Prediction of potential genes in microbial genomes Time: Fri Jul 8 06:33:53 2011 Seq name: gi|222441941|gb|ACEP01000001.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont0.1, whole genome shotgun sequence Length of sequence - 2970 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 308 - 367 7.2 1 1 Op 1 . + CDS 413 - 1066 1007 ## COG0176 Transaldolase 2 1 Op 2 12/0.000 + CDS 1150 - 1989 1007 ## COG3959 Transketolase, N-terminal subunit 3 1 Op 3 . + CDS 1977 - 2918 1424 ## COG3958 Transketolase, C-terminal subunit Predicted protein(s) >gi|222441941|gb|ACEP01000001.1| GENE 1 413 - 1066 1007 217 aa, chain + ## HITS:1 COG:lin2886 KEGG:ns NR:ns ## COG: lin2886 COG0176 # Protein_GI_number: 16801946 # Func_class: G Carbohydrate transport and metabolism # Function: Transaldolase # Organism: Listeria innocua # 1 213 1 211 214 300 76.0 1e-81 MKFFIDTANVEDIKKANDMGVICGVTTNPSLIAKEGRDFNEVIKEIASIVDGPISGEVKA TTTDAEGMIKEGREIAAIHPNMVVKIPMTVEGLKACKVLSQEGIKTNLTLIFTANQALLA ARAGATYVSPFLGRLDDISVRGVDLIRDIAEIFAVAGLDTEIIAASVRNPIHVTDCALAG ADIATVPYKVIEQMTKHPLTDAGIEKFKKDYIAVFGE >gi|222441941|gb|ACEP01000001.1| GENE 2 1150 - 1989 1007 279 aa, chain + ## HITS:1 COG:FN0294 KEGG:ns NR:ns ## COG: FN0294 COG3959 # Protein_GI_number: 19703639 # Func_class: G Carbohydrate transport and metabolism # Function: Transketolase, N-terminal subunit # Organism: Fusobacterium nucleatum # 6 269 7 269 270 313 57.0 2e-85 MNKLELQKMANEVRKGIVTGVHAAKAGHPGGSLSAADLFTYLYFEEMNVDPKNPQDPDRD RFVLSKGHTAPGLYATLAHKGYFPVEDLVTLRHIGSHLQGHPCMQHTPGLDMSSGSLGQG ISAAVGMALSAKLRNKSYRVYTLLGDGEIQEGQVWEAAMFAGARKLDNLVVIVDNNGLQI DGKIEDVCSPYPIDKKFEAFNFHVINVADGNDFDQLDAAFKEAREVKGMPVAIVMKTVKG KGVSFMENQASWHGTAPNDEQYAVAMEDLKKVEEALCQK >gi|222441941|gb|ACEP01000001.1| GENE 3 1977 - 2918 1424 313 aa, chain + ## HITS:1 COG:FN0295 KEGG:ns NR:ns ## COG: FN0295 COG3958 # Protein_GI_number: 19703640 # Func_class: G Carbohydrate transport and metabolism # Function: Transketolase, C-terminal subunit # Organism: Fusobacterium nucleatum # 4 313 1 309 309 343 57.0 2e-94 MSEVKKIATRESYGNALIELGKEHENLVVLDADLAAATKTGMFKKVFPERHIDCGIAECD MIGIAAGIATTGMVPFASTFAMFAAGRAFEQIRNSVGYPHLNVKIGATHAGISVGEDGAT HQCNEDIALMRTIPGMTIINPSDDVEAKAAVRAAYELDGPVYLRFGRLAVPVINDNDDYK FEIGKGVVLREGKDVTIVATGLCVSSALEAADMLAEDGIEAKVINIHTIKPIDSDLLVEA AKETGKVVTVEEHSVIGGLGGAVCEVLSEKYPVPVKRIGVNDVYGESGPAVKLIEKYGLD GKGVYASVKEFVK Prediction of potential genes in microbial genomes Time: Fri Jul 8 06:34:04 2011 Seq name: gi|222441940|gb|ACEP01000002.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont1.1, whole genome shotgun sequence Length of sequence - 35456 bp Number of predicted genes - 31, with homology - 30 Number of transcription units - 11, operones - 7 average op.length - 3.9 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 1/0.000 - CDS 139 - 1104 1148 ## COG0113 Delta-aminolevulinic acid dehydratase - Prom 1254 - 1313 9.2 - Term 1221 - 1271 -0.4 2 1 Op 2 . - CDS 1365 - 1796 632 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases 3 1 Op 3 . - CDS 1793 - 2644 537 ## COG1648 Siroheme synthase (precorrin-2 oxidase/ferrochelatase domain) 4 1 Op 4 2/0.000 - CDS 2647 - 4695 2341 ## COG1492 Cobyric acid synthase 5 1 Op 5 9/0.000 - CDS 4695 - 5768 1065 ## COG0079 Histidinol-phosphate/aromatic aminotransferase and cobyric acid decarboxylase 6 1 Op 6 3/0.000 - CDS 5817 - 6824 1076 ## COG1270 Cobalamin biosynthesis protein CobD/CbiB 7 1 Op 7 . - CDS 6850 - 8226 1220 ## COG1797 Cobyrinic acid a,c-diamide synthase 8 1 Op 8 . - CDS 8278 - 8994 751 ## COG2099 Precorrin-6x reductase 9 1 Op 9 2/0.000 - CDS 8997 - 9692 878 ## COG2243 Precorrin-2 methylase - Prom 9737 - 9796 5.8 10 1 Op 10 . - CDS 10044 - 10826 840 ## COG4822 Cobalamin biosynthesis protein CbiK, Co2+ chelatase - Prom 10897 - 10956 8.1 11 2 Op 1 6/0.000 - CDS 10964 - 12220 1022 ## COG2242 Precorrin-6B methylase 2 - Prom 12244 - 12303 3.0 12 2 Op 2 5/0.000 - CDS 12318 - 13394 1141 ## COG1903 Cobalamin biosynthesis protein CbiD 13 2 Op 3 . - CDS 13473 - 14105 526 ## COG2082 Precorrin isomerase 14 2 Op 4 11/0.000 - CDS 14127 - 15185 1263 ## COG2038 NaMN:DMB phosphoribosyltransferase - Prom 15262 - 15321 3.7 15 2 Op 5 . - CDS 15348 - 16127 991 ## COG0368 Cobalamin-5-phosphate synthase 16 2 Op 6 . - CDS 16128 - 18662 3214 ## COG0007 Uroporphyrinogen-III methylase - Term 18676 - 18712 0.9 17 3 Op 1 . - CDS 18721 - 19476 923 ## COG1010 Precorrin-3B methylase 18 3 Op 2 . - CDS 19514 - 20692 1195 ## COG0373 Glutamyl-tRNA reductase 19 3 Op 3 . - CDS 20727 - 21479 997 ## COG2875 Precorrin-4 methylase - Prom 21602 - 21661 7.3 20 4 Tu 1 . - CDS 21667 - 22941 1599 ## COG0001 Glutamate-1-semialdehyde aminotransferase - Prom 23037 - 23096 5.5 - Term 23284 - 23327 0.1 21 5 Tu 1 . - CDS 23366 - 25369 1663 ## COG0514 Superfamily II DNA helicase - Prom 25451 - 25510 7.3 - Term 25490 - 25549 16.1 22 6 Op 1 . - CDS 25567 - 27825 2318 ## COG0058 Glucan phosphorylase - Prom 27905 - 27964 5.3 23 6 Op 2 . - CDS 28107 - 28262 61 ## - Prom 28333 - 28392 6.1 + Prom 27974 - 28033 7.0 24 7 Tu 1 . + CDS 28199 - 29068 333 ## COG2207 AraC-type DNA-binding domain-containing proteins + Term 29233 - 29261 1.4 + Prom 29206 - 29265 8.9 25 8 Op 1 16/0.000 + CDS 29302 - 30198 902 ## COG1209 dTDP-glucose pyrophosphorylase 26 8 Op 2 11/0.000 + CDS 30214 - 31233 962 ## COG1088 dTDP-D-glucose 4,6-dehydratase + Term 31349 - 31403 8.5 + Prom 31248 - 31307 5.2 27 9 Op 1 9/0.000 + CDS 31429 - 32370 998 ## COG1091 dTDP-4-dehydrorhamnose reductase 28 9 Op 2 . + CDS 32403 - 33014 637 ## COG1898 dTDP-4-dehydrorhamnose 3,5-epimerase and related enzymes + Prom 33016 - 33075 9.0 29 10 Tu 1 . + CDS 33140 - 33379 264 ## gi|225025808|ref|ZP_03715000.1| hypothetical protein EUBHAL_00033 + Term 33515 - 33558 1.0 + Prom 34087 - 34146 8.3 30 11 Op 1 . + CDS 34249 - 34776 752 ## COG0778 Nitroreductase 31 11 Op 2 . + CDS 34712 - 35449 608 ## COG3201 Nicotinamide mononucleotide transporter Predicted protein(s) >gi|222441940|gb|ACEP01000002.1| GENE 1 139 - 1104 1148 321 aa, chain - ## HITS:1 COG:BH3044 KEGG:ns NR:ns ## COG: BH3044 COG0113 # Protein_GI_number: 15615606 # Func_class: H Coenzyme transport and metabolism # Function: Delta-aminolevulinic acid dehydratase # Organism: Bacillus halodurans # 3 321 9 327 328 409 58.0 1e-114 MIRRRRLRATENLRALVRETSVSVSDLIYPLFVIEGNDIKNPIDSMPGIYQYSLDRIDEE LDRIREAGVKGVLLFGIPAHKDECGTEAYNEHGIIQEAIRYIKKVFPELVVIADICLCEY TSHGHCGLIKDGIILNDETLPLLAKTAVTCAQAGADMVAPSNMMDGHIAAIREALDEAGC KMTPIMAYSAKMASGYYGPFRDAAHSAPGFGDRKTYQMDYHNPREAMRDIQDDIDEGADI IIIKPALAFLDILKTASWETDMPLCAYNVSGEYSMVKAAAANGWIDEKKIVMENMIGMKR AGAQMIITYHALDVAKWLREE >gi|222441940|gb|ACEP01000002.1| GENE 2 1365 - 1796 632 143 aa, chain - ## HITS:1 COG:CAC3256 KEGG:ns NR:ns ## COG: CAC3256 COG0454 # Protein_GI_number: 15896501 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Clostridium acetobutylicum # 1 138 1 138 140 79 32.0 2e-15 MIELVEDKLDIDTYLELRQSVNFRKLTRDQAKKGLSNSMYTLVAFKDGKAVGMGRIVGDG AIICYVQDLIIRPEVQGEGIGGLILETLKKFVIQEGYEGTTMMFDLMCAKGREEFYKKHG FIARPTEDLGPGMIQFIQIGEEG >gi|222441940|gb|ACEP01000002.1| GENE 3 1793 - 2644 537 283 aa, chain - ## HITS:1 COG:MA0576 KEGG:ns NR:ns ## COG: MA0576 COG1648 # Protein_GI_number: 20089465 # Func_class: H Coenzyme transport and metabolism # Function: Siroheme synthase (precorrin-2 oxidase/ferrochelatase domain) # Organism: Methanosarcina acetivorans str.C2A # 115 255 46 187 220 83 35.0 4e-16 MAYFPIFTQIDGKRCLIAGGGKVAARKVYTLLQYGADVVVMAEKVCGEIKEVLPEENVFE GVFKYIHKNRCNKNEWDIDENKYDTGGNISNTELENTLDKRYSSNSNLKNISYDLEISEN LLESLSADLSEDLLKIRKNIFLEKEIGKAFLVVAATSNREENHHIAELCHAYNVLVNVAD SEEESSFIFPSVVRKGDISIGINSGTGSPTVSKHIRKQIEKAVPDYYADIAVFMGKLREY VKANFKEESQRRYILKTAAAEAFAEERVLTQKEIEEIIRQGQK >gi|222441940|gb|ACEP01000002.1| GENE 4 2647 - 4695 2341 682 aa, chain - ## HITS:1 COG:lin1171 KEGG:ns NR:ns ## COG: lin1171 COG1492 # Protein_GI_number: 16800240 # Func_class: H Coenzyme transport and metabolism # Function: Cobyric acid synthase # Organism: Listeria innocua # 1 496 1 499 511 455 48.0 1e-127 MAKAIMIQGTMSNSGKTFVTAGLCRVFKQDGYKVAPFKSQNMALNSYITKEGLEIGRAQA MQAEAAMIEPTHWMNPILLKPTSSMGSQVIVNGEVYDNLSAQEYYKMKDNLAPEVMKAFN HLSEENDIIVIEGAGSPAEINLAENDIVNMGMAKMADAPVILVADIDRGGVFASAYGTIK LLPVEDQERFCGIVINKFRGDVDILKPGLAMLEDLTGKPVLGVIPMEKIDVDDEDSLSDR LNQKTITEGIDVAVIRLPHISNFTDFSVFELIDGVSLRYVTDKKELGDPDLILLPGTKNT MGDMEWLIESGLEGAIIRAARTTRVIGICGGFQLLGKEMHDPDGVEHGGDMRGLGLLDTK TIFKEAKTRTRIHGHISEEHNIYNLDNLSVEGYEIHMGTTENLGEAIPMITLEDGRTDAY MTKDGRVWGSYLHGIFDNEDLVFALVQDIMKEKGINPAENHLSIAEYKEIQYNKLADLIR NSLDMDAIYKVLFGEKKEMVRCAGKKDDTSGKGLVHIYCGDGKGKTTTSVGLTVRAAGSG KKVLFYQFLKDNSSSERNILEKVPGITLVRGREMQKFTFQMNEQELDELRIYNNEMLDKL FEMAKDYDMLVMDESVYAIKSNLLDEEKLITHLEEKPVGLEVVLAGRNPSQKLMDHADYV SEIQKVKHPFDHGVSSRVGIEL >gi|222441940|gb|ACEP01000002.1| GENE 5 4695 - 5768 1065 357 aa, chain - ## HITS:1 COG:STM0644 KEGG:ns NR:ns ## COG: STM0644 COG0079 # Protein_GI_number: 16764021 # Func_class: E Amino acid transport and metabolism # Function: Histidinol-phosphate/aromatic aminotransferase and cobyric acid decarboxylase # Organism: Salmonella typhimurium LT2 # 5 355 7 359 364 258 36.0 1e-68 MKKISHGGNIYKKAKEMGIREEDILDFSANISPLGLPEHIRKAMIEAIDGTINYPDPDCS RLKEAISKQDNVPESHITCGNGGADLLYRLAFGLHPKKVLLPAPAFVEYEEALSAAGAQM EYYLMRDDFVIKEDILEQITEETDFVVICNPNNPTGILTEKELILRVLERAKETNTFVMA DECFLEICQNESEYTVKPFIEKYENLIILKSFTKLYAIPGVRLGYILAGSEDVIAKVNRT GQAWSVSHIAQCAGVAALSDNVYKERVIETVAAELSYMKKEFFKLPVILYDGAANYLFFQ TPGITDLDRRLESYGIMIRNCSNYVNLGKDYWRVAVKSHEENEKLIKALRMILKGEE >gi|222441940|gb|ACEP01000002.1| GENE 6 5817 - 6824 1076 335 aa, chain - ## HITS:1 COG:lin1155 KEGG:ns NR:ns ## COG: lin1155 COG1270 # Protein_GI_number: 16800224 # Func_class: H Coenzyme transport and metabolism # Function: Cobalamin biosynthesis protein CobD/CbiB # Organism: Listeria innocua # 9 334 8 310 315 250 44.0 3e-66 MRLIYAMLLGFILDLIFGDPHGLIHPVQIIGWFIDKLKKGMQHMIYGCSYEEVREKEIER KAGVEKTAGFFLMLFIVAGTFAVVYGILYVANLIHPVLRFCLETFFIYQILATKSLKTES MKVYKKLKEGDLIGARKEVSYLVGRDTENLDESEVAKADVETIAENTADGVIAPMLFIAI GGAPLGFAYKAVNTLDSMVAYKNEELINIGFFSAKMDDICNFIPARFAAVMLMIASLILR FDFKGAVKIFKRDRFAHLSPNSAQTEAVAAGALQIQLGGTHNYFGKAVVKPTIGDDIRPV EYEDIKRTNQLLYVSAVLSMAVCCLITFLIYVQLF >gi|222441940|gb|ACEP01000002.1| GENE 7 6850 - 8226 1220 458 aa, chain - ## HITS:1 COG:MK1643 KEGG:ns NR:ns ## COG: MK1643 COG1797 # Protein_GI_number: 20095079 # Func_class: H Coenzyme transport and metabolism # Function: Cobyrinic acid a,c-diamide synthase # Organism: Methanopyrus kandleri AV19 # 6 419 11 426 463 305 39.0 1e-82 MENVKPMPRVLIAGANSGCGKTSITCGILKALVNRELHIQSYKCGPDYIDPMLHSHITGR NCRNLDPFFSTGEDLRYLVGKDSQDVDFSVTEGVMGYYDGVGISCEKSTWTVSKETGTPT ILIFNVKGMSHTMIPLIKGMVEYQDNPIAGVILNRCSKGMYQLMKPEIEEKLGISVVGYF PQKEGIYIGSRHLGLMTAAEIDNLDEILSLLGETAEECIDLDLLLEIGEKAEPLTAVKRP EVPAEKRAKIAVAWDKAFCFYYKENLEILEQLGAELCYFSPVNDVQLPEDTDGIYLGGGY PETYRKELAENLSMKESICEAAKAGKPIIAECGGFMYVCNHLVETDDSSMDMLGLIDTDV HMTKRLSMQFGYVTLNALCDTAFFKKNTNIRAHEFHYSKADKRGDSCEIKKYSGKCWNGL YVKENIMAGYPHFYFHNCREVAERFVDMAEKISKNLSM >gi|222441940|gb|ACEP01000002.1| GENE 8 8278 - 8994 751 238 aa, chain - ## HITS:1 COG:MTH1002 KEGG:ns NR:ns ## COG: MTH1002 COG2099 # Protein_GI_number: 15679020 # Func_class: H Coenzyme transport and metabolism # Function: Precorrin-6x reductase # Organism: Methanothermobacter thermautotrophicus # 2 235 38 270 302 120 34.0 2e-27 MKYNVLVIAGTTESRQVIEKLLGENPNERILASVATELGKEMLLEYDIDVHVGRLDQDGF LALLKENPCEKIIDASHPFAKIVSETVKKVAESADIPYERYERTNLQYDYEGIVHVKDVQ EAITLLNELKGNVFLTTGVNTAAAYMSGVENGAERLFIRVLDNVSSLEGCAKAGYPEGHV FGKMPPFTVEDNVRLIRETEAEVLVSKDSGKTGGVDVKVEACRQTGIKMILIDRPNAL >gi|222441940|gb|ACEP01000002.1| GENE 9 8997 - 9692 878 231 aa, chain - ## HITS:1 COG:MTH1348 KEGG:ns NR:ns ## COG: MTH1348 COG2243 # Protein_GI_number: 15679347 # Func_class: H Coenzyme transport and metabolism # Function: Precorrin-2 methylase # Organism: Methanothermobacter thermautotrophicus # 1 228 1 227 232 129 32.0 6e-30 MKGKLYGIGVGPGDPELLTLKAKRLIEECDIVAVPVKKEGEDSVALNIAKGAVKIPEEKI QEIVFTMAKDKTKREACRQAAAEEIMKLLDEGKSIAMLALGDIGIYSTYAYVHKRLLKEG YDVEMVSGIPSFCAGASKAGISIVEGNEGFGVIPSLKGIDQVEKTLGVFDNLVIMKVGSH VKEVYDLLVERGMENNALIISNVGMDGEYVGPLIPDREYGYFTTMIIKSEM >gi|222441940|gb|ACEP01000002.1| GENE 10 10044 - 10826 840 260 aa, chain - ## HITS:1 COG:lin1165 KEGG:ns NR:ns ## COG: lin1165 COG4822 # Protein_GI_number: 16800234 # Func_class: H Coenzyme transport and metabolism # Function: Cobalamin biosynthesis protein CbiK, Co2+ chelatase # Organism: Listeria innocua # 3 257 2 253 261 207 43.0 2e-53 MGKKALLVVSFGTSFEEALPAIVNIEETCKKAFPDYDFYRAFTSGMIIRKWARTKNVIIH NPDEVMKRLVAEGYEEVICQPTHIINGLEYDKMMNMLLAYKDQIPTIKVGNPLLTEEEDY KEACEIVMQELEKPLAKDEAFVFMGHGTEHFANSAYSQFENMLRDLGHESTYVGTVEGFP GLDYVIRRLKIRDIKKVYVMPLMIVAGDHARNDLAGAEADSWDSILKADGFETEVIMKGL GEIDAIAEMFVKHLKKAESL >gi|222441940|gb|ACEP01000002.1| GENE 11 10964 - 12220 1022 418 aa, chain - ## HITS:1 COG:FN0964 KEGG:ns NR:ns ## COG: FN0964 COG2242 # Protein_GI_number: 19704299 # Func_class: H Coenzyme transport and metabolism # Function: Precorrin-6B methylase 2 # Organism: Fusobacterium nucleatum # 238 413 5 181 189 122 35.0 1e-27 MSKLTVAGIGPGEADYILPAVIEKMKKAHTVIAAKRVLPILEELCQDVSDSASVLQHDSI SFFAMGKIKDTLEQIGKILSEGHDVVMAVSGDPLMYSLYRTICNDPISENWDMDIIPGVG SLQMLGAAFGETMEDALIISVHGRAKTAGSIALAVTEHPKVFFLCSKEQGPAWLSRIMLD YQLDDVTVCAGANLSYEDEILASGTPAEMVKKEFPSLCVAMIKNPSPRQIVRPCFLSDED FERDKTPMTKEEIRILILHKMKLHPDDVVWDVGAGTGSVSIECARQVPFGTVHSVERSET AVKLIYKNKEKFSADNLTVYEGDAAETVAKLPEPDKVFIGGSGKEMSQILETIAAFPKKI KVVISAVTIETIAETNELLGKYAPDFDVIQATVGRGRKIGSYHIMDTNNPVMIFTAYI >gi|222441940|gb|ACEP01000002.1| GENE 12 12318 - 13394 1141 358 aa, chain - ## HITS:1 COG:MJ0022 KEGG:ns NR:ns ## COG: MJ0022 COG1903 # Protein_GI_number: 15668193 # Func_class: H Coenzyme transport and metabolism # Function: Cobalamin biosynthesis protein CbiD # Organism: Methanococcus jannaschii # 2 319 7 331 362 239 44.0 8e-63 MKKLREGVSTGSCMTGGAEASVIWQTTGKCPSVVKVDTPIGKTLYLDIIPREFGVCGVVK DAGDDPDVTNGSEIVTKVELFEEEGDISFFGGEGVGTITQEGLKIPPGQPAINPVPRQMA EKAIRKIIGNKKASVTVSIPGGKELAKKTFNPRLGIVDGLSVLGTTGIVRPMSEEAMKDS LIAELDMYAKQGHKTILFVLGGTGENALKEQYGEFQCILQVSNYIGFMIEEAVERGFTDI LIGGFVGKLVKVASGTMNTHSHVADGRIETICTHAALHGAPTSVIKKIYDCLTTKAAMKI VEEEGLMDIWPDMAQKASEYCEKTAHKMARVGIIFLDGQNEILAVSENIKEVLSACKK >gi|222441940|gb|ACEP01000002.1| GENE 13 13473 - 14105 526 210 aa, chain - ## HITS:1 COG:FN0970 KEGG:ns NR:ns ## COG: FN0970 COG2082 # Protein_GI_number: 19704305 # Func_class: H Coenzyme transport and metabolism # Function: Precorrin isomerase # Organism: Fusobacterium nucleatum # 10 210 10 211 219 167 46.0 1e-41 MLDKIQIVKPAEIEKRSMEIITSELNGRTWPEPEFSIVKRCIHTSADFDYADNLCFSENA ANIGVEALKNGAHIVTDTRMAWSGINKKKLASFGGEAHCFMSDDDVAKEAKERGCTRAAI CMERGAVLAEEKNIIFAIGNAPTALIRLYELIKEGKLKPALIIGAPVGFVNVVESKELIM EAGVPFIVPKGRKGGSNIAATICNAMLYQL >gi|222441940|gb|ACEP01000002.1| GENE 14 14127 - 15185 1263 352 aa, chain - ## HITS:1 COG:CAC1372 KEGG:ns NR:ns ## COG: CAC1372 COG2038 # Protein_GI_number: 15894651 # Func_class: H Coenzyme transport and metabolism # Function: NaMN:DMB phosphoribosyltransferase # Organism: Clostridium acetobutylicum # 1 351 1 351 352 311 45.0 1e-84 MTLEEAIKKIQPLDEEAMEYTRARWNTLGKPLHSLGRLEEMVTQFAGIYRDKNPHLRKKA VIVMAADNGVVAEGVTQTGQEVTKTVTENMTKHNATICIMSSMSGADVYPVDIGIATDCD NPGVFPRKIKYGTDNMAKGPAMSREEAVKAIEVGIGMVADLKEKGYNVLATGEMGIGNTT TSSAVCSVLLDQSVEKVTGKGAGLTNKDLEHKIEVIRQSIALNKVNADDPLDVLAKVGGL DIAGMAGCYIGGAALNIPVFIDGFISSVAALIAIRLVPECAPYLFPSHCSNEPAGRMILD AIGKQPYIFANMCLGEGTGAVMGFTIADYAFKAYWELPSFEQTNFGTYEELN >gi|222441940|gb|ACEP01000002.1| GENE 15 15348 - 16127 991 259 aa, chain - ## HITS:1 COG:lin1112 KEGG:ns NR:ns ## COG: lin1112 COG0368 # Protein_GI_number: 16800181 # Func_class: H Coenzyme transport and metabolism # Function: Cobalamin-5-phosphate synthase # Organism: Listeria innocua # 5 245 1 244 248 126 35.0 3e-29 MKTQIKRFLLTLGFMTRIPVNVDLGEVKDEDMHKGFLYYPVVGLILGLCDMAVFLLVSLI LPEVFGILFAMLANLCLTGAFHLDGLSDTADGIYSARTRERMLEIMKDSRIGTNGAVAMC FDLALKFAGIYYSDIRWLMILLMPIAGKMVQGAIVYKAIYPREKGIGIYVGTVSLGTVIG TVILGLIAMVLAFSCWGVLIFAILFGFAYLFRVYITGKIGGITGDVMGAGNELAEVLLLL VVIVLTKYTGMTPLSLFPF >gi|222441940|gb|ACEP01000002.1| GENE 16 16128 - 18662 3214 844 aa, chain - ## HITS:1 COG:lin1164_1 KEGG:ns NR:ns ## COG: lin1164_1 COG0007 # Protein_GI_number: 16800233 # Func_class: H Coenzyme transport and metabolism # Function: Uroporphyrinogen-III methylase # Organism: Listeria innocua # 344 593 2 248 252 257 52.0 8e-68 MKNIIKIGTRKSKLALIQTDIVKDKIKKAFPEIEVEIVKIDTKGDQILDKSLTSFGGKGV FTAELEAELLSGAVDIAVHSAKDMPMDFPEGLGIGAVLDRADVRDTFVTTTGKKLEELEP GSIVGTSSLRRELLIKEINPYVTIKLLRGNVQTRLSKLRDGQYDGIILAAAGIERLGYEK EEGLHYQYLDPDVFLPAAGQGILAVESRVEDAEMAEILAAIHSEKAECLLMAERAFLKTI GGSCNAPAAALCREENGEFSMRAMYVKDGIHSRKTYMTVNIEDAIAEKDADSKPIAGTAV AAAVSVEEAVKTDKEITEKDTARKSKVEIAAELGISVAHEVNKGMVYLVGAGPGDEDLMT RKGLKLLREADVVVYDNLASSSLLNEVRDDAELIYAGKRSSNHHLKQYETNELLVKLALE GKNVVRLKGGDPYIFGRGGEEGQELREAGVDFEVVPGISSSYSVPAYCGIPVTHRDFASS FHVITGHEGNHKNGVSVLNYETLAKEEGTLIFLMGLKNLPNIVASLIENGKDPATPVGVL QEGTTARQRVATGTLADIVEVVKREGIKTPAITVVGDVVSLRQVLDWYGHKPLSGKSVLV TGTTSMVDRLSPILKEEGAEAISFSLIRTERMKLPELDVALKEIDKYNWIVFTSANGVEC FFEEMQEIRKDIRDLAHVRFAVIGDGTKKALEEHGIFCDFIPTAYSSKDMAEAMVPHIGK DESVLLLRAEEANRVLPDALEEAGISHTCISLYHTVTDERKADELNRLIKMADYVTFASS SAVRAFVSMVDNLDEVKGKYISIGPVTTKTAQENGLSIAKTAVVYTARGMVETMIQDAVE EGQK >gi|222441940|gb|ACEP01000002.1| GENE 17 18721 - 19476 923 251 aa, chain - ## HITS:1 COG:FN0951 KEGG:ns NR:ns ## COG: FN0951 COG1010 # Protein_GI_number: 19704286 # Func_class: H Coenzyme transport and metabolism # Function: Precorrin-3B methylase # Organism: Fusobacterium nucleatum # 2 244 4 244 249 273 52.0 3e-73 MGKLYAVGFGPGGYEHMTAKAIDVIKNADIITGYTTYVEMLKKFFPEKEYVATPMTKEMD RCRMAVDLAAEGKTVAMVSSGDSGIYGMAGILLEIANEKKADVEIETVPGVTAASAAASI LGAPLMHDFTIISLSDLMTPYSLIMKRVDCAGQGDFIVCLYNPKSKKRADYVEKAAEILM KYRDGKTPVGVVRHAGREEESSYITTLDAVKDAPIDMFSIVIVGNSNTYVRDGKMITPRG YEDKYGFANVE >gi|222441940|gb|ACEP01000002.1| GENE 18 19514 - 20692 1195 392 aa, chain - ## HITS:1 COG:aq_1279 KEGG:ns NR:ns ## COG: aq_1279 COG0373 # Protein_GI_number: 15606498 # Func_class: H Coenzyme transport and metabolism # Function: Glutamyl-tRNA reductase # Organism: Aquifex aeolicus # 3 365 4 368 406 177 30.0 5e-44 MGIQIISISHKIAPLHVREMFAFTEEQQKHMMQEITEHLEVSECIVLSTCNRTEMYVYSD SESKGCVFNLMEDVLLGEAGAQDEEDIGNYLLFYHGKKAIHHLFQVAAGLDSMVIGEDQI LGQVKTAHKQAREAGTTGVYLNTFFRMAVTGAKKVKTETELSKTSVSTATLALKVAEEEL GTLKDKKVLIIGATGKIGGIVLMNIQSLHQADIYVTTRKNKLIQTKHGNDEFTTIDYEDR YEYLDQMDVVISATSSPHYTLTYSKMKKQLTTAKRRVFVDLAVPMDIEAKISAVDDTCYY NIDDFTRIARENNQKKLREAEAASGILDEYELQFEQWMVFQKSLSVMGKVRDNFVKVAEH KGVEKAFDHFFYWVRENNTPEDLETFFQCLDH >gi|222441940|gb|ACEP01000002.1| GENE 19 20727 - 21479 997 250 aa, chain - ## HITS:1 COG:FN0957 KEGG:ns NR:ns ## COG: FN0957 COG2875 # Protein_GI_number: 19704292 # Func_class: H Coenzyme transport and metabolism # Function: Precorrin-4 methylase # Organism: Fusobacterium nucleatum # 2 248 8 254 257 316 66.0 3e-86 MVHFIGAGPGDPELLTIKGKKLIDQADVIIYAGSLVNKEVLAGAKEGAEIYNSATMTLEE VIEVMKKAEAEGKMTARVHTGDPAVFGAHREQMDELSRLGIDYDVIPGVSSFLATAAALK KEYTLPGVSQTVILTRMEGRTPMPPKEKLRDLAKHNSTMIIFLSVGMIEQLADTLKEEYR EDTPVAVVYKASWEDQKIVIGNLTNIAQKVKEAGITKTALTVVGDFLGDEYELSKLYDKT FTHEFRQAKE >gi|222441940|gb|ACEP01000002.1| GENE 20 21667 - 22941 1599 424 aa, chain - ## HITS:1 COG:ECs0158 KEGG:ns NR:ns ## COG: ECs0158 COG0001 # Protein_GI_number: 15829412 # Func_class: H Coenzyme transport and metabolism # Function: Glutamate-1-semialdehyde aminotransferase # Organism: Escherichia coli O157:H7 # 5 420 2 417 426 470 55.0 1e-132 MLNTTKSEELYKVALDLMPGGVNSPVRAFGAVGRNPLFIDHAKGSYVYDVDGNKFIDYVC SWGPGILGHSHPEVLEEVVKACFDGLTFGAPTGKENELAQLVKDCVPSMEMMRMVSSGTE ATMSAIRAARGFTGRDKIIKFKGCYHGHSDGLLVQAGSAALTQSVPDSSGVPASFAQQTL VALYNDKDSVRELFENNKDQIAALIVEPVAANMGVVIPDDDFLPFLREITEENGTILIFD EVITGFRLGVGGAQEWFGIKPDLSTFGKIVGGGMPMAIYGGRKDIMRNISPTGKVYQAGT LSGNPIATTAGIKTLEILRDHPELYTKIEENTKKLAEAYKVRAKKQGINIHVNQIGSLMS AFFTGENVKDYTGATSSDTKAYADYFNYMLENGIYVAPSQFEAMFISAAHSDEDIQRTID VLTK >gi|222441940|gb|ACEP01000002.1| GENE 21 23366 - 25369 1663 667 aa, chain - ## HITS:1 COG:CAC2687 KEGG:ns NR:ns ## COG: CAC2687 COG0514 # Protein_GI_number: 15895945 # Func_class: L Replication, recombination and repair # Function: Superfamily II DNA helicase # Organism: Clostridium acetobutylicum # 2 403 4 398 714 420 52.0 1e-117 MNINETLKHYFGYDSLRPGQQELIEGILQRKDVLGIMPTGAGKSLCYQVPALMLDGITIV VSPLISLMTDQVKALNQAGVHAAYINSSLTENQIRTALFYAAQGRYKIIYVAPERLNTIR FLEFACQVDISMVTVDEAHCISQWGQDFRPSYVGIADFLAQLPKRPVVSAFTATATERVK QDIMGSLRLQNPVTVVTGFDRPNLFFRVVTRKGGKETDNSVLNYVKKHEDESGIIYCATK KNADKIYGLLQQYGIEAGHYHAGLSLEERKKNQDDFTYDRIRVMVATNAFGMGIDKSNVR YVLHYNMPQSLEYYYQEAGRAGRDGEEAECVLFFSKQDIMINKRLLEYKSTESIESDPQV RRNDYQKLNRMIDYCETQQCLRQFILSYFGDNSPCTCDKCSNCVVVEDEEEENYIQTKKE KKKAFQLANLTPKGQELFEQLRKCRTELAAEKGVPPYIICSDKTLTDMCAKCPVDNEDME TVYGMGVQKIQSYGEHFTKIIIDFLEEQSAAGGADAETLQLTTELTPEQIEKTTGMTVAA SPARKKKLPFYIAPGKLDEVELTDTCMISELTNRINELCDEEDQKNRKKLTAAFVNTLLI QKGYIEEATEGEEKVKHITEKGKEAGIQEEERYGKYGRKYYALVHTRESQEMILGELREY LADLTDE >gi|222441940|gb|ACEP01000002.1| GENE 22 25567 - 27825 2318 752 aa, chain - ## HITS:1 COG:SP2106 KEGG:ns NR:ns ## COG: SP2106 COG0058 # Protein_GI_number: 15901921 # Func_class: G Carbohydrate transport and metabolism # Function: Glucan phosphorylase # Organism: Streptococcus pneumoniae TIGR4 # 1 750 2 751 752 1001 65.0 0 MKLSEFVKAEYGKSISECTNEELYYALLTLTKKMASGKQHGNSKKKLYYISAEFLIGKLL SNNLINLGIYDEVKEELAQNGKDICEIEEFENEPSLGNGGLGRLAACFIDSIATLGLNGD GVGLNYHFGLFRQIFENNMQTTVPDPWLTEKSWLTKTDVTYDIKFKGMTVKSRMYDIDVI GYNNTSNKLHLFDVESVDESIVEDGINFNKEDIKKNLTLFLYPDDSDEAGRILRIYQQYF MVSSAAQLILDECVAKGCNLHDLSDYVVIQINDTHPTMVIPELIRLLVERGLEMDEAIEV VTKSCAYTNHTILAEALEKWPIYYLKKAVPQLMPIIEVLDDKVRRKYHNQEVYIIDGEER VHMAHIDIHYSSSVNGVASLHTEILKNNELHHFYEIYPEKFNNKTNGITFRRWLLHCNPQ LTELITSLIGDGFKKDATQLEKLLDYKNDEKVLKQLLEIKSAKKMELKKALMEKQNVTIN ENSIFDIQIKRLHEYKRQQLNALYVIHKYLEIKAGKKPSTPITVIFGAKAAPAYIIAQDI IHLILCLQQLINNDPEVSPYLNVVMVENYNVTWAEKLIPACDISEQISLASKEASGTGNM KFMLNGALTLGTEDGANVEIHELVGDDNIYIFGDSSDTVVERYAQGSYHAAKYYNENTAI KEAVDFITGEELKAIGDKDRLERLFNELVSKDWFMTFPDFDEYVKTREQAYADYENREEW VQKMLVNIAKAGYFSSDRTIEEYNRDIWKLED >gi|222441940|gb|ACEP01000002.1| GENE 23 28107 - 28262 61 51 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MENRLEYVFYFRDIFCIEVYPFLYFACRIYVLNLEINSRVSFLNNIDTKPD >gi|222441940|gb|ACEP01000002.1| GENE 24 28199 - 29068 333 289 aa, chain + ## HITS:1 COG:BH1906 KEGG:ns NR:ns ## COG: BH1906 COG2207 # Protein_GI_number: 15614469 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Bacillus halodurans # 11 288 8 286 299 120 26.0 2e-27 MDKLQYKIYHENKIHTPDDFPYNTYLCSIPLDFQSVNLHWHDEVEIIVIKKGSGIVSVDL TTYSVSAGDIIFVLPGQLHAISQKDNEIMEYENILFKSSLLKSSGYDLCNDKFIQPLFSG SLNICPVINNQTPYYPPVITAINEIDHLCDLRPYAYQLSIKAHLFQILYTLVSNCGQNKI KPINQKSLKKIKTILSYIANNFQENIAIEDIANYCFYSKSYFMKFFKETMGVSFIQYLND YRLEIAAELLTTTTDNILDIAFATGFNNISYFNRCFKKKYGVSPRKYRF >gi|222441940|gb|ACEP01000002.1| GENE 25 29302 - 30198 902 298 aa, chain + ## HITS:1 COG:L197041 KEGG:ns NR:ns ## COG: L197041 COG1209 # Protein_GI_number: 15672176 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-glucose pyrophosphorylase # Organism: Lactococcus lactis # 1 285 1 282 289 412 70.0 1e-115 MKGIILAGGSGTRLYPLTMVTSKQLLPIYDKPMIYYPMSVLMNAGIRDILIISTPQDTPR FKELLKDGKQFGVNLSYAVQPSPDGLAQAFIIGEEFIGDDTVAMVLGDNIFAGHGLKKRL CAAVKNAENGNGATVFGYYVDDPERFGIVEFDRNGKAVSIEEKPSHPKSNYCVTGLYFYD NRVVEYAKNLTPSARGELEITDLNRIYLENNSLNVELLGQGFTWLDTGTHESLVDATNFV KTMEQHQHRKIACLEEIAYLNGWISKEEVLKVYEVLKKNEYGQYLKDVLDGKYQENLY >gi|222441940|gb|ACEP01000002.1| GENE 26 30214 - 31233 962 339 aa, chain + ## HITS:1 COG:MTH1789 KEGG:ns NR:ns ## COG: MTH1789 COG1088 # Protein_GI_number: 15679777 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-D-glucose 4,6-dehydratase # Organism: Methanothermobacter thermautotrophicus # 3 339 4 333 336 449 63.0 1e-126 MTIIVTGGAGFIGSNFIFHMLNKYPDYRIICLDCLTYAGNLSTLEPVMDNPNFRFVKESI TDRDAVYKLFEEEHPDMVVNFAAESHVDRSIENPEVFLDTNIKGTAVLMDACRKYGIKRY HQVSTDEVYGDLPLDRPDLFFTEETPIHTSSPYSSSKAGADLLVLAYHRTYGLPVTISRC SNNYGPYHFPEKLIPLMIANALNDKPLPVYGKGENVRDWLYVEDHCRAIDLIIHNGRVGE VYNVGGHNEMKNIDIVKIICKELGKPESLITYVADRKGHDMRYAIDPTKIHNELGWLPET KFADGIKKTIQWYLDNKEWWETIISGEYQDYYEKMYKNR >gi|222441940|gb|ACEP01000002.1| GENE 27 31429 - 32370 998 313 aa, chain + ## HITS:1 COG:CAC2315 KEGG:ns NR:ns ## COG: CAC2315 COG1091 # Protein_GI_number: 15895582 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-4-dehydrorhamnose reductase # Organism: Clostridium acetobutylicum # 1 313 1 279 280 193 39.0 5e-49 MKFFVTGVNGQLGHDVMNELSSRSYEGIGSDIAPKYSGIQDDSPVTKMPYISLDITDKEA VTRILKETAPDIVVHCAAWTAVDLAEDADKQETVRKINAAGTQYIASACKELDCKMIYLS TDYVFDGQGTTPWKPDCKDYKPLNVYGQTKLLGEQAVANTLEKYFIVRIAWVFGQNGKNF IKTMLTVGKNHDKLTVVNDQIGTPTYTFDLARLLVDMAESEKYGYYHATNEGGYISWYDF TKEIFRQAVALGHTEYDENHVTVFPVTTAEYGMSKAARPFNSRLDKSKLVEAGFTPLPDW KDALQRYLKEVLQ >gi|222441940|gb|ACEP01000002.1| GENE 28 32403 - 33014 637 203 aa, chain + ## HITS:1 COG:CAC2331 KEGG:ns NR:ns ## COG: CAC2331 COG1898 # Protein_GI_number: 15895598 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-4-dehydrorhamnose 3,5-epimerase and related enzymes # Organism: Clostridium acetobutylicum # 1 198 1 180 185 238 61.0 8e-63 MGQIKVKKNVGGIEGLCIIEPTLHGDSRGYFMETYNKNDMAEAGLTMEFVQDNQSASTKG VLRGLHFQKQHPQGKLVRVINGTVFDVVVDLRSHSGTYGKWFGEILSAENNKQFYIPEGF AHGFLVLSDTAEFCYKCTDFYHPGDEGGLAWNDPEIGIEWPELVGEYSGCATAKGYHLKD GTSLTLSEKDQKWLGIAETFKFK >gi|222441940|gb|ACEP01000002.1| GENE 29 33140 - 33379 264 79 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225025808|ref|ZP_03715000.1| ## NR: gi|225025808|ref|ZP_03715000.1| hypothetical protein EUBHAL_00033 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00033 [Eubacterium hallii DSM 3353] # 1 79 1 79 79 157 100.0 2e-37 MPKPFTATGKKNLIGANLIALRKKYHLSQRGLAHELQLAGYDMGKNVITRIETQQRYVTD IEIKALCDLFNVSFEDLIK >gi|222441940|gb|ACEP01000002.1| GENE 30 34249 - 34776 752 175 aa, chain + ## HITS:1 COG:FN1223 KEGG:ns NR:ns ## COG: FN1223 COG0778 # Protein_GI_number: 19704558 # Func_class: C Energy production and conversion # Function: Nitroreductase # Organism: Fusobacterium nucleatum # 1 174 1 174 175 272 68.0 2e-73 MSDVLNKMKSRRSIRKFKPDMVPQEILDQIIEAGLYAASGMGQQSPIIIQVTKKELRDEI SKMNCEIGGWKEGFDPFYGAPAMLIVLAKKERPTYVYDGSLVMGNLMLAAHELGIGSCWI HRAKEEFKSDWGRNLLRSLGVNGDYEGIGHCALGYADGDYPAAPARNEGRVFYVK >gi|222441940|gb|ACEP01000002.1| GENE 31 34712 - 35449 608 245 aa, chain + ## HITS:1 COG:SPy1329 KEGG:ns NR:ns ## COG: SPy1329 COG3201 # Protein_GI_number: 15675271 # Func_class: H Coenzyme transport and metabolism # Function: Nicotinamide mononucleotide transporter # Organism: Streptococcus pyogenes M1 GAS # 24 175 4 153 165 98 42.0 1e-20 MLMAIILLLQREMKAAYSMLNNPLKTLTKKEWLLWLVSLTVVILSNLATNDLDLLTLISA LLGVTALIFAAKGNVWAQFLMIIFCILYGIISFRFHYWGEMITYLGMTLPMAAWSAITWL RNPSEENENEVAIRTLNKKHIIGLTISTVIVTGIFYYILAWLDTPNIMFSTLSIVTSFIA ASLTMLRSSYYAVGYAANDIVLIVLWVLASLENPAYIPVVVNFAIFLLNDMYGFVSWKKR ELERV Prediction of potential genes in microbial genomes Time: Fri Jul 8 06:34:17 2011 Seq name: gi|222441939|gb|ACEP01000003.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont2.1, whole genome shotgun sequence Length of sequence - 5860 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 3, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 2 - 61 2.6 1 1 Tu 1 . + CDS 81 - 563 596 ## COG2606 Uncharacterized conserved protein 2 2 Op 1 . - CDS 837 - 2324 1603 ## COG4145 Na+/panthothenate symporter 3 2 Op 2 . - CDS 2324 - 2593 414 ## gi|225025814|ref|ZP_03715006.1| hypothetical protein EUBHAL_00039 - Prom 2623 - 2682 4.8 + Prom 2784 - 2843 10.2 4 3 Op 1 . + CDS 3076 - 3960 1100 ## COG2017 Galactose mutarotase and related enzymes 5 3 Op 2 . + CDS 4006 - 5343 804 ## ACL_1119 hypothetical protein Predicted protein(s) >gi|222441939|gb|ACEP01000003.1| GENE 1 81 - 563 596 160 aa, chain + ## HITS:1 COG:BS_yjdI KEGG:ns NR:ns ## COG: BS_yjdI COG2606 # Protein_GI_number: 16078271 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Bacillus subtilis # 2 157 3 158 159 141 47.0 5e-34 MKKELKTNVMRILEKEKVPYEAHYYPHGKEAVDGVTVAELTGQNPDYVFKTLVTVSNKKE YFVFVLPVAKELDLKKAAKAVGAKSIEMIHVKDINKITGYIRGGCSPVGMKKQFVTTYHE TAQNNPVIMVSGGKIGTQVECKPDDLLKITKGQYADICKE >gi|222441939|gb|ACEP01000003.1| GENE 2 837 - 2324 1603 495 aa, chain - ## HITS:1 COG:PM1089 KEGG:ns NR:ns ## COG: PM1089 COG4145 # Protein_GI_number: 15602954 # Func_class: H Coenzyme transport and metabolism # Function: Na+/panthothenate symporter # Organism: Pasteurella multocida # 7 475 5 465 477 286 40.0 8e-77 MSASQKIILTIFVIYLALNAIIGIVFSKRQGNKQNVSSEKKFFIGGRNMNGLLLAMTTMA TYTSVSSFISGPGAAGLTYGYAQAWIATVQVPVTFLVLGVLGNKLAMVSRRTGAVTVVGY LKARYKSDALVIITSLLMVAFFMAQMIGQFTGGATLISSITGLNHVVSLLIFGAVVIIYT SFGGFSAVAITDTIQGIVMCIGTFLFIFFVLKAGGGLSGIDAGLARNLPGVYDDLFSVYT PGTLISFWVLVGFGTLGLPQTAVRSMGFKNTKSLHSAMWIGALTCSFIIVGMHMAGTWAG ALVDTNNLPTSDYFIPYIVQKIMPVGLAGTFLAAPMAAVMSTADSLLILASAAIVKDLWR NYVVKDNPEKIASYNKNVGKLSTLVTFAFGVVIILLTINPPDIIFFLNLFALGGLECSFF WPLIGGLFWKKGTKQAAISSSIGAAATYVFCYYNVHVAGINAVVWGLLAGAILYFAVGKA TSKNGLDADIIKNCF >gi|222441939|gb|ACEP01000003.1| GENE 3 2324 - 2593 414 89 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225025814|ref|ZP_03715006.1| ## NR: gi|225025814|ref|ZP_03715006.1| hypothetical protein EUBHAL_00039 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00039 [Eubacterium hallii DSM 3353] # 1 78 1 78 89 152 100.0 6e-36 MKKLTRQEKHEQCIREIRGTLVVVAICCVWHISSAFLLNGTGLYFLGMPAWFSVSTLGTI VLSLLGVWYLLKKVFIDFEYEDEEEGGEE >gi|222441939|gb|ACEP01000003.1| GENE 4 3076 - 3960 1100 294 aa, chain + ## HITS:1 COG:lin1322 KEGG:ns NR:ns ## COG: lin1322 COG2017 # Protein_GI_number: 16800390 # Func_class: G Carbohydrate transport and metabolism # Function: Galactose mutarotase and related enzymes # Organism: Listeria innocua # 6 293 4 289 290 187 37.0 2e-47 MATYTIENEKLSVTIAAHGAELSSIYDKENDRELVWQADPTFWNRHAPVLFPNVGKYYGG YFTYNGKEYPMGQHGFARDTEFEEVAAGEDFVTYRLSSNETTKKEYPFDFKLEITHRLQG GRLSVEWKVTNTGDKEMYFTIGGHPAFNVNVLPDTDFEDYSLAFKEGTESLSYVLLDTES GTAIADKTYELKLTNSKYALKKDMFDKDALVFDNGQIEWAALALPDGKPYIALESKGFPN FGIWSKPGAPYVCLEPWCGRCDNRGFEGELSEKPGINALKAGEVFEKSYDIIVY >gi|222441939|gb|ACEP01000003.1| GENE 5 4006 - 5343 804 445 aa, chain + ## HITS:1 COG:no KEGG:ACL_1119 NR:ns ## KEGG: ACL_1119 # Name: not_defined # Def: hypothetical protein # Organism: A.laidlawii # Pathway: not_defined # 12 326 4305 4623 5552 124 27.0 1e-26 MAETKKAKKHIVVFQDEEGNVLKTSFVSHEEAALPPEMPEKRGESVHHEIKFQGWDKDIS SVKENLVVKAVYKEVPKEYLVMYFHENGKMLGTETVPYRQAATQPYRPQKPQTEEYYYIF KGWNNDLSHIEKDTMAKAVFEERQRSFVVRFFHENGTLLKEENVLYGQAAQEPEVPAKQQ DEVYHYIFNGWDNTFDHIKENTEVHAVFSSVYNEYKVSIYEQLKERLVEEKIYHYGDIID YPVLRKKGYTLQWNIHPETVTQNEKIYASWDFSNPVGKVFEVDGNSYQILNPSITNGSVR LLSYTQDASQIQIPERVQIGDYYYFIEEIAIRAFCNCVKMRTLILPNCVRIISDGAFMNC KRLEKIVLGKGDDIKLHSIGKKAFAENEHLREIYFAGRNLRKVYPATFEGIRKTIKILVL PAEKAKIEKLLQKALREGKVLIDLI Prediction of potential genes in microbial genomes Time: Fri Jul 8 06:34:38 2011 Seq name: gi|222441938|gb|ACEP01000004.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont3.1, whole genome shotgun sequence Length of sequence - 24755 bp Number of predicted genes - 23, with homology - 23 Number of transcription units - 9, operones - 5 average op.length - 3.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 21 - 194 121 ## gi|225025817|ref|ZP_03715009.1| hypothetical protein EUBHAL_00042 + Prom 635 - 694 9.4 2 2 Tu 1 . + CDS 843 - 1085 74 ## EUBREC_3231 transposase + Prom 1118 - 1177 6.2 3 3 Op 1 . + CDS 1210 - 1506 136 ## Acfer_0754 plasmid maintenance system killer 4 3 Op 2 . + CDS 1534 - 2586 756 ## COG3093 Plasmid maintenance system antidote protein + Prom 2655 - 2714 6.6 5 4 Op 1 . + CDS 2740 - 3276 439 ## COG1247 Sortase and related acyltransferases 6 4 Op 2 . + CDS 3281 - 4507 1549 ## COG0205 6-phosphofructokinase + Term 4714 - 4749 -0.8 + Prom 4893 - 4952 6.4 7 5 Op 1 5/0.000 + CDS 5077 - 5856 1006 ## COG1348 Nitrogenase subunit NifH (ATPase) 8 5 Op 2 6/0.000 + CDS 5850 - 7259 1488 ## COG2710 Nitrogenase molybdenum-iron protein, alpha and beta chains 9 5 Op 3 . + CDS 7260 - 8330 1162 ## COG2710 Nitrogenase molybdenum-iron protein, alpha and beta chains 10 5 Op 4 1/0.000 + CDS 8327 - 10756 1964 ## COG0068 Hydrogenase maturation factor 11 5 Op 5 13/0.000 + CDS 10760 - 10969 383 ## COG0298 Hydrogenase maturation factor 12 5 Op 6 4/0.000 + CDS 10966 - 12021 1022 ## COG0409 Hydrogenase maturation factor 13 5 Op 7 . + CDS 12097 - 13113 1438 ## COG0309 Hydrogenase maturation factor 14 5 Op 8 . + CDS 13131 - 13379 330 ## Acfer_0861 hypothetical protein 15 5 Op 9 . + CDS 13395 - 14150 947 ## COG1348 Nitrogenase subunit NifH (ATPase) 16 5 Op 10 . + CDS 14189 - 15562 1059 ## CLJU_c23040 hypothetical protein 17 5 Op 11 . + CDS 15579 - 16862 875 ## COG2710 Nitrogenase molybdenum-iron protein, alpha and beta chains + Prom 17436 - 17495 5.8 18 6 Tu 1 . + CDS 17538 - 18134 709 ## EUBREC_0619 cytidylate kinase + Prom 18175 - 18234 6.5 19 7 Op 1 2/0.000 + CDS 18316 - 20103 1781 ## COG0006 Xaa-Pro aminopeptidase + Prom 20141 - 20200 4.6 20 7 Op 2 . + CDS 20222 - 20944 186 ## PROTEIN SUPPORTED gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 + Prom 21205 - 21264 8.8 21 8 Op 1 . + CDS 21329 - 22813 1802 ## gi|225025838|ref|ZP_03715030.1| hypothetical protein EUBHAL_00063 22 8 Op 2 . + CDS 22865 - 23713 755 ## Clole_3500 glycosyl transferase, WecB/TagA/CpsF family (EC:2.4.1.187) + Term 23766 - 23811 2.5 + Prom 23780 - 23839 12.3 23 9 Tu 1 . + CDS 24024 - 24755 231 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 Predicted protein(s) >gi|222441938|gb|ACEP01000004.1| GENE 1 21 - 194 121 57 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225025817|ref|ZP_03715009.1| ## NR: gi|225025817|ref|ZP_03715009.1| hypothetical protein EUBHAL_00042 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00042 [Eubacterium hallii DSM 3353] # 1 57 7 63 63 107 98.0 3e-22 MRKLIFDEDEMVELDTAQLDFNSGTEDYYLDLTHFCEEAIKMIGISSGEKYWILQMI >gi|222441938|gb|ACEP01000004.1| GENE 2 843 - 1085 74 80 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_3231 NR:ns ## KEGG: EUBREC_3231 # Name: not_defined # Def: transposase # Organism: E.rectale # Pathway: not_defined # 3 73 17 87 90 75 47.0 7e-13 MHKHTWRLIDTINGEKASALVYSIVETAKANGLNPFRYLEFLLTEMMEHEEDTDCCFIDD LLLWSDSIPDICKMKNKKEH >gi|222441938|gb|ACEP01000004.1| GENE 3 1210 - 1506 136 98 aa, chain + ## HITS:1 COG:no KEGG:Acfer_0754 NR:ns ## KEGG: Acfer_0754 # Name: not_defined # Def: plasmid maintenance system killer # Organism: A.fermentans # Pathway: not_defined # 1 98 2 99 99 112 51.0 3e-24 MDITYKNRKIERICTNAKVADREYGSQMSAKIHMRIDEIRAVDTVEEMIQFRIGRCHALK GNRKGQYAVDLEHPYRLVFTKHGNEIEIAHILEIVDYH >gi|222441938|gb|ACEP01000004.1| GENE 4 1534 - 2586 756 350 aa, chain + ## HITS:1 COG:XF1708 KEGG:ns NR:ns ## COG: XF1708 COG3093 # Protein_GI_number: 15838309 # Func_class: R General function prediction only # Function: Plasmid maintenance system antidote protein # Organism: Xylella fastidiosa 9a5c # 11 330 9 337 371 105 27.0 1e-22 MVRSRSFIATPPGATIKEQLNDRGMSQKEFSARMDMSEKHISRLINGDVQLTSEVAVRLE MVLGVPAKFWNNLEAIYREKLIKVEAENTMDKDEVLAKQLPYNQMSKFGWVPETRLIKER VINLRKYFEVVELSLLENRQITKIACRRLAITEKGDFALIAWAQEAKRVARQTDTAHINI DKLVRVLPEIRSMTLETPAEFSAELKDKLAQCGIALVFLPHLQGSFLQGASFIDGKKIVI GLTARGKDADRFWFSLFHELGHIVLGHIGKTDEITEQDENEADLWAKDELIPKEEFEQFK KTQDYSSISVCDFANKIGIAPGIVVGRLQKERCITYNMLNELKEQYEIAM >gi|222441938|gb|ACEP01000004.1| GENE 5 2740 - 3276 439 178 aa, chain + ## HITS:1 COG:L19745 KEGG:ns NR:ns ## COG: L19745 COG1247 # Protein_GI_number: 15673759 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Sortase and related acyltransferases # Organism: Lactococcus lactis # 1 165 1 165 187 145 43.0 5e-35 MNIRIATPEDAFAIQNIYRYYVDNTAITFELEVPSVKEFQERITKTLERYPYLVAEEEGE VIAYAYAGIFYDRRAYDWSAEMSVYVQRGIHGKGVGTALYEKMEELLKKQNIVNLFACIT HPNAESEAFHAARGYEKKAHFEQCGYKLGKWWDIVWMQKVIASNEGTPEPFQRFTGGF >gi|222441938|gb|ACEP01000004.1| GENE 6 3281 - 4507 1549 408 aa, chain + ## HITS:1 COG:XF0274 KEGG:ns NR:ns ## COG: XF0274 COG0205 # Protein_GI_number: 15836879 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphofructokinase # Organism: Xylella fastidiosa 9a5c # 2 408 13 415 427 251 38.0 3e-66 MKKNLLVAQSGGPTAAINATLAGVIKQAIKEEQIDQVYGACYGIQGVLEQKFVNLTEKVD TEEKLEKLKRTPAAALGSCRFKLNDIKEDDSQYQEIVDILHKMNIGYFVYIGGNDSMDTV AKLSAYCKEKGVEDIKVIGGPKTIDNDLCGIDHCPGFGSAAKYISTVFCELEQEITVYEP KNVIIVEMMGRHAGWLTAAAALAEGKNGNVPYLVYLEEKPFSLDRFIDDVKEKLASTNAV LVAVSEGVHDTEGRFLCEQEESRELDVFGHTKLSGTGKILEEAVRAKIGCKVRSIELNLL QRCAGHILSKTDIEESGNLGANAVKLAVAGESGLMSSLTRVSDAPYTVEYSGVDIREVAN KEKKIPVEWINEAGNGVKEELITYLTPLVQGEVSSIYENGMPSYLLLK >gi|222441938|gb|ACEP01000004.1| GENE 7 5077 - 5856 1006 259 aa, chain + ## HITS:1 COG:MA2032_1 KEGG:ns NR:ns ## COG: MA2032_1 COG1348 # Protein_GI_number: 20090880 # Func_class: P Inorganic ion transport and metabolism # Function: Nitrogenase subunit NifH (ATPase) # Organism: Methanosarcina acetivorans str.C2A # 1 243 1 241 339 242 51.0 7e-64 MLKIAVYGKGGIGKSTVTSNLAAAFASMGKKVIQIGCDPKADSTINLLGGTTPIPVMNYM REYDEDPETIEDISKVGYGGVLCIETGGPTPGLGCAGRGIIATFSLLEDLELFEIHQPDV VLYDVLGDVVCGGFAAPIREGYASKVLIVTSGEKMALYAANNINNAVKNFEDRSYAKVYG IVLNHRNVENETEKVTEFADSVGLPIVGEVPRSDEITRCEDRGMTVVEGEPESPAAQAFL NLAKELLENAQDTAEEDVW >gi|222441938|gb|ACEP01000004.1| GENE 8 5850 - 7259 1488 469 aa, chain + ## HITS:1 COG:all1438 KEGG:ns NR:ns ## COG: all1438 COG2710 # Protein_GI_number: 17228933 # Func_class: C Energy production and conversion # Function: Nitrogenase molybdenum-iron protein, alpha and beta chains # Organism: Nostoc sp. PCC 7120 # 101 317 92 307 480 73 28.0 8e-13 MVEIMAKETEKKKAEEAYFITTKELAEAGRDNIPEELISSKHLIYSSPATLAYNSPGAQG FGVKRAGLAIPGSVMLLLAPGCCGRNTTILSELGGYSERFFYLMMDETDIVTGRHLKKVP QAVEEIYDCLETKPSVVMICLTCVDALLGTDMERICKRAQKAVGIPVLPCYMYALTREGR KPPMVHVRQSIYSLLEPKKKKSTSVNLLGHFAPLEDDSELYDLLRQLGIKKIREISRCKN YDEYLDMAEANFNLVLDAEARFAAADMQKRLGIPYIELTRLYQLDKIKNQYTLFAAALGS KFDDDAYFEEALAAKEAFKKKYPHAVFAIGEGCNANAFELAFALIRYEFAVAEVFGNLSK EDFVYIEKMAKLSPETKIYSNLEPTMIYYEPGENPVDIVIGKDAVYYHPEAAQLEWSDDI QPFGYRGVKHFFEECERVLDEKNTKSEEKNLKNSTKNIKEEKILERGTN >gi|222441938|gb|ACEP01000004.1| GENE 9 7260 - 8330 1162 356 aa, chain + ## HITS:1 COG:FN0304 KEGG:ns NR:ns ## COG: FN0304 COG2710 # Protein_GI_number: 19703649 # Func_class: C Energy production and conversion # Function: Nitrogenase molybdenum-iron protein, alpha and beta chains # Organism: Fusobacterium nucleatum # 4 333 30 376 415 105 26.0 1e-22 MKGLRKYLSPFAPDQSGAAAVLCEFHGLIIILDAGGCAGNICGFDEPRWFESRSAIFSAG LRDMDAILGRDDRLVEKIGKACEKLSADFIAVIGTPVPAVIGTDYRALSRMIEKKTGISA LTIDTDGTKLYDDGEKKTWKELFKKFAVEKDVESGRIGIIGATPLEFGGIYEEDFLKKYF AGKGFSKVVCYGMGDGLDAVKEAAAAEKNIVVSPAGIAAAKYLQQKFGTPYELFCPPEII PEWKEKKEQVAGLLNVEELSEKKILIVHQQVLANTLREEFISANINVASWFMMNKEQKKE QDILFKEEDDWITYIKENEYDIIIADSLLKKAVPFYKGEWYDLPHFAISGKKRQSV >gi|222441938|gb|ACEP01000004.1| GENE 10 8327 - 10756 1964 809 aa, chain + ## HITS:1 COG:aq_672 KEGG:ns NR:ns ## COG: aq_672 COG0068 # Protein_GI_number: 15606085 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Hydrogenase maturation factor # Organism: Aquifex aeolicus # 2 799 5 740 746 520 39.0 1e-147 MRVCIRVFGIVQGVGFRPTVKRHADACDIAGSVSNKGPYVEIFAEGSEECVHSFIKQIQE EPPKRAVILKLDVENVESGEDGIHKVESETDSKEKFQIIESEKEEGEIFVSPDIAICPEC KKELYDKNDRRYLHPFINCTCCGPRLTILDSMPYDRVRTSMGEFPMCEKCEYEYTHAETR RFDAQPVCCNDCGPEVYLLGRKERGADAIRYTRKVISEGGIVAVKGIGGFHLCCDAAKEE TVARLRQRKKRPMKPFAVMMKDLDVVRRECETEPHLEEILDGHQKPIILLPKKEGGTLCE SVAPDNPKIGVMLPYAPVQLLLFDYQDETKVSDCLVMTSANTSGAPICRDDEDALNELSG LCDVILSHDRKIRLRADDTVMDFYRGEPYMIRRSRGYAPLPFMMGNEFKGQVLAVGGELK NAFCIGKNQLFYPSPYIGDMGDVRTVKALKESVKRMEELLETKPQIVACDMHPSYNTRAA AEEMGLPVFLVQHHYAHILSCMAENEWTTEKKVIGVSFDGTGYGTDGTIWGGEILLADYD SFTRWGCIEPFAQTGGDASAKEGWRIAVSLLGKIYGKENALLIIETLGLCEPKLAKLQFT MEERGINTVQSTSAGRLFDAVSAILDIRKSSTFEGEASTSLQFAAEKWLDAQKKKIAGSE DFAESGIITDYGELKSISDVAQKSIVEKNNSINRNIKADLYYLPTLSLVKEVAERKLAGE NSNQLALHFHRRLAGMIVSACEKAREETGINTVALSGGVYQNKLLLDYSVTMLEERGFHV LRHHLLPPNDGGISLGQAVAAMRSLQKGE >gi|222441938|gb|ACEP01000004.1| GENE 11 10760 - 10969 383 69 aa, chain + ## HITS:1 COG:AF1369 KEGG:ns NR:ns ## COG: AF1369 COG0298 # Protein_GI_number: 11498965 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Hydrogenase maturation factor # Organism: Archaeoglobus fulgidus # 1 66 1 66 77 57 42.0 4e-09 MCVGLPAKVMTMKDGMAVVDASGAKREVSAELIENLEPGDYVMVHAGIAIAKITEEDEDE AEDLLEGLL >gi|222441938|gb|ACEP01000004.1| GENE 12 10966 - 12021 1022 351 aa, chain + ## HITS:1 COG:CAC0811 KEGG:ns NR:ns ## COG: CAC0811 COG0409 # Protein_GI_number: 15894098 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Hydrogenase maturation factor # Organism: Clostridium acetobutylicum # 11 346 19 358 361 303 45.0 2e-82 MKEQLKKAKAIIEGYDGPAVRVMEVCGTHTHEIFHLGVRKILPPQVELISGPGCPVCVTP AGFIDEAIWLALEKNVTICTFGDLVRVPGSTRSLADARSLGGKIQMVYSPMDAFEYAKEH KDEQVVFLSVGFETTTPASCLSVKMAKEAGLTNYALLAANKTMPGAYEMLKDSADVFLYP GHVNAITGNALNRKLCEEGGISGIVTGFTASEILTAFAVLLKKFPEGKPFFVNAYPRVVT EEGSIPAQKLIKEWMEPCDSQWRGLGMIKNSGLRLRKEAQDFDARVKFSIPKMEGRTSPA CRCGDVLQGKCRPTDCKVFGKGCTPLHPIGACMVSNEGACSAYYQYNSRGE >gi|222441938|gb|ACEP01000004.1| GENE 13 12097 - 13113 1438 338 aa, chain + ## HITS:1 COG:CAC0809 KEGG:ns NR:ns ## COG: CAC0809 COG0309 # Protein_GI_number: 15894096 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Hydrogenase maturation factor # Organism: Clostridium acetobutylicum # 9 338 5 335 335 321 50.0 1e-87 MDNLKGKTVSLAHGAGGKQTSELIDQVFKAHFANPELTSDDAAVLKMKEGKIAFTTDGFI VSPSEFPGGNIGKLSICGTVNDLSCMGAKPLYLSCAFVIEEGFPMEKLEAIAAAMEKTAN EAGVKIVAGDTKVAGKGQVDGVFITTTGIGEIQEGVETSGFMAKPGDAVIVSGDIGRHGC TILLAREDFGIDADVTSDCAPLWGTVQDILSVTKDVHVIRDATRGGVGTVLYEIAEQSKT GIRLNAEAIPVQDSVKGVCGMLGLEPLYLACEGRLVIFAPKENAEAIVAKLREGKYSSEA AIIGEVTEEMPGRVVVTTEIGAETLLPPPGGELLPRIC >gi|222441938|gb|ACEP01000004.1| GENE 14 13131 - 13379 330 82 aa, chain + ## HITS:1 COG:no KEGG:Acfer_0861 NR:ns ## KEGG: Acfer_0861 # Name: not_defined # Def: hypothetical protein # Organism: A.fermentans # Pathway: not_defined # 1 82 1 82 91 102 60.0 5e-21 MEDTRIAVLSIIVEDSESTAKLNEMIHDYAEYVVGRMGIPYREKNISIISLVIDAPQTKI SELSGKIGRLSGVTAKAAYAKV >gi|222441938|gb|ACEP01000004.1| GENE 15 13395 - 14150 947 251 aa, chain + ## HITS:1 COG:MJ0879 KEGG:ns NR:ns ## COG: MJ0879 COG1348 # Protein_GI_number: 15669069 # Func_class: P Inorganic ion transport and metabolism # Function: Nitrogenase subunit NifH (ATPase) # Organism: Methanococcus jannaschii # 1 245 1 245 279 247 52.0 1e-65 MIKIAVYGKGGIGKSTTVSNVAAVFASQGLRVMQIGCDPKADSTVLLRHGEKVETVLDLV RTRKENFTLEEMVKEGFAGVCCVEAGGPSPGLGCAGRGIIAALEILEKKGAYEKYRPDVV IYDVLGDVVCGGFSMPMRRGYADKVFVITSGENMAIHAAANIAMAVENFKDRGYAQLGGL ILNRRNVKNEDAKVEELAEDIHSKVIGTLSRSALVTDAEEQKKTVIEAYPDSEMAEEYRI LAGKIRTLCEI >gi|222441938|gb|ACEP01000004.1| GENE 16 14189 - 15562 1059 457 aa, chain + ## HITS:1 COG:no KEGG:CLJU_c23040 NR:ns ## KEGG: CLJU_c23040 # Name: not_defined # Def: hypothetical protein # Organism: C.ljungdahlii # Pathway: not_defined # 33 457 19 442 442 468 51.0 1e-130 MLKMLHGKAKDEQWNNQIKEKLSPSLQEEIPFQDAAWPLPFVPGLEYNSPAHGTWNIVHM GMLLPGSHQIYVCGANCNRGVILTAAEMNAGDRFSFVEIKEEDLFNGQMEDLVIEGVSDI LHKLPKKPSVVLLFTVCVHHFMGCDLAYIYDTLRSRFPEQCFVDCYMDPIMQKEGLTPDQ KLRNALYKPLPMREKNLKQINIIGNDFPTREETELKTIAKAAGYTVKDITECEGYEEYLS MSESFLNIACYPLAKYGAEQLSKRLGMKFLYLPFSFHYDEINAQLKSLVKEMPGAVMPDT SLLRKRCERILQETKELVGDTKITIDATVVPRLLGLTRLLISHGFSVERIFTDAFSAEEK EDYLWLKENAPNLKVCATIHPNMRVYPRNPQEKILAIGQKAAYFSGTEHFVNMVEGGGLH GYDGICELCQKIQQAYQTKKDTKDLVVRKGLGCESCI >gi|222441938|gb|ACEP01000004.1| GENE 17 15579 - 16862 875 427 aa, chain + ## HITS:1 COG:FN0304 KEGG:ns NR:ns ## COG: FN0304 COG2710 # Protein_GI_number: 19703649 # Func_class: C Energy production and conversion # Function: Nitrogenase molybdenum-iron protein, alpha and beta chains # Organism: Fusobacterium nucleatum # 12 414 38 395 415 108 25.0 3e-23 MKQVSVTLSTYTADVSGVCSALYELGGMVVIHDPSGCNSTYNTHDEPRWYDMDSLVFISG LSQMDAIMGNDDKFINDIVRAAKELKPCFIALVRTPIPLMTGTDFEGIARVIEKQTQIPV FYFPTSGMQSYVSGAGMALETVARELVLPTNANGHFKENGIEIENVSNELSVAYTKKKNQ QKLNQSEQKQIENTESKQSNSKIKINILGATPLDFSVNSTLDSIKEFLSKHFEIISTFAM GSSIEDIQKAGEADVNLVISSVGFPAAKVLEERFSTPYVIGTPVKGFAGIIAEKLIDAAW TGKSQAAYFSVTSSGKSISRATNGIYIIGESVISQSLKAAMALKQGIDATVICPLETEPE YIGENVLLFSSEEEIKAAIAEAKTVIADPIYKTICADETNFIALPHEAFSGRIYRKEIPN LMEWVTI >gi|222441938|gb|ACEP01000004.1| GENE 18 17538 - 18134 709 198 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_0619 NR:ns ## KEGG: EUBREC_0619 # Name: not_defined # Def: cytidylate kinase # Organism: E.rectale # Pathway: not_defined # 1 194 14 206 208 171 47.0 2e-41 MKKNIIAISREFGSGGRTIGRLVAQKLGIQFYDRDIIKKVVEESGLTRKYVEHYGEFAPS ADQRFAYSFVGLDEDSNSPLVQLWKTREKVIKDFATAKPCVIVGSCADYILRDREDCLKV FLYADEETKENRIQEIYGEAGLKLKNRVKDMDVRRGLNYKYFTGQDWGKAQNYDLALNRG SLGIEKCAQLIVDAASEE >gi|222441938|gb|ACEP01000004.1| GENE 19 18316 - 20103 1781 595 aa, chain + ## HITS:1 COG:FN0453 KEGG:ns NR:ns ## COG: FN0453 COG0006 # Protein_GI_number: 19703788 # Func_class: E Amino acid transport and metabolism # Function: Xaa-Pro aminopeptidase # Organism: Fusobacterium nucleatum # 1 593 3 582 584 451 40.0 1e-126 MKRELELLREKMRETGVDACLIPTSDFHGSEYVGDYFKCREYISGFTGSAGTLVVTLDEA GLWTDGRYFLQAAKQLEGSGIMLRKERQPGVPAIEEYLKQTLKKGETLGFDGRCIMQDSA EKLITQLNAQGVAVRTDIDLTGAVWKNRPELSAQPVWPLPVEYAGESSESKIKRVREFLV EKKADYFLLTSLEDIAWLLNMRGNDVESTPVILSYLLLGEKKLTWYVQEKCLSEKIKILL DMQGIKAAPYAQIYEDVKKLPEDASIYYDKSAVNTALVSSLPEKVKKIEGVNPTFLFKAK KNPVEVENERNAHIKDGVAVTKFIYWLKSQIGKTKITEISAAEQLEQFRNTQEHYVEPSF APIIAYKEHGAIVHYSATKESDVELKPESFVLADTGGHYLEGTTDITRTIALGSLTQEEK EMYTTVLKGHIQLEMARFLQGCSGQSLDVLARTPLWEKGLDYNHGTGHGVGYLLSVHEGP NSFRYRPSVNGRNDCVFEEGMITSDEPGIYLEGKFGIRLENMIVCQKDMENDYGSFLCFD ALTLVPFERSAIIAEELSTKEKEWLNKYHQKVFETIAPYLTEEEAGWLREETQEF >gi|222441938|gb|ACEP01000004.1| GENE 20 20222 - 20944 186 240 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 [Phaeobacter gallaeciensis BS107] # 2 237 4 238 242 76 25 2e-13 MKSVIITGATRGIGLALARFYADKGYCLMLNGGHNEEALAAVEKELSTKTQVITCLGSVA EEKTAATIVQKTLDTFGQIDLLINNAGIAHIGLLSDMSSEEWHTLMGTNLDSVFYMSKAV IPHMVHCHSGKILNVSSVWGEVGASCEAAYSASKGAVNSLTKALAKELAPSHISVNGVSF GVIDTDMNRCFDEEERASLAEEIPYGRFASPEEAAAFCYQITESPEYLTGQIIRFDGGWI >gi|222441938|gb|ACEP01000004.1| GENE 21 21329 - 22813 1802 494 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225025838|ref|ZP_03715030.1| ## NR: gi|225025838|ref|ZP_03715030.1| hypothetical protein EUBHAL_00063 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00063 [Eubacterium hallii DSM 3353] # 1 494 1 494 494 689 100.0 0 MKKPSLIKRFFSFLLRLCVVLVLMAAIAVGSFEGVTYYLTGSFTSVKKAVKEETDQSGTE DSTEATTVDNKNMENTLVFVHDDMNNKDYTALNMYNTKTEAMDVLLLPCNAQVSVSSTLL KEIRNTMTEAGSSVNMADVARAFGDKKYEMFAKIVEDISGAKISGYDVISSTNFKKLLNV AGGVSYHFNNAISYRDTENILQTIDAGDVILDGTTAYALFTYMDGTDGEESSRLERVNTY LTSYMEALLSKNSTSAIAKKYDSLVTAEGKSGLTSTEDILKNLTTDALTIRIMQGSESKG IFTLDSQKVKLQVAGLAKQAEAYSQSSSSKSSTGTTSTSAATGSTESSKNYSIEIYNAAY VSGLAGQWESYLESEGYSISLVDSYQEEGPLSQTRINVTQDGMGQDLLTYFPDAEINVVD SISTGGDIQIYIGTDSTNVPAGSDSTTDTEDETVTDSSLDDESDSTDSTDSTDVGDDDDT TTSGGYDFGSDSEQ >gi|222441938|gb|ACEP01000004.1| GENE 22 22865 - 23713 755 282 aa, chain + ## HITS:1 COG:no KEGG:Clole_3500 NR:ns ## KEGG: Clole_3500 # Name: not_defined # Def: glycosyl transferase, WecB/TagA/CpsF family (EC:2.4.1.187) # Organism: C.lentocellum # Pathway: not_defined # 11 245 7 239 244 64 23.0 4e-09 MKDRYHKKANVLGIDVSVMRVDKAVNVSMELMRGKGLPVIYFLSAISSLLCQNNEQSAEY VKSCNLILAGDRHIEMAVHHRQDNGQIPEGIGEFADNYLKRLFSKINREGRSIYMVMEQE SYLESMEEYMEESYRNIEVHGNVMRKETKGEADRIVNEINACIPDIVFVCIPAERQMKFM EKHAAMMNTRLCILIESVQPLIRKETEEIPSWVEKLHLEGIYSWFKKEQKIKNTIVGSVF KKKVLNEASEEESEQKPEDSSEENILDVQEENVNTLEKRRQY >gi|222441938|gb|ACEP01000004.1| GENE 23 24024 - 24755 231 243 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 4 235 2 239 245 93 25 1e-18 MASLSLRHLSKTYDNGVNAIRDFTLEVDDKEFMIFAGPVGCGISTVLRMIAGVEDITDGE LLVDGERMNEVPSIDRNMAMIFKNGKLYPQMNIYDNLAFGLKLKELPKGEIDARIAEVTE VLKIGHLLDKMTEDLDEQERALVVLGRAMVKKPKVFLLDEPFSSLSKDVKRHMQKLVRKL YDQMHITVIYVTHNREDIQNTDARIAVMNDGSIQQIGTIGELYDNPATPYVSEFLGVAAQ AVC Prediction of potential genes in microbial genomes Time: Fri Jul 8 06:35:30 2011 Seq name: gi|222441937|gb|ACEP01000005.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont4.1, whole genome shotgun sequence Length of sequence - 7517 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 1, operones - 1 average op.length - 6.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + TRNA 3 - 73 64.6 # Cys GCA 0 0 + TRNA 129 - 202 68.0 # Met CAT 0 0 + Prom 922 - 981 9.9 1 1 Op 1 20/0.000 + CDS 1056 - 2237 1819 ## COG0183 Acetyl-CoA acetyltransferase + Term 2263 - 2297 5.1 + Prom 2277 - 2336 3.8 2 1 Op 2 7/0.000 + CDS 2368 - 3153 1137 ## COG1024 Enoyl-CoA hydratase/carnithine racemase 3 1 Op 3 7/0.000 + CDS 3222 - 4061 1216 ## COG1250 3-hydroxyacyl-CoA dehydrogenase + Term 4079 - 4122 7.3 4 1 Op 4 2/0.000 + CDS 4162 - 5316 1507 ## COG1960 Acyl-CoA dehydrogenases 5 1 Op 5 29/0.000 + CDS 5331 - 6116 1201 ## COG2086 Electron transfer flavoprotein, beta subunit 6 1 Op 6 . + CDS 6142 - 7173 1512 ## COG2025 Electron transfer flavoprotein, alpha subunit + Term 7231 - 7264 -0.4 Predicted protein(s) >gi|222441937|gb|ACEP01000005.1| GENE 1 1056 - 2237 1819 393 aa, chain + ## HITS:1 COG:CAC2873 KEGG:ns NR:ns ## COG: CAC2873 COG0183 # Protein_GI_number: 15896127 # Func_class: I Lipid transport and metabolism # Function: Acetyl-CoA acetyltransferase # Organism: Clostridium acetobutylicum # 3 392 2 391 392 461 62.0 1e-130 MAKKVVLAGACRTAIGTMGGGLSTVPAADLGSIVIKEALKRTNVPADQVDEVLMGCVIQA GLGQNVARQASINAGLPIEVPAVTINVVCGSGLNCVNLAATKILAGEADIVIAGGMENMS LAPFCLDKARFGYRMNNGVLKDCMVNDALTDAFNQYHMGITAENVAEQWHLTREELDEFS ANSQQKATAAIEAGKFKDEIVPVEVKLKKKTVVVDTDEGPRPGTTAESISKLRPAFKKDG IVTAANSSGINDGAACVVVMSEEKAKELGVTPMATWVAGALGGVDPSIMGVGPVASTKKV LAKTGMSIDDFDLIEANEAFAAQSLAVGRELGINPEKLNVNGGAIALGHPVGASGCRILV TLLHEMQKRDAKKGLATLCIGGGMGCSTIVERD >gi|222441937|gb|ACEP01000005.1| GENE 2 2368 - 3153 1137 261 aa, chain + ## HITS:1 COG:FN1020 KEGG:ns NR:ns ## COG: FN1020 COG1024 # Protein_GI_number: 19704355 # Func_class: I Lipid transport and metabolism # Function: Enoyl-CoA hydratase/carnithine racemase # Organism: Fusobacterium nucleatum # 1 251 1 251 258 341 68.0 8e-94 MGFIDYEVDGQVGIITINRPKALNALNSEVLKDLDATIDAVDLDAIRCLIITGAGEKSFV AGADIGEMSTLTKAEGEAFGKAGNDVFRKIETLPIPVIAAINGFALGGGCEISMSCDIRL CSENAVFGQPEVGLGITPGFGGTQRLARIVGPGKAKQLIYTARNIKAAEAYRIGLVNEVY PLEELMPQAKKMAKGIAKNAPIAVRACKKAINEGLEVGMDEAIVIEEKLFGSCFETEDQK YGMAFFLDKNKEKVKEPFKNC >gi|222441937|gb|ACEP01000005.1| GENE 3 3222 - 4061 1216 279 aa, chain + ## HITS:1 COG:FN1019 KEGG:ns NR:ns ## COG: FN1019 COG1250 # Protein_GI_number: 19704354 # Func_class: I Lipid transport and metabolism # Function: 3-hydroxyacyl-CoA dehydrogenase # Organism: Fusobacterium nucleatum # 1 279 1 279 279 438 78.0 1e-123 MKVGVIGAGTMGSGIAQAFAQTEGYEVYLCDINDEFAANGKKKIAKGFEKRVARGKMDQA KADAILAKITTGTKDICTDADLVVEAALEVMEVKQQTFKELQDIVPATCMFATNTSSLSI TQIGAGLDRPVIGMHFFNPAPVMKLIEVIAGLNTPDEMVEKIKAISVEIGKTPVQVEEAA GFVVNRILIPMINEAVGIYADGVASVEGIDTAMKLGANHPMGPLALGDLVGLDICLAIME VLYNETGDPKYRPHPLLRKMVRGGKLGQKTGIGFYDYSK >gi|222441937|gb|ACEP01000005.1| GENE 4 4162 - 5316 1507 384 aa, chain + ## HITS:1 COG:FN0783 KEGG:ns NR:ns ## COG: FN0783 COG1960 # Protein_GI_number: 19704118 # Func_class: I Lipid transport and metabolism # Function: Acyl-CoA dehydrogenases # Organism: Fusobacterium nucleatum # 1 384 1 381 381 489 66.0 1e-138 MDFTLSKKHEMARQLFKEFAENEVKPLAQEVDETEHFPEETVAKMQKLGFMGIPVPKEYG GQGCDPLTYIMCVEELSKVCGTTGVIVSAHTSLCADPILTYGTEEQKQKYLVPLAKGEKL GAFGLTEPGAGTDAQGVQTKAVLDGDEWVLNGSKCFITNGSYADYYIIIAITSVDTDARG RKKKKFSAFIVEKGTPGFTFGTKEKKMGIRGSATYELIFQDCRIPKENLLGPMGKGFAIA MHTLDGGRIGIAAQALGIAEGALDATIAYVKERKQFGRAIAAQQNTQFQLANMATQVEAA KLLVYKAAMAKATQRVYSVEAAKAKLFAAETAMDVTTKCVQLLGGYGYIREYDVERMMRD AKITEIYEGTSEVQRMVISGNLLK >gi|222441937|gb|ACEP01000005.1| GENE 5 5331 - 6116 1201 261 aa, chain + ## HITS:1 COG:FN0784 KEGG:ns NR:ns ## COG: FN0784 COG2086 # Protein_GI_number: 19704119 # Func_class: C Energy production and conversion # Function: Electron transfer flavoprotein, beta subunit # Organism: Fusobacterium nucleatum # 1 261 1 262 262 242 53.0 5e-64 MKIVVCVKQVPDTKGGVKFKPDGTLDRAAMLTIMNPDDKAGLEAALRIKDETGAEVTVVT MGLPKAEEVLREAMAMGADKGILVTDRVLGGADTWATSTTIAGALRNIDYDLIITGRQAI DGDTAQVGPQIAEHLDLPLISYAQDIKVEGDSVIVQRQYDDRYHVLKAKMPCVITALSEL NEPRYMTPGGIFDAYDAEITTWGRKDLKDVDDSNLGLAGSPTQIAKASDKVRKGKGVMMT AETAEEGVEYIMDKLLTKHVI >gi|222441937|gb|ACEP01000005.1| GENE 6 6142 - 7173 1512 343 aa, chain + ## HITS:1 COG:FN0785 KEGG:ns NR:ns ## COG: FN0785 COG2025 # Protein_GI_number: 19704120 # Func_class: C Energy production and conversion # Function: Electron transfer flavoprotein, alpha subunit # Organism: Fusobacterium nucleatum # 1 339 1 385 391 334 51.0 1e-91 MNLEQYKGVFIYAQQVDNKIDAIALELLGKANDLAADLGTDVTAVLIGSDVKGLVDELAE YGADKVIVVDDPELKEYRTEPYTHALASVINEYKPDILLVGATAIGRDLGPRVSARVATG LTADCTVLEIGEFKARGDKEPRQGQLLMTRPAFGGNTIATIACPNHRPQMATVRPGVMQK KEPVAGAKAEVIEYNPGFTPNDKYVEILDIVKSVTETVDIQAAKILVSGGRGVGSAENFA LLDDLAEALGGTVSCSRAVVDNGWKPKDLQVGQTGKTVRPNVYFAIGISGAIQHVAGMEE SDIIIAINKDPDAPIFDVADYGIVGDALKIVPKLTEAIKAQVK Prediction of potential genes in microbial genomes Time: Fri Jul 8 06:35:36 2011 Seq name: gi|222441936|gb|ACEP01000006.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont5.1, whole genome shotgun sequence Length of sequence - 23470 bp Number of predicted genes - 23, with homology - 22 Number of transcription units - 15, operones - 6 average op.length - 2.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) + TRNA 98 - 168 51.0 # Trp CCA 0 0 - Term 89 - 156 30.2 1 1 Tu 1 . - CDS 202 - 297 71 ## - Prom 336 - 395 6.3 2 2 Op 1 . + CDS 487 - 621 65 ## gi|225025849|ref|ZP_03715041.1| hypothetical protein EUBHAL_00077 3 2 Op 2 . + CDS 627 - 923 216 ## Acfer_0912 transcriptional regulator, LysR family + Prom 1103 - 1162 6.6 4 3 Tu 1 . + CDS 1195 - 1851 588 ## lhv_1650 hypothetical protein + Prom 1910 - 1969 6.8 5 4 Tu 1 . + CDS 2047 - 3021 889 ## COG3049 Penicillin V acylase and related amidases - Term 3057 - 3094 -0.6 6 5 Tu 1 . - CDS 3211 - 3507 424 ## gi|225025854|ref|ZP_03715046.1| hypothetical protein EUBHAL_00082 - Prom 3543 - 3602 9.6 + Prom 3697 - 3756 6.1 7 6 Op 1 . + CDS 3954 - 5918 2361 ## COG1151 6Fe-6S prismane cluster-containing protein 8 6 Op 2 . + CDS 5912 - 7258 1488 ## COG4624 Iron only hydrogenase large subunit, C-terminal domain + Term 7333 - 7375 0.4 + Prom 7477 - 7536 6.0 9 7 Op 1 11/0.000 + CDS 7655 - 10252 2956 ## COG1882 Pyruvate-formate lyase 10 7 Op 2 . + CDS 10290 - 11222 931 ## COG1180 Pyruvate-formate lyase-activating enzyme 11 7 Op 3 . + CDS 11283 - 12476 1296 ## COG0192 S-adenosylmethionine synthetase + Term 12541 - 12569 -0.0 + Prom 12771 - 12830 9.8 12 8 Tu 1 . + CDS 12939 - 13283 549 ## COG0662 Mannose-6-phosphate isomerase + Term 13419 - 13453 -0.6 + Prom 13678 - 13737 11.6 13 9 Op 1 7/0.000 + CDS 13812 - 15044 1008 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 14 9 Op 2 . + CDS 15055 - 16077 910 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain - Term 16302 - 16345 0.1 15 10 Op 1 . - CDS 16394 - 16750 242 ## TSIB_1637 hypothetical protein 16 10 Op 2 . - CDS 16729 - 17136 480 ## gi|225025868|ref|ZP_03715060.1| hypothetical protein EUBHAL_00096 - Prom 17317 - 17376 10.2 17 11 Tu 1 . + CDS 17113 - 17307 57 ## gi|225025867|ref|ZP_03715059.1| hypothetical protein EUBHAL_00095 18 12 Tu 1 . + CDS 17375 - 17977 815 ## Sterm_3311 hypothetical protein + Term 18014 - 18057 -0.7 + Prom 17984 - 18043 3.4 19 13 Tu 1 . + CDS 18098 - 18679 774 ## COG0693 Putative intracellular protease/amidase + Prom 18719 - 18778 3.6 20 14 Op 1 . + CDS 18806 - 19036 394 ## gi|225025871|ref|ZP_03715063.1| hypothetical protein EUBHAL_00099 21 14 Op 2 1/0.000 + CDS 19043 - 20944 2046 ## COG0457 FOG: TPR repeat + Prom 20960 - 21019 4.1 22 14 Op 3 . + CDS 21041 - 21931 846 ## COG1404 Subtilisin-like serine proteases + Prom 22104 - 22163 9.2 23 15 Tu 1 . + CDS 22208 - 23356 1688 ## COG1454 Alcohol dehydrogenase, class IV + Term 23402 - 23448 15.2 Predicted protein(s) >gi|222441936|gb|ACEP01000006.1| GENE 1 202 - 297 71 31 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MVFSLVDEEYLDANGNILWDKFTLFGQNKRA >gi|222441936|gb|ACEP01000006.1| GENE 2 487 - 621 65 44 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225025849|ref|ZP_03715041.1| ## NR: gi|225025849|ref|ZP_03715041.1| hypothetical protein EUBHAL_00077 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00077 [Eubacterium hallii DSM 3353] # 1 44 1 44 44 90 100.0 5e-17 MCVCVPDLCVTTAKIFMWEIKEHLGIGKPITIGNDVYRIFRNSR >gi|222441936|gb|ACEP01000006.1| GENE 3 627 - 923 216 98 aa, chain + ## HITS:1 COG:no KEGG:Acfer_0912 NR:ns ## KEGG: Acfer_0912 # Name: not_defined # Def: transcriptional regulator, LysR family # Organism: A.fermentans # Pathway: not_defined # 1 77 193 266 287 93 53.0 3e-18 MNQKYSSYHNLIQCCSDFGFMPNIVLCTMENSLIYRFCQEKIGIGIDADVHPEKELLAGL RKIELYDAIPWKINLVCRKNTVNDKVISRLQEICCNLD >gi|222441936|gb|ACEP01000006.1| GENE 4 1195 - 1851 588 218 aa, chain + ## HITS:1 COG:no KEGG:lhv_1650 NR:ns ## KEGG: lhv_1650 # Name: not_defined # Def: hypothetical protein # Organism: L.helveticus # Pathway: not_defined # 1 215 11 224 445 96 28.0 6e-19 MKLNRFEVVTGRYIEEIPGQSRIGYAMSDTTDFYDMIEWSKKGGYQGSTISFYDYNNGKV YEPFQKQRNVLYGTPVYLKKSFWFLQGDYNSGKITLFKYYPDKNPEMIIQLNIEDVDLYN LRIIGEDVYIVSEDDEFVSYYPESFRFSKGVNESVSMIADQKVYLSAWVEEGWDDENDCE TEEYNYYEKVVERDFKGNLLSETLGSLQQHADGTWWIA >gi|222441936|gb|ACEP01000006.1| GENE 5 2047 - 3021 889 324 aa, chain + ## HITS:1 COG:BS_yxeI KEGG:ns NR:ns ## COG: BS_yxeI COG3049 # Protein_GI_number: 16081005 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Penicillin V acylase and related amidases # Organism: Bacillus subtilis # 1 305 1 309 328 184 33.0 2e-46 MCTAATYKTQDFYFGRTLDYEFSYGDEITVTPRKYPFHFRHMGEVVSHYAMIGMAHVAGD YPLYYDAVNEKGLAMAGLNFVGNAVYQEVEEGRENVAQFEFIPWILSKCATVKEARESLN KMNLVGTPFSEQLPSAQLHWIIADENEAITVECMKDGMHIYDNPVGVLTNNPPFEQQMFQ LNNYIGLSPKQPENRFSDKLNFNAYSRGMGALGLPGDLSSTSRFVRVAFTKMNSFSGVSE KASISQFFHILGSVDQQRGCCEVADGKFEITLYTSCCNTNKGIYYYTTYENHQISAVDMF GENLNEKELIKYPLIKGEQIYWQN >gi|222441936|gb|ACEP01000006.1| GENE 6 3211 - 3507 424 98 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225025854|ref|ZP_03715046.1| ## NR: gi|225025854|ref|ZP_03715046.1| hypothetical protein EUBHAL_00082 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00082 [Eubacterium hallii DSM 3353] # 1 98 1 98 98 150 100.0 3e-35 MIEIANLEEWTKEYFSDPENQKKAEKACERYDRLMVKNIKRQLSGGAEKIFLNEEPADDP GKCMEKAKYEVIPFAKVDGKKGKVKINMLDQTAEFVPE >gi|222441936|gb|ACEP01000006.1| GENE 7 3954 - 5918 2361 654 aa, chain + ## HITS:1 COG:CAC0116 KEGG:ns NR:ns ## COG: CAC0116 COG1151 # Protein_GI_number: 15893412 # Func_class: C Energy production and conversion # Function: 6Fe-6S prismane cluster-containing protein # Organism: Clostridium acetobutylicum # 5 647 9 629 629 732 56.0 0 MADKICSSADKVLEVFLENAPMDTSHHRMVKQQNKCGYGLQGVCCRLCSNGPCRLSPSKP KGVCGADADTIACRNFLRQVAAGSGCYTHVVENTARRLKELAQELQAEGKKPKYKDSVAK LAKILQINCCGNCGPDCHNSCAKTAEMIADAVLADIRKPYDEKMTLMKNIALPKRYELWE KLGILPGGAKDEIFNAVVKTSTNLNSDPMDMLLQCLRLGISTGNYGLILTNLMNDIIMGP PQISMDPVGFRIIDPEYINIMITGHQQSMFADLEEKLESEIVQKSAELVGAKGIRIVGCT CVGQDYQARSGCYKDVYCGHAGNNYTSEAVLMTGCVDLVVSEFNCTIPGIEPICEQLDIK MLCLDDVAKKANAELLPYTAEEKEKITSQIIADALCGFKNRKEKLYGTAPAEGEKRVNVM AQHGFDKSITGLSEDTLVAALGGTLQPLIDAIVAGKIKGIAAIVGCSNLRAKGHDVFTVE LAKELIKKDILVLSAGCTCGGLENCGLMTMEAVELCGEGLKEICTALGVPPVLNFGPCLA IGRIEMAACALAKELNVDLPQLPVVISAPQWLEEQALADGAYALALGFPLHLALSPFVTG SQVAVNVLTEGLKDLTGGQLIIETEVEDAAQKFEEIIKEKRIGLGIDKGGDAVC >gi|222441936|gb|ACEP01000006.1| GENE 8 5912 - 7258 1488 448 aa, chain + ## HITS:1 COG:TM0201_2 KEGG:ns NR:ns ## COG: TM0201_2 COG4624 # Protein_GI_number: 15642974 # Func_class: R General function prediction only # Function: Iron only hydrogenase large subunit, C-terminal domain # Organism: Thermotoga maritima # 79 448 3 363 372 289 44.0 9e-78 MLMNRTTPFMVPVDEGNPAIIKDEALCSQCGHCFAICEEEIGVAAKYLLNPRETYQCIGC GQCSANCPEKAIIGKPHYKIVKELIKDPDKIVVFSTSPSVRVGFADGFGKEPGTFAQDEM VGALRALGADYVFDVTFSADLTIMEEGSELLSRILKGTGPLPQFTSCCPAWVKYMENFHP DKTAHLSSAKSPIGMQGAVIKTYFAHKKHIDPEKIISVAVTPCTAKKAEIAREELCDAGK LLNIEEMRDNDYVITTKELVQWCQEEGLDLDKITPSKFDSVLGEGTGAGMIFGNTGGVME AALRTVYRVLEGKEAPADFYQLRPVRGLSNRKEAEVTIAGKKLRVCILYGTAAAEEFLAE DMNGYHFVEVMTCPGGCISGAGQPDCGSVPVSDVVRKKRIQSLYQADEQAERRNSMDNPE IGTIYHEFFKEPLSMLSEKLLHTTYKRK >gi|222441936|gb|ACEP01000006.1| GENE 9 7655 - 10252 2956 865 aa, chain + ## HITS:1 COG:ECs4880 KEGG:ns NR:ns ## COG: ECs4880 COG1882 # Protein_GI_number: 15834134 # Func_class: C Energy production and conversion # Function: Pyruvate-formate lyase # Organism: Escherichia coli O157:H7 # 10 592 2 565 765 387 34.0 1e-107 MIAKGFTEPTARIKALKAEVVNAVPEVEPYRAVLITESYKETEGMPAVMRRAKANEKIFN NLPVVIRDNELIVGALSVKPRSTNLCPEFSFDWVEKEFKTMATRMCDPFVITEETADILH EVFKYWKGKTTSELATSYMSQDCLDCQAGGVFTVGNYYFGGIGHVCVDYGKVLKIGFKGV IAEVVEAMNKLDANDPEYLQKRAFYEAVIISYNAAINFAKRYSKKAAEMAATCTDPVRKA ELLQISKNCDRVPENGATNFYEACQSFWFVQALVQIEANGHSISPGRFDQYMWPYLEADK SISKEFAQELLDCLFVLLNHVNKTRDDVSDQAFAGYAVFQNFGVGGQTEDGLDATNPVSY MCMDAAAHVRLPAPSFSVRIHNQTPDEFLLRACELARLGTGVPAMYNDEAIIPALCNRGL TLADARNYCIIGCVEPQCPHKTDGWHDAAFFNVAKVFDIAIHGGKNRDGKQLGPVTKPMP EWKSMDDLYEAYETQIEYFVSKLVEADNAVDIAHKERAPLPFMSALVDDCIGRGKTVMEG GAIYNFTGPQAFGNVDTGDAIYCIKKHVFEDKDLTMQQIYDAMEHNFGAELGAGCYDGPF VRLSTDSAEPAAAAMESVSVSSEDSMESIINAVVQKILAEKGSNLSMSVDTKSEACTSCS DAQRAEYDRIRHILDATPCFGNDIDEVDMCARKATQVYSHEVEKYKNPRGGQYQAGCYPV SANVLFGKDVQALPDGRYSNAPLADGVSPRQGHDVKGPTAAGNSVAKLDQLNISNGTLYN QKFLPSAVAGDQGLLNFAAVVRNYFDKKGMHVQFNVIDRETLLDAQAHPENYKDLVVRVA GYSAHWTVLAKEVQDDIIARTEQHF >gi|222441936|gb|ACEP01000006.1| GENE 10 10290 - 11222 931 310 aa, chain + ## HITS:1 COG:AF1450 KEGG:ns NR:ns ## COG: AF1450 COG1180 # Protein_GI_number: 11499045 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Pyruvate-formate lyase-activating enzyme # Organism: Archaeoglobus fulgidus # 13 310 5 300 302 220 35.0 3e-57 MSNMEPSCDTSKKGLVFNIQRFSVNDGPGVRTIAFLNGCPLRCKWCCNPESQELKPVVMF KAQNCVGCGNCEVVCPTGASNLNVPGKIDHTKCIACGKCIDVCYHRALEMSGKWMTVEEL MGELYKDRVIYRKSGGGITVSGGEAMVQHEFLLELLKACKSFGWTTAIETTGMASEEVIK EIVPWLDVVMMDIKHIDPEVHKEYTGVSNEQILKNALLISSLAKKMIIRVPVIPGFNASP NVIAAIAKFTTYLHNVNELHLLPYHDLGSNKYGMLGKEYELADVQKPENEFMEELKAIVE KEGLTCKIGG >gi|222441936|gb|ACEP01000006.1| GENE 11 11283 - 12476 1296 397 aa, chain + ## HITS:1 COG:BH3300 KEGG:ns NR:ns ## COG: BH3300 COG0192 # Protein_GI_number: 15615862 # Func_class: H Coenzyme transport and metabolism # Function: S-adenosylmethionine synthetase # Organism: Bacillus halodurans # 7 385 9 399 399 442 56.0 1e-124 MNKDITLLTSESVTEGHPDKVCDMISDALLDAYLEEDPSAHVALETMASEDTVFIAGEVA SFAKVDVKNKAREVICEIGYDSKEKGIDGKECLIITNIGVQSPDIAQGVTQCNGTIGAGD QGMMYGYACDDTEELMPLSISLAHKLSKRLTQVRKEGILPYLYPDGKSQVTVAYNNQGEI LGVTDVVISAQHNAEVSVQKLRMDIEEHVIREVIPKELLLPDVKIHINPTGRFVVGGPAG DVGLTGRKIIVDTYGGIARHGGGAFSGKDPTKVDRSAAYMARYAAKNVVAAGFCKKCEIN LAYAIGIPQPVSIKIDTFGTEEIPKEIIEDTVNEVFDFSVAGIINSLSLTKVKYKPVAAY GHFGRTDLDLPWERLDKVNILQEKAVARVAEEIYKNR >gi|222441936|gb|ACEP01000006.1| GENE 12 12939 - 13283 549 114 aa, chain + ## HITS:1 COG:TM1287 KEGG:ns NR:ns ## COG: TM1287 COG0662 # Protein_GI_number: 15644042 # Func_class: G Carbohydrate transport and metabolism # Function: Mannose-6-phosphate isomerase # Organism: Thermotoga maritima # 1 111 8 118 121 80 39.0 1e-15 MVRKANIIQKQELGGGKGCAEVHEIVPEKDLYGHGKLYAKVVLKPNSSVGWHRHVGETEP YYILEGEGIFVDDDGSRTKVGPGDVCTILPGQCHAIENASSCKDLAFMALIHKD >gi|222441936|gb|ACEP01000006.1| GENE 13 13812 - 15044 1008 410 aa, chain + ## HITS:1 COG:FN0190 KEGG:ns NR:ns ## COG: FN0190 COG2972 # Protein_GI_number: 19703535 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Fusobacterium nucleatum # 210 407 350 552 552 73 26.0 5e-13 METEKKNGRGKKEVSLLGLFGKKFLEETQGKLAEATGISFFIIDYKGEPVTKGEQEEDFC EKRKEEDIRCSECQITRAFAAAKSAIKNCPYIFTCPNGLVHMAIPIVVNNQYIGAIIGGP IRCEKKALKDSELGDTEQERKAFLDKLGEETEYQNITSFTEKRVVAVADMVFLLFKEMGV KETTAIKLGALEHKEVHLSDIRKRNKLLEDKVSQLEYELLKGKLPKQFMLNLLTTVSSFS ILENASRTENITAEMASVLRYYLDNSSEVISLSEELKQIECYLDILRQQYDNRLEYRVYC QKEVEKQKIPIIGIFPFVEQLVDFGILAGHFRGTLYIDAEKKEDFCKVVLQLESSGLSVG HFGADITDGNFAQMQEQDTRKRYEREFGKDFEITADPNVITIQIPFQEIK >gi|222441936|gb|ACEP01000006.1| GENE 14 15055 - 16077 910 340 aa, chain + ## HITS:1 COG:BH2728 KEGG:ns NR:ns ## COG: BH2728 COG4753 # Protein_GI_number: 15615291 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 152 330 315 500 510 76 26.0 8e-14 MINILLIDQEPLFQQAFSRMIAETEECQLIGIAENSKEAFEIASRYHPQIVFCDVVLGME NGIALCNLIKEHFPEITTYILSNYCSFQMIRHSMDAGVEEYLYKPLSRKKLSELIERNNA SSLNEEENLQQEALLAAIEQKDYKKAYDTAKELVEYLFEECERRERRSKLSMLESSLFYM IPGMDNIQKNYYFQKYEITMKVLTKSAICYHWLIQIITEVFRQLCTMKYAHMNKVFQYIE NNKNNEISLSELADEAGISSGYLSRIFKKYYKISVVDYIHLRKLLQAKYYMATSEMNISD ISFLLGYSEAGYFCKIFKKYEGQTPSAFNKNFQKLTSKAG >gi|222441936|gb|ACEP01000006.1| GENE 15 16394 - 16750 242 118 aa, chain - ## HITS:1 COG:no KEGG:TSIB_1637 NR:ns ## KEGG: TSIB_1637 # Name: not_defined # Def: hypothetical protein # Organism: T.sibiricus # Pathway: not_defined # 11 110 3 102 102 99 45.0 4e-20 MARKFKLNEDKFKVAFIFANEGCNPKKDRSIIESEKIIMYVVGCADYSQAETVAQEMLEE GCAAIDLCGAFGNEGVARISKAVDRQIPVGAVHFDFHPAFRDKSGDDLFQKDFWCIGL >gi|222441936|gb|ACEP01000006.1| GENE 16 16729 - 17136 480 135 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225025868|ref|ZP_03715060.1| ## NR: gi|225025868|ref|ZP_03715060.1| hypothetical protein EUBHAL_00096 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00096 [Eubacterium hallii DSM 3353] # 1 135 1 135 135 268 100.0 1e-70 MRIGIYNHQHLRGGCPICTQTYVQGMIADGMKCMNRAKGIYGEDNEFVNYTDLGDYPSEN LERPGFRRLMKDIEEDNLDVIVVITLDKITTDIDLLMETYQTIRKHNIQLITANDGKKAM EILDKALVKWQENSN >gi|222441936|gb|ACEP01000006.1| GENE 17 17113 - 17307 57 64 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225025867|ref|ZP_03715059.1| ## NR: gi|225025867|ref|ZP_03715059.1| hypothetical protein EUBHAL_00095 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00095 [Eubacterium hallii DSM 3353] # 1 64 1 64 64 95 100.0 1e-18 MIINSNTHINSLQYIFLNIDSYIYCSRRAKEDREDTGTIMGNNESQKKNRKRNKEYRYIS IDKA >gi|222441936|gb|ACEP01000006.1| GENE 18 17375 - 17977 815 200 aa, chain + ## HITS:1 COG:no KEGG:Sterm_3311 NR:ns ## KEGG: Sterm_3311 # Name: not_defined # Def: hypothetical protein # Organism: S.termitidis # Pathway: not_defined # 4 199 7 205 205 139 44.0 6e-32 MERRELERQIVEKEWLMFQQVQGVNGRAACQDDWTTFLIMRLSQFEGWDTDVLESYFEDI VQAEAQERNLIMEKYAYMMEETDPVYFLSIKELLPEISEEKQLMAEKITSIYMEWEKEAD IKYPNIRRHGRPAEGIGVDGTVSVRNYLKCELYTYSAKTLALFILSIEKNPEYNRYLATM QKMVQAYGYSSLEEAEKEMA >gi|222441936|gb|ACEP01000006.1| GENE 19 18098 - 18679 774 193 aa, chain + ## HITS:1 COG:PA4336 KEGG:ns NR:ns ## COG: PA4336 COG0693 # Protein_GI_number: 15599532 # Func_class: R General function prediction only # Function: Putative intracellular protease/amidase # Organism: Pseudomonas aeruginosa # 2 186 5 189 194 231 57.0 6e-61 MQKILLLAGDFTEDYETMVPFQALSMLGYQVDAVCPGKKAGEFIKTAIHDFEGDQTYTEK PGHLFKLTKTFDEVDFDDYIGLFITGGRSPEYIRMDHKVISLVKCFVRSGKPVAAICHAA QVLTAADVVCGRKLTCYPALAAEVKLAGGNYIEVAPDEAVVDCNLITSPAWPGNTAILRE FAKALGCEFIFRP >gi|222441936|gb|ACEP01000006.1| GENE 20 18806 - 19036 394 76 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225025871|ref|ZP_03715063.1| ## NR: gi|225025871|ref|ZP_03715063.1| hypothetical protein EUBHAL_00099 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00099 [Eubacterium hallii DSM 3353] # 1 76 1 76 76 149 100.0 9e-35 MGSLLEEMHKMRLKEKIDELESMGGNPHVADHGLGGVYRCKVCGYIFDENKEGRPFTTLS HCPICHVGQQQFEKIE >gi|222441936|gb|ACEP01000006.1| GENE 21 19043 - 20944 2046 633 aa, chain + ## HITS:1 COG:FN0847 KEGG:ns NR:ns ## COG: FN0847 COG0457 # Protein_GI_number: 19704182 # Func_class: R General function prediction only # Function: FOG: TPR repeat # Organism: Fusobacterium nucleatum # 10 633 6 599 599 356 35.0 9e-98 MKMERFNEILNKLMQMEDAEENLEMVPLYEEALELSKEIYGEYNLKTLEIYNNYGGHLRN LGLYEKAEQILRKAVICARTVRGKEHPDYATTLVNLANLLRMMKQWEESETLFYQALALY KITIGESHFIYAGTMNNLGLLYYEQGKLERAKECLEHSLHILEGKAEYIIPYATTLHNLV DIYKKEGNLTLAEETLKKEIEIYKEQQYEGTVLYAAALNSLGILYCEKGQYEKAKAVMTE SVDITKKHLGESSDAYKTSVKNLEMIQEKLQEHKIKSNHEILQETLKEMTTASCAQEYNL ETAMASARKVLENSPKVVETGFVKGLDLCRAYFNEVCYPLLEREFANFLPRMAAGLIGEG SECYGFDDEISRDHDFGPSFQIYIPKEDMPVYGERLKHRLATLPKTFQGFGARVESQYGD GRVGVFTIEDFYRKFTAAEGVPDTLSHWRQIPENALSTVTNGEVFFDNYGEFTRIREELK KGYPEDVRLKKIAARLMKMAQSGQYNFPRCNKRKEYVASRLALSEFMSVSMSLVYLLNHA YRPYYKWVHRGLLDLPILGQNAYDKMQRLSVLSLEKDSREMEWIIEEFCVACVEELKAQG LTSSSEAFLLAQGPEVLKRIQEPALRNSNPWVE >gi|222441936|gb|ACEP01000006.1| GENE 22 21041 - 21931 846 296 aa, chain + ## HITS:1 COG:BH1930 KEGG:ns NR:ns ## COG: BH1930 COG1404 # Protein_GI_number: 15614493 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Subtilisin-like serine proteases # Organism: Bacillus halodurans # 20 270 143 414 444 202 42.0 9e-52 MEMTNVGNLLELRRIYSRGITGKGITTAVLDTGIYAHPDFFIPQNKILYFQDFVRNRRGP FDDNGHGTHVSGIIASGGRFGDGSGIGVAPESSIVMLKVLERDGSGKIKNMIKGMEWICL NHKKYGIRIVNISVGMPVKNIENPDEDILVKKVEELWNAGLVVVVAAGNDGPAPHTITSP GTSRKVITVGTGTEAGGEYSGQGPVLGSCVCKPDIIAPATNILSCDNHGKSYKKRSGTSM ATPIVSGTIALLLSNEPWLSNRDVKIRLMQNAVDCGMAKSRQGWGLLNPEGMLRIK >gi|222441936|gb|ACEP01000006.1| GENE 23 22208 - 23356 1688 382 aa, chain + ## HITS:1 COG:ECs3659 KEGG:ns NR:ns ## COG: ECs3659 COG1454 # Protein_GI_number: 15832913 # Func_class: C Energy production and conversion # Function: Alcohol dehydrogenase, class IV # Organism: Escherichia coli O157:H7 # 1 380 2 381 383 454 61.0 1e-127 MANCITLNQTSYHGAGAIQSIPDEVKAHGLNKAFVCSDPDLIKFGVTKKVTDVLDAAKLE YEIYSNIKPNPTIENVQTGVAAFKASGADYFIAIGGGSSMDTAKAIGVIINNPDFEDVRS LEGVAPTTKPTVPIIAVPTTAGTAAEVTINYVIKDDEKQRKFVCVDPKDIPVVAIVDPDM MSSMPYGLTASTGMDALTHAIEGYITKAAWEMTDMFHIKAIEVIARSLRAACENDPKGRE DMALGQYIAGMGFSNVGLGIVHSMAHALGAVYDTPHGVANAILLPTVMAYNADATGTKYK DIAIAMGVEGVEDMTQEEYRKAAVDAVAQLSKDVKIPQDLKDIVKEEDLDFLVDSAYNDA CAPGNPKDATKEDIKALYQSLM Prediction of potential genes in microbial genomes Time: Fri Jul 8 06:36:25 2011 Seq name: gi|222441935|gb|ACEP01000007.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont6.1, whole genome shotgun sequence Length of sequence - 1036 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 76 - 270 263 ## COG1476 Predicted transcriptional regulators 2 1 Op 2 . - CDS 275 - 790 375 ## gi|225025877|ref|ZP_03715069.1| hypothetical protein EUBHAL_00105 - Prom 914 - 973 9.2 Predicted protein(s) >gi|222441935|gb|ACEP01000007.1| GENE 1 76 - 270 263 64 aa, chain - ## HITS:1 COG:SPy1934 KEGG:ns NR:ns ## COG: SPy1934 COG1476 # Protein_GI_number: 15675737 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Streptococcus pyogenes M1 GAS # 1 63 1 63 68 92 74.0 2e-19 MKNLRLKAARAALDMSQAELAKRVNVSRQTIVAIEKGDYNPTIKLCIAICHELGKTLDEL FWEW >gi|222441935|gb|ACEP01000007.1| GENE 2 275 - 790 375 171 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225025877|ref|ZP_03715069.1| ## NR: gi|225025877|ref|ZP_03715069.1| hypothetical protein EUBHAL_00105 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00105 [Eubacterium hallii DSM 3353] # 1 171 1 171 171 295 100.0 1e-78 MPFSLLALIIFLIVYAIIAYGVYSSRKMQYDERQKYMQSIAYKYGFFTLILATAVNGFIE YYTKSSWGTPLAESFTIICIGLIVFFTICVFKDAYFSYSETPVKNIVQSIFLFSLIGGIN LYTGIMNQQEIISQHSPVKFNNINLISGILLLLADAIVITRAIMEHLQKEE Prediction of potential genes in microbial genomes Time: Fri Jul 8 06:36:35 2011 Seq name: gi|222441934|gb|ACEP01000008.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont7.1, whole genome shotgun sequence Length of sequence - 3314 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 2, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 107 - 166 4.2 1 1 Op 1 . + CDS 219 - 569 309 ## COG1733 Predicted transcriptional regulators 2 1 Op 2 . + CDS 566 - 1945 1459 ## COG0733 Na+-dependent transporters of the SNF family 3 1 Op 3 . + CDS 1966 - 2268 308 ## COG1925 Phosphotransferase system, HPr-related proteins + Prom 2590 - 2649 7.8 4 2 Tu 1 . + CDS 2678 - 3314 544 ## Rumal_0599 regulatory protein TetR Predicted protein(s) >gi|222441934|gb|ACEP01000008.1| GENE 1 219 - 569 309 116 aa, chain + ## HITS:1 COG:FN0589 KEGG:ns NR:ns ## COG: FN0589 COG1733 # Protein_GI_number: 19703924 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Fusobacterium nucleatum # 9 110 3 104 107 138 68.0 3e-33 MPEVKMEQEEKIPACPVETTLMLISDKWKVLILRDLLPGTKRFGELRKSIGHVSQKVLTA QLRQMEASELLTRKVYAEVPPRVEYTLTELGYSLKPILDAMSEWGENYQNMKEKEV >gi|222441934|gb|ACEP01000008.1| GENE 2 566 - 1945 1459 459 aa, chain + ## HITS:1 COG:FN1944 KEGG:ns NR:ns ## COG: FN1944 COG0733 # Protein_GI_number: 19705249 # Func_class: R General function prediction only # Function: Na+-dependent transporters of the SNF family # Organism: Fusobacterium nucleatum # 23 459 22 459 459 471 59.0 1e-132 MNEKSIESNGMNNGSNRKNGNVRESFNSQWGFILACIGSAVGMGNIWMFSTRVSLYGGGA FLIPYIIFVVLIGATGVIGEMSLGRAARSGPIDAFGMICEKKGKRKLGEALGMIPVLGSL AMAIGYTVVMGWILKYAAGTFTGATLAPESVEDFGGRFGSMASAFGNNVWQVIALAACMA ILMFGVGRGIEKANKILMPVFFVLFVILGIYVFFQPGAADGYHYIFRVDKTALMNPKTWI FALGQAFFSLSVAGNGTLIYGSYLSDEEDIPASAARVALFDTIAAMLAALVIIPAMATTG AQLNQGGPGLLFVYLPNLIKSMPGSTLIAMIFFVAVLFAGMTSLINLYEAPIATVQEKLH VGRKAACLIIAAIGVVVSLMIQGIVSSWMDVLSIYICPLGAGLAGIMFFWIAGKKYVEVQ VNKGRKNKFTELYFPICKYIYVPVCILVLVLGIALGGIG >gi|222441934|gb|ACEP01000008.1| GENE 3 1966 - 2268 308 100 aa, chain + ## HITS:1 COG:BH3074 KEGG:ns NR:ns ## COG: BH3074 COG1925 # Protein_GI_number: 15615636 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, HPr-related proteins # Organism: Bacillus halodurans # 16 100 1 85 87 62 41.0 3e-10 MYWDVPRGASYGEKVMRKFKYTIKEEQGIHARPASFLIQEARTYESRITLRVKGKEADAS NIIAVMALDVMCGQKVEVEICGTDEEVAYRGMKKFFEENL >gi|222441934|gb|ACEP01000008.1| GENE 4 2678 - 3314 544 212 aa, chain + ## HITS:1 COG:no KEGG:Rumal_0599 NR:ns ## KEGG: Rumal_0599 # Name: not_defined # Def: regulatory protein TetR # Organism: R.albus # Pathway: not_defined # 14 202 18 205 218 76 26.0 6e-13 MGKSEKQIEKEKARNQRIIDTAFQIFVEKKIEAVTMEEIARKAGIGRATLFRCYSSKPEL VMAVCTAKWKAYFDKLDAIRPLESIHDIPAIDRFIFTLDGYIDMYQNHKDLLQYNDNFNH YMVHQEGIAAEKYEKFYKSLYSMNTRLHWMYEKAKEDKTFRTDIPEEQFFRVTFHSMMAT CAYYAGGFIWGSEENKDYTEELIRLKEMILKY Prediction of potential genes in microbial genomes Time: Fri Jul 8 06:36:43 2011 Seq name: gi|222441933|gb|ACEP01000009.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont8.1, whole genome shotgun sequence Length of sequence - 17141 bp Number of predicted genes - 12, with homology - 12 Number of transcription units - 5, operones - 4 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 326 - 356 -0.9 1 1 Op 1 1/0.000 - CDS 408 - 1073 703 ## COG0603 Predicted PP-loop superfamily ATPase - Prom 1093 - 1152 2.1 2 1 Op 2 1/0.000 - CDS 1157 - 1747 705 ## COG0302 GTP cyclohydrolase I 3 1 Op 3 22/0.000 - CDS 1764 - 2429 168 ## PROTEIN SUPPORTED gi|157803532|ref|YP_001492081.1| 50S ribosomal protein L35 4 1 Op 4 . - CDS 2426 - 2854 456 ## COG0720 6-pyruvoyl-tetrahydropterin synthase - Prom 2994 - 3053 4.8 + Prom 3189 - 3248 3.6 5 2 Op 1 1/0.000 + CDS 3302 - 4297 670 ## COG0463 Glycosyltransferases involved in cell wall biogenesis + Prom 4381 - 4440 3.2 6 2 Op 2 . + CDS 4462 - 8820 4041 ## COG5610 Predicted hydrolase (HAD superfamily) + Term 8913 - 8956 8.0 + Prom 8937 - 8996 11.8 7 3 Tu 1 . + CDS 9022 - 9609 529 ## Cbei_2682 hypothetical protein + Term 9642 - 9698 3.2 + Prom 10091 - 10150 5.5 8 4 Op 1 . + CDS 10195 - 11172 513 ## COG1295 Predicted membrane protein 9 4 Op 2 . + CDS 11202 - 11813 592 ## COG1434 Uncharacterized conserved protein + Prom 12213 - 12272 8.1 10 5 Op 1 40/0.000 + CDS 12489 - 13508 1421 ## COG0016 Phenylalanyl-tRNA synthetase alpha subunit + Prom 13589 - 13648 6.5 11 5 Op 2 . + CDS 13687 - 16113 3154 ## COG0072 Phenylalanyl-tRNA synthetase beta subunit 12 5 Op 3 . + CDS 16110 - 17138 969 ## COG0809 S-adenosylmethionine:tRNA-ribosyltransferase-isomerase (queuine synthetase) Predicted protein(s) >gi|222441933|gb|ACEP01000009.1| GENE 1 408 - 1073 703 221 aa, chain - ## HITS:1 COG:CAC3627 KEGG:ns NR:ns ## COG: CAC3627 COG0603 # Protein_GI_number: 15896861 # Func_class: R General function prediction only # Function: Predicted PP-loop superfamily ATPase # Organism: Clostridium acetobutylicum # 6 213 6 213 222 361 78.0 1e-100 MQELKNKDAAVVVFSGGQDSTTCLFWALKRYKKVIALSFDYHQKHVKELECAKKICRDHE VEHHIMDMSLLNQLAPNSLTRVDIKVDESAPEEGTPNSFVDGRNLLFLSYAAVFAKQRGI TDIITGVSQSDFSGYPDCRDMFIKSLNTTLCLAMDYQFDIITPLMWLDKKETWALADELG AFDIIRNETLTCYNGIIGDGCGHCPSCKLRKKGLDAYLASK >gi|222441933|gb|ACEP01000009.1| GENE 2 1157 - 1747 705 196 aa, chain - ## HITS:1 COG:CAC3626 KEGG:ns NR:ns ## COG: CAC3626 COG0302 # Protein_GI_number: 15896860 # Func_class: H Coenzyme transport and metabolism # Function: GTP cyclohydrolase I # Organism: Clostridium acetobutylicum # 1 196 1 195 195 240 61.0 1e-63 MAIDKDAIREHVRGILAALGDDPDREGLKETPDRVARMYEEVFEGMNYTNHEIAMMFNKS FKDDLCVGSDKKDIVLVRDIPIFSYCEHHIALMYDMSVSVAYIPKEKVLGLSKIARICDM ASKRLQLQERIGSDIAEIMCEAAETDDVAVIIHGTHSCMSARGIKKEAASTVTTTFRGRF ETDATIQSQLAMMLHA >gi|222441933|gb|ACEP01000009.1| GENE 3 1764 - 2429 168 221 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|157803532|ref|YP_001492081.1| 50S ribosomal protein L35 [Rickettsia canadensis str. McKiel] # 10 213 24 222 225 69 30 2e-11 MITYPVVEKFVSINGEGQKAGKIAAFIRMRGCNLACNYCDTSWANTKDCPCEFLSAEELI TWLKEHSIENVTLTGGEPLLTEEIAPLIEALGTAGFSVEIETNGSVSLNTFDTLAHRPAF TMDYKCPDSGMENAMNTDNFSLLIPKDTVKFVVSSISDLDKAREICIQYKVAEHCPIFLS PVFGRIEPKEIVEYMIEHHWNEARLQLQMHKFIWPPEQRGV >gi|222441933|gb|ACEP01000009.1| GENE 4 2426 - 2854 456 142 aa, chain - ## HITS:1 COG:CAC3624 KEGG:ns NR:ns ## COG: CAC3624 COG0720 # Protein_GI_number: 15896858 # Func_class: H Coenzyme transport and metabolism # Function: 6-pyruvoyl-tetrahydropterin synthase # Organism: Clostridium acetobutylicum # 1 137 1 132 136 133 47.0 9e-32 MYILETEQSFDSAHFLSGYEGKCSNLHGHRWRVVARIAMDELNKEGQTRDMVIDFGDFKN ALKKLTDALDHSFIFETGSLKERTVAALKDEGFLLHEVPFRPTAEQFSRYFYEQLKGLNF PVYDISVYETPTNCSTYREASL >gi|222441933|gb|ACEP01000009.1| GENE 5 3302 - 4297 670 331 aa, chain + ## HITS:1 COG:BS_ykcC KEGG:ns NR:ns ## COG: BS_ykcC COG0463 # Protein_GI_number: 16078354 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Bacillus subtilis # 17 326 6 315 323 214 34.0 2e-55 MEKENKKKGQVMNIINKLSVVISVYNEEAVLDKMYQAVVPVVESLPCEYELLFVNDGSKD ASPAILDRLAMESEKVKVIHFSRNYGHEAAMIAGIDYSTGDGVICMDADLQHPVECIPQI LDAFHQGYEVVSMVRMSNKSAGLFKNITSKLFYKILNGISQTHFEENASDFFAVTKRPAE VLRKHFRESSRFLRAYVQSIGFRKTTIEYTAGERAGGRSKYSLRNLFRFSVKAMISYSDL PLKIASLCGTSAGIASVLLIIYSIYMKIHVGAPGGYTTIIVVICFMFTVLFFLLGIIGEY LSVILGEIRRRPIYLVRDTVNLEKEQEKSKE >gi|222441933|gb|ACEP01000009.1| GENE 6 4462 - 8820 4041 1452 aa, chain + ## HITS:1 COG:Cj1432c_2 KEGG:ns NR:ns ## COG: Cj1432c_2 COG5610 # Protein_GI_number: 15792750 # Func_class: R General function prediction only # Function: Predicted hydrolase (HAD superfamily) # Organism: Campylobacter jejuni # 427 749 3 316 664 167 33.0 2e-40 MKKSFKELLKGCLKPATSKKKQKEEEKAKEIEVTENVEEEIIDITEEDKNLVWAFNAGQA GNDFRGNPKYLFVYVNKYRKDITAYWLSGSEETTEYVRNLGYRAYTLDTLEADRAIEKTG VLVAEQVKIGIPKGMRHVKYLNLYHGVGAKDVERVLTTGGMAEGLAHKYILHNQYYRDNQ LFLCPSPMMVDDFKRMCGVDEDKVVRAGYPRNIYQKYFEPITTFDHDILKAKGLDESYKI AVYAPTYRQALSEDTFHIAIPDIEKLIEVCKRQKLLFVFKMHPVMENEYAYLKMKEKYED CPYLLFWDNRDDIYEIIDKIDLAVIDYSSIFTDFVSAGVSHYIRYVFDYDEAVQGLNHDY MEVTTGELCYSFSELLQAIKTYKDRDDLKGIDRISNLYWKYSDKDSFEKIVNQTLEFTPV EREFPTLYSFDIFDTLISRKVLDPKGIFYKVKEVIKKSDLNYPEYFVEEYPKLRMFAEAD VRECYHKTMESRHSERREIYMKDIFDKLQWVYNLTDEQAAFLMKCEQEIEIANTIPLTEN INKVKKLLENKETVVLISDMYLPKEIIEKMLEKADPVLTQIPLFLSSEYGVQKTTRKLFL EVYKSFKPFYDFGKWIHTGDGRVPDVNMPRRLGIVANQIEKPEYNEFEGKLTEQLDSYDS YLVAALMARFRQNHLFDKEKFVYSYVSLCFVPYIDWLLKDAKARGYETLYFVSRDGYHLK RIADEIIKMKNIDIKTKYIYASRRAWRIPSFVHDIDEGFWFSYGNFAGVTTYDKLLQAMS LSEEKFMELFPKLSYLKEEEEIEPNEKNRLVEIFKNSTEYKEYLLNLAKEQRVSVCGYLE QEIDKTEKFAMVEYWGRGYTQENFTSLWKTIAGDDAKSYFYYSRTILASDEDNIRYNFTT DNASQLFIEAIFANMPYKSIESYKLVNGKYEPVINNISYDKELFNAMEKYLLEFTRNYEA LDVTDREKLDRGLYNFILDYYKTYQEDEIFIKVLAPLVDSVALYGQKKEFAPAFTAEDLE KMKEKVPRGKFTRNIRISVARSFKPVQDEYRKMYQLKETDTQGGGYLIPEKTMQRNEEFE QKIQKLTVDQIEKQKIYNSLVSENPVENKILFMADGKKISYTGFYTLGKLLEAQDTFVVK EVKAQEYTNKEDLLKEIATARFIICLKPLVYLAMMTLREETEVIVLGESAFPFIHGRRTT IRMAKKPLEYKEMIQKRMHIDHFPIPSLELMPVISEMYGIGEKKAFEMKGSCQTDIYFNE EFIKESKEKVVALFPEAQDKKVILYMPLLRNRTAASGYKELLNLERLKKLIGDKFVVVCN IPLPKGQYIQNIEIEGFSKNITKDISLREAVGAADIVVGDYRDTMFEAALMRKPIFMTNY DADKVLKDKELVCTMEEVQVGPMVDSEEKLAEHLSKLDEYDYSKQEAFCKKYFTYCDGKS AERVAQYIESAK >gi|222441933|gb|ACEP01000009.1| GENE 7 9022 - 9609 529 195 aa, chain + ## HITS:1 COG:no KEGG:Cbei_2682 NR:ns ## KEGG: Cbei_2682 # Name: not_defined # Def: hypothetical protein # Organism: C.beijerinckii # Pathway: not_defined # 6 185 2 181 185 182 54.0 7e-45 MNIFKNLKGHFCTITHHKMLVMKTCFKVGLYKQGLLHDLSKYTPIEFIPGIIYYQGDRSP INREKELKNCSRGWLHHKGRNLHHFEYWIDYSINPGGKLVGMKMPKKYVAEMVIDRISAS KNYLKEQYNDGSALAYYLNGRHMMLIDDEADYLARYLLTMLDMRGEEYLLHYMKHTLLRH KNRDYHVRDGRLYLD >gi|222441933|gb|ACEP01000009.1| GENE 8 10195 - 11172 513 325 aa, chain + ## HITS:1 COG:CAC0168 KEGG:ns NR:ns ## COG: CAC0168 COG1295 # Protein_GI_number: 15893462 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Clostridium acetobutylicum # 10 277 1 268 273 146 29.0 5e-35 MPARKRIKNLYELVMDFLKKCSDDHVGAFGAMSAFFILLSIFPFMIFLLTLTRYIPFSKD DIIVILTRMISFEEGSLIKSIVNEIYRKTGTTVSAISIIAALWSSSRGVYSIVIGLNSVY DIDENRNYFAIRLFSLVYTLLFALMISVMLVMWVFGNSLYSYFLQKFPFVVPILGYFMHK RIIFSLIFLTLLFMIIYKWIPNRVSSFRKQFPGALIATLGWVAVSVGCSIYMDNFTNFSY IYGSMAGIMILLLWLYFCMSMVFYGAEVNYFLENKDNYHLLIRTLRPNWRKQQRAQTEKM RQKSKDERADRKKNKANKNKDENQQ >gi|222441933|gb|ACEP01000009.1| GENE 9 11202 - 11813 592 203 aa, chain + ## HITS:1 COG:VC1698 KEGG:ns NR:ns ## COG: VC1698 COG1434 # Protein_GI_number: 15641702 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Vibrio cholerae # 7 154 7 142 215 65 31.0 9e-11 MYHLDEKQIQEITDYIFLEDKLEKADAIFIPGCARPEHTEEAARLYKDGYAPLLLPSGGY TKVQGSFQGVSKEGQKYGADFACEADFLEAVLIQNGVPASAILKECEATYTLENAEKTKV LLEKKDIHLKKAILCCKAHHSRRSYLYYSMVFPEIEILVYPVVVDGISREDWYKTAEGRK VVLGEFSRMGQQLLMMEGRIAWD >gi|222441933|gb|ACEP01000009.1| GENE 10 12489 - 13508 1421 339 aa, chain + ## HITS:1 COG:CAC2357 KEGG:ns NR:ns ## COG: CAC2357 COG0016 # Protein_GI_number: 15895624 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Phenylalanyl-tRNA synthetase alpha subunit # Organism: Clostridium acetobutylicum # 1 339 1 339 339 446 58.0 1e-125 MKEKIEQIRQQALAAIEDAAGIDKLNDVKVAFLGKKGQLSSLLKGMKDVAPEDRPKVGQM VNEARAGIEAKMEEKRTAIQKKLREEKMKKEVIDVTLPGKKVAVGHRHPNQIALDDLERV FIGMGYEVVEGPEVEYDHYNFELLNIPANHPAKDEQDTFYITKDILLRTQTSPVQARIME TGRMPIRVIAPGRVFRSDEVDATHSPSFHQVEGLVVDKGITFADLKGTLQQWAEEFFGPD TKVKFRPHHFPFTEPSAEVDVSCFKCGGKGCRMCKGSGWIEILGCGMVHPKVLSDCGIDP EVYSGFAFGIGLERVTLLKYEIDDMRLLYENDARFLKQF >gi|222441933|gb|ACEP01000009.1| GENE 11 13687 - 16113 3154 808 aa, chain + ## HITS:1 COG:CAC2356_2 KEGG:ns NR:ns ## COG: CAC2356_2 COG0072 # Protein_GI_number: 15895623 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Phenylalanyl-tRNA synthetase beta subunit # Organism: Clostridium acetobutylicum # 162 807 8 654 654 577 45.0 1e-164 MNTPISWIKAYVPDLDCTVQEYVDKMTLSGSHVENAVYLDKNLEKIVVGRIEKIEKHPDA DKLVICQVNVGDEEVQIVTGASNVFKGAMVPVVLDGGRVAGGHDGSPNPENGIKIKKGKL RGVPSYGMMCSIEELGSTRDMYPEAPEDGIYIFDESKDVKPGDDAVAALGLRDAVVEFEI TSNRVDCFSMIGMAREAAATFEKPFYAPEVKEVGNNEKAEDYISVEVEATDLCPRYTARI VKNIKLAPSPEWMQRRLAAMGIRPINNIVDITNYVMEEYGQPMHAYDLNKIRGHKIVVKR ANDGDVYTTLDGQERKLDKDVLMINDAEGPVGIAGIMGGENSMVTDDIQTMLFEAATFDG TNIRLSSKRIGLRTDASGKFEKGLDPENALEAINRACQLVEELGAGEVVGGVVDVYPNPV EDVKIPFEPAKYNKLLGTNVSEEKMMEYFDRLEIGYDKETNMLLIPSFRQDLRCSADIAE EVARFFGYDNIPTTLPHGEATAGKKSFAARVEDVVMNIAEQNGFCGGMCYSFESPKVFDK LLLADNDPLRQAIVIANPLGEDYSIMRTIELNGILTSLAGNYNHRNKNVRLYEIGNVYLP KALPLTELPDERKRLTLGMYGECDFFMLKGVLEEMFLKLGLDGKVDFEPSQEKPFLHPGR QALIYVGGAYAGFIGQVHPEVCENYDMKCEAYVAGIDLPTVTEKATFDRRYEGVAKYPAV NRDLSLVMKKDVFVGSLEKVMKEKGGKLLESIQLFDVYEGSQIEEGYKSVAFSLVFRSPE RSLEAAEINKIVDKILKELEKMGVELRA >gi|222441933|gb|ACEP01000009.1| GENE 12 16110 - 17138 969 342 aa, chain + ## HITS:1 COG:CAC2283 KEGG:ns NR:ns ## COG: CAC2283 COG0809 # Protein_GI_number: 15895551 # Func_class: J Translation, ribosomal structure and biogenesis # Function: S-adenosylmethionine:tRNA-ribosyltransferase-isomerase (queuine synthetase) # Organism: Clostridium acetobutylicum # 2 342 1 341 341 465 64.0 1e-131 MMKTSDFNFDLPQELIAQDPLEDRSSSRLMVLNKESGEIAHRIFHDITEYLHPGDCLVIN DTKVIPARLIGTKEDTGAHIEILLLKRKENDVWETLVKPGKKCRPGARVVFGNGELKAEI VDVLEDGNRLVHFEYEGIFEEVLDRLGQMPLPPYITHKLQDKNRYQTVYAKYEGSAAAPT AGLHFTKELLKQIEDMGVNIARVTLHVGLGTFRPVKVENVLEHHMHSEYYNVTETAAKMI NDTKKNGGRIIAVGTTSTRTLESVADENGIIHPGCGNTEIFIYPGYKFKAIDCLITNFHL PESTLLMLVSALAGKEHIMAAYKEAVKERYRFFSFGDAMFIQ Prediction of potential genes in microbial genomes Time: Fri Jul 8 06:36:50 2011 Seq name: gi|222441932|gb|ACEP01000010.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont9.1, whole genome shotgun sequence Length of sequence - 8325 bp Number of predicted genes - 9, with homology - 9 Number of transcription units - 4, operones - 3 average op.length - 2.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 271 - 1005 713 ## COG5523 Predicted integral membrane protein - Prom 1042 - 1101 4.8 2 2 Op 1 . - CDS 1111 - 2064 1080 ## COG0039 Malate/lactate dehydrogenases 3 2 Op 2 . - CDS 2135 - 2332 240 ## Elen_0189 diguanylate cyclase/phosphodiesterase with PAS/PAC sensor(s) - Prom 2379 - 2438 5.2 4 3 Op 1 . - CDS 2447 - 3064 576 ## EUBELI_01251 hypothetical protein 5 3 Op 2 . - CDS 3061 - 4947 1275 ## COG1744 Uncharacterized ABC-type transport system, periplasmic component/surface lipoprotein 6 3 Op 3 . - CDS 4928 - 5164 202 ## gi|225025902|ref|ZP_03715094.1| hypothetical protein EUBHAL_00130 7 3 Op 4 . - CDS 5175 - 5831 765 ## COG0664 cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases - Prom 5954 - 6013 9.6 + Prom 5926 - 5985 8.1 8 4 Op 1 1/0.000 + CDS 6058 - 7335 1361 ## COG3875 Uncharacterized conserved protein + Prom 7479 - 7538 12.5 9 4 Op 2 . + CDS 7579 - 8289 675 ## COG0580 Glycerol uptake facilitator and related permeases (Major Intrinsic Protein Family) Predicted protein(s) >gi|222441932|gb|ACEP01000010.1| GENE 1 271 - 1005 713 244 aa, chain - ## HITS:1 COG:lin0656 KEGG:ns NR:ns ## COG: lin0656 COG5523 # Protein_GI_number: 16799731 # Func_class: S Function unknown # Function: Predicted integral membrane protein # Organism: Listeria innocua # 126 211 79 164 345 89 47.0 8e-18 MRVASDFRELARRALSGKWGSAVLTTLVASILGADIMVNGGSGTSGITNSVTNVVNRGNG DGSYNGILSTMSITSLTYILGAMMAIVTVLIVVGLVQYVIGSFVSLGLIQYNLDLIDGKD VEFGQIFSKASMFGKAFWLRLRMGIFTFLWTLLLIIPGIIKSYAYSMSGFIMAENPEMTA KEAMQVSEKMMAGNKWRLFCLQFSFIGWQILCILSLGIGFLWLTPYMNAATAAFYDEISR EPLN >gi|222441932|gb|ACEP01000010.1| GENE 2 1111 - 2064 1080 317 aa, chain - ## HITS:1 COG:CAC0267 KEGG:ns NR:ns ## COG: CAC0267 COG0039 # Protein_GI_number: 15893559 # Func_class: C Energy production and conversion # Function: Malate/lactate dehydrogenases # Organism: Clostridium acetobutylicum # 11 317 6 312 313 237 38.0 2e-62 MLNNFSISERKVAIIGAGFVGASIAYALTIRKLAREIVLIDIHEEKTIGEALDIQHGIPD MGISSVKAGNYEDCKDCDLIIITAGRNRKPGETRLDLIAGNSAILKNVVDQMKPHYTKGV IMIVSNPVDVLVYQCTKWMGLPNGMVFGTGCILDSSRFTRLIADYTRLNTEVVKATIVGE HGDAQIPIWSRVSIAGVPIQEYCENVGLRWGENIRKDISDKVREMGATIIKGKGRTHYGI ATCVCYLAEAVLNQRLTIAPVSTMFQGEYGIEDVCLSVPSIIGVNGVEKRLEERWAEEEF LAFRQAAEKMRNVLKTL >gi|222441932|gb|ACEP01000010.1| GENE 3 2135 - 2332 240 65 aa, chain - ## HITS:1 COG:no KEGG:Elen_0189 NR:ns ## KEGG: Elen_0189 # Name: not_defined # Def: diguanylate cyclase/phosphodiesterase with PAS/PAC sensor(s) # Organism: E.lenta # Pathway: not_defined # 2 65 388 451 1435 64 43.0 1e-09 MGLIDKYHVDSKYIIFEITENTYIHNVEAVNRMIQTFHQRGIRISMDDFDSGYSSLNTLK EIIFD >gi|222441932|gb|ACEP01000010.1| GENE 4 2447 - 3064 576 205 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_01251 NR:ns ## KEGG: EUBELI_01251 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 1 196 12 208 211 240 61.0 2e-62 MNILVIADEEAPSLWDYFRPEKLKDVDLILSCGDLNPKYLSFLATFCKGPVLYVHGNHDD RYEKTPPEGCICIEDKIYEYKGIRIMGLGGSYRYSSGINQYTERKMRNRIFKMWFPLWRK KGFDILLTHAAAYGVDDANDWAHMGFECFVKLLDIYRPKYYIHGHIHLNYGGGHTRRQQY GETEIINGYQFHKFEYETGKEIKMF >gi|222441932|gb|ACEP01000010.1| GENE 5 3061 - 4947 1275 628 aa, chain - ## HITS:1 COG:AF0890 KEGG:ns NR:ns ## COG: AF0890 COG1744 # Protein_GI_number: 11498495 # Func_class: R General function prediction only # Function: Uncharacterized ABC-type transport system, periplasmic component/surface lipoprotein # Organism: Archaeoglobus fulgidus # 290 609 59 397 397 174 32.0 4e-43 MMTEEYKEEYKKARKAAMKQYRTCASRGWSLYPPVLDEVSAYVKTAGEEVLGEMEIPLSL VTGTRTAGRQNAFSKDFLPILPENSEFARKWITLYEAQMEEGIRDPILVYEFMHQFYVQE GNKRVSVMKYLDASHIMAKVIRIFPEKTDEPSVKLYYEFIEFYRSTKFYDIVCKQVGNYA KLLKFMGKERNEACSDEERKKLQSLFYHFSSIYNAVAGNEEAVLTAGDAFLIYLNIFSYE EAASKPASKLREEILKMWKEFVPAKEAESVKRLLEPEEKKPAFWNKLLNSTQKLSIAFVY DKKPDTSSWLYAHELGRLHLEEVFPEKVETSCYIGDVEQAAVDGNQVIFTTTPLLMPESL KAALKYPEVQILNCSLNYAWKSIPTYYTRMYEVKFLTGLIAGSMAKGENIGYEADYPIYG SIANINAFALGVAMVNPNIKIYLNWTSEKGKQKAAVEELALVSARDMISADGCNRRFGLY SNENGEILNIATSVLDWGIFYEKIIQQMLDGTWKKVSDKETASRNYWWGLSAGVENLICS SQMPYGTKRLVDTFQSLITEGHFHPFEGMFYDKNGREHGKKNTILSNEEIITMDWLFANI VGEIPEKYELKEQAKPIVELQGVKGEKE >gi|222441932|gb|ACEP01000010.1| GENE 6 4928 - 5164 202 78 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225025902|ref|ZP_03715094.1| ## NR: gi|225025902|ref|ZP_03715094.1| hypothetical protein EUBHAL_00130 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00130 [Eubacterium hallii DSM 3353] # 1 78 1 78 78 137 100.0 3e-31 MTEHAKYLIPVIYIEEKEIIAKIHAMTKTLVASILENEKPVYVDFETQERIPKDSALWYK EVIKTNGECLSINDDRRV >gi|222441932|gb|ACEP01000010.1| GENE 7 5175 - 5831 765 218 aa, chain - ## HITS:1 COG:L0245 KEGG:ns NR:ns ## COG: L0245 COG0664 # Protein_GI_number: 15674216 # Func_class: T Signal transduction mechanisms # Function: cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases # Organism: Lactococcus lactis # 1 216 12 227 228 221 52.0 9e-58 MNHEYLLEFLEKKDIPTVHKLRHAYLTYYGLDQQNTYVLKEGVVKTSIILRDGREFNISY LKAPDIISLLRDEVSQYTSAPFNVRIESETAVFYKIPRVTFWEYVNEDKKLQDYINAYYR EQLSQAIYRQQLMTMNGKTGAVCAFLYLQIEPFGKKMKDGILIDFQITNDDIAGFCGIST RNSVNRILRGLRENGIIFIADQKILIKDKEYLEQYVKG >gi|222441932|gb|ACEP01000010.1| GENE 8 6058 - 7335 1361 425 aa, chain + ## HITS:1 COG:CAC0769 KEGG:ns NR:ns ## COG: CAC0769 COG3875 # Protein_GI_number: 15894056 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 4 423 6 425 426 629 74.0 1e-180 MFEMNVPYDKGTMKISLPEKNLAGVLDGKQSEYTTELSEKELVEQSLDNPIGSKPLEELA KGKKDIVIISSDHTRPVPSKIITPILLRRIRSVSPDARIRILVATGFHRPSTHEELVDKY GEKIVANEEIVMHISTDDASMKKIGVLPSGGDCIINKIAAEADLLISEGFIEAHFFAGFS GGRKSVLPGIASYKTIMANHSGEFINDKKSRPGNLCHNLIHEDMVYAARTANLAFIVNVV LNGNHEIIGSFAGDMEAAHEKGCDFVRSLASVDKVDCDIAISTNGGYPLDQNIYQAIKGM TAAEATLPDSGIIIMIAGCRDGHGGVGFYHNIADVKNPEEFEQKAIHTPRLETVPDQWTS QIFARILAHHKVIMVSDLVDPKLVKDMHMELASNVDEALAMAFKIKGNDAKVAVIPDGLG VVVNI >gi|222441932|gb|ACEP01000010.1| GENE 9 7579 - 8289 675 236 aa, chain + ## HITS:1 COG:SPy1682 KEGG:ns NR:ns ## COG: SPy1682 COG0580 # Protein_GI_number: 15675543 # Func_class: G Carbohydrate transport and metabolism # Function: Glycerol uptake facilitator and related permeases (Major Intrinsic Protein Family) # Organism: Streptococcus pyogenes M1 GAS # 5 229 3 228 233 157 42.0 2e-38 MLHSLIAEFLGTALMILFGVGVHCDEVLTKTKYHGSGHIFAITTWAFGISVTLFIFGGVC INPAMALCQAILGMIPWTNFIPYCIAEFLGALFGSLVVYVMYADHFKASEGKIDPIAIRN IFSTNPNCRNLPRNFFVETIATTVFLTAILAVATNYEKQLPIGVGLIVWAVGMGLGGTTG FAMNQARDLGPRLAFQLLPIKNKADNDWQYGLLVPGIAPFVGAVLAALFSHYFLAI Prediction of potential genes in microbial genomes Time: Fri Jul 8 06:37:25 2011 Seq name: gi|222441931|gb|ACEP01000011.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont10.1, whole genome shotgun sequence Length of sequence - 79632 bp Number of predicted genes - 78, with homology - 77 Number of transcription units - 31, operones - 16 average op.length - 3.9 N Tu/Op Conserved S Start End Score pairs(N/Pv) + TRNA 21 - 95 81.3 # Pro TGG 0 0 + TRNA 101 - 171 75.8 # Gly TCC 0 0 + TRNA 175 - 248 82.6 # Arg TCT 0 0 + TRNA 271 - 344 74.1 # His GTG 0 0 + TRNA 374 - 447 81.8 # Met CAT 0 0 + TRNA 455 - 527 85.1 # Val TAC 0 0 + TRNA 533 - 619 67.4 # Leu TAA 0 0 + TRNA 634 - 707 75.6 # Arg ACG 0 0 + Prom 1874 - 1933 6.9 2 1 Op 2 4/0.000 + CDS 1967 - 3010 1435 ## COG0416 Fatty acid/phospholipid biosynthesis enzyme 3 1 Op 3 6/0.000 + CDS 3010 - 3708 908 ## COG0571 dsRNA-specific ribonuclease 4 1 Op 4 10/0.000 + CDS 3710 - 7273 3467 ## COG1196 Chromosome segregation ATPases 5 1 Op 5 . + CDS 7286 - 8227 846 ## PROTEIN SUPPORTED gi|163762490|ref|ZP_02169555.1| ribosomal protein L28 + Prom 8237 - 8296 12.5 6 2 Tu 1 . + CDS 8399 - 9628 1593 ## COG1171 Threonine dehydratase + Term 9760 - 9806 9.1 + Prom 9675 - 9734 7.9 7 3 Op 1 . + CDS 9915 - 10085 64 ## 8 3 Op 2 . + CDS 10000 - 10962 640 ## COG3854 Uncharacterized protein conserved in bacteria + Prom 10965 - 11024 6.4 9 4 Op 1 . + CDS 11058 - 11294 246 ## gi|225025912|ref|ZP_03715104.1| hypothetical protein EUBHAL_00148 10 4 Op 2 . + CDS 11318 - 11572 168 ## gi|225025913|ref|ZP_03715105.1| hypothetical protein EUBHAL_00149 11 4 Op 3 . + CDS 11605 - 11799 237 ## Closa_3250 stage III sporulation protein AC 12 4 Op 4 . + CDS 11811 - 12197 365 ## Cphy_2521 stage III sporulation protein AD 13 4 Op 5 . + CDS 12243 - 13388 236 ## Cphy_2520 sporulation stage III, protein AE 14 4 Op 6 . + CDS 13439 - 13966 397 ## gi|225025917|ref|ZP_03715109.1| hypothetical protein EUBHAL_00153 15 4 Op 7 . + CDS 13969 - 14541 576 ## Cphy_2518 hypothetical protein 16 4 Op 8 . + CDS 14538 - 15206 688 ## Cphy_2517 hypothetical protein + Term 15302 - 15343 1.7 + Prom 15215 - 15274 8.3 17 5 Op 1 10/0.000 + CDS 15377 - 15763 601 ## COG1302 Uncharacterized protein conserved in bacteria 18 5 Op 2 1/0.125 + CDS 15807 - 16208 591 ## COG0781 Transcription termination factor + Term 16250 - 16294 5.9 19 6 Op 1 . + CDS 16302 - 17510 824 ## COG1570 Exonuclease VII, large subunit 20 6 Op 2 . + CDS 17522 - 17755 348 ## gi|225025924|ref|ZP_03715116.1| hypothetical protein EUBHAL_00160 21 6 Op 3 13/0.000 + CDS 17718 - 18611 1096 ## COG0142 Geranylgeranyl pyrophosphate synthase 22 6 Op 4 6/0.000 + CDS 18635 - 20509 1635 ## COG1154 Deoxyxylulose-5-phosphate synthase 23 6 Op 5 5/0.000 + CDS 20511 - 21335 900 ## COG1189 Predicted rRNA methylase 24 6 Op 6 1/0.125 + CDS 21341 - 22177 865 ## COG0061 Predicted sugar kinase 25 6 Op 7 8/0.000 + CDS 22196 - 22654 565 ## COG1438 Arginine repressor 26 6 Op 8 3/0.000 + CDS 22669 - 24336 1354 ## COG0497 ATPase involved in DNA repair 27 6 Op 9 . + CDS 24406 - 25485 695 ## COG0750 Predicted membrane-associated Zn-dependent proteases 1 + Prom 25506 - 25565 3.9 28 6 Op 10 . + CDS 25689 - 27341 1268 ## COG2200 FOG: EAL domain + Term 27343 - 27410 21.3 - Term 27331 - 27398 24.0 29 7 Tu 1 . - CDS 27443 - 27820 332 ## COG1321 Mn-dependent transcriptional regulator - Prom 27938 - 27997 9.8 + Prom 27911 - 27970 7.6 30 8 Op 1 35/0.000 + CDS 28145 - 29851 208 ## PROTEIN SUPPORTED gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P 31 8 Op 2 . + CDS 29852 - 31609 2185 ## COG1132 ABC-type multidrug transport system, ATPase and permease components + Prom 31651 - 31710 4.8 32 9 Tu 1 . + CDS 31730 - 32884 1461 ## COG0426 Uncharacterized flavoproteins 33 10 Tu 1 . - CDS 33060 - 34208 503 ## COG3359 Predicted exonuclease - Prom 34258 - 34317 5.7 34 11 Op 1 29/0.000 + CDS 34484 - 35098 682 ## COG0632 Holliday junction resolvasome, DNA-binding subunit 35 11 Op 2 . + CDS 35108 - 36112 1095 ## COG2255 Holliday junction resolvasome, helicase subunit 36 11 Op 3 . + CDS 36140 - 36538 485 ## Closa_2236 hypothetical protein 37 11 Op 4 . + CDS 36525 - 39065 1373 ## COG0826 Collagenase and related proteases 38 11 Op 5 19/0.000 + CDS 39084 - 40451 958 ## COG0772 Bacterial cell division membrane protein 39 11 Op 6 . + CDS 40396 - 41994 1447 ## COG0768 Cell division protein FtsI/penicillin-binding protein 2 + Term 42034 - 42076 4.7 + TRNA 42070 - 42149 64.8 # Leu CAA 0 0 - Term 42377 - 42413 -0.8 40 12 Tu 1 . - CDS 42487 - 42939 419 ## COG0494 NTP pyrophosphohydrolases including oxidative damage repair enzymes - Prom 42983 - 43042 9.9 41 13 Tu 1 . - CDS 43046 - 43933 253 ## gi|225025945|ref|ZP_03715137.1| hypothetical protein EUBHAL_00182 - Prom 43980 - 44039 3.8 - Term 43978 - 44023 -0.2 42 14 Op 1 . - CDS 44073 - 44897 226 ## Clos_0427 signal transduction histidine kinase regulating citrate/malate metabolism 43 14 Op 2 . - CDS 44900 - 45643 575 ## COG3279 Response regulator of the LytR/AlgR family - Prom 45706 - 45765 11.1 44 15 Tu 1 . - CDS 46023 - 46472 483 ## Clole_2893 hypothetical protein - Prom 46617 - 46676 8.3 - Term 46662 - 46710 7.6 45 16 Tu 1 . - CDS 46736 - 47182 355 ## gi|225025950|ref|ZP_03715142.1| hypothetical protein EUBHAL_00187 - Prom 47203 - 47262 8.0 + Prom 47159 - 47218 7.9 46 17 Tu 1 . + CDS 47399 - 47803 477 ## SpiBuddy_2157 C_GCAxxG_C_C family protein + Term 47893 - 47950 14.1 + Prom 47872 - 47931 8.9 47 18 Op 1 24/0.000 + CDS 48015 - 49256 1365 ## COG0004 Ammonia permease 48 18 Op 2 . + CDS 49319 - 49675 608 ## COG0347 Nitrogen regulatory protein PII + Term 49740 - 49785 6.1 + Prom 49683 - 49742 8.7 49 19 Op 1 . + CDS 49820 - 51208 1298 ## COG0534 Na+-driven multidrug efflux pump 50 19 Op 2 . + CDS 51224 - 52216 961 ## COG2896 Molybdenum cofactor biosynthesis enzyme + Term 52258 - 52294 -1.0 51 20 Tu 1 . - CDS 52282 - 52911 630 ## COG1309 Transcriptional regulator - Prom 53007 - 53066 6.7 + Prom 53063 - 53122 8.2 52 21 Tu 1 . + CDS 53337 - 54278 1203 ## COG0521 Molybdopterin biosynthesis enzymes + Prom 54676 - 54735 6.1 53 22 Tu 1 . + CDS 54863 - 55921 712 ## Rumal_2410 hypothetical protein + Term 56110 - 56156 -1.0 + Prom 56061 - 56120 5.8 54 23 Op 1 . + CDS 56350 - 57246 422 ## Rumal_2409 ATP-binding region ATPase domain-containing protein 55 23 Op 2 . + CDS 57203 - 57850 563 ## COG2197 Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain + Prom 57868 - 57927 4.8 56 24 Tu 1 . + CDS 58033 - 59085 1123 ## gi|225025961|ref|ZP_03715153.1| hypothetical protein EUBHAL_00198 + Term 59162 - 59207 13.2 - Term 59153 - 59192 8.6 57 25 Tu 1 . - CDS 59228 - 59992 704 ## COG1349 Transcriptional regulators of sugar metabolism - Prom 60019 - 60078 8.6 + Prom 60104 - 60163 7.2 58 26 Op 1 . + CDS 60204 - 61949 1119 ## COG0579 Predicted dehydrogenase 59 26 Op 2 . + CDS 61961 - 63493 948 ## COG1070 Sugar (pentulose and hexulose) kinases 60 26 Op 3 . + CDS 63531 - 64958 1220 ## COG0277 FAD/FMN-containing dehydrogenases 61 26 Op 4 . + CDS 65027 - 66514 1280 ## COG0554 Glycerol kinase 62 26 Op 5 . + CDS 66583 - 67299 326 ## gi|225025967|ref|ZP_03715159.1| hypothetical protein EUBHAL_00204 63 26 Op 6 . + CDS 67302 - 68234 602 ## COG2025 Electron transfer flavoprotein, alpha subunit + Term 68268 - 68319 16.3 + Prom 68353 - 68412 9.5 64 27 Op 1 . + CDS 68441 - 68902 376 ## Rumal_1150 AAA-ATPase-like protein 65 27 Op 2 . + CDS 68892 - 69248 211 ## EUBREC_2482 hypothetical protein 66 27 Op 3 . + CDS 69251 - 69586 219 ## EUBREC_2482 hypothetical protein + Prom 69610 - 69669 5.9 67 28 Op 1 1/0.125 + CDS 69714 - 71600 1978 ## COG1032 Fe-S oxidoreductase 68 28 Op 2 . + CDS 71560 - 72321 904 ## COG5011 Uncharacterized protein conserved in bacteria + Prom 72380 - 72439 10.6 69 29 Tu 1 . + CDS 72583 - 73239 704 ## Closa_0878 3D domain protein + Term 73263 - 73311 12.6 + Prom 73333 - 73392 5.9 70 30 Op 1 4/0.000 + CDS 73419 - 74630 1116 ## COG1530 Ribonucleases G and E + Term 74806 - 74838 -0.8 + Prom 74745 - 74804 6.2 71 30 Op 2 . + CDS 74878 - 75186 433 ## PROTEIN SUPPORTED gi|238924055|ref|YP_002937571.1| 50S ribosomal protein L21 72 30 Op 3 . + CDS 75245 - 75571 330 ## bpr_I1442 hypothetical protein 73 30 Op 4 14/0.000 + CDS 75577 - 75858 451 ## PROTEIN SUPPORTED gi|160880681|ref|YP_001559649.1| 50S ribosomal protein L27 + Term 75876 - 75918 7.0 + Prom 75868 - 75927 5.9 74 31 Op 1 1/0.125 + CDS 76084 - 77373 1825 ## COG0536 Predicted GTPase 75 31 Op 2 7/0.000 + CDS 77401 - 77688 207 ## PROTEIN SUPPORTED gi|55821596|ref|YP_140038.1| hypothetical protein stu1620 76 31 Op 3 9/0.000 + CDS 77710 - 78354 703 ## COG1057 Nicotinic acid mononucleotide adenylyltransferase 77 31 Op 4 6/0.000 + CDS 78357 - 78926 692 ## COG1713 Predicted HD superfamily hydrolase involved in NAD metabolism + Term 79047 - 79090 3.0 + Prom 79152 - 79211 6.3 78 31 Op 5 . + CDS 79237 - 79590 581 ## COG0799 Uncharacterized homolog of plant Iojap protein Predicted protein(s) >gi|222441931|gb|ACEP01000011.1| GENE 1 797 - 1651 663 284 aa, chain + ## HITS:1 COG:no KEGG:Swol_0422 NR:ns ## KEGG: Swol_0422 # Name: not_defined # Def: uroporphyrinogen decarboxylase (URO-D) # Organism: S.wolfei # Pathway: not_defined # 7 267 10 282 293 112 29.0 2e-23 MGQIWKCNGDDKRTMPRWLWEKDLSILSNEEEMAKVLLAWKKAENRKEIILPLVQNLEGA VLGADVIKRENYWTTGDYPFSRLEEIKPEQFTLMQDKRIKTIIEVTKKLKEKPLILEAEA PFSILAALIDPMKLYLAMQTEPEKLEKILKKIVLEEAEYIKAVIQAGCRIISLAEPTATM DMVGEKCFKECSGMATFMLLKEIEKFLSNSVVHLCGKLSSSMIAVHMAKEEEDFVTGETY LENLIEVTKNPEIHFVGQRCIHQKKNESGKIHVMKLNLIQTKST >gi|222441931|gb|ACEP01000011.1| GENE 2 1967 - 3010 1435 347 aa, chain + ## HITS:1 COG:CAC1746 KEGG:ns NR:ns ## COG: CAC1746 COG0416 # Protein_GI_number: 15895023 # Func_class: I Lipid transport and metabolism # Function: Fatty acid/phospholipid biosynthesis enzyme # Organism: Clostridium acetobutylicum # 6 334 2 329 331 305 48.0 6e-83 MEYKNVIAIDAMGGDNAPAEIIKGAIEAINERADIKLRLFGDKDKIETELNKYTYNKEQI GITHTTEEISCNEAPAMAIKKKKDSSLVVAIKSVKSGECDAIVSAGSSGAILVGGQVLIG KGKGVKRAPLAPLTPTAKGVSLLIDCGANVDARPEHLLQFAQMGSIYMEDVVGVKNPRVA IVNIGAEEAKGNALVKETYPLLKACKNINFIGSIEARDIPKGEADVIVTEAFVGNVILKL EEGLASTLIHVIKEGMMSTTRSKIGGLLVKPALKDTLKTFDATEYGGAPLLGLKGLVVKI HGSAKAKETKTAIFQCVTFKEQRINEKIVAHLAEEMAQAKEEKAGEQ >gi|222441931|gb|ACEP01000011.1| GENE 3 3010 - 3708 908 232 aa, chain + ## HITS:1 COG:BS_rncS KEGG:ns NR:ns ## COG: BS_rncS COG0571 # Protein_GI_number: 16078656 # Func_class: K Transcription # Function: dsRNA-specific ribonuclease # Organism: Bacillus subtilis # 5 229 18 244 249 194 44.0 1e-49 MIRTERFKGLEEKIGYTFTNKRTLALAMTHSSYANEQRGRRKANNERLEFLGDAVLEVTI SDYVFREYPGYNEGRLTKLRSSLVCEYTLAICARDVELGKYLLLSRGEDATGGRERDSIL SDAFEALIGAIYLDGGMDRARTFIHTHLLKDVEDKSLFYDAKTILQEMVQAGPDPRLEYV LTREAGPDHNKEFTVEAKIGGKTYAIGKGKTKKGAEQIAAYQTILLLKKRRR >gi|222441931|gb|ACEP01000011.1| GENE 4 3710 - 7273 3467 1187 aa, chain + ## HITS:1 COG:BS_smc KEGG:ns NR:ns ## COG: BS_smc COG1196 # Protein_GI_number: 16078657 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Chromosome segregation ATPases # Organism: Bacillus subtilis # 1 1181 1 1178 1186 578 36.0 1e-164 MYLKSIEINGFKSFANKIVFEFPQGITGIVGPNGSGKSNIGDAVRWVLGEQSARQLRGSK MEDVIFSGTQSRRPMGFAYVAITFENANRIIPLDYEEVMVARRVYRSGESEYLINGSSCR RRDIVELFYDTGIGKEGYSIIGQGQVEKILSGKIEDSRELFDEAAGIAKYKKNRTVTEKS LEQERQNLERVTDILAELEKQVGPLEKQSAKAKEYLKLRDEEKDIDIHLFLYDYERLKKE QEENERQYKIVSGDLEDTRKIYEKIKEKNGKLQDQTQAVTESLETKESEKEELRRKKETK DNEILLLSHKVESNTLLITHYEELEKQSDEKKDTKTAEIEVLKKETVQKKQEISNGQKKL EELEESILTKRKEQEACEKKISGENDRLFSIMNTSSDTKEKLSRYAAMEEQLEIRNAEYN SRYISFNSELKEYNETAEELLKQQDAAKEAFEKQENLYQELEKREKHLQEKEQKYQDEMG NLNQEYLRTKSRYETLMNITERYDGYNQSIRRIMEQKNANPGIIGVVADILALPEKYETA IEIALGGALQNIVTEDNETAKKMINFLKKNRFGRATFLPLTNIRRRNSTISPSVLQDEGV IGIASTLVKVEERFQALVDSLLGRTVVVDTVDHALSLSRKNNFSLRLVTLDGELLNPGGA ITGGVFRHSGNLLGRKREIEECKDSLRKAKAAWEERKEALSELKAGQQKLEEQKQQRKAK LDAASLLLHDLSNQIPDMDAKKEELSERITSLKEEHQILKEQIDEIRSQKEALLKEQEDN EQIHEQNNTVLDSLKEQLSEIRAEVSKLESEKNEQRISLSKMRQELDYLRSNKQRIEKEI ENIKNSLEENKKEIKRLTEENISYEEEQKTLKAALQEITIQMDGISTNLEELKQQRSTAM KKQNQIFQELESENEKLLILEKEAAKLSSKQERMKEDFESKIDYMWETYEITYNQAFTLK HYEVKAEEAAEQRKQKKELQRQIKELGNININAIEEYKEVGERYEFLRGQYDDIKEAEAK LLTMIEELNTAMKEQFTKEFGNIQQMFTTVFQDLFEGGTASLELMDNENILECGIRIIAQ PPGKKLQNIMLLSGGERALTAIALLFAIQNLKPSPFCLLDEIEAALDDANIVRFSKYLKK LSKDTQFIVITHRRGTMNAADALYGITMQEKGISTLISVDLIDKDLN >gi|222441931|gb|ACEP01000011.1| GENE 5 7286 - 8227 846 313 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163762490|ref|ZP_02169555.1| ribosomal protein L28 [Bacillus selenitireducens MLS10] # 14 313 25 328 336 330 54 1e-89 MGEKKGFFSRLVSGLTKTRKNIASGLDSIFHGFSKIDDDFYEELEEILIMGDLGVDTTMN IIEDLQERVKEEHIKEPAECRQLLIDIIKKQMEVDETAYEYENRTSVVLVIGVNGVGKTT TIGKLAAQLKSQNKKVIMAAADTFRAAAIEQLTEWSNRAGVDIIAQQEGSDPAAVIYDAC QAAKSRHADVLLCDTAGRLHNKKNLMNELSKIRRVIEREFPEAYLETLIVLDGTTGQNAL VQAKQFKEVSDISGIVLTKLDGTAKGGIAIAIQAELGIPVKYIGIGEKIEDLQKFDADSF VNALFEKPAEDEE >gi|222441931|gb|ACEP01000011.1| GENE 6 8399 - 9628 1593 409 aa, chain + ## HITS:1 COG:FN1411 KEGG:ns NR:ns ## COG: FN1411 COG1171 # Protein_GI_number: 19704743 # Func_class: E Amino acid transport and metabolism # Function: Threonine dehydratase # Organism: Fusobacterium nucleatum # 7 398 4 395 404 372 53.0 1e-103 MMEKLSLDKFEEAYERVQEVVLPTNLVQSDFFSQMTGNKVYFKPENLQLTGAYKIRGAYY KISTLSDEEKAKGLITASAGNHAQGVAYAARKFGAPAIVVMPTTTPLLKVNRTKELGAEV VLHGNVYDEACEKAKELAAEHGYTFVHPFDDLEVATGQGSIAMEIVKELPTVDMILVPVG GGGLSTGVSTLAKMLNPKIKVIGVEPAGANCLQESLKAGHPVTLKGVNTIADGTAVKTPG ETIFPYLQENLDDIITIEDDELIVAFLDILENHKMLVENSGLLTVAALKHLKPEGKKVVS ILSGGNMDIITLASVVQNGLIQRSRIFTVSVLLPDVPGQLNKVSKVIADVQGNVIKLEHN QFVSVNRNSSVELTITMEAFGHNHKKKIIDALRAAGFKPKERTTRGVYQ >gi|222441931|gb|ACEP01000011.1| GENE 7 9915 - 10085 64 56 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MINFKKGKEHVCRGILEFLYSRFRKEEIGAAEAIPRGTASDFSTKIKAANRRKRNF >gi|222441931|gb|ACEP01000011.1| GENE 8 10000 - 10962 640 320 aa, chain + ## HITS:1 COG:CAC2093 KEGG:ns NR:ns ## COG: CAC2093 COG3854 # Protein_GI_number: 15895363 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 8 314 4 295 305 226 38.0 5e-59 MQQKLYREELLQIFPQKLRQRIEESVTFKNLTEIRLRADLPVLIETTQENYFLKKEEAAE KRRAIAKEEDNWILSPEQLKLIFEKISQYSVFAYTEEIGEGFITLKGGHRAGLCGKYYYR GEEKPQIKHISSINLRVAREVIGCAECIVPKLFEGNTFCHTLLISPPGCGKTTYLRDIIR LLSDGTDTITGKNVAVVDERSEIGNRTREGIGFYLGNRTDLMDHCPKAAGMLMMLRTMTP EILAADEIGGEKDIKAMAYVRNCGCRLLMTVHGNSMQDIFARPVLGEYLKKYPFERYVFL SKKEDRERIITIYNQDGIQK >gi|222441931|gb|ACEP01000011.1| GENE 9 11058 - 11294 246 78 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225025912|ref|ZP_03715104.1| ## NR: gi|225025912|ref|ZP_03715104.1| hypothetical protein EUBHAL_00148 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00148 [Eubacterium hallii DSM 3353] # 1 78 1 78 78 144 100.0 2e-33 MVKILGGSLVLIAAYLFGMKLMEPAAEHIRLLEEGDLLYRILESEIRNTRTPLPILFGEL SDRTNTRWHNFFLSFLSH >gi|222441931|gb|ACEP01000011.1| GENE 10 11318 - 11572 168 84 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225025913|ref|ZP_03715105.1| ## NR: gi|225025913|ref|ZP_03715105.1| hypothetical protein EUBHAL_00149 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00149 [Eubacterium hallii DSM 3353] # 1 84 1 84 84 131 100.0 2e-29 MTIYERLLKKTWKEKFSKEEQNLFLNAGRNLLSDDMEYQREEIKQLSTYLNEKILQMKKE YTNKKKVVLISCLCMGALAVILLF >gi|222441931|gb|ACEP01000011.1| GENE 11 11605 - 11799 237 64 aa, chain + ## HITS:1 COG:no KEGG:Closa_3250 NR:ns ## KEGG: Closa_3250 # Name: not_defined # Def: stage III sporulation protein AC # Organism: C.saccharolyticum # Pathway: not_defined # 1 64 1 64 64 67 67.0 2e-10 MTIELIFKIAAVGILVTILGQVLKHSGREEQAFLVSLAGLFIVLSWIIPCISELFTEIYS LFQL >gi|222441931|gb|ACEP01000011.1| GENE 12 11811 - 12197 365 128 aa, chain + ## HITS:1 COG:no KEGG:Cphy_2521 NR:ns ## KEGG: Cphy_2521 # Name: not_defined # Def: stage III sporulation protein AD # Organism: C.phytofermentans # Pathway: not_defined # 5 128 15 138 139 82 40.0 5e-15 MLVFKIALIGIAGAMLAMVTKQFKPEYSTLVLLAVCLFLIGYLTSNLKEVLNFVQTLQKR IPISSMYIKILLKLLAIAYICQIASNLCEDLGYHSISFQIETIGKLSILVLSIPIISSLL ETIEHLLT >gi|222441931|gb|ACEP01000011.1| GENE 13 12243 - 13388 236 381 aa, chain + ## HITS:1 COG:no KEGG:Cphy_2520 NR:ns ## KEGG: Cphy_2520 # Name: not_defined # Def: sporulation stage III, protein AE # Organism: C.phytofermentans # Pathway: not_defined # 5 380 3 379 380 170 29.0 1e-40 MGKNKKKTFLYVIGFLTHFLFLCYLHPLTLQAQTKKEEIDITALSKQIEQLGEEKYPDFS SIFEKILSMQFTEVFQQIGGWFIDIAFREVFSSKILITELIGVILFSAVFSNISSSFQPY GVSDSGFFVTYLITFSIIFTNFTAMTTLFENTVILLSSFLKVLLPVYTLAVSLSGNLSTG VVFYEYFIIVVLLLNWICIKIFLPLLQYYLLLELLNRFSKKQNISRLCEGLFLFLSKGVQ VLFFLFFGFHLLETMIAPSFDAAKNNVLNKMIGLIPGAGSVVQSVTGTVLGSSLVIKNAL GAAGIVFLFLFLLLPLTKLLCYVFFYFLLSVVLEPIADERFVECISAAVKCGLLLIKVLC MSSVLFIVIIALTTLTTNHIG >gi|222441931|gb|ACEP01000011.1| GENE 14 13439 - 13966 397 175 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225025917|ref|ZP_03715109.1| ## NR: gi|225025917|ref|ZP_03715109.1| hypothetical protein EUBHAL_00153 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00153 [Eubacterium hallii DSM 3353] # 1 175 1 175 175 310 100.0 3e-83 MEYIKSWLKTILYMNVLLLICDNLMQKTAYEKYYRFFSGFLLVVCLLKPVIDFAGTEQYF QVSFLQEEWKNERNLLGNVKEFQDMQDVLKKEYETAYKNQIEEIAKKKNIQVKKITFWWD NKKEHLKQIEIRGILLKGSDSTLHTTDNPSHVESLKKILMQLYDLEESDVFVEVE >gi|222441931|gb|ACEP01000011.1| GENE 15 13969 - 14541 576 190 aa, chain + ## HITS:1 COG:no KEGG:Cphy_2518 NR:ns ## KEGG: Cphy_2518 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 7 186 44 237 239 99 35.0 6e-20 MDKKMPKKISLKDFGMEKIILIAIAGIVLLVANFSEWKNSSSKTSEKQEKSIETSQNDAY VEALENKLVHILENVDGVGKAEVMITLKSSKESVLNKDLSEEKQTEEERSVETQKVNKNQ KKQEETILSDSSGNSAPYVIKELEPEISGIVISCEGAGNKVVEASVLEAVQVLFGVSANH IKVLKMEVAK >gi|222441931|gb|ACEP01000011.1| GENE 16 14538 - 15206 688 222 aa, chain + ## HITS:1 COG:no KEGG:Cphy_2517 NR:ns ## KEGG: Cphy_2517 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 1 214 1 246 253 108 36.0 1e-22 MKKVFQKNQIAITALAIMIAVAGYLNFTEQNTSSDGTKKTAGSVETIDTMDISDGDSLIK EKAKASKDVSKSLSKNSSENAKETQTDVSSETIGDAVLTQAQVSEYVAGARMEREQTHSK TKESLNEIINSTSVSEDAKKEAVDKLTELADIMEKESATEQLLASKGFEDAVVSIGEDSV DVVLNYEELSSSDRAQIEDIVTRKTGYSVSQLVISKMQTTES >gi|222441931|gb|ACEP01000011.1| GENE 17 15377 - 15763 601 128 aa, chain + ## HITS:1 COG:BS_yqhY KEGG:ns NR:ns ## COG: BS_yqhY COG1302 # Protein_GI_number: 16079489 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus subtilis # 19 126 16 123 135 83 42.0 8e-17 MGKEKDTKINHTVYNIEDVGQVQIADEVVAVIAGLAATEVEGVAKMSGNITNEIVSKLGM KKLSKGVKVTITGTQVDVILNLVLSYGVSIPKTSQEVQDKVKSAIETMTGLTVSEVNIRI AGIQMDYE >gi|222441931|gb|ACEP01000011.1| GENE 18 15807 - 16208 591 133 aa, chain + ## HITS:1 COG:lin1396 KEGG:ns NR:ns ## COG: lin1396 COG0781 # Protein_GI_number: 16800464 # Func_class: K Transcription # Function: Transcription termination factor # Organism: Listeria innocua # 3 133 1 124 128 83 38.0 9e-17 MKMSRRKVRETIFLLLFRVEFNTQEELQEQIKWYFEERPDIEGKDQIYIETKIGSILKRL PEIDEQIHSICEGWRLERLGKPELAILRLAVYEITNDANIPTGVAINEAVELAKIYCSEE APRFVNGVLAKLA >gi|222441931|gb|ACEP01000011.1| GENE 19 16302 - 17510 824 402 aa, chain + ## HITS:1 COG:CAC2082 KEGG:ns NR:ns ## COG: CAC2082 COG1570 # Protein_GI_number: 15895352 # Func_class: L Replication, recombination and repair # Function: Exonuclease VII, large subunit # Organism: Clostridium acetobutylicum # 4 389 5 390 399 285 37.0 1e-76 MKHIFSVTQINSYIHRIFESDYALKKIYLKGEVSNCKYHSSGHIYFTLKDEKSALRCVMF SSDRYKGLTFHLEDGQLIEACGNISVYEQTGTYQMYVRKIELSGAGELYIRYEQLKQELA QKGYFDFERKKPLPPYPEKIGIVTALTGAAIEDIKSIAKRRNPSVQLYLYPSKVQGEGAA AQIARGIRYFDTAGVDIIIIGRGGGSIEDLWAFNEIEVADAIYYADTPIISGTGHEIDMT IADYCADVRAATPSAACELAIPDMTSFYTRMHNYQEVLGTLLQQKLMLMYGKTALLERQL EAQRPDKKLMSHQLIYERLEGRLQQAMRDKYQSRQQQFTRYVDRLSALSPTSRLRGGYVF AQTKEGNPLTSAGQIEKENPFFITFSDGQAEVMPVSVKKNKK >gi|222441931|gb|ACEP01000011.1| GENE 20 17522 - 17755 348 77 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225025924|ref|ZP_03715116.1| ## NR: gi|225025924|ref|ZP_03715116.1| hypothetical protein EUBHAL_00160 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00160 [Eubacterium hallii DSM 3353] # 1 77 1 77 77 110 100.0 3e-23 MAKTKFSIEGAFEQLEDIIEQLSSEEISLSDSMDLYKKGVKILDKCSQTLDKTQKEIIIL QEGQNDAIHKGTAFEED >gi|222441931|gb|ACEP01000011.1| GENE 21 17718 - 18611 1096 297 aa, chain + ## HITS:1 COG:CAC2080 KEGG:ns NR:ns ## COG: CAC2080 COG0142 # Protein_GI_number: 15895350 # Func_class: H Coenzyme transport and metabolism # Function: Geranylgeranyl pyrophosphate synthase # Organism: Clostridium acetobutylicum # 13 295 8 287 289 215 42.0 8e-56 MPSIKEQLSKKIEKIEQTIDFYLPEEKGLQKTVLSAMNYSVKAGGKRLRPMLMAETYEMF GGADSSVIEPFMAAIEMIHTYSLIHDDLPALDNDALRRGMPTAHIKFGEAMAILAGDALL NYAFETAAKAFDGTNKDIQTAKALQILTRKPGIYGMIGGQTVDVEQENKAISFDTLMYIH NNKTAALIECAMMIGAVLAGASEKEVSMIEQIANKVGIAFQIEDDILDVVSTSEEIGKPA GSDEKNGKNTYVLFKGIEQSKKDAEQYTREALQIFDELEYKNSFLRELLLYLVGRKN >gi|222441931|gb|ACEP01000011.1| GENE 22 18635 - 20509 1635 624 aa, chain + ## HITS:1 COG:CAC2077 KEGG:ns NR:ns ## COG: CAC2077 COG1154 # Protein_GI_number: 15895347 # Func_class: H Coenzyme transport and metabolism; I Lipid transport and metabolism # Function: Deoxyxylulose-5-phosphate synthase # Organism: Clostridium acetobutylicum # 1 620 1 616 619 611 50.0 1e-174 MGRLLNKIKEPNDIKRISPKLYPILAQEIRDFLIDHVSQTGGHLASNLGAVEITMALHIC LHFPEDKVVYDVGHQSYVHKLLTGRKDEFTSLRQKDGLCGFPKRCESDCDVFGTGHSSTS ISAALGLAVARDLEQKEETIAAVIGDGALSGGMAYEALNNLSILRREKKNMIIILNDNKM SISENVGGMSRYLNDLRSRRSYSEFKENVENALNNIPGVGKSVARTLKKSKDSIKQLFIP GMLFENMGITYYGLVNGHDIYELIHAINRAKQHEGPILIHAITRKGMGYKYAEKNPEKFH GIGPFDKETGEVLAKKTKKTYTDIFAESLVELARENKKIVAITAAMPSGTGLKAFKKHYP KRFFDVGIAEEHAVTFAAGLAAQGMRPVFAVYSSFLQRGYDQVLHDVCIQKLPVFFGIDR SGLVGADGETHQGIFDISYLSHIPNMVLMAPKNEKEMPAMMKFALEYNGPTAMKYPRGSV YDGLSEYNAPIELGKSEMIYEGQDVVILAVGNIMEECEKAVQLLKSQGYNPGLVNVRFIR PMDEEMLHVLSKKYSLIVTVEENQLIGGYGQMVSAFLHKNVCKNQLLTLGISDYFVGHAT VNEQREEAGINADSIVKSIIDRMN >gi|222441931|gb|ACEP01000011.1| GENE 23 20511 - 21335 900 274 aa, chain + ## HITS:1 COG:CAC2076 KEGG:ns NR:ns ## COG: CAC2076 COG1189 # Protein_GI_number: 15895346 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Predicted rRNA methylase # Organism: Clostridium acetobutylicum # 2 267 5 266 267 306 58.0 3e-83 MKQRLDVLLVEQGHAASREKAKAMIMSGVVFVNGQREDKAGSTFDEKAAATIEIHGSTLR YVSRGGLKLEKAVEQFGFSLKDKTCMDVGASTGGFTDCMLQNGAVKVFSVDVGRGQLDWK LRNDERVVCMEKTNMRYVKPEDIGELADFISIDVSFISLTKILPPVKACLKEDGQVVCLI KPQFEAGREKVGKKGVVRDRQVHEDVIKQIMDFALTLGFSLLHLDYSPIKGPEGNIEYLL HLLNRPEESQISDEMIPDVVAASHQSLSHREAAK >gi|222441931|gb|ACEP01000011.1| GENE 24 21341 - 22177 865 278 aa, chain + ## HITS:1 COG:RSc2650 KEGG:ns NR:ns ## COG: RSc2650 COG0061 # Protein_GI_number: 17547369 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted sugar kinase # Organism: Ralstonia solanacearum # 56 277 74 290 302 166 36.0 6e-41 MNNFLIIANKQKDINLEITEQIKHHITRMGAVCNVYDQYNRNVTSIDIPEGTQCILVIGG DGTILAAARMLVGNTIPLLGINLGTLGFLADVNLADLSKTLDLLLKDQYQVENRIMLTAE VYKQGEKAATYIALNDFNINRCGASRVIGLKVGINGSTIDCYRADGVIVCTPTGSTGYNL SAGGPIINPTCKNFVITPICPHSLTARSIVLAKEDVVTVEVEQIRSNIKEEAIISFDGRE GLSIVPGDQVKIYKSQEVTPFIKATEVSFVQILKEKLL >gi|222441931|gb|ACEP01000011.1| GENE 25 22196 - 22654 565 152 aa, chain + ## HITS:1 COG:CAC2074 KEGG:ns NR:ns ## COG: CAC2074 COG1438 # Protein_GI_number: 15895344 # Func_class: K Transcription # Function: Arginine repressor # Organism: Clostridium acetobutylicum # 1 148 1 148 150 110 41.0 1e-24 MKGKRQEKILEIIRTNDIETQEELTKKLSEAGFSSTQGTISRDIRELKLTKVTGANGKQK YAPIQTEDIHVSSKYKRVLSEGILHMDNAENILVIKTVPGMAMACAAAIDSISLHGVLGC IAGDDTIMCVVKETPMVEEVMKEIDTIMKGHR >gi|222441931|gb|ACEP01000011.1| GENE 26 22669 - 24336 1354 555 aa, chain + ## HITS:1 COG:BH2776 KEGG:ns NR:ns ## COG: BH2776 COG0497 # Protein_GI_number: 15615339 # Func_class: L Replication, recombination and repair # Function: ATPase involved in DNA repair # Organism: Bacillus halodurans # 1 553 1 558 565 325 35.0 1e-88 MLINLHVKNLALIEETEVDFTDHLNILTGETGAGKSILIGSIQSALGAKIPKDMIRHGCD SALIELIFHTKSQAVQKKMEEFEIPFEDGEIIISRRITNNRVINKVNDISVTIGRLKELS PLLLDLSGQHENQLLLKPQNHLKIIDSYHRSVIAPVKEKTASLYHEYQELQKKLSEQNMG EEQRIREMEFLKYEIQEIEEAQLRPGEDESLETEYQKVSHAKEILSDCAAVHEVTAGQNG CAGDLIGSAVQRLYTVSNLDAEARGLTEQLQTIDSLLNDFNRELSSYIDSMEFDGEQFAE IEERLNLINHLKAKYGDSIEKIQKYHDNSVEKYEKLSNYEEYISEIKRELSKVKEKLDIQ CERLTALRKEAAIPLTKLIKEALLDLNFLDVVFEIAFEQSEHYSALGKDNVCFMISTNPG QAVRPLHEIASGGELSRIMLAIKSILADEENIETLIFDEIDTGISGRTAQKVSERLAYIA KKRQVIAITHLPQIAAMADSHYLIEKTSDANSTISNIYPLSEEESVKELARMLGGVKITE AVLQNAREMRVLAKQ >gi|222441931|gb|ACEP01000011.1| GENE 27 24406 - 25485 695 359 aa, chain + ## HITS:1 COG:CAC2072 KEGG:ns NR:ns ## COG: CAC2072 COG0750 # Protein_GI_number: 15895342 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted membrane-associated Zn-dependent proteases 1 # Organism: Clostridium acetobutylicum # 58 359 83 390 395 196 41.0 6e-50 MREKLKQFYIVLLVFILTIMGQYGHVLQTEAVFSTQKETVSAKSQTGKRLTEQKSEEKVY ASGIPIGIYIKTEGILVLGLQQIDGKNSPASCKIKEGDYILKLNAQNITTKQQFIRLLQK NGEKEVVLTLKRKNKKIKVKVQPVYSAKNKCYQIGVWIRNDTQGIGTITFIREDGTFAAL GHGINDGDIGVRFLIEGGSAYRTNISSILKGKSGMPGEIIGTIDYSPQNYLGEIYANTNG GILGKITENKKEFFTGKKISIAKKSEVKTGKAYLRSSISGKSKDYSIEIEKVFFHEKNSL KTLKIRVTDSELITLTGGIIQGLSGSPILQNGKLVGAVTHVLIDNPQMGYGIFAESMLE >gi|222441931|gb|ACEP01000011.1| GENE 28 25689 - 27341 1268 550 aa, chain + ## HITS:1 COG:PA4367_3 KEGG:ns NR:ns ## COG: PA4367_3 COG2200 # Protein_GI_number: 15599563 # Func_class: T Signal transduction mechanisms # Function: FOG: EAL domain # Organism: Pseudomonas aeruginosa # 154 408 5 258 267 158 30.0 3e-38 MDGLRKKLPKLLPNTYCIAAIDIEHFRLFNKLYGRSSGDEVIRYICACLKQSTMENDGID AYLGGDNFVALLPDSDELLCSIREKIIEKLGKWNNTSVFFPLFGVYTIEDTSIQPELMYD RAMLARSHAEEDYKWHICRYTLEMESCLEEEVYLLAEIEKGLENEEFTFFVQPQCNIMTG QIVGAEALVRWQKEDGEFLLPGEFIPVLEKNKMIDRLDRYIWEKVCQWLRHWIDTGHSPV PISINVSRIDIFSMNVPAYLFDLMEKYQIPKHLIKVEITESAYTENNNRIASAVNTLRSG GLVVMMDDFGCGYSSLNMLENIPVDVLKLDMRFLRFEEAERKKSAHILEAIVNMASMLHL PIVVEGVEDESQEKFVQGLGYRYTQGFYYYKPLPIPKFEELLSDHRRIDTQGIVYKQVEP MHIREFIDSNFVSDSMLNNVLGPVVFFEVQSGKIKVTRVNEQYFQMIGAEHFKEDIQKEF LARIPAEERSQFNEMLENSFLNPVSGADGMLHLLRTETDKLTVYIKVFYMQEKEDWRQYY CSLMDMTKIL >gi|222441931|gb|ACEP01000011.1| GENE 29 27443 - 27820 332 125 aa, chain - ## HITS:1 COG:CAC1469 KEGG:ns NR:ns ## COG: CAC1469 COG1321 # Protein_GI_number: 15894748 # Func_class: K Transcription # Function: Mn-dependent transcriptional regulator # Organism: Clostridium acetobutylicum # 7 121 3 118 122 96 51.0 1e-20 MKEKNNESAENYLETILVLSKRLPVVRSVDVANQLDFKKSSVSIAMKNLREKNHITVTDA GYIYLTESGKAIADMIYERHQLLTSCLEKLGVSAEIAEKDACKIEHVISKESFEAIKEYV KANIR >gi|222441931|gb|ACEP01000011.1| GENE 30 28145 - 29851 208 568 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P [Thermanaerovibrio acidaminovorans DSM 6589] # 312 551 116 354 398 84 26 1e-15 MKDYKKESVLAPLFKMLEAFFELFVPLVVASIIDDGIVPKDSGHIIRMCLLLLVLAAVGL TCSITAQYFAAKSAVGAATGIRYELFTHIQTLGYEEMDTVGTSTLITRMTSDINQVQNGI NLVLRLFLRSPFIVFGAMIMAFTIDVKAAMIFVVAIILLSIVVFGVMFITKPLYKKVQSG LDTILGTTRENLTGVRVIRAFHQEQAEYNKFLAENEELTSLQKFAGKISGLTNPLTFIII NFAILVLIHTGAVRVSLGTLSQGQVVALYNYMSQILVELIKLANLIISVTKAMACFNRIQ DVFHIEPSMKEGTKTVAAAGNTTPAVEFKNVSFTYAGGGDHAVENISFKAMPGQTIGIIG GTGSGKSTLVNLIPRFYDVSEGEVDIAGKNVQDYTYGSLRNTISVVPQKAQLFAGTIRDN LTFGCPDATEEQIEEALAISQAKEFVDTKEGRLDAKIEQGGKNLSGGQRQRLTLARALVP QSDILIMDDSASALDYATDARLRKAIQDMKRKPTVFIVSQRTSSIQNADMILVLDDGKIA GQGTHEQLLKSCNIYREIYETQFKKEEA >gi|222441931|gb|ACEP01000011.1| GENE 31 29852 - 31609 2185 585 aa, chain + ## HITS:1 COG:CAC2392 KEGG:ns NR:ns ## COG: CAC2392 COG1132 # Protein_GI_number: 15895658 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase and permease components # Organism: Clostridium acetobutylicum # 10 585 2 577 579 541 47.0 1e-153 MAQQNTNGKRKSTIKKVWHYLKNYRFLLILSILMAALTVAGTLYVPILVGDAIDFIIKKG QVNFAAIAKILEKGATVILFTGITQWIMNICNNHITFHVTRDIRNEAMKKIEKLPLKYID GHSYGDVVSRVIADVDQFADGLLMGFTQFFTGVVTILGTLGFIFSIHVGIALLVVCLTPI SLLVARFIATHTYSMFKLQSETRGEQTALIEEMIEGEKVVKAFGYEKRAGERFAAVNDKL QGYSLKAIFFSSITNPSTRFVNSLVYASVALSGALTAIGGGLSVGQLTCLLSYANQYTKP FNEISGVVTELQNALACAARIFELIEETPETDDKKDALVLANPEGNIELSHVNFSYTTEK RLIEDFNLNVKKGQRIAIVGPTGCGKTTIINLLMRFYDVNKGTIMVEGTDIRDITRESLR TSYGMVLQDTWLRSGTIRDNITMGREGFSDEQIIAAAKEAHSYSFIKKMPKGLDTYITED GAGMSQGQKQLLCITRVMLDLPSMLILDEATSSIDTRTEQKIQNAFAKMMAGRTSFIVAH RLSTIQNADVILVMRDGHIIEQGNHETLLKQNGFYAKLYNSQFAN >gi|222441931|gb|ACEP01000011.1| GENE 32 31730 - 32884 1461 384 aa, chain + ## HITS:1 COG:FN1423 KEGG:ns NR:ns ## COG: FN1423 COG0426 # Protein_GI_number: 19704755 # Func_class: C Energy production and conversion # Function: Uncharacterized flavoproteins # Organism: Fusobacterium nucleatum # 3 372 6 388 405 312 40.0 8e-85 MFKVTEDIIYVGVNDHEVDLFEGQYDVPNGMAYNSYVVLDEKVAVFDTVDAHFKDEWLAN LEEAFAGRTPDYLIVQHMEPDHAANIANFAEKYPEAKIVGNAKTFPMMKQFFGTDFADRQ VIVKEGETLSTGKHNLTFVMAPMVHWPEVMMTYDTTDKIFFSADAFGKFGALDVEEEWDC EARRYYIGIVGKYGPMVQKLFGKVGSLEIKAICPLHGPVLTENLEHYLNLYNIWSSYQVE TEGTVIAYTSVYGNTKKAVELLAEKLKEEGCPKVVVTDLARDDMAEAVEDAFRYGKLILA TTTYNGDIFPFMKEFINHLTERSYQNRTIGFVENGSWTPLAAKIMKAAFEKSKNITFADT TVTIRSAVDETSEAQIVALAKEMK >gi|222441931|gb|ACEP01000011.1| GENE 33 33060 - 34208 503 382 aa, chain - ## HITS:1 COG:CAC0978 KEGG:ns NR:ns ## COG: CAC0978 COG3359 # Protein_GI_number: 15894265 # Func_class: L Replication, recombination and repair # Function: Predicted exonuclease # Organism: Clostridium acetobutylicum # 18 183 32 196 274 94 33.0 4e-19 MLTINKIFPQEKIPEKTESFLFFDIETTGFSRDNTILYLIGCGYFAEEGFQFIQWFNDDG TSEEEILLAFQNILVKKDWQLVTFNGNSFDIPYLKRHYDLNELTCDIEKYPSLDFYQFLK PFQNLFQMTHGKQKDWEQFLGLNREDKYDGGQLIAVYKDYLMSKDEDLLHNLLLHNEDDL LGMKYLLPLFSYRQIFLEDITLQRIAPANKVFERGNGSIAISCRLPRPLPQPLEVSTPIG ELFSDKKDLSILTITLPFVEDVLKHFYKDYHNYYYLPKEERAIHKSVGCYVERQYRRAAK ASTCYVKKEGIFLPLPKAQKHFGIQIENYPYEKTFPLYKREYKELRWFAEFDDVFSEKNT GISVYLRDIIKELLITRMECDL >gi|222441931|gb|ACEP01000011.1| GENE 34 34484 - 35098 682 204 aa, chain + ## HITS:1 COG:PA0966 KEGG:ns NR:ns ## COG: PA0966 COG0632 # Protein_GI_number: 15596163 # Func_class: L Replication, recombination and repair # Function: Holliday junction resolvasome, DNA-binding subunit # Organism: Pseudomonas aeruginosa # 1 198 1 198 201 134 38.0 1e-31 MISYIKGTVAYKGKDSVVIENHGIGYQIKVPSRVLEGVNSGEEAMLHTYLYVREDQLALF GFSSIQELETFQILLGISGIGPKAALSVLSTMSVEDLYYAVFSEDAKSIAKTPGIGPKGA KRMIIELKDKLNLEALESVSGAEEAAPQSSFTEGDSIADTVQALVALGYSNGEAYRAVHS VPEAEKLDAEQLLKEALKKVLTFL >gi|222441931|gb|ACEP01000011.1| GENE 35 35108 - 36112 1095 334 aa, chain + ## HITS:1 COG:BS_ruvBm KEGG:ns NR:ns ## COG: BS_ruvBm COG2255 # Protein_GI_number: 16081161 # Func_class: L Replication, recombination and repair # Function: Holliday junction resolvasome, helicase subunit # Organism: Bacillus subtilis # 2 330 5 333 336 427 65.0 1e-119 MDRMISTELMEEDIAVEGSLRPQNLSEYIGQEKVKKNLRVFIEAAKMRGESLDHVLLYGP PGLGKTTLAGIIANEMDSNLKITSGPAIEKPGEIAAVLNGLSDGDVLFIDEIHRLNRQVE EVLYPAMEDFSIDIMIGKGASAKSIRLELPHFTLVGATTRAGMLTAPLRDRFGVVNRLEF YTDEELKVIVERSAELLGVKIDEAGAMEVGKRSRGTPRLANRLLKRVRDFAQVRYDGMIT YEVAQTALNLLEVDSMGLDATDRNLLEAMITKFMGKPVGLDTLAAAIGEDSGTIEDVYEP FLIQRGLIKRTPRGRALTAFAYEHMGYPVPEEIY >gi|222441931|gb|ACEP01000011.1| GENE 36 36140 - 36538 485 132 aa, chain + ## HITS:1 COG:no KEGG:Closa_2236 NR:ns ## KEGG: Closa_2236 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 121 1 122 122 73 37.0 2e-12 MEEKSELSVVIDGKVYRLSGGSDIYLQKLASYVDGKIRELKKQPGYNKLSTEYRDILLAL NITEELFKLRDEIEVFNQDGRDREQELYELKQQIVDRDMRLDAANKLVADYKAKVNELQK QIIGLETNNEFH >gi|222441931|gb|ACEP01000011.1| GENE 37 36525 - 39065 1373 846 aa, chain + ## HITS:1 COG:MA0538 KEGG:ns NR:ns ## COG: MA0538 COG0826 # Protein_GI_number: 20089427 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Collagenase and related proteases # Organism: Methanosarcina acetivorans str.C2A # 10 771 6 773 855 343 33.0 7e-94 MNFIKKRNLPELLAPAGSFEHLKAAVAAGADAIYMGGEKFGARAYAHNFSREDMIEALQF AHFHERKLYLTVNTLMKEKELFEELGDFLFPYYENGLDGVIVQDIGAVRFIRENFPDLEI HGSTQLTVTDFRGAVAAKRMGMTRVVPARELSLAEIKRIKKETGLEVEVFVHGALCYCYS GQCLLSSMYGGRSGNRGRCAQPCRLPYTLMNADGTILNPSKYTKYLLSPKDLCSLSMLPS LMEIPVDSLKIEGRMKNVEYVAGVTAIYRKYLDKISENWEQRAQFQPEEEDIHALEELYC RGSFTEGYWNKHNGQDMMAAVSPKNTGRKVGKVLSVSKNKVRMHFEDTLHPKDILVIPVS KDGQDEVILTVPSKDVGTSLKGQVTLNAPRTQSIKPGMPVYRRRNEELLTHIEKNILNGV IKYPVTGDITLKVGEPLMLQLFCREESVFVEGPVIECSKKRPVTKEDILRQMHKTGNVPF ILTSFEVNLDEGCFLPMSALKRIRQDGFALLEEKLRICKNRVDNLPEKREYFYEKNLSSK ELDNQQNLMNKRYDCPNQAQRELNSEQEERIASVYNIKQAFLYCEDTFFDGICLPWEFFS EKELENVGEKIKTAGKSLYLALPRVFRGNNVLEDKLKKICSLSIWDGIYAYTINEMEFLM QLEGCTTRVIAGASFYHWNSSAIQESNALYDNMTVRELPVELSEEEISDMIKGMDKEQQQ TVDFELLIYGRIPVMQSAQCLKKTCGHCNKTSGRLWLEDKKKRKLLVTTHCLDCYNLIWQ DKPKSLIGENIKEVSSHIKRHRFDLFELTEAEVSSVKQRYLKWKEQHFTEEQSKDTSDSY WNYGIE >gi|222441931|gb|ACEP01000011.1| GENE 38 39084 - 40451 958 455 aa, chain + ## HITS:1 COG:CAC0505 KEGG:ns NR:ns ## COG: CAC0505 COG0772 # Protein_GI_number: 15893796 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Bacterial cell division membrane protein # Organism: Clostridium acetobutylicum # 46 427 13 395 400 189 32.0 1e-47 MVSLISTGSNYIIVIMGVIYAITCFTVFLPSTEKRQVKRMDRQELFMFVFHFICYGVLFA KTLDTKIIFLYLAQVVFFKMLIFVYSRVYVDCSRILMNHTCFLLLIGFVMLTRLSFDKAV KQFIIAAATSLIVLFIPYFMEKAVWLKKLKWIYGLLGLIFLSSVFVIGTSQNGATNWISL GHGIALQPSEFVKISFVFFIAAMLTKAPNFKTMLLTSIFAACHVIILIGEKDLGGALIYF VVYVFLCYVATGRGIYLFGGIGAGTLAAKLAYMLFAHVRVRFIAWKDPWSVIEGSGYQIT QSLFAIAAGSWLGKGLTQGRPNDIPIVESDFIFSAITEEFGILFAICLILIYLGVFIHFL KIAMDVRGRFYKLLAYGFSICFIFQVFLTIGGVTKFIPSTGVTLPLISYGGSSVASTLII FAVMQGIFIIAYKEDDENEEEQAGQPDESDRKKYD >gi|222441931|gb|ACEP01000011.1| GENE 39 40396 - 41994 1447 532 aa, chain + ## HITS:1 COG:CAC0506 KEGG:ns NR:ns ## COG: CAC0506 COG0768 # Protein_GI_number: 15893797 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell division protein FtsI/penicillin-binding protein 2 # Organism: Clostridium acetobutylicum # 60 491 14 460 482 248 34.0 3e-65 MKKNKQGSRTRVTVKNTTERRESRRHDQKETLFFDIKSLFTPRRGKKLKTNRYMLQTSII VVALFLGLIGYVVKFTLKDSSSVAQSSYNKRNSSLSDQTKRGQILSANQKILAYSENNEA GDEIRHYPYENMFAHIIGYTSYGKAGLESVCNSDLLTSHEKLMKQLSNGVASKKNIGDNV ITTLDTRLQKAAYEALEDYRGAVVAIEPKTGKVRALVSKPDFDPNKLDDIWDKITSDSSE SCLLNRATQGLYPPGSTYKVLTALEYMEEHPNSYKNFSYECDGQTIVNSVRISCYENEEH GEVDLDRAFAKSCNTAFVTLGSKLDIKNFVSLNKKCLFNQEIPFDLAVKKSRFELTQNSD KSEIPQTVIGQGNTLMTPFHNALLMCAVANEGTLMKPYVVDHVEGADGTTVKETIPEIYA DLMSEKDAKKLQKMLREVVTSGTGYNLDTDLYTAAGKTGTAENEGKYAHAWFVGYSNVED PDLVVCVLVENVGAGSKYAVPIAKRVFDSYYNNNMKEYYSQKTDIESTQTNQ >gi|222441931|gb|ACEP01000011.1| GENE 40 42487 - 42939 419 150 aa, chain - ## HITS:1 COG:aq_158 KEGG:ns NR:ns ## COG: aq_158 COG0494 # Protein_GI_number: 15605731 # Func_class: L Replication, recombination and repair; R General function prediction only # Function: NTP pyrophosphohydrolases including oxidative damage repair enzymes # Organism: Aquifex aeolicus # 1 133 1 127 134 72 35.0 2e-13 MIKATSCGGVVIFRGKILLLYKNYKNRYDGWVLPKGTVEAGEEYKETALREVHEETGVKA SAIKYVGKSQYTFNTAHDVVVKQVHWYLMMADSYYSKPQKEEYFEDSGYYKYHEAYHLLR FPNERQILEDAYQQYIELKKAGLWGTRKYF >gi|222441931|gb|ACEP01000011.1| GENE 41 43046 - 43933 253 295 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225025945|ref|ZP_03715137.1| ## NR: gi|225025945|ref|ZP_03715137.1| hypothetical protein EUBHAL_00182 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00182 [Eubacterium hallii DSM 3353] # 1 295 8 302 302 456 100.0 1e-126 MTLNQMFDFTPVYIYIVAAVQLFIGFSLSFAIGQILPLRIREFMYYLFSFLFLFIPFLYE IATANVQSGMRSFLFYLALLITIFLFTKESFLKKLSLFFILVFFDGILEIVICSLYWKGI CEKLFGLHYTPYINVPNQKVSTILIFFLPAALAEIAFNFLFPLLWKRYIRFIHLTTFVEM ILLPVLCTNGFLIFLPELFGMTGWILLFFILFVLSLLFLHAVTQIPVILKQIRVNKEKKK RIEKQIFIYKQYQEQNILLRRQNHDMNNHLQALSFLLSQNRIDDMKKYIKELLKE >gi|222441931|gb|ACEP01000011.1| GENE 42 44073 - 44897 226 274 aa, chain - ## HITS:1 COG:no KEGG:Clos_0427 NR:ns ## KEGG: Clos_0427 # Name: not_defined # Def: signal transduction histidine kinase regulating citrate/malate metabolism # Organism: A.oremlandii # Pathway: not_defined # 86 270 248 437 448 77 31.0 6e-13 MHKKDMIFLSILIACSESVVLFYFCSAKSPRSFLCMMISYAILLIFPLNIFLASLREHRT GTSLSTLETTEESSKQQREIEQFTHKQLKLHKNNFTQKLKELYILLEQNDLTTAQKILGE QIPLKDTDIICCTNSIVDGILQSKKSECIQHNISFEYTILFPEKNDFSFSVLISLFSNLL DNAIESCQMASSLNPKIELSIDYKGDFLIIFMQNTKETALIFNSLARKTTKSDSLSHGFG LSIIEEIVKSYDGFCEWEDRGNIFESRIMLRYLL >gi|222441931|gb|ACEP01000011.1| GENE 43 44900 - 45643 575 247 aa, chain - ## HITS:1 COG:CAC1581 KEGG:ns NR:ns ## COG: CAC1581 COG3279 # Protein_GI_number: 15894859 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Response regulator of the LytR/AlgR family # Organism: Clostridium acetobutylicum # 1 243 1 234 234 78 27.0 2e-14 MISIAVCDDNEITAQEISEKVKQCCPIEAEVSVFYHIKDILFSIHNENKLFDIIFMDIEF DVSSGIEASKQINAWHPDCQIIYITNYTNYFTSVYETSHIYYILKKDIDTYLSTALKKAI QQIAKITQYYLIIQSKQQHIRIAQSKILYLERIMRTTNIHTADTIYQTPEKMDSLLERLC PWFCPSHRSYIVNGRHIVNLDRHHAILSNDTSVPVSRTCYESLKKTFARCIWADGMYFTN EEAKGEL >gi|222441931|gb|ACEP01000011.1| GENE 44 46023 - 46472 483 149 aa, chain - ## HITS:1 COG:no KEGG:Clole_2893 NR:ns ## KEGG: Clole_2893 # Name: not_defined # Def: hypothetical protein # Organism: C.lentocellum # Pathway: not_defined # 9 149 7 147 147 150 63.0 2e-35 MESDTKKLTDTTKLLRECDAGTKMAISSINEILEKVEDTKLNEILTKSRNAHEELESEIH SLLNYHKEEQKEPDPIAKGMSFIKTNVKMGIDESDKTVANLITDGCNMGIKSLNKYLNQY AMADVISKKITEKLIRLEENLRKELSTYL >gi|222441931|gb|ACEP01000011.1| GENE 45 46736 - 47182 355 148 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225025950|ref|ZP_03715142.1| ## NR: gi|225025950|ref|ZP_03715142.1| hypothetical protein EUBHAL_00187 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00187 [Eubacterium hallii DSM 3353] # 1 148 1 148 148 271 100.0 1e-71 MNKYLKRTLVVASVMIIIFSIIMLWPLPLRKVLRQETDTAMTVSLSDNDTNISSFSLSAS SVEYRRIMEILEDYSYHCTWYSFIPHESFTGHGENLIIYTGNSGLVIDTASGRVFVDRGT AEHTYRMNYLGQNDSAKLTAQIKKVLKI >gi|222441931|gb|ACEP01000011.1| GENE 46 47399 - 47803 477 134 aa, chain + ## HITS:1 COG:no KEGG:SpiBuddy_2157 NR:ns ## KEGG: SpiBuddy_2157 # Name: not_defined # Def: C_GCAxxG_C_C family protein # Organism: Spirochaeta_Buddy # Pathway: not_defined # 6 132 8 133 141 105 44.0 4e-22 MNRAQKAAEYHQKGYNCAQAIVCAFCDKVGLDEKTAFKVSEGLGLGVSDTYGTCGAVTGM ALVMGMANSCGNLEAPTSKAATYQKVRELNEIFRKKNGSTICRELKGMDTGKVLRSCPGC IEDAANILSEKLGE >gi|222441931|gb|ACEP01000011.1| GENE 47 48015 - 49256 1365 413 aa, chain + ## HITS:1 COG:L0236 KEGG:ns NR:ns ## COG: L0236 COG0004 # Protein_GI_number: 15673574 # Func_class: P Inorganic ion transport and metabolism # Function: Ammonia permease # Organism: Lactococcus lactis # 1 412 1 412 413 385 57.0 1e-107 MNTGDTAFMMICAALVFFMTPGLAFFYGGLVRRKNVCNTIMACVAIMGLSVVMWTLFGYS LSFGGNHAGIIGDFRWFGLNGVGMEAGPYADSIPHLVFCAFQMMFAMITPALITGSLVGR MKFKALFLFVAIWSVIVYYPMAHMVWGDGGFLAAIGSVDFAGGDVVHISSGISALVLAII LGRRRGYEHTTYRIHNIPSVVLGASLLWFGWFGFNAGSALKADGLAAHAFMTSAISAAAA LLSWMFIDTIKNGKTTLVGAATGLVVGLVAITPGAGFVPIWSSFIIGALVSPICYFGVGF IKKKLKIDDALDAFGCHGIGGIWGGIATGIFTQKSINPVARWDGLIFGDYHLFVAQIVGI IITIAVAVVGTLICVAIVRVFTPLRVSVREEQVGLDISEHGENAYPSFNGLDQ >gi|222441931|gb|ACEP01000011.1| GENE 48 49319 - 49675 608 118 aa, chain + ## HITS:1 COG:CAC0681 KEGG:ns NR:ns ## COG: CAC0681 COG0347 # Protein_GI_number: 15893969 # Func_class: E Amino acid transport and metabolism # Function: Nitrogen regulatory protein PII # Organism: Clostridium acetobutylicum # 2 115 5 117 121 97 48.0 4e-21 MMIKVEAYVREDKFEDVKAALNAIGVNGLTVSQVMGCGIQRGYKEIVRGMQVDMQMQPKI KFEIVVSSEEWEVKTIEAIEKAAYTGEPGDGKIFTYEIRHAQKIRTKETGYDAIQATE >gi|222441931|gb|ACEP01000011.1| GENE 49 49820 - 51208 1298 462 aa, chain + ## HITS:1 COG:FN0944 KEGG:ns NR:ns ## COG: FN0944 COG0534 # Protein_GI_number: 19704279 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Fusobacterium nucleatum # 1 458 1 454 455 390 47.0 1e-108 MSEERKENIQTGNPLGEASVASLMVKFAVPSIIAMLVSALYNIVDQLFIGQAVGTLGNAA TNVAFPLTTSCIALALMFGIGGASCFNLNMGGGHKDRAVYFVGNAIICLIGSGVILFLIA ELFLTPLLTGFGAPENVLPYAQTYVRITAIGFPFLILTTGGCHLIRADGSPNMAMLCNLI GAVINTVLDAIFVMVFHWGMAGAALATIIGQIISAIIVIRYLLHFKTVSLTKEHFRPQIE YISKTAQIGMASFFNQVAMMVVQIIMNNSLTHYGAASVYGEAIPLACAGIVIKVNQIFFS IVIGLSQGSQPVESFNYGAKNYDRVRKAFGLAATSGVIISIVSFVLFQIFPRQILGLFGT GEPEYFEFGVRFFRIFLFFIWLDALQPITSTFFTSIGKPAKGIFLSLTRQIIFFIPLLLI LPHFMGIEGCIYCGPIADFLAAVVTIIMAFLEFKNMPKTNGE >gi|222441931|gb|ACEP01000011.1| GENE 50 51224 - 52216 961 330 aa, chain + ## HITS:1 COG:lin1039 KEGG:ns NR:ns ## COG: lin1039 COG2896 # Protein_GI_number: 16800108 # Func_class: H Coenzyme transport and metabolism # Function: Molybdenum cofactor biosynthesis enzyme # Organism: Listeria innocua # 6 330 4 333 333 222 37.0 6e-58 MIKTGIKDSYGREINYMRISITDRCNLRCRYCMPDGAEWIPMKEILTYEEITEVCQEAVK LGITRFKITGGEPLVRRGCPEVIRMIKNIPGTELVTLTTNGLLLGEQLEALLSAGLDAVN ISLDTLDIKKYEWITGFDKLAVVLSSIEKAVASGIPVKINAVLQKGMNEEEWLPLTELAR KLPLSVRFIEMMPIGYGNMSESISNEELKETIRKHYGNLTEDFRVYGNGPAVYYSIEGFQ GNVGFISAMHGKFCNLCNRIRMTSTGDLKPCLCYEQRWALKPALRMENPEKRQEEIQNIL RKSIENKPQMHCFEDTRQVTESHPMGQIGG >gi|222441931|gb|ACEP01000011.1| GENE 51 52282 - 52911 630 209 aa, chain - ## HITS:1 COG:CAP0127 KEGG:ns NR:ns ## COG: CAP0127 COG1309 # Protein_GI_number: 15004830 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Clostridium acetobutylicum # 5 182 8 185 215 90 26.0 2e-18 MPKGSEQLTNARKEEIINACAKLYTTMNFKDITLKQISLETTFTRTSIYNYFQTKEEIFL ALFQREYDLWTADIRQIFEKYEVMTVDEFASALAHTLEKRECLLKLLSMNHYDMESRSRL DNLVEFKKSYGNSMAAVTHCLEKFFPKMTIHDIQDFLYAFFPFLFGIYPYTYVTEKQTEA MEEAHTEYVLLSLYEITYSCIKKLLADAE >gi|222441931|gb|ACEP01000011.1| GENE 52 53337 - 54278 1203 313 aa, chain + ## HITS:1 COG:CAC2022 KEGG:ns NR:ns ## COG: CAC2022 COG0521 # Protein_GI_number: 15895292 # Func_class: H Coenzyme transport and metabolism # Function: Molybdopterin biosynthesis enzymes # Organism: Clostridium acetobutylicum # 153 301 5 153 165 147 49.0 2e-35 MGKITGICISEKRGVQKHLITEANIVCDWGIEGDAHGGKWHRQISLLSKEKVDAFKAKGA DIHPGSFGENLIVEGIDFSSMPVGTRFVIGDVVLEMTQIGKECHNHCLIYKTMGDCIMPR EGVFAEVITGGHIKIGQEVTCIPPKADRPYTAAVITLSDKGAAGLREDKSGPAIVEMLKS AGYDVKETLLLADEQKLLEKEMIRLSDQRQINLIFTTGGTGFSKRDRTPEATIAVCDRMA NGIAEAIRNYSMAITPRAMFSRAVSGIRNNTVIINLPGSPKAVKESLEFLLPNLEHGLGI LTNRENECGALSL >gi|222441931|gb|ACEP01000011.1| GENE 53 54863 - 55921 712 352 aa, chain + ## HITS:1 COG:no KEGG:Rumal_2410 NR:ns ## KEGG: Rumal_2410 # Name: not_defined # Def: hypothetical protein # Organism: R.albus # Pathway: not_defined # 48 350 243 549 551 105 25.0 4e-21 MVYKCRLSQRKKYLPVIGICITIIYAIIYASGVQWMQVIGGDITAVLCLMFVCIFESCIY CGLIQTNTGYEQLFEVCTMGAQITDLKYHVCYASSNAMELSKTIMKESAKKEVVVDKKTV IKSRPIQGGYVLWQEDIKDILMLLEKMEENRKTIEESNCIEQENYQTKAKINMLREKNRL YDKLQMQTAGQIELLNNLLYQYEAETNLTAKRRLLAKISVIGTYIKRCGNLIFIEERAEV SDIAELEACLEESFSSLRLMGVMCAFAAPSGAFIYVRDAVRIYNFFETVIEACLDSLLSV WVKFRVHKESLIFCMEVESNANLTSFFEITDTSSYEDGVWRFTFTVKKAGEA >gi|222441931|gb|ACEP01000011.1| GENE 54 56350 - 57246 422 298 aa, chain + ## HITS:1 COG:no KEGG:Rumal_2409 NR:ns ## KEGG: Rumal_2409 # Name: not_defined # Def: ATP-binding region ATPase domain-containing protein # Organism: R.albus # Pathway: not_defined # 10 282 164 424 424 114 29.0 7e-24 MYRLFRTLAQKDLQKLSELRDALHSNIQNEIIRLTDKKEIYLFPDGRAWNYQESEVKDKE GNVYIEAVFSDVTKQYHEKINLTRQTEKLKEISRELRYLSDNVLILTREREVLAAKTKLH DQMGAGLTAIRQSLAQENADYSNAVRLLRQAVNAIWNDNQYPLEEGEFERFLQDARTIGV KVRCTGSLPKEEEYAHIYILAMRECLTNGVCHAGATELFITMQEDKDCYHICITNNGEVP EKEVVPKGGLYNLSRHIFDYSGEMHIQSMPYFALTITFQKKELDHEEGIDCRGSEDAS >gi|222441931|gb|ACEP01000011.1| GENE 55 57203 - 57850 563 215 aa, chain + ## HITS:1 COG:SA1700 KEGG:ns NR:ns ## COG: SA1700 COG2197 # Protein_GI_number: 15927458 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain # Organism: Staphylococcus aureus N315 # 3 203 4 208 209 99 33.0 4e-21 MKKVLIVEDQRMPRENMERILLDSGKYKLCASVNGADVALAVCRREKIDLILMDVCTAGN KDGIEAAAEIKAEFPDIKIIIVTSMVEVGYLKRAREAKADSFWYKDISPEKLIDVIEETM AGEHIFPDKTPSVKLGLADSSELTAKEIEVLRLVCEGLEYSEIAERMHISQRTVKFHISN ILSKTGYANKTRLAIAVTNKNFIIPSSPEKNPSSW >gi|222441931|gb|ACEP01000011.1| GENE 56 58033 - 59085 1123 350 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225025961|ref|ZP_03715153.1| ## NR: gi|225025961|ref|ZP_03715153.1| hypothetical protein EUBHAL_00198 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00198 [Eubacterium hallii DSM 3353] # 1 350 1 350 350 589 100.0 1e-166 MYSEAATQYKGENIWNRGSKTLTSGTYYVQIANRGAYYPATFMFNLNGHNWISANKVTCS QSAIQLSDIGKKKVIAQTVPANSDDKIVSVYDHLTGKTWDWYNSNKVDYDGIPSSATAPG KHKWITFKTTNGKTFTTYVISPAEKLKKPRVATGYNSAKFWIDYPNKLQTSVRIQVLKKG KWTTAKTIGYFSCGNSNPVTIKGLKPLTNYKFRVQAYANGMTGTPSNIVTMKTGSKVKPA VKSIKIVQRKKVTVKGWWRKHWVGGHISRYEWVPPYTYTKYKVKVTLKKKVPKIGGMWVA NNWVKGSKKSYTVWVPNGAQKGKKLTIRIRTALGKGYGNGPEVKKVIRVK >gi|222441931|gb|ACEP01000011.1| GENE 57 59228 - 59992 704 254 aa, chain - ## HITS:1 COG:ECs3563 KEGG:ns NR:ns ## COG: ECs3563 COG1349 # Protein_GI_number: 15832817 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulators of sugar metabolism # Organism: Escherichia coli O157:H7 # 4 252 5 251 257 130 33.0 2e-30 MKNHRQEQILEILKEKGKVNVHNLSEYFSVTPKTIRRDLEFLEENGELVRIHGGAEMTKT DILRDKPFGVRLNIEAEKKEKIGKEAAMLLKNGQRIFIGAGSTLDYLSSYIDNTNRLYVV TDSITVVNQLNNRSEVEIFMIGGEITKHILGASGTIAENTLKNFWFDMAFISASTVDQEG HLFHRGPAEFGIYKHLAERTTKLVALIDSTKLGKHNFINVAKLRKNDILITDEDADPAFI EALRLQGIEIIIAK >gi|222441931|gb|ACEP01000011.1| GENE 58 60204 - 61949 1119 581 aa, chain + ## HITS:1 COG:CAC1322 KEGG:ns NR:ns ## COG: CAC1322 COG0579 # Protein_GI_number: 15894602 # Func_class: R General function prediction only # Function: Predicted dehydrogenase # Organism: Clostridium acetobutylicum # 92 557 3 451 475 320 41.0 6e-87 MNTLNRRINEKVKDKFPELNIEIRINERNIATVSGECETWEQLIDVGHFVAEDAKIKNVV SEMTVKGLTITPQDYASMAKEGRKKGIIQKTDVVIIGAGVIGCGIARELSKYDFHCIVVD KENDVSVGASKANNGNIHPGHAVKKGTLKHKLNILGNRMYDEWAKDLQFEFQRNGLMYIA WEEEYLPALKRRYTKGLENGVDGIRYISGEEAMAIEPELKKLDNPPIAAVWLPSLAHVEP YEVTIALAENAAENQVKFMLNTKVCDIVHDGRIQAVVTSRGVIETKYVINAAGVYADDIS RMAGDVSYTIHPRKGTIVLLDKAKTYPYKPQLGFVSNKLENRMMNVKNKESKGGGCCKTP EGNYLLGPSAKEVWNKEDTSCDAEGIAYALSCCQHKGVGEKDVIRSFAGVRAADFKEDFI IEKSEVTSGLIHVAGIQSPGLSAAPAIAKMVENILLEEMKKEGMSYKRKKNYQPYRPKRR VFRKLSLEEQNKLIKENPDYGQIVCRCEFITKGEILDAIDSPVVPTSVDAIKRRTRAGMG RCQGGFCLPVVLQILAQAQQQDCTEIDFTAKDTNILEKIKN >gi|222441931|gb|ACEP01000011.1| GENE 59 61961 - 63493 948 510 aa, chain + ## HITS:1 COG:PA3024 KEGG:ns NR:ns ## COG: PA3024 COG1070 # Protein_GI_number: 15598220 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar (pentulose and hexulose) kinases # Organism: Pseudomonas aeruginosa # 3 508 4 514 519 278 32.0 2e-74 MAEEYVIGIDSGTQSTRAILFNLKGEKIASASATHPALMTDERGLIVHDYQDVYQGLCTA CQILIEKIKKNSETATGTIKAIGIAAQRATVFFLDKEGQQLCRPISWMDRSWQSNEKYSD KWKDNLDAWQYFLVNYSRMNWMQREYPDIFEKVEKYFTTSGYIHYKLTGEFADSLGNNLG MPVDREHWNLYQDSAVYEGMAITRSQLATFRNPGEIIGTITNKAAKETGLPVGCPVVAGA GDKQCEILGSGTLNKGQAYITMGTMTGLNLVDDQYIADEYKVTKHRTYTAAVPGFWNAEA VSPRGFWLISWFRDNFAKNLNKDNPKKTIEQLLDEEAIHISPGAEGLITIPDWKSTWDKP YAKGIFMGFDMRHGRAHMFRSLLEGIMMQLKIGTDDMCHSTNREIKELRVGGGGSKSYVA VQAIADIFNLSVKKSEEPETCSLGAAICAAVGAGCFPDFYAAAKAMGQNHTEFLPSTENH KLYQELMEEIVNPYYKINEQLLKKLQQIIS >gi|222441931|gb|ACEP01000011.1| GENE 60 63531 - 64958 1220 475 aa, chain + ## HITS:1 COG:ECs3629 KEGG:ns NR:ns ## COG: ECs3629 COG0277 # Protein_GI_number: 15832883 # Func_class: C Energy production and conversion # Function: FAD/FMN-containing dehydrogenases # Organism: Escherichia coli O157:H7 # 1 471 3 479 484 399 45.0 1e-111 MQQKKIIKELQEIVGKDKVITDEESIALASRDYIGFRRYHRSDGKNWAPHAMCVIKPKNT DEVSKVLSFLNENHIDVVPRTGGSSVTMSIEPAEGGVIIDGCDMCDILNIDKKNHIVTAQ CGTPLEYLEEQLNKQGYTTGHYPQSLPLASLGGLTATRSIGQFSTLYGGVEDCIIGLEAV LADGSVVRIKNVPRRSTGPDLRHIFIGSEGTMGFITEVSMKLYEYHPENRWMHAYGVIGM QKGLDFIREIMVSGYKPAVVRLHDEYEVQELMGAPAPEGYAMLLFIAEGPKAIADVTGEA IQSLAEKYEFLDLGTEPVEVWLKTRNDSCANIDKNTRYLKGIVSDTTEISGNWDVIGKIY EGIISRINEEIENITFVGAHSSHSYLNGTNIYCRFIFQADKGVDYVQEDYMKIVTIIMEE TLKYGGSIAHHHGSGKYRTKWMPQEHGSSYELMYRLKDAMDPNGILNKGDLLVEK >gi|222441931|gb|ACEP01000011.1| GENE 61 65027 - 66514 1280 495 aa, chain + ## HITS:1 COG:TM1430 KEGG:ns NR:ns ## COG: TM1430 COG0554 # Protein_GI_number: 15644181 # Func_class: C Energy production and conversion # Function: Glycerol kinase # Organism: Thermotoga maritima # 3 479 2 481 482 421 45.0 1e-117 MSKYIMVIDEGTTGTRGILFDKEFKIVSQDYQELIQYTPDNIRVEHDASEIYNKSVEVCK GAMQKIGATAEDISCIGIATQRNTCVIWDKNTGEPLYHAIVWQDTRTGDVAEKLKENGGE EKILRETGKVIAPHNNGLILKWCMENVPEVKEAIEKETALYGTMDTWLIWKLTEGKTHAV ACSNASSSGCVNVQKGVWNEEFIQNLGVPLKLFPTIASEASEYGVTKVFGKEIPITGAIA DQQSALFAQGCLEAGTMKCTNGTGSFMDITIGNTCKIASGGVDNLIAWKLNDTLTYMVEG FVSVTGSAVQWLRDGLKIIRSSGEIEALAASVPDTNGVYFVPALVGLTSPHNDPSARGTI IGITRGTTDAHIARATLECIAFGIKDILDVVEKECEVKIDRINVDGGASQNNLLLQMLAD YCNADVARPDTLEATALGAALMAALSINQISLEDVKHILTPDAVFHPQMDATLREERYSI WKEAVDRSLKWIKYA >gi|222441931|gb|ACEP01000011.1| GENE 62 66583 - 67299 326 238 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225025967|ref|ZP_03715159.1| ## NR: gi|225025967|ref|ZP_03715159.1| hypothetical protein EUBHAL_00204 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00204 [Eubacterium hallii DSM 3353] # 1 238 1 238 238 458 100.0 1e-127 MEAEYNSIPDDVWEREEEYLRFLPYIGYEKNSYDEIGLELVRRILESNPTIVADVLFMTK ENIKKEFQNLKAHGFHEIFQYIPKGNADFIEVFKQHCKEQGNVDVVIVGQESSSRRNGTT GPQIAEFMHYPCITNVVDFHIENNTDIWIKRNTDEAIITATVKTPVVLIIGEAPDVRLKT PRRKDKLPFLQQLPHQKCWEKELEEEKIIFSLRQHKRNCQFISVKEWNQFLKNREGGN >gi|222441931|gb|ACEP01000011.1| GENE 63 67302 - 68234 602 310 aa, chain + ## HITS:1 COG:Ta0212 KEGG:ns NR:ns ## COG: Ta0212 COG2025 # Protein_GI_number: 16081361 # Func_class: C Energy production and conversion # Function: Electron transfer flavoprotein, alpha subunit # Organism: Thermoplasma acidophilum # 89 307 67 284 293 96 32.0 5e-20 MSISVIIFGEKTTLNHEIKAVSSFLADAFTEKQLLKMKIDFFIFIPKQQLPKKVNAPKNI NFYYIFLPENADDTSILDVIKDYNRINAYDYIFFGTTVYARSFAVKYAIQEKTHLATNIK GSFFSETVKTWEKDIFLGNISKAIPVTNHRIITLAKNIPATYDEVCYPAWKAIPVEIQSN DYITAYTKENKNIVVGWEQAEKIIVIGNGVDKNGYPLIEQFAKQINAEIAGTRRAVENGL LPVERMIGISGKSISPKLCIVIGASGLAAFVKGIEKSECIVSINSDKDALIFKYADYGII GHYQEVITVT >gi|222441931|gb|ACEP01000011.1| GENE 64 68441 - 68902 376 153 aa, chain + ## HITS:1 COG:no KEGG:Rumal_1150 NR:ns ## KEGG: Rumal_1150 # Name: not_defined # Def: AAA-ATPase-like protein # Organism: R.albus # Pathway: not_defined # 1 153 1 156 523 123 45.0 2e-27 MGIYLNPGDTSFQGSLRSKIYVDKSGLIAKTNDVICTEQKYVCVSRPRRFGKSMAANMLA AYYDTAEDTSELFDNLFIQNCPSYQKHKNKYDVIKINMQEFLSATHDIDEMLAILQKRVI KELKLKYPDYVDNEYLVFVMQDIFMHTNHPFVI >gi|222441931|gb|ACEP01000011.1| GENE 65 68892 - 69248 211 118 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_2482 NR:ns ## KEGG: EUBREC_2482 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 2 100 132 230 477 129 62.0 3e-29 MLYDKDYVALAYMTGILPIKKYGSHSALNMFAEYSMTNPREMTEFFGFTEKEVYELCKQY KRNFEETKVWYDGYRLTRTKERLLQIYSMYNPKSVVDAMISTMDWTEVMESVSASAVR >gi|222441931|gb|ACEP01000011.1| GENE 66 69251 - 69586 219 111 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_2482 NR:ns ## KEGG: EUBREC_2482 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 92 383 474 477 145 68.0 6e-34 MAFYFAREYYTLIKELPTGKGFADICFLPRGLYMDKPAMIIELKWDKNVQGAIAQIEDKK YTDALKDYHGNILLVGINYNKKTKEHTCIIKKKCFDFFSPHTFKESGICKK >gi|222441931|gb|ACEP01000011.1| GENE 67 69714 - 71600 1978 628 aa, chain + ## HITS:1 COG:CAC1254 KEGG:ns NR:ns ## COG: CAC1254 COG1032 # Protein_GI_number: 15894536 # Func_class: C Energy production and conversion # Function: Fe-S oxidoreductase # Organism: Clostridium acetobutylicum # 1 617 1 614 622 687 52.0 0 MNNLALSDEILLKIEKPVRYIGNEVNMVRKDPEGKIRFAMCFPDVYEIGMSHLGMKIIYD QMNRRDDVYCERVFSPWVDLDKVMREENIPLFALESQDPVFMFDFLAVTIQYEMCYTNIL QILDLSQIGIYAKDRREDAPFVIGGGPCTYNPEPLADFFDMFYIGESETVYYDLMDLYKE HKKNGGNRKEFLRKASHIPGIYVPSLYETTYNEDGTIASFEPLYEDVPRTILKQVQMDLT NTFYPEKQIVPYIKVTQDRCVLEIQRGCTRGCRFCQAGMIYRPLRERSLPMLEELAAKGI KGTGNDEISLSSLSSSDYSHLPELCNYLIDNYRSKNINIALPSLRIDAFSLDVMSKVQDV RKSSLTFAPEAGTQRMRDVINKGLTEEVILHGAMEAFKGGWSKVKLYFMMGLPTETEEDI KGIAHLAEKIAMNYYSMDAEYRHGRISVGASASFFVPKPFTPFQWASMCEEDEYKTKAHI VNDEFKIQHNHRSLSFKWHDAKTTVLEGILARGDRRIGKVIYDVYKKGCIYDAWTEFFNY EAWMETMEENGLDYHFYTTRKRELDEIFPWDFIDIGVTKEFLKREWNNAMAEKVTPNCRQ KCSGCGAAKFGGGVCYESKSKMGENRCS >gi|222441931|gb|ACEP01000011.1| GENE 68 71560 - 72321 904 253 aa, chain + ## HITS:1 COG:CAC1255 KEGG:ns NR:ns ## COG: CAC1255 COG5011 # Protein_GI_number: 15894537 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 1 234 1 233 238 99 30.0 5e-21 MKVRVKWAKTGVLKFIGHLDVQRYFQKALMRAELPVSFSKGMSPHQIMSFAAPLGLGMTS EGEYADISFDWTYSSEEMLSRINAVMNEGISVLEFKEIDEKEKNCMAVTAAADYLVTFRE GYYYKDAFLKRTQPFSMQEKIPIVKKTKRSEKEVDIAPMILDIREYESEPLLPGLPLKQE GIFMKLVTGSAENLKPQLVMEAFCKYLGYDYDPMGFQYHRLETYLKKDGELQPLGSVGWE ITEEKHPDEEQKN >gi|222441931|gb|ACEP01000011.1| GENE 69 72583 - 73239 704 218 aa, chain + ## HITS:1 COG:no KEGG:Closa_0878 NR:ns ## KEGG: Closa_0878 # Name: not_defined # Def: 3D domain protein # Organism: C.saccharolyticum # Pathway: not_defined # 129 218 152 238 240 82 42.0 9e-15 MKCKQYNIIATFIFMLSFLCLFPQKVSAQQTKSNNTQASVQSQASIRLNKAGSVVERGQK VKLKARISNSRSKKVVWKSSKRRVATVNRNGQVIARGKGTAVITAKIAGTNVYAQSVVTV KNYITMRVRTTGYCNCRSCAGKWAGCATASGTRPKEKRTIAVDKRLIHLGTKVKIGNIIY TAEDTGSAIKGKRIDVYYASHRKATAHGVKYQKIKVYI >gi|222441931|gb|ACEP01000011.1| GENE 70 73419 - 74630 1116 403 aa, chain + ## HITS:1 COG:XF1125 KEGG:ns NR:ns ## COG: XF1125 COG1530 # Protein_GI_number: 15837727 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Ribonucleases G and E # Organism: Xylella fastidiosa 9a5c # 23 394 20 409 497 191 31.0 2e-48 MEIKKNKLVITEKEGVICYGYFQDGVPTELYCEPKEQQSILGNIYAARVERVAEGIHGAF LEIGDNQKCYYSLSAEQPVKLSPGHEDKLYGGDIILVQITKDAVKTKLPVGTGNISLDGK YFVFTLTDKRTGISKKIRNTVERERLETLLKKYTREEYGIIVRTNAAGVSEETLTKELEL LQLRYEELMRKAKIATGKTLLYREPPHYITLGKELPAKALDEILTDHAEVFTELKEYYKQ TSESDTTKISFYEDTYSLYNLYRFAHYYEEAYGKYIWLKSGASLVIEHTEAMTVIDVNTG SVLKKKKQEDTLFYQINREAAKEIARQLRLRNISGIIMIDFINMKDTAQKEKLLSLLDSE CKKDRVHCNVIDMTALNLVEMTRSKVRKPLHEQIRSCMKNNRS >gi|222441931|gb|ACEP01000011.1| GENE 71 74878 - 75186 433 102 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|238924055|ref|YP_002937571.1| 50S ribosomal protein L21 [Eubacterium rectale ATCC 33656] # 1 101 1 101 102 171 82 1e-41 MYAIIATGGKQYKVSEGDVIKVEKLGVEAGSAYTFDQVLVVGGEETKVGDPVVAGATVEA SVVEDGKDKKVIVYKYKRKTGYHKKQGHRQPYTKVKIEKINA >gi|222441931|gb|ACEP01000011.1| GENE 72 75245 - 75571 330 108 aa, chain + ## HITS:1 COG:no KEGG:bpr_I1442 NR:ns ## KEGG: bpr_I1442 # Name: not_defined # Def: hypothetical protein # Organism: B.proteoclasticus # Pathway: not_defined # 1 97 1 97 109 87 48.0 1e-16 MIQITIYKKPDNQYKGFQVIGHADSVEEGADLVCCSVSVLTINLVNSLDTFTDDEFEVTE QEELGLVQVTFKNPLSDKALLLMDSFDLGVHSIEEQYDIWLKVITREV >gi|222441931|gb|ACEP01000011.1| GENE 73 75577 - 75858 451 93 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|160880681|ref|YP_001559649.1| 50S ribosomal protein L27 [Clostridium phytofermentans ISDg] # 1 92 1 92 95 178 93 9e-44 MMKMNLQFFAHKKGVGSTKNGRDSESKRLGAKRADGQFVKAGNILYRQRGTKIHPGVNVG RGGDDTLFALVDGVVRFERKGRDKKQCSVHPVA >gi|222441931|gb|ACEP01000011.1| GENE 74 76084 - 77373 1825 429 aa, chain + ## HITS:1 COG:CAC1260 KEGG:ns NR:ns ## COG: CAC1260 COG0536 # Protein_GI_number: 15894542 # Func_class: R General function prediction only # Function: Predicted GTPase # Organism: Clostridium acetobutylicum # 1 429 1 424 424 400 50.0 1e-111 MFADRARIFIRSGKGGDGHVSFRRELYVPDGGPDGGDGGKGGDLIFVVDPGLNTLVDYRH KRKYCAGDGKEGSKKRCTGASGEDMILKVPAGTVVKDAETGKVILDMANRTEPVVLLKGG RGGKGNQHYATATMQAPKYAQPGQRARELWVDLELKVIADVGLIGFPNVGKSTLLSRVTN ARPKIANYHFTTLNPNLGVVDLAEGNGFVIADIPGIIEGASEGVGLGYQFLRHIERTKVM IHLVDAASVEGRDPIEDIKAINKELEAYNPELAKRPQVIAANKMDAMPEEDREVIIEMLE EAFADKDIKIFPISAVSGQGVKELLWYVNDLLKELPEEPIEFDQEFFFELQEEDEEESIV VKMEEPGVYSVEGPKVERMLGYTNLESEKGFEFFQKFMKENGVLKRLEELGIEEGDTVKL YNLAFDYYR >gi|222441931|gb|ACEP01000011.1| GENE 75 77401 - 77688 207 95 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|55821596|ref|YP_140038.1| hypothetical protein stu1620 [Streptococcus thermophilus LMG 18311] # 1 92 3 96 105 84 47 2e-15 MTSKQRAYLKSLAMTMDPIFQIGKASLTPEVIEGIREAIDKRELIKVSVLKNCFDDPREI AEVLAERTRSQVVQVIGKKIVLYKPAKENSKIVLP >gi|222441931|gb|ACEP01000011.1| GENE 76 77710 - 78354 703 214 aa, chain + ## HITS:1 COG:Cgl2301 KEGG:ns NR:ns ## COG: Cgl2301 COG1057 # Protein_GI_number: 19553551 # Func_class: H Coenzyme transport and metabolism # Function: Nicotinic acid mononucleotide adenylyltransferase # Organism: Corynebacterium glutamicum # 6 204 10 204 218 138 37.0 6e-33 MAEHKKIGIMGGTFNPIHFGHLLLAETAFHQFNLDEILIMPTKNPYYKKISNSVTEEDRV AMVELAIEDNVHFQLSKEELNREGTTYTVETLSHLTVKHPGYEYYFIMGADSLYHIESWK DPEKILEMATIVVAGRAGTGTSLSSQIEYIENKYDATIYRLNSPVLEISSNDIRRRVRDG ESIRYLLPSKVVDYIYGHNLYQPDPPEEVQKENE >gi|222441931|gb|ACEP01000011.1| GENE 77 78357 - 78926 692 189 aa, chain + ## HITS:1 COG:CAC1263 KEGG:ns NR:ns ## COG: CAC1263 COG1713 # Protein_GI_number: 15894545 # Func_class: H Coenzyme transport and metabolism # Function: Predicted HD superfamily hydrolase involved in NAD metabolism # Organism: Clostridium acetobutylicum # 1 170 1 169 189 125 41.0 5e-29 MMSIQKIKEDLKKKLSSKRYEHTMGVEYTSTCLAMRYGADIEKARLAGLLHDCAKYLSSE EKLSECERYGIPVSAYERKNPELLHAKLGACFANELYGVTDPEILSAIIWHTTGCPEMSL LDKIVFIADYMEANRNQAEDLPEVRALAFKDLDACLRLILEDTIAYLAKKKSVTDPMTQK TFDYYKENK >gi|222441931|gb|ACEP01000011.1| GENE 78 79237 - 79590 581 117 aa, chain + ## HITS:1 COG:BH1328 KEGG:ns NR:ns ## COG: BH1328 COG0799 # Protein_GI_number: 15613891 # Func_class: S Function unknown # Function: Uncharacterized homolog of plant Iojap protein # Organism: Bacillus halodurans # 4 114 3 113 117 94 38.0 5e-20 MNESKKMALLAVEALEDKKAEDITIIDISEVSVLADYFIIADGSNRNQVQAMADSAEEAL GKAGYDAKQIEGYQSANWILMDYKDIIVHVFSKEDRAFYDLERIWRDGKQITKEDLQ Prediction of potential genes in microbial genomes Time: Fri Jul 8 06:39:49 2011 Seq name: gi|222441930|gb|ACEP01000012.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont11.1, whole genome shotgun sequence Length of sequence - 10310 bp Number of predicted genes - 8, with homology - 8 Number of transcription units - 6, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 79 - 127 17.3 1 1 Op 1 . - CDS 230 - 496 346 ## COG2002 Regulators of stationary/sporulation gene expression - Prom 686 - 745 8.8 2 1 Op 2 . - CDS 786 - 1643 1172 ## COG1209 dTDP-glucose pyrophosphorylase - Prom 1751 - 1810 6.8 3 2 Tu 1 . - CDS 1832 - 3484 1223 ## COG1887 Putative glycosyl/glycerophosphate transferases involved in teichoic acid biosynthesis TagF/TagB/EpsJ/RodC - Prom 3524 - 3583 4.4 4 3 Tu 1 . - CDS 3587 - 5197 1766 ## Closa_3527 hypothetical protein - Prom 5345 - 5404 6.4 + Prom 5202 - 5261 8.2 5 4 Tu 1 . + CDS 5503 - 6213 828 ## COG3956 Protein containing tetrapyrrole methyltransferase domain and MazG-like (predicted pyrophosphatase) domain + Term 6353 - 6418 21.5 - Term 6341 - 6406 19.2 6 5 Op 1 . - CDS 6425 - 6916 366 ## gi|225025990|ref|ZP_03715182.1| hypothetical protein EUBHAL_00227 7 5 Op 2 . - CDS 6968 - 8338 1231 ## COG0372 Citrate synthase - Prom 8382 - 8441 8.7 + Prom 8709 - 8768 10.3 8 6 Tu 1 . + CDS 8803 - 10278 1677 ## COG0260 Leucyl aminopeptidase Predicted protein(s) >gi|222441930|gb|ACEP01000012.1| GENE 1 230 - 496 346 88 aa, chain - ## HITS:1 COG:CAC1941 KEGG:ns NR:ns ## COG: CAC1941 COG2002 # Protein_GI_number: 15895214 # Func_class: K Transcription # Function: Regulators of stationary/sporulation gene expression # Organism: Clostridium acetobutylicum # 1 79 1 79 79 92 51.0 2e-19 MQSTGIVRKLDSLGRITLPMELRKSFDIGEREPLEIFTEEDKIIIKKYNPSDIFTGQCED LIEYRGKKVSRDSIRELARIAGFKLTEE >gi|222441930|gb|ACEP01000012.1| GENE 2 786 - 1643 1172 285 aa, chain - ## HITS:1 COG:MTH1791 KEGG:ns NR:ns ## COG: MTH1791 COG1209 # Protein_GI_number: 15679779 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-glucose pyrophosphorylase # Organism: Methanothermobacter thermautotrophicus # 1 284 1 285 292 343 56.0 2e-94 MKGIVLAAGRGTRLYPMTKPVCKPLLPVYDKPLIYYPIATLLQAGIRDILVIIPPGEERE FQNLLGDGSELGLHIEFAVQKVARGIADALIIGEDFIGDDSVCLVLGDNIFQCHNLDEIM KEAIKDDHGAKVFGYYVDDPRPFGVVEFDENGQAVSIEEKPKNPKSNYIIPGLYFYDNQV IDIAKNLEPSARGEYEITDVNLEYLSRGQLKVIPFDRGLTWMDAGTADSMLEAAEIVKAL QKSGCYVGCLEELAWKEGFVSLDKVHEIGESLKMTNYGQYLLKLK >gi|222441930|gb|ACEP01000012.1| GENE 3 1832 - 3484 1223 550 aa, chain - ## HITS:1 COG:SA0243 KEGG:ns NR:ns ## COG: SA0243 COG1887 # Protein_GI_number: 15925956 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Putative glycosyl/glycerophosphate transferases involved in teichoic acid biosynthesis TagF/TagB/EpsJ/RodC # Organism: Staphylococcus aureus N315 # 194 543 215 561 564 231 35.0 3e-60 MRKINYRITVNITQKTMKLGFRAKMPDEMQGKRIRVQAVFTQRQVERRFPMEVVIEEGEV GDQIRADAEILLPYVFYSPPRHKVNVIFTLWCGTEEICLDDQPFPVQKELFARAEIVRKR NLFKFGLCSLGLPFLLVRDYFKEDKNFIKAAKRANDTVYRISGYNYSARQQNTDYFAARY RKYIAKNKVNPNQILFLSEREPEKNGNLMLVKRWFEENEPEVEITTFINTKTVDQLRKKE LRDCAFKCATSAVIILEDFYPQLHSIQKRSETKIVQLWHACGAFKTFGLTRMGKQGGAPQ TSMNHRNYDLVPVSSDTVRDIYAEAFGISGSKVQALGVPRTDLLFDWDYEEKKREELYGK YPILKENRVILFAPTFRGDGNKDAYYPLEAFDVNHFMERQPEDTVLILKNHPFVKQKFTV DAQWQDRVLDLSGEEHINDLMLISNLLITDYSSSVFEAAILELPMLFYAFDEKEYMDSRD FYFDYSQFTPGPVVTDFEALCEESAAMLQNIVSANEQADLKKFRETFLNTVDGCSTERIC RKIKTNYINI >gi|222441930|gb|ACEP01000012.1| GENE 4 3587 - 5197 1766 536 aa, chain - ## HITS:1 COG:no KEGG:Closa_3527 NR:ns ## KEGG: Closa_3527 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 206 361 110 270 274 117 40.0 2e-24 MNNFGRAVLAVGLSAVMTFTGPECFFQSGVEPVSVYAAEEKAVQSKTAGDNSEEGESTTE VTTEKKEESTTATESQESQTSSERDTSSGNTEHSQTEHTTEGTKRNTSGTSGNTDQAEDT DSSTSDTSGNSSHSNEEGTTQKKAKKDISSREDAEIRTKAIKKITDQLAEEDAPDGSDKT STFIPKIYNDTRLAMKNNKEYIAGYVYFNQTDSAWNQNGYCIAKAGCGPTSMAVVITSLT GKWVTPLDTAIWGYQHGFYSRAGSAHEMIPAMAAAYGLKCQGAGTDYSAIKKALKEGKPV VCLMGPGYFTRGGHFMVLVAIDKNDCVTVADVGSRARSAYKYRLSDVIAQSKGASAGGPF WIMSYDKESKSALREKQAIKDYTEEDMAEDFADVSYMKVKGNLPEVLQEEKADVSMKKVV SILKMSTGKSLLDTIGAGKGTALAGTSNAGRSVLSPVSQNSQVIVVNTDASNKKAGSQSG STALAGSAADTVSQGQSAKKSQDLALTLKQDIARRQWEKKIDLSWMDNYMQGTLEK >gi|222441930|gb|ACEP01000012.1| GENE 5 5503 - 6213 828 236 aa, chain + ## HITS:1 COG:BH0072 KEGG:ns NR:ns ## COG: BH0072 COG3956 # Protein_GI_number: 15612635 # Func_class: R General function prediction only # Function: Protein containing tetrapyrrole methyltransferase domain and MazG-like (predicted pyrophosphatase) domain # Organism: Bacillus halodurans # 6 236 232 450 491 133 40.0 3e-31 MDKKFNFQELQAIVATLRGENGCPWDKSQTHESLKLYTLEEAYEVNQAVTDLTKTGDCAN LKEELGDLLFQVLLQSQVAEDNGEFAIEDVIDGIARKMIHRHPHVFAGRHYDSVEQQQAD WEKLKSQEEGHKQTSLKEEIALVPESFPALIRGQKIAKKAAAAGLFSTEDEDVFKDLLTS VVNLQLGTAGEDPEKKFSSDEELSEKLGEVLFALCRFCAKYKVSGEMALLKKLEEF >gi|222441930|gb|ACEP01000012.1| GENE 6 6425 - 6916 366 163 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225025990|ref|ZP_03715182.1| ## NR: gi|225025990|ref|ZP_03715182.1| hypothetical protein EUBHAL_00227 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00227 [Eubacterium hallii DSM 3353] # 1 163 9 171 171 282 100.0 5e-75 MKAKDTGSSDWPEGEPSAEQKVTLEDYQQLIIQMRKANVQFQQMCNIPSGELTMLLTLRH LLLAKEFVIPSDIGDAMKLSRPAVSRMLHNLERKGYLEMKSSEEDHRYVKVQFTQTGKEL ITEEFEKCCKLLERVKERMGEKDMYKFLYYYSQFCTILVDEIF >gi|222441930|gb|ACEP01000012.1| GENE 7 6968 - 8338 1231 456 aa, chain - ## HITS:1 COG:L67186 KEGG:ns NR:ns ## COG: L67186 COG0372 # Protein_GI_number: 15672652 # Func_class: C Energy production and conversion # Function: Citrate synthase # Organism: Lactococcus lactis # 19 452 11 441 441 528 56.0 1e-149 MNEYSMMTQELQDLAALSMEHGQIPSGLYDQYHVLRGLRDVNGKGVLAGLTDISTITSSK EVDGKMVPCDGELRYRGYDIHDLVDGFVAEQRFGYEEVAYLLIFGRLPQKQELQEFQNLL GSYRTLPTNFVRDVIMKAPGKDMMNTLQRGVLTLYGYDDMADDISIPNVLRQCLQLTSTV PLLAVYGYQAYNHYVEGKSFYIHSPKQELSTAENILRMLRPNKKYTPLEARVLDIALVLH MDHGGGNNSTFTNHVVTSSGTDTYSAMAASLGSLKGPKHGGANIKVVHMFDEMKQEVKDW KDEDEIKNYLLKLLNKEAFDHSGLIYGMGHAVYSISDPRARIFKGFLDKLAREKGYDEEF EFYDRVEKLAVETIQEKRRIYKGVSANIDFYSGFMYRILGLPDQLFTPMFAVARMVGWSA HRLEELMNCDKIIRPAYQSVAMDKEYVPMKDRIIVE >gi|222441930|gb|ACEP01000012.1| GENE 8 8803 - 10278 1677 491 aa, chain + ## HITS:1 COG:BS_yuiE KEGG:ns NR:ns ## COG: BS_yuiE COG0260 # Protein_GI_number: 16080258 # Func_class: E Amino acid transport and metabolism # Function: Leucyl aminopeptidase # Organism: Bacillus subtilis # 70 477 69 485 500 279 39.0 9e-75 MQIQRLVDLEDYEFDTLLVTYNHKNRLSIELGKVSEIFESLQADGLVGGSKVYGLRGKDF SYQGQPYYNLIFIGIPEKATPRRFQLQFGQAMKEAKRFHGKNLLMTAIGEDVSYWLSSAM KGLLLADYSFDKYISDKPESPELHLGILTGIDSASFKKEEQYTRMMAKAVTTARDLVNEP ANVMTPAALAAKAKEICEKNNIKCTILEKQDCEALGMNSYLAVARGSKQSAKFIVMEYNG RVTDDEKIALIGKGLCFDSGGYNLKPGNAMKGMHGDMGGAAAVIAAMGAIAEAKLGINVT AVIPACENMISGKAMKPGDIVKSMNGKYIEVVNTDAEGRMALVDAITYAIRECGATTLID VATLTGACVVALGDRYTGAFSNSDELMSKIVQASILAGENIWRLPMGDEEYDNLNASTVA DIANCGKKCGATAAARFLGEFVEKKPWVHLDIAGTSESDEDTEIYSKGGTGAATMLLYEV VKLMEKPYKSY Prediction of potential genes in microbial genomes Time: Fri Jul 8 06:40:07 2011 Seq name: gi|222441929|gb|ACEP01000013.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont12.1, whole genome shotgun sequence Length of sequence - 4396 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 4, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 513 - 1163 856 ## COG0274 Deoxyribose-phosphate aldolase - Prom 1249 - 1308 5.8 2 2 Tu 1 . - CDS 1363 - 1728 415 ## COG1063 Threonine dehydrogenase and related Zn-dependent dehydrogenases + Prom 2003 - 2062 10.3 3 3 Tu 1 . + CDS 2207 - 2980 905 ## COG0647 Predicted sugar phosphatases of the HAD superfamily + Term 3094 - 3149 8.8 - Term 3085 - 3133 0.2 4 4 Tu 1 . - CDS 3168 - 4040 1037 ## COG1307 Uncharacterized protein conserved in bacteria - Prom 4206 - 4265 10.6 - TRNA 4309 - 4381 76.0 # Asn GTT 0 0 Predicted protein(s) >gi|222441929|gb|ACEP01000013.1| GENE 1 513 - 1163 856 216 aa, chain - ## HITS:1 COG:SPy1867 KEGG:ns NR:ns ## COG: SPy1867 COG0274 # Protein_GI_number: 15675686 # Func_class: F Nucleotide transport and metabolism # Function: Deoxyribose-phosphate aldolase # Organism: Streptococcus pyogenes M1 GAS # 1 214 1 214 223 208 50.0 6e-54 MTKQEILTHIDHTQLKAFCTWEDIKKLCDEAIKYNTASVCIPPSYIERIKETYGDKINIC TVIGFPLGYNTTETKVFEVKDAIAKGCSEVDMVINIGHVKNGDFDKVEAEIKALKEAAGD KILKVIIETCYLTDDEKVKLCQCVTNAKADYIKIFTGFGTGGATIEDIRLFAQNIGSDVK MKAAGGVKSREDLEIFLEAGCDRIGTSSAVKMLEEN >gi|222441929|gb|ACEP01000013.1| GENE 2 1363 - 1728 415 121 aa, chain - ## HITS:1 COG:STM3261 KEGG:ns NR:ns ## COG: STM3261 COG1063 # Protein_GI_number: 16766559 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Threonine dehydrogenase and related Zn-dependent dehydrogenases # Organism: Salmonella typhimurium LT2 # 1 95 1 96 347 107 51.0 6e-24 MKAVRMYAPRDLRVEEVDIPSYEADECLIKVMAVGVCGSDIPRVNQYGAHVSPIIVGHEF SGQIVKTGDKVTKFKAGDRVTCPPLIPCFKCKYCEMVNIHFVKTTIIMVQEEMVHLHNIS Q >gi|222441929|gb|ACEP01000013.1| GENE 3 2207 - 2980 905 257 aa, chain + ## HITS:1 COG:Cgl2203 KEGG:ns NR:ns ## COG: Cgl2203 COG0647 # Protein_GI_number: 19553453 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted sugar phosphatases of the HAD superfamily # Organism: Corynebacterium glutamicum # 10 257 7 254 275 240 45.0 2e-63 MESLRSKKGFICDMDGVIYHGNQLLPGVKEFVEWLQKEEKQFLFLTNASSRSPKELQNKL YRMGLEIGEEHFYTSALATAKFLQNQAPGCSAYVIGDHGLYNALYDAGITINDVDPDYVV VGETVTYGYEHIITAMNLVNKGARLIATNTDITGPIEGGIAPACRAFVAPIEATTGKKAY YVGKPNPLMMRTGLQILGVHSSEAVMIGDRMDTDIIAGIETGLDTALVLSGVSTRETVKE FPYRPRLILNGVGDIAE >gi|222441929|gb|ACEP01000013.1| GENE 4 3168 - 4040 1037 290 aa, chain - ## HITS:1 COG:FN1927_2 KEGG:ns NR:ns ## COG: FN1927_2 COG1307 # Protein_GI_number: 19705232 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 1 226 1 228 285 117 32.0 2e-26 MKKIGIMTDSHSGILTEEAEKLGIKVLPMPFYIDDELYLEGVDLSREEFYEKLRQGVNVS TSQPSPQEVMETWDDMLKEYETLIYIPISSGLSGSCMTAQGLAQDEEYEGKVFVVDNGHV STPLHRSVLDAVEMAEAGYSAVEIKKILEETKEQMVIYVGLSTLEYLKKGGRISSTSALL ANVLNIKPVMKFGTGVLDVYQKCRGMKKSRKAMIDAMKNELETTFHDAYKAGKVYLMAAS SSSEEVTKEWVEQIKEAFPGMEVMSDYLSFGLSCHIGPDGLGIGCTCKPV Prediction of potential genes in microbial genomes Time: Fri Jul 8 06:40:08 2011 Seq name: gi|222441928|gb|ACEP01000014.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont13.1, whole genome shotgun sequence Length of sequence - 659 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 162 62 ## COG2145 Hydroxyethylthiazole kinase, sugar kinase family - Prom 353 - 412 9.3 Predicted protein(s) >gi|222441928|gb|ACEP01000014.1| GENE 1 3 - 162 62 53 aa, chain - ## HITS:1 COG:CAC3096 KEGG:ns NR:ns ## COG: CAC3096 COG2145 # Protein_GI_number: 15896347 # Func_class: H Coenzyme transport and metabolism # Function: Hydroxyethylthiazole kinase, sugar kinase family # Organism: Clostridium acetobutylicum # 15 53 1 39 273 63 74.0 1e-10 MNIYAKSICKERVKMLKQCFDNVREKHPLVHNITNYVTVNDVANILLACGGSP Prediction of potential genes in microbial genomes Time: Fri Jul 8 06:40:16 2011 Seq name: gi|222441927|gb|ACEP01000015.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont14.1, whole genome shotgun sequence Length of sequence - 30453 bp Number of predicted genes - 27, with homology - 26 Number of transcription units - 14, operones - 6 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 17 - 196 177 ## COG0675 Transposase and inactivated derivatives + Prom 368 - 427 5.7 2 2 Op 1 8/0.000 + CDS 560 - 1954 1835 ## COG0215 Cysteinyl-tRNA synthetase 3 2 Op 2 7/0.000 + CDS 1939 - 2367 240 ## PROTEIN SUPPORTED gi|163764762|ref|ZP_02171816.1| ribosomal protein S13 4 2 Op 3 . + CDS 2393 - 3136 676 ## PROTEIN SUPPORTED gi|163764761|ref|ZP_02171815.1| ribosomal protein S11 + Term 3205 - 3248 0.1 + Prom 3439 - 3498 8.3 5 3 Tu 1 . + CDS 3614 - 3793 196 ## gi|225026003|ref|ZP_03715195.1| hypothetical protein EUBHAL_00241 + Prom 3796 - 3855 7.1 6 4 Op 1 . + CDS 3970 - 4386 380 ## Closa_3192 hypothetical protein 7 4 Op 2 . + CDS 4392 - 4493 113 ## + Term 4548 - 4600 13.0 + Prom 4599 - 4658 9.2 8 5 Op 1 27/0.000 + CDS 4749 - 6347 1503 ## COG0286 Type I restriction-modification system methyltransferase subunit 9 5 Op 2 . + CDS 6352 - 6966 376 ## COG0732 Restriction endonuclease S subunits 10 5 Op 3 . + CDS 7019 - 7489 351 ## Bacsa_0122 hypothetical protein 11 5 Op 4 . + CDS 7532 - 10510 2227 ## COG0610 Type I site-specific restriction-modification system, R (restriction) subunit and related helicases 12 5 Op 5 . + CDS 10571 - 11497 598 ## COG0582 Integrase 13 6 Tu 1 . - CDS 11729 - 12484 608 ## gi|225026011|ref|ZP_03715203.1| hypothetical protein EUBHAL_00249 - Prom 12545 - 12604 7.9 + Prom 12568 - 12627 6.7 14 7 Tu 1 . + CDS 12706 - 13971 1151 ## COG2333 Predicted hydrolase (metallo-beta-lactamase superfamily) + Prom 13995 - 14054 4.1 15 8 Tu 1 . + CDS 14176 - 14679 679 ## COG0703 Shikimate kinase + Prom 15120 - 15179 5.2 16 9 Op 1 5/0.000 + CDS 15257 - 16057 188 ## PROTEIN SUPPORTED gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 + Prom 16098 - 16157 1.5 17 9 Op 2 2/0.000 + CDS 16185 - 16634 441 ## COG4578 Glucitol operon activator 18 9 Op 3 6/0.000 + CDS 16649 - 17203 723 ## COG3730 Phosphotransferase system sorbitol-specific component IIC 19 9 Op 4 6/0.000 + CDS 17230 - 18240 1377 ## COG3732 Phosphotransferase system sorbitol-specific component IIBC + Prom 18420 - 18479 4.7 20 9 Op 5 . + CDS 18615 - 18977 647 ## COG3731 Phosphotransferase system sorbitol-specific component IIA + Prom 19045 - 19104 5.2 21 10 Tu 1 . + CDS 19217 - 21121 1565 ## COG3711 Transcriptional antiterminator - TRNA 21604 - 21676 75.7 # Thr GGT 0 0 + Prom 21759 - 21818 7.0 22 11 Tu 1 . + CDS 21948 - 22784 596 ## gi|225026020|ref|ZP_03715212.1| hypothetical protein EUBHAL_00259 + Prom 22869 - 22928 7.5 23 12 Op 1 . + CDS 23006 - 23662 491 ## gi|225026021|ref|ZP_03715213.1| hypothetical protein EUBHAL_00260 + Term 23757 - 23790 -0.8 + Prom 23692 - 23751 4.7 24 12 Op 2 . + CDS 23847 - 25514 1997 ## COG0008 Glutamyl- and glutaminyl-tRNA synthetases + Term 25574 - 25629 6.4 + Prom 25590 - 25649 5.0 25 13 Op 1 5/0.000 + CDS 25701 - 26534 1181 ## COG0543 2-polyprenylphenol hydroxylase and related flavodoxin oxidoreductases 26 13 Op 2 . + CDS 26534 - 27931 1902 ## COG0493 NADPH-dependent glutamate synthase beta chain and related oxidoreductases + Term 27980 - 28042 6.0 + Prom 28228 - 28287 7.0 27 14 Tu 1 . + CDS 28343 - 30415 2327 ## COG5492 Bacterial surface proteins containing Ig-like domains Predicted protein(s) >gi|222441927|gb|ACEP01000015.1| GENE 1 17 - 196 177 59 aa, chain + ## HITS:1 COG:Ta1471 KEGG:ns NR:ns ## COG: Ta1471 COG0675 # Protein_GI_number: 16082436 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Thermoplasma acidophilum # 1 51 140 191 237 60 53.0 9e-10 MVKVDRFFPSSKKCCKCGRVKKELKLSERVYHCVCGNKMDRDCNAAINIREEARRMLTA >gi|222441927|gb|ACEP01000015.1| GENE 2 560 - 1954 1835 464 aa, chain + ## HITS:1 COG:BH0111 KEGG:ns NR:ns ## COG: BH0111 COG0215 # Protein_GI_number: 15612674 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Cysteinyl-tRNA synthetase # Organism: Bacillus halodurans # 1 464 3 465 466 518 56.0 1e-146 MKLYNTLTKQKEEFVPVHEGKVGMYVCGPTVYNYIHIGNARPMIVFDTVRRYFEYKGYDV NYVSNFTDVDDKIIKKANEEGVPASEISERFIAECKKDMEGLNIEPATHNPKATEEIDGM IAMISTLIEKGYAYEKNGTVYFRTRKFKNYGQLSKKNLDDMRAGIRIAVSDEKEDAMDFV LWKPKKEGEPAWVSPWGEGRPGWHIECSEMSKKYIGDTIDIHAGGEDLVFPHHENEIAQS EACNDEKFANYWMHNAFLNIDNKKMSKSAGNFFTVREISEKYPLQVIRFFMLSAHYRSPL NFSDTLVEASKNGLERILTAIDHLREVIAAAPEGELNVEDQKNLEEANALKAKYEAAMED DFNTADAIAAIFELVKLANVTAESGSKAYAQQLLDIIVQLCDILGIITEKKEELLDDDIE ALIEERQAARKAKNFARADEIRDLLADKGIILEDTRAGVKWKRA >gi|222441927|gb|ACEP01000015.1| GENE 3 1939 - 2367 240 142 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163764762|ref|ZP_02171816.1| ribosomal protein S13 [Bacillus selenitireducens MLS10] # 15 133 6 126 141 97 38 1e-19 METCLKGIAEHFPLTEKELRGYSSLGLAYIGDCIYELFVRTMVVTKGNDKANHYHQKTIS YVNAAAQTAMMEKIKPLLTDEEKAVFRRGKNAKSPSPAKNQSSHDYHIATGFEALMGYLY LSGQMERLEELIRICLSDADNL >gi|222441927|gb|ACEP01000015.1| GENE 4 2393 - 3136 676 247 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163764761|ref|ZP_02171815.1| ribosomal protein S11 [Bacillus selenitireducens MLS10] # 8 244 9 246 255 265 53 3e-70 MAYEECTVVGRNAVMEAYRSGKTIDKLFILDGCQDGPIKSILREARKRDTLIKFVSKEKL DSLSFHEKHQGVVAIAAAYEYATVEDLFKKAEEKGEAPFFILCDEIEDPHNLGAIIRTAN LAGAHGVIIPKRRAVGLTSTVAKVSAGALNYTPVARVTNLSRTIDELKDKGMWFVCGDMG GELMYDLNLTGSIGLIIGNEGNGVSRLVKEKCDYIASIPMKGDIDSLNASVATGVLAFEI VRQRMGK >gi|222441927|gb|ACEP01000015.1| GENE 5 3614 - 3793 196 59 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026003|ref|ZP_03715195.1| ## NR: gi|225026003|ref|ZP_03715195.1| hypothetical protein EUBHAL_00241 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00241 [Eubacterium hallii DSM 3353] # 1 59 1 59 59 96 100.0 5e-19 MITEEQGLEYDEAMNKFYNSEVFEKLQDKETGLYCESSSYVYDLFCDELNLGHIVQAEI >gi|222441927|gb|ACEP01000015.1| GENE 6 3970 - 4386 380 138 aa, chain + ## HITS:1 COG:no KEGG:Closa_3192 NR:ns ## KEGG: Closa_3192 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 137 1 138 139 128 58.0 7e-29 MNKVSDKLTVYFEEPFWVGVFEKVQGKKLSVSKVTFGTEPKDYEVYEFLLKHYYDLQFSP SVTTVIKEVKQNPKRRQREVKKQLRNTGIGTKSQQALKLQQEQNKQERKIKSRKQKQAEA EYQFQLRQEKKKEKHRGR >gi|222441927|gb|ACEP01000015.1| GENE 7 4392 - 4493 113 33 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MDNILSNEIILVGDIVSYYIEKNEEYDFSTKMD >gi|222441927|gb|ACEP01000015.1| GENE 8 4749 - 6347 1503 532 aa, chain + ## HITS:1 COG:SPy1906 KEGG:ns NR:ns ## COG: SPy1906 COG0286 # Protein_GI_number: 15675719 # Func_class: V Defense mechanisms # Function: Type I restriction-modification system methyltransferase subunit # Organism: Streptococcus pyogenes M1 GAS # 5 524 3 522 526 575 56.0 1e-164 MAEIENSKDLISVLWSGADILRSKMDANEYKDYLLGIVFYKYLSDSFLIKVYDMICDGKP GTLKEALEAYEEVLQSEDGEELKAEMKQECHYVIEPELTYTCFADAARNNSFNREQLQKA FNNIEQSDPIFADLFTDIDLYSNRLGTGDQKQSDTVANLIKEIDKADLLNSDAEILGNAY EYLIGQFASETGKKAGEFYTPQAVSKILTKIAIDGQEEKKGLSVYDPCMGSGSLLLNAKK YVKYPEYIRYYGQELNTSTYNLARMNMFLHGIVAENQKLRNGDTLDGDWPTGEETDFNMV LMNPPYSAKWSAAAGFLQDERFSDYGVLAPKSKADYAFLLHGLYHLKNNGTMAIVLPHGV LFRGAAEGKIREKLLRSGNIYAVIGLPANLFYNTSIPTCIIVLKKHRDGRDVLFIDASKK FNKGKKQNEMTDEHIEAVMDLYSKRETVEKESFLASFEDIEKNDFNLNIPRYVDTFEKEP EIDLNEVLKEMEQTNKEIEQAEGEFLSLLKELTSSDEKIMASLNELVKKMER >gi|222441927|gb|ACEP01000015.1| GENE 9 6352 - 6966 376 204 aa, chain + ## HITS:1 COG:AF1710 KEGG:ns NR:ns ## COG: AF1710 COG0732 # Protein_GI_number: 11499300 # Func_class: V Defense mechanisms # Function: Restriction endonuclease S subunits # Organism: Archaeoglobus fulgidus # 13 203 141 329 341 67 26.0 1e-11 MGKTKIRFKGYTEDWEQRKLGELASSFEYGLNAAAKEYDGENKYIRITDIDDNTHEFLTD NLTSPDIELTGADNYKLTEGDILFARTGASVGKSYIYKNSDGLVYYAGFLIRARIKEEYD TEFVFQNTLTDRYNKYIAVTSQRSGQPGVNAQEYAEFEIKVPKKEEQTKIGTYFRNIDNL ITLHQRKCNQLQIIRKYMLKNMFL >gi|222441927|gb|ACEP01000015.1| GENE 10 7019 - 7489 351 156 aa, chain + ## HITS:1 COG:no KEGG:Bacsa_0122 NR:ns ## KEGG: Bacsa_0122 # Name: not_defined # Def: hypothetical protein # Organism: B.salanitronis # Pathway: not_defined # 9 150 5 150 151 91 38.0 1e-17 MKNKYFPDEDITMNDLYFICYMIERVARHIKQKNKYVVNAIGKDELYHLLSCASTLHCEN LEKVEADWIRDYGLVEGSFDITNVDSELATLIPTSLDMGEVYQRLIADTLSRKEDFTEGI LWVYNDEICDVIDNYNCSAFYEPSYVIARAYQNGGF >gi|222441927|gb|ACEP01000015.1| GENE 11 7532 - 10510 2227 992 aa, chain + ## HITS:1 COG:SPy1904 KEGG:ns NR:ns ## COG: SPy1904 COG0610 # Protein_GI_number: 15675717 # Func_class: V Defense mechanisms # Function: Type I site-specific restriction-modification system, R (restriction) subunit and related helicases # Organism: Streptococcus pyogenes M1 GAS # 5 988 8 989 992 911 50.0 0 MPELESTIERKLIEQLVYGESQWTYREDLKTEADLWANFKYILEQNNKDRLNGELLSDSE FEQVKNQLQFSSFYKAGEWLVGENGKVQVHVQRDTERLHLVVMNHEHIAGGSSVYEVINQ YSALKTEEDTAASTRDRRFDVTLMINGLPLIHIELKNKQHSYIDGFWQIRKYIGEGKFTG IFSAVQMFVVSNGVDTRYFAAAGDTELNPKFMSGWVDQENNSVSDYLDFAKSVLRIPEAH EMIARYTVLDEDAKRLILLRPYQIHAIESIREASKTGRSGYVWHTTGSGKTLTSYKATRN LLMDIPAIDKAIFLIDRKDLDTQTTMAFQAYANNDLVDVDETDNVNDLKKKLKSEDRQVI VTTIQKMQILISKRLKEDTPEYQKIKNLKIAFVVDECHRAVTPKTKRELERFFGRSLWYG FTGTPRFAENPYPQLGDLPRTTEKLYGERLHKYTIQNAIHDKAVLGFQVEHNGPKNVADE TDSSVYNNETHMLRVLDIILNKSYHKLGFQNGKGKTYEGLLTTSSIQLAQKYYDLLTKIK NGESSLKIDEKIKQVLPDFPKFAITYSVTENEEGSCVNQQKMQKSLDDYNGMFGTKYELS QIQGYNGNLNKRLARKDAKFKSRNEQLDLVIVVDRLLTGFDAPCLSTIFIDRQPMGPHDL IQAFSRTNRIFDKNKSNGQIVTFQAPKLFKESVDNAVKLYSAGSTGIAILAQWEEVEPAF RKALSALRVCAESPSEIPEMSMKEKKIFAKMFQSFDSLFAQLKSFTNYDDSMLEQYGITE QEYDDYVGHYKNVMEEIREEKTNDSETAIEETDVDQDYELMAYSNTKIDYEYIINLIQNI VTPTEEEEDITPEERQKKIDEAKQYVEELRKDNEKVADIMSDLIEEIEKDETKYKGQSIL NIVENMKQDCIEKVVSEFCESWYAMKGDVMYAAMHYRNGEIPNESVIKKNVDYTSYKAEQ EKALPKFKYYSRMIAELKETLEKEIKPLIASI >gi|222441927|gb|ACEP01000015.1| GENE 12 10571 - 11497 598 308 aa, chain + ## HITS:1 COG:lin0524 KEGG:ns NR:ns ## COG: lin0524 COG0582 # Protein_GI_number: 16799599 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Listeria innocua # 11 308 11 308 309 388 64.0 1e-108 MLNNTNNSPLFYEYYAQWVDVYKKGAIREATMAKYLMTQKWIQKLAPELKVSELTRTAYQ QILNDYAKEHERQTTLDFHHQLKGAILDALDEGMIERDPTRKAIIKGKTPRAKKIKYLNQ FELHTLIASLDLSEEPNWDWFILLVAKTGMRFSEALAITPSDFDFARQALSISKTWDYKG EGGFLPTKNRSSVRKIQIDWQIVVKFSELIKGLPEDEPIFIGKSKIYNSTVNDVLTRHCK QCGISDISIHGLRHTHASLLLFAGVSIASVARRLGHASMTTTQKTYLHIIQELENKDVDL IMRTLSGL >gi|222441927|gb|ACEP01000015.1| GENE 13 11729 - 12484 608 251 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225026011|ref|ZP_03715203.1| ## NR: gi|225026011|ref|ZP_03715203.1| hypothetical protein EUBHAL_00249 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00249 [Eubacterium hallii DSM 3353] # 1 251 4 254 254 431 100.0 1e-119 MFEMFATIVAACAAAIYGLSRSVFMSGVHRNLYRAAFIGIIILVYVTYRIYHYIKFKNIK KLNVDFDDYLYYDKYTSPLYRAGYFISYLVQNILFDYTYMKRNYNFCKGLDVYDGGVSEY YIKFRNTWYRYLKLTMIRHRISLKKLQAFSCDMKNEMIIAALNTPEEKEMHHQFTKDAAA YFQEHQSFNKKVIEQFANYAEITPMFRKNVLKHYRALAYTAKKEPEMVAMVKKQEFRRER DVNDFKKMNFR >gi|222441927|gb|ACEP01000015.1| GENE 14 12706 - 13971 1151 421 aa, chain + ## HITS:1 COG:CAC0946 KEGG:ns NR:ns ## COG: CAC0946 COG2333 # Protein_GI_number: 15894233 # Func_class: R General function prediction only # Function: Predicted hydrolase (metallo-beta-lactamase superfamily) # Organism: Clostridium acetobutylicum # 39 294 33 288 320 256 49.0 5e-68 MRKIKILLLSLLLAFCMAGCSEGSSVAISGSGDETDGKISQNGSGMEVHFIDVGQGDSTL IKVGGHAMLIDAGDNSEGTAVQSYLDSQNVEKIDYAIGTHPDADHIGGLDVVVYKFDCKK VFMPDVTSDTKTYDDVVQALKSKNQKAQAPKLGKTYSLGDATFTIIAPVKDYGDDTNDWS IGILLQYGKNRFLFTGDAAKQAEDDMIDTGEDLSADVYKASHHGSKTGSSDDFLDKVSPA YAVISCGEGNKYGHPSAQTLNNFRSRGIKTFRTDNQGTIVAYSDGSDITWNASPDTTWTP GEPKGSSSSWSTASKKGSVKDSAGKTTSKTSSKKTTAKSTTAKSTTDKASSKTSSKKTTE STAKNTSKNKVSYVINEDTGKFHLSTCRFVKQMNEENKVISSKSRDILIKEGYEACKVCK P >gi|222441927|gb|ACEP01000015.1| GENE 15 14176 - 14679 679 167 aa, chain + ## HITS:1 COG:MA3237 KEGG:ns NR:ns ## COG: MA3237 COG0703 # Protein_GI_number: 20092053 # Func_class: E Amino acid transport and metabolism # Function: Shikimate kinase # Organism: Methanosarcina acetivorans str.C2A # 6 148 8 151 175 104 40.0 6e-23 MKDGKSIVLIGMPGVGKSTIGVILAKEIGYQFLDADLLIQEQEGMLLKDIIATKGHDGFL AVENQVNREVNAKHSVIATGGSAVYCEEAMLHYKDTCQIIYLRCPYEILSKRLGDLKGRG VALKDGQTLLDLFEERSVLYEKYADLIIDEGDKGIEETLEILKEKLS >gi|222441927|gb|ACEP01000015.1| GENE 16 15257 - 16057 188 266 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 [Phaeobacter gallaeciensis BS107] # 11 263 4 240 242 77 26 1e-13 MNENWLGLDGQVVIVTGGASGIGKHVVDTLVKVGANAVAVDLNVKTGDELDGAYCVQCNV TDPESVNQMVQAVLEKYGKIDALVNNAGINLPRLLVDVKGEKPEYELNEDSFGKMFAVNV KGVFLCAQAVARELVKQGHGVILNMSSESGKEGSQGQSAYSATKGAVDSFTRSWAKELGK YNVRVVACAPGIMEATGLRTTAYNEALAYTRGVKPEDLSTDYSKVIPIGRDGKLDEVGSL VAYLVSDQASYITGTTVNISGGKSRG >gi|222441927|gb|ACEP01000015.1| GENE 17 16185 - 16634 441 149 aa, chain + ## HITS:1 COG:ECs3562 KEGG:ns NR:ns ## COG: ECs3562 COG4578 # Protein_GI_number: 15832816 # Func_class: K Transcription # Function: Glucitol operon activator # Organism: Escherichia coli O157:H7 # 1 118 5 118 119 58 30.0 4e-09 MIKIALVIGLAFVIQFVLSSFQMKNFNNEFVRLRRKGKVAIGRKSGGFHAGAIVMFRIDE KGIIQESRKIEGTTFLARVKDFPGFEGRYVGDLSANDVEKSHKNLRKAVEDAALTYKKYM AGEEIAQPPSNFQKAGNVVTNLFNRRAGA >gi|222441927|gb|ACEP01000015.1| GENE 18 16649 - 17203 723 184 aa, chain + ## HITS:1 COG:PM1971 KEGG:ns NR:ns ## COG: PM1971 COG3730 # Protein_GI_number: 15603836 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system sorbitol-specific component IIC # Organism: Pasteurella multocida # 8 182 9 183 184 195 53.0 3e-50 MNILVKLASGFMNLFTLGGEQFVSWVTGIIPTVLMLLLLMNAIIALAGDESVNKLARVCT KNPILRYLVLPFVSAFMLGNPMALSMGKFMPEFYKPAYYASATYHCHTNNGIFPHINASE LFVWLGIANGISTLKLDTTPLAVRYLLVGLVANFISGWVTEFTTRYVEKQQGVKLSRTLK ISEN >gi|222441927|gb|ACEP01000015.1| GENE 19 17230 - 18240 1377 336 aa, chain + ## HITS:1 COG:PM1970 KEGG:ns NR:ns ## COG: PM1970 COG3732 # Protein_GI_number: 15603835 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system sorbitol-specific component IIBC # Organism: Pasteurella multocida # 5 335 3 328 329 306 50.0 4e-83 MSELRAIRIERGPQGFGGPLIIRPTEQKNKVMYITGGGTAPECLKKIVELSGMTPVDGFH GSAPEEELAMVIVDCGGTLRCGIYPQKRIPTVNVMPVGKSGPLANFITEDIYVSAVTSKQ ISLAEEGEAVQATEVSEKKEEKAVKFNADQKVSETLAAQENKSIITKVGLGVGKFVNTFY QAGRDAIQTCITTLLPFMAFVSLLVGIINGSGFGNAFAKLLTPLAGNVFGLVALGVICSI PGLSALLGPGAVIAQIVGTLIGTEIGKGTIAPQLALPALFAINCQGACDFIPVGLGLAEA EPETIEVGVLSVMYSRFLTGWIRVLIAYAASFGLYA >gi|222441927|gb|ACEP01000015.1| GENE 20 18615 - 18977 647 120 aa, chain + ## HITS:1 COG:BH0773 KEGG:ns NR:ns ## COG: BH0773 COG3731 # Protein_GI_number: 15613336 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system sorbitol-specific component IIA # Organism: Bacillus halodurans # 1 120 5 124 125 78 34.0 3e-15 MAVIYENKVKELGKDIMMMNGGDFIIIFGDSAPAELRDYCYSVDVNPINGEIKAGQTLKI DENEYKITCVGEEAPVTLAGLGHCTIRFNGMTEAELPGTLYVEEKAVPEIKVGTTIQIIE >gi|222441927|gb|ACEP01000015.1| GENE 21 19217 - 21121 1565 634 aa, chain + ## HITS:1 COG:BS_licR_1 KEGG:ns NR:ns ## COG: BS_licR_1 COG3711 # Protein_GI_number: 16080911 # Func_class: K Transcription # Function: Transcriptional antiterminator # Organism: Bacillus subtilis # 9 461 5 478 499 118 24.0 3e-26 MKNLFGDNRQANILSVLRKNSALNIEMLAQRFGVSERTVRNDIKDINRELKNSGLVEINQ GKCSLRVFDTRDFQNAYARIIETDDLMNSSQKRQEYVFAKLMRAMEPVLTDDIAYEMNIG RSTLISDLKKLRQTMEKYELEIVGKTSKGLALGGSELNIRKFVMENLFGSIYQNYPQDEL MLGKIREAMAEKNFEESTQKMFENYMTLMFDRFLTGHVITRMPEKYYNLVSRNSFSFVDE LIDDISKEFYIEIPIEEKIFVFLPIIGMRTPADSKNMYSIELDEKIRPLLQKIVEQIRQE LNISIDTHEFTEKFMYHIMFMINRLRYNVHIDNPMFEDIHYKYPLAFKMAEIAARVIKED AEVVVTQAEMGYLAAYFGVFLEVNTLNQKQQKIAVISDKGRVTAQLFAVQIRKVVDSSSQ LDILSPSEAVSGILDQYDIVICTTEHIIECECPVIYIHEIFDENELKNKLRQAKFCKGTD AAILDDNWYVMANILKEDTFFNLSEQEDYTSALNYIIDTLEKRGYVDADFKERVWEKESR SSMTIDNIAIPHAVQKAGDSIVLSVGVFRRPMPYKEDNIQIIFLLALPEEIRDENRLVCV YDEIMSLVKNDEMIEKITQSENYIDFMKVLYKRN >gi|222441927|gb|ACEP01000015.1| GENE 22 21948 - 22784 596 278 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026020|ref|ZP_03715212.1| ## NR: gi|225026020|ref|ZP_03715212.1| hypothetical protein EUBHAL_00259 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00259 [Eubacterium hallii DSM 3353] # 1 278 1 278 278 471 100.0 1e-131 MFFGKKQSKRKMNLSRKIMAFVLAGVLLLSSVGMNVSAQEIVTKKSTISKNTKVDIYKAT VQSGNGERTYRVFAQTSKKYKKNSFISLHGCAVCSLTTVLSGYSKKYRNYTPNKTSRILE KKVFGSSRWNANYRKSLGAQRPVSLYGISKVLSYCNISNRYIRFFKDKTAVRQIEKHLKT GNPVIIEVNNRTQKNGRFGSYTNKWSSSKHTMVLLGMTNTGKVIVADSATRKWSGKKQRI KFTTMKELVKYMIPCKSVSTSVYYKSVSSAGGYILVNP >gi|222441927|gb|ACEP01000015.1| GENE 23 23006 - 23662 491 218 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026021|ref|ZP_03715213.1| ## NR: gi|225026021|ref|ZP_03715213.1| hypothetical protein EUBHAL_00260 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00260 [Eubacterium hallii DSM 3353] # 1 218 1 218 218 391 100.0 1e-107 MNRKRKRNYGYVILFLGLWFFWFQNVSASQMLFPFVADDWLSFEGHSFLELQQENGRKNV NQTKEKKKDKENVKEAMAVPVKYDNVISAGDRQILYKVVSMECDTDYEGSLAVISCMLNR LESDKYPDDLMEVITEKAQFSAYYDKKTETYPYLDRVPSKECIQAVDDALSGKKRNLPSY ILYFRSSAYKKLKGYKKYGQIGDNTYFYKEHDKEQNKK >gi|222441927|gb|ACEP01000015.1| GENE 24 23847 - 25514 1997 555 aa, chain + ## HITS:1 COG:PA1794 KEGG:ns NR:ns ## COG: PA1794 COG0008 # Protein_GI_number: 15596991 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Glutamyl- and glutaminyl-tRNA synthetases # Organism: Pseudomonas aeruginosa # 8 551 9 553 556 619 53.0 1e-177 MEEKEKVSKNFIEMAIDKDLEEGVYDHVMTRFPPEPNGYLHIGHAKSILLNYGLAQKYNG KFNLRFDDTNPTKEKVEFVESIKKDVEWLGADYEDRLFFASNYFDQMYEAAVTLIKKGKA YVCDLSAEEIREYRGTLKEPGKESPYRNRSVEENLELFEKMKNGEFADGEKVLRAKIDMA AGNINMRDPILYRVAHMTHHNTGDKWCIYPMYDFAHPIEDAIEGVTHSICTLEFEDHRPL YDWVVKEVGFEQPPRQIEFAKMYLTNVITGKRYIKKLVEDGIVDGWDDPRLVSIAALRRR GFTPEAIRNFIELAGVSKAQSSVDYAMLEFCIRDDLKLKKSRVMAVLDPVKVVITNYPEG QIEYLDVENNRENEELGSRKVAFGREIYIERDDFMIDPPKKYFRLYPGNEVRLMNAYFVT CTDYITDENGKVTEIHCTYDPETKSGSGFTGRKVKGTVHWVCADECVDAEVRLYENIVDE EKGVYNEDGSLNLNPNSLTILPGCKLEAGLKDSKAYDSFQFVRQGYFCCDAKDSSEEHLI FNRIVSLKSSYKPKK >gi|222441927|gb|ACEP01000015.1| GENE 25 25701 - 26534 1181 277 aa, chain + ## HITS:1 COG:PAB1737 KEGG:ns NR:ns ## COG: PAB1737 COG0543 # Protein_GI_number: 14521153 # Func_class: H Coenzyme transport and metabolism; C Energy production and conversion # Function: 2-polyprenylphenol hydroxylase and related flavodoxin oxidoreductases # Organism: Pyrococcus abyssi # 1 265 1 264 278 261 49.0 9e-70 MYKIVKKETLNSVVELMEIYAPFVARKCEPGQFIILRVDEDGERVPLTIADYDREKETVT IIYQVLGYSTTLLSQKKEGEYVADFVGPLGVPAALEKKENKVIGVAGGVGAAPLYPQLRK LAENGTKVDVIIGGKSKEYVLWADKFREFCENVYVATDDGSEGTKGFVTTVLQDLLDKGE VYDECIAIGPLIMMKNVVKVTKPADLHTMVSLNPIMIDGTGMCGGCRVTIGGETKFACVD GPDFDGFLVDFDECMKRQGMFKEEEHECKMMALGGEA >gi|222441927|gb|ACEP01000015.1| GENE 26 26534 - 27931 1902 465 aa, chain + ## HITS:1 COG:TM1640 KEGG:ns NR:ns ## COG: TM1640 COG0493 # Protein_GI_number: 15644388 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: NADPH-dependent glutamate synthase beta chain and related oxidoreductases # Organism: Thermotoga maritima # 6 465 1 462 468 530 59.0 1e-150 MAKKNMSLTKVPMPEQAPDVRNKNFEEVALGYTAEMAMEEAARCLNCKNKPCVSGCPVNV PIPGFIEKVAEGDFEAAYEIITSENALPAICGRVCPQENQCEGKCVRGIKGQPVGIGRME RFVADYHMEHAEPVKADIKKNGKKVAVVGSGPSGITCAGELIKKGYDVTVFEALHKAGGV LSYGIPEFRLPKALVAREIKSVEDLGVDIETNVIVGRSVTIDELMEDGYEAVFVGSGAGL PRFLNIPGENLLGVYSANEFLTRVNLMKGYKFPEVPTPVKVGKRVAVVGAGNVAMDAART AKRLGAEEVYIVYRRSEEEAPARLEELHHAKEEGIIFKFLNNPAAIVGDDNGWVKGMEII KQELGEPDASGRRRPVPIEGSNYILDVETVIIAIGQSPNPLIRHTTPGLECQKWGGIIVN EETMESSKENVYAGGDTVTGAATVILAMGAGKKAAAAIDEKLSGK >gi|222441927|gb|ACEP01000015.1| GENE 27 28343 - 30415 2327 690 aa, chain + ## HITS:1 COG:CAC2367 KEGG:ns NR:ns ## COG: CAC2367 COG5492 # Protein_GI_number: 15895634 # Func_class: N Cell motility # Function: Bacterial surface proteins containing Ig-like domains # Organism: Clostridium acetobutylicum # 43 223 163 333 752 96 35.0 2e-19 MKRGKVTALVMAFIMAVSVLASPVTMLNTNAAVKAKKITMAKKATLEVGAKKKLKVTVKP AKAKVKITFRTSNSRIAKVNSKGVVKGVKAGQATITAKAKVGKKVLKARQKITVKKKAQK APAAIKVSALTVKKYDVTINEGNIETMAVTVLPANATNKKLQYTTSDNNVATVDNKGVIT AIAQGSCKVTAETMDGSNRSVSVNVTVTKTQRPRCIITQDAEVDDMNSLIHVLLYSNEVD IQGIVQSSSKFHWKGVAGQKEEKYTKPYRWPGTEWMQKYLNAYQKIYPNLKKHDKSYPAP AYLKSVTKVGNIGYKGEMDSVTEGSELIKKKILDNDERTLYLLAWGGTNTISRALKDIEK EYKGTKQWDAIRKKIINKVVIPACGEQDETYSEYIAEEWPEIKFMSCSQMSSYAYMWRTQ PEDSSKKTLYADFMLKNLIRKHGALLDNYVTWGDGTYLDGEEPGSQFGTNEDLLDSLNWW GGFNPVQKYQRYDFLSEGDSPTFFMLLDTGLRTLEDITNGGFSGRYARADKKNSKGQEVN YWSPVTDTYVKEDGSTMKVESSWKYIDDIQNDFAARADWCIVNDYVKANHAPKVSVTEGT DIKASAGETLKLHAIATDPDDDYVTVSWSEYTDASTTETALTLKGAASDTISFKIPEDAK AGQKIHLIVQAQDDGEHTLTHYQQVIITIK Prediction of potential genes in microbial genomes Time: Fri Jul 8 06:41:12 2011 Seq name: gi|222441926|gb|ACEP01000016.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont15.1, whole genome shotgun sequence Length of sequence - 20727 bp Number of predicted genes - 11, with homology - 11 Number of transcription units - 10, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 203 - 247 -0.9 1 1 Tu 1 . - CDS 285 - 1250 209 ## Cphy_0822 HAD family hydrolase - Prom 1284 - 1343 8.8 + Prom 1231 - 1290 5.5 2 2 Tu 1 . + CDS 1344 - 2819 1371 ## COG1559 Predicted periplasmic solute-binding protein + Term 3047 - 3112 -0.9 + Prom 3052 - 3111 5.0 3 3 Tu 1 . + CDS 3219 - 3449 170 ## gi|225026030|ref|ZP_03715222.1| hypothetical protein EUBHAL_00269 + Term 3500 - 3543 2.1 - Term 3593 - 3631 5.1 4 4 Tu 1 . - CDS 3702 - 5084 616 ## PROTEIN SUPPORTED gi|145634045|ref|ZP_01789756.1| 50S ribosomal protein L21 5 5 Tu 1 . + CDS 5451 - 5879 222 ## ELI_4142 prepilin peptidase + Prom 6428 - 6487 8.3 6 6 Tu 1 . + CDS 6530 - 10087 4244 ## COG0674 Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, alpha subunit + Term 10148 - 10198 14.2 + Prom 10192 - 10251 10.2 7 7 Tu 1 . + CDS 10303 - 11082 400 ## COG2865 Predicted transcriptional regulator containing an HTH domain and an uncharacterized domain shared with the mammalian protein Schlafen + Prom 11539 - 11598 5.9 8 8 Op 1 21/0.000 + CDS 11713 - 16248 5019 ## COG0069 Glutamate synthase domain 2 9 8 Op 2 . + CDS 16261 - 17745 1604 ## COG0493 NADPH-dependent glutamate synthase beta chain and related oxidoreductases + Term 17936 - 17976 -0.7 + Prom 18514 - 18573 5.8 10 9 Tu 1 . + CDS 18708 - 19874 824 ## COG2230 Cyclopropane fatty acid synthase and related methyltransferases + Term 19914 - 19947 4.0 + Prom 20030 - 20089 3.4 11 10 Tu 1 . + CDS 20119 - 20694 733 ## COG0424 Nucleotide-binding protein implicated in inhibition of septum formation Predicted protein(s) >gi|222441926|gb|ACEP01000016.1| GENE 1 285 - 1250 209 321 aa, chain - ## HITS:1 COG:no KEGG:Cphy_0822 NR:ns ## KEGG: Cphy_0822 # Name: not_defined # Def: HAD family hydrolase # Organism: C.phytofermentans # Pathway: not_defined # 8 197 113 309 396 83 28.0 1e-14 MFNFFIIMDEIQGNNIARNFSSEYFNYTINFIISKEFPALTDFPDFSLILCDLDKNIELA KKHNLPVIAFSHKNNRQESLMGTPWLILDTDGLSPFFLNEVYCRHYKKPLTITTTNRCII RELTTRQLPELLQLQEENKNNPSGCFFPQNCTTYAEAEEFLQNYIKNQYAFYGYGIYGIF NKENETFLGIAGFSPFENVITSDTLNSKEKNFKISENLSEKIPEKKSKNISEHCSEKTSE KYPENDFNEYSAEIGYSVLKKWQQQGIASEILPPLIHFGKEYLGFTKIVTRIEKNNIASI RLAKKNNLKILICQATEQTNS >gi|222441926|gb|ACEP01000016.1| GENE 2 1344 - 2819 1371 491 aa, chain + ## HITS:1 COG:CAC1685 KEGG:ns NR:ns ## COG: CAC1685 COG1559 # Protein_GI_number: 15894962 # Func_class: R General function prediction only # Function: Predicted periplasmic solute-binding protein # Organism: Clostridium acetobutylicum # 137 475 13 338 339 153 30.0 7e-37 MRKEENTEKDELYGKLQVHFIENVEAEESSEEKESDSGNLSDETARKSVSAEKENRAKKP VQKDIEEASAKSKGKKSETQSDSSQTDSEDDVKEQKGSRKRRHSKVDKQEAEAVEESFYK VTGEAEEHKSHPVRNAILGLLVLILVASAVVSYPIGKEYFQEKSVAGKDIEITIEKGSTS RDVSAILKKKGIIRYEAAFLLKLYFSDYKGKLRYGTFDLNNGMSLGKVIKELATQDGQKE NKFTIPEGYTIEMTASKLEKEGIMSAQEFLTAVTNAAVTSKYKDVLPKKKKVFYQLQGYI YPDTYYLAKDITGDQLVAKILDEFDKKFDATRQEKAKKLGMTVEEVLIRASLLQKETELP EEYPIIAGVIQNRLDKKMKLQFDSTAVYAITKGQYGIARVMYKDLKVDSPYNTYKHKGLP VGPICSPSLEAIDGVLNPQKNDYLYFQMDTVKNDGSNIFSKTYEEHKAASATTEAATETT TTAVKQKTKKK >gi|222441926|gb|ACEP01000016.1| GENE 3 3219 - 3449 170 76 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026030|ref|ZP_03715222.1| ## NR: gi|225026030|ref|ZP_03715222.1| hypothetical protein EUBHAL_00269 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00269 [Eubacterium hallii DSM 3353] # 1 76 1 76 76 114 100.0 3e-24 MKCLDCGNEMQQGIVEGYGQGGHSCYEFTSDEEKKKKGLKGFFTKEIISISTDNMEHCAW YCSKCKKVLMWMDTDE >gi|222441926|gb|ACEP01000016.1| GENE 4 3702 - 5084 616 460 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|145634045|ref|ZP_01789756.1| 50S ribosomal protein L21 [Haemophilus influenzae PittAA] # 4 456 6 440 456 241 33 2e-63 MVKLIETVYNFLWGDLLTAPLPGGGTLPLSLLVILLIPTGVYFTIRTKFLPIRLFKDMVG ALFEKKDEEQSALSVIQTLIVSTATRVGMGNLVGVVAAISAGGAGAVFWMWVTALIGSST AFIEATLSQIYKEKDPLYGGYRGGPAYYIHRYFEEKSGKKKRYVLLSVLFAISGLICWCG ISQVVSNSVASAFKNAFHIPPIYTTVILVILAAIIVLRKNATVKVLDIIVPIMAGFYLLI TLFIIITNITQLPGVFERIFSEAFGFRQVAAGGFGAVLMNGIKRGLFSNEAGSGSAPCAA AAAVADNPVKVGLTQALGVFIDTLMICSCTAMIMLLTPTKLTEGLVGMDLLQKAMEYHLG SFGVIFIAITLWLFSFSTFIGILFYARSNVAYLFGDNWFSQTAYKVLALAMLFVGGLAAY TIVWDLGDVGIGLMTIFNIIMLYPLSKKALQALKEYEKKF >gi|222441926|gb|ACEP01000016.1| GENE 5 5451 - 5879 222 142 aa, chain + ## HITS:1 COG:no KEGG:ELI_4142 NR:ns ## KEGG: ELI_4142 # Name: not_defined # Def: prepilin peptidase # Organism: E.limosum # Pathway: not_defined # 4 132 113 239 257 87 44.0 2e-16 MIVIVSILLVISIIDFKIKIIPNQLNVLLFISGIWSGFVFQEVTFLSRFLGVFSVSIPMF ILAILCSGGLGGGDVKLMAASGVLLGIKWNIFAACAGLLLGGLYGFFLLITKRAKRKDCF ALGPFLCIGIAVVFFRYIFMKV >gi|222441926|gb|ACEP01000016.1| GENE 6 6530 - 10087 4244 1185 aa, chain + ## HITS:1 COG:CAC2229_1 KEGG:ns NR:ns ## COG: CAC2229_1 COG0674 # Protein_GI_number: 15895497 # Func_class: C Energy production and conversion # Function: Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, alpha subunit # Organism: Clostridium acetobutylicum # 3 405 2 403 413 544 65.0 1e-154 MPRAKQSMDGNTAAAHVAYAYTDVAAIYPITPSSPMADSVDQWAAAGQENIFGNQVVVSE MQSEAGAAGTVHGSLAAGAITTTFTASQGLLLMIPNMYKIAAEQLPCVFDVSARTVATQS LNIFGDHSDVYACRQTGFAMLAETNPQEVMDLSPVAHLAAIEGKVPFINFFDGFRTSHEI QKIEKWDYADLKEMCNMDAVKAFREHALNPEHPAMRGSHENGDVFFQHREACNSVYEALP AVVKKYMGKINEKLGTNYDLFNYYGAEDADRVIIAMGSICDVAEEVIDYLTAKGEKVGLI KVRLYRPWVSSALLDVIPKTAKKIAVLDRTKEPGALGDPLYLDVATTLREAGLNDIVLTA GRYGLGSKDTPPSSVFAIYKELEKDAPKSRFTIGIVDDVTNLSLPEVKPAPITSAPGTVE CKFWGLGGDGTVGANKNSTKIIGDHTDKYIQAYFQYDSKKTGGITISHLRFGDNPIRSPY YINQADFVACHNPSYVVKGYKMVQDVKPGGIFMINCQWSDEELDHHMPAAAKKYIADNNI QLYTINAIDKAIEIGMGKRTNTILQSAFFKLANIMPIDEAVEFMKAAAKKSYSKKGDAVV EMNYKAIDAGVDAVHKVEIPASWSNPAADAPAKELTGRPETVKMVKEMMEPIGIMDGDSL PVSTFVDNADGQFELGASAYEKRGTAVSVPEWNPEACIQCNQCAFVCSHATIRPFLVSED EIKAAPDNMKVADTKPKASEYKFTMSVSPLDCMGCGECITVCPAAAKGALKMVPQESQAA EQPVFDYLVATVGRKPGAPADTTVKGSQFNQPLLEFSGSCAGCAETSYARLITQLFGEHM YISNATGCSSIWGGPAATAPYTVNKDSKQGPAWANSLFEDNAEHGFGMYLGQKTLRDQAI AKLEKMAASDKASDEFKAAFAKFMETKDNTKENTPDAQALIAEVEKAAAAGCPDAAEVLE KKQYLAKKSVWIFGGDGWAYDIGFGGLDHVLASGENVNVMVFDTEMYSNTGGQASKASNI GEVCQFAAAGKEISKKSLAEIAMSYGYVYVAQIALGANPNQAVKCLAEAEAYNGPSLIIG YAPCELHGIAKGGMNHCQDEMKKAVKAGYWNLFSFDPAKKAEGKNPFTLTSKEGDGSYQE FLNNESRYTRLIKPFPERAERLFNESEKVAKARYEHLQRLVELYK >gi|222441926|gb|ACEP01000016.1| GENE 7 10303 - 11082 400 259 aa, chain + ## HITS:1 COG:MA2121 KEGG:ns NR:ns ## COG: MA2121 COG2865 # Protein_GI_number: 20090964 # Func_class: K Transcription # Function: Predicted transcriptional regulator containing an HTH domain and an uncharacterized domain shared with the mammalian protein Schlafen # Organism: Methanosarcina acetivorans str.C2A # 84 212 156 283 458 75 38.0 8e-14 MLHKEDFGEGKNIEFKREIPKRHENFLKDVIAFSNSTGGKIFIGIEDKTNEVIGIGEKNP FKLADDISNMIFDSCTPIIDPEITVQTLEGKTQIINKMTIEKLEDLGLLCRIGKELQPTH AFRLMTKNKIRYAKIQCALFKGTERDIFIDKREFDGPLYSQLENAYQFVLKHINLSAKIE GLHRKESYELPVRTIRELIINAVVHRSYLDESCIFGRKEIKQGLGYKDSKAGFLIQIMQE FELIKTVKGRGKGKYRFAI >gi|222441926|gb|ACEP01000016.1| GENE 8 11713 - 16248 5019 1511 aa, chain + ## HITS:1 COG:sll1502_2 KEGG:ns NR:ns ## COG: sll1502_2 COG0069 # Protein_GI_number: 16329610 # Func_class: E Amino acid transport and metabolism # Function: Glutamate synthase domain 2 # Organism: Synechocystis # 387 1185 1 801 801 932 58.0 0 MYQDNMRQGLYDPSLEHDNCGIGAVVNIKGLKSHETVSNALKIVENLEHRAGKDAEGKTG DGVGILVQISHKFFEKTCSTLGFSIGKERDYGIGMFFMPQDELQRNRAKKMFEIIVEKEG LNFLGWREVPVQPGILGKKAADCMPCIMQGFVKRPFDVPRGLDFDRRLYIVRRVFEQSND SAYVVSLSSRTIVYKGMFLVGQLRLFYNDLQDKDYESAIALVHSRFSTNTVPSWEKAHPY RYIVHNGEINTIRGNADKMLAREENMESDYLKDEMIKLTPVVDPNGSDSARLDNTLEFLL MSGMPLPLAMMITIPEPWENNEGMSREKRDFYQYYATMMEPWDGPASILFTDGDIMGAVL DRNGLRPSRYYITDDDQLILSSEVGVLDFDTTHIVKKERLHPGKMLLVDTVKGELIDDEI LKDSYVKNQPYGEWLSDNLVFLKDLKIPNVAVEEYSYEESNRLMKAFGYSYEEVKDSVLP MALNGSEAIGAMGVDTPLAVLSKRHHPLFDYFKQLFAQVTNPPIDAIREEIVTSTSIYVG GEGNVLEEKAENCHVLKIHNPILTNTDLLKIRYAKVKNINVKDVPITYYKGTPLDKAINH LYVAVDQAHKNGANVVILTDRGVDENHVAIPSLLAVSAVHHHLIETKKSTSVSIILESGE PREVHHFATLLGYGACAINPYLAQFSIKKLIQDGLLKKDYYAAVNDYNNAVLHGIVKIAS KMGISTLQSYQGSQIFEAVGISKKVIDEYFTNTVSRVEGITIEDIAEDVDYHHSKAFDML GLKNDLSLDSEGDHKYRSGKEEHLYNPLTIHTLQTAAWTNNYELFKQYTSMINKETSPVS LRGLMDFNYPSKGVPIEEVESVDSIVKRFKTGAMSYGSISKEAHETLAIAMNMIHGKSNT GEGGEDLERLTVGPDGLNKCSAIKQVASGRFGVTSRYLVSAQEIQIKMAQGAKPGEGGHL PAGKVYPWIAKTRHSTPGVGLISPPPHHDIYSIEDLAQLIYDLKNANRNARISVKLVSEA GVGTVAAGVAKAGAQVILISGHDGGTGAAPRNSSIHNAGLPWELGLAETHQTLIKNDLRN KVIIETDGKLMSGRDVAMAAALGAEEFGFATGPLITMGCVMMRVCNLDTCPVGIATQNPE LRKRFKGKPEYVVNYMKFVAQEMREYMAKLGVRTVDELVGRTDLLKELPEAKEYHLDLSA ILNNPYVDKKHPICYNKKNEYNFELEKTLDEKVLLTKLKTALDKKQKRSISIDVGNTDRS FGTIFGSEITKKYYNTLEDDTFTVQCTGAGGQSFGAFIPNGLTLELVGDSNDYFGKGLSG GKLIVYPPKGIRFKAEENIIVGNVALYGATSGQAYINGVAGERFCVRNSGATAVVEGVGD HGCEYMTGGKVLIIGKTGKNFAAGMSGGVAYVLDEDNDLYLSLNKEMVSSEPVVSKYDVM EIKDMITAHVAYTNSEKGKEILNDFSKYLPKFKKIIPHDYKKMLKKISEMEGKGLSAEQA QIEAFYEIKRG >gi|222441926|gb|ACEP01000016.1| GENE 9 16261 - 17745 1604 494 aa, chain + ## HITS:1 COG:sll1027 KEGG:ns NR:ns ## COG: sll1027 COG0493 # Protein_GI_number: 16329369 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: NADPH-dependent glutamate synthase beta chain and related oxidoreductases # Organism: Synechocystis # 1 494 1 493 494 541 53.0 1e-154 MGKPTGFLEYERKESVGEAPLKRIKHFNEFHTPLSKKEQERQGARCMNCGVPFCQSGILI KGMVTGCPLNNLIPEWNDATYTGNWDEAYNRLRKTNPFPEFTGRVCPHPCEVGCTCNLGG DPVNIKEKELSIIERAYESGLIKPEPPEVRTGKTVAIVGSGPSGLCVAEYLNRRGHSVTV YERSDRVGGLLMYGIPNMKLEKTYIERKVKLMEKEGVVFKTGVNVGVDVKAQDLKKQYDA VVLCCGASNPRDIKAPGRKSKGIYFAVDYLKAATKSLLDSNFADKKYIDAKDKNVMVIGG GDTGNDCVGTSIRLGCKSLIQLEMMPKLPDERQPNNPWPEWPRVCKTDYGQEEAIAVFGK DPRVYQTTVKEFIANEKGEVCKAKLVKLESKFDKKAGRNIMAEVAGSEYEVPVDLVLIAA GFLGSQSYVTKAFGLDLTERTNVKTVEGHFKTNVDKVFTAGDMHRGQSLVVWAIREGRDC AREVDKALMGYSNL >gi|222441926|gb|ACEP01000016.1| GENE 10 18708 - 19874 824 388 aa, chain + ## HITS:1 COG:CAC0877 KEGG:ns NR:ns ## COG: CAC0877 COG2230 # Protein_GI_number: 15894164 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cyclopropane fatty acid synthase and related methyltransferases # Organism: Clostridium acetobutylicum # 27 388 28 388 391 354 48.0 2e-97 MNLTEKEFVKFLSKFDEHPFDVKIGDEVSHIGKGESEFTVIFHKIPSMKELLTSTSIALG EAYMDGKLEIEGDLYHALNMFLGQMGKFSTDKKALKKLIFSSLTKKNQQKEVSSHYDIGN DFYKMWLDETMSYSCGYFKHPEDTLYQAQVQKVERILEKLYLKEGMTLCDIGCGWGYLLI EAAKKYGIKGVGITLSKEQKKKFEERIKENHLEDQLEVRLMDYRDLPESGFKFDRVVSVG MLEHVGRDNYELFMKSVKSIMNPGGLFLLHYISALQEHPGDAWVKKYIFPGGVVPSLREI INIAGDYQFYTLDVESLRRHYNKTLLCWNANFQKHKDEVREMFGERFVRMWELYLCACAA TFMNGIIDLHQILFTNGVNNDLPMTRWY >gi|222441926|gb|ACEP01000016.1| GENE 11 20119 - 20694 733 191 aa, chain + ## HITS:1 COG:BH3033 KEGG:ns NR:ns ## COG: BH3033 COG0424 # Protein_GI_number: 15615595 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Nucleotide-binding protein implicated in inhibition of septum formation # Organism: Bacillus halodurans # 3 183 4 176 190 168 51.0 5e-42 MNIILASGSPRRKELLAQAGFDFEVEVSNADENVAEESPTEMVEELAARKAEAVVNLHNK KEDNCLVIGADTIVVLDGKILGKPSDEADARAMLASLSGRTHQVYTGVALFSVKEGIIEK KTTFHECTDVTMVSMTEKEIADYVASGDPMDKAGAYGIQGLAAIFISEIKGDYYNVVGLP ISRVYHEIEQF Prediction of potential genes in microbial genomes Time: Fri Jul 8 06:41:28 2011 Seq name: gi|222441925|gb|ACEP01000017.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont16.1, whole genome shotgun sequence Length of sequence - 3709 bp Number of predicted genes - 4, with homology - 3 Number of transcription units - 4, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 140 - 343 274 ## gi|225026043|ref|ZP_03715235.1| hypothetical protein EUBHAL_00282 2 2 Tu 1 . + CDS 1327 - 2337 917 ## Selsp_1852 hypothetical protein 3 3 Tu 1 . - CDS 2274 - 2423 116 ## - Prom 2451 - 2510 4.0 + Prom 2353 - 2412 6.0 4 4 Tu 1 . + CDS 2486 - 3682 642 ## COG0389 Nucleotidyltransferase/DNA polymerase involved in DNA repair Predicted protein(s) >gi|222441925|gb|ACEP01000017.1| GENE 1 140 - 343 274 67 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026043|ref|ZP_03715235.1| ## NR: gi|225026043|ref|ZP_03715235.1| hypothetical protein EUBHAL_00282 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00282 [Eubacterium hallii DSM 3353] # 1 67 15 81 81 107 100.0 2e-22 MAFFDTLKQNLMTASQVTMDKSKNTAEILKLKDQIRQDKREIRSATYKIGEIYRELHSEN HEEAYED >gi|222441925|gb|ACEP01000017.1| GENE 2 1327 - 2337 917 336 aa, chain + ## HITS:1 COG:no KEGG:Selsp_1852 NR:ns ## KEGG: Selsp_1852 # Name: not_defined # Def: hypothetical protein # Organism: S.sputigena # Pathway: not_defined # 20 325 21 326 333 174 36.0 4e-42 MIRQTGLAQAIDIAENRAKYDECAKKLLSYKAIIAWILKSCTKEFSQYGVQYICDNCLKE EAEVSTYAVHQDELDKNDKLDGDKRVEGMNTESNSIQEHTIYYDIRLPAFLPKSNEIVRL ILNLEIQLDDTPGYPIVKRGFYYCGRMVSEQYGTVFTNEHYEKLEKVYSIWICPDPAKKR KNGIFKYHTVEDIIYGESYTKEENYDLMEVVVLNLGDADKSSDLEILDLLNVLFSATTTP EEKKQRLNDEFEIAMTVEFESEVQEMCNLSKALVEQGIKQGIKQGIELGKEERTLSMAQM MIQEREPIEKIEKYTGYTLEKLREIATKIGIPLMTE >gi|222441925|gb|ACEP01000017.1| GENE 3 2274 - 2423 116 49 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MFQYVSILTTMRMMLKDVYAIYISIYFLSHSVINGIPILVAISLNFSNV >gi|222441925|gb|ACEP01000017.1| GENE 4 2486 - 3682 642 398 aa, chain + ## HITS:1 COG:CAC0285 KEGG:ns NR:ns ## COG: CAC0285 COG0389 # Protein_GI_number: 15893577 # Func_class: L Replication, recombination and repair # Function: Nucleotidyltransferase/DNA polymerase involved in DNA repair # Organism: Clostridium acetobutylicum # 4 395 3 388 396 331 46.0 2e-90 MKERLIFHVDVNSAYLSWEAVHQLKNGASTDIRTIPSIIGGDTSLRKGVVLAKSLPAKRF RIHTGEPVTDALTKCPTLQSFQPNFPLYHKYSKAFITILKKYAPVVEQVSIDEAYLDMSG LHYFYSTPLEAAEKIQTDIRETLDFTVNIGISSNKLLAKMASDFEKPDKIHTLFPSEIEK KMWHLPVRTLFFTGRAAAQKLNQLGIYTIGDIAKSNPDFLKAHLNSQGVTLWNYANGRDN SPVKESPDAPKGYGNSTTLSKDLTSLSEAKPVLLELCETVTKRLREDQVYAQVIEVELKD SHFRRKSHQTVLDYSTNTTDDFYQISVKLFKELWDGTPIRLLGVRGGKLTLDKIEQIQLF TETASSHEKREKMSKLDAALDSIQKKHGKNSIMRASKL Prediction of potential genes in microbial genomes Time: Fri Jul 8 06:41:46 2011 Seq name: gi|222441924|gb|ACEP01000018.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont17.1, whole genome shotgun sequence Length of sequence - 8999 bp Number of predicted genes - 7, with homology - 7 Number of transcription units - 4, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 16/0.000 - CDS 26 - 478 299 ## COG0784 FOG: CheY-like receiver 2 1 Op 2 . - CDS 545 - 3280 2309 ## COG0642 Signal transduction histidine kinase 3 1 Op 3 . - CDS 3306 - 3464 115 ## gi|225026049|ref|ZP_03715241.1| hypothetical protein EUBHAL_00288 - Prom 3506 - 3565 4.8 4 2 Tu 1 . - CDS 3579 - 4325 706 ## COG1349 Transcriptional regulators of sugar metabolism - Prom 4404 - 4463 9.2 5 3 Tu 1 . - CDS 4479 - 5393 717 ## COG1180 Pyruvate-formate lyase-activating enzyme - Prom 5549 - 5608 5.7 6 4 Op 1 2/0.000 + CDS 5718 - 8132 2488 ## COG1882 Pyruvate-formate lyase + Term 8156 - 8197 2.7 + Prom 8156 - 8215 7.0 7 4 Op 2 . + CDS 8278 - 8955 950 ## COG0176 Transaldolase Predicted protein(s) >gi|222441924|gb|ACEP01000018.1| GENE 1 26 - 478 299 150 aa, chain - ## HITS:1 COG:slr2104_4 KEGG:ns NR:ns ## COG: slr2104_4 COG0784 # Protein_GI_number: 16330590 # Func_class: T Signal transduction mechanisms # Function: FOG: CheY-like receiver # Organism: Synechocystis # 20 145 1 121 130 98 46.0 4e-21 MYTDTQERESSVSSNGEESAQNNLNGINILLVEDNDLNMEIAEFYLSDRGAMIEKAWNGQ EALDKFTSSAPNTYDIILMDIMMPVMDGLETCRKIRTSNHPEGKNIPIIAMTAQVSQECM DKCQLAGMNGHLTKPVTSEKLIKTIMEFTI >gi|222441924|gb|ACEP01000018.1| GENE 2 545 - 3280 2309 911 aa, chain - ## HITS:1 COG:PA3462_2 KEGG:ns NR:ns ## COG: PA3462_2 COG0642 # Protein_GI_number: 15598658 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Pseudomonas aeruginosa # 680 903 1 221 385 154 37.0 6e-37 MERLDMKNNSSSKKTSYITRGAMILLTTILIILFFCIMLLVSQIQGTARIVNYAGLVRGK TQRIIKLEDAGQPHDEMIDSVSSYIKGLRYGSDELNLVRLEDRAFQSKMQELNSYFKELC EEIQRVREVGYENTAIIDKSEHFFQICDEATGLAEAYSQRKATALSQLETIVFVDIGGLV ILIALELIKALRYAAQNRILQSKVYLDEATGLPNKNKCEEILNASDLLSAQDAVAICVFD LNNLRNINNNLGHDKGDEYIRSFAVQLRIAVADEYFVGRDGGDEFIAVLKNVTRMQVEEC LRDIREQAAKYSKEYPEMPISYAVGYAMSQDFEQSTMRELFRYADKNMYIDKNRAKMEEA AEEKRMNQRLLAKVKEMGYQFSDCLYCDVFMDQYRVLRASSKFFLAEDGSYSGAVEQIVH KLATDSTRKKMWSQLQIDYLKEHMTEEQPIHEISYKYTEEDVTIHGRLTGIFCDTGRDGT VHHFILGFEVFHDRNVAASDEKLQLTQYYEQMKQAILENGNYVEALLDTAEAVYTVDFTH DRLEKIFYHSESAREFDLKIVLPCSYDTYCLKQREFITKDTMENYRIVEASAKLLGRFCS GEKQVTVEYRERGQNGKLIWLQKTVLMSQDTVYDSELQEETKVVHGIILFKDTSVFHEKE QQEKERLQLAFEKADSASKAKTEFVNRMSHDIRTPINGIMGMLDIIRKNKENPAKLEECL GKIQLSAGHLMALADDVLDMSKLESGRLVLEEVSFDLLQLMADVTSLVNAQLLEMKLSHH TYRKNLKHTILLGSPTRLRQIMLNLFSNAIKYNKVGGKIDTYAKEISFDGITVWYEFKIK DSGIGMSEKFVKEELFDIFTQEQTDARTHYKGSGLGMSIVKQLIKAMHGTIEVKSTLGEG TTFVFSSCHLK >gi|222441924|gb|ACEP01000018.1| GENE 3 3306 - 3464 115 52 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225026049|ref|ZP_03715241.1| ## NR: gi|225026049|ref|ZP_03715241.1| hypothetical protein EUBHAL_00288 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00288 [Eubacterium hallii DSM 3353] # 1 52 1 52 52 97 100.0 4e-19 MGIYLNPGNASFRGSLRSKIYVDKSGLIAKTNDVICTEQKYVCVSEMAIIIV >gi|222441924|gb|ACEP01000018.1| GENE 4 3579 - 4325 706 248 aa, chain - ## HITS:1 COG:SPy2054 KEGG:ns NR:ns ## COG: SPy2054 COG1349 # Protein_GI_number: 15675824 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulators of sugar metabolism # Organism: Streptococcus pyogenes M1 GAS # 3 231 2 230 248 231 53.0 8e-61 MKNRGNKILELLTKDNKIEVSHLAEELGVSQVTMRKDLDTLESKGVIKREHGFALLCSTD DINGRIAYHYEEKRKIAQKAAGLVSNGDTIMIENGSCCALLADTLTTTKKDLTIITNSAF IAEYIRGKSNFQINLLGGIYQQDSQVMVGPMILQCVENFFVDRFFIGTDGYSTKMGFTNQ DQMRAQAVRDMARQAEEVIVLTESEKFSKRGIVPLNLKDQIKAVITDSEIDSFYEAELIS KRIQVIKA >gi|222441924|gb|ACEP01000018.1| GENE 5 4479 - 5393 717 304 aa, chain - ## HITS:1 COG:SPy2055 KEGG:ns NR:ns ## COG: SPy2055 COG1180 # Protein_GI_number: 15675825 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Pyruvate-formate lyase-activating enzyme # Organism: Streptococcus pyogenes M1 GAS # 7 304 2 257 257 250 45.0 2e-66 MKKADRTKKGMIFNIQKFSVNDGPGIRTVVFFKGCPLHCAWCANPESQYSEMQILWNMEK CVRCHHCMEICPKKAITFFSNEIKINPYICNGCQKCIEECPARALQAEGEEKTVQGVLDV VLQDKVFYEESGGGITLSGGEMLYQPDFALQLLLAAKEEGLHTCCETTGFLKTELFAKII EQVDYILFDMKHWNSKKHKEGAGIYNELILSNMTYAVKVGKKVLPRIPVIPGFNDSLEDA IQFAATLRNIGITTCQLLPFHQFGENKYDLLGKHYAYKHQPSLHKEDLEEYKDTFIKNGI KAFF >gi|222441924|gb|ACEP01000018.1| GENE 6 5718 - 8132 2488 804 aa, chain + ## HITS:1 COG:SPy2049 KEGG:ns NR:ns ## COG: SPy2049 COG1882 # Protein_GI_number: 15675819 # Func_class: C Energy production and conversion # Function: Pyruvate-formate lyase # Organism: Streptococcus pyogenes M1 GAS # 6 804 8 805 805 1228 73.0 0 MENQEHFGRLTDRMAAFREEVLEEKPYIDAERAVLATQAYKENQNQPRVMVRALMLQKIL ENMSIYIEDKSLIAGNQATKNKNAPIFPEYTMEFVMNELDLFEKRDGDVFYITEETKQQL RDIAPFWENNNLRARGEALLPDEVSVFMETGVFGMEGKLNAGDAHLAVNYERILAEGLKG YEERTKKLKAALDFTKPESIDKNVFYKAVLIVIDAVHTFANRYSKLAQDMALTETDAKRK EELLEISRICAKVPYEPASSFREAVQAVWFIQLILQIESNGHSLSYGRFDQYMYPYYKKD MENGSLSEESALELLTCLWIKTLTVNKVRSQAHTLSSAGSPMYQNVTIGGQTTDKKDAVN ELSFTVLKSVAQTRLTQPNLTVRYHANLNKKFFDECIEVMKLGFGMPALNNDEIIIPSFI NWGVKEEDAYNYSAIGCVETAVPGKWGYRCTGMSYINFPRVLLCAMNNGVDLTSKKRFTK GYGYFTEMETYEDLLAAWDRTVREMTRYSVIVENAIDKASERDVPDVLCSALTDDCIGRG KTIKEGGAVYDFISGLQVGIANMADSLAAIKKLVYEEKKITKQQLWDAILDNFQSPENKK IQEMLIEEAPKYGNDNDYVDNLVVEAYDSYLDEIKKYPNTRYQRGPIGGIRYGGTSSISA NVGQGMGTIATPDGRNAFEPLAEGCSPAHNADKNGPTAIFKTVSKLPTEKITGGVLLNQK MTPQMLSTEENKQKLEMLIRTFFNRLHGYHVQYNIVSKETLIDAQKHPEKHKDLIVRVAG YSAFFNVLSKKTQDDIIGRTEQAL >gi|222441924|gb|ACEP01000018.1| GENE 7 8278 - 8955 950 225 aa, chain + ## HITS:1 COG:SPy2048 KEGG:ns NR:ns ## COG: SPy2048 COG0176 # Protein_GI_number: 15675818 # Func_class: G Carbohydrate transport and metabolism # Function: Transaldolase # Organism: Streptococcus pyogenes M1 GAS # 1 215 1 216 222 164 42.0 9e-41 MKLIIDDAHIDLIKKIYEYYPVDGVTTNPSILAKSGRQPFEVLKEIRSFIGEEAELHVQV VAKDADGMVDDAHRIVKELGANTYVKVPSIPEGFKAMKALAKENINITATAIYTPMQAYL AAKCGASYAAPYVNRIDNMGFNGIQVAKQIHDIFKNNNLKTEVLAASFKNSQQVLELCEY GIGASTIAPDIIEGLVKNQAITSAVDDFVKDFEGLTGAGKTMSDC Prediction of potential genes in microbial genomes Time: Fri Jul 8 06:41:52 2011 Seq name: gi|222441923|gb|ACEP01000019.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont18.1, whole genome shotgun sequence Length of sequence - 3425 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 2, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 31 - 90 8.4 1 1 Tu 1 . + CDS 237 - 980 678 ## gi|225026055|ref|ZP_03715247.1| hypothetical protein EUBHAL_00294 + Prom 1012 - 1071 3.9 2 2 Op 1 13/0.000 + CDS 1091 - 1501 472 ## COG1959 Predicted transcriptional regulator + Prom 1646 - 1705 8.0 3 2 Op 2 20/0.000 + CDS 1741 - 2928 1508 ## COG1104 Cysteine sulfinate desulfinase/cysteine desulfurase and related enzymes 4 2 Op 3 . + CDS 2918 - 3343 636 ## COG0822 NifU homolog involved in Fe-S cluster formation Predicted protein(s) >gi|222441923|gb|ACEP01000019.1| GENE 1 237 - 980 678 247 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026055|ref|ZP_03715247.1| ## NR: gi|225026055|ref|ZP_03715247.1| hypothetical protein EUBHAL_00294 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00294 [Eubacterium hallii DSM 3353] # 1 247 1 247 247 443 100.0 1e-123 MKKIHALVMSLIMLFIVLAPAQNVEAAGKSLKVSEKAFFKEMKEFDYKGMNRYVKDWGEG GQLVSAFYMVPSGKKYFAKCASKMSYRIISTKKKGNKADVKVKFRYVNCEDFTFNFCMNA FYYMADGKLDNLSSMSEKQVIKLVNGIIDKSQKDTKFNRFKTKTVTIRFVKAKNCWKVQK VSDKLADVMMANFASNLQNLATFSISSACGEKSAYVIPETSEYGTVQKKVLVKVLKNIYG RKPELSA >gi|222441923|gb|ACEP01000019.1| GENE 2 1091 - 1501 472 136 aa, chain + ## HITS:1 COG:VC0747 KEGG:ns NR:ns ## COG: VC0747 COG1959 # Protein_GI_number: 15640766 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Vibrio cholerae # 1 132 16 145 188 94 34.0 8e-20 MMVSSKGRYALRVMIYLAQNDREELIPLKEIAENQNISMKYLEMIVSLLHKGNMLISGRG KKGGYRLARKPSEYTIGSILKLTEKTLAPVNCLEGGKVTCEKAGYCITLPMWQKLDGIID DYLETVTLEDLLEGRV >gi|222441923|gb|ACEP01000019.1| GENE 3 1741 - 2928 1508 395 aa, chain + ## HITS:1 COG:MA3264 KEGG:ns NR:ns ## COG: MA3264 COG1104 # Protein_GI_number: 20092080 # Func_class: E Amino acid transport and metabolism # Function: Cysteine sulfinate desulfinase/cysteine desulfurase and related enzymes # Organism: Methanosarcina acetivorans str.C2A # 3 391 58 448 448 414 54.0 1e-115 MYIYADNAGTTKMSPVAIDAMTKCMNEVFGNPSSLHSTGQRAAEVLQKAREDVAEALGAD FNEIYFTSGGSEADNQAIRSAALFGAKKGKKHLISTKIEHHAVLHTLKKLEREGFEVTLL DVGADGIVDPADVEKAIREDTALVTIMYANNEIGTLQPIDEIAKICKEKKVTFHTDAVQA VGHVPVNVHAQNIDMLSLSGHKFHGPKGIGVLYVRKGVPLFNLIEGGAQERGRRAGTENI PAIVGMAAALKEAVANLDENMAKISAMRDRLIEGLSQVEHSRLNGDAKKRLPGNVNFCFE GIEGESLLLLLDEKGIEASSGSACTSGSLDPSHVLLAIGLPHEVAHGSLRLTLCEENTPE QIEYIIKEVPEIVSYLRNISPVWEELEKGEKKHVI >gi|222441923|gb|ACEP01000019.1| GENE 4 2918 - 3343 636 141 aa, chain + ## HITS:1 COG:MA2717 KEGG:ns NR:ns ## COG: MA2717 COG0822 # Protein_GI_number: 20091541 # Func_class: C Energy production and conversion # Function: NifU homolog involved in Fe-S cluster formation # Organism: Methanosarcina acetivorans str.C2A # 1 125 1 125 128 168 67.0 3e-42 MLYSEKVMDHFRNPRNLGKMDDADGIGEVGNAKCGDIMKMYIKVKDGIIEDVKFNTFGCG SAIASSSMATEMIKGKSIDDALELSNKAVVEALDGLPTHKIHCSVLAEEAVKAAVKDYYD KNGIEYDKEKFAPDDCPTCGE Prediction of potential genes in microbial genomes Time: Fri Jul 8 06:42:07 2011 Seq name: gi|222441922|gb|ACEP01000020.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont19.1, whole genome shotgun sequence Length of sequence - 15524 bp Number of predicted genes - 11, with homology - 11 Number of transcription units - 7, operones - 3 average op.length - 2.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 203 - 460 308 ## Rumal_1318 hypothetical protein - Prom 513 - 572 6.6 - Term 471 - 515 0.1 2 1 Op 2 . - CDS 678 - 1427 624 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog - Prom 1583 - 1642 9.7 - Term 1763 - 1798 -0.8 3 2 Op 1 . - CDS 1982 - 3679 1944 ## COG0443 Molecular chaperone 4 2 Op 2 . - CDS 3676 - 5601 1767 ## Clocel_3378 heat shock protein DnaJ domain-containing protein 5 2 Op 3 . - CDS 5604 - 6323 732 ## COG0846 NAD-dependent protein deacetylases, SIR2 family - Prom 6345 - 6404 2.2 - Term 6346 - 6397 3.4 6 3 Op 1 . - CDS 6407 - 7309 1255 ## COG4399 Uncharacterized protein conserved in bacteria 7 3 Op 2 . - CDS 7397 - 9847 1667 ## COG1199 Rad3-related DNA helicases - Prom 9928 - 9987 5.4 - Term 10262 - 10302 -0.8 8 4 Tu 1 . - CDS 10438 - 12093 2028 ## COG1227 Inorganic pyrophosphatase/exopolyphosphatase - Prom 12195 - 12254 4.5 - Term 12201 - 12234 1.5 9 5 Tu 1 . - CDS 12344 - 13438 815 ## gi|225026069|ref|ZP_03715261.1| hypothetical protein EUBHAL_00308 - Prom 13461 - 13520 2.4 - Term 13467 - 13512 6.7 10 6 Tu 1 . - CDS 13548 - 14051 578 ## COG0653 Preprotein translocase subunit SecA (ATPase, RNA helicase) - Prom 14129 - 14188 9.4 - Term 14357 - 14424 25.0 11 7 Tu 1 . - CDS 14463 - 14960 833 ## gi|225026071|ref|ZP_03715263.1| hypothetical protein EUBHAL_00310 - TRNA 15305 - 15376 63.4 # Gln CTG 0 0 - TRNA 15406 - 15477 78.1 # Gly GCC 0 0 Predicted protein(s) >gi|222441922|gb|ACEP01000020.1| GENE 1 203 - 460 308 85 aa, chain - ## HITS:1 COG:no KEGG:Rumal_1318 NR:ns ## KEGG: Rumal_1318 # Name: not_defined # Def: hypothetical protein # Organism: R.albus # Pathway: not_defined # 1 85 1 85 85 82 45.0 4e-15 MVQLTLRIDGMSCGMCESHVNDTIRQKFNVKKVTSSHKKGETVIVANQAIDSDKLKAAIS ETGYEVLDIKEEPYEKKGFFSFLKK >gi|222441922|gb|ACEP01000020.1| GENE 2 678 - 1427 624 249 aa, chain - ## HITS:1 COG:BH2026 KEGG:ns NR:ns ## COG: BH2026 COG1595 # Protein_GI_number: 15614589 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Bacillus halodurans # 1 94 1 94 179 65 32.0 9e-11 MEITRLVKRAKRKDADAFTELMESQMQNMYKVARSILSRDEDVADAISDTILVCWEKIDT LKKNRFFRTWMTRILINKCNDILRAGQRTVFTDEPLEIEYMENEFLNLEWILKGSNESQK FDINKEFGNTGVKLVSAEISALGLELVSDYPKGCKYADMNGEHQPPSFAGVKMKDGKVYT YESVVDGSGEQFETGSSGKLYGKFTETVTTKKILEVKQIKALLFFKKSAGDSETYKAEDY YEIPLDVEK >gi|222441922|gb|ACEP01000020.1| GENE 3 1982 - 3679 1944 565 aa, chain - ## HITS:1 COG:RSp0521 KEGG:ns NR:ns ## COG: RSp0521 COG0443 # Protein_GI_number: 17548742 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Molecular chaperone # Organism: Ralstonia solanacearum # 1 560 23 587 593 478 43.0 1e-134 MIIGIDLGTTNSLAAYYTEHGPEIIPNRLGERLTPSVVSIDENDQVYVGKTALERMAVYP DQTANLFKRSMGSRKEFKLGNKIFTATDLSAMVLRSLKEDAEVYLGEEVTEAVISVPAYF NEVKRKAVKQAGELAGLKVERIISEPTAAAIAYGLYQKNDNTRFLVFDLGGGTFDVSILE LFDDILEVRAVAGDNYLGGEDFTNLIVEEFYKEHGLTEESLSLKEKSYYRNQAEKSKCNA TGDGFYRMQASVNGEKVETLIPRGAFEKMSSQLLDRIKTPVRKSLSDAGVKAREIDEVVL VGGTTKMPLVRKFVGKLFGRVPDTSINPDEAVALGAAIQAAMKERKEPVKEVILTDVCPF TLGTEVSVKTENDHLEGNHFCPIIERNTVIPASRTQHFFTVYDHQTQVEIHILQGESRFA SNNVSLGTLKLTVPDNEAGKEQIDITYTYDINALLEVEAKIVSTGETVTRLIKNQENSMT EEEMKARMRELSYLKIPPREQEKNKVLLLRGERLYEETTGELREQLEMVTQQFERILDRQ DPLKIEEARKDYEEALDWIEEEMWI >gi|222441922|gb|ACEP01000020.1| GENE 4 3676 - 5601 1767 641 aa, chain - ## HITS:1 COG:no KEGG:Clocel_3378 NR:ns ## KEGG: Clocel_3378 # Name: not_defined # Def: heat shock protein DnaJ domain-containing protein # Organism: C.cellulovorans # Pathway: not_defined # 5 638 7 636 1112 295 33.0 3e-78 MNDFFNTLGIEATKEEKKIKKAYRARLHAVNPEDDPDGFKRLREAYEEALKYARQKEEEP ENLSPAEEFISRCEQLYKDFYRRIDEQEWEKLFSEDICISLESGEEVRQRFLVFLMENFR LPSPVWKKIDQTFSITGNRKELLELFPEPYVDFLQQVVRYNGALNYELFEGDVSEDVDDY IEHYHKLRQYADLGMRKEAWEEYHFIEANYDICHPYILLEKARLLYSDEERKKQEEGGEI FRSLAKKYPQEERIVCCYGRYCQEKDTWDGTIALYDAVLEAHPESFLARTGKMESILHEG RYREARESILDLLEESPVDERLVADLNKANVYVIAELEPKYKDSLLDQDGLMELAWCYYQ NSRFEDGIALMDGFVPDEEHYLDYHNLKGRIFLTVNKDAEALEHLVPWLHAIQKLRPDGT KKTTRQMARLGYAYYTIASAKADLILDGQEGSLLEVMDYIEKAVEAEQDFSQKVSYYHTA ADIWRRKKEYAKMFDICDKIIALDPQYYPAYLLRQEACLYLGRYQEVINDYQRATALYPY HARPYCTLIRMYLSFGEQDRVKDILQLAKENGANSDELELLRARYIAIRAKSRKDLEKAL SILEKLEKKGWSLQSEMDEEEWAEIVYRKAQILTRLQEETR >gi|222441922|gb|ACEP01000020.1| GENE 5 5604 - 6323 732 239 aa, chain - ## HITS:1 COG:CAC0284 KEGG:ns NR:ns ## COG: CAC0284 COG0846 # Protein_GI_number: 15893576 # Func_class: K Transcription # Function: NAD-dependent protein deacetylases, SIR2 family # Organism: Clostridium acetobutylicum # 3 236 5 241 245 328 66.0 7e-90 MNEKIKKLKEIIDGTDNLVFFGGAGVSTESGIPDFRSTDGLYSQKYDFPPETIVSHTFFM KRNKDFFDFYKDKMMALDAKPNAAHYKLAEWEAQGKCRAVVTQNIDGLHQMAGSKNVLEL HGSIHRNYCMRCGKFFDAAYVKNSEGAPKCDECGGLIKPDVVLYEEGLDENVISKTIHYI SQADVLIIGGTSLVVYPAAGLIDYFKGSHLILINKSATQRDSQADLVINDKIGEVFGQL >gi|222441922|gb|ACEP01000020.1| GENE 6 6407 - 7309 1255 300 aa, chain - ## HITS:1 COG:SA1664 KEGG:ns NR:ns ## COG: SA1664 COG4399 # Protein_GI_number: 15927420 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Staphylococcus aureus N315 # 9 297 12 370 374 99 24.0 5e-21 MWKYIVTPIIGAIIGYVTNWIAVKMLFRPRKEVRVFGKRLPFTPGVIPRGQARLAKAVGN VVETQLLTPEYMGEKLLSEESEKEFKSHIQAWVEEQKRSEDTLHSAAVKIVEEEKVDDFA ASVEEDLTDFLSEKVIAMEPGKLIVDKVVQEAQRKLADSMFGMMLGGSFIEKIAGQIQEG IDAYIAENARGYIEKEVVAASEELQAKPIPEVTGFFEEKGIYDPEFLWRLYKRIIEEKLP ALLSSLKLSAVVEERINAMKVEEVEELVLSIMSKELGAIVNLGAVIGLILGLVNVLIFMI >gi|222441922|gb|ACEP01000020.1| GENE 7 7397 - 9847 1667 816 aa, chain - ## HITS:1 COG:CAC1672 KEGG:ns NR:ns ## COG: CAC1672 COG1199 # Protein_GI_number: 15894949 # Func_class: K Transcription; L Replication, recombination and repair # Function: Rad3-related DNA helicases # Organism: Clostridium acetobutylicum # 9 803 8 788 791 558 39.0 1e-158 MLSYQKGVVKTSVRHLVEFLFRGGDITTGSSVSASPEAMLEGSRLHRKIQRGQKATYQSE VPLKMEWQEERYDLVLEGRADGIDKTEVNKDRLSDVDGMNQIESNEEKLTYVDEIKCVYR DVEEIEEADILHLAQAKCYAYMYGAKQDLEKIGVQITYCHLETEQIKRIFYCYTMGELSL WFAEVLDAYRMWAAYYVEAREKRNASIQALQFPFSFRPGQKKMTAIAYQAMENKKHIFLQ APTGVGKTISTLYPALKELGAEKAENIFYLTAKTITRTVAEDTIKLLKKQGLFLSAVTIT AKEKICIQEEMQCNPESCERAKGHFDRINEALYALLTAECEISRENILLFAEKYKVCPYE LSFEAAVWADCIICDYNYVFDPHVNRKSLIEGSLRQNIYLIDEAHNLLDRAREMYSADIA KSDFKVPKKYFKDRNRFLFKKLGNCVMALRKLEKQAQDGTRFRLHENVDAMYFPIFHLIG PLEEYLADHDNFSEREEIVEFYFKLTHFYMMLDSMDSGYEIYSEKRGRDFLLRLFCVNPS DKLEEYIENSRSSIFFSATLLPVQYYRQLLGGNRFEAYQIASPFAKENRLLAITEDVTSR YTRRGEWEYGKMIRYLELCLEKKAGNYMIFFPSYEMLTSVYGMAEQSNLALVSEIIVQEP SMEERSREAFLEKFREQSGKPLIGFCVLGSIFSEGIDLTGNSLIGVLIVGTGLPQICNER EVIRAYFDRQQKKGYDYAYRYPGMNKVLQAAGRVIRTAEDVGTILLMDERFLERDNQSLL PEEWESYYAVNRNNCGQVLENFWNGLDNDKVEEAES >gi|222441922|gb|ACEP01000020.1| GENE 8 10438 - 12093 2028 551 aa, chain - ## HITS:1 COG:FN1824 KEGG:ns NR:ns ## COG: FN1824 COG1227 # Protein_GI_number: 19705129 # Func_class: C Energy production and conversion # Function: Inorganic pyrophosphatase/exopolyphosphatase # Organism: Fusobacterium nucleatum # 7 547 4 532 538 291 33.0 2e-78 MSKKNNVYVIGHKNPDTDSICSAISYAYLKNELSQKQEEGAKYIPTRAGHINEETQYVLK YFEQEAPLYVNDIRPQVTDIEIRKTAGIEAGLSLKKAWDIMRGARIVSLPILEDEKLAGI ITVSDIAYSDMDVYDNMILSKAGTSYKNIVETIDGEMIVGDLDGEFEEGHVLIAAANPDM MEGYIEPQDMVILGNRYESQLCAIEMDAQCIVVCMGATVSKTIRKMAKEHGCRIIVSPYD TYTVARLINHSMPVRYFMTTDNLITFHTTDYVDDIQEIMAKRRFRDFPITDNQGNYVGMI SRRNLLDLERKKLILVDHNEKAQAVDGFEEAEILEIIDHHRLGNIETNSPVYFRNQPVGC TATIITQMYMENDVTIPEKIAGLLCSAILSDTLMFRSPTCTALDHLVAEKLASIAHIDIE QYARDMFDAGSNLREKSAKEIFYQDFKKFNDSNINFGVGQISSMNKEELSACKGKLLPYM EEVRTSYGLDMVFFMMTDIINESTELLMVGEMCKEVIENAFETTVEDSSAELPGVVSRKK QLIPAILGQLS >gi|222441922|gb|ACEP01000020.1| GENE 9 12344 - 13438 815 364 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225026069|ref|ZP_03715261.1| ## NR: gi|225026069|ref|ZP_03715261.1| hypothetical protein EUBHAL_00308 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00308 [Eubacterium hallii DSM 3353] # 1 364 1 364 364 687 100.0 0 MQDRYVADEARVSISSLEAETIMKSTPVQTVKSILGRIGIEYEDGLPKEAYISALKEEFC SEPKWVLLMLPKVMLDFLTEVWENSVIAMTEERWNYIEYLKIFGFLAYQMGNPVTDEPNR LIVVEEMKDNFYFLLKSRKNRKLLSVYDEWEKIITGFMYYYGFIETRTLYSLFLKVSKKV ISYEDFLLFIKCRTSLWPFGAILKDTFEKNEYFQYLNVENPDMLLEYVRQHENLPYKPIK LEDLMYVSDAAGIDNRWRGVSELGNLFIDKMGLNYYRATVLVRTLLIMIKNGSSYEKLQE KVSILSFENAQIREEVQQAVRQIFENVPVFEYKGYSRAEYKRLSYQKQLKKRKNLFTIID GGKE >gi|222441922|gb|ACEP01000020.1| GENE 10 13548 - 14051 578 167 aa, chain - ## HITS:1 COG:CAC3537 KEGG:ns NR:ns ## COG: CAC3537 COG0653 # Protein_GI_number: 15896773 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit SecA (ATPase, RNA helicase) # Organism: Clostridium acetobutylicum # 1 167 1 166 166 151 44.0 4e-37 MSLLTDWREFAYSHDDRTREGQEFWIGYFNQEKAVYEQILATPEEPVSGTVTELAQKYNM ELTYFVGFLDGINDSLKNPNPIEEMEADTVVSLDYDKEKLYYNMVAAKADWLYGLEQWDA LLTPERRKELYKEQRSSTTVVKPPKIGRNDPCPCGSGKKYKKCCGRN >gi|222441922|gb|ACEP01000020.1| GENE 11 14463 - 14960 833 165 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225026071|ref|ZP_03715263.1| ## NR: gi|225026071|ref|ZP_03715263.1| hypothetical protein EUBHAL_00310 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00310 [Eubacterium hallii DSM 3353] # 29 165 29 165 165 137 100.0 3e-31 MARTTARTAKKKAAAKAATAAKATTKAAEAVEEEVKDAAVKEVEPKAEKAVEADIKEVAA AAKETVKDTGAKVVDKAKEVTKKVEKTVKEKVAEVKAAAADPEVYVQYQGNESNLAVITE RIKKQYVEEGHKESDIKSLKIYLKPEDASAYYVINDNNSGKVFLF Prediction of potential genes in microbial genomes Time: Fri Jul 8 06:42:46 2011 Seq name: gi|222441921|gb|ACEP01000021.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont21.1, whole genome shotgun sequence Length of sequence - 14266 bp Number of predicted genes - 12, with homology - 12 Number of transcription units - 9, operones - 3 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 120 134 ## COG2145 Hydroxyethylthiazole kinase, sugar kinase family - Prom 342 - 401 6.8 - Term 479 - 520 8.1 2 2 Tu 1 . - CDS 678 - 824 118 ## gi|225026073|ref|ZP_03715265.1| hypothetical protein EUBHAL_00314 - Prom 845 - 904 6.0 - Term 887 - 931 -0.6 3 3 Op 1 . - CDS 963 - 1733 649 ## COG0428 Predicted divalent heavy-metal cations transporter 4 3 Op 2 . - CDS 1741 - 2160 375 ## gi|225026075|ref|ZP_03715267.1| hypothetical protein EUBHAL_00316 - Prom 2259 - 2318 7.1 - Term 2279 - 2342 13.0 5 4 Tu 1 . - CDS 2379 - 4115 1951 ## COG0318 Acyl-CoA synthetases (AMP-forming)/AMP-acid ligases II - Prom 4180 - 4239 5.3 - Term 4206 - 4250 9.0 6 5 Op 1 . - CDS 4256 - 5773 925 ## gi|225026077|ref|ZP_03715269.1| hypothetical protein EUBHAL_00318 - Prom 5799 - 5858 2.3 - Term 5791 - 5832 8.3 7 5 Op 2 . - CDS 5876 - 10630 5670 ## COG5492 Bacterial surface proteins containing Ig-like domains - Prom 10834 - 10893 6.6 - Term 10677 - 10736 -0.8 8 6 Tu 1 . - CDS 10902 - 11375 556 ## COG0691 tmRNA-binding protein - Prom 11458 - 11517 3.0 9 7 Op 1 . - CDS 11558 - 12007 521 ## gi|225026080|ref|ZP_03715272.1| hypothetical protein EUBHAL_00321 10 7 Op 2 . - CDS 11967 - 12158 261 ## gi|225026081|ref|ZP_03715273.1| hypothetical protein EUBHAL_00322 - Prom 12363 - 12422 9.5 - Term 12301 - 12329 -0.9 11 8 Tu 1 . - CDS 12467 - 13165 435 ## gi|225026083|ref|ZP_03715275.1| hypothetical protein EUBHAL_00324 - Prom 13209 - 13268 10.5 - Term 13316 - 13349 2.2 12 9 Tu 1 . - CDS 13375 - 13731 464 ## gi|225026084|ref|ZP_03715276.1| hypothetical protein EUBHAL_00325 - Prom 13953 - 14012 6.4 Predicted protein(s) >gi|222441921|gb|ACEP01000021.1| GENE 1 3 - 120 134 39 aa, chain - ## HITS:1 COG:CAC3096 KEGG:ns NR:ns ## COG: CAC3096 COG2145 # Protein_GI_number: 15896347 # Func_class: H Coenzyme transport and metabolism # Function: Hydroxyethylthiazole kinase, sugar kinase family # Organism: Clostridium acetobutylicum # 1 39 1 39 273 60 71.0 5e-10 MLKKCFDNVREKHPLVHNITNYVTVNDVANILLTCGGSP >gi|222441921|gb|ACEP01000021.1| GENE 2 678 - 824 118 48 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225026073|ref|ZP_03715265.1| ## NR: gi|225026073|ref|ZP_03715265.1| hypothetical protein EUBHAL_00314 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00314 [Eubacterium hallii DSM 3353] # 1 48 1 48 48 72 100.0 9e-12 MIYVVVEELIPEMSVGKHSNIGTIFFSLGFTVMLTLDVVLGTSALPSE >gi|222441921|gb|ACEP01000021.1| GENE 3 963 - 1733 649 256 aa, chain - ## HITS:1 COG:lin0435 KEGG:ns NR:ns ## COG: lin0435 COG0428 # Protein_GI_number: 16799512 # Func_class: P Inorganic ion transport and metabolism # Function: Predicted divalent heavy-metal cations transporter # Organism: Listeria innocua # 22 256 25 269 269 178 46.0 7e-45 MVSEGFYDYRDNKGHQCSLLGTTLGAFCVFFLKNELSKGMQKALMGFAAGVMVAASIWSL LLPALEQVDSIGVWKFLPAASGFWIGILFLMAIDHFAPSECIDGKCKNKLLLAVTIHNIP EGMAVGVIYAGLLSGAEHITVMGAFSLALGIVIQNFPEGAIISLPLRAEGISQKRAFIYG VLSGAVEPVAAILTVWAASLIVPLLPYFLSFAAGAMFYVVVEELVPKMSVGKSVNVGFFM FSVGFTLMMILDVALA >gi|222441921|gb|ACEP01000021.1| GENE 4 1741 - 2160 375 139 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225026075|ref|ZP_03715267.1| ## NR: gi|225026075|ref|ZP_03715267.1| hypothetical protein EUBHAL_00316 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00316 [Eubacterium hallii DSM 3353] # 21 139 1 119 119 112 100.0 1e-23 MWKKSKRILLCILALMLLLSMAKSEQLFVQADMESTEKVQTKDVKNRSQRQEENSDEKGQ EESTDKKEEASTKATQKTEETIKEDKKNLEATTQKEKDASKNDEAKEETEEKTEEYKREE ENTERKTDENKAAEEVSAE >gi|222441921|gb|ACEP01000021.1| GENE 5 2379 - 4115 1951 578 aa, chain - ## HITS:1 COG:BH3104 KEGG:ns NR:ns ## COG: BH3104 COG0318 # Protein_GI_number: 15615666 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Acyl-CoA synthetases (AMP-forming)/AMP-acid ligases II # Organism: Bacillus halodurans # 14 481 2 462 566 100 24.0 8e-21 MEKNHKVELIGRPSIDRPWMKYYPESISRLQVPELSLNEYLKANCPGMDMVAMHYYGRDV SWERFFVKVDMVAKSMSAMGIKEGDQIPVFYRSVPEFIVLLLAAEKIGASLVCRDNTLQE NVDAVKEAGAKIIFVHDFLPKAEVEEYKKQAGVETIISLSPYHAAEKDKMPEHIIHEIDS FYTDGILSGDGVMTWEEFLEKGKNYTGKVEVEKDVNRPLFRAYTSGSTGTSKQVIHSAYT MIGIIAQMSFYGSAGDFRPTWLLTNLPPCLIAVVVSMILVPLCSNKLLILDPFCNVNDLD LEMMRYRPNCWPLIPMFIELLMRSDRIPEDYDMSHLFAAGVGCEAFNNGQIRRAQEFLKA HNCKAVLSIGYGQSEAGSNCTLPCLEHPIGNGNSGIPMPLTVMSIFEPGTHKEVGYNKLG EICKTGLGNMLGYDREEATEKALQKHEDGNIWLHTGDIGYMTEDGIVFALTRGDVRRVGG GRLIDLNMENKIIDAEIDGIIDAFFVVDEDFENKGYFLPYLYVVLEDGYTIDDIREDVAD ALEDFEYPAQIIQLPERPFFHYKTNRVGLKAQLRATTN >gi|222441921|gb|ACEP01000021.1| GENE 6 4256 - 5773 925 505 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225026077|ref|ZP_03715269.1| ## NR: gi|225026077|ref|ZP_03715269.1| hypothetical protein EUBHAL_00318 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00318 [Eubacterium hallii DSM 3353] # 13 505 1 493 493 946 100.0 0 MKLCIGLTCGGKMEKKNRFSILLERLMGTVNLKNATLAAELQYDVSYISKWINGKTLPSE KSVAQILHVISDYIVESATSAALKTLYIEYQVDVPEDLKQAIYDNLEAEYFYVKGLTKTI GADIAPETQNYAKLTLYQFLKKMSHPALRQVKSLNVIAAIDLLNMEHDVRLQIAQIDTAS QILQRDFPEVHFSLLINFDIGEKDYIYDCIFIINMLTNYTHINFKIYNCIHSYDKIVFCA KDAYCITSMLLGGNLCYQVTISEDKKMCKTVYQRLATLCTREELMFRKCTMRNLLQGSDY ISQMLSENLCWLLGHMTEHFLPEELHKELLEYLDISDAEREKMQNIHLLVEEVLKKSHVR ILLYETALSQFAISGELDFYNNKILLDTRQREQCLSYIQNIIHEENAMELRIIHGHFVMD FQFNANPSLFLSNNLCYLRLDSKNIQNNVTLLNINIAKKMFRCFFEEIWSHSEEDVVISD WRKIESILKHMKQSVALLKKVDKEI >gi|222441921|gb|ACEP01000021.1| GENE 7 5876 - 10630 5670 1584 aa, chain - ## HITS:1 COG:CAC3275 KEGG:ns NR:ns ## COG: CAC3275 COG5492 # Protein_GI_number: 15896520 # Func_class: N Cell motility # Function: Bacterial surface proteins containing Ig-like domains # Organism: Clostridium acetobutylicum # 1327 1581 607 871 1043 74 30.0 2e-12 MKKQVKNLFALFLTASMAVSLVPADAFAAVHDAVKDATTTTAYEKVEAEKKAAQPKAGSA EAAIQEALYTNTDEVDLSKYNLTERKAEKLTDKSLGDDKDTDLVDVTYETDANGKVTTMS VEKDDTYAYALEEIEELAADDADEDDGSGKTKTEVGQAYKDLMAFYESPDNQEYLGIATP YFTSKDSKGGPVSSLLSLMQQKIYKEGDPEEENPANAKVTYKDMYDVIQGYQTTLQYGIQ LFGQQLLAARDEALAQIDDSMTREQKLLVLNDWLGKYCTFDMGAIQEEQDKKNDTETTEE AIQAMGETAAVALADDGQQQPTEEQIAYLRNMLLNGDLKTAFESTAFGALVRRNTICAGY TSAYTYLVQCAFPEIYKEADGKTWKKASEVNGTDSSQDKKDDNSSSENKGDTSETDKEES SSAAETETQAADGEDSSGDASSEKKDETTTETKPTYIVDFVKIFWRSDVQMLGQSQTFRN SHYFNAVKTDNSVDNWYYVDSCYNDIYVECMGRNRVETDGNMTHSYFLISHTSLAKQFDG NFDEIDTLYTDKATDTKYEDAWFTDAQGPISFDKDNWYYVQNTTSYSSGMGTSDQKPDQL VSLPRKAAATATSDSATVLVDYEKGTGAVKEGGDLLKESATKDEDVNDKIYPGLTHTSAY YGGALYLNADNQILKYDLSTNAITKVKEYNEVTAKSVADNMGTGINNNFTGMTFKVTTKD DKDAIHTVENHPIAGLSIDSQGNLNVDIATNYCYVDKYQTEQTNYNSGYMNYKFQGTTIK RGGSNDNQEFMWSADFVETLDMKHVAGDSHEYETVTVDPTCDNAGFTEERCKTCGVVKAD SKKESEAEDAKATGHHYIQYKDETYTKESDAEDAKKIVVDAIVCTKCLKAIPRKSSGNSG SIMGGSNDNSDELDIPDGLKTGHEYTGKITSLSDDKKTGNVTVSCDSCVGKELDFLDEKD VTLETKKDVEVTSKKEAEDCEKGGKIIYTAKVTVGDNEYEATTTEEAAAEKHVYKPEFTW ADDNTCTVTYTCENCKIPATEAEKCEVTSEVTEPTCTEAGKTVYTATATDKNNINHTDTK EVPIEATGHTYSADPTYEWSEDYKTCKAIFTCDKKDDTQTVECTVTPATTDATCTEAGKV VYTATCTFKGKEYTCPEKKETTIAATGHKYGEPTFNWSEDGKECTATFTCENDKTHTKDV KCTVTKDAEKSTDATCTEAGKAVYVATVKEDPNGSDEYTDTKEVKTTNALGHDYGEPTFG EWTEKDGKMTCEATFTCKNDASHVETVSCEVTSEGKDATYTEAGSVTYTATATVPEGTKT YINPVKKEVKVAQKDSKAKFAKTSYNLYSTQSGKVSLVSDYKQDGIVSIKSSNPSKVSVN TAGVIKAGVVKGKAVKVTITAVVKSGKTIKTVVTVAPTKITLNATSVPLQLKKSTSVIKV AKATLPGDQIASWSTSNKKVVIVSKSGKITAKKVGTAVVRITMKSGATAKCKVKVQKAAV KLTKIAVNSKKVNLNLKKGPKTYKLTATKAPITVVNKVTFTTSNKKVAVVSKSGKITAKK AGKAVITVKCAKKVQKVTVVVKAK >gi|222441921|gb|ACEP01000021.1| GENE 8 10902 - 11375 556 157 aa, chain - ## HITS:1 COG:BS_yvaI KEGG:ns NR:ns ## COG: BS_yvaI COG0691 # Protein_GI_number: 16080413 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: tmRNA-binding protein # Organism: Bacillus subtilis # 1 152 1 150 156 164 57.0 5e-41 MAKSEGIKLIANNKKAYHDYFIDEKYEAGIALHGTEVKSLRMGRCSIKESFVTISNKGEI LINHMHISPYEKGNIFNKDPLRVRKLLLHKSEINKLAGQIKMKGYTLMPLKVYFKGSLVK VEIGLARGKKLYDKRQDIAKKDAKREVERDFKIRNLG >gi|222441921|gb|ACEP01000021.1| GENE 9 11558 - 12007 521 149 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225026080|ref|ZP_03715272.1| ## NR: gi|225026080|ref|ZP_03715272.1| hypothetical protein EUBHAL_00321 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00321 [Eubacterium hallii DSM 3353] # 5 149 1 145 145 247 100.0 2e-64 MYMIMISVISLLDDFYVFDEESYKLTGEHTGRIFTLGQKAEIVVHSANKMERTIDFILKE FSQYSDYPDPDDYYGDGMPSRKNEDAEAGKFENFESADGEYPVEQEEFEFDETGESLPDK DGILLNDWIDDEEADRLLHLYNKKKIDEI >gi|222441921|gb|ACEP01000021.1| GENE 10 11967 - 12158 261 63 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225026081|ref|ZP_03715273.1| ## NR: gi|225026081|ref|ZP_03715273.1| hypothetical protein EUBHAL_00322 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00322 [Eubacterium hallii DSM 3353] # 1 63 1 63 63 107 100.0 2e-22 MSAREKDLKELLEYYEQNLCHKIFKYELNNKINIEVIFYIEGLCHLLGIQHVYDNDKRYQ PLR >gi|222441921|gb|ACEP01000021.1| GENE 11 12467 - 13165 435 232 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225026083|ref|ZP_03715275.1| ## NR: gi|225026083|ref|ZP_03715275.1| hypothetical protein EUBHAL_00324 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00324 [Eubacterium hallii DSM 3353] # 1 232 14 245 245 379 100.0 1e-103 MKKEIETKICSCCGKELKLNKFIKNRRLCKVGFRKEYKTSYQSVCKDCRNEKRQNTLKEK GMSLYDNGKLVQIRREYKKINLNRFLKQTEAKLEYLDRHERFVKLLYYDNTWISNYGRPV IKKEDGTYEFVKGKRNKITQEIEYSLRKNIYIKTKKAWGYKQERVSASSLFMMIIGEEHD LGVFDTKEECYEAFEKRKKEFIIKLAEKTKKKDKIPECQYQAMINWEVKAKE >gi|222441921|gb|ACEP01000021.1| GENE 12 13375 - 13731 464 118 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225026084|ref|ZP_03715276.1| ## NR: gi|225026084|ref|ZP_03715276.1| hypothetical protein EUBHAL_00325 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00325 [Eubacterium hallii DSM 3353] # 1 118 1 118 118 219 100.0 8e-56 METISRQEFIDMVAAKSNNRYPKIDLYYIIRDAFWCLEDILRDGNRFEVKGILGIYPEYI EERQYNNFNKGKATVPPHYKPVIKPYSRLKRACEELTEERLGEGLNEAEEDFDYSEEE Prediction of potential genes in microbial genomes Time: Fri Jul 8 06:43:54 2011 Seq name: gi|222441920|gb|ACEP01000022.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont22.1, whole genome shotgun sequence Length of sequence - 13191 bp Number of predicted genes - 14, with homology - 13 Number of transcription units - 11, operones - 3 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 72 - 824 812 ## COG0708 Exonuclease III + Term 852 - 904 15.1 + Prom 878 - 937 5.8 2 2 Tu 1 . + CDS 977 - 1498 454 ## COG1335 Amidases related to nicotinamidase 3 3 Op 1 . - CDS 1596 - 2660 996 ## COG0707 UDP-N-acetylglucosamine:LPS N-acetylglucosamine transferase 4 3 Op 2 . - CDS 2715 - 2966 393 ## CbC4_0918 hypothetical protein - Prom 3007 - 3066 6.6 5 4 Tu 1 . + CDS 3316 - 4020 1006 ## COG0775 Nucleoside phosphorylase + Term 4056 - 4100 7.5 6 5 Tu 1 . + CDS 4425 - 5096 466 ## Cphy_0418 hypothetical protein + Term 5224 - 5266 0.6 - Term 5101 - 5138 2.6 7 6 Tu 1 . - CDS 5306 - 5959 515 ## COG0726 Predicted xylanase/chitin deacetylase - Prom 6201 - 6260 7.5 - Term 6199 - 6251 2.7 8 7 Tu 1 . - CDS 6284 - 6856 731 ## COG1971 Predicted membrane protein - Prom 6929 - 6988 9.2 - Term 6970 - 7021 3.5 9 8 Op 1 . - CDS 7220 - 7486 354 ## EUBELI_00058 hypothetical protein - Prom 7512 - 7571 5.2 10 8 Op 2 . - CDS 7573 - 7797 67 ## - Prom 7900 - 7959 8.8 + Prom 7899 - 7958 7.3 11 9 Tu 1 . + CDS 8021 - 9103 1266 ## COG1932 Phosphoserine aminotransferase + Term 9106 - 9154 -0.9 12 10 Tu 1 . - CDS 9120 - 9683 383 ## GY4MC1_3829 sporulation protein YyaC + Prom 10075 - 10134 4.8 13 11 Op 1 . + CDS 10202 - 11419 1109 ## Cphy_0051 peptidoglycan-binding LysM + Prom 11444 - 11503 5.3 14 11 Op 2 . + CDS 11532 - 12743 926 ## EUBELI_01971 hypothetical protein + Term 12746 - 12815 28.1 Predicted protein(s) >gi|222441920|gb|ACEP01000022.1| GENE 1 72 - 824 812 250 aa, chain + ## HITS:1 COG:lin1894 KEGG:ns NR:ns ## COG: lin1894 COG0708 # Protein_GI_number: 16800960 # Func_class: L Replication, recombination and repair # Function: Exonuclease III # Organism: Listeria innocua # 1 249 1 249 251 363 69.0 1e-100 MKLISWNVNGLRACVKKGFEDFFKEADADIFCVQETKLQEGQIDFAPEGYECYWNYAEKK GYSGTAVFTKKHPLKVWNGIGMEEHDQEGRVITLEFEDFFFVTVYTPNSQSELKRLDYRM KWEDDFREYLQELDKEKPVIMTGDLNVAHEEIDLKNPKTNKKNAGFTQEERNKFTELLGA GFVDSFRYLNPELAGAYTWWSYRFKAREKDAGWRIDYFVVSERWKEKIEDAIIYKTVMGS DHCPIGLQMK >gi|222441920|gb|ACEP01000022.1| GENE 2 977 - 1498 454 173 aa, chain + ## HITS:1 COG:L67226 KEGG:ns NR:ns ## COG: L67226 COG1335 # Protein_GI_number: 15672251 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Amidases related to nicotinamidase # Organism: Lactococcus lactis # 2 167 12 171 171 184 58.0 8e-47 MQKDFVDGSLGTKEAQAIVPAVKEKIENFNGDIIFTKDTHGEDYMNTQEGQNLPIPHCIK GTPGHEIINELQPYVKKALAVFEKETFGSRKLANFLKELHKEKEIESIELIGLCTDICVV SNALLIKAFLPEVPILADSNCCAGVTPEKHESALETMRSCQILVCEAAKQTNS >gi|222441920|gb|ACEP01000022.1| GENE 3 1596 - 2660 996 354 aa, chain - ## HITS:1 COG:CAC2231 KEGG:ns NR:ns ## COG: CAC2231 COG0707 # Protein_GI_number: 15895499 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylglucosamine:LPS N-acetylglucosamine transferase # Organism: Clostridium acetobutylicum # 3 354 5 356 359 418 57.0 1e-117 MKRIVLTGGGTAGHVTPNMALVPSLKEAGYDIQYIGSYNGIEKRLIEEMGIPYHGISSGK LRRYFDPKNFSDPFKVMKGYLEASHMIRKLKPDIVFSKGGFVSVPVVLAAKRRRVPVIIH ESDLTPGLANKICIPAATKVCCNFPETLKHLPENKAVLTGSPIRQELFSGKKEEGLRLCG FDDSKPVLLVMGGSLGAVAINNAIRENLDELLKQFQIIHLCGRGHYDTSLDGKNGYKQFE YAKKELTHLFAATDLIISRAGANAICELLALKKPNILIPLPASQSRGDQLLNAASFEKSG YSYVLQEEELTSNTLLKAVQYVYDEREEYIQTLKESKLNRAIPIIMDLITKYSK >gi|222441920|gb|ACEP01000022.1| GENE 4 2715 - 2966 393 83 aa, chain - ## HITS:1 COG:no KEGG:CbC4_0918 NR:ns ## KEGG: CbC4_0918 # Name: not_defined # Def: hypothetical protein # Organism: C.botulinum_BKT015925 # Pathway: not_defined # 1 82 1 82 84 99 54.0 6e-20 MYSYKTKGVCSREIQFDVENDIIKEVRFIGGCMGNTQGVAALAKGRNIDEVINTIKGIDC GGRGTSCPDQLALALIAYKEQQA >gi|222441920|gb|ACEP01000022.1| GENE 5 3316 - 4020 1006 234 aa, chain + ## HITS:1 COG:CAC2117 KEGG:ns NR:ns ## COG: CAC2117 COG0775 # Protein_GI_number: 15895386 # Func_class: F Nucleotide transport and metabolism # Function: Nucleoside phosphorylase # Organism: Clostridium acetobutylicum # 4 228 3 226 230 194 46.0 1e-49 MVCIGVIGAMEEEVASLINQMEDAESKTMAGMTFNKGKLWNQDAVVVQSGIGKVNMAICT QILVNIYGVDMLINTGVAGGLYKDINVGDIVISSDALQHDFDVTGLGYKKSVIPGMETSV FTADTELVEMAKEACEIVNPEIQCFVGRVVTGDQFISDNGTKAALVKDYDGYCAEMEGAS MAQVATLNKIPFVIIRAISDKADNSAPVAYETFEEQAIVHTVKLLAAMFLKMSK >gi|222441920|gb|ACEP01000022.1| GENE 6 4425 - 5096 466 223 aa, chain + ## HITS:1 COG:no KEGG:Cphy_0418 NR:ns ## KEGG: Cphy_0418 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 29 220 5 194 555 76 26.0 8e-13 MFDFILGNTVFTKAIIACCFIGILSWSVLTISYRSMIRATSRIGQTKKKWLVSLKKKYED YHEMDIKVNNVETFVDKYFRKKRVCGLPCGFWKTLYRLTIAACAITGAAGALAVSQEGGE LTSVIVTYLTGVIAAFGLIFLDILLRADAKEHRIIMNMNDYLQNVLENSIAGREVDLEDG TSREQRRRLLRYARENTKKKAKDEPVSLEEKKVLEDVLQEFFA >gi|222441920|gb|ACEP01000022.1| GENE 7 5306 - 5959 515 217 aa, chain - ## HITS:1 COG:BS_yfjS KEGG:ns NR:ns ## COG: BS_yfjS COG0726 # Protein_GI_number: 16077865 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted xylanase/chitin deacetylase # Organism: Bacillus subtilis # 3 214 42 252 263 207 45.0 2e-53 MTPSAQEEVDLSRYDAWYIKTNLKGDEKPVFLTFDCGYENGYTASILDVLKKHKAPGAFF LCRHYIEDQPELVKRMKKEGHIVGNHTSHHVCMPETDSRKVREEITDNANYMKEATGYEM DRFFRPPKGEYSERTLQITKDIGYTTVFWSMAYLDYDVDNQPGSDYVINHFEKYIHPGAI PLIHNISKSNAEALDTVLTNLEKEGYTFHSLSDLKKG >gi|222441920|gb|ACEP01000022.1| GENE 8 6284 - 6856 731 190 aa, chain - ## HITS:1 COG:MA3749 KEGG:ns NR:ns ## COG: MA3749 COG1971 # Protein_GI_number: 20092547 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Methanosarcina acetivorans str.C2A # 8 165 3 156 186 96 39.0 2e-20 MTWSLLFFTTNIFLGIGLAMDAFSVSMANGLNEPHMNKGRVCTIAGVFGGFQALMPLIGW FCVHTIAESFHVFEKFIPWIALILLLFIGGKMLIEGIKNDCEEEVSAVSPGALFVQGIAT SIDALSVGFTIAEYNWIMALAASLIIAIVTFFICIGGLIIGKKFGTKLSNKAQILGGIIL IAIGIEIFIS >gi|222441920|gb|ACEP01000022.1| GENE 9 7220 - 7486 354 88 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_00058 NR:ns ## KEGG: EUBELI_00058 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 2 88 3 90 90 92 51.0 7e-18 MRIWCKLFEKNHMLKDTVMENYDMSMSRTAKVYDCLERACYEFDLEKPMWLDKNKREFIS HARTRFYQDNFIEHIDFDYLDFQVIEED >gi|222441920|gb|ACEP01000022.1| GENE 10 7573 - 7797 67 74 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MFVTICGRFFLNGGLHSGLGLRQLNKGKFKNAPYFIRKLTNASEFVTLSEIPISTYTLYV VSEYLPLNPFILRF >gi|222441920|gb|ACEP01000022.1| GENE 11 8021 - 9103 1266 360 aa, chain + ## HITS:1 COG:lin2957 KEGG:ns NR:ns ## COG: lin2957 COG1932 # Protein_GI_number: 16802016 # Func_class: H Coenzyme transport and metabolism; E Amino acid transport and metabolism # Function: Phosphoserine aminotransferase # Organism: Listeria innocua # 1 360 1 357 363 390 54.0 1e-108 MGRVYNFSAGPAVLPEEVLKEAADEMLDYKGCGMSVMEMSHRSKVFDDIIKEAEQDLRDL MNIPDNYKVLFLQGGASQQFSAIPMNLMKNGVADYIVTGQWAKKAYQEASRYGKAVKIAS SEDKTFSYIPDCSDLPIDEDADYVYICENNTIYGTKYKKLPNTKGKTLVADVSSCFLSEP HDVEKYGVIYGGVQKNVGPAGVVIVIIREDLIRDDVMEGTPTMLKYKTQADKDSLYNTPP CYGIYICGKVFKWLKKQGGLQAMKEYNEKKAKILYDFLDESEMFKGTVVKEDRSLMNVPF VTGDKELDAKFVAEAEANGFVNLKGHRTVGGMRASIYNAMPIEGVEKLVEFMKKFEAENK >gi|222441920|gb|ACEP01000022.1| GENE 12 9120 - 9683 383 187 aa, chain - ## HITS:1 COG:no KEGG:GY4MC1_3829 NR:ns ## KEGG: GY4MC1_3829 # Name: not_defined # Def: sporulation protein YyaC # Organism: Geobacillus_Y4.1MC1 # Pathway: not_defined # 44 187 47 191 208 125 42.0 7e-28 MNFNIKKEKRYQYFAPDNASTIPFSIALNNHITYLNKKFEDLVLLCIGTDKITGDCLGPL VGSKLKQRKFPYPLYGTLEKPLHAGNLTEQLPEISLVHPDACTLAIDAAVGTKNHIGLVS LSRQPLSPGKGVARPLCPVGDISITGIINEASVSSEILLPYTSLYLVDKLAEYICKGILN SDLPQAR >gi|222441920|gb|ACEP01000022.1| GENE 13 10202 - 11419 1109 405 aa, chain + ## HITS:1 COG:no KEGG:Cphy_0051 NR:ns ## KEGG: Cphy_0051 # Name: not_defined # Def: peptidoglycan-binding LysM # Organism: C.phytofermentans # Pathway: not_defined # 15 401 18 436 436 156 29.0 2e-36 MAEERQETGKKKGNEKGEVSRNLPRNYRQIGEPGAKKIYVEDYVYTYLNKLAQPENLYAR GAILFGKMYKTTIGKCMFISGAAACQNFELDLDETIFSEENWGEIYRIRDSFFPELEICG WFLSRMGFSVDLNDKIIRIHMDNFSGENKVLYMIDALENEDAFYQFENYSLKKQRGYYIY YETNQEMKNYMLAEGEMGRMPERGRRSETIKRDTAVVKNYKKALRKKKSLRPKSAFANPA YVAGGLVAAMVLIYGVRSAVQYQGNQKMQEVFQQQEMTDAKNVTGAQQGEEETDTVSKIE TTQNSENEVVSDEKAGSSEISEESNSESAVQDTEASETQTTEKTQQTWGSIGGYYTVKRG DTLAGISKKMYQSYNYIEMIAEANGIDNVDEIYPGQILEIPQIED >gi|222441920|gb|ACEP01000022.1| GENE 14 11532 - 12743 926 403 aa, chain + ## HITS:1 COG:no KEGG:EUBELI_01971 NR:ns ## KEGG: EUBELI_01971 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 20 393 48 417 422 169 28.0 2e-40 MAKQIYNYELFTNKKDELAKIRKKAARNRRIKILLLALILAGTAIVAYILFNSKCNYYTY KDKVDTEDSGDVSYENFDEGYLKYSSNGIEYQKQFGRAEWNIALSYEHPFLAKSDSYAVL GDKGENTLILLNKNGKVREFTLKYPLLQATVSEQGIIQMILEGSDSNYVQVYGKNGKLIA DMRSSIEEKGYPITAAISPNGTQLAVSYYSINGNTSKTSIAFYDLSRQLQSDSIPLKGGF DYTNTIIPKVSFMDDNTVAAFGSKVTYFYNIEDSPNEKKKLEFDQQIQSVFENEDYVGYI INNSANPEKGKYILQVYNKKGARKLNHIVDMNYDSISMWGKEIIAVRGNECTIFNIKGNI LFQGELEGDSIETIVPAKGWRTYHVVFRDKIVKMQLQFWKNGK Prediction of potential genes in microbial genomes Time: Fri Jul 8 06:44:25 2011 Seq name: gi|222441919|gb|ACEP01000023.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont23.1, whole genome shotgun sequence Length of sequence - 3205 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 2, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 73 - 132 7.9 1 1 Op 1 . + CDS 165 - 314 193 ## gi|225026102|ref|ZP_03715294.1| hypothetical protein EUBHAL_00343 2 1 Op 2 . + CDS 343 - 516 230 ## gi|225026103|ref|ZP_03715295.1| hypothetical protein EUBHAL_00344 3 1 Op 3 . + CDS 596 - 1369 743 ## bpr_III165 phosphoglycerate mutase family protein 4 2 Tu 1 . - CDS 1533 - 2906 1111 ## COG0534 Na+-driven multidrug efflux pump - Prom 2947 - 3006 9.9 Predicted protein(s) >gi|222441919|gb|ACEP01000023.1| GENE 1 165 - 314 193 49 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026102|ref|ZP_03715294.1| ## NR: gi|225026102|ref|ZP_03715294.1| hypothetical protein EUBHAL_00343 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00343 [Eubacterium hallii DSM 3353] # 1 49 24 72 72 82 100.0 8e-15 MQKLKEENYPVRILVYIFGLFIMTLGISMSVKSDLGVSPVSSIPYSIPF >gi|222441919|gb|ACEP01000023.1| GENE 2 343 - 516 230 57 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026103|ref|ZP_03715295.1| ## NR: gi|225026103|ref|ZP_03715295.1| hypothetical protein EUBHAL_00344 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00344 [Eubacterium hallii DSM 3353] # 1 57 1 57 57 77 100.0 4e-13 MVTISLISCLLTLRRLGSVGVGTIIAAVLVGTVLGFVTRLFGVKRDAILYPKGRISK >gi|222441919|gb|ACEP01000023.1| GENE 3 596 - 1369 743 257 aa, chain + ## HITS:1 COG:no KEGG:bpr_III165 NR:ns ## KEGG: bpr_III165 # Name: not_defined # Def: phosphoglycerate mutase family protein # Organism: B.proteoclasticus # Pathway: not_defined # 1 257 1 260 260 280 54.0 4e-74 MRILLIRHGDPDYENDTLTEKGCREAELLAKRALSLHMGKCYVSPLGRAQRTALPSLEAA GCKAETVEWLQEFPAKLDVNKAPELAEAYPDIEKEGEKYRPRIVWDMAPGYLTEHEEYMD KINWRHSLVAKCSDMEDVYDKVTKEFDTLLAKHGYVRENGYYRVEKESTETITLFCHFGL ICVLLSHLWNVSPFTLWNSFALAPTSVTEVVTEERKQGIAYFRGLKIGDISHLYAGNEEP SFAARFCETYSNKEQRH >gi|222441919|gb|ACEP01000023.1| GENE 4 1533 - 2906 1111 457 aa, chain - ## HITS:1 COG:FN1726 KEGG:ns NR:ns ## COG: FN1726 COG0534 # Protein_GI_number: 19705047 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Fusobacterium nucleatum # 11 440 13 438 457 221 32.0 3e-57 MKRIDFDNGKITTNILQAALPMLVAQILSLLYNIVDRIYIARIPNIGTAALGAVGLCFPI IVIITAFSNLFGSGGAPLFSIERGRGDKKRAGMIMNTSFFMLSVCAIVLMCIGFIFARPI LILFGASENALVYAYPYIMIYLIGTFPSMIATGMNPFINAQGYATTGMISVVIGAIANIV LDPLFIFMLDLGIRGAAIATVLSQCLSAGFVLYFLSKKAEYKVRLLYKEEIRTCGKDAKN IVSLGTAGFVMQLTNSLVSICCNNVLSATGGDVYVSVMTIISSVRQMIETPIWSISEGSS PVISYNYGAKRPKKVIEAWITMSVLALIYSLIAWSVILFAPKFLIGIFSSDKSLMIDTVP AMKLYFSAFIFMLFQYTGQTMFKSLNKKKQAIFFSILRKVIIVVPMTYMFPYVFHMGSNG VFMAEPVSNVIGGSLCFIVMLLTVLPELKQMNSDLSS Prediction of potential genes in microbial genomes Time: Fri Jul 8 06:44:40 2011 Seq name: gi|222441918|gb|ACEP01000024.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont24.1, whole genome shotgun sequence Length of sequence - 2263 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 746 - 805 2.9 1 1 Tu 1 . + CDS 941 - 2227 1714 ## COG0172 Seryl-tRNA synthetase Predicted protein(s) >gi|222441918|gb|ACEP01000024.1| GENE 1 941 - 2227 1714 428 aa, chain + ## HITS:1 COG:PH0710 KEGG:ns NR:ns ## COG: PH0710 COG0172 # Protein_GI_number: 14590588 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Seryl-tRNA synthetase # Organism: Pyrococcus horikoshii # 1 428 6 452 460 327 40.0 4e-89 MLDLKFVRENPEVVKQNIRNKFQDEKLPLVDEVIELDAQRRATQQEADDLRSNRKKISKQ IGALMAQGKKEEADAVKEQVTKNAERLAELEKLEAELGEKVKKIMMIIPNIIDPSVPIGK DDSENVEIQKYGEPFVPDFEVPYHTDIMERFDGIDLDSARKVAGNGFYYLMGDIARLHSA VISYARDFMIDRGFTYCVPPFMIRSNVVTGVMSFAEMDAMMYKIEGEDLYLIGTSEHSMI GKFIDTVLPSDTLPRTLTSYSPCFRKEKGAHGIEERGVYRIHQFEKQEMIVVCEPEDSKM WFDKLWQNTVDLFRSMDIPVRTLECCSGDLADLKVKSIDVEAWSPRQKKYFEVGSCSNLG DAQARRLKIRVTGEKGKYFAHTLNNTVVAPPRMLIAFLENNLNEDGSVRIPEVLRPYMGG KEVLVPVK Prediction of potential genes in microbial genomes Time: Fri Jul 8 06:44:46 2011 Seq name: gi|222441917|gb|ACEP01000025.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont25.1, whole genome shotgun sequence Length of sequence - 24728 bp Number of predicted genes - 21, with homology - 21 Number of transcription units - 11, operones - 5 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 14 - 73 6.3 1 1 Op 1 . + CDS 160 - 450 246 ## gi|225026109|ref|ZP_03715301.1| hypothetical protein EUBHAL_00350 2 1 Op 2 . + CDS 462 - 779 406 ## gi|225026110|ref|ZP_03715302.1| hypothetical protein EUBHAL_00351 3 1 Op 3 . + CDS 804 - 1520 390 ## gi|225026111|ref|ZP_03715303.1| hypothetical protein EUBHAL_00352 + Prom 1526 - 1585 9.5 4 2 Tu 1 . + CDS 1629 - 2969 1708 ## COG0427 Acetyl-CoA hydrolase 5 3 Tu 1 . - CDS 3267 - 4295 753 ## ELI_2068 hypothetical protein - Prom 4322 - 4381 12.0 + Prom 4654 - 4713 2.7 6 4 Op 1 . + CDS 4747 - 8304 1518 ## COG1674 DNA segregation ATPase FtsK/SpoIIIE and related proteins 7 4 Op 2 . + CDS 8291 - 8542 246 ## gi|225026115|ref|ZP_03715307.1| hypothetical protein EUBHAL_00356 8 4 Op 3 . + CDS 8559 - 10304 960 ## COG0515 Serine/threonine protein kinase 9 4 Op 4 . + CDS 10288 - 13767 1658 ## EUBREC_0561 hypothetical protein + Prom 13848 - 13907 2.6 10 5 Tu 1 . + CDS 13971 - 15011 787 ## gi|225026118|ref|ZP_03715310.1| hypothetical protein EUBHAL_00359 + Prom 15108 - 15167 6.3 11 6 Op 1 1/0.000 + CDS 15319 - 15477 134 ## COG4481 Uncharacterized protein conserved in bacteria + Prom 15582 - 15641 7.0 12 6 Op 2 24/0.000 + CDS 15716 - 16003 368 ## PROTEIN SUPPORTED gi|160881892|ref|YP_001560860.1| ribosomal protein S6 13 6 Op 3 21/0.000 + CDS 16021 - 16509 665 ## COG0629 Single-stranded DNA-binding protein 14 6 Op 4 . + CDS 16537 - 16797 358 ## PROTEIN SUPPORTED gi|160940069|ref|ZP_02087414.1| hypothetical protein CLOBOL_04958 + Term 16821 - 16867 6.4 + Prom 16814 - 16873 4.3 15 7 Op 1 . + CDS 16917 - 17312 426 ## COG1869 ABC-type ribose transport system, auxiliary component + Prom 17321 - 17380 1.6 16 7 Op 2 . + CDS 17517 - 18431 692 ## COG1451 Predicted metal-dependent hydrolase + Term 18564 - 18605 8.3 17 8 Tu 1 . + CDS 18988 - 20556 505 ## PROTEIN SUPPORTED gi|229845805|ref|ZP_04465917.1| 50S ribosomal protein L31 + Prom 20652 - 20711 7.5 18 9 Tu 1 . + CDS 20811 - 21620 793 ## COG1234 Metal-dependent hydrolases of the beta-lactamase superfamily III + Prom 21796 - 21855 5.7 19 10 Tu 1 . + CDS 21880 - 22656 889 ## COG0106 Phosphoribosylformimino-5-aminoimidazole carboxamide ribonucleotide (ProFAR) isomerase 20 11 Op 1 . - CDS 22826 - 24037 772 ## COG2073 Cobalamin biosynthesis protein CbiG - Term 24052 - 24095 7.2 21 11 Op 2 . - CDS 24117 - 24608 552 ## COG3760 Uncharacterized conserved protein - Prom 24642 - 24701 6.0 Predicted protein(s) >gi|222441917|gb|ACEP01000025.1| GENE 1 160 - 450 246 96 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026109|ref|ZP_03715301.1| ## NR: gi|225026109|ref|ZP_03715301.1| hypothetical protein EUBHAL_00350 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00350 [Eubacterium hallii DSM 3353] # 1 96 1 96 96 125 100.0 1e-27 MAQIIASTKEISNKRESLIVLSNSLKNKISLLETLGNSLNSMWEGAAKEKYVKQLKTDIS KMRNLLKTILEFILILEKIIAIYNKAEKKNEATAVS >gi|222441917|gb|ACEP01000025.1| GENE 2 462 - 779 406 105 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026110|ref|ZP_03715302.1| ## NR: gi|225026110|ref|ZP_03715302.1| hypothetical protein EUBHAL_00351 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00351 [Eubacterium hallii DSM 3353] # 1 105 1 105 105 172 100.0 8e-42 MADLINVTPEKLKATASSFQQAGKDVKKTTSDMLQLVRGISSSIWSGEASSIYLGKFNGL DADIAKMCKMIEEESQHLTTIAQEYQLAEEQNKQVAATLKNNVIA >gi|222441917|gb|ACEP01000025.1| GENE 3 804 - 1520 390 238 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026111|ref|ZP_03715303.1| ## NR: gi|225026111|ref|ZP_03715303.1| hypothetical protein EUBHAL_00352 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00352 [Eubacterium hallii DSM 3353] # 1 238 1 238 238 442 100.0 1e-123 MLSVMIFCHDKREGDLIGKICNQCVALIDKEFLRNFIISDEEYHQVNIEGIQPPDLLVVE ITGIEDLCRAKRIRGIFPRGRLLVVSTNQISAESYLIPEVSPDMLLLKPYVYAKACRIIY RSLTWCCKDKCRKKKQEDILKIRTGGEIYCFHYEEIQYLEARDKKIILHVNKYEISFYSS LQKLEKSLPDYFIRCHRSYIVNFMFIQKADLINGEFYLNKKTVIPISQKYKAKLAKML >gi|222441917|gb|ACEP01000025.1| GENE 4 1629 - 2969 1708 446 aa, chain + ## HITS:1 COG:AF1145 KEGG:ns NR:ns ## COG: AF1145 COG0427 # Protein_GI_number: 11498745 # Func_class: C Energy production and conversion # Function: Acetyl-CoA hydrolase # Organism: Archaeoglobus fulgidus # 1 438 7 460 482 308 40.0 1e-83 MSFIKEYAQKLVTAEEAVKVVKSHDWVDYGWTTGTPVALDAALAARADELEDVKIRGGIL LREPEIFKVNNVADHFTWNSWHMGGLERKAISKGFAYYSPLKYSELPRYYRENIKHLNVA MFQVAPMDKHGFFNFGPNASHMMAVCEAADVIIVEVNENMPRCLGGFEEGIHVSRVDYIV EGENPAIGELGAGAPPTEVDKAVAQLIVEEIPDGACLQLGIGGMPNTVGSMIAESDLKDL GVHTEMYVDAFVDIAKAGKITGLKKNIDRGRQVYAFGAGTKKMYDYLDDNPECMSAPVDY TNSAKTIAQIDNFISINNAVDIDLYGQVNAESAGIKQISGAGGQLDFVQGAYLSKGGKSF ICCSSTFTSKDGVKHTRIRPTLAEGSTVTDTRPNTHYVVTEFGKVCLKGMSTWERAEALI NIAHPDFRDELIKEAEKMHIWRRSNK >gi|222441917|gb|ACEP01000025.1| GENE 5 3267 - 4295 753 342 aa, chain - ## HITS:1 COG:no KEGG:ELI_2068 NR:ns ## KEGG: ELI_2068 # Name: not_defined # Def: hypothetical protein # Organism: E.limosum # Pathway: not_defined # 6 342 4 336 336 161 32.0 5e-38 MIAKDDIVIRKAILHILDTNRGECILSNTLLDPGPDLHDFIRNHIYKIVSSDDTKNCEFD PEYSPIYSILETWDESDETSFIETSQAIANKLYVAMGEGLDIPAADLLFVTFQAEGIIYL ALLKMNYKESYTHEITEIPDNPVINTDIIKTHSLLPSATSRIPEAVVINLSDYHIKLLEK KYEINGEKAYYLSENFLVCRTSIPPKKKLNILTRVINNISNKYDGADLKTKMDTKSALQK EYVDNKSFDIEEIGNKLFGKSPEKKSEFDEKMEQYDLQYDNFTVTNESTVKKLEKQVMVT DSGIEISIPMETYNKLANFEVQTDVTGKSTIIIRNIDNLVLK >gi|222441917|gb|ACEP01000025.1| GENE 6 4747 - 8304 1518 1185 aa, chain + ## HITS:1 COG:CAC0039_2 KEGG:ns NR:ns ## COG: CAC0039_2 COG1674 # Protein_GI_number: 15893337 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: DNA segregation ATPase FtsK/SpoIIIE and related proteins # Organism: Clostridium acetobutylicum # 210 1176 74 1007 1286 557 35.0 1e-158 MMEKAINRKNSMGISYEYGKWFIEKCGNIEIYINEILLEKKQQLEFGDVICIGTIYLLFF ENYIAVEAGERNTVSDYLEEVILENEIKGQEDVRKKEKREITFHRAPRVMEELKKVSLKV DAPPQVDVQEKKSIFMDIGTIMNLMFPMLGMNLFLIYGMKSEQNQAGIYVYSGLFMAVMS AVCSILWLMISRKYEKREQQKKVNKERLAYRRYLNKKSEYIKVQYERVYKVLQSRYLRAD TYLDSPLLDMYLWNRNLYHKDFLMYRIGIGDVEFPMKIEFPEEVFGDEENILWREAKKIK EHYEILHQIPVLLDMGRYSQIGIITKDTIAGMELVRSIILQIALCNCYTEVKIGCIYNKN KVIQSQQWDFCRWLPHIWDANRQKRFIAGNEVEARRLFYDLLQIFKEREEVSISDKSEKI LPHYILFVAEEQFLEGEMFSKYILDRGKEYGLTVVWLDSMRKKLPNTCKMVLEINGGFTG RYEIERHSQKKEKINFDYTEKNIAEKLIRSISGIKVMEIEEKAGIPEVVDFLGMYDVHTI EELHIKQRWEKNRIFESAKVLIGKKAGDEPFYLDIHERYHGPHGLLAGTTGSGKSEVLQT FILSMAVNFSPEAVCFLLIDYKGEGMSALFSELPHISGKISNLSDGQAYRAMVSIKSENK RRQRIFKECKVNNINDYTRLFNSGSVNEPIPHLLIIIDEFAELKKAEPEFMQELISVAQV GRSLGVHLLLATQKPGGVVDDKIWSNSRFRICLKVQEREDSMDMLHNMDACQITQTGRGY LQVGNNEVYELFQAGWSGALFQQEDTEVAACLVQTDGTIYKRRKNAEKNRKKKITQLQAI KQYIIRFAKEKEYQEGRKLWLEPLAKYIYLNEIHKEMNKDKKLKEKRQRVDMNKNLEVCV GIFDDPENQEQSIFSLNLMESGHIAICGRSASGKSTFFQTFLFSLLKESTAEEVCLYLLD FNGSGMDIYDLMPQVKQVIKEEEEDKVEELFENIKKEMKRRKKKFSGGNFKQYKNKSKRI ENKSNEDKRDVGKEDNVSLNQIENEKEEFPLILIIVDGFVEFCEETYQRYDDSLYLILRE GEKLGIKVMISIESFSGMYISMRIAELFKTKICLYMKDKYAYTEVFDVIQISVFPKAEIP GRGIAYYGERILEFQTALALNVHNDYERREKIKEVLNRGVADGIC >gi|222441917|gb|ACEP01000025.1| GENE 7 8291 - 8542 246 83 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026115|ref|ZP_03715307.1| ## NR: gi|225026115|ref|ZP_03715307.1| hypothetical protein EUBHAL_00356 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00356 [Eubacterium hallii DSM 3353] # 1 83 1 83 83 147 100.0 2e-34 MVYVDVFVAGVGKWYEFKCEETALIRDIIKEIYDVIRDKESIICEKQIDLELYLACIENK RILPLDFKLEESGVENGNRLILF >gi|222441917|gb|ACEP01000025.1| GENE 8 8559 - 10304 960 581 aa, chain + ## HITS:1 COG:CAC0404_1 KEGG:ns NR:ns ## COG: CAC0404_1 COG0515 # Protein_GI_number: 15893695 # Func_class: R General function prediction only; T Signal transduction mechanisms; K Transcription; L Replication, recombination and repair # Function: Serine/threonine protein kinase # Organism: Clostridium acetobutylicum # 7 281 24 296 306 251 46.0 2e-66 MLKAGFVLDGKYRILSVIGQGGMSTVYLAVHERLKQKWAVKEISMEYCENYEMISRKLIV EADILKRLDHPGLPKIVDIIEKKDAIWMVMEFIEGKTLKEILNERGRIEEKEILIWGKQL CEVLSYLHSKKPSIIYRDLKPENIILKKTGRLVLIDFGTAREYCYKNTACDAMYLGTKGY AAPEQYGGMGQTDARTDIYCLGVTLYSLLTGYNPEKPPYKIYPEKYWGEHISLEMKSLLL KCIQSEPEKRYQNCRELAYALSQIDYKKQKEKENERRKIIKFLIFMMVGQLSLMFCIGCK KVSFCYKEEAVVRYINIAEKSEDKKEASQYYKLALELIPEEKLIYQSMIKYFIRLNHFQI EDAVILLDLINSVCEEGTVIEIFQKHNALGYAEFSYALGLGYFYDMGNITGKGASEKWFC EAIDTMTSQDENEAFGKQKRKRAGLYAKIANYYNTFLLNGTDQSGERGTGDFMDFYKTLH SLNQFEVTQKSTKSDLAAAYLISREITIEIMNYAEQFLKDSSINKKMLNKELEKIEERTK YFIKEKKQMEELYQFMNEAKQKIEIADSFKDKEDVRDAGGT >gi|222441917|gb|ACEP01000025.1| GENE 9 10288 - 13767 1658 1159 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_0561 NR:ns ## KEGG: EUBREC_0561 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 541 1147 664 1250 1727 90 23.0 6e-16 MQEELNTRLLQKISYLEKGAVLGMIVCSCFFLAAILYFVKANVYVILISGLKKKVAHVRN KIKRKRVRRIGDKGKNEITNKAYFKLIVLAFVGVGSIYTYRTYYVEAAVITQLAIDEDTE VMDNTAGDIGEKSPKLYLESKGWIGGTGEFAQYLFSKDNRTFTIGVEENTFHSDLEKESF LFQVSEKESGIDQEGFNKIRQYEQGEFERKASDNPIYQKTLSFETEKNRHKVYSLYLEYI NRWGMPLIGDKGAVENYGNILSGTFKSKKLVIDKKCPEIAGLKLEKADKKKEGIRFAKKS VSETYNTDEIYNTDKKCNTDITYNIDEECYYNTSVKGMIDIREKYLDLDSIHIQAMPLDD RAREAVKENEAESNDGMLDILAWIHTKKGNLHQISFDFAVEGKWKFILDCSDLAGNKGVS NQTGQEGIESTDITIDKSAPELSVDYKGIINVMEAESSPANINKKLKSNGEKITSSGNEL FMKRENSIDICIEDMNLEAENIELKLYRVKYGLNGKIEQNKESWEEITEKIKQEPEKQEL EKGEQEKNSKTVMRYSVTNLDDGHYKLMIHCTDKAGNVMTAEKSSETERCIYNGYYESPL YTVDTKSPLITSVILNQNAVKKIGKRQYFQNAPQITIKIQEENFNKINFSLEGKMFYSNG MVMEKEWERFKSQKKSLQWKSYYEDGIRINETNIQVEAEGNYTLNFGVIDSAACVANQKN LQITYDCHKPEIIYTGVDNESGDLIFKAEENGDDKKKNTLLLFHKYHFFRYFSKRRMNVF IRIKDAISGIERINYAFIPYEEQPIDINKFSQVERTGAEKINDEYFEKEDLSEFSITVSP EKENFKGYLKVYGQDYSGNVSEMVKSKGAISESLQLHKETSNITMKMPRAFFTDKEKNIR YYNNTVPVQAVFEDTYAGIYKTQLYAKTTKKKDSSIKTGKTIMWDGDNLIYRKRQQIELE ADKFSQFDADNPLTIQADLEDNAGHTEKNILNEKVVIDNTKPEIEVVYDQNNQTQYYNFS RKATVTVKEKNFSPDLVVWNIQGSNQKYQIGEWKNVQGTYVCEISFEEDGRDYSVGLSVT DKAGNKSEWKDRNYFTIDKTVPQISIQINGIGKQADQVPYFNTERIITFCIQEQNFDKNK VEYNIDAIHGKSRITIKNQ >gi|222441917|gb|ACEP01000025.1| GENE 10 13971 - 15011 787 346 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026118|ref|ZP_03715310.1| ## NR: gi|225026118|ref|ZP_03715310.1| hypothetical protein EUBHAL_00359 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00359 [Eubacterium hallii DSM 3353] # 1 346 56 401 401 602 100.0 1e-170 MPQVICKDQYLDRESVEISLKKIDDRNVLKKEWSYERAESENTVQVQWDNIKKTENSDGI YYLQIKGQDKAGNKIKDDFKVVFRVNQRGADFILSHALKKKINKYYLKEAPKIKLREQCV KQTKSRAVILKDNEERKVIGESSITSSVIADKKSERYGWYEKYYNFAKKDFEREGDYRVT FQADTKEKELRFVVDKTPPVVHIGNLDKQIYEEKEHEFTIRVMDNYAFKELELYMETDRD ILGQKGTKKIIIKPKDLDENYMVRKTLTADKRYQTVRYIARDKAGNVIDSNDNGDTKVCL VTDSKTVKEYQEHKKEYMLIGIISTLGIFIVISGLFIFTRRKRNLK >gi|222441917|gb|ACEP01000025.1| GENE 11 15319 - 15477 134 52 aa, chain + ## HITS:1 COG:CAC3725 KEGG:ns NR:ns ## COG: CAC3725 COG4481 # Protein_GI_number: 15896956 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 1 44 13 56 65 57 52.0 5e-09 MKKKHPCGANEWKLLRVGMDFRLQCMNCGREVMVPRKLVEKNFRGYIHQDID >gi|222441917|gb|ACEP01000025.1| GENE 12 15716 - 16003 368 95 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|160881892|ref|YP_001560860.1| ribosomal protein S6 [Clostridium phytofermentans ISDg] # 1 95 1 95 95 146 67 1e-34 MNKYELALVINAKIEDDARTDAIEKIKALIEKFGGEITNVDEWGKKKLAYEIQKMREGYY YFIQFDASAECPAEIERRVRIMEPVMRYLCVKQDA >gi|222441917|gb|ACEP01000025.1| GENE 13 16021 - 16509 665 162 aa, chain + ## HITS:1 COG:CAC3723 KEGG:ns NR:ns ## COG: CAC3723 COG0629 # Protein_GI_number: 15896954 # Func_class: L Replication, recombination and repair # Function: Single-stranded DNA-binding protein # Organism: Clostridium acetobutylicum # 1 104 1 103 144 105 49.0 5e-23 MNKVILMGRLTRNPDVRYSQGEKATCVARYTLAVNRRFRREGDQDADFINCVAFGRQGEF AEKYLKQGTKIVISGRIQTGSYTNRDGVKVYTTDVVVEEQDFAESKAAASSYTGGYQQQG GYQSAPEPQAAPAPTNRPAPSEAVSDGFMTIPEGIEEELPFI >gi|222441917|gb|ACEP01000025.1| GENE 14 16537 - 16797 358 86 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|160940069|ref|ZP_02087414.1| hypothetical protein CLOBOL_04958 [Clostridium bolteae ATCC BAA-613] # 1 86 1 89 89 142 82 2e-33 MAYSKDRGDAPMRRRGGRRRKKVCIFCADKNAVIDYKDVNKLKRYVSERGKILPRRITGN CAKHQRALTVAIKRARHVSLMPYTVE >gi|222441917|gb|ACEP01000025.1| GENE 15 16917 - 17312 426 131 aa, chain + ## HITS:1 COG:HI0501 KEGG:ns NR:ns ## COG: HI0501 COG1869 # Protein_GI_number: 16272445 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type ribose transport system, auxiliary component # Organism: Haemophilus influenzae # 1 131 1 139 139 124 48.0 6e-29 MKKRGILNAQLSYLLAALGHKDLFMIGDAGMPIPEGVEVVDLVLTAGVPTFKQVLDAVLD EVQVEGYYLAHEIKEFNPELEEYIKAGLPEAEVEYMPHEDLKKFSGKCRFAIRTGEFSPY PNVILRAGVVF >gi|222441917|gb|ACEP01000025.1| GENE 16 17517 - 18431 692 304 aa, chain + ## HITS:1 COG:RC0404 KEGG:ns NR:ns ## COG: RC0404 COG1451 # Protein_GI_number: 15892327 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase # Organism: Rickettsia conorii # 180 299 109 223 230 94 35.0 2e-19 MKETKKRAILLNNTKIIFYHPEERGSLVSKIRIHQGSRDLYVNVISVNSRYVDFIIHNDL SIDMKVPVGMSWEMIERYVRLNETMIFEEYEKKKVRNHQMLPITLDLEEGRIVYRGGLYL PFLGKTDVLLRIKYLSDFDEEETKIYMEDKKDQGKHLIIKTDKDSQEFLRYCIMRYYKKC ALIIVKKKVEEFAQKMGLQYNQVMITGQKRQSPLRRPRLSYQNIEIKNQLTVWGSCNRKH NLKFDWKLAMLPMEIIEYIIAHELTHLKVMNHSATFWNEMEKVMPEYKECRSWLEKHGKE YEIF >gi|222441917|gb|ACEP01000025.1| GENE 17 18988 - 20556 505 522 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229845805|ref|ZP_04465917.1| 50S ribosomal protein L31 [Haemophilus influenzae 7P49H1] # 180 513 15 333 378 199 35 2e-50 MKLDETKRQKIIHPIPPLYDKDSKILILGSFPSVKSREEAFFYGHKQNRFWKLLAGILSE KKPETVEEKKDFLHRNCIAVWDVIHSCDIIGSSDSSIRNVVPNDLSEILESADIRQIYCN GAKSYEYYRKYQEKETGRKAKKLPSTSPANAAFSIEKLTNEWKEICGPLQVAPAGIGGVL LNWYDYNARILPWRSDPTPYHVWISEIMLQQTRVEAVKKYYDRWMESLPDVKALAEVPDD ELMKLWEGLGYYNRARNLKAAAVQIMEEFDGEIPSDYSKLLSLRGIGEYTAGAIASIAFG IPESAVDGNALRIFSRILAEDGEINKTSVKKKITQEVRRVLPEERPGDFNQALMDLGSSI CIPNGEPFCENCPWESICKAHKYGQETDFPVKAKKKQRKIEKKAVFLIEVSDKIILHKRP EKGLLSGLWELPNLDGELSAKELSEQMKKWEIGDYMIEPLGEGKHIFSHVEWQMRGYRIQ MRDISEKLLEKEEWIAVSREDLEEKYAIPSAFECYRKQIYRG >gi|222441917|gb|ACEP01000025.1| GENE 18 20811 - 21620 793 269 aa, chain + ## HITS:1 COG:CAC1330 KEGG:ns NR:ns ## COG: CAC1330 COG1234 # Protein_GI_number: 15894609 # Func_class: R General function prediction only # Function: Metal-dependent hydrolases of the beta-lactamase superfamily III # Organism: Clostridium acetobutylicum # 1 269 1 268 268 303 54.0 2e-82 MEKLIVLGTGNASVTKCYNTCSIIQDEKGKYFMIDAGGGNGILTQLEKMNIPVTEIHDIF LTHEHTDHLLGMIWMIRVIGQAIQKGKYEGNCTIYCHDGLVNVLKTICELTLVKKITNLI GDRIILCPVQDGEEKQILDYKVKFFDIRSTKARQFGFVTTLNNGKRLTFCGDEPYQEHCF EYAYQTDYFLHEAFCLDGEKDEFKPYEKHHGTVKDSAELAQELQVKNLLLWHTEDKNIRK RKSLYKKEAKKYYKGNVYVPYDREIIELK >gi|222441917|gb|ACEP01000025.1| GENE 19 21880 - 22656 889 258 aa, chain + ## HITS:1 COG:SPAC3F10.09 KEGG:ns NR:ns ## COG: SPAC3F10.09 COG0106 # Protein_GI_number: 19114853 # Func_class: E Amino acid transport and metabolism # Function: Phosphoribosylformimino-5-aminoimidazole carboxamide ribonucleotide (ProFAR) isomerase # Organism: Schizosaccharomyces pombe # 3 253 9 251 264 232 47.0 4e-61 MKFRPCIDIHNGKVKQIVGGSLLDKGDYAQDNFVSEKDGDFYAKLYKDAGLERGHIILLN PAGSQYYEEDVRQACLALSAYPGGLQIGGGMTAENAAFFLEQGASHIIVTSYVFKDGKIN YENLEKIVAVTGKEHLVLDLSCRKKGEDYYIVTDRWQKFTDVKLTEDVLSELADYCDEFL VHAVDVEGKAGGIEEDIAALLGNWNGISVTYAGGVSSFEDLRKLKELGRNKVDVTIGSAL DLFGGEMPFSKVLEEINF >gi|222441917|gb|ACEP01000025.1| GENE 20 22826 - 24037 772 403 aa, chain - ## HITS:1 COG:lin1161 KEGG:ns NR:ns ## COG: lin1161 COG2073 # Protein_GI_number: 16800230 # Func_class: H Coenzyme transport and metabolism # Function: Cobalamin biosynthesis protein CbiG # Organism: Listeria innocua # 1 178 1 203 343 125 41.0 2e-28 MNTAYFYLTDEGGKLAHKLATAHPGDIYNKENFKENLRAGFGKYDSLVCIMATGIVVRIL APLIVHKTSDPAVVVLDQKGKHAISLLSGHLGGANDLAREMAAISGGEAVITTATDVAGE LSFDTFAKKYDMAIENIGQLKHISGALLAGKKVNVFTNKNAKELYPELAEEQKRGMIDIF SLSDFFKIYIRNKNNTKTEDLNANLRSKNIITSHIETYNIAVTDDKNVKTTPIMSPSTNI DNIESDIPIVVIDEGFSLNECSTIPSSAPILYLRPRTICAGIGCKRNMEQKPIEEALLQT LKEEGLHPLSLKCIATIPLKSDEPGIIGTAANLNVPLKIIPTEKIEKLDISQLGIATSEF VASQTGVLSVSTACSYLASGKGEILRDKAKYKGITIALSKEKS >gi|222441917|gb|ACEP01000025.1| GENE 21 24117 - 24608 552 163 aa, chain - ## HITS:1 COG:mll4433 KEGG:ns NR:ns ## COG: mll4433 COG3760 # Protein_GI_number: 13473736 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Mesorhizobium loti # 5 162 7 163 165 92 32.0 4e-19 MNKQDIYNYLKENNIWHEITEHKAVYNMAELAEVDTPYPEADAKNLFVRDDKKKNFYLIT VRGDKRVNLKEFRKANGTRPLSFASAENLMDIMGLIPGAVTPLGILNDTEKKVHFYLDKH FLEEPGLVGVHPNDNTATVWLKTEDLIRIIEEHEHDVTLVSLD Prediction of potential genes in microbial genomes Time: Fri Jul 8 06:45:55 2011 Seq name: gi|222441916|gb|ACEP01000026.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont26.1, whole genome shotgun sequence Length of sequence - 12406 bp Number of predicted genes - 10, with homology - 10 Number of transcription units - 5, operones - 2 average op.length - 3.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 60 - 119 6.4 1 1 Tu 1 . + CDS 139 - 585 511 ## COG1490 D-Tyr-tRNAtyr deacylase + Prom 589 - 648 7.2 2 2 Tu 1 . + CDS 796 - 2010 1761 ## COG0538 Isocitrate dehydrogenases + Prom 2175 - 2234 6.3 3 3 Tu 1 . + CDS 2313 - 3200 988 ## COG4866 Uncharacterized conserved protein + Prom 3485 - 3544 8.4 4 4 Op 1 19/0.000 + CDS 3586 - 4515 869 ## COG0540 Aspartate carbamoyltransferase, catalytic chain 5 4 Op 2 . + CDS 4502 - 4924 486 ## COG1781 Aspartate carbamoyltransferase, regulatory subunit + Term 4937 - 4987 7.7 + Prom 4948 - 5007 7.0 6 5 Op 1 . + CDS 5051 - 7327 2240 ## COG0317 Guanosine polyphosphate pyrophosphohydrolases/synthetases 7 5 Op 2 1/0.000 + CDS 7347 - 7994 896 ## COG0491 Zn-dependent hydrolases, including glyoxylases 8 5 Op 3 . + CDS 8002 - 9471 1221 ## COG0635 Coproporphyrinogen III oxidase and related Fe-S oxidoreductases 9 5 Op 4 . + CDS 9533 - 10129 552 ## COG2206 HD-GYP domain 10 5 Op 5 . + CDS 10163 - 11956 1906 ## COG0173 Aspartyl-tRNA synthetase Predicted protein(s) >gi|222441916|gb|ACEP01000026.1| GENE 1 139 - 585 511 148 aa, chain + ## HITS:1 COG:CAC2273 KEGG:ns NR:ns ## COG: CAC2273 COG1490 # Protein_GI_number: 15895541 # Func_class: J Translation, ribosomal structure and biogenesis # Function: D-Tyr-tRNAtyr deacylase # Organism: Clostridium acetobutylicum # 1 148 1 148 149 155 49.0 2e-38 MKAVLQVVTHATVRVDGGITGQIGPGLLVFLGVAEEDEQADLDKIVKKVTELRIFKDDAG KTNLSLQDVNGELLIVSQFTLLADCKKGRRPSFVKAGNPQKAEEMYEEFIRVCKEKVPKV EHGVFGADMKVELLNDGPFTIVLDSKEL >gi|222441916|gb|ACEP01000026.1| GENE 2 796 - 2010 1761 404 aa, chain + ## HITS:1 COG:TM1148 KEGG:ns NR:ns ## COG: TM1148 COG0538 # Protein_GI_number: 15643905 # Func_class: C Energy production and conversion # Function: Isocitrate dehydrogenases # Organism: Thermotoga maritima # 1 402 1 399 399 498 59.0 1e-140 MDKIKMTTPLVEMDGDEMTRILWKMIKDELLLPFIDLKTEYYDLGLEYRDETNDQVTVDS ALAAKKYGVAVKCATITPNAARVKEYNLKEMYKSPNGTIRAILDGTVFRAPIVVKGIEPT VKTWKKPITIARHAYGDVYKASEMKIPGKGKAEIVYTAEDGTETRELIHEFDGPGIVQGM HNLNGSIDSFARSCFNYALETHQDLWFATKDTISKKYDHTFKDIFQDIYDADYKEKFEEA GIEYFYTLIDDAVARVIRSEGGYIWACKNYDGDVMSDMVATAFGSLAMMTSVLVAPDGTT EYEAAHGTVQRHYYKHLKGEETSTNSVATIFAWSGALRKRGKLDGIDELVTFADKLEKAT IDTIESGTMTKDLALITTLENVNTVNSEDFIKAIRATLENLLNA >gi|222441916|gb|ACEP01000026.1| GENE 3 2313 - 3200 988 295 aa, chain + ## HITS:1 COG:jhp0277 KEGG:ns NR:ns ## COG: jhp0277 COG4866 # Protein_GI_number: 15611347 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Helicobacter pylori J99 # 3 287 2 283 290 173 37.0 4e-43 MDFQPVKASDKEIIDKYMKKANSRSCDMSFAAVYLWKDFYLLEYTVCEDMLIFRTTEDGS SYSFPIGDASPEKALLALEAHCKENEEPLKLHCVYRENEAWLEEHMPGKFEIEFDRDSAD YIYECEKLIGLKGKKFHGKKNHVNKFIKTYDWAYEKITDDNIDDCLAMLYKWKEINCEPG NIEKHAEACVSENALREREFLGLKGGLIRADGEVVAFAVGEQINEDTLVVHIEKAFSEVP GAYAIINQQFLVHEADGLKYVNREDDVGEPGLRKAKLSYHPEFLVEKGFARLERV >gi|222441916|gb|ACEP01000026.1| GENE 4 3586 - 4515 869 309 aa, chain + ## HITS:1 COG:CAC2654 KEGG:ns NR:ns ## COG: CAC2654 COG0540 # Protein_GI_number: 15895912 # Func_class: F Nucleotide transport and metabolism # Function: Aspartate carbamoyltransferase, catalytic chain # Organism: Clostridium acetobutylicum # 2 303 5 306 307 410 66.0 1e-114 MRHLIDPMDLSVEEINHLLDLADDIILHKEKYQEVCKHKKLATLFFEPSTRTRLSFEAAM MELGGNVLGFSSANSSSASKGESVSDTVRVVSGYADIIAMRHPKEGAPYAAARKLSVPII NAGDGGHNHPTQTLTDLMTIRREKGKLDNLTIGMCGDLKFGRTVHSLIKAMSRYENINFI MISPEELRLPDYIIKDVFEKKNIKYKEVRTMEEVMPELDILYMTRVQKERFFNEADYIRL KDSFILDERKLLDAKEDLCIMHPLPRVNEISTKVDDDPRACYFKQALYGKYVRMALIMTL LGVSADENR >gi|222441916|gb|ACEP01000026.1| GENE 5 4502 - 4924 486 140 aa, chain + ## HITS:1 COG:CAC2653 KEGG:ns NR:ns ## COG: CAC2653 COG1781 # Protein_GI_number: 15895911 # Func_class: F Nucleotide transport and metabolism # Function: Aspartate carbamoyltransferase, regulatory subunit # Organism: Clostridium acetobutylicum # 1 134 2 135 146 128 51.0 3e-30 MKIDSIQSGIVLDHIKAGKSMQIYKYLGLDDLDCSVAIIKNASSNKMGKKDIIKIADNMD LNLDVLGYIDPDITVCYIRDGKIVEKKHLELPEKIVNIVHCKNPRCITSVEQDLDQVFKL TDRENRVYRCAYCESKKSEY >gi|222441916|gb|ACEP01000026.1| GENE 6 5051 - 7327 2240 758 aa, chain + ## HITS:1 COG:CAC2274 KEGG:ns NR:ns ## COG: CAC2274 COG0317 # Protein_GI_number: 15895542 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Guanosine polyphosphate pyrophosphohydrolases/synthetases # Organism: Clostridium acetobutylicum # 31 756 7 738 740 708 49.0 0 MEAQKQKLKRTQKEISKPEDFTDPEVLYEKLINTIREYHPSTDLSMVEKAYKLARDAHKD QKRKSGEPYIIHPLCVAIILAELELDKETIVAGILHDVVEDTTATLEDLSREFNDEVALL VDGVTKLGQLSYSHDKMDLQAENLRKMFLAMAKDIRVILIKLADRLHNMRTLQYMRPEKQ KEKARETMDIYAPIAHRLGISKIKTELDDLSLKYLQPEVYKDLEEKLQTNKEGRENFIQS IIDEVSKHIEEAGIRAEIDGRVKHLFSIYKKMRNQNKTLDQIYDIFAVRIKVDTVKDCYA ALGVIHEMYKPIPGRFKDYIAMPKQNMYQSLHTTLIGSSGTPFEIQIRTFEMHRTAEYGI AAHWKYKEGGGNINKEEEKLSWLRQILEWQQDMSDNKEFLTMLKTDLDLFTEQVYCFTPQ GDVKTLPAGSTPIDFAYMIHTAVGNKMVGARVNGRQVPIDYKLQNGDRVTIVTSQNSNGP SRDWLSIVKSSQAKTKINQWFKTQFKEENISKGKELLDRYCKAKGLVMSKYMKPEYQKKC MHKYGLKNWDSILAAIGHGGLKEGQVINKLVEEYDKENRKNLTDQDALNEIEEKNKTKAV EKARSKSGITVRGIHDVSVRFSKCCSPVPGDEIIGFVTRGRGISIHRTDCVNILSMPEGD RARLIDAEWEEEAVEKGGELYMTEICLYAHNRTGILLDISKVFMELKVDIKSVSTRTSKQ GLATIVLSFEIGGIDDLNHIIKKLRNIESVIDIERSAG >gi|222441916|gb|ACEP01000026.1| GENE 7 7347 - 7994 896 215 aa, chain + ## HITS:1 COG:CAC2272 KEGG:ns NR:ns ## COG: CAC2272 COG0491 # Protein_GI_number: 15895540 # Func_class: R General function prediction only # Function: Zn-dependent hydrolases, including glyoxylases # Organism: Clostridium acetobutylicum # 4 213 1 199 199 152 39.0 4e-37 MGQLKIVCAQLGPIDTNTYILMDDEIKEAIIIDPAGSPEELIKFLEDEGYSLDGVLLTHG HHDHIGALEGLREHYDPWKLRVYAAKPEQEVLMVKDYNLSTMFGEGYTTHANMMIDDKQI FEVTSKHKCQCLYTPGHTLGSCCYYFAEEGWLFSGDTLFAGSVGRSDFPTGDAQALLKSL NHVVMRLPDEVIVYPGHGPSTTIGEERETNPFVER >gi|222441916|gb|ACEP01000026.1| GENE 8 8002 - 9471 1221 489 aa, chain + ## HITS:1 COG:CAC2271 KEGG:ns NR:ns ## COG: CAC2271 COG0635 # Protein_GI_number: 15895539 # Func_class: H Coenzyme transport and metabolism # Function: Coproporphyrinogen III oxidase and related Fe-S oxidoreductases # Organism: Clostridium acetobutylicum # 97 489 73 471 476 317 42.0 5e-86 MIYLIQNDREYDYDVRAIALAFYERTKIVEVTKEEFEEQEWEKDDLLLTLVYTKKRIQGW LKGKVGIEGTEERAVSEEVVCDYEDHKGSRNAVCRFLYRLFEKYTGRSLPWGMLTGIRPT KIIMKWMEEEKDSAKLEQRFRETYLADPQKANLCRRVAQREKVLLESRPFEKEYSLYIGI PFCPTTCLYCSFTSFPVSRFGDRMRAYLDALYKELAFVAKHHRDKKLTTIYIGGGTPTAL DEECLSKLMNMIHELFPVEESEEFTVESGRPDSITKEKFRILKEAGVTRISINPQTMHQE TLDLIGRAHTVEQTKEAFLLARECGFDNINMDIITGLPGESLSYVHETLDEIFKLRPESL TVHSLAIKRAAHLNIEMEKYQGMVKGSTNEMLRLVDEYASNMGMEAYYMYRQKNIPGNLE NIGYCVPDKECLYNILIMEEKQDIISCGAGASSKYVFEQGRIERTENVKNLDHYINRIDE MIDRKRKYL >gi|222441916|gb|ACEP01000026.1| GENE 9 9533 - 10129 552 198 aa, chain + ## HITS:1 COG:RSc2515 KEGG:ns NR:ns ## COG: RSc2515 COG2206 # Protein_GI_number: 17547234 # Func_class: T Signal transduction mechanisms # Function: HD-GYP domain # Organism: Ralstonia solanacearum # 24 181 162 320 402 115 40.0 5e-26 MAFDYDGTHDLETSLQDSFGEAILHGVVVSNLASFVAREMALSEEQCHNLAIAGMLHDIG KLRLRSYVYEEKEAKLNIDELRYVRLHPSLGYAILKEHGYTKEILTAVLYHHENADGSGY PNNLKGEEIPLGARILRVCDAFGALIANRPYRSAFDIETAISIMIEEVKNFDMRVFLTFQ KVTQSEDMKNMLTRLGIS >gi|222441916|gb|ACEP01000026.1| GENE 10 10163 - 11956 1906 597 aa, chain + ## HITS:1 COG:CAC2269 KEGG:ns NR:ns ## COG: CAC2269 COG0173 # Protein_GI_number: 15895537 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Aspartyl-tRNA synthetase # Organism: Clostridium acetobutylicum # 1 597 1 595 595 703 58.0 0 MAESMKGLKRSHRCTEVSNENIGETVTLMGWVQKRRNLGSLIFVDLRDRSGLIQLYFDEE TIGEEGFKKAASLRAEFVIAVTGTVEKRSGAVNENLETGDIEVKVDSIRILSESETPPFP IDADITVKDELRLKYRFLDLRRPNIQKNLMMRSKVATLTRQFLANEGFLEIETPMLTRST PEGARDYLVPSRVHPGSFYALPQSPQLFKQLLMCSGYDRYFQLARCFRDEDLRADRQPEF TQIDMELSFVDVDDVIDVNERLLAYLFKLCLNEEVSLPIQRMTWQDAMDYYGSDKPDLRF DMKIQDVSELVKDCGFGVFTGAIENGGMVRGLCAKGLGDVSSKKMKHIEKTAKDYGAKGL AYIQLKSDGTVKCPFSKFMSEEELQALIQAMGGETGDLLVFAADKFKVVCDVLGALRLKF AEELNLLDKSEYRFVWITEFPLLEWSEEENRYTAMHHPFTMPMEEDLDMIDTEPGKVRAK AYDIVLNGTEIGGGSVRIHQNDIQEKMFECLGISKEQAQEKFGFLLEAFKYGVPPHAGLA YGLDRLVMLMTKEDTIRDVIAFPKVKDASCLLTDAPNTVDDKQLEELCLEIEIPEKE Prediction of potential genes in microbial genomes Time: Fri Jul 8 06:46:00 2011 Seq name: gi|222441915|gb|ACEP01000027.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont27.1, whole genome shotgun sequence Length of sequence - 18395 bp Number of predicted genes - 13, with homology - 13 Number of transcription units - 8, operones - 4 average op.length - 2.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 105 - 368 287 ## bpr_I2619 hypothetical protein - Prom 501 - 560 7.8 + Prom 727 - 786 13.4 2 2 Tu 1 . + CDS 841 - 2661 1241 ## COG1404 Subtilisin-like serine proteases + Term 2720 - 2763 1.0 3 3 Tu 1 . - CDS 3176 - 4360 857 ## COG0500 SAM-dependent methyltransferases - Prom 4591 - 4650 9.2 + Prom 4528 - 4587 7.2 4 4 Op 1 . + CDS 4631 - 4780 235 ## gi|225026145|ref|ZP_03715337.1| hypothetical protein EUBHAL_00386 5 4 Op 2 . + CDS 4790 - 5542 667 ## COG0730 Predicted permeases + Term 5581 - 5620 10.0 + Prom 5590 - 5649 9.7 6 5 Op 1 . + CDS 5856 - 6836 697 ## Selsp_1788 hypothetical protein 7 5 Op 2 . + CDS 6864 - 8480 1061 ## Selsp_1787 hypothetical protein + Prom 8515 - 8574 4.9 8 6 Op 1 . + CDS 8623 - 9354 641 ## Selsp_1786 hypothetical protein 9 6 Op 2 . + CDS 9341 - 13810 2978 ## Selsp_1785 hypothetical protein 10 7 Op 1 31/0.000 + CDS 14384 - 14677 486 ## COG0721 Asp-tRNAAsn/Glu-tRNAGln amidotransferase C subunit 11 7 Op 2 21/0.000 + CDS 14688 - 16163 455 ## PROTEIN SUPPORTED gi|163737840|ref|ZP_02145257.1| 30S ribosomal protein S4 12 7 Op 3 . + CDS 16165 - 17598 1797 ## COG0064 Asp-tRNAAsn/Glu-tRNAGln amidotransferase B subunit (PET112 homolog) + Term 17651 - 17682 -0.7 + Prom 17625 - 17684 1.9 13 8 Tu 1 . + CDS 17791 - 18354 478 ## COG0494 NTP pyrophosphohydrolases including oxidative damage repair enzymes Predicted protein(s) >gi|222441915|gb|ACEP01000027.1| GENE 1 105 - 368 287 87 aa, chain - ## HITS:1 COG:no KEGG:bpr_I2619 NR:ns ## KEGG: bpr_I2619 # Name: not_defined # Def: hypothetical protein # Organism: B.proteoclasticus # Pathway: not_defined # 1 76 1 76 76 67 47.0 1e-10 MITKTININSLDKIETFVDIINKFNGRFDLVSGCSIVNAKSIMAIFSLDISRPVYLYIYN EESAEHVTKALAEFEVNALQIQEQLLA >gi|222441915|gb|ACEP01000027.1| GENE 2 841 - 2661 1241 606 aa, chain + ## HITS:1 COG:CAC3245 KEGG:ns NR:ns ## COG: CAC3245 COG1404 # Protein_GI_number: 15896490 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Subtilisin-like serine proteases # Organism: Clostridium acetobutylicum # 44 605 564 1098 1118 253 31.0 6e-67 MLQNELKAMLNEEECIGVPADEEYQEYLIDYILDENLFENELSDYCVFPIGYNQSIIYIK SDNPLTGGDRRFRYNMIPKCYGLMASEALEETGVLRVRRSPQLGYRGGGTLVAIIDTGIK LDDSLFLYEDGSSKVVSLWDQSDQRGTRPEGFLYGTEWTREEINNILQSDMDTGSNRKDE KNDSDNISSNSKNNVRKLPNDENGHGTFLAAIAAGREDIDQIFSGVAPDAELVVVKLKQS KKYLREFYSIPDGVWSCQEDDVMLAVRYVINVANKLGKPISICLGIGTNLGGHNGANSLE RYISYLSLLPKISFHLAGGNEGISGHHFHGTIRREEQYQTVDFNVAEGENGFVMELWGDE PNVYTIGILSPGGENIERMQLKMGEFRSVRFFPEDTLLEIRSFPGATIGGSQVIRMNFKN IVSGIWKLFVYGTGNGEKQYDIWLPISNFLKEETVFINPSSEQTVTSPGNAQYALAYAAY DVATGGLYVRASKGYTRDGRIVPDLAAPGVSVGIPGVSRITGAGERAISRERVRSGSSVA AAFGAGIGALMQEWAFVVGYDPFMNGQNMRTYLIQGAVKDGPYEYPNREWGYGKINIYNT LLEQRF >gi|222441915|gb|ACEP01000027.1| GENE 3 3176 - 4360 857 394 aa, chain - ## HITS:1 COG:VNG0503C KEGG:ns NR:ns ## COG: VNG0503C COG0500 # Protein_GI_number: 15789731 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Halobacterium sp. NRC-1 # 141 392 13 255 262 104 29.0 3e-22 MKKDGFYSSGEFAKMAHITKKTLRYYDAHNILKPSYLTPYGARFYSDEDFARLQQILLLK YLGFSLDDIREMTINDSDYHFMANALKLQQKLVQDKIEQLQLVEKAIQDTSRAIEEEHSV NWSQMLNLIHLTGMEKSMKNQYQNASNISARIQLHSKYSTNKQSWFPWIFEQCHMKNHSN VLEIGCGDGSLWLENKEKIPEHLHVVLSDISEGMLRDTRRMIHSNIQNARFDFQVFDCER IPYPNDSFDLIIANHVLFYCKHLTDVFQEIARVLKPNGVFICSTYGSEHMKEINELVSEF DDRIILSAERLFERFGKENGADLLAPYFHNITWLPYEDSLYVTEPETLISYVLSCHGNQN QYIIDHYKEFRNYVRKKTEHGFNITKDAGIFITI >gi|222441915|gb|ACEP01000027.1| GENE 4 4631 - 4780 235 49 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026145|ref|ZP_03715337.1| ## NR: gi|225026145|ref|ZP_03715337.1| hypothetical protein EUBHAL_00386 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00386 [Eubacterium hallii DSM 3353] # 1 49 1 49 49 74 100.0 3e-12 MNSENAFKQLGQKLMNGEITPEEHVKEYNRLLKEMSEKYDSQWRPHEHI >gi|222441915|gb|ACEP01000027.1| GENE 5 4790 - 5542 667 250 aa, chain + ## HITS:1 COG:FN1706 KEGG:ns NR:ns ## COG: FN1706 COG0730 # Protein_GI_number: 19705027 # Func_class: R General function prediction only # Function: Predicted permeases # Organism: Fusobacterium nucleatum # 3 246 5 248 254 144 37.0 1e-34 MGIQTFLIVCPLVFLAGFVDAIAGGGGLISLPAYMFAGVPVHNAIATNKLSSCTGTVVST WRLIKNKRVDWYFVPGTVVCALAGSVIGANLALIISDEILKTVLVVLLPIVAFCVLRDKN LEVIVPEGMTRRKQYIIAAACSLGIGIYDGFYGPGTGTFLLLAFTKLAKMDLEKSTGNVK AVNLASNISALITFILAGKILWTLGLAASCFSIAGHYTGAGMVMHNGVKIIKPIILVVLV LLLIKVISGM >gi|222441915|gb|ACEP01000027.1| GENE 6 5856 - 6836 697 326 aa, chain + ## HITS:1 COG:no KEGG:Selsp_1788 NR:ns ## KEGG: Selsp_1788 # Name: not_defined # Def: hypothetical protein # Organism: S.sputigena # Pathway: not_defined # 1 326 1 325 325 345 52.0 1e-93 MKKILLEKLLDNCKEQEYQKQYEYIINLLEEKKIKPVKASRLNGKSPALYREYWLLEETK DYSNFIEELRYQIIPDISVDYYLRHLESYEADREWVLQLNKYLKERKETLQFKVSANERS FEIWGREKFLSKGQGRRILKRCGLEMSFFNIYETTEPLAYYSRTRNVPQNLLILENKDTF FSMRRRLLEGNETILGIKIDTLIYGAGKGIFRSFEDFDLCVEPYMKAAGNQIYYFGDLDY EGIGIYENLSADFGEKWNIVPFEAGYRKMLSKAKGVDSIPETKEGQNRNIKDIFFSYFST EQVKRMKEILEKGYYIPQEILNISDF >gi|222441915|gb|ACEP01000027.1| GENE 7 6864 - 8480 1061 538 aa, chain + ## HITS:1 COG:no KEGG:Selsp_1787 NR:ns ## KEGG: Selsp_1787 # Name: not_defined # Def: hypothetical protein # Organism: S.sputigena # Pathway: not_defined # 1 535 1 535 543 518 50.0 1e-145 MQYEFLKKFPRRMKNVGLYAVIIQNSSQKLSWKQYGFTKFDEQINLLFEVLLYIMEQSLK EEKCTMDDIATYIDTINVQYLRKDISYEQCHQLGDFIVNTVLSNEGRPMYFGGYDFEKNE YEEMHISYVANKIVYVENEVRRTSYYLTDDGYNLLLSTLEIEDNMKFNIHEIIFRLHLEK QSYDKAVNDIKNVFNLMRIQFQRVQEAMRQIRRNALSYSVDEYEEVLVGNLNTITDTKKK FQEYKTVIQERVKDLEEENINIRKLSKKEQQDLNNLRVIEEYLTRVLDEHQKILNSHFDL KILYTEELERLSQARLIQRFSMRRDLYDKVLKQADTLENMDMFLRPLFNRNPEKIYNLNK AFSYEKSVNAGMEKDTEEEVDFDEEAFRREKEEKLQKKLLVYEKSLQYLLEKASVTGEVS LGQLKDRLDIYPEEKEIFIPNVDVFKEIMVELIRNRTIDIATLKKERREYIQEQPDGFQL NEMILKLVEEQPENNDITSIEVERLENEEAITFSEIKDEENRRKTIRCSNVLIRIFRK >gi|222441915|gb|ACEP01000027.1| GENE 8 8623 - 9354 641 243 aa, chain + ## HITS:1 COG:no KEGG:Selsp_1786 NR:ns ## KEGG: Selsp_1786 # Name: not_defined # Def: hypothetical protein # Organism: S.sputigena # Pathway: not_defined # 1 243 1 239 251 253 60.0 4e-66 MAYTMEDIRKSQEIFYYLLEKHELREEDEQALYKAYTEEEDIQNIVKSQAETASCDIERY GNTIYLIPQEENNFLGYSRVELKKELCKSTATNKDYYLAQFVILTLLTEFFDGDGNSSRA REYIRSGELMNILSERLREGAAYEEEHNEEEMDTAGISFSDMCHAYEALKSDDKGSHAKT TKEGFLYNILLFLQKQGLIEYIERDEMIKTTKKLDSFMDWNLLNQNNYQRVKNILGVIEN EQN >gi|222441915|gb|ACEP01000027.1| GENE 9 9341 - 13810 2978 1489 aa, chain + ## HITS:1 COG:no KEGG:Selsp_1785 NR:ns ## KEGG: Selsp_1785 # Name: not_defined # Def: hypothetical protein # Organism: S.sputigena # Pathway: not_defined # 1 1470 1 1446 1470 851 38.0 0 MSKINAVRFINLNYNNNAMKINDECMQFSGKSTLLSLRNGGGKTVLVQMMTAPFVHRGKQ KTKDRPFESYFTTAKPSFILVEWLLDGGAGYVLTGLMVRKNQEISEEKTDALEMMAIISE YKEPCMQDIHHLPVVEQNEKTMKLKSYNSCRKLFEDYKKDKKLSFFCYDMSSPAQSRQYF YKLMEYQINYKEWETIIRKVNVKESGLSELFSDCRTEKELVEKWFLEAVESKLNKEENKV KNFQEILEKYAGKYKNIKEQLKRRDAIQKFKEAAEEIQINAEDFLVKEGEKIEQEKVIAA FIARLNVLYEEAEIERERQEEGRKKLQEELEFLKYEQLSCEFHEKNREKRNHASNREMID LEKESLLRKQQKIQKKVHVFLCAKQQEMVNEDKQEWEIRKEKAAISRTKEENLEPERNRI GGQLSGYYEYRLSDNKEKKEAIKKQKLQIRKDISQQKDILNEYREKTKKITESKGSFRSL VRGYDNIEIKYNSNYKENLSRNILGVYEAGMLDIKQEMYDKEQKKSIQENKEQKEKSENT TEEIHRTERAIEEKREKYFQKDSDIKQAEKEKKGYEQELEERKDILKYLELPEEKLFARE EILHKAKIKMQELSSRRRTLEKKEDALQKEYKLLVSGRVMELPDNLKEEFEKLDVPVVYG MEWLKKNGFTEKKNKEIVSQNPFLPYALILTRQELKKLSERNGETYTSFPIPIIERENLE SIKLDRTQSFVKMQDIHFYILFNENLLDEEKMEIMIEQKQKDIADIRETMQICKNEYEDY FHRFDVIKRQAVTKENWDKIQKKLQKLEKEKEDIFQNIQQARDTKQSLKKNFEILQKTLR ELEKKIESQAARQRAFKELRTAYAEYEENNKKLQEYEREEERLENRQHLTEEKISQLEEN YRELSGQENSLFREEESIQNACQKFAAYKEINRNVKAGKLLGVDSTLRTDNNSGVKIIPS EAEVLKLEARYEAVTADISQELKELELEEEKALTRYHKSFGELRELCQKYNLKNSEWQNI IYDKREQLYQEAELEDYDKKIERKANLLNEEDKKIGILNSQLEGILKQIVSECGKGNPLE EEKISQKDLESAKNQTKYQLSELERKIAFSEKAIQKYRENLTALSEYNNFSADEEIHFEQ DFKKMSEKELRDFKGMLIRDYNDIIRCVQKCRETLAQTLNKIARQEAFQDASYKTPLENM LKVCDDAAKVLRQLNITLESYDSLMKQLEVDISLVETEKKNVTELLEDYVQNIHKNLEKI GRNSTIKIREKSIKMLKVILPVWEDNEKLYSLRLSDFVDEITEEGIRLFENNENAQEYIG RKVTSKNLYDTVVGTGNVQIQLYKIEEQREQQISWNQVAKNSGGEGFLSAFVILSSLLDY MRKDDSDIFMDKNEGKVLLMDNPFAQTNAEHLLKPLMNLADKTNTQLICLTGLGGESIYN RFDNIYVLNLIEAHLRNGIQYLRPEHKKGEEVKVETILPTHIEVEEMLF >gi|222441915|gb|ACEP01000027.1| GENE 10 14384 - 14677 486 97 aa, chain + ## HITS:1 COG:lin1868 KEGG:ns NR:ns ## COG: lin1868 COG0721 # Protein_GI_number: 16800934 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Asp-tRNAAsn/Glu-tRNAGln amidotransferase C subunit # Organism: Listeria innocua # 5 97 4 96 97 75 46.0 2e-14 MANIISDETIEYVGILAKLELSDEEKEQAKKDMANMLDYIDTLNELDTSGVEPMSHVFPV NNVFREDVVTNGDDREEILANAPEAKEGAFVVPKTFD >gi|222441915|gb|ACEP01000027.1| GENE 11 14688 - 16163 455 491 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163737840|ref|ZP_02145257.1| 30S ribosomal protein S4 [Phaeobacter gallaeciensis BS107] # 7 475 8 452 468 179 30 9e-45 MGLMDYTAVELGKKIKAGEVSVLEAANAAMDAIDALEDKFNCYVTVQERAAVQAKAEELQ KKIDDGTLTGPLAGVPVAVKDNMCTKGTRTTCSSKILENFEPAYTAEAVLNLEKAGALII GKTNMDEFAMGSTTETSHYGVTKNPWNAEHVPGGSSGGSAAAVAAGECSYALGSDTGGSI RQPASYCGIVGMKPTYGTVSRYGLIAYGSSLDQIGPMAKDVTDCATILETIASHDPKDST SVKREDYDFTSALTDDVKGMRIGIPRDYMGEGLDEEVKAAVLDAAKKLEEKGAIVEEFDL SLVQYAIPAYYVIASAEASSNLARFDGVKYGYRTESYEGLHNMYKKTRSEGFGPEVKRRI MLGSFVLSSGYYDAYYLKALRTKALIKQAFDKAFAKYDVILGPAAPTTAPKLGESLSDPI KMYLGDIYTISVNLAGLPGISVPGSLDSKGLPIGIQFIGDCFKEKNIIRAAYAFEQSREF TNWSLSGKEEK >gi|222441915|gb|ACEP01000027.1| GENE 12 16165 - 17598 1797 477 aa, chain + ## HITS:1 COG:CAC2669 KEGG:ns NR:ns ## COG: CAC2669 COG0064 # Protein_GI_number: 15895927 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Asp-tRNAAsn/Glu-tRNAGln amidotransferase B subunit (PET112 homolog) # Organism: Clostridium acetobutylicum # 4 477 2 474 476 501 49.0 1e-141 MSKQYETVIGLEVHVELATRTKIFCGCSTAFGGAPNTHTCPVCTGMPGSLPVLNKQVVEY AAAVGLATNCTITQYCKFDRKNYFYPDNPQNYQISQLYLPICRNGHVEIDVDGRKKNVRI HEIHMEEDAGKLVHDPSTGNSLVDFNRSGVPLIEIVSEPDMRSAEEVIAYLEKLRMIIQY LGASDCKLQEGSMRADVNLSVREVGAEEFGTRTEMKNLNSFSAIARAIEGEMERQIDLIE DGKKVVQETRRWDDDKEYSYPMRSKEDAQDYRYFPEPDLTPIVISDEWLDEIRSRQPEFR DEKQARYKEQFGLPEYDINIITEDKTLTDLFESCIELGAAAKEVSNWIMGDIMRLLKEKE MEASDIHFSPANLVKMIQMIESGAINRKVAKKVFEAIFDEDVDPEVYVEENGLKTVNDEG ALRKVIEEIVANNPKSVEDYKAGKKKAMGFFVGQTMRAMKGKADPAMVNQILKEMLD >gi|222441915|gb|ACEP01000027.1| GENE 13 17791 - 18354 478 187 aa, chain + ## HITS:1 COG:CAC3601 KEGG:ns NR:ns ## COG: CAC3601 COG0494 # Protein_GI_number: 15896835 # Func_class: L Replication, recombination and repair; R General function prediction only # Function: NTP pyrophosphohydrolases including oxidative damage repair enzymes # Organism: Clostridium acetobutylicum # 1 186 1 194 202 78 28.0 9e-15 MCEKKVPVLNSMIKNRNTRFLKSYTLNYTNTEGNEKLYEMVSNFDYEKPEEIGQNASGVV IVGFWDDELLLLREFRMGVNQFIYNMPAGHSEDGESVEECARRELREETGLHIKKICKIL PPAYAAPDLSDSSAWVVIAEVEGEFNPQTEADEYIQPLFADKKQLKELLENEKFSARAQL IAYFFTI Prediction of potential genes in microbial genomes Time: Fri Jul 8 06:46:46 2011 Seq name: gi|222441914|gb|ACEP01000028.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont28.1, whole genome shotgun sequence Length of sequence - 6163 bp Number of predicted genes - 5, with homology - 4 Number of transcription units - 2, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 170 - 520 315 ## gi|225026155|ref|ZP_03715347.1| hypothetical protein EUBHAL_00396 2 1 Op 2 . + CDS 504 - 3380 2488 ## COG0642 Signal transduction histidine kinase + Prom 3428 - 3487 4.2 3 2 Op 1 . + CDS 3558 - 5402 1396 ## EUBELI_01635 multiple sugar transport system substrate-binding protein 4 2 Op 2 . + CDS 5392 - 5529 62 ## 5 2 Op 3 . + CDS 5513 - 6161 430 ## gi|225026160|ref|ZP_03715352.1| hypothetical protein EUBHAL_00401 Predicted protein(s) >gi|222441914|gb|ACEP01000028.1| GENE 1 170 - 520 315 116 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026155|ref|ZP_03715347.1| ## NR: gi|225026155|ref|ZP_03715347.1| hypothetical protein EUBHAL_00396 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00396 [Eubacterium hallii DSM 3353] # 1 116 1 116 116 206 100.0 5e-52 MRNYEDKIYSLNETVTGQTQAMSQAVQKALENNGGVGIMTGYYDQNLSVLSVSNLLLHST GYTFDSFMEQTKGSLKNFFYGDDMDILDYDRFLQFHGTGEAQILAADGTVNNVRAL >gi|222441914|gb|ACEP01000028.1| GENE 2 504 - 3380 2488 958 aa, chain + ## HITS:1 COG:all4496 KEGG:ns NR:ns ## COG: all4496 COG0642 # Protein_GI_number: 17231988 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Nostoc sp. PCC 7120 # 409 682 305 574 575 177 38.0 1e-43 MYGLCKEDATDEAGRQIWVMSVQVNWDHVNLALLNEAIYSGFWYFDCDENSEIVNANWSH EFRKMLGYHDTLDFPNKLESWSDLLHPQDKERVMVQLQAAIKDKTNQIKYQVEYRMRMKD NQYQWFRASAEVIRRLDGSASRIAGIFINIDAEKKEIMQAQKSAAFHRAFTKADLCEYYV NLEANTFDTFKVEPSLMTVFEQSHTWDELIRHFVDSYVVETDKKAVSSFYDRGYIAERLK GLETELALECRITLNGEERWVRNVVIRGEIEDSEYAMIFLRDITETKVESARHLQMAADN AYMEQLIQSIVRLVDRFVVCDLENDRYEFYNLNGQMIYKPLGFYHDFQMQVLEKYKTLEP LEAIDILIAPDNIRKKLKSENDIYKFEYCSLDEKTYKIASYIPLEWKNGKLEKVLLASMD VTQEKKAEIESRQALKEAYRSAENANRAKTEFLSNMSHDIRTPMNAIVGLTAIAGANVES QDRVIECLGKITESSRHLLGLINEVLDMACIESGKMTLAQEDFNLSELVDNLITLTKPVL DEHKHNFDIHINHIEHEDVCGDSLRIQQVFVNLMSNAIKYTPDGGNIIFSIEEKPNGFSE LGCYEFTIEDNGIGMSPEFQKIMFDPFSRADDHRTTRVQGTGLGMAISRNIVNLMNGTIK VDSTLHKGTKITVTIYLELQEKEKEQDRDLMNLPVLVVDDDRTCCKSTVATLKEIGIMGE WVLSGREAVECCYARHELKDDYFAVILDWKMPDMDGIETARQIRKRIGKEITIIVLTSYE FSEIEEEAKAAGIDAFIAKPLFRSRLTATLRQFTSGRKEKTARNYLEKLSESDYTGKRIL LVEDNELNREIAGEILQMTGTKVETAENGKIAVEKVEASPKGSYDLIFMDIQMPVMNGYE ATAAIRSLPGAKGKLPIVAMTANAFAEDVQLAKNTGMNGHIAKPLDMNKLNDVLKNWL >gi|222441914|gb|ACEP01000028.1| GENE 3 3558 - 5402 1396 614 aa, chain + ## HITS:1 COG:no KEGG:EUBELI_01635 NR:ns ## KEGG: EUBELI_01635 # Name: not_defined # Def: multiple sugar transport system substrate-binding protein # Organism: E.eligens # Pathway: not_defined # 27 345 4 324 326 125 25.0 5e-27 MEKAKQCLIAVLLGVFLTTTVLSGCGQSKKEVSQKDNHLTVYLWENRLMKNIAPYIQKQF PDQDIEFITGNNDTDLYSYFEEHGELPDIITVRRFSGADAQDLQPYLMDFCSYDVVSKYY SYALQYYKNSEDEIQWLPICGIPQTLIANKTLFDQYGIKIPKNYKEYAQACQQFYDNGIK PYILDLAEDWSTQEVIQAGAIGEFTSLDGIKWRSSAESSADNIKFDAALWKRILSQTSTF LKDSHFTKDDISVDITTATETFLEGKAAMFHGYPALIQEFQKQMDAELIRIPFFSQTSDE AFINMTPSLNIAFNKDLEKNPEKLDIALDVLDCMISEEGQKRIADGSGVISLNTDVPTMM KDVSGLEKEIQDNSVYIRYSAQKSFAASLKVVHGLLSGEMDEEKAYDTLCSVMNSKATGE KTTVNFEHEYSISLNDKNGRDAASSILTTVRNENNIQLALAPYYYFTSSIYEGECSSNRV ALMTAKNSDTPLYLAKINGKQVYELVGNYLADFDSDFYVTNKYELPIASGMKIIVKDKEN GFSLKSIAVNEKKIDKEKKYSILLTDITMSILKKLYPECEITQLKDTTLSSAWAAAMSKG QQPSAPEDYIEVEK >gi|222441914|gb|ACEP01000028.1| GENE 4 5392 - 5529 62 45 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKNKSADKTKGNKKKENYNTNAYNPAHFFKHIIYEASFEQDEQVY >gi|222441914|gb|ACEP01000028.1| GENE 5 5513 - 6161 430 216 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026160|ref|ZP_03715352.1| ## NR: gi|225026160|ref|ZP_03715352.1| hypothetical protein EUBHAL_00401 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00401 [Eubacterium hallii DSM 3353] # 1 216 1 216 217 394 100.0 1e-108 MNRYIDENGKSSMGAVVEQIQQTYELQVNGYYSQLHLVEDYILKEKEVSPETDANQNFFE AWEKEAGSRLVFLQENGKAMTADGTKMRIDIPGKLLLALRNGHNIAKLVAWNYEETQSGG YLVAIPCQEYHMNGEVYTAVGTVYDHSKLDSMLKLKSYNENAYLFLLDEEENITYTNQSE DKFFRNYSLLKHLKADKAITEKEAALLQKKMDKKEQ Prediction of potential genes in microbial genomes Time: Fri Jul 8 06:47:45 2011 Seq name: gi|222441913|gb|ACEP01000029.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont31.1, whole genome shotgun sequence Length of sequence - 117864 bp Number of predicted genes - 99, with homology - 95 Number of transcription units - 38, operones - 24 average op.length - 3.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 5 - 268 152 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily 2 1 Op 2 . + CDS 338 - 568 74 ## Dhaf_2764 hypothetical protein 3 1 Op 3 . + CDS 601 - 2148 2099 ## COG0519 GMP synthase, PP-ATPase domain/subunit + Term 2161 - 2199 6.9 - Term 2147 - 2188 3.7 4 2 Tu 1 . - CDS 2247 - 3374 463 ## Ethha_1703 helix-turn-helix domain protein - Prom 3401 - 3460 9.2 + Prom 3291 - 3350 14.8 5 3 Op 1 . + CDS 3549 - 3749 244 ## gi|225026165|ref|ZP_03715357.1| hypothetical protein EUBHAL_00406 + Prom 3762 - 3821 4.5 6 3 Op 2 . + CDS 3892 - 4818 320 ## BLJ_1240 hypothetical protein 7 3 Op 3 . + CDS 4805 - 5677 346 ## COG0582 Integrase 8 3 Op 4 25/0.000 + CDS 5730 - 6515 732 ## COG1192 ATPases involved in chromosome partitioning 9 3 Op 5 . + CDS 6505 - 7401 679 ## COG1475 Predicted transcriptional regulators 10 3 Op 6 . + CDS 7437 - 7748 263 ## EUBREC_3610 hypothetical protein + Term 7760 - 7819 13.3 11 4 Op 1 . + CDS 7878 - 8276 437 ## EUBREC_3609 hypothetical protein 12 4 Op 2 . + CDS 8290 - 8688 449 ## COG0629 Single-stranded DNA-binding protein 13 4 Op 3 . + CDS 8702 - 9094 409 ## EUBREC_3607 hypothetical protein 14 4 Op 4 . + CDS 9087 - 9968 854 ## EUBREC_3606 hypothetical protein 15 4 Op 5 . + CDS 9979 - 10797 582 ## EUBREC_3605 hypothetical protein 16 4 Op 6 . + CDS 10797 - 11591 518 ## COG1484 DNA replication protein 17 4 Op 7 . + CDS 11674 - 12165 424 ## EUBREC_3603 hypothetical protein 18 4 Op 8 . + CDS 12189 - 14048 1284 ## COG3505 Type IV secretory pathway, VirD4 components + Term 14099 - 14132 2.0 + Prom 14062 - 14121 3.6 19 5 Op 1 . + CDS 14170 - 14385 266 ## Ethha_1766 hypothetical protein 20 5 Op 2 . + CDS 14398 - 15267 523 ## EUBREC_3576 hypothetical protein 21 5 Op 3 . + CDS 15291 - 15755 403 ## EUBREC_3575 hypothetical protein 22 5 Op 4 . + CDS 15637 - 17838 1686 ## COG3451 Type IV secretory pathway, VirB4 components + Prom 18001 - 18060 3.4 23 6 Tu 1 . + CDS 18087 - 18431 174 ## EUBREC_3574 hypothetical protein + Term 18485 - 18518 2.0 + Prom 18445 - 18504 2.1 24 7 Op 1 . + CDS 18562 - 21378 2006 ## COG0739 Membrane proteins related to metalloendopeptidases 25 7 Op 2 . + CDS 21421 - 21723 277 ## EUBREC_3572 hypothetical protein 26 7 Op 3 . + CDS 21677 - 23101 1451 ## EUBREC_3571 hypothetical protein 27 7 Op 4 . + CDS 23108 - 25180 1506 ## COG0550 Topoisomerase IA + Term 25192 - 25226 4.0 + Prom 25253 - 25312 5.1 28 8 Tu 1 . + CDS 25408 - 31737 6474 ## COG4932 Predicted outer membrane protein + Term 31764 - 31798 4.3 + Prom 31742 - 31801 3.4 29 9 Op 1 . + CDS 31826 - 31966 73 ## 30 9 Op 2 . + CDS 31973 - 33049 1136 ## EUBREC_3566 hypothetical protein + Term 33073 - 33106 2.5 31 10 Tu 1 . + CDS 33118 - 39813 4863 ## COG4646 DNA methylase + Prom 40188 - 40247 3.1 32 11 Op 1 . + CDS 40363 - 42177 934 ## COG3344 Retron-type reverse transcriptase 33 11 Op 2 . + CDS 42258 - 43295 824 ## COG4646 DNA methylase + Term 43321 - 43352 1.0 34 11 Op 3 . + CDS 43377 - 46115 2392 ## EUBREC_3563 hypothetical protein + Term 46129 - 46180 8.3 + Prom 47418 - 47477 5.2 35 12 Op 1 9/0.000 + CDS 47644 - 48381 221 ## COG3279 Response regulator of the LytR/AlgR family 36 12 Op 2 . + CDS 48549 - 49673 387 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain + Prom 49696 - 49755 5.7 37 13 Tu 1 . + CDS 49945 - 50517 207 ## Closa_2478 Accessory gene regulator B + Prom 50595 - 50654 1.7 38 14 Op 1 . + CDS 50729 - 51361 564 ## COG3208 Predicted thioesterase involved in non-ribosomal peptide biosynthesis 39 14 Op 2 . + CDS 51351 - 52049 326 ## COG2091 Phosphopantetheinyl transferase 40 14 Op 3 . + CDS 52039 - 53628 1032 ## COG1021 Peptide arylation enzymes 41 14 Op 4 3/0.000 + CDS 53643 - 58184 2857 ## COG1020 Non-ribosomal peptide synthetase modules and related proteins + Prom 58199 - 58258 3.1 42 14 Op 5 3/0.000 + CDS 58309 - 60519 889 ## COG4693 Oxidoreductase (NAD-binding), involved in siderophore biosynthesis 43 14 Op 6 2/0.000 + CDS 60526 - 64797 1444 ## COG1020 Non-ribosomal peptide synthetase modules and related proteins + Prom 65233 - 65292 4.1 44 14 Op 7 . + CDS 65325 - 66296 488 ## COG2207 AraC-type DNA-binding domain-containing proteins + Term 66374 - 66405 2.1 + Prom 66393 - 66452 5.6 45 15 Op 1 49/0.000 + CDS 66534 - 67385 378 ## COG0601 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 46 15 Op 2 44/0.000 + CDS 67386 - 68186 595 ## COG1173 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 47 15 Op 3 17/0.000 + CDS 68183 - 68977 285 ## PROTEIN SUPPORTED gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 48 15 Op 4 . + CDS 68962 - 69693 304 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 49 15 Op 5 . + CDS 69695 - 71308 1529 ## COG0747 ABC-type dipeptide transport system, periplasmic component + Term 71380 - 71425 10.0 + Prom 71328 - 71387 3.5 50 16 Op 1 35/0.000 + CDS 71440 - 73179 229 ## PROTEIN SUPPORTED gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P 51 16 Op 2 . + CDS 73191 - 74921 228 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 + Term 75079 - 75111 5.6 52 17 Op 1 . - CDS 75097 - 76476 657 ## COG3843 Type IV secretory pathway, VirD2 components (relaxase) 53 17 Op 2 . - CDS 76443 - 76766 263 ## EUBREC_3546 hypothetical protein - Prom 76852 - 76911 4.2 + Prom 76858 - 76917 4.1 54 18 Tu 1 . + CDS 77016 - 77390 250 ## PROTEIN SUPPORTED gi|157165407|ref|YP_001467497.1| 30S ribosomal protein S8 + Term 77449 - 77494 8.8 - Term 77438 - 77480 0.1 55 19 Op 1 . - CDS 77493 - 77693 215 ## EUBREC_3544 hypothetical protein 56 19 Op 2 . - CDS 77750 - 78481 613 ## COG2932 Predicted transcriptional regulator - Prom 78511 - 78570 7.3 + Prom 78565 - 78624 5.9 57 20 Op 1 4/0.000 + CDS 78676 - 79923 835 ## COG0389 Nucleotidyltransferase/DNA polymerase involved in DNA repair 58 20 Op 2 . + CDS 79938 - 80150 175 ## COG1974 SOS-response transcriptional repressors (RecA-mediated autopeptidases) 59 20 Op 3 . + CDS 80163 - 80402 176 ## EUBREC_3540 hypothetical protein + Term 80444 - 80490 -0.5 + Prom 80531 - 80590 2.1 60 21 Op 1 . + CDS 80653 - 81225 447 ## EUBREC_3538 hypothetical protein 61 21 Op 2 . + CDS 81235 - 81606 237 ## EUBREC_3537 hypothetical protein 62 22 Tu 1 . + CDS 81710 - 82549 767 ## Selsp_1548 hypothetical protein + Term 82568 - 82616 9.0 + Prom 82647 - 82706 3.0 63 23 Tu 1 . + CDS 82739 - 83053 399 ## COG1396 Predicted transcriptional regulators + Term 83073 - 83103 0.3 + Prom 83079 - 83138 9.9 64 24 Op 1 . + CDS 83209 - 83430 228 ## EUBREC_3534 hypothetical protein 65 24 Op 2 . + CDS 83427 - 83858 328 ## EUBREC_3533 hypothetical protein + Term 84008 - 84058 12.1 + Prom 83898 - 83957 3.0 66 25 Tu 1 . + CDS 84185 - 84658 457 ## EUBREC_3532 hypothetical protein + Prom 84685 - 84744 4.1 67 26 Tu 1 . + CDS 84903 - 85085 104 ## gi|225026232|ref|ZP_03715424.1| hypothetical protein EUBHAL_00473 + Term 85131 - 85182 7.0 + Prom 85147 - 85206 3.0 68 27 Op 1 . + CDS 85236 - 85616 339 ## COG2337 Growth inhibitor 69 27 Op 2 . + CDS 85597 - 85845 223 ## EUBREC_3530 hypothetical protein 70 28 Op 1 2/0.000 + CDS 85985 - 87682 998 ## COG1961 Site-specific recombinases, DNA invertase Pin homologs 71 28 Op 2 2/0.000 + CDS 87683 - 89383 1281 ## COG1961 Site-specific recombinases, DNA invertase Pin homologs 72 28 Op 3 1/0.000 + CDS 89374 - 90948 1077 ## COG1961 Site-specific recombinases, DNA invertase Pin homologs + Term 90970 - 91011 4.7 + Prom 90979 - 91038 6.0 73 29 Op 1 . + CDS 91138 - 91524 278 ## COG0582 Integrase 74 29 Op 2 . + CDS 91517 - 91765 280 ## Rumal_3218 hypothetical protein 75 29 Op 3 . + CDS 91758 - 92135 233 ## gi|225026240|ref|ZP_03715432.1| hypothetical protein EUBHAL_00481 + Term 92262 - 92299 4.1 + Prom 92327 - 92386 4.0 76 30 Tu 1 . + CDS 92457 - 92738 287 ## gi|225026241|ref|ZP_03715433.1| hypothetical protein EUBHAL_00482 + Prom 93287 - 93346 9.0 77 31 Op 1 . + CDS 93422 - 96658 2122 ## COG0553 Superfamily II DNA/RNA helicases, SNF2 family 78 31 Op 2 . + CDS 96676 - 97332 520 ## DET1110 hypothetical protein 79 31 Op 3 . + CDS 97335 - 98192 523 ## Maqu_1263 TIR protein 80 31 Op 4 4/0.000 + CDS 98194 - 100056 1083 ## COG2189 Adenine specific DNA methylase Mod 81 31 Op 5 . + CDS 100068 - 103139 2332 ## COG3587 Restriction endonuclease 82 31 Op 6 . + CDS 103190 - 104713 595 ## Dhaf_4662 peptidase C14 caspase catalytic subunit P20 83 31 Op 7 . + CDS 104731 - 106413 828 ## COG2865 Predicted transcriptional regulator containing an HTH domain and an uncharacterized domain shared with the mammalian protein Schlafen - Term 106402 - 106451 12.3 84 32 Op 1 . - CDS 106500 - 108077 450 ## RSP_2044 ATPase - Prom 108125 - 108184 6.0 85 32 Op 2 . - CDS 108222 - 108377 98 ## gi|225026251|ref|ZP_03715443.1| hypothetical protein EUBHAL_00492 - Prom 108413 - 108472 3.6 86 33 Tu 1 . - CDS 108523 - 108678 164 ## gi|225026252|ref|ZP_03715444.1| hypothetical protein EUBHAL_00493 - Prom 108725 - 108784 7.8 87 34 Op 1 . + CDS 108789 - 109007 137 ## gi|154503692|ref|ZP_02040752.1| hypothetical protein RUMGNA_01516 88 34 Op 2 . + CDS 108940 - 109152 165 ## gi|225026253|ref|ZP_03715445.1| hypothetical protein EUBHAL_00494 89 34 Op 3 . + CDS 109125 - 109328 82 ## gi|253580568|ref|ZP_04857833.1| transposase family protein + Prom 109368 - 109427 3.1 90 35 Op 1 . + CDS 109447 - 109554 74 ## 91 35 Op 2 . + CDS 109554 - 109721 128 ## gi|225026256|ref|ZP_03715448.1| hypothetical protein EUBHAL_00497 92 35 Op 3 . + CDS 109764 - 109937 92 ## EUBREC_1337 hypothetical protein 93 35 Op 4 . + CDS 109934 - 110179 236 ## EUBREC_3346 hypothetical protein + Term 110275 - 110303 -0.0 + Prom 110562 - 110621 3.8 94 36 Tu 1 . + CDS 110762 - 110848 65 ## + Prom 111024 - 111083 5.8 95 37 Op 1 . + CDS 111111 - 112598 918 ## Spico_0247 Site-specific DNA-methyltransferase (adenine-specific) (EC:2.1.1.72) 96 37 Op 2 . + CDS 112591 - 113511 520 ## Spico_0248 Type II site-specific deoxyribonuclease (EC:3.1.21.4) 97 37 Op 3 . + CDS 113483 - 113722 57 ## 98 37 Op 4 . + CDS 113753 - 116011 1115 ## EF3220 hypothetical protein + Prom 116277 - 116336 5.5 99 38 Tu 1 . + CDS 116402 - 117424 397 ## EF3220 hypothetical protein Predicted protein(s) >gi|222441913|gb|ACEP01000029.1| GENE 1 5 - 268 152 87 aa, chain + ## HITS:1 COG:BS_ybfH KEGG:ns NR:ns ## COG: BS_ybfH COG0697 # Protein_GI_number: 16077290 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Bacillus subtilis # 5 87 126 209 306 80 47.0 6e-16 MGEDVTGCVIALIGISIISFSGTTMHLNPIGDILSVGAAMVWSFYAVLSKKMGAFGYSVI QTTRRTFFYGILFMIPALYFFDFKWDL >gi|222441913|gb|ACEP01000029.1| GENE 2 338 - 568 74 76 aa, chain + ## HITS:1 COG:no KEGG:Dhaf_2764 NR:ns ## KEGG: Dhaf_2764 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense_DCB-2 # Pathway: not_defined # 1 61 234 294 305 72 63.0 8e-12 MTWNLAVRILGAIKTSVYIYLAPVVTAVASVIVLHEKITFLSGLGIVLVLGGLLLSEQKK KLFLIRGRKHTENVVE >gi|222441913|gb|ACEP01000029.1| GENE 3 601 - 2148 2099 515 aa, chain + ## HITS:1 COG:CAC2700_2 KEGG:ns NR:ns ## COG: CAC2700_2 COG0519 # Protein_GI_number: 15895957 # Func_class: F Nucleotide transport and metabolism # Function: GMP synthase, PP-ATPase domain/subunit # Organism: Clostridium acetobutylicum # 196 515 1 316 316 446 69.0 1e-125 MKRETVIVIDFGGQYNQLVARRVRECNVYCEIYSYKTDIEKIKAMNPKGIILTGGPNSCY EPDSPTYTKELFELGIPVLGLCYGAQLMMHILGGKVEAAPVREYGKTEVLVDKKDSKIFA DVSEKTICWMSHFDYISKVAPGFEISAHTADCPVAAAENAKKSLYAIQYHPEVLHTVEGT KMLSNFVLGVCGCAGDWKMDAFVENTVKEIREKVGDGRVLLALSGGVDSSVAAGLLSRAI GKQLTCVFVDHGLLRKDEGDEVESVFGPEGQFDLNFIRVNAQQRYYDKLAGVTEPEAKRK IIGEEFIRIFEEEAKKIGAVDFLAQGTIYPDVVESGLGGESAVIKSHHNVGGLPDYVDFK EIIEPLRDLFKDEVRKAGLELGIPEKLVFRQPFPGPGLGIRIIGEVTAEKVQIVQDADAI YREEIANAGLDQEINQYFAALTNMRSVGVMGDERTYDYAVALRAVKTIDFMTAESAELPY EVLNKVMNRIINEVKGVNRVFYDLTSKPPGTIEFE >gi|222441913|gb|ACEP01000029.1| GENE 4 2247 - 3374 463 375 aa, chain - ## HITS:1 COG:no KEGG:Ethha_1703 NR:ns ## KEGG: Ethha_1703 # Name: not_defined # Def: helix-turn-helix domain protein # Organism: E.harbinense # Pathway: not_defined # 19 204 2 160 169 86 31.0 2e-15 MTFDIMKLFENDNFISPGSIGGKIKKIRELRGLTQKQLGIMCGFSASSADVRIAQYEKNK KIPREKTLKDICEALNIDVYCLFDADMLPYQRMFHALFDIEDFHGLHPIKRNNKYYLEFS GPTVIGQDILPHDFDEFLKEWYEMYQQHLPNPSDTKKEKEKKEANYALWRYEYPINVAEE NTKKLQNQMKMNHLQAQMDALNAEMQGEAELSQLDSAMADALAASKKVYRDITKESEFIL LIKKLIEADIPIKRFSPEVSTKIDYDHMHIISIKTQDIINNESNKSYFAEFLCQIETMQR AGLNINRKITCKNNELYVTYEIASSQWKYLENLNRYWDDIIYIKERIGSWPNQEIDELNK KFLNKITGPHDMIIN >gi|222441913|gb|ACEP01000029.1| GENE 5 3549 - 3749 244 66 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026165|ref|ZP_03715357.1| ## NR: gi|225026165|ref|ZP_03715357.1| hypothetical protein EUBHAL_00406 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00406 [Eubacterium hallii DSM 3353] # 1 66 1 66 66 125 100.0 1e-27 MTEERVPMFVGVDTVKFDLGVSRAKAYEVIKQLNKEMKEQNPRAIVVSGKVNRIWYEEAC LKSGVH >gi|222441913|gb|ACEP01000029.1| GENE 6 3892 - 4818 320 308 aa, chain + ## HITS:1 COG:no KEGG:BLJ_1240 NR:ns ## KEGG: BLJ_1240 # Name: not_defined # Def: hypothetical protein # Organism: B.longum_longum_JDM301 # Pathway: not_defined # 1 268 80 352 381 139 33.0 2e-31 MAVLKNKTQKNFTIISNNVLRDKELSMKDRGVLCTICSLPDEWKFSISGLSAIVPDGVDA IRKSIFNLESLGYVVRKKTRGKDGKYVSEIEVFTEKRNVIDLPPREIRHGESITDNSTWK NQYGNAIAEKTLQYNKDNKSKEIKTDNIKSIHLSGKSEVEEDVEREIDIYKYKEIIADNI RLSWLIEIAERHNEDEVIMVHEIYDVICDMVCYPRDKVLVKGTFYPWEEVKSRFLKLRYE HIADILNRLIDATLEIKNMSSYLISTLYIESLVGTLDIQARLHDDYTKFLRGKPYLNYGK EKVGYGGI >gi|222441913|gb|ACEP01000029.1| GENE 7 4805 - 5677 346 290 aa, chain + ## HITS:1 COG:BS_ydcL KEGG:ns NR:ns ## COG: BS_ydcL COG0582 # Protein_GI_number: 16077547 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Bacillus subtilis # 38 216 34 218 368 72 30.0 7e-13 MAVYKDKWNGYKGTTWRVACYYKDWKGVRRKHEKRGFSTKKEALKYEREFLAKTSKDINM GFGAFIDIYMGDLRPQLKASTMDNKENIIEAHILPYFKDRSLSEITSVDVLQWQNELLSQ RDENGKGYSQTYLRSIQNQLNAIFNHAVKYYELPRNPCIANKKMGRAKGKEMQFWTLDEY LKFSEAIKDKPISYYAFQILYWTGIRCGDDDDKIRLNQRKPSKYKGLSRFGPEKNLQRIN KFMKERPIFYKNLIQMKENFRFYLRCFYCITKVVILQFNSENRTELARNG >gi|222441913|gb|ACEP01000029.1| GENE 8 5730 - 6515 732 261 aa, chain + ## HITS:1 COG:lin2923 KEGG:ns NR:ns ## COG: lin2923 COG1192 # Protein_GI_number: 16801982 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Listeria innocua # 1 257 1 250 253 191 42.0 1e-48 MARIISIVNQKGGTGKSACTANLAVGLAQKNKKVLIVDADPQSDVSAGFGYRDCDDSNET LTALMDAVMKDEDIPSDCYIRHQAEGIDIICSNIGLAGTEVQLVNAMSREYVLKQILYGI KDQYDAVIIDCMPSLGMITINALAASDEVLIPVEASYLPIKGLQQLLKTIGKVRKQINPK LQVGGILFTMVDAHTNDARNNMELLRNVYGSQIHIFDNYIPFSVRMKEAVREGQSIFSYD PKGKATEAYRRVAEEVLKDAI >gi|222441913|gb|ACEP01000029.1| GENE 9 6505 - 7401 679 298 aa, chain + ## HITS:1 COG:BH4057 KEGG:ns NR:ns ## COG: BH4057 COG1475 # Protein_GI_number: 15616619 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Bacillus halodurans # 23 197 18 176 288 81 32.0 2e-15 MPSKKRATTISLQPLDALFGTNEETTNGISEIAIGSLHPFPNHPFQVKDDKKMEELSDSI TQYGVLVPGIVRLRESGGYELVAGHRRKRACELAGLEKMPVIIKDLTDDEATVIMVDSNI QREELLISEKAFAYKMKYEALKRQGKRSDLTSCQVGKKLAVEEVSQNTGDSSRQILRYIH LTELITELLELADEKKLPFNTAVEVSYLRSEEQQILLQYMSNHNMVPSMKQAKELKQISK ERMLTYSEIDQICMNESTEKVQVQIPAKKLKQYFPESYSKAQMEEVIFMLLASWAEKQ >gi|222441913|gb|ACEP01000029.1| GENE 10 7437 - 7748 263 103 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_3610 NR:ns ## KEGG: EUBREC_3610 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 103 1 103 103 142 92.0 3e-33 MMIEDVIKKSILPLAFMLFWFWMVFTIMKVSGQTELFWWIFLSGLPFGIHKMRLILIPRG MDTTATLGMAALSVIIGALIGSIMIPVYVIRAIYVFIRYVIGK >gi|222441913|gb|ACEP01000029.1| GENE 11 7878 - 8276 437 132 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_3609 NR:ns ## KEGG: EUBREC_3609 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 132 1 132 132 226 98.0 3e-58 MRERIESKHVTKEIKSQMPVIPVEIVKERGCTFCPYYLGNYEKQPRCMMKTCAWDDENER FHPVLRELIPFYKEKMEKAEEKFLAMKKIYTTLLGMFADEMKQEELEKDECYGCAYGKCG PCIGICYKSMKA >gi|222441913|gb|ACEP01000029.1| GENE 12 8290 - 8688 449 132 aa, chain + ## HITS:1 COG:SA1792 KEGG:ns NR:ns ## COG: SA1792 COG0629 # Protein_GI_number: 15927558 # Func_class: L Replication, recombination and repair # Function: Single-stranded DNA-binding protein # Organism: Staphylococcus aureus N315 # 1 104 2 106 156 65 36.0 3e-11 MNTTVISGRLVRNPGYNEVENEKGLHKIAKFVLAVRRNYSDEVSFIPVKAFAKKAEFARD YLMQGTKVMVEGEIVTGNYEDKDTGKKIYTTEVYANRIEFAGAKMVDRPPFPEDAEGFLE IPEGMEEEMPFR >gi|222441913|gb|ACEP01000029.1| GENE 13 8702 - 9094 409 130 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_3607 NR:ns ## KEGG: EUBREC_3607 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 130 1 130 130 234 97.0 9e-61 MGSVDQAYMVGHMKNADGSINNKNVVLSAACEAAELFQCMMSVNQKLEDWQRFNHTLEQH LRSVECEKIHDICQAVYGMKHPSDELEGLYLAREDEQICLLYYQVFVGMLEININSAEVL YKQKEETINE >gi|222441913|gb|ACEP01000029.1| GENE 14 9087 - 9968 854 293 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_3606 NR:ns ## KEGG: EUBREC_3606 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 293 1 293 293 408 77.0 1e-112 MSKMEYMLVDCLQVPKVLFQMEKYKNLSNTAKILYSLFLDRLKFAVQNGWVDGKGDLYVI YPKSEMKKDLNTTRYGVDQAVQELVEVGNLVRIIPNNGKANHFYINDIYENEMEEESMMT LDSIMANMPMEEREIIMDKMVKASRDILETIADMGYLYEYETPLTSMEERNYRDELGQLA VDQISCDGYEFGMNLAFGILEENEENADRVLDIQKYVNRQNKKCSKKKFMKVLETLVHAI GNDEGFIADMYACDKEAREIYLEEAMNAFNYVLDELCHGTGSAAGYDFNNIEE >gi|222441913|gb|ACEP01000029.1| GENE 15 9979 - 10797 582 272 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_3605 NR:ns ## KEGG: EUBREC_3605 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 272 1 272 272 530 97.0 1e-149 MADFEYFRAEESDQFSFFRIPKALFTEKEFQSLSTDAKLLYGILLDRISLSKKNGWIDGD GYVYIIYTIAELQELLRMSHTTVTKLLYELDSVHGIGLIERYRQGNNRPSVIYVKNFVKR IRGKPVRYFASGTPETGNPDCKELEIRIARNWKSGTQKNRSPDCKNLDGSNTKINKTEKN HTDKSKSNTPALEEKFGLYGRFKNVVLSEAELQELMTLFPWDYQKRIDHLSVYMKSSGKE YQNHFATICLWAERDGARAGMDKYEFQEGESL >gi|222441913|gb|ACEP01000029.1| GENE 16 10797 - 11591 518 264 aa, chain + ## HITS:1 COG:CAC1933 KEGG:ns NR:ns ## COG: CAC1933 COG1484 # Protein_GI_number: 15895206 # Func_class: L Replication, recombination and repair # Function: DNA replication protein # Organism: Clostridium acetobutylicum # 18 254 27 273 282 147 35.0 2e-35 MENKVTADYLDEEGCLHCGICGKRKQMKVSLMGFEHVVSCLCECEVKARQELDEKMQREE AQRQLYQRKSVGLRERRFWEWKFENDNGSNQKILIARQYVENWTDMKRKNVGLLLMGPVG TGKSFFAGCIANALLEQGERVMMTNFSRILNEMTNYQADRNQIIQNLVDYPLLIIDDLGI ERNSEFALEQVYNVIDSRYCKMLPLIVTTNLGLNEMKSTDLDTAHQRIYSRILEMCVPIY CGGEDKRKEEGTEKLVQVQNLITG >gi|222441913|gb|ACEP01000029.1| GENE 17 11674 - 12165 424 163 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_3603 NR:ns ## KEGG: EUBREC_3603 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 163 1 163 163 221 96.0 6e-57 MQEEVTQKTVTFCIRATKITANLLKKVLVAYLRHQKQKSVEKKAQKNQPKQGKVTVKELA KQNAGMVNIEITNKNIKSFERYARKYGINYALKKDKSKEPPVYLVFFKGRDQDALNAAFR EFSQKQIQKANKPSIHKRLAAYRAMMPKKSKDKVKNRHQEQSR >gi|222441913|gb|ACEP01000029.1| GENE 18 12189 - 14048 1284 619 aa, chain + ## HITS:1 COG:CAC1969 KEGG:ns NR:ns ## COG: CAC1969 COG3505 # Protein_GI_number: 15895240 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirD4 components # Organism: Clostridium acetobutylicum # 151 583 157 562 591 207 34.0 7e-53 MQKWKNKIGGKLSALDKKKLVLTNIPYALAAFYADRAFFLYRNSPGEDMGNKLLYAMEHA DRIFAGFVLSNNWKDLLAGIVVAVVLKVLVWQKQADAKKLRKGIEYGSARWGTAEDIKPY MSEDPWMNIPLTATEALTMESRPKQPKYARNKNIVVIGGSGSGKTRFFVKPSVMQMNCSM VITDPKGTLIEECGKMLAKGPPKKDKNGNIMKDKSGKVVHEPYVIKVLNTINFSKSLHYN PFAYIRSEKDILKLVTTIIVNTKGEGEKASEDFWVKAEKLLYTALIAFIWYEGKEEEKNL NTLLDLLNESETREEDETYQNPVDMLFEELEAKEPQHFAVRQYKKYKMAAGKTAKSILIS CGARLAPFDIAELREIMSYDEMELDKIGDRKTALFLIMSDTDTTFNFVIAMLQSQLFNLL CDKADDVYGGRLPVHVRVIADEFANIGQIPQFDKLIATIRSREISASIILQSQSQLKAMY KDSADTILGNCDTTLFLGGKEKTTLKEMSELLGKETIDLYNTSETRSNQKSFGLNYQKTG KQLMTEDEIAVMDGGKCILQIRGARPFFSDKYDITKHKNYRFLADENEKNRYKVEKELNP QYTPKPEEEVEVIQVELSE >gi|222441913|gb|ACEP01000029.1| GENE 19 14170 - 14385 266 71 aa, chain + ## HITS:1 COG:no KEGG:Ethha_1766 NR:ns ## KEGG: Ethha_1766 # Name: not_defined # Def: hypothetical protein # Organism: E.harbinense # Pathway: not_defined # 1 70 1 70 71 89 77.0 5e-17 MAFFSSAITTLKTLVVAIGAGLGVWGVVNLLEGYGNDNPGAKSQGIKQLMAGAGIMLLGT TLIPQLATLFS >gi|222441913|gb|ACEP01000029.1| GENE 20 14398 - 15267 523 289 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_3576 NR:ns ## KEGG: EUBREC_3576 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 289 1 289 289 493 99.0 1e-138 MNTIIERITEAIKDILIGLIKSCLDNMFTSVNEQVGTIAGQVGQTPQGWNAGIFNLIQNI SQTVIVPIAGLIITFVLCYELITMVTQKNNFHEFETYNIFLWIFKAYVAIYLVTNTFNIT MAVFDVGQHVVNNAAGVISGNTAVDASEAITRIVDALEDMELGDLFLLSMETLLISITMR ILSIIITVILYGRMIEIYLYTSIAPIPFATMTNKEWGNIGNNYLKGLFALAFQGFFMMVC VGIYAVLVNAMTISSDLHAAMFSVAAYTVILAFSLFKTGSLSKSIFNAH >gi|222441913|gb|ACEP01000029.1| GENE 21 15291 - 15755 403 154 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_3575 NR:ns ## KEGG: EUBREC_3575 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 154 1 154 154 231 99.0 8e-60 MAYVPVPKDLTKVKTKVALNLTKRQLIFFSLAAVVGIPFYLVMRKPIGSSIAAILMVTIM LPFFFMAMYEKDGLPFEKVVANIIRQKFICPAVRPYKTENFYQEISTLVKKEGVPDGKKD KAGGSAGASVPGGKKKPSGKKKQKKQKRKRTGKA >gi|222441913|gb|ACEP01000029.1| GENE 22 15637 - 17838 1686 733 aa, chain + ## HITS:1 COG:CAC2047 KEGG:ns NR:ns ## COG: CAC2047 COG3451 # Protein_GI_number: 15895317 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirB4 components # Organism: Clostridium acetobutylicum # 247 717 27 525 617 86 21.0 1e-16 MAKKTKQEAQQERLSLAEKRNRQERRNKKSRKEREQEKRRKKQERDRRKQKNIPTTAQQS IPYIRMLQDGICQVTKTFFSKTIQFYDINYQLALNEDKTTIFENYCDFLNYFDSSISVQL SFINQQVDVAEFEKSIDIPDQNDDFNAIREEYRTMLKNQLSKGNNGLVKTKYITFGIEAE SLKVARPRLERIETDILNNFKVLGAQAHSLNGLERLEILYHVFNQDRIEPFKFQYKMLPE TGLKTKDFIAPTSFNFSKNQTFLMGRTMGSVSYLQILAPELTDRMLADFLDVDDSINVNI HIQSIDQSSAIKMIKSKISDLDKMKIEEQKRAVRSGYDMDVLPSDLVTYGEEAKNLLEDL QSRNERMFLVTVLVMNTARKRQKLDNNIFQVQGIAQKYNCSLKQLDYQQEAALMSCIPLG VNRIEIQRGLTTSSTAIFVPFTTQELFQEGEALYYGLNALSNNMIMVDRKRLKNPNGLIL GTPGSGKSFSAKREITNCFLITEDDIIICDPEAEYSPLVQALHGQVVKVSPVSDQYINPM DLNLNYSEEDNPLSLKADFILSLCELIVGGKNGLEPVEKTIIDRCVRLVYQEYLADPMPE KMPILEDLYNLLRKQEEPEAQRLATSLEIYVTGSLNVFNHRTNVDINNRIVCFDIKELGK QLKKIGMLIVQDQVWNRVTINRSAHKSTRYYVDEFHLLLKEEQTAAYSVEIWKRFRKWGG IPTGELVCYLLVA >gi|222441913|gb|ACEP01000029.1| GENE 23 18087 - 18431 174 114 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_3574 NR:ns ## KEGG: EUBREC_3574 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 21 114 660 753 753 177 97.0 2e-43 MKMGESPREVDKKPPDNNNQITQNIKDLLASREIENIFENSDFIYMLNQASGDRQILAKQ LNISPTQLSYVTNSNEGEGLLFYGNVIIPFVDRFPKNSLYKIMTTRLEETSEAG >gi|222441913|gb|ACEP01000029.1| GENE 24 18562 - 21378 2006 938 aa, chain + ## HITS:1 COG:TP0864 KEGG:ns NR:ns ## COG: TP0864 COG0739 # Protein_GI_number: 15639850 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane proteins related to metalloendopeptidases # Organism: Treponema pallidum # 635 797 381 541 546 115 40.0 4e-25 MAEQRMKVRDKKVQKMTKDGLVEENLTDKSSVRVSNRASDVQMGGKQGEKELSLVDKSSR QSERSGKNIRPSVQKSRDAPELIRTEKQSKDFGQNRKQQNRKRIRAEVGKESRLAEKSEA GQSGRLKEKRADLSENRGDSRRNQTGGSKKPKQKQRLKFAYEETGASVKNEKDMEKLNQM DTDGVNFRHQKQKEKAKKFSYEEAKKKKEAKQHSKKAQVYRANEGEAPGQKKSRLKFGEG ESVKTEKISAVKKAGSATSVALHREISKNEDDNAAVEGAHKLEEGGEGVYRLEQRSARRR KQRASRKRSRLERQEEKQAQAAAHQEQKKLKKQIQKQQIKRDYAKAKRSEQTVGTATKGT IDYIKKIGGKVTNFFKENRKVYISVAVLIGLMFLIITNVTSCSAVFLQNVITYTGTSYLS SDQAIREAELYYTQLEANLQERINNMESEEPGHEEYRYNIGPIEHDPFILISYLSAKYEE FTFEQVKPELDALFAEQYHLTTEAVNETVTETATVRVGESLGQVVTSGYCNCPICCGIWS GGPTASGVYPTANHTLAVDASNPFVPMGTKVVMNGVEYTVEDTGAFARYGVQFDVYYDNH AAASAHGHKTWECYLADDNGSQEVEVTRTRDVDVLNVTLNSGNLMLICQDRLRFFQKELF SAYNDTKGNLQMFATPVDFNWYSSVTSYYGYRIHPISGANQLHNGMDIGAPEGTKVMAGL TGTVTTSAYNDSYGNYVVIKDSKGYELRYAHLSSRSVSAGASVTKGDEIGLVGNTGNSTG SHLHIELLKNGERLNPIFYLETGEGTGFGGNEYTSEAAQRLLNEAARYLGTPYVWGGYSP SGFDCSGFVSYCLTNSGVRNTGRLTAQGLYNICTPVSQSEAQPGDLIFFTGTYDAGEPVT HIGIYVGNGQMIHCGHPVQYTSINSPYWQSHFYGFGRW >gi|222441913|gb|ACEP01000029.1| GENE 25 21421 - 21723 277 100 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_3572 NR:ns ## KEGG: EUBREC_3572 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 100 21 120 120 102 96.0 6e-21 MKSLEKATADYEKAKEKMEASKARYEADLKRFKAAEIAKTEAENLEIVKIIRAMDMSIPE LEAFKKRMKNELPGRVEIQKEETKADDESGKNETEEENNY >gi|222441913|gb|ACEP01000029.1| GENE 26 21677 - 23101 1451 474 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_3571 NR:ns ## KEGG: EUBREC_3571 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 2 468 1 467 475 647 97.0 0 MMNLEKMKQKKRTITKSRVKKSLLAATTFLCLTTPLLQTQTAYAAENTTPAITIEKPDGW KQGETTIAVTVDASHMPEGFSIAKIEAKAGKDGSWQDVTGSGSITITGNQTVYVRVTDGE GKVYEQNRSIKCYDTEKPTLSASLTDGVLTIQGNDTVSGITAVTVNGTTYTDLKDGMLRV QLTQKDFTTKQIEITVTDGAGNTSEKYVLQNPYYEWAKKQAEKQKSSRDSNGAMATTTSA DATGTEKTTTSPLPQDAQASEPTDAKGTVDDRTVTGIEEQLNKEGETAEAVTKTATEGTK EFYTISTKSGKIFYLIIDNSKSQDNVYFLTEVSEKDLMNFTLSDSVTLPEVDTVYAEPEK KAEEEKPETTETDEKDKVEDEIQMPEDKSPFGTYLLIALVAVGAAAGGYYFKVYKPKHEY DDEDEMEEDEDESEDSESEKREVDDADEQEESAEPEILRDEDFNDDVLEDEEEE >gi|222441913|gb|ACEP01000029.1| GENE 27 23108 - 25180 1506 690 aa, chain + ## HITS:1 COG:CAC3567 KEGG:ns NR:ns ## COG: CAC3567 COG0550 # Protein_GI_number: 15896801 # Func_class: L Replication, recombination and repair # Function: Topoisomerase IA # Organism: Clostridium acetobutylicum # 3 677 5 702 709 452 40.0 1e-126 MKLVIAEKPSVAMALASVIGARTRKDGYVEGNGYLVSWCVGHLVGLCDASEYDEKYKKWR YEDLPIVPECWKHRVLEGTKKQFGILKKLMRDSKVDEVICATDAGREGELIFRLVYEQAG CKKPMKRLWISSMEEQAIKDGFASLKDGSCYDSLYQSALCRAKADWLVGINASRLFSVLY NQNLKVGRVQTPTLAMIVDRNQKIKEFTKEKYYMAHIKFDEMDAVTEHFQKKEDADRVAA DCMERMCEVEKDDVKEKTVRPPKLYDLTTLQREANRMFGYTAQQTLDAVQEMYEQKLVTY PRTDSQYLTDEMGESTETLIQMLLGKMPYAEGLEYHPDVSKVLNSKKVSDHHAIIPTMEV AKADIGELKERNRKILYLISARVLTATADPYIYESHKCQITCNYHTFYLTAKKTKQEGFK AIENKLKQFFGVKIEKEEPELDIWAGKHYGPCDSLVSEHFTQPPKQYTEDTLLSAMERAG NEELTEDTEKKGLGTPATRAAIIEKLIQSGFVKREKKNLVPTDDGNVLITVLPDEIKSPK MTAEWEMALNHIAQNTETADEFLNGITELMQELVARYQGISEEKKNQFQGKAKGEVIGKC PRCGADVQEGKVNFYCSDRNCTFTLWKNDKFLASQGKKMDKTAAKKFLSKGKIHYKDLVS RKTGRQYEATVEMVDPGEGNVQFNLSFPQR >gi|222441913|gb|ACEP01000029.1| GENE 28 25408 - 31737 6474 2109 aa, chain + ## HITS:1 COG:L148778 KEGG:ns NR:ns ## COG: L148778 COG4932 # Protein_GI_number: 15672133 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted outer membrane protein # Organism: Lactococcus lactis # 1137 2107 1113 1981 1983 137 25.0 3e-31 MKKKGNFRKVVSGLLAGMTMLSTVLSPMTAYAAEIQPEEKPPLYEEVKDLLDEDEVVTAK DYEIETGSVFNVKSDYTGLEIKDDNKVKVTFEEAKNDKNEDFTTDHADTYKAVYYVEPVN QEHPKYQISRKLIVREKETEVQTETEGSEAVTESETAGSEQQTEETENSEADSELLPTEE VTESEIEEFDSQASELLPDMEEDTENTTDSETGLTVSDAMEQAEEEGIDLYAMEAGETVT FMAKTSARSVQKVSVTRGTLYRYADYGYGSYLTYQYTVQFGNVSATAYCVQPSKPGPGTG NYTISKVGDGKTLAKVCYYGTKAAGDEGFFTEENGYGNLSAGAKFILVHLAASYANGSGD AFSGANSTAKNLAMKLYNYCVSQPEIPDVAMSFSDGDVKAYVDGNSQRTKDITFKADKLQ TITMKLPSGVKLHNLSTGTTSKAGASVEICGGTKFYLSAPLTQVSDVAQSWSSTMKGSIT KDYSAYKITTGSDTQDLALVFGEGVTDEKYIDFKVSWIEQATIEIVKKDDTADVNLAGAV FGVYSDEACTKLITQMPATDKNGKSSVTIIKTQDTVYLKEITAPQGYVVNATATNVKLVA SKISAVTVENKEQLAELTIYKEGQVLTGAEVSESGTVFQYENRRQKNAVYNVYAGADIVT AYGTKVYSKGDLVKENLTTGENGSVTLKNLHLGTYVVKETKAPDNFYNGSEEKSVTLTYA GQNKEVVFADVTFNNERQKADVSVVKQDKDTKKPLKGGIFALYASDDIRNADGTVIVKKG TLIEKATTGEDGTAKFTADLPIGYSYSVKEDQAPEGYVRNTEDVYTFKFSYTNDKEATVS FAHTFSNDRVTAKINLFKVDKETGKAVPQGDATLKGAVYGLYAREDIVHPDGATGTIYKA GEQVASLTTDDKGQASVNGLYLGKYYVKEITPPTGYLADTEEHDLTCSYEGDMTAEVKRE CTSSEQVIKQPFQIIKAANNGKTDADLLFGAGFTVYLKSSLTKKADGSYDFDSAKPVVSG ENGATEIFTDEKGYACSIPLPFGSYIVRETTTPHNYKPVDDFEVNITEHHPNEPQIWRVL LDEEFKAKLKIVKKDDETKKSVLIAGTEFKVYDLDNKKYVEQVTTYPVTTTHKSYFTDSQ GYLIMPKNLKIGHYRIEEVNAPEGYTINKNYVEIAVDANTAYQMDSESGDAIITVDYENH TVKGKLTIYKKGEMLTGFKKDFVYEERYLKGASFNVYAAENIYTPDYQKDENGNRQLIYA KDALVTTVTTGEDGKAVAENLPLGSYYVVEKTAPEGFVLNHDRSEVAFVYADQDTPVIEQ EVTVGDDRQKVAIQVEKQDAENGATVVGAVFGIYNKADIKADGKVIVKADTLLQEMTSDN DGLAACTLDLPLGQYYVKELKAPAGFVSSDEVLNLDASYQGQDVKTVKLKTVKKNQPTTV EITKSDVTTGVELDGAKLTVLDKEGNVVDQWTSVKDQPHVIKRLTVGEEYTLREEMAPYG YLKTTDVKFTLEDTAEIQKVEMKDEVPTGLLIINKNGEFLDKVTLLDNVKGTVEHLFEYV TGSLTDVTFDVFAAEDIKAADGVSEDYFKADEKVGTITTDSNGIAQMGDLPAGKYYVKEV KTAHGYVLDKEPRYVDLSYRDQDTPVITYDEKWQNARQKVKVTVVKKEKDTDRVLAGGVF GLYTSEDIKNAKGEVLLEKDSLIEQRVTDEKGQITFTADLPVDGKYYVKEIFAPDGFVTT EEVQEFTFEYAGEDQAEVSYDFTFENQPTTVELSKTDLTTGKELPGAHLKVTDSDGNTVD EWTSTEESHVIKELVVGKEYTMTETKPADGYVTAESIAFTVENTAEIQKVEMKDDVTKVE ISKTDITGETEIPGAKLTILDKDDQVVESWTSTEEAHYIEKLPIGKYTLREEQAPKGYLL TADVTFEVKDTEEIQKVAMKDDTAKGKVILNKTDKSSGEPLKGVEFELRDSKGKVLETLK TDAAGHAESKLYEIATFKNGKYDTAIKYYLVETKTLDGYTLDQTKHEVTFAYSDDSTPVV EVTFNLTNEKPEVPETPNTPDTPQSHEETKVSNAPKTGDSTNIWLPILLLVISTGGMAGL YISRKRKSK >gi|222441913|gb|ACEP01000029.1| GENE 29 31826 - 31966 73 46 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MMKAMEKRQEARKNQEKTLKRMMAYQKKAEQMVSRKRMNHHMIRKG >gi|222441913|gb|ACEP01000029.1| GENE 30 31973 - 33049 1136 358 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_3566 NR:ns ## KEGG: EUBREC_3566 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 358 1 362 362 612 90.0 1e-173 MMNRKEFYEYVKDNVKEYLPESYKDAEIKLQEVEKNNGLKLTGITIPNGDQRIVPTVYLD SLYQEYIHGKDVDSCVGDVADMRIEAQGKAEFFDMGVPDILDYEKMKDKLQMRICDKEWN TDLLADKVVTEHGDFAAYYAVNLEENGEGISSIPVTVSLMNEWGVSAEQIQADAMVADRK RGVTLMDMNEIIKSMIFGEEPENLLNEKMDMEAMENPMFCLTNKAKMNGASLLLQEDIRK QIGECLGSDYFVIPSSVHEVLILPDNGIFQVPELNAMVQEVNETQVERQEQLSDKVQFCD KKTAVMENAERREARLEKEKAEVKGGIHGRLEKAKAEIKAKEADKVPKNKSKELATAL >gi|222441913|gb|ACEP01000029.1| GENE 31 33118 - 39813 4863 2231 aa, chain + ## HITS:1 COG:AGpT188_2 KEGG:ns NR:ns ## COG: AGpT188_2 COG4646 # Protein_GI_number: 16119916 # Func_class: K Transcription; L Replication, recombination and repair # Function: DNA methylase # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 1242 2225 1 1020 1315 548 33.0 1e-155 MANKLYAMEQLTEEVAKDVAASPQEWMRFLNTASRLYKYTFPEQLLIYAQRPEATAVASM EIWNQKMYRWIKKGSKGIALIDNTSGSKTKLRYVFDVQDTYKVRNLGKDPQLWNLPVKGE QLVADYLQEQLSLEDTEGGLAESLHQAAKESMQEWLPDALEELRLDVTGTFLEELDEQNQ EVEFRELMTNSVWYVLLNRCGLDVQEYLDAEDFRHITDFNQLKVLGHLGSAVNEISRPVL MQIGRYVLKDLENDLKTVAKEKEVVYNEFNTLIRESIEDSTEDREENKEEVDYERDQLQS ERRIPDSGYQPGGDERDDREVRNDEERVSEKSQGSQVQHSDTAEPSGQSSDGDRQSGKTE SRQSDERTFGERSGTGQDGRHNGVDQTHEPDQSTGRGTGDPGDYLQLSLFPTEEEQLGEI RKAAAALEQPAAFLISDEMVDDILRTGSGQKNTLFHITARLIEGLDNEEMQSFLKDEYRT GGKGFTIDGQKISIWYDNDGIRIRRGDSARRNFDRIVTWEETADRIRDMYEEGNYVDNLI SNNAIEQEQKEMTDLLALHFRDTNRNTEEYRSYTDWQDIIRNAWPDPEGQKEIYQQFEWL QADMNENPSNYHRWEIQHNPVYFQRFRDLQRDFSWVAQQFKVERPALSFITQDEIDAVLR RGGITAGGRNRIYEYFMEHHDMKDAAEFLKNEYGTGGSSPGIPGADASDASHDAKGLKLG KGKIGNPEVEVLLKWNKVAERVRQLIRTDDYLSPEEMEKYEERQEAQRLADLEEAQQMLG EQLEQDTFTAEDITDLRLVDSEYMSGTRTKIYDFDCKVKGEEDRLQYTLEYHDDGEGFTI HTEKDDIWDRMSTQELERLDVKLGQEVLYYHYHNKTVNADTLDKLRDIREEIMEEESSYF TAISNRVWTDYDKKEKELSGEIEASEEKESLEEINGIAEPIQATNFHITDDELGQGTPKE KFRANIMAIQLLKKCEDENRNATPEEQEILSRYVGWGGLADAFDETKSAWETEYLELKTV LTPEEYAAARASTLNAHYTQPIVIESMYQVLENLGFTKGNILEPSMGVGNFFGMLPENLN QSKLYGVELDSISGRIAKLLYPDASILIKGFEKTDYPNDFFDVAIGNVPFGAYKVNDRQY DRYNFMIHDYFLAKTIDQLRPGGVAALITTKGTMDKASPEVRKYLAERADLLGAIRLPNT AFKANAGTEVSTDILFFQKRESFTKEMPDWIDLESDANGISINKYFVQHPEMILGEMKEV SGPYGMETTCAPMEGADLELQLQEAVKHIKGSMAPAIDVEAELDEMPESIPADPNVRNYS YTVVDAQVYYRVNSLMNQVKMPAATAERVKGMVAIRDTVRELIAMQMEEFVTDEEIQKQQ EKLNQVYDTYTAKYGVIGSNANKRAFSDDSSYCLLCSLEDLNEDGTLKRKADMFTKRTIK KAVAVTSVETATEALALSLNEKAKVDLPYMAQLTGKTEEKITEELVGVIFKNPLTDQWES GDEYLSGNVRDKLNTARTFAENHPEFTPNVRALEAVQPRNLEASEIEVRVGATWIEPSDY QDFMVELLHTPRYLAQKEIQVKFSEVNGEWRITGKNADSPRNAFAYATYGTERANAYRIL EDTLNLKDVRIYDKVVNDNGDEVRVLNKKETMLASQKQDALKVAFQDWIFKDQQRRERLV SVYNERFNSIRPREYDGSHLSFPGMNPEIELRPHQKNAVAHQLYGENVLLAHVVGAGKTY EMVAAAMESKRLGLSQKNLFVVPNHLTEQWGAEFLQLYPGANILVATKKDFEPANRKKFC ARIAMGNYDAVIIGHSQFERIPISDERQEAMLQRQIDDLEMAIQSARYEQDGGRYTVKQI EKTRKTLQTRLEKLNQKEKKDQVVTFEELGVDHLYVDEAHSYKNAFLYTKMRNVAGIAQN EAQKSADMFNKCQYLDEITGGKGITFATGTPISNSMTELYVMQRYLQNSKLQNMGLGLFD SWASTFGEVVTSIELAPEGTGYRAKSRFARFYNIPELMNMFKEIADIKTSDQLELPVPEA EYETVVLKPTEQQKEIVENLGERAEVVRNGGVDASVDNMLKITNDGRKLALDQRLVNELL PDNPESKISVCAEKSYEIWKDTAAQKSAQLIFCDLSTPKGDGSFNVYDNLKQKLMEKGVP EKEIAFIHDANTEAKKTELFGKVKSGQVRFLIGSTAKMGAGTNVQDRLIALHHLDIGWKP SDLERAPVKAS >gi|222441913|gb|ACEP01000029.1| GENE 32 40363 - 42177 934 604 aa, chain + ## HITS:1 COG:Q0050 KEGG:ns NR:ns ## COG: Q0050 COG3344 # Protein_GI_number: 6226520 # Func_class: L Replication, recombination and repair # Function: Retron-type reverse transcriptase # Organism: Saccharomyces cerevisiae # 28 599 255 823 834 288 35.0 3e-77 MRNPENVLISLTKHSNQKDYKYERLYRLLYNEELYLTAYQNIYSNDGSMTKGTDKQTVDG MSIERIRKIIVSLKDESYQPKPARRTYIPKKNGKMRPLGIPSFEDKLLQEVIRMILEAIY EGHFENTSHGFRPNRSCHTALNEIQKTFTGVKWFIEGDIKGFFDNINHATLIGILRERIN DERFLRLVRKFLNAGYIENWTFHNTYSGTPQGGIISPILANIYLDKFDKYVNEYVRKFKK GKKRMRTKEYRRNEVELSKARIALKNANDDCERENAIARIRQLEKERVNIPPSDPMDNNY ARLVYVRYADDWLCGVIGSKEDCKKIKEDFKNFLKEQLQLELSEEKTLITNAQKSAKFLS YEIRVRHSNLTKRDKTGKLVRNYTGRIVLEVSSDTIRKHLIDTGAMKLIYHNGKEIWKPK AIYRLKNCDDLEILDYYNSMIRGFYNYYCIANNSSIINSYKYIMEYSMYKTYGTKYRTSI SKVIGKFRAGKDFAVKFRNSKGVEKMRVFYNEGFKRQKKAFQSNADLVPNTIKYFSSTKL IDRIKARECELCGKTNTPIEIHHVHRLKDLQGKTFWEALMIARNRKTIALCRECHKKLHC GQLN >gi|222441913|gb|ACEP01000029.1| GENE 33 42258 - 43295 824 345 aa, chain + ## HITS:1 COG:AGpT188_2 KEGG:ns NR:ns ## COG: AGpT188_2 COG4646 # Protein_GI_number: 16119916 # Func_class: K Transcription; L Replication, recombination and repair # Function: DNA methylase # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 10 294 1020 1312 1315 123 31.0 6e-28 MARRWVLSLQREGRIIRQGNHNKKVHIFRYVTESTFDSYMWQLIENKQKFISQIMTSKAP VRSCEDVDEAALSYAEVKALATGNPAVKEKMALDVDVAKLKLLKANHMNNQYRLEDDIAR NFPQQIAKLTEIIDSYKADIAHYSEHKSTDPEQFTMEISGKVFTEKKEAGAALLAVCKDI KAVDAAMDIGNYQGFNMRIQFDGWSKEFILSVKHEAVSKVHLGADALGNITRINNLLDSY PEKLSEAQQRLETVYEQLANAKEEVGKPFPKEEELNQKLERLSELNALLNMDEREDAEAE VSESDEKEERSARGSIHEKLQIYKEKSQRESETGKENRKRDFGLE >gi|222441913|gb|ACEP01000029.1| GENE 34 43377 - 46115 2392 912 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_3563 NR:ns ## KEGG: EUBREC_3563 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 912 1 918 918 1472 88.0 0 MAGNMITGKETQKMQEYTQERIAQAERRDIEFSTDVKEQIFEYANLIGEEEKVRTLVRNL ADAVSQADQEGVEDLLDDVRMDIQELPDPTIGKLELRDYGYTAEDMVPLRKEAALDYHRM GSKIYCLGSDGSKGEYASKEMIQAHDGLFGMESQMWERIKDNNLDYAEENLGAFQEPMSV IEQEEALKLYDAGADIYLITNFSSPMYVTERMEIERGPEHYQMSMTELERFRNLEWEMQK YPQIQSLKEANLLLGTRRIFGIYQIRDDLPGENYAFMNMSFIESHGMQIKKEDYKLVYVG ELSGNMSLDDIFEKFNIDRPEDFRGHSLSVSDIIVLNDGEKVTAHFVDSISFEQLDSFLN LEEQVLSELAYEVGERYFAIQRTEEGYDYSFYDEDFRLMDGGIYENDEISIEEAAEELLE DEGWTGERIRGDYDQLMEKVEEMDEAVMAEIQKSQGEYKPLAKVEELEEANYNMIDNVLN NMPPKKEPYLEYFAAECDEFHDMGAYEKSTDANQIAAVYEKYRENPKTAYLGCSMGIIYH DPEDSYYDEAEFAIVKGNTVFGNLMDDVRFYGELALVREGIEKIHEALPDYKYVPMRDVR EAMYPEKMTTEQLAEALDEIAEAFDPYEYRDNVEPGENTVQEVMLDLRSGNIHYYVSYLK DIVDEECDLSVRAGVLIERLKAYEPELPKDMQPMVYVNYCEKSELGNPRCQKLSDLDSKT IEQDKVWYADRDPKTNEPKTTAQMFFTVYYAEKGDKMLHHFQGKIDIGTGNGGIISQLKM QNEMKLTDESWISYQQGKGNEEYQKYMEDLTDMQNHVLPYLQSFCSLEEKGVKERREQQV AEKNESREAVPRTGGEANIAVKDAGKAERKEITGKEKKPSIHERLEINKRIIQEKQGKDE PERGDDLSVRMV >gi|222441913|gb|ACEP01000029.1| GENE 35 47644 - 48381 221 245 aa, chain + ## HITS:1 COG:CAC1581 KEGG:ns NR:ns ## COG: CAC1581 COG3279 # Protein_GI_number: 15894859 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Response regulator of the LytR/AlgR family # Organism: Clostridium acetobutylicum # 1 238 1 230 234 70 25.0 4e-12 MCKIAICDDDKEYREKIKTVIKTEGILSSNEIRFYEYESGKELLEDADILHDLIFMDMRM PGLDGNKTVLKLREYNEAAILVFCSGCFEPTPDSINVGQPFRYIMKDLHDRSLKKEIPMI LLKVKLCCGDSSVTVTSAGRIMRIHTEDILYICLAKRGCTIYIARQGKIEEMHCKESLSD LYECLAGRGFEYAHNSYIVNLAKVEGLEKNVALLSFGIQLNVSRSRKGQFADALITFLGE RNGMV >gi|222441913|gb|ACEP01000029.1| GENE 36 48549 - 49673 387 374 aa, chain + ## HITS:1 COG:CAC1582 KEGG:ns NR:ns ## COG: CAC1582 COG2972 # Protein_GI_number: 15894860 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Clostridium acetobutylicum # 68 359 137 438 452 68 23.0 2e-11 MVMIPVIYMIASMVVFRGNIWKKLVTVCCYYVLAIIPEFLFAALTNAYGVTGASEGFRTE TEKTLALLLMSTMTFLFIKCINQVTRKRDYLTIENKTFTVLLMLPTATIVMLGCMFYSHT SFEGMNRVMVPVGASLLLMTNIFIFTVFDRFVEKSEEVKKMDRLYQKSRAEIANLQYMNK VNEDNRAFLHDINKFICTVAGLIEEGENQEVKDIMEHLGVRIQNLQKSVYCEHPILNSIL CERKFLAESKDISYRITLGNDLRLDFLEELDLISIVGNLLDNALEAAEKTEDGRYVECRM YMGNAGHFLVMEFCNGYVVPLLKDKDRYISTKRDADSHGIGLHTVGKLVKKYAGIMRVEA GEREFSVKLIFTIK >gi|222441913|gb|ACEP01000029.1| GENE 37 49945 - 50517 207 190 aa, chain + ## HITS:1 COG:no KEGG:Closa_2478 NR:ns ## KEGG: Closa_2478 # Name: not_defined # Def: Accessory gene regulator B # Organism: C.saccharolyticum # Pathway: Two-component system [PATH:csh02020] # 4 182 7 185 197 74 34.0 2e-12 MAGRLFELQSRNGILEEKDRRLYEYAYGVLLGRVVIYLIIVILGIITGNWMEMIVFLLPF TVLRQYAGGIHLEKAGGCMAVSGILVLLCSLYLASAPAVIWQMRIIWFVAVGVIFIMAPV DASSKKLDAKEKKVYGMRARVILVIECAIAGVFSVIGYSLIVNGIMVAHIVLASGLILGW TKIFFDKELE >gi|222441913|gb|ACEP01000029.1| GENE 38 50729 - 51361 564 210 aa, chain + ## HITS:1 COG:alr2045 KEGG:ns NR:ns ## COG: alr2045 COG3208 # Protein_GI_number: 17229537 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Predicted thioesterase involved in non-ribosomal peptide biosynthesis # Organism: Nostoc sp. PCC 7120 # 1 203 45 249 253 154 38.0 1e-37 MQGKVTVCPVQLPGREERIMEKPYIDMPVMLDDLEEAVREVVDGPYALWGHSMGGKISYE LEKRLEAEGYRAKYLFISGSRIPSIPEPKPIYHLPDEAFKRELGRFEGTPKEILENQELL DFFLPMLRADFTMDETYYDKAGVVLHTPISAFGGEKDGEADESAICEWGKYTDNDFDYRI FPGGHFYLRDYEDEVISEVERRLQGMKNGG >gi|222441913|gb|ACEP01000029.1| GENE 39 51351 - 52049 326 232 aa, chain + ## HITS:1 COG:slr0495 KEGG:ns NR:ns ## COG: slr0495 COG2091 # Protein_GI_number: 16331528 # Func_class: H Coenzyme transport and metabolism # Function: Phosphopantetheinyl transferase # Organism: Synechocystis # 25 187 6 171 246 92 32.0 6e-19 MEVKRCGEKEFLSADKLRIRENEIHIWCICWPEMTDFWKNHEYILSEQESEQAGKFRFPE DRMRYIAGKLIVRILLKRYLDVETVDFSVNELGKPYHKEIAGKQTVNFNISHSGEFILEG FAVGMDIGVDVQEMAGCPDYREIAGNFYTAEEAEDVINEGPELFFQYWAAKEAYVKAIGI GLGRGMDFFSVRNGEITEKGRKGQGWHLYPVKIEGYAAYVAAHKKGDKSNEL >gi|222441913|gb|ACEP01000029.1| GENE 40 52039 - 53628 1032 529 aa, chain + ## HITS:1 COG:YPO1907 KEGG:ns NR:ns ## COG: YPO1907 COG1021 # Protein_GI_number: 16122155 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Peptide arylation enzymes # Organism: Yersinia pestis # 22 529 17 519 525 477 47.0 1e-134 MSYKKDMSKYEHLEGWERCGFGEQLDKWAEKYGERTAVTDSEDEISYMELKQKADCLAAA FLRKGILKGDKVLVQLPNRISFVIVFFALSKIGAVPIMMLPAHREAELEGIIELAKPAAY IVVEKYLGFSYVPMANAMKEKYSCIRHIFVDSESGDISGMIAETCGENGAFPAVDGYETA VLLLSGGTTGVPKLIPRTHTDYMYNARMSAKRCRLDSSDVYLASLPVAHNFPLCCPGLLG TLDVGGKVVLASATSPDDILDAITEEGVTITALVPAMVTVCMEMLEWDEDYDISSLRILQ VGGAMLEDSLADKIIEEWPCKLMQVFGTAEGLLSFTSPEDEGSLIARCQGTPVSPADEVK IVDEEDKAVPEGVFGELLSRGPYTIDGYYMAEEANKKSFTPDGFYRTGDKAMWTKDRRLR LGGRIKEQINRAGEKIMPSEIEAYLCRHSKIKEAAVVGVPDETLGNRICAFLVTDDEAGI DLQEIHRFLREIGVAAYKMPDQIERVETWPLTSVGKIDKKALERMAQEK >gi|222441913|gb|ACEP01000029.1| GENE 41 53643 - 58184 2857 1513 aa, chain + ## HITS:1 COG:YPO1911_2 KEGG:ns NR:ns ## COG: YPO1911_2 COG1020 # Protein_GI_number: 16122159 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Non-ribosomal peptide synthetase modules and related proteins # Organism: Yersinia pestis # 95 1484 4 1354 1381 734 34.0 0 MEYIEIKEQIKQKLPVARDLGDSENLLELGLSSLTIMRLVNQWRKQGVKVSFGSLMENPT LEGWWALIQRSMKKKAGKKRAQKSKITPEKDMKQPFPLTDVQYAYWVGRDEEQALGGIDC HAYLEFDGGNIDPERLEKAWNVLQYHHPMLRACFLEDGTQKILDKPYCEKIKVHDFSQMS SEEAEAMAVSVRERLSHRKLKIEEGEVAGIELTLFPENRARMHVDMALLVADVQSLQILL RDLAAAYRGESLPAESKGWNFASYLERQKEEDKEERNNAEGYWKKRLEHLPKGPGLPLAK RPQEVTKTVFNRRIVRIGKEEWAHLQVRAKEYQTTPAMVLLTAYATVLERWSRNHRFLIN IPFFNRKTEQQGLEDVIADFTTLLLLEIDCEGNPTFAELLDRIQKQLHEDMKYTAYSGVK VQRDLAQMYGDASAVAPVVFACNLGTPLVNDTFRKELGQFSYMISQTPQVWNDFQSYEDE NGVQLTWDSVDKLFPENMIPDMLECFENLLHELGKKDWNQRFDVLPEKRKREIEDTARTR VPERVECLHWAVMNHAEVCPEDIALIDAGSGRSVSYGELKARATAVAAGISEKDIKGVPV ALTLPRGIEQIETALGILLSGNSYLPVSLSQPKDRRALIHEKTGVRYVVTNRELSEKLDW PEGTEILVMEGMEEGQKNVRLPEVSPKDSAYIIMTSGSTGVPKGVEIAHESAWNTVQDIN EKYHVTSADRALAVSAMDFDLSVYDVFGILGAGGTLVLLPEQERRNADYWLEQVLKYQIT VWNSVPVLLDMLLIRAESMKQKLPIRAVMLSGDWIGMDLPQRVAAWTEDCQFVAMGGATE ASIWSNYQNVTLPMPKNWKSIPYGKPLRYQAYRVVDEYGRDCPYWAEGELWIGGFGVAKG YRGDSALTGQKFITDQYGRWYRTGDLGRIWDDETIEFLGRKDHQVKIRGHRIELGEIEHA IQEFPGVAHAVVDTVSDGHGNKTLAAYIGAPLQEDSKVTTYLYGTDIFGGGWKELKDDVS NWQMQQERKTAYKNFLAYADQRCVQLMLETLIELGVFVSEKEVLSQKEIFEKGSITETQK NTVARWLEILKKEGILREEDGRLSRTGKEVAVPEKAGDAETYFKKLKPYLKHMVTGNEVP LDVFYQKDSALAPNMLLRRIPGYEETVERLVQELKLLIEGRRKEPLQIIEIGTRDAAITR QILNVLEDVSVAYTYVDSSKYFLQEAEKELAGYERVEFEMLNLEESMDKQQMALHSYDVV ISVNALHRNIDAVDAVKKVAELLKPNGILLMTDLVVRTYLQELTATFLENGFADIRDKRK EAGMVTPDCLLWRECLSEAGLGQDIVVTEEYGRCICCSRQQASVLSYYDGALREYLSEKL PEYMVPQNYHFMAQLPTLSNGKINRKKLREDFKEETSVIRFSKAVTETEEKLLDIWKRMF GYENLGTEDNYFVLGGDSLIATKLISDIQKEFGCKIMISDIFENVTIKLLAQKIDLVRQN VIMEDIEMECGEI >gi|222441913|gb|ACEP01000029.1| GENE 42 58309 - 60519 889 736 aa, chain + ## HITS:1 COG:YPO0774 KEGG:ns NR:ns ## COG: YPO0774 COG4693 # Protein_GI_number: 16121087 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Oxidoreductase (NAD-binding), involved in siderophore biosynthesis # Organism: Yersinia pestis # 5 335 6 347 375 154 27.0 8e-37 MRKLKSLVCGSKFGQFYIDAISKMKECVEFIGIFSNGSERSRLCAEKYNVNLYTSIDKIP EDIDIVFIAVRTGVMGGDGSELALKFLSKGIHVFIEQPIHIDEMKKCVKCSIEKGVFFHV ANFYKYMDTVHKYIECSKGLLRKNEILSIECACANQVIYSLLDNILRIFREKARFSIENF WIQNQDFFIASGSINDIPLSIKVYNSVNTVDSDSYCYYLQQINIYTSEGMLFMVDVNGPI IWKPQFRAAHDLFIKKGDELDQRIFDSKMYWYESESELSYKDIFCKKWIDAIKKEIKEFI DNIYFEKMTEYRRKIQSEIVIPSIWGLITQKIGYPRLYQNANHAVDLNNYVLDFFEEKNT FLNDSLEIKKTANNIDFYLEPDITYGYVEEAKHIMNIACLESISYKLYEATGEKENISIT EIVSCLKVADSNMHILQRWLNALSENGYIECNAGKIKWKKKNTSICEKDNWKIVSDKWVG KLSGEHVINYYLKHIEKMEALLSGEMNAALLLFPEGTIELANCFYRENLMEKYLAYCIEK ILSFAIEHTKKDKIRILELGAGTGATTDRILKVLENSTKKIAYIYSDITKYFLNFAAQKY QGKEELEYKVINIDSNLSDQGIAEESVDIVIAAGMMNNAVNTDYSMRQIICSLVHGGIAL IAEPVGENYELMLSQSFMMSRPTDAREVCNETFLKMEEWKEVFWGCGISERELVVVPSDT SPLAPLNQKLFIIRKE >gi|222441913|gb|ACEP01000029.1| GENE 43 60526 - 64797 1444 1423 aa, chain + ## HITS:1 COG:PA4225_1 KEGG:ns NR:ns ## COG: PA4225_1 COG1020 # Protein_GI_number: 15599421 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Non-ribosomal peptide synthetase modules and related proteins # Organism: Pseudomonas aeruginosa # 11 968 5 965 1575 594 35.0 1e-169 MDANNLKKAFELFVKLQDKKVKLSNVDGKIKYRAPSGIMTNEVIEEIRKCKNELLTIINC TEHPEKLWKGKQGDSFPLTDVQSAYLIGRKNIFQYGNAACHIYMEVEYSELDSQRVEEVW NILINRHPMLRVIFSDTGIQKILRDTPWYTLYTNRNITYTQETLQTELGHKIYDASKWPM FDIGCVSGHNKAILGISIDFMIADWASIWILIDEFEKLYQNPTASLPRIEYNFSDYIYCE KVKKYTPKYFSDREYWIERLAEFPEKPDLPIINAVENIDKNVIFKRYKFVVEKNKWKTIQ LISSKMGLTSSAVILSAYAMVISKWSTNKKFVMNLTTLNRNSYESKVNNIVGDFTSINLL AVDFSIQKSFREYVKQIQQQLLTDMEHKFFSGVEVLRELSQGKENIIMPYVFTSGIGVIR NTKKEGYHGKYYQNGISQTPQVFIDCQVFDQDDELLINWDVREGIFLPEVIDEMFLMFQD LICNFLIDAENWNVEKVTAILNKNIKISKKKYDLPSGLLYSSILEMAELCPEKEAVQDEN EIWSFSKLVRKSWFVQNKLQQEGCIPGDLVGIFMPKSNYQIASVLGVLMNGCTYVPIDIQ QPKKRINEIIEQAGIKYVLISSCSKIELMFLDVKYIYADKQNNEIITKQIPQVPNSSIAY VIFTSGSTGKPKGVAISHLSALNTIYYVNKMLNISTNDIVFGVSRLSFDLSVYDIFGILS AGGKLFLPDENRIADPSYLFETIVKQKITVWNSVPAIMQMILEYRENEKRTEQFNIRCSL LSGDWIPLTLPERLISACMEKTRVISLGGATEAAIWSIYYEYEYLFDEWKSIPYGKALPN QEVYVLDSKKEICPLGAVGDIFISGIGLAKGYLNEPLLTEKAFIMVNGERMYDTGDKGKY LEDGNIEFLGRRDTQVKLNGFRVELGEIEANIEKIKSVKYAAVLCREKGREKRVIAYVEP QPSEFTEDFSLYVLNELKRMLPAYMIPYNIRVMKLPLTSNGKIDRKYLKNIEIVEDKKEA DTKNTINSDSSTYFPIKEIWCKLLQRSDIRPDSDLRSFGADSLIMAQAVGKIRNFLKESG SQCDVPFDILLKQTINSPTLSEIVSYIDQIILKKEVNSRYEDAEVNEEELGKLKIHKKGK EDKLVVIFHAGLGSIDEFLPFIQGLCNEEKGTIVSIALNNVKQYLDFYAGHLYESVADKY TKIILDLGYDKIQLIGHCVGGMIAFEVAKNMMDKGVDVIDVSLIDSIPGKYSIEDEVVME LMFLPMVNITYPVLAEYQLNERELYEYMDEKFGKYTGIEQKSKNKVEEYINGLREIEKEE RIRLYINSVSKNMPIQMFVYLFDIFKQSTNGAIYKPIPYLGDIRLFIAEDTEKIVFHRSE EVINFWRENCIGDVTVIMIPGNHLTCFTDVNNIQKLVNMITEK >gi|222441913|gb|ACEP01000029.1| GENE 44 65325 - 66296 488 323 aa, chain + ## HITS:1 COG:RSc1813 KEGG:ns NR:ns ## COG: RSc1813 COG2207 # Protein_GI_number: 17546532 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Ralstonia solanacearum # 186 320 166 301 303 82 32.0 8e-16 MSEQQFTDFFDPEIQKIVSEDGCSIYRMKNVTGEGVITRYEILPGIELFYNDFHMSDGQN QNKLPHSDVLEINHCREGRFECEFANGDYQYVGAGDLAINRLTNETTSTYFPLSHYHGIS ITIDLPTADQTMKRIESVIGGLNIDVFSIADKFCKNDTCIVLRTQSEIEHIFSELYRVEP RMIAYYLKVKVLELLMFLNQVSLQDYQEERRYFARNQVQTIKKMQEYMTADLRNHYTLQE LSEKFEIPLTSMKVCFKGVYGCSIYAYMKSYRMQAATILLRDTSDSITEIAAKMGYDNPS KFSEVFKKEFGELPSEFRKKLSK >gi|222441913|gb|ACEP01000029.1| GENE 45 66534 - 67385 378 283 aa, chain + ## HITS:1 COG:MA4673 KEGG:ns NR:ns ## COG: MA4673 COG0601 # Protein_GI_number: 20092271 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Methanosarcina acetivorans str.C2A # 3 272 29 298 309 147 33.0 2e-35 MRLSSVDPATAYAKRMIGNPTAEQIEKIRIQLGFDKPLLVQYGRWVWDLLHFDLGVSLAN GHDVWTDIATAFPKTLGIVALASIFQVVFIVIVSCIAFLLPWKLPKKAVRLLCILGVSIP SFYLATVYLDYFAVQKSLISVAGNTTLLSYISPAICIGVFGASFYTPLLMDALEHESNED YAFYARCRGLSEKRLLLCHFFPRAAMGLVPNFLQSIGLALANATVIESIFSIPGFGYLIV NHVLDRDTPMIHAEVFFLALAIALCNIAADLVQWAIGRKKGVA >gi|222441913|gb|ACEP01000029.1| GENE 46 67386 - 68186 595 266 aa, chain + ## HITS:1 COG:MA0880 KEGG:ns NR:ns ## COG: MA0880 COG1173 # Protein_GI_number: 20089764 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Methanosarcina acetivorans str.C2A # 8 260 10 264 278 125 31.0 9e-29 MKTKRFRKWCAIFIPTLVIIFFFAGFLFAPNDPMKSNLLERYADPSAEYPFGTDALGHCI LSRILYGGWTTLGIVLGGSLIVFLLGTVIGMVTSRAIMKENALIDGLINAVTAIPPIAYL IVFIGAWGSGAKTTLVALTVSYILRYIKLVRTRTDMEMGKAYVMCAIASGASKLRIMLIH IFPNLIAEMIRFLCLSCADMILAITGFSFIGLGLGDNVVDWGSMILDARGALVLHPTMIL YPIGAVILSTLCFNVIGRQLTTKEAE >gi|222441913|gb|ACEP01000029.1| GENE 47 68183 - 68977 285 264 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 [Roseobacter sp. AzwK-3b] # 1 250 7 261 563 114 27 4e-53 MMLKIEQLDVKERSGRYLLKDISMEIPAGHVIGLTGKSGSGKTTLLRSILGMLHTSCHID AGKILLDDIDLSHLSRKSHRELCGKKLGFIPQNPMTAFDSRLKIGYQMRETFVNRLHLNT SEATDLAKEKLVSVNLKDTDRILGAYPSELSGGMLQRVAAAILLGMSPDYVLADEPTAAL DEENRDLLLLIMQEQMKDKGILFVSHDVAALKNLCQNVYVLGAGKIIEHGTMDKLLSSPQ TDWMKQFSALSHRESRGEWKWEKL >gi|222441913|gb|ACEP01000029.1| GENE 48 68962 - 69693 304 243 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 19 228 33 242 329 121 31 1e-26 MGEIVIQNVNQVYHNVNQGDFNALSDINLHLCKGESLAIEGESGSGKSTLARLLIGIEKP TSGKILLDGEDITCWNYRTWKQHRKKIQAVFQDSSGTLNPARSAYANVEEALVNLTDKKR AERKKHILDLMDAVHMDYGLLETPVRQLSGGEQRRLSLLRAIAVEPDYLIMDEVTSGLDL ISADAVLTLVENFVKLSGSSCIFITHSRKDAMRIANRIIIMREGRIAEQGYLIDQLSNQK GQG >gi|222441913|gb|ACEP01000029.1| GENE 49 69695 - 71308 1529 537 aa, chain + ## HITS:1 COG:BH0567 KEGG:ns NR:ns ## COG: BH0567 COG0747 # Protein_GI_number: 15613130 # Func_class: E Amino acid transport and metabolism # Function: ABC-type dipeptide transport system, periplasmic component # Organism: Bacillus halodurans # 21 514 21 517 539 257 32.0 4e-68 MKQIYKKLLAAVLGVSMLLSMVGCGQNDTSDQTQTGGENEAKVLTIVTAKELDSLTTLTM NKENNIACGLVYETLVAYEDGEIVPELAEEWGWDETNTVLTFKLRQDVAFTDGAAFNAES VKEILDFDRSNPNFSGIKGIYNIESVEVVDEYTVAVHYAAPCFSYINDFCFQNVAGMMSP NVFEAENFQTFTDVVGTGPYIREEMISGDCTRFVRNENYWGEAPYYDEVVIKYIPEASSR LQALQTGEVDLIYGADLLSYDDYNQALSLDGIEGAINDGNTLTRNLVLNASSEILSDLKV RQAIAYAVNKEEITKGLTYGYETPATSLFAPGAPYTDITYNSTWSYDLDKANALLDEAGW VMNESTGIREKDGQQLSLNYTYWTDLSLAQDMALAIKTQLAEVGIDVTTTGQDQMTWWTE GVAGNYDITTWNTEGSYTEPHKFLQESLGADPHAISLQALEDFQSYSDAVNAFSTSADPE VVQSAIATALNVSNDNVIDLPISYSKDLVVYNSSKIAGYTFSSVPQFFEIDNVQPAE >gi|222441913|gb|ACEP01000029.1| GENE 50 71440 - 73179 229 579 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P [Thermanaerovibrio acidaminovorans DSM 6589] # 342 561 135 354 398 92 30 7e-18 MDKKDNPMSRLMELAAPYKGEYVLSVILAILGVAAGLLPFFAVSKIVILLMDGETSVSVY MEWCLIAGIGFLGKVCFANFSTLVSHTATFATLAEIRLKLADKLTRVPMGYMINTPSGEL KNVLVDRVEGMETTLAHLVPELTANLCIPICILIYLLFLDWRMALAALITLPIGMLCYKG MANGYEEKFQGLIMRMRKMTSTVVEYIGGIEVIKAFNQSANSYQKYSDAVEDNAAYAVNW MKSVQLYKSMLVTIWPSVLVCVLPIGCVLYRNGSLSVPAFVTCIVLSLGIITPILNAMNF TDSISQMKSVVGELCAVLDEKELDRPETPVSIHSSNIVLENVSFSYESEKPLLKNINLSI PENSITAFVGPSGGGKSTITKLIAGFWDVTNGTVKLDGTDIRKIPLSQLMDNIAYISQDN YLFDETVLENIRMGKPSATDEEVYQAARDCGCYDFILNLENGFHTVVGGAGGHLSGGERQ RIAIARAVLKNAPIVILDEATAYIDAGNEALIQEAMAKVIAGKTVLMIAHRLSTITDADK IVVIKDGQIIDEGTHENLLTSCTLYQEMWKAHMDTKDVA >gi|222441913|gb|ACEP01000029.1| GENE 51 73191 - 74921 228 576 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 333 557 2 230 245 92 28 9e-18 MIKTLSKIYQFSGKMQGTMKKAILFSVLHSLFDMMSFGALAMVFSGLTDGFTTSMIWMIF GITLASMLLKIYCSYISDFGKVQIGYFMCAEKRIHIGDRMKYMPMGYFNDHNLGNLTSVV TTTMGDIENNASMVLTNILGGYIHAAIITIVMLCIDWRIGLTILCGILLFTWCIGRLQKK SETVSPQRQQAQETLVSNVLEYVQGMLIVKSFNLGQNSNSKMRQAILDSKDKNLKLERTF VPYNMLQQIILYGTSILVIVEGLYFYLNGTMALSICLLMTVASFMLFSQLQSAGNTSSLL RLLDVSIDKVNEIDNTPVMDEHGKPINPPNYNIVFDDVSFSYGEHKILDHVSLSIPEKTV TAIVGPSGAGKSTLCNLIARFWDVDDGKITIGGIDVRDYTLDSLLTNISEVFQKVYLFAD TIENNIKFGNPAASHNEVVKAAQKACCHDFIMSLPDGYDTVIGEGGATLSGGEKQRISIA RAILKDAPIIILDEATSSVDPENENLLMGAISELTKNKTVIMIAHRLKTVRNADQIFVLS GGHIVQTGKHEDLIRQPGIYADFIGIRKKAIGWKLR >gi|222441913|gb|ACEP01000029.1| GENE 52 75097 - 76476 657 459 aa, chain - ## HITS:1 COG:SP1056_1 KEGG:ns NR:ns ## COG: SP1056_1 COG3843 # Protein_GI_number: 15900926 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirD2 components (relaxase) # Organism: Streptococcus pneumoniae TIGR4 # 14 216 16 230 402 62 27.0 2e-09 MAYLKIFPIKVTDKKALDYITNPDKTEEKLLVSSFGCSPETADLEFSMTREMAKKNGMDK GDNLAFHLIQSFKPGEVDAETAHRLGQQFADEVLKGKYEYVISTHVDKNHIHNHIIFNAA SFVDHHKYVSNKRSYHKLCRISNRICHENGLTTSMPIGEKGKSYKENMEYHRGTSWKAKL RVAVDKAIWASINYEEFLQKMQLAGYEIRQGKHLSFRAPEQKNFTYMKSLGSYYSEENVR IRLAKNRSKVKAPKHLSREARLYINISTYVTTGNREGFERWAKLNNLKEAARTFNYLSEN NLLNYEDFQQHLSDVNASVKAAEQRITQINSELSTQKIIRKHCDSYRLCRKVIEDCRSAK NPKAYRTKHQAEYQLHDSLKKELQDLGVTKIPSSNKIQKRIENLESEQAATVREKQELQK KQKTLDIIQQNFIALMYTPNVKSDSLQTKRESVPISRDE >gi|222441913|gb|ACEP01000029.1| GENE 53 76443 - 76766 263 107 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_3546 NR:ns ## KEGG: EUBREC_3546 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 107 1 107 107 167 97.0 1e-40 MAERKRSKEIHFYVTEEERKLIRRKMIESKTKNMGAYLRKMAIDGYIVNTDTTPLKKQYE EMHKIGVNINQIAKKVNTTGDLYPEEMQELKEMVKELWRILRSSPLK >gi|222441913|gb|ACEP01000029.1| GENE 54 77016 - 77390 250 124 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|157165407|ref|YP_001467497.1| 30S ribosomal protein S8 [Campylobacter concisus 13826] # 1 114 1 115 117 100 34 3e-20 MKKEEIFEYVQKQYGTIPEYLWSKLPDSAVLRHKNGKWYAVIMTVEKSKLGLEGRELLDI IDVKCDPDMTNMIIQTYGFLPGYHMNKQHWITILLDGSVSEAKVLDFLDMSYDLIDGAGR KENK >gi|222441913|gb|ACEP01000029.1| GENE 55 77493 - 77693 215 66 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_3544 NR:ns ## KEGG: EUBREC_3544 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 66 1 66 66 117 100.0 1e-25 MAINNTLKDIREERNLIQADLAEAIGSCSQTIGRIERGERNPSLEIAIRLAHYLKVPVED IFQVED >gi|222441913|gb|ACEP01000029.1| GENE 56 77750 - 78481 613 243 aa, chain - ## HITS:1 COG:SPy2125_2 KEGG:ns NR:ns ## COG: SPy2125_2 COG2932 # Protein_GI_number: 15675873 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Streptococcus pyogenes M1 GAS # 150 232 15 97 103 72 44.0 8e-13 MRDFGEILAENRKKKGYSQSDLVDLLSQEGIQVTTKAISKWETNAREPALHVFLTLCQLL DIEDIYESFFGENPYNIMSGLNEEGRNKLIEFADILKASKKFSPLSAKIIPFHHPVEITW EPVSAGTGNYLEDSVKETYDVGHLAPEQTDFGVRISGDSMEPLYHTGDVAWIQKIDSLAN GEIGIFYLNGNTYIKELHDEPDGVYLISLNQKYRPIQVLESDSFKIFGKVIGKCKGAEIP GFH >gi|222441913|gb|ACEP01000029.1| GENE 57 78676 - 79923 835 415 aa, chain + ## HITS:1 COG:BS_yqjW KEGG:ns NR:ns ## COG: BS_yqjW COG0389 # Protein_GI_number: 16079428 # Func_class: L Replication, recombination and repair # Function: Nucleotidyltransferase/DNA polymerase involved in DNA repair # Organism: Bacillus subtilis # 1 390 2 392 412 256 38.0 6e-68 MSERVILHSDMNCFYASVEMLHHPEFAGMPLAVGGDPEARHGIVLTANYIAKKQGVKTGM ALWQAKQVCPEIIFVPPRMDLYLRFSQMAREIYSEYTDKIEPYGIDEAWLDVSDSGNLKG SGMTIAREISHRIKYELGVTVSIGISWNKIYAKLGSDYKKPDAITEFNRENYKDRIWQLP ASDLLYVGRQTNKKLQKLGIRTIGQLAESDEKLLESHFGKIGNVLWAFANGWDEDPVCKE GYEAPVKSIGNSTTTPRDLENDLDVWIIQIALAESVAARLRKHGFKCKTVEITVRDNGLY SFSRQIHLRQPTNITNEIVTAAFQLFKDNYKWEYPIRSLGIRAADLVLDDIPVQLDLFGN QEKKEKLEKLDRTVDEIRRRFGYFSIQRAAMYQDKVLSHLDAGTHTIHPHSYFHG >gi|222441913|gb|ACEP01000029.1| GENE 58 79938 - 80150 175 70 aa, chain + ## HITS:1 COG:SA1174 KEGG:ns NR:ns ## COG: SA1174 COG1974 # Protein_GI_number: 15926920 # Func_class: K Transcription; T Signal transduction mechanisms # Function: SOS-response transcriptional repressors (RecA-mediated autopeptidases) # Organism: Staphylococcus aureus N315 # 3 70 2 69 207 66 45.0 9e-12 MKKALTRKQKESYQCILDYTKEHGYPPTVREFGKLIGVKSTSSAFSRIKQLELNGYIRRI PASPRAIEIL >gi|222441913|gb|ACEP01000029.1| GENE 59 80163 - 80402 176 79 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_3540 NR:ns ## KEGG: EUBREC_3540 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 79 1 79 79 159 100.0 4e-38 MNKVYVDVVAEFRKDGQLVPIFFTWEDGRKYSIDRILKIERCASRKAGGVGMMYTCMIQG QESHLFYEVDKWFMERKTA >gi|222441913|gb|ACEP01000029.1| GENE 60 80653 - 81225 447 190 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_3538 NR:ns ## KEGG: EUBREC_3538 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 190 1 190 190 347 97.0 2e-94 MTDRDYAIKSMKEITFQMASHAQDYLEVTIERHYTDIKELMTSYQKLILENQIVLEELDM ECQEKINEDMAYVLSYLSIYNNQLNVPKMHREMNNLMIIYGLSDMIYRGMTLVKFYAPNG VMLSEILHSCFCSHYNKTDVEVQQELGIGRTSFYKMKKQALGYLGFYFYEIVVPQAKDKR FKPSLGVEEE >gi|222441913|gb|ACEP01000029.1| GENE 61 81235 - 81606 237 123 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_3537 NR:ns ## KEGG: EUBREC_3537 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 119 1 119 123 230 98.0 2e-59 MRYEDSMKGVAAQIIKEQQRVSKIKNNPEEFHWHDEYATLNFFRFICKNIGNLGNGEIEK MVTRLKSIDQKAVENHVNGAVWAFDYDKMFCVLMENEICRKVCEKNKYNSWMKLIGNYCF QVV >gi|222441913|gb|ACEP01000029.1| GENE 62 81710 - 82549 767 279 aa, chain + ## HITS:1 COG:no KEGG:Selsp_1548 NR:ns ## KEGG: Selsp_1548 # Name: not_defined # Def: hypothetical protein # Organism: S.sputigena # Pathway: not_defined # 12 269 4 278 286 181 41.0 4e-44 MDVAKEEIVKTEELETRYERYKGILKDLTIMSDVFMRNVFKKRECTEYVLQVIMNKKDLK VIDQVLQKDYKNLQGRSAILDCVARDSEGKQMDVEIQQDNEGASPKRARYHSGLMDMNTL NPGQDFDDLPESYVIFITRDDALGYGLPIYHIDRKIEEVSENFKDEAHIIYVNSKKQEDT ELGRLMHDLHCKNAEDMHSKILADRVYELKETQKGVEFMCREMEQIYSEGIESGELKKAK ASALSMAADGMKVDKIAHYLNVSVQMVQKWIDESMSVAH >gi|222441913|gb|ACEP01000029.1| GENE 63 82739 - 83053 399 104 aa, chain + ## HITS:1 COG:CAC1578 KEGG:ns NR:ns ## COG: CAC1578 COG1396 # Protein_GI_number: 15894856 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Clostridium acetobutylicum # 1 104 3 106 110 61 32.0 3e-10 MDLKAVGQRIKAAREAKNLTQEELAALVNLSTTHVSVIERGLKVTKLDTFVAIANALDVS ADALLIDVVTHSVTGVTNELSDMIEKLPKDEQKRILNAVRALVD >gi|222441913|gb|ACEP01000029.1| GENE 64 83209 - 83430 228 73 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_3534 NR:ns ## KEGG: EUBREC_3534 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 68 1 68 70 96 89.0 3e-19 MREKNIYEKIAEKYNTTPEEVRREMQIAIDTGFDNPDPAVQEEWKKMTLKGERPTPEEVI NYAVKKTKRKIKI >gi|222441913|gb|ACEP01000029.1| GENE 65 83427 - 83858 328 143 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_3533 NR:ns ## KEGG: EUBREC_3533 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 143 1 143 143 245 90.0 4e-64 MKKKIIAVISGAVILIIAAGSIYGKSESGHKEGEPDVVGTFSVNRDENITVVANRGHIGD KEAFARELLQMYKDDSFYSTKFSTDRGYATSLDMNIYLWKEDIEDGESVMTAEYRPVEYG KDYDVVNHPDKFQLYIDGKEVEE >gi|222441913|gb|ACEP01000029.1| GENE 66 84185 - 84658 457 157 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_3532 NR:ns ## KEGG: EUBREC_3532 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 157 35 191 191 284 98.0 9e-76 MYIEHPYFYEGKYYANIDGEMIEITKEVAYAMNNFYRSSKAKKVEIKNELGEVVDKMLRE VPYSGQSIDGEGFMIEDFPDLNCDVEHCVLTKMEQQDIHTVINQLNSEERMIIYGIFFEN KTQTQMAEIMGISRQMLSYKLKSTLNKMREMYMNKFF >gi|222441913|gb|ACEP01000029.1| GENE 67 84903 - 85085 104 60 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026232|ref|ZP_03715424.1| ## NR: gi|225026232|ref|ZP_03715424.1| hypothetical protein EUBHAL_00473 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00473 [Eubacterium hallii DSM 3353] # 1 60 1 60 60 118 100.0 1e-25 MICMRCDTLADPEKSVTGKPMNCWSLETRVSGLWSGVCIRAGYYSLPAYAGVTIRGIEDK >gi|222441913|gb|ACEP01000029.1| GENE 68 85236 - 85616 339 126 aa, chain + ## HITS:1 COG:CAC0494 KEGG:ns NR:ns ## COG: CAC0494 COG2337 # Protein_GI_number: 15893785 # Func_class: T Signal transduction mechanisms # Function: Growth inhibitor # Organism: Clostridium acetobutylicum # 1 117 3 116 122 79 38.0 2e-15 MTIRRGDILWADLGMFPTTSVQGGVRPVIVVSNNKANTYSSVITVVPLTSRIYKKRYLPT HVFISKYDMTGIRKGSLALAEQVMSISTKCIIEKCGRVNKWSLDRVLKAVRIQMGMEGEG HDGRGI >gi|222441913|gb|ACEP01000029.1| GENE 69 85597 - 85845 223 82 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_3530 NR:ns ## KEGG: EUBREC_3530 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 82 1 82 82 99 63.0 6e-20 MTAEEYRNYLETDFDDVDLKELKDIRKIRIDRNQTKEKRIAQYLRQVGNPYLVCIGNVKV KIRFANNGVSFEDAFEELLLSV >gi|222441913|gb|ACEP01000029.1| GENE 70 85985 - 87682 998 565 aa, chain + ## HITS:1 COG:lin1623 KEGG:ns NR:ns ## COG: lin1623 COG1961 # Protein_GI_number: 16800691 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinases, DNA invertase Pin homologs # Organism: Listeria innocua # 14 315 10 300 301 175 35.0 2e-43 MNQINNSKKIYHAAIYVRLSKEDGDVATAGKRESNSISNQKDLIKNFLKDKKDIVVVSER VDDGYSGSNFERPAFQMMLEDIKKGIVDCVVVKDLSRFGREYIDAGKYIERLFPALGVRF IAVNDHYDSLEGKSQADEIVIPFKNLINDAYCRDISIKIRSHLEVKRKNGEYIGAFTPYG YQKDSDNKNKLVIDAYAAGIVKEIYRMKLSGMSQTAIANALNKQGVLSPMEYKHSLGIRI QDNFKTHEQAEWSAMSVKRILENEVYTGTLVQGKRTTPNHKVKKLMKKPETDWVRIEKNH EAIVSEREFALVQRLLGIDTRTSPNEEKVYPLSGLVVCGDCGAMMIKRDVPAGGKVYAYY ICSRHAATKACAAHRIPMGKLEETVLELVKIHIENILDMKKIMDFIHEVPFQELDIKELE IRKEAKEKEAVRCRELRDYLYEDFREGIISKEDYKELHDGYTEKRKKAEEAVRSIDQQIS EVLESKSDKYHWLDYFAEHQNIQELTRTVAVELIDQILVYDKKHIEVRFNFDDCYQSLLR QIQSVGCDVSTGMDGRIEIQKREVV >gi|222441913|gb|ACEP01000029.1| GENE 71 87683 - 89383 1281 566 aa, chain + ## HITS:1 COG:lin1623 KEGG:ns NR:ns ## COG: lin1623 COG1961 # Protein_GI_number: 16800691 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinases, DNA invertase Pin homologs # Organism: Listeria innocua # 34 323 10 299 301 163 34.0 8e-40 MARKSRKVDYVNVGNKENLVTEAENQCEKVSHAALYARLSYESEKNRERNTIETQMVLLH NFVKEQKDIVVAKEYYDISKTGTNFERDGFNEMMQDIKEGNIDCVIVKDLSRLGRNYVEA GSYIERVFPFFNVRFISVNDHYDSFRDDISLLISMSNVYNEFYSRDLAKKIRSSYRTSWA NGEFPSGQMAYGYEKDKDNPHQLIPDPVAAPVVKKIFQYFIDGMTYAEIARKLNADGYLC PKAYKLDKAGKANEKSATWTWSGGTVHKILENQYYAGDSVHNQFTNDSWAAQRQKMNQKE EWIIIKNTHEALVSRKDFDEVQEKIGHIVKRVNEARKSNGNNVRDFNFFKQKIVCADCGK TMYLYGKTKGNHRRFYCGNNKLHGKCTPHSITDLEVNDYVLRVIRAHINVYVENVDLIRR LNQRQESIKKYDVFNREIKKCRKELEKVAVHRERLFEDYVCRIIDAEQYETFSKQDAETE KEIQNNMEILLKHQVGYEKNFHTEEEWETLINKYRNTRTLTKEMVNAFVEKIEIHESGSI TVRLVYDDMLEELRKYAKEREAELCQ >gi|222441913|gb|ACEP01000029.1| GENE 72 89374 - 90948 1077 524 aa, chain + ## HITS:1 COG:lin1623 KEGG:ns NR:ns ## COG: lin1623 COG1961 # Protein_GI_number: 16800691 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinases, DNA invertase Pin homologs # Organism: Listeria innocua # 8 309 10 299 301 175 33.0 2e-43 MPVTECIAVYIRLSSEDNDVDGSIKAESNSVSAQRMLINEFIKKQDEFTNCPVVEYVDDG YSGTNFNRPGFQRMVEDAKAGRIKSIVIKDFSRFGRDYLEVGNYLEKILPVLGIRIISIN DGFDSINSSGFTGGMSVALKNMLNAMYSRDLSRKVRSAMKTHAKNGEYMPAFPKYGYIKD PEDKHHLVIDPEAAEIVKLIFTMAADGKTKGQIAKHLNETHVLTCREYMCRKGIKMHREN EKEKKLWSVTTISDMLKNEVYLGKIIWNKKRVARTGSNKLVSNDKEEWIVVENAHEPIIS DELFQKANEKAFTNQKRVLTKRGVACPIFFCPTCGRRLGFTSRETGYRCMQAHISGLSGC AESKMDRKEAEETVLDAARNMAQLISENLEKKKSEWHKTILKEENIATLESEKKRLSSRK MKLYSDYRSEVLDKEGYMEELEKTTSRISEITLQIAELENEIAVAKKKCDEATEKEMEVN EIAALQDFDKIQLSKIIEKVFIYEPGRMEISWKMDDIFYKEEKA >gi|222441913|gb|ACEP01000029.1| GENE 73 91138 - 91524 278 128 aa, chain + ## HITS:1 COG:BS_ydcL KEGG:ns NR:ns ## COG: BS_ydcL COG0582 # Protein_GI_number: 16077547 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Bacillus subtilis # 1 121 234 368 368 67 34.0 7e-12 MITSPKTEKSNRTIELPEFLCEEIQDYIESLYKVNENSRLFEVTKSYLHHEMDRGSKAAG IKRIRIHDLRHSSCALLINLGYSPIQIAERLGHESITITERYAHLYPSVQKEMANKLDKV FKEEEKND >gi|222441913|gb|ACEP01000029.1| GENE 74 91517 - 91765 280 82 aa, chain + ## HITS:1 COG:no KEGG:Rumal_3218 NR:ns ## KEGG: Rumal_3218 # Name: not_defined # Def: hypothetical protein # Organism: R.albus # Pathway: not_defined # 1 65 1 64 77 72 56.0 8e-12 MIEGRLSYNCNNDRYGLLVSDLWEYEGFHCGDPLEVKVDDKWIHTRIEMAWDEKGGHWYL VGTPYFDNLEYVCVRFKEDKDV >gi|222441913|gb|ACEP01000029.1| GENE 75 91758 - 92135 233 125 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026240|ref|ZP_03715432.1| ## NR: gi|225026240|ref|ZP_03715432.1| hypothetical protein EUBHAL_00481 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00481 [Eubacterium hallii DSM 3353] # 1 125 1 125 125 165 100.0 1e-39 MYEVDKIYRRPTQSKGKRKSRSKKNEKNRVRNVIMNFRVSPTEKELIETRIKLTGMLKSK FFIQSCLYQTILVKGNIKTFTEIKKQIEQISKQINKNPNLEELEPEQVESLRIILEILNT LFRKE >gi|222441913|gb|ACEP01000029.1| GENE 76 92457 - 92738 287 93 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026241|ref|ZP_03715433.1| ## NR: gi|225026241|ref|ZP_03715433.1| hypothetical protein EUBHAL_00482 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00482 [Eubacterium hallii DSM 3353] # 1 93 4 96 96 126 100.0 4e-28 MGKDKHYRMTEESLTFIEKRDKEKYPKENLFVNAAIQYFGRKIEQEKIVGKLEDIQDKLE RIERMLEEKAVRKNMVSDKVERKELFGKPDYLK >gi|222441913|gb|ACEP01000029.1| GENE 77 93422 - 96658 2122 1078 aa, chain + ## HITS:1 COG:PM0696 KEGG:ns NR:ns ## COG: PM0696 COG0553 # Protein_GI_number: 15602561 # Func_class: K Transcription; L Replication, recombination and repair # Function: Superfamily II DNA/RNA helicases, SNF2 family # Organism: Pasteurella multocida # 3 1078 1 1087 1089 1115 53.0 0 MEVKVFDNITEIARDDMASTIEKGSKVSIAAACFSMYAYKELKKQLESVDEFRFIFTSPT FVTEKAEKQKREFYIPRLNRENSLYGTEFELKLRNEMTQRAIAKECAEWIKRKARFKSNT TGENMGGFITVENKMNQLAYMPISGFTTVDIGCERGNNSYNMINRFDAPFSMSYIQLFEN LWKDKEKLQDVTDVVVENITTAYNENSPEFIYFMTLYHVFSEFLDDISEDELPNEATGFK QSKIWNMLYDFQRDAVLAIINKMERYNGCILADSVGLGKTFTALAVIKYYENRNRSVLVL CPKKLAENWNTYKDNYINNPIASDRLNYDVLFHTDLSRAHGFSNGLDLERLNWGNYDLVV IDESHNFRNGAGTHAKTQENRYVKLMDKVMRAGVKTKVLMLSATPVNNRFVDLKNQLAIA YEGDSENINKKLSTSKTIDEIFKQAQKAFNSWSKLDVAKRTTEALLQMLDFDFFELLDSV TIARSRKHIEKYYDTTEIGKFPERKKPISLRPKLTDLNAAINYNEIYEQLIHLSLCVYTP SNYIFPSKMQKYLDLTHNKGMNLTQTGREEGIRRLMSINLLKRLESSVNSFQLTLQRMKA LIEDTITAINRFEKYGNVNINMYEVSDEDFDIDDGNTEYFTVGKKVKIELADMDYKTWRS ELIEDAETLELLTLMVADITPEHDYKLQELLRLLDEKIEDPINPDNKKVLVFSAFSDTAE YLYKHVSKHIKNKYGLNTAVITGSIDGKTTIAGFKATLNNVLTCFSPISKGRNILMPGSN TEIDILIATDCISEGQNLQDCDYLVNYDIHWNPVRIIQRFGRIDRIGSKNEYIQLVNFWP DMDLDEYINLKGRVETRMKISIMTSTGDDDLINPEEKGDLEYRKQQLKRLQEEVVDIEDM TTGISIMDLGLNEFRLDLLDYVKIHPDLEQKPKGLHAVVPSREDLPEGVIFVLKNINNSV NIDNQNRIHPFYMVYIGMDGEVICDYLNPKKLLDDIRLLCRGKKQPIKKLYQKFNEETDD GKNMGEMSELLSEAINSIINCKEESDIDSLFAEGGTSALLSAVSGLDDFELICFLVIK >gi|222441913|gb|ACEP01000029.1| GENE 78 96676 - 97332 520 218 aa, chain + ## HITS:1 COG:no KEGG:DET1110 NR:ns ## KEGG: DET1110 # Name: not_defined # Def: hypothetical protein # Organism: D.ethenogenes # Pathway: not_defined # 1 204 1 205 222 256 64.0 3e-67 MIGLPKTTEFNKRIPKQKFYENITISPAIKRVFVEQVKTIYWKNKIAVSTTNLAEGNKVT ELEVFEIHLNNPELDEKLLYQIDKVVPYHILFLLEYQGEYQAWVGFKEKAASGNKVFKVN RYYHTEWMKEEELPLHLEGLSMDAAYENYVRQIAGDKLQIKTNGESLKESVTRDEQRQIL QKKIDILKAQIRKEKQLNKQMQINRELKILKKELEEYL >gi|222441913|gb|ACEP01000029.1| GENE 79 97335 - 98192 523 285 aa, chain + ## HITS:1 COG:no KEGG:Maqu_1263 NR:ns ## KEGG: Maqu_1263 # Name: not_defined # Def: TIR protein # Organism: M.aquaeolei # Pathway: not_defined # 112 281 90 253 255 140 43.0 8e-32 MGVEQYQRKVNSLDKEIADLEKKKAEEDRKAADNHKKASRVSVSKNASESMIKSKMRQIE NYEEKARKAEAASADYGKKISDKRTKRNDAYLKLQKEEQNERKKQEKSIENMKRVYEQRI SELESLRLSKVKNEFLETVSGDEPEYDVFVSHAWEDKADFVEAFVQALKDREIKVWYDKS KIKWGDSMRARIDDGLRKSKFGVAILSPNYIAEGKYWTKAELDGLFQMESINGKTLLPIW HNLTKKQVMDYSPILASKLAMTTATMTAEEIADELLDMLSSEEDK >gi|222441913|gb|ACEP01000029.1| GENE 80 98194 - 100056 1083 620 aa, chain + ## HITS:1 COG:PM0698 KEGG:ns NR:ns ## COG: PM0698 COG2189 # Protein_GI_number: 15602563 # Func_class: L Replication, recombination and repair # Function: Adenine specific DNA methylase Mod # Organism: Pasteurella multocida # 3 620 4 636 636 517 46.0 1e-146 MDKMRMESVDMTAQNIEKIGSLFPNCITETIDENGKGKKVINFELLKQMLSSDVIDGDEA YEFTWVGKKASIVEANKPIRKTLRPCKEESVDWDNTENLYIEGDNLEVLKLLQESYLNKV KMIYIDPPYNTGNDFIYNDDFKMTSEEYAEEISELDEDGNRMFKNTDTNGRFHSDWCSMI YSRLMLARNLLSDDGVIFISIDDNEQSNLKKCCEEIFGEKNFVAQLIWERAFAPKNDAKY ISNSHDYILMFAKNINNFVIGRLPRTSEANARYSNPDNDPRGVWQSDNLTVKTYSPSGDY SITTPSGRVVEPPAGRCWSLSKNKFLERLKDNRIWFGADGNGVPRIKRFLSELKFEGMAP TSLLFFKEVGHSQEGSQELIKIMDGGVFDGPKPTRLLKRLMILANLKENSIVLDFFSGSA TTAHALMEVNKEKNLKCKYIMVQLAENTNKKKDTGYKNICEIGKERIRRAGKKIKEEAPL VTQNLDIGFRVLKLDDSNMNEVYYSPDDYSQGFLNQLESNIKEDRTDLDLLFGCLLDWGL PLSLPYYSEQIEGCTVHNYNDGDLIACFDENIPNSVIKEIAKKEPLRAVFRDSSFASSSS KINVGEIFKLIAPDTTVKVI >gi|222441913|gb|ACEP01000029.1| GENE 81 100068 - 103139 2332 1023 aa, chain + ## HITS:1 COG:PM0699 KEGG:ns NR:ns ## COG: PM0699 COG3587 # Protein_GI_number: 15602564 # Func_class: V Defense mechanisms # Function: Restriction endonuclease # Organism: Pasteurella multocida # 1 1021 1 1041 1043 787 43.0 0 MKLRFKHQKFQADAAKAVVDVFAGQPYLTPSYMMDKGYQGKTMEGQKYSQMSTMDEEDFT GWSNHKIVPELNDRLILEHIQKIQRENQIEPSGKLEGRYNLTIEMETGVGKTYTYIKTMY ELNRAYGWSKFIVVVPSIAIREGVYKSFQVTQEHFAEEYGKKIRFFIYNSARLTEIDRFA SDNSINVMIINSQAFNARGKDARRIYMKLDEFRSRRPIDILAKTNPIVIIDEPQSVEGKK TKENLKQFNPLMTLRYSATHKRDSIYNMVYQLDAMEAYNKRLVKKIVVKGISESGTTATE SYVYLESINLSKAAPTATIQFDYKGANSIRKITKTVGEGCNLYDNSGNMEEYKEGFVISR IDGRDDSVEFINGIKIYAGDVIGKVSEDQLRRIQIRETILSHIQRERELFYKGIKVLSLF FIDEVAKYKQYDDAGQPINGIYADIFEEEYNDIINNLQIKIGEDDYLRYLSSIPVNKTHA GYFSIDKKGKMINSKVLRKKTTTDDVDAYDLIMKNKELLLDRSPEKSPVRFIFSHSALRE GWDNPNVFQICTLKQSNSEVRKRQEVGRGLRLCVNQDGERMDTNVLGNDVHSVNVLTVIA SESYDSFAKGLQSEIAEAVADRPHAVTAELFIGKVIEDEQGNEQVITSDMAHAIYYDLIM SRYVDRKGNLTDKYYEDKANGEIKVAEEVADSVNSVIQIIDSVYNSRSMCPENARSNNIE LHLDENKLAMPEFKALWSKINSKTAYVVDFDTEELICKAIDSLDRKLRVSKIYFKVETGV LNEIKSKEILADGSGFLKEKTENYGRTVSVSSRVKYDLIGKLVDETNLTRKAIVRILQGI QPSVFNQFKDNPEEFIIKAAALINDEKATAIIEHITYNVTDDRYGTEVFADPTIKGRLGV NAMKANKHLYDHIIYDSTNEQKFAEELDISQDVAVYVKLPDGFYISTPVGRYNPDWAIAF YEGTVKHIYFVAETKGSMDSMQLRLIEKSKIHCAREHFKAISNSEVVYDVVDSYKSLLDI VTR >gi|222441913|gb|ACEP01000029.1| GENE 82 103190 - 104713 595 507 aa, chain + ## HITS:1 COG:no KEGG:Dhaf_4662 NR:ns ## KEGG: Dhaf_4662 # Name: not_defined # Def: peptidase C14 caspase catalytic subunit P20 # Organism: D.hafniense_DCB-2 # Pathway: not_defined # 5 223 3 216 322 79 31.0 5e-13 MGNVKATLVGVSNYDNPRTPDIDACRNDIVEVKAALIKGLGISSQDIRLLGSSGRVNIKE LETELQIASLVVEPEDTYIFYFSGHGNNGTLALSETVIGVQEVIELVDNIEARNKIIILD SCRSGDFLIPQIKTPDIDRMVEEFAGAGYAVMASCGANENSTFYPGNKISLYTHFLCDAL TMDCIIKKGKKSLEEINELIFRMISVWNRKGVRIQHPIFRANITGTIYFPVQKYTSYQKQ NIFKETEEYIIYEVLPFHHSRQKRYTAKVILKYYFDEIDIVKITKEIIDEIKYSNVYKNE IEERRFRGLPANMIRCHMGNDEQDMTDCNFEWITTWVDDTMDKNNWYRESNGSKILDGIL VDKQKSYNALRILNQQTVTTEEFIKEARKYLNEMVAYADQFISKYREYSNGNISESELVN NVKDINVQIASIYFKISDLPKVPLECNSWAESIQQIAATIHDFTLFYSNKTMNTWSQENR KDLMKIKIKHYQQEIELIKQEEKKLKE >gi|222441913|gb|ACEP01000029.1| GENE 83 104731 - 106413 828 560 aa, chain + ## HITS:1 COG:DR2199 KEGG:ns NR:ns ## COG: DR2199 COG2865 # Protein_GI_number: 15807192 # Func_class: K Transcription # Function: Predicted transcriptional regulator containing an HTH domain and an uncharacterized domain shared with the mammalian protein Schlafen # Organism: Deinococcus radiodurans # 47 552 39 583 596 233 31.0 7e-61 MKRGVFLNEKEFSIILSSGESLHTEFKSWKKAGNMKERIKLAVDELVAFANAKGGTVYFG VEDDGTVTGCEKYDTQSIIEAIYDKTRPSLFVDIEKIPYQGMMVLALKVEADGKTYGTTD GRFLKRLGKNSKPFFPDEMSHHFYREQIPDFSSKIVMESTFEDINLLEVYSLKEKLKIRD PKSTLPALEDMAFLRDLGLIKKIDNVERLTVAGLLFVGKQNSISRLLPQAEVIYLHYSAE NLEEYDLRLDLQQPIVNVLDRLTEKIQNDNKILNIQVGLFRLEVEDFSEKVFQEALLNAL SHRDYESMGAVYVKHYPDKVVIENPGGFLDGITERNIITHPSAPRNKLIAETLQHLKYVQ RTGQGVDIIFREMISMGKPYPEYQVFSDAVSLTLRSAVEDVSFVKFIVNEQEKNQIFLSL AELMIMRYLSENKRIKLSEAKDLTQISLENTRKALTNLVNLHLIESVGKEYMLTARVYEA VKNDLQYTRDKGVRYIKAKEMILEYLEHNESITNAAIQELCGFSKQQARTTIDKMRKEKL IILVGKGNRSCYQRIVYENK >gi|222441913|gb|ACEP01000029.1| GENE 84 106500 - 108077 450 525 aa, chain - ## HITS:1 COG:no KEGG:RSP_2044 NR:ns ## KEGG: RSP_2044 # Name: not_defined # Def: ATPase # Organism: R.sphaeroides # Pathway: not_defined # 39 419 3 356 458 100 26.0 1e-19 MRKLQIKDIYFGKIDGYNEFLEYGQDTCKGLFYEFPNINISRLLDGSTYYIFGNKGTGKT MLLKYLESIISENPSQNFSEFIRFKRDVDEEERNQIKRSALPNNPFEEIIDSEIPSDFTL NCSLAWQVYLIKVIVSRLSRTEYGVFNRNTESWTKLCALLNATYGKLGTDNTIKKILPKM KRGNLEIDFNKIAKTNLEFEWVDSEKKSVSFTSLAKQIINLYSVLTPVEDKIYIFIDELE LAFKQTKKYQRDVTLIRDLIFAIEYLSDINRTHNFNVFLVTAIRSEVYKNIISKGLEINK TIHDFGVTISWEQKGGNIRNHPLLKMLEKRIHFSETKLGLEPSPDIWETYFIPYVGNKKK EVQNYILDQTWYKPRDIIRLFTTIQRLHGNKSIIDQECFDTSRKPYAEESWTEFEEFLTV KYSDKEVEGIKKSLTGILLPFSVSDFQHHLNSKKDFFEEVNELVTKRKTPEILKDLYDIG IIGNYGSNSRFMFKGDTDIDPIMPITIHYPLIKFFQASINYHQKA >gi|222441913|gb|ACEP01000029.1| GENE 85 108222 - 108377 98 51 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225026251|ref|ZP_03715443.1| ## NR: gi|225026251|ref|ZP_03715443.1| hypothetical protein EUBHAL_00492 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00492 [Eubacterium hallii DSM 3353] # 1 51 1 51 51 93 100.0 6e-18 MVGINFGTLSLMRLWLDCYNEYMIEVEKCNKIKDVKKRGEALLKVEAEFNL >gi|222441913|gb|ACEP01000029.1| GENE 86 108523 - 108678 164 51 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225026252|ref|ZP_03715444.1| ## NR: gi|225026252|ref|ZP_03715444.1| hypothetical protein EUBHAL_00493 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00493 [Eubacterium hallii DSM 3353] # 1 51 1 51 51 89 100.0 6e-17 MNVANVGLLTFIMQIRQERNLTQKQLGEMVGASEAYIRAYESGTRNTKPSP >gi|222441913|gb|ACEP01000029.1| GENE 87 108789 - 109007 137 72 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|154503692|ref|ZP_02040752.1| ## NR: gi|154503692|ref|ZP_02040752.1| hypothetical protein RUMGNA_01516 [Ruminococcus gnavus ATCC 29149] hypothetical protein RUMGNA_01516 [Ruminococcus gnavus ATCC 29149] # 1 68 1 68 216 107 75.0 4e-22 MSILISIFILGYHGKYKGFSKNSFCHRTTIAHFLNSGKWDESLLLNVLKQYVIEILYSEA VRIGPPCFLYCR >gi|222441913|gb|ACEP01000029.1| GENE 88 108940 - 109152 165 70 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026253|ref|ZP_03715445.1| ## NR: gi|225026253|ref|ZP_03715445.1| hypothetical protein EUBHAL_00494 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00494 [Eubacterium hallii DSM 3353] # 1 70 1 70 70 132 100.0 7e-30 MSLRFFIQKQSASGPPVFCIVDDTIASKTKLLSQALYPIKDAYFCQSHLKGKQDYGHQVV AVCFPVMALF >gi|222441913|gb|ACEP01000029.1| GENE 89 109125 - 109328 82 67 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253580568|ref|ZP_04857833.1| ## NR: gi|253580568|ref|ZP_04857833.1| transposase family protein [Ruminococcus sp. 5_1_39B_FAA] transposase family protein [Ruminococcus sp. 5_1_39BFAA] # 1 65 118 182 366 131 93.0 2e-29 MLSCNGIVLNYAFVMYNKSISKIDIVQNIAKELPVPPVMFYFFCDCWYVSEKIINTFAVK GFHTISV >gi|222441913|gb|ACEP01000029.1| GENE 90 109447 - 109554 74 35 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MCIGMREKSTENAIVLLSYPEKAFGNPKASKDIGY >gi|222441913|gb|ACEP01000029.1| GENE 91 109554 - 109721 128 55 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026256|ref|ZP_03715448.1| ## NR: gi|225026256|ref|ZP_03715448.1| hypothetical protein EUBHAL_00497 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00497 [Eubacterium hallii DSM 3353] # 1 55 1 55 55 92 100.0 8e-18 MSLEYYICVIGTNEACSFENGYTQICDIIYEEKYRYIFQCAKASRDFEAFMKLAI >gi|222441913|gb|ACEP01000029.1| GENE 92 109764 - 109937 92 57 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_1337 NR:ns ## KEGG: EUBREC_1337 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 57 40 96 96 92 87.0 6e-18 MNQTLSHGHAAEILEIRKSELIDLYDKRGYSYFDMTMDDLDDELNTFRELKKKETMV >gi|222441913|gb|ACEP01000029.1| GENE 93 109934 - 110179 236 81 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_3346 NR:ns ## KEGG: EUBREC_3346 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 65 1 65 164 97 78.0 2e-19 MIVISDTTPLISLLKINRIDLLEKLFGDVLIPQAVFGELTIDEHFQLEADQIRQKKFIVV KPVKDAVKQTERLNVSFKNNN >gi|222441913|gb|ACEP01000029.1| GENE 94 110762 - 110848 65 28 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRTSKNGSNKNIKESLDAISVILGRRVL >gi|222441913|gb|ACEP01000029.1| GENE 95 111111 - 112598 918 495 aa, chain + ## HITS:1 COG:no KEGG:Spico_0247 NR:ns ## KEGG: Spico_0247 # Name: not_defined # Def: Site-specific DNA-methyltransferase (adenine-specific) (EC:2.1.1.72) # Organism: S.coccoides # Pathway: not_defined # 1 493 3 494 517 661 64.0 0 MLEKIIELTNEYIESMPKKERKKYGQFFTSMETARFMAGLYKLDEKTGRVSVLDAGAGSG ILSCAFIERMETIDSIQEIELTCYENDENVLPLLKRNLEYCKEETRKKLTVNIVEDNYIL SQYLDFNHMLGGNAEPKKYDFVIGNPPYMKIPKDAPEATAMPEVCYGAPNLYFIFASMGL FNLCENGEMVYIIPRSWTSGAYFKRFREYFLTEGKLEHIHLFVSRNKVFDKESVLQETII IKVRKTSEKPETVTITSSKSNSDFGELTSLTVPYDLVVAGSDYYVYLVTDENEVEVLKKL HKFDKTLPAIGVKMKTGLTVDFRNRDILRDEAEEGAIPLFYSQHIKQGKVEFPIQKEHEY VVTEQKGLMQDNKNYLFVKRFTAKEEPRRLQCGVYLAKRFPQYQKISTQNKINFVDGVLT EMSECLVYGLYVLFNSTLYDEYYRILNGSTQVNSTEINAMPVPDLEDIQEMGRKVLKSKD YSEENCNLILEGYCG >gi|222441913|gb|ACEP01000029.1| GENE 96 112591 - 113511 520 306 aa, chain + ## HITS:1 COG:no KEGG:Spico_0248 NR:ns ## KEGG: Spico_0248 # Name: not_defined # Def: Type II site-specific deoxyribonuclease (EC:3.1.21.4) # Organism: S.coccoides # Pathway: not_defined # 5 306 1 302 303 453 69.0 1e-126 MDKKIEEAREFLQTIGMPKAQQADICCYVILAMAGIKPDMSWSEGTNEWIRIHDIIQFVN TFYGMSYAENSRETFRKQALHRFRTAALIEDNGKATNSPNYRYRLTEETVQILRTMGTPK WKESIKRFLCYHEKLIDLYASKKKMTMMPVNINGESFKFSTGKHNELQKAIIEEFSPRFA PNSECLYVGDTIEKDLVKNVEKLKELGFEITLHDKMPDVVLYRADKNWIYFVESVTSVGP MDPKRILEITEMTKDVTAGKIFVTAFLDFKTYKRFAEELAWETEVWIAEMPEHMIHLNGD RFMGPR >gi|222441913|gb|ACEP01000029.1| GENE 97 113483 - 113722 57 79 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MVIGLWGRDDSKKKLITNNLLEEIFVGFVTLACIKFKLFVMLFCHNEILCQYSLPYESYL YLDLSFRLVLFDLYFTAEL >gi|222441913|gb|ACEP01000029.1| GENE 98 113753 - 116011 1115 752 aa, chain + ## HITS:1 COG:no KEGG:EF3220 NR:ns ## KEGG: EF3220 # Name: not_defined # Def: hypothetical protein # Organism: E.faecalis # Pathway: not_defined # 27 752 46 740 1210 150 23.0 2e-34 MEFLSEIEMFRDSFYNGDSKSEVSVAVEEAKKYEIFNLISRVSALNLFHQNQTKSVILDT YIEGLLHQKKDQFQSKYNISPGKFRRIITQISDTSLKYSVDPPENMFVQNIMFYGNYRVL NGIDQTPAYNLQNMISILFTNGIEYPKDFLNEAYILVNGMLTISEKIVSGISDINNDHNT DEEKGVIIPPAIDLNKYAELIVIEGTNFRKLFLEKSELLNLTTIEFGVEFEDDFDNKSFY TRPFLYNEEQDQYILLNAGLLPTAIVFWITCLAKKYGIFENVMENYNSYIFHECKKYLRY LGHKKVLESQMGIELFNCSGYKEYIASVQNNQLVIVQYLYDDGKNYDAYTLHSPVNKKEF NDMVPERLAYHYSKIVEYGVNKEDIFVIIIINSLGRGIAYGIKKYDYFYPPLRVNPFELM CISINEKTESIFIPRYLKAKNSLRTFETGILSELNQIEMYCNNNYSFYMNDDFAPSEITT YFAPGDSLDYIMRAIQKEDRRLVEDSQGIMFCEVILNDRKRKIYVDPNCIKRQEISYYIE FDTFNIWIVAKEISNAKKMDLCYSVLDLISYWLAECKTVLNKMNGGGRTYEIEIILSDEA EKYYYYKENPKPFIETLEICNSFSVMEICISPEAFQYLNYRDNSREKEFITIIIDYIYKL LGETGKINYDLNVLFANPMQKKLFSLDYQEYHYLEPVANRENHFVHGEDEDILLNEIGEE LLKIGKWNVGIVDDGERTQIAHEVVGILYRKL >gi|222441913|gb|ACEP01000029.1| GENE 99 116402 - 117424 397 340 aa, chain + ## HITS:1 COG:no KEGG:EF3220 NR:ns ## KEGG: EF3220 # Name: not_defined # Def: hypothetical protein # Organism: E.faecalis # Pathway: not_defined # 31 331 905 1204 1210 173 33.0 1e-41 MFSVNEAIRKEQLIASSVFEDSVNLVEQKSFMEEINRAFEKEYGYSLYQLMLLAEGLREY SKSQSSDTVYVAKMKDIVDFLSQYDTSFNEDTVDVIIGSIALRQREDFLVPPESFRKEDV YPWRFNRELSFTRRPIIIRGDEVIWGNRQLDHMIRFTVGLINNGKLKAHSPEMNSLVGKI SDERGAEFNDLIYKLLINLNVFEVYSNISKVNGIHISENNNTLGDIDVLIIDHEYNKIIA AEVKNFNFSKNPYEMHLEYKKMFENTKKKKGYYTKHCRRVEWCNKHIGDFKKQYGLDNNE WKVIGLFILSQPLVSTEIYHKDIKMLTEKELSVSSIRALY Prediction of potential genes in microbial genomes Time: Fri Jul 8 06:51:37 2011 Seq name: gi|222441912|gb|ACEP01000030.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont32.1, whole genome shotgun sequence Length of sequence - 3805 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 1/0.000 - CDS 3 - 285 310 ## COG1192 ATPases involved in chromosome partitioning - Prom 386 - 445 8.5 - Term 458 - 497 10.0 2 1 Op 2 . - CDS 638 - 3268 3768 ## COG0574 Phosphoenolpyruvate synthase/pyruvate phosphate dikinase - Prom 3326 - 3385 7.9 Predicted protein(s) >gi|222441912|gb|ACEP01000030.1| GENE 1 3 - 285 310 94 aa, chain - ## HITS:1 COG:NMB0191 KEGG:ns NR:ns ## COG: NMB0191 COG1192 # Protein_GI_number: 15676118 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Neisseria meningitidis MC58 # 3 91 5 80 257 62 41.0 3e-10 MKVYVVANLKGGVGKTTTTVNVAYTFSEMGGRVLVIDLDPQCNCTRFFAKVNGYSKTVRD VLENPKGINSAVYRTKYQDIDIVKGSVKITEQKT >gi|222441912|gb|ACEP01000030.1| GENE 2 638 - 3268 3768 876 aa, chain - ## HITS:1 COG:SMc00025 KEGG:ns NR:ns ## COG: SMc00025 COG0574 # Protein_GI_number: 15964685 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphoenolpyruvate synthase/pyruvate phosphate dikinase # Organism: Sinorhizobium meliloti # 1 871 1 884 898 999 56.0 0 MAKWVYLFTEGDATMRNLLGGKGANLAEMTNIGLPVPQGFTITTEACTQYYEDGREINDE IQGQINEYIVKMEEITGKKFGDKENPLLVSVRSGARASMPGMMDTILNLGLNETVVETIA AKSGNPRWAWDCYRRFIQMYSDVVMEVGKKYFEELIDEMKAKKGVSQDVDLTAEDLKELA SQFKAEYKEKIGEDFPDDPKKQLMGAIKAVFRSWDNPRANVYRRDNDIPYSWGTAVNVQS MAFGNMGDDCGTGVAFTRDPATGEKKLMGEFLTNAQGEDVVAGVRTPMPIAQMEEKFPEA FEQFKQVCKTLEDHYRDMQDMEFTVENKKLYMLQTRNGKRTAQAALKIACDLVDEGMRTE EEAVAMIDPRNLDTLLHPQFDAAALKAATPMGKALGASPGAACGKIVFTAEDAKAWAERG EKVVLVRLETSPEDIEGMKSAQGILTVRGGMTSHAAVVARGMGTCCVSGCGDIVMDEENK KFTLAGKEFHEGDAISLDGSTGNIYDGIIPTVDATIAGEFGRIMAWADKYRTMGVRTNAD TPSDAKKARELGAEGIGLCRTEHMFFEGNRIDAFREMICSDTVEEREAALDKILPYQQGD FEQLFEAMEGNPVTIRFLDPPLHEFVPQTEEDIKKLADAQGKSVETIKAIIESLKEFNPM MGHRGCRLTVTYPEIAVMQTKAVIRAALAVQAKHADWTIVPEIMIPLVGEEKELKYVKKI VVKTADEEIKAAGSDMKYEVGTMIEIPRAALLADEIAKEAEFFCFGTNDLTQMTFGFSRD DAGKFLDAYYDAKIFENDPFAKLDQNGVGKLMDMAVKLGKGQRPELHCGICGEHGGDPSS VEFCHRIGLDYVSCSPFRVPIARLAAAQAAIAEKRK Prediction of potential genes in microbial genomes Time: Fri Jul 8 06:51:46 2011 Seq name: gi|222441911|gb|ACEP01000031.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont33.1, whole genome shotgun sequence Length of sequence - 33110 bp Number of predicted genes - 31, with homology - 31 Number of transcription units - 13, operones - 5 average op.length - 4.6 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 85 - 354 286 ## Acfer_0671 hypothetical protein - Prom 405 - 464 6.6 2 2 Tu 1 . - CDS 627 - 2180 956 ## Sterm_0502 hypothetical protein - Prom 2256 - 2315 6.2 3 3 Tu 1 . - CDS 2567 - 2914 313 ## gi|225026270|ref|ZP_03715462.1| hypothetical protein EUBHAL_00511 - Prom 2956 - 3015 6.2 - Term 3064 - 3095 1.5 4 4 Op 1 2/0.000 - CDS 3323 - 3874 723 ## COG4577 Carbon dioxide concentrating mechanism/carboxysome shell protein 5 4 Op 2 . - CDS 3892 - 5220 1229 ## COG4656 Predicted NADH:ubiquinone oxidoreductase, subunit RnfC 6 4 Op 3 2/0.000 - CDS 5237 - 6643 946 ## PROTEIN SUPPORTED gi|148544941|ref|YP_001272311.1| 50S ribosomal protein L29P 7 4 Op 4 2/0.000 - CDS 6643 - 7656 940 ## COG2096 Uncharacterized conserved protein 8 4 Op 5 . - CDS 7670 - 7945 338 ## COG4576 Carbon dioxide concentrating mechanism/carboxysome shell protein 9 4 Op 6 . - CDS 7958 - 8836 863 ## ELI_4075 flavoprotein 10 4 Op 7 . - CDS 8845 - 9663 881 ## COG4820 Ethanolamine utilization protein, possible chaperonin - Prom 9683 - 9742 3.9 11 5 Op 1 . - CDS 9864 - 10661 913 ## gi|225026278|ref|ZP_03715470.1| hypothetical protein EUBHAL_00519 12 5 Op 2 . - CDS 10705 - 11067 358 ## ELI_4079 hypothetical protein 13 5 Op 3 . - CDS 11077 - 12891 1861 ## GY4MC1_1864 Diol/glycerol dehydratase reactivating factor large subunit 14 5 Op 4 1/0.000 - CDS 12924 - 13418 672 ## COG4910 Propanediol dehydratase, small subunit 15 5 Op 5 . - CDS 13434 - 14093 774 ## COG4909 Propanediol dehydratase, large subunit 16 5 Op 6 2/0.000 - CDS 14117 - 15778 1848 ## COG4909 Propanediol dehydratase, large subunit - Prom 15806 - 15865 6.0 17 5 Op 7 4/0.000 - CDS 15960 - 16772 1034 ## COG4816 Ethanolamine utilization protein 18 5 Op 8 . - CDS 16816 - 17088 487 ## COG4577 Carbon dioxide concentrating mechanism/carboxysome shell protein - Prom 17211 - 17270 7.0 19 6 Op 1 . - CDS 17324 - 18478 1434 ## COG1454 Alcohol dehydrogenase, class IV 20 6 Op 2 . - CDS 18465 - 19532 861 ## COG2207 AraC-type DNA-binding domain-containing proteins 21 6 Op 3 . - CDS 19548 - 20783 1063 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 22 6 Op 4 . - CDS 20833 - 21495 544 ## COG5012 Predicted cobalamin binding protein - Prom 21570 - 21629 7.7 23 7 Tu 1 . - CDS 21643 - 21816 78 ## gi|225026290|ref|ZP_03715482.1| hypothetical protein EUBHAL_00531 - Prom 21895 - 21954 3.5 - Term 21944 - 21987 0.1 24 8 Tu 1 . - CDS 22047 - 24482 2936 ## COG0058 Glucan phosphorylase 25 9 Tu 1 . - CDS 24664 - 27822 3789 ## COG0060 Isoleucyl-tRNA synthetase - Prom 27974 - 28033 5.3 + Prom 28466 - 28525 10.3 26 10 Op 1 . + CDS 28545 - 28976 507 ## gi|225026294|ref|ZP_03715486.1| hypothetical protein EUBHAL_00535 27 10 Op 2 . + CDS 28984 - 30429 1561 ## COG1488 Nicotinic acid phosphoribosyltransferase - Term 30513 - 30558 8.0 28 11 Tu 1 . - CDS 30654 - 31040 75 ## Acfer_0798 MATE efflux family protein - Prom 31094 - 31153 8.4 + Prom 31062 - 31121 8.7 29 12 Tu 1 . + CDS 31196 - 31837 449 ## COG4845 Chloramphenicol O-acetyltransferase 30 13 Op 1 . - CDS 32049 - 32516 361 ## NT01CX_0673 hypothetical protein 31 13 Op 2 . - CDS 32513 - 32974 404 ## NT01CX_0674 hypothetical protein - Prom 33011 - 33070 5.5 Predicted protein(s) >gi|222441911|gb|ACEP01000031.1| GENE 1 85 - 354 286 89 aa, chain - ## HITS:1 COG:no KEGG:Acfer_0671 NR:ns ## KEGG: Acfer_0671 # Name: not_defined # Def: hypothetical protein # Organism: A.fermentans # Pathway: not_defined # 1 85 1 82 89 82 50.0 5e-15 MMYPYMTLADETEIVHSQIIEKDGMKKVIVNFERPTENGFDSARCELPDYKWTERMGYSD EEIEMFEELLHSNAHLLYKYAENGGIQIA >gi|222441911|gb|ACEP01000031.1| GENE 2 627 - 2180 956 517 aa, chain - ## HITS:1 COG:no KEGG:Sterm_0502 NR:ns ## KEGG: Sterm_0502 # Name: not_defined # Def: hypothetical protein # Organism: S.termitidis # Pathway: not_defined # 138 334 13 200 201 66 28.0 3e-09 MKKMGKVARLASLLVLTLLFGLFPIGSGNNLNKVQAVTPLNNPRTVADSSMNAKQKVTWD CVWFGAYPQSEVTSGDIYNKLKNATGWDENNDIVIDGNKYRRLKGEDATYYNTKRVLNYY DWIEDYSTYHYFKYEAIKWRVLNVSNGGALLLADQALDSQRYNQNSEDITWEKSSIRSWL NAQDTINNQEGINYRKGNFLNEAFFLSEQAVILPRNIANKNNATYNTSGGNNTLDKIFLL AETQICGLDAKKYGFIMDHSIDDEARQSKCAEYAFAMGCYKFVETKYAENVNWWLRSPGG SSKAALKIDYDGRSRTGGSSIDSIEAIRPALYLKISSSNLYSYAGTVCSDGTINETPAPA RVYQDNTSSGNTGNNSSSNNTTGNNNSNNTNNNKNNTTANTKPSNKTTASKKPTKRALLP VEKMTGSLKSVKSPAKSTLAITWKKLRKVTGYQVQFCAKSNFKKGTIERKFKQKVTKTKV RPLKSKKKYYVRMRPYTKSGSKTYYGKWSKVKSVKIK >gi|222441911|gb|ACEP01000031.1| GENE 3 2567 - 2914 313 115 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225026270|ref|ZP_03715462.1| ## NR: gi|225026270|ref|ZP_03715462.1| hypothetical protein EUBHAL_00511 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00511 [Eubacterium hallii DSM 3353] # 1 115 1 115 115 227 100.0 2e-58 MGKTYIPITEANKKNILNFIGMCRMNGLEGMAEPSFSFKDIREDIGTVIFDVFYNCRGIT EPSWNMTVKVDSDVRFHDSAASKAGETYCYLDSYRIEKDSLEFKTKTSKFNVFSF >gi|222441911|gb|ACEP01000031.1| GENE 4 3323 - 3874 723 183 aa, chain - ## HITS:1 COG:STM2054 KEGG:ns NR:ns ## COG: STM2054 COG4577 # Protein_GI_number: 16765384 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; C Energy production and conversion # Function: Carbon dioxide concentrating mechanism/carboxysome shell protein # Organism: Salmonella typhimurium LT2 # 1 174 1 174 184 133 42.0 2e-31 MRKAVGFVEIKSIPVGIQTADEMVKAGNVELLLSTPICPGKYVIIVGGHVGPVKAAMSKA ELVSGIYLIDTHILDNIREEVLPAIAGVTPTQQIQAVGAIETISALTAIIAADTAVKASK VSIVDLRIARGLGGKGYLVITGEVSSVRSAVNACLAKLGISGEVTSTSIIPRPHPQIVES LLK >gi|222441911|gb|ACEP01000031.1| GENE 5 3892 - 5220 1229 442 aa, chain - ## HITS:1 COG:STM2053 KEGG:ns NR:ns ## COG: STM2053 COG4656 # Protein_GI_number: 16765383 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfC # Organism: Salmonella typhimurium LT2 # 22 440 34 448 451 325 42.0 1e-88 MEKNLIDIIAKSGIVGAGGAGFPTHVKMNAKAEYVIINGAECEPLLRVDQQLMAVETKKI LDALKLCMEAVGADKGVIALKGHYEAAINSFNQLLGEYEGISLHLLPNFYPAGDEQITVY EVTGRIVPEGGIPLQCGVIVSNVETMLNIYNAYFEDKPVTDSYVTFAGEVKNHITLKLPL GMTVKEALDLAGGVTVNGPYVVINGGPMMGKHVALDSKITKTTKGLIVLPEDHPLIQDVK LPIPKMLHMAKAACCHCSMCSEVCPRHLIGHRIEPNKTIRFASYGSLCDGSDTPMIAFLC SECRLCQYACIMNLQPWKVNHELKGIMAKQGIKSTLHNQPESVHPMREYKRFPIKKLIAR LGLTKYNVAAPLTELTTEIDHVSIPVGQHIGAPAVPCVSVGDTVAKGDVIAQPAEGKLGA CIHASISGTVKAIADGCISIEK >gi|222441911|gb|ACEP01000031.1| GENE 6 5237 - 6643 946 468 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|148544941|ref|YP_001272311.1| 50S ribosomal protein L29P [Lactobacillus reuteri DSM 20016] # 8 466 6 474 477 369 43 1e-101 MNIDVELIEKVVKKVLNDVETGSSESEYGYGIFDTMDEAIEASAKAQKEYMNHSMADRQR YVEGIREVVCTKENLEYMSKLAVEESGMGAYEYKVIKNRLAAVKSPGVEDLTTEALSGDD GLTLVEYCPFGVIGAIAPTTNPTETVICNSIAMLAGGNTVVFSPHPRSKGVSIWLIKKLN AKLEELGAPRNLIVTVKEPSIENTNIMMNHPKVRMLVATGGPGIVKAVMSTGKKAIGAGA GNPPVVVDETADIEKAAKDIVNGCSFDNNLPCIAEKEVIAVDQIADYLIFNMKNNGAYEV KDPEIIEKMVDLVTKDRKKPAVNFVGKSAQYILDKVGIKVGPEVKCIIMEAPKDHPFVQI ELMMPILPIVRVPNVDEAIDFAVEVEHGNRHTAMMHSKNVDKLTKMAKEIETTIFVKNGP SYAGIGVGGMGYTTFTIAGPTGEGLTSAKSFCRKRRCVLQDGLHIRMK >gi|222441911|gb|ACEP01000031.1| GENE 7 6643 - 7656 940 337 aa, chain - ## HITS:1 COG:FN1303 KEGG:ns NR:ns ## COG: FN1303 COG2096 # Protein_GI_number: 19704638 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 1 169 9 178 192 144 48.0 2e-34 MSNIYTKTGDKGTTGLYGGSRVDKDSLNVDAYGTVDEAISSLGVAYTLTDSPEIKEYINH IQKRMFQAGAELASDARGMEMLKDKIGEADIKYLENIIDKSTEVNGLMREFVVPGVNPSS AALHVARTVVRRAERIITALAKQVPVREELRKYINRLSDACFAMARLEEARAKNQEIEEL KDTVRQVVKTLGAMGKEEDSMDMSIETLKKMAGFIEEKAKEIGVPVAFSAVDEGGNLLYF QRMEGTLLISTKVSQDKAYTACALKCPTCDLADVTKPGESLWSLHNSGDGRIICFGGGYP IKKDGKVIGAIGVSGGTAEEDMAVATYALEKMQGGKA >gi|222441911|gb|ACEP01000031.1| GENE 8 7670 - 7945 338 91 aa, chain - ## HITS:1 COG:sll1030 KEGG:ns NR:ns ## COG: sll1030 COG4576 # Protein_GI_number: 16329366 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; C Energy production and conversion # Function: Carbon dioxide concentrating mechanism/carboxysome shell protein # Organism: Synechocystis # 1 89 1 88 100 71 48.0 4e-13 MYLGKVIGTVVSTIKTPSLTGSKFLIVEKINQDLTAKKQTEIAVDTVGAGDGETVIVVGG SSARMSNGKKTDLPVDAAIIGIVDTVEISQC >gi|222441911|gb|ACEP01000031.1| GENE 9 7958 - 8836 863 292 aa, chain - ## HITS:1 COG:no KEGG:ELI_4075 NR:ns ## KEGG: ELI_4075 # Name: not_defined # Def: flavoprotein # Organism: E.limosum # Pathway: not_defined # 8 291 10 277 278 139 32.0 1e-31 MNRKEFEDALMATIVEYISDQVILRYAQACKKATVLFSGALIGYKDAVTSLNELKKDGWT LTAVLSKAAGEVITEERIRNDIGPDAIYVEGAPVNGRQIVDDSQFVIIPSLTINTAAKVA NCISDNLLTNMISRAMATGKPIVAAIDGCCPDNKVREQLGFKVTEAYKAKMRHNLEDMLA YGITLTTDYSLCSKVNEVFSAKVSLPALDNSKPAPKKDRIVSTKMEPAVQSVSQPVKKVE TPSSVKLDKKVIGRVDIAKNARFKTIVVRQDALVTGLAKDEALNRGITIVKE >gi|222441911|gb|ACEP01000031.1| GENE 10 8845 - 9663 881 272 aa, chain - ## HITS:1 COG:FN1783 KEGG:ns NR:ns ## COG: FN1783 COG4820 # Protein_GI_number: 19705088 # Func_class: E Amino acid transport and metabolism # Function: Ethanolamine utilization protein, possible chaperonin # Organism: Fusobacterium nucleatum # 1 267 3 269 274 275 55.0 5e-74 MNDYNNVLVGFSELIKNEEFRPYEGDLRLGVDLGTANIVVSVVDSNNTPIAGASYPSTVV RDGIVVDFMGASRAVRNMKAKLEDLMGVEFYEAATAIPPGIISGNVKVISNVVESVGLDV VNVIDEPTAAASVLGITDGAVVDVGGGTTGISILKDGKVIFTADEPTGGTHMTLVLAGSL GCSFEEAERIKKDPKQEMLVFPVVKPVIEKMAAIVARFIEGYDVDVIYVVGGACSFKKFE SVFEKETGVKTIKPKEPLLVTPLGIAMNCNKQ >gi|222441911|gb|ACEP01000031.1| GENE 11 9864 - 10661 913 265 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225026278|ref|ZP_03715470.1| ## NR: gi|225026278|ref|ZP_03715470.1| hypothetical protein EUBHAL_00519 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00519 [Eubacterium hallii DSM 3353] # 1 265 3 267 267 237 99.0 6e-61 MSNNKCLGFIETFGLAACVAAADAAVKSANVKLLGYEYAKGGGMCTVKVEGNVGACRAAI AAGKKAAEAVNGSLSGTKSTLLKARPADGMRELLVDNRETVGGEIALAEGYRPKGDTKKP VLVKNWKPKTEPKAEVVTEEAPEPPVVEEPEEEEPTSEEPETEEVKEEAPVEEPKAEEPA EETPVEETPVEEAPVEEPKAEEPAPEAPAEETPVEKAEAEEAKSEKTEPKKETDKKADAP KGKTSTGNRRSRSRRKKNTDNNDNK >gi|222441911|gb|ACEP01000031.1| GENE 12 10705 - 11067 358 120 aa, chain - ## HITS:1 COG:no KEGG:ELI_4079 NR:ns ## KEGG: ELI_4079 # Name: not_defined # Def: hypothetical protein # Organism: E.limosum # Pathway: not_defined # 11 117 9 115 121 94 48.0 1e-18 MEGSNTVIDNRPSVKVFYDCDHLSEADFVSVLLGIEEEGIPYDVQAEHCSSVLELAHNAS LNSRLGVGVGISKEGIVLQHEKLDKAAPLFKIKLYQTELFRKIGANAARLVKKMPFKVLD >gi|222441911|gb|ACEP01000031.1| GENE 13 11077 - 12891 1861 604 aa, chain - ## HITS:1 COG:no KEGG:GY4MC1_1864 NR:ns ## KEGG: GY4MC1_1864 # Name: not_defined # Def: Diol/glycerol dehydratase reactivating factor large subunit # Organism: Geobacillus_Y4.1MC1 # Pathway: not_defined # 1 603 1 603 608 752 70.0 0 MKLVAGVDIGNATTETAIARIDGKNVTFLSSGITGTTGIKGTKQNIHGVFQSLKNALDEV GFEISDLDEVRINEAAPVIGDVAMETITETIITESTMIGHNPNTPGGVGIGVGISQRIDR LDTVKDGGDVIVVIPAEVSFEAAAALINRYNKIFNITGAIVQRDDGVLINNRLEKKIPIV DEVGMIDKVPLGMLCAVEVAPVGGVVEVLSNPYGIATLFKLSPEDTKQVVPIARALIGNR SAVVIKTPEGDVKERRIPAGSIEIIGAKKKVIVGVEEGAEKMMEAVNSVPVIEDIKGEPG TNAGGMLEKVRQVMSNLTNQHPKDIKIQDLLAVDTYNPQKIKGGLANEFSLESAVGIAAM VKADRLQMKMIAEELTDRLKIPVYVGGVEADMAIKGALTTPGTNVPLAIVDMGAGSTDAS IKDRDGNVKLVHLAGAGNMVTLLIQSELGLEDFNTAEDIKKYSLAKVESLFHIRHEDGTV QFFEQPLDPNVFAKVVLVKEDGELVPLEGQDSMEKIKMVRTNAKKKVFVTNAIRSLAKVS PTGNVRDIEFVVLVGGSALDFEVPTLVTDALSQYNIVAGRGNIRGCEGPRNAVATGLAMS CADV >gi|222441911|gb|ACEP01000031.1| GENE 14 12924 - 13418 672 164 aa, chain - ## HITS:1 COG:lin1119 KEGG:ns NR:ns ## COG: lin1119 COG4910 # Protein_GI_number: 16800188 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Propanediol dehydratase, small subunit # Organism: Listeria innocua # 1 163 1 167 170 157 52.0 1e-38 MDQELLVRQIMQEVMKNMQGSAAACECESKVTVEDYPLGEKKPELIKSASGKSLNDLTLQ GVIDGKLDAKDFRITKETLELQAQVAESAGRGFFATNLRRAGELIPVPDARLLEIYNALR PYRSTKAELYAIGDELINEYGATVSGNFVKEAADIYEKRNRLKK >gi|222441911|gb|ACEP01000031.1| GENE 15 13434 - 14093 774 219 aa, chain - ## HITS:1 COG:mll6722 KEGG:ns NR:ns ## COG: mll6722 COG4909 # Protein_GI_number: 13475607 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Propanediol dehydratase, large subunit # Organism: Mesorhizobium loti # 33 218 565 751 756 112 38.0 4e-25 MAIDEKLLKSVIAEVLKEMNTADTSAASAVEGETYECPGMQITEIGDAEKGTNPNEVVLG LAPAFGESQTTNIVGIPHADIVKEVMAGLEEEGVSCRIVKVYRTSDVSFIAHDAAELSGS GIGIGIQSKGTTVIHQKDLPPLSNLELFPQCPLIDLETYRAIGKNAAQYAKGESPNPVPV KNDQMARPKFQALSAVLHIKETEHADRDKAPKALKVVFK >gi|222441911|gb|ACEP01000031.1| GENE 16 14117 - 15778 1848 553 aa, chain - ## HITS:1 COG:STM2040 KEGG:ns NR:ns ## COG: STM2040 COG4909 # Protein_GI_number: 16765370 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Propanediol dehydratase, large subunit # Organism: Salmonella typhimurium LT2 # 1 553 1 553 554 754 66.0 0 MRSKRFEYLDQRPVNQDGYAKEWPEVGFIAMNSPFDPTPSIKIVNGEVVEMDGKTKDEFD MLDAFIARYAINLDFAEEAMATDSLEIARMLVDINVNRQECLKYTTAVTPAKLVEIVGNL SVLEAMMGITKMRCRLKPANQCHVTNVKDNMVQIAADAAEAAVRGFAEMETTVGIVRYAP FNAIAIMLGANVGRPGILTQCAVEEATELDLGMRGYTAYAETVSVYGTEDVFMDGDDTPW SKAFLASCYASRGVKMRFTSGTGSEVQMGMAEGKSMLYLEARCLGVTRGAGVQGTQNGSV SCIGVPAAVPGGIRAVACENLIAAMLDLENASSNDQSFTHSDLRRVARSIPQFIPGTDYI CSGYSSTPNYDNMFAGSNWDADDYDDWCVIQRDLRVDAGLLPAREEEVVAARNKAAKAIQ ALFRELGLPEITDEEVEAATYARGSQDMPDRDVVADLKAATDMLDRGVTGIDFVKALYKG GFEDVAESIFNIQKAKVVGDYLHTASMLDDKFHVISAVNSPNCYEGPMTGYQISPELWEK ISDIRTAVDPAKL >gi|222441911|gb|ACEP01000031.1| GENE 17 15960 - 16772 1034 270 aa, chain - ## HITS:1 COG:lin1116 KEGG:ns NR:ns ## COG: lin1116 COG4816 # Protein_GI_number: 16800185 # Func_class: E Amino acid transport and metabolism # Function: Ethanolamine utilization protein # Organism: Listeria innocua # 8 256 6 255 267 278 60.0 9e-75 MKDEMMAKIMDEVMKKMGGTEAPAQSMACEAPQGINEDLCGLTEYVGTAMGHTIGLVIAN LDYSVHELLKIDPKYRSIGIIADRTGAGPQIFACDEAVKATNTEVCLIECPRDTEGGAGH GCLVVFGAEDVSDARRAVEVALSFVGKHFGDVYGNSAGHLEFQYTARASYCLEKAFGAVV GKAFGITVGAPSGIGIVLADTASKTATIDPISIATPGNGGTSFSNEVNFFFTGDSGAVRQ AIIGAREVGKQLLKALEPNEPLESGTVPYI >gi|222441911|gb|ACEP01000031.1| GENE 18 16816 - 17088 487 90 aa, chain - ## HITS:1 COG:lin1115 KEGG:ns NR:ns ## COG: lin1115 COG4577 # Protein_GI_number: 16800184 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; C Energy production and conversion # Function: Carbon dioxide concentrating mechanism/carboxysome shell protein # Organism: Listeria innocua # 3 89 2 88 95 108 83.0 2e-24 MQKEALGMIETKGLVGAIEAADAMVKSANVTLVGYEKIGSGLVTVMVRGDVGAVKASVDA GTVAADKVGTVVSSHVIPRPHTDVEKILPQ >gi|222441911|gb|ACEP01000031.1| GENE 19 17324 - 18478 1434 384 aa, chain - ## HITS:1 COG:lin1135 KEGG:ns NR:ns ## COG: lin1135 COG1454 # Protein_GI_number: 16800204 # Func_class: C Energy production and conversion # Function: Alcohol dehydrogenase, class IV # Organism: Listeria innocua # 1 382 1 377 379 362 46.0 1e-100 MKRFKLSTEVLFSEDAIDALLEEKDENAVLITDKFMVDSGMADMILKKLSNCKEVSVFSE VVPDPPMDLIERGIAFLQDKDCDIVVALGGGSSIDAAKAIVYMAKKIEEKQNPDNQKKIK LIALPTTSGTGSEVTQFAVVTDSKTGVKIPLIDESLMPDIAILDPELVKSAPPFITADTG MDVITHALEAYVSETASDCSDCLAEKAAELAFEFLPRAYRNGYDIKARDKMHRASCLAGM AFNLVNLGVNHGIAHALGAIFHIPHGRANALVLPYVIEFNADMAREGAKHSNDAAKKYQK MARIVGLPAPTPKVGVANLIAEIQRLLKYMDRPQCMSECGVSVEEFNKHREEIVRRALAD ACTQANPRKVTAEDINKILDHIAK >gi|222441911|gb|ACEP01000031.1| GENE 20 18465 - 19532 861 355 aa, chain - ## HITS:1 COG:lin1114_2 KEGG:ns NR:ns ## COG: lin1114_2 COG2207 # Protein_GI_number: 16800183 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Listeria innocua # 244 345 44 145 145 95 47.0 1e-19 MYKILIVEDEELERTSMKIFLEENLLDVEIVGEAKSGFEAVEMIDSKDINLMLVDINIPG MDGMEVIKYARKKLPKAVIIITTAYDDFNIAHRAIKLKVDDFLLKPIRKEVLLESVKSFA SRLGEGRQTGKSNKLLSKFEMELKKCSYKASVDILREYIDQLYLEEHDVNIISKRLQEVA KAVVRISEELGITDTNELVVQMEKLKIKYLLYNNKHDAYNEIIKMVDILFDKMNIKGKLS EGGMKAVIDYIERNLKKGISLEDVANHVNISTYYLSKIFKKEMGVNFITYVTDRKMDLAK EMLVNTDIPVLNIALDLAYNEANYFSKAFKKKTGLTPSEYREKYRNRKEEENETV >gi|222441911|gb|ACEP01000031.1| GENE 21 19548 - 20783 1063 411 aa, chain - ## HITS:1 COG:SP0662 KEGG:ns NR:ns ## COG: SP0662 COG2972 # Protein_GI_number: 15900563 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Streptococcus pneumoniae TIGR4 # 170 410 334 561 563 127 35.0 5e-29 MTEKNNIYIEDVVDVKELQTLQDNWAKATDLAFVTVDCNGTPITKQSNFTPFCKRLRELD AFREKCYYCDACGGRKARRIGKAYVYICHAGLIDFAIPIMIKGQYVGAFLAGQVTSTDFA DLKKITTQSEDWKDNQELVELYSQVKEVSVEKIEAVAQIICDSYNYSVEREYANKVNAEL REKDYKLMKEEKLRIEMEKSLRDIELKALHYQINPHFLFNVLNTIGRLAFFENAKMTEDV VYAFADMMRYILRKSRDPLSPLREELTHVQNYLKIQKIRLGSRLQYSINISEAYYSVKLP FLSLTTLVENSIKYAVENKASGGMVEVNGIEKDGDLCIDIIDNGDGMSQSKIVSVLRGDA YKDNEKGSVGLYNINSRLIHYFGNEYALSIESENKPSLGTKVTIKVPLVQE >gi|222441911|gb|ACEP01000031.1| GENE 22 20833 - 21495 544 220 aa, chain - ## HITS:1 COG:mlr1231 KEGG:ns NR:ns ## COG: mlr1231 COG5012 # Protein_GI_number: 13471298 # Func_class: R General function prediction only # Function: Predicted cobalamin binding protein # Organism: Mesorhizobium loti # 5 213 23 231 238 127 33.0 2e-29 MGATIDEIIDSLKRGQGKRVSDLVKMALEEGTDPQMILEDGFLAGMEQVADKFRKEEVGV PEILSVTRALERGVSTLRHYSGGRAKKEIGTVIIGTVKGDLHDIGKNLVKIMMESKNIRV IDLGVDVSPRRFLEEAVGSHAQIVCLSGIMPHSEEDMRAVVEEFEMKGIRDQFYFMVGGY FLTERQAKAIGADCYTEDACNCAEKACRYLNKKDRRKKNH >gi|222441911|gb|ACEP01000031.1| GENE 23 21643 - 21816 78 57 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225026290|ref|ZP_03715482.1| ## NR: gi|225026290|ref|ZP_03715482.1| hypothetical protein EUBHAL_00531 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00531 [Eubacterium hallii DSM 3353] # 1 57 50 106 106 109 100.0 5e-23 MKLGVQSYGRIWGFYGVTVGGEGAGVQVSSVFCKPKSEKIVIGSPNSFFYLFSCPFE >gi|222441911|gb|ACEP01000031.1| GENE 24 22047 - 24482 2936 811 aa, chain - ## HITS:1 COG:CAC1664 KEGG:ns NR:ns ## COG: CAC1664 COG0058 # Protein_GI_number: 15894941 # Func_class: G Carbohydrate transport and metabolism # Function: Glucan phosphorylase # Organism: Clostridium acetobutylicum # 8 811 4 807 812 832 53.0 0 MIKKDTFDKEKFKKELESNVRMLFRRKLEEATPQQIYQAVAYSVKDDIIDNWIETHKAYE KQDKKMVYYMSMEFLMGRALGNNMINLLCYDDVRETLEELGLDLNLIEDQEPDAALGNGG LGRLAACFLDSLATLGYPAYGCGIRYRYGMFKQKIENGYQVEVPDNWLKYGNPFEIKRDE YAVEVKFGGYVDVEMHNGRQKFVQKGYQSVRAVPYDMPIVGYGNHIVNTLRIWDAEAINN FNLDSFDKGEYQKAVEQENLARTICEVLYPNDNHMAGKELRLKQQYFFISASVQRAIAKY KETHDDIRKFHEKVTFQLNDTHPTVAVAELMRILVDEEGLEWDEAWEITRKTCAYTNHTI MAEALEKWPIELFSRLLPRVYQIVEEINRRFVIEIQNKYPGDQEKIRKMAILYDGQVRMA HLAIAGSYSVNGVARLHTDILKKRELKDFYEMMPEKFNNKTNGITQRRFLLHGNPLLASW VTDKIGDDWITNLDHLKHLKVYVDDEKCQQEFMNIKYQNKVRLAKYIKEHNGIDVDPRSI FDCQVKRLHEYKRQLMNILHVMYLYNEIKAHPDMDIVPRTFIFGAKAAAGYYTAKLTIKL INAVADKINNDPSINGKIKVVFIEDYRVSNAELIFAAADVSEQISTASKEASGTGNMKFM LNGALTLGTMDGANVEIVEEVGKENAFIFGLSADQVMEYEKNGNYNPRDVYNNNQDVRQV LTQLVNGFYSPENPELFRALYDALLEKDTYFTLLDFDSYKEAHNRIDAAYRDEEHWARTA MLQTASAGKFSSDRTIEEYAKEMWHLEKVTL >gi|222441911|gb|ACEP01000031.1| GENE 25 24664 - 27822 3789 1052 aa, chain - ## HITS:1 COG:CAC3038 KEGG:ns NR:ns ## COG: CAC3038 COG0060 # Protein_GI_number: 15896289 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Isoleucyl-tRNA synthetase # Organism: Clostridium acetobutylicum # 1 1050 1 1033 1035 1186 54.0 0 MYKKVSTNLNFVEREKETAQFWKDHDIFKKSIETREGDPVYTFYDGPPTANGKPHIGHVL TRVIKDMIPRYRTMKGYMVPRKAGWDTHGLPVELEVEKKLGLDGKEQIEEYGLDPFITQC KESVWKYKGMWEDFSKTVGFWADMDNPYVTYNDNFIESEWWALKTIWDKGLLYKGFKIVP YCPRCGTPLSSHEVAQGYKAVKERSAVVRFKVAGEDAYFLAWTTTPWTLPSNVALCVNPE ETYCKVKAADGYVYYMAKALLDKVLGKLGTEETPAYEILETYKGTDLEYKEYVPLYEGAA KSAEKQHKKAHFVTCADYVTMSDGTGIVHIAPAFGEDDAQVGRKYDLPFVQFVDGKGSMT EDTPYAGLFVKDADPKVLVDLDKEGLLFDAPKFEHDYPFCWRCDTPLIYYARESWFIKMT AVKDDLIRNNNTINWIPKSIGKGRFGDWLNNIQDWGISRNRYWGTPLNVWQCEGCGKMEC IGSRQELEEKSGNPEAQTVELHRPYIDAITLTCPDCGKTMKRVPEVIDCWFDSGAMPFAQ HHYPFENKELFEQQFPAQFISEAVDQTRGWFYSLLAESTLLFNKAPYKNVIVLGHVQDEN GQKMSKSKGNAVDPFDALEQHGADAIRWYFYSNSAPWLPNRFHADAVTEGQRKFMGTLWN TYAFYVLYANIDEFDPTKYSLEYDKLSVMDKWLLSRLNSCVKTVDDCLANYKIPETTKAL QAFVDDMSNWYVRRSRQRFWAKGMEQDKINAYMTLYTALVTFIKASAPMIPFMTEDIYQN LVKSVNTDAPESIHLCDFPAVDEAMIDEKLEQDMGEVLDIVVLGRAARNASGIKNRQPIG QMFVNGEAALTDYYKQIIEGELNVKDVIFKDDVSDLTDYTFKPQMRILGPKYGKDLGKIR NILANLNGSAAKKELDANGFLTIELNDGKINLLPEELLIDMSQKEGYVSQADHGVTVVLD TNLTPALLEEGFVREIISKVQTMRKEAGFEVTDHITVYEEGNDKIKEVMTKYTDEIKNDV LADDMCLDAEGGYSKEWDINGEKVRLGVEKKM >gi|222441911|gb|ACEP01000031.1| GENE 26 28545 - 28976 507 143 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026294|ref|ZP_03715486.1| ## NR: gi|225026294|ref|ZP_03715486.1| hypothetical protein EUBHAL_00535 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00535 [Eubacterium hallii DSM 3353] # 1 143 46 188 188 260 100.0 3e-68 MLKKLIVSFLCAIVVACTVLGGFCLLYFTIPSQKAKASISLPSTVEDSETAAESTTEADS KDKNVYVADVYQSLTLRSEASADSDAITDLLPMTHLEVLEFAAGTNYAYVKVLTGDKESY KGYVNSEYITKLGEPTVRIGTEE >gi|222441911|gb|ACEP01000031.1| GENE 27 28984 - 30429 1561 481 aa, chain + ## HITS:1 COG:CAC1780 KEGG:ns NR:ns ## COG: CAC1780 COG1488 # Protein_GI_number: 15895056 # Func_class: H Coenzyme transport and metabolism # Function: Nicotinic acid phosphoribosyltransferase # Organism: Clostridium acetobutylicum # 5 477 14 483 489 501 51.0 1e-142 MFSNNLTMMTDLYELTMMQGYFFEGNEDKVVFDVFYRKNPSGSAYAITCGLDQVIDYIKN LSFSYEDIDYLRGLGIFKEDFLSYLAGFHFSGDIYAIPEGSVVFPKEPLVKVIAPIMEAQ LVETAILNIINHQSLIATKASRVDYAAGGGVMEFGLRRAQGPDAGTLGARAAVIGGCTGT SNVLAGELYGIPVQGTHAHSWIMSFPDELTAFRTYAKMYPDACILLVDTYDTLKSGVPHA IQVFKEMRDAGITSKLMGIRLDSGDIAYLSKKARKMLDEAGFTNAIISASSDLDEYLINS LKTQGCTVTSWGVGTNLITSSDNPAFGGVYKLAAIKKPGDKEFTAKIKISENPEKITNPG NKTVYRIYDKESSKIKADLICLVGETFDPSEDLKIFDPISTWKKSILPAGSYQIREMLVP IFLNGQCVYSSPAVMDIKAYCQQELNTLWDENRRLINPQTVYVDLSQKLFDLKHKLLGDE K >gi|222441911|gb|ACEP01000031.1| GENE 28 30654 - 31040 75 128 aa, chain - ## HITS:1 COG:no KEGG:Acfer_0798 NR:ns ## KEGG: Acfer_0798 # Name: not_defined # Def: MATE efflux family protein # Organism: A.fermentans # Pathway: not_defined # 9 113 335 439 458 77 40.0 1e-13 MIFSVLSSLVIIEPLIIFRKFFTGFFFAEQTVIQNAHLRIMCILFFEPLCSFYEIPAGVL RGTGHSLSPAIAAIIGTCCFRIIWIYTIFGEYCQLSVLYYAFPISWILTIVLIMIDFAIV IKNRQPLK >gi|222441911|gb|ACEP01000031.1| GENE 29 31196 - 31837 449 213 aa, chain + ## HITS:1 COG:CAP0060 KEGG:ns NR:ns ## COG: CAP0060 COG4845 # Protein_GI_number: 15004764 # Func_class: V Defense mechanisms # Function: Chloramphenicol O-acetyltransferase # Organism: Clostridium acetobutylicum # 7 207 8 209 220 139 37.0 3e-33 MTGCKKIDMDTWARREHYEYYTKQLKVEYNITANVDVKNVLDFCHSKGYKFYPAMIYLVT KTLNRIDNFKMFKDKYGYLCAWDKIIPNYTIFHNDDHTFSDCWTDFSEDFNIFYQDILKD MATAATKKGIKAKEGQPANFYCISCTPWTTFTGYSSRVSNGEPSFFPIVLMGKYKQHGKK ILMPVNITIAHAVADGYHVGLFFQYLQEEINNY >gi|222441911|gb|ACEP01000031.1| GENE 30 32049 - 32516 361 155 aa, chain - ## HITS:1 COG:no KEGG:NT01CX_0673 NR:ns ## KEGG: NT01CX_0673 # Name: not_defined # Def: hypothetical protein # Organism: C.novyi # Pathway: not_defined # 7 154 3 150 151 181 64.0 9e-45 MMERQLGRESQKDNDLFYTCSLIDYIARKTKNKRAAVVDALGKECIAKIYDLADIYHSDN IERVSDDFIEEAKISVGNFDNVGECQYAVPSHWDIGKVYKRLIKQVALEKKIDVVDAIIE VYHSFISDKIDDYNSSVYYENPSYIFECFIQNKIL >gi|222441911|gb|ACEP01000031.1| GENE 31 32513 - 32974 404 153 aa, chain - ## HITS:1 COG:no KEGG:NT01CX_0674 NR:ns ## KEGG: NT01CX_0674 # Name: not_defined # Def: hypothetical protein # Organism: C.novyi # Pathway: not_defined # 1 146 1 148 158 222 70.0 3e-57 MAKMTVYHGGYTPVKHPEIREGRNTKDFGTGFYCTVIKEQAQRWAKRYDDKIVSIYEVRL NTHLNIKEFKDMTDEWLDFIIDCRSGKPHQYDIVIGAMANDQIYNYVSDYIDGAITREQF WVLAKFKYPTHQINFCTREALKCLEYRGYEETR Prediction of potential genes in microbial genomes Time: Fri Jul 8 06:53:08 2011 Seq name: gi|222441910|gb|ACEP01000032.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont34.1, whole genome shotgun sequence Length of sequence - 14217 bp Number of predicted genes - 10, with homology - 10 Number of transcription units - 6, operones - 3 average op.length - 2.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 232 - 291 6.8 1 1 Tu 1 . + CDS 342 - 653 294 ## COG0393 Uncharacterized conserved protein + Term 700 - 736 5.0 + Prom 700 - 759 10.2 2 2 Op 1 . + CDS 828 - 1931 1524 ## COG0568 DNA-directed RNA polymerase, sigma subunit (sigma70/sigma32) + Term 2018 - 2058 1.3 3 2 Op 2 . + CDS 2096 - 4078 2215 ## COG0171 NAD synthase + Term 4147 - 4200 -0.2 4 3 Tu 1 . - CDS 4165 - 4593 292 ## Closa_1014 hypothetical protein + Prom 4583 - 4642 8.9 5 4 Op 1 . + CDS 4768 - 9222 4951 ## COG2176 DNA polymerase III, alpha subunit (gram-positive type) 6 4 Op 2 . + CDS 9249 - 9743 655 ## COG0663 Carbonic anhydrases/acetyltransferases, isoleucine patch superfamily 7 4 Op 3 . + CDS 9828 - 10820 1084 ## gi|225026307|ref|ZP_03715499.1| hypothetical protein EUBHAL_00548 + Term 10853 - 10906 10.1 + Prom 10891 - 10950 8.4 8 5 Op 1 . + CDS 11030 - 12193 1202 ## COG2872 Predicted metal-dependent hydrolases related to alanyl-tRNA synthetase HxxxH domain + Prom 12219 - 12278 7.7 9 5 Op 2 . + CDS 12394 - 13485 1267 ## gi|225026309|ref|ZP_03715501.1| hypothetical protein EUBHAL_00550 + Term 13654 - 13692 1.2 + Prom 13761 - 13820 5.8 10 6 Tu 1 . + CDS 13842 - 14217 148 ## COG0675 Transposase and inactivated derivatives Predicted protein(s) >gi|222441910|gb|ACEP01000032.1| GENE 1 342 - 653 294 103 aa, chain + ## HITS:1 COG:MA3383 KEGG:ns NR:ns ## COG: MA3383 COG0393 # Protein_GI_number: 20092197 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Methanosarcina acetivorans str.C2A # 1 103 1 104 106 99 56.0 1e-21 MLLVTTERIAGREIEPLGIVEGSSVQTVHAGKDIMNSLKTLVGGELTSYTEMMEKARTLA IDKMVRNAEQMGADAVVCVRYSSSEVMQGAAEVLVYGTAVRLI >gi|222441910|gb|ACEP01000032.1| GENE 2 828 - 1931 1524 367 aa, chain + ## HITS:1 COG:BH1376 KEGG:ns NR:ns ## COG: BH1376 COG0568 # Protein_GI_number: 15613939 # Func_class: K Transcription # Function: DNA-directed RNA polymerase, sigma subunit (sigma70/sigma32) # Organism: Bacillus halodurans # 12 367 19 372 372 396 62.0 1e-110 MEDIRIVFSNIMEKLLKQAHKNKNVLEYKEINDAFKNIELTPEKFEWILDYFEKQGVDVL NTNEEDDGDDNFIDDDTEEVEIIDDADILEGVSLEDPVRMYLKEIGNIPLLTAEEEVFLA QRIEKGDEQARKQLIEANLRLVVSIAKKYVGRGMSFLDLIQEGNMGLMKAVEKFDYKKGN KFSTYSTWWIRQAITRGIADTAKTIRVPVHMVETINKTLRTSRMLLQELGREPTNEEIAK KMNMPVAKIDEILKTSRDPVSLDTPIGEEEDSQLGDFIEDESLLSPVDSASFSMLKEELE EAMASLTERERNVIKLRFGLDDGKTRTLEEVGKEFNVTRERIRQIEAKALRKLRHPSRSR KLKDFLE >gi|222441910|gb|ACEP01000032.1| GENE 3 2096 - 4078 2215 660 aa, chain + ## HITS:1 COG:CAC1050_2 KEGG:ns NR:ns ## COG: CAC1050_2 COG0171 # Protein_GI_number: 15894337 # Func_class: H Coenzyme transport and metabolism # Function: NAD synthase # Organism: Clostridium acetobutylicum # 340 647 1 309 310 415 64.0 1e-115 MGRLRNLGFPFCFMEVFMKDGFLKVGVASVDVEVANPIHNKEKVMEVIKKADKNGVKLLV FPELVLTAYTCNDLFLQKTLLDEAKNQLFAILEETKGQDMVIVLGLPLTVNHKLYNCAVF AQGGKILGVVPKHYLPNYSEFYEARHFAPGEEEVRKIRLGGKDVPFGMNLLFCCENMEEL VIGCEICEDLWCPLPPSTHHALAGATVICNPSASDETTTKDTYRRNLVSQQSARLVCAYL YSCAGEGESTQDLVYSGHNMIAEYGSILKESRRFQNSYIETEIDLQRLEADRRRMTTFVT EGADKNYERVYFRLKEERTELSRYFERTPFIPSNKIDREKRCDEILTIQAMGLKKRLAHT HCQSAVVGISGGLDSTLAVLVTARAFDMLKIPRENIVCVTMPCFGTTDRTYSNAVTLTKK LGASLREIRINKAVEQHFSDIGHDPEIHNVTYENSQARERTQILMDIANQTNGMVIGTGD MSELALGWATYNGDHMSMYAVNCSVPKTLVRHLVRYYADTSEEQELKEVLLDILDTPVSP ELLPPVDGVISQKTEDLVGPYELHDFFLYYMLRFGFHPEKIFRLTRQAFGEEYDVATCYK WLRTFCWRFFAQHFKRSCLPDGPKVGSIAVSPRGDLRMPSDASSAVWMSELDELKEKLGL >gi|222441910|gb|ACEP01000032.1| GENE 4 4165 - 4593 292 142 aa, chain - ## HITS:1 COG:no KEGG:Closa_1014 NR:ns ## KEGG: Closa_1014 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 141 1 143 146 108 44.0 5e-23 MSAKTKIVVVKAKELIYTALFVCLGIILILLLIFMFAPEEKSTKTSAGVYTPGVYTSTIT LGDNTLDVAVSVDADHITSVSLNHLSDSVSAMYPLLEPSLEEINAQLEQISSIDDLKLNA ENKYTGLILQQAIKNALSKAKS >gi|222441910|gb|ACEP01000032.1| GENE 5 4768 - 9222 4951 1484 aa, chain + ## HITS:1 COG:CAC3442 KEGG:ns NR:ns ## COG: CAC3442 COG2176 # Protein_GI_number: 15896683 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, alpha subunit (gram-positive type) # Organism: Clostridium acetobutylicum # 9 1475 6 1448 1452 1332 47.0 0 MLEGQTFIEVFPDYTVPDNLKYVLSNAIVSKVVMKQQTRQLVVHLRSEHIIPRKVLNKVA YDMKKELFGKTSVFISFDDTYELSSAYTLESITNDYWNSILHEVKKMGRVEYSLLSNGEW GFEGNIMTILLEDSFLARDKSRVLKEYLEHLFSHRFGKEIQVGFDFTDDAKQAFYKARNH KLNLEVDMVMSQIKKVESEKKDSDAEDGEGSKEEKKPQRSYSGNFFEERARRQEFSQRRR SKDPDVIYGKDCDGEVVPLEEIVDEIGQVVVRGQILKMELREIRRERTIVSFHITDFTDT IVGKIFIANDQLPEFLDLFEKGKFYKVKGMPRMDTFEHDLTLSSISGIKPIPDFTEKRMD KSLEKRVELHLHTVMSDLDSVVDIKKVINQAKAWGHPAMAITDHGVLQAFPIANHCITMD EPFKIIYGVEGYFVNDLKKLVTNDKGQTLLDDYVVFDLETTGFSPIHDAIIEIGAVKVSK GKISDHYSVFVNPQRPIPLRITELTSIDDSMVADAKSIEEILPEFLSFCEGCSLVAHNAE FDVSFIEENAKRQGFETDFTVLDTVQMARLLLTDLNKFKLNTVCKRLNIKQEHHHRAVDD ARVTAEVFLRFVEMLEEQDVHTLAKLNDMGAMSPDLIKKAPSYHGIILVKNETGRINLNR LVSASHLDYFNRRPRMPESLIQKYREGLILGSACEAGELFQAVIRGKSEEELAELVRFYD YLEIQPIGNNMFMLESDRYEAKTVEDLQNYNRKIVELGERYHKPVVATCDVHFLNPEDEQ YRRILMAGKGFSDADNQAPLYFRTTEEMLEEFAYLGEAKAKEVVITNTNKIADMIEKISP VHPDKCPPVIPKSDETLTKICYDKAHEIYGPDLPDIVEERLQRELNSIISNGFAVMYIIA QKLVWDSNDHGYLVGSRGSVGSSFVATMAGITEVNPLSAHYICPKCHYVDFDSDAVRPYS KKGMSGCDMPDRDCPVCGTPLQKEGHDIPFETFLGFKGNKEPDIDLNFSGEYQSKAHKYV EVIFGEGKAFRAGTIGTLAEKTAYGYVYKYFEERDIHKRRCEMERLAEGCTGVKRTTGQH PGGIIVVPHEHEIYEFTAIQHPANDVNTDIITTHFEYHSIDHNLLKLDILGHDDPTMIRR LEDLTGMDATTIPLDDPEVMSLFHSTDILGITPEDIGGIEMGSLGVPEFGTEFVMQMLKD TNPKCFSDLVRISGLSHGTDVWLGNAQTLIEEGKAEISTAICCRDDIMTYLINMGLDKEQ SFTIMESVRKGKGLKPEWEEEMTSHGVPDWYIWSCKKIKYMFPKAHAAAYVMMGWRVAWF KVHKPLAYYAAYFGIRASAFSYEIMCRGKERLDIALTELLKTPKDQQTKKDQDTIRDGRI AQEMYARGFEFMPIDIYKAKARECQIIDGKIMPPLSSVEGLGDKACEQVEEAAKHGPFTS LENFRNQAKVSKTNVDKMVELGILTGLPETDQLTFDFFMAQSQV >gi|222441910|gb|ACEP01000032.1| GENE 6 9249 - 9743 655 164 aa, chain + ## HITS:1 COG:BMEI0736 KEGG:ns NR:ns ## COG: BMEI0736 COG0663 # Protein_GI_number: 17987019 # Func_class: R General function prediction only # Function: Carbonic anhydrases/acetyltransferases, isoleucine patch superfamily # Organism: Brucella melitensis # 7 156 14 162 175 158 52.0 4e-39 MSYIKNADIQGNAWVAPGACVVGNVTLGDESSVWYNAVLRGDMAPIVVGCGSNVQDGTVV HADNGFPCKIGNGTSIGHNAIIHGCTIGNNTVIGMGAIIMNGAQVGSDCIIGAGSLVTQG TVIPDGMLAFGSPAKVIRPLTDAEKKENQTNGISYMLMKNVQKG >gi|222441910|gb|ACEP01000032.1| GENE 7 9828 - 10820 1084 330 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026307|ref|ZP_03715499.1| ## NR: gi|225026307|ref|ZP_03715499.1| hypothetical protein EUBHAL_00548 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00548 [Eubacterium hallii DSM 3353] # 1 330 14 343 343 582 100.0 1e-164 MSDTTNTRKYKGYELDQTTSHRRNQNRPWFWKVLGISLLLTVLILAGVVFGLLHRTHKNA KHMEEINSANTVEALLKEHKLVTITASYSHLAEGSDYTTTRQIKKDKKGNYFSYFKKEGS DDDYKEVISNKELYRYNGKYTQYFGLVGNDYEEECVKAIEDSIFQGDMKDQIQNEKERDS TITLQLTTKVQSGDDYDTEYGFSAGDTIEKTIIIDKKTQLITSVEEKCGDEVFYGYTVEF DGEEKIPQFYKDLKKEKEKRVCTVYSDYDGEKNQEYTFNLPIDLYFTVLDHDGYKVYEDE DGTKEFTHYQMEVQNPDSDLSLYIKAESSK >gi|222441910|gb|ACEP01000032.1| GENE 8 11030 - 12193 1202 387 aa, chain + ## HITS:1 COG:CAC0906 KEGG:ns NR:ns ## COG: CAC0906 COG2872 # Protein_GI_number: 15894193 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolases related to alanyl-tRNA synthetase HxxxH domain # Organism: Clostridium acetobutylicum # 3 369 2 370 387 170 32.0 3e-42 MKTRRLFYEDVYIREFEAIVLSCEERQEGYYIILNETAFYPEGGGQPADKGTIEGEEVLD VQYVDGEICHIMKKPLKAGEQVVGRINWEWRFDLMQQHSGEHIVSGMIHEKYGYENVGFH MGEEVITIDLSGMLTEEQVAEIEQKVNDYIWMNQEVNIFYPDERQRKLIPYRSKKELTGE IRLVEFPGADLCACCGLHVTHASQIGLVKILSSKKFRDGVRMEMLSGKRALEYLSKSAEQ NSQVAVRLSVKEKETKDAVQRLLDEVYNLKGELAAEKQKHFEQIAASCVGKENVLLIEGD MEPVEVRKCTDAILDTCSGVAAFFAGNDKDGYKYAIGQREGDVRELVKQVNKELNGRGGG KPFFAQGSLKATRKQIEIFFEKKVNFQ >gi|222441910|gb|ACEP01000032.1| GENE 9 12394 - 13485 1267 363 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026309|ref|ZP_03715501.1| ## NR: gi|225026309|ref|ZP_03715501.1| hypothetical protein EUBHAL_00550 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00550 [Eubacterium hallii DSM 3353] # 1 363 1 363 363 538 100.0 1e-151 MFKNKLKRLCAVTLAGMMVVSVSACSTSKSAEKNSVSKAEQASEATTEAAKDDTTADYVA SGVELPDNFENMIYPIEALMLQNYSKGYPYYANGASEEKSDSFWFSMAALTSLMENESAY GDGIEMDSYYYLKENTSNMYASALYNAYGMGNMEFPEIPDGDKYATYDDGKQMYGFLKGE VSGLVPYITNCVKDGSDYIITSQLRDKDTDEVKGTYEVTITASSFDGDDNCFAYSATAIR QIGETDSENQSTSSESEEASTDSTSENGDSTDADSSTDSTDATDSTDSTEGISQDDALSQ AKDYYGEDAEYSYKKQVTVGDKEYYDFSVSGDNVSSTDVLVSVDGQDVIGGTQNDDGSWS FDQ >gi|222441910|gb|ACEP01000032.1| GENE 10 13842 - 14217 148 125 aa, chain + ## HITS:1 COG:alr2719 KEGG:ns NR:ns ## COG: alr2719 COG0675 # Protein_GI_number: 17230211 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Nostoc sp. PCC 7120 # 3 124 267 387 452 124 49.0 6e-29 MQKGSNNRNKQRLKVARLHEKVSNQRKDFLHKQSRQITNAYDCVCIEDLDMKAMSRLLNF GESVSDNGWGMFTTFLRYKLEEQGKKLVKVGRFFTSSQTCSVCGYKNAKTKNLAIREWDC PQCGI Prediction of potential genes in microbial genomes Time: Fri Jul 8 06:53:59 2011 Seq name: gi|222441909|gb|ACEP01000033.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont36.1, whole genome shotgun sequence Length of sequence - 52513 bp Number of predicted genes - 45, with homology - 41 Number of transcription units - 27, operones - 9 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 39 - 72 0.6 1 1 Tu 1 . - CDS 214 - 1665 1876 ## COG0008 Glutamyl- and glutaminyl-tRNA synthetases - Prom 1873 - 1932 8.6 + Prom 2097 - 2156 4.8 2 2 Tu 1 . + CDS 2176 - 2883 519 ## Nther_2618 polymorphic outer membrane protein + Term 3058 - 3099 -0.2 3 3 Tu 1 . - CDS 3135 - 3659 516 ## COG0602 Organic radical activating enzymes - Prom 3776 - 3835 8.9 - Term 3969 - 4019 14.1 4 4 Tu 1 . - CDS 4037 - 6199 2298 ## COG1328 Oxygen-sensitive ribonucleoside-triphosphate reductase - Prom 6235 - 6294 5.5 - Term 6300 - 6347 8.2 5 5 Tu 1 . - CDS 6413 - 7555 1262 ## COG0860 N-acetylmuramoyl-L-alanine amidase - Prom 7600 - 7659 5.6 - Term 7618 - 7669 2.0 6 6 Tu 1 . - CDS 7676 - 8368 574 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain - Prom 8460 - 8519 4.8 - Term 8811 - 8873 3.6 7 7 Tu 1 . - CDS 8938 - 9972 1199 ## COG0722 3-deoxy-D-arabino-heptulosonate 7-phosphate (DAHP) synthase - Prom 10103 - 10162 7.6 + Prom 10158 - 10217 11.0 8 8 Tu 1 . + CDS 10277 - 12160 1266 ## COG0542 ATPases with chaperone activity, ATP-binding subunit - Term 12234 - 12262 -1.0 9 9 Tu 1 . - CDS 12338 - 12550 143 ## - Prom 12592 - 12651 6.8 + Prom 12589 - 12648 6.2 10 10 Tu 1 . + CDS 12686 - 13084 104 ## gi|225026320|ref|ZP_03715512.1| hypothetical protein EUBHAL_00561 + Term 13239 - 13283 10.1 + TRNA 13143 - 13227 56.8 # Leu CAG 0 0 - Term 13228 - 13270 10.5 11 11 Tu 1 . - CDS 13309 - 13884 449 ## COG1954 Glycerol-3-phosphate responsive antiterminator (mRNA-binding) - Prom 13913 - 13972 5.7 12 12 Op 1 4/0.000 - CDS 14191 - 14553 460 ## COG3862 Uncharacterized protein with conserved CXXC pairs - Prom 14583 - 14642 3.4 13 12 Op 2 6/0.000 - CDS 14721 - 16037 1981 ## COG0446 Uncharacterized NAD(FAD)-dependent dehydrogenases 14 12 Op 3 3/0.000 - CDS 16049 - 17488 1679 ## COG0579 Predicted dehydrogenase 15 12 Op 4 1/0.000 - CDS 17518 - 19002 1931 ## COG0554 Glycerol kinase - Prom 19172 - 19231 4.8 - Term 19067 - 19131 20.3 16 13 Op 1 10/0.000 - CDS 19267 - 19917 997 ## COG2376 Dihydroxyacetone kinase 17 13 Op 2 9/0.000 - CDS 19933 - 20949 1398 ## COG2376 Dihydroxyacetone kinase 18 13 Op 3 . - CDS 20975 - 21373 618 ## COG3412 Uncharacterized protein conserved in bacteria - Prom 21428 - 21487 9.4 - Term 21419 - 21463 3.1 19 14 Tu 1 . - CDS 21527 - 22300 780 ## COG0149 Triosephosphate isomerase - Prom 22338 - 22397 5.8 - Term 22490 - 22556 25.4 20 15 Tu 1 . - CDS 22663 - 23763 1325 ## COG0544 FKBP-type peptidyl-prolyl cis-trans isomerase (trigger factor) - Prom 23884 - 23943 8.5 21 16 Op 1 . - CDS 24137 - 25837 1246 ## COG3044 Predicted ATPase of the ABC class 22 16 Op 2 . - CDS 25916 - 27529 1513 ## gi|225026333|ref|ZP_03715525.1| hypothetical protein EUBHAL_00575 23 16 Op 3 . - CDS 27542 - 29704 1980 ## COG0550 Topoisomerase IA - Prom 29755 - 29814 5.2 - Term 29853 - 29897 0.0 24 17 Op 1 24/0.000 - CDS 29975 - 30754 661 ## COG0600 ABC-type nitrate/sulfonate/bicarbonate transport system, permease component 25 17 Op 2 17/0.000 - CDS 30754 - 31497 278 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 26 17 Op 3 . - CDS 31561 - 32505 1213 ## COG0715 ABC-type nitrate/sulfonate/bicarbonate transport systems, periplasmic components - Prom 32607 - 32666 5.9 + Prom 32789 - 32848 14.1 27 18 Tu 1 . + CDS 32885 - 33910 552 ## COG1609 Transcriptional regulators + Term 33914 - 33969 -1.0 - Term 33778 - 33809 -0.7 28 19 Tu 1 . - CDS 33923 - 35293 1591 ## COG1376 Uncharacterized protein conserved in bacteria - Prom 35474 - 35533 4.9 + Prom 36125 - 36184 8.6 29 20 Op 1 . + CDS 36231 - 37055 573 ## gi|225026340|ref|ZP_03715532.1| hypothetical protein EUBHAL_00582 30 20 Op 2 . + CDS 37114 - 37674 529 ## COG0494 NTP pyrophosphohydrolases including oxidative damage repair enzymes + Prom 37812 - 37871 9.9 31 21 Tu 1 . + CDS 38094 - 38201 94 ## 32 22 Tu 1 . - CDS 38358 - 39071 492 ## Closa_2635 hypothetical protein - Prom 39177 - 39236 8.2 33 23 Op 1 4/0.000 - CDS 39287 - 39823 378 ## COG0700 Uncharacterized membrane protein 34 23 Op 2 . - CDS 39811 - 40440 403 ## COG2715 Uncharacterized membrane protein, required for spore maturation in B.subtilis. 35 23 Op 3 . - CDS 40458 - 40577 128 ## 36 23 Op 4 . - CDS 40621 - 41745 1272 ## COG0787 Alanine racemase - Prom 41887 - 41946 7.1 - Term 41926 - 41968 8.0 37 24 Op 1 . - CDS 41989 - 44286 2352 ## HMPREF0868_0283 hypothetical protein 38 24 Op 2 . - CDS 44299 - 45345 1074 ## EUBELI_20117 glycoside hydrolase family 13 candidate A-glycosidase (amylase-related) - Prom 45445 - 45504 7.4 39 25 Op 1 1/0.000 - CDS 45553 - 46395 928 ## COG0313 Predicted methyltransferases 40 25 Op 2 1/0.000 - CDS 46397 - 47146 722 ## COG4123 Predicted O-methyltransferase - Prom 47263 - 47322 10.9 - Term 47177 - 47226 0.1 41 26 Op 1 7/0.000 - CDS 47357 - 48256 806 ## COG1774 Uncharacterized homolog of PSP1 42 26 Op 2 22/0.000 - CDS 48275 - 49297 868 ## COG0470 ATPase involved in DNA replication - Prom 49320 - 49379 2.2 43 26 Op 3 4/0.000 - CDS 49396 - 50025 644 ## COG0125 Thymidylate kinase - Prom 50083 - 50142 2.7 44 26 Op 4 . - CDS 50163 - 51701 714 ## COG1982 Arginine/lysine/ornithine decarboxylases - Prom 51788 - 51847 6.0 - TRNA 51844 - 51916 81.3 # Lys TTT 0 0 - TRNA 51920 - 51992 82.6 # Phe GAA 0 0 - TRNA 51998 - 52071 68.0 # Met CAT 0 0 45 27 Tu 1 . + CDS 52093 - 52242 78 ## - TRNA 52094 - 52175 49.9 # Tyr GTA 0 0 - TRNA 52208 - 52280 82.7 # Thr TGT 0 0 - TRNA 52286 - 52358 84.6 # Val TAC 0 0 - 5S_RRNA 52301 - 52352 92.0 # AF302131 [D:490..741] # 5S ribosomal RNA # Streptococcus agalactiae # Bacteria; Firmicutes; Lactobacillales; Streptococcaceae; Streptococcus. - TRNA 52394 - 52467 85.9 # Asp GTC 0 0 Predicted protein(s) >gi|222441909|gb|ACEP01000033.1| GENE 1 214 - 1665 1876 483 aa, chain - ## HITS:1 COG:CAC0990 KEGG:ns NR:ns ## COG: CAC0990 COG0008 # Protein_GI_number: 15894277 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Glutamyl- and glutaminyl-tRNA synthetases # Organism: Clostridium acetobutylicum # 3 483 5 485 485 627 64.0 1e-179 MAKIRTRFAPSPTGRMHVGNLRTALYTYLIAKHEGGDFVLRIEDTDQGRYCEEALGIIYR TLEKTGLKHDEGPDKDGGFGPYVQSERVASGLYMDYAKQLIEKGEAYYCFCDQERLDSLK QNVGGKEISVYDKHCLHLSKEEVEANLAAGKPFVIRANMPNEGETTFHDEIYGDITQPNE ELDDMILIKSDGYPTYNFANVVDDHLMQITHVVRGNEYLSSAPKYTRLYKAFGWEEPKYI HCPLITNEEHQKLSKRSGHSSYEDLIEQGFVPEAVVNFVALLGWSSPDNREIFSLEELVE AFDYKHISKSPAVFDMVKLKWMNGEYLKAMDSDTFYNMAEPYIKEVIKKDMDLKKIAALV KTRIEVFPEIGEMIDFFEAVPEYDVDMYTNKKNKTNAEKSLPVLQELLPVIEAQEDFTND ALYQMLRGFADEKGYKPGFVMWPVRTAVSGKRMTPGGATEIMEIIGKDETVSRIKAAIEK LSK >gi|222441909|gb|ACEP01000033.1| GENE 2 2176 - 2883 519 235 aa, chain + ## HITS:1 COG:no KEGG:Nther_2618 NR:ns ## KEGG: Nther_2618 # Name: not_defined # Def: polymorphic outer membrane protein # Organism: N.thermophilus # Pathway: not_defined # 50 228 29 199 863 67 29.0 3e-10 MLSLLNIDYSSMISTLPLDSGSSDSLSNFINNGLVGLVTEANTTYKIAGSVIALILALIG CFFGYKLSRLFMGITGFIAGAIIGQIVASQILHVEGFASVLCIILCGAFIASLAFWIYRI GIFILCFALAFSAAGTLFPFEGDVQFFANVITGLIVGVLAVKYMRPVIILTSAIVCGTSA AGLLPGVTEYMGITTLSSLNSSAALTLALCVLGIAVQFLTTKEPVKKTVPRHKHG >gi|222441909|gb|ACEP01000033.1| GENE 3 3135 - 3659 516 174 aa, chain - ## HITS:1 COG:SPy2105 KEGG:ns NR:ns ## COG: SPy2105 COG0602 # Protein_GI_number: 15675857 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Organic radical activating enzymes # Organism: Streptococcus pyogenes M1 GAS # 5 173 26 187 204 169 49.0 2e-42 MYYGEIKKCDIANGPGVRVSLFVSGCTHHCKGCFNEETWSFEYGKPFTEETEEQILKLLE PSYISGLTVLGGEPMEYMNQEVLFPFLKKVKERFPEKSIWCFTGYVYEKDILHRMIPKWK YTKKMLEQIDYLVDGPFVMEKKNISLKFRGSSNQRIIDVKKTLETGEVVLWEEQ >gi|222441909|gb|ACEP01000033.1| GENE 4 4037 - 6199 2298 720 aa, chain - ## HITS:1 COG:ECs5215 KEGG:ns NR:ns ## COG: ECs5215 COG1328 # Protein_GI_number: 15834469 # Func_class: F Nucleotide transport and metabolism # Function: Oxygen-sensitive ribonucleoside-triphosphate reductase # Organism: Escherichia coli O157:H7 # 3 720 5 706 712 446 37.0 1e-125 MKIIKRSGMEVEFDRSKIIAAVKRANNSVVEGEKVSDEQVEKIAENVETICEGMNRALNV EEIQDLVENQIMNQKAFRLASNYITYRYKRALVRKSNSTDKQILSLLECNNEEVIQENSN KNPTVNSVQRDYMAGEVSKDITRRFLLPQKIVDAHEAGIIHFHDSDYFAQHMHNCCLVNL EDMLQNGTVISDVMIEKPKSFSTACNIATQAVAQIASSQYGGQSITLSHLVPFVEISRQK YRRDVRAEFEVEGMELDEQKINEIAEMRVRKEVKQGVQVIQYQVITLMTTNGQAPFVTVF MYLDEVEEGPARDDLAMIIEEMLNQRILGVKNEKGVYITPAFPKLIYVLEEDNIHENSKY WYLTKLAAQCTAKRMVPDYISEKIMKKLKDGNCYPCMGCRSFLTVYHDENDNPKFYGRFN QGVVTLNLVDLACSSGGDMDKFWEIFDERLELCHEALMYRHNRLKGTPSDVAPILWQNGA LARLKKGETIDELLYNGYSTISLGYAGLCECTRYMTGKSHTDPEAKPFALKVMQHMNDAC NKWRAESNIDFSLYGTPLESTTYKFARCLQERFGMIPGVTDKNYITNSYHIHVTEEIDAF DKLSFEAQFQELSPGGAISYVEVPNMQNNIEAVLAVMKHIYENIMYAELNTKSDYCQCCG YEGEIQIITDEHGKLIWECPNCGNQDQAKMNVARRTCGYIGTQFWNQGRTQEIKERVMHL >gi|222441909|gb|ACEP01000033.1| GENE 5 6413 - 7555 1262 380 aa, chain - ## HITS:1 COG:BH3665_2 KEGG:ns NR:ns ## COG: BH3665_2 COG0860 # Protein_GI_number: 15616227 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: N-acetylmuramoyl-L-alanine amidase # Organism: Bacillus halodurans # 196 357 6 160 180 103 42.0 5e-22 MKAKRVVGVLAALLLCICMIMPVNTDAAAQVRVRTVKGVTSSYTGRKSYCYVNGQKRKLT KYPIFKKSGAYMGPVGAILKNSKLKVKATAKGDKLTLTYGPNTVIVRADSRIAVTNGQKS TMGAPVVHGTYTATGKRRWIVPLNSVCTRLGINYKLSKGRIYISGTTQSSSNNTTGSTTT TTTKPSTTSSKDKIKIVIDAGHGGSDSGATGNGMAEKNLTLAIVLAAKRSFDKDSRFQVS YTRTSDTYPSLSQRAKLANNKNADMFLCVHINSASASAHGTETLWSKSRNSATQKKGLTS KTLAKAMQSAAVAATGFTNRGLVDRPNLYVLKHTNMPACLIEYGFISNKKESARMKANTS AYGKALYKAVVNLMKKQGKY >gi|222441909|gb|ACEP01000033.1| GENE 6 7676 - 8368 574 230 aa, chain - ## HITS:1 COG:CAC2435 KEGG:ns NR:ns ## COG: CAC2435 COG0745 # Protein_GI_number: 15895700 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Clostridium acetobutylicum # 8 230 1 222 224 130 35.0 1e-30 MKGRCRMLTSRLLILNQKEAECKEMAAYFQDSGYSVIWCTEYDEATEILSRGKNKPDLLI FDVDLPKRREYEALEKIRSLSDMAVLMLSQDDKLDSQLYAYSKKIDDYMVKPAPLPLLEA HVEAVIRRTAEKRNPVESVGALSIDYEGRKIYLADKVLKVTAKEFDLLEYFVKHKGMILS RDKILDSVWGFDYIGGYRSVDTLVKKLRAKLTKEYPYIKTVYGVGYCFDV >gi|222441909|gb|ACEP01000033.1| GENE 7 8938 - 9972 1199 344 aa, chain - ## HITS:1 COG:SP1700 KEGG:ns NR:ns ## COG: SP1700 COG0722 # Protein_GI_number: 15901534 # Func_class: E Amino acid transport and metabolism # Function: 3-deoxy-D-arabino-heptulosonate 7-phosphate (DAHP) synthase # Organism: Streptococcus pneumoniae TIGR4 # 28 340 26 337 343 364 55.0 1e-100 MDMDMKFFRKLPVPKELKEQFPADERIVKIKQERDPEIRRIFEGKSDKLLLIIGPCSADR EDAVLDYLTRLSKVQEKVKDKIMIIPRVYTNKPRTTGGGYKGLVHQPDPEKKPDMLQGII AVRELHQRAILESGLTCADEMLYPENHRYVSDLLSYVAIGARSVEDQQHRLTASGVGIPV GMKNPTGGDLSVMMNSITAAQGQHVFLYRGWEVQSLGNPYAHALLRGYVDKHGKTYSNYH YEDLNELFELYQESSLKNPGVIIDTNHANSGKHYLEQIRIAKEVMHSCHLSKDIRGLVKG LMIESYIEDGNQPVGGGCYGKSITDPCLGWEKSERLIYEIADLL >gi|222441909|gb|ACEP01000033.1| GENE 8 10277 - 12160 1266 627 aa, chain + ## HITS:1 COG:BH2179 KEGG:ns NR:ns ## COG: BH2179 COG0542 # Protein_GI_number: 15614742 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ATPases with chaperone activity, ATP-binding subunit # Organism: Bacillus halodurans # 303 539 46 283 711 185 43.0 3e-46 MDSTTSTITEAYSMDSAANTSIESYTMDSAASTITETYSIDSIADTLAKAYSIIEEGGQP LASETTCIQYIFLSPSHIEAFHNNELYLEYIQNLVKERFNVSFPEEKSTFDQYFSDCIQL KEQLESHGNFDKTVWLSYTTSLLESLKTPYYSVYSFTLLTLESCFTDPQLFDKVQKSFYS VLIKSSEKKQSIIRKIIEHAPVYKKYSYIKENLLSDILINTLALFGDEKEYWLASFINLF ENNNAVHSLPKNTSNKEWNTPSLLMTFFEWEELFLKNRASKDFSSFSEYLSSRTLYFYKH AGTTSDNLLKNATSNASVTSAGTSASIDSAILSNIENAYVDSEFLNTFAYNMTTRLYHAN PAIGRDQEISDLELILISPKKSPLLIGEAGVGKTSVVEGLAYRLQRGTVPDLLKNKKIFK LTTTSLLSGTKYVGEMEDRIKKLAGELEAHPDVILFIDEIHTIVGAGSTESSNNDISNML KPYIDRGDIKIIGATTREEYTRFLLPDKALTRRFYPISIEEPDEELTLSILSGSIPSIEY ETKVKNTFSANTTERILRTLISISMPENQPDDQPAKRPELPLTLLEMAFSYAALSGKTAL SCEYIEQAVHHSNRLRKEIRTNFTCAL >gi|222441909|gb|ACEP01000033.1| GENE 9 12338 - 12550 143 70 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MINLQKIIDYLEQHNISEETFMISTGLSKSSLIKAKGSLSADDYLTICSTLGVSPWFFYE RELTEGSSDS >gi|222441909|gb|ACEP01000033.1| GENE 10 12686 - 13084 104 132 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026320|ref|ZP_03715512.1| ## NR: gi|225026320|ref|ZP_03715512.1| hypothetical protein EUBHAL_00561 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00561 [Eubacterium hallii DSM 3353] # 1 132 1 132 132 229 100.0 3e-59 MRNDNIARSLRYYRNVNHKSIDDIIAFLREHDYDVSEKTIYAWENGSNQPRADILMLLCE YYQIDDVLTAFGYRDAETSFLQDPLSVSEISLIKSYREHKEMQPAVHKLLDLPAPVTRTR RKRKPKKSNPEV >gi|222441909|gb|ACEP01000033.1| GENE 11 13309 - 13884 449 191 aa, chain - ## HITS:1 COG:TM1431 KEGG:ns NR:ns ## COG: TM1431 COG1954 # Protein_GI_number: 15644182 # Func_class: K Transcription # Function: Glycerol-3-phosphate responsive antiterminator (mRNA-binding) # Organism: Thermotoga maritima # 5 185 13 192 195 82 27.0 5e-16 MENEKNTSNFKIIPSVKEVKYLKKAIQSDNLCIQLTGVHIGNLQQLSHICHQAGKTVIVN HELVDGLGKDRIAFQMLKKLYHVDGIIGSSITKLHMMKGLNVKVIYRITLMDSISVDNAL KTINEVKFDAIELRPYYHAIEFLPTFKKVWDGEYYVAGFVNTEEKLKKCKEAGFSGAMTS TVELWNKKFEK >gi|222441909|gb|ACEP01000033.1| GENE 12 14191 - 14553 460 120 aa, chain - ## HITS:1 COG:CAC1324 KEGG:ns NR:ns ## COG: CAC1324 COG3862 # Protein_GI_number: 15894604 # Func_class: S Function unknown # Function: Uncharacterized protein with conserved CXXC pairs # Organism: Clostridium acetobutylicum # 3 119 2 114 117 96 47.0 1e-20 MKEVTMCCTTCPSGCALKVMIDNNEVVSVEGNNCPRGEKFAHKEWINPERMLTSTVYAVI NGKELLIPVKSKEPIPRKIMTEAMKEIQEIEVNAPVKMGDVIKENLAGSGVELVACKTVK >gi|222441909|gb|ACEP01000033.1| GENE 13 14721 - 16037 1981 438 aa, chain - ## HITS:1 COG:CAC1323 KEGG:ns NR:ns ## COG: CAC1323 COG0446 # Protein_GI_number: 15894603 # Func_class: R General function prediction only # Function: Uncharacterized NAD(FAD)-dependent dehydrogenases # Organism: Clostridium acetobutylicum # 28 437 26 413 417 386 50.0 1e-107 MRTLKYDVVVIGGGPAGLAAALKAREKVEKVAVLERNDELGGILNQCIHSGFGLHVFKEE LTGPEYAQRYVDMVAETDISCYLNTMVLDVTEDKHVVAMSEDGMLVFEASSVVLAMGCRE RTRGQIRIPGTRPAGVFTAGLAQRYVNLENFKPGSRVVILGSGDIGLIMARRLTLEGIKV VGVYELMPYANGLYRNIKNCLDDFGIPLHLSTTVTKIVGSKRVQAVEVSAVDENLKPIPG TEEIIPCDTVLLSIGLIPENELSKKAGIELNPVTNGPKVDNALETSVKGIFACGNVLHVH DLVDQVTKEAEDAGTYAAEYAAECAAAQQADAAVSTDVETAAEACGTSNAETEIAVIPEG MVRYTVPTRICANRTPVTKIKFRVGRPVETAKIRITSGDKVIFESKKAKKFVPSIMENVN LTAAQLADVQDEIKVTLV >gi|222441909|gb|ACEP01000033.1| GENE 14 16049 - 17488 1679 479 aa, chain - ## HITS:1 COG:CAC1322 KEGG:ns NR:ns ## COG: CAC1322 COG0579 # Protein_GI_number: 15894602 # Func_class: R General function prediction only # Function: Predicted dehydrogenase # Organism: Clostridium acetobutylicum # 3 478 2 475 475 348 41.0 1e-95 MKYDVVIIGGGITGTGIAHELAKYQLKTILLESGTDVAFSATKQNGGVIHPGYDPHPGTL KAKLNPPGARMYPRLSKELGFKILPTGTLVVAYSDQDLKKVDELMDNARINGVEKVERLD FEQLHNREPHISDKALGALLANTTVMVDPFEVAIAFMENAMQNGVELGLCQKVRKIEKRA EEDFVVYTQDRQYETRFIVNAAGVHADDVAAMAGIHEYQVEGRHGNLCVLDKVLPIHTVM FPCPGPDTKGIALIPTVSGNFLIGSTATMREDKYDVTNDAHGIDELIKGAKMLLPDFDPR CIIRTFAGQRPVVLNNGNDFYIRESEIVKGFIHAAGIQSPGIASSPAIAEYVRDLLANAG LDLKDKSDYNPYREPIPDFSDLSLEEQDALIKKDPAWGKIVCRCETVPEGEIIAAIHSTL GARTVEGVKRRTRAGMGRCQSGFCQYKVMQILSRELGIPEEQVLFEEQGAQVLYGKIKG >gi|222441909|gb|ACEP01000033.1| GENE 15 17518 - 19002 1931 494 aa, chain - ## HITS:1 COG:TM1430 KEGG:ns NR:ns ## COG: TM1430 COG0554 # Protein_GI_number: 15644181 # Func_class: C Energy production and conversion # Function: Glycerol kinase # Organism: Thermotoga maritima # 4 482 3 482 482 587 57.0 1e-167 MANYILVMDQGTTGSRGVVYDEKGKTVAIDYEEYEQFFPNSGWWEQSGKTVWEVTLRTSR NAIKKAGLTGKDIAAIGIANQRETTTIWDKATGEPIYNTIVWGCRRTADICAKLIEEGYN DTVNKKTGLVIDPTYSGSKIRWILDEVEGAQERAEKGELLFGTIDCWLLYNLTKGKVHAT DYSNAARSMVMNAHELKWDDELLSMLNIPKCMLPEIKPSVGIFGYTDPEWFGAEIPITGI AGDQSAALFGQACFEEGAAKNTYGTSAVPLMYTGDVAPETEKGLLTVAWGKDGKVKYSMG ASILIAGQVIQWLRDKINLVDHAAKTSVMAESIPDNGGIFFVPAFTGLGAPHWNTGARGM IIGITAATSKEQITRAALESMAYQTRDLLEEMERNSGIKLKELKVDGGASKNDFLMQFQA DILNCRVVRPKDIETTALGACYLAGLGCGIFKSEDEIKELWEADRVFEPQMDEETRENLY AQWCKAVEKSKDWL >gi|222441909|gb|ACEP01000033.1| GENE 16 19267 - 19917 997 216 aa, chain - ## HITS:1 COG:lin0365 KEGG:ns NR:ns ## COG: lin0365 COG2376 # Protein_GI_number: 16799442 # Func_class: G Carbohydrate transport and metabolism # Function: Dihydroxyacetone kinase # Organism: Listeria innocua # 1 214 1 212 216 178 46.0 7e-45 MSEYMLSKDYFVQVIRDLIVLAEDRKEYFTDLDSAIGDGDHGINLSIGFREVNKNIEDWK GLSVRDFYNKVGTALLDKVGGSSGPLYGSFFMKFGLPIKTKGPDEGATFEEFIAMMEKGV EIIKKRGKSTTGEKTMLDTFVPAVETLRKEYEGGTPAKEAMEKAVEAGKAGLESTRNIVA TKGRAMRLGERAIGHLDPGAASAAEILEVFYKDMPN >gi|222441909|gb|ACEP01000033.1| GENE 17 19933 - 20949 1398 338 aa, chain - ## HITS:1 COG:lin0366 KEGG:ns NR:ns ## COG: lin0366 COG2376 # Protein_GI_number: 16799443 # Func_class: G Carbohydrate transport and metabolism # Function: Dihydroxyacetone kinase # Organism: Listeria innocua # 4 335 1 328 331 382 60.0 1e-106 MAKIKRLFNDPYDIVEEMVEGYVGAHKQYVKMCDLDEAQGRVVLANDAGTKDKVGVIIGG GSGHEPLFIGYVGEDFADAVVIGNINTSPSPDPCYAAAKACDNGKGCIYLYGNYAGDVMN FDMGAEKADEEDDIRVETVLVTDDVVSSENIPDRRGIAGDFFVFKVAGAKAATGADLDEV VAAAQKANDNTRSMGVAMSSATLPAKGGTIFDMEDGDMEIGMGIHGEPGIRRGKIDTADN VIDEIMEPILKDLPYVEGDEVYVLVNSLGATPLIDLHVCYRRVAQILEEKGIKVYKALVG PFACSMDMAGMSVTLMKLDDELKELMDAPCDTPYFTQK >gi|222441909|gb|ACEP01000033.1| GENE 18 20975 - 21373 618 132 aa, chain - ## HITS:1 COG:lin0369 KEGG:ns NR:ns ## COG: lin0369 COG3412 # Protein_GI_number: 16799446 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Listeria innocua # 6 130 4 125 125 84 41.0 6e-17 MAKFGVVLVSHSEYIAKGLKELVDEMNDGSVQVVAAGGADGGRIGTSAIKIQGAIESVED CDHILIYADLGSSILSAETAIDLIDEDLAEKVQLVDAPIVEGALAGVVQGTISDDVADVI KASEDARNVHKN >gi|222441909|gb|ACEP01000033.1| GENE 19 21527 - 22300 780 257 aa, chain - ## HITS:1 COG:PM1640_1 KEGG:ns NR:ns ## COG: PM1640_1 COG0149 # Protein_GI_number: 15603505 # Func_class: G Carbohydrate transport and metabolism # Function: Triosephosphate isomerase # Organism: Pasteurella multocida # 3 257 2 259 270 290 59.0 2e-78 MAKKKIYFGTNTKMYKTIKDTVEFVSQLQELTKDISREDMQLFVIPSYTTLRDANEAKDE DLLMVGAQNMGWEEQGQFTGEISPLMLQEVGTDIVMIGHSERRHVLGETDEEENKKVLCA LNHNFTTLLCVGETGEQKDYGISEEVIRIQLKKGLYGVTEKQTEKLWIAYEPVWAIGVNG KPATKEYAEQIHIVIRETLVELFGEEAGNEIPVLYGGSVNPENAVGLSKMEHIDGLFIGR SAWQADNFNKIIRDVLK >gi|222441909|gb|ACEP01000033.1| GENE 20 22663 - 23763 1325 366 aa, chain - ## HITS:1 COG:SA1499 KEGG:ns NR:ns ## COG: SA1499 COG0544 # Protein_GI_number: 15927254 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: FKBP-type peptidyl-prolyl cis-trans isomerase (trigger factor) # Organism: Staphylococcus aureus N315 # 55 359 116 420 433 89 25.0 2e-17 MIKKKIFIATAVFASICMLGGCATSKSSKKAAEATTEAAQQAKATPVKLNASEYVKLGQY KGLTIKGASTKVTDQDVEDQVNELAHDNASYEEIKDRKTVQKDDYLNVDYTTTINGKENS DYSDSNLDMHLGDGNLNVDENVDVDEKLIGAKVGDTVTIEFTFPEDYDDSSIAGKKCELA VSINMIEKEVIPEVNDALVKENTDCKTVKEYKKQVRDSLVSDKKSEAEQTNQETLWNKIM DNATQLKDFSEADIKKEVSNIKIENKEMAGYFGMSVSDFIEQYYEMSLEDYAKENLKKQC VQDLLLKENSIEITDADVDEEIQYYIDELGYKDKKEVLSAISEDEIHSELQYTKLMDALM KETTIK >gi|222441909|gb|ACEP01000033.1| GENE 21 24137 - 25837 1246 566 aa, chain - ## HITS:1 COG:VCA0786 KEGG:ns NR:ns ## COG: VCA0786 COG3044 # Protein_GI_number: 15601541 # Func_class: R General function prediction only # Function: Predicted ATPase of the ABC class # Organism: Vibrio cholerae # 11 563 8 546 549 377 39.0 1e-104 MKNKEDLKKELIKIDHKSYGMYKTLGGSYSFGNYILYIDHVQGDPFASPSRLHFEVKRDR HGFPEEYYQEKHRRLALEDQVLRRFLYELRQIDKGFMGSGKSGRITICPANQTVQERIAV VFSKEKMELRFEMGFPARGRTILAKEMQKLVFDILPQLAENTLFYRNWDTKNKKYLEQAI FLADDQKVLREELKKRNLTAFVADGAVLPRESGVSDRPMRGAVPFASPESMRIEVELPHK GKVTGMGIPEGITVIVGGGYHGKSTLLKALEQGVYNHICGDGREYVVADNSGMKIRAEDG RNVLHTDISMFINHLPAGQDTTDFSSENASGSTSQAANLIEAVEAGAGLLLLDEDTSATN FMIRDKVMARLVSDEKEPITTLLRHIRGIYRTLGISFVIVVGSSGDYLSVADFVLQMDHY KVKDVTKEAKSICEECGIAGQYPENEITVPAFSRKLKPVKPGRRKIKSMGTDTVLIDREA IDVRYLEQLSENGQTTALAHMMSWILDNVKAEEDIQHIIERMYTIIEKRGMEAIIPASCS GGHPVLPRKQELYACLSRYRSAKIVR >gi|222441909|gb|ACEP01000033.1| GENE 22 25916 - 27529 1513 537 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225026333|ref|ZP_03715525.1| ## NR: gi|225026333|ref|ZP_03715525.1| hypothetical protein EUBHAL_00575 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00575 [Eubacterium hallii DSM 3353] # 1 537 1 537 537 982 100.0 0 MNIKRIVFASLFCIAVLLCAGKKAEAAFPYDMEQKPVTATSIQDGKVAVYSDENLKNKKE EVAGNNLTIEAYKVSKNGIYGNYKSGKRSGTGWFSLNTFVVNPKYKNVYATVRDHMYIYT SRSYSKVQTTIKKYSGVIVISKVDGDRQVICDKKDHYEIGWMTASAFSNTLLYDGREKQT LADGIYKFRCGYQDDENGGSKNQSAMEAYPEYTFKIMHTTQNQYYIQNVETEKYLAVTFK SASKKAEKGSVGKYVVCWTKQPDKTYGLFQLQRLNGAFSIQNVKSKYFLGQITDTKENSE DSQESDSAKNSEELKVANSGEENQADTSSKNAKQQSNTELAFTLEKSRSTQPTHWRIHAA QKMSNTKKPFIFTQYDPEWCATPYGGGGCMGTAGCGILATVNAVYALNGQYMDVMELANY AVEKNYRIVGSGTDDGIFKAAAKKYGQKYGFAWDGASGSIDVLKKKLKAGDTAIVHVQGH YVSISDYNKKTKKYLLLDSNYLPKRATSAFGDWIGVNRLLSGALESQYFYFFKSSEL >gi|222441909|gb|ACEP01000033.1| GENE 23 27542 - 29704 1980 720 aa, chain - ## HITS:1 COG:BS_topB_1 KEGG:ns NR:ns ## COG: BS_topB_1 COG0550 # Protein_GI_number: 16077493 # Func_class: L Replication, recombination and repair # Function: Topoisomerase IA # Organism: Bacillus subtilis # 2 568 3 575 575 695 61.0 0 MKKLVIAEKPSVGRDIARVLGCKQQGKKYIEGNEYIVTWGLGHLVTLADPEGYDKKLKTW RMEDLPMMPERFKLVVIPQTRGQFQAVKSLIERKDVAEIIIATDAGREGELVARWILAKA GNKKPLKRLWISSVTDKAIRDGFAHLKNAKEYENLYHAATARAESDWLVGINATRALTCK YNAQLSCGRVQTPTLAMIAAREEEIRHFRPKPYYGLQLTGKGITFTWKDKKSGSSATFSK EKNEEIYKNLSGKTAVVAEVKKTKKTSQAEGLYDLTDLSRDANQRFGFSAKQTLNIMQSL YEHHKVLTYPRTDSRYLTTDVAETLRERVEAIAVGPYRSFTNTVLKGKITPKKSFVNNQK VSDHHAIIPTEQPVILAQLSVDERKIFDLVVRRFLAVLYPPYEYEQTKITLECEGETFFA EGNVPLNRGYKEVTAPDGDESGKVFPALKEGEVIRDFSLKKTEGKTKPPARFTEGTLLKA MENPVKYMQQKDAKAAKTLQETGGLGTVATRADIIDKLFGSYLIEKKENEIFITSKAKQL LALVPEDLKKPELTAEWESRLSAIAKGKASDRKFMREIATYTKELMKQIQNGTGTFRHDN LTNKICPECGKRLLLVNGKQGKLYVCQDRECGYKERISRLTNARCPVCHKKMELVGAKDR QKFVCSCGHKESMQAFENRRKKEGAGVSKKDVANYMKKMKKEEKQPLNNALADALANLKL >gi|222441909|gb|ACEP01000033.1| GENE 24 29975 - 30754 661 259 aa, chain - ## HITS:1 COG:CAC0108 KEGG:ns NR:ns ## COG: CAC0108 COG0600 # Protein_GI_number: 15893404 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport system, permease component # Organism: Clostridium acetobutylicum # 7 259 9 261 265 243 48.0 3e-64 MKKIRKSNIAGEILFLLGIIILWQIFYVAGVDGLGIWKAYAMPSPVGVWNSFVEMMKQGT LLAAIGNSILRGAIGYILSLIIGAVIGILINHFSFLHRNLRPLIMGIQTLPSICWVPFSI LWFGLTQTAIIFVVIMGSAFSMAIAVDNAILNVPPIYKKAALTMGANQKQIYWKVIFPAS LPELISGMKQGWSFAWRALMSGEVMTTSIGLGQTLMTGRNLADINQVMLVMVVIVVVGIL IDQVVFYNVEKRVLKKRGL >gi|222441909|gb|ACEP01000033.1| GENE 25 30754 - 31497 278 247 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 5 209 4 216 245 111 34 7e-24 MALILENVGKTFENTGQATLKNINLEIQPGEFLCVVGKSGCGKSTLLNLIAGLEAPTEGR ILLNGKEVKGPGSERTVMFQEHALFPWLNVIQNIKFGMKINGVPKEEQEIRAEKYLKMMQ LEQYKDYAIHQLSGGMRQRVALARAFTMDSDILLMDEPFSALDKQTSNHLRKELQDIWMQ THKTIFFITHSVEEAVYLGDKVVVMSGENGGIRNIVPINLERPRHVYDEEFIAYRHKILE EIQGGDY >gi|222441909|gb|ACEP01000033.1| GENE 26 31561 - 32505 1213 314 aa, chain - ## HITS:1 COG:CAC0106 KEGG:ns NR:ns ## COG: CAC0106 COG0715 # Protein_GI_number: 15893402 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport systems, periplasmic components # Organism: Clostridium acetobutylicum # 5 304 46 343 368 260 46.0 2e-69 MVAGLAGCGGTKKDADKTTEINVGYFNNVTHAQALYMKAQGTLDKAFDGKAKVKWTSFNA GPSEVEALFSGDIDIGYIGPVPAISANVKSKGDVSIISGASQAGAELVKAPGSEIESAKD LDGKTVAIPQIGNTQHLCLLKLLSDNGLKTVEEGGTVTVTAVENADIQNMMDQGNIDAAL VPEPWGSTLVKNGAEIVLDYNGVYMEGNYPVAVVVVRNEFLKEHPDLVKEFLKQHEAATD EINQNADEAAKIINDEINAATGKSLSEDILQTAFQKLTISTEVNKDAVDDFAAISLEQKF IDQKPSDDFITEVK >gi|222441909|gb|ACEP01000033.1| GENE 27 32885 - 33910 552 341 aa, chain + ## HITS:1 COG:ECs2367 KEGG:ns NR:ns ## COG: ECs2367 COG1609 # Protein_GI_number: 15831621 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Escherichia coli O157:H7 # 2 337 3 332 341 117 27.0 5e-26 MSLKEIAHRAGTSVSTVSRVLNHPDYKCKNPELEARIWAAAQEISYTPNTAAQTLKSGAP QIKTTPLTFDIFLTRFPSLSQDLFFSELFESLKKELLRQQCVPGKFLTLPEVTSMLKEKS YFSVTDTHADGIILLGKCPAELISTLQNHYHNIVGIDRNPTNFDYDEVICSGTTAAVTAM NYLISLGHKKIAYIGDCSYEARYIGYYQSLISHNLPLDYSNIYPTSQTREEGMKTMELIL QREELPSAIFCANDSTALGVFDCLQRHRKKGYIPSVISIDNILESEQTKPMLTTIDIPKK EMAHHAVNLLLDRIHNQHSNAVRIELPCRLIVRESCSYCGR >gi|222441909|gb|ACEP01000033.1| GENE 28 33923 - 35293 1591 456 aa, chain - ## HITS:1 COG:CAC0747 KEGG:ns NR:ns ## COG: CAC0747 COG1376 # Protein_GI_number: 15894034 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 10 456 23 466 466 301 35.0 2e-81 MNKQKKIGLAVLVVILLYAAISVFFQYHYFIGTKVNGYSCGFRSVSKTKEMIKEDIKAYK ITIKERNKKKESISFSQVNLAFKDDGKLEGIKAQQKGYAWITALFQSQDYRDAITLTMDD TAFNDTYNNLNAFNKDMVVAPVDAYSTYDKATNSYSIVPEVYGNTVKKKKLKPLLKEAIL NMDKSIDIEKNDCYKNPAYKKDTKEVVEANKTMNKYVQETITYDFDDRTEELKGKKISKW LYETDKHEVKVHSEMAAKYIKKLADKYDTVGIKRNFTSICGNKVSVSGGTYGWRIDQKAE TKNLVKTIKKGKSVKIKPEYAHRAKSRKKFDIGSTYVEVSLGEQHMWFYKNGKTLVSTDV VTGDISKGHGTPTGVYYILYKTTDYTLTGQGYASHVDYWLPFIQIGVGIHDSSWRGSYGG GIFTYDGSHGCVNTPRSAVQKIYNNIESTYPVVVHW >gi|222441909|gb|ACEP01000033.1| GENE 29 36231 - 37055 573 274 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026340|ref|ZP_03715532.1| ## NR: gi|225026340|ref|ZP_03715532.1| hypothetical protein EUBHAL_00582 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00582 [Eubacterium hallii DSM 3353] # 1 274 1 274 274 531 100.0 1e-149 MKKTINRIMNSYIQFFKIKNLNVQIVLTDDMYTCQKKYGFNKEDSQTLDEATARKNWKHV AACMKYPKHMNEPFTLIFKEPYLRRSPLCEVYRLVFHELTHICDYRDYARLNHLTSYRQL FDDPETVLFQHWSEYHAERRGYAAWLKHRYGIRVKYDINNSKVDILHKETVSNIQYYGEH YTNTAEYGSTRQIYFTMHLLARMSIWMQILPYQMSDILSKDPFDYKGIEWIKKLMYLFHK YPNIDQMNDHFMEIAKIVAENFSLSREEIWEKVS >gi|222441909|gb|ACEP01000033.1| GENE 30 37114 - 37674 529 186 aa, chain + ## HITS:1 COG:CAC0446 KEGG:ns NR:ns ## COG: CAC0446 COG0494 # Protein_GI_number: 15893737 # Func_class: L Replication, recombination and repair; R General function prediction only # Function: NTP pyrophosphohydrolases including oxidative damage repair enzymes # Organism: Clostridium acetobutylicum # 2 158 4 153 206 94 37.0 9e-20 MEFLDIRDKNGNPTGEVKERSLVHADGDIHGTSHVWIVRKNEKGSYDLLLQKRSENKDAF PGCYDISSAGHLPAGQDYLSSALRELEEELGIKAKPEQLHFMGLHEGCCEETFYGKPFKN HEISHVYLYQEPVNIEDLTLQKEEVQEVCWLDFKECCKKVKDGDRKYCLFPEELLMIKKY LQFYLK >gi|222441909|gb|ACEP01000033.1| GENE 31 38094 - 38201 94 35 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MQHMTLNKISIEHITLEHIGFVHHGFGVDLPDYLL >gi|222441909|gb|ACEP01000033.1| GENE 32 38358 - 39071 492 237 aa, chain - ## HITS:1 COG:no KEGG:Closa_2635 NR:ns ## KEGG: Closa_2635 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 4 234 2 198 198 122 35.0 1e-26 MKQEKNIDYFQIEGTPGGNQEWCTDFWMYLGGCGALAACDLSICLARNYGLKKCYPGDAL NLTRKEYVDFSMKMKPYIHPRVGGVTKLSMFTDGCGQYLKDCGYGAEFETLDGDKPYEEA EKFVKKAIERNLPVIYLMLRHRDKEFKDLNWHWFCITGYKTEKEEVQQRQIKNSCIHTAS IQANEMNKNKIKEGINLEERVYLNYHTYGEALSVDFKRLWNTGMYKRGGMVALKSIK >gi|222441909|gb|ACEP01000033.1| GENE 33 39287 - 39823 378 178 aa, chain - ## HITS:1 COG:CAC0470 KEGG:ns NR:ns ## COG: CAC0470 COG0700 # Protein_GI_number: 15893761 # Func_class: S Function unknown # Function: Uncharacterized membrane protein # Organism: Clostridium acetobutylicum # 9 177 3 171 173 115 37.0 3e-26 MDTVKFLLYFSDFIVPFTMFYIVVYGFFNRTDVYESFLKGVKEGFQIVIEIAPTMIALLV SIGIFRASGALDSFSELLAPAGKLLHIPVEVIPVFIVRIFSSSAAVSFVLDIFKEYGPDS RLGMIVSIMMSCTETVIYTITIYYMSVNIKKTRWTLPGAMFATIAGAVASVAITELIM >gi|222441909|gb|ACEP01000033.1| GENE 34 39811 - 40440 403 209 aa, chain - ## HITS:1 COG:BH1574 KEGG:ns NR:ns ## COG: BH1574 COG2715 # Protein_GI_number: 15614137 # Func_class: R General function prediction only # Function: Uncharacterized membrane protein, required for spore maturation in B.subtilis. # Organism: Bacillus halodurans # 1 181 1 181 197 151 41.0 8e-37 MLNILWVIMIAGGIFFAAFHGTMGQITESFISSSTEAVNLCIFMLGVIGVWNGMMEIAVK SGLMKKIAKTMYPFIHWLFPDIPPRHKANEYIAANMAANILGLGWAATPAGLKAMRELQK LEEGGGRASDMMCAFLVLNISSLQLVPINMIAYRSQYGSVNPAAVVLPAICATMISTIAG IVFIKIIEITRYNGKKYNRKNKKKRRWIQ >gi|222441909|gb|ACEP01000033.1| GENE 35 40458 - 40577 128 39 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MQYADFAMYQIKHTVKNQFGYFDMEAYRTSQEQYNEEKG >gi|222441909|gb|ACEP01000033.1| GENE 36 40621 - 41745 1272 374 aa, chain - ## HITS:1 COG:CAC0492 KEGG:ns NR:ns ## COG: CAC0492 COG0787 # Protein_GI_number: 15893783 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Alanine racemase # Organism: Clostridium acetobutylicum # 8 372 9 373 386 324 46.0 2e-88 MEEYLRTYCEIDLKAIRENYINIKKKTGDNAMTMAVIKADGYGHGAVEVAHYLNDIADYF GVATIDEGVELRKAGLKQPILLLGYNSPSLYYKNLEYGVDQTIYCYDTAKAMSEAAVKAE KTARIHIALDTGMTRIGLSPDEKGLAIVKQIAELPNIKIEGLFTHLSCADMTDKAYTESQ MERYDHFVQLLDEAGIEIPVKHICNSAAIMEYEDHRYDMARAGIILYGMYPSDEVDFSNL DLTPAMSWYSHVVNIMEPEMERGVSYGATYIVEHPCRLATVSVGYADGYARSLSNKGWVL IHGQKAPIVGRVCMDQMMVDITDIPDVKLEDIVTLIGKDGDNYISVEDMANLAGSFNYEF VCDVGKRVPRVYKR >gi|222441909|gb|ACEP01000033.1| GENE 37 41989 - 44286 2352 765 aa, chain - ## HITS:1 COG:no KEGG:HMPREF0868_0283 NR:ns ## KEGG: HMPREF0868_0283 # Name: not_defined # Def: hypothetical protein # Organism: Clostridiales_BVAB3 # Pathway: not_defined # 61 633 195 761 1160 368 42.0 1e-100 MNRKVRNLLASVVLGVCLLALSFIPSFAANSANSASVGNVEAMKENVVYDFQSLSFADAP DYVKGIKSVSVNKTKWEEASSRLSLMRREAYYRNTDENKLYFDGSSDGMLKSGNILTIES EGYETLYLRVIKATADTFTVKKTDSDAETPSGPSDGINTLHVRIRGSFESALEGQKKYDA ISGASTSVTTNKNSDVIVEAAVLPDGQEPKNSDWKPLSEVITVNKNRTKINIDTEACGME GVYSVHDSSLSLSGEPLKAGSYPISVTVTDESGRTATSNELNFKVYTRTEVLAKQLKLEN CKQTADGKYMYDMEPWAIASFGGENETVTVPKDIKAWYGSHTSGTYGELGYAVDGSPVQT LIVPDGCNLTLVNMKVFSSVNIVVENGGKLVLRDSSIHGNIEVKNGGKFSSNYDDYGSQF LTGSSINGQLILNDGAILENSSIYSNTNFISNGHIIRQNTNPVIVVKGKVTVDGQVFVKG DEAATGQDETTGKSLSGQPAIRIENGSLDIKKDSVVAAFGGGYIATTSVGGSGIILDNGS ISGEGTLIAFGGNGTYDDGADAISGNGTVSVKNVYAQGGCSSIPKTGATAGKAIADGVTL AKTSNRTLLNGKEIKDNYGTISELYWSSITKVPDVSKYVIEVNGSDEEDNTKPGSSEDKP NTKPGTSEKKPETKPASGSTVKKTIKTKSLKVKKTKVTLKKGKKYKIQAKKIPASSTAKI MYRSSNKRIAKVTKKGVIKALKKGKCKITVKTSDGKKKVIIVKVK >gi|222441909|gb|ACEP01000033.1| GENE 38 44299 - 45345 1074 348 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_20117 NR:ns ## KEGG: EUBELI_20117 # Name: not_defined # Def: glycoside hydrolase family 13 candidate A-glycosidase (amylase-related) # Organism: E.eligens # Pathway: not_defined # 46 232 659 846 1390 73 29.0 2e-11 MRKISSFIMACAMCLSMTVVGTVAPQVAVQAEETATVELKEAPKAPSTYKTVAGYDCQLN YGDDAEEWLSAIDKVEVNGIEYEKATSSYSIWNNAKYYAKATLDSSDSTDVRYLLIGEGK FEKDESAVNKVVISADGYKDLVLEVNKDGKTVAIHTKHTGGTATCTKKAVCDFCGEEYGE CASHNYGDEYESDGTYHWKKCQNEGCDATTEKEAHKGGEATTTKRAVCEVCGAEYGDLKQ EDPKPADKTTEGKKDDTKPAEKPDRKTVKAKKVVVNKKRIVLKKGKKVKLKVKLKPANAT EKVTFKSSNKKVAKVTKNGVVKAVKKGKCKITVKTASGKKAVVKVTVK >gi|222441909|gb|ACEP01000033.1| GENE 39 45553 - 46395 928 280 aa, chain - ## HITS:1 COG:BH0049 KEGG:ns NR:ns ## COG: BH0049 COG0313 # Protein_GI_number: 15612612 # Func_class: R General function prediction only # Function: Predicted methyltransferases # Organism: Bacillus halodurans # 3 274 14 283 289 266 52.0 4e-71 MAGTLYLCATPIGNLEDMTYRAVRILKEVDLIAAEDTRNSIKLLNHFEIKTKMTSYHEYN KVDKAVYLVNKLREGLDIAVITDAGTPGISDPGEELVRQCYEAGINVTSLPGACACITAL TMSGQPTRRFAFEAFLPYDKKERAQILENLRNETRTIIIYEAPHHLKKTLKECREYLGNR NMTVCKELTKRYEQKRKATLDEMIAYYEENEPRGEYVLIFEGKSLQELKEEKQQEWNQLT VQEHMEHYLEKGMDKKSAMKQVAKDRGVSKRDIYQELLKK >gi|222441909|gb|ACEP01000033.1| GENE 40 46397 - 47146 722 249 aa, chain - ## HITS:1 COG:CAC0306 KEGG:ns NR:ns ## COG: CAC0306 COG4123 # Protein_GI_number: 15893598 # Func_class: R General function prediction only # Function: Predicted O-methyltransferase # Organism: Clostridium acetobutylicum # 6 245 4 243 244 227 47.0 2e-59 MENVQIKENERIDDLQIKDYRIIQKTDGFCFGIDAVLLSDYARVKKKSKGIDLCSGTGII PILMEGKYPLAHVEGLEYQEEFAKMAERSVKMNGQEEKVHIVQGDVRNIKTDYERESFDW VTCNPPYMTGEHGLTNPDYAKAIARHEITCSLEDVVNAAKWLLKTGGHFYMVHRPFRLAE IIYTLKQHKLEPKRMRLVHPYVDKEPNMVLIDVVKGGKQRITVEPPLIVYNKDGSYTDEI YQIYGEMKN >gi|222441909|gb|ACEP01000033.1| GENE 41 47357 - 48256 806 299 aa, chain - ## HITS:1 COG:CAC0301 KEGG:ns NR:ns ## COG: CAC0301 COG1774 # Protein_GI_number: 15893593 # Func_class: S Function unknown # Function: Uncharacterized homolog of PSP1 # Organism: Clostridium acetobutylicum # 1 294 1 298 303 301 54.0 7e-82 MARVIGVRFRDNGKIYYFASEQFEVEVGQNVIVETARGVEYGRVVLGNREIHDEKIIATL KPVIRIATPEDDKIQENNKEKSKEAYKICIEKIKKHKLAMKLIDCEYTFDNNKVLFYFTA DGRIDFRELVKDLASVFKTRIELRQIGVRDETKIKGGIGICGRSLCCNTFLSEFVPVSIK MAKEQNLSLNPTKISGVCGRLMCCLKNEQDTYEYLNSKLPNVGDEVKTNTGVKGVVHEVS VLRQRVKLVVTDEKGEKELVDYKVDDLIFRPRKRREKVKMDDELKKLEAMEKKERKSKL >gi|222441909|gb|ACEP01000033.1| GENE 42 48275 - 49297 868 340 aa, chain - ## HITS:1 COG:BS_holB KEGG:ns NR:ns ## COG: BS_holB COG0470 # Protein_GI_number: 16077099 # Func_class: L Replication, recombination and repair # Function: ATPase involved in DNA replication # Organism: Bacillus subtilis # 3 334 7 324 329 121 26.0 2e-27 MAELNEILGNERIKEHFITAVHHKKISHAYIMEGDKGSGKKMLAEAFSKILQCEGRETTG LVESCGKCESCIQMEYHDHPDVIWVSHEKPNVISVGEIREQIVNTVEIMPYKGPYKIYIV DEAEKMNAAAQNAILKTIEEPPEYAVIFLLTTNRGAFLDTILSRCILLATRPVPGTAVEK YLVEKCSVSKEEAEFAAGFSLGNIGKAEAIISSEEFQDLKELTLSILRYIHEIESYEIAD KVKEYKKYKNRIEDFFDIFLMWFRDIMLLKADNSVNAEAAKKKIIFKNEYAYLKKQVDKL DLEAIDYVIGKIHRARVRIKSNVNFDATLELLIVEARREF >gi|222441909|gb|ACEP01000033.1| GENE 43 49396 - 50025 644 209 aa, chain - ## HITS:1 COG:BS_tmk KEGG:ns NR:ns ## COG: BS_tmk COG0125 # Protein_GI_number: 16077096 # Func_class: F Nucleotide transport and metabolism # Function: Thymidylate kinase # Organism: Bacillus subtilis # 1 206 1 209 212 159 43.0 4e-39 MKGLFIVMEGPDGSGKTTQINLLKEYLEEAGYECLITREPGGTVIGEEIRQLILNPEHKE MSPVTEMLLYAASRAQLVHEVIGPALEEGKIVISDRFVDSSIVYQGIARKLGISTVSAVN APGIGIYRPDGIFFIDLSEAEGLRRKKEQKNLDRMEQEGIDFHHMVSEGYRKVLSGRPEV MKIDGGRSIDTIQKKIRNHVDELLKKKNR >gi|222441909|gb|ACEP01000033.1| GENE 44 50163 - 51701 714 512 aa, chain - ## HITS:1 COG:CAC2338 KEGG:ns NR:ns ## COG: CAC2338 COG1982 # Protein_GI_number: 15895605 # Func_class: E Amino acid transport and metabolism # Function: Arginine/lysine/ornithine decarboxylases # Organism: Clostridium acetobutylicum # 3 510 9 487 487 281 34.0 2e-75 MNTPIYNKLRELESEKRIPFHMPGHKRADFGAFFGVEKMDITEITDYDNLHEPEGIIRES MNLVRDIFKSKESWYLVGGSTLGILVSISSVCRQGDKILIGRNCHKAVYNCIRLLQLEAV YCYADVSAEYDICEDMKPEAVARELRANPDIKAVVLTSPTYEGVVSDIAGIKEVTRPYAI PLIVDEAHGAHLIFHKYFPDSAVQEGADLVIQSTHKTLPSFTQTAVLHLCSDIVTKEQVE EIIDIYETSSPSYLLMASAEYGIMYMKENQGQLAEYVDNLKNFRRKCEQLKKISLLNQKK LNCFAYDNGKLVFSVKGCGINGKELFQLLYEKYHIELEMENLTYGIAMTSICDKKEDFDE LWKAISEIDKMCEEKEKENRSLNKAVNETKIAIQNQDKVEIEICDKNKPKIETKIHNHYK VIQEKENEQTIYPPKIIESWQCRGKAMETVELADSTGRVSGKYVMIYPPGVPILVPGEKI LKETVENISQYLYNGYNVLGLSCNKIIVLKDL >gi|222441909|gb|ACEP01000033.1| GENE 45 52093 - 52242 78 49 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGGGGFEPPKQFAADLQSVPFGHSGIHPYLIMTFNNHNKPMNGLEPLTC Prediction of potential genes in microbial genomes Time: Fri Jul 8 06:55:33 2011 Seq name: gi|222441908|gb|ACEP01000034.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont37.1, whole genome shotgun sequence Length of sequence - 24846 bp Number of predicted genes - 24, with homology - 24 Number of transcription units - 9, operones - 4 average op.length - 4.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 859 409 ## EF2334 hypothetical protein 2 2 Tu 1 . + CDS 1267 - 2001 340 ## COG4509 Uncharacterized protein conserved in bacteria + Term 2025 - 2066 -0.4 + Prom 2596 - 2655 8.1 3 3 Tu 1 . + CDS 2695 - 3093 296 ## gi|225026359|ref|ZP_03715551.1| hypothetical protein EUBHAL_00608 + Term 3153 - 3196 4.3 - Term 3141 - 3183 0.3 4 4 Tu 1 . - CDS 3218 - 3550 127 ## gi|225026360|ref|ZP_03715552.1| hypothetical protein EUBHAL_00609 - Prom 3683 - 3742 8.5 + Prom 3633 - 3692 12.4 5 5 Op 1 40/0.000 + CDS 3723 - 4424 521 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 6 5 Op 2 . + CDS 4417 - 5622 469 ## COG0642 Signal transduction histidine kinase 7 5 Op 3 . + CDS 5694 - 6227 240 ## gi|225026363|ref|ZP_03715555.1| hypothetical protein EUBHAL_00612 8 5 Op 4 . + CDS 6224 - 6922 264 ## PROTEIN SUPPORTED gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein 9 5 Op 5 . + CDS 6919 - 9411 815 ## EUBELI_20553 hypothetical protein + Term 9637 - 9683 6.2 + Prom 9492 - 9551 6.3 10 6 Op 1 . + CDS 9719 - 10183 152 ## gi|225026366|ref|ZP_03715558.1| hypothetical protein EUBHAL_00615 + Prom 10199 - 10258 2.5 11 6 Op 2 . + CDS 10308 - 11885 1313 ## Odosp_0908 Sel1 domain protein repeat-containing protein + Term 11894 - 11947 9.5 + Prom 11955 - 12014 4.4 12 7 Op 1 . + CDS 12036 - 17324 4806 ## BCE_5332 cell wall anchor domain-containing protein + Term 17344 - 17377 4.5 13 7 Op 2 . + CDS 17385 - 17810 236 ## gi|225026369|ref|ZP_03715561.1| hypothetical protein EUBHAL_00618 + Term 17869 - 17902 4.5 14 8 Op 1 . + CDS 17944 - 18348 267 ## Phep_0230 hypothetical protein 15 8 Op 2 . + CDS 18358 - 19563 1254 ## gi|225026371|ref|ZP_03715563.1| hypothetical protein EUBHAL_00620 16 8 Op 3 . + CDS 19560 - 19994 211 ## gi|225026372|ref|ZP_03715564.1| hypothetical protein EUBHAL_00621 17 8 Op 4 . + CDS 19995 - 20207 236 ## gi|225026373|ref|ZP_03715565.1| hypothetical protein EUBHAL_00622 18 8 Op 5 . + CDS 20212 - 20574 443 ## gi|225026374|ref|ZP_03715566.1| hypothetical protein EUBHAL_00623 19 8 Op 6 . + CDS 20593 - 21417 443 ## MGAS2096_Spy1127 hypothetical protein 20 8 Op 7 . + CDS 21461 - 22054 351 ## Cthe_1806 cellulosome enzyme, dockerin type I 21 8 Op 8 . + CDS 22097 - 22426 187 ## gi|225026377|ref|ZP_03715569.1| hypothetical protein EUBHAL_00626 22 8 Op 9 . + CDS 22463 - 22963 262 ## gi|225026378|ref|ZP_03715570.1| hypothetical protein EUBHAL_00627 23 8 Op 10 . + CDS 22976 - 23347 106 ## gi|225026379|ref|ZP_03715571.1| hypothetical protein EUBHAL_00628 + Term 23358 - 23395 4.7 24 9 Tu 1 . + CDS 23408 - 24845 983 ## ELI_0234 hypothetical protein Predicted protein(s) >gi|222441908|gb|ACEP01000034.1| GENE 1 2 - 859 409 285 aa, chain + ## HITS:1 COG:no KEGG:EF2334 NR:ns ## KEGG: EF2334 # Name: not_defined # Def: hypothetical protein # Organism: E.faecalis # Pathway: not_defined # 1 277 18 320 328 167 31.0 6e-40 FRFYQIPALLLEEEEYRGISDTAKILYGVLLSRLALSRKNSWIEKDTGRLYITYNLKQLV EKLGRSDRTISKAMKQLSEVGLIEKKKRGQGKPDIIYVMNFTAVHKEGAKEIVPEEQMKQ DSHMDKMQGNKREETIAETSVTGKEAKRGDSRSEEITILEKRHAREFYRKKIRHQLEYEI LMQRHKTDKNIIDAMIEIMANVYISDRDTYKISGEEIPVNEVQNRIKTLNVQHIEYIIEC LNRNTREIRKPDNYLLASLYNAPLTIGLYYKAQVQHDFKMENTVV >gi|222441908|gb|ACEP01000034.1| GENE 2 1267 - 2001 340 244 aa, chain + ## HITS:1 COG:BH3294 KEGG:ns NR:ns ## COG: BH3294 COG4509 # Protein_GI_number: 15615856 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus halodurans # 16 241 30 248 254 97 32.0 2e-20 MGKIRDFFYLLTIAVLISLSMIFIGAAFREEYPLWKNQKILEDLRSDVKAEETGDEVPGI DWEKLKKINPDIVGWIKVPGTRIDYPVLQGSRWNEYLHKDYKGEYCYAGSIFLQPETSFE DKHLILYGHNMRTRSMFGSLHEFESEDFYKKNNKIYLYQPEKVMKYTVYSVYDCLDKSIT YLTDFTGDKKDNRWLDWLSMTIEKNAYYPIKKKPKKDGQILTLSTCSGKKKGDDYRLVVN AVIK >gi|222441908|gb|ACEP01000034.1| GENE 3 2695 - 3093 296 132 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026359|ref|ZP_03715551.1| ## NR: gi|225026359|ref|ZP_03715551.1| hypothetical protein EUBHAL_00608 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00608 [Eubacterium hallii DSM 3353] # 1 132 1 132 132 218 100.0 1e-55 MRERLTKSEMAIMEILWEEGKGLTATEIVNIAGEKKTWKDSSIHLLINKLLDKGVIEVVG FRKTMKNYARTFFPVDSKQKYIVNQITDGMSEEERKKLYEYLIGEEKKLEMIKYIQNLLT KRKEKCTEIHNI >gi|222441908|gb|ACEP01000034.1| GENE 4 3218 - 3550 127 110 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225026360|ref|ZP_03715552.1| ## NR: gi|225026360|ref|ZP_03715552.1| hypothetical protein EUBHAL_00609 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00609 [Eubacterium hallii DSM 3353] # 1 110 18 127 127 173 100.0 3e-42 MKKTLSLILILSMFLCSNLSVFAMETTSTTSSTCNIYNDVESSGARIVHNYKKTITKVYK SRSSIPDSINYSEYNNSLKGTFSGRLYLKSIKKTASGWQATFSGTLTGRI >gi|222441908|gb|ACEP01000034.1| GENE 5 3723 - 4424 521 233 aa, chain + ## HITS:1 COG:CAC0450 KEGG:ns NR:ns ## COG: CAC0450 COG0745 # Protein_GI_number: 15893741 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Clostridium acetobutylicum # 2 223 4 223 227 154 39.0 1e-37 MILFIEDDIMLSEEISEFLKENDFEVEVVGNGDDALKLLQSRIYDLCILDIGLPDCSGFL LCKKIRQFYRNPIIIFTAYENEEDIVTGLMAGADDYVTKPCSLRVLRSRIFSQLRRKQWS EEIETEIIRSGDLVIDVVHNMIFKKGRELSIKDTEFRLCYMLVRNDGRIMPRELLLERLW DMDEHFIENNTLSVYVSRLRKKLGVYNGVSYIETIKGIGYRWNHLVQRGEFNE >gi|222441908|gb|ACEP01000034.1| GENE 6 4417 - 5622 469 401 aa, chain + ## HITS:1 COG:CAC0525 KEGG:ns NR:ns ## COG: CAC0525 COG0642 # Protein_GI_number: 15893815 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 160 394 92 329 329 103 29.0 4e-22 MSKREKYYGKEGFWIISVFGIMCIIFMTIYTMFIFKEVARQNMIFFGTVEKAMDYRTSAN LVNLIFSSQNKSDLYYNGIQLMKKAGYENSYATLIFLPYQKLMIFNIYSAIIMIFTLIIL GKTQKKQSQKEEFQIILYLKKHVPIKKNLHFFSKNFLESIEKIRKDIDKQQQIHIEYNEK IMHYMEDVSHQLKTPLSVMRMICEKIEMRYSKFSVEMGKCLGQIDYMTDTIRDLINLGKF DCKKFQMKFEEISAEILVETVVNDIEILAEPKNIEISVQGKLKIKWLCDPFWMQEVLKNI LKNCIEHSPNGKIEIFYGIEKNLNKIIIRDNGQGFMFGRENKIFERYFLGDRTKEGSSGL GLSIAQQVIKQHFGTITASNRECGGAEFLILFPQMDATTIY >gi|222441908|gb|ACEP01000034.1| GENE 7 5694 - 6227 240 177 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026363|ref|ZP_03715555.1| ## NR: gi|225026363|ref|ZP_03715555.1| hypothetical protein EUBHAL_00612 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00612 [Eubacterium hallii DSM 3353] # 1 177 1 177 177 308 100.0 8e-83 MYHKKCWKVHGIFLLLLSMLLLNACHMPAGSEQTEENIMDERENSEESNIWISINDIPEG YTEIRYVSVPMRQQDCYSNAEGNKRFVVEVQQGDTDLISVEEKKKIDIDGSQGTQFIYNG ENIKYNGNEKEQIQEITKMKKGERVLEWNAGKCCCRIYGSLSYEELKEIAKNVEVKI >gi|222441908|gb|ACEP01000034.1| GENE 8 6224 - 6922 264 232 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein [Acinetobacter baumannii AYE] # 1 228 1 221 311 106 32 2e-22 MKDIIVEIKGLEKTYGKNEAETKAIQGIDLTIAREEFVSIVGKSGSGKSTLLNIIGGLER PTSGEVQIEGTNLFDMKDTSRTIFRRKHIGYVFQFFNLIPEMTVYENICLPSYLDHKDPD EDFIQTVMEKLGIYEKKGKYATELSGGEQQRVAVARALSLKPSILLADEPSGNLDKKNGE QLLELMLLSQRYFDQTILLVTHDLEIARLSQRMIVLEDGKIISDTALEGEKS >gi|222441908|gb|ACEP01000034.1| GENE 9 6919 - 9411 815 830 aa, chain + ## HITS:1 COG:no KEGG:EUBELI_20553 NR:ns ## KEGG: EUBELI_20553 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 18 395 29 407 951 87 23.0 3e-15 MKISKYQNRILAIVIFFVFFFGLITVIYSDSYQNTLEEQRKKLYGSWNIASYHTSDDEIQ ILEKNATIDKVGKVEVLGDILDREGENIGAIGSADNSFFSLGNIKTLSGHMPRNANEIAI EASYLTRLGYSYELGQNIHLNILFYDKEGNEQIGVYQFVLTGVIKNYSSYWKTQNNELPS FFITKRKSIGTNKNGTITKHIFISLKEKYKEEADALQVLCQRGEFVKNDYVYGEYSGGKL FILERNMLQFFILGIGCITIFLLLYIDLKNQTENLLIMRILGATKAQIFQIYIKKKKEVM LFPGIAGIFFGVALPSVLVTIINKYGNYQLYLRLSWINILGLIVLFTLVILFVFFISIIR LLKLPLRGKLQQQTEYKIVKRRRKLNKKNLFKILNIFQRKEKILSVMLVTVLSAFMLLTV YQTWEAYRSYRFSSINYPDDYTFGRLMGHIGSEDTMTEEQLSLIKGVYGVGKVESVAISS NIKLMFSGKTDDIYLKKVKEGMRDDFTQKQQLDNYIYGPLVGVSDNLTGSYLKQVEHSSK KKLKNNEVILYIPDYIKTEKEMLQVDLDLDSSTQKNGENIQERMIKEGDTVTINTKEGEK ILNIVGVIHCFKDEMPASLNPMVTYGVLCNESTYRSICGKYSPCYVTVGVQKGTIPYQTD IELSKVNTSLYFENNRVEREEQLNNLILKVIMTFVLTVIGLLFLLLIRLGIGLSIEKQKI RRYQTLYQLGMEKIIIWKYFLKNNFTESFFGVLLGNIALVFYRYVTERKNLIEFADYSIR KSLSFYMEICKRCVAFTHWEFLILVVIILILLNFFIAAACDYQKVYGKIK >gi|222441908|gb|ACEP01000034.1| GENE 10 9719 - 10183 152 154 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026366|ref|ZP_03715558.1| ## NR: gi|225026366|ref|ZP_03715558.1| hypothetical protein EUBHAL_00615 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00615 [Eubacterium hallii DSM 3353] # 1 154 1 154 154 295 100.0 1e-78 MDIVQITDRILVSIGIGMSAAFQLFRFSIMTLMPAIENIKTDVICEGLSLKRGLIITVSI LVSISPILLTIKNSEQRRSLYHLQIFLLLYILCTYFYDLWKIKHWLYGCVVLIFCIDLTW CTIGFCQEYVPKFGTHKRKKDSEKSRLEHWNSEK >gi|222441908|gb|ACEP01000034.1| GENE 11 10308 - 11885 1313 525 aa, chain + ## HITS:1 COG:no KEGG:Odosp_0908 NR:ns ## KEGG: Odosp_0908 # Name: not_defined # Def: Sel1 domain protein repeat-containing protein # Organism: O.splanchnicus # Pathway: not_defined # 393 474 395 480 1134 67 42.0 1e-09 MREKTMKKGHLLLTLLAFVCLMGLGTIHANAATVKAKVDVKKFYGIAEQVLQEINRQRRN NGLSDLKMDANFTEVAMDIALQKQGGDASASNPEIYNNVYVRKDAMEIDDIDGYDEYSDN LWKETRNGKEKILNHIKKVGTFLYKESDITSKIGIGVVATDAPQGIPRELNYGGFCNVTY CFISNEVGHDDTNYNPYVNNGKTVRTQMDLYIDSSNTYLIVDANENCLTPKYTYDVIKKG EKRKAHPGVWQHPVIFGKVHNELCRFKCKLSNTGASWSSSDSSVVSVDSNGNITAKKNGI ATVTVKVGSLQATKKYVVSDVIGMDTDTLLYLPNWKSKSDIPKEYTGNCYAGGYALEVKN GRITWSKNMKTNKITRPGSKKSTVKTKKLKGSYYKIISKSKRTVQYVKPVKKNIISVTIP SSIKLNGKRYKVTGIAANAFKNCKKLKKVTIGINVNSIGKRAFYGCSKLQTIKVKTSKLT GSRVGKQAFKGLNKKAVIKVPKKQLKAYKRLFRVKGVGKKVTIKK >gi|222441908|gb|ACEP01000034.1| GENE 12 12036 - 17324 4806 1762 aa, chain + ## HITS:1 COG:no KEGG:BCE_5332 NR:ns ## KEGG: BCE_5332 # Name: not_defined # Def: cell wall anchor domain-containing protein # Organism: B.cereus_ATCC10987 # Pathway: not_defined # 1087 1701 285 851 975 79 26.0 1e-12 MKIKMILKKGMVFLFAAAMAVTGVFSLGVMAASGLEGAYISPKSKMGLEFANNFDRAVSK SQVDVPNFMADSDSNGEEPNKSFNAQQVAAGNSDAGQKWRGFAAYAKNHDPSDFWVKYKN VTTANGKSVDLKIEVTDWKTQNDESKPLPSNGDMFGHDNNSAHLGYAIGFSYKKTGVVMF SGFKWVKFKYTFLYNGTNTKAPFTGFATFQDIDQNQYVTITDGMENIVNTSYIGGSDGNT WCEPDGWTYKAIKDQNASSDMGKDTFNKTCISLYVKNMTDMSIKYGWTGHTAIAYFDLAP WAMNSFEENQNPDTPKKLIQNGSSWISNEKTSAVNYELGDTYSYRYEDSIPDYASDTLEK LRNQSQTAVSSYVWEDTIDDGLSYVSNSLQVNVGGQDFTSKFTDSSSGQHLKFSAKTDAL SNTAFYGKKVTITFKVKIKDASYSWLGHERTSDNKYRLIPNAAKRSMKYLKKLTYDSKNK MYDITDGDYGSASQNTSTVWSKVPIGDPDNPDPDPKKTVSDEDEKNVLNNTLTNADETFT YTVSKEIEKNMPSTAKYSKVVIKDDIDSCLTIDSVKIYAGDKDVTNNFTPTSASKENYLE YAASSDLLNNKDFYGNNAGTTVKMVVKTHIDAKKVSIETLRAHGHLVENDKKTETDIKIK NETTVTTTKADNQGTWDVDKKVTPPPTTTDSPVPSIKDPVKKVSDSDDLNWDATVKQDGE KTPGSHNRVTDVTNQWLYTLAQEIPAHTVELFHYKSFTITDAVDSCLSYDVKDITIKAGD KDYTDKFDITKGEDNSLTLTAKSDVLSSDEFYGGKSGNKIVVSFPVKISADAKTLKDENL GHLEIGGKKMAHLQKVSDLQKLSGFTDLVKSKDNEYVYAFLNQAKSHIDSQIKYEGQTGV KDRTTDKVETAVETADPTIKKESSKYEWQVGDKVDYTINVGDTNSNSIADNVVVTDESLP KAMLPDKDSITISSTFDKEKSGAPEDRDISKDAKIEYTEKGFVITIPKLYRGEAATIKLT CTAKEKSTYDSITDQWADCLEKANADKKLTCVNGEVIENRAYVTATLMLKDKDNPPDDLE RVWVNTPHLNIKKSVEKKEYNVGETVKYTLVVTNVHEGTLARDIVITDQLQTPGETIVEG TIRLKDETGAVYDLGKKAGKDAEKPYLETQDDTSFKITTGMNLRYDGESPFKKLAEIDDK VHWKNKEGLTQEEWDAPITHTKFTVTYEAKINKEPADTNQFVNVATAKGSNTDEVKDDEI VYLKTPVIKTEKESDKDSYQKGETAKYHLTVTQTREDRTAVNVVIKDALKDEDSEKERIK GDTIKVYYNGNEFTPVSVKSDDTSFTIETGKNLTVKDKLEVTYDAEVLSDLNKDSITNIV KAKADNAYAEIELDVPVTKVTQNISARKTSNPASGTVVKQGGQIIYYIDVHNDSETTLKN VLVRDAIPELSNYVDGSASNDGKIMEINGKQYITYIIDTLRAGSTQTLSFAVTVDEKATS EDVILNTAEIKETTDKDLNEDGNIPDDLWNEDFKETNTTKHPLSNWATDIKTVKVGTPDS YIEKRADSAVYDVGETIHYTIVAEQTTKDGKLTNAVVEDKDLPDGVDIDYDSIKVDGNTV PEQDTMTSDTNSIYLIRTKKGFKVVYPELSEKSTITFDAVINSEKLVGKKIPNKATIISD QTPEKEAVVKVKVPKNPIEKVIKKIINRKGVSGNSSPKGVKTGLSTHAGLFAGIAAALLA AAAILIKKRRNAKKKPMTLEKF >gi|222441908|gb|ACEP01000034.1| GENE 13 17385 - 17810 236 141 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026369|ref|ZP_03715561.1| ## NR: gi|225026369|ref|ZP_03715561.1| hypothetical protein EUBHAL_00618 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00618 [Eubacterium hallii DSM 3353] # 1 141 16 156 156 218 100.0 1e-55 MEKILRLNLAILFLLFAMMLVSPSAASLKNPAGSITVEAKSKKKPAKKSTTRKPVTRTNY ITMKKKEKLWVGPSVAAMHPETKDWKSSNSKIATVSTKGDEEGFHKVTVKKKGTVTISCT VKKTLRGWIKGDVHKWVITIK >gi|222441908|gb|ACEP01000034.1| GENE 14 17944 - 18348 267 134 aa, chain + ## HITS:1 COG:no KEGG:Phep_0230 NR:ns ## KEGG: Phep_0230 # Name: not_defined # Def: hypothetical protein # Organism: P.heparinus # Pathway: not_defined # 3 127 23 123 161 70 35.0 2e-11 MAVCLLILTVIGYWGIFCKAGEKGWKVLIPFYNEYLLFKIAWKPSICLFKWLCLFLYEVV SVTLNAGVLLDTLQFVLPSIAFGLTVTLYYSLSRAFGHGFGYTVGIMFLPFIFAPLLGVG RSRYTRPDKYRDKF >gi|222441908|gb|ACEP01000034.1| GENE 15 18358 - 19563 1254 401 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026371|ref|ZP_03715563.1| ## NR: gi|225026371|ref|ZP_03715563.1| hypothetical protein EUBHAL_00620 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00620 [Eubacterium hallii DSM 3353] # 10 401 1 392 392 445 100.0 1e-123 MCRISVHNKMSELLNRNTDPLFEKMEKIFAERDAEYKKMEERNRMREEAVKQKENSLKKQ EEQFNNREENVRQQEKEIEEKMQMLEERQRETQEMEKYLQKKRLELEADEQQSLLDTSIL REEIRNEKLKQQRLSRELEDKLADLGYSSDNFDPVVLEEKEQKIKEQAQEIQKLKEEIEK WEDKADEWEEKESGFVSQIDTLNKEKAQLWKKLMGVEDESIPEKKEDVPGKAEEPVEERI SGAEEAVEEVEEVEVVPEEESVYAKEEENLKEAIEMDTPLTAEGFYNYLQEQEVGGVIQL RHAKQGDMVNIAISQIVVTVVFAEEGWFDIKKSIGNNNRKLRRLIRSWNSSQNDLTFKYD TSDNTVTAEGDFDKDQTAEELLRYLDRLLDTYFDTDTGEEE >gi|222441908|gb|ACEP01000034.1| GENE 16 19560 - 19994 211 144 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026372|ref|ZP_03715564.1| ## NR: gi|225026372|ref|ZP_03715564.1| hypothetical protein EUBHAL_00621 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00621 [Eubacterium hallii DSM 3353] # 1 144 1 144 144 240 100.0 3e-62 MKNFILAVENVPKPMLIAEAVLIVLIIGVVAIRFFIIRSKPAYLKKLPRTVYDEETIHLL FNCYKAADSIEGMLHLAVKKSRNRKNKKRFKAAISYLYTSRYKDYETALYKYAGDGTEQT ERLFTDIIGKEAAKKRLLPLKEES >gi|222441908|gb|ACEP01000034.1| GENE 17 19995 - 20207 236 70 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026373|ref|ZP_03715565.1| ## NR: gi|225026373|ref|ZP_03715565.1| hypothetical protein EUBHAL_00622 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00622 [Eubacterium hallii DSM 3353] # 1 70 1 70 70 97 100.0 4e-19 MRRNNWKKYIIILIAFIVGLVANLAARAILPRLLPAGMQAWTGLLATLLEIVISVGIILI IIFTNHIGED >gi|222441908|gb|ACEP01000034.1| GENE 18 20212 - 20574 443 120 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026374|ref|ZP_03715566.1| ## NR: gi|225026374|ref|ZP_03715566.1| hypothetical protein EUBHAL_00623 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00623 [Eubacterium hallii DSM 3353] # 1 120 1 120 120 208 100.0 1e-52 MNRVKNFIGAMREKGVRTVKRAVPAFFTAYCYFSMYVPVYAAKAAGTSVITSGFNSLYEI VAAIVSSIGQLLLLWGVFEWATALNSQDGTMQSMAFKRIASGLVACLAPQIVTVISASLK >gi|222441908|gb|ACEP01000034.1| GENE 19 20593 - 21417 443 274 aa, chain + ## HITS:1 COG:no KEGG:MGAS2096_Spy1127 NR:ns ## KEGG: MGAS2096_Spy1127 # Name: not_defined # Def: hypothetical protein # Organism: S.pyogenes_MGAS2096 # Pathway: not_defined # 17 274 18 283 283 70 25.0 8e-11 MPEWLFDFLSGPTIKIAAAIWNSSMKLIIKLVCSTPQKFSTGTWAFVKENLYPWAEGIGL SALNLFFMIAFLKAVSNLHQNITLEMVIEAMIKVVVANVLYLNILNIMTTIFSISSAMAK EVFTIDAPDLKVEDMDIGSALFYQLFGLLFILAAIVCSFLMLWTVYSRYIKLYLLMILAP FAMSALVGGQEIERSFYAWMRTFLLNAFEIVVIALTMVISFKLIGAGISLFDGKNIATSM VDGFWDALNSLFTMILVTAAVKGSNAFLSKAMGL >gi|222441908|gb|ACEP01000034.1| GENE 20 21461 - 22054 351 197 aa, chain + ## HITS:1 COG:no KEGG:Cthe_1806 NR:ns ## KEGG: Cthe_1806 # Name: not_defined # Def: cellulosome enzyme, dockerin type I # Organism: C.thermocellum # Pathway: not_defined # 22 197 146 319 2177 72 29.0 9e-12 MGKFAAKIERVITIALILSVMLILTPGVSTQAKAKKCNHKSVTWITTSKPSCTDEGMKVK KCKNCGKILKIKKIKKSGHCLRTQIEKMPTCTKPGLTATYCLNPDCIYGYRKYYKTEKIA PLGHSYIAKTYKATCTAPKTIVTSCKNCKYKSTHKEGKALGHHWTKWKLNTDSMIKKKPK KTRICSRCGKKETIYVK >gi|222441908|gb|ACEP01000034.1| GENE 21 22097 - 22426 187 109 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026377|ref|ZP_03715569.1| ## NR: gi|225026377|ref|ZP_03715569.1| hypothetical protein EUBHAL_00626 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00626 [Eubacterium hallii DSM 3353] # 1 109 1 109 109 198 100.0 1e-49 MEKLKKEVTIVLLAIMMTMVFFANAQAKAGCNHKHTGRTWIIAEKANCNHTGRKKQLCVR CRKVLLEKVIPKTKHKWKLTRKKRTRYDCLKCGGVKLKNEYGGWSYYFP >gi|222441908|gb|ACEP01000034.1| GENE 22 22463 - 22963 262 166 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026378|ref|ZP_03715570.1| ## NR: gi|225026378|ref|ZP_03715570.1| hypothetical protein EUBHAL_00627 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00627 [Eubacterium hallii DSM 3353] # 1 166 1 166 166 264 100.0 2e-69 MKTKRHIAVVLMVLIVLVLVPGSSTQAKAKKCNHKKIIWETLTKPTCEYRGRSYKKCKSC GKEWFQTIMKTPALGHKPGKPRILHPTCLSGGHKEIVCTRKGCPKSYGDEEICGSYLSYK ELPALGHSYNKGMSIKTGKKRGKKFQYQKTQKCKRCGNRRISFYYK >gi|222441908|gb|ACEP01000034.1| GENE 23 22976 - 23347 106 123 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026379|ref|ZP_03715571.1| ## NR: gi|225026379|ref|ZP_03715571.1| hypothetical protein EUBHAL_00628 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00628 [Eubacterium hallii DSM 3353] # 1 115 1 115 123 223 100.0 4e-57 MKIDINKDFEKAFPDDLWKGFTLKQIITFAIGIIMAAGTIAALWKFMKISPITGSYIAIP VMLPICFFGIYEYQNHSLWEMWKEMRFYRTTKKLTYGAEEAPKTGRIFTMRHPAVKKKKR KKK >gi|222441908|gb|ACEP01000034.1| GENE 24 23408 - 24845 983 479 aa, chain + ## HITS:1 COG:no KEGG:ELI_0234 NR:ns ## KEGG: ELI_0234 # Name: not_defined # Def: hypothetical protein # Organism: E.limosum # Pathway: not_defined # 23 479 18 469 774 147 25.0 9e-34 MAVINASGKFRSFHSKSVPIVPVPKNVRQALNIERMYKNGIAKIEPSKTNSIYDRCYVFE EINYINKDESEKDIFLNQFMSWLKSMSVDFKITIANEFQSIDEFLKKIRGKQNSKVYPQI DRGIKEWTREKLENSNPNVTTLRYLTVSCRANSLDEATILLNALDTTIQQMFAKWRGKII LLNGEERLKCLHSLLRPGKKEEEQYIYLDESGKHDWKNDVMPQTIRQYSNFMIFDKTQYV SVLFGWKYSRTLKPDELMRSFSYVDFPSFITLDFSPVPADVINEKLIAAAMNNEKSISEE EEAKRKKNIIVSGPSYAKQRKKDEIEGYMDLVNDNDETGFFLNFLFVATAPDESTLAQRV EQLKETGKAQGVVISTADYTQLKALNTALPFAGRQVDYMRFFLSSSMVAMQPYFAQDILD PGGYFYGLNLTTKRLIFGNRKLLMNPHAIIVGHTGSGKSVLIKATEIFQTLISTSDDIL Prediction of potential genes in microbial genomes Time: Fri Jul 8 06:58:26 2011 Seq name: gi|222441907|gb|ACEP01000035.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont39.1, whole genome shotgun sequence Length of sequence - 22879 bp Number of predicted genes - 26, with homology - 26 Number of transcription units - 12, operones - 6 average op.length - 3.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 156 - 215 7.3 1 1 Tu 1 . + CDS 253 - 1026 576 ## gi|225026381|ref|ZP_03715573.1| hypothetical protein EUBHAL_00630 + Prom 1255 - 1314 7.3 2 2 Tu 1 . + CDS 1362 - 2003 999 ## COG0563 Adenylate kinase and related kinases + Term 2017 - 2077 12.1 + Prom 2017 - 2076 5.4 3 3 Op 1 . + CDS 2097 - 2390 323 ## EUBREC_2415 putative nucleotidyltransferase 4 3 Op 2 . + CDS 2377 - 2766 429 ## EUBREC_2414 nucleotidyltransferase 5 3 Op 3 . + CDS 2776 - 3558 1017 ## COG0024 Methionine aminopeptidase 6 3 Op 4 . + CDS 3636 - 3923 183 ## gi|225026386|ref|ZP_03715578.1| hypothetical protein EUBHAL_00635 7 3 Op 5 2/0.000 + CDS 3926 - 4144 273 ## PROTEIN SUPPORTED gi|15900168|ref|NP_344772.1| translation initiation factor IF-1 + Term 4169 - 4215 7.1 + Prom 4189 - 4248 6.5 8 3 Op 6 . + CDS 4275 - 4388 187 ## PROTEIN SUPPORTED gi|160881761|ref|YP_001560729.1| ribosomal protein L36 + Term 4392 - 4440 5.6 9 4 Op 1 48/0.000 + CDS 4837 - 5205 543 ## PROTEIN SUPPORTED gi|238922854|ref|YP_002936367.1| ribosomal protein S13p/S18e 10 4 Op 2 36/0.000 + CDS 5256 - 5651 614 ## PROTEIN SUPPORTED gi|160881759|ref|YP_001560727.1| ribosomal protein S11 11 4 Op 3 26/0.000 + CDS 5668 - 6261 793 ## PROTEIN SUPPORTED gi|238916291|ref|YP_002929808.1| small subunit ribosomal protein S4 + Term 6278 - 6319 6.5 + Prom 6281 - 6340 3.5 12 5 Op 1 50/0.000 + CDS 6379 - 7338 1175 ## COG0202 DNA-directed RNA polymerase, alpha subunit/40 kD subunit 13 5 Op 2 6/0.000 + CDS 7425 - 7961 699 ## PROTEIN SUPPORTED gi|240145848|ref|ZP_04744449.1| LSU ribosomal protein L17P + Term 7988 - 8035 11.1 14 6 Op 1 34/0.000 + CDS 8325 - 10550 1904 ## COG1122 ABC-type cobalt transport system, ATPase component 15 6 Op 2 8/0.000 + CDS 10547 - 11359 687 ## COG0619 ABC-type cobalt transport system, permease component CbiQ and related transporters + Prom 11368 - 11427 6.9 16 6 Op 3 7/0.000 + CDS 11466 - 12227 506 ## COG0101 Pseudouridylate synthase + Term 12380 - 12406 -1.0 + Prom 12264 - 12323 6.6 17 6 Op 4 59/0.000 + CDS 12478 - 12906 620 ## PROTEIN SUPPORTED gi|239623392|ref|ZP_04666423.1| ribosomal protein L13 18 6 Op 5 . + CDS 12931 - 13323 583 ## PROTEIN SUPPORTED gi|160881745|ref|YP_001560713.1| ribosomal protein S9 + Term 13363 - 13406 8.3 + Prom 13642 - 13701 6.0 19 7 Tu 1 . + CDS 13725 - 15281 1620 ## COG0488 ATPase components of ABC transporters with duplicated ATPase domains + Prom 15532 - 15591 5.3 20 8 Tu 1 . + CDS 15611 - 16528 599 ## COG0679 Predicted permeases + Prom 16549 - 16608 7.4 21 9 Tu 1 . + CDS 16703 - 17398 756 ## COG2186 Transcriptional regulators + Term 17438 - 17502 7.2 + Prom 17498 - 17557 6.0 22 10 Op 1 . + CDS 17598 - 18443 904 ## COG0726 Predicted xylanase/chitin deacetylase 23 10 Op 2 . + CDS 18457 - 19287 931 ## COG0726 Predicted xylanase/chitin deacetylase + Prom 19304 - 19363 4.7 24 11 Tu 1 . + CDS 19400 - 20869 552 ## PROTEIN SUPPORTED gi|145634045|ref|ZP_01789756.1| 50S ribosomal protein L21 + Prom 21441 - 21500 3.4 25 12 Op 1 . + CDS 21522 - 22490 732 ## COG1893 Ketopantoate reductase 26 12 Op 2 . + CDS 22553 - 22780 144 ## gi|225026405|ref|ZP_03715597.1| hypothetical protein EUBHAL_00654 Predicted protein(s) >gi|222441907|gb|ACEP01000035.1| GENE 1 253 - 1026 576 257 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026381|ref|ZP_03715573.1| ## NR: gi|225026381|ref|ZP_03715573.1| hypothetical protein EUBHAL_00630 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00630 [Eubacterium hallii DSM 3353] # 1 257 14 270 270 446 100.0 1e-124 MRKTERNFTNKVLESKRGDVQAVKSKIKSSVMAVAFLFAATMPLTMAVSSRNVQAAVTQQ KQIQTVNQKVKTNTSKAIKTITPKKVTSATMIKAYKKYLRKNVSNKKYYAIVNIGDNNAP VLLRGTQGAGNDKTGRHYTGCQIYYYKNGKVVKMDTFKDGGRWLLLSKKDGQNYLTNGGS DFSIRICVKNGNLYRYQYYNNHNWKKNPENKWATFIVKMNGKVIKNMGYLDGTLYRRHTG YYTYLKNPEISFTKNVK >gi|222441907|gb|ACEP01000035.1| GENE 2 1362 - 2003 999 213 aa, chain + ## HITS:1 COG:CAC3112 KEGG:ns NR:ns ## COG: CAC3112 COG0563 # Protein_GI_number: 15896362 # Func_class: F Nucleotide transport and metabolism # Function: Adenylate kinase and related kinases # Organism: Clostridium acetobutylicum # 1 213 1 213 215 263 62.0 1e-70 MKIIMLGAPGAGKGTQAKKLSEKYGIPHISTGDIFRANIKNNTELGQKAKGYMDAGQLVP DELVVDLVVDRIKEKDCFKGFILDGFPRTIPQAEALDYALNNQNEKIDYAINVDVPDENI INRMGGRRACVGCGATYHVVNMPPKKEGICDHCGEKLVLREDDKPETVKKRLQVYHDQTK PLIEYYDKKGSLLTVDGTLDINVVFEKITEVLG >gi|222441907|gb|ACEP01000035.1| GENE 3 2097 - 2390 323 97 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_2415 NR:ns ## KEGG: EUBREC_2415 # Name: not_defined # Def: putative nucleotidyltransferase # Organism: E.rectale # Pathway: not_defined # 2 97 19 114 114 107 60.0 2e-22 MEHTGIKPKVIEEIRTFARKYNIDKVILFGSRARGDYRRTSDIDLAVVGGDFARFALDVD EETSTLLEYDIVDMSREMQDELRQSIMREGKILYEKV >gi|222441907|gb|ACEP01000035.1| GENE 4 2377 - 2766 429 129 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_2414 NR:ns ## KEGG: EUBREC_2414 # Name: not_defined # Def: nucleotidyltransferase # Organism: E.rectale # Pathway: not_defined # 1 129 1 129 129 171 62.0 1e-41 MKKYENFCSALCNMKEIYKYDPPYDTVILTGLVGLFEVCFEQSWKMMKEILESHGYEEGA TGSPKIILKTAYKAGMIKDEELWLRALQERNNVAHSYNERIALGIVMRAKESFYDMFCEL KKEIDKNWI >gi|222441907|gb|ACEP01000035.1| GENE 5 2776 - 3558 1017 260 aa, chain + ## HITS:1 COG:BH0156 KEGG:ns NR:ns ## COG: BH0156 COG0024 # Protein_GI_number: 15612719 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Methionine aminopeptidase # Organism: Bacillus halodurans # 3 248 2 246 248 256 49.0 3e-68 MAITIKNEHEIALMRESGRLLALVHEEMRKELKEGMTTKDIDQICETMIRDFGCIPNFLN YQGFPASVCVSINDEVVHGIPSKKRYIQNGDIVSLDAGLIYKGYHSDAARTHGVGEISPE AQRIIDVTRQSFFEGIKFAKPGNHLVDIARGVQTYAESNGYSVVRDLTGHGIGTHLHEAP EIPNFVQRRRGAKMQKGMALAIEPMINEGTYDVYWLDNDWTVATADGKLSAHYENTIAIT DDGCEILTMLPEEKEALGVL >gi|222441907|gb|ACEP01000035.1| GENE 6 3636 - 3923 183 95 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026386|ref|ZP_03715578.1| ## NR: gi|225026386|ref|ZP_03715578.1| hypothetical protein EUBHAL_00635 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00635 [Eubacterium hallii DSM 3353] # 1 95 1 95 95 178 100.0 1e-43 MLEYKLGYFATSLCGHDKGKVYMIVKQEGEMIGLSDGGRRTIDNPKWKKKKHIQMIRSER MAEAFPSLISSEDANQRICDEIAGMEGRKVRKQED >gi|222441907|gb|ACEP01000035.1| GENE 7 3926 - 4144 273 72 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15900168|ref|NP_344772.1| translation initiation factor IF-1 [Streptococcus pneumoniae TIGR4] # 1 72 1 72 72 109 69 1e-23 MSKSDVIEVEGKVVEKLPNAMFKVELENGHQVLGHISGKLRMNFIKILPGDKVTIELSPY DLTKGRIIWRDK >gi|222441907|gb|ACEP01000035.1| GENE 8 4275 - 4388 187 37 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|160881761|ref|YP_001560729.1| ribosomal protein L36 [Clostridium phytofermentans ISDg] # 1 37 1 37 37 76 89 1e-13 MKVRASVKPICEKCKVIKRKGAIRIICENPKHKQRQG >gi|222441907|gb|ACEP01000035.1| GENE 9 4837 - 5205 543 122 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|238922854|ref|YP_002936367.1| ribosomal protein S13p/S18e [Eubacterium rectale ATCC 33656] # 1 122 1 122 122 213 86 7e-55 MARISGVDLPREKRVEIGLTYIYGIGRTSADQILKAADVNPDTRVRDLTDDEVRRLSDII NADYTVEGDLRREIAMNIKRLQEIGCYRGIRHRKGLPVRGQKTKTNARTRKGPKKTVANK KK >gi|222441907|gb|ACEP01000035.1| GENE 10 5256 - 5651 614 131 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|160881759|ref|YP_001560727.1| ribosomal protein S11 [Clostridium phytofermentans ISDg] # 1 131 1 132 132 241 91 4e-63 MAKKVTKKVTKKRVKKNVQHGQAHIQSSFNNTIVTLTDSEGNALSWASAGEMGFRGSKKS TPYAAQMAAETAAKAAVVHGLKTVEVMVKGPGSGREAAIRALQACGLEVTSIKDVTPVPH NGCRPPKRRRV >gi|222441907|gb|ACEP01000035.1| GENE 11 5668 - 6261 793 197 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|238916291|ref|YP_002929808.1| small subunit ribosomal protein S4 [Eubacterium eligens ATCC 27750] # 1 197 1 197 197 310 74 7e-84 MAIDRTPVLKRCRSLDMDPVYLGVNKKSNRKLVRSSRKISEYGLQLREKQKAKFIYGVLE KPFRNYYATAERMKGLTGPNLMVLLESRLDNVVFRMGFARTRKEARQIVGHKHVLVNGKC VNIPSYRIKAGDVVEIKENKKTMQRYKDIVEATGGVLVPAWLDADIEELKGTVKELPSRD MIDVPVNETLIVELYSK >gi|222441907|gb|ACEP01000035.1| GENE 12 6379 - 7338 1175 319 aa, chain + ## HITS:1 COG:BH0162 KEGG:ns NR:ns ## COG: BH0162 COG0202 # Protein_GI_number: 15612725 # Func_class: K Transcription # Function: DNA-directed RNA polymerase, alpha subunit/40 kD subunit # Organism: Bacillus halodurans # 1 318 1 314 314 380 66.0 1e-105 MFEFEKPRIEIAEISEDNKYGRFVVEPLERGYGTTLGNSLRRIMLSSLPGAAVSAVKIKG VLHEFSSIPGVKEDVTEIIMNIKELAIKNNSSSNEPKRAFIEFSGEGVVTAADIQVDSDI EIINPDLVIATLSGGADSHFEAELTITKGRGYVGADKNKSEDQSIDVIAVDSIYTPVERV NLTVQNTRVGQITDFDKLTLDVFTNGTLAPDEAVSLAAKVLSEHLNLFIDLSENAQKAEV MVETAQDPVDKVLEMNIDELELSVRSYNCLKRAGINTVEELTNKTPEDMMKVRNLGRKSL EEVLAKLKELGLSLNNSEE >gi|222441907|gb|ACEP01000035.1| GENE 13 7425 - 7961 699 178 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|240145848|ref|ZP_04744449.1| LSU ribosomal protein L17P [Roseburia intestinalis L1-82] # 1 178 1 178 178 273 78 6e-73 MAKYRKLGRTSNQRKALLRNQVTQLLYHGKIRTTEAKAKEIRKIAEGLIALGVKEKDNFE VVKVQAKVAKKDAEGKRVKEVVDGKKVTVYETVEKEIKKDLPSRLHARRQMLKVLYPVTE VPVEGKGRKANTKKIDLAQKIFDEYAPKYADRNGGYTRIVKIAQRKGDAALEVIIELV >gi|222441907|gb|ACEP01000035.1| GENE 14 8325 - 10550 1904 741 aa, chain + ## HITS:1 COG:CAC3102 KEGG:ns NR:ns ## COG: CAC3102 COG1122 # Protein_GI_number: 15896353 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type cobalt transport system, ATPase component # Organism: Clostridium acetobutylicum # 3 279 5 277 281 248 45.0 4e-65 MNLIDVKNLLFKYKMYSGEKEEVIEHTAIDNISLSIKKGDFVGILGHNGSGKSTLAKQLA ALLKPSGGIIYVGGMDTAKEEQILDIRKTAGLVFQNPDNQLIGNIVEEDVAFGPENMGIP AEEIEMRITKALASTGMTAYREASPGALSGGQKQKIAISGVLAMEPECIIFDEPTAMIDP ESRKELLEAIYDLKRLKNITVIYITHFLQEVSQADYLYVMSHGKITLKGTPETLFKIPEK LVENNLELPFEVALIDDLRKKSVDVPEEIYTKQQLLEFLKYHFQKDESDNIVLSEYKSER KTISESSKSNVKSQAKENSRYGNCLEKQSDSYSISVDDRNEKLADNSLGDLNLADETEVS KKSHELKETSDIKEDASKNIDEPKKTESTKGITLKNISYQYKKQDSGEEKYAIKDISLAI EPGEFVAIIGRTGSGKSTLIQHFNGLFQPKSGDYFFNGENIWEKKYDLKKLRQKVALCFQ YPEYQLFEENVLKDIAFGPKNLGFDKKECEEKARHAMQLAGLSNELEKVSPFSLSGGQKR RVALAGILAMEPEYLILDEPVAGMDAPGKKILFDLLHHLNKERGITIVLVSHNMDDVADH ADRVLVMENGQLKMDGKTEEVFVRKDELTEMGLGVPQAVEFYLDLKEILEETDIDLENAF SHDMQYNQKNREVSQKTIELSQENVEINQKETKEKGTEMHEKKIEASNEKAKEYIKTKRT KNVGIPLNIDELAAYIAGGSL >gi|222441907|gb|ACEP01000035.1| GENE 15 10547 - 11359 687 270 aa, chain + ## HITS:1 COG:CAC3100 KEGG:ns NR:ns ## COG: CAC3100 COG0619 # Protein_GI_number: 15896351 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type cobalt transport system, permease component CbiQ and related transporters # Organism: Clostridium acetobutylicum # 1 248 1 250 267 216 46.0 3e-56 MIREMTLGQYYKADSILHKLDPRVKLLGTLLFVITLFFPKSLLSLGIATAFLILIVKLSK VPVSMMIRGVKPLFIIICFSAIINMISVSGDMLVKIGPITIAKQGLWMAFYLVVRLVYLV IGSSVMTFTTTPNQLTDGLEKGFHFLKKVHVPVHEIAMMMSIALRFIPILTEELDKIMKA QMSRGVDFESGNILERGKKLIPVLVPLFIAAIRRASDLAMAMDARCYNGGEGKTRLHPLI YEKRDYIAYGIMLLYVVIMIFCSFILKRFF >gi|222441907|gb|ACEP01000035.1| GENE 16 11466 - 12227 506 253 aa, chain + ## HITS:1 COG:CAC3099 KEGG:ns NR:ns ## COG: CAC3099 COG0101 # Protein_GI_number: 15896350 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Pseudouridylate synthase # Organism: Clostridium acetobutylicum # 1 252 1 244 244 218 46.0 7e-57 MKRFKLVVAYDGTNYCGWQIQPNGITVEEVLNKTLSRFLKEEIHIIGASRTDSGVHSFGN VAVFDSETTIPPEKLCFALNPQLPEDIRIVSSEQVPGDFHPRYCDTRKTYEYHIQNGKIP LPTERLYSHLVIFPLNIQAMQEASQYLVGTHDFKSFCAAKTQAQTTVRTVTGIEVEAHPL SEGEYPSKKVIIRVTGEGFLYNMVRIISGTLIKVGLGVYPPEHVKQILEACDRSQAGETV PAKGLFLKKIEYL >gi|222441907|gb|ACEP01000035.1| GENE 17 12478 - 12906 620 142 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|239623392|ref|ZP_04666423.1| ribosomal protein L13 [Clostridiales bacterium 1_7_47_FAA] # 1 140 1 140 142 243 82 8e-64 MKSFMASPSTITRDWYVVDATGHTLGRLASEVASVLRGKNKPIFTPHMDCGDYVIIVNAD KIKVSGKKLDQKIYYHHSDYVGGMKETTLREKLNKKPEEVIELAVKGMLPKGPLGRQMYT KLFVYAGPDHKHAAQQPKELAI >gi|222441907|gb|ACEP01000035.1| GENE 18 12931 - 13323 583 130 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|160881745|ref|YP_001560713.1| ribosomal protein S9 [Clostridium phytofermentans ISDg] # 1 130 1 130 130 229 84 2e-59 MATTRYYGTGRRKSSVARVYLVPGTGKVTINKKDMDEYFGYDTLKVIARQPLELTSTADK FDVLVNVHGGGYTGQAGAIRHGIARALLEADPEFRPALKKAGYLTRDPRMKERKKYGLKK ARRAPQFSKR >gi|222441907|gb|ACEP01000035.1| GENE 19 13725 - 15281 1620 518 aa, chain + ## HITS:1 COG:BS_yfmM KEGG:ns NR:ns ## COG: BS_yfmM COG0488 # Protein_GI_number: 16077809 # Func_class: R General function prediction only # Function: ATPase components of ABC transporters with duplicated ATPase domains # Organism: Bacillus subtilis # 1 518 1 518 518 791 75.0 0 MSILNVEHLTHGFGDRAIFNDVSFRLLKGEHIGLVGANGEGKSTFMSIVTGKMMPDEGKV EWAKNVNVGYLDQHAVLEAGMTIQDALKSAFDPLLQKEERMNEICDMLGTADEKEMEILM EELGMIQDELTLHDFYTIDAKVEEVARALGLLDLGLDRDVTDLSGGQRTKVLLGKLLLEK PDILLLDEPTNYLDEEHIAWLKRYLLDYENAFILISHDIPFLNEVVNIIYHMENQELNRY VGDYDHFQEVYAVKKAQLEAAYRRQQQEINELKDFVARNKARVSTRNMAMSRQKKLDKMD LIELAAEKPKPEFNFRYGRTPGKMLFETKKLVIGYDEPLSKPLDFYMERGQKIALIGTNG IGKTTLLKSLLGLIPPLSGSCEQGENLQIGYFEQEVKGENPNSCIEEIWEEFPGFTQYEV RSALAKCGLTTKHIESKVRVLSGGEQAKVRLCKLINRDTNVLLLDEPTNHLDVDAKDELK RALLDYRGSVLLICHEPEFYKDVVNEVWDMSKWTTKIL >gi|222441907|gb|ACEP01000035.1| GENE 20 15611 - 16528 599 305 aa, chain + ## HITS:1 COG:CAC0366 KEGG:ns NR:ns ## COG: CAC0366 COG0679 # Protein_GI_number: 15893657 # Func_class: R General function prediction only # Function: Predicted permeases # Organism: Clostridium acetobutylicum # 3 298 1 297 301 121 30.0 2e-27 MSLIVLKKIIEMFIILLVGVIIYKAKIINDVSTKHLSNVLLLLVSPLLIIQSYQIDFNRE LLQGLIWAFAASFLTFLFMIVASEFLCHGDRNRSPVEKIAVSYSNSGFIGIPLISGVLGD KGVFYMTAYITVFNLLLWTHGIILMGDNDGLKGAWKNFLSPAILSIVIGIILFLFQLRLP QFIENPLEMIASMNTPLGMIVAGANLAQGNILKSLKNKRLYYLSFIKLIVYPLAGLVVLW LLPLEFEVAFTVFIALACPAGASVIMFAQRYDRDAYYASEIFVITTLLSAVTIPLLSVVA VSLLS >gi|222441907|gb|ACEP01000035.1| GENE 21 16703 - 17398 756 231 aa, chain + ## HITS:1 COG:CAC2546 KEGG:ns NR:ns ## COG: CAC2546 COG2186 # Protein_GI_number: 15895808 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Clostridium acetobutylicum # 12 223 10 221 231 76 27.0 4e-14 MKIGKRPARILLYEQVVEDLCLLIDRGDFEPGDQLPSERELIEKLNVSRNVLREAFHVLE SRGVIVSHQGKGRFLRESPISGKEKRYDSLSKNLECYSMIEAYEVRQALEEKAVELIIKN ASEKDIDELEEALQHLRKKYEETGRTSGEFELHRLYAAKTGSTFMTQTLEIVLNAIFDMM YGKFTDVLDATSAEAELESHGMIIQAIRERDVEKARKLMDEHLITSMNMFQ >gi|222441907|gb|ACEP01000035.1| GENE 22 17598 - 18443 904 281 aa, chain + ## HITS:1 COG:SMb21100 KEGG:ns NR:ns ## COG: SMb21100 COG0726 # Protein_GI_number: 16264427 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted xylanase/chitin deacetylase # Organism: Sinorhizobium meliloti # 6 275 21 282 292 151 35.0 2e-36 MGDIRWPDGKKCAVMFTFDVDGDTTWENGNRGLKNGEKYIKSLSVGQYGPKRCVDRILQK LDQYGVKATFFVPGLTAERYPEVIKRIAEQGHEIGHHGYAHERFAGKTTEEQIEIIEKTQ KIFKDLIGHEVTGFRTPSGDWAKNTPKLLYERGFRYSSSMRGDDRPYRTVIDGEETDFIE IPTKWEVDDYVAMAYHMYPAEPAGQDRISSYRLVEDNFMREFKGHYREGLCISYMNHPQV IGCPGKIQILDHLLKYITSKEDVWVATGTEIADWYRKQSEK >gi|222441907|gb|ACEP01000035.1| GENE 23 18457 - 19287 931 276 aa, chain + ## HITS:1 COG:SMb21100 KEGG:ns NR:ns ## COG: SMb21100 COG0726 # Protein_GI_number: 16264427 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted xylanase/chitin deacetylase # Organism: Sinorhizobium meliloti # 8 274 21 283 292 147 32.0 2e-35 MNLEKTKWPDGKKCAAMLTINLNAELFWLQIDPSCKDMPKTLSLGQYGMTRGVDRILSIL KERGIKATFFTPGWVAEQYPDKLKRIKEEGHEIASLAYKHENMALLTEEEQEEAMKKGIE AIQNVCGVTPVGFRAPEGELTLDTLRIAKKYGMHYSSNLCDDDRPYQKDLENGETLLEIP IHWDNYDLPYFAFNYHPAFPKGQGRIANYTGVINNWKDEFYGCYEYGLCYVLQLDPAAIG APGRIGLLEELLDYIDEFDSVWWTSGEEMYQYYQNK >gi|222441907|gb|ACEP01000035.1| GENE 24 19400 - 20869 552 489 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|145634045|ref|ZP_01789756.1| 50S ribosomal protein L21 [Haemophilus influenzae PittAA] # 24 466 20 440 456 217 31 6e-56 MWTFPQNFSWYANIPIIGSIPFAIYLLLGMGIYFSIRTKFVQFRYFKRGIQVLTKKKKDT TGVSPLAAFMLSAAMRVGPGNIIGVTGAISLGGPGAMFWMWISALLGMASSFVEATLAQI FKEKDGDEYVGGLPYYGQKLAGNKKGVGIALAILFIIYAMLSIPIQTFHVFTAAGTAIAS VTGVSTDRTSPLYYALAIILIIGIAIAVFGGIKRVTAVTDRMVPVMAVLYATVIVGLFIA NIHLFPAFIQAVVIGAFKPDAIFGGAFGTVLAQGIKRGLLSNEAGQGTITMSAAISEQDH PVEQGFVQSIGVFLDTIIICTLSGFIVCGAHIWTNSAYNWEVLKDSKIDVFLSSLKEMIP GTAADGAVSLFICICYGLFAFTSLLGLVSFASIAGTRITKSKAFTNFVRALGALIFVPLG TLCVLAGLELDNIWYVSDLINIALVFANAPIILYGGKYVYRALKDYVDNDGKRFVAKNIG VESDIWKDE >gi|222441907|gb|ACEP01000035.1| GENE 25 21522 - 22490 732 322 aa, chain + ## HITS:1 COG:BH1763 KEGG:ns NR:ns ## COG: BH1763 COG1893 # Protein_GI_number: 15614326 # Func_class: H Coenzyme transport and metabolism # Function: Ketopantoate reductase # Organism: Bacillus halodurans # 1 307 1 287 304 102 26.0 8e-22 MKYIIIGAGGTGGILGFYMTKAGKDVTLIARNAHLEAMQKQGLSVEKMWINETETMPVGA ESMESYEAKGEKADVILVCVKKYSLDSCIPFIQNISHKNTIVVPVLNVYGTGAYLQEKLP KVLVTDGCIYVSANIKQAGVLLQHGEILRVFFGVREKEDLKKLNGQLDGEYKAERLLKKI AQDFKDSGIDGILSDNIKRDALTKFSYVSPIGTAGLYLHAVAGDFQREGEARELFKTLIR EIVTLANAMGITFEEDLVERNLKILSNLPKEATTSMQRDVMEGKQSEIDGLVYEVVRMAE KYGAEVPAYRRAAEKFREMEVK >gi|222441907|gb|ACEP01000035.1| GENE 26 22553 - 22780 144 75 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026405|ref|ZP_03715597.1| ## NR: gi|225026405|ref|ZP_03715597.1| hypothetical protein EUBHAL_00654 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00654 [Eubacterium hallii DSM 3353] # 1 75 1 75 75 117 100.0 2e-25 MRFVVEWREVVGRGREIKELREKMGMNRREFSDYYGIPYRTVQDWEAEKRQLPDYLLRLI KYRAEMEGMVKKDKS Prediction of potential genes in microbial genomes Time: Fri Jul 8 06:59:04 2011 Seq name: gi|222441906|gb|ACEP01000036.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont40.1, whole genome shotgun sequence Length of sequence - 28009 bp Number of predicted genes - 24, with homology - 24 Number of transcription units - 16, operones - 6 average op.length - 2.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 201 - 267 26.4 1 1 Op 1 . - CDS 281 - 1633 840 ## COG0534 Na+-driven multidrug efflux pump 2 1 Op 2 . - CDS 1633 - 1977 163 ## PTH_0545 transcriptional regulator - Prom 2016 - 2075 7.0 - Term 2129 - 2169 1.2 3 2 Tu 1 . - CDS 2381 - 3508 837 ## COG0628 Predicted permease - Prom 3539 - 3598 5.4 - Term 3543 - 3578 0.3 4 3 Op 1 11/0.000 - CDS 3634 - 4557 580 ## PROTEIN SUPPORTED gi|148988049|ref|ZP_01819512.1| 30S ribosomal protein S9 5 3 Op 2 . - CDS 4583 - 4903 473 ## COG0526 Thiol-disulfide isomerase and thioredoxins - Prom 4957 - 5016 6.0 + Prom 5633 - 5692 7.6 6 4 Tu 1 . + CDS 5759 - 7762 2613 ## COG1902 NADH:flavin oxidoreductases, Old Yellow Enzyme family - Term 7754 - 7796 -0.6 7 5 Tu 1 . - CDS 7932 - 8399 410 ## COG0494 NTP pyrophosphohydrolases including oxidative damage repair enzymes - Prom 8424 - 8483 8.7 + Prom 8398 - 8457 7.3 8 6 Tu 1 . + CDS 8541 - 9473 814 ## COG0679 Predicted permeases + Term 9496 - 9547 13.5 - Term 9485 - 9534 11.1 9 7 Op 1 . - CDS 9597 - 10091 440 ## Closa_2582 hypothetical protein 10 7 Op 2 . - CDS 10104 - 11549 733 ## COG2836 Uncharacterized conserved protein - Prom 11728 - 11787 8.2 + Prom 11910 - 11969 7.6 11 8 Tu 1 . + CDS 12004 - 12561 469 ## COG4636 Uncharacterized protein conserved in cyanobacteria + Term 12794 - 12841 -0.8 12 9 Op 1 9/0.000 - CDS 12760 - 13794 281 ## PROTEIN SUPPORTED gi|90020579|ref|YP_526406.1| ribosomal protein L22 13 9 Op 2 11/0.000 - CDS 13815 - 15104 794 ## PROTEIN SUPPORTED gi|90020581|ref|YP_526408.1| ribosomal protein L16 14 9 Op 3 . - CDS 15108 - 15605 304 ## COG3090 TRAP-type C4-dicarboxylate transport system, small permease component - Term 15898 - 15947 8.6 15 10 Op 1 4/0.000 - CDS 16061 - 17356 1625 ## COG0477 Permeases of the major facilitator superfamily - Term 17395 - 17445 8.3 16 10 Op 2 1/0.333 - CDS 17461 - 18336 1267 ## COG0169 Shikimate 5-dehydrogenase - Prom 18405 - 18464 3.4 17 11 Tu 1 . - CDS 18469 - 20439 2590 ## COG1902 NADH:flavin oxidoreductases, Old Yellow Enzyme family - Prom 20501 - 20560 3.4 18 12 Tu 1 . - CDS 20562 - 21401 1235 ## COG1082 Sugar phosphate isomerases/epimerases - Prom 21422 - 21481 6.7 19 13 Tu 1 . + CDS 21817 - 22725 779 ## COG0583 Transcriptional regulator + Prom 23274 - 23333 8.6 20 14 Tu 1 . + CDS 23401 - 24336 988 ## COG0169 Shikimate 5-dehydrogenase 21 15 Tu 1 . + CDS 24714 - 25439 681 ## COG1811 Uncharacterized membrane protein, possible Na+ channel or pump - Term 25695 - 25737 0.6 22 16 Op 1 . - CDS 25922 - 26560 570 ## gi|225026430|ref|ZP_03715622.1| hypothetical protein EUBHAL_00679 23 16 Op 2 . - CDS 26557 - 27261 684 ## gi|225026431|ref|ZP_03715623.1| hypothetical protein EUBHAL_00680 24 16 Op 3 . - CDS 27265 - 27576 165 ## PROTEIN SUPPORTED gi|18309686|ref|NP_561620.1| 30S ribosomal protein - Prom 27598 - 27657 5.9 Predicted protein(s) >gi|222441906|gb|ACEP01000036.1| GENE 1 281 - 1633 840 450 aa, chain - ## HITS:1 COG:lin0003 KEGG:ns NR:ns ## COG: lin0003 COG0534 # Protein_GI_number: 16799082 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Listeria innocua # 8 443 5 440 447 227 34.0 3e-59 MRENKDYLITEKPSRALLIFSIPMIIGNLFQQAYTIVDSAIVGRYVGETALAAVGASYAL TSIFICIAIGGGIGASVIISHHFGGHNYGRMKTGIRTALLSFLFISLILGGIGLIFSQQI MEVLNTPADAIDIAVTYLNIYFLGLPFLFMYNILSSMFNALGESRIPLYLLIFSSVLNIF LDLYMVAVLNLGVAGAAYATFIAQGISAVLSLIIFIVTLQKLPGKAEGWLSKTEFSDMSR IALPSILQQSTVSIGMMLVQSVVNSFGTQILAGFSAAMRIESLCVVPMSAIGNAMSSYTA QNIGAEKEERISQGYRMGNFMIIVIAVILCLFLEIAHKPLIALFLDNAASSQTIVTGESY LKFIGFFFCMIGFKMAVDGILRGAGAVRVFTIANLVNLTLRVTIAKFGAPIWGVSMVWYA VPLGWLANFLISYGYYRTGKWKEAVSKTVD >gi|222441906|gb|ACEP01000036.1| GENE 2 1633 - 1977 163 114 aa, chain - ## HITS:1 COG:no KEGG:PTH_0545 NR:ns ## KEGG: PTH_0545 # Name: arsR # Def: transcriptional regulator # Organism: P.thermopropionicum # Pathway: not_defined # 6 77 7 78 114 70 50.0 2e-11 MTDIELLHTITMPTRYQILMLLLEHHYCGKALAAKIGISEAAISQHLQILKDCGIVTGQR IDRQMHYQVNTELLIQSVGIFHQKVLEKSEHMPNRDYCDCELAEFCSRKGDVKK >gi|222441906|gb|ACEP01000036.1| GENE 3 2381 - 3508 837 375 aa, chain - ## HITS:1 COG:CAC0730 KEGG:ns NR:ns ## COG: CAC0730 COG0628 # Protein_GI_number: 15894017 # Func_class: R General function prediction only # Function: Predicted permease # Organism: Clostridium acetobutylicum # 2 375 5 382 383 156 28.0 8e-38 MEKETKKSLLLITFGVVLFALLMHFSEVIKILQRGVSLTLPIMVGLILAFVLNVPMKMFE RLFAKLKKKYPQLKKIPVTSLSLFATLACIVLLIGIVYTMFVPGLVASVRSIYNVFMENM PKWLSWLRSNGMDTAWISREVTSWDFNKLLTQITGNVGNVGNALGKVVDATTSAFGVAMN VMMGFIIAIYVLLSKVSLTRQCKKMIYAHLSKNHADKICYISSLVNETYSKFFSGQCIEA VILGLLIFIAFTVFRLPYANLIAVLTAILSFIPYIGAFASCFIGALLIVMVDPWKGLLSI VVYQVVQFIENQFIYPRVVGGSVGLAPLWTLSAALIGGNLFGAIGMIFAIPVTAVLYVLL KDDANRRLQRKKIDI >gi|222441906|gb|ACEP01000036.1| GENE 4 3634 - 4557 580 307 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|148988049|ref|ZP_01819512.1| 30S ribosomal protein S9 [Streptococcus pneumoniae SP6-BS73] # 4 296 1 296 306 228 39 4e-59 MEKIYDVVIIGGGPGGYTSALYCARAGLSTMVLEKLSAGGQMALTSQIDNYPGFEEGIDG FSLAEKMEQGAERFGAETELAEVLSVDFSNDIKIVESSEGTFKAKTVIIATGAVPRELGV PGEKELVGRGVGYCATCDGMFYRNKTVMVVGGGNTAAADAMTLSRICEKVILVHRRDSLR ATKIYHKQLEETENIEFYWNTTVSEFLYEKKLTGAKLHNVKTGEDTEISVDGAFISIGRK PASELFEGILDMDEAGYIVADESTRTNIPGVFAIGDVRTKALRQVVTAVADGATASHYIE EYLAHQI >gi|222441906|gb|ACEP01000036.1| GENE 5 4583 - 4903 473 106 aa, chain - ## HITS:1 COG:ECs4714 KEGG:ns NR:ns ## COG: ECs4714 COG0526 # Protein_GI_number: 15833968 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Escherichia coli O157:H7 # 3 106 23 126 127 98 44.0 3e-21 MAIINLTKETFKQQISEVDKPVIVDFWAPWCTYCRRIAPAFDKIASQQEDKLIFAKLDID DAAEIAEEYGVDTIPTLIIFKNGEVFGSIVAPDSKAKIEAFIEENL >gi|222441906|gb|ACEP01000036.1| GENE 6 5759 - 7762 2613 667 aa, chain + ## HITS:1 COG:lin0492_1 KEGG:ns NR:ns ## COG: lin0492_1 COG1902 # Protein_GI_number: 16799567 # Func_class: C Energy production and conversion # Function: NADH:flavin oxidoreductases, Old Yellow Enzyme family # Organism: Listeria innocua # 5 365 4 364 364 501 64.0 1e-141 MKDKYTHIFEPLTVRTMTIKNRICMMPMGTNYGEQNGEMSFLHIDYYRQRAKGGTGLLIV ENANVDFPQGSNGTTQLRIDHDNYLPRLYKFCEDMHRYGTKVSIQINHAGASAMSSRIGG LQPVSASNIPSKTGGATPRPLTKDEILAIVKKYGEAAKRAQTAGFDAVEIHAGHSYLISQ FLSPVYNDRTDEFGGSIENRTRFCRMIIDEVRSQIGPHMPIMLRLSADEFVKGGNTLEDT LKYLEYLNDEVDIFDVSCGLNDSIQYQIDANYMKDGWRSYMAKAVKEKFGKPVISMGNYR DPQVVEDTLARGDCDLIGMGRGLIAEPNWVHKVANDQECMLRKCISCNIGCAGNRIGGNR PIRCTVNPSVLEGDTHLGRKVNKNCNVVVIGGGTAGLEAACSAAEVGCTTFLLEKKDVLG GLATEISKIPAKHRLNDFPVYLQRRAAALDNLYIFKNVEATLEVIEKFNPHIIICATGSA PLLPPISGLHENIDKEGGKVESILSMIKHVPDYPEDMTGQKVVVIGGGAVGLDVVEFFAP RGAEVSIVEMMPAIGRDLDPVTKNDTKCMMEKYNVNQLTSTALQEVKEDCFVVKSPEGEV YELPFDHGFVCLGMKAQSKVYEEVSKAYKDTNVRVENIGDSVRARRIIDGTFEGRNIVFT LEQKGFL >gi|222441906|gb|ACEP01000036.1| GENE 7 7932 - 8399 410 155 aa, chain - ## HITS:1 COG:FN1791_1 KEGG:ns NR:ns ## COG: FN1791_1 COG0494 # Protein_GI_number: 19705096 # Func_class: L Replication, recombination and repair; R General function prediction only # Function: NTP pyrophosphohydrolases including oxidative damage repair enzymes # Organism: Fusobacterium nucleatum # 4 155 3 154 158 159 54.0 1e-39 MNFTTLCYIEKENQYLMLHRVSKKKDGNKDKWIGVGGHFEEGESPEDCLLREVREETGLE LVNYQFRGIVTFISDKWEDEYMCLYTADKYIGEIGVCDEGELVWVEKDKIKDLNIWEGDK IFLRLLAENELFFSLKLEYQGENLINAILNGKEYN >gi|222441906|gb|ACEP01000036.1| GENE 8 8541 - 9473 814 310 aa, chain + ## HITS:1 COG:CAC0366 KEGG:ns NR:ns ## COG: CAC0366 COG0679 # Protein_GI_number: 15893657 # Func_class: R General function prediction only # Function: Predicted permeases # Organism: Clostridium acetobutylicum # 5 308 1 301 301 102 28.0 8e-22 MHISILLMEQIIQLFLMIFMGFLIVKAKLLNSEDSKILSIIVLYLIIPCVIINAFQVDYT PQTVKGLLIALAGSVMTQVILLIIVSILGKVFHLNEVEVASIYYSNSGNLIVPIVMFILG KEWVLYGCVFMSVQLVFIWTHCKKIISRESTYDWRKIVLNINMISIAIGIVLFLTRIHLP EDINSTLSAVGGMIGPASMIVTGMLFAGMDFKQIFASKRVYFVTFLRLIVVPAIALVLIK FSYLASLSSNGPKIMLIVFLAIITPSASIVTQMCQVYGNDSQYASAINVVTTLLAIITMP LMVMLFQITI >gi|222441906|gb|ACEP01000036.1| GENE 9 9597 - 10091 440 164 aa, chain - ## HITS:1 COG:no KEGG:Closa_2582 NR:ns ## KEGG: Closa_2582 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 24 153 40 172 173 163 61.0 2e-39 MKKYIMTIMMMIILSLTTLSGCGKEKASTSASGVNATQKIASGENLTIPVKEISEKASFY SVEVDGTQMEVIAVKDSDGNIRTAFNTCQICYDSGNGYYKQEDDELVCQNCGNSFTMNQI GESAGGCNPWPILEEDKTETDSEIKISYEFLKESSDIFENWKVN >gi|222441906|gb|ACEP01000036.1| GENE 10 10104 - 11549 733 481 aa, chain - ## HITS:1 COG:CAC1554_2 KEGG:ns NR:ns ## COG: CAC1554_2 COG2836 # Protein_GI_number: 15894832 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 119 371 1 210 220 162 38.0 1e-39 MERRQLYIDGMTCINCQKRIENCLRKRVEIKEAEVSYETGTAEILYDANKITLQEIIRII NGLGYDAKAEPNSRKDIALRAVQELTIILAVFFLLQHFGILNRLAPDSLADASMSYGMLF VIGMVTSVHCIAMCGGINLSQTLQREASKDISRKMFKNTLMYNMGRVVSYTVIGGVLGAI GGLTGIGGSLQSSSLLQGSLKLFAGIVMIIMGVNMLGIFQGLRGCLKNYPSKPQKIQDAF FCKKRILFLHKMRVPVFHKKLIPFVHKKISGGRKTPFLIGICNGFMPCGPLQSMQIVALA SGNIFTGAFSMFCFSLGTVPLMLGFGSVVSALGKHFTRQVLRVGAILVVVMGLSMMMQGT SLSGLDIKVASVFSTKESKQMQADNTNIAVEKNGVQYVSSTLESGHYPDITVKAGEPVEW TIKASEKNINGCNYKILLQDFNQEHTFKSGRNVIKFTPEKEGTYTYSCWMGMITGKIYVK S >gi|222441906|gb|ACEP01000036.1| GENE 11 12004 - 12561 469 185 aa, chain + ## HITS:1 COG:all1618 KEGG:ns NR:ns ## COG: all1618 COG4636 # Protein_GI_number: 17229110 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in cyanobacteria # Organism: Nostoc sp. PCC 7120 # 5 135 18 149 189 63 30.0 1e-10 MPKLKEDYTKQEKINGVVYNMSPSANYRHSIVNGNIYGKLREGFKGSLCLTFMENLDYRY HVEENDDYVIPDIMVIYDRKHLKESAYTGVPRFIVETLSPATALNDKTIKKDIYQNAGVS EYWIVSPKERAVEIYYLEDSAYVLKYFYILQDDTEEEGYNADTIITLRDFPHITMALEEI FENVE >gi|222441906|gb|ACEP01000036.1| GENE 12 12760 - 13794 281 344 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|90020579|ref|YP_526406.1| ribosomal protein L22 [Saccharophagus degradans 2-40] # 6 327 8 323 331 112 29 2e-24 MKGKFTSVILAVVLVVITVGTVFFGYSKASGKAAKSSVEGEQRYAWPLATCSTEDTITHI FATQFAKEVEKLSDGKMKINVYPQSTLGGDRELMESCKDGDIPFVVQSPAPQVSFMPQLC VFDTPCVFENIDDARKAIDNADFQKEIQKIYKGAGYDLLGIADQCFRVMTSAKPFTGIES FKGQKIRTMENAYHLQFWKQMGANPTPMSFSEVYIGLQQGTIDAQENAYEIIVSAKLYEQ QKYLISTNAVPDYTTLIVSDEFYQGLTKEQQKIIDQAAETAQEKARESADDRRVQRQKVL EDEGMQIVDVDQKTWKQMQEKCQPVYDSIRKQAGDKLVDLYMGK >gi|222441906|gb|ACEP01000036.1| GENE 13 13815 - 15104 794 429 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|90020581|ref|YP_526408.1| ribosomal protein L16 [Saccharophagus degradans 2-40] # 1 425 3 422 435 310 37 7e-84 MSPLVVFILFFVFLLIAIPISVSLGLVAVLPGVFDPSFTASASYVIRSMFGGIDSFPLLA VPMFILSGIIMARGGISKRLFDLFSFFIGKRTAGLPCAAVITCLFYGAISGSGIATVAAV GSMTIPLLVELGYDKKFCTALVAVAGSLGVIIPPSIPFIMYGMASGASVSDIFLAGIVPG VLIGLLLMVYAVFYCKKHGEDKEKINAKIDALHEQGLWKVFKSSFFAVLSPVIILGCIYS GVASPTEAAVISVFYSLIVSIFIYKSLPIKKVYDVFVEAVRTYAPILFILAASIAFSRVL TLMQVPQLISGWILAHFTSKVVVLIVINLVLLLVGMVMDTTPAILILTPILLPIVEGFGM NPIHFGVMMVVNLAIGFVTPPIGVNLFVASSLTDIPVVDIAKRAMPLIGFFLIALLLITF IPQISLCLL >gi|222441906|gb|ACEP01000036.1| GENE 14 15108 - 15605 304 165 aa, chain - ## HITS:1 COG:BH2672 KEGG:ns NR:ns ## COG: BH2672 COG3090 # Protein_GI_number: 15615235 # Func_class: G Carbohydrate transport and metabolism # Function: TRAP-type C4-dicarboxylate transport system, small permease component # Organism: Bacillus halodurans # 1 159 1 159 183 59 32.0 3e-09 MKVLKWLDNNLEQAILLILLCGMTIVMGGQIIARYAFSASLSWSEEITRYMFIISGFISA SYCIKHGLSVKIDQLVGMLPGKGVHYMRLVSYLIQLLFFAYLVPFAWKYVKAGMASGQLS PACGIPMYLIQSFTVISFVLCCIRLIQKFIQRVEIITGKREAEEK >gi|222441906|gb|ACEP01000036.1| GENE 15 16061 - 17356 1625 431 aa, chain - ## HITS:1 COG:STM1360 KEGG:ns NR:ns ## COG: STM1360 COG0477 # Protein_GI_number: 16764711 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Salmonella typhimurium LT2 # 6 422 11 396 411 146 27.0 8e-35 MKAKKYIWSMICVYMAYLTHGMQALIFSQNQVNFATKWGFDMTDPSSAAYAAGVAAVSTA IAWTGFGKFISVWIGGEISDRVGRKKLMIGGAILYIICFATMFVTNNATVAAITGLLGGI ATSGFWDASGYPAVQEAYPAAPGSALIMIKFFVALSSIFYPLICVQTAAAGNWKINIVIP LVLSCVVLVFACIARFVYDDEKAANAGSGSDKRADQAEIDAAKARQLVAPGKFTAFLTYF YAFLVMAVMYGAQQYTKAFGLAYCGLSEMQAASMTTMYQVGSIVAVAFWAVMMSKLKWHP LKVLIVDAVITAGSLLAVLLVHQVAIVYIAILILGFGCAGGALQTGLSVLQEFVPGPKGR NTGIYYTFMGAASYLMPVIVAKLTVISGESKAVYTLMTLMFILAVVQVVATLYLILRYKK VFGTTPLAKAE >gi|222441906|gb|ACEP01000036.1| GENE 16 17461 - 18336 1267 291 aa, chain - ## HITS:1 COG:lin2338 KEGG:ns NR:ns ## COG: lin2338 COG0169 # Protein_GI_number: 16801401 # Func_class: E Amino acid transport and metabolism # Function: Shikimate 5-dehydrogenase # Organism: Listeria innocua # 1 283 1 284 289 324 54.0 2e-88 MERRINGHTGLLALFGTPVGHSGSPAMYNFSFQHDGLDYAYMAFDVNVEQMPAVFDSIRL FKMRGGNFTMPCKNIAAELVDKLSPAAEIIGACNVFVNDDGVITGHITDGVGFVKNLEVN GVDVKGKKTVVLGAGGAATAIQVQLALDGASEVAIFNIDDEFYARAESTKEKLATRCPEC KVTVEHLEDKEALAAAVNNADIVINATIMGMKPHEDVTLIDKSLFRKDLVVADTVYNPEK TKMILEAEEAGCQAIGGKGMLLYQGVVNYSLFTGKDFPIEEYQKFQAEQAK >gi|222441906|gb|ACEP01000036.1| GENE 17 18469 - 20439 2590 656 aa, chain - ## HITS:1 COG:lin2337_1 KEGG:ns NR:ns ## COG: lin2337_1 COG1902 # Protein_GI_number: 16801400 # Func_class: C Energy production and conversion # Function: NADH:flavin oxidoreductases, Old Yellow Enzyme family # Organism: Listeria innocua # 1 361 1 361 361 505 66.0 1e-142 MKFENMFSPIQIGPMEVKNRFVVPPMGNNFANTDGTMSDQSVAYYGARAKGGFGLITLEA TVVHKGAKGGPRKPCLYDDSAIPSFKKVIDACHAEGAKVSIQLQNAGPEGNAKNAGAPIQ AATAVTSADGRDIPEAVTTEQVYELARGYGEAARRAMEAGADAVEIHMAHGYLVNSFMSP RTNKRVDEFGGSFENRMRFSTLIIDEVKKATQGKIAVLARINSTEDMFGGLDNHDMCAIA TYLESCGLDALHVSRAVHLKDEYMWAPTGIHGGFSAEYVANIKKCVSIPVITVGRYTEPQ FAEVMVRTGNADLVAFGRQSLADPAMPKKAFEDRLEDMTPCIACLQGCVANMYAGKPICC LTNPVLGRESEGMKEAETKKKIYVIGGGPAGMCAAFTAARRGHDVTLFEASDVLGGNMRL AAYPPGKGDITNMIRSYITKCEKSGVKIVLNTEVTADLIKKDAPDAVIVATGSETLVLPF IKGITAPEIVHGGDCLSGKRPVGHKVLVVGGGMVGAETAEFLAEKGHDVSIIEMRDAIGP DVIHEHRVFLMEGLKEYDVHQYVDAAVSEIYPDGVSYKNAADKSDDTVYELRGFDTVVLS MGYVSKYTHREGQEVVYDFVDELKAICPEVHVVGDAIRARRALDATKEAYDVALNI >gi|222441906|gb|ACEP01000036.1| GENE 18 20562 - 21401 1235 279 aa, chain - ## HITS:1 COG:lin2336 KEGG:ns NR:ns ## COG: lin2336 COG1082 # Protein_GI_number: 16801399 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar phosphate isomerases/epimerases # Organism: Listeria innocua # 3 278 9 281 284 301 54.0 8e-82 MSKEFPITVSSWTLGDQCKFEERVKAAKEAGYDGIGLRAETYVDALNEGLHDEDILAILE KYDMKVTEVEYIVQWAEDARSYEQKYKEQMCFHMCELFNVGHINCGLMEDYSVEHTAQKL KELCQRAGKLVIGVEPMPYSGIPNMEKGWAVVKGSGCDNAKLIMDTWHWVRANQPIDLDV IKDIPADKIVSIQINDVWDRPYAKSILRDESMHDRLAPGTGIGCTADFVKMVREKGIEPK AIGVEVISDAILAKGVAEAAKHTFDNTKKVLEQVWPEVL >gi|222441906|gb|ACEP01000036.1| GENE 19 21817 - 22725 779 302 aa, chain + ## HITS:1 COG:lin2335 KEGG:ns NR:ns ## COG: lin2335 COG0583 # Protein_GI_number: 16801398 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Listeria innocua # 1 292 1 290 292 288 52.0 6e-78 MNLFHLRYFETLARTEHYSKAAEILSITQPSLSYAISTLESELGIQLFEKRGRNIVLTKY GKAFYSNVKEILANLDNAVRDIKLVANGEGEINIGFLRTLGTDYVPTTVKNFLDLHPDKN IWFNFSCEHGLSVDLVRSLIDREYDMVFCSKLNNSPMVEFIPIATQELVVIVPKDHPLAA KDSVTLKETLAYQQIIFRHKSGLRQIIDDLFKSINAYPDILCEIEEDLVGAGFVSKGFGI MIAPKFDGLNHVDVKVLKLTQPSYKRYFYMAILKDVYHPPLIEEFKKYVIEQSKLNKEYN FG >gi|222441906|gb|ACEP01000036.1| GENE 20 23401 - 24336 988 311 aa, chain + ## HITS:1 COG:lin2338 KEGG:ns NR:ns ## COG: lin2338 COG0169 # Protein_GI_number: 16801401 # Func_class: E Amino acid transport and metabolism # Function: Shikimate 5-dehydrogenase # Organism: Listeria innocua # 1 290 1 286 289 292 50.0 5e-79 MDTRISGKTRLLGVFGTPIKHSGSPIMYNFCFDYYGIDCAYLAFDTDASQVKDSLAAMRR LNMRGANVTMPCKQEVCRNMDKLSDAARFVGAVNTIVNDNGVLTGHITDGMGVVLDLKDH NVSIVDKDIVLFGAGGAATAIMVQCALDGAKSITVFNRSKEGLDHVAGIGQKMMDDGVKC KLEFHSLSDTDLMHDKIRECDILINGTCVGMAPALDGQSLVTDMSVFHEDMVIYDVIYHP LVTKLMADAVANGCKEENVIGGKGMLLWQGAGAFKLYTGLDMPAQKLKEFLAKREEEGFE PPVDVAENVPY >gi|222441906|gb|ACEP01000036.1| GENE 21 24714 - 25439 681 241 aa, chain + ## HITS:1 COG:CAC1482 KEGG:ns NR:ns ## COG: CAC1482 COG1811 # Protein_GI_number: 15894761 # Func_class: R General function prediction only # Function: Uncharacterized membrane protein, possible Na+ channel or pump # Organism: Clostridium acetobutylicum # 1 240 1 242 242 236 58.0 2e-62 MPIGVIANCLGIIIGGIAGSFIGPRLTESFKTNLNLVFGACAMTMGISSISLMKNMPAVI LSVILGTVIGLIIHLGNLTQKMGMRMEKIISKIIPGQSKKDAEYQSVLVTAIVLFCASGT GIYGSITAGMTGDHSILLAKTVLDLATALIFACILGVVTCVIAVPQFIIFMVLFLLAGAI YPFCTETMICDFKACGGVLLVATGFRMMKVKEFPIADMIPAMVLVMPLSHLWVTYVLPFV S >gi|222441906|gb|ACEP01000036.1| GENE 22 25922 - 26560 570 212 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225026430|ref|ZP_03715622.1| ## NR: gi|225026430|ref|ZP_03715622.1| hypothetical protein EUBHAL_00679 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00679 [Eubacterium hallii DSM 3353] # 1 212 1 212 212 336 100.0 7e-91 MSFRKILVGIFMVGIICCGIGAGVTFQEYSSFKYMGEKRIGSNKIAEKTLTENLYTDKTK KLNTSFIAYNNENENIELKTSKSVPKDKIQIKVRYDADNVKDIHIDKNLYNDDEYIGDDI YIDDDYDENSMDNVALENMTQTDESESAQKNPTSQEFFISAVGRMSDIEYVLSYKDEILK NIKERKIYNYVYPQIISVEITVNPENKNLMKI >gi|222441906|gb|ACEP01000036.1| GENE 23 26557 - 27261 684 234 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225026431|ref|ZP_03715623.1| ## NR: gi|225026431|ref|ZP_03715623.1| hypothetical protein EUBHAL_00680 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00680 [Eubacterium hallii DSM 3353] # 1 234 1 234 234 428 100.0 1e-118 MTKRKFLKELEDRLQMLDEKERKDMIEEYSQHISMCMKSGMKEEEAIEDFGDMDDLIAEI LEAYHLDPEYEVKSRERSGKTGIADLLPKDFFKKKTKIAKAGNKSNVNDGNNKSNPRKNR RVEGTKRFKEEQGRKIENITSKAVKTGSNAVSWSWDKIKKFFFIFIKIVLVFLALPAVFF DLAGLFGLGILGVMAFQGYPVIGCVIIAFGAVLSMTAYILFLFTYVLGGRGAKA >gi|222441906|gb|ACEP01000036.1| GENE 24 27265 - 27576 165 103 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|18309686|ref|NP_561620.1| 30S ribosomal protein [Clostridium perfringens str. 13] # 1 100 3 107 110 68 38 6e-11 MNAQFKKGVLELIVLAAVEQKDMYGYELVAEVSKVVNVKEGTIYPILKRLTNEHYFETYL RESTEGPPRKYYHLTAAGILHLRELEKEWEEFSRNVNKFLKER Prediction of potential genes in microbial genomes Time: Fri Jul 8 06:59:39 2011 Seq name: gi|222441905|gb|ACEP01000037.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont41.1, whole genome shotgun sequence Length of sequence - 17009 bp Number of predicted genes - 18, with homology - 18 Number of transcription units - 7, operones - 6 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 48 - 392 321 ## CD2814 GntR family transcriptional regulator - Prom 473 - 532 7.1 + Prom 510 - 569 6.1 2 2 Op 1 2/0.000 + CDS 677 - 1450 533 ## COG2110 Predicted phosphatase homologous to the C-terminal domain of histone macroH2A1 3 2 Op 2 . + CDS 1401 - 2384 731 ## COG0846 NAD-dependent protein deacetylases, SIR2 family + Prom 2447 - 2506 8.8 4 3 Op 1 . + CDS 2529 - 2753 301 ## gi|225026436|ref|ZP_03715628.1| hypothetical protein EUBHAL_00685 5 3 Op 2 . + CDS 2774 - 2968 132 ## bpr_I2668 hypothetical protein + Term 3188 - 3249 -0.9 + Prom 3275 - 3334 7.9 6 4 Op 1 . + CDS 3474 - 4880 711 ## FMG_0966 hypothetical protein 7 4 Op 2 . + CDS 4865 - 5056 103 ## gi|225026439|ref|ZP_03715631.1| hypothetical protein EUBHAL_00688 + Term 5258 - 5311 7.4 + Prom 5066 - 5125 11.6 8 5 Op 1 27/0.000 + CDS 5356 - 7134 1664 ## COG0286 Type I restriction-modification system methyltransferase subunit 9 5 Op 2 11/0.000 + CDS 7137 - 8363 634 ## COG0732 Restriction endonuclease S subunits 10 5 Op 3 . + CDS 8367 - 11390 2508 ## COG0610 Type I site-specific restriction-modification system, R (restriction) subunit and related helicases + Prom 11399 - 11458 5.4 11 6 Op 1 . + CDS 11552 - 11956 300 ## Apar_1212 hypothetical protein 12 6 Op 2 . + CDS 11957 - 13441 688 ## COG0494 NTP pyrophosphohydrolases including oxidative damage repair enzymes 13 6 Op 3 . + CDS 13407 - 13583 248 ## gi|225026445|ref|ZP_03715637.1| hypothetical protein EUBHAL_00694 + Prom 13608 - 13667 4.7 14 7 Op 1 . + CDS 13797 - 14720 536 ## COG1061 DNA or RNA helicases of superfamily II 15 7 Op 2 . + CDS 14762 - 15778 715 ## Dhaf_0648 ATPase AAA 16 7 Op 3 . + CDS 15799 - 16296 412 ## gi|225026448|ref|ZP_03715640.1| hypothetical protein EUBHAL_00697 17 7 Op 4 . + CDS 16349 - 16657 340 ## gi|225026449|ref|ZP_03715641.1| hypothetical protein EUBHAL_00698 18 7 Op 5 . + CDS 16575 - 16901 219 ## gi|225026450|ref|ZP_03715642.1| hypothetical protein EUBHAL_00699 Predicted protein(s) >gi|222441905|gb|ACEP01000037.1| GENE 1 48 - 392 321 114 aa, chain - ## HITS:1 COG:no KEGG:CD2814 NR:ns ## KEGG: CD2814 # Name: not_defined # Def: GntR family transcriptional regulator # Organism: C.difficile # Pathway: not_defined # 7 85 15 93 231 81 46.0 1e-14 MKEPQLLQTKAYDYLIDLIKKGKLKTDKIYSLNQLAKDAGFSKIPFRDAVLRLEQERYID ILPSRGFTLHKMTKEDIIETYQMRNNNFDHRQRQPVRNNLTPICDLGASVEYER >gi|222441905|gb|ACEP01000037.1| GENE 2 677 - 1450 533 257 aa, chain + ## HITS:1 COG:SA0314 KEGG:ns NR:ns ## COG: SA0314 COG2110 # Protein_GI_number: 15926027 # Func_class: R General function prediction only # Function: Predicted phosphatase homologous to the C-terminal domain of histone macroH2A1 # Organism: Staphylococcus aureus N315 # 6 256 10 261 266 231 49.0 8e-61 MNQSERRIYLIQELLKENAQYRGMEIPTNTEEQRLLLRGLMNIRMPHPVRKEVLEVQDAY LQEEKKRKGIVEPDNLGEIQPGIILWQGDITKLACDAIVNAANSGMTGCYVPNHRCIDNC IHTFAGMQLREECAEIIEEQGYAEATGKAKITKAYNLPCKYILHTVGPIIQGTVTKKDCE LLKSCYTSCLALAAENGLESVAFCCISTGEFHFPNEKAAQIAVATVKEFLKQNTSVRKVI FNVFKDLDKAIYEGILQ >gi|222441905|gb|ACEP01000037.1| GENE 3 1401 - 2384 731 327 aa, chain + ## HITS:1 COG:SA0315 KEGG:ns NR:ns ## COG: SA0315 COG0846 # Protein_GI_number: 15926028 # Func_class: K Transcription # Function: NAD-dependent protein deacetylases, SIR2 family # Organism: Staphylococcus aureus N315 # 1 321 2 289 315 216 38.0 6e-56 MFSKIWTKQSTKESFNKIEQLKEKIEEADTIIIGSGAGLSTSAGFTYTGDRFTKYFSDFQ KKYGFQDMYSGGFYPFESFEEHWAYWSRYIYINRYMDAPKPVYEKLYDLVKEKDYFVLTT NVDHCFQKAGFDKYRLFYTQGDYGLFQCSVPCHQETYDNEEEIRKMVEAQGFVIGTRIEK STNTLHLQPMRQLKTEKELTLPEGIIPKMAIPSELVPVCPKCGKPMSMNLRADNTFVEDE GWHNAAERYSEFLRRHKHTKTVFLELGTGYNTPGIIKYNFWNMTDSWDDATYACINLNEA AAPTEIKEKSICVGADIGEVLNQIFEK >gi|222441905|gb|ACEP01000037.1| GENE 4 2529 - 2753 301 74 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026436|ref|ZP_03715628.1| ## NR: gi|225026436|ref|ZP_03715628.1| hypothetical protein EUBHAL_00685 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00685 [Eubacterium hallii DSM 3353] # 1 74 1 74 74 140 100.0 4e-32 MEIKSILIKETTKEERIRIVQEGLSQCGGACDFCNGCDNLGGGLVDAFYEPYINGEKELR QLNEEYRSHTGLVR >gi|222441905|gb|ACEP01000037.1| GENE 5 2774 - 2968 132 64 aa, chain + ## HITS:1 COG:no KEGG:bpr_I2668 NR:ns ## KEGG: bpr_I2668 # Name: not_defined # Def: hypothetical protein # Organism: B.proteoclasticus # Pathway: not_defined # 4 56 5 57 58 73 66.0 4e-12 MRPPEIEVSTEEERRQYIQKTFPCIADCDMCGLCTVFKGKDPEIAYEDYIRGKRSYMEVS GDYR >gi|222441905|gb|ACEP01000037.1| GENE 6 3474 - 4880 711 468 aa, chain + ## HITS:1 COG:no KEGG:FMG_0966 NR:ns ## KEGG: FMG_0966 # Name: not_defined # Def: hypothetical protein # Organism: F.magna # Pathway: not_defined # 31 467 7 469 489 141 27.0 6e-32 MDKKIKRGLERERIIMDNDKERQSNPMIAIKELLQNKNYIVVLDANILLKIYRSSPDYAE FVLECLNSIKDYICIPFNVQWEYEKHRKDEYAKKVKSIENSAELCRQLISTIENKIKGQC RELSKDGYPDIDKLFNELIGKITEIEDKFDLYFIEHQNLDFLNNWEIDKVFDLVNKFSKM PAPSAAFIYKQCQEGQYRYKKKMPPGYKDGISKYGDLLVWAETYEYAVLNNKNVIFVTDD VKEDWWEKLDDGSILFRKELIKEFSHKTKNIKNSGESLKLIPMIGYDLYQAIAREFNIEA PDATSMILNATDESFVEEVHMEVFDSIWSEIAYSGISFLHVDISHIGSEGVEEWKLDEVE YNGYERIDVDSGIATYIISYFIKMFGVSYDCWGIDNNTEDIITSPGRMHECSGRVDVSVT RKVDTTINWNEDFDYQDAKIHETFIKEDGYEDEDSEFDVYYCSECGKK >gi|222441905|gb|ACEP01000037.1| GENE 7 4865 - 5056 103 63 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026439|ref|ZP_03715631.1| ## NR: gi|225026439|ref|ZP_03715631.1| hypothetical protein EUBHAL_00688 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00688 [Eubacterium hallii DSM 3353] # 1 63 1 63 63 104 100.0 3e-21 MWKKIGYEWEVNQRDNEGNTLCDSCMVTNENGCVCAACGCKYPEQMRGGSGTLCINCEEE YDI >gi|222441905|gb|ACEP01000037.1| GENE 8 5356 - 7134 1664 592 aa, chain + ## HITS:1 COG:CC0620 KEGG:ns NR:ns ## COG: CC0620 COG0286 # Protein_GI_number: 16124873 # Func_class: V Defense mechanisms # Function: Type I restriction-modification system methyltransferase subunit # Organism: Caulobacter vibrioides # 13 562 2 583 611 473 45.0 1e-133 MNSNDLITNVGKNSNDTAALIWSVADDLVGAYKPHEYGLVILPMTVIKRFHDCLLLTHQA VLDTYKKVEKLAVKDGFLRKSSGYQFYNTSPFTFKTLIADPENIVDNFKAYLNGFSDNVQ DILARMDFDSQIKHMEEAGLLYQVISDFCTDKGDFSPEKISAVDMGYIFENLVRRFSESY NEEAGAHFTSRDIIYLMSDLLVAGDESAFTGDGISKTVYDMTMGTSQMLTCMEERLKQMD ADADVTVFGQEFNPFTFGIAKADMLIRGGDPNNMQFGDTLSDDKFSGYKFDYIISNPPFG IPWKREEKEVTAEFKKGTAGRFAPGLPAKGDGQLLFMLNGLAKLKDDGQMAIIQNGSSLF NGDAGSGPSEIRRYLIENDWLDAIVQLPNNAFYNTGIATYIWIVMKNKPVTHQGKVQLID ASACCSSRRKNIGSKNVDITKACRDLIIKAYGAYVDETYNGVDENDNAIIVKSKVMDAID LGYNKIVVETPQLDEDGNIVMKKKKPVVDKSKRDTENVPLAEDIDAYFEREVIPYNPQAW IDKAKTKVGYEIPFTRTFYEYQQIEPSDVIAARIEYYEKSLMAKLHNLFGGK >gi|222441905|gb|ACEP01000037.1| GENE 9 7137 - 8363 634 408 aa, chain + ## HITS:1 COG:CC0621 KEGG:ns NR:ns ## COG: CC0621 COG0732 # Protein_GI_number: 16124874 # Func_class: V Defense mechanisms # Function: Restriction endonuclease S subunits # Organism: Caulobacter vibrioides # 12 407 10 440 450 103 23.0 5e-22 MKTVIRPETEMKDSGIEWIGDIPSSWTIFPANGVFSEVKEKNTDLKFTNAFSFKYGEIVD KKQVGDVDNNLKETLSSYTIVRKNTIMINGLNLNYDFVSQRVAIVNESGIITSAYLAIQP DENKINPRFVLYLLKSYDYQQVFHGLGSGIRKTLKYQDFKKIMIVAPTLSEQQVIADYLD KTCSQIDEIIAEAKASIYEYKELKQSVIFEAVTKGLDKNVEMKDSGVYWIGKIPLDWEII KTKYVIKIENGSNPSTEGKIPVYGSGAKSFKTCGEYKEGPTVLLGRKGATLHIPHYIEGK YWNVDTAFNTIPINNKIELKYFYYVASCFDYNKYISQTTLPGMTQTNYRNIYMPYPSITI QEELVKWLDNKIFELDSLISEKESLINDLEAYKKSLIYEVVTGKRKVV >gi|222441905|gb|ACEP01000037.1| GENE 10 8367 - 11390 2508 1007 aa, chain + ## HITS:1 COG:CC0623 KEGG:ns NR:ns ## COG: CC0623 COG0610 # Protein_GI_number: 16124876 # Func_class: V Defense mechanisms # Function: Type I site-specific restriction-modification system, R (restriction) subunit and related helicases # Organism: Caulobacter vibrioides # 5 785 6 779 964 533 40.0 1e-151 MAENEKQFESNIEAFLISSAGGYEKATDAGYTSSSGMALDIHTLVGFVKATQPIMWQRFE KQCNSDPYKKFYKCFEDAVQMDGLLSVMRHGFKHRGMDFKVCYFKPESTLNDVAVKRYEQ NVCQCIRQWHYSEQNNNSVDMMLAINGIPVVAIELKNQLTGQTVDNAKLQWQYDRDQREP AFWLNHRILAYFAVDLYEAWMTTELKGTDTYFLPFNQGSNGAGNDGGAGNPQAEDDNYVT SYIWENVLQKDSLLDIIQKFISFEVKTEKKDGKNVTKKRLIFPRFHQLDVVRKLIADVRE NGSGNNYLIQHSAGSGKSNSIAWTAYRLASLHNDDNEPIFTSVVIVTDRRNLDAQLQETI TGFDHTLGSVCAIDEKKSSKDLKDALNAGKRIIVTTLQKFPVIYEEVDDTTDKRFAIIVD EAHSSQTGSSAMKLKAALADVSDALKEYAELEGKVENELLDDNDRLVREMIAHGKHKNLS FFAFTATPKGATLEMFGTEWNDGSYHPFHVYSMRQAIEEGFILDVLQNYTTYKTCYQIAK NTKDNPDVPQSKALKTIKRYEELHPYNIQQKSAIIVETFRNITKQKIKGKGKMMVVTASR LAAVRYYHEIKNYLESNSYHDVEILAAFSGSIKDPGDQKEIEWTESKLNGVNESQTKQVF HDEGNILIVAEKYQTGFDEPLLHTMIVDKKLRGVKAVQTLSRLNRTYPDKQDTFIIDFVN TKEDILKAFQPFYQETSLAQEINTDLIYKTQKMLRAFKIYDDTDIEKVNKIYFDEDKRKA NKIQAAVTNALLPIQQKYNALNQEQRYQFRKLCRTFVKWYSYITQIARMFDKPLHEEFVF CSYLAKVIPADPKVPFELADRVKLEYYNLEKTFEGSIELTKEEKGSYEPAKLKKPVKMVE TLSPLEKVIEKINEQYMGNFTEGDKVVITTLHQKLKNNKRLVKAAKTDGRQIFEKNIFPQ LFDDAAQEAYIESTETYTKLFEDAGKYRAIMGALAHAMFEELQNTHE >gi|222441905|gb|ACEP01000037.1| GENE 11 11552 - 11956 300 134 aa, chain + ## HITS:1 COG:no KEGG:Apar_1212 NR:ns ## KEGG: Apar_1212 # Name: not_defined # Def: hypothetical protein # Organism: A.parvulum # Pathway: not_defined # 22 130 19 134 142 124 45.0 2e-27 MVDTKFDKFIGLLELLETNSVGEWIIDKGHKGTEDDPIHFPYPIYTEVVYEFEHENPEYE LTKYEELFKERGLEWGQGSMENADVSHMDAQSIMALFMGMVQSERFCDGAIMEMIKSGTV RKWLMRLKELKERQ >gi|222441905|gb|ACEP01000037.1| GENE 12 11957 - 13441 688 494 aa, chain + ## HITS:1 COG:MA1602 KEGG:ns NR:ns ## COG: MA1602 COG0494 # Protein_GI_number: 20090460 # Func_class: L Replication, recombination and repair; R General function prediction only # Function: NTP pyrophosphohydrolases including oxidative damage repair enzymes # Organism: Methanosarcina acetivorans str.C2A # 1 131 1 132 132 78 33.0 4e-14 MKIIRVVAAVICDSVKEKHKIFATARGYGEFKGQWEFPGGKIEEGETSEQALKRGIEEKL DIKIEVYDLIDTIECDYPNFRLSMECFWCETITGKLVLKEAESAKWLRKNELDSVQWLPA NLTLIEKIRSEMIVNKELILTEETEKLEGEIVLDSKGNKTLKIHEGMTNIDFAGAVDRLA QYVNIAEIASTIEKGVEYVVQIPAQYQKQYEAGEYFINQNKTTGIEWPILMRKAENGQYR FVDDLPIKQQEFIEGNPFQDICNSYHNICMQQQMAQLSERVEETYQIVKMLERGQQDDRV AFIEAGRKEIMLALTLQDENDRNKQIWFGINNLLLGREQIGKAMVRRIEDFEPLPESRVV LFLNTLAHNNYQNKKDDEIENIQECYEMYLQATKMIAGAYAYKGEMAAVEQIFEDGISFF DSIDFGKLKSIELAHRGVDLDDMFYNHASQYVEIEKKQCVESGKSYNYVALEVTGDKLLE VLSDGGEVSKEKAE >gi|222441905|gb|ACEP01000037.1| GENE 13 13407 - 13583 248 58 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026445|ref|ZP_03715637.1| ## NR: gi|225026445|ref|ZP_03715637.1| hypothetical protein EUBHAL_00694 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00694 [Eubacterium hallii DSM 3353] # 1 58 1 58 58 81 100.0 2e-14 MAEKFQKKKLNRDDHKKMDNAAKHVRRVILGALTTCVIVAKKNGPKIMSAAKNIITKV >gi|222441905|gb|ACEP01000037.1| GENE 14 13797 - 14720 536 307 aa, chain + ## HITS:1 COG:CAC2824_2 KEGG:ns NR:ns ## COG: CAC2824_2 COG1061 # Protein_GI_number: 15896079 # Func_class: K Transcription; L Replication, recombination and repair # Function: DNA or RNA helicases of superfamily II # Organism: Clostridium acetobutylicum # 1 305 304 606 616 226 43.0 4e-59 MVLFLRPTESPIVFLQQLGRGLRKSKGKEYLNVLDFIGNYEKAGRTRFLLTGKNLSEKNN YYSLNQSDFPDDCLIDFDMKIIDLFAEMDKKNLKISEQIQKEYFRVKELLGRTPSRMELF IYMEDDIYELAITHAKENPFKRYLSYQRSLKELTANEEVFYQNENGREFINPLENTSMSK VYKMPVLRAFYNEGNVRQEISEEQLLASWKAFFSTRTNWKDLDKELTYEKYLAISDKDHL KKILKMPVHYLLESGKGFFVKKERAALALKDEMKEVLDNPVFIKQMKDVVEYRTMDYYRR RYKEKDN >gi|222441905|gb|ACEP01000037.1| GENE 15 14762 - 15778 715 338 aa, chain + ## HITS:1 COG:no KEGG:Dhaf_0648 NR:ns ## KEGG: Dhaf_0648 # Name: not_defined # Def: ATPase AAA # Organism: D.hafniense_DCB-2 # Pathway: not_defined # 1 331 1 332 340 355 54.0 2e-96 MPLTRIKIENFTVFEDITIPFSKGLNILVGENGMGKTHVMKLAYAACQSRKHDVSFSQKT TMLFRPDQSGIGRLVNRNKNGENTARVLVESDIAQIGMSFSTKTKKWDAEVKSEEKWEKQ MSGTTSVFIPAKEILSNAWNLDAAVKMGNVEFDDTYLDIIAAAKIDISTDVDSVARKKYL KILQKISNGKVTVQDDRFYLKPGTQAKLEFNLVAEGIRKIALLWQLIKNGTLEKGSVLFW DEPEANINPKYIPVLAELLIMLEKEGVQIFVSTHDYFLSKYIEVKREKDSDVQYISLYKD EKNQVQCEIAKEFELLEHNTIMDTFRQLYREEIGVALK >gi|222441905|gb|ACEP01000037.1| GENE 16 15799 - 16296 412 165 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026448|ref|ZP_03715640.1| ## NR: gi|225026448|ref|ZP_03715640.1| hypothetical protein EUBHAL_00697 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00697 [Eubacterium hallii DSM 3353] # 1 165 9 173 173 294 100.0 2e-78 MDFGEFDEKNLFHIEDSKIYKSLGSGIKTVEFVLKYNENSIVFLEAKKTCPNAANRYESK EKQDKFEEYYSSITEKFIASLQIYLATILSRFQDSSEVGDSLNSINAMKEAKLKFILVVK NAEDITWLVGPMAELKARLLQIRKIWDIEVAVLNEGLAREHKLIK >gi|222441905|gb|ACEP01000037.1| GENE 17 16349 - 16657 340 102 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026449|ref|ZP_03715641.1| ## NR: gi|225026449|ref|ZP_03715641.1| hypothetical protein EUBHAL_00698 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00698 [Eubacterium hallii DSM 3353] # 1 102 1 102 102 169 100.0 5e-41 MYLNKVTVKNFKAITDMELSFTPGVNLLIGDNGTGKSSMLEAIGVAISGIFKGISSVSTK GISQNDIHFKTSGKGDASTEISYGIPAEITAFIDMDSEENAN >gi|222441905|gb|ACEP01000037.1| GENE 18 16575 - 16901 219 108 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026450|ref|ZP_03715642.1| ## NR: gi|225026450|ref|ZP_03715642.1| hypothetical protein EUBHAL_00699 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00699 [Eubacterium hallii DSM 3353] # 25 108 1 84 84 154 100.0 2e-36 MLQQKYLMVYQPRSQLLLIWIQKKMPIKEYEAFKFIVSCFMKKINGLDNAPEIYYSSQFE EIVYQENNMVMPITMLSAGYQSLSFYSKKTPTSKVGDELRLYLLDRAS Prediction of potential genes in microbial genomes Time: Fri Jul 8 07:00:40 2011 Seq name: gi|222441904|gb|ACEP01000038.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont42.1, whole genome shotgun sequence Length of sequence - 8896 bp Number of predicted genes - 8, with homology - 8 Number of transcription units - 6, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 66 - 125 7.8 1 1 Tu 1 . + CDS 189 - 932 778 ## COG1378 Predicted transcriptional regulators + Prom 1131 - 1190 5.3 2 2 Tu 1 . + CDS 1211 - 2464 1750 ## COG0019 Diaminopimelate decarboxylase + Term 2654 - 2688 -0.5 + Prom 2726 - 2785 8.9 3 3 Tu 1 . + CDS 2854 - 3885 887 ## COG1910 Periplasmic molybdate-binding protein/domain + Prom 3937 - 3996 7.2 4 4 Op 1 23/0.000 + CDS 4151 - 4993 1225 ## COG0725 ABC-type molybdate transport system, periplasmic component 5 4 Op 2 6/0.000 + CDS 5016 - 5684 653 ## COG4149 ABC-type molybdate transport system, permease component 6 4 Op 3 . + CDS 5689 - 6789 927 ## COG1118 ABC-type sulfate/molybdate transport systems, ATPase component + Term 6814 - 6863 0.1 - Term 6781 - 6805 -1.0 7 5 Tu 1 . - CDS 6987 - 7928 684 ## COG0561 Predicted hydrolases of the HAD superfamily 8 6 Tu 1 . - CDS 8132 - 8866 404 ## COG0675 Transposase and inactivated derivatives Predicted protein(s) >gi|222441904|gb|ACEP01000038.1| GENE 1 189 - 932 778 247 aa, chain + ## HITS:1 COG:BS_yrhO KEGG:ns NR:ns ## COG: BS_yrhO COG1378 # Protein_GI_number: 16079765 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Bacillus subtilis # 4 243 6 243 275 63 24.0 3e-10 MDNVELLMSFGLTRQEARVYVLLLGEGALSGYEAAKRLGISRSNAYAALAGLVDKGAAYM AEEQAVQYHAVSIKEFCSNKLHYMEELSETLERQIPKQKQEDAKYLTIRGRRHIEDKFRN MLKETKERVYLSVSASVLELFREELLGLVAQKKKVVLLTDEEYSMDGAIIYRTKKFENLS GSADCEGDADGQLRLITDSNYALTGELQDKETSTCLYSANPNLVRLLKDALANEIRLIEI EKKDASI >gi|222441904|gb|ACEP01000038.1| GENE 2 1211 - 2464 1750 417 aa, chain + ## HITS:1 COG:SP1978 KEGG:ns NR:ns ## COG: SP1978 COG0019 # Protein_GI_number: 15901801 # Func_class: E Amino acid transport and metabolism # Function: Diaminopimelate decarboxylase # Organism: Streptococcus pneumoniae TIGR4 # 3 417 2 414 416 588 67.0 1e-168 MEKKPFVTKEQLDKIVEKYPTPFHLYDEKGIRQTARDLYKAFSWNEGFKEYFAVKATPTP AILKILQEEGCGTDCSSLTELMMSDRCGFADGNSIMFSSNETPAEEFILAKKLGAIINLD DITHIDFLKDACGIPERICCRYNPGGYFSLGTDIMDNPGDAKYGMTKPQIFEAFKRLKRE GVKEFGIHSFLASNTVSNEYYPALASILFNLAVELKENLGVHIGFINLSGGVGIPYTPDK EPNDIFAIGEGVRKAYEEILVPAGMGDVKIFTELGRFMLAPHGHLVVKAIHEKHTYKEYI GVDACAANLMRPAMYGAYHHITVMGKENEPCDHTYDVVGSLCENNDKFAIDRKLPKIDMG DLLVLHDTGAHGFSMGYNYNGRLKSAEILLQEDGSTRMIRRAETPEDYFATIEGFDF >gi|222441904|gb|ACEP01000038.1| GENE 3 2854 - 3885 887 343 aa, chain + ## HITS:1 COG:CAC0252 KEGG:ns NR:ns ## COG: CAC0252 COG1910 # Protein_GI_number: 15893544 # Func_class: P Inorganic ion transport and metabolism # Function: Periplasmic molybdate-binding protein/domain # Organism: Clostridium acetobutylicum # 1 335 1 316 319 258 40.0 1e-68 MGEKLFTAKEAAEILKVRKNTVYDMIKRGDLKASKLGKQLRIRQEDLDFYIQYGSQAKVV DSMQTGHNSNIDKAEFEKNTSKLPNVSNEISGVSRTTSATQNRISNQIIICGQDMILDLL ANRLNQCVEENVFRSYKGSYNGLYAMYQGEVNVATAHLWHGKTNSYNIPYISSMLPGTDV IVLHLLKRKQGFYVKKGNPKRIQSFEDLKRADVTIVNREPGSGVRVLVDEKLRQAGIFTQ EVNGYQKVVNSHLEAAATVNRGEADVAVGSEKHSLSVPGIDFIFIQEESYDMVIRKEDFT KKSYQKMIEIIRSPEYQKEVAGLGSYNVENMGKIIYSDCGCRP >gi|222441904|gb|ACEP01000038.1| GENE 4 4151 - 4993 1225 280 aa, chain + ## HITS:1 COG:BS_yvgL KEGG:ns NR:ns ## COG: BS_yvgL COG0725 # Protein_GI_number: 16080391 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type molybdate transport system, periplasmic component # Organism: Bacillus subtilis # 5 277 3 258 260 163 38.0 3e-40 MKFVKRIGVAVLAACMVLSCVGCRSSSNKGAAEGTTEVAAEGGQTKISVAAAASLQATFD DELIPLFEKENPGVTVEGTYASSGDLQQQIESGLDADVFFSAATSNMDTLVDENLVDKDT VVDLLKNDVVLITPKDSKLGIKGFKDITKADTIAIGDPESVPAGKYAKEILTNLGVYDEV EKKASLGASVTEVLSWVAEGSADAGIVYATDAQTENTNGDDKEVEIVATAEDSMMQTPVI YPVGTVSSSAHKDEAKAFEDFLQTDEAKAILEKAGFTINE >gi|222441904|gb|ACEP01000038.1| GENE 5 5016 - 5684 653 222 aa, chain + ## HITS:1 COG:alr2433_1 KEGG:ns NR:ns ## COG: alr2433_1 COG4149 # Protein_GI_number: 17229925 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type molybdate transport system, permease component # Organism: Nostoc sp. PCC 7120 # 1 222 1 222 223 190 47.0 1e-48 MTMDLSPIIISLKTATLSIIITFFLGVAAAQFVFRLKSKTMKTILDGLFTLPLVLPPTVA GFFLLYIFGMRRPIGKFLIEYFAVKIAFSWGATVLAAVVISFPLMYRSARGAFEQVNPDV LSAGRTLGMSEWNIFWKVQMPIAWPGVISGAVLAFARGLGEFGATAMIAGNIRGKTRTLP LAVYSAVASGKMDDAGQYVVILVGISFIVVVCMNYFSMEKKG >gi|222441904|gb|ACEP01000038.1| GENE 6 5689 - 6789 927 366 aa, chain + ## HITS:1 COG:sll0739_2 KEGG:ns NR:ns ## COG: sll0739_2 COG1118 # Protein_GI_number: 16331977 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type sulfate/molybdate transport systems, ATPase component # Organism: Synechocystis # 2 365 37 392 395 257 39.0 3e-68 MALSVHIKKKLKDFELAVDFDMDEGIVGFLGSSGCGKSMTLKCIAGIETPDEGRIVLDDE VLFDKKSGINLSPQKRKVGYLFQNYALFPHMTVKQNIEAGLAASGKSKQEKKEIVERLLE QFHIKELENSYPVRISGGQQQRAALARMMAAEPKFLLLDEPFSALDAYLKEKLMQEMMTY LNQFKKGVIMVSHSRDEVYQMCSDTVIMHQGMIETLGKTKEIFDNPGTVNAARLTGCKNI SRATKTGTHEVYASDWDIKLTFPEQTEIPSNLTHVGIRAHEIHPIVSKDTNLKDMDSNES IKLNTMDCQLERISEAPFEMYLILQNKSGKNTSPLWWKVTKKEWQEIGEKVPETVELPAE HIILLK >gi|222441904|gb|ACEP01000038.1| GENE 7 6987 - 7928 684 313 aa, chain - ## HITS:1 COG:CAC2244 KEGG:ns NR:ns ## COG: CAC2244 COG0561 # Protein_GI_number: 15895512 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Clostridium acetobutylicum # 45 313 1 266 266 197 40.0 2e-50 MKFEIASILNRCTTELSLNKDINLAINNLLQIVNEYFYERTFYTMNRKLLFFDIDGTLLA GGIPGYIPDSTIEALKQAQANGHYIFINSGRTYGFMPEAIKEFPFDGYVCGCGTEVIFLG KTLYHYDLDEDLKHSLQSILEECKIQAVYEGRKSCYFQKTEKPFAPITAIRNSYAKTNLE QPIRFFDDPVLDFDKFVILTDENSSLDLFHERTKEHFDFIAREEMHPYGFEEVVPKGCSK AGGIDYIVEHLGESLASCYVFGDSTNDLSMLTHVKNSIAMGNSYPEVLANTSYVTTPIER DGIRNALKHFGLI >gi|222441904|gb|ACEP01000038.1| GENE 8 8132 - 8866 404 244 aa, chain - ## HITS:1 COG:alr2719 KEGG:ns NR:ns ## COG: alr2719 COG0675 # Protein_GI_number: 17230211 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Nostoc sp. PCC 7120 # 1 244 158 410 452 206 47.0 2e-53 MKLPKIGQVRLKQHRIIPEEYRLKSVTVSQTPSEKYYASILFEYEDQVQEKKLQKFLGLD FSMHELYRDSNGKEPAYPGYYRNAEKKLAREQRKLSKMQKGSNNRNKQRLKVAKLHEKVS NQRKDFLHKQSRQITNAYDCVCIEDLDMKAMSRSLKFGKSVLDNGWGMFTTFLKYKLEEQ GKKLVKVGRFFASSQTCSVCGYKNVKTKNLALREWDCPQCGTHHDRDVNAAINIRNEGMR LVNA Prediction of potential genes in microbial genomes Time: Fri Jul 8 07:00:48 2011 Seq name: gi|222441903|gb|ACEP01000039.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont43.1, whole genome shotgun sequence Length of sequence - 22136 bp Number of predicted genes - 16, with homology - 16 Number of transcription units - 10, operones - 4 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 2 - 650 287 ## COG0671 Membrane-associated phospholipid phosphatase 2 1 Op 2 14/0.000 - CDS 619 - 1452 493 ## COG0688 Phosphatidylserine decarboxylase - Prom 1496 - 1555 2.4 - Term 1493 - 1535 4.1 3 1 Op 3 . - CDS 1559 - 2263 667 ## COG1183 Phosphatidylserine synthase 4 1 Op 4 . - CDS 2247 - 3323 695 ## EUBREC_1550 hypothetical protein - Prom 3355 - 3414 2.8 5 2 Op 1 . - CDS 3577 - 4308 436 ## FMG_1610 hypothetical protein - Prom 4330 - 4389 2.8 6 2 Op 2 . - CDS 4393 - 5541 1285 ## COG2265 SAM-dependent methyltransferases related to tRNA (uracil-5-)-methyltransferase - Prom 5564 - 5623 4.2 - Term 5551 - 5589 7.1 7 3 Op 1 47/0.000 - CDS 5625 - 5990 502 ## PROTEIN SUPPORTED gi|238916248|ref|YP_002929765.1| large subunit ribosomal protein L7/L12 8 3 Op 2 . - CDS 6028 - 6546 565 ## PROTEIN SUPPORTED gi|238922788|ref|YP_002936301.1| 50S ribosomal protein L10 - Prom 6600 - 6659 6.9 9 4 Tu 1 . - CDS 7217 - 8941 2059 ## COG0265 Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain - Prom 9072 - 9131 4.9 + Prom 9466 - 9525 6.5 10 5 Op 1 . + CDS 9559 - 10617 978 ## COG2148 Sugar transferases involved in lipopolysaccharide synthesis + Prom 10623 - 10682 3.2 11 5 Op 2 . + CDS 10712 - 12043 1079 ## COG1316 Transcriptional regulator - Term 12090 - 12133 0.1 12 6 Tu 1 . - CDS 12187 - 16299 4895 ## gi|225026472|ref|ZP_03715664.1| hypothetical protein EUBHAL_00721 - Prom 16355 - 16414 6.5 + Prom 16435 - 16494 12.4 13 7 Tu 1 . + CDS 16541 - 17410 1004 ## COG0613 Predicted metal-dependent phosphoesterases (PHP family) + Prom 17634 - 17693 8.1 14 8 Tu 1 . + CDS 17741 - 19105 991 ## CLL_A1793 ErfK/YbiS/YcfS/YnhG family + Term 19180 - 19227 -0.8 - Term 19787 - 19822 1.0 15 9 Tu 1 . - CDS 19974 - 21038 1031 ## COG0482 Predicted tRNA(5-methylaminomethyl-2-thiouridylate) methyltransferase, contains the PP-loop ATPase domain - Prom 21094 - 21153 6.4 + Prom 21160 - 21219 7.1 16 10 Tu 1 . + CDS 21328 - 21990 492 ## COG3619 Predicted membrane protein + Term 22006 - 22046 -0.1 Predicted protein(s) >gi|222441903|gb|ACEP01000039.1| GENE 1 2 - 650 287 216 aa, chain - ## HITS:1 COG:CAC1489 KEGG:ns NR:ns ## COG: CAC1489 COG0671 # Protein_GI_number: 15894768 # Func_class: I Lipid transport and metabolism # Function: Membrane-associated phospholipid phosphatase # Organism: Clostridium acetobutylicum # 5 204 3 198 219 146 42.0 3e-35 MHSNIKERFNNLLPKYAILPIIAALLINNCIYIGAAQLRNHLSFYSLAISLDEKIPLLTP FVIFYVLAYVQWALNYILIGRDSKKLCYQFVLGDILSKVICLFFFILFPTTLTRPEITGT DIFSQLVRFIYSVDAPVNLFPSIHCLESWCCIRAAYKMNFKTEKRTNYYRVATILMSLGI FASTLFIKQHVIADVFGGIAAFEIGMILAKILICED >gi|222441903|gb|ACEP01000039.1| GENE 2 619 - 1452 493 277 aa, chain - ## HITS:1 COG:CAC0799 KEGG:ns NR:ns ## COG: CAC0799 COG0688 # Protein_GI_number: 15894086 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylserine decarboxylase # Organism: Clostridium acetobutylicum # 4 267 24 287 291 234 45.0 1e-61 MTILEFLYNTVPGRFLLKPLTGPRLSRICGHFLDSELSSFLIQPFVKQNAIQLSDYETTD IKSFNDFFSRKIKQGKRPIDMEENHLIAPCDGLLSVWKIKENTVFPVKQSHYTISSLLHS KKLAQRYHGGYCLVYRLCVNHYHRYCYVDSGQKSRNFFIPGRLHTVRPVALREVPVFTEN SREYTLIRTEKFGTVVQMEVGAMLVGRIVNHEEKGSTIRGKEKGYFQYGGSTIIVLIEPE QVQIREDILQSSALTKEVPVKMGEVIGHALEHKRTIQ >gi|222441903|gb|ACEP01000039.1| GENE 3 1559 - 2263 667 234 aa, chain - ## HITS:1 COG:CAC0798 KEGG:ns NR:ns ## COG: CAC0798 COG1183 # Protein_GI_number: 15894085 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylserine synthase # Organism: Clostridium acetobutylicum # 2 216 3 189 205 91 35.0 2e-18 MIGFYDYTVILTYLSLMSGTIGIMLCLNGMGHPYLGMFFLLFSGLCDTFDGKVARSKKDR TTQMKKFGIQIDSLSDLIAFGMLPACIGIAMLRYTMCIPDLSGVTRYLLTHYTLQTQIVL SLIAVLYVLAAMIRLAYFNVMEEEDQNRDETGAKIYTGLPVTSAALIFPAVLLIHMLIRA DLTFLYFGVMLITGGLFISKIQIKKPQNKGIATMIFFGAVECVVLIFALKFLKK >gi|222441903|gb|ACEP01000039.1| GENE 4 2247 - 3323 695 358 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_1550 NR:ns ## KEGG: EUBREC_1550 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 12 349 4 341 341 238 41.0 2e-61 MKEKSIIQQNLKKIIGVIVVSVFAIITIYTVFQGSGISLNELTASLKEASWEGILLASVS MLGFIYFEGEALRVLVRHMGYPAKRSHGFVYSAADVYFSAITPSASGGQPASAYFMLKDG IAGTAVMAALLLNLIMYTLAILTIGLVDILIFPEVFLNFSIGCRVLIVAGGLALAGLGII FYLLLRRQALIESVGTFFVRILRMIHCGRLADKLEKKLEVSIEKYSSLVDVIFDGKRMLW KVYILNLLQRLSQIIVTLFSFAALHGDLRKLPQLLATQIYVVMGSNCVPIPGGMGVTDYL MLKGYQQLMSREAAFQLEMLSRGLSFYVCIFVSLTAVLIGYVTIKRKKKLENENDRIL >gi|222441903|gb|ACEP01000039.1| GENE 5 3577 - 4308 436 243 aa, chain - ## HITS:1 COG:no KEGG:FMG_1610 NR:ns ## KEGG: FMG_1610 # Name: not_defined # Def: hypothetical protein # Organism: F.magna # Pathway: not_defined # 1 242 1 242 243 313 73.0 5e-84 MEVRKRFENDKLKKLQIFNGAQLKYIAFISMLIDHVNNALITPFLNGKGFLLHLSNLFSI LGRIAFPIFVFFIVEGFFKTSNRKKYLITLLLFGVISEVPFDMFTSKVFFDPYWNNIMFT LALCLITIWIIDAVKAKISNKVLWYSISIVVVGISCAVAMALSLDYDYHAIIVAYIFYIF YDKPILGAGLGYLSIIKELYSCLGFAMTVTYNGERGRQNKWINYLFYPVHILILGILRFY LNI >gi|222441903|gb|ACEP01000039.1| GENE 6 4393 - 5541 1285 382 aa, chain - ## HITS:1 COG:BH0687 KEGG:ns NR:ns ## COG: BH0687 COG2265 # Protein_GI_number: 15613250 # Func_class: J Translation, ribosomal structure and biogenesis # Function: SAM-dependent methyltransferases related to tRNA (uracil-5-)-methyltransferase # Organism: Bacillus halodurans # 5 377 78 454 458 348 44.0 7e-96 MSKKCPYEKKCGGCQYIDLPYEEQLKKKQKNTNKLLKSFGKVKPIIGMKDPWHYRNKVHG VVAGDRHGNCFTGIYENRSHRVIRVDSCLIENQKADEIMNTVTSLMKSFKMRPYNEDTGY GFLRHILVRTGYHTGQIMVVLVTASPVFPSKNNFVKALRKEHPEITTIIANVNNRNTSMV LGDKQQILYGKGYIEDVLCGKTFRISPKSFYQVNPIQTEVLYNKAIEFAGLTGKERVIDA YSGIGTIGMVASDNAGEVISVELNGDAVRDAVFNAKQNKVKNIRFYKDDAGDFMVKMASH KEKADVVFMDPPRSGSDKKFLDSLLRLRPKKVVYISCNPETLARDLKHLTGNGYRIKEIQ PVDMFPSTGHVEAVCLLSRREK >gi|222441903|gb|ACEP01000039.1| GENE 7 5625 - 5990 502 121 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|238916248|ref|YP_002929765.1| large subunit ribosomal protein L7/L12 [Eubacterium eligens ATCC 27750] # 1 121 1 121 121 197 86 4e-50 MTTQEIIEVIKGLSVLELNDLVKACEEEFGVSAAAGVVVAAGAGAGEAAEEKTEFDVELT EVGGQKVKVIKVVREITGLGLKEAKAVVDGAPKAIKEAVSKEEAEDIKKQLEEVGAKVTV K >gi|222441903|gb|ACEP01000039.1| GENE 8 6028 - 6546 565 172 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|238922788|ref|YP_002936301.1| 50S ribosomal protein L10 [Eubacterium rectale ATCC 33656] # 1 171 1 169 183 222 66 2e-57 MAKVELKQPIVDEIKAQLDGAKSMVVVDYRGISVADVTELRKQCREAGVVYKVYKNTMVK RAVEGTEFEGVVKDLEGTNAFAISKDDATAPARIVANFIKKAKCCELKSGVVEGTYYDAA GIATIATIPSRDELIAKFMGSIQSPITNFARVIKQIAEQKDGGAAESEAAAE >gi|222441903|gb|ACEP01000039.1| GENE 9 7217 - 8941 2059 574 aa, chain - ## HITS:1 COG:aq_1450 KEGG:ns NR:ns ## COG: aq_1450 COG0265 # Protein_GI_number: 15606621 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain # Organism: Aquifex aeolicus # 170 501 13 346 453 188 36.0 2e-47 MNEFENGFHNENLENNRINNQENNSQQVAIDVTPIEESVTQENTESHRYSYREQGTGGSS YQYDNGNNCNDTYNNSYSSHGWRDESTYNNENYYGNIPPEPDKRRRQRKNGSKNNKNGMG KKAAKLVASAAVFGLVAGACFVGVSVAKDKLYPSTADRIETTSGTTSAKSETSSSGSSSS SSNVASVVNEVMPSVVSITSTIQSSNYYGFGTQESEGAGSGFIVAKTKDNLMIATNNHVV SDATSLTVGFVDDTTAKATVVGTDSSADLAVISVKIKDIKDSTASKIKVATLGSSDDLKV GEEVVAIGNALGYGQSVTTGVVSAKNREVSLTDGTMNLLQTDAAINPGNSGGVLINMDGQ VVGINNAKLEDTSVEGMGYAIPITTAKTILTDLMNANSVSTKDAAFLGVVGRDINESYSS ALGIPSGIYVSQVVSGSPAEKAGISAGDVITKFEGNNVSTMSGLKEKLALKKANTKVKIT FKRANQSGTYEEKTVTVTLGKKSDFSDVTTDNSSDSSNDSNNNSNNGNNNGNSNGNSGNS NGNSGDYGYGNGNFGNDNGNGYINPYEYFFGNNY >gi|222441903|gb|ACEP01000039.1| GENE 10 9559 - 10617 978 352 aa, chain + ## HITS:1 COG:wcaJ KEGG:ns NR:ns ## COG: wcaJ COG2148 # Protein_GI_number: 16129987 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Sugar transferases involved in lipopolysaccharide synthesis # Organism: Escherichia coli K12 # 2 352 115 464 464 258 38.0 1e-68 MFYVLSMLLTIAERAAIRYALERTRRKGFNLKHVVVIGFSAAAEAYIDRIKSNPQWGYTI HGIFDDNLKADFSYRNTFCIGKLKDVEKFLQNTSMDEVAIALSLKEYYKLGDMVAICEKS GVHTKFVPDYYKFISTNPVTEDLNGLPVINIRNVPLTNTVNKCIKRLVDIIGSIVCIILF SPIMAVVAVLVKKSSPGPIIFCQERVGLHNKPFKMYKFRSMGVQSASKEQKAWTVHNDPR VTPVGRIIRKTSLDELPQLFNVLKGDMSLIGPRPERPLFVEKFKEEIPRYMIKHQVRPGM TGWAQVCGFRGDTSIEGRIEHDIFYIENWTLSFDIKILFLTVFKGFVNKNAY >gi|222441903|gb|ACEP01000039.1| GENE 11 10712 - 12043 1079 443 aa, chain + ## HITS:1 COG:CAC3046 KEGG:ns NR:ns ## COG: CAC3046 COG1316 # Protein_GI_number: 15896297 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Clostridium acetobutylicum # 121 366 29 273 341 142 37.0 1e-33 MKYGSIYSKDNIENKYNYSSTGGRRSSQNNFETSHDHHTHSTLNTDGLLRSSSGTGYHSS GSSIKNSGKPSTKKSSSQNKPDTKSQKRQTSGGRALTYHERKKKKKRGCFPFLLLIILLI AGAVFAFRFSLKGAFSKIEKYPLDKTSVTVNDTDANIKDYQNIALFGVDSQDNKIKDKGS RTDCIIIASINKSTKKVKLMSIYRDTYVSIDGEYDKINAAYSYGGPELALRTINRNLDLN ITDFATVNFKALADAVDVLGGIPLTINSEKELQNLNDYIGNMNHINGGNSPKFEKTGTYT FDGNQAVAYSRIRYMEGGDHARANHQRLVLEGIMNTAKKQPLKLGKLISTVLPQCKTSLS NDALTKLTLSMARCKIEDSQAYPFKSEDERYNGIYYGFPITAQTNVKKAHEYLFGTKDYK PTEELQKISAKIKVVTDYIGITE >gi|222441903|gb|ACEP01000039.1| GENE 12 12187 - 16299 4895 1370 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225026472|ref|ZP_03715664.1| ## NR: gi|225026472|ref|ZP_03715664.1| hypothetical protein EUBHAL_00721 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00721 [Eubacterium hallii DSM 3353] # 1 1370 1 1370 1370 1649 100.0 0 MEDMMKALLDVVRAQHTATEGSEERPFDINDIIDMALNITGRPEEPEEQQELSDTIQKLA ESLAPDIFPPKTFEEMDDSEQAASAVNMIEERLRNGGRRRETAAPQSQQEMTPQQNSLQM QSESHEENPFAQVMENENANIQQQSETADGYGNMASTDSGASEDYGNGSSYDMFGQDDVN HQEAANLNDLIYNNFMQMMGLNDPKVEYPFDRSQIRYGREKTATEMLAEDEANKAEERAL EEQRRRPVSAWELAQSAVDKDEEAHQKEEYEPKEMKMPETKSASQLAAEAIAKAKEEDQM KLEAEKRAERLMEEARKRGKDPMEFALHQQEILNYMEKNSDELVSFEDYEDLSPEEKLEI EKELYREKQIEAGVAPEDISDELPDEILEQSGIAPDQTAGEQNSQESAGQGDGTTAQQTS QSQGMPAFSDDMLRMISQEVVQENAEMILAEDANADLGLINETIFENLKNLMSQTGGAVT QEDMESLIGEVISRNTSSDSEEKQETLAQTETGTTAETAPMAFEGESAGSAVGSGMAGTR EPAVESSQSESQSQPMSAVELARAAQQAAKPEPQEARETKSAVELAKEAQENAVQKKAEP ISETEEELSEDDLNFDELDLEEESEESQSPSIEELKAQLKAAEEALAAEQLKAAQKAGKA EEEKKSEEIPKEEEATEQPMEESASTAGEQTATEESSEITPAPTAEEVQEQPEYSEVSEK EAEEFEYVDPGELVLGDHTQAEIDEALDNLASLGLEGEVYERAKRMLLLELAGSEVALDA WLEEQENGKKKKAAVSALDTEEDALEDLEDLDEDDLERELELAMDEDFIEEDLEEPANEE TLEESSQDKSEETEKEAVSEENLEENSAEKTEDESDKTEGAEDVSEQPESILKEASEEEI SSEEENSEEEEGETSEAAQEEAANKEFSETEEEAANREYSETEEKEAANSECSETEEKEI ANKGVSKKIEKEAEYKEAEYISESEDTIQVEKTRPEKAERTSSQTKKPAHSERTSHSRKH KNIVKRKEKTAPEKEEREFSAVVLTGKNVEEKEFQVSVRNPFVLKNSASFMDKFEEYIVD TQENRKLSTGFKRLDAMLRYGLHKGSYFVDATPQYLKNGFMQQIADRAAESGVDVLYIST ELTRYDLMVDTISRLSYEMNKKDEEKAVSSMAIMTGEKGADIRSLKDELNWYRGRISEHL FVLDQEAVSEYVENMEDASAGDILAELIRSIVTEGAHKPVVFIDNIENILSVEDSEDMKP LMDGIRKLAKELGIPIIMSYGYAPAESENELDPDEIEYHKSLGNMCDVYLELKYADMITE DYEELTEDDIQEMVENGEMLLINVQLHKNRRTMKASCQIQATPKFNYYEE >gi|222441903|gb|ACEP01000039.1| GENE 13 16541 - 17410 1004 289 aa, chain + ## HITS:1 COG:BH3372 KEGG:ns NR:ns ## COG: BH3372 COG0613 # Protein_GI_number: 15615934 # Func_class: R General function prediction only # Function: Predicted metal-dependent phosphoesterases (PHP family) # Organism: Bacillus halodurans # 6 257 13 250 266 184 41.0 2e-46 MKTYIDLHIHSTASDGTFSPTEIINDALKLAGDKNPVVIALTDHDTVAGTEEFQKAAAKH KERLTAISGVEISTDYHGVEIHVLGYNIDTKNKALLERLAVCRESRDGRNAKIIEKLQEQ GFKISMDEIKPDKPGETIARPHIAKLLMKKKYVSSVQEAFDKYLAEGRCCYVERIMPTPQ EAIELIKNSGGVPVLAHLMFYKKLDSAKKETLVRELKEAGLVGIEAYYNTYTPVEQEYVA GLAKQWGLLKTGGTDFHGQNKPHISLFKGQGEMEVPESILPEFLASLIR >gi|222441903|gb|ACEP01000039.1| GENE 14 17741 - 19105 991 454 aa, chain + ## HITS:1 COG:no KEGG:CLL_A1793 NR:ns ## KEGG: CLL_A1793 # Name: not_defined # Def: ErfK/YbiS/YcfS/YnhG family # Organism: C.botulinum_B_Eklund # Pathway: not_defined # 312 452 34 161 163 75 36.0 7e-12 MKITNKLCLSVASALLFFGLCTYPAVSLNASQTTVKSQSTTSSTEAVSLKAPYFTVTPGY KSAQITWKAVTGANKYKLFCSDGRSVKFKKAGTYTFNQLNEGQTYTFTVKAFAPNAKTVS RKLSATIPVTISQNVTGLSVYASNSRLNLKWNTLYNASGYSVYKKKGSRYTLLANTTRST YQTSITSGASSITFMVKPYTTINGKNYTSSTGATVSCTPNTLLSPLKTIRTMTYFCKTTK KVSLYRSWTSKKVVKTLSSGVTVDLIGRNTKYKRSEILYKGKTYYLTTGSLRAFKCNYTT SKYSTAQKLAYVKKYSSKTSYLIWVSHYTQEVSIFQGRKNNWKLIKSFPCASGNYNTRSP HGTFRIGQKENGWYYVNTYEEYITHYCGRNSFHTRVHRYPSGSSQNHHKFPIASTVYADA TIGKPASNGCVRLMDSDAYYIWKYMPANTTVISY >gi|222441903|gb|ACEP01000039.1| GENE 15 19974 - 21038 1031 354 aa, chain - ## HITS:1 COG:CAC2233 KEGG:ns NR:ns ## COG: CAC2233 COG0482 # Protein_GI_number: 15895501 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Predicted tRNA(5-methylaminomethyl-2-thiouridylate) methyltransferase, contains the PP-loop ATPase domain # Organism: Clostridium acetobutylicum # 1 352 1 354 355 336 50.0 4e-92 MAKKALIAMSGGVDSSVAAYLMKEAGYDCIGVTMKLYDNEDIGMAREKTCCSLSDIEDAR SVCVKLGIPYYVFNFKDDFKEKVIDNFISCYQNGMTPNPCIECNRHLKFEHLYKRAKVLH CDVIVTGHYAKITHDEQGYHLLKGVDDSKDQSYVLYSLTQEQLANTCFPLGDYSKTEIRK IAEEQGFFNAHKHDSQDICFIPDGDYMSFIEKTTGKKSEPGDFVNQEGEVVGRHKGYYGY TIGQRKGLGISAPKPYYVTEIRPEENKVIVGSNEDLFKTDLIANDFNWIENLAEDEVAKA RIRYHQTEKEATARKNSDGTVDIHFLEPQRAITKGQAVVLYRDNHVVGGGTIIK >gi|222441903|gb|ACEP01000039.1| GENE 16 21328 - 21990 492 220 aa, chain + ## HITS:1 COG:lin0467 KEGG:ns NR:ns ## COG: lin0467 COG3619 # Protein_GI_number: 16799543 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Listeria innocua # 1 220 1 219 221 233 58.0 2e-61 MNKKVQISESIELGILLALSGGFMDAYSYIGRGEVFANAQTGNMLLLGVHLSEGNIPAAI RYLCPVLAFTFGIALADIVRNSGIGSHLHWRQISVVAEAIILTIVSFLPQSHNLLANSLT SLACGIQVESFRKIHGNGIATTMCIGNLRSGTQNIYTFFQTKKREALEKGLLYYGIILCF ICGAIIGNQCIKYFHERAILISSVLLLLAFIMMFITRDEE Prediction of potential genes in microbial genomes Time: Fri Jul 8 07:02:17 2011 Seq name: gi|222441902|gb|ACEP01000040.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont44.1, whole genome shotgun sequence Length of sequence - 24477 bp Number of predicted genes - 29, with homology - 28 Number of transcription units - 15, operones - 7 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 2/0.000 + CDS 2 - 250 183 ## PROTEIN SUPPORTED gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 + Prom 253 - 312 2.1 2 1 Op 2 2/0.000 + CDS 342 - 647 355 ## COG0526 Thiol-disulfide isomerase and thioredoxins 3 1 Op 3 3/0.000 + CDS 660 - 980 318 ## COG0607 Rhodanese-related sulfurtransferase 4 1 Op 4 . + CDS 982 - 2673 1456 ## COG0446 Uncharacterized NAD(FAD)-dependent dehydrogenases + Term 2716 - 2777 11.3 - Term 2706 - 2763 9.1 5 2 Tu 1 . - CDS 2794 - 3453 426 ## COG0664 cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases - Prom 3693 - 3752 8.3 + Prom 3751 - 3810 3.6 6 3 Op 1 . + CDS 3970 - 4227 153 ## gi|225026484|ref|ZP_03715676.1| hypothetical protein EUBHAL_00733 7 3 Op 2 . + CDS 4240 - 4551 323 ## COG0784 FOG: CheY-like receiver + Prom 4768 - 4827 10.4 8 4 Op 1 . + CDS 4961 - 5230 117 ## gi|225026486|ref|ZP_03715678.1| hypothetical protein EUBHAL_00735 + Prom 5243 - 5302 6.7 9 4 Op 2 . + CDS 5409 - 5726 139 ## gi|225026487|ref|ZP_03715679.1| hypothetical protein EUBHAL_00736 + Term 5728 - 5774 9.1 - Term 6099 - 6145 2.8 10 5 Op 1 . - CDS 6167 - 7018 487 ## COG3649 Uncharacterized protein predicted to be involved in DNA repair 11 5 Op 2 . - CDS 6999 - 7346 255 ## gi|225026489|ref|ZP_03715681.1| hypothetical protein EUBHAL_00738 12 5 Op 3 . - CDS 7343 - 7612 419 ## gi|225026490|ref|ZP_03715682.1| hypothetical protein EUBHAL_00739 - Prom 7642 - 7701 12.9 13 6 Tu 1 . + CDS 7971 - 8360 354 ## COG3436 Transposase and inactivated derivatives + Prom 8598 - 8657 8.6 14 7 Op 1 . + CDS 8798 - 9229 260 ## CDR20291_1773 hypothetical protein 15 7 Op 2 . + CDS 9235 - 9465 308 ## LMHCC_1339 conjugative transposon protein + Prom 9938 - 9997 5.9 16 8 Tu 1 . + CDS 10032 - 11771 956 ## COG1961 Site-specific recombinases, DNA invertase Pin homologs + Term 11784 - 11822 -0.4 + Prom 11966 - 12025 11.2 17 9 Op 1 11/0.000 + CDS 12097 - 13362 1495 ## COG3705 ATP phosphoribosyltransferase involved in histidine biosynthesis 18 9 Op 2 18/0.000 + CDS 13364 - 14005 870 ## COG0040 ATP phosphoribosyltransferase + Prom 14110 - 14169 8.1 19 9 Op 3 6/0.000 + CDS 14290 - 15585 1827 ## COG0141 Histidinol dehydrogenase 20 9 Op 4 . + CDS 15588 - 16178 908 ## COG0131 Imidazoleglycerol-phosphate dehydratase 21 9 Op 5 . + CDS 16179 - 16514 466 ## COG0139 Phosphoribosyl-AMP cyclohydrolase 22 9 Op 6 . + CDS 16567 - 17382 629 ## COG1606 ATP-utilizing enzymes of the PP-loop superfamily + Term 17396 - 17432 0.8 + Prom 17429 - 17488 7.7 23 10 Tu 1 . + CDS 17531 - 17686 172 ## gi|225026502|ref|ZP_03715694.1| hypothetical protein EUBHAL_00751 + Term 17708 - 17758 11.2 + Prom 17711 - 17770 2.1 24 11 Tu 1 . + CDS 17801 - 20437 3548 ## COG0013 Alanyl-tRNA synthetase + Prom 20557 - 20616 2.8 25 12 Tu 1 . + CDS 20641 - 21051 519 ## COG0242 N-formylmethionyl-tRNA deformylase - Term 21358 - 21404 4.0 26 13 Tu 1 . - CDS 21479 - 21550 67 ## - Prom 21751 - 21810 5.2 27 14 Op 1 . - CDS 21831 - 22418 444 ## gi|225026507|ref|ZP_03715699.1| hypothetical protein EUBHAL_00756 28 14 Op 2 . - CDS 22457 - 23311 559 ## Selsp_1548 hypothetical protein - Prom 23348 - 23407 2.2 29 15 Tu 1 . - CDS 23435 - 24295 599 ## Selsp_1548 hypothetical protein - Prom 24334 - 24393 7.8 Predicted protein(s) >gi|222441902|gb|ACEP01000040.1| GENE 1 2 - 250 183 82 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 [Phaeobacter gallaeciensis BS107] # 1 78 165 238 242 75 46 4e-13 TVSLARELGPKGIRVNAVAPGITETDMMKAVPKEVIEPMIERIPLRRLGQPEDIANAFVF LASDEASYITGVILSVDGMARS >gi|222441902|gb|ACEP01000040.1| GENE 2 342 - 647 355 101 aa, chain + ## HITS:1 COG:MT4033 KEGG:ns NR:ns ## COG: MT4033 COG0526 # Protein_GI_number: 15843547 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Mycobacterium tuberculosis CDC1551 # 2 101 8 108 116 107 49.0 6e-24 MSAININKNNFQNEVLNSDKPVLLDFWASWCAPCRMVVPIVEEIASERRDIKVGKINVDE EPELANKFSIMSIPTLVVMKNGKIVQQISGARPKNAILEML >gi|222441902|gb|ACEP01000040.1| GENE 3 660 - 980 318 106 aa, chain + ## HITS:1 COG:BH2813 KEGG:ns NR:ns ## COG: BH2813 COG0607 # Protein_GI_number: 15615376 # Func_class: P Inorganic ion transport and metabolism # Function: Rhodanese-related sulfurtransferase # Organism: Bacillus halodurans # 16 106 36 124 125 68 42.0 3e-12 MGFFDLFKQSNINQGIEEYKRTDDAVLLDVRTPQEYQEGHIPESKNVPLQQLDNIASVAW DKAIPLFVYCYSGSRSRQATGILQRMGYSKVNNIGGIAAYSGKVEK >gi|222441902|gb|ACEP01000040.1| GENE 4 982 - 2673 1456 563 aa, chain + ## HITS:1 COG:FN1903_1 KEGG:ns NR:ns ## COG: FN1903_1 COG0446 # Protein_GI_number: 19705208 # Func_class: R General function prediction only # Function: Uncharacterized NAD(FAD)-dependent dehydrogenases # Organism: Fusobacterium nucleatum # 2 469 3 469 469 411 43.0 1e-114 MKVIIVGGVAGGATAAARIRRLNEHAEITVFERSGYISYANCGLPYYIGDVITDPEELTL QTPESFFKRFHINMKIHHEVISIHPERKTVSVKNLENGEIFEEKYDKLILSPGAKPTQPR LPGVGIDKLFTLRTVEDTFRIKEYINKNHPKSVVLAGGGFIGLELAENLRELGMDVTIVQ RPKQLMNPFDPDMASMIHNEMRKHGIKLVLGYTVEGFKEKDNGVEVLLKDNPSLQADMVV LAIGVTPDTVLAKEAGLELGIKESIVVNDRMETSVPDIYAAGDAVQVKHYVTGNDALISL AGPANKQGRIIADNICGGDSRYLGSQGSSIIKVFDMTAATTGINETNAKKSGLEVDTVIL SPMSHAGYYPSGKVMTMKAVFEKETYRLLGAQIIGYEGVDKRIDVLATAIHAGLKATQLK DLDLAYAPPYSSAKDPVNMAGFMIDNIAKGTLKQWHLEDMDKISRDKSVVLLDVRTVGEF NMGHMDGFKNIPVDELRERINEIEKGKPVYLICQSGLRSYIASRILEGNGYETYNFSGGF RFYDAVVNDRALIEKAYACGMDY >gi|222441902|gb|ACEP01000040.1| GENE 5 2794 - 3453 426 219 aa, chain - ## HITS:1 COG:FN0217 KEGG:ns NR:ns ## COG: FN0217 COG0664 # Protein_GI_number: 19703562 # Func_class: T Signal transduction mechanisms # Function: cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases # Organism: Fusobacterium nucleatum # 3 209 10 212 217 84 29.0 1e-16 MSFENYFPLWNDLNTAQKKLISDNLITRNVKKGTIIHNGNLDCTGLLLVKSGQLRTYILS DEGREITLYRLFDMDMCLLSASCIIRSIQFEVTIEAEKDTDLWIIPAEIYKGIMKESAPV ANYTNELMATRFSDVMWLIEQIMWKSLDKRVASFLLEETSIEGTNELKITHETIANHLGS HREVITRMLRYFQGEGLVKLSRGKITILDPKRLETLQRS >gi|222441902|gb|ACEP01000040.1| GENE 6 3970 - 4227 153 85 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026484|ref|ZP_03715676.1| ## NR: gi|225026484|ref|ZP_03715676.1| hypothetical protein EUBHAL_00733 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00733 [Eubacterium hallii DSM 3353] # 1 85 1 85 85 160 100.0 4e-38 MGMSADFKSKIFDPFPHAEGSVTNKIQGTNLRMAITRNLVGTIDMENELGQKSCFEVLID LRIAEDRFVSSAVREEKNEKVDSIL >gi|222441902|gb|ACEP01000040.1| GENE 7 4240 - 4551 323 103 aa, chain + ## HITS:1 COG:slr2104_4 KEGG:ns NR:ns ## COG: slr2104_4 COG0784 # Protein_GI_number: 16330590 # Func_class: T Signal transduction mechanisms # Function: FOG: CheY-like receiver # Organism: Synechocystis # 1 100 11 105 130 73 40.0 7e-14 MCTEDNELNVEILKELLKIESAECTICENGEKILEEFEYYDLGEYDMILMDVQMPVVNGY EVTMAIRRSSYELEKTIPIIFMTANAFSEDIQHSLTLGINAPV >gi|222441902|gb|ACEP01000040.1| GENE 8 4961 - 5230 117 89 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026486|ref|ZP_03715678.1| ## NR: gi|225026486|ref|ZP_03715678.1| hypothetical protein EUBHAL_00735 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00735 [Eubacterium hallii DSM 3353] # 1 89 1 89 89 150 100.0 4e-35 MITKKRKLSERKLMALSCGSYLLGIVCMTILGYLYDKIPCRNFLYIIPICLLILGAIFSC MLHIMYPIFLIEWKEVSKKDIKEMTTKKE >gi|222441902|gb|ACEP01000040.1| GENE 9 5409 - 5726 139 105 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026487|ref|ZP_03715679.1| ## NR: gi|225026487|ref|ZP_03715679.1| hypothetical protein EUBHAL_00736 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00736 [Eubacterium hallii DSM 3353] # 1 105 8 112 112 174 99.0 2e-42 MHIEKYVAKNIISLCEKRDISKYRLSQLSGISQSSLSRIMTQESLPSLITLEKICAALGV TLSQFFREKNPVDLTESQKEILKIWDNLSTKEQETVLAMLRGLQK >gi|222441902|gb|ACEP01000040.1| GENE 10 6167 - 7018 487 283 aa, chain - ## HITS:1 COG:BH0339 KEGG:ns NR:ns ## COG: BH0339 COG3649 # Protein_GI_number: 15612902 # Func_class: L Replication, recombination and repair # Function: Uncharacterized protein predicted to be involved in DNA repair # Organism: Bacillus halodurans # 1 264 1 259 283 235 48.0 7e-62 MNPLQNRIDFCVIISVDGANPNGDPATMSWYPRVLYDGHGEISDVCLKRKIRNRLQDMGE EIFVQEDSRIDDGYRNLKERILNFDDFKNEYRNKKPDFKKIYDSTCKKWIDIRTFGQVFP FKGAGNNLSTNVRGCMSLWGATSLDVIDVQEIISIKSTNLNETEEGRKDSASFFHRYMVH KAAYVTYGSIYCQLAEKNNFTEEDAEKIHQALITLFEGDAAAMRPAGTMNVQKVYWWKHN CKTGQYPQIKVFKTLDIQPQKEYPFFTVTETPLPDLTPEVYTL >gi|222441902|gb|ACEP01000040.1| GENE 11 6999 - 7346 255 115 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225026489|ref|ZP_03715681.1| ## NR: gi|225026489|ref|ZP_03715681.1| hypothetical protein EUBHAL_00738 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00738 [Eubacterium hallii DSM 3353] # 1 115 1 115 115 195 100.0 1e-48 MITYDLFTKLYGEAKTYDSLELYISERGWQDWMDDFDIETVTAILKKIYEYSIESLQDIR TKLNYSRIGLSRAYYIPLRTLENWDNKQNMPEYVEVLIKYTFFIKEMTKDESITE >gi|222441902|gb|ACEP01000040.1| GENE 12 7343 - 7612 419 89 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225026490|ref|ZP_03715682.1| ## NR: gi|225026490|ref|ZP_03715682.1| hypothetical protein EUBHAL_00739 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00739 [Eubacterium hallii DSM 3353] # 1 89 1 89 89 168 100.0 1e-40 MATLDEKNTYYIDYGTGAGNFEFTGTLEEAMEEANKGLCYTQVPVSISIKDGCDDIAYLP WYGIEASEDDVITASFGKFGFYGEWRINE >gi|222441902|gb|ACEP01000040.1| GENE 13 7971 - 8360 354 129 aa, chain + ## HITS:1 COG:SPy0131 KEGG:ns NR:ns ## COG: SPy0131 COG3436 # Protein_GI_number: 15674346 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Streptococcus pyogenes M1 GAS # 2 117 330 445 450 99 42.0 1e-21 MEASIDRSSLIGDAVLYTLNQEVYLRRYLEDGHLSIDNNSAERAIKNFAVGRRNWLFAKS IRGADASAIVYSIVETALLSGLKPYLYLTYVLEKLLQTGAFPKPEELDRLLPWSNELPKE LRTKIKSKK >gi|222441902|gb|ACEP01000040.1| GENE 14 8798 - 9229 260 143 aa, chain + ## HITS:1 COG:no KEGG:CDR20291_1773 NR:ns ## KEGG: CDR20291_1773 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile_R20291 # Pathway: not_defined # 1 143 1 143 145 96 41.0 4e-19 MEQSPSYKRKIQIQFQAYCYKILRGEAVDYYRYVNYLQKHEKSIENLSSEESDELYVSDE YRSDDHLFKVLEYDIGIKSDLLAEVLHLLSQKKRDVILLSFFLGMSDTEIARIMCVVNST IHEHKKKALKLMKSELEKRGMEE >gi|222441902|gb|ACEP01000040.1| GENE 15 9235 - 9465 308 76 aa, chain + ## HITS:1 COG:no KEGG:LMHCC_1339 NR:ns ## KEGG: LMHCC_1339 # Name: not_defined # Def: conjugative transposon protein # Organism: L.monocytogenes_HCC23 # Pathway: not_defined # 4 76 3 80 80 65 44.0 7e-10 MQKDSRKNEERLLPFETIKAATEGDVDAMDKILKHYKPYIVKLSIRTDGDKSYIDEDLRE RLEAKLIAKTLEFETA >gi|222441902|gb|ACEP01000040.1| GENE 16 10032 - 11771 956 579 aa, chain + ## HITS:1 COG:CAC2247 KEGG:ns NR:ns ## COG: CAC2247 COG1961 # Protein_GI_number: 15895515 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinases, DNA invertase Pin homologs # Organism: Clostridium acetobutylicum # 23 512 1 504 523 119 26.0 2e-26 MEVKIIKANTPTELSSKRKIERLRVAAYCRVSTDDEDQTKSYQSMIKYYTDLIHNNKQWV FAGIFADKAITGTKADIREEFQRLIQECLSGKIDLVIAKSIPRFARNTLDTLKYVRMLKE RNVAVYFEVEKINTLKDGEFLLTILSSVAQQEVENTSAYVKKGLKMKMKRGELVGFQKCL GYDYDPLTKTLSINEEGAKTVGYIFKRYVAGAGSTMIARELNEQEIPTVHGNRWSPSSVM GIINNEKYMGDLLLGKSFTVDPISKRRLTNMGEEDRFYIENHHEAIISKEIFEKAQEIRQ RRSKGRVNGVETGKREKYSRQYAFSSMLECGFCGETVSRRCWHSHSKYEKIIWHCGRNIR KGKRFCPEAKGIPEQVIEKAFVESYQMLCSDHKDVLEEFIKRIEETLSGDSIKDRIKKID QSIRNLKYRRRKLLDKYLDGGIAKDIYEEADLDLDKKIFKANQDLIELEQQYNSENSLQK RLKDFKKTLLKNQTLEEFDREIFESIVEKVIIGGYDENGKEDPYKIIFIYKTGFKDEIEN AKKRYENTSGRVDGKLCSYITSEVKDLYPCNSDNTHRPC >gi|222441902|gb|ACEP01000040.1| GENE 17 12097 - 13362 1495 421 aa, chain + ## HITS:1 COG:CAC0935 KEGG:ns NR:ns ## COG: CAC0935 COG3705 # Protein_GI_number: 15894222 # Func_class: E Amino acid transport and metabolism # Function: ATP phosphoribosyltransferase involved in histidine biosynthesis # Organism: Clostridium acetobutylicum # 7 367 7 368 407 243 37.0 7e-64 MKEKLLHTPEGVRDIYNSECKRKSAVEKRIHHILELYGYQDIETPTFEFFDIFNSDTGTV SSVEMFKFFDRNNNTLVLRPDMTPSIARSVAKYYSACEFPLRLSYRGNTFLNNRSYQGKL TEKTEMGAELINDDSPEADAEGIIMMIDCFLAAGLTDFRIDIGQAEFYKGIMGALEASEE VKEQIHEYIENKNALGMEILLSDLSMKDCYRNILLEYNDLYGDVTMLEKAKKLTINPRCQ AAIARLEEVYEVLKLYGYEKYVSFDLGMVNHFDYYTGIIFRGYTYGTGDAIGKGGRYDNL LAQFGKEASACGFTILIDDVLSALSRQNISISLKEYAYSLLLYKSEDRQNAIPLVVKLRD AGVRTELVSMTEGKTLEEYLAFGKNQGAEGILYLDGENVTVYDMASGDQETKTLQEFREE Y >gi|222441902|gb|ACEP01000040.1| GENE 18 13364 - 14005 870 213 aa, chain + ## HITS:1 COG:CAC0936 KEGG:ns NR:ns ## COG: CAC0936 COG0040 # Protein_GI_number: 15894223 # Func_class: E Amino acid transport and metabolism # Function: ATP phosphoribosyltransferase # Organism: Clostridium acetobutylicum # 2 208 5 208 215 185 49.0 6e-47 MRYLTFALAKGRLANKTMDMFEKLGITCDSIRDKETRKLIFVNEDLKLKFFLAKASDVPT YVEYGAADIGIVGEDTILEEGRNLYEVMDLGFGKCKMCVCGPESAREKLQHGELIRVASK YPSIAKDYFYNKKFQTVEIIKLNGSVELAPIVGLSEVIVDIVETGSTLKANGLTVLEEIC DLSARMVVNQVSLKMEQERIRKLIYDLRKLKQQ >gi|222441902|gb|ACEP01000040.1| GENE 19 14290 - 15585 1827 431 aa, chain + ## HITS:1 COG:CAC0937 KEGG:ns NR:ns ## COG: CAC0937 COG0141 # Protein_GI_number: 15894224 # Func_class: E Amino acid transport and metabolism # Function: Histidinol dehydrogenase # Organism: Clostridium acetobutylicum # 1 429 5 431 431 464 57.0 1e-130 MRVLKVNKENTKNILSDLLKRSPNNYDQYAGTVQGILDEIKEKGDEALFAYTEKFDKCKL TKDTIQVTGEEIKAAYEKVDDSLVEVIRKSLVNIREYHEKQKRNSWFDAKPDGTILGQKM TALASVGVYVPGGKAAYPSSVLMNIVPAKVAGVEKIVMVTPPNKEGEVTPATLVAANEAG ATHIYKVGGAQAIGALAYGTESIEKVDKIVGPGNIYVALAKKAVYGHVSIDSIAGPSEIL VLADDTANPHFVAADLLSQAEHDELASAILVTTSKTLGEKVQEEIEGYLKKLSRAEIIEK SLENFGYILEAETLNEALEVVNAIASEHLEIVTANPFEVMTKVRNAGAIFIGEYSSEPLG DYFAGPNHVLPTNGTAKFFSPLNVDDFVKKSSIIYYSKDALKAIHKDIETFAQAEGLTAH ANSIAVRFEEE >gi|222441902|gb|ACEP01000040.1| GENE 20 15588 - 16178 908 196 aa, chain + ## HITS:1 COG:CAC0938 KEGG:ns NR:ns ## COG: CAC0938 COG0131 # Protein_GI_number: 15894225 # Func_class: E Amino acid transport and metabolism # Function: Imidazoleglycerol-phosphate dehydratase # Organism: Clostridium acetobutylicum # 3 196 4 197 197 228 56.0 4e-60 MAERKAHVERNTSETKIIVDLNLDGSGKYDIDTGIGFFDHMLNSFAKHGFFDLTCKVDGD LEVDCHHTIEDTGIVLGQAIREAVGDKKGIRRFGNFLLPMDETLVLGAVDLSGRPYCVCD LPFTVERVGGFDTEMVQEFFYAISYSAAMNLHLKLMHGSNNHHIMEAAFKAFAKALDEAC SYDPRITDVLSTKGAL >gi|222441902|gb|ACEP01000040.1| GENE 21 16179 - 16514 466 111 aa, chain + ## HITS:1 COG:CAC0942 KEGG:ns NR:ns ## COG: CAC0942 COG0139 # Protein_GI_number: 15894229 # Func_class: E Amino acid transport and metabolism # Function: Phosphoribosyl-AMP cyclohydrolase # Organism: Clostridium acetobutylicum # 14 105 13 104 115 125 60.0 2e-29 MKSKLSFDEFKLNSDGMIPVITQDYKTNEVLMLAYMTKESFEKTLETGKMTYFSRSRQEL WTKGDTSGHYQYVKSLDIDCDKDTILAKVEQIGAACHTGNRTCFFTRLAEE >gi|222441902|gb|ACEP01000040.1| GENE 22 16567 - 17382 629 271 aa, chain + ## HITS:1 COG:CAC0775 KEGG:ns NR:ns ## COG: CAC0775 COG1606 # Protein_GI_number: 15894062 # Func_class: R General function prediction only # Function: ATP-utilizing enzymes of the PP-loop superfamily # Organism: Clostridium acetobutylicum # 8 271 5 266 271 248 49.0 7e-66 MDMNKLQEKYQKLLSYLKELDSVAIAFSGGVDSTFLLAAAKEALGDKVIALTAKSCLFPE RESGEAEDFCKNLGVEQIVLDLGDTELPAFKHNPPNRCYICKKGLFEKFLQKAKEYQMSY VAEGSNMDDLGDYRPGLQAITELGIKSPLRAAELTKAEIRELSKQMGLPTWSKPSFACLA SRFVYGETITAEKLSMVERAEDLLRSNGFLQFRVRIHGTIARIELLPEDIPRMLDSELRK EIVDKLKRYGFTYVSLDLEGYRTGSMNEVLA >gi|222441902|gb|ACEP01000040.1| GENE 23 17531 - 17686 172 51 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026502|ref|ZP_03715694.1| ## NR: gi|225026502|ref|ZP_03715694.1| hypothetical protein EUBHAL_00751 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00751 [Eubacterium hallii DSM 3353] # 1 51 1 51 51 77 100.0 3e-13 MSAVLSAFVQYIITFILLVAAAAGGIFCGKKLRTKKNMESNTAGTGNTSGK >gi|222441902|gb|ACEP01000040.1| GENE 24 17801 - 20437 3548 878 aa, chain + ## HITS:1 COG:CAC1678 KEGG:ns NR:ns ## COG: CAC1678 COG0013 # Protein_GI_number: 15894955 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Alanyl-tRNA synthetase # Organism: Clostridium acetobutylicum # 1 876 1 878 881 904 52.0 0 MKKYGVNELRKMYLEFFESKGHLAMKSFSLVPHNDNSLLLINAGMAPLKPYFTGQEIPPR KRVTTCQKCIRTGDIENVGKTDRHGTFFEMLGNFSFGDYFKHEAIAWSWEFLTEVVGLDA DRLYPSVYLEDDEAFAIWRDEIGIAPERIFKFGKEDNFWEHGAGPCGPCSEIYYDRGEKY GCGKPGCTVGCDCDRYMEVWNNVFTQFENDGHNNYTTLKQKNIDTGMGLERLAVIVQDVG SIFDVDTIRSLRDKICEVAKKTYKENEQDDISIRLITDHIRSATFMISDGIMPSNEGRGY VLRRIIRRAARHGRLLGIEGKFLAELSKTVIEGSKDGYPELEEKKDFIYNVIIQEEGKFN KTIDQGLAILADLEESMKEKGVKELDGQDAFKLYDTYGFPLDLTKEILEEKGYTVNEEAF QTCMNEQKEKARSARKTTNYMGADVTVYESIDPSVTSTFVGYETQECDSKITVMTTDTEL TEALTDGQAGTIFVDETPFYATGGGQHADSGVITCKDGEFVVEDVVKMLGGKIGHIGHVT KGMFKVGDTVTLSVNKAQRADTAKGHSATHLLQKSLRTVLGNHVEQSGSYVDKDRLRFDF SHFQALTAEELAEVEKMVNEKIAEDLTVSTEIMSVDEAKNTGAMALFGEKYGDKVRVVTM GDFSKEFCAGTHVPHTGVIKAFKIISETGVAAGIRRIEALTGDGVMKYYLDEEKTLHEAA KAAKAEPHKLAEKIQSMLDEIKALSAENEKLKDQIAKSEVADVMDQVVEAGDYKVLPVSV KDVDMNALRTLGDDLKGKIGSGVVILASDMGGKVNLIVTATDDAVKAGAHAGKIIKEAAA LVGGGGGGRPNMAQAGGKNPAGIKAALEKAVEIAKEQL >gi|222441902|gb|ACEP01000040.1| GENE 25 20641 - 21051 519 136 aa, chain + ## HITS:1 COG:SP1549 KEGG:ns NR:ns ## COG: SP1549 COG0242 # Protein_GI_number: 15901392 # Func_class: J Translation, ribosomal structure and biogenesis # Function: N-formylmethionyl-tRNA deformylase # Organism: Streptococcus pneumoniae TIGR4 # 1 136 1 136 136 145 51.0 2e-35 MVRAIMKDVLFLKKKAVPATKDDMPVVQDLLDTLKANETGCVGMAANMIGVNKRIIAVNM GFANIAMLNPVIVSKKEPYEAEEGCLSLKGVRKTTRYQEIEVEYEDIHFKKQHQKYKGWV AQIIQHEVDHCEGIII >gi|222441902|gb|ACEP01000040.1| GENE 26 21479 - 21550 67 23 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MGFGYRMILLLLPAKPLAKYSHD >gi|222441902|gb|ACEP01000040.1| GENE 27 21831 - 22418 444 195 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225026507|ref|ZP_03715699.1| ## NR: gi|225026507|ref|ZP_03715699.1| hypothetical protein EUBHAL_00756 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00756 [Eubacterium hallii DSM 3353] # 1 195 20 214 214 322 100.0 9e-87 MAVRETYISRILSALSDGVAKVDEKIKEVISIWICIDARKNEDSIIEYQLRPEVKFGENI HPEEINLLKGILIKIRAGKNMTQSKNKLIEMLEYVIADHSIEEKKNYLKNKCGVEMTKEL ERKVEAMGESMGRVILQGMLEDAWDKGVEQERRNTEKERENAIAAFISFGIPKEKILEKG YTEAEYTKVKRKLLS >gi|222441902|gb|ACEP01000040.1| GENE 28 22457 - 23311 559 284 aa, chain - ## HITS:1 COG:no KEGG:Selsp_1548 NR:ns ## KEGG: Selsp_1548 # Name: not_defined # Def: hypothetical protein # Organism: S.sputigena # Pathway: not_defined # 13 278 7 277 286 181 39.0 4e-44 MTDSTILQPEGLEQRYERYKEKIKHFTIMNDIFMRNVLKEISCTEYILQVIMNRTGLKVI DQTLQKDYKNLQGRSAILDCVARDAENNHFNVEIQGENDGASPKRARYHCGLLDMNLLNP GDFFDNLPETYVIFITKNDVLGYNLPINHIRRRSNETGEIFEDGQHILYVNSKKQDDTEL GRLMHDLHCEEADKMYSNVLSARVQQLKETEEGVNQMCQELEEIYNEGVERGEQSGFLRG EQSGELKKARETTLALLEMGMSVKQIAKAVNLSVETIQNWISKA >gi|222441902|gb|ACEP01000040.1| GENE 29 23435 - 24295 599 286 aa, chain - ## HITS:1 COG:no KEGG:Selsp_1548 NR:ns ## KEGG: Selsp_1548 # Name: not_defined # Def: hypothetical protein # Organism: S.sputigena # Pathway: not_defined # 13 278 7 277 286 184 40.0 4e-45 MTDSTILQPEGLEQRYERYKEKIKHFTIMNDIFMRNVLKKTACTEYILQVIMNRTGLKVI DQTLQKDYKNLQGRSAILDCVAKDAENNFFNVEIQGENDGASPKRARYHCGLLDMNLLNP GDPFDSLPETYVIFITKNDVLGYNLPINHIRRRSNETGEIFEDGQHILYVNSKKQDDTEL GRLMHDLHCKEADKMYSNVLSARVQQLKETEEGVNQMCQELEEIYNEGVEKGEQFGFLRG EQSGKLKAKKETTLALLEMGMSSEQIAKAVNLSIETIQNWISEASF Prediction of potential genes in microbial genomes Time: Fri Jul 8 07:03:24 2011 Seq name: gi|222441901|gb|ACEP01000041.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont45.1, whole genome shotgun sequence Length of sequence - 6067 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 332 - 380 -0.8 1 1 Tu 1 . - CDS 456 - 2591 1227 ## COG3711 Transcriptional antiterminator - Prom 2626 - 2685 5.2 + Prom 2730 - 2789 5.0 2 2 Op 1 11/0.000 + CDS 2911 - 4782 2560 ## COG2213 Phosphotransferase system, mannitol-specific IIBC component + Term 4851 - 4878 -0.8 + Prom 4784 - 4843 4.7 3 2 Op 2 . + CDS 4897 - 6039 1620 ## COG0246 Mannitol-1-phosphate/altronate dehydrogenases Predicted protein(s) >gi|222441901|gb|ACEP01000041.1| GENE 1 456 - 2591 1227 711 aa, chain - ## HITS:1 COG:BH3853 KEGG:ns NR:ns ## COG: BH3853 COG3711 # Protein_GI_number: 15616415 # Func_class: K Transcription # Function: Transcriptional antiterminator # Organism: Bacillus halodurans # 1 505 1 500 700 184 26.0 4e-46 MQLTPRADKIIRILLRFSPDSPVTIGIISEELGVSDRSVQRELPTVEKWMKHNGFCFVRK RSVGLLIDETPERLKELAALLDEKDTNSSAPADNRPERLTLLCHDLLLAEEPIKSYYFTE KFEISEGTLTADLNQLETWFTKYQLKLVRRPGLGVFIEGTEIARRQALTSFICKQVNEHH SIGNLQDKKFLSDRNFINEIDGEVMAEVNHILGGCQKQLGMQLSDNGYLHLLVYTSLCVQ RMQKGRFIKELKQSYAEISIQPEYAVSEYIMKELRQFFHLSISVDETIYMAVFISGQRIW PSGSRDLTDIKNLDVHQITLSIIKKVNEILHINFMRDSRLLTELSGHIQPTIARLRAGIP VENPLLKEFQESYPSVYKACEEGLSLLCPILGIKKVPASEIGFITLYFTMAMERIEKEIK KLSVMIVCPTGIGSSRLLTESLKKEYPDLDIRGITSAFELDNIRLQEEGVDLVISTVKLE IAYPYIHVNPILTRQDKILLDSRIKVIQEQKRQAQEKEIKEPEVPSDSSICRKDVEFLSA IGSEIYELLGNIYINQAPILKNRDEIISYCSSLYADTSDMQAHLYKILKDRDNLADTYIK PFQALLLHGRSTEISTSCFGYVRLEPPIYERGHIIRGAIVSLIPSGPASEVAAPVTSEVI GALLEEPALLKALQSLNKENFTKQLEEILLRFYKSTVQTRLHLKTSGAEKV >gi|222441901|gb|ACEP01000041.1| GENE 2 2911 - 4782 2560 623 aa, chain + ## HITS:1 COG:VCA1045_1 KEGG:ns NR:ns ## COG: VCA1045_1 COG2213 # Protein_GI_number: 15601796 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, mannitol-specific IIBC component # Organism: Vibrio cholerae # 2 477 7 488 503 461 50.0 1e-129 MKNAVQRFGKFLSSMVMPNIGAFIAWGFITALFIEKGWLPNAQLATLNGPMLNYLLPVLI AFQGGKLIGGDRGGVMGAIATIGVIVGAPDYPMLMGAMVMGPLAGFVIKKFDQAMDGRMP AGFEMLINNFSIGIFGMLLAIVGYYAIGPFMSAVLVVLKGGVQVLVDHSLLPLAAIFIEP AKVLFLNNAINHGIFTPIGIEQAAEAGKSIMYMLEANPGPGLGLLLAYWAFSKDKATKDS APGAIIIHFFGGIHEIYFPYILMNPLVIVGPIAGNICAILFYSITKAGLVGPSAPGSIIA FMSMSPKSSIVITALGVLIAATVSFLVSSPIVKMAGSKSLEDAQKKTSDMKAESKGQAAE VLSGADVKADDIKKIVFACDAGMGSSAMGATKFRNRIKPLNLGITVTNTSVDNVPADADV VVCQYILQDRAVKSAPQARLVAIGNFLQDPNLDTFYAELEGRANGTSASAPAVEPEVKEE KKAKKAVLKKEGIKTGLKSVDKETAIRAAGQLLCDLGFTNEDYIQAMVDRENMVSTYMGM GVAIPHGTSNAKETVKKSGIVVMQYPEGVDFGDEKAYLVIGIAGVGDEHLEILGNIVASL EDEELLETLKKNADVDTIMKTFG >gi|222441901|gb|ACEP01000041.1| GENE 3 4897 - 6039 1620 380 aa, chain + ## HITS:1 COG:mtlD KEGG:ns NR:ns ## COG: mtlD COG0246 # Protein_GI_number: 16131471 # Func_class: G Carbohydrate transport and metabolism # Function: Mannitol-1-phosphate/altronate dehydrogenases # Organism: Escherichia coli K12 # 3 379 2 381 382 374 52.0 1e-103 MKKAVQFGAGNIGRGFIGALLSQSGYHVVFADVVDKIIDKINEDKTYTVHVMDVECEDQK IENISGVNSTKPEAVDEIASADLVTTAVGLVILPRIAPTIAAAIEKRMADGNKEPMNIIA CENAIRGTSQLKKAVYENLSEEGKAFADEYVGFPDCAVDRIVPPVKSENFIDVVVEEYYE WDVERASFKGEIPEIKGMTLVDDLMAYIERKLFTLNTGHCITSYLGKLRGFPTIDAAIAD EEIYDIVSKAMKESGDGLIKKHGFDPEKHAAYIKKIIGRFKNPYLQDDVSRVGREPLRKL SPTDRLISPLTTAAGYGLPVDHLLVGVGAALHYDNPEDKQSVELQAKIKEQGVREAAKEI TQLTDEKLLDAIVAEYEKLA Prediction of potential genes in microbial genomes Time: Fri Jul 8 07:03:35 2011 Seq name: gi|222441900|gb|ACEP01000042.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont47.1, whole genome shotgun sequence Length of sequence - 39701 bp Number of predicted genes - 33, with homology - 33 Number of transcription units - 12, operones - 9 average op.length - 3.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 36/0.000 - CDS 67 - 2091 1173 ## COG0577 ABC-type antimicrobial peptide transport system, permease component 2 1 Op 2 4/0.000 - CDS 2084 - 2851 685 ## COG1136 ABC-type antimicrobial peptide transport system, ATPase component - Prom 2909 - 2968 7.7 3 1 Op 3 40/0.000 - CDS 2975 - 4012 795 ## COG0642 Signal transduction histidine kinase 4 1 Op 4 1/0.000 - CDS 4028 - 4702 666 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain - Prom 4732 - 4791 2.8 5 1 Op 5 . - CDS 4793 - 5689 527 ## COG0583 Transcriptional regulator - Prom 5850 - 5909 7.5 + Prom 5803 - 5862 8.7 6 2 Tu 1 . + CDS 6048 - 7076 865 ## COG0709 Selenophosphate synthase + Prom 7192 - 7251 4.8 7 3 Op 1 . + CDS 7276 - 7908 752 ## EUBREC_3158 hypothetical protein 8 3 Op 2 . + CDS 7925 - 8164 301 ## Cphy_1490 hypothetical protein + Prom 8226 - 8285 3.8 9 4 Op 1 . + CDS 8348 - 9121 725 ## COG1924 Activator of 2-hydroxyglutaryl-CoA dehydratase (HSP70-class ATPase domain) 10 4 Op 2 . + CDS 9132 - 9410 201 ## gi|225026523|ref|ZP_03715715.1| hypothetical protein EUBHAL_00772 - Term 9426 - 9463 7.1 11 5 Op 1 . - CDS 9519 - 10844 548 ## gi|225026524|ref|ZP_03715716.1| hypothetical protein EUBHAL_00773 12 5 Op 2 . - CDS 10912 - 11766 586 ## COG1284 Uncharacterized conserved protein - Prom 11811 - 11870 5.4 13 6 Tu 1 . - CDS 11956 - 12342 314 ## Clole_2877 hypothetical protein - Prom 12388 - 12447 7.3 - Term 12583 - 12625 0.3 14 7 Op 1 . - CDS 12781 - 13860 592 ## gi|225026527|ref|ZP_03715719.1| hypothetical protein EUBHAL_00776 15 7 Op 2 . - CDS 13863 - 15035 679 ## VF_A0651 hypothetical protein 16 7 Op 3 . - CDS 15102 - 16487 943 ## COG0270 Site-specific DNA methylase 17 7 Op 4 . - CDS 16578 - 18086 854 ## SH0580 hypothetical protein 18 7 Op 5 . - CDS 18091 - 19521 960 ## CPF_1016 putative type II restriction endonuclease 19 7 Op 6 . - CDS 19506 - 19994 315 ## COG3727 DNA G:T-mismatch repair endonuclease - Prom 20022 - 20081 5.7 - Term 20060 - 20106 7.2 20 8 Op 1 . - CDS 20128 - 20502 322 ## CDR20291_1765 hypothetical protein 21 8 Op 2 . - CDS 20596 - 21927 1003 ## EUBREC_0392 hypothetical protein - Prom 21983 - 22042 6.0 - Term 22040 - 22104 17.1 22 9 Op 1 . - CDS 22213 - 22440 230 ## Rumal_3218 hypothetical protein - Prom 22465 - 22524 5.3 23 9 Op 2 . - CDS 22627 - 22866 79 ## gi|225026536|ref|ZP_03715728.1| hypothetical protein EUBHAL_00785 24 9 Op 3 . - CDS 22934 - 23998 881 ## EUBREC_0390 hypothetical protein - Prom 24055 - 24114 6.1 25 10 Op 1 . - CDS 24195 - 25259 724 ## COG0582 Integrase 26 10 Op 2 . - CDS 25274 - 25465 262 ## EUBREC_0388 hypothetical protein - Prom 25664 - 25723 7.8 + Prom 25507 - 25566 7.2 27 11 Tu 1 . + CDS 25690 - 26223 577 ## EUBELI_20054 hypothetical protein - Term 26668 - 26705 7.1 28 12 Op 1 30/0.000 - CDS 26793 - 27986 1454 ## PROTEIN SUPPORTED gi|119502908|ref|ZP_01624993.1| Ribosomal protein S19 - Prom 28110 - 28169 6.1 - Term 28067 - 28116 6.7 29 12 Op 2 51/0.000 - CDS 28179 - 30308 2458 ## COG0480 Translation elongation factors (GTPases) - Term 30319 - 30371 5.1 30 12 Op 3 56/0.000 - CDS 30381 - 30851 695 ## PROTEIN SUPPORTED gi|240146972|ref|ZP_04745573.1| 30S ribosomal protein S7 - Prom 30871 - 30930 5.0 31 12 Op 4 5/0.000 - CDS 31029 - 31442 657 ## PROTEIN SUPPORTED gi|240146973|ref|ZP_04745574.1| 30S ribosomal protein S12 - Prom 31567 - 31626 8.7 32 12 Op 5 58/0.000 - CDS 31826 - 35494 4785 ## COG0086 DNA-directed RNA polymerase, beta' subunit/160 kD subunit 33 12 Op 6 . - CDS 35517 - 39353 812 ## PROTEIN SUPPORTED gi|163796927|ref|ZP_02190884.1| 30S ribosomal protein S12 - Prom 39435 - 39494 6.6 Predicted protein(s) >gi|222441900|gb|ACEP01000042.1| GENE 1 67 - 2091 1173 674 aa, chain - ## HITS:1 COG:SP0913 KEGG:ns NR:ns ## COG: SP0913 COG0577 # Protein_GI_number: 15900794 # Func_class: V Defense mechanisms # Function: ABC-type antimicrobial peptide transport system, permease component # Organism: Streptococcus pneumoniae TIGR4 # 4 669 3 660 662 343 33.0 5e-94 MNSKIYGKLAVTNLKNNRKNYLPYILASAFSVMMYFIMDSLYRNPSLVKTGATLSIMLSY ANVVLLIFSVIFLFYINSFLIKRRKKELGIYNILGMGKGHLAKMLFIESIITTTVSIIGG ILAGILFGKLVYLVLLKILHLKRDIVYMISPVSVGITAAIFGCIFFVIFLYNLVQMKLSN PIELLRGGNAGEREPKTKWLMTIIGSVCLAGGYYISLTTKEPLQAIGQFFVAVLLVAAGT YALFMAGSITLLKLLRKKKSYYYKTRHFTAISGMIYRMKQNAVGLANICILSTMVLVMVS ITISLYAGMNDVLVTRFPSEAQITNQGINQNEERQIGELVASITKENHTNPTSQIRFHEG NFTALYNDKTKNFNMTTAQSYGDTNIVEFVMIPLSDYNKTEGKAVKLADNEVLIYHRKNK NKSSIKNEETVHLNNDSYKVAANLDSMRIAKADATNSVDGWYVIVKDTNIIKKYLKAVYG KDDMEDSVEDYHEFMQYVYSFNLDGSRTNRERTEQILQEQLQAKFKSAFIEGRELSRENF YNFYGGFLFIGIFLGIIFLMATVLIIYYKQISEGYDDRERYQIMQKVGMSKKEVRQSIRS QVLLVFFLPLIMAVIHLAFAFKIITRLLSVLNLTNISLFFMYTVGTVAVFAVIYVIIYSI TAREYYKIIICRGE >gi|222441900|gb|ACEP01000042.1| GENE 2 2084 - 2851 685 255 aa, chain - ## HITS:1 COG:SP0912 KEGG:ns NR:ns ## COG: SP0912 COG1136 # Protein_GI_number: 15900793 # Func_class: V Defense mechanisms # Function: ABC-type antimicrobial peptide transport system, ATPase component # Organism: Streptococcus pneumoniae TIGR4 # 1 249 1 249 252 316 64.0 2e-86 MALLDVKNVRKVYTTRFGGNQVEALKDVNFSVEQGEYVAIMGESGSGKSTLLNILAALDK PTEGKVFLKEKDLSKVKEKEMAAFRRNNLGFVFQDFNLLDTFSLKDNIFLPLVLSGTSYR EMERRLLPLADKLGIKNLLEKYPYEVSGGQKQRAAVARAVITQPQLLLADEPTGALDSKA AKELLKLFTSLNQDGQTILMVTHSVKAASTAGRVLFIKDGRVFHQIYKGNLSETQMYQKI SDAMTAITAGGADDE >gi|222441900|gb|ACEP01000042.1| GENE 3 2975 - 4012 795 345 aa, chain - ## HITS:1 COG:SP1632 KEGG:ns NR:ns ## COG: SP1632 COG0642 # Protein_GI_number: 15901468 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Streptococcus pneumoniae TIGR4 # 101 340 98 312 324 187 44.0 2e-47 MIKESIFSFLKIRKMPIIIFTGIVVIFGILFYLYDIPFDAIIYGCELSFVWCAVCLFIDF YKYYKRHKLLHINREQFFDDAEQLPEHMDIIEYDYQELAKELYQAKQELISKNRIAKKEL LDYYGMWVHQIKTPIAALDILLQNTEQFLYEEDIVNSRINKRLEIIQESIPVSDMKMELF KIEQYVEMALNYLRVEDISSDLSFKQYAVDDMVCQVIRKYAKIFISKKIKMDFKPTKACI VTDEKWFIFVLEQIISNALKYIKKGQISIYMKEKSLVIEDTGIGIPAEDLPRIFEKGFTG YNGRENKKSTGIGLYLCKNIMDKLQWNITADSEVGRGTKIYLTKM >gi|222441900|gb|ACEP01000042.1| GENE 4 4028 - 4702 666 224 aa, chain - ## HITS:1 COG:SP1633 KEGG:ns NR:ns ## COG: SP1633 COG0745 # Protein_GI_number: 15901469 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Streptococcus pneumoniae TIGR4 # 1 222 1 222 225 260 57.0 2e-69 MYRILIVEDDSTIASNVAAHLERWDYETKQIEDFKCVMEAFQQFDPQLVILDIGLPFYNG FYWCQEIRKISSVPILFLSSMNDNMNIVMAMNMGGDEFIEKPFDLNVLTAKVQALLRRAY SFQGNVNVLEHEGMLLNLNDASLSYKGEKISLTKNEFRILQMLMENAGKIVARDDIIARL WESDEFIDDNTLTVNVARLRKKLENAGMEGRIKTKKGIGYYLDK >gi|222441900|gb|ACEP01000042.1| GENE 5 4793 - 5689 527 298 aa, chain - ## HITS:1 COG:MJ0300 KEGG:ns NR:ns ## COG: MJ0300 COG0583 # Protein_GI_number: 15668475 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Methanococcus jannaschii # 2 292 9 295 296 112 28.0 1e-24 MQIFANVAECKGFLAASKKMHLSQSTVSTHMSALEHELGKKLIRRTARNFELTEDGVKLY KYAVDILLLHRKAIMEVGGNGSEFLRIGTSSVPAQWFVPTILSGFLKKNVNTQVEMTSAD SLDIIRRVKEGSLDIGFVGTKVEKQCKYIPLIKDHLVLAAPNTKEYQKIFAGESDSGESD LRQILEAPFLARRQYSGTMKEALGSLKYFGIEEEDLNIIATFDSAEMLRDCVEAGMGISI LSYQMIRELHESGKILIYRLPEKEFCRELYLIYKNESFLPEIVKRFIHYTEKYFKKEF >gi|222441900|gb|ACEP01000042.1| GENE 6 6048 - 7076 865 342 aa, chain + ## HITS:1 COG:SMa0028 KEGG:ns NR:ns ## COG: SMa0028 COG0709 # Protein_GI_number: 16262467 # Func_class: E Amino acid transport and metabolism # Function: Selenophosphate synthase # Organism: Sinorhizobium meliloti # 10 328 14 336 349 229 43.0 5e-60 MEKEKLFCKGGGCTAKLGAGILSRILSKLPQGPEDENLLVGYDSKDDAAVYKISDDIAFV QTLDFFPPMVDDPYTFGKIAATNALSDIYAMGGEVKTALNIVCFPEKMDLNILGEIMRGG AEKVIEAGGTLAGGHSIADSDVKYGLSVTGIVNPNKIYTNNGAKPGDALILTKRLGVGII CTANRVGEASVEAMKAVTDSMTTLNKYAAECCRNFEIHACTDVTGFSFLGHLHEMMDGQL SCHIHAKQVPVFEEALRHADEFLLTAAGQRNRNFTGPFVQFKDIPFAMEEILFDPQTSGG LLISLAKEDAPALLGKLKASGLPAEIVGEVLEKAEPEILVTN >gi|222441900|gb|ACEP01000042.1| GENE 7 7276 - 7908 752 210 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_3158 NR:ns ## KEGG: EUBREC_3158 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 210 1 205 205 224 60.0 2e-57 MIKVNAIGDACPIPVVKTKNAIRELQGGGTVETLVDNEIAVQNLTKMANQKGYDVHSEKV ADNEFKVTMTVNAEAAQIAANNAVSTIEEESCPITPLNKKNTVVVIRSNQMGNGEEELGK ALLKGFIYTLSQQDTLPSTILFYNSGAYITCEDSASIEDLKSLEAQGVEILTCGTCLNFY GITDKLQVGEVTNMYVIAEKMTQADLIVQP >gi|222441900|gb|ACEP01000042.1| GENE 8 7925 - 8164 301 79 aa, chain + ## HITS:1 COG:no KEGG:Cphy_1490 NR:ns ## KEGG: Cphy_1490 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 2 79 1 78 78 70 44.0 2e-11 MLRKKKPAFIIAFDATTQAMATDKLCAKNGLPGRLIPVPREITAGCGFAWKSAPEDKTIL IEALTKEGIEWASTHILEI >gi|222441900|gb|ACEP01000042.1| GENE 9 8348 - 9121 725 257 aa, chain + ## HITS:1 COG:AF1959 KEGG:ns NR:ns ## COG: AF1959 COG1924 # Protein_GI_number: 11499541 # Func_class: I Lipid transport and metabolism # Function: Activator of 2-hydroxyglutaryl-CoA dehydratase (HSP70-class ATPase domain) # Organism: Archaeoglobus fulgidus # 7 254 5 251 251 179 45.0 6e-45 MNKYFIGIDIGSTCAKTVVMNHEKEILHRFLQPTGWSSVDTAKQIQETLMEKGILPEESV VVATGYGRVSVPYAAKCVTEITCHAKGACYIHKNENMLLIDIGGQDTKIIRIENGMVTDF LMNDKCSAGTGRFLEVMANTLSLTPQDLCELAAKGSGTSISSMCTVFAESEVISLIGRGE SRENIAFAVIDSIVQKVVSQAARLTNGQEECICLTGGLCDYSYLKKSLGESLKTDVLTDT DGRYAGAIGAALSATTI >gi|222441900|gb|ACEP01000042.1| GENE 10 9132 - 9410 201 92 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026523|ref|ZP_03715715.1| ## NR: gi|225026523|ref|ZP_03715715.1| hypothetical protein EUBHAL_00772 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00772 [Eubacterium hallii DSM 3353] # 1 92 1 92 92 182 100.0 5e-45 MRMIKGGLKYYILFENYEQGLALHDVLDSDDIPNRIAPAPRAIQGKLSCGMSLLIEPAHI EQVRLSIEKHQAEYHTIVPLEGQVKPKRDKYC >gi|222441900|gb|ACEP01000042.1| GENE 11 9519 - 10844 548 441 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225026524|ref|ZP_03715716.1| ## NR: gi|225026524|ref|ZP_03715716.1| hypothetical protein EUBHAL_00773 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00773 [Eubacterium hallii DSM 3353] # 1 441 1 441 441 883 100.0 0 MKLTWSEASVFAFVWACALGYIVASRILEWNFERFLPSNWYFVVKKRLAQPEAKSLRTEK VPKKISNVLEELLTLMQTGNLQKSTVRMDRTFQEKVMHEVKLLEGRGLSRQLHFTNVKLS EENAKKRDFRHWSDSGREWREVIVEAAVSDKYLHRDSGKVIHESIYPHGSILLRQSRHIR HNEVGEKDRQKEQRFYSKYSEIFCPSCGAQIQLKGDETNCPYCGGFIQSNFYDWQIEDFM IYQTPDPNMDNLKYTSVTILLSFLPTFPCVYLITNMYIAIGVAMTLTIILSLSLLWIRAR KIDELENLEKEIVRYDEGWLISNINEALYKTLLTPELLSYSVNNIKLKGVENTSDKTTIS VEAMLKQTFLQNNKHIVTKSEKHNWKLCRSRYPNRIKSKGQAVVMEKECPSCGANYIPDN NGCCSYCGYSLQVDNAKWKLL >gi|222441900|gb|ACEP01000042.1| GENE 12 10912 - 11766 586 284 aa, chain - ## HITS:1 COG:TM0177 KEGG:ns NR:ns ## COG: TM0177 COG1284 # Protein_GI_number: 15642951 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Thermotoga maritima # 18 275 12 269 283 155 33.0 6e-38 MKTKTDYKSILIEINILTIAIAIIAVAVYFFLVPSHTSISSMSGLGIVLSNFVPLPLSAI TMILNVVLLIIGFFTCGKEFGLKTVYTSVMLPVFLGIFENIFPNTGSITNSQELDVLCYI LVVSVGLSILFNRNASSGGLDIVAKIINKYFHMELGKAMSLSGMCVALSAALVYDKKTVV LSVLGTYFNGIVLDHFIFDHNIKRRVCIITKKEEELRQFIVRDLHSGATIYEAIGAYNME KRNEIITIVDKGEYQKLMKYINQEDPEAFITVYTVSDMRYLPKK >gi|222441900|gb|ACEP01000042.1| GENE 13 11956 - 12342 314 128 aa, chain - ## HITS:1 COG:no KEGG:Clole_2877 NR:ns ## KEGG: Clole_2877 # Name: not_defined # Def: hypothetical protein # Organism: C.lentocellum # Pathway: not_defined # 1 127 1 122 124 122 51.0 6e-27 MLFYGFDMLFNIAFALIFCSVFIILAYGIITSITTWNKNNHSPRLNVSARVVAKRTEVSH HQHPNNGDMSGSHGYRTTSSTSYYVTFEVDSGDRIEFSMSGSEYGMLIEGDAGTLNFQGT RYLGFERK >gi|222441900|gb|ACEP01000042.1| GENE 14 12781 - 13860 592 359 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225026527|ref|ZP_03715719.1| ## NR: gi|225026527|ref|ZP_03715719.1| hypothetical protein EUBHAL_00776 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00776 [Eubacterium hallii DSM 3353] # 1 359 1 359 359 659 100.0 0 MYFKFVDAEVECVSFQYNDTENIFEMKDKENKVICEYYPLLNLDAEYNPRDFEYYLLRND REDYSENSIYQVYASKERIGWIFPIQALMSRQHDYVKNKFFLRYAYVASCWLLENINCYN QKFSEIVVLSDFYDDTSTILVLDKENTKKINNFQLEDYTVSLYQKGYSYSEKGNLYSTIE KANKRINLQPISRELRNIKYIHTLFEKEVPKNQEAFAKFHTYYQIIEILISVVFEDKFKK FVAQLTSNKGSLFDKRDDLGNMVLEKQRVKWLFSNYVEISGEDVAILNDCCKKLLQLNGK KTCSTMAENLYSVRCLLVHNMYILNNESHELLKDLDNAFIDVLMDMLLTFNTKREGMEE >gi|222441900|gb|ACEP01000042.1| GENE 15 13863 - 15035 679 390 aa, chain - ## HITS:1 COG:no KEGG:VF_A0651 NR:ns ## KEGG: VF_A0651 # Name: not_defined # Def: hypothetical protein # Organism: V.fischeri # Pathway: not_defined # 3 387 5 386 389 287 43.0 5e-76 MSKVFDKRINSWNFYVESTFGEYLKFAKKIINNNELQRKRVKTSKTIYSLLKNDLQKGCI MPPLVLALVKTDIIDIENPDQEKMLQYINEHSENVLLLDGLQRTYTLIDADTDMKKKPDE EYQKFLKNKLRLEIYVEINKFGILYRMLTLNTGQTPMSARHQLEMLYSDMLNTEFEGVKL VTDKEGKADPDENEFIFKNAIEGFNSYMNRNELPIDRQDILENIKMLENMSEEDVSKDLF KEFLETYMKLFGMLRKITDNHIVDEEELIEYGISESPFGKKVSKIFSTSQVLTGFGAAVG KMKDLDIIQSLEDVSKLVDKLEEKNKGYVWMMELLNKLDRIKVSSKKIGNAQRMFFQYFF RELLNSESDSYLDLEAAVQNGYKKYYSQVI >gi|222441900|gb|ACEP01000042.1| GENE 16 15102 - 16487 943 461 aa, chain - ## HITS:1 COG:HP1121 KEGG:ns NR:ns ## COG: HP1121 COG0270 # Protein_GI_number: 15645735 # Func_class: L Replication, recombination and repair # Function: Site-specific DNA methylase # Organism: Helicobacter pylori 26695 # 69 198 40 164 312 85 31.0 2e-16 MKKTVCELFAGVGGFRCGLNNIKTAEDYGKEEKWDTVWFNQWEPSEKTTQYAHDCYVYRF GPRLDINGEDTTNYNIEDVDKAKIPDFNLLVGGFPCQDYSVASSLATSKGLEGKKGVLWW SIRETLEVKNPSFVLLENVDRLLKSPAKQRGKDFGVMLACFRDEGYTVEWRVINAAEYGY QQRRRRIFIFAYKNDTKYAERILKTIRYTDELEEDKKIECMENVVLKEGFFAETFAVNKA DKAKMKVKVLPTEIGEVSDTFQCAFENSGIMKDGIIYTIKTTPTYNGKQITLGDVMETGE VEEEYFIPEEKLYYTNPSVTHSDETEQRLPKEDRQTWQYLKGAKKLLRTSATGHEYIFSE GAISMIDQEDKPARTMLTSEGGFSRTTHIVKDKMTGRVRLLTATEAERIQGFPTNHTKYC LVKGKTVEMPLRKRRFMMGNALVVNLVADMEKNLEKIFNQE >gi|222441900|gb|ACEP01000042.1| GENE 17 16578 - 18086 854 502 aa, chain - ## HITS:1 COG:no KEGG:SH0580 NR:ns ## KEGG: SH0580 # Name: shlAIR # Def: hypothetical protein # Organism: S.haemolyticus # Pathway: not_defined # 5 499 3 490 490 228 34.0 4e-58 MDNKEYKNSEEVKKRAEEAIGRPFKEIFELAKKFQEKNGLKEKHGKGDIGQAYEEGWFNY ACNKDAEPDFKEADIELKVTPFLKNSRGYRAKERLSLGKINYKDENWNEYEESRFWIKNH HLLIMYYQYIKGIDRENFSVQKIDEILLNELPERDKSIIQHDWEKIARYVKEGRAHELTE RDFMYLSPATKGSGGNKKVKYDDRYPKAKPRAYSFKKSYMTKLFNERMLDEDEESYVLPD IVLSKEKEFDDILLETLQKYLGRTVKSLKKEFPDYRDGYSHKHGIIKQIYKSDCDLEKTD EFQKANYKLRTITVNKNGMPLEDMSFSTFDFEELLKEKTWKESIVYDEMIDSKFLLVIFS KNSEGKEILNNAMMWYIPKKDENKVKEVWEETRNVIKNGIRLQQKISHDRKGRAIFAYEN NFPKSNFNKVAHVRNKARESEYFCENSNSVKLKKPAKIIVLKEIPEELKDVPVPTGEYMT KQCFWFNKKYIKQQIKDFIKSY >gi|222441900|gb|ACEP01000042.1| GENE 18 18091 - 19521 960 476 aa, chain - ## HITS:1 COG:no KEGG:CPF_1016 NR:ns ## KEGG: CPF_1016 # Name: not_defined # Def: putative type II restriction endonuclease # Organism: C.perfringens_ATCC13124 # Pathway: not_defined # 6 474 4 494 495 506 58.0 1e-142 MGSYLKEYDETDPISIETYAKKLIGKTFADVCREDDITKSTVVREASNYEVKHANKKRKG GLGELIEERFFHYQTNNDARPDFDKAGVELKVTPYKQNTNGKFVAKERLILTMIDYFSVV NEKFEDSHMWQKSSLILLIYYLYQKEIEDRLDYRIGYVKLFTPTEQDIKIIKHDFESIVE KIKAGKAHELSEGDTLYLGAAPKAATSKDRRKQPFSDELAKPRAFAFKNSYMTYVLNNYI IPGKNTYEPIIKGDAEESFEDYVVRKIDAYYDWTVMDLCNEFHIEYQKKPKSLEAMLAYR MLGIKGNHAEEFEKANVVVKAIRIEKNNKIKENMSFPTFKFKELVEEEWNDSTFGNYLRE TRFLFVIYKYDQGRELRLKGCQFWNIPYDDLEGNVRLVWEKTQRVIREGLQIEKKNGKNY NNFPKLSENPVCHVRPHARNSKDTYELPDGRKYPKQCFWLNNSYILSQLDKKFIGD >gi|222441900|gb|ACEP01000042.1| GENE 19 19506 - 19994 315 162 aa, chain - ## HITS:1 COG:NMA0429 KEGG:ns NR:ns ## COG: NMA0429 COG3727 # Protein_GI_number: 15793434 # Func_class: L Replication, recombination and repair # Function: DNA G:T-mismatch repair endonuclease # Organism: Neisseria meningitidis Z2491 # 13 135 7 125 140 79 39.0 4e-15 MPAKEPRFHGEVTEKSHKNMSRIHGKDTRIEIVLRKALWSRGFRYRKNYKELPGRPDIVL TKYHIAIFCDSEFFHGKDWKVLKTKLEKGKNPDYWVRKIERNIQRDEEKDKQLSSMGWTV IHFWGKDILKNTEQCIKVIEETIFDQKLEEIDDEGEYKWEVT >gi|222441900|gb|ACEP01000042.1| GENE 20 20128 - 20502 322 124 aa, chain - ## HITS:1 COG:no KEGG:CDR20291_1765 NR:ns ## KEGG: CDR20291_1765 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile_R20291 # Pathway: not_defined # 1 124 4 125 125 97 39.0 1e-19 MKKRIHDKTNGLNYTLQGDYYFPELIIKEEEATYGKYGILRKNYLKEHKSGYYQYLILIG KLTEYLKQIDKEAGEQVEILVKQMAERYGVAEEMREENWMEWVRRMNGIKASVEEIVLQK LIYL >gi|222441900|gb|ACEP01000042.1| GENE 21 20596 - 21927 1003 443 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_0392 NR:ns ## KEGG: EUBREC_0392 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 10 441 1 449 449 246 40.0 1e-63 MAIKRTISGMVGAGSLAHNRRDFIAENVDQDRVDLNICYCDENIKGVYRQLFDEAVERYN VGKRKDRQITNYYEKIRQGKQEKLFHEVIFQIGNCEDMAVGTPEGNLAVKVLDQYMQDFQ QRNSNLRVFSCYLHQDEATPHLHIDFVPYVTGWKGKGMDTRVSLKQALKSQGFQGGTKHA TELNQWINHEKEILAEIAKQHGIEWKQKGTQEEHLDVYNFKKKERKKEVQELEQEKEYLT AENEGLTLQIADARADIKNLREDKEQAIQEKELAEKYAEETQKELENLAEQRAQLQPIID KVSKELKECRKLIPELPEAGALERASTYRDKKITPLFNQMKGIIAGLAAQVQELNTEIEK WKEKYKREKQICEAVKKDLSKAQQRNQYLQEEKEELLDISEQYSREYGFLEIKLSRKPYR KTFREKWKSKKKTERSSHQGKVF >gi|222441900|gb|ACEP01000042.1| GENE 22 22213 - 22440 230 75 aa, chain - ## HITS:1 COG:no KEGG:Rumal_3218 NR:ns ## KEGG: Rumal_3218 # Name: not_defined # Def: hypothetical protein # Organism: R.albus # Pathway: not_defined # 1 74 1 77 77 69 53.0 4e-11 MKEGILAYNRVNDRYGLLSSDLWVHTGFHCGKRMEVFMDGKWILSRMEMNPKREWYLVET PYSGNLEYIRVRIEE >gi|222441900|gb|ACEP01000042.1| GENE 23 22627 - 22866 79 79 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225026536|ref|ZP_03715728.1| ## NR: gi|225026536|ref|ZP_03715728.1| hypothetical protein EUBHAL_00785 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00785 [Eubacterium hallii DSM 3353] # 1 79 35 113 113 108 100.0 1e-22 MKNKNVQIPYELFLLLLQYHLMEYRQNEEKIRQGLEKKMNAMAEREIYSRYKTAPTEEER EKYRQEYLDRRGIPEDFRW >gi|222441900|gb|ACEP01000042.1| GENE 24 22934 - 23998 881 354 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_0390 NR:ns ## KEGG: EUBREC_0390 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 8 352 4 322 322 390 58.0 1e-107 MRLEEIVRQEPPLKLINMESVEVEEIKWLLYPFIPYGKVTIIQGDPGEGKTTMVLQIIAK LTRGEPILPASFEKRKEPERADVITDENKADMDVSKNKQCLQQPVNVIYQTAEDGLGDTI KPRLLAAGADCSKVLVIDDREQPLTMLDIRLEEAIIQTKARLVVLDPIQGFLGSDVDMHR ANEIRPVMKRVAVLAEKYQCAIILIGHMNKNSNGKSSYRGLGSIDFQAAARSVLIVGRIK EEPETRVVCHVKSSLAPEGKSIAFRLDQHNGFEWIGEYDISADELLCGDSRGQKSRQAKE FLKKILSDGGMAQKKIEEEAEKCGIKSKTLRNAKQELGIDAVKRGNQWFWILSE >gi|222441900|gb|ACEP01000042.1| GENE 25 24195 - 25259 724 354 aa, chain - ## HITS:1 COG:BS_ydcL KEGG:ns NR:ns ## COG: BS_ydcL COG0582 # Protein_GI_number: 16077547 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Bacillus subtilis # 15 348 16 366 368 164 32.0 2e-40 MSVYKDKAHGTWYTSFRYIDWTGKKRQKMKRGFKTKREALNYENEFKNSVAVEPDMQMDK FVDIYFSDKKNELKANSIRNKQHMINKHIIPYFGNKKLNEITPADIIQWQKIIQENKFSK TYERMIQNQINALFNHAQRIYNLKENPCKKVKRMGKSDADKLDFWIKEEYDRFISGVDRK SEDYLMFEILFWTGIREGELLSLTIADFDMTNNLLHINKTYHRIDGKDVISTPKTDNSVR TIIIPNFLKEEVQEYISYHYGFPENERLFPIVARTLQKRMKRYEKKTGVKPIRVHDLRHS HVAYLINQGVEPLIIKERLGHKNIQITLNTYGHLYPSRQKEVAALLDEKNKGKD >gi|222441900|gb|ACEP01000042.1| GENE 26 25274 - 25465 262 63 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_0388 NR:ns ## KEGG: EUBREC_0388 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 2 60 3 61 66 64 44.0 1e-09 MRTNYMMTVEDVMNELGVKRSKAYSILKQLNNELEKEGYVAVRGKIPRPYWETKFYGFSQ KSM >gi|222441900|gb|ACEP01000042.1| GENE 27 25690 - 26223 577 177 aa, chain + ## HITS:1 COG:no KEGG:EUBELI_20054 NR:ns ## KEGG: EUBELI_20054 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 2 110 14 121 154 72 38.0 9e-12 MVGKKIRAYREFRGFSQIQLAELSGINVGTIRKYELGIRNPKPEQLEKIAAALGLNVSIF LDFNIETVGDVLSLLFAIDDSVNLSLSETRNQKIILDFDNPMMQDFFKKWCQFKNVYEKE KAEILAIEDEDKRQEELYKLNATQEEWKLRAMGTTIGCHTVIKKGTEDNVIKTYDLT >gi|222441900|gb|ACEP01000042.1| GENE 28 26793 - 27986 1454 397 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|119502908|ref|ZP_01624993.1| Ribosomal protein S19 [marine gamma proteobacterium HTCC2080] # 1 397 1 407 407 564 69 1e-160 MAKAKFERTKPHCNIGTIGHVDHGKTTLTAAITKVLAERVEGNEKVDFENIDKAPEERER GITISTAHVEYETEKRHYAHVDCPGHADYVKNMITGAAQMDGAILVVAATDGVMAQTKEH ILLSRQVGVPYIVVFMNKCDMVDDEELLELVDMEIRELLNEYDFPGDDTPIIQGSALMAL EDPNGPWGDKIMELMDAVDTWIPNPERDTDKPFLMPIEDIFSITGRGTVATGRVERGVLH VSDEVEIVGITDETRKVVVTGVEMFRKLLDEAQAGDNIGALLRGVQRDEIQRGQVLAQPG SVTCHHKFTAQVYVLTKDEGGRHTPFFNNYRPQFYFRTTDVTGVITLPEGTEMCMPGDNV EMTIDMIHPVAMEQGLTFAIREGGRTVGSGRVASILE >gi|222441900|gb|ACEP01000042.1| GENE 29 28179 - 30308 2458 709 aa, chain - ## HITS:1 COG:CAC3138 KEGG:ns NR:ns ## COG: CAC3138 COG0480 # Protein_GI_number: 15896387 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation elongation factors (GTPases) # Organism: Clostridium acetobutylicum # 3 700 2 683 687 917 64.0 0 MAGREYPLERTRNIGIMAHIDAGKTTLTERILYYTGVNYKIGDTHEGTATMDWMEQEQER GITITSAATTCHWTLEEDCKPKKGALEHRINIIDTPGHVDFTVEVERSLRVLDSAVGVFC AKGGVEPQSENVWRQADTYNVPRMAFINKMDIMGADFYGAVDQIKTRLGKNAIPIQLPIG KEDDFKGIIDLMEMKAYIYNDEKGEDIDIIDIPEDMKDDAELYHTDMVEKICEADDDLMM AYLDGEEPSVEELKRVLREGTCNCTMVPVCCGTAYRNKGVQKLLDAIIEYLPAPTDVEAI KGQDLEGNEVEVPASDDAPFAALAFKIMTDPFVGKLAFFRVYAGTMNSGSYILNATKGKK ERVGRILQMHANKRQELDKVYSGDIAAAVGFKFTTTGDTICDEQHPVILESMEFPDPVIE LAIEPKTKAGQGKMAEALAKLAEEDPTFRAHTDQETGQTIIAGMGELHLEIIVDRLLREF KVEANVGAPQVAYKESFTKAVDIDSKYAKQSGGRGQYGHCKVRFEPMDVNGEETFKFDSE VVGGAIPKEYIPAVGAGIEEAAKAGILGGFPVVGVHATVYDGSYHEVDSSEMAFKVAGSL AFKDAMKKADPALLEPIMKVDVTMPEEYMGDVIGDINSRRGRIEGMEDVGGGRIVHGFVP LSEMFGYSTDLRSRTQGRGNYSMFFDHYEKVPKNVQEKILDAHLAAQNK >gi|222441900|gb|ACEP01000042.1| GENE 30 30381 - 30851 695 156 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|240146972|ref|ZP_04745573.1| 30S ribosomal protein S7 [Roseburia intestinalis L1-82] # 1 156 1 156 156 272 85 3e-72 MPRKGHIAKRDVLPDPIYNNKVVTKLINNIMLDGKKGVAQKIVYGAFDRIAEETGKDALE VFTEAMNNIMPALEVKARRIGGATYQVPIDVRPDRRQALALRWITLYSRKRSEKTMEERL AKELMDAANNTGASVKKKEDMHKMAEANKAFAHYRF >gi|222441900|gb|ACEP01000042.1| GENE 31 31029 - 31442 657 137 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|240146973|ref|ZP_04745574.1| 30S ribosomal protein S12 [Roseburia intestinalis L1-82] # 1 136 1 136 139 257 92 7e-68 MPTFNQLVRKGRQSVAKKSTAPALQKGFNSLKKRPTDASSPQKRGVCTAVKTATPKKPNS ALRKIARVRLSNGIEVTSYIPGEGHNLQEHSVVLIRGGRVKDLPGTRYHIVRGTLDTAGV ANRRQARSKYGAKKPKK >gi|222441900|gb|ACEP01000042.1| GENE 32 31826 - 35494 4785 1222 aa, chain - ## HITS:1 COG:CAC3142 KEGG:ns NR:ns ## COG: CAC3142 COG0086 # Protein_GI_number: 15896391 # Func_class: K Transcription # Function: DNA-directed RNA polymerase, beta' subunit/160 kD subunit # Organism: Clostridium acetobutylicum # 15 1180 6 1161 1182 1615 68.0 0 MSDMNNQTVEQPVNFDAIKIGLASPELIRQWSHGEVKKPETINYRTLKPEKDGLFCERIF GPSKDWECHCGKYKKIRYKGVVCDKCGVEVTKASVRRERMGHIELAAPVSHIWYFKGIPS RMGLILDMSPKALEKVLYFASFVVLDPGTTNLSKKDVISDAEYRRVTEEYRNSDEGLGNF RVGMGAEAIKELLAEIDLEAEAEELKKSLKNASGQKKARNIKRLEVVEAFRLSGNRPEWM VLDAVPVIPPDIRPMVQLDGGRFATSDLNDLYRRIINRNNRLKRLLELGAPDIIIRNEKR MLQEAVDALIDNGRRGRPVTGPGNRALKSLSDMLKGKQGRFRQNLLGKRVDYSGRSVIVV GPELKIYQCGLPKEMAIELFKPFVMKELVANGLAHNIKSAKKMVEKLQPEVWDVLEDVIK EHPVMLNRAPTLHRLGIQAFEPTLVEGKAIKLHPLVCTAFNADFDGDQMAVHLPLSQEAQ AECRFLLLSPNNLLKPSDGAPVAVPSQDMVLGIYYLTMEKEGAKGEGKCFKSENEAYLAY ENKVITLHSRIKVKRRGMRPDGTMGSRIIDCTMGRLLFNEIILQDLGFVDRSDPDNFLKL EVDFQCGKKQLKQILDRCISVHGTTKTAEVLDNIKALGYKYSTIGALSVSISDMTVPKEK PEILAAAQKQVEYITKQYRRGFMTEEERYKAVVQTWFAADNELTDKLISGLDKYNNIFMM ADSGARGSNQQIKQLAGMRGLMADTSGRTIELPIKSNFREGLDVLEYFISAHGARKGLSD TALRTADSGYLTRRLVDVSQELIIREQDCCEGTNKIPSMYVEAIMDGKETIESLEDRISG RYAAEDYKDKEGNIIVEANCMITPKRAKAIVNAGYQQVKIRTMLTCKSHNGACSKCYGAN LATGQAVQVGEAVGIIAAQSIGEPGTQLTMRTFHAGGVAGDDITQGLPRVEELFEARKPK GLAIITEIAGRADVKETKEKREITVVNEDSGETRNYVIPYGSRIKVLDGTYLEAGDELTE GSVNPHDILRIKGAGAVQDYMMREVQRVYRLQGVDINDKHIEIIVRQMLKKVRIEEAGDS RYLPGALIDILELEDENERLLAEGKEPATAEQTLLGITKASLATSSFLSAASFQETTKVL TDAAIKGKIDPLIGLKENVLLGKLIPAGTGLKKYRNLHLDTGKVVNKIEEADEFDYDAAS MEEEIRDQKSEDAAMMMDSENL >gi|222441900|gb|ACEP01000042.1| GENE 33 35517 - 39353 812 1278 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163796927|ref|ZP_02190884.1| 30S ribosomal protein S12 [alpha proteobacterium BAL199] # 910 1209 1078 1391 1392 317 52 7e-86 MEKNRLRPTRIPGSKNIRMSYARQKDVLDMPNLIEVQKDSYKWFLTDGLKEVFEDISPIT DYSGQLSLEFVDFQLCREDRKYSIEECKERDATYAAPLKVKVRLHNKETDDFKQHDIFMG DLPLMTETGTFVINGAERVIVSQLVRSPGIYYAISHDKIGKKLYSCTVIPNRGAWLEYET DSNDVFYVRVDRTRKVPVTVLIRALGIGTNAEIKELFGEEPKILKTLEKDTATNYQEGLK KLYEKIRPGEPLSVDSAESLINSMFFDARRYDLAKVGRYKFNKKLMFRNRIAGHRLAQDV LDPSTGEILFEAGVRLTKEQADAIQNAAVPYVYVETEEKEVKVLSSMMVDITSFVDVDPE EVGVTELVYYPALEKILEEYDDIDEIKAQIRKNITELIPKHITREDILASINYNIHLEYG VGNDDDIDHLGNRRIRAVGELLQNQYRIGLSRLERVVRERMTTQDIESISPQTLINIKPV TAAVKEFFGSSQLSQFMDQHNPLSELTHKRRLSALGPGGLSRDRAGFEVRDVHYSHYGRM CPVETPEGPNIGLINSLATYARINEYGFVEAPYRKVDKSDPQNPVVTDDVVYLTADEEDN YTVAQANEPLDEEGHFIHTNVSGRYREETSEFRKSRIDLMDVSPKMVFSVATSMIPFLEN DDANRALMGSNMQRQAVPLLKTEVPVVGTGMEAKAARDSGVCIIAHHAGTVEYSTSKEII VKREDGIRDTYHVIKFSRSNQGNCMNQRPIVNKGDHVEAGDILADGASTCGGEMALGKNP LIGFMTWEGYNYEDAVLLSERLVQNDVYTSVHIEEYEAEARDTKLGQEEITRDLAGLSED VLKDLDENGIIRIGAEVHAGDILVGKVTPKGETELTAEERLLRAIFGEKAREVRDTSLRV PHGAYGVVMDTKVFTRENGDELPPTVNKSVRVYIAQKRKISVGDKMAGRHGNKGVVSRVL PVEDMPFLPNGRPLDIVLNPLGVPSRMNIGQVLELHLSLASKVLGFNVATPVFNGADEND IMDTLEMANDYANKTWEEFEERWGDKVNDDIMKYLWDNRDHREEWKGVPIDRTGKVQLRD GRTGQEFDSPVSIGFMHYLKLHHLVDDKIHARSTGPYSLVTQQPLGGKAQFGGQRFGEME VWALEAYGASYTLQEILTVKSDDVVGRVKTYEAIIKGENIPEPGIPESFKVLLKEFQSLA LDVRVLKEDMTEIELQEGSETINESLTSVIESSPDDNNYKDESNLGEYGYKESTLEEEAG QADAQPEFTEMPFGDSDL Prediction of potential genes in microbial genomes Time: Fri Jul 8 07:05:20 2011 Seq name: gi|222441899|gb|ACEP01000043.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont48.1, whole genome shotgun sequence Length of sequence - 30659 bp Number of predicted genes - 26, with homology - 26 Number of transcription units - 14, operones - 6 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 548 - 611 6.1 1 1 Tu 1 . - CDS 755 - 2230 1218 ## COG1686 D-alanyl-D-alanine carboxypeptidase - Prom 2356 - 2415 5.8 + Prom 2200 - 2259 7.1 2 2 Op 1 . + CDS 2496 - 3143 827 ## COG0220 Predicted S-adenosylmethionine-dependent methyltransferase 3 2 Op 2 . + CDS 3190 - 3318 61 ## gi|225026552|ref|ZP_03715744.1| hypothetical protein EUBHAL_00801 4 2 Op 3 . + CDS 3351 - 4172 923 ## COG0561 Predicted hydrolases of the HAD superfamily + Term 4335 - 4381 5.3 + TRNA 4255 - 4327 82.3 # Lys CTT 0 0 + Prom 4255 - 4314 76.8 5 3 Op 1 . + CDS 4486 - 4644 82 ## gi|225026554|ref|ZP_03715746.1| hypothetical protein EUBHAL_00804 + Term 4729 - 4790 9.7 + Prom 4886 - 4945 8.2 6 3 Op 2 . + CDS 4993 - 6150 847 ## gi|225026555|ref|ZP_03715747.1| hypothetical protein EUBHAL_00805 7 3 Op 3 . + CDS 6216 - 6953 538 ## gi|225026556|ref|ZP_03715748.1| hypothetical protein EUBHAL_00806 + Term 7021 - 7062 9.0 + Prom 6956 - 7015 5.9 8 4 Op 1 . + CDS 7130 - 7828 727 ## COG1011 Predicted hydrolase (HAD superfamily) 9 4 Op 2 . + CDS 7893 - 10145 1468 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog + Prom 10607 - 10666 6.8 10 5 Op 1 34/0.000 + CDS 10747 - 11559 677 ## COG0619 ABC-type cobalt transport system, permease component CbiQ and related transporters + Prom 11563 - 11622 3.0 11 5 Op 2 1/0.000 + CDS 11679 - 12515 353 ## PROTEIN SUPPORTED gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P 12 5 Op 3 4/0.000 + CDS 12578 - 13312 909 ## COG0310 ABC-type Co2+ transport system, permease component 13 5 Op 4 . + CDS 13328 - 13663 513 ## COG1930 ABC-type cobalt transport system, periplasmic component 14 6 Tu 1 . - CDS 13852 - 14406 667 ## COG1611 Predicted Rossmann fold nucleotide-binding protein - Prom 14435 - 14494 8.2 + Prom 14701 - 14760 7.7 15 7 Tu 1 . + CDS 14824 - 15420 658 ## ELI_2728 hypothetical protein + Term 15454 - 15499 -0.5 + Prom 15438 - 15497 7.5 16 8 Tu 1 . + CDS 15650 - 16462 451 ## gi|225026566|ref|ZP_03715758.1| hypothetical protein EUBHAL_00816 + Prom 16560 - 16619 9.3 17 9 Op 1 18/0.000 + CDS 16689 - 17591 1065 ## COG0175 3'-phosphoadenosine 5'-phosphosulfate sulfotransferase (PAPS reductase)/FAD synthetase and related enzymes 18 9 Op 2 . + CDS 17593 - 19422 2143 ## COG2895 GTPases - Sulfate adenylate transferase subunit 1 19 9 Op 3 . + CDS 19450 - 20901 1401 ## Tery_4433 hypothetical protein + Term 20997 - 21032 -1.0 + Prom 21211 - 21270 6.6 20 10 Tu 1 . + CDS 21424 - 22944 1645 ## COG2059 Chromate transport protein ChrA + Term 23160 - 23215 2.1 21 11 Op 1 . - CDS 23198 - 24871 1284 ## COG0714 MoxR-like ATPases - Prom 24898 - 24957 4.4 22 11 Op 2 . - CDS 24964 - 26250 741 ## EUBREC_1241 hypothetical protein 23 11 Op 3 . - CDS 26262 - 27431 468 ## EUBREC_1237 hypothetical protein + Prom 27678 - 27737 6.9 24 12 Tu 1 . + CDS 27789 - 28571 1073 ## COG0345 Pyrroline-5-carboxylate reductase + Term 28590 - 28625 3.2 - Term 28570 - 28622 7.1 25 13 Tu 1 . - CDS 28671 - 29291 468 ## gi|225026577|ref|ZP_03715769.1| hypothetical protein EUBHAL_00827 - Prom 29336 - 29395 8.0 + Prom 29367 - 29426 7.8 26 14 Tu 1 . + CDS 29491 - 30630 1154 ## COG1619 Uncharacterized proteins, homologs of microcin C7 resistance protein MccF Predicted protein(s) >gi|222441899|gb|ACEP01000043.1| GENE 1 755 - 2230 1218 491 aa, chain - ## HITS:1 COG:CAC1267 KEGG:ns NR:ns ## COG: CAC1267 COG1686 # Protein_GI_number: 15894549 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanyl-D-alanine carboxypeptidase # Organism: Clostridium acetobutylicum # 76 423 30 377 425 196 35.0 1e-49 MKRILTFILTLTLALSFCVNSFAAATTATASETQTGTSEAQTDTSKTQAGTSEAQTDTSK VQAGASYGISNLSGAPDIVAESAVVMDANTGTILYGKEADTKRYPASITKVMTALLAVEN CKMSDVITYSNAAVNGIEAGSSTAGINVGAKLTVEDSLYALMLVSANEAAAAIAEHISGS TTEFAKLMTKRAKELGCTNTQFKNPHGLPNEEHYTTAHDMGLILKEAMKHEEFRKISGTI SYTLKKSDTLKDTLELWNHAKILRENSDYYYKYAEGAKTGFTQVALNTLVTYAKKDNVEL ICVILKDYGADKSYTDTANLFKWAFNQVKSVTPLTDFSLKTAMTANTSIDSSKLDQIQLL NSVYDKNFSVLVKKDFNESDLKTAFKLDEDKKTGRLGYIVISCDGKKLGQTEVTYDTTSK QGKAYQEGKTVDDNLKTAPVASKRKDAIHKGIEFSLRLLVAVILIILIMHLIHRHELEKR RKNRISRRKKK >gi|222441899|gb|ACEP01000043.1| GENE 2 2496 - 3143 827 215 aa, chain + ## HITS:1 COG:SPy1726 KEGG:ns NR:ns ## COG: SPy1726 COG0220 # Protein_GI_number: 15675576 # Func_class: R General function prediction only # Function: Predicted S-adenosylmethionine-dependent methyltransferase # Organism: Streptococcus pyogenes M1 GAS # 1 204 1 204 211 201 50.0 9e-52 MRLRNVKGSREQIAANEYTIKDAEQIKGQWKDYFQDGKPIQLEIGMGKGKFLMQLAEKNP QIHYIGIEKYSSVLVRALEKMEENPLENLHFIRMDAENIVNVFEKGEVDQIYLNFSDPWP KDRHAKRRLTSRQFLERYHNILKEEGRVIFKTDNRPLFDFSLEEVKEAGWTLEKSTFDLH HSEYVEGNIMTEYEEKFVAKGKPICMLQIFLQEDK >gi|222441899|gb|ACEP01000043.1| GENE 3 3190 - 3318 61 42 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026552|ref|ZP_03715744.1| ## NR: gi|225026552|ref|ZP_03715744.1| hypothetical protein EUBHAL_00801 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00801 [Eubacterium hallii DSM 3353] # 1 42 1 42 42 62 100.0 9e-09 MNFKEKMCWNIQRKSLCNRGIWRCLCNFCQSISEVCRILNGK >gi|222441899|gb|ACEP01000043.1| GENE 4 3351 - 4172 923 273 aa, chain + ## HITS:1 COG:lin0196 KEGG:ns NR:ns ## COG: lin0196 COG0561 # Protein_GI_number: 16799273 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Listeria innocua # 1 271 1 267 270 104 29.0 2e-22 MKKIVAIDLDGTLLHSDCTISSYSKEVIAEAVKNGIMVVPTSGRSFRSIKNQVQDIEGVK YCICANGTLVGNIQTEELLHNCKIPQHLVYDIYKEVKERHGFIELYCDNDAYVEKDTAPI IYDTKMGKDFCDSMLSTDVIGLSYDLLLRRGTMRINKIHVAFPDPQELREFAAKTAENKN LMVTFPSEYNMEIFPSGCNKDVGIRILAEKYNIAHEDTVAIGDSDNDVAMIEYAHIGIAV ANAMEVLKEKADYITKSNDEDGPAIVLKKILEK >gi|222441899|gb|ACEP01000043.1| GENE 5 4486 - 4644 82 52 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026554|ref|ZP_03715746.1| ## NR: gi|225026554|ref|ZP_03715746.1| hypothetical protein EUBHAL_00804 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00804 [Eubacterium hallii DSM 3353] # 1 52 13 64 64 99 100.0 7e-20 MPKQFNFWIVASGDAARKICNYLSKQIIYMRNNQYKKTFYTLQREMLLTGEV >gi|222441899|gb|ACEP01000043.1| GENE 6 4993 - 6150 847 385 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026555|ref|ZP_03715747.1| ## NR: gi|225026555|ref|ZP_03715747.1| hypothetical protein EUBHAL_00805 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00805 [Eubacterium hallii DSM 3353] # 1 385 4 388 388 678 100.0 0 MKQRTMKITKNIICMTSILVLFILIKPTKVYAFSMNEARNSPVLTSQYRTTRLRNEYDVK LFQVQMPKSGCFRITLRPNAVADENDIGYGWNLNIYRKDNLKEPVKQYWQIENKMVTEKL VLTSGTYYIEVKSYSEYGMSPIMVPFDIKADVVSENNWEQENNNTFKKANKISIGKKYQG TLFDDIDEDWFKVVAPNTGRITATLNCDPDTDVNDVGDGWDVSVYSASDINTEIAKEEWI ITKGSVRFNVVKGRTYYIYVHSYHYSFPEGYTYQLATSFKGPKATPAKKTTNTSKKPVKM VTGVSLKVKKKKINIRFKLVSNVSKYQIMYSTKKSFKNKKVVTVRSGNVTIKKLKRKKTY FIRVRAISKNGKAGAWSVVKKVRVK >gi|222441899|gb|ACEP01000043.1| GENE 7 6216 - 6953 538 245 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026556|ref|ZP_03715748.1| ## NR: gi|225026556|ref|ZP_03715748.1| hypothetical protein EUBHAL_00806 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00806 [Eubacterium hallii DSM 3353] # 1 245 1 245 245 458 100.0 1e-127 MKKRTKFYALLFAMVLSLTQMIGAAVPVKAATKECSHGYTLVNGIGIRMHECGESVCITC YHGHSKRVVIPKTFKLNGKNLPPTQIDWADCGFSENGVKEIVIPDTLDYLGYWDEDEVNT SYKDITFFVHKGSKAAKAVQAEGYKYRYLNDNSNPQPVKGTSIKRTVRIQNLTKITWKKV KCDKYEIRYSQKKNMKNAKKINCGKNKTSLVIKKLKKNRKYYVQIRTKSGKKYSAWSKVK TISKR >gi|222441899|gb|ACEP01000043.1| GENE 8 7130 - 7828 727 232 aa, chain + ## HITS:1 COG:lin0639 KEGG:ns NR:ns ## COG: lin0639 COG1011 # Protein_GI_number: 16799714 # Func_class: R General function prediction only # Function: Predicted hydrolase (HAD superfamily) # Organism: Listeria innocua # 1 228 2 231 234 161 34.0 1e-39 MNLIFDLDDTLYNLMGPFELVHKKLYADKTDADCTQLFMQSRVYSDEIMEAEKKGLIPHE DCFYERVKRTYHDVGIEMSREDADIFEQLYRSFQKKITLGNGVEGFLDYCKSNDIFIAIL TNGRPEPQYAKVVALGLHKWFDDEHIFISGGIGYQKPDPQAFKYVENAYALNPEETWYVG DTYEADVVGANTAGWHTIWFNHRNRECPEKKRADITVKSIEELKEVIRGQIG >gi|222441899|gb|ACEP01000043.1| GENE 9 7893 - 10145 1468 750 aa, chain + ## HITS:1 COG:mlr8088 KEGG:ns NR:ns ## COG: mlr8088 COG1595 # Protein_GI_number: 13476697 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Mesorhizobium loti # 46 197 67 217 217 73 32.0 2e-12 MFCDICGKPLVKDLDIYGENFFNGDVQAFDYIYQNSYNWVAGEVRKMFLANSSEVEDCVQ EIYLLLYRKIKYYDPRQGSFSAWFNALVRNRIYDYGRKLAGNREILPETSQQYFKEITDR NVYINPEARIEYAEREQILDEILDDIGEGQRICIQMYYNDGLKIKEIAQILNISEGTVKS RIHNGLKKLNTKVDEMQKRGLYTFRMLPLGFLLWLLSRDDCSAKEDYRALYHARTLAGQN VLQNAPMAAGNKAAGIKMARRNQNVVKHASMQQAGQAGAKHAAINHAARTASRAGGKAIK AGMVKALAGGITAAVIGSGVFIGGKALKQHQGDKKVITSGQEYSTQEEKEGTPDSSVKEF VDDNQAFIKEILALEAEVPEGTEFQTDINENVRSVVYTSLGLLCDNKSFGEEATNIYDAV PEESIVDYNNNDDGEGTISVEESGFENMRQFLGCKETIGSLSATSSEEFLCQDGEVRAEG LSPSSDMQSSIKDIVYIPLSNGDERAIFTEQIIQDEIDGIGTAVITKADNDLGVQLKSYN IVADSDKIDQLLGSVAVLGKQLGSDGKLFSGNMTEMSDESKIHALIDSVRGYCYTEPNDE DIYANRAESLVYKPVFSEDDIKSNTEKSIIVKEDAFDDYLALVDYDGTIDDIVANKALQN KYNCSIKRNGTNIEFTFSGEFEMTYWWVETDMIDQINKDGFKILKNGKIEVDCLLYTSHA TEPDPGVEYKATLSLNEDGSGMKLENLERV >gi|222441899|gb|ACEP01000043.1| GENE 10 10747 - 11559 677 270 aa, chain + ## HITS:1 COG:MA3552 KEGG:ns NR:ns ## COG: MA3552 COG0619 # Protein_GI_number: 20092359 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type cobalt transport system, permease component CbiQ and related transporters # Organism: Methanosarcina acetivorans str.C2A # 18 270 16 266 271 124 32.0 3e-28 MKEKKKRFKHEHGSSFSVDYYAYASHMRSWNAGFKVLFSIISLLICIACNNLFVSLAVIF FMGFLTVAVGGLDLHHYITMLLVPIVFLIFGTIAIAIGFSMRPVGQYHVHFLNLFYIYWS KASLLKAGGLVFKALGAVSALYMMTLTTPLSEIIAVLRKAHIPKLIIELMNMIYRYIFIM IDTHSRMKNSAESRLGYCDFKTSCYSFGQVASNLLVVSLKRGNNYYNALEARCYDGDLRF LEEEKPLKAGQIVMAVIFIVVLLLIWFFTR >gi|222441899|gb|ACEP01000043.1| GENE 11 11679 - 12515 353 278 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P [Thermanaerovibrio acidaminovorans DSM 6589] # 15 258 137 381 398 140 34 1e-32 MQEKEPILKVEDLHFTYNGEKTALNGVDLNIYEGEKIAVLGSNGAGKSTFFLNINGVLKS HHGDIYYRGEKITKKNLNDLRKNVGIVFQDADNQIIASTVMAEVSFGPMNLKLPKKEVIS RVDKALEYMNISEFKDRPPHYLSGGEKKRVSIADIIAMESEVIIFDEPTAALDPLNAAML EEVLQKMGDEGRTMLISTHDVDFAYRWAERVIVFCHGKIIADDTPLAVFKQEDILKQANL KHPMMFDVYDILKEKGIVPDGDVYPKNIAEFQAMVKGL >gi|222441899|gb|ACEP01000043.1| GENE 12 12578 - 13312 909 244 aa, chain + ## HITS:1 COG:lin1167 KEGG:ns NR:ns ## COG: lin1167 COG0310 # Protein_GI_number: 16800236 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Co2+ transport system, permease component # Organism: Listeria innocua # 10 227 11 228 244 213 54.0 3e-55 MSKKQKNLAVLASVFLMLLTIAPVSNAMHIMEGFLPVKYCILWGVICVPFLVIGFMNIKK VLGNDRRTLTVLALTGAFVFVLSSLKIPSVTGSCSHMTGTGLGAILFGPCVTSILGIIVL IFQAILLAHGGLTTLGANCFSMAIAGPIISWILYKLLQKANVNKRVCIFVAAFLGDLFTY CVTSFQLALAYPSEVGGIGASIAKFLGIFATTQLPLAVIEGILTVVIVIALETYAKDELS SLEF >gi|222441899|gb|ACEP01000043.1| GENE 13 13328 - 13663 513 111 aa, chain + ## HITS:1 COG:STM2022 KEGG:ns NR:ns ## COG: STM2022 COG1930 # Protein_GI_number: 16765352 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type cobalt transport system, periplasmic component # Organism: Salmonella typhimurium LT2 # 6 102 3 92 93 66 37.0 1e-11 MTKNTKTVIILLVIAALIAIVPLFALKGAEFGGSDDAGSEMISEIQGKEYEPWFTPVLET ALGGELPGEIESLIFCLQTGIGVGIVAFVLGRFVERKKWQDKENGDSTSSK >gi|222441899|gb|ACEP01000043.1| GENE 14 13852 - 14406 667 184 aa, chain - ## HITS:1 COG:AGc1969 KEGG:ns NR:ns ## COG: AGc1969 COG1611 # Protein_GI_number: 15888409 # Func_class: R General function prediction only # Function: Predicted Rossmann fold nucleotide-binding protein # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 5 160 20 178 216 139 41.0 2e-33 MTKLTICMYGAASDRIDSIYINAAEALGKEIAKRHHKLIYGGGASGLMGACANGVLENGG EVTGVVPTFMNKFEPIKSECTEIVRTETMSLRKEVMEENADAFVIAPGGIGTFDEFFQIL TLTELGRSAKPIIVYNVSGFYDDTIAIIDKLIGMGFIRESVKSLFSVCETPEEVLNAIEW LTSQ >gi|222441899|gb|ACEP01000043.1| GENE 15 14824 - 15420 658 198 aa, chain + ## HITS:1 COG:no KEGG:ELI_2728 NR:ns ## KEGG: ELI_2728 # Name: not_defined # Def: hypothetical protein # Organism: E.limosum # Pathway: not_defined # 4 198 6 203 213 151 43.0 1e-35 MKKIITISREFGSGGHDIGMEVAKRLDIPFYDHEIVDKAVAESGFTKEFVEEHGEYSSLS SVLLGTSAPFDSYYYAEDPQDKIFKIQRRIITKFAQEGPCVIVGRCANYILEQADIDSFD VFIRANIEKRMANVSDRVDLKGDELVKFLQKRDHKRRVYYRFYTKRKWGDYHDYNMMLDS GAIGKENCINLIIKAIGE >gi|222441899|gb|ACEP01000043.1| GENE 16 15650 - 16462 451 270 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026566|ref|ZP_03715758.1| ## NR: gi|225026566|ref|ZP_03715758.1| hypothetical protein EUBHAL_00816 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00816 [Eubacterium hallii DSM 3353] # 1 270 1 270 270 512 100.0 1e-143 MKITRCKYGHYYDKAYFRECPHCCRARGGRDEDIIIGKNELSTDLSGKNSLPGRNVRINQ LQDDSNYGQGSNRDISQKEQDRKKAERWIGEAWDELEKVAKESPKTKVTEKAAQREKQED RQEKITEEKNTVSVKKKHRNSEQNLGNYMLFAAMPLKVIKNQEFCMDFCIFPEKDWETGA KMLSFKSQESIRDLKPVQMALPTKAVLHILYKEEEIFHKDIELEAERENVFCTFPVTVTD TRDRRFHMIKAVFEIEEQNIEIPLYLMPNP >gi|222441899|gb|ACEP01000043.1| GENE 17 16689 - 17591 1065 300 aa, chain + ## HITS:1 COG:mlr7575 KEGG:ns NR:ns ## COG: mlr7575 COG0175 # Protein_GI_number: 13476292 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: 3'-phosphoadenosine 5'-phosphosulfate sulfotransferase (PAPS reductase)/FAD synthetase and related enzymes # Organism: Mesorhizobium loti # 4 300 5 301 301 447 70.0 1e-126 MSDMTHLDELEAEAIYIMREVVAECEKPVMLYSIGKDSSVMLHLALKAFYPEKPPFPFLH VNTTWKFKEMIEFRDKIAEKYGLDMIEYINEDGVKKGINPFDHGSSYTDIMKTQALKQAL DKYGFTAAFGGGRRDEEKSRAKERIFSFRNSSHAWDPKNQRPEMWKLYNTNIHQGEEMRV FPISNWTEKDIWQYIQRENIDIVPLYFAAERPFVRRNGNIIMVDDDRMRLEPGEKIEHGK IRFRTLGCYPLTGGIESDADTLDAIIDETLSAVSSERTSRVIDSDGGAASMEKRKREGYF >gi|222441899|gb|ACEP01000043.1| GENE 18 17593 - 19422 2143 609 aa, chain + ## HITS:1 COG:XF1501_1 KEGG:ns NR:ns ## COG: XF1501_1 COG2895 # Protein_GI_number: 15838102 # Func_class: P Inorganic ion transport and metabolism # Function: GTPases - Sulfate adenylate transferase subunit 1 # Organism: Xylella fastidiosa 9a5c # 1 406 50 457 457 406 49.0 1e-113 MNSLLKFITCGSVDDGKSTLIGHMLYDAKLLFADQKRALELDSKVGSRGGAIDYSLLLDG LMAEREQGITIDVAYRYFTTENRSFIVADTPGHEEYTRNMAVGASFAELAVILVDATQGV LVQTRRHTRICSLMGIEHFVFAINKMDLINYDANRFEKIEREIKMMAAEYDYKSIKILPV SATEGDNVTVKSTKMPWYKGESLLTYLENIQIHDDSIESGFTMPVQRVCRPDRTFRGFQG QIESGKIKVGDEITALPSNEKAEVKGIIDAGKEVTESSRGHAVTISLDREVDVSRGCVLV NNAKPKTGNLFTAKLLWMDDTHLVAGKNYLLKLGTKLIPAVVMNIKYKIDVNTGNEVHAD AIYKNEIAACDIAVSDKIVFEKFKDNHALGSMILIDRITNMTSACGVIMHALRRTDNLTW HEMDITRDFRAQQKGQTPKTIWLTGLSGSGKSTLANELEKHLAALGKHTMLLDGDNVRMG LNKNLGFKEADRIENIRRIAEVAKLMNDAGLIVITSFISPYVRDRRNAREIIGEDSFIEV YVSTPVEECEKRDVKGLYKKARAGEIPNFTGISSPYETPEHPEVTIDTTGKSLADSVDYV MAQLEDLLN >gi|222441899|gb|ACEP01000043.1| GENE 19 19450 - 20901 1401 483 aa, chain + ## HITS:1 COG:no KEGG:Tery_4433 NR:ns ## KEGG: Tery_4433 # Name: not_defined # Def: hypothetical protein # Organism: T.erythraeum # Pathway: not_defined # 2 355 3 341 1088 92 28.0 3e-17 MRTLYVHIGTPKTATTSIQMFCVENQKVLNKQSYSYPLLDFVYPHVAHRRNGHFLVGWVY KPGGQEDVEKEQELWEKGFAMIHREFEKYDNVILSDENIWHSSNGRKFPFWAKLMQDAKE HDYQVKVIVYIRRQDGLANSWLSQQVKEGWNTNATIKWDSFQRKTRKVVFNYYLLLEKIA EVTGRENIIVRIFDRKKFKGKDHTIFSDFLEAIGVDYTDDFKITEEEANRSLTGNSQEIL RIVNTVLPDDDKVRTLVRQAAQDCENYKDPQNNFVMFSEEEFNKFMGRYEKWNEAIAKDY LHQEEPLFDMKRKEGERWTPENRFMYEDIVRFFGGVVIRQQRYIEALQKDINTLKESRLE AAKGLEDANVDEETKKILQSLSEQIINQQKDIQRALENSEKIDAVRDKNQEVKVRSLTVG NELIDLRQDYDALQKETKKDSKAIRQESKERDKELKAMIRELEQTSLWFRLRRKWRHITG KDK >gi|222441899|gb|ACEP01000043.1| GENE 20 21424 - 22944 1645 506 aa, chain + ## HITS:1 COG:FN0712 KEGG:ns NR:ns ## COG: FN0712 COG2059 # Protein_GI_number: 19704047 # Func_class: P Inorganic ion transport and metabolism # Function: Chromate transport protein ChrA # Organism: Fusobacterium nucleatum # 28 213 5 177 186 64 27.0 4e-10 MSVTAVQISERRQKVKTTVSDESIKMRKFISMTMTMLKIGFIGFGGGSALIPVIEDEVVE KDKIVSEEEFNDDVMIASITPGALPVEVATGIGRQASGLKGMAAAATAMALPGALLTVLL QAVISSAGSVVKSQINYLSIGISAFIILTLLAYSLGTVCQAVNKREGQLYGLIILGVFLL SGEKSIYQLFGLDIKPIFALSTIQVLGAAFFVILFTKGHVRCLRRSIPAAIITVCYLLCA GNAHIIPVSAKPFILAVMAVLSILGLVQSIVESPHKEAFPAKRLATSAGFWLLFVVILSI PAMFLTAKALKFIGMGCLSSIMSFGGGDAYLSVAQGLFVEGGVINNADFYGNVVAVANAL PGSILCKILTGIAYDVGYNLNGSVIEGFLVALSGFACSVAASGAIFELVFCVYEKYESLQ IFSVVKHFIRPIISGLLLTVAVSLYTSGIRGQVQTGSGHPALVITLIVIAVNLVLMWLQR RGKNIHLIWKIVISAGISFAGCNLFL >gi|222441899|gb|ACEP01000043.1| GENE 21 23198 - 24871 1284 557 aa, chain - ## HITS:1 COG:DR1171 KEGG:ns NR:ns ## COG: DR1171 COG0714 # Protein_GI_number: 15806190 # Func_class: R General function prediction only # Function: MoxR-like ATPases # Organism: Deinococcus radiodurans # 33 238 21 212 340 91 26.0 4e-18 MNIKKAKEDIKNTVRAYLKKNDYGQYSIPSIRQRPMLLMGPPGIGKTQIMQQIAQECNIG LVAYTITHHTRQSAVGLPFIKEKTFDGKSYSVTEYTMSEIIYSIYQYMEESGKKEGILFI DEINCVSETLAPTMLQFLQCKTFGNQAVPEGWIIVAAGNPPEYNRSVHDFDFVTLDRVRK IDIEADLDVFKGYARARHLHPFILSYLELRPQNFYRTENDVDGIQFVTARGWEDLSSLIY TYEELSIKVDEDIIHQFIQHADIAKDVAAYYDLYQKYRDDYDISRILTGQAGADIYAKLF RASFDEHLSCVELLISGLHNYISDALTADNLTTEAYAFLKTYQIELKKQIQNSSDSDSAA PGSDISDTVSKVNHSNGNIAATSNTAHLDCGTPDPKSSIPTENGLYRVLLQKKAAEYEQE NKAGLTSPEQKNFQLKLLELLTAWTPDSSLPPKEAFLAAKSGFDKQCQTRIDTIKKASSA LENAFTFMEEAFDEGQEMVVFVTELTMDPEASQFITENGCERYFKYNKTLLVGNRRAAIL KELDRDAIYSNPNEYEF >gi|222441899|gb|ACEP01000043.1| GENE 22 24964 - 26250 741 428 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_1241 NR:ns ## KEGG: EUBREC_1241 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 2 425 7 449 451 433 48.0 1e-120 MTQTEWENEMCIKILELIQNELYVDFRYLDVAVSALTLTPNDSLRSTATDGISFFFPPEQ ILRVFRSNPLFLNRAVLHSVFHCIFRHLWIRGSRDPDLWNLSCDIAVEWIIDSFEKKSVK RALSGIRTNIYNDFRQYKIPITAANIYRYLLPDIADNPDRLNQLMMEFFTDDHRFWPKKP STSPSAAKAGQSWDKISRRMEQELNLRGDDSASGIDAMKTQIKEGKSRRSYKDFLRKFTV LKEELHCDYDEFDLNYYSYGLRLYKNMPLIEPLESREVTKISEFVIVIDTSYSTNGPLVQ KFLEETFQIIQERDSFFHKSQIRIIQCDNQVHSDTIIKEQRDIPKLLHNFELIGGGGTDF RPAFSYINKLLEDGEFQNLKGVLYFTDGKGIYPAKRPPYETAFLFLGEEEHPDVPAWAMK LILHEEDL >gi|222441899|gb|ACEP01000043.1| GENE 23 26262 - 27431 468 389 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_1237 NR:ns ## KEGG: EUBREC_1237 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 47 385 8 335 335 151 29.0 6e-35 MDFTKNTNLSIEEISENKTSGNELSVTKISDSTAEKTFTNNAFSHFVTDTMTILWEPADS GCRLLRCFGNSPILNVPDMIDGRTVSEMGAYCFSRSRPRFPEKIYKTIFIDIENQETTLE SGQAFNQKDFDFSAFGTELDGTFLEEITLPDCATTLHNAAFYNCRKLKKLSVGTAISGIG SDEFMNDSQLEHLIIRGKDSEATGLPLILERIAENITVSFCPNSSSSPESIVFFPEYYEW LDEISPAHIFSRSIHGEGFRMRKSFENGILNYRKYDSCLENALTVESPESLCKIALNRLR WPSRLEDIFREKYENVIKKYMGTAFTLAVKDQDFPMLQFICEHFSPDASALASAIDQCIE KEWGEGNAFLIEAKHKSFSKKTFDFDLDF >gi|222441899|gb|ACEP01000043.1| GENE 24 27789 - 28571 1073 260 aa, chain + ## HITS:1 COG:CAC3252 KEGG:ns NR:ns ## COG: CAC3252 COG0345 # Protein_GI_number: 15896497 # Func_class: E Amino acid transport and metabolism # Function: Pyrroline-5-carboxylate reductase # Organism: Clostridium acetobutylicum # 4 260 5 263 270 176 39.0 4e-44 MKTLGFIGMGNMAGAIAAGILQKGLMKKEEVFAYAPHYEKLEANAEKIGFVPCKDLKGLT AKADTLVMACKPYQIEGVLSEIKEELKGKALISIAAGWNYDKYEKYLDKSTRLQFIMPNT PAMVGEGVLLFEEKNSLLPEEREEIKKLFEALGIVIELPVHLMGIGGALTGCGPAFVDLI IEALGDAGVKYGVPRKQAYEMVSQMILGSAKLQLETGEHPGVLKDNVCSPAGTTICGVDA LEHAGIRAGFIDAIDAIMNK >gi|222441899|gb|ACEP01000043.1| GENE 25 28671 - 29291 468 206 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225026577|ref|ZP_03715769.1| ## NR: gi|225026577|ref|ZP_03715769.1| hypothetical protein EUBHAL_00827 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00827 [Eubacterium hallii DSM 3353] # 23 206 1 184 184 328 100.0 1e-88 MKYLKSKLFTSLFAILLILSFCMAKPAQVHASKAESAKMARWMSICSSMADNIEKKHFVY SNGGTARTYNSAVKRSRRSNCALYVSWCLQKYGALGSGQTFYIRRGSSSIRKNFGHWKKK KVQVIRVNKRASRVNLKKGDVVLWSGLGHTNIYAGKNSSGERLWFDAGKAATYGHHSGSR FNNIGKKTQGYLNSKTVSYIIRIKGL >gi|222441899|gb|ACEP01000043.1| GENE 26 29491 - 30630 1154 379 aa, chain + ## HITS:1 COG:BH0726 KEGG:ns NR:ns ## COG: BH0726 COG1619 # Protein_GI_number: 15613289 # Func_class: V Defense mechanisms # Function: Uncharacterized proteins, homologs of microcin C7 resistance protein MccF # Organism: Bacillus halodurans # 7 379 7 337 338 128 28.0 2e-29 MKYGKFLPENGTIGFVAPSFGCNIEPYKSAFENAQKVWTKEGYSLSLGPNVYEGCGIGIS NTPEKCAQEFNEWYEKEDVDVLISCGGGELMCEILPYVDFEKIRQEEPKWFMGYSDNTNL LFLLATLCDTAGIYGPCAAAFGMEPWHESLKNSMEILTGMCKKVHNYDKWEREGLKTEEN PLVPYNVTEKFELRTWTAEDGLSTGKEPEEIQSVSDCNKISCDKVSAGEPHKIINEREEI AVKGRLLGGCLDCLATLCGTKFDKVEEFNEKYKEDGVLWFLEACDLNVMGIRRALWQLDE SGWFKNTKGFLIGRPYCYGEEFFGLDQYKAVIDILSKYNVPIIMDLDIGHLPPMMPLVCG SVAEAKVKDNSITIDMEYK Prediction of potential genes in microbial genomes Time: Fri Jul 8 07:06:51 2011 Seq name: gi|222441898|gb|ACEP01000044.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont49.1, whole genome shotgun sequence Length of sequence - 28060 bp Number of predicted genes - 23, with homology - 22 Number of transcription units - 14, operones - 5 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 100 - 159 9.8 1 1 Op 1 23/0.000 + CDS 321 - 797 452 ## COG1905 NADH:ubiquinone oxidoreductase 24 kD subunit 2 1 Op 2 1/0.400 + CDS 801 - 2771 2258 ## COG1894 NADH:ubiquinone oxidoreductase, NADH-binding (51 kD) subunit 3 1 Op 3 . + CDS 2784 - 4484 1395 ## COG4624 Iron only hydrogenase large subunit, C-terminal domain + Prom 5174 - 5233 10.1 4 2 Tu 1 . + CDS 5267 - 7393 2344 ## COG0297 Glycogen synthase + Prom 7649 - 7708 10.6 5 3 Op 1 31/0.000 + CDS 7766 - 8626 1240 ## COG0834 ABC-type amino acid transport/signal transduction systems, periplasmic component/domain + Term 8686 - 8749 1.2 + Prom 8719 - 8778 2.8 6 3 Op 2 34/0.000 + CDS 8800 - 9444 639 ## COG0765 ABC-type amino acid transport system, permease component 7 3 Op 3 . + CDS 9454 - 10242 217 ## PROTEIN SUPPORTED gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 + Term 10385 - 10426 1.1 8 4 Tu 1 . - CDS 10283 - 10444 67 ## - Prom 10508 - 10567 4.8 + Prom 10406 - 10465 8.4 9 5 Op 1 . + CDS 10515 - 10676 239 ## gi|225026588|ref|ZP_03715780.1| hypothetical protein EUBHAL_00838 10 5 Op 2 . + CDS 10669 - 13422 2008 ## COG2200 FOG: EAL domain 11 6 Tu 1 . - CDS 13673 - 14857 1152 ## COG0436 Aspartate/tyrosine/aromatic aminotransferase - Prom 14995 - 15054 5.2 + Prom 14845 - 14904 8.1 12 7 Tu 1 . + CDS 15095 - 16171 1311 ## COG0371 Glycerol dehydrogenase and related enzymes + Prom 16205 - 16264 7.0 13 8 Op 1 . + CDS 16285 - 16797 735 ## COG1853 Conserved protein/domain typically associated with flavoprotein oxygenases, DIM6/NTAB family 14 8 Op 2 . + CDS 16818 - 17528 943 ## COG2738 Predicted Zn-dependent protease 15 8 Op 3 . + CDS 17534 - 17986 462 ## COG1576 Uncharacterized conserved protein + Term 18128 - 18186 3.8 - Term 17978 - 18024 0.8 16 9 Tu 1 . - CDS 18032 - 19687 596 ## COG2244 Membrane protein involved in the export of O-antigen and teichoic acid + Prom 19705 - 19764 3.1 17 10 Tu 1 . + CDS 19814 - 21004 678 ## COG1104 Cysteine sulfinate desulfinase/cysteine desulfurase and related enzymes + Term 21068 - 21134 30.0 + TRNA 21050 - 21123 76.7 # Arg CCG 0 0 + Prom 21054 - 21113 80.4 18 11 Tu 1 . + CDS 21350 - 21793 290 ## Thit_2303 protein of unknown function DUF163 19 12 Tu 1 . - CDS 21961 - 22434 450 ## COG1522 Transcriptional regulators - Term 22671 - 22710 7.1 20 13 Tu 1 . - CDS 22792 - 24126 1795 ## COG0334 Glutamate dehydrogenase/leucine dehydrogenase - Prom 24260 - 24319 8.6 + Prom 24456 - 24515 6.8 21 14 Op 1 24/0.000 + CDS 24683 - 25756 1254 ## COG1131 ABC-type multidrug transport system, ATPase component 22 14 Op 2 4/0.200 + CDS 25758 - 26624 980 ## COG1277 ABC-type transport system involved in multi-copper enzyme maturation, permease component 23 14 Op 3 . + CDS 26636 - 28057 1326 ## COG3225 ABC-type uncharacterized transport system involved in gliding motility, auxiliary component Predicted protein(s) >gi|222441898|gb|ACEP01000044.1| GENE 1 321 - 797 452 158 aa, chain + ## HITS:1 COG:TM1424 KEGG:ns NR:ns ## COG: TM1424 COG1905 # Protein_GI_number: 15644175 # Func_class: C Energy production and conversion # Function: NADH:ubiquinone oxidoreductase 24 kD subunit # Organism: Thermotoga maritima # 6 158 7 158 164 149 46.0 2e-36 MLEKGYYDKADTIIKSHGAMEASLIPIIQDIQSEYRYLPPELLSYVADQIGITEAKAFSV ATFYENFSFEPKGKYVIKVCDGTACHVRKSTTILERIYSELGLSKDKVTTEDMLFTVETV SCLGACGLAPVLTVNDKVYPSMTPDDAAQLLQKLREEE >gi|222441898|gb|ACEP01000044.1| GENE 2 801 - 2771 2258 656 aa, chain + ## HITS:1 COG:TM0010_1 KEGG:ns NR:ns ## COG: TM0010_1 COG1894 # Protein_GI_number: 15642785 # Func_class: C Energy production and conversion # Function: NADH:ubiquinone oxidoreductase, NADH-binding (51 kD) subunit # Organism: Thermotoga maritima # 62 584 14 525 527 601 56.0 1e-171 MRIQNREELDQIRDMAAAKVSSTKCRILICAGTGCLAGGSGEIYEKMCELTKDCKDVDVE FGAEIAHSEKDTKENSNVDINKDIDADVDKGESVSGIEIPVAVKRSGCHGFCEMGPLMRI EPLGILYTKVKPEDCEEIFERTIKHGNIIHHLLYKQDGIEYPKQEEIPFYEKQTRLVLKN CGHIDAEHIEEYIAVGGYQSFAKVLFEQTPNDVIETILESNLRGRGGGGFQTGYKWKQVA RQQEKIRYIVCNGDEGDPGAFMDRSIMEGDPHKMIEGMLIAAYAVGAQEGYIYVRAEYPL AISRLRLAISQAEDRGLLGEHILGTDFSFHLHINRGAGAFVCGEGSALTASIEGKRGMPR VKPPRTVEQGLFAKPTVLNNVETFANVPMIIEKGAKWYRSIGPENSPGTKAFALTGSVKN TGLIEVPMGTSLREVIYDIGGGIKGDAKFKAVQIGGPSGGCLITPHLDVSLDFDSLKKMG AMIGSGGLVVMDDKTCMVEVARFFMNFTQNESCGKCVPCREGTKRMLEILERIVAGKGTR EDLDLLDELASTITDTALCGLGKSAALPVMSTLRLFRKEYEEHVVDKKCAAKNCTALRRF VISPERCKGCSKCARNCPVGAISGQIKKPYVIDDSICIKCGACESACAFHAIHIEA >gi|222441898|gb|ACEP01000044.1| GENE 3 2784 - 4484 1395 566 aa, chain + ## HITS:1 COG:TM0201_2 KEGG:ns NR:ns ## COG: TM0201_2 COG4624 # Protein_GI_number: 15642974 # Func_class: R General function prediction only # Function: Iron only hydrogenase large subunit, C-terminal domain # Organism: Thermotoga maritima # 212 557 5 357 372 296 47.0 5e-80 MKYMTINNRKVAFDDEKNVLSVIRKAGINLPTFCYHSELSTYGACRMCVVEDDRGKIFAS CSEVPRDGMVIYTNTPKLQHHRKMILELMLSAHCRECTTCKKNGDCTLQNLAKQLGVSYI RFENNKPQIPIDESSDCIVIDKNKCILCGDCVRTCEEIQGLGILDFAYRGSKMQVMPAFD RKMAETACVGCGQCRVVCPTGAITIKHNIHPVWEALADKNTRVIVQIAPAVRVAVGDKFG IPKGENSLGKLVAALRRMGFDEIYDTDFGADLTIMEESKEFLERVESGEKLPLFTSCCPA WVKFCEQKYPELRANISTCRSPQQMFGAVLKEEARNNKKDTRKTVVVSIMPCTAKKAEIL RPEHKTEGEQDVDFVLTTTEVIRMIKEAGIDLAQMPWEAMDMPFGLSSGAGVIFGVSGGV TEAVLRRLVEDSGMEALNEIKFTGIRGTDGIKEAVIPYGDRQIKIAVVSGLKNADELIQK IKSGEVQYDFVEVMACRRGCMAGGGQPVPIGARTKAARYEGLYRIDDASQIKRSNDNPIV GELYEGLLKGKEHKLLHNNSDLPQAK >gi|222441898|gb|ACEP01000044.1| GENE 4 5267 - 7393 2344 708 aa, chain + ## HITS:1 COG:CAC2239 KEGG:ns NR:ns ## COG: CAC2239 COG0297 # Protein_GI_number: 15895507 # Func_class: G Carbohydrate transport and metabolism # Function: Glycogen synthase # Organism: Clostridium acetobutylicum # 228 704 3 476 477 399 42.0 1e-111 MAKKISTRKTTAKSSSTAKKTSVKNAEAEVKASVKEATVEPKTVAKEEAVKPKAAVKKTA KAKTPEKKETVEAKPEAKKEEAVKPKAAVKKTAKAKTPEKKETVEAKPEAKKEEAVKPKP AAKKKTAKAKTSEKKAAVKAKPVVKKEEAAEPKTEVKEKTVKAKPAAKKAEVEVKEPVKK VEIETKVPAKKVAVKAEAPSKKEVAEPQTTVKQDIPMEQPDLGPRRSVAFIGSECYPFVK TGGLGDVMSALPKALAKLNLDVKVILPRYKCIPQKYQEKMEYRGSFYMDLCADGKQYYVG IMEYQEDGVVYDFIDNDEFFSWGDPYTNLIDDIPKFCYFGKAALAALNYLDWTPDIVHCH DWQAALVPLYLRTCFSDTNVGRAIAVLTIHNLRFQGVYDRKTIQYWSGLPDYVFNKDCMI QNWLDANMLKGGITYSNKVTTVSNTYAWEIQTEEYGEGLEEHLRYHNNKVLGIVNGIDTD IWNPATDKLLASKYDAESAIKNKKANKKALQESLGLDVDDNKMVIGLISRLTNQKGLDLV NDVIPGIMDGNTQVVVLGTGDAQYEDTFRYYEDKYKGSFCAYIAYNENVAHNIYAGCDAL LVPSRFEPCGLTQLISMRYGAVPIVRETGGLKDTVQPYNAFENTGNGFTFDRYESGLLYD AINRAKTLYFENRVYWDDMVVRDMNKDVSWEQSAKQYKDMYVELTPRY >gi|222441898|gb|ACEP01000044.1| GENE 5 7766 - 8626 1240 286 aa, chain + ## HITS:1 COG:SP1500 KEGG:ns NR:ns ## COG: SP1500 COG0834 # Protein_GI_number: 15901347 # Func_class: E Amino acid transport and metabolism; T Signal transduction mechanisms # Function: ABC-type amino acid transport/signal transduction systems, periplasmic component/domain # Organism: Streptococcus pneumoniae TIGR4 # 1 278 1 264 278 117 31.0 2e-26 MKKIIAMLLVLVMSVTGLVGCGSSKGNEAGSTSAGSDSDWAYIQDKGKLTVGITLFAPMN YYNEKNKLVGFDTEMAEAVTKKLGIDVEFTEINWDSKEVELSSKNIDCIWNGMCITEERK QNMSISDPYLYNTQAMVMKKSREKEIMKSVKGLTVTAEQGSTGEGKIDGSIADDDTVKVS AQDYFKDANYVASDSMAKALMEVKSGTADVALVDSVCALGMVGEGTDYSDLVINMDNNFG QQEYGIAFRKGSDVTEKVNEAIKELYEDGTVDTIAKKYGLQEMLIK >gi|222441898|gb|ACEP01000044.1| GENE 6 8800 - 9444 639 214 aa, chain + ## HITS:1 COG:SP1502 KEGG:ns NR:ns ## COG: SP1502 COG0765 # Protein_GI_number: 15901349 # Func_class: E Amino acid transport and metabolism # Function: ABC-type amino acid transport system, permease component # Organism: Streptococcus pneumoniae TIGR4 # 7 213 8 213 213 139 39.0 5e-33 MMATSPMESLSSGFIINIELFFITLLLALPLGLVITFCSMSKFKPLKWLSRTFVWIIRGT PLMLQLFVVLYAPGLLFSMPMNSRFTAAVIAFVINYAAYFSEIYRGGIESVSKGQYEAGQ VLGMTKTQIFFKIILLQVIKRIIPPLSNEVITLTKDTSLARIIGLAEIIMCAERFTKQGL IWPLFSTAVFFLVFNGILTILFGWIEKKMDYFRV >gi|222441898|gb|ACEP01000044.1| GENE 7 9454 - 10242 217 262 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 [marine gamma proteobacterium HTCC2080] # 1 234 1 214 305 88 27 5e-17 MMSILEVKNLKKNFGSLEVLKGIDFSLEKGEALAMIGASGSGKTTLLRCLNFLERPTHGQ IIVNNKVIFDADDNKSLGDAQIRKNRLHFGMVFQSFNLFPQYTALRNCTLARELMAKERP DFKENKKKIMDEITAEGEALLESVGLKEKMNYYPHQLSGGQQQRVAIARALMLQPDILCF DEPTSALDPELTGEVLKVIRGLADRHTTMVIVTHEMKFARDVSDKVLFLDQGTIAEYGTP EQVFEHPKEERTRQFLERYFED >gi|222441898|gb|ACEP01000044.1| GENE 8 10283 - 10444 67 53 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTKSNEIYKKEGRTSIFTPPLKNYIFQCSFLYFNYLSIITVSFYNSRTILIFF >gi|222441898|gb|ACEP01000044.1| GENE 9 10515 - 10676 239 53 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026588|ref|ZP_03715780.1| ## NR: gi|225026588|ref|ZP_03715780.1| hypothetical protein EUBHAL_00838 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00838 [Eubacterium hallii DSM 3353] # 1 53 1 53 53 85 100.0 1e-15 MIDVRESTDKLKILIVDDSELNRELLAGMLGDEYEIYQVEKNDIKVNGTQIFG >gi|222441898|gb|ACEP01000044.1| GENE 10 10669 - 13422 2008 917 aa, chain + ## HITS:1 COG:slr1588_2 KEGG:ns NR:ns ## COG: slr1588_2 COG2200 # Protein_GI_number: 16332199 # Func_class: T Signal transduction mechanisms # Function: FOG: EAL domain # Organism: Synechocystis # 658 914 19 273 283 174 35.0 5e-43 MDEYEIYQVENGKKAIDILEENREQFKLVLLDINMPVMDGYEVLSIMKRRKWLDKLPVIV ISAEISGESVKKAYELGASDYFVRPFNVAIVLRRVRNMITLYDNISSNLKDAVTMLSTIF YRILKIDLEADSYEIIEQGNSDPLRELYQKESISACLKDVAEKGYIHEEDYKEYTEFCSL EHLKKIFLDGSQYASLQYRRVLEGQYRWVSMEIVRSTEYREDNQQVVMYIRDINDDYLKL LQIAMCHTLDSVGIVSANISQGICLSFAGRRDELECQSEESIDTYIQRVSEMIPMPESRE HFCQVFSQQNMLKRFTEGTAALSMEAAFFYSEEQQPCVLRINVDMACNSFSREIEGVLHF TDVTVAYLIENVPQKIYQKDYENIIIIDAKREKMIKTDVLSSVISDYLKKEEAYEGYRSY SSHRAVVESERERFKKCVELSTIKEGLRKDKQYFFTIHETDKTGEVRLKRYSYIYIDERV DIIVGAREDITEFSEKDVLTGGYNRRGFIRITERLLNEVPDRTKYAVLFFNVKNFKAVNE LFGVESGDVVLQNIFRTLTHSKLSPVITARVESDHFVCLVENKNLDFEELTSVCDNKFVK DGKCMNLIIRCGIFYVEEKPMKISGMIDRAKLAKRYITDEYVQPYMVYDHSMQVAYIDKA KLAGELQEGIAKEQFKVYYQPVIDTKTGKIASAEALIRWIHPDKGFISPALFIPALEENG HISELDFYVLKKVWQFINDRCENNKFVVPISVNLSWMDFYDEIMMEKILKEMDRFRENGR EHMARFEITETSYAAIRENRSGILESLRIKNAKILLDDFGSGFSSFGMLQDYDFDILKID MSFIRKIGENPKTKSIVHSIIGMAHEIGIKTVAEGVETEEQVSFLRQSGCDYIQGYYYSK PLPEEEFVEFLEKADAE >gi|222441898|gb|ACEP01000044.1| GENE 11 13673 - 14857 1152 394 aa, chain - ## HITS:1 COG:CAC2832 KEGG:ns NR:ns ## COG: CAC2832 COG0436 # Protein_GI_number: 15896087 # Func_class: E Amino acid transport and metabolism # Function: Aspartate/tyrosine/aromatic aminotransferase # Organism: Clostridium acetobutylicum # 1 393 1 392 393 402 48.0 1e-112 MLNKQYTSMLQEKDVIFEIFQYATERAKVVGAENIYDFSLGNPSVPAPSIVNETIVQKVN TLDPLALHGYSPSFGIMETREKIAVFLNKKYGSSYKASNIFMAIGAAGALAHAFRAVTNP GDEIITFAPCFSEYKPYTAGAGLKLTIIPADIDTFQINFAAFEAQLNENTTAVLINSPNN PSGIVYSTETIKKLADILTEKSKEFGHPIYLISDEPYRDIIFEGVDAPYVANYYKNTLTC YSFSKSISLPGERIGYLAVHPDCIEANKIIEICPQISRTIGQNGAASLMQRTVADVCNYT SDLSVYETNKKILFEALKEYGYHCVEPGGTFYMFPRSLEPDSQAFCKKAMEKDLMLVPGD IFGCPGHFRIAYCVPTERVEKALPIFQELAEEYL >gi|222441898|gb|ACEP01000044.1| GENE 12 15095 - 16171 1311 358 aa, chain + ## HITS:1 COG:ECs4874 KEGG:ns NR:ns ## COG: ECs4874 COG0371 # Protein_GI_number: 15834128 # Func_class: C Energy production and conversion # Function: Glycerol dehydrogenase and related enzymes # Organism: Escherichia coli O157:H7 # 6 326 18 334 380 129 30.0 1e-29 MENYSVQVPPYTVGPEAYRQIEKQCHIYGKKAVVVGGKKAMAAAKDKLLAAVKDTGIEIL DFVWFGGECTFNNAKKLEQLEAVRQADMIFAVGGGKAIDTCKLVSIDFEKPYFSFPTIAS NCAPTSAVSIVYNEDGTFCKFVHFLEPAKHVFINTEIIANAPKEYLWAGIGDTYAKYYEV SISARGEKLEHFKAMGVQLSRMCMEAMVEHGAQALKDNEQGVASYALEQVALDIIITAGW VSLLVAREHTMDYNGGVAHAFFYGLCNLPGFDEDHLHGVVVGFGVLLLLTIDGKEEECQK LIAFNKTVGLPAKLSEVGVTVEEVAKCAPFMVKDEDLEHYPYVVTEEMIVEAAKKLDY >gi|222441898|gb|ACEP01000044.1| GENE 13 16285 - 16797 735 170 aa, chain + ## HITS:1 COG:FN0320 KEGG:ns NR:ns ## COG: FN0320 COG1853 # Protein_GI_number: 19703665 # Func_class: R General function prediction only # Function: Conserved protein/domain typically associated with flavoprotein oxygenases, DIM6/NTAB family # Organism: Fusobacterium nucleatum # 24 170 23 180 180 79 35.0 4e-15 MKFEKIDVTKANVNPFQRIGQDWMLISAKKEGKVNTMTASWGMMGVFWGKNVVTVGIRPQ RFTKEFVDAGDTFTLTFFDGERKQEMGYLGKVSGRDEDKISKVNFHVVETEEGEPTFEEG TMVFVCKKLMETQLNPEEFIDPEVDGRWYPEKDYHHMYTAEVVAAYRIEK >gi|222441898|gb|ACEP01000044.1| GENE 14 16818 - 17528 943 236 aa, chain + ## HITS:1 COG:TM1511 KEGG:ns NR:ns ## COG: TM1511 COG2738 # Protein_GI_number: 15644259 # Func_class: R General function prediction only # Function: Predicted Zn-dependent protease # Organism: Thermotoga maritima # 11 234 5 230 230 203 47.0 2e-52 MYYGGYYGMYWDPTYILVIIGMVICLLASARVKSTYAKFNRVRSHSGITAAEAAEKILHG AGIYDVEIRHISGDLTDNYDPSNRVLNLSDATFRSSSVAAIGVAAHECGHAIQDAESYAP LKIRGAIVPVVNFGATISWPLILIGIVLGGSRTLIMLGVLLFSLTVIFQLVTLPVEFNAS SRALKILGNSHILYDDEISGARKVLTAAALTYVAAAASSILQLLRLLLLFGDRRDD >gi|222441898|gb|ACEP01000044.1| GENE 15 17534 - 17986 462 150 aa, chain + ## HITS:1 COG:BH4007 KEGG:ns NR:ns ## COG: BH4007 COG1576 # Protein_GI_number: 15616569 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Bacillus halodurans # 4 148 3 147 159 138 46.0 3e-33 MTGIKIVCVGRVKEDFFRDGIAAYVKEIRKKYSMDIMECPDEPTPDQCPASVERQIREME GERILSKIKDDDYVIALCIDGKHYSTSAWKKRMERVFSTVSSDVVFVIGGSLGLADKVVR RANEKLSFSALTFPHQMMRLILCEQIARIV >gi|222441898|gb|ACEP01000044.1| GENE 16 18032 - 19687 596 551 aa, chain - ## HITS:1 COG:CAC1017 KEGG:ns NR:ns ## COG: CAC1017 COG2244 # Protein_GI_number: 15894304 # Func_class: R General function prediction only # Function: Membrane protein involved in the export of O-antigen and teichoic acid # Organism: Clostridium acetobutylicum # 3 540 7 490 500 149 26.0 1e-35 MIFTGTLILTLSGFLARILGFYNRIFLSNLIGAKELGIYQLIFPIYMLCFSLCCHGFETG ISNLTSRFFAKGQKKNAHHLVRLGCLLSFCLSILLMFALFEGADYLSTFILKEASAAPSL KIAALSLPFVSIKACLHGYYIGLNHSSVPAVSQIVEQITRVGGIYLLSISVFIMGADARI AAWGMVLGEAVSTIYTILAYFRRLFFENSIFFQKNLQKNNSNKNKQNKKISNKKNQNKRI LNKNNSDKISKNYNKKYADKRTFSRQADSFTLSTRELLEHFFSFSIPISVNHFCLTIISS LETMLIPFMLEIFFHSHSQALETYGTLTGMALPFLFFPATIVNSLSVMLMPAISSAYDQK QHRQMENTISASLHFCLLIGIFSTFAFLIYGTVLGETIFHSKEAGQYVYLFSVLCPFMYA AQTTSSILNGFGKTKQTLYHNLLGVGIRIFFILLLIPSKGIPGYLIGLLAGYSLQLLLNL FCIYQIVPFYFSAEKTLLFPVLTAIGGGILSKKFWGIASAALPLSSFYILVISGFLYFLF FTLCQIFRESV >gi|222441898|gb|ACEP01000044.1| GENE 17 19814 - 21004 678 396 aa, chain + ## HITS:1 COG:MA3264 KEGG:ns NR:ns ## COG: MA3264 COG1104 # Protein_GI_number: 20092080 # Func_class: E Amino acid transport and metabolism # Function: Cysteine sulfinate desulfinase/cysteine desulfurase and related enzymes # Organism: Methanosarcina acetivorans str.C2A # 4 385 58 441 448 414 53.0 1e-115 MKQVYLDNAATTPVRLEVEVAMRPYFCEKYGNPSGVYQMSSENRGMIENVREQIAKTLNA SAEEIYFTSGGTESDNWAIKAAAQMLKEKGRHIITSKIEHHAILNSCAYLEKQGFEVTYL DVDEWGSIRLDVLEKAIRKDTILISVMYANNEVGTIQKIAQIGKIAEKNDILFHTDAVQA YGQLLIDVKKENIDLLSASAHKFNGPKGVGFLYIRKKIPLPEYIHGGKQERNHRAGTENV PGIAGMGKAAEIAFTTREYREKEIQQLRDYLIRRLQRGIPYCRLNGSLTNRLPGNCNISF QFIEGNELLLLLDEKNICASAASACSTGDTSPSHVLTAMGIPEKLARGTLRLTIGYQNTQ EEIDYTVQCIKEAVEKLRENAEDYRRYKETFPCNIM >gi|222441898|gb|ACEP01000044.1| GENE 18 21350 - 21793 290 147 aa, chain + ## HITS:1 COG:no KEGG:Thit_2303 NR:ns ## KEGG: Thit_2303 # Name: not_defined # Def: protein of unknown function DUF163 # Organism: T.italicus # Pathway: not_defined # 1 147 1 133 133 68 30.0 1e-10 MKYEIYIQERKKSQKFCKKAIAEYEKRLSRYCKISCKFVKKEKEWNSLLTKDTGMKKLLV QPGKAELTSEKLSDQLKKWEGSGMKGVMFFVPEWNEEATDKFAELGTGSIEPLVLSDFTM SSNMTGMVLYEQIYRGYRILHNHPYHK >gi|222441898|gb|ACEP01000044.1| GENE 19 21961 - 22434 450 157 aa, chain - ## HITS:1 COG:NMB1650 KEGG:ns NR:ns ## COG: NMB1650 COG1522 # Protein_GI_number: 15677499 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Neisseria meningitidis MC58 # 1 120 4 126 154 93 38.0 1e-19 MDKIDRKILSLLQENARYPLKYLAQKVFLSSPAVSTRIERMEKAGIITGYHATIDPVALG YHITAFINLTLDPKQKNEFYPYVTDCPNVLECNCVTGTYSIMLKVAFPSTMELDTFIGHL QQFGNTQTQIVFSTAVPPRGVGIETDETDTAEEIKKA >gi|222441898|gb|ACEP01000044.1| GENE 20 22792 - 24126 1795 444 aa, chain - ## HITS:1 COG:lin0569 KEGG:ns NR:ns ## COG: lin0569 COG0334 # Protein_GI_number: 16799644 # Func_class: E Amino acid transport and metabolism # Function: Glutamate dehydrogenase/leucine dehydrogenase # Organism: Listeria innocua # 3 443 17 457 458 591 66.0 1e-168 MSYVDDVLAKVIAKNPSEPEFHQAVTEVFETIRPLIEANEEEYKKQAVLERITEPERMIK FRVPWVDDNGQVQVNMGIRVQFNSAIGPYKGGLRFHPSVNQGIIKFLGFEQIFKNSLTGL PIGGGKGGSDFDPKGKSDREVMAFCQSFMTELCKHIGADTDVPAGDIGVGGREIGYLFGQ YKRIKGLYEGVLTGKGLTYGGSLARTEATGYGLVYFTEEMLKCHDDDIVGKTIVVSGAGN VAIYAIEKCHQLGAKVVTCSDSTGWVYDPEGIDLDALKEIKEVKRARLTEYKNYRPNSEY HEGRGVWSVKCDVALPCATQNELFLEDAKQLVANGCIAVCEGANMPTTLDATKYLQENGV MFAPGKASNAGGVATSALEMSQNSERLSWTFEEVDSKLKNIMVNIYHNIDDAAKRYNKEG DYVTGANIAGFEKVLNAMLAQGVC >gi|222441898|gb|ACEP01000044.1| GENE 21 24683 - 25756 1254 357 aa, chain + ## HITS:1 COG:sll0489 KEGG:ns NR:ns ## COG: sll0489 COG1131 # Protein_GI_number: 16331772 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase component # Organism: Synechocystis # 1 314 1 317 342 253 43.0 5e-67 MIEVKNLVKDYGKHHAVKDISFSVPDGQIVGLLGPNGAGKSTTMNIMTGYISATSGEVKI GGYDILEQPIQAKKLIGYLPEIPPLYEDMTVAEYLNFICDLKGIRKKTEKESSINEVTEA VKISDMKGRLIKNLSKGYKQRVGLAQALIGNPPLLILDEPTVGLDPNQIIEIRALIKSLG QKHTIILSSHILSEVNAICDYVLIIDKGTLVAEDTPEHLSEDFSDTDNINMSVKGSREQV EEVLKVSEYIRDYKIIDEKDGIVDVQAKTATKEDIRDNLFFEFAEEKLPIIKMERESLSL EDVFLKLTGQDVEEVESEAKKASEQIEKQGHHGFSFKRKKKADSKGTEQAETAEEDK >gi|222441898|gb|ACEP01000044.1| GENE 22 25758 - 26624 980 288 aa, chain + ## HITS:1 COG:PA3671 KEGG:ns NR:ns ## COG: PA3671 COG1277 # Protein_GI_number: 15598867 # Func_class: R General function prediction only # Function: ABC-type transport system involved in multi-copper enzyme maturation, permease component # Organism: Pseudomonas aeruginosa # 1 172 4 176 244 80 32.0 4e-15 MKAIYLKEMRSYFHSLAAYIYFALFIAATGVYFSVICMSYGYTDYAQYVFSNSTILYIVI VPILTMRLFAEEKKQRTDQLLYTSPIRSGSIVFGKYLASVTLLAISMLVSIIEAGVLSMF GNVNWKTVLTGCLGYFLLGACLMAVGMFVSSLTDNQMIAAALSFAVVLFCMLLPNISNVV PGRARYTYIVCVLAVILIAWFFYDETKNVKITAGVGVAGIAIIAILSKVKPALFDNGLSK IIDWFSVLDRFNDFCSGILNASSIIYYVSFIAVFLFLTMQGIEKRRWN >gi|222441898|gb|ACEP01000044.1| GENE 23 26636 - 28057 1326 473 aa, chain + ## HITS:1 COG:alr1621 KEGG:ns NR:ns ## COG: alr1621 COG3225 # Protein_GI_number: 17229113 # Func_class: N Cell motility # Function: ABC-type uncharacterized transport system involved in gliding motility, auxiliary component # Organism: Nostoc sp. PCC 7120 # 16 379 73 415 567 87 22.0 5e-17 MIKNKMNTKLKKGIFSAVAVVVLIAAVIAVNIFVTSKNYTVDVTANKIYSLSKQTKKILK GLDKEVTIYVIDKESSVNSSYAQVWKEYKKNSTKVKFVYKDPDLYPNFTKKYTDSSEEVA NDSMIIKCGKKYRYVSANDYVSYSYGSDYSYSADSLQLESLTTEAINYVVSDSTPVIYTL SGHSEQSFDTSVTSSFEGDNYSVKTLNLLTEQRVPDDCSILLINGAQKDITKDELKMIKK YMKNGGKMYVFLDAKVENLTNLYSLLKSYNVEPQKGVVVETDASKYTQYPIYLLPTIESS DATSAQYNSNIYILAPSAKGLKDITEKNAKKNSSASDYTVTSLLSTSDGAYSKVNTDSAT LNKEKNDISGPFNISVAVSDSTGGRMIVTGCTNMLLQDIDQAVSGANTDFVLNGVNYLAE QKNKISIRAKSLKTENAVVPAFNQKATLIMTVFVIPLVILAIGIGIVIKRRKL Prediction of potential genes in microbial genomes Time: Fri Jul 8 07:07:06 2011 Seq name: gi|222441897|gb|ACEP01000045.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont50.1, whole genome shotgun sequence Length of sequence - 844 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 28 - 844 618 ## COG0389 Nucleotidyltransferase/DNA polymerase involved in DNA repair Predicted protein(s) >gi|222441897|gb|ACEP01000045.1| GENE 1 28 - 844 618 272 aa, chain + ## HITS:1 COG:SA1196 KEGG:ns NR:ns ## COG: SA1196 COG0389 # Protein_GI_number: 15926944 # Func_class: L Replication, recombination and repair # Function: Nucleotidyltransferase/DNA polymerase involved in DNA repair # Organism: Staphylococcus aureus N315 # 2 252 9 221 420 144 35.0 2e-34 MDKIYAAIDLKSFYASVECVERGLDPLITNLVVADKSRTEKTICLAVSPSLKKYGIPGRP RLFEVIQKVKRINKKRLEAAPGHKFIGQSFHSDKLSDPSVALAYITASPRMSLYMKYSTQ IYQVYLRYFAPEDIHVYSIDEVFIDLTGYLTNYQMGAKELISKVIQDVLKETGITATAGI GTNLYLAKIAMDIMAKHVPADEYGVRIAYLDELTYRKKLWEHQPITDFWRVGKGYAKKLA VYQIYTMGDVARCSVGKEKEYHNEELLYKLFG Prediction of potential genes in microbial genomes Time: Fri Jul 8 07:07:09 2011 Seq name: gi|222441896|gb|ACEP01000046.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont51.1, whole genome shotgun sequence Length of sequence - 8994 bp Number of predicted genes - 10, with homology - 9 Number of transcription units - 7, operones - 1 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 17 - 199 119 ## COG0675 Transposase and inactivated derivatives + Prom 335 - 394 8.0 2 2 Op 1 . + CDS 452 - 550 60 ## + Prom 555 - 614 1.6 3 2 Op 2 . + CDS 650 - 1099 386 ## COG1671 Uncharacterized protein conserved in bacteria 4 2 Op 3 1/0.000 + CDS 1133 - 1660 735 ## COG0655 Multimeric flavodoxin WrbA 5 2 Op 4 . + CDS 1732 - 3090 1756 ## COG2233 Xanthine/uracil permeases 6 3 Tu 1 . - CDS 3389 - 4183 854 ## COG1349 Transcriptional regulators of sugar metabolism - Prom 4264 - 4323 8.2 + Prom 4300 - 4359 9.0 7 4 Tu 1 . + CDS 4404 - 5057 919 ## COG0176 Transaldolase + Term 5192 - 5251 13.5 - Term 5178 - 5239 12.1 8 5 Tu 1 . - CDS 5253 - 5678 455 ## COG1661 Predicted DNA-binding protein with PD1-like DNA-binding motif - Prom 5820 - 5879 8.7 + Prom 5906 - 5965 6.7 9 6 Tu 1 . + CDS 5990 - 6814 805 ## COG1606 ATP-utilizing enzymes of the PP-loop superfamily + Term 7002 - 7030 -0.9 + Prom 7397 - 7456 7.6 10 7 Tu 1 . + CDS 7498 - 8412 995 ## EUBREC_2750 hypothetical protein Predicted protein(s) >gi|222441896|gb|ACEP01000046.1| GENE 1 17 - 199 119 60 aa, chain + ## HITS:1 COG:TM1044 KEGG:ns NR:ns ## COG: TM1044 COG0675 # Protein_GI_number: 15643802 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Thermotoga maritima # 1 57 320 376 405 79 56.0 1e-15 MKVDRFFASSQTCSVCGYKNAKTKNLAIREWDCPQCGTHHDRDVNAAVNIRNEGIRLVNA >gi|222441896|gb|ACEP01000046.1| GENE 2 452 - 550 60 32 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLSLLNEGAITFEDLEDFSEDLQDTLKHFAGR >gi|222441896|gb|ACEP01000046.1| GENE 3 650 - 1099 386 149 aa, chain + ## HITS:1 COG:CAC2825 KEGG:ns NR:ns ## COG: CAC2825 COG1671 # Protein_GI_number: 15896080 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 1 145 1 145 148 219 71.0 1e-57 MHIFIDADACPVVGIVEKTAKKYNLPVTLLCDTNHILYSDYSEVIVVGAGADAVDYKLIS ICHKRDIVVSQDYGVAAMALGKGAYAIHQSGKWYTNDNIDQMLMERHLNKKARRSSHKNH LKGPRKRTEEDDVRFAQSFERLILKVLEK >gi|222441896|gb|ACEP01000046.1| GENE 4 1133 - 1660 735 175 aa, chain + ## HITS:1 COG:MA0418 KEGG:ns NR:ns ## COG: MA0418 COG0655 # Protein_GI_number: 20089311 # Func_class: R General function prediction only # Function: Multimeric flavodoxin WrbA # Organism: Methanosarcina acetivorans str.C2A # 2 118 4 123 179 83 39.0 2e-16 MRIVIIDGSARKGNTLTAINTFIKGASEGNQIEVIHPEKLHIAPCKGCGACQCHKGCVDK DDTNETIDKIAAADMILFTTPVYWWGMTAQLKLIIDKCYCRGMQIKEKKVGLIVVGGAPV DNVQYELIWKQFECMAEYLSWNVLFHESYSAMGKDDIAKDEEILKKLEALGKNVK >gi|222441896|gb|ACEP01000046.1| GENE 5 1732 - 3090 1756 452 aa, chain + ## HITS:1 COG:DR2280 KEGG:ns NR:ns ## COG: DR2280 COG2233 # Protein_GI_number: 15807271 # Func_class: F Nucleotide transport and metabolism # Function: Xanthine/uracil permeases # Organism: Deinococcus radiodurans # 18 451 90 489 496 301 44.0 2e-81 MNESNQVIRDAKKLGLPKMLILGVQHMFAMFGATILVPILVNSYFMDACGEHPTRGLTVA VTLFCAGFGTLLFHVCARMKVPAFLGSSFAFLGGFSTVAHLNTGIYAGMSANEKAAYACG GVVVAGLLYVVMSLIICLVGIKRVMRYLPPVVTGPVIICIGLSLASSAISNASTNWFLAF VALAVIIAFNIWGKGMFKIIPILMGVVISYIVALIMNAAGFTNPDGSAILNFSSVASSGF VGLPPFQICKFDVTAILVMAPIAIATMMEHVGDMSAISATVNKNFLADPGLHRTLLGDGV ATALAGLIGGPANTTYGENTGVLELSHVYDPLVIRIAACLAIIISFIPKVSAIISTMPSA IIGGVSFMLYGMISAIGVRNVVENKVDLTKSRNLIITGVIFVCGLGFGDGLTFTVAGTSI TLTALSIAAIAGILLNIILPGNDYEFDESSDL >gi|222441896|gb|ACEP01000046.1| GENE 6 3389 - 4183 854 264 aa, chain - ## HITS:1 COG:lin0370 KEGG:ns NR:ns ## COG: lin0370 COG1349 # Protein_GI_number: 16799447 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulators of sugar metabolism # Organism: Listeria innocua # 1 261 1 254 254 202 44.0 4e-52 MMQLERQKKILDYLNKNRKATTKELSQLLDVSTTTIRTDLNQMDRENLITKTHGGAVYND RSLDNIELTGRAYFFDERALENQKEKEAIANKAIKLIKNHMCIYLDASSTCYTLGMKLSG FTKLTVITNGINLAMALKDIPGITVILTGGIVTSVSSSIEGLLGEDLLKKIHTDIAFVSA RGFSVENGLTDFSIYEADLKRRCVKSSAKTIALIDHTKFNTTSISSYASLDDLNMVITDF GLSENTKDIYEKAGVNLVIAKEMN >gi|222441896|gb|ACEP01000046.1| GENE 7 4404 - 5057 919 217 aa, chain + ## HITS:1 COG:lin2886 KEGG:ns NR:ns ## COG: lin2886 COG0176 # Protein_GI_number: 16801946 # Func_class: G Carbohydrate transport and metabolism # Function: Transaldolase # Organism: Listeria innocua # 1 213 1 211 214 294 74.0 9e-80 MKFFVDTANVADIKKANDLGVICGVTTNPSLIAKEGRDFTEVIKEITSIVDGPISGEVKA TTTDAEGMIKEGREIASIHPNMVVKIPMTVEGLKAVKVLSKEGIKTNVTLIFSANQAILA ANAGATYVSPFLGRLDDISQPGIDLIRQIAEIFDIYAYDTEIIAASVRNPIHVTDCALAG ADIATVPYSVIEQMTKHPLTDQGIVKFQADYKAVFGE >gi|222441896|gb|ACEP01000046.1| GENE 8 5253 - 5678 455 141 aa, chain - ## HITS:1 COG:SA0649 KEGG:ns NR:ns ## COG: SA0649 COG1661 # Protein_GI_number: 15926371 # Func_class: R General function prediction only # Function: Predicted DNA-binding protein with PD1-like DNA-binding motif # Organism: Staphylococcus aureus N315 # 1 138 1 138 140 88 34.0 5e-18 MDYRRFKDTIYLRLDPGEEIIEQIEKVAKKENIQLAQINGLGALNDFTTGVFDTVEKKFH PIHFEGAYEIVSLTGTVTKQDGEVYLHLHMAAGDKDGHVFGGHLSMAYVSATAEIQIQIS DGVIDRKYDKKIGLNLFDFTI >gi|222441896|gb|ACEP01000046.1| GENE 9 5990 - 6814 805 274 aa, chain + ## HITS:1 COG:CAC0775 KEGG:ns NR:ns ## COG: CAC0775 COG1606 # Protein_GI_number: 15894062 # Func_class: R General function prediction only # Function: ATP-utilizing enzymes of the PP-loop superfamily # Organism: Clostridium acetobutylicum # 7 270 4 265 271 228 44.0 7e-60 MASVEQNKKKEDLENYLKELGSVAVAFFGGVDSTFLLKVAHDTLGEKCIAVTAKSCSFPE RELKEAKAFCEKEGIRHLVVESEELDIEGFRHNPKNRCYLCKKELFEKIWKIAKEEHVVA IAEGSNFDDNGDYRPGLQAVAELGIKSPLRASKLNKADIRVLSKDLGLPTWNKQSFACLS SRFVYGETISEEKLIMVDKAEQLLLDMGFHQVRVRIHGTIARIEILPEEFPKLVEETNRN LITKSFKEYGFTYVTMDLTGYRTGSMNETLNSDL >gi|222441896|gb|ACEP01000046.1| GENE 10 7498 - 8412 995 304 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_2750 NR:ns ## KEGG: EUBREC_2750 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 303 1 306 307 450 72.0 1e-125 MANIIAVVWDFDKTLVNGYMEEPIFEHYGVDSSVFWREVNSLPEKYRVEQGVRVNPDTIY LNHFIHYAKKGIFKGLNNAMLYDFGKNLHFYEGVPEIFEETRKLIEEDSIYQEYDIKVEH YIVSTGLSQVIKGSVVVQYVKGIWGCELIEEEIENGEKIISEIGYTIDNTSKTRALFEIN KGVNRHEGVEVNTKMPEELRRVPFRNMIYVADGPSDIPAFSLVNKNQGATFAIYPHGDME AMRQVEQMRVDGRINMYAEADYREGTTAYMWICHKIKECAERIRKREREKISIYAQAGTP KHLT Prediction of potential genes in microbial genomes Time: Fri Jul 8 07:07:34 2011 Seq name: gi|222441895|gb|ACEP01000047.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont52.1, whole genome shotgun sequence Length of sequence - 67545 bp Number of predicted genes - 58, with homology - 56 Number of transcription units - 37, operones - 14 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 275 - 320 2.8 1 1 Op 1 . - CDS 359 - 2305 378 ## gi|225026621|ref|ZP_03715813.1| hypothetical protein EUBHAL_00872 - Prom 2340 - 2399 3.8 2 1 Op 2 . - CDS 2424 - 4091 1206 ## Clole_3834 GerA spore germination protein - Prom 4339 - 4398 8.8 + Prom 4292 - 4351 10.7 3 2 Op 1 . + CDS 4380 - 5357 815 ## COG1947 4-diphosphocytidyl-2C-methyl-D-erythritol 2-phosphate synthase 4 2 Op 2 . + CDS 5357 - 6025 875 ## COG1802 Transcriptional regulators + Term 6113 - 6152 -0.3 5 3 Tu 1 . - CDS 6158 - 7111 875 ## COG3764 Sortase (surface protein transpeptidase) - Prom 7181 - 7240 1.6 6 4 Tu 1 . - CDS 7280 - 7639 308 ## BDP_0277 putative sortase-anchored surface protein - Prom 7737 - 7796 2.3 - Term 8194 - 8231 4.6 7 5 Tu 1 . - CDS 8309 - 9883 1538 ## COG1388 FOG: LysM repeat - Prom 9908 - 9967 1.6 8 6 Tu 1 . - CDS 9993 - 10277 180 ## EUBELI_00153 hypothetical protein - Prom 10351 - 10410 1.8 + Prom 10618 - 10677 9.2 9 7 Tu 1 . + CDS 10720 - 12072 1668 ## COG0617 tRNA nucleotidyltransferase/poly(A) polymerase + Term 12147 - 12211 5.4 + Prom 12108 - 12167 2.2 10 8 Tu 1 . + CDS 12219 - 13223 441 ## Cphy_3824 CotS family spore coat protein + Prom 13237 - 13296 7.5 11 9 Tu 1 . + CDS 13327 - 15042 1346 ## COG2812 DNA polymerase III, gamma/tau subunits + Prom 15069 - 15128 3.3 12 10 Op 1 . + CDS 15150 - 15266 59 ## 13 10 Op 2 23/0.000 + CDS 15283 - 15627 181 ## PROTEIN SUPPORTED gi|149916415|ref|ZP_01904934.1| 30S ribosomal protein S21 14 10 Op 3 . + CDS 15627 - 16226 655 ## COG0353 Recombinational DNA repair protein (RecF pathway) + Term 16257 - 16292 -0.7 + Prom 16237 - 16296 7.7 15 11 Tu 1 . + CDS 16393 - 17466 1210 ## COG0205 6-phosphofructokinase + Term 17556 - 17599 1.0 16 12 Op 1 . - CDS 17679 - 18710 558 ## COG0705 Uncharacterized membrane protein (homolog of Drosophila rhomboid) 17 12 Op 2 . - CDS 18774 - 20528 971 ## Cphy_0040 FHA domain-containing protein 18 12 Op 3 . - CDS 20571 - 21074 244 ## gi|225026640|ref|ZP_03715832.1| hypothetical protein EUBHAL_00891 19 12 Op 4 . - CDS 21082 - 21813 244 ## Cphy_0037 hypothetical protein - Prom 22028 - 22087 4.2 20 13 Op 1 . - CDS 22127 - 23668 968 ## EUBREC_0046 hypothetical protein 21 13 Op 2 . - CDS 23730 - 23957 223 ## gi|225026643|ref|ZP_03715835.1| hypothetical protein EUBHAL_00894 - Prom 23977 - 24036 3.0 22 14 Tu 1 . - CDS 24102 - 25439 515 ## Cphy_0034 hypothetical protein - Prom 25469 - 25528 2.0 - Term 25499 - 25535 -1.0 23 15 Op 1 . - CDS 25565 - 26146 203 ## TherJR_2439 type II secretion system F domain protein 24 15 Op 2 . - CDS 26154 - 27401 953 ## COG4962 Flp pilus assembly protein, ATPase CpaF 25 15 Op 3 . - CDS 27322 - 28299 553 ## gi|225026647|ref|ZP_03715839.1| hypothetical protein EUBHAL_00898 - Prom 28383 - 28442 9.1 + Prom 28406 - 28465 10.7 26 16 Tu 1 . + CDS 28530 - 28892 154 ## gi|225026648|ref|ZP_03715840.1| hypothetical protein EUBHAL_00899 - Term 28930 - 28966 0.4 27 17 Tu 1 . - CDS 29040 - 29924 1189 ## COG4086 Predicted secreted protein - Prom 29953 - 30012 1.8 28 18 Op 1 1/0.000 - CDS 30045 - 32744 2197 ## COG4485 Predicted membrane protein 29 18 Op 2 . - CDS 32755 - 33480 442 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 30 18 Op 3 . - CDS 33506 - 33685 228 ## gi|225026652|ref|ZP_03715844.1| hypothetical protein EUBHAL_00903 - Prom 33781 - 33840 3.8 31 19 Op 1 . - CDS 33909 - 34121 284 ## gi|225026653|ref|ZP_03715845.1| hypothetical protein EUBHAL_00904 32 19 Op 2 . - CDS 34185 - 35489 1511 ## COG2607 Predicted ATPase (AAA+ superfamily) 33 19 Op 3 . - CDS 35493 - 36155 686 ## gi|225026655|ref|ZP_03715847.1| hypothetical protein EUBHAL_00906 - Prom 36182 - 36241 4.5 34 20 Op 1 . - CDS 36314 - 38122 1848 ## COG1164 Oligoendopeptidase F - Prom 38145 - 38204 5.0 35 20 Op 2 . - CDS 38209 - 39933 1067 ## COG5279 Uncharacterized protein involved in cytokinesis, contains TGc (transglutaminase/protease-like) domain - Prom 40079 - 40138 12.2 + Prom 40238 - 40297 6.8 36 21 Tu 1 . + CDS 40418 - 40720 204 ## gi|225026658|ref|ZP_03715850.1| hypothetical protein EUBHAL_00909 + Term 40731 - 40783 14.6 - Term 40727 - 40763 3.5 37 22 Tu 1 . - CDS 40778 - 41236 367 ## Closa_2549 hybrid cluster protein - Prom 41286 - 41345 8.8 - Term 41270 - 41323 12.1 38 23 Tu 1 . - CDS 41396 - 43441 2416 ## COG1384 Lysyl-tRNA synthetase (class I) - Prom 43545 - 43604 8.6 39 24 Tu 1 . - CDS 43838 - 44143 266 ## gi|225026661|ref|ZP_03715853.1| hypothetical protein EUBHAL_00912 - Prom 44215 - 44274 8.3 - Term 44212 - 44249 0.7 40 25 Tu 1 . - CDS 44292 - 44417 88 ## gi|225026662|ref|ZP_03715854.1| hypothetical protein EUBHAL_00913 - Prom 44518 - 44577 2.5 - Term 45095 - 45138 9.2 41 26 Op 1 . - CDS 45151 - 45672 544 ## COG0590 Cytosine/adenosine deaminases - Prom 45698 - 45757 4.2 42 26 Op 2 . - CDS 45840 - 47288 721 ## Cphy_3896 hypothetical protein - Prom 47308 - 47367 5.4 43 27 Op 1 1/0.000 - CDS 47411 - 48292 1077 ## COG0030 Dimethyladenosine transferase (rRNA methylation) - Prom 48372 - 48431 5.2 - Term 48491 - 48533 -0.4 44 27 Op 2 8/0.000 - CDS 48574 - 49344 739 ## COG0084 Mg-dependent DNase - Prom 49395 - 49454 2.5 45 27 Op 3 . - CDS 49548 - 51512 2204 ## COG0143 Methionyl-tRNA synthetase - Prom 51646 - 51705 6.3 + Prom 51669 - 51728 7.3 46 28 Tu 1 . + CDS 51786 - 53612 1218 ## COG4805 Uncharacterized protein conserved in bacteria + Prom 53670 - 53729 9.8 47 29 Tu 1 . + CDS 53872 - 55458 1990 ## COG1866 Phosphoenolpyruvate carboxykinase (ATP) + Term 55528 - 55562 1.1 48 30 Tu 1 . - CDS 55663 - 56289 691 ## gi|225026672|ref|ZP_03715864.1| hypothetical protein EUBHAL_00924 - Prom 56327 - 56386 7.8 - Term 56780 - 56829 10.1 49 31 Op 1 . - CDS 56858 - 57571 769 ## gi|225026674|ref|ZP_03715866.1| hypothetical protein EUBHAL_00926 - Prom 57632 - 57691 11.3 50 31 Op 2 . - CDS 57741 - 58877 1207 ## COG0520 Selenocysteine lyase - Prom 58989 - 59048 8.4 + Prom 58902 - 58961 6.8 51 32 Tu 1 . + CDS 59061 - 59258 65 ## gi|225026676|ref|ZP_03715868.1| hypothetical protein EUBHAL_00928 + Term 59405 - 59450 0.1 + Prom 59276 - 59335 10.8 52 33 Op 1 . + CDS 59579 - 60892 1034 ## gi|225026678|ref|ZP_03715870.1| hypothetical protein EUBHAL_00930 + Prom 60974 - 61033 3.5 53 33 Op 2 . + CDS 61064 - 61987 1147 ## COG0524 Sugar kinases, ribokinase family + Term 62115 - 62159 6.5 + Prom 62161 - 62220 6.3 54 34 Tu 1 . + CDS 62250 - 63572 1073 ## Clole_0241 hypothetical protein 55 35 Tu 1 . - CDS 63879 - 65549 2170 ## COG2759 Formyltetrahydrofolate synthetase - Prom 65754 - 65813 9.0 - Term 65619 - 65656 6.2 56 36 Op 1 . - CDS 65839 - 66288 384 ## gi|225026684|ref|ZP_03715876.1| hypothetical protein EUBHAL_00936 57 36 Op 2 . - CDS 66355 - 66417 67 ## - Prom 66470 - 66529 5.1 + Prom 66538 - 66597 6.0 58 37 Tu 1 . + CDS 66664 - 67308 422 ## gi|225026685|ref|ZP_03715877.1| hypothetical protein EUBHAL_00937 Predicted protein(s) >gi|222441895|gb|ACEP01000047.1| GENE 1 359 - 2305 378 648 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225026621|ref|ZP_03715813.1| ## NR: gi|225026621|ref|ZP_03715813.1| hypothetical protein EUBHAL_00872 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00872 [Eubacterium hallii DSM 3353] # 1 648 4 651 651 1306 100.0 0 MYENKISARQLRRMIFIETFGAGALSIPAFACYKEQNGFWSLIFYGVFLTGTIAFFIIFS GKISEENFENRKIGKTGLARDANINVSKMQTRLKSINFPEVLPQPIKFVYMIRFFINAVA LFYFFGKTIQTVYMPESSFLFIIFPAAILLCYSLYTTLQKRARFLEIIFQWIITTYFIAV ILSFIGIEKAMQIGNAGELWQGILSDNIFRSMGNGYLMLLCSSPIEFLLFLRPAAEERGT ETKNRKVEDKNNKEGKNNREGKDDKAIRDIITAVAGAILCNVLFCFLSVRTLGPTLTSQS EWPVIKMMQLIRMPGGFLERFDILPIIFWILCMMAVLSGYLYYGWSIFQTRSRGCLKNFL GITFVCLVLLTCVVEKHTLLWTFYLKYKAFVDFPLSLILPLVVWLFKRQKDFNYQEKTEK DKEERKNIAKYKNNKQVKYFAGITLLIFFPILFSLTGCQRLTDVEEKNYILSMYVDYPAD EENAYQFWIARANLSEMEGQSDEIPCQITKIEARNLQQLEEKYLEIVPGKTEWNHIYTIF LGSGIAANKTECTRLLKEWDNEWQKSPNVLLALCPENPKELYKIKNIPAGAAGQEINLLA EQNKEKYSGQICETPIDYLRAKQQQQDKITLYRVTIENGNLKIMTDEL >gi|222441895|gb|ACEP01000047.1| GENE 2 2424 - 4091 1206 555 aa, chain - ## HITS:1 COG:no KEGG:Clole_3834 NR:ns ## KEGG: Clole_3834 # Name: not_defined # Def: GerA spore germination protein # Organism: C.lentocellum # Pathway: not_defined # 108 555 66 518 552 406 42.0 1e-111 MRIKYNRDQKIKGNLTQNFDKDYAFLEKALDRSGDIVKNVFYVGDTADIQSIMEEQTDAN TDSDTNSIMEEQTDANTDSDIGRKTVSKRIVKTTQNNKKIKPKKAAVIYVDGMTDADMVE DFVIRPLLKNKCEKTGHDFLSYVENHVMETVDWKEDESFEDILTDILSGNTLLLLESCPK AIILSTKKYPSRGVGETQQEMVIRGPKDSFTENMRMNTALIRRRIRDSRLKMEHTMVGER SKTDLAIVYMDDLVQPELLEKVRQKVNALSFDGILDGGMVEQLLEENVWTPFPQFQHTER PDKAASGLLEGRIVLVVDNSPGVLILPVTYQMFFQAGDDYYTRFEVASFARLLRFAASLF AIGFPGLYVAIAAFHTEMLPTSFLLSIATARTGIVIPVALEVLLMEFQFELLKEAGIHLP GQLGGTIGIVGGLIVGQAAVEAGIVSTIVVIVVSFTAIASFIVPNESFGAVFRLLKFLFI VTAAIWGIYGYLLTFAALLLHLSQIESFGVPYMLPSVCGENLNYDDKKDHYVRYPFAYMK KRPVFTREGRRIRKR >gi|222441895|gb|ACEP01000047.1| GENE 3 4380 - 5357 815 325 aa, chain + ## HITS:1 COG:BS_yabH KEGG:ns NR:ns ## COG: BS_yabH COG1947 # Protein_GI_number: 16077114 # Func_class: I Lipid transport and metabolism # Function: 4-diphosphocytidyl-2C-methyl-D-erythritol 2-phosphate synthase # Organism: Bacillus subtilis # 8 258 6 255 289 232 46.0 1e-60 MTQPLHLKAYAKVNLGLDVLRRREDGYHEVRMIMQTVKLFDYITLQKNTCGQIRLSSNLP YLPLNEKNLVYRAIDTIRTAYNISDGVDAAIEKHIPVAAGMAGGSTDAAAALVGMNQLFS LGITQDELIKHGLTLGADIPFCIMRGTALSEGIGEILTPLPSIPPCWFLIVKPSFSMSTK FVYEHLHLDEQTAHPDIDGMIQAISHGDLTGITDRMGNVLEQVTEKHYPAIVQIKKDMRR LGAVNALMSGSGSTVFGIFTSREAAGKAAEFFHNDPGIRQTAVVRPFSKNLPRSKSQKNH NHFSKKNSSLTQGDKYTAKKGRRNN >gi|222441895|gb|ACEP01000047.1| GENE 4 5357 - 6025 875 222 aa, chain + ## HITS:1 COG:SMb20773 KEGG:ns NR:ns ## COG: SMb20773 COG1802 # Protein_GI_number: 16265213 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Sinorhizobium meliloti # 14 215 23 224 228 120 34.0 2e-27 MKDTNFNVNEYLPLRDVVFNTLRQAIITGEFAPGERLMEISLANRLGVSRTPVREAIRKL ELEGLVIMIPRKGAQVARITEKNLRDVIEIRTVLEEFAAVLACERIDQSGLHDLRQAHED FIRSVENGDILDIVDKDETFHDTIFRATNNDRLISIINNLREQFYRYRMEYVKDIRQRSN LVEEHRELLDAISSRDSIKAKELMKTHLLNQQQEVINNIQEA >gi|222441895|gb|ACEP01000047.1| GENE 5 6158 - 7111 875 317 aa, chain - ## HITS:1 COG:SPy0135 KEGG:ns NR:ns ## COG: SPy0135 COG3764 # Protein_GI_number: 15674348 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Sortase (surface protein transpeptidase) # Organism: Streptococcus pyogenes M1 GAS # 58 293 10 251 251 181 40.0 2e-45 MKSGKKSEKKSKKTLKKKKLSSNITTIILILIFLVGLSVMLYPTVSNYINQRHQSKAIAA YDEKVSEMKPEDYTKYFEAAEKYNKKLAKNPSAFYNPDEIKGYEKILDISGTGIMGYITI KKLGVELPIYHGTDEGILQIAAGHLKGTSFPVGGDSTHSVISAHRGLPSAKLFTDLDKME VGDTFTITILNREITYEVDDISIVLPDETDSLQIEDKKDYCTLMTCTPYGINTHRLLVRG HCIGTGEPQHIHVTAEAFQIDPTITASVMGIILLIMLLIRVLIRTRKKGQEGKHLKCNQV FLHLKRGDLIGNNKKTK >gi|222441895|gb|ACEP01000047.1| GENE 6 7280 - 7639 308 119 aa, chain - ## HITS:1 COG:no KEGG:BDP_0277 NR:ns ## KEGG: BDP_0277 # Name: not_defined # Def: putative sortase-anchored surface protein # Organism: B.dentium # Pathway: not_defined # 36 116 412 492 495 94 56.0 1e-18 MERTSKEEVNIPSIKEIRLERRRLARQNRKRRIFFNTTITWDGTNRTGVLTSLTGNQEDG KITFAPNDDKSELVTNVVNKKGSTLPSTGGIGTTIFYVVGAILMVGAAVLLITKRRAEN >gi|222441895|gb|ACEP01000047.1| GENE 7 8309 - 9883 1538 524 aa, chain - ## HITS:1 COG:CAC2903 KEGG:ns NR:ns ## COG: CAC2903 COG1388 # Protein_GI_number: 15896156 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: FOG: LysM repeat # Organism: Clostridium acetobutylicum # 19 514 21 514 520 131 23.0 4e-30 MKVEKKAVRSGILKLEKNVQITLDEDMNVPDTRPDVEKIIESRGEVHIDEIEVLTDRMRL RGNFLVQILYLSTDKEQQISCMEHEFSIEEFMNVESAESSDIAKVTADLEDLTISIINSR KCGVRSVIYFHIAISEMKFVECTTGIEKKENVQCLYESPSMTEIIMNKKDIQRIKASVSL PAGKPNIKEILWNSMQLRDVDIRMMENKLSIRGSLFLFILYQAEEGSESLQYYDWEIPFT NELDCADSQENLIGNIAVMLGNHQAVIKPDIDGEPRDVEIEAVLELDLKAYREFKMPLLK DMYANDRKLKLKTSPITFENLIFQNNAKTKVSQRVEAAGEIHKLLQVLNVEGNVRIEDFQ LTKQGIATEGLIFCKVLYIAGDDTAPIQSKEIVIPFEYLVEIPEVAETDRCEIRGVLEQI GGYVVDSNELEIRAVAGIYVTGFSPQTMYMIDEVEEIPYSEEEISRIPSITGYIVKSGDT LWNIAKHYGTTIEKMKQYNENLTEPLETGQKLFLLKEMESLIGE >gi|222441895|gb|ACEP01000047.1| GENE 8 9993 - 10277 180 94 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_00153 NR:ns ## KEGG: EUBELI_00153 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 1 87 50 136 143 77 43.0 2e-13 MPERLQKIIEERKKKPGTFAYKNSKYTYLVVCYGEKSYSGYSVRVEQCWKDKEQLYLETQ LIGPAAGEEVVETLTYPFLVVRCGRTELLCRIES >gi|222441895|gb|ACEP01000047.1| GENE 9 10720 - 12072 1668 450 aa, chain + ## HITS:1 COG:FN0243 KEGG:ns NR:ns ## COG: FN0243 COG0617 # Protein_GI_number: 19703588 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA nucleotidyltransferase/poly(A) polymerase # Organism: Fusobacterium nucleatum # 12 448 16 451 451 234 34.0 4e-61 MHISIPKHASDIIKTLSAHGYEAYVVGGCVRDSILGKEPADWDITTSALPEQVKALFPRT IDTGLKHGTVTIMMDKVGYEVTTYRIDGTYEDHRRPNEVTFTSDLKEDLMRRDFTINAMA YNEEQGLVDLFGGIQDLTDRIIRCVGNPAERFDEDALRMLRAVRFAGQLNFKIEENTKAA IEAQHTFLKDVSAERIQTELMKLLVSGHPEMIRAAYETGLTSVFLPEFDRMMETPQNNPH HIYSVGDHTVHAVETIAPNPVLRLTMLLHDIAKPETRTTDENGIDHFYNHNAVGADLAKA ALKRLKFDNQTTHMVTTLIANHDIRFNDPLGTGRKHVRKIIHRVGVDLFPYLLDVMEADI SAQSDFMRLKKLTALKETREAYREILTANDCLTLKDLKINGNQLKALGISEGKMIGAILN ALLAQVLNNPELNEYEKLEELALKIYQEVQ >gi|222441895|gb|ACEP01000047.1| GENE 10 12219 - 13223 441 334 aa, chain + ## HITS:1 COG:no KEGG:Cphy_3824 NR:ns ## KEGG: Cphy_3824 # Name: not_defined # Def: CotS family spore coat protein # Organism: C.phytofermentans # Pathway: not_defined # 1 322 1 344 347 188 33.0 3e-46 MSEKYKELLEQYDVKVLSTFRSRGTFQCETEQGLALLKEYHGSLQKLALEYEWKEKLANA GFLTTDRYFLTKEESLVTYDRYRTPFVLKHYFKGRECDCRNLSDVAASCRNLAFLHKVSS SIEEIPFENLKTETTSHLFERKNRELRSIRKFIGKVRGKNDFELLYMDCFDSFYKEASHA LAHLLKAEADLADTDCGMCHGAYHYHNVLILPDYSVATVNFESLCYQPYLLDLYLFLRKT LEKNHYDYAFFETGISGYSIYRHLTEKDFLFLYLLFLYPEKFWKISNQYYNHRKSWIPPR TLEKLQKVLAQNEERHSFLKQFADFLRTLNINLL >gi|222441895|gb|ACEP01000047.1| GENE 11 13327 - 15042 1346 571 aa, chain + ## HITS:1 COG:CAC0125 KEGG:ns NR:ns ## COG: CAC0125 COG2812 # Protein_GI_number: 15893421 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, gamma/tau subunits # Organism: Clostridium acetobutylicum # 1 385 1 384 542 354 47.0 3e-97 MGYMALYRKWRPSDFDEVKGQDAIVQTLRNQIIYNRIGHAYLFCGTRGTGKTSIAKLFAK AVNCEHPVNGNPCNACPSCQAINNQSSLDVLEIDAASNNGVENIRDIREQVQYSPVEGRY KVYIIDEVHMLSSGAFNALLKTLEEPPSYVIFILATTEKHKIPVTILSRCQKYDFKRISV DTITNHLVSLMEKEKVDADEKALRYIARAADGSMRDALSLLDQCIAFYLGQTLTYDNVLE VLGTADTSVFSALLRNILNHDSIAALDIIDSMITEGRELSQFLSDFLWYLRNLLILKDQE GAEESLDMSKETIAALKEECSMVDTIALLRYIRLLSELSNQIRTATQKRVLLEVGFIKLC RPEMESDTESLSERIRQLEDKIAQGIPMIPANGLGNTGMITNNGTFAYDPYTGQPLGTSG AAGAGTAGMVNGTGNSAVSGAFPQGTEGQISPEEALEKRFSPAEADDLKEIASHWNDIVN QTSMPMQQFLRAARVAVADSSTILQLVFTNNDLAKEYFEREHHKNLDLLSELVAEQTGKV VTFECTSSTHQPTEQSNYIDLSKIHQTVIFE >gi|222441895|gb|ACEP01000047.1| GENE 12 15150 - 15266 59 38 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSKFLRHMQCVRLKNLGATGEKSAAKASARKTPRVRNY >gi|222441895|gb|ACEP01000047.1| GENE 13 15283 - 15627 181 114 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|149916415|ref|ZP_01904934.1| 30S ribosomal protein S21 [Roseobacter sp. AzwK-3b] # 10 109 6 105 114 74 39 2e-12 MAKRGGFPGGMPGNMNNLMKQAQKMQKQMAETTKALEEKSYEASAGGGVVSVTVSGKKEV TAIKIAEEVVDPDDIEMLEDLIMAATNEAFRAMEADSQAQMSKLTGGLGGGFGF >gi|222441895|gb|ACEP01000047.1| GENE 14 15627 - 16226 655 199 aa, chain + ## HITS:1 COG:CAC0127 KEGG:ns NR:ns ## COG: CAC0127 COG0353 # Protein_GI_number: 15893423 # Func_class: L Replication, recombination and repair # Function: Recombinational DNA repair protein (RecF pathway) # Organism: Clostridium acetobutylicum # 1 199 1 198 198 224 52.0 1e-58 MNYYSTQITNLIEELRRLPGIGSKSAQRLAFHIINMPEEHVHHLAQTLVTAREKIRYCKY CCTLTDSEVCPICASTKRDHSTIMVVEHTQDLAAYEKTGQYHGVYHVLQGTLAPSLGKGP DDIRFKELEERLNNPEVKELILATNSSLDGEATAMYITKKVKPRGIKVTRIASGVPIGGE LEYIDEVTLSRALDGRVEL >gi|222441895|gb|ACEP01000047.1| GENE 15 16393 - 17466 1210 357 aa, chain + ## HITS:1 COG:Cgl1221 KEGG:ns NR:ns ## COG: Cgl1221 COG0205 # Protein_GI_number: 19552471 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphofructokinase # Organism: Corynebacterium glutamicum # 3 348 5 343 346 256 40.0 3e-68 MKRIGMLTSGGDCQSLNATMRGVVKGLNVLEGTVEIYGFIDGYKGLIYGNYKRLTSKDFS GILTTGGTILGTSRQPFKLMRTPDENGLDKVEAMKSTYRYLDLDCLVVLGGNGSQKTANL LREEGLNIVSLPKTIDNDLWGTDMTFGFQSAVDIATQTIDCIHTTAASHNRVFIVELMGH KVGWVTLHAGIAGGADIILIPEIPYDLKHVISAIQRRNEQGKHFTILAVAEGAISKEDAK LTKKQYKAKLAKRKHSTVGYELASQIEKATGLEVRVTIPGHTQRGGSPVPFDRVISSRLG VAAAELIAKEDYGKMLIMKGYDVATLPLEKTAGKLKYVSPDDQIVLQAKHLGISFGD >gi|222441895|gb|ACEP01000047.1| GENE 16 17679 - 18710 558 343 aa, chain - ## HITS:1 COG:RSp1525 KEGG:ns NR:ns ## COG: RSp1525 COG0705 # Protein_GI_number: 17549744 # Func_class: R General function prediction only # Function: Uncharacterized membrane protein (homolog of Drosophila rhomboid) # Organism: Ralstonia solanacearum # 144 338 204 389 569 107 36.0 5e-23 MEFVEALQEFLFHQGFQKIKYDSKILWGKDTGDKLSLLEVVPPLLPGQPRIGLKQREEEM LDIERQIMLKYQKKVEHLLLILNKGEPDFEERKEAEKYPDIWFFDIKAGQLYIYEYQKSQ YCGLESSLEPFSQKFLQQKITEERTELRRMLTPVNTILVLANVVVFMILSFLGNTTDAEF MAAHGAIDWMDVVEKHQYYRLFTSMFLHFGADHLLQNMLILLVIGCRLERITGKLSYLLI YIGAGLIGAGTSIIFTLGNNPNTVSAGASGAIFGVMGGLLYCIISDIIQKKRHRVEEIGL TGMIFMVASALSYGFFSTGIDNAAHIGGLVGGFLITMISRFVI >gi|222441895|gb|ACEP01000047.1| GENE 17 18774 - 20528 971 584 aa, chain - ## HITS:1 COG:no KEGG:Cphy_0040 NR:ns ## KEGG: Cphy_0040 # Name: not_defined # Def: FHA domain-containing protein # Organism: C.phytofermentans # Pathway: not_defined # 40 579 30 575 580 84 20.0 1e-14 MEVIKEGTHTYLLKQIEYRTPAYDESVVYAENESWDYYLSVEYEVLKRNELKTLVSAQLR EKDEEKSLLFDITGKRPLKRQGKERAFSQKECEKILQNMRNMIQEIEDYMLDLNCIELRP EYIYEDSKGEIQWIYFPQTSLESEIKSNLQDRELQNKNEDLQKRMEALFAWMLPQIDYED TDAVQFMYRFYNKVRKLGFSKELLETYIQTKRQKEYYDIEEVDKTSAKVRDSSKIKQDRR SNNESISYEEFFKEELELEKQKIESPNNRKQKTGDFQIGIKNGRRNSRREVQRNERNRDE KNKNQSNRNQKNKDLNNRSKSQSNENNYLIVYNGLKIISVICTFLAVILEGIFVFYGMRS GFTRQLFQYSVGGMLLIIVFACGFIWSAQTVRKIKQKNKSNSMNSIKDNINKQNKVRHRK RGEQGSSISDKFIQEKELQNRPQMIKTEVDWEAEGDWESGTEGTTILNYGNESEKEDGKI CHPMLRDMELGIIYVIKNCPFYIGSAEGVNHLHIQDKTVSREHAVILEDIYEGDGQGYIL RDMGSTNGTWLNGKKIKRGNQEQLEDGAVIRFAKKEYEFLIQDI >gi|222441895|gb|ACEP01000047.1| GENE 18 20571 - 21074 244 167 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225026640|ref|ZP_03715832.1| ## NR: gi|225026640|ref|ZP_03715832.1| hypothetical protein EUBHAL_00891 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00891 [Eubacterium hallii DSM 3353] # 1 167 1 167 167 286 100.0 5e-76 MGNKGMMTVEASVVVPIILIIVVAEFFFGLFLIDMSVVKSETMRLADEAADVWKTDGKLA DGTYKSQQLLLRNKNFLQKNRRGQLISKTRLRLQKRIATRLNIASLDSYKVSISGDKVLV QASVRLGNPLLGGRKYTGIKGWIFKCEEKADITNEEELLRKLSGNSR >gi|222441895|gb|ACEP01000047.1| GENE 19 21082 - 21813 244 243 aa, chain - ## HITS:1 COG:no KEGG:Cphy_0037 NR:ns ## KEGG: Cphy_0037 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 78 241 127 300 304 107 39.0 5e-22 MILVETEVHYAVSQTAKICAKQQMFSLVPEEDDQKSNQVKEDNGIHKSDRTNGRQSIRNA SSVFFDIYDAKSLCDSLVEGGRKGISVQSAFSSEEEVQVKAIYTLKLSVPFFRPIRFQKD TAVKRRVFSGYVKHRRDYDAAGDNSIVYVAENGVVYHKNASCSHICLKITGNAAIQDIVH SSKYAACEKCIHKGSSLSAIFVTAYGDCYHSTLGCSGLKRTIKAVRLKDVGRLRPCSRCA SGR >gi|222441895|gb|ACEP01000047.1| GENE 20 22127 - 23668 968 513 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_0046 NR:ns ## KEGG: EUBREC_0046 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 296 508 293 501 524 104 32.0 9e-21 MSFLFIVEGVISYSASSLGEDAVKGAGESIMANYDRELFKNYHLFFLDPRQKDYIVTDGK EFINQYFSGNSFFDIYCDSLEVTGEKTAVDEDGLYLKHEIREWMKYREEKEIAKVLEDII RNTKKNDGERKECQEEVDSAETDIKKEQDEEVNQETKEEANKKDTESDLKGDVDKNTDNV EDMKSDEKAASDEIEPVSEETLKERSNWKEIKETLQLLMRTGILFYAAEHPENLSRQSIP GTNLPSKRVKSSSFDNEQRLLDKMDELSFSSLKGIKSLLSVDISMDGRGTLWTKDRYIIS YIEDCFSFYGSSDKKDNALLYEVEYLISGKNSDIDNLKRVANYILLLRFINNYIFTGKDT QMKMQINTMASAITGVMGMPQTMKAVQVLIRMAISYGESLLELHTLFSGGEVPLIKDKTT WNLTLKTMAEQLRNKQIVKKGKKNISYKDYLKLFLLTKGNSRTVCYRMMDIMQENIASKE PGFLMENCLFSYSWKGSLAAGGVKLNFVKQCSY >gi|222441895|gb|ACEP01000047.1| GENE 21 23730 - 23957 223 75 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225026643|ref|ZP_03715835.1| ## NR: gi|225026643|ref|ZP_03715835.1| hypothetical protein EUBHAL_00894 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00894 [Eubacterium hallii DSM 3353] # 1 75 1 75 75 86 100.0 5e-16 MAFIKRKKEQMHQNNIVYKKNRLQSILLSQDGMGVVEVILIIVVLIGLVIIFQTNIKGVV ASIFSTIQKNAGSIR >gi|222441895|gb|ACEP01000047.1| GENE 22 24102 - 25439 515 445 aa, chain - ## HITS:1 COG:no KEGG:Cphy_0034 NR:ns ## KEGG: Cphy_0034 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 67 439 96 480 485 135 27.0 3e-30 MNNIFVSWKQKIESRIISDLKSQYTESEEQIRQRKEQFLKKQKKIILIEIIVGAVFLAFL IIRAAFLQDDILLSRNLFGQGSKEVQLSLKKDDKKKEITYKLDEQKLSAEEESKVYIQFF KKLKKIMMKNNTSLKQMQTPLNLPDTVDGYPFEITYELAEDGYIRLDGSINEEEQAKLKR GETYKTYIVVTARYGEYRKSKKYEIGIVPKKNISQTNVFYKIQQYLKKEEQENRYSRDIK IPSVYKDIEITKRQENQGGISGILILFVTVCIFIPLHNYLKLQEEGKKCQEQAERDFPVI VHLLTLYMGAGLSFFSAVKRISQNYQKQRELDDSKKYAFEKMMRMEQQMSNGVSQREACQ DWGMQFRTDSYQKLSLILIQSFTKGSKEAAMLMEAEEREAFHKRIDRAKKEGEEASTRLL FPMILLLCQVMLLVMYPALTRFQGF >gi|222441895|gb|ACEP01000047.1| GENE 23 25565 - 26146 203 193 aa, chain - ## HITS:1 COG:no KEGG:TherJR_2439 NR:ns ## KEGG: TherJR_2439 # Name: not_defined # Def: type II secretion system F domain protein # Organism: Thermincola_JR # Pathway: not_defined # 39 174 91 223 278 84 34.0 3e-15 MIVEAGLILLNILCYGDIRWLLLEQILLFPCFNVLKRKKREKEKHLYEKGFQNLLQSMMT SLQAGYSLNNSCRIALKELEELYQDQNNPMLRQMRRIIQGLDLHIALEQLFMEFAESTGL EEAKQFAVVIEIVRSTGGNMVEILKRTMQHLKYKMDTEEEIRVLLSGKLFEKTLCYQCHS LFYYICAWPILNM >gi|222441895|gb|ACEP01000047.1| GENE 24 26154 - 27401 953 415 aa, chain - ## HITS:1 COG:SMc02820 KEGG:ns NR:ns ## COG: SMc02820 COG4962 # Protein_GI_number: 15963897 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Flp pilus assembly protein, ATPase CpaF # Organism: Sinorhizobium meliloti # 47 356 92 401 466 276 46.0 7e-74 MELPGKRELREQILHQMDGRTSYVSDKDVYREIDKAILRSSHGGYGTLSEKEELRRQMFY SIRGFDVLEKYLGDSSISEIMVIGASRIFVEQNGKIRRIEEHFSEEGDVYRLIEQIVAPT NRMVNEACPIVDSRLPDGSRVHIVLPPVSLEGPVITIRKFLKGGMTIEKLLAFEEFPEKL TGILSALVKSRYNILISGATNSGKSSLLNALAEYIGKRERVITIEDSAELQLFHVENLVR LETRNANMEGENEITMQDLIKASLRMRPDRIIVGEVRGAEAMSMLQALSTGHSGSVSSIH ANSCRDALRRLETMVLMGMDMPLAAIQGLIASSVDILIHLGRLTTGKRKILEICEIKDYS RAEYDIRTLFQYKAGQTESLEMTGKLFHTEKLKSYGQYENYCEAVRLFCEEKREG >gi|222441895|gb|ACEP01000047.1| GENE 25 27322 - 28299 553 325 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225026647|ref|ZP_03715839.1| ## NR: gi|225026647|ref|ZP_03715839.1| hypothetical protein EUBHAL_00898 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00898 [Eubacterium hallii DSM 3353] # 1 325 6 330 330 617 100.0 1e-175 MDFSDIRTIIIGEAGEYGKRLVRYLEIHLSSAVRVYHFTTAEGMVSFKEEAEIYLLDERF FKELSEEKQKFLKQKKLILLTWREEQGSFCKYDNPQKLLDMLSDFTDYNSDLSDLAAELK DTPTKLTIVYSPVYDEYLEQIARSFMSSGDVYMGAEDLGYKGIDSEYEEGGDMGDLCYYI HLREENILDILQDMLIPKEEVQVLHSPDLYFYLRELTKEDYAWFFDKIKKESPYREVLWG AGNCFVANVDMLRFFDQIILIDSRKNPRQSIFCDRLERVLNINQEEPQIWQRIYREDILD GTAWKERAERTDFTPDGWKNIICQR >gi|222441895|gb|ACEP01000047.1| GENE 26 28530 - 28892 154 120 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026648|ref|ZP_03715840.1| ## NR: gi|225026648|ref|ZP_03715840.1| hypothetical protein EUBHAL_00899 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00899 [Eubacterium hallii DSM 3353] # 1 120 1 120 120 195 100.0 1e-48 MYPILFSFSTHFSTLFINNSSIQNILSQDNLSQNKILLIFCAACFAGYFIAYIKKRPTAL FSIIGRGAAGLSFIYFFNFFCAARDIITGIGINPITGAVCTVLGIPGAILLYAIKIYSFL >gi|222441895|gb|ACEP01000047.1| GENE 27 29040 - 29924 1189 294 aa, chain - ## HITS:1 COG:BH1404 KEGG:ns NR:ns ## COG: BH1404 COG4086 # Protein_GI_number: 15613967 # Func_class: S Function unknown # Function: Predicted secreted protein # Organism: Bacillus halodurans # 26 293 29 293 297 155 37.0 8e-38 MFKRKIITLSAIMIAGLMTFQANVSADAVEEKPYLSLGADLSTDQKSKVLELLDVDEKAL DQYNIVTVTNADEHKYLDDYLDSSVIGTKALSSVLVEKRDKGEGIDVTTKNITYCTAGMY ENALTTAGVTDATVTVAGPFNITGTAALVGAMNAYEDMTGEDISSESKDAATNELVVTSE LADQLNDSDKAEQFLALIKEKVLSDDVKSESDINDIIDECSKDLDITVTDDQKAQIAELM QKINKLDLNVNDLKEQASKIYDKLSDMDIDTQGIFEKIKTALSSIFAALSGLFN >gi|222441895|gb|ACEP01000047.1| GENE 28 30045 - 32744 2197 899 aa, chain - ## HITS:1 COG:L48341 KEGG:ns NR:ns ## COG: L48341 COG4485 # Protein_GI_number: 15672817 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Lactococcus lactis # 50 898 2 875 883 228 25.0 4e-59 MRLAKSLTNRDMQITFTFLNTLWKSEKGRMRKEHMNIKKEGKQIRHIGQRFGRKHVIALS FFMPLIVMTGCCIAFGIQPFGDNSLLIIDGLHQYMPFYSVLYDKLKGGESLFYSFRSGLG INFLSLFSYYLSSPLNLLIIFFKKTQLNMAVSMIIVLKIALSGMTAGIYFSSKSKKQGLH VLMISMAYALSSYMVGYCWNVMWLDAIMIFPIVVMGLERLIDKKDGRLYCLALFYALYCN YYIAFMICIFAIIWYIFYSFKSVKQFFFRGISFALYSFLAAGMAAVLLIPAYLGIKQTAS GEAMTLPDHSFLTNAADLLNRQFAMGSPISHDNFDGNANLYIGIFTVLAVGLYLLNQQIK ISDKIKKTLLVGFFYLSFGEMILNFIWHGFHDQYGIPNRFSFLFGFVLLHMLWEVFEHLD GTKGWHVAFACMAGVGLLAYARVKGTQPLEDKVYGVAGMLFILYGMILLLYTLSRKRKVW YKAAFCVVAIIEIATTACMGFNENGQISVSKFFYGTEDMEKAAASLKDGTFYRSELASAL MVDENAWYPINSVGLFGSTATDRMVNMMDNLGFYTGCNEYLYKGATPVTNVLLNVKYLYY HQEDSLTTDFKYLKTQGTFDIYENPAKGMSIGYLMNDSIKDWYYDSAYPFRVQNDLGEQA FDVFELFHDIEIDDPATNGCTASKTNDGEYYFEYGDSRPDNMTFTIPITETAENLYLFYD GTQVENAQIMVDGTNVKSGDLDGYMLPIGKVSAGSEVKVTFELKGETEDGYVRLSAADFD QEVFEDFKKTAAEQAFTVTDYSSNSLEGTVDASDNQTLFLSIPYDSGWKVKVDGKEVKTC RIGDAFLGVKVPSGEHTVSLKYTPPGFSIGWKVSLATAIIFIVLCFVKYRYKADIKKKE >gi|222441895|gb|ACEP01000047.1| GENE 29 32755 - 33480 442 241 aa, chain - ## HITS:1 COG:DRA0037 KEGG:ns NR:ns ## COG: DRA0037 COG0463 # Protein_GI_number: 15807707 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Deinococcus radiodurans # 5 221 10 236 328 115 30.0 5e-26 MKISVAMAYYNGGTYIEEQMDSILSQLGEKDEVIVSVDGASDGSKPLLLKMSEVDSRIHI VKGPGKGVVRNFENAIRHCNGDIIYLSDQDDIWKPDKVEKVNYAFSNPEVKAVLHDAEIV DENGVATGAASLFSIRGSRAGILKNLLKNSYVGCCMAFRKELIPIICPIPNEMYMHDYWI GTAAEYMGKVCFLDEPLIGYRRHSSNVTQMTHGSIRFMIKKRIDIIRCLGLLKKRVRKVK A >gi|222441895|gb|ACEP01000047.1| GENE 30 33506 - 33685 228 59 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225026652|ref|ZP_03715844.1| ## NR: gi|225026652|ref|ZP_03715844.1| hypothetical protein EUBHAL_00903 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00903 [Eubacterium hallii DSM 3353] # 1 59 1 59 59 85 100.0 9e-16 MGVVIGRMDGQVEKIFLGNNLTRSTKNYAKILKDENLSEEEKQRILAAEKKKQEEDIIL >gi|222441895|gb|ACEP01000047.1| GENE 31 33909 - 34121 284 70 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225026653|ref|ZP_03715845.1| ## NR: gi|225026653|ref|ZP_03715845.1| hypothetical protein EUBHAL_00904 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00904 [Eubacterium hallii DSM 3353] # 1 70 8 77 77 107 98.0 4e-22 MKKKEKLFLGGEINLSELDADVQKKYEVAKELGIYDKVLEKGWGSLTAEETGRVGGVLRK RKNSGERVLS >gi|222441895|gb|ACEP01000047.1| GENE 32 34185 - 35489 1511 434 aa, chain - ## HITS:1 COG:CAC3262 KEGG:ns NR:ns ## COG: CAC3262 COG2607 # Protein_GI_number: 15896507 # Func_class: R General function prediction only # Function: Predicted ATPase (AAA+ superfamily) # Organism: Clostridium acetobutylicum # 1 431 10 425 426 273 39.0 5e-73 MYREVSKLILYRDLGEDSILLNMADIFKRFDSCHYRADELITDIYREMKALLDLATTYGF DKNLWHNYLTFVLVTNENSFSMTSEKVGANNGTVNHFAKNDFQVFMNLFHYDFRPIEETL GIDCFSTILDYKAIGKTERMYNKNVSEKVRALSDELAAAEDVDTFFNAVVKFYKDYGVGM FGLNKAFRIVENNGKPDFVPINNLDKVVLDDLTGYEIQKKKLVDNTEAFVQGKVANNCLL YGDAGTGKSTSIKAILNQYYGDGLRMIEIYKHQFKYLSAIISHIKNRNYRFIIYMDDLSF EEFEIEYKYLKAVIEGGMETKPDNVLIYATSNRRHLIRETWSDRSDKDEELHRSDTVQEK LSLAARFGVSINYSKPTQQEYFQIVTNLAKKYPEIKLSDKELCAEANKWELSHGGISGRT AQQFINYIAGKSEK >gi|222441895|gb|ACEP01000047.1| GENE 33 35493 - 36155 686 220 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225026655|ref|ZP_03715847.1| ## NR: gi|225026655|ref|ZP_03715847.1| hypothetical protein EUBHAL_00906 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00906 [Eubacterium hallii DSM 3353] # 1 220 1 220 220 384 100.0 1e-105 MYEFTLLKSMFFELDKYKIEGVTQTKEKQRTFRRLEVSYDSVPEITSITAEERILPYVDR YYLFINDERVDTDIHDPKVKQVIYEFYKQHHPEYLKENLPDDMTDVDILDCFEVFKMPDY RPKHAFPGLVTLEDIRNGNIHDEDKPKFAFPGLELYSNRDASEDEEEKPMVASYKGYTAF DGDREQAKDKIEQLKSQFASQIEALNVDEILKNRSENQEE >gi|222441895|gb|ACEP01000047.1| GENE 34 36314 - 38122 1848 602 aa, chain - ## HITS:1 COG:BH2856 KEGG:ns NR:ns ## COG: BH2856 COG1164 # Protein_GI_number: 15615419 # Func_class: E Amino acid transport and metabolism # Function: Oligoendopeptidase F # Organism: Bacillus halodurans # 7 602 3 598 598 605 51.0 1e-173 MEENKEKTVKKRQECDPAYQWHIQDLYASDEAWEKDYESLTSEIQSLAAYEGRLKEGSEV FVEYMRKKEALMKKFEAIYVYANQRYHEDTGNSFYQGLAGKAQTLSIQLDSAVVFEEPEL LAIGKKTIDSWFTQNMDMQLYKRYFYELFRQQKHVLSKEEEAILADVSDMSADVSNIFSM FNNADIRFPSIEGKEGEKIPVSHGRYTLLLESRDVNIRKSAFESVYSQYGQYRNTLAALY AANLKNTAFFAKKRHYNSSLEMALEGGEIPTSVYTNLIDTVHEHMDLMHRYVSLRKKALK AEELHMYDLYAPMVDEFEMKVPFSKAKEIVKKGLAPLGKEYGKVLSDGMESGWIDVYENE GKRSGAYSWGAYGCHPFVLMNYQDNLNNVFTLAHEMGHSMHSYYSDKNQPYIYAGYRIFV AEVASTCNEALLMEYLLKNTEEKKEKMYLINYFLEQFKGTLYRQTMFAEFEKITHEMAWN KEPLTAEVLCDIYYKLNQKYFGEDIVIDKEIALEWARIPHFYTPFYVYQYATGYSAAIAI SRKILSGDERAKEGYFKFLSGGSSMNPIDLLRLCNVDMAAKEPIESALKLFGELLDEMEE ML >gi|222441895|gb|ACEP01000047.1| GENE 35 38209 - 39933 1067 574 aa, chain - ## HITS:1 COG:CAP0002_2 KEGG:ns NR:ns ## COG: CAP0002_2 COG5279 # Protein_GI_number: 15004707 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Uncharacterized protein involved in cytokinesis, contains TGc (transglutaminase/protease-like) domain # Organism: Clostridium acetobutylicum # 128 216 5 102 138 63 35.0 9e-10 MLCFILAGSLFFTKDVDAATFSDTQKEYINVLSKFMIEGTVSGNLDVYRTVSQGKSGKKC MEAAAISNRAALMAEGIDFLDSNWSQYYAVETRDGATTFNSTKLISRSKFRRRYKKIIRG LDEALSTVDSSMSQADKAMAVYIYFAENTTYKESKDAHTGYDVLVSHVGVCDGLANAYAL AMNTLGIPCAVISNYSKNHSWNIVKLNGIWYFVDLTNGVGVGKHEGMVVTYDSFLVGKNT FLKTHPGYKKKDLYGQGNSNNLKMRKIKLASSDYIKNSKEMKQALENKTCTFYHKGYWYW ISQDNYLKKSNLRGNKAVVFYAPRSGNYIGWIRQYNNRIILSLNSGIYRMSFLNKYPVLL KKVDNRDKNAETKGALWSLIYIGRFIVNKNGWLSYYTTDFHGVRRGNVKIFLGRAQQSGQ LTKSGRKIIQLQAGQIKQLATIHMHDRGLRTAGWTSANPSIVSIDNNGYMKAKRSGKTYI STRINGRKRAFRVKVSGYTITYKNVGINSSKNRERLSGKRAVTLRSPRKKGYTFVGWYDS SGKNVTVIPKGNTKNLILHAKWDKTSSKDSVKAK >gi|222441895|gb|ACEP01000047.1| GENE 36 40418 - 40720 204 100 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026658|ref|ZP_03715850.1| ## NR: gi|225026658|ref|ZP_03715850.1| hypothetical protein EUBHAL_00909 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00909 [Eubacterium hallii DSM 3353] # 18 100 1 83 83 157 100.0 3e-37 MTKTSEFHTLLRNYRGIMGNIRINKIWISFDSCFFNYYADDDEIQFNLAGQEGVITIPAV ISYDKIDEEDLFEDSVAGYTATLESGDTVTFYCFNAKNKS >gi|222441895|gb|ACEP01000047.1| GENE 37 40778 - 41236 367 152 aa, chain - ## HITS:1 COG:no KEGG:Closa_2549 NR:ns ## KEGG: Closa_2549 # Name: not_defined # Def: hybrid cluster protein # Organism: C.saccharolyticum # Pathway: not_defined # 5 152 36 187 530 79 35.0 3e-14 MNKKEKELIGALIGLAKACNVHLKTENTDGIIIKSLASIFPLEENGEELLQRVREEKLAV APDCATCFAPCGNTDEYNLDELQASGISETVRDLKFQLLNVSHEIASGMVSYTINSTEEN ISLLYKALCVVSYDVDEERVQTVLKELQRITI >gi|222441895|gb|ACEP01000047.1| GENE 38 41396 - 43441 2416 681 aa, chain - ## HITS:1 COG:PAB0139 KEGG:ns NR:ns ## COG: PAB0139 COG1384 # Protein_GI_number: 14520422 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Lysyl-tRNA synthetase (class I) # Organism: Pyrococcus abyssi # 1 516 5 514 526 304 34.0 5e-82 MHWSEKIAKDIIKRSPDKEEYVCAAGISPSGSIHIGNFRDIATSYFVYKALIKQGKKARL LFSWDEFDRFRKVPVNVKEVAPELENEIGKPYVDVKDPYGCCSSYAEHFEKEFEASLSRF GVKVDFRRQSENYRSGKYAKYVIQAVQKRREIFDILDKFRQQVATPEEREAYYPVSIYCP VCGKDETTITSSSEDGVTCEYTCKCGHKGTWDFSKDFNCKLAWKIDWPMRWMIEGVDFEP GGKDHASPGGSYDTSKIIAKKIFGAQAPMFKGYEFVGIKGTTGKMSGSTGLNLTPDTLLK IYQPEMILWLYSKSEPNKAFDFCFDDEILRQYFEFDKMLKVYQAGKGKNYDYIEGIMHNC MIEGRELYPVPMQQIVNFGSVVDFNADMLETVFEKIGTPYKKEEFAERLELAKYWLEKCS PENMNTLLGYRNWDFYNTLNEVEKKEIQLLHDFIAKGEYDLDALNSFIYTIPREADPDFQ EENKKTAQAQFFKNAYNLMIGKAAGPRLYLFLFAVEPQRYLGLLDFSTPQTEEEKTLAAE AKAEAERKAAEEEARRKAAEEEEARKNAIAPIKEEITIDEFDKVDMRVCKVINCEIVKNA KKLLKLTLFDGLDERVIVSSIRDDYTPEELIGRKIIVIANLKPAKFAGVKSNGMLIAASG DDFGCKIIFVDDCVPEGTAIH >gi|222441895|gb|ACEP01000047.1| GENE 39 43838 - 44143 266 101 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225026661|ref|ZP_03715853.1| ## NR: gi|225026661|ref|ZP_03715853.1| hypothetical protein EUBHAL_00912 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00912 [Eubacterium hallii DSM 3353] # 1 101 1 101 101 157 100.0 3e-37 MQGDVIDSWEERLRLTLDHFRNENMADFRKNSEAYRRLVREEEEAYAKLDTLNLTEEQRE VIEEAFEAQSAAEAEYATESYFCGIADCIRFLKYIEGTRSK >gi|222441895|gb|ACEP01000047.1| GENE 40 44292 - 44417 88 41 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225026662|ref|ZP_03715854.1| ## NR: gi|225026662|ref|ZP_03715854.1| hypothetical protein EUBHAL_00913 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00913 [Eubacterium hallii DSM 3353] # 1 41 1 41 41 63 100.0 5e-09 MYSAYSRMTGNRGIGSLGHLSNELNNKKGEKYLGMRMDVVY >gi|222441895|gb|ACEP01000047.1| GENE 41 45151 - 45672 544 173 aa, chain - ## HITS:1 COG:BH0033 KEGG:ns NR:ns ## COG: BH0033 COG0590 # Protein_GI_number: 15612596 # Func_class: F Nucleotide transport and metabolism; J Translation, ribosomal structure and biogenesis # Function: Cytosine/adenosine deaminases # Organism: Bacillus halodurans # 6 156 4 154 159 185 59.0 3e-47 MRKYTEDEKFMKEAIKQAKKAEAIGDVPIGCVIVHDGKIIARGYNKRNKDKTVLAHAELL AMKKACKKLGDWRLEDCTMYITLEPCQMCAGAIVQARVTRVVIGSMNAKAGCGGSILNLL EMQEFNHQAEVERGVLQEECSEMLSAFFRKLREIQKEKKKKRKLIQEENQTDN >gi|222441895|gb|ACEP01000047.1| GENE 42 45840 - 47288 721 482 aa, chain - ## HITS:1 COG:no KEGG:Cphy_3896 NR:ns ## KEGG: Cphy_3896 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 1 474 1 463 475 87 24.0 1e-15 MADYQRILSYLYRYEKAEKKECFGFVKAEQKSGSLKLTIQIDDERLLQGMELKLCFYERQ GESWQVWQLDTLITQEHKEEIHLTYPAAMLPAGFRIKGQSGVLLYYQDAFYYGSVWIGEE IPTETLEPLRWHKIVNSAKDKSQMQENKSGKDIAGKISEKQTETEKILPDKMSEKQKVST RNITDEMLQEQQELEENILEKKGQINTESFEEKDFQKRAENKIENIKSEENLPDSENDIA HSNNTNIEESVKIKEFSEDGIESKNKSELELKSELKTDSEIESKSELKTDSEIESKSELK TDAEMESKSELKTDSEIESKSELKTDAEMESKSELKIDSEIEPQIKIEAETEPEATSVVD NFEKMWINAMKKNPSVDNIFNTAFYEGCRISTADLAQFGEEASVLKSNQFLLKGYGRYHH ILAGKVRYAGEERYCIGVPGIYENREKYMAEIYQFPVFLSLTENRMKTGSFGYWLYLLRD GI >gi|222441895|gb|ACEP01000047.1| GENE 43 47411 - 48292 1077 293 aa, chain - ## HITS:1 COG:SP1985 KEGG:ns NR:ns ## COG: SP1985 COG0030 # Protein_GI_number: 15901808 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Dimethyladenosine transferase (rRNA methylation) # Organism: Streptococcus pneumoniae TIGR4 # 20 291 1 274 279 279 53.0 4e-75 MTAQGGLMANLGNPQETIAVLQRYGFNFQKKYGQNFLIDTHVLDKIIGAAEIGKDDFVLE IGPGIGTMTQYLAEAAREVVAVEIDTKLIPILEDTLKEYDNVTVLNEDILKVDIRKIAEE KNGGKPIKVVANLPYYITTPIIMGLFESEVPLDSITVMVQKEVADRMQVGPGTKDYGALS LAVQYYAEPYIVANVPPNCFIPRPAVGSAVIRLTRYQEKPVKVNDSAFMFKIIRASFNQR RKTLQNGLYNSSELRIPKEKTVAALEEMGLTPTIRGEKLSLEEFAKLSDILGR >gi|222441895|gb|ACEP01000047.1| GENE 44 48574 - 49344 739 256 aa, chain - ## HITS:1 COG:BS_yabD KEGG:ns NR:ns ## COG: BS_yabD COG0084 # Protein_GI_number: 16077107 # Func_class: L Replication, recombination and repair # Function: Mg-dependent DNase # Organism: Bacillus subtilis # 1 252 1 252 255 244 46.0 1e-64 MIFDTHSHYNDKQFDADRAVVLESLKDAGVTQVVNVSASWKDLMDTLELISKVPFMYGAA GIHPDHVGELNEERMEQLREYCHRDKIVAVGEIGLDYHWNVEPKEVQQEWLIRQLHLATE EKLPVIIHSRDASQDTFDIMKKEHAGTTGGVIHCFSGSAEMAKEYVKMGYYIGVGGVVTF KNSRVLKEVVKAIPLECIVVETDCPYLAPAPHRGKRNSSAYLPLVIEEIARLKEITAEEV ENVTYENAQRLYGLIH >gi|222441895|gb|ACEP01000047.1| GENE 45 49548 - 51512 2204 654 aa, chain - ## HITS:1 COG:CAC2991_1 KEGG:ns NR:ns ## COG: CAC2991_1 COG0143 # Protein_GI_number: 15896243 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Methionyl-tRNA synthetase # Organism: Clostridium acetobutylicum # 3 535 4 530 536 659 59.0 0 MSKKPYYITTAIAYTSGKPHIGNTYEIVLADAIARFKRKEGYDVRFQTGTDEHGQKIELK AAEAGISPKEFVDNTAGEIKRIWDLMNTSYDKFIRTTDEDHEKQVQKIFKKLYDKGDIYK GHYEGMYCTPCESFFTESQLVDGKCPDCGRPCQPAKEEAYFFRMSKYADRLIQYINDNPD FIQPESRKNEMMNNFLLPGLQDLCVSRTSFSWGIPVDFDPKHVVYVWLDALTNYITGLGY DCDGESGELFKKYWPADLHLIGKDIIRFHTIYWPIFLMALDLPLPKQVFGHPWLLQGDGK MSKSKGNVIYADDLVDLFGVDAVRYFVLHEMPFENDGVITWDLLVERMNSDLANTLGNLV NRTISMSNKYFGGVVANSGVTDGFDKNLMDVALDTKHKVIKCMSKLRVADAMTEIFNLFK RCNKYIDETMPWALAKDPAKQDRLATVLYNLVESITIGASLLSPFMPETAERIFAQLNSS ERTMDELEMFGLYESGKKVTDKPEILFARLDAKEVQAKVDEMNASRQAPAEEEKKDDEPV IDIEPKEEITFDDFEKMQFQVGEIIACEEVKKSKKLLCSQVKIGSQVKQIVSGIKKHYSA EEMVGKKVMVLVNLKPAKLAGVLSEGMLLCAEDENGELSLMVPEKKMPAGAEIC >gi|222441895|gb|ACEP01000047.1| GENE 46 51786 - 53612 1218 608 aa, chain + ## HITS:1 COG:XF0221 KEGG:ns NR:ns ## COG: XF0221 COG4805 # Protein_GI_number: 15836826 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Xylella fastidiosa 9a5c # 127 598 127 599 613 104 22.0 4e-22 MKKNHRRPLSKALFTHLFFLLLILICGVSLAGCGQSSAHNSSKQKKFDQFLDSCFKEFAT ENTITLHFKLSDPSSYGIKKEPSPTYGELSSNTAKKNCKRSKQLLQELYTFPTSSLTKEQ KLTWQIFQDYLNETIMSEKYILYSSPFGADGLPSDIPVTLSEYRFDNEKDIKDYLSLVNQ IPELFTQVLDFEEERRNADIVSPDFVISDTIDQINQFLNASEENNLLVESFEERLDSLDT LSEDQKASYTANNRLLITNKVFPAYEHLKTALQVSTGSKHTTSDNSTKERLCEYENGQDY YRFLLQSDVGTDMSPEECITALETQLKDTIKDISSLTTQNKDLYTEYLSAVPTLSKPKEI MEQLKDDSLVDFPEIKNISYELKNVPNALSGTSACAFYLVPPIDSKNANIIYINNNRVES NEMFSTLAHEGYPGHLYQTNYFLSTNPSPLRTFLHCDGYDEGWGTYAQLYSYNFTEFKDV DEKTQAQLRQLYRDNDLLSLSLSSLSDLYVNYKNYDENALADYLKTYGIDKSATKQLYQY VIENPATYLSYSIGYYELNKLEKEMAASLGKSFKISDFHEAVLNVGSCNFAILRQEIEKE TEILSKVP >gi|222441895|gb|ACEP01000047.1| GENE 47 53872 - 55458 1990 528 aa, chain + ## HITS:1 COG:FN1120 KEGG:ns NR:ns ## COG: FN1120 COG1866 # Protein_GI_number: 19704455 # Func_class: C Energy production and conversion # Function: Phosphoenolpyruvate carboxykinase (ATP) # Organism: Fusobacterium nucleatum # 1 526 1 524 527 800 71.0 0 METYGIEKLGIVNPKAVYRNLTPAQLTEAALRRGEGKLSNTGALVVTTGKYTGRSPKDKF IVDTPSVHDDIAWGSVNVPITQEKFNAIRSKVIAYLQNREIFIFDGMAGADPVCTRKFRI INELASQNLFIHELLIRPTAEELENYGEADFTIFVAPGFKCIPEIDGTHSEAAIIVDYEQ KQVVICGSQYSGEIKKSVFSVMNFLMPKEGVLPMHCSANMDPETHETAVFFGLSGTGKTT LSADPNRKLIGDDEHGWSDRGIFNFEGGCYAKCINLSAENEPEIYNAIKFGSLVENVIMD DETREFDFDDGSLTENTRVGYPVDYISNAQIPGVGGIPKVVIFLTADAFGVLPPISRLDE NAAMYHFVTGFTSKLAGTERGITEPQPTFSTLFGEPFMPMDPSVYANMLGERIEKYNTKV YLVNTGWTGGPYGVGNRMKLKYTRAMVTAALNGTFDDVEYKHDEVFNVDIPQTCPNVPSE IMNPRDTWEDKAAYDAQAKKLAKMFQDNFTKKYPNMPKNIAEAGPKAD >gi|222441895|gb|ACEP01000047.1| GENE 48 55663 - 56289 691 208 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225026672|ref|ZP_03715864.1| ## NR: gi|225026672|ref|ZP_03715864.1| hypothetical protein EUBHAL_00924 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00924 [Eubacterium hallii DSM 3353] # 1 208 1 208 208 337 100.0 4e-91 MKKVTKIFALALAFVIIFTLQIPANAATKQARTSKGVYGYIEELTDEYVSIECKDTQDEE FEVQVRDNKGKEVFSQREIGYFFVELSKNKLYSFRVRGMKINYDTDSYDPYTPWSNSVYF STAQYKLKQVGKTKKAKITTPKVAGISKYTVYMSLKKNGGYKKVKTVKAGKSITVSKFKG KALKAKKNYYFKFVPNKGTETVKSLRIK >gi|222441895|gb|ACEP01000047.1| GENE 49 56858 - 57571 769 237 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225026674|ref|ZP_03715866.1| ## NR: gi|225026674|ref|ZP_03715866.1| hypothetical protein EUBHAL_00926 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00926 [Eubacterium hallii DSM 3353] # 1 237 1 237 237 360 100.0 6e-98 MKKITRIFALALALVMAFALQAPVNAAEKAAVPVMDREMEATTMMTDVTNQNFAPYYKSN KEITYQVAVPSDVTGVQVILYNYKGKKVSTQNALSASYGTVIRYVRFKIAKNQTYTVKVR TYLSVNGTTIYGKLSKARAFTTLTNIKLKYKYNYRTGKKNYTVVMPKAKNAKGIKNFTVY ISKDRSKGYKKVKTAKPGKNVKISKFKGKAIKTRQIYYIKIVPNLKCKVKSDVIYQY >gi|222441895|gb|ACEP01000047.1| GENE 50 57741 - 58877 1207 378 aa, chain - ## HITS:1 COG:CAC2354 KEGG:ns NR:ns ## COG: CAC2354 COG0520 # Protein_GI_number: 15895621 # Func_class: E Amino acid transport and metabolism # Function: Selenocysteine lyase # Organism: Clostridium acetobutylicum # 1 375 1 374 379 319 45.0 5e-87 MIYFDNAATTLQKPETVAKAVAEAISNLGNAGRGAYDAALHSMRMAYETREMINEFFNGD GPEQVAFMSNATEALNTAIDGMLQPGDHVITTAMEHNSVLRPLYLKEQQGVALTILPVGE DGTISSEAFEEAIRPDTEAIVCTHASNVTGTALNLYRIGEACKRHGLHFIVDASQSAGIL PIDIKAMNIDALCFTGHKGLFGPQGTGGLCLKKGVTIRPLKSGGSGIHSFDKEQPHSMPE LLEAGTINAHGIAGLKAGLEYIQQEGLERLYTREALLTKTFYEGVRRIDGVKVYGNFDQE KRAPLVALNIRNEDAGKVSEILWEDYGIATRAGAHCAPLMHETLRTTEQGVVRFSFSSFN TMEEIKTAIEAIRELAAE >gi|222441895|gb|ACEP01000047.1| GENE 51 59061 - 59258 65 65 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026676|ref|ZP_03715868.1| ## NR: gi|225026676|ref|ZP_03715868.1| hypothetical protein EUBHAL_00928 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00928 [Eubacterium hallii DSM 3353] # 1 65 33 97 97 123 100.0 4e-27 MNFFYSMSRACLKNAFCKMYITICGRFRLNEGGVAGYGNRMKAKYTVNWGVQIAGMIFQT RSKDN >gi|222441895|gb|ACEP01000047.1| GENE 52 59579 - 60892 1034 437 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026678|ref|ZP_03715870.1| ## NR: gi|225026678|ref|ZP_03715870.1| hypothetical protein EUBHAL_00930 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00930 [Eubacterium hallii DSM 3353] # 15 437 1 423 423 820 100.0 0 MFSKNTSTTYKYFFMFVFFSLFLWQSFTIKPEASEIITDPLISMTVGDTKNLYFFHGDSA SAYFSSNPNIASVTTGGVLNANDVGSAEIIYSVHGVFHQRKINVADIENPSFSTTQRENL ILPDNALTTTDPVLFMQKKDSYTIQFSSSSQALATRYKGLLIWKSDKPNIVRVDSNGKVT ALKKGSATITCTLGNVSCHTYVNVITDSYTGKATDFSMLTATGNQRTYRLFKQNAHNYPR YDSYLAWHGCATCSLATVLGAYNDNYSGILPSSVIDGVEKQFTSNKDWTREHVNRSLRGQ MPLSLYGISSILKSSGVDNNYVRTYTDSEAKHDIISHLKTGNSIIFEVRQKNSRTGKRTK RWTNSYHTMVLLGVLTNGKVLLCDSVDRSWYNGGQRLKIVDLSDIMEYMFPCTSFSESMY YNGASSDGGYIKIYEIS >gi|222441895|gb|ACEP01000047.1| GENE 53 61064 - 61987 1147 307 aa, chain + ## HITS:1 COG:ECs4694 KEGG:ns NR:ns ## COG: ECs4694 COG0524 # Protein_GI_number: 15833948 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Escherichia coli O157:H7 # 2 303 4 305 309 183 38.0 4e-46 MAAKILMTGSIMMDLILQMPRIPHPSESLLGTSYSNAGGGKGSNSAVAACKAGGEVSVCG TLGKDSNGEALLGMLTDIGIDVSNLTLKEGANTGMAVIMLEEDGMNRLAIYTGANNDTTP EAVDAAFEGKEYDALMMQLEIPLETNVHAFELAHERGMITCLDAGPAQDYPLERFKGLTI LSPNETETEALVGILPKDDETCIAASKILKERSDCKYVVLKMGDKGSFIYGDDICQMVPT FKVEAVDPTAAGDCFTGVLVKQYAETKDIVASARYASAAAAISVQHLGAQPSLPTKEAID EFIAARS >gi|222441895|gb|ACEP01000047.1| GENE 54 62250 - 63572 1073 440 aa, chain + ## HITS:1 COG:no KEGG:Clole_0241 NR:ns ## KEGG: Clole_0241 # Name: not_defined # Def: hypothetical protein # Organism: C.lentocellum # Pathway: not_defined # 275 439 1 202 202 83 27.0 2e-14 MRNDEKIDINLATEDTSENLSEEELAQQNYETALRYINIAEHMNKFEDQDKYYHRAIQYL KKAKPYKKVQPLLRELRNKKFGTRAAGKIELYKEACHIRDNAKTPSDYYSAQTIFSRIYH YEEKHPLIEKWTDPEVYAEAIKCSDSKEQMELCAKLADEKAAQLKRHSFFVSCAFIACLL AALFFTRTVSFKQCLASINSSSGNYEKAWQNYQNIYNRTNSKDAFKKYIEYRYKSAEKAL KAGDEDTAYRNYKAIAKEDYKDSQAKFVTLEKEHIKNTAIGKKVSFAYMDWRVLDKQDGK VLLLKDNSLGSTPFDETGKNVTWESSSVRKWLNSGFLNDNFFKAEQNAILDTTVKNTANP VYNTPAGKDTTDKLFLLSCDEVAQYKKGIHKTKSCWWLRTPGAAANSMSFVYKDKTVMEY GYEVTNTKITVKPAIWVTVE >gi|222441895|gb|ACEP01000047.1| GENE 55 63879 - 65549 2170 556 aa, chain - ## HITS:1 COG:CAC3201 KEGG:ns NR:ns ## COG: CAC3201 COG2759 # Protein_GI_number: 15896448 # Func_class: F Nucleotide transport and metabolism # Function: Formyltetrahydrofolate synthetase # Organism: Clostridium acetobutylicum # 1 556 1 556 556 703 63.0 0 MKTDIQIAQESTMLPIKDVAANYGITEDDLELYGKYKAKFSDEFMNSLDEKPDGKLVLMT AINPTPAGEGKTTTSIGLAQAFEKLGKKSVLALREPSLGPCFGIKGGAAGGGYSQVVPME DLNLHFTGDFHAITSANNLLAALLDNHIQQGNELRIDTRQVIWKRCMDMNDRALRNIVIG LGRKVDGYVREDHFVITVATEIMAILCLAEDIKDLKARLAKIIVAYDVDGNPVTAGQLNA VGSMAALLKDAIKPNMVQTLEHTGAIVHGGPFANIAHGCNSVRATKMALKLSDIAITEAG FGADLGAEKFMDIKCRMAGLKPDAVVLVATVRALKYNGGVPKTELVAENLDALKKGICNL EKHIENLQKFGVPIVVTLNAFETDTEKEHEFIRQFCEERGCEFALSEVWAKGGEGGIDLA KKVLKTLEEKESNFKVLYPDNLSLKEKVETIAKEIYGADGVTYEPAAEKMLAKLTDLGFG ALPVCMAKTQYSLSDDQTKLGRPEGFDIHVRDAFVNAGAGFVVILTGAVMTMPGLPKKPA AFGIDVDDNGKITGLF >gi|222441895|gb|ACEP01000047.1| GENE 56 65839 - 66288 384 149 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225026684|ref|ZP_03715876.1| ## NR: gi|225026684|ref|ZP_03715876.1| hypothetical protein EUBHAL_00936 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00936 [Eubacterium hallii DSM 3353] # 1 149 1 149 149 254 100.0 1e-66 MKKKVLTRLALFCCLIVGLSVFSATQVQAHRIWRKYTKLPNGTYTIETDYYEYKSWQKDS VKLKLGKYFVLSSSVQKKGSGRVYTSDPLKLKLSSKCKFYSIDECTYVYKRISKKKAKRI LKNINGACYYTLDKFHVKNNKVDKVYLLW >gi|222441895|gb|ACEP01000047.1| GENE 57 66355 - 66417 67 20 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MFKEKFIVERTCRKLVEAVE >gi|222441895|gb|ACEP01000047.1| GENE 58 66664 - 67308 422 214 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026685|ref|ZP_03715877.1| ## NR: gi|225026685|ref|ZP_03715877.1| hypothetical protein EUBHAL_00937 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00937 [Eubacterium hallii DSM 3353] # 1 214 1 214 214 320 100.0 6e-86 MEKIKISKQRILVATLTLCMLFTLFAPAANVNAASKRTKALTAYQKKLKKLDSKIYKFAL VYLDKDSIPELVITSKESIHAVYGEIYTYNNGKLKNLKYAGSDFGQFIYSPKKSVACNSS WINGYGAVSTFYRLSKTGKSTKLKRFDAIVNPKTSYKINNKKVSKKKYYAEYKKMEKKYP LKQISLKNAFNLTTKNIKNITKNYKSFVVTGAKY Prediction of potential genes in microbial genomes Time: Fri Jul 8 07:11:30 2011 Seq name: gi|222441894|gb|ACEP01000048.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont53.1, whole genome shotgun sequence Length of sequence - 7826 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 3, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 7/0.000 - CDS 1 - 640 197 ## COG0675 Transposase and inactivated derivatives 2 1 Op 2 . - CDS 672 - 1085 149 ## COG1943 Transposase and inactivated derivatives - Prom 1115 - 1174 6.6 - Term 1141 - 1180 -0.8 3 2 Op 1 . - CDS 1360 - 1746 228 ## gi|225026688|ref|ZP_03715880.1| hypothetical protein EUBHAL_00940 4 2 Op 2 1/0.000 - CDS 1739 - 3106 699 ## COG2199 FOG: GGDEF domain - Term 3774 - 3800 -0.6 5 2 Op 3 . - CDS 3943 - 5625 1808 ## COG0119 Isopropylmalate/homocitrate/citramalate synthases - Prom 5650 - 5709 6.3 6 3 Tu 1 . + CDS 6313 - 7605 1230 ## COG0038 Chloride channel protein EriC + Term 7754 - 7789 1.0 Predicted protein(s) >gi|222441894|gb|ACEP01000048.1| GENE 1 1 - 640 197 213 aa, chain - ## HITS:1 COG:all7245 KEGG:ns NR:ns ## COG: all7245 COG0675 # Protein_GI_number: 17233261 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Nostoc sp. PCC 7120 # 1 213 9 223 407 137 36.0 1e-32 MYPNDEQKILLAKTFGCVRMVYNHWLDRKIRQYEENKTNVTYTICAKEMAAMKKTEEYGF LKEADSIALQQSLRHLDTAFQNFFKQPKTGFPRFKSKKRNKNSYSTVCINGNITISNGYL KLPKIGQVRLKQHRIIPEEYRLKSVTVSQTPSGKYYASILFEYEDQVQEKELQKFLGLDF SMHELYRDSNGKESAYPRYYRNAEKKLAREQRR >gi|222441894|gb|ACEP01000048.1| GENE 2 672 - 1085 149 137 aa, chain - ## HITS:1 COG:alr8071 KEGG:ns NR:ns ## COG: alr8071 COG1943 # Protein_GI_number: 17227445 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Nostoc sp. PCC 7120 # 7 134 9 140 140 98 37.0 3e-21 MNLDNNTHSVFLLHYHLVLVVKYRSQVFDDGISSRAKEIFEYIAPNYNITLEEWKHDKDH IHILFRAHPNTEISKFINAYKSASSRLLKKEFPQIRQKLWKEHFWSQSFCLITTGDAPIE VIKKYIESQGQRDRKRK >gi|222441894|gb|ACEP01000048.1| GENE 3 1360 - 1746 228 128 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225026688|ref|ZP_03715880.1| ## NR: gi|225026688|ref|ZP_03715880.1| hypothetical protein EUBHAL_00940 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00940 [Eubacterium hallii DSM 3353] # 1 128 1 128 128 245 100.0 1e-63 MNRNLSEAKEGIESKRQSGIDHGVYGAKYIYDLSGKYGPNGRIVGELLANVICYHHGGLP DAEDFNGDSPLLMRLQDENRLTDYEQVKVAFFKDLQITEQEIEQLFKKSVEEIKLLILKS STGSEASI >gi|222441894|gb|ACEP01000048.1| GENE 4 1739 - 3106 699 455 aa, chain - ## HITS:1 COG:BMEII1009_2 KEGG:ns NR:ns ## COG: BMEII1009_2 COG2199 # Protein_GI_number: 17989354 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Brucella melitensis # 296 455 2 162 166 95 34.0 2e-19 MKLKWRGLQFEHYILGILLAVEWIMSFTFLGYIHIPPISITTAHIPIVVIACLFGPVESS VAGLFFGLGSMYKASALYVMLGDRIFSPFQTDFPIGSILLSVGTRVLFGFLMGCIFKIVN KSRYKWAGKSLAALIATPLHALLVYGAMGILFPESGFNYKSSFRWGVNDYAIAAICILAV ILSDIIYHSSFVTHYKNVINDSELRQHWSVKMKMSLFIIIIFVFCIAAFSTIYFASRTEY MFGVYGVKVTKKIAQDILHLQVQFLAAMISLNFILIVIILMIYNYMKHREYMGEMDALTN VMGRRLFLNYCTKCQNDRDDIDNKSKRKGWFLFIDIDWFKQINDTLGHTVGDETLKQIAE SLKNIFSSYGAVGRVGGDEFAVIINEKMTKEQLEEQLKKFLLDISAILPERTVSCSIGVY HFGFPKSQKELLTRTDDALYKAKEKGRACYVILDE >gi|222441894|gb|ACEP01000048.1| GENE 5 3943 - 5625 1808 560 aa, chain - ## HITS:1 COG:CAC0273 KEGG:ns NR:ns ## COG: CAC0273 COG0119 # Protein_GI_number: 15893565 # Func_class: E Amino acid transport and metabolism # Function: Isopropylmalate/homocitrate/citramalate synthases # Organism: Clostridium acetobutylicum # 20 550 18 554 558 576 54.0 1e-164 MKYADKYKPMYFPAPAGYNKWMEKDHIEQPPIWCSVDLRDGNQALIIPMSLEEKIEFFKL LVDVGFKEIEVGFPAASETEYEFCRTLIERNLIPDDVTIQVLTQAREHIIKKTFEAIEGA KNVIVHLYNSTSVAQREQVFKKDKEHIIQIAVEGAELLQKLMEENGGKYRFEYSPESFTG TEMDFALEICNAVLDVWKPKKELPAIINLPATVSHSLPHVYASQIEFMSENLKYRDNVVL SVHPHNDRGTGVADTELAVLAGADRVEGTLFGNGERTGNVDIITLALNMYAHGVEPNLDF TDLPHIVKVYEKVTRMHVYERQPYAGKLVFAAFSGSHQDAIAKGMLWRETHDCDKWTVPY LPIDPKDLGREYEADVIRINSQSGKGGIGFLLEKTYGIRIPTKMREEVGYSVKDVSDHTH KELQPKEVLKVFNEKYVNICDPIELLEVHFKQEHGGISTEIILNKYGVHKVYHGKGNGRL DAVSNALKRHSNLDYTISTYEEHALKMGSNSQACAYVAIYGSDGNTYWGVGIDDDIINAS VKALISSVNRYERCEDGKED >gi|222441894|gb|ACEP01000048.1| GENE 6 6313 - 7605 1230 430 aa, chain + ## HITS:1 COG:BH0663 KEGG:ns NR:ns ## COG: BH0663 COG0038 # Protein_GI_number: 15613226 # Func_class: P Inorganic ion transport and metabolism # Function: Chloride channel protein EriC # Organism: Bacillus halodurans # 18 420 8 409 424 266 38.0 4e-71 MEEIIKRYLRKIYYNLKTCALWLFMAVLTGVAVGAFSSAFSFCLKKVTTLRGQYPFLLYF LPIGGVVIVFLYHLCGVRQDKGTNIIMTAVHGKNQDVPAYMAPLIFAATIITHLFGGSAG REGAALQMGGSIGNTIGRLFKFDEGDRKLLVMSGMSAAFSAVFGTPLAAAIFPMEMISVG IMHYAALVPCVFSSIVANRFAVNMGINPEAFTIHGIPELTFTSCAKIVLLGIFCAGLSVV FCIFLKGAGLFYNKFFKKPYMRIIVGGLLVIAITLLIGNYDYNGAGTDMIARAIAGNVPA YAFLIKMLLTALTLAAGYKGGEIVPAFFTGATFGCLFGHIFGISPSLCAATGMLALFCGV TNCPIASMLIGFELFGFEGVSYILIAISISYMLSGYHGLYKEQIIVYSKYHPKFINRLSG DERFDGQDYE Prediction of potential genes in microbial genomes Time: Fri Jul 8 07:11:45 2011 Seq name: gi|222441893|gb|ACEP01000049.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont54.1, whole genome shotgun sequence Length of sequence - 27644 bp Number of predicted genes - 21, with homology - 20 Number of transcription units - 12, operones - 5 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 108 - 968 733 ## COG1307 Uncharacterized protein conserved in bacteria - Prom 1023 - 1082 5.6 + Prom 943 - 1002 5.8 2 2 Op 1 . + CDS 1217 - 1723 423 ## Shel_19200 protein of unknown function (DUF1836) + Term 1725 - 1780 3.1 + Prom 1728 - 1787 5.8 3 2 Op 2 . + CDS 1809 - 2600 691 ## COG1342 Predicted DNA-binding proteins + Term 2630 - 2665 1.1 + Prom 2729 - 2788 9.2 4 3 Op 1 . + CDS 2835 - 3296 418 ## EUBREC_0506 hypothetical protein 5 3 Op 2 . + CDS 3324 - 4004 523 ## Olsu_0661 hypothetical protein + Term 4160 - 4217 11.1 6 4 Tu 1 . + CDS 4449 - 6035 960 ## COG1961 Site-specific recombinases, DNA invertase Pin homologs + Term 6257 - 6295 1.1 7 5 Tu 1 . - CDS 6032 - 6526 322 ## COG1686 D-alanyl-D-alanine carboxypeptidase - Prom 6724 - 6783 7.6 + Prom 6705 - 6764 7.1 8 6 Op 1 . + CDS 6799 - 8709 2040 ## COG0296 1,4-alpha-glucan branching enzyme 9 6 Op 2 21/0.000 + CDS 8672 - 9439 677 ## COG1354 Uncharacterized conserved protein 10 6 Op 3 . + CDS 9471 - 10043 654 ## COG1386 Predicted transcriptional regulator containing the HTH domain - Term 10033 - 10071 1.2 11 7 Tu 1 . - CDS 10074 - 11444 1265 ## COG0534 Na+-driven multidrug efflux pump - Prom 11630 - 11689 10.4 + Prom 11978 - 12037 6.3 12 8 Op 1 . + CDS 12057 - 12155 83 ## 13 8 Op 2 . + CDS 12227 - 12436 174 ## gi|225026711|ref|ZP_03715903.1| hypothetical protein EUBHAL_00963 + Prom 12519 - 12578 3.2 14 9 Op 1 . + CDS 12601 - 13416 881 ## COG0834 ABC-type amino acid transport/signal transduction systems, periplasmic component/domain 15 9 Op 2 . + CDS 13406 - 15562 1570 ## COG0642 Signal transduction histidine kinase 16 9 Op 3 . + CDS 15601 - 17451 1292 ## EUBELI_01635 multiple sugar transport system substrate-binding protein 17 9 Op 4 10/0.000 + CDS 17457 - 20408 1886 ## COG0642 Signal transduction histidine kinase 18 9 Op 5 . + CDS 20471 - 23356 2190 ## COG0642 Signal transduction histidine kinase + Term 23365 - 23402 0.0 + Prom 23364 - 23423 5.2 19 10 Tu 1 . + CDS 23570 - 24346 778 ## COG0428 Predicted divalent heavy-metal cations transporter - Term 24237 - 24268 -0.7 20 11 Tu 1 . - CDS 24392 - 24514 87 ## gi|225026718|ref|ZP_03715910.1| hypothetical protein EUBHAL_00970 - Prom 24561 - 24620 9.3 + Prom 24673 - 24732 5.0 21 12 Tu 1 . + CDS 24965 - 27556 1861 ## PROTEIN SUPPORTED gi|163764771|ref|ZP_02171825.1| ribosomal protein S8 Predicted protein(s) >gi|222441893|gb|ACEP01000049.1| GENE 1 108 - 968 733 286 aa, chain - ## HITS:1 COG:SP0742 KEGG:ns NR:ns ## COG: SP0742 COG1307 # Protein_GI_number: 15900637 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Streptococcus pneumoniae TIGR4 # 1 275 1 279 281 214 40.0 1e-55 MKMKIVADSSANLLALPDIDFGVVPLHIIVGENDYIDDDKIDLTAMQENLSSYKGKTSTS SPSPHEWENAFGDADVVFCITITSSLSGTYNSAVIAKDIYEQNHTGRKVYVLDSLSTGPE IALFIEKIAELIKAGFAPDDIYKTLLEYKDKTHLYFSLASLNNFAKNGRISPVIAAGIGL LGVRVVGKASDEGTLLPLDKCRGEKRALKKLIDHLKECGYTDGKIIIAHNNNEAGALELK KLIENTFGIFNGTIQKTRALCSYYAEPGSLLIGFEASLCNHKPYKD >gi|222441893|gb|ACEP01000049.1| GENE 2 1217 - 1723 423 168 aa, chain + ## HITS:1 COG:no KEGG:Shel_19200 NR:ns ## KEGG: Shel_19200 # Name: not_defined # Def: protein of unknown function (DUF1836) # Organism: S.heliotrinireducens # Pathway: not_defined # 6 158 8 162 172 179 52.0 6e-44 MRFSARDYRMPRYKEIPNVGLYLEQTVKYINECLAPIEISITPSMLSNYVKKGYIDRPIK KQYYADQIAYTIYIVIIKQVLSMDNIASLFVLQKATYTLPVAYDYFCSELEKTLVAIFEN KEEPMKDMGDIPFEKKMLHSVIVAVSHIIYLNDCFKNNLQDNASHSIK >gi|222441893|gb|ACEP01000049.1| GENE 3 1809 - 2600 691 263 aa, chain + ## HITS:1 COG:CAC3166 KEGG:ns NR:ns ## COG: CAC3166 COG1342 # Protein_GI_number: 15896414 # Func_class: R General function prediction only # Function: Predicted DNA-binding proteins # Organism: Clostridium acetobutylicum # 1 109 1 106 143 71 40.0 1e-12 MPRPIKCRKVCHFPDILEFRPSNKNGRRGEEDEKEVILLTVDEYETIRLIDKEGYSQEQC AGFMQIARTTVQRIYEIARKKVADAIIDGHPLRIEGGDFRICDGKSSKCGLGGCYKQEFY QKYATEKGEGIMRVAVTYENGQIFQHFGHTAEFKVYDVQDGKVIHSEVVDTNGSGHGALA GVLNALKADVLICGGIGGGAQMALSAAGIKLYGGVSGDADAAVEAFMNGTLDYNPDVKCS HHDHDHGEGHTCGEHGCGSHSCH >gi|222441893|gb|ACEP01000049.1| GENE 4 2835 - 3296 418 153 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_0506 NR:ns ## KEGG: EUBREC_0506 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 2 141 3 142 146 84 34.0 2e-15 MFFWDQHKTITSCYELFTRKVCDRYQLTQMEYDILMFLHNNPKHNTAADIVKIRKSTKSH VSTSLKALENKGLIERIQSARNKKHIEIVILNKAESIIEDGMKAQKEFAKNVLDGLTEEE KYMCMAVFNKICANADAYLKKYAEGENEKSFLL >gi|222441893|gb|ACEP01000049.1| GENE 5 3324 - 4004 523 226 aa, chain + ## HITS:1 COG:no KEGG:Olsu_0661 NR:ns ## KEGG: Olsu_0661 # Name: not_defined # Def: hypothetical protein # Organism: O.uli # Pathway: not_defined # 8 225 2 219 222 218 50.0 1e-55 MREEKYQPKMPDIMEAIFDAGYLIFDLVAAILFFTYAKGNTLFILYGILTLTLCGGDAFH LVPRIIRAARGTNDRIKKQLGIGLQISSITMTVFYIILMYVWKDTFPDFNIPAAVKAMVW ISAIIRIAVCLLPQNNWCTEDGNLKLSIIRNAVFAVTGIGVIILYAISGNANGYHMTRMV AAIIISFGCYLPVTLFSKTKPKVGLLMIPKTCAYMWIIAMGLQLLF >gi|222441893|gb|ACEP01000049.1| GENE 6 4449 - 6035 960 528 aa, chain + ## HITS:1 COG:lin1623 KEGG:ns NR:ns ## COG: lin1623 COG1961 # Protein_GI_number: 16800691 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinases, DNA invertase Pin homologs # Organism: Listeria innocua # 7 304 11 301 301 197 39.0 5e-50 MNNKAVLYLRLSKEDRNKVNKGDDSESIINQRIMLSDYAMRHNFQVVKIYSDDDESGLYE DRPGFSQMMQDAERGLFDIIIAKSQSRFTRNLEHSEKYLHHKLPLWGVRFIGVVDGVDTM DENNKMTRQVKGLTNEWYCETLSKNVRSVFLSKMNQGKFVGSSCPYGYMRDSEDHNHLII DEYAANIVRKIFRLYLEGNGKAHIASILSSENILIPSLYKTRVLQQNYYNSKLKDTTKTW CYQTVHMILNNQTYLGKMVQHKDIKISYKDKKKRRLPKEQWIVVDNTHEPIIDQETFDRV QAIQKIKRKSVNTEYSDNIFAGILFCADCKHAMNRAYAKENKKGSIGYICKIYKTMGKKH CPSHMIKNEELEEAVLSSIKNEARKILTEDDISDLDKVQKIDSREQFYKKQIQIVDAELE KYQKFKKRAFTGYMEEMITPQEYRSYVAEYEEEISKLERQKEEILTKLADENSLQNQHDE WVEAFKDYINIEKLTRDVVIELIEKIEVHEDGGLDIYYRFRNPFQVAN >gi|222441893|gb|ACEP01000049.1| GENE 7 6032 - 6526 322 164 aa, chain - ## HITS:1 COG:BH1535 KEGG:ns NR:ns ## COG: BH1535 COG1686 # Protein_GI_number: 15614098 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanyl-D-alanine carboxypeptidase # Organism: Bacillus halodurans # 53 160 24 131 387 110 54.0 1e-24 MENSLNPLSDNSGINELSNRLKNITTSNNISVQAATDSTSNVSYQTNSESSSASSANTGF QSAAPNAILMEATTGTILYEKDADKEKPPASVTKVMTLLLIFEALSKKQITLNDTVTVSE HAASMGGSQVFLEPGETQSVDTMIKCIAVSSANDACVAYSHSRN >gi|222441893|gb|ACEP01000049.1| GENE 8 6799 - 8709 2040 636 aa, chain + ## HITS:1 COG:FN0856 KEGG:ns NR:ns ## COG: FN0856 COG0296 # Protein_GI_number: 19704191 # Func_class: G Carbohydrate transport and metabolism # Function: 1,4-alpha-glucan branching enzyme # Organism: Fusobacterium nucleatum # 1 603 8 608 611 472 42.0 1e-133 MNLHEFYIGKAFDAYEYFGAHLTKNGVLFRVYAPNAEKIEVIGEFNDWDGIEDEMIQDAQ SGVFECVQKDAKPGMMYKYRIYQPDGIVMDRFDPYSFGSQVRPDTASVIVDLEDYHFSDE AWMKKRSKNYDLPVNIYEVHAGSWRKNEADEENGWYHYQELGKQLIPYVKEMGYTHVEFL PLTEHPADCSWGYQASGYFCPTSRYGTAKELMEMVDLFHKEGIGVIMDFVPVHFIADTYA LNKFDGTALYEYADDDKGYSEWGTCNFNYYRGEVRSFLQSSANFWMEKFHFDGIRMDAIS NAIYWQGDSSKGINEGALDFIKCMNSGLQKLHPSVMLIAEDSTNFPKVTAPVEYDGLGFD YKWDMGWMNDTLDYFRIHPYDRKFHYHKLSFSMMYFYNEHYLLPFSHDEVVHGKATILQK MWGEYDLKFQQAKTLYTYMYTHPGKKLNFMGNEIGQFREWDEEKECDWMLLDYPKHQDFH RFMKDLNHLYISEPAFFNGDYNSANFRWIEVEAVEERVYIYERMCGENHFIVVLNMSDAQ YNDFEFGYDHYAVLHEVLSGERKEYGGYFEGPKEDIVAKTTGYKWWKRKFKITIPALAAI IYKVEFLPEPKEIPLPMLDEASIDIDRKRLRAEHEN >gi|222441893|gb|ACEP01000049.1| GENE 9 8672 - 9439 677 255 aa, chain + ## HITS:1 COG:CAC2061 KEGG:ns NR:ns ## COG: CAC2061 COG1354 # Protein_GI_number: 15895331 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 10 252 1 243 249 156 45.0 4e-38 MTERDLEQSMKIDIKLDVFEGPLDLLLHLIEKNKVSIYDIPIVEITNQYMEYIREMEKTY SMESMSEFLVMAATLLKIKSKMLLPQPEKEEEEDPREELVRRLTEYKMYKYAAEELKDLS VDAQKVFFKPETVPEEIKYYEEPIRPEEIVGDITLEKLNQIFRMVMRRKKDREDPVRSHF GKIQKEKYKVEDRMDDIRRQIRGLKKINFRTLLDIQPVKEMVIVTFLAVLELMKVGEIKV SQEHNFAEIYLDSCE >gi|222441893|gb|ACEP01000049.1| GENE 10 9471 - 10043 654 190 aa, chain + ## HITS:1 COG:CAC2060 KEGG:ns NR:ns ## COG: CAC2060 COG1386 # Protein_GI_number: 15895330 # Func_class: K Transcription # Function: Predicted transcriptional regulator containing the HTH domain # Organism: Clostridium acetobutylicum # 1 174 14 187 202 128 40.0 7e-30 MKDKERIKGIIEAILFTMGESVAISRLSEVLDIEEDTLKELLEEMKGDYAGKDRGICLIE LDGAYQICTKIETYEYVKKLVSQPKKRSLTDVMLETLSIVAYKQPVTKQEIEAIRGVKSD FAVNKLLEYRLIKELGRLDTIGHPIVFGTTEEFLRCFGVSSIEELPDIDEVQKQNFMEEA MEEVAASITV >gi|222441893|gb|ACEP01000049.1| GENE 11 10074 - 11444 1265 456 aa, chain - ## HITS:1 COG:FN1726 KEGG:ns NR:ns ## COG: FN1726 COG0534 # Protein_GI_number: 19705047 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Fusobacterium nucleatum # 1 446 1 444 457 275 36.0 1e-73 MQDNKNFLAAEPIGKLLLKLSIPTVIAQLINMLYNIVDRIYIGHIPSDGSLALTGVGVCM PIIMIVTAFAALVSSGGAPRASIYMGKQDNESAENILGNCFLLQIVISLILTAILLIWSK DLLLAFGASENTISYATDYMHIYAFGTLFVQLTLGMNAFITAQGFTTISMVSVLIGAICN ITLDPVFIFAFHMGVKGAALATVISQAISTIWVVSFLCGEKTHLHLRKKYMRLEPKVSVP CVTLGLAAFIMQASESIVTVCFNSSLLRYGGDIAVGAMTILTSVMQFAMLPLQGIAQGAQ PISSYNYGAKNTNRVKKTFRLLLMTCLSYSALLWAAVQLIPRVFVSIFTADASLINFTAP MLRIYLGGLFLFGIQIACQMTFTSLGKAVNSIVVAVVRKFVLLLPLIYIMPHVVSNPTTG VYMAEPIADIIAVLFTSVLFSFQFKKALAEIQNENN >gi|222441893|gb|ACEP01000049.1| GENE 12 12057 - 12155 83 32 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKIKKLQSVCLILACVAVTGLIGCGETDKRPV >gi|222441893|gb|ACEP01000049.1| GENE 13 12227 - 12436 174 69 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026711|ref|ZP_03715903.1| ## NR: gi|225026711|ref|ZP_03715903.1| hypothetical protein EUBHAL_00963 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00963 [Eubacterium hallii DSM 3353] # 1 69 1 69 69 123 100.0 4e-27 MKYLHNSSFEKTKAELEEKGCSYSNAVLIGITSLIRYDAGDKGKVIEYADKIDTSSQHLL GIINDVLDI >gi|222441893|gb|ACEP01000049.1| GENE 14 12601 - 13416 881 271 aa, chain + ## HITS:1 COG:SPy1274 KEGG:ns NR:ns ## COG: SPy1274 COG0834 # Protein_GI_number: 15675229 # Func_class: E Amino acid transport and metabolism; T Signal transduction mechanisms # Function: ABC-type amino acid transport/signal transduction systems, periplasmic component/domain # Organism: Streptococcus pyogenes M1 GAS # 4 259 5 269 278 75 26.0 8e-14 MKGKRVIAGILLAGILAVTLAGCKNTDNTKKETEKPVITLGSDNYPPYNYLNEDGEPTGI DVELATEAFKRMGYQVDVVQINWEKKKELVERGEIDCIMGCFSMEGRLDDYRWAGPYIAS RQVVAVNKSSDIYKLSDLEGKNLAVQSTTKPEGIFLKRTDERIPKLGNLISLGHRELIYT FLGKGYVDAVAAHEESVVQYMKDYDASFRILEEPLMLTGIGVAFAKNDDRGICEQMEQTL EEMRQDGTSLKIVEKYLDDPEKYLEVDDLGY >gi|222441893|gb|ACEP01000049.1| GENE 15 13406 - 15562 1570 718 aa, chain + ## HITS:1 COG:mll3725_2 KEGG:ns NR:ns ## COG: mll3725_2 COG0642 # Protein_GI_number: 13473203 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Mesorhizobium loti # 327 566 69 304 328 165 38.0 3e-40 MDIKRKKKEKRIQLLGGLIGICVAVVSLFYFFHVEKAEAEKRMVEIVNYVKVQCSTYTHY NESSESKSLLRAIESARQMSTNINMETENGRQLSRDFLKENLQTLWVNGIIVLDTEGKID CEYSTDESLANEITEYLQKEIIMDFVGYEERSYSERINRKDGSHIDIAACARKDAQGIVA VYYYTPPKFARNYTLTIQSLLNGYSTQKDGTIIVADEGIIVASNDESLLGQNTADNEVVK AMKRHTDSQHIYHLKNEGTGCYGIMLKQRDYYIYAYLSDTEVFHNLPLNVISVIFLYFLM FSIFWFWTYRTNQSHQKQEQEKDEKYKTELLIAAKKAEAANEAKTEFLQRMSHDIRTPIN GICGMVNMADHYADDMEKQKEYRTKVKEASNLLLELVNDVLDMSKLESGEIVLEEIPFNL SSIYREVFVVIEQVAAEQNLQIVWEKKEITHRDLIGSPRYVKRVMMNILSNAMKYNRENG HIYISCIEIPSGQPETTTMEFVCRDTGIGMAEEFQKHIFEPFAQEHAGSRTRFSGTGLGM PISKKLIEKMGGTITFESAEGIGTTFVIRVPFKIDLDVDIREEQADVSEKSIKGLHILLA EDNELNMEIAEFVLQNEGAEVSKAWNGEETVELFRKSESGEFDAILMDIMMPVMNGYEAT KMIRSLDREDAKVIPIIAMTANAFTEDRLRAKEAGMNEHIAKPFDAKLLVKIIHKLVH >gi|222441893|gb|ACEP01000049.1| GENE 16 15601 - 17451 1292 616 aa, chain + ## HITS:1 COG:no KEGG:EUBELI_01635 NR:ns ## KEGG: EUBELI_01635 # Name: not_defined # Def: multiple sugar transport system substrate-binding protein # Organism: E.eligens # Pathway: not_defined # 33 341 7 320 326 155 32.0 4e-36 MRKKKWNRVLAVLLMMVMSISLLSGCGSKSAEKEDAETITVYLWSTNLYEKYAPYIQEQL PDINVEFVVGNNDLDFYKFLNENGGLPDIITCCRFSLHDASPLKDNLMDLSTTNVAGAVY DTYLSNFMNEDGSVNWLPVCADAHGFVVNKDLFEKYDIPLPTDYESFVSACQAFDKVGIR GFTADYYYDYTCMETLQGLSASELSSVDGRKWRTTYSDPDNTKREGLDNTVWPKAFGRME QFIQDTGLSQDDLDMNYDDIVEMYQSGKLAMYFGTSAGVKMFQDQGINTTFLPFFQENGE KWIMTTPYFQVALNSNLTKDETRRKKAMKVLDTMLSADAQNRIVYDGQDLLSYSQDVDLQ LTEYLKDVKPVIEENHMYIRIASNDFFSVSKDVVSKMISGEYDAGQAYESFNSQLLEEDS SSEDIVLDSQKSYSNRFHSSGGNAAYSVMANTLRGIYGSDVLIATGNSFTGNVLKAGYTE KMAGDMIMPNELSAYSSKMSGAELKEAVKNFVEGYEGGFTPFNRGSLPVLSGISVEVKET DDDYTLSKVTKDGKQIQDNDTFTVICLAIPKHMEAYPADDNIVFDGGNTSVDDTWIGYIS DGDAVLAEPEDYMTLR >gi|222441893|gb|ACEP01000049.1| GENE 17 17457 - 20408 1886 983 aa, chain + ## HITS:1 COG:PA3462_2 KEGG:ns NR:ns ## COG: PA3462_2 COG0642 # Protein_GI_number: 15598658 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Pseudomonas aeruginosa # 470 747 1 276 385 191 41.0 9e-48 MGIKNHNTKREKTVMRKRLRIAVFIAVSLGIVLIVFRYFGFVSKTIYEESVSHLTEVFHQ SDNMLRELTDKNLNYLHMWGENLQDISNEDEIRNYIKKAQEDAGFLDFFFLSADGNYKMA TGETGYLGLQENIEDEIRQGNDVIANAAVPGKSQLLVFATAKAHGIYHGFEYDAIAIAYE NSDIVDVLDISAFDGNAQSFVVHPDGRVVVDHSSEAWGNVYNFFGVLREHSDMSEKEINE LSEKFKARRTDAMLLNLDGRNYYLVYEKSDIQDWMFLGLVQADIVNASMNSLQRSTILLV GAVVLCIAALLISFIIQKGRINLRKKDTEILYRDELFQKLSMNVDDVFLMLDAKTYQADY VSPNAEKLLGITVEQIRKDICILGKLHPEEQEDSEKNYMEEIQVHEQREWDFEYVHLKTG EKRWFHNIVMGSEVNGKRKYILVMSDRTSDWKMNQALSEAVHAAETANQAKSTFLSNMSH DIRTPMNAIIGFTTLAVSNIDNQERVQDYLGKILSSSNHLLSLINDILDMSRIESGKIHL EETEVSLSDVLHDLKTIISGQIHAKQLELYMDVMDVTNEDVYCDKTRLNQVLLNLLSNAV KFTPAGGTVSVRLKQYPKAVKGSKLYEIRVKDTGIGMSQEFVQKIFSPFERERTSTVSRT QGTGLGMAITKNIVDMMGGTIEVQTEQGKGTEFIVCLPLRIQSECQRIEKIAELEGLKAL VVDDDFNTCDSVTKMLVRVGMRSEWTLSGKEAILRARQSMELGDAFHAYIIDWRLPDMNG IEVTRQIRSLGDDTPIIILTAYDWSDIEAEARAAGVTAFCAKPLFMSDIRDTLMTAIGQS QAEVEDSILPEAGSDFRGRCILLVEDNELNSEIAVEILKEYGFLVDTAENGAEAVEKVKN STPGKYDLVLMDVQMPVMNGYEATRQIRALDDPALSGITILAMTANAFDEDRKKALEYGM DGFLSKPIVIEELISTLHNNLLD >gi|222441893|gb|ACEP01000049.1| GENE 18 20471 - 23356 2190 961 aa, chain + ## HITS:1 COG:PA3462_2 KEGG:ns NR:ns ## COG: PA3462_2 COG0642 # Protein_GI_number: 15598658 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Pseudomonas aeruginosa # 571 855 1 274 385 171 35.0 5e-42 MDKSRKKSAVLFVCILLVLGMWMMLSVHSQAAETNNDEKQAQTIRVGSFEDTFNYVDKNG VRRGYGYELMQALAGYTEWKFEYVKCDWSDCFDKLENGEIDIMGDISYSDERAQKMLFSD EPMGEEKYILYADLSNMDIGMSDFKFMDGKRVGVLMDTEPEIMLTEWENKNGIHTEHVNV NNDNDVEKKLANHEIDAFVSLEESIWSEQGISSVTTIGKSGIYFAINKERSDIKTELDYA MCQLDQDSPFFKADMYKKYFTLDYNQSLTGGEKSWLEEHGDIRMGFLNNDPAIFSMDETT GKLTGMLPEYVSYAKDCLGNQTLKFNIQDYDDYDEMLQALQNHEIDMIFYAGRNPDIAEK KGYALTNTAWTYNLMAVTDEKNFDEGNVYTVAVPKEKEALKQQLTFSYPQWNLVDYDSFE EAAEMITNEKADCFLMGASQAMVYDNNRDFKSVPLTKTMEACFAVKGGEETLLSILNKTL KGMPSGMLTSALAIYDSTADKVTFLDFVKDNMLAFFLATGFSALSIIVIILVLLRKARKA EAAAKLAANDTQKLNEKLEIALKKAEDASLAKTSFLNNMSHDIRTPMNVILGYAQLMENE LNGKDIPEVLEHLEKLQQSGNLLLSIINNVLDMARIESGRMEIDENYCRIEDVWKSLFAV FDEKARKKNISLHYTMNVEHEHVLTDVTKVKEILVNILSNAIKYTPAGGSVMVYVDELPC DESGYMIVRIRISDTGIGMSQDYLTKIFEAFTREKNTTKSKIAGTGLGMSIVKNYVDLLG GTIDVESELGKGSTFTVTLKHRIADERYYVKKHIEESGTGNEILEGRNILLAEDNDLNAE IAEAILERAGLRIERVENGIQCVNRILKMPAGTYDMIFMDIQMPQMDGYKATQTIRNLPD KDKACIPIIAMTANAFAEDKKKTMEAGMNAHLSKPLNVPELMDTIRKFCDGKQMCQRPGK E >gi|222441893|gb|ACEP01000049.1| GENE 19 23570 - 24346 778 258 aa, chain + ## HITS:1 COG:lin0435 KEGG:ns NR:ns ## COG: lin0435 COG0428 # Protein_GI_number: 16799512 # Func_class: P Inorganic ion transport and metabolism # Function: Predicted divalent heavy-metal cations transporter # Organism: Listeria innocua # 14 258 25 269 269 209 51.0 3e-54 MNTFFGILIPFVGTTLGAACVFFMRTTFSKSVQRALTGFAAGVMVAASIWSLLIPAMKQS EKMGDLSFVPAVAGFWIGILFLLTLDHLIPHLHVGSDQAEGPKSRLGRTTMMVLAVTLHN IPEGMAVGVMYAGFLAGNAQITATSALALSLGIAIQNFPEGAIISMPLRAEGESKGKAFL GGVLSGVVEPIGAVLTIIAAQLIIPALPYLLSFAAGAMLYVVVEELIPEMSQGEHSDIGT VFFAVGFSLMMILDVALG >gi|222441893|gb|ACEP01000049.1| GENE 20 24392 - 24514 87 40 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225026718|ref|ZP_03715910.1| ## NR: gi|225026718|ref|ZP_03715910.1| hypothetical protein EUBHAL_00970 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00970 [Eubacterium hallii DSM 3353] # 1 40 1 40 40 74 100.0 2e-12 MIKMTSDIKEIPVFASMWNMDGLILIRFCKVDYENLRNPI >gi|222441893|gb|ACEP01000049.1| GENE 21 24965 - 27556 1861 863 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163764771|ref|ZP_02171825.1| ribosomal protein S8 [Bacillus selenitireducens MLS10] # 1 862 1 814 815 721 43 0.0 MNINKFTQKSLEAVNNCEKLAYQYGSPEIDQEHFLYSLLTIEDSLILKLIEKMEVNKEAF LSQVQQAVEKKPKVSGGQTYISKSLNQVLVSAEDEAKAMGDEYVSVEHLFLSLMKYPNTE IKKLFQAYGITRERFLQALSTVRGNQKVVSDNPEATYETLEKYGYDLVERAKQQKLDPVI GRDSEIRNVVRILSRKTKNNPVLIGEPGVGKTAVVEGLAQRIVKGDVPDSLKDKKLFALD MGSLVAGAKYRGEFEERLKAVLEEIKASDGQILLFIDELHTIVGAGKTDGAMDAGNLLKP MLARGELHCIGATTLNEYRQYIEKDAALERRFQPVMVDEPTVEDTISILRGLKDRYEVYH GVKIADSALVSAAVLSNRYITDRFLPDKAIDLVDEACAMIKTELDSLPADLDEVQRKIMQ LEIEEAALKKETDRLSVERLENLQKELAELREEFKSKKASWDQEKSAVERVSKLKEEIES LNNEIQIAQRNYDLNKAAELQYGKLPELQRQLEEAEESAKKRESSMVHESVTDDEIATII SRWTGIPVAKLTESERNKTLNLDKELHKRVVGQDEGVEKVTEAIIRSKAGIKDPTKPIGS FMFLGPTGVGKTELAKALAASLFDNEQNMVRIDMSEYMEKYSVSRLIGAPPGYVGYEEGG QLTEAVRRKPYSVVLFDEIEKAHPDVFNVLLQVLDDGHITDSQGRTVDFKNTILIMTSNI GSQYLLEGIDENGNIKTDAETMVMNDLRSHFRPEFLNRLDEIIMFKPLTKDNIGGIIELM LADVNKRLEDKELSIHLTDAAKSYVIEHGYEPAYGARPLKRYLTKHVDTLAARMILSGEV YPQDTIVIDEQGGELIASVEHRQ Prediction of potential genes in microbial genomes Time: Fri Jul 8 07:12:20 2011 Seq name: gi|222441892|gb|ACEP01000050.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont55.1, whole genome shotgun sequence Length of sequence - 2844 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 136 - 195 6.1 1 1 Tu 1 . + CDS 292 - 2244 2715 ## COG0441 Threonyl-tRNA synthetase Predicted protein(s) >gi|222441892|gb|ACEP01000050.1| GENE 1 292 - 2244 2715 650 aa, chain + ## HITS:1 COG:BS_thrS KEGG:ns NR:ns ## COG: BS_thrS COG0441 # Protein_GI_number: 16079947 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Threonyl-tRNA synthetase # Organism: Bacillus subtilis # 3 640 7 642 643 581 47.0 1e-165 MIIRLKDGSEKEYAQPMSVLDIAKDISEGLARNACAGQVNGETVDLRTVVSEDSDLNILT FNDEEGKLAFRHTASHVLAQAVKRLYPEAKLAIGPAIEDGFYYDFDMAPLTREDLDAIEK EMKKVIKENPPIERYELPREEAIAYMKEKDEPYKVELIEDLPEDATISFYKQGDFTDLCA GPHLMSVKPIKAFKLTASSGAYWRGNENNKMLTRIYGTAYTKKADLEERLKYLEEIKLRD HNRLGREMELFTTVDVIGQGLPLLLPKGAKIIQTLQRWIEDLEDNEWGYVRTKTPLMAKS DLYKISGHWDHYKEGMFVLGDEENDKEVLALRPMTCPFQYYCYKNTQKSYRDLPYRMSET STLFRNEDSGEMHGLTRVRQFTISEGHLIIRPDQTNDELKGCLHLAQYCLGVLGVQDDVT YRLSKWDPNNKEKYLGDDEYWETTQDAIRNILVDQGVPFVEAEGEAAFYGPKIDIQAKNV YGKEDTMITIQLDCAIAENFDMYYIDQNGDKQRPYVIHRTSMGCYERTLAWLIEKYAGKF PTWLCPEQVRVLPISEKFADYAEEVNKELKRNGILSTVDNRSEKIGYKIREARLQKLPYM LVVGAQEQETGKVSVRSRFAGDEGQKDLKDFISAICEEIRTKEIRQEVEE Prediction of potential genes in microbial genomes Time: Fri Jul 8 07:12:25 2011 Seq name: gi|222441891|gb|ACEP01000051.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont56.1, whole genome shotgun sequence Length of sequence - 19784 bp Number of predicted genes - 17, with homology - 16 Number of transcription units - 6, operones - 3 average op.length - 4.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 1192 840 ## EF2320 TraE protein, putative 2 1 Op 2 . + CDS 1198 - 3534 1649 ## ELI_0235 antigen-like protein 3 1 Op 3 . + CDS 3576 - 3860 92 ## gi|225026723|ref|ZP_03715915.1| hypothetical protein EUBHAL_00975 + Prom 3938 - 3997 3.5 4 2 Tu 1 . + CDS 4062 - 4364 382 ## gi|225026724|ref|ZP_03715916.1| hypothetical protein EUBHAL_00976 + Term 4400 - 4458 7.7 + Prom 4582 - 4641 4.3 5 3 Op 1 . + CDS 4671 - 4958 288 ## gi|225026725|ref|ZP_03715917.1| hypothetical protein EUBHAL_00977 6 3 Op 2 . + CDS 4983 - 5582 468 ## Ethha_1703 helix-turn-helix domain protein 7 3 Op 3 . + CDS 5587 - 5877 383 ## gi|225026727|ref|ZP_03715919.1| hypothetical protein EUBHAL_00979 8 4 Op 1 . + CDS 6015 - 6230 218 ## gi|225026728|ref|ZP_03715920.1| hypothetical protein EUBHAL_00980 9 4 Op 2 . + CDS 6217 - 7605 953 ## EUBELI_01797 hypothetical protein 10 4 Op 3 . + CDS 7614 - 8492 775 ## gi|225026730|ref|ZP_03715922.1| hypothetical protein EUBHAL_00982 11 4 Op 4 . + CDS 8489 - 10471 1233 ## CD1105 putative DNA primase 12 4 Op 5 . + CDS 10498 - 11055 456 ## Sgly_1452 RNA polymerase, sigma-24 subunit, ECF subfamily 13 4 Op 6 . + CDS 11045 - 11704 484 ## Sgly_1453 hypothetical protein 14 4 Op 7 . + CDS 11767 - 12192 194 ## gi|225026734|ref|ZP_03715926.1| hypothetical protein EUBHAL_00986 15 4 Op 8 . + CDS 12214 - 12360 176 ## 16 5 Tu 1 . + CDS 12498 - 19388 5207 ## COG3209 Rhs family protein + Prom 19407 - 19466 8.9 17 6 Tu 1 . + CDS 19581 - 19782 110 ## gi|225026737|ref|ZP_03715929.1| hypothetical protein EUBHAL_00989 Predicted protein(s) >gi|222441891|gb|ACEP01000051.1| GENE 1 2 - 1192 840 396 aa, chain + ## HITS:1 COG:no KEGG:EF2320 NR:ns ## KEGG: EF2320 # Name: not_defined # Def: TraE protein, putative # Organism: E.faecalis # Pathway: not_defined # 1 320 518 825 825 103 23.0 1e-20 PQNEFKEIVEKYGGSYFDLTPKSKIYINGFEVSDAVFYADKDTKEKFIATQTKYAKSLVA AIMTNILFTQEHATVVSRATRKMFDKIFAQKKLKKQATLRMLREEIKLELENAKDDYDRS IIKPIYNSLEEYTDGSCDMLAYPSTVRLDNRMIGFGLKNVPPDNWEPVMVTVMHYTSTRM EYNQEAQRATHFIVDETQVVSRKGTSAEQLNTAVATFRKYGGICTMAMQNITAALENKML KELFSNCNYKCFLDQGGADANALAQIQELSQTEFNALSSEEVGRGVMVWGKKVVLFDAKI SKENELYPLINTNFHEKAKEAEKRKQRQRQEQQESNDRIEEIILQMAELAPVTVRDLLSV LEITQEEAEKQLNWLCQQGYLVKTEEAGNIRYQKAG >gi|222441891|gb|ACEP01000051.1| GENE 2 1198 - 3534 1649 778 aa, chain + ## HITS:1 COG:no KEGG:ELI_0235 NR:ns ## KEGG: ELI_0235 # Name: not_defined # Def: antigen-like protein # Organism: E.limosum # Pathway: not_defined # 309 454 473 615 615 114 45.0 2e-23 MAVKKAKDASFKRMDSAYASSKSAIQKVKNADPLNAKEEQEQEEGNASVDPANYLSKKEK KEWKRLSYTAKQRYIRQAERQLKRQKKRNGIAKTEQSSVQEKEQQAAEIFLKERKYKSNT WKEPVDKKEIGSHDGNGEIASKSGDSRAAIRSQSTVAGGEKTADIASSASAETTKATGSA AAKTADTAAASTGAGLAALVAKKTAENFKEMLQQKNTAIESQRIKTQTKLSQLIEENKTG AGSPGVAAAAMLASVVVTVMYFAVQIAVTIFFVLLILLIPIIVIVSFLGIIITLLSGFAA AVDNTDYASGSGASIVAVAVQEIGYHEGSGNHTKYGVYTGTDGMSWCHAFVSWCANECGF IESNIIPKTAACETGRQWFIHKQQYKKAGSYTPQAGDIIYFDKGGVGESHHVGIVEYVEN GIVHTIEGNKNSQVMRGHYELTYKGIMGYGTPDYPDEGLTSSTASEILSKAQELGKMMVE EKWVYSNTDLKGSLAAAEQSAKRTNCAHGVCLVLQEIGLLKKGQIFFGNKSGELSCNGTV RKQIEKGFDIIKTGGKKSKDLDLKPGDICLFKGHTNIYAGKDENGKTYWYDFGRDQTKDK KPGSVFVKSVRKGEINWAVITVILRVKDQDGYGSGKTINIPSPYGDSFTYMGWSLITATG SNQYKLRVKTGEHYDANGFGKIGDRYVIACTPTFGKIGDEIDFVLANGRVIHGVMGDEKN MSDAGCNKWGHDNGHSVVEFVVNKSMWYHTGKTVTKFHPEWAKSRVVKAINLGKNHLK >gi|222441891|gb|ACEP01000051.1| GENE 3 3576 - 3860 92 94 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026723|ref|ZP_03715915.1| ## NR: gi|225026723|ref|ZP_03715915.1| hypothetical protein EUBHAL_00975 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00975 [Eubacterium hallii DSM 3353] # 1 94 1 94 94 129 100.0 6e-29 MKKKIVIFLLAVFLVFALLVSALCFFAPQKNQNSRENTTEEQKTEDAAEQTEKEKREQYH FTNLEYIYYINSENETDFEEQAAEYVEKNYKNVQ >gi|222441891|gb|ACEP01000051.1| GENE 4 4062 - 4364 382 100 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026724|ref|ZP_03715916.1| ## NR: gi|225026724|ref|ZP_03715916.1| hypothetical protein EUBHAL_00976 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00976 [Eubacterium hallii DSM 3353] # 1 100 8 107 107 159 100.0 8e-38 MQDDEEEYNEIGIEPINLGEPNIVDQDDSLGAVADEKELSQELLKFLEKEGEERRNFYAV SVESTGEKDYKAVLDFSTKRSDGKNISITFENGIYTFAFK >gi|222441891|gb|ACEP01000051.1| GENE 5 4671 - 4958 288 95 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026725|ref|ZP_03715917.1| ## NR: gi|225026725|ref|ZP_03715917.1| hypothetical protein EUBHAL_00977 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00977 [Eubacterium hallii DSM 3353] # 1 95 1 95 95 138 100.0 2e-31 MARQKHFRLPDDVLERINHRDKEMYPTENVYITEAVRNFFKKEETENVMDKLEDIQDKLD RIEKKLEEDAGKKYSSFSSDAEKKELYGKPDFLKY >gi|222441891|gb|ACEP01000051.1| GENE 6 4983 - 5582 468 199 aa, chain + ## HITS:1 COG:no KEGG:Ethha_1703 NR:ns ## KEGG: Ethha_1703 # Name: not_defined # Def: helix-turn-helix domain protein # Organism: E.harbinense # Pathway: not_defined # 20 171 2 154 169 79 32.0 9e-14 MDWIKQIMEYKKGNWESPGSLGQKIKKIRMLRRYTQKELGVMCGFSLSSADVRIAQYEKN KKIPREKILKAMCNALGVGEGALVHADMLSYQEMFYALFDMEDFHGLHPVKKEDGYYLRF DSDSIINDFLEEWYKMYQKYPRPRGDDEEEKYTLCRYEYPNEIAVVSQYELLKRKALQAE IENLQLKIYDLENENKEGD >gi|222441891|gb|ACEP01000051.1| GENE 7 5587 - 5877 383 96 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026727|ref|ZP_03715919.1| ## NR: gi|225026727|ref|ZP_03715919.1| hypothetical protein EUBHAL_00979 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00979 [Eubacterium hallii DSM 3353] # 1 96 10 105 105 104 100.0 2e-21 MDWKKQIEKLEDELQKLTEKENRIAERKKEVGEKLRKVREQKENEENKQLADIVTEYLGP MDPKKIEDLKVVLDMYMSDQEEERVTQKERQEGEER >gi|222441891|gb|ACEP01000051.1| GENE 8 6015 - 6230 218 71 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026728|ref|ZP_03715920.1| ## NR: gi|225026728|ref|ZP_03715920.1| hypothetical protein EUBHAL_00980 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00980 [Eubacterium hallii DSM 3353] # 1 71 1 71 71 109 100.0 7e-23 MQEDRPVRLEKELKNLSYQVRKIGVNINQVVAKINAGYGNQIDIYNLEKALEQVKHEMKT LNEKVEDYGNH >gi|222441891|gb|ACEP01000051.1| GENE 9 6217 - 7605 953 462 aa, chain + ## HITS:1 COG:no KEGG:EUBELI_01797 NR:ns ## KEGG: EUBELI_01797 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 1 419 1 395 608 146 29.0 2e-33 MAITKLSNIGMPKGGGAAHLKNCISYIMNPDKTEGMVGGNAGTTPQEVYQVMIDTKQEWE KEGGRQGYHFVISFPPGEATKEEAYAVINDFCEEYLGEDYDYVFSIHTDQKHMHGHIVFN SVNRMSGYKYRYEKKDWEKYIQPLTDRICEKYGLPPLVYDKNNKIGKSYAAHYAEKEGRP SSEKIIKADIDFVIASSKDWDDFIRQMEGLGYKIRQKKYVTYIPPGFERGRRDSRLGPGY RAEEIKERIKNKGQEQGVEKILSHELSKVYEREIFQYTKTTLTVFQMKKIKNFYQTGHYL EEKNPYAVAWKEVRRNAVHIDQLYEECKYILEHEIKTEKDLLEKQKILIKKEAELFDQRK TLFSVEDKKIFQQYQVLKKRLADTPDWDDRFEIYQEELNDFLKEMPEGMLSAEKKKRQIN AMLREVRHEKRIVNRMIEEEKSLNKEEKITKKKGRTEIQHRR >gi|222441891|gb|ACEP01000051.1| GENE 10 7614 - 8492 775 292 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026730|ref|ZP_03715922.1| ## NR: gi|225026730|ref|ZP_03715922.1| hypothetical protein EUBHAL_00982 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00982 [Eubacterium hallii DSM 3353] # 1 292 1 292 292 445 100.0 1e-123 MGSKEEKHIVIEFQKKILQEAGCSAEVIAFLDEIEDEVFCEYYCFCFLDGMPIEEVRRID SVPVQNWKDKIKRIREERTGYLERLFVPDAQVTQQISKLHEKAGRVFKETEELRDTINTT LKQTLKIQEEALVQQKESYQSTLASKAELLKDREDKIQSLLSEIEQNKAVWENEKKDLLL KLEEKKTAVKEEEKKEAAFIEEDTVNNRERKKKRWHFFCKENKKRGTQDNNNEFIETYLA EDTFNDAQKEFLIQCLEEGDSVQEMKTYASASLTPAMMQRLRQVRKKRRERP >gi|222441891|gb|ACEP01000051.1| GENE 11 8489 - 10471 1233 660 aa, chain + ## HITS:1 COG:no KEGG:CD1105 NR:ns ## KEGG: CD1105 # Name: not_defined # Def: putative DNA primase # Organism: C.difficile # Pathway: not_defined # 4 125 1064 1184 1343 95 41.0 7e-18 MRKKEQTGINLSEEEILHGKGDAYGIYQIDWKGEGREYAFLSYDSIRAKGKLPQRKDYQL VYSGILEPDENMDSLYVKFNIAHPQDFTGHSLSISDIIVLKKNGKINVSYVDMIGFVPLS DFYKEPALRVVGQITEATQGFTAEGHFGTWHSIQMQEFHNEKFFQMRHDEFGEKVADIIV NEQGQVIAEDLWHGFSPEAMKLIGEYLLNRSLHEKKEAAYVISGDSGYFMIHETDGGYDY TFYNEDYRELDGGVYDNPDVSLAEAIEDILNDAGIAIATIEEIGYEQLEQNIEESEEKEL LHYAVQESKRQLKGGDIRLTSEVYYKEKSLEGRSRADIEETVLSQAQIIVDELGLHNEVE LIGARVYGSRSREGLYRPDSDVDVALSYQGPISEDSFFNYLKEDMLYVKEIPIDINPISK TKSGTLPEYLERAEYYLDEKEIEQFAEQIDTFGRLRGDWYVDETMEQEKAVDAITDDILQ KKTGYLNDYLKKTIEISGDQEDIKQAKDLLIQMEKLERLSIFDKEPEPIPEVDFYVTECS EFPSLGEYHEGLTIDEAIAVYEKIPGDRKNGIKAIGINLHFPEGHMYSDKCDLLAGGHIC KEMLDAVPFYKENRQVRKAVRYLEKHFEKKENLSLIKPKKKQKIIIFNKKHNKKIIFMIK >gi|222441891|gb|ACEP01000051.1| GENE 12 10498 - 11055 456 185 aa, chain + ## HITS:1 COG:no KEGG:Sgly_1452 NR:ns ## KEGG: Sgly_1452 # Name: not_defined # Def: RNA polymerase, sigma-24 subunit, ECF subfamily # Organism: S.glycolicus # Pathway: not_defined # 1 180 1 174 177 80 32.0 3e-14 MFIYLMVLDTEEEKIKFVKIYDAYKDRMHYTASVILKDKLNAEDMVHDTFLTLTDYLDRI DEEDFTGTWNYIVTILKNKCYNFIKRSKKIELIADDTIFEQSVEGHNLLESQLIKKESEE FLNSLIRGLKYPYKEVIYLQYHNKLNSCQIAEILETNSANVRKISSRAKEQLKKGMLKKG YTYEF >gi|222441891|gb|ACEP01000051.1| GENE 13 11045 - 11704 484 219 aa, chain + ## HITS:1 COG:no KEGG:Sgly_1453 NR:ns ## KEGG: Sgly_1453 # Name: not_defined # Def: hypothetical protein # Organism: S.glycolicus # Pathway: not_defined # 31 218 33 223 225 68 25.0 2e-10 MNFEKLLEQEAQRAALQDYEEVTEKTKNYSYKFSKEFIQKMNKMIYEEKRKAKKKRRLRF LLVAAVILILNAGIVLANDDLREKVGTLIIHFFEDSIYIHNSEEQTESEKIFRQLHLTYI PAGYNLLYETENPITTYSAYYEGQNDNYIGFTQGLKENVDVHVTYDGTGSRKIQVNGKEI YVVKDKEITSFYYEDNGCIITLSSTEKELELIKILKNIK >gi|222441891|gb|ACEP01000051.1| GENE 14 11767 - 12192 194 141 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026734|ref|ZP_03715926.1| ## NR: gi|225026734|ref|ZP_03715926.1| hypothetical protein EUBHAL_00986 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00986 [Eubacterium hallii DSM 3353] # 1 141 13 153 153 214 100.0 1e-54 MRKKIMVLLICTVSLVSIPFSVIAAEKNVHTNVAVEQIMPRYGNISQISGNIQAVSGKIN CSVDVLMKTSISISVKMSLQKKNGSSWNTVKSWNKSYSNTKVVNSKAVYNSTTGATYRMK YTVTAGNDKATGTTPSVVGKK >gi|222441891|gb|ACEP01000051.1| GENE 15 12214 - 12360 176 48 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKKRKLLSRVLCLILCLCLAIELIPSEAYANTLKGIDKAVSSQDKPVK >gi|222441891|gb|ACEP01000051.1| GENE 16 12498 - 19388 5207 2296 aa, chain + ## HITS:1 COG:lin0454 KEGG:ns NR:ns ## COG: lin0454 COG3209 # Protein_GI_number: 16799530 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Rhs family protein # Organism: Listeria innocua # 1390 2066 1469 2108 2167 175 28.0 1e-42 MSDGSISAVSYGMDVNYKDKQGKWKEINNNFLYSKDTGEEGDFSGYKTKQGETKIKFSEI AGEDKLVHLEKDKYSVDFTMISSEDEENVFDEEENVPDDESTEELEDTEEIQATEAEESS TEMEESNTEKNTEDKPQEVTTEEVEEATTEYQEEVSTEDESKEILDKQNLEDIEDEQAAL QDIESQNISETENEEDDFIDFQTINSKSDKKEEKNEKEANVEKSLITATAEEKSEIKSEE SENFSENYEGIEQVPAIIKKGTIEGISNAKKSRTIQEEVDEEEQEDSYAVENVYTQLSYE GIQENVSLNYSLIGSSLKEDIIVEEPSETYEYRFLLSLGNLVPKQGKEGAIYLNDSKTGK NIYVIPAAYMTDSADNYSENVKMSLEETQEGQYILKIVPDEEWMNDPDRSYPVTIDPTLE KSTADSNDFVIRDVYVVEGDSSTYYPYGNTIAYIGHGASADAQTHRMFTEFRKLPQIPAG SVITGAKLYYYQQMYDPYSDTRTPFFYLDAKEVTAGLDWTNGVTWDSQPTYKNTVLDSQK ISEETTGKYVGWDLTSVIKKKYEEGNDEKFAAVALVPDDEESISSVKYSALVRFYKATRS TNKPYVVINYRDSYGVEEYYSSQSQEIDSAGTAYVGDYSSQLTLEKMDYSNSSEAISFSL EHYYNSGLAGGNFGEVSAYTKDYSSMKIGNGWKLNAQQTIVEKKIGSTIYYIYTDGDGTE HYFYEDAEKAGTFVDEDNLGLSVTKNGSVYTMTDGDSTTTFNEGYLTSIKDSKENKICFA YGGKEYGSDNLWKPTQKSHQLTSIIQINKVDNKARKLVSLSYDENGYLTKITDKAGRITK YHYDGGELVKITHPDNTTVSYTYDNKHYLLSALDNESKYGVEYGWEKLGCRIIVNRIREF NNSGETKEYGSDVKVDGHDLQHTVYTSAGNDNNFDTSDDQISTYTFDYSGHTVNVCTKDV QGNVLGVDTGAYKSKSAKNSLTASATSGKQAYSEIVNGGIEVAGATSIDINSWNVSGSTT NGSVSATNKQAHNGDRSIALSGESTDNSYVRAYAAGRTLSPGTYTFSAYVKTTDVKKFGK DGGVFLQFHTGSKVLAQGDSLDYQTLEDTDNGWTRISVTKKFTSSRKVYGFVNLKNAQGV AYVDDLQMEKVNSASGYNLLNNGAGVTQARWDMNTEGTSFAQEEKLYASNALKITGKLQE NRNISQTVRINHKIKQQSYMLSGWAKAESVPDIAEDNSDGENRFFGLHAKFLYADGSYET EYIPFNADVSTWQYISGVVTPTQTGKVLSAIIVGCDYTFNANTAYFDGISLVEEEASLYD YDSKGNLISAIDGDGKFVYDYDDKGNIISVTDEDGNKTNFKYDESGNLEGISANEFSVKE DDEGTEQKNIIEGKDGTKSTSSIKYTADGNYISSETDAAGAVTSYQYDDKTDLLKTTVDS NKNTTSYTYNSGNSQLSKAALKDKNGEELVSLNYTYSNGIMSSLKRSGYREDKAIQQEYS FGYNKWKQVTSISAGKYKLAGYEYAGKNGSLTQLTYGNGVKETYKYDTLGQLASISYDGT VTYKYTYTTDGNIETVTDVPNDYTYTYKYDKNRDAIEYVKKKAGVTEIYSQSWESTDGLT SGNKISFDGKSYDTSITKDSQTSNVLTEKNILGNKQSFIYDSLGQKLSEKNNVYKLDYSY ESIDTDNAKDNATGRIQQISYGKINNSFSDFSLKYEYDTLGRIVKVSDTKGNILAQYSYD AQGQLLMEKLPQQNKRYEYTYDTVGNIQDAKTYTLTGTTATATKKYTYGDASWNDLLTAY DGKEITYDGAGNPLAYNNGTAWKFTWEKGRELSSASADGKNITFTYDVDGIRDSKTINGV KHEYVTLDGLIYQEKWGNNTLTFSYDNNDKPYAVKYNNTTYYYVLNQQGDVIRIVDENGT TKAEYTYDAWGNVIASTGTDSGIGAVNPLLYRGYYYDSELGMYYLRSRYYDPGVKRFINA DSVDYLGENKTELSYNLFAYCENNPIIYIDKDGYAVAHIVGGAIGGLSGVALGKFLAKKL KLHGTKKTALIAASAVGGVVLGAFLGPYVAQLSKYGIQLIKKNISYGAAAGKALCFVSGT LILTEEGLVPIEKLREGDYVYAEDINTGEQELKEIIQIYENQTQEVVCLKLKGEEIITTP YHPIYIDGRGWVAAVKVKNGDVLHTFDGKKILVEKVQYRKLEKPVKVYNFEVRDFHTYYV GKNNFLVHNKNCSLVKLSDKYIKKTLKLDAHAIKREYLGKKAAIARYDLAVDKNTGIIYI INKAGTIIDKTIYRTK >gi|222441891|gb|ACEP01000051.1| GENE 17 19581 - 19782 110 67 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026737|ref|ZP_03715929.1| ## NR: gi|225026737|ref|ZP_03715929.1| hypothetical protein EUBHAL_00989 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_00989 [Eubacterium hallii DSM 3353] # 1 67 7 73 74 123 100.0 4e-27 MNIFFDNNFDVYKIERILNIKPSDCKRKNETRMSPFNKNEHLEGYWSLVTDTFEELDIKP VMDDLLR Prediction of potential genes in microbial genomes Time: Fri Jul 8 07:14:14 2011 Seq name: gi|222441890|gb|ACEP01000052.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont57.1, whole genome shotgun sequence Length of sequence - 14439 bp Number of predicted genes - 18, with homology - 17 Number of transcription units - 8, operones - 3 average op.length - 4.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 88 - 147 5.2 1 1 Tu 1 . + CDS 179 - 901 423 ## Bmur_1224 CRISPR-associated protein Cas6 + Prom 937 - 996 3.4 2 2 Op 1 . + CDS 1089 - 2864 1082 ## Bmur_1225 CRISPR-associated protein, CXXC-CXXC region 3 2 Op 2 . + CDS 2865 - 3740 1065 ## COG1857 Uncharacterized protein predicted to be involved in DNA repair 4 2 Op 3 . + CDS 3727 - 4467 534 ## Dtox_0916 CRISPR-associated protein Cas5 5 2 Op 4 6/0.000 + CDS 4493 - 6913 1398 ## COG1203 Predicted helicases 6 2 Op 5 12/0.000 + CDS 6949 - 7446 343 ## COG1468 RecB family exonuclease + Prom 7571 - 7630 5.5 7 2 Op 6 13/0.000 + CDS 7723 - 8589 342 ## COG1518 Uncharacterized protein predicted to be involved in DNA repair 8 2 Op 7 . + CDS 8658 - 8936 145 ## COG1343 Uncharacterized protein predicted to be involved in DNA repair + Prom 8969 - 9028 9.0 9 3 Op 1 . + CDS 9099 - 9296 115 ## COG1724 Predicted periplasmic or secreted lipoprotein 10 3 Op 2 . + CDS 9326 - 9757 378 ## ELI_3151 hypothetical protein + Term 9764 - 9809 8.9 11 4 Tu 1 . - CDS 9805 - 10293 -50 ## gi|225026749|ref|ZP_03715941.1| hypothetical protein EUBHAL_01001 - Prom 10335 - 10394 6.5 + Prom 10273 - 10332 2.9 12 5 Tu 1 . + CDS 10357 - 10470 85 ## + Prom 10619 - 10678 5.0 13 6 Tu 1 . + CDS 10779 - 11039 193 ## gi|225026750|ref|ZP_03715942.1| hypothetical protein EUBHAL_01002 + Term 11040 - 11087 1.1 + Prom 11099 - 11158 5.7 14 7 Tu 1 . + CDS 11222 - 11536 234 ## Swol_2193 hypothetical protein + Term 11634 - 11671 -0.8 + Prom 11687 - 11746 9.8 15 8 Op 1 . + CDS 11778 - 11993 266 ## gi|225026752|ref|ZP_03715944.1| hypothetical protein EUBHAL_01004 16 8 Op 2 . + CDS 11987 - 12250 137 ## COG3326 Predicted membrane protein 17 8 Op 3 . + CDS 12260 - 12838 427 ## COG1451 Predicted metal-dependent hydrolase 18 8 Op 4 . + CDS 12847 - 13845 524 ## EUBELI_20061 hypothetical protein + Term 13894 - 13937 1.0 Predicted protein(s) >gi|222441890|gb|ACEP01000052.1| GENE 1 179 - 901 423 240 aa, chain + ## HITS:1 COG:no KEGG:Bmur_1224 NR:ns ## KEGG: Bmur_1224 # Name: not_defined # Def: CRISPR-associated protein Cas6 # Organism: B.murdochii # Pathway: not_defined # 2 239 1 252 252 108 28.0 2e-22 MIQFNLEFVLEKPELPLELDRLLVSFLKASLESASPKMFEQLYDKRRSVIKPYTFSYYLP GAKFKDEKIYLRQNKFSMFFSNADMEQTIYFFNGFKKILHQRYPMNGNSMELVRIKNVRR KEIKDPEIIVKMQSSLIARRHDVEENRDTYYVYDHPEFSQVVKENVQTLIERLGVDVSVE GFEIVPIKGKKIVATIFGRKVDANIGIYKLSGSPELLTFLYRTGMGGRRSEGHGMWKIVY >gi|222441890|gb|ACEP01000052.1| GENE 2 1089 - 2864 1082 591 aa, chain + ## HITS:1 COG:no KEGG:Bmur_1225 NR:ns ## KEGG: Bmur_1225 # Name: not_defined # Def: CRISPR-associated protein, CXXC-CXXC region # Organism: B.murdochii # Pathway: not_defined # 6 583 31 586 586 225 29.0 4e-57 MDRVIITLGDFLKNAGIVGMKYLLDISEAEEDSDYGITSDEQGIWLDRDFALHADWTDLY FNACVKYFGKNTVYQGVLDRIERCLTKIREDKWNPGREEKDDLKFITEKLLSNSYQNGFN IIKEKVENPEVYLELKKNKLSDKFESDILRKRLEELQQFLTQPLCRETFIMKSIIYNYIN RFWDGKCFLLRANAKKDMRAVFEKDFSEPFHKYLEGEHKKAKDTCIDCGNGITGKEKVSI AFMKEMADDLTRKRSAFWNCQVDAFLCPVCAFVYALSPLGFQMYANKFVFMNLNENISVL VDVNGKKRGNGLKEKGEEENYTVWFARILNKVLSDKVKELNNVQVILRGTRAEDNYMFSI IGRDALQILKQEKVQKALKYLEQHPYVKLSNEFVNVHESVVMNILQYHKQYSLLNQILKE SLGNTGVIPTAYWVFVVQLCTSIEKRMEVNVGSISEELEQSQQGIEEKLSKAKEGIFMSR IKMRDAGYKLRKEILLAKKAENDECMRGTIYQLLNALSVRNEGKFMDIVIRLYSSTKLPM ADGFVYMLGNKEKFSEYGYSFILGLKGSYTDSKDSKSEDTKQSQDESGEEN >gi|222441890|gb|ACEP01000052.1| GENE 3 2865 - 3740 1065 291 aa, chain + ## HITS:1 COG:aq_374 KEGG:ns NR:ns ## COG: aq_374 COG1857 # Protein_GI_number: 15605880 # Func_class: L Replication, recombination and repair # Function: Uncharacterized protein predicted to be involved in DNA repair # Organism: Aquifex aeolicus # 5 288 4 352 357 144 32.0 1e-34 MTRDGLTFTMVFLAESANYGEGIGNITTLKKMTRGDFQQYSYISRQAMRYNIVKQLKWDN TPVDGKSGVVQFAPSATIEDYPEIDLFGYMKTTSKADDKKGGASTRSAVVRLSNAISLEP YQSDLEFLTNMGLAQRQNLENGIAQSEIHRSYYSYTISIDLDRIGVDVEIEVSQEEKASR VKELLDTVQYLYRDIRGRRENLSPLFIIGGRYKRKNPFFENIVSVKANKIGVATLKEIVE DSEELKEYTYTGITSGIFDNEKDVKEKLKAESVGNVFKHLKEEVDAYYEGN >gi|222441890|gb|ACEP01000052.1| GENE 4 3727 - 4467 534 246 aa, chain + ## HITS:1 COG:no KEGG:Dtox_0916 NR:ns ## KEGG: Dtox_0916 # Name: not_defined # Def: CRISPR-associated protein Cas5 # Organism: D.acetoxidans # Pathway: not_defined # 1 209 1 222 250 166 39.0 1e-39 MKAIRVECFQNLVNYRKPSSFIIKETFPLPPYSTVLGMIHAACGFTTFHPMELSIQGKNK GTISELYTRYSFSAGVKFEEGRHQICVHDKEDYGVFKGIANVELICENNMILHILPSEED FDTVYQSLLNPPQYLSLGRYEDLLDIKKVEIVHLSWEEEVSVHNNIYIPVDYETEDGERI FEDGEGGATIYNLTREYEITKNGLRRWKKDGGRVRAYYYPAGKVLDQGYVDDSEAESAVV FSKNIR >gi|222441890|gb|ACEP01000052.1| GENE 5 4493 - 6913 1398 806 aa, chain + ## HITS:1 COG:aq_371 KEGG:ns NR:ns ## COG: aq_371 COG1203 # Protein_GI_number: 15605877 # Func_class: R General function prediction only # Function: Predicted helicases # Organism: Aquifex aeolicus # 60 705 20 645 729 286 33.0 1e-76 MELYAKSKAKQLSIKEVEALKETLESFLDSLEGILTPKEEQIIIQQIQKIQNQKEEKQKT LKEHEEDIVKCAEAFFRGYGKYFTEIEKILILQACRQHDWGKANLIFQSKVCQKVRDNLS EPEKKISQIPHGFLSAVSISYKEFQKLYEGLEKDDFKAFMTAIYYHHDRKDDFIQDDIKA YCQKYYLKYLSEYLKEERDKVYSSQMRKLLFRNNMSSIKLDVKPSLWNKYVLLKGMLNKF DYTVSAGYEFAEENSDLEEKRLVNNINKNMKEQKFALKPAQKFMQENVNKNLVVIAPTGS GKTEAALLWLNGEKGFYTLPLKVSANDIYRRIKDDYNYKDVELLHSDAMQKYLEESTNAA DSIYQRYEKAKLLSNPLTICTVDQLFKFVYRALGTEIFAATLKYSKVILDEVQSYDPHII ATIVTGLKQITEMGGRFAIITATFPHVLKYFMEKCGLIQGEQYEFRDFSKESDIIRHKVK IKSGEMNFDEILEKGRDKKVLVICNTVTKAQEVYEQIESQLEEKDLHLLHSRYVKKDRLQ LEKMIKNFSENEEECGIWITTQIVEASLDIDFDVLYTEMCTVDSLLQRMGRCNRKGRYIP EEPNVIVYDNANGKGTVYYEDMYDRSLEKLYQYEDKIFTEELKTEYINAVYCTEEIKKTK YYEEIEYWLEHFSTIHPIEYTKEDVDKKFRNIHSITVLPDTLYEENQVLFEEGVNLLRKP NINKDIKNIITSKFSSLTLGLTYYRYKDNGLNGIDVTTIGSHGDDRKNNYPVTQIHRTSM KYDFDPDTGRGKGLILGEIDDEACIL >gi|222441890|gb|ACEP01000052.1| GENE 6 6949 - 7446 343 165 aa, chain + ## HITS:1 COG:FN1178 KEGG:ns NR:ns ## COG: FN1178 COG1468 # Protein_GI_number: 19704513 # Func_class: L Replication, recombination and repair # Function: RecB family exonuclease # Organism: Fusobacterium nucleatum # 3 165 2 164 164 166 59.0 2e-41 MTDKRITGVMIYYYFVCKRKLWYFCHEINMESENENVQLGKLLDEKSYERDEKHINIDNV INIDFIREHQELHEIKKSKAIEEAGIWQVKYYLYYLEERGVKNIKGKIDYPLMKKTLLVE LSEQDREKLKSIIDEIESMKKEEFPPKFTEQKICKKCAYHDLCFI >gi|222441890|gb|ACEP01000052.1| GENE 7 7723 - 8589 342 288 aa, chain + ## HITS:1 COG:FN1177 KEGG:ns NR:ns ## COG: FN1177 COG1518 # Protein_GI_number: 19704512 # Func_class: L Replication, recombination and repair # Function: Uncharacterized protein predicted to be involved in DNA repair # Organism: Fusobacterium nucleatum # 1 288 51 338 338 385 69.0 1e-107 MSEMSFNTAFINYISQYGIPLHFFNYYNFYTGSYYPKESLLAGQLLVKQVEHYTCYDERL VIAKEFIKAAADNIYRNLRYYNSRGKDVAECMKEVNSLRKKLDHTEFIEELMGIEGNIRK NYYKAWNTIVNQEIQFEKRVMHPPDNMINSLISFVNSLIYAKTLSEIYHTQLDPTISYLH EQGFRRYSLCLDVSEVFKPLIGDRLIFSLLNKNQITEKSFTKELNFLHLKKEASRLIVTE FEKRMKQTIMHKELGKKVSYQYLIRLEAYKLIKHLIGEKEYEGFRIWW >gi|222441890|gb|ACEP01000052.1| GENE 8 8658 - 8936 145 92 aa, chain + ## HITS:1 COG:FN1176 KEGG:ns NR:ns ## COG: FN1176 COG1343 # Protein_GI_number: 19704511 # Func_class: L Replication, recombination and repair # Function: Uncharacterized protein predicted to be involved in DNA repair # Organism: Fusobacterium nucleatum # 1 92 15 106 106 100 56.0 5e-22 MYVVLVYDISKDENGQKRWSHIFKICKKYLSHIQNSVFEGEISKVQLVKLQQELKVYIDS NLDSVILFKSRQQKWLDKEFWGKEDDKTSFIL >gi|222441890|gb|ACEP01000052.1| GENE 9 9099 - 9296 115 65 aa, chain + ## HITS:1 COG:msr2149 KEGG:ns NR:ns ## COG: msr2149 COG1724 # Protein_GI_number: 13471995 # Func_class: N Cell motility # Function: Predicted periplasmic or secreted lipoprotein # Organism: Mesorhizobium loti # 5 62 2 59 62 73 63.0 9e-14 MKSYSSREVIGILKKDGWYEVACVGDHHQFKHSIKKGRVTVTHPNKDIPIKTLRSISKQS GVMFP >gi|222441890|gb|ACEP01000052.1| GENE 10 9326 - 9757 378 143 aa, chain + ## HITS:1 COG:no KEGG:ELI_3151 NR:ns ## KEGG: ELI_3151 # Name: not_defined # Def: hypothetical protein # Organism: E.limosum # Pathway: not_defined # 3 137 2 134 134 202 70.0 3e-51 MSKDRYSYVAVFSYEEDGISIEFPDLPGCLPCAEKDNTEEALKNAKEALGLHIWGMEQDG EEIPVPTPITSIHLDSNEVPVLIEVFMPPVRERINSRFVKKTLSLPAWLAAKADEDGVNC SRIFQLALMNYLQVSEENKAGIN >gi|222441890|gb|ACEP01000052.1| GENE 11 9805 - 10293 -50 162 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225026749|ref|ZP_03715941.1| ## NR: gi|225026749|ref|ZP_03715941.1| hypothetical protein EUBHAL_01001 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01001 [Eubacterium hallii DSM 3353] # 1 162 1 162 162 216 100.0 5e-55 MTLVNLFTFHNVSINSICLQKESEQYLDLHSTMFLLIPGSGFLNIFKEIFTFHNVSINSR YLIFLIFPTFSLLFLSILDSICVFPLYSHLKIIIYLLFMPLSMSPVFYFIEHRQFYNFIF LTAYAISFNFLNVYSYFSTYQQTHCNIIKLQLRIIFFILNRL >gi|222441890|gb|ACEP01000052.1| GENE 12 10357 - 10470 85 37 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MMWLKLELIETLWNVNIDGMVLGYKDIGINRNIVECK >gi|222441890|gb|ACEP01000052.1| GENE 13 10779 - 11039 193 86 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026750|ref|ZP_03715942.1| ## NR: gi|225026750|ref|ZP_03715942.1| hypothetical protein EUBHAL_01002 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01002 [Eubacterium hallii DSM 3353] # 1 86 1 86 86 167 100.0 2e-40 MIEYENYGKLESLAELENAIEMGLDIEFLLYDTRYNISWRDNKPFICECPEGVAKFFDTS ELLLQRYEVGDNPLKSLWGDMKILSM >gi|222441890|gb|ACEP01000052.1| GENE 14 11222 - 11536 234 104 aa, chain + ## HITS:1 COG:no KEGG:Swol_2193 NR:ns ## KEGG: Swol_2193 # Name: not_defined # Def: hypothetical protein # Organism: S.wolfei # Pathway: not_defined # 1 104 1 104 106 167 75.0 1e-40 MKKCKITVMRKACYKDLMEKYENPMEHACDIEEGQVFIANGWCKPVGMCDSAWESMSPFV MTLAYGGKDIYNGWMKNEKSAMISCNDGFRPVSFLIEALEEEAD >gi|222441890|gb|ACEP01000052.1| GENE 15 11778 - 11993 266 71 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026752|ref|ZP_03715944.1| ## NR: gi|225026752|ref|ZP_03715944.1| hypothetical protein EUBHAL_01004 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01004 [Eubacterium hallii DSM 3353] # 1 71 1 71 71 112 100.0 1e-23 MQKTYTVIEIYEADFGCEERPEGQETMVGIRLKAEDGEEIHRQEADAELYAKNINEGDKV IFIEGRIEKQC >gi|222441890|gb|ACEP01000052.1| GENE 16 11987 - 12250 137 87 aa, chain + ## HITS:1 COG:BH3136 KEGG:ns NR:ns ## COG: BH3136 COG3326 # Protein_GI_number: 15615698 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Bacillus halodurans # 2 81 9 88 94 68 43.0 2e-12 MLIYLLIINIIAFIMYGIDKWKAHRKQWRISEKMLLFLAVIGGSAGALAGMYIFHHKTLH KKFTIGVPLILVIQVMIFICIMENFYR >gi|222441890|gb|ACEP01000052.1| GENE 17 12260 - 12838 427 192 aa, chain + ## HITS:1 COG:RSc0521 KEGG:ns NR:ns ## COG: RSc0521 COG1451 # Protein_GI_number: 17545240 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase # Organism: Ralstonia solanacearum # 99 186 191 278 288 90 43.0 2e-18 MITEYKLEDAGREMLITIKRSSRKTLGLEVRENGEIFARIPEKLSDRSLKSFIEKERDWI IKKSELVKWEREHRQTTNATPVQELTSEEIENISQKIVDRVRFYQKKMGVTVGRVAIRNQ KTRWGSCSAKGNLNFNYQIYYLPDELLDYVVVHELAHRRHMNHSDEFWREVEKYFPMYRQ CRKRLKEIRLIN >gi|222441890|gb|ACEP01000052.1| GENE 18 12847 - 13845 524 332 aa, chain + ## HITS:1 COG:no KEGG:EUBELI_20061 NR:ns ## KEGG: EUBELI_20061 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 15 327 11 312 316 191 37.0 3e-47 MFDVGDRIIGLELPDTFYKLAELLEQDQFELLLSKNRIKDMHRKQDISLVYLMNDAVESF LVFHNATITGQYKSEYEGGLLAVLDKKEEEYVLVVHQGDSVITLFFEDLLQQNNLYNYGD IGHFWVKGYEYLRLLEYRLAILRDKCDYLDIDCSTPEERKLAALVEFPPLNYCCYPAVPE KYIVPRQNPWQPSDEAIDLMIEIAKEAGDKLFVHILSYYQKNNSKLMSRFIARMLHRKKH IKIVRILTKKLQKAASIYPVRSFGIENDIQIQKCLEKAKQKKQELKEKGIKADIIREEPF VEAQDSVQYKIYLMKWKQQRGNYTTEIEEITI Prediction of potential genes in microbial genomes Time: Fri Jul 8 07:15:17 2011 Seq name: gi|222441889|gb|ACEP01000053.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont58.1, whole genome shotgun sequence Length of sequence - 49234 bp Number of predicted genes - 38, with homology - 38 Number of transcription units - 19, operones - 8 average op.length - 3.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 3 - 620 272 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog 2 1 Op 2 . + CDS 617 - 1939 1014 ## Nther_1024 hypothetical protein + Prom 2177 - 2236 6.1 3 2 Tu 1 . + CDS 2267 - 4081 1224 ## PROTEIN SUPPORTED gi|157803230|ref|YP_001491779.1| 50S ribosomal protein L9 + Prom 4730 - 4789 5.1 4 3 Op 1 . + CDS 4809 - 5465 200 ## PROTEIN SUPPORTED gi|229845805|ref|ZP_04465917.1| 50S ribosomal protein L31 + Prom 5488 - 5547 3.0 5 3 Op 2 . + CDS 5603 - 6478 1029 ## COG0152 Phosphoribosylaminoimidazolesuccinocarboxamide (SAICAR) synthase + Term 6506 - 6576 13.1 + Prom 6556 - 6615 9.4 6 4 Op 1 . + CDS 6837 - 8333 852 ## gi|225026763|ref|ZP_03715955.1| hypothetical protein EUBHAL_01015 7 4 Op 2 . + CDS 8373 - 9452 686 ## COG0515 Serine/threonine protein kinase 8 4 Op 3 . + CDS 9412 - 10467 658 ## gi|225026765|ref|ZP_03715957.1| hypothetical protein EUBHAL_01017 9 4 Op 4 . + CDS 10523 - 10723 253 ## gi|225026766|ref|ZP_03715958.1| hypothetical protein EUBHAL_01018 10 4 Op 5 . + CDS 10775 - 11824 652 ## gi|225026767|ref|ZP_03715959.1| hypothetical protein EUBHAL_01019 + Term 11842 - 11885 7.2 + Prom 11954 - 12013 8.4 11 5 Tu 1 . + CDS 12109 - 13989 1354 ## COG0515 Serine/threonine protein kinase + Prom 14442 - 14501 7.3 12 6 Tu 1 . + CDS 14542 - 15558 619 ## gi|225026770|ref|ZP_03715962.1| hypothetical protein EUBHAL_01022 + Term 15594 - 15630 1.5 + Prom 15703 - 15762 5.0 13 7 Tu 1 . + CDS 15801 - 16697 998 ## COG0860 N-acetylmuramoyl-L-alanine amidase + Prom 16732 - 16791 4.9 14 8 Tu 1 . + CDS 16827 - 18197 1154 ## gi|225026772|ref|ZP_03715964.1| hypothetical protein EUBHAL_01024 + Term 18235 - 18287 5.6 + Prom 18486 - 18545 7.1 15 9 Tu 1 . + CDS 18657 - 20540 2173 ## COG4690 Dipeptidase + Term 20610 - 20655 1.2 + Prom 20769 - 20828 6.7 16 10 Tu 1 . + CDS 20885 - 21352 469 ## COG2246 Predicted membrane protein + Term 21466 - 21506 5.0 17 11 Tu 1 . - CDS 21327 - 22043 736 ## COG1720 Uncharacterized conserved protein - Prom 22286 - 22345 4.6 + Prom 22742 - 22801 6.7 18 12 Op 1 . + CDS 22876 - 24345 1705 ## COG2195 Di- and tripeptidases 19 12 Op 2 . + CDS 24345 - 25334 828 ## COG0042 tRNA-dihydrouridine synthase 20 12 Op 3 . + CDS 25355 - 27373 1787 ## COG1404 Subtilisin-like serine proteases 21 13 Tu 1 . + CDS 27424 - 28062 668 ## COG1896 Predicted hydrolases of HD superfamily + Term 28203 - 28247 -1.0 22 14 Tu 1 . - CDS 28742 - 30172 949 ## COG1055 Na+/H+ antiporter NhaD and related arsenite permeases - Prom 30326 - 30385 12.3 + Prom 30154 - 30213 9.8 23 15 Op 1 . + CDS 30420 - 33953 2754 ## bpr_I2220 hypothetical protein 24 15 Op 2 . + CDS 33994 - 35097 851 ## bpr_I2219 hypothetical protein + Term 35098 - 35139 -0.9 25 15 Op 3 . + CDS 35158 - 35988 842 ## Cphy_0459 hypothetical protein + Term 36029 - 36074 2.2 + Prom 36011 - 36070 3.3 26 16 Op 1 8/0.000 + CDS 36150 - 38792 3222 ## COG0525 Valyl-tRNA synthetase 27 16 Op 2 . + CDS 38816 - 40102 1250 ## COG0285 Folylpolyglutamate synthase 28 16 Op 3 . + CDS 40118 - 40270 338 ## gi|225026788|ref|ZP_03715980.1| hypothetical protein EUBHAL_01040 29 16 Op 4 . + CDS 40290 - 41831 1971 ## EUBELI_01610 hypothetical protein + Term 41848 - 41904 10.1 + Prom 41873 - 41932 9.6 30 17 Op 1 8/0.000 + CDS 41956 - 42291 333 ## COG1366 Anti-anti-sigma regulatory factor (antagonist of anti-sigma factor) 31 17 Op 2 6/0.000 + CDS 42288 - 42749 446 ## COG2172 Anti-sigma regulatory factor (Ser/Thr protein kinase) 32 17 Op 3 . + CDS 42733 - 43470 618 ## COG1191 DNA-directed RNA polymerase specialized sigma subunit 33 17 Op 4 . + CDS 43481 - 44146 481 ## Closa_1563 stage V sporulation protein AA 34 17 Op 5 . + CDS 44109 - 44561 355 ## Aflv_1006 stage V sporulation protein AB + Prom 45111 - 45170 7.1 35 18 Tu 1 . + CDS 45244 - 46542 562 ## PROTEIN SUPPORTED gi|149914878|ref|ZP_01903407.1| 30S ribosomal protein S2 + Term 46688 - 46716 -1.0 - Term 46558 - 46597 2.3 36 19 Op 1 . - CDS 46834 - 47655 535 ## CLL_A0680 ABC transporter permease protein 37 19 Op 2 8/0.000 - CDS 47649 - 48347 230 ## PROTEIN SUPPORTED gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 38 19 Op 3 . - CDS 48344 - 48715 464 ## COG1725 Predicted transcriptional regulators - Prom 48874 - 48933 8.2 Predicted protein(s) >gi|222441889|gb|ACEP01000053.1| GENE 1 3 - 620 272 205 aa, chain + ## HITS:1 COG:BS_ylaC KEGG:ns NR:ns ## COG: BS_ylaC COG1595 # Protein_GI_number: 16078537 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Bacillus subtilis # 33 201 7 166 173 67 28.0 2e-11 KNIVTFLVFICYMYERSRKVSRKGSENLDKHLLEALYHRYYKELYIFVYSLCGSETVTDD ILQDTFLKALLALKDSHTNMRAWLYMVARNLYYNYYNRQKNLFSIETEEVDRRNSREKII SGEGEAFIQNDILEEILHKEEKKRVWDAVNRLKGQKKEVILLQYFGDFSQKEIAAMLKIT PGNVKVLSYRAKKELKEYLTKEEGR >gi|222441889|gb|ACEP01000053.1| GENE 2 617 - 1939 1014 440 aa, chain + ## HITS:1 COG:no KEGG:Nther_1024 NR:ns ## KEGG: Nther_1024 # Name: not_defined # Def: hypothetical protein # Organism: N.thermophilus # Pathway: not_defined # 1 438 1 411 424 94 22.0 1e-17 MTYRELLELYKTGELELEQRKKVEKDIERQEAISDYLYEQEDIPELNDIFEEKAAAEKNN KRNDDIEDIDTKFIEMVNRSIRRAFRRMGLTILAAACVLILFVQFCLPTIVSCFYYNPAK NIGDKDYAISKMSRDMAVYSEVFLPGKRRGSVQVEAKGYGNYNITVNQTSSFTGSLTDVS GEITRGELRYYDNNIFKKPATNVFVYGSIAGKEENTVEENLKYTRKYFDVKDDEEINLCA AGSKEESLKTLQELDERKMYIGYVSLDKIMSYADFKKYVDKQDLAEVWCAVQVAELEKEE EGVVNFQSNIPNIGFVCNPSYNTAIKWDEKKYPNLLPGCETQDMGNEDEWDDPEENLKSE SNARQHFVSLLNYLSNQKKFLAMMEKDNSNYTTKELKEMVSYIKKNGIKVYGFTTIADKE TLLKLSKQSEVYEIYTEEVR >gi|222441889|gb|ACEP01000053.1| GENE 3 2267 - 4081 1224 604 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|157803230|ref|YP_001491779.1| 50S ribosomal protein L9 [Rickettsia canadensis str. McKiel] # 9 602 6 594 636 476 44 1e-133 MKEVKTPRKPFIFYYIIVLLVLMLVNMFVMPMIREASIKKVDYNEFMNMTLNKQIKKVEI DDSQITFTDQNGTVYKTSKMDGDWGLTERLYRSGAEFTTQVQEQMSPILSFLLSWIIPIV LFSALGYYAQKKMMNKMSGGGPNMMFGMGKSNAKVYVPSEEGIRFSDVAGEDEAKENLAE IVDYLHQPSKYSEIGASMPKGVLLVGPPGTGKTMLAKAVAGEANVPFFSISGSEFVEMFV GMGASKVRDLFKQAKEKAPCIVFIDEIDAIGKKRDGHIGGNDEREQTLNQLLTEMDGFEG NNGVIILAATNRPESLDPALTRPGRFDRRVPVELPDLVGREAILRVHSKKTRLADNVDLH AIARMAAGASGAELANIINEGALRAVRNGRRIVTQADLEESVEVVIAGYQKKNAVLSPKE KMTVAYHEIGHALVAAKQTNSAPVQKITIIPRTSGALGYTMQVEQQDKYLMTKEEIQNKI ATLTGGRAAEEIVTGTISTGASNDIEQATKLARAMITRYGMTDEFDMVAMETVNNQYLGG DASLACSADTQKKIDEKVVEVVKAQHKKALDILNENKDKLHELANFLYEKETITGEEFMR ILES >gi|222441889|gb|ACEP01000053.1| GENE 4 4809 - 5465 200 218 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229845805|ref|ZP_04465917.1| 50S ribosomal protein L31 [Haemophilus influenzae 7P49H1] # 59 216 64 223 378 81 33 8e-15 MRKKERAAIVIERLKKEYPGADCTLDYNEAWKLLVSVRLAAQCTDERVNIIVKDLFAKYP GVNELAEAEPEDIEAIVRPCGLGKSKARDISKCMRMLRDEFGSKVPDNFEDLLKLPGVGR KSANLIMGDVFGKPAIVTDTHCIRLCNRIGLVDNEKNPKKVEMALWKIIPPEEGSDFCHR LVYHGREVCTARTTPYCEKCCLADVCKSYASYMKEKNQ >gi|222441889|gb|ACEP01000053.1| GENE 5 5603 - 6478 1029 291 aa, chain + ## HITS:1 COG:Cgl2543 KEGG:ns NR:ns ## COG: Cgl2543 COG0152 # Protein_GI_number: 19553793 # Func_class: F Nucleotide transport and metabolism # Function: Phosphoribosylaminoimidazolesuccinocarboxamide (SAICAR) synthase # Organism: Corynebacterium glutamicum # 1 289 5 289 297 293 50.0 3e-79 MKEFKPIKEGKVREVYDNGDSLIIAATDRISAFDVILKNEVKDKGAILTQMSKFWFNLTE NTVPNHMISTDVKDMPEFFQNEKYAGRSMMCKKLEMFPVECIVRGYITGSGWASYQENGT VCGIKLPEGLQESEKLPEPIYTPSTKADLGDHDENVSYDKTVEILEKLYPGKGNYYANIL KEYTISLYKKCAEYAWEKGIIIADTKFEFGLDEQGRVVIGDEMLTPDSSRFWPREGYEAG KGQPSYDKQFVRDWLKQNPDSDYDLPEEVIDKTVAKYKEAYEQLTGEPFEA >gi|222441889|gb|ACEP01000053.1| GENE 6 6837 - 8333 852 498 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026763|ref|ZP_03715955.1| ## NR: gi|225026763|ref|ZP_03715955.1| hypothetical protein EUBHAL_01015 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01015 [Eubacterium hallii DSM 3353] # 1 498 1 498 498 843 100.0 0 MRKKQLPKRIEILRKAFENKWSLEAVNKELEDNLCERLYARNIYEAGLIYAFCLGFDYEN WKQLYTAYEEERKDARDYIQSPEKNIIATYKKEQKDTKECIQSLEQNMVTTYEKKRVPSY SGIWSGKITLKQIYEYIMYNSKESFQTGTVAEELEEKLWKEQENTLPDKTSIGFLDKKKN FIAFMNENQKQFSEVREKARYYFCKYLYFYIEERCENYYDKCRTEETEEKIYGESLFLEY CIQAEKNALNQLTMLKLLTPLKNEAKKAKVSLSLEQRKEYLRNAALSFRGIFNEFNYFYF GFVSLGWVEMLFEFYGPLKDWSEEEKIRIACSLGYCKKNPSDEEKEKAFLQLERLSKKQE NTEKQLDEKYAIGMERAGYQSGRSGETYFRDFITGRRDINRDTLLAFLVFVQAETKLDLK DKLTIARINRILYNCGFAQLRPDREFDEFIRKYLTGEAKEEVLYEAAERAALKSEDFYLY KLYQQAKSCQKELEGKLI >gi|222441889|gb|ACEP01000053.1| GENE 7 8373 - 9452 686 359 aa, chain + ## HITS:1 COG:DRA0335_1 KEGG:ns NR:ns ## COG: DRA0335_1 COG0515 # Protein_GI_number: 15808039 # Func_class: R General function prediction only; T Signal transduction mechanisms; K Transcription; L Replication, recombination and repair # Function: Serine/threonine protein kinase # Organism: Deinococcus radiodurans # 11 251 6 233 459 148 37.0 2e-35 MEKENWLLEETLLHEKYQIKKVLGQGGFGITYLAYDQTLQQNVAIKEYFPVKIVRRLGNT LRKGQGEYELSATAMVYPQNGQEEKYLNGMQNFLEEAQVLFGKFDVEGLAAVKDYFEENG TAYIVMEYLSGPTLQEYEKEHKGKISEKQAEILLEPVINALAYIHSIGIVHCDISPDNLI FNKEGQLKLIDFGAAKNKKKEKEEQYCKGGYTAPEQYLEKEFVGPWSDVYSLCAVWYEML TGIKVPPAVERLQKDRLKNMTMASEVSEQTNNILKKGLSLEIQKRYFSVLNLSSDIKSNM ESKEEITKQYMKDTRFVWGTLWLQITTDINERNISQEKKWISRRKSFGAERIWSNLSCL >gi|222441889|gb|ACEP01000053.1| GENE 8 9412 - 10467 658 351 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026765|ref|ZP_03715957.1| ## NR: gi|225026765|ref|ZP_03715957.1| hypothetical protein EUBHAL_01017 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01017 [Eubacterium hallii DSM 3353] # 1 351 18 368 368 574 99.0 1e-162 MGQRGFGVTYLAYDEELCQEVVLKKYRWEHGSLDNEVINKRDYYKKQRKAFLNEARILSS LFDIKEVVKVLDYFEERRTSSPYKKATTGKLTNGDAIYYGDILQVTYSTLTNWTIDSHGK EKITVTGNVTSADIYMTVWSDWSEWPDDADTESNTVKVESKSQYRYSDKATTTSTNASLP GWIQTGSTTSYGGWGSWSVWQTNAIGGSDTRQVETATVYPYYYFYCKSCGRGSRYPYYGS TCAYCNSSSYVTLDTGTVDWFTNSWSSSTRWGSTSKYYQYIYDYHAGANAIYWNWTDGNA QTGYRYRDRSKTITYSYYKWNDWSSWSDTAYSSNGNRKVDTKTVYRIKKKY >gi|222441889|gb|ACEP01000053.1| GENE 9 10523 - 10723 253 66 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026766|ref|ZP_03715958.1| ## NR: gi|225026766|ref|ZP_03715958.1| hypothetical protein EUBHAL_01018 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01018 [Eubacterium hallii DSM 3353] # 1 66 1 66 66 100 100.0 3e-20 MKKIYTLISCLVLAIMALGMNVNASTGRTIISVDKVVAGEESSVRVPVKIMNNEGLVGAT ITIEYD >gi|222441889|gb|ACEP01000053.1| GENE 10 10775 - 11824 652 349 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026767|ref|ZP_03715959.1| ## NR: gi|225026767|ref|ZP_03715959.1| hypothetical protein EUBHAL_01019 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01019 [Eubacterium hallii DSM 3353] # 1 349 1 349 349 574 100.0 1e-162 MTKPGELTEKKINIVWDGMEEDSSNGVIAILNFSKPKKAGSYDVKISYEDGGIIDGNLQP VEVTMYNGSITVKGFSEGEKDSCTVNGHKGGKATCIKKAVCSVCRKEYGEVDPTNHEGET ELRNVKKATCTEDGNTGDTYCKSCGTKIKDGKVIKALGHKFTKYVSDNNATTTKDGIKTA RCDNGCGTKNTVVDKGSKKTNPSQSPKIKIGQKLQNSVGVRCAITGKNTAECIGYVGKKN SVTIPPSLKYMGVTYQITSIKTKAFSGNKNLRSIVIPSSIRTIGSQAFFNCKNLRNITIK TPYLSKKTVGAKAFKGIHAKAKIKVSKKQKKAYQKLLKAKGAGKKVTIK >gi|222441889|gb|ACEP01000053.1| GENE 11 12109 - 13989 1354 626 aa, chain + ## HITS:1 COG:DRA0335_1 KEGG:ns NR:ns ## COG: DRA0335_1 COG0515 # Protein_GI_number: 15808039 # Func_class: R General function prediction only; T Signal transduction mechanisms; K Transcription; L Replication, recombination and repair # Function: Serine/threonine protein kinase # Organism: Deinococcus radiodurans # 12 285 4 271 459 199 42.0 1e-50 MIQTEPHHLRMGTRLNNRYLIQGVLGEGGFGITYVGMDEVLCQKVAVKEFFPRGAITRNN QQTNEVVSVYGTKAANFHQGEEKFLQEARTLAQFNNVAGVVRVQDFFRENGTAYIVMEYL EGITLKQYLQTYGPISVEEMQNIFAPILEALDKIHQNGVIHRDISPDNIMCLPEGEAKLM DFGAARDYTDYSEEGLSVILKMGFAPIEQYDSHGKQGPWTDIYALGATMYQCLTGRKPDD ATKRSLEDTLVSPSMLGVRIAPPVEYAIMRALQIRPADRYRNLGEFCENLYSVVSDNTVS VNMNTFTQPAGSYDWGSPSGTSSQSRQQGYPDVPPYTPPQPPVKTKNSTPLIVALCCTLT AVIMIGIGVVLFTNSKNSQTAGNTARAESGKESEKVAENTTAATQSTTESKSQDTQQSKT EQQAAATQAASTTEDTVDTQAVGADGNEEIFEDTDASADYSECLSLDNYTYLESPDGDYT FYYPQNVYNYAYYDGDSCIFTSDDENDTVKFIRKDNTSGSPAQGMENLYNSYTNLSQKEK ILFREETDERGFARGVLSGITDNQNVYASIAANDDYVYIMEVCTSFQHDSTKSTWVEYYI DCMYRYCSFSRRVGDPRSYEEYMEDQ >gi|222441889|gb|ACEP01000053.1| GENE 12 14542 - 15558 619 338 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026770|ref|ZP_03715962.1| ## NR: gi|225026770|ref|ZP_03715962.1| hypothetical protein EUBHAL_01022 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01022 [Eubacterium hallii DSM 3353] # 1 338 1 338 338 595 100.0 1e-168 MRCKFCGTTLPDHAKFCPGCGKEIDNDRERHESREASNIPEDAYYTNTNKKNSKRKRRHK KISVRFIIVYLIVVLTGGFCLYRMGFPEKVVEMLSKQSSVPDSKKEISQKSDKKKNNEKK TGKATTEKKKDKNKNKKTLESTESTLAYTDNSSMNVGACLSPEDYNTVTAKDDSFSFAYP KYLFNHSEVNEEGTSYTFSYKGNSVSDSNEAELSVYTQENEGDPLQNARELYQNFSSEVY KKYFKMYPSRVDSSGMARTLIGASADSSEKTGVYIIAANDGKRDYILKFTYPDPDMTNDY NEIDYVVDCVYRFCSFSGGTYQPRSYQQFLDDNMGSKK >gi|222441889|gb|ACEP01000053.1| GENE 13 15801 - 16697 998 298 aa, chain + ## HITS:1 COG:Cj1269c_2 KEGG:ns NR:ns ## COG: Cj1269c_2 COG0860 # Protein_GI_number: 15792593 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: N-acetylmuramoyl-L-alanine amidase # Organism: Campylobacter jejuni # 61 291 77 325 329 85 26.0 2e-16 MKVFKTIFLCVVLIFALAGCGNSHAEADKTVKQGSGQKTDGAVNASEESTKGAAESATEG VQVKESTAQESATESELDSDASHRADYVIVIDAGHQAHGNYEEEPVGPGASETKPKVASG TTGVSTGVEEYELNLEVAVKLQAELEDRGYQVIMIRDKNDVDISNSERAKVANDNNADAF LRIHANGSDNSTDAGMMTICQTAENPYNGELHEKSYALSEKILDSMVEATGANRERVWET DTMSGINWAKVPTTIIEMGYMSNPEEDEKLNSDDYQDEIVQGIADGLDQYFNIGGTED >gi|222441889|gb|ACEP01000053.1| GENE 14 16827 - 18197 1154 456 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026772|ref|ZP_03715964.1| ## NR: gi|225026772|ref|ZP_03715964.1| hypothetical protein EUBHAL_01024 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01024 [Eubacterium hallii DSM 3353] # 1 456 1 456 456 566 100.0 1e-159 MKDKSIKSTMKSWIMLSEKELDDTQVYMQEQEQQDQQEKDNSFSEAESGSGNVGSAKYES QVSDAGTETFSDAENGFERKLSREEQVDQLIKEAGEEVVDPEVDSVPRKRRKRHQNKKGK FEKGEARESIESEERFVSESTDESMSDTTREKESMASEVENAAELIKDNGNKSDKSSGIS KMLVGGVGVLLVLIIAGLGYQTQSLQKQNQKLKTEISTLKKEKKQAEQASEEQALQTKNE DGVVENTDKMFASVGVGDKEYKLLAKVSDFTKDGWKLEALRNENATLVPGGKLEKDAFLT NKDGRKIGVIIRNLSDKNQKVSDCIVTKLSFSKEMFDGQITLPGSLGFDSTEKELKAAGF KKDADGIYTYTSKKIKDDTIKAAMADGESVSEFTLERAVDESSTSTGNAAANTSASTAAG SATTTASDSSSAEQNSVTADTTLTNNSNSASDSAEE >gi|222441889|gb|ACEP01000053.1| GENE 15 18657 - 20540 2173 627 aa, chain + ## HITS:1 COG:SPy2066 KEGG:ns NR:ns ## COG: SPy2066 COG4690 # Protein_GI_number: 15675831 # Func_class: E Amino acid transport and metabolism # Function: Dipeptidase # Organism: Streptococcus pyogenes M1 GAS # 2 408 4 413 498 207 33.0 5e-53 MKKLMLTMLSVVGMMLFVALPAKACTSYYVGKDCTKDGTTMYGRTEDYSPKKDKVYKVIQ PKKVGKNATFKDETGATTFQAPIKVETTYRYTICHDSEGAEDGYFGEFGTNTKGVSMSAT TSASVAKTVGKFDPYIDAYESKVGGITEENLADYVLCQASSAREGVELLADVIDTVGAGE GDGLFIADQNEVWYFEILTGHNYCAIKMPSDKAAIIPNCFVIGDVDLSDKANVVASPNLV KLAKNNGFYVAAQDGKGDINVKLSYSGKGYAAHNADRIRGGQYLLSGQDNTGIYDADYQD PFFTCKNVTVEKMYELAGYRYEGMNFGRNISYRIGSRRTAEAHIFQINSSMPTELATVQW FSMASPDYSTFVPFYGALLTDVSKAYKTEAQQPNSRAAYWIFRNIGYLCEETNDGDGPNR ENYGKGVKQFYKAYMTKMEELQKNVNAQMLNVYKNDKKNLEYYATKLGIAIGNETMDFAK AMYMDIQTCKTNGTKYETSSLSADDIEYDLSMVTAPAKKADDTKPVTPAKPSAPARVQVR AKALKGKKVKVSLKKTAEAKGYEIVYSTNVNFTKKTTKKISTKNLTKTIKKLKKKKTYYI KARAYKLDGKTKVYGRWSLIRKVTIKK >gi|222441889|gb|ACEP01000053.1| GENE 16 20885 - 21352 469 155 aa, chain + ## HITS:1 COG:lin2694 KEGG:ns NR:ns ## COG: lin2694 COG2246 # Protein_GI_number: 16801755 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Listeria innocua # 31 155 21 145 145 78 37.0 4e-15 MKEKEKDIFDKIMALPGLNIFEGFYQKYKEMLLYLFFGGMTAVISIGSYSYCDVGLGFDP LIANIISWILAVTFAYVTNKVWVFSVETHGMHELFIEAFHFFTGRLFTLIVEEAILLIFI SKLHFNSIVVKVVAQVVVVVLNYIISKLIVFREKE >gi|222441889|gb|ACEP01000053.1| GENE 17 21327 - 22043 736 238 aa, chain - ## HITS:1 COG:HI0510 KEGG:ns NR:ns ## COG: HI0510 COG1720 # Protein_GI_number: 16272454 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Haemophilus influenzae # 10 232 4 233 239 186 44.0 3e-47 MKCEDTSVMLSIKPIAYIHSDFSEKFGIPRQSGLVDSLQAKIIFTEEFRNGDCIRGIEEF SHLWLIWEFSKNQRDKWTPTVRPPRLGGNKRKGVFASRAPFRPNPLGLSCVRLEKAVPNS PEGPILYVSGADLLDGTPIYDIKPYLSFADAHPDALGSFSTKALEHRLNVHCEDALLHLI PKEKRQSLLDCLSLDPRPSYQHDPERIYGMQYAGFDIHFSVNANDLFVTEVIPSHEKQ >gi|222441889|gb|ACEP01000053.1| GENE 18 22876 - 24345 1705 489 aa, chain + ## HITS:1 COG:FN1277 KEGG:ns NR:ns ## COG: FN1277 COG2195 # Protein_GI_number: 19704612 # Func_class: E Amino acid transport and metabolism # Function: Di- and tripeptidases # Organism: Fusobacterium nucleatum # 4 489 5 486 486 350 39.0 2e-96 MGVLNNLVPEKVFYYFEELCNIPHGSKNTKAISDYCVNFAKERGLAYHQDASNNIIILKN GSKGYEDSKPVIIQGHLDMVCEKEKDCTIDFEKDGLDLAIDGDYIYAKGTTLGGDDGIAV AMALAILDDDTLEHPPIEAVFTVDEEIGMLGAVAMDCSSLKSRLMLNLDSENEGHFLVGC AGGMMVECVFPAERRRWSGKVVSVHVNGVTGGHSGVEIQKQGANANVILGRVLYALKKKV AFNVICVDGGLKDNAITRSSRAILMLAPGEEEIFMDIIKEQQEIITKEYHKTDANMSIEG FLEVSKEDEASAAGKMPLTEACGNAVIAALRTFPWGVQKMSQDMEGLVQTSLNPGILATK EDEISLTFSVRSSVASEKEELYERLVCLTESLGGKVNRSSDYPAWEYVEDSKLRILMGDT YREMYKKEPVMEVIHAGVECGLFSEKMPGLDCVSYGPDIFDIHTPQERLSISSTKRIYEY TVEVLKRLK >gi|222441889|gb|ACEP01000053.1| GENE 19 24345 - 25334 828 329 aa, chain + ## HITS:1 COG:CAC3454 KEGG:ns NR:ns ## COG: CAC3454 COG0042 # Protein_GI_number: 15896694 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA-dihydrouridine synthase # Organism: Clostridium acetobutylicum # 10 329 4 311 311 295 45.0 6e-80 MEKQHKVQCYAAPLEGITGYIYRNAHQHIFQGVDKYFTPFVTPKPKKGLNTREKNDIAPE HNKNIPVVPQILTNKAADFIKVAGMMEELGYKEVNLNLGCPSGTVVSKKKGSGFLADLEG MKYFFDEVFSQVNIEVSVKTRLGVENPDEVIDLMNIYNAFPISEVIVHARVREDYYKRPV NQEAFAQCLAISKHPVCYNGDLFTVQDIENLTAKFPDIKAVMLGRGMIANPQLTEVFCGR EAHGDSTSVENRQGLDYVRWKAFLDELCYGYEHIMSGGRNVLFKLKEVWSYMITFFPESE KYGKKIKKAKTVEEYQRIVDTLFREMGIV >gi|222441889|gb|ACEP01000053.1| GENE 20 25355 - 27373 1787 672 aa, chain + ## HITS:1 COG:slr0535 KEGG:ns NR:ns ## COG: slr0535 COG1404 # Protein_GI_number: 16332024 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Subtilisin-like serine proteases # Organism: Synechocystis # 232 445 142 364 613 122 34.0 3e-27 MWKKTTAVLMICILCISFSFPVSNAAEISGTASDVTNYKQISSLVSDTWESDYFEKMIIS PGSDSMEKDGEQESVSDEFNVSVSKAKEITKTENSMDSYLDRQDGIYEVEKNDDGDLEVT APYQTKRIIVENDNVEDTYGASHVYVNKADGETILQYDAEEDTESAFEALKQTYGPSQCY LDKVVSLDETAMGLPLSQEIVDGVDSYSWGNDYMGMSKLKKEAASYGYTRKVTVAIVDTG INTSNRMFKGRTISSQSYNFFNGNKNVTDVFGHGTHVSGIIVDATPANVSLLVLRVANSK GQSSMLTIKTALQYAIAKKSDVINLSMGFIDANADLYNYLDSTIDKAYEKGIPISCAAGN QETGGIDVRYCYPANYSKTIAVSAIDSSGRLANYSNRGNGIDFAAPGTGIISADYKGNLT LRAMSGTSMAAPHITAAIAYLKMMQPNLSVKGVCRELELYCRNLGAKKYYGRGCPILTNL FKKGITNKKYIVILKPTLSSVSNKGSGIKVTWKKATGAASYYVYRRTNNGAWKRRAVLSA SSNSYIDRNVKQGKKYTYKVRAYNNGIFGRFSSEKKVYRLKTLTNIRVKNTSGRRAAVLW KKKTYATTYQVKYAANPSFNKARKVSANKKNSRLTTKKLKKKTYYFQIRYSYKKGGVKSW SAWSKVKKIKIR >gi|222441889|gb|ACEP01000053.1| GENE 21 27424 - 28062 668 212 aa, chain + ## HITS:1 COG:DR0704 KEGG:ns NR:ns ## COG: DR0704 COG1896 # Protein_GI_number: 15805731 # Func_class: R General function prediction only # Function: Predicted hydrolases of HD superfamily # Organism: Deinococcus radiodurans # 18 157 4 144 198 101 45.0 9e-22 MNTEKEQKEIEEKWQTERLEKQINFIREMDKEKFIGRQTYLSDGKRKENDAEHAWHLALM TLLLSEYANEKIDTLKTMTMVLFHDVVEIDAGDTYAYDEEGKKTQAQREQKAAERLYGLL PEDQGAKLKAIWEEFEAKNTPESRFAHTMDNLQPVILNASTDGKAWKEHQVRLSQFMGRQ EDTPKGSETLWRYEWEKLVKPFLENGTIRGDE >gi|222441889|gb|ACEP01000053.1| GENE 22 28742 - 30172 949 476 aa, chain - ## HITS:1 COG:NMA0753 KEGG:ns NR:ns ## COG: NMA0753 COG1055 # Protein_GI_number: 15793728 # Func_class: P Inorganic ion transport and metabolism # Function: Na+/H+ antiporter NhaD and related arsenite permeases # Organism: Neisseria meningitidis Z2491 # 13 476 6 469 473 333 43.0 3e-91 MKKLLIGLSAFLLFSPLHVFAAESGLSDTALGNTLPIWSCIPFAGMLLSIAIFPLIKEEW WEKHKPWVVAFWSLLFLIPFAIAFGGHVALEHLIEVTLGDYLTFIVLLFGLFCVAGNISL QGDLAGNPKVNVVLLLIGTLLSSWIGTTGASMLMIRPIIRANKWRQRKVQIMVFFIFLIS NIGGCLTPVGDPPLLMGFMNGVDFFWSLHLLPVMIINVIILLTLFFLIDTRAYKKDIAAG FRPEMKTEENRVKLHLDGAHNILFLLIIVAGVILSGVLPTTFPVFSEGLTIMGVTLSYAS IIEIAMILASAYLSFKTTKKEVRSSNHFTWDAIQEVAILFIGIFITMIPALLILKARGSS LGLTEPWQFFWITGALSSFLDNTPTYLVFFTTAASLGFSSGIQTLTSFIPQTILMAISCG AVFMGANTYIGNAPNFMVRSIAEENGIKMPSFFGYMLWSLSCLIPVFLIDMLIFFL >gi|222441889|gb|ACEP01000053.1| GENE 23 30420 - 33953 2754 1177 aa, chain + ## HITS:1 COG:no KEGG:bpr_I2220 NR:ns ## KEGG: bpr_I2220 # Name: not_defined # Def: hypothetical protein # Organism: B.proteoclasticus # Pathway: not_defined # 30 1120 20 1123 1195 343 23.0 3e-92 MSYHGKGKDMDVKEEISTIKQINSEATPIVSLSVTNLDIEVEEGKVYKDSFFIESENKVP IEGYVCTTSDKVETEAERLEGTRQEIPFYFKGKLATAGNTFEGDIVLITNGGEYNIPYCV KVIHKFVESSGGRISTMEEFVQIYENNRKEAVDIFFLPNFEEVFLQGKPEQKELYHSLMK SRSKSLILEEFLTAAGYKNTSFLDVSQDKLVLEEEDSSAQLELLLTQPGYVEGSIYSEKG QIQISRERFSSDDFTDGKLLFNVEKKQNLLNGSDIIHIDTVRQDIEISVEWWAKQTALTK EKEKKLYLKKKRAQLMHNYLYFRTGSIAFEDFAEDSGHVLEELSRYTEDNEWKLYRMHFF LMEERKEEGKELLELLETISQRDEFTPLEANYFLYLKAMYYRTPEAISKAVSTIREFYEL SEYKAEALWMLIYLDREYVYNKRLQYDTIRQLFEEGNNSSLLYFEACDILNENPNYMEEL GKFEISIFRWGVRYGYISLSLSYQFARLALKAKYYNKSIYYIAEKLYGVEPDEQFLQVIC SLLIKGNRTGKEYHEYFRLAVEANLKIIGLNEFFIRSMDFETYDLIPQRVLIYFTYSNSL DYLEKAYLYSNVLRNKEKYEEVYGAYYSKMLPFIEEQLLKGRINEHLAYLYTCFQKEVLE KPDNWKAVCDILFYHKLICTNPHIIGVYVSCPELGTEKYYPLSGGTGGVEIYNDRAVLYF VDNNEQRYVKDINYQLREFLSPQQFEEEWIRRNLSNRKILLMESGKMEENIKEQTLPVLQ RIVFNDDFTRWMQVEAVEKMLVYYENHQDKRNLARWLGKIDYSNISTEFRKTLMDYYMEV GMMEDAFFGIELYGCDIMGAAKILRLASFGAQCYQGKDDEATLSLAYAAFIRKKCNKDTL TYLMKHFKGETAALLEIWERSKRFGLETAAFERRILEQVMFTGNDTEGVFPVFESFYATD EGDELIDRYLEYASMKELESSMELNDFMHMAIGQEIVNGRMENRHSRIHFLYYFAGREDW YDTIKDTAVYIINGFLQEEFYLPVYQAYSQWITFPIHYQEMTFLTYHGRAGSNVILRYQI DGEEYSLRELHLEEVLPGMYVCHMNFFQKDHVKYRLESDGSPVDEGAGLEFETFEYGDGE ESRFFTLNHLDSEESSLPEVKDCLLKTFFADQYMKLL >gi|222441889|gb|ACEP01000053.1| GENE 24 33994 - 35097 851 367 aa, chain + ## HITS:1 COG:no KEGG:bpr_I2219 NR:ns ## KEGG: bpr_I2219 # Name: not_defined # Def: hypothetical protein # Organism: B.proteoclasticus # Pathway: not_defined # 173 358 240 426 433 110 32.0 1e-22 MEELETKKYAGIDLGRTSVQFSIYREGQEEMTEESFLLSEEEQKEYIESGMRQVERYMET GGLRWPDFQAVHFSMEDASEENRSKLKSAVSEELRKLHGVKVITHFRAFAEYVFHQERIM WDRNTLLLDYHDNQLSYVLIDQIRRSKQKAYRALQQRIDLNEYRVAEGTPEQDQNFGQMM KRFLVKNPANIIFLTGSGFEGNWMKKTLTYLCAGRRVFLGQNLYANGACLLGIHSIELMD EGMILMDGPDMVYHTVGVVTTEAGKPQYVPITSIGREWYNTHGSVDIILDKSQRVDFFYH NTKENEIEGAACDIKGLPKRPPKTTRIRIEVSFTSQTEGVILLKDMGFGEMFPATGKIIV FPFTLIS >gi|222441889|gb|ACEP01000053.1| GENE 25 35158 - 35988 842 276 aa, chain + ## HITS:1 COG:no KEGG:Cphy_0459 NR:ns ## KEGG: Cphy_0459 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 1 273 1 276 278 88 26.0 4e-16 MRESLICIGKIGEKGYYFEDTGIQIFSYEELCYYMSQHMICYIHTLPGEDLLAYIRDELD LDKLSKQLSKLLDPEKDQMKYFAALFREGHYFNEDEIRDILDEYRGLMNAPVYLQKKWMG DLLTSSGRASRAIRYYQEALGQKEIKEEEVGRLYHNMGVAEAKLFRFENAKINFIKAYQY TGEESSLFYYYCIMALADGIEAAGEELKTFEDSDFLLDAFEEQFAAFEEDFAYSAMAEKY RKIVFLDENGKPEEALAKKQRLVSALKKDFRKEIDI >gi|222441889|gb|ACEP01000053.1| GENE 26 36150 - 38792 3222 880 aa, chain + ## HITS:1 COG:CAC2399 KEGG:ns NR:ns ## COG: CAC2399 COG0525 # Protein_GI_number: 15895665 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Valyl-tRNA synthetase # Organism: Clostridium acetobutylicum # 4 879 6 881 881 1130 60.0 0 MATEMNKTYNPSEIEDRLYKKWLDKKYFHAEVDRSKKPFTIVMPPPNITGQLHMGHALDN TLQDILIRFKRMQGYNALWQPGTDHASIATEVKVTNKLREEGIDKEELGREGFLKRTWEW REEYGGRIVSQLKKLGSSADWDRERFTMDEGCSKAVQEVFIRLYEKGYIYQGSRIINWCP VCQTSISDAEVEYEDQAGHFWHINYPIVGTDKCIEIATTRPETMLGDTAIAVHPDDERYK DLVGKMVLLPIVNKEIPIVADSYVDKEFGTGAVKITPAHDPNDFEVGKRHNLEEINILND DGTINENGGKFEGMDRYEARKAIVKELEEGGYLVRIENHEHNVGTHDRCHTTVEPMVKKQ WFVKMNEMAKPAIEAVKNGDLRFVPGHFDRTYLHWLENIRDWCISRQLWWGHRIPAYYCD DCGEIVVAKETPSVCPKCGCTHFTQDEDTLDTWFSSALWPFSTLGWPDKTEDLDYFYPTN VLVTGYDIIFFWVIRMVFSGYEQTGKCPFSDVLIHGLVRDEQGRKMSKSLGNGIDPLEII DQYGADALRLTLVTGNAPGNDMRYSEKKIIASRNFANKVWNASRFMLMNIEKADLSNVSL DDLTPADKWILSKANSLVKEVTDNMENYDFGVAVSKLNDFIWEEFCDWYIEMVKPRLYNE EDTTKAAALFTLKKVLTISLKLLHPYMPFITEEIFCSLQDEEESIMVSDWPVFEEAFDFK AEENEVEIIKNAVRNIRNLRADMNVPPSKKASVYVVSEKEEVRKVFEDSRVFFATLGYAS EVHVQADKAGIADDAVSTVIPDAVIYMPFAELVDVEKEIARLEKEAKRLEGEIKRAKGML SNEKFISKAPAAKVEAEKEKLEKYTSMAAQVAERLSQLKK >gi|222441889|gb|ACEP01000053.1| GENE 27 38816 - 40102 1250 428 aa, chain + ## HITS:1 COG:L0177 KEGG:ns NR:ns ## COG: L0177 COG0285 # Protein_GI_number: 15673139 # Func_class: H Coenzyme transport and metabolism # Function: Folylpolyglutamate synthase # Organism: Lactococcus lactis # 2 423 1 422 427 224 34.0 2e-58 MMTYSEMEREINEIPKFGAKASLSNLSDYLELANHPERNLRVIHIAGTNGKGSVSAYIDS ILRQAGYTTALFTSPHLVKINERFRINFKECSDEELILAWCQVKSFMEKGEKQGLQPLTF FEILFLMGMILFSQKEIDYCILETGLGGRLDATVLSDPVLSIITSISYDHMEILGDTIEK IAAEKAGIIKNGIPVVAVDEENGAFPVIERTAKEKKSPVYGLKSQDLTILKKYENKIDFS INSRYYKISNLKVKSYASYQVQNAALAALAAHVLLPDLAENVIRNGILEMFWAGRMEEIA ENVYVDGAHNPGAVRQIYNSLADSDKEWLLLFAVCSDKDYTEMIRILGKIPWKRIYITKI DSARGADTAAVRQCFEEAAGCPICEFESAGEAFKAALRDRGDEKEENLLCLGSLYLVGEI KELAATMF >gi|222441889|gb|ACEP01000053.1| GENE 28 40118 - 40270 338 50 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026788|ref|ZP_03715980.1| ## NR: gi|225026788|ref|ZP_03715980.1| hypothetical protein EUBHAL_01040 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01040 [Eubacterium hallii DSM 3353] # 1 50 1 50 50 69 100.0 7e-11 MFDYDLELKKFKPAVGVEIGEDELFSDDIKDITDLVEELVAEREGKKASK >gi|222441889|gb|ACEP01000053.1| GENE 29 40290 - 41831 1971 513 aa, chain + ## HITS:1 COG:no KEGG:EUBELI_01610 NR:ns ## KEGG: EUBELI_01610 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 1 452 1 458 463 228 32.0 4e-58 MKCIKCGRPVGASRYCNACGFDNKHIAKALNTADYYYNIGLEKAQMHDLSGAEIYLKKAL NYNKSHKNARNLLGLVYNEMGEGGKAYIQWKISAKLSSVEENVANLYIRQMEEHPAVFEE INETAKKYNTALSYAKQGSDDLAMIQVKKVLSITPNFVNAHLLFALLHMRAGDNVSAQTD LNNALAIDRYNTTARKYLNEMGENPDTVPEKRPADALKPDNENLRNVRPVDHYEDPNKET WKQFVYMLIGLAIGVVAMFVLVIPSVKASVSVDYNNLKKEYSQTTNKKDAEISTLKDDKK SLQKKNKKLTKRLKVYEGSKGEDSMYDSILKASQAYSSGNYVECAKHLLKVDKDSLPSTT AKNLYTSMKDKAFQNAAAQLYNSGKASFDAYKYQDALDDLEQSYKYDKSYNTEYHIAMCY KYLNKNTDKAQEYFYDIINNSGDSELIRKAANLGLDMVINSAKEAAAKAKGGSTTTDSKS DTASSSDSSTSKKSSTKSTTEEDFGADTSNSNN >gi|222441889|gb|ACEP01000053.1| GENE 30 41956 - 42291 333 111 aa, chain + ## HITS:1 COG:BS_spoIIAA KEGG:ns NR:ns ## COG: BS_spoIIAA COG1366 # Protein_GI_number: 16079404 # Func_class: T Signal transduction mechanisms # Function: Anti-anti-sigma regulatory factor (antagonist of anti-sigma factor) # Organism: Bacillus subtilis # 5 97 8 100 117 73 36.0 6e-14 MGCCHVKNQCLVIRLPKEIDHYQAEKVRRECEQSFMKFIIRDIIFDFSDTSFMDSSGIGL VLGRVRKIHPINGKVYLFGGNELIQKMWEMAGILNLVTVLDSIEKVKEVYE >gi|222441889|gb|ACEP01000053.1| GENE 31 42288 - 42749 446 153 aa, chain + ## HITS:1 COG:CAC2307 KEGG:ns NR:ns ## COG: CAC2307 COG2172 # Protein_GI_number: 15895574 # Func_class: T Signal transduction mechanisms # Function: Anti-sigma regulatory factor (Ser/Thr protein kinase) # Organism: Clostridium acetobutylicum # 9 141 4 136 143 125 51.0 3e-29 MKQAGENSNSLKMVFRNRPENEKIVRTTAAVFASVLDPTLEEISDFKTAVSEAVTNAIIH AYPKTGGDIEAYFKREDKKITVLITDYGIGIKDVGKSMQPLYSTLHTQERSGMGFTFMEA FADEVKVKSVPGKGTTVKLIKIIGKGEECGENL >gi|222441889|gb|ACEP01000053.1| GENE 32 42733 - 43470 618 245 aa, chain + ## HITS:1 COG:BS_sigF KEGG:ns NR:ns ## COG: BS_sigF COG1191 # Protein_GI_number: 16079402 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit # Organism: Bacillus subtilis # 6 234 21 247 255 209 48.0 3e-54 MERTCELIKQAKEGNKEAQSILVEENSGLIWSVVKRFQGRGYDKEDLFQIGSIGLLKCIE NFDLERNVKFSTYAVPLIMGEIKRFLRDDGIVKVSRSLKEASYKIKREKEHYQKLYNREP TLKEIAVTLDMDESDILMAMESGQEVCSLHQVIYQSEGDEIHLEDKLEQQADLIEQTVDN IYVQELLKKLNEQERQIIHMRYFQNQTQAAIAKKIGISQVQVSRMEKKILNKLRKNANTK ENGNL >gi|222441889|gb|ACEP01000053.1| GENE 33 43481 - 44146 481 221 aa, chain + ## HITS:1 COG:no KEGG:Closa_1563 NR:ns ## KEGG: Closa_1563 # Name: not_defined # Def: stage V sporulation protein AA # Organism: C.saccharolyticum # Pathway: not_defined # 4 212 3 208 208 139 35.0 6e-32 MEKKIIYLKAEQSSYVNHEKIHIGDIASVFCEDKEIEKKIKNIVLYEFDEKNEKEGRVFL SILLLIEKISEQIPYGEVRNTGETDMVIYYKAEELKSKKWVQVIKILFICATCFFGAGIT VMGYNNDVDMVKVFGQLYKVFLGTKPDDPTFVELFYSVGLALGIFLFFNHVPGKKVTNEP TPIQVQMRLYEQDVNRTFLLGASRKGEELDVSGNNDSANNK >gi|222441889|gb|ACEP01000053.1| GENE 34 44109 - 44561 355 150 aa, chain + ## HITS:1 COG:no KEGG:Aflv_1006 NR:ns ## KEGG: Aflv_1006 # Name: spoVAB # Def: stage V sporulation protein AB # Organism: A.flavithermus # Pathway: not_defined # 10 150 2 137 140 77 34.0 1e-13 MCLGIMTVPIINKVILGWVAFGAGFAVAGGFIAFISLIGIVTRLAGLTKTADAIPTYENS MALGLIFFNLVSLYQPDLQWISYTAALSIINIVGLFTGIFAGCLAGALAEVVNIIPIFSR RIKLRKGFPYMVKAAAVGKCIGCLIQFYVF >gi|222441889|gb|ACEP01000053.1| GENE 35 45244 - 46542 562 432 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|149914878|ref|ZP_01903407.1| 30S ribosomal protein S2 [Roseobacter sp. AzwK-3b] # 17 428 26 424 425 221 37 8e-57 MEERTKALLVGVNLNNDPEFETALKELESLAEACNMEVVGVETQNVSQINTGVYVGTGRV EEIKAVAHMMGAEVIIFDNTLSPMQLRNLKDIIERPVFDRTHLILQIFSSRARTREAQIQ VETARLQYELPRLTGMGEILSRQGGGSGGLSNKGAGEKKLELDKRKIRHRISELKKELRE VEKNRETQRKRRLVQGIPQVALVGYTNAGKSTLLNAFIDKYEENEEKKEDRKVMAKNMLF ATLDTTVRKIHLPDKREFLLSDTVGFISKLPHNLVEAFHSTLEEVKFANLLLEVVDYSDE HYMDHMEVTRQTLKELGADEIPCIHVFNKCDIAKANGREDVPAELPHIGKDCIYMAAGQN IGLEELVQLISNHIYQDYVECTMLIPYTEGALVSYFNENATVKETEYEAEGTKITMSCLL KDLKKYKEYVIL >gi|222441889|gb|ACEP01000053.1| GENE 36 46834 - 47655 535 273 aa, chain - ## HITS:1 COG:no KEGG:CLL_A0680 NR:ns ## KEGG: CLL_A0680 # Name: not_defined # Def: ABC transporter permease protein # Organism: C.botulinum_B_Eklund # Pathway: not_defined # 1 273 1 270 270 92 29.0 2e-17 MLGKLFKYEWRSISKLLLPIHGFVLLFALLSRFYFTISGGTDALLNTDSTIIGTLTMLLI FALVIVISSIAIFTYIYSGYHFHKNVFTDQGYLTNTLPVTPSQLLLSKELAALLWLLIDV VVISISIFILVGSTELFSNFSIFWSTLIRYASQTPLFTTLIIIALVLSPFLAIGILFFSI TLGNLASSHKVLASIGAYVGIYVVQQIFGLIQLVVWGYFGSTTIMRVNIYSNNYSFETFL NPILITGLIFNIILIAVCWMGSKYIMTKRLNLQ >gi|222441889|gb|ACEP01000053.1| GENE 37 47649 - 48347 230 232 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 [marine gamma proteobacterium HTCC2080] # 1 225 1 234 305 93 28 3e-18 MSTILECKNLSKKYGRKQALDNITLSLESGKIIGLLGPNGSGKTTLIKLINGLLAPTGGK LFINGNTPGVESKKIVSYLPERTYLDETQKVSEAITLFEDFYKDFDRSRAISMLEKLNID SNARIKTLSKGTKEKVQLILVMSRRAKLYCLDEPIAGVDPAARDYILSTIINNYEPDSTI LISTHLISDVENILDEVVFIKNGHIVLKNDVEDIRFNQGKSVDVLFREVFKC >gi|222441889|gb|ACEP01000053.1| GENE 38 48344 - 48715 464 123 aa, chain - ## HITS:1 COG:SP1714 KEGG:ns NR:ns ## COG: SP1714 COG1725 # Protein_GI_number: 15901548 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Streptococcus pneumoniae TIGR4 # 1 109 1 109 121 117 49.0 6e-27 MSWNLDSSRPIYAQIIEKVSLDIVSGKYQPGDKLPSVRDLAAQAGVNPNTMQKALSELER ENLVHSARTSGRFITEDKAMIEKMREELASTQIKEFLNKMSQMGFDYEKTITLLEKLGKE NQK Prediction of potential genes in microbial genomes Time: Fri Jul 8 07:18:03 2011 Seq name: gi|222441888|gb|ACEP01000054.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont60.1, whole genome shotgun sequence Length of sequence - 62282 bp Number of predicted genes - 59, with homology - 59 Number of transcription units - 35, operones - 14 average op.length - 2.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 2 - 1351 645 ## COG0534 Na+-driven multidrug efflux pump 2 1 Op 2 . - CDS 1390 - 1845 344 ## Elen_1297 transcriptional regulator, MarR family - Prom 1878 - 1937 5.1 3 2 Tu 1 . - CDS 2511 - 2870 463 ## COG2033 Desulfoferrodoxin - Prom 2944 - 3003 8.3 + Prom 2858 - 2917 8.8 4 3 Tu 1 . + CDS 3140 - 4312 1264 ## COG0037 Predicted ATPase of the PP-loop superfamily implicated in cell cycle control + Prom 4379 - 4438 6.8 5 4 Op 1 . + CDS 4476 - 5960 1594 ## COG0466 ATP-dependent Lon protease, bacterial type + Prom 5966 - 6025 4.4 6 4 Op 2 . + CDS 6045 - 7034 1121 ## COG2358 TRAP-type uncharacterized transport system, periplasmic component 7 4 Op 3 . + CDS 7054 - 8241 1447 ## COG0786 Na+/glutamate symporter + Term 8305 - 8336 -1.0 + Prom 8303 - 8362 7.9 8 5 Tu 1 . + CDS 8383 - 8577 326 ## gi|225026810|ref|ZP_03716002.1| hypothetical protein EUBHAL_01062 - Term 8796 - 8839 -0.6 9 6 Tu 1 . - CDS 8848 - 9771 704 ## COG3965 Predicted Co/Zn/Cd cation transporters - Prom 9806 - 9865 12.8 + Prom 9746 - 9805 8.6 10 7 Tu 1 . + CDS 9899 - 11263 1087 ## COG0534 Na+-driven multidrug efflux pump 11 8 Tu 1 . - CDS 11872 - 12891 1088 ## COG0385 Predicted Na+-dependent transporter - Prom 13064 - 13123 5.4 + Prom 13356 - 13415 6.4 12 9 Tu 1 . + CDS 13612 - 13869 298 ## gi|225026817|ref|ZP_03716009.1| hypothetical protein EUBHAL_01069 + Term 14000 - 14030 1.1 13 10 Tu 1 . - CDS 14573 - 15583 923 ## COG1609 Transcriptional regulators - Prom 15608 - 15667 7.7 + Prom 15712 - 15771 5.0 14 11 Op 1 4/0.000 + CDS 15930 - 17204 1489 ## COG0153 Galactokinase 15 11 Op 2 . + CDS 17228 - 18724 1601 ## COG4468 Galactose-1-phosphate uridyltransferase 16 12 Tu 1 . - CDS 18979 - 19698 859 ## COG4816 Ethanolamine utilization protein - Prom 19722 - 19781 5.0 - Term 19752 - 19800 4.1 17 13 Tu 1 . - CDS 19916 - 20587 641 ## COG2357 Uncharacterized protein conserved in bacteria - Prom 20628 - 20687 11.9 + Prom 21255 - 21314 6.2 18 14 Op 1 39/0.000 + CDS 21401 - 22282 1270 ## COG0226 ABC-type phosphate transport system, periplasmic component 19 14 Op 2 38/0.000 + CDS 22297 - 23151 1093 ## COG0573 ABC-type phosphate transport system, permease component 20 14 Op 3 41/0.000 + CDS 23172 - 24083 943 ## COG0581 ABC-type phosphate transport system, permease component 21 14 Op 4 32/0.000 + CDS 24144 - 24896 273 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 22 14 Op 5 . + CDS 24898 - 25542 827 ## COG0704 Phosphate uptake regulator 23 15 Op 1 . + CDS 25942 - 27741 1883 ## COG1283 Na+/phosphate symporter 24 15 Op 2 40/0.000 + CDS 27744 - 28415 823 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 25 15 Op 3 . + CDS 28427 - 29740 931 ## COG0642 Signal transduction histidine kinase 26 15 Op 4 . + CDS 29834 - 29977 63 ## gi|225026832|ref|ZP_03716024.1| hypothetical protein EUBHAL_01084 + Term 30137 - 30180 1.4 + Prom 29995 - 30054 5.9 27 16 Op 1 16/0.000 + CDS 30208 - 31092 1348 ## COG0214 Pyridoxine biosynthesis enzyme 28 16 Op 2 . + CDS 31094 - 31669 719 ## COG0311 Predicted glutamine amidotransferase involved in pyridoxine biosynthesis 29 17 Op 1 . + CDS 32289 - 33377 1155 ## COG3608 Predicted deacylase 30 17 Op 2 . + CDS 33404 - 34723 1248 ## COG0534 Na+-driven multidrug efflux pump - Term 34778 - 34819 1.1 31 18 Op 1 . - CDS 34932 - 36263 1092 ## COG3610 Uncharacterized conserved protein 32 18 Op 2 . - CDS 36344 - 37147 711 ## lin0468 hypothetical protein - Prom 37176 - 37235 10.8 33 19 Tu 1 . - CDS 37285 - 38364 792 ## COG1902 NADH:flavin oxidoreductases, Old Yellow Enzyme family - Prom 38509 - 38568 10.4 34 20 Tu 1 . + CDS 39317 - 40855 1807 ## COG1070 Sugar (pentulose and hexulose) kinases + TRNA 41079 - 41168 64.0 # Ser CGA 0 0 + Prom 41094 - 41153 80.4 35 21 Op 1 . + CDS 41296 - 43959 2356 ## COG0480 Translation elongation factors (GTPases) + Prom 44006 - 44065 4.7 36 21 Op 2 41/0.000 + CDS 44104 - 44817 174 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 37 21 Op 3 . + CDS 44830 - 45747 1274 ## COG0719 ABC-type transport system involved in Fe-S cluster assembly, permease component + Prom 45798 - 45857 6.8 38 22 Tu 1 . + CDS 45893 - 46165 362 ## COG1937 Uncharacterized protein conserved in bacteria + Term 46245 - 46290 7.5 + Prom 46305 - 46364 10.6 39 23 Op 1 . + CDS 46411 - 46617 292 ## gi|225026845|ref|ZP_03716037.1| hypothetical protein EUBHAL_01098 40 23 Op 2 . + CDS 46652 - 47218 385 ## COG1309 Transcriptional regulator + Prom 47309 - 47368 4.4 41 24 Tu 1 . + CDS 47452 - 48177 1121 ## COG0580 Glycerol uptake facilitator and related permeases (Major Intrinsic Protein Family) + Term 48221 - 48258 7.3 - Term 48270 - 48305 -1.0 42 25 Tu 1 . - CDS 48412 - 48765 371 ## gi|225026848|ref|ZP_03716040.1| hypothetical protein EUBHAL_01101 - Prom 48791 - 48850 12.0 + Prom 48747 - 48806 6.1 43 26 Op 1 . + CDS 48933 - 49124 291 ## gi|225026849|ref|ZP_03716041.1| hypothetical protein EUBHAL_01102 44 26 Op 2 . + CDS 49190 - 49630 497 ## COG1959 Predicted transcriptional regulator + Term 49634 - 49683 10.0 45 27 Tu 1 . + CDS 49716 - 50648 781 ## COG4905 Predicted membrane protein + Prom 50841 - 50900 4.7 46 28 Op 1 . + CDS 50923 - 51606 330 ## EUBREC_1314 hypothetical protein 47 28 Op 2 . + CDS 51677 - 52207 552 ## COG4917 Ethanolamine utilization protein + Term 52292 - 52328 1.0 - Term 52330 - 52389 22.2 48 29 Tu 1 . - CDS 52431 - 52973 926 ## COG1592 Rubrerythrin + TRNA 53269 - 53341 82.0 # Thr CGT 0 0 - Term 53352 - 53400 10.4 49 30 Tu 1 . - CDS 53407 - 53802 340 ## COG0735 Fe2+/Zn2+ uptake regulation proteins 50 31 Tu 1 . + CDS 54163 - 54396 530 ## COG0236 Acyl carrier protein + Term 54447 - 54489 5.4 + Prom 54523 - 54582 7.3 51 32 Op 1 . + CDS 54694 - 55677 1513 ## COG1077 Actin-like ATPase involved in cell morphogenesis 52 32 Op 2 . + CDS 55688 - 57898 1939 ## COG0507 ATP-dependent exoDNAse (exonuclease V), alpha subunit - helicase superfamily I member 53 33 Tu 1 . - CDS 57945 - 58763 291 ## COG1040 Predicted amidophosphoribosyltransferases - Prom 58783 - 58842 5.6 54 34 Tu 1 . - CDS 58870 - 59307 540 ## COG0756 dUTPase - Prom 59448 - 59507 7.1 + Prom 59826 - 59885 5.9 55 35 Op 1 . + CDS 59928 - 60077 241 ## PROTEIN SUPPORTED gi|160881814|ref|YP_001560782.1| ribosomal protein L33 56 35 Op 2 . + CDS 60105 - 60302 221 ## EUBREC_0372 hypothetical protein 57 35 Op 3 45/0.000 + CDS 60324 - 60848 636 ## COG0250 Transcription antiterminator 58 35 Op 4 55/0.000 + CDS 60918 - 61343 624 ## PROTEIN SUPPORTED gi|240143818|ref|ZP_04742419.1| ribosomal protein L11 59 35 Op 5 . + CDS 61410 - 62099 939 ## PROTEIN SUPPORTED gi|160881810|ref|YP_001560778.1| ribosomal protein L1 + Term 62140 - 62204 23.9 Predicted protein(s) >gi|222441888|gb|ACEP01000054.1| GENE 1 2 - 1351 645 449 aa, chain - ## HITS:1 COG:BH4045 KEGG:ns NR:ns ## COG: BH4045 COG0534 # Protein_GI_number: 15616607 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Bacillus halodurans # 7 448 3 446 447 278 40.0 2e-74 MYQGVFMKIQLSEHFTYKKLLRFVLPSIIMMIFTSIYSVVDGLFVSNFVGKTALAAINLT LPIIMGLSALGFMIGTGGSAIVARMLGEKKREKANEYFSMLIYVTAIGGILLSILGTIFI PAIASFLGAKGQLLSNCILYARLSFISMPAFMLQNVFQSFFVTAEKPHLGLYVVIAAGVT NMVLDFLFVGILGFGLAGAALATVCGEFIGGLFPIFYFTRKNSSLLRLGKTHWNGHVLLQ TCINGSSELMTNLSSSIMNSLYNIQLMKFAGENGVAAYGTMMYINFIFLAIFFGYSIGSA PVISYHYGAGNQDELKNLFRKSLCLIGTWGIMLALLSQLLAAPLSKLFVGYDAELFAMTE HGFRIYCLAYLVNGFNIFGSSFFTALNNGVISAAISFLRTLVFQIIMILLLPTLLGINGI WSAVGIAELMTLCVTITFFIKQRKKYHYV >gi|222441888|gb|ACEP01000054.1| GENE 2 1390 - 1845 344 151 aa, chain - ## HITS:1 COG:no KEGG:Elen_1297 NR:ns ## KEGG: Elen_1297 # Name: not_defined # Def: transcriptional regulator, MarR family # Organism: E.lenta # Pathway: not_defined # 16 151 10 145 151 117 41.0 2e-25 MKDNFSTTPYMSSNLREYNRIYKEVNDIYRDAASKFGLSNSVFDILYTICEVGEGCLQKD VCDATFIPKQTVNSAIRKLEQEGYLTLSNGKGHSKHILLTESGHTLLKETIFPIVEAENE AFTELSFEECNLLLKLHSKYTTALREKFSKL >gi|222441888|gb|ACEP01000054.1| GENE 3 2511 - 2870 463 119 aa, chain - ## HITS:1 COG:TP0823 KEGG:ns NR:ns ## COG: TP0823 COG2033 # Protein_GI_number: 15639809 # Func_class: C Energy production and conversion # Function: Desulfoferrodoxin # Organism: Treponema pallidum # 19 119 24 128 128 99 47.0 2e-21 MAKFYRTEDRTVLLELFGAGSSVENLKNLKANTTDAAVEKHVPVVTQEGNKITVAVSSVE HPMLPEHYIMGVYIETKNGGQLHKFQPGDTPKATFTLADGDEFVAAYEYCNLHGLWKDK >gi|222441888|gb|ACEP01000054.1| GENE 4 3140 - 4312 1264 390 aa, chain + ## HITS:1 COG:FN0868 KEGG:ns NR:ns ## COG: FN0868 COG0037 # Protein_GI_number: 19704203 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Predicted ATPase of the PP-loop superfamily implicated in cell cycle control # Organism: Fusobacterium nucleatum # 120 375 21 269 277 217 41.0 3e-56 MDITINELGQLTPDSYILLDMRGDVEVGHGIIPGAIHMSKEEILEKYSGGLVKADEAEAA EREDSAEKKLIIYCARGRISQELAEELRDRGYDAYSLKGGYTSWLLNEMKNQQADEVCAQ VEKSIRKKFRKNIWCKFTKAINQYELVKEGDCIAVCISGGKDSMLMAKLFQELKLHNKFP FEVKFVVMDPGYSPENRKVIEENARKLKIPIHIFESDIFESVYHIEKSPCYLCARMRRGY LYNFAKELGCNKIALGHHYDDVIETILMGMLYGAQIQTMMPKLHSTNFEGMELIRPLYLI REDDIKAWRDYNDLRFIQCACKFTDTCTTCNNEENQSKRVEIKQLIAEIKKKNPYVEAHI FKSVENVNIETVIAYKKDGVKHHFLDNYDL >gi|222441888|gb|ACEP01000054.1| GENE 5 4476 - 5960 1594 494 aa, chain + ## HITS:1 COG:CAC0456 KEGG:ns NR:ns ## COG: CAC0456 COG0466 # Protein_GI_number: 15893747 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ATP-dependent Lon protease, bacterial type # Organism: Clostridium acetobutylicum # 103 461 171 539 786 157 29.0 5e-38 MPVFDFSNMQAKKEPVRCTYAMVYSDNDHASPENPILVIDNQKREWKHHSFGIFANPNKQ TTFEFEEEDGTISADILKVDARFVSLLKWLGEHHIHVQLSGKNVEEGYAVYKIQEISFGG RTKLSAEDGFLQFMIERLFASPAPAEEVTEEVEEEDSDEMKLTSLQSITDFMTCAGRTLP DNIRLWARRNLAVARSREVSPEERRHAQRALSIMMNIQWKSDYFKSIDPEEARKILDEEL YGMERVKQRIIETIIQINRTHTLPAYGLLLVGPAGTGKSQIAYAVARILKLPWTTLDMSS INDPEQLTGSSRIYANAKPGIIMDAFSMAGESNLVFIINELDKANSGKGTGNPADVLLTL LDNLGFTDNYMECMIPTVGVYPIATANDKSQISSPLMSRFAVIDIPDYTPEEKKAIFSKF ALPKVLKRIGLREGECIMTDDALDAVIDIFSDTTGIRDLEQAAEHIVANALYQIEVDHLE QVTFTGEMVRELLL >gi|222441888|gb|ACEP01000054.1| GENE 6 6045 - 7034 1121 329 aa, chain + ## HITS:1 COG:AF0635 KEGG:ns NR:ns ## COG: AF0635 COG2358 # Protein_GI_number: 11498243 # Func_class: R General function prediction only # Function: TRAP-type uncharacterized transport system, periplasmic component # Organism: Archaeoglobus fulgidus # 34 317 46 328 330 159 36.0 8e-39 MKKRWFALCMCIVMLAGILTSCGTNTGKIKFGAAGLGGTYRVFGDTFANLVTSKNKKYKM EVKTTAGSAANLRLLSDGYIQMAVAQTDLTNDAYERTGIFENEKQHGGYSAVAALYTEAC QIVVKADSSINTVEDLQDKRVSVGEEESGTEQNAKQILAAYGLNDSLVDEVNLDYSNAAE ELREGKIDAFFCTAGAQTTVIGELAKKCDIRLLSIDEKSAEKLKGTYKFYTDCTIPKETY NGQTEDVKTVGVKAVLLASDKLSADTVKDITKILFENKQELQYALPVDISLDEKSAVEGI TIPFHKGAVAYYEECGMDAAELTTEKESN >gi|222441888|gb|ACEP01000054.1| GENE 7 7054 - 8241 1447 395 aa, chain + ## HITS:1 COG:FN0793 KEGG:ns NR:ns ## COG: FN0793 COG0786 # Protein_GI_number: 19704128 # Func_class: E Amino acid transport and metabolism # Function: Na+/glutamate symporter # Organism: Fusobacterium nucleatum # 4 395 5 394 399 333 48.0 4e-91 MKIQLDMYQTIAVAVVVLMLGKFLKERVELLERFCIPAPVIGGVIFAIFTCLCYVTGIAE FSFDDILKEVCMVFFFTSVGFQANLKVLKSGGRALIVFLGLVITLILCQNFLAIGLAKLL HISPLVGLCTGSIPMIGGHGTAGAFGPVLEDFGVKGASTLCTAAATFGLIAGSVMGGPVG KRLIEKKNLLDTVVVEDDSLLIEDERKHERHASMYPAAVFQLIIAIGIGTIISKLLSLTG MTFPIYIGAMIAAACMRNIGEYSGKFTIYMGEINDIGGISLSLFLGIAMITLKLWQLADL ALPLITLLAGQTILMFLYTYFVVFNVMGRDYDAAVLSSGVCGFGMGATPNAMANMQAVCE KYAPSVKAFLLVPLVGSLFADFLNSLAITFFINFL >gi|222441888|gb|ACEP01000054.1| GENE 8 8383 - 8577 326 64 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026810|ref|ZP_03716002.1| ## NR: gi|225026810|ref|ZP_03716002.1| hypothetical protein EUBHAL_01062 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01062 [Eubacterium hallii DSM 3353] # 1 64 40 103 103 109 100.0 9e-23 MFGKKKKVKHAERHVDYKVSSENLAMPELHIGEDKGEEQPEKKDDGFITYENVAIPEVHI KKRK >gi|222441888|gb|ACEP01000054.1| GENE 9 8848 - 9771 704 307 aa, chain - ## HITS:1 COG:BH0427 KEGG:ns NR:ns ## COG: BH0427 COG3965 # Protein_GI_number: 15612990 # Func_class: P Inorganic ion transport and metabolism # Function: Predicted Co/Zn/Cd cation transporters # Organism: Bacillus halodurans # 15 265 10 258 301 89 22.0 1e-17 MHVGRKTQKREKSAMSVSLYGNLLFVVIELIMAIVTSSQAVLLDAVYDGVEFVMLLPSLF LIPFLYRPANEQHPFGYTQIETLFIVIKGITMTAVTFGLIFNNINLMIHGGHIISFNTVA YFELFACVLGIIVTVYLYIKNKTMQSPLINMEMQGWRMDSVVSFGMACAFLLPMFIPFAW FQKFVPYLDQIITVVLSLIMIPAPIHTIITGFRDLMLIPPEEETIEDIKATVEPIIGIYG HKNLYYDIVRTGRKLWVSVYITFDKDIVSLSKFKFLQDECILALAKKYPDFYFELLPDIE FTGLERV >gi|222441888|gb|ACEP01000054.1| GENE 10 9899 - 11263 1087 454 aa, chain + ## HITS:1 COG:lin0003 KEGG:ns NR:ns ## COG: lin0003 COG0534 # Protein_GI_number: 16799082 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Listeria innocua # 6 429 2 425 447 267 37.0 4e-71 MSKTTINDMTIGSPAKLIIKFMIPMCLGNLFQQFYNVVDSIVAGQFLGVQALAAIGSTGS LMFFVIGWLNGLTSGFAIMVAQSFGAKQYDRMRHYVAMSIYLSVAFALVMTIGFSVANEQ ILRLMNYSDEIMPSVKGYMGIIYMGLLVTVAYDDLSAFLRALGDSRSPLYFLMISAGINV VLDIVLIVVMGMGVEGCAYATVIAQAISALLCFIYICKKFPILRLKKTDFQIALKSLGKL LGLGIPMALQFSITAIGTIIVQGAINIYGENCMAGFSAAGKLQNIFMTVFVAFGATIATY VGQNRGAGKMDRVKQGVRCTQYMIWGWSVVNMIAMFFFGKYMTYLFISPSETEVIKVAVT YFHTVFWFYPFLGSIFLYRNTLQGMGYGLIPMIGGIFELVARTAIVTLVAGKTSFAGVCM ADPAAWLAALVPLLPYYFYIMRKWKNSDLLNWQN >gi|222441888|gb|ACEP01000054.1| GENE 11 11872 - 12891 1088 339 aa, chain - ## HITS:1 COG:NMA0046 KEGG:ns NR:ns ## COG: NMA0046 COG0385 # Protein_GI_number: 15793077 # Func_class: R General function prediction only # Function: Predicted Na+-dependent transporter # Organism: Neisseria meningitidis Z2491 # 1 329 2 316 318 243 48.0 6e-64 MKQLQKFSKFLSDYTSIVVIAIAVITFFLPSLMGWVNFQLFTDPVANKFTSQSIIIGVIM FSMGLTLTTEDFKILAQRPFDICIGAIAQYLIMPFLAFFITKLLHLPTGIALGLILVGCC PGGVSSNIMSYLCGGDVAFSVGMTTVSTILSPVMTPLMVSLLASGAKITIHGLPMFVSII ETVIVPVAIGFVLNYALGKNKTFKELQKVMPGVAVLGLACVVGGVISSQGSKFFQSGAVI FIAVLLHNGLGYFLGYCAGRLTGMNTAKKRTISIEVGMQNAGLATNLATTTAQFASTPES AIICAVSCVWHSISGTLLAGLFAQYDKMKATHKKTATAK >gi|222441888|gb|ACEP01000054.1| GENE 12 13612 - 13869 298 85 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026817|ref|ZP_03716009.1| ## NR: gi|225026817|ref|ZP_03716009.1| hypothetical protein EUBHAL_01069 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01069 [Eubacterium hallii DSM 3353] # 1 85 15 99 99 158 100.0 1e-37 MFNFLNSESLWIGTDMKKFNEIRTQLEKEGIPYKYKTKDHLSQTMYPAEGTLRSRTGSFG NKPDQMIEYEILVHKKDYEKVRGKF >gi|222441888|gb|ACEP01000054.1| GENE 13 14573 - 15583 923 336 aa, chain - ## HITS:1 COG:BH2227 KEGG:ns NR:ns ## COG: BH2227 COG1609 # Protein_GI_number: 15614790 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus halodurans # 1 329 1 333 347 202 36.0 8e-52 MATLKDIAKRAGVSSATVSRILNQDETLSVTPQTRERVQEIARELNYKKKSSPSSKTVIG IFQWVTLFQELEDPYYQAIRSGIERYCMTENLEIRRAFQSDPDYMNTLRGVQGLICIGKF NDKQIQLFESITPNVIFVDMQTSKINCNTISLDFEQAVVDALNYLTSLGHTSIAYLGGKE YLNDDTVYFEQRKDTFIRYCKEHQISYEPYLKETEFSADAGYQMMMELIEAGTLPSAVFA ASDPIAIGAMRALYQKGYRIPEDISVMGFDDINVAKFSNPPLTTVHAPADFMGQFAAHYI QLLAESNTKLSYNMPVRMTLPCEITIRDTCGASSDL >gi|222441888|gb|ACEP01000054.1| GENE 14 15930 - 17204 1489 424 aa, chain + ## HITS:1 COG:VC1595 KEGG:ns NR:ns ## COG: VC1595 COG0153 # Protein_GI_number: 15641603 # Func_class: G Carbohydrate transport and metabolism # Function: Galactokinase # Organism: Vibrio cholerae # 42 389 30 364 405 118 28.0 3e-26 MKTTELKKGFQDGKYEELLRDIYIDESVLDYQKERYIKAIESFEELFGEKEVEIYSAPGR SEIGGNHTDHQLGRVLAASINLDAIAIVAKKENGVVLKSEGYPMINVSLADLLPKKEEEG TSEGLIRGVAAKLKEEGYEIGGFEAYVTSDVLNGAGMSSSAAFEVLTGNILSGLYNEGKV SPVLIAQAGQYAENVFFGKPCGLMDQMASSVGNLIFIDFADVKNPVIKKVNVNFEDFDHS LCIIDTKGSHADLTDEYAAIPEEMKKVAAYFGKEVLHQIDKNEFYIHIPEIRKVAGDRAV LRAMHWFEETDRVIDQVNALEKEDFEGFKSLIKASGDSSFKYLQNVYSVKNLSRQEMAVG LAFSDVILKGRGVSRVHGGGFAGTIQAFVPNDIVDIYKKNMEDIFGQDACHVLKIRKYGG MKVL >gi|222441888|gb|ACEP01000054.1| GENE 15 17228 - 18724 1601 498 aa, chain + ## HITS:1 COG:CAC2961 KEGG:ns NR:ns ## COG: CAC2961 COG4468 # Protein_GI_number: 15896214 # Func_class: G Carbohydrate transport and metabolism # Function: Galactose-1-phosphate uridyltransferase # Organism: Clostridium acetobutylicum # 3 498 2 497 497 546 54.0 1e-155 MGLTKNIKKLVHYGIGAKLIQPEDEIFMINQYLDLFGLDEYDNPDIFGEKMVLADILNEL TDIAFEKGIIQSDDIVTRDLFDTKLMGIMTPRPGTVKKIFDAYYEKNPKYATDYFYELSQ NSNYIRKDRIQKDMKWTVDSPYGVIDITINLSKPEKDPKAIAAAKNAKQSAYPKCQLCIE NEGYAGRMNHPARQNHRIIPITIHGSRWGFQYSPYVYYNEHCIVLNGQHTPMKIDRDTFA KLFDFIRQFPHYFVGSNADLPIVGGSILTHDHFQGGHYTFAMERAEMEKEFKVPGYEDVT AGIVHWPLSVIRIRHKDPERLIDLADHILKKWRNYTDPDAFIFEQTNGEPHNTITPIARR RGEEYELDLTLRNNITTEERPLGVYHPRPEYHHIKKENIGLIEVMGLAVLPSRLKKEMES LADCLVSGKDVSSVGGLEKHADWVKDFLPKYAEITQENVWDILKEEIGEVFVKVLEDAGV YKCTEEGRKAFERFINTL >gi|222441888|gb|ACEP01000054.1| GENE 16 18979 - 19698 859 239 aa, chain - ## HITS:1 COG:lin1116 KEGG:ns NR:ns ## COG: lin1116 COG4816 # Protein_GI_number: 16800185 # Func_class: E Amino acid transport and metabolism # Function: Ethanolamine utilization protein # Organism: Listeria innocua # 12 227 44 259 267 201 49.0 1e-51 MSHAEFTPTTNEFIGTAAGHTIGMVIPNIDPEVRTLLKIPEQYSSLGILTSRTGAAAQAF SVDEAAKACNVELLIFELPRDTEGYSGHGNLIVLGAYYVSDARRAVEVALTLIDEKAERI YINEVGHMEMHVTANAGPVLHQIFDAPLGKAFGFICAGPAGIAIVAADTAVKSSPVDIVW YGTPSINLSHTNEVIIGVSGDYGAVKKAVDIAYEKASALIEIFGQKPASILHFTQNPDL >gi|222441888|gb|ACEP01000054.1| GENE 17 19916 - 20587 641 223 aa, chain - ## HITS:1 COG:CAC3340 KEGG:ns NR:ns ## COG: CAC3340 COG2357 # Protein_GI_number: 15896583 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 43 223 33 216 217 134 43.0 2e-31 MKKKKVKPVKQMGSTGTKLQKEVAAMEEAMKQHHIEEQCQLLIDEMMPQIKRLGRSLSGT YHRNIIEYTTSRIKSPESIVEKLHRKNREVSLDKAVATLRDLAGIRVICSFQDDVYRVAG AIKNLPGYELVKEKNYINKPKASGYRSIHLILRPANTAYDINYIEIQVRSAAMNYWAILE YQLCYKNEKKGAERIRKELKECAIDIAKIDKKMLKLRKEIEKI >gi|222441888|gb|ACEP01000054.1| GENE 18 21401 - 22282 1270 293 aa, chain + ## HITS:1 COG:SP2084 KEGG:ns NR:ns ## COG: SP2084 COG0226 # Protein_GI_number: 15901900 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type phosphate transport system, periplasmic component # Organism: Streptococcus pneumoniae TIGR4 # 2 291 4 290 291 187 42.0 1e-47 MKKFISFALIAAMAVTTLVGCGNKGAKADGGSSDAAITVMSREDGSGTRGAFIELFGIEE EENGEKVDKTTDEASITNSTSVMMSSVANDANAIGYISLGSLNDTVKAVKIDGAEASVDN VKNGSYKVVRPFNIVLGKKTSDAANDFVNYIMSADGQKIIEDNGYIPVDEGAAAYQASNA KGKVVVGGSSSVSPVMEKLVEAYSKANTNIQVDVQTTDSTTGVSSTVEGSYDIGMASRDL KDDETAQGVEGKTIAKDGIAVIVNNESSVDELSTDQVKSIFTGETTSWSDITK >gi|222441888|gb|ACEP01000054.1| GENE 19 22297 - 23151 1093 284 aa, chain + ## HITS:1 COG:SP2085 KEGG:ns NR:ns ## COG: SP2085 COG0573 # Protein_GI_number: 15901901 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type phosphate transport system, permease component # Organism: Streptococcus pneumoniae TIGR4 # 5 284 7 287 287 293 58.0 3e-79 MAKSKEKIMEIVFLLAACVSILAVALICVFLFANGIPAMKKIGFADFLLGTTWKPGNDIY GIFPMILGSIYVTVGAIIIGVPIGLLTAVFLAWYCPEKIYKVVKPAVELLAGIPSVVYGF FGMVMIVPAIRYNFGGTGSSILAASILLGIMILPTIIGVSESAIRAVPGSYYEGSLALGA THERSVFRAVLPAAKSGIFAGIILGVGRAIGETMAVIMVAGNQARVPKGILKGVRTLTAN IVMEMGYAEGLHREALIATGVVLFVFILIINLAFSIIKMRSEKE >gi|222441888|gb|ACEP01000054.1| GENE 20 23172 - 24083 943 303 aa, chain + ## HITS:1 COG:SP2086 KEGG:ns NR:ns ## COG: SP2086 COG0581 # Protein_GI_number: 15901902 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type phosphate transport system, permease component # Organism: Streptococcus pneumoniae TIGR4 # 38 303 4 269 271 269 59.0 4e-72 MENIQQAAKPSAMMNEAPMNHLTLSDRLKSYKRTPGSFLVMLLVVLSAVATIAALVFLIL YILVNGIPYINADLFSWTYTSENCSVIPAAINTVIMAGISLLLAVPVGIGSAIYLVEYAK KGNKLVKVIRVTAETLTGIPSIVYGLFGMLFFVTALKWKFSILAGAFTLAIMILPVILRT TEEALLSVPDSYREGSFGLGAGKLRTIFKIVLPAAVPGILSGVILATGRIVGETAALIYT AGTVAKIPNSAFSSGRTLAVHMYLLANEGLHVNQAYATAVILLILVIGINALSSFIAKKI SKA >gi|222441888|gb|ACEP01000054.1| GENE 21 24144 - 24896 273 250 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 5 246 12 263 329 109 30 3e-23 MANKMMIDDLNLYYGQFHALKDISLNIQEKEITAFIGPSGCGKSTLLKSLNRMNDLVEGC KITGQVTLDGTDIYRDLDVNLLRKRVGMVFQKPNPFPMSIYDNVAYGPRTHGIRSKAKLD QIVEEALRGAAIWDELKDRLKKSALGLSGGQQQRLCIARALAVKPEVLLMDEPTSALDPI STIKIEELAQELKKDYTIVMVTHNMQQATRISDKTVFFLLGEIIEAGKTDELFSMPKDKR TEDYITGRFG >gi|222441888|gb|ACEP01000054.1| GENE 22 24898 - 25542 827 214 aa, chain + ## HITS:1 COG:lin2637 KEGG:ns NR:ns ## COG: lin2637 COG0704 # Protein_GI_number: 16801699 # Func_class: P Inorganic ion transport and metabolism # Function: Phosphate uptake regulator # Organism: Listeria innocua # 1 212 3 214 219 134 37.0 1e-31 MRIRFDEQLKQLNKEMINMGTMIEKSIGDAVKALMKQDVELAQKVMAGDEEIDRTERKIE DLCLRLLLQQQPVARDLRNISAALKMVTDMERIGDHATDISELAIVLSEKTYVKKLDHIE EMARETMVMLIQSLEAYVEKDLDKAQKVIAHDDVIDDLFEEVKQELIELIRNHADEGEQA VDLLMVTKYFERIGDHATNIAEWVIFSITGQIAG >gi|222441888|gb|ACEP01000054.1| GENE 23 25942 - 27741 1883 599 aa, chain + ## HITS:1 COG:SP0496 KEGG:ns NR:ns ## COG: SP0496 COG1283 # Protein_GI_number: 15900410 # Func_class: P Inorganic ion transport and metabolism # Function: Na+/phosphate symporter # Organism: Streptococcus pneumoniae TIGR4 # 6 550 9 535 543 319 35.0 1e-86 MDIFGILSMIGGLALFLYGMEAMGAGLSKLSGGRMERLLEKLTSKRIMAVLLGAGVTAVI QSSSATTVMVVGFVNSGIMKLNQAVGIIMGANIGTTITSWLLSLTGIQGSSFILKMLKPS SFSPILAIVGIIFIMFTKDEKKKDIGGIFLGFAILMYGMEAMSGAVAPLADNEKFTGILT MFSNPILGLLAGTILTAVIQSSSASVGILQALCATGAVNFSTALPIIMGQNIGTCVTAII SSIGTSKNAKRAAAVHLFFNISGTVIFMVVFYTLNTFVHFSFLNTAASPAGIAVIHSLFN IGATILLFPFANLLEKMAILAIPDKGSEVEEMEEEKINPDLARLDERFLDKPGFAMEECR GVAINMARKSKKAMNLAIDLLKEYDEKTSARVEKLENQIDQYEDALGTYLVKLSERDLSV KDSRILSVLLHCIGDFERISDHAVNIRDAAVEMNKKNLQFSDKAKQELLVFSNAIRDIID RAVLAFETGDTGLAKEVEPLEQVVDALNKEEKQRHINRLRTGTCTIELGFVLSDISTNFE RAADHCSNIAVCLLQVDEGGFDTHEYLDILKEENSEEFRHEYMELSEKYTLPESKHGGK >gi|222441888|gb|ACEP01000054.1| GENE 24 27744 - 28415 823 223 aa, chain + ## HITS:1 COG:CAC1700 KEGG:ns NR:ns ## COG: CAC1700 COG0745 # Protein_GI_number: 15894977 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Clostridium acetobutylicum # 4 220 6 225 232 176 41.0 3e-44 MANIYIVEDDTNIQEIEAIALKNSGHSVEQFDDAAHFYKKLEEKLPDLAILDLMLPDEDG YEIVKKLRQNPATKKLPVIMVTAKTSEIDMVRGLDIGADDYIKKPFSIIEFITRVKAILR RTLEQPDDKYMTIGDIFLDNERHVVYVDNKGIELTYKEYELLKLLMHNAGIVLTREKIME RVWDTEFEGESRTVDMHIKTLRKKLGDSSRHIKTVRNVGYVME >gi|222441888|gb|ACEP01000054.1| GENE 25 28427 - 29740 931 437 aa, chain + ## HITS:1 COG:BS_phoR_3 KEGG:ns NR:ns ## COG: BS_phoR_3 COG0642 # Protein_GI_number: 16079962 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Bacillus subtilis # 215 437 46 271 279 169 41.0 9e-42 MRRKINVQLISISIIAIISTMLVMLFVFYDLFQTQVQDDLSANAIVLKNTGIFHSGNEKA IHMDYRELRVTWISSDGTVLYDNDADVGTMDNHMSRPEVKNAFKRGDGSAVRQSATMGTN TFYYAIKLNDGSVLRVAREASNIWTVGGKIFPIIAATILVMMILCIILAHFMTKNLVKPI EQMAVDINDFSITPAYKELAPFAATIRAQHADILKSARMRQDFTANVSHELKTPLTAISG YAELMENGMVPPEEIGSFAQKIEKNARRLLSLINDIIRLSELDSSNTQLLVEKFDLFEEA QICVENMQIYAEKNDVTITYEGGHQEITANREMINELINNLCANAIRYNKKNGSVKVRVD SSQGHPYISVADTGIGISKEHQKRVFERFYRVDKSRSKQTGGTGLGLAIVKHIVALHDAQ LNLESEVGKGTTITVTF >gi|222441888|gb|ACEP01000054.1| GENE 26 29834 - 29977 63 47 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026832|ref|ZP_03716024.1| ## NR: gi|225026832|ref|ZP_03716024.1| hypothetical protein EUBHAL_01084 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01084 [Eubacterium hallii DSM 3353] # 1 47 1 47 47 71 100.0 2e-11 MVIFAVLYGQMQNLTQQNLTQLIQTLIYEKESKANFIPLRWTFRFAQ >gi|222441888|gb|ACEP01000054.1| GENE 27 30208 - 31092 1348 294 aa, chain + ## HITS:1 COG:SP1468 KEGG:ns NR:ns ## COG: SP1468 COG0214 # Protein_GI_number: 15901318 # Func_class: H Coenzyme transport and metabolism # Function: Pyridoxine biosynthesis enzyme # Organism: Streptococcus pneumoniae TIGR4 # 6 294 3 291 291 426 82.0 1e-119 MNNNAEKQYELNKELAQMLKGGVIMDVTTPEQAKIAEEAGACAVMALERIPADIRAAGGV SRMSDPKMIRGIQEAVSIPVMAKCRIGHFAEAQILEAIEIDYIDESEVLSPADDVYHIDK KQFKVPFVCGAKDLGEALRRINEGASMIRTKGEPGTGDVVQAVRHMRMMNREIARITSMR QDELFEVAKELRVPYELVLKVHDTGRLPVVNFAAGGVATPADAALMMQLGAEGVFVGSGI FKSGNPKKRADAIVKAVTNYKDAKMLAELSSDLGEAMVGINESEIQLLMAERGK >gi|222441888|gb|ACEP01000054.1| GENE 28 31094 - 31669 719 191 aa, chain + ## HITS:1 COG:SP1467 KEGG:ns NR:ns ## COG: SP1467 COG0311 # Protein_GI_number: 15901317 # Func_class: H Coenzyme transport and metabolism # Function: Predicted glutamine amidotransferase involved in pyridoxine biosynthesis # Organism: Streptococcus pneumoniae TIGR4 # 1 187 1 190 193 199 52.0 2e-51 MKIAVLAVQGAFIEHEKALQRLGAETVELRKAEDLEQDFDGLVLPGGESTVQSRLLKELS MFEPLKEKIEEGLPVLATCAGLILLAQNVSNDEKRGFATLPVTVKRNAYGRQLGSFYYEG GIKGIGTYPMEFIRAPYIESVGDDVEILAEVEEKIVGVAYKNQLAFSFHPELTGNDKIHE KFLELIKVSND >gi|222441888|gb|ACEP01000054.1| GENE 29 32289 - 33377 1155 362 aa, chain + ## HITS:1 COG:SMb20435 KEGG:ns NR:ns ## COG: SMb20435 COG3608 # Protein_GI_number: 16264169 # Func_class: R General function prediction only # Function: Predicted deacylase # Organism: Sinorhizobium meliloti # 18 230 20 239 331 104 31.0 3e-22 MGFQLGNLYVEKGQKTCGFLKLPFTEDELPVTLIYGREEGPTVLVDGGIHNAEYVGIECV TGLAKQLQPEDIKGILIMIHIVNVNGFQARTVSVSAEDGKNLNRVFPGNENGTYTDKLAY FMEKEIFSKVDYYIDVHNGDWFEDLTPFIYCVGNAPAETVAEAERMAQAADMPFYVKSQS GKGGAYNYAGSLGIPSVLIERGCNGMWSEEEVAASQKDVKNILRRIDVLKTKPTLSEMQM RVPRHMHHAHYIDSEKAGCWFPKKKAGQVARAGELLGELKDYFGNVIEEIRLKEDAIILY QTISYSVPENSPLIAYGHYDTCIDDLGDMNHEHTHEELHKHHKEHYDDHAGIHSREMWED MI >gi|222441888|gb|ACEP01000054.1| GENE 30 33404 - 34723 1248 439 aa, chain + ## HITS:1 COG:FN0667 KEGG:ns NR:ns ## COG: FN0667 COG0534 # Protein_GI_number: 19704002 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Fusobacterium nucleatum # 1 404 12 415 426 281 41.0 2e-75 MTEGSIWKGILLFALPLLGSSLIQQLYSTVDLIFVGQLLGTKASAAIGASGLIVTCLIGF FNGMAVGTNVFAARHYGAKRFNELKKLIQTIFWTGIIGGLLLMVIGLIFSPIFLTWMGTP KSIFPLAVRYLRIYMVSMISIVSYNLLSGVLRALGDSRTPLLYQFFGGIINVFADFIFLA VFHMGVEGTALATLFSQTVAAIGIMLHLYRLKEPYALRFSIKECSLREFTDILKVGVPAG VQSIIITLSNIIIQSQINTLGVTAVASFTVYFRVELIVYLPIIALGQAVVSFVGQNYGAG NWNRIKKGNKFSILGGSFATFIACILLIIAMPVILKAFTKDTAVAAQTLEIIKVTFPFYF FYTVLECFSSNLRGFGKAFLPMVVTVVSFCGFRIAALFVLMAQNPSSDKVALSYPISWGI AAIAMAVLYVRNRSGEKTL >gi|222441888|gb|ACEP01000054.1| GENE 31 34932 - 36263 1092 443 aa, chain - ## HITS:1 COG:SA0699 KEGG:ns NR:ns ## COG: SA0699 COG3610 # Protein_GI_number: 15926421 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Staphylococcus aureus N315 # 303 434 5 135 164 62 34.0 1e-09 MSETMIRKNHMDVLWHEYTDKEGQNLPIVQASLTEKASIIGRAGIMLLSCGTGAWRVRSS MNTLAEVMGVTCTADIGLMSIEYTCFDGQDGFTQSLCLTNTGVNTSKLNRLEHFIQDFET EGKDMSGEQLHNLLDNIEEIHGLYSPIALGIAAALACGGFTFLLGGGPIEMFCAFVGAGL GNFLRCKLTKHHYTLFLGIVLSVSLACLSYAGLLKLGEIFLGLSVKHEAGYICAMLFIIP GFPFITSGIDLAKLDMRSGIERLGYALIIILVATMSAWIMALILHLQPVDFIKISLPLAG WIIFRFLASFCGVFGFSIMFNSPIPLAVAAASIGAIANTLRLELVDLVNAPPAAAAFIGA LTAGILASLMKDRVGYPRISVTVPSIVIMVPGLYLYRGFYNLGIMSLATSATWFASALLI ILALPLGLIFARILTDKTFRYCT >gi|222441888|gb|ACEP01000054.1| GENE 32 36344 - 37147 711 267 aa, chain - ## HITS:1 COG:no KEGG:lin0468 NR:ns ## KEGG: lin0468 # Name: not_defined # Def: hypothetical protein # Organism: L.innocua # Pathway: not_defined # 1 259 1 254 254 216 40.0 1e-54 MNSVLPREEEVEAIFSQILASPDSCKKLMDIFYDVMDDNHLLEADNPKHFSKVLFYAYQN GDISALLLKLCGKSMFDLLREAYLIPKRFHGKAGKNPVLLTNPEGELLPEAQKIVSGREY SKFKETYKNHICAPRSKLYLADGYDIVRSYVNDTDMEITEKRENKRRGIMILYALPDTKK IGLTEAQAYAVVWDTFHNIQKEAPHAIVYYGQETGLKKEQTFDEIGILLPIREFEKKMLH HLSEIDGLVLACREKMMKKAGMESLEL >gi|222441888|gb|ACEP01000054.1| GENE 33 37285 - 38364 792 359 aa, chain - ## HITS:1 COG:MA1426 KEGG:ns NR:ns ## COG: MA1426 COG1902 # Protein_GI_number: 20090286 # Func_class: C Energy production and conversion # Function: NADH:flavin oxidoreductases, Old Yellow Enzyme family # Organism: Methanosarcina acetivorans str.C2A # 4 347 1 354 365 184 32.0 2e-46 MSKLMHTPYRIGNIVISNRIVKSAMFEYCANQGRITDQHRQIYEDAARGGCGLIITGMEA ISSTAGNGPGMIHTEYDGYLEDMRSIVSKVHRYSSRIFVQLQHAGPRTDWKNGYDRFASS PIKVADGIIYHAATKDELAKVVHDFGVAARKCKLAGCDGVQIHAAHGFLLTTFLSPHFNK RADEYGGPIENRSRLLFEIYDSVRNAVGVDYPIALKLSFSDLVSDSSTPEEMLWVCKELE KKGIDFIEVSSGISADNSGASCSPVLRNGDREGKFLESALLVSDTLEIPVSSVGCYRSPD FIEHILSSTSLTAISLGRPLVREADLPNKWRTSNEKAACISCNRCFNCTDVITCLCGKK >gi|222441888|gb|ACEP01000054.1| GENE 34 39317 - 40855 1807 512 aa, chain + ## HITS:1 COG:TM0116 KEGG:ns NR:ns ## COG: TM0116 COG1070 # Protein_GI_number: 15642891 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar (pentulose and hexulose) kinases # Organism: Thermotoga maritima # 1 505 1 490 492 163 26.0 7e-40 MNYYLGFDAGTQSVKVAVYDEDMNCIVSSSAPTTLTYPEPGWVSMDVDEYLALTIKGMKE CASQMKEKGLDVTKIKSIMGDGVICGICGVDKDGNAITPYINYLDSRTKEDVELVNSWEL EIFGKETGNPEANCMFPAMFARWFLKNIEGFKENGAKFIHDAPYVLAHLAGLSAKDMFVD WGTMSGWGLGYKVEEKCWSKEQLDILGIPEEMMPRIVKPWDIIGHLTEEIAQETGLPAGI PICGGAGDTMQSMIGSGNMEPGQAVDVAGTCSMFCVSTKGIIPELSKKGAGLVFNSGSLP DTYFYWGYIRTGGLALRWFKDNICKKAEDGSYYQVLEKDARKVPAGSNGVLFLPYLTGGI NDIPDAVGCFLNMTMDTDQATLWHAVLEAIGYDYMEITDLYRSAGIDLSRITITEGGSRD NMWNQMKADMLDSRTVTLENAAGAVMTNCMFGAYAVGDVDNIAEALTGNIHLKNEFEPNE ENVTFYRGQYEKKTHLVKDTMKEAFSILAELR >gi|222441888|gb|ACEP01000054.1| GENE 35 41296 - 43959 2356 887 aa, chain + ## HITS:1 COG:CAC0854 KEGG:ns NR:ns ## COG: CAC0854 COG0480 # Protein_GI_number: 15894141 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation elongation factors (GTPases) # Organism: Clostridium acetobutylicum # 1 630 5 640 644 531 42.0 1e-150 MGILAHVDAGKTTLSEGMLYLSGTVRKLGRVDHKDAFLDTYSLERDRGITIFSKQAVFSL GNRRINLLDTPGHVDFSAEMERTLQVLDYAVLVISGADGVQGHTETLWKLLKLYEIPTFI FINKMDQPGTDRESLLTELKERLDEGCIVFGKGKNVESLEEIAMTDEVVLDYFMEHETVR NEDICRLIRERKIFPCYFGSALKLDGVQELLAGFEEYMKPFDGKKGFGARVFKISRDDKG ERLTFLKVTGGKLVVKMPINKEEKINQIRIYSGAKYEAVNEVEAGGVCAVTGLSSSYIGQ GLGVEKGTAAPFLEPVLTYQMILPEGADTTKVLRELKQLEEEEPLLNIVWNPALEEIHVQ LMGEVQTEILKTMIAERFHLDVEFGTGKIVYKETIKSPVVGVGHYEPLRHYAEVHLKMEP LEAGSGLVFDTDCSEDVLDRNWQRLILTHLQEREHPGVLTGAPITDMKITIVAGRAHLKH TEGGDFRQATYRAVRQGLKSAESVLLEPWYSFVLEVPSEQVGRAMSDIGQMNGSFEGPEA EDKQGMVRLTGTAPASEMRDYQREVWAYTKGRGRITLTLKGYEPCHNAEEVIEEIGYDSE RDVDNPTGSVFCAHGAGFLVKWDEVPEYMHIKEDFLAEKPGIEQDEVMAVQMGNHCNYSG GYSSSYDDDPELLTIMEREFGSKQKERDRYSSYRKQTVSTPVRHTTVIKENEPKKEYLLV DGYNIIFAWEELNELAKASIDAARNKLMDILSNYQGFIGCTLILVFDAYKVKGNQGEVQK YHNIYVVYTKEAETADQYIEKTTHEIGRKYKVTVATSDALEQVIVMGQGAYRISARDFYE EVERTEKQIREINERERGEKRNYLLDYAKEEDAREMEKVRLGKTTEK >gi|222441888|gb|ACEP01000054.1| GENE 36 44104 - 44817 174 237 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 220 1 226 245 71 25 1e-11 MLELQNISYLVDDDNKEILRDINLAFDEKFVAITGPNGSGKSTLAKIIAGILKPTSGKIL LDGEDITDLSITERANKGISYAFQQPVKFKGLKVKDLMSVASGKTLDIKGICDILSEVGL CARDYINRELNGSLSGGELKRIEIAMIKARATKLSVFDEPEAGIDLWSFNSLIHVFEDLH EKTDGTILIISHQERILEIADRIIVINDGLVEKSGKGEDVIPELLYKSSMCQKLSRD >gi|222441888|gb|ACEP01000054.1| GENE 37 44830 - 45747 1274 305 aa, chain + ## HITS:1 COG:MTH1150 KEGG:ns NR:ns ## COG: MTH1150 COG0719 # Protein_GI_number: 15679161 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ABC-type transport system involved in Fe-S cluster assembly, permease component # Organism: Methanothermobacter thermautotrophicus # 87 304 169 384 410 97 32.0 3e-20 MDLIQKELLEEIAGLHEVPAGAYNIRANGKAAARNTTANIDIVSKEDKPGIDIIIKPGTK KESVHIPVVISQTGLNDLVYNDFHIGEDCDVTIVAGCGIHNSGDGLSQHDGIHEFFVGKN SKIRYIEKHYGQGDGKGGKVMNPKTIIHLDENAYFYMDTVQIKGIDSTMRETDADLKAGA KLEISEKLMTHGEQTAQTQFSVDLNGEGSSANVVSRSVARDDSAQLFRAKIAGNNACAGH SECDAIIMDNAKVRAIPELEANDLDAELIHEAAIGKIAGEQLIKLMTLGLSEEEAEEMII SGFLK >gi|222441888|gb|ACEP01000054.1| GENE 38 45893 - 46165 362 90 aa, chain + ## HITS:1 COG:BH0558 KEGG:ns NR:ns ## COG: BH0558 COG1937 # Protein_GI_number: 15613121 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus halodurans # 2 90 12 100 100 70 40.0 5e-13 MRQCMDMDNLHNRLKRVDGQIKAIDRMIEQDVPCEDIIIQINAAKTALHKIGQVVLEGHL NHCVKDGIEHGDAEKTIADFAKALEYFSRL >gi|222441888|gb|ACEP01000054.1| GENE 39 46411 - 46617 292 68 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026845|ref|ZP_03716037.1| ## NR: gi|225026845|ref|ZP_03716037.1| hypothetical protein EUBHAL_01098 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01098 [Eubacterium hallii DSM 3353] # 1 68 1 68 68 110 100.0 2e-23 MSRTDEEIKRFETKMKSCTTMTDLLVAMSSWQSYAQSHNLEPEQKRTVDEAYLKAEEGLI TAVKPSLW >gi|222441888|gb|ACEP01000054.1| GENE 40 46652 - 47218 385 188 aa, chain + ## HITS:1 COG:BH3394 KEGG:ns NR:ns ## COG: BH3394 COG1309 # Protein_GI_number: 15615956 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Bacillus halodurans # 4 58 7 61 186 62 47.0 7e-10 MNGTKGKMAEAFKELVCKKSFQKITISDIAKESAMTRENFYYHFRDKYDIMHWIFEQQVA AQLPEDEEPFEMWFNALFCNTCEDYKYYRKLIKSLSAEEIRSDLYPLFERRVRLLVQDCL DDSVWNLRKEKEDFSTAFFTDAFLGFYINYIRDHEEIDYNMLQIGLEFLFDKFLSTVRIT KEESSEQK >gi|222441888|gb|ACEP01000054.1| GENE 41 47452 - 48177 1121 241 aa, chain + ## HITS:1 COG:SPy1682 KEGG:ns NR:ns ## COG: SPy1682 COG0580 # Protein_GI_number: 15675543 # Func_class: G Carbohydrate transport and metabolism # Function: Glycerol uptake facilitator and related permeases (Major Intrinsic Protein Family) # Organism: Streptococcus pyogenes M1 GAS # 5 239 3 232 233 143 41.0 4e-34 MTNVIIGEFVGTILLVLLGDGVCANVTLNNSGFKAAGPLFIAIGWGLAVALPAAAFSGIC PASYNPALTIALFADGTLAAADGYTIAVIIEMIIAEMAGGFIGAVLVWVAFKPQFDASSD LGDSLRGVFCTAPGVRNLPYNILQEAIATFWLVFAIKAACGAFDLGTFGVFLTIVSCGTT FGGLTGYAMNAARDTAPRLAFAILPIKGKGDADWAYGLTAPLIGPVIGALVAVAVYAALP L >gi|222441888|gb|ACEP01000054.1| GENE 42 48412 - 48765 371 117 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225026848|ref|ZP_03716040.1| ## NR: gi|225026848|ref|ZP_03716040.1| hypothetical protein EUBHAL_01101 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01101 [Eubacterium hallii DSM 3353] # 1 117 1 117 117 240 100.0 3e-62 MTTIKCRKCGKVYPFNEKICPDCGYDIRKDETLSPRDRVALGLKSDSALFEELGKENMER WKTQTREERLQDAWENRNKFEIRPDQGGLPRVNIPMNTMLFLTVIFVVAVFLAKYLA >gi|222441888|gb|ACEP01000054.1| GENE 43 48933 - 49124 291 63 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026849|ref|ZP_03716041.1| ## NR: gi|225026849|ref|ZP_03716041.1| hypothetical protein EUBHAL_01102 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01102 [Eubacterium hallii DSM 3353] # 1 63 1 63 63 81 100.0 2e-14 MTVEMWMGLGIVVAIIGFIMWRAGLSMQAKRNRNAGMVTMIAWLLLMVGVMTAFSFFRVK YLS >gi|222441888|gb|ACEP01000054.1| GENE 44 49190 - 49630 497 146 aa, chain + ## HITS:1 COG:YPO2897 KEGG:ns NR:ns ## COG: YPO2897 COG1959 # Protein_GI_number: 16123088 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Yersinia pestis # 1 133 1 130 164 110 43.0 6e-25 MRISTKGRYALRLMLDIAMNDAEGKPVRVKDIAKRQEVSGKYLEQIVSVLTKAGYLRSIR GPQGGYLLTKRPDEYTVGMILRITEGSMEPVPCLEEGSEVCERMEECITIPLWRKLSEAI SGVLDHTTLDDLIEWHYAKNGNDYVI >gi|222441888|gb|ACEP01000054.1| GENE 45 49716 - 50648 781 310 aa, chain + ## HITS:1 COG:CAC1666 KEGG:ns NR:ns ## COG: CAC1666 COG4905 # Protein_GI_number: 15894943 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Clostridium acetobutylicum # 5 231 2 237 238 122 32.0 7e-28 MIGQYTLQQWILYFYVYCFLGWIFESCYVSFRKKEWVNRGFLHGPFLPIYGSGAVMMLFV SEPFKNNLILTYFAGVVGATLLELVTGAAMEALLKVRYWDYSNQKFNYKGYICLSSSVAW GFLTILMNEVLHPAILHVLAVIPKMADGVFVWIVSAGLAVDLCISVRDAINLRNILVGME NVRREVVILRKRADIVIAVLDQEWREYVKNHPAAERMDEISKEIEMRLAKMKEMAELPKL PEPQKLEMLEVKEKIGKLKERGFILKRKGTGSVRTLIKGNPTMVSPKYAESFKQLKGYLK EFEEKRKGKD >gi|222441888|gb|ACEP01000054.1| GENE 46 50923 - 51606 330 227 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_1314 NR:ns ## KEGG: EUBREC_1314 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 201 8 208 232 177 46.0 3e-43 MAERLKKYRHGLLLLYFPFYLAAFAYLEKKVPDKVHIINCAIDQYIPFVEVFIIPYLLWF AYVAVAGIYFFFKEKESFCKWMYFGMIGMTIFIIVSYLYPNGLELRPETFTRDNIFVQLT KMIYSMDTPTNVLPSIHVFNSMAVYFAVKNSPKLKNNKAARTGAFVMTSLIILSTMFLKQ HSVVDVLTALILSCLSYDFIYNERTEKIRDGLEELKFRRKRKEFSKY >gi|222441888|gb|ACEP01000054.1| GENE 47 51677 - 52207 552 176 aa, chain + ## HITS:1 COG:STM2056 KEGG:ns NR:ns ## COG: STM2056 COG4917 # Protein_GI_number: 16765386 # Func_class: E Amino acid transport and metabolism # Function: Ethanolamine utilization protein # Organism: Salmonella typhimurium LT2 # 1 142 1 143 150 114 39.0 7e-26 MRKIIFIGRSEAGKTTLTQAMKGKKIVYHKTQYINHYDVIIDTPGEYAETKELAGSLAVY SAEADVVGLLISATEPFSLYPPNVTAQANREVVGIVTKCDHWAGNPEQAAEWLRLAGCKK IFMTSSFTGEGIADIISYLKEEHETLPWEEVKAQYDNLGYGEGESDKKMNINGVLI >gi|222441888|gb|ACEP01000054.1| GENE 48 52431 - 52973 926 180 aa, chain - ## HITS:1 COG:CAC3597 KEGG:ns NR:ns ## COG: CAC3597 COG1592 # Protein_GI_number: 15896831 # Func_class: C Energy production and conversion # Function: Rubrerythrin # Organism: Clostridium acetobutylicum # 1 179 1 180 181 207 62.0 9e-54 MKKFVCSVCGYVYEGEAAPEFCPLCKAPASKFIEQSEEKTWAAEHVVGVASDVPEDIKAD LRANFEGECSEVGMYLAMARVAHREGYPEIGLYWEKAAYEEAEHAAKFAELLGEVVSAST EENLRVRVEAENGATAGKFDLAKRAKALDLDAIHDTVHEMARDEARHGKAFEGLLKRYFG >gi|222441888|gb|ACEP01000054.1| GENE 49 53407 - 53802 340 131 aa, chain - ## HITS:1 COG:BS_perR KEGG:ns NR:ns ## COG: BS_perR COG0735 # Protein_GI_number: 16077938 # Func_class: P Inorganic ion transport and metabolism # Function: Fe2+/Zn2+ uptake regulation proteins # Organism: Bacillus subtilis # 10 131 23 144 145 81 35.0 5e-16 MGAPMKNSRQRNAILECVMRHHDHPTADIIYQELRESFPNISLGTVYRNLSLLTSLGKIM KITCENHADRFDGQTKPHAHFECKSCGCLQDIPFKPSIHPQEEIGAGFDGIISDYTITFR GYCAKCAKNSD >gi|222441888|gb|ACEP01000054.1| GENE 50 54163 - 54396 530 77 aa, chain + ## HITS:1 COG:aq_1717a KEGG:ns NR:ns ## COG: aq_1717a COG0236 # Protein_GI_number: 15606797 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Acyl carrier protein # Organism: Aquifex aeolicus # 3 73 5 75 78 66 67.0 1e-11 MLEKMKEIIAEQLGVEEDEITPDTSFKEDLGADSLDLFELTMALEEEYDTEIPAEELEDI ETVGDVIEYLREKGIDE >gi|222441888|gb|ACEP01000054.1| GENE 51 54694 - 55677 1513 327 aa, chain + ## HITS:1 COG:CAC2858 KEGG:ns NR:ns ## COG: CAC2858 COG1077 # Protein_GI_number: 15896112 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Actin-like ATPase involved in cell morphogenesis # Organism: Clostridium acetobutylicum # 1 324 5 327 340 387 62.0 1e-107 MATDIGIDLGTASILVYVKGKGVVLKEPSVVAFDVDTRKIKAIGEEARLMIGRTPGNIVA VRPLRQGVISDYSVTEKMLKYFVHKSVGKSLFGRKPRISVCVPSGVTEVEKKAVEDATYA AGARDVKIIEEPVAAAIGAGIDIAKPCGNMIVDIGGGTSDIAVISLGGTVVSTSIKVAGD DFDEAIVRYMRKKHNLLIGDRTAEDIKIQVGSAYTRAEESSIEVRGRNLVTGLPKTVTVT SEETREALREATAQIVEAVHSVLERTPPELAADISDRGIVLTGGGSMLQGLEQLIEEKTG ITTMTADDPMTAVAIGTGQYIEYLGTK >gi|222441888|gb|ACEP01000054.1| GENE 52 55688 - 57898 1939 736 aa, chain + ## HITS:1 COG:CAC2854 KEGG:ns NR:ns ## COG: CAC2854 COG0507 # Protein_GI_number: 15896108 # Func_class: L Replication, recombination and repair # Function: ATP-dependent exoDNAse (exonuclease V), alpha subunit - helicase superfamily I member # Organism: Clostridium acetobutylicum # 1 734 1 735 739 564 43.0 1e-160 MAHLEGYVSHIRFHNEENGYTVMEIETTHGDEVLVGNFHYINEGEYICAEGEYVDHPAHG PQYRMTSYTVKEPEDKAAMERYLASGAIKGMGPALAARVIKKFKGNTFDIIESEPERLAE VKGISLKKAMDIAVQFQEKQEMRHAMMFLSEYGITSRFAVRIFEEYGNKMYDILKTNPYK LAEDITGIGFRAADAIAAKAGIAPDSDFRVCAAILYSLQQAALSGHVYLPKDVLCRSAGQ LLSMAPEFIEDHLMELVLDKKLMIRTIGEEEHVYLSSYYFMELNTSRMLLNLDLHFPMEK GKYGRRISELEESMDFQLDLLQKQAVEEALTKGVVVVTGGPGTGKTTTINLMIRCFEMEK MDIVLAAPTGRAAKRMTEATGYEAKTIHRLLEVSGGIEDGDSSHFEKNEDNPIEADAVIL DEMSMVDIHLMYSLLKAIVPGTRLILVGDSNQLPSVGPGNVLKDIIDSKAFSVVRLSKIY RQAESSHIVLNAHKINAGELITLDNKNKDFFFFEKKDVRSVMGALVYLVKTRLPEYIGAS PFEIQVLTPMRKGELGVEHLNEVLQYYLNPASESKKEKEGRTGVFREGDKVMQVKNNYKL EWKITNAKGFSIEEGLGVFNGDMGIIKQINTYSERITVVFDENREAEYPFSSLDELELAY AITIHKSQGSEYPAVVLPLLSGPPILCNRNLLYTAITRAQKCVAIAGRQSMVEQMIHNET EQKRYSGLNDCLKEGL >gi|222441888|gb|ACEP01000054.1| GENE 53 57945 - 58763 291 272 aa, chain - ## HITS:1 COG:CC0830 KEGG:ns NR:ns ## COG: CC0830 COG1040 # Protein_GI_number: 16125083 # Func_class: R General function prediction only # Function: Predicted amidophosphoribosyltransferases # Organism: Caulobacter vibrioides # 65 248 57 237 265 99 32.0 8e-21 MKKLFFRFDFSPKIKKFLTTLHNTQEQAILWIIRQLLPFLFPRHCPLCDKLLPYGSFIHE ECHRELPLIHSPVCMRCGKPVSSHTQEYCYDCRAFPKSFQRGLSLFLYNKKTRPIMSAFK YQNRRGLADFFCQELCRYRLSQLRDLGADAVIPVPIHKNKYKKRGYNQAALLSSRLALTL NLPHYPDMLIRSVNTLPQKQFNPQARLNNLKKAFCFNSHYDKLLSQSTPFSVLLVDDIYT SGATMEACTRILLEAGISEVYILSICIGIARD >gi|222441888|gb|ACEP01000054.1| GENE 54 58870 - 59307 540 145 aa, chain - ## HITS:1 COG:BS_yncF KEGG:ns NR:ns ## COG: BS_yncF COG0756 # Protein_GI_number: 16078829 # Func_class: F Nucleotide transport and metabolism # Function: dUTPase # Organism: Bacillus subtilis # 1 143 1 142 144 195 64.0 3e-50 MKQTIRIKYFTDKIEKLTYIDGKSDWIDLRAAEDVSLKKGEFALIPLGIAMELPQGYEAH VVPRSSTFKNFGVIQTNHMGVIDESYCGDNDQWFMPVLAIRDTEIHVNDRICQFRIMEHQ PVISFEECNTLNGTDRGGFGSTGRA >gi|222441888|gb|ACEP01000054.1| GENE 55 59928 - 60077 241 49 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|160881814|ref|YP_001560782.1| ribosomal protein L33 [Clostridium phytofermentans ISDg] # 1 49 1 49 49 97 79 2e-19 MRVRITLACTECKQRNYNTTKEKKNHPERLETRKYCRFCRKHTVHKETK >gi|222441888|gb|ACEP01000054.1| GENE 56 60105 - 60302 221 65 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_0372 NR:ns ## KEGG: EUBREC_0372 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: Protein export [PATH:ere03060]; Bacterial secretion system [PATH:ere03070] # 2 65 4 67 71 70 53.0 2e-11 MSDEAKKTEKTSWFQGLKAEFKNISWPDRKTLIRETITVTVVSVVLGVIIATLDMVLQYG INFLV >gi|222441888|gb|ACEP01000054.1| GENE 57 60324 - 60848 636 174 aa, chain + ## HITS:1 COG:CAC3149 KEGG:ns NR:ns ## COG: CAC3149 COG0250 # Protein_GI_number: 15896397 # Func_class: K Transcription # Function: Transcription antiterminator # Organism: Clostridium acetobutylicum # 1 174 1 172 173 178 57.0 5e-45 MSELNWYVVHTYSGYENKVKANIEKTIENRKLQDQIFEVRVPLQDVVEMKGGVKKNVSKK MFPGYVLVNMVMNDDTWYVVRNTRGVTGFVGPGSDPVPLSEAEMRNLGIVAQSDNEVEID IEIGDLVEVTSGAWEGRVSTVTAINMNKQTVTIEVDMFGRETSVEIGFLDVKKL >gi|222441888|gb|ACEP01000054.1| GENE 58 60918 - 61343 624 141 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|240143818|ref|ZP_04742419.1| ribosomal protein L11 [Roseburia intestinalis L1-82] # 1 141 1 141 141 244 85 6e-64 MAKKVTGYIKLQIPAGKATPAPPVGPALGQHGVNIVQFTKEFNARTADKGDLIIPVVITV YQDRTFSFITKTPPAAVLLKKYCNIKSGSGVPNKTKVASISKDKVKEIAEMKMPDLNAAS IEAAMSMIAGTARSMGITVED >gi|222441888|gb|ACEP01000054.1| GENE 59 61410 - 62099 939 229 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|160881810|ref|YP_001560778.1| ribosomal protein L1 [Clostridium phytofermentans ISDg] # 1 228 1 228 230 366 77 1e-100 MKRGKRYQEAAKLRDKTNLYDAPEAVAIVKKAASAKFDETIEAHFKMGLDGRHADQQIRG AVVLPNGTGKKVRVLVFAKGDKAEEAVAAGAEFVGAEELIPKIQNEGWFDFDKVVATPNM MSVVGRLGRVLGPKGLMPNPKAGTVTMDVAKAVQDLKAGKIEYRLDKTNIIHVPIGKASF TEEQLNENFQTLIDAIIKAKPAAAKGQYIRSATLTSTMGPGVKLNPARF Prediction of potential genes in microbial genomes Time: Fri Jul 8 07:18:59 2011 Seq name: gi|222441887|gb|ACEP01000055.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont61.1, whole genome shotgun sequence Length of sequence - 20335 bp Number of predicted genes - 16, with homology - 15 Number of transcription units - 11, operones - 3 average op.length - 2.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 43 - 1632 2046 ## COG4108 Peptide chain release factor RF-3 - Prom 1818 - 1877 7.9 - Term 1892 - 1921 0.5 2 2 Tu 1 . - CDS 1929 - 3113 938 ## COG2814 Arabinose efflux permease - Prom 3222 - 3281 2.7 - Term 3188 - 3237 10.6 3 3 Tu 1 . - CDS 3308 - 4033 930 ## gi|225026870|ref|ZP_03716062.1| hypothetical protein EUBHAL_01124 - Prom 4254 - 4313 10.5 4 4 Op 1 1/0.000 - CDS 4518 - 5975 1326 ## COG2199 FOG: GGDEF domain - Prom 6201 - 6260 7.6 - Term 6262 - 6302 -0.5 5 4 Op 2 . - CDS 6303 - 7706 1367 ## COG2265 SAM-dependent methyltransferases related to tRNA (uracil-5-)-methyltransferase 6 4 Op 3 . - CDS 7715 - 8671 1033 ## COG3481 Predicted HD-superfamily hydrolase - Prom 8706 - 8765 8.1 + Prom 8781 - 8840 5.7 7 5 Tu 1 . + CDS 8897 - 9187 284 ## COG1254 Acylphosphatases + Prom 9217 - 9276 5.0 8 6 Tu 1 . + CDS 9384 - 11759 2276 ## COG1193 Mismatch repair ATPase (MutS family) 9 7 Tu 1 . - CDS 11809 - 12021 128 ## NT01CX_0673 hypothetical protein - Prom 12078 - 12137 2.1 10 8 Op 1 . - CDS 12170 - 12661 684 ## COG0219 Predicted rRNA methylase (SpoU class) 11 8 Op 2 3/0.000 - CDS 12695 - 13870 1304 ## COG0436 Aspartate/tyrosine/aromatic aminotransferase 12 8 Op 3 . - CDS 13867 - 14349 560 ## COG1522 Transcriptional regulators - Prom 14375 - 14434 8.3 13 9 Op 1 . - CDS 14485 - 15606 1044 ## COG0309 Hydrogenase maturation factor 14 9 Op 2 . - CDS 15642 - 17117 1451 ## COG2720 Uncharacterized vancomycin resistance protein - Prom 17155 - 17214 5.5 - Term 17133 - 17175 -0.6 15 10 Tu 1 . - CDS 17239 - 17346 233 ## - Prom 17552 - 17611 7.5 16 11 Tu 1 . - CDS 17656 - 19629 1507 ## COG1032 Fe-S oxidoreductase - Prom 19766 - 19825 5.9 + TRNA 20061 - 20131 75.8 # Gly TCC 0 0 + TRNA 20231 - 20302 78.1 # Gly GCC 0 0 Predicted protein(s) >gi|222441887|gb|ACEP01000055.1| GENE 1 43 - 1632 2046 529 aa, chain - ## HITS:1 COG:CAC0630 KEGG:ns NR:ns ## COG: CAC0630 COG4108 # Protein_GI_number: 15893918 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Peptide chain release factor RF-3 # Organism: Clostridium acetobutylicum # 4 528 5 526 526 702 65.0 0 MSRIDEIKKRRTFAIISHPDAGKTTLTEKFLLYGGAINQAGSVKGKATAKHAVSDWMDIE KERGISVTSSVLQFEYGGCCINILDTPGHQDFSEDTYRTLMAADSAVMVIDASKGVEAQT RKLFKVCAMRHIPVFTFINKMDREARDIFDLLDEIEKELGIPTCPVNWPIGSGKQFAGVY DRKNQKIDLFEDTMKGTKMGTMKEIALDDPEISNYVTDEAREVLEEEIELLDGASAEFDQ ELVDAGELSPVFFGSALMNFGVENFLNYFLKMTSSPLPRTSDQGTIDPIEEKDFSAFVFK IQANMNKAHRDRIAFMRICSGEFEAGMDAFHVQGDKKVRLSQPQQMMASERKMIDKAYAG DIIGIFDPGIFSIGDTITTSPKKFAYEGIPTFAPEHFARVRQVNTMKRKQFIKGINEIAQ EGAIQIFQEFNTGMEEVIVGVVGTLQFDVLNYRLQHEYNVEIRLEKLPYEYIRWIENTEI DLENLRGTSDMKKIKDLKGRPLLLFINSWSVGMTLERNEGLVLSEFGRA >gi|222441887|gb|ACEP01000055.1| GENE 2 1929 - 3113 938 394 aa, chain - ## HITS:1 COG:STM1522 KEGG:ns NR:ns ## COG: STM1522 COG2814 # Protein_GI_number: 16764867 # Func_class: G Carbohydrate transport and metabolism # Function: Arabinose efflux permease # Organism: Salmonella typhimurium LT2 # 10 386 12 387 396 266 40.0 5e-71 MQSKMTLKEWLPLLGITVSAFIFNTSEFMPIGLLTDIADSFHITEAHAGVLITVYSWIVM LLSLPLMLLLNKIDFKRLLLGTIALFGIFQMLSAFSASYGMLMISRIGVACTHSVFWSIA SPAAVSVVSEKFRSLALSMVVTGTSIAMILGLPLGRIIGLHMGWNMAFFCVGIIAFITTA YLIFVFPKVPGGESFSIKQMPEIFKNKTLMGIFLVTFLFATSYYTGYSYIEPFLQKVAGL SANWVTTTLTIFGAAGLLGSFLFSHYYDKNKYLFVKLVMISVAAALLLLYPISKAHMAVV LLCAFWGMAVMAFNVTFQSEIITYAPAAASSVAMAIYSGIYNLGIGSGTWIGGSICTHLS ISYIGIAGGMIAVVAALLCIFVVIKYMKEFDRAA >gi|222441887|gb|ACEP01000055.1| GENE 3 3308 - 4033 930 241 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225026870|ref|ZP_03716062.1| ## NR: gi|225026870|ref|ZP_03716062.1| hypothetical protein EUBHAL_01124 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01124 [Eubacterium hallii DSM 3353] # 1 241 1 241 241 355 100.0 1e-96 MIDNKRQELIHSQVMEILKENQLAFSNVARPKPTKASTATSASWSKDMAPKASAATQGAW DSTNTPGGVSSLAKEKWNVQEIAALVTKALEASNVKVAFSNKPIPAPTTISSSTQPAWKK DEDEAPRKPAAGTQGAWDVTDTPGGLSAETKKVWNLDAITALVKKELTENLEFYNKPLPK PTTVSSSTQPAWKDDSEEYKPKESASAQGIWERGNKISASTQPAWKVDEIAALVTKALQN K >gi|222441887|gb|ACEP01000055.1| GENE 4 4518 - 5975 1326 485 aa, chain - ## HITS:1 COG:BS_ytrP_2 KEGG:ns NR:ns ## COG: BS_ytrP_2 COG2199 # Protein_GI_number: 16080017 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Bacillus subtilis # 315 481 38 198 199 103 34.0 8e-22 MGGAEYMKSIKKYETIINEALRSALEYDTPEGQINEFISFFGKHIGSDRIYIFEDDEEHH VTNNTYEWCAEGITPQIEHLQGVNMEVIDWWYDTFDEGKSVIISDVEELKGVHQVSYNML KYQNVCKVVVSPLRHKGQIKGFFGVDNPPAVDYNALTMFLDMIGTMLISLLKIRNTFQKE QYNASMSSYSALSEIYLCMHLIDIKTGKYEQIKNVKHVEKECEELDQENFSKRVYTTMKY FCTEMYLSSVLEFVDLSTLDRRLMETNTIVHEFIGTVSGWCRERFIKVDSDEKGRLWHVL YCVEVIDEQKRREKKLLYLSETDSMTGINNRGSGERKIAEFLEKKKGGLFCLLDCDKFKK INDTYGHAVGDTVLIKLAAVLQRACRENDIVMRLGGDEFAMYLPGMPDKAQAESFFKRLF TQIDKISIPEMQGEKINVSLGASLCSDKDGATFDTLYHEADAAMYESKKIVGNHAIIYSD LPQAR >gi|222441887|gb|ACEP01000055.1| GENE 5 6303 - 7706 1367 467 aa, chain - ## HITS:1 COG:lin1863 KEGG:ns NR:ns ## COG: lin1863 COG2265 # Protein_GI_number: 16800929 # Func_class: J Translation, ribosomal structure and biogenesis # Function: SAM-dependent methyltransferases related to tRNA (uracil-5-)-methyltransferase # Organism: Listeria innocua # 9 462 6 450 453 443 50.0 1e-124 MKSEQKLNLKKNQEISLTIEDFTKEGEGLGKYQGFPFFVKDTVIGDEVRVSITKLKKNYG YARLVEIVTPSKDRVTPPCPVARQCGGCKLQQISYEKQLSFKKSLVEGCLMRIGGFGKED IEQKIEPVYGMEEPWHYRNKAQFPVGYDKEGNLVAGFYAGRTHSIVANTNCAIQAEVTHP IVEKVLTYMKENKISAYDEKNHSGLVRHILTRIGFTTGEIMVCLIVNGTKKQLKHIDKLV DDLKTIEGMTSIIVNTNTDKTNKILGLRCETVWGQDYIEDYIGNIKYQIGPLSFYQVNPQ QTKVLYSKALEYAELKGQELVWDLYCGIGTISLFLAQKAKQVYGVEIIKEAIDDARRNAA LNHMDNVEFFVGKAEEVVPAQYEKTGIHPDVIVVDPPRKGCDAILLNTMLDMAPKRIVYV SCDPATLARDLKILCEEKYTLEKVSVVDQFSHSVHVESVCALKRVDE >gi|222441887|gb|ACEP01000055.1| GENE 6 7715 - 8671 1033 318 aa, chain - ## HITS:1 COG:SA1660 KEGG:ns NR:ns ## COG: SA1660 COG3481 # Protein_GI_number: 15927416 # Func_class: R General function prediction only # Function: Predicted HD-superfamily hydrolase # Organism: Staphylococcus aureus N315 # 1 286 1 285 313 164 35.0 1e-40 MRYINELREGDNVSEVYLCKVKNIAKTKAGKTYYSMILQDKTGVIDTKIWDLNNGIENFE QMDYIRVEGNVTSFQGSPQLNVRRLRKAREGEFAMEDYIPCSSKSIDGMFKELSSYVNHV QNIYLRQLLVAFFGDKEFAAKFKAHSAAKRVHHGFMGGLLEHTLSVTKLCDFYCTQYPVL NKDLLITAAICHDIGKIDELSDFPENDYTDVGQLVGHIVMGTMMIDEKIRNINGFPAKLA NELKHCILAHHGELEYGSPKKPALIEALALNFADNTDAKMETFIEALAEESRQSGEWKGY NKLFESNIRATSHLGEKD >gi|222441887|gb|ACEP01000055.1| GENE 7 8897 - 9187 284 96 aa, chain + ## HITS:1 COG:CAC2830 KEGG:ns NR:ns ## COG: CAC2830 COG1254 # Protein_GI_number: 15896085 # Func_class: C Energy production and conversion # Function: Acylphosphatases # Organism: Clostridium acetobutylicum # 1 96 2 91 91 62 37.0 2e-10 MIRKHIIAHGRVQGVGLRFTVTGFAKKYNVTGWVRNLYDGTVEMEVQGLDHRVELFLQEL SSDRPGGNRFIRIDKLDITNIPSVNVADEKGFRARY >gi|222441887|gb|ACEP01000055.1| GENE 8 9384 - 11759 2276 791 aa, chain + ## HITS:1 COG:BH3106 KEGG:ns NR:ns ## COG: BH3106 COG1193 # Protein_GI_number: 15615668 # Func_class: L Replication, recombination and repair # Function: Mismatch repair ATPase (MutS family) # Organism: Bacillus halodurans # 4 791 3 785 785 583 42.0 1e-166 MNTKALITLEYDKIIKKLETFASSTMGKALCKDLLPSSDYEEILSAQTETKDALTRLYKT GYLSFQGLSDIRPHLRLLEIDSTLNTKELLDIARLLSITAQAVEYGDTEDDIMAYDSLNS YFGELDSLEFLYQRITQCILSEDEISDDASSALKDIRREIKQTNISIHNKLTSVINSQNN KTMLQDALITVRNGRYCVPVKTEYRNAFPGMIHDQSSSGSTLFIEPMAVVQLNNHLKELD IKEKMEIEKILQSLSAQAASCSRELEENQKILTKLDFIFAKAKYAKEYQGTEPIFNTDGI VDIKQGRHPLLDPKKVVPIHIYIGEDFNMLLLTGPNTGGKTVSLKTVGLFQLMGQAGLHI PAFQGSRLAVFSDIFADIGDEQSIEMNLSTFSSHMTNLVHILDEADPNSLVLLDELCGGT DPTEGAALAIAILDDLHTGKIRTVATTHYAELKMYAMDTPGVENACCEFDLETLSPTYRL LIGIPGKSNAFSISERLGLPDYIIEQARSQIDATAIDFENMLSELEKNKAEIEKEQSELY KTKQEIENLKNSLKEKQDDIKEKRDKMLRDAREEARNILEEAKEVADESIRKYHAWGQHP KQNNMKKMEAQRSDLRGRMSKLDKKLAYKAKKSSTISDPSDFKVGDSVFVTTLSLNGTVK EAANKDGDLVIQMGFLSSVVNYKNLELLAPEKAPKPQHQPKDRYSINKAATINPEINLLG NTVDEAIARLEKYLDDAMIAGLTSVRVVHGKGTGALRKGIHEYLRKLKFVKSYKLAEFGE GDAGVTIVTFK >gi|222441887|gb|ACEP01000055.1| GENE 9 11809 - 12021 128 70 aa, chain - ## HITS:1 COG:no KEGG:NT01CX_0673 NR:ns ## KEGG: NT01CX_0673 # Name: not_defined # Def: hypothetical protein # Organism: C.novyi # Pathway: not_defined # 11 62 70 121 151 65 57.0 5e-10 MLADCGCWYKIRIGNFDNVGECQYAVPSYWDIGKAYKHLIKQVALEKKIDVVDAIIEAYH SFFFRRLHTI >gi|222441887|gb|ACEP01000055.1| GENE 10 12170 - 12661 684 163 aa, chain - ## HITS:1 COG:FN0809 KEGG:ns NR:ns ## COG: FN0809 COG0219 # Protein_GI_number: 19704144 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Predicted rRNA methylase (SpoU class) # Organism: Fusobacterium nucleatum # 2 149 1 148 150 184 56.0 8e-47 MVNIVLHEPEMPANTGNIGRTCVATGSILHLIEPLGFKINDKMLKRAGLDYWDKLDVRTY VNFEDFLAKNPGAKIYMATTKSKQTYTDVEYEDGCYIMFGKESAGIPEEILLEHKETAVR IPMLEHIRSLNLGNSVAIVLYEALRQQGFKELELEGQLHKYEW >gi|222441887|gb|ACEP01000055.1| GENE 11 12695 - 13870 1304 391 aa, chain - ## HITS:1 COG:BH3350 KEGG:ns NR:ns ## COG: BH3350 COG0436 # Protein_GI_number: 15615912 # Func_class: E Amino acid transport and metabolism # Function: Aspartate/tyrosine/aromatic aminotransferase # Organism: Bacillus halodurans # 2 387 8 393 393 444 52.0 1e-124 MRNPLSDKVVQMKPSGIRKFFDLVQEMPDAISLGVGEPDFDTPWHIREEGIYSLEKGRTF YTSNAGLLELRKAIAHYMYRKYELTYNPAHEIVVTVGGSEGIDLALRAMLNPGDEVILPE PAFVSYLPCVKLADGVPVTIDLKEENHFKLKPEELLAVITDKTKILILSYPNNPTGAIMT REDLEPIAEIVKEKDLYVISDEIYAELTYGQDHCSIASLPGMRDRTIIINGFSKSFAMTG WRMGFATGPELIMQQILKIHQFAIMAAPTTSQYAAIEAMTNGEEDVQIMRNAYNQRRRFV LELFSEMGLKCFEPEGAFYIFPCIKEFGMTSDEFANRFLREEKVAIIPGTAFGDCGEGFL RVSYAYSIEELKEALGRLANFVERLRKEKNL >gi|222441887|gb|ACEP01000055.1| GENE 12 13867 - 14349 560 160 aa, chain - ## HITS:1 COG:BH3351 KEGG:ns NR:ns ## COG: BH3351 COG1522 # Protein_GI_number: 15615913 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus halodurans # 4 160 7 164 164 145 48.0 4e-35 MRHEILRMLENNSRIDLHDLAIMLGTDESVVLEEIEKMENEGIICGYPTLINWDKTDTEK VTAFIEVRVTPQRGQGFEKLAERITNYPEVKSIYLMSGAFDFAIFLEGKTLKEVSMFVST KLSTLEAVAGTATHFVLKKYKDHGMILIDKEPANRMKVTP >gi|222441887|gb|ACEP01000055.1| GENE 13 14485 - 15606 1044 373 aa, chain - ## HITS:1 COG:PAB0403 KEGG:ns NR:ns ## COG: PAB0403 COG0309 # Protein_GI_number: 14520803 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Hydrogenase maturation factor # Organism: Pyrococcus abyssi # 73 364 64 325 326 103 25.0 5e-22 MRIGKVKESILKRSVLRQLHNHSEKGSPAPGEDAGVLFMPEFMPEFTQGFSGKNGVAMAV NPVEGWTFAAKRAVYGAVNSMLAAGAAPKAISLSILMPEEAEEKQLKALIKEIDSLCMQE NILVLSGHTAVSPYVSTLILSVTAMGSTTENRKDITANKGSITRNKESILVSKESIADSK GNTKQIAVGNADLDLVVAGTVGREGAAMLAAEYAKRLEERYAPSYVEAAKHLFDDGSMTA VEDILQEKEVVSVHDVREGGIFAALWEMAAAANVGLSIDLKNIPIKQHTIEVCEYFNLNP YMLRSGGTLLLACANGARIVEQLQKAGVAAAVIGQTTSGNDRLIRYDDEARFLEPPKMDE YYKVQVDISEICK >gi|222441887|gb|ACEP01000055.1| GENE 14 15642 - 17117 1451 491 aa, chain - ## HITS:1 COG:CAC0691 KEGG:ns NR:ns ## COG: CAC0691 COG2720 # Protein_GI_number: 15893979 # Func_class: V Defense mechanisms # Function: Uncharacterized vancomycin resistance protein # Organism: Clostridium acetobutylicum # 37 393 34 388 411 155 33.0 1e-37 MKKRVGIIILVLLLLILAGGGAAFYYYYSKYINIDAIYPGMTIQGMSVGGMTQEEAKAKV QEYIDKVSQETVTLQVKKKESTFALSDIGLKCTNMDVVEKAYDFGKTGNVFKRVIEVRKL EKEGMDFPLTFSVDKAETRKVVKKKAKKFLAKKKDATITRKDGKFVITKQVDGVDIDFEA NADKLTEVFSKKDWDHKSVVFPMDYTLDKAKHTKKELSAIKDVLGTFTTSYAGSASGRCA NVENGASLINGTLLYPGDSFSVYSKVAPFTADNGYHLAGSYSNGQTVQTYGGGICQVSTT LYNAVLRAELNVTERSNHSMTVHYVPLSADAAISGTDKDLKFTNNLDHPVYIQGVAGGSS ITFTIYGKEYRASNRKVEYVSETVSTRGPSEKVIKDNTMEEGKRVVESNGRTGYTARLWK VVYIDGKETKRTQVNSSSYMSTPSVVRVGTKKKEVPKPTTQKKDETAKAGEKAEGSKGTT GNNNQGANKAE >gi|222441887|gb|ACEP01000055.1| GENE 15 17239 - 17346 233 35 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKSDGTKKRQRIATVVVVLVVIAMVITTVVPAMMM >gi|222441887|gb|ACEP01000055.1| GENE 16 17656 - 19629 1507 657 aa, chain - ## HITS:1 COG:CAC1021 KEGG:ns NR:ns ## COG: CAC1021 COG1032 # Protein_GI_number: 15894308 # Func_class: C Energy production and conversion # Function: Fe-S oxidoreductase # Organism: Clostridium acetobutylicum # 221 638 147 548 548 263 38.0 6e-70 MRFLLCGINAKYIHSNLAIFSLKAYADRKKIPGAEIILKEYTINNYVEDILQDLYEAKAD IIIFSCYIWNISFVRELAAELKKVSPEVKIWAGGPEVSYAANKFLMENPAFDLIMQGEGE EVFSELICLTVEEKCRIKDVYKQSESKKVLSGIVEKRYFIERKQAVKEEKDIEDKHFAGE DNVYPTNYIDMSKLQKLQGIAVRDFLGEAALGNAESNIGNKTKIINTGFATLMDMDTIPF VYEDFHLFEHKILYYETSRGCPFCCSYCLSSVDKTVRFRSLPIVKKELDAFLEAKVPQVK FVDRTFNCNRQRAIDIWSYLVEHDNGITNFHFEISSDLLGEEELELFAKMRPGLIQLEIG VQSTNGETVDAIHRHMDLDKLFHYVDRVHELGNIHQHLDLIAGLPYENYERFGCSFDDLY AHEPDQLQLGFLKVLKGTMMEEEVKKYSILYRNQPPYEVLGTKWLSYDEIILLKGVEELV ELYYNSGQYTLTLKYAVPFFESPFRFYEMFSAWYRGKGYHKLNHNRFEKYNILREFLREH IDENERDTLDEIMLYDMYLRENVKGRPAWAKDTAQYKKEWKALYREQGEKLFPEDVQAGI YDSKRAANQSHIEVFEINIKKFEQSGQVEKKQVFCLFDYSRRNPLNRAARTVEWEIL Prediction of potential genes in microbial genomes Time: Fri Jul 8 07:19:24 2011 Seq name: gi|222441886|gb|ACEP01000056.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont62.1, whole genome shotgun sequence Length of sequence - 22712 bp Number of predicted genes - 22, with homology - 22 Number of transcription units - 6, operones - 4 average op.length - 5.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 1646 811 ## bpr_I2177 hypothetical protein - Prom 1675 - 1734 8.0 + Prom 1868 - 1927 5.5 2 2 Tu 1 . + CDS 2034 - 2675 705 ## gi|225026886|ref|ZP_03716078.1| hypothetical protein EUBHAL_01142 + Prom 2791 - 2850 6.0 3 3 Op 1 . + CDS 2934 - 3788 555 ## gi|225026888|ref|ZP_03716080.1| hypothetical protein EUBHAL_01144 4 3 Op 2 . + CDS 3790 - 3963 176 ## gi|225026889|ref|ZP_03716081.1| hypothetical protein EUBHAL_01145 5 3 Op 3 . + CDS 3956 - 4969 321 ## CLK_1304 hypothetical protein + Term 5064 - 5111 4.5 + Prom 5874 - 5933 4.6 6 4 Op 1 7/0.000 + CDS 5970 - 8063 1881 ## COG1086 Predicted nucleoside-diphosphate sugar epimerases + Prom 8065 - 8124 3.2 7 4 Op 2 5/0.000 + CDS 8144 - 9346 1269 ## COG0399 Predicted pyridoxal phosphate-dependent enzyme apparently involved in regulation of cell wall biogenesis 8 4 Op 3 . + CDS 9386 - 10078 536 ## COG2148 Sugar transferases involved in lipopolysaccharide synthesis 9 4 Op 4 4/0.000 + CDS 10078 - 10536 257 ## COG1045 Serine acetyltransferase + Prom 10539 - 10598 5.6 10 4 Op 5 . + CDS 10619 - 11386 554 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 11 4 Op 6 . + CDS 11389 - 12423 580 ## Cthe_2339 glycosyl transferase, group 1 12 4 Op 7 5/0.000 + CDS 12478 - 13002 292 ## COG1045 Serine acetyltransferase 13 4 Op 8 25/0.000 + CDS 13006 - 14079 518 ## COG0438 Glycosyltransferase 14 4 Op 9 . + CDS 14085 - 15167 685 ## COG0438 Glycosyltransferase 15 4 Op 10 . + CDS 15169 - 16422 506 ## CbC4_0712 hypothetical protein 16 4 Op 11 . + CDS 16403 - 17281 428 ## Cthe_1356 glycosyltransferase 17 4 Op 12 . + CDS 17291 - 18418 663 ## COG2843 Putative enzyme of poly-gamma-glutamate biosynthesis (capsule formation) + Prom 18428 - 18487 4.8 18 5 Op 1 . + CDS 18528 - 19481 466 ## Lbys_1376 hypothetical protein 19 5 Op 2 . + CDS 19439 - 20500 249 ## COG1835 Predicted acyltransferases + Term 20503 - 20552 -0.6 20 5 Op 3 . + CDS 20561 - 22054 623 ## Clocel_0891 polysaccharide transport protein, putative + TRNA 22121 - 22227 22.7 # Pseudo ??? 0 0 + Prom 22153 - 22212 80.4 21 6 Op 1 . + CDS 22273 - 22380 159 ## EUBELI_00152 phosphoglucosamine mutase 22 6 Op 2 . + CDS 22455 - 22710 155 ## COG1109 Phosphomannomutase Predicted protein(s) >gi|222441886|gb|ACEP01000056.1| GENE 1 2 - 1646 811 548 aa, chain - ## HITS:1 COG:no KEGG:bpr_I2177 NR:ns ## KEGG: bpr_I2177 # Name: not_defined # Def: hypothetical protein # Organism: B.proteoclasticus # Pathway: not_defined # 1 535 1 536 538 580 54.0 1e-164 MGIFLNSTAPFEAYKITALDKYFVDKTMLIEELIPSIGREQRFLCITRPRRFGKTVMANM VASYFGKAIDSSFVFEHLAIAKSPVYEEYINKYDVIYIDFSRLPENCQTYEEYINRIKTG IKEDLLEEFPELELKEEMSLWDILAKIFQKTNRKFMFIMDEWDAVFHMPFISQKERQEYL LFLKNLLKDQVYVELAYMTGILPIAKYSAASELNMFVEYNMATRERFSSYFGFSEEEVDK LFQIYSETTIRPRITRHDLKIWYDGYCTASGEQLYNPRSIICALSDNQLGSYWTGSGPYD EIFYYIKGNIDEVREDFILMISGEHIEAKVQEYAATAEELTTKNQIYSAMVIYGLLTYED GEVFIPNRELMYKYNELLLTNESLGYVYRLAKESERMLKATLAGDTQTMAEILEYAHNTE SPILSYNNEIELSAIVNLVYLAARDKYRVEREDKAGKGYVDFIFYPEKKNISAIILELKV DASPEEAIQQIKDKQYILRFKGKLAEKAKYTGEILLVGINYNKETKMHSCKIEKINRNLP QAKETPSG >gi|222441886|gb|ACEP01000056.1| GENE 2 2034 - 2675 705 213 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026886|ref|ZP_03716078.1| ## NR: gi|225026886|ref|ZP_03716078.1| hypothetical protein EUBHAL_01142 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01142 [Eubacterium hallii DSM 3353] # 1 213 1 213 213 347 100.0 3e-94 MKKRLSTILATLLMTVTLIPFAPVHAKAASADVTRQMAGNRQIKDISKKMTAYTTAMNIS VWSTTKPVKMKLNDNQKLSIAVFVRYNYKGDYSYTAKELRSETKKLFGKSASVSKIKSSK KNKNQAMFVCSSDSKFYKDPYMYCGGDFGDTIPDYKIKKVVRTGKNIYTVTTQNRLGCYG EKGRTNIGTTTLKLKKTAAGYVVKGVRYQYNGK >gi|222441886|gb|ACEP01000056.1| GENE 3 2934 - 3788 555 284 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026888|ref|ZP_03716080.1| ## NR: gi|225026888|ref|ZP_03716080.1| hypothetical protein EUBHAL_01144 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01144 [Eubacterium hallii DSM 3353] # 1 284 1 284 284 503 100.0 1e-141 MNDLRSAIKHNFLKQIIFRLDYEGVMEADVEGCILSLRQKFFDAGFINMGKRTENQFDLQ VKMDLNIPEGNQFSVSNNNKNLVYIFSSDNKEILEISKSFFTLTVDIDQTYETFDKYIVL LAESMAVIKSVSPYFQPLRIGLRKINICFLDNLADLPLYFTKAAFNANDVIEQFSEYDYI ASNMVTILSKDNYQANYVRNIQEGMVQQEDGNQKTTYQIVIDVDVFKEGNREILPMLSDE QIIKNTLQIQNTIEFEIFIKSLSDKLIQALKQEVFQDDVIRGVV >gi|222441886|gb|ACEP01000056.1| GENE 4 3790 - 3963 176 57 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026889|ref|ZP_03716081.1| ## NR: gi|225026889|ref|ZP_03716081.1| hypothetical protein EUBHAL_01145 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01145 [Eubacterium hallii DSM 3353] # 1 57 1 57 57 87 100.0 2e-16 MMKFVSKSRNPYVAQQENIKKESVKGQFVSKSQAGTIHMFRDNTANIVGKKVEFKHV >gi|222441886|gb|ACEP01000056.1| GENE 5 3956 - 4969 321 337 aa, chain + ## HITS:1 COG:no KEGG:CLK_1304 NR:ns ## KEGG: CLK_1304 # Name: not_defined # Def: hypothetical protein # Organism: C.botulinum_A3_LochMaree # Pathway: not_defined # 24 320 12 318 328 141 33.0 3e-32 MYSYDMHLSGRSISRYKQIIYFEKQLLIETYIIGYKSQGESILFFIKTDGGISFSGLIDC FCLKDIDKVRDVLEANKITALDFICWTHPDWDHSKGLKSIIDNYTSEKTHIWIPESVEMG EIKCSKEVRELFQYLKECTVNCSTEYNVYSTSDRKDMMCYNSISFRKDTNNFPLELVSYA PNSKLVRKQTYTDKYIKNDRSIFFVLSLGSVRIFFTGDIEDDTIEKIPANFFGEHIHIMK IPHHGSGSSVKMLDLGWDGCDIACSTVMRKGNDLPVKDVMEQYKKNAQLLLCTGKSNKEK EEEDYGVIRIITDVIENKFSIDTEGNAEIWNSISDVR >gi|222441886|gb|ACEP01000056.1| GENE 6 5970 - 8063 1881 697 aa, chain + ## HITS:1 COG:BH3718 KEGG:ns NR:ns ## COG: BH3718 COG1086 # Protein_GI_number: 15616280 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Predicted nucleoside-diphosphate sugar epimerases # Organism: Bacillus halodurans # 85 614 52 545 608 425 43.0 1e-118 MSTIKKNGLERECTESEGKRHGFRFRHWKVIAMYLVLYDIIAVNLSYFFGLWLRFDLRYS HIPAEYMDAFVKFAPFYTAFVLIMFYAFRLYNSLWRFASFSELNRIFAATVITTIFQVVG ITVCLERMPLSYYIVGCGMQFALTLLVRFAYRYITLERTKMEHLQEADSVHRAMIIGAGA AGQVILKELKSSNLVDARPCCVIDDNKNKWGRYMLGIPIVGGRDSILEAVERYNIDQILF AVPTASPEEKRDILNICQETNCELKSLPGVYQLANGQVSLSKLKPVAVEDLLGREPIKAN LDEVFQYIQGKKVLVTGAGGSIGSEISRQVAGHHPEQLILFDVYENSLYDLEQWLKRTYP DLNMIALTGSVRDSRRICQVFAQYKPDIVYHAAAHKHVPLMETSPNEAIKNNAVGTYKTA YAAMINGCQRFVLISTDKAVNPTNIMGASKRICEMIIQSFDRMIKEGKAHEMPMLYTHGE EENGIMHQIRQKKFIPRTEYVAVRFGNVLGSNGSVIPLFKRQIEEGGPVTVTHPDIIRYF MTIPEAVSLVLSAGTNAEGGEIFVLDMGSPVKIDSLARNLIRLSGHEPDVDIKIEYTGLR PGEKLFEEKLMAEEGLKKTKNKLIHIGCPIPFNTEQFLHQLTVLMKAAYDNDQNIREMVE LIVPTYHPAGKNGSENKGKVYELQRKMMEQAAVAKEV >gi|222441886|gb|ACEP01000056.1| GENE 7 8144 - 9346 1269 400 aa, chain + ## HITS:1 COG:SP1837 KEGG:ns NR:ns ## COG: SP1837 COG0399 # Protein_GI_number: 15901666 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted pyridoxal phosphate-dependent enzyme apparently involved in regulation of cell wall biogenesis # Organism: Streptococcus pneumoniae TIGR4 # 3 400 6 403 408 520 61.0 1e-147 MKIPFSPPDITEAEVEQVAEALRSGWITTGPKTKELERQVADLCGVNRAVCLSSQTACAE TTLRVLGVGEGDEVIVPAYTYTASASVVYHVGAKIVLVDVQKDSLEMDYDKLAEAITEKT KVIIPVDLGGVPCDYDKIFSIVEAKKSLFHPVNKLQEAIGRVIVMTDAAHAFGATWHGKP VGSIADFSNFSFHAVKNFTTAEGGAATWRDIEGIDNEEIYHQYQLLSLHGQSKDALAKTQ LGAWEYDIVGPWYKCNMTDVVAGIGLAQMKRYAGLLERRKEIIGKYDEAFKPLGIEVLNH YTEEHQSSGHLYITRIPGITLEQRHEIIVKMAEQGVACNVHYKPLPMMTAYKNLGFDIKD YPNAYKRFENEITLPLHTKLTDEEVEYVIEKYSEIVKEYI >gi|222441886|gb|ACEP01000056.1| GENE 8 9386 - 10078 536 230 aa, chain + ## HITS:1 COG:SP1838 KEGG:ns NR:ns ## COG: SP1838 COG2148 # Protein_GI_number: 15901667 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Sugar transferases involved in lipopolysaccharide synthesis # Organism: Streptococcus pneumoniae TIGR4 # 1 225 1 229 230 224 54.0 7e-59 MKKWKELPKFMQCDEVKEYYDILSKKKLSLRLKRAFDIVAATCILMITAIPMIIIAIRIA TESKGGVFYRQERVTTYGKKFKIHKFRTMVANADQIGSAVTVSGDNRITPTGAFLRKYRL DELPQVFDVLSGNMSFVGTRPEVTKYVKKYTKEMRATLLLPAGITSEASIRYKDEAELLD AADDVDRVYVEEVLPGKMKYNLQSIKKFSFLGEIVTMFRTVFAVLGKEYR >gi|222441886|gb|ACEP01000056.1| GENE 9 10078 - 10536 257 152 aa, chain + ## HITS:1 COG:SA0487 KEGG:ns NR:ns ## COG: SA0487 COG1045 # Protein_GI_number: 15926206 # Func_class: E Amino acid transport and metabolism # Function: Serine acetyltransferase # Organism: Staphylococcus aureus N315 # 59 142 65 148 213 72 48.0 4e-13 MEMSGWRAVFEADMYRYGASKVDKYIKKWNYFFRRCQYASGGKKIIWHLLLLKHGTKHGI EIDYPVKIGKGLFIAHPYGITINDKCIIGMNCNIHKGVTIGQENRGKRQGTPIIGDNVWI GMNVTIVGNIKIGNDVLIAPNTYLNLMYQVTR >gi|222441886|gb|ACEP01000056.1| GENE 10 10619 - 11386 554 255 aa, chain + ## HITS:1 COG:BH3661 KEGG:ns NR:ns ## COG: BH3661 COG0463 # Protein_GI_number: 15616223 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Bacillus halodurans # 2 237 6 242 257 187 43.0 2e-47 MIDGLVSIIMPSWNTERFIAETIKSVINQTYTNWELLIVDDCSTDKTDEIVASFNDNRIK YFHNEKNSGAALTRNKALREAKGEWIAFLDSDDLWMPEKLEHQIDFMNKNGYSLSFTEYE KIDEDSKPLNIYVSGPDKVNKRKIYNYDYIGQLTMMYSTKVFGLIQIKDIKKNNDYAIRL QLYKKPGTCAYLLKENLAKYRVRKVSISHDKFRRKFKSHYDLFHMCDEKPAVVAAWYTCW NMFFGVLKKKNYERG >gi|222441886|gb|ACEP01000056.1| GENE 11 11389 - 12423 580 344 aa, chain + ## HITS:1 COG:no KEGG:Cthe_2339 NR:ns ## KEGG: Cthe_2339 # Name: not_defined # Def: glycosyl transferase, group 1 # Organism: C.thermocellum # Pathway: not_defined # 1 341 1 342 347 319 48.0 1e-85 MKKVCVIGHFGFGENLLNGQTIKTKIVTKELDKQFGADQVVKIDTHGGAKSLPSVVFQMI KAFIKCDNIIIFPAHNGIKIFVPLCNSINVFFHRRLHYVVIGGWLTEFLKKRKNLTKALM SFDGIYVETNTMRKALEIQGFNNVYVMPNFKDLNILKESELVYPNTEPYSLCTFSRVMKE KGIEDAVNAVKTVNEHFGRTVYTLDIYGQVDSAQTEWFNELENTFPSYIKYGGLVPFDKS VEVLKNYFALLFPTYYEGEGFAGTLLDAMAAGVPVVASDWRYNSEIVNEKNGYVYPVHDN YAFIDTLINVGNNLDLLLSKKPSCLKEAEKYRAENVIQCLISKF >gi|222441886|gb|ACEP01000056.1| GENE 12 12478 - 13002 292 174 aa, chain + ## HITS:1 COG:VC0919 KEGG:ns NR:ns ## COG: VC0919 COG1045 # Protein_GI_number: 15640935 # Func_class: E Amino acid transport and metabolism # Function: Serine acetyltransferase # Organism: Vibrio cholerae # 6 162 9 167 184 80 35.0 1e-15 MNKYFKCDYYRMTGEKWNPVKGTLIMLLRYDIRYLYLIRRKKTKIRTLFAMRAARKYGLE ILSDNVGPGLYLGHAHNINVSPLAKIGKNFNLNKGCTIGRENCGKRIGAPTLGDGVWVGS NSMVVGNIHTGNDVLIAPNAYVNSDVPEHSIVIGNPAKIIKKENATDSYITMRI >gi|222441886|gb|ACEP01000056.1| GENE 13 13006 - 14079 518 357 aa, chain + ## HITS:1 COG:TM0622 KEGG:ns NR:ns ## COG: TM0622 COG0438 # Protein_GI_number: 15643387 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Thermotoga maritima # 1 353 2 370 388 210 32.0 4e-54 MKIIQIIPLLDLAGAETMCENLTNSLIKLGHEVTVISLYDYHSIMTERMEKNGIKIIYLN KKSGLDLSLIFKLVKIFKEYKPDVVHSHLYAAKYAHVAASICGIKAKIYTIHNIARNEAG KMNRTFNRFLFKKCHVVPVSLTEEIQKSVADEYNIDFKNTPVVFNGVSMEKCHKKNDYSG NSRILHIGRFSKQKNHEVLVKAFSRVVNSGSDVSLYLYGQGELEEAIKELVKNLNMDQNI FFCGLTDDVYSVMESSDIFVLPSLFEGMPMTLIEAMGTGMPILASNVGGIPDMIENEKSG LLCEPTVDGVAAGLERLISSADDRKLYGQNAVISSEKFSADKMAKDYCEIYLKACKG >gi|222441886|gb|ACEP01000056.1| GENE 14 14085 - 15167 685 360 aa, chain + ## HITS:1 COG:SMb21250 KEGG:ns NR:ns ## COG: SMb21250 COG0438 # Protein_GI_number: 16264502 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Sinorhizobium meliloti # 2 348 50 400 427 145 28.0 2e-34 MKLVFITPGMTFGGAERVISILSNIWCDMGHDVSIFITSSDKPSAYRLNEKVKVEYYDDY RVDGVSHFKLISSIRDFVDKENSDCVLSFMNDVCAYTILSLTGKNIPIIYSERNDPNKTN QSKVDKIFRKIVEFGASHIVFQTEGAKQCYSKKVQRKSSIILNPVELSRIPERKKEDINY SEIVTVARLEPQKNQELLIDAFNIVSNKHKDVVLKIFGEGSLKKKLQNRIDELGLKDKVY LMGTKSDVLNWIKESYCFVLTSDFEGLPNSLIEAMCMGIPCISTDCSPGGARQLLSDDRG LIVPCGDKERLAEAINIYLENKDIAAQYGQAAYGLRKEIDCNKVAKEWLKLMEKIVRGEN >gi|222441886|gb|ACEP01000056.1| GENE 15 15169 - 16422 506 417 aa, chain + ## HITS:1 COG:no KEGG:CbC4_0712 NR:ns ## KEGG: CbC4_0712 # Name: not_defined # Def: hypothetical protein # Organism: C.botulinum_BKT015925 # Pathway: not_defined # 8 397 19 407 416 144 30.0 1e-32 MKISYKQITYTVIVVFFSVALNPEWSANPILWWLSFFSVLAVISIANSFKINFKINLFKI WLISFGGMCSISIIYAINRSVSFDCIKTLVILFIILFLVDEEIDSQDELETYMKLFLISL FIMMIYVLVKVDLKSFQLAQHGEATTGLWNGNDVGMKCALFIIVMLYFLNNNTKIIQRIV VLISMVIPVVLLYYTASRKAFLMVVLGISLFYYLKHPNKKIRNLIVIFMSIYFLYILIMN NDVLYNAIGWRIEGALALVNKSGKVDSSALLRKKYIDVGIAAWKKSTVLGYGIDNFRIIN LTATGHLTYSHNNFIEMLVGVGGMGFLIYYFYYIKLLYEYLRIYLKHHATQILNVMTICF ILMFAMQFAVVSYNAMEQGLVILFMSKAVEFNKKNENYAVLQKKGRRRGEVNVYSNI >gi|222441886|gb|ACEP01000056.1| GENE 16 16403 - 17281 428 292 aa, chain + ## HITS:1 COG:no KEGG:Cthe_1356 NR:ns ## KEGG: Cthe_1356 # Name: not_defined # Def: glycosyltransferase # Organism: C.thermocellum # Pathway: not_defined # 20 291 30 298 301 300 56.0 7e-80 MYTRIFDALNYRKMLNWLPDKAFLKAAFRARFGRKLNLNKPETFNEKMQWMKLYNRKPEY TMMVDKYLVRNYVREKIGEKYLIPLLGVWDDSDKIDFDKLPMQFVLKCNHNSGLGMCICK NKNNLDINQVKNELRKGMEQDYYLTGREWPYKNVPRRIIAEKYMVDESGYELKDYKYYCF DGKVKIVMINSDRMSSEKTKANYFDENYQQLDFVWGYDHAKIPPAKPEKFEEMKCLAEKL SEGIPHVRIDFYQTPNGIFFGEMTFFDGCGFDPIEPIEWDYKIGSWIKLPME >gi|222441886|gb|ACEP01000056.1| GENE 17 17291 - 18418 663 375 aa, chain + ## HITS:1 COG:BS_ywtB KEGG:ns NR:ns ## COG: BS_ywtB COG2843 # Protein_GI_number: 16080641 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Putative enzyme of poly-gamma-glutamate biosynthesis (capsule formation) # Organism: Bacillus subtilis # 35 238 92 299 380 82 29.0 1e-15 MNIIIGADLVPTKSNIELFEKGDVDTLIGRELTDIIRSADYRIFNLEVPLVNESSPIKKC GPNLIAPTQCISAYKAMEIDLLTLANNHILDQGIKGLDSTIKTLKKADISYIGVGNSIKD ASKPKIINCEGKKIGIYACVEHEFTVATEKESGANPIDLLETPDHIVELKKQCDYVIVLY HGGKEQYRYPSPNLQKTCRKLVEKGADLVVCQHSHCIGCREEYLQGTIVYGQGNFLFDDE ENEYWQTSLLIMISDDFEVKYIPLKKNKNTVRVAKENEAKNILEQFNIRSEELTDSKKVG QHYDELAEYYKNFYLLNIMKIKRGLLYRIINKLTKGLLEKTIIDAAYNSTYKYTLRNYVE CEAHSELVLHILKKL >gi|222441886|gb|ACEP01000056.1| GENE 18 18528 - 19481 466 317 aa, chain + ## HITS:1 COG:no KEGG:Lbys_1376 NR:ns ## KEGG: Lbys_1376 # Name: not_defined # Def: hypothetical protein # Organism: L.byssophila # Pathway: not_defined # 2 247 5 253 278 142 37.0 2e-32 MEKIAILVVYYGEFPPYFDLWIRSCEYNQTIDFFIVTDNKFDDLPDNVKIINITLDEFRK LAEKKLHKKVRIDYPYKLCDFKPLYGHILEDYLSDYDYWGHCDVDLIFGDLRRFFSEYKI NKYDRFLHLGHLSLYKNTPECKEYYKLDGSACGNWEEVVTNPRNFLFDEWSGVYGIYHKN NISMFEERIFADISMIYKRFRLALDDPNYNKQVFYWENGHIYRKYWIDGKSFKEEFIYIH FKKRKFNKEGFDASTTTAFYIGPEGFTEKTASATMDDVKKINPYYGEKYELKELRHMQWE EKKEWWKKRINQLWKRE >gi|222441886|gb|ACEP01000056.1| GENE 19 19439 - 20500 249 353 aa, chain + ## HITS:1 COG:mll0134 KEGG:ns NR:ns ## COG: mll0134 COG1835 # Protein_GI_number: 13470430 # Func_class: I Lipid transport and metabolism # Function: Predicted acyltransferases # Organism: Mesorhizobium loti # 20 334 15 329 370 84 28.0 4e-16 MVEKTNKPAMEKRINSLWLLDVFRGLSALLIVLYHYTTQYDKSIGHLAKYNFSFPWGCHA VYAFFMLSGFLIVYTYKENFNMVSFLKKRCVRLYPMFWVCMVVTTIYMAFIFPERVPSVR QFLLNITMFPTLFGSTAIDGVYWTMPKELIFYIIFAIIACGGGLKGKQKAKWLWLCIVIE VICLIYCFGPFDLPAQWGIIFFMIPDYMYVFLAGCAVYYLNYTENLQQKGMMIIYLIICT VVCKCLCSESTFIFFIFSLCMLILCSREKTNQKTEGMKQILRPFIFLSEISYVLYLTHQF IGFGIIRKMEMQGMTAEVWILLPIAHAILLATILHYGVELKINQMLKRCFAKI >gi|222441886|gb|ACEP01000056.1| GENE 20 20561 - 22054 623 497 aa, chain + ## HITS:1 COG:no KEGG:Clocel_0891 NR:ns ## KEGG: Clocel_0891 # Name: not_defined # Def: polysaccharide transport protein, putative # Organism: C.cellulovorans # Pathway: not_defined # 1 444 1 444 503 263 38.0 1e-68 MKKKTVIENLVVSTLTQLITLVLGMILPRLILLAWGSEYNGLLSSVTNILRYLSLLEAGF NTATLQALYKTVGQDDKEQTSIVIRTSQHYYHRISVVYAMMVLLISVVYPLIISSEIPYW ETFAIIILQGAVGVINFAFRAAYQQLLNAEGKYYVISLITLLTTVLTYAAKIVSIGYFGS VVIMQIMSVVVIIVQVIIYAIYFNRRYKWIDKNARLDESLLENRKYYFTQQIAGLIFNST DTFVLSVFCGLKVASVYATYNLVYTALAQIISLIRGSTSFVLGQSFHKDKTFFEKVYNAY SAMQSMIGGIMASISMVLIINFVKLYTQSITDINYIDFIAALIFSFNLMLECSRGASLAS ANIAGKAPDTTWRYILEAAINLVSSLILVNFIGMKGVLLGTAIAGIYRTTDSIIYTNNHV LNRSSKYEFKNVIIDFILFFIIVYASHNIISVNAINYIELICYAIGTGIVVTVVYGAAFI LLNKNEARDLIKAIKQN >gi|222441886|gb|ACEP01000056.1| GENE 21 22273 - 22380 159 35 aa, chain + ## HITS:1 COG:no KEGG:EUBELI_00152 NR:ns ## KEGG: EUBELI_00152 # Name: not_defined # Def: phosphoglucosamine mutase # Organism: E.eligens # Pathway: Amino sugar and nucleotide sugar metabolism [PATH:eel00520]; Metabolic pathways [PATH:eel01100] # 1 35 1 35 450 72 94.0 5e-12 MGKYFGTDGFRGEANVNLTVDHARKVGRFLGWYYG >gi|222441886|gb|ACEP01000056.1| GENE 22 22455 - 22710 155 85 aa, chain + ## HITS:1 COG:mll3879 KEGG:ns NR:ns ## COG: mll3879 COG1109 # Protein_GI_number: 13473323 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphomannomutase # Organism: Mesorhizobium loti # 1 83 57 139 450 86 50.0 9e-18 MFEYALVAGLTASGADAYLLHVITTPSVAYVARTEDFDCGIMISASHNPYYDNGIKLING NGEKMDEATIHLVEAYLDSELEVFG Prediction of potential genes in microbial genomes Time: Fri Jul 8 07:20:36 2011 Seq name: gi|222441885|gb|ACEP01000057.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont63.1, whole genome shotgun sequence Length of sequence - 16212 bp Number of predicted genes - 12, with homology - 12 Number of transcription units - 9, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 62 - 412 283 ## COG0671 Membrane-associated phospholipid phosphatase - Prom 593 - 652 9.6 + Prom 662 - 721 8.4 2 2 Tu 1 . + CDS 747 - 1064 469 ## COG4496 Uncharacterized protein conserved in bacteria + Term 1099 - 1137 5.3 + Prom 1083 - 1142 4.2 3 3 Op 1 11/0.000 + CDS 1170 - 2897 2326 ## COG4231 Indolepyruvate ferredoxin oxidoreductase, alpha and beta subunits 4 3 Op 2 1/0.000 + CDS 2911 - 3483 747 ## COG1014 Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, gamma subunit 5 3 Op 3 . + CDS 3533 - 4837 1359 ## COG1541 Coenzyme F390 synthetase + Term 4853 - 4916 19.3 - Term 4839 - 4904 10.1 6 4 Tu 1 . - CDS 4909 - 6654 782 ## gi|225026914|ref|ZP_03716106.1| hypothetical protein EUBHAL_01170 - Prom 6827 - 6886 6.2 + Prom 6705 - 6764 6.2 7 5 Tu 1 . + CDS 6905 - 8617 1485 ## COG5492 Bacterial surface proteins containing Ig-like domains - Term 9139 - 9199 0.2 8 6 Tu 1 . - CDS 9302 - 10024 759 ## EUBELI_01898 hypothetical protein - Prom 10081 - 10140 2.5 - Term 10088 - 10125 -0.3 9 7 Tu 1 . - CDS 10174 - 11361 1315 ## COG0436 Aspartate/tyrosine/aromatic aminotransferase - Prom 11557 - 11616 7.2 + Prom 11516 - 11575 8.5 10 8 Tu 1 . + CDS 11697 - 13331 1711 ## COG0318 Acyl-CoA synthetases (AMP-forming)/AMP-acid ligases II - Term 13202 - 13246 -1.0 11 9 Op 1 . - CDS 13496 - 14419 460 ## gi|225026920|ref|ZP_03716112.1| hypothetical protein EUBHAL_01176 12 9 Op 2 . - CDS 14433 - 15965 1686 ## COG0591 Na+/proline symporter - Prom 16088 - 16147 7.2 Predicted protein(s) >gi|222441885|gb|ACEP01000057.1| GENE 1 62 - 412 283 116 aa, chain - ## HITS:1 COG:SPy1836 KEGG:ns NR:ns ## COG: SPy1836 COG0671 # Protein_GI_number: 15675663 # Func_class: I Lipid transport and metabolism # Function: Membrane-associated phospholipid phosphatase # Organism: Streptococcus pyogenes M1 GAS # 1 107 57 163 167 80 42.0 1e-15 MLISFIAIPALCFLAVTIFRKVVNKKRPYEKLPIQSLIKKDKKGQSFPSRHVFSIFLIAT LWFCFWKPVGIFLFIAGVFLAIVRVIGGVHFVTDVCAGALLGVLAGLISNYVFLIS >gi|222441885|gb|ACEP01000057.1| GENE 2 747 - 1064 469 105 aa, chain + ## HITS:1 COG:lin1875 KEGG:ns NR:ns ## COG: lin1875 COG4496 # Protein_GI_number: 16800941 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Listeria innocua # 5 97 3 98 98 83 47.0 1e-16 MGKNIPKERKEGMYKAILELKSLEECFDFFEDICAMTELRSMEQRFEVASMLKKEKVYTE IMSETNASSATISRVNRMLNYGTGCLGEVIDRLNQKEGSEEAKES >gi|222441885|gb|ACEP01000057.1| GENE 3 1170 - 2897 2326 575 aa, chain + ## HITS:1 COG:CAC2001 KEGG:ns NR:ns ## COG: CAC2001 COG4231 # Protein_GI_number: 15895271 # Func_class: C Energy production and conversion # Function: Indolepyruvate ferredoxin oxidoreductase, alpha and beta subunits # Organism: Clostridium acetobutylicum # 1 574 1 575 584 667 56.0 0 MKQLMLGNKALARGLYEAGCSVVSSYPGTPSTEVTEEAAKYDEIYCEWAPNEKVAMEVAF GASLAGKRSFCGMKHVGLNVAADPLFTCSYTGVNGGMVICVADDPGMHSSQNEQDSRHHA IASKIPMLEPSDSTEAYEFAKKAFELSEEFDTPVIIKMCTRVAHSQSIVEPGERVVPETI PYEKNIAKYVMMPGNAIRRHPVVEERTRKLIEYGNHCDFNRVEMGDTKLGIITSSTCYQY AKEVFGDKASILKIGLVNPLPDQLILDFAAKVEKLAIIEELDPILENHCKELGLTVTGKD ALPMEGEFSQNLIAEKLGIEVPAHKELGEALPGRPPVMCAGCPHRGLFFTLSKNKCTVLG DIGCYTLGAVAPLSAMEMTLCMGASISSIHGFNKALGKESEGKTVAVIGDSTFMHSGMTG LANIAYNQSNSTVIILDNSITGMTGHQQNPTTGYNIKGDPAGKIDLESLCRSMGFNRVRV VDPYDLAACDTVIKEELAAEEPSVIISRRPCALLKYVKHNPPLKVEKENCVGCKSCMRLG CPAISVKDKKAVIDTTLCVGCGVCQQLCKFDALQA >gi|222441885|gb|ACEP01000057.1| GENE 4 2911 - 3483 747 190 aa, chain + ## HITS:1 COG:CAC2000 KEGG:ns NR:ns ## COG: CAC2000 COG1014 # Protein_GI_number: 15895270 # Func_class: C Energy production and conversion # Function: Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, gamma subunit # Organism: Clostridium acetobutylicum # 4 189 5 190 192 209 53.0 3e-54 MTKNIMIVGVGGQGSLLASKLLGHLLLSEGYDVKVSEVHGMSQRGGSVVTYVRFGEKVYS PIIDKGEADFIVSFEKLEAARYVEYLRPDGRIVVNTQEIDPMPVIIGAAQYPENLIEKME ALGIKVDAMDCLSLAEEAGSSKAVNIVLMGRLSKYFDIPVEKWQKAIEECVPAKFVELNH KAFALGRGEE >gi|222441885|gb|ACEP01000057.1| GENE 5 3533 - 4837 1359 434 aa, chain + ## HITS:1 COG:MTH1855 KEGG:ns NR:ns ## COG: MTH1855 COG1541 # Protein_GI_number: 15679843 # Func_class: H Coenzyme transport and metabolism # Function: Coenzyme F390 synthetase # Organism: Methanothermobacter thermautotrophicus # 5 431 3 432 433 431 48.0 1e-120 MGNYYQPEIETMPVEKLQALQSERLVEQVKYVYDHVEFYRNKMKEAGVEPEDIKGIEDLH KLPFVTKDDLRDQYPYGFLGVPLSECVRMQSTSGTTGRRVVAFYTQEDIDVWEDCCARAI MAAGGTKDDVCHVAYGYGLFTGGAGLHGGSHRVGCMTLPMSSGNTERQIQFMEDLGSTIL CCTPSYAASLGESINEGGHRENIKLKAGIFGAEPWTEEMRHNIEESLGIKAYDIYGLTEI SGPGVSFECEEQKGMHIQEDHFIPEIINPDTGEVLPEGEIGELVFTCITKKAYPLLRYRT RDLCYLTRKKCSCGRTHIRMHKPMGRSDDMMVVKGVNVWPSQIEAVLLKQGYQANYQILV DRIGNNDTIEVQVEMTPEMKQEDVAIAAREKKIVHGLKNMLGIKVDVSILEPKTITRSEG KAVRVVDKRNLYNK >gi|222441885|gb|ACEP01000057.1| GENE 6 4909 - 6654 782 581 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225026914|ref|ZP_03716106.1| ## NR: gi|225026914|ref|ZP_03716106.1| hypothetical protein EUBHAL_01170 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01170 [Eubacterium hallii DSM 3353] # 1 581 1 581 581 1115 100.0 0 MTDFKDILIKYMEELDCSSKELADSSGLSAATISRYRSGERIPDVESDNLKQLIYGIVKL AQKRNLSSINDITVHSDFLRFLPDISADFSILQANLNTLFTMLSINTSEFARFLNYDASY ISRIKSGERQPADPELFLVNTALFVTKRYTKKTDLSILANLFDCSLEELREEKTYLSLLK HWLQTKHTNTDKEQQSLSHFLQKLDEFNLDDYIRVIHFNELKVPTAPFQFPGSKNYFGLK EMMNSELDFLKATVLSKSQEDVIMYSDMPIEEMAKDSNFPKKWMFGMACMLKKGLHLHQI HQIDRPFAEMMLGLESWIPMYMTGQISPYYLKENTGQTFMHLLKVSGAAALQGEAIYGHH TQGRYYLTKHKTEISYYKEMAELLLKKASPLMEIFRENSMMPYHAFLQADAKTSGKRYHI LSALPLQTLNSSLLEDILNQNQINKEDAQKIKTYIDKKSAQIQQILSHDMITEEFPILSK EEFSRFPIALPLSDIFYEENIYYTWEMYKQHLESTLNYEKTHQNYCIKQNQQSAFRNIQI RIHEKKWVLVSKNRTPAIHFLIRHPKMRNAFENMIIPIVEL >gi|222441885|gb|ACEP01000057.1| GENE 7 6905 - 8617 1485 570 aa, chain + ## HITS:1 COG:CAC2107 KEGG:ns NR:ns ## COG: CAC2107 COG5492 # Protein_GI_number: 15895377 # Func_class: N Cell motility # Function: Bacterial surface proteins containing Ig-like domains # Organism: Clostridium acetobutylicum # 398 567 199 358 439 65 32.0 3e-10 MKRWKKIMALCFAFLMSFVMLGSSVEAANGPNEQDWSSAYIVIDDGSANPKQLYNANKSV TISTVRNISYDKKTNTLTLNGYQEAEKKIVANEMGDDFKVKVVGNNQIQGIAVWGYSYGG SLTLEGNGSLEINKNRVQGEPISLMAEEANAQFKVKQGVTLKVYRDNKFPSSIVVSYSAA AKDGIVFEGNDTLNQKVQKTSQKEAKRMTAAVYELADFIPCTPKDDSVTGIYGALRQAGS SDDEPYYNIYKIQEVFPEIWLASMQDENGNYFWSSNFTINTNAQKIKAYTTSVTSYSLMM LNKKTSSGTVSDYVAYFDYDQQKNEEVCSIYKIKKKANKNYAIPVEGEQNQPMSILDKYE IVPTGRMFYNYNFSGDILYSGSDQNTKPNIDTNGNRRLVSSIRLSGISKKIAAGKKLTLK ATVLPNNASNKKLIWKSSNPKVATVTQNGIVTLKKKTGGKKVTITATAKDGSKKYASWKI TSMKGIVKKVKITGSKPVKAGKKLKLKAKVIATKKANTKLLWTSGNAKYATVNAKGIVTT KKSAKGKTVKITAMATDGSGKKHTVKIKIK >gi|222441885|gb|ACEP01000057.1| GENE 8 9302 - 10024 759 240 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_01898 NR:ns ## KEGG: EUBELI_01898 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 1 240 56 288 288 188 40.0 2e-46 MKEADIDYSVNLPVMTKPSQVEKVNSSLIQQQEYLLDMKIITFGGMHPNYTNYKEELLRL KQAGIPGIKLHPAYQNVDLDDIRMMRIIDEASSQGLITLVHAGIDIGIYDHNYSSVAQIL KLLKEVAPEKLVLAHMGNWGRWNEVESDLAGAPVYFDTAFSYGPVTPLPDAPRAPYISYN LHPKDFVRLVRKHGTDKILFGTDSPWENQQDYVKRISHMDFTKEELEQILSENAKKLFTI >gi|222441885|gb|ACEP01000057.1| GENE 9 10174 - 11361 1315 395 aa, chain - ## HITS:1 COG:CAC2832 KEGG:ns NR:ns ## COG: CAC2832 COG0436 # Protein_GI_number: 15896087 # Func_class: E Amino acid transport and metabolism # Function: Aspartate/tyrosine/aromatic aminotransferase # Organism: Clostridium acetobutylicum # 1 395 1 393 393 507 62.0 1e-143 MLNEKMVALGTQRSVIRELFEYGKQRAAEVGPENVFDFSIGNPSVPAPDCVNETAIKLLN EMDSVVLHGYTSAQGDAGVRKMIADSINSRFQTSFNADNLYMTVGAAAALCCCLNGLTCP GDEFIVFAPYFPEYKVFIEAAGASMKLIPADIEAFQIDFRAFEEAINEHTKGVLVNSPNN PSGAVYSEETVKKLAEILEAKSAEYHHPIYLIADEPYREIAFEGTVVPYLPKYYKNTLVC YSWSKSLSLPGERMGYVVVPDEVDEHELVYAAIAGAGRSLGYVNAPGLFQRVCALCASET SDISVYETNKNILTKGLRDMGYHVVEPGGTFYMFPRTLIDDDIAFCEKAKLDHNLLIVPG TGFGCPGHARISYCVPTEQVERSLAAFEKLVKSYK >gi|222441885|gb|ACEP01000057.1| GENE 10 11697 - 13331 1711 544 aa, chain + ## HITS:1 COG:CC0966 KEGG:ns NR:ns ## COG: CC0966 COG0318 # Protein_GI_number: 16125218 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Acyl-CoA synthetases (AMP-forming)/AMP-acid ligases II # Organism: Caulobacter vibrioides # 51 532 35 521 530 252 34.0 1e-66 MPVTDLLERNHKLYGEEVALVELNPTMKDTRKTTWKEYDLVQQTSSVPYRREITWSVFDE KANRVANMLLSRGVQKGDRVAILMFNCLEWLPIYFGILKTGAIAVPFNFRFASDEIKYCA DLAEVHVLFFGPEFIGRIEAIEEELSKGRLLIYVGDGCPTFAESYRELVADCSSQAPLVR LEDEDDAAIYFSSGTTGFPKAILHNHESLLQAAQMEQKHHLTNRDDVFLCIPPLYHTGAK FHWMGSLCAGSKAVLLKGTTPEIILDAVSKEHCTIVWLLVPWAQDILDSLDRGDLKLENY ELSQWRLMHIGAQPVPPSLIKHWKEYFPNHLYDTNYGLSESTGPGCVHLGIENIDKVGAI GIPGYRWSCKIVDENGNALPQGEVGELCVQGPGMMTCYYNDPKATAETIKDGWLFTGDMA MQDEDGFYFLVDRKKDVIVSGGENIYPVQIENFLSAFDKVKDVAVIGLPDKRLGEITGAV ISIKDGMECTEEEINEYCMKLPRYKRPKKLIFADVPRNATGKIEKPALRKKYGAEYLVAQ QNRA >gi|222441885|gb|ACEP01000057.1| GENE 11 13496 - 14419 460 307 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225026920|ref|ZP_03716112.1| ## NR: gi|225026920|ref|ZP_03716112.1| hypothetical protein EUBHAL_01176 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01176 [Eubacterium hallii DSM 3353] # 1 307 1 307 307 572 100.0 1e-161 MASKNTNIVSKTLVESYIRKQVKALKQSPAETIDTLTKQLLDKPKGSLEKYLLHITPEKL QDTQSGYYKLFHDILPKVNTDHLVTLGMNIGYNGIFTASKKTQTNSPESKSDALWAFMLS IDGINYEKKSAQYKAQISNAKKNGTRCFMLFIEKNAWKLLPLIKKEKDCAFFLFCPSSCL TEKFLDAVKLTKNLFISVAYNSNGNKEQILSTNVFNRLRSRKLPYGIHYYYSVSDLDNFK SGEIFKKLSKEKPLFSFFLPKKADSGKDTVTEEEKKAVYEIIIKERYWLKYPSIPFEIEK DYYYLIH >gi|222441885|gb|ACEP01000057.1| GENE 12 14433 - 15965 1686 510 aa, chain - ## HITS:1 COG:BS_opuE KEGG:ns NR:ns ## COG: BS_opuE COG0591 # Protein_GI_number: 16077734 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Na+/proline symporter # Organism: Bacillus subtilis # 9 487 12 476 492 172 28.0 2e-42 MKLLILIAYFIVLITIGFVCRQKATDVNGFVLGGRSVGPWLTAFAYGTSYFSAVVFVGYA GQFGWRFGIAATWAGIGNALIGSLLAWVILGRRTRIMTQHLDSATMPQFFGERFQSKSLK IAASVIIFIFLIPYTASLYNGLSRLFGMAFHIDYSVCIVLMAILTAIYVIAGGYMATAIN DFIQGIIMLAGIIAIIVAVLHGKGGFMAALDGLGHVSDPTVSSTPGVFGSFFGPDPVSLL SVVILTSLGTWGLPQMVQKFYAIDNEDSIKKGTIISTIFALVVSGGCYFLGGFGRLFSDK IDIAANGYDSIIPTMLENLHPLLIALVVILVLSASMSTLSSLVIASSSTLTIDLIKDNVV KDMSDKKQVLCIRVLIVIFIAISAIIAVIQYKSAVTFIAQLMGISWGALAGSFLAPFLYS LYWKKTTKAACWTSFVFGCMIMVLNLVAGSHFPALLQSPINCGAFAMLIGLIIVPVVSLF TKKPDAALIENTFACYDKTSTVTQKTALGK Prediction of potential genes in microbial genomes Time: Fri Jul 8 07:21:21 2011 Seq name: gi|222441884|gb|ACEP01000058.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont64.1, whole genome shotgun sequence Length of sequence - 15266 bp Number of predicted genes - 20, with homology - 20 Number of transcription units - 9, operones - 6 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 742 608 ## gi|225026922|ref|ZP_03716114.1| hypothetical protein EUBHAL_01178 + Prom 764 - 823 5.9 2 2 Op 1 . + CDS 864 - 1256 219 ## gi|225026923|ref|ZP_03716115.1| hypothetical protein EUBHAL_01179 3 2 Op 2 . + CDS 1309 - 1539 138 ## gi|291520905|emb|CBK79198.1| hypothetical protein CC1_02460 4 2 Op 3 . + CDS 1601 - 1837 184 ## gi|225026925|ref|ZP_03716117.1| hypothetical protein EUBHAL_01181 5 2 Op 4 . + CDS 1889 - 2557 481 ## Cphy_0675 hypothetical protein + Term 2647 - 2708 9.0 6 3 Op 1 . - CDS 2704 - 3999 948 ## COG0582 Integrase - Prom 4022 - 4081 2.7 7 3 Op 2 . - CDS 4086 - 4715 545 ## EUBREC_3495 hypothetical protein - Prom 4739 - 4798 11.5 + Prom 4789 - 4848 8.6 8 4 Tu 1 . + CDS 4890 - 5120 273 ## EUBREC_3496 hypothetical protein + Prom 5149 - 5208 1.6 9 5 Op 1 . + CDS 5265 - 5579 322 ## EUBREC_3497 hypothetical protein 10 5 Op 2 . + CDS 5579 - 5884 97 ## EUBREC_3498 hypothetical protein 11 5 Op 3 . + CDS 5860 - 7581 1219 ## COG0507 ATP-dependent exoDNAse (exonuclease V), alpha subunit - helicase superfamily I member 12 5 Op 4 . + CDS 7301 - 8269 273 ## EUBREC_3500 hypothetical protein + Prom 8272 - 8331 5.2 13 6 Op 1 . + CDS 8358 - 8951 270 ## EUBREC_3502 hypothetical protein 14 6 Op 2 . + CDS 8952 - 9815 648 ## EUBREC_3503 hypothetical protein + Term 9832 - 9886 1.7 + Prom 9879 - 9938 6.1 15 7 Op 1 2/0.000 + CDS 9964 - 10659 343 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 16 7 Op 2 . + CDS 10676 - 11185 300 ## COG4767 Glycopeptide antibiotics resistance protein 17 7 Op 3 3/0.000 + CDS 11199 - 12263 825 ## COG0642 Signal transduction histidine kinase + Term 12323 - 12381 15.5 + Prom 12295 - 12354 5.9 18 8 Op 1 . + CDS 12404 - 13291 490 ## COG1131 ABC-type multidrug transport system, ATPase component 19 8 Op 2 . + CDS 13293 - 14534 786 ## EUBREC_3508 hypothetical protein + Term 14627 - 14673 15.9 - Term 14619 - 14657 6.2 20 9 Tu 1 . - CDS 14744 - 14983 274 ## EUBREC_3510 hypothetical protein - Prom 15030 - 15089 5.0 Predicted protein(s) >gi|222441884|gb|ACEP01000058.1| GENE 1 2 - 742 608 246 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026922|ref|ZP_03716114.1| ## NR: gi|225026922|ref|ZP_03716114.1| hypothetical protein EUBHAL_01178 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01178 [Eubacterium hallii DSM 3353] # 1 246 1 246 246 424 100.0 1e-117 IDSRSESLGHSQGSGNLQMSKAGMKLMTPGQISRMPKNDCILFLKGERPIYDKKNWPFNT EVFKEAEKIAGQNGYKNPVYVSYDERKKIYITTRFESRLNYISKEEFTFFEEKAKEDSNI QTFQIDEEAFLYLNFNETPQPSLRELEKMVKEIQVAEVNEEEEKETDQQPDMLQDREQWD LSGDIIDCFRRYSSELSPEEQEEIIKGIEEGLTDQDVKRYFTLVGAEKMRQYRRVLMVAK IRDEDN >gi|222441884|gb|ACEP01000058.1| GENE 2 864 - 1256 219 130 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026923|ref|ZP_03716115.1| ## NR: gi|225026923|ref|ZP_03716115.1| hypothetical protein EUBHAL_01179 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01179 [Eubacterium hallii DSM 3353] # 1 130 1 130 130 226 100.0 5e-58 MKKMIMALCIFICVFTLVACSGQQTNEPLTLGVNAIITEIDKENRIITTKDSGEEGVLGN DCLIDCSKIPMIYCNYDTQDVVDITFDDLQVGDEIILTIRSSEIENLQKEDDNKAKIAVE QLQLGTQRIK >gi|222441884|gb|ACEP01000058.1| GENE 3 1309 - 1539 138 76 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|291520905|emb|CBK79198.1| ## NR: gi|291520905|emb|CBK79198.1| hypothetical protein CC1_02460 [Coprococcus catus GD/7] # 1 76 1 76 76 109 92.0 7e-23 MLKKLNSFFNIVIGSFIGVFIGFGIYKFWHFKTYPNLYAMQSAPWYTELLLGGALVAVVV VVCIILKLIICRKLKP >gi|222441884|gb|ACEP01000058.1| GENE 4 1601 - 1837 184 78 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026925|ref|ZP_03716117.1| ## NR: gi|225026925|ref|ZP_03716117.1| hypothetical protein EUBHAL_01181 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01181 [Eubacterium hallii DSM 3353] # 1 78 1 78 78 130 100.0 4e-29 MKKAIITMIVVFALVVWPFSGVWQLLSGIIGGGVQESFIYPIYGGIILLSGLLVVCTELI LEEIKSLKDDIKDKKEGC >gi|222441884|gb|ACEP01000058.1| GENE 5 1889 - 2557 481 222 aa, chain + ## HITS:1 COG:no KEGG:Cphy_0675 NR:ns ## KEGG: Cphy_0675 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 1 180 1 181 264 87 35.0 6e-16 MKKVIYLLVLCMVLGMAGCGTNQENKEPVSGNVSVEIENKENDDVVQNEQTADTGSYPPC VMVDGVIYKDTGYVASMPGCGTMDGEIVSTVAGTELPSENNQSNFGSGYHYQRSSEGQLI VVIDDERIIFRDIEKDDTSIPIEVINFNAKVKEITDDGELLVTHVSTAEGFCQMNDGEYY VSAENLAEDVQVGDIVTIWFNGMIQETYPAQLGVVYRIIKTQ >gi|222441884|gb|ACEP01000058.1| GENE 6 2704 - 3999 948 431 aa, chain - ## HITS:1 COG:intR KEGG:ns NR:ns ## COG: intR COG0582 # Protein_GI_number: 16129306 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Escherichia coli K12 # 64 410 82 392 411 70 24.0 5e-12 MAKGSVRKKGKKWYYRFYVEDASGNLVQKEYAGTESKSETEKLLRQAMDDYESKKFIAKS ENITIGELLDIWAEEELKTGTLSNGTVQNYLGAITNIKKHPISERKLKNVTSEHLQAFFD LLSFGGTYPDGSERKGYSKDYIRSFSAVLQQSFRFAVSPKQYITFNPMQYIKLKYQTDEV DLFSDDDMDGDIQPIPREDYERLIEFLQNYNPPAILPIQIAYYAGLRIGEACGLAWQDVN LEEQCLTIRRSIRYDGSRHKNIIGPTKRKKVRIVDFGDTLAEILRNARKEQLKNRMQYGE LYHKNYYREVKDKNRVYYEFYHLDGTENVPEDYKEISFVCLRPDGSLELPSTLGIVCRKI AQKLDGFEGFHFHQLRHTYTSNLLANGAAPKDVQELLGHSDVSTTMNVYAHSTRKAKRES ARLLDKVAGND >gi|222441884|gb|ACEP01000058.1| GENE 7 4086 - 4715 545 209 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_3495 NR:ns ## KEGG: EUBREC_3495 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 209 1 209 209 362 99.0 5e-99 MKDKELRKLIGSRAKQRRLELNLTQPYVAEKMGVTASTILRYENGSIDNTKKMVLEGLSE ALHVSIEWLRGETDEYETDITDKKELQIRDAMGDILKQLPLDLSKKEDAFSKDLLLLMLK QYNLFLESFQFACKNYKGNTNEADIAKVMGFESNDEYNEIMFLREITHTVNAFNDMADIV RLYSKKPEMAEQRLENLLSEVLYEDSDSV >gi|222441884|gb|ACEP01000058.1| GENE 8 4890 - 5120 273 76 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_3496 NR:ns ## KEGG: EUBREC_3496 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 76 1 76 76 123 100.0 2e-27 MTERKIALSIEEAADYTGIGRNTLRKLVEWKKLPVLKVGRKVLIKTDILEKFMEANEGRD LRDKGNVKAVTRNVAT >gi|222441884|gb|ACEP01000058.1| GENE 9 5265 - 5579 322 104 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_3497 NR:ns ## KEGG: EUBREC_3497 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 104 33 136 136 146 98.0 3e-34 MAKSTKTYEERIRALEKKEQESIEATKKLIAQRKELEKRKKAEESKKRTHRLCQIGGAVE SVLGCPIEEEDLPKLIGFLKRQETNGKFFSKAMQKEPLTDMEEV >gi|222441884|gb|ACEP01000058.1| GENE 10 5579 - 5884 97 101 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_3498 NR:ns ## KEGG: EUBREC_3498 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 101 1 101 101 181 98.0 9e-45 MAEGVVFPFPSLPLNEGRTYGQPEVVLISLPQVATGLRLRLPGSFRRRGFQPDLLMPQGA LLRRGRSDSFTSAKPEEDAVGNQCRIRQKGGIVSWRYFTLQ >gi|222441884|gb|ACEP01000058.1| GENE 11 5860 - 7581 1219 573 aa, chain + ## HITS:1 COG:AGpT237 KEGG:ns NR:ns ## COG: AGpT237 COG0507 # Protein_GI_number: 16119945 # Func_class: L Replication, recombination and repair # Function: ATP-dependent exoDNAse (exonuclease V), alpha subunit - helicase superfamily I member # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 1 330 18 283 1117 107 27.0 5e-23 MAIFHFTVKIVGRSKGKSVISASAYLNGDVMKNEETGRISYYTSKKEVVYTSLMMCENAP PEWLHVPEENIKRFQQSIRYKRADDKDAALEKFKITFQKQRLWNEVLKIEKNADAQLGRS FEFSLPKEWSRQEQIDYTTEYIQKTFVDKGMCVDWSIHDKGDGNPHVHLLVTMRPFNLDH SWGSKEVKDWDFVRDTDGNIVVDESHPDWWQDKKNPDRHGIRIPVLDENGVQKVGARNRK QWKRVLTDATGWNNPKNCELWRSEWAGMCNRHLSIDNQIDHRSYERQGKLKIPTIHEGAD ARKIEEKYLTGQIWNGSWKVEENQMIKKQNALLQKVITTFGKVSGALSIWKEWLNDIRRK QRSNSHDGSNDYTDRGTAEYHGRDASGNTGKGREADVLSGEGRTIAAIRERIIRTASNLA GYRRTVDTSGRKDRPDTEAYRRKSAMAGIGTEIKQREPAIAETEQRITELEQQLEKGRSI DERIKKLKERRSTGRVASADRADAGRTRPERQDYPGTENAARRIADLEREVKQREQSREH SSLKERLEENKRIVAERERENARCRNHDRGMSR >gi|222441884|gb|ACEP01000058.1| GENE 12 7301 - 8269 273 322 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_3500 NR:ns ## KEGG: EUBREC_3500 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 87 322 1 236 236 451 100.0 1e-125 MNELRNSKSDDQLEELLLLTEQMQEELDQKDRTIRELKMQLDESLTLNERLNSENRAGNI QALKNDLRKTKELLQREKEKTHAAEIMTEECQDKLRQAEQERDYALSHQKKVEIPVEKPV LYQKCGNCNLTAYLKAKEKYDTQREKLAGRYKTKTAMYEALMFLLIWYSVSTTLFQMIRS KIFISDCVVFFDTIATFIQTIAGWIILTGKNMAQISNGISNPVVAGIVYWLIRILICGGC LVGAGILVAFIEIKIAELYKKCCWDMITIMVILISMATAIYFGKWIKTTLPINLLFLLLF VQLVYVGIRWYVKGWRETRGYH >gi|222441884|gb|ACEP01000058.1| GENE 13 8358 - 8951 270 197 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_3502 NR:ns ## KEGG: EUBREC_3502 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 197 1 197 197 398 100.0 1e-110 MTQFEKLDLLLRECGGTIQTFQVLNNGISKSVFYAYVKERGLEQAAHGVYVSPDTWTDAM YLLHLRCGQAVFSHETALFFHDLTDREPLKYTITVRTGYNPSRLQEDGFQVYTVKKDLHE IGIIAMQTSFGHSVPVYDMERTICDLLRSRKNVEMQVFQDALKQYAKRKDKNLRMLMKYA AMFHVENILRPYLEVLL >gi|222441884|gb|ACEP01000058.1| GENE 14 8952 - 9815 648 287 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_3503 NR:ns ## KEGG: EUBREC_3503 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 287 1 287 287 525 99.0 1e-148 MITTARQLKDLIRNMSKKKSADAQILMRNYMMERFLERISLSEYKNQFILKGGMLVAAMV GLDARATMDLDATIKGTNVSVEDVEMIISQIISIPLDDGVSFRVKRISEIMEEADYPGVR VSMETKFDGVITPLKIDISTGDVITPREIKYKFNLMLEDRTIEVWAYNLETVLAEKLETV VSRNVTNTRMRDFYDIYILQKLYGEQLQKQVLWTALVATAKKRGTLDLIEAEDIREIFDE IESSPVMETLWKTYQKNYSYAADISWHAIVKSICALYGSILENMYDT >gi|222441884|gb|ACEP01000058.1| GENE 15 9964 - 10659 343 231 aa, chain + ## HITS:1 COG:CAC0564 KEGG:ns NR:ns ## COG: CAC0564 COG0745 # Protein_GI_number: 15893854 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Clostridium acetobutylicum # 2 231 3 229 233 199 44.0 3e-51 MSEKILIIDDEQDIADLLEVYLKNENYVVYKFYCATDAMSCIESGDIDLAILDIMLPDMN GFSLCQLIRKKYTYPIIMLTAKIEETDKITGLTLGADDYVTKPFRPLEVVARVKAQLRRY KKYSPGIITEKIPSELSYNKLCLNVQTHECLLDGEPVSLTPTEFSILHVLLSSAGNVVSI EELFHAVWKDEYYSKNSSTITVHIRHLREKLNDTSDTPQYIKTIWGVGYKI >gi|222441884|gb|ACEP01000058.1| GENE 16 10676 - 11185 300 169 aa, chain + ## HITS:1 COG:L98876 KEGG:ns NR:ns ## COG: L98876 COG4767 # Protein_GI_number: 15673074 # Func_class: V Defense mechanisms # Function: Glycopeptide antibiotics resistance protein # Organism: Lactococcus lactis # 8 135 5 129 165 72 38.0 3e-13 MKRKSNCLATILFLIYLALLVWIILFKLQFSISDLDKVRSINLIPFHYDKEVGAAFHLTE VLENFLIFVPMGIYLQMLLPRTKLYVKFMLIAGTSFLLETMQYILAVGRSDITDVLTNTA GGLLGLAVYSMAARLIGNRIKANRLFSILAGIVSVVVIGLFGFLLFANR >gi|222441884|gb|ACEP01000058.1| GENE 17 11199 - 12263 825 354 aa, chain + ## HITS:1 COG:BH1809 KEGG:ns NR:ns ## COG: BH1809 COG0642 # Protein_GI_number: 15614372 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Bacillus halodurans # 54 351 46 346 351 162 33.0 7e-40 MSNKEMTTEKRLKVRLYLSLLVYTVVGYGLTLILDYIFSKFDNGIFAWLYWRLDLLFILY LMLGFICIFNYYWKKPWGYLDEVIDATQTVYEQNNHTVSLSEPLKELEEQLNQIKMSVLL SKQAAKQAEEKKNEIIMYLAHDIRTPLTTVIGYLSLLHEAPDMPEQQKEKYVKVALNKAE RLEKLINELFEITKYNAHTVIIKKETVDLHCLIAQVIDEIYPTLSANGNTAVFTAEDNLS VNADPEKLARVFSNLLRNAASYSYPQTEITISAKRLEHDIQITFENHGKTIPQEQLNSIF EKFNRLDDARLSNTGGAGLGLSIAEEIIHLHGGKITASSQNETIDFVITLPLSA >gi|222441884|gb|ACEP01000058.1| GENE 18 12404 - 13291 490 295 aa, chain + ## HITS:1 COG:CC3566 KEGG:ns NR:ns ## COG: CC3566 COG1131 # Protein_GI_number: 16127796 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase component # Organism: Caulobacter vibrioides # 3 288 2 288 294 207 40.0 1e-53 MELKIEHLSKQFKDKTAVNDVNLTLTPGVWGLLGANGAGKTTLMRMIADIMTPTNGAVYY DGKDIRELGEVYRSKFGFLPQDFGYHRDFTVKDYLEYMAALKDIPTRATTQKINHLLDIL SLSDVKKKKISNLSGGMKRRVGIAQAMLNDPEILVMDEPTAGLDPGERVRLRNFISEFSH DRIVLISTHIVSDIEYISTRNAIMKAGEIVDVGTTAELVKRIEGKVWNCIIPTSKLPECE MRLRIINQRGEDNNQVSIRYLSEHSEIDGAVTTEPRLEDLYLWLFPQTDLEKEDR >gi|222441884|gb|ACEP01000058.1| GENE 19 13293 - 14534 786 413 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_3508 NR:ns ## KEGG: EUBREC_3508 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 28 413 11 396 396 758 98.0 0 MRLFRLELKRILKSRRTLILLAIALLLSVAMAYLPISFEGINRPNEDGTVTELDGLAAIK YKQDLYKTSAGEVTPDRIKSALETYQSCVREYGPVEEEGFPLAVFIEKIVPFRHLLMGLS EAFADPVTGIGADLMDIDPNDIDGAYYEKCAEHLQDVMRNEQRENETAQQKALEKYSELD TPFYLHSGISKDAFDYIEFYILFLAILCVAIAASTFAGEYQTGGDSILRTTKYGHKQLAI TKILAAFTLFVVTFLVGITIHILILDAAFGTDCLKTSFQMRYSIINLPNINLGQLQIILA AAGLLSVLATVSCTLFLSAKCKDTLTVLLISIVVLLMPLFAYVAMGATWLSTILPSAGIG MQNNFLSQLADFNYLNIGGMSFWTPHVILISAGIELFVFTFLAIHSYCRHQVA >gi|222441884|gb|ACEP01000058.1| GENE 20 14744 - 14983 274 79 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_3510 NR:ns ## KEGG: EUBREC_3510 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 79 12 90 90 157 98.0 9e-38 MRTIFAEYNPKRNSIDVYTSVGYMLRIDCWEAEKNLKTTPGSDCALNALAIDEPLEYAKL YLDGNLQMWVDAEDSLDIF Prediction of potential genes in microbial genomes Time: Fri Jul 8 07:22:30 2011 Seq name: gi|222441883|gb|ACEP01000059.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont65.1, whole genome shotgun sequence Length of sequence - 4080 bp Number of predicted genes - 5, with homology - 4 Number of transcription units - 2, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 60 - 96 0.0 1 1 Op 1 . - CDS 157 - 963 760 ## BL01297 membrane protein YdjH 2 1 Op 2 . - CDS 964 - 1728 500 ## BATR1942_00585 putative phage replication protein - Prom 1760 - 1819 1.5 3 2 Op 1 . - CDS 1852 - 1947 63 ## 4 2 Op 2 . - CDS 1999 - 2802 749 ## BL01297 membrane protein YdjH 5 2 Op 3 . - CDS 2803 - 3786 880 ## BATR1942_00585 putative phage replication protein - Prom 3834 - 3893 7.6 Predicted protein(s) >gi|222441883|gb|ACEP01000059.1| GENE 1 157 - 963 760 268 aa, chain - ## HITS:1 COG:no KEGG:BL01297 NR:ns ## KEGG: BL01297 # Name: ydjH # Def: membrane protein YdjH # Organism: B.licheniformis # Pathway: not_defined # 48 248 34 234 255 69 27.0 1e-10 MGKIWRRILIGIVILAAIIGIVAKLLEPEEYTNTKPRQNTECTITQRVLDEAGKMTDKQK KDLEALIAEKEQLIGCDIVILTINEPGLDSYDEIRDYAQAYYEDNKFGWNKANGDGIIYV DDWETGYTWMCTTGLAKTRLDDTDTEQIVDSANEIVNDNPYEAYTSIVNNAATMIQTGRS SYVQINPIIVFLAAAVIGAIFVGVQLIGHGGKDTVLRSTYAKGGVQMNEKQDIFLHSHVT RTKRPSKSSGGGGKVGGTGGHGGAGGRH >gi|222441883|gb|ACEP01000059.1| GENE 2 964 - 1728 500 254 aa, chain - ## HITS:1 COG:no KEGG:BATR1942_00585 NR:ns ## KEGG: BATR1942_00585 # Name: not_defined # Def: putative phage replication protein # Organism: B.atrophaeus # Pathway: not_defined # 7 247 93 333 341 215 40.0 2e-54 MKEEYKPDLALPFVIDRKEAIEIIRKNFAKKMFIPKDFCSTSSLESLQGRYVPFWMYDME THVHFEGEGRKIRTWTEGDYDCTETSIYRLVRDFNVNYDKIPVDASKNMPDEKMDLVEPY KYTALGKFRSKYLSGFQAEIYDENKEELLPRAQKKAKKYTEGYLDEYNNSYSSVKTAIKK VDSKEMNSYYTFLPVWRYVYRYQGTNYEFFVNGQTGKAIGTPPTSKARMFSWFAAVTFSL FFLAEMILYFLEVA >gi|222441883|gb|ACEP01000059.1| GENE 3 1852 - 1947 63 31 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MFYRCKDCGGNAVYDPKKKKMVCESCGNEET >gi|222441883|gb|ACEP01000059.1| GENE 4 1999 - 2802 749 267 aa, chain - ## HITS:1 COG:no KEGG:BL01297 NR:ns ## KEGG: BL01297 # Name: ydjH # Def: membrane protein YdjH # Organism: B.licheniformis # Pathway: not_defined # 48 249 34 234 255 70 27.0 6e-11 MAKTYRGILIFVVILGIVMGLAGKKFDGYVYTNTEPRSNTECTLTQRVFDDADVLTDEEE AKLEKLIAEKEKLIGGDIVLMTTNDASLNTMEKLRDYAQNYYKENKIGWDKPIGSGAIYV DDWETRHTWLATRGTVKDKLSKDNTDYIVDETNDIVNKDPYKAYTRLINLLADETQKSHL FHINISPLWVALGCLVAAIAFVVFQLSDSVGKDTVKKSTYVKKDGVQLNDKQDVFLHSHV TRTKRETESSSSDCGGSDGFGGSGGSH >gi|222441883|gb|ACEP01000059.1| GENE 5 2803 - 3786 880 327 aa, chain - ## HITS:1 COG:no KEGG:BATR1942_00585 NR:ns ## KEGG: BATR1942_00585 # Name: not_defined # Def: putative phage replication protein # Organism: B.atrophaeus # Pathway: not_defined # 3 325 5 338 341 248 37.0 2e-64 MFYRCKSCGGNVIYDPKKKAMVCESCGNGGEPDLISQEKKHVCNNCGAEIEADKIDLSLK CPYCGTYVIFEDRMENEYEPNLVLPFALDKHKALDLLKEKFAKQMFLPGNFCAASTIESM QGLYVPFWMYDLHTHVHFEGEADKVRTWDEDDYECTETSTYRILRDFDVDYDKIPVDASK VMPDKMMDLMEPYKYGELGDFDAKYLSGFQAEVYDEDKNTLLPRAKKKADKYSQKYLSSY NVEYDAVRPTVNDKKSTEKESFYSFLPVWRYVYRYQGKNYEFYVNGQTGKAVGEAPTSTG KIIAWFIAVFGSLFFTVEMLLYLLGVL Prediction of potential genes in microbial genomes Time: Fri Jul 8 07:22:58 2011 Seq name: gi|222441882|gb|ACEP01000060.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont66.1, whole genome shotgun sequence Length of sequence - 17867 bp Number of predicted genes - 13, with homology - 13 Number of transcription units - 8, operones - 2 average op.length - 3.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 115 - 684 333 ## COG0860 N-acetylmuramoyl-L-alanine amidase - Prom 756 - 815 6.6 + Prom 662 - 721 6.1 2 2 Tu 1 . + CDS 813 - 2201 1368 ## ELI_1359 5-bromo-4-chloroindolyl phosphate hydrolysis protein + Prom 2218 - 2277 1.5 3 3 Tu 1 . + CDS 2348 - 3481 1529 ## COG3853 Uncharacterized protein involved in tellurite resistance + Prom 3491 - 3550 5.9 4 4 Tu 1 . + CDS 3688 - 4365 700 ## COG0791 Cell wall-associated hydrolases (invasion-associated proteins) + Term 4388 - 4427 8.0 5 5 Tu 1 . - CDS 5033 - 8107 2579 ## COG1401 GTPase subunit of restriction endonuclease - Prom 8136 - 8195 6.8 - Term 8321 - 8363 -0.7 6 6 Tu 1 . - CDS 8401 - 9780 1127 ## COG1253 Hemolysins and related proteins containing CBS domains - Prom 9830 - 9889 6.5 + Prom 9772 - 9831 9.1 7 7 Op 1 . + CDS 9863 - 11605 1788 ## COG0673 Predicted dehydrogenases and related proteins 8 7 Op 2 31/0.000 + CDS 11610 - 12377 1087 ## COG0568 DNA-directed RNA polymerase, sigma subunit (sigma70/sigma32) + Term 12500 - 12543 -1.0 + Prom 12556 - 12615 8.1 9 7 Op 3 . + CDS 12651 - 14159 1030 ## COG0358 DNA primase (bacterial type) 10 7 Op 4 . + CDS 14069 - 14425 277 ## Cphy_3522 DNA primase + Prom 14432 - 14491 5.9 11 8 Op 1 5/0.000 + CDS 14530 - 15720 995 ## COG0568 DNA-directed RNA polymerase, sigma subunit (sigma70/sigma32) + Prom 15871 - 15930 2.9 12 8 Op 2 . + CDS 15965 - 16783 878 ## COG2384 Predicted SAM-dependent methyltransferase 13 8 Op 3 . + CDS 16793 - 17644 847 ## COG1387 Histidinol phosphatase and related hydrolases of the PHP family Predicted protein(s) >gi|222441882|gb|ACEP01000060.1| GENE 1 115 - 684 333 189 aa, chain - ## HITS:1 COG:BH1294 KEGG:ns NR:ns ## COG: BH1294 COG0860 # Protein_GI_number: 15613857 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: N-acetylmuramoyl-L-alanine amidase # Organism: Bacillus halodurans # 4 189 3 179 253 91 34.0 7e-19 MAIKIYIDQGHNPEGINAGAEGFGIREQDVTYQVGRFLYDILSEDDRFEARLSRPTPQTT LGYSNTSSLRERVTQANNWPADYFISIHANASENPDINGTEAYVSSTDSAAWYLAQNIVT EIVRRTSTKYNGVFSRPSLYVLKNTKMPAVLVELGYITNYEDNQRMINEPYQFAYGIYVG LLNYLGLKQ >gi|222441882|gb|ACEP01000060.1| GENE 2 813 - 2201 1368 462 aa, chain + ## HITS:1 COG:no KEGG:ELI_1359 NR:ns ## KEGG: ELI_1359 # Name: not_defined # Def: 5-bromo-4-chloroindolyl phosphate hydrolysis protein # Organism: E.limosum # Pathway: not_defined # 35 448 3 395 403 159 29.0 2e-37 MSNQEDRFDFSDSENHGNQGERYRKSSEDKNDNGKADGINWGDKIFSMIQESTESMDFAE LSHNIQKTIDVAREETMKQVDKAKTEAAKQIERAKEEAARQVDKAKEEAAKQIDKARGYQ SSQAYVNVGHGRKVIAGKLKKNPGLYSGPAEIAVGAAGLGAFGGAGIGLGAFLGAVAFPI GAAITSVAVMIPFTIVSAFLLGRGIFNSKRARRIRKYASIWTGKPYVMIEDLESRVGWDR KKILKDIHFLTDRELIIGAELDAGETCLMLTDESKQQYASAMVAKRQREQEEVKAREAEE ALKAAPFEQREIHRIKKDGQEYLEQLAELKKEIVSDEMRRKIEQMETLTARIFVCASEHP ESISQTDRLFKYYFPSVLKLLKVYEDVEKQPVQGENIRKTKKEIEDSLDTMNQALEKLFD EMFQNVAMDISSDIQVLEVMLKQDGLTEDGIHADQKEPMLKL >gi|222441882|gb|ACEP01000060.1| GENE 3 2348 - 3481 1529 377 aa, chain + ## HITS:1 COG:lin2081 KEGG:ns NR:ns ## COG: lin2081 COG3853 # Protein_GI_number: 16801147 # Func_class: P Inorganic ion transport and metabolism # Function: Uncharacterized protein involved in tellurite resistance # Organism: Listeria innocua # 18 373 39 396 399 229 43.0 5e-60 MDIEELLKEGPSLTLTPEAAPELPAEEKKAEPKVYEEEQYLTEEEKKQVDAFVEQIQLED STSILQYGAGTQKKMSDFSEGTLEKVRSKDLGEIGDLLNGVVVELKGFDEEEEKGIVGFF KKKASKAQTLKLRYSKAEGSIDQICEKLEGHQVQLLKDSAMLDQLYDLNKVYFRELTMYI LAGKKKLEKETNEVLPALRKKAQETGSQEDVQEVNDREALCNRFEKKIHDLELTRMVSLQ MAPQIRMVQASDITMSEKIQSTLVNTIPLWKSQMIIALGVEHSREAAEAQRKVTDLTNDL LKKNAEKLKTATIETAKESERGIIDMETLVATNESLMSTIDEVMKIQQEGREKRREAEGK LVELEDELKQKLLSQSK >gi|222441882|gb|ACEP01000060.1| GENE 4 3688 - 4365 700 225 aa, chain + ## HITS:1 COG:CAC2663 KEGG:ns NR:ns ## COG: CAC2663 COG0791 # Protein_GI_number: 15895921 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell wall-associated hydrolases (invasion-associated proteins) # Organism: Clostridium acetobutylicum # 108 225 140 255 255 85 42.0 8e-17 MLNRMKRVAAGLLTAVFLFNIAPVSGVNNVAQVEAAKLVTKGKVNSSILNIRTRKSTKAK KITTLKRGAKVTLLSNNSKWVAVKVKGKVGYTQGKYITIANGATASATTMSKGQSVVNYA KRFLGNPYRWGGTSLTHGADCSGFVMSVYRHFGKSLPHSSYAQRNVGRRVRSLSKAKPGD IICYSGHVAIYMGNNRVIHASNPRDGIKITKGAAYRSIVTIRRIF >gi|222441882|gb|ACEP01000060.1| GENE 5 5033 - 8107 2579 1024 aa, chain - ## HITS:1 COG:YPO0387 KEGG:ns NR:ns ## COG: YPO0387 COG1401 # Protein_GI_number: 16120721 # Func_class: V Defense mechanisms # Function: GTPase subunit of restriction endonuclease # Organism: Yersinia pestis # 807 984 446 631 687 95 32.0 5e-19 MITDKQRQRDYEYEADYQSKSRYTTEQWTNILNSGQLDEKNISLLKQIYSSFNHAATLLQ LSFESGRTQEDILSQLNSAGQLLGEANSMNPEVDYEGNEYWWYLFFWGKNTDAKTLELKM HPELTEAIGTLYPELEESYYAFMSDVERSLQTRYSQEDAVWIAAAVLLYEKYYCNPGISS DDILLMQYEVQTRSQKVYGQDVNANTITQICNADERGHRFNYLRDIYKYYRVSFPGEFEG ERERPDPEEVDYDAYVYSILGYMKLSKLTDFIEHEYAHLVDESYVELTNANGFVRMAAYL SRQGGTPFSLDDTSDKALSLRADGEDASETFHKISEALLKEYPNFTYAAKGDWYVPESST VTGVLQDILYIPSYAGHNAFIGIRTVLDEDKLNIEVSLNLPCVSDEEVMLDIHDKCNMLT LMTAAPFQVTLVPSESSDLSPSGDKIKAFVLYHYDDFKTMSEEDIVGLFSTSLEIFASYY TDICQNYYPAIEDDGNVEIGTETGDSSETGTDSVTNPLAAALGSKLTYRPAADVQGPQII YKEGGTPASTDFITKAIQGYGTTPASDERTYENETASNISADSTSAGDEEFSSFVNKAAL EGDDEEIPARKTGNYSGRNTSITGAGSSAGSNRTSGSGSMNSASAGFSGNSSFASDAASS AAAVGAATGAGSGTPVTANVSITPDSVWPYPDNAIASTHPDRKEQNFRLYPKNTLIKGPV KTGKFHQAIMTAVGIIEGKDSNMMNIEPVPDVLEHYQQYVDEGRILHISYPDINSDGYDG FIERKRGPFIEDGIFKQFANKCADGRYVVMMEEVDLNWMHLFRETAVLLRENRREGTSSE TAITLPFSKEKFRLPSNLYIVATCDSIVCEDTILGAIDHDFFIRPVSPEPEILHGMRVEG ISLQRLMTTINMRLSYFLGADYQLGEGFFLQSPDKDSFISLCRVFREQLVPLLEKWFDGD IERIRYVLGDNGKSRPDTIFFQEIPFRDGLFKGELPDSFDKERRIYRCNEDAFYNPKSYI DIYD >gi|222441882|gb|ACEP01000060.1| GENE 6 8401 - 9780 1127 459 aa, chain - ## HITS:1 COG:FN1486 KEGG:ns NR:ns ## COG: FN1486 COG1253 # Protein_GI_number: 19704818 # Func_class: R General function prediction only # Function: Hemolysins and related proteins containing CBS domains # Organism: Fusobacterium nucleatum # 42 453 17 424 426 235 36.0 1e-61 MIDNTTFIKYNNTNIFFLKGDDVLDPSSIRQIVILLILLLLSAFFSSAETALTTVNKHRM RALANQGNKKAARVLALIEKPEKMLSAILIGNNIVNLSASSLTTTLAMSLASQAGLGKNT STFVGLATGILTLLILIFGEITPKTIATMKNESMALLYSGIIYTITTVLTPVIYVVNLFS GLLCRLFGIKPGSGQAITELELRTIVDVSQESGVIEKEEKELINNVFDFGDSVAKDIMLP RIDVSFASVDMSYDELVEIFLEEQYSRLPVYEESKDNVIGILNLKDLFFYRETHRNEAFD IYKVLREPFFTYEYQKTSTLMEEMRNNSISFAIVLDEYGSTAGLVTLEDLIEEIVGDIKD EFDESETDAIRCIGTDEYEIEGSTKLDDLNDVLGTEIESEDYDSIGGHMIELLDHLPKEG ETVKEGNYLYSIKKMDKNRVEIVYLKIERHTSEAKPAEM >gi|222441882|gb|ACEP01000060.1| GENE 7 9863 - 11605 1788 580 aa, chain + ## HITS:1 COG:BS_yulF KEGG:ns NR:ns ## COG: BS_yulF COG0673 # Protein_GI_number: 16080169 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Bacillus subtilis # 241 564 1 327 328 296 44.0 1e-79 MKQYLLFDLDGTLTDPMVGITSSVQYALEKFGIHVKYLKDLIPFIGPPLDDSFQEFYGLS KEDAGKAVEYYREYFAPKGIFENEVYPGIPEMLSRLVEAGFTLIVATSKPAVFAKQILEH FGLSDYFSFVGGSELDGTRKRKAEVIGYILETCEIKPQDAIMIGDRKHDIEGAKLCGLES VGVLYGYGSEEELSKAGADHIIKDVKLLEEYLRKQGENPDNLTWYDRLKGRTGGEGKETK MIRFGMIGTGKIAQKFWQANRYGKDFELTAVYSRTLERAREFGFQKGRLQYFDDLEAFAN SDCIDAVYVASPNCCHHDQVMTLLKAGKHVLCEKPMASNLKEAEEMFSEAEKQNLILLEG MRSIYAPSFEKMLPYMESLGKIRRATLQYCQYSSRYDNYKRGIIENAFKPELSNGALMDI GVYVVACMIRLFGAPVSIKASGIKLSNGVDGAGTILMEYPDMIGEAIYSKITDSAMPSQI QGEDASMLVQEIENIKDLRIVRKGVVQSIHFEQSDNILNYETQEFIKMIKTGMGWEKSRE ITLETMKVLDEARRQLHIVFPADEKSQEKKKQKKTAKEEE >gi|222441882|gb|ACEP01000060.1| GENE 8 11610 - 12377 1087 255 aa, chain + ## HITS:1 COG:BH1376 KEGG:ns NR:ns ## COG: BH1376 COG0568 # Protein_GI_number: 15613939 # Func_class: K Transcription # Function: DNA-directed RNA polymerase, sigma subunit (sigma70/sigma32) # Organism: Bacillus halodurans # 3 250 12 267 372 102 29.0 7e-22 MAGQTEFLQAIQELERLGETNGNQLSMEEINSYFSDMKLEEKQLDFICNYFESHHIYITN RVQRFEEEIGDEEIPNKEDPMDAEIVAFYLKEMEKANSLSGEQERVIARKLVSGDDTARN LLIEANLSRAVEIAKEYEGRGLLLSDLIQEGNIGLMVAVNEFEPDIDKDFHAFSEKMIRK HLEETLEEYNSSTRSAVKMANRVNEMNDIATAFAKEYEREAKPSEIAERMGITEEEVREL MKVSLDAIAVLNQDK >gi|222441882|gb|ACEP01000060.1| GENE 9 12651 - 14159 1030 502 aa, chain + ## HITS:1 COG:CAC1299 KEGG:ns NR:ns ## COG: CAC1299 COG0358 # Protein_GI_number: 15894581 # Func_class: L Replication, recombination and repair # Function: DNA primase (bacterial type) # Organism: Clostridium acetobutylicum # 5 424 8 423 596 338 42.0 2e-92 MRYEEELIDEVREKNDIVDVISSYVSLKKRGSNYVGLCPFHNEKSPSFSVSRDKQMYYCF GCGQGGNVYTFLMEYNRLSFVEALQNLAQRAGVELPEREQTAEERQQADARVTLRDMNKE AAIYFHYLLKSERGTQALSYLKNRGLSDETINHFGLGYADIYRDDLYQYLRSKGFTDYQM KESSLVNIDTVKGNYDKFFNRVIFPIMDMNQRVIGFGGRVMGDGEPKYLNSRETILFDKG RNLYGLNYAKRSRSNAIILCEGYMDVISLHQAGFTNAVAPLGTAFTPNQAMLLKRYTNEV ILSFDSDGAGIKAALRALPILRQNGLRGRVLSMKPYKDPDELIKAEGAESYQKRIDEAES GRMFELLVLYGQYNQQDPESRTEFLHEVAKKLAQIEEPLERQTYLTSAANRFMIRKNDLE NLVNRYGLGYQYEQVNEQYKTAPETEERRQEKKKIRKTSHRDFFLPGLWNILICINISGR ISGQMTSLSRCTIRWRLCFLNN >gi|222441882|gb|ACEP01000060.1| GENE 10 14069 - 14425 277 118 aa, chain + ## HITS:1 COG:no KEGG:Cphy_3522 NR:ns ## KEGG: Cphy_3522 # Name: not_defined # Def: DNA primase # Organism: C.phytofermentans # Pathway: DNA replication [PATH:cpy03030] # 2 113 475 589 596 76 35.0 3e-13 MYQYIRPYIGADDFIEPLYHSVAIMLFEQLDKEEKINPAAILSRFTDAEEQKEIAAIFNT HLKYGLSSEDEDKALSDVVRKVKLASIEDEMVHTSDIMRWQELLSLKNKMQKFSINRV >gi|222441882|gb|ACEP01000060.1| GENE 11 14530 - 15720 995 396 aa, chain + ## HITS:1 COG:lin1491 KEGG:ns NR:ns ## COG: lin1491 COG0568 # Protein_GI_number: 16800559 # Func_class: K Transcription # Function: DNA-directed RNA polymerase, sigma subunit (sigma70/sigma32) # Organism: Listeria innocua # 124 395 103 374 374 374 74.0 1e-103 MGRKEQIFGSQMNCLLERAKKQKNVVELQEIRDVFQNSPLTQAQLERIIAYLEEQKIDVL TMPDINADDIEDDVGEIDSLNDMDVFQKPDRIGNVQVTAEIEKGYEKTAEESENSFTERG NAEDPVRMYLKEIGRIPLLSSEEEIELAKRMEEGDEEAKKKLSEANLRLTVSIAKRYSGR GMQFLDLIQEGNLGLIKAVEKFDYRKGYKFSTYATWWIRQSITRAIADQARTIRIPVHMV ETMNRVNRTSRRLLQEYGREPTPEEIAEAMNLPVERVLEISKISQEPVSLETPIGEEEDS HLGDFIQDEHIPVPADEAAHTLLREQLEKVMDTLSEREQKVLALRFGLEDGKPHTLEEVG REFQVTRERIRQIEAKALRKLRHPTRSRKLRDFLEE >gi|222441882|gb|ACEP01000060.1| GENE 12 15965 - 16783 878 272 aa, chain + ## HITS:1 COG:CAC1302 KEGG:ns NR:ns ## COG: CAC1302 COG2384 # Protein_GI_number: 15894584 # Func_class: R General function prediction only # Function: Predicted SAM-dependent methyltransferase # Organism: Clostridium acetobutylicum # 1 260 1 225 229 126 35.0 5e-29 MELSIRLKTVAEAVTKGNRVADVGTDHGYVPIYLVKNNLSPAGIAMDVNKGPLEKAQEHI REEKLGGKIATRLGNGLAPLEPGETDTVIIAGMGGDLICKILKAKPEFLIEGKEFILQPQ SEWFKIRRLLKEYRYQIEKEWFLKEDGKYYVIIKAEPAWTQPQRLKHQQPSIHEYIKEIS DTQVNQAVSEKDMESIYEQYGKYLIETKNPVLKEYLEKEITKKESIAAELKQSIKEIESE ELSGKERGNIAKRQRRYQEIQKEIRDMRKAIG >gi|222441882|gb|ACEP01000060.1| GENE 13 16793 - 17644 847 283 aa, chain + ## HITS:1 COG:L37351 KEGG:ns NR:ns ## COG: L37351 COG1387 # Protein_GI_number: 15673198 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Histidinol phosphatase and related hydrolases of the PHP family # Organism: Lactococcus lactis # 4 277 5 259 269 135 32.0 1e-31 MLADYHVHTEFSNDSIYPMEEVVKDAISLGIKDICFTDHVDYGPYRDWDDPRGIQYRPGD EGEPEQVALTNVDYKKYFSMIEKMREKYREKIAIKAGLEFGVQTHTIPEYEKLFRSYPFD FIILSIHQAGDQEFWTNEYQSGRTQQEYNEGYYKELLSVVQNYHNYSVLGHMDLIVRYDS YGVYPFEKLKPLLTEILKTVIADGKGIEVNTSNHRYGLSDMTPSRDILKLYKELGGTIIT IGSDSHKKEHLGAYIDWAKEELRKLGYTQFCTFEKMQPIFHEL Prediction of potential genes in microbial genomes Time: Fri Jul 8 07:23:13 2011 Seq name: gi|222441881|gb|ACEP01000061.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont67.1, whole genome shotgun sequence Length of sequence - 12830 bp Number of predicted genes - 13, with homology - 13 Number of transcription units - 5, operones - 2 average op.length - 5.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 212 - 271 3.8 1 1 Op 1 . + CDS 338 - 1324 1266 ## COG0078 Ornithine carbamoyltransferase 2 1 Op 2 . + CDS 1349 - 2329 597 ## COG0340 Biotin-(acetyl-CoA carboxylase) ligase 3 1 Op 3 . + CDS 2342 - 2686 319 ## EUBREC_0059 hypothetical protein 4 1 Op 4 1/0.000 + CDS 2670 - 3443 937 ## COG1521 Putative transcriptional regulator, homolog of Bvg accessory factor 5 1 Op 5 1/0.000 + CDS 3447 - 4397 569 ## PROTEIN SUPPORTED gi|42631300|ref|ZP_00156838.1| COG0042: tRNA-dihydrouridine synthase 6 1 Op 6 . + CDS 4460 - 4942 741 ## COG0782 Transcription elongation factor + Term 5001 - 5043 0.4 + Prom 5106 - 5165 7.5 7 2 Op 1 . + CDS 5202 - 5606 343 ## Cphy_3553 hypothetical protein 8 2 Op 2 . + CDS 5611 - 6624 607 ## Cphy_3552 hypothetical protein 9 2 Op 3 7/0.000 + CDS 6644 - 7546 704 ## COG1624 Uncharacterized conserved protein 10 2 Op 4 . + CDS 7494 - 8756 1315 ## COG4856 Uncharacterized protein conserved in bacteria + Term 8772 - 8821 -1.0 + Prom 8764 - 8823 2.5 11 3 Tu 1 . + CDS 8864 - 9391 211 ## COG1309 Transcriptional regulator + Prom 9436 - 9495 3.5 12 4 Tu 1 . + CDS 9522 - 10295 850 ## COG0566 rRNA methylases + Term 10509 - 10562 0.2 + Prom 10873 - 10932 18.6 13 5 Tu 1 . + CDS 11128 - 12624 2000 ## COG0554 Glycerol kinase + Term 12674 - 12711 10.1 Predicted protein(s) >gi|222441881|gb|ACEP01000061.1| GENE 1 338 - 1324 1266 328 aa, chain + ## HITS:1 COG:PM0808 KEGG:ns NR:ns ## COG: PM0808 COG0078 # Protein_GI_number: 15602673 # Func_class: E Amino acid transport and metabolism # Function: Ornithine carbamoyltransferase # Organism: Pasteurella multocida # 1 328 3 334 334 366 54.0 1e-101 MDLKGRSFLTLLDFNKEEIRYMLDLAHELKAKKKKGILGEELKGKNIVLVFDKSSTRTRC SFEVAAMDEGAHVTYLTNSHMNKKESIEDTAKVFGRMYDAIEYRGFGQDIVEDLQKYSGV PVWNGLTDVDHPTQVLADFLTMEEHIDKKLDEMKLTFVGDLSDNVMYALMYGSAIMGVDF KAVGPEETVCDPKVLEVSQELAKKSGATITISHDPEDVKGSDVIYTDIWVSMGESEELYP VRVKALSPYKVTTKMMKDTGNDKCLFMHCLPSYHDFETSVAKDMRDRLGLDIREVEDEVF RSENSVVFDEAENRMHTIKAVMVATLGR >gi|222441881|gb|ACEP01000061.1| GENE 2 1349 - 2329 597 326 aa, chain + ## HITS:1 COG:lin2018_2 KEGG:ns NR:ns ## COG: lin2018_2 COG0340 # Protein_GI_number: 16801084 # Func_class: H Coenzyme transport and metabolism # Function: Biotin-(acetyl-CoA carboxylase) ligase # Organism: Listeria innocua # 84 317 16 247 253 176 39.0 6e-44 MKEKILEFLKQQEGFISGQRICDELGISRTAVWKYMNSLKKEGYEIESVTRKGYRLLQSP DLLTREAVLSCIKKGEIPGELCCFESIDSTNEEAKRRGEAGAPDGSIYVADNQTNGKGRR GRTWLSPSGEDIFFTILLRPDLPLNSVSMLTLVAASAVAEAVDKVTGQECQIKWPNDIVL NRKKVCGILTEMNMEIDSIAYVVVGVGINVNRMEFREDIAETATSLKKESGHSIERAELL SEILSAFFRDYKLFLEKQDLSPFLESYNQKLVNVGREVKIIKKGEEIIRTAIGINDRGEL IVQDAEGNTEHIFSGEVSVRGIYGYV >gi|222441881|gb|ACEP01000061.1| GENE 3 2342 - 2686 319 114 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_0059 NR:ns ## KEGG: EUBREC_0059 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 5 108 4 107 111 127 63.0 1e-28 MFDDKVILCGSSSYEKIFYFNPEFDSLPKQIKDELQIMCVLFTEDVGGILRLEFDEKGNL NLTVESKEDDFFFDEIGSVLKIKQLRAEKRELLENMETYFRVFFLQEGQDVISD >gi|222441881|gb|ACEP01000061.1| GENE 4 2670 - 3443 937 257 aa, chain + ## HITS:1 COG:lin0253 KEGG:ns NR:ns ## COG: lin0253 COG1521 # Protein_GI_number: 16799330 # Func_class: K Transcription # Function: Putative transcriptional regulator, homolog of Bvg accessory factor # Organism: Listeria innocua # 1 255 1 254 259 249 48.0 4e-66 MLLAIDIGNTDFTIGVFQGKKLCDTFRMRTKINRTSDEYGVFILNTLQYKGYDTKDVEDV IIASVVPAVMHSFNSAIIKYFGLTPIIVGPGIRTGIRIATTNPKETGADRIVDAVAGYEL YGGPVIVVDFGTATTYDLVTEDGRFVGGVTSPGIRTSANALWQQAAKLPEIEIRKPQYIL AKDTVTSMQGGLVYGYIGQTEYIIRKMKEESGLGKINVVATGGLGKMIAQETSFINYYDA NLTLKGLQIIYDKQKQG >gi|222441881|gb|ACEP01000061.1| GENE 5 3447 - 4397 569 316 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|42631300|ref|ZP_00156838.1| COG0042: tRNA-dihydrouridine synthase [Haemophilus influenzae R2866] # 13 315 39 339 353 223 39 4e-58 MTVKEIYGTDFPIILAPMAGITDLPFRILCKEQGADVMVTEMISAKALFYGNKNTLPLMQ TEEEEKPIGVQIFGSEPELMGEMAHKIEDKGFSYIDINMGCPVPKIVNNKEGSALMLNPE LAGRIVKEVSKAVSLPVTVKFRKGFDADHINAPEFAQIMEENGAAAVAVHGRTRVQYYTG QADWDIIRDVKNAVSIPVIGNGDIFKGEDAVKMRDYTGCDGIMVARGAKGNPWIFREVKA ALSGKPIPERPTEEEVISMLLRHAQLSVQYKGEKMGIREMRKHTAWYTTGMHGSSTLRNK VNQVETFDALQELLMR >gi|222441881|gb|ACEP01000061.1| GENE 6 4460 - 4942 741 160 aa, chain + ## HITS:1 COG:CAC3198 KEGG:ns NR:ns ## COG: CAC3198 COG0782 # Protein_GI_number: 15896445 # Func_class: K Transcription # Function: Transcription elongation factor # Organism: Clostridium acetobutylicum # 1 157 1 157 158 133 52.0 1e-31 MVEAKKHILTSQGMQALEDELQDLKVVKRKEIAQKIKEAREQGDLSENAEYDAAKDEQRS MEARIEELEKIIKNAEVIDESAYDKDTVSIGSTVKFYDEEFDEELEYRIVGSTESDILKG LISNESPLGKGLIGAKIGETVQVESPDGFSRYKILGISRE >gi|222441881|gb|ACEP01000061.1| GENE 7 5202 - 5606 343 134 aa, chain + ## HITS:1 COG:no KEGG:Cphy_3553 NR:ns ## KEGG: Cphy_3553 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 1 130 1 129 143 62 33.0 5e-09 MIDPKKVRLMTKLAVYEEGPGKKDLKINCYSKRTYVNIKQLESIIAITVAYLLGLGLYCF GIYTDIISQGLKFPYQKYILHAMILYLIIIIIDFVCTRRYYTKVYDKMREDIKQYDYNLY RLARYIQKEEKDRA >gi|222441881|gb|ACEP01000061.1| GENE 8 5611 - 6624 607 337 aa, chain + ## HITS:1 COG:no KEGG:Cphy_3552 NR:ns ## KEGG: Cphy_3552 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 2 308 1 308 353 128 29.0 3e-28 MMAQIVYFKEWLIVFYKKHVDVVLPVLKLVFAFFTMCMFQGMFPYNMVINKPGIFLVLSA AQAFLPISVLYYMISILIMINLWKVSMDIFLGFVIFSIICSLAFVRVDRKHAVILIVTAV MFYLKLEYLLPVLLGMIVGFGAILPAAAGVVVYFLSVYMTDVSTLLTTSSSSNFGMGLQR IVNLMLIDKKLLVFLITFSLIIFITTLLCHLFYERAWLFAIFIGHIAMILLLLFGRLIFE LDYKIWRLFLEMILGIGCCYIYRFFRGIGDVSRIEKASFEDDEYFYYVKAVPKIKVTQKD RNVTDIKSGENEDDILFDDITEETDLFDGNREEAEKE >gi|222441881|gb|ACEP01000061.1| GENE 9 6644 - 7546 704 300 aa, chain + ## HITS:1 COG:BS_ybbP KEGG:ns NR:ns ## COG: BS_ybbP COG1624 # Protein_GI_number: 16077243 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Bacillus subtilis # 31 273 16 250 273 221 49.0 1e-57 MQLISIGQFHKVIEKYLYWLSLPSIGFTDILEIIIISVIVYQMLKWVQLTRAWTLFKGII MLLLFALFAAIFQLNTISWLLSNSLGVGITAAIIIFQPELRRALEQLGRKKIFSNLFSFS TGEYDGRQSVLTEKTINEIVRASYEMGAVKTGALMVIEQNVALGEYVRTGISIDGIVSSQ LLINIFEHNTPLHDGAVIIRGNRIVSATCYLPLTDSMDIGKELGTRHRAAVGISEVSDSL TIIVSEETGAVSLAKDGNLYKHLKKEELLEKLKQLKIDEENNTDVFKKWKDLLRNEKSEK >gi|222441881|gb|ACEP01000061.1| GENE 10 7494 - 8756 1315 420 aa, chain + ## HITS:1 COG:BS_ybbR KEGG:ns NR:ns ## COG: BS_ybbR COG4856 # Protein_GI_number: 16077244 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus subtilis # 74 420 73 416 483 75 23.0 3e-13 MYLKNGRTCCEMKSQKNNMWMKILSVVIAVLIWLFVANTNDPVVTKRFYSIPVKVMNEDA LTKRGYAYEILDGEEVNITVKGKSSIVRSMGISDFQAIADFSKLSKVDAVPIDVTAKKYS DQLDITLGTTNTMKIKKDEVVTISVPVNVTAKGDPAEGYAVGRVTSTPNLIKVSGPENLL SSAKEIRATVSVDGISHDVTATDKPVLYDEEGKKIISNQIEFDTASIGIYIELWKTKTVD VKLSYTGEPAKNYHLVSFDYEPKQITIAAPDDMLESLDSITLDSVSIEGLTEDYEKDIDL TQTVLPDNVILANDDTSDVKVKATIEKITSHKLSFTKRDINITNNTNNYKVSFDKDNDYS ILVDGAASAVKKLDIKDFVPWIDINGLEPGTHEVSLHVKDVEGVTVGATTKLKITLKENN >gi|222441881|gb|ACEP01000061.1| GENE 11 8864 - 9391 211 175 aa, chain + ## HITS:1 COG:lin0482 KEGG:ns NR:ns ## COG: lin0482 COG1309 # Protein_GI_number: 16799557 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Listeria innocua # 5 166 7 168 186 64 27.0 8e-11 MRQKTETMLCVAFMELLKKCPFSKITIQKIAGQCGVNRQTFYYHFDNIYDLMSKAFEYEL VHESRIYEEVTWEYAINSFLQWMKENRVIIRNILANAELPYLKRAMYPLIAKSISCGSFM QESMQPAIKQQEEFCINFLTIGITQYVLEWVESDFRESIDDIVDHMSWLLQKVYS >gi|222441881|gb|ACEP01000061.1| GENE 12 9522 - 10295 850 257 aa, chain + ## HITS:1 COG:CAC2358 KEGG:ns NR:ns ## COG: CAC2358 COG0566 # Protein_GI_number: 15895625 # Func_class: J Translation, ribosomal structure and biogenesis # Function: rRNA methylases # Organism: Clostridium acetobutylicum # 1 253 4 259 261 166 37.0 5e-41 MISSTSNAKVKNIMNLKKSAKARKKEKCFLVEGPRMFFEIPKDRIEECYLTEEFEDKYAE ELNGWHYELISENVCKHLSDTKTPQGVIAVVRRTEPSIEELLHKEKNPCFFLLENLQDPG NLGTIVRTAEGAGVTGIIMNRETVDIYNPKVIRSTMGAIFRVPFVIADSLPEVIFQLKQN GVSVYAGHLKGDVFYKQDYRDGSAFLIGNEGNGLTDEITALADYKIKIPMKGKVESLNAA VSATILMYETMRQREFM >gi|222441881|gb|ACEP01000061.1| GENE 13 11128 - 12624 2000 498 aa, chain + ## HITS:1 COG:VCA0744 KEGG:ns NR:ns ## COG: VCA0744 COG0554 # Protein_GI_number: 15601500 # Func_class: C Energy production and conversion # Function: Glycerol kinase # Organism: Vibrio cholerae # 3 495 5 498 505 659 63.0 0 MAKYMMALDAGTTSNRCILFNEKGEICSVAQEEFTQYYPQPGWVEHDAKEIWHTQLSVAR QAMEKLGVTAADIAGIGITNQRETTIVWDRETGMPIYHAIVWQCRRTSEYCDSLKERGLV DKIREKTGLVIDAYFSGTKIHWILENVPGARERAEKGELMFGTVDTWLIYNLSGRKIHVT DYSNAARTMLFNINTLQWDDEILAELDIPKSMLPTPKPSSEIYGMTDESLFQGRIPIAGA AGDQQAALFGQTCFNPGEAKNTYGTGTFMLMNIGKEVKLSENGLVTTIAWGLDGEVNYAL EGSVFVAGAAIQWLRDEVKIVDAAPDSEFFCNKVPDTNGCYVVPAFTGLGAPHWDQYARG CIVGLTRGCNKAHIIRATVESLAYQTYDILEAMQADAGVKLAALKVDGGACKNDFLMQFQ ADIIDAPVHRPQCVETTAMGAAYLAGLAVGYWNSKEDVIANWAIDRVFDPQMDDDNRQKL LKGWKKAVKCSYGWAKED Prediction of potential genes in microbial genomes Time: Fri Jul 8 07:23:25 2011 Seq name: gi|222441880|gb|ACEP01000062.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont68.1, whole genome shotgun sequence Length of sequence - 5286 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 4, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 16 - 75 7.7 1 1 Tu 1 . + CDS 195 - 326 97 ## gi|225026977|ref|ZP_03716169.1| hypothetical protein EUBHAL_01233 + Prom 715 - 774 1.8 2 2 Op 1 . + CDS 818 - 1372 493 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog 3 2 Op 2 . + CDS 1372 - 2874 805 ## gi|225026980|ref|ZP_03716172.1| hypothetical protein EUBHAL_01236 + Term 2926 - 2969 4.7 + Prom 3395 - 3454 8.7 4 3 Tu 1 . + CDS 3475 - 3687 288 ## gi|225026981|ref|ZP_03716173.1| hypothetical protein EUBHAL_01237 + Prom 3779 - 3838 7.6 5 4 Tu 1 . + CDS 4025 - 5044 856 ## COG1284 Uncharacterized conserved protein Predicted protein(s) >gi|222441880|gb|ACEP01000062.1| GENE 1 195 - 326 97 43 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026977|ref|ZP_03716169.1| ## NR: gi|225026977|ref|ZP_03716169.1| hypothetical protein EUBHAL_01233 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01233 [Eubacterium hallii DSM 3353] # 1 43 1 43 43 73 100.0 4e-12 MLESEYTVVSNINRHGIAPVEAAAYIVDLIERILKTGKSGVEL >gi|222441880|gb|ACEP01000062.1| GENE 2 818 - 1372 493 184 aa, chain + ## HITS:1 COG:CAC1766 KEGG:ns NR:ns ## COG: CAC1766 COG1595 # Protein_GI_number: 15895043 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Clostridium acetobutylicum # 3 178 5 179 185 64 31.0 1e-10 MTDKEIIALYFKRSENAIEQTDEKYGRLCHSISINILNDTRDSEECVNDTYLTLWNKIPP KEPNPFKAYICRIIKNLSLKKYEFNHAKKRNSAYETSLDELAECVSDGTEASDSVEFEEL RTAINTFLSELPEKKRILFLRRYWFLQPVKEIAKDYGITEKTASMRLARLREKLQQYLQE KELI >gi|222441880|gb|ACEP01000062.1| GENE 3 1372 - 2874 805 500 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026980|ref|ZP_03716172.1| ## NR: gi|225026980|ref|ZP_03716172.1| hypothetical protein EUBHAL_01236 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01236 [Eubacterium hallii DSM 3353] # 1 500 1 500 500 882 100.0 0 MDKKELFRAIDACEDRFIREAAEDIQKRKPSMIRRFFFVNSSENTEENSEKDVRKLSGLR VAAAIVICVVVLSIGVQTAARVSTVFRDWIEQTFQIGSSSGDQQGKENQYINNNGSGKNG NNKKGYNNNTSNIDSYGKNDKKNSAGKEVKNGRSNEEETGRVHRRNNKADEGQDSISQIE KVPMKDNLQIVGTNESFIYESKTDANSDEIVEKVYSIKDGKLSKLPIQSFSGSYKASTFS FQYSIIGQEIFSYNYSKNLVQVFDQINKQGEIYLSAENAKENEQILCVNLKSGECRQVVK PGDGMNMSMSPDGKYILINHSKSYWTVFDTEKETERKLEGLSGYALSNEYDFINDDHIAA VGDVFTKNNTEFYRLNYIDFQTGKVKVYPEYGDIKGCWTYRCDTKKKQLEIENLITKQKQ MIPLKQAEDVHIMQVNGEYVLLGTESDDVYYLYNLQEDIYRELDIPEEIRGTLEMYIAKK EKKLLFTNGKEAYLVDLKER >gi|222441880|gb|ACEP01000062.1| GENE 4 3475 - 3687 288 70 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026981|ref|ZP_03716173.1| ## NR: gi|225026981|ref|ZP_03716173.1| hypothetical protein EUBHAL_01237 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01237 [Eubacterium hallii DSM 3353] # 1 70 1 70 70 123 100.0 5e-27 MYSLVDASLNTLHTIKWNICSDEKMSLLEKLGLVQGAKVSVIASYFGNVIVSVNGKRIAI EKDTAERVKV >gi|222441880|gb|ACEP01000062.1| GENE 5 4025 - 5044 856 339 aa, chain + ## HITS:1 COG:BS_yitT KEGG:ns NR:ns ## COG: BS_yitT COG1284 # Protein_GI_number: 16078176 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Bacillus subtilis # 63 333 7 277 280 138 31.0 2e-32 MGFGKRENTFDWRIYMKKVPIKELREDLKEELKEELVSDSAILDKQRMGEEMYHKLEIRR DVKDSIVVIIASILYAINVNVFVNAGNLLPGGATGISLLLQHICRTFLHISVPYSLFSIL LNAVPATICYRVVGKKYTLRSVLCIFVMSIAVDMIPSHFITDDLLLIAIFGGIINGVLIA LILNSHATSGGTDFISMIISKKKGISVWNYIFMGNVAVLLIAGCIYGMNVALYSIIFQYA STQMINMMYKRYDKDTLFIITKKPDEVYNIILKKTNHDATLFKGVGCYKNEEKAMLYSVV NSDATRVVIMDIKKEDPDAFINYFGSKGVRGKFYYQEEE Prediction of potential genes in microbial genomes Time: Fri Jul 8 07:24:02 2011 Seq name: gi|222441879|gb|ACEP01000063.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont69.1, whole genome shotgun sequence Length of sequence - 22875 bp Number of predicted genes - 20, with homology - 20 Number of transcription units - 7, operones - 4 average op.length - 4.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 108 - 167 9.2 1 1 Op 1 . + CDS 218 - 415 222 ## gi|225026985|ref|ZP_03716177.1| hypothetical protein EUBHAL_01241 2 1 Op 2 33/0.000 + CDS 434 - 1147 1011 ## COG0528 Uridylate kinase + Prom 1153 - 1212 7.4 3 1 Op 3 19/0.000 + CDS 1276 - 1821 791 ## COG0233 Ribosome recycling factor + Term 1829 - 1883 10.7 + Prom 2058 - 2117 6.2 4 2 Op 1 32/0.000 + CDS 2199 - 2924 800 ## COG0020 Undecaprenyl pyrophosphate synthase 5 2 Op 2 15/0.000 + CDS 2939 - 3748 799 ## COG0575 CDP-diglyceride synthetase 6 2 Op 3 17/0.000 + CDS 3745 - 4881 1459 ## COG0743 1-deoxy-D-xylulose 5-phosphate reductoisomerase 7 2 Op 4 6/0.000 + CDS 4897 - 5934 870 ## COG0750 Predicted membrane-associated Zn-dependent proteases 1 + Prom 6066 - 6125 5.1 8 2 Op 5 . + CDS 6207 - 7277 1396 ## COG0821 Enzyme involved in the deoxyxylulose pathway of isoprenoid biosynthesis 9 2 Op 6 8/0.000 + CDS 7308 - 10712 2609 ## COG3857 ATP-dependent nuclease, subunit B 10 2 Op 7 . + CDS 10713 - 14378 2695 ## COG1074 ATP-dependent exoDNAse (exonuclease V) beta subunit (contains helicase and exonuclease domains) 11 2 Op 8 1/0.000 + CDS 14371 - 15909 1050 ## COG1502 Phosphatidylserine/phosphatidylglycerophosphate/cardioli pin synthases and related enzymes 12 2 Op 9 1/0.000 + CDS 15983 - 16747 770 ## COG0500 SAM-dependent methyltransferases 13 2 Op 10 . + CDS 16744 - 17622 1160 ## COG1281 Disulfide bond chaperones of the HSP33 family + Term 17737 - 17772 1.1 - Term 17593 - 17637 7.3 14 3 Tu 1 . - CDS 17669 - 17953 366 ## gi|225026999|ref|ZP_03716191.1| hypothetical protein EUBHAL_01255 - Prom 18042 - 18101 10.1 + Prom 17994 - 18053 5.2 15 4 Op 1 16/0.000 + CDS 18266 - 19096 981 ## COG0207 Thymidylate synthase + Prom 19098 - 19157 2.6 16 4 Op 2 . + CDS 19179 - 19682 584 ## COG0262 Dihydrofolate reductase 17 5 Tu 1 . + CDS 19791 - 20135 535 ## Cphy_0823 hypothetical protein + Prom 20173 - 20232 5.4 18 6 Tu 1 . + CDS 20332 - 20826 459 ## PROTEIN SUPPORTED gi|148994988|ref|ZP_01823966.1| ribosomal protein L11 methyltransferase + Prom 20868 - 20927 8.2 19 7 Op 1 . + CDS 20947 - 22092 1696 ## COG0468 RecA/RadA recombinase 20 7 Op 2 . + CDS 22096 - 22710 699 ## Closa_0675 regulatory protein RecX + Term 22725 - 22774 -0.6 Predicted protein(s) >gi|222441879|gb|ACEP01000063.1| GENE 1 218 - 415 222 65 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026985|ref|ZP_03716177.1| ## NR: gi|225026985|ref|ZP_03716177.1| hypothetical protein EUBHAL_01241 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01241 [Eubacterium hallii DSM 3353] # 1 65 1 65 65 112 100.0 7e-24 MKVLLSNVSRKNGYMIIDVFFSVLSGMNYPVLAHDDTMLKFIDVDEKFKKTEIRYDFRTR KIIMG >gi|222441879|gb|ACEP01000063.1| GENE 2 434 - 1147 1011 237 aa, chain + ## HITS:1 COG:CAC1789 KEGG:ns NR:ns ## COG: CAC1789 COG0528 # Protein_GI_number: 15895065 # Func_class: F Nucleotide transport and metabolism # Function: Uridylate kinase # Organism: Clostridium acetobutylicum # 9 236 8 235 236 213 47.0 3e-55 MKDLKSLNRVLLKLSGEALAGEKKTGFDEATVLKVAEQVKKIVDDGMQVAIVIGGGNFWR GRTSENMERTKADQIGMLATVMNCIYVSDMFRHVGMKTKIYTPFTCGAFTELFSKDDAVK NLEEGVVTFFAGGTGHPYFSTDTATALRAIEIEADAILLAKAIDGVYDSDPKVNPDAVKY DEISLDEVLSKKLGVIDLTATIMCLENKMPLVIFGLEEENSIVNTMSGNFNGTYVTV >gi|222441879|gb|ACEP01000063.1| GENE 3 1276 - 1821 791 181 aa, chain + ## HITS:1 COG:CAC1790 KEGG:ns NR:ns ## COG: CAC1790 COG0233 # Protein_GI_number: 15895066 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Ribosome recycling factor # Organism: Clostridium acetobutylicum # 1 181 5 185 185 183 58.0 2e-46 MIKKFEDKMQKSIESLEREYNSIRAGRANPHVLDKIKVDYYGTPTPLQQVGNISVPEPRI ITIQPWETSLLKAIEKEIMQSDIGINPTNDGKVIRLVFPELTEERRKELVKDVKKKGEAA KVAIRNIRRDANDSLKKANKNSEITEDELKDQQDKVQKATDKYIKTIDGMIDKKSKEILT V >gi|222441879|gb|ACEP01000063.1| GENE 4 2199 - 2924 800 241 aa, chain + ## HITS:1 COG:AGc2550 KEGG:ns NR:ns ## COG: AGc2550 COG0020 # Protein_GI_number: 15888704 # Func_class: I Lipid transport and metabolism # Function: Undecaprenyl pyrophosphate synthase # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 11 241 8 238 247 260 51.0 2e-69 MEKELHGKLLTIPEHVAIILDGNGRWAKKRFMPRNYGHSQGAKTVEQICEDAWDIGIKYL TVYAFSTENWKRPEKEVKALMKLLRKYLKDCIERTAKNNMRVRVIGEKSRLDDDIRAAID ELEEVSAANTGLNFTIALNYGSRDEMVRAMHHMAQDCKDGAFDPEEITEEKFASYLDTKD LPDPDLMIRTSGEQRLSNWMLWQLAYAEFYFTDVLWPDFDKEELIRAVEYYNTRDRRFGG V >gi|222441879|gb|ACEP01000063.1| GENE 5 2939 - 3748 799 269 aa, chain + ## HITS:1 COG:CAC1792 KEGG:ns NR:ns ## COG: CAC1792 COG0575 # Protein_GI_number: 15895068 # Func_class: I Lipid transport and metabolism # Function: CDP-diglyceride synthetase # Organism: Clostridium acetobutylicum # 9 241 9 240 245 104 33.0 2e-22 MFKQRLISGIILVLIAFVTLYYGGFLTFFVVGAISLIGVFELLRVIQMEKSALGVIVYTG SVLYYLLLLLGLEQYVMPLALLVLILLIAVYVFTFPKYEISSIAISYFALFYVTVMLSCI YRIRMLSDGAYMVVLVFLSAWGNDTLAYCTGRLIGKHKMSPILSPKKTVEGAIGGVIGAG LLGCLYGLFAKQFLSVNYNLIVVFGIVCAVGGLISIIGDLGASAIKRNYDIKDYSHLIPG HGGILDRFDSIIFTAPIIYYMLVMVVGLK >gi|222441879|gb|ACEP01000063.1| GENE 6 3745 - 4881 1459 378 aa, chain + ## HITS:1 COG:lin1354 KEGG:ns NR:ns ## COG: lin1354 COG0743 # Protein_GI_number: 16800422 # Func_class: I Lipid transport and metabolism # Function: 1-deoxy-D-xylulose 5-phosphate reductoisomerase # Organism: Listeria innocua # 1 371 1 370 380 395 54.0 1e-110 MKKIAILGSTGSIGTQTLDVVREHSDELQVVALAAGTNKERLKEQIKEFHPKLVSLSDEK KAQELKEELAGEQVEVVCGMEGLIEVAGADSADVVVTAVVGMMGILPTMEAIKKGKDIAL ANKETLVTAGHLIIPMAKEYGVSILPVDSEHSAIFQSLQGEPKAALDKILLTASGGPFRG KSAEFLETVTLEDALNHPNWSMGPKITIDSSTMVNKGLEVMEAKWLFGVDYSQIEVVIQP QSIIHSMVQYVDGAIIAQLGTPDMRVPIEYALFYPERRSLSGERLDFSKLSQITFEKPDY KVFRGLSLAIEAGKTGGTMPTVFNAANERAVAKFLKGEIKYTDIVRSIEKCMDAHKVSAH PDLEEILATEQWVYSILQ >gi|222441879|gb|ACEP01000063.1| GENE 7 4897 - 5934 870 345 aa, chain + ## HITS:1 COG:CAC1796 KEGG:ns NR:ns ## COG: CAC1796 COG0750 # Protein_GI_number: 15895072 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted membrane-associated Zn-dependent proteases 1 # Organism: Clostridium acetobutylicum # 2 339 4 333 339 223 40.0 5e-58 MLKIILAIVLFSVIVIFHELGHFLFAKKNGICVEEFAIGIGPTIFGKQIGETKYSIKCLP FGGCCVMLGEDDDCKDPRAFGSQSALARFSVIFAGPFFNFILAFVLALFVIGFSGADPAV AGEISADSGAYEAGLHEGDRIVKLDGSRIYNFREISLFNYLHKDKADVEVTYERDGKQKT VTVTRKKTEAGTYAFGISMTEDTKEGIIGTLKYSILEVRYQIKSTFLSLKYLITGRFKLN DLSGPVGIVNMIGNTYEQSIVYGIKTVVLSLLNFAIMLSANLGVMNLLPLPALDGGRLVF IILEMIRRKKVSPEKEGMVHFAGLVLLMALMVIVMANDIKNIFFI >gi|222441879|gb|ACEP01000063.1| GENE 8 6207 - 7277 1396 356 aa, chain + ## HITS:1 COG:CAC1797 KEGG:ns NR:ns ## COG: CAC1797 COG0821 # Protein_GI_number: 15895073 # Func_class: I Lipid transport and metabolism # Function: Enzyme involved in the deoxyxylulose pathway of isoprenoid biosynthesis # Organism: Clostridium acetobutylicum # 1 347 1 346 349 393 57.0 1e-109 MYREHTKEVRIGNRVIGGGNPILIQSMTNTKTEDVEATVAQILELEAAGCDIIRSTVPTL EAAKALGEIKKRIHIPIVADIHFDYRMAIAAIENGADKIRINPGNIGSAERVKAVVDKAK EYNVPIRVGVNSGSLEKNIVEKYGKVTAEGLVESALDKVKMIEDMGYDNLVVSIKSSDVL MCAKAHELLAPKCPYPLHVGITEAGTLFRGNIKSAIGLGIILNQGIGDTIRVSLTGNPVN EIASAKQILKTLGLRKGGIEVVSCPTCGRTNIDLIGLANEVEKMVTEFDDLDIKVAVMGC VVNGPGEAKEADIGIAGGKGEGLLIKKGQVVRKLPESELLSTLREELAHWNDKADE >gi|222441879|gb|ACEP01000063.1| GENE 9 7308 - 10712 2609 1134 aa, chain + ## HITS:1 COG:CAC2263 KEGG:ns NR:ns ## COG: CAC2263 COG3857 # Protein_GI_number: 15895531 # Func_class: L Replication, recombination and repair # Function: ATP-dependent nuclease, subunit B # Organism: Clostridium acetobutylicum # 1 1132 1 1142 1153 534 28.0 1e-151 MALRLITGRSGQGKTEYLVQEVIRLSMENPQQKYYVIVPEQFSLEMQRKMVKKHPRHGFF NIDVLSFHRLAYRIFDECGYQPGEILEDLGVSMMLRKILSEGENNFLYFKKSMKKAGFID ELKSMLMEFIGYGVTWEQVEEVSEKLGENKALSNKCKELGRIYEHFDKAIEGRFMVTEQI LDVAREFAKEAPMLTNAVFYFDGFAGFTPVQLAFLKELLKVAGQINVTITIPEFVPGQRG MPEDLFTSSKKTGDALLMLCQENMQEVEEVVKLDTPVSPRFINNPELAFLEKNLFQNSRK IYSKKIERIHMTVCHNPDGEADYVMHKIEQLVRQKGYRYRDFAVLSGDVADYASAFKRKA AILNIPVFEDTKKKVSYHSGVEAVRSLFHLAQMEYSYESVFRYLKSGMSNLIDEDADYLE NYVLYAGVRGYSMWKKPFYRRLKNKDEAAIKALLLLQEKFMEETENFCSVMRDKEASVRD KIEVLYHTMVKLSFEEKLKNQAQKAEENSDFVKAAEYRQLYDLLLSLMDKIVMIFGEEKM AVKELAEIVDAGLDALGLGVVPLTMDQVVLGDLKRTRLHEVKVLFITGMNDGKIPPNIED RGLLSDEEKEVLKNCGISLSQSLLEQSMEDEFYMYMAFAKPTDELYFSYSVTDSDGSALR PSLLQKNISQLFPKLKRKQYPEEERRYYFNLEDSREFLIESFLQAKTEPEKVRKNRAFVM LAKYWLEQSEGRKELEKYGHWIENAYQEPELSEELLEQLYGKELSGSVTRLERFAACPYQ YFCIYGLELREREEYKIRPIDLGNLFHKALECFSRKVKESEYSWKNIPDEVQEQYISEAL HTAMDENLSDVFQSSSRNQYKIKTVERIMKRTIQVLRVHLKNSQFEPDRFELSFGKNKKL KEAEVPLEDGRKMFLQGVIDRVDTCEDDDEILMKVIDYKSGMKKFELEDFYYGLEMQLVI YMNAAEEIYKENEQNPDNKPVVPAGIFYYQLQDPIIKADYAEESELLKNFRLSGMANSDA DILSKLEEGSDGFVSMPIRLKKSGEPYKNSSVMSTQDFHYMGAYARKKAAELGERIYKGE IHPRPYRNKKGTACDYCPFADVCGFDPKLPGYEYQSFQGMSVEEVLEKIREEGE >gi|222441879|gb|ACEP01000063.1| GENE 10 10713 - 14378 2695 1221 aa, chain + ## HITS:1 COG:CAC2262 KEGG:ns NR:ns ## COG: CAC2262 COG1074 # Protein_GI_number: 15895530 # Func_class: L Replication, recombination and repair # Function: ATP-dependent exoDNAse (exonuclease V) beta subunit (contains helicase and exonuclease domains) # Organism: Clostridium acetobutylicum # 2 1203 7 1243 1252 581 33.0 1e-165 MNYTKEQEQAIFLRDKNIMVSAGAGAGKTRVLVSRMAELIMDEKNPVEADRFLVMTFTNA AAAEMKERISLDLEERLAKDPENHYLRKQIRKIRQADISTVHSFCNHLIRTHYNELSIDP SFRIGEEGELFLLRQQAIEQLLEEAYASGRESFVKFAESYAPGKSDKVLEELVGDLYRFS RSFPNASFWFEKTKQEALQLAEAKEWDNSPAVMLIFLKAKKELLQEKEALSKLLKNIAGE EVPEKYGVLLQDVSEYVEALSQTESYDAYYMVLSRGPVPAFPRATKKDKEWADYEIVKEW HQEVKELLQKQKETVFTAPAEELQREAAGIYPLLEEYIVLAQRFEEIYLAYKKEKNVYDF DDLEHFALELLVDHYDEGGQAYPSETAKTLAKKYKMIFVDEYQDTNLVQETILEMLSEKD NNTLFTVGDVKQSIYRFRQARPDLFLRRNEKYHNEEEGVSIELRDNFRSAPGVLCFTNYV FSRLMERDFGGVDYNEETALRAGEGGPMLEDKETSELLFFVKDSVQTLEEAPEDVLTETA LITKRIQELIEEGYHYGDIVILLRSGAGRMEPMAEFLEKNGIPVSCENKTGYFHTREITI ILNYLSVVDNVYQDIPMASVMLSSIGGFTEEELTKLRILVREPMREQYTLYDLMCLYLQE GAEEELRAKIQHFYKDLQYFREQKKEQPLGVLLWDIYQRTGFYYDVQLMPDGAKRKENLL MLLKKAEDYEKTVFKGLFYFNRYMEQLKSYEIELGEAGSGTENENVVKIMTIHKSKGLEF PVVFVSGLSKKFNRMDLTKPMLCHPELGIGMECVNITLRFHHPSLMKRAIQEKVWKDTLE EEMRILYVAMTRAKRKLILTGIIKAQDLEAGLRLQINTQKWRATSAMDWLLPILADQFRS TQSEKSEEREGFLKNQETQKQAHSEGVTASWLKARLVNWEEILPFFEKERKQKEEFSYQI FMDTVVIPADVSMVEQSFSYEYPNRDATKWKRKYSVSELKTLSQTTLPEEESSIMLPEQN NFEEEVKMPEFLKEEKSEIAATARGTIIHKIMELLPFGQIQTKKQLFEWIDALKGTYPES EKISAKWLYRAIEAFLFSEIGEKLSQMDRQGNLRKELPFTVGLPVSLMNADTQAEDTVVV QGIIDICGEAEESLWLIDYKTDRIKEGEESLLLDRYGNQMLYYKAALEQILGKRVSESYL YSFSLKKFIPVNLQERSGENA >gi|222441879|gb|ACEP01000063.1| GENE 11 14371 - 15909 1050 512 aa, chain + ## HITS:1 COG:CAC3316 KEGG:ns NR:ns ## COG: CAC3316 COG1502 # Protein_GI_number: 15896559 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylserine/phosphatidylglycerophosphate/cardioli pin synthases and related enzymes # Organism: Clostridium acetobutylicum # 1 512 1 510 510 485 47.0 1e-137 MRKFLKFLSQRIVIVTFLIFLQIMWFVGMFLKVTSYSQYITAAFKILSILVVIYIMNRSD NPSVKLGWIVVILLFPLFGGLLYLTIGGKQPISYLRRKLEPMIEESARHLKADPEIEQEL KEKDGSTASQIYYLEHQSGFPAYKDSKVDYYPSGEDCFKVMAEELRKAKEYIYLEYFIIE EGIMWNTILDILEEKVKEGVDVRVMYDDVGCIFNLPSHYADFLRKKGIKCIVFNRYIPIF STVFNNRDHRKILVIDGDVAFTGGINFADEYINEKQRFGYWKDNGVRITGKAVYSMTQMF LQMWNAFALEKDKLDYEKFCLEDVRPLAKEERGIVLPYSDHPLDSEFVGESVYINLINSA QKYIYFFTPYLIIDNEVVTALILAAKRGVDVRIITPGIPDKKMIFLVTQSYYPQLVEGGV QIFQYRPGFVHAKCAVCDDKIATVGTINLDYRSLYLHFENGLFLYNNDTVMDIKKDILDT LEDCNRIMPEMCKKNMFFLLLQGILRIFAPLL >gi|222441879|gb|ACEP01000063.1| GENE 12 15983 - 16747 770 254 aa, chain + ## HITS:1 COG:BS_yqeM KEGG:ns NR:ns ## COG: BS_yqeM COG0500 # Protein_GI_number: 16079615 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Bacillus subtilis # 4 245 3 242 247 154 36.0 1e-37 MGSYENFARVYDELMDNVPYEEWADFILAILKKNKITDGLVLELGCGTGKLMSLLGNAGF DMIGVDNSVDMLQIAREKTSPEFLYLLQDMREFELYGTVKAVVSVCDSVNYITEKEDLTE VFRLVNNYLDPKGLFIFDFNTDYKYRDMIGETVIAEDREDVSFIWFNEYDEESQLNDIDL KVFVQEDGDCYRKFQEEHIQRGYSLQEIKQMLEESGLVFLQAFDEYSNQEPRTDSGRIVV VAQENGKEKQEETE >gi|222441879|gb|ACEP01000063.1| GENE 13 16744 - 17622 1160 292 aa, chain + ## HITS:1 COG:BS_yacC KEGG:ns NR:ns ## COG: BS_yacC COG1281 # Protein_GI_number: 16077139 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Disulfide bond chaperones of the HSP33 family # Organism: Bacillus subtilis # 3 285 2 283 291 274 47.0 2e-73 MSDYIIRGMAADKQVRFFAANTKDLVEKARRIHNTSPIATAALGRLMTGTAMMGSMCKND SDIVTVQIKGDGPMGGLVVTSDAKARVKGYVYNKDVMLPPNAQGKLDVGGAIGNGVLTVI KDLGLKEPYSGQTNLITGEIAEDLTYYFASSEQIPTSVALGVLMNKENTVRQAGGFMIQM MPFASDEVITALEERLKDFTSVTSHLDRGETPEDMMAELFEGMDMTIEDKIPTEFYCNCS KERVSRAVISVGKKELADMIEEGKPIEVNCHFCNSHYNFSVEELEEMLKAAR >gi|222441879|gb|ACEP01000063.1| GENE 14 17669 - 17953 366 94 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225026999|ref|ZP_03716191.1| ## NR: gi|225026999|ref|ZP_03716191.1| hypothetical protein EUBHAL_01255 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01255 [Eubacterium hallii DSM 3353] # 1 94 18 111 111 171 100.0 2e-41 MKKHRIISIALSVIGFAATLALVLYLLKDRIKGCSFCGDDAPDEAEEDIFDSDLDDLEDF VPVEKEEAKPSSQTKVRRGYIPLKFHSHDEAEEA >gi|222441879|gb|ACEP01000063.1| GENE 15 18266 - 19096 981 276 aa, chain + ## HITS:1 COG:BS_thyA KEGG:ns NR:ns ## COG: BS_thyA COG0207 # Protein_GI_number: 16078831 # Func_class: F Nucleotide transport and metabolism # Function: Thymidylate synthase # Organism: Bacillus subtilis # 1 269 1 273 279 223 45.0 3e-58 MSYADKLFIDMCSDIIENGYSTEGEKVRPKWPDGTYAYTIKKFGVVNRYDLSKEFPAITL RKTYIRSAIDEILWIWQKKSNNVHDLSSHIWDEWADEDGSIGKAYGYQLGKKHKYKEGEM DQVDRVLFDLKENPFSRRIMTNIYVHEDLHEMNLYPCAYSMTFNVTGNRLNAILNQRSQD ILAANNWNVVQYAALLMMFAQVSGFEPGELVHVIADAHIYDRHIPIVKELIQRPVYPAPK VTLNPEVKDFYKFTVDDFTIEDYEAGPQIKNIPIAI >gi|222441879|gb|ACEP01000063.1| GENE 16 19179 - 19682 584 167 aa, chain + ## HITS:1 COG:CAC3004 KEGG:ns NR:ns ## COG: CAC3004 COG0262 # Protein_GI_number: 15896256 # Func_class: H Coenzyme transport and metabolism # Function: Dihydrofolate reductase # Organism: Clostridium acetobutylicum # 1 137 2 134 153 108 40.0 5e-24 MKLIAAADKNWAIGKDGELLVRISEDMKNFSAITTGNVIVMGRKTLESFPGGKPLPNRVN IVLTHEKDYNGKGAIVVHSEEELWEELSKYDTDSIFVTGGESIYRMLLPYCDTAYITRLD YAYEADTWMPNLDKEEHWSMVEKGEERYCFDLIYHFTTYKNERPALH >gi|222441879|gb|ACEP01000063.1| GENE 17 19791 - 20135 535 114 aa, chain + ## HITS:1 COG:no KEGG:Cphy_0823 NR:ns ## KEGG: Cphy_0823 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 1 111 1 111 114 68 35.0 6e-11 MPYIEVKLAKQASRKQERILTEKLGEAITSVPGKTAAGLMVSISSDEHIYKGGEFLEYGA YVNVVFKTGVTRADCEKFNEAIFTIMEEDLKVPKENVYTTFQYCDDFGVQGKFL >gi|222441879|gb|ACEP01000063.1| GENE 18 20332 - 20826 459 164 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|148994988|ref|ZP_01823966.1| ribosomal protein L11 methyltransferase [Streptococcus pneumoniae SP9-BS68] # 57 164 7 114 114 181 74 4e-45 MSREKEELQGVTLLGNQKTKYPQDYAPEMLETFINKHQDHDYFVKFNCPEFTSLCPMTGQ PDFATIYISYVPDVKMVESKSLKLYLFSFRNHGDFHEDCVNIIMKDLIKLMDPKYIEVWG KFTPRGGISIDPYTNYGKPGTMWEQVATERLVHHDLYPEKVDNR >gi|222441879|gb|ACEP01000063.1| GENE 19 20947 - 22092 1696 381 aa, chain + ## HITS:1 COG:BH2383 KEGG:ns NR:ns ## COG: BH2383 COG0468 # Protein_GI_number: 15614946 # Func_class: L Replication, recombination and repair # Function: RecA/RadA recombinase # Organism: Bacillus halodurans # 26 361 6 340 349 434 65.0 1e-121 MAASKKDNVNAGRVEYLGDDREGKLAALNNAVAAIEKNYGKGSIMKLGDSGANIDIEAIP TGSISLDVALGVGGVPRGRIIEIYGPESSGKTTVALHMIAEAQKRNGIAGFIDAEHALDP QYAKKIGVDIDNLYISQPDNGEQALEIAETMIRSGALDIVIVDSVAALVPKAEIEGDMDD QQVGLHARLMSKAMRKLTGVINRSNCAVVFINQLREKVGIMFGNPEVTTGGRALKFYSSV RMDVRRVEAIKQGGEIIGNHTKVKVVKNKVAPPFKEAEFDIMFGQGISREGDLIDLAVKV DAVQKSGAWYAYKGEKIGQGRENAKTFLREHPEIMAEVEVQVREYYNLPTDTVVESASQE EGQPKAGDNQNGQFNSITTEE >gi|222441879|gb|ACEP01000063.1| GENE 20 22096 - 22710 699 204 aa, chain + ## HITS:1 COG:no KEGG:Closa_0675 NR:ns ## KEGG: Closa_0675 # Name: not_defined # Def: regulatory protein RecX # Organism: C.saccharolyticum # Pathway: not_defined # 25 196 24 195 202 95 35.0 1e-18 MQYKITLTDTGKKKVYVNPGVREGIYLYPGEIKKLRLEDGSELEQDKFERIRLEYALPRA KHRAIAILAKRDKTEQELREKLLQSLTDTQSLEEAISYMKSCGYVDDTQYARDYLYFKKG RKSFLQIKMELQKKGIPAEVLETVFEEEGSQQMEDILEQARKYMRKFPELDFPARQKVYA HFARKGYAGDLIREAIDKIEELEE Prediction of potential genes in microbial genomes Time: Fri Jul 8 07:25:15 2011 Seq name: gi|222441878|gb|ACEP01000064.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont70.1, whole genome shotgun sequence Length of sequence - 194898 bp Number of predicted genes - 179, with homology - 176 Number of transcription units - 97, operones - 46 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 40 - 84 14.1 1 1 Tu 1 . - CDS 228 - 1049 778 ## CLJU_c32910 hypothetical protein - Prom 1083 - 1142 8.0 + Prom 1618 - 1677 2.9 2 2 Op 1 49/0.000 + CDS 1722 - 2666 885 ## COG0601 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 3 2 Op 2 5/0.000 + CDS 2672 - 3499 835 ## COG1173 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components + Prom 3561 - 3620 8.1 4 2 Op 3 5/0.000 + CDS 3648 - 5216 1940 ## COG0747 ABC-type dipeptide transport system, periplasmic component 5 2 Op 4 44/0.000 + CDS 5226 - 6014 444 ## PROTEIN SUPPORTED gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 6 2 Op 5 . + CDS 6007 - 6963 786 ## COG4608 ABC-type oligopeptide transport system, ATPase component - Term 7114 - 7157 0.1 7 3 Op 1 . - CDS 7214 - 8404 947 ## COG1887 Putative glycosyl/glycerophosphate transferases involved in teichoic acid biosynthesis TagF/TagB/EpsJ/RodC 8 3 Op 2 . - CDS 8457 - 9197 151 ## Bcer98_0953 hypothetical protein - Prom 9269 - 9328 8.2 + Prom 9152 - 9211 7.3 9 4 Tu 1 . + CDS 9384 - 10361 1148 ## gi|225027016|ref|ZP_03716208.1| hypothetical protein EUBHAL_01272 + Prom 10571 - 10630 5.4 10 5 Tu 1 . + CDS 10650 - 11609 1573 ## COG0280 Phosphotransacetylase + Term 11855 - 11888 -0.9 + Prom 11785 - 11844 8.5 11 6 Tu 1 . + CDS 11913 - 13010 1599 ## COG0371 Glycerol dehydrogenase and related enzymes + Term 13032 - 13075 12.1 - Term 12841 - 12876 -0.7 12 7 Op 1 28/0.000 - CDS 13125 - 16574 2552 ## COG0419 ATPase involved in DNA repair 13 7 Op 2 . - CDS 16576 - 17775 1026 ## COG0420 DNA repair exonuclease - Prom 17867 - 17926 6.2 14 8 Tu 1 . + CDS 18144 - 19091 831 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily + Term 19122 - 19163 1.2 - Term 18961 - 19006 3.5 15 9 Op 1 . - CDS 19039 - 19599 554 ## COG1437 Adenylate cyclase, class 2 (thermophilic) 16 9 Op 2 . - CDS 19612 - 20802 880 ## gi|225027023|ref|ZP_03716215.1| hypothetical protein EUBHAL_01279 - Prom 20988 - 21047 6.8 + Prom 20934 - 20993 9.4 17 10 Op 1 . + CDS 21044 - 21757 888 ## EUBREC_1566 hypothetical protein + Prom 21862 - 21921 5.2 18 10 Op 2 . + CDS 21955 - 23130 1662 ## COG0138 AICAR transformylase/IMP cyclohydrolase PurH (only IMP cyclohydrolase domain in Aful) + Term 23218 - 23265 4.0 19 11 Tu 1 . - CDS 23600 - 23812 364 ## gi|225027026|ref|ZP_03716218.1| hypothetical protein EUBHAL_01282 - Prom 24038 - 24097 10.1 + Prom 23824 - 23883 8.2 20 12 Op 1 2/0.000 + CDS 24129 - 24734 761 ## COG0118 Glutamine amidotransferase 21 12 Op 2 . + CDS 24745 - 25503 892 ## COG0107 Imidazoleglycerol-phosphate synthase + Term 25506 - 25564 2.3 + Prom 25629 - 25688 2.9 22 13 Tu 1 . + CDS 25708 - 26667 625 ## COG1619 Uncharacterized proteins, homologs of microcin C7 resistance protein MccF + Prom 26744 - 26803 8.3 23 14 Tu 1 . + CDS 26870 - 27046 276 ## PROTEIN SUPPORTED gi|160880748|ref|YP_001559716.1| ribosomal protein S21 + Term 27057 - 27092 5.1 + Prom 27409 - 27468 7.2 24 15 Op 1 10/0.000 + CDS 27567 - 28565 1536 ## COG2376 Dihydroxyacetone kinase + Term 28644 - 28678 -0.3 + Prom 28578 - 28637 6.7 25 15 Op 2 9/0.000 + CDS 28707 - 29330 1125 ## COG2376 Dihydroxyacetone kinase 26 15 Op 3 . + CDS 29349 - 29729 664 ## COG3412 Uncharacterized protein conserved in bacteria + Term 29871 - 29918 -0.9 + Prom 29864 - 29923 2.5 27 16 Op 1 . + CDS 29962 - 30177 202 ## gi|225027035|ref|ZP_03716227.1| hypothetical protein EUBHAL_01291 28 16 Op 2 . + CDS 30190 - 31443 798 ## Cphy_2612 putative stage IV sporulation YqfD 29 16 Op 3 17/0.000 + CDS 31523 - 32533 972 ## COG1702 Phosphate starvation-inducible protein PhoH, predicted ATPase 30 16 Op 4 . + CDS 32530 - 33039 672 ## COG0319 Predicted metal-dependent hydrolase 31 17 Op 1 . + CDS 33225 - 34652 1246 ## COG2081 Predicted flavoproteins 32 17 Op 2 . + CDS 34649 - 36334 1342 ## COG2509 Uncharacterized FAD-dependent dehydrogenases + Term 36385 - 36420 0.3 + Prom 36368 - 36427 6.4 33 18 Op 1 . + CDS 36509 - 36733 304 ## Cphy_2602 hypothetical protein 34 18 Op 2 . + CDS 36735 - 37283 478 ## COG0212 5-formyltetrahydrofolate cyclo-ligase + Term 37325 - 37368 -0.8 - Term 37452 - 37500 2.1 35 19 Tu 1 . - CDS 37511 - 38122 435 ## COG1309 Transcriptional regulator + Prom 38073 - 38132 8.8 36 20 Op 1 1/0.067 + CDS 38324 - 40432 2115 ## COG1033 Predicted exporters of the RND superfamily 37 20 Op 2 . + CDS 40521 - 43001 3276 ## COG1511 Predicted membrane protein + Term 43099 - 43142 8.5 + Prom 43027 - 43086 5.2 38 21 Op 1 32/0.000 + CDS 43159 - 43620 697 ## COG0779 Uncharacterized protein conserved in bacteria 39 21 Op 2 22/0.000 + CDS 43658 - 44902 761 ## PROTEIN SUPPORTED gi|17988250|ref|NP_540884.1| transcription elongation factor NusA 40 21 Op 3 8/0.000 + CDS 44925 - 45200 160 ## PROTEIN SUPPORTED gi|78222795|ref|YP_384542.1| 50S ribosomal protein L7AE 41 21 Op 4 10/0.000 + CDS 45184 - 45504 297 ## PROTEIN SUPPORTED gi|239626258|ref|ZP_04669289.1| ribosomal protein L7Ae/L30e/S12e/Gadd45 42 21 Op 5 32/0.000 + CDS 45508 - 48015 3383 ## COG0532 Translation initiation factor 2 (IF-2; GTPase) 43 21 Op 6 4/0.000 + CDS 48030 - 48416 518 ## COG0858 Ribosome-binding factor A 44 21 Op 7 1/0.067 + CDS 48413 - 49450 219 ## PROTEIN SUPPORTED gi|149007035|ref|ZP_01830704.1| 50S ribosomal protein L31 type B 45 21 Op 8 12/0.000 + CDS 49431 - 50453 931 ## COG0130 Pseudouridine synthase 46 21 Op 9 . + CDS 50519 - 51448 361 ## PROTEIN SUPPORTED gi|163762565|ref|ZP_02169630.1| ribosomal protein S2 + Prom 51703 - 51762 7.5 47 22 Tu 1 . + CDS 51883 - 52503 657 ## bpr_I1473 hypothetical protein - Term 52538 - 52611 18.7 48 23 Op 1 . - CDS 52660 - 53382 673 ## COG0655 Multimeric flavodoxin WrbA 49 23 Op 2 . - CDS 53453 - 54229 1009 ## COG4816 Ethanolamine utilization protein - Prom 54330 - 54389 6.7 + Prom 54281 - 54340 8.9 50 24 Op 1 . + CDS 54434 - 57994 3693 ## COG1038 Pyruvate carboxylase 51 24 Op 2 . + CDS 58010 - 59290 974 ## COG0612 Predicted Zn-dependent peptidases 52 24 Op 3 . + CDS 59370 - 60746 1749 ## COG1362 Aspartyl aminopeptidase + Prom 60837 - 60896 3.7 53 25 Tu 1 . + CDS 61059 - 61400 348 ## Cphy_2804 hypothetical protein + Term 61458 - 61515 3.1 + Prom 61508 - 61567 11.2 54 26 Op 1 4/0.000 + CDS 61647 - 64283 3004 ## COG0749 DNA polymerase I - 3'-5' exonuclease and polymerase domains 55 26 Op 2 . + CDS 64297 - 64902 774 ## COG0237 Dephospho-CoA kinase 56 26 Op 3 . + CDS 64970 - 65761 827 ## COG1235 Metal-dependent hydrolases of the beta-lactamase superfamily I + Prom 65773 - 65832 8.2 57 27 Op 1 9/0.000 + CDS 65863 - 66150 440 ## COG3830 ACT domain-containing protein 58 27 Op 2 . + CDS 66161 - 67525 2096 ## COG2848 Uncharacterized conserved protein + Term 67547 - 67607 22.2 + Prom 67915 - 67974 7.5 59 28 Op 1 40/0.000 + CDS 68002 - 68691 433 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 60 28 Op 2 4/0.000 + CDS 68667 - 69932 567 ## COG0642 Signal transduction histidine kinase + Term 69942 - 69990 9.1 + Prom 69964 - 70023 5.2 61 29 Op 1 36/0.000 + CDS 70046 - 70720 324 ## PROTEIN SUPPORTED gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) 62 29 Op 2 . + CDS 70783 - 72003 618 ## COG0577 ABC-type antimicrobial peptide transport system, permease component 63 29 Op 3 . + CDS 71976 - 73160 1048 ## EUBELI_01520 hypothetical protein + Term 73307 - 73353 7.2 + Prom 73359 - 73418 5.6 64 30 Op 1 . + CDS 73478 - 73753 164 ## gi|225027073|ref|ZP_03716265.1| hypothetical protein EUBHAL_01329 65 30 Op 2 . + CDS 73823 - 74386 309 ## CDR20291_3288 hypothetical protein 66 30 Op 3 . + CDS 74376 - 75035 420 ## Sgly_1453 hypothetical protein + Prom 75106 - 75165 8.2 67 31 Op 1 . + CDS 75196 - 75681 93 ## gi|225027076|ref|ZP_03716268.1| hypothetical protein EUBHAL_01332 68 31 Op 2 . + CDS 75684 - 76583 543 ## gi|225027077|ref|ZP_03716269.1| hypothetical protein EUBHAL_01333 69 31 Op 3 . + CDS 76598 - 77284 449 ## gi|225027078|ref|ZP_03716270.1| hypothetical protein EUBHAL_01334 70 31 Op 4 . + CDS 77284 - 77553 184 ## BCB4264_A3444 TPR-repeat-containing protein + Prom 77633 - 77692 8.1 71 32 Tu 1 . + CDS 77900 - 78514 445 ## COG2357 Uncharacterized protein conserved in bacteria + Prom 78620 - 78679 7.0 72 33 Op 1 . + CDS 78709 - 79407 497 ## PROTEIN SUPPORTED gi|23335919|ref|ZP_00121150.1| COG0081: Ribosomal protein L1 73 33 Op 2 . + CDS 79373 - 80020 446 ## Sgly_2563 helix-turn-helix domain protein 74 33 Op 3 . + CDS 80084 - 80782 424 ## COG4912 Predicted DNA alkylation repair enzyme + Prom 80853 - 80912 2.5 75 34 Tu 1 . + CDS 80937 - 82364 670 ## BAA_A0205 pXO1-133 + Term 82587 - 82632 -0.8 + Prom 82561 - 82620 10.0 76 35 Tu 1 . + CDS 82644 - 83669 957 ## COG2008 Threonine aldolase + Term 83807 - 83841 -0.2 + Prom 83824 - 83883 5.7 77 36 Op 1 2/0.000 + CDS 83915 - 85144 1340 ## COG1160 Predicted GTPases 78 36 Op 2 . + CDS 85192 - 86604 1295 ## COG1027 Aspartate ammonia-lyase + Term 86777 - 86815 0.0 - Term 86813 - 86841 -0.9 79 37 Tu 1 . - CDS 87032 - 88432 1657 ## COG1115 Na+/alanine symporter - Prom 88476 - 88535 2.8 + Prom 89066 - 89125 8.8 80 38 Op 1 . + CDS 89158 - 89373 96 ## gi|225027093|ref|ZP_03716285.1| hypothetical protein EUBHAL_01349 81 38 Op 2 2/0.000 + CDS 89342 - 90355 1084 ## COG1975 Xanthine and CO dehydrogenases maturation factor, XdhC/CoxF family 82 38 Op 3 5/0.000 + CDS 90428 - 91447 1130 ## COG0303 Molybdopterin biosynthesis enzyme 83 38 Op 4 . + CDS 91465 - 91953 495 ## COG0315 Molybdenum cofactor biosynthesis enzyme 84 38 Op 5 . + CDS 91956 - 93851 1443 ## COG0488 ATPase components of ABC transporters with duplicated ATPase domains - Term 93644 - 93706 2.2 85 39 Op 1 . - CDS 93918 - 94394 305 ## COG1418 Predicted HD superfamily hydrolase 86 39 Op 2 . - CDS 94439 - 95764 1301 ## COG0366 Glycosidases - Prom 95824 - 95883 7.4 - Term 95774 - 95830 0.5 87 40 Tu 1 . - CDS 95900 - 96136 334 ## Closa_0890 Phosphotransferase system, phosphocarrier protein HPr - Prom 96304 - 96363 10.2 + Prom 96302 - 96361 8.9 88 41 Op 1 31/0.000 + CDS 96402 - 97193 1132 ## COG0834 ABC-type amino acid transport/signal transduction systems, periplasmic component/domain 89 41 Op 2 34/0.000 + CDS 97227 - 97895 594 ## COG0765 ABC-type amino acid transport system, permease component 90 41 Op 3 . + CDS 97895 - 98638 285 ## PROTEIN SUPPORTED gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 + Term 98657 - 98691 4.4 91 42 Op 1 22/0.000 + CDS 98697 - 99470 948 ## COG0263 Glutamate 5-kinase 92 42 Op 2 . + CDS 99467 - 100723 1505 ## COG0014 Gamma-glutamyl phosphate reductase + Term 100887 - 100938 12.6 - Term 100873 - 100925 12.1 93 43 Tu 1 . - CDS 100944 - 101792 732 ## gi|225027106|ref|ZP_03716298.1| hypothetical protein EUBHAL_01362 - Prom 101850 - 101909 4.3 94 44 Tu 1 . - CDS 102186 - 103433 922 ## gi|225027107|ref|ZP_03716299.1| hypothetical protein EUBHAL_01363 - Prom 103512 - 103571 10.3 95 45 Op 1 . + CDS 103813 - 104607 739 ## COG2357 Uncharacterized protein conserved in bacteria 96 45 Op 2 . + CDS 104690 - 105712 641 ## COG0530 Ca2+/Na+ antiporter - Term 105764 - 105826 3.1 97 46 Tu 1 . - CDS 105861 - 106727 956 ## Closa_3823 ErfK/YbiS/YcfS/YnhG family protein - Prom 106759 - 106818 10.6 + Prom 106935 - 106994 6.4 98 47 Tu 1 . + CDS 107030 - 108079 1052 ## COG0484 DnaJ-class molecular chaperone with C-terminal Zn finger domain + Term 108162 - 108192 0.4 99 48 Tu 1 . - CDS 108608 - 112846 4273 ## COG1924 Activator of 2-hydroxyglutaryl-CoA dehydratase (HSP70-class ATPase domain) - Prom 113086 - 113145 6.6 + Prom 113193 - 113252 9.6 100 49 Tu 1 . + CDS 113272 - 113643 349 ## Hore_11320 peptidoglycan-binding LysM + Term 113745 - 113783 3.1 101 50 Tu 1 . - CDS 113727 - 113966 93 ## - Prom 114121 - 114180 7.2 + Prom 113670 - 113729 7.2 102 51 Tu 1 . + CDS 113839 - 114777 985 ## COG0598 Mg2+ and Co2+ transporters + Term 114789 - 114828 6.9 + Prom 114779 - 114838 2.8 103 52 Tu 1 . + CDS 114924 - 115739 498 ## Closa_0669 hypothetical protein + Prom 115746 - 115805 5.2 104 53 Op 1 . + CDS 115827 - 116165 341 ## gi|225027118|ref|ZP_03716310.1| hypothetical protein EUBHAL_01374 105 53 Op 2 . + CDS 116181 - 117110 665 ## Closa_1954 hypothetical protein 106 53 Op 3 . + CDS 117126 - 117566 672 ## EUBELI_00954 hypothetical protein + Term 117586 - 117618 -0.1 + Prom 117695 - 117754 5.8 107 54 Tu 1 . + CDS 117812 - 117997 261 ## PROTEIN SUPPORTED gi|160881022|ref|YP_001559990.1| ribosomal protein L28 + Term 118070 - 118103 5.1 + Prom 118224 - 118283 9.8 108 55 Op 1 9/0.000 + CDS 118408 - 118764 469 ## COG1302 Uncharacterized protein conserved in bacteria + Prom 118820 - 118879 6.7 109 55 Op 2 4/0.000 + CDS 118933 - 120642 2265 ## COG1461 Predicted kinase related to dihydroxyacetone kinase + Term 120695 - 120745 10.2 + Prom 120688 - 120747 5.2 110 56 Op 1 . + CDS 120778 - 122829 1891 ## COG1200 RecG-like helicase 111 56 Op 2 . + CDS 122904 - 123626 720 ## COG2365 Protein tyrosine/serine phosphatase + Prom 123696 - 123755 10.4 112 57 Op 1 14/0.000 + CDS 123786 - 124343 285 ## PROTEIN SUPPORTED gi|163764797|ref|ZP_02171850.1| ribosomal protein L29 113 57 Op 2 . + CDS 124383 - 124865 440 ## PROTEIN SUPPORTED gi|163764798|ref|ZP_02171851.1| ribosomal protein S19 114 57 Op 3 . + CDS 124887 - 125504 808 ## EUBELI_01034 hypothetical protein 115 58 Tu 1 . - CDS 125640 - 125945 116 ## gi|225027130|ref|ZP_03716322.1| hypothetical protein EUBHAL_01386 - Prom 126148 - 126207 6.2 + Prom 125889 - 125948 1.5 116 59 Tu 1 . + CDS 125968 - 126111 68 ## + Prom 126667 - 126726 7.2 117 60 Op 1 . + CDS 126754 - 127272 234 ## PROTEIN SUPPORTED gi|170754849|ref|YP_001782001.1| ribosomal protein L32 family protein 118 60 Op 2 . + CDS 127352 - 127531 276 ## PROTEIN SUPPORTED gi|227872300|ref|ZP_03990657.1| possible ribosomal protein L32 119 60 Op 3 . + CDS 127581 - 127727 63 ## + Prom 127774 - 127833 5.5 120 61 Op 1 2/0.000 + CDS 127866 - 128639 1131 ## COG1691 NCAIR mutase (PurE)-related proteins 121 61 Op 2 . + CDS 128683 - 129972 1719 ## COG1641 Uncharacterized conserved protein + Term 130002 - 130045 5.1 + Prom 130233 - 130292 8.7 122 62 Tu 1 . + CDS 130347 - 131477 1124 ## COG2768 Uncharacterized Fe-S center protein + Term 131638 - 131678 7.2 + Prom 131642 - 131701 5.9 123 63 Tu 1 . + CDS 131774 - 131974 268 ## EUBREC_0225 hypothetical protein + Term 132005 - 132062 -0.9 + Prom 132016 - 132075 1.6 124 64 Op 1 5/0.000 + CDS 132117 - 132887 1172 ## COG2022 Uncharacterized enzyme of thiazole biosynthesis + Prom 133072 - 133131 4.9 125 64 Op 2 3/0.000 + CDS 133171 - 134424 1153 ## COG1060 Thiamine biosynthesis enzyme ThiH and related uncharacterized enzymes 126 64 Op 3 . + CDS 134417 - 135154 168 ## COG0352 Thiamine monophosphate synthase + Prom 135176 - 135235 5.9 127 64 Op 4 . + CDS 135317 - 136093 720 ## COG2510 Predicted membrane protein + Term 136123 - 136184 15.6 + Prom 136595 - 136654 7.0 128 65 Tu 1 . + CDS 136717 - 138045 400 ## PROTEIN SUPPORTED gi|168182407|ref|ZP_02617071.1| 50S ribosomal protein L18 + Term 138104 - 138147 0.1 + Prom 138302 - 138361 6.0 129 66 Op 1 4/0.000 + CDS 138398 - 138841 376 ## COG1846 Transcriptional regulators 130 66 Op 2 35/0.000 + CDS 138914 - 140662 1969 ## COG1132 ABC-type multidrug transport system, ATPase and permease components 131 66 Op 3 . + CDS 140665 - 142557 2104 ## COG1132 ABC-type multidrug transport system, ATPase and permease components + Term 142674 - 142728 -1.0 132 67 Tu 1 . - CDS 142570 - 143973 991 ## COG1167 Transcriptional regulators containing a DNA-binding HTH domain and an aminotransferase domain (MocR family) and their eukaryotic orthologs - Prom 144088 - 144147 5.6 + Prom 144077 - 144136 8.3 133 68 Tu 1 . + CDS 144278 - 145492 1409 ## COG1457 Purine-cytosine permease and related proteins + Term 145508 - 145568 17.1 + Prom 145503 - 145562 7.9 134 69 Op 1 . + CDS 145584 - 146216 659 ## COG0494 NTP pyrophosphohydrolases including oxidative damage repair enzymes 135 69 Op 2 . + CDS 146259 - 146777 508 ## COG4917 Ethanolamine utilization protein + Term 146836 - 146873 6.1 + Prom 146854 - 146913 4.7 136 70 Tu 1 . + CDS 146982 - 147911 1035 ## COG0530 Ca2+/Na+ antiporter - Term 147931 - 147978 -0.8 137 71 Tu 1 . - CDS 148105 - 148980 540 ## COG0791 Cell wall-associated hydrolases (invasion-associated proteins) - Prom 149105 - 149164 6.1 + Prom 149057 - 149116 6.3 138 72 Tu 1 . + CDS 149268 - 150368 893 ## FN1582 hypothetical protein + Term 150448 - 150482 2.0 + Prom 150577 - 150636 6.7 139 73 Tu 1 . + CDS 150689 - 151342 960 ## COG4816 Ethanolamine utilization protein + Prom 151433 - 151492 5.4 140 74 Tu 1 . + CDS 151524 - 152177 821 ## COG4816 Ethanolamine utilization protein + Prom 152486 - 152545 6.7 141 75 Tu 1 . + CDS 152711 - 154462 2405 ## COG0173 Aspartyl-tRNA synthetase + Term 154657 - 154705 1.1 + Prom 154928 - 154987 9.2 142 76 Op 1 . + CDS 155061 - 155183 59 ## gi|225027159|ref|ZP_03716351.1| hypothetical protein EUBHAL_01415 143 76 Op 2 . + CDS 155211 - 157199 2862 ## COG3808 Inorganic pyrophosphatase + Term 157416 - 157455 2.3 + Prom 157455 - 157514 5.1 144 77 Tu 1 . + CDS 157540 - 157932 382 ## CLJU_c39230 hypothetical protein + Prom 158046 - 158105 4.3 145 78 Op 1 . + CDS 158298 - 159128 869 ## COG3757 Lyzozyme M1 (1,4-beta-N-acetylmuramidase) 146 78 Op 2 . + CDS 159070 - 159504 173 ## gi|225027164|ref|ZP_03716356.1| hypothetical protein EUBHAL_01420 + Prom 159544 - 159603 5.4 147 79 Op 1 . + CDS 159635 - 160162 473 ## COG4917 Ethanolamine utilization protein 148 79 Op 2 . + CDS 160237 - 160416 318 ## gi|225027166|ref|ZP_03716358.1| hypothetical protein EUBHAL_01422 + Term 160443 - 160475 -1.0 + Prom 160446 - 160505 2.3 149 80 Tu 1 . + CDS 160548 - 161252 624 ## COG5523 Predicted integral membrane protein + Prom 161272 - 161331 7.2 150 81 Tu 1 . + CDS 161367 - 162077 948 ## gi|225027168|ref|ZP_03716360.1| hypothetical protein EUBHAL_01424 + Prom 162169 - 162228 6.2 151 82 Op 1 3/0.000 + CDS 162260 - 164170 1701 ## COG0685 5,10-methylenetetrahydrofolate reductase + Prom 164180 - 164239 3.6 152 82 Op 2 . + CDS 164328 - 166799 3156 ## COG1410 Methionine synthase I, cobalamin-binding domain + Prom 166835 - 166894 6.0 153 83 Tu 1 . + CDS 166939 - 167859 804 ## COG0463 Glycosyltransferases involved in cell wall biogenesis + Prom 167978 - 168037 7.3 154 84 Op 1 . + CDS 168069 - 169196 1395 ## COG1316 Transcriptional regulator 155 84 Op 2 15/0.000 + CDS 169221 - 169712 767 ## COG0440 Acetolactate synthase, small (regulatory) subunit + Term 169777 - 169812 1.1 + Prom 169781 - 169840 9.8 156 84 Op 3 . + CDS 170019 - 171038 1289 ## COG0059 Ketol-acid reductoisomerase + Term 171059 - 171103 7.5 + Prom 171242 - 171301 6.8 157 85 Tu 1 . + CDS 171350 - 172411 945 ## EUBELI_01556 hypothetical protein + Prom 172473 - 172532 5.2 158 86 Op 1 . + CDS 172555 - 173007 425 ## COG0394 Protein-tyrosine-phosphatase 159 86 Op 2 5/0.000 + CDS 173031 - 174074 900 ## COG1716 FOG: FHA domain + Prom 174425 - 174484 8.5 160 86 Op 3 . + CDS 174522 - 175616 896 ## COG0515 Serine/threonine protein kinase 161 86 Op 4 . + CDS 175684 - 176661 1128 ## gi|225027182|ref|ZP_03716374.1| hypothetical protein EUBHAL_01438 162 86 Op 5 . + CDS 176654 - 177133 336 ## gi|225027183|ref|ZP_03716375.1| hypothetical protein EUBHAL_01439 163 86 Op 6 . + CDS 177136 - 178293 1128 ## COG0860 N-acetylmuramoyl-L-alanine amidase - Term 178143 - 178190 4.1 164 87 Tu 1 . - CDS 178374 - 179222 647 ## COG1737 Transcriptional regulators - Prom 179249 - 179308 6.0 + Prom 179581 - 179640 7.5 165 88 Op 1 . + CDS 179707 - 180168 299 ## COG1762 Phosphotransferase system mannitol/fructose-specific IIA domain (Ntr-type) + Prom 180180 - 180239 3.7 166 88 Op 2 . + CDS 180259 - 180936 895 ## COG0120 Ribose 5-phosphate isomerase + Prom 180973 - 181032 9.2 167 89 Tu 1 . + CDS 181137 - 182108 1221 ## COG0673 Predicted dehydrogenases and related proteins + Term 182138 - 182173 3.3 + Prom 182219 - 182278 5.4 168 90 Tu 1 . + CDS 182317 - 182532 292 ## gi|225027191|ref|ZP_03716383.1| hypothetical protein EUBHAL_01447 + Term 182769 - 182808 3.2 + Prom 182782 - 182841 7.2 169 91 Tu 1 . + CDS 182868 - 184235 1075 ## COG0534 Na+-driven multidrug efflux pump + Term 184338 - 184393 12.0 + Prom 184351 - 184410 6.0 170 92 Tu 1 . + CDS 184517 - 185812 1657 ## COG0148 Enolase + Term 185868 - 185924 8.1 + Prom 185895 - 185954 7.4 171 93 Tu 1 . + CDS 186113 - 186583 549 ## COG4894 Uncharacterized conserved protein + Prom 186616 - 186675 5.1 172 94 Tu 1 . + CDS 186722 - 187513 573 ## COG0300 Short-chain dehydrogenases of various substrate specificities + Prom 187517 - 187576 5.8 173 95 Op 1 17/0.000 + CDS 187627 - 189030 1668 ## COG0569 K+ transport systems, NAD-binding component 174 95 Op 2 . + CDS 189045 - 190523 915 ## COG0168 Trk-type K+ transport systems, membrane components + Prom 190932 - 190991 9.0 175 96 Op 1 . + CDS 191075 - 191284 277 ## Closa_0962 FeoA family protein 176 96 Op 2 22/0.000 + CDS 191338 - 191559 442 ## COG1918 Fe2+ transport system protein A + Prom 191570 - 191629 5.7 177 96 Op 3 . + CDS 191714 - 193906 2791 ## COG0370 Fe2+ transport system protein B + Term 193917 - 193978 11.3 + Prom 193923 - 193982 2.6 178 97 Op 1 . + CDS 194010 - 194177 203 ## gi|225027203|ref|ZP_03716395.1| hypothetical protein EUBHAL_01459 179 97 Op 2 . + CDS 194233 - 194703 443 ## Closa_0966 ferric uptake regulator, Fur family + Term 194728 - 194778 7.9 Predicted protein(s) >gi|222441878|gb|ACEP01000064.1| GENE 1 228 - 1049 778 273 aa, chain - ## HITS:1 COG:no KEGG:CLJU_c32910 NR:ns ## KEGG: CLJU_c32910 # Name: not_defined # Def: hypothetical protein # Organism: C.ljungdahlii # Pathway: not_defined # 93 272 138 316 319 68 23.0 2e-10 MQNNKAKNLNFTEILSNEPDNILSLDVSSVKTLQIGYQAETLFLHKATDDTLTIKEYING LTGSEYYAKVTANRFKTTIRYGRREAVNPATCIEIFLPDSYNGELLLSSQYGNIMTDADW KVERFAAETTEGSISLNTITAPRIRLVSSTSLIHITKAEGFTDIHSVSGTIIADDIEGGA KLATSSSPIQATFTSLNNIVECETLNGNIQLLLPENTGMKIDGISKRGKIFSDIEGLSIK EKPGNVQNITGILGEKPFQNVRISTINGDISLK >gi|222441878|gb|ACEP01000064.1| GENE 2 1722 - 2666 885 314 aa, chain + ## HITS:1 COG:FN1521 KEGG:ns NR:ns ## COG: FN1521 COG0601 # Protein_GI_number: 19704853 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Fusobacterium nucleatum # 9 308 9 308 312 284 54.0 2e-76 MSVKQLGRRLLQIVIVLIGISFFTFILTYLAPGDPVRTRYAASGMVPTEKELQEERERLG LNDPVLVQYGRWLGKVLHGDFGTSYSNGKPVAELLSVRLLPTLKLAFSALILMLLVAVPL GMLSAVYKNTWVDYLVRGVTFLGVSIPNFWVGLILLYVVALKFSLLPVISTGDGFEKIIL PAITLAFAMAGKYTRQVRTAVLEELNKDYVTGARARGMSEQEILWKQVFPNALLPLITLL GLSLGNLLGGTAVVEVIFTYPGLGNLAVQAITAYDYSLIQGYVLWVALIYMVINLLVDMS YTFVDPRIRRNERG >gi|222441878|gb|ACEP01000064.1| GENE 3 2672 - 3499 835 275 aa, chain + ## HITS:1 COG:FN1522 KEGG:ns NR:ns ## COG: FN1522 COG1173 # Protein_GI_number: 19704854 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Fusobacterium nucleatum # 3 275 2 274 276 301 56.0 7e-82 MQKILNFIKKNKQFSFFFLLVILLIFVAVFAPVIAPQDPYVSDLKNALQAPNSEHWFGTD KLGRDIFSRVIYASRISLPATLTLVAIIFVAGTILGTLAGYFGGWVDAVIMRLSDMMISF PGMVLAIAVAGIMGASIKNAVIAIAIVSWSKYARLARSLVMKIRHEDYVYAAIVTGSKTG YILRKYMLPNVIPTLVITAATDIGGMMLELAGLSFLGFGAKAPAAEWGLMLNEGRTYMQN APWMMIYPGLAIFIVVVIFNLLGDSLRDVLDVREE >gi|222441878|gb|ACEP01000064.1| GENE 4 3648 - 5216 1940 522 aa, chain + ## HITS:1 COG:FN1523 KEGG:ns NR:ns ## COG: FN1523 COG0747 # Protein_GI_number: 19704855 # Func_class: E Amino acid transport and metabolism # Function: ABC-type dipeptide transport system, periplasmic component # Organism: Fusobacterium nucleatum # 5 518 6 522 526 288 33.0 2e-77 MKKMKKFLGIMLAMVMAVTMLAGCGGSSKESSSSDNGKTLKFGCFSYSESLDPANMINSA WCSSRYGIGECLFKYKDDLSVENTICDEYSTEDNKTWVFHIRDGVKFSNGNDVTPSAIKA CWDVLYANKTGSSNPSKFMDYESITPDDDARTLTIVLKQAKADLRKDLAYPVFVILDVSG DYDLAENPIGTGPYALTSFDSKTKATLVKNKYYWNGEVPYDNVEIDFLSEDNKALALQNG DINLVENVTSTSDLDTLKKDDNFNVSIGQGVRTGFAYINEKGILGDDTLRQAVQMALDHK TMCEVTVGGLYTEGISVLPSSLAYGYDNLKNPYTYNVDNAKKLLDDAGYKDTDGDGIREM NGKKISINYITYENRRLSDFAQAIQTQLADIGIEAKINSMDADREWNKMVAGEYDLCDSN WITVGNGDPTEYMANWYGKSDANFCNYKNAEYDKLYEQLDTEFDEAKRAKIIEQLQQILI NDAAVIVHGYYNSSMISDKTVSGANIHTADYYWLTTEIAPAK >gi|222441878|gb|ACEP01000064.1| GENE 5 5226 - 6014 444 262 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 [Roseobacter sp. AzwK-3b] # 1 262 8 270 563 175 35 2e-91 MLDVQDITVAYSSNAEPTIEGFHLEMKPGEICSIVGESGSGKTTVIRSILGLLPSGGKVT KGDILFEGESLLQYSKKQWRELRGTQISMIFQDSGAMINPIRKIGSQYVEYIRAHKNISK KDAFDMAVEMLEKMRLPDGKRIMESYPFELSGGMCQRVGIAMAMTFQPKLLLADEPTSAL DVTTQAQIVRQMMELRDDFGTGIIVVTHNIGVAAYMADKLLVMRQGKVVDSGQREDILHN PQSEYTKNLLASVPALKGERYV >gi|222441878|gb|ACEP01000064.1| GENE 6 6007 - 6963 786 318 aa, chain + ## HITS:1 COG:FN1525 KEGG:ns NR:ns ## COG: FN1525 COG4608 # Protein_GI_number: 19704857 # Func_class: E Amino acid transport and metabolism # Function: ABC-type oligopeptide transport system, ATPase component # Organism: Fusobacterium nucleatum # 1 316 19 334 342 385 56.0 1e-107 MSKEKELILEAKHVTKKFPLTKGKELIANDDVSLKFYKGQTLGIVGESGCGKSTFMRMMV QLEQPTSGEILFKGKDITKMKGEELRKNRRNVQMVFQNPSTSFNPKMKVGDIICEPLMNF GLIKKKEKDAVARKYLEMVELPGDFADRYPHNMSGGQRQRVGIARAIALEPEIIFCDEST SALDVSIQKAILELLVKLQKEKNIAIGFICHDLAMISSIAHQIAVMYMGNIVEILPGEEV ARGALHPYTKALLQSVFDLHMDFNKLITPLEGEASMPTGELQGCPFQNRCAFCMEKCQKE KPELKTLDENHQIACWRI >gi|222441878|gb|ACEP01000064.1| GENE 7 7214 - 8404 947 396 aa, chain - ## HITS:1 COG:SA0247 KEGG:ns NR:ns ## COG: SA0247 COG1887 # Protein_GI_number: 15925960 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Putative glycosyl/glycerophosphate transferases involved in teichoic acid biosynthesis TagF/TagB/EpsJ/RodC # Organism: Staphylococcus aureus N315 # 12 390 173 556 562 189 34.0 6e-48 MSFFKKYKKKFVAKLRQLRKYYRYKVYFPKKYESYCNQPVQENKVLFLEMRFTTLSNSFQ YLYKKLEESGEYDLKCSYVQFNFIRGREFTKRVDDMLQELATAKYVFIDDASLILSSIPL RKETIAINLWHACGAFKKFGRSTAELKFGSSAATLDKYPNYENLTHVTVSSPEVIWAYEE AMHLPKGIVKATGVSRTDLFYDSEFVESRRQKLYEIMPEAKDKKVILYAPTFRGHVATAK SPDKIDFERFYQELGDEYVIVCKHHPFVKKPPVIPEELQHFARDLTKDLSIEDLLCSADI CISDYSSLVFEYSLFEKPMIFYAYDYDNYCDWRGFYYDYSEFTPGPVVQTEDELLDSIKN IDTQFDKQRVIDFKEKFMGSCDGHATERIIELMKES >gi|222441878|gb|ACEP01000064.1| GENE 8 8457 - 9197 151 246 aa, chain - ## HITS:1 COG:no KEGG:Bcer98_0953 NR:ns ## KEGG: Bcer98_0953 # Name: not_defined # Def: hypothetical protein # Organism: B.cereus_NVH # Pathway: not_defined # 1 246 14 270 277 68 28.0 2e-10 MITYFMQLNVKLLFVLILFGFIMGQTRHAAIQNFSNKNRSYLILDFFNLVGTPVHEIGHL LFGLIFGYKIDQICLYRTTKKALYSGGTLGFVKMHHENRSFLQKLQGDFGQFFISIGPLL FGPCLIYIISLFLPENVRTLPLSFQQNWTIFFKALQQLQASDIIFLFIFLYIIMGISLNM ELSRQDLHLACKGLLFLEVFFLILSGLSVLFHWNLQYGIDILFRWNLLISSIGIISALIA NLISLI >gi|222441878|gb|ACEP01000064.1| GENE 9 9384 - 10361 1148 325 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225027016|ref|ZP_03716208.1| ## NR: gi|225027016|ref|ZP_03716208.1| hypothetical protein EUBHAL_01272 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01272 [Eubacterium hallii DSM 3353] # 1 325 1 325 325 545 100.0 1e-153 MSLMIGAVVIVIIVAGAAFFMSGKSGAKNTAQSETQEKKGSEGKETAVVQEKAESADKSQ SAEAREQKAKTESVLENDKEKGTGENTEKIISENQKVKNEEKVDFKEKNRQKEQLYYYIN HELQTLFATYNKGEWNRLEKLGDMSSALYKEKESIIEGLPSFTGTMVKEYFSCIDFKEEE EKAVCYVKDKEQLKKAFLQFMMPFYPYYYKQLSEGGLRHTALLQPNTLRLFQQLTGKKFR PGYRNRYTTGVRAFEWDKNRYKVYGKDGKLLCDAIFGTDGIIEGYGQKKYTDETHPDWNI VEEGNWENGTFEGGILRYEYKRPVE >gi|222441878|gb|ACEP01000064.1| GENE 10 10650 - 11609 1573 319 aa, chain + ## HITS:1 COG:BH3823 KEGG:ns NR:ns ## COG: BH3823 COG0280 # Protein_GI_number: 15616385 # Func_class: C Energy production and conversion # Function: Phosphotransacetylase # Organism: Bacillus halodurans # 1 316 4 319 330 333 56.0 3e-91 MFKKLAEQLKANARTIVFTEGTDARILEAADKLTTEGILGVILLGKADEVKSAAKEAGFN IEKCQIIDPEQYAEIDAMVDEMVTLRKGKMTPEQVREALSKSNYFGTMLVKMGKADCLLG GATYSTADTVRPALQLIKTKPGNKIVSSCFIMVKGDKKLAMGDCAINIAPNEDELAEIAI ETAKTAKIFGIDPKVAMLSYSTYGSGKGEMVTLMANATKKVKEMAPDLDIDGELQFDAAV APEVAKVKCAGSTVAGQANTFIFPEISAGNIGYKIAARLGEYEAIGPVLQGLNAPINDLS RGCNAQEVYKMALVTAALS >gi|222441878|gb|ACEP01000064.1| GENE 11 11913 - 13010 1599 365 aa, chain + ## HITS:1 COG:STM4108 KEGG:ns NR:ns ## COG: STM4108 COG0371 # Protein_GI_number: 16767374 # Func_class: C Energy production and conversion # Function: Glycerol dehydrogenase and related enzymes # Organism: Salmonella typhimurium LT2 # 1 362 1 362 367 360 53.0 3e-99 MANIIISPSKYVQGKGELANIAQYATKLGKKPFILISEGGKKRVGGIIEDSFKDTGCELV FEAFHGECSKNEINRLVAIVKEKGCDLVIGVGGGKIFDTAKAVAYYVESPVIVCPTIAAT DAPCSALTVVYTDDGVFEEYLWLPANPNLVLVDTDIIAKAPARLLVSGMGDALATYFEAK ACQASNATSCAGGHITIAAITLAKVCYQTLIEEGVKAKLAVEAGVCTTAVEKVIEANTLL SGMGFESGGLAGAHAIHNGLTCIKDCHHLYHGEKVAFGTLTQLVLENADEDLLDEVITFC MDVGLPTTFADLGIEHPDKDALMEAATLAASPDDTLGNEPLEVTPDKVYAAMVGADALGR YYKGL >gi|222441878|gb|ACEP01000064.1| GENE 12 13125 - 16574 2552 1149 aa, chain - ## HITS:1 COG:CAC2736 KEGG:ns NR:ns ## COG: CAC2736 COG0419 # Protein_GI_number: 15895993 # Func_class: L Replication, recombination and repair # Function: ATPase involved in DNA repair # Organism: Clostridium acetobutylicum # 1 1149 1 1163 1163 390 29.0 1e-107 MRPLQLKLKGINSYREEQVIDFETLTSQGLFGIFGPTGSGKSTILDAMTLALYAKLPRDT KNFINTNETTASVSFLFSITTDRTRKYLAERSFRYHNNTYTSTVRNTTGRLIVCDGENET VLADKPTEVTAECIRLLGLTSEDFMRTVVLPQGQFSEFLHLKNKERRIMLQRIFHLEKYG IELTSKIGRAKQKQELLLSKLDGEKKNFEEISSVQLSKLQKQEKSDTKQHSRLSKKLAKL EKSFQKTEELYNTQQELLPLKKAQKEKELLLPSIKEKENNLNITKKANQLRPFSIQVSNA QSAYDQACEKLETVQKETACLQSHFEKTEKEYHAAASAQKEELPLLQAREQVLHTAQSAA DTIHTWEQTKGNAQKELQEILASLAPLKETQEQHFKKERQIHSEIVESEKSIEADRISPE RKQLLEKGNVMEETYREKRENYNTNANALSTLNTEIKEEHDAFSIQQKSLKNIYTQLCHR RSALNEQHSAAKKELGIIKGKEEQAKQKQKSWQQNHMAQLLREELEEGALCPVCGTPYKE HVSQSHIAENVKTSSETISDVTTKRDTEAFSKNDSETIEKKEAELSEISFKNLEKGLKKL EKQQQTQQNLIQQLEQELFSLDTHISNLKDALEITDEDTIEDNETDRHFTINENISQASP VIVEKPIDYFSISKQISTYLQAKGDLRRQKEQCERQAATLKEQHSALQNYAEEILTLRKD NNIENFSNTLETLRQKEQERQQKEITIKEQRQHLEKLRSDIKVLSAQIAGLENKSSSLTA NIENCEKVIAEQTEKFPKTLSINMDFVSEITKNQQRQQEINTALKHTEKDYQQSLKELQK KNKELTSCQSTSRSSETHLKQVQELYLTQKKELGFSKTDQPELYYMEEKELTKLEKDITS YYDDVRQNEERIRYLQEKLGKKEIEESEWIRQKEELADIKQQIKELDSSILLLTHQIETM TKDLAKKEELEQKWETESHKKDMIKELEQLFKGNAFIEYVAQSKLSYIAREASVILSKIS GGNYALEINDSAEFIIRDNKNGGAIRPSDTLSGGELFITSLSLALALSSSIQLNGAAPLE FFFLDEGFGSLDDDLLDTVMNSLERLQSKKRSIGIISHVETIQARVPLKLLVTPSDLSQK GSQIRLEYS >gi|222441878|gb|ACEP01000064.1| GENE 13 16576 - 17775 1026 399 aa, chain - ## HITS:1 COG:CAC2737 KEGG:ns NR:ns ## COG: CAC2737 COG0420 # Protein_GI_number: 15895994 # Func_class: L Replication, recombination and repair # Function: DNA repair exonuclease # Organism: Clostridium acetobutylicum # 1 396 1 396 408 447 53.0 1e-125 MKLIHTADWHLGKNIEGHSRLEEQRLFLKDFINICEEEQADIIIIAGDIYDNYNPSAMAE QLFYDTLKQLSRNGDCMTVVIAGNHDNPERLVASGPLARDHGIVMAGTPNSVITPGMYGK NEITESDAGYIHAIIHGEEADILLVPFPSEKRLNEVYLDETEEETKKAASYSEKMSSLFG SLATHFHEDSIHLIASHLFVMNSVEDGSERSIQLGGSYMVGGEIFPENADYIALGHVHKP QKVPGCPKARYSGSPLPFNVRETSFDKQVIAVTLTAGSPCTIRELPLPVYKPIEVWHCEN IDDAVEQCEANADRECWVYLEVKTDHYIHEEDIKKMKAIKADILSITPVLPEDESEDFSA SDLKEQPFDELVKNFYRKKFHVDIGEETFSLLMSIIEEN >gi|222441878|gb|ACEP01000064.1| GENE 14 18144 - 19091 831 315 aa, chain + ## HITS:1 COG:CC3033 KEGG:ns NR:ns ## COG: CC3033 COG0697 # Protein_GI_number: 16127263 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Caulobacter vibrioides # 15 299 17 301 310 236 47.0 4e-62 MKEQWMQKTVVVWFLAMICCLLWGSAFPCIKIGYKMFHIASADASSQLLFAGCRFFLAGV LVLLLGTSGNFKLKKTEIPMAMTLAVFQTILQYVFFYMGLAHTTGVKSSIINGANSFLVI IIASLIFHQEKLTGPKILGCIIGFAGVVLVNLGGAGLDMSFHWNGEFFILISTVSAACSS AFIKKFSTKANPVLLSGWQFMMGGTVLIIIGKLMGGKLQFEGIAPMFLLLYMACISAVAY TLWSLLLKYNPPSRIAVFGFMNPVFGVMLSALLLKEKGQAFSLTGLTSLILVCIGIYIVN RENPLKEASTGKISV >gi|222441878|gb|ACEP01000064.1| GENE 15 19039 - 19599 554 186 aa, chain - ## HITS:1 COG:MJ0240 KEGG:ns NR:ns ## COG: MJ0240 COG1437 # Protein_GI_number: 15668415 # Func_class: F Nucleotide transport and metabolism # Function: Adenylate cyclase, class 2 (thermophilic) # Organism: Methanococcus jannaschii # 1 178 1 172 175 108 38.0 7e-24 MIEVEIKLPLFRRSITERALTDCGFKAGNLVKESDFYFTSDFHDFMKTDEALRIRTCENF TTRETTSFLTYKGAKLDTVSTTRKELETRIEDADIAREILISLGYKKLYPVTKLRQYYHK GRMTACVDQVESLGSFLELEILVDSPEQKESALQSIEALLLDMGSSLKETTRKSYLSMLL SKGSPD >gi|222441878|gb|ACEP01000064.1| GENE 16 19612 - 20802 880 396 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225027023|ref|ZP_03716215.1| ## NR: gi|225027023|ref|ZP_03716215.1| hypothetical protein EUBHAL_01279 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01279 [Eubacterium hallii DSM 3353] # 1 396 1 396 396 657 100.0 0 MKKNFLLLFTLLLLALVGTGCGNKVSGSLGKDGAELVGTWNGVGNALGRDDSYSCDYLSF SIDKEGTFTLKDIAQNKTCLSGSLSAKSEKFSLDTREETTAELPSGWDGLGSGSTLSYDM PTKNKLVLTYDNVSYLFEKQNDSSKQTSKNSSASPLLNLAENDVWYSNDGSNEDKTTYEL SLYDNYMELYSIDPKAATEGEATFLTNFFYLSSRDNEFTFYTYKEASMKLPDFLKDLPDG FSKSTITMKAEDDSITLSYNGKSLSFYNNVIYGLKSDSDSYRLCDTCFNWKFDSSDHFSY FTMNPDTNTLYLYVTDGAEDADEPKYVAGEVHINEKKHTITYHFDKDLSEKSIGTDSNLY KIFEKMDKTKLSYTLKDKQLTLKLDKTTYKLTLDEY >gi|222441878|gb|ACEP01000064.1| GENE 17 21044 - 21757 888 237 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_1566 NR:ns ## KEGG: EUBREC_1566 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 237 50 286 286 390 77.0 1e-107 MKMISIEKELKENSYPGRGIIVGRSEDGTKAVIAYFIMGRSSNSRNRIFVEDGEGIRTQA FDPSKLEDPSLIIYAPVRVLGNKTIVTNGDQTDTIYDGMDKQLTFEQSLRSREFEPDGPN YTPRISAVLHLEKGEFNYAMSILKSNNGNSESCNRYTFAYNKVPAGEGHFISTYMHDGNP LPSFEGEPKWMAIPDDMDAFAELLWSSLNEDNKVSLFVRYIDIATGKYESKIINKNK >gi|222441878|gb|ACEP01000064.1| GENE 18 21955 - 23130 1662 391 aa, chain + ## HITS:1 COG:CAC2445 KEGG:ns NR:ns ## COG: CAC2445 COG0138 # Protein_GI_number: 15895710 # Func_class: F Nucleotide transport and metabolism # Function: AICAR transformylase/IMP cyclohydrolase PurH (only IMP cyclohydrolase domain in Aful) # Organism: Clostridium acetobutylicum # 3 391 5 391 391 527 64.0 1e-149 MKELELKYGCNPNQKPSRIYMEEGELPIQVLNGKPGYINFLDAFNGWQLVSELKKATGLP AATSFKHVSPAGAAVGLPLSDTLAKIYWVDDLGELSPLACAYARARGADRMSSYGDFIAL SDVCDVSTANMIKREVSDGVIAPGYEPEALEILKSKKRGNYNVIQIDPAYVPAPIEKKQV FGITFEQGRNELNIDDKFFDNIVTENKELTEEAKRDLAISMITLKYTQSNSVCYVKDGQA IGIGAGQQSRIHCTRLAGQKADNWWLRQSPQVMGLQFKDEIRRADRDNAIDIYMGDEYMD VLADGVWENTFKVKPEVFTREEKRAWLDKLTDVALGSDAFFPFGDNIERAHKSGVKYIAQ PGGSIRDDNVIDTCNKYGIEMSFTGIRLFHH >gi|222441878|gb|ACEP01000064.1| GENE 19 23600 - 23812 364 70 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225027026|ref|ZP_03716218.1| ## NR: gi|225027026|ref|ZP_03716218.1| hypothetical protein EUBHAL_01282 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01282 [Eubacterium hallii DSM 3353] # 1 70 1 70 70 118 100.0 1e-25 MKSTYNVEFAGNQIESKEVIARAKKVWVDAGNKNRKVKDLKTMDLYLKPEENAVYYVFNE EESGSFPLYE >gi|222441878|gb|ACEP01000064.1| GENE 20 24129 - 24734 761 201 aa, chain + ## HITS:1 COG:MJ0506 KEGG:ns NR:ns ## COG: MJ0506 COG0118 # Protein_GI_number: 15668683 # Func_class: E Amino acid transport and metabolism # Function: Glutamine amidotransferase # Organism: Methanococcus jannaschii # 1 201 3 197 198 209 53.0 4e-54 MVAIIDYDAGNIKSVEKALLHLGEEVMITRDREKILNSDKVILPGVGAFGDAMEKLRSYG LDKVIYEVVEKKIPFLGICLGLQLLFEKSDETPGVKGLGILPGEVLRIPDKEGLKIPHMG WNSIKIKKGARLFKDIPEDSYVYFVHSYYLKAGREEDVTATTEYSALIHASVEHDNVFAC QFHPEKSSEIGLKILKNFVEL >gi|222441878|gb|ACEP01000064.1| GENE 21 24745 - 25503 892 252 aa, chain + ## HITS:1 COG:CAC0941 KEGG:ns NR:ns ## COG: CAC0941 COG0107 # Protein_GI_number: 15894228 # Func_class: E Amino acid transport and metabolism # Function: Imidazoleglycerol-phosphate synthase # Organism: Clostridium acetobutylicum # 1 252 1 253 253 325 67.0 4e-89 MFTKRIIPCLDVNDGRVVKGVNFVNLIDAGDPVAIAEAYDKAGADELVFLDITASSDARN TVADMVRKVAEKVFIPFTVGGGIRTVDDFKAILREGADKVSVNSAAIMRPELISEAADKF GSQCVVVAIDAKRREDGGWNIFKNGGRVDMGIDAVEWAMRVEKLGAGEILLTSMDCDGTK AGYDLELTRTISENVSIPVIASGGAGTKEHFYDAFDQGKADAALAASLFHFKELEIMDLK RYLREKGISVRI >gi|222441878|gb|ACEP01000064.1| GENE 22 25708 - 26667 625 319 aa, chain + ## HITS:1 COG:lin0027 KEGG:ns NR:ns ## COG: lin0027 COG1619 # Protein_GI_number: 16799106 # Func_class: V Defense mechanisms # Function: Uncharacterized proteins, homologs of microcin C7 resistance protein MccF # Organism: Listeria innocua # 3 316 2 280 291 211 41.0 1e-54 MKLAKGDKVAIVCCSNGQSLSKKKEIENLKVILEEMGLISVFSKYIYQKISLEEVENNLK GFPESGSPLEKARELMEFYEDASIKAIFDISGGDLANTILPYLDYERIASVKKQFCGYSD LTTIINAIYAKTQKVSVLYQIRNLLYEYSAQQKEDFIKTCFAKDLNNIINNDAVKMMEKN NLYDLSVENSGLYQFPYKFYQGKTVEGIVVGGNIRCFLKLAGTEYFPDVTDKILLLEARS GNPAQIVTYFAQLEQLGVFKKVKGILLGTFTQMEREQPVSAVYSLLKRYINEELPVVKTE YIGHGEDSKAIQIGKKYKF >gi|222441878|gb|ACEP01000064.1| GENE 23 26870 - 27046 276 58 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|160880748|ref|YP_001559716.1| ribosomal protein S21 [Clostridium phytofermentans ISDg] # 1 58 1 58 58 110 93 4e-23 MSSIIVKENESLDSALRRFKRSCAKAGIQQEIRKREHYEKPSVKRKKKSEAARKRKYK >gi|222441878|gb|ACEP01000064.1| GENE 24 27567 - 28565 1536 332 aa, chain + ## HITS:1 COG:FN1840 KEGG:ns NR:ns ## COG: FN1840 COG2376 # Protein_GI_number: 19705145 # Func_class: G Carbohydrate transport and metabolism # Function: Dihydroxyacetone kinase # Organism: Fusobacterium nucleatum # 1 328 5 332 332 364 57.0 1e-100 MKKIINAVDQVENQMIEGMVKAFPQYVKKLDCGNVVVRAEKKTGKVALISGGGSGHEPAH GGFVGKGMLDAAVAGAVFTSPTPDQIYEGIKAVDDGKGVLMVVKNYTGDVLNFEMAAEMA ADMDGIEVKEVVVNDDVAVQDSLYTTGRRGVAGTVFVHKIAGALAEKGASLDEVQAVAQK VIDNVRTMGMAIKPCTVPAAGEPGFELADDEMEIGIGIHGEPGVEKKPLETADEIVDTLL EKILADIDYSGSEVAVMINGCGSTPLMELFIVNNRVADVLAEKGIKVYKTMVGEYMTSLE MEGFSITLLRLDDQMKELLDEEADTPALTVAK >gi|222441878|gb|ACEP01000064.1| GENE 25 28707 - 29330 1125 207 aa, chain + ## HITS:1 COG:FN1841 KEGG:ns NR:ns ## COG: FN1841 COG2376 # Protein_GI_number: 19705146 # Func_class: G Carbohydrate transport and metabolism # Function: Dihydroxyacetone kinase # Organism: Fusobacterium nucleatum # 8 205 3 200 202 175 48.0 6e-44 MADVKKVIEIIDKIIADVDEQKLFLTELDNVIGDGDHGINLARGFDAVKTIEDTFESKDI GAILKAIGMKLVSTVGGASGPLYGTAFMKSGALLNGKTEMDMNDFIEMLQVSIDGIMKRG KAVKGEKTMLDAMIPAHDAIKASYEADGDAKKALDAGVKAAEEGIEYTKTIIATKGRASY LGERSIGHQDPGATSFTLMLKAVQELA >gi|222441878|gb|ACEP01000064.1| GENE 26 29349 - 29729 664 126 aa, chain + ## HITS:1 COG:BH3395 KEGG:ns NR:ns ## COG: BH3395 COG3412 # Protein_GI_number: 15615957 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus halodurans # 3 118 7 120 128 77 43.0 8e-15 MVGIVVVSHSPKLAEGVVELASLMAPETPMKAAGGMDDGGFGTSFEKITEAIDEVYSDDG VAIFMDLGSAVMTTEMVLESMEDRKIKMVDCPVTEGAIAAAVVAAGGASLEEVVQVAQAS KEQAKF >gi|222441878|gb|ACEP01000064.1| GENE 27 29962 - 30177 202 71 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225027035|ref|ZP_03716227.1| ## NR: gi|225027035|ref|ZP_03716227.1| hypothetical protein EUBHAL_01291 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01291 [Eubacterium hallii DSM 3353] # 1 71 1 71 71 138 100.0 2e-31 MLERKSPFVCSVNGAKEISIENYGSLGDFSDKKIVLNCYKCKMIIEGEGLLIEYFTDVDM KITGHIHSIHF >gi|222441878|gb|ACEP01000064.1| GENE 28 30190 - 31443 798 417 aa, chain + ## HITS:1 COG:no KEGG:Cphy_2612 NR:ns ## KEGG: Cphy_2612 # Name: not_defined # Def: putative stage IV sporulation YqfD # Organism: C.phytofermentans # Pathway: not_defined # 16 390 1 386 412 226 33.0 2e-57 MLFSLFHFLSGYCRIVLSGMNQERFLNLCAAKQILLWNIKREGKHYIFYISRGGYKELQE ISEKTGSTFRCIQKKGIPYLFCRYKKRKCFALALLFCTAIFIVMSCFIWQVQVTGSYGHS EEELLDYLEKKGVHSGTKISGFSCARLEEQIRKDFEDIAWVSCERKGTLLRVRVKETLDK RDQKEGEMQESPCNLIASKKGRIASILVRSGSAKVKKGDKVKPGDVLISGIVEIKDDEGE IAEETLVRAQGDVYAISKIKYEDCFPLIYYKKNYTGKTEKSYRFLINGYTIKMPERKTAY AEYDEQSREILLHIGPQFYLPVSLFVITKAQCQISKESYSVSQAKKEARKRLFLFLKEYE RKGVVILKNNVRIDSDKNECTARGIIKIKERVGKVQEIPSTLDGEKDIKQEPAVLNR >gi|222441878|gb|ACEP01000064.1| GENE 29 31523 - 32533 972 336 aa, chain + ## HITS:1 COG:MT2437 KEGG:ns NR:ns ## COG: MT2437 COG1702 # Protein_GI_number: 15841880 # Func_class: T Signal transduction mechanisms # Function: Phosphate starvation-inducible protein PhoH, predicted ATPase # Organism: Mycobacterium tuberculosis CDC1551 # 11 317 28 333 352 319 52.0 6e-87 MNIAEQEFPFPAEHASNVFGQFDANMKKIEKTLHVTIIFRDDKLKLIGGEKDCDRAKHVI EQLLLLSQRGNDITEQNINYTLSLSMTNEESAVTQLDSDVICHTIMGKPIKPKTMGQKKY VDMITDKMIVFGVGPAGTGKTYLAMAKAITAFKANEVNRIILTRPAIEAGEKLGFLPGDL QSKVDPYLRPLYDALYEIMGADTFAKNMEKGLIEVAPLAYMRGRTLDNAFVVLDEAQNTT PAQMKMFLTRIGFGSKAIITGDLTQKDLPSGTRSGLDDALHVLKNIDEIGVCNLTSKDVV RHPLVQKIVTAYDEYDKKKETRQKRKQRTTGKQEKK >gi|222441878|gb|ACEP01000064.1| GENE 30 32530 - 33039 672 169 aa, chain + ## HITS:1 COG:CAC1293 KEGG:ns NR:ns ## COG: CAC1293 COG0319 # Protein_GI_number: 15894575 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase # Organism: Clostridium acetobutylicum # 16 160 16 161 166 119 47.0 2e-27 MTFEIENVYPEVILPDYETVIKNVIEAALDYEKCPYEAQVYVLLTDNEEIHEINKEHRQI DRPTDVLSFPMADYPVPGDFSDIEERDPDAFHPETGELILGDIIISMDKVKEQAKAYGHS NTRELAFLVAHSMLHLMGYDHMVDEERKVMEAKQEEILKNCGYIREMEA >gi|222441878|gb|ACEP01000064.1| GENE 31 33225 - 34652 1246 475 aa, chain + ## HITS:1 COG:CAC3590 KEGG:ns NR:ns ## COG: CAC3590 COG2081 # Protein_GI_number: 15896824 # Func_class: R General function prediction only # Function: Predicted flavoproteins # Organism: Clostridium acetobutylicum # 30 447 4 405 405 238 34.0 2e-62 MAKQIQKSTVEKDRILPEKLIQDSTVWDTIIVGAGAAGMTAALACGKEKQKVLLLDRQNQ MGRKILVTGNGKCNLTNFAQKPKYYRSDCPEKVQKVLSVFGQKEAMELFQSLGIYTKDKN GYVYPYSEQAASVREAFETEILSNPSITFVPFSEVYNVQKKEDVFLIKAKEQLGQSEQST GKQGNGEKIEKVYRCQSLLVTTGGFAGPKFGCDGSGYAFAKVFGHEIIKPLPALTALKSS APFLKKVSGVRNQAEITLEIDGEPVKKETGELQWTDYGISGVAIFQLSRFAITALEEKKK VALYFDFMPELSSKEKLRLFLMLAQNHKKRDMLSFVKGLFPAKLCPVILREAGLREETLA GTLKEEEWNALVMAVSHFPLRINGYMGYEKAQVTRGGISLTELTGNLESDFQPGLFFAGE VADVDGTCGGYNLQWAFSSGTVAGRAIVKRNQELFKDDRHTMEENDGISERNRRR >gi|222441878|gb|ACEP01000064.1| GENE 32 34649 - 36334 1342 561 aa, chain + ## HITS:1 COG:CAC3077 KEGG:ns NR:ns ## COG: CAC3077 COG2509 # Protein_GI_number: 15896328 # Func_class: R General function prediction only # Function: Uncharacterized FAD-dependent dehydrogenases # Organism: Clostridium acetobutylicum # 33 556 35 537 540 473 48.0 1e-133 MIRISQLKLKVGHTREELFEEIIHQAHGKRPVSWRIVRKSVDARKKPQLFYVYTIDAEFE NEKKLLSAKKSKWTRSTVVKYRFPYNKKDGSGASSTEVSTIDRAVIVGMGPAGLFAALVL ARAGFAPIVFERGDCVEKRSEIVEHFFESGELDEESNVQFGEGGAGTFSDGKLNTLVKDK FGRNHFVLKEFVKHGAPEEILYEAKPHIGTDILKDVVASIRKEIESLGGEVHFRTKVCDI LCEKISGSVAERERAEAEKKLLMQDKQLTGLILEKEGVQAEYPCRNVIFAIGHSARDTFY MLHERELSMNPKAFAIGVRVEHLAHLINESQYGEGYPEEVPTASYKLTHQCKGTGRGIYS FCMCPGGYVVNSSSEKGRLCVNGMSYHNRAGHNSNTALITTVTPDDFPSDSPLAGLEFQR KYEELAYKIGNGKIPVQLFGDFLKNQMSTKLGNVTPSIKGEWQFANLHECLPDYVCASLE EGMRAFGHKIKGYDAEDTVFSGVETRTSSPLRMERNKQFESNIQGLFPCGEGAGYAGGIT SAAMDGIKTAEAIAEKILTQN >gi|222441878|gb|ACEP01000064.1| GENE 33 36509 - 36733 304 74 aa, chain + ## HITS:1 COG:no KEGG:Cphy_2602 NR:ns ## KEGG: Cphy_2602 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 1 66 1 66 69 65 59.0 6e-10 MNNIDIDRINELYHKSKSVGLTKAEEAEQKKLRKAYLAAIRKNLRGSLNRIDIQNKDGSI ENLGEKYGKKLDKN >gi|222441878|gb|ACEP01000064.1| GENE 34 36735 - 37283 478 182 aa, chain + ## HITS:1 COG:BH1417 KEGG:ns NR:ns ## COG: BH1417 COG0212 # Protein_GI_number: 15613980 # Func_class: H Coenzyme transport and metabolism # Function: 5-formyltetrahydrofolate cyclo-ligase # Organism: Bacillus halodurans # 6 181 1 179 186 108 37.0 5e-24 MPEEALELKKNLRCDMKARRNALTDEEKKHAAANCLSKLKELPEFMNADWIYAYIACRNE LETADIISWCLSYGKHVAVPKVQGEIMHFYEITALSDCVPGAFGILEPAGEEKDRITTPG FMLVPGLAFDKNGNRLGYGGGFYDKYLASHEEIITAALGYDFQIVEKVPSESHDRRMDYL IY >gi|222441878|gb|ACEP01000064.1| GENE 35 37511 - 38122 435 203 aa, chain - ## HITS:1 COG:SSO2506 KEGG:ns NR:ns ## COG: SSO2506 COG1309 # Protein_GI_number: 15899244 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Sulfolobus solfataricus # 3 113 9 106 198 60 35.0 2e-09 MGKVETNKKQKNTTLLQTSFELFTEKGFTKTTISDIVNRAGLAKGTFYLYFKDKYDLRDK LVTYKTSQLFGNAHTALIEQNIQGFENQMFFVIDHIIDRLEKEPKLLQFISKNLSWGIFK NAFEKTVPENSRQFYDYYQDMLTQNHIFCENQELMLFTILELVGSTCYSCILYQQPVSME EYLPYLHQTIRQIFITFTKPMAD >gi|222441878|gb|ACEP01000064.1| GENE 36 38324 - 40432 2115 702 aa, chain + ## HITS:1 COG:BH0720 KEGG:ns NR:ns ## COG: BH0720 COG1033 # Protein_GI_number: 15613283 # Func_class: R General function prediction only # Function: Predicted exporters of the RND superfamily # Organism: Bacillus halodurans # 9 692 9 686 687 310 30.0 9e-84 MLIKMGKQIAKHKFLILLIGFLLLIPSLIGMVTTRVNYDLLSYLPETLETVKGQDIMVDE YGMGAFSMVVVENMELKDVQKLEKKFEAVDHVKDVLWYDDVADLNLPVQMIPKKLREGFF KGNATMMIVLFDNTTSSDESMNAVAEMRKLANKQCFISGMSGVVNDIKDVALDELPVYVV IAAVLSFLVLEFTTESFLVPVFFLLSIGIAIIYNLGSNVFLGQVSYITKALTAVLQLGVT MDYSIFLLNSYEEYKEKYTEDRNRAMGHAIASTFKAVVGSSVTTVAGFAALCFMTFALGR DLGIVMAKGVIIGVICCVTILPSLILAFDKQIEKTKHRPLIKSVDGASNFIIKHHWVWLI VFLVLLLPAVHGNNNIKIYYDIAKSMPSSLPSNVATDRLKKDFNMSTIHMVMMDSNMDSK DKRDMLKEIDNVEGVKWTISMSSLVGASVPESMIPKDVKSMLQSGDHELAFICSEYESAT PKVNKQIAKINKITKNYDKTSMVIGEAPLMKDLEDVTNIDIQNVNIASIAAIFVIILIIF KSISLPVILVAIIEFAIAVNMAVPYYQGVSLPFVASIVIGTIQLGATVDYAIVYTNRYLK ERRNGKNKLEAISIAHKSNMLSIITSALSLFAATYGVACYSQVDMIGSICTLLARGAIIS MIVVLTLLPTMFLIFDKAICKTTKGMGHIDDHKDTVVNDIVL >gi|222441878|gb|ACEP01000064.1| GENE 37 40521 - 43001 3276 826 aa, chain + ## HITS:1 COG:lin2460 KEGG:ns NR:ns ## COG: lin2460 COG1511 # Protein_GI_number: 16801522 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Listeria innocua # 236 775 146 655 927 138 24.0 5e-32 MRNKRFVKHTSRAFAILMSAMMAVGIAMPVNAAGKKDSKKEDKTETVYVNANADGSVDKI TVSDWLKNHSSSDTLEDFSNLENIKNVKGDETFTQNADGSIVWDSKGNDIYYQGESNEEL PVTMKVTYYLDGQQMDPKDMAGKSGEVKIRFDYYNNSNETVKVKGKNYNIKTPFTMVTGM ILSSDVFSNIEVTNGKVISDGDKNVVVGLAFPGLKSSLNLASYDKLDDVDIPEYVEVTAK ADKFELALTATAATTGTLKDIDTSNLNDVDDLKDSMDKLTDATDKLIDGSEALSDGMGTL SSSAGEYTKGVSSANAGVKQLLSGLNTLSSKSGQLESGVESLNKGLKALKKGTKSLNAGI SSYTKGVSGLDAGLQSAASGTGKLESGAGALSKGIKSYTENEAKIAAGIKKLNNKVSGMS NLELPSDTELKAVKKASSALESDAKKLQESATEIQSAIDKMNKLSEAVEAHNSKIDEHNK EVSEKFSSAKSALDDIDKKATDKANEKIASQKDSLSSSATSKAKEEAKSAIDSVNMDGLT DEMKSALKSAIDNNVSVSAEPSSIEISGLASDAEKALGDAPSAEKLNIGKVNVGLSDMET LLKDMKTQAAVLESFAANTGSLSDAASQIPDLIKGVDELNTGAQALTANNKTLTSGMKDL TSGLSTLSTGLDTMTKGAATLTGNNSVLTKGASSVDKGTGKLVAGSSQLVTGVKAYAQGV NAAAIGVQSLSSGMNKLDSAGGQLTSGIDKLATGSDTLTKGLKTFNDDGISKLSDLAGDD LDSVINHFKAVKKADNRYKSFGGIKKNAKGSVKFVIETDPIEADEN >gi|222441878|gb|ACEP01000064.1| GENE 38 43159 - 43620 697 153 aa, chain + ## HITS:1 COG:BS_ylxS KEGG:ns NR:ns ## COG: BS_ylxS COG0779 # Protein_GI_number: 16078722 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus subtilis # 5 153 4 156 156 122 45.0 3e-28 MKRTEIVAKTEELVTPIIEENHFELVDVEYVREGANWYLRVYADKEGGINIDDCVLISRA LEEKLDAEDFISDAYILEVSSPGLGRPLKKEKDFKRSLGKSVECKLYKAINKQKEFEGIL KDFTEDTVTIETEGNELVLDRKDIAMIRLAIDF >gi|222441878|gb|ACEP01000064.1| GENE 39 43658 - 44902 761 414 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|17988250|ref|NP_540884.1| transcription elongation factor NusA [Brucella melitensis 16M] # 3 408 9 425 537 297 39 2e-79 MSELIDALNQLQKEKHIEKEVIMQAIEDSLVAACNRDFGKNAIVKVNMDRDTGDISVYVE KTVVEEVEDPATEISLADAKMIRFHYEIGDVVNIEVTPKNFGRIAAQHARSVIVQKIKEE ERRVVFEHFAHKEKDIVTGVVQRYNGKNISVSLDDKTDAVLTEKEQVKGEVFKPTDRIKL YIVEVKDTNKGPKILVSRTHPELVKRLFEKEVAEVYDGTVEIKSIAREAGSRTKIAVASN DPNVDPVGACVGINGARVNAIVSDLRGEKIDIIEWNDDPAVLIENALSPSNVVSVDVDEE EKSARVVVPDYQLSLAIGKEGQNARLAAKLTGYKIDIKSESQAAELAEELDFGEDFGDLD EAEDFDLEAPVEEPEVSASEATADFEAEEVTEEPETAEDSFEEVSSEEADTDEQ >gi|222441878|gb|ACEP01000064.1| GENE 40 44925 - 45200 160 91 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|78222795|ref|YP_384542.1| 50S ribosomal protein L7AE [Geobacter metallireducens GS-15] # 1 84 1 92 205 66 38 3e-16 MGKKIPLRQCVGCREMKTKKEMIRVIKTPENEVVLDAKGKQNGRGAYLCFSAECLQKARR SKGLERSLKISIPDEIYDRLEEELKELDTKQ >gi|222441878|gb|ACEP01000064.1| GENE 41 45184 - 45504 297 106 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|239626258|ref|ZP_04669289.1| ribosomal protein L7Ae/L30e/S12e/Gadd45 [Clostridiales bacterium 1_7_47_FAA] # 1 94 1 94 105 119 58 1e-25 MTQNSKVLSMLGLAAKAGKVASGEFSTEKEVKSGNACLVIVAEDASDNTKKLFRNMCKYY EVPMEIFATKEELGHWIGKAYRASICILDEGFAKSIVKKISLNMEG >gi|222441878|gb|ACEP01000064.1| GENE 42 45508 - 48015 3383 835 aa, chain + ## HITS:1 COG:BS_infB KEGG:ns NR:ns ## COG: BS_infB COG0532 # Protein_GI_number: 16078726 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation initiation factor 2 (IF-2; GTPase) # Organism: Bacillus subtilis # 247 835 128 715 716 677 63.0 0 MPKMRVHELAKLLDKSSKEIIAELKQQGVEVQSHMSTVSDEHVDKLKAKFQKKESGNNRG NNESRSNDNRGSRNDNRGDRNNNRNDNRGDGRNNNRNNDNRNGRNENRGDGRNNNRNDNR GGRNNRNDNRNDNRGGGRNNNRNNDNRGGRNDNRGDNRNNNRNGNDRRNRPSMGDTPINA KETRNGKENRHDINREHSRDNKENNRNDNRGGKNNRRNNRNNDFGSSRKQKNNKKQQAPV QPPKQEEPKEDVIRNITLPDPVSLKDLADACKVQPSVIIKKLFMAGQMVTINTEFSFDDA AEIALEYNCIAEEEVKVDVIEELLKEEEEDERNLTSRPPVVCVMGHVDHGKTSLLDAIRS THVTAGEAGGITQHIGAYVVEINGEKITFLDTPGHEAFTSMRLRGAQATDIAVLVVAADD GVMPQTIEAINHAKAAEVEVIVAVNKIDKEGANVDRVKQELTEYGLIAEDWGGSTVFVPV SAKTGEGIDELLEMITLTAEVLELKANAKRRARGLVIEAKLDKGRGPVATILIQKGTLHV GDSINVGHAFGRVRAMMDDKGNRVKVAGPSTPVEILGLNDVPFSGEVLMAHGNEKEARAT ADAFIAYGREKMLEETKNKLSLDDLFDQIQAGNVKELNLVVKADVQGSVEAVKQSLMKLS NEEVMIKVVHGGVGAINESDVILASASNAIIIGFNVRPDAMAKAAAEREKVDMRLYRVIY NAIEDIEAAIKGMLDPVYREKVIGHVEVRQIYKASGVGTIAGCYVLDGTITKDAQSRIVR DGIVIYEGELASLKRFKDDVKEVKAGFECGIVFEKYNDIKEGDQIEAFVMEEIPR >gi|222441878|gb|ACEP01000064.1| GENE 43 48030 - 48416 518 128 aa, chain + ## HITS:1 COG:BH2411 KEGG:ns NR:ns ## COG: BH2411 COG0858 # Protein_GI_number: 15614974 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Ribosome-binding factor A # Organism: Bacillus halodurans # 4 115 2 112 116 91 46.0 3e-19 MRKNSVKNTRINSEVLKELSRIISREIKDPRIAPMTSVVSVEVAPDLKTCKAYISVLGDK EAQESTLEGLRSAHGYIRRELARSINLRNTPDVRFILDQSIEYGVSMSKKIDEINAAMHE REREASEE >gi|222441878|gb|ACEP01000064.1| GENE 44 48413 - 49450 219 345 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|149007035|ref|ZP_01830704.1| 50S ribosomal protein L31 type B [Streptococcus pneumoniae SP18-BS74] # 16 307 5 289 311 89 25 1e-16 MSTDKRENVTFAERKEEFERQVQTAQTIAISGHMNPDGDCIGSCLGLYTYIVEQYPEKEV TLFLEPVQDKFSFLKYADQITQEKKIDGPYDLFFSLDCSDTDRLHDFKGYFEEAKYKICV DHHFTNEGFGDLRFIVPDASSTCEVLFDMFDWDKISLECAQDLYMGIVHDTGVFKHTNTT RKTMETAGALLEKGIHSEDIIDRTFYKKSYVQNQILGRALMESIILLHGKVIFSIVRKRD FDFYGIDSSDLDGIIDQLRVTDGIECAILLYEKEDGNYKVSMRSNDAVDVSAIAKIFGGG GHIKAAGCTVHGQPRDIVMNITHMVEHQLNEQEEKQESIYDKRNS >gi|222441878|gb|ACEP01000064.1| GENE 45 49431 - 50453 931 340 aa, chain + ## HITS:1 COG:CAC1805 KEGG:ns NR:ns ## COG: CAC1805 COG0130 # Protein_GI_number: 15895081 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Pseudouridine synthase # Organism: Clostridium acetobutylicum # 2 253 1 248 289 216 48.0 4e-56 MISGILNIKKEPGFTSHDVVAKLRGIVHQKKIGHTGTLDPDATGVLPVCLGKATKLCDII GDWDKTYEAVLLLGRETDTEDTSGQTLKEREVTVTEEEIRDIILSFQGSYDQIPPMYSAK KVNGKKLYELARQGIVIERKPCPVTLSSIIIKKIDLPYVTFEVTCSKGTYIRSLCRDIGE KAGCGGCMAGLVRTRVSSFKIEDGFTLSEVEAMRDADTLLEHIVPVDEVFLHLPAFFVKK EGEKLLYNGNPISLSLCALDENSSISNWQLYEDLSSNEQMKMEEADTEDLGKKEKSGKDG KQYSSYTVGNRFRVYDNQKNFIGIYQLQEKKMLKPYKLFL >gi|222441878|gb|ACEP01000064.1| GENE 46 50519 - 51448 361 309 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163762565|ref|ZP_02169630.1| ribosomal protein S2 [Bacillus selenitireducens MLS10] # 15 305 19 314 317 143 32 5e-33 MEYITSSDFHFKDTAVALGKFEGIHRGHQLLMDEVKKQELHGLRSVVFTFDRPTRLTLTG DTEYKQIYTKKERREILEKRGIDILIEHPFTKEFAALTPDRFIREVLVEKVGAKVIVVGT DFHFGKNRSGSITDLEKLEEECGYHLIVVEKLQLNGKDISSTRIRASLEKGAMEEAKALL GRNYSVSGEILHGNALGRTIQVPTINQQVPSFKLLPPNGVYVSKIHWKDEVYYGITNIGT KPTVNNTSEKTVETNIFDFNKDVYGEEMVVELLHYHRKETRFSSVEALQAQLFQDIEFGK QYVTNVLSH >gi|222441878|gb|ACEP01000064.1| GENE 47 51883 - 52503 657 206 aa, chain + ## HITS:1 COG:no KEGG:bpr_I1473 NR:ns ## KEGG: bpr_I1473 # Name: not_defined # Def: hypothetical protein # Organism: B.proteoclasticus # Pathway: not_defined # 1 199 1 202 210 258 65.0 2e-67 MNTIITIGRQYGSGGKEIGKKLAEYYNIPFYDKELLKVAAKESGICEEMFENFDEKPTTS FLYSLVMDPYALGYNAASFDMPLNQKVFLASFDAIKKVADEGPCIIVGRCADYALKDYDN KLNVFIHAPMAFKKSRIQEQYEVPEAKVKDVAIKTDKQRASYYNYYTSKKWGDLKSYDLC IDSSILGIDGTVELIKQAVALKEKNL >gi|222441878|gb|ACEP01000064.1| GENE 48 52660 - 53382 673 240 aa, chain - ## HITS:1 COG:CAC3341 KEGG:ns NR:ns ## COG: CAC3341 COG0655 # Protein_GI_number: 15896584 # Func_class: R General function prediction only # Function: Multimeric flavodoxin WrbA # Organism: Clostridium acetobutylicum # 1 208 1 208 208 301 66.0 1e-81 MKVLLLNGSPHPNGCTFRALSEVEKTLQQEGIETEIFHIGTKAIRGCMGCGMCRKNKLGR CIFGEDAVNVVIKKMKECDGLIVGSPVHYAGASGGITSFLDRMFYSTGGFAYKPGAAIVS CRRAGSTAALEQLNKYFMMNNMPLVPSAYWNMVHGNTPEEVEQDLEGLMVMRTLGRNMAW MLKSIDAGKAHGILTPAAEPKTRTNFIRECAPVKEEAEEKNFAPYFISYNKVNKAYQEED >gi|222441878|gb|ACEP01000064.1| GENE 49 53453 - 54229 1009 258 aa, chain - ## HITS:1 COG:lin1116 KEGG:ns NR:ns ## COG: lin1116 COG4816 # Protein_GI_number: 16800185 # Func_class: E Amino acid transport and metabolism # Function: Ethanolamine utilization protein # Organism: Listeria innocua # 10 228 41 259 267 168 40.0 7e-42 MDVNNFLQSEATQFVGTAAANTIGLVIGTIHEDIKELLEIPDEYQAISIVSSRTGTVAQA IAIDDAVKAVNAKLLRFEMAIDAGKQCGQGCLFVLGAENISDARRLVEIALEQIDYWAAC IYVNEVGHMESHVTPRAGEILHQIFGTPLGKAFGVIGAAPAGIGIVAVDQCMKAAPVDIV WYGSPSHNLTMMNEFSAGISGDVSAVQKALEAGKEVGCELLRVCGITPISITKVHEVCGT DYVKESYTPTSCLKPKEE >gi|222441878|gb|ACEP01000064.1| GENE 50 54434 - 57994 3693 1186 aa, chain + ## HITS:1 COG:CAC2660 KEGG:ns NR:ns ## COG: CAC2660 COG1038 # Protein_GI_number: 15895918 # Func_class: C Energy production and conversion # Function: Pyruvate carboxylase # Organism: Clostridium acetobutylicum # 3 1147 2 1143 1144 1243 54.0 0 MEVKKFKKVLVANRGEIAIRVFRALSELGIGSVAVYSKEDRYALFRTKADEAYPLNPDKG PVDAYLDIPTIIRIAKEKKVDAIHPGYGFLSENPEFAEACRKNDIVFIGPNVEAMNIMGD KISAKRAAEESKVPIIDGSDFIIHDVDTAMKIADEIGYPVMLKASNGGGGRGMRIVQSQE EMEEAFEKSRNESKRVFGEAKLFIEKYLIHPKHIEVQILGDNYGNIVHLYDRDCSLQRRN QKVIEFAPAWSISEETRKKILDSAKRLARRVGYTNAGTMEFLVDEEEHPYFIEMNPRIQV EHTVTEMVTGIDIVICQILIAEGYPLNSPTIHIYSQNEIKCTGYALQARVTTEDPANHFL PDTGKIDLYRSGSGGGIRLDGGTAYTGAVITPHYDSLLVKVISHDRTFENAIHKSVRALK ELRIHGVKTNITFLINVLNHETFHKGKCFTTFIEETPELFELHYKLDRTTKILEFLANKM VNVNPGPKPFLENRKVPVIDTERKLFGTRDEFLRLGAKEFTQKVYRSDKLYVTDTTMRDG QQSLLATRMRTKEVAGAARATNLYLKDAYSIEAWGGATYDTEYRFLKESPWKRLDILRER MPNVMIQMLLRASNVVGYSTYPDNVIKKFIKVSSDHGVDIYRIFDSMNWIENMKLPVEEA LKTGKVVEGAICYTGDILNPKETKYTIDYYCKKARQLERMGCHVIALKDMSSLLKPYAAK ELITALKEEVRVPIHLHTHDTTGNGIATYLMAAEAGADIVDCAIGSMSSLTSQPSMNALV EALRGTKRDTGIDPDGLLVLNEYYEHERKVYKPFESDMDSPNPEIYKYEIPGGQYSNFAA QVKEMGAGNDFNDIKRLYKEADEVLGNIIKVTPTSKVVGDMAIFMLKNDLTKDNILEKGK DLSYPDSIVQYFKGALGQPEGGFPEKLQRIVLKGQEPITVRPGSLLPDVDFEAIGRHLRE NYYMESMEQPEVMEQKVLSYALYPKVYEDYCEHFQAYNDVSKLESHVYFYGLRPGEETTI QIEEGNDTLIKFVGKSEPDEKGYRILQFEVNGFLREVKILDKHFEVKADRRLKTDPKNPG HLGATLPGTICDIRIKEGDRVTKNMPLMVIEAMKMETTVISKVNGTVDKIYVKDGEEVNE DTLLVSFIIDMEEQPEEVHPELPEMDFTTLEKNEFKTVDVTEEENK >gi|222441878|gb|ACEP01000064.1| GENE 51 58010 - 59290 974 426 aa, chain + ## HITS:1 COG:BS_ymxG KEGG:ns NR:ns ## COG: BS_ymxG COG0612 # Protein_GI_number: 16078734 # Func_class: R General function prediction only # Function: Predicted Zn-dependent peptidases # Organism: Bacillus subtilis # 1 400 1 398 409 94 22.0 6e-19 MEREHITPQGVSVFHDKNENIHSFCLCLYVRAGSLFETKENNGISHFFEHIVFKNIHYQM GENLYQTLDRCGLDFNASTYEEFIQFIITGAPAHFEEAAEILTGIFEPITLPEEVIDTER KRIKAEIREEDEESSLGYFTKKIAWKNTSLERTITGKKKTLDKIKGKQLRKFQKEVLSSN NIFFYISGNYPETAVVTVTKLMENYPLTVMKKERKNLAPVPKRFFARKPKVYVKNSKKTC VCFSVDINASNYTLAEKNLLFDILFEGEFCKIHQELSEKKGYVYSYDPCFQHYNNIGQMT LTYEVLPKHLYDSVETVVEVLKSMKEGITDELSYVLPHYVDNAGFLLDDTEEFNWVRAYE GHILDVYYKDIEERTESYRRVTTERMTEICREVFRQSNILLTVKGKKKKVDTERLAEILC DALSCR >gi|222441878|gb|ACEP01000064.1| GENE 52 59370 - 60746 1749 458 aa, chain + ## HITS:1 COG:CAC1091 KEGG:ns NR:ns ## COG: CAC1091 COG1362 # Protein_GI_number: 15894376 # Func_class: E Amino acid transport and metabolism # Function: Aspartyl aminopeptidase # Organism: Clostridium acetobutylicum # 3 457 10 465 465 584 62.0 1e-166 MRENSWLSYTEEDDKNVEVLAQEYKEFLSKCKTERECTQYFIKEAESHGYQDLNKLIAQG TRLKAGDKVYGVGMGKTIALFQIGKQPLTEGMNILCAHIDSPRLDLKQNPLYEDTELSFM DTHYYGGIKKYQWVALPLALHGVVAKKDGTVVNVNIGEDPADPVVYVTDLLIHLAGKQME KKGSVVVEGENLDILVGSRPLAGEEKDAVKANILRLLKEKYQMEEEDFLSAEIEVVPAGP ARDCGLDRSMIAGYGHDDRVCAYPSFAAMMEAGHVDRTSCCLLVDKEEIGSVGATGMQSM FFENTVAEILALMGQDSNLAVRRTLARSRMLSSDVSAAYDPAYAEAFEKKNCAYFGKGMV INKYTGARGKSGSNDANAEYLARLRRIFDDNKVAFQTAELGKVDYGGGGTIAYIAALYGM EVVDSGVAVLSMHAPWEIISKSDLYEAKKGYVAFLKNA >gi|222441878|gb|ACEP01000064.1| GENE 53 61059 - 61400 348 113 aa, chain + ## HITS:1 COG:no KEGG:Cphy_2804 NR:ns ## KEGG: Cphy_2804 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 1 101 1 101 192 73 35.0 4e-12 MNCQNAQSMVLNFINNKLDKEETKAFIEHVRDCKDCWEELEIYYVMLVGLKQLDEGEELA ADFRKKLQNEVESRYVEIEREAKRKHIAKIITILVTAAILIWMFAYLISAMLL >gi|222441878|gb|ACEP01000064.1| GENE 54 61647 - 64283 3004 878 aa, chain + ## HITS:1 COG:BH3153_2 KEGG:ns NR:ns ## COG: BH3153_2 COG0749 # Protein_GI_number: 15615715 # Func_class: L Replication, recombination and repair # Function: DNA polymerase I - 3'-5' exonuclease and polymerase domains # Organism: Bacillus halodurans # 395 878 86 569 569 449 50.0 1e-125 MADKIMLLDGHSLLNRAFYGLPDLTNAEGKHTNAVLGFLNIMLRYVEEEKPTHLLVAFDR KEPTFRHIKYKEYKGTRKPMPAELHEQVPLMKEVLKTMEIPIITQAGIEADDILGTIGKE AEEAGFDVAIVSGDRDLLQLATKKVKIKIPKTKASGTVTEDYYEEDVRELYGVSPTEFID MKGLMGDTSDNIPGVPGIGEKTAAKIIKEYHSIENAHEHIEEIKPARAKNNLQEYYEQAI MSKDLATIKKDCDISYDFKDAKIGTLFTKDAYQLFKELELKSLFKYFGEDEKEEETLEVV STFDLGEIENAFTSIKEEGQAALGIIGVSGNCKGLSLTWRDNTVLALVGGFVTEDYLKEK ITELLKEGTELIVLDYKALLHFSEEVREYESQIQDVGIQAYLLNPLQSAYDYEDIARDYL RQMLPSRKEVCGKKALEEVFSQEEEKTVQMAAYLSYVPFHAYPLVLEKLREAEMEELYLT IERPTVATLYDMEKYGITVEKDALKEYGDQLVGRIEELEKNIYEQAGKTFNINSPKQLGV ILFEDLKMPFAKKTKTGYSTSADILEKLAPDYPIVSNILEYRQLTKLKSTYADGLAAFIQ EDGKIHSIFHQKVTATGRLSSSDPNLQNIPIRMPLGRAIRKVFVPDSGYVFVSADYSQIE LRVLAHMAGDEHLIDAYRHGEDIHRMTASQVFGVPFEEVTPLLRRSAKAVNFGIVYGISA FGLSQDLKISRKEASEYIDKYFATYPGIKTYLDGNVAFAKKEGYVKTLYGRVRPIPELSS SNFMQRSFGERVAMNSSIQGTAADIIKIAMIRVSKRLQEEKLASRLILQIHDELLIETKE DEVEEVRKILKEEMMGAADLKVPLSIDIEEGKTWYEAK >gi|222441878|gb|ACEP01000064.1| GENE 55 64297 - 64902 774 201 aa, chain + ## HITS:1 COG:SPy0498 KEGG:ns NR:ns ## COG: SPy0498 COG0237 # Protein_GI_number: 15674604 # Func_class: H Coenzyme transport and metabolism # Function: Dephospho-CoA kinase # Organism: Streptococcus pyogenes M1 GAS # 3 194 2 190 197 121 38.0 7e-28 MKVVGITGGVGSGKSVVMNILKEKYGAEVILADLVAHDLMEPGQQNYLDIVEAFGEGILA QDKTIDRPALAKVVFGNQERLARLNAITHPNVKKEIFHRIDGIKEKGEASFIAVEAALLL EEGYQNDFDMMWYVYVDEATRIERLKEGRGYTEEKCREIMAKQLPEEVFRKECSTVIDNH LGISETENQIKTAVESLLSLY >gi|222441878|gb|ACEP01000064.1| GENE 56 64970 - 65761 827 263 aa, chain + ## HITS:1 COG:CAC3538 KEGG:ns NR:ns ## COG: CAC3538 COG1235 # Protein_GI_number: 15896774 # Func_class: R General function prediction only # Function: Metal-dependent hydrolases of the beta-lactamase superfamily I # Organism: Clostridium acetobutylicum # 1 263 1 259 261 196 42.0 5e-50 MNLASIASGSSGNCIYVGNEKSHFLIDAGISRKRIVEGLTQMEVAPETIQGIFVTHEHMD HISGLGVFLRKYPVPVFATAKTIDEILSTSSLGKVDKNLFESITPDNPIYMDGIKIEASR ILHDAADPVCYTVSDTESKVGVATDFGTYDEYLVEKLQGCESLLVESNHDLNMLMVGPYP YPLKKRIMGNKGHLSNERAGQFLSKVIGEECKHIFLGHLSKENNYGELAYETVKVELLLH NIDLDKANFTLQVASRTAPTCII >gi|222441878|gb|ACEP01000064.1| GENE 57 65863 - 66150 440 95 aa, chain + ## HITS:1 COG:MJ1558 KEGG:ns NR:ns ## COG: MJ1558 COG3830 # Protein_GI_number: 15669753 # Func_class: T Signal transduction mechanisms # Function: ACT domain-containing protein # Organism: Methanococcus jannaschii # 6 95 1 90 90 77 44.0 9e-15 MEGKKLKKSIITVLGKDSVGIIAKVCTYLANNNVNILDINQTITGGFFNMMMIVESEEVV KTFPVMASELEQIGEELGVKIQAQHQEIFEMMHRI >gi|222441878|gb|ACEP01000064.1| GENE 58 66161 - 67525 2096 454 aa, chain + ## HITS:1 COG:CAC0479 KEGG:ns NR:ns ## COG: CAC0479 COG2848 # Protein_GI_number: 15893770 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 2 454 1 451 451 584 68.0 1e-166 MLNINEVIETNKMIHEMNLDVRTITMGISLLDCVSDSLDEVCENIYNKITTKAKNLVKVG KEIEQEFGIPIVNKRISITPIALVGGACCKGPEDFVRIAKELDRCAKEVGVNFIGGYSAL VNKGMTKADKYLIESIPEALSVTDRVCSSINVGSTKTGINMDAVKMLGEIILETSRKTAD KDSIGNAKLVVFTNAPDDNPFMAGAFHGVTEADTIINVGVSGPGVVKRALENVRGKDFEE LCETIKKTAFKVTRVGQLVAKEASKRLGVPFGIIDLSLAPTPAVGDSVGEILEEIGLEYA GAPGTTAALALLNDQVKKGGVMASSYVGGLSGAFIPVSEDQRMIDAVEAGALTLEKLEAM TCVCSVGLDMIAIPGDTKATTISGIIADEMALGMVNQKTTAVRIIPVIGKGVGDTVQFGG LLGYAPIMPVNQFDCSAFVNRGGRIPAPIHSFKN >gi|222441878|gb|ACEP01000064.1| GENE 59 68002 - 68691 433 229 aa, chain + ## HITS:1 COG:CAC0450 KEGG:ns NR:ns ## COG: CAC0450 COG0745 # Protein_GI_number: 15893741 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Clostridium acetobutylicum # 1 224 1 223 227 175 40.0 7e-44 MPGILVVEDDENLNRGITFSLKKSGYEVFSAESVKKAKRIASDNNVDVTICDVNLPDGNG LEFVRWMRCNYNTYIICLTALDQEMDQVMGYEAGADDYITKPFSLSVLLLKIEAHFRRRQ EKTEAGKMISGDIIFIAGEMKVLIKSREISLTKTELKMLTFFLQNPKQILSKTQILENVF DLEGDFVDENTIAVNIRRLREKIEDNPAAPVYIKNIRGLGYIWNQEVRQ >gi|222441878|gb|ACEP01000064.1| GENE 60 68667 - 69932 567 421 aa, chain + ## HITS:1 COG:CAC0451 KEGG:ns NR:ns ## COG: CAC0451 COG0642 # Protein_GI_number: 15893742 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 65 419 64 412 416 117 28.0 4e-26 MESGGKAVTKRYRKKITVVLSLVLPVIVLFAILNCFTTYMFYEDYKYKMNLMTEIAAKEE FSGLDAVSELLKDKDIETNEQGRRLLEQYGYWGNKGNAFYSQFRHQVMVTGAVSTVICVL LLTFLLYWKKKEDACHQKILDQLEEILIRFRENKFDDLLKTENHAELEKLNDQLEAIGHH IQLIKEEARAEKENTKEMVSDISHQLKTPVAALDTCFSVLMQNDLSATEQEEFRIRCRSA LDGLETLLQSLLEISKMETGLIQINKKKLPLMDTVISAVNRTYPKADEKEMEFVFDYEKE LETCTIMQDKRWLGEAVINVLDNAVKYSPCGSKIFIRLQKRNDLVRMEIEDQGIGIPQNE YHKIFQRFYRGSSKEVMEKSGTGIGLFLSREIIEKHAGTITVTSGKKKKGSTFVIQLPYV G >gi|222441878|gb|ACEP01000064.1| GENE 61 70046 - 70720 324 224 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) [Campylobacter concisus 13826] # 1 220 1 219 223 129 34 1e-28 MSVILETKQLCKFYGAGENQVKAVDQVDIQIEQGEFVAIVGKSGSGKSTLLHMLGGLDTP TKGSVTLAGKDLYRMKEDALAVFRRRKIGFVFQAFNLVSSVNVWENIVLPLGLDGRKVDE AYVNDIIATLGIENRIYNLPNQLSGGQQQRVAIARALVNRPEIIFADEPTGNLDSKTSDE VIALLKMTAKKYGQTIVMITHDDEIAQVADRILVIEDGQVVDFR >gi|222441878|gb|ACEP01000064.1| GENE 62 70783 - 72003 618 406 aa, chain + ## HITS:1 COG:CAC0454 KEGG:ns NR:ns ## COG: CAC0454 COG0577 # Protein_GI_number: 15893745 # Func_class: V Defense mechanisms # Function: ABC-type antimicrobial peptide transport system, permease component # Organism: Clostridium acetobutylicum # 1 378 23 387 832 72 23.0 1e-12 MMAICLTTMLLVIISTVGNGVIHLQKSQAAGSYGSNYGLFISADGSQLKEVNRRAEIDAT GTMCTEGIIKGNEKGGFVCMDETARKMLPYNKEYELKEGQYPEKMQEIAAGRAFFRAMGY GDVKIGDTVTLDYRAGMQSEYKPEEFVVSGILYDRDEYTIEASYVAFGSQEFYDEHVAEN DRQYNIYFTLNDSANVSMNNIDSVIKQIAAACGIEEKNVIVNDLYLQWVLQPSYETIAVC GILILAIVLFSVVVIYNIFQVGIVNKIQEYGKIKALGATKKQMKQLIFREGIFLTISSIP VGLLLGFLIAKYGFNWLVEQGNFVSTGTGSMGVQNQQVPLFSLPIMLLCIFVSFLTVALA LRKPMKIVSRISPIEATRYLENAETQKKENEIAEKMSPCFLWQWQI >gi|222441878|gb|ACEP01000064.1| GENE 63 71976 - 73160 1048 394 aa, chain + ## HITS:1 COG:no KEGG:EUBELI_01520 NR:ns ## KEGG: EUBELI_01520 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 1 394 421 814 814 754 92.0 0 MFSMAMANITGNPKRTIGTILTLGLSCALFVIISNYVGNIDTEHEARLSVNHGQFELQLD YSAEYDERYPENNLDTILTDNPLNDSLIEEIKSIPGVTDVMTREIVSVNLNGTRFPAAVV SKKDFDFMRQDGDIGSMDYDQAVKNGDIFFGWSAWMEQDGYAPGESIVFDFENGSGTYTY QGKIAGSFVSADTYLVIPEGVYRSMNPRGTAYGYLWVDCDKKDVASVEQSLNTLISNTSH IKMDTYHAQLESAESVTRMMKLGCYLFMAVVGLIGFMNMANTMIMNITTKKQEYGVLQAV GMTNKQLNLCLQLQGLIFTVGTICVALIIGLPLGYALFSYAKNNGIFGMNVYHVPITPIL AMILLVGLLQIVLSCVLSSNLKKETLVERIRYQG >gi|222441878|gb|ACEP01000064.1| GENE 64 73478 - 73753 164 91 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225027073|ref|ZP_03716265.1| ## NR: gi|225027073|ref|ZP_03716265.1| hypothetical protein EUBHAL_01329 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01329 [Eubacterium hallii DSM 3353] # 1 91 7 97 97 150 100.0 4e-35 MHYQNYTYRTGSYRKQHKRYTANRKRQEEYESLPSFLFFRLRLLICIMLFLAFVTADKYV LTKEDTAPVFQKIEKNQSYKDLIKKYKLIGE >gi|222441878|gb|ACEP01000064.1| GENE 65 73823 - 74386 309 187 aa, chain + ## HITS:1 COG:no KEGG:CDR20291_3288 NR:ns ## KEGG: CDR20291_3288 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile_R20291 # Pathway: not_defined # 1 178 1 173 180 81 33.0 1e-14 MLIYLMALDTEEERIKFVRLYEDYRNRMHYTASILLKSEIEAEDIVHDTFLTLTDYLDRI DEKDSVGTWNYIVTILKNKCYNFLKRNKRIELTEDEEVFEQPVEMYNLLENQLIKEEAEE FLNTLIKGLKYPYREVIYLQYHNKLNSRQIAELLRTSPANVRKISSRAREQLKKGMQKKG YTCEYGF >gi|222441878|gb|ACEP01000064.1| GENE 66 74376 - 75035 420 219 aa, chain + ## HITS:1 COG:no KEGG:Sgly_1453 NR:ns ## KEGG: Sgly_1453 # Name: not_defined # Def: hypothetical protein # Organism: S.glycolicus # Pathway: not_defined # 28 218 30 223 225 66 24.0 9e-10 MDFEKLLEQEARKAARQDYEEVTEKTKDYSYKYSKAFIQKMNAMIHEEKRKAKKKRRWHI LLVAAIILILNAGIVLANDDLREKVGELIIQFFDDNIHIRSSKEIADSEEIFRQLHLGYV PKGYHILYETENPTTMYDVYYEGTNDNYITFMQGFKENVDVHITYDGTGRKKVQVNGKEL YMIKDGNITSYYYEDGEYLITLSGTEKESELIKMLKSLK >gi|222441878|gb|ACEP01000064.1| GENE 67 75196 - 75681 93 161 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225027076|ref|ZP_03716268.1| ## NR: gi|225027076|ref|ZP_03716268.1| hypothetical protein EUBHAL_01332 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01332 [Eubacterium hallii DSM 3353] # 1 161 1 161 161 296 100.0 5e-79 MKFEIIYSRKIRDAVAVFTIFMFICQYGLTREDVAENFSRTAPTWGWFALDKPDWEFWNE ELKETRVWNEEDFRTYKRKVFKKDIIQSVYFLPVLWVLKFVDNYYIPVILMGITVFIYCS SYADKWASRIKDLTKEYQRAALRQLIDEKQKVYIKTESIEG >gi|222441878|gb|ACEP01000064.1| GENE 68 75684 - 76583 543 299 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225027077|ref|ZP_03716269.1| ## NR: gi|225027077|ref|ZP_03716269.1| hypothetical protein EUBHAL_01333 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01333 [Eubacterium hallii DSM 3353] # 1 299 1 299 299 583 100.0 1e-165 MLKKIIKKRAYEGNKIEDNLLFVTLQTQIVGLDMNNGYEQIFLSDKLSNMCNFYYDKDTK QLLVYNTSGHAYLYQMPEGKRLSQKLYLRAMDLQPDSIAHKGDYYWYSDTNFCLRRLDIK DGSTKQILKDKERRVVNVIQVADRLYIFTDMRREIEGYVKILCYHIDENGDLKYEFEWGE RPYVFMGYVRRDFDRKTVIVGAKTYTKYVGDYYVAFEPENKRFVPIYPITDEAWELYGHT MLEGNKILAANPKYVKVIDIPSRKVLAKYEPEETIYSATFLTKNKGAIFTNRGVYEFTF >gi|222441878|gb|ACEP01000064.1| GENE 69 76598 - 77284 449 228 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225027078|ref|ZP_03716270.1| ## NR: gi|225027078|ref|ZP_03716270.1| hypothetical protein EUBHAL_01334 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01334 [Eubacterium hallii DSM 3353] # 1 228 1 228 228 361 100.0 3e-98 MTDEEKLTLIQKYEELSEEEFKEECFEDLKKFAFDKDDTVKTFTASVLSRFVNTAKQKEA IEILLQLCHEENELVRTEAYDAIGCYCEKSVVKSLEKAVLSEKDGLARSYAIDSWIYVQQ HLCTNETEVIGYLNAILEKERDDNCILNCLYGKYSFGERAVLDEILNYIKNTDYHIRCSV LSIIELILEEESCTKECKRKIKITLTELLKREKANAVKEQAEEIMRWL >gi|222441878|gb|ACEP01000064.1| GENE 70 77284 - 77553 184 89 aa, chain + ## HITS:1 COG:no KEGG:BCB4264_A3444 NR:ns ## KEGG: BCB4264_A3444 # Name: not_defined # Def: TPR-repeat-containing protein # Organism: B.cereus_B4264 # Pathway: not_defined # 3 54 2 53 161 70 61.0 2e-11 MWHIWTIDDWEKNIDLTREGHRTGLRLRFDKDIDSEVRRACKEFAVFLRREYFFQWINAL KLTPIGQERQATNYARYILDEYAETREHP >gi|222441878|gb|ACEP01000064.1| GENE 71 77900 - 78514 445 204 aa, chain + ## HITS:1 COG:CAC3340 KEGG:ns NR:ns ## COG: CAC3340 COG2357 # Protein_GI_number: 15896583 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 4 194 20 215 217 135 41.0 5e-32 MNEKEYYNLIKPYEDAMQMILSRLDTLNHTLYESSATHLVPIHNIQSRIKSKKSIEDKLI KKEKKPSLTNAKNFLMDIAGARVICYFVNDIYNLVEILKSQTDLIVMKERDYIAGPKPNG YRSYHIILGIPVYCLDGMEYFPVEIQFRTMSMDFWASMEHRINYKKERQDREKLVKELKE HARKLEKIEKSFEKYSENIQNTVE >gi|222441878|gb|ACEP01000064.1| GENE 72 78709 - 79407 497 232 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|23335919|ref|ZP_00121150.1| COG0081: Ribosomal protein L1 [Bifidobacterium longum DJO10A] # 4 226 6 226 226 196 43 8e-49 MGKIKLLHGSDHIIEKPDLLLGKTNNDYGRGFYCTQELSMAMEWACKKNSDGFVNKYVLE QDSLNILNLLDGKYNILHWMALLLKNRTFRLSNEIAIDACDYIIENFSIDLKAYDVIIGY RADDSYFSFAESFVQNGLPLRSLNEALHLGRLGTQTVLISEKSFRNLTFDGADFADKTVY YPKFIARDSNVRETYRREIRNRHSYKNDIFVLDILREEMKDNDSRIQRILSI >gi|222441878|gb|ACEP01000064.1| GENE 73 79373 - 80020 446 215 aa, chain + ## HITS:1 COG:no KEGG:Sgly_2563 NR:ns ## KEGG: Sgly_2563 # Name: not_defined # Def: helix-turn-helix domain protein # Organism: S.glycolicus # Pathway: not_defined # 1 211 1 213 213 182 47.0 1e-44 MIHAYSESYLYDAKQNLAECFDYAISNCRFNADIFSRLFVQSGYADKFERGNPAIVSGIS GIELAQEIIMYAYMNYKFPEKIFSEERSEVYWAGWALAEYQWATCRRFKDIFSRIPLSEI VTMYSVYHEMDIGHFIEDMNKKYMSITQEIHLETIRENRGISQVELAALSGVKLRSIQMY EQKVNDIDKAQARTLYKLSRVLGCSIEDLLENPEL >gi|222441878|gb|ACEP01000064.1| GENE 74 80084 - 80782 424 232 aa, chain + ## HITS:1 COG:FN0805 KEGG:ns NR:ns ## COG: FN0805 COG4912 # Protein_GI_number: 19704140 # Func_class: L Replication, recombination and repair # Function: Predicted DNA alkylation repair enzyme # Organism: Fusobacterium nucleatum # 8 232 19 251 251 119 35.0 7e-27 MNQKNIVKDIQSKLFDLQDIKYRDFHAKLMPTVNKEKIIGVRIPVLRSFAKEFGKTKEAR LFLQVLPHSYYEENNLHGLLLEQIKDYEKCLQELERFLPFIDNWATCDLLAVRTVKKHLD VFIEEVYRWIQSQHTYTIRFGIGMLMRYYLDENFQMEYPEKVMQICSKEYYVNMMRAWYF ATALAKQYENILPFIEDKKLDVWTHNKTIQKAIESYRITPEQKMYLRTLKIK >gi|222441878|gb|ACEP01000064.1| GENE 75 80937 - 82364 670 475 aa, chain + ## HITS:1 COG:no KEGG:BAA_A0205 NR:ns ## KEGG: BAA_A0205 # Name: not_defined # Def: pXO1-133 # Organism: B.anthracis_A0248 # Pathway: not_defined # 17 475 58 482 485 162 27.0 3e-38 MAAQLEMHMQEVYSLRFFYSFQIPRLGKEFDLLQIKDNHIVNIELKSGVVSDQAIRKQLI QNRYYLSVLGRPIQSYTYISSQNRLVRLTHHDHIVDADWERLCEDLQKEGTNYEGNIEDL FRAELYLISPITDPVRFLKKEYFLTSQQRDIEKKILRDIYVKQSGCFWFSGIPGTGKTLL LYDIAMKLSVRHRICMVHCEENGEKWRILHERLQRIDFLADEQIRIEKKSGSQNSGQDKG PDSSQDYDQRKQFNCEERKAGTQIPLEKYRGILVDEAHLLSKDKIERLLELSKEQPVIFS SDSEDVISSEEMDKENIKKLENQTDIKVFRLTNRIRTNAELSTFIQNMMHLPPRMNSRGY PHIFVVYANDDVEAENLLSDYIKQGYQWVEIEESERQEAQADLKMQAVRDMDKIVLLLDE RYYYDEEGYLRAACFMKNGSSYVRKIFHRLNHARESIALVVKNNEKVYNTLLDLL >gi|222441878|gb|ACEP01000064.1| GENE 76 82644 - 83669 957 341 aa, chain + ## HITS:1 COG:SA1154 KEGG:ns NR:ns ## COG: SA1154 COG2008 # Protein_GI_number: 15926897 # Func_class: E Amino acid transport and metabolism # Function: Threonine aldolase # Organism: Staphylococcus aureus N315 # 1 341 1 341 341 444 60.0 1e-124 MLDFVNDYSEGAHEKILLRLLETNMEKLSGYGTDQYCESAKQKIREVCECPEADIYFLVG GTQTNATVIDAILSQYEGVVAANTGHISTHEAGAIEASAHKVLAIPHKNGKITAKAVKQY FADFYADGNHEHMVFPGMVYISHPTEYGTLYSKKELEELSIVCHEYDAPLYLDGARLGYG LVSEENDVTLADIASLCDAFYIGGTKVGALCGEAVVFPKNNAPKHFMTTVKQHGALLAKG RVLGVQFDTLFTDNLYFEISKNAIKTADTLKQALKEKGYTFYIDSPTNQIFLVLEDERLK ELEQKVAISFWEKADDNHTIIRLATSWATKEEEIKALIEIL >gi|222441878|gb|ACEP01000064.1| GENE 77 83915 - 85144 1340 409 aa, chain + ## HITS:1 COG:CAC1651 KEGG:ns NR:ns ## COG: CAC1651 COG1160 # Protein_GI_number: 15894928 # Func_class: R General function prediction only # Function: Predicted GTPases # Organism: Clostridium acetobutylicum # 3 402 4 400 411 450 56.0 1e-126 MGLNDTPSGERIHIGFFGRRNAGKSSVVNAVTGQELAVVSDTKGTTTDPVSKAMELLPLG PVMIIDTPGFDDKGHLGELRVRKTKQILNRTDIAVLVIDATEGIKQCEEELLAIFKEKEI PYLLVYNKEDILSEEQKNLLRTEQSEQVNSICVSALEKKHINELKEKLARLVKTDNQKLQ LVGDLVHPADLVVLVIPIDKAAPKGRLILPQQQVIRDVLEADGAAICVKEYELRELLDTI GKRPAMVITDSQVFAKVSADVPEEIPLTSFSILMARYKGFLEEAVLGAAAIDRLKDGDRI LISEGCTHHRQCDDIGSVKIPRWLKNYTGKKLTIETSSGREFPEDLSPFSLIIHCGGCML NEREMKYRMKCAKDQNIPFTNYGTAIAYMQGILERSIGIFPYLLEELKR >gi|222441878|gb|ACEP01000064.1| GENE 78 85192 - 86604 1295 470 aa, chain + ## HITS:1 COG:CAC1652 KEGG:ns NR:ns ## COG: CAC1652 COG1027 # Protein_GI_number: 15894929 # Func_class: E Amino acid transport and metabolism # Function: Aspartate ammonia-lyase # Organism: Clostridium acetobutylicum # 2 463 9 445 470 466 55.0 1e-131 MRLEKDTLGEKYLTDETVYGINTQRAVENFPLSHKKVNLHLIHAMLLVKKAAAKTYENLV EDIEKEKYQAIVAACDELLLKTEEDKSFSQQAEHNRDNQNQAEDNRRGIDALFVTQALQG GAGTSTNMNVNEVIANIALKHLGKNYGDYNCIHPLDDVNRGQSTNDVYPTALRIASIFLI RKLSDACAALQTSLQEKENAFQDIEKLGRTEWMDAVPITLGEEFGACAQAIARDRWRIYK VEERLRQVNLGGTAVGLSDNAQHKYRFRVIEELRRLTGLGLAAAEYPMDLTQNNDVFVEV SGLLKALAVNLMKIANDFRIMNSGPGGGIGELHLKAMQQGSTIMPGKVNPVIPEMTMQCA MRIIANDTAITMAAAHGEFELNAFLPLIADSLLESLELAERAVTIFREKCVETIEPDVEN CKKHLEQSYAFATAYTPKLGYETVAKIIKENPPEKAKEILICEAAKQTNS >gi|222441878|gb|ACEP01000064.1| GENE 79 87032 - 88432 1657 466 aa, chain - ## HITS:1 COG:MA2837 KEGG:ns NR:ns ## COG: MA2837 COG1115 # Protein_GI_number: 20091661 # Func_class: E Amino acid transport and metabolism # Function: Na+/alanine symporter # Organism: Methanosarcina acetivorans str.C2A # 5 465 8 453 469 404 52.0 1e-112 MFGTINSILTKIDDLVWGVPLIVMILATGIFLTIRLKGIQVRKLILAFKYMFSNETDGEH GEISSFGALCTALSATIGTGNIVGVATAIVAGGPGALFWMIIAATVGTATKYSECLLAIK YRTVADDGHIVGGPFYYIEKGMGENWKWLAKIFAVFGVMVGLLGIGTFSQINGITSAVNN FFDANNAWTVQLFGRDYSWTVVIAGIILTIFVAFVIIGGLRRISAVAQVVVPFMAAAYII AAVVILILNYKAIPGAVVEIVQSAFGMRAVAGGALGAMMLAMQKGIARGIFSNEAGIGSA PIAAAAVQTTEPVRQGLVSMLGTVIDTIIICTMTGLSIVITGSWDKGLEGVAVTTRAFQV GLPFPAKVSAFILMLCLVFFAFTTILGWDYYSERCLDYLTGNNQKVVRAYRWIYIACVFI GPYMTVSAVWTIADIVNGLMAIPNLIALLALNGVVVAETKKYFKKF >gi|222441878|gb|ACEP01000064.1| GENE 80 89158 - 89373 96 71 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225027093|ref|ZP_03716285.1| ## NR: gi|225027093|ref|ZP_03716285.1| hypothetical protein EUBHAL_01349 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01349 [Eubacterium hallii DSM 3353] # 1 71 1 71 71 109 100.0 6e-23 MDKYKELTCVICIIIEAIKFCCTYIYLFKNSTLRNSTDFTMFNYQNKICDSSKINFKGVF IDEKTFTEIFI >gi|222441878|gb|ACEP01000064.1| GENE 81 89342 - 90355 1084 337 aa, chain + ## HITS:1 COG:yqeB KEGG:ns NR:ns ## COG: yqeB COG1975 # Protein_GI_number: 16130777 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Xanthine and CO dehydrogenases maturation factor, XdhC/CoxF family # Organism: Escherichia coli K12 # 74 219 105 248 541 91 35.0 2e-18 MKKHLQKYLFEEKVCNLVTEISSKETGGIPESNTNVILKRKIIEQTEEDFSYQPILRKEE NAYRFFEPIAKEERLIVLGGGHISGYLCEFAAKTGFDVWVVDEREEFSNRERFPHAKKVI CGKFTDVLPTLNINKNDYVAIVTRGHSCDGDCLYYILTHELPGYLGMIGSKRRVSAQFKM FREMGVPEEKIAQVHNPIGLPINGVTPPEIAISILAELILEKRTKKTDGTVQTELDYEVL LEWLNGTRPCAMATILKAQGSSPRKEGAKMLIFEDKSILGSVGGGLAESKVIEKGHEMIG SGGAFLFHFVMDADVAARVGMACGGTFDILIEDVVRE >gi|222441878|gb|ACEP01000064.1| GENE 82 90428 - 91447 1130 339 aa, chain + ## HITS:1 COG:mlr0093_1 KEGG:ns NR:ns ## COG: mlr0093_1 COG0303 # Protein_GI_number: 13470396 # Func_class: H Coenzyme transport and metabolism # Function: Molybdopterin biosynthesis enzyme # Organism: Mesorhizobium loti # 7 328 9 324 330 110 27.0 4e-24 MKLMKTQDAVGQVLCHDITQIIKGVTKDAVFRKGHVITEEDIPVLLSVGKDNVYIWEKGE TRLHENEAAEILREMCQGEYMHPTSVKEGKIELVADIDGLLKVDSDKMKKINGMGELMIA SRHGNFAVKKGDKLAGTRIIPLVIEREKMEQAKAVCGGEPILELKPFVHKKVGIVTTGNE VYYHRIEDTFTPVIKEKLAEYDTEVIGQKICNDDHEKITKAILSFIERGADLVLCTGGMS VDPDDKTPLAIKNTGAEIVSYGAPVLPGAMFLLSYYNGNIPIIGLPGCVMYAKRTIFDLA LPRIMADDEISVEELAALGEGGLCLNCPVCTFPNCGFGK >gi|222441878|gb|ACEP01000064.1| GENE 83 91465 - 91953 495 162 aa, chain + ## HITS:1 COG:RSc0560 KEGG:ns NR:ns ## COG: RSc0560 COG0315 # Protein_GI_number: 17545279 # Func_class: H Coenzyme transport and metabolism # Function: Molybdenum cofactor biosynthesis enzyme # Organism: Ralstonia solanacearum # 4 155 5 156 158 159 58.0 1e-39 MEYTHFDKHGNAVMVDVSEKAVTKRVAVAKGSIRMSRECFDKIKYQDMKKGDVLGVARIA GIMGAKKTSELIPLCHILNLTSVTVDFIMQEESSSITAVCETKTTGKTGVEMEALTGVQI SLLTIYDMCKSVDKTMEIYEVCLCSKSGGKSGDVVYRTVDGE >gi|222441878|gb|ACEP01000064.1| GENE 84 91956 - 93851 1443 631 aa, chain + ## HITS:1 COG:BS_ydiF KEGG:ns NR:ns ## COG: BS_ydiF COG0488 # Protein_GI_number: 16077662 # Func_class: R General function prediction only # Function: ATPase components of ABC transporters with duplicated ATPase domains # Organism: Bacillus subtilis # 1 515 2 555 642 338 36.0 2e-92 MRYQIENGTVSLGGKTILNHIDFSIKGQEKIAVVGRNGAGKTTLLRLISGELSLDRDDKR FGKGITTDKAISIGFLHQQSFSNAERTVEEEILSLVSEEDIFSKEKYYFEKEYDRIFTGF GFKKEDKCKKIGQFSGGEQTKLAFIKLLLSKPDILLLDEPTNHLDIASVEWLEEYLVSYE KAVIMVSHDRFFIDRTAEIIYELSDGKLMRYVGNYTEYKKQKEKQRVSQQKKYDAQQKEI ARLNELIEKFKHKPKKASMARSKKKVLERMQKVEKPDKEEAYIFKEKLEPLMPGAKNVLE TEHLKIGYKNPIRELTLRIRRGQKIAVLGANGAGKTTFFKTIIGQLQPISGKYIIGNGIA TGYFDQHSGEIQSEKRVEEHFGECFPKLTEKEKRQILGRYLFSSQKANVKIRDLSGGEKS RLILAEILESRPNFLVLDEPTNHMDLPAKETMESAFSAYTGTMLFISHDRYFVSKIADAL LLFEEDGVKYYPFNYEHYLHMLKKKREGDQTWADVVEAENTALVEGLLSVPAKERHQSAR FNTEQSYTDWQLTLAREQLEKCQKNIEYWQKQAATELDNLTFEQYQSEKWTKRQRELHKR FEELSEEYLQQNLIWYEKYQEYEEAFSDYLE >gi|222441878|gb|ACEP01000064.1| GENE 85 93918 - 94394 305 158 aa, chain - ## HITS:1 COG:CAC1667 KEGG:ns NR:ns ## COG: CAC1667 COG1418 # Protein_GI_number: 15894944 # Func_class: R General function prediction only # Function: Predicted HD superfamily hydrolase # Organism: Clostridium acetobutylicum # 25 155 36 163 173 122 48.0 2e-28 MKLSNAEMQEIDTILEKYCNTKEALSMKNFCQHGSVSTYKHVMYVTRVCYYLNKHFGLGA DVKSLVVGAFLHDFYLYDWHEKDDSHKWHGFSHAHTALLNADARFALTDKERDIIAHHMW PLNLTKLPRCREALLVCLVDKYCSTVETLTMRSAESAF >gi|222441878|gb|ACEP01000064.1| GENE 86 94439 - 95764 1301 441 aa, chain - ## HITS:1 COG:CAC2686 KEGG:ns NR:ns ## COG: CAC2686 COG0366 # Protein_GI_number: 15895944 # Func_class: G Carbohydrate transport and metabolism # Function: Glycosidases # Organism: Clostridium acetobutylicum # 2 408 3 417 451 344 43.0 2e-94 MWFDEAVVYQIYPLGLCGVLNEERSSDDSVVGHSILEVKDWISHIKKLGATCILFNPLME SDYHGYDTRDYYTVDRRLGTNDDLKEVCDAIHEAGLKILFDGVFNHVGRGFWAFKDVQEK KWDSPYCGWFHLSFDGNTEYNDGFWYEGWEGHNELVKLNLDNPDVRQHQFDAIKQWTDLY GIDGLRLDVAYCLSDFYLKELRRFTSTLKEDFVLMGETLHGDYNMYMNDEKCHSVTNYEC YKGLYSSFNCANMHEISYSLNRQFGYENWCIYRGKHLLSFVDNHDVSRIATILEDKQQLP VIYGLLFGMPGVPSIYYGSEWGIEGKKQGRDEEIRPRLEKPEWNELSDAIAAYAKAHHTH PALYNGGYANLLVRDKQLIFERSCDEERVIVAINADSNTFHADFNARSGLGTDLVTGETV DFGGGMDLPPYSVAYWKTEVI >gi|222441878|gb|ACEP01000064.1| GENE 87 95900 - 96136 334 78 aa, chain - ## HITS:1 COG:no KEGG:Closa_0890 NR:ns ## KEGG: Closa_0890 # Name: not_defined # Def: Phosphotransferase system, phosphocarrier protein HPr # Organism: C.saccharolyticum # Pathway: not_defined # 1 76 1 75 77 64 43.0 1e-09 MKTVTIKFGSVDNATSFVKAIEDFESHFDLIYGQYVVDAKSLLGVMTMDIRNKVDLRIME RENEMPTIMKAISPYIAA >gi|222441878|gb|ACEP01000064.1| GENE 88 96402 - 97193 1132 263 aa, chain + ## HITS:1 COG:lin0840_1 KEGG:ns NR:ns ## COG: lin0840_1 COG0834 # Protein_GI_number: 16799914 # Func_class: E Amino acid transport and metabolism; T Signal transduction mechanisms # Function: ABC-type amino acid transport/signal transduction systems, periplasmic component/domain # Organism: Listeria innocua # 1 259 1 251 268 254 58.0 9e-68 MKKFLSVALAAVMMLSLAACGSSSNKAESSSEGGSASDKTYIIATDTTFAPFEFTNDKNE FVGIDVDILAAIAKDQGFKYDLQSLGFDAAVAALESGQADGTIAGMSITEERQKKYDFSE AYYDSYVCMAVKKGSDIKGYEDLKGKKVAAKTGTQGADCAESLKDKYGFDITYFDDSATM YQDVKTGNTVACFEDQPVMAYGVSQGNGLEIVAEEKDNFSTPYGFAVLKGQNADLLKMFN EGLKNIKANGTYDEIVAKYTSAK >gi|222441878|gb|ACEP01000064.1| GENE 89 97227 - 97895 594 222 aa, chain + ## HITS:1 COG:lin0840_2 KEGG:ns NR:ns ## COG: lin0840_2 COG0765 # Protein_GI_number: 16799914 # Func_class: E Amino acid transport and metabolism # Function: ABC-type amino acid transport system, permease component # Organism: Listeria innocua # 6 220 1 209 212 188 47.0 5e-48 MNLFGVLSEVYPTLLTGLTMTVKITILSLIIALVIGIFVCLMNISSNIVLRGIAKFYIWL IRGTPMLVQAFYFYFALPQLVQAVTGSQFRITVFTASLVTLALNAGAYISEIFRGSIESV NKGQMEAARSLGLSYAKSMQKIILPQAVRICLPSLVNQFIITLKDSTILYAIGLSEVMYN AKIYVGRTMESFATYTWVAVFFLLIVTVLSFISRYVERRMNA >gi|222441878|gb|ACEP01000064.1| GENE 90 97895 - 98638 285 247 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 [Roseobacter sp. AzwK-3b] # 1 241 1 257 563 114 29 3e-24 MTTEKDVKIHVKGLKKSFGHVHVLKGIDAEIKAGEVICVIGSSGSGKSTFLRCLNRLEEA TAGEILVDGQSITEKNSNIDKIRQHIGMVFQQFNLFPHYTVKENIMLAPVELKLKSKEEA EKKALELLKRVGLAEKADAKPKQLSGGQQQRVAIARALAMEPDIMLFDEPTSALDPEMVC EVLDVMKELAAEGMTMVVVTHEMGFAHDVADRVFFIDQGVIMEQGTPEEVMDHPTNERTQ SFLSKVL >gi|222441878|gb|ACEP01000064.1| GENE 91 98697 - 99470 948 257 aa, chain + ## HITS:1 COG:lin1228 KEGG:ns NR:ns ## COG: lin1228 COG0263 # Protein_GI_number: 16800297 # Func_class: E Amino acid transport and metabolism # Function: Glutamate 5-kinase # Organism: Listeria innocua # 2 256 10 268 276 246 49.0 3e-65 MRIVVKVGTSTLTHTSGRINIRRTEKLCKILSDVKNAGHELILVSSGAIGMGVGKLNMKS KPEDMPTKQAAAAIGQCELMYLYDKLFREYNHIVAQMLITGLDFSNEESHKNFQSTLNRL LELNVIPIINENDTIATQEISVGDNDTMGAIVAENANADLLIILSDIDGLYTGDPRNNPD AQLIDTVFSITPEIKELAGTKGTTLGTGGMITKIHAAEISMNAGIDMIIANGTKPELLYN IIDGEKVGTKFIGRKKA >gi|222441878|gb|ACEP01000064.1| GENE 92 99467 - 100723 1505 418 aa, chain + ## HITS:1 COG:NMB1068 KEGG:ns NR:ns ## COG: NMB1068 COG0014 # Protein_GI_number: 15676952 # Func_class: E Amino acid transport and metabolism # Function: Gamma-glutamyl phosphate reductase # Organism: Neisseria meningitidis MC58 # 3 418 4 419 420 500 63.0 1e-141 MTTKEILTQAVTAKNAINSADTETKNKALANMADALLAHTDAILEANKQDVEAARGKISD VMIDRLMLDAGRIEGMAKGIRELIDLDDPAGKVLRTVERPNGIVIEKTAVPMGVIAIIYE SRPNVTSDAAALCIKSGNVCVLRSGKEAWKSANAVVTALKEGMVKSNLPGEGIQLIEDTS RESSVELMKAVGYVDLLIPRGGPGLIRSCVENAKVPCIQTGTGICHVYVDESASFEKALQ IIENAKTSRPSVCNAEEVLLVHSAVAEKFLPLLQKKLVEERKEKGEIPVELRLCDRAAKI ISGTPAGADDFDTEFLDYILAVKIVDSVEEAVEHISKHSTGHSEAIITESKEAADYFTMR VDSAAVYVNVSTRFTDGGEFGLGCEMGISTQKLHARGPMGLEELCTYKYIIRGEGQIR >gi|222441878|gb|ACEP01000064.1| GENE 93 100944 - 101792 732 282 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225027106|ref|ZP_03716298.1| ## NR: gi|225027106|ref|ZP_03716298.1| hypothetical protein EUBHAL_01362 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01362 [Eubacterium hallii DSM 3353] # 1 282 1 282 282 527 100.0 1e-148 MLWEGGLHMNLRQINELWNEYKEKYDLPKDVQISAVQYDAEKYVEGFFNILELLDGKYIL HLNPRFHLYSDSYARFILFHEFTHFYDFINSSFDETEDLFIYMNAYSEFHACRVTLARFM ERLTLKTVHIDKIQIPGPFNEISIRHLLEEGLWRVKVCFEQFLETLEPNDFVMTFRQLMY LLGYISLFENDDVLLEQTLQFAHLDTDDFFALYHALKEIDHDEILRLTRKIYNSAFLIYL RGFFRRNYDKRILPDDELENITADNCDEYKAILDKRQQEVAA >gi|222441878|gb|ACEP01000064.1| GENE 94 102186 - 103433 922 415 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225027107|ref|ZP_03716299.1| ## NR: gi|225027107|ref|ZP_03716299.1| hypothetical protein EUBHAL_01363 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01363 [Eubacterium hallii DSM 3353] # 1 415 1 415 415 795 100.0 0 MKKHIFLGVLLMLTGFFFIPSKTFAFQHAYESKSDFIYALARNELPVYYSDYSNDLKTVL PKYTGVKVIGSSGSWYEIQYASKKGGTKNGWVTRDEFHSDCLIYDGREKQPFSNGTYQLS FYEENSSDSSFAMNTASIISENFSCSFKYAGDNRYTIRKAGEEKYLKADTLSNTPSSNEL WGSKQEAGTFLITRKKDYYTICDETTKRNLSQNDGSILEFTTDSNAVWRLTRNKKAIEKE NLQVFVQFDPVWAKHHYGNETTKDTDTNNFCTSGCGIFATVNAIYSLSGHFPDPYELAQY ASDKHYRIEDCGTDSGFFKAAAEKFGYKYGFSYDGSGESFKELKEKLKEGDTAIAYLPGH YGTIVDYNAKKDKYLLMDPHYLPKRGTSSFGDWVSQKDLEEGTLMVQTFFYYKAE >gi|222441878|gb|ACEP01000064.1| GENE 95 103813 - 104607 739 264 aa, chain + ## HITS:1 COG:BH1885 KEGG:ns NR:ns ## COG: BH1885 COG2357 # Protein_GI_number: 15614448 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus halodurans # 50 241 21 213 251 182 51.0 4e-46 MEQRKKIFGALDDLMRSPIGDRLRERVFRELSTKEIHDLLQREQKPYKELMAYYQCALME VETKFRVLNEQFSLAHDRNPIESIKTRVKSLESITEKLNRRNLPFSTQSIEENLTDIAGI RIICSFREDIYFLADCLLEQDDITLIQRKDYIENPKENGYRSLHLIVEVPIFLEHKKKMM KVEVQLRTIAMDFWASLEHKLKYKKDIASDQLTADLKKCADESAALDLKMDAIRKEIEVL KAPEAKAIGREVKHLPGGILNENQ >gi|222441878|gb|ACEP01000064.1| GENE 96 104690 - 105712 641 340 aa, chain + ## HITS:1 COG:MA3021 KEGG:ns NR:ns ## COG: MA3021 COG0530 # Protein_GI_number: 20091839 # Func_class: P Inorganic ion transport and metabolism # Function: Ca2+/Na+ antiporter # Organism: Methanosarcina acetivorans str.C2A # 12 333 15 307 315 167 37.0 2e-41 MIYMSFFIGLILICKGGDWFVDAASKLSEILGIPKYVIGATIVSFATTMPEIIVSVAAAL DGHSAMAIGNAVGSVSANTGIILALSLFFLPVAIKREEYLFKNSLYIGTLLLLLISFKDR IFDRADCLLLLLAGILFLVENVKNAVCKKEVSNSKRKKEDMILHKKIQAGKNGEKKVKNK RSKWKEKQSSIWETFFWFCIGAVCIVAGSELLVHSACKIAEILHISEDIISVTIVAMGTS LPELVTTVTAILKKESSLSVGNIIGANMIDSAFILPICTLLSGEKLPVSLQMAQIDMPLC ILISVLALLPMLIRQKIKRRDGILVFSIYAGYLLYICWAI >gi|222441878|gb|ACEP01000064.1| GENE 97 105861 - 106727 956 288 aa, chain - ## HITS:1 COG:no KEGG:Closa_3823 NR:ns ## KEGG: Closa_3823 # Name: not_defined # Def: ErfK/YbiS/YcfS/YnhG family protein # Organism: C.saccharolyticum # Pathway: not_defined # 36 219 277 440 483 65 28.0 3e-09 MKKVTSLLMALILSVTILMGSIPMNASAATTEVPVGWYQQGKKVYYYENGKKVTGWHTLK SYYDKKKYKFYFNKNGVLVKDLFTLNYKKWIKKDITIVVNTKTHNATLYAKDKKTGKYNI PLKTMVCSTSRKAKGTKAYSGCRLEKTSAVRWFIYKKSRPYHYYQWGVKVKHGNFYFHSA RYTTTNNRKLEVGLYNDLGTNQTTTSVRLQAVNAKLIYDIATKTNKKKRVWVKVIRKNKE KGPFGVYTLAGTTGKIKNKKKRIDPTDPSIKGNKKIYSYAMSILNGLK >gi|222441878|gb|ACEP01000064.1| GENE 98 107030 - 108079 1052 349 aa, chain + ## HITS:1 COG:RP184 KEGG:ns NR:ns ## COG: RP184 COG0484 # Protein_GI_number: 15604058 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: DnaJ-class molecular chaperone with C-terminal Zn finger domain # Organism: Rickettsia prowazekii # 6 348 4 350 370 168 35.0 1e-41 MVTKTDYYDVLGINKNADEKTIKKAYRKLAKKYHPDINPGDSNAEAKFKEVTEAYEVLSD PEKKKLYDRFGHAAFDGTGNAQSGPYGNDGGFGGFSNFGGSGGFRGGTYRSNNGNGYQEF HFEGGDMGDIFENFFGGGFKKSGFRGGSGQNGNFNNGFHGNFRQNGGFGNRQSYQTKGND IHADIQVNFDAAAFGKKETIHLKDENGKVQALEVNIPAGIETGQSIRLKGKGTPGYNGGT AGDLFLKVTVSEKPGFKRDGQDIFNTVTIPFITAVLGGDVTVPTIYGNVRCSIKPGTQSG SKLRLRGKGIVSMKNSSVHGDQYTTIEVQVPKNLSPKAKQKLREFAAEL >gi|222441878|gb|ACEP01000064.1| GENE 99 108608 - 112846 4273 1412 aa, chain - ## HITS:1 COG:CAC2401_1 KEGG:ns NR:ns ## COG: CAC2401_1 COG1924 # Protein_GI_number: 15895667 # Func_class: I Lipid transport and metabolism # Function: Activator of 2-hydroxyglutaryl-CoA dehydratase (HSP70-class ATPase domain) # Organism: Clostridium acetobutylicum # 5 659 7 663 663 828 60.0 0 MKHALGIDIGSTTVKVTVINESHDILFSDYKRHFANIKGTLQTLLSEAREKLGNLTIHPT VTGSGGMAISEYLHIPFCQEVVCVANALQDYEPKTDVAIELGGEDAKIIYFNNGIEQRMN GVCAGGTGSFIDQMASLLQTDATGLNEYARDYDTLYPIAARCGVFAKSDIQPLINEGATK ENLAASIFQAVVNQTISGLACGKPIKGYVAFLGGPLHFMPELKNAFIRTLNLDEEHIIDP PYSHLFAAKGAALNAKEECSFTLDDLLKEFESDIQLAVEVERMEPLFKDEKEYDEFIEEH NQFKVKKADLATYSGKCYLGIDAGSTTTKLALVGEDGSLLYRFYDNNNGSPLNTTIRAMK ELKELLPPAAKISGSCSTGYGEALIKAALQLDHGEVETMAHYYAAAFFEPDVDCILDIGG QDMKCIKIRNHAVDNVLLNEACSSGCGSFIETFAKSLNYSVADFAKVALFAPSPIDLGSR CTVFMNSKVKQAQKEGASVGDISAGLAYSVIRNALLKVIKLTDPKQLGKKIVVQGGTFYN DAVLRSFERISGCHAVRPDIAGIMGAFGAALIAREREEETGDTQMLSIDEIINLEYSTSM SRCQGCNNHCILTINKFNNDRQFISGNRCERGLGLEKNHEHVPNLYEYKYKRIFGYKALK PDEAKRGVIGIPRVLNMYENYPFWYTFFTNLSFRVLLSPKSSRKIYELGIESIPSESECY PAKIAHGHVMWLLKKGINTIFYPCVPYEEDQMEGANNHYNCPIVTSYAENIKNNMEELRD PNVLFMNPFLALDNADALKVRLYDELSKHFDVTKEEIDHAVDEAYKEAAQVRKDVQAKGE ETLAYLAETGKMGIVLCGRPYHIDPEINHGIPELINSFGIAVLSEDSICHLGHVERPLIV SDQWMYHSRLYAAANYAKKNKQLEVIQLNSFGCGLDAVTTDQVNDIMTNSGRIYTVLKID EVNNLGAARIRIRSLIAAVKIRNRKHIEPHPVPANIEKVEFTKEMKKDYTILIPQMSPIH FDLLTPAFRHCGYNVEILPEMGKQAVDTGLKYVNNDACYPSLIVVGQIMTALTSGKYDLN KTAVIISQTGGGCRATNYIGFIRRALAKNGYGQIPVISLSVQGIEKNEGFKFTPALLVRA LQGVIYGDLFMRVLYRVRPYEKVKGSANHLHEVWKKRCIKSLEKARYGEFKKNIRGIIRD FDRLPITDEKKPRVGIVGEILVKFLPDANNHLVELLEEEGAEAVMPDLLDFMFYSFYNSN FKAEYLGKSTKAAKLCNMAISALEAYRGFLRKELERSDRFTAPPHIEELAEYAKPIVSLG NQTGEGWFLTGEMVELIKQGAPNIVCTQPFACLPNHVVGKGVIKELRRKYPESNIVAIDY DPGASEVNQVNRIKLMLATAEKKLKSDLPQAR >gi|222441878|gb|ACEP01000064.1| GENE 100 113272 - 113643 349 123 aa, chain + ## HITS:1 COG:no KEGG:Hore_11320 NR:ns ## KEGG: Hore_11320 # Name: not_defined # Def: peptidoglycan-binding LysM # Organism: H.orenii # Pathway: not_defined # 15 122 63 173 173 63 36.0 3e-09 MSEKVYVNICDTKKCPGPVYAVKTGDTLYSIAQRHHCRVRTLLDLNPFVDIYNLQPGEEI CVPDCRGQGKVDFRPYVVKEGDTLKIILKNVSMTFEELAKVNRILYNLTVPAGTILLVPA KKE >gi|222441878|gb|ACEP01000064.1| GENE 101 113727 - 113966 93 79 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLNWSAVVLSSSAVGLVRAIQHPSSATATLRIPESVVLKILIMRLPSFNSLSYLSLHLLY LGKTKERVRFADSYITETV >gi|222441878|gb|ACEP01000064.1| GENE 102 113839 - 114777 985 312 aa, chain + ## HITS:1 COG:CAC0294 KEGG:ns NR:ns ## COG: CAC0294 COG0598 # Protein_GI_number: 15893586 # Func_class: P Inorganic ion transport and metabolism # Function: Mg2+ and Co2+ transporters # Organism: Clostridium acetobutylicum # 1 311 1 315 315 303 50.0 2e-82 MIRIFKTTDSGIRKVAVAEEGCWIALTNPTAEELSTTADQFNIDVDDLKSPLDEEERSRI QVEENYTLIILDIPTIEERKGREYFYTIPFGIIFTDKYIFTVCLVDSPVLTVFMDGRVRD FYTYKRTRFIYQMLYRNATMFLQYLKIIDKKSDEVEKKLHISQRNQELIEMLDLEKSLVY FTTSLRGNEVVLEKLMRNTSIPRYEDDEELLEDVIIENKQAIEMAKIYSDILSGTMDAFA SVISNNLNIAMKFLSVITIVLTVPTIVTSALGMNVAGIPFANNPYGFWIVVGISLLLTLI AGAIIAKNKHFK >gi|222441878|gb|ACEP01000064.1| GENE 103 114924 - 115739 498 271 aa, chain + ## HITS:1 COG:no KEGG:Closa_0669 NR:ns ## KEGG: Closa_0669 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 270 1 269 281 230 42.0 5e-59 MNWLDKLERKFGRYAIRNLPLYIVMLYAIGAVLNILTKGSYYYYVALNPYMILHGQIWRL LGFLIVPPNASLIFIIFVLMFYYSIGESLVQVWGAFRFNMYILIGVLATIVASFIVYFIY PSTMIFMDTFYLNLSMFLAYATIFPEMRVYLYGILPLKVKWMAWFDVALLVWQFVVGGLG ARIAIVVSLLNFLLFYFSSRNYKKVSPKEIHRKRAYRKASSVASNKPYRHKCAVCGKTEL DDPNLEFRYCSKCNGNYEYCNEHLFTHTHVK >gi|222441878|gb|ACEP01000064.1| GENE 104 115827 - 116165 341 112 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225027118|ref|ZP_03716310.1| ## NR: gi|225027118|ref|ZP_03716310.1| hypothetical protein EUBHAL_01374 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01374 [Eubacterium hallii DSM 3353] # 1 112 1 112 112 221 100.0 2e-56 MTGGGLLLTISEHIMIGKTISHDAKAILRGVRHPFIWKKGLWYGITTSCGKDVIMYILSS REMRQKFYQNNDLKLLGLAGSMKEAYEIVLKLVQMGYNSDCIYEMNDYLEKF >gi|222441878|gb|ACEP01000064.1| GENE 105 116181 - 117110 665 309 aa, chain + ## HITS:1 COG:no KEGG:Closa_1954 NR:ns ## KEGG: Closa_1954 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 50 304 52 325 329 75 29.0 3e-12 MLYVLLLLLKILGILLLLFIGLVLLVLLTPIQYSFELEKEEQKGPKFMVRVTWLFWLFYF KTSYIEKVFDYRIRILGYQIAGNQPEFLEKQKERQKKKSQKEKAKEERKTAETLAEGKSR SIDSVSKQKAEKAVPKEKSSDVSEGREKTKEKDSGDDKNSKDKKKNKERKQKDGKKPKKE SSLKKIKSKIKSLKETKVNLDKMPWREWLELGKDVLIRFLKHALPGKLEGNIAFGTGNPE YTGYITGIAAVFYPKYGEHFSLYPDFERKMFEAKCRGRGRIRLGYMLVLAVSILKEKSVR TMIKNIILG >gi|222441878|gb|ACEP01000064.1| GENE 106 117126 - 117566 672 146 aa, chain + ## HITS:1 COG:no KEGG:EUBELI_00954 NR:ns ## KEGG: EUBELI_00954 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 1 133 29 157 157 132 55.0 6e-30 MDKQNFDTTVNSLFSGMDHVLSSKTVVGEPQVIGDTIILPLVDISFGMGAGAFNKEKGSS AGGGIGGKMSPSAILVIHNGQTKLVNIRDQDAVTKVLDMVPDAVNKVVSIIKGEDAEDKV DPEVESAVNKAAQAEASVVDVDGAGI >gi|222441878|gb|ACEP01000064.1| GENE 107 117812 - 117997 261 61 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|160881022|ref|YP_001559990.1| ribosomal protein L28 [Clostridium phytofermentans ISDg] # 1 61 1 65 65 105 76 2e-21 MARCAICDKGAHFGIKVSHSHRRSNKMWKSNIKSVRVKVNGASKKMYVCTSCLRSGKVER A >gi|222441878|gb|ACEP01000064.1| GENE 108 118408 - 118764 469 118 aa, chain + ## HITS:1 COG:BS_yloU KEGG:ns NR:ns ## COG: BS_yloU COG1302 # Protein_GI_number: 16078646 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus subtilis # 1 118 1 119 120 95 44.0 2e-20 MKGQMHTDLGQVIIDEEVIATYAGINAIECFGIVGMASVNMKDGFAQLLKRDNVSKGVHI KINDNKIYIDFHIIVVYGVSIATVTENLIQNVKYRVEKFTGMSVEKINIYVEGVRIVD >gi|222441878|gb|ACEP01000064.1| GENE 109 118933 - 120642 2265 569 aa, chain + ## HITS:1 COG:CAC1735 KEGG:ns NR:ns ## COG: CAC1735 COG1461 # Protein_GI_number: 15895012 # Func_class: R General function prediction only # Function: Predicted kinase related to dihydroxyacetone kinase # Organism: Clostridium acetobutylicum # 1 569 1 547 547 451 44.0 1e-126 MNSHTIDAKTFQKLFLAGANRINSRKDYINELNVFPVPDGDTGTNMSLTILAAAKEVGAV AEPTMATICKAISNGSLRGARGNSGVILSQLFRGFTKAVSKADSLGKDEITAGFTRAVET AYKAVMKPKEGTILTVAKGMADKAEELVDEDMDVIAFCEAIVAYGYEVLSKTPDMLPVLK EAGVVDSGGQGLMEVLQGIVDALTGKVTDIEVPAEAASMSSAVISAPKADISTADIKFGY CTEFIILLDKPLTKEDEKGLKKFFLSIGDSLVLVADEEICKVHVHTNHPGEAFEKALTFG ALSNMKIDNMRLEHQEKLIKDAENEAARQRQQEAEEEKLAAKEAAAEKPAKEVGFIAVSV GEGLTDIFKGLGVDYLIEGGQTMNPSTEDMLNAIEQVNAKNIFILPNNKNIILAAEQARD LTEDKKIIVLPTKTIPQGISAMIGYMPEAGVDENEENMKEAYQDIASGQVTYAVRDTVID GKEIHNGDIMGINDDGIAAVGKNLTETTLELLSTMADEDSELICLYYGADVTKDDADTLS DAIEDKFPECEVEVNFGGQPIYYYMISVE >gi|222441878|gb|ACEP01000064.1| GENE 110 120778 - 122829 1891 683 aa, chain + ## HITS:1 COG:slr0020 KEGG:ns NR:ns ## COG: slr0020 COG1200 # Protein_GI_number: 16331409 # Func_class: L Replication, recombination and repair; K Transcription # Function: RecG-like helicase # Organism: Synechocystis # 5 666 128 808 831 507 40.0 1e-143 MMHVEEGESQLGIPVREIKGIGEKTEKLLAKLDIETVDQLVHHYPRCYTTYPEPISISEI KTGQRCSIEAEIASPIHLKTVRKLKLCTGLIADVSGQLFVRWFNMPYLRNTLKQGEQWIF TGTPIYKDGRLMLEQPEYCKREKYLQLMETFQPIYPLTTGLSNKTVQKAQAAAFEMYREE EYLPETVRNYYDLEPVNTALREVHFPTGTEHLMEAKKRIIFDEFFRFFSALELVKDREEQ ALNHYIIPMEEPVKRFVENLPYPLTGAQKKVLNEIRQDFSDTMAMNRLLQGDVGSGKTIV AMTAMYAAVLAGYQAALMAPTEVLAEQHYQNFVKLLSPLGITVALLTGSTKAKEKREIKA ACASGEIQILIGTHAVIQDDVAFDNLAFIVTDEQHRFGVKQRDAFMKKGKDPHVLVMSAT PIPRTLGIILYRDLDVSIMNEMPASRLPIKNSVVGTSYRPAAWEFIRKQVALGHQAYVIC PMIEENEKMDLENVEEYARMLSQALPPSITVEALNGHMRPAEKNDIMERFSKNEIQILVS TTVVEVGIDVPNATVMLIENAERFGLAQLHQLRGRVGRGKAQSYCVFISGSEKEEAMERL SIIGHSNDGFEIANEDLKLRGPGEFFGVKQSGTMNFALGDIYSNADILKMASEAVDYLKK EGYNFQKLHQYSLEKNLNFAKNI >gi|222441878|gb|ACEP01000064.1| GENE 111 122904 - 123626 720 240 aa, chain + ## HITS:1 COG:lin1914 KEGG:ns NR:ns ## COG: lin1914 COG2365 # Protein_GI_number: 16800980 # Func_class: T Signal transduction mechanisms # Function: Protein tyrosine/serine phosphatase # Organism: Listeria innocua # 10 235 44 293 298 92 28.0 7e-19 MKRKPYLNYTRLPMENAYNVRELGGYATKKGSVTIHHQFLRSENLTDITEEDKTFLIEYG LSGIIDLRSREEALIYPNPFRGNNAVKYINCPLITDGILDLRAVKEVGFDPGEFYVKLVE YKEMLYKIFHFILDNIDGCLLFHCQAGKDRTGVLAMILMGLAGVSKEDIVANYEVTHTYL KENVKLRLEDGLEELEFSKPQWIEKAYDHIIEHYGSFKVYLMAVGLTKKEIKKIKGKLAG >gi|222441878|gb|ACEP01000064.1| GENE 112 123786 - 124343 285 185 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163764797|ref|ZP_02171850.1| ribosomal protein L29 [Bacillus selenitireducens MLS10] # 1 153 12 164 199 114 40 3e-24 IMRVIAGSARHLKLKTIEGMGTRPTTDRIKETLFNMLSFYVEESRFLDLFSGSGGIGIEA LSRGASQAVFVEQNRKAAACIEENLNHTHLREKAVVMSKDVMTALRILEDKKQAFDYIFM DPPYGKLLEKEAVLYLDGSVLCDENTTIIIESDLDTEFSWVMDTGFTITKEKIYKTNKHT FLQKK >gi|222441878|gb|ACEP01000064.1| GENE 113 124383 - 124865 440 160 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163764798|ref|ZP_02171851.1| ribosomal protein S19 [Bacillus selenitireducens MLS10] # 3 160 5 162 164 174 51 3e-42 MATAVYPGSFDPITLGHLDIIKRTAAVFDKVIIGVLINKAKKPLFSIEERVELIKEVTKN IPNVEIVSFNGLLIEFSDKMKADVIVRGIRAVSDFEYELMMAQTNKQLNPNIETMFFATS AEYSFVSSSMIRELAAFDGDITPFVPEEVSKRVYEKYKQS >gi|222441878|gb|ACEP01000064.1| GENE 114 124887 - 125504 808 205 aa, chain + ## HITS:1 COG:no KEGG:EUBELI_01034 NR:ns ## KEGG: EUBELI_01034 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 6 179 4 175 203 82 33.0 8e-15 MANMSIEELIEEIEVYVDNCKTAGVLSSGSMIKINREELLAMLDEVRTRLPKELAESRQI IKSKESIIADAKARAERIVKDAAKEAGVMIDDNEIVALANMRADSIVKDAQKAADELNGK ARETAREVQTGALQYTQNMLEGLEYMYSTIIKEEKEYFNSVLEKLKSEHKQIVADKQEID LQLGAGIRTGRRKEDFEKKEERAEE >gi|222441878|gb|ACEP01000064.1| GENE 115 125640 - 125945 116 101 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225027130|ref|ZP_03716322.1| ## NR: gi|225027130|ref|ZP_03716322.1| hypothetical protein EUBHAL_01386 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01386 [Eubacterium hallii DSM 3353] # 1 101 211 311 311 171 100.0 2e-41 MIFSILSGMLTSFPLFQAPVFTFVLANLEVTTGIHLLALKPFITPQIQYALIAAATSFGG LCTMAQVQTVLSATDLSLKRYIVIKTGTAFVSFLLCLILLC >gi|222441878|gb|ACEP01000064.1| GENE 116 125968 - 126111 68 47 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MISREGRKTSSDKERAGGFLFESFVTDFAESFLAKEDELACLRFFCL >gi|222441878|gb|ACEP01000064.1| GENE 117 126754 - 127272 234 172 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|170754849|ref|YP_001782001.1| ribosomal protein L32 family protein [Clostridium botulinum B1 str. Okra] # 1 167 1 163 166 94 30 3e-18 MKINISEILSNPSVIKNFKVEPDFEKLKLRRGSYPVSYKEPFLLTVSKAEDKLHVTGETT IRLIIPCDRCLEDVENTFHITIDRSVNPNTESDGIEDVDELSFIDGYMLDVDKLIMDEIV VALPTKVLCKEDCKGLCSICGTNLNNHTCDCHKESLDPRMAAIQDIFRDFNQ >gi|222441878|gb|ACEP01000064.1| GENE 118 127352 - 127531 276 59 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|227872300|ref|ZP_03990657.1| possible ribosomal protein L32 [Oribacterium sinus F0268] # 1 59 1 59 60 110 83 4e-23 MSICPKNKSSKGRRDRRRANWKMSAPTLVKCSKCGALTMPHRVCKACGSYNKKEIVSVD >gi|222441878|gb|ACEP01000064.1| GENE 119 127581 - 127727 63 48 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MIDRPDIMSGLFFVYRLYSKNAFWFLQCIFLEAITKLINFHINKEVVA >gi|222441878|gb|ACEP01000064.1| GENE 120 127866 - 128639 1131 257 aa, chain + ## HITS:1 COG:CAC0776 KEGG:ns NR:ns ## COG: CAC0776 COG1691 # Protein_GI_number: 15894063 # Func_class: R General function prediction only # Function: NCAIR mutase (PurE)-related proteins # Organism: Clostridium acetobutylicum # 1 248 1 248 248 237 55.0 2e-62 MEHTKLTELLNKVAQGEVSVEKAALELKTEPFEDLGFAKLDHHRKIRQGAAEVIYGAGKT PEQILKITEAFRKKGDNAVLITRMSQEAADLVGASLPLRYDALSRTGIVGELPEKDGNGK VVIATGGTSDLPVAEEAALTAEVLGNEVVRIYDVGVAGIHRLLAYSEDLMSAQVIVVVAG MEGALASVVGGLADCPVIAVPTSVGYGASFGGLSALLSMLNSCASGVSVVNIDNGFGAGY MASMINHVSMKNAENLN >gi|222441878|gb|ACEP01000064.1| GENE 121 128683 - 129972 1719 429 aa, chain + ## HITS:1 COG:CAC0774 KEGG:ns NR:ns ## COG: CAC0774 COG1641 # Protein_GI_number: 15894061 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 1 426 1 416 420 272 38.0 9e-73 MKTLYLECNMGAAGDMLMAALTELMPDKDAFVKELNDLQIPQVNIERTTIAKCGIMGTHM VVTVNGEEEKSIDVPMHAHTHSHVHEEHEHSHEGHTHAHGHSHHHTSMKDIEHIIGHLNV PEKVKEDAIAVYKLIADAESHAHGRPVEEVHFHEVGAMDAVADIVGVCLAIYKLAPEQII ASPVHVGYGQIHCAHGILPIPAPATAHILQGIPIYGGRIEGELCTPTGAALLKHFAQSFG QMPMMAVEQTGYGMGMKDFTDANCLRAIIGNTVEGQEQTGCHGAVQEMDSIIELCCNLDD MTPEKIGFVTELLMEEGAFDVYTTNIQMKKNRPAVMLTCMCAKEDREKFLTLILKHTTTL GVREYTCKRYGLKREIREVETIYGTVRVKAASGYGITKEKPEYEDMARIAKEKKISLAEI EKEVYKNLK >gi|222441878|gb|ACEP01000064.1| GENE 122 130347 - 131477 1124 376 aa, chain + ## HITS:1 COG:TM0034 KEGG:ns NR:ns ## COG: TM0034 COG2768 # Protein_GI_number: 15642809 # Func_class: R General function prediction only # Function: Uncharacterized Fe-S center protein # Organism: Thermotoga maritima # 4 373 3 350 357 327 46.0 3e-89 MEKSKVYFTDFRAKLGEGLPLKLKRLLKKAGISEIDMENKFVAIKMHFGELGNVSFLRPN YAKAVADVIKELGGKPFLTDCNTLYPGSRKNALEHLQCAWENGFTAMTVGCPILIGDGLK GTDDIEVPVEGVEYIKSAKIGRAIMDADIFISLSHFKGHETTGFGGAIKNIGMGCGSRAG KKEQHTNGQPTIHEDMCRGCRRCMRECANNGLVFDEETKKMHIDGANCVGCGRCIGACNF DAIEFENWAATKDLNCRMAEYTKAVVDGRPNFHISLVVDVSPNCDCHAENDAPILPNIGM FASFDPLALDQACVDACLKMQPLPNSQLSENMHKEGFCDHHDHFENSTPESEYKTCLEHA EKIGLGSREYELIVMK >gi|222441878|gb|ACEP01000064.1| GENE 123 131774 - 131974 268 66 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_0225 NR:ns ## KEGG: EUBREC_0225 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: Sulfur relay system [PATH:ere04122] # 1 64 1 65 65 70 58.0 2e-11 MVRINGQDCEKTGISVSDYLEQENYNVKHIVVEYNLEIIPKERYAQTILKDGDEVEIVSF MGGGSK >gi|222441878|gb|ACEP01000064.1| GENE 124 132117 - 132887 1172 256 aa, chain + ## HITS:1 COG:FN1754 KEGG:ns NR:ns ## COG: FN1754 COG2022 # Protein_GI_number: 19705075 # Func_class: H Coenzyme transport and metabolism # Function: Uncharacterized enzyme of thiazole biosynthesis # Organism: Fusobacterium nucleatum # 4 256 2 254 257 363 73.0 1e-100 MKTKDKLVIGGHEFDSRFILGSGKYSMKLIEAAVRDAGAQIITLAVRRANTKDHENILDY IPEGVTLLPNTSGARNAQEAVRIARLARELGCGNFVKVEIMRDSKYLLPDNQETIKATEI LAKEGFVVMPYMYPDLNAARDLVNAGAACVMPLASPIGSNKGLATKEFIQILIDEIDLPI IVDAGIGKPSQACEAMEMGAAAIMANTALATAGDLPMMAAAFRQAIEAGRKAYLSGLGRV LTRGASASDPLTGFLH >gi|222441878|gb|ACEP01000064.1| GENE 125 133171 - 134424 1153 417 aa, chain + ## HITS:1 COG:FN1753 KEGG:ns NR:ns ## COG: FN1753 COG1060 # Protein_GI_number: 19705074 # Func_class: H Coenzyme transport and metabolism; R General function prediction only # Function: Thiamine biosynthesis enzyme ThiH and related uncharacterized enzymes # Organism: Fusobacterium nucleatum # 40 417 3 374 376 469 59.0 1e-132 MENQFLIDSEYLSAEALEKKHRLETDPSSRKSHMEYMPGMEQITSDVMEKVMSQMNSYDY EKYTAKDVKAALEHETCSIEDFKALLSPAAAPFLEQMAQKAKIETSKHFGNTVYLFTPLY IANYCENYCVYCGFNCYNHINRMQLNMEQIAHEMKVIADSGMEEILILTGESRAKSDVKY IGEACKLARKYFRMVGLEIYPVNTNEYRYLHECGADYVTVFQETYDSDKYETLHLMGHKR VWPYRFEAQERALMAGMRGAGFSALLGLSDFRKDALASALHVYYLQRKYPQAELSLSCPR LRPIINNDKINPLDVGEKQLCQVLCAYRIFLPFVGITVSSRESEIFRNGIVKIAATKVSA GVSTGIGDHESKYTGKEEDGEQGDEQFEINDGRSFQKMYEDISEEGLQPVLNDYLYV >gi|222441878|gb|ACEP01000064.1| GENE 126 134417 - 135154 168 245 aa, chain + ## HITS:1 COG:Cj1043c KEGG:ns NR:ns ## COG: Cj1043c COG0352 # Protein_GI_number: 15792370 # Func_class: H Coenzyme transport and metabolism # Function: Thiamine monophosphate synthase # Organism: Campylobacter jejuni # 57 237 6 185 201 146 44.0 3e-35 MCKRRNQNSTKDLKQQLIKNGTYQNRDYQGKNFEKKNTTGGEFNSILKNSVLEKSDILCI TNRKLCKDDFLKRIQIIAAAQPKAIVLREKDLSEEAYTILAEKVMHICEKYSVPCILHSF AKAAMALNVKAIHMPLPLLRKMTPQEKNHFEIIGASCHSLEEAKEAERLGCTYITAGHIF LTDCKKGLPGRGLTFLQNICENVSIPVYAIGGISNENINDVRQTGAAGACIMSGFMKCNN IAEII >gi|222441878|gb|ACEP01000064.1| GENE 127 135317 - 136093 720 258 aa, chain + ## HITS:1 COG:alr2616 KEGG:ns NR:ns ## COG: alr2616 COG2510 # Protein_GI_number: 17230108 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Nostoc sp. PCC 7120 # 124 255 10 141 143 103 47.0 3e-22 MATAIRTIVVLIFAWLMVFVTGAEAGISNISSRTLLFLVLSGLATGASWLCYFHALQKGD VNKVVPIDKSSTILTILLALIFLQEGLSLGKGVGIVLIGVGTMLMITRKGMANQEKSSDG SWFIYAVFSAVFASLTAILGKAGIEGVDSNLGTAIRTTVVLVMAWFMVFVTGEQKKVRTV EKIELLFICLSGLATGASWLCYYRALQEGPASVVVPIDKLSMLITIAFSYIVFHEKLTKK AAFGVVLITAGTILLAMM >gi|222441878|gb|ACEP01000064.1| GENE 128 136717 - 138045 400 442 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|168182407|ref|ZP_02617071.1| 50S ribosomal protein L18 [Clostridium botulinum Bf] # 15 441 7 422 447 158 29 1e-37 MSRPTNNMSEELFKVDGKTSVGETLPLALQHVVAMIVGCVTPAIIVAGVAGLSEQDSVIL IQAALTMSAITTLIQVFPLLRTGKIAIGSGLPVIMGISFAYVPTMQAIAGRFDIATILGA QIVGGIVAVFVGIFIKQIRKFFPPLITGTVVFAIGLSLYPTAINYMAGGASNESYGSWEN WLVAIITLCIVTALNHYGKGIWKLASILIGIIAGYIISLFFGMVDFSAVVNASWFALPKP MHFGIKFEPSSCVAIGVLFAINSIQAIGDFSATTTGGLDRMPTDEELNGGIVGYGLSNIV CAIFGGLPTATYSQNVGIVGSTKVVAKRVFETSAIIILIAGLIPKFSSVLTTIPYCVLGG ATVSVFASIAMTGIKLITTAPMDFRNTTVVGLSIALGMGVTQANAALATFPAWVTTIFGK SPVVLATITAVFLNLVLPKNSD >gi|222441878|gb|ACEP01000064.1| GENE 129 138398 - 138841 376 147 aa, chain + ## HITS:1 COG:CAC3413 KEGG:ns NR:ns ## COG: CAC3413 COG1846 # Protein_GI_number: 15896654 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Clostridium acetobutylicum # 7 143 7 141 143 67 34.0 1e-11 MKREEMIGFIVRSLDNMMMRNAVGDEKSEKKGMMPVMQGWIIGYLYDHKEEEIFQRDIEA EFYIARSTVTCLVKQMEQKGYIARVAVERDCRLKRLCLLEKGEKLHECFLQNIDNVEKKV REGIDEEELQTFFKVAKQIRENLEQMQ >gi|222441878|gb|ACEP01000064.1| GENE 130 138914 - 140662 1969 582 aa, chain + ## HITS:1 COG:CAC3414 KEGG:ns NR:ns ## COG: CAC3414 COG1132 # Protein_GI_number: 15896655 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase and permease components # Organism: Clostridium acetobutylicum # 1 572 1 570 577 692 61.0 0 MIKTLAGQVKEFKKDSIITPIFMILEVIMEMIIPLLMSSIIDDGVEKGDIHHIYIIGIWM VVAAGIGLFAGIMGGKYGARASTGFARNLRQAMFENIQTFSFSNIDKFSTPSLVTRMTTD VTNLQNSYQMLLRMCVRAPFSLICAMIMSFSVNVKLASIYLVAVIILGALLAIIIWRAMG YFTAMFKKYDDLNESVQENVSAIRVVKAYVREKYEKSRFTKASGNVYRMAVKAERIVSFN APLMMLIVYSCILGISWLGAKMIVQSALTTGELMMLLAYCMSILMNLMMLSMVFVMLSMS VASAERVSEVLNETPDIVNPENPVYKVKNGSIVFDHVSFRYNKEGKKNVLNDIDLSIKEG ETIGIIGGTGSAKSSLVNLISRLYDVSEGRLLVGGKDVREYDIETLRDEVSVVLQKNVLF SGTILDNLRWGDENATEEECKRACHLACADEFIDRMPDGYHTYIEQGGSNVSGGQKQRLC IARALLKKPKILILDDSTSAVDTATDAKIREAFAKEIPDTTKLIIAQRVSSIQSADRIIV MDNGEISGFDTHENLMKSNEIYRDVYESQTKGGGDFDENGGE >gi|222441878|gb|ACEP01000064.1| GENE 131 140665 - 142557 2104 630 aa, chain + ## HITS:1 COG:CAC3415 KEGG:ns NR:ns ## COG: CAC3415 COG1132 # Protein_GI_number: 15896656 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase and permease components # Organism: Clostridium acetobutylicum # 8 627 4 622 627 810 64.0 0 MARIKNMKNAKKIENPGVLLKRLMGYIMKHYGAAFVTVLICILISVICNVQGTMFMQTLI DDYIVPMLKDGSRDFSGLAHAITRVAGFYAIGVAVTLLYSQIMVNITQGTLRSLRDDLFT HMQDLPIKYFDTHAHGDIMSVYTNDIDTLRQMISQSIPQIFNSGIMVVSIIVCMLVLSVP LTVVSLAMVGIMVYLSKRIATLSGKYFVQQQKDLGAANGYIEEMMQGQKVVKVFCHEEKS IEEFKELNDKLFHSADNANKFANIAMPVNAQVGNISYVIVAIVGGILAINGIGSFTLGGL ASFLTFNKSLNMPIGQVSQQVNFVAMALAGAQRIFELLDKEPEVDEGYVTLVNATEDADG TIHETDKHTGTWAWKHFHKEDGTTTYAKLLGDVEFLGVDFGYDEKKIVLHDIKMFATPGQ KIAFVGSTGAGKTTITNLINRFYDIQDGKIRYDGININKIKKADLRHSLGMVLQDTHLFT ASVKDNIRYGKLDATDEEVIEAAKLANADSFIERLPQGYDTVLTGDGANLSQGQRQLLSI ARAAIADPPVLILDEATSSIDTRTEKIVQEGMDRLMHGRTTFVIAHRLSTIKNSDCIMVL EQGRIIERGSHDDLIEEQGRYYQLYTGNKA >gi|222441878|gb|ACEP01000064.1| GENE 132 142570 - 143973 991 467 aa, chain - ## HITS:1 COG:FN1462 KEGG:ns NR:ns ## COG: FN1462 COG1167 # Protein_GI_number: 19704794 # Func_class: K Transcription; E Amino acid transport and metabolism # Function: Transcriptional regulators containing a DNA-binding HTH domain and an aminotransferase domain (MocR family) and their eukaryotic orthologs # Organism: Fusobacterium nucleatum # 1 457 1 456 469 308 35.0 1e-83 MLTYSFTNLKNESLYMHLYQCIRQDIVDGKLVSGEKLPSKRSFAKNLGVSTITIENAYAQ LLSEGYIYSIPKKGFYVSDFKNNVIKPVKLTTENVKLSSGQSDFLADFSSNQTRPENFPF SIWAKLMRETFNTNSPELMTKPPCGGIMGLRQAIATHLEQFRGMHVQPEQIFIGAGTEYL YGLLIQLLGFDKKYAIENPGYEKIAAIYNSYNVAYHYIPMDNNGILVDELEKSEADVVHI SPSHHFPTGIVTPISRRYELLGWAAASTDRYIIEDDYDSEFRLTGKPIPSLQSMDITQKV IYINTFTKSLSSTMRISYMVLPPKLANRFLERLSFYSCTVSNFEQYALMRFINEGCFEKH INRMRNFYHKQRDSLLDAIKNSPLASYVTIMEEDSGLHFLLKVNTELSDEELMQRALQKG VKLNSLSAYYHDSPDDFAAHTFIINYSYLNTTGVEDAIKVLYDVIKK >gi|222441878|gb|ACEP01000064.1| GENE 133 144278 - 145492 1409 404 aa, chain + ## HITS:1 COG:NMA0365 KEGG:ns NR:ns ## COG: NMA0365 COG1457 # Protein_GI_number: 15793373 # Func_class: F Nucleotide transport and metabolism # Function: Purine-cytosine permease and related proteins # Organism: Neisseria meningitidis Z2491 # 12 403 15 404 437 261 43.0 2e-69 MEENKTSVLSNGLIWFGAGVSMSEILTGTYFAPLGMKTGLLAILIGHVIGCILLFLAGYL GAKTNNNAMETVAVSFGRKGNLPFSILNILQLAGWLAILNYDGALAANGIFSAGAGVWCA IIGVLLLVWVLVGLKRLEKINMVAMTALFLMTVALCFVIVTKGNAGAFLNGGSVSGETLS FGAAIELAASMPLSWIPVISDYTSNSTKPVKATTVSTIVYGLVSCWMYIIGMSAAIGMGT SDIAQISLRAGLGVVGLLIIVFSTVTTNFLAAHSAGVSGEVVGESFSKKINGKYISVVVV IVGTIAAILYPMDNIEEFLYLINSVFAPMIAVMIADFFFNKKRRAVKEWDWANLAVWLIG LILYRALMHVDIIIGYTLPDIVITAVLCMVVHRILPQNHLDKAI >gi|222441878|gb|ACEP01000064.1| GENE 134 145584 - 146216 659 210 aa, chain + ## HITS:1 COG:YBR111c KEGG:ns NR:ns ## COG: YBR111c COG0494 # Protein_GI_number: 6319587 # Func_class: L Replication, recombination and repair; R General function prediction only # Function: NTP pyrophosphohydrolases including oxidative damage repair enzymes # Organism: Saccharomyces cerevisiae # 54 152 78 168 231 59 38.0 3e-09 MYNGKMETIYESEYMNLYDLQYREGGHYYCASRRNKDRMVALTPDEECGTMQPDAVSCFV VLNIKGQPKKLLLNWEYRYPVGQYMLSVPAGLIDKGDWNNPNALVDTAIRELKEETGIEV EESDEIKVVSPCVFSTPGMTDEGNALVYISINRDEMPKMNHDGAEGSEVFEGFRLVSKED AEEYLCKGRDEHGMYYPLFTWAALMFFMGV >gi|222441878|gb|ACEP01000064.1| GENE 135 146259 - 146777 508 172 aa, chain + ## HITS:1 COG:STM2056 KEGG:ns NR:ns ## COG: STM2056 COG4917 # Protein_GI_number: 16765386 # Func_class: E Amino acid transport and metabolism # Function: Ethanolamine utilization protein # Organism: Salmonella typhimurium LT2 # 1 146 1 147 150 117 41.0 7e-27 MRRILLIGRSESGKTTLKQVLRGENITYKKTQYVSQYDCVIDMPGEYAENLDLAHTIDLY AAESDVVGIVLSATEPYSLYPPNITPLGNRDEVGIVTKIDHWAANPQQAAEWLKLTGCKK IFFTSAYTGEGIEEIFEFLREEHDICPVEKVMDALHKESFGKQAELPHYHLQ >gi|222441878|gb|ACEP01000064.1| GENE 136 146982 - 147911 1035 309 aa, chain + ## HITS:1 COG:BH0465 KEGG:ns NR:ns ## COG: BH0465 COG0530 # Protein_GI_number: 15613028 # Func_class: P Inorganic ion transport and metabolism # Function: Ca2+/Na+ antiporter # Organism: Bacillus halodurans # 16 306 14 314 318 181 38.0 1e-45 MILQILLLCVGFVLLVKGADFFVDGAAGIAGKLGIPQLIVGLTIVAMGTSLPEAAVSITS ALKGSAGITIGNVVGSNIMNVLVILGITAVITNVAIQKSTLYCEIPFMIGITALMIVFGI TRRAVSFPEGVIFWILFIAFLAYLFVMSKKNKAAEEEDECQMSYLKCIIFIIAGGIMVVW GSDLVVDAASEIARAFGMSERFIGLTIVAFGTSLPELVTSVTAAKSGNAGIAIGNIVGSN IFNILFVIGTTALICNVPFESKFLIDSAVAVAAGVILWLGTIRHKELRRPCGVVMLVAYV GYFAYLCMN >gi|222441878|gb|ACEP01000064.1| GENE 137 148105 - 148980 540 291 aa, chain - ## HITS:1 COG:L161266 KEGG:ns NR:ns ## COG: L161266 COG0791 # Protein_GI_number: 15672918 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell wall-associated hydrolases (invasion-associated proteins) # Organism: Lactococcus lactis # 145 275 67 190 197 63 35.0 5e-10 MLPVTFVHAQRLYTTDGVNVRAKPNSSSKVLTSVSAGTSVTKTGRSGNWIAVRVNGIKGY IYKSYLSGSKNTSTATVSKSTSYRAVITASSVNLRAKPSFSSRVKGSLSAGQAVTVCSTN GSWKKVQTSKGKKGYVYGIYVRKSASSSKTTAKKTTTASSSSYRTKAVSYAKRRVGDKYS QSRRNQKGYADCSSLMRDAYKSASGKYIGNTTYEQLELMHSYLYSISSVRQATVGDLVFH QNGSNHVGIYLGNGKVLHASQTAGKVKISTFSSSSRYWDTGCNAAGYCVKH >gi|222441878|gb|ACEP01000064.1| GENE 138 149268 - 150368 893 366 aa, chain + ## HITS:1 COG:no KEGG:FN1582 NR:ns ## KEGG: FN1582 # Name: not_defined # Def: hypothetical protein # Organism: F.nucleatum # Pathway: not_defined # 103 366 157 456 456 176 36.0 1e-42 MKKKYIAWILIVSILFLGRGADAMAKTSLRKPLGVLANGVDGQVVLSWESVQGADGYEVF EKTEEENVFLKLRATKQCKLVFKNKKRGSVYQYKVRAYKIYKKNGEEKRIYSSFGAVART TVAKNSTSTIKNFLTTALAPVGSTMYIWGGGWNKADTGAGTDGLRIGLNASWRKFCAKQK ASYNYRKHRFAFGNGLDCSGFVGWSVYNINKTKNGKQGQGYVTKASKAASTYAKYGWGTY KKTIAVKDWKPGDIMSSPTHIYIVVGSCADGSVVLVHSSPAGVRLSGTPDKKGRTNSEAV KLAKKYMKKYYPSWYKRYPSCKKDKSYLTDYAQFRWRAGKGKMISDPDGYQKKNARQVLK DLYSDK >gi|222441878|gb|ACEP01000064.1| GENE 139 150689 - 151342 960 217 aa, chain + ## HITS:1 COG:FN0081 KEGG:ns NR:ns ## COG: FN0081 COG4816 # Protein_GI_number: 19703433 # Func_class: E Amino acid transport and metabolism # Function: Ethanolamine utilization protein # Organism: Fusobacterium nucleatum # 7 216 8 217 217 152 40.0 4e-37 MKRRIDAHVVMEKLVPRITEEYRKIYDIPENHESMAIFSADCEDVMWLAVDDATKKAKIK VIQIETVYGGVDYSWSRYGGEITAIISGEKVADVKSGLQYAKDYIENKSGNYSLNEEGTL GYYVDYVPRIGKYYQESLGLSEGTSIAYLVSAPVESMYGLDKALKAADVKIAELSEIPSR VNTGGAIVYGTESACRSAVEAFAEGVEYCAFHPMDIE >gi|222441878|gb|ACEP01000064.1| GENE 140 151524 - 152177 821 217 aa, chain + ## HITS:1 COG:FN0081 KEGG:ns NR:ns ## COG: FN0081 COG4816 # Protein_GI_number: 19703433 # Func_class: E Amino acid transport and metabolism # Function: Ethanolamine utilization protein # Organism: Fusobacterium nucleatum # 6 217 6 217 217 146 37.0 3e-35 MKIVKMETHMLSQKVIANVTPEFARVYNIPEGHTSVGFFSADNDDIGYLAVDDASKKANI KLIHAETYYGGTTCSWSKYGGSVFVLFSGPKVEDVKSGLRYVRDFIENHSELCNFDGDEG TAFYAQTIPRPGKYFQEWCELKPGESYAYLVGGPIETNYALDKALKAGNTRVARYWYPPS HANSSGAVLAGTESACRAATNAFVEALEYAIKNPLEV >gi|222441878|gb|ACEP01000064.1| GENE 141 152711 - 154462 2405 583 aa, chain + ## HITS:1 COG:RC0187 KEGG:ns NR:ns ## COG: RC0187 COG0173 # Protein_GI_number: 15892110 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Aspartyl-tRNA synthetase # Organism: Rickettsia conorii # 7 579 17 602 615 602 51.0 1e-172 MNIANIYRDRTLDMLSEENVGETVRVAGWVENIRDHGGVSFIDLRDMYGVLQVVMRDTTL LEGINKEDCISLEGVIEHRDEETYNPKIPTGTIELEAHKVDMLGKVYKQLPFEVMTSKEN HENVRLKYRYLDLRNKKVKDNMIFRSKVISFLRQKMTDMGFLEIQTPILCASSPEGARDY IVPSRKYKGKFYALPQAPQQYKQLLMVSGFDKYFQIAPCFRDEDARADRSPGEFYQLDFE MSFATQEDVFRVGEEVLTATFEKFAPEGSVVTAAPYPVISYKQAMLEFGSDKPDLRNPLR IVDVTDFFQRCTFKPFHGQTVRAIKVSNKMSKGFHEKLLKFATSIGMGGLGYLEVKEDLS YKGPIDKFIPDDMKAEIREIASLKAGDTIFFIADKEKQAALYAGQLRNELGQRLDLLEKN AYRFCFINDFPMYEYDEEQKKIIFTHNPFSMPQGGLEALTTKDPLELLAYQYDIVCNGVE LSSGAVRNHNLDIMVKAFEIAGYTEEDLKEKFGALYNAFQYGAPPHAGMAPGVDRMIMLL RNEENIREVIPFPMNGNAQDVMCGAPNEVSEMQLREVHIKVRK >gi|222441878|gb|ACEP01000064.1| GENE 142 155061 - 155183 59 40 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225027159|ref|ZP_03716351.1| ## NR: gi|225027159|ref|ZP_03716351.1| hypothetical protein EUBHAL_01415 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01415 [Eubacterium hallii DSM 3353] # 1 40 1 40 40 63 100.0 5e-09 MCGTFKKVTKKQEKPSIFLFITIKNIYKQKHFLDNDINIS >gi|222441878|gb|ACEP01000064.1| GENE 143 155211 - 157199 2862 662 aa, chain + ## HITS:1 COG:FN2030 KEGG:ns NR:ns ## COG: FN2030 COG3808 # Protein_GI_number: 19705321 # Func_class: C Energy production and conversion # Function: Inorganic pyrophosphatase # Organism: Fusobacterium nucleatum # 1 657 4 669 671 580 58.0 1e-165 MERLIYFAPVLGICALLFAFYLTKKVGKQDAGTDRMKEIAAFIHEGARAFLTAEYKILVV FVAVLFVLIGIGIGNWVTAVCFLVGALFSTAAGYIGMNVATKANVRTAAAAKDSGMNKAL SIAFSGGAVMGMCVVGFGLFGAGVVYILTKNPDVLSGFSLGASSIALFARVGGGIYTKAA DVGADLVGKVEAGIPEDDPRNPAVIADNVGDNVGDVAGMGADLFESYVGSLVSAITLGVV YAKESGAIFPLVIAALGVLASVIGCFFVKGDENSSPHKALKYGSYSAAIVVMIGSLILSK MFFNGFKEAIAIIFGLVVGLLIGVITEIYTSGDYRFVKKIAQQSETGPATTVISGIAVGM QSTAVPIILIAIGIIGAYSFSGLYGIALAAVGMLSTTGITVAVDAYGPIADNAGGIAEMS GLPSEVRNITDKLDAVGNTTAAMGKGFAIGSAALTALALFVSYAQAVGLFEEGINLLDYK VIVGMFVGGMLPFLFSAFTMDSVSKAAYKMIEEVRRQFKTIPGILEGKGKPDYKSCVAIS TQAALKEMLLPGVMAVLAPVFIGVVLGPDALGGLLGGALVTGVMLAIFMSNSGGAWDNAK KYIEDGHHGGKGSEAHRAAVVGDTVGDPFKDTSGPSINILIKLMTIVSLVFAPLFLKIGG LF >gi|222441878|gb|ACEP01000064.1| GENE 144 157540 - 157932 382 130 aa, chain + ## HITS:1 COG:no KEGG:CLJU_c39230 NR:ns ## KEGG: CLJU_c39230 # Name: not_defined # Def: hypothetical protein # Organism: C.ljungdahlii # Pathway: not_defined # 4 128 7 131 131 137 55.0 2e-31 MSGELIKVGKYDLKYNKILDIDIAELDIYRSTGLPAHMVKRKHFNCLKYIDDIPDILENP DYVGVNPNEKDKSIEFVKKYSKNVLLGVKLEKEGQYLYISTMYEVQESKLNRRLYSGRLK EMSVDNVKKI >gi|222441878|gb|ACEP01000064.1| GENE 145 158298 - 159128 869 276 aa, chain + ## HITS:1 COG:yegX KEGG:ns NR:ns ## COG: yegX COG3757 # Protein_GI_number: 16130040 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Lyzozyme M1 (1,4-beta-N-acetylmuramidase) # Organism: Escherichia coli K12 # 82 259 72 247 275 59 29.0 8e-09 MKNKISGKIVALLLTLIIVATSGMPLYAKETGDKANSFRYENGQSIAKAVPFSTLSPNAW KKIDGKYYSGDGSVIEGAVAKGIDVSHHQGTVDWKKAKEDGVEFAIIRCGFGMNQTKQDD AQWFNNVKGCEENNIPYGVYLYSYADTVKKAQSEAEHVLRLIKGHTLAYPVYYDIEEKAV LNKLTAEQLGKIAATFVNKVKAEGYQTGIYSNKSNFETFLTDSQFSQWNKWVAQYNSAAC TYKGTYQIWQAADTGTIDGKWSSRYQFCHENYLIKR >gi|222441878|gb|ACEP01000064.1| GENE 146 159070 - 159504 173 144 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225027164|ref|ZP_03716356.1| ## NR: gi|225027164|ref|ZP_03716356.1| hypothetical protein EUBHAL_01420 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01420 [Eubacterium hallii DSM 3353] # 12 144 1 133 133 124 100.0 3e-27 MGNGAVDINFAMKTTSSSDNNPGTSTEKPATSATKPTTTAAKNNTRPAPVKLRTPTIKVK SSAKKKAQIIWNLKGNGVKYQVYYSTKKNKGYKKAATINSSKGKATVKKLKSKKKYYFKV RCSKTVKGHTYYSAYSKIKSIKIK >gi|222441878|gb|ACEP01000064.1| GENE 147 159635 - 160162 473 175 aa, chain + ## HITS:1 COG:STM2056 KEGG:ns NR:ns ## COG: STM2056 COG4917 # Protein_GI_number: 16765386 # Func_class: E Amino acid transport and metabolism # Function: Ethanolamine utilization protein # Organism: Salmonella typhimurium LT2 # 1 142 1 143 150 119 42.0 4e-27 MRKIMFIGRSEAGKTTLTQAMKGNTITYHKTQYVNHYDVIIDTPGEYLQSKNLISALAVF TAEADVIGLLMDATEPFSLYSPNLTPVCNREVVGVVTKIDHWAAQPAQAAAWLELAGCKK IFYTSAYTGEGIADILSYLKEERDTLPWEEVKSKYDELGFGAGEDAKDQNRFHNI >gi|222441878|gb|ACEP01000064.1| GENE 148 160237 - 160416 318 59 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225027166|ref|ZP_03716358.1| ## NR: gi|225027166|ref|ZP_03716358.1| hypothetical protein EUBHAL_01422 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01422 [Eubacterium hallii DSM 3353] # 1 59 1 59 59 94 100.0 3e-18 MRGYEAAQILKEKGITVEDVAEEITYRKETNAFSNNPKNEAFMNMTLNELVRIKEMWNI >gi|222441878|gb|ACEP01000064.1| GENE 149 160548 - 161252 624 234 aa, chain + ## HITS:1 COG:lin0656 KEGG:ns NR:ns ## COG: lin0656 COG5523 # Protein_GI_number: 16799731 # Func_class: S Function unknown # Function: Predicted integral membrane protein # Organism: Listeria innocua # 9 204 7 176 345 66 28.0 6e-11 MKLGSGMYKEIARDALKGNWMKAIAAGFAAGWLGVFSSSFLFIAGYVLVAVILVYFLEFL PGFYPILFLGTTIIALIYFFIGGVIRLGYIDFNLALLDRRKNGIYRLGSRMSDWWRVLCA KITLFFALSLGYVLLIAPGIITKYSYAMVPYILEERPDFTVHEAFKASKQIMKKHKWELF CLRFSFIGWYIIGILTLGIGLIFINPYRYAAEAAFYNEISGRAEAYYGRKEDRY >gi|222441878|gb|ACEP01000064.1| GENE 150 161367 - 162077 948 236 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225027168|ref|ZP_03716360.1| ## NR: gi|225027168|ref|ZP_03716360.1| hypothetical protein EUBHAL_01424 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01424 [Eubacterium hallii DSM 3353] # 1 236 15 250 250 434 100.0 1e-120 MEILGKVLSEKTEAIYKDITGGLIYPIKIVKLDPENEEDCSRPVAMDTGKKEFIVKVDNT LADNLFENALIRDIIYCQQMSNNAPVLTAKSRNDIDGFQVAMMISSIIMDIDVENKLRSY DMHIDDIDTMRLSDLYAFLKSGMADYNRELYNVFTGLQITLLYFTTSKRSNIEEIIETFY LSDKSAMDAIDKYVDIIDRYGVDDNRSMMRCMRKLAIACGMKGRLLLEYEGKVTEI >gi|222441878|gb|ACEP01000064.1| GENE 151 162260 - 164170 1701 636 aa, chain + ## HITS:1 COG:CAC0291_2 KEGG:ns NR:ns ## COG: CAC0291_2 COG0685 # Protein_GI_number: 15893583 # Func_class: E Amino acid transport and metabolism # Function: 5,10-methylenetetrahydrofolate reductase # Organism: Clostridium acetobutylicum # 354 636 1 283 285 259 45.0 8e-69 MNIREYLENHKLLTDGAMGTYFDSIEKENYICSEEANITNPALVREIHRSYVKNGAQLLR SNTFLANEGTFLSLTQAKAEAFENITLKQLIIAGYQLAKETAQEVYQEEYPIFAAADIGP ILEERDSEEADILQQYYEICDSFLEAGADIFVLETFPDTQYVLKMAEYIRKSCPEAFIIG QFSLTPTGYSRTGFHYKTILQEATQSGLLDGAGLNCGVGAAHMKKFLKSYIEEFGVPEDM VLTSLPNCGYPQIVRGHAVYSDSVPYFGEKLGEISELGVQVLGGCCGTTPEYIGEIYKTV FQAKKTEGAVKIPTKIVWGDVTKRVQKRKEAAVHITQAGEQKEAIKEKQVTNTFRQKLEM GELVCAVELDPPFDTDDTKLLNGAKALLSTKADIITIADSPLARSRADASLMAAKIKMTT GMDVMPHLSCRDKNRIAIRSGLLGSYASGIRNYLFVTGDPVAREDREFTKSVFDFNSIRL MKFAQSMNQEVFKDDTIFYGGALNQNGASPENIAGRMLKKMEAGCRYFLTQPVYDKEGIE RLTFLKERTGAKILIGIMPLVSRRNALFIKNEMPGIHVPDEVVAKYKEGASREEFEEAAI EISKQIIEMGKGIGAGFYFMTPFNRVSLVKSILEYV >gi|222441878|gb|ACEP01000064.1| GENE 152 164328 - 166799 3156 823 aa, chain + ## HITS:1 COG:TM0268_2 KEGG:ns NR:ns ## COG: TM0268_2 COG1410 # Protein_GI_number: 15643038 # Func_class: E Amino acid transport and metabolism # Function: Methionine synthase I, cobalamin-binding domain # Organism: Thermotoga maritima # 303 820 15 481 483 271 35.0 5e-72 MKTEELYTMIQQKPVILDGATGSNLQKVGMKPGVCPEEWILENEDKLIDLQKSFVEAGTN ILYAPTFSGNRVKLEEYGLADRAEEINKRLVGLSKRAAGDKALVAGDMTMTGVALEPVGP MKLEALINIYKEQAKYLLEAGVDLFVVETMMSLAETRAAVIAIKEVCNLPVIASLTFQED GRTLYGTDPVTAVVVLQSIGADIIGVNCSTGPGEMIPVIKKMKEYAEVPILAKPNAGLPV LEDGVTVYPMTPEEFASWGPAFIEAGAGLIGGCCGSTPEHIRMLTEKVSGLTTKAPCNIH PIMLASERKSQEIALDGKFLVIGERINPTGKKKLQEELRAGKLDLVEEMAEQQEEMGAHI LDINMGTNGIDEKEMMLKAISKVTMVSSLPLCIDTSYVEVMEAALRAYPGRALINSISLE TEKIEKLLPLAKKYGAMFILLPLSDEGLPKSLAEKKEIINTILEKAKKLGVSKNQIIVDG LVTTVGANKNAALETLETIRYCKEDLKLCTTMGLSNISFGLPERPYVNGAFASMAIASGL TMAIANPSNQLLMGISFASDLLRNKEGSDIAYIEQIQRMEPLKEAAIAKAAKTAGNGEKQ PNEKNAAVGEADGNIKKNPVFEAVLKGNKDGIVDVVKKELSKGTKPGEILDGLLIPAINE VGVLFDKQKYFLPQLISSANTMEQAVEYLEPLLKEGGIQEKMPTIIIATVEGDIHDIGKN LVALMLRNYGYDVIDLGKDVPADEIIAAAKEHNASIIVLSALMTTTMMRMKDTIALKEKE HLDVKVMIGGAVTTQSFADEIGADGYSADAADAVRLAKKLLAE >gi|222441878|gb|ACEP01000064.1| GENE 153 166939 - 167859 804 306 aa, chain + ## HITS:1 COG:STM2085 KEGG:ns NR:ns ## COG: STM2085 COG0463 # Protein_GI_number: 16765415 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Salmonella typhimurium LT2 # 6 305 2 302 314 223 40.0 3e-58 MSETRKIDVIIPTYHPDEKLEKCLRMLKRQTVQPQRILLINTEEEFFHSKVFSTLKQGEI IHITKPEFDHGGTRNQAAAMCDGEIMILLTQDAIPADEHLIENLIKPFEDEEVCAAYGRQ MADKKDNPVEAYTRIFNYPKESRIKSKEDLPKLGIKTFFCSNVCAAYRKSEYNALGGFPL HTIFNEDMIFASHLIEAGKKIAYAADAKVWHWHNYTAKDQLKRNFDLAVSQVDAGGLFTQ VKSESEGIRLVKMTLKHFIKKGQFYYIPKIITQNGAKYIGYRLGKNYKKLPKKWILKISL NPWYWK >gi|222441878|gb|ACEP01000064.1| GENE 154 168069 - 169196 1395 375 aa, chain + ## HITS:1 COG:lin0463 KEGG:ns NR:ns ## COG: lin0463 COG1316 # Protein_GI_number: 16799539 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Listeria innocua # 18 340 2 308 309 160 34.0 4e-39 MKETDKNKNIDKKEAKKAKKARKRGRAGRRIAQVLMCFLLTVAFVASAAGGAVAAYYKAA TNTITTDTKTALKNADVVDADLEYDQDVVNILLIGCDKRSDETEAGRSDSTMIATIDLKN GKLKLTSLMRDMYIDIPGYGYHKFNAAYSYGGVQLEYETIAETFGIKLDGYVEVDMEAFR EVVDLIGGVPMDLTEAEAYFLKTAYVNSKHGEKNVKAGENMLTPYQALAYCRIRQDITGE FGRTDRQRKILISVFNELKSEDVMTITNICMKALKYVTTDLTEKQIRSLLTSVITMGATE VEQFRIPYEGSYTEGRLSGNGAWVMQVDYEANKKALQYFVFGIGEDPGINDSYGVGNDKF IKGYYPEKTEGISTY >gi|222441878|gb|ACEP01000064.1| GENE 155 169221 - 169712 767 163 aa, chain + ## HITS:1 COG:MJ0161 KEGG:ns NR:ns ## COG: MJ0161 COG0440 # Protein_GI_number: 15668333 # Func_class: E Amino acid transport and metabolism # Function: Acetolactate synthase, small (regulatory) subunit # Organism: Methanococcus jannaschii # 5 162 10 168 172 144 48.0 9e-35 MRQIVLSLLVDNNPGVLARVAALFSRRGYNIDSITAGITNDPKYTRITVAVSGDNQILEQ IRNQIAKLVEVRKIVELENSHSVCRELVLVKVMADKKEKQEIIAIADIFRAKVVDVAANS LMIELTGNQNKLDAFLNLMADFEVLEMVRTGITGLGRGVSTLN >gi|222441878|gb|ACEP01000064.1| GENE 156 170019 - 171038 1289 339 aa, chain + ## HITS:1 COG:TM0550 KEGG:ns NR:ns ## COG: TM0550 COG0059 # Protein_GI_number: 15643316 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Ketol-acid reductoisomerase # Organism: Thermotoga maritima # 3 333 2 330 336 423 66.0 1e-118 MAAKIFYQEDCDLSLLDGKKIAIIGYGSQGHAHALNLKESGCDVIVGLYEGSKSWAKAEK QGLKVYTAAEAAKQADIIMILINDEKQAKLYKESIEPNLEEGNMLMFAHGFNIHFGCIVP PANVDVTMIAPKAPGHTVRSEYLAGKGTPCLVAVEQDYTGKAQELALAYGAGIGGARAGL LETTFRTETETDLFGEQAVLCGGVCALMQAGFETLCEAGYDPRNAYFECIHEMKLIVDLI YQSGFAGMRYSISNTAEYGDYITGPKIVTAETKKAMKQILSDIQDGTFAKDFLLDMSDAG SQVHFKAMRKLASEHPSEKVGADIRKLYSWSDEDQLINN >gi|222441878|gb|ACEP01000064.1| GENE 157 171350 - 172411 945 353 aa, chain + ## HITS:1 COG:no KEGG:EUBELI_01556 NR:ns ## KEGG: EUBELI_01556 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 1 340 1 340 353 444 72.0 1e-123 MTFYQELQLSSTGSKELIKKTTDPKEKRRHILIYNVKVYLVVAFCFALVTLFSAVFGSGN SVAGVVVLLALLVLRQADFGIKTTHGLLCIAGIFGILIIGPRFTNTLAPIPAFFVNLVCI MLLMILGCHNVIMSNHSTFVLGYLLLQGYDVTGEEYSMRVISLLIGMAICMAVFYKNQRN RPYRRTFLDLFREFNLHSARNWWYIRLTVMVSTALLIMSLLGLPRAMWAGIACMSVCLPF SSDLVARAKLRGPYNILGSLIFVVLYIVLPKSMYPYIGIIGGIGVGYSAGYAWQTVFNTF GALSIASGLFGMTGAVVLRIGANVFASIYTVLVDKVLNKIFEWTTQIAPKYES >gi|222441878|gb|ACEP01000064.1| GENE 158 172555 - 173007 425 150 aa, chain + ## HITS:1 COG:L67624 KEGG:ns NR:ns ## COG: L67624 COG0394 # Protein_GI_number: 15674184 # Func_class: T Signal transduction mechanisms # Function: Protein-tyrosine-phosphatase # Organism: Lactococcus lactis # 1 148 1 144 145 127 39.0 1e-29 MIKVLMICHGNICRSPMAEFILKDMVKKKGMEDSFLIASCATSREEIGNSLHPGARAKLD EMGIPHKKRAAVQLTKEDYTTYDYLLCMDQWNINNVKRIIGHDPKHKVHLLLEYAGVKRD IADPWYTGNFDVTYDDIVQGCEAFLEELGV >gi|222441878|gb|ACEP01000064.1| GENE 159 173031 - 174074 900 347 aa, chain + ## HITS:1 COG:alr4954_1 KEGG:ns NR:ns ## COG: alr4954_1 COG1716 # Protein_GI_number: 17232446 # Func_class: T Signal transduction mechanisms # Function: FOG: FHA domain # Organism: Nostoc sp. PCC 7120 # 2 93 4 95 168 65 35.0 1e-10 MKIIIKCTDGIMTGKEWRFNGSGQITLGREKGSMVTTAPEDNSISRHQCVIDVVPPNVYV SDAGSTHGTYVNGQLIGKKGQGGQCPVRDGAFIGVGSGGRTAVFQIRIIEEERGTVSPDY IADNMDNHYNRSFNGQNLGNDFASSFQFVVPRIRDIFIRPVHTAIEFGKMNSAILGTTMI LINMAVLLVLAVIVMAIAKDKLFGFGAYVPWGRIIIGVELMAAFSYFGLAALMLLSARVF FRAEITYAQLLSFYGVQAIWSTAYLLVESFLLMIGTEGSIMLCLFVFWISILHTFLIGIF SYGEVVHLSPDKKMYSYFVFVILYIITYAIILAILGGGVRDSLFGLI >gi|222441878|gb|ACEP01000064.1| GENE 160 174522 - 175616 896 364 aa, chain + ## HITS:1 COG:alr3997 KEGG:ns NR:ns ## COG: alr3997 COG0515 # Protein_GI_number: 17231489 # Func_class: R General function prediction only; T Signal transduction mechanisms; K Transcription; L Replication, recombination and repair # Function: Serine/threonine protein kinase # Organism: Nostoc sp. PCC 7120 # 287 364 411 487 496 60 38.0 5e-09 MGYRERNNSQNGGWNTGNNLQSGWNARNSSQNSWNPNNSQSGWNPGNSQQPGWNPDNSQQ NWNNMNPNSQWNPNMGGGYQGDPGRKNRGGLTKEVIIALLIAVAAIVFVVILFVVSGKKS SSQKDNNSTKSSTQAVAEATTQAATEETQATTAATTQATVATTAATQSQNTTTAAMDNTF IFSDSNSRKLTKEEVLRLSAWDTKLAKNEIYARHNRKFDNQKIQAYFDSQQWYHGTVAPS DFDEDVFNEYEKANVELLKKAEDGTLTASTSSGTSSTDDGYIISDSSIRELTDSDLSGLS KGKLRIARNEIYARHHRKFDSADLQIYFDKKSWYSGTIEPSDFDEKNELSQIEKKNIDLI KKYE >gi|222441878|gb|ACEP01000064.1| GENE 161 175684 - 176661 1128 325 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225027182|ref|ZP_03716374.1| ## NR: gi|225027182|ref|ZP_03716374.1| hypothetical protein EUBHAL_01438 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01438 [Eubacterium hallii DSM 3353] # 1 325 1 325 325 578 100.0 1e-163 MRKRFSLCKKVGVFLLTAVMCTSQLNVQAEKKYDFETEYQVVAGANGYFIAKNQKGEYGL LDREGEKAVGFDYSEMEFPKDTMQYQYIKVKQNKNWGIVDYDENPLVPLNFEKVSEYSNG NTIAAGYNGNTTYLYDLKGKEQKKQLTGSGSYSVISDTVFWGKEDIRNEKDNTLVKLEEN SLLDKKKNSLAVKVGDTHYAVQKYTTQLTGNLETDQNIEVYEKDEKTQDILPEEEWVKDK DSTVKGSLKLIKAVSDKNIIADMQMDDGFKVIYNIEDGQSTSKYKSIGEFRDGKAFAVDM DNNIKIIDTSGKAINKNDIEVADCG >gi|222441878|gb|ACEP01000064.1| GENE 162 176654 - 177133 336 159 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225027183|ref|ZP_03716375.1| ## NR: gi|225027183|ref|ZP_03716375.1| hypothetical protein EUBHAL_01439 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01439 [Eubacterium hallii DSM 3353] # 1 159 1 159 159 249 100.0 4e-65 MDDFYRYYAPDFNFLYPKSVFAYSEADDETNSYYLSSADGSMTLEFYEEDTSGSAKENVK KFYNKTQNEVHAIKYDPKWEDHQVIAGQYVANTNEAFYYFIRNENAKNYIFKFHYVDPDL NNDYNQEDYIMETIYRGCSFSKASKPRSFRNFEDEGKKQ >gi|222441878|gb|ACEP01000064.1| GENE 163 177136 - 178293 1128 385 aa, chain + ## HITS:1 COG:slr1744 KEGG:ns NR:ns ## COG: slr1744 COG0860 # Protein_GI_number: 16330376 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: N-acetylmuramoyl-L-alanine amidase # Organism: Synechocystis # 37 242 470 647 649 85 33.0 2e-16 MMKTIIKTNYFSRVFLYGLIFSLCIFPLSVKARQTSPKKDLCIVIDPGHGGIQSGTQRGT VEEKTLNLKIAQYLKEALEKYKGVTVSLTRDGDYDVSLTDRTQYSVDKNADLMVSIHNNA TGDCAAYDNGCTVLAAKDGYKQELADEEQKLACNILNELSALGIENQGILLRDSEANEKY ENGELADYYAIIRGGVLKDIPTVLVEHAFVDDDSDFENYLSSDAKLKALAEADAKGIARY YQLTTEDGKKAESPLENYKEKIVHIIDGNCKHNKISYKTYYSSSKKETEEKSANTDTTTA KAEKQEQTTEKSAPEAKTDVISESESQEKSSEEFSNTVENTSSKSVTKPKKRESVQTDSM ILFVLIAALLIALILLGTLLKKRKK >gi|222441878|gb|ACEP01000064.1| GENE 164 178374 - 179222 647 282 aa, chain - ## HITS:1 COG:CAC0191 KEGG:ns NR:ns ## COG: CAC0191 COG1737 # Protein_GI_number: 15893484 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Clostridium acetobutylicum # 19 256 18 255 283 115 30.0 8e-26 MEKSVLDTICASLDTFFESERKIGTYIIQHTAKVVDMTVGELAQACGVSDASVSRFCKKI DMKGFHHLKITLAKEISERGKEEEEVSNRISVNDIGQSLKNILANKVTEITQTVSMMDTK QLHEILDKLNTAKTVQFFAVGNTIPVAIDGAFKLNQIGIPAVSGTIWETQIGYTYNMMAD DVVIAISNSGESTAVLRALEAARSAGATTISITNSEKSSAAQLSDYHITTATREKLFLDG YCFSRVSATTVIEILYLFLTSMRKDAYKSVVRHEQAIAFTKL >gi|222441878|gb|ACEP01000064.1| GENE 165 179707 - 180168 299 153 aa, chain + ## HITS:1 COG:BS_fruA_1 KEGG:ns NR:ns ## COG: BS_fruA_1 COG1762 # Protein_GI_number: 16078504 # Func_class: G Carbohydrate transport and metabolism; T Signal transduction mechanisms # Function: Phosphotransferase system mannitol/fructose-specific IIA domain (Ntr-type) # Organism: Bacillus subtilis # 1 142 1 146 169 98 39.0 5e-21 MRITELLRKSSIALQVDVQSKEEIIDYLVDLISQSGIIPDKEEYKQGILLRESLGTTAIG DEIAIPHAQLDSIMAPGIAAITVKNGVNYDAPDGQDVKLCFIITAPKTFYWSVSWLSGTF FLKETFTRKCRRKKEKALKDIIYRKLFLKIYSI >gi|222441878|gb|ACEP01000064.1| GENE 166 180259 - 180936 895 225 aa, chain + ## HITS:1 COG:CAC1431 KEGG:ns NR:ns ## COG: CAC1431 COG0120 # Protein_GI_number: 15894710 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose 5-phosphate isomerase # Organism: Clostridium acetobutylicum # 2 225 5 227 227 201 48.0 9e-52 MNMKQLCAKEALKYIKDGMCIGLGGGSTIGYLAEYLAKEERKITVVTPSDDTAELCKNLG LMVLSLEMTAHINIAFDGCDELDKELNALKANGAIHTKEKIIASMAEKYVLLIDESKFYD TLPLKDSVTLEVIPQSRNYVQAQIEKLGGKAVMRKSGAKAGFVISDNGNYIMDTDFSNVA AFAGKPEELHQKLKQLTGVVETALFIDVVAFALCVCGDEVKVIEK >gi|222441878|gb|ACEP01000064.1| GENE 167 181137 - 182108 1221 323 aa, chain + ## HITS:1 COG:CAC3400 KEGG:ns NR:ns ## COG: CAC3400 COG0673 # Protein_GI_number: 15896641 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Clostridium acetobutylicum # 1 323 1 322 322 387 57.0 1e-107 MKKLNWGMLGTGWIAHEMGEALKRVNGEIYACCATSMESAEKYAEEFDVKNAYGSADEMF DDENIDIVYIATPHNSHYEFLMKALQSGKHVFCEKAITVNDTQLEEAVKLAKKKNLVICD GMTLYHMPVFKKMKEIVDSGKLGPVKMIQVNFGSCKEYDVTNRFFSKELAGGALLDIGVY ATSFSRFFMKSKPNSILTTANYFETGVDETSGIILRNPDGQMAVMALTMRAKQPKRGVVA CEDGFIEIYNYPRGDKATIVYTNDGHTETIEAGEAAYALDYEVEDMQNYVLNGGSEEYLT YSRDVMSVLTDIRKQWGMIYPFE >gi|222441878|gb|ACEP01000064.1| GENE 168 182317 - 182532 292 71 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225027191|ref|ZP_03716383.1| ## NR: gi|225027191|ref|ZP_03716383.1| hypothetical protein EUBHAL_01447 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01447 [Eubacterium hallii DSM 3353] # 1 71 1 71 71 139 100.0 6e-32 MEKKYLFFDIDGTLTDRATGEIVPSAKEVLQRLEENGHFVAIATGRAHYKAENFTLAMAG VLSWMIQKMFI >gi|222441878|gb|ACEP01000064.1| GENE 169 182868 - 184235 1075 455 aa, chain + ## HITS:1 COG:yeeO KEGG:ns NR:ns ## COG: yeeO COG0534 # Protein_GI_number: 16129928 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Escherichia coli K12 # 19 450 92 528 547 188 29.0 2e-47 MTELIKQKKQVPERLYSNKKLAALLIPLALDQLLNSFMGTIDTLVVSNLGSAAISAVSLV DSINILIVQAFFALASGGTVVCSHYLGCKSKDHAKEAARQLVFVTLMLSLIIGAICLVFN QQILNLIFGKVEKDVMAGAKQYFFYSVLSYPFIALYNDGACILRAQNNSKFPMQISIVSN FLNAFLDVIFVWVFHWGVAGSAIATAGSRFFSMSIVLWKLRNPSLEIPFRNYFSIRPNWR EIKKILNIGIPSGIENSMFQFGKLAIQSTVSLMGTAAIAAQGMTNVIENLNGILAIGIGI GLMTVVGETLGAGRKEEAVYYIKKFCIIAEITLAVSCLIFYLLVRPITYFGGMEPESAKL CIYMVTWISIVKPLIWIMAFIPAYGLRAAGDVKFSMTVSVLCMWFCRVSLVIFLARAYHM GPMAVWIGMFVDWGVRNIIFTIRFRSRQWLNHEVI >gi|222441878|gb|ACEP01000064.1| GENE 170 184517 - 185812 1657 431 aa, chain + ## HITS:1 COG:CAC0713 KEGG:ns NR:ns ## COG: CAC0713 COG0148 # Protein_GI_number: 15894001 # Func_class: G Carbohydrate transport and metabolism # Function: Enolase # Organism: Clostridium acetobutylicum # 3 428 4 427 431 521 63.0 1e-147 MLHLEIEKVVGREIIDSRGNPTVEAEVYLADGTVGRGAAPSGASTGEFEALELRDGNKDR FGGKGVSKAVGNINTTINEALKGIDASDIYAVDGAMLAADGTKDKSNLGANAILAVSIAA VRAAATALQIPLYRLLGGVNGNRLPVPMMNILNGGAHAANTVDVQEFMIMPAGAPSFKEG LRWCTEVFHALAALLKERGLATSVGDEGGFAPDLGSDEEAIECILEAVEKAGYKPGEDFV LAMDAASSEWKSATKGEYLLPKSGRKFTSAELIEHWKQLCEKYPIYSIEDGLDEEDWEGW QQLTKELGDTVQLVGDDLFVTNTERLSKGIKRGCGNSILIKLNQIGSVSETLEAIKMAHN AGYTAVTSHRSGETEDTTIADLAVALNTCQIKTGAPSRSERVAKYNQLLRIEEQLGNAAV YPGKGAFHISR >gi|222441878|gb|ACEP01000064.1| GENE 171 186113 - 186583 549 156 aa, chain + ## HITS:1 COG:BS_yxjI KEGG:ns NR:ns ## COG: BS_yxjI COG4894 # Protein_GI_number: 16080945 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Bacillus subtilis # 1 149 2 150 162 89 28.0 2e-18 MKFLIKQRVFSWSDTFDIYDEYENKKYFVQAEFLSLGHRLHVYDMNGHEVGLIKEKVLTF LPEFEVMIGGHSCGIIKKKFTFFHPQYELDYNGWHVEGDFLSWNYDVYEACSAVIHITKE PLHWGDTYVIDFLDPKDELMGLMLVLAIDAANCTNN >gi|222441878|gb|ACEP01000064.1| GENE 172 186722 - 187513 573 263 aa, chain + ## HITS:1 COG:all3753 KEGG:ns NR:ns ## COG: all3753 COG0300 # Protein_GI_number: 17231245 # Func_class: R General function prediction only # Function: Short-chain dehydrogenases of various substrate specificities # Organism: Nostoc sp. PCC 7120 # 4 254 6 255 263 111 29.0 2e-24 MKVKRIAIVTGATGGLGREFVRLLLKEKNIDEIWALARNEKKLGRLKEKFGKKVKIYSFD LSSVRQIQCFEEELKQEAQRRKLEISYLINNAGFAKFCSYGDLSIEESLNMIHVNIDAVV AMGLVCIPYMKKGSHLINIASQASFQPLPYQNIYSSTKSFVRNYSRALNVELKEKGIYVT AVCPGWIKTDLYKRAEIGARKATTRYVGMVTPDKVAKKALKDAKKGKDISIYSLFTKISH VAAKLLPQRIMMKIWLRQQHLDN >gi|222441878|gb|ACEP01000064.1| GENE 173 187627 - 189030 1668 467 aa, chain + ## HITS:1 COG:FN0242 KEGG:ns NR:ns ## COG: FN0242 COG0569 # Protein_GI_number: 19703587 # Func_class: P Inorganic ion transport and metabolism # Function: K+ transport systems, NAD-binding component # Organism: Fusobacterium nucleatum # 17 466 1 450 452 314 39.0 2e-85 MFSFMNPGKSKPAKKGLKIIIVGCGKVGTTLVEQLSKEGHDIIIIDKNAKKVQEIANIYD IMGIVGNGASYSVQMDAGIEDTDLIISVTESDELNLLCCTVAKQVGDCAAIARVRTPDYS KEAGYLREKLGLAMIINPELEAAREAARILFLPSALEVNSFAHGQAELIKFAIPAGNILD GLQIAELRGKIETNILVCGVERDGEVHIPSGDFILQQGDVISVVASRKIAREFLSHIGFE TKQVKNTMIIGGGKAAYYLAKNLLDVGIDVKIIESNKERCEELSILLPKAIIINGDGTDQ ELLREEGLPYVESFVPLTGIDEENILLTLHTRQVSKAKVITKINRINFKDVIAKLDLGSV IYPRYITSEAIIAYVRARQASLDSNIETLYHMFDHRVEAIEFRVYEESAVTNIPLKDLAL KKDLLFCFINRNGAIILPNGQDSIQVGDTVMVVTTHTGFNDLRDILA >gi|222441878|gb|ACEP01000064.1| GENE 174 189045 - 190523 915 492 aa, chain + ## HITS:1 COG:FN0993 KEGG:ns NR:ns ## COG: FN0993 COG0168 # Protein_GI_number: 19704328 # Func_class: P Inorganic ion transport and metabolism # Function: Trk-type K+ transport systems, membrane components # Organism: Fusobacterium nucleatum # 1 478 1 481 483 372 44.0 1e-102 MNSSIIRFVLGYVLKMEAVLMLLPCIVAVLYREQTGFAYLLVAAVSMAFGTLMTIKKPKS HVFYLKEGCVATALSWIFLSFFGALPFWISGEIPSLIDALFETVSGFTTTGSSILADVEA LSHCALFWRSFTHWIGGMGVLVFLLAVIPLSGGSHINLMRAESPGPSVGKLVPKIKYTAQ ILYIIYFGMTIVEIVLLLISRMPAFDAITLSFGTAGTGGFGIKGDSLASYTALQQWIVTI FMILFGVNFNAYYLILFRKFKKALQMEEVRAYFAIIFVATAIITGSLVGGNMQFFDALRH AAFQVGSIITTTGYATVNFDAWSQTCRAILVLLMFVGACAGSTGGGIKVSRFIVMVKTMI KELNSYIHPKSVKKIKMDDKPIEHEVVRSINVYFITFMIIFVASVIAISFEGHDLVTNFT AIAATINNIGPGLSMVGPDCNFGFFSDFSKLVIIFDMLAGRLELFPLLILFHPAIWKELF TQKITMRRKKKF >gi|222441878|gb|ACEP01000064.1| GENE 175 191075 - 191284 277 69 aa, chain + ## HITS:1 COG:no KEGG:Closa_0962 NR:ns ## KEGG: Closa_0962 # Name: not_defined # Def: FeoA family protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 69 2 70 71 82 81.0 5e-15 MPLSMVKAGEPNVIKKVGGKSETKKFLENLGFVVGGLVTVISEINGSLIVNVKDSRVAIG KDMANKILV >gi|222441878|gb|ACEP01000064.1| GENE 176 191338 - 191559 442 73 aa, chain + ## HITS:1 COG:L192240 KEGG:ns NR:ns ## COG: L192240 COG1918 # Protein_GI_number: 15672170 # Func_class: P Inorganic ion transport and metabolism # Function: Fe2+ transport system protein A # Organism: Lactococcus lactis # 4 72 80 148 152 75 52.0 3e-14 MKTLKEVAVGQTVTVKKLSGAGPVKRRIMDMGITKGVEVYVRKVAPLGDPVEVTVRGYEL SLRRADAEMIEVE >gi|222441878|gb|ACEP01000064.1| GENE 177 191714 - 193906 2791 730 aa, chain + ## HITS:1 COG:L190009 KEGG:ns NR:ns ## COG: L190009 COG0370 # Protein_GI_number: 15672169 # Func_class: P Inorganic ion transport and metabolism # Function: Fe2+ transport system protein B # Organism: Lactococcus lactis # 5 714 4 702 709 689 48.0 0 MSVKIALAGNPNCGKTTLFNALTGSNQFVGNWPGVTVEKKEGKLKGHKDVVIMDLPGIYS LSPYTLEEVVARNYLIGERPDAIINIVDGTNIERNLYLSTQIMELGIPVIMAINMMDLLE KSGEKIDIAKLGKDLGCEVVEISALKGTGIKKAAEKAVALAQSKKSAEVVHSFDKKVEDA IDAVKVMLGSQVPEEQKRFFAIKLLERDDKIVEQLNTVPDVSNEIKALEDAFDDDTESII TNERYVYISSVVADSCKKNGAKKLTTSDKIDRIVTNRWLALPIFAVVMFIVYYVSVSTVG DWATTWANDGVFGDGWHLFGIKSLFIPSVPSIVEGALNAVGCADWLSGLILDGIVAGVGA VLGFVPQMLVLFIFLAFLESCGYMARVAFIMDRIFRKFGLSGKSFIPMLIGTGCGVPGVM ASRTIENDRDRKMTIMTTTFIPCGAKLPIIALIAGALFNGAWWVAPSAYFVGIAAIICSG IILKKTKMFAGEPAPFVMELPAYHMPTVSNVLRSMWERAWSFIKKAGTIILLSTIILWFL MNFGWVDGNFGMLEAEQLNDSILATIGNVIAPIFAPLGWGDWKMAVAAVTGLIAKENVVG TFGILFGFAEVAEDGSEFWGQLANSLPAVAAYSFLVFNLLCAPCFAAMGAIKREMNNIKW FFFAIGYQCLLAYVVGLCIYQVGTLITAGTFGIGTVVAFVLIIGFIYLLFRPYKESNTLN VNVKSVAKAK >gi|222441878|gb|ACEP01000064.1| GENE 178 194010 - 194177 203 55 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225027203|ref|ZP_03716395.1| ## NR: gi|225027203|ref|ZP_03716395.1| hypothetical protein EUBHAL_01459 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01459 [Eubacterium hallii DSM 3353] # 1 55 1 55 55 89 100.0 1e-16 MGTIIVGLILVLIVSLIIRSMVKDKKSGKSIQCGGSCSHCAGHCSAPCDTPSNRD >gi|222441878|gb|ACEP01000064.1| GENE 179 194233 - 194703 443 156 aa, chain + ## HITS:1 COG:no KEGG:Closa_0966 NR:ns ## KEGG: Closa_0966 # Name: not_defined # Def: ferric uptake regulator, Fur family # Organism: C.saccharolyticum # Pathway: not_defined # 17 153 20 156 170 177 65.0 1e-43 MLQKANQNLPINDVGIHRTQMQKEIILQKLKERGCRITRQRKMLLDVILNEDCSSCKEIY YKASMKDSGIGTATVYRMINILEEIGALSRKNMYKIACGPECEVKDACVIEFDDDTIIEL SGKSWNQVVQLGLRACGYSSGHKIKSVTAKSCEFSR Prediction of potential genes in microbial genomes Time: Fri Jul 8 07:30:50 2011 Seq name: gi|222441877|gb|ACEP01000065.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont71.1, whole genome shotgun sequence Length of sequence - 5575 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 4, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 34 - 93 6.1 1 1 Tu 1 . + CDS 124 - 1314 1128 ## COG0053 Predicted Co/Zn/Cd cation transporters + Prom 1319 - 1378 7.1 2 2 Tu 1 . + CDS 1493 - 2590 715 ## COG1865 Uncharacterized conserved protein + Term 2698 - 2734 5.6 - Term 2686 - 2722 1.8 3 3 Op 1 . - CDS 2763 - 3116 447 ## COG1393 Arsenate reductase and related proteins, glutaredoxin family - Prom 3162 - 3221 6.8 4 3 Op 2 . - CDS 3229 - 3903 516 ## COG0692 Uracil DNA glycosylase - Prom 3938 - 3997 9.2 + Prom 3995 - 4054 7.0 5 4 Tu 1 . + CDS 4171 - 5388 643 ## COG1686 D-alanyl-D-alanine carboxypeptidase Predicted protein(s) >gi|222441877|gb|ACEP01000065.1| GENE 1 124 - 1314 1128 396 aa, chain + ## HITS:1 COG:CAC0606 KEGG:ns NR:ns ## COG: CAC0606 COG0053 # Protein_GI_number: 15893895 # Func_class: P Inorganic ion transport and metabolism # Function: Predicted Co/Zn/Cd cation transporters # Organism: Clostridium acetobutylicum # 1 388 11 400 403 330 41.0 4e-90 MTEFLVNKFIKDSANIESTEVRTRYGMLASVVGIFCNVLLFSVKLAIGLILSSLAVTADA FNNLSDAASSIISFVGVKMAGKPADAEHPFGHGRIEYIAALIVSFLVIEVGFTFFKSSIS KIMHPEEITFDPVPFIILILSILVKLWMAFFNNKLGKRIDSKVMLATAADSLGDVITTSA TVISIVICHFTSINVDAIAGLIVSGIVIWSGVSIAKDTLEPLIGQRVPSELYQKITDMVE SYEGIVGAHDLIVHNYGPNRSMATIHAEVPNDVSIEASHEIIDRIERDAKKELNILLVIH MDPVEMRDEEVLELRDKTSHIVHALDPELHFHDFRVLKENEQKNLIFDLVVPDSYTEKDA NRVMHQLIALLHEMEKNVDCIITLDRSFEAPKNNIT >gi|222441877|gb|ACEP01000065.1| GENE 2 1493 - 2590 715 365 aa, chain + ## HITS:1 COG:MJ1613 KEGG:ns NR:ns ## COG: MJ1613 COG1865 # Protein_GI_number: 15669809 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Methanococcus jannaschii # 44 246 84 253 255 93 34.0 5e-19 MSTGPNNGGFTTDLAAVFNNDANPGDGMEVKLRADTYREHMDILAKEDLGLDPSSCSGLM TVASMDNSAISTLAYDDFSVTSIATAGVRNNGGRIGDPASWHEKSENTFEDTTSSDTKKK NPIQNLTKNILGKNISGKIISDKDKLPVGTINILLYIDADLSKEALASALVSCTEAKVAA MQELLIASRYSCGIATGTGTDGAIIISNAESKTHLTNAGKHSKLGELIGRTVISSIKEAL KLQQGITPQIQHDIIHRMDRFGVTEDALWDCYKETYRNLIRAEFTDILDRIRTDDTLVTY TSLYAHLLDQLSWGLLSFAECRIAANELLKLAVLHPDAECGTENIIQNYILAIADRIHRE SLKKK >gi|222441877|gb|ACEP01000065.1| GENE 3 2763 - 3116 447 117 aa, chain - ## HITS:1 COG:BH3485 KEGG:ns NR:ns ## COG: BH3485 COG1393 # Protein_GI_number: 15616047 # Func_class: P Inorganic ion transport and metabolism # Function: Arsenate reductase and related proteins, glutaredoxin family # Organism: Bacillus halodurans # 1 115 1 115 119 125 57.0 2e-29 MKVLFVEYPKCSTCKKAGKWLQEHGVGYDDRHIVEQNPTAEELKEWHEKSGLPLKRFFNT SGQVYRNNGIKDKLPGMSEKEQYALLATDGMLVKRPIVVGEDFVLVGFKEKEWEEKL >gi|222441877|gb|ACEP01000065.1| GENE 4 3229 - 3903 516 224 aa, chain - ## HITS:1 COG:lin1190 KEGG:ns NR:ns ## COG: lin1190 COG0692 # Protein_GI_number: 16800259 # Func_class: L Replication, recombination and repair # Function: Uracil DNA glycosylase # Organism: Listeria innocua # 1 224 1 224 224 283 61.0 1e-76 MAAINNDWAEALKEEYRKPYYRELYKKINEEYRTREIFPPSDEIFSAFHLTPLKNVKVVI LGQDPYHNNGQAHGLSFSVKPGVDIPPSLVNIYKELHDDLGCYIPNNGYLVKWAKQGVLL LNTVLTVRAHQANSHRGIGWEEFTDAAIRVLNAQDRPLVFLLWGRPAQNKKPMLTNPKHL ILEAPHPSPLSAYRGFFGCRHFSQANEFLRFHELEPIDWQIENI >gi|222441877|gb|ACEP01000065.1| GENE 5 4171 - 5388 643 405 aa, chain + ## HITS:1 COG:CAC2057 KEGG:ns NR:ns ## COG: CAC2057 COG1686 # Protein_GI_number: 15895327 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanyl-D-alanine carboxypeptidase # Organism: Clostridium acetobutylicum # 50 379 22 343 351 195 37.0 1e-49 MFILSVFFIFTIFFADSCYLDVYAFSKTETIRNFTVSDKSVKDNASKNSAPAELSSLHAK AAILMDGDNQRVLFGKEEEKELPMASTTKIMTCLYAIEHGNLSDKVTFSRRAASAPKVRL GARSGSQFLLSDLLYALMLESYNDVAVAIAEHIGGSVEQFCEDMTQEARDYGCYHTSFET PNGLDSAKHYTTCHDLALLTCLALKNENFCKIIKEQSFSIKELKNGQSYSLNNKNLFLTS YNGAIGVKTGYTNNAGYCFVGAVKKDGHYLISVVLGSGWYPNRRYKWEDTKKLMDYGIKN YNKKEIPLNQGIPSKLPVKNGRKSTCTLSAPASTSLYISPTEKTRCIISLSPSLTAPVQK GTAVGSVVLYIDSKPYRSYPIYARQSIKKRTFAWYYKRMFYYLIH Prediction of potential genes in microbial genomes Time: Fri Jul 8 07:30:51 2011 Seq name: gi|222441876|gb|ACEP01000066.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont72.1, whole genome shotgun sequence Length of sequence - 5183 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 73 - 132 5.5 1 1 Tu 1 . + CDS 327 - 5036 4270 ## Cthe_1806 cellulosome enzyme, dockerin type I Predicted protein(s) >gi|222441876|gb|ACEP01000066.1| GENE 1 327 - 5036 4270 1569 aa, chain + ## HITS:1 COG:no KEGG:Cthe_1806 NR:ns ## KEGG: Cthe_1806 # Name: not_defined # Def: cellulosome enzyme, dockerin type I # Organism: C.thermocellum # Pathway: not_defined # 420 1287 1140 2007 2177 264 27.0 3e-68 MKKQRKHINLSRITAFILSVAMLAGFCPQGVSLADIPITLPKVNAAENISNPRIVKDSSM DAGQKVTWDCVYFGSYPQSEITSKDGSIYNTLKNATGWDENNDITIGRTKYRRLKGEDAT YATSSDSPYRYDWNDNYKIYHYFKYEPIKWRVLNRNGNDALLLADIALDDQKCNTNCEDV TWETSSMRSWLNGYEASVNQPETDYSRKNFINSAFTSTQRSAIKTTSVVNDNSNIYGTES KNNIADKVFLLANSKSYNKDVATSYGFDFDNDVYGDEARRSRCSAYAYAMGTERCYPSDD DEGFIGNVSWWYYMPDRDWMAISVDFDGQGEGVGVERDDIGVRPALHINLSSTNLYSYAG TVCSDAMKSGRKKPDYPSEPDEPDTPNQPTQPDKPGTATGTRTEESNVDIEIDGGVDFTI PENVPILGGGDVSLDYGTIPVTFEREDNTYRIGIGVQDMNKKDWTTFKKFVETQKESYRK GMNSLLASKFGTASMGMSVEPEMEAYGYVEGTITKTNGVESAGGKLVVEIKGTAKQEWQT FVVVVPVVIKVKGTAGTKADFSVGFDFNKSKVYTKGEVELTLPSVRLTGGIGVSYIADIS VYGEAKNLVTVESDGKDNDITASLEGALGVSAKALCFSYEKELLNGSFDYLSTKKKSKAR MYTRALPKLEPEAKDYEIQRVDSSSWDGSAVAEQTAKPRSIKRAASAKNTSGTVTTLLSD VYASAKPQLLQTASGKKLLIFTTDMGDRTTGNHTAVVYSIYNERGWSIPKLIDDDGTADF DAVAAVDGENVYVTWINAKRTFTPEEAEAEDFMTKLAAETEVQAAKITLNGNTGTVTKYP AITDNAIADLHPSITVKNHVPYIAWNSNSANDILKGTGTNTVYLASLNGNAFTTKKLSEE NKPVQSVAIGNLDNDVVTAYTLNSGTEENPQVQLTTVNAKGRTTIAANGQNLSPSFAKID GSSVLLWYAQDAEGSSLNYIDAIDGAVESYIEDDAVISADYTVVDGGDSQLLICSSEKEN AEEAGRNLYAYVIRDGEVYEPVTLTDLEGYAAVPSGIWNGTAYEYLFTRTDVDITENTVT ENTDLCITSVVPQSKLVIGDIDYTQEKLMPDEDASITIPVKNNGLTNCGEGKVQIVYNGN VIGQADLEAGITAGETQDVTVDLTVPGDAVAKEILKVEAISGENTTADSTKNIQSAGSEL ALSVKQEDDNITAVIDNNSAFDTTAALTLKAGDASGKVLKTINLGNVESYDTVEKTFTKE ELKKLGSDTVYMEVSGDAEESIKSDNTAFVYVGTEELKTLDYLTATKTKVTYTKGEKLNL DDLEVTAVYTDGSKAKVTGYTTNVKNIDMSKTGKKQLEILYEEIGIGRKVVMPITVENAK PNTPKPDTGKKPSKKVKVTSIRLSGLSKQIAAGKKLTLKAAVLPKTASNKKLLWKSSNTK VATVTQGGVVTLKKKTGGKKVTITATATDGSKKYASWKITSMKGIVKKIKITGSKPVKAG KKLKLKAKVTATKKANKKLLWTSGNTKYATVNAKGIVTTKKSAKGKTVKITAMATDGSGK KKTVKIKMK Prediction of potential genes in microbial genomes Time: Fri Jul 8 07:31:17 2011 Seq name: gi|222441875|gb|ACEP01000067.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont73.1, whole genome shotgun sequence Length of sequence - 22520 bp Number of predicted genes - 22, with homology - 22 Number of transcription units - 11, operones - 6 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 207 - 266 8.9 1 1 Op 1 2/0.000 + CDS 347 - 1999 1119 ## COG0606 Predicted ATPase with chaperone activity 2 1 Op 2 13/0.000 + CDS 1996 - 3117 757 ## COG0758 Predicted Rossmann fold nucleotide-binding protein involved in DNA uptake + Prom 3316 - 3375 6.5 3 1 Op 3 1/0.000 + CDS 3419 - 5494 1950 ## COG0550 Topoisomerase IA + Term 5534 - 5593 7.2 + Prom 5675 - 5734 5.0 4 1 Op 4 . + CDS 5781 - 6551 817 ## COG4465 Pleiotropic transcriptional repressor + Prom 6576 - 6635 2.2 5 2 Tu 1 . + CDS 6740 - 8530 1730 ## COG1479 Uncharacterized conserved protein + Prom 8752 - 8811 7.8 6 3 Op 1 38/0.000 + CDS 8924 - 9655 1047 ## PROTEIN SUPPORTED gi|240145469|ref|ZP_04744070.1| ribosomal protein S2 + Term 9673 - 9716 4.1 + Prom 9666 - 9725 6.4 7 3 Op 2 . + CDS 9761 - 10696 458 ## PROTEIN SUPPORTED gi|42631241|ref|ZP_00156779.1| COG0264: Translation elongation factor Ts + Term 10745 - 10808 10.6 + Prom 10709 - 10768 4.2 8 4 Tu 1 . + CDS 10904 - 11509 309 ## gi|225027219|ref|ZP_03716411.1| hypothetical protein EUBHAL_01475 9 5 Op 1 . + CDS 11705 - 11863 168 ## gi|225027220|ref|ZP_03716412.1| hypothetical protein EUBHAL_01476 10 5 Op 2 . + CDS 11806 - 12291 386 ## Fisuc_0607 hypothetical protein + Term 12320 - 12373 13.2 + Prom 12293 - 12352 3.7 11 6 Tu 1 . + CDS 12383 - 12976 509 ## Spico_0561 regulatory protein TetR + Term 13003 - 13056 2.2 + Prom 12986 - 13045 4.0 12 7 Op 1 . + CDS 13072 - 13713 643 ## COG3153 Predicted acetyltransferase 13 7 Op 2 . + CDS 13728 - 14009 189 ## gi|225027224|ref|ZP_03716416.1| hypothetical protein EUBHAL_01480 + Term 14133 - 14170 1.0 + Prom 14228 - 14287 5.7 14 8 Op 1 . + CDS 14331 - 14951 462 ## COG1309 Transcriptional regulator 15 8 Op 2 . + CDS 14962 - 15150 95 ## MGAS2096_Spy1114 hypothetical protein + Term 15212 - 15255 5.1 + Prom 15226 - 15285 2.6 16 9 Op 1 . + CDS 15322 - 15486 97 ## gi|225027228|ref|ZP_03716420.1| hypothetical protein EUBHAL_01484 17 9 Op 2 . + CDS 15500 - 15916 296 ## EUBELI_20153 hypothetical protein 18 9 Op 3 . + CDS 15949 - 16152 124 ## gi|225027230|ref|ZP_03716422.1| hypothetical protein EUBHAL_01486 19 9 Op 4 . + CDS 16198 - 19119 2121 ## Sgly_2993 cell wall binding repeat 2-containing protein + Prom 19121 - 19180 7.5 20 9 Op 5 . + CDS 19203 - 19889 521 ## gi|225027232|ref|ZP_03716424.1| hypothetical protein EUBHAL_01488 + Term 19896 - 19934 6.1 + Prom 19954 - 20013 3.7 21 10 Tu 1 . + CDS 20063 - 20734 481 ## gi|225027233|ref|ZP_03716425.1| hypothetical protein EUBHAL_01489 + Term 20754 - 20808 6.1 + Prom 20762 - 20821 7.7 22 11 Tu 1 . + CDS 20876 - 22045 1072 ## COG0790 FOG: TPR repeat, SEL1 subfamily + Term 22098 - 22150 2.0 Predicted protein(s) >gi|222441875|gb|ACEP01000067.1| GENE 1 347 - 1999 1119 550 aa, chain + ## HITS:1 COG:alr4088 KEGG:ns NR:ns ## COG: alr4088 COG0606 # Protein_GI_number: 17231580 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Predicted ATPase with chaperone activity # Organism: Nostoc sp. PCC 7120 # 1 540 1 503 509 399 40.0 1e-110 MVAEVFSAMIQGIEGQIVKIQVDISNGLPNFNMIGYLSNEVKESKERVRTALNNIGVLLP PKRISVNFAPADFRKCGTSFDIGVAAAILLAMDFVPETYIKDTLFVGELSLNGQICPISG VLPIVVAASKRGIKRCIVAQENVAEAAFVEQMEVVGCRNLEELFFYLKHGYIINDDSGEI LENSGNISEDSGKECTIQDVKKSQEISESTRSLNLLVKDILENADEDKISVTDFSEVKGQ VMAKRAMEIAAAGYHNLMLGGPPGVGKSMLASCISGIMPQMTTEEMIDTTIIYSAKGLLK DSFSLITKRPFRNPSLAVTQAGMFGGGMTPQPGEISLAHHGILFLDEFPEYKREIIEMLR VPLEKHEISIVRKEQTLTFPADFLLSVCANSCPCGYYPMDKCRCTYRQIVKYQNRISGPI LDRIDLFVRCEEIGYEALTSNSKEESSADIRKRVSKAWEIQKERFAQSTTLFNGRMNKNE VEIYCKLDEEGQQLMKEAFSVFQLTGRSYFKILKVARTIADLEGSYDIKSRHVEEALYFR QKGEGRVKVR >gi|222441875|gb|ACEP01000067.1| GENE 2 1996 - 3117 757 373 aa, chain + ## HITS:1 COG:RSc0068 KEGG:ns NR:ns ## COG: RSc0068 COG0758 # Protein_GI_number: 17544787 # Func_class: L Replication, recombination and repair; U Intracellular trafficking, secretion, and vesicular transport # Function: Predicted Rossmann fold nucleotide-binding protein involved in DNA uptake # Organism: Ralstonia solanacearum # 8 284 31 310 401 206 39.0 6e-53 MTNNDFYWYWFTNIAGIGVKTQKKLLEHFGHPSTIYNAPESEIKGILTTGQCKEVIASKN QKKIEASIRKLEKREIYFFHRESQEYPERLAQLYEPPNLLYYIGRLPNFSKPILAIVGAR RATIYGRKMAREFAKRLAECGIQIVSGMAAGVDAAGHKGALDANGYTLGVLGGGIDTIYP VENFNLYQQVYQMGGVLSEYNMGISPQKGLFPMRNRIISGLSDGIFVTEAGARSGSLITA DQGLEQGKDIFALPGRITDCMSRGCNDLISQGAVLVRSPEDIMDIIIKEGKTGNKKKNMD VNREKFFKEQKFFCENKKEEKVYSLLDEIHPMTFENLLENSGFSAFELQHILMKFELKNI IYQIEQNVYLRKV >gi|222441875|gb|ACEP01000067.1| GENE 3 3419 - 5494 1950 691 aa, chain + ## HITS:1 COG:CAC1785_1 KEGG:ns NR:ns ## COG: CAC1785_1 COG0550 # Protein_GI_number: 15895061 # Func_class: L Replication, recombination and repair # Function: Topoisomerase IA # Organism: Clostridium acetobutylicum # 1 577 1 576 578 594 54.0 1e-169 MSANLVIVESPAKAKTIQKFLGKSYKVIASNGHVRDLPKSTLGIDVENDFEPKYITIRGK GDILAQLRKEVKKADKVYLATDPDREGEAISWHLYYALKLENKKCSRITFNEITKDAVKK SIKNSRDIDMNLVDAQQARRVLDRIVGYKISPILWSKIKRGLSAGRVQSVALRIICDREK EIDEYISQEYWSLIANILVPGAKKTLEAHFAGDKNGKIELHSQKEVDALVEKLQKEKFEV LSVKNGERRKKPPLPFTTSTLQQEASKRLNMSTQKTMRLAQELYEGVTLKGKGSVGLITY LRTDSTRISEEADASCREYIAQQYGEQFVGDKTAVKSSKTGKIQDAHEAIRPTDSSLSPA MLKEQLPRDLFRLYQLIWTRFLASRMAEAVYETTSVKIGAGDYRFNVAASKVKFDGFMTV YSYDSEKSTMNAAMKKITKDTEITLSGLDPQQHFTQPPAHFTEASLVKTLEELGIGRPST YAPTITTLLARRYIVKESKNLYITELGEAVNDMMEDGFPTIVDVNFTANMESLLDGVEEG KVKWKTIVENFYPDMMQAVEKAEKELEEVKIEDEVSEEVCEECGRNMVVKYGPHGKFLAC PGFPDCRNTKPYYEKIGVSCPKCGKDVVLKKTKKGRKYYGCIDNPECDFMSWQKPSNIRC KQCGGYMVEKGKKLVCADKECGFVMDKEEVK >gi|222441875|gb|ACEP01000067.1| GENE 4 5781 - 6551 817 256 aa, chain + ## HITS:1 COG:CAC1786 KEGG:ns NR:ns ## COG: CAC1786 COG4465 # Protein_GI_number: 15895062 # Func_class: K Transcription # Function: Pleiotropic transcriptional repressor # Organism: Clostridium acetobutylicum # 5 255 4 258 258 213 51.0 4e-55 MSVQLLDKTRKINKLLHNNNSHKVVFNDICEVLSNILLSNVLVISRKGKVLGIKNRQDIE VIDELINNEVGGYIDSLLNERLLGVLSTKENVNLETLGFEHTEIDRYQAIITPIDIAGER LGTVFMYRKEHHYEIDDIILVEYGTTVVGLEMMRSVNEENAEETRKVSIVKSAISTLSFS ELEAIQHIFDELDGSEGILVASKIADRVGITRSVIVNALRKFESAGVIESRSSGMKGTYI KVLNDVVFDELKKMKH >gi|222441875|gb|ACEP01000067.1| GENE 5 6740 - 8530 1730 596 aa, chain + ## HITS:1 COG:TM0991_1 KEGG:ns NR:ns ## COG: TM0991_1 COG1479 # Protein_GI_number: 15643751 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Thermotoga maritima # 6 381 6 359 364 186 31.0 1e-46 MKSDNYKITQYSVSSILNYVENSQIAIPEIQRPFVWKGEEVRALIDSLYEGYPIGYLIVW QNSQVRVRGFGKGGTKKILIDGQQRVTALMAALLGKEVLDEQYQSCRIRIAFNPLAQTGD ERFAVCDEKHEQDSKWIPDISIFFRRDFSFRQFEKEYKEANPGEDFTPLEESVDTLKEIV KHQVGVIELSFLLDIDVVSEIFIRINLQGKPLNQEDFVMSKISVNEEYGGDYIRNCIDYF CHLLREPSFYQVLQQNETDFFNSEYGQAVAWCQNGEKNLYIPSYADVLKVVLISYFGKVR IGDLVYLLSGRTGEKKGLTKKEISKKLSEESFEKLGAGVRAFVCEKNFKGFQKALKKAGY CCSRLLYSQSVLNYCYAMYLLMDRQGIGEEEKESLLARWMTMVMLTGHYQSGGEGTVLKD YANATEEGFASYLAQIEELKLTEEFYDSILPDKFASTTARTAPFLVYVATQCARGVHSLF SDVTIEELYKNKTESYQILPKTYLAKCGYKTREIYGQVANLTYISKETKDIVKRKAPADY REKLEEIIGRESITASLQENGLTEEIFTAGEKDVTKILADRRKQMALEIKEFYKTL >gi|222441875|gb|ACEP01000067.1| GENE 6 8924 - 9655 1047 243 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|240145469|ref|ZP_04744070.1| ribosomal protein S2 [Roseburia intestinalis L1-82] # 1 241 1 243 247 407 84 1e-113 MSVISMKQLLEAGVHFGHQTRRWNPKMAPYIYTERNGIYIIDLQKSVGKVDEAYNAIRDC VANGGKILFVGTKKQAQDSIKNEAERCGMYYVNQRWLGGMLTNFKTIQSRIAQLKKIEAM EADGTFDVLPKKEVINLKKQQEKLEKNLGGIKEMQDIPDMIFVVDPRKERICIQEAETLG IPLVGICDTNCDPEELDYVIPGNDDAIRAVKLIVSKMADAVIEGNQGQDAEVAEEEAAEE AEA >gi|222441875|gb|ACEP01000067.1| GENE 7 9761 - 10696 458 311 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|42631241|ref|ZP_00156779.1| COG0264: Translation elongation factor Ts [Haemophilus influenzae R2866] # 3 310 4 281 283 181 41 5e-45 MAITAKQVKELREMTGAGMMDCKKALTATEGDFDKAIEFLREKGLATAEKKAGRVAAEGL VKVIVSDDKKKAVAVEVNAETDFVAKNEKFQAYVAQVAEQAMETEAADIDAFLAETWKFD ETKTVNEALAGQIAIIGENMNIRRFQQVKEDAGFVASYTHMGGKIGVLVDVATDVVNADI EEMAKNVCMQVAALNPKYTDRSEVDQDYLAKEEEILTAAAKNEKPDANDKIITGMVKGRL NKELKEICLMDQVYVKAEDGKQSVSKYVEEVAKANNAKIAIKGFVRMETGEGIEKKEEDF AAEVAAQMAGK >gi|222441875|gb|ACEP01000067.1| GENE 8 10904 - 11509 309 201 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225027219|ref|ZP_03716411.1| ## NR: gi|225027219|ref|ZP_03716411.1| hypothetical protein EUBHAL_01475 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01475 [Eubacterium hallii DSM 3353] # 1 201 29 229 229 406 100.0 1e-112 MSIHSLNSVNEIKERENEICVVSQIYVDDELIAYYPYVLVRKYIKKMNLAKNFFPRIAEA ISEQLTACYFQKFKGGDWLFGDKFAQMLIANCVSSNLHNGIKFAHLIEKIEQLSVSTFEG EYFSTGVIVSSNFAKHFQGGKYAAYHFKGFPQFLFMVYQEIFCRWLSKTGNQLDERRPIF DIYRLVGENGYMEIDICFPLK >gi|222441875|gb|ACEP01000067.1| GENE 9 11705 - 11863 168 52 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225027220|ref|ZP_03716412.1| ## NR: gi|225027220|ref|ZP_03716412.1| hypothetical protein EUBHAL_01476 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01476 [Eubacterium hallii DSM 3353] # 1 52 8 59 59 80 100.0 3e-14 MKTKNIILKLRTERGMSQDELADKIMVTRQAVSRWENGAMQMEHIHITIWTN >gi|222441875|gb|ACEP01000067.1| GENE 10 11806 - 12291 386 161 aa, chain + ## HITS:1 COG:no KEGG:Fisuc_0607 NR:ns ## KEGG: Fisuc_0607 # Name: not_defined # Def: hypothetical protein # Organism: F.succinogenes # Pathway: not_defined # 4 161 33 195 195 100 33.0 2e-20 MGKWCYADGTYTYNDMDELIDVCVKNMVNENFTEKQAGSYLKEMLPKLDYWKRYDELSDN GQFEEFKKQLINEINDLHIDGLPRVDKLNALVGKYVNLEYQLPNGQKVKFLDDQKTYLGN QLESEFRAERCFGILANMDFILVCTYEKNGENPELLIYKKR >gi|222441875|gb|ACEP01000067.1| GENE 11 12383 - 12976 509 197 aa, chain + ## HITS:1 COG:no KEGG:Spico_0561 NR:ns ## KEGG: Spico_0561 # Name: not_defined # Def: regulatory protein TetR # Organism: S.coccoides # Pathway: not_defined # 1 196 1 206 209 145 43.0 8e-34 MGNRKEEILIVALHLFARDGYEAVSVSQIAGELDMTKGALYRHYKSKRDIFDCIVQRMEQ QDSEQARQNEVPEESIENVPEEYQNVSVEDFVGYSKSMFEYWTEDDFASSFRKMLTLEQF RNEEMQNLYQQYLVSGPAEYVKDLFKNMEIKNPEEEAVKFYANMFFYYSVYDGATDKTKA KCQFEYMLDKIVEEMKQ >gi|222441875|gb|ACEP01000067.1| GENE 12 13072 - 13713 643 213 aa, chain + ## HITS:1 COG:MA1701 KEGG:ns NR:ns ## COG: MA1701 COG3153 # Protein_GI_number: 20090553 # Func_class: R General function prediction only # Function: Predicted acetyltransferase # Organism: Methanosarcina acetivorans str.C2A # 5 212 16 215 217 145 36.0 6e-35 MEKSNIMIRLEKKEEHQKVENLVRESFWNVYRPGCLEHYVLHQLRNDLAFVPELDFVMFL NENGKEDKLIGQNMFMRTAIKSDDGRNIPIMTMGPICITPELKRQGYGKALLDYSLDKAA KLGCGAVCFEGNIDFYGKSGFRPASEFNIRYHGLEEGQDASFFLCKELIPGYLNGITGEY ATPAGYFVDEKKAEEFDKMFPYKEKKKLPGQLF >gi|222441875|gb|ACEP01000067.1| GENE 13 13728 - 14009 189 93 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225027224|ref|ZP_03716416.1| ## NR: gi|225027224|ref|ZP_03716416.1| hypothetical protein EUBHAL_01480 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01480 [Eubacterium hallii DSM 3353] # 1 93 1 93 93 174 100.0 2e-42 MQIKSIQPMAAKILAEETGKMIIATKQLFYAMEVHKLLHFQNADMSAVSFAMTVHGLMDY ELDLRSGECKTENQERNNLDEYLQWFCRENATK >gi|222441875|gb|ACEP01000067.1| GENE 14 14331 - 14951 462 206 aa, chain + ## HITS:1 COG:aq_2179 KEGG:ns NR:ns ## COG: aq_2179 COG1309 # Protein_GI_number: 15607114 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Aquifex aeolicus # 1 84 7 89 192 61 36.0 1e-09 MESKEKNTKEKILEEALKLFAQSGYMGTSMNEIASRLGVTKAALYKHYSSKQEILDSIVE RMNQMDVKRAKEYEMPDGDMEDVIAGYKNIALEKIRQFTKVQFLHWTEEEFSCCFRKMLT LEQYRDPKMAKLYQNYLAEGPLRYMEAVFSGIVESGEDAKQLAMDFYGPIFLLYSIYDGA ENKQEVVKQVEEHVEKIAKIFQKIST >gi|222441875|gb|ACEP01000067.1| GENE 15 14962 - 15150 95 62 aa, chain + ## HITS:1 COG:no KEGG:MGAS2096_Spy1114 NR:ns ## KEGG: MGAS2096_Spy1114 # Name: not_defined # Def: hypothetical protein # Organism: S.pyogenes_MGAS2096 # Pathway: not_defined # 10 57 4 51 61 66 58.0 5e-10 MNKDEMEYQKWVYCPVCGNKTRVRLRKDTVLLHFPLFCPKCKKESLIDAEGFTVRLIGCK SM >gi|222441875|gb|ACEP01000067.1| GENE 16 15322 - 15486 97 54 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225027228|ref|ZP_03716420.1| ## NR: gi|225027228|ref|ZP_03716420.1| hypothetical protein EUBHAL_01484 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01484 [Eubacterium hallii DSM 3353] # 1 54 16 69 69 92 100.0 1e-17 MMMNIMVKTAGKALEKKTDRTPEEEDMLDMMLHGGERVKIQNLQGVISWYNGKR >gi|222441875|gb|ACEP01000067.1| GENE 17 15500 - 15916 296 138 aa, chain + ## HITS:1 COG:no KEGG:EUBELI_20153 NR:ns ## KEGG: EUBELI_20153 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 1 136 1 136 138 176 77.0 2e-43 MKLYFGNMVTTVTTIMIIALVGFIGYSVWNRNTISYWGCRNLFLLMYGLVICCFAAARDG LDKTIQAAVDGSCIPGIFPLISIPTIVGCIGAAMIIVAAVATPIAKSQRMREIWFYVMSS GVMLKVVVMELARIIQFI >gi|222441875|gb|ACEP01000067.1| GENE 18 15949 - 16152 124 67 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225027230|ref|ZP_03716422.1| ## NR: gi|225027230|ref|ZP_03716422.1| hypothetical protein EUBHAL_01486 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01486 [Eubacterium hallii DSM 3353] # 1 67 1 67 67 114 100.0 2e-24 MKVVISKCEKVERRVLAVEEQEIFLGYLARTKRESVKWVDYDGKILKCVIVIFDACNSLD CMLLYTI >gi|222441875|gb|ACEP01000067.1| GENE 19 16198 - 19119 2121 973 aa, chain + ## HITS:1 COG:no KEGG:Sgly_2993 NR:ns ## KEGG: Sgly_2993 # Name: not_defined # Def: cell wall binding repeat 2-containing protein # Organism: S.glycolicus # Pathway: not_defined # 798 965 136 301 723 91 36.0 2e-16 MKKKLLAILFMVCFIFVFAPKTVHAYSYYWYVNPIDTLYSDIAEKKNFYLAGKESKYAYY CHNMDEIAIRDGSNPTRGMLYDEFKIISKGMGSIVLRMDEKMSERNRKNIKEYGLKFQKK KGTDTHYYIYIIKQDVKYYLNIDRNVNKLGFSKNSKTEWIVNPVPRSNYLFRIVDNKTGD LFLNVDANGLQVTTNSQPLYLYHAEDLAENTYTIHDKEVEYKHSDKLPQPKYKEYKTFYS YTKKPLSQLQKQWGYIYRQILDMIPDGDYTSYKVGENPGFDYDPPNENKEVYANYYNKAN IKLKWNIKPTDKDIITYNQRDYNGATKQITGQDIEFDVPEGGWISLEFPSVYDSGKFYKV TSVKENGTELKVTRMDGVEKENERGPFPDSFYFDTVNKDATYEITIEASYPEVKDFSYNL YVKRGDFDSYSNEGTITNLTDSKVQLLTTTKELDGGVLRDPFIIILYGYRDYEKQHSISY KYEYYMVKDGKATRLDLTGDSVSVDLKDIFANIKDEKQVKVYAVVTKTRKEDGAFKTFKT EDVTITCKYEGGDFSIDVANADTIHPMQSNRIVVLMKKTNENNEAASCEIKREGDENYTK IEKLGVGFPYVITQNGTYTIRLTDKFGKSATKTLEYNNIKEDMTLPKISGVEDGKIYCLT RTITVTDEHLRDVKVNGNRIELDENNQYVLSGSGKQVITATDDYDNVTEITVTINDHHIG GKATCKAKAICDICKMEYGELDSANHVEKVKWIITPTTHEQRYGCCNAVAVKKAAHQWKA GKCTKCGTRILVGKPGITAKSSNGTTIIRWKKVTNASGYKVYRAKNKKGKYELLKTTAAL SYSDSSIAGGQNYYYKVVAYYKNGSTMINGKASDVVLQVGTLKKVSLRVKNKKKSTASLS WKKADGAKKYQIYRATGKKGKYSKITTTKKLTYEDTSLSKNKTYYYKVRAYYVKAGKNIY GSYSNVKSVKITK >gi|222441875|gb|ACEP01000067.1| GENE 20 19203 - 19889 521 228 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225027232|ref|ZP_03716424.1| ## NR: gi|225027232|ref|ZP_03716424.1| hypothetical protein EUBHAL_01488 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01488 [Eubacterium hallii DSM 3353] # 1 228 1 228 228 416 100.0 1e-114 MASRYEKEDSKRRILSACVQLFLEKGYQKTTTVEILKKADVTPSTFYNIYHTKGSILTEL TEFMFGNQFNISGKIVGEETNPVLLYAVETCLQLTLAELNENLREIYVEAYTVPENIELI HKKTAVQLQKIFAPYLPGYSASDFYEMEIGTAAFMRGYMARPCDMYFTLERKLARFLSMS LSVFKVPQEEQEAILSYIENLNIREIANKVMQQLFVTLEMKYEFTLTN >gi|222441875|gb|ACEP01000067.1| GENE 21 20063 - 20734 481 223 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225027233|ref|ZP_03716425.1| ## NR: gi|225027233|ref|ZP_03716425.1| hypothetical protein EUBHAL_01489 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01489 [Eubacterium hallii DSM 3353] # 1 223 8 230 230 359 100.0 1e-97 MKCRTPKLLVSFFILAIILIIAVPSTGFAAKTTVSTVKAKNGKKQSKNVTVKYRKQMNRI LEATQLYYSIRLNYSMQSGETKKIKLTSLEKQNVAAGRQVLEGQSRITNFAFSQRVKELF GANARIASLSFKTYPETPKELVVRCNSNYVKLAVGEWGEDYPVYKLKSVTKQGKKWKAVF KVNMYDAYADQTQPLGKVTLTLKKNKKSSYGFNITGITLRRSK >gi|222441875|gb|ACEP01000067.1| GENE 22 20876 - 22045 1072 389 aa, chain + ## HITS:1 COG:ECU11g0430 KEGG:ns NR:ns ## COG: ECU11g0430 COG0790 # Protein_GI_number: 19074843 # Func_class: R General function prediction only # Function: FOG: TPR repeat, SEL1 subfamily # Organism: Encephalitozoon_cuniculi # 7 165 189 347 590 106 36.0 1e-22 MGLFSFESKQVKEWKKLAKSGDMEAQYHLARAYANGKGASINMKRAVDYCVQSAEQEYAP AQALLAHFYGYGKGVDKNYEEMIHWGEKAALQGYAQAQYNVGRCYEQGKGKEKDFEKAMH WYMLAAQQGRPDAAYKLGQFYEHGLGVEEDIEKAEEWYKKAAEEDTGDDAEIGNTNIDIE EKMKKDLETLTSAQIEKMQDLDNKNIDNNRKYRCISEMKVGEAIYLHEEELLESPFVLIK AENHETELGWGKAKAICNTATLVSLTTGKKRTAILYCNNGSMRKEALKSIQSQVDDEYNH EAVAWNEPHSRIPVVLKNVQYTHKHNDMYVMTDRKTNEEYEMVPQSWAVPILDKLEYNTI VTIIYYEGKWDFVSKKETVDIGKYKFNIR Prediction of potential genes in microbial genomes Time: Fri Jul 8 07:32:34 2011 Seq name: gi|222441874|gb|ACEP01000068.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont74.1, whole genome shotgun sequence Length of sequence - 12453 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 3, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 104 - 163 6.6 1 1 Tu 1 . + CDS 282 - 983 169 ## gi|225027236|ref|ZP_03716428.1| hypothetical protein EUBHAL_01492 + Term 1082 - 1126 10.1 + Prom 1149 - 1208 4.6 2 2 Op 1 . + CDS 1313 - 9370 4340 ## COG4646 DNA methylase 3 2 Op 2 . + CDS 9361 - 9810 315 ## gi|225027238|ref|ZP_03716430.1| hypothetical protein EUBHAL_01494 4 2 Op 3 . + CDS 9852 - 11117 834 ## gi|225027239|ref|ZP_03716431.1| hypothetical protein EUBHAL_01495 + Term 11154 - 11203 8.5 + Prom 11172 - 11231 7.2 5 3 Tu 1 . + CDS 11251 - 12451 1050 ## COG3505 Type IV secretory pathway, VirD4 components Predicted protein(s) >gi|222441874|gb|ACEP01000068.1| GENE 1 282 - 983 169 233 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225027236|ref|ZP_03716428.1| ## NR: gi|225027236|ref|ZP_03716428.1| hypothetical protein EUBHAL_01492 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01492 [Eubacterium hallii DSM 3353] # 1 233 24 256 256 471 100.0 1e-131 MLFTVFMFACQYGLTRETVAENFSRTGICWSEFALNVVFSILLIFLVWDDDYILSDESIT FRRVIFCKKHKWNEFPYCGICYKKNENADDRNEKLLYFSNKPESDSKFEKVYHVIEYTPE IEAIFQIYCPQMPDLDGKFWKNELMGTKMWEDADIDNYKRKIFKYDFIQSVHFLPIICAL KLASNYYVPVVMLVGAIVFTYSPFIKKWHNKVKELTREYQRAALCQLIAERCT >gi|222441874|gb|ACEP01000068.1| GENE 2 1313 - 9370 4340 2685 aa, chain + ## HITS:1 COG:AGpT188_2 KEGG:ns NR:ns ## COG: AGpT188_2 COG4646 # Protein_GI_number: 16119916 # Func_class: K Transcription; L Replication, recombination and repair # Function: DNA methylase # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 1398 2667 1 1311 1315 649 32.0 0 MATTLLTAEEMLEETIRKVSSSPDEWLRFLNTASRVYQYSFDDQLMIYAQRPNAVGCTSF QIWKKTNHYVKQGTAGIALIHTVNGRKKLRYVYDYKDTDIVRGVPVTEVRKPYLWQIDEE DKKDVSDHLQRRYHIKGQEADLATMLKKLTEEIIEDTMAEEVPKLLRDRQDSYLEDLDQD TIRLEYRELFLSTAWYLLLSRCGIDPEEYMYLEDFRSITDFNNEDVLLRLGSPISEHCAS VLKEIGLYLFQKNLQKNRAETIVEGRAKEYNQDRQIPVQKEAEQDEINLYERRRRSDVSR AGTNKDGGAHREIRTDEGKIPERTRQIDISDPSDERLRGTSDRDPEAGRREDGRTGARAD EGGRGNGGTESSRPDEMGRTGKQLQTAGGGNSETGDYIQLSLFPQNEQEQADYGTAAGDK PAAVFAFPETYIDEALRTGGNDDHSLEYVLLDLMQKKPGELLRKDLKEAWKGGKGLFMPD GVKVSVWYSEEGIFFRRGNGARRNPEKTLSWDEAAERIESLYQNGQFTTKELQEKEFDTI RTRIAEYVFFFYRDSRLESRWGSVYIEAVEKITGELRDPIGVNTIYEELLSFEKRVDFET GWQKRNFEAAKEHLKKLNQASIPVSQNGTEPVHIAFITADEIDKLLQEGGIVQGGKSRIL SFYQQEPMPDKKVAIAFLKEEYGIGGHSHSISGSSHSDEWHDAKGFHLKKGKAEKKLTWK EAEKRIRILIENKEYPYVKDKEMQEQSVKEDHTKEKENVLNFSTEEKKEDTLQFQIDYSE HDRLSVYDKENERFHTKKISFAVADRVFTVLDEQERKQQEEEEAPSYYKTAFSVYAVIDG EEYSFRGRYDIGSEGKSLLEHIREYYAYCLSPDCLYRKHWAEDGMLEETIAQLTRDKKVF IPYLEEHLGLSPEEEKQAEEFLEKKEPGVLQDRRYEIVDTGRFDYPFSVHELTRTEEGFS YSGNTKFCRNKEEVDAWIEEQQEIPRKMEAAIESSEDYEDAAVSFVTTVDPEGNRQPAYR LLKKGKNGVEPYPSEEMLFHSIGEAKEYIKDHAEEIRQISYDEIMDWAAEIRQDKITATE VQEEILDAETVAVTPASEENSQNYHILKGELETGTIREKCWKNILAIETVKQLEKEQRQA TKEEQNILADYVGWGGIADVFDEKKPQWQEERERIQTALTPEEYRSAVGSVLNAHYTQPV LIKAMYQVLAGLGFTKGRILEPSCGTGNFFGLLPESMNKSTLYGVELDQMSAKIAGYLYP EVNIENTGFERTDYPDGYFDIAVGNVPFGDYRVNDPVYNRHGFLIHDYFFAKTLDKLRPG GVAAFITTKGTMDKENTKVRQYLFKRAELLGAVRLPNTAFKNAGTKVTSDILFLQKREKE IADSEWISVTEDAQGIPVNSYFAAHPEMILGTMKEISGPYGVETACIENEGVSLEIQLRD AIANITGHIPERMPEERAEIQYGTEPVTDMPDREANRIYSYVVSEAGDIYYKNESGLEQQ RVTKTTKSRITGMAAIRDCVRELIRLQVEETENTETKIQAEQRKLNTLYDSYTEEWGLLN SIANKRAFSEDSSYPLLCSLERLDEDGNLAGKADMFSKRTIRQKESITHVNTANEALAVS MTEHGKVDLSFMSGLCGKPGEKITEELAGVIYRNPVTQEWETADSYLSGNVREKLRIAEN FLESTPEYKRNVEALKNVQPKRLEASEIEVRLGSMWIPLEVYEQFMIETFQPPKYIAANH TIKIQYSSYTGEWNIQGKNIDGSILATNTYGTKRANAYRLLENSLNLKNIQIFDTYTDSE GKEHRELNKKETILAGQKQDMIKEKFKDWIFKDRERREMLVTIYNERFNSIRPRQYDGSH LEFPGMNPEITLKPHQKNAVAHQLYGDNTLLAHCVGAGKTFEMAAAAMEAKRIGITHKSL FVVPNHLTEQWGAEFLTLYPAANILVATKKDFQPANRKKFCARIATGDYDAVIIGHSQYE KIPLSAERQRNILEEQIDEIEMAISLAKEAQGENYTIKQMVKSRKNLEVRLAKLNDKKKD DVVTFEELGIDRLFVDESHAFKNLFLYTKMRNVAGVAQTESQKAMDMYNKCRYMDEITGG KGITFATGTPVSNSMTELYTIQRYLQYDRLQEMELGMFDNWASTFGETITALELSPEGSS YRMKTRFANFFNLPELMSVFQEVADIQTADMLKLPVPQAEYENIVLPASEQQKEILQSLA ERAELVRNGAVDPSEDNMLKITTDGRKLALDQRLLNDMLPDTENNKVSACAERCFTIWEE TKGQKAAQLVFCDSSTPKKDGTFNVYDALKEKLMKKGIPEEEIAFIHDANTDVQKARLFS KVRSGQVRFLLGSTSKMGAGTNVQDRLIALHHLDVPWRPADIEQQEGRILRQGNKNKKVK IFRYITENTFDAYSWQLIENKQKFIGQIMTSKSPVRSCQDVDEAALSYAEVKALATGNPK IKEKMDLDVQVTKLKMLKANYESNLFRLQDAIAVEYPEKIAKYEELTTAYDTDIKHIDTV LNTPFLMEIHGVTYNDEKAAGEMLVQACTQMKKAHADAENIGSFWGFQMKISYSLFDNSF YVRLTREASFIVEVKKDPVRNIERILTAMRNLPGQKKTAEERLEDARQQLVQAKKEVQKP FEKEEELKFIQARLVKVNAELDVGSDEEKIQKDGKNRQEEQRICL >gi|222441874|gb|ACEP01000068.1| GENE 3 9361 - 9810 315 149 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225027238|ref|ZP_03716430.1| ## NR: gi|225027238|ref|ZP_03716430.1| hypothetical protein EUBHAL_01494 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01494 [Eubacterium hallii DSM 3353] # 1 149 1 149 149 259 100.0 7e-68 MPVIRVSKMGKCIQKLREERGWSLEELAGRILLSPEELKCMENGIKEPDKMTVTLLGYIL NVHADNLLNGEIHCRVSQTELMELMKETIDYLEGIQKENKQIGVDMEKIQEKYGLPGKDM PERERTERMQKQGHIFTNKIQKESVPKLL >gi|222441874|gb|ACEP01000068.1| GENE 4 9852 - 11117 834 421 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225027239|ref|ZP_03716431.1| ## NR: gi|225027239|ref|ZP_03716431.1| hypothetical protein EUBHAL_01495 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01495 [Eubacterium hallii DSM 3353] # 1 421 1 421 421 792 100.0 0 MEEKSREVVMAISNAARYMDPFFKLSAMGTILLLQYFARMVKERKLKETEFTDFQKFLRM TEGKYDIMNVPEIPEEQLSEELNTLGIHYMILPDLEKNDGMLQVAVYQPDRENFGAWYQR HILSCMTGGEKDIQELRNLTSGKTTIVSFPLEDEEELIREDFEKLGINYSRLPDLHVGDG ELQVVIANADLPKVESWYKLYRDDLRKDGITDVPDMKKMSMDNYMQTGQQTEAEYIDTAS PELKAVNAKYEGKEKGEIEHQIEAAEYNTMGKESSTAYLRYVNDPAYIPISIDKKTLVEK SSVINKDGLDRYNQFACRIPGTYGKNEKQLVIPETQVFETQKGSYIAFLNKEEPVFVFNV RTKQVDHEMRKLTGEEFAKQYFDKVDRPSERKVTSLKKYKEKGKDLSDLKIKMPDPPIRS K >gi|222441874|gb|ACEP01000068.1| GENE 5 11251 - 12451 1050 400 aa, chain + ## HITS:1 COG:CAC1969 KEGG:ns NR:ns ## COG: CAC1969 COG3505 # Protein_GI_number: 15895240 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirD4 components # Organism: Clostridium acetobutylicum # 130 393 157 401 591 68 22.0 2e-11 MIQKRKPPWGMFFILLFLALAVSYYIAGLFKLDDVTIQNYQDKLAYILMHPFRNWFNEKT PAVIGIALTAWVMFVCYYLTYYRNFHPDAEHGVAEWADVPKAAKRLYGKGEEPVTCLSKN ITVNAKALPNMHILILGGSGDGKTTSLLIPNILLANMTNIILDVKGDLLKNYGGYLKEKG IAVKSLNFKDMAQSDQYNPFRYIENYTDMVELITNIQTSVKPPDAQKGDPFWDDGVGLYL QSLFEYEWLQAKEEGRTASMLGILDLVNKETQKTDEEGTTQLQQDMQELQERYGEDYPPV RDYRKLKEGASETVRSIVIMVNAQLRLFEIPEIKRIFEGTDDIDIPSLGLGVEGNPKKKT ALFLVMPSGDSSYNLFINMFYTQLFTVLKRVADNRRDGQL Prediction of potential genes in microbial genomes Time: Fri Jul 8 07:33:24 2011 Seq name: gi|222441873|gb|ACEP01000069.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont75.1, whole genome shotgun sequence Length of sequence - 45943 bp Number of predicted genes - 38, with homology - 36 Number of transcription units - 17, operones - 7 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 44 - 667 395 ## BCAH820_0604 endolysin + Prom 675 - 734 8.9 2 2 Op 1 . + CDS 778 - 2667 1163 ## bpr_I2549 hypothetical protein 3 2 Op 2 . + CDS 2681 - 3487 600 ## CDR20291_2292 putative cell wall hydrolase - Term 3675 - 3718 -1.0 4 3 Tu 1 . - CDS 3816 - 4502 448 ## COG2964 Uncharacterized protein conserved in bacteria - Prom 4564 - 4623 8.1 + Prom 4674 - 4733 8.6 5 4 Tu 1 . + CDS 4778 - 5479 678 ## COG3341 Predicted double-stranded RNA/RNA-DNA hybrid binding protein + Term 5653 - 5706 1.3 + Prom 5682 - 5741 9.5 6 5 Op 1 . + CDS 5811 - 5909 57 ## 7 5 Op 2 25/0.000 + CDS 5894 - 6976 717 ## COG0803 ABC-type metal ion transport system, periplasmic component/surface adhesin + Prom 7003 - 7062 4.3 8 5 Op 3 42/0.000 + CDS 7114 - 7836 732 ## COG1121 ABC-type Mn/Zn transport systems, ATPase component 9 5 Op 4 . + CDS 7833 - 8681 759 ## COG1108 ABC-type Mn2+/Zn2+ transport systems, permease components + Term 8835 - 8879 10.1 + Prom 8860 - 8919 7.0 10 6 Tu 1 . + CDS 8948 - 9037 101 ## + Prom 9354 - 9413 9.3 11 7 Op 1 . + CDS 9572 - 12490 3091 ## COG1026 Predicted Zn-dependent peptidases, insulinase-like 12 7 Op 2 16/0.000 + CDS 12515 - 13420 1141 ## COG1159 GTPase 13 7 Op 3 1/0.000 + CDS 13429 - 14166 417 ## COG1381 Recombinational DNA repair protein (RecF pathway) + Term 14226 - 14267 7.4 + Prom 14270 - 14329 6.2 14 7 Op 4 . + CDS 14369 - 15757 1696 ## COG0423 Glycyl-tRNA synthetase (class II) + Term 15801 - 15858 1.5 + Prom 15974 - 16033 6.9 15 8 Tu 1 . + CDS 16173 - 18281 2431 ## COG3968 Uncharacterized protein related to glutamine synthetase + Prom 18630 - 18689 4.1 16 9 Op 1 1/0.000 + CDS 18725 - 20032 1174 ## COG0285 Folylpolyglutamate synthase 17 9 Op 2 5/0.000 + CDS 20032 - 20841 961 ## COG0294 Dihydropteroate synthase and related enzymes 18 9 Op 3 . + CDS 20843 - 21667 525 ## PROTEIN SUPPORTED gi|148994682|ref|ZP_01823786.1| 50S ribosomal protein L13 19 9 Op 4 . + CDS 21684 - 22826 1143 ## COG0077 Prephenate dehydratase 20 9 Op 5 . + CDS 22854 - 25346 1948 ## COG1368 Phosphoglycerol transferase and related proteins, alkaline phosphatase superfamily 21 9 Op 6 11/0.000 + CDS 25386 - 26921 1524 ## COG0248 Exopolyphosphatase 22 9 Op 7 . + CDS 26935 - 29085 2079 ## COG0855 Polyphosphate kinase 23 9 Op 8 . + CDS 29106 - 30314 1138 ## COG1472 Beta-glucosidase-related glycosidases + Prom 30317 - 30376 6.1 24 10 Tu 1 . + CDS 30465 - 31919 1595 ## COG0034 Glutamine phosphoribosylpyrophosphate amidotransferase + Term 31993 - 32028 2.1 + Prom 31921 - 31980 3.1 25 11 Tu 1 . + CDS 32037 - 33470 1742 ## COG0015 Adenylosuccinate lyase + Term 33520 - 33560 8.8 + Prom 33528 - 33587 4.8 26 12 Tu 1 . + CDS 33612 - 35282 1402 ## COG2244 Membrane protein involved in the export of O-antigen and teichoic acid + Prom 35306 - 35365 6.5 27 13 Tu 1 . + CDS 35391 - 36680 1683 ## COG0104 Adenylosuccinate synthase + Term 36681 - 36738 2.1 + Prom 36683 - 36742 3.5 28 14 Op 1 . + CDS 36868 - 37890 1112 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 29 14 Op 2 . + CDS 37862 - 38269 398 ## COG5341 Uncharacterized protein conserved in bacteria + Term 38418 - 38475 6.3 + Prom 38356 - 38415 11.3 30 15 Op 1 12/0.000 + CDS 38594 - 39907 1341 ## COG4656 Predicted NADH:ubiquinone oxidoreductase, subunit RnfC 31 15 Op 2 12/0.000 + CDS 39928 - 40881 1186 ## COG4658 Predicted NADH:ubiquinone oxidoreductase, subunit RnfD 32 15 Op 3 13/0.000 + CDS 40881 - 41510 930 ## COG4659 Predicted NADH:ubiquinone oxidoreductase, subunit RnfG 33 15 Op 4 3/0.000 + CDS 41512 - 42270 1021 ## COG4660 Predicted NADH:ubiquinone oxidoreductase, subunit RnfE 34 15 Op 5 12/0.000 + CDS 42285 - 42860 723 ## COG4657 Predicted NADH:ubiquinone oxidoreductase, subunit RnfA 35 15 Op 6 . + CDS 42876 - 43667 1030 ## COG2878 Predicted NADH:ubiquinone oxidoreductase, subunit RnfB + Term 43721 - 43761 1.2 + Prom 43875 - 43934 7.3 36 16 Tu 1 . + CDS 43977 - 44918 719 ## COG2267 Lysophospholipase + Prom 44924 - 44983 3.3 37 17 Op 1 . + CDS 45076 - 45327 321 ## Aaci_1292 sporulation protein, YlmC/YmxH family 38 17 Op 2 . + CDS 45398 - 45856 462 ## COG1327 Predicted transcriptional regulator, consists of a Zn-ribbon and ATP-cone domains Predicted protein(s) >gi|222441873|gb|ACEP01000069.1| GENE 1 44 - 667 395 207 aa, chain + ## HITS:1 COG:no KEGG:BCAH820_0604 NR:ns ## KEGG: BCAH820_0604 # Name: not_defined # Def: endolysin # Organism: B.cereus_AH820 # Pathway: not_defined # 11 203 6 188 266 65 28.0 1e-09 MEAIMSNYYPDISHYHPVRNWGKVKSNCPFLIAKATQGTSYIDETLDTNIRECERRGIPY WLYAFLNRGDETAQTRFLVEVCKERVEKHFIGYVLDVESGNTEDGVESALNYLKRQRHKM MLYTMYSEYERYERIIRRRPKRCAWWEARYGLNDAVYRPRYSAHKGADLQQFTSNGICPG IPGRVDLNRITGQGKKEHWFRKAGKCD >gi|222441873|gb|ACEP01000069.1| GENE 2 778 - 2667 1163 629 aa, chain + ## HITS:1 COG:no KEGG:bpr_I2549 NR:ns ## KEGG: bpr_I2549 # Name: not_defined # Def: hypothetical protein # Organism: B.proteoclasticus # Pathway: not_defined # 68 628 32 614 616 110 22.0 2e-22 MNKGKNKGKIQKSRQNAEGREAGSFLGKAFDSYHKFLSNKVINFVINMILAIGTVLPFMM KMCNKVAFTYEVNDDAAIVQILDGSYTGTSDGHAIFIKYPLSWIIAKLYELNPKLPFTVP ADNGTNWYVTAIVLLEVFALTVVLFRILNYFRCNRILICFFYTLAFVYVWMPCFFHLTFS TVAAFLGCMSLLFTGFSKKEELWRPWNLLCLGILGISAYCMRKQCFYMVIPFLLIEIWYK YRMDFFRSVKPWFIFGVCGVLGAGILFLNTQMYGSMGWKNYFIYNHARAYMQDYTGLPDY EENEDFYQSIGVSENAQKVFKSYSYCLYDDFSTETIEKIYNYQKTQELQLSLEQKAENAK EKAYRYCMKKKQTGEFLKFSGFYVWFLIVPLTAVTLLFKWKNGFLRWVSTFLYGGTCAFL IHIEWIYLAMNGRFPQRVEESIRLLMLSVGFMIVCHLLSFWKDTSFIRISVVIQCILLAV ILHMGVNGSDRIQAIQGIQAGREAGSGQKAEIAAYCGKHKENYYILDTQSFGKPSGVKDD LHQGNWYMSGSWTAYSPLYEEKLAKDGISNLGTGFLLKKNVYIITKGKKNVLNLLGQEDT EHLTYKAVDEIEASGNLFFAVYKVSRSVK >gi|222441873|gb|ACEP01000069.1| GENE 3 2681 - 3487 600 268 aa, chain + ## HITS:1 COG:no KEGG:CDR20291_2292 NR:ns ## KEGG: CDR20291_2292 # Name: not_defined # Def: putative cell wall hydrolase # Organism: C.difficile_R20291 # Pathway: not_defined # 94 261 75 250 396 69 29.0 1e-10 MPKLYKKLIDIIALLCLAAVFLFTFSYLKSWWTEEIGEKRQEQNIWKEKRTGTWCTIKSN SLDGVPIYDEAGGSVPIGSLPEGKLCELTDSTIKEGKKWGKVKYAGLSGWMKMSYLKYIC QESISIQEDSQIYINVSTEKGIRMYQEPDVTSDAVLKGIPYGAEFIVQETRDGWGKVSNN GRTGWINLYYAGCYPESSKAAWKVETLSSAQQINFREKPGEDQRSIAKVPENTYLEMKEF KNGWGRTEYGGQEGWVKLSYLTPCKKKK >gi|222441873|gb|ACEP01000069.1| GENE 4 3816 - 4502 448 228 aa, chain - ## HITS:1 COG:FN1942 KEGG:ns NR:ns ## COG: FN1942 COG2964 # Protein_GI_number: 19705247 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 3 226 4 226 229 167 42.0 2e-41 MTDLLYSYTTTVKFLGSVLGPDYEVVLQDLSNINHSVIAIENGHISGRTIGSPLTSAVFQ MLSSKVYEEDDFIANYKGVAENGHILRSSSMFIKDSNGNPIGLLCINFDDSRYMELHEKL FSLIHPGNFVKNISSGIAESKENVKDLSSRITENFAIDVASLREQIFNDATANFTTPVDR LNQNERKEIMIKLYEQGLFQLKGSIPFVAKRFSCSQATIYRYLGEINN >gi|222441873|gb|ACEP01000069.1| GENE 5 4778 - 5479 678 233 aa, chain + ## HITS:1 COG:CAC2551 KEGG:ns NR:ns ## COG: CAC2551 COG3341 # Protein_GI_number: 15895813 # Func_class: R General function prediction only # Function: Predicted double-stranded RNA/RNA-DNA hybrid binding protein # Organism: Clostridium acetobutylicum # 1 217 1 239 240 122 35.0 4e-28 MAKKKIYAVRKGHKTGLFYTWDECKKAVYGYSGAEYKGFLTKEEAEAFLNIGAAKIITNK NSSTAATGVDSTALTTSTSDRLVVYVDGSFDVSIQKYSFGCVILLEDGRIIERSGNGEEP ELLAIRNVAGEMHGAMYAVQWAFENGYSSVVIHYDYEGIEKWATGVWSAKNPHTKKYAAF MKRMQEKIEVIFQKVKAHSGDFYNERVDKLAKKALVEGHGIPDVDVYREKNRK >gi|222441873|gb|ACEP01000069.1| GENE 6 5811 - 5909 57 32 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKQLFKKKKSYQQVLCYWQDFFYVLFYQWDVL >gi|222441873|gb|ACEP01000069.1| GENE 7 5894 - 6976 717 360 aa, chain + ## HITS:1 COG:lin0191 KEGG:ns NR:ns ## COG: lin0191 COG0803 # Protein_GI_number: 16799268 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type metal ion transport system, periplasmic component/surface adhesin # Organism: Listeria innocua # 29 359 24 311 312 115 28.0 1e-25 MGCSVVKNTSGKNTTRKNAESENTIEQNSIEKGNSNKISIVCTTFPQYDWVKNILGEEAE RFNVTLLLDNGVDMHSYQPAVKDIATAGSSDLFIYVGGESDTWVEDALKEAKNKDLKAIN LMETLGNFVKEEEVVEGMQEERESLGHSHEKSSKEKQEQTQKESHENSQEINGQKEAADE EPEYDEHIWLSIRNAEIMVKNIEKAIEQLDSDNAKVYQTNAENYIKKLDTLDKQYANTIQ NAKYKAILFGDRFPFRYMADDYDLKYYAAFAGCSAETMAGFETVTFLAKKADELRLPVIL TIENSDGRIAEAVKSNTTKKNQKILAMNSLQSVTKEQIADGITYLQVMQENLSVLSEALN >gi|222441873|gb|ACEP01000069.1| GENE 8 7114 - 7836 732 240 aa, chain + ## HITS:1 COG:MA0024 KEGG:ns NR:ns ## COG: MA0024 COG1121 # Protein_GI_number: 20088923 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Mn/Zn transport systems, ATPase component # Organism: Methanosarcina acetivorans str.C2A # 4 216 5 221 254 130 36.0 2e-30 MAYITCKDLTLGYEGHAIVSDLNFKVSKGDYLCIVGENGTGKSTLIKTLLHLQDAISGDI ITGDGLKAYEVGYLPQQTVVQKDFPATVEEIVLSGTLAKCGWRPFYRKAEKKLAEEKMKR MEVWKLRKSCYRNLSGGQQQRTLLARALCAASKVILLDEPVTGLDPKVTAEFYQVTKDLS EEGIAVIMVSHDVQALDYATHILHMQKKNSFYGTKEEYMNSDKWKLFKNAEENRIGGDKK >gi|222441873|gb|ACEP01000069.1| GENE 9 7833 - 8681 759 282 aa, chain + ## HITS:1 COG:lin0193 KEGG:ns NR:ns ## COG: lin0193 COG1108 # Protein_GI_number: 16799270 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Mn2+/Zn2+ transport systems, permease components # Organism: Listeria innocua # 14 277 6 264 268 92 31.0 1e-18 MTHILQELSFYLSYPFVRYALIVGVLIALCSSLLGVTLVLKRFSFIGDGLSHVAFGAMAV ATIFKLVNNTIFIMPVTIAAAILLLRTGQNTKIKGDAAIAMLSVSSLAVGYLVMNLFSTS ANVSGDVCSTLFGSTTILTLTKTEVFTCIIMSILVVAFFILFYNRIFAVTFDENFTRAIG KNAETYNMIIAIITAVIIVLAMSLVGSLLISALIIFPALSAMRVMQSFKGVIIYAAVLSV IGAFFGIVLSILCSTPVGATIVILDLILFFVNCIIGKFIGRI >gi|222441873|gb|ACEP01000069.1| GENE 10 8948 - 9037 101 29 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKGFIEKLEELLELGETKRDLKRPCIAFH >gi|222441873|gb|ACEP01000069.1| GENE 11 9572 - 12490 3091 972 aa, chain + ## HITS:1 COG:CAC3006 KEGG:ns NR:ns ## COG: CAC3006 COG1026 # Protein_GI_number: 15896258 # Func_class: R General function prediction only # Function: Predicted Zn-dependent peptidases, insulinase-like # Organism: Clostridium acetobutylicum # 3 970 6 974 976 761 43.0 0 MLKETDFKSYSFVQKEKIDELNGYGYVLEHKKTGARVLLIENDDTNKVFSIAFRTPPADD TGVAHILEHSVLCGSDKFPSKDPFIELAKGSLNTFLNAMTYPDKTVYPIASCNAQDYHNL MHVYLDAVFHPNIYKRDEILKQEGWHYEIADKDDELKFNGVVYNEMKGVFSSPDDVLARK IQEALLKDTPYAFESGGDPDAIPELTREKFLEFHSKYYHPSNSYIYLYGDVDFARELAFI DEEYLSHYEKKTVDSKVAMQEVFKAPETLKDTYSVSEKEEEGVYLSYNVAAGDSCDNERG LAMQILDYVLFTMPGAPVRKKLIDAGLGKDVDSYYDGGIQQPLFSVIVKNAKKGNEALFI KTLEEALREQAENGLNKKAIYSAINNYEFKYREADFGRFPKGLIYGLNFLNSWLYDDTKA LELADSLTPLARLKEKVETGYFEQLIKESFLENTHKAYVYLYPEVGKNERLEEELKEQLA RMKDKLNAKQLNYLIEDTKKLKEFQETPSTQEELEKIPTLDLSDISREVLPFKNKEVTIG GTTAVVHEYHTNGIVYSDFCFDMSELPEELIPYATLLTEIYRYVDTEHFSYNDLATEINL KIGGLSFQTGMNVLVWKKDAYRPYFSVHMKCMENQVADGMSLLKEVLLSSKMDNKKRLKE IISELRTKMDTRIPAAGHVYAANRALSYIDPMMKYKDTAEGIGFYEFVKKLDKNFDSNAD LLMKQLVRAQMCIFRKENLTLSLTGEFNFKSLMEGEMLQFNRMLYDMPCVKAVPAFVLEK KNEGFKTASKVQYVASAGCFEKEGQEYHGALKVLKTIFSYDYLWVNVRVTGGAYGCMCNF SRNGYGFFTSYRDPNLSATLDVYKKAADYVRNFEAGKRDMTKYIIGTISGIDQPLEPSAL GERSFHAYQSGITVEMIQKERNQVLDATDETIRSLADYIESMMGAGTVCAIGNDRKLEEE KEIFKQVCSLNQ >gi|222441873|gb|ACEP01000069.1| GENE 12 12515 - 13420 1141 301 aa, chain + ## HITS:1 COG:BH1367 KEGG:ns NR:ns ## COG: BH1367 COG1159 # Protein_GI_number: 15613930 # Func_class: R General function prediction only # Function: GTPase # Organism: Bacillus halodurans # 5 300 8 303 304 335 57.0 7e-92 MSRCFKSGFVTIIGRPNVGKSTLMNQLIGQKIAITSSKAQTTRNRIQTVYTSEEGQIVFL DTPGINKAKNKLGDYMLMAAERTLNEVDLILWLVEPTTFIGGGEQYIIEKLQNVKTPIFL VINKTDVATEEEILKAIVAYKDQCNFAEIIPVSALEGNNTDDLVHSIFKYLPEGPMYYDE DTITDQPERQIVAELVREKALRLLSQEIPHGIAVVIERMKTRKNGNIVDIEAVIVCERES HKRIIIGKKGAMIKEIGTQARVEMENLLDMKVNLRLWVKVRKDWRDSDILLKNYGYNKKE L >gi|222441873|gb|ACEP01000069.1| GENE 13 13429 - 14166 417 245 aa, chain + ## HITS:1 COG:BH1369 KEGG:ns NR:ns ## COG: BH1369 COG1381 # Protein_GI_number: 15613932 # Func_class: L Replication, recombination and repair # Function: Recombinational DNA repair protein (RecF pathway) # Organism: Bacillus halodurans # 6 243 4 241 254 96 28.0 4e-20 MSEQTKVTGIVLSVYPVGENDRRLTILTKERGKIQVFARGSRRPKHPLFGVTQPLIYCEF MVTDTRNYTYLNSAECKDYFYFLKKDLEDIYYSSYFAEVAEYFTMEGQDERNILNLLFVT FLAMKKKLISLSLIRRIYELKILQFAGFGLQSFQCVHCGNEEHLTRISFEEGGMLCSGCA VKGQGREVESAVLYVLQYVSSIPLNRLYSFTLKENIDKEFEWIVKRYFAVQAGHRFKSAE MLECV >gi|222441873|gb|ACEP01000069.1| GENE 14 14369 - 15757 1696 462 aa, chain + ## HITS:1 COG:CAC3195 KEGG:ns NR:ns ## COG: CAC3195 COG0423 # Protein_GI_number: 15896443 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Glycyl-tRNA synthetase (class II) # Organism: Clostridium acetobutylicum # 2 449 4 450 462 694 72.0 0 MEKTMDKIIAVAKARGFVYPGSEIYGGLANTWDYGNLGVELKNNVKKAWWKKFIQESPYN VGVDCAILMNPQTWVASGHLGSFSDPLMDCKECKERFRADKIIEDFAQEHEIELESSVDG WSQEKMKAFIDDNNVPCPSCGKHNFTDIRQFNLMFKTFQGVTEDAKNTVYLRPETAQGIF VNFKNVARTSRKKIPFGIGQIGKSFRNEITPGNFTFRTREFEQMELEFFCKPDTDLEWFA YWKDFCVNWLKDLGLKDEELRIRDHDKEELSFYSKATSDIEFLFPFGWGELWGIADRTDY DLNCHQEVSGQDLTYFDDAEKKKYIPYVIEPSLGADRVTLAFLCAAYDEEEIKDGETRNV MHFHPAIAPVKVGILPLSKKLNEGAEKIYHELSKIYNCEFDDRGNIGKRYRRQDEIGTPY CVTYDFDSVEDGAVTVRDRDTMEQERIKIEDLKDYFAEKFNY >gi|222441873|gb|ACEP01000069.1| GENE 15 16173 - 18281 2431 702 aa, chain + ## HITS:1 COG:CAC2658 KEGG:ns NR:ns ## COG: CAC2658 COG3968 # Protein_GI_number: 15895916 # Func_class: R General function prediction only # Function: Uncharacterized protein related to glutamine synthetase # Organism: Clostridium acetobutylicum # 5 702 4 696 696 882 63.0 0 MKTDVKKIFGSNVFNDAVMKERLPKAAYKKMKATIEAGAELDPEVADIVAHEMKEWALEK GATHFTHWFQPLTGITAEKHDAFIDPQGDGTTLLRFSGKELIKGEPDASSFPSGGLRATF EARGYTAWDCTSPAFIKETPHSTILCIPTAFCSYKGEALDKKTPLLRSMQALDVQAIRIL RLFGNKTAKHVTTSVGAEQEYFLVDKGNFLKRKDLIFTGRTLFGAMPPKGQEMDDHYFGV IPERVLGFMEKFNEELWKLGISAKTQHNEVAPAQHEIAPIYDQNNIATDHNQLVMETAQR VADELGLKCLLHEKPFAGINGSGKHNNWSMCTDEGENLLDPGETPHENMQFLLFLAAILR AVDEHADLLRLSASTPGNDHRLGANEAPPAIISVFLGEQLEDVVEQFVDNGEATSSLEGE EYISGVHSLPHFQKDATDRNRTSPFAFTGNKFEFRMVGSTQSIADPNTVLNTIVAEVLCD MADKLEAAEDLPMALHDMIKDTIAAHKRIIFNGNGYSDEWVVEAERRGLPNIKCMVDAVP ALVKPEAVEIFEKHGVYTKAEMESRAEVMYETYSKTINIEALTMIDMAKKQIVPATIKYQ TSLAESVNAIKAASSAVDTSVQEALMADISTNLKDMYAALAHLEEVAGKAEAMEEGAEQG HYYHDVVFTAMDELRKPADKLEMLVAKSAWPFPSYGDLIFEV >gi|222441873|gb|ACEP01000069.1| GENE 16 18725 - 20032 1174 435 aa, chain + ## HITS:1 COG:CAC2398 KEGG:ns NR:ns ## COG: CAC2398 COG0285 # Protein_GI_number: 15895664 # Func_class: H Coenzyme transport and metabolism # Function: Folylpolyglutamate synthase # Organism: Clostridium acetobutylicum # 1 431 1 425 431 271 42.0 2e-72 MNYISLIEELKKRGSIPGLDAIQGLLEELGHPEDDLKIVHIAGTNGKGSVFAYLSSILMA AGFKVGRYISPTISCYEERFQINGTYIAKDKLERLYGVVEEAMRKEEAKIGLRPTLFEVE TALSFLYFKEEKVDYALVEVGMGGRLDATNVIKHPELTVISSISYDHKAILGDTLEEIAC QKAGIIKESAPVVLSENPEEVCKVVKHEAAEKKVHCIEVKPQDYEILAETPYGSTFLWKE QRYETKLPGNHQISNAVTALTASEYLFEKDYEKNEARKKIPKELDTMNIKAAQQGGIIRT IWPGRLEVLKKEPLFYRDGAHNPDGARKLASFLQKHFTNKRIIYIMGVLKDKEYTKILQY LMPMAKKVYVFRPNNERGLSAEILANTIKEKADVPVMIEPDVNTAVSKALEAAQPEDILV ACGSLSFMEEMEAIQ >gi|222441873|gb|ACEP01000069.1| GENE 17 20032 - 20841 961 269 aa, chain + ## HITS:1 COG:CAC2926 KEGG:ns NR:ns ## COG: CAC2926 COG0294 # Protein_GI_number: 15896179 # Func_class: H Coenzyme transport and metabolism # Function: Dihydropteroate synthase and related enzymes # Organism: Clostridium acetobutylicum # 1 266 1 266 268 287 53.0 1e-77 MKIGKYEFDLLHKGYIMGILNVTPDSFSDGGKYQSVDLALKQAERMIEEGAAILDIGGES TRPGHKKITDQEEIERVVPVIEAIKKNFDIAVSLDTYKYEVSKAGIAAGADMINDIWGLK WDERLAPLLAKEDVACCLMHNRDNQEYTNFIEDFCSDMEETLAIAKKAGIRDERIVLDPG VGFGKTFEQNLSIMKHMDVFSKWGLPVLLGTSRKSVIGLALNLPVDQREEGTAATTVLGR MKGASIFRVHDVKTNYRELKMIEAILEAE >gi|222441873|gb|ACEP01000069.1| GENE 18 20843 - 21667 525 274 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|148994682|ref|ZP_01823786.1| 50S ribosomal protein L13 [Streptococcus pneumoniae SP9-BS68] # 1 271 1 270 278 206 37 2e-52 MDEIRVKNLEVFCHHGVYKEENVLGQKFLVNIVSKVDTRAAGKTDELELSVSYGDICRCV KKEMTKQNDKLLERVAERLAECILLQFPLIKEVEIEVKKPWAPVLMHVDYTSVKIKRSWH KVYIGVGSNLGEREEYMKLAKEKVSALPDTKNFKSASIIETEPYGYTEQGKFLNTVYSIE TLDTPAEFLDKLHQIENEAERKREIHWGPRTLDLDILLYDDLVTEDEEITIPHPELTKRL FVLEPLCELTPRGIHPLERRRYSDILEDLKEKEK >gi|222441873|gb|ACEP01000069.1| GENE 19 21684 - 22826 1143 380 aa, chain + ## HITS:1 COG:DR1147 KEGG:ns NR:ns ## COG: DR1147 COG0077 # Protein_GI_number: 15806167 # Func_class: E Amino acid transport and metabolism # Function: Prephenate dehydratase # Organism: Deinococcus radiodurans # 117 380 23 285 293 180 38.0 3e-45 MDKEIRKVDDVRDDITKIDYEIAELFEKRMGFAAELALSKKQAGESIYNKNKEDEKLSDI TKNRSNPFVIKGLEEVFIQMMSISRKYQYHMVHQRDRYIENYFTEVPELVMFPDTRIVYP GVPGSFSEMACEKFFGADVDHYAVVNFKDVAMALNNGDADYGVLPIENSSAGDVTGVYDI LLENDVCMVGEVFVKVEHCLLGCPGSKIEDLEVVLSHPQGLMQCAPYLENLDVKKVSVEN TAIAAERVAREKIMTQGAIASRRAAELYGLDILDAGINFDKNNVTRFVILSKKRQYTENA NKISISFSLLHESGTLYNILSHFLYNDLNLSHIESVPLPDQQWEYRFYIDINGNLHDPAV KNALQGVRTEVADFKILGNY >gi|222441873|gb|ACEP01000069.1| GENE 20 22854 - 25346 1948 830 aa, chain + ## HITS:1 COG:SMc00195 KEGG:ns NR:ns ## COG: SMc00195 COG1368 # Protein_GI_number: 15965601 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Phosphoglycerol transferase and related proteins, alkaline phosphatase superfamily # Organism: Sinorhizobium meliloti # 393 822 215 628 639 181 28.0 4e-45 MSETVKNKHVKKKNSAVLLLKTVITAVLLFFTWYLCSHFMEYQKNATNQVNKYRIDQVCQ LSAGSAVSQKFVAKHTHLKTVKVYFGNDYSGQASGKVILNIIDLETGKSIQRLTKNISDI VNNDYTEFKTDLQLTKKKEYSIQLTTSGAESGKEPLIFQWTTKETGFRGKLKINQEEQGK YLVSKLYYPVTIYQQWAGICMMMALVLLLLWFALPAPEMVKKALGQILFFAAPLFTFWFV ERFTDNPIFRMRAAEFWLNILVYYMFFGLLYLIFNSRRVSVTIGSILWCIIGIANYYVLS FKGAPIVPSDIMSARTAANVAENYTYSIQPVFVWNVLFLLLYLAIMWRCPVPKKMGWKKR VIMLVVIGLLGSVLGHFVVEQKTLKNFGIKNNVWDQKKGYAKNGLFFGFVLNMNSLVQEK PSDYSVEAAKDIAEKYEEKFANEDSDKKKKGRLETADGTKPNVIGIMNEAFSDLSVINEF STNEDYMPFIHSLKENTIKGSLYMSIFGSGTCNSEFEYLTGNSMSFLQNGIIAYTQVVKD KLPNMTYLLKEQGYKGNLALHPYLASGWNRVQVYDYMGFDHFYSETDFKNPTMYRKYISD ESDFKKIEELYENRTEKDEPFYLFNVTMQNHGGFDKTYSNFHNDIQITDNHKNEQAEQYL SLVKKTDDAFKQLVEYFSKVKEPTIIVMYGDHQPAVQSSFYDSLFGKSAGSLTNEELMNK YRTPFIIWANYDIKEKTIDKMSANYLSAYVMNEAGLETSPYQKFLLKLRKKLPVLTAMGC FDKKGKYYESALESPYSDMVKEYQILQYNNLIDTKHTVNSFFYLSDEQKK >gi|222441873|gb|ACEP01000069.1| GENE 21 25386 - 26921 1524 511 aa, chain + ## HITS:1 COG:MA0083 KEGG:ns NR:ns ## COG: MA0083 COG0248 # Protein_GI_number: 20088982 # Func_class: F Nucleotide transport and metabolism; P Inorganic ion transport and metabolism # Function: Exopolyphosphatase # Organism: Methanosarcina acetivorans str.C2A # 7 500 13 512 543 116 23.0 1e-25 MGISPYAAIVIGSKNMLMKVYEISGGKGFRTVDEIRYEYELGKEAYFTRKITFAQIEEIC NVLMEFQQRMREYAITEFRCYATSAIRNAKNQVSVLNQIKIRTGVDVIMLSNSELRFLMY KGIQVLDIDFNKIIEKNTAILDIGSGSVQVSLFDKQALYMTQNLDIGTVKVRGFLESVEN NVLDYISVLEEYIQYEFDMFKSGYLKDKDIKNVIAIGDEIKNINRLIPELHLTDTMSYEQ ILYIYKKIKKASYQDIALKYGLSIEDARMVLPSVLIYKTFLERSKADTVYICATNLCDGS VVDYAQETKKIKLSHDFYQDIVASAKYIGKRYRYSKVHAQYITDLALEIFDKTKKFHGLT QRDRLILEISAILHDIGKYVNLNDPGKNGYQLIMSTEIMGLSHREREEVANIVLYNTQEL PDTDELDTAFWREEYLRIAKLSSILRLANALDRGHKQKLEKCSMARKGKELIISLETNQD ITLEITRFSKKAETFSEFTGITPVLKQKRAL >gi|222441873|gb|ACEP01000069.1| GENE 22 26935 - 29085 2079 716 aa, chain + ## HITS:1 COG:BH1392 KEGG:ns NR:ns ## COG: BH1392 COG0855 # Protein_GI_number: 15613955 # Func_class: P Inorganic ion transport and metabolism # Function: Polyphosphate kinase # Organism: Bacillus halodurans # 6 690 14 696 705 715 53.0 0 MDKSIFDRPENFTNRELSWLEFNQRILGEARDRKNPLFERMKFLSITASNLDEFFMVRIA SLKDMVNAGYKKKDIAGMTPQEQLGALNEKTHAFCEKQYTTYNRSLLPKLSEAGLEIVTF NSLSEKEEEFLEEYFHKNVYPVLTPMAIDSSRPFPLIQNKTLNIAALIKSRGEDKKEKKD YDIATVQVPSVLPRVIVLPQKDGGKRKCRVILLENVIEHYLDVLFLNHEIICSAPYRIMR NADLSIDEDDAEDLLKEIEKSLKMRQWGEVIKFEYEERMDSRLVKYLKKQFKVHPCDMYA FNGPLDLTFLMKCYGIEGFSDLKEEPYTPQKNKKLRADKNIFTQIRKGDVLLHHPYESFD PIVAFIRQAAEDKDVLAIKQTLYRVSGHSPIIAALAQAAENGKQVTVLVELKARFDEENN INWARKLEKAGCHVIYGLVGLKTHCKIALVVRKEADGIRRYVHLGTGNYNDSTAKLYTDT GMFTCRDVVGEDATAVFNMLSGYSEPANWNRLIVAPIWMKKRFLEMIHRETKNAKEGKPA KIIAKCNSLCDRKIILALYEASCAGVQVDLIVRGICCLVAGKPGVSENIRVRSIVGTFLE HARIFYFYNDGHEEIYMGSADWMPRNLDRRVEIVFPVEEEELKEKAKHILDVQLSDTLKA HRLLEDGTYQKVDRRGKEAIEAQKTFCEEAIAAANESKKKVPKKRTFDPRFSPQEE >gi|222441873|gb|ACEP01000069.1| GENE 23 29106 - 30314 1138 402 aa, chain + ## HITS:1 COG:BH0675 KEGG:ns NR:ns ## COG: BH0675 COG1472 # Protein_GI_number: 15613238 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Bacillus halodurans # 46 393 130 484 686 247 39.0 2e-65 MLLLCFTLFAGILAGCGSHKKKEKSEESTSTSKDSGELDDITLEGMVSDALSSMTLKEKI GQMFIVCTDSLDFDAETEVTKKMQQKLEEYKPGGVIFFSYNLKDREQVKSMIADIQKSND IPLFISVDEEGGSVARIANSKNMQTTKFPAMSEIGKSGDSKKACEVGETIGKEIRELGFN LDFAPVADVNTNAENTEIGNRSFSSDAKTVAGMVSQEVKGLQSQGVSSTLKHFPGQGQCG EDTHKGYVDLNATIDRLREVEFLPFESGIAAGADMVMMSHVAVSQVTGKETPASLTKLMV TDILREELQFNNVIITDAMNMKVITKFYDADQAAVMAVQAGNDMILMPDNFEQAFEGVLE AVKDGTISESKINEAVSRILSVKIRRGILPTDSKLFRHNAGK >gi|222441873|gb|ACEP01000069.1| GENE 24 30465 - 31919 1595 484 aa, chain + ## HITS:1 COG:CAC1392 KEGG:ns NR:ns ## COG: CAC1392 COG0034 # Protein_GI_number: 15894671 # Func_class: F Nucleotide transport and metabolism # Function: Glutamine phosphoribosylpyrophosphate amidotransferase # Organism: Clostridium acetobutylicum # 5 477 2 467 475 487 53.0 1e-137 MAFDMEEQFSEECVDSGIHEECGVFGAYDFDGNDIASTVYYGLFALQHRGQESCGIAVSD TNGPKKNIQVHKGMGLVNEVFSSENLEKLKGNISVGHVRYSTAGSSTRENAQPLVLNYYK GTLALAHNGNLVNALELREELEKTGAIFQTTIDSEVIAYHVAKERISSATAEEAVLAAMR KLKGAYSLIVMSPRKLIGARDPFGFRPLCIGKRDNTYFLTSETCALDTVGAEFVRDVEPG EVVTLTPSGIVSNRELCFKDSSKQARCIFEYIYFARPDAVIDGVGVYASRIKAGRFLAMD SPVEADMVVGVPESGNPAAQGYAMESGIPYGTAFIKNSYVGRTFIKPKQSMRESSVQVKL NVLKDAVKGKRIVMIDDSIVRGTTCHRIVRMLKDAGAKEVHVRISSPPFLHPCYFGTDIP SEDQLIAYGKTLDEIRDSIEADTLAYLGMERLKELNNGLPYCDACFSGNYPIEPPTQDIR GEQG >gi|222441873|gb|ACEP01000069.1| GENE 25 32037 - 33470 1742 477 aa, chain + ## HITS:1 COG:CAC1821 KEGG:ns NR:ns ## COG: CAC1821 COG0015 # Protein_GI_number: 15895097 # Func_class: F Nucleotide transport and metabolism # Function: Adenylosuccinate lyase # Organism: Clostridium acetobutylicum # 1 477 1 476 476 658 66.0 0 MYDKYQSPLSERYASKEMQYIFSPDKKFKTWRKLWIALAETEYELGLDSVTKEQIEELKA HADDINFDVAKAREKEVRHDVMSHVYAYGVQCPNAKGIIHLGATSCYVGDNTDIIIMTEA LKLVRSKLINVIAELSAFAMKYKDLPTLAFTHFQPAQPTTVGKRATLWLHDLMLDLEDLE YVISTMKLLGSKGTTGTQASFLELFNGDHETIRKIDGKIAEKMGFDACYPVSGQTYSRKI DSRVLNVLSGIAQSAHKFSNDIRLLQHLKEIEEPFEKHQIGSSAMAYKRNPMRSERIASL ADYVMSDAMNPAFVASTQWFERTLDDSANKRLSVPEGFLAIDGILDLYLNVVDGLVVYPK VIEAHLMRELPFMATENIMMDAVKAGGDRQELHERIRQHSMAAGRVVKEEGKENDLLERI AADSAFGMTMEQLQAIMKPENFVGRSPEQTVEFITEYVQPVLDENKDILGMKAEINV >gi|222441873|gb|ACEP01000069.1| GENE 26 33612 - 35282 1402 556 aa, chain + ## HITS:1 COG:CAC3213 KEGG:ns NR:ns ## COG: CAC3213 COG2244 # Protein_GI_number: 15896460 # Func_class: R General function prediction only # Function: Membrane protein involved in the export of O-antigen and teichoic acid # Organism: Clostridium acetobutylicum # 7 527 2 494 512 175 29.0 2e-43 MSSKDTKETNVLAQASILMVAGIVTRIIGVLYRSPVTSIIGDEGNGYYSIAVNIYTMILL ISSYSIPMAVSKVVSAKMAVHQYKIAHKVFKCALIYVLVIGGIASLITFFGASVLIPSNQ PKAIPVLRILAPTIFFSGFLGVFRGYFQAHRTMIPTSLSQIVEQVANAVISIGAAVFFIH AFAGGDADKIPVFGAMGSAAGIGAGVITGLLFMLFIYSCNKKYFKKEIQKDDTGVDSSYK DIFRIIFMMVTPVILSTFVYNISTVVDQTLFMDIMGFKRMASETAASLYGVFSGKYWVLI NVPIALANSMSTAMIPAISGSYELKDYKKCRGHVRQAIHFTMVISIPAAIGMGALAYPIM EVLFPQKATIDLAVSILRTGCISIIFYALSTVSNGVLQGIGKVNIPLRNAAVSLVLHVVL LTPLLYFTNLNLYALVFATMFYAFLMCLLNNLSVRKYLGYHQEMKRTFMIPLLSSVIMGI LCYVFYQGIYLILSGIFGSFIHLRILVFICLMISVGFAVIVYFVLVLKLKGITEAELRNF PKGNMLVRMAKKSKLL >gi|222441873|gb|ACEP01000069.1| GENE 27 35391 - 36680 1683 429 aa, chain + ## HITS:1 COG:SP0019 KEGG:ns NR:ns ## COG: SP0019 COG0104 # Protein_GI_number: 15899967 # Func_class: F Nucleotide transport and metabolism # Function: Adenylosuccinate synthase # Organism: Streptococcus pneumoniae TIGR4 # 5 420 6 418 428 371 43.0 1e-102 MVKAIVGANWGDEGKGKITDMLAQDSDIIIRFQGGANAGHTIINDYGKFALHTLPSGIFY GHTTSIIGNGVALNIPILFNELKEITSKGVPAPKILISDRAQIVMPYHILFDEYEEERLA GKSFGSTKSGIAPFYSDKYAKVGFQVNELFEEEAVLREKIERVIVQKNILLEHLYHKPLI DADELYNTLMEYKEMVAPYVCNVSAYLDKAIKEGKNILLEGQLGTLKDPDHGIYPMVTSS STLAAYGAIGAGIPPYEIKKVVTVCKAYSSAVGAGAFVSEIFGDEAQELRVRGGDGGEFG ATTGRPRRMGWFDCVASKYGCRLQGTTDVAFTVVDVLGYLDEIPVCTGYEIDGKVTTEFP VTNQLEKAKPVLEVLPGWKCDIRGIKKYEDLPENCRKYIEFVEEHIGYPITMVSNGPGRD DIIYRGFEK >gi|222441873|gb|ACEP01000069.1| GENE 28 36868 - 37890 1112 340 aa, chain + ## HITS:1 COG:CAC1488 KEGG:ns NR:ns ## COG: CAC1488 COG0463 # Protein_GI_number: 15894767 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Clostridium acetobutylicum # 1 340 1 338 338 373 54.0 1e-103 MKILTIAIPSYNSMDYMRNCIESLLPGGEDVEILIVDDGSKDETPAIADEYEAKYPGIIR AIHQENGGHGEAVNAGLRNATGFFYKVVDSDDWVDAEAYQKILAFLKEAVKEEEPLDMLL SNYVYEKVGAKHKKVIQYHSILPENRYFGWEEIGHFHASQNILMHSVIYRTELLRSFQFE LPKHTFYVDNIFVYWPLPYVKKMYYLDVDFYRYFIGRDDQSVNEKVMISRIDQQIRVNKI MIDLYAKHESMLSCPQLKEYMLHYLETIQMVTSVLLMKMNTSESEAMRDDLWHYLEEKSP DAYKILKSSVLGKFTKSHNKLSHKLTMGGYKIAQKWIGFN >gi|222441873|gb|ACEP01000069.1| GENE 29 37862 - 38269 398 135 aa, chain + ## HITS:1 COG:L178600 KEGG:ns NR:ns ## COG: L178600 COG5341 # Protein_GI_number: 15673322 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Lactococcus lactis # 18 131 21 132 135 73 37.0 1e-13 MHRNGLDSTKKSYKNDILLILFFLIIGIGAFAFMQLQGKRGAEVKVSVNNKEYGIYSLDK NQTVTIGEDDWENILIIKDGKASMAKADCPDKICVNHAAISKKGETIVCLPHKVVVEVVD ENGTQDGNQIDIISK >gi|222441873|gb|ACEP01000069.1| GENE 30 38594 - 39907 1341 437 aa, chain + ## HITS:1 COG:FN1596 KEGG:ns NR:ns ## COG: FN1596 COG4656 # Protein_GI_number: 19704917 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfC # Organism: Fusobacterium nucleatum # 1 437 7 441 441 334 42.0 2e-91 MGLLTFKGGLHPYDGKELSKDKPITEYLPQGELVYPLSQHIGAPAVACVKKGDRVLVGQK IAEAGGFVSANIYSSVSGTVKKIEPRMTVSGNKVNSIIVENDGEYETVSFQPVTEKTNEA IINAVKEAGIVGLGGAGFPTHVKLSPKEPEKIDTIIINAAECEPYITADYRCMMEIPEQL ISGLNIMLSLFPKAKGVIGIEDNKPEAIAKLTEMCKNESRIEVAALKTKYPQGAERSLIF AVTGRAINSSMLPADAGCIVDNVATAIAIHEAITLGKPLFERVVTVTGDAIKNPGNFKVK NGTNAAELVEAAGGFKSQPEKVISGGPMMGMALSTLDVPCAKTFSSLLCFTKDEVAACEP SNCIRCGRCISVCPAGLMPTKLSEIADHGDFALFDELNGCECVECGCCSYICPAKRRLTQ SMKTGRREAMALRRKKK >gi|222441873|gb|ACEP01000069.1| GENE 31 39928 - 40881 1186 317 aa, chain + ## HITS:1 COG:TM0245 KEGG:ns NR:ns ## COG: TM0245 COG4658 # Protein_GI_number: 15643017 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfD # Organism: Thermotoga maritima # 12 313 9 315 318 213 45.0 4e-55 MNEQLYNVSANPHVRDKASTHSIMLDVIIALLPATLFGFWNFGVKAIINTVICVAVCVLA EYVWQRFMHQKITIKDCSAALTGLLLALNLPPEVPFWIPVIGGLFAIIVVKQLFGGLGQN FMNPALGARCFLLISFAGIMTNFSYNGFGGSFDATTSATPLAALKAGEVVNLKAMFLGNT AGTIGETSALLLLVGGVYLIAKKIISWRIPVCYIVTLGLFVLLFGGHGFDMAYLAEQLCG GGLMLGAFFMATDYVTSPITPKGQIVFGILLGILTGIFRLFGGSAEGVSYAIIFCNLLVP LIERVTTPTAFGKGGKK >gi|222441873|gb|ACEP01000069.1| GENE 32 40881 - 41510 930 209 aa, chain + ## HITS:1 COG:MA0661 KEGG:ns NR:ns ## COG: MA0661 COG4659 # Protein_GI_number: 20089548 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfG # Organism: Methanosarcina acetivorans str.C2A # 4 207 7 188 188 71 33.0 9e-13 MNSLVKDALKLVIITVVAGLVLGAVYGITKKPIADQEAKAQMEAYKAVFPKASDFKDVDG FSEEAASKVIASYKNTVDGHESDVISSAVEAVDASGKALGYIFNITTSKGYGGDIQLTVG IQSDGTVSGYSVLSISETAGLGMKAKDDPSWGKQFAGKKADAFSVVKDGSGSGDDAKIDA ISGATITSKAVTGAMNSCLAYFQSLEGGN >gi|222441873|gb|ACEP01000069.1| GENE 33 41512 - 42270 1021 252 aa, chain + ## HITS:1 COG:TM0247 KEGG:ns NR:ns ## COG: TM0247 COG4660 # Protein_GI_number: 15643019 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfE # Organism: Thermotoga maritima # 5 205 2 198 200 207 56.0 1e-53 MKNNSALERLYNGIIKENPTLVLILGMCPTLAVTTSAINGAGMGLSTTVVLMFSNMIISI LRNFIPDRVRIPGYIVIIASLVTVVQFLLQGYVPALNDALGVYIPLIVVNCIILGRAESY ASKNGPVSSFFDGLGMGLGFTVSLTILGAFRELLGAGTIFGKTILSESFYTPITIFILAP GAFFVLACLVAARNKIMNRKVKNGEMEEIPAGCGDCSACGNSCCSSKSFFDMSVDNKKEE TKAASEEKAENK >gi|222441873|gb|ACEP01000069.1| GENE 34 42285 - 42860 723 191 aa, chain + ## HITS:1 COG:FN1592 KEGG:ns NR:ns ## COG: FN1592 COG4657 # Protein_GI_number: 19704913 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfA # Organism: Fusobacterium nucleatum # 19 190 21 192 194 186 60.0 2e-47 MKELILLIISAAIVNNVVLSQFLGLCPFLGVSKKVETAGGMGAAVIFVITIASLVTSLIY KFILAPLDLTYLQTIVFILVIAALVQFVEMFLKKSMPALYESLGVYLPLITTNCAVLGVA LNSVQYGYNILQSVVYGLGISIGFTIAIVILAGIREKMEYNDIPESWQGMPIVMVTAGLM SIAFFGFSGII >gi|222441873|gb|ACEP01000069.1| GENE 35 42876 - 43667 1030 263 aa, chain + ## HITS:1 COG:MA0664 KEGG:ns NR:ns ## COG: MA0664 COG2878 # Protein_GI_number: 20089551 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfB # Organism: Methanosarcina acetivorans str.C2A # 4 262 3 261 264 166 42.0 5e-41 MSISGIILAAVVVGGTGLVISILLGIASEKFKVPVDEKEVAVRECLPGNNCGGCGFAGCD ALAKAIAAGEAPVGACPVGGQPVADKIASIMGVEAGVGEKQVAFVKCAGTCDKAKSKYKY AGNEDCVSAMSVPGGGPKACSFGCTGFGSCVKVCDFDAIHVINGVAVVDKEKCVACGKCV ATCPKSLIELVPYAAPHKVQCSSKEFGKAVKEVCSAGCIGCKMCTRVCEADAITVENNIA KIDYSKCTGCGKCAEKCPAKIIL >gi|222441873|gb|ACEP01000069.1| GENE 36 43977 - 44918 719 313 aa, chain + ## HITS:1 COG:lin1226 KEGG:ns NR:ns ## COG: lin1226 COG2267 # Protein_GI_number: 16800295 # Func_class: I Lipid transport and metabolism # Function: Lysophospholipase # Organism: Listeria innocua # 19 308 15 305 306 108 26.0 2e-23 MAYQKFTGTYQSSDHINRIHYYIYEPETDLRAILQIVHDFGDFVERNENLIRFFTDHGVM VCGCDHIGHGRSSKKEDYGYFGSKNGWTYLVKNTKKLTHYMKREYPDTPYFIYGHGLGSL IVRMDCIHEKGINGVILSGTSGKQKYCWRKILLAALLKRIWGAEHRSVYLEKDIEKRLNH RFLKEQDTYSWIAANEKIRREYSRIYSEHLPVTVAAYEDILKMLALVSTRKWYRSVNEDI PILLLGGKEDPIGNGGKGILEVHRQLEDTRHWVELHLYEGMRHNICDEVRRDEVYTDMLQ WMKLHMNEEYELQ >gi|222441873|gb|ACEP01000069.1| GENE 37 45076 - 45327 321 83 aa, chain + ## HITS:1 COG:no KEGG:Aaci_1292 NR:ns ## KEGG: Aaci_1292 # Name: not_defined # Def: sporulation protein, YlmC/YmxH family # Organism: A.acidocaldarius # Pathway: not_defined # 1 75 2 76 98 66 42.0 3e-10 MRFWELSNKDVINCKNGHRLGCVGDLEIDVCKLCITDFYVPTGGKYCGCLGKKSEYKIPV GAVIRIGVDSILVDIDEKKCLVK >gi|222441873|gb|ACEP01000069.1| GENE 38 45398 - 45856 462 152 aa, chain + ## HITS:1 COG:lin1597 KEGG:ns NR:ns ## COG: lin1597 COG1327 # Protein_GI_number: 16800665 # Func_class: K Transcription # Function: Predicted transcriptional regulator, consists of a Zn-ribbon and ATP-cone domains # Organism: Listeria innocua # 1 152 1 152 154 165 58.0 3e-41 MKCPFCGLDNTRVIDSRPADDNSSIRRRRLCDECGKRFTTYERVEIMPLTVIKKDKTREP YDRSKMMSGILRACHKRPVSVEQIEQMVNKIENELFNLEYKEIESKQIGEVVMEHLKSLE QVAYVRFASVYREFKDVETFMNELKKMLDNNR Prediction of potential genes in microbial genomes Time: Fri Jul 8 07:33:57 2011 Seq name: gi|222441872|gb|ACEP01000070.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont76.1, whole genome shotgun sequence Length of sequence - 18463 bp Number of predicted genes - 10, with homology - 10 Number of transcription units - 6, operones - 3 average op.length - 2.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 7/0.000 - CDS 17 - 2869 2689 ## COG0178 Excinuclease ATPase subunit 2 1 Op 2 . - CDS 2885 - 4879 2320 ## COG0556 Helicase subunit of the DNA excision repair complex - Prom 4909 - 4968 8.1 3 2 Op 1 . - CDS 5328 - 6884 1599 ## gi|225027284|ref|ZP_03716476.1| hypothetical protein EUBHAL_01540 4 2 Op 2 . - CDS 6945 - 7826 512 ## COG3764 Sortase (surface protein transpeptidase) - Prom 8010 - 8069 11.6 - Term 8045 - 8099 9.0 5 3 Tu 1 . - CDS 8286 - 9983 1583 ## COG0155 Sulfite reductase, beta subunit (hemoprotein) - Prom 10048 - 10107 8.0 + Prom 10344 - 10403 5.9 6 4 Tu 1 . + CDS 10509 - 12491 2157 ## COG0326 Molecular chaperone, HSP90 family 7 5 Op 1 . - CDS 12588 - 14876 1419 ## COG1368 Phosphoglycerol transferase and related proteins, alkaline phosphatase superfamily 8 5 Op 2 . - CDS 14893 - 15768 742 ## COG1368 Phosphoglycerol transferase and related proteins, alkaline phosphatase superfamily 9 5 Op 3 . - CDS 15793 - 17193 1169 ## Sca_2153 hypothetical protein - Prom 17325 - 17384 8.0 10 6 Tu 1 . - CDS 17388 - 18194 782 ## COG0860 N-acetylmuramoyl-L-alanine amidase - Prom 18220 - 18279 7.2 Predicted protein(s) >gi|222441872|gb|ACEP01000070.1| GENE 1 17 - 2869 2689 950 aa, chain - ## HITS:1 COG:CAC0503 KEGG:ns NR:ns ## COG: CAC0503 COG0178 # Protein_GI_number: 15893794 # Func_class: L Replication, recombination and repair # Function: Excinuclease ATPase subunit # Organism: Clostridium acetobutylicum # 5 942 2 939 939 1223 63.0 0 MSEKKEYIRIRGAREHNLKNINIDIPRNEFVVLTGLSGSGKSSLAFDTIYAEGQRRYMES LSSYARQFLGQMEKPDVDDIQGLSPAISIDQKSTNRNPRSTVGTVTEIYDYLRLLYARVG IPHCPKCGKVISKQTVDQIVDILMKLPERTKIQLLAPIVRGRKGEHVKILKDAKKSGYVR VRIDGNLYDLAEEIKLEKNKKHNIEIIVDRLMIKEGIENRLTDSIETVMKLTGGLLLVDV VGGEPMNFSQSFSCPDCGISIDEIEPRSFSFNNPFGACPVCHGLGFKMEFDVDLMIPDKK MSIADGAITVLGWQSCTDPKSYTRAVLDALAREYNFDLSTPFCDLSPEVQNIIIHGTNGH EVKVYYKGKRGEGVYDVAFEGLLRNVQRRYRENHSERQRAEYENFMTVTPCDSCHGQRLK PESLAVTVGQYNISELTCLSVKRLISYFNEIELTERQQLIGKQVLREIRGRIGFLNDVGL DYLSLSQPTGTLSGGEAQRIRLATQIGSGLVGVAYILDEPSIGLHQRDNDKLISALKRLR DLGNSLIVVEHDEDTMREADCIVDIGPMAGENGGKVVAVGTAEELMANPDSITGAYLSGK MKIPVPEVRRKPQGYLTVKGASENNLKNINVKFPIGVFTCVTGVSGSGKSSLVNEILYKK LACVLNRARIKPGKHKDIEGTEQLDKIINIDQSPIGRTPRSNPATYTGVFDHIRELFAMT KDAKARGYDKGRFSFNKKGGRCEACAGDGILKIEMQFLPDVYVPCEVCHGKRYNRETLEV KYKGKNIFEVLDMNVDEACEFFESVPSIRRKIETLRDVGLSYIKLGQPSTTLSGGEAQRV KLATELSRQSTGKTIYVLDEPTTGLHFADVHKLVDILQRLTEGGNTVVVIEHNLEVIKTA DYIIDMGPEGGEGGGTMVTCGTPEEVAAVEKSYTGKYLKKYLNPDLPQAR >gi|222441872|gb|ACEP01000070.1| GENE 2 2885 - 4879 2320 664 aa, chain - ## HITS:1 COG:BH3595 KEGG:ns NR:ns ## COG: BH3595 COG0556 # Protein_GI_number: 15616157 # Func_class: L Replication, recombination and repair # Function: Helicase subunit of the DNA excision repair complex # Organism: Bacillus halodurans # 4 656 5 657 660 841 65.0 0 MDHFELVSDFAPTGDQPQAIEELVKGFKEGNQFETLLGVTGSGKTFTMANVIAQLNKPTL ILAHNKTLAAQLYSEFKAFFPHNAVEYFVSYYDYYQPEAYVPSTDTYIEKDSSINDEIDK LRHSATAALIERKDVIVVSSVSCIYGIGSKNDYAKMMISLRPGMEIDRDEVVRRLIDIQY VRNELDFKRGSFRVRGDVLEVVPVSSFEDAIHIEFFGDEIDRIMQVDVLTGEIKASLNFA IIFPASHYVVPQEQIERAVKTIKEELDERVEYFKENDQLLEAQRISERTNFDIEMLKETG FCSGIENYSRHLTGLEPGKAPYTLIDFFGDDFLMIVDESHITIPQVRGMYAGDRSRKQTL VDFGFRLPSALDNRPLNFDEFEERIDQMLFVSATPNVYEGEHEMLRAEQIIRPTGLLDPP IDVRPVEGQIDDLISEINKEVAKKNKVLVTTLTKRMAEDLTAYMKELDIRVKYLHSDIDT LERIEIVRDLRMGVFDVLVGINLLREGLDIPEVTLIAILDADKEGFLRSETSLVQTIGRA ARNAEGHVIMYGDTITDSMRAAITETKRRREIQMAYNEEHGITPKTIQKAIRDVISITNE EAPEKSRGSLKKDMESMNRKELTEMIAKLTKKMNKAAAELNFEEAAELRDELKKYKIALR DYDD >gi|222441872|gb|ACEP01000070.1| GENE 3 5328 - 6884 1599 518 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225027284|ref|ZP_03716476.1| ## NR: gi|225027284|ref|ZP_03716476.1| hypothetical protein EUBHAL_01540 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01540 [Eubacterium hallii DSM 3353] # 1 518 1 518 518 875 100.0 0 MKSIIARKFSKYIGAFFAIVAMVAGLLLPAQIKAADSYNKDKKGSITINLDDVKQEDSIT NKSGVSVSIYQVASIGHDGVNISFDIASSLESTGVDVNDITTSDKNLNSAKKLTTVIDNS GISSVTKKTDSNGKVSFTDLAQGMYLVEEKDSASYGMFSPFLVAIPYMEDGQNWIYDVET YTKGVSNQQGSLEVTKALVYMDPETGKIYNLQAPKSYEENGETVPGARYYVGLFCDAEGT IPYGKNYLKTIDFNGSHSETIKYDNLPDGIYYLFETDENGVPYAMGETHTEKDRTYKCVV GDGSGDSADNKVQLQGRPQGKMNISNVFTIIPDDFVVVAELTIEKKVLDQNGQQTTTNDT FYASVYKQTGSDDNVENELIQTVQLQQNGKVTVPVRVEKDGSTTKVYIEETDADGNTIDP DNFAYKISGEGNVELSMEQQTGTITLTNREKGDEEEETTSSEEETTSTEKEKTPPNNKKK KSKSGKTGDTTPILTWIIIGVAAAVVIIFLVVRRRKSE >gi|222441872|gb|ACEP01000070.1| GENE 4 6945 - 7826 512 293 aa, chain - ## HITS:1 COG:SP0467 KEGG:ns NR:ns ## COG: SP0467 COG3764 # Protein_GI_number: 15900383 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Sortase (surface protein transpeptidase) # Organism: Streptococcus pneumoniae TIGR4 # 30 274 5 245 261 193 41.0 3e-49 MKSKILGILITLLFVAGYATLNYPVLGTLYNQIREGKVIDSYDHAVHTMNEQKRKKYWED AREYNERLARENPQLSDAFSQEEKKPDSAYNHILNMEESGVMGALEIPKISLYLPIYHGT SQEVLEKGIGHLEGSSIPIGGKNTHAVLTGHRGLPSAELFSNLDQMERNDEFYIHILGKT LAYRVFNVETVRPEETGHLAIARGQDRVTLVTCTPYGINTHRLLVHARRVPYKEKSSSSR KNNLWKWLLKQKVLLISTGILILLIIYSLVRSRKNRKRKKRRHKKKSGGNKKK >gi|222441872|gb|ACEP01000070.1| GENE 5 8286 - 9983 1583 565 aa, chain - ## HITS:1 COG:CAC0094 KEGG:ns NR:ns ## COG: CAC0094 COG0155 # Protein_GI_number: 15893390 # Func_class: P Inorganic ion transport and metabolism # Function: Sulfite reductase, beta subunit (hemoprotein) # Organism: Clostridium acetobutylicum # 3 564 2 515 516 372 36.0 1e-103 MNQELMKEFKADLKEFREMTEKFYAKEVSVKDYKGFSGGFGSYAQKGGEASMLRLRMPGG RVTKEKLKFLVDSIERYDVKRAHITTCQTVQFHDLDAKAVCDIMEQAMDAGIVTRGGGGD FPRNTMVSPLSGVEQGEYFDVLPYAEEAGDYLMGIIKTVKLPRKLKVGFSNSLANVTHAT FRDLGFVAKEEGTFDVYSAGGLGNNYRMGVKVAENVKPEEVLYYVEAMVRTFTTYGNYES RAKSRTRYMQETLGVDGYRKAYQEKLAEVKAEYKDSLLIKLEGKVAENAINNMENNSADD VEGKNTADMSENITESITKNVADNIVKTDENVILETAESYPQKEAASERILPQKQQGLYA VAYHPIGGIVPVKKFGEIYNIIKDIGDAEVRIAPDETLYIINLNASQAKEIAAITEDGAK NLFETSVSCIGSTVCQIGLRDSQGLLASIVATVEPYHFADGVLPRIHISGCTSSCGTHQI GKLGFHGASKRVDGKMQPAFAFHVNGTDAQGAEHFGEEWGTMLAEEIPDFFVELGQMISA EHLTYESWYTQNPEKLKKIAEKYIR >gi|222441872|gb|ACEP01000070.1| GENE 6 10509 - 12491 2157 660 aa, chain + ## HITS:1 COG:alr2323 KEGG:ns NR:ns ## COG: alr2323 COG0326 # Protein_GI_number: 17229815 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Molecular chaperone, HSP90 family # Organism: Nostoc sp. PCC 7120 # 268 660 209 654 658 285 38.0 2e-76 MAKQGSLSINSENIFPIIKKWLYSDHDIFIRELVSNGCDAVTKLKKLSVMGEFEEADDEK YKVEVRINPTDKTLTFIDNGIGMTEDEVDEYINQIAFSGAAAFMEQYKDKASDEQIIGHF GLGFYSAFMVADKVTIDTLSYKEGATPVHWECDGGTEFDMEDGDKAERGTTITLYLNDES YEFCNEYRCREVLNKYCSFMPVEIYFVNEEEEAKKAEEAAKKDAEEKVIDVDVKDADAKE NNADTEDGDSEVTLEEEEEEEDDAPKPINQIHPLWTKHPNDCTDEEYKEFYRDVFHDYKE PLFWIHLNMDYPFNLKGILYFPKINTKYDTIEGTIKLYNNQVFIADNIKEVIPEYLLLLK GCIDCPDLPLNVSRSALQNDGFVKKISNYITKKVAEKLSGMCKTDRENYEKYWDDISPFI KFGCIRDQKFNDKMKDYIIFKNMEGKYVTLPDYLEAGKEKYENNVFYITDEVAQSQYINM FKEQEIDAIYLTHQIDTTFITHLEQRNPEVKFLRIDSELTDNFKETIDENEEKELTEKLS EKFKKATGVENLIVKVEKLKNADMPSMITVSEQTRRMSEMMEMYGMSNSSTGLGAEGETL VLNMNNDLVQYVLNNDESENTTTICQQLYDLARLANHPLKPEEMTAFVARSNKILSILTK >gi|222441872|gb|ACEP01000070.1| GENE 7 12588 - 14876 1419 762 aa, chain - ## HITS:1 COG:ECs5319 KEGG:ns NR:ns ## COG: ECs5319 COG1368 # Protein_GI_number: 15834573 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Phosphoglycerol transferase and related proteins, alkaline phosphatase superfamily # Organism: Escherichia coli O157:H7 # 392 735 120 448 750 165 33.0 2e-40 MNKQQKNYIKAVVGFVLAAIILCAGIMTALYLMREEKGMENLMAPVLKRLIAGILLFGLT VCWRHFLKKSVSKVAGIIVIVLLSLFFAWHISGIPQEKALKPQNVNLEKAEFRGYEINGN HFLIKEARPFIKYKMQSSKIANVKVYFEEPVAQDTDVKISYQTKEKPTYEQNPRVWANIK KGEKIAFGEIDATGITRIKVRFGSKIGQQFVLDHIELNANYRERVQKKQITMIIYFMMFL LMPACYLLLSHAVELQKRTDQNKILKVLSFPFGLLVPVALFITLLVCSMVAWLKSTCGDV SFSIIVLQLTSPIKGTDSGVINSIIKTGIIPPLLVTLTISIVYLIMVRVLYNLEDLPVKK VPAWTKICLEIILLIALVGTIQVQGTKVGMWEYIKSVQEKTDFYEKYYVNPAKTKLDFPS QKRNLIYIFMESMESSYADQEDGGIMDDNYIPNLTKLAKENINFSDKADGKLGGPTCLEA TAYTVGGMVAQTAAINLKLHNSGSMFGNFLPNLTTMGDILNKEGYQQVFLCGSEGDFAGR DTYFTSHKDFHIEDYNAAKKEGFIAPDYKVFWGHEDEILYKRAKKQLEQLSSSDKPFNLT MLTVDTHFPRGYKCRLCKDKYNRQYANVIACADQQVYDFVEWIKKQDFYKNTTIVIAGDH TTMVDTSDPIWGNLNNNYKRTVYNTIINADCTYKENVTENRDFSTMDMFPTTLAALGVQI DGNRLGLGTNLFSGQKTLPEKLGRGYINQELKKNDKEYNGFY >gi|222441872|gb|ACEP01000070.1| GENE 8 14893 - 15768 742 291 aa, chain - ## HITS:1 COG:ECs5319 KEGG:ns NR:ns ## COG: ECs5319 COG1368 # Protein_GI_number: 15834573 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Phosphoglycerol transferase and related proteins, alkaline phosphatase superfamily # Organism: Escherichia coli O157:H7 # 7 264 192 448 750 127 33.0 3e-29 MGGPVCLEATAYTAGGLVAQTSAINLKVMNSGAVSDSFLPNLTALGDILNKQGYNQMFLC GSDGDFAGRDAYFKTHKDYQIEDYKAAIKEGDIPKDYKVYWGHEDKVLYERAKKNLKKLS KEGKPFNLTMLTVDTHFPNGYICDLCENKYDTTYGNAVACADRQVYDFVQWIKKQDFYED TTIIIAGDHTSMVDTGSKFWKSLSNDYQRTVYNAIINPQCAYKKKVTTKRKFSTMDMFPT TLAALGVEIDGNKLGLGTNLFSGEETLREELGANYINKELKRNDKMYNQFY >gi|222441872|gb|ACEP01000070.1| GENE 9 15793 - 17193 1169 466 aa, chain - ## HITS:1 COG:no KEGG:Sca_2153 NR:ns ## KEGG: Sca_2153 # Name: not_defined # Def: hypothetical protein # Organism: S.carnosus # Pathway: Glycerolipid metabolism [PATH:sca00561]; Metabolic pathways [PATH:sca01100] # 278 466 113 303 607 104 32.0 1e-20 MENQGNMQSRKKAAVIYFLAAFFIACIGIAGANYLKGELGLSAEELVKKVELTVEIAVLL GIAIAVRHLLKKRNRKIAGIIVLLFVAGVFSWQNSGIWQESQLKPQRLETPESEFDRYDI KDGHFTVAEANANIVKEINVDYLDNVTVHFSKPVTQKVIVRVLYETKTQHGFDNKTRKMR VKVHKGETVGCVSMKVKDVTRIKIGIGRKIGTQFDYGYTEINGNYAARMHQKKVSMVKYF VFFMLIPAGYFILAAGKKWQEKTDKNICLRILSFPFGFITILSLFATFLVVNLVEWVKMT CGNVSFSIILLQLTSPIKGTDSGIINSIIKTAVVPPVLLAVAAVLCYLFIVRGMYALEDL PVKKIPRWSKICIEVVMVVFFLHTVQVQGTEIGMWDYIQSVRESSDFYEKEYVNPAKVKM TFPKEKKNLIYIFMESMESSYADKEDGGTMDDNYIPNLTKLARENV >gi|222441872|gb|ACEP01000070.1| GENE 10 17388 - 18194 782 268 aa, chain - ## HITS:1 COG:BS_yqiI KEGG:ns NR:ns ## COG: BS_yqiI COG0860 # Protein_GI_number: 16079475 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: N-acetylmuramoyl-L-alanine amidase # Organism: Bacillus subtilis # 107 267 52 204 206 85 31.0 8e-17 MKKMMKKTFSLLMIFMLVFTMIPTSTLKVSAKAAAATTQAKKATTKTAKKTAKKKKKTKA KKQRTVFIAAGHQQRGISSTERLAPGSSRRKAKLTSGTAGVRTHIPEYKTNLAIAKATKK ELKKRGYKVIMLRTTNNCPLSNQQRTKKANKSGADIHICIHCNASGASARGPLVCVPGSS RYVGKKIFNSSRRLGSCLLSSVAKAVNKKSHGTIRSDYYTTINWAKIPTMILECGFLTNP TEDRQLNSSSYQKKLAKGIANGVDKYFK Prediction of potential genes in microbial genomes Time: Fri Jul 8 07:34:30 2011 Seq name: gi|222441871|gb|ACEP01000071.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont77.1, whole genome shotgun sequence Length of sequence - 8564 bp Number of predicted genes - 8, with homology - 8 Number of transcription units - 2, operones - 2 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 130 - 189 3.5 1 1 Op 1 1/0.000 + CDS 288 - 914 769 ## COG1802 Transcriptional regulators 2 1 Op 2 1/0.000 + CDS 911 - 1555 828 ## COG1802 Transcriptional regulators + Term 1623 - 1670 -0.7 + Prom 2031 - 2090 2.3 3 1 Op 3 1/0.000 + CDS 2128 - 3396 1558 ## COG0471 Di- and tricarboxylate transporters + Prom 3425 - 3484 3.2 4 1 Op 4 11/0.000 + CDS 3540 - 4439 250 ## PROTEIN SUPPORTED gi|169634422|ref|YP_001708158.1| fumarate hydratase 5 1 Op 5 . + CDS 4439 - 5065 221 ## PROTEIN SUPPORTED gi|169634422|ref|YP_001708158.1| fumarate hydratase + Prom 5665 - 5724 7.4 6 2 Op 1 1/0.000 + CDS 5956 - 6744 967 ## COG0703 Shikimate kinase 7 2 Op 2 8/0.000 + CDS 6788 - 7648 1013 ## COG0169 Shikimate 5-dehydrogenase + Prom 7702 - 7761 5.8 8 2 Op 3 . + CDS 7801 - 8562 1091 ## COG0710 3-dehydroquinate dehydratase Predicted protein(s) >gi|222441871|gb|ACEP01000071.1| GENE 1 288 - 914 769 208 aa, chain + ## HITS:1 COG:STM3358 KEGG:ns NR:ns ## COG: STM3358 COG1802 # Protein_GI_number: 16766653 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Salmonella typhimurium LT2 # 1 149 1 153 209 69 28.0 5e-12 MRALSEKRLSDYMAQQIADEILSGRIVGGTQLKQEELAEVFGASRIPIREAFQVLEGQGL IVRLATRRIYAVELSEEQIQIIYEMIGDILKKAVENLKNEDKGKEVITILSSDLSTSKRF DEILLDCTDNAYLSKLLYTASDCYIRFAVSCGLNEQENKCREEMKQIILEENGVNLDSIS GRKKIFKKIDVWMKVLSDIVNIERGKNK >gi|222441871|gb|ACEP01000071.1| GENE 2 911 - 1555 828 214 aa, chain + ## HITS:1 COG:STM3357 KEGG:ns NR:ns ## COG: STM3357 COG1802 # Protein_GI_number: 16766652 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Salmonella typhimurium LT2 # 4 213 6 215 221 204 51.0 1e-52 MMGLRPIKMPSAKERVAAELRKAILSRQMKEGEVLSLESVASQLNVSAMPVREAFQILAR DGLIQLRKNKAAVVLGVTETYIKEHYQLRAILEGAAARLCAAPEADISELEEIYEDSCKV LENKDFSHYTDINREFHNEIWTAAGNGKMKNMISELWNGLSMGNMVSEEDYAKVSVKEHG EILEAIKAHDGDAAEAAMKEHIMRSRDDMLTYYN >gi|222441871|gb|ACEP01000071.1| GENE 3 2128 - 3396 1558 422 aa, chain + ## HITS:1 COG:STM3356 KEGG:ns NR:ns ## COG: STM3356 COG0471 # Protein_GI_number: 16766651 # Func_class: P Inorganic ion transport and metabolism # Function: Di- and tricarboxylate transporters # Organism: Salmonella typhimurium LT2 # 1 421 1 422 422 509 69.0 1e-144 MSQITITLLFLLFAIVMFMWEKIPLGLTSMIVCVGLVVTGVLEWQTAFAGFIDSNVILFV AMFIVGGALFETGMANKIGGIVTHFAKTERQLIVAIMVIVGVMSGFLSNTGTAAILIPVV IGIAAKSGYSRSRLLMPLVFAAAMGGNLTLIGAPGNMIAQSGMEGIGLKFGFFDYAKVGV PILIVGIIYFAFIGYKFLPNKEGSDEGIFDESKDFSHVPKWKQYLSLVILLLTLVGMIFE EQLGIKLCVIGCMGALALMITGVISEKDALASIDLKTIFLFGGTLSLAAALEQTGAGELI AEKVIGMLGDNPSPYVLTFVIFMLCCVMTNFMSNTATTALMVPIGISIAQGMGADPSAVL MACVIGGSCAYATPIGMPANTMVVTAGGYTFKDYAKAGVPMILVATVVSMILLPIFYPFF PG >gi|222441871|gb|ACEP01000071.1| GENE 4 3540 - 4439 250 299 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|169634422|ref|YP_001708158.1| fumarate hydratase [Acinetobacter baumannii SDF] # 26 284 26 288 508 100 26 8e-42 MSKEEQVTQLTEHMAKFIAHVAKKLPDDVIAKLEELRDKEDSPLSKTIYDTMFKNQELAV KLNRPSCQDTGVLQFWVKCGTKFPLIGELEALLKDAVVKATFEAPLRHNSVETFDEYNTG KNVGKGTPTVFWDIVPDSDECEIYTYMAGGGCTLPGKAMVLMPGEGYEGVTKFVMDVMTT YGLNACPPLLVGVGVATSVETAALLSKKALMRPIGSHNDNERAAKMEALLEDGINAIGLG PQGMGGKYSVMGVNIENTARHPSAIGVAVNVGCWSHRRGHIVFDKDLNYKITTHTGVEL >gi|222441871|gb|ACEP01000071.1| GENE 5 4439 - 5065 221 208 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|169634422|ref|YP_001708158.1| fumarate hydratase [Acinetobacter baumannii SDF] # 3 180 309 484 508 89 32 8e-42 MVEVKDGKKILTTPISAEDLKGIKIGDIIYLNGSMTTCRDVAHRRLVEEGRELPVDVRNN AIFHAGPIIRPLENDKFEMVSVGPTTSMRMEKFEYEFVKETGVRVIIGKGGMKENTERAC KDFGAIHCVFPAGNAVVAATEVEEIVRAEWRDLGMPETLWNCRVKEFGPLIVSIDPEGNN FFEQQKVEYNKRKDEQIKKISAQVGFIK >gi|222441871|gb|ACEP01000071.1| GENE 6 5956 - 6744 967 262 aa, chain + ## HITS:1 COG:FN0822 KEGG:ns NR:ns ## COG: FN0822 COG0703 # Protein_GI_number: 19704157 # Func_class: E Amino acid transport and metabolism # Function: Shikimate kinase # Organism: Fusobacterium nucleatum # 92 240 4 153 172 100 36.0 2e-21 MTRLEKLRKRLGQCDEILLDAILMRHHIIKDIMAYKEENGLNILDPEQEERQKEWLDGRL EDESNASEVWDIFHAIGKASKQMQARKLFDYNIVLIGFMGAGKTSISEYLKTLFAMDVIE MDQIIAEREGMSIPDIFEVHGEQYFRDLETNLLIEMQSRKNVVISCGGGTPLRECNVVEM KKNGRVVLLTASPETIFDRVKDSHDRPVIENNKNVPFIADLMEKRRAKYEAAADIIINTD GKSLIEVCEELVQQLLAMDENK >gi|222441871|gb|ACEP01000071.1| GENE 7 6788 - 7648 1013 286 aa, chain + ## HITS:1 COG:lin0493 KEGG:ns NR:ns ## COG: lin0493 COG0169 # Protein_GI_number: 16799568 # Func_class: E Amino acid transport and metabolism # Function: Shikimate 5-dehydrogenase # Organism: Listeria innocua # 7 280 12 291 291 176 36.0 6e-44 MAKVDINTKMITLLGDPLKQSFAAQMQNCGYEAAGLNMIYFYTEVNNEHLADVVNGIRYM NFAGFAVTKPNKVKVLEYLDELDPLCEKMGASNTVVKTPEGKLVGYNTDGIGFIRSMERD GNVKIDENIYFCIGSGGAGRAMCSALAYYGAKKIYITDVFEESSKSLVEDINKNFAPVAE FCPAGDFSKVKEATVVLNASGIGMGSHIGENPLPKEFISKDQFYFDACYNPAKTQFLLDA EEAGATILNGLGMSLYQGAAQIEYWTGQEPPIEAMRQKLLDIISEK >gi|222441871|gb|ACEP01000071.1| GENE 8 7801 - 8562 1091 253 aa, chain + ## HITS:1 COG:STM1358 KEGG:ns NR:ns ## COG: STM1358 COG0710 # Protein_GI_number: 16764709 # Func_class: E Amino acid transport and metabolism # Function: 3-dehydroquinate dehydratase # Organism: Salmonella typhimurium LT2 # 1 252 1 252 252 276 56.0 2e-74 MNPVVVRNVAIGEGIPKICVPIVGKTREEILEAAKNILPIGADVVEWRVDWYEDIFDFEK VEETAKQLREVLGEMPILFTFRTSKEGGEKAIETPVYVELNQKISATGYVDLVDMEAFTG DDAVKAVIEEAHKHGVKVVASNHDFDKTPAKSDIIYRLRKMQELGADIPKIAVMPQNKKD VLTLLAATEEMVNNYADRPIITMSMAATGVISRVCGEVFGSALTFGAVGKASAPGQMGAG ELKEMLTTLHASL Prediction of potential genes in microbial genomes Time: Fri Jul 8 07:34:33 2011 Seq name: gi|222441870|gb|ACEP01000072.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont78.1, whole genome shotgun sequence Length of sequence - 11819 bp Number of predicted genes - 9, with homology - 9 Number of transcription units - 5, operones - 2 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 617 - 928 257 ## COG4636 Uncharacterized protein conserved in cyanobacteria - Prom 988 - 1047 3.0 2 2 Tu 1 . - CDS 1145 - 1270 196 ## gi|225027303|ref|ZP_03716495.1| hypothetical protein EUBHAL_01559 - Prom 1304 - 1363 2.1 - Term 1834 - 1880 9.1 3 3 Tu 1 . - CDS 1882 - 2199 146 ## DSY2751 hypothetical protein - Prom 2299 - 2358 3.9 - Term 2307 - 2353 6.2 4 4 Op 1 . - CDS 2371 - 3870 1115 ## COG3437 Response regulator containing a CheY-like receiver domain and an HD-GYP domain 5 4 Op 2 . - CDS 3950 - 4309 392 ## EUBELI_20567 hypothetical protein 6 4 Op 3 10/0.000 - CDS 4357 - 7188 2218 ## COG0642 Signal transduction histidine kinase 7 4 Op 4 . - CDS 7267 - 10134 1802 ## COG0642 Signal transduction histidine kinase 8 5 Op 1 . - CDS 10773 - 11210 327 ## Bacsa_0373 YolD-like protein 9 5 Op 2 . - CDS 11210 - 11818 483 ## COG0389 Nucleotidyltransferase/DNA polymerase involved in DNA repair Predicted protein(s) >gi|222441870|gb|ACEP01000072.1| GENE 1 617 - 928 257 103 aa, chain - ## HITS:1 COG:sll0925 KEGG:ns NR:ns ## COG: sll0925 COG4636 # Protein_GI_number: 16329481 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in cyanobacteria # Organism: Synechocystis # 6 101 32 124 202 60 32.0 9e-10 MDWNLKITDMIGDMPEHSTVIVNFVAAIRHQLKNSTCYVYSDNIQYHFQDSQGNNKIIIP DASINCRTKSRHGNTFTDAPRFVMEVLSPSTEKYDRTEKMQLF >gi|222441870|gb|ACEP01000072.1| GENE 2 1145 - 1270 196 41 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225027303|ref|ZP_03716495.1| ## NR: gi|225027303|ref|ZP_03716495.1| hypothetical protein EUBHAL_01559 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01559 [Eubacterium hallii DSM 3353] # 1 41 15 55 55 77 100.0 5e-13 MCDNKLDPRFYPEVYQAVAGKDYTVYAYMNDGGVRKFDVKP >gi|222441870|gb|ACEP01000072.1| GENE 3 1882 - 2199 146 105 aa, chain - ## HITS:1 COG:no KEGG:DSY2751 NR:ns ## KEGG: DSY2751 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense # Pathway: not_defined # 7 101 9 104 110 61 31.0 9e-09 MHIEKYVAKNIISLCEKRDISKYRLSQLSGISQSSLSRIMTQESLPSLITLEKICVALGV TLSQFFREENPVDLTESQKEILKIWDNLSTKEQETVLAMLRGLQK >gi|222441870|gb|ACEP01000072.1| GENE 4 2371 - 3870 1115 499 aa, chain - ## HITS:1 COG:slr2100 KEGG:ns NR:ns ## COG: slr2100 COG3437 # Protein_GI_number: 16330586 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Response regulator containing a CheY-like receiver domain and an HD-GYP domain # Organism: Synechocystis # 2 328 8 344 368 199 34.0 8e-51 MDKRQKILIVDDSKLNREILKEILGETYNYLEAENGNQAIQMIGENIGINLMLLDINMPQ MNGFEVLEIMKRSQCIAETPVIMISSEDAVNTMRKAYELGITDYITRPFDSVIVKKRVQN TLGLYMNQKHLINVVYDQVYEKEENNNIMIQIMSNILGSRNSESREHILHIKTATEMMLR QLVKVTDAYPLTEADIALITTASSLHDIGKIRIPEEILNKPGRLTDEEFKIMETHSELGA AIIKDMDFSQGHPLVHTAWEICRWHHERWDGKGYPDGLKGEEIPISAQVVSIVDVYDALT SERCYKKAFEHDTAIQMIMDGQCGQFNPILLKCLKELSLQLSKMLNKEMDDNKYSHEIQR LSNEILSDKSLPNQIYSQSLVKVMQEKIDFFKSNSGTNSIDYNVVSGQLTILNGKQQVLC QRNNPKFDLFKEFGVIEEDVQYIRVLLHQTSVQNKEISVQIKATVENKSQMYKMKLHTLW SPMKKDVCIGIIGYFDIIK >gi|222441870|gb|ACEP01000072.1| GENE 5 3950 - 4309 392 119 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_20567 NR:ns ## KEGG: EUBELI_20567 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 1 114 1 112 114 117 52.0 1e-25 MTMQECYKAIGGNYEAVLGRLHNEALIPRFTLKFLEDQSYLQLKQALENKNYEDAFRSAH TLKGVCQNLSFDRLYEVSNELTELLRDRTGEKPGIPEAMEKVTEVYEMTIEEIKKGLSH >gi|222441870|gb|ACEP01000072.1| GENE 6 4357 - 7188 2218 943 aa, chain - ## HITS:1 COG:RSp1178 KEGG:ns NR:ns ## COG: RSp1178 COG0642 # Protein_GI_number: 17549399 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Ralstonia solanacearum # 528 927 262 654 676 178 31.0 4e-44 MISSAKKYMAMLLDFVCVCMIFSPQTAYAAEDSSQHETVKVGFFAIDGYHMMDEEGNRSG YGYDFLRLMARYWDVDYEYVGYDQSWDDMQQMLEYGEIDMVTSARKTPDREEKFDFSRPI GTNYGMLTVRSDNSTIVDGNYSTYNGMRVALLNGNTRNKEFADFADNKGFTYVPSYFDTT AEMEEALQSEKVDAIVTSSLRKTNNERIVDKFGSSDFYVIVKKGNTELLNEINYAIDQMN AVEGDWKTTLYNKNYESIETKNLEYTEQEKSIIAQYSKDNPIRVLCDPTRYPYSYNENGE MKGIIPDYFRKIADYAGISYEFLTPATRDEYIAYQKNKEATDMSIDARLETDNYAETKKW GLTAPFITMQLARVTRRDFDGKINVVTTVDQTASNSIADAMAPGAEKLMCSTRQEMMEAV RKGKADAAFVYYYMALAFVNSDTTGTLTYTLLEQPTYTYRMVISSTENHALAGILTKAMY AMPKKLVEDLAARYTTYKATDLTFVDWIRLHPVVTVWVLLIFGWLLTTMAVIAMRLSARK KAQKAAQETAEEMAELAEHAQATNKAKTAFLSHMSHDMRTPMNAIMGFTSIAMKNNPSDE VKNCLEKIDESSEYLLSLINDVLDLTRVESGKVKYNPVPADLKSITDSALDITKGFLTNR DISFKIQREEAKIPNVLVDPVRLRDVLVNILSNAVKFTPDGGTITFEARCQEKGGDGYIN MHYRISDTGIGMSEEFTKEVFEKFAQEDSDVRTQYHGVGLGMAIVKKYVDMVGGTISVQS KKHEGTTFTVDIPLEITDKECNKSDTGFSEKVNLTGVKVLLAEDNELNAEIATVQLEEFG MNVERAVDGKNAVEIFRNHPEGTFDVILMDIMMPEMNGYEAAKAIRAMNDRPDGKNIPII AMTANSFAEDVQASLDAGMNTHLSKPIVIEEVIKTILRYVHND >gi|222441870|gb|ACEP01000072.1| GENE 7 7267 - 10134 1802 955 aa, chain - ## HITS:1 COG:VC1349_3 KEGG:ns NR:ns ## COG: VC1349_3 COG0642 # Protein_GI_number: 15641361 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Vibrio cholerae # 563 800 2 236 260 164 39.0 8e-40 MKKCKLVQNIMNFKFCYIIFCTIICFIVLSVPTFAKENSDNVIRVGSFEETYNTVNEKGE RSGYGYEYLQDIAGYAGWTYKYITSDWKNCFTQLENGEIDILGGISYTDERAENMLFSDM PMGEEKYYIYTDASNMDLTAGNLDSFEGKNIGVFKDHIPEDLLNEWELKYGLHTQHVNAS NTAEVMDKLSKHEIDCFVSVEESRWEESDISPVTSIGETEIYFAINPKRPDIKEALDSAM RRIKDDNPFYTDDLYRRYLSAQSSSFLSKEEREWIGQHGAIRIGYLNQDGGISSVDPSTG KLTGVITDYVDLAENCLQGQTLEFELKGYDTRSELLQALQDGKIDLIFHTNQNPYFAETK GVALSDTLLTLNMAAITAKDFFDENKENTVAVEKDDFSLKAYLSYNYPKWEIIEYGTSDA AVKAMQKGEVDCIVSSSGTVSDYLKNNKLHSVFLTKEADVPFAVRQGEPVLLSILNKTLT SMPTAQFSGAVVSYNASLRKVTAKDFIQDNLLAVSLIVGIFIFVVLCIILDSLKKSKRAE EKSKKSAEQALKLNQELEEKQQELQNALVEAQSANKAKTSFLNNMSHDIRTPINGIMGML TILEKSGNDGERAKDCLNKINESSKLLLSLVNDVLDMAKLESDTVVFSDESINLDQVCQE ITESLSFQAEEKGLHVIGEHDDYSGIYVWSNAVHLKKVLMNLFTNSMKYNKVNGSIYMSM RTIERSEDHMTCEFKIRDNGIGMSEEFIKNELFTPFVQADNSPRSDYNGTGLGMPIVKQL VEKMGGTITVESKLGEGSCFTVILPFKIDTNARLEEKEDFNADISGVRILLVEDNELNAE IAEFMLTENGAKVETVKNGLEAVQHFEACESGRYDVILMDVMMPVMDGLTATKTIRSLER QDAKTIPIIAMTANAFREDVEKCMEAGMNAHLAKPLDDKTIKQTICEELRSSRDR >gi|222441870|gb|ACEP01000072.1| GENE 8 10773 - 11210 327 145 aa, chain - ## HITS:1 COG:no KEGG:Bacsa_0373 NR:ns ## KEGG: Bacsa_0373 # Name: not_defined # Def: YolD-like protein # Organism: B.salanitronis # Pathway: not_defined # 4 137 5 137 138 121 49.0 1e-26 MGKYDDIINLPHHVSKKHSQMPIADRAAQFAPFAALTGYGEQIKETARTTEEFHQVSGRE KEEIDQKLQILQQTQYKHPLISIYYFAPDTRKQGGHYNKTTGCVKKIDVINEKLMMIDGN EIAFKHIKEIQGDFFLSIMESENIT >gi|222441870|gb|ACEP01000072.1| GENE 9 11210 - 11818 483 202 aa, chain - ## HITS:1 COG:BS_uvrX KEGG:ns NR:ns ## COG: BS_uvrX COG0389 # Protein_GI_number: 16079209 # Func_class: L Replication, recombination and repair # Function: Nucleotidyltransferase/DNA polymerase involved in DNA repair # Organism: Bacillus subtilis # 14 202 255 416 416 68 28.0 5e-12 TIADIKVYKPEAKSIGSGQVLSSAYSSEKAKIAVLEMAEQIAFDLFEQKLVSKQFVLTIG YDRDNLQGQEYSGEVATDRYGRKIPKHAHGTINLDIPTSSLKEITTAVSSLYDRIIDKEL LIRRLTLSAAKVMPKEGQVYQQLDLFTDYEALKKEQEKERRLQKSILDIKKKYGKNAVLR GLSYEEGATTRTRNGQIGGHKA Prediction of potential genes in microbial genomes Time: Fri Jul 8 07:34:46 2011 Seq name: gi|222441869|gb|ACEP01000073.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont79.1, whole genome shotgun sequence Length of sequence - 509 bp Number of predicted genes - 0 Prediction of potential genes in microbial genomes Time: Fri Jul 8 07:34:58 2011 Seq name: gi|222441868|gb|ACEP01000074.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont80.1, whole genome shotgun sequence Length of sequence - 38559 bp Number of predicted genes - 44, with homology - 42 Number of transcription units - 29, operones - 9 average op.length - 2.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 39 - 77 -0.6 1 1 Tu 1 . - CDS 224 - 352 113 ## gi|225027315|ref|ZP_03716507.1| hypothetical protein EUBHAL_01571 - Prom 518 - 577 2.4 + Prom 135 - 194 3.6 2 2 Op 1 . + CDS 279 - 431 56 ## 3 2 Op 2 . + CDS 522 - 728 193 ## COG2723 Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase + Term 770 - 812 7.2 + Prom 807 - 866 8.1 4 3 Tu 1 . + CDS 900 - 2237 1391 ## COG0534 Na+-driven multidrug efflux pump + Term 2414 - 2479 0.9 5 4 Tu 1 . + CDS 2757 - 3524 550 ## COG2865 Predicted transcriptional regulator containing an HTH domain and an uncharacterized domain shared with the mammalian protein Schlafen + Term 3532 - 3590 12.1 6 5 Tu 1 . + CDS 3651 - 3782 116 ## gi|225027320|ref|ZP_03716512.1| hypothetical protein EUBHAL_01576 + Prom 3800 - 3859 5.9 7 6 Op 1 . + CDS 3884 - 4189 254 ## COG0526 Thiol-disulfide isomerase and thioredoxins + Prom 4216 - 4275 1.9 8 6 Op 2 . + CDS 4324 - 6015 1466 ## COG0446 Uncharacterized NAD(FAD)-dependent dehydrogenases + Term 6060 - 6119 9.1 - Term 6045 - 6108 7.9 9 7 Tu 1 . - CDS 6132 - 6791 380 ## COG0664 cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases - Prom 6867 - 6926 6.2 + Prom 6829 - 6888 9.1 10 8 Op 1 . + CDS 6926 - 7675 413 ## gi|225027324|ref|ZP_03716516.1| hypothetical protein EUBHAL_01580 11 8 Op 2 . + CDS 7731 - 8087 123 ## gi|225027325|ref|ZP_03716517.1| hypothetical protein EUBHAL_01581 + Term 8121 - 8171 6.5 + Prom 8122 - 8181 3.0 12 9 Op 1 . + CDS 8207 - 8407 259 ## gi|225027326|ref|ZP_03716518.1| hypothetical protein EUBHAL_01582 + Prom 8409 - 8468 3.5 13 9 Op 2 . + CDS 8500 - 8832 65 ## bpr_I2653 hypothetical protein + Term 8871 - 8922 0.2 14 10 Tu 1 . + CDS 9252 - 9551 233 ## COG0675 Transposase and inactivated derivatives + Prom 9637 - 9696 2.4 15 11 Op 1 . + CDS 9731 - 10348 491 ## Shel_05170 hypothetical protein 16 11 Op 2 . + CDS 10409 - 11254 312 ## COG0348 Polyferredoxin + Prom 11394 - 11453 5.2 17 12 Tu 1 . + CDS 11492 - 11713 167 ## gi|225027331|ref|ZP_03716523.1| hypothetical protein EUBHAL_01587 + Term 11769 - 11807 1.3 18 13 Tu 1 . - CDS 11710 - 12045 343 ## EUBREC_0657 hypothetical protein - Prom 12099 - 12158 7.1 + Prom 12358 - 12417 2.8 19 14 Tu 1 . + CDS 12484 - 13707 1591 ## COG1301 Na+/H+-dicarboxylate symporters 20 15 Tu 1 . - CDS 13871 - 14746 681 ## COG2816 NTP pyrophosphohydrolases containing a Zn-finger, probably nucleic-acid-binding - Prom 14794 - 14853 3.1 + Prom 15242 - 15301 5.0 21 16 Tu 1 . + CDS 15395 - 19282 4306 ## COG0841 Cation/multidrug efflux pump + Prom 19301 - 19360 2.0 22 17 Tu 1 . + CDS 19406 - 19570 58 ## 23 18 Tu 1 . - CDS 19754 - 21124 1064 ## COG0733 Na+-dependent transporters of the SNF family - Prom 21223 - 21282 10.3 + Prom 21330 - 21389 11.1 24 19 Tu 1 . + CDS 21504 - 21911 475 ## CKR_0758 hypothetical protein + Prom 21915 - 21974 4.8 25 20 Op 1 . + CDS 22120 - 24135 2104 ## COG3437 Response regulator containing a CheY-like receiver domain and an HD-GYP domain 26 20 Op 2 . + CDS 24177 - 24524 474 ## EUBELI_20567 hypothetical protein + Term 24671 - 24712 4.1 + Prom 24663 - 24722 5.6 27 21 Tu 1 . + CDS 24817 - 25527 462 ## COG1636 Uncharacterized protein conserved in bacteria - Term 25730 - 25769 0.6 28 22 Tu 1 . - CDS 25837 - 26562 290 ## PROTEIN SUPPORTED gi|87119277|ref|ZP_01075175.1| 30S ribosomal protein S1 - Prom 26615 - 26674 8.9 + Prom 26621 - 26680 8.3 29 23 Op 1 . + CDS 26905 - 27630 566 ## COG3409 Putative peptidoglycan-binding domain-containing protein 30 23 Op 2 . + CDS 27620 - 28543 536 ## EUBREC_3563 hypothetical protein 31 23 Op 3 . + CDS 28516 - 28776 198 ## EUBREC_3563 hypothetical protein 32 23 Op 4 . + CDS 28805 - 29593 703 ## COG0642 Signal transduction histidine kinase + Prom 29621 - 29680 4.2 33 24 Op 1 2/0.000 + CDS 29744 - 30100 182 ## COG0640 Predicted transcriptional regulators 34 24 Op 2 . + CDS 30084 - 30716 441 ## COG5658 Predicted integral membrane protein + Term 30734 - 30771 1.1 + Prom 30719 - 30778 2.5 35 25 Tu 1 . + CDS 30879 - 31100 111 ## gi|225027349|ref|ZP_03716541.1| hypothetical protein EUBHAL_01605 + Term 31115 - 31157 7.7 36 26 Tu 1 . - CDS 31358 - 31609 73 ## Swol_1745 transposase - Prom 31696 - 31755 5.3 + Prom 31512 - 31571 3.6 37 27 Tu 1 . + CDS 31659 - 31943 181 ## gi|225027352|ref|ZP_03716544.1| hypothetical protein EUBHAL_01608 + Prom 32022 - 32081 3.5 38 28 Op 1 . + CDS 32101 - 32802 268 ## PROTEIN SUPPORTED gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 39 28 Op 2 . + CDS 32806 - 33531 456 ## CLJU_c09150 putative permease 40 28 Op 3 . + CDS 33538 - 34269 159 ## BAS3614 bacteriocin ABC transporter permease subunit 41 28 Op 4 . + CDS 34262 - 35077 633 ## Blon_1002 hypothetical protein 42 28 Op 5 40/0.000 + CDS 35139 - 35798 595 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 43 28 Op 6 . + CDS 35789 - 37153 764 ## COG0642 Signal transduction histidine kinase + Term 37393 - 37445 0.6 - Term 37282 - 37334 11.4 44 29 Tu 1 . - CDS 37335 - 38318 512 ## Selsp_1837 hypothetical protein - Prom 38483 - 38542 5.1 Predicted protein(s) >gi|222441868|gb|ACEP01000074.1| GENE 1 224 - 352 113 42 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225027315|ref|ZP_03716507.1| ## NR: gi|225027315|ref|ZP_03716507.1| hypothetical protein EUBHAL_01571 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01571 [Eubacterium hallii DSM 3353] # 1 42 53 94 94 78 100.0 1e-13 MAPKTEIRNKLADALDADLSALSDINIQTYEDVMHTLFLFEE >gi|222441868|gb|ACEP01000074.1| GENE 2 279 - 431 56 50 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSESADKSASRASASLFRISVLGAIIFFSYFLILESAVLEEKPTFSPNSF >gi|222441868|gb|ACEP01000074.1| GENE 3 522 - 728 193 68 aa, chain + ## HITS:1 COG:ascB KEGG:ns NR:ns ## COG: ascB COG2723 # Protein_GI_number: 16130623 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase/6-phospho-beta-glucosidase/beta- galactosidase # Organism: Escherichia coli K12 # 4 62 293 351 474 64 55.0 4e-11 MMMPEDDEILKNGTVDFVYISYYSSNCVSVTQKGEITAANGGENIKNPYLETSAWGWQID VLLVQKSH >gi|222441868|gb|ACEP01000074.1| GENE 4 900 - 2237 1391 445 aa, chain + ## HITS:1 COG:TM0815 KEGG:ns NR:ns ## COG: TM0815 COG0534 # Protein_GI_number: 15643578 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Thermotoga maritima # 10 444 18 459 464 158 28.0 2e-38 MEGNLTKGPILKTLTKLAVPIMASSFLGTLYNITDMAWIGLLGSRAVAGVGVGGMFTWLS QGLAAMARMGGQVHVAQCIGRGDRERAHGFAQAAVQLAAFMGMAYAILSLLFTRQMVGFF QLADAQAHAAAMSYMRIACGLIVFSFMTLTLTGLYTAQGDSKTPFIANLVGLATNMVLDP VLILGVGMFPKLGVVGAAIATVTAQAIVMSIMIVGIVIQKKENVLKGTRLLAKIPREYLQ GICKIGIPTAIQGMAYCAISMVLTRMISGYGAEAVATQRVGGQIESISWNTADGFAAALN AFIAQNYGAGKNDRVKKGYKASLWTVGIWGLLISAVFICIPEPIARIFFHEPKAIATSVE YLIIIGFSEAFMCVELTTVGALSGLGRTRLCSIISIAFTSARIPLSIILGGIMGLDGIWW AISSTSIVKGIIFTCTFLWLTRKRR >gi|222441868|gb|ACEP01000074.1| GENE 5 2757 - 3524 550 255 aa, chain + ## HITS:1 COG:MA2370 KEGG:ns NR:ns ## COG: MA2370 COG2865 # Protein_GI_number: 20091202 # Func_class: K Transcription # Function: Predicted transcriptional regulator containing an HTH domain and an uncharacterized domain shared with the mammalian protein Schlafen # Organism: Methanosarcina acetivorans str.C2A # 37 207 205 373 458 132 43.0 8e-31 MAEHSVKKEQLINWKILKQSEGQLLATNAYALLTSDYFSFSKTQCAVFKGTDRAVFLDKR EFTGPIYTQIEEAVDFVLRNIRLGATIDGLVRKEKYELPPEAIREMIINAHCNRNLLDES CIQVAVYDDRLEVTSPGGLYNGLTYEKVMNGHSKIRNKGIANIFSQMGLVEAWGSGIKRI LNAAEEYGLPKPRFQEFDNMFRVELFRVNPITDQANQATNQVNQDFERLDFFSCLYSRFT CKFREISYSHKIKLE >gi|222441868|gb|ACEP01000074.1| GENE 6 3651 - 3782 116 43 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225027320|ref|ZP_03716512.1| ## NR: gi|225027320|ref|ZP_03716512.1| hypothetical protein EUBHAL_01576 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01576 [Eubacterium hallii DSM 3353] # 1 43 1 43 43 76 100.0 6e-13 MFDQQHQRNNTVLFCRLESENGCATDFIADSGRDIASDHKMKK >gi|222441868|gb|ACEP01000074.1| GENE 7 3884 - 4189 254 101 aa, chain + ## HITS:1 COG:BH3098 KEGG:ns NR:ns ## COG: BH3098 COG0526 # Protein_GI_number: 15615660 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Bacillus halodurans # 1 101 1 100 104 106 49.0 1e-23 MSAININKNNFQNEIMDSEKTVLLDFWASWCAPCRMVVPIIEEIAGERPDIKVGKINVDE QPELASKFGIMSIPTLVVMKNRKIVQQVSGARPKNAILEML >gi|222441868|gb|ACEP01000074.1| GENE 8 4324 - 6015 1466 563 aa, chain + ## HITS:1 COG:FN1903_1 KEGG:ns NR:ns ## COG: FN1903_1 COG0446 # Protein_GI_number: 19705208 # Func_class: R General function prediction only # Function: Uncharacterized NAD(FAD)-dependent dehydrogenases # Organism: Fusobacterium nucleatum # 2 461 3 461 469 412 43.0 1e-114 MKVIIVGGVAGGATAAARIRRLDEYAEITVFEKSGYISYANCGLPYYIGDVITDPEELTL QTPESFFKRFHINMKIHHEVISIYPDRKTVSVKNLENGEIFEENYDKLILSPGAKPTQPR LPGVGIDKLFTLRTVEDTFRIKKYINKNHPKSVVLAGGGFIGLELAENLRELGMDVTIVQ RPKQLMNPFDPDMASMIHNEMRKHGIKLVLGYTVEGFKEKDNEVEVLLKDTPSLHADMVV LAIGVTPDTALAKEAGLELGIKESIIVNDRMETSVPDIYAAGDAVQVKHYVTSNDTLISL AGPANKQGRIIADNICGGDSHYLGSQGSSVIKVFDMTAATTGINETNAKKSGLDVDTVIL SPMSHAGYYPGGKVMTMKVVFEKETYRLLGAQIIGYEGVDKRIDVLATAIHAGLKATQLK DMDLAYAPPYSSAKDPVNMAGFMIDNMAKGILKQWHLEDMDKISKDKNAVLLDVRTVGEF SRGHINGFKNIPVDELRERINEVEKGKPVYLICQSGLRSYIASRILEGNGYETYNFSGGF RFYDAVVNDRALIERAYACGMDY >gi|222441868|gb|ACEP01000074.1| GENE 9 6132 - 6791 380 219 aa, chain - ## HITS:1 COG:FN0217 KEGG:ns NR:ns ## COG: FN0217 COG0664 # Protein_GI_number: 19703562 # Func_class: T Signal transduction mechanisms # Function: cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases # Organism: Fusobacterium nucleatum # 2 209 9 212 217 81 27.0 1e-15 MNFDNYFPVWNELNTAQKKLISDNLITQHVKKGAVIHNGNLDCTGLLLVKTGQLRTYILS DEGREITLYRLFDMDICLLSASCIIRSIQFEVTIEAEKDTDLWIIPSEIYKVLMKESAPV ANYTNELMATRFSDVMWLIEQIMWKSLDKRVASFLLEETSIEETNELKITHETIANHLGS HREVITRMLRYFQGEGLVKLSRGKITILDLKKLETLQRS >gi|222441868|gb|ACEP01000074.1| GENE 10 6926 - 7675 413 249 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225027324|ref|ZP_03716516.1| ## NR: gi|225027324|ref|ZP_03716516.1| hypothetical protein EUBHAL_01580 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01580 [Eubacterium hallii DSM 3353] # 1 249 1 249 249 515 100.0 1e-144 MQGDVSFTFLDRIEKVELNIVDRRWQSALALALTLPDICGERCWQLRCEYLHQNKGFLND ENNIHFHLGLNCGMSVCQLDSMNIQENRIDIRIDIEQFCLRMCKAAKSYYDKVNLEKDFS LYNTPVLDFIQVTQKKKDASIIALICGNERYAKGLKEALQFISEQIMLFYTPESAKTKLG KHKPDLWIVTEDMTRQPNQPWRADRTTPVIIITGNPDAVEIKKNSGKLTVLSMPLSIVDL RKNVEIYVS >gi|222441868|gb|ACEP01000074.1| GENE 11 7731 - 8087 123 118 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225027325|ref|ZP_03716517.1| ## NR: gi|225027325|ref|ZP_03716517.1| hypothetical protein EUBHAL_01581 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01581 [Eubacterium hallii DSM 3353] # 1 118 23 140 140 149 100.0 6e-35 MMLMRRQKDKMSEIKIQMDQLQAQHDELQKERDQLLEQMNHLEEQREHLQMQTDQAFRSA DEIEKICDEIWHHANTIHLYSSLSEEESKSITMKEKQKVIIRTIEEIINIIYKYKGDV >gi|222441868|gb|ACEP01000074.1| GENE 12 8207 - 8407 259 66 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225027326|ref|ZP_03716518.1| ## NR: gi|225027326|ref|ZP_03716518.1| hypothetical protein EUBHAL_01582 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01582 [Eubacterium hallii DSM 3353] # 1 66 1 66 66 79 100.0 7e-14 MAKRVASVDKKSCAACGVCENTCPLAAVKVHRGCYAVVEETLCVGCGKCARSCPVGCIEM KERVAV >gi|222441868|gb|ACEP01000074.1| GENE 13 8500 - 8832 65 110 aa, chain + ## HITS:1 COG:no KEGG:bpr_I2653 NR:ns ## KEGG: bpr_I2653 # Name: not_defined # Def: hypothetical protein # Organism: B.proteoclasticus # Pathway: not_defined # 1 101 35 135 193 129 69.0 5e-29 MLDFLLPLGIAIFGGNKFFCNHLCGRGQLFSKLGGDFKCSRNKPTPRWMSSKWFRYGFLI FFLTMFGNMVFQTYLVGGGASSLHEAIKLFWTFRVPWGRTYMWRAWMWKP >gi|222441868|gb|ACEP01000074.1| GENE 14 9252 - 9551 233 99 aa, chain + ## HITS:1 COG:alr2719 KEGG:ns NR:ns ## COG: alr2719 COG0675 # Protein_GI_number: 17230211 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Nostoc sp. PCC 7120 # 6 97 320 410 452 111 56.0 4e-25 MKAVSRSLNFGKSVSDNGWGMFTTFLKYKLEEQGKKLVNVDRFFASSQTCSVYGYKNAKT KNLALREWDCPQCKTHHDRDVNAAVNIRNEGMRLVNALP >gi|222441868|gb|ACEP01000074.1| GENE 15 9731 - 10348 491 205 aa, chain + ## HITS:1 COG:no KEGG:Shel_05170 NR:ns ## KEGG: Shel_05170 # Name: not_defined # Def: hypothetical protein # Organism: S.heliotrinireducens # Pathway: not_defined # 3 202 76 275 436 322 73.0 7e-87 MSITKIVITGGPCAGKTTGMSWIQNAFTERGYRVLFISETATELISGGVAPWICSDNTEY QKCQMKLQLEKEKVFEQAARTMNSDKILIVCDRGTLDNKAYMTEADFSLVLNELGLNEVE LRDGYDAVFHLVTAAKGAEKFYTTANNTARTETVDEAVALDDKLIAAWTGHPHLRIIDNS LGFEEKMKHLIAEIANFLGEPDHMR >gi|222441868|gb|ACEP01000074.1| GENE 16 10409 - 11254 312 281 aa, chain + ## HITS:1 COG:Cj0782 KEGG:ns NR:ns ## COG: Cj0782 COG0348 # Protein_GI_number: 15792120 # Func_class: C Energy production and conversion # Function: Polyferredoxin # Organism: Campylobacter jejuni # 112 281 73 251 260 65 29.0 1e-10 MMRKNTMFRKYIPSLLLFLLFELVAITLWIAKYNIFYLLNFSYIGGCLALGTALFTAGKC YARHFVQLAVGSYMLLYLGIISCENMQIEGFWYYLFLGVFEAATIHYAVAKIFGPLLFGR GWCGYACWTAMVLDFLPYKQPKKPRKEKLGVLRYVMFVLSLAVVLSLFLMKIANLERIMF WIFLVGNTFYYIVGIVLAFVFKDNRAFCKYLCPVTVFLKPMSYFSLLRVHCDENKCIHCG KCLKVCPMNVEVNKESRKRKNGTECILCYECTKVCPTKALH >gi|222441868|gb|ACEP01000074.1| GENE 17 11492 - 11713 167 73 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225027331|ref|ZP_03716523.1| ## NR: gi|225027331|ref|ZP_03716523.1| hypothetical protein EUBHAL_01587 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01587 [Eubacterium hallii DSM 3353] # 1 73 1 73 73 76 100.0 5e-13 MITLFVIGILILSIKFIVLACKATWGMAKALLFIIGLPVILIGLLIAGFISLAAPLLIIT LIAVFIWSFAKQY >gi|222441868|gb|ACEP01000074.1| GENE 18 11710 - 12045 343 111 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_0657 NR:ns ## KEGG: EUBREC_0657 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 108 1 108 111 131 58.0 8e-30 MHFCPERLIQVRESLHINKAEAARRLNLSAMAYGRYERGEREPSYQYAFYIAYTFHCNLD FLYGQSDEMETDFITITRSDSEEVYSLVEEMQNNSDFEKRILAYAEKLRAL >gi|222441868|gb|ACEP01000074.1| GENE 19 12484 - 13707 1591 407 aa, chain + ## HITS:1 COG:VCA0088 KEGG:ns NR:ns ## COG: VCA0088 COG1301 # Protein_GI_number: 15600859 # Func_class: C Energy production and conversion # Function: Na+/H+-dicarboxylate symporters # Organism: Vibrio cholerae # 12 402 12 410 424 206 33.0 6e-53 MKGFYQWYRKHITGAIFAGLILGILTGLFLADKFEVVLTATSALGSIYMNALNMMIFPLV FCSIVMGIASIGSAKTTGKITGAAMLFFLCTTALASFMGLIIPRLIHLGRGVKFEMATSD IQATKMDSILDTVKSLIPSNPVKAFAEGNMLQVLVFSLVIGFTLIAIGEKGEPLLKVIDS LNEVCLKIISTVMYFTPIGVFCTIVPVVEANGTSTIVSLATQLVILYIAFFGFALVVYGA AVKVLGKTSPIRFFRAIMPAALNAFGTCSSSATIPLSKQCVEDELGVSNQVSSITIPLGA TVNMDAVSILMSFMIMFFANACNVNISISMMVLVLLANVLLSVGTPGIPGGAIASFAALA TMAGLPAGVMGVYISINTLCDMGATCVNVIGDMAGCVVLQEQIDIEA >gi|222441868|gb|ACEP01000074.1| GENE 20 13871 - 14746 681 291 aa, chain - ## HITS:1 COG:CAC3396 KEGG:ns NR:ns ## COG: CAC3396 COG2816 # Protein_GI_number: 15896637 # Func_class: L Replication, recombination and repair # Function: NTP pyrophosphohydrolases containing a Zn-finger, probably nucleic-acid-binding # Organism: Clostridium acetobutylicum # 62 269 43 249 271 138 35.0 2e-32 MIQDIYPHKLDNHYDINAVPDMDSIVLYFTEEGVLHRLGQYKEYDNYLFFPAVRDFPSDT TDIDEFPFEKDDLTYLFSVDDAHYFLLEKEPAYLPEGYDFAPVRKLRTKEVHPKYRIFTG ITGFQLSNWYKNNRFCGRCGSKTVHSTTERALKCPSCGHVIYPRIVPAVIVGVCNGDEIL VTKYRTGFAHYALVAGFTEIGETLEETVQREVMEETGLHVKNIRYYKSQPWGIVDDILAG FYCDVDGDTEIQMDESELKLAEWKKREDIVLQPDDFSLTNEMMLMFKKGKQ >gi|222441868|gb|ACEP01000074.1| GENE 21 15395 - 19282 4306 1295 aa, chain + ## HITS:1 COG:BH3816 KEGG:ns NR:ns ## COG: BH3816 COG0841 # Protein_GI_number: 15616378 # Func_class: V Defense mechanisms # Function: Cation/multidrug efflux pump # Organism: Bacillus halodurans # 423 1280 165 1005 1093 339 29.0 2e-92 MTKFFVKKPYFVLVTVIIVLVIGFVSLGEMQTDLLPKMELPYMAVITTEVGASPEKVESD VVKPMESTIGTINGVEKLTSTSANNYGMLMLEFAEDTNMEAALVRVSKALNSLDLPEGCG TPNIMEISADMMATMYASVSYEGKDIKELSSFTEQTVKPYLERQDGVASVSANGLIQDHV EIRLNEKKIDRMNDKILGQTNDKLLEASDKIEEAKASLKKSKEQLKKQEKTLSSKQEETN TKLAEAQTKLAEALSQKAAYEAELNSLKASQSALKAEKKAYKDAGLEKTYKTLNNTFKSL NATMGEAAKQMGITIPSSVKEAIQHPKQFEAFVSWMEQMGYGEQFSKLTIDSLKQMYKAV EVRLPQIDTELSNLQVKIKVQKAIIDKLNEQMKGLDEQQSKLIAGGYSAAAGFGSGQAQI AAGKTQIESAEKELKDSEDKLEDSRKAARENANIDTLLSLDTLSAMISAQNFSMPAGYIE DKSSHQWLVEVGDNFTNEKQLKNLVLTKIDGVGKIRVKDIADITVVDNQGDAYSKVNGEN AILLSVFKGSTANTSTVSKGIQNSFKDLEKKYKGLSFTTIMNQGDYINIIIKSVLSSILL GAVLAIIVLALFLKDIRPTIVVAFSIPFSVLFAIIIMYFTGININVMSLAGLCLGIGMLV DNSIVVMENIYRFRNEGMSASKAAVRGTAQVAGPILASTITTICVFLPMVYTSGMVSQLL MPFAFTISYALIASLLVALTVVPTMGSVLLRKTKDVKQPWFDKIKDVYGKVLELALRFKI IPLAISIVLLVLCVMQSMRTGIVMMDDMDSNQIAVSMTLDDNVKKENAYKTADEVMGILT GVDGIKKVSALDGNASVTSSMLGTGTTDNYTSFSFNIITEDNIKTTKEFRKIRKEIESKT KRVKCKEITVSSSAMGDMSSLSGGGLSVNVYGEDQQKLIKISEDVIKMMKEISEVKTASN GIESADKMLHLKIDRNKAAEHGLTIAQIYQEILKHTTTEKNAITLNMDEKDVEVNLINET DKLTYENILDMKIETTEKNSDGEDEKHTYKLSDFASEDDGYSMDNITRENQKPYLTVTAE IDENANATLLSRKLQDKIDKYKAPGGYEVELSGDATQTTDMLKQMGKALALGFMLIYLVM VAQFQSLLSPFIVIFTVPLAFTGGMIGLGFFGMSISAMALMGFMILMGTVVNNGIVFVDY VNQLRIGGMSKREALIETGKTRMRPILMTALTTILSMSVMVFSQDAGNAMQKGMAVVVAF GLLYATLMTLFIVPVMYDILYRKQPKVIDVEDSDL >gi|222441868|gb|ACEP01000074.1| GENE 22 19406 - 19570 58 54 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MCYYNNSLSVNLMVFHFLNILYRQSPSSNFSKIISEFPLTVKDDPSTDLQIIPK >gi|222441868|gb|ACEP01000074.1| GENE 23 19754 - 21124 1064 456 aa, chain - ## HITS:1 COG:MA0901 KEGG:ns NR:ns ## COG: MA0901 COG0733 # Protein_GI_number: 20089780 # Func_class: R General function prediction only # Function: Na+-dependent transporters of the SNF family # Organism: Methanosarcina acetivorans str.C2A # 3 453 8 454 459 420 53.0 1e-117 MKEREKLGSRLGFILISAGCAIGIGNVWKFPYMAGQGGGGAFVLFYLIFLLMLGLPIMTM EFAVGRASQKSPVKAYYALEKPGQKWHIHGYITLIGCYLLMMFYTTVSGWMLHYFYLTAT GKFTGLDSETVSEQFNTMLSQPLVMGFWMVVVVIAGILVCSIGLQNGLEKVTKVMMISLL LIMIVLAVNSFFMDGAKEGLSFYLIPDFGRMKEIGIIKTITGAMNQAFFTLSLGIGAMAI FGSYIGKERSLLGESLNIALLDTFVAITSGLIIFPACFTFNVDQTSGPGLIFITLPNIFN HIPMGRLWGSLFFVFMSFAAFSTILAVFENIISCGMELTGWSRKKSSFVNAIAIILLSIP CVLGYNLWSWDGFAVFGGAVLDFEDFLVSNLFLPLGSLVYLLFCTSRYGWGWKNFTTEAN TGKGLKMQNWMRGYITYILPLIVLFIFFFGLYDKFF >gi|222441868|gb|ACEP01000074.1| GENE 24 21504 - 21911 475 135 aa, chain + ## HITS:1 COG:no KEGG:CKR_0758 NR:ns ## KEGG: CKR_0758 # Name: not_defined # Def: hypothetical protein # Organism: C.kluyveri_NBRC # Pathway: not_defined # 1 71 2 71 364 62 43.0 4e-09 MGRKCVGFVEAVGLAAALQAADTIAKSANIRLLGYEYSGYDGHILIKFEGNAGAIKAAAE AAAVAIHKVHTDFDQTTAVNLEKTGLDEAVYDVLVHNKNTVGDPLQIASGKRPQGTNRQA KWVGHWEQKGCYIRE >gi|222441868|gb|ACEP01000074.1| GENE 25 22120 - 24135 2104 671 aa, chain + ## HITS:1 COG:VC1348 KEGG:ns NR:ns ## COG: VC1348 COG3437 # Protein_GI_number: 15641360 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Response regulator containing a CheY-like receiver domain and an HD-GYP domain # Organism: Vibrio cholerae # 302 655 73 434 441 213 36.0 1e-54 MQADETMTKQESEEMIKQLEKVFDIVRLLDEETLYGGSGIESEGGEVCQCYSFWGRNEVC ENCISLKVMQDKVQRTKLELLGQDFYQVISKYVKIDGKPYAMEMISRLDREVLVDDDGKN QLLSEISGYNDKLYRDALTGVYNRRYFEEKIKTTRFSAGVAMIDLDDFKIYNDTYGHEAG DMVLDTVVQIIKKNIRKSDILIRYGGDEFLLLLPDIKGDAFEIKLRQIQERVHTAAVPGY SQLRLSVSVGGILSNGEKIEDAIRRADKYMYQAKMTKNSVVTEQSKICTEDRETGRKGGQ KILIIDDSEMNRMLLSEMLGEEFDILEAENGKEGLEILQQYGTSISIVLLDIVMPVVDGF EVLDFMIKEHWNEEIPVIMISSENSPDTMRRAYEMGVVDYISRPFDARVVYRRVLNTIKF YAKQRHLVTLITNQVYEKEKNNRMMISILSQIVEFRNGESGQHVLNINILTGLLLESLVQ KTDKYHLNGSDRLLIITASALHDIGKIGISDRILNKAGKLTEEEFEVIKRHPIIGASILK NLALHQDEPIVKVAYEICRWHHERYDGGGYPDGLKGEQIPISAQIVSLADVYDALVSDRI YKKAYSHKEAVRMILAGECGAFNPLLLECLEEIQGKIKEELEVQDVPEISPVPVQCPISE ISELSMPEDKK >gi|222441868|gb|ACEP01000074.1| GENE 26 24177 - 24524 474 115 aa, chain + ## HITS:1 COG:no KEGG:EUBELI_20567 NR:ns ## KEGG: EUBELI_20567 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 1 114 1 114 114 134 62.0 1e-30 MTVKECYEQMGSDYESVLGRMGSEAMIKRFALKFLQDPSFNNLKENLEKNDGEEAFRAAH TLKGVCLNLGFDELYEASAEITEKLRGKETAGSEDMFQKVEEKYQKTVNAIKGLE >gi|222441868|gb|ACEP01000074.1| GENE 27 24817 - 25527 462 236 aa, chain + ## HITS:1 COG:CAC1577 KEGG:ns NR:ns ## COG: CAC1577 COG1636 # Protein_GI_number: 15894855 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 6 212 1 203 208 220 57.0 2e-57 MNTNNIQKVNYQKILDKITDKIEKECAEGENRPALFLHACCAPCSSYVLEYLTKYFKITI FYYNPNITPEIEYTKRVKELQRFIHEAGYEEDVTFVEGDYDPQVFFDMARGMEDLPERGL RCYHCYALRMEEAAKKAAEMGADYFSTTLSISPHKNAQWINEIGEKLADKYGVKHLPSDF KKKGGYLRSIALSEEYHLYRQDYCGCVFSRRDSEVPKQTNSVIKIAPTPTRMLNNI >gi|222441868|gb|ACEP01000074.1| GENE 28 25837 - 26562 290 241 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|87119277|ref|ZP_01075175.1| 30S ribosomal protein S1 [Marinomonas sp. MED121] # 1 206 54 264 561 116 34 2e-25 MSESMADFEKEIDESLKTLNVVENEGEATEEETGFNAWDKVQQYLEDGTVLTLKVGGIVN KGVIVYVDGLRGFIPASHLEIGYVEDTNPYLGKTVEAKVITADPEKNRLVLSVKEVLYDK KRAEKEARINAIEAGAVLTGKVESLMPYGAFVDLGNDISGLVHISQIAQKRIESPAEVLK AGQEVNVKVLKVENGKISLSMKALEEVAPEEEDTSGEEVVSYKDEGDATTGLGALLAGLK L >gi|222441868|gb|ACEP01000074.1| GENE 29 26905 - 27630 566 241 aa, chain + ## HITS:1 COG:CAC3244 KEGG:ns NR:ns ## COG: CAC3244 COG3409 # Protein_GI_number: 15896489 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Putative peptidoglycan-binding domain-containing protein # Organism: Clostridium acetobutylicum # 5 218 4 216 437 148 42.0 8e-36 MPDVGTLQVSLVSNRGKRPITDAAVEISSAGEPDKILEVLKTDLSGQTEVIELETPPLEY SMEPGLNQPYSEYNLKITAPGFEPVEIAGSQMLSGQLALQNLSLEPTITETASYEVLAIG QHTLYGDYPPKIAESEVKDIQATGEVVLSRVVIPEYVIVHDGAVSDNTAKNYYVLYKDYI KNVASCEIYSTWPKETLRANILAIMSFTLNRVYTEWYRSHCSFMVGKCGSAGIFFLLLFA F >gi|222441868|gb|ACEP01000074.1| GENE 30 27620 - 28543 536 307 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_3563 NR:ns ## KEGG: EUBREC_3563 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 7 298 539 830 918 489 86.0 1e-137 MLFKVEYRDPEDSYYDEAEFAIVKGNTVFGNLMDDVRFYGELALVRGGIEKIHEALPDYK YVPMRDVREAMYPEKMTTEQLAEAVDEIAEAFDPYEYRDNVEIGENTVQEVMLDLRSGNI HSYISYLKDIVDEECDQSVRAGVLMERLKAYEPELPKDMEPMVYVNYCEESGLMNPRCQK LSDLDSKTVEQDKVCYADRAPNTNEPKTIAQMFFTVYYAEKGDQMLHHFKGKIDIGTGNG GLLNQLKLQNEMRLTDESWINYQKGKGNEEFQKYMEDLTDMQNHVLPYLQSFCSLEKKEC RNAGNSR >gi|222441868|gb|ACEP01000074.1| GENE 31 28516 - 28776 198 86 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_3563 NR:ns ## KEGG: EUBREC_3563 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 86 832 918 918 86 59.0 3e-16 MQERREQQVMDKKQSRATAVTGVEKSANVKDAGKVERKSLPQKKVAIGKEKKPSIHERLE INKKIIQEKQGKDKPEREADFGVRTV >gi|222441868|gb|ACEP01000074.1| GENE 32 28805 - 29593 703 262 aa, chain + ## HITS:1 COG:VC1349_3 KEGG:ns NR:ns ## COG: VC1349_3 COG0642 # Protein_GI_number: 15641361 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Vibrio cholerae # 57 252 2 194 260 115 35.0 1e-25 MITWLPDMEVFRNLPLSVISVIFLYFLMFSIFWLWTYRTNQAHQKKEQEKDEKYKAELLI AAKKAEAANEAKTKFLQRMSHDIRTPINGICGMIDMAEHYADDMEKQTEYRAKIKEASNL LLELVNDVLDMSKPESGEVVLEESPFNLNKIFEEVLVVIEQIAAEQNIRIVWEKKEIKHC DLIGSPRYVKRVMMNILSNAVKYNGENGQIYISCREIPSEQPEMTTMEFVCRDTGIGMTG TVLSSMTATVAQTVKKDGHMSS >gi|222441868|gb|ACEP01000074.1| GENE 33 29744 - 30100 182 118 aa, chain + ## HITS:1 COG:BS_yvbA KEGG:ns NR:ns ## COG: BS_yvbA COG0640 # Protein_GI_number: 16080432 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Bacillus subtilis # 29 111 3 87 90 85 47.0 3e-17 MTILFRQNILYLEIFLNTFERSDAMAFADTFKALSDPARREILQLLKNGRMSAGEIGSHF HMTGATVSYHLKILKKADLIWETKEKNFIYYDLNTSVVEELLLWLKDLKGDESHDEKI >gi|222441868|gb|ACEP01000074.1| GENE 34 30084 - 30716 441 210 aa, chain + ## HITS:1 COG:MA3135 KEGG:ns NR:ns ## COG: MA3135 COG5658 # Protein_GI_number: 20091953 # Func_class: S Function unknown # Function: Predicted integral membrane protein # Organism: Methanosarcina acetivorans str.C2A # 8 208 11 215 227 77 26.0 2e-14 MMKKYKWSLLLGSLLSLSPILIGLLLWNQLPAKLPTHFSTGNTPDGWSSKAITVFGLPLF MFAMMWFCFLFTHVDPKRQNISPKLIRALVWFMAALSPIVVCSSYAIALGINVNITLLTY LIIGLMLLVIGNYLAKTKQNYSAGIRIPWTLSSEENWNRTHRVSSRLFIFCGLAMIANGF LDSKILLFFIMAVLIVIPYAYSFFLFKKGI >gi|222441868|gb|ACEP01000074.1| GENE 35 30879 - 31100 111 73 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225027349|ref|ZP_03716541.1| ## NR: gi|225027349|ref|ZP_03716541.1| hypothetical protein EUBHAL_01605 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01605 [Eubacterium hallii DSM 3353] # 1 73 52 124 124 145 98.0 9e-34 MKAGKMQNFFSVLIENNCLQDSGNTKSRTTKNDDFLHGFGISNMRKAVEKYDGQLMTKCE NGKFTLKILILIP >gi|222441868|gb|ACEP01000074.1| GENE 36 31358 - 31609 73 83 aa, chain - ## HITS:1 COG:no KEGG:Swol_1745 NR:ns ## KEGG: Swol_1745 # Name: not_defined # Def: transposase # Organism: S.wolfei # Pathway: not_defined # 2 64 436 497 570 81 55.0 1e-14 MYAVTTDLLDGDVKDILKFSEGRWKIEECFRVMKTDFETRPVFLHDDVRIKAHFLICFLA VVIYVTSKEIWEMPSATRHSWIN >gi|222441868|gb|ACEP01000074.1| GENE 37 31659 - 31943 181 94 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225027352|ref|ZP_03716544.1| ## NR: gi|225027352|ref|ZP_03716544.1| hypothetical protein EUBHAL_01608 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01608 [Eubacterium hallii DSM 3353] # 1 94 1 94 94 200 100.0 3e-50 MCIFSTPKLFAGIRKLYPEEYQLLKHNEEILEFMLDNKCDLDTFVGNAKSCVYHGDEKAI RSLVTGEFSANDIYAKGQWMYPAGAFHGVEGEPC >gi|222441868|gb|ACEP01000074.1| GENE 38 32101 - 32802 268 233 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 [marine gamma proteobacterium HTCC2080] # 1 227 1 229 305 107 33 8e-23 MNMILKTTNLCKSFNGQTAVNNISLNIERNSVYGLLGPNGAGKSTTLKMITGILRPTSGS IEFDGHAWKRSDLNHIGALIEMPPLYENLTAYENLKVRTAILGLTDKRIDEVLQIVRLTE TGKKRAGQFSLGMKQRLGIAVALLNNPKLLILDEPTNGLDPIGIEELRELIRSFPAKGIT VILSSHILSEVQQTADHIGIIAGGILGYEGKLNANENLEQLFMDVVKSNHREG >gi|222441868|gb|ACEP01000074.1| GENE 39 32806 - 33531 456 241 aa, chain + ## HITS:1 COG:no KEGG:CLJU_c09150 NR:ns ## KEGG: CLJU_c09150 # Name: not_defined # Def: putative permease # Organism: C.ljungdahlii # Pathway: ABC transporters [PATH:clj02010] # 3 229 4 240 254 147 40.0 5e-34 MDYLKSEHLKFKRTISNKLIFIVPLITAIFAWIMGGYMGYQYMTLYWWYAFLLPGAIAIL CSLSHRKEENAGKYYSVFSMPVNLSRFEFAKGIILVEKLLVSAIFLALLISISNIIAPAT AVYSLLHSITGSIGIILASVWQIPLCLYLARKTGMFVPIVLNTILGIVLSTATALGNTVA WWFVPYCWAAKLAEPLMGIEMNGTYAGNCGFSIAIMISISLSILLFLILSYADAKDFSKG R >gi|222441868|gb|ACEP01000074.1| GENE 40 33538 - 34269 159 243 aa, chain + ## HITS:1 COG:no KEGG:BAS3614 NR:ns ## KEGG: BAS3614 # Name: not_defined # Def: bacteriocin ABC transporter permease subunit # Organism: B.anthracis_Sterne # Pathway: ABC transporters [PATH:bat02010] # 6 240 15 258 260 99 30.0 9e-20 MGFIYVLRSEQEKTKHTSFWAIHFCVPVMGALLFLAYYSLYASTADSKKLKMILEITTTF FPLLISVIVGLNVALEEKASHFQTLLAVPNRHKNMLAKLTYLYGSGVFALFFLFLLFVIG IHLLGMADTVQLGMLIGAAAGMAFCNLIIYILHLFLSFKFGLGLSLFWGVFESLQCILYS NIELKGVARYIPFAWSMNWVQDILSRQIFNYGTEKIWIAALTTGGLLLTLLWFSHWEGRK NYE >gi|222441868|gb|ACEP01000074.1| GENE 41 34262 - 35077 633 271 aa, chain + ## HITS:1 COG:no KEGG:Blon_1002 NR:ns ## KEGG: Blon_1002 # Name: not_defined # Def: hypothetical protein # Organism: B.longum_infantis_ATCC15697 # Pathway: not_defined # 1 271 1 268 268 321 59.0 2e-86 MNKSKKIMLLLLFTFIVCVALAGCTLQDRIEEYSSDKEQCYLNTENVTRFSYKGNDYTIL ADTVSNGGLGEWIGYIRQLAAIDENGKILLQENVEPATFQSLADLAEKAPEAAYIIPFLN VYAASNADDYLIVDVNGGYHKAVISENVKDSDTVFDFKKTEESINDSFEVNPENATQLLW GGTVYQVTSDMVSDDELGSYIDILAESVTFDTETKIPLSKEDLSKIDWYGENAGQGRESW FYTDVYEIYGTDKAEAVAVNVNDNYYIAKRQ >gi|222441868|gb|ACEP01000074.1| GENE 42 35139 - 35798 595 219 aa, chain + ## HITS:1 COG:SPy1081 KEGG:ns NR:ns ## COG: SPy1081 COG0745 # Protein_GI_number: 15675069 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Streptococcus pyogenes M1 GAS # 1 219 1 227 228 169 40.0 3e-42 MAAILMIDDERAILELVKNGLRKDGHFVTTYTSAAQVPLDKLNRYDLIILDIMMPDVDGF SYCDKIRSLVDCPILFLTAKTMEHDITFGLGLGADDYLTKPFRIAELRARVNAHLRRERR ERHTALTFDRIKIDLSEKELRVDNTPVALTKSEYLICEYLARNKGQVFSKEQIYEAVFSL EGDSDNSTISTHIKNIRSKLNKLDIQPIATVWGIGYKWE >gi|222441868|gb|ACEP01000074.1| GENE 43 35789 - 37153 764 454 aa, chain + ## HITS:1 COG:SPy1082 KEGG:ns NR:ns ## COG: SPy1082 COG0642 # Protein_GI_number: 15675070 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Streptococcus pyogenes M1 GAS # 3 452 1 444 448 112 25.0 2e-24 MGMKKTTSLKASFWKFLCMLLIGLIGAVAIPFSLILAGTSTGLITYADYSERCTKDLAPV IAATPDLADVQLPMGCEYLVLDKNYQVTETTLEGDDLDRAMEYAISGQINTNLNKQYLLV TRKNEYVVLQYYIGSQFTNEWLYEHFPSPEILLYIFIAINCIAVCVILTAKFAKNMRTQL SPLFEATRQVAEQNLDFEVGHSKIKEFEDVLCSFANMKDNLKISLEKQWNAEQLQREQIA ALAHDLKTPLTVIQGNIDLINETELDDEQRLYAGYITESSEQIGIYIKVLIDISRTIAGY QLHLEKFDIADYMGQIKAQASSLCLTKGICLQLETGANLGTLKADKLLLERAIMNVISNA LDYSPPQGTIYVTVQKTDCFLHISITDEGGGFTPEALHHAQKQFFMGDKSRTSNMHFGMG LYITSSIIKQHDGQLVLSNSKKTGGAQVIIKIPY >gi|222441868|gb|ACEP01000074.1| GENE 44 37335 - 38318 512 327 aa, chain - ## HITS:1 COG:no KEGG:Selsp_1837 NR:ns ## KEGG: Selsp_1837 # Name: not_defined # Def: hypothetical protein # Organism: S.sputigena # Pathway: not_defined # 23 316 21 335 342 150 32.0 9e-35 MIIILKQTNIAHTIDLTADRAKYDECAKKLLTYNAVIAWILKSCTKEFSQYSVQFICDNC LTGDIEISQRSVHQDYLDKNTRLNGNQRIDGLNTEANSINEQTVYYDIRFKACLPDSDEP VQLIINLEIQLNDTPGYPLVTRGFYYCARMISEQYGTVFTDEHYEKIQKVYSIWICPDPA KKRKNGIFKYHTVEDTIHGTPYTSPESYDLMEVVILNLGDADKESDLEILDLLNVLFSPS TSPEEKKQRLNDDFHIAMTEEFESEVQNMCNLSNGLVALGEDKKSYSLAQMMIENEEPLD KIILYTGYSVEKLKDIANTMGKLLPEK Prediction of potential genes in microbial genomes Time: Fri Jul 8 07:36:44 2011 Seq name: gi|222441867|gb|ACEP01000075.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont82.1, whole genome shotgun sequence Length of sequence - 11700 bp Number of predicted genes - 11, with homology - 11 Number of transcription units - 6, operones - 4 average op.length - 2.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 3 - 581 463 ## ELI_2668 predicted metal-binding protein 2 1 Op 2 . + CDS 669 - 1880 1178 ## EUBREC_2482 hypothetical protein 3 1 Op 3 . + CDS 1967 - 2602 628 ## COG2755 Lysophospholipase L1 and related esterases + Prom 2623 - 2682 7.7 4 2 Tu 1 . + CDS 2782 - 3573 828 ## gi|225027363|ref|ZP_03716555.1| hypothetical protein EUBHAL_01619 + Term 3614 - 3654 9.0 + Prom 3601 - 3660 8.4 5 3 Op 1 . + CDS 3822 - 4685 955 ## COG0077 Prephenate dehydratase 6 3 Op 2 . + CDS 4688 - 5371 486 ## gi|225027365|ref|ZP_03716557.1| hypothetical protein EUBHAL_01621 + Prom 5405 - 5464 3.7 7 4 Tu 1 . + CDS 5496 - 6779 1598 ## COG2873 O-acetylhomoserine sulfhydrylase + Prom 7024 - 7083 7.8 8 5 Op 1 . + CDS 7132 - 8166 1161 ## COG1705 Muramidase (flagellum-specific) + Term 8275 - 8313 2.0 + Prom 8221 - 8280 4.2 9 5 Op 2 . + CDS 8327 - 8728 405 ## BATR1942_04580 putative small membrane protein + Term 8867 - 8923 3.0 + Prom 8858 - 8917 7.7 10 6 Op 1 40/0.000 + CDS 9027 - 9704 941 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 11 6 Op 2 . + CDS 9718 - 11283 1384 ## COG0642 Signal transduction histidine kinase Predicted protein(s) >gi|222441867|gb|ACEP01000075.1| GENE 1 3 - 581 463 192 aa, chain + ## HITS:1 COG:no KEGG:ELI_2668 NR:ns ## KEGG: ELI_2668 # Name: not_defined # Def: predicted metal-binding protein # Organism: E.limosum # Pathway: not_defined # 9 190 19 199 201 172 45.0 5e-42 KEKKYVAEKIENFQAEIPVEEYMKECVDVATFLEFCKECSNYGKLWCCPPYTFDVEKDYW NKYRKIQIMGRKLYLPKELTSQSYIKNEQWKVTEEFLLPYKEELEQEILKLEQEYPGSVS LSSGACLHCKNAECTRLSGKPCRFQDKMRYSIESLGGNVGKTVTKYLKQELQWVEEGKLP EYFMLIYGLLIP >gi|222441867|gb|ACEP01000075.1| GENE 2 669 - 1880 1178 403 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_2482 NR:ns ## KEGG: EUBREC_2482 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 54 396 1 345 477 452 64.0 1e-125 MGIYLNPGTIEFQESINSEIYVDKTMLIERTNAALRTKQKYMCISRPRRFGKSMAADMLA AYYGKDNDSAALFDGLEISHSETFSTHLNKYNVFKINMQEFLSMTQNVDEMLKVLQKRLL KELKYKYPEYVDSDNLVFVMQDIFSYTGQSFIILIDEWDCLFREYKEDEEAQRKYLDFLR VWLKDKAYIALAYMTGILPVKKYGTHSALNMFIEYSMTDPGNMAEYFGFAEREVESLCVE YGMSIEETKAWYNGYGLYSHNKQEDILYSIYNPKSVVEAMLRHRFGNYWNQTETYEALKI YIQMNMDGLKDAIIEMLAGNSVRINIGTFHNDMTTFATRDDILTLLVHLGYLTYDVEKES VRIPNKEVAQEYINAISTMDWKEAISQIKDKKYALKFQGKLEE >gi|222441867|gb|ACEP01000075.1| GENE 3 1967 - 2602 628 211 aa, chain + ## HITS:1 COG:SMa2002 KEGG:ns NR:ns ## COG: SMa2002 COG2755 # Protein_GI_number: 16263550 # Func_class: E Amino acid transport and metabolism # Function: Lysophospholipase L1 and related esterases # Organism: Sinorhizobium meliloti # 3 208 8 213 220 162 41.0 6e-40 MTTILCYGDSNTYGYNPVNGLRYPKDVRWTGVLQKLLGEQYAVIEEGCNGRTTVFEDIAE PWKAGLGYLKPCLNTHKPIDFVIMMLGSNDLKRMFHATAKEIADGAEQLVSIIKEFTKEK QGFMPKVILVSPPEIGADIATSEFARSFDEDAIERSKELPVFYEKIAKKYDCIFFNAAKV IESSKVDSLHLMPEAHKKLAEELYKCIMENE >gi|222441867|gb|ACEP01000075.1| GENE 4 2782 - 3573 828 263 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225027363|ref|ZP_03716555.1| ## NR: gi|225027363|ref|ZP_03716555.1| hypothetical protein EUBHAL_01619 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01619 [Eubacterium hallii DSM 3353] # 1 263 44 306 306 490 100.0 1e-137 MGFDLKELGEDNEQMEEFLNAVIADSKNYIEWNTENGVQNYIARALPSEQFLITISCTFQ DVAIDRDLDQIKKMTTALRERISDDRIDSIYSMTGEEKEREFESLARDLHDVCMGNTSES EKDTEEAAKNVQNTSSEVSYQQSAPTKPSGSTMTDKEKKQQAVRPHIPAQKLIFDNFHNL MDFCSLLNKDYFIPSSLYKKADKYILLVEFPVEMDNSQIITFMITAEEYGAECSNQRLEG YYLSEHAKLIIKEKAVETLFRMN >gi|222441867|gb|ACEP01000075.1| GENE 5 3822 - 4685 955 287 aa, chain + ## HITS:1 COG:VC0705_2 KEGG:ns NR:ns ## COG: VC0705_2 COG0077 # Protein_GI_number: 15640724 # Func_class: E Amino acid transport and metabolism # Function: Prephenate dehydratase # Organism: Vibrio cholerae # 202 284 187 268 278 61 33.0 2e-09 MDLSDVRKNIDRVDAEIRKLFVERMTLADQVACIKAETEDKIYKPDREEIIIKKQTEGMK PELVREYTALIKRIMEVSRKYQYGRTLELRQCFPFEYSKMPAVILKPTMVKEELYICEDF SKDKVITVSSYEEIGNYIKEGKADAGIGIIEEVGIGVSDELHNLLAEKDLYITHCKVQED GGVRRKVVTFTDKLVVLPSHNRVKVMFECPNHSGSISSILSMISDYGVNLTEIHSRPFWK DNSWNYKFFVEMNANIDTKEIRALIYQLSQETAYLKILGSYACEGDF >gi|222441867|gb|ACEP01000075.1| GENE 6 4688 - 5371 486 227 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225027365|ref|ZP_03716557.1| ## NR: gi|225027365|ref|ZP_03716557.1| hypothetical protein EUBHAL_01621 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01621 [Eubacterium hallii DSM 3353] # 1 227 1 227 227 415 100.0 1e-114 MEFLEEKIFDREQYDTYIRRIHYDDRDAVMYEKMLHFLFPYIVEERGIEDKEGLMEEYIH WNGQLIEKCLYHLPHDIPYINICRAYQSVVFLFAGNIKEGMEQLRRIGTENFQYDGKAPK SIMTQDGKVYKPVSKLFLLYNYAKVLKQKGMQGELEAFQKEYPYAFTIYGDEEHREHDLF YKEQMGQYKDVIFYPVYQRVYKIGRGPQDFFVYYDAAKSVPVSQVPE >gi|222441867|gb|ACEP01000075.1| GENE 7 5496 - 6779 1598 427 aa, chain + ## HITS:1 COG:PM0738 KEGG:ns NR:ns ## COG: PM0738 COG2873 # Protein_GI_number: 15602603 # Func_class: E Amino acid transport and metabolism # Function: O-acetylhomoserine sulfhydrylase # Organism: Pasteurella multocida # 6 424 3 418 422 514 57.0 1e-145 MKEYSLGTKCVQGGYTPGNGEPRQIPIVQSTTFKYETSEAMGRLFDLEEEGYFYSRLQNP TNDTVARKICELEGGSAAMLTSSGQAASFYAVFNIASCGDHIVSSSSIYGGTYNLFAVTM KRMGIDFTFVSPECTEEELNAAFQPNTKAVFGETIANPALTVLDIEKFAKAAHAHGVPLI VDNTFATPVNCRPFEWGADIITHSTTKYMDGHGAGLGGCIVDSGNFDWTKYPDKYPGLCT PDDSYHGVIYTERFGIGGAYITKATAQLMRDFGSMQSPQNAFILNLGLESLEVRMKRHCD NGLAVAQFLQQHEKVTYVNYCGLEGDKYHALAEKYLPNGSCGVVSFGVKGGRKAAENFMK HLKIAAIETHVADARTCCLHPASATHRQMNDEELEACGVKPDLIRYSCGLENAEDLIADL AQALDQI >gi|222441867|gb|ACEP01000075.1| GENE 8 7132 - 8166 1161 344 aa, chain + ## HITS:1 COG:lin1064_1 KEGG:ns NR:ns ## COG: lin1064_1 COG1705 # Protein_GI_number: 16800133 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Muramidase (flagellum-specific) # Organism: Listeria innocua # 174 343 43 201 201 102 37.0 1e-21 MKHKKTLLTFLMLCGLFILSKPVSAAKVPVIKAGKSTNLKGTIINVKYSGAAVTMANKSA TPSIKIGSEIYVPCKTLFADNGIHASYTANGNTVTVKNGKRKVIFYANKKYAKVNGKKMT LKAAPYFVTYKKSNIRDLLVPAKQAAAFLGLKYTYSSRAKLVTLGVRSGIETSATQVSKV AKTRFINKMGPLARANYKRTGILASVTMAQAILESGWGQSTLAENGNNLFGMKISLSGNN WSGSAWDGVNYYAKRTYEYGSRGRYSITAKFRKYSCAEDSIEDHSAYLLGAKNGSRKRYA GLTKTKSYKKQLQIIKKGGYATSGSYVNDLCRVIRTYNLTKWDK >gi|222441867|gb|ACEP01000075.1| GENE 9 8327 - 8728 405 133 aa, chain + ## HITS:1 COG:no KEGG:BATR1942_04580 NR:ns ## KEGG: BATR1942_04580 # Name: not_defined # Def: putative small membrane protein # Organism: B.atrophaeus # Pathway: not_defined # 11 128 11 128 143 80 36.0 3e-14 MERMLEGMVIFVAKIHNYILSLNDAYEKNFTDKQLHFLVIGILGMLILMVIYPLFKLLSE NHILVIAFIYVFTVIVVITFAIEIGQKISDSGTMDFADIVFGIAGFLLMFVIFAVIRQIF LAIINLFRKIGKH >gi|222441867|gb|ACEP01000075.1| GENE 10 9027 - 9704 941 225 aa, chain + ## HITS:1 COG:CAC2435 KEGG:ns NR:ns ## COG: CAC2435 COG0745 # Protein_GI_number: 15895700 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Clostridium acetobutylicum # 6 224 5 222 224 221 51.0 8e-58 MDKLKIMVVDDESRMRKLVKDFLKKKDFEVIEAADGEEALDIFFKDKSISLIICDVMMPK INGFDVVKEIRQYSQVPIIMLTAKGEESDELNGFDLGVDEYISKPFSPKILVARVEAILR RSNNLSTGEVLKAGGIELDIAAHEVRIDGKEITLSFKEFELLNYFVVNQGVALSREKILN NVWNYDYFGDARTIDTHVKKLRSKLGEKGEYIKTIWGMGYKFEVD >gi|222441867|gb|ACEP01000075.1| GENE 11 9718 - 11283 1384 521 aa, chain + ## HITS:1 COG:CAC2434 KEGG:ns NR:ns ## COG: CAC2434 COG0642 # Protein_GI_number: 15895699 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 10 515 9 489 492 234 33.0 2e-61 MRNSIRFKMTSILILIVGWVIVFTWFLNNTFSEKYYVSSEKASIVKTFDKVKNILKNEEN QDTIDEEMEQISNKTNIKMMIAQSSNYYYSQNVIFSNLIDGSKTYEEILGYLDQIRSQVI LGNDKGNFKSNSGRSIFPWENSAENNMSTLINKGYFVTQLKDQASGQNGIYLFGFTDNDY LIAMRVSIEGIKASADISSRFMAYIGMIGIILGSLVIFIFSSNFTRPIKNMAAVANRMAN LDFDAKVTVTQDDELGELGYSMNQLSEKLESTIADLKAANLELKKDIEKKEKIDDMRKEF LSHVSHELKTPIALIQGYAEGLKENITDDEESKDFYCEVIADEAKKMNAMVKKLLTLNQI EFGNNQLSMQRFDVCQMIQNKINSSQILFKKKNTTVIFEEEAPVYVWADELMIEEVFSNY LSNALNHVYENGTIRIWFERMDSNLRIHVYNDGKCIPEEELNKLWIKFYKVDKARTREYG GNGIGLSIVAATMKAHGKDFGVANKENGVDFYFDLDAKIKS Prediction of potential genes in microbial genomes Time: Fri Jul 8 07:37:27 2011 Seq name: gi|222441866|gb|ACEP01000076.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont83.1, whole genome shotgun sequence Length of sequence - 25421 bp Number of predicted genes - 32, with homology - 31 Number of transcription units - 15, operones - 7 average op.length - 3.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 184 114 ## gi|225026922|ref|ZP_03716114.1| hypothetical protein EUBHAL_01178 + Prom 202 - 261 5.9 2 2 Tu 1 . + CDS 296 - 649 265 ## gi|225026922|ref|ZP_03716114.1| hypothetical protein EUBHAL_01178 + Prom 767 - 826 6.3 3 3 Tu 1 . + CDS 870 - 1379 501 ## Trebr_0078 GCN5-related N-acetyltransferase + Term 1544 - 1585 -0.5 + Prom 1409 - 1468 5.0 4 4 Tu 1 . + CDS 1634 - 3658 1187 ## COG2199 FOG: GGDEF domain 5 5 Tu 1 . - CDS 3726 - 4163 255 ## COG0789 Predicted transcriptional regulators - Prom 4218 - 4277 5.8 + Prom 4193 - 4252 5.2 6 6 Op 1 . + CDS 4283 - 5281 829 ## COG1073 Hydrolases of the alpha/beta superfamily 7 6 Op 2 . + CDS 5284 - 5874 533 ## COG0110 Acetyltransferase (isoleucine patch superfamily) + Prom 5885 - 5944 3.6 8 7 Op 1 . + CDS 6053 - 6295 258 ## COG0655 Multimeric flavodoxin WrbA 9 7 Op 2 . + CDS 6325 - 6417 113 ## + Prom 6622 - 6681 2.9 10 8 Op 1 . + CDS 6701 - 7198 420 ## COG0716 Flavodoxins 11 8 Op 2 . + CDS 7215 - 7598 317 ## COG0789 Predicted transcriptional regulators 12 8 Op 3 . + CDS 7604 - 8110 518 ## COG0716 Flavodoxins 13 8 Op 4 . + CDS 8122 - 8676 495 ## COG0110 Acetyltransferase (isoleucine patch superfamily) 14 8 Op 5 . + CDS 8681 - 9532 967 ## COG0656 Aldo/keto reductases, related to diketogulonate reductase + Prom 9556 - 9615 5.2 15 9 Op 1 . + CDS 9667 - 10644 769 ## COG3049 Penicillin V acylase and related amidases 16 9 Op 2 . + CDS 10683 - 11363 470 ## COG2357 Uncharacterized protein conserved in bacteria 17 9 Op 3 1/0.333 + CDS 11399 - 12958 905 ## COG1502 Phosphatidylserine/phosphatidylglycerophosphate/cardioli pin synthases and related enzymes + Term 12975 - 13032 16.1 + Prom 13068 - 13127 7.3 18 10 Op 1 . + CDS 13157 - 14590 1041 ## COG1502 Phosphatidylserine/phosphatidylglycerophosphate/cardioli pin synthases and related enzymes 19 10 Op 2 40/0.000 + CDS 14608 - 15279 708 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 20 10 Op 3 . + CDS 15279 - 16322 778 ## COG0642 Signal transduction histidine kinase 21 10 Op 4 . + CDS 16356 - 17447 623 ## EUBREC_3483 hypothetical protein 22 10 Op 5 . + CDS 17410 - 17982 331 ## COG0558 Phosphatidylglycerophosphate synthase + Term 18167 - 18225 16.1 - Term 18156 - 18209 15.6 23 11 Op 1 2/0.000 - CDS 18290 - 19246 673 ## COG3666 Transposase and inactivated derivatives 24 11 Op 2 . - CDS 19197 - 19916 498 ## COG3666 Transposase and inactivated derivatives - Prom 20005 - 20064 7.4 - Term 20096 - 20146 18.4 25 12 Tu 1 . - CDS 20193 - 20729 638 ## gi|225027396|ref|ZP_03716588.1| hypothetical protein EUBHAL_01652 - Prom 20768 - 20827 12.7 + Prom 20724 - 20783 8.0 26 13 Op 1 . + CDS 20868 - 22232 998 ## COG0534 Na+-driven multidrug efflux pump + Prom 22239 - 22298 7.4 27 13 Op 2 . + CDS 22318 - 23148 615 ## EUBREC_1089 hypothetical protein 28 13 Op 3 . + CDS 23154 - 24011 628 ## EUBREC_1088 hypothetical protein 29 13 Op 4 . + CDS 24105 - 24314 145 ## LCAZH_2344 hypothetical protein 30 13 Op 5 . + CDS 24329 - 24484 95 ## EUBREC_1088 hypothetical protein + Term 24531 - 24578 5.6 + Prom 24577 - 24636 4.9 31 14 Tu 1 . + CDS 24771 - 25055 176 ## COG2200 FOG: EAL domain + Prom 25086 - 25145 4.4 32 15 Tu 1 . + CDS 25230 - 25415 145 ## gi|225027405|ref|ZP_03716597.1| hypothetical protein EUBHAL_01661 Predicted protein(s) >gi|222441866|gb|ACEP01000076.1| GENE 1 2 - 184 114 60 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026922|ref|ZP_03716114.1| ## NR: gi|225026922|ref|ZP_03716114.1| hypothetical protein EUBHAL_01178 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01178 [Eubacterium hallii DSM 3353] # 1 60 1 60 246 129 100.0 9e-29 IDSRSESLGHSQGSGNLQMSKAGMKLMTPGQISRMPKNDCILFLKGERPIYDKKNWPFNT >gi|222441866|gb|ACEP01000076.1| GENE 2 296 - 649 265 117 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026922|ref|ZP_03716114.1| ## NR: gi|225026922|ref|ZP_03716114.1| hypothetical protein EUBHAL_01178 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01178 [Eubacterium hallii DSM 3353] # 1 117 99 214 246 169 94.0 8e-41 MNYISKEEFTFFEEKAKEDSSIQTFQIDEEAFLYLNFNETPQPSLRELEKMVKEIQVAEV NEEEEEKETDQQPDMLQDRGQWDLSGDIIDCFRRYSSELSPEEQEEIIKGIEEGMTD >gi|222441866|gb|ACEP01000076.1| GENE 3 870 - 1379 501 169 aa, chain + ## HITS:1 COG:no KEGG:Trebr_0078 NR:ns ## KEGG: Trebr_0078 # Name: not_defined # Def: GCN5-related N-acetyltransferase # Organism: T.brennaborense # Pathway: not_defined # 20 145 1 126 155 163 63.0 2e-39 MNLELILMSENDIAEFKKNVQCAFQKGFEDVFGKCEETILPEKDIDQSLNGKGSVVYKAV LDGETVGGVIVAIDTETQHNHLDLLYVKSGVQGKGIGKSIWYAIEKLYPDTKVWETCTPY FEKRNIHFYVNICGFHITEFFNEKHPMLDTPDDFIGDGNEGMFVFQKEM >gi|222441866|gb|ACEP01000076.1| GENE 4 1634 - 3658 1187 674 aa, chain + ## HITS:1 COG:AGc4051_1 KEGG:ns NR:ns ## COG: AGc4051_1 COG2199 # Protein_GI_number: 15889504 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 509 670 349 510 513 77 34.0 1e-13 MFVRLFLIATVLFFGETLAYIFRGNLGLFNILVTRIANLMVFAMNIAMANTYVRFVSSVF VEKGAEVSGNSVKIANIFSCINIFIVVVNLFYPWMYYFDEANYYHRNNSWYVYTLISLVV IFIGAGMAIKYRKYLKKRSFISMILFSFIPIIATVVQSFIYGFSITNLGLGIGSVVMFAA YMYDWSHNGDEHTNMINDSRFGAVIMFIIMLLSMSVSIIACVNVIQQVTKENSEIQSRTI AQMVSAKIENELIKPITVSQTISSDIDIRTYIEGKTREEAESVKDDITNRLVSIGNEFDY KMVFVVSDKTRAYYTYNGISRYLDVENDSHDIWYKDYLDSGKRYIVNVDTDEDNNGSLSV FINYGIIDTNGDILGACGVGVDMNDLVDILARFEEEYNIKVYLVNHDGLIQVDTDVSSIG TGYLDNSYFGNISDDDFYYQLSKNGCYMTKYLEGFDWYIVIRDNNPVKLDENKIILPIVL IFIAGVLIMATSFVIISMREKKAKNAYNRRYEASIKDELTGLYNRRGFEVDCEIIKKNNN LIEYVLIMMDLNGLKAANDNISHEAGDELIIGASKCMDNAFSGLGRTYRVGGDEFAALLR GTREEAQDAVKTFDYLTENFQGNLISELSVSKGIVVCSEHIELNFEEIKAMADKLMYADK DEYYRRTGKDRRRV >gi|222441866|gb|ACEP01000076.1| GENE 5 3726 - 4163 255 145 aa, chain - ## HITS:1 COG:CAP0107 KEGG:ns NR:ns ## COG: CAP0107 COG0789 # Protein_GI_number: 15004810 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Clostridium acetobutylicum # 2 139 3 141 152 116 43.0 1e-26 MYTIGQVSAMFGLPVSTLRYYDKEGFFPNLERKGNIRYFSDNELEALRIIECLKKSGLEI KDIKQFFIWVSEGSSSYEKRKELFEARKSAVETEIQELQKTLSLLKFKCWYYETAMKDGN EDAINAMLPDKLPKNIQKLYDKGHD >gi|222441866|gb|ACEP01000076.1| GENE 6 4283 - 5281 829 332 aa, chain + ## HITS:1 COG:PA2218 KEGG:ns NR:ns ## COG: PA2218 COG1073 # Protein_GI_number: 15597414 # Func_class: R General function prediction only # Function: Hydrolases of the alpha/beta superfamily # Organism: Pseudomonas aeruginosa # 6 329 49 366 367 347 55.0 2e-95 MLKTEELQLVSEWDKTFLQSEKVDHQKVTFINRYGITLAADLYKPKNTSGKYPAIAVSGP FGAVKEQCSGLYAQTMAEKGYLTIAFDPSFTGESGGNPRFMASPDINTEDFMAAVDFLSV REDVDPDKIGIIGICGWGGMALNAAALDTRIKTTVASTMYDMTRVNANGYFDSENSEEAR YAKKQSLNTLRTEEYRKGEYSRGGGCVPLPVPEDAPFFVKDYSEYYKGRCYHKRSLNSND GWNSIGCMSFMNQPILKYSNEIRSAVLIVHGEKAHSYYFGKDAYENMIKNSKYTSNKELL TIPGAVHTDLYDNPDVIPFDKIQKFFKENGVG >gi|222441866|gb|ACEP01000076.1| GENE 7 5284 - 5874 533 196 aa, chain + ## HITS:1 COG:MA0410 KEGG:ns NR:ns ## COG: MA0410 COG0110 # Protein_GI_number: 20089303 # Func_class: R General function prediction only # Function: Acetyltransferase (isoleucine patch superfamily) # Organism: Methanosarcina acetivorans str.C2A # 22 187 26 191 191 182 55.0 4e-46 MTCEEYLSNMKPGYIVDGGSEEHLVMHELSQRALKITMELNNKYHTKEEIIQLMSELTGQ KIDESFGMFPPFYTDCGRNIHIGKNVFINAGCKFQDQGGIYIEDGVLIGHNAVLATINHM EDPEKRAGMIFQPIHIEKKVWLGANVTVLPGVTIGEGSVIAAGAVVTKDVPANIIAAGVP AKVIRKVKKDTEKGEI >gi|222441866|gb|ACEP01000076.1| GENE 8 6053 - 6295 258 80 aa, chain + ## HITS:1 COG:MA0418 KEGG:ns NR:ns ## COG: MA0418 COG0655 # Protein_GI_number: 20089311 # Func_class: R General function prediction only # Function: Multimeric flavodoxin WrbA # Organism: Methanosarcina acetivorans str.C2A # 2 72 62 131 179 66 46.0 1e-11 MCVLKDDVADIMAKVKDAEVIVYATPIYYYEMCGQMKTLLDRLNPLYSTDYSFRDIYMIA TAAENDESAFEKATMVCKAG >gi|222441866|gb|ACEP01000076.1| GENE 9 6325 - 6417 113 30 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MVGGGGIDAANIAASHVDVMKKAYELGKKL >gi|222441866|gb|ACEP01000076.1| GENE 10 6701 - 7198 420 165 aa, chain + ## HITS:1 COG:YPO2003 KEGG:ns NR:ns ## COG: YPO2003 COG0716 # Protein_GI_number: 16122245 # Func_class: C Energy production and conversion # Function: Flavodoxins # Organism: Yersinia pestis # 14 164 85 234 235 103 34.0 1e-22 MKKLLIIYYSWSNGNTERIAKMLQSETDSDILKIDTVVPYSGSYDDVVNQGQNEVQRGYE PEIKPLDINIADYDVIAVGTPTWWYTMAPAVKTFLHQQDFTGKTVVPFMTNGGWPGHVIK DMKAACKGANVVCDMQIQFDSTGGSNLETPQEQINEWIQSVKNLL >gi|222441866|gb|ACEP01000076.1| GENE 11 7215 - 7598 317 127 aa, chain + ## HITS:1 COG:CAC0766 KEGG:ns NR:ns ## COG: CAC0766 COG0789 # Protein_GI_number: 15894053 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Clostridium acetobutylicum # 1 116 1 116 126 111 54.0 3e-25 MTIKEVSEKYGISQDTLRYYERVNVIPKVTRTSGGIRNYQEEDLRWVELAVCMRNAGLPI ESLIEYQRLFRAGDSTIPARLELLNEQMDILQKQKEQIEETMDRLSYKISRYEEAVKTGK LVWTKEE >gi|222441866|gb|ACEP01000076.1| GENE 12 7604 - 8110 518 168 aa, chain + ## HITS:1 COG:MA0407 KEGG:ns NR:ns ## COG: MA0407 COG0716 # Protein_GI_number: 20089301 # Func_class: C Energy production and conversion # Function: Flavodoxins # Organism: Methanosarcina acetivorans str.C2A # 4 168 10 174 179 167 46.0 1e-41 MAKLVAFYSRADENYFGGSMKYIQIGNTEKAAKMIADMTGADLFKIEQKIPYAADYNTCI AQAKEDKQTGKRPEILNLPQDIDQYDEIYLGYPNYWGTMPMAVYTFLESYDFTGKKIHPF CTHEGSGLSNTESDIKKSAKGAVIEKGIAIHGSSVDQAKDVLERWIQK >gi|222441866|gb|ACEP01000076.1| GENE 13 8122 - 8676 495 184 aa, chain + ## HITS:1 COG:MA0410 KEGG:ns NR:ns ## COG: MA0410 COG0110 # Protein_GI_number: 20089303 # Func_class: R General function prediction only # Function: Acetyltransferase (isoleucine patch superfamily) # Organism: Methanosarcina acetivorans str.C2A # 23 176 36 191 191 122 40.0 5e-28 MECKEIRVDVRTASIKESTEGQRQTKILFQINHTMPFTDEYNHLLKELFGNNLGDGSMIS PPLNGACVGSVKIGRNVFINSNLLAMARGGITIEDNAMIAANVQLISNNHDPYDLCTLTC KPVLIREYAWVGAGATILPGVCIGRHAIVGAGSVVTKDVPDYAVAVGNPAKVIKMLDKEK FQED >gi|222441866|gb|ACEP01000076.1| GENE 14 8681 - 9532 967 283 aa, chain + ## HITS:1 COG:L126956 KEGG:ns NR:ns ## COG: L126956 COG0656 # Protein_GI_number: 15673856 # Func_class: R General function prediction only # Function: Aldo/keto reductases, related to diketogulonate reductase # Organism: Lactococcus lactis # 4 276 5 280 281 231 44.0 2e-60 MENMKLNNNLECPMLGLGTFMLSPADAYTSTLEALKMGYSLIDTANAYVNERAVGRAIKD SGIDRKNIFLSTKIWASEYENENTVEETLERLGVDYVDLLYIHQPAGNWLAGYRMLEKAY REGKAKSIGISNFEGKYMEELETKWEIVPQFIQVEAHPYFTQKELRVTLDKYGIKLMSWY PLGHGDTALMNELVFAGLGKKYGKTPAQVILRWHTQMGFVVIPGSKNAEHIKDNMDIFDF ALTDEEMEQIAKLDKNERYYHRTDEQLVQFANWKPEFEKLMEK >gi|222441866|gb|ACEP01000076.1| GENE 15 9667 - 10644 769 325 aa, chain + ## HITS:1 COG:BS_yxeI KEGG:ns NR:ns ## COG: BS_yxeI COG3049 # Protein_GI_number: 16081005 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Penicillin V acylase and related amidases # Organism: Bacillus subtilis # 1 305 1 309 328 189 36.0 8e-48 MCTAATYKTKDFYFGRTLDYEFSYGDQIVITPRNYAFNFRHVGDMKNHYAIIGMAHVAED YPLYYDAMNEKGVAMAGLNFVGNAVYAAIKPDVENIAQFEFIPWILSQCSSLVEVRELLE RINIVNTPFSEQLPLAQLHWIISDENESITVESMSDGLHIYDNPVGVLTNNPPFPQQMFQ LNNYMYLSPKQPRNTFCENLALDAYSRGMGGLGLPGDLSSSSRFVRVAFTKVNAISGESE EESVSQFFHILGSVDQQRGCCEVADGKYEITLYTSCCNVTKGIYYYNTYENHQISAVDMH VENLDSDKMICYPVIQGERINYQNK >gi|222441866|gb|ACEP01000076.1| GENE 16 10683 - 11363 470 226 aa, chain + ## HITS:1 COG:CAC3340 KEGG:ns NR:ns ## COG: CAC3340 COG2357 # Protein_GI_number: 15896583 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 29 220 25 213 217 185 51.0 6e-47 MTKDEVKKLRMNSPESLQFLNNATKFYNLMMMYRCAIREIQTKLEVLDDEFSVENNRNPI SFIKTRIKKPNSIYDKLQKMGYEFTTENIQTYLNDVAGVRIVCAFIDDIYMISDLITQQD DIKVIEIKDYIKNPKSNGYRSYHMIVEIPVFFAKGKTPMRVELQIRTNGMDFWATLEHQL RYKKGIEEMPGYDEISEELLHSARAIIEADNEMQRIKDKIGMFHEI >gi|222441866|gb|ACEP01000076.1| GENE 17 11399 - 12958 905 519 aa, chain + ## HITS:1 COG:CAC3316 KEGG:ns NR:ns ## COG: CAC3316 COG1502 # Protein_GI_number: 15896559 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylserine/phosphatidylglycerophosphate/cardioli pin synthases and related enzymes # Organism: Clostridium acetobutylicum # 18 519 11 510 510 388 39.0 1e-107 MKQDTLEGKAKTKNGVKRLCFSIVCILLEVIFIITIVTRLNEYAEIINLFTRILSGILVL GLYASDKTSSMKMPWVILILIFPIMGVGLYLLIGLNGGTHKMRERYAEIDSKLLPMLPDS QECLSRIKETIPKAGNIASYIQRNSQYPIYQNTDIVYFDEAVKGLEAQIKDLEKAQKFIF MEYHAIEDAEAWHKIQDVLEERVKAGVEVRVFYDDMGSIGFINTDFVKKMEAIGIHCRVF NPFMPGLNLFLNNRDHRKITVIDGKVGFTGGYNLANEYFNYTHPYGQWKDTGIRLEGDAV QSLTVTFLEMWNAVSAKAANDSDFSKYLFHYDYAAQQTGFVQPYADSPMDNEQVGEEVYI SMINKAEKYCWFMTPYLIITDEMTHALCLAAKRGVDVRIITPGIPDKKFIYNITRSFYHG LVKHGVRVYEWTPGFCHAKMSVADDCMATCGTINLDYRSLYHHFENGCFMADCQAVVEIK NDLIRTMGECRDVTDQYQTGRSAYLRLGQLFMRLFAGLL >gi|222441866|gb|ACEP01000076.1| GENE 18 13157 - 14590 1041 477 aa, chain + ## HITS:1 COG:jhp0176 KEGG:ns NR:ns ## COG: jhp0176 COG1502 # Protein_GI_number: 15611246 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylserine/phosphatidylglycerophosphate/cardioli pin synthases and related enzymes # Organism: Helicobacter pylori J99 # 8 439 2 458 502 126 26.0 1e-28 MKKHKICKILLVILAIFLCVAFYELLGICVAYKKQPEVSNTTKKETKNGSWNECSENTER AVIIEKNPEALLQRVRLIKNAKKEIILSTFAFQSDESGKLILGALHDAADRGVHIRLLVD GMESWIDMEGNPYFYGLSSHENVEIKLYNKANPLKPWKMMGRMHDKYLIADGKRYILGGR NTYNYFLGDFPGHKNYDRDVLVVCDEPEKENSVNQLSEYFETIWNQEDSGYFHNNKKLAN RKSVKNAVLELQNSYQKYFEENKERICETNYTDETFETEKITLVSNPIHTGSKEPVVWYQ LGELMKNAKNRVKIHTPYIICNDMMYNTWEGIAENVSDFSIMTNSVANNGNPFGAADYAK NRNRILSTGINIWEYEGGYSYHGKSILIDDDLSVIGSFNMDMRSAYLDTELMIVIRSKDI NKQLEEGMMEYEKVSRQILEGGTYNDPYHVEPIELTKKRQRKIFLVQHLLGWARYLF >gi|222441866|gb|ACEP01000076.1| GENE 19 14608 - 15279 708 223 aa, chain + ## HITS:1 COG:lin2728 KEGG:ns NR:ns ## COG: lin2728 COG0745 # Protein_GI_number: 16801789 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Listeria innocua # 1 220 1 221 225 174 39.0 2e-43 MFQILIVEDDKELSQLFQKVLEKNGYQVKSASDGAQALEVLDKEYIDLIISDIMMPVMDG YELVSELRSAGYQIPVLMITAKGSFDDMRQGFLSGSDDYMVKPVNVNEMVLRVGALLRRA QILNEHKIVIGSTEFDYDAMTVTTDKESLVLPKKEFLLLYKLAASPGRTFTKQQLMDEVW GYETEADPHTIEVHIGRIRERLKDNPDFEIVTMRGIGYKVVKK >gi|222441866|gb|ACEP01000076.1| GENE 20 15279 - 16322 778 347 aa, chain + ## HITS:1 COG:lin2727 KEGG:ns NR:ns ## COG: lin2727 COG0642 # Protein_GI_number: 16801788 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Listeria innocua # 53 341 168 458 459 167 33.0 2e-41 MEQKKEKGLRIRSCLTGAIWLALVFSTVISALLFAFLNHFFNLPGSIPVLGWLLIFNTLI AGLITSFINAKLLEPITRLSKAMKEVSQGDFEQHLETNSRIAEVGESYQSFNVMTKELRA TEVLQMDFVSNVSHEFKTPINAIEGYTMLLQGEELSPDQEEYVEKILFNTQRLSGLVGNI LLLSKLENQNIPMKKTEYRLDEQIRQAFLSLETKWTEKEIGFQVELEEVKYTGNEGLFMH IWINLLDNAIKFSPSKGTITMFLKQEQDSVKFILEDEGPGIEDDVKSRIFDKFYQVDGSH KAEGNGLGLALVKRIVDSAGGTIKAENREYGGCRFVIELPKQKDEII >gi|222441866|gb|ACEP01000076.1| GENE 21 16356 - 17447 623 363 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_3483 NR:ns ## KEGG: EUBREC_3483 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 354 1 354 354 618 96.0 1e-175 MTFYQELQLNQAGSKNLLKKSETLKEKLYHMWVYLVKIAVTMAFCFFFVSIFSILFGNEN SIVGVVVLLCLMVFRNADLGIHTGQSTMLLALFFVIMTVCPHLANQFSPVLGMLLNIAAL AVLILFGCHNPFMFNQSTLVLGYLLLYGYDVTGKSYQMRLVGMALGAALTCFVFYRNHKN RTYKRNLKDLIQEFDITSSRTKWQICQILCVPIVLCIAELCNMPRAMWAGIAAMSAILPF MEDMHYRVRKRIVGNIAGVICFTVLYFLLPSSIYAYIGILGGIGVGFSAQYGWQAVFNTF GALAIAAETYGLQGAVSLRVIQNVFGVVFALAFCVIFYWFMSKKKESEVTVHAEGSESGR KFE >gi|222441866|gb|ACEP01000076.1| GENE 22 17410 - 17982 331 190 aa, chain + ## HITS:1 COG:CAC3596 KEGG:ns NR:ns ## COG: CAC3596 COG0558 # Protein_GI_number: 15896830 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylglycerophosphate synthase # Organism: Clostridium acetobutylicum # 15 179 1 162 174 79 32.0 5e-15 MQREVNQEENLNRIITVPNLLSFFRLCLIPVIIWSYCVKKNPLLAGEILLLSGLTDLADG YIARRFHRISNLGKILDPVADKLTQAAMLICLFTRFPHMLLLIVIMAGKELYMVVSGCLV IRKTGKVHGADWHGKIVTFLLYGTAAVHIIWFHITPMVSDLLIGLCAIMMVISVALYIIQ NTRTLKGETV >gi|222441866|gb|ACEP01000076.1| GENE 23 18290 - 19246 673 318 aa, chain - ## HITS:1 COG:BH2841 KEGG:ns NR:ns ## COG: BH2841 COG3666 # Protein_GI_number: 15615404 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Bacillus halodurans # 25 301 258 525 573 148 34.0 1e-35 MFTEKEKENLHFNKLHEELEQCGKRLMEYKECFEIMGKDRNSYSKTDLEATFMRMKEDYM LNGQLKPAYNVQIAVENYFIVHGYVSSDRTDYNTLIPVLEKHKNAFGSILEEVTADSGYC SEKNLLYLKRNEISSYIKLQDHEKRKTRAYSEDISKYYNMRTGIFEDEQFYICHDGRELR HLRTESKEQDGYTQTFEVYGCADCSGCEHKARCLYKYDAEKDAEKNKVMKINEQWEELKE RSHANIQSERGILKRQTRSIQTEGHFGDIKENENFRRFNYRSADKVYKEFMLYAIGRNIN KYHRFLHEKLRKFEGKTA >gi|222441866|gb|ACEP01000076.1| GENE 24 19197 - 19916 498 239 aa, chain - ## HITS:1 COG:CAC0657 KEGG:ns NR:ns ## COG: CAC0657 COG3666 # Protein_GI_number: 15893945 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Clostridium acetobutylicum # 1 144 43 188 334 91 37.0 2e-18 MEELDFSGLLANCSDKGRTGYNPIMMYAVVTYANMRGIRSVDRIVDLCERDLAFIWLTRG QKPKRDAFYDFKNRKLTSDVLDDLNYQFMRRLQKEGLVTLKELFIDGTKIEANANRYTFV WRGSINYHLAGLLDSIDQLFEKYNSFLQENDYGTKYELGNAQMFVIEGMDKVREVIEKNR KRKLVKHKKLSNNRIIEIDNCSPLEIRKLQNNLTLIADNEGIEFVYGKGKRKPALQQAA >gi|222441866|gb|ACEP01000076.1| GENE 25 20193 - 20729 638 178 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225027396|ref|ZP_03716588.1| ## NR: gi|225027396|ref|ZP_03716588.1| hypothetical protein EUBHAL_01652 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01652 [Eubacterium hallii DSM 3353] # 1 178 1 178 178 164 100.0 3e-39 MRYKDELRIAGVTALFCCAMLMITAVVPKDTTVAKARKTFKEAPETAADAEAQAQSEAED AGFTVYTEDSNYDYEDNSYDDSGSDDTSTDTSDSDSSGDTTYDPSYDDSTTDNTEDNVIE DDSSSDYDTSDTDTSDSGDGTEDDTEFDYPTEEETFPSDYADTGDYTGTNVYTDNNWQ >gi|222441866|gb|ACEP01000076.1| GENE 26 20868 - 22232 998 454 aa, chain + ## HITS:1 COG:FN0667 KEGG:ns NR:ns ## COG: FN0667 COG0534 # Protein_GI_number: 19704002 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Fusobacterium nucleatum # 1 407 3 409 426 250 36.0 5e-66 MKNNKYEIDMCNGTLMDKLISFSLPLMLSGILQLLFNAVDIIVVGRFTGRQALAAVGSTT ALINIFTNLFIGISLGANVLAARFYASGKEKEMSETVHTSITLALISGLVMALAGVLLAR FALNLMGTPNDVIDQSVLYMRIYFLGMPFFMLYNYGAAILRAVGDTKRPLFFLVISGMTN AVLNLVLVIVFHMGVAGVAIGTIVSQLISSILVLRCLYTSNTSYRLYFSKLGIKTQYLKQ IFQVGIPAGIQSTVINLSNALLQSSVNSFGSVAMAGYTAANNIFGFLYMSVNAVTQSCMS FTSQNYGVKKLKRMDRVLLDCMILSVGVTLTLGCGAYFFGPELLKIYTSDADVIRCGVEV LAFTTVPYFCCGIMDLLPGALRGMGYSGVPMILSIIGTVGTRIVWIFGLFPAHRSLSFLF ISYPVSWILTILMQAVCFCFVRKHVHQSVNRDLA >gi|222441866|gb|ACEP01000076.1| GENE 27 22318 - 23148 615 276 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_1089 NR:ns ## KEGG: EUBREC_1089 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 3 253 5 251 252 172 37.0 1e-41 MSTIFPREEKAEQIFDEILKNPHACERLKDTFFEAIPSAEESEGAGTDIPGTVFAAALFT AYENKDLSAFMMAVCNSSVFDLLRNSFLISIRFNDKGVENPIFLTDENGDLLSGSHQHAC AKKYKMFHKLYEQQDEIPDYHMYMTDGFREKHGYNEKGEIETFRISEHTGILMLFEFPES VTLEINEERSYAIIWKYLMKLQEQLPRALMYYGRRDENGVEKNTSKLVIFLPFCHFEHEM EKTVELANGIGLGCREAILAEIKALKNDPKNKTAGV >gi|222441866|gb|ACEP01000076.1| GENE 28 23154 - 24011 628 285 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_1088 NR:ns ## KEGG: EUBREC_1088 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 4 285 9 289 446 372 63.0 1e-101 MQQLIRKNHMDISWHEYTDEDGANVPVTQASLTEKASIIGRVGIMLLSCGTGAWRVRSSM NTLAEIMGITCTADIGLMSIEYTCFDGQDGFSQSLCLTNTGVNTSKLNRLEHFIQEFEVD GKDMSGEELHVLLDNIEKIHGLYSPIALGFAAALACGGFTFLLGGGPIEMFCAFIGAGIG NFLRCKLSKHHFTLFLCIVSSVSLACLVYAGLLKIGEMLFGISIQHEAGYICAMLFIILG FPFITSGIDLAKLDMRSGTERLMYALIVILVATMAAWLMALILHL >gi|222441866|gb|ACEP01000076.1| GENE 29 24105 - 24314 145 69 aa, chain + ## HITS:1 COG:no KEGG:LCAZH_2344 NR:ns ## KEGG: LCAZH_2344 # Name: not_defined # Def: hypothetical protein # Organism: L.casei_Zhang # Pathway: not_defined # 1 64 327 390 451 85 68.0 6e-16 MFNSPVRLAATAAGIGAIANTLRLEMVDMTAIPPAGAAFVGALTAGILASLIKSKVGYPR ISLTEFHLL >gi|222441866|gb|ACEP01000076.1| GENE 30 24329 - 24484 95 51 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_1088 NR:ns ## KEGG: EUBREC_1088 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 51 396 446 446 81 78.0 1e-14 MYLYRGFYNLGIMSLGPAASWFASAILIIAALPLGLIFARILTDETFRYCT >gi|222441866|gb|ACEP01000076.1| GENE 31 24771 - 25055 176 94 aa, chain + ## HITS:1 COG:YPO0998_4 KEGG:ns NR:ns ## COG: YPO0998_4 COG2200 # Protein_GI_number: 16121300 # Func_class: T Signal transduction mechanisms # Function: FOG: EAL domain # Organism: Yersinia pestis # 1 77 163 240 254 75 46.0 2e-14 MESISRLPFSVLKIDKCLINHVCETRVKILVDHIIKLSKALNMRVLAEGVETTEQLDVLR KIKCDEIQGFYYARPMPEAQFIEYVRKSDNHKKL >gi|222441866|gb|ACEP01000076.1| GENE 32 25230 - 25415 145 61 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225027405|ref|ZP_03716597.1| ## NR: gi|225027405|ref|ZP_03716597.1| hypothetical protein EUBHAL_01661 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01661 [Eubacterium hallii DSM 3353] # 1 61 1 61 61 79 100.0 7e-14 MKKFMPIIYVIAFVIGIIFIVTADTGLITSYYPALIAGIIISCFARVFLAIYLYVWHKQS K Prediction of potential genes in microbial genomes Time: Fri Jul 8 07:38:28 2011 Seq name: gi|222441865|gb|ACEP01000077.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont84.1, whole genome shotgun sequence Length of sequence - 11963 bp Number of predicted genes - 10, with homology - 10 Number of transcription units - 5, operones - 2 average op.length - 3.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 203 - 262 8.8 1 1 Tu 1 . + CDS 369 - 1316 781 ## Closa_2306 LysR family transcriptional regulator + Term 1490 - 1547 1.0 + Prom 1625 - 1684 5.8 2 2 Op 1 4/0.000 + CDS 1746 - 2957 1609 ## COG1171 Threonine dehydratase + Prom 2971 - 3030 3.3 3 2 Op 2 . + CDS 3089 - 4465 1567 ## COG0624 Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases 4 2 Op 3 8/0.000 + CDS 4538 - 5737 1374 ## COG0078 Ornithine carbamoyltransferase 5 2 Op 4 . + CDS 5776 - 6717 1193 ## COG0549 Carbamate kinase 6 2 Op 5 . + CDS 6731 - 7117 602 ## COG0251 Putative translation initiation inhibitor, yjgF family + Term 7295 - 7340 5.0 + Prom 7148 - 7207 6.2 7 3 Op 1 4/0.000 + CDS 7379 - 8764 1457 ## COG0044 Dihydroorotase and related cyclic amidohydrolases 8 3 Op 2 . + CDS 8848 - 10059 930 ## COG0624 Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases + Term 10154 - 10199 2.7 + Prom 10125 - 10184 7.9 9 4 Tu 1 . + CDS 10209 - 11009 768 ## gi|225027415|ref|ZP_03716607.1| hypothetical protein EUBHAL_01671 + Term 11236 - 11274 2.3 10 5 Tu 1 . - CDS 11290 - 11889 438 ## gi|225027416|ref|ZP_03716608.1| hypothetical protein EUBHAL_01672 Predicted protein(s) >gi|222441865|gb|ACEP01000077.1| GENE 1 369 - 1316 781 315 aa, chain + ## HITS:1 COG:no KEGG:Closa_2306 NR:ns ## KEGG: Closa_2306 # Name: not_defined # Def: LysR family transcriptional regulator # Organism: C.saccharolyticum # Pathway: not_defined # 1 309 1 308 309 318 49.0 2e-85 MFVSNVGGIIAAASKESAHPLRKTGKIPAIKRIVLSMQQAGLFPIVVVVGADDYESRYQL NNLNVVFLILEESDEKRELFHSVKAGLSYLQDKCLSVVFTPVNAPMFIPKTIVEMRKYHD DIVVPSYKKKAGHPVLISNEMIPDILAYDGENGLRGAIEKYAGRRVFVEVDDIGVLSLNQ EDDELQSRIEEHNKSILHPILTFGIGHETPFFNARLKLLLFLIEDLNNVRKACDTMALSP GKAWDMINELEDKLGYTVVKRRRGGRNGGKTFLTEDGRQFLITCQRFEEQVISFSEQKFE KMFLESHMLEKKEEE >gi|222441865|gb|ACEP01000077.1| GENE 2 1746 - 2957 1609 403 aa, chain + ## HITS:1 COG:STM1002 KEGG:ns NR:ns ## COG: STM1002 COG1171 # Protein_GI_number: 16764362 # Func_class: E Amino acid transport and metabolism # Function: Threonine dehydratase # Organism: Salmonella typhimurium LT2 # 2 398 3 401 404 403 51.0 1e-112 MEKIKWILNQMPKNEDKNLAVMALDNVNQARKFHRSFPQYSVTPLAKLDGMAGQLGLGGL FVKDESYRFGLNAFKVLGGSFAMAKYIAKEMGKDVSEMTYDYLTSKKFRDDFGQATFFTA TDGNHGRGVAWAANKLGQKAVVHMPKGSTQTRFDNIAKEGASVTIEELNYDDCVRLAAKE AAETPHGVIVQDTAWDGYEEIPSWIMQGYGSMAGEAAEQLREVEVNRPTHVFVQAGVGSL AGGVVGYFANLYPNNPPKFIVMEAGAADCLYKGAEAGDGEPRIVGGDLQTIMAGLACGEP NTIGWDILRNHVSAFISCPDWVSAKGMRMLAAPVKGDPSVTSGESGAVGMGVISSIMTDD AYKELREALELDNSSQVLMFSTEGDTDPDKYKKIVWGGEYPTV >gi|222441865|gb|ACEP01000077.1| GENE 3 3089 - 4465 1567 458 aa, chain + ## HITS:1 COG:ECs3745 KEGG:ns NR:ns ## COG: ECs3745 COG0624 # Protein_GI_number: 15832999 # Func_class: E Amino acid transport and metabolism # Function: Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases # Organism: Escherichia coli O157:H7 # 3 431 5 394 403 523 60.0 1e-148 MALDFQKINEAAENYRADMAAFLRAMISHPSESCEEKEVVACIKAEMEKLNFDKVEVDGL GNVIGWMGDGEKIIAIDSHIDTVGIGNIDNWEQDPYKGYETDEIIYGRGGSDQEGGMASA VYGAKIMKDLDLIPEGYKIMIVGSVQEEDCDGMCWQYIYNKDKIVPEFVISTEPTDGGIY RGHRGRMEIRVDVKGVSCHGSAPERGDNAIYKMADIIADVRALNNNGCDESTDIKGLVKM LSPKYNPEHYEDAQFLGRGTCTVSQIFYTSPSRCAVADSCAISIDRRMTAGETWDSCLDE IRALPSVQKYGDDVKVSMYMYDRPSWTGEVYETECYFPTWINKENAAHVQALVDAHKGLF GDKRMGPDSRKELRNRPLIDKWTFSTNGVAIQGRYGIPCVGFGPGAESQAHAPNEITYKD DLVRCAAVYVAAANLYNEDNKTDDVSQFRAGKTNNDIK >gi|222441865|gb|ACEP01000077.1| GENE 4 4538 - 5737 1374 399 aa, chain + ## HITS:1 COG:ECs3743 KEGG:ns NR:ns ## COG: ECs3743 COG0078 # Protein_GI_number: 15832997 # Func_class: E Amino acid transport and metabolism # Function: Ornithine carbamoyltransferase # Organism: Escherichia coli O157:H7 # 1 395 1 394 396 538 66.0 1e-153 MDKKLQGYIDKLNKLNFKEMYNGDFFLTWEKTDDELEAVFTVAEALRYMRENNISTKVFE SGLGISLFRDNSTRTRFSFASACNLLGLEVQDLDETKSQIAHGETVRETANMISFMADVI GIRDDMYIGKGNKYMHEVVDSVTQGNKDGVLEQKPTLVNLQCDIDHPTQCMADMLHIINT FGGVENLKGKKVAMTWAYSPSYGKPLSVPQGVVGLMTRFGMDVVLAHPEGYDIMPEVEEV AKKNAAENGGSFTKTNSMAEAFKDADIVYPKSWAPFAAMEKRTDLYAEGDQAGIDALEKE LLAQNAEHKDWCCTEELMSTTKDGKALYLHCLPADINGVSCKDGEVEASVFDRYRDPLYK EASYKPYIIAAMILLAKQKDITGTLEALADKATPRHFGE >gi|222441865|gb|ACEP01000077.1| GENE 5 5776 - 6717 1193 313 aa, chain + ## HITS:1 COG:yqeA KEGG:ns NR:ns ## COG: yqeA COG0549 # Protein_GI_number: 16130776 # Func_class: E Amino acid transport and metabolism # Function: Carbamate kinase # Organism: Escherichia coli K12 # 1 313 1 310 310 353 62.0 3e-97 MSKRIVIALGGNALGSNLEEQMKAVKNTAVAIADLIEEGNEIVIAHGNGPQVGMIQNAMT ELRRSNPEKYVQSPLSVCVAMSQGYIGYDLQNALREELLDRGIQKDCATMLTQVEVSPDD PAFKNPTKPIGSFMTKEEADELVREKGYKVIEDSGRGYRRVVASPKPVSIVEIDTIKALV ESNHVVIACGGGGIPVFKTGGHHLKGAAAVIDKDFAAEKLAEQLNADCLIILTAVEKVAI HFGTPEQEALSELTPELAEKYIEEGEFPAGSMLPKVQAAVAFTQSGEGRTSLITLLEKAK DGIAGKTGTLIHQ >gi|222441865|gb|ACEP01000077.1| GENE 6 6731 - 7117 602 128 aa, chain + ## HITS:1 COG:HP0944 KEGG:ns NR:ns ## COG: HP0944 COG0251 # Protein_GI_number: 15645560 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Putative translation initiation inhibitor, yjgF family # Organism: Helicobacter pylori 26695 # 1 124 1 124 125 134 56.0 4e-32 MNKALHTDNAPAAIGPYSQAVQAEKTIYVSGQLPVDPATGEFAGEDIKAQTKQSLTNIKN ILASAGADMSDVTKTTVLLQDIADFGAMNEVYAEFFTEPYPARAAFQVAALPKGAKVEIE AVAVISEK >gi|222441865|gb|ACEP01000077.1| GENE 7 7379 - 8764 1457 461 aa, chain + ## HITS:1 COG:ygeZ KEGG:ns NR:ns ## COG: ygeZ COG0044 # Protein_GI_number: 16130775 # Func_class: F Nucleotide transport and metabolism # Function: Dihydroorotase and related cyclic amidohydrolases # Organism: Escherichia coli K12 # 1 455 5 457 465 460 47.0 1e-129 MKWLIKNGTIASEEKTFLADLLLEDDKIIKIDTNIEDAEAKCIDAEGCYVMPGAVDIHTH MDLDVGIARAIDDFYTGTIAAACGGTTTIIDHMAFGPKGCNLMHQVKEYHKLADHNAVVD YGFHGVFQHVNDDILKEMKKVVKDEGITSFKIYMTYDYKLSDEDIVRVLKQAKEDGILIT VHCENDGIVSYFRKKFVGEGKTEVRYHPLSRPNEAEAEAVNRMLYLASAIGDAPVYIVHL STKEGLEEIRQAKKRGQKHIGVETCPQYLLLTDDLYDDSSEGLKAVMAPPLRKPADNEAL WDGLKNNEIDTVATDHCPFTFKRQKQQGADDFTKCPSGAPGVEERLSLLYSEGVCKGKIS MEQLVKYACTNPAKVGGVYPQKGTLSVGSDADIVIFNPKKKWTMTKKKMYGAADYTCYEG REIEGKIDLVMQRGKILVKDGEFLGERGDGKYLKRGKSSIV >gi|222441865|gb|ACEP01000077.1| GENE 8 8848 - 10059 930 403 aa, chain + ## HITS:1 COG:AGpA709 KEGG:ns NR:ns ## COG: AGpA709 COG0624 # Protein_GI_number: 16119709 # Func_class: E Amino acid transport and metabolism # Function: Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 15 391 18 374 387 131 28.0 2e-30 METYEESVELTRTFVKIESTNPGTEEAAMREAILAYLKDTSCEIQLDDVGNGRKNVIATL WPENTKEESIKEKYKEVPMFIFICHMDTVVVGEGWTKNPFGAEIEEQNGILRMYGRGSCD MKSGLACALATFKKTARRLEKGEISLKCPLRMICTIDEEGDMTGSDHIVEQEFVKKEDYV LDLEPTDGEIQMAHKGRLWFHVHVKGITAHASKPEKGADAIFAASKFIVQIQEIFEHFPV HEELGYSTITFGQIQGGYQPYVVPDACTIFIDCRLAPPVTDQIVVQHFNKIIKEIETQIP GIKISYDITGNRPYIEKNPDSILLKNLKEAVEAETKQKAVVKAFTGYTDTAVIAGRLHNT ECMSYGPGSLQYAHKPDEFVEIRDIIRCEKVMNHLVMSLCEER >gi|222441865|gb|ACEP01000077.1| GENE 9 10209 - 11009 768 266 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225027415|ref|ZP_03716607.1| ## NR: gi|225027415|ref|ZP_03716607.1| hypothetical protein EUBHAL_01671 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01671 [Eubacterium hallii DSM 3353] # 1 266 1 266 266 520 100.0 1e-146 MFCSNCGAEAEGNFCKNCGAPLPKEEPSTEKSNESIDAEYKVYDEEEDQKQKDKQKNVKV TIKNKRGGSSRNTTTTKRKRRSALSSGGDAVSNAASTAGNVVGKSVKTAGNILFTGLHWL CALLMIGIVLQIFKNVWNYTSVSTIMGLASSRDIDTIVFIGGSAFFILFGIIEALWILGS KKVMDMGVLRNIDTGRGIFAFGLFLVITVLTPFIDGFLTFGYNGMAGGQWILHAFTGNNG IFYCAVLGLVISFIRKMRGSIRGMIA >gi|222441865|gb|ACEP01000077.1| GENE 10 11290 - 11889 438 199 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225027416|ref|ZP_03716608.1| ## NR: gi|225027416|ref|ZP_03716608.1| hypothetical protein EUBHAL_01672 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01672 [Eubacterium hallii DSM 3353] # 1 199 1 199 199 267 100.0 4e-70 MYEEQRLTIPATGKHDWSDWWIVKKSTVFKTGTKKRECWICDKTQTVSIPKLKPFVKLNK KTISLTAGKSYRLKASYAKGDSVKKCKSSNKKVATVSKKGVISARQAGKVRITIYTKSGK KATCKVTVTAKKKAAKKSSSNKSGSGSGSSKTNGSTVYWTPGGAVYHRTPNCPTLSRSRT IYHGSISSCPKNRGCKVCF Prediction of potential genes in microbial genomes Time: Fri Jul 8 07:39:03 2011 Seq name: gi|222441864|gb|ACEP01000078.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont85.1, whole genome shotgun sequence Length of sequence - 15576 bp Number of predicted genes - 14, with homology - 13 Number of transcription units - 5, operones - 4 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 447 - 506 8.2 1 1 Op 1 . + CDS 545 - 1291 939 ## COG1385 Uncharacterized protein conserved in bacteria + Prom 1293 - 1352 4.5 2 1 Op 2 . + CDS 1378 - 2487 1321 ## COG0082 Chorismate synthase 3 1 Op 3 15/0.000 + CDS 2556 - 3698 1332 ## COG0343 Queuine/archaeosine tRNA-ribosyltransferase 4 1 Op 4 . + CDS 3732 - 4052 441 ## COG1862 Preprotein translocase subunit YajC + Prom 4195 - 4254 4.2 5 2 Op 1 . + CDS 4408 - 4788 420 ## gi|225027421|ref|ZP_03716613.1| hypothetical protein EUBHAL_01677 + Prom 4808 - 4867 4.2 6 2 Op 2 . + CDS 4888 - 5034 173 ## + Term 5048 - 5078 1.6 + Prom 5179 - 5238 4.5 7 3 Op 1 . + CDS 5296 - 6660 1594 ## COG0641 Arylsulfatase regulator (Fe-S oxidoreductase) 8 3 Op 2 1/0.000 + CDS 6661 - 8436 1389 ## COG0608 Single-stranded DNA-specific exonuclease + Prom 8488 - 8547 5.4 9 3 Op 3 . + CDS 8607 - 9452 981 ## COG0561 Predicted hydrolases of the HAD superfamily + Prom 9455 - 9514 7.0 10 4 Tu 1 . + CDS 9555 - 10112 771 ## COG0503 Adenine/guanine phosphoribosyltransferases and related PRPP-binding proteins + Term 10234 - 10275 -0.0 + Prom 10218 - 10277 7.1 11 5 Op 1 1/0.000 + CDS 10310 - 11179 887 ## COG1737 Transcriptional regulators 12 5 Op 2 1/0.000 + CDS 11179 - 12420 1300 ## COG2081 Predicted flavoproteins + Prom 12432 - 12491 6.1 13 5 Op 3 21/0.000 + CDS 12516 - 13220 228 ## PROTEIN SUPPORTED gi|15639271|ref|NP_218720.1| bifunctional cytidylate kinase/ribosomal protein S1 14 5 Op 4 . + CDS 13235 - 15121 1297 ## PROTEIN SUPPORTED gi|15895122|ref|NP_348471.1| 4-hydroxy-3-methylbut-2-enyl diphosphate reductase + Term 15170 - 15214 6.1 Predicted protein(s) >gi|222441864|gb|ACEP01000078.1| GENE 1 545 - 1291 939 248 aa, chain + ## HITS:1 COG:CAC1285 KEGG:ns NR:ns ## COG: CAC1285 COG1385 # Protein_GI_number: 15894567 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 1 247 1 245 250 186 45.0 3e-47 MYQFFIQPDSIKEEEIWIADPQDVNHIRNVLRMKKGEKVSLCCEAKGKEYICSIEEIESD IVKAAIIDINGESRELPVKITLFQGLPKSDKMELIIQKAVELGVAEIVPMATKRAVVKLD AKKAAKKVQRWNEIAKSAAKQSKRGLIPEVKPVMSFKEAVEYGKSMDMLLIPYEDAKGIA HSREVVETVKDKKSLGIYIGPEGGFPEEEVSLAMKAGAEPVTLGHRILRTETAGMTLLSI LMFMLEED >gi|222441864|gb|ACEP01000078.1| GENE 2 1378 - 2487 1321 369 aa, chain + ## HITS:1 COG:MA0550 KEGG:ns NR:ns ## COG: MA0550 COG0082 # Protein_GI_number: 20089439 # Func_class: E Amino acid transport and metabolism # Function: Chorismate synthase # Organism: Methanosarcina acetivorans str.C2A # 1 352 1 355 365 347 51.0 3e-95 MAGSSFGTLFCMTTWGETHGKGVGVVVDGCPAGLSLCEEDIQKYLNRRKPGQSKYTTKRK EDDKVEILSGVFEGKTTGTPISMAVFNKDQHSKDYSAIKDIYRPGHADYTFDKKYGFRDY RGGGRSSGRETTARVAAGAVAAKVLKELGIEVFAYTKAIGPVQIDENRFSKEERDKNALY MPDAQAAAQAEEYLQEKMKELDSCGGIVECVITGMPVGVGEPVFDKLSANLGKAILSIGA VKGFEIGDGFAAAASAGSENNDDFYNDNGTVKKKTNHAGGVLGGMSDGSVITFRAAFKPT PSIAQPQQTVNRDGKDTEIVIHGRHDPIIVPRAVVVVETMAAVTVLDLLLQNMGSRLDSV KNFYENEKK >gi|222441864|gb|ACEP01000078.1| GENE 3 2556 - 3698 1332 380 aa, chain + ## HITS:1 COG:CAC2282 KEGG:ns NR:ns ## COG: CAC2282 COG0343 # Protein_GI_number: 15895550 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Queuine/archaeosine tRNA-ribosyltransferase # Organism: Clostridium acetobutylicum # 1 372 1 372 376 594 72.0 1e-170 MYKLLTKDGMAKRGEFETVHGTIQTPVFMNVGTVGAIKGAVSTDDLRTIGTQVELSNTYH LHVRTGDKLIKQFGGLHKFMGWDKPILTDSGGFQVFSLSGLRKIKEEGVYFQSHIDGHHI FMGPEESMQIQSNLGSTIAMAFDECPSSRADRTYIQNSVDRTTRWLERCKKKMQELNSLP DTVNPHQMLFGINQGGVFADIRIEHAKRISELDLDGYAVGGLAVGETHDEMYYILDEVVP YLPENKPTYLMGVGTPANIIEAVDRGIDFFDCVYPSRNGRHGHVYTHHGKLNLFNKKFEL DPRPIEEGCQCPACRTYSRAYIRHLLKAKEMLGMRLCVLHNLYFYNHLMEEIRGAIDEHR YKEFKKTTLEGFLTPEEEYR >gi|222441864|gb|ACEP01000078.1| GENE 4 3732 - 4052 441 106 aa, chain + ## HITS:1 COG:BH1229 KEGG:ns NR:ns ## COG: BH1229 COG1862 # Protein_GI_number: 15613792 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit YajC # Organism: Bacillus halodurans # 20 99 11 88 88 63 38.0 9e-11 MLLAQSASASSSITMIIAYVILFGAIIYFMTIRPQKKQQKAADEMHAGMQAGDSVLTTSG FYGVIIDIMDDTVIVEFGNNRNCRIPMQKDAIAAVEKPNLTKTEEE >gi|222441864|gb|ACEP01000078.1| GENE 5 4408 - 4788 420 126 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225027421|ref|ZP_03716613.1| ## NR: gi|225027421|ref|ZP_03716613.1| hypothetical protein EUBHAL_01677 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01677 [Eubacterium hallii DSM 3353] # 1 126 1 126 126 187 100.0 2e-46 MDGISYKKWMERAAGLFLSALIAVLLTMLFLAVFAFAMLKGNLSASVEKIGLIVMSVMAC FVGGFLCGKKNIKKRYLWGLGVGVLYCLFFGVIRVGMGQAFVTDTTAFFTMMLYCAAGGM LGGMLS >gi|222441864|gb|ACEP01000078.1| GENE 6 4888 - 5034 173 48 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKRVKTLTGNKTYGTTMCKGGCGECQTSCQSACKTSCTVGNQTCESNK >gi|222441864|gb|ACEP01000078.1| GENE 7 5296 - 6660 1594 454 aa, chain + ## HITS:1 COG:CAC2279 KEGG:ns NR:ns ## COG: CAC2279 COG0641 # Protein_GI_number: 15895547 # Func_class: R General function prediction only # Function: Arylsulfatase regulator (Fe-S oxidoreductase) # Organism: Clostridium acetobutylicum # 1 448 3 454 454 465 50.0 1e-131 MIHQYKNNGYNIVMDVCSGAIHVVDDVTYDVIALYEEKSLEEIENELSYSKEDIEEAYKE VTSLKEEGLLFTEDIYEDYIKEVKSRKTVVKALCLHIAHDCNLACRYCFAEEGEYKGHRE LMSAKVGKAALDFLVANSGNRHNLEVDFFGGEPTMNFGVVKEVVEYGRSLEEKHNKHFRF TLTTNGVLLNDEIMEFANKEMDNVVLSVDGRKEIHDYMRPTRNGKPSYDLIMPKFIRFAE SRHQQKYYVRGTFTNRNLDFSKDVLHLADLGFEQISMEPVVGQPEEPYAIQEKDLPQIFD EYDRLAKEMIKREKAGKGFNFFHFMIDLTGGPCVAKRLSGCGSGTEYLAVTPWGDLYPCH QFVGEEKFLLGNVFDGIKRTDICDEFRSCNVYTKKKCKDCFARFYCSGGCPANSYNFHGT IDDSYDLSCDMERKRVECAIMIKAALADTEGEEE >gi|222441864|gb|ACEP01000078.1| GENE 8 6661 - 8436 1389 591 aa, chain + ## HITS:1 COG:CAC2232 KEGG:ns NR:ns ## COG: CAC2232 COG0608 # Protein_GI_number: 15895500 # Func_class: L Replication, recombination and repair # Function: Single-stranded DNA-specific exonuclease # Organism: Clostridium acetobutylicum # 1 587 1 587 587 590 47.0 1e-168 MEKWFIASKRADFNKIGEIFHISPVTARLMRNRSLTTVREMQRYLYGSLSDLYDSHLLKD ADKGAEIIKEKIETGKKIRIISDYDVDGVSSNYILYQGLKRCGADVDYKIPDRVEDGYGI NEHLIEKAAEDGIDTIITCDNGIAAAEQIAYGNSLGLTLIVTDHHDIPFQRQEDGSISYN LPSAAAVINPKQKDCPYPFENICGAVVVYKFIQVLYDLFGIDRRESEVFLEIAAMATVCD VMPLCDENRIIVKEGLRRIQHTNITGLQALIEANHLEEKKISSYHFGFILGPCINASGRL TSAKDALELLLCEDKELCRQKAEALTQLNAERKEMTANGVKAAKEYLESTGHANDKVLVV YLPNCHESIAGIIAGRMKEHYHKPVFVITRTKEGLKGSGRSIEAYHMYKEMNKIKECFDK FGGHPMAAGFSMQEDRLEEFREKLNANATLTEDDFAEKVHIDVALPISYLSESLINEFEL LEPFGNENTKPLFAQKDLYIKGMRVLGTTGRCLKLFLSDANGTSIDAMYFGESDVFLEEL ISFCGKEEAQRIQKGLSHHIWMDVTYYPQVNEWRGQKNIQIVIQNYRFKNL >gi|222441864|gb|ACEP01000078.1| GENE 9 8607 - 9452 981 281 aa, chain + ## HITS:1 COG:CAC3035 KEGG:ns NR:ns ## COG: CAC3035 COG0561 # Protein_GI_number: 15896286 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Clostridium acetobutylicum # 11 278 4 274 280 110 30.0 3e-24 MEQEKKQPDVRLIGLDLDGTTLTSDKVLTPHTKEVLEKCLAEGIQVLPATGRVKSGIPEY LTQIEGMRYVILSNGASVLDLKEDKVLYQNCIPWERALELFDVLETYDTFYDIYALGAGW CEARFYDNVENYGIEKHIEKLVRTSRNRIDDLREWMKENKAPVEKINMFFAKEEDRQRAF RELGQIEDLAVTCSLGNNLEINGATCNKGDAMLNLGKILDIPIESIMACGDGNNDFEMVK MAGVGVAMKNGEESLKEVADFVTKTNDEEGVAYAIEHFCNV >gi|222441864|gb|ACEP01000078.1| GENE 10 9555 - 10112 771 185 aa, chain + ## HITS:1 COG:SA1461 KEGG:ns NR:ns ## COG: SA1461 COG0503 # Protein_GI_number: 15927215 # Func_class: F Nucleotide transport and metabolism # Function: Adenine/guanine phosphoribosyltransferases and related PRPP-binding proteins # Organism: Staphylococcus aureus N315 # 15 182 3 170 172 169 49.0 3e-42 MPDLGAERVDKVKKLEDYVINIPDFPKPGIIFRDITSILRDPEGLKLSVHELMHCLEGTE FDVVVGAESRGFLFGMPLAYNKSKGFVPVRKKGKLPRETVSKEYALEYGTAEIEIHKEDI RPGQRIVFVDDLLATGGTAKAAIDLIEELGGVVVKVLFVMELEGLHGRDVLKGYDVESVI TYPGK >gi|222441864|gb|ACEP01000078.1| GENE 11 10310 - 11179 887 289 aa, chain + ## HITS:1 COG:CAC1850 KEGG:ns NR:ns ## COG: CAC1850 COG1737 # Protein_GI_number: 15895125 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Clostridium acetobutylicum # 2 285 4 285 293 224 42.0 2e-58 MEHLNLLQRIEKKSSEFSKGQRRLAQYITENYDIAAYLTASKLGKEAGVSESTVVRFAYQ LDYEGYPELQKAIQVIVKTNSNSIQRMSLSSKRYQEKGVLKSILYTDSERLRDTIQSGVD EEEFNRSVMLINDARRLYILGARSAAYLAGLMGYYFKMMFDNVIIVDANSTSETLEQIYD ISDKDVMMGITFPRYSKRTICALQYAKNHGAKTIALTDNMQSPIVEYADCKLIAKSDVMT IVDSLVCPLSVVNAMVTAIALLRKDDVEKRLMALEELWNEYDVYNRSLL >gi|222441864|gb|ACEP01000078.1| GENE 12 11179 - 12420 1300 413 aa, chain + ## HITS:1 COG:CAC1849 KEGG:ns NR:ns ## COG: CAC1849 COG2081 # Protein_GI_number: 15895124 # Func_class: R General function prediction only # Function: Predicted flavoproteins # Organism: Clostridium acetobutylicum # 14 412 1 390 393 377 48.0 1e-104 MNKVIIVGGGAAGMMAAISASKEGKKACILEKNEKLGKKLFITGKGRCNITNACPADEFL THVVTNPRFLYSTFSQFNNEDMMDYLEEIGLPIKTERGQRVFPQSDRSSDVIDALKKQCK KGGVQIYYHTEVKELLFDEEKENCVGVLLSDGKKMMGDSVILACGGFSYASTGSNGSGYT LAKQAGHKIKSIEPSLVPFEMKEMWCKDLMGLTLKNIGVRIKAKKKTVYEGFGEFLFTHF GVSGPLVLTASTCLGKYQKELEAGELKLILDLKASLTPEQLDKRFLREFDTYRNKNISNV MERLLPKKMIPVFLEEAQIPEDKKIRDISKKERRHMIELMKNFEMHITGVRGFNEAIVTR GGVNVKEINPSTMESKKVKHLFFAGEMMDLDAVTGGYNLQIAWTTGYVAGKNA >gi|222441864|gb|ACEP01000078.1| GENE 13 12516 - 13220 228 234 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15639271|ref|NP_218720.1| bifunctional cytidylate kinase/ribosomal protein S1 [Treponema pallidum subsp. pallidum str. Nichols] # 1 229 21 282 863 92 27 2e-18 IRSQVSKEWEEMSMYSIAIDGPAGAGKSTIAKAIAEDIGFIYVDTGAMYRAIALYFLRKG IDGHDRKLVAAECPNIHVTLEYDEEGKQQVILNGENVTGYIRKEEVGKMASVTSAVPEVR AALLDLQRDMAKTSDILMDGRDIGTNVLPDASLKIYLTASADVRAKRRYDELKEKGADCD IEQIKTDIIARDKQDMEREIAPLCQAEDAILVDSSDMSIPQVIDTIISEFKKIK >gi|222441864|gb|ACEP01000078.1| GENE 14 13235 - 15121 1297 628 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15895122|ref|NP_348471.1| 4-hydroxy-3-methylbut-2-enyl diphosphate reductase [Clostridium acetobutylicum ATCC 824] # 1 626 1 641 642 504 42 1e-142 MMKINIAKSAGFCFGVKRAVDTVYKESSKKNVYTFGPIIHNDEVVADLERQGVHVINDSE DFKSLSEGTIIIRSHGVSKAIYDEILNAGLDLVDATCPFVRKIHKIVERESGDGRVIIII GNNHHPEVEGIMGWCHTHPIVIENREQAEGIELDADTKISIVSQTTFNYNKFQDLVEIIS EKGYDINVFNTICNATEERQTEVRELAKKSDAMIVIGGRHSSNTQKLFEISKKECSNTFY IQTKNDLNMEDFSNIGMLGITAGASTPNNIIKEVHESMAEMSFDQMLEESFKTIRNGEVV QGTIIDVKEDEIILNIGYKADGIISRNEYSNDQNLDLTTVAKVGDEMEAKVLKVNDGEGQ VLLTYKRLKAEKGNKRLEEAFNNHEVLKAPVAKVLDGGLSVVVEETRVFIPASLVSDTYE KNLKKYDGQEIEFVITEFNPRKRRIIGDRKQLLVAEKKAKQEALFAKIEAGMTVEGTVKN VTDFGAFIDLGGADGLLHISEMSWGRVENPKKVFTVGEKLNVLIKDIQGEKIALSLKFPE TNPWLHADEKYAVGNEVTGKVARMTDFGAFVELEPGVDALLHVSQIAKEHIEKPSDVLKV GQEVTAKVVDFKKDDRKISLSIKALDNE Prediction of potential genes in microbial genomes Time: Fri Jul 8 07:39:18 2011 Seq name: gi|222441863|gb|ACEP01000079.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont86.1, whole genome shotgun sequence Length of sequence - 9619 bp Number of predicted genes - 8, with homology - 6 Number of transcription units - 7, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 279 - 1142 579 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 1246 - 1305 11.9 + Prom 1188 - 1247 11.9 2 2 Op 1 1/0.333 + CDS 1348 - 4386 2322 ## COG3250 Beta-galactosidase/beta-glucuronidase + Prom 4389 - 4448 3.6 3 2 Op 2 . + CDS 4554 - 5951 1244 ## COG2211 Na+/melibiose symporter and related transporters + Term 5960 - 6018 12.3 - Term 6036 - 6073 3.1 4 3 Tu 1 . - CDS 6116 - 6193 108 ## - Prom 6325 - 6384 6.9 5 4 Tu 1 . + CDS 6303 - 7103 443 ## COG2207 AraC-type DNA-binding domain-containing proteins + Term 7347 - 7411 22.1 - Term 7209 - 7241 -0.9 6 5 Tu 1 . - CDS 7409 - 7813 78 ## COG1943 Transposase and inactivated derivatives - Prom 7842 - 7901 5.8 + Prom 7811 - 7870 6.0 7 6 Tu 1 . + CDS 7894 - 9033 706 ## PROTEIN SUPPORTED gi|163764777|ref|ZP_02171831.1| ribosomal protein L22 + Prom 9268 - 9327 8.8 8 7 Tu 1 . + CDS 9363 - 9452 93 ## + Term 9494 - 9540 13.2 Predicted protein(s) >gi|222441863|gb|ACEP01000079.1| GENE 1 279 - 1142 579 287 aa, chain - ## HITS:1 COG:BH2229 KEGG:ns NR:ns ## COG: BH2229 COG2207 # Protein_GI_number: 15614792 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Bacillus halodurans # 48 282 40 276 287 85 27.0 1e-16 MYLNTGYLNHSHIDFKDKSRPLIVGSCGTYRLSRHPKLPTYRPRGRLDYQIIYITAGCGH FHFDNVNNEMIVPAGNIVLYRPKELQKYEYYGEDKTEVYWIHFTGSNVKNILRQYGFPDK ERVFQVGTSNEYEQIFKRIIIELQRCQDNYEEMLVLLLRHLLISFHRELTREHILKNEYL DHEMDNAVTFFSENYNQNININDYATSRGMSVSWFIRNFKKYTGSTPMQFIVGIRINNAQ MLLETTTYSINEISKIVGYDNQLYFSRLFHKLKGYSPREYRKLRNKF >gi|222441863|gb|ACEP01000079.1| GENE 2 1348 - 4386 2322 1012 aa, chain + ## HITS:1 COG:TM1193 KEGG:ns NR:ns ## COG: TM1193 COG3250 # Protein_GI_number: 15643949 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Thermotoga maritima # 3 1006 4 983 1087 556 33.0 1e-158 MIVPRYYENLSVLHENTMPARAYYIPASRRMDNLVEHREESDRMQLLNGTWKFQYFNSIY DIQDSFFEKNYDTENFDEIQVPSVWQMAGYDTHQYTNIRYPFPFDPPYVPQDIPCGAYVH TFEYSRDEKAPKSFLNFEGVDSCFYVWINGSYIGYSQVSHMTSEFDVTDVLQDGKNTVAV LVMKWCDGSYLEDQDKFRMSGIFRDVYILKRPKQAISDYHIKTRIEDMLAKVEIEMKFYS PLNVKISIEDRNGAVVALGSIAEEGTAVLEIASPELWNTENPYLYKLILETENEVIVDHI ALRKIEIKDQVIYLNGQKIKFRGVNRHDSDPVTGFTINPEQITTDLTLMKQHNFNAIRSS HYPNAPFFYEMCDKYGFMVIDEADIEAHGPFMIYRKEDTDYNRFKRWNEKIADDPVWEEA IVDRVKLMVERDKNRFCIVMWSMGNESAYGCNFEKALEWTKNFDPDRITQYESARYRNYD ETYDYSNLDVYSRMYPALSEIQEYLDKDGSKPFLLVEYCHSMGNGPGDFEDYFQMIQDND KMCGGFVWEWCDHAIAHGTAENGKTIYAYGGDHGEEIHDGNFCMDGLVYPDRTVHTGLLE YKNVYRPARVISYNKESGELVLHNYMDFDDLKDYVKISYELTQDGLVISKGILPEFSVAP HGEGKTNLKINVPENGKCYLKLIYHLKKELPLLDEDHILGFDEIEVSKEDTKCKLAEKWI PKTVVDSELQVNENDTQIHIKGREFAYTIDKRTALFTEMKFAGREYLNHPMELNIWRAPT DNDMYIKSEWKKAHYDKAYTRAYTTEVVQGKHGVKITSHASVVAETVQKILDVTITWKIE AAGKIDADIAVTKDDEFPDLPRFGVRMFLDKKLSAVRYFGMGPQESYCDKHQAASHGLYR ADVGDLHEDYIRPQENGSHYDCEYVEINNSRYGIVASAEKAFSFNASYYTQEELEKKTHN YELIESDSVVFCVDYALNGIGSNSCGPVVLEQYRFDDVLFRFQFTLIPYVKG >gi|222441863|gb|ACEP01000079.1| GENE 3 4554 - 5951 1244 465 aa, chain + ## HITS:1 COG:BS_ynaJ KEGG:ns NR:ns ## COG: BS_ynaJ COG2211 # Protein_GI_number: 16078820 # Func_class: G Carbohydrate transport and metabolism # Function: Na+/melibiose symporter and related transporters # Organism: Bacillus subtilis # 2 446 4 441 463 150 26.0 4e-36 MEEKKYLKWYNKIGYGSGDIAGNVVYAFLTSFMMVYLTDSVGLAAGVVGTLIAVSKLFDG FTDIFFGSMIDKTHSKMGKAKPWMLYGYIGCAITLVGCFAIPVSLGTTAKYAWFFISYTL LNGGFYTANNIAYSALTSLITKNSKERVQMGSYRFIFAFSTSLLIQAITVGFVDKCGGDA AAWRTVAIIYAIIGLVVNTISALSVKELPEEELNEGEVKNDNEKYGMVQAFKLLVKNKYY MMICGTYILQQLYGAMIGAGIYYMTWVLKNKNLFGQFAWAVNIPLIIALIFTPTLVGKWK GMYKLNLRGYVLAVIGRALVVVAGYMGSVPLMMAFTALAALGQGPWQGDMNAVIASCSEY TYLTQGKRIDGTMYSCTSLGVKIGGGIGTAVVGWMLEFSGYVGTNVTQPQSALDMMQFMY LWLPLIFDVLIMFVLSRMNVEDANKKLRAEKGIAADEVTDALDIN >gi|222441863|gb|ACEP01000079.1| GENE 4 6116 - 6193 108 25 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSNLKKHLNDLLVFYTEYFEHPYHV >gi|222441863|gb|ACEP01000079.1| GENE 5 6303 - 7103 443 266 aa, chain + ## HITS:1 COG:lin2267 KEGG:ns NR:ns ## COG: lin2267 COG2207 # Protein_GI_number: 16801331 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Listeria innocua # 15 259 20 287 292 75 25.0 9e-14 MQLKYVDTKEEIIAINRKSKHIEPHLHNALEIVCVTSGSLELGVGQELYHMEKGDIGFVF PDVIHHYQVLTPGVNKATYLIASPFTIAKFADIMQSMAPEYPIIKAEKVEPEVYRVINAI LETERSDITVAQAYLQIVLARCIGKLNLVEKSSVGSNDLIYQTVSYISANFKKKFSLEEM AKDLGVSKYVLSRLFSKTFHRNFNKYLNDARLNYACHRLENTSDSITNICLDSGFESQRT FNRVFKERYKISPSDYRSTCVKEMLS >gi|222441863|gb|ACEP01000079.1| GENE 6 7409 - 7813 78 134 aa, chain - ## HITS:1 COG:HP0437 KEGG:ns NR:ns ## COG: HP0437 COG1943 # Protein_GI_number: 15645065 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Helicobacter pylori 26695 # 5 134 11 142 142 112 45.0 2e-25 MDNRYNRHNRRKYNLKVHIVLVTKYRKQLLKDSIADDVKQKIFDIANAYGYEIIAIETDK DHIHFLLSYDTTDKVCDIVKIVKQETTYYLWHKYGSLLSKQYWKKKIFWSDGYFACSIGE VSSATIQKYIENQG >gi|222441863|gb|ACEP01000079.1| GENE 7 7894 - 9033 706 379 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163764777|ref|ZP_02171831.1| ribosomal protein L22 [Bacillus selenitireducens MLS10] # 150 369 1 220 225 276 58 4e-74 MEQITITAKIQIVATETDKVLLDETMSVYRNACNYVSDYVFHTHDLKQFSLNKVLYSTLR EKFGLKSQMAQSVLKTVIARYKTILENQNEWIKPSFKKPQYDLVWNRDYSLTQNRFSINT LNGRVKLSYFADGMSKYFDHTIYKFGTAKLVNKHGKYYLHIPVTYDVEESNISDICNVVG IDRGINFVVATYDSKHKSGFVSGKAIKQKRANYSKLRKELQMRRTASSRRRIKAIGGREN RWMRDINHQVSKALVKNNPKHTLFVLEDLSGIRNAAERVKTKDRYVSVSWSFYDLEQKLI YKAKQNQSSVIKVDPRYTSQCCPCCGHVEKSNRNKKIHLFTCKNCGYKSNDDRIGAMNLY RMGIDYLADSQVPNTVVTE >gi|222441863|gb|ACEP01000079.1| GENE 8 9363 - 9452 93 29 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGEVFTSFLISVIAGVVSYYICKWLDRDK Prediction of potential genes in microbial genomes Time: Fri Jul 8 07:40:35 2011 Seq name: gi|222441862|gb|ACEP01000080.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont87.1, whole genome shotgun sequence Length of sequence - 56393 bp Number of predicted genes - 61, with homology - 61 Number of transcription units - 20, operones - 10 average op.length - 5.1 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 17 - 196 185 ## COG0675 Transposase and inactivated derivatives + Prom 357 - 416 6.2 2 2 Tu 1 . + CDS 652 - 1398 591 ## COG3022 Uncharacterized protein conserved in bacteria 3 3 Tu 1 . - CDS 1558 - 2766 1028 ## COG4286 Uncharacterized conserved protein related to MYG1 family - Prom 2823 - 2882 9.7 + Prom 2719 - 2778 7.8 4 4 Op 1 8/0.000 + CDS 2960 - 3967 1120 ## COG0310 ABC-type Co2+ transport system, permease component 5 4 Op 2 34/0.000 + CDS 3967 - 4794 213 ## COG0619 ABC-type cobalt transport system, permease component CbiQ and related transporters + Prom 4856 - 4915 8.3 6 4 Op 3 . + CDS 4941 - 5606 274 ## PROTEIN SUPPORTED gi|145635097|ref|ZP_01790803.1| 50S ribosomal protein L25 + Term 5671 - 5710 1.1 - Term 5504 - 5544 2.9 7 5 Op 1 . - CDS 5752 - 6900 746 ## CLH_1999 hypothetical protein 8 5 Op 2 . - CDS 6890 - 7480 419 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog - Prom 7519 - 7578 8.7 + Prom 7633 - 7692 5.8 9 6 Tu 1 . + CDS 7719 - 9572 1489 ## COG0122 3-methyladenine DNA glycosylase/8-oxoguanine DNA glycosylase + Prom 9605 - 9664 4.8 10 7 Tu 1 . + CDS 9796 - 10695 993 ## COG1307 Uncharacterized protein conserved in bacteria + Term 10727 - 10767 5.6 - Term 10705 - 10762 4.6 11 8 Tu 1 . - CDS 10936 - 12801 1839 ## gi|225027449|ref|ZP_03716641.1| hypothetical protein EUBHAL_01705 12 9 Tu 1 . - CDS 13215 - 16466 2934 ## EUBELI_00340 hypothetical protein - Prom 16519 - 16578 14.7 + Prom 16436 - 16495 8.4 13 10 Op 1 2/0.000 + CDS 16713 - 17807 1101 ## COG1477 Membrane-associated lipoprotein involved in thiamine biosynthesis 14 10 Op 2 . + CDS 17914 - 19695 1802 ## COG3976 Uncharacterized protein conserved in bacteria + Prom 19701 - 19760 4.3 15 11 Op 1 . + CDS 19805 - 20296 651 ## COG3976 Uncharacterized protein conserved in bacteria 16 11 Op 2 . + CDS 20296 - 21135 355 ## COG0348 Polyferredoxin + Prom 21148 - 21207 6.5 17 12 Op 1 . + CDS 21259 - 22206 895 ## COG0714 MoxR-like ATPases 18 12 Op 2 . + CDS 22203 - 23393 781 ## Cphy_3708 hypothetical protein 19 12 Op 3 . + CDS 23383 - 25764 1118 ## COG1305 Transglutaminase-like enzymes, putative cysteine proteases + Prom 25768 - 25827 7.4 20 13 Op 1 . + CDS 25855 - 26718 939 ## COG1092 Predicted SAM-dependent methyltransferases 21 13 Op 2 . + CDS 26733 - 27239 689 ## COG4769 Predicted membrane protein + Prom 27501 - 27560 4.1 22 14 Tu 1 . + CDS 27745 - 28659 838 ## COG0583 Transcriptional regulator 23 15 Op 1 . + CDS 29104 - 29790 661 ## COG0356 F0F1-type ATP synthase, subunit a 24 15 Op 2 . + CDS 29838 - 30068 449 ## EUBREC_2902 hypothetical protein 25 15 Op 3 . + CDS 30135 - 30587 667 ## COG0711 F0F1-type ATP synthase, subunit b 26 15 Op 4 . + CDS 30565 - 31083 561 ## EUBELI_01483 F-type H+-transporting ATPase delta chain 27 15 Op 5 42/0.000 + CDS 31089 - 32591 2002 ## COG0056 F0F1-type ATP synthase, alpha subunit 28 15 Op 6 42/0.000 + CDS 32607 - 33497 822 ## COG0224 F0F1-type ATP synthase, gamma subunit 29 15 Op 7 42/0.000 + CDS 33529 - 34926 1843 ## COG0055 F0F1-type ATP synthase, beta subunit 30 15 Op 8 . + CDS 34938 - 35360 554 ## COG0355 F0F1-type ATP synthase, epsilon subunit (mitochondrial delta subunit) - Term 35491 - 35525 -0.4 31 16 Tu 1 . - CDS 35714 - 35917 160 ## gi|225027471|ref|ZP_03716663.1| hypothetical protein EUBHAL_01727 - Prom 36131 - 36190 6.6 - Term 36354 - 36397 3.0 32 17 Tu 1 . - CDS 36469 - 37458 858 ## Selsp_0413 hypothetical protein + Prom 37786 - 37845 5.5 33 18 Op 1 . + CDS 37868 - 38461 464 ## COG0622 Predicted phosphoesterase + Term 38497 - 38532 4.4 + Prom 38464 - 38523 2.1 34 18 Op 2 . + CDS 38555 - 38998 235 ## gi|225027474|ref|ZP_03716666.1| hypothetical protein EUBHAL_01730 35 18 Op 3 . + CDS 38986 - 39471 375 ## gi|225027475|ref|ZP_03716667.1| hypothetical protein EUBHAL_01731 36 18 Op 4 . + CDS 39473 - 40786 717 ## COG1477 Membrane-associated lipoprotein involved in thiamine biosynthesis + Prom 41208 - 41267 5.8 37 19 Op 1 40/0.000 + CDS 41503 - 41820 487 ## PROTEIN SUPPORTED gi|240145879|ref|ZP_04744480.1| 30S ribosomal protein S10 + Prom 41844 - 41903 2.5 38 19 Op 2 58/0.000 + CDS 41955 - 42584 797 ## PROTEIN SUPPORTED gi|238916265|ref|YP_002929782.1| large subunit ribosomal protein L3 39 19 Op 3 61/0.000 + CDS 42610 - 43230 781 ## PROTEIN SUPPORTED gi|238922831|ref|YP_002936344.1| ribosomal protein L4/L1e 40 19 Op 4 61/0.000 + CDS 43230 - 43529 389 ## PROTEIN SUPPORTED gi|240145876|ref|ZP_04744477.1| 50S ribosomal protein L23 41 19 Op 5 60/0.000 + CDS 43604 - 44446 1257 ## PROTEIN SUPPORTED gi|240145875|ref|ZP_04744476.1| 50S ribosomal protein L2 42 19 Op 6 59/0.000 + CDS 44462 - 44743 460 ## PROTEIN SUPPORTED gi|240145874|ref|ZP_04744475.1| 30S ribosomal protein S19 43 19 Op 7 61/0.000 + CDS 44771 - 45157 547 ## PROTEIN SUPPORTED gi|240145873|ref|ZP_04744474.1| 50S ribosomal protein L22 44 19 Op 8 50/0.000 + CDS 45169 - 45831 921 ## PROTEIN SUPPORTED gi|238916271|ref|YP_002929788.1| small subunit ribosomal protein S3 45 19 Op 9 . + CDS 45831 - 46271 654 ## PROTEIN SUPPORTED gi|160881778|ref|YP_001560746.1| 50S ribosomal protein L16 46 19 Op 10 . + CDS 46249 - 46464 267 ## PROTEIN SUPPORTED gi|239623355|ref|ZP_04666386.1| 30S ribosomal protein S17 47 19 Op 11 50/0.000 + CDS 46485 - 46739 346 ## PROTEIN SUPPORTED gi|158319552|ref|YP_001512059.1| ribosomal protein S17 48 19 Op 12 57/0.000 + CDS 46762 - 47130 564 ## PROTEIN SUPPORTED gi|160881775|ref|YP_001560743.1| ribosomal protein L14 49 19 Op 13 48/0.000 + CDS 47144 - 47452 278 ## PROTEIN SUPPORTED gi|16125508|ref|NP_420072.1| ribosomal protein L24 50 19 Op 14 50/0.000 + CDS 47477 - 48016 770 ## PROTEIN SUPPORTED gi|160881773|ref|YP_001560741.1| ribosomal protein L5 51 19 Op 15 50/0.000 + CDS 48032 - 48217 299 ## PROTEIN SUPPORTED gi|158319556|ref|YP_001512063.1| ribosomal protein S14 52 19 Op 16 55/0.000 + CDS 48244 - 48645 568 ## PROTEIN SUPPORTED gi|240145864|ref|ZP_04744465.1| ribosomal protein S8 + Prom 48712 - 48771 6.6 53 19 Op 17 46/0.000 + CDS 48816 - 49358 765 ## PROTEIN SUPPORTED gi|160881770|ref|YP_001560738.1| ribosomal protein L6 54 19 Op 18 56/0.000 + CDS 49375 - 49743 510 ## PROTEIN SUPPORTED gi|239623363|ref|ZP_04666394.1| ribosomal protein L18 55 19 Op 19 50/0.000 + CDS 49760 - 50263 683 ## PROTEIN SUPPORTED gi|160881768|ref|YP_001560736.1| ribosomal protein S5 56 19 Op 20 48/0.000 + CDS 50279 - 50461 229 ## PROTEIN SUPPORTED gi|227871783|ref|ZP_03990188.1| ribosomal protein L30 57 19 Op 21 53/0.000 + CDS 50486 - 50929 549 ## PROTEIN SUPPORTED gi|240145859|ref|ZP_04744460.1| 50S ribosomal protein L15 58 19 Op 22 . + CDS 50929 - 52242 764 ## PROTEIN SUPPORTED gi|163796899|ref|ZP_02190856.1| 30S ribosomal protein S11 + Term 52265 - 52311 8.2 + Prom 52972 - 53031 11.0 59 20 Op 1 6/0.000 + CDS 53207 - 54649 1988 ## COG0579 Predicted dehydrogenase 60 20 Op 2 4/0.000 + CDS 54683 - 55957 1756 ## COG0446 Uncharacterized NAD(FAD)-dependent dehydrogenases 61 20 Op 3 . + CDS 55962 - 56327 698 ## COG3862 Uncharacterized protein with conserved CXXC pairs Predicted protein(s) >gi|222441862|gb|ACEP01000080.1| GENE 1 17 - 196 185 59 aa, chain + ## HITS:1 COG:Ta1471 KEGG:ns NR:ns ## COG: Ta1471 COG0675 # Protein_GI_number: 16082436 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Thermoplasma acidophilum # 1 51 140 191 237 59 55.0 1e-09 MVKVDRFFPSSKKCCKCGRIKKELKLSERAYHCACGNKMDRDRNAAINIREEARRMLTA >gi|222441862|gb|ACEP01000080.1| GENE 2 652 - 1398 591 248 aa, chain + ## HITS:1 COG:CC3385 KEGG:ns NR:ns ## COG: CC3385 COG3022 # Protein_GI_number: 16127615 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Caulobacter vibrioides # 4 245 3 250 255 130 31.0 2e-30 MLKVIISPAKKMKIRNDILPYKNIPFFIKEAEKLFSYMGNMSFDELKDLWKCSDKLVEEN ERQFRIGNLKQNLSPAILSYEGIQYKYMAPAVFEEQAFTWLEKHLRILSGFYGALSPFDG VIPYRLEMQARFHKEEIDNLYDFWGQKLADYVLKDCDCLINLASKEYSKSILKYIPENVR VITCVFGELIDGKIKEKGTYAKMARGSLVRYMAEHQIENPEEIKQFDELGYKYKEEYSNS TKYCYLIH >gi|222441862|gb|ACEP01000080.1| GENE 3 1558 - 2766 1028 402 aa, chain - ## HITS:1 COG:AGl2056 KEGG:ns NR:ns ## COG: AGl2056 COG4286 # Protein_GI_number: 15891145 # Func_class: S Function unknown # Function: Uncharacterized conserved protein related to MYG1 family # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 17 272 8 284 315 114 27.0 2e-25 MNTLLEQIKKKNAKAFTHSGKFHADDVFSYALLLYLNPAITITRGNKVPKDFDGIVFDIG RGKYDHHQRDSRIRENGVPYAAFGLLWEELGAEILGEELAAKFDESFIQPLDINDNTGEK NELATLIGNFNPSWDVENGENEAFSRAVQTAGMILVNMFEKYKGNERAEKRVEEILAAHN SSVLSGEKSEIEAKILVLPEFVPCQKQLRETDIAFIIFPSNRGGYCIQPLKKEHSLNYKC SFPENWLGLERDELKQATGLTSANFCHKGGFIMTVDDVNDAISACKISLENFTETSCIIN LGGSSKMDEILKEIPHMENAAIIHCDLPKMPALTFDRNLGEISMEKEEFASYIKDYVKGI LKYKPDAVYVEGELFIVYPVIRVLHKKHIPVYIKHQNRVVAI >gi|222441862|gb|ACEP01000080.1| GENE 4 2960 - 3967 1120 335 aa, chain + ## HITS:1 COG:CAC0771 KEGG:ns NR:ns ## COG: CAC0771 COG0310 # Protein_GI_number: 15894058 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Co2+ transport system, permease component # Organism: Clostridium acetobutylicum # 1 327 1 322 322 328 55.0 1e-89 MHIPENYLSPSTCAVMAAAMVSVWTYSVKKVKEEIPKVKMPLLGIGAAFSFLGMMFNIPL PGGTTGHAVGGTLIAILTGSPSAGCIAVTIALLIQALLFGDGGILAFGANCFNMAFILPF LGFAIYKLIWEKTGKRKLAAGIGSYIGINAAAFCAAIEFGIQPLLFTDAAGKALYCPYPL TISIPAMMIGHLTLFGIAEIVLTTAILAFVEKISPETLEEKPAQSAFKPLYILMAVLIIF TPLGLLASGTAWGEWGVEEMASLVSNGKALGYTPAGMEKGFSLASLFPDYSMAGMPEWIG YILSAVVGVAIIVIFFKLLAGSKKDKIDFSKGQSV >gi|222441862|gb|ACEP01000080.1| GENE 5 3967 - 4794 213 275 aa, chain + ## HITS:1 COG:CAC0772 KEGG:ns NR:ns ## COG: CAC0772 COG0619 # Protein_GI_number: 15894059 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type cobalt transport system, permease component CbiQ and related transporters # Organism: Clostridium acetobutylicum # 11 253 2 245 269 163 35.0 4e-40 MQNQEKNNSFLPSWMCETEQYVPSIDKDGFITKSTQAVLGVLSKLKWKEGKDGCLSASPS LKLCYTFLYILLTACSKNYVFSLIMAGGTILALATYPAETMKQILSGTFGAVLFSALILL PAVFIGNPQILLIITTKVFVSVTLIGILSAGTSWNKLTASLRSFHIPDIFIFTLDITLKY IAVLGEICMEILTSLRLRSVGQNRKKAQSFSGILGISFLKSREMAEEMYAAMCCRGFVGE YKTQKTSTFRKQDILSLLLMIGVTGIFIYFEGLWK >gi|222441862|gb|ACEP01000080.1| GENE 6 4941 - 5606 274 221 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|145635097|ref|ZP_01790803.1| 50S ribosomal protein L25 [Haemophilus influenzae PittAA] # 1 213 1 205 205 110 29 2e-23 MIELKNVCYAYGNEIALRYINLNIQKGESVIIQGPNGCGKSTLIKILNGIIFPMEGSYTY EDHEITEKALKDSRFAKWFHQQMGYVFQNADTQLFCGSVEEEIAFGPIQMGLSEEKVKQR TEDCLRLFGIEKLRERPPYHLSGGEKRKVSLACILSMNPEVLILDEPLAGLDESTQKMLI DFLKKFHAAGKTLIIITHNNQLAKELGTRFIQMNENHELTI >gi|222441862|gb|ACEP01000080.1| GENE 7 5752 - 6900 746 382 aa, chain - ## HITS:1 COG:no KEGG:CLH_1999 NR:ns ## KEGG: CLH_1999 # Name: not_defined # Def: hypothetical protein # Organism: C.botulinum_E3 # Pathway: not_defined # 1 346 1 340 346 133 31.0 1e-29 MKNDIYDLLNDIDNQIDTFETVDITSTDLKNWKQSFALKKKTSVKKHTWKKYAAAAAAFI ILLGGSSAPVRQNVYAQSQQIMESLSTLLGTKEDLSPYSTVVGKSISKNGITITLNEVIL DGHSLIISYKTKIKDSSVLKKSNIKTITELPIDIDLTIAGEPIVESVGICSTPIDKLTSI SESEIQLNDSSILSEKADFTINFSTLDKSNTSIGKLKFAASGKELNAKTTNVALNHSFTL PNKTKITLKNYRSNIVNHQISFTISKNNTGLSGDIILKGKDDLGNPVEFYLDTADGTNGT FTVDRMTGYISPKATCLTLTPYYNDFPLSSSVNSTEETAESTEKTVAESTEIEVSTTNTD DAQEISISNYKSLGKAFTITIH >gi|222441862|gb|ACEP01000080.1| GENE 8 6890 - 7480 419 196 aa, chain - ## HITS:1 COG:CAC1766 KEGG:ns NR:ns ## COG: CAC1766 COG1595 # Protein_GI_number: 15895043 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Clostridium acetobutylicum # 1 180 1 179 185 84 30.0 2e-16 MKIKESNFVEALSHRNERALEYVMVHYGGLVKSVVHRYLNVLSQYEEECMNDVFFAVWEH IDSYDSARNPFANWIGGIARLKALDYKRKYANHLLETSWENAQYTKGLDDAIELHTQFEE EFSEETKQMLSCLKPQDRELFIKLYVEEKPFDEISKEMNTNKPVLYNRLFRSKKKLRSLF PQTNSTHLTGGTSYEK >gi|222441862|gb|ACEP01000080.1| GENE 9 7719 - 9572 1489 617 aa, chain + ## HITS:1 COG:CAC2707 KEGG:ns NR:ns ## COG: CAC2707 COG0122 # Protein_GI_number: 15895964 # Func_class: L Replication, recombination and repair # Function: 3-methyladenine DNA glycosylase/8-oxoguanine DNA glycosylase # Organism: Clostridium acetobutylicum # 6 275 13 287 292 146 35.0 2e-34 MDYIDIFIKNDYIDLKQIAESGQCFRWKKICPGRYFVISDGRTACFFQEKTGIRILCHEK DEEYFRKYLDLDTDYEKIIEQIDPEDRFLSGAAKMGKGIRILRQDLWEMIVSFIISQRNN IPRIMKSIDALCEKLGEKIVFNYEEEHLIGYSFPGPEVLAKADLSEFKFGYREKYIRQTA EDILTGKFELSVLQTAVAKGASPEEGKEMLKKLHGVGEKVASCIQLFGLHQLSLFPIDTW IAKVEKMYYNGHFPVERYEGIAGVMQQYLFFRVREEAEKRACLEVKADKMQKEKPEEIRK NVSKEVLRKQEKKEYNLSGKMLYVSDLDGTLLNSDALLNEDVPKRLNALIEQGLCFTVAT ARTYATVNSIMKDVNLTCPMILMNGVMIYDPVKKSCIHAEIIERDSVEYILKGRKKFGVT GFAYALSPEISEDCSKSGETSVVDSAGRVGKSGRKMATYYEKIATQHMEKFYTERRDLYH KPFSKVERLEDISGEDIIYFSICYEEEVLRPFYEYLKKDERLNLNFYKDVYGDGLWYLEI SHKNASKYYGIQKLRKLLHPAAITGMGDNLNDIPLFEACDRSCAVGNAHKEVKERADYIL DTNLNAGVVKFLEKEMK >gi|222441862|gb|ACEP01000080.1| GENE 10 9796 - 10695 993 299 aa, chain + ## HITS:1 COG:CAC3284 KEGG:ns NR:ns ## COG: CAC3284 COG1307 # Protein_GI_number: 15896529 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 1 290 1 277 285 122 31.0 8e-28 MDYKIISDGCCDLEKEIIEQYNLRIIPFYISFDEETYYKEIEEIGIREVYERMVGNPDVY PKTSLPSVQNYIDVFTEYVKEGTPVVCICFTPSLSGSYNCACNAREIVCEDYPDAKIAVI NSEAGTVSQGLMVIEAGRMCQNGVPYEQCVDILEKMKKTNRIFFTVGNTDYLKHGGRIGK LAGIASSALSLKPLITLVNGEIEASGIGRSRKKTVAKTIALMDDYFKETGDAMEDYSFRV GFGYDIEEGKEYYNTVANHIYGESHVDQVGICQIGATIAVHTGPYAIGIGCVKKYENYL >gi|222441862|gb|ACEP01000080.1| GENE 11 10936 - 12801 1839 621 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225027449|ref|ZP_03716641.1| ## NR: gi|225027449|ref|ZP_03716641.1| hypothetical protein EUBHAL_01705 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01705 [Eubacterium hallii DSM 3353] # 1 621 1 621 621 967 100.0 0 MRKQKLLTKALAASMAVFMSVSQLSVPAFAAEGTKSVSIAGSTYTYSAEALEGYTQIGNS GLYYKLTNGTGFSGTIYGTSNLTYKEFYSGDVSSTDSFDVVTTASTSKYSVLSNAWTDYK KESAEAAGGYHVKGVANVNIAVDSDLYVESIILKDANKELSKAYEEASDITLNENPTQVP SQYKILQENGSYVSNNKTVATVTDAVPTLTTGSTWGEYEIDVVETSTKYLRNSRQDEGFP VNSTIQGIILETTEGYKVGLEHLSNIWVQPYKLSFNVANEAATMGIAKSDNTAEFAKLVN KTINKITYVTPEGNYVYTFADGIFIKPAYSEEISGTFSDDMKSFTLNQLPTVKNGTLTVT YTVGAGHMKTPYTLYSGAIAKSVSLDLNAIPSDAEGGTYSVAISCDNYADIHVAIPVTEA QKAQLQELIKQAETALKGAGADDSVLLAHKNEAAELLANESSTSADAADLINELTELLKP YQSTEAPTPSIPDSTTASKPNTTTAAKPATKPATTAKKPNTTATKVKLAKQTTKVKANGK KKIKVSWKKDKKASGYEITYSTKKSFKGKKTIVVKSNKTTSKVVKKLTSKKKYFVKVRSY KQVGKTKTYGAYSKVKTVKVK >gi|222441862|gb|ACEP01000080.1| GENE 12 13215 - 16466 2934 1083 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_00340 NR:ns ## KEGG: EUBELI_00340 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 39 448 28 436 485 395 56.0 1e-108 MRNKLLVAGTLALAMTCTTVFSSITPAVVKAASVSTVQTQTAENSDDYVYCYAGLTWAEY WKAEGVWAAGDTTSSDTADSHGEKDKGAFDAVTRATANHGLHRGSYQCTAIIEGEKGTYN ISHWTDANTAILTNGEKITFTRGAIKTSDGTEDTMKDYKVYGLKYVPVAVKSSDFEAFKA KYSVVENNGTLIGGYGEKNLQAYNNVTASVDKDTNGLKTAEKQSDGSFTFTQRSNGSSSG IKDKTLKTADLNNMGATVKDASGSYGEFLRVDFTKNYGDLGANMQAVKWTYYGNDSTRSK ALATYGTKFAADNWMHKSMGIQLGLTNSIRCQLPKGYDGTGYWSLTIYALGYADSTYNFE ATDANIVKPDTNAGDTTELNKLVSSVEELKKSDYTEKSWNSFETELNEAKDILAKETPTQ AEINEAVEHLTAAKNALDKYEYGTANLSYADFYYGELNDVKEDTTLDLTSDKAASYRESG MYDAVSSATTNKYKSFFSSTYSEDNANGQGGSIIGMKDVNIAVPTSLYENAKKAISENKE CSNKLLEIIGSMKLSDNAPSEYKILNGDGTLTAMKSEVTEDTSDTLTIKTQSSYGQYEVD PVTSENSPLSNLSAKDSLLGIIVEDEDGKKYGMEHLENIWIGGKFSFAVTDGFKEKHGNT IDYNRHEALQGKTIKKVTYLVKNGADIVINTNLKCKTLAGSDSIKATANDNFKDGATVSV DTTAVPSGSNYTLSSVSMGKTVLTEGTDYTYANNTLTIKATDNTGVGSYSMVFTDDTYSD IVASFTLESGYKTGDISINKNNQVTLPAGVNFKKYVNNITSIKINGVEKTGKGGIKATDL FDADGNINFNAAIKGKDGSSTPVFADKSASYTIELTSTGYPSVSGTVQLNTSILEASIKK AEALDSSKYTAETWKALQTALTEAKEAKSANTQAIVDAANTKLTEALSGLKEKAVTPSKP ATPSNPDTTTTKKPATKPALKKSNVKLSKPVLKVGKTTKNKAKVTWKKVKKATGYEIQYT TKGFGNKKDTKTIKVSKAKITSAQLKKLKKKTRYKVRIRAVYTKAGYQSATSKWSAIKTV KTK >gi|222441862|gb|ACEP01000080.1| GENE 13 16713 - 17807 1101 364 aa, chain + ## HITS:1 COG:CAC2766 KEGG:ns NR:ns ## COG: CAC2766 COG1477 # Protein_GI_number: 15896021 # Func_class: H Coenzyme transport and metabolism # Function: Membrane-associated lipoprotein involved in thiamine biosynthesis # Organism: Clostridium acetobutylicum # 57 359 9 316 319 215 38.0 1e-55 MLLTTVFTAGMLSGCGTSGDTEKSKNDSRKAVEQTTQAEVDTDRKSQNEEDESESRDIFA MDTYMTLTAYGKNAKKALDAAVDEINNIEQLVSTGIDSSEVSQINKNGKGSVSETTGYLI KRSKEIYDSTNGVFDITIYPIMQAWGFPTENYRVPGKKELKKLRGLMGADHVLYDEKKQE VTLNKEGMKIDLGGIAKGYTSSKVMDIFKENGISSAVISLGGNVQTLNGKPDGSDWRVAV ENPADTGSYIGVLSIKDKAVITSGGYERYFKQDGKTYHHIIDPANGYPANNGLTSVTIVS DDGTLADGLSTSLFIMGPEKAQKYWKEHSDEFDTILVKDDGSILVSEGLAEYFTSESDFT IIKK >gi|222441862|gb|ACEP01000080.1| GENE 14 17914 - 19695 1802 593 aa, chain + ## HITS:1 COG:CAC2762_2 KEGG:ns NR:ns ## COG: CAC2762_2 COG3976 # Protein_GI_number: 15896018 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 211 297 46 132 132 70 54.0 1e-11 MREKWKKYVEKFKSLGDKVSGVKGITAIIPAVCLAVLMVTVLTGYKTPQAKKYEASETED ISQIKEALAKESRAATAETTKKNTTKKGKKGAIDVKDGTYKGSANGYGGKVTVNVTVSKK TMTAIDVVSAPGETDSFFQRAKGVIDEMLTAQSTDVDVVSGATYSSNGIIGAVKNALFGT ESNNATAAAANAGNAGGSAPSVSKVSESGTWKDGTYTGSGKGFGGTISVKVTVKDGKISA IDVTSASGETASYFSKAKGIIPKMISGQTTNVDVASGATYSSNGIITAVRNALSKAETGK SSTKKKKKKNKKNKKKNSGSNSNNNNNNIAAPAEGYEDGTYTGSAACSGEQFKEYSVTAN VTIKNGKISAVEISSTAKGTNLKQFMSRDEIKNLPSLIVSKNGTSGVDAVSGATYSSHAI FNAVNDALSKAKKNGSSTEKKEETTTEKKEETTTEKKEETTTEKKEETTTEKKEETTTEK KEETTENPDEGKNYKNGTYKVSITCEPDEDEDFDPYTISMDITIKKDKITEISNITANTN STNKAYTNDAKKGMISKIIANGNADGVNTVAGATCSSKAIKDACQKAFNAAKK >gi|222441862|gb|ACEP01000080.1| GENE 15 19805 - 20296 651 163 aa, chain + ## HITS:1 COG:CAC2762_2 KEGG:ns NR:ns ## COG: CAC2762_2 COG3976 # Protein_GI_number: 15896018 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 76 161 46 131 132 61 44.0 6e-10 MKYKNFLIRLCDLILVLGLLAGYQAVIYSRDKEATIAELKSQVNQLQGEKEDILEAAKNS GKLGTDGASAGQAGTYKDGTYTGSSQGFGGEIKVKVTVSGKKISAIDIIEASGEDEAYLS MAKDIVKTILDKQTTEVDTISGATYSSTGIKNAVGQALEGAAK >gi|222441862|gb|ACEP01000080.1| GENE 16 20296 - 21135 355 279 aa, chain + ## HITS:1 COG:CAC2762_1 KEGG:ns NR:ns ## COG: CAC2762_1 COG0348 # Protein_GI_number: 15896018 # Func_class: C Energy production and conversion # Function: Polyferredoxin # Organism: Clostridium acetobutylicum # 5 258 3 252 253 101 29.0 2e-21 MKKKKKISVQTIIEVMIRAVFFIIYPALFSTAFTGIKLIVEQITQRASLQWNSFVQTLVV LLAFTIVFGRFFCGFACAFGTWGDFVYFISSMVCKKRKKKPLKCFSKEGKILRYLKYVVL LVVLLLCIKGYSSAVAVHSPFSAFSRFHQLKMADSLIGTGLFLLVTAGMALEPRFFCRFL CPMGAVFSLMPVLPFSVIKRNRENCIPNCKACNLVCPANLEIPSDKEGDNATSGECFSCG KCIGKCPKQNTHNGFWGKLWLPMLVVKAVILFVIFRILY >gi|222441862|gb|ACEP01000080.1| GENE 17 21259 - 22206 895 315 aa, chain + ## HITS:1 COG:lin0471 KEGG:ns NR:ns ## COG: lin0471 COG0714 # Protein_GI_number: 16799547 # Func_class: R General function prediction only # Function: MoxR-like ATPases # Organism: Listeria innocua # 3 306 8 311 315 303 49.0 2e-82 MNKANLIIEEVNKVIIGKEKVIRKVWMTILSGGHVLLEDVPGVGKTTMALAFSKALGLSY RRIQFTPDVMPSDVVGFYYYNKESGKFEYRQGAVMTNLLLADEINRTSSRTQSALLEVME EGQVTVDGVTRNIPEPFFVIATQNPLGSAGTQMLPESQMDRFMVLLSMGYPTIKEEMILM SQRKVSDPLDDVNEVISKEELLSMQKEVNEIKVSSLIYQYIAMLSDATRRHDMIQLGVSP RGSLALCRMAKASAFLAGRDYVVPEDVQDVVKDVFRHRLVLKSRARLSRKDVDKIMDEIC ATVHVPDRRAAGGRR >gi|222441862|gb|ACEP01000080.1| GENE 18 22203 - 23393 781 396 aa, chain + ## HITS:1 COG:no KEGG:Cphy_3708 NR:ns ## KEGG: Cphy_3708 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 79 335 75 340 404 85 27.0 3e-15 MTRKRLWMNRISGIVCLLLWIALVIFLSGTTGKIAGMALGIMLLVAVIYNLYLPKEVSTH LIIPAYAQKEEKITGRLCISNTGVFPVLSGKVFLQVKKTISGEKEIFSMNIRAAGRKNGA AAFVIDSEYMGFLEITILRVELYSFFGCMKKVTYVNQEKVVMILPQTTELHFPVQNSGAA CSFFDEEQSGKKGNGPGEYFGIRPYVDGDSMKLIHWKLTGKTDEYMVKEVEVPMMRMPLI FLETRVEKLDAAMIDGLLEAFFSISQHMAQEGQKHCLCWWDKKTDGWNFYNVANAMQLEE VYAHVFQSMFFSQGNATIDSFEKERKEQFSEVIYITEYWNEKQAVFDDWNLTLLLERREA RKEEGEALVPYIFSAETMGEDIDRILTAYRSVDYGA >gi|222441862|gb|ACEP01000080.1| GENE 19 23383 - 25764 1118 793 aa, chain + ## HITS:1 COG:lin0469 KEGG:ns NR:ns ## COG: lin0469 COG1305 # Protein_GI_number: 16799545 # Func_class: E Amino acid transport and metabolism # Function: Transglutaminase-like enzymes, putative cysteine proteases # Organism: Listeria innocua # 485 645 426 598 721 74 27.0 1e-12 MEHKNSFHIKKLNHQSGFGWAEMMRAGYVWLAALLCFIWYAADRFEMDTAGVALFIGAAF FILLYFVGISVGKGFTVFISGLYFFVPFFILSMENIRRALLIFSRTVFDYLADYGQSKNI FHRIIQNTSVDMQTGNVEFFPFVVGILGITLWLILFMALLWKVQPCILFFTVAICFAFPF FVWGVRPELFAVLSFLIFCFFFFAPSFWGGVLGTSILIILLLLGTFSGMTNTGQMSAIQK KWNKERYKEDVSVLPEGQLTKAKKMKRTGEEALKVNIYKKNSYYLKGFVGTTYENNCWKA STNNWNIFGSDRSMLAASDREAYESLLWLHNKEFYGNTVMYHFVRQKKLSEGKTDPSLYS LSVKNVGANRKYLYLPYESAVLPTIYEEEGTAAINKTENIFASGIKGAGEYSFKAMETMA GKLKSSTNIKTEDKNLSSKKMTESYKEYVNATCLFLNKETKTLISSATEGSLQSNLSVLA IINRVKSFMKTSITYNENPGSISEDEDFAQWFFEDKKTGYDVQYAAAAVMMFRYYGIPSR YVEGYLLTPETVKEAGTAETVTVSQKYAHAWPEIYLDGIGWIPVEVTPKYENIMGTPYYE TQETAVSNSSSSDQSLENDDKKKQEKQQEQKKQEKEQTTEKKSEKKTVETPQQIPQNRSK QSKGGRTVHKILYILCFFIFIVGLLLFVFRKRLQKVYRNYKTDKLLKQQKYSQAVMFWYD ILCKTIYGKEKIKDNRVRTFEKRLRTFDREVEPKKLHLCMRIRQKAVYSPEGITRKEAEA AVHFLKKECEKLK >gi|222441862|gb|ACEP01000080.1| GENE 20 25855 - 26718 939 287 aa, chain + ## HITS:1 COG:mlr3209 KEGG:ns NR:ns ## COG: mlr3209 COG1092 # Protein_GI_number: 13472800 # Func_class: R General function prediction only # Function: Predicted SAM-dependent methyltransferases # Organism: Mesorhizobium loti # 9 284 58 336 338 224 41.0 1e-58 MWLADNWKEYEVIDCSEGEKLERWGKYTLLRPDPQVIWNTKKEDKHWKHLNAHYHRSKKG GGEWEFFDLPKQWDIHYRSLTFHLKPFSFKHTGLFPEQAVNWDWFSDKIKKAGRPVKVLN LFAYTGGATLAAAAAGASVTHVDASKGMVTWAKENAVASGLGDAPIRWLVDDCVKFVERE IRRGNHYDGIIMDPPSYGRGPKGEIWKIEEKIYSLVCLCEKLLSKDPLFFLINSYTTGLQ PAVLSYMLGTAVAKKHGGKVSADEVGLPVSANGLVLPCGASGRWEKK >gi|222441862|gb|ACEP01000080.1| GENE 21 26733 - 27239 689 168 aa, chain + ## HITS:1 COG:lin2789 KEGG:ns NR:ns ## COG: lin2789 COG4769 # Protein_GI_number: 16801850 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Listeria innocua # 4 162 5 166 180 84 37.0 9e-17 MSAKKTAFYGMFLALALVAGYIEQLIPINLGIPGVKLGLANIVTMLLLYIVGVPAACLIS VLRILLSGFLFGSGFAMVYSAAGAAMSMLVMALLKKTKKFSSVGVSVAGGIFHNVGQIIV AMIVLETKALAYYLPILILSGLVAGILIGILSGILTKRLNPIIKQHFV >gi|222441862|gb|ACEP01000080.1| GENE 22 27745 - 28659 838 304 aa, chain + ## HITS:1 COG:BH2712 KEGG:ns NR:ns ## COG: BH2712 COG0583 # Protein_GI_number: 15615275 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Bacillus halodurans # 6 288 4 285 296 177 34.0 2e-44 MEITYDYYRIFYYAAKYKSFSKAAEILMSNQPNITHFMNNLENQLGCRLFIRSNRGVTLT REGEKLYKHVTIAYQHFQLAELELANDKSLQSGIITVGVSEIALHLLILQVMADFRKAYP GIRIRLSNHSTPETIQEVKDGLVDLAVVSTPADIPATLKSTSLMDFSDVMVVSSSHKELC EKPIHLSKMQDYPLICLGRGTKSYEFFSNFYQRYGLKLNPDIEAGTIHQILPMVVYDLGV GFLPENVGIEALEDGKIIKVPLIEKIPQRQICLVESKRTPLSIAANALKQFINNYIETKL KDKA >gi|222441862|gb|ACEP01000080.1| GENE 23 29104 - 29790 661 228 aa, chain + ## HITS:1 COG:CAC2871 KEGG:ns NR:ns ## COG: CAC2871 COG0356 # Protein_GI_number: 15896125 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, subunit a # Organism: Clostridium acetobutylicum # 11 228 2 220 221 134 40.0 2e-31 MDALVNELMKELNCDTVFTIPVLGGIPVKESVVVTWIIMAVILVLCIALTRNLSVENPSR GQIILETIVSGGHNFFKDTLGEHGAEYIPYLMTVTLYIGIANLIGLLGFKPPTKDMNVTA ALAFMSIVLIEVAGIRQKGAKGWVKSFAEPIPIILPINIMEIFIKPLSLCMRLFGNVLGS FVIMELLKLIVPAVLPAVFSCYFDIFDGLIQAYVFVFLTSLFIKEATE >gi|222441862|gb|ACEP01000080.1| GENE 24 29838 - 30068 449 76 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_2902 NR:ns ## KEGG: EUBREC_2902 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: Oxidative phosphorylation [PATH:ere00190]; Metabolic pathways [PATH:ere01100] # 1 76 1 76 76 81 89.0 1e-14 MSSALFVAIGAGVAVLTGIGAGIGIGKATAAATDAIARQPEAESKISKALLLGCALAEAT AIYGFVIALLIILFLK >gi|222441862|gb|ACEP01000080.1| GENE 25 30135 - 30587 667 150 aa, chain + ## HITS:1 COG:lin2677 KEGG:ns NR:ns ## COG: lin2677 COG0711 # Protein_GI_number: 16801738 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, subunit b # Organism: Listeria innocua # 18 139 26 147 170 61 31.0 5e-10 MLNINPWNMVMIVINLLVLYAIFRKFLYKPVMNVIHRREELIQGQFDSAKKTQDDALQLK ADYEEKLQKANVKADEIILAARDQAKEEHEKAVLETQAKTDRMIEKAKADIEKEQKQAQQ EVQGEIAKLALIAARKIIETGEAHDAEGSK >gi|222441862|gb|ACEP01000080.1| GENE 26 30565 - 31083 561 172 aa, chain + ## HITS:1 COG:no KEGG:EUBELI_01483 NR:ns ## KEGG: EUBELI_01483 # Name: not_defined # Def: F-type H+-transporting ATPase delta chain # Organism: E.eligens # Pathway: Oxidative phosphorylation [PATH:eel00190]; Metabolic pathways [PATH:eel01100] # 1 172 1 173 173 141 45.0 1e-32 MTQKAVNNAKVLYRLNISRSDVEEAHKIFTKAPFLLETLTNPVVSEEAKEVVLDKIARQS GMPELLENFMKMMCRIGEIGQITDIFNCYYEYWDKMNHILHAKVIYADDKAKEEKEAIQK QLESIYPDEKVVLSEETDASLLGGYVVRVGYEEYDHSYEGKLRRLERKLTGR >gi|222441862|gb|ACEP01000080.1| GENE 27 31089 - 32591 2002 500 aa, chain + ## HITS:1 COG:Cj0105 KEGG:ns NR:ns ## COG: Cj0105 COG0056 # Protein_GI_number: 15791493 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, alpha subunit # Organism: Campylobacter jejuni # 6 497 5 496 501 573 57.0 1e-163 MGAISAEDIISIIQSEIENFDVDNTSEETGTVIYVGDGIVTVYGIDHAMYGEVVVFDNGV KGMVQDIRQNEIGVILFGRDTGIKEGTKVVRTKKKAGIPVGNEYIGRVINALGEPIDGGA EIKADGYRPIEEEAPGIIERKSVDTPMETGILSIDSMFPIGRGQRELIIGDRQTGKTSIA TDTILNQKGKGVICIYVAIGQKASSVAKIVNTLEKYDAMDYSIVLSSTASEPASLQYIAP YAGTALAEYFMHQGKDVLIVYDDLSKHAVAYRAISLLLERSPGREAYPGDVFYLHSRLLE RSSRLSDKLGGGSITALPIIETQAGDVSAYIPTNVISITDGQIFLESNLFFEGMRPAVNV GLSVSRVGGAAQTKAMKKACGSIRIDLAQYREMEVFTQFSSDLDDATKAQLAHGRAIMEL LKQPLCHPLSLHEQVITLCLANNGIFDIVMPKEVKKYQKDILSYLDLKHPEIGKEIEKKK DLTDELIAKIIEAAKEYTDK >gi|222441862|gb|ACEP01000080.1| GENE 28 32607 - 33497 822 296 aa, chain + ## HITS:1 COG:SPy0759 KEGG:ns NR:ns ## COG: SPy0759 COG0224 # Protein_GI_number: 15674807 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, gamma subunit # Organism: Streptococcus pyogenes M1 GAS # 2 290 3 289 291 173 34.0 3e-43 MASAKEIQDRMRSIKDTLKITNAMYMISSSKLKKSKKMLADTEPYFYTLQSEMSRILRHL PDMNSIYFKTNAEIPERKRKAGYIVITADKGLAGSYNHNILKLAEEELEKRDDYKLFVLG ELGRHYFEQKGINIDKQFHFVVQDPSLSRARRIAEDLLKLYHENQLDELYIIYTTMVNAM QEEAQVAQLLPLKKTDFKIPVPIDIPLEGLALKPSAEEVMDHIVPNYVVGFVYGALVEAF SCEQNARMMAMEGATNSAKQMLKELDIEYNRARQAAITQEITEVIAGAKSQKKKKK >gi|222441862|gb|ACEP01000080.1| GENE 29 33529 - 34926 1843 465 aa, chain + ## HITS:1 COG:CAC2865 KEGG:ns NR:ns ## COG: CAC2865 COG0055 # Protein_GI_number: 15896119 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, beta subunit # Organism: Clostridium acetobutylicum # 4 461 6 463 466 629 71.0 1e-180 MKTGKIIQVLGPVVDVEFENQELPAIRDALEVQNGDKKCVMEVAQHIGNHVVRCIMLAAS EGLHRDMEVTAEGSGIKVPVGEKTLGRLFNVLGETIDDGEPIKDAPKMVIHREPPTFEEQ NPAVEILETGIKVIDLLAPYAKGGKIGLFGGAGVGKTVLIQELISNIATEHGGYSIFTGV GERSREGNDLWTEMGESGVLAKTALVFGQMNEPPGARMRVAETGLTMAEYFRDEKKQDVL LFIDNIFRFTQAGSEVSALLGRMPSAVGYQPTLATEMGELQERIASTKNGSVTSVQAVYV PADDLTDPAPATTFAHLDATTVLSRKIVEQGIYPAVDPLESSSRILEADIVGEEHYEVAR KVQEALQKYKELQDIIAILGMEELSDEDKTVVFRARKIQKFLSQPFHVAENFTGIKGVYV PVKETIRGFKAILDGEMDEYPENAFFNVGTIEDVKKKAEEMKAAN >gi|222441862|gb|ACEP01000080.1| GENE 30 34938 - 35360 554 140 aa, chain + ## HITS:1 COG:VC2763 KEGG:ns NR:ns ## COG: VC2763 COG0355 # Protein_GI_number: 15642756 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, epsilon subunit (mitochondrial delta subunit) # Organism: Vibrio cholerae # 1 133 1 132 140 71 32.0 5e-13 MASNTFSLKIISANRVFFTGRCQSVIVPGYDGQKEVLAHHENMVIAVNEGEMRFLPEGEE EWQYAVVGIGFIEIVNNRVTLLVESVERPEEIDIARAQEAKERALEKIRQKQSIQEYYHT QASLSRAMARLRVGHRQKHV >gi|222441862|gb|ACEP01000080.1| GENE 31 35714 - 35917 160 67 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225027471|ref|ZP_03716663.1| ## NR: gi|225027471|ref|ZP_03716663.1| hypothetical protein EUBHAL_01727 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01727 [Eubacterium hallii DSM 3353] # 1 67 1 67 67 117 100.0 3e-25 MILVDMSDLELKAYAQQLSYMTYDFNLDYTLDIKPIAKSNAHFKKWIINYPFYSNIHKEG IVLYSAT >gi|222441862|gb|ACEP01000080.1| GENE 32 36469 - 37458 858 329 aa, chain - ## HITS:1 COG:no KEGG:Selsp_0413 NR:ns ## KEGG: Selsp_0413 # Name: not_defined # Def: hypothetical protein # Organism: S.sputigena # Pathway: not_defined # 18 325 20 328 335 182 37.0 2e-44 MDKDIKKSISATDKDAQYDEKAKNLLGHKIILAHILVKTIDEFKGMNPKDVVQYIEGKPY ISTVPVDTGSTNVEKEQDGEKVIGLNTENSELHEGLARFDIIFYVRMKDGLSQVIVNIEA QKAEPSAYDIINRAVFYVSRMISSQKGREFVNSNYNDIKRVYSIWICMNMSQNCMNYIHF TQESVVGTYQWKGDIDLANIVLIGLAEDLPEKEERYELHRLLGALLSAKLNVDEKFDIIG NEFDIPLESDIRKDVNDMCNLSQGIKEQAYVEGTENGIAIGKQEGITIGKQEGIAETIIK MYRKGYEAEQISDILDMTAEEVREIIENE >gi|222441862|gb|ACEP01000080.1| GENE 33 37868 - 38461 464 197 aa, chain + ## HITS:1 COG:CAC2749 KEGG:ns NR:ns ## COG: CAC2749 COG0622 # Protein_GI_number: 15896006 # Func_class: R General function prediction only # Function: Predicted phosphoesterase # Organism: Clostridium acetobutylicum # 1 182 1 178 180 187 50.0 2e-47 MKWMIASDIHGSAFYCEKMLKAFEREQADKLLLLGDILYHGPRNDLPKGYAPKKVIELLN NIKEKILCVRGNCDAEVDQMVLEFPIMADYAMITEGDLNIFVTHGHHFNEKKLPPMFEKT QEGLILLHGHTHVPVCRVHESYTYMNPGSVSIPKENSEHSYMILENGAFWWKTLDGETYL NFKGSKPERGYCREEYR >gi|222441862|gb|ACEP01000080.1| GENE 34 38555 - 38998 235 147 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225027474|ref|ZP_03716666.1| ## NR: gi|225027474|ref|ZP_03716666.1| hypothetical protein EUBHAL_01730 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01730 [Eubacterium hallii DSM 3353] # 1 147 27 173 173 278 100.0 1e-73 MKRTDQYLYLANGRLHRSLFGAINEKGEVKIVKDNQVLTRMNEKQAKALGNIARKIKELD NQKSISAVYLYGSVARGSAHGMSDLDIFIVLNDNDLTPEECRKFRIENSSFNDIDIDIHF ENRNIFLKSNSTYHTNIKKEGLELWTA >gi|222441862|gb|ACEP01000080.1| GENE 35 38986 - 39471 375 161 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225027475|ref|ZP_03716667.1| ## NR: gi|225027475|ref|ZP_03716667.1| hypothetical protein EUBHAL_01731 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01731 [Eubacterium hallii DSM 3353] # 4 161 1 158 158 260 100.0 2e-68 MDSMTYLDFAENDYKYFMHSYESGYVANNMAANAQNTAEKYLKHLIDQYDHDEQRLDLRT RTLRTHNLSQLMNYLSNEMGMEIPLRVKRDINALNNYYFNARYPGDNSFFVSKDDIEICK EGLDSCRELVLSVDKEMRTKNKEKELISENIPIVEDEEWDI >gi|222441862|gb|ACEP01000080.1| GENE 36 39473 - 40786 717 437 aa, chain + ## HITS:1 COG:CAC2766 KEGG:ns NR:ns ## COG: CAC2766 COG1477 # Protein_GI_number: 15896021 # Func_class: H Coenzyme transport and metabolism # Function: Membrane-associated lipoprotein involved in thiamine biosynthesis # Organism: Clostridium acetobutylicum # 257 433 138 313 319 135 40.0 1e-31 MKKQILSIIMAGCLLLSMTACSSDKKNTKSASEQTTADTSTSTNSPQEYSKTDFVMSTVL SEKIYGTKDVTQDIKEELDKLEKEQLSWREDSSIVSKINADAQKGTKTKLDSDMTSWVED SLELARRSNGAFDPTIGRLTRLWNIEGDNPKVPSKQEVKNTLKDTGYTKIHLEKVETQNT AITKKNVDKDIKDNTDKNGDAAKDTDNNTINSTAQNTADNMVNNEANNTPDNTALNEERL ETTDKKTNTDESISSIYIEDQCTLDLGAVGKGIACDVAQNYLKQQKEVSGAVIAVGGSIL LYGSKTDGTNWNVAVQNPRGKDGEAMGVLSLSGTTNVSTSGDYEKYFMQNGKRYHHILDP STGYPAESSLISVTVVSDNGLLSDGLSTACFVLGKEKGQKLLETYGAEGIFIDQNKKVTV TKGLKDKFTILNEEYSE >gi|222441862|gb|ACEP01000080.1| GENE 37 41503 - 41820 487 105 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|240145879|ref|ZP_04744480.1| 30S ribosomal protein S10 [Roseburia intestinalis L1-82] # 1 105 1 105 105 192 89 5e-48 MASQIMRITLKAYDHQLIDASAKSIIETVKKTGSKVSGPVPMPTKKEVVTILRAVHKYKD SREQFEQRTHKRLIDIITPTQKTVDALSRLEMPAGVYIDIKMKQK >gi|222441862|gb|ACEP01000080.1| GENE 38 41955 - 42584 797 209 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|238916265|ref|YP_002929782.1| large subunit ribosomal protein L3 [Eubacterium eligens ATCC 27750] # 1 209 1 209 211 311 74 5e-84 MKKAILATKIGMTQIFAEDGELIPVTVLQAGPCEVTQVKTVENDGYSAVQVGFQDMREKL STKPMKGHFEKAGVSVKRFVREFKLDDAENYELGQKITVDVFEAGDKVDATAISKGKGFQ GAIKRHGQHTGPKTHGSKYHRHAGSNGMASDPSKVMKGKKMPGQMGHVQITIQNLEVVKI DAENNVILVKGSVPGPKKSLVTLKAAVKA >gi|222441862|gb|ACEP01000080.1| GENE 39 42610 - 43230 781 206 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|238922831|ref|YP_002936344.1| ribosomal protein L4/L1e [Eubacterium rectale ATCC 33656] # 1 206 1 206 206 305 73 4e-82 MASVSVYNMEGAQVGTIELSDSIFAVPVNEHLVHQAVVAQLANKRQGTQKAKTRSEVRGG GRKPWRQKGTGHARQGSIRAPQWTGGGVVFAPTPRDYSVKMNKKEKQLAMKSVLTSKVNE SKFIVLDELKLAEIKTKQIKAVLDNLKVEKALIVTKEKDDVVVKSANNLPKVATTALNNI NVYDILKYDTVVVTSEAVAAIEEVYA >gi|222441862|gb|ACEP01000080.1| GENE 40 43230 - 43529 389 99 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|240145876|ref|ZP_04744477.1| 50S ribosomal protein L23 [Roseburia intestinalis L1-82] # 1 99 1 99 99 154 74 1e-36 MADLKYYDVILKPVVTEKSMTAMGEKKYTFYVNPDATKTQVKEAVERMFEGAKVAKVNTM NLNGKKKRRGMVYGRTAAKKKAIVQLTQDSADIQIFEGL >gi|222441862|gb|ACEP01000080.1| GENE 41 43604 - 44446 1257 280 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|240145875|ref|ZP_04744476.1| 50S ribosomal protein L2 [Roseburia intestinalis L1-82] # 1 280 1 281 281 488 82 1e-137 MGIKTYNPYTPSRRHMTGSDFSEITKKTPEKSLTYSLKKNSGRNNQGKITVRHRGGGAKK KYRLIDFKRNKDGIPAKVVAIEYDPNRTANIALICYVDGQKSYIIAPNKLEVGTTIMNGP DAEIHIGNCLPLENIPVGTEIHNIEMHPGKGAQLVRSAGNSAQLMAKEGKYATLRLPSGE MRMVPLNCRATIGQVGNIEHDLINIGKAGRKRHMGIRPTVRGSVMNPNDHPHGGGEGKAS IGRPGPCTPWGKPALGLKTRKKNKHSNKLIVRTRDGKNVK >gi|222441862|gb|ACEP01000080.1| GENE 42 44462 - 44743 460 93 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|240145874|ref|ZP_04744475.1| 30S ribosomal protein S19 [Roseburia intestinalis L1-82] # 1 93 1 93 93 181 94 6e-45 MARSLKKGPFADASLLKKVDAMNQSGDKSVIKTWSRRSTIFPSFVGHTFAVHDGRRHVPV YVTEDMVGHKLGEFVATRTYRGHGKDEKKSKVR >gi|222441862|gb|ACEP01000080.1| GENE 43 44771 - 45157 547 128 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|240145873|ref|ZP_04744474.1| 50S ribosomal protein L22 [Roseburia intestinalis L1-82] # 1 128 1 128 128 215 82 5e-55 MAKGHRSQIKRERNANKDNRPSAKLSYARCSVTKACFVLDAIRGKDVETALGILEYNPRY ASEIIGKLLKSAIANAENNNGMNRENLYVAECYADKGPTMKRIQPRAQGRAYRIEKRQCH ITVVLDER >gi|222441862|gb|ACEP01000080.1| GENE 44 45169 - 45831 921 220 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|238916271|ref|YP_002929788.1| small subunit ribosomal protein S3 [Eubacterium eligens ATCC 27750] # 1 220 1 225 225 359 80 2e-98 MGQKVNPHGLRVGIIKDWDSRWYAEKDFADNLVEDDKIRKYIKNRLYSAGISRTEIERAS DRVKIIIHTAKPGIVIGRGGSAIDELKKELEKLTGKKLIIEIKEVKRFDVDKDAQLVAEN IAQQLENRISFRRAMKSCMQRTMRNGALGIKTSCSGRLGGADMARTEFYSEGTIPLQTLR ANIDYGFAEANTTYGKVGVKVWVYHGEVLPTKETKEGSDK >gi|222441862|gb|ACEP01000080.1| GENE 45 45831 - 46271 654 146 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|160881778|ref|YP_001560746.1| 50S ribosomal protein L16 [Clostridium phytofermentans ISDg] # 1 145 1 145 145 256 85 2e-67 MLMPKRVKRRKQFRGSMAGKATKGTAITYGDFGLVACDPCWIKSNQIEAARVAMTRYMKR GGKVWIKIFPDKPVTAKPAETRMGSGKGSLEYWVAVVKPGRVMFEVAGVPEETAREALRL AMHKLPVKCKIVSRADLEGGDNSENN >gi|222441862|gb|ACEP01000080.1| GENE 46 46249 - 46464 267 71 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|239623355|ref|ZP_04666386.1| 30S ribosomal protein S17 [Clostridiales bacterium 1_7_47_FAA] # 1 70 1 70 70 107 77 2e-22 MITVKTTKYVEDLRTKSIAELNEELVAAKKELFNLRFQNATNQLDNTSRIKEVRRNIARI QGIIAEQNSAE >gi|222441862|gb|ACEP01000080.1| GENE 47 46485 - 46739 346 84 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|158319552|ref|YP_001512059.1| ribosomal protein S17 [Alkaliphilus oremlandii OhILAs] # 1 84 1 84 84 137 79 1e-31 MERNLRKTRVGLVTSDKMDKTIVVSVTDNVKHPLYNKIVKRTYKLKAHDENNECRIGDRV KVMETRPLSKDKRWRLVEIVEKAK >gi|222441862|gb|ACEP01000080.1| GENE 48 46762 - 47130 564 122 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|160881775|ref|YP_001560743.1| ribosomal protein L14 [Clostridium phytofermentans ISDg] # 1 122 1 122 122 221 92 5e-57 MIQQESRLKVADNTGAKELLCIRVMGGSTRRYANIGDTIVATVKDATPGGVVKKGDVVKA VVVRTKKGARRKDGSYIKFDENAAVIIKDDLNPRGTRIFGPVARELREKKFMKIVSLAPE VL >gi|222441862|gb|ACEP01000080.1| GENE 49 47144 - 47452 278 102 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|16125508|ref|NP_420072.1| ribosomal protein L24 [Caulobacter crescentus CB15] # 1 102 1 103 104 111 56 8e-24 MAQKIKRNDMVKVIAGKDIDKEGKVLSVDAKNHKVVVEGINMATKHTKPSMQNQAGGIVQ EEAAIDVSNVMLLHNGQPTRVGFKFVDGKKVRFAKSTGEVID >gi|222441862|gb|ACEP01000080.1| GENE 50 47477 - 48016 770 179 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|160881773|ref|YP_001560741.1| ribosomal protein L5 [Clostridium phytofermentans ISDg] # 1 179 1 179 179 301 82 7e-81 MTRLKEQYDSQIKAAMMKKFGYKNEMQIPKLVKIVVNMGVGDAKENHKLLDSAIGDMEKI TGQKAVVCKAKKSVANFKLREGMPIGCKVTLRGDKMYEFADRLINLALPRVRDFRGVNPN AFDGRGNYALGIKEQLIFPEIEYDQVDKVRGMDVIFVTTAHTDEEARELLTLFGMPYAK >gi|222441862|gb|ACEP01000080.1| GENE 51 48032 - 48217 299 61 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|158319556|ref|YP_001512063.1| ribosomal protein S14 [Alkaliphilus oremlandii OhILAs] # 1 61 1 61 61 119 86 3e-26 MAKKAMKVKQQRKQKFSSREYSRCKICGRPHAYLRKYGICRICFRELAYKGQIPGVKKAS W >gi|222441862|gb|ACEP01000080.1| GENE 52 48244 - 48645 568 133 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|240145864|ref|ZP_04744465.1| ribosomal protein S8 [Roseburia intestinalis L1-82] # 1 133 1 133 133 223 81 2e-57 MTMSDPIADMLTRIRNANTAKHDTVDVPASKMKISIADILLKEGYIKGYDIIEDGAFKTI RIALKYGADKNERIISGLKRISKPGLRVYADVENMPKVLGGLGVAIISTNKGVVTDKEAR SMNVGGEVLAFVW >gi|222441862|gb|ACEP01000080.1| GENE 53 48816 - 49358 765 180 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|160881770|ref|YP_001560738.1| ribosomal protein L6 [Clostridium phytofermentans ISDg] # 1 180 1 180 180 299 80 3e-80 MSRIGRLPVEIPAGVEVTVAENNVVTVKGPKGTLTESLPVEMDIKVENNQVVVTRPNDLK KMKSLHGLTRTLVANMVTGVTKGYEKVLEINGVGYRAQKQGKKLILSLGYSHPVEMEDPE GLESVLDGQNKITIKGIDKQKVGQYAAEIREKRKPEPYKGKGIKYADEVIRRKVGKTGKK >gi|222441862|gb|ACEP01000080.1| GENE 54 49375 - 49743 510 122 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|239623363|ref|ZP_04666394.1| ribosomal protein L18 [Clostridiales bacterium 1_7_47_FAA] # 1 122 1 122 122 201 81 1e-50 MINKKSRSEVRAKKHRRLRNRISGTASTPRLAVFRSNNHMYAQIIDDTVGKTLVSASTVQ KEVKAELEKTNDVAAAAHLGTVIAKRAIEKGITTVVFDRGGFIYQGKIQALADAAREAGL NF >gi|222441862|gb|ACEP01000080.1| GENE 55 49760 - 50263 683 167 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|160881768|ref|YP_001560736.1| ribosomal protein S5 [Clostridium phytofermentans ISDg] # 1 167 1 167 168 267 78 9e-71 MKRELIDASQLELEETVVSIKRVTKVVKGGRNMRFAALVVVGDKNGHVGAGLGKAIEIPE AIRKGKEDAMKKLVKVPVNEVGSIPHDFIGKFGSAEVLLKASPEGTGIIAGGPSRAVLEL AGYKNIRSKALGSNNKQNVVLATIAGLKEIKTPEEVARLRGKSVDEL >gi|222441862|gb|ACEP01000080.1| GENE 56 50279 - 50461 229 60 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|227871783|ref|ZP_03990188.1| ribosomal protein L30 [Oribacterium sinus F0268] # 1 60 1 60 60 92 75 4e-18 MADKLKITLVKSPIGAIPKQRATVEALGLKKVNKTVEMPDNDAVRGMIWHVRHLVKVEEI >gi|222441862|gb|ACEP01000080.1| GENE 57 50486 - 50929 549 147 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|240145859|ref|ZP_04744460.1| 50S ribosomal protein L15 [Roseburia intestinalis L1-82] # 1 147 1 146 146 216 76 3e-55 MNLSNLHPAAGSKHSDAFRVGRGHGSGNGKTAGRGQKGQKSRSGGKVRVGFEGGQMPLYR RLPKRGFTCINSKKIIAINVSELERFEADSVVTIDTLIESGLVKNTFDGVKILGNGELSK KLTVQVNAFSKSAVAKIEAAGGKAEVI >gi|222441862|gb|ACEP01000080.1| GENE 58 50929 - 52242 764 437 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163796899|ref|ZP_02190856.1| 30S ribosomal protein S11 [alpha proteobacterium BAL199] # 10 435 19 437 447 298 36 4e-80 MFETLRNALKVKDIRKRLLFTLVVLIICRLGSQLPIPGIDTDTISQYLNSLLGDSFNLLN SFTGGSFESMSLFALNVTPYITASIIIQLLTIAIPALEELYRDGEDGRKKINNITRFVTL GLSVLESAGLAIGFGKQGLLSNYGPLIVMEMIVCLTAGSVFVMWLGEQITDKGVGNGISI ILLCNIVSRMPSDLYNLYQKFMEGKQISNVIIAGVIIFIVIIGTIILTIILNDAERRIPV QYSRKIQGGSQLGGLGSTLPVKVNTANVMPIIFSSSLLQFPLVIKQLIGADPKGAAGFIF NALNQSNWCNPDHWNWSIGLIVYLVLNVIFAYFYTSITFNPLEISNNMKKQGGYIPGIRP GKATVDYLNSILTYIIFIGAVGLCIVAVIPIFFNGYFGANVSFGGTSIIIIAGVVLETMK QIESQTLVRQYTGFLTE >gi|222441862|gb|ACEP01000080.1| GENE 59 53207 - 54649 1988 480 aa, chain + ## HITS:1 COG:CAC1322 KEGG:ns NR:ns ## COG: CAC1322 COG0579 # Protein_GI_number: 15894602 # Func_class: R General function prediction only # Function: Predicted dehydrogenase # Organism: Clostridium acetobutylicum # 1 478 1 475 475 416 46.0 1e-116 MYDIIVIGAGVTGTCTARELSKYQVNMCVIDKGDDVASGTSKANSGIVHGGYDCKPGTLM AEMNVRGNELLYELAEDLDYPIKKNGSLIVCTVPGERGKLDVLLEQAKENGVPGCRIIDK EEALKMEPNLTDNTVAALYVPTGGIICPWGLAIAMGENAAMNGCEFKFEHAVENIIKKDD HYEVVTNKGTFETKIVVNAAGVYADTLHNMVSEDKKHIIARRGEYLLMDKELGDYFSATV FPLPGKMGKGILCAPTIHGNMFVGPSATDVEGKDAVETTQEILDDLAYKAQHSYLTRTKL PMNKIITSFAGLRAHLPEHEFIIEEAKDAPGFFDALGIESPGLTSSPAIAERIAGQIQDK YHFPEKDNFIAKRKDVIHMEDLTLEEKNALIKEKPEYGSIVCRCEMITEGEIMDAINRPL GARTMDGVKRRTRAGMGRCQAGFCTPRTMEILSRELGIPMTEITKKGEGSNLLVGKIKED >gi|222441862|gb|ACEP01000080.1| GENE 60 54683 - 55957 1756 424 aa, chain + ## HITS:1 COG:FN0182 KEGG:ns NR:ns ## COG: FN0182 COG0446 # Protein_GI_number: 19703527 # Func_class: R General function prediction only # Function: Uncharacterized NAD(FAD)-dependent dehydrogenases # Organism: Fusobacterium nucleatum # 4 424 5 420 421 460 58.0 1e-129 MKSYDVVIVGGGPAGLAAAIAAKKEGIDSILVIERDNQLGGILNQCIHNGFGLHTFKEEL TGPEYAARFIDQAAELNIEYKLHTMVCDISKDKVVTVMNREEGIVMYQAKAVILAMGCRE RPRGALNIPGYRPAGIYSAGTAQRLVNMEGFMPGKKVVILGSGDIGLIMARRMTFEGAEV KVVAEVMPYSGGLKRNIVQCLDDYDIPLLLSHTVIDIEGKDRLTGVVIAEVGPDRKPIPG TEVHYDCDTLLLSCGLIPENELSKQADVAMNPVTNGPLVDESLETNIDGVFACGNVLHVH DLVDYVSEEAATAGKNAALYVKNNCGKDAQKSDKVVEIKAIDGVRYTVPSTIHVDNMADL LTVRFRVGGVFKNSYISVYLNDERVQHRRKQVMAPGEMEQVILKKKDLEAYEGLETITVK IEEE >gi|222441862|gb|ACEP01000080.1| GENE 61 55962 - 56327 698 121 aa, chain + ## HITS:1 COG:CAC1324 KEGG:ns NR:ns ## COG: CAC1324 COG3862 # Protein_GI_number: 15894604 # Func_class: S Function unknown # Function: Uncharacterized protein with conserved CXXC pairs # Organism: Clostridium acetobutylicum # 5 121 4 117 117 98 49.0 3e-21 MEKRDLTCIGCPLGCQITVTMENGEVTDVKGNTCPRGDKYAREEVTNPTRVVTSIVKVEG GNLAAVSVKTKDVIPKGKIFDILDEIKPVVVKAPVKIGDVIVPNVAGTGVDVIATKNIQA V Prediction of potential genes in microbial genomes Time: Fri Jul 8 07:42:12 2011 Seq name: gi|222441861|gb|ACEP01000081.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont88.1, whole genome shotgun sequence Length of sequence - 43089 bp Number of predicted genes - 30, with homology - 28 Number of transcription units - 18, operones - 6 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 239 - 430 66 ## - Prom 642 - 701 6.5 2 2 Op 1 . + CDS 410 - 625 86 ## 3 2 Op 2 . + CDS 640 - 1488 601 ## gi|225027502|ref|ZP_03716694.1| hypothetical protein EUBHAL_01758 + Term 1557 - 1601 2.4 + Prom 1895 - 1954 4.5 4 3 Tu 1 . + CDS 2012 - 3271 1400 ## COG3641 Predicted membrane protein, putative toxin regulator - Term 3607 - 3650 1.0 5 4 Op 1 24/0.000 - CDS 3851 - 4645 498 ## COG0600 ABC-type nitrate/sulfonate/bicarbonate transport system, permease component 6 4 Op 2 . - CDS 4638 - 5408 413 ## COG1116 ABC-type nitrate/sulfonate/bicarbonate transport system, ATPase component - Prom 5475 - 5534 6.5 + Prom 5388 - 5447 6.7 7 5 Tu 1 . + CDS 5479 - 6075 498 ## gi|225027507|ref|ZP_03716699.1| hypothetical protein EUBHAL_01763 + Term 6084 - 6135 -0.8 + Prom 6548 - 6607 10.1 8 6 Tu 1 . + CDS 6633 - 7160 483 ## PROTEIN SUPPORTED gi|28210085|ref|NP_781029.1| SSU ribosomal protein S30P + Term 7303 - 7353 6.9 + Prom 7365 - 7424 6.3 9 7 Op 1 6/0.000 + CDS 7485 - 10121 3263 ## COG0653 Preprotein translocase subunit SecA (ATPase, RNA helicase) + Term 10218 - 10251 2.5 + Prom 10283 - 10342 7.1 10 7 Op 2 . + CDS 10395 - 11510 1307 ## COG1186 Protein chain release factor B 11 7 Op 3 . + CDS 11546 - 13696 200 ## PROTEIN SUPPORTED gi|213965205|ref|ZP_03393402.1| 30S ribosomal protein S1 12 7 Op 4 . + CDS 13726 - 14448 677 ## gi|225027514|ref|ZP_03716706.1| hypothetical protein EUBHAL_01770 + Prom 14524 - 14583 2.5 13 8 Op 1 12/0.000 + CDS 14609 - 15040 486 ## COG0802 Predicted ATPase or kinase 14 8 Op 2 20/0.000 + CDS 15041 - 15769 218 ## PROTEIN SUPPORTED gi|238855674|ref|ZP_04645973.1| ribosomal protein ala-acetyltransferase 15 8 Op 3 . + CDS 15766 - 16200 279 ## PROTEIN SUPPORTED gi|146295284|ref|YP_001179055.1| ribosomal-protein-alanine acetyltransferase + Term 16223 - 16282 5.1 + Prom 16250 - 16309 7.4 16 9 Op 1 . + CDS 16365 - 16604 158 ## Closa_0626 hypothetical protein 17 9 Op 2 . + CDS 16621 - 17532 871 ## COG1234 Metal-dependent hydrolases of the beta-lactamase superfamily III 18 9 Op 3 . + CDS 17538 - 18569 786 ## PROTEIN SUPPORTED gi|229232313|ref|ZP_04356740.1| (SSU ribosomal protein S18P)-alanine acetyltransferase 19 9 Op 4 . + CDS 18595 - 19299 358 ## PROTEIN SUPPORTED gi|163764767|ref|ZP_02171821.1| ribosomal protein L15 20 9 Op 5 . + CDS 19317 - 20915 263 ## PROTEIN SUPPORTED gi|163762510|ref|ZP_02169575.1| ribosomal protein S16 + Prom 20989 - 21048 3.2 21 10 Tu 1 . + CDS 21069 - 23030 1720 ## COG3855 Uncharacterized protein conserved in bacteria + Prom 23086 - 23145 5.3 22 11 Tu 1 . + CDS 23225 - 24841 471 ## gi|225027524|ref|ZP_03716716.1| hypothetical protein EUBHAL_01780 + Term 24988 - 25022 -0.8 + Prom 25226 - 25285 5.2 23 12 Tu 1 . + CDS 25404 - 27692 3105 ## COG1048 Aconitase A + Term 27781 - 27828 3.0 + Prom 27900 - 27959 6.3 24 13 Op 1 32/0.000 + CDS 28049 - 29782 2249 ## COG0028 Thiamine pyrophosphate-requiring enzymes [acetolactate synthase, pyruvate dehydrogenase (cytochrome), glyoxylate carboligase, phosphonopyruvate decarboxylase] 25 13 Op 2 . + CDS 29795 - 30313 694 ## COG0440 Acetolactate synthase, small (regulatory) subunit + Term 30390 - 30429 -0.4 + Prom 30592 - 30651 7.5 26 14 Tu 1 . + CDS 30742 - 37020 4742 ## gi|225027528|ref|ZP_03716720.1| hypothetical protein EUBHAL_01784 + TRNA 37147 - 37220 76.9 # Pro CGG 0 0 - Term 37331 - 37379 1.6 27 15 Tu 1 . - CDS 37429 - 38229 941 ## COG1028 Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) 28 16 Tu 1 . + CDS 38917 - 39453 534 ## COG0406 Fructose-2,6-bisphosphatase 29 17 Tu 1 . - CDS 39627 - 40481 640 ## gi|225027532|ref|ZP_03716724.1| hypothetical protein EUBHAL_01789 - Prom 40599 - 40658 8.5 - Term 40866 - 40911 3.2 30 18 Tu 1 . - CDS 40972 - 42555 2050 ## COG0166 Glucose-6-phosphate isomerase - Prom 42581 - 42640 7.7 Predicted protein(s) >gi|222441861|gb|ACEP01000081.1| GENE 1 239 - 430 66 63 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MDTAADHRGFDCSVDMDYYCHYTDSAHLADMDSDCADHPMIVNPVIAHREVDFPAIDCTM IVH >gi|222441861|gb|ACEP01000081.1| GENE 2 410 - 625 86 71 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MVGSRIHRWVKIPIASRAIKILIIINRTHKGISMPMVHSHIHKGIPMSMDNSQIYRESQF HIISRLIYRLQ >gi|222441861|gb|ACEP01000081.1| GENE 3 640 - 1488 601 282 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225027502|ref|ZP_03716694.1| ## NR: gi|225027502|ref|ZP_03716694.1| hypothetical protein EUBHAL_01758 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01758 [Eubacterium hallii DSM 3353] # 1 282 181 462 462 567 100.0 1e-160 MPFIIAGALIVILFLGGLAIGLISSNKSGTQANKNSQNAEQGDSSKILVQNGDFDTVVTV KSKDIDNIKLFKEPGGKKKAGKVQEGVPCALLQEKTVDGEKWAKIDFCNRQGWCRMDRLR TIFGETCYFYVKENSNENTVFVNERAIKLHTGAGQDTDIAATDVKYGTELTISKVEDGWG RTNYQNKECWIDMNVVGFYTSKYWQVERCDGSKNGIKLRKSSTEDSEQLTTVPLCTKFQS SDCRNGWARFTYGGKTGWLKLHYATPCGSSKGLSFSEDENNF >gi|222441861|gb|ACEP01000081.1| GENE 4 2012 - 3271 1400 419 aa, chain + ## HITS:1 COG:BH3254 KEGG:ns NR:ns ## COG: BH3254 COG3641 # Protein_GI_number: 15615816 # Func_class: R General function prediction only # Function: Predicted membrane protein, putative toxin regulator # Organism: Bacillus halodurans # 24 417 4 334 336 199 38.0 1e-50 MNEKETGKMKNQNNTAQSAEKQSFLQKKNVEISFKRYAIDAMGAMALGLFASLLIGTIIK TLAENVGTTGLMAALAAITDTGELAAKVAGLSMLNRFCWILWQIGNYATMVTGVAMAAAI GYALDAPPLVLFSLCAVGQATNTLGGSGGPLAVLFVAIIACEAGKLVSKETKIDILVTPA VTIIVGVGCAMVLAPIFKTICDSLGTFIGWATNLQPFFMGIIISVVVGVVLTLPISSAAI CAAVGISGGAVIAGVVDGSISMEVWNGLALAGGAATVGCCCNMLGFAVISYPDNGVGGLV AQGLGTSMLQVPNLMRKPVLWIPPVLTSAILGPVATCIFQLRNNGAAISSGMGTAGLVGP IGIITGWSNMPKGYAPGAFDWVGMILVCFILPVVLSWAIGKFMRKKGWIKEGDLKVDLG >gi|222441861|gb|ACEP01000081.1| GENE 5 3851 - 4645 498 264 aa, chain - ## HITS:1 COG:CAC0618 KEGG:ns NR:ns ## COG: CAC0618 COG0600 # Protein_GI_number: 15893906 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport system, permease component # Organism: Clostridium acetobutylicum # 3 262 2 261 264 196 41.0 5e-50 MNEISKEQLLYMKRTRQHKQFVLFFQLFLFIFFIILWEITSRAGIINAFIFSSPSRMLAS GKELLLNGQLLKHLGITLAETFASFFLVTAVSLFTAILLWWNNTLSEILEPYFVILNSLP KSAMAPIFIVWLGNNMKTIIVTAISVAIFGSILNLFTSFMSTDPDKLKLIRTLHGSQADC LTKVILPMNLPTILSILKVDIGLCLIGVVIGEFLAAKEGLGYLIIYGSQVFKMDWVMLSI VLLCLIAALLYGVLNRLEKHFRNF >gi|222441861|gb|ACEP01000081.1| GENE 6 4638 - 5408 413 256 aa, chain - ## HITS:1 COG:CAC0619 KEGG:ns NR:ns ## COG: CAC0619 COG1116 # Protein_GI_number: 15893907 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport system, ATPase component # Organism: Clostridium acetobutylicum # 10 256 6 252 253 241 47.0 1e-63 MPQTKPFLSVRSLSHTYHTNSGETPALRDIHFDLFPGEFAAIVGPSGCGKSTLLELIAGL IPIETGSLLYPACNKTPAIGYMLQKDHLLEHRTIYKNIILGLEIQHRLTEKNLEYVQKLM RQYGIADFASSYPRELSGGMRQRAALIRTLALKPEFLLLDEPFSALDYQTRLDVSDDIAK IIRQSGVTTLLVTHDLSEAISIADRIIVLGKRPGHIRKIVTIDFKTNKKLSSKEARIHPK FQEYFNQIWKELKADE >gi|222441861|gb|ACEP01000081.1| GENE 7 5479 - 6075 498 198 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225027507|ref|ZP_03716699.1| ## NR: gi|225027507|ref|ZP_03716699.1| hypothetical protein EUBHAL_01763 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01763 [Eubacterium hallii DSM 3353] # 1 198 1 198 198 362 100.0 1e-98 MIIASIIVADKILKEYFKSSFQGKHREVRTKGNGFGVRKEVGKQKVKENLFRKKEQNKAE LKQRRNNKEKNDGKNIGEGNWREYLSPTIELLENKGAASGIMAKHPKILKIISAAMLCFC ALALGKERQKGKTTMTGIGYALLLGGGISNFIDRMKKGSVTDYVRFPKFPVKKISELVFN LSDFGIFAGVFCLLLRKK >gi|222441861|gb|ACEP01000081.1| GENE 8 6633 - 7160 483 175 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|28210085|ref|NP_781029.1| SSU ribosomal protein S30P [Clostridium tetani E88] # 1 175 1 176 176 190 55 1e-47 MRYIIYGKNLEVTEGLKQAVHDKFSKLEKFFTPETEVQVTLSVQKDNQIVEATIPMKGTI LRVEPNSTDMYVSIDMAVDLLERMIRKYKTKISKYGKGELKFNDEFMEEDVDNHETVTIT KSKRFAIKPMDVEEACVQMELLGHNFFVFRNAETFEVNVVYKRKDKTYGLIEPEF >gi|222441861|gb|ACEP01000081.1| GENE 9 7485 - 10121 3263 878 aa, chain + ## HITS:1 COG:CAC2846 KEGG:ns NR:ns ## COG: CAC2846 COG0653 # Protein_GI_number: 15896101 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit SecA (ATPase, RNA helicase) # Organism: Clostridium acetobutylicum # 1 877 1 839 839 907 56.0 0 MGLVDKLFGNKSEKEIKRIEKYVDAIEALDEKMQALSDEELRAKTQEFKDRLAAGETLDD LLVEAFAVVREAADRTIHLKHYRVQLMGGIILHQGRIAEMKTGEGKTLVSTLPAYLNALE GKGVHIVTVNDYLAKRDANWMGQVHQFLGLSVGVILNSYDNAERQAAYNCDITYITNNEL GFDYLRDNMVIYKKDLVQRELNYAIVDEVDSVLIDEARTPLIISGQSGKSTDLYKACDVL ARQMQRGTSDGELSKIDAIMGIDIEEDGDFLVNEKDKQVMLTTEGVKKVEQFFHIKNLAD AENLEIQHNIILALRAHNLMFRDRDYVVKDNEVLIVDEFTGRIMPGRRYSDGLHQAIEAK EHVKVNRESKTLATITFQNFFNKYKKKAGMTGTAMTEDKEFMDIYGMDVVEIPTNKPMIR VDHEDAIYMTKKEKLNAVINDIVESHKKGQPVLVGTINIDTSEEVSELLSKRGIKHNVLN AKFHELEAEIVSHAGERGAVTIATNMAGRGTDIKLEDGVAELGGLKIIGTERHESRRIDN QLRGRAGRQGDPGESKFYLSLEDDLMRLFGSERMMGVYKALKIPEGEEIQHKTITKQIEK AQKKIENNNFGIRRNLLDYDQVNNDQREVIYAERRRVLNGESMRDVVYKMIKDVVADDVN MVAGEQPLKELNLVELNESLRRKIPLEPITLTDEEKEKNDKEALISRLQDEAMKVYHKKE KDFNEFAKVNVPEEGVELGEEDAAIFNDGIREVERVILLKVIDSKWMNHIDDMDQLRESI GLQSYAQKDPVVQYKFLGYDMFEEMTKNIRHDTVGALMHVQIEQKVERKQVARATGTNKD ELVERGPYRRKEAKIGRNAPCPCGSGKKYKNCCGRFAN >gi|222441861|gb|ACEP01000081.1| GENE 10 10395 - 11510 1307 371 aa, chain + ## HITS:1 COG:BS_prfB KEGG:ns NR:ns ## COG: BS_prfB COG1186 # Protein_GI_number: 16080582 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Protein chain release factor B # Organism: Bacillus subtilis # 2 363 1 362 366 384 53.0 1e-106 MVELEQYKFTVNQYKEPMKELGVSLALSHKHEQIKELESEMREEGFWNDPDKAQEVTRKV KNLKDTVSAYHALELTLDDVSTMIELGNEENDASVIPDIEELLGEFEIEFDNLKIQTLLS GEYDKNNAIVTLHAGAGGTESCDWAGMLYRMYSKWADSHGFKTEVLDYLDGDEAGIKSIT FEVNGENAYGYLKSERGVHRLVRISPFNAAGKRQTSFASCDVMPDIEEDLSVEIADEDIR IDTYRASGAGGQHINKTDSAIRITHIPTGVVVQCQNERSQHKNKDKAMQMLKAKLYLIKE QENREKLSDIRGEVSDNGWGNQIRSYVMQPYTMVKDHRTGEETGNVQSVMDGNLDAFMNA YLRWITTPKKA >gi|222441861|gb|ACEP01000081.1| GENE 11 11546 - 13696 200 716 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|213965205|ref|ZP_03393402.1| 30S ribosomal protein S1 [Corynebacterium amycolatum SK46] # 636 711 201 275 487 81 48 7e-15 MDIIKKLAQELAIKEEQAKAAVQLIDEGNTIPFIARYRKEMTGALNDETLRNLHERLLYL RNLEEKKEQVINAITDQGKMTEDLKKKILSAETLVAVDDLYRPYRPKKQTRATKAKEKGL EPLANILLLQKTTKSMEEEAAAFVNAEKGVANAKEAIAGAKDIIAEMIADMADYRKSIRN CTMKQGSLVVKAKDEKAESVYEMYYDFREPLSKVTGYRVLAINRGEKEKFLTVKVEAPDV QIVNALNKTMVKDNLNTKALMEEAILDAYNRLIAPSIERDIRSEITEKAQEGAITVFGKN LTQLLMQPPMEGRTVLGWDPAFRTGCKLAVVDPTGKVLDTVVVYPTAPQCKVEQAKKTVH DMIKKYNVSVISVGNGTACRESEQVITELIREIPEQVAYVIVNEAGASVYSASKLATEEF PEFDVGQRSAASIARRLQDPLAELVKIEPKAIGVGQYQHDMNQKRLEEALNGVVEDCVNK VGVDLNTASVPLLSYIAGISKAVAKNIVAYREENGRFHTRKELLKVPKLGPKAFLQCAGF LRIPGGEEPLDNTSVHPESYEAATKLLNDLGYTSESIRGGGLSSLNSQIKDEKGLAENLG IGEMTLHDIIEELEKPGRDPRSEMPKPVLRTDVLEMKDLKEGMVLKGTVRNVIDFGAFVD IGVHQDGLVHISQMTDRYIKHPLEVVSVGDIIDVQVMSVDLNKKRIQLTMKIDHNK >gi|222441861|gb|ACEP01000081.1| GENE 12 13726 - 14448 677 240 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225027514|ref|ZP_03716706.1| ## NR: gi|225027514|ref|ZP_03716706.1| hypothetical protein EUBHAL_01770 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01770 [Eubacterium hallii DSM 3353] # 1 240 1 240 240 437 100.0 1e-121 MKSFEEISKMNLAELDRPVGVGEEELETETTGAAGNHSVSDAEKAGTPGADDLETSINMS AAVSDNEEEKQPEELEKKVMMEPQGKHYRIDADITQSALNTFLFGHTYRQPLMILVTVLA IAWPIAVIVKKQGNIAMPLICSLFILIWIPFTTYLRAKNARKLNPIYEQTFHYMLDEWGL HLELSENAIDVEWKKVTKIIFYKSVAVIYTGKNNAFLIPTEAMGAQRADILNFIEEMKNR >gi|222441861|gb|ACEP01000081.1| GENE 13 14609 - 15040 486 143 aa, chain + ## HITS:1 COG:CAC2838 KEGG:ns NR:ns ## COG: CAC2838 COG0802 # Protein_GI_number: 15896093 # Func_class: R General function prediction only # Function: Predicted ATPase or kinase # Organism: Clostridium acetobutylicum # 1 138 1 137 152 138 47.0 3e-33 MKIESNSAEETFALGKQCGEKAAAGQVYCLYGDLGVGKTVFTKGFAAGLGIKEPVSSPTF TILQVYDEGRLPFYHFDVYRISDPEEMYEIGFEEYIEGEGVCFIEWANLIEELLPAQYTE IHIDKDLSKGFDYRLITVEEVQR >gi|222441861|gb|ACEP01000081.1| GENE 14 15041 - 15769 218 242 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|238855674|ref|ZP_04645973.1| ribosomal protein ala-acetyltransferase [Lactobacillus jensenii 269-3] # 42 239 1 189 380 88 30 4e-29 MKLLALDSSGLVASVAILDGETLVAEYTLNYKKTHSQTLLPMLDEIVKMTQTELSEMDAI AVAAGPGSFTGLRIGSATAKGLGLALNKPIISVPTLEGIACNFYGTDAIICPMMDARRQQ VYTGIYHFAGGTEFEELVAQEAGPVEDIIKKCNELGQKLGKPVIFLGDGVPVYKDIIEEL CNVEHSYAPAHLSRQRAGAVAIRAMKLYEEGKAEPASDHAPIYLRKSQAERELEAKGKKL EE >gi|222441861|gb|ACEP01000081.1| GENE 15 15766 - 16200 279 144 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|146295284|ref|YP_001179055.1| ribosomal-protein-alanine acetyltransferase [Caldicellulosiruptor saccharolyticus DSM 8903] # 6 140 10 144 153 112 42 5e-24 MKICPMTEADCEKTGLLMKECFSVPWSVEGLKEMFHTEGYCSFLAKEEEEVIGYVGMKMV LDEADITNVAVLPSHRKKGIAGKLLKQLLEEAKKQNLHSIYLEVRASNIAAVTLYEHAGF KEVGQRKNYYDNPREDARLMLWEA >gi|222441861|gb|ACEP01000081.1| GENE 16 16365 - 16604 158 79 aa, chain + ## HITS:1 COG:no KEGG:Closa_0626 NR:ns ## KEGG: Closa_0626 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 8 79 54 125 125 76 48.0 4e-13 MEDTMDKKVHEIYCNSCGKKIETTDNIKREDYLSIKKEWGYFSKKDGEIHSIFLCESCYD IWTKQFKLPVKVEKQTELL >gi|222441861|gb|ACEP01000081.1| GENE 17 16621 - 17532 871 303 aa, chain + ## HITS:1 COG:CAC1584 KEGG:ns NR:ns ## COG: CAC1584 COG1234 # Protein_GI_number: 15894862 # Func_class: R General function prediction only # Function: Metal-dependent hydrolases of the beta-lactamase superfamily III # Organism: Clostridium acetobutylicum # 1 301 5 311 313 341 53.0 1e-93 MLELCLLGCGGMMPLPYRKLTSLMARYNGNSILIDCGEGTQVAIKEKGWSFKPIEVICFT HYHADHISGLPGLLLTLGNAQRTEPLLMVGPRGLERVVNSLRVIAPDLPFPIVFHELKEK EEVLDVCGCRITAFRVNHNITCYGYTIEIDRAGKFDVNKAKENNVEMQYWNRLQKGETIE LPDRTLTSDLVLGPSRKGIKCTYTTDTRPTDSIVEHAKDADLFICEGMYGEPEKKAKAVE YKHMTFYEAAEMAKKACVKEMWLTHYSPSLVRPEPYMEEVRKIFAAASAGKDGKSTTLLF EEE >gi|222441861|gb|ACEP01000081.1| GENE 18 17538 - 18569 786 343 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229232313|ref|ZP_04356740.1| (SSU ribosomal protein S18P)-alanine acetyltransferase [Cryptobacterium curtum DSM 15641] # 9 342 518 858 860 307 47 8e-83 MEEQKKDVLILAIETSCDETAAAVVKNGREVLSNTIYTQIKLHTMFGGVVPEIASRKHIE KINIVIETALKEADVTLEDIDAIAVTYGPGLVGALLVGVAEAKALAYAAKKPLVGVHHIE GHISANYIEHKELEPPFLCMVASGGHSHLVLVKGYGEYEIIGRTRDDAAGEAFDKVARAI GLGYPGGPKIDKLAKEGNKKAIAFPRAKITDAPDDFSFSGLKSAVLNYLNQCQMKGEEVN RADVAASFQQSVTDVLTEHTVGAAKRLGINKVAIAGGVASNSSLRAQMEEACRKNGIEFY HPSPIFCTDNAAMIGVAGYYEYIKGIRDGWDLNAKPNLKIGER >gi|222441861|gb|ACEP01000081.1| GENE 19 18595 - 19299 358 234 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163764767|ref|ZP_02171821.1| ribosomal protein L15 [Bacillus selenitireducens MLS10] # 7 231 6 227 234 142 39 3e-33 MKEKVTAIVLAAGKGSRMHSEIQKQYMTLLDRPVITYALEAFEKSSVDEIILVVAPGEIE YAQENILDKYDYKKVSGIITGGAERYDSVYKALCSMPEEGYVLIHDGARAFITPELIEFC IDQVKKDKSCVMGMPVKDTIKIVDEDRYAVSTPPRSTMWQIQTPQCFVTAEIREAYQKMM EAGDDSVTDDGMVMETYGIRGVRMIKGSYENMKITTPEDMLLGEAILKGRSTER >gi|222441861|gb|ACEP01000081.1| GENE 20 19317 - 20915 263 532 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163762510|ref|ZP_02169575.1| ribosomal protein S16 [Bacillus selenitireducens MLS10] # 258 511 258 459 466 105 30 3e-22 MKKENTINQEQEEMVFAEAPAQVIFNEKKEPDDIVEDDAEEIDEEDYEIEDDGDEEDDTP KNKFESCCVCHRTEEQGAKLIHMPTGLSICSDCMQKSFDSFNSGFGGGNGIQFLDLSNMD FNNLKDMNLGDLLSGTMSNQKPKTKKAKKKKKKNKKEEPKLTLKNIPAPHQIKAQLDEYV IGQDYAKKVMSVAVYNHYKRVITNTMDEIEIDKSNMLMIGPTGSGKTHLVKTLARLLNVP LAICDATSLTEAGYIGDDVESVLSKLLANADNDVEKAETGIIFIDEIDKIAKKKNTTSRD VSGESVQQALLKLLEGSEVEVPVGASSKNAMVPTAVMNTRNILFICGGAFPGLEDIIKER MTQHSSIGFHAELKDKYDHEKNMLQYVTVEDIRKFGMIPEFIGRLPIIFTLNALTKEMLV DILKEPKNAILKQYQKLLALDEVDLQFEDGALTAIAEKAMEKDTGARALRAIIEKFMLDI MYEIPKDDNIGRVTITEDYIRNNGAPLIEMRPYLQLEQKKREIAAAEEQKEN >gi|222441861|gb|ACEP01000081.1| GENE 21 21069 - 23030 1720 653 aa, chain + ## HITS:1 COG:CAC1572 KEGG:ns NR:ns ## COG: CAC1572 COG3855 # Protein_GI_number: 15894850 # Func_class: G Carbohydrate transport and metabolism # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 3 651 13 661 665 767 58.0 0 MEKLNLKFLQSLATQYPSVAKACTEIINLQAILNLPKETEHFVSDVHGEYEQFLHILKTG SGAIRNKIEDVFGNTLCARDKKSLAALIYYPEQKVERMRLELSEEDFQDWYKVSLHRLIA VCKESSAKYTRSKLRKSLPADFSYILEELLSENNRVVEKEEYYNEIFDTIIRLNRAKEFI VAIAGVIQKLVVTRLHVIGDIFDRGPGPHIIMDHLMEHSNVDFQWGNHDVVWMGAASGQK ACIANVIRLSARYNNLNVLEDGYGINLVPLASFAIETYANDPCQCFRVHAQEDSDLREVS LNMKMHKAIAIIQFKLEGQVIKRRPEFNMDKRLLLDKIDYEEGTIMIEGKKYELLDKSFP TIDPNNPYELSEAEEALMNRLCMNFLNCDKLQEHIRFLFNKGGLYLCYNSNLLYHGCVPL DEKGNFRKVKIGSKQYSGKELYDVLEYYARKGYYEQDNREEHEYGKDIMWYIWSNENSPV YGKEKMATFERYFIADKEVQIERKDHYYRLIEREDVVNNILKEFGVDIEKGHIINGHMPV HVKEGESPVKCNGKVMIIDGGFSKAYRGTTGIAGYTLIYNSRGLRLVSHETFTSTEDVIE KEIDVHSDTVIREVSAQRERVIDTETGRQLKEQISDLEKLLHAYRTGQILEKE >gi|222441861|gb|ACEP01000081.1| GENE 22 23225 - 24841 471 538 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225027524|ref|ZP_03716716.1| ## NR: gi|225027524|ref|ZP_03716716.1| hypothetical protein EUBHAL_01780 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01780 [Eubacterium hallii DSM 3353] # 1 538 1 538 538 949 100.0 0 MNMDRSQKNLEKKKDDRTIEYLELKYDTNIQRGLTLAKVRQKQRGKRSSRNSITKEKKRI WFMCLKQECQDLLFFFFLYLCVVSVLFQFDQRVKTVLFVSFMVFLFGKVEKTFRYSKVLF KKEEIYKTKVVVLRENIFQCVQVNSLVKGDIVRLKRGEKVPLAVISLKLPYKEYKQGEIF AENTGRAIVIQQVVYPFFENVKENTGINRTVKMTENIEDKYVEKNESMKGKISLNTICKQ RKKQGRLYAAEEKEREENGGSYTENMILQEIFIRQGIYFQPDFLKKTVVEGDRPIVAVGF EDIYFPSRFQKEKFQRFMQGLKKTGVKYFFFTNQNRESAFSIGKQTGIVKEKREVIDRKQ FLLLKNVALEKQIKSIRIYCGLSEVEKSQIIAIWRKHQQEYSKKFLQKQGRILFMSGLNQ SGREKITEDCVYACCHSGSQEGDIWFKKNWRDTIVKYLTGQNIWQKFLENTQKWEGQILA SLCIFMLISLLLSLYMPQQENMQKITQTGSLFCAAYIIGKEIVEEGFRKWFLKKLKNN >gi|222441861|gb|ACEP01000081.1| GENE 23 25404 - 27692 3105 762 aa, chain + ## HITS:1 COG:ybhJ KEGG:ns NR:ns ## COG: ybhJ COG1048 # Protein_GI_number: 16128739 # Func_class: C Energy production and conversion # Function: Aconitase A # Organism: Escherichia coli K12 # 1 761 10 757 761 915 59.0 0 MELLKTGAYLVNGTEIIEDTPEAQARLQAVAGPNAPSKEEAAKGTIAYGILEAHNTSDNM EKLKIRFDKLTSHDITYVGIIQTARASGLEKFPMPYVLTNCHNSLCAVGGTINEDDHMFG LTCAKKYGGIYVPPHQAVIHQFAREMLACGGAMILGSDSHTRYGALGTMAMGEGGPELVK QLLNQTYDINMPGVIGIYMTGEPIKGVGPQDVALAIIGAVFQNGYVNNKVMEFVGPGVEK LSADFRIGVDVMTTETTCLSSIWKTDDVIKDFYDIHGRVEDYKELNPAKVAYYDGMVKID LSKICPMIAMPFHPSNTYTIEDMNKNLMDVLHDCEQKALVSLDGQVDFKLTDKVRDGKLY VEQGIIAGCAGGGFENLCAAADIIKGKSIGADEFTLSVYPASTPIYMELVKNGAAAALME SGAIVKTAFCGPCFGAGDTPSNNSLSIRHSTRNFPNREGSKLQNGQIASVALMDARSIAA TAANKGYLTPATDLDVEYTGRKYHFDSSIYANRVFDSKGVADPSVEIKFGPNIKDWPKMS ALPENMILKVVSEIHDPVTTTDELIPSGETSSYRSNPLGLAEFTLSRKDPAYVGRAKEVR AAQEAVEANSNPVEALAELAPVIDAIHAKYADVTNENIGIGSTIFAVKPGDGSAREQAAS CQKVLGGWANIANEYATKRYRSNLINWGMLPFLIEKGDLPFANGDYIFLPDVRKAVETKA AEFEAYVVKDGSLTPFALKMGELTDDEREIILKGCLINYNRQ >gi|222441861|gb|ACEP01000081.1| GENE 24 28049 - 29782 2249 577 aa, chain + ## HITS:1 COG:MA3792 KEGG:ns NR:ns ## COG: MA3792 COG0028 # Protein_GI_number: 20092588 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Thiamine pyrophosphate-requiring enzymes [acetolactate synthase, pyruvate dehydrogenase (cytochrome), glyoxylate carboligase, phosphonopyruvate decarboxylase] # Organism: Methanosarcina acetivorans str.C2A # 2 576 8 559 564 526 46.0 1e-149 MKVSGNDLFVTAMKKEGVDTIFAYPGGMVVNLLDALHDCHGIDLVLPRHEQGLIHAADGY ARATGKVGVCLVTSGPGATNLVTGIATANYDSIPLVCFTGQVPEHLIGNDAFQEVDIIGV TRNIAKFVIMVRDREMLAQRIKEAFYIARSGKPGPVLVDLPTDVMAELGSTSYPTEVNIR GYKPNKGVHVGQLKKAINLLEEAKKPMFLIGGGVNLAHAQEEMTKLAEKIDAPVITTVMG RGAISTEHPLYIGNIGMHGSYAANKTADECDLMFSIGCRFNDRVTGEIKKFAPNAKIVHI DIESAAISRNVTVDIPIVADAKAAILKILEHTEPMKHEEWIAEVKGWDKEYPLHMEVEEG VNPQRIIETLNEVYADRDTVFTSDVGQHQMWASQYLKLDATHRLVQSGGLGTMGFGLPSA VGAQIGCPEKSVVSISGDGGFQMNMQELATAVCQELPLVNIVFNNNRLGMVRQMQELFFK KRYTITCLRYHKSCKGKCGTPGWVCPEYVPDFVKMAEAYGSKGYFVTSNDQLKDTILEAR EYAEKNKKPVIVECMVAPDELVMPMIKGGASFEDIML >gi|222441861|gb|ACEP01000081.1| GENE 25 29795 - 30313 694 172 aa, chain + ## HITS:1 COG:MJ0161 KEGG:ns NR:ns ## COG: MJ0161 COG0440 # Protein_GI_number: 15668333 # Func_class: E Amino acid transport and metabolism # Function: Acetolactate synthase, small (regulatory) subunit # Organism: Methanococcus jannaschii # 4 164 2 161 172 101 34.0 8e-22 MREINTEKKKRWVSLYVENRIGVLARISGMMSGKSYNISSITVGVTEDPTVSRMTIGLSS DDATFEQIKKQLNRCVEVIKLNDITDAPTHMKEILFVKIHECSEFEKGEIFRISQVFGAK AIDYGINSVLLESVNTEGRNNDLIALLTRKFDHVEVVRGGIVAIESISMIDR >gi|222441861|gb|ACEP01000081.1| GENE 26 30742 - 37020 4742 2092 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225027528|ref|ZP_03716720.1| ## NR: gi|225027528|ref|ZP_03716720.1| hypothetical protein EUBHAL_01784 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01784 [Eubacterium hallii DSM 3353] # 1 2092 1 2092 2092 3623 100.0 0 MRIFFRRILIFVLVAALFFTNINISVAASGIGKESFYQMKNEQEDTRQIENITTESQEST TAISKETTEEKEKSEQGTTEQHNEAEEKCTEQESESAGTQDKSQQQQEKLTGSELKEGGQ TKKSTERNQEQSTAEQREDEEEESVPVVMSTHPSTNHSVTMPDVTSHYFLLQKNSKKVGS VTGKNLLTVANAQAASKYIYESYSNGRQYNFTPRFNNNSSVSVGGGTGGEVSFTKVKNGT ESEFGTLYPAVSKGRIASKVFNLQKTTVNPYAVYKDVGNWYDYNTKRTYKIDLKITVTGY KFPRAAIRKQLSNQDLKAPYIGFQKNKIGISVMGTDYVQTRMDFYYSGTTTPVSGIKGLI QFCDIDAQQGVDFGTGFEKVLMFQMQQSKLQYNATGLITGSKGYVSSRTSEGFDNNDETT TAVGIFAGSTVNCRWTVAKCDQKDTGGNAAYAVKSGYGIPADSSLADSTSYYWSNSTGLL SIRADVGIGLLPDEVKKAIYHGKINSKNSESAKKFLGLSDRKETFSYVLSAPVPSPSNVT TAKYTSFQFEDKVESLLNVKDVKVYADEAVNNNVQADQMNYTDVSANFNITKTQGTDHAT TVKVAAKNAKLSAAAFYGHSYYVHIEVQMKSDEELHKINKSVADWYQTDNTVTAKVSEAG KCQGSVAVLNKGTLNVANNQGGSVKKESNYVASKAGMIIKVKKTDQNSGQPIEGVKFGLF GGENVSIDKAEPIDTAVTNKEGIAVFKTGTSYSFYKDKYGDGPYCVKEIAIPAVYKNVWK PSLNKEWTYKIPTLKSEALFDLSSNIPLEAQLENTNYQAEKNLIKVYKKSKDTGAYLSGA EFTLLQWSQSKNQYEELFTLEEDVDENKRSVYQNKQEFKNTLDNLGRYKIVEKKAPKGCI LTKQEWTFELSENTDNKENPVIFENLSTGEKQKGSLVYENPLQKGKLLIQKEDDEGQPVS GAVFSVTAAQDIYAPWDVKEDGTPASDAKPLVTKGEVVDKITTGEDGKGESNKSLYIGKY IVEEIKGAWNHIKGDTIYEAELHYGLDSSKELISYHLKVNNLLMHPSLAVSKLADKTTNE EKKEVSFDEKTGRYIEKKVTGRYKAGEFVDYTIRVTNTGNVSLYNIKVTDDMDCKGEFAG QTLSKYADMETAAFVLPESGYFVTKSGDKVYAQMSASSNLVLNLHHLAIADSVEMHVKVK LKEAVKDVWKLKNEVRVQAQYDDNGQEDDADKPHLVDVPVKDLVDGENNSLVVDWDYINI PGIPEEKVIKTADRTTGIQIENGEITSGSKIPGIYNAKEKVKFSIVVKNSGEAALKRITV KDALSDELKAVSDMESAGFVFDDATVDKDSFYVLTTAKGKKITAKVIDKNTVILCNTGED GSGTDRLFADDYITLNYSVNLLPGTANLYDLSNKVYVNGWYFNGNEDEEVPGEDDEDIIE VPGIPEGRTAKLADKTTGVVLKEGRYDAGAKISGVYENGNEVTYKITVSNTGSANLYDLR LTDVLSKELEEALEKDSVSFIEQVYTSKDNRKVRTNLEESQKLWMDFLAAGDAVDVYLKG KVRIDVGNLFSLENIVNLTARYKKGDEKARKEQEETEKEEASTEKKEETSKDDAAEKEED KKDDQKDDQEGTKVPKKELEPLSKESKKAIEEAYEAIQKLTVEELKEESKQYAEIPVTEL MQDEDYINIPGTPLAKIAKLADKTQGVTLVKGRYEGQKEEGIYEYGDTVDYTITLTNAGT ADLYNLLVDDTLDKKLLSVLKSDSITITTGQVTTKMGDTIQVRTAEKDEDIGKDSESITK QSESRSVVLDHLKSGDSVAIHLKGIIQSGANRDTGLNNMVHLVAQYKTVNENGENEENYV EDTPEMTDNDTVGIGTPDILAAKKADKTKNVILENGRYTGKKKYGTYKEGEEVQFILTVT NVGNGTANHVKVTEEPSAELKKYVQMKGFANKAGDVIRTKKGKELHIETSKKRELCLKKI EAEDAVELIYIGKVKKDIPSIKFLKNEVVITGQNKDGSKIPTTSKMSDYDKINLKEKEIK NQNPKKNRETGTKGNGAKTGDNTPVLMYSFLSVAAILALSVLISYKNKKRKN >gi|222441861|gb|ACEP01000081.1| GENE 27 37429 - 38229 941 266 aa, chain - ## HITS:1 COG:MA0415 KEGG:ns NR:ns ## COG: MA0415 COG1028 # Protein_GI_number: 20089308 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) # Organism: Methanosarcina acetivorans str.C2A # 6 261 11 252 256 121 33.0 1e-27 MGFLKGKTAIITGGGRAVLSDGSCGSIGYGIATAYAKEGANLVITGRNLQKLEDAKEELE RLYGIQVLPVQADVNAGTDNEAAVANVVKQAIDTFGQIDVLINNAQASASGVSLADHTTE QFNLALYSGLYAVFYYMKACYPYLKETKGSVINFASGAGLFGNYGQCAYAAAKEGIRGLS RVAATEWGADGINVNVVCPLAWTAQLENFEKAYPEAFKANVKMPPAGHYGDVEKEIGRAC VQLASPDFKYMNGETLTLEGGMGQRP >gi|222441861|gb|ACEP01000081.1| GENE 28 38917 - 39453 534 178 aa, chain + ## HITS:1 COG:YPO0455 KEGG:ns NR:ns ## COG: YPO0455 COG0406 # Protein_GI_number: 16120784 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose-2,6-bisphosphatase # Organism: Yersinia pestis # 6 178 5 184 215 75 29.0 5e-14 MAKYFYFTRHGQTVWNVENKICGATDIALTDLGHQQAAELGERILKEGIKIDEILYSPLM RAADTAKHISEVTGIPAREELRLKEQNFGKYESTPRNGEEFKRAKQDFVCRYEGGESMLQ LCQRIYNLLDDIKKEDKVYLLVAHNGISRAIESYVRDMSNEEFAGFGIKNCELRKYEF >gi|222441861|gb|ACEP01000081.1| GENE 29 39627 - 40481 640 284 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225027532|ref|ZP_03716724.1| ## NR: gi|225027532|ref|ZP_03716724.1| hypothetical protein EUBHAL_01789 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01789 [Eubacterium hallii DSM 3353] # 1 284 1 284 284 477 100.0 1e-133 MKNTIFKSFKFTFSLMLTVMLCLSFSTITNAASVKDIPVKKTISADITGDGKADKILITT TMDDDFRVKKLKVTVNKKTAFSKNCTDSGINYITAKYAKMTNKNEFIQLMGIGDNDYIVF NQIYKYNNTSKKLYSVSKLDNTACEIVSAKKNRLTLHHGEQPAETGWLTWKMSYKFRNNK LVLTNATTSTVKSTIGYSRKDSYSKLFRKNIFVTAKKLRFYNGKKLAFTVPKGKQVTLKK LTLSKGNIYLQFQYGKKTGWISVNNKNYDFESPYFKKVNSRLAG >gi|222441861|gb|ACEP01000081.1| GENE 30 40972 - 42555 2050 527 aa, chain - ## HITS:1 COG:TP0475 KEGG:ns NR:ns ## COG: TP0475 COG0166 # Protein_GI_number: 15639466 # Func_class: G Carbohydrate transport and metabolism # Function: Glucose-6-phosphate isomerase # Organism: Treponema pallidum # 4 527 3 529 535 622 59.0 1e-178 MATWNNLDTLASYEKLAGLKNHVDIKEAMAGENGAERVAKYTAPMAEGLSFNYAAKQVDD TVLTALTELAEEAQLAEKFEELYNGAVINTGEKRLVLHHLARRQLGNDVVVDGVNKREFY VSQQEKAADFANKVHAGEITNAAGEKFTTVVQIGIGGSDLGPRALYIALENWAKENGVAK MEAKFISNVDPDDAAAILKSTDLAHALFIVVSKSGTTLETLTNEAFVKDALTKAGLNPAN HMLAVTSETSPLAKSDDYLEAFFMDDYIGGRYSSSSAVGGVVLSLAFGPEIFARILNGAA EEDKLAANKNILENPDMLDALIGVYERNVQGYPSTAVLPYSQALSRFPAHLQQCDMESNG KSVNRFGEPVNYVTGPTIFGEPGTNGQHSFYQLLHQGTDIVPLQFVGFKNSQIGMDVVIE DSTSQQKLCANVAAQIVAFACGKKDENLNKNFEGGRPSSIIIGDQLTPESLGALLAHFEN KIMFQGFVWNVNSFDQEGVQLGKVLAKRVLAHDTDGALKAYSDLLNI Prediction of potential genes in microbial genomes Time: Fri Jul 8 07:44:47 2011 Seq name: gi|222441860|gb|ACEP01000082.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont89.1, whole genome shotgun sequence Length of sequence - 6636 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 2, operones - 2 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 50 - 577 472 ## gi|225027535|ref|ZP_03716727.1| hypothetical protein EUBHAL_01792 2 1 Op 2 1/0.000 + CDS 564 - 1178 718 ## COG2197 Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain + Prom 1180 - 1239 6.7 3 1 Op 3 4/0.000 + CDS 1265 - 2062 1072 ## COG0561 Predicted hydrolases of the HAD superfamily 4 1 Op 4 . + CDS 2117 - 2905 888 ## COG0561 Predicted hydrolases of the HAD superfamily + Prom 3194 - 3253 5.0 5 2 Op 1 36/0.000 + CDS 3376 - 4074 342 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 6 2 Op 2 . + CDS 4087 - 6438 1682 ## COG0577 ABC-type antimicrobial peptide transport system, permease component Predicted protein(s) >gi|222441860|gb|ACEP01000082.1| GENE 1 50 - 577 472 175 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225027535|ref|ZP_03716727.1| ## NR: gi|225027535|ref|ZP_03716727.1| hypothetical protein EUBHAL_01792 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01792 [Eubacterium hallii DSM 3353] # 1 175 17 191 191 265 99.0 1e-69 MTGQIHTLKSNARAIGAMSLYELAASLEIRGKMQDESYIEKAIPLLFFEWERALDGMFFF LENTKKFVLESEKKEKENGKQKVYGELTEEVNQEKLYEETYDGILEAIGKYQGKYAEEQL EKLLSLEKNPKKKKILGQVQKSVQNLEFEQAESLMKGTYWTNKVTTEWEDRNGQK >gi|222441860|gb|ACEP01000082.1| GENE 2 564 - 1178 718 204 aa, chain + ## HITS:1 COG:slr1909 KEGG:ns NR:ns ## COG: slr1909 COG2197 # Protein_GI_number: 16329722 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain # Organism: Synechocystis # 6 203 10 208 213 115 37.0 7e-26 MGKNSILIVDDDLILLKTAEEILSSIYTVSVAKSGKQAITLLKKGVVPDLILLDIAMPEM DGYTTLEKMKSIPNCQDIPVIFLTGFAESDYEVKGLKAGAVDYIVKPFVKEVLLARVEMH LERSAKETGKSFADAAIEKNRGEIQKKLTPIEWKIALAVAQGMENREIAEQMNYSYGYVK NVIARIFSKLEIERRRELRKLLIG >gi|222441860|gb|ACEP01000082.1| GENE 3 1265 - 2062 1072 265 aa, chain + ## HITS:1 COG:lin1028 KEGG:ns NR:ns ## COG: lin1028 COG0561 # Protein_GI_number: 16800097 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Listeria innocua # 10 265 3 256 256 151 35.0 1e-36 MKKDSAKDIKIAFFDVDGTLVDMEKKVITPAMIETLKHLKENGIILCMATGRGPYLVPSF PGIDFDVFLSYNASYCYTKDEVIFSNPIPKEDVKAIVENASEIHRPVFLAGVEGGGANGC DKDLADYFAIAKSKVNILDNFDFEKLMDKKVYQMMVGCYKEEYADILRDVDGARITAWWT RAADITPANGNKGVGVRKILDYYHLDKENAIAFGDGTNDIEMLEAVGTGVAMGNATDDVK AVADAICGHVAEDGIYHYCKEQGLI >gi|222441860|gb|ACEP01000082.1| GENE 4 2117 - 2905 888 262 aa, chain + ## HITS:1 COG:CAC0522 KEGG:ns NR:ns ## COG: CAC0522 COG0561 # Protein_GI_number: 15893812 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Clostridium acetobutylicum # 3 260 2 257 265 156 37.0 3e-38 MNIKLVAVDMDGTLLNSKKEMPEEFIPWVKSHPEIKTVIASGRQYYTLERDFLPIRDNLA FIAENGGLVFEKGEIIFKDEIDKQDVLKCLDIIDKVPYATPVICGAKSAYITKKDVDEEI YYNVEMYYERRQIVEDLRKVADNDTIVKVAIYFKKEKAEESMKYFENIGEKLTAVLSGAS WIDIANKTESKGNAIKAICEKYGIAHTEAMGFGDYLNDMSLLESCGESYCMKNGHPTLKA LAKYVTEKTNDENGVMEVLKTL >gi|222441860|gb|ACEP01000082.1| GENE 5 3376 - 4074 342 232 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 3 215 2 217 245 136 38 5e-32 MFLEIKQIKKSFGMGDSRIEVLKGIDLEIERGEFCVLLGPSGSGKSTLLNIIGGIDGADS GSITIDGERIEDMTEKKLSLYRRKHLGYIFQMYNLIPNLTVRENIEVGAYLSDKPLDVDE LLHTLGIYEHQRKLPNQLSGGQQQRTAIGRAIVKNPDILLCDEPTGALDYNTSKDILRLI ETVNQKYGNTIVMVTHNDAIKDMADRVVKLRDGMIRKNYTNEQKVPAMELEW >gi|222441860|gb|ACEP01000082.1| GENE 6 4087 - 6438 1682 783 aa, chain + ## HITS:1 COG:CAC1534 KEGG:ns NR:ns ## COG: CAC1534 COG0577 # Protein_GI_number: 15894812 # Func_class: V Defense mechanisms # Function: ABC-type antimicrobial peptide transport system, permease component # Organism: Clostridium acetobutylicum # 5 783 3 745 746 216 24.0 2e-55 MKNPLRKRLLRELRQEMGKYLVVFILMVATIGFVSGFLVADGSMLTAYDNSFEKYNIEDG NFATEEKVYKNQQETIEGYGVKLYENFYVEKALSNGSTLRIFKNREEVNKVCLMKGKMPK EIGEIAIDRMYADNNKLSVGDTLKSGKKTWKITGLVALSDYSALFQNNNDTMFDAIKFGV GIVTKEEFSSFNESQLTYDYAWKYNKKPKNEKEEKKRSEDFMEDIGKDITLESFIPQYVN QAIHFTGDDMGSDEAMIIVLLYIVMVIMAFVFGITTSNTIRKEAGVIGTLRASGYTKNEL IRHYMSMPVFVTLIGAVVGNILGYTIFKNVCAGMYYGSYSLPTYVTVWNAKAFLLTTVVP VLIMLVVNYGILRSKLKLPPLKFLRRDLSRKKQKRALRLSSRINIFSRFRLRVIFQNFSN YIVLFVGIVFANLLLFFGLLLPSVLDHYQTDIQNNMLAKYQYMLSVPTSAMSGNKISSLF SLLEYSLGAKTENEDAEEFSAYSLNTTPKTFKSEEVTLYGVKSDSKYVKINLKDEKADLQ ADSKVTADAYISSAYADKFSLQPGDMITLKEKYEKDKYTFRVKGIYEYNGGLCIFMNQNQ LNRIFDLGSDYYSGYFSDTKITDISSKYIGSVIDLDALTKISRQLDVSMGNMMGLINGFA IAIFLVLIYLLSKIIIEKNAQSISMVKILGYSNGEISRLYILSTSLVVVLCLLLSFPVET ILMKGIFREMMLQSLSGWIALYIDPKIYVKMFVIGLVTYAVVALLEYRKIQKVPMDEALK NVE Prediction of potential genes in microbial genomes Time: Fri Jul 8 07:45:02 2011 Seq name: gi|222441859|gb|ACEP01000083.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont90.1, whole genome shotgun sequence Length of sequence - 21413 bp Number of predicted genes - 17, with homology - 17 Number of transcription units - 10, operones - 4 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 28 - 87 6.5 1 1 Op 1 3/0.000 + CDS 135 - 1619 1123 ## COG2199 FOG: GGDEF domain + Term 1709 - 1744 -0.1 + Prom 1680 - 1739 3.8 2 1 Op 2 . + CDS 1780 - 2337 332 ## COG1309 Transcriptional regulator + Term 2432 - 2466 4.0 + Prom 2452 - 2511 6.9 3 2 Tu 1 . + CDS 2571 - 4232 1641 ## COG1757 Na+/H+ antiporter + Term 4302 - 4358 14.9 - Term 4298 - 4338 6.4 4 3 Tu 1 . - CDS 4365 - 5123 650 ## COG1349 Transcriptional regulators of sugar metabolism - Prom 5224 - 5283 6.8 + Prom 5221 - 5280 7.7 5 4 Tu 1 . + CDS 5358 - 6149 709 ## COG0561 Predicted hydrolases of the HAD superfamily + Term 6159 - 6210 6.5 + Prom 6235 - 6294 6.1 6 5 Op 1 16/0.000 + CDS 6323 - 7618 1861 ## COG1879 ABC-type sugar transport system, periplasmic component + Prom 7828 - 7887 4.0 7 5 Op 2 10/0.000 + CDS 7907 - 9415 174 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 8 5 Op 3 . + CDS 9434 - 11083 1770 ## COG4211 ABC-type glucose/galactose transport system, permease component + Term 11120 - 11180 7.4 + Prom 11148 - 11207 6.3 9 6 Op 1 7/0.000 + CDS 11316 - 12914 1340 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain + Prom 12940 - 12999 2.1 10 6 Op 2 . + CDS 13030 - 14790 1280 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain + Prom 14899 - 14958 10.1 11 7 Op 1 . + CDS 14985 - 15125 63 ## gi|225027551|ref|ZP_03716743.1| hypothetical protein EUBHAL_01808 12 7 Op 2 . + CDS 15135 - 15317 193 ## gi|225027552|ref|ZP_03716744.1| hypothetical protein EUBHAL_01809 13 7 Op 3 . + CDS 15310 - 15723 259 ## COG1943 Transposase and inactivated derivatives 14 7 Op 4 . + CDS 15713 - 17326 814 ## GWCH70_0818 IS transposase + Prom 17422 - 17481 4.6 15 8 Tu 1 . + CDS 17548 - 18522 643 ## EUBREC_0483 hypothetical protein + Prom 18536 - 18595 3.2 16 9 Tu 1 . + CDS 18641 - 19642 947 ## COG1879 ABC-type sugar transport system, periplasmic component - Term 20061 - 20108 7.1 17 10 Tu 1 . - CDS 20127 - 21143 454 ## PROTEIN SUPPORTED gi|15900011|ref|NP_344615.1| aldose 1-epimerase - Prom 21224 - 21283 9.5 Predicted protein(s) >gi|222441859|gb|ACEP01000083.1| GENE 1 135 - 1619 1123 494 aa, chain + ## HITS:1 COG:PA4929_2 KEGG:ns NR:ns ## COG: PA4929_2 COG2199 # Protein_GI_number: 15600122 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Pseudomonas aeruginosa # 317 493 3 179 196 111 37.0 3e-24 MKKKRIEQNEIKRYFLIIAAFLSFMILSLLFFLSWLQQDVEVSSRKTVATNVERQSYHLK SVLDIQFNYLTGMASYMGEGEDLFCQENKRLIQSIVKEQDLDLIGIMKADGDTYYNNGAR KNVKNRGYFKKVMQGQKIMSPPIESKVDGRTKVILAVPIYRNGKVTGALGASYNVGALNQ MLFNDIYDGIGFSMIIDTSGKIISCDSGLSYRKIGINDNVYDFYTKSGEVAEKTMKSAKA NIKKQKSGILVLKEKNAMSYLAYEPLHINNWMLCYLVPQSKAQEEFQFIGNKEVILFAVI GVGVITFFILIFQRTSMKQKKMREIASRDGLTQLYNKAGTEQRIKEWLHSDRSKAGAVLM MMDVDYFKQINDTYGHAVGDRVLYKVGHLLKSSFRENDIVGRIGGDEFLIFMEGITSEEF AVRRMENLQQYLRELPIEELEEHTLTCSMGAAFMPKDGTTFGELYKHADEALYTTKRSGR DGFRIYHEVEEKEV >gi|222441859|gb|ACEP01000083.1| GENE 2 1780 - 2337 332 185 aa, chain + ## HITS:1 COG:lin2279 KEGG:ns NR:ns ## COG: lin2279 COG1309 # Protein_GI_number: 16801343 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Listeria innocua # 6 178 9 187 187 63 25.0 2e-10 MPPFAKREIKNSFIKLLTERPISQITVKDIVEDCGVNRNSFYYHFQDIPSLLEEIIVEMT AKVIENLPEESTFEEKVTAALEEINLNKRMIYHIYGSSNREFYEKQLMKICDYVTRTYIR SRDYSEKVASKDLEFVISYLKCELFGQLIDWLNHDMSYDIVEHSRILCRMFAGSMRMVCQ KYKLI >gi|222441859|gb|ACEP01000083.1| GENE 3 2571 - 4232 1641 553 aa, chain + ## HITS:1 COG:VC1131 KEGG:ns NR:ns ## COG: VC1131 COG1757 # Protein_GI_number: 15641144 # Func_class: C Energy production and conversion # Function: Na+/H+ antiporter # Organism: Vibrio cholerae # 44 540 8 522 533 308 39.0 2e-83 MKMKVKRKQAVMLGLIAVFAVLLAITMLTGKPSADGEYTSVVYATALSLLPPVIAIGLAL FTKEVYTSLLAGILAGALLYANGNLELMLNTMLFNEDGGMVSKLSDSGNVGILVFLVMLG ILVSLLNKAGGSAAFGKWASKHIKTRIGAQISVMILGVLIFVDDYFNCLTVGSVMRPVTD RHKVSRAKLSYIIDATAAPVCIIAPISSWAAAVTSSVPADSGINGFAVFIQTIPYNLYAI LTLVMLLTITVLRVDFGPMKHHEMNAIAGDLFTTPGRPYEDNEEEIIKENSHVLDLVLPV VVLIASCIVAMIYTGGFFTGASFVDAFANSSASVGLVLGGSVTLAFTFVYYMLRDVLSFR EFTECIPDGFKSMIAPIMILTLAWTLSGVTNLLGAKVFVADMVQHFAQGLQGFLPMVIFV VAAFLAFATGTSWGTFSILIPIVIGVFPSGQMMAISIASCLAGAVCGDHCSPISDTTIMA SAGGHCEHVNHVATQLPYVGLVAAVCMIGYFIIGVLQAVGLTQFSLLALPVCIVLLVAIL FVIREKTGGKEEI >gi|222441859|gb|ACEP01000083.1| GENE 4 4365 - 5123 650 252 aa, chain - ## HITS:1 COG:BH0847 KEGG:ns NR:ns ## COG: BH0847 COG1349 # Protein_GI_number: 15613410 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulators of sugar metabolism # Organism: Bacillus halodurans # 4 251 6 251 259 132 35.0 7e-31 MIPDERRKEILELLDEKGYVSVKDLSQRLYVSLPTIRRDLTLLEREGYVLRTHGGASLSI SDSFVEPFALRKKTNLEAKKYIGKIAASLIHNNDTLFITSSSTCLEFANHIKPDLRLNIL TNGMPLAHKLSENNNVTVECPAGIYNYFHEGIYGKEVKELISKRYAQYCFTSCNGIDLIN GLTFSTDLDMTLIRACRDNCDQLIVLADHTKFGMTYYFKTLSIKEIDVIITDQEPAEKWI TYCEENGITLIY >gi|222441859|gb|ACEP01000083.1| GENE 5 5358 - 6149 709 263 aa, chain + ## HITS:1 COG:SPy0567 KEGG:ns NR:ns ## COG: SPy0567 COG0561 # Protein_GI_number: 15674657 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Streptococcus pyogenes M1 GAS # 1 260 1 262 265 119 32.0 6e-27 MVKLIMTDIDGTLIPDGTMDINPEYFEVIEKLVEKGIIFVVASGRHMSSVKKVFAPVLDK IWVASQNGNVLTYHGKSRIIKSIPQEWGREMWRQFSKLKGVEGVLDTATEMYCPFEETSM YKILADEYHFNVTGTGGWNQVPEEDFSMMTLYHPQSAENICKELVEDKWKGKLEFLTSGK YWVDIVMPEVGKGTALEEICRQLGIAPEETIAFGDNLNDISMIQSAGKGYAVNTAREETK KAADEVIPGYAENGVLEVLKTFL >gi|222441859|gb|ACEP01000083.1| GENE 6 6323 - 7618 1861 431 aa, chain + ## HITS:1 COG:TP0684 KEGG:ns NR:ns ## COG: TP0684 COG1879 # Protein_GI_number: 15639671 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Treponema pallidum # 40 426 27 398 403 291 43.0 1e-78 MRKKVVAMMMAAALVVSTAACGSSGSSSSSKSESNTGSQAAATSEVANKDKPLVWFNRQP SNSSTGELDKTALNFNKDTYYVGFDANQGAEVQGQMVKDYIEKNIDKIDRNGDGVIGYVL AIGDIGHNDSIARTRGVRKALGTGVEKDGEINSAPTGTNTDGKAAEVQDAELEVNGKKYV VRELASQEMKNSAGATWDAATAGNAIGTWSSSFGDSIDVVVSNNDGMGMSMFNAWSKDNK VPTFGYDANSDAVAAIAEGYGGTVSQHADVQAYLTLRVLRNALDGVDVDTGIGTADEAGN VLSEDVYKYSEEERSYYALNAAVTADNYKDFTDSTVVWKPVSNQLDSSKHPTKKVWLNIY NASDNFLSSTYQPLLQNYDDLLNLDVEYIGGDGQTESNVTNRLGNPNQYDAFAINMVKTD NAASYTAILNQ >gi|222441859|gb|ACEP01000083.1| GENE 7 7907 - 9415 174 502 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 276 480 17 217 245 71 25 4e-12 MEDKREDIILSIRGMSKSFGRNRVLDHINLDVKRGTVMGLMGENGAGKSTMMKCLFGTYQ KDEGNIYLDGKEVNFSGPKDALENGIAMVHQELNQCLERNVVDNLFLGRYPVNSLGVIDE KRMKNEASKLFRKLGMTVNLEQPMRNMSVSQRQMCEIAKAISYNSKVIVLDEPTSSLTVQ EVDKLFEMMRMLRDQGISLIYISHKMDEIFEICDEISVLRDGNLVMTKSAKETNMNELIT AMVGRVLENRFPPVDNQPGDVVLSVQHLSTKFEPYLQDVSFDVHEGEIFGLYGLVGAGRT ELLETIFGIRTRAAGRVYLNNKLMNFSCAKEAMDYGFALITEERKANGLFLKGDLTFNTT IANLNAYKKGVAISDPKMVKATSNEIKIMHTKCMGPGDMISSLSGGNQQKVIFGKWLERG PQVFMMDEPTRGIDVGAKYEIYELIIQMAKQGKTIIVVSSEMPEILGITNRIGVMSNGHL SGIVNTKETNQEELLRLSAKYL >gi|222441859|gb|ACEP01000083.1| GENE 8 9434 - 11083 1770 549 aa, chain + ## HITS:1 COG:TP0686 KEGG:ns NR:ns ## COG: TP0686 COG4211 # Protein_GI_number: 15639673 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type glucose/galactose transport system, permease component # Organism: Treponema pallidum # 28 549 30 531 531 397 41.0 1e-110 MANGAILTAEQERKLRQPIDEYVGKIQKEIDELREHGTAEVIEYQNLIANVKRDKTLSKG EKESEIKEFEAKLSQAKAVEAQNKDKVAKLISDAESYLKENFEKLYYNAVKESCEAEKAK ALEDHKQRLAQLEKEHKEALAGMSDQVEIKEENYVHKNRISNEKLELEKEKQRIKDRKHD AFTYKYHLIDLLRLSEFTFAEEVAQKWENYKYTFNRRSFLLQNGLYIAIILIFVALCVIT PIKKGTPLLTYNNVLNILQQASPRMFLALGVAGLILLTGTDLSVGRMVGMGMTAATIIMH QGINTGTVFGHTFDFTNIPVGGRVVLALVVCIVLCTVFTSIAGFFTAKFKMHPFISTMAN MLVIFGLVTYATKGVSFGSIEATIPNMIIPKVNGFPTIILWALAAIIIVWFIWNKTTFGK NLYAVGGNPEAAAVSGISVFAVTLGAFIMAGILYGFGSWLECARMVGSGSAAYGQGWDMD AIAACVVGGVSFTGGIGKISGVVTGVCIFTALTYSLTILGIDTNLQFVFSGIIILVAVTL DCLKYVQKK >gi|222441859|gb|ACEP01000083.1| GENE 9 11316 - 12914 1340 532 aa, chain + ## HITS:1 COG:BH1123 KEGG:ns NR:ns ## COG: BH1123 COG4753 # Protein_GI_number: 15613686 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 1 522 1 519 526 158 27.0 3e-38 MYTLLLVDDEEEVTRIIAKKVKWNDLGFSVTGHANNGLKALEMLEEMQPDVVMTDIRMPY MDGLELCGQIREKYPATKLVLFTGFDDFEYAKEAVHLEIEEYVLKPLNAAEITKVFEDLK KKLDQEINEKKSAELLKKYYKDSLPLLQANFYSTLIEGRVPEEELTKYLANYQIDFTGPL YGCVLIHASKTQVPKDMDPQLVAISVDKQAGERLEQKWKAKRFNYLGDTVMIVQLESEKE VSELTDSCDRFCKYVHHMIGAVVTVGVGQICDNISDLVKSYQSAREAVSYRVLYGSNQAI NLKEIVPTRKVATEAADGTELAYLFKMICLGKKPDVKAAVERYMERKIIPLKSLERYHVA VMELISELYHFMVNNELDIQKLSGGTGQLYTLLSNMEARILEKWLLNICLVLHEDMANAR DHSKKSLIDQAKEYVHNNYSNEALNLDDICSELGVSNSYFSSIFKKETGQSFVAYLTGYR MEKAARMLVETSEKSYMIAKSVGYTDPNYFSYVFKRQYGVSPSKYRTEYESN >gi|222441859|gb|ACEP01000083.1| GENE 10 13030 - 14790 1280 586 aa, chain + ## HITS:1 COG:BH3447 KEGG:ns NR:ns ## COG: BH3447 COG2972 # Protein_GI_number: 15616009 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus halodurans # 28 575 44 601 602 220 28.0 8e-57 MIAFSVVSAVILLCMGSVMYLRFSSMSQKEILENNQRLMDQTVESVEDYLINMRQVSDAL YYNIIKESDMSSESDKMHNGMNLLYEANKENLRSIAIYNQYGSLLEAEPVVAQKEDPNVT KQDWFIQAMNQMENIHFSTPHVQNLFDDGTQQYYWVISSSRVVELTDGTNTQLGVLLVDM DYSGISRMMERINTTDSGQYFYLCDSNGQIIYHPHQVQLDNGMKKESSKKAARAKESVYE ERINGEHREIVVDTIGYTGWKLVCVMPYSIFSNKMLDVKHFVLILILLMAMMLTLINRLV SVRISRPIMKLNNSVTEYEEGKEPEIYIGGSREIRHLGRSIQDSYKQNNALMQKIVWEQT ERRKSELEVLQSQINPHFLYNTLDSITWMIEGERNDDAVFMISQLARLFRISLSKGHTII SIRDELQHAQSYMNIQKVRYKNKFQITFDVDSDILDCCIVKLILQPILENAINYGVREMD DCGEIIVQGRKEEDEILFTIADNGMGIPEEEIEFLLTDTQRVHKKGSGVGLVNVNNRIKI LFGEKYGLHIESELDEGTTVSIRIPAIIYSEENRKKFENTSKSGCS >gi|222441859|gb|ACEP01000083.1| GENE 11 14985 - 15125 63 46 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225027551|ref|ZP_03716743.1| ## NR: gi|225027551|ref|ZP_03716743.1| hypothetical protein EUBHAL_01808 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01808 [Eubacterium hallii DSM 3353] # 1 46 1 46 46 63 100.0 5e-09 MYECLKDCLNCANSFSEPSDNGNDILHCMVHDEEIVEEEYCCEDYN >gi|222441859|gb|ACEP01000083.1| GENE 12 15135 - 15317 193 60 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225027552|ref|ZP_03716744.1| ## NR: gi|225027552|ref|ZP_03716744.1| hypothetical protein EUBHAL_01809 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01809 [Eubacterium hallii DSM 3353] # 1 60 1 60 60 102 100.0 8e-21 MARLKSNNVQVNISIPSEWKAELENLARIYSVEEGETITFLDLMRRGIQEKYQLGEQAHE >gi|222441859|gb|ACEP01000083.1| GENE 13 15310 - 15723 259 137 aa, chain + ## HITS:1 COG:SSO1474 KEGG:ns NR:ns ## COG: SSO1474 COG1943 # Protein_GI_number: 15898306 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Sulfolobus solfataricus # 5 133 2 125 133 106 41.0 1e-23 MSEEQYHSASHCKYLIQYHIIWCPKFRFSVLKGNVEDTLKQILQKICDDYNYHIKALEVM PDHIHIFIDVPQTVAPCDVARTLKSISAVELFRVFPRLKQFYAKCGVLWSRGYFVSTVGH ISEATVIKYIEEQKNDK >gi|222441859|gb|ACEP01000083.1| GENE 14 15713 - 17326 814 537 aa, chain + ## HITS:1 COG:no KEGG:GWCH70_0818 NR:ns ## KEGG: GWCH70_0818 # Name: not_defined # Def: IS transposase # Organism: Geobacillus_WCH70 # Pathway: not_defined # 66 532 45 484 487 201 32.0 8e-50 MISDEEYKKLLKQFHKLSDRHILVVETDMPYSDVLKVMILSDKIRKAGNELVSFMRKNYD QLIRTKKYRKLLKLYGSTKDKKKRKDLADQLNEMQKAYNVTWDHCRRSMIPIGRKYGIDA VFALTKAEDVWRGIEKCLYGNGKAIHFSKYGELPCIRAKQINRGIPISVKDDRLQFKLGK TAFGIQVNDRFQQDEADAVLTYLAEPEIVDNKAVNTLIEDAYYIDTYRPCYATLVIKMIR GKYRVYVHLTIEGKTKLKYDKHGNPRHKYGKGVIGADIGTQTVAYTSDTEVGLKNLSERG NSIQTSERKERLLYRAMDRSRRAANPQNYNEDGTVKKGRKTWKYSNHYKKLKAKHSELCR INAVNRQLAINEDANHLRSLGDTFVTEPKNASKLMKRAKETAVNDKGRFNKKKRFGKSIK NRCPSGFQTTVEKKFKTTGGTYIEVPNNYRASQYDHTVDDYIKKKLSDRLYKLQDGTEVQ RDWYSSFLLYCYDYKIQKIDKNKCITQFDNCYKKEKALITWIKAHKIKVLNSGIKIA >gi|222441859|gb|ACEP01000083.1| GENE 15 17548 - 18522 643 324 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_0483 NR:ns ## KEGG: EUBREC_0483 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: ABC transporters [PATH:ere02010]; Bacterial chemotaxis [PATH:ere02030] # 1 324 24 351 351 284 50.0 4e-75 MKNGKRSFILTEIILGMLVVILAFFMLYNKNEDKTERIAVIIPDIEESQWSAFKYGLKMA SQEYGVNTILISKSNINLSEDEMEVVKQEIDKGADAIIVKMTGNSNEYSQLKKIQKKVPI MLAGESLAETKKQSDIPVTEPDQYEMGVALVQKLLEKNNGNLSGKKIGIFLENSNSEADL NRREGVCDTLKGTGAEIVWTVSRDEEIKRNTTLQSQRKVDIVLALDDTSFVEAGESVKQN NLYGAIVYGIGNSTEAVYYLDARWAECLIVPDEFTAGYQCVAETVNALRKTFYQMKNQEV PYTVLTRDNLFSKENQDLLFTLSK >gi|222441859|gb|ACEP01000083.1| GENE 16 18641 - 19642 947 333 aa, chain + ## HITS:1 COG:YPO1507 KEGG:ns NR:ns ## COG: YPO1507 COG1879 # Protein_GI_number: 16121780 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Yersinia pestis # 26 333 32 335 335 196 37.0 6e-50 MVIFAGTLIFYTCGKSQPQTDNQIYIGVACYDQKDTFIEELVDAFKEQCSSMESKDYAIS MTIMDASNSQRVQNDQIEQMINDGCNVLCVNLADRTEPSEIIDAAKEKDIPIIFFNREPV EEDMRRWDKLYYVGGKAKQSGELQGELAADFIKVNPQADRNNDGKIQYVILEGEMGHQDA IIRTESVVESLKANDISLEKLSYQIANWNRAQAQTRMMQLIGQYKTNIEMVIANSDSMAL GAIDAYEKLGYTESNIPVFFGIDGTDEGLEAVQKGKIAATVYNDKEGQAKAMAELVIAAV TGKEMEDIQFENNRCIYLPYKKVTKENVETMIK >gi|222441859|gb|ACEP01000083.1| GENE 17 20127 - 21143 454 338 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15900011|ref|NP_344615.1| aldose 1-epimerase [Streptococcus pneumoniae TIGR4] # 25 333 29 340 345 179 35 1e-44 MVKVELLKELDNGIRFYSMENESLKIVVVNMGCRIMEIHTPDASGNKSDVILGLKNIEDY ANDPAYFGAIIGRVANRIGDAHFTLNGKTYELYANNGKNHLHGGKEGFDKKIFDVTPLEN GLRFHYFSKDGEEGYPGNLNLYVTYTLNENAFSIHYEADTDADTVVNFTNHLYFNLSNTL DKVNDHYLQINSDYIGCVDDTCLATGELLPVKDTPFDFNKRKRIGQDLEADHIQLKNAFG YDHSFVLNGTENQLTLEEPVSGRRLIISTTAPVVQVYTGNFLKDGCIGKAGRVYENREGV ALETQLMPNAINTEENPSTILRKGDKFESNTKYVFEIM Prediction of potential genes in microbial genomes Time: Fri Jul 8 07:45:38 2011 Seq name: gi|222441858|gb|ACEP01000084.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont91.1, whole genome shotgun sequence Length of sequence - 43533 bp Number of predicted genes - 37, with homology - 36 Number of transcription units - 22, operones - 8 average op.length - 2.9 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 48 - 107 4.6 1 1 Op 1 . + CDS 174 - 581 170 ## CKR_2633 hypothetical protein 2 1 Op 2 . + CDS 679 - 1182 238 ## LSL_1338 DNA-binding protein 3 1 Op 3 . + CDS 1242 - 1379 70 ## + Prom 1431 - 1490 8.3 4 2 Op 1 . + CDS 1517 - 1786 320 ## Acfer_0671 hypothetical protein 5 2 Op 2 . + CDS 1770 - 2048 151 ## LGAS_0243 hypothetical protein + Prom 2096 - 2155 13.1 6 3 Tu 1 . + CDS 2196 - 2603 342 ## gi|225027564|ref|ZP_03716756.1| hypothetical protein EUBHAL_01821 + Term 2645 - 2711 -0.9 7 4 Tu 1 . - CDS 2869 - 3177 213 ## gi|225027565|ref|ZP_03716757.1| hypothetical protein EUBHAL_01822 - Prom 3303 - 3362 4.9 + Prom 3131 - 3190 3.6 8 5 Tu 1 . + CDS 3416 - 6559 2683 ## COG3291 FOG: PKD repeat - Term 6550 - 6586 2.1 9 6 Tu 1 . - CDS 6765 - 7613 647 ## Selsp_1548 hypothetical protein - Prom 7664 - 7723 4.8 10 7 Tu 1 . - CDS 7935 - 9164 1763 ## COG0137 Argininosuccinate synthase - Prom 9211 - 9270 8.1 + Prom 9250 - 9309 11.5 11 8 Op 1 11/0.000 + CDS 9438 - 10478 1206 ## COG0002 Acetylglutamate semialdehyde dehydrogenase + Prom 10491 - 10550 5.1 12 8 Op 2 10/0.000 + CDS 10593 - 11819 1906 ## COG1364 N-acetylglutamate synthase (N-acetylornithine aminotransferase) + Prom 11857 - 11916 5.4 13 8 Op 3 13/0.000 + CDS 12017 - 12913 1386 ## COG0548 Acetylglutamate kinase + Prom 13013 - 13072 2.8 14 8 Op 4 . + CDS 13148 - 14350 1623 ## COG4992 Ornithine/acetylornithine aminotransferase + Term 14416 - 14460 1.7 + Prom 14439 - 14498 8.9 15 9 Tu 1 . + CDS 14633 - 15073 470 ## COG3238 Uncharacterized protein conserved in bacteria + Prom 15076 - 15135 3.3 16 10 Op 1 1/0.000 + CDS 15215 - 15826 607 ## COG1555 DNA uptake protein and related DNA-binding proteins 17 10 Op 2 40/0.000 + CDS 15835 - 16524 873 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain + Prom 16584 - 16643 1.9 18 10 Op 3 . + CDS 16733 - 17968 1092 ## COG0642 Signal transduction histidine kinase 19 10 Op 4 . + CDS 17981 - 18979 877 ## Cphy_1956 hypothetical protein + Prom 19076 - 19135 8.3 20 11 Op 1 4/0.000 + CDS 19199 - 21448 974 ## COG2333 Predicted hydrolase (metallo-beta-lactamase superfamily) 21 11 Op 2 . + CDS 21517 - 22500 679 ## COG1466 DNA polymerase III, delta subunit - Term 22968 - 23010 10.2 22 12 Tu 1 . - CDS 23028 - 23291 279 ## PROTEIN SUPPORTED gi|160880450|ref|YP_001559418.1| ribosomal protein S20 - Prom 23381 - 23440 8.6 + Prom 23335 - 23394 7.8 23 13 Tu 1 . + CDS 23538 - 24476 1044 ## Cphy_2317 germination protease (EC:3.4.24.78) 24 14 Tu 1 . + CDS 24875 - 25861 990 ## EUBELI_01207 stage II sporulation protein P + Prom 26354 - 26413 5.9 25 15 Tu 1 . + CDS 26497 - 27507 1392 ## COG0547 Anthranilate phosphoribosyltransferase + Prom 27511 - 27570 1.9 26 16 Tu 1 . + CDS 27612 - 29426 2005 ## COG0481 Membrane GTPase LepA 27 17 Op 1 5/0.000 + CDS 29983 - 31140 802 ## COG0635 Coproporphyrinogen III oxidase and related Fe-S oxidoreductases + Prom 31201 - 31260 5.0 28 17 Op 2 21/0.000 + CDS 31320 - 32369 1165 ## COG1420 Transcriptional regulator of heat shock gene 29 17 Op 3 29/0.000 + CDS 32385 - 32984 758 ## COG0576 Molecular chaperone GrpE (heat shock protein) 30 17 Op 4 31/0.000 + CDS 33032 - 34903 2630 ## COG0443 Molecular chaperone + Term 34940 - 35009 15.7 + Prom 34977 - 35036 6.1 31 18 Op 1 5/0.000 + CDS 35082 - 36254 1475 ## COG0484 DnaJ-class molecular chaperone with C-terminal Zn finger domain 32 18 Op 2 . + CDS 36266 - 37234 829 ## PROTEIN SUPPORTED gi|160880441|ref|YP_001559409.1| ribosomal protein L11 methyltransferase + Term 37357 - 37394 -0.6 + Prom 37459 - 37518 6.4 33 19 Tu 1 . + CDS 37724 - 38785 1374 ## COG0180 Tryptophanyl-tRNA synthetase 34 20 Tu 1 . - CDS 39179 - 39904 914 ## COG0363 6-phosphogluconolactonase/Glucosamine-6-phosphate isomerase/deaminase - Prom 40070 - 40129 7.0 + Prom 40179 - 40238 5.2 35 21 Op 1 1/0.000 + CDS 40295 - 40990 927 ## COG3010 Putative N-acetylmannosamine-6-phosphate epimerase 36 21 Op 2 . + CDS 40994 - 41827 491 ## COG1737 Transcriptional regulators + Prom 41937 - 41996 6.1 37 22 Tu 1 . + CDS 42070 - 43431 1183 ## COG1820 N-acetylglucosamine-6-phosphate deacetylase Predicted protein(s) >gi|222441858|gb|ACEP01000084.1| GENE 1 174 - 581 170 135 aa, chain + ## HITS:1 COG:no KEGG:CKR_2633 NR:ns ## KEGG: CKR_2633 # Name: not_defined # Def: hypothetical protein # Organism: C.kluyveri_NBRC # Pathway: not_defined # 7 133 3 130 149 105 43.0 5e-22 MKDFVITKIAQPLFLISFGGMAYFNFEILYRGYSHVSMIICGGLSFYTIGLLNEGKKWHP SFLFQMILGSLIIISYEYLTGVIVNIYLKLNVWDYSKVPFNYRGQICIPFLIIWFFATPL CIWMDDVIREALFKF >gi|222441858|gb|ACEP01000084.1| GENE 2 679 - 1182 238 167 aa, chain + ## HITS:1 COG:no KEGG:LSL_1338 NR:ns ## KEGG: LSL_1338 # Name: not_defined # Def: DNA-binding protein # Organism: L.salivarius # Pathway: not_defined # 2 92 1 91 203 97 51.0 2e-19 MVEPIKVGRFIAQNRKDLNLTQKELAEKLGVTDRAVSKWENGRSIPDVGIIESLCKELNI SIGEFFAGEKIQEKEYKKETEKMLLASLDEKQLYGVQIVIYILEFVSLELLMLPFQLSEK WPSINRINIIYWILAAVVFSVCVILMKKCQKENCVIQINEYRESKWL >gi|222441858|gb|ACEP01000084.1| GENE 3 1242 - 1379 70 45 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSIGESIAIGLEIIIGFIIVVGVAIIQSKMRKEEWEEENNSKKSD >gi|222441858|gb|ACEP01000084.1| GENE 4 1517 - 1786 320 89 aa, chain + ## HITS:1 COG:no KEGG:Acfer_0671 NR:ns ## KEGG: Acfer_0671 # Name: not_defined # Def: hypothetical protein # Organism: A.fermentans # Pathway: not_defined # 2 86 3 86 89 100 55.0 2e-20 MYPFMTLNDDTEITHSEMKADGSVKVYIETPDEVGGFHNATCWLPAYKWENIEGYSDTEI SYFKQLIRKNAHLIIEFSQEGGVLNAATS >gi|222441858|gb|ACEP01000084.1| GENE 5 1770 - 2048 151 92 aa, chain + ## HITS:1 COG:no KEGG:LGAS_0243 NR:ns ## KEGG: LGAS_0243 # Name: not_defined # Def: hypothetical protein # Organism: L.gasseri # Pathway: not_defined # 1 85 1 83 93 79 41.0 6e-14 MPQLLRIGPYSIYFWSNEGDPLEPIHVHISEGRAKSTATKIWITSTGNVILSNNNSQIPE RILKKLMRMIAANSDDIINEWINRFGEIKYFC >gi|222441858|gb|ACEP01000084.1| GENE 6 2196 - 2603 342 135 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225027564|ref|ZP_03716756.1| ## NR: gi|225027564|ref|ZP_03716756.1| hypothetical protein EUBHAL_01821 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01821 [Eubacterium hallii DSM 3353] # 1 135 3 137 137 244 100.0 2e-63 MKKRIGIYHYQYYRGACKLCYRHFIQEMVESYEKCHEVAEEIYGKENCEFLHYTDFGQYD SENTDRPSFRRIMEAIEKKELDVVMCYGLNNITINEELLIKFYKYVRSQGMEFLTADHGR DAMKYIDIYIKKNNL >gi|222441858|gb|ACEP01000084.1| GENE 7 2869 - 3177 213 102 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225027565|ref|ZP_03716757.1| ## NR: gi|225027565|ref|ZP_03716757.1| hypothetical protein EUBHAL_01822 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01822 [Eubacterium hallii DSM 3353] # 1 102 1 102 102 163 100.0 4e-39 MIGIAENIPAKEESYELHRLLATILSSNLEANEKLDIMEREYHILLEDDIREEVEEMCNL SQGIKEKAYQGGYELGHESGYAEIIFKMYESGLPIEHLLLQN >gi|222441858|gb|ACEP01000084.1| GENE 8 3416 - 6559 2683 1047 aa, chain + ## HITS:1 COG:MA4292 KEGG:ns NR:ns ## COG: MA4292 COG3291 # Protein_GI_number: 20093081 # Func_class: R General function prediction only # Function: FOG: PKD repeat # Organism: Methanosarcina acetivorans str.C2A # 526 614 692 780 1995 73 41.0 1e-12 MWWLRSPGYYSYNAAEVSNDGWVYRYGNYVSYYIGGVRPALHLNLSSSNLYSYAGTVCSD VMKSGESGSDNPVNPGKPEGETTTTPTKVTEDMLFTPEYCKYLNNTTYNKMFETLWTDMS DVSRNCKWDDISIAARTVMSDGISGQLKRIGESLVSKAFNKNIQAEKVQSALALDYVNHA KEYSDYCANVYDEATDSLSKSKKVVSLLKKGSGKVGDTLKWVAEDEDIEEFAKAINAIFS YGDDLKVEQDLKKILKSSEVMKKVSSVFKTSGKAISAYQVAVSIEILRITTEDTQKHYMS LVDKDSTVYQGLKINYEKIKRQDAVNFALEMLDTSGITDAIGWITAKVTGAGTTASVTVA VVDLIYTGLNFILKKAGVIDVQDYLKVQYALLDQSWLKLAVSNQRLNIANNVYNTATLKN NYQDAFELYLSCMLQTPGYMEDYSGSDSYYKKLKKDIKKYESKLTYKNYIQSCLQNANAS YSYTLSGDKATITKVNPGTGRSARVSSPFALNVMAMENEDSYCLDIPSEVDGYEVEAVNT DLFDNKTEAFAVTIPDTVTTIKEGAFQDCTALKQVFLENGLKTIETNVFAGCEDLSIINV PDTVNTINENAFGDNISIEAAADSQAHNYAEQNGNTFTGREKTVTAISVKKNMDKTEYAM SEPIDLTGLELTVTYVDGTTKDIKEGFTGDFTAKQIGRNEVVLYYNGVKTKTTVQVRSEQ CEYTVLYQDTCGEKIHENVTGSGTTGETITLSIPEIAGYTPVETNIKQILGETNVFKVIY SVAKKKDIGEASVSYKKEHTETGQEIRPAVEVRYKGLLLKADTDYELEYNNNIKPGKGNI WIAGIGDYEGVQLCNFVIKQKKTAVVKPGTSNKKIQKITLSAISKQIASGKKVTLKPTVL PANASNKKLTWKSSNTKVATVTQGGVVTLKKNTGGKKVTITATATDGSRKSASWQITSMK GIVKKVKITGSKSVKAGKKLKLKAKVTATKKANTKLRWTSSNTKYATVNAKGIVKTKKTA KGKSVKITAMATDGSGKKQSVKIKIKK >gi|222441858|gb|ACEP01000084.1| GENE 9 6765 - 7613 647 282 aa, chain - ## HITS:1 COG:no KEGG:Selsp_1548 NR:ns ## KEGG: Selsp_1548 # Name: not_defined # Def: hypothetical protein # Organism: S.sputigena # Pathway: not_defined # 10 274 4 277 286 186 39.0 9e-46 MPDSTMLQSEDLQERYERYKAKIKNFTIMNDIFMRNVLKKKECTEYILQVIMEDKGITVI DQALQKDYKNLQGRSATLDCVAKAASNRQFNVEIQGDNEGASPKRARYHCGLLDMNILNP GESFDALPDTYVIFITRNDILGFHLPIYHIDKEVIETKEPFDDGQHIIYVNASRRDSTEL GHLMHDLHCKDASEMYSDILSTRVRELKESEKGVEEMCKELEEIYNEGEQSGVQKGIQKG ELKKARETVFALLEMGMPLEQITKAVKIPLTTVQSWIAETKF >gi|222441858|gb|ACEP01000084.1| GENE 10 7935 - 9164 1763 409 aa, chain - ## HITS:1 COG:CAC0973 KEGG:ns NR:ns ## COG: CAC0973 COG0137 # Protein_GI_number: 15894260 # Func_class: E Amino acid transport and metabolism # Function: Argininosuccinate synthase # Organism: Clostridium acetobutylicum # 3 405 2 399 400 440 54.0 1e-123 MSKEKVVLAYSGGLDTTAIIPWLKETYDFDVVCCCVDCGQGEELDGLEERALYSGASKLY IEDITDDFCENYIMPCVQADAVYENKYLLGTSMARPGIAAKLVEVARKENAVAICHGATG KGNDQIRFELGIKALAPDLKIIAPWRDDKWQMDSREAEIEYCKAHDIHLPFSVDNSYSRD RNLWHISHEGLELEDPACEPNYDHLLVLGVSPEKAPDEPEYLTMTFEAGVPKTINGEEMK VADIIRKLNELGGKHGIGIVDIVENRVVGMKSRGVYETPGGTILMEAHRQLEELVLDRDT MAVKKDMGNKLAQVVYEGKWFTPLREAIQAFVESTQKYVTGEVKFKLYKGNIIKAGTTSP YSLYSETLASFTTGDQYDHHDAEGFITLFGLPLKVRAMRMQELEKNHSK >gi|222441858|gb|ACEP01000084.1| GENE 11 9438 - 10478 1206 346 aa, chain + ## HITS:1 COG:Cj0224 KEGG:ns NR:ns ## COG: Cj0224 COG0002 # Protein_GI_number: 15791596 # Func_class: E Amino acid transport and metabolism # Function: Acetylglutamate semialdehyde dehydrogenase # Organism: Campylobacter jejuni # 2 339 3 336 342 395 56.0 1e-110 MIRAGIIGSTGYAGQELVRILLGHKDVEIKWYGSRSYIDQAYASVFQNMFQLVPDICKGE DLEQLCEEVDVIFTATPQGLCSSLVSENVLSKVKVIDLSADFRIKDVDTYEKWYGIKHQS PQFIKEAVYGLCEVNREKIKKARLIANPGCYPTCSFLSIYPLAKAGLIDMKSVIVDAKSG TSGAGRGAKVANLYCEVNESIKAYGVATHRHTPEIEEQLSYASGQEAVINFTPHLVPMNR GILVTAYANLVKDVTEEEIRKIYEDAYKEEQFVRFLNAGVCPETRWVEGSNYVDVNVKVD ERTHRVIMMGAMDNLVKGAAGQAVQNMNLLFGLPETEGLLMPPVFP >gi|222441858|gb|ACEP01000084.1| GENE 12 10593 - 11819 1906 408 aa, chain + ## HITS:1 COG:TM1783 KEGG:ns NR:ns ## COG: TM1783 COG1364 # Protein_GI_number: 15644527 # Func_class: E Amino acid transport and metabolism # Function: N-acetylglutamate synthase (N-acetylornithine aminotransferase) # Organism: Thermotoga maritima # 12 408 5 397 397 343 48.0 3e-94 MKVIEGGVTAAKGYKAAGLRAGIKAGKTNKDMAMICSEKEAVCAGTFTKNVVKAAPVFWD KDVVYKEGKAQAVVVNSGIANACTGEEGLSNCKKEAEKAGELLSIAPEHVLVASTGVIGA QLVMDVVEKGIEMLVPELDSGKEAGENAATAIMTTDTHKKEIAVECTLGGKTVTIGGMCK GAGMIHPNMGTMLSFITSDAAIDQKLLQQFLSEIVEDTFNMISVDGDTSTNDTCLVLCNG LAENPVITEEDEDGKEFKAALAYVMEYLAKQIAGDGEGCTRLFEVTCNGAATKEDAKIIS KSVVCSTLTKAAVFGKDANWGRILCAMGYSGVTFDPEKVDIVLESEEGSLAIVKDGIATD YSEEKATQILSANPVKAILDIHAGEEKATAWGCDLTYEYVTINADYRS >gi|222441858|gb|ACEP01000084.1| GENE 13 12017 - 12913 1386 298 aa, chain + ## HITS:1 COG:Cj0226 KEGG:ns NR:ns ## COG: Cj0226 COG0548 # Protein_GI_number: 15791598 # Func_class: E Amino acid transport and metabolism # Function: Acetylglutamate kinase # Organism: Campylobacter jejuni # 14 292 7 280 281 296 56.0 2e-80 MAEMEINKNVDEIMTKANTLIEALPYIRRFNGATIIVKYGGSAMLDAQLKMNVIKDVALL KLVGMRPIIVHGGGKEISKWVKLSGKEPEFYNGLRVTDKETMNIAEMVLNKVNKELVGLM QKMGVNAVGLCGKDGNMIRVNKKMPDGKDIGFVGEVASVNIDLLNTLMDNDFIPVIAPIG LDDEFNAYNINADDAACGVAKAMDAEKLVFLTDIEGVFIDPANKKTLISEMDIKTAKEFI ENGVVGGGMLPKLNNCIDAIEQGVSRVHILDGRVAHCLLLEFFTEKGIGTAILKEPLF >gi|222441858|gb|ACEP01000084.1| GENE 14 13148 - 14350 1623 400 aa, chain + ## HITS:1 COG:Cj0227 KEGG:ns NR:ns ## COG: Cj0227 COG4992 # Protein_GI_number: 15791599 # Func_class: E Amino acid transport and metabolism # Function: Ornithine/acetylornithine aminotransferase # Organism: Campylobacter jejuni # 10 395 6 393 395 384 51.0 1e-106 MTKMQEYMQKGEDYILKTYNRYPVVFEKGDGVSLYDMDGKRYLDFAAGIAVFALGYNNKE YNDALKNQIDKILHTSNLYYNQPAIDAAEKIVKTSGMSKVFFTNSGTEAIEGALKVARKY AYLKDSSKDYEFIAMNHSFHGRSQGALSVTGNAHYQDGFVPVEMRAVFADFNDLEDVKSK ITDKTCGIICEVVQGEGGIYPAKKEFLEGLRKLCDEKDILLIFDEVQCGMGRCGSLYAHD LYGVKPDVMALAKALGCGVPVGAFVTNEKASHALVPGDHGTTYGGNPFATAAVCEVFRQF EEKDIVAHVNEVGTYLYEKLEEVKNQFATVKDHRGVGLMQGLEFEGPVGDLIVTALKKGL VIISAGANIIRFVPPLVIEKKDVDEMIAILTEAMKENGLQ >gi|222441858|gb|ACEP01000084.1| GENE 15 14633 - 15073 470 146 aa, chain + ## HITS:1 COG:CAC3547 KEGG:ns NR:ns ## COG: CAC3547 COG3238 # Protein_GI_number: 15896783 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 14 146 14 143 143 93 42.0 1e-19 MFGFIVALVSGALMSIQGVFNTEVTKQTGTWLTNSFVQFTGFLLCLVIWFFKERKKHPPM EIFAVEPKYYLLGGVIGAFITFTVIAGMSTLGPAKAVMLIVAAQLIVAYLIEVFGIFGVE KTGLDWMKLLGTGLFLAGIFIFKARG >gi|222441858|gb|ACEP01000084.1| GENE 16 15215 - 15826 607 203 aa, chain + ## HITS:1 COG:BS_comEA KEGG:ns NR:ns ## COG: BS_comEA COG1555 # Protein_GI_number: 16079613 # Func_class: L Replication, recombination and repair # Function: DNA uptake protein and related DNA-binding proteins # Organism: Bacillus subtilis # 17 202 23 204 205 117 41.0 1e-26 MRQCVMRRLISVALGIVMIILAGCSFSVPVREEESTDSTVVIKADQKEDTEQARETEIYV HVCGCVKKPGVYRLHFGARTQEAIDAAGGFSEKANQTAWNLAEVLQDGMQVYIPSKDEAK EALNEEQSLGKDLSASQKEDTVNINTASKEELMTLPGIGESRADAVIACREEKGSFTSIE GIKDAAGIGDGIFNRIKDLITVN >gi|222441858|gb|ACEP01000084.1| GENE 17 15835 - 16524 873 229 aa, chain + ## HITS:1 COG:BS_yycF KEGG:ns NR:ns ## COG: BS_yycF COG0745 # Protein_GI_number: 16081093 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Bacillus subtilis # 1 229 1 231 235 216 50.0 2e-56 MGKKVLVVDDERLIVKGIKFSLVQDDMEVDCAYDGEEALEMAKKTEYDIVLLDVMLPKMD GFEVCQRIREFSNMPIIMLTAKGDDMDKILGLEYGADDYITKPFNILEVKARIKAIMRRS SKRATEDEKKKIFEKNGLKIDCDSRRVFVEDKEVKLTAKEYDLLELLSFHPNKVYSRENL LNIVWGYDYPGDVRTVDVHIRRLREKIEKNPGEPLYIHTKWGVGYYFQD >gi|222441858|gb|ACEP01000084.1| GENE 18 16733 - 17968 1092 411 aa, chain + ## HITS:1 COG:SA1515 KEGG:ns NR:ns ## COG: SA1515 COG0642 # Protein_GI_number: 15927270 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Staphylococcus aureus N315 # 176 401 321 545 554 167 40.0 4e-41 MSSLSDEAGENVFEVEQLANNLEGRILLVDSRFKILADTYNRNCGKYLVTKETIHAMKGE EVAYRYIGKQYLQVITPIQDKNNDKSDVRGLIIAMVSLQNCNEMSEYLYDQRDNLVGIFF IITLIFSVLVALLIRNNFRRMQNELDIIAAGHQEEGLGQENFREFYNLSSSFNKIINRYK ELENTRQEFVSNVSHELKTPITSMKILADSLLMQPDVPVELYEEFMNDIVHEIDRENQII TDLLTLVKMDKTQADLNITSVKINDLLEVLLKRISPIAEKRGIKIRLETMREVVAEVDEV KLSLACNNLIENAVKYNHDDGKVDVSLNADHKFFYIRVKDTGCGIPEDCQEQIFERFYRV DKARSRETGGTGLGLAITRNVVLMHKGSIRVHSKEGEGATFIIRIPLNYIS >gi|222441858|gb|ACEP01000084.1| GENE 19 17981 - 18979 877 332 aa, chain + ## HITS:1 COG:no KEGG:Cphy_1956 NR:ns ## KEGG: Cphy_1956 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 58 309 32 284 309 103 30.0 1e-20 MKKRIPYVIFLVSLLSLTVLCGCDKSTKSAQEKQEKTENQEQVTKEETDRKKNNTINLYY LNAQEDGFKKVSCKLQHSDDLKEMANEVLYKLSDTENNNTNKYKASIFDGVVINDVSVSR GKATIDFGAGYIQLSHTQEVLLRTSVVKSLLQIQGIDGVLFSVNGSSLLDNDKMPVGLME ASTILTDDGQADIYSAKKKVKMYYTNETGDKLIPYLTELTVTDNVPLETQVLLALKNPPA SKKNLKSPLPSDFYVNQTQILNNICYVDLSSEVEKAAVNVKEKVTVYAMVNSLTDLDTAY QVQFTIDGKKISKLNEFENFDALLTSDFSLCK >gi|222441858|gb|ACEP01000084.1| GENE 20 19199 - 21448 974 749 aa, chain + ## HITS:1 COG:lin1517_2 KEGG:ns NR:ns ## COG: lin1517_2 COG2333 # Protein_GI_number: 16800585 # Func_class: R General function prediction only # Function: Predicted hydrolase (metallo-beta-lactamase superfamily) # Organism: Listeria innocua # 491 733 1 247 259 145 37.0 4e-34 MGGIAFFRASYFNEIDRILSKKTLTGTLIGQVEYVRQNAEGEYQITVRENSFVVNESYLK DKIKKNLKKCSISSIKSNLRQTKKLPGKCRLVKIPVSKGKIYPGDFITCTGKLKAIEEQI NPGEFNTKIYYYSVGIRYQFFGENLTRKRESPLSLHRIAGNVREKIDAIYRHILSDTEYA LLKAIFLGDKTDLSKEQKHLYEENGMAHLLAVSGLHVSIVGGMLFRFLRKKGRSYAFSCM VGSGILFFYAVMTGFGNSVFRAAVMFFCFLLAQYFGAEYDMISSMSLAGILMLLEHPWRL LESGCVISFASIFSIGIILPIAKELWEKRSQNRLVAGEFPTESPRRKIIRQAFFANIVIS MSITPLLLRFFYQWSPYSILLNMFVIPAMSPLLISAVLGGLLGFFSYVSAFLGCIPAVVL LRTFEMIFRLVKQIPGAVIVTGCPPWWEILFLYFIEISFFFLWYYRQKGFSVIFCLFLIA GRCFFTVPALKITMLDVGQGECIFIKMPTGESMLIDGGSTSKKNIAEYTLVPALKYYGTD HLDYVIITHTDEDHISGIRELFEQNYPIKHIVLPDTKAMNSSNTIRKAGKKKGYSFLKIS RADQLNFGTIHLRCLHPQKGWEAEDTNSGSLVLYLTYDKFTMLFTGDLNGEQESLLEPDR GEESMPSVNILKVAHHGSKNSTTKSFLEEFQPQKAIISAGKNNLYGHPHKETIKKLQKNG ADIYGTLWGGAIIIESDGQKYKINYFKRG >gi|222441858|gb|ACEP01000084.1| GENE 21 21517 - 22500 679 327 aa, chain + ## HITS:1 COG:BH1337 KEGG:ns NR:ns ## COG: BH1337 COG1466 # Protein_GI_number: 15613900 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, delta subunit # Organism: Bacillus halodurans # 3 319 5 329 342 100 24.0 3e-21 MKELTKQIKEKSFHKVYLLTGDEPYLVLQAKHMLKNAMIKEGDTMNYAAFTDGKIDLNTL QELAFTYPFFSEKRLLLLDGTGILKTGKDEFVSIMENMPETTCIIICEPEVDKRSKVYKW IKKNGYVGEFLKKNQTEKVLLRWIAALLGKEKKQIRENNARYFLQKTGDDMFQIKNEIDK LIAYAGDREEITQADIDAVASGEVQDKIFDLVDAIARKNKAAALSYYNDLILLKEPPMHI LFLIVRQYRILHIIANMRGLRKPDDAIAKTAGIPRFAIRKNEQQLRGYGEKMLEKCIEEC IQIEEEIKTGRINDQIGVESLICRLML >gi|222441858|gb|ACEP01000084.1| GENE 22 23028 - 23291 279 87 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|160880450|ref|YP_001559418.1| ribosomal protein S20 [Clostridium phytofermentans ISDg] # 1 87 1 87 87 112 68 5e-24 MANIKSAKKRILVIETKTLRNKSIKSKVKTCIKNVEVAVAKGDKEAAQAALTVAISEISK ATSKGIFHKNTAARKVSRLTKAVNGIA >gi|222441858|gb|ACEP01000084.1| GENE 23 23538 - 24476 1044 312 aa, chain + ## HITS:1 COG:no KEGG:Cphy_2317 NR:ns ## KEGG: Cphy_2317 # Name: not_defined # Def: germination protease (EC:3.4.24.78) # Organism: C.phytofermentans # Pathway: not_defined # 10 310 10 306 307 236 42.0 1e-60 MNIQRSFSHTDLALELKDELEESLEEQQAFDGIKIQQERIGERGLQETVIEIDSEEGEKQ LGKPRGIYVTLEGENMAGNDGSFHEEMSECLAKRLQSLLSGKRKLLFIGLGNGEVTPDAL GPLVIKNLFITRHLTGWKEIEGCPAVAALAPGVMAQTGMETGEIVEGIVKKIHPDALVVI DALAAKSSERLNRTIQISNTGIAPGSGVGNHRNEITEKTMGIPVIAIGVPTVISIPSLAC DIMEAFCASQGEGMEDIFCSWPEKEKYHFLGEILDEKLWELFVTPKEIDEAVKRISYTLS EGINQFVAGRIF >gi|222441858|gb|ACEP01000084.1| GENE 24 24875 - 25861 990 328 aa, chain + ## HITS:1 COG:no KEGG:EUBELI_01207 NR:ns ## KEGG: EUBELI_01207 # Name: not_defined # Def: stage II sporulation protein P # Organism: E.eligens # Pathway: not_defined # 58 327 98 368 368 243 48.0 1e-62 MLLPFFRNGMQEELFPIQRFALDYWKGKEYYKAKYSDTVPDYFTESDDQSVEEKKKDPGK MKQMGTGRSYTIKELASYSFLLEHFYIVDETTSMTKQDLNGTKLVTMDLSAKLGKEEPLV LIYHTHGSETYKKVNGKEGSVIEVGTALTKELENVYGIKTVHDKSVYDMVDGQLDRNAAY NFAGDSVEAALRKNPSVKVVIDLHRDSVENNIHLRTKINGKSAAQIMFFNGVSRLASKGD IGYLYNPNKEANLAFSLQMQLLCGKYYPDLTRRIYIKGYRYNLHLAKRAMLVEVGAQNNT VEEAKNAMKPLAEMLYRLLSGEKSYKKN >gi|222441858|gb|ACEP01000084.1| GENE 25 26497 - 27507 1392 336 aa, chain + ## HITS:1 COG:MJ0234 KEGG:ns NR:ns ## COG: MJ0234 COG0547 # Protein_GI_number: 15668409 # Func_class: E Amino acid transport and metabolism # Function: Anthranilate phosphoribosyltransferase # Organism: Methanococcus jannaschii # 1 332 1 332 336 307 52.0 2e-83 MIKEAIIKLSKKEDLSYEMAQAVMDEIMSGEANDIQKSAYLTALAMKGETIDEITASAAG MREHCTRLLNDMDVLEIVGTGGDRSNSFNISTTSALVISAGGVPVAKHGNRAASSKCGAA DVLEALGVNITVSPAQSAQMLKDINICFLFAQKYHTAMRYVANVRKELGIRTVFNILGPL SNPAGANMQVMGVYDEKLVEPLARVLSNLGVKNAMVVYGQDCLDEISMSARTSVCEIKDG KFRSYEIEPEQFGFKKCDKEELVGGTPEENAKITLSILNGEKGPRREAVLMNAGAAFYVA GKADTLEDAVKLASEIIDSGKAKARLDEFIKLSNQE >gi|222441858|gb|ACEP01000084.1| GENE 26 27612 - 29426 2005 604 aa, chain + ## HITS:1 COG:CAC1278 KEGG:ns NR:ns ## COG: CAC1278 COG0481 # Protein_GI_number: 15894560 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane GTPase LepA # Organism: Clostridium acetobutylicum # 6 602 6 602 602 854 69.0 0 MAGIDQSKIRNFCIIAHIDHGKSTLADRIIEKTGLLTDREMQNQVLDNMDLERERGITIK SQAVRIVYHAEDGEEYIFNLIDTPGHVDFNYEVSRSLAACEGAVLVVDAAQGIEAQTLAN VYLALDHDLEIMPVINKIDLPSADPDRVVEEIEDVIGIEAQDAPKISAKNGINIEQVLEQ IVKKIPAPGGSPENPLQALIFDSVYDSYKGVIIFARIMEGTIKKGTNMLMMATGAKAEVV EVGTFGAGQFFPCDELSAGMVGYITASLKNVKDTRVGDTVTDADNPCAEPLPGYKKVNSM VFCGIYPADGAKYPDLRDALEKLQLNDAALQYEPETSVALGFGFRCGFLGLLHLEIIQER LEREYNLDLVTTAPSVIYKVHQTDGTVFDLTNPTNLPDPSTIEYMEEPIVSAEIMVTKEY VGAIMTLCQERRGTYISMEYMEETRALLKYELPLNEIIYDFFDALKSRSRGYASFDYEMK GYQQSDLVKLDILINRETVDALSFIVHSSTAYERARKICEKLKEEIPRHLFEIPIQAAIG SKIIARETVKAVRKDVLAKCYGGDISRKKKLLEKQKEGKKRMRQVGNVEIPQKAFMSVLK LDED >gi|222441858|gb|ACEP01000084.1| GENE 27 29983 - 31140 802 385 aa, chain + ## HITS:1 COG:CAC1279 KEGG:ns NR:ns ## COG: CAC1279 COG0635 # Protein_GI_number: 15894561 # Func_class: H Coenzyme transport and metabolism # Function: Coproporphyrinogen III oxidase and related Fe-S oxidoreductases # Organism: Clostridium acetobutylicum # 8 385 4 374 374 295 43.0 8e-80 MCASKTDLELYIHIPFCIKKCNYCDFLSFPSNEERREIYVQSLINEIEQTGKLLDKDAYA VRSIFIGGGTPSLLSGKQIERIMLAVRNAFFIMEEAEITMETNPGTLTKENVFSYKKAGV NRFSMGLQSADDACLHLLGRIHTWEEFLKSYELARKAGFENINVDLMSGLPGQTVSIYRR TLEKVMALQPEHISAYSLILEEGTPFGESEEIQKKIPDEETDREMYQLTKEVLAENGYER YEISNYARKGKECIHNLGYWSGIPYLGFGLGASSYFEGTRFSNEKNLEEYQKKPYVPFMM REDYTVLSEKDEIEEFMFLGLRKRAGISEREFKERFRVGLKDIYGKVIAKYEEMDLLEWT ADGKMLRLTDAGIDVSDYIFCDFML >gi|222441858|gb|ACEP01000084.1| GENE 28 31320 - 32369 1165 349 aa, chain + ## HITS:1 COG:CAC1280 KEGG:ns NR:ns ## COG: CAC1280 COG1420 # Protein_GI_number: 15894562 # Func_class: K Transcription # Function: Transcriptional regulator of heat shock gene # Organism: Clostridium acetobutylicum # 1 342 1 334 343 201 34.0 1e-51 MELDERKMKILKAIISTYLETGDPVGSRTIAKDSDLHLSSATIRNEMADLEELGYILQPH TSAGRIPSDLGYRYYVNCLMEENDEIRTRETEREKEHQDLINKMDRMEEVLKNVADVLAA NTNYATLISAPQIKESKLKFIQLSMVDSRRLLVVIVVGNDTVKNVLMNMDEPLGNDEILK LNILLNSFLQGLTLNDINIELVHTMKVQAGMHADILEGVFNGIVEAIHEADNLEIYTSGA TNMLKYPELSNPEHTTALLDTLVEKKQLVKLVDDTVNQEEKHGIQVYIGNETQVQSLRDC SIVTATYELKEGGTGTVGIIGPKRMDYRKVVNTLRNLTDDLDEIFNNKS >gi|222441858|gb|ACEP01000084.1| GENE 29 32385 - 32984 758 199 aa, chain + ## HITS:1 COG:alr2445 KEGG:ns NR:ns ## COG: alr2445 COG0576 # Protein_GI_number: 17229937 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Molecular chaperone GrpE (heat shock protein) # Organism: Nostoc sp. PCC 7120 # 30 197 57 228 248 91 37.0 1e-18 MEEEKKVSDESIVEEAKGETPEETTTEEGTETQETKETKTKKKKKSAAALEIELKKSEEK AAEMTDKYQRLMAEFENARKRNAKEQSHMYDVGAKEVLAKLLPVVDNFERGLDALSEEEK EGAFAQGFIKIYQQMITVLEEIGVKPMDAVGKEFNPDFHNAVMHEENEEMGENLVSEEFQ KGYMYKDGVLRHSMVKVVN >gi|222441858|gb|ACEP01000084.1| GENE 30 33032 - 34903 2630 623 aa, chain + ## HITS:1 COG:CAC1282 KEGG:ns NR:ns ## COG: CAC1282 COG0443 # Protein_GI_number: 15894564 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Molecular chaperone # Organism: Clostridium acetobutylicum # 1 621 1 610 615 709 66.0 0 MGKIIGIDLGTTNSCVAVMEGGKPVVITNAEGMRTTPSVVAFTKTGERVVGEPAKRQAVT NADKTISSIKREMGTDYKVTIDDKKYSPQEISAMILQKLKSDAENYLGEKVTEAVITVPA YFNDAQRQATKDAGKIAGLDVKRIINEPTAAALSYGLDNENEQRIMVYDLGGGTFDVSII EIGDGVIEVLSTAGDNRLGGDDFDNVITQYMLDDFKAKEGVDLSTDKMAMQRFKEAAEKA KKELSSSTTTNINLPFITATAEGPKHFEMNLTRAKFNELTAHLVERTATPVSKALNDAGL NASELSKVLLVGGSTRIPAVQDKVKQLTGKEPFKGINPDECVAIGASIQGGKLAGDAGAG DILLLDVTPLSLSIETMGGVATKLIERNTTIPTKKSQIFSTAEDNQTAVDIHVVQGERQF ARDNKTLGQFRLDGIPPARRGVPQIEVTFDIDANGIVNVSAKDLGTGKEQHITITAGSNM SDEDIDKAVKEAAEYEAQDKKRKEGIDARNDADNMVFQTEKAMEEAGDKIDASEKATVEA DIAKLKEVLDRTTPDNISEADVAALNEGKEQLMKDAQSLFAKMYEQAGGAAGQAGPGPQA GAGADPNATYNSDDVVDADYKEV >gi|222441858|gb|ACEP01000084.1| GENE 31 35082 - 36254 1475 390 aa, chain + ## HITS:1 COG:YPO0469 KEGG:ns NR:ns ## COG: YPO0469 COG0484 # Protein_GI_number: 16120798 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: DnaJ-class molecular chaperone with C-terminal Zn finger domain # Organism: Yersinia pestis # 4 361 3 350 379 313 50.0 4e-85 MAEKRDYYEVLGVDKNASEAEIKRAYRKVAKKYHPDMNPGDKEAEEKFKEAAEAYEVLSD PEKKSKYDQFGHAAFEQGGGGAGGFGGFDFGGDMGDIFGDIFGDFFGGGRRGGDARRNGP RQGANLRAGVRITFDDAIRGVEKELEITLKEECKTCNGSGAKPGTSPETCPKCKGTGQVV FTQQSMFGTVRNVQTCPDCNGSGQIIRNKCTDCSGTGYIKVRKRIKVPIPAGIDNGQSVR IREKGEPGINGGPRGDLLVNVTVSRHPKFQRQEYDIYSTEPVTFAQAALGGTIKIDTVDG PYEYTLKPGTQTDTTIRLKGKGVPTLRNKNIRGTHIVKFVVQVPEKLNAAQKEALRKFQE AMGEVAPSEKKEEAPHEKHEKHKKKGFFKK >gi|222441858|gb|ACEP01000084.1| GENE 32 36266 - 37234 829 322 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|160880441|ref|YP_001559409.1| ribosomal protein L11 methyltransferase [Clostridium phytofermentans ISDg] # 1 319 1 329 333 323 50 8e-88 MKWNKLTIETTTAAEDMLSYELSEMGMEGVEIEDHVPLSEEDRKIMYVDLLPDEIAPDDG KARISCYVDEKEDLQAVIAKVKAKIEELSAFLPVGSGEITLGVTEEEDWINNWKVFFKPF RLDDNIVIKPTWETLTDKKDDDIVVEIDPGTAFGSGSHETTKLCISQLKKYIKKDTELLD VGTGSGILSIIGLKLGAHHAMATDIDPNAIRATDENFAINGVAGQAEVQQVNILDQDEAY RFYEKNNGKRYDVVVANILADVIIPLSGIVTPLLKEDGIFITSGIINTMEEAVKEAMLAN NFEILEINHMKDWVSITAKPKR >gi|222441858|gb|ACEP01000084.1| GENE 33 37724 - 38785 1374 353 aa, chain + ## HITS:1 COG:L0358 KEGG:ns NR:ns ## COG: L0358 COG0180 # Protein_GI_number: 15672048 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Tryptophanyl-tRNA synthetase # Organism: Lactococcus lactis # 2 346 3 340 341 417 59.0 1e-116 MKKVILTGDRPTGRLHVGHYVGSLSERVRLQNSGDYDEIYIMIADAQALTDNAEHPEKVR QNIIQVALDYLACGIDPEKSTIFIQSMVPELTELTFYYMNLVTVARVQRNPTVKSEIKMR NFEASIPVGFFCYPISQAADITAFRATTVPVGEDQMPMIEQCKEIVHKFNSVYGETLTDP QIVLPSNKACLRLPGIDGKAKMSKSLGNCIYLSDEADVVKKKVMSMFTDPNHLRVQDPGK VEGNPVFIYLDAFCKDEYFAEFLPDYQNLDELKEHYQRGGLGDVKVKKFLNKVLEAELGP IRERRKMWEQRIPDVYDILQAGSEVAEKKAAETLNDVREAMKINYFDGDFALN >gi|222441858|gb|ACEP01000084.1| GENE 34 39179 - 39904 914 241 aa, chain - ## HITS:1 COG:CAC0187 KEGG:ns NR:ns ## COG: CAC0187 COG0363 # Protein_GI_number: 15893480 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphogluconolactonase/Glucosamine-6-phosphate isomerase/deaminase # Organism: Clostridium acetobutylicum # 1 237 1 237 241 256 53.0 3e-68 MRIYKATDYNDMSRKAANIISAQIIMKPDCVLGLATGSSPVGTYKQLIEWYNKNDLDFSE VTSINLDEYKGLSPEDPQSYRYFMNTHLFDHVNIDKNRTFVPDGLATDPEKACAEYNANI IKQGGIDLQLLGIGRNGHIGFNEPGTVFKKETHCVDLTESTIEANKRFFASEADVPRQAY SLGIKNIMQARKILVIVSGKDKADALYNAVHGEITPAVPASILQLHNDVTIVADADALNN F >gi|222441858|gb|ACEP01000084.1| GENE 35 40295 - 40990 927 231 aa, chain + ## HITS:1 COG:BB0644 KEGG:ns NR:ns ## COG: BB0644 COG3010 # Protein_GI_number: 15594989 # Func_class: G Carbohydrate transport and metabolism # Function: Putative N-acetylmannosamine-6-phosphate epimerase # Organism: Borrelia burgdorferi # 7 231 4 229 232 254 57.0 1e-67 MSINEKIENLKGKLIVSCQALPSEPLHSPFIMGRMALAAKIGGASGIRANTKEDIAEIQT QVDLPIIGIVKRDYEDSEIYITPTMKEIDELMGVKPEIIAMDATISTRPEGKTLDEFFHE VKKKYPEQLFMADCSTIEEALHADELGFDFIGTTMVGYTKQSEGDKIEENDFEILREIVS KVNHKVIAEGNINTPEKARRVLELGAYSVVVGSIITRPQLITKSFVEAIEK >gi|222441858|gb|ACEP01000084.1| GENE 36 40994 - 41827 491 277 aa, chain + ## HITS:1 COG:SP1331 KEGG:ns NR:ns ## COG: SP1331 COG1737 # Protein_GI_number: 15901185 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Streptococcus pneumoniae TIGR4 # 6 270 5 269 269 200 41.0 3e-51 MEYYIKSVIPVIESNYNKFTMVEKTIADFFINNQKKINFSAKSVAAKLFVSEASLSRFAQ KCGYRGYREFIYQYEESFVEKKEFMTGNTLMILNAYQELLNRSYSLVDEKQIERIGNYLN ATKKVIVCGKGSSGLAAREMEIRFMRIGVDVDSITDTDLMRMQAVFQNENSLVFGFSIGG QKEEVIFLLKEAKKRGAKTVLFTANNRDDYKEFCTEVLLIPSLRHLNHGNVISPQFPILI MIDIIYSYYVEQDKTKKKILHDRTVQALEETYTEKAE >gi|222441858|gb|ACEP01000084.1| GENE 37 42070 - 43431 1183 453 aa, chain + ## HITS:1 COG:SA0656 KEGG:ns NR:ns ## COG: SA0656 COG1820 # Protein_GI_number: 15926378 # Func_class: G Carbohydrate transport and metabolism # Function: N-acetylglucosamine-6-phosphate deacetylase # Organism: Staphylococcus aureus N315 # 1 450 4 387 393 222 34.0 2e-57 MIIDNVKIYTEAGTFVSGGIITQGDKITEIYTEAEKENIFKKMNMETEHTAAQCSVQDYS AKNDIDNVLDGKGAYAIPGLIDLHFHGCVGDDFCDGDKEAIRRIAEYEVSVGVTAIAPAT MTLPVAELENILKTAAAYKKEYEEDFVCIGEKMKDEDEQAQNMEAKQQNITDEACDALKA TVVSAKVKNKKRADFVGINMEGPFISPVKKGAQDERNIIPCNEKIAQKFLDASDNLVKFI GIAPEENENTISFIKNMKDKVNISLAHTNADYESAKAAFDAGANHAVHLFNAMPAFTHRE PGVVGAVSDSEHVMAEIICDGVHIHPSMVRAAFKMMGANRMIFISDSMRATGMPDGQYTL GGLDVKVRGNRATLVSDGALAGSVTNLADCMRTAITKMGIPLETAIACATKNPAISLGIY DERGSISVGKKADILLLDKNLTLKTVIKDGVTI Prediction of potential genes in microbial genomes Time: Fri Jul 8 07:46:38 2011 Seq name: gi|222441857|gb|ACEP01000085.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont92.1, whole genome shotgun sequence Length of sequence - 23704 bp Number of predicted genes - 18, with homology - 18 Number of transcription units - 13, operones - 5 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 101 - 463 372 ## gi|225027601|ref|ZP_03716793.1| hypothetical protein EUBHAL_01858 2 2 Tu 1 . - CDS 649 - 1590 953 ## gi|225027602|ref|ZP_03716794.1| hypothetical protein EUBHAL_01859 - Prom 1736 - 1795 6.7 3 3 Tu 1 . - CDS 1814 - 2395 687 ## COG1961 Site-specific recombinases, DNA invertase Pin homologs - Prom 2536 - 2595 8.5 - Term 2975 - 3012 -0.4 4 4 Tu 1 . - CDS 3095 - 4330 1239 ## COG1301 Na+/H+-dicarboxylate symporters - Prom 4398 - 4457 5.0 + Prom 4619 - 4678 7.6 5 5 Op 1 . + CDS 4837 - 6204 1489 ## COG0534 Na+-driven multidrug efflux pump 6 5 Op 2 . + CDS 6218 - 6805 578 ## EUBELI_20196 cytidylate kinase + Term 6842 - 6887 8.2 + Prom 7012 - 7071 7.1 7 6 Op 1 . + CDS 7119 - 7409 170 ## gi|225027607|ref|ZP_03716799.1| hypothetical protein EUBHAL_01864 8 6 Op 2 . + CDS 7440 - 9512 1584 ## COG0642 Signal transduction histidine kinase + Prom 10017 - 10076 8.7 9 7 Tu 1 . + CDS 10142 - 11083 780 ## COG2390 Transcriptional regulator, contains sigma factor-related N-terminal domain - Term 11159 - 11202 1.0 10 8 Op 1 . - CDS 11262 - 12032 687 ## COG0436 Aspartate/tyrosine/aromatic aminotransferase 11 8 Op 2 . - CDS 11977 - 12930 697 ## COG0436 Aspartate/tyrosine/aromatic aminotransferase - Prom 12999 - 13058 8.2 + Prom 13425 - 13484 4.8 12 9 Tu 1 . + CDS 13504 - 14085 287 ## COG1434 Uncharacterized conserved protein + Prom 14103 - 14162 3.3 13 10 Op 1 . + CDS 14201 - 14923 558 ## smi_0183 serine/threonine protein phosphatase + Prom 14927 - 14986 2.6 14 10 Op 2 . + CDS 15017 - 15538 292 ## COG3177 Uncharacterized conserved protein + Term 15556 - 15594 2.0 + Prom 15603 - 15662 9.8 15 11 Op 1 4/0.000 + CDS 15700 - 16086 359 ## COG1937 Uncharacterized protein conserved in bacteria + Prom 16181 - 16240 4.6 16 11 Op 2 . + CDS 16268 - 18805 3099 ## COG2217 Cation transport ATPase + Term 18956 - 18997 -0.5 - Term 19334 - 19385 14.5 17 12 Tu 1 . - CDS 19398 - 22109 1447 ## Cthe_1806 cellulosome enzyme, dockerin type I - Prom 22154 - 22213 3.5 18 13 Tu 1 . - CDS 22229 - 23662 773 ## COG1649 Uncharacterized protein conserved in bacteria Predicted protein(s) >gi|222441857|gb|ACEP01000085.1| GENE 1 101 - 463 372 120 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225027601|ref|ZP_03716793.1| ## NR: gi|225027601|ref|ZP_03716793.1| hypothetical protein EUBHAL_01858 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01858 [Eubacterium hallii DSM 3353] # 1 120 34 153 153 224 100.0 2e-57 MLEEELGVKLFVRGAKNTTLTEAGKALYIRAGSLLTMADITKREVIKANEVATIHIGLAP STVSMMAKQFKASMRPLADKCFVYKIDSDELITEVLLAWNKEKVTDEIKEFLEFLFNQFD >gi|222441857|gb|ACEP01000085.1| GENE 2 649 - 1590 953 313 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225027602|ref|ZP_03716794.1| ## NR: gi|225027602|ref|ZP_03716794.1| hypothetical protein EUBHAL_01859 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01859 [Eubacterium hallii DSM 3353] # 1 313 1 313 313 349 100.0 1e-94 MANFAIAADENVIARGNKLIEELQEPGEKKGVTLNRLFDLVSTHLQEDQLKRSGVDTEAL DASITNIRNLFTAALSGKEEIRAEYERRMAELRESKEELEKNYKIQLGKLASEKEDALRK YTDLKELQETAETARKAAEEQAASAVNLVKEKEKTNIMLTEKLRDAEQKAGNYDTLEKEN ASLKQKVSDLQFKIKDYEKNELLHIKEIEQLKKEAHKNSVTIEKLNTEKYKEHETIQAQL SEKTKLLSEQEKELNVLHIQLAEQSKESELIKERAVIEKEREMLSKIEELRNALDEAKEE KYNLRLQLTKLQK >gi|222441857|gb|ACEP01000085.1| GENE 3 1814 - 2395 687 193 aa, chain - ## HITS:1 COG:XFa0019 KEGG:ns NR:ns ## COG: XFa0019 COG1961 # Protein_GI_number: 10956730 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinases, DNA invertase Pin homologs # Organism: Xylella fastidiosa 9a5c # 1 182 4 180 188 108 39.0 5e-24 MVYGYCRVSTKGQAKDGNSIEAQEAAVREAGAEEIYADAFTGTKKHRPQLDKLLKVIQNG DKVVVTKLDRIARSATQGIALIDMMLDKGVTVHILNMGIMDNTPTGRLIRTIMLAFAEFE RDMIVERTQEGKAIAKQKDDFKEGRPRISKIKIDHALSLLDEGYSYKQVERMTGISTATL ARRKRERLLEEDS >gi|222441857|gb|ACEP01000085.1| GENE 4 3095 - 4330 1239 411 aa, chain - ## HITS:1 COG:VCA0088 KEGG:ns NR:ns ## COG: VCA0088 COG1301 # Protein_GI_number: 15600859 # Func_class: C Energy production and conversion # Function: Na+/H+-dicarboxylate symporters # Organism: Vibrio cholerae # 1 410 4 421 424 287 44.0 3e-77 MKEKKKMSLAMQIFIALVLAIAAGLLLQKHAQFAETYIKPFGTIFLNLLKFIVVPIVLFS IMCGIISMRDIKKVGAIGLKTVVYYMCTTAFAITIGLIGGNLFKKMFPVIATTDLSYQVG EKTSLMDTIVNIFPSNFISPMAEANMLQVIVMALLIGFAIILVGEEKNTRIITACNDLND VFMKCMEMILKLSPIGVFCLLCPVVAANGATIIGSLAMVLLAAYVCYIVHAVVVYSFAVK TIGGISPLTFFKEMLPAIMFAFSSASSVGTLPINMECTEKLGTSREIASFILPLGATINM DGTAIYQGVCAIFIASCYGIHLTLPQMLTIIFTATLASIGTAGVPGAGMVMLAMVLTSVG LPVDGIALVAGVDRIFDMGRTTVNITGDASCCVIVSNLEKKREARKMAKSM >gi|222441857|gb|ACEP01000085.1| GENE 5 4837 - 6204 1489 455 aa, chain + ## HITS:1 COG:MA1121 KEGG:ns NR:ns ## COG: MA1121 COG0534 # Protein_GI_number: 20089987 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Methanosarcina acetivorans str.C2A # 11 450 19 457 475 179 28.0 1e-44 MAESNKMKDMSVNKLMIQMGIPMILSMALQAVYNIVDSAFVGNMRVGSEAALNALTLVFP VQMLMVAVGIGTGVGTNALLARTLGQGNSKKAAKVAGNSLFLGVIIYVVCFLFGIFGVKA YISSQTVDTEVLEMGVSYLRICCVISFGIIFFSLFEKLLQATGRSLYSTIGQVVGAVVNI ILDPIMIYGIGPFPEMGVKGAAYATVIGQVASAVLLLIFHMKLNKEFEHDAKYMKPDIGI IKEIYAIGLPAIIAQALMSIMVYVMNLILKFNPSAQTAYGLFYKVQQFVLFLAFGLRDAI TPIIAFAYGMRSKKRIQDGIKYGLLYTIVLMILGIAITEIFPGAFATLFNAGQSREYFIS AMRVISISFLFAGINVAYQGIYQALDGGMESLVISLLRQLIIILPLAGVFSLFVRNGQMG VSLIWWAFPITEVIACLAGYVFLKKIRKNKVDVLN >gi|222441857|gb|ACEP01000085.1| GENE 6 6218 - 6805 578 195 aa, chain + ## HITS:1 COG:no KEGG:EUBELI_20196 NR:ns ## KEGG: EUBELI_20196 # Name: not_defined # Def: cytidylate kinase # Organism: E.eligens # Pathway: not_defined # 1 195 1 195 196 332 84.0 6e-90 MAKRIITISREFGSGGRFIGEEVAKKLGITYYDKNIISEIAEKSGLSPEYIQESAELSPK KGLFAYALAGRDITGKSVEDMVYEAQRKVILELAEKESCVIIGRNADFILKDRNDVLNVF IYGDMPEKIQRIMGLYNVEEKEAVKMIADIDKRRMTNYNFYTDQKWGKASNYTLCLNSSQ LGYDRCEAIIMECAK >gi|222441857|gb|ACEP01000085.1| GENE 7 7119 - 7409 170 96 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225027607|ref|ZP_03716799.1| ## NR: gi|225027607|ref|ZP_03716799.1| hypothetical protein EUBHAL_01864 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01864 [Eubacterium hallii DSM 3353] # 1 96 1 96 96 179 100.0 4e-44 MEEYRDDIKSKLHYMDEILHKISFMSQAENEKQLDDMTPSILKSVGKYTAADRAYIFEWN SEKKEVLKIHLNGVHQELNHRYRICRKFCAGNQMCQ >gi|222441857|gb|ACEP01000085.1| GENE 8 7440 - 9512 1584 690 aa, chain + ## HITS:1 COG:RSp1178 KEGG:ns NR:ns ## COG: RSp1178 COG0642 # Protein_GI_number: 17549399 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Ralstonia solanacearum # 284 672 274 654 676 160 28.0 7e-39 MKKDSGPYKKIKIYCYIVGLTAFLTAFFVGIFLLHNLEKNEKTTGKYMAQVTEKRVRARL DQYIMLSNLLGNYISAGENLDENTFSELAEKIPNEDGVIKAFELAPEGIVTDIYPKQENE GAFGLDMLQEHERKKDAVVARDSGKYTIGGPYQLKQGGTGALLFNPVYQDSNSDKGEFWG FVILVIDWNRFIDEINLDYLSDADFCYRIWSYDRDGNDRIILAESQDDMPDNILTVECTV PNNTWYFDIIPSEGWIPRSYWIMCIVVSYVYSLLVATVFYLIFSKKHRERQYEAELKKSA EQAKNASEAKTRFLFNMSHDIRTPMNAIVGFSGLLEKNLQNEKKAKEYLGKIQSSSNLLL MIINQILEMARIESGTAVLQLKAEDMDALFHRVNTVFEEDIRKKNLQYYADLDIRHHYVV CDQTKLQEIMLNIISNAIKYTPEGHSIYVEIHEAASENPSKIHYIFSCEDTGIGMSEEYL PHVYEEFSREHTTTENKVQGTGLGLSIIKSMIELMDGSIQVESRQGIGTKFTVDLSFDIA SKEEVYGNQNAMKPPAIHTIKGTRILLVEDNELNAEIAKTVLEDVGALITRVEDGQQAVK LFKEKPAGTFDAILMDLMMPVMDGYTATKKIRSLERSDAKTIPIIAMTANAFQEDAEKCI AVGMNAHLAKPLNIEKMMTTICHLVKKKNE >gi|222441857|gb|ACEP01000085.1| GENE 9 10142 - 11083 780 313 aa, chain + ## HITS:1 COG:BS_deoR KEGG:ns NR:ns ## COG: BS_deoR COG2390 # Protein_GI_number: 16080994 # Func_class: K Transcription # Function: Transcriptional regulator, contains sigma factor-related N-terminal domain # Organism: Bacillus subtilis # 10 311 10 310 313 172 33.0 8e-43 MKNDKIIKIIRVAKKYYESHMDQKVIAQEEGMSVSTVSRMLKKAEEMGYIKITVEYPVLS NEELSASLKKKYNLEKVFLVPKLSDAPVAVQEDVCRAAAKDLSSYLNDGSIIGTAWGRTM KSFANYVSDLGVHNVKVVQINGKTNETSVPVGADDLSQAIARAGHGEAYVIPAPVAVDSA EIAEMLKKERNISAALTLARECQIALFGVGNLSRNTILYRSGSLKEEDFIELEEKGAVGD VCTCFFDARGEAVSSSFAKRRISITLEELKKIPCKIGVASGLEKKEALHAALLGEYINIL YADEELGKELIAI >gi|222441857|gb|ACEP01000085.1| GENE 10 11262 - 12032 687 256 aa, chain - ## HITS:1 COG:L162604 KEGG:ns NR:ns ## COG: L162604 COG0436 # Protein_GI_number: 15672142 # Func_class: E Amino acid transport and metabolism # Function: Aspartate/tyrosine/aromatic aminotransferase # Organism: Lactococcus lactis # 16 256 164 404 404 363 69.0 1e-100 MKNPNGIQTLMISAKKINDRTKAIVIINPNNPTGALYPKEVLEQIVQVAREHQLIIFSDE IYDRLVMDGEKHISIASLAPDLFCVTFSGLSKSHMIAGFRVGWMILSGNKAMAKDYIEGL NMLSNMRLCSNVPGQSIVQTALGGHQSVEDYIMPGGRIYEQREFIYKALTDIPGISAVKP KAAFYIFPKIDTKKFNIVNDEQFVLDLLREKKVLLIHGGGFNWQQPDHFRVVYLPRIEVL KKATDSMADFLNHYHQ >gi|222441857|gb|ACEP01000085.1| GENE 11 11977 - 12930 697 317 aa, chain - ## HITS:1 COG:L162604 KEGG:ns NR:ns ## COG: L162604 COG0436 # Protein_GI_number: 15672142 # Func_class: E Amino acid transport and metabolism # Function: Aspartate/tyrosine/aromatic aminotransferase # Organism: Lactococcus lactis # 152 315 1 164 404 246 67.0 6e-65 MIQKEVFSERLKAAMKKQNLKQIDLVRAAQVQGIKLGKSHVSQYVSGKTVPRTDILLFLA KTLQVEEEWLIGVSNVESDTEISATNRNILDTNKTILGNSNKENSEALTMSAKRQKSSIF EAEIISDINKNIQTTSAATISRNISNKEGKKMREFKKSSKLDNVLYDVRGPVVDEATRME EAGTHVLKLNIGNPAPFGFRTPDEVIYDMRQQLTDCEGYSNAMGLFSARKAIMQYAQLKH IPNVNINDIYTGNGVSELINLCMSALLDNGDEILIPSPDYPLWTACATLAGGKPVHYICD EESEWYPDIDDIRKKNQ >gi|222441857|gb|ACEP01000085.1| GENE 12 13504 - 14085 287 193 aa, chain + ## HITS:1 COG:CAC0441 KEGG:ns NR:ns ## COG: CAC0441 COG1434 # Protein_GI_number: 15893732 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 6 187 69 249 259 105 36.0 4e-23 MQKYFLGKCVIGIMAIAIGIVVITAVIETVLIVKASAKKPQENATLIVLGCKVYGEHASR SLRERLDAALIYLEENPNSQCIVSGGMGEGEKISEAECMYRYLTKKGINSSRIIKEDKST STRENLRFSKKIMEEKGLGNNIAIATSEYHQYRASQIAKSLGFSVGAVSGHTAWWLFPTF YMRELYGILYQAL >gi|222441857|gb|ACEP01000085.1| GENE 13 14201 - 14923 558 240 aa, chain + ## HITS:1 COG:no KEGG:smi_0183 NR:ns ## KEGG: smi_0183 # Name: not_defined # Def: serine/threonine protein phosphatase # Organism: S.mitis_B6 # Pathway: not_defined # 1 226 1 232 255 67 28.0 7e-10 MKVLIIPDVHLKPFMFKQAAELMERGIAKRAVCLMDIPDDWNKQFDISLYEQTYDAAISF AKKYPDTAWCYGNHDLSYVWHKLETGYSPMAAYTVQKKLIDLKEVLPEGNLIKYVHRIDN VLFSHAGMSKDFVNLHVPKSKYDDVDDVLDYVNHLGEMKMWYDGSPIWLRPQYTDVKMYK SQQFLQVVGHTPMETITKKKNVISCDVFSTYRDGKPIGTEEFLLLDTITWDYSMVNYGNY >gi|222441857|gb|ACEP01000085.1| GENE 14 15017 - 15538 292 173 aa, chain + ## HITS:1 COG:jhp0651 KEGG:ns NR:ns ## COG: jhp0651 COG3177 # Protein_GI_number: 15611718 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Helicobacter pylori J99 # 21 150 78 209 234 68 33.0 8e-12 METINHFQCFDYMIDSANDILDEDFIKNTHKILKTNTSDSRISWFNVGEYKSRKNMVGDL ITTPPEKVKSTIEKLLNEYNKKQTVSFDDILDFHVKFERIHPFQDGNGRVGRIIMFKECL KYNIVPFIIEDNLKMYYYRGLKEWDNEKGYLRDTCLTAQDRYKQYLDYFEIKY >gi|222441857|gb|ACEP01000085.1| GENE 15 15700 - 16086 359 128 aa, chain + ## HITS:1 COG:BH0558 KEGG:ns NR:ns ## COG: BH0558 COG1937 # Protein_GI_number: 15613121 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus halodurans # 42 128 12 99 100 77 45.0 6e-15 MGECCNNKEDISLSCHCGANDSFKSNSSQPKCQNCSVKKKVRSEKEFKALLNRLSRIEGQ VRGVRSMLENDAYCIDILTQVSAINAALNSFNKVLIANHIRTCVKENIQAGNDEVVEELV STLQKLMK >gi|222441857|gb|ACEP01000085.1| GENE 16 16268 - 18805 3099 845 aa, chain + ## HITS:1 COG:CAC3655 KEGG:ns NR:ns ## COG: CAC3655 COG2217 # Protein_GI_number: 15896888 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Clostridium acetobutylicum # 6 741 78 809 818 639 50.0 0 MEQYNVTGMTCAACQARVEKVVSKVPGVTSVSVNLLTNSMGVEGTALSTDIVAAVEKAGY HASVKGAEKESSQGAEALADTETPKLLKRLIISLIFLMPLMYLSMGHMMWNWPLPGFLNN NHVGMGLAQLLFTVIIMVINQRFFISGFTSLIHRAPNMDTLVAMGATAAFSYSTYALFAM TAAQTAGNNKLVMSYMHEFYFESAAMILTLITLGKTLEAYSKGKTTDALKSLMNLAPKMA TVVRNGQEQIISAEQVKKGDIFLVKPGESIPVDGIVLEGNSAIDEAALTGESIPVDKAEG DNVSAATINQSGFLKCEATRVGEDTTLSQIIKMVSDAAATKAPIAKIADKVSGIFVPAVI TIAVITIIGWLLAGQTVGFALARGISVLVISCPCALGLATPVAIMVGNGVGAKNGILFKT AVSLEEAGKVDIVALDKTGTITTGQPKVTDILSVDGISEQELLETAFSLEKKSEHPLAKA IVEYGDEKKFTVPVVEDFQAVPGNGLTGTLNNKTLIGGNLLFIEKSLSISEKIKHSAEQL ASAGKTPLFFAKDNHLLGIIAVADVIKEDSPQAIKELKAMGIHVVMLTGDNERTARAIGE QAGVDNVIAGVLPDGKESVIRALGEKGKVAMVGDGINDAPALTRADVGIAIGAGTDVAMD AADVVLMKSKLADVPAAIRLSRGVLRNIHENLFWAFFYNTIGIPLAAGLLIPVLGWKLNP MFGAAAMSLSSFCVVTNALRLNLLNIRSTKKDKKKKKAIDVSLININNNEKKEVNEMTKT MNIKGMMCGHCEAAVKKALEALPEVASAEVSHEKGTAVVTLEKEIADDVLKKTVEDKDYE VVEIK >gi|222441857|gb|ACEP01000085.1| GENE 17 19398 - 22109 1447 903 aa, chain - ## HITS:1 COG:no KEGG:Cthe_1806 NR:ns ## KEGG: Cthe_1806 # Name: not_defined # Def: cellulosome enzyme, dockerin type I # Organism: C.thermocellum # Pathway: not_defined # 288 785 381 909 2177 182 32.0 6e-44 MKKTKLRLLLLLLFIGGLIILPQKTKAAEKAADIIPVNISVKYGQTEARTIFDMINEMRT NPYDTWCWDAYDNEKIPCPNLEELKYDYDLERVAMKRAAEIALSYEHERPMGGKVWDIYN EENIKWLAAGENISVGHTTAAEANLGWREDDEPYAGQGHRRNMLSSDYNCVGIGHVYYNG VHYWVEVFANRPEINTTEIPANDSTETVSVSVDKKKIKTVDVTFDQDAYSLRIGENITPI ITETRIDVVNFSSQGRGLAPVLDTPVVSIDDSSVASYNNNQLSGLKEGTTNLSATLYGMT ASGSTVSVHDCKNHWDTGKVTKKSTCTEPGEKTFTCSICRQTTKKESVPATGHNYASAWT TDIAATCEKTGTESRHCQNENCTATTEQRSIAALGHNWKDNGDETAICTRNGCKATHTHA WDSGTITTEPTCTAQGERTYHCTYEENGICTATKTEAVKKLGHRYILTKRDKPTCESDGI LYGKCSRCSKTITQQDTKNPALGHDWKDNGDGTATCSREGCGKNHTHDLDSGTTTVEPTC TATGEKEYCCTYINCPYKKTETLKATNHKNKETRNGEKTTCQKEGYTGDIYCKDCGILIS SGKVTKKYDHDWDRGTVTKEATCKEEGSVTYRCENCDETETVSIKKTAHNYKIMEQKDAT CTENGYSISACQTCNDKKKEEIVAKGHSKGIRNKKTATCKAEGYTGDTYCRICKTLLEEG KILPKLEHQWNDGTVTKRATYQAAGELTYRCRKCTAKRIISIEKLAYPKAGTSYTISGNE YKISKPGAEVILVKTSKTTKNVTVPAQIYVQGITYKVTSIGAKAFNNNKNLTKVTIGTNI IKINSNAFFNCKNLKTVTIKSVRLTKKAANKKAFKNAHKKLVIRVPKKVKKIYKKIFKGL KVK >gi|222441857|gb|ACEP01000085.1| GENE 18 22229 - 23662 773 477 aa, chain - ## HITS:1 COG:BS_yngK KEGG:ns NR:ns ## COG: BS_yngK COG1649 # Protein_GI_number: 16078889 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus subtilis # 79 428 37 374 510 211 35.0 2e-54 MKKLIENFIIIMVLTIAILFSYALIEKKISGRSSSEFINFLFSTQEKASSNKSSVTSNKG NSKKGNNSSTQSADTMNYRAVWLSYLEFNSYRKSVKNNNESSFRKFYKHILQQIKTIGCN RIIVQVRPFGDALYASDYFPWAACISGTQGKNPGYDPLKIMTEMSHKEGISIEAWINPYR ISSGNSIRSLSKTNPARKWFSVQNTKRNILSYEGSLYYNPSSESVRNLIIQGVKEIVQNY NVDGIHMDDYFYPSFTEKNVTTAFDAPEYKQQLKTNLSSTDSTSLTSADKSSNEISLADW RRDNVNRLVSGIYKAVKEINSDVTFGISPAGNLDNLRSDLEYYVDIDTWVSQNGYVDYLM PQIYWGFTNEVAPFDKVTDAWCILMENSPVKLYIGLQLYRMGSTEPGQSDEKELQKTSLL KKELSYLKKQKKIEGYCLFSYQYLDCQNKKYHFDAEQFSTKRKKLLNQIVKSFKNNS Prediction of potential genes in microbial genomes Time: Fri Jul 8 07:47:29 2011 Seq name: gi|222441856|gb|ACEP01000086.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont93.1, whole genome shotgun sequence Length of sequence - 8648 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 4, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 101 - 160 10.0 1 1 Tu 1 . + CDS 233 - 826 449 ## EUBELI_20285 hypothetical protein + Term 939 - 991 17.3 + Prom 930 - 989 4.8 2 2 Tu 1 . + CDS 1172 - 2869 2036 ## COG1966 Carbon starvation protein, predicted membrane protein + Prom 3353 - 3412 1.7 3 3 Op 1 . + CDS 3442 - 4386 956 ## COG2390 Transcriptional regulator, contains sigma factor-related N-terminal domain 4 3 Op 2 . + CDS 4475 - 6913 1626 ## COG2918 Gamma-glutamylcysteine synthetase + Prom 6947 - 7006 3.2 5 4 Op 1 . + CDS 7027 - 7800 895 ## COG0121 Predicted glutamine amidotransferase 6 4 Op 2 . + CDS 7827 - 8582 821 ## COG1179 Dinucleotide-utilizing enzymes involved in molybdopterin and thiamine biosynthesis family 1 Predicted protein(s) >gi|222441856|gb|ACEP01000086.1| GENE 1 233 - 826 449 197 aa, chain + ## HITS:1 COG:no KEGG:EUBELI_20285 NR:ns ## KEGG: EUBELI_20285 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 1 176 36 213 232 127 34.0 2e-28 MDKIGHLVMTKSLSGNIHRAHKAALLFGSILPDLLVYTYLEGHTWEATFEQITKQMEALE AKGRGGSFSYLKLGWILHYVEDYFTYPHNTIFEGTIPEHYAYEKKMTRWMREGALEQMSL PMCKRLDSAAEVEERLQELHDRYLSQKMCYENDMAYMRQMVSEILNCYAEIFVRKSEFAR FMEWVRKKVGIMTGFVS >gi|222441856|gb|ACEP01000086.1| GENE 2 1172 - 2869 2036 565 aa, chain + ## HITS:1 COG:PAE1423 KEGG:ns NR:ns ## COG: PAE1423 COG1966 # Protein_GI_number: 18312627 # Func_class: T Signal transduction mechanisms # Function: Carbon starvation protein, predicted membrane protein # Organism: Pyrobaculum aerophilum # 1 564 1 569 618 236 30.0 1e-61 MNTLVIVLIAAVLLTVAYAFYGRWLSKKWGIDPKAETPAHKFEDGQDYVPTDGWTVFSHQ FSSIAGAGPVTGAIQAAVFGWLPVLLWVLIGGIFFGAVTDFGALYASVKNDGKSMGLLIE KYIGKLGRKLFLIFCWLFTLIVIAAFADMVAGTFNAYTVDANGVIALSDAAKTNGAAGTI SLLFIAFAMIFGLLHKHLHLTGWKETVVGLICTVAALAIGMTMPITLGKDGWTYITFVYI FFAAVLPMWLLKQPRDAMTTYMFIGMIAGAVVGLLVAHPTMNLPVFTGFHNDQLGDLFPI LFVTVACGAVSGFHSLVSSGTSSKTVSNEKDMLKVGYGAMVLESLLAVIALCVAGAAASA NGTPADGTPFQVFSSGVAGFFEMFGIPVYVAQCFMTMCVSALALTSLDAVARIGRMSFQE LFSVDDMEHAEGWRKVLCNTYFSTIITLAAAFVLTRIGYANIWPLFGSANQLLSALVLVT LCVFLKVTGRSNKMLFPPLIIMLCVTGTALVERTISLVQAFSAGSATFLVEGLQLIIAIL LMILGVIIVVNSLRAYFASKKNSEI >gi|222441856|gb|ACEP01000086.1| GENE 3 3442 - 4386 956 314 aa, chain + ## HITS:1 COG:lin2104 KEGG:ns NR:ns ## COG: lin2104 COG2390 # Protein_GI_number: 16801170 # Func_class: K Transcription # Function: Transcriptional regulator, contains sigma factor-related N-terminal domain # Organism: Listeria innocua # 1 314 1 312 315 153 28.0 5e-37 MDAAKMNQAIRIAKLYYELHYNQLEIAAREGISKSSVSRILKNAMDMGIIEVRIKDSVLA NGNLENELIARFPIKRAIIVPDLVGNSQILMQDVCAALAEDLPRYIKNDSVIGVTWGHAL AVLAQQLPKIKRSGVSVIQLTGGFSRAIFESGALDVLKRFVDSVGGTGYQIPAPAMVAES SIVEALKKDPQIHEILEMAEVCQTAIYSVGNLERPSIVYEMGLIDENDYKDFSRRGAVGD CCSHFIKQNGELFDKKIDARVVGASLETIKKIPNKLVVAVGKEKEKIITAALQGGLVDSL YIDEPTAELIVRHN >gi|222441856|gb|ACEP01000086.1| GENE 4 4475 - 6913 1626 812 aa, chain + ## HITS:1 COG:lin2913_1 KEGG:ns NR:ns ## COG: lin2913_1 COG2918 # Protein_GI_number: 16801972 # Func_class: H Coenzyme transport and metabolism # Function: Gamma-glutamylcysteine synthetase # Organism: Listeria innocua # 392 798 14 432 458 218 34.0 4e-56 MNEKNEPVNFEVIYQATDRFIDTFSFLNVSRDDNEHIYLAVDDKTGDGLSYDCSYNTLEF SFGKEEDMNVLYKRFCQYYTYIQKELRKEGHMLTGMGINPRYAVNQNVPVVSERYRMLFH HLSFYKKYGNSIPFHSYPNFGLFSCASQIQLDVEEEQVVPMLNTFTKLEPFKALILANSL WGENAEILCSRDNFWRNSLHGLNRHNVDMYNVVFDTTDEIVRYIKSMSLYCVEREGKYIN FPPVVFSKYFSSDRIKGEYFDGNRYREITFHPEISDLQYLRSFKFEDLTFRGTVEFRSVC EQPVGEIMASGALHAGLMENIGELSEFLEKDTSIYHNGYNASELRRTFPQAFIWKGGETF KSRQTDGRRNGERKNTGGLHRRVWRNVMLHIENEYIREHLLDGAFGIEMESLRVVGDGML SQTAHPFPGDAHIVRDFSENQTEINTGVNESIEAAIEELKGHTRRIQEKLKALPEPELLW PFSNPPYIGGEKDVPIAQFDGKDAHKTAYREYLSDKYGRYKMTFSGSHVNFSFSEDLLRA DFALQSEPDFMKYKNKLYLELAQKIAVYGWILVAVTAASPIVDSSFMEKGVYGKSVFTGL SSVRCSELGYWNEFPPTFDYSDIDSYVNSIEKYVKNGLLKAPSELYYPIRLKPAGENNLI SLRKNGIDHIELRMFDLNPLTGAGIDARDVKFAQLLMVWLATMPSWYVSKKDQVNAVQNF KNAAHYDLKTVKIARTEKRARSIVHEALKVIGWMKEFYQGLKMDDVQQILDFEYEKFVDP EKRYAWQVRKQYQENYVEKGLVLARQKQNMTV >gi|222441856|gb|ACEP01000086.1| GENE 5 7027 - 7800 895 257 aa, chain + ## HITS:1 COG:MJ1515 KEGG:ns NR:ns ## COG: MJ1515 COG0121 # Protein_GI_number: 15669709 # Func_class: R General function prediction only # Function: Predicted glutamine amidotransferase # Organism: Methanococcus jannaschii # 1 235 1 256 355 114 30.0 1e-25 MCELFGVDSAKKIPLNELLREFFSDGTEHPHGWGMAFFYGNAVSLEKQPQSAVTSNYLKQ RLRAKIETDKMIAHIRLATRGTMDYENTHPFVMRDNYDRTWTLAHNGTIFESEVLEPFVA TEDGDTDSERILCYIIDQVNAAQDEKGAALTKEERFALLNRIILEIAPENKLNLLIFDGE IMYIHTNYKDSLYRCRKDKAIVMATRPLDRDKWKNVPMNQLLAYEDGKLIYTGTKHEYEF VDSEEKMHMLFLDFANL >gi|222441856|gb|ACEP01000086.1| GENE 6 7827 - 8582 821 251 aa, chain + ## HITS:1 COG:CAC0908 KEGG:ns NR:ns ## COG: CAC0908 COG1179 # Protein_GI_number: 15894195 # Func_class: H Coenzyme transport and metabolism # Function: Dinucleotide-utilizing enzymes involved in molybdopterin and thiamine biosynthesis family 1 # Organism: Clostridium acetobutylicum # 5 248 6 248 251 298 59.0 9e-81 MLNQFSRTQLLLGEEAMDKLKNSRVAVFGVGGVGGYVCEALARSGVGTFDLIDDDKVCLT NLNRQIIATRKTVGKYKVEVMKERILDINPDAQVNVHQCFFLPENADDFPFDEYDYVVDA VDTVTAKIEIIMQAQKYGTQVISSMGAGNKLDPAAFQVADIYKTKMCPLAKVMRRELKKR GVKKLKVVYSEEKPTRPIEDMSISCRTNCICPPGAEHKCTERRDIPGSVAFVPSVAGLII AGEVVKDIAGK Prediction of potential genes in microbial genomes Time: Fri Jul 8 07:47:37 2011 Seq name: gi|222441855|gb|ACEP01000087.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont94.1, whole genome shotgun sequence Length of sequence - 14169 bp Number of predicted genes - 12, with homology - 12 Number of transcription units - 8, operones - 2 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 5 - 64 5.5 1 1 Op 1 . + CDS 185 - 574 467 ## COG1725 Predicted transcriptional regulators 2 1 Op 2 . + CDS 590 - 1120 521 ## COG0542 ATPases with chaperone activity, ATP-binding subunit 3 1 Op 3 . + CDS 1093 - 3051 1970 ## PROTEIN SUPPORTED gi|163764771|ref|ZP_02171825.1| ribosomal protein S8 + Prom 3106 - 3165 7.7 4 2 Tu 1 . + CDS 3198 - 3608 572 ## Cphy_3460 hypothetical protein + Term 3815 - 3853 1.2 + Prom 3612 - 3671 3.6 5 3 Tu 1 . + CDS 3867 - 5240 1404 ## COG1066 Predicted ATP-dependent serine protease + Prom 5483 - 5542 8.8 6 4 Tu 1 . + CDS 5590 - 6837 1780 ## COG0112 Glycine/serine hydroxymethyltransferase + Term 6871 - 6929 21.1 + Prom 7184 - 7243 6.4 7 5 Tu 1 . + CDS 7269 - 9089 1282 ## COG0642 Signal transduction histidine kinase - Term 8993 - 9031 0.4 8 6 Tu 1 . - CDS 9078 - 9719 764 ## ELI_0578 hypothetical protein + Prom 9835 - 9894 6.6 9 7 Tu 1 . + CDS 10008 - 12179 1752 ## COG0550 Topoisomerase IA + Prom 12298 - 12357 6.6 10 8 Op 1 . + CDS 12418 - 12642 101 ## Rumal_3660 transposase, IS605 OrfB family 11 8 Op 2 . + CDS 12675 - 13367 162 ## PROTEIN SUPPORTED gi|163756452|ref|ZP_02163565.1| 50S ribosomal protein L34 12 8 Op 3 . + CDS 13385 - 14168 699 ## gi|225027639|ref|ZP_03716831.1| hypothetical protein EUBHAL_01896 Predicted protein(s) >gi|222441855|gb|ACEP01000087.1| GENE 1 185 - 574 467 129 aa, chain + ## HITS:1 COG:CAC0599 KEGG:ns NR:ns ## COG: CAC0599 COG1725 # Protein_GI_number: 15893888 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Clostridium acetobutylicum # 1 122 1 123 125 102 43.0 2e-22 MRIKLDFNSPEAIYIQLRNQIVMQIAQEQLHEGDSLPSVRGLAGELGVNMHTVNKAYAML RQEGYLKLDRRKGAVISVQISDKSEEMMKVNGYMQMIVAQAICKEITKDEMKQLVEEMYD TFLMIEEEE >gi|222441855|gb|ACEP01000087.1| GENE 2 590 - 1120 521 176 aa, chain + ## HITS:1 COG:SA0483 KEGG:ns NR:ns ## COG: SA0483 COG0542 # Protein_GI_number: 15926202 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ATPases with chaperone activity, ATP-binding subunit # Organism: Staphylococcus aureus N315 # 9 144 10 143 818 78 32.0 7e-15 MRTEYNNYASQALEKAAQDAASCHQKYIGTEHLILGLLAQPDSLSGRVLYENGLNYENFL EAVLGESAALGDNANQTFLGYSPKAEHVLLTSEEEAIRMHAQQIGTEHILLAIIKEKDCI ALRLMKGMNINLKKIYVDVLVATGVDSTRAQKEFQSLTAPKKAKKRILWSQHIVWI >gi|222441855|gb|ACEP01000087.1| GENE 3 1093 - 3051 1970 652 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163764771|ref|ZP_02171825.1| ribosomal protein S8 [Bacillus selenitireducens MLS10] # 8 649 175 814 815 763 58 0.0 MVAAYCVDLTERVKEGKLDPIVGRSQEIERMIQILSRRTKNSPCLVGEPGVGKTAIVEGL AQWIVRGEVPEMLWDKKVMTLDLPGMIAGSKYRGEFEERIKKLLQEVISRGDILLFVDEI HTMIGAGGAEGAIDASNIMKPFLARGEIQLIGATTREEYRKYIEKDAAFERRFQPVAVEE PTKEEAVAILNGLKEKYEQHHHVHYTAGAIQAAVELSERYISDRFLPDKAIDVMDEAASK KRVGLYTMSDEMRALEEQFDKYGKRKEEALAEGNLRKAKMNNTKQKTVLEKLEEQRKVWE QKKAVSRVNVTENDVAEVVSSWTKIPVTRISQSESQRLLGLEDTLKKRVIGQDEAVKMVA KAIKRGRVGLKDPKRPIGSFMFLGPTGVGKTELSKALAEALFGDENAMIRVDMSEYMEKH SVSKLIGSPPGYVGYEEGGQLSEQVRRHPYSVILFDEIEKAHGDVFNILLQVLDDGQITD SNGRKADFKNTVIIMTSNAGAQRIMAPKNLGFITERSAKEDHERMKSHVMEEVKQIFKPE FINRVDGIMVFHTLEKPEQRKIVELLLKELSDRLSKNPGFTLEFSDKVADYVLEKGYDPQ YGARPLRRAIQNELEDFLADEYLEGHIKNNDRILLDINDGKPVVKKAEKKKK >gi|222441855|gb|ACEP01000087.1| GENE 4 3198 - 3608 572 136 aa, chain + ## HITS:1 COG:no KEGG:Cphy_3460 NR:ns ## KEGG: Cphy_3460 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 1 136 1 136 137 124 49.0 9e-28 MANVKELLKAEENGSLSFGDYSLTQKTKLDDFSFEGDTYKVKTFQEITRLEKNGGVVYES VPGSAVHGYKETERQIVFETEAADDLQITLEVEPEKEYKVYVNDTNIGKLKSSLSGKISF SIELDAGETAKVQIVK >gi|222441855|gb|ACEP01000087.1| GENE 5 3867 - 5240 1404 457 aa, chain + ## HITS:1 COG:BH0104 KEGG:ns NR:ns ## COG: BH0104 COG1066 # Protein_GI_number: 15612667 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Predicted ATP-dependent serine protease # Organism: Bacillus halodurans # 4 456 2 453 457 492 55.0 1e-139 MAKSKAKTVFFCKECGYETPKWMGQCPGCHQWNTMTEEKVSPVSKGTGKRGDNLPRQELT GLFEVSMEEEDRSSSGIPELDRVLGGGIVKGSLTLVGGDPGIGKSTLLLQICRYQANSGK KVVYVSGEESLKQIKMRAQRLGGFKQNVFLLCETDINAAAEAVREAKPDMVVVDSVQTMY SEEITSAAGSVSQVREVTSVLMQLAKVEGIAVFLVGHVTKEGVVAGPKTLEHMVDTVLYF EGDQTAIYRILRGVKNRFGSTNEIGVFEMKEQGLVEVENPSKVMLDGRPTDASGSVVVCS MEGTRPLLIEIQALVSPTSFNMPRRTTVGIDFNRVNLLLAVLEKKAGMQLGGCDAYVNLA GGMKLGEPAIDLGIICAVVSSYRNLPVHEGTLIFGEVGLTGEVRGVSHVEQRLAEAIKMG FTRCIMPKTNAEVLGDNICQKIQVIGVSNVREVIQVL >gi|222441855|gb|ACEP01000087.1| GENE 6 5590 - 6837 1780 415 aa, chain + ## HITS:1 COG:CAC2264 KEGG:ns NR:ns ## COG: CAC2264 COG0112 # Protein_GI_number: 15895532 # Func_class: E Amino acid transport and metabolism # Function: Glycine/serine hydroxymethyltransferase # Organism: Clostridium acetobutylicum # 4 411 3 411 411 518 63.0 1e-147 MFSFDYVKDFDPEVAKAMEDELGRQRNNLELIASENLVSEAVMAAMGSHLTNKYAEGYPG KRYYGGCQYVDVVENLAIERAKELFGCEYVNIQPHSGAQANMAVFFAVMNLGDTYMGMNL DHGGHLTHGSPVNMSGKNYHCVPYGVNDEGVIDYDKVREIALECHPKMIIAGASAYARKI DFKRMREIADEVGAVLMVDMAHIAGLVAAGLHESPIPYAHVTTTTTHKTLRGPRGGMILS SDEVNKKYNFNKAIFPGIQGGPLMHVIAAKAVCLKEALTPEFKEYQKQIVKNASVLADAL IERGFDIVSGGTDNHLMLVDLRKMGLTGKDMEKLLDSVHITCNKNTVPNDPQSPFVTSGL RLGTPAVTSRGLKEDDMVQIAEAIKLTIIDHKLEEAEAIVKSLTEKYPIYQGDKF >gi|222441855|gb|ACEP01000087.1| GENE 7 7269 - 9089 1282 606 aa, chain + ## HITS:1 COG:PA1611_1 KEGG:ns NR:ns ## COG: PA1611_1 COG0642 # Protein_GI_number: 15596808 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Pseudomonas aeruginosa # 216 453 257 492 516 181 38.0 3e-45 MEAWNISYIYTVIFMMLINNILVYVLSRTPGFGKSERWILQLLIAVCVCDLSDVFGILFK QTAGRNVLFILDAAFIFSVASISLFFLCYSENIYGSDMFKKRIPLITIHIPIDVLCIIIV ASYRTEWIYKYPQILMIANIYNAYSVLLSLWRIHKEKNAEKRKVLWQPVFYIVPFFIGIF LQCFFNTMPWANTSLTITILLIFVNNQQRLLQKKTQDAEAAVRAKSEFLSHMSHDIRTPI NGMMGMLDIAQAHLNNPEKMDLCLSKMRGAADQLLSLINDVLDMSKIETGSIQLVEEPFD MIRLLNGTLAVQEIIASEKSLTIEQDIEGAIEHPCVCGSPNYVRSILVNIISNAIKYTNP GGDIFVSARELSCDGEYVKFEFIVSDTGIGMSEEFAEHIFEPFTQEHAENRSSYQGTGLG MSIVKNLINKMKGTITLETKQGEGSTFTITLPVKLDTVCFEETETEEEETSIEGMKILLV EDNDLNLEVAQYILEDAGAEIIVARNGLESVELFEQSESDSFDVILMDVMMPVMDGLTAT KRIRKLKRKDARTVPVIAMTANVFNEDIIAAKEAGMNEHIAKPLDFDKLIHTLAKYFLKM DKKLIS >gi|222441855|gb|ACEP01000087.1| GENE 8 9078 - 9719 764 213 aa, chain - ## HITS:1 COG:no KEGG:ELI_0578 NR:ns ## KEGG: ELI_0578 # Name: not_defined # Def: hypothetical protein # Organism: E.limosum # Pathway: not_defined # 1 213 1 211 211 209 47.0 8e-53 MTPKQEAWAIQAEHVIKQMERRHFNAYYCSTKEEALTKALSIIEKGSVIASGGSATLGEV GLLDYIKNHSEDYTFIDRSIAKTEDEKRQIHAKAILSDYYLMSTNAFTIDGQLVNIDGNG NRVSALCYGPKHVVIITGMNKLVPSVDAAYKRIRQDACPPNCVRLGLTNPCSKTGICGEC MSASSICDLFVTTRMSRYPDRIHVIMVGEELGY >gi|222441855|gb|ACEP01000087.1| GENE 9 10008 - 12179 1752 723 aa, chain + ## HITS:1 COG:CAC3567 KEGG:ns NR:ns ## COG: CAC3567 COG0550 # Protein_GI_number: 15896801 # Func_class: L Replication, recombination and repair # Function: Topoisomerase IA # Organism: Clostridium acetobutylicum # 2 677 3 674 709 320 35.0 4e-87 MKKLIIAEKPSVAKNIADATASSRRNGYFEGQNYIITWAFGHLLQLYDARDYDPEMRSWK IEKFPYIPEKFLYKIKSDGKNKDKPDSGVVQQMDIIKSLMERSDVEGIISATDDDREGQI IADEIINYISPQKPVERILLNEWTAEEVNRGLQNLKPNEEMKSLQDAGFGRQIADWLIGI NLTSVATLKYRNQKNIRLLNVGRVLMPTLKIIYDRDKEIADFKVTKYYKLKGIFQTEDKK KFDSIYYIKAGEKFDNPEELKEIQEKIKDKNGEVVSKKITNKKEYPPYLFNLSGLQGYIT SKYKGWTSDRVLKTAQQLYEKKFITYPRTASVVLDESLKEKTAKVLNIHKKSCPYEEMIQ FKSTKRIFDSKKVESHSAIIPTYMQPGSLNESERIVYEAVKNRFLAQFMPIAVAEDTVLH IKMQNTDIPGEFVAKGKVQLEAGWKLIEGIDAKETILPAVKKGDVLPLIKSEINEVERKP PKPHTEKTLLKVMETCGKKYEEKEEEQMMMAILSGFSIGTPATRAETIKKLKDAGYIKAK GKSLSCTDLGKILVETFPARELFDLEYTGRLEKTLSDIERKKFTKDDFLAMIEQFVIESV DKIKNDKVFGGPVTLEPLHTEPAGICPECGAPVVETARAFGCSGWKKGCKFAIWKDDRFI TSLGKKVSYEMVKILLEHGKVGFRGCVSKRGNKFSAYFYYEKDEKTGRYNWRLEFIDTNP PQA >gi|222441855|gb|ACEP01000087.1| GENE 10 12418 - 12642 101 74 aa, chain + ## HITS:1 COG:no KEGG:Rumal_3660 NR:ns ## KEGG: Rumal_3660 # Name: not_defined # Def: transposase, IS605 OrfB family # Organism: R.albus # Pathway: not_defined # 2 34 1 33 372 65 78.0 6e-10 MVANRAYKFRIYPNDEQKILFAKTFGCVRMVYNHSVGHTGIARICLAYSVIKREPNLPRL GSPTSTGGEDVTKR >gi|222441855|gb|ACEP01000087.1| GENE 11 12675 - 13367 162 230 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163756452|ref|ZP_02163565.1| 50S ribosomal protein L34 [Kordia algicida OT-1] # 24 216 121 340 391 67 25 7e-11 MQPEKKMKRLENRELFQDFNRTIASVLFEENTYPWEVLPKIHDFIMEVGMLLSQDEYEEV KEHVWVAKSAMVAPTAYINGPAIIGPDAEIRHCAFIRGNAIVGEGAVVGNSTELKNVILF DKVQVPHYNYVGDSILGYKSHMGAGSITSNVKSDKKLVEVHLDDTKIETHMKKIGAILGD YVEVGCGSILNPGTIVGRHSNIYPLSSVRGYVAGHSIYKNKNEVVQKEEY >gi|222441855|gb|ACEP01000087.1| GENE 12 13385 - 14168 699 261 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225027639|ref|ZP_03716831.1| ## NR: gi|225027639|ref|ZP_03716831.1| hypothetical protein EUBHAL_01896 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01896 [Eubacterium hallii DSM 3353] # 1 261 1 261 261 444 100.0 1e-123 MKRGRRSKYVVLLISLVLIFSLTGCKASKKKVLESSYYKELQKENKKLKKQNKSLKSKVD AENDMTEDEQRASDYLEKISRDHLVKLEVGYADNMDGSEFIEGEAVFSLATTIASRADKT TKYTPDEVKEKYGPGYEYILYDEDNAIYEIMVYGGNYIVFTDLPNNVYYAYNASAIGDAF LHFRNGYPNSKLFHRLADAPLIINDKGRCYENEAASSVATYIDQMSKKKSNEARAKKKWG KKAAKKVSKGRTYTFYHHGNT Prediction of potential genes in microbial genomes Time: Fri Jul 8 07:48:24 2011 Seq name: gi|222441854|gb|ACEP01000088.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont95.1, whole genome shotgun sequence Length of sequence - 84970 bp Number of predicted genes - 77, with homology - 75 Number of transcription units - 30, operones - 16 average op.length - 3.9 N Tu/Op Conserved S Start End Score pairs(N/Pv) - TRNA 93 - 164 63.1 # Glu CTC 0 0 - TRNA 243 - 314 63.1 # Glu CTC 0 0 + Prom 350 - 409 10.9 1 1 Op 1 1/0.000 + CDS 457 - 1908 686 ## PROTEIN SUPPORTED gi|16079597|ref|NP_390421.1| hypothetical protein BSU25430 2 1 Op 2 6/0.000 + CDS 1901 - 4531 2844 ## COG0249 Mismatch repair ATPase (MutS family) + Prom 4581 - 4640 9.2 3 1 Op 3 12/0.000 + CDS 4668 - 6863 1858 ## COG0323 DNA mismatch repair enzyme (predicted ATPase) 4 1 Op 4 . + CDS 6870 - 7838 1080 ## COG0324 tRNA delta(2)-isopentenylpyrophosphate transferase 5 1 Op 5 . + CDS 7822 - 9141 1521 ## COG4100 Cystathionine beta-lyase family protein involved in aluminum resistance + Prom 9144 - 9203 4.8 6 2 Op 1 4/0.000 + CDS 9242 - 9940 743 ## COG2003 DNA repair proteins + Term 10001 - 10044 11.4 + Prom 9952 - 10011 1.7 7 2 Op 2 22/0.000 + CDS 10064 - 11104 1461 ## COG1077 Actin-like ATPase involved in cell morphogenesis 8 2 Op 3 . + CDS 11120 - 11989 851 ## COG1792 Cell shape-determining protein + Term 12001 - 12053 5.7 9 3 Op 1 . + CDS 12065 - 12580 506 ## EUBREC_1743 hypothetical protein 10 3 Op 2 1/0.000 + CDS 12573 - 15512 2923 ## COG0768 Cell division protein FtsI/penicillin-binding protein 2 11 3 Op 3 22/0.000 + CDS 15580 - 16329 690 ## COG0850 Septum formation inhibitor 12 3 Op 4 22/0.000 + CDS 16351 - 17142 1007 ## COG2894 Septum formation inhibitor-activating ATPase 13 3 Op 5 . + CDS 17157 - 17432 192 ## COG0851 Septum formation topological specificity factor 14 3 Op 6 . + CDS 17452 - 18576 742 ## COG0772 Bacterial cell division membrane protein 15 3 Op 7 . + CDS 18589 - 18984 531 ## COG1803 Methylglyoxal synthase 16 3 Op 8 . + CDS 19006 - 19971 821 ## COG1686 D-alanyl-D-alanine carboxypeptidase + Prom 19987 - 20046 3.8 17 4 Tu 1 . + CDS 20090 - 20506 563 ## EUBREC_1736 hypothetical protein + Term 20592 - 20630 4.3 + Prom 20525 - 20584 2.3 18 5 Op 1 . + CDS 20639 - 21586 546 ## COG1242 Predicted Fe-S oxidoreductase 19 5 Op 2 . + CDS 21588 - 22979 1122 ## EUBREC_1735 hypothetical protein 20 5 Op 3 14/0.000 + CDS 22979 - 23671 717 ## COG0325 Predicted enzyme with a TIM-barrel fold 21 5 Op 4 . + CDS 23687 - 24232 756 ## COG1799 Uncharacterized protein conserved in bacteria + Term 24306 - 24361 17.5 22 6 Op 1 . + CDS 24380 - 25465 1168 ## COG0337 3-dehydroquinate synthetase 23 6 Op 2 15/0.000 + CDS 25520 - 26023 479 ## COG0597 Lipoprotein signal peptidase 24 6 Op 3 . + CDS 26038 - 26997 286 ## PROTEIN SUPPORTED gi|161507907|ref|YP_001577871.1| ribosomal protein large subunit + Term 27056 - 27112 3.1 + Prom 27038 - 27097 3.4 25 7 Op 1 . + CDS 27127 - 27393 333 ## PROTEIN SUPPORTED gi|238924297|ref|YP_002937813.1| ribosomal protein S15 + Term 27404 - 27439 6.0 26 7 Op 2 . + CDS 27456 - 28112 669 ## gi|225027665|ref|ZP_03716857.1| hypothetical protein EUBHAL_01924 + Prom 28130 - 28189 6.8 27 8 Op 1 . + CDS 28211 - 28420 110 ## 28 8 Op 2 25/0.000 + CDS 28350 - 28613 472 ## COG1925 Phosphotransferase system, HPr-related proteins 29 8 Op 3 . + CDS 28642 - 30351 2235 ## COG1080 Phosphoenolpyruvate-protein kinase (PTS system EI component in bacteria) + Term 30398 - 30437 7.0 + Prom 30494 - 30553 8.3 30 9 Tu 1 . + CDS 30593 - 32596 1790 ## COG0210 Superfamily I DNA and RNA helicases + Prom 32641 - 32700 5.0 31 10 Tu 1 . + CDS 32726 - 33547 1158 ## COG1968 Uncharacterized bacitracin resistance protein + Term 33652 - 33694 -1.0 + Prom 33550 - 33609 4.1 32 11 Tu 1 . + CDS 33762 - 34337 577 ## COG0500 SAM-dependent methyltransferases + Term 34346 - 34382 2.5 + Prom 34351 - 34410 9.6 33 12 Op 1 13/0.000 + CDS 34531 - 35298 1141 ## COG0543 2-polyprenylphenol hydroxylase and related flavodoxin oxidoreductases 34 12 Op 2 1/0.000 + CDS 35295 - 36197 1356 ## COG0167 Dihydroorotate dehydrogenase + Prom 36248 - 36307 5.5 35 12 Op 3 . + CDS 36420 - 37061 845 ## COG0461 Orotate phosphoribosyltransferase 36 12 Op 4 . + CDS 37082 - 37234 230 ## + Term 37241 - 37284 -1.0 + Prom 37264 - 37323 3.2 37 13 Op 1 . + CDS 37367 - 37849 203 ## COG4767 Glycopeptide antibiotics resistance protein 38 13 Op 2 . + CDS 37884 - 38189 267 ## gi|225027677|ref|ZP_03716869.1| hypothetical protein EUBHAL_01936 39 13 Op 3 . + CDS 38235 - 40355 2364 ## Closa_1605 hypothetical protein + Prom 40397 - 40456 6.1 40 14 Op 1 . + CDS 40504 - 41772 1460 ## COG0044 Dihydroorotase and related cyclic amidohydrolases + Prom 41782 - 41841 4.1 41 14 Op 2 . + CDS 41868 - 42794 1293 ## COG0284 Orotidine-5'-phosphate decarboxylase 42 14 Op 3 . + CDS 42824 - 43336 780 ## COG0041 Phosphoribosylcarboxyaminoimidazole (NCAIR) mutase + Prom 43342 - 43401 6.0 43 15 Op 1 21/0.000 + CDS 43430 - 44455 851 ## PROTEIN SUPPORTED gi|169632702|ref|YP_001706438.1| phosphoribosylaminoimidazole synthetase 44 15 Op 2 . + CDS 44449 - 45075 858 ## COG0299 Folate-dependent phosphoribosylglycinamide formyltransferase PurN + Term 45119 - 45181 3.3 + Prom 45103 - 45162 4.0 45 16 Op 1 . + CDS 45199 - 46476 1796 ## COG0151 Phosphoribosylamine-glycine ligase + Term 46496 - 46532 -0.2 46 16 Op 2 . + CDS 46543 - 47244 833 ## COG1296 Predicted branched-chain amino acid permease (azaleucine resistance) 47 16 Op 3 . + CDS 47234 - 47551 304 ## EUBELI_20620 hypothetical protein + Term 47698 - 47751 4.3 + Prom 47931 - 47990 3.1 48 17 Tu 1 . + CDS 48012 - 52577 4904 ## COG1924 Activator of 2-hydroxyglutaryl-CoA dehydratase (HSP70-class ATPase domain) 49 18 Tu 1 . - CDS 52776 - 54446 1951 ## COG1069 Ribulose kinase - Prom 54482 - 54541 8.5 50 19 Tu 1 . - CDS 54558 - 55916 1480 ## COG0371 Glycerol dehydrogenase and related enzymes - Prom 56019 - 56078 5.0 + Prom 56292 - 56351 4.4 51 20 Tu 1 . + CDS 56381 - 57613 1066 ## COG1609 Transcriptional regulators + Term 57673 - 57709 0.0 - Term 57749 - 57787 -0.6 52 21 Tu 1 . - CDS 57925 - 58416 467 ## COG1607 Acyl-CoA hydrolase - Prom 58473 - 58532 6.5 + Prom 58886 - 58945 5.8 53 22 Tu 1 . + CDS 58995 - 60536 1702 ## COG1620 L-lactate permease + Prom 60544 - 60603 5.8 54 23 Op 1 1/0.000 + CDS 60725 - 62137 1852 ## COG0277 FAD/FMN-containing dehydrogenases 55 23 Op 2 29/0.000 + CDS 62156 - 62959 1067 ## COG2086 Electron transfer flavoprotein, beta subunit 56 23 Op 3 3/0.000 + CDS 62973 - 63977 1500 ## COG2025 Electron transfer flavoprotein, alpha subunit + Prom 64339 - 64398 3.9 57 23 Op 4 . + CDS 64463 - 65605 1498 ## COG1960 Acyl-CoA dehydrogenases 58 23 Op 5 . + CDS 65623 - 66909 1231 ## COG3875 Uncharacterized conserved protein + Term 66992 - 67034 7.3 + Prom 67032 - 67091 6.4 59 24 Tu 1 . + CDS 67137 - 67880 741 ## COG2186 Transcriptional regulators + Term 67886 - 67946 1.1 - Term 67874 - 67932 0.7 60 25 Tu 1 . - CDS 67995 - 68255 237 ## NT01CX_0524 stage III sporulation protein D - Prom 68279 - 68338 3.5 + Prom 68682 - 68741 7.4 61 26 Op 1 8/0.000 + CDS 68923 - 69378 482 ## COG1762 Phosphotransferase system mannitol/fructose-specific IIA domain (Ntr-type) 62 26 Op 2 7/0.000 + CDS 69430 - 70131 526 ## COG1445 Phosphotransferase system fructose-specific component IIB 63 26 Op 3 2/0.000 + CDS 70148 - 71167 1053 ## COG1299 Phosphotransferase system, fructose-specific IIC component + Term 71282 - 71339 0.1 + Prom 71201 - 71260 7.6 64 26 Op 4 . + CDS 71368 - 73281 1586 ## COG3711 Transcriptional antiterminator 65 26 Op 5 13/0.000 + CDS 73328 - 74317 1519 ## COG3444 Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IIB 66 26 Op 6 13/0.000 + CDS 74349 - 75149 1173 ## COG3715 Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IIC 67 26 Op 7 4/0.000 + CDS 75170 - 76084 1078 ## COG3716 Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IID + Term 76122 - 76173 14.3 68 27 Tu 1 . + CDS 76195 - 76554 403 ## COG4687 Uncharacterized protein conserved in bacteria + Term 76571 - 76612 3.6 + Prom 76572 - 76631 5.2 69 28 Tu 1 . + CDS 76683 - 77630 840 ## COG1482 Phosphomannose isomerase + Prom 78292 - 78351 3.6 70 29 Op 1 35/0.000 + CDS 78504 - 79904 1546 ## COG0147 Anthranilate/para-aminobenzoate synthases component I 71 29 Op 2 9/0.000 + CDS 79980 - 80543 618 ## COG0512 Anthranilate/para-aminobenzoate synthases component II 72 29 Op 3 9/0.000 + CDS 80614 - 81399 722 ## COG0134 Indole-3-glycerol phosphate synthase + Prom 81420 - 81479 4.7 73 29 Op 4 23/0.000 + CDS 81510 - 82193 537 ## COG0135 Phosphoribosylanthranilate isomerase 74 29 Op 5 37/0.000 + CDS 82209 - 83381 1510 ## COG0133 Tryptophan synthase beta chain 75 29 Op 6 . + CDS 83399 - 84190 436 ## PROTEIN SUPPORTED gi|149916131|ref|ZP_01904653.1| 50S ribosomal protein L25/general stress protein Ctc + Term 84225 - 84266 -0.3 + Prom 84216 - 84275 9.1 76 30 Op 1 . + CDS 84316 - 84564 315 ## Dhaf_3301 prevent-host-death family protein 77 30 Op 2 . + CDS 84566 - 84865 303 ## Dhaf_3300 addiction module toxin, RelE/StbE family Predicted protein(s) >gi|222441854|gb|ACEP01000088.1| GENE 1 457 - 1908 686 483 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|16079597|ref|NP_390421.1| hypothetical protein BSU25430 [Bacillus subtilis subsp. subtilis str. 168] # 51 465 8 421 451 268 36 5e-71 MTNINYELIDTTAPAPEQEPERQRYFIAKCREIIKERELEIGRKLTFYDQTFGCQMNFKD SEKLNGILEDIGYVKADTEKADFVYYNTCTVRENANVRVYGRLGALKNYKKKNPEMVIAM CGCMMQEPEEQEKVKTTFKFVDVVFGTHNIFKLAELLYECLSGRKRVFDVWEKTDQIIED LPSDRKFGFKAGVNIMFGCNNFCSYCIVPYVRGRERSRKPEDIVAEVKKLASEGVVEVML LGQNVNSYGKNLENPISFAKLLTMVEEVEGIERIRFMTSHPKDLSDELIEVMAHSKKICK HLHLPVQSGSSRILKLMNRKYTKEHYLELVDKIRTAVPDIAITTDIIVGFPGETEEDFQE TLDVVRKAKYDSAFTFIYSKRSGTPAAKMPDQVPDDVVHERFDRLLKVVNEAAKEQNGKL TGNTELVLVEEIDEKDDTMVTGRLSNNSVVHFKGDASLIGKIVPVVLEESKGFYYLGRLS VNG >gi|222441854|gb|ACEP01000088.1| GENE 2 1901 - 4531 2844 876 aa, chain + ## HITS:1 COG:CAC1837 KEGG:ns NR:ns ## COG: CAC1837 COG0249 # Protein_GI_number: 15895112 # Func_class: L Replication, recombination and repair # Function: Mismatch repair ATPase (MutS family) # Organism: Clostridium acetobutylicum # 4 876 3 867 869 828 50.0 0 MAKLTPMMEQYFQIKNKYKDCILFYRLGDFYEMFYEDALTASRELEITLTGKNCGQEEKA PMCGVPFHSCEPYINKLVEKGFRVAICEQVEDPKAAKGIVKRDVIRVVTPGTNVMTQSLD ESKNNFIMSVFCEDDLFGIGVCDLSTGEFRTTQVEHQDALFDEMNKFQPSEIICNDAFCI CGMDFDFIKDKVGAVISPVASYYFEKDNCEMLIKEQYNLANLEGIGLTDYPFGVIASGAL LQYLHETQKNSLSHLMELKVYSTENYMVIDSSTRRNLELCETLREKGKKGSLLWVLDKTK TAMGARMLRKMIEQPLIHKTAIQDRLDAVEMLKDNLMAREELREYMNSIYDLERLTMKVS YRSANPRDLISFKTSIQYLPYIKDILAQFSKGVLAKMAGDLDTLEDLHELLEASIEEDPP IPIKEGGIIKEGFDEEIDHLKKAKTDGKTWLAELEEREREKTGIKNLRVRYNKVFGYYIE VTNSYKDLVPEYYIRRQTLANAERYTTEELSELARTILGAEDRLYALEYETYVAIREKLA AEMERIQKTASVIAWTDVFASLALVAETNQYVRPTLNQRGVIDIKDGRHPVVEKMMSGEL FVANDTLLDHKKNRVDIITGPNMAGKSTYMRQTALIVLMAQIGSFVPAKSANIGLVDRIF TRVGASDDLASGQSTFMVEMSEVANILRHATKDSLLILDEIGRGTSTFDGLSIAWAVVEY IAGSSLSGAKTLFATHYHELTELEGKLPGVNNYCIAVQEKGDTIIFLRKIIKGSADKSYG IQVAKLAGVPEVVIERAKEIAAELEESDITANTKNIGKKHTEEVEPVQISLFDTMGMVAE QPKKSEVEETLKKLDISNMTPMEAMNQLYELQKKCK >gi|222441854|gb|ACEP01000088.1| GENE 3 4668 - 6863 1858 731 aa, chain + ## HITS:1 COG:SA1138 KEGG:ns NR:ns ## COG: SA1138 COG0323 # Protein_GI_number: 15926879 # Func_class: L Replication, recombination and repair # Function: DNA mismatch repair enzyme (predicted ATPase) # Organism: Staphylococcus aureus N315 # 1 730 1 668 669 371 33.0 1e-102 MPEIRVLDQSTINQIAAGEVVERPSSIVKELVENAVDANSTAVTVEIKGGGIDFIRITDN GCGIEKEQVRKAFLPHATSKIRSASDLETVVSLGFRGEALSSIASVAKVELVTKTDEGIS GIRYVIEGGEEKSYDEIGCPEGTTFIIRNLFFNTPARRKFLKSKMTEAGYVESFIQRLAL SHPDISFKFICDNKNKISTSGNGNLKDVIYNIFGRDVAMNLLPVKGNENGILVDGYIGKP VISRGNRNYENYFVNGRYLKNNIISKAIEEGYKGHAMVHKFPFTALMISMDPHCFDANVH PAKMEMRFRNAEELYSSVMSAVRSSFVKKELIPKVGIGNDKKEKESVKKIPEPFERKRRA IEERFSSLSQEKIREIEKRKEQGDAAEQRVSVLAKNSMENNVRDSVQGNDGNNAQNPIQN PAQNTTGNMSGSVNGYSGVHGENSENATEISNRRSSIEERIARLHVGENNAIKKEIDDVI PKSMDESTAKAEEYSESDDRQIKQDKKPSGQKELLKEADTYGKGTQMSFADVPLLSEEAR PKHRIIGQVFRTYWLVEFDEKLFFVDQHAAHEKVMYEKLKKDLENNTIVQQMVAPPVILT FSIKEQQKFKICEESFKKLGFLIEEFGGNEYCIRGVPANLLGIDPQELFIEIFDQIEENS GKMNLEMITDRLATMACKAAVKGNTAMSYQEMDALMDQLMKLDNPYQCPHGRPTIISMTK YELEKKFKRVQ >gi|222441854|gb|ACEP01000088.1| GENE 4 6870 - 7838 1080 322 aa, chain + ## HITS:1 COG:CAC1835 KEGG:ns NR:ns ## COG: CAC1835 COG0324 # Protein_GI_number: 15895110 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA delta(2)-isopentenylpyrophosphate transferase # Organism: Clostridium acetobutylicum # 7 313 2 308 309 290 48.0 4e-78 MVVEQKKPLIILTGPTAAGKTKLSIGLAKSIGGEIISADSMQVYKKMDIGTAKIRPEEMD GVPHYLVDEFDPSEEFNVVVFVERAKAAMEKIYAAGHIPIIVGGTGFYIQALLYDIDFSE HEGKESYRKELEQLAKEKGKEYLYEILKEVDPEYAQIVHFNNVKRVIRALEYHKETGHKL SEHNKEQRQKDSPYNFAYFVLYHDREVLYDRINRRVDLMMEDGLLEEVKGLIEEGYTKDL VSMQGLGYKEFFDYFDGEMSLEETVDKVKRDTRRFAKRQLTWFRREKEVVWLNKKEYEQE KNLLDSVLNIIKEKGICNGTKR >gi|222441854|gb|ACEP01000088.1| GENE 5 7822 - 9141 1521 439 aa, chain + ## HITS:1 COG:BS_ynbB KEGG:ns NR:ns ## COG: BS_ynbB COG4100 # Protein_GI_number: 16078807 # Func_class: P Inorganic ion transport and metabolism # Function: Cystathionine beta-lyase family protein involved in aluminum resistance # Organism: Bacillus subtilis # 26 425 16 417 421 441 53.0 1e-124 MEQRDKILEEYYAQMGIAKPVYDFCMKVEEELKDRFLSIDKTAECNQMKVLRAMQKNHLS EACFAPTTGYGYNDIGRETLEKIYADVFGTEDALVRPQITCGTHALALALMSQLRPGDEL LSPVGKPYDTLEEVIGIRPSNGSLAEYGISYHQVDLLPDGEFDWDGIEKALNEKTKLVTI QRSKGYASRPTLSVKRIGELIHFIKERKPDVLCMVDNCYGEFVEIEEPSQVGADLIVGSL IKNPGGGLAPIGGYLCGTKECIDRAAYRLTSPGLGKEVGATLGVTTSLTQGLFLAPTVVK GALKGAVFAANIYEKLGFKVIPDSKESRHDIIQEVEFHDPELMVAFCEGIQAAAPVDSFV RPEPWDMPGYDAQVIMAAGAFVSGSSIELSADGPMKEPYAVYFQGGLTWEHAKYGIMMSL QKIYETGKLKNILINDTES >gi|222441854|gb|ACEP01000088.1| GENE 6 9242 - 9940 743 232 aa, chain + ## HITS:1 COG:BS_ysxA KEGG:ns NR:ns ## COG: BS_ysxA COG2003 # Protein_GI_number: 16079856 # Func_class: L Replication, recombination and repair # Function: DNA repair proteins # Organism: Bacillus subtilis # 6 229 8 228 231 169 43.0 3e-42 MERQSVTMKQRPETERPYEKCLKAGAESLTDGELLAVLIRCGTKNYSALELAFALLDRHP VYKGLAGLYHLDMEMLKTIPGIGNVKAIEILCALELSKRLARASIKEESDFSSPEYIASY YMEEMRHLCVEKVMLLLLDGRHRLMKEILLSKGTANSSWVPVRTIFVEALRYEAVYMILI HNHPSGCPEPSREDLVITKQIKEAGELLGIPLSDHIIIGDKCYTSLREQQMM >gi|222441854|gb|ACEP01000088.1| GENE 7 10064 - 11104 1461 346 aa, chain + ## HITS:1 COG:aq_845 KEGG:ns NR:ns ## COG: aq_845 COG1077 # Protein_GI_number: 15606197 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Actin-like ATPase involved in cell morphogenesis # Organism: Aquifex aeolicus # 7 335 15 336 341 242 41.0 1e-63 MSEKIYGIDMGTNSIKIYQKAAGVILHQKNMIAMVKKKEPLALGDEAYSMYEKAPKNIEV KQPIINGVIADFTAMQTMIQLYFKGMHGKKFAEGLAAGGNAEFYIAVPSDITEVEKRAFY DLIASSSLKTKKIRIVEKPIADALGAGLDVMDATGRLIVNIGADTTEISLISLGGIVQSR LLKIGGTKFDEAIQQTVKKKHNLVIGMKSAERLKMQLASGIKEDEEITYDVMGRNLITGL PNKVTVTNHLIFEAMSESLSSIIDAIKFILEKTPPELSMDIMHDGVCMTGGSTHIKNLDL LIQQETGLSIMKMEEPELTVVKGLGMIMENPDYERLAHNLQDSDFH >gi|222441854|gb|ACEP01000088.1| GENE 8 11120 - 11989 851 289 aa, chain + ## HITS:1 COG:CAC1243 KEGG:ns NR:ns ## COG: CAC1243 COG1792 # Protein_GI_number: 15894526 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell shape-determining protein # Organism: Clostridium acetobutylicum # 15 285 12 278 283 114 31.0 2e-25 MKNRHNFEFSPKTMLVFLTIICILLLSVSAVLKDAVKPLSAVAETVVIPMQDGINSFGVW VGDHAGSFKSMKELKKENAKLNEKVRELTQENETLSTNEKELNNLQKLLKIDKAYSQYKK VGARVISKGSGNWYNTFVINKGSKDGIKVNMNVIADGGLVGIVTETGRTYAKVCSIINDS SSVSAKSSQAGDTCVVQGNSESVQKDGTIDVTYISKDADMKEGEELITSHISSKYLEGIK IGTVSKISTDSSNLTKSAKVTPFVDFQHIEDVLVITQLKEVPADSDSAD >gi|222441854|gb|ACEP01000088.1| GENE 9 12065 - 12580 506 171 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_1743 NR:ns ## KEGG: EUBREC_1743 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 168 1 168 173 136 45.0 4e-31 MKRGIVTAVLVFICFILQTSVFELIKLAGITPNILLILISSVAVMRGQKPGMIVGFFSGL LMDIFYGSTIGGFAFLYMVFGFVDGFFNRIYYSDDNILPLVLIGVNDLIYGFIMYILCGL LQNHLRIIYYLKNIILPEMVYTVAVGLVCYQILLRINDWLEKNEKRGADFV >gi|222441854|gb|ACEP01000088.1| GENE 10 12573 - 15512 2923 979 aa, chain + ## HITS:1 COG:HI0032 KEGG:ns NR:ns ## COG: HI0032 COG0768 # Protein_GI_number: 16272007 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell division protein FtsI/penicillin-binding protein 2 # Organism: Haemophilus influenzae # 632 957 279 623 651 157 32.0 8e-38 MFKFKKKQEKDKTKLRGQIIELLSNRIIIMGVLLMVLFCLLIYRLFVLQIIQGDEHLANF NYKIKKTIEISGARGNIYDCNGKTLAYNQLAYTVTLENTDETSRIARERTNANGKNGQKV TENDVKNEVIYKLIKVLETNGDTINYSLPMTVNSKGKLKFTVSGSSLARFKKDIYGITNI DNLSGDEKKKAEKYLNSTPEEVYEYLRSGKNGPQGTGNMFGIADSYSTEDTLKIMSVRYD VFMNRYSQTTPITVATNISDKSIAAISEHDDEYPGVSIKADSLRKYNDAKYFSSVLGYTG VVSESELKELNGNGGKYEANDVVGKTGIEKTMESTLQGKKGQKDVLVDNLGKVIKTVKTT KASAGNNVYLTIDADLQKYAYNILERRLAGILLAHLTTADTAGSEKRVPIKDVYYALIDN NIINISKLSRKKAKTNEKDVYQIYRKKQETVLSTLRKDLQSGTTIRKNLSEEKQDYVSYI YKMLENDGILVASSIDENDQVYLDWKDEKITFRKFLRHAINNEWINISSFNIKSDYYDAD EIYDELINYIVDALKSEEDFDKILYEYMIKNGNLSGKKVCLLLYDQGVLDKKKDKDYNSL YTGKMSAYNFMYKKIKNIEITPAQLALDPCSGSVVITDPDTGEVKAMVSYPSYNNNRLAN GIDSSYYASLNTDLSSPMLNRATQSRTAPGSTFKPISSTAVLEEGVANASTYVKCTGIFD KISPAAKCWIYPNAHGSLNVSGAIAVSCNFFFYQMGYNLGTVNGTYNSQRGLTKLKKYAS MYGFNRKSGVEITEYEPHISDEDAIRSAIGQGTHSYTPSQISRYVTSLVNHKNLIDLSLI EKTTDSSGSVITSYKKRTEEKLDIKSSTFDLIKKGMIGVVNGKDSSIKYLYEKQGMKVAG KTGTAQENKKRPNHALFISYAPYDNPDITMTVVVPNGYTSANAAEIARDIYKYYFGKASK AEKKATTALLPTGSDGSND >gi|222441854|gb|ACEP01000088.1| GENE 11 15580 - 16329 690 249 aa, chain + ## HITS:1 COG:FN0175 KEGG:ns NR:ns ## COG: FN0175 COG0850 # Protein_GI_number: 19703520 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Septum formation inhibitor # Organism: Fusobacterium nucleatum # 1 238 1 215 216 119 35.0 5e-27 MKSSVIIKGNKYGFQIVLNPTLPFEELLREVGNKFKESTHFFDLNKPIAVSFTGRELTSG EQNLLVDTISENSGLNISCVIDGAKVYETQFAKVLNEIREEEKKEKKRQEAESFPENRIA KNGQFYRGTLRSGQKIEVDGSIVILGDINPGAQIVAGGNVVVLGCLKGTVHAGYPEDNSA FVAALMMEPMQIQIGSYIARSPDQRVKNTKKNAKKKNRQELLAKLAFVENDSIFIETITR SLISEISIS >gi|222441854|gb|ACEP01000088.1| GENE 12 16351 - 17142 1007 263 aa, chain + ## HITS:1 COG:BH3027 KEGG:ns NR:ns ## COG: BH3027 COG2894 # Protein_GI_number: 15615589 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Septum formation inhibitor-activating ATPase # Organism: Bacillus halodurans # 1 259 1 261 264 331 65.0 8e-91 MSEVIVITSGKGGVGKTTTTANIGTGLAKLGKKVVMIDTDIGLRNLDVVMGLENRIVYNL VDVVEGNCRLKQALIKDKRYSNLFLLPSAQTRDKSAVSPEQMRKLVDELRKDFDYILLDC PAGIEQGFKNAIAGADRAIIVTTPEVSAIRDADRIIGLLEAEELKKIELVINRIRMDMVK RGDMMSVEDVVDILAIDLIGVVPDDESIVIATNEGEPLVGSDTQAGKAFANICHRVLGEE VPIMEFETESILKKIANLFKRGY >gi|222441854|gb|ACEP01000088.1| GENE 13 17157 - 17432 192 91 aa, chain + ## HITS:1 COG:FN0177 KEGG:ns NR:ns ## COG: FN0177 COG0851 # Protein_GI_number: 19703522 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Septum formation topological specificity factor # Organism: Fusobacterium nucleatum # 1 84 1 81 99 64 45.0 4e-11 MGFMDGFLRKKTSGNTAKDRLKLVLVSDRANCSPETMELIKNDIIKVISKYMEIDTDGLD IQITQMESDGGHGNVPAIYANIPIKELRSPR >gi|222441854|gb|ACEP01000088.1| GENE 14 17452 - 18576 742 374 aa, chain + ## HITS:1 COG:HI0031 KEGG:ns NR:ns ## COG: HI0031 COG0772 # Protein_GI_number: 16272006 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Bacterial cell division membrane protein # Organism: Haemophilus influenzae # 13 368 20 360 371 175 34.0 1e-43 MLKQKYQFKNYNWVLIGAVLILSGMGVLFINSADSSYTSKQLVGLIFCTGVMLFLSVVNF NFVCGFNRVLYMINIVLLVLVKLVGVSVNGAQRWINLGFTRLQPSELTKIIMIIFVAVYI QEHEEDLMEWKVLLKLALLCALPLFLVVIEPNLSTTLDITFILLSVIFVGGFSMALIKKW LKIIIPVMIPLGFLFIWYIQTPNQILLHDYQVTRIMTFLEPSKYSSTSAYQQDNSVMAIG SGKLYGKGLNNNTIADVTVADTGFVSEQQTDFIFSVVGEETGFVGSVIVIALLAIIVIEC LKTAYVAKNMSGRLIASGMAALIGFQSFINIGVATEFLPNTGLPLPFVSYGLTSLLSYMA GIGIVLNIGLQRHY >gi|222441854|gb|ACEP01000088.1| GENE 15 18589 - 18984 531 131 aa, chain + ## HITS:1 COG:lin2020 KEGG:ns NR:ns ## COG: lin2020 COG1803 # Protein_GI_number: 16801086 # Func_class: G Carbohydrate transport and metabolism # Function: Methylglyoxal synthase # Organism: Listeria innocua # 1 131 1 131 134 152 53.0 2e-37 MNIGLIAHDGKKKLMQNFCIAYRGILNKHELFATATTGRLIEEVTNLTIHKYLAGPLGGE QQMASQIEHNLIDMVIFLRDPQNQQVHEPDIFNVMHLCDVHNIPLATNLATAELLVKALE RGDLEWREMYR >gi|222441854|gb|ACEP01000088.1| GENE 16 19006 - 19971 821 321 aa, chain + ## HITS:1 COG:BH1535 KEGG:ns NR:ns ## COG: BH1535 COG1686 # Protein_GI_number: 15614098 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanyl-D-alanine carboxypeptidase # Organism: Bacillus halodurans # 50 317 16 272 387 127 34.0 3e-29 MKKKLICTISILVLAAALCSGCSTAPFKITDSFASYYAKNKEEVGSDLKGMASDLAVLSD SEATDPNYQSNDYADLLINDSTNEVLESYHCFDRLYPASITKVMTALLTLEHGNMDDEIT LKHDINLTENGAVISTLTKGDTVTVGQVFHTMLIKSANDCAVILAEYVAGSESKFIDMMN AKAKELGATHTHFVNPNGLHDNNHYTTAYDLYLIFKEAVKYDTFVDTVSSKDYTMTYTTP KKTQINEYMQSTNYYLLNEFPVPEGVVMYGGKTGTTSMAKSCLILMTKNKKGERFFSVVL GAETKDMLYSSMSQLLEKTAN >gi|222441854|gb|ACEP01000088.1| GENE 17 20090 - 20506 563 138 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_1736 NR:ns ## KEGG: EUBREC_1736 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 134 1 136 140 135 55.0 4e-31 MIDIIAGAKGKGKTKILIQQVNDDIKLTKGTIVYLDKNNKHMYELSNQIRLIVVPEFNVT DTDMFLGFIAGIISQDHDLDKIYLDSFLTTACIKDNLDYAVEKLDALSEKFGVSFVISAS VDKENMPESVQKYVSHAL >gi|222441854|gb|ACEP01000088.1| GENE 18 20639 - 21586 546 315 aa, chain + ## HITS:1 COG:TM0515 KEGG:ns NR:ns ## COG: TM0515 COG1242 # Protein_GI_number: 15643281 # Func_class: R General function prediction only # Function: Predicted Fe-S oxidoreductase # Organism: Thermotoga maritima # 6 306 3 302 303 246 42.0 3e-65 MEKQYYYAYSKYLKEKYGEKVYKLPVNLPVTCPNRKEGGGCLFCADAGTGFEAKESTMPV LEQLQSSREKVEKRYHAHKFIVYFQNYTNTFLPLPALLQYMEEAAGMENIVEISLSTRPD CIRQDYLEAFQNFRERTGIEVSIELGLQTVNYHTLSAMNRGHGLAEFINAVLLIAPYHFP VCVHMILNLPGDTMEDARESARVLSALPVSMVKLHSLYIPKECALYKEYEAGNIQICSKQ EYIERLAEFISLLRKDIVVERLFSRIPEKDSAFSNWGNSWWKLTDEWKALMEERQYVQGI RCDYLNGAALNRWEM >gi|222441854|gb|ACEP01000088.1| GENE 19 21588 - 22979 1122 463 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_1735 NR:ns ## KEGG: EUBREC_1735 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 462 1 460 462 203 30.0 1e-50 MAGKEKGKLHIHINLGTLVFLVIVVYLMAYMLTYIGKDKLAIYEVSASDIVDNIEGTGVI LRQENLVNTAQDGYINYYVQDGAMVNKNGVVYMIDTTGEIQDYLKKVLEKKNTMSLDEKE QIIEKLKIFSEGYTDNHFSLSYETQNDINNTLLDYTDAMLTEHKAELEEKYGKDSYVEVT SAQEGLVSFVSDGQENLKQSQVSEETFLSKAKMSDLRTNKKQKAGSPVYRFIKGQDWQLM VPVGQNGYNRLKKRAEDNSSIQVTFQKDNFTTTTSYRCLKQNGKYYVILSFDNYIQRYLN QRYLSVSLTLSETNGLKIPSSSLVKKSVYRIPKSFLVHGGNSAEKDQLNIMETNKKGEKV LRQTSAIVYKTNDKYAYVVSKDLKTAIIISETDKQKIYTIKDSDKVEILGVYMVNKGYAV FALVDMVERNGDYCIVSTSGSKIELYDRIILNSDTVKEGQVIY >gi|222441854|gb|ACEP01000088.1| GENE 20 22979 - 23671 717 230 aa, chain + ## HITS:1 COG:CAC2121 KEGG:ns NR:ns ## COG: CAC2121 COG0325 # Protein_GI_number: 15895390 # Func_class: R General function prediction only # Function: Predicted enzyme with a TIM-barrel fold # Organism: Clostridium acetobutylicum # 12 230 6 220 221 207 52.0 2e-53 MLKENLKNVQNNIKKACERVGRKPEEVTLVAVSKMKPLSDIEELLETGQLEYGENYVQEL CDKYENISKPVHWHMIGHLQTNKVKYIIDKVELIHSVDSLHLAKQIEKEAVKKGVDAQIL VQVNIAQEDTKFGIDGPEVMSLVEEISKFPHVHIRGLMTSAPFVANPEENRCYFKKLHKL FVDIREKNIDNVSMDILSMGMTNDYEVAIEEGATMVRVGTGIFGARNYNK >gi|222441854|gb|ACEP01000088.1| GENE 21 23687 - 24232 756 181 aa, chain + ## HITS:1 COG:lin2136 KEGG:ns NR:ns ## COG: lin2136 COG1799 # Protein_GI_number: 16801202 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Listeria innocua # 70 174 49 152 152 73 35.0 2e-13 MKFAEKFLNMMKLNDADDETEPFDEEYNEVTEEEGYDEEEMEDKEEEPHKPILSLRRKKK DLTELQEDDEDGQSRVIPFHGKEEEGESVKVIRPQTFNEAQIVADFLKEGKTIVVNLEGI EIGQAQRIIDFIGGASFAVDGSLKAISNNIFIVAPGNIEVSGDLRDEIVSEDMLSPELTK F >gi|222441854|gb|ACEP01000088.1| GENE 22 24380 - 25465 1168 361 aa, chain + ## HITS:1 COG:VC2628 KEGG:ns NR:ns ## COG: VC2628 COG0337 # Protein_GI_number: 15642623 # Func_class: E Amino acid transport and metabolism # Function: 3-dehydroquinate synthetase # Organism: Vibrio cholerae # 1 355 1 356 366 257 39.0 2e-68 METKRLEVTAEGKKVYDIVIENNFNQLVKELSVFGIKERKLCIVTERQVASYYLEEVCSL LEKECAKLEVYIFQEGEESKNLSTVDSLYHFLITHEFDRKDMLLALGGGVTGDLTGFAAA TYLRGIDFVQIPTSLLAQVDSSIGGKTGVDYHAYKNMVGAFYMPKLVYINVNTVKTLSER QYHSGLGEVIKHGLIRDKAYFEWMKDNKEAVAKREDEALAYMIEGSCEIKRAVVEEDPKE HGVRALLNFGHTLGHSIEKLMNFSLAHGECVAVGSLLAADISCQRGYISEEENKEIKELY AYFDFPVLPEELKAEDIVKETRHDKKMEHGKIKFILLDKVGEATIYQDVTPEEMLTVLER V >gi|222441854|gb|ACEP01000088.1| GENE 23 25520 - 26023 479 167 aa, chain + ## HITS:1 COG:SP0928 KEGG:ns NR:ns ## COG: SP0928 COG0597 # Protein_GI_number: 15900808 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Lipoprotein signal peptidase # Organism: Streptococcus pneumoniae TIGR4 # 1 148 1 146 153 94 40.0 1e-19 MGKKAKSVVIFFVLLLIDQLTKLWAVKILKPIGSIKIIHNVLEIYYVENRGAAFGIMQNK QWFFLIITLAVLVGLLWISGKIPEEKHFIPLKACLYFVGAGAVGNMIDRVFRKYVVDFIY FSLINFPVFNVADIYVTVAAFMLVVLILFFYQEEDLNRVFSKKKEQE >gi|222441854|gb|ACEP01000088.1| GENE 24 26038 - 26997 286 319 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|161507907|ref|YP_001577871.1| ribosomal protein large subunit [Lactobacillus helveticus DPC 4571] # 97 306 82 279 285 114 35 1e-24 MAERNEMMQQNQKQNIEQKYYYTVMEEQEGMRLDQFLAGELKEHSRSYIQKLIKEGRVTV GDKKEKPGCRLKVEDNVVICVPPLKELEVLPEKMDLDIIYEDEDVILINKPKDMVVHPCP GRYTGTLVNGLLYHCRDNLSGINGVLRPGIVHRIDKDTTGVLVVCKNDMAHKSLAEQLKE HSITRKYEAIVYNNFTQEEGTVDAPIGRSPADRKKMAVEPKNGKRAVTHYKVLSHLNHQF NHIECQLETGRTHQIRVHMASIRHPLLGDTTYGPKNAMGLTGQCLHARVLGFIHPRTGEY VEFEAPLPEYFQKLLEKYS >gi|222441854|gb|ACEP01000088.1| GENE 25 27127 - 27393 333 88 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|238924297|ref|YP_002937813.1| ribosomal protein S15 [Eubacterium rectale ATCC 33656] # 1 88 1 88 88 132 72 5e-30 MITKEMKQEIMTKYARTEGDTGSPEVQVALLTARINDLQNHFKANPKDHHSRRGLLKMVG QRRNLLAYLKGKDIERYRTLIEQLGLRK >gi|222441854|gb|ACEP01000088.1| GENE 26 27456 - 28112 669 218 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225027665|ref|ZP_03716857.1| ## NR: gi|225027665|ref|ZP_03716857.1| hypothetical protein EUBHAL_01924 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01924 [Eubacterium hallii DSM 3353] # 1 218 1 218 218 372 100.0 1e-102 MRKIIIRVVIAAVILLAGIGVFQWHNYQQKVEYRKEALLYFKEEDYSKTISYLGQALKLQ SVFAGKLDLDMTCYLAESHYQLKEYDEAEKIYDKLINNDSKNAQYYILKGECLAKSGDAK AAVKVYEQGWNSTNDTDFLEKICEIYVEQKDYDNALKYIQKGIEQGGESKAGFMYGKIVI YEKAQEYDRAYEAAKKYVELYPDDEAGKKEYIFLSTRI >gi|222441854|gb|ACEP01000088.1| GENE 27 28211 - 28420 110 69 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSTKGTRGGMNAGYNTNRMHQPAAQYPVFFSYYLIYINLLRKGEINYGITESYNQQPTGT SHETGRSIR >gi|222441854|gb|ACEP01000088.1| GENE 28 28350 - 28613 472 87 aa, chain + ## HITS:1 COG:ECs4354 KEGG:ns NR:ns ## COG: ECs4354 COG1925 # Protein_GI_number: 15833608 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, HPr-related proteins # Organism: Escherichia coli O157:H7 # 1 87 1 87 89 64 41.0 4e-11 MVSQKVTINNPQGLHMRPAGVFAKGMAKFDSDVTINFNGKATNGKSLLNIIGACIKCGSE VEIQCDGADEEEALKTAIEMIESGLGE >gi|222441854|gb|ACEP01000088.1| GENE 29 28642 - 30351 2235 569 aa, chain + ## HITS:1 COG:BH3073 KEGG:ns NR:ns ## COG: BH3073 COG1080 # Protein_GI_number: 15615635 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphoenolpyruvate-protein kinase (PTS system EI component in bacteria) # Organism: Bacillus halodurans # 3 564 6 567 572 533 50.0 1e-151 MYTGIGASAGIGIGSAVIVKEPSLAYDNKAVSDVDAEKKRLAEALDKCIAKTQAMADDMK ERVGEKEAEILEGHVLLLMDPEMTGQMETSIADDKMCAEAAVEKICDYYMQMFSSIDDEM FQQRATDVCDIKTRLLKILLDVEDVDLANVPEGTVLVAEDLTPSMTAGINPKNVTGILTE VGGKTSHSAILARALEIPAVLSIPGITEKISNGDKVVLDGSEGQTYINPDEKLQAEYEQK RVAYLEEKAALAQFVGRETKTADGVKVELCANIGKPEDALKVVECDGEGIGLFRTEFLFM DRPQIPTEEEQFAAYKKVAETLEGKPVIIRTLDVGGDKEIPYLALEKEENPFLGFRAIRL CLKREDLYRPQLKALLRASAYGDIRIMVPMVTCLDELRAVKNMIAELKEELKAEGVAYDE NIKVGIMIETPAASLMADVFAKEADFFSIGTNDLTGYTMAVDRGNAEVAYLYSAYNPAVL RSIERVIKAGRAEGIMVGMCGEAAADPLLAPLLISFGMNEFSVSATSILATRKGISLWTK EEADAVAAKAMSLTTEAEVEAYLKSVAKH >gi|222441854|gb|ACEP01000088.1| GENE 30 30593 - 32596 1790 667 aa, chain + ## HITS:1 COG:BS_yjcD KEGG:ns NR:ns ## COG: BS_yjcD COG0210 # Protein_GI_number: 16078247 # Func_class: L Replication, recombination and repair # Function: Superfamily I DNA and RNA helicases # Organism: Bacillus subtilis # 5 624 136 758 759 336 34.0 1e-91 MPAIFNKEQEEAITHKEGPLMVLAGPGSGKTLVITYRVKWLIENAGVHPSNILVITFTRA AAEEMRKRFFAFDGMENAPVTFGTFHSIFFMILRYAYRYTAGNIIREDVKRRYIKEITEN MELEIEDEKEFLSGIINEISYVKGEMMSLSYYHSSNCSDELFAQIYEGYERRLREENLID FDDMLVFCYELLKERKDILAMWQQKFQYILIDEFQDINKVQYEIVRMLAGKGDHLFIVGD DDQSIYRFRGARPEIMLGFEKDYPDAKKVILNTNYRCSAEIVDSAEHLISHNTKRFPKNM QAARGKKVPITFRYLKDAGEECTDILKGIRFYHKKGIPLEEMAVIFRTNTQPRLLVGRLM EYNIPFQMRDVIPNIFDHWIARNILTYIKIAMGNKDRKLFLQIMNRPKRYISRSMLTDPQ VDLKKLKQETFGKKWLYEKIDKLEMDLCLLRKMEPYAAIQYIRNGIGYEDYMNEYAQFRR MNPDDLEEVLNQIQESAKEYHSFEEWFSYIDSYGEELEKQMEAGRQQQKSGVTLTTMHSS KGLEYEVVFVMDINEGTTPHKKAVKDADLEEERRLFYVAVTRAKTYLFLYSLKELYQKDA QISRYIGELRYDKKEFKKGRRVVHKNIGKGTILELKDDKIKIRFDNSKKPRLFSIKYLME QGLLELE >gi|222441854|gb|ACEP01000088.1| GENE 31 32726 - 33547 1158 273 aa, chain + ## HITS:1 COG:CAC0501 KEGG:ns NR:ns ## COG: CAC0501 COG1968 # Protein_GI_number: 15893792 # Func_class: V Defense mechanisms # Function: Uncharacterized bacitracin resistance protein # Organism: Clostridium acetobutylicum # 3 272 4 274 274 300 63.0 2e-81 MLIEMLKVILLGIVEGITEWLPISSTGHMILVDEFIKLNVPKDFMEMFLVVIQLGAILAV CVIYFHKLNPFSPKKSAKEKRETVDIWGKVIVGCLPAAVIGLAFNDKIDALFYNAPTVAF TLILYGILFIIVENYNKHREPQVGELSDLTYKIAVIIGCFQVLALIPGTSRSGATIIGAM ILGTSRFVAAEYSFFLSIPVMFGASFLKLVKYGFHYTGMEVTILVVGMLVAFIVSIVAIK FLLGYIKKNDFKAFGVYRIVLGIVVAAYFLFVK >gi|222441854|gb|ACEP01000088.1| GENE 32 33762 - 34337 577 191 aa, chain + ## HITS:1 COG:lin1771 KEGG:ns NR:ns ## COG: lin1771 COG0500 # Protein_GI_number: 16800839 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Listeria innocua # 7 189 6 187 191 134 39.0 9e-32 MKRRTPITQYCHERIQQMIKEPLLCIDATAGTGKDTVFLAKLVGEKGRVISMDIQEMAIE QTKKRLLKERLSDRAEVVLDSHAHMDKYAQKDSVSLIMFNLGYLPGGDHSLSTKADTTIE ALEKGLNLLHEGGMISLLIYSGGDSGFEEKKQVLAWLRELTDDKYTVLVEAFYNKPNNPP LPVYILKNETA >gi|222441854|gb|ACEP01000088.1| GENE 33 34531 - 35298 1141 255 aa, chain + ## HITS:1 COG:BH2535 KEGG:ns NR:ns ## COG: BH2535 COG0543 # Protein_GI_number: 15615098 # Func_class: H Coenzyme transport and metabolism; C Energy production and conversion # Function: 2-polyprenylphenol hydroxylase and related flavodoxin oxidoreductases # Organism: Bacillus halodurans # 6 254 5 259 259 185 39.0 8e-47 MEMIKETAKVVRQQQIDEGIFDMELSFPKGAALAKPGQFIAMYCNDKSKLLPRPISICGI NKEEGTLRVVYRVAGEGTKEFSEMKEGDTLEVMGPLGNGFALKEEKAIIIGGGIGIPPML ELAKQLNVEKTVVLGYRTSTFLKDEFEAVCDVKVATEDGSQGAKGTVIDAIEKYGVEGKV IYACGPMPMLKALAVYAEEHGMEAQISLEERMACGIGACLGCICKTKGKDHHTNVNNTRI CKDGPVFDAKEVVFA >gi|222441854|gb|ACEP01000088.1| GENE 34 35295 - 36197 1356 300 aa, chain + ## HITS:1 COG:BS_pyrD KEGG:ns NR:ns ## COG: BS_pyrD COG0167 # Protein_GI_number: 16078618 # Func_class: F Nucleotide transport and metabolism # Function: Dihydroorotate dehydrogenase # Organism: Bacillus subtilis # 3 297 2 297 311 308 52.0 1e-83 MNLEVTLAGKTFKNPVTTASGTFGSGAEYEQFVDLAALGAVTTKGVANVPWEGNPTPRVA ETYGGMLNAVGLQNPGIDLFCKRDIPHLKEKGASIIVNVCGRSTSDYVEVVERLADEPVD LLEINVSCPNVKEGGIAFGQNPDALYEITKAIKAKAKQPIVMKLSPNVTDITEMAKAAEA GGCDALSLINTITGMKIDIHRRKFVLANKTGGMSGPAIKPVAVRMVYQVSHACKLPIIGM GGILTGEDAIEMMMAGATMVSVGTANFHNPRATMEVLDGMKAYMERYHIEDINEIIGCID >gi|222441854|gb|ACEP01000088.1| GENE 35 36420 - 37061 845 213 aa, chain + ## HITS:1 COG:CAC0027 KEGG:ns NR:ns ## COG: CAC0027 COG0461 # Protein_GI_number: 15893325 # Func_class: F Nucleotide transport and metabolism # Function: Orotate phosphoribosyltransferase # Organism: Clostridium acetobutylicum # 1 213 12 224 224 291 66.0 5e-79 MIDCNVLKFGDFVTKSGRKTPFFVNTGFYRTGSQLKRLGEYYAQAIHDHFGFDFDVLFGP AYKGIPLSVAATIAISEKYDKDIRYCSNRKEVKDHGDKGILLGSPIQDGDKVIIIEDVTT AGTSIQETLPIIKAQGDVEVGGLVVSVDRMERGQGEKSALTEISEKYGMKTTAIVTMEEV VEHLYNKPYKGKVIIDDKLKAAIDAYYDQYGVK >gi|222441854|gb|ACEP01000088.1| GENE 36 37082 - 37234 230 50 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNEKIFKTLGQIGASGIVVGIITVIVGVSVGVVSIVSGARALKLKGKIMI >gi|222441854|gb|ACEP01000088.1| GENE 37 37367 - 37849 203 160 aa, chain + ## HITS:1 COG:BH2294 KEGG:ns NR:ns ## COG: BH2294 COG4767 # Protein_GI_number: 15614857 # Func_class: V Defense mechanisms # Function: Glycopeptide antibiotics resistance protein # Organism: Bacillus halodurans # 14 157 12 160 163 67 34.0 1e-11 MNENRKRLGSLSLLLLILYGCVVIYFVLFSDRLGRGDGYSTYRYNLVPFEEIRRFIIYRN YVSAGAFLLNLFGNLLVFFPIGFLVPIWRLEKTGCIRIIIYAFLFSLCIETLQLITKVGV FDVDDLMMNTVGGLIGWIIYCIIRFIYHKWKAKNKKKEED >gi|222441854|gb|ACEP01000088.1| GENE 38 37884 - 38189 267 101 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225027677|ref|ZP_03716869.1| ## NR: gi|225027677|ref|ZP_03716869.1| hypothetical protein EUBHAL_01936 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01936 [Eubacterium hallii DSM 3353] # 1 101 1 101 101 132 100.0 8e-30 MNTRNISFSEKKVSKNAVITLVMGIIILIGFVILTILSVVTGGGLNVIGGLIGCILMILA FFCVLWGVMSYDDAKTSQKYKISGICINIVAIFIGITFIMM >gi|222441854|gb|ACEP01000088.1| GENE 39 38235 - 40355 2364 706 aa, chain + ## HITS:1 COG:no KEGG:Closa_1605 NR:ns ## KEGG: Closa_1605 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 703 1 688 691 721 51.0 0 MDYRELFKESNKEVEERFLLTRERIGEIAAGEKNGMKEQVHRYFKKTAQFLEKCASDFLC ISSDVWKSCSFEQMKSENELFYQDILPQNYGKSFANPAFAVQELGEEFGRILSFLYTELR SERAFVFEQNLEKITILNELFLEIYCMFEDETVSYKQIKDTIYWFLYDYADDWAGYRVRE LVDPALDFATSIIMDSDLGDLRYLYSYGDYISEVELKTAEFLNGLSDEEIEQIAATFTEG YQRGFELKGVDVSKKDTVNIRYPIGFERVIRAAIRQFAAMGLKPVIYREALNTLNKRQNL RIGYISSDPNPQYGYDHRFDNAIYLDKRMVDRKISCMRKAYEEYEQEAAGFAGPACFEVF GEEPFEPVNKPESYRLSERQQSLSVEYSSLSNELLNEFINQEERSFTIIAYPVPSIGEKF PEIFEEIRKVNTLDNELYKGIQQNIINILDQAESVHIMGRGRNMTNLTVALCDLKDAQKE TKFENCLADVNIPVGEVFTSPKLKGTNGLLHVTEVYLNGLCYKDLKITFKDGMVDDYECA NFPDKEDGRKFIKENLLYNRETLPMGECAIGTNTTAYVMAAKYGIMEKLPILIAEKMGPH IAIGDTCYSYCEDVRVYNPDGKEIVAKENECSVFRKEDPKKAYFNCHTDITIPYEELSRI AAVTAQTVEVPIMNDVAEERVEIPILLDGRFVLEGTLKLNEPFEGK >gi|222441854|gb|ACEP01000088.1| GENE 40 40504 - 41772 1460 422 aa, chain + ## HITS:1 COG:L81189 KEGG:ns NR:ns ## COG: L81189 COG0044 # Protein_GI_number: 15673051 # Func_class: F Nucleotide transport and metabolism # Function: Dihydroorotase and related cyclic amidohydrolases # Organism: Lactococcus lactis # 1 422 19 440 441 460 55.0 1e-129 MKLVIKNGRVMDPASGRDEVTNIWVEDEKIIGFGEHPEWAEEAEELNAEGCVVAPGLVDV HVHFRDPGFTYKEDLETGSRAASAGGFTTVVCMANTKPIVDTPEVLQDILERTKSLPIHV KQASAVSKNFEGKELVDFDAMVEAGTCGFTDDGIPLTNAGFIKKAMEAAARVDMPISLHE EDPTLNEVNGVNKGVISDKMGIGGAPAISEDVMVARDVMLALETGAKVDIQHISSGRSVD LVRLAKKKGAKVYAEATPHHFTLTEEDVPNYGTMAKMNPPLRTEWDRTKIIEGLADGTID MIATDHAPHAIEEKEREFTKAPSGIVGLETSLSLGITSLVKRGYLTMMQLLEKMTVNPAR LYKFDCGTLEEGKAADIVIFAPDEVRTVESFESKAENSPFLGAQLYGKVKYTICDGKVVY RG >gi|222441854|gb|ACEP01000088.1| GENE 41 41868 - 42794 1293 308 aa, chain + ## HITS:1 COG:CAC2652 KEGG:ns NR:ns ## COG: CAC2652 COG0284 # Protein_GI_number: 15895910 # Func_class: F Nucleotide transport and metabolism # Function: Orotidine-5'-phosphate decarboxylase # Organism: Clostridium acetobutylicum # 1 306 2 286 286 199 37.0 5e-51 MINQLVKGIKEKDAPIVVGLDPMLSYVPEHLVKDAFAAYGETLEGACEAIWQFNKGIIDA TYDLIPAVKPQVAMYEQFGVEGMKAFKKTCDYAKSKGLVIIGDIKRGDIGSTSEAYAIGH VGTVKIGEHIYSPFTEDFVTVNPYLGSDGVKPFLKVCKENNKGAFILVKTSNPSSGEFQD REIDGKPLYEIVAEKVAEWGKEVMGEDYSYVGAVVGATYPEMGAALRKIMPKNYILVPGY GAQGGKGKDLAPFFNEDGLGAIVNSSRGIICAYQKDPYKEKFSPVDYADASRQAVLDMKE DLKNAFVK >gi|222441854|gb|ACEP01000088.1| GENE 42 42824 - 43336 780 170 aa, chain + ## HITS:1 COG:TM0446 KEGG:ns NR:ns ## COG: TM0446 COG0041 # Protein_GI_number: 15643212 # Func_class: F Nucleotide transport and metabolism # Function: Phosphoribosylcarboxyaminoimidazole (NCAIR) mutase # Organism: Thermotoga maritima # 1 169 1 169 171 208 62.0 3e-54 MAKVGIVMGSDSDMPIMAKAADVLKELGIDFDMTIISAHREPEEFFAYAKGAEEKGYKVI IAGAGKAAHLPGMCAAIFPMPVIGIPMKTSDLGGVDSLYSIVQMPTGIPVATVAINGAAN AGILAAKMLATSDPELLERLKAYSEKLKNQVVAKADKLEEIGYEEYLKQM >gi|222441854|gb|ACEP01000088.1| GENE 43 43430 - 44455 851 341 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|169632702|ref|YP_001706438.1| phosphoribosylaminoimidazole synthetase [Acinetobacter baumannii SDF] # 1 338 12 344 356 332 48 4e-90 MDYKNAGVDIEAGYKSVELMKKYVQGTMRPEVLTNLGGFSGAFSLKKFMTMENPTLVSGT DGVGTKLKVAFLMDKHDTVGIDCVAMCVNDIACAGGEPLFFLDYIACGKNFPEKIASIVS GVAEGCKQSEAALIGGETAEMPGFYPEDEYDLAGFAVGIVDEKDLITGKDIKAGDVLVGI ASSGIHSNGYSLVRKVFPMEKEALNEYREELGKTLGEALITPTKIYVKALKAVKEAGITI KGCSHITGGGFYENIPRMLPEGARAVVKKDSYEVPAIFRLLAKEGNIAEEMMYNTYNMGI GMMLALDPKDAEKAIEALKAVGEEAYVVGSVEAGEKGVTLC >gi|222441854|gb|ACEP01000088.1| GENE 44 44449 - 45075 858 208 aa, chain + ## HITS:1 COG:CAC1394 KEGG:ns NR:ns ## COG: CAC1394 COG0299 # Protein_GI_number: 15894673 # Func_class: F Nucleotide transport and metabolism # Function: Folate-dependent phosphoribosylglycinamide formyltransferase PurN # Organism: Clostridium acetobutylicum # 1 208 1 204 204 207 50.0 1e-53 MLKVAVLVSGGGTNLQAILDAVDSGKITNTEIRVVISNNEGAYALERAKNYGTEALLLSP KSFETREEFNQKLLEALKERDIDLVVLAGYLVVVPPCVIKEYENRIINIHPSLIPSFCGK GCYGLHVHEKALARGVKVSGATVHFVDEGTDTGPIIMQKPVMVEQGDTPEVLQRRIMEQA EWNILPETINLIANGKVHVDGRVVTIDK >gi|222441854|gb|ACEP01000088.1| GENE 45 45199 - 46476 1796 425 aa, chain + ## HITS:1 COG:BH0634 KEGG:ns NR:ns ## COG: BH0634 COG0151 # Protein_GI_number: 15613197 # Func_class: F Nucleotide transport and metabolism # Function: Phosphoribosylamine-glycine ligase # Organism: Bacillus halodurans # 1 420 1 417 428 416 52.0 1e-116 MKVLIVGSGGREHAIAWAVSKSAKADKIYCAPGNAGIEEYAECVNIGAMEFDKLVAFAKE KEIDLTIIGMDDPLVGGVVDAFEAEGLRVFGPRKNAAIIEGSKAFSKDLMKKYNIPTAEY ENFDSDEAALAYLETAKFPIVLKADGLALGKGVLICNTLEEAKDGVKTIMEDKKFGDAGN RMVIEEFMTGREVSVLAFCDGKTIKCMTSAQDHKRAKDGDQGLNTGGMGTFSPSPFYTKE VEEFCEKYIYQPTMDAMAAEGRPFVGILFTGLMLTEDGPKVLEYNARFGDPEAQVVLPRM TNDIIDVMEACIDGRLDEIDLQFEDNAAVCVVLASDGYPVSYKKGYVINGLDEFKKHDGY YCFHAGTKRTDEGIVTNGGRVLGVTAKGSDLKEARKNAYAATEWVTFENKYMRHDIGKAI DDVEA >gi|222441854|gb|ACEP01000088.1| GENE 46 46543 - 47244 833 233 aa, chain + ## HITS:1 COG:BH2910 KEGG:ns NR:ns ## COG: BH2910 COG1296 # Protein_GI_number: 15615473 # Func_class: E Amino acid transport and metabolism # Function: Predicted branched-chain amino acid permease (azaleucine resistance) # Organism: Bacillus halodurans # 5 218 17 224 237 106 31.0 3e-23 MNFKQGIRDGIPIGLGYAAVSFAFGILAVDKDLTVSEAVLISLTNLTSAGQFAGLTIIAE LGTLVEMALSQFIINLRYMLMAISLSQKVDDDFKGIWRWILGFAITDEIFAVAIQNPGKI KRNYFMGLTIIPIIGWTLGTLGGALLGNILPAIITNALGVALYGMFIAVVVPKARDDSHV FVAVCIAVAISVALKYIPVFADLSGGFAIIICTVVASLIAAVLFPVEVSEDEF >gi|222441854|gb|ACEP01000088.1| GENE 47 47234 - 47551 304 105 aa, chain + ## HITS:1 COG:no KEGG:EUBELI_20620 NR:ns ## KEGG: EUBELI_20620 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 1 101 1 101 103 77 54.0 1e-13 MSFKIFLPYLLVMAIVTYLIRAIPLVLVKHKIKNKFFNSFLYYIPYTVLAAMTIPAIFTA TGNLISAIAGLAVAIVLAYQRKSLIVVAVGACAAVFIIEGIITLI >gi|222441854|gb|ACEP01000088.1| GENE 48 48012 - 52577 4904 1521 aa, chain + ## HITS:1 COG:CAC2401_1 KEGG:ns NR:ns ## COG: CAC2401_1 COG1924 # Protein_GI_number: 15895667 # Func_class: I Lipid transport and metabolism # Function: Activator of 2-hydroxyglutaryl-CoA dehydratase (HSP70-class ATPase domain) # Organism: Clostridium acetobutylicum # 15 716 7 662 663 516 43.0 1e-145 MKDERRKAEQPEVLVGIDVGSTTTKIVVMQTGNSDYKVVFSNYERHHADQLKSVLEILKK FDEKMPAQKVRVCLTGSGAKPIAESLGVPFAQEVAANAIALQKKYEKVGTAIELGGQDAK MIFFKESETDHMRSVSDMRMNGSCAGGTGAFIDEIASVLKVPVEEFNALAAKGEHVYDIS GRCGVYAKTDIQPLLNQGAAKEDLALSAFHAIAKQTIGGLAQGLEIESPVAFEGGPLTFN PVLVRVFSERLQLEKEDILIPEHPELMIAAGAALSLKEMFAGSKKVAIADILKKLEQIQQ QKEEAKKNGINISLENEEKARPFFASVEEKAAFEKAHEKKKITWKKPQKGEVVRGYLGID AGSTTTKFVLLSDNEEILDSFYAPNEGDPLKVAKQALIDLRERYKEAGAELEILSAGTTG YGEMLFSKAFDIKNHTVETVAHARAAEKYIQDASFILDIGGQDMKAIWLENGVITNILLN EACSSGCGSFLENFASSLHIPTKEIAKTAFSSKNPAVLGSRCTVFMNSGIITEQRNGRTS EDIMAGLCRSIIENVFTKVIRVSNLDSLGKKIVVQGGTFENDAVLRAMEEYVGRDVVRAP YPGIMGAIGMALIAKEKYEKSPETEEIWDLQNLDTFSYTQDANSPCPFCGNHCNRTIVKF SNGNSWVTNNRCERGEVIGNPKDKEVNRQVKEKLQKVNAVPNLYKIREELLFKDYPHNLS NTYKNEHSTEHVGENIKEHGKESTKQRTEVIGLPRVLSYWDTMPFWKTFWQALGYTVRIS DKSTRKMYEDGLSAVTSDTVCFPAKLVHGHIRNLVKHKVDRIFMPSVTTVPTENTEETSQ SMCAVVKGYPIVIRNSDNPSEKWGVPYDAPLFHWYTKKDRQKQLVAYMSETFGITKEETL AAIKAGDAAQKSFDTELKKAGAKVLEDVQKAGKFAVVLASRPYQNDALVNHDLPLMFTKA GIPVLTADSLPEVNRTDLHRSRIDVVNNYHARMLSSAILAAQSENLEYVQVVSFGCGHDA YLSDEIVRMMKEISGKTPLILKVDESDIQGPLGIRVRSFLETVEMGREKKLVHEVKPLAE PYPVKFTKESRKERVVLVPNTSHAFCRVMSAALSTQGIKAIPLDIGREEAIRLGKQYVHN DICFPAQIVIGEALAALRSGKYDGQKVAIGMGKYVGDCRLTHYGALLRKALDDAGYADVP ILTNDDADSHDMHPGFKMNILSAARIAFALPMIDVLEELLRKIRPYELTKGDADVAFEKA LDEVIEGLEKHGIAGARKGFKKAITIMKKVRYDRTYLKQQVLIVGEYLLNFHPGANHDIE AYLEENGFEIIEARMTDVIRKTYFYQDAQIKEYHLNKPIDQKVWYRTADTIFDFAHKLTD SIAKEHPLYEPACRMDELVKDSDPIIHHTFDAGEGVLIPGEILHHAKHGCKFFLILQPFG CLPNHVVGRGITKKLKEIYPDVQILPLDYDPDVSFANIENRLQMLVMNAKQAKRRVSVVQ KKENKKETVKKGKIHKQESYC >gi|222441854|gb|ACEP01000088.1| GENE 49 52776 - 54446 1951 556 aa, chain - ## HITS:1 COG:BS_araB KEGG:ns NR:ns ## COG: BS_araB COG1069 # Protein_GI_number: 16079931 # Func_class: C Energy production and conversion # Function: Ribulose kinase # Organism: Bacillus subtilis # 4 555 3 554 560 605 52.0 1e-173 MSKYAIGVDYGTLSVRALLVNIETGEEAATSVYEYPHGVMEEHLPTGERLPSGWALQEPQ DYVEGLIITIRNVLAEKKVLPEEVIGIGVDFTSSTVIPVKADRTPLCHLEKFTHNPHSYV KLWKHHGAEEEALQIDRIAKEREEKWLPLYGGKVSSEWMIPKILETLHHAPDVYKEADRF MEALDWIIWQMTGEETRSACGAGYKAFYRHDSGYPTKDFFKALDPAMENIVEEKLDAPIK SIGETAGYLTDSMARELGLLAGTPVGTGIIDAHSSLPGCGIGTPGTMMIIVGTSSCHMML SETEAGIAGVGGLVKDGIMPGYFGYEAGQCCVGDHFAWFTDNCVPESYEQEARSRGISIH QLLTEKLAGYKAGQSGLLALDWFNGVRSPLMDFNLNGLIMGMNLLTKPEEIYLSLIEATA YGTRMIIEQFEDAGVPVNALVLSGGIPAKNKMLVQIYSDVCNKEIRISGTDQASALGAAI LGIAAAPERITGFKNANEAAEKLGKVRDEVYKPNPDNVAVYDKLYQEYKTLHKYFGTGEN DVMKRLNKIREDRLSE >gi|222441854|gb|ACEP01000088.1| GENE 50 54558 - 55916 1480 452 aa, chain - ## HITS:1 COG:BH1862 KEGG:ns NR:ns ## COG: BH1862 COG0371 # Protein_GI_number: 15614425 # Func_class: C Energy production and conversion # Function: Glycerol dehydrogenase and related enzymes # Organism: Bacillus halodurans # 16 429 14 392 399 261 34.0 2e-69 MAKKINLNDYLGHSMDCTCGKEHKTNLKIVDIDKDAVSRLPKHIEALGYKKPFLVADVNT WAAAGALAAEKLDVAGISYKKYIFNEKELIPDEKTLGSLMMAFSRDCDVIVAVGSGTLND LCKFSSFQLGLDYMVFATAPSMDGFVSIGAALITNYVKTTYQAHVPLAVIGDTDILAAAP MEMITAGLGDILGKYTCLLDWKMAHIITGEYYCSHIADMVKEAVEIVVEESTRIKDRNPD AVKAVTEALVLSGIAMSFVGNSRPASGSEHHLSHYWEMKFQAEGKKPILHGIKVGIGMIA VTKMYETLENEQFDFASLKERSFDYAAWEKKVKDCYQDAAPGIIALEEKAQKNNLDKRNK RLAILEEKWPEILRTIKESLPSSAEMEKLLASLGAPINPAQIGVSSELVEDAVILAKEVR DRFTLLQVLWDTGLLDEYAKMIAAYFGEKANC >gi|222441854|gb|ACEP01000088.1| GENE 51 56381 - 57613 1066 410 aa, chain + ## HITS:1 COG:CAC0360 KEGG:ns NR:ns ## COG: CAC0360 COG1609 # Protein_GI_number: 15893651 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Clostridium acetobutylicum # 8 408 3 327 328 90 21.0 5e-18 MNEKKRTTTRDIAKACFVSQSAVSMILSGRKDIHFAPETIERVKRTAKEMGYEYKARAKR KKTGTNDTIMIMCPSLATQYYTTLIQFITQEAQEHGLCTLTAYTNRSKEREEYYLNMAAD TGFYGVIYTYAPRAVTSLNHLHGKVPLVLINDHNSDLKIELLELDSKKSGRLIAAHLLEL GHRDIGYMTTPLSSIEVPRRRRLEGMQEEYERQGFDPAMIHVVSGKREDQETITGNKHYD TGYGLTRKYFKKREELLYKKKEDSEAVQILKEKKKVEKIEKIKQEDISSKHAIENKEDVK ETGLENITAFVGTNDFVAIGIMDAITSLGYRIPEDFSVCGFDNTLVASFSGISLTTIDHC IGEKGAYAVSMLMNQKNRMEQKDGKREKKRPIMRLEYEPQLIVRHSTGRR >gi|222441854|gb|ACEP01000088.1| GENE 52 57925 - 58416 467 163 aa, chain - ## HITS:1 COG:BH0798 KEGG:ns NR:ns ## COG: BH0798 COG1607 # Protein_GI_number: 15613361 # Func_class: I Lipid transport and metabolism # Function: Acyl-CoA hydrolase # Organism: Bacillus halodurans # 8 153 7 153 157 97 36.0 1e-20 MNFRAKTVKESIVETVHIIRPSDLNDAGRLFGGVLMQWIDEVAALVAKRHSQMNVTTASV DNLQFLHGAFQRDVIVIIGRVTHVGHTSMEVKVETYVENTDGERALINRAFLTLIGLGPD NRPTTLPRLILESEEDRKEWERAEIRRSLRRKQREEGFHFYGE >gi|222441854|gb|ACEP01000088.1| GENE 53 58995 - 60536 1702 513 aa, chain + ## HITS:1 COG:BB0604 KEGG:ns NR:ns ## COG: BB0604 COG1620 # Protein_GI_number: 15594949 # Func_class: C Energy production and conversion # Function: L-lactate permease # Organism: Borrelia burgdorferi # 5 506 6 498 500 318 39.0 1e-86 MFLKFFLAILPILWLIIALSVLKIAGHKACLIALAITAVLAAGYWKLSLICTATAGLEGI LNALWPICLVIVAALFTYNLTLETGAMELIKKMLASVSVDQRVLLLIIGWGFGNFMEGMA GFGTAVAIPASMLAAIGINPILSVIACLVVNSTPTAFGSVGVPTVTLASVTNLDALQLSG SVALIQVILTFLSPFFMVFIVGKGFKALKSVLPMVLIASLSFTVPWFIAAQVIGCELPNI IGSIISMICMVAAARFLNKNPEPEYRVQLSGEEQSSGFTASEGVKAWSPFILIFLLLMFT STLCPPIHNLIADIKTTVTVYAGDNPGSLSFSWINTPGIMIFIAAIIGGLIQGASFGTMG KVLIETLKKYWKTILTICCVMATAKIMGYSGMISDIAAALVVVAGPFYPLISPLVGALGA FVTGSGTSTCVLFGGLQSQTAISLGLNPSWMAAANVMGAGIGKMICPQGIAIGAGAIGAV GSESEILRKVFKYFVIYVVIAGIICYAGSMAGF >gi|222441854|gb|ACEP01000088.1| GENE 54 60725 - 62137 1852 470 aa, chain + ## HITS:1 COG:FN1536 KEGG:ns NR:ns ## COG: FN1536 COG0277 # Protein_GI_number: 19704868 # Func_class: C Energy production and conversion # Function: FAD/FMN-containing dehydrogenases # Organism: Fusobacterium nucleatum # 1 469 5 474 475 649 68.0 0 MYNQLTPEIIEEIKKVSNNVFLGEEINPDYAKDEMPIYGTHMPDLAVQPRSTEEVAAVMK ICYENNIPVTPRGAGTGLAGGAVPLYGGVLIDISKMNKIKSYDMENFVVEIEAGVLLNDL AEDCQSKGMLYPPDPGEKFACVGGNVATNAGGMRAVKYGATRDYVRAMTVVLPTGEITHL GATVSKTSSGYSLLNLMIGSEGTLGIITELTLKIIPAPKAVASLIIPYEDLETALATVPK FKMAHMNPQAIEFMEREIVLASERYIGKSVFPQELEGVQIGAYLLVTLDGNSEDEVDALI EQAAELVLEAGAIDVLVADTSAKMKDAWAARSSFLEAIMEETKLLDECDVVVPVNKIAEY LTFVNKTGEECGLVIKSFGHAGDGNLHIYQCSNDLEEEEFKKRVDKFFDIIYKEATNCGG LVSGEHGIGSGKVKYLVDSVGETNMAIMEGIKRVFDPKMILNPGKVCYRL >gi|222441854|gb|ACEP01000088.1| GENE 55 62156 - 62959 1067 267 aa, chain + ## HITS:1 COG:CAC2710 KEGG:ns NR:ns ## COG: CAC2710 COG2086 # Protein_GI_number: 15895967 # Func_class: C Energy production and conversion # Function: Electron transfer flavoprotein, beta subunit # Organism: Clostridium acetobutylicum # 1 260 1 253 259 209 45.0 3e-54 MNICVCIKQVPDTNEIKVDPVTHTLVRKGVPSIVNPFDTYAQEVGVRLKEKLGGKVVVIS MGPEQAKEAIKTCLSVGADAGYLISSRKFGGSDTLATSYILSEAIKAVEEKEGLKFDLIL CGKQAVDGDTAQVGPEIAEHLGLPQITYALDIIEKDGDIQVKRECDEGYDMISTQLPAVV TVVRLPYEPRYPTIKSKMAAKKKEIPVLTEEDIPAISLERCGLSGSPTKVKKTYTPVTEK NGVKLEGMEAEDAAKEVVKLIYDAKIL >gi|222441854|gb|ACEP01000088.1| GENE 56 62973 - 63977 1500 334 aa, chain + ## HITS:1 COG:CAC2709 KEGG:ns NR:ns ## COG: CAC2709 COG2025 # Protein_GI_number: 15895966 # Func_class: C Energy production and conversion # Function: Electron transfer flavoprotein, alpha subunit # Organism: Clostridium acetobutylicum # 1 331 1 333 336 314 49.0 2e-85 MNLADYKNIWVFIETECGKPKNVGFELINAAKPLAQEKGCELIAVVIGKDIEGVAKDAIC YGADSAIIVDGPEYEYYTTDAFAKALVALVEKYKPETLMIGATNNGRDLGPRVSCRLKTG LTADCTAIAIDEETGNVAWTRPTFGGNLMATIMCPENRPQMGTVRPAVFKKGAYDEAKTG EIVKEEITVAADEIRTTLIERVKEVAESINLEEAEIIVSGGRGVGSQENFQLLQDLADVL GATVGCSRAVVDAGWMPHAHQVGQSGKTVAPKLYFAIGISGAIQHVAGISGSDTIVAINK DPDAPIFDVADYGIVGNLNEVVPALTEAFKAEKA >gi|222441854|gb|ACEP01000088.1| GENE 57 64463 - 65605 1498 380 aa, chain + ## HITS:1 COG:FN0783 KEGG:ns NR:ns ## COG: FN0783 COG1960 # Protein_GI_number: 19704118 # Func_class: I Lipid transport and metabolism # Function: Acyl-CoA dehydrogenases # Organism: Fusobacterium nucleatum # 1 375 1 377 381 488 64.0 1e-138 MNFRQDEEHQDILEMVHDFAVNEVKPLAAEIDRTEEFPMKNVKMMAEMGLMGIPFPEEYG GAGMDTLAYIQTVEELSKYCASTGVILSAHTSLCATPIYQFGSEEQKQKYLPKLCSGEWI GAFALTEPNAGTDASGQQTVAVDAGDHWVLNGSKIFITNAGVADVFFVLAMTDKSQGTKG ISAFIVEKGYKGFSIGKHEDKLGIRASSTCELIFEDCIVPKENLVGELGKGFKYAMMTLD GGRVGIAAQALGIAEGALEETTKYITERKQFGRRISQFQNTQFEMAQMRAQTEAAKLLVY QAACAKDDHEPFSHKAAMAKLVAARNAMDVTTRCLQLYGGYGYTKDYPIERMMRDAKITE IYEGTSEVQMMVISGWMGVK >gi|222441854|gb|ACEP01000088.1| GENE 58 65623 - 66909 1231 428 aa, chain + ## HITS:1 COG:CAC0769 KEGG:ns NR:ns ## COG: CAC0769 COG3875 # Protein_GI_number: 15894056 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 1 425 4 426 426 436 51.0 1e-122 MDMKLPFSTTGMMLHIDDSMDYEVLSSSIESMPKSDKTEDEIVLEAMAHPIGSPKLSELA KGKENVVIICSDHTRPVPSKHIIPFMLKEIREGNPDAKITLLIATGFHRATTREELVGKF GEEIVDNECIAIHDSQDMDAMANIGTLPSGAPLLINKIAANADLLVSEGFIETHFFAGFS GGRKSILPGVSSKVTVLGNHCSKFIDSPYSRTGILEGNPIHKDMIAASKMAHQKYIVNVI IDADKKVVHAVAGDAIEAHAAGCKFLQDYCQVVPKKAADIAISTNGGYPLDQNMYQSVKG MTAAEAAAKDDGILIMVSNCGDGHGGEGFYEALKNCSSPADLMAEILKVPQDQTKPDQWE YQIQCRILMQHKVIYVMCEEHRKMAQEMGFAVANDVNEALEMAIKEKGKDAHISIIPDGV SVMVKKPE >gi|222441854|gb|ACEP01000088.1| GENE 59 67137 - 67880 741 247 aa, chain + ## HITS:1 COG:CAC2546 KEGG:ns NR:ns ## COG: CAC2546 COG2186 # Protein_GI_number: 15895808 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Clostridium acetobutylicum # 12 235 5 228 231 113 31.0 3e-25 MNEFTDFKTYKMKNQKTYQYVFDYFSEQILSGELKINDKILPEREIAEKLDVSRNSIREV MHMLEINGLLECRQGSGNYLRCEPQDYMVKFMNMVMSLQQIKYTEVYDIRMGYETVALRL AIKNATEEEIEEMHQILIKMDEIENPKESGKLDMQFHNKLIEASHNRMIVLYTSMIAELI NQFIADFRVRIMANQRQADALRRAHWNIYDSLVAKDFVAGSFAMQRHFDIVGEQLERLQQ QMENQGK >gi|222441854|gb|ACEP01000088.1| GENE 60 67995 - 68255 237 86 aa, chain - ## HITS:1 COG:no KEGG:NT01CX_0524 NR:ns ## KEGG: NT01CX_0524 # Name: spoIIID # Def: stage III sporulation protein D # Organism: C.novyi # Pathway: not_defined # 1 82 1 82 84 104 69.0 1e-21 MRDYIELRAIEVGKYIIDSKATVRKTATKFGVSKSTVHKDLTERLPQIDHHLADEVRDIL DINKSERHIRGGMATRQKYLLLHRVN >gi|222441854|gb|ACEP01000088.1| GENE 61 68923 - 69378 482 151 aa, chain + ## HITS:1 COG:BMEI1786 KEGG:ns NR:ns ## COG: BMEI1786 COG1762 # Protein_GI_number: 17988069 # Func_class: G Carbohydrate transport and metabolism; T Signal transduction mechanisms # Function: Phosphotransferase system mannitol/fructose-specific IIA domain (Ntr-type) # Organism: Brucella melitensis # 18 146 18 145 154 87 32.0 1e-17 MDMLVKKEKIFDNIVAVSGTKKEAIRTISRLCAKQTMLDATQLEKAFWKREEWDSTGCGE GIAIPHAIIKEIDEVQLFIVRFRYPIEWESFDDKKVDIAFAIIAPEEKGQEKYLVFLSRL ARKLVDEKFIADFRNCRSTEAVYQYVKNELE >gi|222441854|gb|ACEP01000088.1| GENE 62 69430 - 70131 526 233 aa, chain + ## HITS:1 COG:VC1823 KEGG:ns NR:ns ## COG: VC1823 COG1445 # Protein_GI_number: 15641825 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system fructose-specific component IIB # Organism: Vibrio cholerae # 1 94 1 94 99 99 50.0 7e-21 MKIIGVTKCPTGIAHTYMAAARIEKECERLGYEVKVETQGSQGTENKLTKREIAKADYVI IAADVVIEEPERFYGKWVLKTRIKPLLKNTQGVFERLEQDSFIMGGASFQNENNQNRNEN SGNIENFQNKNAKDKNAENTEDKNIENREKNGDAVKQNITYQNAINRSRIVENTQENQKT EKIVGKGYRQELENEQSIAGNNKAKQSDKNNKNKIVKTKKKKTGERAQRSFRY >gi|222441854|gb|ACEP01000088.1| GENE 63 70148 - 71167 1053 339 aa, chain + ## HITS:1 COG:VC1826_3 KEGG:ns NR:ns ## COG: VC1826_3 COG1299 # Protein_GI_number: 15641828 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, fructose-specific IIC component # Organism: Vibrio cholerae # 1 333 15 348 359 246 46.0 6e-65 MNGASYMIPFVVVGGMLVSLSVSMGAQTSADGTVVYIGLWNKIHAIGNLAFTLMYPILSG FIAFSIAGRATLAPAMIGAMVATDGEILGTEGGTGFIGCIIVGYLVGYLVKWMNSWNLAK EFKPMMPIFIIPLTGVAVVSALFIFVLGKPVTLITDLLNSLLVELAKNPSSAVVLGIVLG AMVGIDMGGPINKVAFFFGVTSIAQGNLQIMGIVSTSIAVAPLSMGIAAVVGKDKFTPEE KSAGIYTIFMGIIGISEGAIPFAASDPGHVLPAVVAGSALTGAAAAVCGVTSAVPHGGLV VALFKATNYMSLYFLCVMAGTALSIAIVLAFKRRQEKNR >gi|222441854|gb|ACEP01000088.1| GENE 64 71368 - 73281 1586 637 aa, chain + ## HITS:1 COG:BS_yjdC_1 KEGG:ns NR:ns ## COG: BS_yjdC_1 COG3711 # Protein_GI_number: 16078265 # Func_class: K Transcription # Function: Transcriptional antiterminator # Organism: Bacillus subtilis # 1 489 1 499 510 114 23.0 8e-25 MRFTRRDMKMIEYLRDYEVVELKAMSEYLGVSVKTLKNQLKELSETLKEFGVDVQFVSSN QILVKGHEKFAEVMSVSIPRFEMEFERRFLLLLVLHDNFLTIQEIADELLVSKSYAEKHL AAIMKKYPEDIQAQRHYGIRYAASQNKRREVFVKILFPYLFGEDYIVALEQFDSLHFPLF HYFTKEQMLRTRGAIQALQRLEWFQLTDESLQQLFLYILFLTRHADSENTEKPVAVQEDF INTQEFDGLYEWILGWCQEFRLPDSKEELRYMYTLLLSLRKQKIACQDQIMDKMRHPIKE ILKGISERLSADFSEDEDLIEGLSSHIYTTILRGNHLDVETDEYMLKSMKRQYPFGFEMA AIAADYIADIYNLSMKENDLIYLAIHFQAAIERMKDAGEKTKIIIVCHFGAAAARIIRSK IERKLVGVEVTGMYSLQEFKQLKNPDCDYIVTTERILKADFPIIYISMALPEREMQKIKE GIKEIQVNHLLELNILEAIILPIEEKNMESAVKAMVQPLLEEDFVTEKYLQSVLEREEMS STSLNHIALPHGNPAMVQNTRLVIGRMNHPILWDDSKVSCAFLFAVSSEMLKEKPMLFNT FYRTMADPEVEESIKKLQMEKNLPDEVFRQKLFQILR >gi|222441854|gb|ACEP01000088.1| GENE 65 73328 - 74317 1519 329 aa, chain + ## HITS:1 COG:lin0143_2 KEGG:ns NR:ns ## COG: lin0143_2 COG3444 # Protein_GI_number: 16799220 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IIB # Organism: Listeria innocua # 166 327 1 162 165 214 72.0 2e-55 MVGIILATHGNFATGIQQSASMIFGEQPNVAAVTLQPNEGPDDVRKKMEEAVASFEDPEQ VLILVDLWGGTPFNQANGLIAGHEDTWAIVAGLNLPMLIDAYASRMMMDTAHELAVQISG SAKEGVKIYPESLEPKEEKKVAATSDGQPRGALPEGTVVGDGKIKYVLARIDSRLLHGQV ATAWTKTTGPNRIIVVSDGVAHDDLRKSMIREAAPPGVKANVVPISKMIQVSKDTRFGNT KALLLFETPQDALAAIKGGVDIKELNIGSMAHSVGKVAVSKVLSLDEKDIETFEELKKLG VKFDVRKVPSDSQDNMDEILKKAKAELAK >gi|222441854|gb|ACEP01000088.1| GENE 66 74349 - 75149 1173 266 aa, chain + ## HITS:1 COG:lin0144 KEGG:ns NR:ns ## COG: lin0144 COG3715 # Protein_GI_number: 16799221 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IIC # Organism: Listeria innocua # 1 266 1 268 268 296 75.0 2e-80 MSIISMILVVIVAFLAGIEGVVDEFQFHQPLVACTLIGLVTGNLEAGIVLGGSLQMIALG WANIGAAVAPDAALASVASAIILVLGGQGVKGVSTAIAVAIPLAVAGLFLTMVVRTLSVA CVHRMDAEAEKVNFRGVEMWHIIAICLQGLRIAIPAACLLAIPTETVQNFLQSMPAWLTD GMSIGGGMVVAVGYAMVINMIATKEVWPFFAIGFCLAAISDLTLIALGAIALAIAFIYIN LSEKGGNGNGGGGNVGDPLDDILNDY >gi|222441854|gb|ACEP01000088.1| GENE 67 75170 - 76084 1078 304 aa, chain + ## HITS:1 COG:lin0145 KEGG:ns NR:ns ## COG: lin0145 COG3716 # Protein_GI_number: 16799222 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, mannose/fructose/N-acetylgalactosamine-specific component IID # Organism: Listeria innocua # 4 304 3 303 303 456 77.0 1e-128 MAEEKITLSKQDRRSVALRSTFLQGSWNYERMQNGGWCFAMIPAIKKLYTNKEDQKAALK RHLEFFNTHPYVASPVLGVTLALEEEKANGAQVDDAAIQGVKVGMMGPLAGVGDPVFWFT ARPMLGALGASLAMGGSILGPILFFVLWNVIRWAFMWYTQEFGYRAGSKISEDLSGGLLQ KVTKIASILGMFVLGSLIERWVSINFIPVVSKVKLSDGAYIDWNSLPAGAEGIKTAISQY ASGMALEPTKVTTLQDNLNSLIPGLMPLLLTLLCMWLLKKKVSPIVIILALFVVGILGHV VGIL >gi|222441854|gb|ACEP01000088.1| GENE 68 76195 - 76554 403 119 aa, chain + ## HITS:1 COG:CAP0069 KEGG:ns NR:ns ## COG: CAP0069 COG4687 # Protein_GI_number: 15004773 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 1 118 12 129 135 159 66.0 1e-39 MAQSLNTKVDLVMDATSFHGMNNYGKIMIGDRGFEYYNEKKMNDYIQIPWEEVDYVIASV MFKGKYIPRFAIQTKKSGTFTFAAKKPKELLRAVRNYVPSDHMVRSLSFFDVLKRAFKK >gi|222441854|gb|ACEP01000088.1| GENE 69 76683 - 77630 840 315 aa, chain + ## HITS:1 COG:BS_yjdE KEGG:ns NR:ns ## COG: BS_yjdE COG1482 # Protein_GI_number: 16078267 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphomannose isomerase # Organism: Bacillus subtilis # 4 311 6 313 315 213 38.0 4e-55 MAILKLKPSGKDYIWGGHKLVDNYGKEMTGDRLAETWELSCHPDGPSFVANGEDAGKTLR QYIEEHGKKVLGTNCERFEDFPILTKFIDAQDNLSIQVHPDNEYALKNEGQYGKTEMWYV VDAEEGACLYHGFNREISKDEFTKRIEEDTLLEVLNKVPVHKGDVFFIEAGTIHAIGKGL IIAEIQQNSNVTYRVYDYGRVGKDGKKRELHIEKAVAVTNCAPAKKDDSHYPHIADCDYF TVDKLNLDGTTFNKLEGTVSEKSFLSILVLDGEGTITSNGESVSYKKGDSIFLTAGSGEY SIEGVCDALLTTIRG >gi|222441854|gb|ACEP01000088.1| GENE 70 78504 - 79904 1546 466 aa, chain + ## HITS:1 COG:all0328 KEGG:ns NR:ns ## COG: all0328 COG0147 # Protein_GI_number: 17227824 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Anthranilate/para-aminobenzoate synthases component I # Organism: Nostoc sp. PCC 7120 # 8 460 22 496 504 332 40.0 7e-91 MKRIRRAYSKTVSLIDETAVQIYQSFVGKKKGFLLESYDKNNGRYVFMGKNPEEVIKSKE NGLVIEKEDGTTKELSGNPVDLLKGYYSDFEIRKDDEQLAFTGGLVGGLGYDFVRYKEEL PDNNPDEIGISTISLMLVTEFLSIDHFAETMTAVVVEDDTEAGRKKAMFRCEEMVKEAFA NIRKDEKSEFQPDGKIIKQSDTLEEYSEKVNKIKQYIIDGHVFQTVLSQRWTIETKQKGF DLYKELREINPSPYMYYFNFEEFEIIGSSPEMIVKQKDNKVLTCPIAGTRPRGKDDAEDQ KFAEGLMKDPKEKAEHVMLVDLARNDMGRIAEFGTVKVRDFMHIQKYSHVMHIVSLVEGR KKGEYHPLDLVNAFLPAGTLSGAPKVRAMEIIDELESVRRGLYGGAIGYIDFNGDMDFCI TIRTMIKKADKVYIQAGAGIVADSNPESEYNECCNKARALAKVLVP >gi|222441854|gb|ACEP01000088.1| GENE 71 79980 - 80543 618 187 aa, chain + ## HITS:1 COG:RSc2882 KEGG:ns NR:ns ## COG: RSc2882 COG0512 # Protein_GI_number: 17547601 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Anthranilate/para-aminobenzoate synthases component II # Organism: Ralstonia solanacearum # 1 186 1 186 189 245 59.0 3e-65 MILLIDNYDSFTYNLYQYMGIFEKDIKVVRNDKITIEEIEQLNPNRIVLSPGPKSPKEAG ICMDVVKHFYTKKPILGICLGHQSIGAAFGAKIIHAKELMHGKQSLIEHDGKSVFEGIPS PVHVARYHSLAVDEKTLPDVFEILARTEDGEIMAMQHKEYPLIGIQFHPESIYTDHGKKM IENFLEL >gi|222441854|gb|ACEP01000088.1| GENE 72 80614 - 81399 722 261 aa, chain + ## HITS:1 COG:CAC3160 KEGG:ns NR:ns ## COG: CAC3160 COG0134 # Protein_GI_number: 15896408 # Func_class: E Amino acid transport and metabolism # Function: Indole-3-glycerol phosphate synthase # Organism: Clostridium acetobutylicum # 1 242 1 242 262 181 45.0 1e-45 MILDVIVEDKKKRLKEHKARTSEEQMKELALVSERKSISFYEALAKPGLSIIGEFKKASP SMGKIEMTMELMDRIDEYNISVDAISCLTEEDHFLGNTDYFKEIRNISPLPMIRKDFIID PYQIYEAKVIGADCILLIAAILSDEQMAEYYKLATELGMDVLVETHDEEEMGRALKIQPK IIGVNNRNLKDFTISLENTKRLRPMVPEGTVFVSESGVTTNEHIAFLKECKVDALLIGGA FMLSEHPRELATEWKNLYNEK >gi|222441854|gb|ACEP01000088.1| GENE 73 81510 - 82193 537 227 aa, chain + ## HITS:1 COG:SP1813 KEGG:ns NR:ns ## COG: SP1813 COG0135 # Protein_GI_number: 15901642 # Func_class: E Amino acid transport and metabolism # Function: Phosphoribosylanthranilate isomerase # Organism: Streptococcus pneumoniae TIGR4 # 10 225 2 191 199 109 31.0 5e-24 MDKELINSHTKIKICGMTCEADIKAVNTYLPDYIGFVLFFPKSNRNISIEQAEHLLEKVD KKIRTVAVVVSPTTEQIRQIEKAGFDYIQIHGTVTEDVYKQCKLPILRAFNVSDLDKLNE YEAKDKIKGYVFDSKTPGSGKTFDWSLLDNIRQRQKTDASKDVSHKKNKKMIFLAGGIDE TNVKRAISQVAPDVIDLSSAVEKTSEDGTFHGKDPEKIRTIVTMVHD >gi|222441854|gb|ACEP01000088.1| GENE 74 82209 - 83381 1510 390 aa, chain + ## HITS:1 COG:CAC3158 KEGG:ns NR:ns ## COG: CAC3158 COG0133 # Protein_GI_number: 15896406 # Func_class: E Amino acid transport and metabolism # Function: Tryptophan synthase beta chain # Organism: Clostridium acetobutylicum # 3 383 2 383 394 507 63.0 1e-143 MQNKKYGEYGGQYVAESLMSVLAELETAYNEAINDPKFIEEYNYYLKEYVGRETPLYFAE KLSKKYGTKIYLKREDLNHTGAHKINNVIGQILLAKRMGKKKVIAETGAGQHGVATATGA ALFDMECTVFMGEEDMERQALNVFRMEMLGTKVECVKSGSRTLKDATNEAIRTWAKNAED TFYIIGSAVGPHPYPQMVKEFQKVISVETKKQILEKEGRLPDCVLACVGGGSNSIGMFAE FIDEKDVELIGVEAAGFGIETGKHASAFASGKAGVLHGMRSYLLQNEDGNIQIASSISAG LDYPGVGPEHAYLHDSGRATYVSITDDEAMQALRELCREEGIIPAIESAHALAYAFKKAK TMTEDQIMVVNLSGRGDKDVHTIAEYFKDK >gi|222441854|gb|ACEP01000088.1| GENE 75 83399 - 84190 436 263 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|149916131|ref|ZP_01904653.1| 50S ribosomal protein L25/general stress protein Ctc [Roseobacter sp. AzwK-3b] # 1 242 1 242 263 172 40 5e-42 MNRIEKRMEELQNKNEKAFVTYMTAGLPDMEGTKALIKAQAEAGIDIIELGIPFSDPTAD GPVIQDASYRSICKGTNLKGVFAAVEEVRKDCDVPLVFMMYYNTILYYGVDAFARKCAEV GVDGLIIPDLPKEEQFELVEALDNTENAPILIQLVAPVSADRIPMILENARGYVYCVSSM GVTGQAAHFHKNVRKYLENVKAVSKIPVMMGFGIRTAEDVAGLKDTIDGCIVGTHFIELM EENNYNLDVAKEYIRKFKKELNA >gi|222441854|gb|ACEP01000088.1| GENE 76 84316 - 84564 315 82 aa, chain + ## HITS:1 COG:no KEGG:Dhaf_3301 NR:ns ## KEGG: Dhaf_3301 # Name: not_defined # Def: prevent-host-death family protein # Organism: D.hafniense_DCB-2 # Pathway: not_defined # 1 81 1 81 82 95 61.0 4e-19 MASILPVSDLRNYNEVLKKCHKGEPVYLTKNGRGRFVVIDIEDYERERAEKKLLMKLQEA EEAVKDGKGWLDLDELKTLVEE >gi|222441854|gb|ACEP01000088.1| GENE 77 84566 - 84865 303 99 aa, chain + ## HITS:1 COG:no KEGG:Dhaf_3300 NR:ns ## KEGG: Dhaf_3300 # Name: not_defined # Def: addiction module toxin, RelE/StbE family # Organism: D.hafniense_DCB-2 # Pathway: not_defined # 3 93 4 94 107 84 45.0 1e-15 MLKLRINPIVAKDLKNIKDYIAEDNEEYAIKTIKEIYGKFENLHMFPGMGADLSKRVSFR TDYKYAIWEDYVIIYKMSNEFVEIYRVINRYQDITRIFD Prediction of potential genes in microbial genomes Time: Fri Jul 8 07:49:36 2011 Seq name: gi|222441853|gb|ACEP01000089.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont96.1, whole genome shotgun sequence Length of sequence - 27475 bp Number of predicted genes - 29, with homology - 25 Number of transcription units - 15, operones - 7 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 2592 3295 ## COG1529 Aerobic-type carbon monoxide dehydrogenase, large subunit CoxL/CutL homologs 2 1 Op 2 . - CDS 2608 - 3147 408 ## COG2878 Predicted NADH:ubiquinone oxidoreductase, subunit RnfB 3 1 Op 3 1/0.000 - CDS 3161 - 6151 3825 ## COG0493 NADPH-dependent glutamate synthase beta chain and related oxidoreductases 4 1 Op 4 . - CDS 6181 - 7521 1547 ## COG0402 Cytosine deaminase and related metal-dependent hydrolases - Prom 7541 - 7600 4.6 + Prom 7533 - 7592 14.6 5 2 Tu 1 . + CDS 7686 - 8339 541 ## COG2964 Uncharacterized protein conserved in bacteria + Term 8351 - 8406 5.4 - Term 8339 - 8392 7.1 6 3 Tu 1 . - CDS 8400 - 9767 1402 ## COG2233 Xanthine/uracil permeases - Prom 9814 - 9873 10.7 - Term 10102 - 10145 11.2 7 4 Op 1 . - CDS 10194 - 10751 546 ## COG1954 Glycerol-3-phosphate responsive antiterminator (mRNA-binding) 8 4 Op 2 1/0.000 - CDS 10757 - 12277 1572 ## COG1070 Sugar (pentulose and hexulose) kinases 9 4 Op 3 . - CDS 12306 - 13190 1066 ## COG0191 Fructose/tagatose bisphosphate aldolase - Prom 13213 - 13272 3.2 10 4 Op 4 . - CDS 13300 - 14061 213 ## PROTEIN SUPPORTED gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 - Prom 14173 - 14232 13.2 - Term 14075 - 14135 21.2 11 5 Op 1 . - CDS 14322 - 15536 1127 ## COG5441 Uncharacterized conserved protein - Prom 15640 - 15699 5.1 - Term 15647 - 15692 5.1 12 5 Op 2 . - CDS 15733 - 16533 485 ## COG1234 Metal-dependent hydrolases of the beta-lactamase superfamily III - Prom 16719 - 16778 6.9 - Term 16789 - 16826 -0.8 13 6 Op 1 . - CDS 16848 - 16919 73 ## 14 6 Op 2 . - CDS 16994 - 17257 270 ## HMPREF0837_10282 transcription regulator - Prom 17419 - 17478 6.4 + Prom 17244 - 17303 8.0 15 7 Op 1 . + CDS 17380 - 17832 262 ## MCCL_1543 hypothetical protein 16 7 Op 2 . + CDS 17848 - 18072 168 ## gi|225027736|ref|ZP_03716928.1| hypothetical protein EUBHAL_01995 17 7 Op 3 . + CDS 18095 - 18319 347 ## gi|225027737|ref|ZP_03716929.1| hypothetical protein EUBHAL_01996 18 7 Op 4 . + CDS 18351 - 18461 113 ## 19 7 Op 5 . + CDS 18508 - 18657 169 ## gi|225027739|ref|ZP_03716931.1| hypothetical protein EUBHAL_01998 20 8 Tu 1 . + CDS 18766 - 18846 59 ## + Term 18875 - 18906 2.1 + Prom 18893 - 18952 7.2 21 9 Tu 1 . + CDS 19042 - 19170 142 ## gi|225027741|ref|ZP_03716933.1| hypothetical protein EUBHAL_02000 + Prom 19207 - 19266 2.5 22 10 Tu 1 . + CDS 19321 - 20313 629 ## CPR_0832 hypothetical protein 23 11 Tu 1 . - CDS 20731 - 21444 476 ## COG4748 Uncharacterized conserved protein - Prom 21471 - 21530 8.4 - Term 21676 - 21714 8.4 24 12 Tu 1 . - CDS 21745 - 21858 56 ## - Prom 21878 - 21937 2.0 + Prom 22129 - 22188 13.9 25 13 Op 1 . + CDS 22213 - 23730 498 ## BACI_pCIXO200950 hypothetical protein + Term 23805 - 23838 1.0 + Prom 23733 - 23792 6.6 26 13 Op 2 . + CDS 23946 - 25184 704 ## COG0582 Integrase + Term 25216 - 25251 1.7 - Term 25204 - 25237 4.5 27 14 Op 1 . - CDS 25263 - 26273 1326 ## Closa_2929 hypothetical protein 28 14 Op 2 . - CDS 26292 - 26996 1095 ## COG0822 NifU homolog involved in Fe-S cluster formation - Prom 27030 - 27089 2.9 - Term 27139 - 27179 -0.2 29 15 Tu 1 . - CDS 27221 - 27439 141 ## gi|225027749|ref|ZP_03716941.1| hypothetical protein EUBHAL_02008 Predicted protein(s) >gi|222441853|gb|ACEP01000089.1| GENE 1 3 - 2592 3295 863 aa, chain - ## HITS:1 COG:BH0748 KEGG:ns NR:ns ## COG: BH0748 COG1529 # Protein_GI_number: 15613311 # Func_class: C Energy production and conversion # Function: Aerobic-type carbon monoxide dehydrogenase, large subunit CoxL/CutL homologs # Organism: Bacillus halodurans # 163 852 2 741 760 317 31.0 1e-85 MATEYCFTVNGMRRTTTEKKSLLRYLRDDLHLFSVKDGCSEGACGTCTIIVDGVAVKACV LTTDKSVGRNIVTVEGLTEREQDAFVYAFGSKGSVQCGFCIPGMVMAGKALIDQNPDPTE EEIKYALRGNICRCTGYKKIIEGIQLTAAILRGDAEIDYDLERGEQYGVGQRAFRVDVRK KVLGYGKYPDDIVMDGMVHLSAVRSKYPRARVLKIDASKAEALPGVLGVLTAKDVPNNKV GHIQQDWDVMIAEGDITRMIGDTICVVVAETEEILEKAKKLVKVDYEKLEPIRNVHEAMA EDAPQIHSSGNLCQQRHVTRGDAKTALEKAAYKVTRSFTTPFTEHAFLEPECAVSFPYKD GIKILSTDQGAFDTRKEVSIMFGWDPEKIVVENQLIGGGFGGKEDVTVQHLTALAAYKYQ RPVKAKFTRQESLFFHPKRHAMEGTFTLGCDENGIFTGLDCEIYFDTGAYASLCGPVLER ACTHSVGPYCYQNTDIRGFGYYTNNPPAGAFRGFGVCQSEFALESIINLLAEQVGISPWE IRYRNAIEPGKVLPNGQIADCSTALKETLEAVKDVYEANKDHAGIACSMKNAGVGVGLPD KGRCNLVIEDGVCVIYAAASDIGQGCATVFCQDVAQATGLPLSKIRNAVCNTENAPDSGT TSGSRQTLLTGEAVRGAAVKLADAMKEVDNDLEKLNGKNYFYEYYEPTDKLGANVPYPKS HIAYGFATHVVILDKKGKVTEVYAAHDSGKVVNPTSIQGQIEGGVLMGMGYALTEDWPLK DCVPQVTYGTLGLLRSTQIPNIHAIYVEKEKLLDVAYGAKGIGEIATIPTAPAVQGAYYA LDGKFRTKLPLEDTFYRPKERRV >gi|222441853|gb|ACEP01000089.1| GENE 2 2608 - 3147 408 179 aa, chain - ## HITS:1 COG:MA0664 KEGG:ns NR:ns ## COG: MA0664 COG2878 # Protein_GI_number: 20089551 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfB # Organism: Methanosarcina acetivorans str.C2A # 46 169 138 261 264 95 43.0 3e-20 MAVRRKKVAVIHCNGRPTRKEGFEQVTVTGNCQEILTQYSEGINICSYGCLGGGSCVEAC RLHAVSIGERGIAKIDEEKCVGCGLCAKACPQHLIQVVPTENTIQPQCANLDKGPKAIQG CAVSCIGCRICEKYCPCKAIKVVDNHAVICQEKCIACGMCAVKCPRGVIHDKNGILSIR >gi|222441853|gb|ACEP01000089.1| GENE 3 3161 - 6151 3825 996 aa, chain - ## HITS:1 COG:ygfK_2 KEGG:ns NR:ns ## COG: ygfK_2 COG0493 # Protein_GI_number: 16130780 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: NADPH-dependent glutamate synthase beta chain and related oxidoreductases # Organism: Escherichia coli K12 # 441 995 1 577 582 397 41.0 1e-110 MSDIMRPMSIGHLMNWALSEYKKKGSIFGIDKLVHYTSGQVLPIYEEKIESPFGPAAGPN TQLAQNIIASYVAGSRFFELKTVQVMDGAELSACVAKPCITAGDECYNCEWSTELYVSQA YAEYVKAWVVCKILAKELGLGNPDGFVFNMSVGYDLEGIKSEKVNTFIDDMIEAKDTEVF KECINWALENVDSFENVDADYIKSISSNISSSITESTLHGCPPDEIERIATYLITEKHLH TFIKCNPTLLGYEYARKRLDELGFDYVAFDDHHFVEDLQWDDAIPMLERLMALTKEKGLE FGVKLTNTFPVDVAAKELPSEEMYMSGRSLFPLSIHLAKLLSEQFDGKLRISYSGGATIY NIREMFDAGIWPITMATNILKPGGYERLSQISEKFMECGTERFHGTDTKAIAALDDAVTS DELYKKPVKPLPERHMEKELPLFDCFTAPCRNGCPIAQDIPAYLEAMTEGDATKALDIIL ERNALPFITGTICPHHCGDKCMRNYYEETLHIRETKYAAADAAYETVLKNIAKPEARTDK KVAVIGGGPAGLSAAMFLSRAGVPVTVFEKSENLGGVVRNVIPDFRIKREDIEKDVALCK AYGAEFVTGKEVASIKALKEEGYTDVIVAIGAWKPGNAHLQYGGAFDAVEFLADAKNEKI EVVLGKNVVVLGGGNTAMDVARAAIRVKGVENVRLVYRRTKRYMPADEEELREAIEDGVQ FMELLAPIGVENGQLKCSVMELGEADESGRRAPVDTGKVEYVPADTVIAAVGENIDGTLY EDMGVELDRKGRPVVDANMMTSVEGVYAAGDSRRGPATVVEAIADASKAAAAIAGISYEK YAEANVAEDSAKYIAKKGYQSPDLTEMPDKRCLGCSTVCETCADVCPNRANVAIKVAGKC QEQVIHVDGMCNECGNCAVFCPYAGRPYKDKFTLFWSEEDFTNSENEGFYCEENGEIHLR LDDKVEKTDINGIRTISADAAAVIETIQKDYSYLMK >gi|222441853|gb|ACEP01000089.1| GENE 4 6181 - 7521 1547 446 aa, chain - ## HITS:1 COG:Z4218 KEGG:ns NR:ns ## COG: Z4218 COG0402 # Protein_GI_number: 15803416 # Func_class: F Nucleotide transport and metabolism; R General function prediction only # Function: Cytosine deaminase and related metal-dependent hydrolases # Organism: Escherichia coli O157:H7 EDL933 # 1 442 23 461 464 224 31.0 2e-58 MLLIGNGRVITRDQENPYLEDGAVVISGEKIKEVGSLTEMKAKYPEAEFVDAKGGVIMPG LINAHTHIYSGLARGLSIAGNNPTNFLEVLDGTWWAIDRHLTLDGTKACAYATVLDCIRN GVTTIFDHHASFGEISGSLFAIKDVVKELGIRSCLCYEVSERDGEEKTLQSIKENADFAK WAKEADDDMVKAMFGGHALFTISDKTFEKMVEANDGMTGFHIHVAEGMNDVYDSIRNYGC RPVNRLLYNGLLGEKTMLGHCIHVSPAEMDIIKETGTMVVNNPESNMGNAVGCAPVLKMM EKGITVGMGTDAYTHDMLESMKVFLIIQRHQQAMPNVAWCEDVKMQFENNRIIAEKYFNQ PLGILKEGAAADVIVMDYKPFTPFSDENIDGHMIFGMMGKNCRTTVINGKVLYKDREFVN IDEDAINAWTMEQSKKLWGELNHRTY >gi|222441853|gb|ACEP01000089.1| GENE 5 7686 - 8339 541 217 aa, chain + ## HITS:1 COG:PM1350 KEGG:ns NR:ns ## COG: PM1350 COG2964 # Protein_GI_number: 15603215 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Pasteurella multocida # 3 212 13 216 223 101 34.0 8e-22 MKKSLIEFYTRLAHGLAVQFGSNCEVVVHDLQGKDIEHSIIAIENGHVTGRRIGDGPSQI VLESLHNSEQNLKQEDKLAYLTKTKNGKILKSSTIFIRDDSDKIIGIFGINYDISLMLAM EKELSSFTGTIEEDTTKEPIAVNIGDLLDELITQSVQHIGKPVTLMSKEDKVQVVRLLNE AGAFLITKSGPKVCQFLGISKYTLYSYLDEIKSKNTI >gi|222441853|gb|ACEP01000089.1| GENE 6 8400 - 9767 1402 455 aa, chain - ## HITS:1 COG:CAC0872 KEGG:ns NR:ns ## COG: CAC0872 COG2233 # Protein_GI_number: 15894159 # Func_class: F Nucleotide transport and metabolism # Function: Xanthine/uracil permeases # Organism: Clostridium acetobutylicum # 3 444 2 435 435 235 34.0 1e-61 MSEKKLTREADTSTLYEFKGSVPKPSLLVPLSLQHVLAAVVGVITPSIIIAGVCGLTESE KTMMIQAALLLTAIATLLQLFPIFNRLGAGLPVIMGTSFAYVPTLQAIGGEFDMGTILGA EIIGGIVAIIFGIFVKEIRKLFPDVVTGTVIFTIGLSLYPTAIKYMAGGAGSKEFGSMKN WVVALCTFGIVLILSNFTKGIFKLGSLFFGMIAGYLLALAFGMVDFSSVTNTTAIVAVPQ FMHFQIKFVPSACISLAIVYIVNSVQTIGDLTSTTMGGMDRVPTNRELQGGILTQGVMSI FGAFFGGLPTATYSQNVGIVTVNKVINRIVFLVAAVILMIAGFIPVFSAALTTIPQSVIG GATLSVFAQIAMTGVRMFTKDGLTARKTTVVGMSVALGVGITQVADCLSGPGLPGWINSI FGSSSVVVATIMAILLNLILPKEETAAETTEETKE >gi|222441853|gb|ACEP01000089.1| GENE 7 10194 - 10751 546 185 aa, chain - ## HITS:1 COG:TM1431 KEGG:ns NR:ns ## COG: TM1431 COG1954 # Protein_GI_number: 15644182 # Func_class: K Transcription # Function: Glycerol-3-phosphate responsive antiterminator (mRNA-binding) # Organism: Thermotoga maritima # 6 184 18 195 195 83 27.0 2e-16 MEKQGFLVVPSIRDVKYLKYTLESECREVLLSNAHIGNLKQLTENCHRNGQKVIVNHELI GGLGNDRIAFEMLKKLYKVDGVIGSSVSKLNQMQHMGLETIFRISLIDSHSVEMALRSLK GVNFTAVELRPYSHAVEFLPMFKEIFTGKFFAGGFINSEERIKICQKAGFDGVMTSTKKL WSYIE >gi|222441853|gb|ACEP01000089.1| GENE 8 10757 - 12277 1572 506 aa, chain - ## HITS:1 COG:STM3674 KEGG:ns NR:ns ## COG: STM3674 COG1070 # Protein_GI_number: 16766959 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar (pentulose and hexulose) kinases # Organism: Salmonella typhimurium LT2 # 1 499 1 489 498 246 29.0 9e-65 MEKYLMGIDAGNTSSKVVIFDLYGKMIATAATPSMHFKRRGEGFEEFDVTELWNLISVCI KEAVEKAAVAPEQIAGVGVTSFGNGVVFLDKEGYAIAPGCFSQDYRANSIIEMYQREGTY EKINDIVKGTLFAGEPGPILRWYKENYREIYDQIGGILQFKDYIMYRLTNVFATDLNCFG GAFMVDMNTMDYSRELMDLYGIPELYDALPKLATEPTEIVGTVTKEASEITGLAEGTPVA AGMMDILACLVGAGATGEEVYTAVAGSWCINETHSDRIIPNASSNMPYLKKGEYLNCSYT GASGSNYEWFTRILGGTAKLEAQDRNLSQYAVLDELIEMVPIEKVKVFFSPFVAQPSIHM NAKANFFNIDMSTSYAEICYSVAEGVAFIHKHHIDFLRNAGLPVKEIRLTGGTAKSHVWN QIFANVLQVPIVGVDCEETGAQGVAIAAGIGAGVYKDYEDAFEKAVKVKEPVLPDDSTFP IYEKRYKEWCRLNEIMKTYWDEKSKG >gi|222441853|gb|ACEP01000089.1| GENE 9 12306 - 13190 1066 294 aa, chain - ## HITS:1 COG:STM3253 KEGG:ns NR:ns ## COG: STM3253 COG0191 # Protein_GI_number: 16766551 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose/tagatose bisphosphate aldolase # Organism: Salmonella typhimurium LT2 # 1 293 1 283 284 206 40.0 5e-53 MALVNIKEMLEKARKEHYAVGAFDASTIEMAMAIVEAAEEEKSPVIVMGLTPDLQQGNEK MLTYWTESLKDLAKKASVPVCLHLDHARDMNFLKRCVDAGFTSVMYDASEYPFEENVRLS KEAADYAHKYGATVEAELGHVGDGIVSGVIKEDGNYDNPEDYLTNPEEMKRFIAETGVDC LAVAVGTSHGVYVHEPKLDFERLELLNSISDIPMVVHGGSGTPDDQIKKAISLGVTKLNI YSEMMAAYFGTMKEELEKAGTMAIWMSNANREPLKAVKKVVKEKIRLTGSAGKA >gi|222441853|gb|ACEP01000089.1| GENE 10 13300 - 14061 213 253 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 [Phaeobacter gallaeciensis BS107] # 5 248 1 238 242 86 27 2e-16 MDYGLKDKVVLVTGSTKGLGKAIAELIAQEEGIPVVTGRREKEVHEIMMELREKYQDERI QGYCVDFSELSSVDFIFDVIKKDLGSIDVLINNAGIWPTAYVVDMKPEDFERTIRVNLEV PYLLCQKFVQEKIAAQQKGKIVNIVSQAAFHGSTTGHAHYAASKAGLVSFSISLAREVAK YGINVNMVAPGMVRTPMTEEALKAKPDYYNGRIPIGRVAEPEEVAQAAVFLASKCADYYT GITFDATGGMLMR >gi|222441853|gb|ACEP01000089.1| GENE 11 14322 - 15536 1127 404 aa, chain - ## HITS:1 COG:YPO3839 KEGG:ns NR:ns ## COG: YPO3839 COG5441 # Protein_GI_number: 16123974 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Yersinia pestis # 2 389 3 389 405 273 40.0 4e-73 MKTIALIGTFDTKGEEYLYVKNKIENLGLRTLTIHAGIYKEVFSPDINHDSVAVLGGGSV AELQEKKDRGYAMEVMSKGLCALIPKLYAEKLFDAVLALGGTGGTSLVTPCMRLLPLGVP KIMVSTMASGDVSRYVGTSDILMMPSIVDVAGINRISSQVLTHAVHAIVGMVEHENTDIP VKKPLIVATMYGVTTPCVMCAKEYLEQEGYEVIIFHASGTGGKMMESLINSGIVDGVLDL TTTEWIDEIAGGIMAAGTGRLDAAALNGVPQVVSVGAADMITFGERESLPEKYKDRVVYM HNPAITVVKSNIEENVTFGIKVGEKLNQCKNNAVLLLPLQGISMNDKVGSEYYGPREDQA LFITLKKVINNPLVEVIDVDAHINDEAFAIFAARKLVALMEMKK >gi|222441853|gb|ACEP01000089.1| GENE 12 15733 - 16533 485 266 aa, chain - ## HITS:1 COG:CAC1330 KEGG:ns NR:ns ## COG: CAC1330 COG1234 # Protein_GI_number: 15894609 # Func_class: R General function prediction only # Function: Metal-dependent hydrolases of the beta-lactamase superfamily III # Organism: Clostridium acetobutylicum # 3 266 4 267 268 210 40.0 2e-54 MKMHIIGTGTATVSKYINTSCVFEEDDQLFLVDGTGGSDILRAFDTMNLNWKKLHYAFLS HEHTDHFLGMIWVIRTIAELLELEEYEGDFYLYGNDVVLGKVLHVCRMILKKRSQAFLET RIKFVLVKDHERKHILNYDFTFFDIGSTKAKQYGFLMEYDNGKKLVFAGDEPLKENGKKY SKNVDWLLSEAFCLFDEEPKFHAYQYQHQTVKEASKIAQDLKVKNLLIWHTEDQTGKKRK ERYTEEAKQFYKGNVWVPEDGEVFNI >gi|222441853|gb|ACEP01000089.1| GENE 13 16848 - 16919 73 23 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MVIKVPKKVKKTYAKIFKGLKVK >gi|222441853|gb|ACEP01000089.1| GENE 14 16994 - 17257 270 87 aa, chain - ## HITS:1 COG:no KEGG:HMPREF0837_10282 NR:ns ## KEGG: HMPREF0837_10282 # Name: not_defined # Def: transcription regulator # Organism: S.pneumoniae_TCH8431-19A # Pathway: not_defined # 20 69 1 50 50 62 54.0 5e-09 MGVSSEMRVSLELARELNGLKQSEAAQKIGVSTDTLGNYERGKSYPDIPILRKIESVYGV PYDRLIFLSLDFGLTEKRTEKPERKSS >gi|222441853|gb|ACEP01000089.1| GENE 15 17380 - 17832 262 150 aa, chain + ## HITS:1 COG:no KEGG:MCCL_1543 NR:ns ## KEGG: MCCL_1543 # Name: not_defined # Def: hypothetical protein # Organism: M.caseolyticus # Pathway: not_defined # 4 71 2 69 209 88 58.0 9e-17 MSNLGNKQIMANNIRYYMNIHSVSQTEICNTLGFKMPTFSDWVNAKTYPRIDKIELMANY FGVTKADLVEDHSSRSHLTQCQTKDEETLVLSYRELNDINKKSVAYTNNLLSTQRMEDEL RTAHARTDIEATPEGIQSDLDIMNDDSLWD >gi|222441853|gb|ACEP01000089.1| GENE 16 17848 - 18072 168 74 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225027736|ref|ZP_03716928.1| ## NR: gi|225027736|ref|ZP_03716928.1| hypothetical protein EUBHAL_01995 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01995 [Eubacterium hallii DSM 3353] # 1 74 1 74 74 123 100.0 5e-27 MFQIIYGKKAIKFLKKQDKPTQKCLMTAISRLPLEGDIKKLQGASGYRLRVGNFRVLFDV NGVIIDIIDIGNRG >gi|222441853|gb|ACEP01000089.1| GENE 17 18095 - 18319 347 74 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225027737|ref|ZP_03716929.1| ## NR: gi|225027737|ref|ZP_03716929.1| hypothetical protein EUBHAL_01996 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01996 [Eubacterium hallii DSM 3353] # 1 74 11 84 84 138 100.0 1e-31 MSNVKERIFGAVTIMSDEDAEKVWNLIQATFLLNNVEEVTPDPDEIAALNAYHSGDPDYQ PAMSQEEVLKELGL >gi|222441853|gb|ACEP01000089.1| GENE 18 18351 - 18461 113 36 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTYEQLLDAADQEGLAVKEQPLSTHDGLIIGSHIAI >gi|222441853|gb|ACEP01000089.1| GENE 19 18508 - 18657 169 49 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225027739|ref|ZP_03716931.1| ## NR: gi|225027739|ref|ZP_03716931.1| hypothetical protein EUBHAL_01998 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_01998 [Eubacterium hallii DSM 3353] # 1 49 1 49 49 88 100.0 2e-16 MAEHLDVTEEFLKDALDAYLLKYGKCTVVDNYMVFFEPLGVVDMNYGIE >gi|222441853|gb|ACEP01000089.1| GENE 20 18766 - 18846 59 26 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLKEVEENSDCHEFISQEEAMKELGL >gi|222441853|gb|ACEP01000089.1| GENE 21 19042 - 19170 142 42 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225027741|ref|ZP_03716933.1| ## NR: gi|225027741|ref|ZP_03716933.1| hypothetical protein EUBHAL_02000 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02000 [Eubacterium hallii DSM 3353] # 1 42 9 50 50 64 100.0 3e-09 MGLSDIFNIGNIKKENEELKEMLTPDMNDAIDLQHKINDFNK >gi|222441853|gb|ACEP01000089.1| GENE 22 19321 - 20313 629 330 aa, chain + ## HITS:1 COG:no KEGG:CPR_0832 NR:ns ## KEGG: CPR_0832 # Name: not_defined # Def: hypothetical protein # Organism: C.perfringens_SM101 # Pathway: not_defined # 2 329 98 429 431 287 56.0 6e-76 MSSDVYKERLTTIRNQQKQMIKQDIAASGNTDWTVNNNKAKGRKMVNDMKKLLLRAFNSE CDETIGKVKYNNIETSVRKIVKSAEQIQKLGTIMSVYINQSYIDLKIVELYLAFEYQQKK QQEKEEQRELRAQQREEAKLKKEIEEKRKKIKKEQTHYQQALKNLLSQIKEHGETEDLIA KKAELETELSNIDKSIKDIDYREANQKAGYVYVISNVGSFGENIYKIGMTRRLEPQDRVD KLGDASVPFKFDVHAMIFSDNAPALEAALHRAFEDRKLNMVNTRREFFYVTLDEIKQVVK ENFDKTVEFIDFPDAEQYRTSLKMREQLLA >gi|222441853|gb|ACEP01000089.1| GENE 23 20731 - 21444 476 237 aa, chain - ## HITS:1 COG:lin1828 KEGG:ns NR:ns ## COG: lin1828 COG4748 # Protein_GI_number: 16800895 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Listeria innocua # 7 236 12 241 355 240 56.0 2e-63 MDFNEQIKQLSKRISTLKNSVVTEEATKTSFIMPFFQILGYDVFNPTEFFPEYVADVGIK KGEKVDYAIIIDGKPCIFVECKSCNEDLDKHASQLFRYFAAAPVKFGILTNGIVYRFYTD LEKDNIMDLEPFVEVNLENINESGIKALMKFRKETFDKSNIYKAAEELKYSTLIKKVFED EFDTPSDEFVRFVLNDIYTGKKNKNMVEKFKPMVRKAFSAFINDIVNQKLNEVFDTM >gi|222441853|gb|ACEP01000089.1| GENE 24 21745 - 21858 56 37 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MISLYSFLFSVLAGIISGIIVEMIMKLLSKWFGRKKK >gi|222441853|gb|ACEP01000089.1| GENE 25 22213 - 23730 498 505 aa, chain + ## HITS:1 COG:no KEGG:BACI_pCIXO200950 NR:ns ## KEGG: BACI_pCIXO200950 # Name: not_defined # Def: hypothetical protein # Organism: B.anthracis_CI # Pathway: not_defined # 4 505 14 517 517 273 33.0 1e-71 MNDVKLKDLYMGLPDGEVEARDKRFQELFFDPNNKYNEIINSNEKFLIIGSKGTGKTYLS KYIVEQSPSKQTCIIVDPKNFWICKLINIDEQELTNDYISVLCKWFLLYEIANSLLNKHR WLKHLPRCKLNKLRKFMLEYNDDTFYKIVSLSTTNNQEITGNLSHGISHSDKLQTSNFQH SAGIKTSDGVSYESTRKRFFDLIDYFEQLVFDCFQINDHLLIILDDLDELKKEAGEQSEN IIYNLITAAKKYNFYFNSRAKSLKIIMLLRSDILNKMQGNHPNLNKIKTSCSIDLYWLLD STHDKWDHPLISMIFHKIRASCEPYKNRSNKELFEILFPESIDKKNPLDFLLDHSLGRPR DIVTFLNCAKKEFPERTCFSATVLKETRKIYATDFYNEMLNQASFYKSSAYSTQCLKLIA GIKRPSFSYSDIQTLYEENRTSYSEIDNLDDALHFLYELGAIGNAWKSKKGKHRTCWYYK IDAIDEVDLSQNFTIHYGLRKKFSL >gi|222441853|gb|ACEP01000089.1| GENE 26 23946 - 25184 704 412 aa, chain + ## HITS:1 COG:BH3551 KEGG:ns NR:ns ## COG: BH3551 COG0582 # Protein_GI_number: 15616113 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Bacillus halodurans # 1 394 1 359 378 101 26.0 3e-21 MGKVSTRKRGKTWQYYFQLASVNGTRKWKTGSGYRTKAEAQATGTKALAEYNSTGIAFKV SEQSVADYFDYWMEHYVEQELAETTVNTYKKRIRLYIKPYIGSYKLKNVQGETLRNFLAE LHRTGMSRNTLTCIKGMLTSAFGYATVQAKFISVDPSYKLTLPNKRKDSEVGTRKENHIF VEEDMWNAIIERFPEGHPSHLALMLGYYCGLRLGEVYGLTWDCVDFENKTITINKQMQEP SGCGKWLLYVPKYDSSRTVTVGNNVLALLKRTLEFQLTDKETCGEYYQENYMNYDEETHS LLSLNDLRPVHFVNAKQGGLLAHPRNMQHTSRSIHGKAKNCTLISEEWDFHSLRHTHATI LYEAGVPMPLIQKRLGHINIQTTKRYTDHVTKKMLSMLDEVINGDNIDNNLE >gi|222441853|gb|ACEP01000089.1| GENE 27 25263 - 26273 1326 336 aa, chain - ## HITS:1 COG:no KEGG:Closa_2929 NR:ns ## KEGG: Closa_2929 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 336 1 336 336 572 90.0 1e-162 MALFESYERRIDKINSVLSSYGIASLEEAEKITKDAGLDVYGMVKGIQPICFENACWAYT VGAAIAIKKGARKASEAAAAIGEGLQAFCIPGSVAENRKVGLGHGNLGKMLLEEDTECFA FLAGHESFAAAEGAIGIAEKANKVRQKPLRVILNGLGKDAAQIISRINGFTFVETEYDPY TNTVKEVFRKSYSDGLRAKVNCYGANDVCEGVAIMWKENVDVSITGNSTNPTRFQHPVAG TYKKERLEAGKKYFSVASGGGTGRTLHPDNMAAGPASYGMTDTMGRMHSDAQFAGSSSVP AHVEMMGLIGMGNNPMVGATVAVAVSVEEAAKAGRF >gi|222441853|gb|ACEP01000089.1| GENE 28 26292 - 26996 1095 234 aa, chain - ## HITS:1 COG:CAC2565 KEGG:ns NR:ns ## COG: CAC2565 COG0822 # Protein_GI_number: 15895825 # Func_class: C Energy production and conversion # Function: NifU homolog involved in Fe-S cluster formation # Organism: Clostridium acetobutylicum # 1 228 1 228 230 360 78.0 1e-99 MIISQEVQNMCPVAQGVHHGAAPIPEEAKWVKAKEVKDISGFTHGVGWCAPQQGACKLTL NVKEGIIQEALVETIGCSGMTHSAAMASEILPGLTVMEALNTDLVCDAINTAMRELFLQI VYGRSQSAFSEDGLQIGAGLEDLGKGLRSQVGTMYGTLQKGPRYLEMAEGYVTGIALDED DLIIGYQFVNLGKMTDFIKKGDDPNTAWEKAAGQYGRVADAAKIIDPRTDKVID >gi|222441853|gb|ACEP01000089.1| GENE 29 27221 - 27439 141 72 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225027749|ref|ZP_03716941.1| ## NR: gi|225027749|ref|ZP_03716941.1| hypothetical protein EUBHAL_02008 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02008 [Eubacterium hallii DSM 3353] # 1 72 1 72 72 79 100.0 1e-13 MAKQIWYHAEKAAIAKMKDIFKEAYQKQKAEQTAQKETEVSGSTAPDQTDSTSESDASST HRNEMESEGTTE Prediction of potential genes in microbial genomes Time: Fri Jul 8 07:50:54 2011 Seq name: gi|222441852|gb|ACEP01000090.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont108.1, whole genome shotgun sequence Length of sequence - 42935 bp Number of predicted genes - 29, with homology - 29 Number of transcription units - 16, operones - 7 average op.length - 2.9 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 286 - 345 6.1 1 1 Tu 1 . + CDS 395 - 2236 2392 ## COG0449 Glucosamine 6-phosphate synthetase, contains amidotransferase and phosphosugar isomerase domains + Prom 2288 - 2347 4.2 2 2 Tu 1 . + CDS 2489 - 3157 533 ## CLOST_0058 hypothetical protein + Term 3236 - 3272 2.0 + Prom 3241 - 3300 5.4 3 3 Op 1 . + CDS 3407 - 3847 175 ## gi|225027752|ref|ZP_03716944.1| hypothetical protein EUBHAL_02011 4 3 Op 2 . + CDS 3855 - 4067 213 ## PROTEIN SUPPORTED gi|163756262|ref|ZP_02163377.1| 50S ribosomal protein L20 5 3 Op 3 . + CDS 4143 - 4313 118 ## gi|225027754|ref|ZP_03716946.1| hypothetical protein EUBHAL_02013 6 4 Tu 1 . - CDS 4577 - 6259 2345 ## COG1109 Phosphomannomutase - Prom 6312 - 6371 10.6 + Prom 6987 - 7046 5.2 7 5 Op 1 3/0.000 + CDS 7078 - 7824 1163 ## COG0760 Parvulin-like peptidyl-prolyl isomerase + Prom 7936 - 7995 7.7 8 5 Op 2 . + CDS 8070 - 8939 1361 ## COG0191 Fructose/tagatose bisphosphate aldolase + Term 9031 - 9066 2.4 + Prom 9228 - 9287 9.7 9 6 Tu 1 . + CDS 9457 - 9978 237 ## gi|225027758|ref|ZP_03716950.1| hypothetical protein EUBHAL_02017 + Term 10084 - 10135 4.4 - Term 10072 - 10123 1.4 10 7 Tu 1 . - CDS 10156 - 11637 1197 ## COG1640 4-alpha-glucanotransferase - Prom 11660 - 11719 5.8 11 8 Op 1 . - CDS 11755 - 12771 851 ## COG1609 Transcriptional regulators 12 8 Op 2 38/0.000 - CDS 12807 - 13640 734 ## COG0395 ABC-type sugar transport system, permease component 13 8 Op 3 35/0.000 - CDS 13640 - 14482 858 ## COG1175 ABC-type sugar transport systems, permease components - Prom 14554 - 14613 2.9 - Term 14495 - 14535 5.1 14 8 Op 4 . - CDS 14616 - 15935 1641 ## COG1653 ABC-type sugar transport system, periplasmic component + Prom 16162 - 16221 7.5 15 9 Tu 1 . + CDS 16429 - 17598 1372 ## COG3839 ABC-type sugar transport systems, ATPase components + Term 17622 - 17668 0.2 + Prom 17801 - 17860 9.6 16 10 Op 1 . + CDS 17913 - 19214 1410 ## ELI_3930 hypothetical protein 17 10 Op 2 . + CDS 19282 - 19929 671 ## COG0546 Predicted phosphatases + Prom 20198 - 20257 7.1 18 11 Tu 1 . + CDS 20313 - 21140 683 ## gi|225027767|ref|ZP_03716959.1| hypothetical protein EUBHAL_02026 - Term 21481 - 21524 1.7 19 12 Op 1 . - CDS 21581 - 23338 1567 ## COG4624 Iron only hydrogenase large subunit, C-terminal domain 20 12 Op 2 . - CDS 23332 - 25173 2188 ## COG0493 NADPH-dependent glutamate synthase beta chain and related oxidoreductases - Prom 25225 - 25284 10.7 - Term 25346 - 25404 4.6 21 13 Tu 1 . - CDS 25467 - 26036 179 ## PROTEIN SUPPORTED gi|163764517|ref|ZP_02171573.1| ribosomal protein L32 - Prom 26076 - 26135 14.2 + Prom 26205 - 26264 7.9 22 14 Tu 1 . + CDS 26300 - 35707 9791 ## LLO_3002 transmembrane protein (fibronectin III domain and GP5 C-terminal repeat) + Term 35741 - 35794 7.4 + Prom 35838 - 35897 9.2 23 15 Op 1 . + CDS 35939 - 36598 696 ## Mlab_0596 hypothetical protein + Prom 36606 - 36665 8.2 24 15 Op 2 . + CDS 36734 - 36940 334 ## PROTEIN SUPPORTED gi|238917368|ref|YP_002930885.1| large subunit ribosomal protein L31 + Term 36959 - 37026 15.4 + Prom 36975 - 37034 2.4 25 16 Op 1 1/0.500 + CDS 37090 - 38031 889 ## COG3872 Predicted metal-dependent enzyme 26 16 Op 2 32/0.000 + CDS 38034 - 38927 355 ## PROTEIN SUPPORTED gi|169796031|ref|YP_001713824.1| putative adenine-specific methylase + Prom 38990 - 39049 7.3 27 16 Op 3 . + CDS 39166 - 40245 1607 ## COG0216 Protein chain release factor A 28 16 Op 4 . + CDS 40251 - 41321 947 ## COG0079 Histidinol-phosphate/aromatic aminotransferase and cobyric acid decarboxylase 29 16 Op 5 . + CDS 41393 - 42844 1408 ## COG0297 Glycogen synthase Predicted protein(s) >gi|222441852|gb|ACEP01000090.1| GENE 1 395 - 2236 2392 613 aa, chain + ## HITS:1 COG:CAC0158 KEGG:ns NR:ns ## COG: CAC0158 COG0449 # Protein_GI_number: 15893453 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glucosamine 6-phosphate synthetase, contains amidotransferase and phosphosugar isomerase domains # Organism: Clostridium acetobutylicum # 1 613 1 608 608 656 53.0 0 MCGIVGYTGREAAAPIVLNGLSKLEYRGYDSAGVAVFNEAENDIDIVKTKGRLKNLLEKT DSGKAMPGGCGIGHTRWATHGAPSEKNAHPHCSDDKMVVMVHNGIIENYQELKDHLTQKG YTFYSETDTEVAVKLIAYYYEKTDHNPLESMSRALLRVRGSFALGIMFKDRPGVVYAARK DSPLIVGQNEGGAFIASDVPAILSYTRNVYYLDNMEMAELHEDSIRFFNMNGDEIEKEMV NIKWDAEAAEKGGFEHFMMKEIHEQPKALKDTINTYVKDGKIDLSSVGISPEEFAQYHRI CIVACGSAYHVGIAAKYVMEDLTGLIVDVDFASEFRYRNPKLSQDTLVVIISQSGETADS LAALRLTKEMGVKTLAIVNVVGSSIAREADHVMYTLAGPEIAVATTKAYSTQLICMYILS MYAAQVRGEIEESTLNSMLKEIVLLPEKVASILADKERIQWFANKFASAHDVFFIGRGLD YAICQEGSLKMKEVSYVHSEAYAAGELKHGTISLIEDGTLVIGVLTQSHLCEKTLSNMQE VKSRGAYLMGVAENGSYYLEDMVDFVVYVPKTDPHFTTTLAIIPLQLMGYYVSVARGLDV DKPRNLAKSVTVE >gi|222441852|gb|ACEP01000090.1| GENE 2 2489 - 3157 533 222 aa, chain + ## HITS:1 COG:no KEGG:CLOST_0058 NR:ns ## KEGG: CLOST_0058 # Name: not_defined # Def: hypothetical protein # Organism: C.sticklandii # Pathway: not_defined # 1 218 8 220 222 91 28.0 3e-17 MEIKTFHGFEDFCVEKFENTDEWYYGIYPTWMSNVEVAEECGKGREYAGSKLCFFNKTGK AYESIKQEKNVYLEAPVYDKASNSFGILRYDFNKRILQAIAYKPDEESVEVIEEISLSEI DDLMNIRLIESPFMIVRNSMEENAIEFLAPKRKIYLEENESIEFVENEKMYSSKWNEDLL DVLNYGYGEEVIVRDLETGKILERYKGYLNRMPDGTAWLMTE >gi|222441852|gb|ACEP01000090.1| GENE 3 3407 - 3847 175 146 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225027752|ref|ZP_03716944.1| ## NR: gi|225027752|ref|ZP_03716944.1| hypothetical protein EUBHAL_02011 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02011 [Eubacterium hallii DSM 3353] # 7 146 1 140 140 217 99.0 2e-55 MRISGNLIFGSVLSIGALVVFTVIQIDRPDIVTTEMKKLSLSSQEPFTLFICLLINCISL FLVARFTKQIFEWIELSESPFIPQISNQINKISAALFVYIIVPTGSVNEINLSSVVLGIV IVLVLYCLSAVFKYGCSLQKEVDGML >gi|222441852|gb|ACEP01000090.1| GENE 4 3855 - 4067 213 70 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163756262|ref|ZP_02163377.1| 50S ribosomal protein L20 [Kordia algicida OT-1] # 1 67 1 67 67 86 62 2e-16 MAIILRLDRVMADRKINLKDLSEQIGIANVNLSKIKTGKVSAIRFSTLNAICKALDCQPG DILEYQEDED >gi|222441852|gb|ACEP01000090.1| GENE 5 4143 - 4313 118 56 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225027754|ref|ZP_03716946.1| ## NR: gi|225027754|ref|ZP_03716946.1| hypothetical protein EUBHAL_02013 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02013 [Eubacterium hallii DSM 3353] # 1 56 1 56 56 69 100.0 7e-11 MKKYLNRIYTILLIILCVVLCVLQIQAGTLLSTGTIIRFVIIFIAYGLGMAYKKKS >gi|222441852|gb|ACEP01000090.1| GENE 6 4577 - 6259 2345 560 aa, chain - ## HITS:1 COG:CAC2337 KEGG:ns NR:ns ## COG: CAC2337 COG1109 # Protein_GI_number: 15895604 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphomannomutase # Organism: Clostridium acetobutylicum # 2 546 3 557 575 468 42.0 1e-131 MYLDEYKRWLAADLEDSDLHPELAGIEGNDDEIKDRFAVALKFGTAGLRGVLGAGTNRMN IYVVRQATQGLANWVKTQGGNQTVAISYDSRIKSDVFAKTAAAVLAANGIKVRIYDALMP VPALSFATRYYECNAGIMVTASHNPAKYNGYKAYGPDGCQMTDDAAAIVYDEIQKTDVLT GAKYISFAEGVEQGLIRFVGDDCKNAFYEAIEARQVRPGLCKTAGLKLVYSPLNGSGLVP VTRVLNDIGITDITIVPEQEYPNGYFTTCSYPNPEIFEALKLGLDLAKESGADLMLATDP DADRVGIAMKCPDGSYELVTGNEMGVLLLDYICAGRKELGTLPEKAVAVKSIVSTPLADA VAAHYGVEMRSVLTGFKWIGDQIASLEAAGEVDRFIFGFEESYGYLAGPYVRDKDAVISS MLICEMAAYYRSIGSSIKERLEEIYAEYGRYLNVVDSFEFPGLTGMDKMSGIMQGLRDNP PTSIGDKKVVSVTDYKNAEATGLPAANVLTYGLDNGATVVVRPSGTEPKIKTYFTTLGKD LADAQATKDELAAALAPLFK >gi|222441852|gb|ACEP01000090.1| GENE 7 7078 - 7824 1163 248 aa, chain + ## HITS:1 COG:CAC0279 KEGG:ns NR:ns ## COG: CAC0279 COG0760 # Protein_GI_number: 15893571 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Parvulin-like peptidyl-prolyl isomerase # Organism: Clostridium acetobutylicum # 1 248 1 247 247 160 41.0 2e-39 MSEKIIAVSAGREITETEFNEFLSKLPEQQQAYVATEEGRKQALTQYANYFLFEKLGYDK KYDQDEAFLATMEAVKRELLGQYALTQEIKDIQATQEECEAYYNEHKAMFVKEAKATAKH ILTATEEESKKVLEEIESGVKTFEDAAKEYSTCPSKAQGGSLGTFGRGQMVKEFDEAVFT AEVGKVIGPVKTDFGYHLIRVDELTGGEQSEFAEVYPQIMQQLTTEKQNKKYMAVRQEMI EKYGLEFK >gi|222441852|gb|ACEP01000090.1| GENE 8 8070 - 8939 1361 289 aa, chain + ## HITS:1 COG:CAC0827 KEGG:ns NR:ns ## COG: CAC0827 COG0191 # Protein_GI_number: 15894114 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose/tagatose bisphosphate aldolase # Organism: Clostridium acetobutylicum # 3 288 2 287 287 431 78.0 1e-121 MALVSAKEMLQKAKAGHYAVGQFNINNLEWTKAILLTAEECKSPVILGVSEGAGKYMTGY KTVVGMVNGMLEELNITVPVALHLDHGSYEGAKKCIEAGFSSIMFDGSHLPFEENVEKTK ELVAICNEKGMSIEAEVGSIGGEEDGVVGMGECADPQECKAIADLGVTMLAAGIGNIHGK YPENWAGLQFDVLDDIQKLTGEMPLVLHGGTGIPEDMIKKAISLGVSKINVNTECQISFA EATRKYIEAGKDLEGKGFDPRKLLAPGAEAIKATVKEKMEIFGSIGKAE >gi|222441852|gb|ACEP01000090.1| GENE 9 9457 - 9978 237 173 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225027758|ref|ZP_03716950.1| ## NR: gi|225027758|ref|ZP_03716950.1| hypothetical protein EUBHAL_02017 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02017 [Eubacterium hallii DSM 3353] # 1 173 1 173 173 327 100.0 2e-88 MTNLNNDFMESYKHLDKICKEIFNSEKGVTTYIDTMKEVNDGNRYVPLWNKTLYKLKHYR HIRNNYVHEVGTSQYDICTRDDIEWLNNFYKEIMKTTDPLARYRREKSSYKAVEDVKYIR YTEGGEGEQLQYQNVNSKSDDGEGLRLLNIILKFLIIMGINLAVVLFIMLMIS >gi|222441852|gb|ACEP01000090.1| GENE 10 10156 - 11637 1197 493 aa, chain - ## HITS:1 COG:alr3871 KEGG:ns NR:ns ## COG: alr3871 COG1640 # Protein_GI_number: 17231363 # Func_class: G Carbohydrate transport and metabolism # Function: 4-alpha-glucanotransferase # Organism: Nostoc sp. PCC 7120 # 1 491 4 497 502 468 48.0 1e-131 MRTAGILMPITSLPSPYGVGTLGQCARDFIQFLHRAGQTYWQILPICPTGYGDSPYQSFS SYAGNPYMIDFDDLKKEGLLQEEEYASISWGDNPARVDYSLLYSHKFSILRIACSRISGV LSERFQTFCKEQNVWLNDYALYMSIKDYYGGQSWQEWPETLQKREKDSLEAIRQELTDEI LFWKKVQFLFFYQWDKLKAYAEENNIKIIGDLPIYVSYDSADVWAHPEEFQLDENLCPIE VAGCPPDGFSADGQLWGNPLFNWDYMKETGYQWWVNRIAYQSKIYDVIRIDHFRGFDEYY AIPYGDSTAKNGYWKPGPGMDLFTTIENKLGKLPIIAEDLGFLTDSVKQLLASTGFPGMK VLEFAFDSRDTGSGYLPHCYPNNCVVYTGTHDNDTILGWFETAPEEFQNNAIRYLRLTKE EGYHIGMMRCAWASVANTAIMQMQDFLGLGVEGRMNTPSTLGGNWTWRCLPGDYNNELAD WLYEETKIYDRLN >gi|222441852|gb|ACEP01000090.1| GENE 11 11755 - 12771 851 338 aa, chain - ## HITS:1 COG:HI0506 KEGG:ns NR:ns ## COG: HI0506 COG1609 # Protein_GI_number: 16272450 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Haemophilus influenzae # 2 313 3 309 332 156 30.0 5e-38 MTIKDIARLSGYGIGTVSRVLNNHPDVSEKARKKILSVIEENNFQPNSNARHLKMQSSSS VILIVKGTSNLLFADIVEQMQSLFLKNGENVSTYYIDEDANEVNYAIQLCRDRNPKGIVF LGGNLEYFKEDFAHIHIPCLLLTNNSSDLDFDNLSSVTTCDEQAAQSVIEYLAKKGHTSI GVVGGNLTETEISFRRFTGAKWGFKNSKLSFNQRLQYEPCRFSMEEGYQAAMRLLNRNTA ITAIFAMSDLIALGTMRAIYDMGKRVPEDISIVGYDGIALSRYCVPRLTTVQQDTTQLAK KGVDLMLQRINYPYHATHKTTPFHLIEGESVMELNGDK >gi|222441852|gb|ACEP01000090.1| GENE 12 12807 - 13640 734 277 aa, chain - ## HITS:1 COG:Cgl2406 KEGG:ns NR:ns ## COG: Cgl2406 COG0395 # Protein_GI_number: 19553656 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Corynebacterium glutamicum # 14 277 40 304 304 273 54.0 3e-73 MKNKTSGIFATIFFTLISILFVSPILIVLMNSFKKKAFISLEPFKLPNEKSYVGLENYLT SISKYDFIKAVGWTVFITVLSVIVILVCCSMCAWYITRVQNKITKLIYLLCVFSMVIPFQ MVMFTLSGTADTLGLNTPWGLIIVYLGFGAGLAVFMFCGFVKSIPVEIEEAAMIDGCNPI RTFFSVVLPIMKPTYISVGILEAMWIWNDFLLPYLILDIKKYKTISIIIQYMKGSYGRVD MGAIMACLIMAVIPVIVFYLSCQKYIIKGVAAGAVKG >gi|222441852|gb|ACEP01000090.1| GENE 13 13640 - 14482 858 280 aa, chain - ## HITS:1 COG:Cgl2407 KEGG:ns NR:ns ## COG: Cgl2407 COG1175 # Protein_GI_number: 19553657 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Corynebacterium glutamicum # 1 279 1 280 281 293 57.0 2e-79 MEKSIKKYWPIFVLPTFAAFIIGFIVPFIEGIYLSFCKFTTIRDASFVGLSNYVEAFKDN TFTHAFGFTAVFAVVTLVLINVLAFAIALALTQKIKGTNIFRTIFFMPNLIGGIVLGYIW QLIFDGIFEKFDLALKLSAKLGFWGLVILICWQQIGYMMIVYIAGLQAIPGSLNEAAMID GANKWQRLIHVTIPNMMPSITICTFLTITNSFKLFDQNLSLTGGEPMHQTEMMALNIYDT FYGRVGYEGVGQAKAVIFFIIVVAIGMIQLYATKSKEVQQ >gi|222441852|gb|ACEP01000090.1| GENE 14 14616 - 15935 1641 439 aa, chain - ## HITS:1 COG:Cgl2408 KEGG:ns NR:ns ## COG: Cgl2408 COG1653 # Protein_GI_number: 19553658 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Corynebacterium glutamicum # 1 433 1 437 443 314 40.0 2e-85 MRKMKRLLAAGLATIMACSMLTGCGGGSSDKKASSDKDSSKGSVYYLNFKPEADEQWQEL AKEYTDETGVPVTVVTAAANQYETTLKSEMGKSEAPTLFQVNGPVGLASWKDYCYDLTGS DILNELTSDKFELKNGDEIAGVGYCTETYGLIYNKALLKKAGYTQNDIKSFADLKKVAED ITKRKDELGFSAFTSAGMDGSSDWRFKTHLANLPIYYEYQKDGITDTKAIKGTYLDNYRN IWDLYINNGTCDAKQLSKKTGDDAVAEFTTEQAVFYQNGTWAYGDIADIGNDNLGMLPIY IGAPGEEKQGLCTGTENYWCVNKNASKEDIQATLDFINWCVTSEEGTKMMCSADGMGFVI PFKNNLKSENPLVNIANDYMEKGYTPVDWTFSTMPSEQWKNDLGSALTTYAADQTDANWK LVQNAFVDGWAKEAAAATK >gi|222441852|gb|ACEP01000090.1| GENE 15 16429 - 17598 1372 389 aa, chain + ## HITS:1 COG:CAC3237 KEGG:ns NR:ns ## COG: CAC3237 COG3839 # Protein_GI_number: 15896483 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, ATPase components # Organism: Clostridium acetobutylicum # 1 387 1 368 369 406 57.0 1e-113 MAEVTLKHIKKVYPNTEKKSKKGQKKSNLMVTDEGVLAVQDFNLDIKDREFIVLVGPSGC GKSTTLRMVAGLEEISGGELLIDGKRMNDVAPKGRDIAMVFQSYALYPHMTVRENMEFPL KLRKMPKDEMNKRVDEVAEILDITQYLDRKPKALSGGQRQRVAIGRAIVREPKVLLMDEP LSNLDAKLRNQMRAELIKLRQRIDTTFIYVTHDQTEAMTLGDRIVIMKDGIVQQVGTPQE VFDHPANIFVAGFIGMPQMNMFDAKLVEQSGKYSVELGGVSIILSEEKQAALAKNNVKSQ EITLGIRPEHIALAQPGKDAITGKVDVSEMMGSAIHLHVNACGRDTIIIVQTMDLTGGED LSIGSEISFAFGGNAVHVFNKETGENLEF >gi|222441852|gb|ACEP01000090.1| GENE 16 17913 - 19214 1410 433 aa, chain + ## HITS:1 COG:no KEGG:ELI_3930 NR:ns ## KEGG: ELI_3930 # Name: not_defined # Def: hypothetical protein # Organism: E.limosum # Pathway: not_defined # 14 431 18 434 436 546 63.0 1e-154 MLNVTLDSLGLETVRGDESFVSRVQDMPVSKEEFFDLTKMAKYVGVTEQFKDVINTFHTP EGETPAGFKRELVMEKDGVVKVDLVRDISYDKNGILRPTNVLFSADSANPYEVEPISPLI SNLTCNPGIIYDLFINNPKANVGNKYKNRDEVMAEIGRVLGPGCDISVELNNPFEQDFNK ILEEAEKFREMFSRYRVVIKVPHTGAVTPQNVTQLLSENKKLDKRPDQVGTEDALRGHNL ALKLHEHGFRVNFTLMFEPFQTMLAMQARPYFINTFLRHRLLQSQNIKKYVDMYEVSKDN KILETLKDYFISCDYYTEADRDMALADVLAFGKDLLKYRHFEDKQGQDGLDGMRHNLRVL KNSNLKDTRLIVCSMEGPYNYPDIDKLLTEPEFQDMNHKVVITAEPNYLARFTSTNQVIS YQRRFMNAANGQS >gi|222441852|gb|ACEP01000090.1| GENE 17 19282 - 19929 671 215 aa, chain + ## HITS:1 COG:TP0554 KEGG:ns NR:ns ## COG: TP0554 COG0546 # Protein_GI_number: 15639543 # Func_class: R General function prediction only # Function: Predicted phosphatases # Organism: Treponema pallidum # 1 196 4 203 222 117 31.0 1e-26 MGKAVIFDLDGTLLNTLDDLEDSVNHTLNYFKYPKRTKAEVRSFIGGGAKALIKKSLPEN VTAEKYEEVLSYFQAYYKKNADKKTGLYPGVKKLVNKLSDENYSIGVVSAKGDIVVKELV ESFLGDKVNETLGEKEGIKRKPAPDSILIMMDTLKCKPEETIYVGDSEVDVEAAANAGIR CASVTWGFRDKEDLEKINPLYIADNVQELYELIKG >gi|222441852|gb|ACEP01000090.1| GENE 18 20313 - 21140 683 275 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225027767|ref|ZP_03716959.1| ## NR: gi|225027767|ref|ZP_03716959.1| hypothetical protein EUBHAL_02026 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02026 [Eubacterium hallii DSM 3353] # 25 275 35 285 285 441 100.0 1e-122 MQKKSIIIIILVVLLLVAILAVAAELGMKSGGRPNKIIEKITNGNNNEDNSNSDSNNMKE KDSEKEFEDDTNSELNTGYGYEMDVLSDKNNSTEEKTVYEKGYDLPVDKQEDKTAKKELH EMMQAVWPIYDKSEKGDTSNVVLPSKTIKKILDKIESCSVPAFFSGREEHMRNADKFDGF LKNAQKGEKDSIIVYEVHEDGGVGRNKYIFDGQNMYVLYVGSVWNNDEDVISIDTTYTRI KEWNYTKDGWFTGEFCVPEPPERSETVDGRFRILV >gi|222441852|gb|ACEP01000090.1| GENE 19 21581 - 23338 1567 585 aa, chain - ## HITS:1 COG:TM0201_2 KEGG:ns NR:ns ## COG: TM0201_2 COG4624 # Protein_GI_number: 15642974 # Func_class: R General function prediction only # Function: Iron only hydrogenase large subunit, C-terminal domain # Organism: Thermotoga maritima # 219 569 9 358 372 346 52.0 1e-94 MVNIKINDVPISVEEGTTILEACREAKMIVPTLCYLKDINEIGACRVCVVEIKGIERLVT SCNNTVKEGMEIYTNSPKVRDARRINIKLILSQHNCFCPTCVRSGNCQLQKVANDLEFGS GTFKQHITEEKWPMDFPLIRDESKCIKCMRCIQICDKVQSLKVWDIAKTGSRTTVNVSLG RNIKEADCALCGQCITHCPVAALSGRDDKRPVFSQNGMLNTRRKTTVVQVAPAVRTAWAE AFKLSRKFASPERLAGALRLMGFDYVFDTTYAADLTIMEEGSEFLERLKHKEDYQWPMFT SCCPGWVRFLKSQYPDMVDCLSTAKSPQQMQGAIIKNYFADKIHEDPENIFSVSIMPCIA KKAECALPTMDSTGTGPDVDVVLNTREFVDYMKSLNIDVYGLPEDRFDSPCGEGTGAAVI FGTTGGVMEAALRSCYYLATGQNPDPDAFHGVRGMDGWKEAAIDINGTEVKVAVVSGLGN ARKLIEAVRRGEVFYHFVEVMACPGGCVGGGGQPIHEGKEMAEIRSKNLYFLDSQNERRF SHENPEVLKTYEEYLEKPLSRMSHKLLHTDHHGWDMPLAPKLKKR >gi|222441852|gb|ACEP01000090.1| GENE 20 23332 - 25173 2188 613 aa, chain - ## HITS:1 COG:TM1640 KEGG:ns NR:ns ## COG: TM1640 COG0493 # Protein_GI_number: 15644388 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: NADPH-dependent glutamate synthase beta chain and related oxidoreductases # Organism: Thermotoga maritima # 137 535 46 459 468 224 36.0 3e-58 MSRLEVRKKSKAQLVVDNLYEDLERRIIASPPGLCPVDMTASFLKLFHAQTCGKCVPCRI GLGVLEDMLEDVLDGKATLETLSLIERTAKAIYDSADCAIGYEAGRMVLKGLEGFREDFI EHITNGHCSCHLEQPVPCVALCPAGVDVPGYISLIADERYADAVKLIRKDNPFVTSCALV CEHPCETKCRRNFVDDAVNIRGLKRYAADHCGLVPPPECCEPTGKTVAIIGGGPSGLSAA YYLQQMGHQCTVFEQRKELGGMLYYGIPNYRLPKERLAEDVQNILATGVETKMNTCIGDG DGEISFKMLKENYDAVYISIGAHTDKKLGIPGEDAKGVISAIQMLRDIGDGNPPDFTGKN VIVVGGGNVAMDATRSAIRLGAKNVSVMYRRRKADMTALMDEIEAAIAEGAEILELHAPQ AIETNKKGYVTAMVADPKIIGLITGGRPSPANSGREPIHIPCDIVIVAIGQDIVSEPFER AGIPAKRGRFLAQKDSSIAEKDGVFSGGDCVTGPATVIRAIAAGKVAAANIDNYLGFNHK IESGVVVPEPRIENKTPCGRVTMMERNTTERKKDFNIVENGMTCKEANQEAHRCLRCDRF GYGVFKGGRVEEW >gi|222441852|gb|ACEP01000090.1| GENE 21 25467 - 26036 179 189 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163764517|ref|ZP_02171573.1| ribosomal protein L32 [Bacillus selenitireducens MLS10] # 7 180 2 175 190 73 32 2e-12 MSAITTTKKKSFTTFQMALIAVMAAITCILGPLSIPIPISPVPISLTNLAIYLTVCLLGW KFGTISYLIYLLIGIAGLPVFSGFSSGFAKLLGPTGGYLIGFIPMTIICGFAFEKFSNHG MQIVGLAIGTIVAYIFGTAWLAIEAHLTFYQALLAGVIPYIPGDLVKIILVVLAGPIVKK RLQSGGFLL >gi|222441852|gb|ACEP01000090.1| GENE 22 26300 - 35707 9791 3135 aa, chain + ## HITS:1 COG:no KEGG:LLO_3002 NR:ns ## KEGG: LLO_3002 # Name: not_defined # Def: transmembrane protein (fibronectin III domain and GP5 C-terminal repeat) # Organism: L.longbeachae # Pathway: not_defined # 1620 2745 1399 2395 3329 148 24.0 5e-33 MKQRFLSIFLALALIIPFFPQVTLPAKAAETTAKTQDENSNENTDAFGIKMDTTIDTKKE KANNPYGTEGWVNLFTVPELFVSQGYDGYRSFETYNYNNDRDHKEGSIGSITDGTRIGKK EEGNKNGFTIMDTAPVDAEGEGQKRYVAVLGYWLGGKRLELYIADKDGNRVSNTYAIGGE KTLDYLEQADAFEDTGFVSVAAGDFNGDGKDTVIGYAPLMESDSEQPQLWEFSIGSNMQL EKTGTVCNIFDILGTGNIATKHSNNGKVFRNTPVVQMTTADTDKDSVDELVVTAGMNNTK ADVNNRQSRMFIYDNITEEEKSYWNKTFELDTKGYNNGNQRLRWASSSVGNLVMTGSGAD YPEIITAGWVDKKTNKDADLTHDIGAYVTSCSKTTKKGNSAIGTYAKSEVPAIGSNGQPQ VSGFTKDGHYRDDVQSLVVVDTFAADGVNAKASVLIMDTIYTYEAGKGLNEEYRTDYFNH SDNGIGTSIITNGLVQDAVSGNFSGNEEGREQIVFTTCQKRASKNQYFYKTYTYKKTGKS FWQCNSTGYRISKKGNAYTSLCAPDIDSDSTIARIKDVSLTYTEPEVLALLESTPYFSEV DEGDIGNSETAYGKENGEGTSVSTAEGLTTNIVAGFEWSVDDICAGFVCGAGFETSVEQG YTWETATSTTKKFSLNYSNDTGENQVIVYRRPVTTYRYEIKGTKDTMVLARQGTLLTSML PVDEYNEAAQSYELEEIADGTLATPGNPFSYRSSTAGLNNVAESKITTQYGKEGTVTQEF STETEQEKTFTYDLNASFTAYGLVFGVKAGGGAGTTYSESQSTINTEAITKTGAVTGKQV EGYDFNWKFAHWTTKVNGTEIPVLGYVLTNVIAPPSPPENLAVESVTSDSAKITWDAGER GADEYRIYQIYSDGSDIRIGTVDGTESTYEVTGLKPDTSYTYAIKAYKEGKKGDAISGES VFSEKLIVTTLPEKMGTVTITNPENASVKIGGSAVFKADLSSTASDYRATNYKWQRREKG GKWQTIDGAKSSKLTLNDLTEADNNTEYRCIFRVSYTSASSLIEYYSKAATLTVGETAVA PELTITGHDNTGNGTLAKPYAGKSNYNKKTGTTTQEIKTTKNITIEKSGTQPELTVYTDG DQTAPKYYGVGKDENDNIVYYQVAKTGDTYTAGDKITVAEKYSYTNLNGAAVTDVPAEFN SGKDSVTVTKNDVTYYLQAKITGKQRAGINTGSSGVTSTDSRLESMTRITYYWKSNNGYY TYDSSAPNTPGSAVTLSDADKNVLYDVYHKADTKVVVGRNETYTVSDTSTVNGKQETTYE NESQYGFALVTISSGETTTYTITSIQEDVEETYTVGDTELAGFDPAKLTLVTKQVINTVE TPVYTTQAGDSLTLRAKVTEKDNSKAAAGASVEFKIVNTQTNGVETISATTDANGAASAK WTAATSGLYSIQVNVLAKSGYTASATKAQYYNAGGTYETNTTEYRLVLSSAGKTLTGTMT YGGSVSYELQERAITVSSDAAKAQTVGEWKKSEKTNLTYTVETTEKDRKKVDEVKQPLSV ASYNFRVYDGDTIDPQKELAAAALQVTKAAVTITPEIKNGVTPSSASDITLKVEPEILGV NLNDVLNVNCGYFTDKTATGKFDVILSYKTDSNGAVTDKVKAFQNNYTAVFESASFTVKP DSAQVKFSCGENGTIVGRYADNWYPMASGSNQTKGTRLRFAVAPNSGYGVAKWIINGTDY AVDAKNLPEGMRISEDGKILDVASFNPASQTASTNPGHTKDGVLTVEVSFKSTSHEITYN VDGKGGTLTAVNENDKKINSGTKITQGSKVTFTAEPEEGYIVSGWKVDGKTYKWQDKDED YLGTTLVLEDISKDENVIVSFKKSTASYKVITSVADEDGKTDTSLAKVTAINAETKEAVT DLTSIKEGTTLTFTASVADKTNHMVKLWQTSKDGKTWEDAALSGGSNTFTLYNISGNLYI RPVITIAQKYSLKYKVVLDDGKPSETIVTDKKIAELTATSNGQEIASGESHSAYIPIEFA LTLNNDYYVTGWSKNVKASEDLSGASLDALDSNTEVVVTIKEKPVVTIPSGISNGTLTVI YKDAANKDISVSNGDHVPNGTQLLVTLTPEKGYVVDEDALGASVETEYTDEDKSGNTTDT KSYKVENVIANVTVKGAFKALGTHKVTYEPVIASGESANGTLTAKADRKQMDIYKIDKLT TGEKVYEGSTLTFTAAPDQDYSVQEWRVNGKVLKEDGIKVTDSTLTIANVEKDYTVTVQF KKSGSDTTIAAGANGKIVSAVAGKVDQIANIETGFVLASGATVDITAQPDTGYKVGCWKV NGKVVDGQTGNTYTYTADENGTGAAITVQFVQIDYAVSWSAVNGTVKAEDTSGTEYEGNK ADIRGGSKVTFSATPKEAYKVSCWKVNGKVVDGENANTFTFTVPSGAKETPEVASYKVEA VCEKDQFTLTYAQPSNGTLTARGAAGEVASGDKVNGDEKYTFTVKPDADYIVESWKVDGQ VIDSHSTSYEVTVKKNTEVSVRLVPASYKVTYKVNNEQGKLLVGKDTEEKTDGEISAAYG TSIKFTAVSNKFCHIKGWKLDGTEVTDTTEGISISADGSELTLSEVKKEHSVEAIFDAAT MYEVSYEVDETSEGVASNAGTLNAKAGNTDLKLKKDQTTTVEGGKTLTFTAVPVSADFMV AGWYVNGKKVENELSNTCVIEKLDKKVHVTVQFTQYKGYALPVSGEGYALSEMKRTPDDT TPDTEIRENGTLSFKVAPDTDNKYIRIDKLVINGYDCLADKLSEGKEQPENCTLVEVQKN EDGSYVITVSGITGEIQTDITAHKHTLKKVDKVNPTCTKAGNKEYWICEDENCKEMFLEE AAMKAVQWKDILLPATGHNYQNGICKNCGAKDPTYVDIHPGTPKVKAKAKGDRKIKLSWS KAKDAQGYIVYRYNAKTRKYAVIANTKKTAYTDKKRTPGTVYRYLVKAYGVSLNKKVIYG QVSNCASAVTKPQTPKITSVKKDGTTKAAIQLKTERNVKGYQLYEYMWQTKKFKLVGKIE GKKYYKYDSKKKKFTRDRKSKVVQNAKKKTISVKITTNNVNFKRYRRYRFKVRSYVKYNG KQIFSKLSKQKIVTR >gi|222441852|gb|ACEP01000090.1| GENE 23 35939 - 36598 696 219 aa, chain + ## HITS:1 COG:no KEGG:Mlab_0596 NR:ns ## KEGG: Mlab_0596 # Name: not_defined # Def: hypothetical protein # Organism: M.labreanum # Pathway: not_defined # 16 141 10 140 222 70 29.0 4e-11 MEGKRLTSYAMEELECPKCGHKHSLKKYKVINVTEKAKLKEEIMKNRLYQFSCEECEYMA PLTYDSLYVDSQRNIMIYMAPVMNAEIKAEIAELEQEKGIDKRLVDNINDLKEKIMIADN HLDDRVIEIIKIMYIDQMKKEMEDDTLLNILFDYNRDNYCFLVFFQKKGIGKIPLTREFY RQVEDKYKDAIKEHSMDSFMKVDMEWAGKILFKNHNKFN >gi|222441852|gb|ACEP01000090.1| GENE 24 36734 - 36940 334 68 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|238917368|ref|YP_002930885.1| large subunit ribosomal protein L31 [Eubacterium eligens ATCC 27750] # 1 68 1 68 68 133 85 2e-30 MREGIHPNYYQAKVVCNCGNEFVTGSTKEDIHVEICSKCHSFYTGQQKAAKARGRIDKFN RKYGMNQN >gi|222441852|gb|ACEP01000090.1| GENE 25 37090 - 38031 889 313 aa, chain + ## HITS:1 COG:CAC2886 KEGG:ns NR:ns ## COG: CAC2886 COG3872 # Protein_GI_number: 15896140 # Func_class: R General function prediction only # Function: Predicted metal-dependent enzyme # Organism: Clostridium acetobutylicum # 2 293 3 296 317 264 48.0 2e-70 MKKTSIGGQAVMEGVMMKNLDRYAVAVRKPDHEIEVMTDEYKSLGSRYAVLGLPIIRGVV NFGESLYIGLKTLSYSSSFYEEEDYEPGKVEQFFTKIFGDKLESVLMGITMVISVILALG IFMVLPFFLTNLMKGFLPSYSIRTLIEGIIRVALFLIYIWLISKMEDIKRVFMYHGAEHK TINCLEHGEDLTPENIKKYSRLHKRCGTSFLLIVMIVSIVVFMFIRVDSLVWKFVLRILL VPVVAGISYEFIRLAGRSDSKIVNTLSKPGLYLQYFTTREPDEEMMEVAIKAVEGVFDWR EYLRAMHNGELED >gi|222441852|gb|ACEP01000090.1| GENE 26 38034 - 38927 355 297 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|169796031|ref|YP_001713824.1| putative adenine-specific methylase [Acinetobacter baumannii AYE] # 18 258 42 288 334 141 37 7e-33 MTIREVLINIRERLQNAGIEDFEYESWAFLDWKLHIDRAEFYMNPNGEVKEELLAELESV LKQREQRVPLQYLMGECEFMGYDFYVDERVLIPRQDTECLVELAVEDIRNRKTQNRCESN NTADQKNEQKVKVLDLCTGSGCIGISVAKLCPDTEVTLADISEGALSVAKKNAQNLDAGV TLIKGNLFENIEGRFDYILSNPPYIPSEVIEGLMPEVKEHEPRLALDGEADGLSFYREII NEAPDYLNPDGRIYFEIGAEQGEDLTHLMNERGFSEVKVHKDLAGLDRIVTGIYSRK >gi|222441852|gb|ACEP01000090.1| GENE 27 39166 - 40245 1607 359 aa, chain + ## HITS:1 COG:CAC2884 KEGG:ns NR:ns ## COG: CAC2884 COG0216 # Protein_GI_number: 15896138 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Protein chain release factor A # Organism: Clostridium acetobutylicum # 1 354 1 354 359 404 58.0 1e-112 MFDKLEDLVKRLDEVMQELSEPDVVSDQNRFRNLMKEQNELAPIVEKYKEYKEAKQTIED SVEMLEEEHDEEMRELLKEELSEAKKNVEQYEEELKVLLLPKDPNDDKNVIVEIRAGAGG DEAALFAAEIYRMYKNYAESKRWKTEFIDVNENGIGGFKEVSFMINGQGAYSRLKYESGV HRVQRIPATESGGRIHTSTITVAIMPEAEEVDVQLDMNDCRFDVFRASGNGGQCVNTTDS AVRLTHIPTGIVISCQDEKSQLKNKDKAIKVLRARLYDLEQSKAHDAEAELRKSQIGTGD RAEKIRTYNFPQGRVTDHRIKLTLHRLDDILNGDLDEIIDSLTAADQAAKLSKLQEMER >gi|222441852|gb|ACEP01000090.1| GENE 28 40251 - 41321 947 356 aa, chain + ## HITS:1 COG:L0065 KEGG:ns NR:ns ## COG: L0065 COG0079 # Protein_GI_number: 15673188 # Func_class: E Amino acid transport and metabolism # Function: Histidinol-phosphate/aromatic aminotransferase and cobyric acid decarboxylase # Organism: Lactococcus lactis # 4 353 3 350 360 363 50.0 1e-100 MQEWKKNIRQVVPYVPGEQPKKANVIKLNTNENPYPPSPKVKEQCAKICAETEELRLYPD PTAGMLVEAIAKYKGLDSSQVFVGVGSDDVLAMAFLTFFNSERPIFFPDITYSFYDVWAD LFKIPYDKKPLDENFMIKKDDYYWKNGGVVFPNPNAPTGVLMPLDEIEDIISHNQDVIVI VDEAYVDFGGHSAQELLSKYENLLVVQTFSKSRSMAGMRIGYAMGSAELIKALNDVKYSF NSYTMNRTSILLGTASIEDDVYFKETVEKIVNTREWFKAEMKKLGFTFPDSKANFLFASH PKVPAKEIFEAAREKDIYVRYFDKPRINNYLRITIGTDEEMEKFLKFLTSFLKIQK >gi|222441852|gb|ACEP01000090.1| GENE 29 41393 - 42844 1408 483 aa, chain + ## HITS:1 COG:CAC2239 KEGG:ns NR:ns ## COG: CAC2239 COG0297 # Protein_GI_number: 15895507 # Func_class: G Carbohydrate transport and metabolism # Function: Glycogen synthase # Organism: Clostridium acetobutylicum # 3 481 2 477 477 407 45.0 1e-113 MKKILFAASECVPFMKTGGLADVVGALPKEFDKNEWDVRVVMPNYRGIPEEYRNNFEYVT HFYMGVGPYIPNVYVGIMKYVYNGITYYFVDNLDYFGAMEPYSDTRTDVEKFTFFCKAVL SILPVIDFRPDLIHCHDWQTGLIPVYLKTEFAANPFFWGIKTMMTIHNLRFQGVWDINTM KGLSGLPDYLFTPDKLEFKKDANMLKGGIVYSDFVTTVSNTYAQEIQTAYYGEGLDGLLS ARNQSLFGIVNGVDYSLYDPSNDVKLYRNYSEYNHREGKAANKAELQKQLGLEVNPNKYM IGLISRLTDQKGLDLVRYAMDRLIDDNTQIVVIGTGDPSYEEMFRYYAWCNSDKVSANIL YSDDLAHKLYAAADAFLMPSLFEPCGLTQIIAFHYGTIPIVRETGGLKDTVIPLNEFEDT GDGFSFSNYNGDELINTVNYSKYIYFEQPEIWDHMITRAMEKNLSWGVSKRRYEELYNTL IGR Prediction of potential genes in microbial genomes Time: Fri Jul 8 07:52:19 2011 Seq name: gi|222441851|gb|ACEP01000091.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont349.1, whole genome shotgun sequence Length of sequence - 5538 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 2, operones - 1 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 103 - 642 448 ## EUBREC_0237 hypothetical protein - Prom 676 - 735 7.2 - Term 788 - 831 0.1 2 2 Op 1 . - CDS 888 - 1859 774 ## Elen_2186 hypothetical protein 3 2 Op 2 . - CDS 1878 - 3470 945 ## COG1696 Predicted membrane protein involved in D-alanine export 4 2 Op 3 . - CDS 3479 - 3709 355 ## gi|225027783|ref|ZP_03716975.1| hypothetical protein EUBHAL_02042 - Prom 3729 - 3788 3.4 5 2 Op 4 . - CDS 3796 - 5394 1299 ## COG1020 Non-ribosomal peptide synthetase modules and related proteins - Prom 5469 - 5528 5.6 Predicted protein(s) >gi|222441851|gb|ACEP01000091.1| GENE 1 103 - 642 448 179 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_0237 NR:ns ## KEGG: EUBREC_0237 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 5 179 37 224 224 130 39.0 2e-29 MYTEAYKKLERSLTDNITEAQAKLGYSKNTMQFCYPLRTLNHFFNSEWEADEMNKALQEF IDFVAPRFGETRIEYKEHRFCFYLSEAAPTYVHEHKEKNEFIFELIQLLLKPDTSLETVT SFFKEWDKNCVIKEVNNDEFDILITFTDRDDPYYYCFKDEELQIVYHRFLPEDYEDFGF >gi|222441851|gb|ACEP01000091.1| GENE 2 888 - 1859 774 323 aa, chain - ## HITS:1 COG:no KEGG:Elen_2186 NR:ns ## KEGG: Elen_2186 # Name: not_defined # Def: hypothetical protein # Organism: E.lenta # Pathway: not_defined # 46 311 53 318 322 160 31.0 6e-38 MSKKRMITKSVIFLLILAFILKIFSSLSIAMGYNANPLYNYSSYSIFQEPDNSIDVLSIG DSNVYSSIFPLVWWEQQGFTGYTWGQPSQRIPETYEYLKKIYKHQKPSIVLIDGNNLFRD KTDIDNLDSITKAKLATIFPVISFHKNLNPRRLKNIFGNQHSVMKGYYYRKASHKVHKKK HRMKFTRKCWQINKLSTSTFSKCIHYCKSQGSIPVLISVPNYNGWNYQKHNALQEIADKN GINFVDLNLELKEQINWKKDSVDGGDHLNIKGAKKTSAYLGEYLKKEYGLPDRRGTANYK QWDNDVEEWEKLMLLDKHRKVGL >gi|222441851|gb|ACEP01000091.1| GENE 3 1878 - 3470 945 530 aa, chain - ## HITS:1 COG:FN1672 KEGG:ns NR:ns ## COG: FN1672 COG1696 # Protein_GI_number: 19704993 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted membrane protein involved in D-alanine export # Organism: Fusobacterium nucleatum # 93 530 69 486 486 228 35.0 2e-59 MQYNSWLYLFPFLGGSIILYYLIPVKRRWIGLLFSSLLFYILAMKNLVAVVLFTTATIYL GARTIEEFNEQFKLKKKELDRAARKELKSKITKKNRYIIALLCVINFGILFVTKYFNFFG ENINALFHALHISAQIPFLDIFLPLGISFYTLSAISYVTDVARGVCKAEKNYFRLLLFLI FFPIITEGPISRYGQLGEELKAEHHFDYKQFCFGAQLIVWGLFQKVLLSDRVNMYVRAIF ANHEKYSGFPVIMAILLYTFQLYMDFAGCINIARGSAQLFGITLEENFKRPFFATSVNEF WRRWHITLGAWFKDYIFYPISLGTHFQKFSQNCRKHMNRYYAATIPGIFALFAVWFGNGI WHGAEWKYIIYGLYYYVIMVLGMLLEPAFVSFCSKLNIDRNASWYHTMQVVRTFILVNIG MLIFRSKDIKTAITMLGSVFQPWHGKSNFWQLAFNRGGLFKLDFLLIAFSVILLYFIGKK QENGHSIRELILAQPLPIRWAIYIIPIVLIIIAGAYGPGWGVADFIYAKF >gi|222441851|gb|ACEP01000091.1| GENE 4 3479 - 3709 355 76 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225027783|ref|ZP_03716975.1| ## NR: gi|225027783|ref|ZP_03716975.1| hypothetical protein EUBHAL_02042 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02042 [Eubacterium hallii DSM 3353] # 1 76 1 76 76 110 100.0 4e-23 MDTILEILETIKPGADFASSTDFIEEHLLESMEILQLVSELNDEFDINITLPYIKPENFK SVESIYNMVQEILEDE >gi|222441851|gb|ACEP01000091.1| GENE 5 3796 - 5394 1299 532 aa, chain - ## HITS:1 COG:Cj1307 KEGG:ns NR:ns ## COG: Cj1307 COG1020 # Protein_GI_number: 15792630 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Non-ribosomal peptide synthetase modules and related proteins # Organism: Campylobacter jejuni # 2 523 11 499 502 360 37.0 4e-99 MKNILQYFEKTCEQFPNHIAVANESSSVTFNDLKTRSQQTGTMIASELKQDSLPVAIYID KSPALVEAMLSVLYSGNFYVVLDTQMPVERVRRIFETLQPAAILTDKSYIDKVESINSFK NDNAINSTSITEQSCENSQIKEDKLKTFYKTFYIEDSISTEIDTKLLGRLRSQMTERDPA YILYTSGSTGQPKGTVISHRALIAYSDWFIHAFDINERTVFGSQTPLYFSMSVSDLYASL RTGATYQIIPKKYFSFPMQLIEYLNTYKINTIYWVPSALNIVANWDTFAYIKPKYLKKVL FAGEVMPVKQLNYWRKALPDVFYANLFGPTETTDICSYYVVNRTFSDDETLPIGVACSNC DLIIADENGKEATSGELLVRGPFLADGYYHNPEKTAEVFVQNPLNDAYPEIVYRTGDLVY RGEDGLLRYQGRKDFQIKHMGYRIEPGEIEAAIGALPEIHACVCIYLETTDEILLFYQGK IKQESLSAAVSGKLPAYMRPNKFIRIRQMPYNSNGKVNRKLLKTNYLEENQL Prediction of potential genes in microbial genomes Time: Fri Jul 8 07:52:34 2011 Seq name: gi|222441850|gb|ACEP01000092.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont350.1, whole genome shotgun sequence Length of sequence - 2269 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 16/0.000 - CDS 419 - 1255 786 ## COG0784 FOG: CheY-like receiver 2 1 Op 2 . - CDS 1279 - 2097 758 ## COG0784 FOG: CheY-like receiver - Prom 2129 - 2188 8.6 Predicted protein(s) >gi|222441850|gb|ACEP01000092.1| GENE 1 419 - 1255 786 278 aa, chain - ## HITS:1 COG:CAC2071_1 KEGG:ns NR:ns ## COG: CAC2071_1 COG0784 # Protein_GI_number: 15895341 # Func_class: T Signal transduction mechanisms # Function: FOG: CheY-like receiver # Organism: Clostridium acetobutylicum # 5 167 1 155 162 84 31.0 2e-16 MKITMKNEKIIKVLVADNSEFAKNNLRDFLESSEGIQLINVVDNGKDAYHFIMDEAPDIV LIDVILPVMDGFTVIEKVNANKSIKKKPSFIIISSMGNQSMVEYACKLGVLYYMMKPYNL DSMLHRIFQTASSHMEAERDTKKKERKHLMKGYGISGYMENTLENDVTDIIREIGIPAHI KGYQYIREAIMMTVNDINLLNYITKLLYPTIAKKYKTTSSSVERAIRHAIEVAWNKGQID VLEDMFGYTISAGKGKPTNSEFIALIADKLRLEYRMQA >gi|222441850|gb|ACEP01000092.1| GENE 2 1279 - 2097 758 272 aa, chain - ## HITS:1 COG:BS_spo0A_1 KEGG:ns NR:ns ## COG: BS_spo0A_1 COG0784 # Protein_GI_number: 16079478 # Func_class: T Signal transduction mechanisms # Function: FOG: CheY-like receiver # Organism: Bacillus subtilis # 1 121 1 120 120 106 46.0 5e-23 MSNINVVIADDNERIRQNLQEIVSRECDMTLVGSASDGMDALSMIRTSHPDVVLLDIIMP KMDGLGVLEEVKKDENLLKKPAFIILTAMGQEEVMEEALGLGANYVLLKPFDSNALVKKI RQIKNGKIAANRKDFITLKEKTEEKDGDYRLEIIVTNIIHEIGVPAHIKGYQYLRDSIIM SVNNMDILNSITKQLYPTIAKMHETTPSRVERAIRHAIEVAWNRGKMDTINELFGYGIQA GKGKPTNSEFIALIADKIRLEYKGKIKYPAEA Prediction of potential genes in microbial genomes Time: Fri Jul 8 07:52:47 2011 Seq name: gi|222441849|gb|ACEP01000093.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont351.1, whole genome shotgun sequence Length of sequence - 50366 bp Number of predicted genes - 43, with homology - 42 Number of transcription units - 22, operones - 11 average op.length - 2.9 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 557 - 1858 1243 ## COG2851 H+/citrate symporter 2 1 Op 2 . - CDS 1892 - 2815 799 ## COG0657 Esterase/lipase - Prom 2842 - 2901 8.0 3 2 Tu 1 . - CDS 3496 - 4302 663 ## Sterm_3186 MerR family transcriptional regulator - Prom 4524 - 4583 6.8 - Term 4596 - 4655 10.9 4 3 Op 1 1/0.000 - CDS 4728 - 5240 574 ## COG1827 Predicted small molecule binding protein (contains 3H domain) 5 3 Op 2 13/0.000 - CDS 5266 - 6117 546 ## PROTEIN SUPPORTED gi|163755345|ref|ZP_02162465.1| 30S ribosomal protein S6 6 3 Op 3 10/0.000 - CDS 6153 - 7358 1337 ## COG0029 Aspartate oxidase 7 3 Op 4 . - CDS 7379 - 8287 968 ## COG0379 Quinolinate synthase - Prom 8346 - 8405 9.4 - Term 8399 - 8461 19.1 8 4 Tu 1 . - CDS 8495 - 9187 549 ## gi|225027797|ref|ZP_03716989.1| hypothetical protein EUBHAL_02056 - Prom 9385 - 9444 8.6 + Prom 9161 - 9220 6.0 9 5 Op 1 40/0.000 + CDS 9350 - 10018 713 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 10 5 Op 2 . + CDS 10021 - 11388 687 ## COG0642 Signal transduction histidine kinase + Term 11628 - 11663 1.1 + Prom 11564 - 11623 6.4 11 6 Tu 1 . + CDS 11768 - 12679 626 ## Cphy_2618 Ig domain-containing protein - Term 13112 - 13159 3.1 12 7 Op 1 11/0.000 - CDS 13341 - 15197 2031 ## COG0465 ATP-dependent Zn proteases 13 7 Op 2 10/0.000 - CDS 15187 - 15777 773 ## COG0634 Hypoxanthine-guanine phosphoribosyltransferase 14 7 Op 3 1/0.000 - CDS 15804 - 17345 748 ## COG0037 Predicted ATPase of the PP-loop superfamily implicated in cell cycle control 15 7 Op 4 . - CDS 17318 - 18727 867 ## COG2208 Serine phosphatase RsbU, regulator of sigma subunit - Prom 18754 - 18813 3.3 - TRNA 19499 - 19572 81.1 # Met CAT 0 0 - Term 19457 - 19494 5.0 16 8 Op 1 . - CDS 19634 - 20113 461 ## gi|225027808|ref|ZP_03717000.1| hypothetical protein EUBHAL_02068 17 8 Op 2 . - CDS 20136 - 20585 105 ## gi|225027809|ref|ZP_03717001.1| hypothetical protein EUBHAL_02069 18 8 Op 3 . - CDS 20586 - 20867 327 ## Closa_4066 sporulation protein YabP - Prom 20889 - 20948 7.4 + Prom 20866 - 20925 7.3 19 9 Op 1 1/0.000 + CDS 21049 - 21324 509 ## COG0776 Bacterial nucleoid DNA-binding protein 20 9 Op 2 . + CDS 21372 - 21611 276 ## COG1188 Ribosome-associated heat shock protein implicated in the recycling of the 50S subunit (S4 paralog) + Term 21687 - 21734 -0.0 21 10 Tu 1 . - CDS 21906 - 22448 439 ## COG2002 Regulators of stationary/sporulation gene expression - Prom 22590 - 22649 6.0 22 11 Op 1 . - CDS 22667 - 23785 899 ## COG0657 Esterase/lipase 23 11 Op 2 . - CDS 23853 - 24623 505 ## EUBREC_2419 hypothetical protein - Prom 24655 - 24714 3.0 24 11 Op 3 . - CDS 24723 - 26009 939 ## Closa_3262 hypothetical protein - Prom 26042 - 26101 9.3 - Term 26249 - 26291 -0.5 25 12 Op 1 13/0.000 - CDS 26351 - 27127 196 ## PROTEIN SUPPORTED gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 26 12 Op 2 9/0.000 - CDS 27127 - 28065 981 ## COG4120 ABC-type uncharacterized transport system, permease component 27 12 Op 3 . - CDS 28096 - 29124 1441 ## COG2984 ABC-type uncharacterized transport system, periplasmic component - Prom 29189 - 29248 5.1 28 13 Tu 1 . - CDS 29345 - 30259 1121 ## COG0039 Malate/lactate dehydrogenases - Prom 30290 - 30349 7.3 29 14 Tu 1 . - CDS 30752 - 31141 505 ## gi|225027822|ref|ZP_03717014.1| hypothetical protein EUBHAL_02082 - Prom 31200 - 31259 4.3 30 15 Tu 1 . - CDS 31338 - 34889 2565 ## COG4485 Predicted membrane protein - Prom 35086 - 35145 5.6 - Term 35075 - 35137 -0.2 31 16 Tu 1 . - CDS 35245 - 36975 1760 ## COG1376 Uncharacterized protein conserved in bacteria + Prom 37194 - 37253 15.2 32 17 Op 1 5/0.000 + CDS 37464 - 38594 1228 ## COG0399 Predicted pyridoxal phosphate-dependent enzyme apparently involved in regulation of cell wall biogenesis + Prom 38623 - 38682 4.5 33 17 Op 2 . + CDS 38824 - 39840 864 ## COG0463 Glycosyltransferases involved in cell wall biogenesis + Prom 39907 - 39966 6.7 34 18 Tu 1 . + CDS 40073 - 41254 1297 ## COG0439 Biotin carboxylase 35 19 Op 1 . - CDS 41678 - 43123 1497 ## Fisuc_0966 polysaccharide biosynthesis protein 36 19 Op 2 . - CDS 43134 - 44426 1512 ## COG1541 Coenzyme F390 synthetase - Prom 44562 - 44621 6.7 - Term 44630 - 44672 4.4 37 20 Tu 1 . - CDS 44683 - 45285 659 ## gi|225027831|ref|ZP_03717023.1| hypothetical protein EUBHAL_02091 - Prom 45324 - 45383 3.7 - Term 45298 - 45331 -0.9 38 21 Op 1 5/0.000 - CDS 45409 - 46158 834 ## COG0489 ATPases involved in chromosome partitioning 39 21 Op 2 . - CDS 46158 - 46982 883 ## COG3944 Capsular polysaccharide biosynthesis protein 40 21 Op 3 . - CDS 47001 - 48020 1022 ## bpr_I2394 cell envelope-related transcriptional attenuator 41 21 Op 4 . - CDS 48035 - 48154 117 ## 42 21 Op 5 . - CDS 48155 - 48883 460 ## COG4464 Capsular polysaccharide biosynthesis protein - Prom 48914 - 48973 9.0 - Term 49092 - 49144 0.1 43 22 Tu 1 . - CDS 49187 - 50098 790 ## COG0679 Predicted permeases - Prom 50304 - 50363 3.1 Predicted protein(s) >gi|222441849|gb|ACEP01000093.1| GENE 1 557 - 1858 1243 433 aa, chain - ## HITS:1 COG:BH0745 KEGG:ns NR:ns ## COG: BH0745 COG2851 # Protein_GI_number: 15613308 # Func_class: C Energy production and conversion # Function: H+/citrate symporter # Organism: Bacillus halodurans # 6 387 4 396 442 131 28.0 3e-30 MNATLIMAFILILIVVATVITKKLPFNFVLFIAPVLCALILGHSIEETSTFIVEQLASMM KSAGFMLLFAFLYFQMLTEAGVFDTIVSAITTKLGNKMNVIVIMVLTTLIGGFSILTGNF TPAYLITFPLMVPLYKKFDFDREAAFIIAQTAMSAMCFIPWGIGMAYTASSAGLDANKLA AASMPWGLCFIPAIIFQWVYFGIKHKRRVGTFQAVTTTVEAAAQQEENQNRRPKLFWVNF ILFILCLVALGIFGIAPYFVFIFATVITAMLNYKDNFGEIFNKVGAMYLNILIMLLAINV YQAVFNNTGMVEALSNGLMQVCPSFLLRYLHVIMLLLCVVIIYVVPFQIFNALYPVFISI GAGFGIPAVAIIAPFVCNLSLATSSTPTNSSTYTGCALTETDVQHYCKKAVPVQTVTNAI VVLTAVLFHVLKL >gi|222441849|gb|ACEP01000093.1| GENE 2 1892 - 2815 799 307 aa, chain - ## HITS:1 COG:YPO2336 KEGG:ns NR:ns ## COG: YPO2336 COG0657 # Protein_GI_number: 16122560 # Func_class: I Lipid transport and metabolism # Function: Esterase/lipase # Organism: Yersinia pestis # 41 307 61 331 334 99 28.0 1e-20 MTPEEIKNTDFETFFQAVPQMKEVHHIKATGQEIYHVYKHDVVYVTRSEGERTLQMLVPE VKPESAQQKYPAILYVQGSAWLKQDQYKRLSAMADFARRGFVTAILQYRESDLATFPAQV QDAKTGIRFLRKHAEEYHIDVDNIFIMGDSSGGHTAVLAGITAGQDILDTEDYNEYSCQV KAIVDFYGVTDVRQKEDFPTTLNQGEPDSPEGKLIGGNNVYEHPELSVPVTCANYVEKAT PIPPILMMHGTGDDTVSWRQSVRLYKKLREEEKDVTFYIMENAVHGDNAFWTTENLNIVD EFIKKHI >gi|222441849|gb|ACEP01000093.1| GENE 3 3496 - 4302 663 268 aa, chain - ## HITS:1 COG:no KEGG:Sterm_3186 NR:ns ## KEGG: Sterm_3186 # Name: not_defined # Def: MerR family transcriptional regulator # Organism: S.termitidis # Pathway: not_defined # 1 110 1 110 327 73 36.0 6e-12 MNEKTVGMLSKFTGVSVHTIKYYEKLGLISSSRREQSNYRSYDVKSCTDIYECMKYRNMG FSLKDIQLLLKKGDNALLSSLLEKRSQELTTEVTELLNTRELIENYRKELADLDRRFGKW YIEDCPDFYFRRQTKGLNYMDEASCESDGINLAEYAPKSSSLLELSPEYFKGDLSAFSWG HGISIDTDNDFMKNKKGFEKISGGRMFTAYMSLRGHYASEGDLINEFLRYYNEYKSGIPK APVYGFRLRITLDETGERKDYFRMQLPL >gi|222441849|gb|ACEP01000093.1| GENE 4 4728 - 5240 574 170 aa, chain - ## HITS:1 COG:SP1234 KEGG:ns NR:ns ## COG: SP1234 COG1827 # Protein_GI_number: 15901096 # Func_class: R General function prediction only # Function: Predicted small molecule binding protein (contains 3H domain) # Organism: Streptococcus pneumoniae TIGR4 # 6 169 5 171 171 125 39.0 4e-29 MTGSKRREELINTIKSSTSPISGKTLAQLYDVSRQVIVQDIALLRTAGYDIISTNRGYIL NAPHTITRVFKVSHTDAQTEDELYSIVDLGGTVVNVMVNHRVYGHMEAPLGISSRLKVKA FIDDIKNGKSSPLKNITSNYHYHTVEADSEKTLDLIEKTLEEKGYLITEV >gi|222441849|gb|ACEP01000093.1| GENE 5 5266 - 6117 546 283 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163755345|ref|ZP_02162465.1| 30S ribosomal protein S6 [Kordia algicida OT-1] # 13 278 14 283 286 214 43 6e-55 MNTISMKLQADELIRLALQEDISSEDVTTNSVMKEAVEGEVQLICKQDGIVAGLDVFKRV FELLDENTKTEFLCKDGDAVKKGQLMGTVTGDIRVLLSGERVALNYLQRMSGIATYTHTV SALLKGTKTKLLDTRKTTPNMRIFEKYAVRAGGGFNHRYNLSDGVLLKDNHIGAAGSVAK AVQMAKEYAPFVRKIEVEVENLDMVKEAVEAGADIIMLDNMSVEEMKEAIRIIDGRAQTE CSGNVTKENIDHLTSLGVDYISSGALTHSAPVLDISLKNLHAL >gi|222441849|gb|ACEP01000093.1| GENE 6 6153 - 7358 1337 401 aa, chain - ## HITS:1 COG:FN0009 KEGG:ns NR:ns ## COG: FN0009 COG0029 # Protein_GI_number: 19703361 # Func_class: H Coenzyme transport and metabolism # Function: Aspartate oxidase # Organism: Fusobacterium nucleatum # 6 383 7 381 435 382 51.0 1e-106 MDFRTDVVIVGTGVAGTFSALNLPKDKNIIMITKSDLESSDSFLAQGGICVLTDENDYDN YFEDTMKAGHYENRKESVDIMIRSSQEIIHDLIGYGVDFAHDGDKLLYTREGAHSRPRIL FHEDITGQEITSKLLAQVKKLPNVTIYEYTTMTDIIVQDGHCAGIIAKTKDGGIMHIRAE DTIFASGGIGGCYKHSTNFPHLTGDALDISKKHGIRLEHLDYVQIHPTTLYSKEPGRRFL ISESVRGEGAVLYNKNGERFVNELLPRDVVTKAIKAQMEKDGTSHVWLSMEHIDKETILN HFPNIYQRCLEEGYDVLKEWIPVVPAQHYFMGGIWVDSDSKTSMDHLYAVGETSCNGVHG ANRLASNSLLESLVFAKRAARKICGLEQTVKPVDFSESAAV >gi|222441849|gb|ACEP01000093.1| GENE 7 7379 - 8287 968 302 aa, chain - ## HITS:1 COG:FN0008 KEGG:ns NR:ns ## COG: FN0008 COG0379 # Protein_GI_number: 19703360 # Func_class: H Coenzyme transport and metabolism # Function: Quinolinate synthase # Organism: Fusobacterium nucleatum # 5 301 3 297 298 323 57.0 2e-88 MTIVEEIKKLKEEKDAVILAHYYVRPEVQEIADYIGDSFYLSKVATKLKEQTIVFCGVSF MGESAKILNPKKTVLMPDMNADCPMAHMAEIETIKKVRSEYDDLAVVCYINSTAALKEYS DVCVTSANAVKIVKELPQKNIFFIPDRNLAHYVASQVPEKNFVYNNGFCPTHERIEAEDV ANVKAQHPDAQIVAHGECQESILLMADYVGSTSGIINYVTESDCQEFIVCTEEGVGYKLK TQNPDKTFYYPDRLPVCPNMKKNTLEKILHVLKTGENEVHVSDALRENSKKPLEKMLELA AK >gi|222441849|gb|ACEP01000093.1| GENE 8 8495 - 9187 549 230 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225027797|ref|ZP_03716989.1| ## NR: gi|225027797|ref|ZP_03716989.1| hypothetical protein EUBHAL_02056 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02056 [Eubacterium hallii DSM 3353] # 1 230 1 230 230 364 100.0 2e-99 MRKRRYLFPGMAVVLLFTILFAAGYTTPAIGNVNTVQAATKKKDISREGKKKRSIIKGKI FELEVDKPESVKDKNLTWSIKDSSIVTFAGKERHDDDIKFKALKAGTTKITCRNTKTDKK INFKITVKNVKKTIKASYQNKTICAKGSLKRSIRVNGELELEVKKTKGKGVKDRQLKWTV EDSSVLAFEDMDDGIYDDEMEFVGIKPGKTTVTCTNTANKEKVTFTITVK >gi|222441849|gb|ACEP01000093.1| GENE 9 9350 - 10018 713 222 aa, chain + ## HITS:1 COG:FN0585 KEGG:ns NR:ns ## COG: FN0585 COG0745 # Protein_GI_number: 19703920 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Fusobacterium nucleatum # 1 222 1 222 224 203 52.0 2e-52 MRMLIVEDEKDLCNILKKRLQKDYTIDICMDGEIAKDYLNTYTYDIILLDIMLPKIDGIT LLKWLRNQKKTTPVLLLTAKSSIEERVNGLDSGADDYLIKPFAFEELQARLRVLLRRNIA ADPSDILIYEDLVMDTKKKKVSRNGKEISLTTKEYKLLQYMMRNPGHVLSRDQLEQRGWD STFEGGSNIIDVYIRYLRKKIDSDYEKKLIHTVYGKGYCLGE >gi|222441849|gb|ACEP01000093.1| GENE 10 10021 - 11388 687 455 aa, chain + ## HITS:1 COG:FN0586 KEGG:ns NR:ns ## COG: FN0586 COG0642 # Protein_GI_number: 19703921 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Fusobacterium nucleatum # 4 453 11 441 445 190 29.0 4e-48 MKSSKLSIKFRITFWYTLVFSLILALVFFYIYFLSKSYSVHDLQEELKDESADILENLYD KSSRNDILYNADWTKFYDDNVSISLYDKNSSFINGVLPEHFPDKVPFINNKVRQIKSDKN IWLVYDYCYEESPEDSIWVRSIASYGHWSRILYNMLWIFAVLFPITVFISAFVGYHILKK ALQPIYTISDMAKEISFSVNLSKRIETSNTHDEFSYLADAFNSMIAKLEKSFEKEKQFTS DAAHELRTPISVIQSHCEYCLDELKLTDSVREEIEIIYKKTTYIGQLISQLLTISRTDKH SFKPQKEEVDLTLLIESVIEELQQKAAYKNISLLFEYIPQNESYIIQGDTMLLMRMIYNL IENAINYGINGGCVWVNLIKNQENTDIIIKDNGIGIEKEHLDKIWNRFYRVDKSRSTGEG FGLGLSMVRFIVDIHGGSIAVKSKPGYGTTFTVSL >gi|222441849|gb|ACEP01000093.1| GENE 11 11768 - 12679 626 303 aa, chain + ## HITS:1 COG:no KEGG:Cphy_2618 NR:ns ## KEGG: Cphy_2618 # Name: not_defined # Def: Ig domain-containing protein # Organism: C.phytofermentans # Pathway: not_defined # 19 156 10 140 399 65 33.0 3e-09 MNKYFKKSTKRSVISMLSIAITLCLLFSLLFPGKAVNAAPRMRLNKTAVTLIQGKTVKLR VIGTKRKVTWKSSNKKIAKVNKKGVVKALSPGKCTITAKVRGKKLKCKVTVDTVERINAK KLYDLIRKKGKKGKGEEKNLRTISTKFRPKGTDDSIEVRITAYPEKGKLLFSYDYVLDSP WDSYHTELTMNLLKKKKGTISSSYRNLYVDPVYTHSVNGTISTLYDGKSQGLFLTECYNG ADSDEAYDDVETSVPYKGKPRPDDIIKGIYRINDAFANYNILLKKYGYSMKKIGFTKWKN TNN >gi|222441849|gb|ACEP01000093.1| GENE 12 13341 - 15197 2031 618 aa, chain - ## HITS:1 COG:CAC3202 KEGG:ns NR:ns ## COG: CAC3202 COG0465 # Protein_GI_number: 15896449 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ATP-dependent Zn proteases # Organism: Clostridium acetobutylicum # 11 610 13 601 602 608 53.0 1e-174 MLNRNMRNTAILLVVFLAVYMFGNYFAEKNLMSGSGSSSYSYERLESDIAKNKVKSIEVV QNAQVPTGSAIVKLNDGEKKVAHVTDVNKVIDLAQQYKVSCQVDDVKSSDAGWLSMILPW IVIGVIMMFMFGMISRASGGGGGGNKMMNFGKSRAKMQDPESQNIRFDQVAGLQEEKEEL KEIVDFLKSPQKYIEVGARIPKGVILVGPPGTGKTLLAKAVSGEAGVPFFSISGSDFVEM FVGVGASRVRDLFEDAKRNAPCIVFIDEIDAVARRRGTGLGGGHDEREQTLNQLLVEMDG FGVNEGIIVMAATNRVDILDPAILRPGRFDRKVAVGRPDVKGREEILKVHVKGKPLAEDV DLHQIARTTPGFSGADLENLMNEAAIHAARNNARFIRMEDIRKSFIKVGIGTEKKSRLVP ELERKITAYHEAGHAILFHLLPDVGPVYTVSIIPTGMGAAGYTMPVPENDNVFETKGRMI QEIKVGMGGRIAEELIFDDVTTGASQDIKQVTDTARSMITKFGMSDRLGFINYEENTDEV FLGRDLGHSRSFSEEVASIIDKEVKKLVDDCYTDAKRILTENMDVLHSCANLLLEKERIS REEFEALFKNDKAAEENL >gi|222441849|gb|ACEP01000093.1| GENE 13 15187 - 15777 773 196 aa, chain - ## HITS:1 COG:FN0288 KEGG:ns NR:ns ## COG: FN0288 COG0634 # Protein_GI_number: 19703633 # Func_class: F Nucleotide transport and metabolism # Function: Hypoxanthine-guanine phosphoribosyltransferase # Organism: Fusobacterium nucleatum # 1 174 1 175 175 189 57.0 3e-48 MADKINVLIPEETVDAKIKEIGEQISKDYAGKEVHLICILKGGVFFACELAKRITVPVSL DFMQVSSYGDATESSGIVRIKKDLDDTMENKEVIIVEDIIDSGRTLHYLIPVLKQRNPKS IRLCALLNKPDRREVEVQIDYLGFDIPDEFVIGYGLDYAQKYRNLPFIGVVEPEAAEDAE EGAEQASGEEKGDNVE >gi|222441849|gb|ACEP01000093.1| GENE 14 15804 - 17345 748 513 aa, chain - ## HITS:1 COG:CAC3204 KEGG:ns NR:ns ## COG: CAC3204 COG0037 # Protein_GI_number: 15896451 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Predicted ATPase of the PP-loop superfamily implicated in cell cycle control # Organism: Clostridium acetobutylicum # 13 511 9 461 461 195 28.0 2e-49 MWDKHWERVDSFLKKWKMVQEGDSILLGISGGADSVCLARYFLARREALSLKLYAVHINH MLRGEEAKRDEEFVQDFCHKWNLSLNVEYRNIKEESRQKKCSEEEAGRIARYECFEKYAK EYHCGKIAVAHHQNDAAETILFRMLRGTGPQGMAGILPVNGKIIRPFLCLSREEIMDILQ NIGQNYVDDSTNTNEEYSRNYIRHRILPEMEHVNQKAAAHISELGMQMQELLAYVTPQME KLYNENVIANEQGELFLEEKTFSVMSLFEQKEMMRRMLFEISGHRKDISLVHVEQMLALM ANKEGKQQNCPYGVLAKRVRDGLLLMKLSTDVTFSDKNKQKDGQQKSDEQNLENLNKKNK KCGNHSEDKMQPVYLQLTAESEESKAEETIVLSNETLLQKFTIHFSILPWNGGKVAKRDC VKYFDYDKMKCKPCLRTRDTGDYFIMDKEGRRKSLGRYFIDTKIPASDRDGQLLLADGSH IMWIIGGRISEFYKVSSETKRVLRVSVQEKEEN >gi|222441849|gb|ACEP01000093.1| GENE 15 17318 - 18727 867 469 aa, chain - ## HITS:1 COG:BH0078 KEGG:ns NR:ns ## COG: BH0078 COG2208 # Protein_GI_number: 15612641 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Serine phosphatase RsbU, regulator of sigma subunit # Organism: Bacillus halodurans # 40 454 387 806 830 117 25.0 5e-26 MHNRSYNEQDAIQRLLLFEEGLSRLIELMKIEKKDVLGNDMDFLCQSIMQDVCRKCPKYR ECYGSRQKETIGEIASILEQAYQCSGVDGRMASSEFRRNCVFFQPFMEEISWLYRLIYQN IYWEQRYDEIKQVMCRQLDEQRLFLRECRVNMQKGEEICGRKKMALKTLFFRKGIHFIKG KEYKDGSGLLQVTVQVQPLLGTKKAELVRKCLNTYYKRNFYRNMETTWLRPGENRISFIE ENGFHVVFGQRCCNKKGEDVCGDTFSFTNFGRKRAVMLLSDGMGTGKKANEDSRRLIETL EDLLEAGISEEFALEMIQDALLFQDKGAFSTIDAAVISLKTGILKLLKAGGMATFIRHKN SVERIMPAALPPGCRIGQQFDLKYKKLYDGDMVIMVSDGMLEFENMPEISFRMESLIGKI KTNNAQVFANELIEAVPVLEDGYDDDRTVLVAAIWEKGTQKCGINIGRE >gi|222441849|gb|ACEP01000093.1| GENE 16 19634 - 20113 461 159 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225027808|ref|ZP_03717000.1| ## NR: gi|225027808|ref|ZP_03717000.1| hypothetical protein EUBHAL_02068 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02068 [Eubacterium hallii DSM 3353] # 1 159 1 159 159 166 100.0 5e-40 MKKRKKKKKNSKQFNRQFKKTFFTKKSAIIIVCTVIILSAFWGAYQKVERKAAKYQKTID ELKTEVKNLNQTNKTLEKEKNNIDTDESKERIARERLGMIKEGEISLQEAQEGESASTTA ADSAETTMASEDQSGQTTESADQTTQSQTENKATIQKTD >gi|222441849|gb|ACEP01000093.1| GENE 17 20136 - 20585 105 149 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225027809|ref|ZP_03717001.1| ## NR: gi|225027809|ref|ZP_03717001.1| hypothetical protein EUBHAL_02069 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02069 [Eubacterium hallii DSM 3353] # 1 149 1 149 149 234 100.0 2e-60 MENKKIADIIIKQIHLFFISGLYGILLGLWYEFFRTLRKNFVHKNKMVHLEDVIFCFTAA IGLFILFQVYNQGMVRFYCLVGLECGAVFYFLVLSEVIGKLFSVFIKIISKIVKITGGII LFPGKVIVKNTGKMLKNMRRTVKIIRRNK >gi|222441849|gb|ACEP01000093.1| GENE 18 20586 - 20867 327 93 aa, chain - ## HITS:1 COG:no KEGG:Closa_4066 NR:ns ## KEGG: Closa_4066 # Name: not_defined # Def: sporulation protein YabP # Organism: C.saccharolyticum # Pathway: not_defined # 1 92 1 93 94 81 41.0 1e-14 MEERILKSHKISLFNRSSGIISGVKEVISFDPNEIILDTEQGMLMIQGEELHVTKLTVEK GEVEIEGLVYSMVYSDDGYMKGEKGGLLRRLFH >gi|222441849|gb|ACEP01000093.1| GENE 19 21049 - 21324 509 91 aa, chain + ## HITS:1 COG:CAC3211 KEGG:ns NR:ns ## COG: CAC3211 COG0776 # Protein_GI_number: 15896458 # Func_class: L Replication, recombination and repair # Function: Bacterial nucleoid DNA-binding protein # Organism: Clostridium acetobutylicum # 1 90 1 90 91 87 62.0 4e-18 MNKTELVAAIAERTELTKKDADQALKAFIDVVGDELSKGEKIQLAGFGTFEVTERAARDG INPRTKETIHIPASKAPKFKAMKSLKEKVNN >gi|222441849|gb|ACEP01000093.1| GENE 20 21372 - 21611 276 79 aa, chain + ## HITS:1 COG:BH0073 KEGG:ns NR:ns ## COG: BH0073 COG1188 # Protein_GI_number: 15612636 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Ribosome-associated heat shock protein implicated in the recycling of the 50S subunit (S4 paralog) # Organism: Bacillus halodurans # 1 76 1 76 88 78 61.0 4e-15 MRLDKFLKVSRLIKRRTVANEACDAGRVLVNGKTAKASLNVKNGDVLEIQFGNKTVKAEI LDVKDTAKKDEAKELFRYL >gi|222441849|gb|ACEP01000093.1| GENE 21 21906 - 22448 439 180 aa, chain - ## HITS:1 COG:CAC3214 KEGG:ns NR:ns ## COG: CAC3214 COG2002 # Protein_GI_number: 15896461 # Func_class: K Transcription # Function: Regulators of stationary/sporulation gene expression # Organism: Clostridium acetobutylicum # 1 178 1 181 183 145 43.0 3e-35 MKATGIVRRIDDLGRIVIPKEIRKTLKVKEGMPLEIYTDKEGGIILRKFLPFSEMSSLSE EFAQCVAQQMGNAVFVTDREKVVAAAGYTGEDIVGEPISHILENILDDRDERLPLSERNR FIPIIHGMEEHEQICQAIRSNGQIMGAIIIQAKDRKQRLGESEKKIALVAAEFLGKQVSV >gi|222441849|gb|ACEP01000093.1| GENE 22 22667 - 23785 899 372 aa, chain - ## HITS:1 COG:mlr0240 KEGG:ns NR:ns ## COG: mlr0240 COG0657 # Protein_GI_number: 13470513 # Func_class: I Lipid transport and metabolism # Function: Esterase/lipase # Organism: Mesorhizobium loti # 147 370 77 299 316 153 43.0 6e-37 MNRRNLGTSSGRREKAVDKSAMERMVRQNRARRKKRRKDKSISLAATMSKNVIALATSLH GIGPLIQNGTFRKRVAEIEPPWHPPRGFSLVPIEMPDFSMELLLHQNEFFSNNSLEYLKT DDSNMPEDRAGKEDTAERLLTENREVVLQLHGGGYIGKMKNAYRDFAVLYARMPGERAVL SVDYRVAPEDPYPAALEDAYAAYQWLLEMGCRGSQIIVVGDSAGGGLALALCLYLKDKGE PLPKKLVLMSPWTDLAATGDSYETNFEKDPLFGNTTDSMIYSNAYYGENDPKTPYISPLY GNYEGFPPMLFQVGGAEMLLSDSARAAKKAKAAGCEVQLTIYDEMFHVFQLGMKKMKESR EAWKEIEEFLTI >gi|222441849|gb|ACEP01000093.1| GENE 23 23853 - 24623 505 256 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_2419 NR:ns ## KEGG: EUBREC_2419 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 256 5 264 264 136 33.0 9e-31 MNKCKRCNVWVADDTMVCPLCHTILTEEKSKGQKSGSGFPDSEFEYNGARYVDIDSVADK RFLIQGPGYPDVKKNTRRFRKAGRILLFISLALEMLLAFINYLTFDMAPKYWSVVTGGII AYLILTMWDIFSRRQGHIRKIYTQIFVILGLMILIDFALDWSGWSLEFGLPCIIYGLVLA IIICMSVNSSSWQNYLLMQLAAIALSILDVVLHFTGYLHHIVLAWIAFGLSVLLWSGTMI IGDRKAKNELKRKLHI >gi|222441849|gb|ACEP01000093.1| GENE 24 24723 - 26009 939 428 aa, chain - ## HITS:1 COG:no KEGG:Closa_3262 NR:ns ## KEGG: Closa_3262 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 11 425 9 423 618 373 43.0 1e-101 MKSRVQKKETRWEKLDNTANLFPVIANETMTNVYRIAVILSENIEPECLQEALEQVLPWF NTMNVRMRTGMFWYYFETNVKGKPVVREEEDFPCRYIEQHRNKSYLFRVTYYKNRINLEV FHALADGMGAVNFLRELTYQYLRICYPQLSGEMGDRLCDDTSLDTEDSYLKNYKKAGKKT YKSVPAVHMKGERLVRGELGIIHGYMSVKQLKEKAKELGMSINEYLSGIFVYSIYKGYLH GNVSKKPIVLCVPVNLRPFFGSMTTRNFFAMASASFLPEKEKYERQEVMKLVQAELKRQI TQENLEKMIAYNVSNQKNYALRVVPLFLKKPAIKFVYLMSAKATTTTITNMGQMMVDEAY RPYIKRFQIILSPSAGQNTKATVCSYGDELTFTFSSLLKDTSVQKVFFRSLAEDGIDVEI ETNGVYYD >gi|222441849|gb|ACEP01000093.1| GENE 25 26351 - 27127 196 258 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 [marine gamma proteobacterium HTCC2080] # 1 237 1 222 305 80 28 2e-14 MSSQLKLTDITKKFEANTVNEKVALDHLNLTVEPGQFVTVLGSNGSGKSTMFNVILGSLF PDEGKVYLGDKDITKLKDYKRALNIGCLYQNPLRGTAPNLTIEENLALAYTRKASPSFFA LNKKDSAYFREVLSTLGMGLEDRMKTKIGLLSGGQRQAASLLMATIAEPELLLLDEHTAA LDPGAAEKVMALTKKIVEENKITTLMITHDMEYALEYGSRTIMLDSGKIVMDLTGKEREE MTTGKLAELFYGKAHKRG >gi|222441849|gb|ACEP01000093.1| GENE 26 27127 - 28065 981 312 aa, chain - ## HITS:1 COG:FN2080 KEGG:ns NR:ns ## COG: FN2080 COG4120 # Protein_GI_number: 19705370 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport system, permease component # Organism: Fusobacterium nucleatum # 7 308 1 275 278 161 35.0 1e-39 MNEIAGMLVSALGQGFIYAPMALGVFLTFSILKTPDLTVEGSFVFGMTACVVVTIAGHPI LGLVAGTLAGAAAGLCTGLLQTKLKIEPILSGILTMTGLYTVNYAILGGQSNRYLQYEAK NSLGVTVNKPADTVYKIFQRDFGKDMSSGMTAFLLTMIIVIVTCAVLVTFFRTRNGMAII ATGDNEEMVRSSSINADVSRIGGVMISNALVAFSGALLCQYQKYADLNCGNGMLVMGLAS VIIGLTFFSNHSVTIQAVGVVLGSLLYRIIIQIAYKIDMPSYAVKLLSAVIVVIALVIPW LQKTYKKKGGRR >gi|222441849|gb|ACEP01000093.1| GENE 27 28096 - 29124 1441 342 aa, chain - ## HITS:1 COG:BMEI0015 KEGG:ns NR:ns ## COG: BMEI0015 COG2984 # Protein_GI_number: 17986299 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport system, periplasmic component # Organism: Brucella melitensis # 42 326 18 307 314 149 35.0 6e-36 MRKFKKFVSTAIIAAMAGTLCLTGCGSNSSSESSTGSSKSGKQVTVAVVQPMSHTSLDQI RDTITSELGKDDNIKVVTDNANGDTAALSSIIENYKSEGVDIVVPIATSTAQTAKSVYDG EDTPIVFAAVSDPEAAGLTGEDCANITGVSNNIPADEIVKLIANFQPDYKKIGFLYTSSE TNSVSTITAAKKYCDDNNIAYEESSISNVSELQTAVESLISKGVDALYTGNDNTIASAMA TYTDVAYAKKVPIYCGADSMVADGGFATVGVNYVQLGKQVADMVEKIANGEKVSDIPYET ISDYAKYVNMQAVKQFGGNFDKKAFEEFDVLVEEDGTSHFNK >gi|222441849|gb|ACEP01000093.1| GENE 28 29345 - 30259 1121 304 aa, chain - ## HITS:1 COG:FN1169 KEGG:ns NR:ns ## COG: FN1169 COG0039 # Protein_GI_number: 19704504 # Func_class: C Energy production and conversion # Function: Malate/lactate dehydrogenases # Organism: Fusobacterium nucleatum # 2 301 5 312 318 235 41.0 8e-62 MRKVVIIGAGHVGSHCGYALAHSGIAGEIVLVDVDKDKAHAQALDIADSVSFCERETVVR DGDYKDCEDAALVVVAIGEARKPGQTRLDMLGRSVEMLKELLEQLKPYKIPGIVVTITNP ADIVADYVRKGLGLERNRVFGTGTLLDTARLIRTLSEESGVGRRSIQAYSLGEHGDSSMI PFSSVTIGGLPFDAYDISKEKVLEATRQIGMTIIEGKKSTEFGIGRALTEMAACILRDEK KIMPASVLLQGEYGQHDVHCGVPCLIGKNGIEKIIELPLTEEEQEMLNKSCEVIKKHIKM ASEN >gi|222441849|gb|ACEP01000093.1| GENE 29 30752 - 31141 505 129 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225027822|ref|ZP_03717014.1| ## NR: gi|225027822|ref|ZP_03717014.1| hypothetical protein EUBHAL_02082 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02082 [Eubacterium hallii DSM 3353] # 1 129 1 129 129 195 100.0 9e-49 MAEQHLTKSELAIMEVMWEQEEALTASEIIKASGDKEWKNSSIHLMVNALLEKGFLEVAG FKKTTKNYARTFKPTMTKEKYFVRKVMGNTPVAEKKKIYMQLLKEVDEIETLDTIKEEIY ARKKELENR >gi|222441849|gb|ACEP01000093.1| GENE 30 31338 - 34889 2565 1183 aa, chain - ## HITS:1 COG:BS_yfhO KEGG:ns NR:ns ## COG: BS_yfhO COG4485 # Protein_GI_number: 16077927 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Bacillus subtilis # 398 1180 73 815 819 171 24.0 1e-41 MSIQGKNGTSRREVFSRELIVAFYFIFLVMTALCHKFGEVHIVHAFTRIVQIVYVPGIFF AVGYYIQIQSEKYGDFKGCMLRHAAYSLGLYALLGIPYEVFAKHVDPFTTIRNFISFIKV FEISAIFLSVSIVFLLCAYLWDYIEGWMEKPVLLLLICIIGFLPVFLPEGIIGYGLAGTL IGGDRTHAVAISTELFVFFWGVFCARKKQNTFLEKKNFIMAALSFVIAGVFAVLHIKAAC FVFLGMFLAFIAINICVPFAGIYAALEEAVVNILERVWGIITTPASREGEENWRSLLRYF IGYTVLFVVMAVLIFLPYILQNKTLIWEADGLGQYVPKVYRFIQYIPEIFKDIFSGNFDF KQYDFTTGLGATVAISYDPVYWLYLLFSPAKIENAYSFLIIVRYFLAGLSMSGLVLYFKK SHFAAYTASIVYAFSGYAIYAGTKHGQFLTPMILLPILVVAMERLIRNRKWYMLTVLVGV SLLCSYYFLYMNTIALGIYFVLRIVCTEEYRNWKTFFERGFIIVGSYILGASLGVISLFT SFGSYMGSSRSGGGKITAFLSTTPLFYRTEWLVDFFTSYISDNFASGLWLKLGFAPLAML AIVLLFTRKNRKELRWLFVIFTAFSLFPVFGYIFSGFSNVNNRWGYIYVVIVSFILAESL DKMRDLSGKEMGIMTAITGLYGLVNAFSDKLFVPSIYGAFGLLAATLIMLFLLNSDKITF SKKQFRGIVLGMTALVIFLNAKWFITSGDDSYTHLDTYVSAGDSKKKISGTALKHLDEVP GADTDEFYRSTNLVTTGNLRSSSLLYGYNDVSTFSSTLNGGIVNYNNAMGNCRWNIVSIY DYNFRTYLNTLASVKYMGVAKKNTATIPYGYKKVQETKNKYYSIYENQYSLPLGYTYDKI VNTDRIDQYSAAEKQETTMLAAIVEDKDMDKNSNLTVATKLPLTAQKLKIKNIKLNGVSM TKDTIEIEKPGATMKFSFEAPANAETYLSLVGDIYAEKDAKEHFITARIKAPGVKYGHKF RIDAYTTGQKEYLFNLGYREGAVKTCTLKFVGTGTLKYKDLAIYSQTMSNYADRVNALKE NSLQNAKAEKNTVTGNITVDKDKMLVVTLPYQKGWTAYVDGKKTDIQRVNYQYIGINLKK GTHDIKLHYQLPGIKLAFMITGCGIIAFVAIIIFNIVRKRRKN >gi|222441849|gb|ACEP01000093.1| GENE 31 35245 - 36975 1760 576 aa, chain - ## HITS:1 COG:CAC1822_2 KEGG:ns NR:ns ## COG: CAC1822_2 COG1376 # Protein_GI_number: 15895098 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 466 576 9 120 120 73 35.0 7e-13 MNKKHLGIVALFILCFSFALAGKSAQAATTPQETSVSKIAYVDASGRLVLKFADFSDAGY TYTVTNEKTGAKTDTAQLASKKGVLTVSLGQLYSAKTEYKLILTSKDNTKKLTVHYYTGK ALGSYSIKQKSNDSVKASWTVSDKIYTGYSLELYLGENSTVAQKRISVGKTAASASIASS SLKNARYYTYLIGQTTLNNNKTVYYGQGMLKTFDYVKKPAKVKGVKAVPNANAAKLSWNA VSNASYYTVYKSTKSSSGYKVAKSKCTSTSLKIKGLTGGKKYYFKVVAVSQAGGKIVIGT QSNAVLAKIPVVAGQVRNVQLTFDSKKNLALTWSRTAKATSYHILYKKAADSKYKILIKT KKSNYSLAKLKADTKYNIKVQAVTRIGNKVYLSSKTSKVITVTPRQYRDKNYNKLLASQV RSIGYVGNKCIYTTKKYSTEVKTAFVNYKGYSSKTKYLIWISHYTQQVSIFEGSKGKWKM IRTFICATGTAKNHSPRGVFKITYKEKGWFYTSTKELYVTHYKGRNSFHTRPLWNNGSVQ NPTIGKPASHGCVRCYNQDAKYIYDKMPIGTTVVSY >gi|222441849|gb|ACEP01000093.1| GENE 32 37464 - 38594 1228 376 aa, chain + ## HITS:1 COG:wecE KEGG:ns NR:ns ## COG: wecE COG0399 # Protein_GI_number: 16131647 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted pyridoxal phosphate-dependent enzyme apparently involved in regulation of cell wall biogenesis # Organism: Escherichia coli K12 # 1 375 1 376 376 556 68.0 1e-158 MINFNIPPFTGKETEYMTKAIQNHKISGDGEFTKKCNKWIEDKTGTAKALLTTSCTHATE MAALLADIKPGDEVIMPSYTFVSTADAFVLRGATAVFVDIRPDTMNIDETLIEDAITDKT RAIVPVHYAGVSCEMDTIMDIAKRHNLLVIEDAAQGVMSSYKGKALGTIGDYGCFSFHET KNYSMGEGGCLLIRDEKNIEDAEIIREKGTNRSKFFRGQIDKYTWVEAGSSYLPSDMNAA YLYAQLEVADKIYDDRMSTWNTYYENLTPLKEAGYIELPVVPEGCVHNAHMFYIKAKDLE ERSALIKFLKENGISSVFHYIPLHGAPAGQKFGRFHGEDKYTTKESERLLRLPLYYGLEK EKVLTVCEKIKEFYSK >gi|222441849|gb|ACEP01000093.1| GENE 33 38824 - 39840 864 338 aa, chain + ## HITS:1 COG:STM2298 KEGG:ns NR:ns ## COG: STM2298 COG0463 # Protein_GI_number: 16765625 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Salmonella typhimurium LT2 # 4 317 11 325 327 146 30.0 5e-35 MLHSIIIPCYKSSRTIREVVELTSAELDRLGRPEYEFILVDDYSPDDGATLRELKALAAD YPFVKAISLAKNSGQHNAVMAGLNYAQGDLLIAMDDDMQTHPSQLHFLLDEIEKGYDIVY GYYPDKKHSAFRNFGSYLNYLTVRILIGKPKDMKTSSYWVIRKFVRDYVIQYQSPYTHLQ GLFLRTTRNISCVPIKHFEREVGQSGYTLKKLIQLYSNIMGYSVVPLRLSTYCGYFFSIL SILGALAIVIRKLVNPAMALGWPSMMCAICFFSGLIMLFMGTIGEYLGRMFLGMNKQPQF VVREVITQNNAPDTEKTGINTSTVEESQTTSSSDIEKL >gi|222441849|gb|ACEP01000093.1| GENE 34 40073 - 41254 1297 393 aa, chain + ## HITS:1 COG:TP0695 KEGG:ns NR:ns ## COG: TP0695 COG0439 # Protein_GI_number: 15639682 # Func_class: I Lipid transport and metabolism # Function: Biotin carboxylase # Organism: Treponema pallidum # 1 260 1 274 597 100 26.0 5e-21 MKKIMILGAGIYQVPLIKTAKKLGLYTIVVSIPGNYPGFELADKVYYENTVDDEKILEIA KEEQIDGIITTGTDVCVITIGKVCDAMGLAGLSYEAAQIAVDKMLMKTKYEEYGVRTARF RKVLFSDPDIKKTIEGLEFPLIFKSVDSSGSRGITRVDSYDKFDSAMNYVKENTRKDYFL IEEFVEGEEFGAQAFVYNGEVQFILPHGDYVFHGDTGVPIGHFAPYNLNEDVIEDAKEQL RGAVKAMQLDNCAINADFILKDNKTYVLELGGRCGATCLAELTSIYYGFDYYEKILRTAL GENPHFTFDKPFVPNASHLLMSDKDGIIKSQVDYNEPNEDICEVEFDYKPGDEVHKFHVG PHRIGHIITKGETLEDAQKLLFEAMDKVEITVE >gi|222441849|gb|ACEP01000093.1| GENE 35 41678 - 43123 1497 481 aa, chain - ## HITS:1 COG:no KEGG:Fisuc_0966 NR:ns ## KEGG: Fisuc_0966 # Name: not_defined # Def: polysaccharide biosynthesis protein # Organism: F.succinogenes # Pathway: not_defined # 17 445 17 440 474 176 29.0 2e-42 MAGKNKDASLNKRAFTSGFFFILAQLFARGLTFAVTPVYSRLLTKAQYGVVRTYESWLLI AYTIMSLCLWRSVDVAKKDFEDDYNGYVSSVHTLSYIAIAFFFGLCMIFKTQVQDFCQMD DLMFYTCFLYVFTYTSMLYVQRRDKQVLKYKFSTMATLLTIVPGTFLSIWLIYRGRVQGL IGQLVDRRVIGYYVPQIIGGAVVAIVIWTQGKKFINLKYWKYGLAFSLPLIPEALSIQIM NQSDKIMIQKMISKEAAGVFAIATTISFIIWILEDSVWNAWIPWLYAKIGEGAEKEVAKP WISVMHMFGILSWILVMLAPEEILILGGAKYKMAVWLIAPMVSGTLFRFYSYSYSALQNY YKRTQYVAAGTIGTMVLNVILNYVCILNFGYMAAAYTTAFSYIILLLVQGILEHKITGMV IVPLHKTVLIAIAYGAINLLSMNLYRVAWYYRYAVMVLVVGAAAIALRKQMMAVLKMFKK K >gi|222441849|gb|ACEP01000093.1| GENE 36 43134 - 44426 1512 430 aa, chain - ## HITS:1 COG:MA1063 KEGG:ns NR:ns ## COG: MA1063 COG1541 # Protein_GI_number: 20089933 # Func_class: H Coenzyme transport and metabolism # Function: Coenzyme F390 synthetase # Organism: Methanosarcina acetivorans str.C2A # 36 428 44 445 445 108 23.0 1e-23 MFEKLERKLVNVLLFFDLFKMKTKIPTDFRGVMHTQDKRVQKLVKRAYEIPFYKKRFDEA GVKPEDIRTGDDLSKLPLLTKDELRAWMNEEAKNPKYDCWFHDTTSGSSGIPLMLLVSPK EKAYNMANWFRVMMCAGYNPFTGKTMSRKSAHSISGGSDTFLQHFGILRRGFVNQYAPEP EIVKQINAYKPDFLYMNKSEFMRICLYCKQNHVELEKPKFYCPTGEKIDDTARKLFAEIL GPGIIDSYGTAETGAAMVRLFDSEEYTVHNDSFVVNIYDEKNRPAKEGNIVVTPLYKTDL PLINYAIGDKGTCEVRDGVRYITSVQGRMNDFFRYETGEVTTFFEIAPIIAHCEDILQMR FIQETYHKIHIQCVHNIAESKLTKEEVEKDMTAKLNAKFKHPFEIEYEWMDSIPPDKNGK LRMIVCKVQD >gi|222441849|gb|ACEP01000093.1| GENE 37 44683 - 45285 659 200 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225027831|ref|ZP_03717023.1| ## NR: gi|225027831|ref|ZP_03717023.1| hypothetical protein EUBHAL_02091 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02091 [Eubacterium hallii DSM 3353] # 1 200 1 200 200 164 100.0 4e-39 MKTVKRIAAVLLAVMLMAPAAVHAAAPSVKQTNINKKKATATVTYNKKKQLPTVITVSGQ KLVVGKDCIILKKKKSAKKNAGTYKITIKGIGNYYGTTTVTYKIKKAKQKIKTSTKAKTY KASTIKKKGKKFNLKTKAIRKKVTYKVRGKKSAKKYIKVSKKGKVTIKKGIKKGTYKIVI KAKATKNYKRGKKVVKITIK >gi|222441849|gb|ACEP01000093.1| GENE 38 45409 - 46158 834 249 aa, chain - ## HITS:1 COG:SP0349 KEGG:ns NR:ns ## COG: SP0349 COG0489 # Protein_GI_number: 15900278 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Streptococcus pneumoniae TIGR4 # 15 223 17 227 227 129 36.0 4e-30 MKKLNLKIKDMPYAAEEALNRLRVNIKFCGKNTKKIVITSSIPNEGKSTVSVRLWKLLSE AGFPTVLVDVDLHKSELQKKYEVEGIKEGLNHFLSGLAEYEDVVYETNIPNGHIVPVTTL LENPSALLEDPRLGELLDRLAEDYRYVIIDTPPLDNISDGALIASMSDGAVLVVRCGETS KALVRQSLQQLDRVGCPVLGTVLNRAEIRSGAYKKYYNRYGKKYGDYYGDYYGAYYGGDT TEKKAGIKE >gi|222441849|gb|ACEP01000093.1| GENE 39 46158 - 46982 883 274 aa, chain - ## HITS:1 COG:SP0348 KEGG:ns NR:ns ## COG: SP0348 COG3944 # Protein_GI_number: 15900277 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Capsular polysaccharide biosynthesis protein # Organism: Streptococcus pneumoniae TIGR4 # 32 257 2 229 230 105 30.0 7e-23 MSLINNRNEESTPEFEVISGGKEDVRSSFADRTEQEAEIDLIDLAWALLDKIHYIVLCFL IGAVIMNAYSYFLVRPTYKSTAKMYVVSASKNSVVDLDALNIGTSLTADYEQLMLSYPVL EQVINKLNLDMDSDTLAKMITLENPTDTRILNINVVSTDPKSARDIANTLMEVSVDYLPK TMSTNAPNVAQKAKLADYKDGPSYTKYTMIGALAGAFLYCMYLVVKYLMDDTIHTADDME KYFDIVPLAVIPDVSELASEKQQKKGKLEKGFSK >gi|222441849|gb|ACEP01000093.1| GENE 40 47001 - 48020 1022 339 aa, chain - ## HITS:1 COG:no KEGG:bpr_I2394 NR:ns ## KEGG: bpr_I2394 # Name: not_defined # Def: cell envelope-related transcriptional attenuator # Organism: B.proteoclasticus # Pathway: not_defined # 55 337 67 350 354 144 35.0 5e-33 MAKKYYSRSEKEKLAKKLAAVVAVFVIVCTATVVIGHIVDNQKMNTGQKDAAAASDIVTI NGVKCKPNWDVQTYLFIGEDDRGVKTGKTESDGTGQSDVLELLVIDTKKNIYHKLPINRD TITDVKSLDDDGSYLATTKTQIALAHAKGDGMELSCENTVDAVSHMLYGIRIEGYISLNM DSIKILNHLAGGVPVTIEDDFSQSDPSLVMGQNITLTDDQAMHYVHDRMNVGDGTNECRM RRQKAYVDALFPLYKAKLQADSGFINEFYNDLSDYMVTDMSVGEMGNVANMIMNSEDKGE LSIKGTNAIAEDDGFNEFTMDETSRGEVAMELFFNKVEK >gi|222441849|gb|ACEP01000093.1| GENE 41 48035 - 48154 117 39 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSKKVSIIISVILFVILAVLLVCEKTNPGAGSALLQGIF >gi|222441849|gb|ACEP01000093.1| GENE 42 48155 - 48883 460 242 aa, chain - ## HITS:1 COG:CAC3045 KEGG:ns NR:ns ## COG: CAC3045 COG4464 # Protein_GI_number: 15896296 # Func_class: G Carbohydrate transport and metabolism; M Cell wall/membrane/envelope biogenesis # Function: Capsular polysaccharide biosynthesis protein # Organism: Clostridium acetobutylicum # 3 237 1 231 254 95 30.0 7e-20 MSVIDFHSHVLPRIDDGSHSSEESLGMLQISASQGIDVMAATSHFYAIEDRISSFLNRRR CSEERLKERMNQELTKEERIPRLIMGVEVAFFTGISRAERLEELTYEGTDLLLLEMPFTK WNKSEIEEVRYILERRKLRVMLAHLERFLMIPGNKKRIYELMELPVYVQINAGSFERWGE RRQILKMIRKKEQIFLGSDCHGLNHRVPNLKNGREALEKMMGSTFIDKMDKEAAVLLGLE DK >gi|222441849|gb|ACEP01000093.1| GENE 43 49187 - 50098 790 303 aa, chain - ## HITS:1 COG:CAC2949 KEGG:ns NR:ns ## COG: CAC2949 COG0679 # Protein_GI_number: 15896202 # Func_class: R General function prediction only # Function: Predicted permeases # Organism: Clostridium acetobutylicum # 4 295 5 299 305 115 30.0 9e-26 MKELINLQLMMFALIITGLILKKVGVVGKEVQKGMTNLVIDLILPCNIIYSFMIKFSSKI AQDFVVILVISILIQVFCVILGGVLYNKCSVKRKKCLRYATICSNAGFLGNPIAQGVFGM MGLTLASIYLIPQRIMMWSEGVSVFTEAPTKKEMLKKIFTHPCIIACEIGIVLMLTGWRP PHFLEETITSVSNCNTAMSMLVIGMILADADPKTIIDKDVLFYTFIRLLIIPFCVFIPCW LLHIDSLVMGVSVILAAMPAGATTTILASKYDCEEEFSVKLVVFSTAASLITTPLWNMIL LKF Prediction of potential genes in microbial genomes Time: Fri Jul 8 07:54:22 2011 Seq name: gi|222441848|gb|ACEP01000094.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont352.1, whole genome shotgun sequence Length of sequence - 42747 bp Number of predicted genes - 48, with homology - 46 Number of transcription units - 27, operones - 14 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 2 - 61 2.4 1 1 Tu 1 . + CDS 82 - 444 374 ## gi|225027839|ref|ZP_03717031.1| hypothetical protein EUBHAL_02099 + Prom 484 - 543 3.8 2 2 Tu 1 . + CDS 644 - 1828 1188 ## COG0116 Predicted N6-adenine-specific DNA methylase + Prom 1872 - 1931 3.8 3 3 Op 1 7/0.000 + CDS 1953 - 2537 382 ## PROTEIN SUPPORTED gi|71274727|ref|ZP_00651015.1| Ham1-like protein 4 3 Op 2 . + CDS 2542 - 3063 636 ## COG0622 Predicted phosphoesterase + Term 3107 - 3167 5.8 + Prom 3082 - 3141 3.7 5 4 Op 1 . + CDS 3179 - 4807 1632 ## COG2265 SAM-dependent methyltransferases related to tRNA (uracil-5-)-methyltransferase 6 4 Op 2 . + CDS 4839 - 5210 372 ## CDR20291_1765 hypothetical protein + Term 5211 - 5256 7.1 + Prom 5357 - 5416 6.2 7 5 Tu 1 . + CDS 5438 - 6712 788 ## COG1106 Predicted ATPases + Term 6939 - 6997 4.9 + Prom 6924 - 6983 12.6 8 6 Op 1 . + CDS 7058 - 7915 303 ## PC1_2981 hypothetical protein 9 6 Op 2 . + CDS 7916 - 8140 138 ## Thit_1410 protein of unknown function DUF2188 + Term 8156 - 8216 17.0 - Term 8148 - 8200 11.2 10 7 Op 1 . - CDS 8233 - 9942 1377 ## COG0507 ATP-dependent exoDNAse (exonuclease V), alpha subunit - helicase superfamily I member 11 7 Op 2 . - CDS 10079 - 10300 163 ## gi|257414192|ref|ZP_04745521.2| conserved hypothetical protein - Prom 10358 - 10417 3.1 12 8 Op 1 . - CDS 10471 - 11328 452 ## COG1484 DNA replication protein 13 8 Op 2 . - CDS 11325 - 12041 259 ## EUBREC_3581 hypothetical protein 14 8 Op 3 . - CDS 12066 - 12449 359 ## Rumal_2309 hypothetical protein 15 8 Op 4 . - CDS 12453 - 13370 611 ## Rumal_1095 hypothetical protein + Prom 13511 - 13570 4.5 16 9 Op 1 40/0.000 + CDS 13591 - 14265 373 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 17 9 Op 2 . + CDS 14253 - 15431 671 ## COG0642 Signal transduction histidine kinase + Term 15445 - 15489 4.1 + Prom 15462 - 15521 4.7 18 10 Op 1 . + CDS 15548 - 16618 270 ## Cphy_1407 hypothetical protein 19 10 Op 2 . + CDS 16615 - 17271 295 ## gi|253579583|ref|ZP_04856852.1| conserved hypothetical protein 20 10 Op 3 . + CDS 17275 - 18147 623 ## COG1131 ABC-type multidrug transport system, ATPase component + Term 18151 - 18189 8.3 - Term 18139 - 18176 8.1 21 11 Tu 1 . - CDS 18185 - 20431 1179 ## Rumal_1099 hypothetical protein - Prom 20529 - 20588 9.3 + Prom 20596 - 20655 6.8 22 12 Tu 1 . + CDS 20682 - 20816 108 ## 23 13 Op 1 . - CDS 20864 - 21472 346 ## Clole_1882 hypothetical protein 24 13 Op 2 . - CDS 21465 - 21650 306 ## gi|154506194|ref|ZP_02042932.1| hypothetical protein RUMGNA_03736 - Prom 21866 - 21925 7.4 + Prom 21893 - 21952 6.2 25 14 Tu 1 . + CDS 21988 - 23673 1322 ## COG1961 Site-specific recombinases, DNA invertase Pin homologs + Prom 23680 - 23739 4.3 26 15 Tu 1 . + CDS 23843 - 24112 63 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain + Prom 24340 - 24399 5.5 27 16 Tu 1 . + CDS 24423 - 25106 568 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain + Term 25122 - 25175 -0.7 + Prom 25418 - 25477 2.4 28 17 Op 1 4/0.000 + CDS 25529 - 26011 286 ## COG0642 Signal transduction histidine kinase + Term 26063 - 26102 4.0 + Prom 26028 - 26087 2.6 29 17 Op 2 36/0.000 + CDS 26110 - 26790 342 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 30 17 Op 3 . + CDS 26804 - 29353 1257 ## COG0577 ABC-type antimicrobial peptide transport system, permease component + Term 29368 - 29412 10.0 - Term 29356 - 29400 1.6 31 18 Tu 1 . - CDS 29413 - 29979 347 ## gi|225027870|ref|ZP_03717062.1| hypothetical protein EUBHAL_02130 + Prom 30131 - 30190 13.7 32 19 Op 1 . + CDS 30215 - 30754 328 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog 33 19 Op 2 . + CDS 30741 - 31520 321 ## EUBREC_3106 hypothetical protein 34 19 Op 3 . + CDS 31504 - 32388 617 ## COG1131 ABC-type multidrug transport system, ATPase component 35 19 Op 4 . + CDS 32381 - 33571 838 ## EUBREC_3104 hypothetical protein + Prom 33647 - 33706 6.8 36 20 Tu 1 . + CDS 33733 - 33849 86 ## + Term 34067 - 34105 8.3 + Prom 33855 - 33914 6.1 37 21 Tu 1 . + CDS 34116 - 35075 461 ## COG4823 Abortive infection bacteriophage resistance protein - Term 35380 - 35424 8.1 38 22 Op 1 . - CDS 35425 - 35553 57 ## gi|291549684|emb|CBL25946.1| Archaeal ATPase - Prom 35578 - 35637 2.5 39 22 Op 2 . - CDS 35667 - 35834 155 ## gi|291540163|emb|CBL13274.1| Bacterial mobilisation protein (MobC) - Prom 35880 - 35939 3.3 - Term 35856 - 35901 5.9 40 23 Tu 1 . - CDS 36025 - 36462 340 ## COG1959 Predicted transcriptional regulator - Prom 36531 - 36590 15.7 + Prom 36502 - 36561 10.0 41 24 Op 1 . + CDS 36617 - 36793 157 ## gi|153811087|ref|ZP_01963755.1| hypothetical protein RUMOBE_01478 42 24 Op 2 . + CDS 36793 - 37641 766 ## COG0846 NAD-dependent protein deacetylases, SIR2 family 43 24 Op 3 . + CDS 37690 - 38667 675 ## COG0846 NAD-dependent protein deacetylases, SIR2 family + Term 38717 - 38778 12.4 - Term 39009 - 39058 13.2 44 25 Tu 1 . - CDS 39113 - 39658 388 ## gi|225027885|ref|ZP_03717077.1| hypothetical protein EUBHAL_02145 - Prom 39685 - 39744 7.3 45 26 Op 1 . + CDS 39927 - 40364 486 ## EUBELI_20398 hypothetical protein 46 26 Op 2 . + CDS 40368 - 41711 1000 ## COG0534 Na+-driven multidrug efflux pump + Prom 41766 - 41825 5.6 47 27 Op 1 7/0.000 + CDS 41855 - 42268 223 ## COG1943 Transposase and inactivated derivatives 48 27 Op 2 . + CDS 42273 - 42747 106 ## COG0675 Transposase and inactivated derivatives Predicted protein(s) >gi|222441848|gb|ACEP01000094.1| GENE 1 82 - 444 374 120 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225027839|ref|ZP_03717031.1| ## NR: gi|225027839|ref|ZP_03717031.1| hypothetical protein EUBHAL_02099 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02099 [Eubacterium hallii DSM 3353] # 1 120 28 147 147 235 100.0 8e-61 MFDLISISEAIKRSYDSEGLLVDVRSEEVFKKGHLPMAVNLPFEEIMNEKFDIEAIEKDF NIEKEQPVFLYCDTGSTSMLAAKKLDSVGIHAYSVVGGIMFYKGYLEKEKNDLWTMVRTS >gi|222441848|gb|ACEP01000094.1| GENE 2 644 - 1828 1188 394 aa, chain + ## HITS:1 COG:BH1771 KEGG:ns NR:ns ## COG: BH1771 COG0116 # Protein_GI_number: 15614334 # Func_class: L Replication, recombination and repair # Function: Predicted N6-adenine-specific DNA methylase # Organism: Bacillus halodurans # 9 385 3 378 385 379 51.0 1e-105 MNGGKMSNKYEFIAPCHFGLEAVCKREIADLGYEIVKVEDGKVTFAGDMDALVRANIFLR TTQRILLKVAEFKALTFEELFQNVKKVPWEEYFPSDARFWVTKATSVKSKLFSPSDIQSI VKKAMVERMKEHYHINWFEEDGADYPVRVIIYKDIVTIGLDTSGESLHKRGYRRMVSKAP IEETLAAALIKLTPWNKSRILVDPFCGSGTFPIEAAMMGANIAPGMHREFQAQNWDNIVS PKLFSRGFEEAESLVDTSVEMDIQGYDIDPQIVKAARENAKRAGVDGLIHFQQRPVHHLS HPKKYGFVITNPPYGERLEEKEDLPALYRDMGKAFAGLDGWSEYVITAYEDAERYIGKKA AKNRKIYNGMMKTYFYMFPGPKPPRRKKLVQPEE >gi|222441848|gb|ACEP01000094.1| GENE 3 1953 - 2537 382 194 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|71274727|ref|ZP_00651015.1| Ham1-like protein [Xylella fastidiosa Dixon] # 1 194 1 195 200 151 46 5e-36 MKKKLIFATGNKGKMKEIREILGDLDYEILSMKEAGVDVDIVEDGTTFEENAIIKAKTVM EATGSLVLADDSGLEVDYLNKEPGVYSARYMGENTSYRIKNQIILDRLHGVPDIVRSARF VCVIAAAFPDGRVETRRATIEGRIAQEPAGENGFGYDPIFYLPEKGKTTAQLSAEEKNEI SHRGKALRQIKEIL >gi|222441848|gb|ACEP01000094.1| GENE 4 2542 - 3063 636 173 aa, chain + ## HITS:1 COG:PH1746 KEGG:ns NR:ns ## COG: PH1746 COG0622 # Protein_GI_number: 14591503 # Func_class: R General function prediction only # Function: Predicted phosphoesterase # Organism: Pyrococcus horikoshii # 34 159 32 157 163 86 40.0 3e-17 MSEKKMRILVISDSHGRNDDVAGVIEQVGPIDMLIHCGDVERGDDYIRSLVDCPVHMVSG NNDYNLDLPAQDIFNIGDYKVLVVHGHTFCVYRGVERLKQYALQNHIDIVMFGHTHKPYI EIDEDVTILNPGSVSYPRQPDHMPTFLIMEIDDEGEAHYGHGYYKSKFTELKI >gi|222441848|gb|ACEP01000094.1| GENE 5 3179 - 4807 1632 542 aa, chain + ## HITS:1 COG:CAC1435 KEGG:ns NR:ns ## COG: CAC1435 COG2265 # Protein_GI_number: 15894714 # Func_class: J Translation, ribosomal structure and biogenesis # Function: SAM-dependent methyltransferases related to tRNA (uracil-5-)-methyltransferase # Organism: Clostridium acetobutylicum # 1 450 4 454 456 424 48.0 1e-118 MKKGDIFEGKVIRIEFPNKGIIDIEGQKVIVKNALEGQIVRFSINKKKRDKIEGRLLEVI EPSPIEQPAACKHFGICGGCRYQNLSYEQQLDLKKRQVEELIEKNGLSFDIENIYGSPIT EGYRNKMEFTFGDEEKDGPLALGMHKKNSFYDIVTLDDCRIVDPDFNVLLQAILKYFKEK GETYFHKIRHEGFLRHLVMRRSVKTGDILINLVTTTQSRLDESEFVNMILSQKIDGKVVG ILHTLNDNLADVVQSDETKTLYGQDYFYEYLYNMRFKISPFSFFQTNTLGAEVLYDKVRE YVGETKDKLIYDLYTGTGTIAQMLASVASKVVGVEIVEEAVEAAKKNAVDNHLDNCEFIA GDVLKVVDNLTKKPDILVLDPPRDGIHPKALTKIINFNVDEMVYVSCKPTSLMRDLLVFR EAGYEVKRACLVDMFPGTVHVETVVLLSQQKPDDTIEIDLDLDELDATSAELKATYQEIK DYVLKEFGLKVSSLYISQVKRKCGIEVGENYNLSKSENARVPQCPKEKEDAIKAALKYYA MI >gi|222441848|gb|ACEP01000094.1| GENE 6 4839 - 5210 372 123 aa, chain + ## HITS:1 COG:no KEGG:CDR20291_1765 NR:ns ## KEGG: CDR20291_1765 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile_R20291 # Pathway: not_defined # 9 122 12 124 125 91 39.0 9e-18 MKSTFEKMGGTYTLGADGIYYPNLVSTDEEPHYGKYGMMRKTYLKEHRPAMYSLYMLEDR LTEHLNTVDDEAQERMDILVSQMMEKQGITEELKARDQMEWVRAVNNVRNSAEEIVRHEI IYK >gi|222441848|gb|ACEP01000094.1| GENE 7 5438 - 6712 788 424 aa, chain + ## HITS:1 COG:FN1198 KEGG:ns NR:ns ## COG: FN1198 COG1106 # Protein_GI_number: 19704533 # Func_class: R General function prediction only # Function: Predicted ATPases # Organism: Fusobacterium nucleatum # 46 424 44 416 420 109 26.0 1e-23 MFVSFSVGNTGPFKEITGITTLAENLKKEFLEENTFEVSGQNYNKISYIYGANGSGKTNY LAALTKMQKMIIMSTVLGANNNKLLEVPAIKKELAAPIETFKFDIDCKSKETYFEIQVII EEILYTYSFAIQDGKIQKELLTKKKKRTEVLIKRTSPKYEDIVLRSGLSSFKNMVSVVRE DALCLAMAAMLNNPLASMILNEIMNYRVINMASVGNAPDFDEENTNEEAIERYLKYLKIA DPTLTNLKVDLESKMDKHVLSEDDLENKELVIKNIHVSVQSLHATYKDHKQVGEIELPFL KYESNGTIRMLRVLPAIFEALDAGSTLFIDEIENGLHPNLVKLLAGLFNSNESNPHHAQL ICTTHDTLLLDGVRRDQVWFTDKNQYGETSMCRLSDYPNVRSNDNIAAKYLQGVFGAIPN TQNL >gi|222441848|gb|ACEP01000094.1| GENE 8 7058 - 7915 303 285 aa, chain + ## HITS:1 COG:no KEGG:PC1_2981 NR:ns ## KEGG: PC1_2981 # Name: not_defined # Def: hypothetical protein # Organism: P.carotovorum # Pathway: not_defined # 17 278 38 298 303 206 41.0 1e-51 MMVLRDAARFKGILVKLGFSEDLIEGERVLPSMLNPTLKRNAEPFYIKDKTKPKEQYTQT LWWTRHEWAGRGETREVTDFVTIPRERYARIKFEPYSVELFLKYDEQGQLMVMTDFISYC HDNEKLLINTINIFLTNFEECEILTENFENVMPTRIIKLNWEVLPSGDYPWKRMQDDLQK VSAKSSKTAKKLLIDKCEFINSFQPDFRAYGKSGFHGYVIFGFMHRNIYVLESVYPNNAT YVFGKNWEELSKLTKAEILKENLQDVRIIHNNNWQQEIRDLLEVA >gi|222441848|gb|ACEP01000094.1| GENE 9 7916 - 8140 138 74 aa, chain + ## HITS:1 COG:no KEGG:Thit_1410 NR:ns ## KEGG: Thit_1410 # Name: not_defined # Def: protein of unknown function DUF2188 # Organism: T.italicus # Pathway: not_defined # 1 74 1 74 74 94 68.0 1e-18 MGKNQHVTPHKEGGWQVKGENNTRATVRTDTQAKAIQRAREIARNQESELFIHGKNGRIR ERDSYGNDPFPPRG >gi|222441848|gb|ACEP01000094.1| GENE 10 8233 - 9942 1377 569 aa, chain - ## HITS:1 COG:AGpT237 KEGG:ns NR:ns ## COG: AGpT237 COG0507 # Protein_GI_number: 16119945 # Func_class: L Replication, recombination and repair # Function: ATP-dependent exoDNAse (exonuclease V), alpha subunit - helicase superfamily I member # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 1 303 18 283 1117 143 31.0 1e-33 MAIYHMQAKVVSRGSGRSAVAASAYMSCSRMYNDYDGIQHDYTRKHGLIYQEVMLPPIAP SKWIDREQLWNAVEETEKTKDSRLAREFVVALPVELDKDSNISLLQDFIKKNFVDMGMCA DFAIHDTDGHNPHAHILLTVRPLNENGTWQYKTEKEYLCIKDGEEKGFTASEFKTAQKQG WEKQYGYKVGKKKEYLTSSAAQEKGYERIDKHPKSSRYGRQNPISEQWNSDEQLCIWRAN WADAVNKMLARNQINATIDHRSFADQGITEQPTIHEGYIAQNMEKKGMIADRCEINRQIR ADNKMLRELKAKVAKLAEAVEKSIPIIAETLEAIRNHMIFTQYHLLHNEMQKKVIHDWMN HFNPILNKYNTVKKKLKAKVTERKELNVQKDKTSILNPIQHIKLNQQLTTITEEIEELKS RKEQLIFQAECSTDKDMTNLSKKYDQMKNNLDILDSQGISLKKQLEKDAAAFREEKFRPE PEQYTELLDTRIQIRPDFRDKLIEQLKGTFGKYYDYHRRDIAADEVDYLNVEDPDVFSHR AWELEYQRKQEMRRNQPIRSKKRSHDMEL >gi|222441848|gb|ACEP01000094.1| GENE 11 10079 - 10300 163 73 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|257414192|ref|ZP_04745521.2| ## NR: gi|257414192|ref|ZP_04745521.2| conserved hypothetical protein [Roseburia intestinalis L1-82] conserved hypothetical protein [Roseburia intestinalis L1-82] # 1 73 50 122 122 115 94.0 1e-24 MADKKIMTPEEKALLQARHRQEEAEARNRKKERDARTHRLVQEGAILESIVPHIKEMDLD SLKRELIIRLRGM >gi|222441848|gb|ACEP01000094.1| GENE 12 10471 - 11328 452 285 aa, chain - ## HITS:1 COG:CAC1933 KEGG:ns NR:ns ## COG: CAC1933 COG1484 # Protein_GI_number: 15895206 # Func_class: L Replication, recombination and repair # Function: DNA replication protein # Organism: Clostridium acetobutylicum # 32 276 27 281 282 130 32.0 3e-30 MTKFIEKPTAITPNRELQSDEYFNETDHLIYCSKCNTPRQCRHELQGKVLIPSIRCKCQQ EIFEQEEAQRKLHEKQMEIEHLKTSGLQDKALYDYTFARDNGINPEIKLAHNYVSNWEEM RANASGLLIWGDVGTGKSFFAGCIANALLEKGVPVLMTNFSRILNTLTGMHFEDRNQFIN SLNRYSLLIIDDLGIERNSDFALEQVFNVIDSRYRSKKPLIITTNLTLSELNNAADIAHK RIYDRILERCIPVRINNRNIRQDNATANLKQAKRILLNNHAPNKN >gi|222441848|gb|ACEP01000094.1| GENE 13 11325 - 12041 259 238 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_3581 NR:ns ## KEGG: EUBREC_3581 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 5 238 1 248 248 166 41.0 9e-40 MTDFLTADTNLPSYMMFPRFLLDMEINETAKMLYIILLDRARLSQKNEGWSDTDGHVFIY FTIEALAEVLHKSQMTVKTALAVLEKQELIFRKRQGPGQPNRIYVKLPKETIHYTDRFLS LRQTKNCPIDRQDSFPDTDRKLSGNKKEIKKNNLAIRGSKEPLSPYGKFQNVFLSEKELE DIRQTIPDWKDYIERLSGYMASTGKQYQNHAATIISWARQDHPASRQRNYESEEYETL >gi|222441848|gb|ACEP01000094.1| GENE 14 12066 - 12449 359 127 aa, chain - ## HITS:1 COG:no KEGG:Rumal_2309 NR:ns ## KEGG: Rumal_2309 # Name: not_defined # Def: hypothetical protein # Organism: R.albus # Pathway: not_defined # 3 126 2 127 128 115 48.0 4e-25 MQNNIRNTNLRFNLDKEQQRRAWEYLQTMDRQDFKSYSQVISLALVDYFDRYYRTQADPY LETREREELFVKQIVDAVENSLKQALPLFLSGLTAGMAQREPQIRASFPAPENSQPDSDV DWDFLGE >gi|222441848|gb|ACEP01000094.1| GENE 15 12453 - 13370 611 305 aa, chain - ## HITS:1 COG:no KEGG:Rumal_1095 NR:ns ## KEGG: Rumal_1095 # Name: not_defined # Def: hypothetical protein # Organism: R.albus # Pathway: not_defined # 1 305 1 306 306 431 65.0 1e-119 MRELRNTKIIAVDHGYGNMKTANTVTPTGIKAYETEPIFTGNILEYNGIYYRIGEGHKEF IPDKAMDEEYYLLTLMAIARELNVFSIREADVHLAAGLPLTWIRNQREAFRSYLLQNPEV HYRFNGKEYHLRFVGCNLYPQGYPAIVNHLGNFKGTNLLADIGNGTMNILYINNKKAQES RCWTEKLGVNQCMIAAKNAVLDKFGVKIEESTVEQILRFGTADISAPYMDCISSIARQYV AELFSTLRKYEYNPDLMRLYVVGGGGCLIRNFGTYDKLRVTIIDDICATAKGYESLAYMS LKRRG >gi|222441848|gb|ACEP01000094.1| GENE 16 13591 - 14265 373 224 aa, chain + ## HITS:1 COG:Cgl2904 KEGG:ns NR:ns ## COG: Cgl2904 COG0745 # Protein_GI_number: 19554154 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Corynebacterium glutamicum # 4 221 14 236 240 150 36.0 2e-36 MKYKILVVDDDKELVKMLCSYFNMKQYETITATDGMEALNKIKMKPDIILLDINMPRMDG IEVCRLIRSKVLCPILFLTARVDEDDKINGLLSGGDDYITKPFSLRELEARIVTNIKREE RHQQKTEYRFMDEMLIDYSEKIVAIAGHRMEFTKIEYQIIEFLSMHPGQVFDKERIYAQV CGYDAEGDSRTITELVRRIRKKIADYSEKEYIETVWGIGYRWKK >gi|222441848|gb|ACEP01000094.1| GENE 17 14253 - 15431 671 392 aa, chain + ## HITS:1 COG:CC1181 KEGG:ns NR:ns ## COG: CC1181 COG0642 # Protein_GI_number: 16125433 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Caulobacter vibrioides # 102 382 206 466 475 95 26.0 2e-19 MEKIRNLSLKKTILLYFVISLTAAFLLSGFTVHFAGNMQNKIWEKYIDYADYTDVFQQYG KKYEIEISRPNQSQMNRLDYHLSEMCDFMETYSVLIFSIVGSVVAVFFFYKNKLKTPLQE LKDASQMIADNELDFHVSYENKDEMGTLCKEFEMMRSDLADNNRKMWRMIDDEKALRNAI AHDIRSPLSILRGYQEMLLEFVSAESIKTEDVIDILQTGMYQIDRIEHFTENMRKMSHLE QRELQCSEIELSELAKKIEAEAAMLSKKESKLCKVERVQEQNIVKVDEELVMEVTDNLLE NAVRYAQKSIALQIKKKDGFLIISVEDDGIGFVDTEEKVTEPFYHKNPQDDLKHFGLGMY ISRIFCEKHGGNLKIYNARQGGAHVEALFKAE >gi|222441848|gb|ACEP01000094.1| GENE 18 15548 - 16618 270 356 aa, chain + ## HITS:1 COG:no KEGG:Cphy_1407 NR:ns ## KEGG: Cphy_1407 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 1 164 1 164 164 78 30.0 6e-13 MKTIIKRSILDYFKNPVLWIGLIIIVASMYQCLSSYLQIHYIKQNEQITQNDVALEDADV MDGYIPTSDDKERRREWEDTIKETLMDTSKNGFGFSRQEADHVMKEIQNMDVKTASEFLE SQYGYYNAIYAYEDLEIHKGTAEEINHYIERKLSEHSFSWYFAKKFTDFAGLHMAFFATV LLSFLFIQDTRKSTYELLHTKPVTAVQYICGKVISGFISMLGVLVILNVIFFMLCLKTSL ESGFPVTPIDFCVNSLIYIIPNLLMICCVYTITAVIFKNPLPAAPILFLHIIYSNMLTMK NDIYYMRPFSIMVRFPGRFFETHVAKMSNINQIILVISSVILVCISVTIWKRRRVH >gi|222441848|gb|ACEP01000094.1| GENE 19 16615 - 17271 295 218 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253579583|ref|ZP_04856852.1| ## NR: gi|253579583|ref|ZP_04856852.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] conserved hypothetical protein [Ruminococcus sp. 5_1_39BFAA] # 1 218 1 218 218 386 100.0 1e-106 MKTELKNCLSLYKIFYSCAFILILCAIHPIIYYEEIGSAIQSPIAFLTIIFCSDTYLMEV KSKRADVFHLYDQKKQLKVISQRVGVQILYLLILSCVGYVLFFWQKPGSVNEGISGIQIF LLYFIAMFGTIWLWSICSVILCTLLRNMWAGIGCLFGIVIGLISKAGSSFFGNLGLFSFS FCEPTQLMSESWIYGTLVSFIAGLFLFAVLPMTLKKRG >gi|222441848|gb|ACEP01000094.1| GENE 20 17275 - 18147 623 290 aa, chain + ## HITS:1 COG:all2672 KEGG:ns NR:ns ## COG: all2672 COG1131 # Protein_GI_number: 17230164 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase component # Organism: Nostoc sp. PCC 7120 # 5 271 4 266 293 187 37.0 2e-47 MSIRINDLTVRFKNGVVAVNKASLEIPKGIYGLLGENGAGKTTLMRVLTTVLKQTEGMVS LDGILYNEGNYEKIQKKIGYLPQEIDLYPNLTVKECLVYMGGLSGVARNDLEQRITYYLE KTSLTEHQNKKMRQLSGGMKRRVGLIQALLNNPDFLIIDEPTTGLDPEERIRIRNLLVDF SKDRTVLFSTHVVEDLAATCTQLAIMKKGSFLYSGSVSGLLENAKGCVWNCTVQNAEEAR LLEANYSISSKQYVENGIHMKVLSKGKPNENCVLDNDITLEDAYIYLTNS >gi|222441848|gb|ACEP01000094.1| GENE 21 18185 - 20431 1179 748 aa, chain - ## HITS:1 COG:no KEGG:Rumal_1099 NR:ns ## KEGG: Rumal_1099 # Name: not_defined # Def: hypothetical protein # Organism: R.albus # Pathway: not_defined # 1 715 23 735 771 595 44.0 1e-168 MSQPDFLYELFEDFMDDPANQDFSMDNGLVCRWLTGQAKISPKISAYYSKPSNQKKLAET IHQNLLPLMSDCNMAMQDIYTLFIQDDTISDAKKKNLASLYKPASSRLLFLAKLISFGME RQFIKRDTKNQKLIAGGALSPIVLDYIMDSEVPKPCRHFIGRDKELEELYTMLEENRHVF LCGIAGIGKSELAKAYAKHYKKHYTNILYVEYTGDLHQDITDMDFIDDPPEISEQERFQR HNRFLRSLKSDTLLIIDNFNVTATQDSFLSVVLKYRCQILFTTRSNLNEYCTFQLKEIKD INILFQLTSAFYSEADKYRSTVEKIIETVHYHTFAVELAAKLLENGISTPGQLLAKLQEE RASLDNEDKIKIIKDGQSSKATYYSHIHTLFSLYALSRKQQDIMCNLCFLPYTGISARIF AKWLELPTLNEINDLIETGFVQTTTRHTISLHPMIKEIALSETKPSVSSCHILLDSLQKI CLMHGIEVDYYKKLFQTAGNIIELIEKDDIPKYLLFLENVFPYMDNYNYQKGMKEIIQEL KNFLKPKDIGTDSDRALLLDFQATLEIKPEKAIKLEKDALAQIENITADNARLVSNLHAN LGGLYRMNGHPDLAREHMEKSISLLDQFNLLHINDSIPQIANYAMFLTEQQEPERGISEL QKLSGIIKEYHSDDCLDYAKVQETLGTIYLMTANLPQAKTHFKRAFKIYEKIWADEPEMI EAKYQEIQELYPQIGFCIGKNLSGLLTK >gi|222441848|gb|ACEP01000094.1| GENE 22 20682 - 20816 108 44 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MQQNIEFERCIDFLVRMIDKYGAEVLRELEEEKQNKEEKEAAVS >gi|222441848|gb|ACEP01000094.1| GENE 23 20864 - 21472 346 202 aa, chain - ## HITS:1 COG:no KEGG:Clole_1882 NR:ns ## KEGG: Clole_1882 # Name: not_defined # Def: hypothetical protein # Organism: C.lentocellum # Pathway: not_defined # 16 183 10 179 202 123 41.0 4e-27 MSDKYFTVNIRAYLDKDEPTYIGEESLYDLLSDFSCPKNPDVEYFLLHNAIEFTKKDQSI TYLVFDAEDASLVGYFSLTIKPISVRASNISKTMAKKLSRVSILDEETQSYTTAAYLIAQ LGKNYSLPKEKRIPGNILLGFALETISSLKYSVGGVMEFLECEDNEFLLSFYTQNHFKPF DTRITASQNNEPHTLHQLLKFI >gi|222441848|gb|ACEP01000094.1| GENE 24 21465 - 21650 306 61 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|154506194|ref|ZP_02042932.1| ## NR: gi|154506194|ref|ZP_02042932.1| hypothetical protein RUMGNA_03736 [Ruminococcus gnavus ATCC 29149] hypothetical protein RUMGNA_03736 [Ruminococcus gnavus ATCC 29149] # 1 61 54 114 114 114 98.0 2e-24 MATSSITHNFVVSNPNSVKRFVAAIDEADRDRTPKQTLPGRQLTNPQEILALMSKRKKKH V >gi|222441848|gb|ACEP01000094.1| GENE 25 21988 - 23673 1322 561 aa, chain + ## HITS:1 COG:SP1040 KEGG:ns NR:ns ## COG: SP1040 COG1961 # Protein_GI_number: 15900911 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinases, DNA invertase Pin homologs # Organism: Streptococcus pneumoniae TIGR4 # 1 560 1 559 559 536 51.0 1e-152 MKNKKIKCDIYTRVSTTMQVDGYSLDAQKEKLKRYAEFQNMEIVNEYSDEGKSGKSVEGR PEFQRMLDNIENGTDEVQFVLVFKLSRFGRNAADVLNSLQRMQDFGVNLICVEDGIDSSK DSGKLMISVLSAVAEIERENILVQTMEGRKQKAREGKWNGGFAPYGYELVNGELQIAEDE AEIIRLIYDKFIHTNMGISAIAAWLNQHGYKKKKRQNNTLDAFAASFIKGVLDNPVYCGK LAYGRRKNEKVSGTRNEYRIVKQENYMLHDGIHEGIVSETDWELAHQKREKTGVKYEKTH SLDHEHILSGILRCPLCGSGMYGNVNRKKKKDGTLYKDYFYYACKHRRLVDGHKCGYRKQ WSEEKINNAVEEVIRKLVKNPKFEEAILNKIGSRIDTEEIEKEIEGLEKQHRQLTGAKAR LGQQMDNLDIMDKFYEKKYQDMETRLYRLYDEIEGVENSIEEVKNRLLNIRQQKISEENV YQFLLYFDKLYDKFTDLEKKEFLNSFVEQVDIYEQEQPDGRFLKHIKFRFPVYFGDRETQ ELCWDNESTVETVVLLQRKNM >gi|222441848|gb|ACEP01000094.1| GENE 26 23843 - 24112 63 89 aa, chain + ## HITS:1 COG:CAC0564 KEGG:ns NR:ns ## COG: CAC0564 COG0745 # Protein_GI_number: 15893854 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Clostridium acetobutylicum # 4 85 152 233 233 100 58.0 6e-22 MNTGNTEVTLTNREFEILYLLASSPGRVFSKEQIYDLVWEEPYFGDYNIVMSHIRNIREK IGDNPSKPIYIQTVWGVGYRFNKNISSGL >gi|222441848|gb|ACEP01000094.1| GENE 27 24423 - 25106 568 227 aa, chain + ## HITS:1 COG:CAC0450 KEGG:ns NR:ns ## COG: CAC0450 COG0745 # Protein_GI_number: 15893741 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Clostridium acetobutylicum # 4 227 3 226 227 164 41.0 1e-40 MSEKLLIVEDDKKLNDGIRLALKNDSYFFYQCQTLQKAREILKKEDITLVLLDVNLPDGN GIDFVREIRKNSQVPIILLTVNNMEVDIVTGLEAGANDYITKPFSLMVLRARVSVQLRNK EAAAKNSMELDGFEFYFDKMEFFKDGEPIELSKTEQKLLKVLCENRGKVVKREYLIDEVW QGETEFVDAHALTVAVKRLRDKLEDDTQKPEYIKTVYGIGYTWAVNG >gi|222441848|gb|ACEP01000094.1| GENE 28 25529 - 26011 286 160 aa, chain + ## HITS:1 COG:BS_resE_4 KEGG:ns NR:ns ## COG: BS_resE_4 COG0642 # Protein_GI_number: 16079368 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Bacillus subtilis # 1 158 108 269 269 105 36.0 4e-23 MESGIIAVHSEDTTIKQMLESIQQQFNVKVREKNITLSLCDTDLHAMCDSKWTVEALGNI VDNAIKYTACGGNVQIKVEQYSFFVKIDIIDDGIGIEKEEIPKIFGRFYRSLSVADQPGV GIGLFLAREIIQAQKGYIKVTSKRGKGSTFSVFLPIAKKE >gi|222441848|gb|ACEP01000094.1| GENE 29 26110 - 26790 342 226 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 3 220 1 218 245 136 37 2e-31 MNILKAVNLRKIYGQGETEVRALDGINLEVEKGEFVAIVGTSGSGKSTLLHIIGGLDNPT SGQVIVDGQNLSHMTDEELTIFRRRNIGFVFQQYNLVPMLNVWENIVLPVKLDGKKIEKG YVNEIIDTLGIRTKLENLPSALSGGQQQRVAIARALAAKPAILLADEPTGNLDSKTSQDV LGLLKVTSKRFHQTIVMITHNEEIAQMTDRILQIEDGKIVSDSGLV >gi|222441848|gb|ACEP01000094.1| GENE 30 26804 - 29353 1257 849 aa, chain + ## HITS:1 COG:CAC0454 KEGG:ns NR:ns ## COG: CAC0454 COG0577 # Protein_GI_number: 15893745 # Func_class: V Defense mechanisms # Function: ABC-type antimicrobial peptide transport system, permease component # Organism: Clostridium acetobutylicum # 11 847 5 829 832 84 20.0 1e-15 MVKVENKETLRLLTKRFMKMNRARNIIAVIAIMLTSLLFTSLFVGSVSMILSKRATEIKQ FMDSSHAIAQNLSEEDAVRLQKTIEQSDEVERYGTGIFLGAGRDKHFGFSVEVRYADKNM AESFNCLPTTGRLPEKENEVAVSSLVLEALGVTPKIGEEVTLTWEVNPMLKQYKTDTFQI CGFWQGDKAVLGQMVWVSEAYAKENCYPVTQKELKNGIYNGGKEYSVWYKNLWNLEKKTE NISKTAGFTKARTGLEINPAYNLFEEDSFSFTSLIVMVLFVILAGYLIIYNIFNISVKTD IRAYGLLKNVGTTGKQLKKIVRMQAWKLSEVGIPIGLIFGYLAGFCMSPSLTADAQISAQ AGQTTQTVVSANPLIFFAAALLTLLTVYLSCLRACKMVERVSPVEALHLAEGEQSQKKIK KNTSVTWWGMAVQNVFRNWKKGLIVMLSIALSMVVVNCIVMLVQGYDFESYQNIVLASDF QLDQMTDTLSNTNFNGITPEIKEILNKCPESEKTGYVYYSEERHKMEPALLKTWETLAEK NKENWTDYEKQIWEETKADNTVKVHFLGISEAVFDMLEWKGEKCSWDTFKSGDYVIVDYS DKYTEQPVSYYQSGETFKMEYGNGKQKDYGVIGEAMMPYSLDYPYADSVYITVMVPEEEY FTQTENQSAMYATIDAKKGEDKQVKEYIDKNVLKENDMINVSSVLDMKASFRRFVSKYYM IGSFLVIILAFIGIMNFFNTTATSVISRKKELALLEVVGMTKKQISKMLVAEGFLYLGGA FVIAVLLIMFGAKQILVNTLGTAFFFRLHLTIVPCVLMIPILVGIAYVIPKYQFEKMSRE SIVERIRKE >gi|222441848|gb|ACEP01000094.1| GENE 31 29413 - 29979 347 188 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225027870|ref|ZP_03717062.1| ## NR: gi|225027870|ref|ZP_03717062.1| hypothetical protein EUBHAL_02130 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02130 [Eubacterium hallii DSM 3353] # 1 188 1 188 188 345 100.0 1e-93 MHFCLTIDHRSFADQRIAEQPTIHEGYIAQNMEKKGMIADHCEINRQIRADNQMLRELKT QVSKLAQAVKNSIPVIAETMETIRNHMIFTQYHLLHNEMQKEVIHDWMNHFNPILNKYNT VKKQLKGMFGKYYDYHRRDIAANEVDYLNVEDPDVFSHRAWELKYQREQEIRRNQPARTK NKSYDIEL >gi|222441848|gb|ACEP01000094.1| GENE 32 30215 - 30754 328 179 aa, chain + ## HITS:1 COG:BH0263 KEGG:ns NR:ns ## COG: BH0263 COG1595 # Protein_GI_number: 15612826 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Bacillus halodurans # 6 171 5 184 187 79 29.0 4e-15 MEYKTDSLLVHLARAGDEDACEKLVKKYYSSIYQYCLLHINDPYEAEDLTQEVFTRFFSN LYRYKEYGKVKNYLYTIAGNTVKNYYKKKKDIPSEELLKSEDCSKNHVEELGVRLTIEQA VRKLPEEIRETAVLYFFQELKQREIAELLHIKLSLVKYRIGRAKELLMKELEVKKDEEL >gi|222441848|gb|ACEP01000094.1| GENE 33 30741 - 31520 321 259 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_3106 NR:ns ## KEGG: EUBREC_3106 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 242 1 237 254 207 46.0 3e-52 MKNYKQQIKEYKEEIEEKYVRAETENKTVKACQNILSESVLSKQSHRTSYFEFLYEQTKF IKKRWWILQGCVLMCLWIWLSNYTSDIKDMMRIMGISATIFVVLIIPEIWKNRRNGAIEI EQASYYTLRQICTARMLMFGVVDLVIVMVFLAITYQTTILSLSDLVVNFLLPVNVSCCIC FRVLYSRWEKSEYIAVLSCLVWVGLWMMIVANDVIYQKIVTPVWVAMLILTFVYFVFCVQ KSLVFDEKILEDYTYEIRI >gi|222441848|gb|ACEP01000094.1| GENE 34 31504 - 32388 617 294 aa, chain + ## HITS:1 COG:CC3566 KEGG:ns NR:ns ## COG: CC3566 COG1131 # Protein_GI_number: 16127796 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase component # Organism: Caulobacter vibrioides # 3 281 2 282 294 211 38.0 2e-54 MKLEFKNVTKQYGEVNAVNKVNCVMEKGIYGLLGVNGAGKTTLMRMLCTIVQPSEGQILW NGKDIWKLGGEYREVLGYLPQDFGYYPDLTVYDYMMYISSIKGLKIAFARKKVKQLLNQV GMEKFSKRKMKNLSGGMVRRVGIAQAMLNDPQILVLDEPTAGLDPNERIRFRNLISELSE ERLVLLSTHIVSDIEYIANNIMLMKDGELFYAGTAEDLVSSMKEKVWQCNVSRSMIDKYM NEYLVGNIKTTKTGAELRIISIEKPTEDAVAVEKNLEDAFLFYFGEKSEEKQDA >gi|222441848|gb|ACEP01000094.1| GENE 35 32381 - 33571 838 396 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_3104 NR:ns ## KEGG: EUBREC_3104 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 394 2 397 401 377 47.0 1e-103 MLKLELKRIFSKKINVFAIGLALILAVIFSGFAVTSNRYVDENGNASTGIMATRKLTDNR RAWKGTLTEDELGKVIEQNKNAMTQSSEENAIYGTTLQPIDDIRGFIISVLTPDAEYDES VLNQITEENIQEFYDTYHKNMEKMAEEYGKTSVQKKYLEKKYNEIKLPVEYESYSSWDTM IMHVETYSIILAIIVGFICAGIFADDFQTKADAVFFSTKYGRTKAVKTKILAGIATTVMI YCMGIILLSVICFGIMGTSGMNTPYQMYQAYSIYIMSYGQYYLLTVVCGFIASMLAAVVS MLVAAKMHTISVAVCIPFFLYCLLPFIGRALSGYTTLFNLIPTILTNVQASVKVPLIYQI GNCVFRQIPLVMVMYTVMAIALLPFIYKSFRRYGNK >gi|222441848|gb|ACEP01000094.1| GENE 36 33733 - 33849 86 38 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MQQNIEFERCIDFLVRMIDKYGEEVLRELEEEKEAAVS >gi|222441848|gb|ACEP01000094.1| GENE 37 34116 - 35075 461 319 aa, chain + ## HITS:1 COG:PM1540 KEGG:ns NR:ns ## COG: PM1540 COG4823 # Protein_GI_number: 15603405 # Func_class: V Defense mechanisms # Function: Abortive infection bacteriophage resistance protein # Organism: Pasteurella multocida # 1 233 1 240 309 87 28.0 3e-17 MQTPKKFSSFSDQVSWISDEKGIRIKDREYAEEMLRQIGYFPLMGGYKHLFRISNTKKYK AGTSFEEIVSLYKFDAELRELFFKYLLQIERQMRSLMSYYFTEMYGAEQKQYLDANNYNN TKRNHATIVKLIATLKRATTTTDYTYINYYRKTYGEIPLWVLANVLTFGNLSKMFRVFPQ SLKSKVSKNFEPLNQHQMEQFLSVLTKYRNVCAHGERLFTYRTVDAIADTPLHKKLSLPQ SGNQYEKGKQDLFAVMIAFRYLLPGKDFLEFKRKLIKEIDRVNREVEHISEVELLNKMGF PEKWKSITKYHLKEISSKR >gi|222441848|gb|ACEP01000094.1| GENE 38 35425 - 35553 57 42 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|291549684|emb|CBL25946.1| ## NR: gi|291549684|emb|CBL25946.1| Archaeal ATPase [Ruminococcus torques L2-14] # 8 42 716 750 750 64 91.0 3e-09 MFLTEQQEPEMIKAKYQEIQELYPQVGFFLGQQLSNLLTKQT >gi|222441848|gb|ACEP01000094.1| GENE 39 35667 - 35834 155 55 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|291540163|emb|CBL13274.1| ## NR: gi|291540163|emb|CBL13274.1| Bacterial mobilisation protein (MobC) [Roseburia intestinalis XB6B4] # 1 43 1 43 105 69 83.0 7e-11 MANRNRTNPVQFYLSDDEQYILTTKFKASGMKIMSPFYARHILEQTWITITITKV >gi|222441848|gb|ACEP01000094.1| GENE 40 36025 - 36462 340 145 aa, chain - ## HITS:1 COG:lin0458 KEGG:ns NR:ns ## COG: lin0458 COG1959 # Protein_GI_number: 16799534 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Listeria innocua # 1 132 1 133 155 104 41.0 6e-23 MKYSTKLSDTIHILIFIALGDDEQLSSTKIAESIKTNPAYVRQLMATLKNAGIVVNTQGH ANAALAKSADKINMYDIYRAVEGDKPLLHLDTDTNPDCGIGINIQFAIGDFYHEIQNMVN EKMKSITLQDIIDRYYFKIREIKNL >gi|222441848|gb|ACEP01000094.1| GENE 41 36617 - 36793 157 58 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|153811087|ref|ZP_01963755.1| ## NR: gi|153811087|ref|ZP_01963755.1| hypothetical protein RUMOBE_01478 [Ruminococcus obeum ATCC 29174] hypothetical protein RUMOBE_01478 [Ruminococcus obeum ATCC 29174] # 1 58 232 289 289 114 91.0 2e-24 MQIPFWEMTYKNEKVFYACLNQKKSSAPEHIKDKGIYIAGDLAKTLRDLKENIAGKEM >gi|222441848|gb|ACEP01000094.1| GENE 42 36793 - 37641 766 282 aa, chain + ## HITS:1 COG:SA0315 KEGG:ns NR:ns ## COG: SA0315 COG0846 # Protein_GI_number: 15926028 # Func_class: K Transcription # Function: NAD-dependent protein deacetylases, SIR2 family # Organism: Staphylococcus aureus N315 # 13 263 28 279 315 135 30.0 7e-32 MKMNVYQEISQIIKEADGILIGASNGLSIAEGYNIFADDAWFQENMGDFREKYGLRCVLH GFSVPMKVEEKWAFVSRLVKAKAMQDEPSEIMKNIYAIVKDKEYFVVTSNAEDHFVPAGF EADRVFEMEGKLTQMRCKNRCHDEVYPNQKAVLAMTEEEVNGRVPKELLPKCPKCGGDME VNWGEMSSFTETKNWKEKAARYQEFIQNLHGKKLVILEFGIGWRNQMIKAPLMQLAAVEP QASYITFNKGEIYIPEEIKEKSIGVDGNLTVALKEIRKGRID >gi|222441848|gb|ACEP01000094.1| GENE 43 37690 - 38667 675 325 aa, chain + ## HITS:1 COG:SA0315 KEGG:ns NR:ns ## COG: SA0315 COG0846 # Protein_GI_number: 15926028 # Func_class: K Transcription # Function: NAD-dependent protein deacetylases, SIR2 family # Organism: Staphylococcus aureus N315 # 50 312 28 289 315 146 30.0 5e-35 MSCYENLVMKTQMNYSRHYSTYASGGTAVVLSKQKPLSYEEQIQEFVRRVQEAECIVVGG ASGLSAAGGGDFYYSDTPSFREHFGKFADKYGFKGAFSGMMHRFSTRNEHWGYVATFLNT TQNAPIREPYLDLDRILQGKDFHILTTNQDTQFVKIYPEEKVSEIQGDHRFFQCSQCCQD ETWDAVQPVADMIAAMGAGTMVPDELIPRCPYCGAEMFPWVRGYGNFLQGKKYEEEYEKI SKYIQKNKDRKILLIELGVGRMTPMFIQEPFWELTNSLKDAYYISVNSEYQFLPEFIEDK GIAILEDIGTVLKDLRKAKEESAFV >gi|222441848|gb|ACEP01000094.1| GENE 44 39113 - 39658 388 181 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225027885|ref|ZP_03717077.1| ## NR: gi|225027885|ref|ZP_03717077.1| hypothetical protein EUBHAL_02145 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02145 [Eubacterium hallii DSM 3353] # 1 181 1 181 181 334 100.0 2e-90 MIYTPMNLSLNELTKMATQEVNFDETFFTNIEECIKYNSIGTLNWAIHTLTIIRERIDVG QKVKYKDIDLTPDTYKKLLHEHYGIYVVKGVYPKMSLKHKVYFKLENTENGMDLIYTGQE ENKLFRWIADINENESLVRVLPTNVVYIRNIKLGSLTPFVTEHNSVYVYNEKTGKIEEVF E >gi|222441848|gb|ACEP01000094.1| GENE 45 39927 - 40364 486 145 aa, chain + ## HITS:1 COG:no KEGG:EUBELI_20398 NR:ns ## KEGG: EUBELI_20398 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 1 145 1 145 149 140 53.0 1e-32 MTKDKIKEVMDACYQAKRIRDMLPALPGGVTPSHIHYLDTISKLEKTEGKVKVSDISENL GLPRPSVTKTVKDMEKLGFVEKETTETDGRFVYIKTTRTGRDLVDKYVDEYFGNLSEDLG GITDEDADKMIEVVEKFYIVMSNRK >gi|222441848|gb|ACEP01000094.1| GENE 46 40368 - 41711 1000 447 aa, chain + ## HITS:1 COG:CAC3354 KEGG:ns NR:ns ## COG: CAC3354 COG0534 # Protein_GI_number: 15896597 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Clostridium acetobutylicum # 7 429 6 423 452 159 25.0 2e-38 MNAKERSDFTQGSILKKMIPFMMPILGALILQAAYGAVDLLVVGRFGTTAGLSAVSTGSQ ILNLVTFVITQLAMGITVLIARYIGEKHTSQIGELLGGATTVFTTIAVVLFVVMVFFAKP LAVLMQAPQEALILTSVYVRICGGGIFFIVAYNVLSAIFRGLGDSRSPLIFVMVACVVNV VGDLVLVAGFHLDAAGAAIATVLAQAVSVVLALLMLKRRQLPFKITKKDFRVNSQCKRFL TVGIPLALQEFLTQMSFLALCAFINKLGLEASSGYGVACKIVNFAMLIPSALMQSMASFV SQNVGAGKEDRAKKAMFTGMGIGLIVGIIVFASVWFFGDVLTSIFTTDEAVVQKGFEYLK GFAPETILTAILFSMIGYFNGHEKTLWVMIQGLIQTLLVRLPLAYFMSIQPDASLTNIGL AAPVATCFGIILNVIFFLYMTRKMKKA >gi|222441848|gb|ACEP01000094.1| GENE 47 41855 - 42268 223 137 aa, chain + ## HITS:1 COG:alr8071 KEGG:ns NR:ns ## COG: alr8071 COG1943 # Protein_GI_number: 17227445 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Nostoc sp. PCC 7120 # 7 134 9 140 140 100 37.0 9e-22 MNLDNNAHSVFLLQYHLVLVVKYRRQVFDDGISSRAKEIFEYIAPNYNITLEEWNHDKDH IHILFRAHPHTEISKFINAYKSASSRLLKKEFPQIRQKLWKEHFWSQSFCLITTGGAPIE VIKKYIESQGQRERKRK >gi|222441848|gb|ACEP01000094.1| GENE 48 42273 - 42747 106 158 aa, chain + ## HITS:1 COG:DR0666 KEGG:ns NR:ns ## COG: DR0666 COG0675 # Protein_GI_number: 15805693 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Deinococcus radiodurans # 3 158 4 161 408 115 41.0 4e-26 MANRAYKFRIYPNDEQKILFAKTFGCVGMVYNHWLDRKIRQYKEDKTNVTYTICAKEMAA MKKTEEYGFLKEADSIALQQSLRHLDTAFKNFFKQPKTGFPRFKSKKRNKNSYSTVCIKG NITLSNGYLRLPKIGQVRLKQHRIISDEYRLKSVTVSQ Prediction of potential genes in microbial genomes Time: Fri Jul 8 07:56:31 2011 Seq name: gi|222441847|gb|ACEP01000095.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont353.1, whole genome shotgun sequence Length of sequence - 21000 bp Number of predicted genes - 17, with homology - 17 Number of transcription units - 10, operones - 4 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 304 - 363 6.6 1 1 Tu 1 . + CDS 414 - 1253 1208 ## COG1464 ABC-type metal ion transport system, periplasmic component/surface antigen + Term 1330 - 1376 -0.9 - Term 1568 - 1634 17.1 2 2 Op 1 . - CDS 1644 - 1862 246 ## COG3546 Mn-containing catalase 3 2 Op 2 . - CDS 1868 - 2239 342 ## Clole_2722 hypothetical protein 4 2 Op 3 . - CDS 2236 - 2430 216 ## gi|225027894|ref|ZP_03717086.1| hypothetical protein EUBHAL_02154 - Prom 2567 - 2626 8.3 + Prom 2619 - 2678 8.4 5 3 Op 1 . + CDS 2698 - 2880 77 ## gi|225027895|ref|ZP_03717087.1| hypothetical protein EUBHAL_02155 6 3 Op 2 . + CDS 2946 - 4079 1533 ## COG0523 Putative GTPases (G3E family) + Term 4088 - 4146 9.6 + Prom 4084 - 4143 2.4 7 4 Tu 1 . + CDS 4165 - 5142 1254 ## Cphy_2882 G3E family GTPase-like protein + Prom 5315 - 5374 7.4 8 5 Op 1 . + CDS 5409 - 6365 907 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 9 5 Op 2 . + CDS 6453 - 8555 1127 ## LAF_1406 hypothetical protein 10 5 Op 3 . + CDS 8603 - 10537 989 ## LAF_1406 hypothetical protein + Term 10590 - 10628 5.1 11 6 Tu 1 . - CDS 10683 - 11603 528 ## Closa_3823 ErfK/YbiS/YcfS/YnhG family protein - Prom 11816 - 11875 9.6 + Prom 11609 - 11668 7.2 12 7 Tu 1 . + CDS 11911 - 13740 2366 ## CPF_1176 glycerol dehydratase reactivation factor, large subunit + Term 13830 - 13881 3.4 + Prom 14139 - 14198 6.8 13 8 Op 1 . + CDS 14252 - 15121 1136 ## COG1307 Uncharacterized protein conserved in bacteria 14 8 Op 2 . + CDS 15180 - 15824 498 ## Shel_24010 GDSL-like lipase/acylhydrolase 15 8 Op 3 . + CDS 15857 - 17296 1210 ## Cphy_2618 Ig domain-containing protein + Prom 17485 - 17544 9.6 16 9 Tu 1 . + CDS 17614 - 18654 1314 ## COG0115 Branched-chain amino acid aminotransferase/4-amino-4-deoxychorismate lyase + Prom 18859 - 18918 7.0 17 10 Tu 1 . + CDS 18960 - 20864 2164 ## COG0296 1,4-alpha-glucan branching enzyme Predicted protein(s) >gi|222441847|gb|ACEP01000095.1| GENE 1 414 - 1253 1208 279 aa, chain + ## HITS:1 COG:CAC0986 KEGG:ns NR:ns ## COG: CAC0986 COG1464 # Protein_GI_number: 15894273 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type metal ion transport system, periplasmic component/surface antigen # Organism: Clostridium acetobutylicum # 2 276 3 269 272 236 51.0 4e-62 MKKFGAFLLAGVLAIGALTGCGSTDKKAESSTGSTDTQVIKVAASATPHAEILEEAKPLL AKEGYDLQVTVFDDYVQPNEVVDSGDFDANYFQHVPYMEQFNKEKGTKLVDAGDIHYEPF GIYPGTKKSLDEIADGDEIAVPNDTTNEARALLLLQDNGLIKLKDGAGLTATVNDIAENP HNIKIVELEAAQVARVTGETAFVVLNGNYALQAGYSVKKDALAYEASDSEAAKTYVNIIA VKEDNEDSDAIKALVKVLKSDDIKKFIDEKYDGAVIAFE >gi|222441847|gb|ACEP01000095.1| GENE 2 1644 - 1862 246 72 aa, chain - ## HITS:1 COG:CAC1338 KEGG:ns NR:ns ## COG: CAC1338 COG3546 # Protein_GI_number: 15894617 # Func_class: P Inorganic ion transport and metabolism # Function: Mn-containing catalase # Organism: Clostridium acetobutylicum # 1 64 1 64 200 81 54.0 3e-16 MWNYEKRLEYPINIKRTDAAMAKAIISQFGGPDGETGAAMRYLSQRFAMPYRPVMGVLTD VGIEASNSNLPG >gi|222441847|gb|ACEP01000095.1| GENE 3 1868 - 2239 342 123 aa, chain - ## HITS:1 COG:no KEGG:Clole_2722 NR:ns ## KEGG: Clole_2722 # Name: not_defined # Def: hypothetical protein # Organism: C.lentocellum # Pathway: not_defined # 42 123 5 85 86 69 42.0 3e-11 MMQNMHSSQNCNMNSCQHRSMNCNFNSACNNNRNQNCNCNMNQQNLKCYIDFVSFAALDC AMFLDTHPKNAEGLEYFEYYTNARKQALKEYSSRFSPLTLDTVPKGTDFWAWANEPWPWE MEG >gi|222441847|gb|ACEP01000095.1| GENE 4 2236 - 2430 216 64 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225027894|ref|ZP_03717086.1| ## NR: gi|225027894|ref|ZP_03717086.1| hypothetical protein EUBHAL_02154 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02154 [Eubacterium hallii DSM 3353] # 1 64 1 64 64 126 100.0 5e-28 MMENHSSCSYKQEKEVLKLDSMPLAMAYVPWQKWQNIYKPENALCAGTIFQELDLPFTGR RCQK >gi|222441847|gb|ACEP01000095.1| GENE 5 2698 - 2880 77 60 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225027895|ref|ZP_03717087.1| ## NR: gi|225027895|ref|ZP_03717087.1| hypothetical protein EUBHAL_02155 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02155 [Eubacterium hallii DSM 3353] # 9 60 1 52 52 90 98.0 3e-17 MLVPQNDILKIPKTVRESSEEVIAFGRFFLCRRKCTLLAFFKSGLYNIRLINLVYYFIEK >gi|222441847|gb|ACEP01000095.1| GENE 6 2946 - 4079 1533 377 aa, chain + ## HITS:1 COG:FN0779 KEGG:ns NR:ns ## COG: FN0779 COG0523 # Protein_GI_number: 19704114 # Func_class: R General function prediction only # Function: Putative GTPases (G3E family) # Organism: Fusobacterium nucleatum # 3 184 2 179 294 127 34.0 4e-29 MTKIDIISGFLGAGKTTLIKKLIEEALKGQKVVLIENEFGEIGIDGGFLKESGIQINEMN SGCICCSLVGDFNTALKDVLEQYTPDRIIIEPSGVGKLSDVMKAVQKVVDAEANVVLNSH ITVADAQRAKMYLKNFGEFYRNQVEFASAVILSRSQNVKEDKLEKAVELLRSLNDKCPIV TTPWEELSGAQILEVMEGENDFAKELMEAAQVCPECGHHHEHGEECHDHHHDHEEHEHHH HDHDHEEHEHHHEHDHEEHEHHHEGHEHHHDHDCCGHDHHHHHADEVFTSWGKETPKKFT EEGIRAILDTLSKEDSDEYGIILRAKGIVPDENGKWIHFDLVPGEDEVRYGSAEYTGRIC VIGSKLNEDKLAELFGL >gi|222441847|gb|ACEP01000095.1| GENE 7 4165 - 5142 1254 325 aa, chain + ## HITS:1 COG:no KEGG:Cphy_2882 NR:ns ## KEGG: Cphy_2882 # Name: not_defined # Def: G3E family GTPase-like protein # Organism: C.phytofermentans # Pathway: not_defined # 8 325 1 317 318 317 50.0 5e-85 MLGKKKELEIPVYLFMGFLEAGKTTFIQETLEEDYFNDGERTLLFACEEGMEEYDEELLK RTNTTVVYVEEQEDFNTEFLTSKLLQYYPDRVIIEYNGMWTIDHLVEAMEGTPLMIFQTI VSANAETFDLYMNNMRSLAVEMFKMAELVIINRCTKATPRATYRRSIKAVNRRVQVVFDS MVPGEDMEEEEDELPFDISGDEIHLEDDDYGVWFIDAMERPELYDGKTMVMKTRIFKAMR MPKGTFVPGRHAMTCCADDIQFIGYLCHTNHAKSSTIKSLKNKMWVTLTAEVKVEYNEEY QDKGPVLYAKRIEAAEPPEEELVYF >gi|222441847|gb|ACEP01000095.1| GENE 8 5409 - 6365 907 318 aa, chain + ## HITS:1 COG:CAC2324 KEGG:ns NR:ns ## COG: CAC2324 COG0463 # Protein_GI_number: 15895591 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Clostridium acetobutylicum # 3 298 2 298 315 326 55.0 4e-89 MDKIAVLIPCYNESKTIEKVIKDFKEVLPDATIYVYDNNSTDGTDEIARRNGAIVRYEYQ QGKGNVIRRMFRDIDAECYIMTDGDDTYPAVNAPEMVDKVLNRNVDMVVGDRLSSTYFQE NKRPFHNFGNSLVRSSINHLFHTDIKDIMTGYRAFSYEFVKTFPVISRGFEIETEMSIHA ADKNMFVENVVVDYKDRPEGSESKLNTYSDGFKVLRTIARLFRTYQPMKYFGSIAGVLAV LGGGFSIPVFKDYFVTGLVKRFPTLIVCCFVILTAIVSLFSGAILKTITWKNRQDFEMTR HQARNKKEELLKEAKEAE >gi|222441847|gb|ACEP01000095.1| GENE 9 6453 - 8555 1127 700 aa, chain + ## HITS:1 COG:no KEGG:LAF_1406 NR:ns ## KEGG: LAF_1406 # Name: not_defined # Def: hypothetical protein # Organism: L.fermentum # Pathway: not_defined # 191 670 139 586 592 165 27.0 6e-39 MDILIKPKKNHYIIYAIIIAIFSTVGMNISLMNTNADKLLKKFEYSSTWNNFMAVFRTSI SGKNDVRLFYQLVAVGITLMFLVIFSYRFCVREIVSSAVISIIFGFCMWFGTVFSNKESW NYFLRNKYVKILDIWYILAYALAMFGALLIIGKIVAASGNKAEQKAEVKTKAKTKDKTVR NTRKVFFMCAAVLLVCWLPYYIVFWPGFQHTDLPTQMLQYFHVPTRFQGHDITDGVNILY SNDHPYFQTKLVGLCIEFGFKIHNVNVGYSIYTFIQMIAFIVAFSSIIATLYRFEVNHTL LKISLVLYAVIPVFPLYSLLIGGDSFFSVFFLYYMLEILWIFGTKGAVLKNNKFILAMIL EIFLMSASKNQGVYVAAAMFLFCIIYFSKYRARIAVCMVVPIILFQFGYCGAFFKVAKIA SVGKQEALSVCFQQTARYVKYHGDEVTQEEEEAIKKVLNYKDIGNLYDPNLSDPVKKTFK TESTSEDLKNYFKVWLSMGLKHPGEYIQSFIANTYQYYYMEFRNKRGLYLKPEIMDFYIR KRPWVQKSEAIQKLIRKLQVHVPENLKSVREKGVLAMDTVRRFPVVSWFTNPGVITWMML IGFFALWTKRRYTSILEFLPVFLIFGVCLLSPKNGNLRYLYPACCMVPALLAAAFGNLRE EQANAIEDTKEVSKRRRAGSRKKHRIARTLPEDKRQRVKV >gi|222441847|gb|ACEP01000095.1| GENE 10 8603 - 10537 989 644 aa, chain + ## HITS:1 COG:no KEGG:LAF_1406 NR:ns ## KEGG: LAF_1406 # Name: not_defined # Def: hypothetical protein # Organism: L.fermentum # Pathway: not_defined # 142 625 99 557 592 151 27.0 9e-35 MKIHIELKKKRYILYAAIIALLGVVGANISILHTNADKITKHIEYKSFLTQLIAVFRVSI TNKEGNGIFYQLMAAGLFILFLIVFSYKYTVREVISAAILSILYGFCMWIGTVFSHRESW DYFLRNKYVMLLNAFYMLGNAIILFGILLLLCRVILRISETDNQTEHSLERVFLTCVLVL FVLWLPYYISMWPATIHGDFLMQVLQLFHYPTMLQHQLTSDGVNVFYSNDHPFLHTQLAG LFIKFGIKINNKALGYGMYTFLQMTAYIGGISAIITTLYKFGANHRILKAALGAYALFPV FPLYAIMVGGDSFFALFFLWFMVGILWIFKTQGDVLKNIKFDIFFAVIVFLMSASKNQGI YIALVTLVFCVICLKKYRIKILVTMFVPIFIFQFAYTGLLFKAARVSTVGKQEALSVCFQ QTARYVKYHGDEVTGEEEAAIKKVLAYKKLAKKYQPALSDPVKGTYKSEVTSTDLKNYFK VWLQMGLKHPDEYFQAFFANTYGYYAPLFNSRGGLYLGLSTVRFYRSNRKWAQEMIPESF CDKVDFKEPKILSPIRERMKFLMGISYKIPIINWLYNPGVITWLILIAFFALWIKRKYFD MAAFLPVFLIVCVCLLSPRNDNLRYIYPACVLIPGMLANLQGDR >gi|222441847|gb|ACEP01000095.1| GENE 11 10683 - 11603 528 306 aa, chain - ## HITS:1 COG:no KEGG:Closa_3823 NR:ns ## KEGG: Closa_3823 # Name: not_defined # Def: ErfK/YbiS/YcfS/YnhG family protein # Organism: C.saccharolyticum # Pathway: not_defined # 62 300 267 479 483 94 31.0 6e-18 MKKFIRQLTIFSLIFIMTFGVGFSSQKGINVQAATTQTTTTNDGKLRKNGVLFTGISDNK YYKRGVFTKYTGWKNWKGNRYYLKKGKAVTGWKYLRDYSGSRTKYKYYFKKNGRLSKDLF KTFGSSYKKKRMKLELNLVTHNITFLLYDGKTNKYDIPAKTVVCSTARDGRSTYVGNHYL SKGTARSWFIYKKSNPWHYYQWGVFVKGTRSWIHSEMYRGTSNKKLIASTYNGLGTNQTT ACIRVQAGNARLIYDIAKTNRYSIPIRIYRSSNKGPFGKITLNDTTGKIPGNQNYDPTDP AFKNKR >gi|222441847|gb|ACEP01000095.1| GENE 12 11911 - 13740 2366 609 aa, chain + ## HITS:1 COG:no KEGG:CPF_1176 NR:ns ## KEGG: CPF_1176 # Name: dhaF # Def: glycerol dehydratase reactivation factor, large subunit # Organism: C.perfringens_ATCC13124 # Pathway: not_defined # 4 603 1 606 616 722 66.0 0 MGDMKYVAGVDIGNATTEVALAKIDGQDIEFLTSGIGPTTGIKGTLQNISGVFMSLKNAL EKAGMDYSDLAEIRINEAAPVIGDVAMETISETVITESTVIGHNPSTPGGTGIGVGTSVL VTELSKIREAKDVIVIVPNKVRFAQAAALMNQAKENIHITGAIVQADDGVLLNNRLDKKI PIIDEVAMIEKVPLGMTCAIEVAEQGTVLSTLSNPYGIATVFDLSSEETKRVVPIARSLI GARSGVVIKTPTGDVQERSIKAGTINIIGLKKEVDVDVDEGADKIMEAISSIQEIDDIQG ETGTNVGNMLNKVRRVMAGLTNKSPQDIKIQDLLAVDTFNPQKVVGGIADEFALENAVGI AAMVKSDRLQMEMIAQVLTEKLRVPVYVGGVEADMAIKGALTTPGTSVPLAIVDMGAGST DASIINREGKVKLIHLAGAGNMVSLLIQSELGLDDFELAEDVKKYTLAKVESLFHIRHEN GTVQFFDKPLDPSIFAKVVLVKENDELVPLDGVESLEKVKAVRMEAKKKVFVTNAIRSLS KVSPTGNPRDIQFVVMVGGSALDFEIPNFVTDALADYEIVAGRGNIRGCEGPRNAVATGL AMACVEQEN >gi|222441847|gb|ACEP01000095.1| GENE 13 14252 - 15121 1136 289 aa, chain + ## HITS:1 COG:FN1927_2 KEGG:ns NR:ns ## COG: FN1927_2 COG1307 # Protein_GI_number: 19705232 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 1 288 1 283 285 106 27.0 4e-23 MSKIQIITDSASDISKEMEQKYNLDVISYSIVIGDKTYTSRVDFDNEGCYELMEASESLP MTSQITAFQFMELYYDYYQKGCTDLIFILINGKGSATYNNSLMAMDMLIEEHPECEGKIH IYSHDGGSYSAGYGQPAVMAAKMAQEGKSVEEINAYLEESIARRRIYFGIYNLKYAGKSG RIPSAAAFIGDKIGIKPIMKIWDHEITTAGKCRGEKKVVSKICDMAIADMEPGSEYQIEY GNDEKIKEEMAAKMTEKLGYGPTGYFQIGAEVAANAGPRIVGVAFDAKK >gi|222441847|gb|ACEP01000095.1| GENE 14 15180 - 15824 498 214 aa, chain + ## HITS:1 COG:no KEGG:Shel_24010 NR:ns ## KEGG: Shel_24010 # Name: not_defined # Def: GDSL-like lipase/acylhydrolase # Organism: S.heliotrinireducens # Pathway: not_defined # 6 209 3 229 232 140 37.0 4e-32 MEKRRISILGDSISTFAGYTPEEAVFYDSYVQWETGVKSVEDTWWMQVIKALDGELGTNH SLAGSMVSGNLSTSAMSYERIEKLGTNGIPDIILISAGCNDWGFCVLPEEFEEAYRTMLH RIKEQYPHADIYCATLPEGKQPQGETFFNIDSTISKRVYSDIIRENAQKAQVYIADLANV QEEYETVDGVHPNKAGMHTLARMWIREMKKFICK >gi|222441847|gb|ACEP01000095.1| GENE 15 15857 - 17296 1210 479 aa, chain + ## HITS:1 COG:no KEGG:Cphy_2618 NR:ns ## KEGG: Cphy_2618 # Name: not_defined # Def: Ig domain-containing protein # Organism: C.phytofermentans # Pathway: not_defined # 31 299 30 307 399 81 28.0 8e-14 MKKKNIVFVLCLIFALGFLFMPQEGRNAEAASRTRLSSTSLKVVPGKTEKLKIYGRRGRK VVWTSSRPRVVSVENGKLTALKGGTSTITARVGSQKLHCKVRVVGLNTTKITLAKGDKFQ LKVKNGYRTTWTSKNKKIAKVSKNGVIKAKKSGVTTIVCRTNGRKLKCKVYVASLKYSTL RLKAESSYHLSIKHAGDVLAWTSSNPSVAEVDSNGNITTLPVSGTSVLTCKSGKAVLSCK LTVVSPDNIITNMSTLPTSSNQDRFTVAVNSYPNVRHYTVYRQSASINASSFKNYMPYHG CAACAAATVLTGFGKNITPKRVTDKNGLEYKAFGKKIWKKNYKKELKDQMPVSLYGINKI LNNNGIQTEYVRRFSDAEACQQIISHLKTGNPVVIEAKKGKWANSYHTMVLLGLTDTGKA IIADSANRTGFGSKQRVKYESVSNLIKHMFACSVSGAKSANCYFGSSSGGGYILVNPNP >gi|222441847|gb|ACEP01000095.1| GENE 16 17614 - 18654 1314 346 aa, chain + ## HITS:1 COG:CAC1479 KEGG:ns NR:ns ## COG: CAC1479 COG0115 # Protein_GI_number: 15894758 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Branched-chain amino acid aminotransferase/4-amino-4-deoxychorismate lyase # Organism: Clostridium acetobutylicum # 1 340 1 340 341 585 78.0 1e-167 MEKKNINWGELGFSYQQTEKRFVANFKDGAWDDGTLTEDATITLNECAGVFQYAQTCFEG LKAYTTENGDIVCFRPDLNAARMADSCRRLEMPVFPEDKFVEAVVETVKANAGYVAPYGS GATLYIRPYMFGSNPVIGVKPADEYQFRIFVTPVGPYFKGGAKPITIRVSDFDRAAPNGT GHIKAGLNYAMSLHAIVDAHNQGYDENMYLDPATRTKVEETGGANFIFITKDGKFVTPKS DSILPSITRRSLMYIAEHYLGMEVEHREVRFDEVKDFAECGLCGTAAVISPVGKIVDHGK EICFPSGMEEMGPVTKKLYDTLTGIQMGHIEAPEGWIKTICKHGEC >gi|222441847|gb|ACEP01000095.1| GENE 17 18960 - 20864 2164 634 aa, chain + ## HITS:1 COG:sll0158 KEGG:ns NR:ns ## COG: sll0158 COG0296 # Protein_GI_number: 16331275 # Func_class: G Carbohydrate transport and metabolism # Function: 1,4-alpha-glucan branching enzyme # Organism: Synechocystis # 7 598 109 716 770 628 50.0 1e-179 MPDTRFITEFDQYLFGQGTHYDLYNKLGAHPMTVDEEEGVYFAVWAPNAAAVSIVGDFNE WDENATPMERLEPLGIYQIFLTGIKVGDIYKYCVTAQDGKKTLKADPYGFQAELRPNNAS VVADISDFKWHDSRWMKKREKFDDKKNPMFVYEVHPGSWKKHEQTEEDEDGFYNYREIAH ELAAYVKDMGYTHVELMGIAEHPFDGSWGYQVTNYFAPTSRHGSPEDFQYFMDYMHEHNI GVILDWVPAHFPRDAFGLAEFDGTCLYEYADPRKGEHPDWGTKVFDYGKTEVQNFLICNA LFWLEHYHVDGLRVDAVASMLYLDYGREDGQWVPNIYGGNENLEAIEFFKHLNTIVKKRN PGIVMIAEESTAWPKVTDKAEYGGLDFSLKWNMGWMHDFLEYMKLDPYFRKYNHTKMNFA MVYAYSENYMLVLSHDEVVHLKCSMIEKMPGSYEDKFKNLMAGYAFMTGHPGKKLLFMGQ DFGQHREWSEKRELDWFLLDKEPNRHLQAFVKELLHLYKNNKCLYEYDCYPEGFEWINAD DGDRSIFSFVRHSESGKSNMLFIINFTPVERPDYRVGTTCRRKHTLVLSSDDKKFGGTGK RRPKEYKPAKKECDGRKYSFRYKLPAYGVAVFKF Prediction of potential genes in microbial genomes Time: Fri Jul 8 07:57:30 2011 Seq name: gi|222441846|gb|ACEP01000096.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont354.1, whole genome shotgun sequence Length of sequence - 564 bp Number of predicted genes - 0 Prediction of potential genes in microbial genomes Time: Fri Jul 8 07:57:36 2011 Seq name: gi|222441845|gb|ACEP01000097.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont355.1, whole genome shotgun sequence Length of sequence - 32913 bp Number of predicted genes - 25, with homology - 23 Number of transcription units - 14, operones - 6 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 429 - 488 4.2 1 1 Tu 1 . + CDS 512 - 2167 1552 ## Ccur_02400 hypothetical protein 2 2 Op 1 . - CDS 2443 - 7935 5542 ## Clocel_1425 Ig domain-containing protein group 2 domain-containing protein 3 2 Op 2 . - CDS 7962 - 8276 247 ## gi|225027913|ref|ZP_03717105.1| hypothetical protein EUBHAL_02173 - Prom 8317 - 8376 12.4 - Term 8401 - 8437 1.2 4 3 Op 1 . - CDS 8481 - 9119 464 ## Sgly_1030 hypothetical protein 5 3 Op 2 . - CDS 9106 - 9729 236 ## gi|225027915|ref|ZP_03717107.1| hypothetical protein EUBHAL_02175 6 3 Op 3 . - CDS 9755 - 10144 449 ## Sgly_1028 hypothetical protein 7 4 Op 1 . - CDS 10291 - 10683 324 ## gi|225027918|ref|ZP_03717110.1| hypothetical protein EUBHAL_02178 - Prom 10711 - 10770 5.0 8 4 Op 2 . - CDS 10787 - 10906 225 ## gi|225027919|ref|ZP_03717111.1| hypothetical protein EUBHAL_02179 - Prom 10938 - 10997 11.6 9 5 Op 1 . - CDS 12763 - 13182 196 ## COG1203 Predicted helicases 10 5 Op 2 . - CDS 13222 - 15180 1190 ## COG1203 Predicted helicases 11 5 Op 3 . - CDS 15203 - 16294 829 ## CTC01145 hypothetical protein 12 5 Op 4 . - CDS 16310 - 17212 841 ## COG1857 Uncharacterized protein predicted to be involved in DNA repair 13 5 Op 5 . - CDS 17232 - 18623 795 ## CTC01143 hypothetical protein - Prom 18730 - 18789 6.5 14 6 Tu 1 . - CDS 18812 - 19549 328 ## CTC01142 hypothetical protein - Prom 19633 - 19692 6.5 - Term 19879 - 19916 7.1 15 7 Tu 1 . - CDS 20047 - 21054 868 ## Selsp_1852 hypothetical protein - Prom 21101 - 21160 8.3 - Term 21853 - 21896 2.4 16 8 Tu 1 . - CDS 22005 - 22400 343 ## COG3293 Transposase and inactivated derivatives - Prom 22439 - 22498 5.8 - Term 23569 - 23612 3.0 17 9 Op 1 . - CDS 23682 - 24185 510 ## Selsp_0413 hypothetical protein 18 9 Op 2 . - CDS 24257 - 25540 1216 ## CDR20291_2878 hypothetical protein 19 9 Op 3 . - CDS 25595 - 25726 197 ## - Prom 25758 - 25817 7.4 + Prom 25858 - 25917 5.5 20 10 Tu 1 . + CDS 25941 - 26813 624 ## EUBELI_00788 hypothetical protein + Term 26897 - 26950 13.3 - Term 27333 - 27377 12.3 21 11 Op 1 . - CDS 27404 - 28333 1257 ## COG0549 Carbamate kinase - Prom 28374 - 28433 10.3 22 11 Op 2 . - CDS 28478 - 29326 1006 ## COG4667 Predicted esterase of the alpha-beta hydrolase superfamily - Prom 29375 - 29434 8.3 23 12 Tu 1 . - CDS 29468 - 30838 957 ## COG1323 Predicted nucleotidyltransferase - Prom 30940 - 30999 8.1 + Prom 31003 - 31062 8.4 24 13 Tu 1 . + CDS 31283 - 32518 1422 ## COG0282 Acetate kinase + Term 32561 - 32606 9.5 - Term 32547 - 32594 12.2 25 14 Tu 1 . - CDS 32762 - 32911 122 ## Predicted protein(s) >gi|222441845|gb|ACEP01000097.1| GENE 1 512 - 2167 1552 551 aa, chain + ## HITS:1 COG:no KEGG:Ccur_02400 NR:ns ## KEGG: Ccur_02400 # Name: not_defined # Def: hypothetical protein # Organism: C.curtum # Pathway: not_defined # 12 548 125 668 697 523 46.0 1e-146 MKKKQKHHFTGKKWIAALVVVVLILAAIYYYIALPAINIHNPELWKFVLFLAVIIFALYA LPKTTFTGDRRHPVNFSGNYKTKTFKFLASLVVVIAAAYFVGTLLSSPVINASKYQKLMK IEKSEFTKDIDQISYDQIPLLDKSSAEILGERKMGSLVDLASQFEVADEYSQINYKNNPV RVTPLRYADLVKWFTNKKAGIPAYIKIDMATQQTELVRLKEGMKYTPYDHFGRYLYRHLR FKYPTYMFDDINFEINEEGTPYWVCSVKKYNIGLFGGQTIGRVVLCNAITGECKDYKVED TPKWVDRVYSADMLVNLYNYYGTLKHGFINSILGQKDCLTTTNGYNYLAINDDVWVYTGV TSITSDQSNVGFVLMNQRTMETKYYEIEGAIEDSAMTSAEGKVQNLGYKATFPLLLNIDA EPTYFMALKDASGLVKKYAMVNVHNYQIVATGDTVNDCEASYRSLIQSNGMGSDEESKKA STKTTAGTIKKIAQAVIDGSSHYYIVLNDSDTIYDVDISKYLDAILLEEGDKVTIEYTEG NPSTVNKISKS >gi|222441845|gb|ACEP01000097.1| GENE 2 2443 - 7935 5542 1830 aa, chain - ## HITS:1 COG:no KEGG:Clocel_1425 NR:ns ## KEGG: Clocel_1425 # Name: not_defined # Def: Ig domain-containing protein group 2 domain-containing protein # Organism: C.cellulovorans # Pathway: not_defined # 1587 1808 557 784 1065 75 29.0 2e-11 MKYQKTRYVLARVLAAILTLTAVFGLSAPLITQAASNVSVLYRYDGDGTETKNRVGIRSV IINDKKYFDIHDGSVTDRVDPNNQIAFLQAVLQSGNETDGGTDSLIGQWASLAEQIYGTH NKYAAFVNAGGGFAKNFTGEENRTRNGYADVAYALSQKEPKSDRKSGKDNTYWTGLAYAK NLKSVRQQAAEEIASGINRKVSGSSILNSTEGQDAALKQIDDDTTQDVLYSLVTCVDRVG STPRFCYNTFGLAFYDFKLSVIAGEGLEYITKAQKYDSLKEAVNGQAAGVTYKTNANSNP KLSYYKNESKEEADIGMEFKQSNSLTTSNTLETGKSYSYSEMIGSETTLSGEIPLIAEVE QTLKMEVTCEQALSTAYSETKEYSETSENTISSSMSLPAQTAVGMESSKAVTNVQLGYDC PVAITYKVAVFSLSGTVYDDNMKIQSFSTAGYRQSHFSTIFGSESEKGGTTAMDNLYNRA VRYTATPNYEQAYGQTYGWTKKRSDGNPADKLEKLEWKDILNGTIKEEELTDEKTKVTLN RILVDEDGVIQHTFSSEPSGTEYPIGYESLIEVPSTMTQNTKKYTIYDKTVKDSNGEEVD NDINYNANGDTYNKWIQPIGEEEKKSLAEQGVSQEELDALNSVNFYYTENAPESGTTGQR ANSNQKANKAVKQSANTAQRAAAGTNNSLINKVSWLSTHCPMSVTGGVLNYDTVSMNSNI KEIVPLYPLKKISVTNGVKTLNMISGDTFDLSTVTLKGSNKNDVEYYGFDQNKGHWILID QNGKELTGSNIASLTEEKANGEAILTAGDEEGTLYLKYMINDDTTYTSLENEQPATNAGL DATAKIKVNVTAKPFDGSVQAEGTTTIYVGEEPVNLIGNEDIKGYAVDSTDKKISGAPIV WEARLDEEDGIKLENNRVSFTKEGTYQIRATYRGKHSEWISVKALPEKALTTLKISDDTK PATLESFIFKDKGTPDIIDLSKLTVKAFDQYDGEWNDLSDVKWHIDFSGEKAEDSFSDYL VANKLPIDKAGTYKIQARARKGRAVYAESNVLELVVKPTRKLTSLNIETDIEKDGIGIGD SYSSDLKEDVKVTALDQYGDEYDWTKEEYKWLTGGKYSKVTGDTLLGLVKGSDTLQLIVG KDDAKIESNVIDFAIVTKPYVKELYTGDSAVVREGEEYDLTKVNFTAKDQNGEAYILSQK EIDRIQWKLTDKGTIKSTQASFSAAKKTLSVAEGTLGYGETGNVILTGTFTNKNGIEAQA VQFTLTVRQKPVLDTLQLEKKDAESTLKNGENAYVDEYFTVKGLDQYGEEYGLDEVDITW NSDNEKAFRFENNKVIAAVEAGKKANIAVSATNKLNQNVVSNEVELSVPRVRSLQKITLE EVPEAIPFNSTLDLTALKATCYDELGEIYSEEDLKAYPAKVIFSLDANGINCKLDTAKNI LQTYDEYGYITISAMAVNSSTTNEMQDENGNTIISSVKIWVGPKVEVLKNTEKMIATAGA NTITLEGKCLEDNMKAGLFDADGRLITEAGTSGNADSQSATLDIPSNIGGKSDVVYTVKY AIKDTYMDEPTGKITVSNKIPATGVKLNKNNLVIAPKNREVLKAIVSPSNSTDKLSWSST ARRIASVDKNGNVSAAAPGKAVITVTTESGKKASCNVLVGLRQGDVISSGIYRYKVTDSS VDGTGQMEVAGFIKGRRAKNVTFPKSIAWNGIKYNVTSVGTRAFEANKKIKTVIIPDSVE KICHKAFYRCANMKKLTVGKNVSFFGAHAFCNNKKITKIVFKGKKLKTLKDPHVFICVNH AKVYVPKSKYKVYKKLLSTYGLGKCKFVKK >gi|222441845|gb|ACEP01000097.1| GENE 3 7962 - 8276 247 104 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225027913|ref|ZP_03717105.1| ## NR: gi|225027913|ref|ZP_03717105.1| hypothetical protein EUBHAL_02173 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02173 [Eubacterium hallii DSM 3353] # 1 104 1 104 104 199 100.0 6e-50 MNPKEIRREKIHQRIHILRIRYDRCIIFSLTLLSVGLAVCMGIMFQEMHLSGVPAVVDSY GSVLLRDGADLYVIIAVAAFVAGVIFTMVCIWFQNKKSDQKKKV >gi|222441845|gb|ACEP01000097.1| GENE 4 8481 - 9119 464 212 aa, chain - ## HITS:1 COG:no KEGG:Sgly_1030 NR:ns ## KEGG: Sgly_1030 # Name: not_defined # Def: hypothetical protein # Organism: S.glycolicus # Pathway: not_defined # 1 196 1 201 211 164 47.0 3e-39 MMYIENIFICISIPLLLSLIFIQGEARKYTLFVCIGMGICVLSAYVSSFFMNYYGVDTIE ASIEITPVCEEVMKLFPLLFYILIFEPQRDQIPRVAIAISVGFATFENVCYLTENGAADF DLLLIRGISTGALHILCGIFIGFGLAYIFRQRAIAITGMVGLLGACGVFHAIFNLLISAD GIWRWMGYLFPSGMIIVLVSVWHLMVKLNHMK >gi|222441845|gb|ACEP01000097.1| GENE 5 9106 - 9729 236 207 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225027915|ref|ZP_03717107.1| ## NR: gi|225027915|ref|ZP_03717107.1| hypothetical protein EUBHAL_02175 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02175 [Eubacterium hallii DSM 3353] # 1 207 1 207 207 338 100.0 1e-91 MNFEIIDNIFQVIVFSLIVVADIVCWFLHKNRLYIILALAHSCFMMGTLYFVLYLVIRGK VPQFFYVSEISWIASYLFLHSYQIVGYKGQRMKISVIPLICGIGVAIISIWSGIFGPAIL STGVFTLAAGAIVYISVFQILYGDAPYKSSICILLCIILQVSLYISSSFFHDYTRFNLYF CIDIVLTISMAMLLPCTFMEVGKDDVH >gi|222441845|gb|ACEP01000097.1| GENE 6 9755 - 10144 449 129 aa, chain - ## HITS:1 COG:no KEGG:Sgly_1028 NR:ns ## KEGG: Sgly_1028 # Name: not_defined # Def: hypothetical protein # Organism: S.glycolicus # Pathway: not_defined # 1 114 1 111 117 79 40.0 5e-14 MRTNEERQQLIHRRTLEIQREQQKKRERVISGSGIATCLVMIIGIGCLMPEVTKQASASG AKTNYTTGMASMLGNYEALGYICMGILAFALGVSVTILLYRLRKAEEHRKKAEEYKQSQQ VRENSLKKE >gi|222441845|gb|ACEP01000097.1| GENE 7 10291 - 10683 324 130 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225027918|ref|ZP_03717110.1| ## NR: gi|225027918|ref|ZP_03717110.1| hypothetical protein EUBHAL_02178 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02178 [Eubacterium hallii DSM 3353] # 1 130 1 130 130 235 100.0 8e-61 MISDEMAYRQYLDGKEESADILVERYGDALTYYINGYIHDIHESEDLMIEAFAQIFAKER PIDGKGSFRAYLYKTARNLTIRHRQKHKIWFLHLDEVDFELPSEELVDAKLLQSEREQHL YEGMANLKRV >gi|222441845|gb|ACEP01000097.1| GENE 8 10787 - 10906 225 39 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225027919|ref|ZP_03717111.1| ## NR: gi|225027919|ref|ZP_03717111.1| hypothetical protein EUBHAL_02179 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02179 [Eubacterium hallii DSM 3353] # 1 39 1 39 39 67 100.0 4e-10 MGRFVDPDNSAFQDALSSKIYIDKMGLISYTNSVIDMTE >gi|222441845|gb|ACEP01000097.1| GENE 9 12763 - 13182 196 139 aa, chain - ## HITS:1 COG:FN1179 KEGG:ns NR:ns ## COG: FN1179 COG1203 # Protein_GI_number: 19704514 # Func_class: R General function prediction only # Function: Predicted helicases # Organism: Fusobacterium nucleatum # 1 124 673 794 812 62 32.0 3e-10 MNSFKKTYKYVSEIRPYSYQKNEKKLREIVTETIIPSPVYEEKSTQILECVQKLKQADLS VIQRQKLRNQIMKYTVNVPYYHWKSYQDALNSKKTSGRKAKYYEPIQIGKYEKIKVMECN YDKKGYSPIEWTKKTEDNE >gi|222441845|gb|ACEP01000097.1| GENE 10 13222 - 15180 1190 652 aa, chain - ## HITS:1 COG:FN1179 KEGG:ns NR:ns ## COG: FN1179 COG1203 # Protein_GI_number: 19704514 # Func_class: R General function prediction only # Function: Predicted helicases # Organism: Fusobacterium nucleatum # 5 647 9 631 812 426 43.0 1e-119 MDTSKYWAKSDKVTTLVIHNAALREALDKLYELGYIEKEMYELLKIACEYHDLGKINEIF QARMKNKKLHFDLEKEIPHNVLSLFFVDSERVNKINPEDKKNYVRMLFAILYHHYNSDLM RMVSSSKRQLENIFPQDAFPDIQDEIYSLPIGIEEEISEVIGLKDEKAVKMKGLLHKCDY AASGHYPIEFRNDFLLSGLEYLLMQWNNPADQEKQEIQSAEEFALSEQDKNENERATVDW NSLQYFCKEKQNQNIIVVAQTGMGKTEAGLLWIGNHKGFFVLPIRTAINAIYNRLRKDML VNSDKIDEQVGILHSEQLEYYTKNIEGSDEDIYEYVKRGKQFSLPLNISTLDQLFDFIFQ YPGYELKLATLSYSRIVIDEIQMYNAELLAYLIFGLEKIVKMGGKVAILTATLPPFVKEL LEQHLNIPKENQEVFIDDKKARHHIKVCDKKINAEDIVKLYLKNKEACKSNPHKGNKILV VCNKIKTAQKIFDDVNELIKAKYDKEEIHIFHSRFIKEDREKLEKEIIEFGQTGNKSENV VEKGGIWITTSVVEASLDIDFDYLVTELQDLCSLLQRMGRCNRKGLKSVFESNCYIYLQI DSADLIGKTKRTGRRGKEKVYGFIDATLYELSKKAISEHEGILTESQKQEML >gi|222441845|gb|ACEP01000097.1| GENE 11 15203 - 16294 829 363 aa, chain - ## HITS:1 COG:no KEGG:CTC01145 NR:ns ## KEGG: CTC01145 # Name: not_defined # Def: hypothetical protein # Organism: C.tetani # Pathway: not_defined # 1 363 1 360 360 322 52.0 1e-86 MKAFRLIIKQTSANYRKPETIKNKMTYPLPPFSTVIGALHNACGYTEYKEMNLSIQGKYE SMHREPYTDYCFLNRLEDDRGILVWLPNASFFSNAFVRVAHTQKSQGNSFRKGVTIQVEN QNLLDEYRRLKDLNDSITEFKKTRLNPVLALIKKRKKTLKEKKKVRKQEGLDCIAVLTRE NELKEIEKQIKEWFEKYKYENYTEPLSHFRNLVKSVKYYEILNNIELVIHVQASDEVLHD LEEHWTDIKSIGRSEDMVDVQEAKIVELQEKLNKEVNNIYSAYIDYDTIKRGNSEGVRIY PLGRKDKYVRGTVYYLNKNYEISPENGKRIFQKKKVVYTSNYEINKVGENVYIDEDEYIV NFI >gi|222441845|gb|ACEP01000097.1| GENE 12 16310 - 17212 841 300 aa, chain - ## HITS:1 COG:FN1181 KEGG:ns NR:ns ## COG: FN1181 COG1857 # Protein_GI_number: 19704516 # Func_class: L Replication, recombination and repair # Function: Uncharacterized protein predicted to be involved in DNA repair # Organism: Fusobacterium nucleatum # 1 296 1 298 300 337 64.0 2e-92 MKKKALTLTVVANMTSNYAENLGNIASVQKVFKNRKIYTIRSRESLKNALMVQAGLYDDL QTISDNSKQKVVVQKAVSEKLNAANCRALEGGYMNTIDTTYIRNSSIYVTDAISCESFVN EPRFHNNLYLASNYAQANGLNVQNNADKVGLMPYQYEYDKGLKCYSITVDLEMIGKDKNF NMEALPEEKAERVILLLNAIQNLSLVVKGNLDNAEPIFIVGGLSDRKTHYFENVVHVKED KLELSEDLIQKKNQGFYAGLLKGQALLNEDEIEEKLQTIGVSEFFSKLGEDVKTYFDVMK >gi|222441845|gb|ACEP01000097.1| GENE 13 17232 - 18623 795 463 aa, chain - ## HITS:1 COG:no KEGG:CTC01143 NR:ns ## KEGG: CTC01143 # Name: not_defined # Def: hypothetical protein # Organism: C.tetani # Pathway: not_defined # 1 450 1 442 447 342 45.0 3e-92 MVTRLENDIFDIMVEPSDWRYSAAIVGLSRYFEDRDKMEDEDYEVDEDCIKFCSDSITKE EFLLFAESYYGEEFPHVQVEEILSRKEFSKEMIDLVNKLLKENSIMKKVFSKKKFDGSNQ EELLEMIDENRLVLIRETFRNKKSMYANFANESQLFQEKSDVCCRLQGYYVDPGRKSRGA AFAFNNNINVVEDSILLDFVPFGFFGEYESFFVNDSYSVKELIHTNDLLAEKMEEERTEE GKKIRDARKILFRAMQESADFINYDVEVILKERPTQDKPNDFFQTLFIREHSIQVLRDFK KHYKSFCFKRKMRENYDLDIQKEVIDCILNLKRADELILFFLREESAYLVDALIKLNQYI EGSEAAMEKGQRSAFACAKELIRKLPENKRRTYRQRLTSTLILKDYTGYLNILVQLANYA DMIFDFAFDLFEDFEKNMDIAYAFVNALGKAPDKKALGSEKAK >gi|222441845|gb|ACEP01000097.1| GENE 14 18812 - 19549 328 245 aa, chain - ## HITS:1 COG:no KEGG:CTC01142 NR:ns ## KEGG: CTC01142 # Name: not_defined # Def: hypothetical protein # Organism: C.tetani # Pathway: not_defined # 1 244 1 244 247 180 36.0 4e-44 MRVELLFDLEKPSLPADNRAIFVSFLKKTLEQSHEGHFFERYFGGTARKDYSFSIILDKP RYEGERIQLKTPRLKMIFSADDRNRTGLIFSMAFMGMKYKKFPLPEKNAMTLKRIAHVKE NVIISNRVMFRTIPGGGLVVRDHSRETNRDKYYVIGEEQFEEKARQSLMRQAMEAGFPKR LAEEIKFQPVSGKKVVSRLYGIMIDVSAITFIVEADSMILQYFYQAGASSRTAIGYGLCS IIEQL >gi|222441845|gb|ACEP01000097.1| GENE 15 20047 - 21054 868 335 aa, chain - ## HITS:1 COG:no KEGG:Selsp_1852 NR:ns ## KEGG: Selsp_1852 # Name: not_defined # Def: hypothetical protein # Organism: S.sputigena # Pathway: not_defined # 20 325 21 326 333 164 34.0 6e-39 MIRQTGLAQAIDIAGNRAKYDECAKKLLSYKAIIAWILKSCTKEFSQYGVQYICDNCLKE EAEVSTHAVHQDELDDELDKDNKLDGDERVEGMNTESNSIQEHTIYYDIRLPAFLPKSNE IVRLILNLEIQLDDTPGYPIVKRGFYYCGRMVSEQYGTVFTDEHYEKLEKVYSIWICPDP ARKRKNGIFKYHTVEDIIYGESYTKEKNYDLMEVVVLNLGDADKSSDLEILDLLNVLFSA TITPEEKKQRLNDEFEIAMTVEFESEVQEMCNLSEALVELGIEQGKELGREERNISMAQM MIQEREPVEKIERYTGYALEKLKEISNGIGIPLMK >gi|222441845|gb|ACEP01000097.1| GENE 16 22005 - 22400 343 131 aa, chain - ## HITS:1 COG:CC0928 KEGG:ns NR:ns ## COG: CC0928 COG3293 # Protein_GI_number: 16125180 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Caulobacter vibrioides # 2 117 1 115 123 104 44.0 5e-23 MLRRYELTDDEWNRIAPLLPPENSGKQGRPKKSNRTILNGIVWIARSGAPWRDLPERYGA WQTVYSRFRKWIDDGIIDNIFRVLSLEAELTELSMDASIVSAHQHSAGAKKGGLQTKSDT AEVGPAQKFMR >gi|222441845|gb|ACEP01000097.1| GENE 17 23682 - 24185 510 167 aa, chain - ## HITS:1 COG:no KEGG:Selsp_0413 NR:ns ## KEGG: Selsp_0413 # Name: not_defined # Def: hypothetical protein # Organism: S.sputigena # Pathway: not_defined # 3 163 160 328 335 87 32.0 2e-16 MNSNYNDIKRVYSIWVCMNMSQNCMNYIHFTQESVVGTYQWKGDIDLANIVLIGLAEDLP EKEERYELHRLLGALLSAKLNVDEKFDIIGNEFDIPLESDIRKDVNDMCNLSQGIKEQAY AEGTENGIAKGVAKTIIKMYRKECGAEQISDLLDMEVEQVREIIENE >gi|222441845|gb|ACEP01000097.1| GENE 18 24257 - 25540 1216 427 aa, chain - ## HITS:1 COG:no KEGG:CDR20291_2878 NR:ns ## KEGG: CDR20291_2878 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile_R20291 # Pathway: not_defined # 240 385 26 163 343 87 39.0 1e-15 MKKTMHNYRKLLNKNIMLALMFCLTLGIGIFTNSVQVKAANSLKIGNIQAYGGSQVAFEA TFKMNGYQKIEIYRSTSGGAFQLVDNFTEEGELWNTYSEGWYTVGNKKKVTCYADNTKVA GQVVFEDDSVTVGNRYSYKVVLTCNDGSKIMSGTVSVAVTLDTAEILNIYSADAKKVKIK WRRVSKAQGYQIYRTEGKKWKVIKTIKSGKKVTFVDKKVKKNKVYRYKVRAYGTVKGKKV YGNFGYVCKASLKKPTVKGAYKKGSIYGPALNNNQLMQVRRVVQSFKTNYIKKGMSNYEK AFIAFNYLNQNCKYATRGWQYNGANTAWGALVYGEAQCSGYARGMKALCDAIGVPCYYVH ANKKALNPSHQWNQVKVDEKWYIVDAQSGYFLAGSKTWKNEIGMSWDTKGLPKCSGSNHK RGGFYGI >gi|222441845|gb|ACEP01000097.1| GENE 19 25595 - 25726 197 43 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MAGKKNNGVVAVISDLTNEQAAQLTKEIIKAKRKVAPKGKSIL >gi|222441845|gb|ACEP01000097.1| GENE 20 25941 - 26813 624 290 aa, chain + ## HITS:1 COG:no KEGG:EUBELI_00788 NR:ns ## KEGG: EUBELI_00788 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 10 238 6 250 315 158 40.0 3e-37 MTKINTGNANREYKDRLFRFVFGAEENKAYLLSLCNAVSGTDYTDVDDIEITTLSDAIYI KMKNDISFLIDSQMNLFEHQSTFNPNMPLRGMECFAELYGIYIIENNLDIYVSSLQKILT PRYYVIYNGTEKQPDVVKLKLSDAFQVPDDSGEFEWTATMLNINYGHNRKLLEQCQPLYE YAHFIKLVREYSEAMELKKAIDKAVEKAREWKCIGTFLYQCKSEVSVMLLTEFDEKKHED NLIKLGEKEGREKERMKNICSMLALSLSPEIIAKACEVSVDYVLNLKKEL >gi|222441845|gb|ACEP01000097.1| GENE 21 27404 - 28333 1257 309 aa, chain - ## HITS:1 COG:yqeA KEGG:ns NR:ns ## COG: yqeA COG0549 # Protein_GI_number: 16130776 # Func_class: E Amino acid transport and metabolism # Function: Carbamate kinase # Organism: Escherichia coli K12 # 4 308 3 308 310 289 50.0 5e-78 MSKKKVVIALGHRALGTTMPEQYKATRNTAKAIADLVEAGADVVISHSNAPQVGMIHTAM NEFSKDREDYTQAPMSVCSAMSQGYVGYDLQNAIRAELLKRGVYRPVCTIITQVMVDPYD DAFNEPVKVIGRLLSKEEAEQEEKHGNYVTEVEGGYRRIVAAPKPQGIIEIDSIKALSDQ GQVVIACGGGGIPVLAQGVELHGASAVIEKDYASGRMAELLNADVFMVLTSVENVCVGYN TEQESPLEHITVAEAQKYIDEGQFEPNTMLPKIEAGVSFLEKGNGRKVIITSIDKALEGY LEKTGTIIE >gi|222441845|gb|ACEP01000097.1| GENE 22 28478 - 29326 1006 282 aa, chain - ## HITS:1 COG:CAC2424 KEGG:ns NR:ns ## COG: CAC2424 COG4667 # Protein_GI_number: 15895690 # Func_class: R General function prediction only # Function: Predicted esterase of the alpha-beta hydrolase superfamily # Organism: Clostridium acetobutylicum # 1 280 1 280 283 254 46.0 1e-67 MYNAGLILEGGGMRGVYTAGVLDAFLDAGIEFSSVYGVSAGSCHACSYLSKQRGRAFRVN VDYLDDPNYCGAKTFLKTGNVFGPEMLYEQIPDVLDPFDHQAFLNYPGKFYAVVTDVETG AARYLRVRDLRKQMWMIRSSSSLPLVSKTIVVKGHKLLDGGIADSIPIRKSIEDGNEKNV VILTRDHGYRKAPNKLMPIMRLRYPKYKEFLHAMETRHVRYNETIDFLEEQEKEGKVFII RPEAKVEVGRIEKDKAKLTVLYKQGLHDGEKAIEAMKEYLEK >gi|222441845|gb|ACEP01000097.1| GENE 23 29468 - 30838 957 456 aa, chain - ## HITS:1 COG:CAC1741 KEGG:ns NR:ns ## COG: CAC1741 COG1323 # Protein_GI_number: 15895018 # Func_class: R General function prediction only # Function: Predicted nucleotidyltransferase # Organism: Clostridium acetobutylicum # 1 455 1 401 402 192 31.0 8e-49 MKVAGIIAEYNPFHKGHQYHIEETRKKTGADYVVVVMSGDYVQRGEPAIADKYMRTRMAL SGGADLIIEMPAIYATASAEYFATAGIGILDQLGCVDYLSFGSEWAEVEDFSAYATLFLE ESEEYKQILQEQLKSGKSFPEARAFAAGNLLFDSKPEKAIEFLKEPNHILGLEYIKALKR RNSLIRPVVIKRKGNHYHENKLTENYSSATAIRQEMYHFYRNFSRKNPYNTETCNSGKCG NTGKNTNNRVSNIYNNTYNKEKEAPQAFEKALCGEYLPFIEGFLQNNFVTWEDLMPYLDY TFLLKNKVIGKYFGMNLDLARRFQNIYEPGLSFEDLIEGLHARQITDAALRRVLLHIVLH MKYYPFLEEAKDIPIPYARILGFSKSSSPLLKEIRQNATLDIIQRPAEGKKLYSNSSAQA QIYSIDIRTADLYEQIAARKAGRKPISEYKRQQVIR >gi|222441845|gb|ACEP01000097.1| GENE 24 31283 - 32518 1422 411 aa, chain + ## HITS:1 COG:MA3606 KEGG:ns NR:ns ## COG: MA3606 COG0282 # Protein_GI_number: 20092406 # Func_class: C Energy production and conversion # Function: Acetate kinase # Organism: Methanosarcina acetivorans str.C2A # 1 394 1 395 408 480 58.0 1e-135 MKVLVINCGSSSLKYQVIDSETEHVLAKGLCERIGIDGVLTYQPAGQDKIKYEAAMPAHR QAVELVLKQLTDPENGVLKSIDEIDAVGHRMVHGGEKFACSTLLTEEVLKTVESCNDLAP LHNPPTLVGVAACKELLPTTPMVGVFDTAFHQTMPPEAYIYGLPYEYYEKYAVRRYGFHG TSHKYVSLRAAEILGKKPEDLKIVVCHLGNGSSISAVDGGKCVDTSMGLTPLEGLVMGTR SGDIDPTCIEFIAHKENLSLEQVMDIINKKSGVLGISGVSSDFRDLDEAAKAGNKRADLA LRVFSHSVVKYIGSFIAVMNGVDAIVFTAGIGENDDIIRSRIIEHFDYLDTTLDQKANKM HGEERIISTPESKVKVICIPTNEELAICRDTVEIVTKKMVEDTVKDVLGNN >gi|222441845|gb|ACEP01000097.1| GENE 25 32762 - 32911 122 49 aa, chain - ## HITS:0 COG:no KEGG:no NR:no RGRLLCLRQIPILLDKGYRLHTISTTSGGSKGLGGGDRIQATMVFERIE Prediction of potential genes in microbial genomes Time: Fri Jul 8 07:59:25 2011 Seq name: gi|222441844|gb|ACEP01000098.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont356.1, whole genome shotgun sequence Length of sequence - 507 bp Number of predicted genes - 0 Prediction of potential genes in microbial genomes Time: Fri Jul 8 07:59:26 2011 Seq name: gi|222441843|gb|ACEP01000099.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont357.1, whole genome shotgun sequence Length of sequence - 821 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 753 761 ## COG0428 Predicted divalent heavy-metal cations transporter Predicted protein(s) >gi|222441843|gb|ACEP01000099.1| GENE 1 1 - 753 761 250 aa, chain - ## HITS:1 COG:lin0435 KEGG:ns NR:ns ## COG: lin0435 COG0428 # Protein_GI_number: 16799512 # Func_class: P Inorganic ion transport and metabolism # Function: Predicted divalent heavy-metal cations transporter # Organism: Listeria innocua # 16 250 25 269 269 178 47.0 9e-45 MIIDIVKGISVPFLGTTLGAFCVFFLKNELSKGMQKALTGFAAGVMVAASIWSLLLPALE QGASLGMWKFLPAVIGFWIGILFLMAIDYFAPPECIDEKCKNKLLLAVTLHNIPEGMAVG VIYAGLLSGAEHITEIGAFSLALGIAIQNFPEGAIISLPLCTEGMSKKRAFIYGVLSGAV EPVAAVFTVWAASLIVPLLPYFLSFAAGAMFYVVVEELIPEMSAGKHSNIGTILFSFGFT LMLTLDVALG Prediction of potential genes in microbial genomes Time: Fri Jul 8 07:59:46 2011 Seq name: gi|222441842|gb|ACEP01000100.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont358.1, whole genome shotgun sequence Length of sequence - 60339 bp Number of predicted genes - 65, with homology - 64 Number of transcription units - 34, operones - 13 average op.length - 3.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 370 - 429 5.8 1 1 Op 1 6/0.000 + CDS 475 - 1452 999 ## COG0332 3-oxoacyl-[acyl-carrier-protein] synthase III 2 1 Op 2 4/0.333 + CDS 1531 - 1755 424 ## COG0236 Acyl carrier protein 3 1 Op 3 3/0.333 + CDS 1771 - 2715 1320 ## COG2070 Dioxygenases related to 2-nitropropane dioxygenase 4 1 Op 4 26/0.000 + CDS 2721 - 3647 1320 ## COG0331 (acyl-carrier-protein) S-malonyltransferase 5 1 Op 5 11/0.000 + CDS 3650 - 4393 233 ## PROTEIN SUPPORTED gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 6 1 Op 6 4/0.333 + CDS 4417 - 5655 1658 ## COG0304 3-oxoacyl-(acyl-carrier-protein) synthase 7 1 Op 7 4/0.333 + CDS 5662 - 6165 670 ## COG0511 Biotin carboxyl carrier protein + Prom 6199 - 6258 6.0 8 2 Tu 1 4/0.333 + CDS 6287 - 6712 603 ## COG0764 3-hydroxymyristoyl/3-hydroxydecanoyl-(acyl carrier protein) dehydratases + Prom 6715 - 6774 4.2 9 3 Tu 1 4/0.333 + CDS 6824 - 8188 1793 ## COG0439 Biotin carboxylase + Prom 8236 - 8295 5.5 10 4 Tu 1 . + CDS 8411 - 10204 1635 ## COG0777 Acetyl-CoA carboxylase beta subunit + Prom 10223 - 10282 4.6 11 5 Op 1 . + CDS 10361 - 11011 715 ## COG2070 Dioxygenases related to 2-nitropropane dioxygenase 12 5 Op 2 2/0.333 + CDS 11055 - 11429 207 ## COG2070 Dioxygenases related to 2-nitropropane dioxygenase + Prom 11505 - 11564 3.9 13 6 Tu 1 . + CDS 11601 - 12065 519 ## COG1846 Transcriptional regulators + Term 12096 - 12146 9.7 + Prom 12506 - 12565 1.7 14 7 Tu 1 . + CDS 12603 - 13532 599 ## COG4219 Antirepressor regulating drug resistance, predicted signal transduction N-terminal membrane component + Prom 13565 - 13624 7.9 15 8 Op 1 . + CDS 13650 - 14051 403 ## gi|225027959|ref|ZP_03717151.1| hypothetical protein EUBHAL_02219 + Prom 14073 - 14132 4.0 16 8 Op 2 . + CDS 14204 - 14479 388 ## gi|225027960|ref|ZP_03717152.1| hypothetical protein EUBHAL_02220 + Term 14519 - 14552 6.1 17 9 Tu 1 . - CDS 14571 - 16499 2000 ## COG0488 ATPase components of ABC transporters with duplicated ATPase domains - Prom 16527 - 16586 8.9 + Prom 16551 - 16610 10.3 18 10 Tu 1 . + CDS 16698 - 17360 818 ## COG2344 AT-rich DNA-binding protein + Term 17392 - 17445 6.1 + Prom 17476 - 17535 2.8 19 11 Op 1 . + CDS 17586 - 19112 573 ## PROTEIN SUPPORTED gi|153825000|ref|ZP_01977667.1| ribosomal protein S15 20 11 Op 2 . + CDS 19200 - 19544 401 ## COG2337 Growth inhibitor 21 11 Op 3 . + CDS 19572 - 20741 1336 ## COG0436 Aspartate/tyrosine/aromatic aminotransferase + Prom 20804 - 20863 7.2 22 12 Tu 1 . + CDS 21030 - 21764 1112 ## COG0217 Uncharacterized conserved protein + Prom 22472 - 22531 7.8 23 13 Tu 1 . + CDS 22553 - 23545 1134 ## COG0715 ABC-type nitrate/sulfonate/bicarbonate transport systems, periplasmic components + Term 23573 - 23630 19.1 - Term 23563 - 23614 16.8 24 14 Tu 1 . - CDS 23635 - 24414 752 ## COG0796 Glutamate racemase - Prom 24439 - 24498 2.2 + Prom 24750 - 24809 3.0 25 15 Tu 1 . + CDS 24829 - 25326 733 ## COG0622 Predicted phosphoesterase + Term 25332 - 25395 0.1 + Prom 25353 - 25412 6.1 26 16 Op 1 . + CDS 25473 - 26294 686 ## COG2240 Pyridoxal/pyridoxine/pyridoxamine kinase 27 16 Op 2 . + CDS 26297 - 26965 740 ## COG0535 Predicted Fe-S oxidoreductases 28 16 Op 3 . + CDS 27028 - 27972 1109 ## COG0039 Malate/lactate dehydrogenases + Term 28085 - 28130 0.9 + Prom 28020 - 28079 4.7 29 17 Tu 1 . + CDS 28245 - 28970 805 ## COG1387 Histidinol phosphatase and related hydrolases of the PHP family + Prom 28972 - 29031 7.8 30 18 Tu 1 . + CDS 29080 - 29373 386 ## COG4496 Uncharacterized protein conserved in bacteria + Term 29585 - 29625 -0.9 - Term 29468 - 29525 4.0 31 19 Tu 1 . - CDS 29609 - 30229 447 ## Closa_1554 hypothetical protein - Prom 30302 - 30361 10.3 + Prom 30354 - 30413 10.7 32 20 Tu 1 . + CDS 30501 - 31523 1245 ## COG4260 Putative virion core protein (lumpy skin disease virus) + Prom 31611 - 31670 6.3 33 21 Op 1 . + CDS 31724 - 33940 2152 ## COG0210 Superfamily I DNA and RNA helicases + Prom 33982 - 34041 4.6 34 21 Op 2 . + CDS 34065 - 34373 513 ## gi|225027983|ref|ZP_03717175.1| hypothetical protein EUBHAL_02243 + Term 34390 - 34450 9.8 + TRNA 34764 - 34838 81.3 # Pro TGG 0 0 + TRNA 34844 - 34914 75.8 # Gly TCC 0 0 + TRNA 34964 - 35037 82.5 # Arg TCT 0 0 + TRNA 35061 - 35134 74.1 # His GTG 0 0 + TRNA 35141 - 35212 65.9 # Gln TTG 0 0 + TRNA 35257 - 35329 81.3 # Lys TTT 0 0 + TRNA 35385 - 35459 81.3 # Pro TGG 0 0 + TRNA 35465 - 35535 75.8 # Gly TCC 0 0 + TRNA 35580 - 35659 65.9 # Leu TAG 0 0 35 22 Tu 1 . + CDS 35928 - 37076 426 ## COG3594 Fucose 4-O-acetylase and related acetyltransferases 36 23 Op 1 . - CDS 37035 - 37295 127 ## gi|225027985|ref|ZP_03717177.1| hypothetical protein EUBHAL_02254 37 23 Op 2 . - CDS 37292 - 37897 389 ## bpr_I2503 ribonuclease - Prom 38074 - 38133 8.6 - Term 38002 - 38049 2.2 38 24 Tu 1 . - CDS 38260 - 38460 267 ## COG2155 Uncharacterized conserved protein - Prom 38488 - 38547 8.2 + Prom 38499 - 38558 3.6 39 25 Tu 1 . + CDS 38597 - 39079 681 ## gi|225027989|ref|ZP_03717181.1| hypothetical protein EUBHAL_02258 + Term 39291 - 39338 -0.5 + Prom 39295 - 39354 6.6 40 26 Tu 1 . + CDS 39517 - 40482 773 ## Cphy_2931 hypothetical protein + Prom 40607 - 40666 10.2 41 27 Op 1 . + CDS 40730 - 41845 1026 ## COG0628 Predicted permease + Prom 41960 - 42019 8.4 42 27 Op 2 . + CDS 42066 - 42272 280 ## COG1278 Cold shock proteins + Term 42288 - 42327 6.5 + Prom 42277 - 42336 6.9 43 28 Tu 1 . + CDS 42541 - 43155 608 ## COG0406 Fructose-2,6-bisphosphatase + Prom 43183 - 43242 8.9 44 29 Op 1 . + CDS 43274 - 44599 1245 ## COG2256 ATPase related to the helicase subunit of the Holliday junction resolvase 45 29 Op 2 . + CDS 44643 - 44837 106 ## gi|225027996|ref|ZP_03717188.1| hypothetical protein EUBHAL_02265 46 29 Op 3 . + CDS 44842 - 44970 64 ## 47 29 Op 4 . + CDS 44987 - 45529 689 ## COG1592 Rubrerythrin + Prom 45822 - 45881 7.5 48 30 Op 1 8/0.000 + CDS 45924 - 46280 262 ## COG2739 Uncharacterized protein conserved in bacteria 49 30 Op 2 23/0.000 + CDS 46299 - 47654 1729 ## COG0541 Signal recognition particle GTPase + Prom 47660 - 47719 1.9 50 30 Op 3 19/0.000 + CDS 47744 - 47989 308 ## PROTEIN SUPPORTED gi|238916983|ref|YP_002930500.1| small subunit ribosomal protein S16 51 30 Op 4 12/0.000 + CDS 48008 - 48238 434 ## COG1837 Predicted RNA-binding protein (contains KH domain) + Term 48248 - 48297 9.2 + Prom 48245 - 48304 2.8 52 31 Op 1 30/0.000 + CDS 48333 - 48842 183 ## PROTEIN SUPPORTED gi|163796730|ref|ZP_02190688.1| 50S ribosomal protein L19 53 31 Op 2 33/0.000 + CDS 48842 - 49588 703 ## COG0336 tRNA-(guanine-N1)-methyltransferase 54 31 Op 3 5/0.000 + CDS 49676 - 50023 460 ## PROTEIN SUPPORTED gi|160880535|ref|YP_001559503.1| ribosomal protein L19 + Term 50037 - 50074 6.2 55 31 Op 4 2/0.333 + CDS 50100 - 50645 536 ## COG0681 Signal peptidase I + Prom 50674 - 50733 5.6 56 32 Op 1 2/0.333 + CDS 50766 - 51614 1114 ## COG1161 Predicted GTPases 57 32 Op 2 2/0.333 + CDS 51604 - 52197 578 ## COG0681 Signal peptidase I 58 32 Op 3 4/0.333 + CDS 52218 - 52829 495 ## COG0681 Signal peptidase I 59 32 Op 4 1/0.333 + CDS 52832 - 53587 829 ## COG0164 Ribonuclease HII 60 32 Op 5 . + CDS 53607 - 53960 278 ## COG0792 Predicted endonuclease distantly related to archaeal Holliday junction resolvase 61 32 Op 6 . + CDS 53970 - 54818 809 ## COG1496 Uncharacterized conserved protein + Prom 54851 - 54910 6.4 62 33 Op 1 1/0.333 + CDS 55010 - 56344 1336 ## COG1625 Fe-S oxidoreductase, related to NifB/MoaA family 63 33 Op 2 2/0.333 + CDS 56337 - 57662 1750 ## COG1160 Predicted GTPases 64 33 Op 3 . + CDS 57663 - 58298 701 ## COG0344 Predicted membrane protein + Prom 58839 - 58898 7.2 65 34 Tu 1 . + CDS 58919 - 60313 1728 ## COG0119 Isopropylmalate/homocitrate/citramalate synthases Predicted protein(s) >gi|222441842|gb|ACEP01000100.1| GENE 1 475 - 1452 999 325 aa, chain + ## HITS:1 COG:CAC3578 KEGG:ns NR:ns ## COG: CAC3578 COG0332 # Protein_GI_number: 15896812 # Func_class: I Lipid transport and metabolism # Function: 3-oxoacyl-[acyl-carrier-protein] synthase III # Organism: Clostridium acetobutylicum # 3 325 4 323 325 295 47.0 8e-80 MFVKIKGTGSCLPEKVLDNFEISQLVDTNDEWIQSRTGIKSRHIAREETAVSMAAKAAKR ALEDAQVAAEEIDLLIVSSVSSEQLLPCTACSVQKEIGAVNAAAFDLNAACSGFIVAYQM AAGQIKAGLSKKALLIGVECLSNIVNWEDRGTCILFGDGAGAAVVSADEDGNIDGRGNIE IPSVLHSDGSRGEVLTCQNPTGKRADGSLKGYVAMDGREIYKFAARQVPVVVKEILEKAG KSVEEVDLFVLHQANRRIVEAIAKRLKQPIEKFPMDMMQNGNMSSASIPVLLDELKKAGK LQPGMKIVVAGFGAGLTWGGMYLEW >gi|222441842|gb|ACEP01000100.1| GENE 2 1531 - 1755 424 74 aa, chain + ## HITS:1 COG:SA1075 KEGG:ns NR:ns ## COG: SA1075 COG0236 # Protein_GI_number: 15926815 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Acyl carrier protein # Organism: Staphylococcus aureus N315 # 2 70 4 72 77 61 59.0 5e-10 MFEKIKELIVESLGIEEDQVTMEASFKEDLKVDSLDLFEMVMSLEDEFDVEIPTEELEKM ETVGDVVEYIKEHK >gi|222441842|gb|ACEP01000100.1| GENE 3 1771 - 2715 1320 314 aa, chain + ## HITS:1 COG:CAC3576 KEGG:ns NR:ns ## COG: CAC3576 COG2070 # Protein_GI_number: 15896810 # Func_class: R General function prediction only # Function: Dioxygenases related to 2-nitropropane dioxygenase # Organism: Clostridium acetobutylicum # 1 305 2 306 310 350 61.0 2e-96 MKTVITELLGIEYPVIQGGMAWVGTAELAAAVSEAGGLGIIGAGGAPASWVEEQIHKVKE KTDKPFGVNLMLLNPEADAIAELLVKEEVKVVTTGAGNPEKYMEEWKKAGVRVIPVVAST ALAKRMEKAGADAVIAEGTESGGHIGETTTMALVPQVVDAVNIPVIAAGGIADGRGMAAA FMLGAKAIQMGTIFVASKESIVSEAYKNKVIKAKDIDTKVTGRTTGHPVRCLRNQQTREY LKLEQAGASFEELEKLTLGGLRKAVVDGDVIHGSVMAGQIAGLVKDSRSCKEIIEGICRE MKNVVLSQAENIGR >gi|222441842|gb|ACEP01000100.1| GENE 4 2721 - 3647 1320 308 aa, chain + ## HITS:1 COG:CAC3575 KEGG:ns NR:ns ## COG: CAC3575 COG0331 # Protein_GI_number: 15896809 # Func_class: I Lipid transport and metabolism # Function: (acyl-carrier-protein) S-malonyltransferase # Organism: Clostridium acetobutylicum # 1 300 1 305 308 278 49.0 8e-75 MSKIAFLYPGQGAQVAGMGKDFYEKSPLAKEVFDKSCEILGQDMKHICFEENELLDRTDY TQAALVTTCMAMTKEIMARGLTPDMTAGLSLGEYCAITTAGGMELEDAIKMVWLRGNLMH NAVPEGKGGMAAVLGLSGEAVNEAIAYMEGVYVANYNCPGQIVITGDKEAVENAAPALKE AGAKRVIPLNVSGPFHSIYLKEAGEKLYQALSEVTLGELKIPYMTNVDASIVTDTTRTKE LLKEQVYSSVLWEQSVRAMIADGVDVFVEIGPGKTLSGFMRKIDRNVKMFRIGTMEDIDK TVEAIKEL >gi|222441842|gb|ACEP01000100.1| GENE 5 3650 - 4393 233 247 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 [Phaeobacter gallaeciensis BS107] # 6 243 4 238 242 94 32 1e-18 MLLEGKTAIVTGASGGIGKAVAIALAKEGAAVAVHFHGNEEKALLVKKEIEEAGGKAEIF RANVADFEECNALVKAVAKTFGSIDILVNNAGITKDGLLMAMSEADFDNVIDTNLKGCFQ MIRFASRRMMKQRYGRIINVSSVSGVAGNAGQANYSASKAGIIGLTKSAAKELASRGITC NAIAPGFVKTEMTDVLSDEVKENAKKQIPLGRFAEPEDIANAAVFLASDKASYITGQVLL VDGGMVM >gi|222441842|gb|ACEP01000100.1| GENE 6 4417 - 5655 1658 412 aa, chain + ## HITS:1 COG:CAC3573 KEGG:ns NR:ns ## COG: CAC3573 COG0304 # Protein_GI_number: 15896807 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: 3-oxoacyl-(acyl-carrier-protein) synthase # Organism: Clostridium acetobutylicum # 1 411 1 411 411 469 59.0 1e-132 MKRRVVITGMGAVTPIGNTLDEFWNGIKEEKVGIGAITKFDTEDYKVKIAAEVKDFNVKE RLDNKAARRMEVFSQYAVAASREAFAMAGLDMEKEDPYRVGVIIGSGIGGLSSMEKEHEK IITKGPKRVSPMLIPLMISNMAAGNVAIDLGCKGKCTDVVTACATGSNSIGDALRAIQYG DADVMLAGGTESSITPVAVAGFTNLTALSSESDPLRASLPFDEARGGFVIGEGAGVVVLE ELEHAKKRGAKILAELAGYGSTCDAFHITSPAEDGSGAAKAMELAMEEAGVKPEEVDYIN AHGTGTHHNDLFETRAVRRAFGESADHLKMSSTKSMIGHLLGAAGAVELIVCVKSIEEGY IHPTVGTTNPGEGCDLDYVINGAVEQEVNVAISNSLGFGGHNASLLVKKYEE >gi|222441842|gb|ACEP01000100.1| GENE 7 5662 - 6165 670 167 aa, chain + ## HITS:1 COG:CAC3572 KEGG:ns NR:ns ## COG: CAC3572 COG0511 # Protein_GI_number: 15896806 # Func_class: I Lipid transport and metabolism # Function: Biotin carboxyl carrier protein # Organism: Clostridium acetobutylicum # 1 166 1 156 159 108 43.0 5e-24 MDYQQILELVKEVSKAGLTNFEYTEGNIRIAMSCPQPEEKIVVPASNIALQEAIGVSANS VNGANTAGTEGVAATAAAQSQAAEAVGEKGGNLVKSPLVGTFYAAPSEDAEPFIKVGDTV KKGQTLAIVEAMKLMNEIESEFDGVVTEILVENEENVEYGQPLFRIQ >gi|222441842|gb|ACEP01000100.1| GENE 8 6287 - 6712 603 141 aa, chain + ## HITS:1 COG:BH3735 KEGG:ns NR:ns ## COG: BH3735 COG0764 # Protein_GI_number: 15616297 # Func_class: I Lipid transport and metabolism # Function: 3-hydroxymyristoyl/3-hydroxydecanoyl-(acyl carrier protein) dehydratases # Organism: Bacillus halodurans # 3 140 2 139 140 169 59.0 1e-42 MRLGIKEIEQILPHRHPFLLVDYIEEMEPGVSAVGYKCVTFHEDFFRGHFPQEPVMPGVL TVEALAQVGAVAILSLEENKGKTAYFGGINKCRFKGKIVPGDKVKLETKIVKRKGPMGVG EATAYVDGKVVVKAELTFMVG >gi|222441842|gb|ACEP01000100.1| GENE 9 6824 - 8188 1793 454 aa, chain + ## HITS:1 COG:CAC3570 KEGG:ns NR:ns ## COG: CAC3570 COG0439 # Protein_GI_number: 15896804 # Func_class: I Lipid transport and metabolism # Function: Biotin carboxylase # Organism: Clostridium acetobutylicum # 1 446 1 446 447 552 61.0 1e-157 MIRKILIANRGEIAIRIIRACREMGIETVAVYSEADREALHTQLADEAVCIGPAAAKDSY LNMEQIISATMITGADAIHPGFGFLSENSQFAKLCEACHITFIGPDSDIIARLGNKAVAR QTMVDAGVPVIPGCQKALTDVKEALEIAKGIGFPVIVKAVLGGGGKGMRVAYNEDEFENA FLMAQKESGLAFGDESMYLEHFVENPRHIEFQILADNYGNVIHLGERDCSVQRNHQKVIE ESPSAVVDEELREKMGQAAVLAAKAAGYKNAGTIEFLLEKDKSFYFMEMNTRIQVEHPVT EWVTGIDLIKAQIRIADGEKLKWKQEDIQITGHAIECRINAEDPSKNFRPCPGRITDMYL PGGKGVRIDSAIYSGCEVSPYYDSMITKLIVFAATRKEAIAKMHRALGEVIIEGITTNID FLYEIMERPDYQEGDFTIQYLEKVLEERSQEIKK >gi|222441842|gb|ACEP01000100.1| GENE 10 8411 - 10204 1635 597 aa, chain + ## HITS:1 COG:CAC3569 KEGG:ns NR:ns ## COG: CAC3569 COG0777 # Protein_GI_number: 15896803 # Func_class: I Lipid transport and metabolism # Function: Acetyl-CoA carboxylase beta subunit # Organism: Clostridium acetobutylicum # 25 283 26 284 285 327 61.0 5e-89 MKLKSIFKKTPVTKPEVKEAASKVKPEVPEGLLKRCNKCGKGIFTEDYKKNLYICPKCGG YLRMPAQKRIAFLTEKDSFEEWDTGLTTENPLHMIGYPDRIKSLQEKTKLDEAVITGKAR IGANEVALMVMDGRFLMASMGEVVGEKIARGVERATKEKLPVIIFTCSGGARMQEGMTSL MQMAKTSAALKRHSDAGLLYITVLTDPTTGGVTASFAMLGDIILAEPKALIGFAGPRVIE QTIHKKLPKGFQRSEFLLKHGFIDKIVERKDMKTVLEKILTMHRLTAEGVAENTGNNAVF NDGDITVASEQEVGQTVKIVKNSRKQKLSATQKKRASEKTAWERVLTSREKDRPVGEDYI SGLFEEFIEFHGDRNFGDDAAICGGIAYFQGQPVTVIAQMKGKSTSENIERNFGMPEPEG YRKALRLMKQAEKFHRPIICFVDTPGAFCGMEAEERGQGEAIARNLYEMSSLETPILSVL IGEGGSGGALAMAVADEVWILENAVYSILSPEGYAAILWKDGSQAARAAKAMKLTSYDLY KAGFVEKIISEPESYTLDSMMNVFDNLEENISAFLENSKEMTEKERVEARYQRFRSM >gi|222441842|gb|ACEP01000100.1| GENE 11 10361 - 11011 715 216 aa, chain + ## HITS:1 COG:CAC3580 KEGG:ns NR:ns ## COG: CAC3580 COG2070 # Protein_GI_number: 15896814 # Func_class: R General function prediction only # Function: Dioxygenases related to 2-nitropropane dioxygenase # Organism: Clostridium acetobutylicum # 1 207 1 206 355 197 50.0 1e-50 MKKNLFSIGKKQVRVPIVQGGMGVGISLGRLAGTVAKEGGAGTISAAQIGFKEPDFYENP IEANKRAIHKEMQKARAISPDGIVGFNLMVAMNDYETYAHEVIAAGADFIVSGAGLPVDL PAYTADSDIAIAPIVSTQKSARVILKFWDKKYKRTADFIVIEGPMAGGHLGFHKEQLEEF TPDIYGEEVKKIITVVQKYEEKYEKKITCHPGRRHL >gi|222441842|gb|ACEP01000100.1| GENE 12 11055 - 11429 207 124 aa, chain + ## HITS:1 COG:CAC3580 KEGG:ns NR:ns ## COG: CAC3580 COG2070 # Protein_GI_number: 15896814 # Func_class: R General function prediction only # Function: Dioxygenases related to 2-nitropropane dioxygenase # Organism: Clostridium acetobutylicum # 1 121 231 349 355 128 49.0 3e-30 MQIATRFVTTEECDADEHYKQTYIQAEKEDIVIVKSPVGMPGRAIRNTFLDKVKSEGRIP PTKCLRCIHTCNPAETPYCITEALIHAAKGETENALLFCGGRAYEAKTIETVKKVIDYFC SSCI >gi|222441842|gb|ACEP01000100.1| GENE 13 11601 - 12065 519 154 aa, chain + ## HITS:1 COG:CAC3579 KEGG:ns NR:ns ## COG: CAC3579 COG1846 # Protein_GI_number: 15896813 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Clostridium acetobutylicum # 1 154 1 152 154 108 43.0 3e-24 MNDVRETINTMLVKTFHEILELEEKAIITDEFSDISNNDMHIIEAIGLGDGARMSVVAKR LNITVGSLTTSMNSLVNKGYVARARSEKDRRVVNVYLLEKGIAAYKHHEEFHEKMLDAIM ETLSPEELVVWAKSCDRLTQFLKSYDKSHIKKAK >gi|222441842|gb|ACEP01000100.1| GENE 14 12603 - 13532 599 309 aa, chain + ## HITS:1 COG:CAC3437 KEGG:ns NR:ns ## COG: CAC3437 COG4219 # Protein_GI_number: 15896678 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Antirepressor regulating drug resistance, predicted signal transduction N-terminal membrane component # Organism: Clostridium acetobutylicum # 12 232 108 328 541 67 22.0 5e-11 MNLYAKSEDIHKKIVMAVILTIWIIGIIIAVVHKMREWNKVKGYLIKGTSKGDGVAEKIL REIDSECPIKIEYNLAIAEPFIKGLRHPVIYLPEKECNEKELEFILMHEYLHWKRKDLWK KFIINIIGMIFWWNPLAYLLCKDLDQIIELNCDNAMSKKYSEMDTLYYLDTLTYMAGGRR ANFDKVSSDTLGFVKKLEVRPLKQRFHYVMFKKDDKRTQRKMNLFILGVSVIWFVASYYF ILQPEYIVPLTDMHQENTPFVSDGENSYLEEQKDGTYIFHYGKYEEKVLKEDVENGIYEE YPIIKYNKK >gi|222441842|gb|ACEP01000100.1| GENE 15 13650 - 14051 403 133 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225027959|ref|ZP_03717151.1| ## NR: gi|225027959|ref|ZP_03717151.1| hypothetical protein EUBHAL_02219 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02219 [Eubacterium hallii DSM 3353] # 1 133 15 147 147 215 100.0 7e-55 MKRKIPKITNSEKQILEVLWDEEKPLTSSEIVGVSDDRTWKASSVHLLLNSLLNKDLVEV AGFKKTTKNYARTFQPTMTREEYSINQLRQEQRNTSRTLSRLFNALLKDEEDDDLLKELA DMVDSRRSQIRKD >gi|222441842|gb|ACEP01000100.1| GENE 16 14204 - 14479 388 91 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225027960|ref|ZP_03717152.1| ## NR: gi|225027960|ref|ZP_03717152.1| hypothetical protein EUBHAL_02220 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02220 [Eubacterium hallii DSM 3353] # 2 91 1 90 90 176 100.0 7e-43 MMRKAKWKRVVAAGAILMSLGMVFPSADIQAEEVDFGSEIVEIVEEDSITPCADQIVIYY RVHNGRRQKRRWNKTQGKWVDKNWIDLGPAY >gi|222441842|gb|ACEP01000100.1| GENE 17 14571 - 16499 2000 642 aa, chain - ## HITS:1 COG:CAC2714 KEGG:ns NR:ns ## COG: CAC2714 COG0488 # Protein_GI_number: 15895971 # Func_class: R General function prediction only # Function: ATPase components of ABC transporters with duplicated ATPase domains # Organism: Clostridium acetobutylicum # 1 641 2 638 643 542 47.0 1e-153 MILACQKIEKAFGGKSVLNNVNFLINEGEKAAVIGINGAGKTTLFKIITGEYEADNGEVI FQKGSTYGYLSQVIDVSSHRSIYEEMLDAKKEIIEMEKKIRQLEKDISTLSGEALESAME SYSRLTDRFEKANGYAWKSEITGVLKGLGFDESQFATPIHTLSGGQKTRVALSRILLMHP DLILLDEPTNHLDMESIRWLETFLSNYRGAVLIVSHDRYFLDKVVDKVIEIERGTSQVFN GNYSAYAEKKKAQRDAQMKLYMNQQQEIHHQEEVITKLRSFNREKSIKRAESREKMLSKL ELVDKPVVLNSKMRITLEPEVLSGNDVLTIEGLSKSFGDKALFRNLNVQIKRGEVVGLLG ANGTGKTTLLKIINRQLRADSGKIRYGSKVSIGYYDQEQHVLDDSKTIFDEISDAYPKLT NTRIRNVLAAFLFTGEDVFQVIGTLSGGEKGRVSLAKLMLSNANFIILDEPTNHLDIQSR EILEEAINNYEGTVFYVSHDRYFINQTATRILDLSPEGIVNYKGNYNYYLEQKEAGNISA DSDNITLTSATDAALAPSPEKAKEDWKRSREEAARQRKRANDLKKTEKEISRLEEENDQL KEEMALPENATNVAKLMELNTKFEENEAALLELYDKWEELSE >gi|222441842|gb|ACEP01000100.1| GENE 18 16698 - 17360 818 220 aa, chain + ## HITS:1 COG:CAC2713 KEGG:ns NR:ns ## COG: CAC2713 COG2344 # Protein_GI_number: 15895970 # Func_class: R General function prediction only # Function: AT-rich DNA-binding protein # Organism: Clostridium acetobutylicum # 13 220 4 211 214 224 53.0 1e-58 MPDSKTVVKSGSEKSISPAVIKRLPRYYRYLGDLLKNNVVRISSKELSQRMNVTASQIRQ DLNNFGGFGQQGYGYNVEYLYNEMGKILGLDKTNNIIIVGAGNLGQALANNQDFDSNGFK IIGLFDVNPRLIGMTVRGVEVYDIDMLETFLKEHEVMIAALTLPKSKATKVAEQLVDLGI KALWNFAPVDLHFPEEILVENVHLAESIMTLSYRIHSHNQ >gi|222441842|gb|ACEP01000100.1| GENE 19 17586 - 19112 573 508 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|153825000|ref|ZP_01977667.1| ribosomal protein S15 [Vibrio cholerae MZO-2] # 28 499 31 489 490 225 33 5e-58 MRYVASREEMQNIDAYSINNVGIPGIVLMEKAALALEEVFLERIPTDSRVLIVTEKGNNG GDGLALGRLLLEEGYTVDFYEIGVIPHDSDSHQIQKHILEQMEADFLMKFPDEEYDVIVD AIFGVGLKREVAGRHREVIERLNQMEALKVAVDVPSGVDASAGQILGIAFEADITVTFGL PKVGLLLYPGADVSGEVIVKEIGFPNKAVEEIAPQMVSFTTDDLELLPERKAWTNKGSYG KVLLIAGSKNMAGAALLSGTAAYKSGSGLVRIFSCEENRVILQEKLPEAILTTYDSEEKA WELLPESLSWASVIGIGPGIGQSAFASKLVKQVLTLGKAPLVIDADGLNNLAALLQEDTE LRQLFHEYEGGMILTPHLKEMSRLTGEEIAEIRSNLPKAAASTADKGHVIVLKDARTIVS DGSVPSYINMSGNNGMATGGSGDVLTGIICGFLAGGLDILTAARLGVYCHGLAGDAAAKE KGYYSVLAGDLPNYLETILKRKHFPEEI >gi|222441842|gb|ACEP01000100.1| GENE 20 19200 - 19544 401 114 aa, chain + ## HITS:1 COG:BH0522 KEGG:ns NR:ns ## COG: BH0522 COG2337 # Protein_GI_number: 15613085 # Func_class: T Signal transduction mechanisms # Function: Growth inhibitor # Organism: Bacillus halodurans # 1 113 2 114 116 150 66.0 5e-37 MIKRGDIYYADLRPVVGSEQGGVRPVLIIQNDAGNRHSPTVICAAITSRMNKAKLPTHVE LSTTECDITKDSVVLLEQIRTIDKQRLKERVCHLDTNTLKRVNYALKISLELIT >gi|222441842|gb|ACEP01000100.1| GENE 21 19572 - 20741 1336 389 aa, chain + ## HITS:1 COG:BH0936 KEGG:ns NR:ns ## COG: BH0936 COG0436 # Protein_GI_number: 15613499 # Func_class: E Amino acid transport and metabolism # Function: Aspartate/tyrosine/aromatic aminotransferase # Organism: Bacillus halodurans # 4 378 3 377 385 316 39.0 6e-86 MNFEPAEKMKNFEAGIFQILDEKKKELQKQGKKIYNFSVGTPDFETPECVMKAVSEACLH PENYHYSLGETDELLDALAARYKNRYNTDIAKNEIMSVYGSQEGMAHIFLPLINPGDIVL LPNPGYPIFSTGAFIAQSNQWFYPLTEENNYLPDFDAIPADVKKAAKVMVVSYPANPICK TAPDEFYEKLITFARENNILIIHDNAYSDIIYDGREGKSFLSFKGAKEVGVEFYSLSKSY NYTGARMSFVIGNAQVIEVFKRFRSQIDYGIFLPVQAGAVAALKSDDAAVQEQCKKYQER RDALCEGLREIGWNIPDSEGTMFAWGPLPDGYTNSNEFVMELMEKTGVICTPGSSFGSLG EGYVRFALVLPPEEIKKAIASIKESGILK >gi|222441842|gb|ACEP01000100.1| GENE 22 21030 - 21764 1112 244 aa, chain + ## HITS:1 COG:CAC2295 KEGG:ns NR:ns ## COG: CAC2295 COG0217 # Protein_GI_number: 15895562 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 1 244 1 242 246 243 56.0 3e-64 MSGHSKFANIKHKKEKNDAAKGKIFTVIGREIAVAVKEGGPDPANNSKLRDVIAKAKSNN MPNDTIERGIKKAAGDANSVNYVQNTYEGYGPNGIAIIVETLTDNKNRTAANVRSAFSKG RGNIGTSGCVSFMFQKKGQIIVSKEEYETDPDEFMMVALDAGAEDFVEEDDSYEITTDPD AFGEVREALEKEGVPMASAEVTMIPDNYVTLTDPEDIKNLNRTLDLLDEDDDVQAVYHNW EEEE >gi|222441842|gb|ACEP01000100.1| GENE 23 22553 - 23545 1134 330 aa, chain + ## HITS:1 COG:CAC0620 KEGG:ns NR:ns ## COG: CAC0620 COG0715 # Protein_GI_number: 15893908 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport systems, periplasmic components # Organism: Clostridium acetobutylicum # 22 330 20 331 338 274 43.0 1e-73 MRRYLIILMMPCFVMLFGICCLAGCEGSRHHLKKVRLNEVAHSIFYAPQYVAIEKGYFEK EGIKLELTTGFGADKTATAVISGDADIAFMGPEATIYQFNQGNADYLINFAQLTQRAGNF VVSRKKEDSFKWEDLKGKKVIGGRPGGMPEMVFEYVLKKHGMNPQKDINLVQNIDFANTS GAFVSGEADYTVEFEPAATLIEEQGAGYVVASVGKESGYVPYTAYSVKKSYLETHKELLE AFTRAIEKGQEYVNTHTPAQIAKVIAPQFKDTDRKTLEKIVERYAEQDSYKENTKFEKES FTLIQDILEEAGELEKRVDYETLVTTEFSE >gi|222441842|gb|ACEP01000100.1| GENE 24 23635 - 24414 752 259 aa, chain - ## HITS:1 COG:CAC3250 KEGG:ns NR:ns ## COG: CAC3250 COG0796 # Protein_GI_number: 15896495 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glutamate racemase # Organism: Clostridium acetobutylicum # 1 258 1 255 256 242 48.0 6e-64 MNQLSDPIAVFDSGMGGISVLREMTKLMPNEDFIYYGDSKHAPYGTKTLEEVRTLTIEHI TYLIKEKHAKAVAVACNTATSAAVRILRDMYPDLPLVGVEPAIKPAVLATGHAKVVVMAT PMTLREEKFHKLESLYDEQADIYPLPCPGLMEFVEQGILSGEKLETFLHNLLDPYKDKDI TGIVLGCTHYPFLKETIQKIAGPSVTIFDGGYGTAKELLRRLRVADLAQVDNTRKGSVTF LNSSDDPALIQRSKKLLNS >gi|222441842|gb|ACEP01000100.1| GENE 25 24829 - 25326 733 165 aa, chain + ## HITS:1 COG:BH3066 KEGG:ns NR:ns ## COG: BH3066 COG0622 # Protein_GI_number: 15615628 # Func_class: R General function prediction only # Function: Predicted phosphoesterase # Organism: Bacillus halodurans # 4 138 2 131 169 92 36.0 3e-19 MGQKYLVVSDNHGNMSNLEYVIERFRGKIEALIHCGDMEIPPEMLEELAGCPVYMAEGNC DYNFSRDKEDIFELGDHVAFVTHGDAYGVSWGEEELVSKAQEMGADVVFYGHTHCPAFHY YEKEGVTVFNPGSIALPRQMTPAGPTFLIIDLADDGRLTPELYTL >gi|222441842|gb|ACEP01000100.1| GENE 26 25473 - 26294 686 273 aa, chain + ## HITS:1 COG:CAC1622 KEGG:ns NR:ns ## COG: CAC1622 COG2240 # Protein_GI_number: 15894900 # Func_class: H Coenzyme transport and metabolism # Function: Pyridoxal/pyridoxine/pyridoxamine kinase # Organism: Clostridium acetobutylicum # 5 271 6 276 290 191 38.0 9e-49 MKRQKRVAVINDVTGFGRCSAAVAQPIISAMKIQCCVMPTAILSVHTGFPNYFLQDYTPY LRTYMNSWEETGIEFDGITTGFIGSKEQIGLVIEFFKRFKKENTLAVVDPVMGDYGKLYS SYTDDMCQEMKKLLPYADVLTPNLTEACRILDLDYHKVDLSEEGLVAICEALSDMGPSRI VITGLQQGDLIFNYIFEKNKTSELLSTRKIGGDRSGTGDVFSSIVTGALIQGQDFKTAVK RAVTFLDKAIAYTAQMELPWNYGICFEEYLEEI >gi|222441842|gb|ACEP01000100.1| GENE 27 26297 - 26965 740 222 aa, chain + ## HITS:1 COG:aq_2060_2 KEGG:ns NR:ns ## COG: aq_2060_2 COG0535 # Protein_GI_number: 15607030 # Func_class: R General function prediction only # Function: Predicted Fe-S oxidoreductases # Organism: Aquifex aeolicus # 31 220 6 191 194 102 32.0 5e-22 MILYTVKNGVYYNLEDFKEDLSDLGENGFPNKIYINMTNRCSCSCTFCLRSLKEFNEHNS LWLKEEPSVELVKELFSKYDWSKVAEIIFCGFGEPTMRFNDVVEVGQWLKSIHPDIPLRI NTNGLSDLVFGEPTASRLAGIFDTVSISLNSSTAQKYLDVTRNRFGLASYDAMLSFAKAC QRYVPNVVMTVVDVIGEEEVAACQKVCDTHGLHLRVRPYEAN >gi|222441842|gb|ACEP01000100.1| GENE 28 27028 - 27972 1109 314 aa, chain + ## HITS:1 COG:L0018 KEGG:ns NR:ns ## COG: L0018 COG0039 # Protein_GI_number: 15672358 # Func_class: C Energy production and conversion # Function: Malate/lactate dehydrogenases # Organism: Lactococcus lactis # 3 314 2 313 314 317 48.0 1e-86 MEKADKRKIVLIGTGMVGMSYAYALLNQNLCDELVLIDINKKRAEGEAMDLNHGVAFSGG NMEIYAGEYTDCCNADLVVLTAGLPQKEGQNRLDLLKENRKIFETILQSVLENGFHGIFL VATNPVDIMTRIVYEISDFPPEKVIGTGTALDTARLRYLLGEKFMIDPRNMHAYVMGEHG DSEFVPWSQAMMTTKPIFDLCGETEGCHFQELLELEEEVRMAAYKIIEAKKATYYGIGMA MARITKAIFGNEYSVLTVSAHLQGEYGENGIYIGIPCVVNRMGIQRIVELPLGSEEKQRL HSSCETLENTYREI >gi|222441842|gb|ACEP01000100.1| GENE 29 28245 - 28970 805 241 aa, chain + ## HITS:1 COG:CAC0509 KEGG:ns NR:ns ## COG: CAC0509 COG1387 # Protein_GI_number: 15893800 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Histidinol phosphatase and related hydrolases of the PHP family # Organism: Clostridium acetobutylicum # 1 237 1 235 244 204 42.0 1e-52 MDYLLDVHTHTIASGHAYNTIMEMAKAGFDKGLKLLGITEHAPMMPGTCHAMYFHNLKVV PRTMCGIELMLGAELNILDYDGHIDLDTRVLKQLDLKIASLHSVCIQPGTRKENTQAVLG AVHNPLVDIIGHPDDGIYPLEYEPIVEAAKETNTLLEVNNNSLNPAGSRKHTRENLIAML ELCKEYKQPVIMNSDSHVFCDVARRDFSEKLIKEIDFPEELIVNRSVDVFKEYIHRKYEE K >gi|222441842|gb|ACEP01000100.1| GENE 30 29080 - 29373 386 97 aa, chain + ## HITS:1 COG:BH0639 KEGG:ns NR:ns ## COG: BH0639 COG4496 # Protein_GI_number: 15613202 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus halodurans # 9 95 10 96 100 102 57.0 1e-22 MGKSVHNEETKKLMEGILKLQSVDECYAFFEDLCTVNELLSLGQRFEVANMLRNHKTYNE VAEATGASTATISRVNRCLNYGTDGYQLVLGRMEEDK >gi|222441842|gb|ACEP01000100.1| GENE 31 29609 - 30229 447 206 aa, chain - ## HITS:1 COG:no KEGG:Closa_1554 NR:ns ## KEGG: Closa_1554 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 17 204 21 206 206 164 50.0 2e-39 MDNYKDWLYALLKTEPIPPEDIPDIGLYMDQVTTLMDTRLAASKRYPDDKILTKTMINNY TKNHLLPPSDKKKYSREHVFLLIMIYYLKSMLSISDIQSLLEPLSEKYFPKNEENGLTLS DIYQKIIDDNTANEEYISEDIIKTWERTRNSFAEHSPSTEESEYLDDFSFIYQLGYDIYI RKQIIEKLIDKRREENPPKKKGSKKK >gi|222441842|gb|ACEP01000100.1| GENE 32 30501 - 31523 1245 340 aa, chain + ## HITS:1 COG:BS_ydjI KEGG:ns NR:ns ## COG: BS_ydjI COG4260 # Protein_GI_number: 16077688 # Func_class: S Function unknown # Function: Putative virion core protein (lumpy skin disease virus) # Organism: Bacillus subtilis # 1 340 1 323 323 388 59.0 1e-108 MGLFKSQLANVVEWEEYRDDVIFYKWNNSEIKKGSRLIIRPGQDAIFLYNGVTEGVFEKE GNYDIESDIIPFLSTLKGFKFGFNSGIRAEVIFINTKEFTMKWGTKQAVNLPTEALPGGM PIRANGTFQFKVGDYQMLIDKVAGVKKQYTTDEVRERVLSMLNQLLLRWIVREGKDMFNI QANAVEISQGILGELNTEMQKIGLAATSFQIDSVSYPEEVQKMVTKVASQSMVGDMGRYQ QMAMIDAMTNTSNNGGGNNMATDMASMQMGMMMGQQMVNQMQSAMQQNNQNMQGQVQNAA PAPQNQAAGQASGTIPNFCPNCGQKTEGSKFCPNCGQKLV >gi|222441842|gb|ACEP01000100.1| GENE 33 31724 - 33940 2152 738 aa, chain + ## HITS:1 COG:SA1721 KEGG:ns NR:ns ## COG: SA1721 COG0210 # Protein_GI_number: 15927479 # Func_class: L Replication, recombination and repair # Function: Superfamily I DNA and RNA helicases # Organism: Staphylococcus aureus N315 # 2 736 3 727 730 642 48.0 0 MSIYDTLNNEQREAVFCTEGPLLMLAGAGSGKTRSLTHRIAYLIEEKGVAPWNILAITFT NKAAQEMRERVDALVGYGSEDIWISTFHATCSRILRRHIDLLGYDRNFTIYDASDQKSLM KEVLKEMKIDTKQFLERSVMSEISSAKNEYKSPLDYRNEYGSNFRNQRIADIYEHYQKRL KENNALDFDDLLVKMVDLFQTNPDVLEHYQDRFQYIMVDEYQDTNTVQFLLVSLLAKKYR NLCVVGDDDQSIYKFRGANIYNILNFEKVFPDAQVIRLEQNYRSTQNILNAANGVIANNK GRKEKKLWTENQKGELVHFKQYDTEYDEADGVVSRINFLAMRGVQYKDMAILYRTNAQSR IFEEKLKQKNIPYAIVRGISFYDRKEIKDLMSYLKVVDSGMDDLSVKRIINVPKRGIGQT TINRLQEFAILNQMSFLDAVFNADEIPEVTRALAKLHKFADMIEEFREYASEHEISELLE HILDVTQYRAELEAEGTDESISRLEDIEELFNDIAYYEEEEENPNLRDFLAEKDMYTLNA GIDNLEDENNKVLLMTLHNAKGLEFNNVFLGGMEEGVFPGFGAMMSGDESEIEEERRLCY VGITRAKERLFLSAAKRRMLRGQTQYNRRSRFIDEIPGQYLDTEQRVSEQRVVKNTERPA KYQYGAKAGKPYNLSDFKVKPVGELDYQVGDRVKHIKFGVGTVQEITKGGRDFEVAVEFD RVGRKKMFASFAKLKKVK >gi|222441842|gb|ACEP01000100.1| GENE 34 34065 - 34373 513 102 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225027983|ref|ZP_03717175.1| ## NR: gi|225027983|ref|ZP_03717175.1| hypothetical protein EUBHAL_02243 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02243 [Eubacterium hallii DSM 3353] # 1 57 1 57 102 64 100.0 3e-09 MKNIDELLGVAKLSELMSNDKKDESKKVLWVLAIIGAVAAVAGIAAVVYKYLSPDYLEDI DDDYDDFDDDFFDEDDEVEETAEEEAAKTEEEKKSEEATEEK >gi|222441842|gb|ACEP01000100.1| GENE 35 35928 - 37076 426 382 aa, chain + ## HITS:1 COG:SA0903 KEGG:ns NR:ns ## COG: SA0903 COG3594 # Protein_GI_number: 15926637 # Func_class: G Carbohydrate transport and metabolism # Function: Fucose 4-O-acetylase and related acetyltransferases # Organism: Staphylococcus aureus N315 # 10 320 6 313 336 94 28.0 3e-19 MSEAVKKSTKRIYKWDNLKCFLIVMVVIGHFVNQYAPISNTMKSLSLFIYSFHMPLFIFL SGLLQKRWSQRCKFQWDKPLYYIMIGYALKVCIYGIKILFHQKAVFQWFEDTGIPWYMFA MAAFMVIAYLIKELPLWFVLPISILAACLAGYDENIGSFLYLSRIIVFFPFYYTGYCLDI KKLQEVLNKTWIKCLSAIFLTDMILYTLAKIEDNYSYIRLFTGRNAYSLINVESCGAVHR LMFYVIAFLMGIAIISLIPNCKVPVVGKVGARTLQIYFWHRLILYVLTFSGITGNLIKNV PSGWVWIYLAMAIILPFVLSAEVFSAPLTLLKNGENIILAGIEKLSINFYKWKESTLTVL PVLEFITIVLILIYYGEDVGIL >gi|222441842|gb|ACEP01000100.1| GENE 36 37035 - 37295 127 86 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225027985|ref|ZP_03717177.1| ## NR: gi|225027985|ref|ZP_03717177.1| hypothetical protein EUBHAL_02254 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02254 [Eubacterium hallii DSM 3353] # 1 86 1 86 86 146 100.0 5e-34 MIITIDGLAMKKKEDFYQNIKEELDVPEYFGNNLDALYDFLTEQPDILQMKFLHYNMMQS QLGRSFCQQLLKVLQDADILTIIDEY >gi|222441842|gb|ACEP01000100.1| GENE 37 37292 - 37897 389 201 aa, chain - ## HITS:1 COG:no KEGG:bpr_I2503 NR:ns ## KEGG: bpr_I2503 # Name: not_defined # Def: ribonuclease # Organism: B.proteoclasticus # Pathway: not_defined # 1 198 1 181 182 175 50.0 1e-42 MNLKKKLWRLLSVLSTLLLVILPISGCDYFFSAESNTETGTYTTLSETTDYTSDSATEQN KTSNIESGNTSTTEKKQIDETDGYLDKDGSYTSKEDIMNYLIEYGQLPNNFITKKEAKKL GWSGGSLEPYAPGKCIGGDYFGNYEKVLPVVSGRTYHECDIDTLNAKSRGAKRIIYSDDG QIYYTDNHYKSFTLLYGDDVQ >gi|222441842|gb|ACEP01000100.1| GENE 38 38260 - 38460 267 66 aa, chain - ## HITS:1 COG:BH3345 KEGG:ns NR:ns ## COG: BH3345 COG2155 # Protein_GI_number: 15615907 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Bacillus halodurans # 1 59 1 60 75 60 56.0 1e-09 MKWIDYTALTLAIIGAIVWGLIGIFQFNLVSFFFGENSWFSRLIYDLVGLSGLYLLTLFG RIGRDM >gi|222441842|gb|ACEP01000100.1| GENE 39 38597 - 39079 681 160 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225027989|ref|ZP_03717181.1| ## NR: gi|225027989|ref|ZP_03717181.1| hypothetical protein EUBHAL_02258 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02258 [Eubacterium hallii DSM 3353] # 1 160 1 160 160 251 100.0 8e-66 MKKQKTYVAMLLAALVILSLAGCGMKNSKDNAEATTGWSVENPDNTTKDKKTTESTDAED NIEEDADGIMDNLGAGTFDTYEDAKNYLMDKLTADNEDMSYEFREETQDLTSYDSGNPGA EGYQFHVYESEGGKKTGDYYVDKDTGKVYRYTKDNKIMEY >gi|222441842|gb|ACEP01000100.1| GENE 40 39517 - 40482 773 321 aa, chain + ## HITS:1 COG:no KEGG:Cphy_2931 NR:ns ## KEGG: Cphy_2931 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 22 315 1 305 306 192 37.0 1e-47 MCYNYFHNKQIKEYYGKGEENMTFEAFQSYIKENVLKGWREDADIEMAVVRKNNGIELCG LYIRREEEQISPTIYLDEYYSYYLKGEALEEIITRIREEYEWKISRVADYHFNLEKFEYV RDRIVYRLVNYEKNKEILEDCPHLRLYDLALTFRWVAHSDDIGISTALVTNQELQVWGIS MNELLLAARENTPRLFPVHMIDMDEMIAQAGIPISLDESAIPMYIMTNEQEVNGASVLLY DNVLESFALEKKTDFYILPSSIHEVILVPSNKIDDPSALFTMVSDANNTVVALGDILSDS VYYYNRRKNQIVPVGKERKIV >gi|222441842|gb|ACEP01000100.1| GENE 41 40730 - 41845 1026 371 aa, chain + ## HITS:1 COG:BS_ytvI KEGG:ns NR:ns ## COG: BS_ytvI COG0628 # Protein_GI_number: 16079968 # Func_class: R General function prediction only # Function: Predicted permease # Organism: Bacillus subtilis # 25 340 34 357 371 106 25.0 7e-23 METVIFFGIAVLLIIGIPKLMGFFWPFVASWILALLATPLCNFLEKHIKLNKKWSSAVII VLVLLALAGIGYLAVTKLGRELMGLFSGAPEYYVYVQRTIKSFGDSLTNIVSPISSDFGK QIQGFFTDFLSQMGGIINKFAPKGVEMMGSAAADLTNGFIGTVVMIISAYFFIADRERLT EGFWRIVPEDLKSTVRDIKDKVIAALGGFLLAQFKIMCIVFVILLAGFFLMGNPYALLLA LVISFVDLLPILGTGTILIPWTLFCFVQRDYRQAMFLIILYVVCLVARQFLQPKMIADSM GLDSMATLVLIYTGYKLNGMKGMILALLAGVIILSLYRLGLFDRKIKRMSRLLYEYRHCG ADCEDDSNAKL >gi|222441842|gb|ACEP01000100.1| GENE 42 42066 - 42272 280 68 aa, chain + ## HITS:1 COG:BH3610 KEGG:ns NR:ns ## COG: BH3610 COG1278 # Protein_GI_number: 15616172 # Func_class: K Transcription # Function: Cold shock proteins # Organism: Bacillus halodurans # 3 66 2 65 65 82 60.0 1e-16 MFKGTVKWFNNQKGYGFIQDESGKDIFVHYTGLNMPGFKSLEEGNEVEFDIVQGEKGPQA SNVVKLQG >gi|222441842|gb|ACEP01000100.1| GENE 43 42541 - 43155 608 204 aa, chain + ## HITS:1 COG:ECs0676 KEGG:ns NR:ns ## COG: ECs0676 COG0406 # Protein_GI_number: 15829930 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose-2,6-bisphosphatase # Organism: Escherichia coli O157:H7 # 1 200 1 198 203 130 35.0 1e-30 MKLYLVRHGETALNEKGCYYGKTDAVLSVRGIEQAKYLQHIFKEVSFDYVVASPLVRAYN TAQIIIEERKQQIFGDSRLMEQDFGIFEGLTYKQLKGKYPQELEQWNKEFSTYRIPEGES FLDVRRRVEAFLKDIPSGEENKSEKMLITAHKGTLGHLLAAMLKLPPEGYWNFVFDQGCY SEVDLEDGYAIIRKLNVRDIPRDI >gi|222441842|gb|ACEP01000100.1| GENE 44 43274 - 44599 1245 441 aa, chain + ## HITS:1 COG:lin1549 KEGG:ns NR:ns ## COG: lin1549 COG2256 # Protein_GI_number: 16800617 # Func_class: L Replication, recombination and repair # Function: ATPase related to the helicase subunit of the Holliday junction resolvase # Organism: Listeria innocua # 17 440 5 423 427 409 49.0 1e-114 MDLFEYMSMQRKKEESPLAVRMRPKNLDEVAGQQHIIGKDKLLYRAIKADKISSLIFYGP PGTGKTTLAKVIANTTSANFVQMNATTSGKKDMEQAVSQAKDAFGMYGKRTILFIDEIHR FNKAQQDYLLPFVEDGTVILIGATTENPYFEVNSALLSRSQIFHLEPLAESDIYRLVKTA VEDNERGMGAYGAVITEEAARFIAEMAGGDARRALNAVELGVLTTEPDDKGQLVIDLSVA EECIQRKSVNYDRDGDNHYDNISAFIKSMRGTDPDAAVFYLARMLDAGEDPKFIARRIMI CASEDVGNADPQALVVAVAASQGVERIGMPEARILLSQAAAYVASAPKSNACIMAVDKAL EMVRSQNTGQVPPYLRDAHYGGAKKLGHGIGYKYAHDYPEHYVKQQYLPDELKEERFYVP TENGYEKKIKAHLKHLREREE >gi|222441842|gb|ACEP01000100.1| GENE 45 44643 - 44837 106 64 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225027996|ref|ZP_03717188.1| ## NR: gi|225027996|ref|ZP_03717188.1| hypothetical protein EUBHAL_02265 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02265 [Eubacterium hallii DSM 3353] # 1 64 1 64 64 77 100.0 5e-13 MKKIKRILALLLVIFLIAMVFVTLYCAVTGSPYFMASLFVMLGLPLLIYAYMFIYRIVKD KDEK >gi|222441842|gb|ACEP01000100.1| GENE 46 44842 - 44970 64 42 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNDIFKNLLDNYLLISYHINNNNYYYYKMVTKGNYLYMDTIV >gi|222441842|gb|ACEP01000100.1| GENE 47 44987 - 45529 689 180 aa, chain + ## HITS:1 COG:FN0455 KEGG:ns NR:ns ## COG: FN0455 COG1592 # Protein_GI_number: 19703790 # Func_class: C Energy production and conversion # Function: Rubrerythrin # Organism: Fusobacterium nucleatum # 3 180 2 179 179 193 57.0 1e-49 MPELKGTKTEQNLMTAFAGESQARNKYTYFASKAKKEGYVQIASIFEETANNEKEHAKMW FKLLEGGAIRSTIENLQAAADGENYEWTDMYATFAKEAREEGFDDIAEKFEGVGAIEKEH EARYLKLLDNVKKEIVFSRDGDTIWQCANCGHICIGKKAPDVCPVCDHPQAYFQIKAENF >gi|222441842|gb|ACEP01000100.1| GENE 48 45924 - 46280 262 118 aa, chain + ## HITS:1 COG:CAC1753 KEGG:ns NR:ns ## COG: CAC1753 COG2739 # Protein_GI_number: 15895030 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 1 102 1 106 116 60 37.0 6e-10 MDRIVEKSLLFDFYGELLTQHQKEVYGEYIQNDLSPTEIAVLRGISRQGAYDLIKRCEKI LTDYENRLHLVEHFQKVKKTVASIHTCAEEIAECDDKSIVREKIAQIAELSSNILEEY >gi|222441842|gb|ACEP01000100.1| GENE 49 46299 - 47654 1729 451 aa, chain + ## HITS:1 COG:BH2484 KEGG:ns NR:ns ## COG: BH2484 COG0541 # Protein_GI_number: 15615047 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Signal recognition particle GTPase # Organism: Bacillus halodurans # 1 431 1 433 451 458 56.0 1e-129 MAFESLTDKLQNVFKNLRSKGRLTEADVKEAMKEVKRALLEADVNFRVVKSFVKTVTDRA IGQDVLTGLNPGQMVIKIVKEEMESLMGSEMTELVIKQGNEITVFLMAGLQGAGKTTTCA KIAGQLKKKGKKPLLVACDVYRPAAIKQLEVNGAKQGVPVFTMGDKQNPVDIAKASIEHA KENDNNIVILDTAGRLHVDESMMEELQNIKANVDVTQTVLVVDAMTGQDAVNVAKTFEEK VGIDGVILTKLDGDTRGGAALSIRAVTGKPILYVGMGEKLEDLQQFYPDRMTSRILGMGD VFTLIEKAENAVDEEEARKLEQKLRKAEFGFDDFLSQMQQIKKMGGLTDLLSMIPGVGSQ LKNVDIDDSAMDSIEAIIYSMTPEERANPNILNPSRKKRIANGAGVKLQDVNKLCKQFEQ SRKMMKQMNGMMGKGGRRKGGFNLGGMKLPF >gi|222441842|gb|ACEP01000100.1| GENE 50 47744 - 47989 308 81 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|238916983|ref|YP_002930500.1| small subunit ribosomal protein S16 [Eubacterium eligens ATCC 27750] # 1 81 1 81 81 123 70 3e-27 MAVKMRLKRMGKKRNPIYRIVVADARSPRDGRNIDEIGLYDPNQEPCVVKIDEEAAKDWL SKGAQPTDTVARLLKNAGIEK >gi|222441842|gb|ACEP01000100.1| GENE 51 48008 - 48238 434 76 aa, chain + ## HITS:1 COG:CAC1756 KEGG:ns NR:ns ## COG: CAC1756 COG1837 # Protein_GI_number: 15895033 # Func_class: R General function prediction only # Function: Predicted RNA-binding protein (contains KH domain) # Organism: Clostridium acetobutylicum # 1 74 1 74 75 75 68.0 3e-14 MKELVKVIAVSLVDHPDEIVVTEKETENSIILELHVAPDDMGKVIGKQGRIAKAIRTVVK AAASRDDKKVIVDIAQ >gi|222441842|gb|ACEP01000100.1| GENE 52 48333 - 48842 183 169 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163796730|ref|ZP_02190688.1| 50S ribosomal protein L19 [alpha proteobacterium BAL199] # 1 154 1 152 179 75 31 9e-13 MEDLFQVGIITGTHGLKGEVKVFPTTDDKERFLDLDTVLIDTGKELIEKKVQYCKFFKQF VFVKFEGLDDINDVEKYKRCPLKVTRENAVPLEEDEYYVADLLGLTIVDESGVTIGELID VIETGANDVYEVKTPDGGHVLLPAIKDCILDVDMEEKIILVHMLKGLVD >gi|222441842|gb|ACEP01000100.1| GENE 53 48842 - 49588 703 248 aa, chain + ## HITS:1 COG:BH2479 KEGG:ns NR:ns ## COG: BH2479 COG0336 # Protein_GI_number: 15615042 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA-(guanine-N1)-methyltransferase # Organism: Bacillus halodurans # 1 242 1 242 246 296 58.0 3e-80 MNFHVLTLFPDMIEAGLHTSVTGRALKNGYIHLSAVNIRDYSKDKHKHVDDYPYGGGAGM VMQPEPIYLAYEDVTQNMEKKPRVVYVTPQGSVFNQSMAEEFSKEEDLIFLCGHYEGVDE RILEEIVTDYVSIGDYVLTGGELPAMVMIDAVSRLVPGVLNNEESAEFESFHDNLLEHPQ YTRPVEFRGRKVPDVLLSGHHGNIDKWRREQSLKRTLERRPDLIETAILSKEDEKYLKKL KKGLHESE >gi|222441842|gb|ACEP01000100.1| GENE 54 49676 - 50023 460 115 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|160880535|ref|YP_001559503.1| ribosomal protein L19 [Clostridium phytofermentans ISDg] # 1 115 1 115 115 181 79 6e-45 MKDIIKSIEDAQLKPEVDSFRVGDTVRVSAKVVEGSRERIQVFEGTVIKRQNGGAKETFT VRKFSNGVGVEKTWPLHSPIVTKIQVVRKGKVRRAKLYYLRDRIGKRAKVKELVK >gi|222441842|gb|ACEP01000100.1| GENE 55 50100 - 50645 536 181 aa, chain + ## HITS:1 COG:BS_sipT KEGG:ns NR:ns ## COG: BS_sipT COG0681 # Protein_GI_number: 16078505 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Signal peptidase I # Organism: Bacillus subtilis # 20 169 23 177 193 92 39.0 4e-19 MKNYRYLERRNMMIRRIGLWVGEVVAAILLAFLVIQFCFQTVTVHGDSMQPAYYDGDTVL VNKLDYRIGSPKRLDAVILELENGSTTHYSVKRVVGLPGETIKIENGKIYINNKELKGFS EEDILSAGLAAYDVELGEDEYFVMGDNCNNSEDSRVSNIGNIKRSQFVGKIAGTLHKVTR Q >gi|222441842|gb|ACEP01000100.1| GENE 56 50766 - 51614 1114 282 aa, chain + ## HITS:1 COG:BS_ylqF KEGG:ns NR:ns ## COG: BS_ylqF COG1161 # Protein_GI_number: 16078668 # Func_class: R General function prediction only # Function: Predicted GTPases # Organism: Bacillus subtilis # 1 279 1 280 282 281 49.0 7e-76 MHFQWYPGHMTKARRMMQENIKLIDIVVELVDARVPFSSKNPDIDELAKNKYRLIVLNKA DMADAKTTAAWQNYFEAKGFFVAKVNSQKGAGMKEVKSLIEKACQEKKARDKKRGILNRP LRAMVVGIPNVGKSTFINSFAGKACAKTGNKPGVTKGKQWIKINKNVELLDTPGILWPKF EDQAVGLRLALIGSIKEEILNTTEMAMEFIGFLKKSYPGVLVERYSVDESKENLEILEEI ARIRGCLMKGNQTDIDKAGRLLLEDFRSLRLGKLSLEVPDAQ >gi|222441842|gb|ACEP01000100.1| GENE 57 51604 - 52197 578 197 aa, chain + ## HITS:1 COG:alr2304 KEGG:ns NR:ns ## COG: alr2304 COG0681 # Protein_GI_number: 17229796 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Signal peptidase I # Organism: Nostoc sp. PCC 7120 # 5 196 2 206 215 109 32.0 3e-24 MRSKEREAEKKNASEKKEKSTAQNILGTIAYIIGVCVFVFLILHFVGQRTVVNGSSMDTT LANGQNLVMDKLSYRFHDPERYDIIIFPGPEEFGQHPYYIKRIIGMPGETVQIKDGKVYI NDKELKSDVYGITDYIDYPGIAEEPITLGDDEYFCLGDNRPVSQDSRYKEVGPVKRSIIV GKVWIRIWPLTKFGKVS >gi|222441842|gb|ACEP01000100.1| GENE 58 52218 - 52829 495 203 aa, chain + ## HITS:1 COG:alr2304 KEGG:ns NR:ns ## COG: alr2304 COG0681 # Protein_GI_number: 17229796 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Signal peptidase I # Organism: Nostoc sp. PCC 7120 # 48 202 36 206 215 108 39.0 5e-24 MGRRTKKNPEVTLGRRQRRRKEQRRKNVREIISTILYLAILCAAVYFILHYVGQRTVVRG DSMDTTLSDGQNLIMDKLSYHFRDPERYDIVIFPGPEEYGEHPYYIKRVIGLPGETVQIK KGKVYINGKKLKSDIYGITKYIDEPGIAEEPLELGKDEYFCLGDNRPVSYDSRYEEVGPV HRSEIIGKVWIRIWPLTKFGKVS >gi|222441842|gb|ACEP01000100.1| GENE 59 52832 - 53587 829 251 aa, chain + ## HITS:1 COG:SP1156 KEGG:ns NR:ns ## COG: SP1156 COG0164 # Protein_GI_number: 15901021 # Func_class: L Replication, recombination and repair # Function: Ribonuclease HII # Organism: Streptococcus pneumoniae TIGR4 # 1 246 1 246 259 206 48.0 2e-53 MKSISMIKEEFKNLTPEQITAQIALYQEDERKGVQNFLLSQQKKVDKYYQEIERIENLCQ YEKEYSQYDFICGIDEVGRGPLAGPVVAGAVVLPKGSRILYINDSKKLSAKKREELFDII KEEAVSVGIGMASPERIDEINILQATYEAMRQAIANLSVKPELLLNDAVMIPGVDIKQVP IIKGDAKSISIGAASIMAKVYRDHMMEEYDKVFPGYDFASNKGYGSKEHIEALHRLGPCP IHRCSFLKNIL >gi|222441842|gb|ACEP01000100.1| GENE 60 53607 - 53960 278 117 aa, chain + ## HITS:1 COG:CAC1763 KEGG:ns NR:ns ## COG: CAC1763 COG0792 # Protein_GI_number: 15895040 # Func_class: L Replication, recombination and repair # Function: Predicted endonuclease distantly related to archaeal Holliday junction resolvase # Organism: Clostridium acetobutylicum # 1 114 4 121 122 75 40.0 2e-14 MRKNSGGAAEEAAVLFLEGKGIRILERNFRSYHGEIDIIALEQEMILVVEVKMRSYGDCG TAAEAVDFRKQKRICYTFNYYRMQRRLAENTAVRFDVIEVDKDFRCHWIQNAFEFQG >gi|222441842|gb|ACEP01000100.1| GENE 61 53970 - 54818 809 282 aa, chain + ## HITS:1 COG:BS_ylmD KEGG:ns NR:ns ## COG: BS_ylmD COG1496 # Protein_GI_number: 16078601 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Bacillus subtilis # 37 280 30 273 278 164 35.0 2e-40 MEENMDFVYANENHSTIINEKNGVVFLTFPKLVKAGVCHGFSTRIGGVSEGHLSSMNLSF TRGDDPEKVQENFRRIGAAIGFDAKDLVLSSQIHETEIRKVTAKDKGDGIVRETVPGIDA LVTDEAGIPMYTSYADCVPLLFYDPEHHVAGLAHSGWRGTVAKIGAKFVAYIQREYGSKP ENIIAAIGPSICRSCYEVGEEVAEEFKQVFLPEEIPLIMDDKGNGKYQLDLWKANELILT QAGIRKENLDITDICTCCNSDKLFSHRASHGKRGNLGCFMCL >gi|222441842|gb|ACEP01000100.1| GENE 62 55010 - 56344 1336 444 aa, chain + ## HITS:1 COG:CAC1710 KEGG:ns NR:ns ## COG: CAC1710 COG1625 # Protein_GI_number: 15894987 # Func_class: C Energy production and conversion # Function: Fe-S oxidoreductase, related to NifB/MoaA family # Organism: Clostridium acetobutylicum # 6 434 3 430 437 400 48.0 1e-111 MKKHLHKIAWVEPGSIAEELGLEPGDAIEEINGNEIEDIFDYQYYVEDEYLDVLILTKDG EECVLEIEKDEDEDLGITFESSLMDEYHSCCNKCMFCFIDQMPPGMRDTLYFKDDDSRLS FLQGNYITLTNMRDKDIERVIKYHLSPINISVHTTNPELRCKMLHNRFAGDVLDKIGRFY EAGIRMNSQVVLCQGLNDEEELDRTISDLGKFIPHMESLSVVPVGLTKYREGLAPLKLFG KEDAKKVLKQIHKWQDYFRENYGTTFVHASDEWFILAGQEFPDEAYYEGYGQLENGVGMM RLLLEEVKERLEELSGDDREKTVSIATAKLAFPTIEKLARDVEKKFPHIKVHVYCIKNEF FGEHITVSGLLTGGDIIAQLKDRELGEELLLPCNVLKADEDIFLDDISLKGLSDSLQVPV NIIQSEGRDFVDKIIAIENGGNYE >gi|222441842|gb|ACEP01000100.1| GENE 63 56337 - 57662 1750 441 aa, chain + ## HITS:1 COG:CAC1711 KEGG:ns NR:ns ## COG: CAC1711 COG1160 # Protein_GI_number: 15894988 # Func_class: R General function prediction only # Function: Predicted GTPases # Organism: Clostridium acetobutylicum # 1 439 1 437 438 508 58.0 1e-143 MSKPVVAVVGRPNVGKSTLFNALAGSRISIVEDTPGVTRDRIYAEVSWLDYQFTLIDTGG IEPESDDIIISRMREQAETAIMTADVILFLVDVRQGLVDADFKVADMLRRSHRPVVLVVN KVDNFDKFMPDVYEFYNLGLGDPIPVSAQGKLGIGDMLDQVVEKFEAPVQEEEEEDDRPK IAVIGKPNAGKSSIINKLLGEERVIVSPVAGTTRDAIDTTVKRNGQEYVFIDTAGLRRKS KIKEELERYSIIRTVTAVERCDVAVLIIDATEGITEQDAKIAGIAHERGKGMIIAVNKWD LVEKDNTTMKKFTEKIREKLSYMPYAELIFLSAKTGQRLPKLFEMIDAVIENCALRVQTG VLNEILTEAMAMKQPPSDKGKRLRIYYITQVSVKPPTFVMFINEKKLTHFSYTRYIENQI RTTFGFRGTPIHFIYRERKEK >gi|222441842|gb|ACEP01000100.1| GENE 64 57663 - 58298 701 211 aa, chain + ## HITS:1 COG:all0492 KEGG:ns NR:ns ## COG: all0492 COG0344 # Protein_GI_number: 17227988 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Nostoc sp. PCC 7120 # 10 207 11 218 226 127 41.0 1e-29 MNNAVYYVLVIVAGYFCGCIESGYLVGKLCGMDIRDYGSGNSGTTNVLRVLGKTRALLTF LGDALKSFIPVLVVKFLLCPAVPTLNADLIQLVTGFAAVMGHDYPVFLKFKGGKGIATTA AAMMAFDWRIGLCCFIIFVVVTAATRYVSLASILLSAGIVVEVLIFHPGRWDLFAMCVLY AFFAIYRHRANIGRLLSGTESKIGQKVEVKK >gi|222441842|gb|ACEP01000100.1| GENE 65 58919 - 60313 1728 464 aa, chain + ## HITS:1 COG:CAC0970 KEGG:ns NR:ns ## COG: CAC0970 COG0119 # Protein_GI_number: 15894257 # Func_class: E Amino acid transport and metabolism # Function: Isopropylmalate/homocitrate/citramalate synthases # Organism: Clostridium acetobutylicum # 26 450 6 429 448 576 61.0 1e-164 MEMDNRSVRLNKKSNLLKLDEYMYPLVDTDKPNVFRNLFPYEEVPKIAFNDRIVPHEMPE DIWITDTTFRDGQQSRTPYSTDQIVHIYDAFHRLGGPKGIIRQCEFFLYSKKDRDAVYKC MEKGYKFPEITSWIRASKKDFELVKEIGLKETGILVSCSDYHIFYKMKMTRKQALEHYLG VVRECLETGISPRCHLEDITRSDIYGFVIPFCLELMKLKDEYGIPVKIRACDTMGYGVNF PGAVIPRSVPGIIYGLKEHAGVPSELLEWHGHNDFYKAVNNSTTAWLYGASAVNSALFGI GERTGNTPLEAMVFEYAQLRGTLDGMDTTVITELAEYYEKEIGYHIPERTPFVGKNFNVT RAGIHADGLLKNEEIYNIFDTDKLLNRPVLVAVSNTSGLAGIAHWMNTYYRLKGDKAVDK KSPLVEEVKKWVDKEYETGRVTAITDSELEKVIKESSEKLGINF Prediction of potential genes in microbial genomes Time: Fri Jul 8 08:01:00 2011 Seq name: gi|222441841|gb|ACEP01000101.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont359.1, whole genome shotgun sequence Length of sequence - 48908 bp Number of predicted genes - 39, with homology - 37 Number of transcription units - 21, operones - 9 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 178 - 237 8.4 1 1 Tu 1 . + CDS 377 - 898 653 ## COG0652 Peptidyl-prolyl cis-trans isomerase (rotamase) - cyclophilin family + Prom 1061 - 1120 4.7 2 2 Tu 1 . + CDS 1199 - 3283 1428 ## COG1368 Phosphoglycerol transferase and related proteins, alkaline phosphatase superfamily + Term 3290 - 3341 3.1 - Term 3518 - 3562 7.6 3 3 Tu 1 . - CDS 3589 - 3981 100 ## gi|225028019|ref|ZP_03717211.1| hypothetical protein EUBHAL_02288 - Prom 4060 - 4119 9.1 + Prom 3873 - 3932 6.4 4 4 Tu 1 . + CDS 4015 - 4095 75 ## + Term 4141 - 4202 4.3 5 5 Tu 1 . - CDS 4265 - 5014 963 ## COG1349 Transcriptional regulators of sugar metabolism - Prom 5035 - 5094 9.8 + Prom 5226 - 5285 11.5 6 6 Op 1 . + CDS 5344 - 6492 1214 ## COG1775 Benzoyl-CoA reductase/2-hydroxyglutaryl-CoA dehydratase subunit, BcrC/BadD/HgdB 7 6 Op 2 3/0.000 + CDS 6511 - 9993 3608 ## COG0587 DNA polymerase III, alpha subunit 8 6 Op 3 . + CDS 10050 - 11042 1463 ## COG0205 6-phosphofructokinase + Term 11080 - 11150 8.2 9 7 Tu 1 . - CDS 11243 - 12343 1077 ## COG1454 Alcohol dehydrogenase, class IV - Prom 12584 - 12643 6.5 + Prom 12518 - 12577 4.1 10 8 Tu 1 . + CDS 12677 - 13333 813 ## COG0036 Pentose-5-phosphate-3-epimerase + Prom 13488 - 13547 6.6 11 9 Op 1 . + CDS 13632 - 14462 1123 ## COG0489 ATPases involved in chromosome partitioning 12 9 Op 2 . + CDS 14486 - 16519 1274 ## PROTEIN SUPPORTED gi|157803230|ref|YP_001491779.1| 50S ribosomal protein L9 13 9 Op 3 . + CDS 16523 - 18412 1734 ## COG0322 Nuclease subunit of the excinuclease complex + Prom 18760 - 18819 2.9 14 10 Tu 1 . + CDS 18862 - 19806 1167 ## COG1493 Serine kinase of the HPr protein, regulates carbohydrate metabolism + Term 19936 - 19996 1.0 + Prom 20102 - 20161 7.0 15 11 Op 1 1/0.000 + CDS 20232 - 21161 910 ## COG0812 UDP-N-acetylmuramate dehydrogenase 16 11 Op 2 . + CDS 21182 - 22033 699 ## COG1660 Predicted P-loop-containing kinase 17 11 Op 3 2/0.000 + CDS 22062 - 23015 766 ## COG1481 Uncharacterized protein conserved in bacteria + Term 23020 - 23072 -0.9 18 11 Op 4 . + CDS 23096 - 23359 400 ## COG1925 Phosphotransferase system, HPr-related proteins - Term 23596 - 23634 5.1 19 12 Tu 1 . - CDS 23703 - 24536 562 ## gi|225028036|ref|ZP_03717228.1| hypothetical protein EUBHAL_02305 - Prom 24688 - 24747 9.3 + Prom 24600 - 24659 11.3 20 13 Op 1 1/0.000 + CDS 24862 - 25830 813 ## COG1879 ABC-type sugar transport system, periplasmic component 21 13 Op 2 19/0.000 + CDS 25849 - 27225 1246 ## COG4585 Signal transduction histidine kinase + Prom 27240 - 27299 6.4 22 13 Op 3 1/0.000 + CDS 27351 - 28022 848 ## COG2197 Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain 23 13 Op 4 . + CDS 28077 - 29390 973 ## COG1653 ABC-type sugar transport system, periplasmic component + Term 29448 - 29485 -0.8 + Prom 29422 - 29481 7.3 24 14 Op 1 19/0.000 + CDS 29599 - 30507 1123 ## COG1105 Fructose-1-phosphate kinase and related fructose-6-phosphate kinase (PfkB) + Prom 30521 - 30580 7.5 25 14 Op 2 . + CDS 30671 - 32617 2515 ## COG1299 Phosphotransferase system, fructose-specific IIC component + Term 32810 - 32874 19.0 + Prom 32851 - 32910 8.3 26 15 Op 1 . + CDS 33117 - 35021 1233 ## COG0296 1,4-alpha-glucan branching enzyme + Prom 35080 - 35139 5.6 27 15 Op 2 . + CDS 35180 - 35593 527 ## COG0537 Diadenosine tetraphosphate (Ap4A) hydrolase and other HIT family hydrolases + Term 35620 - 35676 -0.1 + Prom 35694 - 35753 7.5 28 16 Tu 1 . + CDS 35785 - 36369 460 ## COG1853 Conserved protein/domain typically associated with flavoprotein oxygenases, DIM6/NTAB family + Term 36383 - 36448 3.4 + Prom 36378 - 36437 9.1 29 17 Tu 1 . + CDS 36497 - 36592 60 ## + Prom 36943 - 37002 5.8 30 18 Op 1 1/0.000 + CDS 37030 - 38343 1720 ## COG0766 UDP-N-acetylglucosamine enolpyruvyl transferase 31 18 Op 2 . + CDS 38365 - 39552 1691 ## COG0192 S-adenosylmethionine synthetase + Prom 40004 - 40063 6.3 32 19 Op 1 1/0.000 + CDS 40183 - 41031 608 ## COG5495 Uncharacterized conserved protein + Term 41057 - 41101 3.4 33 19 Op 2 19/0.000 + CDS 41149 - 41976 1174 ## COG0413 Ketopantoate hydroxymethyltransferase + Term 41995 - 42030 -0.8 + Prom 42148 - 42207 7.5 34 19 Op 3 12/0.000 + CDS 42239 - 43081 1178 ## COG0414 Panthothenate synthetase + Term 43155 - 43203 -0.9 + Prom 43153 - 43212 5.1 35 19 Op 4 . + CDS 43244 - 43645 591 ## COG0853 Aspartate 1-decarboxylase 36 19 Op 5 . + CDS 43649 - 44839 1247 ## COG0452 Phosphopantothenoylcysteine synthetase/decarboxylase + Term 44959 - 45002 0.8 + Prom 45391 - 45450 7.7 37 20 Op 1 1/0.000 + CDS 45624 - 46361 540 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 38 20 Op 2 . + CDS 46432 - 47145 811 ## COG0564 Pseudouridylate synthases, 23S RNA-specific + Term 47150 - 47193 6.1 + Prom 47207 - 47266 6.3 39 21 Tu 1 . + CDS 47372 - 48808 1823 ## COG0442 Prolyl-tRNA synthetase Predicted protein(s) >gi|222441841|gb|ACEP01000101.1| GENE 1 377 - 898 653 173 aa, chain + ## HITS:1 COG:CAC2769 KEGG:ns NR:ns ## COG: CAC2769 COG0652 # Protein_GI_number: 15896024 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Peptidyl-prolyl cis-trans isomerase (rotamase) - cyclophilin family # Organism: Clostridium acetobutylicum # 1 173 1 174 174 225 66.0 3e-59 MANPIVTITMENGDVIKAELYPQIAPNTVNNFISLVKKGFYDGLIFHRVIKGFMIQGGCP LGNGTGNPGYSIKGEFSNNGFTNQLKHEAGVLSMARSMMPNSAGSQFFIMHKAAPHLDGE YAAFGKVIEGMDVVDKIASVKTFMDSPYEKQVMASVTVDTQGEEYPEPEKIEK >gi|222441841|gb|ACEP01000101.1| GENE 2 1199 - 3283 1428 694 aa, chain + ## HITS:1 COG:SMc00195 KEGG:ns NR:ns ## COG: SMc00195 COG1368 # Protein_GI_number: 15965601 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Phosphoglycerol transferase and related proteins, alkaline phosphatase superfamily # Organism: Sinorhizobium meliloti # 203 690 153 628 639 162 26.0 3e-39 MNKLKKKKAGIKDFFKGKHGRNFLLALDVLLAIAFFAQPDLYYNPQAPDFFDRFYADSLI ICGGLWAVLVFLTVKKIHFSAEVNRILTYIAGIATPFIAFLWLEFYNDAQFWVPIFSIPF LYLVLDIIVYYVIYVLFLLIFNSIRAASICMVVVTAVFGIFNYELTLFRSMSFIASDIYS FVTAVSVANTYQVQIDVDTAEFFMMALVLVALLLKLDKVKLFKWKGRIVYAIVSCMIFAG FTQVYVYSDYLEDIGVDFRVYRPQYKYRYYGTLLTTMRTFGYLHVTQPEEYSVNAVKKIT KQYTENESTETQEKNTSTQNKTTKKPNVIAIMNESFADLKAVGDLQTSKDYMPFFRKLKE NAIKGYTYSSVFGGNTANSEFEFMTGNTLAFLPDNSVPYQLFLRSKTAGLTYTLKDQGYS PCYALHPFYKTGYGRYKVYPLMGFDKFYTSDNFSVFTDTVNYHITDSEDYKKLISLYENR TDKDKPFYLFNVTMQNHGSYDGSTLETGDEVQIEGDLQSYSKAEQYLNMIKMSDKALKEI VHYFEKVDEPTVIVFFGDHQPDLEETFYNRLLHTDIQKLEGEDLEQLYKVPFLIWANYDI KEENVEKTSNNYLSTYLAEVAGIKKTGYLEYLTKLREEIPAINAIGYWDKNGKFYEIDDK KSPYYDLIHEYNLLEYNNLFGKDNQQKEFFYLKE >gi|222441841|gb|ACEP01000101.1| GENE 3 3589 - 3981 100 130 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225028019|ref|ZP_03717211.1| ## NR: gi|225028019|ref|ZP_03717211.1| hypothetical protein EUBHAL_02288 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02288 [Eubacterium hallii DSM 3353] # 1 130 1 130 130 251 100.0 1e-65 MLKSYDKVLDNAAWIKQATIYKETTVTTKAKIFYFHGGGLLYGFRKDLPEKHISVITQAG YEIISFDYPLAPAADLEQIVPDICDSAWRSVPSVTCLIPKRFSLTFPLLVILLERILYAV SHIFFKIFII >gi|222441841|gb|ACEP01000101.1| GENE 4 4015 - 4095 75 26 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MVGMLAANKLGELDRDVYNEKKTIAY >gi|222441841|gb|ACEP01000101.1| GENE 5 4265 - 5014 963 249 aa, chain - ## HITS:1 COG:L0152 KEGG:ns NR:ns ## COG: L0152 COG1349 # Protein_GI_number: 15672771 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulators of sugar metabolism # Organism: Lactococcus lactis # 4 248 3 247 247 236 53.0 3e-62 MTERQSKLIKLVNLYQKIEVSRLAELLDVSQVTIRKDLDHLEEEGLLSREHGYALIKNAN DINTRLTINYDKKLEIATKAAEMVSNGETVMLESGSTCALLAEQLAKLKKDVTIITNSAY IAIRIRELPIRKVILLGGEYQKEYQGMVGPLVRKCAKEFYVDKFFVGTDGFIPDAGFTCD DLMRVETMKYMEDSANRMIILADSSKFSQKGVVIQTTFSEIDTVCTDAEIPEDALENLKR HNINVEIVE >gi|222441841|gb|ACEP01000101.1| GENE 6 5344 - 6492 1214 382 aa, chain + ## HITS:1 COG:ECs5298 KEGG:ns NR:ns ## COG: ECs5298 COG1775 # Protein_GI_number: 15834552 # Func_class: E Amino acid transport and metabolism # Function: Benzoyl-CoA reductase/2-hydroxyglutaryl-CoA dehydratase subunit, BcrC/BadD/HgdB # Organism: Escherichia coli O157:H7 # 1 382 8 390 390 439 53.0 1e-123 MELIKALPEIFEDFAEQRKKSFLTMKELKDRDVPVVGAYCTYFPQEIAMAMGAVTVGLCS TSDETIPVAEKDLPRNLCPMVKASYGFAVSDKCPFFYFSDVVVGETTCDGKKKMYELMGE FKNVYVMELPNSQSETALKLWKEEILKFKSYLEQTFKVEITEENVRKAVHMMNENRIALK NLYEVMKNDPAPMNGQELFNVLYGSQFRFDKEKVPEEIHALREKIMKEYGEGEKQPKKKR ILLTGCPSSGAPMKVVKALEENGAVVVAYENCGGTKSADRLIDESAEDIYEAIAERYLQI GCSVMTPNKNRYELLGRLIEEYKVDGVVEMTLQACHTYNVEAKSIEKFVKGKGVPYIHVE TDYSQEDIGQLNTRMTAFVEML >gi|222441841|gb|ACEP01000101.1| GENE 7 6511 - 9993 3608 1160 aa, chain + ## HITS:1 COG:CAC0516 KEGG:ns NR:ns ## COG: CAC0516 COG0587 # Protein_GI_number: 15893807 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, alpha subunit # Organism: Clostridium acetobutylicum # 3 1148 11 1166 1167 1169 52.0 0 MAFTHLHVHTEYSLLDGSSKIKELLPRAKELGMDSLAITDHGVMYGVIDFYKKAKEVGIK PILGCEIYVAPGSRFDREQGRGEDRYYHLVLLAENNQGYKNLMKIVTRGFTEGYYYKPRV DYEVLEKYHEGIIALSACLAGEIPNKILKEDFDGARAAANKMRDIFGENNFFLELQDHGI RQQTQVNTSLIRLSRELGIPMVVTNDVHYIREEDAVPHDLLLCIQTGKKVSDQDRMRYEG GQYYLKSEEEMQKVFPYAREAMDNTHKIAERCNVEIVFGEQKVPKFDVPEGYDAFSYLKE LCETGLVKRYGDPPPQELQERLSYELSTIKNMGYVDYFLIVWDFIRFAKSQGIAVGPGRG SAAGSIVSYCLEITNIDPIRYQLIFERFLNPERVSMPDIDIDFCYERRQEVIDYVYEKYG KDKVVQIVTFGTMAARMAVRDVGRVLDIPYVQVDNVAKMIPMELGITIEKALKSNPELKQ AYESDDVVKNLIDMSMRLEGLPRHTSIHAAGVVIGSKPLDEFVPLSRGADNVITTQFTMT TIEELGLLKMDFLGLRTLTVIQDAARMIGKKLGHDFSIDTIDYDDAKVYEMIGQGKTEGV FQLESAGMKSFMKELKPTNLEDVIAGISLYRPGPMDFIPKYIKGKQNQDSIQYTCPELEE ILAPTYGCIVYQEQVMQIVMRLAGYTLGRSDLVRRAMSKKKGDVMARERQNFVYGNEEEG VEGCIKRGIPEETANKIFDEMIDFAKYAFNKSHAAAYAVVSYQTAWLRCYYPVEFMAALL TSVITNPKKITEYINTCRVMGISILPPDINEGEVGFSVAGDSIRYGLAAIKSLGKSVIDV MTQEREANGKYKDLKDFMGRLTSKEINKRTIENLIKSGALDSFGKTRKQQMLVYPVVLEQ VNREKKESMSGQMSLFDFFSEEEKKEYEMQYPDVGEYDDAQKLALEKDVLGIYVSGHPLE KYMDSIEKQTTARSTDFEPDEESGRAIVRDGQHYVVGGLISNITAKLTKNNQNMAFVTLE DLYGTVEIIVFPTIYQNVKPYLIEDNGLYVKGRASVSEESGKLIAEYIVPIDQIPKEVWI QTENIGEFTDKQQDLYKIIRKYPGKDEIVIFSKKEKAIKRLPTYENISAKNDVFSELKSL FGEKNVKVREKSIEKSQKKR >gi|222441841|gb|ACEP01000101.1| GENE 8 10050 - 11042 1463 330 aa, chain + ## HITS:1 COG:CAC0517 KEGG:ns NR:ns ## COG: CAC0517 COG0205 # Protein_GI_number: 15893808 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphofructokinase # Organism: Clostridium acetobutylicum # 7 330 1 319 319 354 58.0 2e-97 MNKTSAVQTIGILTSGGDAPGMNAALRAVARTAWARGIDVKGIYRGYSGLLNDEIFDMEK EFTCDIISRGGTALFTARCEEFKQLEYQEKAAEILRSHNIDGLVVIGGDGSFRGAQKLSA QGINTVALPGTIDLDIACTEYTIGFDTAVNTAMQAIDKIRDSSTSHERCSIIEVMGRNAG YIALWCGIASGAEDILIPERFDGNRPKLENELIESILEKKALGKKHHLIINAEGVGHSYG MAKRIEQATGVETRASILGHMQRGGNPTAKDRVYASYMGALAVDLLCSGATNRVVGYRNG QYIDYDIEDALGMAKDVDQYEFEIAQILGR >gi|222441841|gb|ACEP01000101.1| GENE 9 11243 - 12343 1077 366 aa, chain - ## HITS:1 COG:TM0920 KEGG:ns NR:ns ## COG: TM0920 COG1454 # Protein_GI_number: 15643682 # Func_class: C Energy production and conversion # Function: Alcohol dehydrogenase, class IV # Organism: Thermotoga maritima # 3 365 4 359 359 222 36.0 9e-58 MHFFDPTNTYAEKNCVKNHKKELLALGTRAFVITGHSSSKKNGSLDDVIAVLEEASVPYQ IFNEIEENPSVETVVRAAEIGKEFKADFVIGIGGGSPLDASKAIALLIANPEETGDCFYV AKDLKTLPIAAVPTTCGTGSEVTPVSVLTRHDTQTKKSISYKIFPDIALIDGKYLLSASK NLLINTCIDALAHAAESRVSIQTNIYNQMFANYALKLWGDIVPFLRSDEELSEELAEKLM LTSTIAGMAIAQTGTSIPHALSYDVTYHNGVAHGKACGIFLAAYLRVYAEQKPEDVEEIL TLLGFKTLDDFASYLTEILGTVSLTQKEVTFYIDRMMENTSKLATCPFELTREDIEKIYR ESIVVE >gi|222441841|gb|ACEP01000101.1| GENE 10 12677 - 13333 813 218 aa, chain + ## HITS:1 COG:CAC1730 KEGG:ns NR:ns ## COG: CAC1730 COG0036 # Protein_GI_number: 15895007 # Func_class: G Carbohydrate transport and metabolism # Function: Pentose-5-phosphate-3-epimerase # Organism: Clostridium acetobutylicum # 5 210 4 210 216 232 56.0 3e-61 MKKILAPSILSADFANLGRDVLTAIDAGAEYVHIDVMDGMFVPQISFGNPVIKALRPLTD VIFDAHLMIEDPGRYIEDFAKAGADIITVHAEACTHLDRVVHQIKDCGKKAAVAINPATP LSAIEYVLPELDMVLVMTVNPGFGGQKYIPYCGEKVRELRKMITEKNLDVDIQVDGGVKA LNIAEIAECGANIFVCGSAVFEGNIEKNVKEILEQMNK >gi|222441841|gb|ACEP01000101.1| GENE 11 13632 - 14462 1123 276 aa, chain + ## HITS:1 COG:FN2098 KEGG:ns NR:ns ## COG: FN2098 COG0489 # Protein_GI_number: 19705388 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Fusobacterium nucleatum # 28 276 8 254 257 226 46.0 3e-59 MSDCTHDCSSCGESCAQRESKPEDFLVKLSKGSSVKKVIGVVSGKGGVGKSMVTSLLAAA SMREGFKTAILDGDITGPSIPKAFGLSMNLVGNAEGLILPATTTCGIDIVSVNLMLANET DPVIWRGPVLGGVIKQFWGETLWQNIDYMFIDMPPGTGDVPLTIFQSVPLDGIVIVSSPQ ELVGMIVEKAVNMARMMNVPILGLVENMSYVECPDCGKQIKVFGESHIDEIAAEYDVPVL AKLPMDPALAAACDAGKIEYVENNYMKDAIEVLKKL >gi|222441841|gb|ACEP01000101.1| GENE 12 14486 - 16519 1274 677 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|157803230|ref|YP_001491779.1| 50S ribosomal protein L9 [Rickettsia canadensis str. McKiel] # 18 652 6 635 636 495 43 1e-139 MDKKNNQNDNPNKKKNIKSAIILLVVCLGITALFTSVMSRYKNGSQKKISYSKFLTMLDK GEIKKVQVADRKIYIEPKTQQNPLMKTTYYTVSMNDALLVNRLKEAEDNNKISSYEQQDS SGSNAILSVMVSYVLPFVLIYAMMYFVMRGIGKGGGMMGSVGKSNAKVYVEKKTGVTFAD VAGQDEAKESLTEMVDFLHNPGKYIEIGARLPKGALLVGPPGTGKTLLAKAVAGEANVPF FSLSGSDFVEMFVGVGASRVRDLFKQAQSMAPCIIFIDEIDAIGKSRDSRYGGGNDEREQ TLNALLAEMDGFDSSKGLVILAATNRPEVLDKALLRPGRFDRRVIVERPDLKGRVETLKV HAKNVKMDETVNFDEIALATSGAVGSDLANMINEAALAAVKAGREAVSQKDLLEAVEVVI AGKEKKDRILGEEEKKVVSYHEVGHAMAIAVQKNTEPVQKITIVPRTMGALGYTMQVPEE EKYLMSKEQMLSELVTLFGGRAAEEVVFNSVTTGASNDIERATQIARAMVTQYGMSERFG LMGLESVQNRYLDGRAVMNCSDATGALIDEEVKEMLKVAYDKAKKIIEDHREVMDEIAEF LIEKETITGKEFMEIYNKSLKKDELESVEANVEEDKEKTELSEAETVSEESSLKDAESQD AQKVTEKKDSEQTKQEE >gi|222441841|gb|ACEP01000101.1| GENE 13 16523 - 18412 1734 629 aa, chain + ## HITS:1 COG:CAC0508 KEGG:ns NR:ns ## COG: CAC0508 COG0322 # Protein_GI_number: 15893799 # Func_class: L Replication, recombination and repair # Function: Nuclease subunit of the excinuclease complex # Organism: Clostridium acetobutylicum # 1 617 1 617 623 577 47.0 1e-164 MFDIQEELKKLPAKPGVYLMHNAQDEIIYVGKARILKNRVRQYFQSSYNKSVKIQHMVSH IAYFEYIITDSELEALVLECNLIKEHRPHYNTMLKDDKSYPYIRVTVQEDYPRILFCRQV KRDKSKYFGPYTSAGAVKDTIELMRKVYKIRSCSRSLPKEIGKDRPCLYYQINQCDAPCQ NYISKEEYRAHVDKALKFLNGDEKEIINSLEEKMKKAAAELEFEQAAEYRDLIENVKRIG EKQKINDTGGDDRDIIAMAKAGDEAVVAIFFIRSGKLLGRDHFHMTGIGDSEKQEIITDF IKQFYVGTPFIPKEILTQEEVLDQEILEKWLSDKRGSKVIFAMPKRGSKHQMMELARKNA QNVLAQDSEKLKREERRTIGAVHELEQALGIGNLNRMEAFDISNTNGYENVASMVVFEKG KAKRSDYRKFKIKTVAGPDDYHCMEEALERRFSHGIREQKEREEKGQDMELGSFTRFPDI LMMDGGKGQVNIALQVLEKLGLTIPVCGMVKDDFHRTRALYYNNEIIEFPKNSEAFRMIT RLQDEAHRFAITYHKALRGKEQVHSVLDDIKGIGPARRKSLMKHFKDIGKVKEASVTELC EADGITENVAEEIYRFFHEDTSKKNGDVV >gi|222441841|gb|ACEP01000101.1| GENE 14 18862 - 19806 1167 314 aa, chain + ## HITS:1 COG:BH3590 KEGG:ns NR:ns ## COG: BH3590 COG1493 # Protein_GI_number: 15616152 # Func_class: T Signal transduction mechanisms # Function: Serine kinase of the HPr protein, regulates carbohydrate metabolism # Organism: Bacillus halodurans # 1 310 1 309 310 260 41.0 2e-69 MYKVKLTEFVNKMNLKSLLPDIDTDSIMIEQSAVNRLALQLAGFFEHFDSERIQVIGNVE AAYMKQLAPESIPPLCEKIFSFNIPCLVFCRGLEPSQEMLDTAEKNGVPIFTTEMTTSDF IANSVRWLSEELAPRISIHGVLVDVYGEGVLIMGESGIGKSEAALELIHRGHRLVSDDVV EIRKISDDTLLGTSPDITRHFIELRGIGIVDVKTLFGVESVMYNQTIDMVIRLEDWNREA NYDRLGLDEEYIEFLGNKVVCYSIPIRPGRNLAVIVESAAINYRAKKMGYNAAQELYNRV TGSLSTNKEEQKEQ >gi|222441841|gb|ACEP01000101.1| GENE 15 20232 - 21161 910 309 aa, chain + ## HITS:1 COG:lin1459 KEGG:ns NR:ns ## COG: lin1459 COG0812 # Protein_GI_number: 16800527 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramate dehydrogenase # Organism: Listeria innocua # 6 301 1 295 298 281 50.0 9e-76 MNKEFLQNLREQIKAGTVTEQEPMNKHTSFAIGGPADVFVQPATREEIRSAVYCAKEAGI PFFVMGNGSNLLVSDEGFRGMIIQIGKNFQAISVKDTVIEVQAGALLSRTARAAWNAGLT GFEFAAGIPGSVGGAVAMNAGAYGGEVKDVLLDAEVLTQEGEFLTLTGEELDLSYRHSCI FEKNYVVLSARFSFEKGESDKIKARMDELAKARREKQPLEFPSAGSTFKRPEGYFAGKLI QDAGLKGYTVGGAQVSEKHSGFVVNRGGATAEEVAFLIKQVQKKVMKQFNVMMQPEVRFV GFADTEVRE >gi|222441841|gb|ACEP01000101.1| GENE 16 21182 - 22033 699 283 aa, chain + ## HITS:1 COG:CAC0511 KEGG:ns NR:ns ## COG: CAC0511 COG1660 # Protein_GI_number: 15893802 # Func_class: R General function prediction only # Function: Predicted P-loop-containing kinase # Organism: Clostridium acetobutylicum # 1 283 9 292 294 290 49.0 2e-78 MSGAGKSTALKVLEDFGFFCVDNLPVALLPNFGEIASDRQTEVDKVAVGVDIRNGEQLTR LNSQLEVLCELGIEYEILFLDAEDETLVKRYKETRRTHPLAKEDRIETGISREREMISFI RDRADYIIDTTTLLTRELRQEVEKIFVGNEEYGNFIITVLSFGFKYGIPSDADLVFDVRF LPNPYYEQALRPLTGNDRSIQEYVMKNGDGRKFLDKLEDLMQFLIPRYLDEGKNSLVIGI GCTGGKHRSVTITNGLYARLKKFPYTVRMEHRDIEKDARTKGK >gi|222441841|gb|ACEP01000101.1| GENE 17 22062 - 23015 766 317 aa, chain + ## HITS:1 COG:CAC0513 KEGG:ns NR:ns ## COG: CAC0513 COG1481 # Protein_GI_number: 15893804 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 1 315 1 316 317 247 42.0 2e-65 MSFSSEVKEELSHVDGHENHCMIAELAAIIGLCGKIKCEDGRYSLILHTENVSVARRFYS LIKDVFHFQPQVSTTQHEFLKKNRFYMISIEESGQTLKILKTAQLINNLGQMDEEFSVKD NPVLQRTCCRRAFLRGAFLSAGSITDPEKNYHFEIACISSEKAAQIKELFEFFGLEAKIV LRKKYYVVYLKEGAMISDTLNIMEAHIALMKFENVRILKDVRNSVNRRVNCEAANINKTV SAARKQIEDIEYIKNTVGLERLSENLRDIAYARLEEPDATLKELGEKLSSPVGKSGVNHR LRKLSQIAQDIREKGHL >gi|222441841|gb|ACEP01000101.1| GENE 18 23096 - 23359 400 87 aa, chain + ## HITS:1 COG:BS_crh KEGG:ns NR:ns ## COG: BS_crh COG1925 # Protein_GI_number: 16080527 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, HPr-related proteins # Organism: Bacillus subtilis # 1 85 1 85 85 75 40.0 3e-14 MISKEMEIKIPSGLDPSTIALFIQIASQYDSSVYVEVKNKKVNAKSIMGMMTLGVPAGGK INVIAKGDDETKAIEHIGKYLNNDLAH >gi|222441841|gb|ACEP01000101.1| GENE 19 23703 - 24536 562 277 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225028036|ref|ZP_03717228.1| ## NR: gi|225028036|ref|ZP_03717228.1| hypothetical protein EUBHAL_02305 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02305 [Eubacterium hallii DSM 3353] # 1 277 1 277 277 531 100.0 1e-149 MNCPNCGTYYDENEKICKKCHTPLSNAVYNQVSSGNDINQPAPKRKSFVPIVAAILICCL IAGGIGGYFYYISRVKQGCREATRQIFTYAKNMDFSDVAPSDLPEPLKSEPNVRELVKQQ LKTYIGDSELGSLVDVDSIDVTTLCDEMVDQASYKITDVSATYNSCTVTVTTKNIDFYSL PGTIYDEIKSQITDGNSSLWTSIKNSISSLLGGSSQTSIEKIDISENLKNWYKETKKNGP TRKTTGKIVFGIKDGKWALISFDERLIYGYYGLRPLQ >gi|222441841|gb|ACEP01000101.1| GENE 20 24862 - 25830 813 322 aa, chain + ## HITS:1 COG:CAC1453 KEGG:ns NR:ns ## COG: CAC1453 COG1879 # Protein_GI_number: 15894732 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Clostridium acetobutylicum # 4 322 6 325 325 272 45.0 6e-73 MRGKKAILCIGYIIVATICIVSIALLFFSRNKNESEKSKRTIGALYMTMNNPYFEVINEE IKAVVESKGDVLETRDSGMDAKVQAKQVEKFIEEKVDCILINAVDWKKIGSSLEKAKKAG IPVIAVDALIYNRSFVDGTVVSNNYQAGEQCAKDLMKRRKKGKILFLVQSENKSAIDRIK GFKETLEKAGWKYENVGELECQGQLELSQPLVEKVLNKTKDIDVVMALNDPSAMGAMAAL DAEHMLSDVLVYGADGSPEAKTMIYENKMAATSAQSPRTTGKKTVEMLYKILDGKKTEKQ CIVPVSLITKDNINDYSLSGWQ >gi|222441841|gb|ACEP01000101.1| GENE 21 25849 - 27225 1246 458 aa, chain + ## HITS:1 COG:CAC1454 KEGG:ns NR:ns ## COG: CAC1454 COG4585 # Protein_GI_number: 15894733 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 18 432 14 436 441 313 41.0 5e-85 MEKRKDAAGYQHQCFTMMVLLNVIIVLFLTFTIVLTQYRVAAAQEAQKFIGTLKVMPERP EQRIIMVVFSLFFLCGIIYYKRKNEAEGKEKVLLWNVFEIIFLIFVLKELDMSYNGVIFL VVADMLIYVEDRKNKLIFLFIAFLCYMVCSYNLVTMFIPLNSFETWAAFYDWNTERIFLS VKTICEITNMVLFLIYIVILMMKDRRERERIKLLNEQLQKANEQLHEFAQEKELMGETKE RNRLAREIHDTLGHILTGISVGIDAVLVLMDIAPDKAKEQLEGIGDTARRGLQDVRRSVR KLKPDALERMSLNNAIHQMIEDMSKVTNTKIYFVSYMEELEFEADEEEVIYRIIQESTTN AIRHGKATEIWIRISEKDEELMIIISDNGCGCESIKEGFGLKHIRERVELLNGKVNYQGM IGFTIIAKIPIRNKLTGKKDRDELNNLADNNDKAKDSQ >gi|222441841|gb|ACEP01000101.1| GENE 22 27351 - 28022 848 223 aa, chain + ## HITS:1 COG:CAC1455 KEGG:ns NR:ns ## COG: CAC1455 COG2197 # Protein_GI_number: 15894734 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain # Organism: Clostridium acetobutylicum # 1 223 1 224 225 282 63.0 4e-76 MIKVLIADDQELIRESLKIVLGANEDMEVTGIAENGEQVLSEIKKKKPNVILMDIRMPGI DGVECTRLVKEKYPDIYIIVLTTFDDDEYIFGALKYGASGYLLKGISMKNLAEAIRTVVS GKAMINPEITDKIVKLFNHMAQSSYAIQVDNELTEELTKTEWRVIQKIGCGMSNKEIASE LALSEGTVRNYLSSVLSKLELRDRTQLAIWAVQTGAVNRYIGE >gi|222441841|gb|ACEP01000101.1| GENE 23 28077 - 29390 973 437 aa, chain + ## HITS:1 COG:CAC1456 KEGG:ns NR:ns ## COG: CAC1456 COG1653 # Protein_GI_number: 15894735 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Clostridium acetobutylicum # 35 435 38 437 439 404 49.0 1e-112 MGDIKLKKGTGILFVMAVLFISSLIYLHYGDQNTLRVGIFYGSNWEVPGTVHYEILDQAI KKFESKYPNVKVEYEKGILSNDYSEWLSEQILKGTEPDVYLVLDEDFNTLASLGALKNLD MLVGGDKEFDFNVFYSSVYKAGQYEGSQYALPMECNPTLMFVNKTLLQKEGIKVPDNDWT WDDFYDICKTIVKDSDGDGQMDQFGCYDYTWLQAVYSNGITPFNQSGTSSNFLDEKFSQS LEFVKRLEATNRNYQVTSNDFDLGKVAFRPFTFSDYRTYMPYPWRIKKYSGFEWTCIRMP AGPSGDNRSEMSALMAGMSSRTEKKQCAWDFVKLLTTDTDIQKLVYEDTSAASVLKSVNT SQDTMNLLNKDTPGDSIIDMSLLDTAIDAGVIPNRFEQYEEAYEKTDSLIKSYVDEEGDS STFLFQMKNQIDKILKK >gi|222441841|gb|ACEP01000101.1| GENE 24 29599 - 30507 1123 302 aa, chain + ## HITS:1 COG:BS_fruB KEGG:ns NR:ns ## COG: BS_fruB COG1105 # Protein_GI_number: 16078503 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose-1-phosphate kinase and related fructose-6-phosphate kinase (PfkB) # Organism: Bacillus subtilis # 1 300 1 298 303 273 49.0 3e-73 MIYTVTFNPALDYVVFLDNLKLGDVNRSTRESIFYGGKGINVSTILNTLGLETTALGFVA GFTGKAIEDGLAAQGIHTDFIHLPEGNSRINVKVKHGDETEINGQGPVITEEAINELFAK LDTLTKEDILVLAGSIPNTLPEDIYEKILARLESKGIRVVVDATKDLLLNVLKYHPFLIK PNNHELGEMFGTVCKTDEDIVHYAKKLQEKGAVNVLISMAGDGAILITEDGQNIKMGTPK GKVVNSVGAGDSMVAGFITGYIRGGNYEEALKSGTAAGSATAFSEGLASLEMYEKMLKQI TE >gi|222441841|gb|ACEP01000101.1| GENE 25 30671 - 32617 2515 648 aa, chain + ## HITS:1 COG:L185031_3 KEGG:ns NR:ns ## COG: L185031_3 COG1299 # Protein_GI_number: 15672941 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, fructose-specific IIC component # Organism: Lactococcus lactis # 299 643 2 349 351 308 50.0 2e-83 MRIVDLLKKEAVVLNAEVSTKEQMLDLLVDLHKEVGNIKDKAAFKAGIMKRESEGPTAIA EGICIPHSKNEAVINPGIAAITVPSGVDCEALDGEPSNLFFMIAAPAEGSDVHLEALSRL STILMDPAFREKLLSAKEVDAFLAAIDEKETEKYGEEDKAAQTAEAPAKKAGYQVLAVTA CPTGIAHTYMAAEALEEKGKELGISIKVETNGSGGAKNVLTQEEIDACDGIIIAADKNVE MARFDGKPVIKVKVSDGIHKSQELIERAVKGDGPIYHHEGGKAASTASEGDESFGRQIYK HLMNGVSNMLPFVIGGGILIAIAFLLDDYSIDPSNFGMNTPMAAFFKTAGGVAFNFMLPI LAGFIAMSIADRPGLMVGFAGGAIASAGTTFASCFNSDITVISGGFLGALFAGFLAGYLV LALEKITEKMPSSMDGLRPMLIYPVAGLLLISVIMFAINPFFSWLNQLLADALNSLSGTN SILLGLIVGGMMSIDMGGPFNKAAYVFGTASLAYQTDAGYMIMAAVMVGGMVPPIAIAIA SWVFKNKFTDNDRKTAPVNAIMGLCFISEAAIPFAAGDPLHVIPPCVIGSAVAGALSMAF KCTLMAPHGGIFVFPVVGNALMYIIALAIGSIVGAVMLGLLRKPVESK >gi|222441841|gb|ACEP01000101.1| GENE 26 33117 - 35021 1233 634 aa, chain + ## HITS:1 COG:all0713 KEGG:ns NR:ns ## COG: all0713 COG0296 # Protein_GI_number: 17228208 # Func_class: G Carbohydrate transport and metabolism # Function: 1,4-alpha-glucan branching enzyme # Organism: Nostoc sp. PCC 7120 # 9 633 6 752 764 269 27.0 1e-71 MRYEISEVVKKDQVMELVFGNCSKPKDLLGRHFILQGQVISAYHPDAVKMEVISEDGERY PMDTVERQPVFSLFLPHKRPFSYQIHMTFHDGNTYISSDPYSYEGLITEEEEKLFSKGNW TEVYHKMGCHKVKLGNTEGMYFAVWAPGARRVSVVGDFNYWNGMLYPMHKMENSDIFELF IPGLSCGQFYKFEIKNVQGDIIQMVDPYAVMNEEKENGASRMFDLGRFRWEDSRWLSKRY HGNVFKTPMSVCEVRISELDSPDEKVQEIVQDMGHTHILLRGTPERAKLGVERGFFEPAF YGNTPDTMRFFVNRSHKRNIGVMLEISPEYLTRAVHLFEKKHPQAVNYLLANILFWIKEY HIDGFVFRGLSENSSDFLEKAKEVIKKEDNSVLFIGEEIKGKQTRDFFDFEWNMELKAGV EEYLGTDFEKRQGKYFCLSQPLMKGDFSNTLLLLNKEKNNLFDESLIDKKPSCDYDKLTG VRMSYGYLMGVPGRKAWDLHSHENISSQEYVKSLIRIYQEYPALYEYDPLRASFEWINGM DAESSVLRFMRKSPSGRDNLLFICNFGTEEKENCQVGVPKAGKYTMISNSDAVEFGGEGR DEHQEVQAVSECWDLRPYSIKISVPPLATLIFKF >gi|222441841|gb|ACEP01000101.1| GENE 27 35180 - 35593 527 137 aa, chain + ## HITS:1 COG:BH1189 KEGG:ns NR:ns ## COG: BH1189 COG0537 # Protein_GI_number: 15613752 # Func_class: F Nucleotide transport and metabolism; G Carbohydrate transport and metabolism; R General function prediction only # Function: Diadenosine tetraphosphate (Ap4A) hydrolase and other HIT family hydrolases # Organism: Bacillus halodurans # 4 137 6 142 142 137 50.0 5e-33 MKEDCIFCKIAKGEIHSATVYEDSHFTVILDVNPATKGHCLIIPKEHFDNIYDLDGETAG KLFALATCIARAMRDALKCDGLNLVQNNGEIAGQTVNHFHLHLIPRYEGDGLNLNWPQQE ISAEQLEEIRQSIKKSI >gi|222441841|gb|ACEP01000101.1| GENE 28 35785 - 36369 460 194 aa, chain + ## HITS:1 COG:FN1468 KEGG:ns NR:ns ## COG: FN1468 COG1853 # Protein_GI_number: 19704800 # Func_class: R General function prediction only # Function: Conserved protein/domain typically associated with flavoprotein oxygenases, DIM6/NTAB family # Organism: Fusobacterium nucleatum # 20 184 20 185 197 181 50.0 7e-46 MGKLDFKPGNMLYPLPAVMVSVRDKEGNDNIITVAWTGTVCTNPPMLYISVRPERHSYKA LHETGEFVVNLTTEKIAKATDFCGVRSGRDMDKFEQTGLIKGEAKKINAPVIEDSPVNIE CRVREEVALGSHTMFIADVVHVTVDDSYMDERGTFHLEKAAPIVYSHGTYFGLGESLGTF GYSVRKKKKKRKNR >gi|222441841|gb|ACEP01000101.1| GENE 29 36497 - 36592 60 31 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTGIFETGLQVVYNYSDKNNHIREGIGNTSL >gi|222441841|gb|ACEP01000101.1| GENE 30 37030 - 38343 1720 437 aa, chain + ## HITS:1 COG:CAC3539 KEGG:ns NR:ns ## COG: CAC3539 COG0766 # Protein_GI_number: 15896775 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylglucosamine enolpyruvyl transferase # Organism: Clostridium acetobutylicum # 1 419 1 415 418 442 60.0 1e-124 MEQYIIKGGNPLVGEVVIGGAKNAALGILAAAIMTDGECLIDNMPNVRDTNVLLQAMEGI GATIDRKGDNEVVISGKNIDSTGDLIVDNEYIRKIRASYYLIGALLGKYRKAQVVLPGGC DIGSRPIDQHIKGFRALGAEVKIEHGMIIAQAEQLVGNRIYLDVVSVGATINIMMAASLA QGNTVIENAAKEPHIVDVANFLNSMGANIRGAGTDVIRIKGVEKFHDTEYSVIPDQIEAG TFMMAAAATRGDVLIKNVIPKHLETISAKLIEIGAEIEESDDAVRVVAAQRLRHTQIKTL PYPGFPTDMQPQMAITLGLSTGTSTITESIFENRFRYVEELRRMGANIKMVEGNTAIIHG VEKYTGASVAAPDLRAGAALVIAGLAAEGYTTVTQIGYIQRGYERFDEKLRALGGLIEKV DSEKETNKFIYKTGTVN >gi|222441841|gb|ACEP01000101.1| GENE 31 38365 - 39552 1691 395 aa, chain + ## HITS:1 COG:BS_metK KEGG:ns NR:ns ## COG: BS_metK COG0192 # Protein_GI_number: 16080107 # Func_class: H Coenzyme transport and metabolism # Function: S-adenosylmethionine synthetase # Organism: Bacillus subtilis # 3 393 5 395 400 630 76.0 1e-180 MERRLFTSESVTEGHPDKMCDQVSDAILDALMEKDPMSRVACETAMTTGLVLVMGEVTTS AYVDIQKLVRDTIKEIGYTRGKYGFDADTCGVMVALDEQSSDIAMGVDKALEARENKMSD EELEAIGAGDQGMMFGYASNETEEYMPYPIALAHKLSRQLTKVRKDGTLSYLRPDGKTQV TVEYDENGKPDRLDAVVLSTQHDPEVTQEQIHEDIKKYVFDAILPQDMVDENTKFFINPT GRFVIGGPHGDSGLTGRKIIVDTYGGYARHGGGAFSGKDCTKVDRSAAYAARYVAKNIVA AGLADKCEIQLSYAIGVAQPTSIMVDTFGTGKIAEDKLVDMVRENFDLRPAGIIKMLDLR RPIYKQTAAYGHFGRNDLDLPWEKLDKAETLKKYL >gi|222441841|gb|ACEP01000101.1| GENE 32 40183 - 41031 608 282 aa, chain + ## HITS:1 COG:RSp0668 KEGG:ns NR:ns ## COG: RSp0668 COG5495 # Protein_GI_number: 17548889 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Ralstonia solanacearum # 51 271 60 281 294 97 28.0 3e-20 MRTGIIGAGKVGCSLGKYFRLNNLKVTGYYDVNENLAKEAATFTETTFIEDLETIVKISD TLFLTVPDDLITTVWNQMKDMSLEGKFICHCSGALSAGDAFPGIDKCGAFGYSVHPLFAV SDKYNSYKELSHAYFVIEGDEKHREEIAGIFNNLGNEVRYIAAKDKVKYHCAAAVCSNHV VALIQESLDLMQECGFDEESALKALAPIMLGNMQHIAERGTVNSLTGPVERADVKTVEKH LNCLDEKQQMLYRLLSEVLISIGEKKNPGRDYGRLKHILGNE >gi|222441841|gb|ACEP01000101.1| GENE 33 41149 - 41976 1174 275 aa, chain + ## HITS:1 COG:CAC2914 KEGG:ns NR:ns ## COG: CAC2914 COG0413 # Protein_GI_number: 15896167 # Func_class: H Coenzyme transport and metabolism # Function: Ketopantoate hydroxymethyltransferase # Organism: Clostridium acetobutylicum # 1 274 1 274 276 347 62.0 1e-95 MKNTVTTFQAMKDKGEKISMLTAYDYSTAKLEDAAGINGILVGDSLGNVVLGYDTTIPVT VEDMIHHGAAVARGAKNSLVVVDMPFLSYQTSVYDAVVNAGKIMKETQCDCVKLEGGKSV CPQIKAITDASIPVMAHLGLTPQSVNAFGGFKVQGKTVEAAQQLIEDAKAVEEAGAFAVV LECVPAKLARKVTEALHIPTIGIGAGEGCDGQILVYADMLSMFSDFTPKFVRSFANVGEI MSQAFKDYDAAVKDSSFPAQEHTFKIADEVLDKLY >gi|222441841|gb|ACEP01000101.1| GENE 34 42239 - 43081 1178 280 aa, chain + ## HITS:1 COG:CAC2915 KEGG:ns NR:ns ## COG: CAC2915 COG0414 # Protein_GI_number: 15896168 # Func_class: H Coenzyme transport and metabolism # Function: Panthothenate synthetase # Organism: Clostridium acetobutylicum # 1 277 1 277 281 369 61.0 1e-102 MEKVYTVKEVREQVKAWRREGLTVGLVPTMGYLHEGHASLIKKAVEQNDKVVVSVFLNPT QFGPTEDLEAYPRDFEADCKLCESIGAAMVFHPEPSEMYAPDFCTWVDMDVLSKTLCGKS RPIHFRGVCTVVSKLFNIVTPDRAYFGQKDAQQLAIIRRMVRDLNMDIEIVGCPIVREED GLAKSSRNTYLNEEERKAALILSKAVFLGKKMVEDGETSAAAVKEAMIKKIESEPMAKID YVEAVDGLSMQPVEEIKAPVLVAMAVYIGKTRLIDNFIVE >gi|222441841|gb|ACEP01000101.1| GENE 35 43244 - 43645 591 133 aa, chain + ## HITS:1 COG:CAC2916 KEGG:ns NR:ns ## COG: CAC2916 COG0853 # Protein_GI_number: 15896169 # Func_class: H Coenzyme transport and metabolism # Function: Aspartate 1-decarboxylase # Organism: Clostridium acetobutylicum # 1 126 1 126 127 171 63.0 3e-43 MQITMLQGKIHRATVTQAELDYVGSITVDEDLLDAAGIREYQMVQIVDVNNGNRFETYTI AGERGSGVMCLNGAAARCASVNDKIILMAYAQMTPEEAKENKPNVVFVNDENKISRVTNY ERHGRLFDMERLG >gi|222441841|gb|ACEP01000101.1| GENE 36 43649 - 44839 1247 396 aa, chain + ## HITS:1 COG:BH2510 KEGG:ns NR:ns ## COG: BH2510 COG0452 # Protein_GI_number: 15615073 # Func_class: H Coenzyme transport and metabolism # Function: Phosphopantothenoylcysteine synthetase/decarboxylase # Organism: Bacillus halodurans # 1 394 1 395 404 360 51.0 4e-99 MLKGKTVVLGVTGSIAAYKIANLASMLVKQHCDVHVIMTKNATNFINPIAFETLTNNKCL VDTFDRNFQFHVAHVSVAQKADVMMIAPASANIIAKLAHGIADDMLSTTALACSAKKIVS PAMNVHMFENPIVQDNLNILKKYDMEVVEPAVGYLACGDTGAGKMPDPEVLYEYIIKEIA CEKDLKGKKIMVTAGPTQEAIDPVRYITNHSTGKMGYAIAKRAMLRGADVTLVSGPVALD PVPFVKMVPVTTAQDMFEAVSGGFPEQDILIKAAAVADYRPSNVSDEKVKKRDGDMAIAL ERTTDILGYVGEHKKPEQIICGFSMETQNMLENSRKKLKKKHLDMIVANNLKEQGAGFGT DTNRVTIISEKEEKNLELMSKEEVADEILNYILEFK >gi|222441841|gb|ACEP01000101.1| GENE 37 45624 - 46361 540 245 aa, chain + ## HITS:1 COG:CAC0321 KEGG:ns NR:ns ## COG: CAC0321 COG0745 # Protein_GI_number: 15893613 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Clostridium acetobutylicum # 1 228 1 229 230 281 63.0 9e-76 MKHILIIEDERAVAELERDYLEINSFKVSIEETGEGGLQRALKGDIDLIILDLMLPDMDG FEICRELRHRIDIPILIVSAKKSDIDKIRGLGIGADDYITKPFSPNELVARVKAHLNRYI RLTRKRIDVNEYIEYPGLKIDKSARRVFVNEEERFFTTKEFDLLTYLATHPNHVFSKDEL FSAVWGMDSMGEIATVTVHIKKIRKKTENANVKSGYIETVWGSGYRFRVIDSDVSGERES IVDAI >gi|222441841|gb|ACEP01000101.1| GENE 38 46432 - 47145 811 237 aa, chain + ## HITS:1 COG:BH2542 KEGG:ns NR:ns ## COG: BH2542 COG0564 # Protein_GI_number: 15615105 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Pseudouridylate synthases, 23S RNA-specific # Organism: Bacillus halodurans # 7 223 81 289 305 97 33.0 2e-20 MLKKKRVPIVFEDDAVIVCEKPAGMPVQSDHTRDMDVLTILKHHIFEEQNSEEEPYLAVV HRLDRPVGGLMVLAKTKEAAANLSEQIQNFEFEKNYQAVVCGNLREDFGTFEDYLLKNGK TNKTEVVKAGTPGARKAELDYELIDCIETKEGVLTWILVILHTGRHHQIRVQFASRGLGL YGDTKYNPKFQKTKKKYTEIGLYSTRISFNHPVTGERMTFKIEPSGEAFEKMDVEAF >gi|222441841|gb|ACEP01000101.1| GENE 39 47372 - 48808 1823 478 aa, chain + ## HITS:1 COG:MYPU_1830 KEGG:ns NR:ns ## COG: MYPU_1830 COG0442 # Protein_GI_number: 15828654 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Prolyl-tRNA synthetase # Organism: Mycoplasma pulmonis # 4 478 25 501 501 469 47.0 1e-132 MAKEKKLVEAITPMDKDFTQWYTDVVTKAELISYSNVRGCMIIRPRGYAIWENIQKQLDA RFKETGVENVYLPMLIPESLLQKEKDHVEGFAPEVAWVTHGGEEKLAERLCVRPTSETLF CDHVKDIVHSYRDLPVVYNQWCSVLRWEKTTRPFLRSSEFLWQEGHTFHATAEEAKERTE QMLNVYADFCEQVLAIPLVKGRKTEKEKFAGAEATYTIEALMHDCKALQSGTSHNFGNGF AKAFGIEYTNKDNKLSPVYETSWGMSTRIIGAIIMVHGDDSGLVLPPRIAPVQTMIIPIQ QKKDGVLDKAFEIRDTLKSNFSVKVDDSDKSPGWKFSEQEIQGIPTRIEIGPKDIEKGQA VIVRRDTREKIVTPIDKIPETLTEVLETMQNDMYNKAKAHLEKNTHIAHNWEEFNDILEN QQGFIRAMWCGEKECEEAIKDENGATTRCIPFVQAKHSDVCVHCGKPAKCEVYFGKAY Prediction of potential genes in microbial genomes Time: Fri Jul 8 08:01:36 2011 Seq name: gi|222441840|gb|ACEP01000102.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont361.1, whole genome shotgun sequence Length of sequence - 13284 bp Number of predicted genes - 10, with homology - 10 Number of transcription units - 3, operones - 2 average op.length - 4.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 40 - 1530 1657 ## COG0498 Threonine synthase - Prom 1711 - 1770 6.7 2 2 Op 1 . - CDS 1777 - 2652 698 ## gi|225028060|ref|ZP_03717252.1| hypothetical protein EUBHAL_02329 3 2 Op 2 . - CDS 2658 - 3563 596 ## COG1131 ABC-type multidrug transport system, ATPase component 4 2 Op 3 . - CDS 3576 - 5990 1753 ## BBR47_50310 hypothetical protein 5 2 Op 4 40/0.000 - CDS 6015 - 7079 961 ## COG0642 Signal transduction histidine kinase 6 2 Op 5 . - CDS 7165 - 7875 808 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain - Prom 7910 - 7969 9.4 7 3 Op 1 . - CDS 9056 - 9985 794 ## COG1897 Homoserine trans-succinylase 8 3 Op 2 . - CDS 10023 - 11984 2098 ## COG0272 NAD-dependent DNA ligase (contains BRCT domain type II) 9 3 Op 3 . - CDS 11977 - 12798 780 ## COG1187 16S rRNA uridine-516 pseudouridylate synthase and related pseudouridylate synthases 10 3 Op 4 . - CDS 12798 - 13190 609 ## COG5496 Predicted thioesterase - Prom 13221 - 13280 5.0 Predicted protein(s) >gi|222441840|gb|ACEP01000102.1| GENE 1 40 - 1530 1657 496 aa, chain - ## HITS:1 COG:CAC0999 KEGG:ns NR:ns ## COG: CAC0999 COG0498 # Protein_GI_number: 15894286 # Func_class: E Amino acid transport and metabolism # Function: Threonine synthase # Organism: Clostridium acetobutylicum # 3 493 4 494 496 579 59.0 1e-165 MDLIYKSTRNNQETATASEAILKGLAGEGGLFVPSYIPKLDKSLEELSKMSYQEVAYEVM KLFMTDFTEEELKHCINSAYDKKFDTENIAELVKKDDAYYLELFHGATIAFKDMALSILP YLMTTSAKKNQVKNEIVILTATSGDTGKAALAGFADVPGTKIIVFYPKHGVSPIQEKQMV TQRGENTYVIGIEGNFDDAQTGVKKMFSDAALAEEMNEAGYQFSSANSINIGRLVPQVVY YVYSYATLLARGEISDGEQINVVVPTGNFGNILAASYAKQMGVPIAKLICASNENKVLFD FFTTGCYDRNRDFILTTSPSMDILISSNLERLIYQTAGRDAEKNNEFMKELNTNGVYTIT DDMKEQMKDFYGNYATEEETAQTIKEVYNKADYIIDTHTAVAASVYKKYEKETGDTTKTV IASTASPYKFTRSVMNAIDSSYDSKTDFELIDELEKLSGVKVPQAIEDIRSAAVLHDTVC ETEDMCKEVKKILNIK >gi|222441840|gb|ACEP01000102.1| GENE 2 1777 - 2652 698 291 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225028060|ref|ZP_03717252.1| ## NR: gi|225028060|ref|ZP_03717252.1| hypothetical protein EUBHAL_02329 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02329 [Eubacterium hallii DSM 3353] # 1 291 1 291 291 493 100.0 1e-138 MWNPFLKVEWLKEGRNIRLPMMLIFYNAILAFITILFMFFNAESFQEGYSYDTSAYLYQF LIISTIQIGMIFVLMPFSVWGFYSTDREKHMLEEFVMIPGSSKQFIIARVSVIIAVYMML FISSLPIISLSCIYSGLPWRKIIRLGIMLFICTFWSASVSIFSFSYCKKGIWAFAQNTVI EAVFILGTILGTEIMRTISISVSGMDNLAPITTSLCLLLSLINPLAAYMGYYGNITGDSG LMNLYCGRIGIDSSTQAFSFLFYKAASIMCILMGIAFLALAVWQMEKQARE >gi|222441840|gb|ACEP01000102.1| GENE 3 2658 - 3563 596 301 aa, chain - ## HITS:1 COG:alr0970 KEGG:ns NR:ns ## COG: alr0970 COG1131 # Protein_GI_number: 17228465 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase component # Organism: Nostoc sp. PCC 7120 # 17 299 22 307 316 129 29.0 7e-30 MLDFQNISIQRKGKKVLDDFSLKAPDGVILALLGEDKEGKSTLLKAASGGEKPETGDIYI DGISIYETGENAYCNFGYMSKEYGFYSLLKVEEYYELFLALYKVGGRYRSRRIEEVLELL DMKEYEGAFIGELPAETLPFLYLGKAILQEPEWLLMDEPFDNMSVTARRKMIGILLTLHE KGTSIILNTPMYPEMMGFITDVLVMESGKNLTFGPVEEVYAEALCKSPVRMHILAQMEKA LEVLKENPLVDRVTVDGDDVIFRFNGGEKEEAELLTDLVASGALIHNYMRDQADIEEIFR R >gi|222441840|gb|ACEP01000102.1| GENE 4 3576 - 5990 1753 804 aa, chain - ## HITS:1 COG:no KEGG:BBR47_50310 NR:ns ## KEGG: BBR47_50310 # Name: not_defined # Def: hypothetical protein # Organism: B.brevis # Pathway: not_defined # 175 441 89 357 738 87 25.0 4e-15 MINDLFRTYKKIILLLLVLLCSVVFWFYGCRHQQRSQSEVVEWNKKTIKGTNGYCYKFKT SNCTGTVTFGAAGYVAKDKEMPVTLDIFAAEKDFTGVMKVTLPGENGKGIAYQSAVKCKA GEKEKIVLNVPQLGNPSAICFEIMDSFGVTELSEDVSFSDAKNRNGAFSEQAENLIGVLS GQSKELSYLNSLKIGEESDEESVKVVCYSKNSFPQTEEEFQGLDGMIIDSFDTKSLSGKQ KSALKSWLKSGGKLLIAGGGQQIDSFEGLEKTFGIIQEDVVVSELYLADADNTVQELPIL MSNLQLSQKYEWEAYGDFEPEIGYTAAVGSGKISILRFSLTNSAFLQWSVRDKAAVEILS HFMGKDEDTESSDTSLWYVKKALYAFMKSQLPNTFFYGVFFIFYILLMVTIAYYYLCKIK KREYIWIIVPVLALVFTVGLFIRSRGMKSSGDSYFSALRVTDSEKEQENIYFLYQNDEGV EGNVNFLSTITSVIPMDYNYRTIVGKNIKTSGEDITINNTQKGFDVAFEESVPGTSRILK LSGNTKSASEKEVFTQDIFTTYTSFYGNITNTSSNNFSRVILIRGNQYAILKSVKAGEKA SVSKDMVKCWNHFDEENSAFGTENENTVTGNIMEYLQQTYMNTQKSKEELLVVGITSEND FKLFSDDNVLKNHVTVMINHFTLEEETGDHILNINMSCLKQEGEGLAIEEDTLEENKTEV SYVFDKEEIEALARNKDNFTGKIYAYNYSTGQRDRILSSTESYITGESLNSYLSGDGEMR ITYKMDASDEYGPAPVLSLIYKKQ >gi|222441840|gb|ACEP01000102.1| GENE 5 6015 - 7079 961 354 aa, chain - ## HITS:1 COG:BH1809 KEGG:ns NR:ns ## COG: BH1809 COG0642 # Protein_GI_number: 15614372 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Bacillus halodurans # 50 343 55 346 351 214 39.0 3e-55 MGIYLYETEIIAGEYKKVWGASGQGKMSFGTFTMSQSIALDAWVIAIFTILLLIIYFILI SNKFVTYVREIISGVERMKSGDLMEEIPVRGEDEFSEIAASINEMRHNLYETMEAQKAVE KTKDELITNVAHDLRTPLTSILGYLDLLTQGDFLTEEQKQKYLGIVSSKAKQLETLVKDL FDYTRYDRNKVKIKKEILDLNLFVPQLVDEFYPSFMDHQLECRTDFYEGALNIEGNGELL ARAIGNLISNAIKYGADGKLVEVHTGLKDKKAFVAIVNYGKIIPAKDLDKIFDKFYRVEN SRSLKTGGTGLGLAIAKNIINLHEGNIWATSDESGTRFQIELPVVQAHMPNTGK >gi|222441840|gb|ACEP01000102.1| GENE 6 7165 - 7875 808 236 aa, chain - ## HITS:1 COG:CAC0564 KEGG:ns NR:ns ## COG: CAC0564 COG0745 # Protein_GI_number: 15893854 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Clostridium acetobutylicum # 6 235 5 230 233 246 56.0 3e-65 MEKNASILVVDDDKEIADLVEIYLVNDGYQVLKASDGEQCLQTLKEHPEVRMLILDIMMP GIDGLEVCREIRKTSNIPILILSAKAEDMDKIVGFGTGADDYLTKPFHPLELLARVKSQM KRYTTIQLGEEQQPASKDEIEIQDLTINKAAHVVKKNGKEVALTPIEFDILYLLASNRDR VFSTDEIFEKVWNEKVYEVNNTVMVHIRRLREKIEDNSRSPKILKTVWGVGYKIEG >gi|222441840|gb|ACEP01000102.1| GENE 7 9056 - 9985 794 309 aa, chain - ## HITS:1 COG:CAC1825 KEGG:ns NR:ns ## COG: CAC1825 COG1897 # Protein_GI_number: 15895101 # Func_class: E Amino acid transport and metabolism # Function: Homoserine trans-succinylase # Organism: Clostridium acetobutylicum # 1 301 1 300 301 363 58.0 1e-100 MPIRINADLPAKKILEHENIFVMDYERSLKQDIRPLKIIIMNLMPTKQATELQLLRSLSN TPLQVDITFLKTETHDAQNESQTHMETFYTIFPEIKQQYFDGLIITGAPVETLDFEEVDY WPELVEIMEWSKTHVTSTFHICWAAQAGLYYHYGVRKELLDEKIFGIFQHHTIHRKTPLI RGFDDVFNAPHSRHSQIVREDVEKNPELTILADSEEAGPFIIIAKEGRQIFVTGHPEYDV LTLDGEYKRDLAKGMDNVPFPNRYYKDDKPELGPVKSWRCHANTLYTNWLNYYVYQMTTY DPMEIQNLK >gi|222441840|gb|ACEP01000102.1| GENE 8 10023 - 11984 2098 653 aa, chain - ## HITS:1 COG:CAC2673 KEGG:ns NR:ns ## COG: CAC2673 COG0272 # Protein_GI_number: 15895931 # Func_class: L Replication, recombination and repair # Function: NAD-dependent DNA ligase (contains BRCT domain type II) # Organism: Clostridium acetobutylicum # 6 652 10 666 669 371 37.0 1e-102 MDKQARMKKLVEVLNQASKAYYQEDRELFSNKEYDELYDELQALEAETGIVMSQSPTQNV GYEVLSELEKETHESPMLSLDKTKSPEALRDWLGEKEGLLSWKMDGLTVVLTYRDGVLQK AVTRGNGVVGEVITNNARVFKNLPAKIPFAGEVVLRGEAVIKYSDFKKINEGIEDVDARY KNPRNLCSGSVRQLNNEVTAKRNVFFFAFALVKAEGRELENSVESNMLWVQNLGFEIVPY KRVDRENIMGEIQNFSEKIVENDFPSDGLVLIYDDIAYGISLGRTAKFPRNSIAFKWADE EAETRLLHIEWSPSRTGLINPIAVFEPVELEGTTVSRASLHNISIMEGLELGENDHIKVY KANMIIPQIAENLTKSGTCQPIEHCPACGGATEVRIENNVKTLYCVNPYCSAKKVKLFSH YVSRDAMNIDGLSEATLMKMIEQGFLSELNDLYNLEQYKEQIIAMDGYGEKSYNNLMQSI EKSRDTQLFRFVYGIGILNVGSSNARLLCRHFGNSLENLRGASVEEMTQIDGIGEVIATS VHDYFENEHNQKLLEKLLPYLHFEEENVSAGGEQEQSLLDKTFVITGTVEHFANRKELKE KIESLGGKVTGSVSKKTDYLINNDNMSSSSKNKKAKELGIPVITEEEFLSMIE >gi|222441840|gb|ACEP01000102.1| GENE 9 11977 - 12798 780 273 aa, chain - ## HITS:1 COG:VC2223 KEGG:ns NR:ns ## COG: VC2223 COG1187 # Protein_GI_number: 15642221 # Func_class: J Translation, ribosomal structure and biogenesis # Function: 16S rRNA uridine-516 pseudouridylate synthase and related pseudouridylate synthases # Organism: Vibrio cholerae # 37 259 20 242 340 252 55.0 6e-67 MAATGKNNQGKNDSKRNNLKKNATDFKEKISNGEPMRLNKFLSDAGVCSRRQADRYVEQG RILVDGEPAQMGQKVHAGQDIRVDGKRVSVSRKQIVLAVNKPKGIVCTTEKREKDNIVDF IGYPERIYPVGRLDKDSEGLILMTNDGELMNEILKARNYHEKEYEVTINRPVTNVFLKQM SEGVEILDTVTRPCKVKRTGVYTFRIILTQGLNRQIRRMCEACGYRVRELKRLRIMNIHL GSLPTGKYREITGEELSKLEKLAGRKPGGVYHG >gi|222441840|gb|ACEP01000102.1| GENE 10 12798 - 13190 609 130 aa, chain - ## HITS:1 COG:FN0889 KEGG:ns NR:ns ## COG: FN0889 COG5496 # Protein_GI_number: 19704224 # Func_class: R General function prediction only # Function: Predicted thioesterase # Organism: Fusobacterium nucleatum # 3 123 2 122 127 83 41.0 8e-17 MSLNTGMKMIKTETVTVENAAKTLGSGSLLVYGTPAMLLLVEKTAVALLDGHLDEGMTTV GTNLNVDHVSASPIGCEVSCEVTLTEIDRKKLVFAVEVKDPAGVIGKGTHERFIVDAEKF QNKANGKFDK Prediction of potential genes in microbial genomes Time: Fri Jul 8 08:01:59 2011 Seq name: gi|222441839|gb|ACEP01000103.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont362.1, whole genome shotgun sequence Length of sequence - 1194 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 1192 789 ## gi|225028069|ref|ZP_03717261.1| hypothetical protein EUBHAL_02339 Predicted protein(s) >gi|222441839|gb|ACEP01000103.1| GENE 1 1 - 1192 789 397 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225028069|ref|ZP_03717261.1| ## NR: gi|225028069|ref|ZP_03717261.1| hypothetical protein EUBHAL_02339 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02339 [Eubacterium hallii DSM 3353] # 1 397 1 397 398 664 100.0 0 NKEEIKKEEQHPRRTIKWIHYSKLKTNPDQYCPARDKEEIESLADLIEAAGEVIENLSVS KSDTDEYRIIAGHKRCEACRLLVEERGLKQFEFLPCIVNNVSVVQESFRVLASNGYHTKK PYELMHEITEMERLLWEHPEEFPPSLQKGRMVERLSKKMGIARATVQEYQQISRNLLPEA MEKFKEGEIEKSAALTLARLPEKMQKEVIEQGITKNVDIEEYKQKNLEPGAVEIRVSYRL LGIDEYDRSTVSRKALVKFLKGRYGSTSYEISQGEIRISCSDKNISISERSITWERYVTL LDQYCPKEEKEEPEPVERREEQRDMVSEIETEDTDCKEPAGKPDREGREVSVSMDELVEE GREETAGLTRQEYMNSLDIKGVSSYISYFLPEEILRC Prediction of potential genes in microbial genomes Time: Fri Jul 8 08:02:17 2011 Seq name: gi|222441838|gb|ACEP01000104.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont363.1, whole genome shotgun sequence Length of sequence - 1185 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 1185 705 ## gi|225028070|ref|ZP_03717262.1| hypothetical protein EUBHAL_02340 Predicted protein(s) >gi|222441838|gb|ACEP01000104.1| GENE 1 3 - 1185 705 394 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225028070|ref|ZP_03717262.1| ## NR: gi|225028070|ref|ZP_03717262.1| hypothetical protein EUBHAL_02340 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02340 [Eubacterium hallii DSM 3353] # 1 394 1 394 395 645 99.0 0 DKEEIKKEEQHPRRIIKWIHYSKLKENVEQYCPARDKEEIESLADLIQATGEVIENLSVS KLDTDEYKIIAGHKRCQACKLLVEERGLKEFAFLPCIVNNVSAVQESFRVMASNGYHTKK PYELMHEITEMEKLLREHPEEFPPSLQKGRMVERLSKKMGIARATVQEYQQISRNLLPEA MEKFKEGEIEKSAALTLARLPEKMQKEVIEQGITKNVDIEEYKQKNLEPGAVEIRVSYRL LGIDEYDRSTVSRKALVKFLKGRYGSTSYEISQGEIRISCSDKNISISEKSITWERYVAL LDQYCPKEEKEEPELVERREEQRDMVPEIETEETDCKEPAGKSDREISVPMDELVEEGRE ETAGLTRQEYMNSLDIKEVSSYISHFLSEEILRC Prediction of potential genes in microbial genomes Time: Fri Jul 8 08:02:52 2011 Seq name: gi|222441837|gb|ACEP01000105.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont364.1, whole genome shotgun sequence Length of sequence - 64527 bp Number of predicted genes - 57, with homology - 56 Number of transcription units - 26, operones - 11 average op.length - 3.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 71 - 130 9.2 1 1 Tu 1 . + CDS 218 - 1306 1374 ## COG2508 Regulator of polyketide synthase expression + Term 1307 - 1370 12.2 + Prom 1499 - 1558 2.2 2 2 Tu 1 . + CDS 1622 - 2899 1217 ## COG0793 Periplasmic protease 3 3 Tu 1 . - CDS 3090 - 3572 389 ## EUBREC_2387 hypothetical protein - Prom 3614 - 3673 9.6 + Prom 3575 - 3634 6.6 4 4 Tu 1 . + CDS 3661 - 4935 1396 ## COG1362 Aspartyl aminopeptidase + Prom 4946 - 5005 4.5 5 5 Tu 1 . + CDS 5064 - 6533 1391 ## Cphy_2385 stage IV sporulation protein A + Prom 6555 - 6614 4.0 6 6 Op 1 . + CDS 6668 - 7468 742 ## COG2357 Uncharacterized protein conserved in bacteria 7 6 Op 2 . + CDS 7556 - 9013 969 ## COG0144 tRNA and rRNA cytosine-C5-methylases + Prom 9032 - 9091 2.4 8 7 Tu 1 . + CDS 9165 - 9893 716 ## COG1187 16S rRNA uridine-516 pseudouridylate synthase and related pseudouridylate synthases + Prom 9903 - 9962 4.7 9 8 Op 1 . + CDS 10074 - 12182 1719 ## COG1032 Fe-S oxidoreductase 10 8 Op 2 . + CDS 12206 - 12967 569 ## COG0300 Short-chain dehydrogenases of various substrate specificities 11 8 Op 3 . + CDS 12975 - 14243 1003 ## COG0564 Pseudouridylate synthases, 23S RNA-specific 12 8 Op 4 . + CDS 14245 - 14790 446 ## COG3331 Penicillin-binding protein-related factor A, putative recombinase + Prom 14817 - 14876 8.0 13 9 Op 1 4/0.000 + CDS 14926 - 15804 900 ## COG1561 Uncharacterized stress-induced protein 14 9 Op 2 4/0.000 + CDS 15816 - 16085 307 ## COG2052 Uncharacterized protein conserved in bacteria 15 9 Op 3 . + CDS 16075 - 16722 729 ## COG0194 Guanylate kinase 16 9 Op 4 . + CDS 16759 - 17079 522 ## EUBREC_1674 hypothetical protein 17 9 Op 5 . + CDS 17099 - 17881 427 ## bpr_I0781 hypothetical protein 18 9 Op 6 . + CDS 17946 - 19406 2055 ## COG0769 UDP-N-acetylmuramyl tripeptide synthase 19 9 Op 7 4/0.000 + CDS 19422 - 21674 1783 ## COG1198 Primosomal protein N' (replication factor Y) - superfamily II helicase + Prom 21724 - 21783 10.3 20 10 Op 1 26/0.000 + CDS 21822 - 22298 781 ## COG0242 N-formylmethionyl-tRNA deformylase 21 10 Op 2 20/0.000 + CDS 22289 - 23224 1199 ## COG0223 Methionyl-tRNA formyltransferase + Prom 23226 - 23285 4.3 22 10 Op 3 4/0.000 + CDS 23334 - 24689 1105 ## COG0144 tRNA and rRNA cytosine-C5-methylases 23 10 Op 4 5/0.000 + CDS 24691 - 25740 879 ## COG0820 Predicted Fe-S-cluster redox enzyme 24 10 Op 5 17/0.000 + CDS 25737 - 26468 658 ## COG0631 Serine/threonine protein phosphatase 25 10 Op 6 7/0.000 + CDS 26474 - 28570 2130 ## COG0515 Serine/threonine protein kinase + Term 28573 - 28631 19.3 + Prom 28575 - 28634 2.2 26 11 Tu 1 . + CDS 28668 - 29531 706 ## COG1162 Predicted GTPases + Prom 29614 - 29673 4.0 27 12 Tu 1 . + CDS 29738 - 30421 767 ## COG1564 Thiamine pyrophosphokinase + Prom 30572 - 30631 6.3 28 13 Tu 1 . + CDS 30662 - 31702 927 ## Poras_0486 hypothetical protein + Term 31811 - 31846 -0.7 + Prom 31999 - 32058 9.5 29 14 Tu 1 . + CDS 32097 - 33194 1677 ## COG0012 Predicted GTPase, probable translation factor + Prom 33698 - 33757 11.4 30 15 Tu 1 . + CDS 33780 - 34676 781 ## COG0682 Prolipoprotein diacylglyceryltransferase + Prom 34686 - 34745 5.1 31 16 Op 1 26/0.000 + CDS 34807 - 35589 560 ## COG1682 ABC-type polysaccharide/polyol phosphate export systems, permease component 32 16 Op 2 . + CDS 35589 - 36326 866 ## COG1134 ABC-type polysaccharide/polyol phosphate transport system, ATPase component + Term 36491 - 36537 6.2 + Prom 36403 - 36462 10.4 33 17 Op 1 . + CDS 36657 - 37604 1017 ## COG0275 Predicted S-adenosylmethionine-dependent methyltransferase involved in cell envelope biogenesis 34 17 Op 2 . + CDS 37633 - 38154 557 ## gi|225028109|ref|ZP_03717301.1| hypothetical protein EUBHAL_02379 35 17 Op 3 3/0.000 + CDS 38182 - 40287 1897 ## COG0768 Cell division protein FtsI/penicillin-binding protein 2 36 17 Op 4 4/0.000 + CDS 40381 - 42099 1478 ## COG0768 Cell division protein FtsI/penicillin-binding protein 2 37 17 Op 5 28/0.000 + CDS 42162 - 43115 1139 ## COG0472 UDP-N-acetylmuramyl pentapeptide phosphotransferase/UDP-N-acetylglucosamine-1-phosphate transferase + Prom 43176 - 43235 6.8 38 17 Op 6 25/0.000 + CDS 43417 - 44778 1742 ## COG0771 UDP-N-acetylmuramoylalanine-D-glutamate ligase 39 17 Op 7 . + CDS 44784 - 45905 1031 ## COG0772 Bacterial cell division membrane protein 40 17 Op 8 . + CDS 45922 - 46698 548 ## Closa_2458 hypothetical protein + Prom 46774 - 46833 8.2 41 18 Tu 1 . + CDS 46889 - 48043 1694 ## COG0206 Cell division GTPase + Term 48225 - 48294 5.6 42 19 Op 1 . + CDS 48503 - 49171 429 ## gi|225028117|ref|ZP_03717309.1| hypothetical protein EUBHAL_02387 43 19 Op 2 . + CDS 49210 - 49953 454 ## COG1191 DNA-directed RNA polymerase specialized sigma subunit 44 20 Tu 1 . - CDS 50469 - 51248 651 ## COG1191 DNA-directed RNA polymerase specialized sigma subunit - Prom 51322 - 51381 11.0 + Prom 51329 - 51388 11.4 45 21 Op 1 . + CDS 51451 - 52245 830 ## COG1058 Predicted nucleotide-utilizing enzyme related to molybdopterin-biosynthesis enzyme MoeA 46 21 Op 2 . + CDS 52277 - 53512 1348 ## COG3633 Na+/serine symporter + Prom 53820 - 53879 4.9 47 22 Op 1 4/0.000 + CDS 53947 - 54483 427 ## COG1396 Predicted transcriptional regulators 48 22 Op 2 30/0.000 + CDS 54520 - 55593 1398 ## COG3842 ABC-type spermidine/putrescine transport systems, ATPase components 49 22 Op 3 36/0.000 + CDS 55586 - 56401 712 ## COG1176 ABC-type spermidine/putrescine transport system, permease component I 50 22 Op 4 25/0.000 + CDS 56395 - 57195 657 ## COG1177 ABC-type spermidine/putrescine transport system, permease component II 51 22 Op 5 . + CDS 57192 - 58298 1068 ## COG0687 Spermidine/putrescine-binding periplasmic protein + Term 58430 - 58475 0.1 + Prom 58330 - 58389 7.4 52 23 Tu 1 . + CDS 58515 - 59012 570 ## COG0782 Transcription elongation factor - Term 58806 - 58858 -0.5 53 24 Tu 1 . - CDS 59027 - 60103 827 ## COG1609 Transcriptional regulators 54 25 Op 1 . + CDS 60435 - 61490 1394 ## COG0240 Glycerol-3-phosphate dehydrogenase 55 25 Op 2 . + CDS 61521 - 63185 1650 ## COG1069 Ribulose kinase + Term 63221 - 63276 6.7 + Prom 63226 - 63285 2.9 56 26 Op 1 . + CDS 63328 - 63417 89 ## 57 26 Op 2 . + CDS 63383 - 63958 381 ## COG0494 NTP pyrophosphohydrolases including oxidative damage repair enzymes + Term 64157 - 64225 20.0 + TRNA 64123 - 64208 60.0 # Ser TGA 0 0 + TRNA 64276 - 64364 53.9 # Ser GGA 0 0 Predicted protein(s) >gi|222441837|gb|ACEP01000105.1| GENE 1 218 - 1306 1374 362 aa, chain + ## HITS:1 COG:CAC3236 KEGG:ns NR:ns ## COG: CAC3236 COG2508 # Protein_GI_number: 15896482 # Func_class: T Signal transduction mechanisms; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Regulator of polyketide synthase expression # Organism: Clostridium acetobutylicum # 169 355 134 312 312 89 33.0 9e-18 MISNQILQDTIDGIKAITRIDLCVMDTEGKPLASTLDAVEEYKEAVLVFAESLAESQALQ GYQFFKVFDENQLEYIILVKGETDDVYMVGKMAAFQVQNLLIAYKERFDKDNFIKNLLLD NLLLVDIYNRAKKLHIETEVRRCVFIVETKNDRDNNAFETVRNIFSAKTRDFITAVDEKN IILVKEVKNGEGYDELTKTAQVIVDMLNTEAMTKVHVAFGTIVNEIKEVSRSYKEAKMAM DVGKIFYPDKNVIAYSRLGIGRLIYQLPLPLCKMFIKEIFDGRSPDEFDEETLQTINKFF ENNLNVSETSRQLYIHRNTLVYRLDKLQKSTGLDLRVFEDAITFKIALMVVKYMKYMESI EY >gi|222441837|gb|ACEP01000105.1| GENE 2 1622 - 2899 1217 425 aa, chain + ## HITS:1 COG:CAC0499 KEGG:ns NR:ns ## COG: CAC0499 COG0793 # Protein_GI_number: 15893790 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Periplasmic protease # Organism: Clostridium acetobutylicum # 89 424 65 399 403 248 43.0 2e-65 MDFKEKEMNLELKEQDSEHNKKKKEEGSPKRKKGCFKRVLSVLVLVVIIGFSFFAGRQSV MTSNAYGTGLNNGKILRKLSMLEAFAGNYYLNKIDAENVENNIYKGFAKGLQDPYAEYYT PSEYQQLSEEDAGKYNGIGISIAKDTDTNYPEVVSVFKDQPAYKAGIKNGDLITAIDQKS TADMELQDVVSAIRSEDNDKVVLTIYRDKKTKDYTVEKTSVELDSVTYKMRKNKIGYIAV SQFHENTDEQFDKAVTDLESQGMKSLIIDLRDDGGGLLTTCTNMLSRLVAKDKMLVYTKD KKGKKEEFKSDSGKTVDVPMVVLVNGNTASASEIMTGCLKDYKKATVVGTTTYGKGIVQT ILPLTDGSAFKFTIAKYYTPKGTDIHEKGIEPDVEVKMSDKQWKKAQTDEKADKQLKKAI EILKK >gi|222441837|gb|ACEP01000105.1| GENE 3 3090 - 3572 389 160 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_2387 NR:ns ## KEGG: EUBREC_2387 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 154 1 157 167 104 38.0 1e-21 MLSIQISDIKDFMSHLLSKDTFDHFYFIEASIKMGVSYQIQGRINEGFYDTSVEQTLNRE FCYWKEIRHRIFDIIKGKRLPLSCKIVLGLSKQSISYLIAHNNSSFREEDIEGIYVNILY DPKNLLITTGISYKNFSLDKSLEYAFDEYLAKFLKEKAVI >gi|222441837|gb|ACEP01000105.1| GENE 4 3661 - 4935 1396 424 aa, chain + ## HITS:1 COG:CAC0607 KEGG:ns NR:ns ## COG: CAC0607 COG1362 # Protein_GI_number: 15893896 # Func_class: E Amino acid transport and metabolism # Function: Aspartyl aminopeptidase # Organism: Clostridium acetobutylicum # 4 422 11 431 433 282 34.0 7e-76 MNVLTECLAQGTSPVGVVSFAKEYLKQQGFEELYYNKMISPKENSKYYIAPFPDVLFAFT TGAGKEMIQSLRMAFAHVDQPCFKVKGKPDFKSMGCAMLNVEVYGGMMDHTWFDRPLGLA GTIVLKGEDVFNPIEVTYDSKRPVALIPGLAIHMQREANNGWKIDRQKELMPLMGLANGR WTEDAFLHFLAEELSVDESEILSYDLNLYNYDQPQLCGVKEEILCSPRIDNLASVSALLE SLTDGERESGINLIGLFNHEEVGSFTKSGADSALLPGILERILSGMGCSKEQIDISLAQS YYLSVDGAHAAHPNYTDRCDMTTRAYMGQGVTVKVSGTQKYASDCKMYAILKGLSEKYDI LLQEINDRNTIRGGTTLGPMIGSHLPMAGCDLGIPMLAMHSARETMAMADYVSLCQLVTA FFAE >gi|222441837|gb|ACEP01000105.1| GENE 5 5064 - 6533 1391 489 aa, chain + ## HITS:1 COG:no KEGG:Cphy_2385 NR:ns ## KEGG: Cphy_2385 # Name: not_defined # Def: stage IV sporulation protein A # Organism: C.phytofermentans # Pathway: not_defined # 1 489 1 491 491 580 57.0 1e-164 MEKQNIYQDISARTGGEIYLGVVGPVRSGKSTFIKRFMELLVLPEIKDENERKRTMDELP QSGTGKMVMTTEPKFIPKEAAELPMEDMKIKIRLIDCVGFMIPGAGGNLENGQERLVKTP WFDYEVPFTKAAEFGTRKVIRDHSTIGILVTADGSFGEIPRDSYVEAEKKTVAELNEIGK PFLVLVNSERPYSKATQALTEKLSKEYNTSVMAVNCDQLRQEDILEILKNVLLEFPLSSV GFYLPKWVETLRDDHWMKKSVLDLVKEFMTDKNKMKDLYQKVFPSNDYIESGKIEKIHMD TGKVDVKIQIRDSYYYDILSDLTGLPIKSEYHLIRLMKELSAKKREFEEVSQALKDARER GYGVMKPVLSEITLSEPEVVKHGNKYGVKIRAEAPSIHLIRANLTTEVAPIVGTQLQAED LITYIKEQAGEEPREIWNVNIFGKTLEQLVDDGISGKVTKINDDSQEKLQNAMEKIVNES SGGLICIII >gi|222441837|gb|ACEP01000105.1| GENE 6 6668 - 7468 742 266 aa, chain + ## HITS:1 COG:CAC0642 KEGG:ns NR:ns ## COG: CAC0642 COG2357 # Protein_GI_number: 15893930 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 6 255 6 255 262 205 45.0 7e-53 MEIQLWRELLDPYQLAVDELTVKFHHIIEEHRNQGLYSPIESVDGRVKSIVSILDKMQRK NVSMNDIEKKIEDLAGIRIICQFVEDIDRVVNLIKNRTDMKVKCEKDYVSHMKTSGYRSY HMIILYTVNTIHGPKELSAEIQIRTMAMNFWATIEHSLQYKYKENIPEYIQQKLLDASEA IIEVDHEMSDVRDEIMDAQNSHRKKETLVKDILNNIQNLYKVANQREVTKIQDEFYKVYV LDDMLQLKHFARQLDIISEGYRAQSL >gi|222441837|gb|ACEP01000105.1| GENE 7 7556 - 9013 969 485 aa, chain + ## HITS:1 COG:SP1402_1 KEGG:ns NR:ns ## COG: SP1402_1 COG0144 # Protein_GI_number: 15901256 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA and rRNA cytosine-C5-methylases # Organism: Streptococcus pneumoniae TIGR4 # 1 297 1 277 280 215 43.0 1e-55 MNLPEDYQNKMQKMLGGDEYLQYLDSFNYSYGQTLRVNQLKKEPADFIRRFLLDGQVSWC STGFYYEGEQKLSAHPYYYAGVYYLQEPSAMAPAAFLPVEPGDKVLDLCAAPGGKSTALA AKLQGEGFLLSNDISASRCKPLLKNMEMAGVKNSVITCESPEKLAGRFAGYFDKILVDAP CSGEGMFRREPSMVKSWSPEEVERYAKLQREILTSAAQMIKPGGYLLYSTCTFAKEEDEQ TVEYFLEQHRDFSLQELPLCDGIEEGKPEWTIRGREDVKNCRRFLPHKVKGEGHFAALFR KADGVESVDKQCIKEDDISKSVHKETKDKEKMCKDGKTQNISKVKNVKIPEAMDEFMREI PEWEQWKNRIIFIKERAFLLPENCPDFKGLRVVRSGLYLGDCLKKRFEPSQALAMALKPE EYKRSVSFKAEDICVEKYLRGETVDIEDAGLNGWTLFCVDEFPLGWGKCNRGRLKNKYNP NWRKL >gi|222441837|gb|ACEP01000105.1| GENE 8 9165 - 9893 716 242 aa, chain + ## HITS:1 COG:lin2436 KEGG:ns NR:ns ## COG: lin2436 COG1187 # Protein_GI_number: 16801498 # Func_class: J Translation, ribosomal structure and biogenesis # Function: 16S rRNA uridine-516 pseudouridylate synthase and related pseudouridylate synthases # Organism: Listeria innocua # 8 236 1 231 233 213 51.0 2e-55 MPETLKKMRLDKFLTQAAELTRSEAKQKIKKGSVTVDGEVVKKAEMKVSSEDAICLDGEV VTYEKYRYIMLHKPAGVVSATEDAQCQTVLDLITEGRKGLFPVGRLDKDTEGLLLLTNDG ALAHNLLSPRKHVDKTYFAILDGEVGEKEQELFLQGLDIGDEKKTLPAKLEITDKANEVF ITVQEGRFHQVKRMAEAVGRKVEYLKRISMGPLVLDKELKKGEYRPLFEEELAKLKESTA SK >gi|222441837|gb|ACEP01000105.1| GENE 9 10074 - 12182 1719 702 aa, chain + ## HITS:1 COG:MA4618 KEGG:ns NR:ns ## COG: MA4618 COG1032 # Protein_GI_number: 20093399 # Func_class: C Energy production and conversion # Function: Fe-S oxidoreductase # Organism: Methanosarcina acetivorans str.C2A # 7 598 126 736 742 592 50.0 1e-169 MIKKHGFLPMNKKEMKERGWEQADFVYICGDAYVDHPSFGAAIITRLLEDAGYKVAFIAQ PDWNDESSIAVCGEPRLAFIVSAGNMDSMVNHYTSFKKRRHQDAYTPGGGFSGRPNRPTI VYSNLIRKTYKKTPILLGGIEASLRRLAHYDYWSDKVKRSVLLDSGADLLMYGMGEHSIL EIAEALDSGIAVQDITYIRGTVFKCRDLEELSGDYELLPSYEEITESKETFAKSYYRQYV NTDAITAKRLVEPYKKKEYLVQNPPSLPLTQEEMDHVYELPYMRAWHPSYDKLGGVPALK EIKFSLTSNRGCFGGCSFCALTFHQGRRIQVRSQESIIKEAELIIKDPDFKGYIHDVGGP TADFRHTACEKQLKYGVCANKQCLFPKPCANMNADHSDYITLLRRLRALPGVKKVFIRSG IRFDYVLADKKDHFLEELCKYHVSGQLKVAPEHICDSVLKLMGKPENQVYEKFAKAYRNM NKKLGKKQYMVPYLMSSHPGSHLKEAIALAEYIRDLGYMPEQVQDFYPTPGTISTCMYYT GLDPRTMKEVYVPVTLKEKQMQRALIQYRNPENYDIVKAALIKAGREDLIGFDEHCLIPP RKMGHNYHPGKKVSADNTAKSKDGRADTRKYTKPKSGKPQNMKSKQFKTDKYKTGKDRPE DFKAGRQKNEKYKTDRFTGKTQERGKSKKKTIRNVHTRKKYR >gi|222441837|gb|ACEP01000105.1| GENE 10 12206 - 12967 569 253 aa, chain + ## HITS:1 COG:RSc2321 KEGG:ns NR:ns ## COG: RSc2321 COG0300 # Protein_GI_number: 17547040 # Func_class: R General function prediction only # Function: Short-chain dehydrogenases of various substrate specificities # Organism: Ralstonia solanacearum # 4 188 12 198 275 122 36.0 7e-28 MNIGIITGASSGMGKEFVKLTMQNHLDLDEIWVIARRKERLEAWPSLYKEQKFRILPLDL QKKEDVECLKQTLEETQPHIELFVHCAGFGIMGKIQDISMEEQAEMVDVNCRGVVEISSL LLPYMKYRGRMIYMASAAAFLPQPGFAVYAASKAFVLSYVRALRAENRERQLRVTAVCPG AVKTEFFNRALTKKHLPAYKKLVMADPKKVIKKAWKDNEQNKEISIYGKAIKLTHLVTKI LPHSLFLQFIGRK >gi|222441837|gb|ACEP01000105.1| GENE 11 12975 - 14243 1003 422 aa, chain + ## HITS:1 COG:CAC1015 KEGG:ns NR:ns ## COG: CAC1015 COG0564 # Protein_GI_number: 15894302 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Pseudouridylate synthases, 23S RNA-specific # Organism: Clostridium acetobutylicum # 123 368 68 312 318 154 36.0 4e-37 MKEITITKREEGQRFDKFLIKYLPGASSGFLHKMLRKKNIKLNGKKAEGREKLTAGDLIQ IFFSDETLEKFQNPQSGRDDSYEADSSMENHKSDRGFGQSGRNRQPDREQAKVYTKDRAG NGGQERKLTKEERELRKQVRVLYSSEDILVFHKPAGMLSQRAKASDDSLNDYLIDYCIQT GIISKEELAGFRPSIANRLDRNTSGIVLAGISIKGLQRLASMLRERTLGKYYLCLVEGKV KEDARIAGYLTKEEKNNKVSLHKEKAEGASYIETEYKVLKSTDKASLLKIRLITGKSHQI RGHLASVGHPVFGDYKYGNRDFNNQIKWKDGVNYQLLHSYELVVPEGTGELFGLHIIDPV PEVFHKVQTRWGLEVPELSYENEAKNRETDNKRAGNTRTGRTTNKKTENARTEKIRKSRR EK >gi|222441837|gb|ACEP01000105.1| GENE 12 14245 - 14790 446 181 aa, chain + ## HITS:1 COG:BH3539 KEGG:ns NR:ns ## COG: BH3539 COG3331 # Protein_GI_number: 15616101 # Func_class: R General function prediction only # Function: Penicillin-binding protein-related factor A, putative recombinase # Organism: Bacillus halodurans # 10 166 8 164 168 113 39.0 2e-25 MATWNSRGLRGSVLEELINYANQKYKDKGLALIQKIPTPITPVRFDKESRHITLAYFEKK STVDYIGAVQGIPVCFDAKECAVDTFSLQNLHPHQVEFMKNFEKQNGISFLLIYYTHRQK CYYLRFSEMIKYWERMEAGGQKNFKYEELSPKFFVPAAGGIALHYLVAMQEDLKERNTDK K >gi|222441837|gb|ACEP01000105.1| GENE 13 14926 - 15804 900 292 aa, chain + ## HITS:1 COG:CAC1716 KEGG:ns NR:ns ## COG: CAC1716 COG1561 # Protein_GI_number: 15894993 # Func_class: S Function unknown # Function: Uncharacterized stress-induced protein # Organism: Clostridium acetobutylicum # 1 292 1 292 292 230 47.0 3e-60 MIKSMTGFGRGESVSEDCKVTVEIKAVNHRYCDLNMKLPRKLNYLEADIRSFLKQSIQRG KVDVFINYEDLSAKDVNVRLNEELGKEYYAALTKLGETLGISSDVTALQIGRFPDVLSLE DVAVNQESIKEQLMKALSEAAGHFSDSREKEGENLKTDILAKLEGMKENVAFIEERYPSI VSEYRKKLEDKVAELLGDTNVDENRIAAEVVIYSDKICVDEETVRLKSHIDGMRDELLKG GNVGRKLDFIAQEMNREANTILSKANDIEVSDHAIDLKTEIEKIREQVQNIE >gi|222441837|gb|ACEP01000105.1| GENE 14 15816 - 16085 307 89 aa, chain + ## HITS:1 COG:BS_yloBa KEGG:ns NR:ns ## COG: BS_yloBa COG2052 # Protein_GI_number: 18677778 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus subtilis # 4 85 5 86 89 89 54.0 2e-18 MADFINVGFGNMVNGDKIISMVSTDAAPIKRMIQNARDEGKAVDATCGRRTRAVLVMESG HLILSALTTDTLSSRCHSKRNDEDEENER >gi|222441837|gb|ACEP01000105.1| GENE 15 16075 - 16722 729 215 aa, chain + ## HITS:1 COG:CAC1718 KEGG:ns NR:ns ## COG: CAC1718 COG0194 # Protein_GI_number: 15894995 # Func_class: F Nucleotide transport and metabolism # Function: Guanylate kinase # Organism: Clostridium acetobutylicum # 1 203 1 201 209 171 45.0 8e-43 MKDKGTLVVISGFSGAGKGTVSKALVEKFGYSLSVSATTRQPREGEQDGREYYFKSEDDF LRLIDYNGFIEYAQYVDHYYGTPRKFVEDELAAGHVVILEIEVQGAMKIKEQYPEAILLF ITAPSIDVLKNRLIGRGTETAEVVEKRMRRAAEEAESIDQYEFIVSNEEGKLEDCMNTIH SIIESESCRITKRQEFVESLREGLQQYKENTEKDK >gi|222441837|gb|ACEP01000105.1| GENE 16 16759 - 17079 522 106 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_1674 NR:ns ## KEGG: EUBREC_1674 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: Purine metabolism [PATH:ere00230]; Pyrimidine metabolism [PATH:ere00240]; Metabolic pathways [PATH:ere01100]; RNA polymerase [PATH:ere03020] # 1 85 1 85 86 88 58.0 7e-17 MLHPSYTELMEKINKEGETGEEPVINSRYSIVIATSKRAREIIGGDEPLVNGMEGEKPLS IAVKELYDSKIKILPGDEENMEDEEHLSSEEFEELGEVDPAFTNVE >gi|222441837|gb|ACEP01000105.1| GENE 17 17099 - 17881 427 260 aa, chain + ## HITS:1 COG:no KEGG:bpr_I0781 NR:ns ## KEGG: bpr_I0781 # Name: not_defined # Def: hypothetical protein # Organism: B.proteoclasticus # Pathway: not_defined # 26 110 19 103 176 71 44.0 4e-11 MRGLPFYVKREAFARADARQRVAYILSVVLVITMLCVIFRFSAADATHSSHLSEGVCIRL VRELNSVFPEQFPKEKLVKVAEAIEYPIRKCAHFTEYAVLGITVNLYLWMCYRMEMLLSR KKKAKDIQRTEEKLQKTVKQKAVVQENVLPEIRNIKEKVTAQESATQETEIARQTVENTE QKMELSIWNFIIRAEIFCALYACSDEFHQYFVPGRSCKFFDVCVDSTGTFFGALLFWGIF HLREQKRERKRQEISKKSKS >gi|222441837|gb|ACEP01000105.1| GENE 18 17946 - 19406 2055 486 aa, chain + ## HITS:1 COG:CAC2129 KEGG:ns NR:ns ## COG: CAC2129 COG0769 # Protein_GI_number: 15895398 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramyl tripeptide synthase # Organism: Clostridium acetobutylicum # 1 482 1 478 482 356 41.0 5e-98 MKCSQLLEHLEYTCLQGSTDVKVTAVVNDSRKIEEGCLFLCIKGAAFDGHKFAAEAAEKG AAVLVVEDEVEVPDSVTVIKVDNTRYAMALISAAWFGYPAEELTTIAVTGTKGKTTTTYM VKSLLEEAGHKVGVIGTIEVVIGQEHIPVNNTTPESYDIHSYFRKMAEEDCDVVVMEASS QGFKLDRTAGIMFDYGLFTNLSPDHIGPNEHKDFAEYLSCKAKLFNQCRYGYANIDDEHF AEITKNATCPIETFGLNENADLVAYDVELTRDRDFLGVDFGLKGTCEGKISCGVPGTFNV HNALGAISIAGHMGVTVEQMNKALRHFSVKGRVQIVPTGYDYTLIIDYAHNAVALESILN TLRAYDPPRLISLFGCGGNRSKLRRYEMGEVSGKLADFTIITSDNPRFEEPQAIIDDILI GMKKTDGEYISIIDRHEAISYAMHHAQPGDIVVLAGKGHETYQEIEGKKYHMSEEEIVQD VLDGKY >gi|222441837|gb|ACEP01000105.1| GENE 19 19422 - 21674 1783 750 aa, chain + ## HITS:1 COG:BS_priA KEGG:ns NR:ns ## COG: BS_priA COG1198 # Protein_GI_number: 16078634 # Func_class: L Replication, recombination and repair # Function: Primosomal protein N' (replication factor Y) - superfamily II helicase # Organism: Bacillus subtilis # 8 746 3 801 805 559 39.0 1e-159 MGDSNTRYADIIIDISHEALDRVFQYRVPLSLWKEVRPGSRVFVPFGRGNRETEGYVVAI RPEADYEESKIKEILRVNTEGVSVESELIKVASFLKEQYGSTMIQALRTVLPVKAKMKPK EEVFITLSADIEAAQKLLAEWEQKHYVARARFLFLLIEKGRITKEEAVKGCNFPLKEIRR LAEQGIVKLESRIKYRNPFPELTKRSTGWPLNEEQQAVADNFQKDYDNGKRGTYLLYGVT GSGKTEVYLALIEQVLAKKKQVIVLIPEISLTYQTVRRFYERFGERIAVINSRMSKGEKS DACERIRAGEADVIIGARSALFAPTERLGLIIIDEEHDGAYKSDTSPKYHARETAVYRAG LCGASVVLGSATPSVEAYARALSGEYKLWTLKKRAGNAMLPTTQIIDLREEFKKGNRSIF SGELHKKIEERLAKKEQIMLFLNRRGFAGFVSCRACGQVIKCPHCDVTLTYHRNGKLRCH YCGYEETFIKQCPICKSSHVATFGLGTEKVEAALHAEFPTARVLRMDMDTTRRKHAHEEM LAAFSKGEADILLGTQMIVKGHDYANVTLVGILAADLSLHEQDFRSGEKTFQLLCQAAGR AGRGDKRGDVIIQTYSPEHYSITTAAEHSYENFFKEEYTYRQLMGYPPCAHMLVILVQSG DESQSVVARLRIEKMIEQSQVGQEVPVQILAPGQASLSKLKDVYRQVLYLKHKDKETLLD LKRRLEPVLEKHPMFAGITIQFDFDPLSHY >gi|222441837|gb|ACEP01000105.1| GENE 20 21822 - 22298 781 158 aa, chain + ## HITS:1 COG:CAC1722 KEGG:ns NR:ns ## COG: CAC1722 COG0242 # Protein_GI_number: 15894999 # Func_class: J Translation, ribosomal structure and biogenesis # Function: N-formylmethionyl-tRNA deformylase # Organism: Clostridium acetobutylicum # 1 147 1 146 150 146 55.0 1e-35 MAIRKIRFIGDPCLNKVCKPVQKITPSIETLIEDMFDTMYEARGVGLAAPQVGILRRICV IDVMDEDPIILINPEIIETAGEQTDEEGCLSIPGKCASVTRADYVKVKSFDMELNPVIIE GEGLRARALQHEIDHLDGVLYGERANEPYHDVEEEECE >gi|222441837|gb|ACEP01000105.1| GENE 21 22289 - 23224 1199 311 aa, chain + ## HITS:1 COG:CAC1723 KEGG:ns NR:ns ## COG: CAC1723 COG0223 # Protein_GI_number: 15895000 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Methionyl-tRNA formyltransferase # Organism: Clostridium acetobutylicum # 1 307 2 309 310 288 47.0 8e-78 MRVVFMGTPDFAVPTLAALIENHEVVGVVTQPDKRKGRGKAMAFTPVKEKALEYDIPVYQ PVKVGEEEFIEILRGLNPEVIVVAAFGQILPESILNMPKYGCINVHASLLPKYRGAAPIQ WSIIDGEKETGVTIMYMEKGLDTGDMIDKVVVPIDTKETGESLHDKLAAAGGPLLLEVLD KLEAGTAVRTRQNDEESCYAKMLTKDLGKIDWNKDAASIERLIRGLNSWPSAYTAFHDKT LKIWDADVEKENGSAQPGAVAKVTSDAIYVQTGEGLLKINEVQIQGKKRMPVKAFLLGHK VEEGTLLGENA >gi|222441837|gb|ACEP01000105.1| GENE 22 23334 - 24689 1105 451 aa, chain + ## HITS:1 COG:lin1936 KEGG:ns NR:ns ## COG: lin1936 COG0144 # Protein_GI_number: 16801002 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA and rRNA cytosine-C5-methylases # Organism: Listeria innocua # 1 447 1 441 446 251 35.0 2e-66 MEKRTDMRAEALDTLIDIERNKKLSHIAIGETLMRNQFEKKADRAFYTRLCEGVTERKIY LDYILDHYSKTPMKKCKPLIRSLLRMGAYQILFMDVRDAAACSEAVSLAKKRGFGRLSGF VNGVLRTLVREKENLPVPDKKDDVKYVSIIYSVPEWLSKLIIEQYGKESAEKIFASFLEA RPLTIRTNLSKTTPEELEKELEEAGVKVEKGNYLPYAFTISNLNYLGKIAAFRQGKFAVQ DESSQLAVAIADINEGDFVLDVCAAPGGKTFHAADRLKGAGKVLSRDLTEYKTELIEENK DRMHYENVEVQQWDALEEDESLLESVDVLLADLPCSGLGIMGRKNDIKYQMTEEQLGELA ALQRQILSVVWKYVKPGGQMIFSTCTLNKGENAENIKWIEENTPLHLVSIEEYLPQNLKN RTGSQGYLQLIPGEDTCDGFFISKFKRPEAK >gi|222441837|gb|ACEP01000105.1| GENE 23 24691 - 25740 879 349 aa, chain + ## HITS:1 COG:BS_yloN KEGG:ns NR:ns ## COG: BS_yloN COG0820 # Protein_GI_number: 16078638 # Func_class: R General function prediction only # Function: Predicted Fe-S-cluster redox enzyme # Organism: Bacillus subtilis # 9 348 22 361 363 342 49.0 7e-94 MDSSMDLKSMTLEELTDCVMELGEKKFRAKQIYGWLHQKLVRSPEEMKNVPAKCIEKLLK EHPFYGVEEVEHYESKIDGTQKFLFSLHDGNMVESVLMKYKHGNSVCISSQAGCRMGCRF CASTLLGLSRNLYPSEMLDQIYAIQKATGERVSHLVVMGTGEPFDNFESLCRMIELLCSP DGLNISHRNITVSTCGIVPKIYEFADRNPQVTLAISLHSPNDTMRRELMPIANKYSMDEL MEAARYYTRTTGRRITFEYSLVKGVNDKKEHAQELISRVKGMNCHINLIPVNPIKERDFE QSTQNNVAAFKHILERQHIQVTVRREMGRDIQAACGQLRKSYKERSGDE >gi|222441837|gb|ACEP01000105.1| GENE 24 25737 - 26468 658 243 aa, chain + ## HITS:1 COG:lin1935 KEGG:ns NR:ns ## COG: lin1935 COG0631 # Protein_GI_number: 16801001 # Func_class: T Signal transduction mechanisms # Function: Serine/threonine protein phosphatase # Organism: Listeria innocua # 7 238 7 241 252 160 42.0 2e-39 MKSFGMTDIGRKRKVNQDYLFFSDEPVGCFPNLYIVADGMGGHKAGDKASSYAVNRFVEL ARKETKELPFLVMERLLNEVNEEVYTLSCKEEQYSGMGTTFVAATVVDGTAYIMNIGDSR LYYFDEKIHQVTMDHSLVEELVRAGELDRAESRNHPQKNIITKAVGVAEDVQPDFFIVDM KENSRILLCSDGLSNMVDDEKLEEILSGEGTEEELAEKCIEEALFYGGLDNIAVVIAQNR NGR >gi|222441837|gb|ACEP01000105.1| GENE 25 26474 - 28570 2130 698 aa, chain + ## HITS:1 COG:lin1934_1 KEGG:ns NR:ns ## COG: lin1934_1 COG0515 # Protein_GI_number: 16801000 # Func_class: R General function prediction only; T Signal transduction mechanisms; K Transcription; L Replication, recombination and repair # Function: Serine/threonine protein kinase # Organism: Listeria innocua # 5 319 4 325 342 277 48.0 4e-74 MLQKGTFLGGRYEILEHIGSGGMADVYKARCHKLNRYVAIKVLRREYCDNESFVRKFTVE AQSTAGLTHPNIVNIYDAGNEEGVHYIVMELAEGMTLKRYIRRYGRLSARETVDFAIQIA SGLQAAHEHHIIHRDIKPQNILVSDSGTIKVTDFGIARAATGDDTISSSAMGSVRYLSPE QARGGYADERSDIYSLGITIYEMATGKVPFDGENTVAIALMHLRDEITPPRCYFPDIPAS LEKIILKCTMKQPEQRYQSAAELILDLQKVFLSPDGSYVYVNPLVDDSPTIQRNKEDITK IRKSLNKSGNRRKEDNTKDRIRINEENEEDEDYKDDKEMNPKLEKMILLITVLFGVLVAC VVFYFIGNSLHLFQPSSGTTQQSTEQTTQDRTKATTTEEKKTEYVKMPKLLDLTKNAAED EMKAIGLKADFAYEKGSSSDDSDLIVVKQQYKKGESLAVGTKVKLTLGKKESSTTAAEKV EVPALVNYTEKKAEQELKKLGLKVKKAYANSDTVKEGYVIKQSPKGGSIVNTGFAVTITI SKGVSKTTVPSLIGVSQSVAERELNRVGLKLGSVSYDYSGEVGEGDVISQGIESGTSVDK GTKVPIVLSLGEQESYRYEGSVTIDEQPFDDGANGSVKLVLKQDSWESTIYSKSSVSNDD FPLNVTFEGDREGSGSVIMYIDGKEYNTYDVSLNAISD >gi|222441837|gb|ACEP01000105.1| GENE 26 28668 - 29531 706 287 aa, chain + ## HITS:1 COG:CAC1729 KEGG:ns NR:ns ## COG: CAC1729 COG1162 # Protein_GI_number: 15895006 # Func_class: R General function prediction only # Function: Predicted GTPases # Organism: Clostridium acetobutylicum # 1 287 6 287 288 247 45.0 2e-65 MKGIGGFYYVYVEGSGLYECRARGIFRNKKMKPNVGDVVDIDILSEEEKTGNLVKIHKRK NQLIRPMAANVDQALVIFAVHEPEPNFHLLNRFLITMEQQEIPVIICFNKTDLATTEQMQ QLEKDYENSGCQVIFQAAATGEGITQIEELLRGRTTIMAGPSGVGKSSTLNCISKEKQME TGAVSEKIKRGKHTTRHSELIHIGEDTYLMDTPGFSSLFLENIEKEDLKYYFPEFEPYEN TCRFNGCCHIHEPGCEVKKALEEGKISKLRYDDYCMFYEELAQQKKW >gi|222441837|gb|ACEP01000105.1| GENE 27 29738 - 30421 767 227 aa, chain + ## HITS:1 COG:CAC1731 KEGG:ns NR:ns ## COG: CAC1731 COG1564 # Protein_GI_number: 15895008 # Func_class: H Coenzyme transport and metabolism # Function: Thiamine pyrophosphokinase # Organism: Clostridium acetobutylicum # 30 227 25 211 211 99 35.0 6e-21 MQNGKTLLIVSGGSIDINFSKEYIKDKKYDRIIAADSGLAHCKEIGIEPTDILGDFDSLK NKELLEEYRKKGIPLREFPTRKDYTDTHLAVKYAVDLKPQKVTILGATGTRYDHALANIS LLAFLKDNGIEAKIVDAHNEIEMLHGPEERKYLRSDIKNPQNPGKEYFSIIAFSPEVTGI DEEGFSYSLKNGTLYNKESVGVSNEIMAKEATLRVKKGYLLVLITRD >gi|222441837|gb|ACEP01000105.1| GENE 28 30662 - 31702 927 346 aa, chain + ## HITS:1 COG:no KEGG:Poras_0486 NR:ns ## KEGG: Poras_0486 # Name: not_defined # Def: hypothetical protein # Organism: P.asaccharolytica # Pathway: not_defined # 69 287 36 257 1007 76 28.0 2e-12 MRKQFKTLSALMLSAVLAVSALPFSKVEAKSKWVEINGVNYEINRITGECEASLNVKKGK SEVRIPNKVKYQGDTYKVTFFSWDDWDQDWKEETNRSYKPAAGSYQAVLEKITIAKGVRV SEPACHYQKLKKIVFEEPAGVSGTEFYDCPQLQSLYIPKKVKYWPTVRKCPKVKITISSS NPYLKVINNDIYSKNGKTLYSVASTKANYKVKKSVKVIDDGAFYKNDNIKSIYLPDSVKK IGDEAFGDMKNLQSIRFSNSIKELDYAIFNKCKKLTTVKLPKKIQTIRGTFGGKKNCYLK KVYINATSLKKCNLKSIPKTCKIYVKNKKVKQQVKGAGFKGKILIR >gi|222441837|gb|ACEP01000105.1| GENE 29 32097 - 33194 1677 365 aa, chain + ## HITS:1 COG:CAC2134 KEGG:ns NR:ns ## COG: CAC2134 COG0012 # Protein_GI_number: 15895403 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Predicted GTPase, probable translation factor # Organism: Clostridium acetobutylicum # 1 365 1 365 365 437 61.0 1e-122 MKLGIVGLPNVGKSTLFNSLTKAGAESANYPFCTIDPNIGIVAVPDQRLKVLSDMYDSAK IVPATIEFVDIAGLVKGASKGEGLGNQFLANIREVDAIVHVVRCFEDTNVIHVDGSVDPL RDIETINFELIFSDIEVLDRRISKSARAAHNDKKIAAEVEFLKKVKAHLEDGQLAKTFVT EDEDEQALLNECNLLTAKPVIFAANVSEDDLADDGAENDYVQKVREYAKSCEAEVFVVCA QIEQEIAELDDDEKAMFLEDLGLKESGLEKLIKASYSLLGLISYLTAGPIESKAWTITRG TKAPQAAGKIHSDFERGFIKADVISYEDLIANGSMTAARDKGLVRSEGKEYVMQDGDVVL FKFNV >gi|222441837|gb|ACEP01000105.1| GENE 30 33780 - 34676 781 298 aa, chain + ## HITS:1 COG:sll1187 KEGG:ns NR:ns ## COG: sll1187 COG0682 # Protein_GI_number: 16329453 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Prolipoprotein diacylglyceryltransferase # Organism: Synechocystis # 24 286 20 263 283 168 38.0 9e-42 MQYTDIRFPHLGIVLSHVGRYISIGNFQIMFYGIIIACGFLAGLVVAQQEAKRTGQNPEI YMDYLLWMMIPAIIGARIYYVVFSWDAYKDNILEIFNLRHGGLGIVGGVTMAIIVLLIFA KVRKQSALLMLDTMTMGLLLGQIMGRWGNFFNREAFGGYTNGPLAMQIPLKYFEQYGRVS ELKTSGILDHLVTLTVNGEKLSYIQVHPTFLYEGLWNLLILLCIFLYRKHKKFDGELLCI YLMGYGAGRFFIEGLRVDQLIIGHTGIAVTQVVCICIFAGGLLGMILGRIRHKKGKAC >gi|222441837|gb|ACEP01000105.1| GENE 31 34807 - 35589 560 260 aa, chain + ## HITS:1 COG:BH3660 KEGG:ns NR:ns ## COG: BH3660 COG1682 # Protein_GI_number: 15616222 # Func_class: G Carbohydrate transport and metabolism; M Cell wall/membrane/envelope biogenesis # Function: ABC-type polysaccharide/polyol phosphate export systems, permease component # Organism: Bacillus halodurans # 1 260 1 256 256 105 28.0 1e-22 MKRFWRDLTSHYRYAIYSAKSELKSEVANSYLNWIWWVLEPFCFMLIYAFIFGVVFEARE QYFPAFIFIGITAWDFFNRCLTQSVKLLKNNKSIVTKVYIPKFILLISKMFFNGYKMLFS FGIIVIMIIYFRIPITLNILYVIPSFIVLFVVTFACMTHLMHYGVFIQDLANVTNIVLRM VFYMTGIFYNIEKRLPAQYTGILMKGNPMALVLSSLRKCILYGQAPDVKWLIIWFIIGII VAALGVRKIYKNENSYVKVI >gi|222441837|gb|ACEP01000105.1| GENE 32 35589 - 36326 866 245 aa, chain + ## HITS:1 COG:slr0982 KEGG:ns NR:ns ## COG: slr0982 COG1134 # Protein_GI_number: 16329492 # Func_class: G Carbohydrate transport and metabolism; M Cell wall/membrane/envelope biogenesis # Function: ABC-type polysaccharide/polyol phosphate transport system, ATPase component # Organism: Synechocystis # 17 240 30 256 430 187 42.0 1e-47 MEENKNSAISVRNVCIDYKSLKTGSIKQNFFKLKKVEQEVFHAVRNVSFEVPEGQILGIT GKNGSGKSTMLRAIAGIFSADSGEIDLHGHTISLLSIGVGFKNELSGRENIILSGMLLGF SREFIIEKMPEIIEFAGIGNFIDMPVKTYSSGMYSKLAFSITAILETEIMLIDEVLSVGD QKFKKKSYNKMKELISKRDRTVVIVSHSMSTLEDLCDQIIWMHDGEIVMYDKPEIVLPVY TDFMR >gi|222441837|gb|ACEP01000105.1| GENE 33 36657 - 37604 1017 315 aa, chain + ## HITS:1 COG:CAC2132 KEGG:ns NR:ns ## COG: CAC2132 COG0275 # Protein_GI_number: 15895401 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted S-adenosylmethionine-dependent methyltransferase involved in cell envelope biogenesis # Organism: Clostridium acetobutylicum # 1 310 1 309 312 361 60.0 1e-99 MEFVHKSIMLEEVIESLAIKPNGIYVDGTLGGAGHSSEIVKRLGEDGRLIGIDQDGEAIE AATKRLKPYKDKVTIVRSNYAQMKEVLRDLGIPKVDGILLDLGVSSYQLDNAERGFTYRE DVPLDMRMDQRQTKTAKDIVNDYSEMELYHIIRNYGEDKFAKNIAKHIVQARQKTPIETT GQLIEVIKAAIPKKVRATGGHPAKKTFQAIRIELNHELDVLKNNLEDMIDLLNDEGRIAI ITFHSLEDRIVKNIFRTSERPCICPPEFPVCVCGRVSKGKVITRKPIVPGKEELEENSRS KSAKLRVFERRITEG >gi|222441837|gb|ACEP01000105.1| GENE 34 37633 - 38154 557 173 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225028109|ref|ZP_03717301.1| ## NR: gi|225028109|ref|ZP_03717301.1| hypothetical protein EUBHAL_02379 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02379 [Eubacterium hallii DSM 3353] # 1 173 1 173 173 271 100.0 1e-71 MALRNTSERYEYIYGSNVRKLETDMAGKQTAKAVPAKKNVRRPEQRRTGNKAADSEAAAA IQKNAAKFLAFDWKYTLVALIAVIICSAAAMFYLHANAQLNVMEKQISDLKTEKTSLESK QSAIQAEIDKSINLDQIKDYAEKKLHMIYPKDSDVLLYQRDSDDYFRQYESVD >gi|222441837|gb|ACEP01000105.1| GENE 35 38182 - 40287 1897 701 aa, chain + ## HITS:1 COG:CAC2130 KEGG:ns NR:ns ## COG: CAC2130 COG0768 # Protein_GI_number: 15895399 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell division protein FtsI/penicillin-binding protein 2 # Organism: Clostridium acetobutylicum # 15 664 12 604 729 253 30.0 7e-67 MAKKKKADRRLTTPMKGKLILVFGGIIFLFLILVVKLCYVTATKGKEYETAVLSQQKHDS TILPFERGKIYDRNGNILATNEKIYTLILEPQNILLQDKKYEEATINALHEYFGFSKSKL KKTIEDNPYSYYEVYKKNMTYDEVEKFQKFLDLADASMKGVSKAKKAKIESAKMVKGVSF EEDYKRVYPYNSLACRVLGYTSSGNVGNWGIEQYYNSQLNGVNGRAYYYFNEELDQEQSV KEPKNGNSVVSTIDMQIQKIIEQKLKDFDDKIGSKVTNILVMNPQNGEILGMTSSYPYNL NKPMDEKSLLSLYSQSDIDKMKAYTKQKQAEETTDSEDASEDSQDSTKKKTDDQKTIYDA FNELWRNSIISDTNEPGSTYKPFTVATGLESGALTGNENYFCTGSLMVGKRNIGCSHVHG NITLKDAVAKSCNVAMMNIGFKEGADTFYKYQNIFGFGRSTGIDLPGETDTKSLVYNASN YSNSVTLATNAFGQNFNCTMMQMAAGFCSLINGGNYYRPHIVKQIQSDNGAVVKDIGKEV LRKTISEETSATIRSYMQQTVESGTGTKAQIEGYSIGGKTGTAEKIPRNKKDYYISFIGF TPVESPQLLIYVTIDEPNVSFQANAGLAVELEKACMEEIVDVLGIKPETTDTSNTDNAST EQKDDTTSATTEASSNTTKNKKNTSSSTTKSKKNTKDAANQ >gi|222441837|gb|ACEP01000105.1| GENE 36 40381 - 42099 1478 572 aa, chain + ## HITS:1 COG:BS_spoVD KEGG:ns NR:ns ## COG: BS_spoVD COG0768 # Protein_GI_number: 16078581 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell division protein FtsI/penicillin-binding protein 2 # Organism: Bacillus subtilis # 1 569 1 562 645 382 39.0 1e-105 MRAIKVHTFQKEKYTLVFGAFILGFTLLAGRLCYLMIFRSEELSKKALNIEQRERTIKAA RGKIYDRNGIVLADNQAVCSVSVIYYQIKESQKVIKLLSQKLALSEEEVKKKVEKVSSRE KIKSNVPKSVADEIREAGLDGVKVDEDYKRYYPFGTLASKVLGFAGADNQGILGIESRYD DVLKGTDGKILTLTDYQGIEIENAAETRVEPVNGNDLYLSVDYNVQCYVQQAAEKVLKVK KAKRVSVILMNPQNGEIYALVSLPEYDLNEPFVLTKAYEAEGKNQNDKLNDMWRNPVISD SYEPGSTFKIITATAALEERKVTLQDSFFCPGFKIVEDRKIRCHKTQGHGSETFKQGVMN SCNPVFMEIGARVGAKDMLRYYHKLGLYERTGVDLPGEANSIMHKLDKIGAVELATMSFG QSIQITPLQLMRAVSAGINDGRLVTPHFALEKKNPVTKEITEYEYKEKAGAVSKETSQTL REVLEAVVAEGTGKNGQVEGYRVGGKTATSEKLPRRSGKYISSFLGFAPANHPQILGLVL IDEPQGTYYGGVIAAPVMAEIFQNVLPYLDNL >gi|222441837|gb|ACEP01000105.1| GENE 37 42162 - 43115 1139 317 aa, chain + ## HITS:1 COG:BS_mraY KEGG:ns NR:ns ## COG: BS_mraY COG0472 # Protein_GI_number: 16078583 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramyl pentapeptide phosphotransferase/UDP-N-acetylglucosamine-1-phosphate transferase # Organism: Bacillus subtilis # 1 314 2 318 324 244 46.0 1e-64 MDHSIAIPALVAFFITLILGPGLISFLHKLKFGQFIREEGPESHLKKSGTPTMGGILFLV GILVGSAFYISDYKKIIPILFVTLGFGLIGFLDDYIKVVMKRNLGLTPAQKMLGQIVITA VFAFYMVNHSGLGTDVIIPFSGGKEWHLGILYIPILFVVVLGTVNGANFTDGLDGLAASV TTLIAVFLTVIAVGTNAGIHPVICAVIGALLGFLCFNTHPAKVFMGDTGSLALGGFVVSA AYMLKIPLFIPIVAFIYFIEVLSVIIQVTYYKKTKKRIFKMAPIHHHFELSGWPETKVVS IFSIVTAVLCLIGMLAY >gi|222441837|gb|ACEP01000105.1| GENE 38 43417 - 44778 1742 453 aa, chain + ## HITS:1 COG:BS_murD KEGG:ns NR:ns ## COG: BS_murD COG0771 # Protein_GI_number: 16078584 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramoylalanine-D-glutamate ligase # Organism: Bacillus subtilis # 3 452 7 451 451 260 37.0 4e-69 MDLKSEKILIAGAGISGIGAAALLGKAGIAAAIYDGNEELDKEAVAAKLPEGMEISFELG EFKEELADKYDMLILSPGISIHKPVAKAFYDRKKPVWGEIELAAHFAKGKIAAITGTNGK TTTTALVGSILREYYKSVFVVGNIGIPFTSVALDTTEDSVIAAEISSFQLETTIDFQPEV SAVLNVTPDHLDRHGTMEVYADTKFSISKKQNKEQVIVLNYDDPITREMAKKATARPVMF SRLEELDEGIVLRGDTFIIKEDGKELPVCTTSEIKLLGSHNHENILAAIGITYYMGVPVE QIRDAVMKFTAVEHRIEYVDTVDGVDYYNDSKGTNPDAAIKGISAMTKPTILLGGGYDKK STYDEWIDAFHGKVRYLIVMGATAQAIADTAKNHGFHNVIFADTFEEAMNIAKEKARPGD AVLLSPACASWGMFKNYEVRGKVFKEIVRKFKG >gi|222441837|gb|ACEP01000105.1| GENE 39 44784 - 45905 1031 373 aa, chain + ## HITS:1 COG:BS_spoVE KEGG:ns NR:ns ## COG: BS_spoVE COG0772 # Protein_GI_number: 16078585 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Bacterial cell division membrane protein # Organism: Bacillus subtilis # 34 368 26 363 366 244 43.0 2e-64 MQKKSWKKPKRRPQAMDYSILFLVLFLVGFGLVILYSTSSYKASLLYNDTTYWVKKQALF AAMGICGMLFIATRDYHIWQKKWWFAWVIYGGVIGLLLLTFAIGAASHGSQRWISIGPFS LQPSELAKIGIILFLAAYISSKSREMRQWKKMVIPFLFAIPIIVIVGIENLSTCIILLAI SFIMIFVATPLLVPFVVIGLIGVAGAGGLLLTQGYRMERITVWLDPAASEKGHQTIQGLY AIGSGGLFGKGLGQSMQKLGFLPEANNDMIFSVICEELGLFGALCVLALFFALIWRFMVI AVNAPDLYGSMIVVGVIAHIGIQVFINIAVATNTIPNTGIPLPFISYGGSSLVFMLLEMG LVLSVSRYINVKK >gi|222441837|gb|ACEP01000105.1| GENE 40 45922 - 46698 548 258 aa, chain + ## HITS:1 COG:no KEGG:Closa_2458 NR:ns ## KEGG: Closa_2458 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 15 254 5 237 242 108 26.0 2e-22 MGERNLKLVADNKPRKRHILLLFACLAVVLGMGIYFLITDFSIQKIQISGNNTYTNAEII EAMKEDGYIDNTLLMIAQNQIFDQTYLPFIEKVSMSYDDSHILKVKVKEKLRTGVFKYMN EYVYFNENGIAMESRNTLFEGVPVVTGVKFNEMNLKRKILENKVPVKKNYFNTIVSITKK ITTYKLNVSEIHFEGEDDITLTSSKYKIYLGSSSYLDGKMSKLSSILETVSSNYKKGTID MHLYTDDKPIVTFKENDK >gi|222441837|gb|ACEP01000105.1| GENE 41 46889 - 48043 1694 384 aa, chain + ## HITS:1 COG:CAC1693 KEGG:ns NR:ns ## COG: CAC1693 COG0206 # Protein_GI_number: 15894970 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Cell division GTPase # Organism: Clostridium acetobutylicum # 13 383 12 368 373 317 52.0 3e-86 MLEIMSNEELAQAKILVIGVGGAGNNAVNRMVDEAIEGVELIGINTDKQALDLCKAPTRV QIGEKLTKGLGAGAKPEIGAAAVEENRDEITELVKEADMVFVTCGMGGGTGTGAAPVVAE IAKEMGILTVGVVTKPFIFEGKPRMNNALNGIERLKENVDTLIIIPNDKLLQICDKRTSI KDAFCKADEVLQQGVQGITDLIFKPGLINLDFADIQTVMRDKGIAHIGIGVGSGEDKAVD AIKSAMESPLLETTVSGATDIIINFSGDIGIQEVYEAVSYLTEVAGDEANIIFGNVESED VPDDEVSITIIATGLHDVSGKGASAVKVRKPAADKNNAESNGPTYSGQAIKTGTGRSSDS RESSRTYDNRDMEIKIPEFLQKRR >gi|222441837|gb|ACEP01000105.1| GENE 42 48503 - 49171 429 222 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225028117|ref|ZP_03717309.1| ## NR: gi|225028117|ref|ZP_03717309.1| hypothetical protein EUBHAL_02387 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02387 [Eubacterium hallii DSM 3353] # 1 222 102 323 323 387 100.0 1e-106 MGGAVSFFLSCIHIKTVGGRKTAWKAMAVIWFSAGVIFLWKMKKKKENATQISKTYTYEV RISRKGRKAVYKGIYDSGNLLVSQITGQGVCIIQKEQAAKLLLTKEKSEIEQIYIQFSEE KEEKKREKNGKKKQQIEQPLWRMWAKQFQTGIYVLQYSTVGKKNAKMPGVMAEEIVVLKD GEVLTRTKGMLGISQEELSETKEFFVLLPNDIFDREESSNIL >gi|222441837|gb|ACEP01000105.1| GENE 43 49210 - 49953 454 247 aa, chain + ## HITS:1 COG:BH2556 KEGG:ns NR:ns ## COG: BH2556 COG1191 # Protein_GI_number: 15615119 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit # Organism: Bacillus halodurans # 30 246 23 236 237 281 70.0 6e-76 MAVVLNASGVNRVRFTVINNMKSLFHPQREIHYIGGTDILPPPLEPEEEAYVLSQLDTRR SEEAKALLIERNLRLVVYIAKKFENTGIGVEDLISIGTIGLIKAIQTFKLDKKIKLATYA SRCIENEILMYLRRTNKTKMEVSIDEPLNVDWDGNELLLSDILGTDSDVISKDIEDEIDR KMLNKAMETLSDREKLIIRLRFGIGGEESEEKTQKEVADLLGISQSYISRLEKKIIHRLK REMIKMQ >gi|222441837|gb|ACEP01000105.1| GENE 44 50469 - 51248 651 259 aa, chain - ## HITS:1 COG:CAC1696 KEGG:ns NR:ns ## COG: CAC1696 COG1191 # Protein_GI_number: 15894973 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit # Organism: Clostridium acetobutylicum # 1 257 1 257 257 322 61.0 6e-88 MFLNKVEICGVNTSELPLLSDEEKENLFIQIEQGNMDAREQYVKGNLRLVLSVIQRFSSN HEHADDLFQVGCIGLMKAIDNFDRTLNVKFSTYAVPMILGEVRRYLRDNNSIRVSRSLRD TAYKAIYAKEQLTKSMNRTPTIEEVSKKTEIPEEDIIYALDAIATPMSLFEPVYQDGNDP LFLMDQICDKKNKEDTWIEHLSLQEAMNQLSGREYHIIEKRFFEGKTQTEVADEISISQA QVSRLEKNALKSIRNFIER >gi|222441837|gb|ACEP01000105.1| GENE 45 51451 - 52245 830 264 aa, chain + ## HITS:1 COG:CAC3586_1 KEGG:ns NR:ns ## COG: CAC3586_1 COG1058 # Protein_GI_number: 15896820 # Func_class: R General function prediction only # Function: Predicted nucleotide-utilizing enzyme related to molybdopterin-biosynthesis enzyme MoeA # Organism: Clostridium acetobutylicum # 1 253 1 245 245 215 42.0 7e-56 MVAEIVCVGTELLLGDIVNTNAQHISKELAHIGIDLYYQTVVGDNKERVWNVLDTALKRS DLIIMTGGLGPTKDDLTKEVAASFFGKKLVFHEQTYEHVRKKLESYGINEMTESQKKQAL VPEGSLVVTNPAGLAPGIIMAQGDKAIVMLPGPPKEMKAVLAECCHLFINRLSDQVFVSI NIKCKGPDELPLREIGEAPVADLLGDILDNENPTVATYAKEDGVLIRVTASGKTREDALT AMQPVVTKIAEILAGKIAWVKEEV >gi|222441837|gb|ACEP01000105.1| GENE 46 52277 - 53512 1348 411 aa, chain + ## HITS:1 COG:SPy0324 KEGG:ns NR:ns ## COG: SPy0324 COG3633 # Protein_GI_number: 15674487 # Func_class: E Amino acid transport and metabolism # Function: Na+/serine symporter # Organism: Streptococcus pyogenes M1 GAS # 6 404 1 401 404 400 59.0 1e-111 METSSLKRIVNKYTETSLIIRILIGMAIGVVLALAVPQFTGIKILGDLFVGALKAIAPLL VCLLIMSSLAQTKEGHNGNMKTVVILYMFSTIMGAIVAVAGSFLFPLKITLADAVQQEAP KGVVEVLENLLLKIVANPIDALVNANYLGVLAWAVILGIALKKSTPGTKQMLSDASDAVS QAVRWIINLAPFGILGLVFNAVSTSGMKIFTQYGKLILLLVGCMLFQEFITNGIIVGFCL KKNPYPLISRCARESGLTAFFTRSSAANIPVNMELCEKMGLDKDNYSVSIPLGSTINMDG AAITITVMTLAAAHTLGISVSIPTAIVLSILATLSACGASGIAGGSLLLIPVACSLFGIS NDIAMQVVGVGFIIGVIQDSCETALNSSSDVLLTATAEFMQWRKAGKEIPF >gi|222441837|gb|ACEP01000105.1| GENE 47 53947 - 54483 427 178 aa, chain + ## HITS:1 COG:CAC0841 KEGG:ns NR:ns ## COG: CAC0841 COG1396 # Protein_GI_number: 15894128 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Clostridium acetobutylicum # 1 178 1 179 179 175 50.0 3e-44 MELGEKIKELRNKQGLTQEELADRAELSKGFISQLERDMTSPSIATLEDLLQCLGTTLGE FFNEEQEEEQIVFTEEDFFVKKDEEYKNQIKWIIPNAQKNCMEPIHLLLEKGGETCPDNP HEGEEFGYILKGKVEIHLGKRICIAQKGDAFYYRADKTHYLKGIERTELIWVTSPPSF >gi|222441837|gb|ACEP01000105.1| GENE 48 54520 - 55593 1398 357 aa, chain + ## HITS:1 COG:CAC0840 KEGG:ns NR:ns ## COG: CAC0840 COG3842 # Protein_GI_number: 15894127 # Func_class: E Amino acid transport and metabolism # Function: ABC-type spermidine/putrescine transport systems, ATPase components # Organism: Clostridium acetobutylicum # 2 350 3 351 352 399 57.0 1e-111 MKNKLVELVNVSKSFGSTKVLDELCLEVKENEFLTLLGPSGCGKTTTLRIIGGFEKPDSG KVFFDGADITNVPANKRNLNTVFQKYALFGHMTIEENIAFGLKIKNKSDSYIKDKIKYAL KLVNLEGFEKRKPTSLSGGQQQRIAIARAIVNEPKVLLLDEPLGALDLKLRQDMQYELMR LKEELGITFIYITHDQEEALTMSDKIIVMNKGYIQQMGTPEDIYNEPENAFVADFIGDSN IIDGIMLEDRLVEILGTKFECVDEGFGRNRPVDVVIRPEDVVLGEEGQGILDGVVTSLIF KGVHYEMEVEANGYEWLVHSTNCHPVGSKVSISVIPFNIQIMHKPESEDEKAVTIDE >gi|222441837|gb|ACEP01000105.1| GENE 49 55586 - 56401 712 271 aa, chain + ## HITS:1 COG:CAC0839 KEGG:ns NR:ns ## COG: CAC0839 COG1176 # Protein_GI_number: 15894126 # Func_class: E Amino acid transport and metabolism # Function: ABC-type spermidine/putrescine transport system, permease component I # Organism: Clostridium acetobutylicum # 3 271 6 277 277 233 44.0 4e-61 MSRKVLTTPYIVWGAAFIILPLFMVLYYGCTGSDGSFTLSNITAMADSVHYKAFWNSVWI ALASTLICLVISYPLAYILSKGKHRGSGIVIMLFILPMWINFLLRVLALQVILSKTGILN AILGFFGLPLQKLMYTKGAVLVGMVYDYIPFMILPIYNALCKIDKDVIEAAHDLGASARV TFQKIIIPLSLPGVISGIIMVFIPSISEFVVADILGGSKVLLFGNVIDQEFNVANNWNLG SGLSIVLMIFIFISMAIMNRHASEEGDNMLW >gi|222441837|gb|ACEP01000105.1| GENE 50 56395 - 57195 657 266 aa, chain + ## HITS:1 COG:CAC0838 KEGG:ns NR:ns ## COG: CAC0838 COG1177 # Protein_GI_number: 15894125 # Func_class: E Amino acid transport and metabolism # Function: ABC-type spermidine/putrescine transport system, permease component II # Organism: Clostridium acetobutylicum # 5 253 3 252 260 199 42.0 4e-51 MVKAKSFIEKFYVAFIGFLLYAPLLVLFVCSFNDSKSRTVWGGFTLHWYTDLFQNEEVLS AVRTSLLLTTLAALIATLIGTLACIGMTAMKLRFQKVMEGLSNIPLLNADIVTGVAIMLL FVHFMSLGFTSMLIAHVTLGLPYIILNVMPRFQQLDKNVYEAAQDLGASAVYAFVKVVLP EIIPGILSGFFFAFTVSMDDFVVTYFTKGAGINTISTMLYQQLRRGINPQMYALSTLLFL TILILLGLINHMSATKQKKADREENR >gi|222441837|gb|ACEP01000105.1| GENE 51 57192 - 58298 1068 368 aa, chain + ## HITS:1 COG:CAC0837 KEGG:ns NR:ns ## COG: CAC0837 COG0687 # Protein_GI_number: 15894124 # Func_class: E Amino acid transport and metabolism # Function: Spermidine/putrescine-binding periplasmic protein # Organism: Clostridium acetobutylicum # 19 362 7 347 354 295 45.0 1e-79 MNLLEKYYLQGKGKLMAGIFVLVVLLIISTIAFTGCFSSGKKSDDKNTVTVFNYGDYIDV SVLKTFEKETGIKVKYEEYVTPEDMYTKYKSGVINYDVICTSDYMVEKMMQEGEAQKIDT SSMEYYKNIDEKYLDFCKAFDPTNEYAVPYLWGTVGILYNTKVIKEKVDSWSILWDKKYD DQIVMQNSMRDAFMVPLKWNGISLNTTSQKELKQAQSLLLKQKPIVEAYLVDEARDAMVS GDASMAVIYSGDATVAMEENEDLDYVVPKEGSNVWFDCFLIPKTAEHKEAAEKFIDYMNR QDIAQKNFDYIYYGTPNKAVYDTLDDDTKEDYTIFPEEATLDKCEVYKYLGEKTDQYYNR LWKELKSY >gi|222441837|gb|ACEP01000105.1| GENE 52 58515 - 59012 570 165 aa, chain + ## HITS:1 COG:BMEI0508 KEGG:ns NR:ns ## COG: BMEI0508 COG0782 # Protein_GI_number: 17986791 # Func_class: K Transcription # Function: Transcription elongation factor # Organism: Brucella melitensis # 2 145 3 145 157 91 39.0 8e-19 MKHKLTQQDVDKMKAEIEHRKVEVRPELLEHVKEARAHGDLSENFEYHAAKKEKNRNESR IRYLERMIRTCEIVEDHSNADQVGLNKTVTLYFEEDDETDKFSFVTTVLVDTMQNRVSIE SPLGKALLGHKKGERVYIPVNAQYGYYVEIKEIEPFDEDVPLNEY >gi|222441837|gb|ACEP01000105.1| GENE 53 59027 - 60103 827 358 aa, chain - ## HITS:1 COG:TM1200 KEGG:ns NR:ns ## COG: TM1200 COG1609 # Protein_GI_number: 15643956 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Thermotoga maritima # 3 323 2 307 333 129 28.0 6e-30 MGKRNRVTTRDIAEELGISQSTVSMILSNKANVSFTEETVQKVKTKAKELGYKKPVPKEL VQEKSLANTIVVLCPMVTNGYYSMMMQSITDHAKEYDYTVMTISTGRDAATEELYLDLFA RVQLAGIICLYPLSKIQKINALSKRFPVISVGDKPLSCQFDSVELDSRKQGHIMANYLLS LGHTDITYISTPIRGKEIGRIHRLDGVKSSFREHGIPLEHLTVLYQSQPAFDRYPLENAE YQNGYDLATHALEEHTSSTAFIGNNDMAAFGIMAAISDHGYRIPADFSVCGFDNITLSAM PQIALTTIDHASVRKGEEAVDMIHRKNTQKKDETSHAYIMRLEYEPQLIVRKSTGRKF >gi|222441837|gb|ACEP01000105.1| GENE 54 60435 - 61490 1394 351 aa, chain + ## HITS:1 COG:AF0871 KEGG:ns NR:ns ## COG: AF0871 COG0240 # Protein_GI_number: 11498477 # Func_class: C Energy production and conversion # Function: Glycerol-3-phosphate dehydrogenase # Organism: Archaeoglobus fulgidus # 3 311 2 303 335 147 30.0 2e-35 MSVITFVCAGQMNSAITFPAFENGHEVRLVGSPLDRDIIYGLKKDNFHITLKRTLHDGIK YYQIEELEEALKGADLVLGGVSSFGLDWFCDEILPVLPEDVPVLTVTKGMVDLEDGTLVP YPHIFAQRQPEGKHLNLNAIGGPCTSYELADHDDSHVAFCGKDIETLKYIKSLLETDYYH ISLSTDVVGVECAVAMKNAYALGVSLAVGLAEKRDGKIGSQHYNSQAALFGQSVKEMTKL LELIGGKPENIIHGAGDLYVTIFGGRTRKIGTLLGRGLSFEEAMAELQGVTLESIVISTR TARAVRKLAERGIVSLEDFPLLMHVDAIINQGAEVNIPWKDFTWNDFSLAE >gi|222441837|gb|ACEP01000105.1| GENE 55 61521 - 63185 1650 554 aa, chain + ## HITS:1 COG:BS_araB KEGG:ns NR:ns ## COG: BS_araB COG1069 # Protein_GI_number: 16079931 # Func_class: C Energy production and conversion # Function: Ribulose kinase # Organism: Bacillus subtilis # 4 550 3 549 560 595 51.0 1e-170 MSKYAIGIDYGTLSVRALLVNIETGEEVAASIYEYPHGVMEEHIPSGKKLPVGWALQDPQ DYLEGLLVTVRDVVSRNKILPGEVVGIGVDFTSSTVIPVKADKTPLCHLPEFKEEPHAYV KLWKHHGAEEEAKLMNDVAVARGEEWLPTYGGKISSEWMYPKIYETLRHAPEVYDAADRF MEAGDWIIWQMTGEETRSACCAGYKAYYHHEKGYPSKDFFKAVDPGMENIVADKLDAPIK GVGEKAGHLTASMAREMGLMEGIPVATCIIDAHASLPGCGIGEPGKMMIIVGTSSVHMML GEKEVAIKGSSGTVKDGIMPGYFGYEAGQSCVGDHFAWFVENCVPESYEQEARVRGISIH QLLSEKLSTYKAGASGLLALDWFNGVRTPLMDFNLNGLIMGMNLLTKPEEIYLSLIEATA YGTRMIIEGFEEAGLEVKDITLSGGIPIKNEMLVQVYSDVCNRKIKVVDSTQSSALGAAI LGAAAAPERITGFKNANEAAKKLGKVKDKVWEPNQDNVEIYNELYEEYKTLHTYFGTGIN DVMKHLNNIRERRK >gi|222441837|gb|ACEP01000105.1| GENE 56 63328 - 63417 89 29 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSYKIEIVKVYEVKSEVANEFFKKDKRVA >gi|222441837|gb|ACEP01000105.1| GENE 57 63383 - 63958 381 191 aa, chain + ## HITS:1 COG:L181238 KEGG:ns NR:ns ## COG: L181238 COG0494 # Protein_GI_number: 15673901 # Func_class: L Replication, recombination and repair; R General function prediction only # Function: NTP pyrophosphohydrolases including oxidative damage repair enzymes # Organism: Lactococcus lactis # 6 182 4 179 186 143 45.0 2e-34 MNSLKKIKEWLNTFTSFNEQEQVDCQRIRQFLEEEENLLLRDNVKMHFTASAWVLNPSKN KVLMLYHNIYQSWSWSGGHADGEGDLLSVAMKEVKEESGLVSLKPLSDSPISIEILGVQP HYKKQKYVSAHLHLNYTFLLHNTKEEKLKICPEENSKVGWLSPDEAVRSSTEAWMKPIYK KLNQKMRKYLG Prediction of potential genes in microbial genomes Time: Fri Jul 8 08:04:13 2011 Seq name: gi|222441836|gb|ACEP01000106.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont365.1, whole genome shotgun sequence Length of sequence - 95997 bp Number of predicted genes - 87, with homology - 86 Number of transcription units - 42, operones - 24 average op.length - 2.9 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 44 - 103 6.6 1 1 Op 1 . + CDS 180 - 665 177 ## EUBELI_00161 stage II sporulation protein R + Prom 729 - 788 5.5 2 1 Op 2 . + CDS 817 - 1776 819 ## COG1305 Transglutaminase-like enzymes, putative cysteine proteases + Term 1807 - 1844 6.6 + Prom 1825 - 1884 3.3 3 2 Op 1 . + CDS 2131 - 3186 1150 ## COG1459 Type II secretory pathway, component PulF 4 2 Op 2 . + CDS 3203 - 3541 314 ## bpr_I2055 hypothetical protein + Prom 3616 - 3675 7.2 5 3 Op 1 . + CDS 3721 - 4206 330 ## bpr_I2056 hypothetical protein 6 3 Op 2 . + CDS 4203 - 4619 403 ## gi|225028140|ref|ZP_03717332.1| hypothetical protein EUBHAL_02412 + Prom 4720 - 4779 3.6 7 4 Op 1 . + CDS 4835 - 5377 440 ## gi|225028141|ref|ZP_03717333.1| hypothetical protein EUBHAL_02413 8 4 Op 2 . + CDS 5404 - 6468 1036 ## COG2805 Tfp pilus assembly protein, pilus retraction ATPase PilT 9 5 Tu 1 . + CDS 6986 - 7996 1213 ## gi|225028143|ref|ZP_03717335.1| hypothetical protein EUBHAL_02415 + Term 8036 - 8084 8.2 + Prom 8002 - 8061 7.9 10 6 Op 1 . + CDS 8121 - 8354 64 ## gi|225028144|ref|ZP_03717336.1| hypothetical protein EUBHAL_02416 + Prom 8356 - 8415 8.2 11 6 Op 2 12/0.000 + CDS 8452 - 9300 936 ## COG3959 Transketolase, N-terminal subunit 12 6 Op 3 . + CDS 9290 - 10246 1041 ## COG3958 Transketolase, C-terminal subunit 13 6 Op 4 . + CDS 10243 - 10968 1021 ## COG0274 Deoxyribose-phosphate aldolase 14 6 Op 5 1/0.000 + CDS 11021 - 11746 1050 ## COG0274 Deoxyribose-phosphate aldolase + Term 11876 - 11913 0.6 + Prom 11759 - 11818 2.0 15 6 Op 6 . + CDS 11943 - 13460 1678 ## COG1070 Sugar (pentulose and hexulose) kinases + Term 13514 - 13571 17.3 + Prom 13607 - 13666 7.8 16 7 Tu 1 . + CDS 13693 - 14415 813 ## COG2188 Transcriptional regulators + Term 14548 - 14582 -0.5 + Prom 14524 - 14583 7.7 17 8 Op 1 . + CDS 14805 - 15650 1162 ## COG0191 Fructose/tagatose bisphosphate aldolase 18 8 Op 2 . + CDS 15683 - 16624 1030 ## COG0613 Predicted metal-dependent phosphoesterases (PHP family) + Prom 16650 - 16709 5.2 19 9 Op 1 . + CDS 16776 - 17810 1277 ## COG1181 D-alanine-D-alanine ligase and related ATP-grasp enzymes 20 9 Op 2 . + CDS 17815 - 18240 233 ## Closa_0530 hypothetical protein + Prom 18267 - 18326 4.2 21 10 Tu 1 . + CDS 18379 - 19566 730 ## COG2508 Regulator of polyketide synthase expression + Prom 19892 - 19951 7.1 22 11 Op 1 . + CDS 20026 - 20112 129 ## + Prom 20116 - 20175 5.1 23 11 Op 2 . + CDS 20197 - 21327 1242 ## COG1454 Alcohol dehydrogenase, class IV + Term 21505 - 21552 11.1 + Prom 21423 - 21482 5.4 24 12 Tu 1 . + CDS 21637 - 22851 1792 ## COG4992 Ornithine/acetylornithine aminotransferase + Term 22862 - 22910 7.1 + Prom 22969 - 23028 6.5 25 13 Op 1 . + CDS 23070 - 23675 681 ## EUBREC_0098 cytidylate kinase 26 13 Op 2 1/0.000 + CDS 23688 - 25205 1780 ## COG0531 Amino acid transporters + Prom 25745 - 25804 4.2 27 13 Op 3 . + CDS 25838 - 26401 563 ## COG1396 Predicted transcriptional regulators + Prom 26483 - 26542 1.8 28 14 Op 1 . + CDS 26728 - 27438 481 ## CLL_A2067 hypothetical protein 29 14 Op 2 . + CDS 27454 - 27942 294 ## gi|225028164|ref|ZP_03717356.1| hypothetical protein EUBHAL_02436 + Term 28040 - 28090 14.7 - Term 28027 - 28077 13.1 30 15 Tu 1 . - CDS 28108 - 28566 711 ## Cphy_3788 hypothetical protein - Prom 28602 - 28661 6.3 31 16 Op 1 . - CDS 28772 - 29434 432 ## COG0204 1-acyl-sn-glycerol-3-phosphate acyltransferase 32 16 Op 2 . - CDS 29521 - 31032 1679 ## COG1640 4-alpha-glucanotransferase - Prom 31061 - 31120 5.0 + Prom 31390 - 31449 8.1 33 17 Op 1 . + CDS 31649 - 32674 877 ## COG1609 Transcriptional regulators 34 17 Op 2 . + CDS 32678 - 34096 1788 ## COG2407 L-fucose isomerase and related proteins + Prom 34108 - 34167 7.4 35 18 Op 1 12/0.000 + CDS 34279 - 35106 1023 ## COG3959 Transketolase, N-terminal subunit 36 18 Op 2 2/0.000 + CDS 35099 - 36031 1382 ## COG3958 Transketolase, C-terminal subunit 37 18 Op 3 . + CDS 36034 - 37515 1898 ## COG0554 Glycerol kinase + Term 37653 - 37689 0.8 + Prom 37578 - 37637 5.2 38 19 Op 1 3/0.000 + CDS 37719 - 38756 1451 ## COG1879 ABC-type sugar transport system, periplasmic component + Term 38807 - 38853 -0.5 + Prom 38760 - 38819 2.5 39 19 Op 2 4/0.000 + CDS 38901 - 39542 894 ## COG5618 Predicted periplasmic lipoprotein 40 19 Op 3 21/0.000 + CDS 39586 - 41115 1773 ## COG1129 ABC-type sugar transport system, ATPase component 41 19 Op 4 . + CDS 41209 - 42267 1325 ## COG1172 Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components + Term 42302 - 42346 8.2 + Prom 42334 - 42393 8.8 42 20 Op 1 9/0.000 + CDS 42520 - 44511 760 ## PROTEIN SUPPORTED gi|85057286|ref|YP_456202.1| exopolyphosphatase-related protein + Prom 44541 - 44600 5.1 43 20 Op 2 16/0.000 + CDS 44791 - 45237 504 ## PROTEIN SUPPORTED gi|160881875|ref|YP_001560843.1| ribosomal protein L9 44 20 Op 3 . + CDS 45321 - 46670 1347 ## COG0305 Replicative DNA helicase 45 21 Tu 1 . + CDS 47056 - 47946 448 ## Closa_0311 MerR family transcriptional regulator - Term 48281 - 48318 4.6 46 22 Tu 1 . - CDS 48348 - 48518 271 ## gi|225028182|ref|ZP_03717374.1| hypothetical protein EUBHAL_02454 - Prom 48613 - 48672 6.3 + Prom 48650 - 48709 7.0 47 23 Tu 1 . + CDS 48757 - 49437 617 ## COG0515 Serine/threonine protein kinase + Prom 50036 - 50095 3.9 48 24 Op 1 16/0.000 + CDS 50186 - 51421 1469 ## COG0117 Pyrimidine deaminase 49 24 Op 2 15/0.000 + CDS 51438 - 52106 683 ## COG0307 Riboflavin synthase alpha chain 50 24 Op 3 18/0.000 + CDS 52128 - 53333 1539 ## COG0807 GTP cyclohydrolase II 51 24 Op 4 . + CDS 53409 - 53882 795 ## COG0054 Riboflavin synthase beta-chain + Prom 54177 - 54236 7.8 52 25 Tu 1 . + CDS 54411 - 54623 407 ## Ethha_1084 domain of unknown function DUF1858 + Term 54781 - 54829 -0.3 - Term 54998 - 55054 4.5 53 26 Tu 1 . - CDS 55219 - 56085 904 ## Closa_0328 Negative regulator of genetic competence 54 27 Tu 1 . - CDS 56814 - 57521 297 ## COG0860 N-acetylmuramoyl-L-alanine amidase - Prom 57570 - 57629 6.6 + Prom 57701 - 57760 7.1 55 28 Op 1 . + CDS 57783 - 58844 1206 ## COG0009 Putative translation factor (SUA5) 56 28 Op 2 . + CDS 58856 - 59329 373 ## gi|225028194|ref|ZP_03717386.1| hypothetical protein EUBHAL_02466 57 28 Op 3 . + CDS 59362 - 59544 87 ## gi|225028195|ref|ZP_03717387.1| hypothetical protein EUBHAL_02467 + Prom 59547 - 59606 3.7 58 29 Op 1 . + CDS 59628 - 60044 235 ## Cphy_3743 hypothetical protein 59 29 Op 2 40/0.000 + CDS 60049 - 60741 532 ## COG0356 F0F1-type ATP synthase, subunit a 60 29 Op 3 37/0.000 + CDS 60813 - 61079 589 ## COG0636 F0F1-type ATP synthase, subunit c/Archaeal/vacuolar-type H+-ATPase, subunit K 61 29 Op 4 38/0.000 + CDS 61115 - 61657 676 ## COG0711 F0F1-type ATP synthase, subunit b 62 29 Op 5 41/0.000 + CDS 61642 - 62214 738 ## COG0712 F0F1-type ATP synthase, delta subunit (mitochondrial oligomycin sensitivity protein) 63 29 Op 6 42/0.000 + CDS 62207 - 63712 2164 ## COG0056 F0F1-type ATP synthase, alpha subunit 64 29 Op 7 42/0.000 + CDS 63728 - 64585 776 ## COG0224 F0F1-type ATP synthase, gamma subunit 65 29 Op 8 42/0.000 + CDS 64608 - 66002 1728 ## COG0055 F0F1-type ATP synthase, beta subunit 66 29 Op 9 . + CDS 66011 - 66415 339 ## COG0355 F0F1-type ATP synthase, epsilon subunit (mitochondrial delta subunit) + Term 66482 - 66528 -0.9 + Prom 66720 - 66779 11.1 67 30 Op 1 . + CDS 66867 - 67382 603 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases + Prom 67471 - 67530 6.3 68 30 Op 2 . + CDS 67602 - 68507 1270 ## gi|225028207|ref|ZP_03717399.1| hypothetical protein EUBHAL_02479 + Term 68581 - 68629 4.2 + Prom 68512 - 68571 3.4 69 31 Op 1 . + CDS 68664 - 69038 429 ## Rumal_1396 Cna B domain-containing protein 70 31 Op 2 . + CDS 69079 - 69537 354 ## gi|225028209|ref|ZP_03717401.1| hypothetical protein EUBHAL_02481 + Term 69567 - 69608 4.1 71 32 Tu 1 . + CDS 69625 - 70266 541 ## COG0637 Predicted phosphatase/phosphohexomutase + Prom 70296 - 70355 3.8 72 33 Tu 1 . + CDS 70382 - 70915 405 ## gi|225028211|ref|ZP_03717403.1| hypothetical protein EUBHAL_02483 + Term 71089 - 71128 -0.9 + Prom 71064 - 71123 8.8 73 34 Op 1 . + CDS 71151 - 72344 1188 ## COG0500 SAM-dependent methyltransferases 74 34 Op 2 . + CDS 72337 - 74889 2204 ## EUBREC_0675 hypothetical protein 75 35 Tu 1 . - CDS 75114 - 75494 87 ## gi|225028214|ref|ZP_03717406.1| hypothetical protein EUBHAL_02486 - Prom 75514 - 75573 4.4 + Prom 75853 - 75912 9.6 76 36 Tu 1 . + CDS 76058 - 78394 2101 ## COG0744 Membrane carboxypeptidase (penicillin-binding protein) + Term 78514 - 78547 0.6 77 37 Op 1 . - CDS 79457 - 80650 1522 ## COG0462 Phosphoribosylpyrophosphate synthetase 78 37 Op 2 3/0.000 - CDS 80729 - 81730 621 ## COG1484 DNA replication protein 79 37 Op 3 . - CDS 81734 - 82924 806 ## COG3935 Putative primosome component and related proteins - Prom 82975 - 83034 8.1 + Prom 82954 - 83013 9.6 80 38 Tu 1 . + CDS 83159 - 83311 74 ## gi|225028219|ref|ZP_03717411.1| hypothetical protein EUBHAL_02491 + Term 83431 - 83467 0.2 - Term 83489 - 83523 2.0 81 39 Tu 1 . - CDS 83657 - 85057 1660 ## COG0773 UDP-N-acetylmuramate-alanine ligase + Prom 85351 - 85410 10.0 82 40 Op 1 7/0.000 + CDS 85464 - 86723 1506 ## COG0448 ADP-glucose pyrophosphorylase 83 40 Op 2 . + CDS 86743 - 87867 1289 ## COG0448 ADP-glucose pyrophosphorylase 84 40 Op 3 . + CDS 87898 - 88197 446 ## COG2088 Uncharacterized protein, involved in the regulation of septum location + Term 88256 - 88309 14.6 + Prom 88226 - 88285 7.5 85 41 Op 1 7/0.000 + CDS 88520 - 90673 1770 ## COG0193 Peptidyl-tRNA hydrolase 86 41 Op 2 . + CDS 90720 - 94283 2717 ## COG1197 Transcription-repair coupling factor (superfamily II helicase) + Prom 94722 - 94781 6.7 87 42 Tu 1 . + CDS 94809 - 95972 1528 ## bpr_I2253 peptidyl-prolyl cis-trans isomerase PpiC-type (EC:5.2.1.8) Predicted protein(s) >gi|222441836|gb|ACEP01000106.1| GENE 1 180 - 665 177 161 aa, chain + ## HITS:1 COG:no KEGG:EUBELI_00161 NR:ns ## KEGG: EUBELI_00161 # Name: not_defined # Def: stage II sporulation protein R # Organism: E.eligens # Pathway: not_defined # 1 126 64 193 227 95 40.0 8e-19 MRADSDKKEAQERKLIVRDAVLRYVKEDVYRADSAQELKQNLAEKTEEIAQVAANASDGK PVRVYFTKERFPIRHYGTAIFPAGEYQALRVDIGKAAGHNWWCAFYPNLCYTPEKNFTLS ESGKKQWKKITELTQGKWLEKTKHYFTKLKKFLPYWAKRSV >gi|222441836|gb|ACEP01000106.1| GENE 2 817 - 1776 819 319 aa, chain + ## HITS:1 COG:CAC2409 KEGG:ns NR:ns ## COG: CAC2409 COG1305 # Protein_GI_number: 15895675 # Func_class: E Amino acid transport and metabolism # Function: Transglutaminase-like enzymes, putative cysteine proteases # Organism: Clostridium acetobutylicum # 169 299 245 376 397 66 34.0 9e-11 MKNKKQIISRLLWIGNIFLLIFCLAGCGTSGTGTSESGTEQQESSSAAPNYYHKGKIFLP QADGKVTEEHEGVKLDLSHTDQGYFMAAYTGSADKLLIQVEGSDNIPYRYYFDPDGKYNA LPLTAGDGSYAVTAYENVGDNRYAVVFTKIVDVTLENEVLPFLYSNQYVNFDENTKTVAL AKKLTKGKTEIEAVQEVYEYVIKNIVYDDEKAATVKSGYLPNVDDTLKTKKGICFDYAAL MTAMLRSSGIPTRLDIGYATNIYHAWISTYLDEQGWVDNVIKFDGKNWTMMDPTFAAGGG EGIQDFITDSSNYNIEYIH >gi|222441836|gb|ACEP01000106.1| GENE 3 2131 - 3186 1150 351 aa, chain + ## HITS:1 COG:aq_747 KEGG:ns NR:ns ## COG: aq_747 COG1459 # Protein_GI_number: 15606135 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Type II secretory pathway, component PulF # Organism: Aquifex aeolicus # 7 350 64 406 408 143 26.0 4e-34 MKGRKMLSNSEVAVFCEQSGMILESGLSMLEGISIMEEDVELHNTEYKKMYEDMRMCLEE TGIFHEALTKTGLFPSYVIQMTKIGEETGNLDVVMKGLAAYYEREENTMQDIKNAISYPL MILCMLLAIMLVMVVKVVPVFNQVYEQLGTQITGPGAVLLKAGEALKQYWIIPTVLLCVF AVGVWWLLKTTKGNVALRRLTGKVFSTKSIVKRLNSARVSSAFSMALQSGMDYGRSFDMA ATLLENDEETKQKLMNCKEKMFEGESFSKVIRETEIYSPLQSRMIMIAERTGEADQALSK IAVQIDGEVTADIQNFVSVIEPTLVIVLSVLVGGMLLSVMMPLMGILSVIG >gi|222441836|gb|ACEP01000106.1| GENE 4 3203 - 3541 314 112 aa, chain + ## HITS:1 COG:no KEGG:bpr_I2055 NR:ns ## KEGG: bpr_I2055 # Name: not_defined # Def: hypothetical protein # Organism: B.proteoclasticus # Pathway: not_defined # 17 105 22 110 115 83 40.0 2e-15 MEQNVEKKKKDYKKGYILPVCIFILMMVMLFYGVNSVTNATAQKEQESLKNAVVQSAVHC YSVEGAYPDSLDYLKKHYGITWDESKYKVTYEIIAKNIRPEVKVIPLQESKK >gi|222441836|gb|ACEP01000106.1| GENE 5 3721 - 4206 330 161 aa, chain + ## HITS:1 COG:no KEGG:bpr_I2056 NR:ns ## KEGG: bpr_I2056 # Name: not_defined # Def: hypothetical protein # Organism: B.proteoclasticus # Pathway: not_defined # 1 158 1 161 161 82 31.0 7e-15 MNNKQKQIHTIDTVFPLIFILLFGFCALALVLSGAHVYQDTTDGLKQNYTIRTAATYLQE KVREYSSESQIEVLSQDGQTVIALYEEGDSDYVTYIYLYKGKLRELFTKKGRDVVWSSGQ ELVSADTFSVTKQKEDLLQIELSGDGQEELLYIRIYGEDSK >gi|222441836|gb|ACEP01000106.1| GENE 6 4203 - 4619 403 138 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225028140|ref|ZP_03717332.1| ## NR: gi|225028140|ref|ZP_03717332.1| hypothetical protein EUBHAL_02412 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02412 [Eubacterium hallii DSM 3353] # 1 138 1 138 138 252 100.0 7e-66 MKRVRHSRSGLFLIELMICILFFSITAGICIQFFVKSHNMSQDAKNLYQAQQEAASMAEI LEKDIDSLDNISVYYDKDWNQCDKEEKMYWLEVTQAQNQAQEDDLKKVKIAVYSGENSKG ESSKKEEIYHLNLSIYIP >gi|222441836|gb|ACEP01000106.1| GENE 7 4835 - 5377 440 180 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225028141|ref|ZP_03717333.1| ## NR: gi|225028141|ref|ZP_03717333.1| hypothetical protein EUBHAL_02413 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02413 [Eubacterium hallii DSM 3353] # 1 180 9 188 188 258 100.0 2e-67 MKKREIPVMNVGISLIILIFMNICLAAFAVLSLENAVSDYSLSKKTAKHTTQYYEAVNNV QEQLAKKNQELREKAETKTTEAKSSQKKAEIKTAETKSTEKKIKVKTVKNMQKQSQKDIK KVAKSQIKLTESVSKSQQLVVTLQLDETGGYPQYYIQKWKLCSSDDWQADDSLDVYQSGE >gi|222441836|gb|ACEP01000106.1| GENE 8 5404 - 6468 1036 354 aa, chain + ## HITS:1 COG:aq_745 KEGG:ns NR:ns ## COG: aq_745 COG2805 # Protein_GI_number: 15606134 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Tfp pilus assembly protein, pilus retraction ATPase PilT # Organism: Aquifex aeolicus # 1 354 13 361 366 301 46.0 1e-81 MGIQEYLQEATGKGASDLFIVAGLPLTYKINRKMIRVGDRLMPEDTRKLIEDIYVIAQER KIETLEKNGDDDFSFSIPNLGRFRVSTYRQRGSLAAVIRLIAFRLPNPEELSIPSQIMQL AEYRKGLVLVTGSAGSGKSTTLACLIEQINSTREEHIITLEDPLEFLHRHKKSIVSQREV SSDTESYKVALRASLRQSPDVILLGEMRDYETINIAMTAAETGHLVFSTLHTLGAANAVD RIIDVFPANQQRQIIVQLASVLNAVVSQQLLPDKEGKLIPAFELMTVNRAIKNMIRDNKV HQIDGLIGSSGQEKMFSMDSYITKLYKEGKITRETALDYATAPELMAQKLLERV >gi|222441836|gb|ACEP01000106.1| GENE 9 6986 - 7996 1213 336 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225028143|ref|ZP_03717335.1| ## NR: gi|225028143|ref|ZP_03717335.1| hypothetical protein EUBHAL_02415 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02415 [Eubacterium hallii DSM 3353] # 1 336 1 336 336 407 100.0 1e-112 MKKRSNKSFLASAAYLVVALVLCAGVIFYTGNGGTGLTVADITDTMPETGTDEIVAGEEE VPLASAPKVSVKKSTKTTTKKKTLKKAAKKTKTTTKKKTSKKKSTKKTATVQTVTETTVQ TTEKTSTKKKSKVQTIRTTVVTTVKTTTQTFGTTTTTTTNNAASVGASTAVSSSGFAISK FSDIKGHVDSKVYDAFVNLGFELKINSKLATTGVFSTKNHNIQLKRGQSSYLLHELGHFV SALKGRNGKKIDQSSEFTRIYNEEKSAYVGNNKAYVTQDAAEYFAESFRDYTENPSALKS QRPKTYSYISQMVSSLSSSDVKAFRNAYGWYWSINK >gi|222441836|gb|ACEP01000106.1| GENE 10 8121 - 8354 64 77 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225028144|ref|ZP_03717336.1| ## NR: gi|225028144|ref|ZP_03717336.1| hypothetical protein EUBHAL_02416 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02416 [Eubacterium hallii DSM 3353] # 1 77 1 77 77 127 100.0 3e-28 MLRFTELLTLAYRTRVKSLLKGTFLYIVLLWDLPLYGEKIIKGLGKNRGMRERHVICWFF AFRMLLRRSMYKNSYKI >gi|222441836|gb|ACEP01000106.1| GENE 11 8452 - 9300 936 282 aa, chain + ## HITS:1 COG:TM0954 KEGG:ns NR:ns ## COG: TM0954 COG3959 # Protein_GI_number: 15644626 # Func_class: G Carbohydrate transport and metabolism # Function: Transketolase, N-terminal subunit # Organism: Thermotoga maritima # 1 280 4 282 286 261 46.0 8e-70 MRRETIRCLKETAAKCRLNVVRMLRASGHGHIGGTYSSIDMVTALYFYKMNVNPEKPDWE DRDRFLLSAGHKSMVQYAVLAEKGYFPKEILDTYGHLHSKLAGHPNMHKLPGVEASTGAL GHGLSIATGMAMGLRLDKKPSKVYTILGDGELAEGSNWEAAAAAAHYKLDNLTAIVDFNG LQISGKVTDVMDFTPIGDKFREFGWAVREIDGNNMDEIVEAFDALPFEEGKPSMIVAHTV KAKGLKEGEGKASYHYWNPSKEDCDALEKDLMEQLRGYEDEI >gi|222441836|gb|ACEP01000106.1| GENE 12 9290 - 10246 1041 318 aa, chain + ## HITS:1 COG:MJ0679 KEGG:ns NR:ns ## COG: MJ0679 COG3958 # Protein_GI_number: 15668860 # Func_class: G Carbohydrate transport and metabolism # Function: Transketolase, C-terminal subunit # Organism: Methanococcus jannaschii # 6 314 8 316 316 210 37.0 3e-54 MKFDNYMDPRKTFGKAVTQEAEKNEDIVVLSADSGKSSGFGEFIEKYPDRYFECGIMEQG VIGISSGLATTGKVPVFCAIAPFVTARPFEMFRNDLGYMKQNVKVVGRNCGITYSDLGAT HHSLEDFAIVRMIPGVTILAPQDPYEIEEAVKAMLSMNGPVYMRIGNPKIPVLFEREPFV IGKGRKITEGEALTLITTGSMTMAAMEAVERLEEEGIQVEHIGMPTVWPIDAELICDSVK KTGRVMTIEEHYIKGGLGTIVMETLNDNGINVPIKMHGIPHCYASNGPYNELMAYYKLDA DGVYGGIKEFLKNEEGNK >gi|222441836|gb|ACEP01000106.1| GENE 13 10243 - 10968 1021 241 aa, chain + ## HITS:1 COG:HI1116 KEGG:ns NR:ns ## COG: HI1116 COG0274 # Protein_GI_number: 16273041 # Func_class: F Nucleotide transport and metabolism # Function: Deoxyribose-phosphate aldolase # Organism: Haemophilus influenzae # 9 223 2 213 223 140 37.0 3e-33 MIDFSKIKSKKEVGKCFDYAILPKQTTEQTIRDCCKEAIKYNCKAFCFSSSYWTPIVAEE LKGTDILVGAAIGFPFGQQTSAVKAFETEEAVRMGATVLDNCMNVGALKDKRYDEIKQEF KEYVEAAQGVMTKMIIEVCYLTDDEIKAACELCIEAGIDWVKSSTGQYAGPTLEQVILMA DCCKGTNTRVKVSGVKDPRPQNAYVFLRAGADLIGTRQAPQIIDSFDTMRKIGIIPPYEG E >gi|222441836|gb|ACEP01000106.1| GENE 14 11021 - 11746 1050 241 aa, chain + ## HITS:1 COG:HI1116 KEGG:ns NR:ns ## COG: HI1116 COG0274 # Protein_GI_number: 16273041 # Func_class: F Nucleotide transport and metabolism # Function: Deoxyribose-phosphate aldolase # Organism: Haemophilus influenzae # 7 227 1 218 223 135 36.0 8e-32 MVDLSKMTKWDMGKLFDFSVLPKQSTEADIRRGCQIAKEYNCKAFCFSSSEWTPIVAEEL KGTDIMVGAAIGFPFGQQSSAVKAFETAEAVRLGATVLDNCMNVGDLKDKKYDKILAEFK EYVDAAQGVMTKMIIETCFLTKEEIVTATKLVCEAGIDWVKTSTGQYGGPSVQDVMLMVD AVKGTKTKVKVAGVKDPRPQNAFCFIQAGAELIGTRAAVEILDAVDDLRKIGLIPAYTGD K >gi|222441836|gb|ACEP01000106.1| GENE 15 11943 - 13460 1678 505 aa, chain + ## HITS:1 COG:CAC2612 KEGG:ns NR:ns ## COG: CAC2612 COG1070 # Protein_GI_number: 15895870 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar (pentulose and hexulose) kinases # Organism: Clostridium acetobutylicum # 3 496 2 500 500 253 31.0 6e-67 MAQYLVGLDGGTTGCKTCIFDLDGKLVGSDYREYPSYYPEPGWVEQTTEDLLPNFYDSIK TAINKSGINPEDIIGFGFSSQGSVIGMLDENGELLRPWVGWQDLRGEGEGIEYLTSRIPR SEIYKQTGDPVGTTFSNAKLAWLKLHEPENWEKTALFSTMQDYFLKQFGADDYYTDYSSA SREGMMDVDNECWSKEMHDILGIDLSKRAKIVKEPGKVVGHIGKEVSEKSGLPVGTPICM GAHDQNCCTFGAGAVDDGTAVLVMGTFGSCFVVSDKSIRDPKERLVVKGNHGCGNFTIEA FSNTAASSYRWYRDTFCDYEKLMAKEQGEDPYDLINKQIATSPIGANGITFLSFLQGAGG ARINGKARGTFVGMTLGTRKADMARAVMEGICYEMYDIIRAEENSGIKIDKIRLAGGAAK SPLWCQMMADIFKHPIQILENGEAGCLGAALYAGVGVGAYKDCHEAAKVARTTKEYTPNP DNYAAYDAAYKRFCDVYDALDKKIF >gi|222441836|gb|ACEP01000106.1| GENE 16 13693 - 14415 813 240 aa, chain + ## HITS:1 COG:PA2299 KEGG:ns NR:ns ## COG: PA2299 COG2188 # Protein_GI_number: 15597495 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Pseudomonas aeruginosa # 6 239 8 242 249 129 33.0 5e-30 MLEKNSPKPLYQQLKDILVDAIDSEKWKANEKIPSENELSSIYGLSRMTVRSVLTDLVKE GLLYRVQGKGTFVAEKIVTVSPSYIGIREQLEKMGYEVETRIVECQEERSSETVAKKLNL LPGESVFKIKRVRYIKGDPISLHISYINARYKEKLTVEMLEKEQLCVLLSENYGIHKKKV SETLESVVASNEEAELLAVEKGHPLLLLRDVLYDEISRPYEYTKVVFRGDRIKIRLQYEE >gi|222441836|gb|ACEP01000106.1| GENE 17 14805 - 15650 1162 281 aa, chain + ## HITS:1 COG:SA1927 KEGG:ns NR:ns ## COG: SA1927 COG0191 # Protein_GI_number: 15927699 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose/tagatose bisphosphate aldolase # Organism: Staphylococcus aureus N315 # 9 274 9 279 286 157 35.0 2e-38 MKNNRFMNVLLKAKEENIAVGAVNIFNYLTAEAAVKAAEEIGVNLIIQTSAGTVEHFGAQ KLYDIVDAARKGAKIEVALHLDHCRNKELGKLCADTGWDSIMMDFSHLPFEENIANTREM AEYAHSRNVAIEGEIGVISGVEEEIVSDEAVGADFEETMEFVERSQIDAVAPAIGTAHGI YHGVPKINFELVEKLGKEKTPVVIHGGSGLSAETFTRLIELGGRKVNISTLVKNAYLDKT KELILSGEKFAPIPFDTEVENAVKEEVKKHLEVFSGKRTSF >gi|222441836|gb|ACEP01000106.1| GENE 18 15683 - 16624 1030 313 aa, chain + ## HITS:1 COG:BH2283 KEGG:ns NR:ns ## COG: BH2283 COG0613 # Protein_GI_number: 15614846 # Func_class: R General function prediction only # Function: Predicted metal-dependent phosphoesterases (PHP family) # Organism: Bacillus halodurans # 8 308 8 280 290 83 28.0 5e-16 MLANYECDLHGHTNRSDGNDSPVEYIRHAVHRGVKILAITDHDIVPPEMIELGGGKRQEI TAYAKGIGVELLRGIEISCETYIDDVHLVCLGCDWNAPYFKELDEFTINSKVNSYRKLID LLNEQGMEMTWEEVLQNNGYPVQEQFVQKKMIFELMARKGYMESWKEAKLYVKNSKEVSV KREKPDAVSVIKEIHKLGGIIILAHPYLISEPVSYKGKEMSRQEFIEVLIEAGLDGIEAS YTYDKTSYGGTMTKDEIKKEVIERYAGRGLIISGGSDYHADGKKGVKNPREIGECGITRE DFMKYEKLVRILP >gi|222441836|gb|ACEP01000106.1| GENE 19 16776 - 17810 1277 344 aa, chain + ## HITS:1 COG:CAC2895 KEGG:ns NR:ns ## COG: CAC2895 COG1181 # Protein_GI_number: 15896148 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanine-D-alanine ligase and related ATP-grasp enzymes # Organism: Clostridium acetobutylicum # 2 341 10 340 343 303 46.0 4e-82 MGGKSSEHDVSLVSAKTVIENIDKDLYNVIMVGITKDGQWKLVDRLDHLEDGSWVNSKTS AYICPDTGRKELLLERDGKVDRISIDVVFPVLHGMNGEDGTVQGLLELAAIPYVGCGVLA SACSMDKFYTKIVVDSIGVRQAKFVGVRRHELTHMDEVVARVEAGVPYPVFVKPSKAGSS QGVSKAENREELIDALNLAAKHDSKILVEETIVGREIECGVLGMNEPKASGVGEVLSAEE FYSYDAKYNNEESRTVVDPELPEGKAEEIRKDAVAIFKALDCSGLSRVDFFLTKDTNEVV FNEINTLPGFTAISMYPMLWEARGLSKKALVQKLIDLAMNRYDV >gi|222441836|gb|ACEP01000106.1| GENE 20 17815 - 18240 233 141 aa, chain + ## HITS:1 COG:no KEGG:Closa_0530 NR:ns ## KEGG: Closa_0530 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 135 1 134 148 66 29.0 4e-10 MARKVWLTITGIQKGFDQEEPVTLVTEGSYIHKNGVHYIFYSERTEEGAVIKNRLTISPG NVELRKTGAGNFNSVLVFDTEHNKNCLYQSPAGPMELVSDTKKIRVLHGDAHFKLQLKYS LIMNGMLVSDYDLTVLAKFME >gi|222441836|gb|ACEP01000106.1| GENE 21 18379 - 19566 730 395 aa, chain + ## HITS:1 COG:CAC1426 KEGG:ns NR:ns ## COG: CAC1426 COG2508 # Protein_GI_number: 15894705 # Func_class: T Signal transduction mechanisms; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Regulator of polyketide synthase expression # Organism: Clostridium acetobutylicum # 18 394 17 393 397 136 26.0 6e-32 MAIILEEIYEVALHRYNMKLVAGGRGLRNLVDWVHTVEEMDYVSFLKGRELIITTGIREK DEETLVRFVKSLHETGASGLVINIGKYITRVPRGVIAYSEEAGFPVFTLPWEVHLVDFNR DLCNLIYKTMQEQDGLETALQKAIFSRKEEQQYMPVLHENHIHKDTEFFMIQCYFPANKD IQADDKFDNTQFFYRFIRQCQQIFAEKKVLSVIFQRDLYITLVAKMPTGTERENVIQQIK TFSESFARQKELYMAVGDEETVVENLWEKYRDLSYLCRWGRQKGKNICQEEELGISRLWL SIGNIKKLEKYREKMLGELERFDQETGSEYIRILKCYLETNGNIGEVATECYLHRNTVSY HLKKIAEITGKTLNSTKDRSDWYMAYQIDEFLGLI >gi|222441836|gb|ACEP01000106.1| GENE 22 20026 - 20112 129 28 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLREALPKIVTGSVPGAKAKAILDRRNE >gi|222441836|gb|ACEP01000106.1| GENE 23 20197 - 21327 1242 376 aa, chain + ## HITS:1 COG:aq_1145 KEGG:ns NR:ns ## COG: aq_1145 COG1454 # Protein_GI_number: 15606402 # Func_class: C Energy production and conversion # Function: Alcohol dehydrogenase, class IV # Organism: Aquifex aeolicus # 4 375 5 382 387 236 37.0 5e-62 MEFNYFLPVNIVFGSGKVLETGELTRPYGKKALIVTGKSSAKKSGLYDKVNDSLKAAGLE TALFDKVSQNPLTTTAQEGADFAKANGCDVVVAIGGGSIMDCAKAIAFLSKNDGDINDYI YNRLRSDNALPLVLIPTTCGTGSEGNGFAVLTNPDNGDKKSLRCNAIVAKVSIVDPECMM TMPKKVLASVTFDALCHNIEAYTSKIAQPLTDALSLYAIELIANNLVSVYNGTGNKKEWE NITMASTIGGMVINTARVTLAHGMEHPASGLKNIVHGQGLAALTPVVIDASYKGDEEKFG NIAKIFGGKSAADCADKIRELLKAIDLECKLSDLGIAEEDIPWMAENCMKVSAAGVANNP VVFTQEEIAEIYRKAL >gi|222441836|gb|ACEP01000106.1| GENE 24 21637 - 22851 1792 404 aa, chain + ## HITS:1 COG:MJ0721 KEGG:ns NR:ns ## COG: MJ0721 COG4992 # Protein_GI_number: 15668902 # Func_class: E Amino acid transport and metabolism # Function: Ornithine/acetylornithine aminotransferase # Organism: Methanococcus jannaschii # 15 398 8 391 398 335 45.0 9e-92 MKLKDTGLTVQDIKDKVNKYMIETYERFDFLAETAEGMYIYDENGTPYLDFYAGIAVNNA GNRNPKVVAAVKDQVDDIMHTFNYPYTIPQALLAEKVCKTLGMDKIFYQNSGTEANEAMI KMARKYGIEKYGPNKYHIVTAKMGFHGRTFGAMSATGQPGNGCQVGFGPMTYGFSYAPYN DLEAFKNACTENTIAIMVEPVQGEGGVHPATPEFLKGLREFCDEKGMLLLLDEVQTGWCR TGAVMSYMNYGVKPDIVSMAKGLGGGMPIGAICATEEVSKAFTPGSHGTTFGGHPISCAA ALAEVEELLDKDLAGNAKKVGDYFAAKLEKLPHVKEVRHQGLLVGVEFDDEISGVEVKHG CLDRKLLITAIGAHVIRMIPPLIVTEEQCDEACAIIEEAVKALC >gi|222441836|gb|ACEP01000106.1| GENE 25 23070 - 23675 681 201 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_0098 NR:ns ## KEGG: EUBREC_0098 # Name: not_defined # Def: cytidylate kinase # Organism: E.rectale # Pathway: not_defined # 1 198 1 198 211 339 91.0 4e-92 MAHLVITIGCEYGAKGNMIGRKVAKDLGIKFYDRDTVDKIIKEVGIPKDIMEKVEEGGTI AGKGAEGDVRGSFSKYADLTQRAIHVQKTIIRKLADRESCVIIGRSADYILKEQKPILRI FIYSPEDVRIENVMKSHNFSAEDAKLFITEKDKRYHKRHLALTGSNRGDRHNRDMLIDSS LLGVDGTAELIEEVAKKVFHE >gi|222441836|gb|ACEP01000106.1| GENE 26 23688 - 25205 1780 505 aa, chain + ## HITS:1 COG:BH0994 KEGG:ns NR:ns ## COG: BH0994 COG0531 # Protein_GI_number: 15613557 # Func_class: E Amino acid transport and metabolism # Function: Amino acid transporters # Organism: Bacillus halodurans # 71 465 1 385 395 326 46.0 8e-89 MNKKESEFNKVFSAWDILVIAFGAMIGWGWVVSTGNWIEKGGVLGAALGFCLGGVMIFFV GLTYAELTAAMPQCGGEHVFSHRAMGPTGSFICTWAIVLGYVSVACFEACAFPTIITYLW PGFLKGYLYTVAGFDVYASWLAVAIIVAFLIMLINIMGAKTAAILQTVLTCIIGGAGILL IVASVINGTVDNLDGQMFVGNTAGLNIKAILGVAAMSPFYFIGFDVIPQAAEEINVPPKK IGKMLILSVILAVVFYALVIIAVGLVLDSGAIIKSQEATGLVTADAMGTAFGMKIMAKVV IVGGMCGIITSWNSFLIGGSRAMYSMAESYMIPKTFAKLHPKHKTPINALILIGVLTMLA PFAGRQLLVWISDAGNFACCLAYCMVAVSFMILRKKEPDMPRPYKVPAYKFFGTMAVIMS GFMVCMYCIPGSGGSLILPEWGMVGAWSLLGVVFYVVCKRKYKESFGDLVELISDEDAAT LMPEADDKELDKVIDAAIDRVLASI >gi|222441836|gb|ACEP01000106.1| GENE 27 25838 - 26401 563 187 aa, chain + ## HITS:1 COG:ECs2037 KEGG:ns NR:ns ## COG: ECs2037 COG1396 # Protein_GI_number: 15831292 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Escherichia coli O157:H7 # 1 174 1 167 178 108 34.0 4e-24 MDYLSHNVAVNLKQIRLSKGMSLGEVAEQTGVSKSMLAQIEKGTANPSLGVLGKITSGLR IEFQTLIDPPREDFCLVSPKEIFPTKELEGQYKVWTCFPYEDSHSVEVYRIDIEPGGVYM SGSHGEKTREYITVTEGVLTIECHGHTQTINKDQVYKFETDQAHIYRNEGAEMVSCTCFF LDYTGLR >gi|222441836|gb|ACEP01000106.1| GENE 28 26728 - 27438 481 236 aa, chain + ## HITS:1 COG:no KEGG:CLL_A2067 NR:ns ## KEGG: CLL_A2067 # Name: not_defined # Def: hypothetical protein # Organism: C.botulinum_B_Eklund # Pathway: not_defined # 33 234 4 206 219 98 30.0 2e-19 MLNLYTFAFAKNIKINKLKEMFWRDDLKKHEKLLFHGAKSKIDGKIDIHRGRHNNDFGQG FYAGESYEQAISFVSGFENSSVYYICFNDDDLTCKRYEVNQEWMMTIAYYRGALDEYKNH PVIKKNIEQSRACDYIIAPIADNRMFQIINSFIEGELTDEQCKHCLAATNLGMQYIFVSE KAVSQAKLIERCYISQNERKYYKNIRLEESKLGANKVKLARKQYRGKGRYIDEILV >gi|222441836|gb|ACEP01000106.1| GENE 29 27454 - 27942 294 162 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225028164|ref|ZP_03717356.1| ## NR: gi|225028164|ref|ZP_03717356.1| hypothetical protein EUBHAL_02436 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02436 [Eubacterium hallii DSM 3353] # 1 162 1 162 162 299 100.0 5e-80 MKKIDRDGLLLCELQATAFENSIDKMDSSSEIFIRRFMKSNIAKRMDDESILESNLQAND ILQLVDEEYGVSHYGTVKYTHNEMYWIGYIYRYFAITYEFTSARVYKIIKPKELRELFLP YHTLDPSQAIERILEAKGLLLDEEAELQRQYEIFRKIRMGSK >gi|222441836|gb|ACEP01000106.1| GENE 30 28108 - 28566 711 152 aa, chain - ## HITS:1 COG:no KEGG:Cphy_3788 NR:ns ## KEGG: Cphy_3788 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 6 136 3 137 151 123 53.0 2e-27 MNPWIVFAIILLVALAILGFLYYKGSKLQKEQQEQKEQLFSAATPVTMLVIDKKRLPLKD SGLPQIVIDQTPKRARRAKVPVVKAKIGPKVTSLIADENVYDLIPVKAEIKAMISGIYIT SINNFRKAPVPKPEKKGMMAKLRKKADDMTTK >gi|222441836|gb|ACEP01000106.1| GENE 31 28772 - 29434 432 220 aa, chain - ## HITS:1 COG:CAC0965 KEGG:ns NR:ns ## COG: CAC0965 COG0204 # Protein_GI_number: 15894252 # Func_class: I Lipid transport and metabolism # Function: 1-acyl-sn-glycerol-3-phosphate acyltransferase # Organism: Clostridium acetobutylicum # 2 216 21 235 241 174 41.0 1e-43 MRARLKKLEKTDPQEASRIAQEQVQKVFKHLLKISGVTIEVNGLENIPDEASLFVGNHSS YFDIIVTGATIPGGVGFVAKDSLGKIPGLSSWMKCIHCLFLDRSDVRKGLQTILEGVDYL KEGYSMCIYPEGTRSTTGELGEFKGGSLKMAQRAKAPVVPVAITGTRDIFENNPGLRVYP SHVTISFGKPFKFSDLPAAEKKFAGEYVKKAIEDMLAAQQ >gi|222441836|gb|ACEP01000106.1| GENE 32 29521 - 31032 1679 503 aa, chain - ## HITS:1 COG:alr3871 KEGG:ns NR:ns ## COG: alr3871 COG1640 # Protein_GI_number: 17231363 # Func_class: G Carbohydrate transport and metabolism # Function: 4-alpha-glucanotransferase # Organism: Nostoc sp. PCC 7120 # 13 503 5 497 502 506 50.0 1e-143 MNDKFSLLKRMRRLSGILVHPTSLPGKFGIGDLGPEAYRFVDFLQESGQHLWQTLPLGPT GGHNSPYQCFSSFAGQPLLISPELLKDEGLLTDADLSNVPHFSDEEVDFAAVGEYKEELF KKAFARFKELPEDNPLKSELALFCKSAHWLDDYAMFMAIKKSKGGAHWLEWEKKYRKPTK SQKAVIEREFEDEILYEKFLQWTFCRQWSQLKAYANERDILLIGDIPIFVSGDSADVWAE PRLFQVDSDGFPTVVAGVPPDYFSATGQLWGNPLYDWKYHKKTNYEWWMERFKTQFLLSD IVRIDHFRGLESYWEIPADSETALNGKWVDGPKDEFFEVLIKTFGEEPPIIAEDLGIITD DVRALRDKFGLPGMKILQFAFEDEDSNYLPYNQPYNCVCYTGTHDNDTTTGWYATASEKA RDKVRRYMNTDASQVSWDFIRTCFGSPARFAIIPVQDLFCQGSDCRMNTPGKADGNWAYR MKKELLTSDVAKRLYDVTKLYGR >gi|222441836|gb|ACEP01000106.1| GENE 33 31649 - 32674 877 341 aa, chain + ## HITS:1 COG:VCA0132 KEGG:ns NR:ns ## COG: VCA0132 COG1609 # Protein_GI_number: 15600903 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Vibrio cholerae # 3 332 2 333 334 161 28.0 2e-39 MKASIKQISEITGFSPATISNALNHKKGVNQKTAERIFQVAKEIGYISENKITKIKLVIY KRNGLIIDDSPFFPILLKGVENECRVSGYEMVICNVDRNDPDYEEQVKLILNDTSSAILL LGTEMMDEDFPLYMSGNCPMVLLDCWSSNYSVNAVLINNADSAVTAVQYLINKGHREIGY LRGRFRIKAFTSRCAGYSRALSKNNIELNSQYIFTVSTTMDGAYKDMKEYLAKGAKMPTA FFADDDMIALGCIKAFQEAGYRVPEDISVASFDDLPFCEISSPRLTTIKVFKYEMGQIAV RRLIEVMKDGGKINTKTQVCTEFVERDSVIDLNSASKEGGN >gi|222441836|gb|ACEP01000106.1| GENE 34 32678 - 34096 1788 472 aa, chain + ## HITS:1 COG:TM0951 KEGG:ns NR:ns ## COG: TM0951 COG2407 # Protein_GI_number: 15643713 # Func_class: G Carbohydrate transport and metabolism # Function: L-fucose isomerase and related proteins # Organism: Thermotoga maritima # 58 449 59 462 471 142 27.0 1e-33 MDKVVLGFAPTRRAIFSAPAAVVYRGLTAKRLKELNVDFVDIDDVNDEGLLYDDAGMEKI YEKFRDAKVDGLFVPHCNFGTEYEVCRLAKKLNVPVLLWGPRDECPDENGMRLRDSQCGL FATGKVLRRFGIPFTYLKNCNLEDECFAEGIDRFLRVCNVVKAFRNTRVLQISTRPFDFW STMCNEGELLEKFGIQLSPIPMPELTQAMKDVKENSPEEIKEVTDYCREKMCIKVTPEQL DNVAALKIAMTRLGKKYGCNCGAIQCWNALQDEIGIMPCAANALCNDEGFPIACETDIHG TITSVLVEAAAMGETRSFFADWTVRHPYNDNAELLQHCGPWPISIAKEKPTIDTPVAFDC SGSLMAQAKDGEHISLVRFDGDNGEYSLLLGNAKTVDGPYTKGTYMWVEVENLDRLEDKL VQGPYIHHCVGVHQDVVPVLYEACKYIGVTPDLYDPIEEKVKAIIRGEKVEK >gi|222441836|gb|ACEP01000106.1| GENE 35 34279 - 35106 1023 275 aa, chain + ## HITS:1 COG:TM0954 KEGG:ns NR:ns ## COG: TM0954 COG3959 # Protein_GI_number: 15644626 # Func_class: G Carbohydrate transport and metabolism # Function: Transketolase, N-terminal subunit # Organism: Thermotoga maritima # 1 266 10 273 286 283 51.0 4e-76 MNLEALSFECRQNVLDMIMEGRAGHIGGDFSVMDILITLYFKQMNISPDNLDDPDRDLFV MSKGHSVEAYYAVLAAKGFLDIEDVIKNFSKFGSPYIGHPNNKLPGIEMNSGSLGHGLPV CVGMALANKMDGKTGRVYTVMGDGELAEGSVWEGAMAAGHYKLDNLCAVVDRNRLQISGN TEDVMADDDLDARFRAFGWNSIHAQGNNIESLNEAFEEAKTCKGKPTVIIADTIKGYGGG EIMENKADWHHKVPTQEMYETISAELLKRRDAANE >gi|222441836|gb|ACEP01000106.1| GENE 36 35099 - 36031 1382 310 aa, chain + ## HITS:1 COG:TM0953 KEGG:ns NR:ns ## COG: TM0953 COG3958 # Protein_GI_number: 15644625 # Func_class: G Carbohydrate transport and metabolism # Function: Transketolase, C-terminal subunit # Organism: Thermotoga maritima # 3 307 2 305 311 301 50.0 1e-81 MNKIANRQVICETLMEQVKEDKSIVALCSDSRGSASMTPFFNAYPENSVEIGIAEQNLVS ISAGMAKCGKKPFCFSPASFISTRSYEQAKVDVAYSNTNVKLVGISGGISYGELGMSHHS AQDIAAMSAIPNMRVYLPSDRFQTKHLIEALLKDEKPAYIRTGRNPVEDIYSEDNCPFEM DKATVIKEGTDVVLIGCGEMVRPCVEAAEILEKEGISATVLDMYCVKPLDTEAIIKAATN AKAVVTAEEHAPFGGLGSMVAQVVGANCPKKVVNVALPDAPVITGNSKAVFDYYGMNGEG IAAKAKEALA >gi|222441836|gb|ACEP01000106.1| GENE 37 36034 - 37515 1898 493 aa, chain + ## HITS:1 COG:TM0952 KEGG:ns NR:ns ## COG: TM0952 COG0554 # Protein_GI_number: 15643714 # Func_class: C Energy production and conversion # Function: Glycerol kinase # Organism: Thermotoga maritima # 4 493 2 492 492 473 47.0 1e-133 MKKYVLGIDQSTQGTKGIIFDGSGKIIARTDLAHDQIINEKGWVEHNPDQIMKNVIQVVK NVVEKAGIDKEEILTLGISNQRETAVCWNKNTGRPVYNAIVWQCARGEAICNKIAEDGYA QMVQEHSGIPLSPYYSAAKIAWVLQNVPEAKEAADKGELAVGTMDSYLVYQLTKDHAFKT DYSNASRTQMFNVSALDWDEELISLFGIKKEMLAEICDSNALYGYTDFDGYLEEAIPIHG VMGDSHGALYGQGCVNPGMIKATYGTGSSIMMNVGEKPIFSDKGVVTSLAWSLSGKVNYV LEGNINYTGAVITWLQKDLGLIASAKETEELAKSANPGDTTYLVPAFTGLGAPYWDSKAK AVVSGITRTTGKAELVRAALDCIVYQISDVVNAMAEDSGIKVDELRVDGGPTKNGYLMQF QSDILDIPVQVPDAEELSAMGPAYAAGIAMGIYEQTALFESLNRTRFTPEMDAETRCTKY TGWKESVGLVLTK >gi|222441836|gb|ACEP01000106.1| GENE 38 37719 - 38756 1451 345 aa, chain + ## HITS:1 COG:SMb20349 KEGG:ns NR:ns ## COG: SMb20349 COG1879 # Protein_GI_number: 16264083 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Sinorhizobium meliloti # 58 345 26 313 313 271 54.0 2e-72 MRMFKKVLAGFVAAAMVATMAGCGSSNSSTKDSGSDKAAAEATTEAAAPAKVLNTNAANK LIYCITPSTSNPYFGTVQTACKEEGEKLGYKVKCVSHDDDATKQSELFDTAISEGASAIV CDNAGADASIEAVQKAYDAGIPTFMVDREINKEGLCVSQIVANNDQGAQAAAEALVEATG GKGDYAELLGLESDTNCQVRSNAFHRVLDQTDMKMVAQQSANWDQTEGQSKAETILQQYP DITAIVCGNDTMACGAAAAVAAANLDHDVYIIGVDGSNDMRDNIKDGKALATALQQIDKI TRNAVTQANDYLTNGTTGLEEKQLVDCVVITKDNADKLDNFVFAE >gi|222441836|gb|ACEP01000106.1| GENE 39 38901 - 39542 894 213 aa, chain + ## HITS:1 COG:mll7013 KEGG:ns NR:ns ## COG: mll7013 COG5618 # Protein_GI_number: 13475842 # Func_class: R General function prediction only # Function: Predicted periplasmic lipoprotein # Organism: Mesorhizobium loti # 15 212 20 217 221 86 28.0 4e-17 MAKRLISTVLTVVMAAVLLTGCVKVVKIGHEGDLTGQKEFNPGDSVEAIWDSEVIPECEK KAVDVSDILAKYDGKLTEMGEEFDGLKTKAASYYNYAVKLEGAKITKVDKDSFYGTVTLE VPGYTGKTVVTCQIGKYKNSSIRDDMSFINFSDFTNQTEWGQVNTSMLEKVEEAVVKPVY DDLAEGATVNLVGCFTADSNDAIVITPVVLEVK >gi|222441836|gb|ACEP01000106.1| GENE 40 39586 - 41115 1773 509 aa, chain + ## HITS:1 COG:mll7012 KEGG:ns NR:ns ## COG: mll7012 COG1129 # Protein_GI_number: 13475841 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, ATPase component # Organism: Mesorhizobium loti # 1 508 1 506 523 475 47.0 1e-134 MAENKDKDIVMKCRHMEKVYPGTKALDDVDFNIYRGKVNVLIGENGAGKSTLMKQIAGIE QPSSGKMYLDGEEVSFKNTTEARAKGVGIIHQELSLFPNLNVYQNIFMNNEQKKGVVVNN EIHKQKTKEVLERLEYPIDPEAKVGDLRVGQQQMIEIARNLIQDDLKILIMDEPTSSLSQ SEVQVLFKIMRELNAQGISIVYISHRLEEIMEIGDHVTILRDGKYVDDAAVKDIDVPWIV QKMTGSNKSYPKKERDIDWEKQDTVLEVKDLCLPKKGGGWLLDHVSFKLKRGEILGIYGL MGAGRTEVFECLMGLHPEHTGEVYVDGKKIKIKNIAEQIKNGFALVPEDRQAEGLVQTLD IEKNISLSSLKTCCNKFVLSKQKEDEYVDAQIKDIHIKVADKHLPILSLSGGNQQKCVIG KGVLTDPKILLMDEPSRGIDIGAKTEVFDIINQYADRGLSIICVSSELKEMLAIANRVVV LSNGIKTAEFTDDEITEDNLVLASYKGHH >gi|222441836|gb|ACEP01000106.1| GENE 41 41209 - 42267 1325 352 aa, chain + ## HITS:1 COG:mll7011 KEGG:ns NR:ns ## COG: mll7011 COG1172 # Protein_GI_number: 13475840 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose/xylose/arabinose/galactoside ABC-type transport systems, permease components # Organism: Mesorhizobium loti # 15 345 1 325 332 266 51.0 5e-71 MNKKKDNSNLMMTLLKGRTLFVLIILFIFFSIKADSFCTVNSLLLVCKHVAQYGILGIGM TYVIITGGIDLSVGSVVGLVGMIAGGLIQEGLTLKFAGVTLYFSVPAITVICIIIGIIIG IVNGALIAKCKLAPFIVTLGTMYIGRGLANIRSGGNTFSSIKGQEALGNVGIMQFDSRVF AVGGFPGIPIAAILFLILAVIAAFVLKKTPFGWHILAIGGNEKAARLSGIKVEKNIILVY AISGACSALIGLVTMAQIAASHPNTGDTWEMNAIAAVVLGGTSMAGGVGTIGGTVIGAFV IGIINDGMTMCGVSEFWQKVIKGLVIILAVLIDQFQRNLQAKMALQARNENK >gi|222441836|gb|ACEP01000106.1| GENE 42 42520 - 44511 760 663 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|85057286|ref|YP_456202.1| exopolyphosphatase-related protein [Aster yellows witches'-broom phytoplasma AYWB] # 54 662 109 697 849 297 31 2e-79 MVVFILVAALGVIEVNPGLGYTILVGDVIYLIAAVLCYEYTEKMISKEFLAAGFEQSQIQ KQILKELPVPYVMTDQDGTIIWHNEQFSEIVGQKITKKSITQIFPGLYKKVFPQNGQMKE YCITHEEHKYHVECRNMVFKNLEQTEGEIQEDKETQGDILQVFYFFDETQLIGYKEESRK KSLVAGLIYIDNYEEVLENMEEVRQSLLLALVERKIMKYMQGLDAIIKGFEKDKFLFVFE EEKLEQLMESKFSILDEVRLINIGNELSMTLSIGVGVNGGSYTENYESARVAMDLALGRG GDQAVVRDGDAISYYGGKTQKVEKSTRVKARVKAHALREIMLTHDKIFVMGHKNPDADCV GAALGIYRLSKALNKSTYIVMDKDCPAIEPIVDGIYRERADEEEDIFVTNEEAMLLKDKS AMLIVVDVNVPAMTECPELLKSISTIVVLDHHRQSNETIKNAVVSYVEPYASSTCEMVAE ILQYIVDKPKLKNIEADAMYSGILVDTDNFVIKAGVRTFEAAAYLRRAGADVTRVRKMFR EDMEHCRIRTAIINKAEIYMDEFAISTFDGENVSGATVVGAKAANALLNIQGIRGSFVLT ALQDCIYVSARSIDELNVQVIMEKFGGGGHLTMAAAQLKDVSLSDAAEMLKHTLRKMHED GDI >gi|222441836|gb|ACEP01000106.1| GENE 43 44791 - 45237 504 148 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|160881875|ref|YP_001560843.1| ribosomal protein L9 [Clostridium phytofermentans ISDg] # 1 145 1 145 148 198 66 8e-50 MKVILLEDVKSLGKKGQLVEVNAGYARNFILPKKLGIEATSKNINDLKLQKAHEDKVAAE QLAAAKVLAEELKNKSVELKMKVGEGGRTFGAISTKEIAAAAKEQLGYELDKKKISVDEP IKSLGVHNVKIKLHPKVTADLKVKVTEK >gi|222441836|gb|ACEP01000106.1| GENE 44 45321 - 46670 1347 449 aa, chain + ## HITS:1 COG:CAC3715 KEGG:ns NR:ns ## COG: CAC3715 COG0305 # Protein_GI_number: 15896946 # Func_class: L Replication, recombination and repair # Function: Replicative DNA helicase # Organism: Clostridium acetobutylicum # 11 443 6 434 442 414 50.0 1e-115 MPTVDENFVKRIQPHSEEAERSVLGSMLMDRDAIVEAEDILTKEDFYQRQYGIVFEAMVE LYREGKAVDLITLQNKLKEKEVPEELMSLDFFRDLVEVVPTAANIGQYAKIVHDKATLRA LIKVTENIGNECYLGKEDTETILEKTEKDIFGLLQNRNRMSDFVPIQQIVLNSLSAIEEA SKTKGRVTGVATGFTELDYKLTGLHPAELILVAARPAMGKTAFVLNIAQYAAFKDNHAVA MFSLEMSKEQLVTRLMASESMVDSQQMRTGDLRDSDWEKLLEGAALIGNSKLIIDDTATT LAEIRSKCRKYKQTYGIELVVIDYLQLMSGSESRRNESRQQEISEISRALKVLARELDVP IIALSQLSRAVEARQDHKPMLSDLRESGAIEQDADVVMFLYRDEYYNPDSEKKNLAEVIV AKQRNGATGSVELVWLGQYTKFANKERVM >gi|222441836|gb|ACEP01000106.1| GENE 45 47056 - 47946 448 296 aa, chain + ## HITS:1 COG:no KEGG:Closa_0311 NR:ns ## KEGG: Closa_0311 # Name: not_defined # Def: MerR family transcriptional regulator # Organism: C.saccharolyticum # Pathway: not_defined # 1 104 1 100 219 79 50.0 1e-13 MKENKYTISEAARLLSVENHVLRYWEEELELNIPRNELGHRYYREEEIQLLHCIKELKNK GFQLKAVKEVLPDLEQKKLIDIHKLLTTQEELENQELEEAQKEAEQQRDSQTEREMQRGK ERQNGKENEQQRNSKVYFSEENDRDTFENSSGNNKKGRTNNRNSDRNHNSTNSNGTVKED RNCKVVTIHQNEKSALAEHEEKMQQFEEIMGRIVGKALREQAENLGDAMGNHLCNKMLRE VDELMVEQEQREEERYKKLDEAIRSTQKARQEIAVSKDGIFKKKSRFFRKNKRKGI >gi|222441836|gb|ACEP01000106.1| GENE 46 48348 - 48518 271 56 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225028182|ref|ZP_03717374.1| ## NR: gi|225028182|ref|ZP_03717374.1| hypothetical protein EUBHAL_02454 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02454 [Eubacterium hallii DSM 3353] # 1 56 1 56 56 68 100.0 1e-10 MAYVISDDCISCGTCEGECPAGAISEGDGKYEIDADTCMDCGSCAGACPAGAISQE >gi|222441836|gb|ACEP01000106.1| GENE 47 48757 - 49437 617 226 aa, chain + ## HITS:1 COG:alr3997 KEGG:ns NR:ns ## COG: alr3997 COG0515 # Protein_GI_number: 17231489 # Func_class: R General function prediction only; T Signal transduction mechanisms; K Transcription; L Replication, recombination and repair # Function: Serine/threonine protein kinase # Organism: Nostoc sp. PCC 7120 # 157 222 424 489 496 61 39.0 1e-09 MGRNEKKYGTIGLMILTVIIIFSVVGIVIAGFRMIKKNEDARLISAADTEYNKDKEKRQE IEDHLESDGTQEANTETETKGESEEAASDSETDNKEEVFSADENENASSDEEADGETEDR IYLIPGDDKSSNRYEKAQHLSYTTTVKYTIQNLSVLDSYGLKITRNEIYARHGRIFNDQE LQEYFQRQNWYVPQTASNDFDDSCLNEVEKYNIQLISTYEQQAGGN >gi|222441836|gb|ACEP01000106.1| GENE 48 50186 - 51421 1469 411 aa, chain + ## HITS:1 COG:CAC0590_1 KEGG:ns NR:ns ## COG: CAC0590_1 COG0117 # Protein_GI_number: 15893879 # Func_class: H Coenzyme transport and metabolism # Function: Pyrimidine deaminase # Organism: Clostridium acetobutylicum # 4 146 21 160 160 179 58.0 6e-45 MPEEKYMRRAIELAKKGSGHVNPNPLVGAVIVKDGEIIGEGYHECYGQLHAERNAIANAR KRGNNIEGSTIYVTLEPCCHYGKTPPCTEAIIEEKIARVVVGSDDPNPLVSGKGFKLLRE KGIEVIPHFLKEECDAMNHVFFHYISTGTPYVAMKYAMTMDGKIACYTGDSKWVTGEESR AHVQTLRNHYKGIMAGIGTVLADDPMLSCRIEGGRDPVRIIADSHLRIPMDSQLVRTAKQ QPLIVACLPDADETKAVQLEEKGVEVLRIPGIAVNDFSGSSSDSVLKYKDRLLADNLADN NISREVDSDIIEEKQKEVISLPVLMKELGSRKIDGILLEGGGQLNESALQAGIVQRVYCY IAPKIFGGAQAKTPVEGQGLAKAADAWHFTRIGMQEFGQDILLEYERTKEN >gi|222441836|gb|ACEP01000106.1| GENE 49 51438 - 52106 683 222 aa, chain + ## HITS:1 COG:SP0177 KEGG:ns NR:ns ## COG: SP0177 COG0307 # Protein_GI_number: 15900114 # Func_class: H Coenzyme transport and metabolism # Function: Riboflavin synthase alpha chain # Organism: Streptococcus pneumoniae TIGR4 # 1 222 1 211 211 228 55.0 5e-60 MFTGIIEEVGEIQSIKKGANSAVITIKANTVLGDLHLGDSVALNGVCLTATSIGKNNYSV DVMHETLRRTNLGELKSGSKVNLERAMAANGRFGGHIVAGHVDGTGTISSMKKDDNAVWI TIDTSAAILKYIISKGSITIDGVSLTVAKVDNKSFAVSVIPHTGQNTTLLDKKPGDTVNL ENDMVGKYVEKLLQYGSIPAEEEEGTTEKTGITMDFLIQNGF >gi|222441836|gb|ACEP01000106.1| GENE 50 52128 - 53333 1539 401 aa, chain + ## HITS:1 COG:BH1556_2 KEGG:ns NR:ns ## COG: BH1556_2 COG0807 # Protein_GI_number: 15614119 # Func_class: H Coenzyme transport and metabolism # Function: GTP cyclohydrolase II # Organism: Bacillus halodurans # 208 401 1 194 197 260 60.0 4e-69 MSDIKFNTIEEAIADFKAGKMVIAVDDDDLENEGSLLVAGEFATPEVINFMATNAKGLIT MPISEEVARQLGLEALIWQGSDGEPSVSTVSFDKADAPTGISAQDRSATVIAATKADAKP EDFRMPGHMFPKVARQGGVLKRTGFTEAAVDLAVMAGLRPVGLCCDILDEDGFVGKLPDL VEFAKKYDLKLITIADMIQYRRKTECFVSREAEADFPTHYGHFRIFGYINKLNGEHHVAI VKGDVSDGKPVLCRVHSECLTGDALGSRRCDCGQQYRAAIQMIEKEGRGVLLYMRQEGRG IGLINKLKAYQLQDTGMDTVEANLALGFPEDLRDYGIGAQILADLGIKELRLMTNNPAKV VGLSGYGIEIVERVPIIIEPNSDDLFYLKTKQEKMGHYTKY >gi|222441836|gb|ACEP01000106.1| GENE 51 53409 - 53882 795 157 aa, chain + ## HITS:1 COG:CAC0593 KEGG:ns NR:ns ## COG: CAC0593 COG0054 # Protein_GI_number: 15893882 # Func_class: H Coenzyme transport and metabolism # Function: Riboflavin synthase beta-chain # Organism: Clostridium acetobutylicum # 5 155 3 153 155 211 69.0 4e-55 MAYHVIEGKVVGKKEKIGIVCARFNEFIVSKLLGGAVDGLVRHGISEENIDVAWVPGAFE IPLICEKMVKTGKYDAVIALGTVIRGSTSHYDYVCNEVSKGVAQVGLQAGVPVMFGILTT ENIEQAIERAGTKAGNKGYDCAVSAIEMINLIKEIEN >gi|222441836|gb|ACEP01000106.1| GENE 52 54411 - 54623 407 70 aa, chain + ## HITS:1 COG:no KEGG:Ethha_1084 NR:ns ## KEGG: Ethha_1084 # Name: not_defined # Def: domain of unknown function DUF1858 # Organism: E.harbinense # Pathway: not_defined # 4 67 3 66 70 82 57.0 8e-15 MAQQITKQTLIGEMLQMDMGIAAILMAAGMHCVGCPSSAMESLEEACMVHGMDADQVLKR VNDYLANKDK >gi|222441836|gb|ACEP01000106.1| GENE 53 55219 - 56085 904 288 aa, chain - ## HITS:1 COG:no KEGG:Closa_0328 NR:ns ## KEGG: Closa_0328 # Name: not_defined # Def: Negative regulator of genetic competence # Organism: C.saccharolyticum # Pathway: not_defined # 1 287 1 239 239 175 35.0 2e-42 MKLEKLSDTQIRCTLSKEDLSQRQLHLSELAYGSEKAKELFRDMMQQASIELGFEADNIP LMIEAIPISNDCLVLVVTKVEDPDELDTRFSRFSKINVDDSFDEDFSDIDDTDFEEMDFL DDEDDIDMDDEPLPFSPSSDFDNADSDASTSSKERSAIDDALDLIAPFTQAIAQAKKEAM RKKKENRSSVQDCQYYSFQNFSQAAQLGAFLAPFFEGESSLYKDSFSNNYYMILRKTQSE NDTFHRACNIAADFGVRISASYATPAYFREHFETILEENAVEMLGELA >gi|222441836|gb|ACEP01000106.1| GENE 54 56814 - 57521 297 235 aa, chain - ## HITS:1 COG:BH0239 KEGG:ns NR:ns ## COG: BH0239 COG0860 # Protein_GI_number: 15612802 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: N-acetylmuramoyl-L-alanine amidase # Organism: Bacillus halodurans # 42 226 43 229 238 126 37.0 4e-29 MKLLSHPIILSVFLLCFCAGAFFIGHAALSYATFQSAENFCVLLDAGHGGNDPGKVSSSG VKEKDINLAITLKCQSVLEQNGVKVILTRNSDCSLADSNASNKKASDLKKRKALIKESQI NCAVSIHQNSFPDTSSHGAQVFYHPQNPDSKRLASLIQAQMQNLTGIQNHRKIKANTDYY LLRDNNTPTVIAEVCFLSNPSEAAMITEETIQEKAAFQIAMGIMQFLHSPSFNPQ >gi|222441836|gb|ACEP01000106.1| GENE 55 57783 - 58844 1206 353 aa, chain + ## HITS:1 COG:PH0435 KEGG:ns NR:ns ## COG: PH0435 COG0009 # Protein_GI_number: 14590350 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Putative translation factor (SUA5) # Organism: Pyrococcus horikoshii # 20 351 16 338 340 332 54.0 7e-91 METELLNIEKVSPEEKAAALKKAGEIIREGGLVAFPTETVYGLGADALNAEASAKIYAAK GRPSDNPLIVHIHDVDQVYEIASEVPEAAKKVMEKFWPGPLTVILNKKSCVPDGTTGGLK TVAIRMPSHPLARDFIRESGRMIAAPSANTSGRPSPTLASHVYEDMQGRIPLILDGGAVG IGIESTIIDMSTDTPTILRPGYITKDMLEEVLPKVNIDPAVTGRTMKKNVVAKAPGMKYR HYAPKGQLTLVEGDRDKVIARINELVKEKEEEGHKVGVIGTDETLDSYHADILRSIGSRQ KPETVAANLYRILREFDDLECDYMYSESFFEQGLGNAIMNRMLKAAGYHLITL >gi|222441836|gb|ACEP01000106.1| GENE 56 58856 - 59329 373 157 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225028194|ref|ZP_03717386.1| ## NR: gi|225028194|ref|ZP_03717386.1| hypothetical protein EUBHAL_02466 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02466 [Eubacterium hallii DSM 3353] # 19 157 1 139 139 248 100.0 7e-65 MDYRKIIIVGENNIGRSFMAECILRGILQKRNCQNIEVVSRGLVVLFSEPVAPGAAKILR EKGYRIEDFRSSQLTDEDLASADLVLAITKGLAEEIKESFDEATSCMSIGTFIDTTETEE RIPEVTEENQETYQTCFDILEPLMEAVADRIIGELLV >gi|222441836|gb|ACEP01000106.1| GENE 57 59362 - 59544 87 60 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225028195|ref|ZP_03717387.1| ## NR: gi|225028195|ref|ZP_03717387.1| hypothetical protein EUBHAL_02467 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02467 [Eubacterium hallii DSM 3353] # 1 60 13 72 72 83 100.0 5e-15 MLALITQLGISMLTCIFLCMIAGKFLAERLHADWIFLVFLILGILAGFRSCYTIILRFIK >gi|222441836|gb|ACEP01000106.1| GENE 58 59628 - 60044 235 138 aa, chain + ## HITS:1 COG:no KEGG:Cphy_3743 NR:ns ## KEGG: Cphy_3743 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 7 132 1 126 126 63 34.0 4e-09 MNWRRWINKDNETLFDLIFGCIVYSIVFEAIGLLVVENKESYSLGLLLGTAVAIGLSVSM YKSLNNCLVMTEHQARRNMVFSTLLRAVVILLAAWIGMRSGYFSFPGIIIGILGLKISAY FHAYTNVYITKKLYKKGR >gi|222441836|gb|ACEP01000106.1| GENE 59 60049 - 60741 532 230 aa, chain + ## HITS:1 COG:FN0364 KEGG:ns NR:ns ## COG: FN0364 COG0356 # Protein_GI_number: 19703706 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, subunit a # Organism: Fusobacterium nucleatum # 57 225 26 207 218 119 40.0 6e-27 MGSSDVDFFIHSYYEFKLFGVTLSINTTMVTTVIVCLILLALILFARHEIMKDYDEPNVV QNVVEMIVEKMDAMVVSNMGIHAKKYLNYVEALMAFIFLSNISGLFGLRPPTADFGTTFG LALITFVMIEYAWIKTKGFGIIKDLLDPFPVFLPINIISEFATPVSMSLRLFGNVTAGTI MLGLWYALMPWFTKIGIPAFLHAYFDFFSGAIQTYVFGMLTMVFISNRYD >gi|222441836|gb|ACEP01000106.1| GENE 60 60813 - 61079 589 88 aa, chain + ## HITS:1 COG:FN0363 KEGG:ns NR:ns ## COG: FN0363 COG0636 # Protein_GI_number: 19703705 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, subunit c/Archaeal/vacuolar-type H+-ATPase, subunit K # Organism: Fusobacterium nucleatum # 1 88 1 88 89 81 62.0 3e-16 MTGITNEGLVLACSAIGAGLAVIAGIGPGVGQGIAAGYGASAVGRNPGAKGDVMSTMLLG QAVAETTGLYGLVIALILLYANPLIGKL >gi|222441836|gb|ACEP01000106.1| GENE 61 61115 - 61657 676 180 aa, chain + ## HITS:1 COG:SA1909 KEGG:ns NR:ns ## COG: SA1909 COG0711 # Protein_GI_number: 15927681 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, subunit b # Organism: Staphylococcus aureus N315 # 23 175 21 173 173 66 27.0 2e-11 MLLEALKYDNRIFGLDPQLVFQVAFQMLAIFILFLAASYLLLDPVRKILNDRKERVMKEQ KEAKESREQAIRFKDEYDTKLKRIDKEAEQILSDARKKAMQRENEIIADARAEAARIMEN AKSEAELEKKRVKDEVKQEIISVAALMSEKIIAASVDEQTQNALFEQTLKEMGDKTWLNE >gi|222441836|gb|ACEP01000106.1| GENE 62 61642 - 62214 738 190 aa, chain + ## HITS:1 COG:BH3757 KEGG:ns NR:ns ## COG: BH3757 COG0712 # Protein_GI_number: 15616319 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, delta subunit (mitochondrial oligomycin sensitivity protein) # Organism: Bacillus halodurans # 5 176 6 177 183 84 29.0 1e-16 MAKRVSSIYGNALFELAVEEKKMDALLSEVQTLQQILLENADLLSLLNHPEVSKIEKLDL LKNIFSGRASDEVLGFLSIIVEKDRQKDIPKIFEFFVDKAKEYKGIGKVKVVSATELSAR QKEKLTKRLLETTKYTSFEVDYQINPALLGGLIIRIEDRVLDSSLKTQIEKLSKGLSKLS LEKEGVVGVG >gi|222441836|gb|ACEP01000106.1| GENE 63 62207 - 63712 2164 501 aa, chain + ## HITS:1 COG:CAC2867 KEGG:ns NR:ns ## COG: CAC2867 COG0056 # Protein_GI_number: 15896121 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, alpha subunit # Organism: Clostridium acetobutylicum # 2 501 1 500 505 672 65.0 0 MVNLRPEEISSVIKEQIKNYTAKLETTDVGTVIQVADGIARIHGLENAMQGELLEFPGEV YGMVMNLEEDNVGVVLLGDSRAVNEGDVVKTTGRVVEVPVGDALTGRVVNALGQPIDEKG PIDTDKYRPVERVAYGVIDRQAVDTPVQTGIKAIDSMVPIGRGQRELIIGDRQTGKTAIA VDTIINQKGQNMHCIYVAIGQKASTVAGIVKTLKEHDAMEYTTVVASTASELAPLQYIAP YAGCAIGEEWMERGEDVLIIYDDLSKHATSYRTLSLLLRRPPGREAYPGDVFYLHSRLLE RASRLSKELGGGSLTAIPIIETQAGDVSAYIPTNVISITDGQIYLETEMFHAGFRPAINA GLSVSRVGGAAQIKAMKKIAGPIRTELAQYRELASFAQFGSELDDDTRERLAQGERIKEV LKQPQYQPLAVEKQIVIIYAVTRKYLLDVPVERIAEFEKGLYEFVETKYPEIYKAIRETG EINSETDVLIQKAITEFKNQF >gi|222441836|gb|ACEP01000106.1| GENE 64 63728 - 64585 776 285 aa, chain + ## HITS:1 COG:BH3755 KEGG:ns NR:ns ## COG: BH3755 COG0224 # Protein_GI_number: 15616317 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, gamma subunit # Organism: Bacillus halodurans # 1 284 1 284 285 192 39.0 6e-49 MASMRDIKRRKESIQSTQQITKAMKLVSTVKLQKARTRAESAKPYFQKMYDTVQSILSKS GNIEHPYLTFNGSEKKAVIVVTSNRGLAGGYNNNIVKLVADSGLPKESVEIYGIGKKGIE SFERRGYYIASDDSDIINEPLFSDAVEIGRKVLDAYAAGEIGEIYLAYTGFKNTVVHTPE FIKLLPVEVKAEEESEKKSQAPMNYEPEAEESLDLLIPKYINSIIYGAFIEAVASENGAR MQAMDAATSNAEEMISTLSLAYNRARQGAITQELTEIISGAEALN >gi|222441836|gb|ACEP01000106.1| GENE 65 64608 - 66002 1728 464 aa, chain + ## HITS:1 COG:CAC2865 KEGG:ns NR:ns ## COG: CAC2865 COG0055 # Protein_GI_number: 15896119 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, beta subunit # Organism: Clostridium acetobutylicum # 4 464 3 462 466 650 73.0 0 MAEQNTGKITQVIGAVLDIRFDQGVLPEINDAVEIRRKDGSKLVAETAQHLGDDIIRCIA MGPTDGLVRGMEAIATGGPITVPVGEVTLGRIFNVLGEPIDKKPVPEGVKRNPIHRKAPT FAEQATETEILETGIKVVDLLCPYQKGGKIGLFGGAGVGKTVLIQELIRNVATEHGGYSV FTGVGERTREGNDLYTEMSESGVIDKTAMVFGQMNEPPGARMRVGLTGLTIAENFRDEGG KDVLLFIDNIFRFTQAGSEVSALLGRVPSAVGYQPTLQTEMGALQERITSTKNGSITSVQ AVYVPADDLTDPAPATTFAHLDATTVLSRAIVEQGIYPAVDPLESTSRILDPRIVGEEHY QVARGVQEILQRYKELQDIIAILGMDELSEEEKVLVARARKVQRFLSQPFFVAEQFTGTT GRYVPLGETIQGFKEILEGKYDEIPEGMFLNAGNIDDVLARYNK >gi|222441836|gb|ACEP01000106.1| GENE 66 66011 - 66415 339 134 aa, chain + ## HITS:1 COG:BH3753 KEGG:ns NR:ns ## COG: BH3753 COG0355 # Protein_GI_number: 15616315 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, epsilon subunit (mitochondrial delta subunit) # Organism: Bacillus halodurans # 5 134 3 133 133 106 45.0 1e-23 MADKTFRLEIIAPDRIFYSNDIYMVEYNTVEGEVGIYADHIPMTQIIAPGRLIITERNEE KQAALLSGFVEITPEKMTILAEAVEWPQNIDVNRAKEAKTRAERRLSSEQGGVDIGRAEL ALKRAITRLDVAGK >gi|222441836|gb|ACEP01000106.1| GENE 67 66867 - 67382 603 171 aa, chain + ## HITS:1 COG:CAP0111 KEGG:ns NR:ns ## COG: CAP0111 COG0454 # Protein_GI_number: 15004814 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Clostridium acetobutylicum # 11 171 2 162 162 156 44.0 2e-38 MMNEELIVMEHMQIRMATMEDLDKIAFVESECFPFEEAATKEKFAERIKEYGTHFWLMCE EEKVIAFVDGMVTDEKNLTDEMYEKACLHDEKGAWQMIFGVNTLPKYRRQGYAEKLLRYA INDARAQDREGLVLTCKDKLVHYYEKFGFQNEGVSESIHGGATWYQMRLTF >gi|222441836|gb|ACEP01000106.1| GENE 68 67602 - 68507 1270 301 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225028207|ref|ZP_03717399.1| ## NR: gi|225028207|ref|ZP_03717399.1| hypothetical protein EUBHAL_02479 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02479 [Eubacterium hallii DSM 3353] # 1 301 23 323 323 295 100.0 4e-78 MNMQELLTYVEELTFKTSMFGYDKDEVDIQLDKICDEAEAIILAKEKEIEELKKESKTAL GIAAAATGKTMEELEKDTQEDIVETEEDEQTVTLDEAEETVQETVSNEVPISAVAVADAD EVQDEIETVRAQLAAAQKKVAALEAEIAAEKEKTEKAIARAEEAEKSAENNKAPETTDQA YERYMKNADLLCKQLSDLEGRQNAILDEAREQAEREREKARQDAADIIGEARKTAERLLK DAQENKENAIEEAKKIHEDSLRKVEEEKKQCDELSAQKAAMVNSLKKLTEDTARLMEKMQ G >gi|222441836|gb|ACEP01000106.1| GENE 69 68664 - 69038 429 124 aa, chain + ## HITS:1 COG:no KEGG:Rumal_1396 NR:ns ## KEGG: Rumal_1396 # Name: not_defined # Def: Cna B domain-containing protein # Organism: R.albus # Pathway: not_defined # 2 89 1403 1481 1533 87 51.0 1e-16 MDITGNNEVAGARLLLKDKEGNVIESWMSATEAHVFEQKLIAGETYTLAEVTAPSGYEVA ADITFTVNKDGTVSIDGKAVDGNEIVMKDDTTPAGEEGKLVVSKLVTFQGQESGSKQNFL RCII >gi|222441836|gb|ACEP01000106.1| GENE 70 69079 - 69537 354 152 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225028209|ref|ZP_03717401.1| ## NR: gi|225028209|ref|ZP_03717401.1| hypothetical protein EUBHAL_02481 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02481 [Eubacterium hallii DSM 3353] # 1 152 1 152 152 193 100.0 4e-48 MKCEGAWSAYTVFEHLKDGTYYVAETDEDGNKLESSEACKIEGNGTKCQITPTQKTAYAV IENQLLIPEYDKYLKEDDGDDTDGNDGKEKKKTSTSKNNSSNKTTKKGKNSKTGDNSHIL FYAALAAAALAAGSAEVYRRRRRATGRNKHDR >gi|222441836|gb|ACEP01000106.1| GENE 71 69625 - 70266 541 213 aa, chain + ## HITS:1 COG:VCA0102 KEGG:ns NR:ns ## COG: VCA0102 COG0637 # Protein_GI_number: 15600873 # Func_class: R General function prediction only # Function: Predicted phosphatase/phosphohexomutase # Organism: Vibrio cholerae # 6 210 10 212 219 117 32.0 2e-26 MLKLIIFDMDGLMFATEQVNYRAFTEIVKEEGYNPTFEQYIGFLGMNAKDIQKKYYVYYG EDVDAEGIYKKVGNRSKQIIREEGVPEKEGLRELLQVVREKGLQTAVASGSDTDVIKEYL DRTGLNEYFDMVLSSKDVKRGKPFPDVFLEICKAFDVKPEETLVLEDSANGVQAALAGNL PVINIPDLLPIPKEQQEKCVAVVENLKEVIPYI >gi|222441836|gb|ACEP01000106.1| GENE 72 70382 - 70915 405 177 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225028211|ref|ZP_03717403.1| ## NR: gi|225028211|ref|ZP_03717403.1| hypothetical protein EUBHAL_02483 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02483 [Eubacterium hallii DSM 3353] # 1 177 1 177 177 303 100.0 4e-81 MNKKGDFDFEWLPGFREMMACQNEEQNTQQETDWGNENLQIEALLPPEQLQQEEQRYWKD YEYFIQMLPMVAREVWIVGDALLDQYEYKGSPIYSEYPDKVTILKIADMVYDKLKYREEQ EMIQADDEEMIRNPYFVEVSLRDAITPFRTLIQVILHWNINHRRQRYYRRQKVFSTQ >gi|222441836|gb|ACEP01000106.1| GENE 73 71151 - 72344 1188 397 aa, chain + ## HITS:1 COG:FN0778 KEGG:ns NR:ns ## COG: FN0778 COG0500 # Protein_GI_number: 19704113 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Fusobacterium nucleatum # 2 390 4 399 412 252 39.0 9e-67 MEFLLTKIKENLNEQMLLLTISNPRKAEEVKKYNIRPILVKDKLVFQSAAYTKTQVFHKN LSKRELLEEVEKILPLYKQVQLQTSGAELTALINKKGKAAIKVKKQNNPAVAKLSSDKSH LLHNRTKKYILPEGVAVPFLKDLGVMTAEGKIVRTKYDKYRQINRFLEFIEDVLPHLPKD REITILDFGCGKSYLTFAMYYYIKELKGYDIRIIGLDLKKDVIKNCSRLAVKYGYDKLNF YEGSIEEFEGVTQVDMVVTLHACDTATDYALYKALRWGASVILSVPCCQHELNKQISASE FEPITDYGILKERFCALATDGIRAKILEEQGYDTQILEFIDMEHTPKNLLIRALHRKKPS AKKREKASKEVNAFCEQFGFAPTLWKLLQEEGGVNNE >gi|222441836|gb|ACEP01000106.1| GENE 74 72337 - 74889 2204 850 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_0675 NR:ns ## KEGG: EUBREC_0675 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 14 848 10 840 842 301 26.0 1e-79 MSKNVQLIILHIAVAVLCFIGALVFFNNRMGMEKGSAVAELANPSYPVLEIGNDTSSYNL MAGYKEDIDLSLVRNQITLTNNKNAVILKLHDYDYDITAIQYTLFEKTPDKPTETGTLNQ LTKNKKKNIKLGTLTFKNTLSENKVYYLKMAVRLNDSTRIYFYTKVQSGSGYHLDDYLAF VLKFHNNLFDKATMDENASYLETSADTIDDNLESVSINSGREAVSFGNMEVKQETKPRIT LQEMNNTYTVIRVNTILSTEISDGVIQYYDLSETYKLRYTADRMYLLDYERTMDAYYNES IIDSANNLISLGIQNEKNISYIYSDKGYRVCFAVEGQLWYYDYQSSDMYKIYSLASENIS DIRNATGNHGIKLLSMDDKGNIFYLVYGYINRGRHEGMNGIQVMKYDAKTNCNEEISFLS TSLPYDSMKEDLEKFSYLNSKSVFYCILEGDLHEIDLEKKKDKILESGLVNESLTASKDQ SIIAIEKEQNLYKNKQIEMIDLESGKKQNFTAGSDKRIRAVGFLSNDFIYGEANAVNVSK SSNGTVSFPITKIHIVDINGKTIKEYQKAGRYIMSTQIKGSILEMTLGKRTGGKIQKTGT KDYIRYKEKEESDAVTLTSKYTDTYWTQLYLKFPNYVYIQVVPDLLLTKIMVNEDDVTLK LSNSGEQLEQYFVYASGTQKAVYTNLTEAIARAYTERGNVIDSKENVLWKCIYADYAQVA GMDNVVKVQSDAKSLAGCLSMIAAVNGKEAAPDSIDIGKGSIEQLMKKYSGHTARNLTGC TVDEILYYVSQGSPVLAKLNSNRYVIVMSYNATKIRYLDPVTGKSTAASRTDVTNRLEKS GKVFYSYIAD >gi|222441836|gb|ACEP01000106.1| GENE 75 75114 - 75494 87 126 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225028214|ref|ZP_03717406.1| ## NR: gi|225028214|ref|ZP_03717406.1| hypothetical protein EUBHAL_02486 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02486 [Eubacterium hallii DSM 3353] # 1 126 129 254 254 227 100.0 2e-58 MILMENASQYRQNAFYRNFDILIIFCCLSEVFQTDCGAAGILCIYFFYSIYKKRGSSTEL PIKAGFIGTLPAALPLLTYVSPFPVQIFALADSLLLHCYNGEKGQGSKYFFYLFYPLHLL ILSFIF >gi|222441836|gb|ACEP01000106.1| GENE 76 76058 - 78394 2101 778 aa, chain + ## HITS:1 COG:BS_pbpD KEGG:ns NR:ns ## COG: BS_pbpD COG0744 # Protein_GI_number: 16080201 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane carboxypeptidase (penicillin-binding protein) # Organism: Bacillus subtilis # 62 642 54 617 624 330 35.0 8e-90 MSKKHNGLLWRVLKIWATCLILAFCVLGVIAVKRVGVPVVTMYREASKDVAKSSKDTFKA EQTSLVYDAEGNEIKKLKQEKDVSYLDYEDIPDAAKLAIISIEDKNFTTHHGIDVEGVAR AGLSLILNRGNITMGGSTITQQLARNIFLGYEKTYSRKIKEMFIAVALEQKYTKQEILEF YLNNVYFANGYYGIESASEAYFNKNAKDLSISQIAFLCSIPNSPNRYDPYKHKDSTLKRR DRILKNMYEDGYISESQYEKSVAEKITIMPKKETTTNDYAETYILKCATEALMKQQGFEI KTTFSSDSEKEKYNNEYDELYNTCKKSLYSAGYRIYTSIDMEKQKQLQDSVSSQLSMFTE KTDDGVYKVQGAAVSIDNSTGKVAAIVGGREEDQEGYGLNRAYQSFRQPGSAIKPLLVYT PALEKGYTPDSTLDDSRMTGSDSVSNAGYSYSGSISLRRAVEKSSNVATYRLYEDLGPRS CMKYLEALNFKGLDKRDYEYNSTCIGGFTNGTTVVEMAAGYATIANDGEYHTPDCIIKIT DAKGNDIVPEDKSTKSVYSSSASRMMTSVLQSCVTESEGTAHVCQLDVDMPAACKTGTTN DYVDGWLCGYTPYYTTAVWVGMDKYKSVDNLKGNTYPASIWKNYMDKIHQGLTRKEFESY LSDTESTTEATTESTEDDTESTTESTTESTEDNTAADTTEPDTSQDYYNPDDNNNDNNTE NNNNGYGGGNNGNGGNGGNDANNGNNENNNDSGMDNGGNNDGGDSGNNGDSGTPDNGE >gi|222441836|gb|ACEP01000106.1| GENE 77 79457 - 80650 1522 397 aa, chain - ## HITS:1 COG:CAC0819 KEGG:ns NR:ns ## COG: CAC0819 COG0462 # Protein_GI_number: 15894106 # Func_class: F Nucleotide transport and metabolism; E Amino acid transport and metabolism # Function: Phosphoribosylpyrophosphate synthetase # Organism: Clostridium acetobutylicum # 14 375 11 357 371 374 50.0 1e-103 MRPDDSKFATVPVGKLGIIVHPSCAELGKRIDQYIVNWRKEREHEHKENVVFKDYEQDTY IIDATCSRFGTGEAKALISDSVRGVDLYILADVLNYSLKYKVCGHENHMSPDDIYADVKR IIGAASGKPKSITVIMPFLYESRQHKRSSRESLDCAYALQELTAIGVDNVLTFDAHDPRV QNAIPLKGFDNIMPTYQFIKALLRSTSDLEIDNEHLMVISPDEGAMNRTMYFSSVLGVDI GMFYKRRDYSVVVDGRNPIVAHEFLGSSVEGKDVMIIDDMISSGESMLDVARELKRRKAR RVFCISTFGMFTNGLESFDKAYEEGLIYKVITTNCTYQRPDLFERPWYVSCDVAKYIALL IDSINHEASISAILNPHDRITKRLEEYKQFHTIKEEK >gi|222441836|gb|ACEP01000106.1| GENE 78 80729 - 81730 621 333 aa, chain - ## HITS:1 COG:CAC3588 KEGG:ns NR:ns ## COG: CAC3588 COG1484 # Protein_GI_number: 15896822 # Func_class: L Replication, recombination and repair # Function: DNA replication protein # Organism: Clostridium acetobutylicum # 15 326 15 322 329 210 39.0 3e-54 MGISTRQYQDIMNEYDAVRRRNYMAEQERKERVYALIPEIRQIDEQIAHISVEKAKALLL KQVSNAEAKKSLQNTIYDLSMEKVNLLAIHDYPADYLDPIYDCPECKDTGYIGDKKCRCF QQKIRHILYSQSNIEDVAGTESFSAFRREYYSTQRSGREKLSPRENIENVLSASHSFIES FDSKPGQNLLIYGNAGVGKTFLSNCIAGELLNRGKGVIYLTAYQFFDQLADYTFRRGANN AQTLPAFLHCDLLIIDDLGTELNNSFINSQLFLCINERILNKKSTIISTNLSLEQINRSY TERVFSRIIQSYTLLHIYGEDIRIKKAFSSLDE >gi|222441836|gb|ACEP01000106.1| GENE 79 81734 - 82924 806 396 aa, chain - ## HITS:1 COG:CAC3587 KEGG:ns NR:ns ## COG: CAC3587 COG3935 # Protein_GI_number: 15896821 # Func_class: L Replication, recombination and repair # Function: Putative primosome component and related proteins # Organism: Clostridium acetobutylicum # 163 391 108 321 328 96 26.0 7e-20 MSDKITLLYNVQNEVTVLPNKFIDEYMIKADGEYVKIYLLILRLQGMGLPVDVERIADYL ELTRKDVLRALSYWEKAGILRTGAASETAATVTGSESMANGISGDTGTAQTVGTSGSDGN IITSGTSGSDGNTMASESTGQTGASLSPTITPVPDKNTLSPTEVQASMTNKDLERTIYMA ETYIGRPFSTTELNSFCYINDQLHFSSDLLEYLIEYCVTRGKKSVRYIESVAINWYQQGI TSVQEAKEQSTLYSQNVFPIMKAFGISNRDPGSAELDYIKKWNSLGLGTDIIIEACSRTL LATHQASFPYANKILEDWKRLGVRNTSDIKHLDDKHRSTASSSSGSAAGYTQGQPRKGSI GAAARKNTATNQFHNFEQRTYDFNDLESRLLDKTRR >gi|222441836|gb|ACEP01000106.1| GENE 80 83159 - 83311 74 50 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225028219|ref|ZP_03717411.1| ## NR: gi|225028219|ref|ZP_03717411.1| hypothetical protein EUBHAL_02491 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02491 [Eubacterium hallii DSM 3353] # 1 50 1 50 50 71 100.0 2e-11 MDNVDTFVDNVDNFVDRWVEFDILGRKLVKKGDKFQDIGVNEQKYNVDKM >gi|222441836|gb|ACEP01000106.1| GENE 81 83657 - 85057 1660 466 aa, chain - ## HITS:1 COG:CAC3225 KEGG:ns NR:ns ## COG: CAC3225 COG0773 # Protein_GI_number: 15896472 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramate-alanine ligase # Organism: Clostridium acetobutylicum # 18 465 13 458 458 418 47.0 1e-117 MALEGNMYTIDFQKPIHVHFIGIGGISMSGLAEILLNRHFTVTGSDMQASDMTKHLEETG AKVVIGQKAENITDDIDLVVYTAAIHESNEEFAAAKNKGIPMMTRAALLGQIMANFAKSI AVAGTHGKTTTTSMLTHILLQADTDPTVSVGGMLDRIGGNIRVGHSDLFLTEACEYTNSF LEFYPLYSIILNVEEDHMDFFKDIEDIKNSFHKFASQTADDGLIIINGDMGHTDFILNGL SQKHVTFGLNPDNDYTAADITFDKEGNASYNLIAHGQEKGRITLKVKGRHNVMNSLAAIA CTEAIGLPLDAIRKGLLSFGGTHRRFEYKGSLGDVTVIDDYAHHPTEIRATLSAAKDYPH DELWVIFQPHTYTRTKAFLPEFAKALEQADHIVLADIYAAREVDTGEVSSKDVMKLLQED GQDVHYFPSFEEIKDFVKAHVKGHDLLITMGAGNVVEIGEELLAEK >gi|222441836|gb|ACEP01000106.1| GENE 82 85464 - 86723 1506 419 aa, chain + ## HITS:1 COG:TM0240 KEGG:ns NR:ns ## COG: TM0240 COG0448 # Protein_GI_number: 15643012 # Func_class: G Carbohydrate transport and metabolism # Function: ADP-glucose pyrophosphorylase # Organism: Thermotoga maritima # 7 418 5 419 423 442 53.0 1e-124 MIKKEMIAMLLAGGQGSRLGVLTSKLAKPAVAFGGKYKIIDFPLSNCINSGIDTVGVLTQ YQPLRLNQHIGIGIPWDLDRNIGGVSILPPYEKSQNTDWYTGTANAIYQNMEFMEYYHPD YVIILGGDHIYKMDYELMLEFHKKNNAQITLATYEVPWEDASRFGLVITDENNQILEFEE KPANPRSNKASMGIYIFNWEVLKEALETLSEQPGCDFGMHVIPYCHQRGDKIMAYNYQGY WKDVGTLSAYWEANMELIDIIPEFNLYEEFWRIYTKTDVIPPQYFSADSKVNACIVGDGT EVYGEISNSVIGAGVVIEEGAVVTNSIIMNNTVIKKGAKIEKSIIAESVEVGENAELGVF EEAENKYKPKVYSGGLVTIGEGSVIPANVKIGKNVAIVGKTTAEDYPDGILASGESIIR >gi|222441836|gb|ACEP01000106.1| GENE 83 86743 - 87867 1289 374 aa, chain + ## HITS:1 COG:TM0239 KEGG:ns NR:ns ## COG: TM0239 COG0448 # Protein_GI_number: 15643011 # Func_class: G Carbohydrate transport and metabolism # Function: ADP-glucose pyrophosphorylase # Organism: Thermotoga maritima # 1 352 1 352 370 300 42.0 3e-81 MKSIGIILAGGNNNRIGRLSDKRAIAALPVAGSYRAIDFTLSNMSNSRVSKVAVLPQFNA RSLNEHLSSSKWWDFGRKQGGLYVFTPTVTKNNNSWYQGTADALYQNIDFLKKSHEPYVI IATGDGVYKMDYGKVLEAHIAKNADITVVYTTLKDTDDDLTRFGVLKLDENERIVEFEEK PLVASSNNVSIGVYVIRRRHLIEILERCAREDRHDFVQDVLVRYRNVRKIYGYKLDTYWR NIATVDSYFKTNMDFLKKDVRDYFFKQYPDVYSKVDDLPPAKYNTGAEVKNSIISSGCIV NGKVENSIIFKDVYIGNNCTIKNSIILNDVYIGDDSYIENCIVESRDTLRPGTNYVGTDD EVKIVIEKNVRYIL >gi|222441836|gb|ACEP01000106.1| GENE 84 87898 - 88197 446 99 aa, chain + ## HITS:1 COG:CAC3223 KEGG:ns NR:ns ## COG: CAC3223 COG2088 # Protein_GI_number: 15896470 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Uncharacterized protein, involved in the regulation of septum location # Organism: Clostridium acetobutylicum # 1 83 1 83 95 119 74.0 1e-27 MQITDIRIRKIAKEGKMKAVVSVTFDNAFVVHDIKVIDGEKGLFIAMPSRKASDGEYRDI AHPINSDTRDIIQKSILQEYEAIMASGEEETVGEAEVAL >gi|222441836|gb|ACEP01000106.1| GENE 85 88520 - 90673 1770 717 aa, chain + ## HITS:1 COG:BS_spoVC KEGG:ns NR:ns ## COG: BS_spoVC COG0193 # Protein_GI_number: 16077121 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Peptidyl-tRNA hydrolase # Organism: Bacillus subtilis # 522 707 1 185 188 207 54.0 7e-53 MNTNKIFAGCFNFFRKWKVKKNQITLIEKLNTGGTGSLFEIKNECEKRGLPFHFNIITHA DYEVSLRNLGGLLKLFTIKAFRMATSSHIFLNDNFLPLGYMKLSDETKVVQLWHGMGSFK KFGGSSETNPEVLKELKAATKNTDHILASSENIRDNYAEAFIAPKEKVICIGCPQVDYFF RDHDIAAWKEELSEQYPEMKGKKLVLYAPTFRGEEEHDKKLLEAFDFDAFQKELGKDYFL MVRLHPQIQSAKVPDTVANMTDYPNVRKLLCMTDILIADYSSIAVEYSLLNRQIILYAFD KEWYLSKDRGFYFDYEKTAPGPIVENMQDLIDCIKNKQWDIAKVEKFAHLHNDYFDDKSA ERVVDYYFGNGKKLPNSASEPEPFYEEWNQYRPKHRRKRNPDSISQNIFDNASGKSQNGK LPEKWATQDAEEAVNSWESERKKQRKRQQQKARMQEKLKQQTANVTKQKNKKNNNFISGI DKLTSKKEKDINKQKQLKLQNQEEKSTKETSEEIKSRKEHRMKVIVGLGNPTDQYKGTRH NVGYMAIDRIAEANRININQHKFKAMVGSGIIGGSKVLLVKPLTYMNLSGESIRPIMDFY KLDLSDILVIYDDISLEPGMLRLRTKGSAGGHNGMKSIIKHLGGDTFPRIRVGIGGEKHP GQDLADYVLGHFKDDEKELLSDALDKAEKAAELFAQDEFSEAMNKYSVGKKKRKTLE >gi|222441836|gb|ACEP01000106.1| GENE 86 90720 - 94283 2717 1187 aa, chain + ## HITS:1 COG:BS_mfd KEGG:ns NR:ns ## COG: BS_mfd COG1197 # Protein_GI_number: 16077123 # Func_class: L Replication, recombination and repair; K Transcription # Function: Transcription-repair coupling factor (superfamily II helicase) # Organism: Bacillus subtilis # 67 1138 54 1118 1177 943 47.0 0 MQQKSATTKIKGGIMKTLSNPLLKFDTYIDIKTAMSQQAYPIVLNGCVDSQKAHFIPNLG EDFPCRFVLTYKEEKAKELYQDLSFFDPNTVLYPSRDVLFYSADVHSNHIERQRMDILKK IVEGKPLTIVACLDVFMEKMIPFEEFKEHCLTIDFESIIDTDALKLKLSELGYENSGLVE APGQFGIRGGIIDIFPLTEELPVRIELWGDEVDSIRSFDTETQRSVEKLDEVQVYPATEM ILSRNKIGEAVRRMKEEYKKQEEAFKKRKRFAEKERLRKMTVRTEEELLSFGTAEGSEAL LSYFYEKTVSFLEYLPENTLFFIDEPHRVLEKGKTYEEEFFLCMQSRLEGGYVLPGQADL LFGYEEILSKVMVGPLILLSSVIQDYAFYKPKTTCDIEAKSIFSYNNSFDQLIKDLEHWK KQNYRILLLSSSTTRAKRLAENIRDYGLLAYFAADFDRTIAPGEIMVASGRLGNGFEYPT LKFVVLSEKDIFKERKAKKPKKKSQYSGQKINSLSEISVGDYVVHEKYGLGIYRGMEKIE SDGITKDYINIEYKDASNLFVPASQLELIQKYSNLSARKPKLNKLGGTEWEKTKSRVRSQ VQIAAQDLVKLYAERQAKEGYAYGKDTVWQKEFEELFPYEETEDQLSAIEDTKRDMESHR IMDRLICGDVGYGKTEVAIRAAFKAVMDSKQVVYLVPTTILAQQHYNSFKERMEHYPVEI AMLSRFCTPKEQKRIFDGLKNGTIDIVIGTHKVLSKNIKYKNLGLLIIDEEQRFGVKQKE KIKQLKKDVDVLALSATPIPRTLHMSLAGIRDMSVLEVPPVDRRAIQTYVMEYNEELVRE AIERELGRGGQVYYVYNRVNNIDEVAAGLQRLLPNATVEYAHGQMGERQLETIMSGFINK EIDVLVSTTIIETGLDIPNVNTMIIQDAQLFGLSQLYQLRGRVGRSNRTAYAFLMYRRNS ILKEEAEKRLKAIREFTDLGSGFKIAMRDLEIRGAGNLLGAEQSGHMESVGYDLYCKMLN EAVLTMKGEQQEVDTFTTSIDLSIDAYIPETYIKSESEKLSWYKRIATIETQEESEDMIE EMTDRYGDTPAPLIRLMDVALLREEAHQAWLLSIEQKGSKILFTMNPRAKVRVEEIDGFL KQYRNKMKIKPEANPVFVFESTGIPKKDLLAKVREIIGEIQKLQDKS >gi|222441836|gb|ACEP01000106.1| GENE 87 94809 - 95972 1528 387 aa, chain + ## HITS:1 COG:no KEGG:bpr_I2253 NR:ns ## KEGG: bpr_I2253 # Name: not_defined # Def: peptidyl-prolyl cis-trans isomerase PpiC-type (EC:5.2.1.8) # Organism: B.proteoclasticus # Pathway: not_defined # 7 360 10 346 356 91 23.0 8e-17 MKAKLTKVLSLVLVVIMALSLMTGCSTSSSGKGTKTQDGNKVLFSYDGTDVTLKEAWLYA KMLSAQYEQTYSSYYGSDFWSMSMGKDDDGNDQTFEEYVKKQVISQMKQNIILIKKADKY DCKLTHSEKQQCSDSALSYYQDKTGKKVMKECGATREDVEKIYEDSTLASKVQKKIESKE KVTVTDDEARKSTIYRVVFATTKTGDNGQTTEMSKKEKKAVKAKAEKALKEIQSGKKTIK KVAKEQNYSNTDESYAAGESEEGEAFEKVMKSMKDGDIADKVLECDNGYVIAKLVAYTDK EETASNKKTLKEKKQSEAFQKKYDNWTKKLEKKWDYNKDVDQKLWAQVSLKSADSTATET STETTAASSEASTTEASSEATTAAKSK Prediction of potential genes in microbial genomes Time: Fri Jul 8 08:07:14 2011 Seq name: gi|222441835|gb|ACEP01000107.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont366.1, whole genome shotgun sequence Length of sequence - 577 bp Number of predicted genes - 0 Prediction of potential genes in microbial genomes Time: Fri Jul 8 08:07:15 2011 Seq name: gi|222441834|gb|ACEP01000108.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont368.1, whole genome shotgun sequence Length of sequence - 3221 bp Number of predicted genes - 3, with homology - 2 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 74 - 133 2.3 1 1 Tu 1 . + CDS 165 - 701 338 ## COG1787 Predicted endonuclease distantly related to archaeal Holliday junction resolvase and Mrr-like restriction enzymes + Prom 1984 - 2043 5.3 2 2 Op 1 . + CDS 2080 - 2172 181 ## 3 2 Op 2 . + CDS 2204 - 3217 722 ## COG2866 Predicted carboxypeptidase Predicted protein(s) >gi|222441834|gb|ACEP01000108.1| GENE 1 165 - 701 338 178 aa, chain + ## HITS:1 COG:sll1429 KEGG:ns NR:ns ## COG: sll1429 COG1787 # Protein_GI_number: 16330594 # Func_class: V Defense mechanisms # Function: Predicted endonuclease distantly related to archaeal Holliday junction resolvase and Mrr-like restriction enzymes # Organism: Synechocystis # 62 170 186 294 304 107 47.0 1e-23 MRHFDIIKWLWVAGIIYTFLQGYTFYMQRQGEIAVGCSFLCAFCIFQFIRRLHTLYYLHM KMKNIDKLNGRAFERYLTVQFRHLGYHVTLTSYSHDYGADLVLRKWGKKIVVQAKRYERN VGIAAVQEVVGSIAYYKADRAMVVTNSNFTKSARDLAKRNEVELWGRKEMQKKFHIKA >gi|222441834|gb|ACEP01000108.1| GENE 2 2080 - 2172 181 30 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTIGDFFTMAAVIVAACLVYNFVKAYTKKK >gi|222441834|gb|ACEP01000108.1| GENE 3 2204 - 3217 722 337 aa, chain + ## HITS:1 COG:BH1603 KEGG:ns NR:ns ## COG: BH1603 COG2866 # Protein_GI_number: 15614166 # Func_class: E Amino acid transport and metabolism # Function: Predicted carboxypeptidase # Organism: Bacillus halodurans # 43 325 67 341 355 93 26.0 5e-19 MKKSNIKRILSGVFAMLMVCCIAFTTQGLTVNAAIVSGGKKNYSYKELKTDLKQLQKKYR DHCQVNVIGKTADKRNLYEVVIGNPNAKKHLLVIGNLHAREHMTVQLCMKQIEYYLNNYN KKINGKKVSATLNKVAIHYVPSCNPDGTAISQKGFNAIRNKSLRNGLRRMGGSSSKWKAN ARGVDLNRNWKVAFKKAGKKGSSGYRGPKAASEKEVQALVKWVNRIERRGKIAGVVSYHS TGSILYGRCASRATKKVRNITTRMYKLAKSLTRYHLMPTESISVARGCSREYFLYKRNIP CITIEVGVGAAPLSGREFNSIWNKNKNVVIREAQLFD Prediction of potential genes in microbial genomes Time: Fri Jul 8 08:07:29 2011 Seq name: gi|222441833|gb|ACEP01000109.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont369.1, whole genome shotgun sequence Length of sequence - 39734 bp Number of predicted genes - 32, with homology - 32 Number of transcription units - 15, operones - 7 average op.length - 3.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 3731 1061 ## gi|225028232|ref|ZP_03717424.1| hypothetical protein EUBHAL_02504 + Prom 3628 - 3687 5.1 2 2 Op 1 . + CDS 3833 - 4744 413 ## COG4974 Site-specific recombinase XerD 3 2 Op 2 . + CDS 4800 - 5351 325 ## gi|225028234|ref|ZP_03717426.1| hypothetical protein EUBHAL_02506 + Term 5584 - 5626 1.0 - Term 5348 - 5385 2.6 4 3 Op 1 8/0.000 - CDS 5415 - 6500 901 ## COG0451 Nucleoside-diphosphate-sugar epimerases - Prom 6523 - 6582 4.5 5 3 Op 2 . - CDS 6649 - 8121 1494 ## COG1004 Predicted UDP-glucose 6-dehydrogenase - Prom 8371 - 8430 10.5 + Prom 8360 - 8419 10.2 6 4 Op 1 . + CDS 8567 - 9118 354 ## Clocel_3236 VanZ family protein 7 4 Op 2 5/0.000 + CDS 9144 - 10529 1729 ## COG1109 Phosphomannomutase 8 4 Op 3 . + CDS 10554 - 12392 2358 ## COG0449 Glucosamine 6-phosphate synthetase, contains amidotransferase and phosphosugar isomerase domains + Term 12429 - 12471 -0.2 + Prom 12582 - 12641 7.0 9 5 Op 1 . + CDS 12697 - 13419 507 ## Dtox_2202 hypothetical protein + Prom 13434 - 13493 4.0 10 5 Op 2 . + CDS 13546 - 13746 136 ## gi|253579386|ref|ZP_04856656.1| conserved hypothetical protein + Term 13845 - 13900 11.1 11 6 Tu 1 . - CDS 13718 - 13876 117 ## gi|225028243|ref|ZP_03717435.1| hypothetical protein EUBHAL_02515 - Prom 13896 - 13955 6.1 + Prom 13873 - 13932 11.8 12 7 Op 1 . + CDS 13959 - 15002 441 ## BDI_0562 hypothetical protein 13 7 Op 2 . + CDS 15045 - 17234 1482 ## COG0451 Nucleoside-diphosphate-sugar epimerases 14 7 Op 3 . + CDS 17219 - 18127 580 ## BBR47_05570 hypothetical protein 15 7 Op 4 . + CDS 18147 - 20468 1333 ## gi|225028247|ref|ZP_03717439.1| hypothetical protein EUBHAL_02519 16 7 Op 5 . + CDS 20473 - 22299 1113 ## gi|225028248|ref|ZP_03717440.1| hypothetical protein EUBHAL_02520 17 7 Op 6 4/0.000 + CDS 22333 - 23814 819 ## COG0438 Glycosyltransferase 18 7 Op 7 . + CDS 23885 - 25327 533 ## COG4267 Predicted membrane protein 19 7 Op 8 . + CDS 25351 - 25722 228 ## gi|225028251|ref|ZP_03717443.1| hypothetical protein EUBHAL_02523 20 7 Op 9 26/0.000 + CDS 25727 - 26926 625 ## COG0438 Glycosyltransferase 21 7 Op 10 . + CDS 26947 - 27756 580 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 22 7 Op 11 . + CDS 27749 - 29062 441 ## Rumal_1156 type 1 dockerin - Term 28990 - 29031 -0.0 23 8 Tu 1 . - CDS 29049 - 30491 368 ## Kfla_4441 hypothetical protein - Prom 30540 - 30599 9.1 + Prom 30460 - 30519 8.6 24 9 Op 1 . + CDS 30578 - 32122 560 ## Exig_2626 polysaccharide biosynthesis protein 25 9 Op 2 . + CDS 32119 - 33210 481 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 26 10 Tu 1 . - CDS 33227 - 34225 262 ## CPE0465 hypothetical protein + Prom 34338 - 34397 5.3 27 11 Tu 1 . + CDS 34497 - 34805 320 ## gi|225028258|ref|ZP_03717450.1| hypothetical protein EUBHAL_02530 - Term 34611 - 34646 1.0 28 12 Tu 1 . - CDS 34817 - 35272 184 ## Rumal_1722 hypothetical protein - Prom 35327 - 35386 7.6 + Prom 35398 - 35457 6.2 29 13 Tu 1 . + CDS 35506 - 35886 521 ## gi|225028261|ref|ZP_03717453.1| hypothetical protein EUBHAL_02533 + Term 35904 - 35963 1.3 - Term 35899 - 35942 8.2 30 14 Op 1 . - CDS 35954 - 37666 1208 ## COG2274 ABC-type bacteriocin/lantibiotic exporters, contain an N-terminal double-glycine peptidase domain 31 14 Op 2 . - CDS 37720 - 38886 493 ## bpr_I1003 hypothetical protein + Prom 39094 - 39153 6.1 32 15 Tu 1 . + CDS 39297 - 39732 466 ## COG1109 Phosphomannomutase Predicted protein(s) >gi|222441833|gb|ACEP01000109.1| GENE 1 2 - 3731 1061 1243 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225028232|ref|ZP_03717424.1| ## NR: gi|225028232|ref|ZP_03717424.1| hypothetical protein EUBHAL_02504 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02504 [Eubacterium hallii DSM 3353] # 1 1243 1 1243 1243 2349 100.0 0 MYNSDTILRKLIQQLKSYYNSETEGVTVIIGAGCSLRSSSDDISTEGIVKDIVKKHLKED QDLPDTWTELYSLFVNNVWTGQGEQDRIDLLAPYFKNLEPSKGYLAIRNLIKTGYIKNII TTNFDPLIDEILKDIPHRTFIDGLKKESFSDDNFSINFVKAHGDFQHGNLRFAPDELRRL SPEIEAVIHSLTKGIVIVVGYRGQDIGIMNALNTSKAHCAFWVSPEKPDLYDQYSNGSID TWMKTRNSEYNFINGEKGDFNVLFPFIESKLQEVPKQEIKSLKLWKKSSLYNIISTSERA SIVFSNLLKIEDEILKEYIWTVKEPYFCVDSTTLLSILITVLQDKLDFQNDITNEIEALL LATALTIFSQIQGYPICVDQYIYIIKEKYRYLQNIPQPDDNFWSILKELLAEKEIIEKEK FFEISITVDSTKLFYYVLKKGMFSKMQSLFILIQQILLFQKTTCEKKDIISTSHRIKLIL ENYCKEIKYSSTSTKIFFENMTPQDDNVLEGYFRNIMKAQTRELVGKSRTYYIKPDIYIY YEVDLENNRLTSLYDEFYNRANTLTEEFLSAVQFNNFILPDTVEILKRFIVSDNSGLIIT GVSGSGKTTALAHWIKNLQASDEYLILPFSGKNQLTKNGLQSFSDWLSEPLKLQEINELF CCRKKTLLLVYDAINEIPNTFDFLQNIYNQLICFVEKLSTHKYNNIKMILSLRKDVYMQL NDNHAVEPREGIFFPTILPNREISLTYEISPFSKDQAVKFLMDSTNQSQEVITTLYDEFS GILQMPIYIKLFADAFTSFSNLNENNFGELISRWYKILLQKISENNTHQDELVVVINEII LSKYQYIDENVYIKNIAKRLNLEEKIIIKDVNLFAEAGILTYKRVTKEINFSHDVYEQIF LTQYLMKIEYPEDFCCNILEKQINMHILSLALKDYLLFQKELNQEYYFETLIKFISKNHS LLNNCIIENALLLNENSSIENHWSILFQKIDYNIGLNCHKYILQLILDILNTRIDKHSFF SNSTLNELRPLVIQPTYIMNSQKDQAFFEFINAKYWYTFSYKFENDVLKDAQIYCINAKK LLLSSDSNISLCDDINFLYAILLRYEGKLVEAVNTLQKVLQNQIKFSVPDKICKTALELG AIYRELTKFDDALGLYKSIKKLYRDFPPYWDTRLQLNMGIIYKNKQQVKSVNYKLTKDDF NDFKITQQLFFNVYKYACSTDDVLLKLEILAELIELCALARHY >gi|222441833|gb|ACEP01000109.1| GENE 2 3833 - 4744 413 303 aa, chain + ## HITS:1 COG:lin2069 KEGG:ns NR:ns ## COG: lin2069 COG4974 # Protein_GI_number: 16801135 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinase XerD # Organism: Listeria innocua # 3 295 1 296 297 175 33.0 9e-44 MKLQETINMYLEYCEYRKELDWNTLKAYRIDLRQFVEFIEGEEPSKEKIERYITYLHKKY KQKTVKRKIASLKVFYAYLEEEEIIEENPMRRVHTKFKEERVLPRIIPREEIEKLLNCMY KELRESEKNQRFILRDVAVVEVFFATGARVYEIANIRIENINLESGVIRFMGKGGKERIV QIGEESVIELLNRYYTENKEEIHACGYFFINKRKSRFTEQSIRLMLKKYTKQAKIERNIT PHMFRHSFATYLIEEGVDISCVQQILGHSSIKTTQIYIHVSSQKQADILREMHPRNKMQI RAA >gi|222441833|gb|ACEP01000109.1| GENE 3 4800 - 5351 325 183 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225028234|ref|ZP_03717426.1| ## NR: gi|225028234|ref|ZP_03717426.1| hypothetical protein EUBHAL_02506 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02506 [Eubacterium hallii DSM 3353] # 1 183 7 189 189 330 100.0 3e-89 MVSSNDYTIFKFGGSSLKNEEYGIKMMMQYFPDYFTEGELKQNLERYNLHRMFQIVKKYF VKEYGYTGTEVELRNNYQKSINSYIYNDEKNPIFVHIDELFESTVIGFLMAMFKWSKDFS NLDIYGKCFKYILYLLNDVCIFGEMQEAEANRALMDTLNGDIQILQLAEDYYKQFANKSK VLH >gi|222441833|gb|ACEP01000109.1| GENE 4 5415 - 6500 901 361 aa, chain - ## HITS:1 COG:BH3709 KEGG:ns NR:ns ## COG: BH3709 COG0451 # Protein_GI_number: 15616271 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Nucleoside-diphosphate-sugar epimerases # Organism: Bacillus halodurans # 11 361 3 335 343 335 47.0 8e-92 MNHIDLKNKTVFVTGSAGFIGSNLVLELLRTQSPIHIIGLDNMNDYYDVNIKEWRLKEIE KCVAEHNDSTYVFYKDDLANKAIIDKIFAEHKPDIVVNLGAQAGVRYSITNPDAYIQSNM IGFYNILEACRHSYDNGEEGVDHLVYASSSSVYGTNKKVPYSTEDKVDNPVSLYAATKKS NELMAHAYSKLYNIPSTGLRFFTVYGPAGRPDMAYFGFTNKLLKGETIQIFNYGNCKRDF TYVDDIVEGVKRVMQGAPEKKNGEDGLPIPPYAVYNIGNSNPENLLDFVTILQEELIRAE VLPADYDFEAHKELVAMQPGDVPVTFADTSALERDYGFKPDTSLRTGLRHFAEWYKEFYN V >gi|222441833|gb|ACEP01000109.1| GENE 5 6649 - 8121 1494 490 aa, chain - ## HITS:1 COG:BH3708 KEGG:ns NR:ns ## COG: BH3708 COG1004 # Protein_GI_number: 15616270 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted UDP-glucose 6-dehydrogenase # Organism: Bacillus halodurans # 85 490 2 388 388 490 63.0 1e-138 MKKLSITKMADTIIAKRKEQKLTQAQLAEMTGINRGMISRLESCDYTPSIDQLQSIAEVL NFEVVDLFEDDKVTSSKPVLDKKYNIAVAGTGYVGLSIATLLSQHNHVTAVDIIPEKVDL INNRKSPIQDDYIEMYLAEKELDLTATLDGEEAYKNADFVVIAAPTNYDSKKNFFDCSAV EAVIELVLKVNPNATMIIKSTIPVGYTASVREKYNTKNIIFSPEFLRESKALYDNLYPSR IIVSCDEESREKAETFAKLLQQGAIKENIDTLFMGFTEAEAVKLFANTYLALRVSYFNEL DTYAESKGLNTQEIINGVCLDPRIGTHYNNPSFGYGGYCLPKDTKQLLANFNDVPQNMIS AIVESNRTRKDFIADQVLNMAGAYEANSSWDPSKEGEKIIGVFRLTMKSNSDNFRQSSIQ GVMKRIKAKGAKVIIYEPTLKNGETFYGSEVVNDLKKFKKNSQAIIANRYDSCLDDVKDK VYTRDIFKRD >gi|222441833|gb|ACEP01000109.1| GENE 6 8567 - 9118 354 183 aa, chain + ## HITS:1 COG:no KEGG:Clocel_3236 NR:ns ## KEGG: Clocel_3236 # Name: not_defined # Def: VanZ family protein # Organism: C.cellulovorans # Pathway: not_defined # 54 172 58 176 471 75 31.0 1e-12 MDLYQIFIEHNRPWSSFEIICYFIMLILLFFVYIFLYKCGKVNRVQIIAGLLLFTYYCIV LESTVFTRKNQGYHAYELEVFWSWKDVIFYHSREMLKENLLNIALLLPFGLLLPFPFYKR LRWWQGLAMGLIASAVIEVLQLVLCRGLFEFDDMIHNGFGCMLGSMLAGLLFQKLINSNL QTE >gi|222441833|gb|ACEP01000109.1| GENE 7 9144 - 10529 1729 461 aa, chain + ## HITS:1 COG:SP1559 KEGG:ns NR:ns ## COG: SP1559 COG1109 # Protein_GI_number: 15901402 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphomannomutase # Organism: Streptococcus pneumoniae TIGR4 # 1 454 1 443 450 398 48.0 1e-110 MGKYFGTDGFRGEANVNLTVDHARKVGRFLGWYYGQLKKQAGDDAPAKIVIGKDTRRSSY MFEYALVAGLTASGADAYLLHVTTTPSVAYVARTEDFDCGIMISASHNPYYDNGIKLING NGEKMDEATIHLVEAYLDGELEVFGQKYEEIPFAHRDAIGCTVDYAAGRNRYMGYLISLG LYSFKGMKVGLDCANGSSWNMAKQIFDSLGAKTYIINAEPDGTNINRDAGSTHIEGLQKY VVENGLDVGFAYDGDADRCLCVDEKGNVVDGDAILYIYGRYMKERGKLLTNTVVTTVMSN FGLYKAFDELGIGYAKTAVGDKYVYEYMTKNGCRIGGEQSGHIIFSKYVSTGDGILTSLK MMEVMMAKKKKMSQLTEDLHIYPQVLVNVKVKDKAVAQADEDVQAAVNKVAKALGDTGRI LVRESGTEPLIRVMVEAETEEICHTYVDEVVAVINEKEATR >gi|222441833|gb|ACEP01000109.1| GENE 8 10554 - 12392 2358 612 aa, chain + ## HITS:1 COG:CAC0158 KEGG:ns NR:ns ## COG: CAC0158 COG0449 # Protein_GI_number: 15893453 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glucosamine 6-phosphate synthetase, contains amidotransferase and phosphosugar isomerase domains # Organism: Clostridium acetobutylicum # 1 612 1 608 608 651 53.0 0 MCGIVGFTGAQQAAPILLDGLSKLEYRGYDSAGIAVRDGENETEVVKAKGRLKALAEKTD NGTAVKGSCGIGHTRWATHGEPSENNAHPHKSDDGNVVAVHNGIIENYLELKEKLTRKGY VFYSETDTEVAVKLIDYYYKKYEGTPIDAINHAMVRIRGSYALAVMFKDYPEEIYVSRKD SPMILGIEDGESYIASDVPAILKYTRNVYYIGNMELACVRKGEITFYNLDGEEIEKELKT IEWDAEAAEKAGFEHFMMKEIHEQPKVVGDTLNSVLKDGQFDLSSVGLSEEEIKDISQIY IVACGSAYHVGIAAQYVIEDLAKIPVRVELGSEFRYRNPLLDPKGLVIVISQSGETADSL AALRESKEKGVRTMAIVNVVGSSIAREADSVFYTLAGPEIAVATTKAYSTQLMATYILAL QFAKVRGEISDETYKNMIAELQTIPDKIAKILEDKERIQWFASKQANAHDVFFVGRGIDY AICMEGSLKMKEISYIHSEAYAAGELKHGTISLIEDDILVVGVLTQPDLYEKTISNMVEC RSRGAYLMGLTTFGQYNIEDSTDFAVYIPKIDPHFATSLAVIPLQLLGYYISVAKGLDVD KPRNLAKSVTVE >gi|222441833|gb|ACEP01000109.1| GENE 9 12697 - 13419 507 240 aa, chain + ## HITS:1 COG:no KEGG:Dtox_2202 NR:ns ## KEGG: Dtox_2202 # Name: not_defined # Def: hypothetical protein # Organism: D.acetoxidans # Pathway: not_defined # 20 119 25 124 502 70 37.0 5e-11 MAGRPKKKPEYNPELQFNNFLQELRDAYEEADSLRSLADELNISLLKLRKLLITADVFTS DICTEINDLHQSGKKIPEIMKLTGLSRASVHSYLPYIKGLYNAAEISLNAERCRTYKNRQ EQVRLLQEIPSEENLWQAVIAFQEYPFKTATGLPFRYKLKVGKNGEYNRELLIDRREKSK SLAWSSVVLAFENSKRISEEVKKPKALGDIRGVSYIYPILWRFGLIRVPEAIEKKMGKQR >gi|222441833|gb|ACEP01000109.1| GENE 10 13546 - 13746 136 66 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253579386|ref|ZP_04856656.1| ## NR: gi|253579386|ref|ZP_04856656.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] conserved hypothetical protein [Ruminococcus sp. 5_1_39BFAA] # 1 66 401 466 466 122 96.0 1e-26 MLVIDSEKLSQQIHETESDYMEKSKEVLANGQETEGAKYQGKVLNRKKKLYYGVLRIIIR PLRQLL >gi|222441833|gb|ACEP01000109.1| GENE 11 13718 - 13876 117 52 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225028243|ref|ZP_03717435.1| ## NR: gi|225028243|ref|ZP_03717435.1| hypothetical protein EUBHAL_02515 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02515 [Eubacterium hallii DSM 3353] # 1 52 1 52 52 86 100.0 5e-16 MRLLYSQKQHSYLSFKLILFSHTNSEVPAKQGVQVVGMIFQTHSKAAAEEDE >gi|222441833|gb|ACEP01000109.1| GENE 12 13959 - 15002 441 347 aa, chain + ## HITS:1 COG:no KEGG:BDI_0562 NR:ns ## KEGG: BDI_0562 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 11 344 2 327 328 270 43.0 8e-71 MRKVDKNTQDIDFVITWVDGNDLEWKREKTARLGHTDMDISVNTDDRKERYRDWDNLRYW FRGMEKYAPWVRKVHFVTWGHIPQWLNTKHPKLNIVCHEDFIPQKFLPTFNSHTIEWNFH RIPGLTEQFVYFNDDMFLLKPVYMEDFFKDEKPIDMLALQPDVANVDDPTMPYIYLNNTM LLAKYFDKRSNIKKQPWAYFHIGYPPMYFLYNVLELAFQRFTGFYTVHGPSPLKKSTYQK LWELEPELLSDVCSHPFRHREDVSQYVLREYQKLSGEFVPENVQKICGYYDAEQTNDRLV RSILRQKKKIICINDSNHEIDFEKVKKEINQAFEQVFSEKSSFENNI >gi|222441833|gb|ACEP01000109.1| GENE 13 15045 - 17234 1482 729 aa, chain + ## HITS:1 COG:CAC0731 KEGG:ns NR:ns ## COG: CAC0731 COG0451 # Protein_GI_number: 15894018 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Nucleoside-diphosphate-sugar epimerases # Organism: Clostridium acetobutylicum # 1 721 1 722 725 173 24.0 1e-42 MNILLVGGSCSLINSMILKLRKEGHRVFLLTGDKYRHNKYERVFEKYEFSYDCENLNDIL ESVNADVTILMGAFDTNYRWDGEEREIVLFISHLVNILVAYSVKNKKRLIFLSSDEIYNG NYTEDIKEEEPFSGVGIRADALSQAEEICDNFRINRSLDIITLRLDHLYNIPKERKDVNN ICADMCLNCMCDGHIKAKTDHTFSLLYENDAVEYIYQFIKTSNHKYSLYNLSSNDVVSEV KLAAMIQKAMEIESNIVAFSDSNGRCVLSGNRFEREFGVHAFGNLEKNIKDMVSYMKKHE SVFLKGDDLKLSWWKMLYEKWKWLIRTLFPFFENLVCFVPFFMMNNRTVGSEYFANLDPF LLYVLLFAIVYGQQQATFSAILAVAGYMFRQMYTRSSFAVLIDYNTYVWIAQLFILGLVV GYMRDQIRIMRLESQELEEHLHRQIVDIRDINESNVRVKEVMEQQLIDHKDSIGKIYSIT AGLEQRMPDEVIFYAVEMLGKLMKTKDVALYNVVSKDYARIFSASSQKARSLGNSIRYRE MTDIYDALKEQKVYINKKMDEQYPLMARGIYEGEEVQMIVMMWGLSWEKMTLGQANFLTV VSYLIQNAVLRAQRYMQALEEKRYSQNSRILEPEAFESLVQAYMKAKLKNLVECVLIKVD VQNSEYQKTDEQMSGYMRDSDYMGMLPDGNLYVLLTNTTREGAVIVQDRFEKNGYKTEYV EMMSVCHRE >gi|222441833|gb|ACEP01000109.1| GENE 14 17219 - 18127 580 302 aa, chain + ## HITS:1 COG:no KEGG:BBR47_05570 NR:ns ## KEGG: BBR47_05570 # Name: not_defined # Def: hypothetical protein # Organism: B.brevis # Pathway: not_defined # 73 299 67 290 320 72 25.0 2e-11 MSQRMKLFIFLLIINLIVVVFYLIWNYIRRKEKIGSVWMKAVMMLLCPVTGPAFVFLSFL LFKLFSSQGMDLSDVVFSKDRKENFHRPDEEMERNMVSLEEALEVTDKKSLRTLMLNVIR GDYRNSLAAINLALNSEDSETAHYAASVLQDVLNDFRSKVQTDYLLCQEENEQQVENCIK LVEYMNPILEQQVLTNLEQRSMAERMQEVLQKAWELDKIKISSTVYEKVCQRLLEVKDYE KCTLWCDRAMEQYPGVLSSYTCQMKLYFSCGKKEKFFQVMQELRDSDIAIDNETLELIRT FM >gi|222441833|gb|ACEP01000109.1| GENE 15 18147 - 20468 1333 773 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225028247|ref|ZP_03717439.1| ## NR: gi|225028247|ref|ZP_03717439.1| hypothetical protein EUBHAL_02519 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02519 [Eubacterium hallii DSM 3353] # 3 773 1 771 771 1488 100.0 0 MSMTTLVILRFAGIFAAYTGLTVLLPAIMFRRILAGRGLSEQFLMCYTFGNFYIINIVFA VQLLHISGFWALVLFTAVPGILIWSRVNRVSLRELCIKTGIICKKILQGSMGMRGALYRV CNRISAVLKRLVRLFYYKVICNTLQWILVGAVIIALFWIYGRNLLLTYGYCASDIPVHLN WINEMVRGNLFSDGVYPFGFHCMIYYLHTVFRVDTYAILCVFYLVQVFFIHMVLLAVMKL LCRSLYLPYAGVLVYILGNLWAKQTYSRFGASLPQEFGMIFVIPSVYFLISFFQTEKEKL KTRETKLLSGCFALAFALTLAIHFYGTMIAGLCCIGIAVGFCPRFLNKEYFKRIMMTGII SVFLAVLPMGIAFAGGTPLQGSLGWGLSVINGGKSDTEDEEASSEDITMEEMAARLNENS KNNIKSNNAKSMQTNMTETPESTFADKVREIPEKIKNTWNMMAQRIGEFIINLPEQWCAY AVLIGIAVVILLGLIFIILRKTTYGEMLMSAGFCMGILTLLLCASGLGLPMLMDPARCSI YYVYLLILTLTVLMDGILYLIFMHRLFTIPRNVLSFILTVFIVGSMLHQGMIKMPGFGSD YVSNGALTCLSNIIKENEDKTWTIVSANDETQMGLDHGWHYETITFLRNMEYMNKDTKLI IPTKNVYFFIEKIPLNYAVLYSGSGQSISKKAASQSLPNSGGITMYQGEGRWILMSRMYY WAQAFKERYPNDMKVYYESEDFVCYVMPQNMYHQYNFAIDYGYNQFRTQEDKN >gi|222441833|gb|ACEP01000109.1| GENE 16 20473 - 22299 1113 608 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225028248|ref|ZP_03717440.1| ## NR: gi|225028248|ref|ZP_03717440.1| hypothetical protein EUBHAL_02520 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02520 [Eubacterium hallii DSM 3353] # 1 608 1 608 608 1179 100.0 0 MVSKRKFFSITIMMFVLLFLFQFSMVIRDRQNTYDVNSNFTTRQKDGKNVWKIKQIDPVA VRTTDRSYALFVGDSSSDMATAVSRWCIYAKWDMAQCRNLEDYQENDKALPGMLILESEK YALGDNLEKLEKLEQKGVIIVFGCLENPKNIEKNQELMDFLGIYKVVSEKTKLTGVKLFE GLLLGGEVVYETPKEKEEKERQDLQLEIPWYQVGSGTKTYMVGLLDHSKQKTQVENEDLP TILWRNGIKGGSVFCIVGDYMKDSTALGLLNGMITEASEYYIYPVVNAQNLSMINFPAFA DENDEEMMKLYSQSVTGMTRDIMWPSLVSIVEKGKLKMTCFMQPQADYEDGIEPDTKDMI FYLKQMKEQEAEVGLSLEYKNAGSLREKLDQDADFFQKSDSSYKYGAAFAEERDLDTITG LMNTELLKNVSTLVCEYTEKEPVVSYCTDSVTLQSVTSDGMNYSYSDDIRMRSIQSSLAY TNVMLNMHDIFWPQQKTDRWQIMQKHFSSNILTYWKNFTGFGSMTLSESNTRIRTFLNLD FSESRTENEITLETSEKGSWFLLRTHGEEIAEIEGGTEKKIEDGTYLIQAQDTTVKIQIK TSGLHYDR >gi|222441833|gb|ACEP01000109.1| GENE 17 22333 - 23814 819 493 aa, chain + ## HITS:1 COG:CAC0734 KEGG:ns NR:ns ## COG: CAC0734 COG0438 # Protein_GI_number: 15894021 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Clostridium acetobutylicum # 1 471 1 467 471 419 41.0 1e-117 MKVCIVAEGCYPYVVGGVSGWINSLIRSFPNVEFILLAIVANRSFRGKFVYELPENLTQV YEVYLDDCEWESGNRKRKSLRQVHLSHKEYDEMLNMVLNRRTDWGVIFDLFHRKHLSVDA FLMGEDFFRIARECYDRNYSNINFSDFLWTLRSIYLPMFWALRMDVPKADLYHCVATGYS GILGSMAKHFHGSSLLISEHGIYTREREEELIKASWVRGVYKNIWIEQFKKMSLLAYEKA DMVTCLYDHAKDLQMELGCPPEKIRVTPNGINTQTLADLPGKTEEEKQFINAGAVLRVTP IKDVKTMIQAFAFAKKKVKNMKLWIMGPADEDKKYAKECYDLVESLGVQDVEFTGRVNVK DYLGKMDFTLLTSISEGQPLTILESYAAHKPVIATDVGNCRELIYGNNDGFGEAGILTHI MNIEEIAHAMVTMSVNEKDRRCMGEAGYRRVNALYRIDQMKEVYREIYKGFSDRQNLSWT EEPFQISVHEKMM >gi|222441833|gb|ACEP01000109.1| GENE 18 23885 - 25327 533 480 aa, chain + ## HITS:1 COG:CAC0735 KEGG:ns NR:ns ## COG: CAC0735 COG4267 # Protein_GI_number: 15894022 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Clostridium acetobutylicum # 1 467 1 464 478 198 29.0 2e-50 MAGIGVKLTNIYKKNTLTTDIIGAGYSMVVTIAPMLLVIGALMIMEYFLDFSSVGYVTRE LFSCTVLYIFIFSLLTSSPFNSVLSKYMSDVIYEETYEDILPCYYVGMLLNISLSALVGI PFCIREYLVGKVDIIYVFTGYCGYIALVLVFYSMLYLSICKDYKKISFFFAVGMTVTVFL SFLLVKVFHWGITYGMLFSLTIGFWLIACLEMSVVRSYFKENSGKYRQVLVYFKEYWPLV VTNFLYILGLYVHNFVFWTTDLHMIVVKSFVCVTTYDMATCLAMFTNISASVIFISRVEM YFHGRYKAYSEAVIGGRGADIDITKKGMFRALSEELMSLARVQFIITLVIFLICMIFLPQ LGFGGMTMRIYPCLASGYFILFLMYAEIIFMYYFNDMKGCVLTGLVFCLITLVGSIISSH FTEIWYGAGLVVGSFAGWTVAYGRLRWMEKHLDVHIFCNGHLLKKKKGDCPSPKVFDRYE >gi|222441833|gb|ACEP01000109.1| GENE 19 25351 - 25722 228 123 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225028251|ref|ZP_03717443.1| ## NR: gi|225028251|ref|ZP_03717443.1| hypothetical protein EUBHAL_02523 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02523 [Eubacterium hallii DSM 3353] # 1 109 1 109 123 183 100.0 4e-45 MEVGDILRLFVIVSGVFMVVRAILSLAKRKMTEPFCLAWVFLSALMILSGALLNPSQMKR YVSTRGLILTIIIVSGVLWGLWFISTQVSILKRKDQELAMQVSLLNNDYEKIIKELEKLE KEK >gi|222441833|gb|ACEP01000109.1| GENE 20 25727 - 26926 625 399 aa, chain + ## HITS:1 COG:PM1138 KEGG:ns NR:ns ## COG: PM1138 COG0438 # Protein_GI_number: 15603003 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Pasteurella multocida # 227 386 211 372 392 95 33.0 2e-19 MKKVLFVINTLGGAGAEKALLELLKRFTPDRYEVDLYVLLEQGELISQVPGYVNILNRDY TAESVLSAEGKKKLNKKVFMRLFTHGALFKNIPYLIKNAVAMLGRKKIYADKLLWRVMSD SGMKLNKSYDMAVAYLEGGATYFVHDHVKARKKFTFLHVDYKYAGYTRELDRDCYLDFDR IFTVSGEVKSVFDSVYPECRNKTFIFHNLIDREEIYRKSELPGGFSDTYTRKRILTVGRL TAQKAYEVAVDAMKLLKDQGIKARWYVLGEGELRNKLQQKIDSLGLKEDFLLLGAKENPY PYYKQCDLYVHATRFEGKSIAIQEAQVLGCTILVSDCSGNREQVENGTDGILCQLSPEDI SRKIAELLGNEEKCREYGKKATVRISDEQGDILKLFEIA >gi|222441833|gb|ACEP01000109.1| GENE 21 26947 - 27756 580 269 aa, chain + ## HITS:1 COG:SPy0794 KEGG:ns NR:ns ## COG: SPy0794 COG0463 # Protein_GI_number: 15674837 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Streptococcus pyogenes M1 GAS # 16 250 4 228 231 154 37.0 2e-37 MENRENVGKEAKKEVLIIIPAYNEEKTIVPLLEKLEQPEIAQVADVLVVNDASQDKTNHV TKKRNHTVITNVFNLGYGSGLQLGYKYAVRKGYSYVIQMDADGQHDVCNLLEIYKELQKE DENGKLPDIVLGSRFMEGSSGFSVSFAKKIAFVWFRIMLKTGTGRRITDPTTGLQGLNWR TFRFYSKYNNFDDKYPDANMLMQMLLLGFRVREIPAVMHVRTTGVSMHSGLKPVLYMFRM TFSILAVWIREKILKMDEDKIRELEKYNE >gi|222441833|gb|ACEP01000109.1| GENE 22 27749 - 29062 441 437 aa, chain + ## HITS:1 COG:no KEGG:Rumal_1156 NR:ns ## KEGG: Rumal_1156 # Name: not_defined # Def: type 1 dockerin # Organism: R.albus # Pathway: not_defined # 68 369 135 439 593 84 25.0 1e-14 MNKNNWVFQQVELIESNVCLENPGRGWYGIYTFVVQDSINPEELRWSLRDGELLALVLLD ISIYRSMQLDENALNNIRSILSFFMYYKKDVILRPVYDREGKGRECEPDSFEQVLEHLQQ IGGLLRDMQHSVFIFQGMLAGSWGEMHDSRYLSPECINRMWNCMRQYLGDEIYLAVRTPA QWRTLRSEKEFRQKKYAHLALFDDGILGSMTHLGTFGTELREAAGWAQAWTRKEEMEFVN CLTEDMPCGGEVVACTENDRMYEAKFIVNELKKMHITYLDSTYDMKALDLWKSISWDGFD NLKIWKGYSLYEYIGEHLGYRLVVRTVELKPIRRGRAEFMFEIENTGFGRVFQETELFLI MKNRNETREVPISMDMRKIQAGTRICGKTVIKLMEGRVSLKLQRKKDGRIIHFANKNSTD TLYLGSLHCGKIYLSEE >gi|222441833|gb|ACEP01000109.1| GENE 23 29049 - 30491 368 480 aa, chain - ## HITS:1 COG:no KEGG:Kfla_4441 NR:ns ## KEGG: Kfla_4441 # Name: not_defined # Def: hypothetical protein # Organism: K.flavida # Pathway: not_defined # 46 471 170 575 587 137 26.0 2e-30 MKKKSSIILHSLFFVVCSVLAVLCLYHIISIRRLNVKYTPAITEESSRELDNPYRGFYQL SGYILSDNQKPEKSAAWCRKSCASNPYPLMLLEINLKNYSNTSISTNAKNQLDKILEECV QAKKQVILRFLYDWDGQALSTEPSDLPQIKNHISQISSTVNKYADCVYILQGTLTGNTGE MNHSNYGDINQIRQIIEELNQNISSDIFLAVRTPGQLRGILRTRTPLSSTDAGNGSLQAR LSLFNDGMLGSVYDLGTYDDTPLQPDSNLDEQGTRSEELLFQYKLCQYVPNGGEVTVDNE YNDLDNAIADLSQMHISYLNSEHDPAVLDKWKNSTYTGSRTDVFSGCTGYDYISTHLGYR YVMKESSVDFHSVLSDTASLYITIANTGFSSAYTRFDTVLFLTDESTEKSKKIETNIDNR TIAGNDCSTFKTDLDIRSLRKGTYSISLRMEDPYTKNSIHFANQGAENSKTVPVGTLILH >gi|222441833|gb|ACEP01000109.1| GENE 24 30578 - 32122 560 514 aa, chain + ## HITS:1 COG:no KEGG:Exig_2626 NR:ns ## KEGG: Exig_2626 # Name: not_defined # Def: polysaccharide biosynthesis protein # Organism: E.sibiricum # Pathway: not_defined # 5 440 5 440 511 332 41.0 3e-89 MKQKSRTEYSVMNTTVAFLAQTLAIFMGFFTRVVFTRTLSEGYVGINGLFTDILNILSLS ELGVGTAITYALYAPIARRDIKKQQILMRMFRNFYRITAGFVLLAGLCLIPFLDILMKDR PDVNHLIIIYLLYLLNSVVSYLLIYKKTLVDAHQMNYITVMYHNGFLVLQDILQIMVLFL TRNFILFLVIAVVCTIIGNICMSRKADRLFPYLKEPCKEQLPQEERHDILRNVKAMLMHK IGNVVVNNTDNLLISSFVGIISAGIYSNYYLIIGSVRQVLEQAMLGVAASVGNLGVTEEN EKIGRIFDSLFFIGYWLFGFAGVCLLELLNPFVKLAFGKKYLFPKEIVLILCINFFINGM RRAVLIFKESMGLFWNDRYKAIAEAVLNLVISILLVTYFGVAGVFAGTFCSTVLTSVWVE PYVIYKYRLKKPVTGFFVKYAGYLGVMAAVWGITEYFCRFISGQAFMILVCRLGICLIVP NILLWFAYRRTAEWKVLWNLLKKIVEKVFTGDKR >gi|222441833|gb|ACEP01000109.1| GENE 25 32119 - 33210 481 363 aa, chain + ## HITS:1 COG:BS_yveR KEGG:ns NR:ns ## COG: BS_yveR COG0463 # Protein_GI_number: 16080483 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Bacillus subtilis # 11 256 6 250 344 124 29.0 2e-28 MTALEKQQKKISVIIPVYNAEKYIRETLDSIIKQSYKNLEIILIDNGSKDRSPDIIQKYE AKYPEIHMIEGSGKGPGASRNRGLKLAKGDYIVFADADDYLPDKEIFCKYINLAEQTDAD IVVSNYARLWKDKILPATKHQSFALCSPSSEEFRFQGFFSVGTLSYAWGKLYRREFLRNH QIYFADLSYAEDKLFNMQCYICDAKYAFLEDIGYIYRKNDMSVSWQYRPDSVENWFKLVY ELKSWIEKEDKDPKEYTSLIGYLLIFAAFFDGKMEYMQHKKSLWAVRKVLKKYGSNPMGR KAFTELADRRKKLQITQKLWKIMIRTFSLGMKWRWYLLMSVGIKILVSCRVDERLSDTGL RKK >gi|222441833|gb|ACEP01000109.1| GENE 26 33227 - 34225 262 332 aa, chain - ## HITS:1 COG:no KEGG:CPE0465 NR:ns ## KEGG: CPE0465 # Name: not_defined # Def: hypothetical protein # Organism: C.perfringens # Pathway: not_defined # 7 279 78 347 410 73 25.0 1e-11 MMLNYYQIEHFTRLTVSLLRKNNIDCYLLKGLSLADCYPVPEYRKLGDLDLYLADPSDLT RAQAILESNGYISEDELSDHHVTYRYIFSKTGRRFLLELHYRVVGQYQYSPANKLVDSVF SPKCLKASTQTIHGYTYTVLPPTEYTFYMIHHMLKHYLYSGFGIRLLCDFTFYLKRHEKE INFRQIHEWCRESRIIHLYEIILECCRKYLGLPDSIDARTQYNPHDCHDFIIQILKDGDM GTDISQSLVGSSSYQKVNFLTYFKEGHIQMKVRFPKASHYPVLWPALWCITFICFIRNTY TVRNTTFHQTLHSFKKINQKTKLIHIFENSDF >gi|222441833|gb|ACEP01000109.1| GENE 27 34497 - 34805 320 102 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225028258|ref|ZP_03717450.1| ## NR: gi|225028258|ref|ZP_03717450.1| hypothetical protein EUBHAL_02530 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02530 [Eubacterium hallii DSM 3353] # 1 102 1 102 102 162 100.0 8e-39 MLLRGNKGKGREKGKMKKTEGYVVRRLEDEYLVLPSGRRTEEVNEVISLSETAGFIYMHA ERAENVYELTQLVAEEYEIEPAEVLEDVRNVVLTLQKKGILL >gi|222441833|gb|ACEP01000109.1| GENE 28 34817 - 35272 184 151 aa, chain - ## HITS:1 COG:no KEGG:Rumal_1722 NR:ns ## KEGG: Rumal_1722 # Name: not_defined # Def: hypothetical protein # Organism: R.albus # Pathway: not_defined # 29 140 20 130 144 79 40.0 4e-14 MNNTTSDPVKIDLEQLLREGNIIRIKPQGYSMYPLFIPGRDEALIQQTDGTDCHRNDVVL YRRDQGILVLHRICRITSDGFYMVGDNQYEVEGPLRQDQIIGKLIAVNRNSREFTVGNPF YKFVSSLWLFMLPVRPFCFKLSAFLRKLHHS >gi|222441833|gb|ACEP01000109.1| GENE 29 35506 - 35886 521 126 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225028261|ref|ZP_03717453.1| ## NR: gi|225028261|ref|ZP_03717453.1| hypothetical protein EUBHAL_02533 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02533 [Eubacterium hallii DSM 3353] # 1 126 1 126 126 211 100.0 1e-53 MKTYKKPVVTVDAGMAEGVYAASGAKGKLNVVYHGVWDRWGTNGGKGLAMADWSGINGTI TLNITFNDTVDQVETDNASVQASCSGKTATFTFASTVSNSLTIGVHLNHGTSIDDLKMTG FTYSVN >gi|222441833|gb|ACEP01000109.1| GENE 30 35954 - 37666 1208 570 aa, chain - ## HITS:1 COG:VC1446 KEGG:ns NR:ns ## COG: VC1446 COG2274 # Protein_GI_number: 15641457 # Func_class: V Defense mechanisms # Function: ABC-type bacteriocin/lantibiotic exporters, contain an N-terminal double-glycine peptidase domain # Organism: Vibrio cholerae # 86 550 235 690 721 177 28.0 5e-44 MWKQTLWIYQYGRRYWKAMILYTLLGLVGTGVSLVSSLISRDLVDIITGHQTGKLVSTFA AMIGFSVANILVSQASGYASTFINLKVDSEIKNDIFAKMLVTDWESLTDYHTGDLVTRWS SDASNISSGILNWIPNLIIYTFRFISALAIVLYYDPTFALFALLGIPFSALLSKPLLKRM RDNNQRSAAMNAKLYGFNQEAFSNIQTIKAFDLIHFYTEKLKSLQKDYVNMRLDFQKMSI LTSVLMSIIGFIVSYSCYGWGIYRVWSNAISYGTMTMFLSLSGTLTSSVNSLVSLIPSAI SLTISAGRLMDIVEMPQEDYSQDSAVRNFEKQHKSEGIGLNMQDLSYTYHNGTAVFANAS IEAYPHEIVALVGPSGEGKTTMLRLILSLLTPQEGSSHICAGTDHEHMIDMSPSTRKLFS YVPQGNTMFSGTIAENMRNVKQDATDEEIIEALKLACAWDFVEKLPDGIESIVKERGGGF SEGQAQRLSIARALLRRSPILLLDEATSALDVTTERKVLRNIMQDTYPRTCIVTTHRPTV LNICNRVYAIRDKKCEILNDVEIHEMIQTF >gi|222441833|gb|ACEP01000109.1| GENE 31 37720 - 38886 493 388 aa, chain - ## HITS:1 COG:no KEGG:bpr_I1003 NR:ns ## KEGG: bpr_I1003 # Name: not_defined # Def: hypothetical protein # Organism: B.proteoclasticus # Pathway: not_defined # 8 382 7 381 385 246 36.0 1e-63 MFNCQPGYTLRKIKGINYLLPYGQQIADLKKGFVLNETSTFLWNVLQHHEGAEPQQLAEI LARTYQLDESYYPELLKDVTDFLTQLTAMGMITEDLHLISTIPSVSMIIAGICIKLYGSA ELISPNFKPFYHEFSDDDTSQEIELVTTPPPSRCYGQVLLQNSEMTVFENPDRYVVLFPQ MQNLYEAHMLKDSTYVRIYCHPQVSETNIENLFHAIRLFFLFTAQRNGLFAIHSASVLYQ GKAWLFSGHSGMGKSTHTALWHELFDTPYLNGDLNLLGLKEDHIIVYGIPWCGTSGIFTA EAYELGGIVLLGRDLQTDYLEELNPSEKVLRVMQRMISPAWTERQLSENLFFAGEIADRV PVLHLLCTKNSSAACVMKNTIDQLEKLQ >gi|222441833|gb|ACEP01000109.1| GENE 32 39297 - 39732 466 145 aa, chain + ## HITS:1 COG:SP1559 KEGG:ns NR:ns ## COG: SP1559 COG1109 # Protein_GI_number: 15901402 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphomannomutase # Organism: Streptococcus pneumoniae TIGR4 # 1 140 1 135 450 129 51.0 2e-30 MGKYFGTDGFRGEANVNLTVDHARKVGRFLGWYYGQLKKQRGDDIPAKVVIGKDTRRSSY MFEYALVAGLTASGADAYLLHVTTTPSVAYVARTEDFDCGIMISASHNPYYDNGIKLING NGEKMDEATINLVEAYLDSELEVFG Prediction of potential genes in microbial genomes Time: Fri Jul 8 08:10:33 2011 Seq name: gi|222441832|gb|ACEP01000110.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont371.1, whole genome shotgun sequence Length of sequence - 989 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 476 - 535 6.4 1 1 Tu 1 . + CDS 568 - 988 241 ## Tresu_0316 ferric uptake regulator, Fur family Predicted protein(s) >gi|222441832|gb|ACEP01000110.1| GENE 1 568 - 988 241 140 aa, chain + ## HITS:1 COG:no KEGG:Tresu_0316 NR:ns ## KEGG: Tresu_0316 # Name: not_defined # Def: ferric uptake regulator, Fur family # Organism: T.succinifaciens # Pathway: not_defined # 4 134 3 133 137 84 38.0 2e-15 MKERKSYTTKNRLVILDYLKENCSTTISAADIKKHLEEKEISVNTTTVYRLLDKLCAENI IIKYSDINSDKAVYQYAGEKGHCREHLHLKCIKCGKVMHLDCGFMDELREHIMEEHHFQL QCSGDLLHGLCEECAKNSDF Prediction of potential genes in microbial genomes Time: Fri Jul 8 08:10:38 2011 Seq name: gi|222441831|gb|ACEP01000111.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont372.1, whole genome shotgun sequence Length of sequence - 6720 bp Number of predicted genes - 9, with homology - 8 Number of transcription units - 4, operones - 3 average op.length - 2.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 339 - 1142 386 ## COG5527 Protein involved in initiation of plasmid replication 2 1 Op 2 . + CDS 1171 - 1308 81 ## 3 1 Op 3 . + CDS 1311 - 1562 252 ## Rumal_3218 hypothetical protein 4 1 Op 4 . + CDS 1620 - 1961 305 ## gi|225028270|ref|ZP_03717462.1| hypothetical protein EUBHAL_02542 + Term 1973 - 2012 6.0 - Term 1959 - 1998 6.0 5 2 Op 1 . - CDS 2051 - 2653 501 ## gi|225028271|ref|ZP_03717463.1| hypothetical protein EUBHAL_02543 6 2 Op 2 . - CDS 2683 - 2952 251 ## gi|225028272|ref|ZP_03717464.1| hypothetical protein EUBHAL_02544 - Prom 3009 - 3068 5.1 7 3 Op 1 . - CDS 3082 - 4110 346 ## gi|225028273|ref|ZP_03717465.1| hypothetical protein EUBHAL_02545 8 3 Op 2 . - CDS 3695 - 4990 702 ## COG0507 ATP-dependent exoDNAse (exonuclease V), alpha subunit - helicase superfamily I member - Prom 5151 - 5210 8.4 + Prom 5082 - 5141 7.1 9 4 Tu 1 . + CDS 5161 - 5412 357 ## gi|225028275|ref|ZP_03717467.1| hypothetical protein EUBHAL_02547 + Term 5611 - 5660 9.6 Predicted protein(s) >gi|222441831|gb|ACEP01000111.1| GENE 1 339 - 1142 386 267 aa, chain + ## HITS:1 COG:pli0023 KEGG:ns NR:ns ## COG: pli0023 COG5527 # Protein_GI_number: 18450306 # Func_class: L Replication, recombination and repair # Function: Protein involved in initiation of plasmid replication # Organism: Listeria innocua # 84 232 99 260 386 58 26.0 1e-08 MKRELNTKESVRTVTSNQFITACGLEGISLKARKLLYIAISQSKKTDTEFYEYEIGVKEF ADLMGIAPTHVYQEADVVTDELMRGFVKIQEKNNKSWRKYQLFDMCQYTENGSIRFEISK QMTDFLLNLTGNFSQPLLYDFLKMRSPYSMAIWHLMQMKMKSKKLGVTATTSFDISLAEL RKVTGCIDKLKQVGEFKKRVLDKALREIFDCAGVKITYQNIKVGRTVEGFRFFAEGVAHI DLSKIPLAEQERIQANAERIKKGEPLQ >gi|222441831|gb|ACEP01000111.1| GENE 2 1171 - 1308 81 45 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MIEIIRNWKKKRQDRKQAKHEKLLVELAEMVDKRRAELLQREREG >gi|222441831|gb|ACEP01000111.1| GENE 3 1311 - 1562 252 83 aa, chain + ## HITS:1 COG:no KEGG:Rumal_3218 NR:ns ## KEGG: Rumal_3218 # Name: not_defined # Def: hypothetical protein # Organism: R.albus # Pathway: not_defined # 4 79 2 76 77 72 53.0 6e-12 MEYKEGRLGYNRKNDRYGLLVTDLWEIDGFSCGNRLQVEINGEWVDTHMEMDWSTGKGIW YLTGTDLKGSALENKRVRVLRDR >gi|222441831|gb|ACEP01000111.1| GENE 4 1620 - 1961 305 113 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225028270|ref|ZP_03717462.1| ## NR: gi|225028270|ref|ZP_03717462.1| hypothetical protein EUBHAL_02542 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02542 [Eubacterium hallii DSM 3353] # 1 113 1 113 113 180 100.0 3e-44 MEDLRKYTRKSELYKLSAEEQTEKMKLVMKEVEKISAYTGVSEEAAYNLFLNGYDVPSEL DGQKIAFSESELTPDLLSNLAENNVRSVKVMSDESIAAGYMDYLHKQKKDNKF >gi|222441831|gb|ACEP01000111.1| GENE 5 2051 - 2653 501 200 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225028271|ref|ZP_03717463.1| ## NR: gi|225028271|ref|ZP_03717463.1| hypothetical protein EUBHAL_02543 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02543 [Eubacterium hallii DSM 3353] # 1 200 1 200 200 357 100.0 3e-97 MKNKIFITASILAMLFQCVPANVFAATSPTPVVTTSVVIKDENNDKVLEEYQLDTANTKE VVDKNGDIIYTTTATAGDSTNLNKFSRASQVGSKTFHGWKGTVNISYSESGEKANLKSAS AKWTHVSGSTAISKSKEFHYGQDLMTNAHNGYKKFTGLTVSASPNWKKGRYAYGVGNKVG ANVCAKISGKYVYVFCDVSF >gi|222441831|gb|ACEP01000111.1| GENE 6 2683 - 2952 251 89 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225028272|ref|ZP_03717464.1| ## NR: gi|225028272|ref|ZP_03717464.1| hypothetical protein EUBHAL_02544 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02544 [Eubacterium hallii DSM 3353] # 1 89 12 100 100 166 100.0 7e-40 MFMLGMKEAFLSNNVELIGIIEHELRKKNIPYKVKTVNSGVLNRTTGTFVGRAGQSYNDL DIMYYIYSKPKYVSEIKFIANDCLNKTNY >gi|222441831|gb|ACEP01000111.1| GENE 7 3082 - 4110 346 342 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225028273|ref|ZP_03717465.1| ## NR: gi|225028273|ref|ZP_03717465.1| hypothetical protein EUBHAL_02545 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02545 [Eubacterium hallii DSM 3353] # 112 342 1 231 231 384 99.0 1e-105 MSESADYSTEQVEQLINELETLISQNEELSQEIINLQEQLSRKDSQNLEVLEQAERWKKK ACGLEEQIQKLKQEMSENLSANSMLMSKLQQKSETIKSLNDEIGRFQESDKVLKENINLK LQNKELQEAKKKALTEVSKIKNAYAEKIRALDTKQWEIKNAKRNQNALIQRKAEELYTVK KHTQQALNLSFLLYSCIITLFNAIRSETLRMDSVAFLTQTWNILRVIALNVKKAAIQASE WGSRVKKQSLAIVLHNLLPVIVVLASLALFGFIIYILIKLGKSYKKYFWDNLSLFITFFS LAGIVFFADLIRNIISTNVILLFLFTQLIYIFIRYFKIFINK >gi|222441831|gb|ACEP01000111.1| GENE 8 3695 - 4990 702 431 aa, chain - ## HITS:1 COG:mll0964 KEGG:ns NR:ns ## COG: mll0964 COG0507 # Protein_GI_number: 13471082 # Func_class: L Replication, recombination and repair # Function: ATP-dependent exoDNAse (exonuclease V), alpha subunit - helicase superfamily I member # Organism: Mesorhizobium loti # 1 265 1 239 1015 132 37.0 1e-30 MAIYHCSIKNIGRSDGRSAVACSAYRAAQKLRDKETGLLYDFTKKSEVIYSKIFLCKNAP AAYVDRETLWNEVQKIEKNKNARLAREWELAIPHELTFEQSKKLVCDFAKTLVDEGMCVD VNIHWKNNNHHAHIMGTTRQIKENGKWGQKEKKVYKLDENGQKIAVLDADGKQKIGANGR KLWQREAVETNDWNKSDKVEEWRKRWAECCNQYLDLKNQVDHRSYERQGIDYIATIHEGY TARLIEKRGGIAERCEINREIVKKNNLLRTLRQQLREVNEKITELIGEKGKQANERISRL LNRAGRTAHQRARNSHQSERGTIAGDNKPTGTTFEERLSKLRSARTSGTLEKESLRTGRT DTEVKTRDVREFISKLDADEQASAEKRNDKIAQRRDREISRERQGAKGEYKLKAAEQRTT GSKEKGSNRSL >gi|222441831|gb|ACEP01000111.1| GENE 9 5161 - 5412 357 83 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225028275|ref|ZP_03717467.1| ## NR: gi|225028275|ref|ZP_03717467.1| hypothetical protein EUBHAL_02547 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02547 [Eubacterium hallii DSM 3353] # 1 83 1 83 83 70 100.0 3e-11 MASIDERLEKLKKQKEELKAKEKKLLAQKASAERKKRTKRLIEVGAAVESVLKQPIEKED LPKLINFLEQQEERGNYFSKAMK Prediction of potential genes in microbial genomes Time: Fri Jul 8 08:11:46 2011 Seq name: gi|222441830|gb|ACEP01000112.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont378.1, whole genome shotgun sequence Length of sequence - 60277 bp Number of predicted genes - 75, with homology - 70 Number of transcription units - 45, operones - 21 average op.length - 2.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 34 - 936 960 ## COG1597 Sphingosine kinase and enzymes related to eukaryotic diacylglycerol kinase 2 1 Op 2 . - CDS 944 - 1108 62 ## gi|225028278|ref|ZP_03717470.1| hypothetical protein EUBHAL_02550 - Prom 1218 - 1277 6.2 + Prom 1168 - 1227 7.0 3 2 Op 1 40/0.000 + CDS 1263 - 1937 728 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 4 2 Op 2 . + CDS 1973 - 3352 797 ## COG0642 Signal transduction histidine kinase + Prom 3613 - 3672 8.2 5 3 Tu 1 . + CDS 3695 - 4858 523 ## PCC7424_2290 hypothetical protein + Term 5023 - 5064 2.1 6 4 Op 1 4/0.000 - CDS 5068 - 5496 576 ## COG0071 Molecular chaperone (small heat shock protein) - Prom 5543 - 5602 7.0 7 4 Op 2 1/0.333 - CDS 5737 - 6195 579 ## COG0071 Molecular chaperone (small heat shock protein) - Prom 6333 - 6392 7.0 - Term 6337 - 6385 -0.1 8 5 Op 1 40/0.000 - CDS 6459 - 7877 767 ## COG0642 Signal transduction histidine kinase 9 5 Op 2 . - CDS 7870 - 8577 564 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain - Prom 8750 - 8809 9.3 - Term 8780 - 8832 3.2 10 6 Op 1 8/0.000 - CDS 8897 - 10057 746 ## COG3550 Uncharacterized protein related to capsule biosynthesis enzymes 11 6 Op 2 1/0.333 - CDS 10023 - 10322 198 ## COG1396 Predicted transcriptional regulators 12 7 Op 1 35/0.000 - CDS 10873 - 12738 183 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 13 7 Op 2 . - CDS 12740 - 15184 2797 ## COG1132 ABC-type multidrug transport system, ATPase and permease components - Prom 15212 - 15271 5.6 14 8 Tu 1 . - CDS 15282 - 15878 637 ## ELI_2728 hypothetical protein - Prom 15989 - 16048 1.9 15 9 Op 1 . - CDS 16051 - 16485 447 ## Closa_1534 MarR family transcriptional regulator - Prom 16562 - 16621 4.3 16 9 Op 2 . - CDS 16646 - 17284 441 ## COG1937 Uncharacterized protein conserved in bacteria - Prom 17442 - 17501 9.4 + Prom 17117 - 17176 8.6 17 10 Tu 1 . + CDS 17270 - 17509 76 ## 18 11 Tu 1 . - CDS 17532 - 18131 443 ## Vpar_0013 hypothetical protein - Prom 18297 - 18356 7.3 + Prom 18105 - 18164 12.6 19 12 Tu 1 . + CDS 18368 - 19750 1525 ## COG2211 Na+/melibiose symporter and related transporters + Term 19944 - 19992 13.6 + Prom 20090 - 20149 8.5 20 13 Op 1 7/0.000 + CDS 20378 - 21985 1342 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain + Prom 21998 - 22057 2.6 21 13 Op 2 . + CDS 22093 - 23850 1157 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain + Prom 23852 - 23911 6.9 22 14 Tu 1 . + CDS 23986 - 24627 233 ## gi|225028299|ref|ZP_03717491.1| hypothetical protein EUBHAL_02571 + Prom 24736 - 24795 2.9 23 15 Tu 1 . + CDS 24831 - 25817 383 ## COG1879 ABC-type sugar transport system, periplasmic component + Term 25863 - 25897 -0.9 + Prom 25887 - 25946 6.3 24 16 Tu 1 . + CDS 25970 - 27583 982 ## COG3894 Uncharacterized metal-binding protein + Term 27744 - 27787 1.0 25 17 Tu 1 . - CDS 27997 - 28338 364 ## COG1393 Arsenate reductase and related proteins, glutaredoxin family - Prom 28387 - 28446 8.9 26 18 Tu 1 . - CDS 28458 - 28997 505 ## COG1896 Predicted hydrolases of HD superfamily - Prom 29023 - 29082 2.9 27 19 Op 1 . - CDS 29271 - 29825 422 ## bpr_I0670 HAD superfamily hydrolase - Prom 29872 - 29931 5.3 28 19 Op 2 . - CDS 29999 - 30658 677 ## COG5018 Inhibitor of the KinA pathway to sporulation, predicted exonuclease - Prom 30723 - 30782 9.4 29 20 Tu 1 . + CDS 31310 - 31417 58 ## TDE0249 flavoredoxin, putative + Term 31431 - 31469 8.0 - Term 31418 - 31456 4.2 30 21 Tu 1 . - CDS 31599 - 31772 241 ## BDP_1352 hypothetical protein - Prom 31796 - 31855 2.8 - Term 31954 - 31990 4.1 31 22 Op 1 . - CDS 32015 - 32434 479 ## COG0394 Protein-tyrosine-phosphatase 32 22 Op 2 . - CDS 32427 - 33110 393 ## TDE2638 hypothetical protein 33 22 Op 3 . - CDS 33120 - 33461 326 ## COG1192 ATPases involved in chromosome partitioning - Prom 33634 - 33693 7.4 34 23 Tu 1 . - CDS 33855 - 33950 85 ## - Prom 33983 - 34042 6.0 35 24 Op 1 . - CDS 34093 - 34617 469 ## COG0655 Multimeric flavodoxin WrbA 36 24 Op 2 . - CDS 34643 - 34756 82 ## 37 25 Tu 1 . - CDS 34918 - 35787 494 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily - Prom 35840 - 35899 6.1 + Prom 35768 - 35827 6.2 38 26 Tu 1 . + CDS 35939 - 36502 288 ## COG1309 Transcriptional regulator + Term 36530 - 36566 2.0 - Term 36518 - 36552 1.6 39 27 Op 1 . - CDS 36591 - 37208 408 ## COG0110 Acetyltransferase (isoleucine patch superfamily) 40 27 Op 2 . - CDS 37232 - 37552 188 ## COG1733 Predicted transcriptional regulators - Prom 37653 - 37712 5.1 + Prom 37578 - 37637 6.3 41 28 Tu 1 . + CDS 37686 - 37889 257 ## 42 29 Op 1 . - CDS 37774 - 38046 190 ## Ethha_1352 integrase family protein 43 29 Op 2 1/0.333 - CDS 38065 - 39069 922 ## COG0667 Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) 44 29 Op 3 1/0.333 - CDS 39080 - 39847 588 ## COG0599 Uncharacterized homolog of gamma-carboxymuconolactone decarboxylase subunit 45 29 Op 4 1/0.333 - CDS 39876 - 40346 604 ## COG0716 Flavodoxins - Prom 40411 - 40470 4.7 - Term 40374 - 40422 5.1 46 30 Op 1 4/0.000 - CDS 40473 - 41351 703 ## COG0583 Transcriptional regulator 47 30 Op 2 . - CDS 41380 - 41805 283 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases 48 31 Op 1 . + CDS 42190 - 42576 232 ## LAC30SC_09585 hypothetical protein 49 31 Op 2 . + CDS 42554 - 43987 494 ## Tresu_1917 Relaxase/mobilization nuclease family protein - Term 43959 - 43998 -0.7 50 32 Op 1 . - CDS 44137 - 44439 265 ## gi|197302208|ref|ZP_03167267.1| hypothetical protein RUMLAC_00935 51 32 Op 2 . - CDS 44449 - 44655 130 ## COG1476 Predicted transcriptional regulators 52 32 Op 3 . - CDS 44652 - 44981 79 ## gi|218132785|ref|ZP_03461589.1| hypothetical protein BACPEC_00646 - Prom 45013 - 45072 8.6 - Term 45135 - 45179 8.3 53 33 Op 1 . - CDS 45195 - 45374 89 ## gi|225028330|ref|ZP_03717522.1| hypothetical protein EUBHAL_02602 - Prom 45465 - 45524 2.2 54 33 Op 2 . - CDS 45527 - 45808 222 ## COG0389 Nucleotidyltransferase/DNA polymerase involved in DNA repair - Prom 45981 - 46040 5.6 + Prom 45967 - 46026 4.1 55 34 Tu 1 . + CDS 46110 - 46529 408 ## Closa_2770 XRE family transcriptional regulator + Prom 46533 - 46592 5.9 56 35 Op 1 . + CDS 46622 - 47137 480 ## COG2856 Predicted Zn peptidase 57 35 Op 2 . + CDS 47203 - 47589 323 ## Mlab_0595 hypothetical protein + Term 47629 - 47688 8.6 - Term 47621 - 47672 10.3 58 36 Tu 1 . - CDS 47732 - 47881 105 ## CPF_0994 hypothetical protein - Prom 47987 - 48046 8.7 + Prom 47790 - 47849 6.3 59 37 Tu 1 . + CDS 47973 - 48182 93 ## CDR20291_2714 hypothetical protein + Term 48428 - 48464 -0.7 + Prom 48571 - 48630 8.7 60 38 Tu 1 . + CDS 48725 - 49375 708 ## COG4887 Uncharacterized metal-binding protein conserved in archaea - Term 49416 - 49455 4.5 61 39 Tu 1 . - CDS 49478 - 49624 149 ## Rumal_1101 integrase family protein - Prom 49740 - 49799 2.8 - Term 49764 - 49813 12.1 62 40 Op 1 . - CDS 49832 - 50434 406 ## COG1073 Hydrolases of the alpha/beta superfamily 63 40 Op 2 . - CDS 50494 - 50967 429 ## Rumal_2970 GCN5-like N-acetyltransferase - Prom 50999 - 51058 4.8 64 41 Tu 1 . - CDS 51065 - 51481 367 ## Closa_3192 hypothetical protein - Prom 51640 - 51699 1.5 65 42 Op 1 . - CDS 51707 - 52108 326 ## gi|225028342|ref|ZP_03717534.1| hypothetical protein EUBHAL_02614 66 42 Op 2 . - CDS 52122 - 52418 211 ## gi|153854275|ref|ZP_01995574.1| hypothetical protein DORLON_01569 67 42 Op 3 . - CDS 52428 - 53426 539 ## Amet_3992 replication initiator A domain-containing protein 68 42 Op 4 25/0.000 - CDS 53526 - 54407 753 ## COG1475 Predicted transcriptional regulators 69 42 Op 5 25/0.000 - CDS 54397 - 54954 464 ## COG1192 ATPases involved in chromosome partitioning 70 42 Op 6 . - CDS 54969 - 55838 574 ## COG1475 Predicted transcriptional regulators - Prom 56016 - 56075 2.6 - TRNA 56261 - 56334 80.5 # Arg CCT 0 0 71 43 Tu 1 . - CDS 56385 - 57260 646 ## Cphy_3931 hypothetical protein - Prom 57298 - 57357 11.0 - Term 57332 - 57379 10.1 72 44 Op 1 . - CDS 57444 - 57980 367 ## EUBREC_3697 hypothetical protein 73 44 Op 2 . - CDS 58012 - 58920 908 ## COG1475 Predicted transcriptional regulators 74 44 Op 3 . - CDS 58910 - 59017 98 ## - Prom 59044 - 59103 6.7 75 45 Tu 1 . - CDS 59129 - 59899 830 ## COG1192 ATPases involved in chromosome partitioning - Prom 59969 - 60028 6.7 Predicted protein(s) >gi|222441830|gb|ACEP01000112.1| GENE 1 34 - 936 960 300 aa, chain - ## HITS:1 COG:BH0676 KEGG:ns NR:ns ## COG: BH0676 COG1597 # Protein_GI_number: 15613239 # Func_class: I Lipid transport and metabolism; R General function prediction only # Function: Sphingosine kinase and enzymes related to eukaryotic diacylglycerol kinase # Organism: Bacillus halodurans # 5 277 6 272 295 184 35.0 1e-46 MKGLFIINPSSGRQNFIDKIKEIAGMLVIDQICNTIDVFYTEKQDDALNKAAALEKGQYD FVVAVGGDGTLNEVINGVVLSQSNTPVAVISAGTVNDFATYLNLPQTPKEFCDMIADFKL KKVDVGKVNEKYFINVVAAGLLSDTGFKVSKDKKAVMGKLAYYLEGAAELPKQFGKSWKM KFITDNKTVEEEILLFMVANSQSVGGFREIAPLASVSDGLFDIIIIKKMDIFQMLPLMLS ILQGDHVNHPSVEYIQTKKLHIENLSGEEVKVDYDGEELPEGFPVDISIIPQAINIVVPK >gi|222441830|gb|ACEP01000112.1| GENE 2 944 - 1108 62 54 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225028278|ref|ZP_03717470.1| ## NR: gi|225028278|ref|ZP_03717470.1| hypothetical protein EUBHAL_02550 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02550 [Eubacterium hallii DSM 3353] # 1 54 1 54 54 82 100.0 1e-14 MKRYVRIIKLENYEEISGNSIAEDFTFKEMYYQRISALVLSIDSLKIYRNLKEV >gi|222441830|gb|ACEP01000112.1| GENE 3 1263 - 1937 728 224 aa, chain + ## HITS:1 COG:FN0585 KEGG:ns NR:ns ## COG: FN0585 COG0745 # Protein_GI_number: 19703920 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Fusobacterium nucleatum # 1 224 1 224 224 214 50.0 1e-55 MKILVVEDEPDLNHIIAKHLQAEGYIVDSCFNGNDAVYYITEESYDAILLDAMLPGKDGF EVLKEIRSQKIETPVMFLTARDQTEDIVTGLDFGADDYVVKPFSFEEVLARLRVIIKRSP QEHGTTYECADLLIDTDKKSVTRAGTDIELSPKEYAILEYMVRNKNIALSRLQIESNAWG LDFEDESNIIDVYIRYLRKKIDDDFDVKLIQTVRGIGYMLTEPD >gi|222441830|gb|ACEP01000112.1| GENE 4 1973 - 3352 797 459 aa, chain + ## HITS:1 COG:FN0586 KEGG:ns NR:ns ## COG: FN0586 COG0642 # Protein_GI_number: 19703921 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Fusobacterium nucleatum # 160 453 153 444 445 164 36.0 5e-40 MKQNKKIAITHYISIRFSAIIFIMAAIMILFISYFSNKTIYFDIRRQIRKETHYDFLNID IQNGKIFVNKNFIFEEDHVQKIILDSQGDFFRGHYPDKKLKNIPVNQWHFQKVKCSSDYY YIYDRPYLKKDTITNKRVLVIVRNIGKASDFDSQYKTMKYISYTFTFVIAFIGFLLIGIV SSQLTIPMKKIKDTADKIGTDGDLAKRIEYTTPFKELDAMIQANNRMMDRLEDLFQQQQQ FTSDVAHELRTPSAIVSAECQYLKKYGKNIDDYTESLTVIERQNTKTTEIISQLLQLSRL EQGRIKDDFEYSNFKTLIESVCDMEPLQFKKQITIYSNLDDISIYMNVGLMAIAVKNIIN NAMKYSKNKSSIKLKLWKEKDYVFFEVKDYGCGMSEETKKNIYDRFYRADKSRNTEGFGL GLSLVHKIIELHKGTIRVQSELGVGSIFTLSLPSDKSSQ >gi|222441830|gb|ACEP01000112.1| GENE 5 3695 - 4858 523 387 aa, chain + ## HITS:1 COG:no KEGG:PCC7424_2290 NR:ns ## KEGG: PCC7424_2290 # Name: not_defined # Def: hypothetical protein # Organism: Cyanothece_PCC7424 # Pathway: not_defined # 1 300 1 309 322 90 30.0 8e-17 MKTLYIHIGCPKTATTSIQYFCNENKEILSKNGIYFPIFEQKYKDVNPYRNGHFLIAHQY DSNGKINTLDEHRIFRFNMDHIIYMFSKYNNILLSDESIWHSTHFFKKDLWEILRKESLK SAFKIKIIVYLRRQDAALNSMWNQKIKKSKQRFRSMSWEEFINTTPPAIYLDYEEGLNEI ASELGSENIIVRRYGKKYFKNGSIYDDFLSTLSLSLTDDYTITTFERNPRLSGNSNEMQR IFNSIPNATKKDFSFFYHMLSEVSSEDPDTEKLGMFTPEEASAFMENYRASNRRIMEKYF NCSEDLFKTNFKNIKKWEWDSRHMCEDIIRVLGNTTISLRKENEDMQQHILELEKAIELQ NENISTLKTKLKHPVRTLFQKIADSTH >gi|222441830|gb|ACEP01000112.1| GENE 6 5068 - 5496 576 142 aa, chain - ## HITS:1 COG:CAC3714 KEGG:ns NR:ns ## COG: CAC3714 COG0071 # Protein_GI_number: 15896945 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Molecular chaperone (small heat shock protein) # Organism: Clostridium acetobutylicum # 35 130 46 140 151 71 36.0 5e-13 MMMPSIFGNNFVDDVFDDMFPFAGNYTTANYDLMKTDVKDAGDHYELEMEMPGVEKENIK AELKDGYLTVTAQQNTNKDEKDKQGNYIRRERYSGSCQRSFYVGEGVKQEDLKAAFNNGI LTVAVPKEVQKPVEEKQYITIE >gi|222441830|gb|ACEP01000112.1| GENE 7 5737 - 6195 579 152 aa, chain - ## HITS:1 COG:CAC3714 KEGG:ns NR:ns ## COG: CAC3714 COG0071 # Protein_GI_number: 15896945 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Molecular chaperone (small heat shock protein) # Organism: Clostridium acetobutylicum # 44 139 46 140 151 74 42.0 7e-14 MMMPSIFGENLFDDFDSWFSDPVEKRFFGKKNPLYGKHAKNLMKTDVRETKNSYEVDVDL PGFKKDEIKLELDNGYLTISAAKGLDKDETDKETGKYIRRERYAGNLSRSFYIGDVKQED VKAAFKNGILSISVPKEDKKAKEEKKYITIGE >gi|222441830|gb|ACEP01000112.1| GENE 8 6459 - 7877 767 472 aa, chain - ## HITS:1 COG:BS_yrkQ KEGG:ns NR:ns ## COG: BS_yrkQ COG0642 # Protein_GI_number: 16079695 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Bacillus subtilis # 142 450 117 427 432 108 25.0 2e-23 MIKKVQTKAKGWKIHTQLMVLILLTGLASILLFEVLWLNKWDIYEYVQQVPGLRDMSQDE EFWEKLRTEAANYNIPSSEDDKEGIKKIQPFLSQGDKYTGIYIYNLKEGTYVAGKFPSVL NNAAFSTFFDMGYQLTGGEGEETYEFPMKFKNGYATVMVWFYHRVRFIYPYCIFCLTVSI GLFFGVILFFIKKKMNMVASLKDEILRMAAGNLRDSVPEMGQDEIGIIAEELDNLRTALQ DTLVREKESRRANQDLITAMSHDLRTPLTILNGYLEVLRLNRNPEMHEEYLNRCLQKTSD IREMTDRMFEYALVFEEVDDPKVKEIPIEYFRDCINEHSDFIKLAGFQPELDYTNVERWI TGDKGMIKRIFSNLFSNILKYGDKSVPVNIKGCFTKTTYILTMSNGIKQQSTGVESNHIG LMSVEKMMQQLGGKSSFSEENGLFSVRLEFHLTAYGRVDRPETYSGNNAGII >gi|222441830|gb|ACEP01000112.1| GENE 9 7870 - 8577 564 235 aa, chain - ## HITS:1 COG:CAC0564 KEGG:ns NR:ns ## COG: CAC0564 COG0745 # Protein_GI_number: 15893854 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Clostridium acetobutylicum # 5 232 3 227 233 203 47.0 2e-52 MEENSSSILVVDDNPEIREIIQVLLGGEGYLVETAGNGVKALEMLENREYDLIILDIMMP GMDGYQTCRKMREESNAPILFLSARTKDSDKTLGFSSGGDDYLAKPFSYNELISRAKALI RRYQVYKGKAEKVETPKNFSCKNLVVAEETKSVYSDGQLLELTDTEYAILLLLLKNRRQV FSTERLYEAVWKENFYYGAENTVMVHIRNLRRKVERDSKNPMIIKTSWGKGYYCD >gi|222441830|gb|ACEP01000112.1| GENE 10 8897 - 10057 746 386 aa, chain - ## HITS:1 COG:CC2770 KEGG:ns NR:ns ## COG: CC2770 COG3550 # Protein_GI_number: 16127002 # Func_class: R General function prediction only # Function: Uncharacterized protein related to capsule biosynthesis enzymes # Organism: Caulobacter vibrioides # 13 378 11 405 435 151 28.0 2e-36 MKENGTLQVLYDGKVVGNLALMAGRKIAFEYSEEWLETGFSISPFSLPLKEQVFIPTKEY FGGLFGVFGDSLPDGWGNLLLNRMLRKKGMKPEEMTMLNRLAIVGQYGMGALTYRPEEHF SFEPQGYDLDDLALQCQKILNTEYSEKLDELYLLGGTSGGARPKIMTEIDGEDWIIKFPA HVDGQDAGGMEYEYALCAKACGIDMTEVRLFPSKKCKGYFGIKRFDRERIVDGIGTKVKA GNKKKRIHMLTAAALLELDFEQPSLDYHSLMKLTKILTRDNEADVREMFRRMCFNVFAHN RDDHSKNFTYLYDEMKKGWRLSPAYDLTFSNTYYGEHTTMVDGNGRSPGKKELLAVGMQA GLEKSWCEEVIEKMKNTVEERLERYL >gi|222441830|gb|ACEP01000112.1| GENE 11 10023 - 10322 198 99 aa, chain - ## HITS:1 COG:FN1997 KEGG:ns NR:ns ## COG: FN1997 COG1396 # Protein_GI_number: 19705293 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Fusobacterium nucleatum # 7 91 21 105 106 77 48.0 4e-15 MEALVWETAEELDMKLAQRVRNIRRRRKISQEELSRMSGVSYGSVKRFEATGKISLLSLT KLAMALDMADELRELFTQVPYRNIQEVIDERKRNTTSFI >gi|222441830|gb|ACEP01000112.1| GENE 12 10873 - 12738 183 621 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 397 604 38 251 329 75 27 9e-13 MAENKAGTSNRGPMGGRGPMGGGRRMMQGGEKAKNFSGSFKKLIRYIGNYKFAILAVMIF AAISTVFSVAGPKVMGKATTALAEGLMKKIAGTGGIDFTRIGKILLLTLGLYLTSTVFNF LQGWIMTGVTQKICYRLRKEISEKINRMPMKYFESRTYGEVLSRITNDVDTLGQGLNQSV TQVITSTATLIGVVVMMLTISPLMTLIAFIILPVSAILMSTVVKFSQKYFRTQQEYLGHI NGQVEESYGGHLVIKAFNKEKDMVKTFDETNGVLYESAWKSQFLSGLMQPIMAFVGNLGY AGVAISGGLLAIRGTIGVGDIQAFIQYVKNFTQPIQQIAQVINQVQSMTAASERVFEFLE EEEEDQEHEGAVSVENVTGEVKFFHVNFGYNPNQTIINDFSVDVKPGQKIAIVGPTGAGK TTMVKLLMRFYDVNSGQILLDNHDVRDYNRRDLRKAFGMVLQDTWLYKGTIMENIRYGRL DATDEEVIEAAKAAHADSFIRTLPGGYNMELNEDASNVSQGQKQLLTIARAILADSKVMI LDEATSSVDTRTEALIQKAMDNLMEGRTSFVIAHRLSTIKNADVILVMKDGDIIEQGNHE ELLAKGGFYAQLYNSQFEEAA >gi|222441830|gb|ACEP01000112.1| GENE 13 12740 - 15184 2797 814 aa, chain - ## HITS:1 COG:CAC3282 KEGG:ns NR:ns ## COG: CAC3282 COG1132 # Protein_GI_number: 15896527 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase and permease components # Organism: Clostridium acetobutylicum # 205 765 25 584 584 525 48.0 1e-148 MTKIFKNMAPYWYMIVAIVLLLIVQAFGDLSLPQYTSDIIDVGIQNKGVEHILPVKMTED EYEISQLYMTSKEKKIWKDTYEKKGEYYICKAEDEEKLDQLDDTFLTAIFLNHNMSNVKE SQFKKMIKNSIASNPAMAPMKDKIDDMSVDEIGKMLNMKFKSFQEEDDNGKKVIYVDVRP MLYQMKQTGMMSAKDIQKSREEIEKKMNDIGESTLFSTGVAYATKCDKAAGVDIDKIQTD YLWKEGGRMLGIAFMILVAAIGVGFLASKVGASIGRDLRGKIYKKVMGFSNAEMNRFSTA SLITRSTNDIQQIQMVTAVMLRLLLYAPIIGIGGIIKVYQTGAGMEWIIALAVVVILGFV MLLVSIAMPKFKIMQTLVDGLNLVSREILTGLSVIRAFGREKTEEERFDEANKKLTGTQL FTNRIMTFMMPGMMFIMYSVTILITWVSAQKIDAGTLQVGAMTAFITYAMQIVMAFLMMT AMSIMVPRAGVAADRIDEVLKTEASVQNVKKPETLKEHKGVLEFSHVDFKYPGAEHNVLS DIDFKVEPGKTTAIIGSTGCGKSTLVNLIPRFYDVTGGQITLDGKDIRRISMEELREEIG FVPQKGVLFSGTIASNLRFGKADATDEDIKEAAEIAQATEFIETKKEKYDSPIAQGGSNV SGGQKQRLAIARAIAKKAKVLVFDDSFSALDMKTDAALRKELNEKVQDASIVIVAQRVST ILHADQILVLDDGKIVGKGTHEELLKNCEVYLQIAKSQLSEKELGLEKLGLAEEKAEKET NKKEILSTKIDEKENNKLKEKSDDKKLKHKKGGK >gi|222441830|gb|ACEP01000112.1| GENE 14 15282 - 15878 637 198 aa, chain - ## HITS:1 COG:no KEGG:ELI_2728 NR:ns ## KEGG: ELI_2728 # Name: not_defined # Def: hypothetical protein # Organism: E.limosum # Pathway: not_defined # 4 192 7 198 213 112 32.0 7e-24 MKSVAIEREFGSGGRDIGIRIAKEAQIPYYDGQLLIEAAKGYGIEIATLKEFDENKVGSF LYNIAMMAGYNQYENMAKINEIFYGMKETIKNLHAEGPAVFIGRCSTEILKSCEDVVTVF VRCSDKEKRVQRIFDKENVDTMQKARRLMDRKDWGREKYFKFWTKKDWKDEKNYDLVLDT AKFSLEECSEILLKRMNE >gi|222441830|gb|ACEP01000112.1| GENE 15 16051 - 16485 447 144 aa, chain - ## HITS:1 COG:no KEGG:Closa_1534 NR:ns ## KEGG: Closa_1534 # Name: not_defined # Def: MarR family transcriptional regulator # Organism: C.saccharolyticum # Pathway: not_defined # 12 143 19 150 160 65 30.0 5e-10 MRDSLINTCGMKMRKLLDKKSEVFAQKYGVRNVELEILLFLYHSPCGDTAKDIVEEKNLS KAHISKSVDNLRAKGFVVLTEDENDRRKRHIELTEKAVEAAKNMLEVHNECKEVVMRYVT DEEKELMNQIMQKMLRSLDEELQK >gi|222441830|gb|ACEP01000112.1| GENE 16 16646 - 17284 441 212 aa, chain - ## HITS:1 COG:Cj0510c KEGG:ns NR:ns ## COG: Cj0510c COG1937 # Protein_GI_number: 15791872 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Campylobacter jejuni # 122 212 6 96 96 83 53.0 2e-16 MENKHIQNNRNENNKKNDTTKKIGKIHIQENEYIKNRNNFYNQEDIQGNVHYDSTHKEGH SHKHEELSEHQHIGKSEKSSVNQHTHKSGELSVNQHTHKSGEASVCQHTHTHVDANGNVY THTHTHSPQIVKNELNRIARIIGHMKSIKIMIESGRDCSEVLIQLAAVDAAVKSLSRVIL KEHMSTCIVDAIKTGDDEAIEALNEAIDKFMK >gi|222441830|gb|ACEP01000112.1| GENE 17 17270 - 17509 76 79 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MFIFHFILPRIIKICGNFEFCNIQRDSREDYKNKINVYKINVIKSYKRYIPAIILLIAII LYENDIKNAAISLPNIIDK >gi|222441830|gb|ACEP01000112.1| GENE 18 17532 - 18131 443 199 aa, chain - ## HITS:1 COG:no KEGG:Vpar_0013 NR:ns ## KEGG: Vpar_0013 # Name: not_defined # Def: hypothetical protein # Organism: V.parvula # Pathway: not_defined # 7 199 4 196 199 195 48.0 9e-49 MKEDIYENKWDKLLKIHTMGRDDSNAGQYYYPYEPTPYSVLERLANTGIIRKGNTLLDYG CGKGRVDFFLAYQTRCRSIGVEYDERIYNKAIENKEKAISAGRVAFKAENAEYFSVPREV DRIYFFNPFSLEILQKVIHRILESYYEKPREIKLLFYYPSDEYISYLMTVDEFMFEDEID CRDLFPGNDMRERIVIFGL >gi|222441830|gb|ACEP01000112.1| GENE 19 18368 - 19750 1525 460 aa, chain + ## HITS:1 COG:melB KEGG:ns NR:ns ## COG: melB COG2211 # Protein_GI_number: 16131946 # Func_class: G Carbohydrate transport and metabolism # Function: Na+/melibiose symporter and related transporters # Organism: Escherichia coli K12 # 20 450 4 439 469 293 37.0 6e-79 MENNSNSGVIYQGKIPNRVKYCFAFGALGKDLIYGMIATFSMIYFTDIIKVAPAFIGSMF FVAKLWDAFNDLFMGMIVDNTRSRWGKFVPWLVIGTLINAFVFITVFTDFHLSGVSLCVF ASVVYVLWGMTYTIMDIPYWSIIPNLTSDPEEREKVSVLPRIFASIGQSLIIAGFGVQII KGLGGNYTGYHRFALIIAATFIFTMAVCVINLPKKQQATGTAEKMKFRDIFTVIKKNDQL RWAVLLILLYNIGIQAIMGVATYYFSYVCNNAGMLSAFMISASVAEVVGLLIFPKVTKLL SRKKAFLMACTLPPIGLILLLIVGFVCPSNIPLTAIAGIIVKTGTGLELGCATVFLADVV DYGEYVLGVRNEGVVFSLQTLIVKLTAAFTALAIGFVLELTGYVPNAVQSLATQNSIRVL MCIVPALGIILAYVVFKTKYKLNDEFMKKVVAKISGKAEE >gi|222441830|gb|ACEP01000112.1| GENE 20 20378 - 21985 1342 535 aa, chain + ## HITS:1 COG:BH3842 KEGG:ns NR:ns ## COG: BH3842 COG4753 # Protein_GI_number: 15616404 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 1 529 1 515 530 145 27.0 2e-34 MINFTIILADDEQQILYGMKNGIDWESLGFSVVGTAQNGKEALELIEEYHPDLLISDIKM PFMDGLELSKTIHENYINTKIILFSGWDDFEYARTAISYGVSQYIMKPIDYNEMQKLLTT MHEELEKEYAEKMNRTRLENIYTESIPLLRQQFFTQLLTETLTEDYCTLQMKNLKLNLDY EVFHIITVKIRENDLNDVLSELSIKETIKESLEKITDLYHFGMIDREVFLLCGNKKHDIE RITRTIQEASVIIERIFHTKISCGISSSGTSLRALPVLYQQALEALDYNVVIQDESYTYY NDILPLQKDAPLQKDDDWNSEVDSIEKIITHCSEEELKTAVLNMLEHLHEAHYNLNEYQI VILEISFSLARLYKKYQITSDKEFAGSKKMAVKILSLNTGEELDNWLINYFQLMRTLIQK KQVDNNVILAENAKKLVEEHFREPDLSVESICKELHVSSSYFSKIFKQETETTFLNYLIS RRMEEAKMLLKMTDYKSHVIGTMVGYPEPNYFSYVFKKHCGISPVKYRKLGGENN >gi|222441830|gb|ACEP01000112.1| GENE 21 22093 - 23850 1157 585 aa, chain + ## HITS:1 COG:SP0662 KEGG:ns NR:ns ## COG: SP0662 COG2972 # Protein_GI_number: 15900563 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Streptococcus pneumoniae TIGR4 # 32 572 53 563 563 216 29.0 1e-55 MNNKKLSIRVLFPLIMSISIVFCILSCMILFSHYISTYLVRKTIEETSRQKNVLAQNIEN EIDSINSIMNNIYYNTIKKYDIQDKNFKTILANEITSNSETIYGLALYDTDGKNLWHSNN LTSTSMQNESWFTQAKDNIETICYGSKKLVYPDNVKQVFQISRYVEYINHGKMKSGVLLM QYYTDSIDAILEHYKNTQNSYCYLLDNTSNFLYHPFIKEISSGLYKENTTKIALNSTNYS IKKLHGTKWLIERQQIGYTGWNIVLVSSLFNIHTENMGIYYVIWLLLLIVGIILVFTDIL LFHDFTNPVYQLLDTMQEFGKGNYEVKAAENGIGELKTLSAHFNIMAEKLQNQMGEIRRN EREKRKMEKKLLQSQINPHFLYNTLDSIIWMIQSGEYKGAEQMVSSLAKFFRISLSQGKD IIPLEKELEHATSYLAIQNIRFKDKFEFSVQADKKLLHYLCPKLSIQPLLENAIYHGMEG VYDDGEITISVYEKDNLIHIDVSDNGLGMPEKVVEYIMHNRVVSSKRGSGIGVRNVDERI KLIYGEQYGVTIESELDEGTTATIIIPKLEETAEINSDESNKTKL >gi|222441830|gb|ACEP01000112.1| GENE 22 23986 - 24627 233 213 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225028299|ref|ZP_03717491.1| ## NR: gi|225028299|ref|ZP_03717491.1| hypothetical protein EUBHAL_02571 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02571 [Eubacterium hallii DSM 3353] # 1 213 1 213 213 384 100.0 1e-105 MKNNKKKYALVIILILFIAFFSSQCMMTRTNTESKKTVAVILPYTYNKENSRILDAIRDY SHNNSIILDVWHEKSLSSNELSSIIANEEQNHAIGILLIYPENYLTQQEKHYDYNNVLAI TATMQTAFSHYAAFENSPSPSYLLPVSAQVIKKIQDGKTNCIYIKNTYKLGYESIQMLDT YSRNKYMKDISIPCHKVDKAAIESGELDALLTN >gi|222441830|gb|ACEP01000112.1| GENE 23 24831 - 25817 383 328 aa, chain + ## HITS:1 COG:HI0822 KEGG:ns NR:ns ## COG: HI0822 COG1879 # Protein_GI_number: 16272763 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Haemophilus influenzae # 38 322 47 328 349 139 32.0 6e-33 MKTNLFSYNSILFTLIIILISTITIWKLPQTQNGYTHIGIAVYDMDDTFINDYVTKLQNK IDRSSFSGKKVLYEIFDAEGNANRQEHQLQYMYTQNFDALLINLVTPSSAASVLNETANH DIPVILFNREADKKDLSISKQTWYVGTAAKKAGAIQADMLQNVWNTDKINLDRNKNNKLD YILIEGEQTHFDAVRRTNGFLENSQNLPLNQLTNLSADWQRNLASVKFSKLDTSIIQSAE AIICNNDDMALGVYDYYKQHNLKLPLILGINNSPEMNQKIQAGEIYGTVDNGTSDQISYI CKLLNDILNKDTAKYHKIWYSKPFAIEK >gi|222441830|gb|ACEP01000112.1| GENE 24 25970 - 27583 982 537 aa, chain + ## HITS:1 COG:AF0691 KEGG:ns NR:ns ## COG: AF0691 COG3894 # Protein_GI_number: 11498299 # Func_class: R General function prediction only # Function: Uncharacterized metal-binding protein # Organism: Archaeoglobus fulgidus # 10 530 16 554 585 237 31.0 5e-62 MKTIELSISDKNKSLLTHLQENGEFLPAYCAGRGTCGKCKVQFLNNIPAHTTHDAAFFSA KELSEGWRLACQSYVKGQFTIQIEDYEEDQIQAASAFNLSADKSVSNKFTAADTVSDSSS KSASVQHISMQPSSEKSDSKIATAPSSSTHNEISSYDSLALAVDIGTTTIASNLIDTNQK KVLQTITSINHQRIFGTDVLSRIDIANQGRLLELQRTILNDLDSICEQFQLGKDITKLSI PVIISGNSTMQHLLQKLSCESLGSFPFTPVDISMHSYKNMTILPGISTYVGADIVSGIVA CGIDQKEEVSLLVDLGTNGEMVIGNKDRLLVTSTAAGPAFEGGNIRYGIAGVPGAIDTVT IFDNHPAISTISNKKPIGICGTGVIETVYELVKEKIVDNTGLIADKYFDNGYPLTENISF TNKDIREVQLAKSAIRAGIEILVTSYGIDYSQIAHLYLAGGFGQKINYTKAVGIGMLPKE LNDRIISVGNSSLAGAGMLAMNSELSHRFTHIPEISEEISLANHKAFNDLYLEYMSF >gi|222441830|gb|ACEP01000112.1| GENE 25 27997 - 28338 364 113 aa, chain - ## HITS:1 COG:CAC0417 KEGG:ns NR:ns ## COG: CAC0417 COG1393 # Protein_GI_number: 15893708 # Func_class: P Inorganic ion transport and metabolism # Function: Arsenate reductase and related proteins, glutaredoxin family # Organism: Clostridium acetobutylicum # 1 112 1 111 112 117 59.0 5e-27 MNVQIFGTKKSFDSKKAERYFKERGIKYQFVDMKEKGLSKGEFHSVCQAVGGYEKLIDPK CKDKNLLALVQYLSEEDKKEKILENQKIIRIPIVRNGKQATVGYEPDVWKDWK >gi|222441830|gb|ACEP01000112.1| GENE 26 28458 - 28997 505 179 aa, chain - ## HITS:1 COG:yfdR KEGG:ns NR:ns ## COG: yfdR COG1896 # Protein_GI_number: 16130293 # Func_class: R General function prediction only # Function: Predicted hydrolases of HD superfamily # Organism: Escherichia coli K12 # 3 111 11 114 187 79 40.0 2e-15 MGNYITTYTRKHFDPINPDAELICIEDIAHALPMICRGNGHVSSFWSVGEHCILCAKEAA AKGYSNRLVLAALLHDASECYMSDVPRPLKQNMKKYKEYENHLLDVIYTKFLGFTLNIDE EKLIKEIDNALLWYDLKYLLDEESEEKQPKLHVKVDYTVRSFSEVEKEYLGLFEKYREV >gi|222441830|gb|ACEP01000112.1| GENE 27 29271 - 29825 422 184 aa, chain - ## HITS:1 COG:no KEGG:bpr_I0670 NR:ns ## KEGG: bpr_I0670 # Name: not_defined # Def: HAD superfamily hydrolase # Organism: B.proteoclasticus # Pathway: not_defined # 4 175 5 179 401 153 43.0 3e-36 MKKKIAVLFPGIGYTCDKPLLYYTGKLAVARGYKLVKVEYGNFPSGIKENKEKMEEAFRC GLEQAEKILQDICWEEYEDILFVSKSIGTVISSAFARRHQLSAKNILFTPLQQTFLFADG NGIVFHGTADPWADTKDITEGCKKLNLPLILTDEGNHSLETGDTLKDLENLQQIIGLLFN SLDK >gi|222441830|gb|ACEP01000112.1| GENE 28 29999 - 30658 677 219 aa, chain - ## HITS:1 COG:PA1575 KEGG:ns NR:ns ## COG: PA1575 COG5018 # Protein_GI_number: 15596772 # Func_class: R General function prediction only # Function: Inhibitor of the KinA pathway to sporulation, predicted exonuclease # Organism: Pseudomonas aeruginosa # 2 178 3 175 183 59 28.0 5e-09 MNYLVIDLEMCKVPKNYRGKNYKYASEIIQVGAVLLDEEYKEIGTLCQYVHPEFGILDYF ITNLTGIEKGQIKNAPKLKDALIHMADWLGEREYKVFAWSKSDYWQLDHEIKSKKLNDEK LDELMKPERWVDYQEIFGKKYNFEQAVGLQEALMLCDIEPDGRMHDGLDDAWNTARLIEK LEKNPNYKLIYRERQEQEDSQPLKVRLGELFEGLNLQLG >gi|222441830|gb|ACEP01000112.1| GENE 29 31310 - 31417 58 35 aa, chain + ## HITS:1 COG:no KEGG:TDE0249 NR:ns ## KEGG: TDE0249 # Name: not_defined # Def: flavoredoxin, putative # Organism: T.denticola # Pathway: not_defined # 1 34 154 187 187 68 91.0 7e-11 MSKVNAIAFDPYTHGYYKVTERVGEAFRDGLKLKK >gi|222441830|gb|ACEP01000112.1| GENE 30 31599 - 31772 241 57 aa, chain - ## HITS:1 COG:no KEGG:BDP_1352 NR:ns ## KEGG: BDP_1352 # Name: not_defined # Def: hypothetical protein # Organism: B.dentium # Pathway: not_defined # 1 55 1 55 117 93 85.0 3e-18 MNTDKIYAEQLANKYAPKDTTKVVALRKLDTKAKLPATVFTYTFGIVTAMVAGVGDG >gi|222441830|gb|ACEP01000112.1| GENE 31 32015 - 32434 479 139 aa, chain - ## HITS:1 COG:CAP0105 KEGG:ns NR:ns ## COG: CAP0105 COG0394 # Protein_GI_number: 15004808 # Func_class: T Signal transduction mechanisms # Function: Protein-tyrosine-phosphatase # Organism: Clostridium acetobutylicum # 3 131 2 129 136 154 60.0 6e-38 MNKKKVALICVHNSCRSQIAEALGKYLASDIFESYSAGTETKPQINQDTVRIMKETYGID MEANSQHSKLITDIPDVDIAISMGCNVGCPFIGRAFDDNWGLDDPTGKSDDDFKAVIQRI EENIMELKKRLSESGVLRE >gi|222441830|gb|ACEP01000112.1| GENE 32 32427 - 33110 393 227 aa, chain - ## HITS:1 COG:no KEGG:TDE2638 NR:ns ## KEGG: TDE2638 # Name: not_defined # Def: hypothetical protein # Organism: T.denticola # Pathway: not_defined # 20 225 1 206 208 336 78.0 4e-91 MKEECIICKAPLIYLEKDEMMECVLCHKKELSKTRCEQGHYVCNECHTKGMDVIIDICLS ETSKNPIEIIRKMMEQPFCHMHGPEHHVMVGSALLAAYKNAGGEIDLPEALLEMMNRGKT VPGGVCGFWGACGAGISTGMFISIISGATPLKNEPWGLANKMTSKALDAIGSIGGPRCCK RDSYMAIISAIDYVAENFNIQMEKPVIKCIHSDKNNQCIKERCPFHE >gi|222441830|gb|ACEP01000112.1| GENE 33 33120 - 33461 326 113 aa, chain - ## HITS:1 COG:Rv1708 KEGG:ns NR:ns ## COG: Rv1708 COG1192 # Protein_GI_number: 15608846 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Mycobacterium tuberculosis H37Rv # 60 104 65 109 318 62 60.0 1e-10 MSEKRCYTVQELQEILGVSRPTIYNLYAQAIPSLSQAQRIKQLSKEKQLSLEKMEKIMCK VISVVNQKGGVGKTTTTVNVGIGLAREGKKVLLIDADSQGSLTASSFIEALGR >gi|222441830|gb|ACEP01000112.1| GENE 34 33855 - 33950 85 31 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MMTVNEVSKLTGVSIRTLQYYDTIGLLKPIS >gi|222441830|gb|ACEP01000112.1| GENE 35 34093 - 34617 469 174 aa, chain - ## HITS:1 COG:MA0418 KEGG:ns NR:ns ## COG: MA0418 COG0655 # Protein_GI_number: 20089311 # Func_class: R General function prediction only # Function: Multimeric flavodoxin WrbA # Organism: Methanosarcina acetivorans str.C2A # 2 118 4 123 179 80 39.0 2e-15 MKIVIINGSARKGNTLAAINAFIKGASEKNKIEIIEPDKLNIAPCKGCGVCQCSKGCVDK DDTNPTIDKIAAADMILFATPVYWWGMSAQLKLIIDKCYCRGLQLKNKKVGTIVVGGSPV DSIQYELIDKQFDCMAKYLSWDMLFKKSYYATARDELEKNKDSMNELEEIGKNL >gi|222441830|gb|ACEP01000112.1| GENE 36 34643 - 34756 82 37 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MINGELVIEWKKEKSIQCEDFDEDTGEKEYTMLWKQK >gi|222441830|gb|ACEP01000112.1| GENE 37 34918 - 35787 494 289 aa, chain - ## HITS:1 COG:mll2462 KEGG:ns NR:ns ## COG: mll2462 COG0697 # Protein_GI_number: 13472235 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Mesorhizobium loti # 2 287 3 287 352 152 37.0 6e-37 METEQKSKKGIAFAILAAALYAINAPFSKILLEFMPPTLMAGFLYVGAGIGMIFIALMRK IKKYEAKELKLTKSELPYTIAMIVLDIAAPICLLFGLNSTTAANASLLNNFEIVATAIIA LMVFKEKISTRLWFGIFFVTLSCGILSFEDISSLRFSYGSLFVLLATICWGFENNYTRKI SSKDPLQIVLLKGIFSGIGSLIIGLFIGERIEALWSIVAVLCVGFVAYGLSIYFYVYAQR LLGAARTSAYYAVAPFIAAILSLIIFREIPDVTYFVALVLMIVGAWLSS >gi|222441830|gb|ACEP01000112.1| GENE 38 35939 - 36502 288 187 aa, chain + ## HITS:1 COG:SPy0846 KEGG:ns NR:ns ## COG: SPy0846 COG1309 # Protein_GI_number: 15674880 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Streptococcus pyogenes M1 GAS # 2 184 4 173 173 99 34.0 2e-21 MDRRQQKTREAIFHAFSTLLEKKSYNHITVQEILDTANIGRSTFYAHFETKDELLNAVCK ELFGHIIDSAMDKTHIHGLYSNEEMPQSVFCHLLQHLKENDNNILGLLSCESSEIFLRYF KDSLNELVQTQFINHNRKRNQDLPQEFLVNHISGSFVEMVLWWLKDKKKHTPEELDYYFR AVIEPIL >gi|222441830|gb|ACEP01000112.1| GENE 39 36591 - 37208 408 205 aa, chain - ## HITS:1 COG:SPAC18B11.09c KEGG:ns NR:ns ## COG: SPAC18B11.09c COG0110 # Protein_GI_number: 19113786 # Func_class: R General function prediction only # Function: Acetyltransferase (isoleucine patch superfamily) # Organism: Schizosaccharomyces pombe # 3 192 15 204 207 163 46.0 2e-40 MTEREKMINGKLYNPMDKELEQLRLNARKLARKYNLTDEDNQEEQAQILRKLLPATEELP YLQAPVYFDYGCNTYFGKFSSANFNFTCLDVCEIHIGDNVMIGPNVTLATPMHPLLPEER NVRKKEDGSFYNLEYAKPITIKDNCWLASNVVVCGGVTIGEGCVIGAGSVVTRDIPPYSL AAGNPCRVIRKITEKDHMSDRIEKN >gi|222441830|gb|ACEP01000112.1| GENE 40 37232 - 37552 188 106 aa, chain - ## HITS:1 COG:CAC0195 KEGG:ns NR:ns ## COG: CAC0195 COG1733 # Protein_GI_number: 15893488 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Clostridium acetobutylicum # 11 106 16 111 115 82 40.0 2e-16 MKDVNYQLDECPVQKILNLFQRKWNLRIIYELSKHESMRFGELKKAVLDISNTVLTSTLK ALENQGLVLRQQFNEVPPHVEYSLTESAKALDKVFQTMREWGNRYL >gi|222441830|gb|ACEP01000112.1| GENE 41 37686 - 37889 257 67 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MELLNTLISRHSVRTYNGEPLLSSDLEKILKAAKAAPVGLGLYDQMHLTVVNPMHLKILF LYAVILC >gi|222441830|gb|ACEP01000112.1| GENE 42 37774 - 38046 190 90 aa, chain - ## HITS:1 COG:no KEGG:Ethha_1352 NR:ns ## KEGG: Ethha_1352 # Name: not_defined # Def: integrase family protein # Organism: E.harbinense # Pathway: not_defined # 3 73 189 259 464 70 46.0 3e-11 MYALSVCEDERLKLAINLSFSCSLRLGELLGLTWDCVDISPEAIEENRAYVFINKESQRI RKESLNALDSQQLSASGRKGLDPQAQLLLP >gi|222441830|gb|ACEP01000112.1| GENE 43 38065 - 39069 922 334 aa, chain - ## HITS:1 COG:lin2113 KEGG:ns NR:ns ## COG: lin2113 COG0667 # Protein_GI_number: 16801179 # Func_class: C Energy production and conversion # Function: Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) # Organism: Listeria innocua # 1 319 1 327 331 339 52.0 5e-93 MEYGKLGNTDIEVSKLCVGCMSFGKAGTMHDWTLDEAESENVIKHALDLGYNFFDTANGY SAGTSEEYLGKALKKNVARNQVVIASKVYFNEGRLSKQAIMREIDGTLSRLGTDYLDLYI IHRFDYDTPIEETMEALHDLVKAGKVRALGASAMYGYQFYNMQLAARDNGWTPFSAMENH YNLLYREDERELLPICKQMKVSLMPYSPLAAGHLARPQWRSESLRGTTDRVAMGKYDKTE AEDMQIVKRVAELAEKYNCKMSQIAIAWQWAKGILSPIIGATKTQYLDDSVGAFDIKLTA EDFAYLEEPYVPHEIVGAIDKNPAQGVILLDEKK >gi|222441830|gb|ACEP01000112.1| GENE 44 39080 - 39847 588 255 aa, chain - ## HITS:1 COG:Cgl1022 KEGG:ns NR:ns ## COG: Cgl1022 COG0599 # Protein_GI_number: 19552272 # Func_class: S Function unknown # Function: Uncharacterized homolog of gamma-carboxymuconolactone decarboxylase subunit # Organism: Corynebacterium glutamicum # 1 105 1 105 107 127 57.0 2e-29 MGKIVQTAGRNTLGEFAPEFAHFNDDVLFGENWNNQDIDVKTRSIITVVALMASGITDSS LKYHLQNAKNHGVTQKEIAAVITHVAFYAGWPKAWAVFNLAKEVWEAGEGDLPYEEEAMR VHAKEMVFPIGAPNDGFAQYFSGRSFLAPISTSQVGIFNVTFEPGCRNNWHIHHAKSGGG QILVCVAGRGFYQEEGRDAVEMKPGDCINIPVDVKHWHGAAPDEWFSHLAIEVPGVDCSN EWCEAVSEKEYAGLR >gi|222441830|gb|ACEP01000112.1| GENE 45 39876 - 40346 604 156 aa, chain - ## HITS:1 COG:YPO2003 KEGG:ns NR:ns ## COG: YPO2003 COG0716 # Protein_GI_number: 16122245 # Func_class: C Energy production and conversion # Function: Flavodoxins # Organism: Yersinia pestis # 13 156 85 231 235 104 36.0 6e-23 MSKNLVAFFSASGTTKKVAEMIAEEAKADLFEIEPKVPYTKTDLDWMNKKSRSSVEMSDK KYRPEIMKKKMDMSSYDEILLGFPIWWYVAPTIVNTFLEAYDFSGKKIVLFATSGGSGFG NTVKELQPSAPDAVITEGSLLNRGTKQEISEWVKSL >gi|222441830|gb|ACEP01000112.1| GENE 46 40473 - 41351 703 292 aa, chain - ## HITS:1 COG:lin0450 KEGG:ns NR:ns ## COG: lin0450 COG0583 # Protein_GI_number: 16799526 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Listeria innocua # 1 287 1 285 291 182 35.0 8e-46 MELRLLRYFLTVAKEQSFTKAAEQLHITQPTLSRQMAAFEEELGVTLFIRSGKKISLTEE GILLKRRALEILNLEEKTLEELKGKEDVVEGNITIGCGEFAAVETLAEICKTYKEKYPLV QIVLHTATADAVYEMMNKGLVDIALFMEPVDTEGLDYIRITDCDHWCVGMRPDDPLAEKE FIRKEDLIGKPLILPERVNVQSELANWFGKDFSKLQIAFTSNLGTNAGVMAANGLGYPVS IEGAAKYWREDILVQRRISPEITTSTVIAWRRNIPYSLAVSKMIEEINAFQA >gi|222441830|gb|ACEP01000112.1| GENE 47 41380 - 41805 283 141 aa, chain - ## HITS:1 COG:CAC1468 KEGG:ns NR:ns ## COG: CAC1468 COG0454 # Protein_GI_number: 15894747 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Clostridium acetobutylicum # 1 140 1 142 142 98 37.0 3e-21 MIRKLLNGDIDRVADIWLKTNLKAHYFISNQYWKSNYELVKEMMSQSEVYVFEADKMIQG FVGLNDEYIEGIFVSDGMQSCGIGKLLLDYIKDKKERLQLNVYQKNVRAISFYQREGFII QCEGLDEATGEKEYTMLWKQK >gi|222441830|gb|ACEP01000112.1| GENE 48 42190 - 42576 232 128 aa, chain + ## HITS:1 COG:no KEGG:LAC30SC_09585 NR:ns ## KEGG: LAC30SC_09585 # Name: not_defined # Def: hypothetical protein # Organism: L.acidophilus_30SC # Pathway: not_defined # 5 126 3 128 132 67 35.0 1e-10 MAPKKKEKNLKRSNIITLKLTDIELAVLNEAADVTGLSRSEYVRKLLLEKQINHQIEVVA DMNDLRKLVSEYGKIGSNLNQIAKHFNSGGSQSRAIENEIHQCITDLFLLRKEVLKLAGG MNGNRKTH >gi|222441830|gb|ACEP01000112.1| GENE 49 42554 - 43987 494 477 aa, chain + ## HITS:1 COG:no KEGG:Tresu_1917 NR:ns ## KEGG: Tresu_1917 # Name: not_defined # Def: Relaxase/mobilization nuclease family protein # Organism: T.succinifaciens # Pathway: not_defined # 1 448 1 489 555 290 36.0 1e-76 MAIVKHIKSRNANYSAAINYLLFEDDEKTGKKIVDESGRSILRKEFYMDGLSCDPMSFDK ECELTNAHFHKNKKREDIKSHHYIISYDPADVTENGLTGERAQAISLELAKQMFPGYQAL VVTHTDGHNESGNIHTHIVINSVRMTAVERQPYMDKPHEEAAGYKHRSTDKFMNAFKKTV MDRCQQEGLHQIDLLAPAERKITQKEYMSQKHGQQKLDEINQKIIEDGLKPTSTVFLTQK EYLRNAIDECAATSNSFDEFQSKLLEQFQISVIEHRGRYSYLHPDRQKRITERALGTRYG KEHLEQTFLRKDPLVILYVRSHLRLVVNLQTNVKAMQSPAYAHRVKLSNLQQMANTIIYV QEHGFDTQSDLKNTLLASKQELKEMQTQFAQHRSDLRILNDQIRYTGQYYANKEVYSQFS NAKYKGKYRKEHAKEIQKYEEARNWLRSFYQDGKMTSLKTLTLQKEKLQTADRFLLR >gi|222441830|gb|ACEP01000112.1| GENE 50 44137 - 44439 265 100 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|197302208|ref|ZP_03167267.1| ## NR: gi|197302208|ref|ZP_03167267.1| hypothetical protein RUMLAC_00935 [Ruminococcus lactaris ATCC 29176] hypothetical protein RUMLAC_00935 [Ruminococcus lactaris ATCC 29176] # 1 100 1 100 100 149 97.0 9e-35 MENIIMLILGVFISVVGIVNIKGNISTIHSYNRRKVKEEDIPKYGKTVGTGTLIIGISLV VGFIVSFWSEIIIDYIILPAVIVGLGFILYGQFKYNKGIF >gi|222441830|gb|ACEP01000112.1| GENE 51 44449 - 44655 130 68 aa, chain - ## HITS:1 COG:SPy1934 KEGG:ns NR:ns ## COG: SPy1934 COG1476 # Protein_GI_number: 15675737 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Streptococcus pyogenes M1 GAS # 7 67 1 62 68 60 46.0 6e-10 MKEQLQLKNHLKEVRTEANLSQAQLAEMVGVSRNTISSIETGQFNPTAKLALILCIALDK KFEELFYF >gi|222441830|gb|ACEP01000112.1| GENE 52 44652 - 44981 79 109 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|218132785|ref|ZP_03461589.1| ## NR: gi|218132785|ref|ZP_03461589.1| hypothetical protein BACPEC_00646 [Bacteroides pectinophilus ATCC 43243] hypothetical protein BACPEC_00646 [Bacteroides pectinophilus ATCC 43243] # 1 109 1 109 109 176 97.0 7e-43 MKKDEILNASRKEHRNKDLAEMEVVYQAGSHASRVGALVCCLLSLLSSVLAHTMIYSPWV IYFSIIATQWLVRFIKMKRKSDLVLTVLFFVFSILAFVGFVSHLLEVRI >gi|222441830|gb|ACEP01000112.1| GENE 53 45195 - 45374 89 59 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225028330|ref|ZP_03717522.1| ## NR: gi|225028330|ref|ZP_03717522.1| hypothetical protein EUBHAL_02602 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02602 [Eubacterium hallii DSM 3353] # 1 59 11 69 69 112 98.0 1e-23 MGLILCFAISNTTIKGQKLAFKQGVTEELKATDMMRWIGLMNNIRACADEIVLNDIVYS >gi|222441830|gb|ACEP01000112.1| GENE 54 45527 - 45808 222 93 aa, chain - ## HITS:1 COG:SA1196 KEGG:ns NR:ns ## COG: SA1196 COG0389 # Protein_GI_number: 15926944 # Func_class: L Replication, recombination and repair # Function: Nucleotidyltransferase/DNA polymerase involved in DNA repair # Organism: Staphylococcus aureus N315 # 8 91 12 95 420 77 42.0 6e-15 MSQDQTSIICIDLKSFYASVECVERGLDPFKANLVVADPTRSKSTICLAITPAMKALGIK NRCRIHEIPDCVKYITAMPRMQLYMDYSAKIYS >gi|222441830|gb|ACEP01000112.1| GENE 55 46110 - 46529 408 139 aa, chain + ## HITS:1 COG:no KEGG:Closa_2770 NR:ns ## KEGG: Closa_2770 # Name: not_defined # Def: XRE family transcriptional regulator # Organism: C.saccharolyticum # Pathway: not_defined # 1 121 1 121 140 90 39.0 2e-17 MTFGEKVRSLRKEKKMSQQELASMVGVSYRTIRSWEVEGRFPKQNVLYQKLADALQCDVS YLMSEDEAFITEASEQFGNRGAKQAQQILEQAAAMFAGGSLTDEDKIAFMDEIQSLYLDS KRRAKKFTPKKYLKNQEEK >gi|222441830|gb|ACEP01000112.1| GENE 56 46622 - 47137 480 171 aa, chain + ## HITS:1 COG:BH3550 KEGG:ns NR:ns ## COG: BH3550 COG2856 # Protein_GI_number: 15616112 # Func_class: E Amino acid transport and metabolism # Function: Predicted Zn peptidase # Organism: Bacillus halodurans # 11 117 10 113 146 59 38.0 4e-09 MRNTYIYTETKKLIKKYGTRDSFEIMDQMNIAVGETSRYKTLKGYCFMSCKTIYVMISSF LSEEEKMIVAAHELGHIILHRSQLKMAPMQDDTLYNMTDNTEYQANLFAADLLIEDEDIE EMVQNEDLDYFGLCSSLNATPELMSFKLYSLTKRGQAYHMPMEIQSNFLAK >gi|222441830|gb|ACEP01000112.1| GENE 57 47203 - 47589 323 128 aa, chain + ## HITS:1 COG:no KEGG:Mlab_0595 NR:ns ## KEGG: Mlab_0595 # Name: not_defined # Def: hypothetical protein # Organism: M.labreanum # Pathway: not_defined # 3 127 2 121 208 69 32.0 4e-11 MSKHHIEKVTCPSCHHEGDFEVWDSINTVLNPEMKEKVLNQSIFLYTCPNCGETFRLNYP TLYHQMEDLVMIYLVPESEVEKTYEMFYGENALADYRTEKYLNRIVTSANQLIEKIKIFD AGKDGRIN >gi|222441830|gb|ACEP01000112.1| GENE 58 47732 - 47881 105 49 aa, chain - ## HITS:1 COG:no KEGG:CPF_0994 NR:ns ## KEGG: CPF_0994 # Name: not_defined # Def: hypothetical protein # Organism: C.perfringens_ATCC13124 # Pathway: not_defined # 1 49 77 125 127 62 53.0 4e-09 MLLVIGKNEDCVGSIGGIICGVQLIPLIGSILPTEIALKKNFDKNGTRR >gi|222441830|gb|ACEP01000112.1| GENE 59 47973 - 48182 93 69 aa, chain + ## HITS:1 COG:no KEGG:CDR20291_2714 NR:ns ## KEGG: CDR20291_2714 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile_R20291 # Pathway: not_defined # 1 61 212 272 277 63 44.0 2e-09 MNNYYGEIITEYLTNYMCFIVLSVEIFSILFTFGVVKSVVVLLFVDTAIFIGLLVVGLVK MDLRGKEVL >gi|222441830|gb|ACEP01000112.1| GENE 60 48725 - 49375 708 216 aa, chain + ## HITS:1 COG:MJ0455 KEGG:ns NR:ns ## COG: MJ0455 COG4887 # Protein_GI_number: 15668631 # Func_class: R General function prediction only # Function: Uncharacterized metal-binding protein conserved in archaea # Organism: Methanococcus jannaschii # 34 205 19 187 193 140 48.0 2e-33 MKKEDMSCIDCAVKNCNKMDKTYPDFCLTTHMDEEVLNEAMECYNEDENRKVTIAAAEVE YENYCKHTRVEEIMDFAKKINAKKIGIATCVGLLKESRILADILRRHGFEVYGVGCKAGT QKKTSVGIPECCEGVGVNMCNPILQAKLLNKAKTDLNVVVGLCVGHDSLFYKYSEALTTT AVTKDRVLGHNPVAALYTADSYYSKLKKSNISNLGV >gi|222441830|gb|ACEP01000112.1| GENE 61 49478 - 49624 149 48 aa, chain - ## HITS:1 COG:no KEGG:Rumal_1101 NR:ns ## KEGG: Rumal_1101 # Name: not_defined # Def: integrase family protein # Organism: R.albus # Pathway: not_defined # 1 47 361 407 415 68 63.0 7e-11 MCEAGVNIKVIQDALGHSDISTALNIYADVTKEMKAEEFKGLDSYFKV >gi|222441830|gb|ACEP01000112.1| GENE 62 49832 - 50434 406 200 aa, chain - ## HITS:1 COG:FN0852 KEGG:ns NR:ns ## COG: FN0852 COG1073 # Protein_GI_number: 19704187 # Func_class: R General function prediction only # Function: Hydrolases of the alpha/beta superfamily # Organism: Fusobacterium nucleatum # 1 195 1 195 199 244 59.0 7e-65 MDKAVIYIHGKGGNAEEAIHYKPLFSNCDVIGLDYIAQFPWEAKEEFPLLFNSIYRNYKT VEVIANSIGAYFAINALSNQQIEKAYFISPVVDMERLIADMMIWANVTEDELKERKEIQT TFGETLSWDYLCYARENPIIWEIPTHILYGEKDNLTAYGTIFEFVQRTNSTLSIMKNGEH WFHTDEQMKFLDEWIIKSSK >gi|222441830|gb|ACEP01000112.1| GENE 63 50494 - 50967 429 157 aa, chain - ## HITS:1 COG:no KEGG:Rumal_2970 NR:ns ## KEGG: Rumal_2970 # Name: not_defined # Def: GCN5-like N-acetyltransferase # Organism: R.albus # Pathway: not_defined # 1 154 1 154 157 205 66.0 5e-52 MNYSIRELKRDENKILDTFLYEAIFIPEGVPVPSKDIINKPDLQVYVKDFGENKGDLCLV AQVADEIVGAVWVRIMNDYGHIDNETPSFAISLLKEYRNYGIGTELMKQMLMKLKLEGYK QASLAVQKMNYAVRMYRKVGFEIVDENDEEYIMICKL >gi|222441830|gb|ACEP01000112.1| GENE 64 51065 - 51481 367 138 aa, chain - ## HITS:1 COG:no KEGG:Closa_3192 NR:ns ## KEGG: Closa_3192 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 138 1 139 139 112 50.0 7e-24 MDKASGKLTVYFEEPFWVGVFERIEDGKLSVAKVTFGAEPKDYEVQEYIQKYYFSLKFSP AVETVVKDIKRNPKRMQREAKKQTMETGIGTKSQQALKLQQEQNNQVRKERNRKKKEAEE QRMFELKQQKKREKHKGH >gi|222441830|gb|ACEP01000112.1| GENE 65 51707 - 52108 326 133 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225028342|ref|ZP_03717534.1| ## NR: gi|225028342|ref|ZP_03717534.1| hypothetical protein EUBHAL_02614 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02614 [Eubacterium hallii DSM 3353] # 1 133 3 135 135 246 100.0 5e-64 MSENNDYIQLPPLKKDTPSDVVAFMWEYMKVPENSREKVKNLLKDANENGVKLSHQAPTL YDVVPKEEIAEFEELMRKTIADIVSEASSVACWVYVQKYVKQKTLDEMLQELPGAGQFII VMDTWFERLMADQ >gi|222441830|gb|ACEP01000112.1| GENE 66 52122 - 52418 211 98 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|153854275|ref|ZP_01995574.1| ## NR: gi|153854275|ref|ZP_01995574.1| hypothetical protein DORLON_01569 [Dorea longicatena DSM 13814] hypothetical protein DORFOR_02720 [Dorea formicigenerans ATCC 27755] hypothetical protein DORLON_01569 [Dorea longicatena DSM 13814] hypothetical protein DORFOR_02720 [Dorea formicigenerans ATCC 27755] # 1 98 1 98 98 110 98.0 3e-23 MFILKLVGKILLLPVWLILFVIGLAVKMTVQTYAVVRGILGFIFTLLIIATAYCYHDWVQ VAFLFSLSVILYLILFAGVFVDTVLDMTRERIIDFIIS >gi|222441830|gb|ACEP01000112.1| GENE 67 52428 - 53426 539 332 aa, chain - ## HITS:1 COG:no KEGG:Amet_3992 NR:ns ## KEGG: Amet_3992 # Name: not_defined # Def: replication initiator A domain-containing protein # Organism: A.metalliredigens # Pathway: not_defined # 3 318 2 319 319 233 41.0 6e-60 MEKKMTFNYFYGTEADQFSFYRIPKALFTDSYFKDLSSDAKILYGLMLDRMSLSIKNQWF DDKNRAYIYFSIEDIMELLNCGRNKATKSMRELDDETGIGLIEKRRQGFGKVNVIYVKTF MPEKTDEKKFEEELKKFKKQTSVENEEPAEVYNSNFMKSQNQTSRSPENKLQEVYISNPN NTNLSDTEMNDNKSNHIISVDEKRFDSDNRSEDYQAYENLVKETIDYESLEVTHHDDMRQ VDEIVNLIVETVMCKNDKILIASNWYPASLVKKKFLMLTYSHIEYVLHCMSGNTTKVKNI KKYLLAALFNAPSTMNGYYQAEVNHDMPGLVR >gi|222441830|gb|ACEP01000112.1| GENE 68 53526 - 54407 753 293 aa, chain - ## HITS:1 COG:SP2240 KEGG:ns NR:ns ## COG: SP2240 COG1475 # Protein_GI_number: 15902043 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Streptococcus pneumoniae TIGR4 # 29 218 7 178 252 88 31.0 1e-17 MKNRSGEKIKLASIDELLGVVNEESAMEIEISKIHPFKNHPFKVLDDEKMQDLVESVRIN GVLTPVLLRMDDNEEYEMVSGHRRMHAAQLAGLTTIPAIVRELSDDDAVIAMVDANIQRE ELLPSEKAFAYKMKLDAMKRQGSRTDLTLCQSGTKSRTDQLLGEQVGESARSVQRYIRLT ELIPELLDLVDNKKLQFTVAVDISYIDKEVQEWIYEYISDTGFIKPKQIAALRNQLNDGP INQIQMLSIFNNCVMAKKVSRSLTFSEKKLTKYFPDDYTAKDMEQVIESLLEK >gi|222441830|gb|ACEP01000112.1| GENE 69 54397 - 54954 464 185 aa, chain - ## HITS:1 COG:DR0013 KEGG:ns NR:ns ## COG: DR0013 COG1192 # Protein_GI_number: 15805054 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Deinococcus radiodurans # 3 182 49 291 296 111 33.0 8e-25 MCKVISVVNQKGGVGKTTTTVNVGIGLAREGKKVLLIDADPQGSLTASLGYEEPDDLRIT LATINALVSSDSVLIPVQAAYLLVKGLQQLIKTILTVKKRLNRKLAIEGILLTMVDFRTN YARDIASRVHTTYGSQIEVFENVIPMSVKAAETSAEGKSIYMHCPKGKVAEAYKNLTQEV LKNEK >gi|222441830|gb|ACEP01000112.1| GENE 70 54969 - 55838 574 289 aa, chain - ## HITS:1 COG:SA2498 KEGG:ns NR:ns ## COG: SA2498 COG1475 # Protein_GI_number: 15928294 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Staphylococcus aureus N315 # 10 236 24 232 279 74 27.0 2e-13 MTEQNRQSGEKEERIIEIEIERLRPFKEHPFQVKDDKEMFLLQESIEKYGILNPLIVRPV PDGYYEIISGHRRKHAAEKLGYRKVPVIIRVLSEDDSILSMVDSNLHRERISYSEKAFAY KLKNDVLKRKSGRKKSQVDHKTPRKRAIEIISEDCGDSPKQVQRYISLTKLIPEMLQKLD DEIISFCPAVEIAALSEKEQRELLVAMEYAQAIPSLSQAQRIRQLSKEKQLSLEKMEEIM CEVKKGEITRVAFTNEQLHKYFPNSYTPAMMKREILALLKLWKKESWES >gi|222441830|gb|ACEP01000112.1| GENE 71 56385 - 57260 646 291 aa, chain - ## HITS:1 COG:no KEGG:Cphy_3931 NR:ns ## KEGG: Cphy_3931 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 1 289 1 289 291 232 40.0 2e-59 MREYLKAIGFSDLNSRKKIDELINEIKKNPSRKNWFQIDEEEAIFIYEKDFAEAVGIAVI EVMDRDGYRVTDHFYPYVRGANYLYHEDLEFEHYTDKEGYAGICDENNIGIPLIFHVNNP VDYLKIVYGKFHDKINSITLSGMSKKGMIILPVEKDEFQEREERKGNELRNEMIDAAKAG DIEAMEQLTLEDMDTYTAVSSRSKKEDLFTIVTSYFMPHSVECDKYSVLGKIINVMEMQN SRTKEIFYYLSVECNSIQIEFTIAKEDLMGEPKVGRRFKGILWLQGEVDCL >gi|222441830|gb|ACEP01000112.1| GENE 72 57444 - 57980 367 178 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_3697 NR:ns ## KEGG: EUBREC_3697 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 40 167 37 160 169 107 45.0 2e-22 MNSRILNFFNVKLDAFIVFVSVVVVIMFIALFFLLLDYLKMRKRYVEFMTGESGKSLEYT IYKRFREIDKLKAGQKDNDAQIAIIYDMLRRNYSKVGIYKYDAFNVENSLSGGSISFALT LLNSRDNGFILNVIHNRAGCHVYLKEIKEAACEQVLAEEEKISLDMALKYDEKKETDK >gi|222441830|gb|ACEP01000112.1| GENE 73 58012 - 58920 908 302 aa, chain - ## HITS:1 COG:CAC3729 KEGG:ns NR:ns ## COG: CAC3729 COG1475 # Protein_GI_number: 15896960 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Clostridium acetobutylicum # 1 298 1 282 283 226 48.0 3e-59 MAVRKGLGRGLDLLIPKDEGMPKKSTEKNKENKEPKENQVLTLSIHDVEPNRNQPRKQFD EDAIEELADSIKQYGVIQPLIVQKKDKYYEIIAGERRWRACKKAGLKEVPVIIKNYDEKE TLKISLIENLQREDLNPIEEAKAYEQLYNTYGLKQDEIAASVSKSRTAITNIMRLLKLDE RVQNMVIENLISSGHGRTLLSIDDGDMQYQLAEKILDENLSVREAEKLVKCILEHKDNKK NVEEESSQEKSMIDFFENKMKDILGSKVVIKNKKNNKGKIEIEYYSKDELERIIDLIQSI KE >gi|222441830|gb|ACEP01000112.1| GENE 74 58910 - 59017 98 35 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MFHSQKRQYVSRETLGGYSETVYRYMEGAGEFYGG >gi|222441830|gb|ACEP01000112.1| GENE 75 59129 - 59899 830 256 aa, chain - ## HITS:1 COG:BS_soj KEGG:ns NR:ns ## COG: BS_soj COG1192 # Protein_GI_number: 16081149 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Bacillus subtilis # 1 255 1 253 253 290 58.0 2e-78 MSRVIAIANQKGGVGKTTTSINLSACLAEKGKKVLLIDMDSQGNTTSGFGFEKNELDKTV YEVLREEVSIEEAIIPVEECFENLFLIPANRNLAGAEIELVTRENMQHILKKQLEPIKDE YDFIVIDCPPALGMLTVNAMTAADSVLVPIQCEFYALDGLSQLIYTIELIQESLNPDLYI EGVVFTMYDARTNLSLQVVENVKDNLKQTIYKTIIPRNVRLAEAPSYGLPINLYDKRSSG AEAYRMLADEVIENAK Prediction of potential genes in microbial genomes Time: Fri Jul 8 08:14:13 2011 Seq name: gi|222441829|gb|ACEP01000113.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont379.1, whole genome shotgun sequence Length of sequence - 20437 bp Number of predicted genes - 21, with homology - 20 Number of transcription units - 12, operones - 7 average op.length - 2.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 339 246 ## COG4268 McrBC 5-methylcytosine restriction system component + Term 346 - 388 4.1 - Term 338 - 372 3.6 2 2 Tu 1 . - CDS 388 - 531 81 ## - Prom 621 - 680 10.4 + Prom 580 - 639 7.0 3 3 Op 1 . + CDS 756 - 944 159 ## MGAS9429_Spy0565 phage protein 4 3 Op 2 . + CDS 990 - 1379 233 ## COG1598 Uncharacterized conserved protein 5 4 Op 1 4/0.000 - CDS 1635 - 3047 1996 ## COG5016 Pyruvate/oxaloacetate carboxyltransferase - Prom 3144 - 3203 5.4 6 4 Op 2 9/0.000 - CDS 3205 - 4329 1549 ## COG1883 Na+-transporting methylmalonyl-CoA/oxaloacetate decarboxylase, beta subunit 7 4 Op 3 . - CDS 4363 - 4731 667 ## COG0511 Biotin carboxyl carrier protein - Prom 4753 - 4812 3.4 - Term 4756 - 4806 -0.6 8 5 Op 1 . - CDS 4826 - 5170 414 ## Closa_1325 sodium pump decarboxylase gamma subunit 9 5 Op 2 . - CDS 5183 - 6592 1902 ## COG4799 Acetyl-CoA carboxylase, carboxyltransferase component (subunits alpha and beta) + Prom 7232 - 7291 8.9 10 6 Tu 1 . + CDS 7330 - 7971 616 ## EUBREC_2283 hypothetical protein 11 7 Tu 1 . - CDS 8512 - 9162 627 ## COG2199 FOG: GGDEF domain - Prom 9306 - 9365 6.0 12 8 Op 1 3/0.000 - CDS 9750 - 10580 1390 ## COG5564 Predicted TIM-barrel enzyme, possibly a dioxygenase 13 8 Op 2 3/0.000 - CDS 10626 - 11852 1295 ## COG5441 Uncharacterized conserved protein - Prom 11916 - 11975 6.8 - Term 11920 - 11973 -0.7 14 8 Op 3 . - CDS 12042 - 13265 1101 ## COG5564 Predicted TIM-barrel enzyme, possibly a dioxygenase + Prom 13349 - 13408 9.0 15 9 Op 1 . + CDS 13628 - 13885 342 ## Cphy_3270 hypothetical protein + Prom 13967 - 14026 1.5 16 9 Op 2 . + CDS 14065 - 14997 791 ## COG0642 Signal transduction histidine kinase + Term 15105 - 15161 1.1 17 10 Tu 1 . + CDS 15423 - 16028 806 ## COG0560 Phosphoserine phosphatase 18 11 Op 1 . - CDS 16265 - 16816 626 ## COG0655 Multimeric flavodoxin WrbA - Prom 16836 - 16895 4.8 19 11 Op 2 . - CDS 16897 - 17673 720 ## COG1349 Transcriptional regulators of sugar metabolism - Prom 17881 - 17940 9.3 + Prom 17752 - 17811 8.4 20 12 Op 1 1/0.000 + CDS 18039 - 18845 537 ## COG0561 Predicted hydrolases of the HAD superfamily + Prom 18968 - 19027 4.7 21 12 Op 2 . + CDS 19068 - 20096 1135 ## COG0673 Predicted dehydrogenases and related proteins Predicted protein(s) >gi|222441829|gb|ACEP01000113.1| GENE 1 1 - 339 246 112 aa, chain + ## HITS:1 COG:mcrC KEGG:ns NR:ns ## COG: mcrC COG4268 # Protein_GI_number: 16132166 # Func_class: V Defense mechanisms # Function: McrBC 5-methylcytosine restriction system component # Organism: Escherichia coli K12 # 1 112 241 347 348 58 33.0 4e-09 MQTDIMLSKNNNILIIDAKYYSHMTQQQYGIHTLHSNNLYQIFTYVKNKEFELRNYEHTV SGMLLYAQTDEDIIPNNTYHMSGNQISVLALDLNQDFSKISRTLDDIAKNFL >gi|222441829|gb|ACEP01000113.1| GENE 2 388 - 531 81 47 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MNNAVSYAMDSQVIFQMFFSIVCIAAAIGVSILVLRFMRKFIDFLLQ >gi|222441829|gb|ACEP01000113.1| GENE 3 756 - 944 159 62 aa, chain + ## HITS:1 COG:no KEGG:MGAS9429_Spy0565 NR:ns ## KEGG: MGAS9429_Spy0565 # Name: not_defined # Def: phage protein # Organism: S.pyogenes_MGAS9429 # Pathway: not_defined # 1 62 28 88 88 82 69.0 5e-15 MPMKPKEMIRLLKKNGFIKISQNGSHVKMKNFKTGKQTTVPLHSKDLKKGLEDAILKQAG LK >gi|222441829|gb|ACEP01000113.1| GENE 4 990 - 1379 233 129 aa, chain + ## HITS:1 COG:SP1786 KEGG:ns NR:ns ## COG: SP1786 COG1598 # Protein_GI_number: 15901615 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Streptococcus pneumoniae TIGR4 # 6 125 5 146 150 75 34.0 2e-14 MKKLFYPAVFHIAEEGGYWVTFPDFPECLTQGENMQEAYDMAIDALGLILTDDINDHKEL PEPSNITHVDDGVVVIIPYDYLEYNRKYHNKSIKKTLTIPEWLNNEALRLNLNFSQILQD ALLNKIQSR >gi|222441829|gb|ACEP01000113.1| GENE 5 1635 - 3047 1996 470 aa, chain - ## HITS:1 COG:FN1376 KEGG:ns NR:ns ## COG: FN1376 COG5016 # Protein_GI_number: 19704711 # Func_class: C Energy production and conversion # Function: Pyruvate/oxaloacetate carboxyltransferase # Organism: Fusobacterium nucleatum # 9 453 4 448 448 550 59.0 1e-156 MAEIKKKPVKIVETVLRDAHQSLIATRMTTEEMLPIVDKMDQVGYHAVECWGGATFDACL RFLKEDPWERLRKLRDGFKNTKLQMLFRGQNILGYKPYPDDVVEYFVQKSIANGIDVIRI FDCLNDLRNLQTAVDATKKEGGHAQIALSYTIGDAYTLDYWKKTAKNIEEMGADSICIKD MAGLLVPNEATRLVTALKQATDIPIELHTHYTSGVGGMTYLKAVEAGVDIIDTAISPFAM GTSQPATEVMAETFRGTEFDTGFDQKLLGEIADYFRPIREKYLANGLMSPKVLGVNIKTL MYQVPGGMLSNLVSQLEQQGASDKFYEVLEEVPKVRKDFGEPPLVTPSSQIVGTQAVLNV LTGERYKMITNESKALLRGEYGQTVKPMNKEVQKKAIGDEKPITCRPADLLEPELSKIEA EMSQWKEQDEDVLTYALFPQVATEFFKYRAAQENKIDPAKADVKNGAYPV >gi|222441829|gb|ACEP01000113.1| GENE 6 3205 - 4329 1549 374 aa, chain - ## HITS:1 COG:SPy1177 KEGG:ns NR:ns ## COG: SPy1177 COG1883 # Protein_GI_number: 15675149 # Func_class: C Energy production and conversion # Function: Na+-transporting methylmalonyl-CoA/oxaloacetate decarboxylase, beta subunit # Organism: Streptococcus pyogenes M1 GAS # 4 373 1 376 376 356 56.0 3e-98 MSYVVETMTNLAAQTGFAFLKPGNFIMILVALVFLYLAIAKGYEPLLLVPISFGMLLVNL YPSIMEEGGLLHYFYLLDEWSILPSLIFLGVGALTDFGPLIANPASFLLGAAAQFGIYSA YFLAMLMGFNGKAAAAISIIGGADGPTSIFLAGKLGQTDLMGPIAVAAYSYMSLVPIIQP PIMKLLTTEKERQIKMEQLRTVTKTEKIMFPIVVTVVVCLILPTTAPLVGMLMLGNLFRE SGVVKQLSETASNALMYIVVIILGTSVGATTSAEAFLNVSTIKIVVLGLVAFAVGTAAGV LFGKLMCKVTGGKVNPLIGSAGVSAVPMAARVSQKVGAEADPSNFLLMHAMGPNVAGVIG TAVAAGTFMAIFGV >gi|222441829|gb|ACEP01000113.1| GENE 7 4363 - 4731 667 122 aa, chain - ## HITS:1 COG:SPy1176 KEGG:ns NR:ns ## COG: SPy1176 COG0511 # Protein_GI_number: 15675148 # Func_class: I Lipid transport and metabolism # Function: Biotin carboxyl carrier protein # Organism: Streptococcus pyogenes M1 GAS # 1 120 1 115 116 75 44.0 2e-14 MKNYRITVNGVAYDVAVEELGAGEESAAPVSTPAAVPAPAKKAAPKTSGSAGAVKITAPM PGKIVAVKAQAGASVKKGDAVLVLEAMKMENEICAAQDGVIASVEVAVGDMVEGGDVLAT MN >gi|222441829|gb|ACEP01000113.1| GENE 8 4826 - 5170 414 114 aa, chain - ## HITS:1 COG:no KEGG:Closa_1325 NR:ns ## KEGG: Closa_1325 # Name: not_defined # Def: sodium pump decarboxylase gamma subunit # Organism: C.saccharolyticum # Pathway: not_defined # 5 110 152 247 256 72 57.0 4e-12 MNVFEQALLNTLMGMGTVFAVLIFISLLISLFVYIPSIERALKNRSSKKEKKAAQEERPA PKRPILEEVVEEEELVDDGELVAVITAAIMAANGGAAVSADKLVVRSIKRVKRR >gi|222441829|gb|ACEP01000113.1| GENE 9 5183 - 6592 1902 469 aa, chain - ## HITS:1 COG:VNG1529G KEGG:ns NR:ns ## COG: VNG1529G COG4799 # Protein_GI_number: 15790513 # Func_class: I Lipid transport and metabolism # Function: Acetyl-CoA carboxylase, carboxyltransferase component (subunits alpha and beta) # Organism: Halobacterium sp. NRC-1 # 1 466 35 511 516 336 37.0 5e-92 MNARERIDYLFDDHSFVEVGGYVSARNTDYNLGAKKVEGDGVITGYGVINDRMVYVYSQD ASALGGSVGEMHAKKIASLYDLAMKTGAPIIGMLDCAGLRLQEATDALHAFGTIFNKQAQ ASGMIPQISAVFGMCGGGSAVMSAMSDFTLMEKENGSLFINSPNALAGNEESRQNTSGAQ FQSEQAGNVDFVCETEEALLDQLRILVDVLPSDYEAEDVYFDNDDDLNRVIPELNDAAYD AAYILKSISDNNMYIETKAAYAKEMVTAFIRLNGETVGAVANQPVDGESVLTRNGLEKAE RFIKFCDAFSIPLLTVTNVTGLRANAHEERKQALALAKFTTTLAQVSVPRINLIAGEAYG TAAEVMNSKAIGADFVLAWPQASVGMMDAEQAVRIMYAEEINTDKNKTALIAEKTAEYAE GQSSALAAAKRGYIDDIIEPDATRKRLIAAYEMLYGKKVTPVTKKHTAV >gi|222441829|gb|ACEP01000113.1| GENE 10 7330 - 7971 616 213 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_2283 NR:ns ## KEGG: EUBREC_2283 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: Pyrimidine metabolism [PATH:ere00240]; Metabolic pathways [PATH:ere01100] # 1 212 1 212 212 226 49.0 4e-58 MENRKIQFRSKACNLHLSAYPGHFATKHSHVNYFLDMTTLKVRQSNAEEAARALVPLYKH NTVVDTIVCLDGTEVIGAFLAEKLTESGFFSYNQHKSIYIVTPEIDSDGQMFFRKNIQPM IKDRNIIILMASVTTGITINRSWECIEYYGGKLQGISAIFSTLSEYGNLPVNSLFTPNDI PTYHTYSKHDCPYCKSGQKIDALVNGHGYSELL >gi|222441829|gb|ACEP01000113.1| GENE 11 8512 - 9162 627 216 aa, chain - ## HITS:1 COG:aq_1455 KEGG:ns NR:ns ## COG: aq_1455 COG2199 # Protein_GI_number: 15606624 # Func_class: T Signal transduction mechanisms # Function: FOG: GGDEF domain # Organism: Aquifex aeolicus # 68 215 40 182 236 85 37.0 8e-17 MTLRIFTMFPQKKYYALRGKLEMIFGMVVFPVIAGVLQMFFTEISIICFGLTLGIIQVFT AFLTNRITMDELTQINNRTKLMQYLEGYMERHTEGEETDLHFLMIDLDDFKWINDTYGHV EGDRALIRIAGVLKKTLAGHAGILARYGGDEFCIAGEMLREEAEHLIKNLYENLEKANKQ AASPYDIGMSVGCAQFTKEVRTIPDLVNSADEDMYL >gi|222441829|gb|ACEP01000113.1| GENE 12 9750 - 10580 1390 276 aa, chain - ## HITS:1 COG:mll9387 KEGG:ns NR:ns ## COG: mll9387 COG5564 # Protein_GI_number: 13488176 # Func_class: R General function prediction only # Function: Predicted TIM-barrel enzyme, possibly a dioxygenase # Organism: Mesorhizobium loti # 3 276 9 282 285 337 59.0 2e-92 MLRKSREEILKDFKAQVAQGKILVGVGAGTGITAKCSEKADVDMLIIYNSGRFRMAGRGS LSGILSYGDANAIVQEMGQEVLPIVKKTPVLAGVCGTDPFRVMDIFLKQLKEQGFNGVQN FPTVGLIDGKFRANLEETGMGYGLEVDMIREAHKLDMLTCPYVFEPEQAKAMAEAGADIL VAHMGLTTKGSIGAETALTLDDCCEKIREIIKAGKEVNPDIMVICHGGPIADPEDAAYVI NNVPEIDGFFGASSIERLASERGMTAQAAAFKAIEK >gi|222441829|gb|ACEP01000113.1| GENE 13 10626 - 11852 1295 408 aa, chain - ## HITS:1 COG:YPO3839 KEGG:ns NR:ns ## COG: YPO3839 COG5441 # Protein_GI_number: 16123974 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Yersinia pestis # 2 405 3 401 405 303 43.0 5e-82 MKTIAIAGTFDSKGEEFLYVKKLVEKLGLQTYMIHTGVFEPAFEPDVSNREVAEAAGYDI DKIVEKKDRALATEALSNGMKVLVPQLYKEGKFDGILSFGGSGGTSLVTPAMQKLPIGVP KLMVSTMASGDVERYVGTSDILMMPSIVDVAGINKISKIIFKNAVLTIAGMVAGQEKLAE EETIEEKPLIAASMFGVTTPCVEQAKAVLEQAGYEVLVFHATGTGGKTMESLIESGFFTG VLDLTTTEWCDEVVGGILAAGEDRCRAAIRKKIPQVVSVGACDMVNFGPIDEVPKQFAGR NLYKHNPLVTLMRTSVEENKVIAEKLAERWNETENKMTLFLPKQGVSMIDAKGQPFDGPE ERKVLFDTLKNNISNENVEIVEMDNNINDEAFAKAAAQKLMELINEGK >gi|222441829|gb|ACEP01000113.1| GENE 14 12042 - 13265 1101 407 aa, chain - ## HITS:1 COG:YPO3838 KEGG:ns NR:ns ## COG: YPO3838 COG5564 # Protein_GI_number: 16123973 # Func_class: R General function prediction only # Function: Predicted TIM-barrel enzyme, possibly a dioxygenase # Organism: Yersinia pestis # 3 270 10 277 280 204 39.0 2e-52 MDREYILRLLRAQINVNGHIIGAAVGSGMTAKFAAMGGADLLLALSAGKYRIMGRSSFAS YFCYGNSNEQVMEMGQRELFPIIKDTPIVFGLMANDPSLHLYEHLKAIKEAGFSGIVNFP TMALIDGQFREALEEEGNTYKQEVEAIHLARYLDLFTIAFVTTEEETRQMIEAGADVICV NLGLTKGGFLGPKRHLSIEDARRMTDKIYALCNEMNPDIIKMIYAGPANTPIDMQYMYQN TECQGYIGGSTFDRIPTERAIYNTMKAFKSYGDFDRNNPMTRLLDGDWNARTTVEFVQKY VEEHYMDEVILGDLALVAHVSPSYLSTRFKKETGASFTEYLIRYRVNKAKNLLKEKQNRC REVAENVGYKDYAQFSKIFKKYVGLSPKEYQRMALEEENQYKHKKSE >gi|222441829|gb|ACEP01000113.1| GENE 15 13628 - 13885 342 85 aa, chain + ## HITS:1 COG:no KEGG:Cphy_3270 NR:ns ## KEGG: Cphy_3270 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 1 85 1 85 85 79 48.0 6e-14 MNKFYKALEEKIKASGYPREISGRDVYNDICDQIEDKENGTYLVLSKFEDDVTFEYHITI QDEGFNLGILTMRTPEGVFQTDFDE >gi|222441829|gb|ACEP01000113.1| GENE 16 14065 - 14997 791 310 aa, chain + ## HITS:1 COG:VCA0736_1 KEGG:ns NR:ns ## COG: VCA0736_1 COG0642 # Protein_GI_number: 15601492 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Vibrio cholerae # 21 202 491 669 726 79 31.0 1e-14 MNCHDPFYFFLIAFKFINNLFRTPLNGIIGIIDINSKSSDMELIEENRQKARCAAQQLLS LLNEVLDMSRLEHGETVLTNELFHLSELTAALQDTMQIHAADSGIILICDSYTVPSNCER VYGSPIYIQKIFFNIIENAIKYNKTGGQIRWKTELIEQTESRLTYKYTISDTGVGMNETT VTIILPFTIANAQKNSSAHGNGNADYLNNDEFTPSISDEDITAASLSGKQILLVEDNELN LEIAQFMLEDAGIIVTPVKDGSQALKAFLEKPAGTYYDMILMDIMMPVMNEHLTKPLNCE KLIATIVKFL >gi|222441829|gb|ACEP01000113.1| GENE 17 15423 - 16028 806 201 aa, chain + ## HITS:1 COG:PA1757 KEGG:ns NR:ns ## COG: PA1757 COG0560 # Protein_GI_number: 15596954 # Func_class: E Amino acid transport and metabolism # Function: Phosphoserine phosphatase # Organism: Pseudomonas aeruginosa # 1 191 1 191 205 213 53.0 2e-55 MEIVCLDLEGVLVPEIWIAFAEETGIPELKRTTRDEPDYDKLMKYRLNILKEHGLGLKEI QETIAKIDPIPGAKEFLDKLRELTQVIIISDTFSQFAGPLMKKLGYPTIFCNSLVVADNG EITDFKMRCEKSKYTTVKALQSIGYDTIASGDSHNDLGMIQASKAGFLFKSTDAIKAEYP EIPAYETYDELYAAIKGVIEA >gi|222441829|gb|ACEP01000113.1| GENE 18 16265 - 16816 626 183 aa, chain - ## HITS:1 COG:MA0444 KEGG:ns NR:ns ## COG: MA0444 COG0655 # Protein_GI_number: 20089335 # Func_class: R General function prediction only # Function: Multimeric flavodoxin WrbA # Organism: Methanosarcina acetivorans str.C2A # 4 183 2 180 201 100 37.0 2e-21 MGKKVLIITGSPRLKGNTNALVTAFASGAVAAGNEVEVFDAATANLNGCHADKSCEQRGC CGQKDDGVRMNELMREADVLVLASPVYWSGFTSQIKTAIDRFYQFSFPKGRETLHIKDLY LIAASANPDSAMFNDVVHAYEQLCELLHFDKGGTLLCPGLAGADDLENHQEFLEKAVKMG MAI >gi|222441829|gb|ACEP01000113.1| GENE 19 16897 - 17673 720 258 aa, chain - ## HITS:1 COG:CAC2950 KEGG:ns NR:ns ## COG: CAC2950 COG1349 # Protein_GI_number: 15896203 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulators of sugar metabolism # Organism: Clostridium acetobutylicum # 11 251 5 248 254 116 31.0 5e-26 MECRKELSGVKRREQIIDNLREHGEVTVKILSNQFGVSEMTIRRDLHLLEEQGYATIHYG GAALLDRQNIGCQSFSLRKGKESENKRRIAKKAASFIREGDVLFMDTSTTVLEILTYMPA FRVKIITNSMPVMESVYRDENIDLYMAPGLYRSVGGPLDYATAEYLSRFHYNKAFFGSTS YEPDFGTSAAEEIEAAVKKCVIENSDENYLLIDHSKMGKRNFVKWGDTDNYTAILSDEEL DIGWRTKIARNGGKLILC >gi|222441829|gb|ACEP01000113.1| GENE 20 18039 - 18845 537 268 aa, chain + ## HITS:1 COG:HI0003 KEGG:ns NR:ns ## COG: HI0003 COG0561 # Protein_GI_number: 16271979 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Haemophilus influenzae # 4 264 3 258 262 121 32.0 1e-27 MSTYKAVFSDIDGTLLTSEHIVSTATIEAIASLQDKNIPFVIISSRSASCIYPILEKHHF QCPIVACGGAWIEDMNGRILSNEGMSKQTAATVIEFMTEKKFDLAWGIYSGKNWITNDRT DARILHEESVVEVESVEGTVALLPEDAVVNKILCMCNPACILQIEEDLKKAFPILSIVKS SNYLLEIMPAGMNKAKGIRKFCELFSIATEDSVAFGDNYNDLEMLYAAGCGILMGNAPVA ILETFPGKVTLDNNHDGIPVALKDLGLI >gi|222441829|gb|ACEP01000113.1| GENE 21 19068 - 20096 1135 342 aa, chain + ## HITS:1 COG:BH2703 KEGG:ns NR:ns ## COG: BH2703 COG0673 # Protein_GI_number: 15615266 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Bacillus halodurans # 1 342 1 322 329 245 40.0 8e-65 MKLAFIGTGKIIADALFAIEPIKEIEKTAIFARPHSEEKAEAFAKQYNIAEVYTDYEKLL TETAADTVYIGLINSAHYSYAKQALLHGKNVILEKPFTGFYNEAQELQQIAEENKLFIFE AVTVLHNEVFYEMKKNVEKLGTLRMALCNYSQYSSRYDAYLEGDITHSFDPAYYGGALYD INVYNIHYCVGLFGEPKDVNYYPNIGPNGIDTSGTLVLVYDGFSAVCTGSKDSDSPGYVS IQGEKGFMKIDSKPNIASELTTTYVDENVKERVRDAAGAMVRATITDYFTAKEQHHRMTQ EFLDFAKIIDEKDYETAHELLTESVTVVGVLEKARSKAGIKF Prediction of potential genes in microbial genomes Time: Fri Jul 8 08:14:32 2011 Seq name: gi|222441828|gb|ACEP01000114.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont380.1, whole genome shotgun sequence Length of sequence - 11390 bp Number of predicted genes - 12, with homology - 11 Number of transcription units - 6, operones - 3 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 42 - 695 567 ## Closa_3698 von Willebrand factor type A 2 1 Op 2 . + CDS 790 - 1998 840 ## COG1879 ABC-type sugar transport system, periplasmic component 3 2 Tu 1 . + CDS 2101 - 2730 644 ## COG0350 Methylated DNA-protein cysteine methyltransferase + Term 2806 - 2840 -0.8 4 3 Tu 1 . - CDS 3075 - 4442 1066 ## COG0534 Na+-driven multidrug efflux pump - Prom 4563 - 4622 1.5 5 4 Tu 1 . - CDS 4700 - 6091 989 ## PROTEIN SUPPORTED gi|145629959|ref|ZP_01785741.1| 50S ribosomal protein L21 - Prom 6119 - 6178 6.5 + Prom 6526 - 6585 3.7 6 5 Op 1 3/0.000 + CDS 6611 - 6907 462 ## COG0011 Uncharacterized conserved protein 7 5 Op 2 21/0.000 + CDS 6876 - 7658 572 ## COG0600 ABC-type nitrate/sulfonate/bicarbonate transport system, permease component + Prom 7680 - 7739 5.6 8 5 Op 3 17/0.000 + CDS 7830 - 8873 1432 ## COG0715 ABC-type nitrate/sulfonate/bicarbonate transport systems, periplasmic components 9 5 Op 4 . + CDS 9085 - 9819 238 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 10 6 Op 1 . - CDS 9676 - 10038 138 ## 11 6 Op 2 . - CDS 10031 - 10195 156 ## gi|225028386|ref|ZP_03717578.1| hypothetical protein EUBHAL_02659 12 6 Op 3 . - CDS 10183 - 11055 415 ## Cphy_2596 Ig domain-containing protein - Prom 11106 - 11165 7.5 Predicted protein(s) >gi|222441828|gb|ACEP01000114.1| GENE 1 42 - 695 567 217 aa, chain + ## HITS:1 COG:no KEGG:Closa_3698 NR:ns ## KEGG: Closa_3698 # Name: not_defined # Def: von Willebrand factor type A # Organism: C.saccharolyticum # Pathway: not_defined # 1 217 1 217 219 387 85.0 1e-106 MRKNLTEIVFILDRSGSMSGLEQDTIGGFNSMINQQKNADGEALVSTILFDNVSEVLHDR INVKDIQPLTDHDYMVRGCTALLDAIGGAIHHIGNIHKYARQEDIPEHTMFVITTDGMEN ASRYYSSNKVKQMIERQKIKYGWEFLFLGANIDAVETASLFGIDEDRAVNYQCDSEGTAL NYEVISEAISAVRCSVPLGSNWKKRIDEDFKKRGNRE >gi|222441828|gb|ACEP01000114.1| GENE 2 790 - 1998 840 402 aa, chain + ## HITS:1 COG:AGc5109 KEGG:ns NR:ns ## COG: AGc5109 COG1879 # Protein_GI_number: 15890064 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 7 240 129 353 357 103 30.0 8e-22 MKEGADGILLAPANSNSLVDSCEKVRKKKIPVVLIDSSINSTEFDVCYMTDNIDAGEMAA KEMLTMLHDTGNSPQESLEVGILLSSDASQAMVNRVSGFLDYWAKYAPTKWEITKDIALN GGDVKKAESDAAKLLKKNENIKGIYGCNNTSTVGIAKTLTKQKRTDIVMVGFDMAEETKQ FIKDSDYRGVSLMQKQDQMGYLGIGTLNSLINGKKSEQKYFDTGVIMIDSDYLMEKKEKK RYIGLSSTDALTGILNRRAFQVELEEQIRKKEPGFFIFIDVDNFKSYNDTYGHNNGDLCL KHFAKAIQECFPEGSILGRYGGDEFVVYLKDVTKESVYIYMEKFQKMIEDLTLATGEQVY LSASAGGAAFPEQGEDFISLCRSADVALYNVKQNGKAAFKIK >gi|222441828|gb|ACEP01000114.1| GENE 3 2101 - 2730 644 209 aa, chain + ## HITS:1 COG:Ta0460_1 KEGG:ns NR:ns ## COG: Ta0460_1 COG0350 # Protein_GI_number: 16081579 # Func_class: L Replication, recombination and repair # Function: Methylated DNA-protein cysteine methyltransferase # Organism: Thermoplasma acidophilum # 116 200 21 105 120 74 42.0 2e-13 MNTREEAIKYGLSFANTYVEAPFRDQNWQLVRVKGSKKAFLWVYEKDGFIHLNVKVEPEW RDFWREVYSSVIPGYHQNKKYWNTLILDGTIPEKEVKRMITESYDIVTDSPTKRIYEAVK KIPKGHVATYGQVAAMAGNPKMSRAVGNALHKNPDPENIPCFRVVNSKGELAGAFAFGGE GEQAKRLEEDGVEVKNGKVDLKKFGINSN >gi|222441828|gb|ACEP01000114.1| GENE 4 3075 - 4442 1066 455 aa, chain - ## HITS:1 COG:FN1789 KEGG:ns NR:ns ## COG: FN1789 COG0534 # Protein_GI_number: 19705094 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Fusobacterium nucleatum # 11 454 15 457 459 230 33.0 3e-60 MKKTFFSHDKNFYAALFPLLIIISLQNLISYSVNMADNIMLGAYSQNALSGASLVNQVFF LIQQAAVVIGDGLVVIASQYWGQKRMEPIRQITGIVLKIAFCLGIFLILLCSLFSGAILG FFTSDTAILSEAGSYLQIIKYTFLLYLVTQVLLSALRSVETVRIAFVISCVSLVVNVSIN YTLIYGRFGFPKLGIRGAAIGTMISRCLEFLIVFIYVIKIDKKLKLFSDKAVFHIDKLLR SDYVKVAVPVFISGMLWAISVPMQTAILGHLSSDAIAANSVASTFYQYMKVVVLAMSSAS SVMMGKAVGKGDLDEVKAAARTLSVLDVGIGIILGAVLYFTKDFLLAQYTLSPTAAQLAE QLIIIMAIVMVGMAYQMPVGSGIIRGAGDTKFSLYVNMISTWGIVLPLSLSAAFWWKWPI PAVVICLQSDQIFKGLPIFLRFRSYKWIHKLTRDK >gi|222441828|gb|ACEP01000114.1| GENE 5 4700 - 6091 989 463 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|145629959|ref|ZP_01785741.1| 50S ribosomal protein L21 [Haemophilus influenzae 22.4-21] # 4 458 2 441 456 385 44 1e-107 MLKTIEAVNSAVNNFIWGIPAMVCIIGVGLYLSFRTGFIQIRKFPYSIKNTLGRIFDKSE AADGSITPFQAVCTALAGTVGTGNIAGVAGAITIGGPGAVFWMWISALLGMCTKYSEVTL AVHFREKNSQGDWVGGPMYYIKNGLSKHWHWLAFLFSLFGVLTVFGTGNATQVNTIVTAI DSLMLNFNLASSGSLSTINLVIGIVIAIAVALILIGGIRRIGKVTETLVPFMALLYILLG LGVVILNIQQIPAVFQSIFEGAFHPAAFTGGMVGTLFTSMKKGVSRGIFSNEAGLGTGSI AHATADTSQPVKQGLFGIFEVFTDTIVICTLTALIILCSGINVPYGQTAGAELTIQGFTA TYGGWVSIFTAIALCCFAFSTTIGWGLYGSRCIEFLFGTKTIRPFMIVYSLVAILGATVD LGLLWSIAETFNGLMSIPNLIAVFLLSGTVVKLTKEYFGAKKA >gi|222441828|gb|ACEP01000114.1| GENE 6 6611 - 6907 462 98 aa, chain + ## HITS:1 COG:CAC1398 KEGG:ns NR:ns ## COG: CAC1398 COG0011 # Protein_GI_number: 15894677 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 4 98 6 97 97 76 47.0 1e-14 MNASVAIQVLPSVQDEEEVIRIVDEVIDYIKSTGLNYYVGPCETSIEGDYDTLMEIVKNC QLVAAKAGCKAMNTYVKISYKAEGDVLTIDKKVTKHHQ >gi|222441828|gb|ACEP01000114.1| GENE 7 6876 - 7658 572 260 aa, chain + ## HITS:1 COG:FN0237 KEGG:ns NR:ns ## COG: FN0237 COG0600 # Protein_GI_number: 19703582 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport system, permease component # Organism: Fusobacterium nucleatum # 21 257 1 237 241 216 46.0 2e-56 MTKRLQSITSKLPAVVALCLLILLWQFLCQSGAVPAYMLPSPVQVIKALVTDLPTILEHA SVTLQEAFYGLCIGVGLAFVMATLMDHFQILNKALYPILIITQTIPTIAIAPLLVLWMGF YMAPKITLVVITTFFPITVGLLDGYKSVDKDSIDLMRAMGATKIQIFFHVKFPTALPQFF SGLKISASYAVVGAVISEWLGGFEGLGVYMTRVSKAYAFDKMFAVIIFIVIISLLLMGAV NLIKTLSLPWVRVEKKEALE >gi|222441828|gb|ACEP01000114.1| GENE 8 7830 - 8873 1432 347 aa, chain + ## HITS:1 COG:SP2197 KEGG:ns NR:ns ## COG: SP2197 COG0715 # Protein_GI_number: 15902004 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport systems, periplasmic components # Organism: Streptococcus pneumoniae TIGR4 # 34 345 27 334 335 294 48.0 2e-79 MRKLNWKKIVAGTLSAVMLLSVTACGSSSTKSESASTEAGKKELTKVTFCLDWTPNTNHT GIYAAKALGYYEDAGLDVEIVQPPENGAATMCASGQAQFAIEAQDTMAAAFDSDNPLGIT AVAGLIQHNTSGIISRKGDGISSPKGLEEKTYSTWESPIEQATLKTVMKDEGADFSKVKL IPNNITDEPAALKANQTDAIWVFYGWGGINAEVEGVDCDYWNFKDIDSVFDYYTPVMIAN NDFLKNSPDEAKAFLAATKKGYQYAIDNPKKAADLLIAGDDTGSLKGSEDLVYKSQEWLS KQYVADADNWGVIDETRWNNFYKWLAENKLTTKDLTGKGFSNDYLPK >gi|222441828|gb|ACEP01000114.1| GENE 9 9085 - 9819 238 244 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 3 227 1 238 245 96 26 9e-20 MVLLKAEHITKKYSGRTIIKDINIELHKGELISLLGVSGSGKTTLFHVLSGLTTPEEGRV LLNGEDITSRPGQISYMLQKDLLFPHKKIIDNTALPLVLHGMSKKEARKKAQTYFEDFGL EGTEYQYPSQLSGGMRQRAALLRTYLSSNGVALLDEPFSALDTITKTAIHRWYLEVMQHI ELSTIFITHDIDEAILLSDRIYILNGKPGEIRDEIIIKEKKPRAEDFYLTDEFLAYKREI IAKL >gi|222441828|gb|ACEP01000114.1| GENE 10 9676 - 10038 138 120 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MHEKRSYLPVVLSRSEMEAIIDSTFNLKHKAMIALMLDIFIIGSYLNIRQNVVLCSDYRF LYRGVSLPAANQNLKLRYYFPLICKEFICEVKILRPWFLFFNDDLISNFSRLSIENINSV >gi|222441828|gb|ACEP01000114.1| GENE 11 10031 - 10195 156 54 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225028386|ref|ZP_03717578.1| ## NR: gi|225028386|ref|ZP_03717578.1| hypothetical protein EUBHAL_02659 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02659 [Eubacterium hallii DSM 3353] # 1 54 1 54 54 83 100.0 5e-15 MEILIEQNRCAMACESYVQRFLFIKIKKAVTKTSYFNIVSFTIVDFIMEVNYYA >gi|222441828|gb|ACEP01000114.1| GENE 12 10183 - 11055 415 290 aa, chain - ## HITS:1 COG:no KEGG:Cphy_2596 NR:ns ## KEGG: Cphy_2596 # Name: not_defined # Def: Ig domain-containing protein # Organism: C.phytofermentans # Pathway: not_defined # 39 111 32 104 390 68 47.0 4e-10 MRVHLTPFLKRMTNYFFSILLISSLLSISVFSPTTVHAASKVKLNKTNITLVKGKQSKLK VKGTTKKVKWCSSNKKVAKVNTNGTVKGVKPGTCYIYAKVNKKKLKCKVTVMTQKSFNAK QFYLFVRKKGKKGKNNTRILSREILYPTDELHMTYIVAYPQVGKISFMYNFYPCNQNANY KTTITMNIISGQEGTVFSKISNWNPSSTVESTGTITMNYDGKSKGFSYTKINSHFEADYE EDNYDYIGPPSSIDIPSFLSSINDTFSSCNKLMKKYGYSMKKIGFTKWKF Prediction of potential genes in microbial genomes Time: Fri Jul 8 08:14:53 2011 Seq name: gi|222441827|gb|ACEP01000115.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont381.1, whole genome shotgun sequence Length of sequence - 631 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 93 - 136 1.0 1 1 Tu 1 . - CDS 223 - 630 75 ## Cphy_2587 serine-type D-Ala-D-Ala carboxypeptidase (EC:3.4.16.4) Predicted protein(s) >gi|222441827|gb|ACEP01000115.1| GENE 1 223 - 630 75 135 aa, chain - ## HITS:1 COG:no KEGG:Cphy_2587 NR:ns ## KEGG: Cphy_2587 # Name: not_defined # Def: serine-type D-Ala-D-Ala carboxypeptidase (EC:3.4.16.4) # Organism: C.phytofermentans # Pathway: Peptidoglycan biosynthesis [PATH:cpy00550]; Metabolic pathways [PATH:cpy01100] # 1 132 314 446 450 99 41.0 3e-20 ISLIAVVMACQSSKERVKDCASLLDYGFSLCKIYTDKQPPALSTVPVHNGTKEFISCKYK KNFSYVFTSEINQNNIQKKVVFQKNLSAPIKKNQVIGRLEYSYNNKKIGKVSILASETVG KAKYTDYLQKLLLNL Prediction of potential genes in microbial genomes Time: Fri Jul 8 08:15:00 2011 Seq name: gi|222441826|gb|ACEP01000116.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont383.1, whole genome shotgun sequence Length of sequence - 1469 bp Number of predicted genes - 0 Number of transcription units - 0, operones - 0 average op.length - 0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + SSU_RRNA 1 - 1398 99.0 # EF404081 [D:1..1493] # 16S ribosomal RNA # uncultured bacterium # Bacteria; environmental samples. Prediction of potential genes in microbial genomes Time: Fri Jul 8 08:15:02 2011 Seq name: gi|222441825|gb|ACEP01000117.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont384.1, whole genome shotgun sequence Length of sequence - 4378 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 1, operones - 1 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 6/0.000 + CDS 1 - 702 920 ## COG2145 Hydroxyethylthiazole kinase, sugar kinase family 2 1 Op 2 11/0.000 + CDS 689 - 1342 639 ## COG0352 Thiamine monophosphate synthase 3 1 Op 3 3/0.000 + CDS 1388 - 2839 1250 ## COG0351 Hydroxymethylpyrimidine/phosphomethylpyrimidine kinase + Prom 2849 - 2908 7.2 4 1 Op 4 . + CDS 2952 - 4256 1786 ## COG0422 Thiamine biosynthesis protein ThiC Predicted protein(s) >gi|222441825|gb|ACEP01000117.1| GENE 1 1 - 702 920 233 aa, chain + ## HITS:1 COG:CAC3096 KEGG:ns NR:ns ## COG: CAC3096 COG2145 # Protein_GI_number: 15896347 # Func_class: H Coenzyme transport and metabolism # Function: Hydroxyethylthiazole kinase, sugar kinase family # Organism: Clostridium acetobutylicum # 1 231 41 271 273 267 59.0 1e-71 MADDSGEVEDITSICQGLNINIGTLNKNTIPSMFLAGKKANVLGHVVLLDPVGAGASGLR TKTANELVRDIKFTVIRGNISEIRTLMEGTGNTKGVDADLADAVTEENLDETIGKLKAFS EATGSVIAVTGAIDLVADSEKCYVIRNGKPELGAITGTGCQLSGMMTAFLCANAEQPLEA AAAAVCAMGIAGEIGFEHLKEGEGNSTYRNRIIDAIYHMDGNTLEERAKYELR >gi|222441825|gb|ACEP01000117.1| GENE 2 689 - 1342 639 217 aa, chain + ## HITS:1 COG:SP0725 KEGG:ns NR:ns ## COG: SP0725 COG0352 # Protein_GI_number: 15900622 # Func_class: H Coenzyme transport and metabolism # Function: Thiamine monophosphate synthase # Organism: Streptococcus pneumoniae TIGR4 # 4 216 1 209 210 144 40.0 2e-34 MSCVNREQLRKDLLLYAVTDRSWLGNETLYEQVEKALKGGATFIQLREKELMRNISWKKR WLSKNCVISIMFHLLSMIMSESQKIWMRMEVHVGQSDMEADDVRKILGEDKILGVSAQTV EQAVLAEKMGADYLGVGAVFSTSTKKDAADVSKETLKAICEAVNIPVIAIGGIGADNILS LQGSGICGIAVVSAIFAAKDIEAATKVLKDRTEKMLK >gi|222441825|gb|ACEP01000117.1| GENE 3 1388 - 2839 1250 483 aa, chain + ## HITS:1 COG:CAC3095 KEGG:ns NR:ns ## COG: CAC3095 COG0351 # Protein_GI_number: 15896346 # Func_class: H Coenzyme transport and metabolism # Function: Hydroxymethylpyrimidine/phosphomethylpyrimidine kinase # Organism: Clostridium acetobutylicum # 218 483 4 262 265 283 54.0 7e-76 MIKGAIFDIDGTLLDSMPIWENAGARYLATLGIKAKPDLKERLDALSLPEGAIYMQKEYG LSVSAEDILEGVNQVVKDFYYKEAVMKPGAYALVKRLKENGVKLIIATATDKEMAKAALI RNGIWQDFTGMITCEEAGAGKTSPKVFELARQKLGTKKEETWVFEDSLYAVKTATEAGFP VCSIYDTYSVGNAKEIQKLSNIYVRDFSEIGDYSFSNMKTVLTIAGSDSSGGAGIQADIK TLTVHKVYAMTCITALTAQNTVGITGIMPVPAEFFKKQMESIFTDIKPDAVKIGMIASKE QAEIIAEYLEKYSIKNVVADPVMISTSGTVLVEETTRKILYEKLYPKVSLLTPNIPETEF LSGIKITDKKTREEAAKVIADRWNCAVLSKGGHSEENADDLLYESFLQEEKKEKAVWFPE ERIDNPNTHGTGCTLSSAVAANLAKGFPVEESVKKAKVYISGAIRAMLNLGQGNGPLNHM WDL >gi|222441825|gb|ACEP01000117.1| GENE 4 2952 - 4256 1786 434 aa, chain + ## HITS:1 COG:CAC3014 KEGG:ns NR:ns ## COG: CAC3014 COG0422 # Protein_GI_number: 15896266 # Func_class: H Coenzyme transport and metabolism # Function: Thiamine biosynthesis protein ThiC # Organism: Clostridium acetobutylicum # 3 434 2 433 436 556 64.0 1e-158 MRNYTTQMDAARKNIITPEMEKVAAKERMTPEEIRALVAEGKVAICANKNHKCLDPQGVG SMLKTKINVNLGVSKDCKDYDIEMKKVMEAVNMGADAIMDLSSHGNTEPFRKKLTSECPV MIGTVPIYDSVIHYQRDLDTLTAKDFIDVVRLHAENGVDFVTLHCGITRKTIDQIKKHKR KMNIVSRGGSLVFAWMTMTGEENPFYEYFDEILDICQEYDVTISLGDACRPGCLADGSDV CQIEELVRLGELTKRAWEKNVQVMVEGPGHMPMDQIAANMKLQSTICSGAPFYVLGPIVT DIAPGYDHIVSAIGGAIAAQNGAAFLCYVTPAEHLALPNLEDVKQGIMASKIAAHAADIA KGVRGAREIDDKMADARRVLDWEAQWECAMDPETAKAIRDDRKPEHEDTCSMCGKFCAVR SMNKALAGEHIDIL Prediction of potential genes in microbial genomes Time: Fri Jul 8 08:15:03 2011 Seq name: gi|222441824|gb|ACEP01000118.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont385.1, whole genome shotgun sequence Length of sequence - 4896 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 4, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 33 - 92 7.3 1 1 Tu 1 . + CDS 121 - 1557 1130 ## COG0297 Glycogen synthase - Term 1719 - 1763 -1.0 2 2 Tu 1 . - CDS 1919 - 3667 1385 ## COG1293 Predicted RNA-binding protein homologous to eukaryotic snRNP + Prom 3618 - 3677 3.9 3 3 Tu 1 . + CDS 3912 - 4337 497 ## EUBELI_00568 hypothetical protein + Term 4514 - 4565 8.7 + Prom 4532 - 4591 9.9 4 4 Tu 1 . + CDS 4612 - 4894 278 ## COG1192 ATPases involved in chromosome partitioning Predicted protein(s) >gi|222441824|gb|ACEP01000118.1| GENE 1 121 - 1557 1130 478 aa, chain + ## HITS:1 COG:CAC2239 KEGG:ns NR:ns ## COG: CAC2239 COG0297 # Protein_GI_number: 15895507 # Func_class: G Carbohydrate transport and metabolism # Function: Glycogen synthase # Organism: Clostridium acetobutylicum # 3 475 2 473 477 403 44.0 1e-112 MKKILFASAECAPFVKTGGLGAVVGSLPKQLNHKKYDVRVVLPDYHCIDAKWREQMETLV TFPMYLGWRMQTVTVKTLKYEGIVYYFIENNFYFCGDSPYYDMWVDIEKFSYFSKAVLEM LSYLEFEPDIIHCHDWQSSLVPVFLKAFYCADPFYRNIKTVMTIHNLKFQGITEIDRLKD ITGLPDDMFTYDKLEYNNSANLLKGGLVFADKITTVSKTYAEEIKQPEYGEGLDSVLQHR SDDLCGIVNGIDYDIYNPSTDEYILCHYDDKTFDKGKKKNKASLQAKTGLPKKRTAFTLG IVSRLTEQKGFALFSYMMERLMKKPVQLYVLGDGEEEYRNMFLSYQEQYPDKVYVQLDYT DEMAKYIYAGCDAILMPSRFEPCGLCQLMSLRYGAVPIIRKTGGLKDTVSIYNPKSKTGT GFGFTDYNAEGFLDAILAALDVYKKNPEEWKNLTAKGMRKNYSWKASSRKYEKLYSDL >gi|222441824|gb|ACEP01000118.1| GENE 2 1919 - 3667 1385 582 aa, chain - ## HITS:1 COG:BH2516 KEGG:ns NR:ns ## COG: BH2516 COG1293 # Protein_GI_number: 15615079 # Func_class: K Transcription # Function: Predicted RNA-binding protein homologous to eukaryotic snRNP # Organism: Bacillus halodurans # 1 572 1 559 570 351 36.0 2e-96 MAFDGVVIANIVYDMNRLLTGGRIYKIYQPEADELLLVVKNNKETYRLLISAGASLPLIY FTTKTKANPMTAPNFCMLLRKYISNGRIIEITQPGMERIVEITIEHLNELGDTCRKKMVI EIMGKHSNIIFIDEKNMIIDSIKHISNQVSSVREVLPGRTYIAPPGKGKISLNDLSVEWM ENTLLVKPVSVQKAVYGSISGISPVLANEICYRADIDGDSAVASLTPVQQSSLYGELVRL KNQIETHAFSPNIIYEGKAPKEFGAVTFSMFSDLTTEEYPSISEVLETFYAQKEVVTRIR QKSVDLRHIVQNALERTAKKYDLQLKQLKDTDKKEKYKIYGELLHTYGYEAAPHQKELKC INYYDGKEITIPLDPDLNAMENAKKYFDRYGKLKRTYEALSTLTKETFAELSHLESVSNA LEIARDENDLAMIKEELIECGYMKRHGRNQKKRQGKSKPLHYISSDGFHMYVGKNNFQND ELSFKFANGKDMWFHAKKMAGSHVIVKLGTANELPDRTYEEAARLAAHYSKGKNAPKVEV DYTERRNLKKPPQAKPGYVIYHTNYSMLIDPDISGIKEISDL >gi|222441824|gb|ACEP01000118.1| GENE 3 3912 - 4337 497 141 aa, chain + ## HITS:1 COG:no KEGG:EUBELI_00568 NR:ns ## KEGG: EUBELI_00568 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 16 122 20 128 157 94 43.0 2e-18 MGKAENGEIREVTANIWEDRKHHLWFPISFTKYTVGNGRLYVNSGFLSSREDECLLYRIT DITLYRSLPQRIFGTGTIELHTKDRSTPVIRLENIAKSAQVKRVLSDLIEKEREDKKVVG RDMYGAVSHIDPMEEIQDDHM >gi|222441824|gb|ACEP01000118.1| GENE 4 4612 - 4894 278 94 aa, chain + ## HITS:1 COG:RP058 KEGG:ns NR:ns ## COG: RP058 COG1192 # Protein_GI_number: 15603937 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Rickettsia prowazekii # 1 90 1 91 255 62 38.0 3e-10 MKVYVVANLKGGVGKTTTTVNVAYTFSEMGGRVLVIDLDPQCNCTRFFAKVNGYSKTIRD VLENPKGINSAVYRTKYQDIDIVKGSVKITEQKT Prediction of potential genes in microbial genomes Time: Fri Jul 8 08:15:15 2011 Seq name: gi|222441823|gb|ACEP01000119.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont392.1, whole genome shotgun sequence Length of sequence - 35089 bp Number of predicted genes - 36, with homology - 35 Number of transcription units - 19, operones - 10 average op.length - 2.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 218 - 277 5.4 1 1 Tu 1 . + CDS 400 - 1089 619 ## CLJ_A0004 putative resolvase + Term 1170 - 1211 8.0 + Prom 1190 - 1249 9.6 2 2 Op 1 . + CDS 1349 - 1414 95 ## 3 2 Op 2 . + CDS 1416 - 2348 994 ## CKL_2932 hypothetical protein 4 2 Op 3 . + CDS 2364 - 2639 336 ## gi|225028400|ref|ZP_03717592.1| hypothetical protein EUBHAL_02674 5 2 Op 4 . + CDS 2639 - 3268 797 ## EUBELI_01204 hypothetical protein 6 2 Op 5 . + CDS 3287 - 4228 852 ## COG5377 Phage-related protein, predicted endonuclease 7 2 Op 6 . + CDS 4259 - 5005 682 ## COG4712 Uncharacterized protein conserved in bacteria 8 2 Op 7 . + CDS 5020 - 5874 563 ## Cphy_2931 hypothetical protein + Term 5880 - 5941 12.9 + Prom 5879 - 5938 6.6 9 3 Op 1 . + CDS 6036 - 6209 134 ## Rumal_1118 DNA-binding domain-containing protein 10 3 Op 2 . + CDS 6281 - 6439 213 ## gi|225028406|ref|ZP_03717598.1| hypothetical protein EUBHAL_02680 + Term 6450 - 6509 10.5 + Prom 6454 - 6513 12.0 11 4 Op 1 . + CDS 6573 - 6914 265 ## Cthe_1748 hypothetical protein 12 4 Op 2 . + CDS 6950 - 7357 217 ## COG3727 DNA G:T-mismatch repair endonuclease 13 5 Tu 1 . + CDS 7416 - 8327 412 ## gi|225028409|ref|ZP_03717601.1| hypothetical protein EUBHAL_02683 + Term 8385 - 8427 -0.6 - Term 8364 - 8423 10.0 14 6 Tu 1 . - CDS 8465 - 8662 130 ## gi|225028411|ref|ZP_03717603.1| hypothetical protein EUBHAL_02685 - Term 9224 - 9272 6.6 15 7 Op 1 . - CDS 9279 - 11090 273 ## gi|225028411|ref|ZP_03717603.1| hypothetical protein EUBHAL_02685 - Prom 11146 - 11205 2.2 16 7 Op 2 . - CDS 11213 - 11965 446 ## gi|225028412|ref|ZP_03717604.1| hypothetical protein EUBHAL_02686 - Prom 12147 - 12206 8.6 17 8 Tu 1 . + CDS 12282 - 12563 140 ## gi|225028414|ref|ZP_03717606.1| hypothetical protein EUBHAL_02688 18 9 Op 1 . + CDS 12681 - 12860 263 ## gi|225028415|ref|ZP_03717607.1| hypothetical protein EUBHAL_02689 19 9 Op 2 1/0.000 + CDS 12857 - 13579 626 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 20 9 Op 3 . + CDS 13637 - 16234 1663 ## COG4485 Predicted membrane protein + Prom 16278 - 16337 1.9 21 10 Tu 1 . + CDS 16383 - 16634 339 ## gi|225028418|ref|ZP_03717610.1| hypothetical protein EUBHAL_02692 + Term 16782 - 16828 2.1 22 11 Tu 1 . - CDS 16565 - 17194 233 ## gi|225028419|ref|ZP_03717611.1| hypothetical protein EUBHAL_02693 - Prom 17241 - 17300 1.8 23 12 Tu 1 . - CDS 17302 - 19044 391 ## gi|225028420|ref|ZP_03717612.1| hypothetical protein EUBHAL_02694 - Prom 19255 - 19314 8.6 + Prom 19081 - 19140 11.4 24 13 Op 1 . + CDS 19231 - 20130 509 ## gi|225028421|ref|ZP_03717613.1| hypothetical protein EUBHAL_02695 25 13 Op 2 . + CDS 20143 - 20436 168 ## gi|225028422|ref|ZP_03717614.1| hypothetical protein EUBHAL_02696 + Term 20457 - 20506 10.1 + Prom 20467 - 20526 14.5 26 14 Op 1 . + CDS 20698 - 21321 308 ## gi|225028423|ref|ZP_03717615.1| hypothetical protein EUBHAL_02697 + Prom 21323 - 21382 1.8 27 14 Op 2 . + CDS 21411 - 23015 1102 ## gi|225028424|ref|ZP_03717616.1| hypothetical protein EUBHAL_02698 + Term 23069 - 23124 8.4 + Prom 23068 - 23127 4.0 28 15 Op 1 . + CDS 23208 - 23504 265 ## gi|225028425|ref|ZP_03717617.1| hypothetical protein EUBHAL_02699 29 15 Op 2 . + CDS 23506 - 25260 1399 ## COG1961 Site-specific recombinases, DNA invertase Pin homologs 30 15 Op 3 . + CDS 25264 - 25860 377 ## COG3409 Putative peptidoglycan-binding domain-containing protein + Term 25928 - 25978 5.7 + Prom 25947 - 26006 5.2 31 16 Op 1 24/0.000 + CDS 26241 - 27320 1308 ## COG0505 Carbamoylphosphate synthase small subunit 32 16 Op 2 . + CDS 27313 - 30546 4340 ## COG0458 Carbamoylphosphate synthase large subunit (split gene in MJ) + Prom 30558 - 30617 4.7 33 17 Tu 1 . + CDS 30655 - 31986 1398 ## COG0174 Glutamine synthetase 34 18 Tu 1 . + CDS 32102 - 32596 411 ## Closa_1273 ANTAR domain protein with unknown sensor + Term 32757 - 32799 -0.4 + Prom 32721 - 32780 10.7 35 19 Op 1 . + CDS 32831 - 33679 873 ## COG0253 Diaminopimelate epimerase 36 19 Op 2 . + CDS 33716 - 34930 1351 ## COG0436 Aspartate/tyrosine/aromatic aminotransferase Predicted protein(s) >gi|222441823|gb|ACEP01000119.1| GENE 1 400 - 1089 619 229 aa, chain + ## HITS:1 COG:no KEGG:CLJ_A0004 NR:ns ## KEGG: CLJ_A0004 # Name: not_defined # Def: putative resolvase # Organism: C.botulinum_Ba4 # Pathway: not_defined # 1 224 1 224 230 274 68.0 2e-72 MGKIYGYVRISTKKQNMERQIRNIQKEYPEAVIIKEVYTGIKFQGRKELNKLIKQLKAGD TVVFDSVSRMSRDAEEGFQVYQDLYKKGIELIFLKEHHIDTMTYKAALHNGIPLTGTNVD YILEGVNKYLMSLAKEQIRLAFLQSQKEVTDLRQRTKEGIETARINGKHIGLASGTKLVT KRSIECKEKIQKYSRDFDGMLSDKECIQLIGIARGTYYKYKREIKDGIC >gi|222441823|gb|ACEP01000119.1| GENE 2 1349 - 1414 95 21 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGIIKSIVTIIIVITGRKGGK >gi|222441823|gb|ACEP01000119.1| GENE 3 1416 - 2348 994 310 aa, chain + ## HITS:1 COG:no KEGG:CKL_2932 NR:ns ## KEGG: CKL_2932 # Name: not_defined # Def: hypothetical protein # Organism: C.kluyveri # Pathway: not_defined # 1 309 1 309 311 359 56.0 7e-98 MAANVESMFYTRTAPWHGLGVRVEEVLGSKEALIQAGLDWKVEQTDVYAASGERIPGYKA NIRDIDRSVLGIVGDRYKIVQNEEAFAFTDGLLGEGVKYETAGSLAGGKIVWMLAKLPEK YIISGDAIEPYLVFCNSHDGSGAIRVAMTPVRVVCQNTLNLALKGASRVWSARHTGNVMS RMDEARETLQLANAYMSQLGRSINELQAKKLTDKKVLAMIDSLYPVSEDLSEMQKKNNLK QQEILKACYFDAPDLQGVGKNGYRFINAVSDMAYHGKPLRQTKNYNENLFRKTIDGLPVL DKTYQMVLSA >gi|222441823|gb|ACEP01000119.1| GENE 4 2364 - 2639 336 91 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225028400|ref|ZP_03717592.1| ## NR: gi|225028400|ref|ZP_03717592.1| hypothetical protein EUBHAL_02674 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02674 [Eubacterium hallii DSM 3353] # 1 78 1 78 91 129 100.0 7e-29 MNPVKDVVEGKILSEMKKYRQSNSFTTTTTGCDTGDTENTGHPVYEIFCIILKKFDPVLL EKLLGVADLFEYLKQLWNKSEKTEEKKGEER >gi|222441823|gb|ACEP01000119.1| GENE 5 2639 - 3268 797 209 aa, chain + ## HITS:1 COG:no KEGG:EUBELI_01204 NR:ns ## KEGG: EUBELI_01204 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 3 201 87 291 298 82 32.0 1e-14 MKTLEQEVLKKIAKKSLFVKLVNTERNESLVEQSISKEFLDLSAVVRVVLKMDKEGMASM ALSKGDAEILGMTEEEIYAAALANTLRLFPPKLMNLGRYVEMSIGAKLPLGEDEVTTYIL TNQKEVDGAIYFMSPEVVGAIAEALEDDLYILPSSVNEVLLVRASELEDGVDELKEMVRD ANETVVAEKDILSYNVYHYDKEHGITIAA >gi|222441823|gb|ACEP01000119.1| GENE 6 3287 - 4228 852 313 aa, chain + ## HITS:1 COG:BS_yqaJ KEGG:ns NR:ns ## COG: BS_yqaJ COG5377 # Protein_GI_number: 16079682 # Func_class: L Replication, recombination and repair # Function: Phage-related protein, predicted endonuclease # Organism: Bacillus subtilis # 2 312 5 315 319 239 42.0 4e-63 MKYDIVCSTKNLSTEEWLKIRTLGIGGSDAGIIAGCNPYRSIFELWQEKRGEIPVREEES EVTHFGKVLEDVVRKEFIQRTGLMVRKKNSILRSKEYPFMIADVDGIISELDGTYSIFEA KTAIEFKNNDWKNGEIPKAYQLQVQHYLAVTGFQKAYIAVLVGGNKFYWTEVYRDEQLIK MLVSMESHFWHCVREGEEPDVDASKATSLYLGAKYNNAKVGSVIDLDEQMLSAIEKYEQI KIELDSLTEQKRLVENQLKSALAENEAGRVGGHIVKWKQVVQTRIDSSKLKEKHKDIYDD CKKEVTYRKFSVA >gi|222441823|gb|ACEP01000119.1| GENE 7 4259 - 5005 682 248 aa, chain + ## HITS:1 COG:CAC1936 KEGG:ns NR:ns ## COG: CAC1936 COG4712 # Protein_GI_number: 15895209 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 11 183 1 173 229 184 56.0 2e-46 MVSSLFWRYKMKKIRELRADEIEVRVGVVRKNGISLLLYKDARCDMNILDETFGIAGWQR KHELINGALFCTVSIKDENGEWISKQDVGVESYTEAVKGAASDSFKRACFNIGIGRELYT APFIWIPSDKVSIAEGNGKLKVRDSFNVSYISYDENGKVIKSLEIRNQKNEIVCQFNMGN KVSKQAFQTANKNQLSDYRMGKLYDEMKRTGVTEQQIVERFHVSLANLTERDYKRVMSAL KNTDTRAA >gi|222441823|gb|ACEP01000119.1| GENE 8 5020 - 5874 563 284 aa, chain + ## HITS:1 COG:no KEGG:Cphy_2931 NR:ns ## KEGG: Cphy_2931 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 1 274 1 294 306 90 26.0 7e-17 MEKNKFIEELFYELSCQLGKEFEMKQKRVWKNNGVSYEGLVIEKLEEELSPVVSLDRCYE DFQAGISVQEISEQVLKEYRRTAGNISLLSQQDIWSYERIKDRLYVEMVNKEWNEQYLEH KYYVPFLDLAVVYYIDTQPDYKEAQSHRGTAVTKDIFKIWSVEPEVIWKQVLENMGKESL FLMLIVTEEENSLMTLEDVLQKNQGALALLQKSVLDEAKKKIGESFYMLPVSIFELMIVP ESQADEVEEMKKAIIESNKELPAEYLLSNSVYYYGDKGEAEIAG >gi|222441823|gb|ACEP01000119.1| GENE 9 6036 - 6209 134 57 aa, chain + ## HITS:1 COG:no KEGG:Rumal_1118 NR:ns ## KEGG: Rumal_1118 # Name: not_defined # Def: DNA-binding domain-containing protein # Organism: R.albus # Pathway: not_defined # 1 57 1 57 57 71 56.0 1e-11 MYEYIPEVMTFKECKELLKVGKNTLLDLIHSGQISAFKIGNRWKIAKQSVIEFIRYR >gi|222441823|gb|ACEP01000119.1| GENE 10 6281 - 6439 213 52 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225028406|ref|ZP_03717598.1| ## NR: gi|225028406|ref|ZP_03717598.1| hypothetical protein EUBHAL_02680 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02680 [Eubacterium hallii DSM 3353] # 1 52 1 52 52 81 100.0 2e-14 MCNLSQGVEERGIEKGLAEAVLRIYKKGYSIEQISDLLDMNIEKIKAIIESK >gi|222441823|gb|ACEP01000119.1| GENE 11 6573 - 6914 265 113 aa, chain + ## HITS:1 COG:no KEGG:Cthe_1748 NR:ns ## KEGG: Cthe_1748 # Name: not_defined # Def: hypothetical protein # Organism: C.thermocellum # Pathway: not_defined # 20 111 420 517 517 72 40.0 4e-12 MSQITEITRRDIYDLFHDGYYISEWHSELIKYPYYGNTDKLLAFYTYIADSFGANMNSWL GRITRDTNVNGSAMPVDLLINFAQDYAERGYNHDTIKRIFSVNREVRLSDIQS >gi|222441823|gb|ACEP01000119.1| GENE 12 6950 - 7357 217 135 aa, chain + ## HITS:1 COG:NMA0429 KEGG:ns NR:ns ## COG: NMA0429 COG3727 # Protein_GI_number: 15793434 # Func_class: L Replication, recombination and repair # Function: DNA G:T-mismatch repair endonuclease # Organism: Neisseria meningitidis Z2491 # 1 102 14 112 140 80 45.0 8e-16 MQHIHSKDTKIEVVLRKALWAKGYRYRKNWSELPGKPDIVLIKYKIAIFCDSEFFHGKDW EVLKPRLEKGNNPDYWIPKITRNMERDMETDKKLLFLGWTVLHFLGKDILKHPEECIQVV EETILAQKIELDAYL >gi|222441823|gb|ACEP01000119.1| GENE 13 7416 - 8327 412 303 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225028409|ref|ZP_03717601.1| ## NR: gi|225028409|ref|ZP_03717601.1| hypothetical protein EUBHAL_02683 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02683 [Eubacterium hallii DSM 3353] # 1 303 1 303 303 558 100.0 1e-157 MYYKYEVSSRILDAIDNLGEFYFQIVLRENMPFILGDCYFVVKKYHNQVCYISDSLQISY YSERDNQNEAMRKIRVALEFFVYLTGNPYNSEGGMTKSIVDARPIIDINKSKRKLLKIQE TETAYQRIRQKRKLLESTLQLYNLGIRLNFLFGDENCEDAFFTFFKIIEKIVSDEFDIEK EGIDRGKEETKECLERILSQTYNIQITEERLTKFSGEISNYIFNIVFGDNYYRIMWFCQK YNISVDCNIISKLVVIRNKIAHAEKVTISGDEYAYIMKLTREVINAKFFSKKPLIIDSKI INI >gi|222441823|gb|ACEP01000119.1| GENE 14 8465 - 8662 130 65 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225028411|ref|ZP_03717603.1| ## NR: gi|225028411|ref|ZP_03717603.1| hypothetical protein EUBHAL_02685 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02685 [Eubacterium hallii DSM 3353] # 1 60 528 587 603 88 70.0 1e-16 MMSYEDKLLCNIQFYLISTELEQAIGRSRLLRTNSTIYLFSNFPCPQAKLSKIDYLKKGN ADALS >gi|222441823|gb|ACEP01000119.1| GENE 15 9279 - 11090 273 603 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225028411|ref|ZP_03717603.1| ## NR: gi|225028411|ref|ZP_03717603.1| hypothetical protein EUBHAL_02685 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02685 [Eubacterium hallii DSM 3353] # 1 603 1 603 603 1202 100.0 0 MCSCRLWKEFVSGDGNILSHRDKFLLATNILYIVGFEKDFLDALKRNYPDSDRNKWRYNL NWMRDREYKPASCSNCKYCNSCNHRSNIVETLKGKKLITFNGNHDFVSVDESFKQISYEL KEALKSKKDQIHLISGQTGIGKTSLYIDAINSGKYKKPLLIAVPTSDLKSELIQRIGKEN IFDIPSFDDLPLIGRRFDIQQYYNQGDFEKARQCIIECARDVCEPEELNKFRVYLNPQIY LEQTSSCVIMTHARLLSLPDKILKRFEIIIDEDILYNSILVRVGSIKISTLKKILEKNYL SYAKREMIQDLLALKEKKCCINRDYIPNEIDTRIIKRCKTNDNIAEFLKAGCYMKLDDYI KYLPPIKLPKCKMIILSATLDQTIYEIFFPTRNIIYHEVKQAAYTGNLIQYPAYSMSRTA IKNIVLDENSDYPTLSMLFEKIISHTNNVVYGITFKRYEEALPLGYTLHFGNLTGTDYLS GKNGIIIGTPHFPTYLYELIAYSVGISEKSKNSYKNRQVSYKGYDFIMMSYKNKILQKIQ LYLISSELEQAVGRSRLLRTNSTVYVFSNFPCNQATFCNIDYLKDADDPEGNIDNYLINT MFF >gi|222441823|gb|ACEP01000119.1| GENE 16 11213 - 11965 446 250 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225028412|ref|ZP_03717604.1| ## NR: gi|225028412|ref|ZP_03717604.1| hypothetical protein EUBHAL_02686 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02686 [Eubacterium hallii DSM 3353] # 1 250 1 250 250 503 100.0 1e-141 MKIKATVDKYCYTSKPTDSDAKKMRMRLPQEENYYDIKTIADCVGEQGQAFLPATFKGMA SKQENFEQCQLFGIDFDEEPDYEKIKRKFADYHLPIVFSYHTFSSTPEHPKYRIILCHIV PITERWLADMILKMLKKMFPEADSHCFETARLFYGGKGLIDFNDVVAETGENTFNVYNLV MQFEKFLYTNDKKNFSRNIRVFAHKYQIGLSNNHFNIFYIKIPHTPTRRVFLSTPQSKMK KLLEVILNIN >gi|222441823|gb|ACEP01000119.1| GENE 17 12282 - 12563 140 93 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225028414|ref|ZP_03717606.1| ## NR: gi|225028414|ref|ZP_03717606.1| hypothetical protein EUBHAL_02688 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02688 [Eubacterium hallii DSM 3353] # 1 93 1 93 93 135 100.0 1e-30 MTFTEWINTQEDICSKLREQGYEEKNDIRKFFNDRIRCILNTRDIPLKNLSPKRSYERIF QINRDNERKCFFPELFKEDDEEDNTEAEEKEKN >gi|222441823|gb|ACEP01000119.1| GENE 18 12681 - 12860 263 59 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225028415|ref|ZP_03717607.1| ## NR: gi|225028415|ref|ZP_03717607.1| hypothetical protein EUBHAL_02689 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02689 [Eubacterium hallii DSM 3353] # 1 59 1 59 59 85 100.0 9e-16 MGVVIGRMDGQVEKIFLGNNLTRSTKNYAKILKDESLSEEEKQRILAAEKKKQEEDIIL >gi|222441823|gb|ACEP01000119.1| GENE 19 12857 - 13579 626 240 aa, chain + ## HITS:1 COG:DRA0037 KEGG:ns NR:ns ## COG: DRA0037 COG0463 # Protein_GI_number: 15807707 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Deinococcus radiodurans # 5 221 10 236 328 113 30.0 3e-25 MKISVAMAYYNGGTYIEEQMDSILSQLGNQDEVIVSVDGASDGSEPILLNMEKKDCRIQV IKGPGKGVVKNFENAIRHCTGDIIYLSDQDDIWKPDKVKKVNAAFDNLEVKAILHDAEII DENGKATGAESLFALRKSKPGILKNLIKNSYVGCCMAFRKELIPVICPIPKEMYMHDYWI GTAAEYMGQVCFLKDKLIGYRRHSSNVTQMTHGSIDFMVKKRIDIIRCLGLLKKRVREIG >gi|222441823|gb|ACEP01000119.1| GENE 20 13637 - 16234 1663 865 aa, chain + ## HITS:1 COG:L48341 KEGG:ns NR:ns ## COG: L48341 COG4485 # Protein_GI_number: 15672817 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Lactococcus lactis # 20 858 8 874 883 220 25.0 9e-57 MNIKEEGKKIWQIRKKINRKHIIILSFFMPLIVMIGCCIAFGVQPFGDNSLLIIDGLHQY MPFYSILYDKLKSGGSLFYSFRAGLGINFLSLYAYYLSSPLNLLILFFKKSQLNMAVSLI IVLKVAISGLTAGIYFSSKSKKPGLPVLMISMAYALSSYMVGYCWNVMWLDAIMIFPIVM MGLEDLIDRKDGRLYCFALFYALYCNYYIAFMICIFVVIWYILYSFKSVKQFFFRGIAFT FYSFLAAGMAALMLIPAYLGIKQTASGESMSLPEHSFLTNAADLLNRQFAMWTPITYDNF DGNANLYIGIFTVLAVVLYLLNSKIKISEKIKKVLLIGFFYLSFSEMILNFIWHGFHDQY GIPNRFSFLFGFVLLYMLWEVFENLEGTKGWHIALACMAGVGLLVYARAKGTNPLDDEVY GVAGMLFILYGMLLFLYTLSRKRKGWYKNVFCIVALIEIVATAFMGFNENGQIAVPKFFY GVNDMEKAVASLEDKTFYRSELASSLIVDENALYPINGVGLFGSTASDSMVNLMDNLGFY TGCNEYLYKGATPVTNLLLNVKYLYYHQEDTLNTDFEYVKREGAFNIYKNPAREMSMGYL MNLSVEDWDSESAYPFRVQNSLGKQAFGVSELFHNVEIADPATNGCTASKTNDGEYYFEY KSTQADNMVFTIPVTKTIDDLYLFYDGTQVENAQITIDGTNVKSGDLDGYMLPIGKVSAG SEVKVTFQLKGETETGYVRLSAADFDQKEYNNLKEETADRAFIVRDYSSNYIEGNVNAKT DQTLFFSIPYDEGWYVEVDGKPADMWAVGNAFLGIDVSAGEHTVSLSYTSLGASAGWKIS SLAIVIFLGSCFVKSRYKSDIKKKE >gi|222441823|gb|ACEP01000119.1| GENE 21 16383 - 16634 339 83 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225028418|ref|ZP_03717610.1| ## NR: gi|225028418|ref|ZP_03717610.1| hypothetical protein EUBHAL_02692 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02692 [Eubacterium hallii DSM 3353] # 1 83 1 83 83 147 100.0 4e-34 MFRKKLIALSAMLIAGLLVFQANVSADAIEEKPYLSLGADLTATQKRKVLELLDVKEDEF DQYKVVKVTNKDEHAIIKNWYPL >gi|222441823|gb|ACEP01000119.1| GENE 22 16565 - 17194 233 209 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225028419|ref|ZP_03717611.1| ## NR: gi|225028419|ref|ZP_03717611.1| hypothetical protein EUBHAL_02693 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02693 [Eubacterium hallii DSM 3353] # 1 209 1 209 209 407 100.0 1e-112 MGTLKNCKTTDNLIDFAKADCYIKQKDCIKYLAPVQLPKCKMIILSATLDETIYNLFFPE RKIVYHEIEQAAYKGNLIQYPCYSMSRKTINDFIEKNTLHCPTVSALFKKIITNTDNVYY GITFKKYENDIPLEHTLHFGNLTGTDYYSGKNGVIIGTPHFPSYLYQLIAFSIGISEHGS LKNRKVSYKGYQFLMMACSSLFVTFTTLY >gi|222441823|gb|ACEP01000119.1| GENE 23 17302 - 19044 391 580 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225028420|ref|ZP_03717612.1| ## NR: gi|225028420|ref|ZP_03717612.1| hypothetical protein EUBHAL_02694 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02694 [Eubacterium hallii DSM 3353] # 1 580 1 580 580 1143 100.0 0 MDIKISFDKTAFQCKPTNRQTAFMKKRLPKEEDYYDIKTIAERIGNHGHSFLPATFEGLA SKQENFEQCQLFGIDFDDEPDYEKIKKKLMEYHLPIVFSYHTFSSTPEHPKYRIILCHIV PITEKWLADMILKMLKKMFPEADKHCFETARLFYGGKGLIEFHEVIEETGENTFNVYNLI QEFEKFLYIHDSRNFSRNLKLFARNYKLNIHNNHFDIYEYNTQRGCISTDPNTNEEKIGS NIKYELYIPNTSSKIIFYELQEEAVKKESSSQRIRKKSAITIMKKVSNDKICNCCRLWRD FINGEPLDHNTRFSLATNIRFLTGYQKKFFDTLNKYYPDTDISKWKGYFNYMKRQNYNPT GCNDCKYKDTCNHKTNLLETLCGRKQITYFNKQSFISVDESVAQVSSELNAALKKNTTNI HLIPGQTGIGKTTIYVKLIKSRPRKKPFLIAVPTNNLKSELVQKIGRDKVLEIPSFEDLP LPPELRQTIEKNYSMGFINDALESIKTYSHNSKDNSIFMNYLHPETALAHSSKCVIMTHA RFLTLPNRVLKNFEVLIDEDILYTMLTRTALFKFLLSKKL >gi|222441823|gb|ACEP01000119.1| GENE 24 19231 - 20130 509 299 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225028421|ref|ZP_03717613.1| ## NR: gi|225028421|ref|ZP_03717613.1| hypothetical protein EUBHAL_02695 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02695 [Eubacterium hallii DSM 3353] # 1 299 3 301 301 499 100.0 1e-139 MTFNMWIKKQKDIYGELQNQQYDSPKKQWEFFDNRINCVLGTARHSEQSYEGMFQENKDS RNNSKKEEEAKRGRKHLDEDIEFGEEEQKFFRKIFEMFPTSEKLTLIEEDKWEEFDIEDR LELFNITVDMLENNGYWFPLSFGNWDLCNQWAGGKYQCKEDELEAKLFFPHMYRIKSKLI ELKGEMLQYSELNKDKLIDEEKEFEQDLDRYLSILDSRATEMSFIQRVNKKIAKIQQLRC GLFREERLREETDEREIITEEDRYFEAGLDRYLNWLDEDKNGIDYARRIINKIEYIEKM >gi|222441823|gb|ACEP01000119.1| GENE 25 20143 - 20436 168 97 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225028422|ref|ZP_03717614.1| ## NR: gi|225028422|ref|ZP_03717614.1| hypothetical protein EUBHAL_02696 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02696 [Eubacterium hallii DSM 3353] # 1 97 1 97 97 134 100.0 2e-30 MRYSSAQINKNLDISELKITIPEVILEANEDIYNRYCKLGEKMEQDSKEDDVFEEKLKEY IELLRKAGERQRVGKEYIEETAKKYAKKFNKNVNIKN >gi|222441823|gb|ACEP01000119.1| GENE 26 20698 - 21321 308 207 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225028423|ref|ZP_03717615.1| ## NR: gi|225028423|ref|ZP_03717615.1| hypothetical protein EUBHAL_02697 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02697 [Eubacterium hallii DSM 3353] # 1 200 1 200 207 269 100.0 1e-70 MLKRVLNTELEKKVFEDEMMEKLSVVPAEYEVYEEYASRQENFDKDFFTYCNKTPLLVDA DEASYDESEENVGVPNCKEKSLETFMAESQEEINKKKLEEARIKYHYKWSTQYIRKLEEE NHRIRQNNCRLARRNERLEEKNQNLLSECKDLRKENKSKKEKLKRQKQVLKKYRKTVFKN IKNGKWDNKRSRRLISKQDVDDLLLLL >gi|222441823|gb|ACEP01000119.1| GENE 27 21411 - 23015 1102 534 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225028424|ref|ZP_03717616.1| ## NR: gi|225028424|ref|ZP_03717616.1| hypothetical protein EUBHAL_02698 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02698 [Eubacterium hallii DSM 3353] # 1 534 1 534 534 990 100.0 0 MNFSFVPNQGLMVDGRYITEFDVKVNRIFYYIEEDESKKNHMEYELVAKMLDGTELSPRR VKKLNNLSYFTLWTEIMDAGLDRWERQMLLLRLQTSCKDAKQVKMYHITSNGYQRLEVGK EIFIAGDYTYMSNQDGIQIEVDENVKQMKWKNKVRLADKKNADRNEEDYLTVSQGASEIL FTAVLLSAIKFLFIKAGFNPAVCINLYGKTGSYKTALTQAMMYMENPSQFMCSLVNDQKK MAVKKVQQCYGFPMVLEDYHPAATGYDYKRQISMMDAVVRHIESNPQSAVVFITSEFLDG VESLQSRTLQIEAQNVDLKTLTKIQQEQTMATIVMNFLDKLLGNMKDAIETIRTQYEALS RHGEQSMRIEESIVFLKITAILYGKYVDNSDKLKLKSKLDKALTLQKEKQKLHLEQLSMS EEDRVIKVVYELIDDKLYEKCSYKNKALYRGKADEMLVSNRGVFLSRKALVYGLKKHKCT GISVNKIILALSQMELLDEDRGGTHTTKVQNIRTYCILYYELKERYEELRGTGE >gi|222441823|gb|ACEP01000119.1| GENE 28 23208 - 23504 265 98 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225028425|ref|ZP_03717617.1| ## NR: gi|225028425|ref|ZP_03717617.1| hypothetical protein EUBHAL_02699 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02699 [Eubacterium hallii DSM 3353] # 1 98 1 98 98 155 100.0 9e-37 MNQTTKRSKELKEVHVGRNETGKSITGISLEAPKYSCIETKQDGIKVILEFPEFDETRER TEAGYHQMEIQKEVNNVLSHTLREQIKNHQIYHVERIS >gi|222441823|gb|ACEP01000119.1| GENE 29 23506 - 25260 1399 584 aa, chain + ## HITS:1 COG:SA0057 KEGG:ns NR:ns ## COG: SA0057 COG1961 # Protein_GI_number: 15925764 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinases, DNA invertase Pin homologs # Organism: Staphylococcus aureus N315 # 3 443 5 407 542 86 24.0 2e-16 MGEKKRVRMLLRVSSNQQLEEDGDLGIQRQLVKEFIAQHEEWHLDSKEYFEGSNSGYSNA VENRDILQQALRDAQRNEYDILAVYKDDRIGRRMWEIGAYVMQLKSYGVDIYTVKDGCIS PESDDIMGQMMLALRYGNAQKSSSDTGMRVKDTAQKLVQQGKFVGGKAPYGYKLELSGEI SKHGRALHHLVIIPERAKAVKHIYELSLDKEYGSSKIARELNLDEEYKELAPSEVWRGGT VTSILTNPIYAGYTAYKRREKKENGKSKRLDSSEWILSEEPNPEIQIISPDDWNLVQIKR KQRGMKYRTAPQNKGVKVIARNEGMLPLIDVIYCGYCGCKCTNGSRYNYWTIKDTGERRA SRQAIYKCQDAWQGVPHYKVYQFKAERIEKIVFQAIAEYIDKLQENEDIFEQISENGRKA KKEKAKELSREQERLKHIQQKISVMEDKIPEAMLGEYPVPVDKLVELIEKQKQEKEQQEK VIAEKQKELDGADVSFNEWDSLRMQIPTWKDVFLNADNPTKRVLVNKLIEKISVKEDEVV IRFKININEIRSKALKKSGSIVPENKLEDKSVILDSLKIYPNPE >gi|222441823|gb|ACEP01000119.1| GENE 30 25264 - 25860 377 198 aa, chain + ## HITS:1 COG:CAC3244 KEGG:ns NR:ns ## COG: CAC3244 COG3409 # Protein_GI_number: 15896489 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Putative peptidoglycan-binding domain-containing protein # Organism: Clostridium acetobutylicum # 5 198 215 408 437 228 55.0 7e-60 MVSTYRGKGKDFTITSSTAFDQKWINGKNTYDSISNVVDEIFNSYLSRPEVTQPILTQYC DGKRVSCPEFMSQWGSKALGDDGLSAIEILRYYYGDDMYINEAETISGIPASYPGYELTI GASGQKVRQMQEQLNVIAGDYPLIPKIRVDGIYGPATANSVKIFQKIFHLPETGVVDFAT WYKISQIYVAVSRIAELK >gi|222441823|gb|ACEP01000119.1| GENE 31 26241 - 27320 1308 359 aa, chain + ## HITS:1 COG:lin1950 KEGG:ns NR:ns ## COG: lin1950 COG0505 # Protein_GI_number: 16801016 # Func_class: E Amino acid transport and metabolism; F Nucleotide transport and metabolism # Function: Carbamoylphosphate synthase small subunit # Organism: Listeria innocua # 2 359 3 359 363 371 50.0 1e-103 MKAFLILEDGTVLTGESFGAAREVISEVVFNTSMTGYLEILTDPSYAGQSVVMTYPLIGN YGICMEDMESIHPWPKALIVRECAKLASNFRNEMTLDEFLKKYDVPGISGIDTRHLTRIL REYGVMNGMITTREDFDLDEVVKQLKEYKVQGVVREVSCKEKRVLPGDGFKVALMDFGAK KNIEKSLNKRGCEVTVYPAYTKAEEIIAANPDGIMLSNGPGDPKECVEIIEELKKLFATN IPIFGICLGHQLMALANGADTKKMKYGHRGANHPVKDLKTGKVYISSQNHGYMVMEDSLN RDIAEVSFINVNDGTLEGVHYLGKNVMTVQFHPEACPGPNDSEFLFEEFINMMEVAKDA >gi|222441823|gb|ACEP01000119.1| GENE 32 27313 - 30546 4340 1077 aa, chain + ## HITS:1 COG:BS_pyrAB KEGG:ns NR:ns ## COG: BS_pyrAB COG0458 # Protein_GI_number: 16078616 # Func_class: E Amino acid transport and metabolism; F Nucleotide transport and metabolism # Function: Carbamoylphosphate synthase large subunit (split gene in MJ) # Organism: Bacillus subtilis # 1 1052 1 1050 1071 1219 57.0 0 MPKNTSIKKVLVIGSGPIVIGQAAEFDYAGTQACRSLKEEGIHVVLVNSNPATIMTDKDI ADEVYIEPLTPEVIKKLILKVSPDSLLPTLGGQAALNIAMELDECGFLKEHKVKLIGTTG KTIRKAEDREEFKETMEKIGEPIAPSQIVTNVEDGLEYAQKIGYPVVLRPAYTLGGSGGG IADNEEEFVEIIQNGLRLSRVGQVLVEKSIAGWKEIEYEVMRDGNGTCITVCNMENLDPV GVHTGDSIVVAPSQTLGDKEYQMLRNSALNIIDELGVEGGCNVQFALNPDTFDYCVIEVN PRVSRSSALASKATGYPIAKVTAKIALGYHLDEIKNAVTGKTFASFEPALDYCVVKVPRL PFDKFISASRHLTTQMKATGEVMSICTNFEGGLMKALRSLEQHVDSLMHGDFDKLSDKEL EEHLHIVDDLRIYCIAEALRRKVNPQKIHDITKIDLWFIDKLNNIVEMEAKLKNAEELDR DLLLKAKEMEFPDKVIGELIGQSEEYVKNLRRQMDIVPAYKVVDTCAAEFAAETPYYYSC FGSFNEVEETSDKKKIMVLGSGPIRIGQGIEFDYCSVHSVWAFKEMGYETIIVNNNPETV STDFDIADKLYFEPLTEEDVEHIVELEKPDGAVVQFGGQTAIKLTEKLNKIGVPILGTSA ENVDAAEDRELFDEILEQCCIPRPKGGTVFTKEEAIKTANEIGYPVLIRPSYVLGGQGMQ IAISDEDISEFMDVIQRYAQEHPILIDKYLMGVEVEVDAVCDGEEIVIPGIMQHVERAGI HSGDSISVYPARDLSPKIKKVIGHYTKLLAKALHVIGLINIQFIVYQDEVYVIEVNPRSS RTVPYISKVTGIPIVDLATKVMTGMKLKDLGYEPGLQPESPYWAIKMPVFSFEKIRGADI SLGPEMKSTGECLGISKNFEEALFKAFLGSGVDLPKHKKMIMTVKDSDKLDAVEVGKRFT ALGYEIYATRNTCKTLQESGVPAKRVNKIEESHPNLLDLILGHQIDLIIDTPSQGIERSK DGFVIRRHAVETGVNCLTSLDTANALLTSLESSCREDLSLVDIAEIDKEIESKINEA >gi|222441823|gb|ACEP01000119.1| GENE 33 30655 - 31986 1398 443 aa, chain + ## HITS:1 COG:BS_glnA KEGG:ns NR:ns ## COG: BS_glnA COG0174 # Protein_GI_number: 16078809 # Func_class: E Amino acid transport and metabolism # Function: Glutamine synthetase # Organism: Bacillus subtilis # 1 443 1 444 444 573 60.0 1e-163 MGNYTKQDIIQLVEEKDVQFIRLQFTDVSGTLKNVAITASQLEKALDNKCMFDGSSIEGF VRIEESDMYLYPDLNTFEILPWRPQHGKVARFMCDVYRPEGVPFEGDSRYVLKKVLKKAA KMGYTFDVGPECEFFLFHTDDNGQPTNITHEKASYFDVTPLDLGENARRDIVLTLESMGF EIEASHHEVAPAQHEIDFKYDNALVTADNIMTFKMVVKTIAKRHGLYATFMPKPITDVDG SGMHINMSLEKDGKNIFFDKSNPMGLSKECYHFIAGLIKHAKGMSCICNPLVNSYKRLVP GYEAPVAICWSSINRSPLIRIPASRGAGTRIEFRSPDPAANPYLVMALCLAAGLDGIENE LEAPEPIGKNMYLLTEEQIADMNIDVLPATLGEACDAFEEDKYIQEVLGEHISKKYLEAK RKEWHEYCRQVSDWETEAYLYKY >gi|222441823|gb|ACEP01000119.1| GENE 34 32102 - 32596 411 164 aa, chain + ## HITS:1 COG:no KEGG:Closa_1273 NR:ns ## KEGG: Closa_1273 # Name: not_defined # Def: ANTAR domain protein with unknown sensor # Organism: C.saccharolyticum # Pathway: not_defined # 1 163 17 179 182 170 47.0 2e-41 MKNVLIKNGFDVSAVCVTGAQAIQAVERLEAGVVVCGIRFADMIYDELKECIPDTFEMVV IASNAQWQEYGDDDVIYLPLPLKAYDLVDTVSDLLTDIHRRLKREREKPNKRSSADQGVI NQAKSLLMEKNELTEEEAHRYLQKRSMDNGTNMVETAYMVLQVF >gi|222441823|gb|ACEP01000119.1| GENE 35 32831 - 33679 873 282 aa, chain + ## HITS:1 COG:slr1665 KEGG:ns NR:ns ## COG: slr1665 COG0253 # Protein_GI_number: 16332245 # Func_class: E Amino acid transport and metabolism # Function: Diaminopimelate epimerase # Organism: Synechocystis # 1 274 3 276 279 263 46.0 3e-70 MKFTKMHGIGNDYVYVNCFEESIKNPAEVSKFVSDRHFGIGSDGLILISPSAKADFRMNI YNADGSQAEMCGNGIRCVAKYVYDYGLTDKTEISVETLAGIKYLKLQVENGKVATVEVNM GAPILEPKKIPVAVENNPVVDVPVEVKGKTYHMTCVSMGNPHAIIFMDNVKELDIEAIGP YFENHPVFPKRTNTEFVEVLDKNTVNMRVWERGSDETLACGTGACATTVACILNNKTEDE VTVHLLGGDLKIRWDRKENLVYMTGPATVVFDGEITLPEGIL >gi|222441823|gb|ACEP01000119.1| GENE 36 33716 - 34930 1351 404 aa, chain + ## HITS:1 COG:MTH52 KEGG:ns NR:ns ## COG: MTH52 COG0436 # Protein_GI_number: 15678081 # Func_class: E Amino acid transport and metabolism # Function: Aspartate/tyrosine/aromatic aminotransferase # Organism: Methanothermobacter thermautotrophicus # 1 404 1 408 410 528 60.0 1e-150 MFKVNENYLKLPGSYLFSNIGKKVAAYKEANPDKSIISLGIGDVTQPLAPEIIKSLHSAV DEMGKAETFRGYAPDLGYEFLRNAIVDGDYKSRGCDISADEIFVSDGAKCDSGNIQEIFS VDNKIAVCDPVYPVYVDTNVMAGRTGTYNPTTETWSDVIYMPCTAENDFVPDFPKEEPDI IYLCFPNNPTGTTITKAQLQEWVDYANKIGAVIIYDAAYEAYISEDDVAHSIYECEGART CAIELRSFSKNAGFTGTRLGFTVVPKDLKAGDVALHSLWARRHGTKYNGAPYIIQRAGEA CYSEAGKAQLKEQVAFYMNNAKIIKEGLKDAGYTVFGGVNAPYIWLQTPDKMPSWDFFDF LLNKANVVGTPGSGFGPSGEGYFRLTAFGSYENTLEAIERIKAL Prediction of potential genes in microbial genomes Time: Fri Jul 8 08:18:37 2011 Seq name: gi|222441822|gb|ACEP01000120.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont397.1, whole genome shotgun sequence Length of sequence - 12729 bp Number of predicted genes - 10, with homology - 10 Number of transcription units - 3, operones - 3 average op.length - 3.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 49 - 108 10.7 1 1 Op 1 . + CDS 246 - 2336 1486 ## PROTEIN SUPPORTED gi|62291006|ref|YP_222799.1| polynucleotide phosphorylase/polyadenylase 2 1 Op 2 . + CDS 2363 - 2956 319 ## Entcl_0681 hypothetical protein 3 1 Op 3 . + CDS 2969 - 4342 897 ## COG0286 Type I restriction-modification system methyltransferase subunit 4 1 Op 4 . + CDS 4370 - 5122 638 ## LPST_C0625 glutamine ABC superfamily ATP binding cassette transporter, substrate binding and permease protein + Term 5161 - 5206 -0.1 + Prom 5135 - 5194 4.9 5 2 Op 1 1/0.000 + CDS 5333 - 6013 544 ## COG0740 Protease subunit of ATP-dependent Clp proteases + Term 6026 - 6061 -0.5 + Prom 6028 - 6087 5.3 6 2 Op 2 . + CDS 6119 - 9160 2644 ## COG1674 DNA segregation ATPase FtsK/SpoIIIE and related proteins + Term 9239 - 9313 4.7 + Prom 9250 - 9309 5.6 7 3 Op 1 . + CDS 9347 - 10198 1039 ## COG0190 5,10-methylene-tetrahydrofolate dehydrogenase/Methenyl tetrahydrofolate cyclohydrolase 8 3 Op 2 2/0.000 + CDS 10195 - 11592 850 ## PROTEIN SUPPORTED gi|228000795|ref|ZP_04047796.1| SSU ribosomal protein S12P methylthiotransferase 9 3 Op 3 6/0.000 + CDS 11570 - 12142 378 ## PROTEIN SUPPORTED gi|229231897|ref|ZP_04356325.1| SSU ribosomal protein S12P methylthiotransferase 10 3 Op 4 . + CDS 12142 - 12648 271 ## PROTEIN SUPPORTED gi|229231897|ref|ZP_04356325.1| SSU ribosomal protein S12P methylthiotransferase + Term 12667 - 12711 -1.0 Predicted protein(s) >gi|222441822|gb|ACEP01000120.1| GENE 1 246 - 2336 1486 696 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|62291006|ref|YP_222799.1| polynucleotide phosphorylase/polyadenylase [Brucella abortus bv. 1 str. 9-941] # 7 694 9 694 714 577 45 1e-164 MVKQFSMELAGRTLTVEIGKVAAQANGAALMRYGDTVVLSTATASEKPREGIDFFPLSVE FEEKMYAVGKIPGGFNKREGKASEHAVLTSRVIDRPMRPLFPKDYRNDVTLNNLVMAVDP DCSPEVTAMLGASVATSISDIPFDGPIAGTRIGLIDGEFIVNPTADQQLVSDLALTVAST AEKVIMIEAGANEVPEDKMIEAIFKAHEVNKEVIKFIDRIVEECGKEKHEYEHVDTPEDL WNDMVDFITPEAMEEAVFTDVKQVREENIRQIKEKLEERYAEEHEDWLPLIDDAVYKFQK KTVRKMILKDHKRPDGRAINEIRPLAAEIDLLPRVHGSGMFTRGQTQIMTITTLAPLSEA QKIDGLDANVTSKRYMHHYNFPSYSVGETKPSRGPGRREIGHGALAERALVPVLPSEDEF PYAIRTVSETLESNGSTSQASICASTLSLMAAGVPIKKPVAGISTGLVTGDSDDDYIVLT DIQGLEDFFGDMDFKVAGTHDGITAIQMDIKIHGLTPDIIREAISRTKEAREHILTDVME PVISTPRDHVGEYAPKIMQMHVDPDKISEIIGKQGKTINAIIDETGVKIDINDEGRVDIC GVDQVMIDRAMEIIRLIVEPVEAGKIYEGEVVRIMNFGAFVQLAPNKDGLIHISKLSKER VEKVEDVVNIGDKVKVKVLEIDKMGRINLALREIIK >gi|222441822|gb|ACEP01000120.1| GENE 2 2363 - 2956 319 197 aa, chain + ## HITS:1 COG:no KEGG:Entcl_0681 NR:ns ## KEGG: Entcl_0681 # Name: not_defined # Def: hypothetical protein # Organism: E.cloacae_SCF1 # Pathway: not_defined # 1 183 1 179 190 68 23.0 2e-10 MEFLLEQISQITHGVVLNRIKPKILSEGAEYPILTISQLIDEENGVWDYEVPTALVDKKK VKNLNLAKEHSIVIGLTSFHRAAVLEKQHEGKIIPSNFVSIEFQNGIMDPYYFAWYFNEH PEIKRQRMIAIQGNSSVKVLSVHMIRQLMIECPNLSTQMSIGKLYYYQKRKKVLLMEKMR LEEAVLTEEMMHHLEKL >gi|222441822|gb|ACEP01000120.1| GENE 3 2969 - 4342 897 457 aa, chain + ## HITS:1 COG:SA0391 KEGG:ns NR:ns ## COG: SA0391 COG0286 # Protein_GI_number: 15926109 # Func_class: V Defense mechanisms # Function: Type I restriction-modification system methyltransferase subunit # Organism: Staphylococcus aureus N315 # 281 457 330 514 518 104 32.0 4e-22 MDEKQLQYEFLSILSESHPGLTFEEYRDSCVFLLFYQYLCLKHGDSLEEAYKPGELVKMA IRGKLQVASFLKFMESASSFLHLLNKDFQLTEFSFYKSLERVHSLEKQKSYARFFRKFLK KIDGWDCKEELLRQYPKLFILLIAEFAKLKKDTYISEELSELYHKFFCRRFHKKKGKCKV LFPEFQYGILASSIISTEKNIEIYGYTKEQEYIDIFTIVCYMRGIPLDSLHLFMKKDWRA IRDLPDGADNILIFMPEGVEAGEYIASPKLSLGKEQFYAGTKGEFPFLLTAISCLKENGF LAAVFPGAMLYREGREAQIRKYLVEELNCLDTIMLLPDSIFHSIGQAEAILFFQMNRERK DILFFDCSEIESLDKEQIDTIDQLWSERKTIPGLCACVERDEIEKNEYNLNLPRYITKVV KETAIDMEKGKARIREIEQELQEIENRIAIYRRELGL >gi|222441822|gb|ACEP01000120.1| GENE 4 4370 - 5122 638 250 aa, chain + ## HITS:1 COG:no KEGG:LPST_C0625 NR:ns ## KEGG: LPST_C0625 # Name: not_defined # Def: glutamine ABC superfamily ATP binding cassette transporter, substrate binding and permease protein # Organism: L.plantarum_plantarum # Pathway: not_defined # 27 248 22 244 478 70 25.0 5e-11 MRKKKRIPVSIMLLVFLFLLTGCTLGGTKQEAETKVDYYIGVIDDNAPYYYEEDGLPKGY YADFVAAMAKEEPFTYKFVPVDASSYNENLSKKSIDGFIGATNASSGKNILASKSFYTSN ICVLAPKSSNISTLKQIKDNSIAAPADTEEEVFAKYLANRYKGRAVPFLSVKEALSDIEA GNTQILVADKEYYKQHKETFKSWKVLKTSHRFQNEHRLFMQKDGKLQEIYSKGIKALETN GSLEHFSNTL >gi|222441822|gb|ACEP01000120.1| GENE 5 5333 - 6013 544 226 aa, chain + ## HITS:1 COG:CAC1811 KEGG:ns NR:ns ## COG: CAC1811 COG0740 # Protein_GI_number: 15895087 # Func_class: O Posttranslational modification, protein turnover, chaperones; U Intracellular trafficking, secretion, and vesicular transport # Function: Protease subunit of ATP-dependent Clp proteases # Organism: Clostridium acetobutylicum # 22 218 28 224 226 204 51.0 1e-52 MNTQLESLLELGEVLLENNGTGQHILFLSVIGEIEGHHLSSDRIKTTKYEHLLPLLTQVQ DNQEIDGLFLLINTVGGDCSCGLAAAEMIASIKKPTVSLVIGDSHSIGVPLSVAADRTFI APTATMILHPVRLNGTIIGAPQTFEYFRLIQDRIVTFVEAHTGVAKSRLERMMIRQGMMV KDLGTILVGKMAVEEGIIDEVGGVSEALFWLHQEIERSRKKRQEKK >gi|222441822|gb|ACEP01000120.1| GENE 6 6119 - 9160 2644 1013 aa, chain + ## HITS:1 COG:CAC1812 KEGG:ns NR:ns ## COG: CAC1812 COG1674 # Protein_GI_number: 15895088 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: DNA segregation ATPase FtsK/SpoIIIE and related proteins # Organism: Clostridium acetobutylicum # 449 1005 219 765 765 547 54.0 1e-155 MASNKTNENKIKRKTTKKVTTKNNVTRKTTAKNHTSGTRKNASQKAEGIPMSSEITLVVA FGLVVLMMFSNFGHCGLIGKWLASFFFGIFGCAQFVMPVFFFLGITILVANDYSTLAMKK CGAGFVILMEVSAVSQLLYKDTTVTVHDLFSVAINNHATGGIIGGGVALLLREGFGTAGA MVVLICVFLLACILLTQRSFVSFFKKMVKNVRASKEEWEKSAEKRRLEREAAAAEDRLIN LEEEEESSDKATSFLKMKSGYKKTSDKKRAKQCEEKVKELEEKKEKKNPKKQPSTSAEQE KTIKNIKVNVPVFTSSDEVKRKDCVRELHPQIYPFTEEDGKTEPFKGKLDFSAFTFPEEN TSLISEQAQESTLEVQEESYAPPYKEKGAFHTSLPEEIRINEDLGSLIEEVSESIVSEAE AISEENVYPDAADQAIEQKEVPEIEAVKDMKEKQEENEESRIWKNGAEIKQRDTEAKDNA SEVNDRDIAIQEKQMARQDTIKKETAQKVSSVESKEVVPDRKKAEGKDYLFPPASLLIKE EQGHSSGQQQYLQETAQKLYETLKSFGVNVTITDISCGPSVTRYEMFPEQGTKVSKILSL TDDIKLNLAASDIRIEAPIPGKAAIGIEIPNKHNQTVHFRDLIESQTFKTFKSKLAFAVG KDIGGKTVVTDLAKMPHLLIAGATGSGKSVCINTLIMSILYKAAPEEVKLIMIDPKMVEL SIYNGIPHLLIPVVTDPKKASGALNWAVAEMTNRYKKFTETGVRNIEGYNKKVRELQKSG EIDPETIKKMPQIVIIIDELADLMMVAPGEVEDAIVRLSQLARAAGIHLVIATQRPSVNV ITGLIKANVPSRIAFSVSSGVDSRTIIDMNGAEKLLGKGDMLFYPAGYSKPVRVQGAFIS DNEISDVVTFLKENEDVAVYDTEVTEKIENKLKSSAVSQERDEYFEAAARFVIEKDKASI GMLQRMFKIGFNRAARIVDQLSDAGIVGPEEGTKPRKVLMSSEQLESYFEEYL >gi|222441822|gb|ACEP01000120.1| GENE 7 9347 - 10198 1039 283 aa, chain + ## HITS:1 COG:BH2784 KEGG:ns NR:ns ## COG: BH2784 COG0190 # Protein_GI_number: 15615347 # Func_class: H Coenzyme transport and metabolism # Function: 5,10-methylene-tetrahydrofolate dehydrogenase/Methenyl tetrahydrofolate cyclohydrolase # Organism: Bacillus halodurans # 5 277 6 277 279 302 56.0 4e-82 MSLRIDGKKISSDIKEELKEQVKLLKEEGISVCLAVIQVGSDPASSVYVNNKKKACAYIG IESQAYELPEETTQEELLSLISRLNKDDHVNGILVQLPVPAHMDEKEIIQAISPAKDVDG FHPQSVGALSTGEPGFVSCTPAGIIQLLKRSGISIDGKDCVVVGRSNIVGKPMSMLLLRE NGTVTICHSHTKNLKEVCQRADILVAAIGKPKFFDDSYVKDGAVVIDVGIHRNEENKLCG DVDYDKVVDKVSAITPVPGGVGPMTIAMLMNNCVEAALRMNNR >gi|222441822|gb|ACEP01000120.1| GENE 8 10195 - 11592 850 465 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|228000795|ref|ZP_04047796.1| SSU ribosomal protein S12P methylthiotransferase [Brachyspira murdochii DSM 12563] # 1 458 1 439 440 332 40 1e-90 MKNVLFVSLGCDKNLVDSEKMLGLLNEAGYRVAQEESEADAIVVNTCCFIHDAKEESVET ILEMAEWKKKGRLKALIVTGCMAQRYQDEIQQEIPEVDAVIGTTGYTEIVPILDEILAEA EASQKEAAVEEPKEKSFVNCCPSIDLLPASLADKRVVTTGGYTAYLKIAEGCNKRCTYCI IPYIRGHYRSFPMEDLLEEARKLAEGGVKELILIAQETTVYGMDCYGRKALPELLTKLCE IEGIEWIRILYCYPEEITDELIAVMKKEKKICHYLDIPIQHSEDTILKRMGRRTNRAELV SLVEKLRKEIPDIVLRTTLITGFPGETEEEFKNMVDFVDSMEFDRLGVFPYSAEEGTKAA EMDGQITEEVKESRRDEIMALQQEISADKAASRIDDEMSVLIEGYLYEDDIYIGRTYMDA PKVDGNVFVRAEEELISGDIVPVRITGANEYDLMGDVIYADEFTE >gi|222441822|gb|ACEP01000120.1| GENE 9 11570 - 12142 378 190 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229231897|ref|ZP_04356325.1| SSU ribosomal protein S12P methylthiotransferase [Cryptobacterium curtum DSM 15641] # 7 183 484 669 904 150 41 1e-122 MPMNLPNKLTIVRIIMIPFFVFFLLTDFVAGSKWIATVLFCAASITDFLDGHIARKYNLV TNFGKFMDPLADKMLVSSAFICLVAQNKIAAWIVIVIIAREFVISGFRLVASDSGVVIAA SYWGKFKTNFQMFAIILLMLNLGENFPAYAGGIHIAEQILIYIALILTIVSLVDYLAKNI DVLKEGGNEK >gi|222441822|gb|ACEP01000120.1| GENE 10 12142 - 12648 271 168 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229231897|ref|ZP_04356325.1| SSU ribosomal protein S12P methylthiotransferase [Cryptobacterium curtum DSM 15641] # 9 163 746 902 904 108 40 1e-23 MSSDRIMAEEPIEKKITRILIHRKMTVTAAESCTGGLVAGTLVNADGISEVFKESYVTYS NEAKHKLLGVKKETLEKYGAVSRQTAAEMAEGAVRAAGADASVVTTGVAGPGGGTEEKPV GLVYIGCCVKGRTIVKRCFYGGNRQGVRHSAVKEALEILYCQLLADKY Prediction of potential genes in microbial genomes Time: Fri Jul 8 08:18:46 2011 Seq name: gi|222441821|gb|ACEP01000121.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont398.1, whole genome shotgun sequence Length of sequence - 2363 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 44 - 103 3.0 1 1 Op 1 . + CDS 124 - 492 228 ## gi|225028444|ref|ZP_03717636.1| hypothetical protein EUBHAL_02718 2 1 Op 2 5/0.000 + CDS 482 - 847 225 ## PROTEIN SUPPORTED gi|148984516|ref|ZP_01817804.1| 50S ribosomal protein L9 + Term 925 - 953 -0.9 + Prom 1042 - 1101 5.9 3 1 Op 3 . + CDS 1127 - 2363 1020 ## COG3436 Transposase and inactivated derivatives Predicted protein(s) >gi|222441821|gb|ACEP01000121.1| GENE 1 124 - 492 228 122 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225028444|ref|ZP_03717636.1| ## NR: gi|225028444|ref|ZP_03717636.1| hypothetical protein EUBHAL_02718 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02718 [Eubacterium hallii DSM 3353] # 1 122 1 122 122 207 100.0 2e-52 MSSITTAVAVDYRLQQWAQLVKDCQNRPQDMTVEQWCDTKGISKSNYYYRLRCIRKACLE HVQDNSVSCQQVVEIPEKITQQSRSESPSDISIEINGVVIHIKEDISESLLEKIIRVVSR AK >gi|222441821|gb|ACEP01000121.1| GENE 2 482 - 847 225 121 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|148984516|ref|ZP_01817804.1| 50S ribosomal protein L9 [Streptococcus pneumoniae SP3-BS71] # 10 107 10 106 107 91 42 7e-19 MLNDATCFKKIFIVTGYTDLRSGIDRLADTIRSYLGEDAIEADTLYLFCGRRTDRIKGLV WENDGYLLLYKRLEAGQFQWPRTESEVRSITEQQFRWLMEGLTTHPKKQVKRLKKAPDYT L >gi|222441821|gb|ACEP01000121.1| GENE 3 1127 - 2363 1020 412 aa, chain + ## HITS:1 COG:SPy0131 KEGG:ns NR:ns ## COG: SPy0131 COG3436 # Protein_GI_number: 15674346 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Streptococcus pyogenes M1 GAS # 148 408 75 331 450 134 31.0 4e-31 MQMKYTEEKLNTFSKETLIQLFLTQQEQLSEIDGKLQLVLEQLAVMNQNRFGRKTEQMPV PEQLAFTDVDGELVLFNEAEVLAALDEREKETTVKKRPVKKKGKREEDLSGIPVVKVEHY LTEKELEDLFGKAGWKQLPDEIYKRYRFIPAKVEVEEHHVGVYASKEGDTMKKAPHPACL LRGSLVSPSLEAAVLNGKYINAVPFARLEKEFGRYGLAITRQNMANWTIECADRYLAVFY DYLHKYIYKSPVIHADETHVLVRKDGREAGSKSYMWVYRTGEKEKKAVILYEYQKTRKAD HPERFLKDYKGICVTDGYQVYHKLGEEKEGLTIAGCWAHARRRYDEVVKALPSKSRKSSI AWQALEQIQMIYHVDNTLKELKPEERKKRRQSSVKPLVEAYFAWVKEKKADG Prediction of potential genes in microbial genomes Time: Fri Jul 8 08:19:02 2011 Seq name: gi|222441820|gb|ACEP01000122.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont400.1, whole genome shotgun sequence Length of sequence - 30755 bp Number of predicted genes - 26, with homology - 26 Number of transcription units - 17, operones - 6 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 26 - 439 185 ## COG1943 Transposase and inactivated derivatives - Prom 469 - 528 7.2 - Term 493 - 533 0.2 2 2 Tu 1 . - CDS 571 - 1674 1384 ## COG0287 Prephenate dehydrogenase + Prom 2098 - 2157 6.5 3 3 Op 1 . + CDS 2253 - 4346 2825 ## COG0480 Translation elongation factors (GTPases) 4 3 Op 2 . + CDS 4350 - 4901 492 ## COG0317 Guanosine polyphosphate pyrophosphohydrolases/synthetases + Term 4964 - 5014 8.6 + Prom 5151 - 5210 7.9 5 4 Op 1 . + CDS 5287 - 5730 374 ## COG4492 ACT domain-containing protein 6 4 Op 2 . + CDS 5742 - 6947 1626 ## COG0460 Homoserine dehydrogenase + Prom 6958 - 7017 8.7 7 5 Tu 1 . + CDS 7164 - 8372 1706 ## COG0527 Aspartokinases + Prom 8605 - 8664 6.5 8 6 Tu 1 . + CDS 8689 - 9168 541 ## COG2131 Deoxycytidylate deaminase 9 7 Op 1 . + CDS 9622 - 10116 404 ## PROTEIN SUPPORTED gi|167856598|ref|ZP_02479300.1| 50S ribosomal protein L35 10 7 Op 2 . + CDS 10147 - 10344 256 ## PROTEIN SUPPORTED gi|160878606|ref|YP_001557574.1| ribosomal protein L35 11 7 Op 3 . + CDS 10368 - 10724 507 ## PROTEIN SUPPORTED gi|240146873|ref|ZP_04745474.1| ribosomal protein L20 + Term 10735 - 10771 7.3 + Prom 10748 - 10807 7.8 12 8 Op 1 . + CDS 10833 - 11906 1153 ## COG1705 Muramidase (flagellum-specific) 13 8 Op 2 . + CDS 11970 - 12581 596 ## gi|225028461|ref|ZP_03717653.1| hypothetical protein EUBHAL_02735 + TRNA 12917 - 13001 53.5 # Leu AAG 0 0 + Prom 12919 - 12978 80.3 14 9 Tu 1 . + CDS 13178 - 13969 794 ## ELI_2632 4Fe-4S ferredoxin - TRNA 14540 - 14624 53.5 # Leu AAG 0 0 - Term 14808 - 14836 -0.9 15 10 Tu 1 . - CDS 15035 - 15229 256 ## gi|225028465|ref|ZP_03717657.1| hypothetical protein EUBHAL_02741 - Prom 15399 - 15458 3.2 - Term 15357 - 15404 2.2 16 11 Tu 1 . - CDS 15465 - 15944 558 ## COG1854 LuxS protein involved in autoinducer AI2 synthesis - Prom 16065 - 16124 9.8 + Prom 16060 - 16119 6.8 17 12 Tu 1 . + CDS 16143 - 17096 870 ## COG3608 Predicted deacylase + Prom 17126 - 17185 7.1 18 13 Tu 1 . + CDS 17246 - 18190 564 ## COG3608 Predicted deacylase + Term 18315 - 18352 4.8 + Prom 18244 - 18303 8.1 19 14 Tu 1 . + CDS 18370 - 22128 4882 ## COG0046 Phosphoribosylformylglycinamidine (FGAM) synthase, synthetase domain + Term 22167 - 22203 -0.1 + TRNA 22364 - 22448 53.5 # Leu AAG 0 0 + Prom 22366 - 22425 80.3 20 15 Op 1 . + CDS 22528 - 24384 1718 ## COG0488 ATPase components of ABC transporters with duplicated ATPase domains 21 15 Op 2 . + CDS 24371 - 25633 1233 ## Rumal_0023 hypothetical protein + Term 25634 - 25680 -0.5 + Prom 25650 - 25709 1.8 22 16 Op 1 . + CDS 25772 - 26572 546 ## Clole_0279 hypothetical protein 23 16 Op 2 . + CDS 26626 - 28257 1776 ## COG1686 D-alanyl-D-alanine carboxypeptidase 24 16 Op 3 . + CDS 28326 - 28520 286 ## COG1983 Putative stress-responsive transcriptional regulator 25 16 Op 4 . + CDS 28556 - 29593 853 ## COG0502 Biotin synthase and related enzymes + Prom 29696 - 29755 11.0 26 17 Tu 1 . + CDS 29777 - 30709 704 ## PROTEIN SUPPORTED gi|116517028|ref|YP_816079.1| glucokinase Predicted protein(s) >gi|222441820|gb|ACEP01000122.1| GENE 1 26 - 439 185 137 aa, chain - ## HITS:1 COG:alr8071 KEGG:ns NR:ns ## COG: alr8071 COG1943 # Protein_GI_number: 17227445 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Nostoc sp. PCC 7120 # 7 134 9 140 140 99 37.0 2e-21 MNLDNNAHSVFLLHYHLVLVVKYRRQVFDDGISSRAKEIFEYIAPNYNITLEEWNHDKDH IHILFRAHPNTEISKFINAYKSASSRLLKKEFPQIRQKLWKEHFWSQSFCLITTGDAPIE VFKKYIESQGQRDRKRK >gi|222441820|gb|ACEP01000122.1| GENE 2 571 - 1674 1384 367 aa, chain - ## HITS:1 COG:BH1666 KEGG:ns NR:ns ## COG: BH1666 COG0287 # Protein_GI_number: 15614229 # Func_class: E Amino acid transport and metabolism # Function: Prephenate dehydrogenase # Organism: Bacillus halodurans # 7 362 5 361 366 249 39.0 4e-66 MNANHMTIGFIGLGLIGGSIAKTIRRIHPDSIIYGFDTDIDSLKMAKEDGTLTQYFETLD PTFSSCDIIFLCAPVSNNIEYLKELKGIISESCLLTDVGSVKEPIQSAIKELGMESNFIG GHPMVGSEKSGYAHANDHLLENAYYFLTPSERTLFQLTTKFSSFIQGLGALAVSLKPEEH DFITAAISHVPHIVAAELVHLVRRADRNNGMLKQLAAGGFKDITRIASSSPVMWEQICEN NSSNIKTLLTSMIHDLQEVIDQLDKENGAYVNEYFKEAGDYRNSVPDHSIGLFDKVHKLY VHIPDQPGTIATVASLLAFNNISLKNIGIIYNREFEEGVLEIVLYDEESCQKGAQVLEER NYIVHIR >gi|222441820|gb|ACEP01000122.1| GENE 3 2253 - 4346 2825 697 aa, chain + ## HITS:1 COG:FN1546 KEGG:ns NR:ns ## COG: FN1546 COG0480 # Protein_GI_number: 19704878 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation elongation factors (GTPases) # Organism: Fusobacterium nucleatum # 1 693 3 690 690 580 43.0 1e-165 MNVYATESIRNVVLLGHGGCGKTSLIESMAFTTGIIKRVRTVTEGGTVSDYDKEEVKRKF SIQSSVIPLEVDGVKVNLLDTPGFFDFSGEVYEALSVADGAIIVINGKSGVEVGTRKSFD FCAKRGIPVIVYITCVEDEHVDASAIIDELRESYGNKIVPFHMPMKEGTEFKGFINAAKM LECRLQPNGTYYEKEVEEGVNDELDEYRQQLMEGVAETSEELMEKFFAEEPFTEEEIKTA IVSSIAKRDLMPVICGAINTDYGANILLHDMVKYFSAPGMVTDQFVGIKVKTDEPVSASY DVSQPVSAYVFKTIADPFVGRFSLVKVTDGILKADSAVYNANKDAEERIGKLYVLRGKEQ IEVNALYAGDIGAIAKLSVTKTGDTLSPKANPIIFEKAEMPEPYTFVRYIAKTKGDEDKI SAALKKLMDEDLTLKEVNDKANRQMLLYGLGDQHLEIVKSKLADRYKVEIELVKPKVAFK ETIRSKVEVRGKYKKQSGGHGQYGDVLMEFEPSGDLETPYVFDEKIFGGAVPKNYFPAVE KGIAESCLKGPLAGYPVVGIKATLTDGSYHPVDSSEMAFKMAAITAFKNGVMDAKPVLLE PIVNLKVDVLDKNTGDVMGDLNKRRGRILGMNPLEGGRQELVADIPLASLIGYSTDLRSM TGGSGEFSYEFSRYEQMPADAQEKVLKEAAAEKENVK >gi|222441820|gb|ACEP01000122.1| GENE 4 4350 - 4901 492 183 aa, chain + ## HITS:1 COG:lin0808 KEGG:ns NR:ns ## COG: lin0808 COG0317 # Protein_GI_number: 16799882 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Guanosine polyphosphate pyrophosphohydrolases/synthetases # Organism: Listeria innocua # 1 178 1 179 180 96 32.0 3e-20 MIDKAITFATKAHEGQFRKGTKLPYIVHPLEVGVIVSRMTQDKEVIAAAILHDTLEDCKE VTFSTLCQEFGERVAEIVKAESEEKGGSWNERKANTVKRLKEEKASDMKLVALGDKLSNA RSLKRDYQMIGDKLWERFNMKDKRQQAWYYRGLCDSLKDMENFPEYWEFCELIAYVFRGV VVD >gi|222441820|gb|ACEP01000122.1| GENE 5 5287 - 5730 374 147 aa, chain + ## HITS:1 COG:BS_pheB KEGG:ns NR:ns ## COG: BS_pheB COG4492 # Protein_GI_number: 16079843 # Func_class: R General function prediction only # Function: ACT domain-containing protein # Organism: Bacillus subtilis # 7 144 6 143 147 107 36.0 6e-24 MENGSKFYMVKKKALPEILLKVAEANVLLNRGKAKTIKEAIDAVGISRSSYYKYKDDIFP IHEKIQGKTITFVMEMEDQPGLLSSILKIVADCKANILSIHQSIPVNHIALITLSVMILP ITEDVAVLFDRIENADGINNLKIVSEE >gi|222441820|gb|ACEP01000122.1| GENE 6 5742 - 6947 1626 401 aa, chain + ## HITS:1 COG:MTH1232 KEGG:ns NR:ns ## COG: MTH1232 COG0460 # Protein_GI_number: 15679243 # Func_class: E Amino acid transport and metabolism # Function: Homoserine dehydrogenase # Organism: Methanothermobacter thermautotrophicus # 2 373 1 379 423 273 39.0 4e-73 MVNVAVLGYGTIGSGVVEVLQTNTEVIAQRAGEEIAVKYILDLRDFPGDPNADKVVHDYD IIDKDEDVQVVVECMGGVEPAYTFVKRALLNGRSVATSNKELVAKHGAELIAIAHEKNIN FLFEASVGGGIPIIRPLVQCLTADVIEEVSGILNGTTNYMLTRMKEEGISFEEALKEAQE KGYAELHPEADVEGYDACRKIAILSSLAFGQQVDFEDIYTEGITSITAEDIRYATAMNKS IKLLGNSWRVDGKIYAMVCPMLLDNAHPLSGVNDVFNAIFVRGNMVDETMFYGKGAGKLP TASAVAADVVDCVKHTGKNIKILWNPEKLTLSDLGDFERQYFVRMPADAEAKAITAAFGA VEMVSVEGINEKAFITGVMKESAFNEAAEKAGHIINRIRLD >gi|222441820|gb|ACEP01000122.1| GENE 7 7164 - 8372 1706 402 aa, chain + ## HITS:1 COG:Cj0582 KEGG:ns NR:ns ## COG: Cj0582 COG0527 # Protein_GI_number: 15791942 # Func_class: E Amino acid transport and metabolism # Function: Aspartokinases # Organism: Campylobacter jejuni # 1 400 1 398 400 376 51.0 1e-104 MLIVKKFGGTSVGNKERILNVAKRCIEDYQKGNDVVVVLSAMGKSTDELIDMAKDINPTP SKREMDMLMTTGEQVSVSLMAMAMGSLGVPAISLNAFQVAMHTTHRYGNAQLKRIDTDRI RNELEQRKIVLVTGFQGIDKFDNVTTLGRGGSDTTAVALAAALHADACEIYTDVDGVYTA DPRYVKKARKLAEITYDEMLDLASLGAGVLHNRSVEMAKKYGVQLVVRSSLNNHEGTIVR EEVNMEKMLVSGVAADKNATRIAVIGLKDEPGIAFHLFNALAKYNINVDIILQSVGRNGT KDISFTVAEDEADEAVAIIQKSFPKSEYNKIDEEKDVAKISIVGAGMMTHPGVAASMFEA LYDAGVNIKMISTSEIRVTVLIDEKYTEKALNAVHDKFALGD >gi|222441820|gb|ACEP01000122.1| GENE 8 8689 - 9168 541 159 aa, chain + ## HITS:1 COG:FN1902 KEGG:ns NR:ns ## COG: FN1902 COG2131 # Protein_GI_number: 19705207 # Func_class: F Nucleotide transport and metabolism # Function: Deoxycytidylate deaminase # Organism: Fusobacterium nucleatum # 3 159 15 170 174 203 57.0 9e-53 MKRQDYVSWDAYFMGVALLSAQRSKDNNTQVGSCIVNPHNKILSMGYNGMPTGCNDDRMP WERKGTPLDTKYLYVCHAELNAILNYSGGSLRGARIYTTLFPCNECTKAIIQAGITEVIY YSDKYADTDSTIAAKRMLDMVGITYRQYQKEGKELVLDL >gi|222441820|gb|ACEP01000122.1| GENE 9 9622 - 10116 404 164 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|167856598|ref|ZP_02479300.1| 50S ribosomal protein L35 [Haemophilus parasuis 29755] # 9 164 2 158 159 160 49 1e-38 MINEQVRDREVRVISSNGEQLGIMSSKEAMKLAREAELDLVKIAPNAKPPVCKIIDYGKY RYELARKEKEAKKKQKTIDVKEVRLSPNIDKNDLNTKINQARKFLSKGDKVKVTLRFRGR ELAHVNSSKVILEEFAEQLSDVAVVDKQPKFEGRSMIMFLTEKR >gi|222441820|gb|ACEP01000122.1| GENE 10 10147 - 10344 256 65 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|160878606|ref|YP_001557574.1| ribosomal protein L35 [Clostridium phytofermentans ISDg] # 1 65 1 65 65 103 80 2e-21 MPKMKTSRAAAKRFKKTGSGKLKRNKAYKSHILTKKSQKRKRNLRKAAMTDATNVKNMKK ILPYL >gi|222441820|gb|ACEP01000122.1| GENE 11 10368 - 10724 507 118 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|240146873|ref|ZP_04745474.1| ribosomal protein L20 [Roseburia intestinalis L1-82] # 1 118 1 118 118 199 83 1e-50 MARVKGAIGAKKRHNRTLKLAKGYRGARSKQYRVAKQSVMRALTSAYAGRKQKKRQFRQL WIARINAAARMNGLSYSKFMYGLKLAGVEVNRKMLSEMAIADPAGFAALAELAKSKLA >gi|222441820|gb|ACEP01000122.1| GENE 12 10833 - 11906 1153 357 aa, chain + ## HITS:1 COG:lin2738_1 KEGG:ns NR:ns ## COG: lin2738_1 COG1705 # Protein_GI_number: 16801799 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Muramidase (flagellum-specific) # Organism: Listeria innocua # 174 352 21 179 180 97 37.0 3e-20 MKNVKVIKKAAIFLMVFVLAFSSLSAAAAVTYSRGGKRVRYTGSSYKVYYNSKRVNSVTR PGLMVNGNIMIPYSYTMVKRGPKVSYSKQNKGKVLTLSYNGNKIVLYLNKTYAYINGKKE KIRTAPFKAKVGGTGLIMVPAKVVCEALGINYTYNAARKAIYMTGKAITTNAPSTSNGTT VGSSLQATAFKNMSTQEFVNAVGPIAREDYRKTGVLASVTLAQAINESGWGKSGLTQSSN NMFGMKTSLSGNSWSGSAWDGKSYAEVKTTEEYNGKKVIITAKFRKYSSVAQSIADHSAY LSNAMNGTHRRYNGLTTTKSYSTQLSILQKGGYCTWSGYVSELTTLIKKYNLTSWDN >gi|222441820|gb|ACEP01000122.1| GENE 13 11970 - 12581 596 203 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225028461|ref|ZP_03717653.1| ## NR: gi|225028461|ref|ZP_03717653.1| hypothetical protein EUBHAL_02735 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02735 [Eubacterium hallii DSM 3353] # 1 203 20 222 222 374 100.0 1e-102 MGFSLALVICVLLFIFGFKHFLKGREYQGNAVQKEAESSTQDIHKEYLDTDKTFQSGYIM IKDVPEKGIYTSKLPGEKKKKGAVSLKNGQILWASKKGSYEGKTYYHLRNGMYLKASENY TEELTSYEKLGGYVAITYISSTGVRLRAWADFSADNVVKSVYVGDKVPIKGKVTLKNGDS AYITEEGLYLTTDIQYLNDYTTE >gi|222441820|gb|ACEP01000122.1| GENE 14 13178 - 13969 794 263 aa, chain + ## HITS:1 COG:no KEGG:ELI_2632 NR:ns ## KEGG: ELI_2632 # Name: not_defined # Def: 4Fe-4S ferredoxin # Organism: E.limosum # Pathway: not_defined # 10 241 8 236 262 184 42.0 4e-45 MQLLKIYYAYFSPGNTTEKVVSHIASSFQNYPVESINLTDYATRQEDYYFKENELLIIGV PAYGGRVPAPVVDALRHFTGSNTPVVLVATYGNRDIDDTLIELKKNVIDKGFIPVGAASF VCQHTFLKECAEGRPDEEDLKMAGEFGDKLKERLRLLVTYDAGDLEVPGTFPYTKPPMGE FPFKVETNEYCIYCMLCADVCPVKAISESNPKEIDSSICLRCGSCLRICPTQAKYFTEEP FKVLQEKLSPLCGVRKENWYTIA >gi|222441820|gb|ACEP01000122.1| GENE 15 15035 - 15229 256 64 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225028465|ref|ZP_03717657.1| ## NR: gi|225028465|ref|ZP_03717657.1| hypothetical protein EUBHAL_02741 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02741 [Eubacterium hallii DSM 3353] # 1 64 28 91 91 113 100.0 5e-24 MKSSKKTYQEVAEACSKYQPVSEEYGFKNSSCECNTPSCLNCTHFTDQEHCDLDLYDQIV ERIK >gi|222441820|gb|ACEP01000122.1| GENE 16 15465 - 15944 558 159 aa, chain - ## HITS:1 COG:CAC2942 KEGG:ns NR:ns ## COG: CAC2942 COG1854 # Protein_GI_number: 15896195 # Func_class: T Signal transduction mechanisms # Function: LuxS protein involved in autoinducer AI2 synthesis # Organism: Clostridium acetobutylicum # 1 158 1 157 158 229 68.0 2e-60 MDKIASFTVNHLDLIPGVYVSRKDYVGKEVLTTFDLRMTAPNREPVMNTAELHTIEHLAA TYLRNNDEWKEKVIYFGPMGCRTGCYLILAGDYESKDVVGLVRDTFSFIADYEGEIPGAA ARDCGNYLDQNLPMAKYLAKKYLNEALNNPSEKNLIYPA >gi|222441820|gb|ACEP01000122.1| GENE 17 16143 - 17096 870 317 aa, chain + ## HITS:1 COG:VC2282 KEGG:ns NR:ns ## COG: VC2282 COG3608 # Protein_GI_number: 15642280 # Func_class: R General function prediction only # Function: Predicted deacylase # Organism: Vibrio cholerae # 33 301 61 320 362 87 27.0 2e-17 MIETVMTAQMPVGLPIKIKKNRLEPDYIREDMPRLSIVAGVHGDELQGQYICYELIRRIK EEREHLRGIVDIYPCVTPMALEIRKRNAPGALDMNAMFPGSRHGHTIEYMAAEVVADLMG SDFCIDIHSSDIFIREIPQIRVPENAGKKVTELAQKSGIPVLWMNSNSSSVWEGSLAYSL RRKGIPNVVIESGIALKIDYDYCKKVLDGVFRMMKELDIFDGEVSPTKKTVTVKKNDVEV VHCNTSGIFIPKAQIGSYVREGQMIGSIVRPILGAEIEEITVPCDGMIFSLREYPAVCEG ELLARICKTERPAGKYE >gi|222441820|gb|ACEP01000122.1| GENE 18 17246 - 18190 564 314 aa, chain + ## HITS:1 COG:VC2282 KEGG:ns NR:ns ## COG: VC2282 COG3608 # Protein_GI_number: 15642280 # Func_class: R General function prediction only # Function: Predicted deacylase # Organism: Vibrio cholerae # 36 308 65 325 362 64 22.0 2e-10 MEEKVLYSMESPFQQEINVKGYYFGKNEGKEHSACIVGALRGNEYQQLYICSKLIDVLKK IEKHGDIVGENGILVIPSVSNMSMDFGKRFWISDDTDLNRLFPGNPSGESGSRVAYAIME TTKGYRYGIHLPSFYLGGTFMPHIRLLDPEHGSTSLANLFGLPYVIEAKSRPFDRTTLHN NWQRNGTEAFSLYTGMTGKIDDELAAQAVSSILRFLTRMGIIRYYSHSGYIASVLKEGDL EPIMTEAAGIFIKKAECGQEVSRGDVLAQIMKADSGEIISEVKAPTDGIIFYAETRPVVY SNLLVFQIIKRIHL >gi|222441820|gb|ACEP01000122.1| GENE 19 18370 - 22128 4882 1252 aa, chain + ## HITS:1 COG:CAC1655_1 KEGG:ns NR:ns ## COG: CAC1655_1 COG0046 # Protein_GI_number: 15894932 # Func_class: F Nucleotide transport and metabolism # Function: Phosphoribosylformylglycinamidine (FGAM) synthase, synthetase domain # Organism: Clostridium acetobutylicum # 2 966 3 965 985 1113 56.0 0 MSQVKRLYVEKKPSYAVKAKELAEELKAYLSIDDVSNVRVLIRYDVENVSAETMEKAAVT VFSEPPVDDLYEEDFPKKEGDRVFSVEYLPGQFDQRADSAMQCLQLLNEKEEPIIRSATT YVFSGNISDEEFEKIKEYCINPVDSRETDEKKPETLVMDYEEPADVAIFDGFKDMPEEEL KALYDSLGLAMTFEDFRYIQNYYHGEEDRNPSMTEIRVLDTYWSDHCRHTTFSTELKDVT FGDGYYKPLFEKTYNEYLADRAEVYGDRKDKFICLMDLALMGMKKLKQDGLLDNMEESDE INACSIVVPVEFEDKTEEWLISFKNETHNHPTEIEPFGGAATCLGGAIRDPLSGRSYVYQ AMRVTGCADPTVPMAETMKGKLPQRKLTTGAAKGYSSYGNQIGLATGLVDEVYHPGYVAK RMEIGAVIGAAPRKNVIRETSDPGDIIVLLGGRTGRDGCGGATGSSKAHTTKSIDTCGAE VQKGNAPTERKIQRLFRREEVSILIKKCNDFGAGGVSVAIGELADGLRIDLDKVPKKYAG LDGTELAISESQERMAVVLDKKDVEQFLTYAEEENLEATVVAEVTEAKRLVMSWRGKEIV NLSRAFLDTNGAHQETTVEVEMPDEKECYYKKQDEFKDAKEMWTEKLSDLNVCSKKGLVE MFDSSIGAGTVTMPFGGKTQLTPIQTMTAKLPVLEGKTELTTMMSYGFDPYLSSWSPYHG AVYAVLESVAKIVAAGGDFRKIRLTFQEYFERLGTDPKRWGKPFAALLGAYNAQKGFGIP SIGGKDSMSGSFEDIDVPPTLVSFAVDYTKASHVVTPEFKKAGNNLLRVDIPFDEYGIPD YKEAGKIYDTIHKLMRKDVIASAYALGAGGLIEAISKMSFGNMYGVKLVRDFNKEDLTKK YYGSILVEIPTEWMDEMAVLPNIFVGRVTDDPFFQVGEEEISLKEMCDVWQKPLESVFPT KTNATKHNVSQPLYRTNDIYVCKNKVAVPKVFIPVFPGTNCEYDSAKAFRRAGAEVETLV FKNQSESDILESVDAFEKAISQAQIIMFPGGFSAGDEPEGSAKFFATVFRNEKIKESVMK LLNERDGLALGICNGFQALVKLGLVPYGEIVNQTEDSPTLTMNEIGRHVSTMVYTKVVTN KSPWLRKAVLDQIYAIPASHGEGRFVASKEWIEKLFKNGQVATQYVNLNGEPTMDDRYNP NGSYYAIEGITSPDGRVLGKMAHSERIGENIAKNIYAKQDQKIFESGVEYFK >gi|222441820|gb|ACEP01000122.1| GENE 20 22528 - 24384 1718 618 aa, chain + ## HITS:1 COG:CAC3012 KEGG:ns NR:ns ## COG: CAC3012 COG0488 # Protein_GI_number: 15896264 # Func_class: R General function prediction only # Function: ATPase components of ABC transporters with duplicated ATPase domains # Organism: Clostridium acetobutylicum # 1 607 1 630 632 521 46.0 1e-147 MSVINVEHISKLYGDKMILEDLSCSVDEGDKIGIIGINGTGKSTLLRIIAGEEEADEGKI IFSNGMTIGWMGQNPEFDEESSILKYVCEGKKIEDDYGYESDAKAMLTVLELENFDEKIK NLSGGQKKRAALCKVLLQKPDILILDEPTNHLDNKMSDWLENYLKSFRGVLLMVTHDRYF LDKVTNHIWEVEGGKVYYYDENYSGYLERKAEREERELASERKRQSILRSEVKWVMRGAR ARSTKQKARLERFEQLKAMDSPKTAKQVEMGSVGTRLGKKTIELYDISKAYGDKVLFKHF SYIFKRFERIGFVGHNGCGKSTLMKILADLEQEDSGAIEWGETIKIGYFAQECEVMDERE RVIDYIKDAAEYVRTSEGLVSASKMLERFLFSSDMQYTPIAKISGGERRRLYLLKVLMQS PNVLILDEPTNDLDIATLRVLEDFLDEFAGIVITVSHDRYFLDRTVDRIAAFENGNIVVY EGDYTEYQEKSGRIETDSIDSVDSGSGLHIKKSNEKKKEGREQWLASKNKEKKLKFSYKE QKEFETIDEDIEKLEEKITELEEQISKCATDFIKLNELMQEKEKTEAELSDKMERWVYLN DLAEKIEAQKRENNNENI >gi|222441820|gb|ACEP01000122.1| GENE 21 24371 - 25633 1233 420 aa, chain + ## HITS:1 COG:no KEGG:Rumal_0023 NR:ns ## KEGG: Rumal_0023 # Name: not_defined # Def: hypothetical protein # Organism: R.albus # Pathway: not_defined # 4 420 5 420 420 481 58.0 1e-134 MKTFKEADILLPGKEVNMEKWSVIACDQFTSEPEYWEEVKKFAGDSPSTLDMILPEVYLE QENHEEQLKNIEASMNKYLHEKVFTEYKDAMIYVERTDSVGNVRAGIVGKIDLEQYDYSK KSTSFIRATEATVAERIPARVEIRNNAVVELPHVMLLVDDAGKHVIEPCEQEKEHLPLLY DFDLMMGGGHLCGYLLGKEEKKRIEKALTFLADPKCFADKYHVKDRPVLLFAVGDGNHSL AAAKAYYEQLKAAHPDEDLSEHPARYALVEVVNLHSPALQFEAIHRIVTGVHVEDLLSKM NAALHIKKEGKGQTFYTIYNGKKKKCVIERPTSKLTVGSLQRFLDGYLAHNGGKIDYIHG EEVVESLAREKDSIGFILPDMGKEELFPTVIHDGALPRKTFSMGHAQDKRYYTECRKIRK >gi|222441820|gb|ACEP01000122.1| GENE 22 25772 - 26572 546 266 aa, chain + ## HITS:1 COG:no KEGG:Clole_0279 NR:ns ## KEGG: Clole_0279 # Name: not_defined # Def: hypothetical protein # Organism: C.lentocellum # Pathway: not_defined # 1 248 1 248 274 155 34.0 2e-36 MPTTYTHYAYGQEVFHKLPEELQRKIEPYIEYYNIGVHGPDILFYYHSYCKNKVNQYGVK VHHEPAREFFKRAFKVFQAQKNRNEAFAYLAGFMTHFILDSSCHPYVRKRIKETGISHTE IETDWDFEVMQRDGRNLNHYKRACHIKDKKEIAAVIAPYFKKSAKQIQVSLFHMKLVINY IFRSGFGVKRAITTVIGGYVSPELDLHHHFIPKEINPADAETCKHLWELYNDSQELCVEM IIKLYKALHSMDMSFCKERRLSRNFS >gi|222441820|gb|ACEP01000122.1| GENE 23 26626 - 28257 1776 543 aa, chain + ## HITS:1 COG:CAC1267 KEGG:ns NR:ns ## COG: CAC1267 COG1686 # Protein_GI_number: 15894549 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanyl-D-alanine carboxypeptidase # Organism: Clostridium acetobutylicum # 8 287 9 293 425 108 32.0 3e-23 MKKKLTGILLLLLFIALPLQGQAKVKAPKKQCHAYVVMDAGSGEVLFGQDANKKIYPAST AKLMTAIVCVEKGNVNSVIKTKSDVVYRTTPGTYSLGIGAGVNYTFKDLLHMSLMSSAAD ATDSLAVGVFGSKKACVEAMNEKCKELGLKKTHFDNPVGSDIGAGYNETYASAKEMAKIC RYAMAIPLIRSAVSKAHYSTQKGGMYVNTTNWFLKGMAYYDRDTYKIIGSKSGTTNAAGH VFIATAADYKGHELICAYFGNVSKESTFASIRSLFDYAFNNYKKGKLTLTPSNYDVRSSQ KYGAVYSEYSALHCYPAQKDGLFAPNKAITRKQLGTMLEAIDSLKDNATLSAFVSENENG TVTTTRFAQLLQELYPVTISDKKAEEVLASCSSIDTMDETAKEAYASFASGALAVDDSCK TANQRITRGQALLIADKLADYQMNYLADHAQTQIAEVRQIPGKDGTITLPAMSYTTFNKK WSDSLKEQKEAQDAEAAAKKAEAERKAAKEKQPKKDAEQITEGTKPKENASQATTATQKQ KKK >gi|222441820|gb|ACEP01000122.1| GENE 24 28326 - 28520 286 64 aa, chain + ## HITS:1 COG:CAC2659 KEGG:ns NR:ns ## COG: CAC2659 COG1983 # Protein_GI_number: 15895917 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Putative stress-responsive transcriptional regulator # Organism: Clostridium acetobutylicum # 1 63 1 63 63 72 52.0 2e-13 MGKKLYKSSLDRKICGVCGGFAEFFGIDATILRLLVVLFTLAGGSGVLFYIVAALIMPDE PEYY >gi|222441820|gb|ACEP01000122.1| GENE 25 28556 - 29593 853 345 aa, chain + ## HITS:1 COG:CAC1631 KEGG:ns NR:ns ## COG: CAC1631 COG0502 # Protein_GI_number: 15894909 # Func_class: H Coenzyme transport and metabolism # Function: Biotin synthase and related enzymes # Organism: Clostridium acetobutylicum # 10 341 14 339 350 301 45.0 2e-81 MDISKKLVETHSLTKEECLELFSLFNNSEVMEYLKEEAVKVRKQYYGTKVFTRGLIEFTN YCKNNCYYCGIRRDNRNVERYRLSEEEILACCANGEKLGFRTFVLQGGEDPWYTQERIAD IVKKIKKAHPDCAITLSVGEKSKEIYKAWKEAGADRYLLRHETANTSHYHMLHPESMHLS NRKQCLYDLKELGYQVGAGFMVGSPGQTWEHLAEDLVFLQELQPQMVGIGPFIPHHDTKF AKEEAGSVDLTLKLLSVLRLLLPKVLLPATTALGTLDPLGREKGLSAGANVVMPNLSPVE NRKQYELYDNKICTGDEAAKCKNCLSARVRSAGYELVEDRGDYIG >gi|222441820|gb|ACEP01000122.1| GENE 26 29777 - 30709 704 310 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|116517028|ref|YP_816079.1| glucokinase [Streptococcus pneumoniae D39] # 1 310 1 318 319 275 46 2e-73 MMKYCFGVDVGGTTVKLGLFTVEGELLDKWEIKTYTENEGERILPDVAEAIKGKIAEKLL KAEEICGIGVGVPAPVDKNGAIERAANVGWMAKEIKKELEELTGFPCVIGNDANVAALGE MWKGAGEGEKDLIMVTLGTGVGGGIIIDGHAVVGAHGAGGEIGHITVRDDETEACGCGRK GCLEQYASATGLVRLAKRYFEKNTKNSILTGKEITAKEVFDAAKAGDAAALEITEEFGAY LGQALVNLAATVDPAVFVIGGGVSKAGDILLETVRKYFYDHAFYGNQKTKITLATLGNDA GIYGAAKQVL Prediction of potential genes in microbial genomes Time: Fri Jul 8 08:19:36 2011 Seq name: gi|222441819|gb|ACEP01000123.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont401.1, whole genome shotgun sequence Length of sequence - 3035 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 3, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 30 - 293 362 ## COG1925 Phosphotransferase system, HPr-related proteins - Prom 372 - 431 6.8 - Term 415 - 470 13.2 2 2 Op 1 . - CDS 494 - 745 234 ## gi|225028479|ref|ZP_03717671.1| hypothetical protein EUBHAL_02756 3 2 Op 2 . - CDS 754 - 1134 609 ## COG0251 Putative translation initiation inhibitor, yjgF family 4 3 Op 1 . - CDS 1300 - 2007 663 ## COG2220 Predicted Zn-dependent hydrolases of the beta-lactamase fold 5 3 Op 2 . - CDS 2004 - 2915 836 ## COG0648 Endonuclease IV - Prom 2945 - 3004 6.9 Predicted protein(s) >gi|222441819|gb|ACEP01000123.1| GENE 1 30 - 293 362 87 aa, chain - ## HITS:1 COG:TP0589 KEGG:ns NR:ns ## COG: TP0589 COG1925 # Protein_GI_number: 15639577 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, HPr-related proteins # Organism: Treponema pallidum # 1 87 1 87 88 57 35.0 4e-09 MVQAELTITNEQGIHLGPANKIANLADRYDCQIHFKTDYMDVNAKSIINLISGTFRHGDV VQCVCNGPDEEEALAAMKRILSENLEE >gi|222441819|gb|ACEP01000123.1| GENE 2 494 - 745 234 83 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225028479|ref|ZP_03717671.1| ## NR: gi|225028479|ref|ZP_03717671.1| hypothetical protein EUBHAL_02756 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02756 [Eubacterium hallii DSM 3353] # 1 83 1 83 83 156 100.0 4e-37 MLQSGIIELDVHSMNRAQAKTYIDSQLKRAKKDIYQIKVIHGYRGGTAIRDMVRKTYRNH PKVVRVELPLNPGETNLILKEYF >gi|222441819|gb|ACEP01000123.1| GENE 3 754 - 1134 609 126 aa, chain - ## HITS:1 COG:lin0837 KEGG:ns NR:ns ## COG: lin0837 COG0251 # Protein_GI_number: 16799911 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Putative translation initiation inhibitor, yjgF family # Organism: Listeria innocua # 1 123 1 123 125 132 52.0 2e-31 MEKKVIVSEKSPAAIGPYSQAIEYNGIVYTSGVIGVNPATGEVGETIEEQTTQVFENLKG LLADAGSSLDQAIKTTVFIKNMDDFAKVNEIYATYFTGAFPARSCVEVARLPKDLLIEVE VTAIKG >gi|222441819|gb|ACEP01000123.1| GENE 4 1300 - 2007 663 235 aa, chain - ## HITS:1 COG:FN1387 KEGG:ns NR:ns ## COG: FN1387 COG2220 # Protein_GI_number: 19704722 # Func_class: R General function prediction only # Function: Predicted Zn-dependent hydrolases of the beta-lactamase fold # Organism: Fusobacterium nucleatum # 6 218 2 226 237 126 36.0 3e-29 MSKLTVTYIKHSGFLVETANSYLLFDYWQGKLPELQYDKELYVFSSHAHHDHYTKDIFKL ENKCQTVVYVLSSDIKRASSFWKKAENVIFMKPHEEKEIDDCVISTLKSTDEGVAFLVKT EGKTIYHAGDLHWWDWPGEPEEDNKMMEQLYKAEMEYLKAEKIDCAFVVLDPRQEESGSL GMEYFIKNVGAHFIFPMHCWEQYDIIEKYKKDVDRKFLTGQIMNITAPGQQFVLD >gi|222441819|gb|ACEP01000123.1| GENE 5 2004 - 2915 836 303 aa, chain - ## HITS:1 COG:BH1386 KEGG:ns NR:ns ## COG: BH1386 COG0648 # Protein_GI_number: 15613949 # Func_class: L Replication, recombination and repair # Function: Endonuclease IV # Organism: Bacillus halodurans # 2 299 3 296 298 388 65.0 1e-108 MIKLGCHVGMSGKEMFLGSVKEAISYNANTFMIYTGAPQNTRRKPVEELRIEEGWALMRE NGIDEFVVHAPYIINLGNTVKSETFELAVEFLQLELERTQALKCKTLVLHPGAHVGAGAD VGIRQIIKGLNEVFSQDKDGKVNIALETMAGKGTEIGRSFEEIAAIYDGVTHNERLRVCF DTCHTSDAGYDVREDFASVIEQFDRIIGKDQIAVFHINDSKNPQGAHKDRHENIGFGEIG YDCLRNIVYNKDFEAIPKILETPYIPNPETKKRAFAPYKYEIAMLREGAFHEDLKEKILE GNK Prediction of potential genes in microbial genomes Time: Fri Jul 8 08:19:43 2011 Seq name: gi|222441818|gb|ACEP01000124.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont406.1, whole genome shotgun sequence Length of sequence - 3597 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 2, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 461 - 781 226 ## COG1687 Predicted branched-chain amino acid permeases (azaleucine resistance) 2 1 Op 2 . - CDS 781 - 1041 104 ## gi|225028485|ref|ZP_03717677.1| hypothetical protein EUBHAL_02762 - Prom 1061 - 1120 4.8 + Prom 866 - 925 3.8 3 2 Op 1 . + CDS 1158 - 2696 1241 ## Shel_01450 hypothetical protein 4 2 Op 2 . + CDS 2684 - 2854 94 ## gi|225028487|ref|ZP_03717679.1| hypothetical protein EUBHAL_02764 Predicted protein(s) >gi|222441818|gb|ACEP01000124.1| GENE 1 461 - 781 226 106 aa, chain - ## HITS:1 COG:BS_azlD KEGG:ns NR:ns ## COG: BS_azlD COG1687 # Protein_GI_number: 16079723 # Func_class: E Amino acid transport and metabolism # Function: Predicted branched-chain amino acid permeases (azaleucine resistance) # Organism: Bacillus subtilis # 1 106 3 110 110 111 65.0 3e-25 MTTTQQIITIAVVVLGTMTTRFLPFIIFSADSAPEFVRFLGRVLPNAVIGLLVIYSLKDS IGGVNHCIPELIALAFIFILHKWKKNTLLSIAGGTILYMFLVQIVF >gi|222441818|gb|ACEP01000124.1| GENE 2 781 - 1041 104 86 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225028485|ref|ZP_03717677.1| ## NR: gi|225028485|ref|ZP_03717677.1| hypothetical protein EUBHAL_02762 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02762 [Eubacterium hallii DSM 3353] # 1 86 1 86 86 122 100.0 1e-26 MPDNLYNLNIHQLNLLLYRCLYLLHNPIGFLYILSFFLCRNRLDDSAKVVKAPKTKNGVK DKAANENLLPSTKSLNNKMNESEENA >gi|222441818|gb|ACEP01000124.1| GENE 3 1158 - 2696 1241 512 aa, chain + ## HITS:1 COG:no KEGG:Shel_01450 NR:ns ## KEGG: Shel_01450 # Name: not_defined # Def: hypothetical protein # Organism: S.heliotrinireducens # Pathway: not_defined # 182 506 368 677 744 248 43.0 5e-64 MLDGDIDVTKKYYNQVGRFYDMYHNTLVKLNHLEASLVDDEPGPLNIPDSAVEYNGHYYY LCCNNEAEDYATAENYCKEQGGYLATITSEKENKFLFNYVRKKGYSSAYFGLNNLKDGKA YQWNNGELLIYTKWAKNEPDNTFSDYGYYVRFNENAKDGTWKVDTFSGGETNFNNVFLCE WGDYSVTGNDGLKVTSKKRDIVLTLDISASMDGIPLDETKKAAAKFVDSILNKNSNIGLV SYSDEATSLSGICSNDVFLKNTITSLSSAENTNIEDGLSRAYSMLQLGQSKKKLIVLMSD GLPTLGKDGEELIKYAEKIKDQGVLIYTLGFFQNTEEYKAEGQYLMEKIASEGCHYEVSS SEDLVFFFEDVAGQIGGQKYIYVKVACPVDVSVTYKGETLSSAENDQNLRTSFGTLSFRE NEGKENNEEESSGYSNTYLKEADSKVKILRLKEGTDYNIKINGTSDGEMDYTIGFVNDEG EYNDFRRFEDIDINKDTVIDTVANTSKKHCLI >gi|222441818|gb|ACEP01000124.1| GENE 4 2684 - 2854 94 56 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225028487|ref|ZP_03717679.1| ## NR: gi|225028487|ref|ZP_03717679.1| hypothetical protein EUBHAL_02764 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02764 [Eubacterium hallii DSM 3353] # 1 45 1 45 56 77 100.0 2e-13 MLNIDEDGDGKYDLKLQAGVNGYGEEVKIDRGIYIILAYAFISALIITIIVRRITI Prediction of potential genes in microbial genomes Time: Fri Jul 8 08:20:01 2011 Seq name: gi|222441817|gb|ACEP01000125.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont408.1, whole genome shotgun sequence Length of sequence - 725 bp Number of predicted genes - 0 Prediction of potential genes in microbial genomes Time: Fri Jul 8 08:20:02 2011 Seq name: gi|222441816|gb|ACEP01000126.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont409.1, whole genome shotgun sequence Length of sequence - 697 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 696 378 ## COG0675 Transposase and inactivated derivatives Predicted protein(s) >gi|222441816|gb|ACEP01000126.1| GENE 1 3 - 696 378 231 aa, chain + ## HITS:1 COG:all7245 KEGG:ns NR:ns ## COG: all7245 COG0675 # Protein_GI_number: 17233261 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Nostoc sp. PCC 7120 # 5 230 81 305 407 133 36.0 3e-31 NVQLHLEKAYKNFFRDPKIGFPRFKSKHHSKNSYTTNVVNGNILVESKRIRLPKLKWIAM KKHREPAEGLRLKSVTVSMESSGKYFASLLYEGYSCENQAAGPDYSTAKILGIDYAMQGM AVFSEKVETEEAGFFRKNEKRLAREQRKLSRCVRGSHNYELQKKKVARCHEKIRNQRRDH LHKLSRKIADGYDAAAVEDIDMKAMGQCLHFGKSVQDNGYGMFREMLDYKL Prediction of potential genes in microbial genomes Time: Fri Jul 8 08:20:03 2011 Seq name: gi|222441815|gb|ACEP01000127.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont412.1, whole genome shotgun sequence Length of sequence - 867 bp Number of predicted genes - 0 Number of transcription units - 0, operones - 0 average op.length - 0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + LSU_RRNA 326 - 812 91.0 # CP001104 [D:874657..877541] # 23S ribosomal RNA # Eubacterium eligens ATCC 27750 # Bacteria; Firmicutes; Clostridia; Clostridiales; Eubacteriaceae; Eubacterium. Prediction of potential genes in microbial genomes Time: Fri Jul 8 08:20:05 2011 Seq name: gi|222441814|gb|ACEP01000128.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont413.1, whole genome shotgun sequence Length of sequence - 7934 bp Number of predicted genes - 7, with homology - 7 Number of transcription units - 4, operones - 1 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 310 - 702 480 ## CLJU_c14440 hypothetical protein - Prom 729 - 788 4.0 + Prom 1504 - 1563 3.0 2 2 Op 1 1/0.000 + CDS 1583 - 2173 704 ## COG0494 NTP pyrophosphohydrolases including oxidative damage repair enzymes + Prom 2414 - 2473 3.7 3 2 Op 2 . + CDS 2615 - 3088 503 ## COG0517 FOG: CBS domain 4 2 Op 3 . + CDS 3144 - 4814 1348 ## COG0144 tRNA and rRNA cytosine-C5-methylases 5 2 Op 4 . + CDS 4823 - 5164 203 ## gi|225028497|ref|ZP_03717689.1| hypothetical protein EUBHAL_02774 + Term 5362 - 5404 8.1 + Prom 5368 - 5427 12.2 6 3 Tu 1 . + CDS 5498 - 7168 2332 ## COG0004 Ammonia permease 7 4 Tu 1 . - CDS 7449 - 7703 86 ## gi|225028500|ref|ZP_03717692.1| hypothetical protein EUBHAL_02777 - Prom 7830 - 7889 9.8 Predicted protein(s) >gi|222441814|gb|ACEP01000128.1| GENE 1 310 - 702 480 130 aa, chain - ## HITS:1 COG:no KEGG:CLJU_c14440 NR:ns ## KEGG: CLJU_c14440 # Name: not_defined # Def: hypothetical protein # Organism: C.ljungdahlii # Pathway: not_defined # 1 129 1 123 123 72 37.0 8e-12 MKETILLYNLDSAAIRNKIKFLCIQGGIHIRVVEKNQYDVPIGTLAFGKKEDMELYLRSE SSEEKPSFNDPMLVFAGFTGQKLDQFLNAMRKQKIPKINLKAMLTEHNVKWDSMTLHDEL AREHEAMNKK >gi|222441814|gb|ACEP01000128.1| GENE 2 1583 - 2173 704 196 aa, chain + ## HITS:1 COG:BH0986 KEGG:ns NR:ns ## COG: BH0986 COG0494 # Protein_GI_number: 15613549 # Func_class: L Replication, recombination and repair; R General function prediction only # Function: NTP pyrophosphohydrolases including oxidative damage repair enzymes # Organism: Bacillus halodurans # 1 151 1 154 208 75 32.0 6e-14 MEEKNYKIYDSLYNVIGVAPYEEVHTQGLLHQVVHFWLAEKKDDQLWFYLQKRAEDDIVY KGAYDILTSGHIDPEETHEEAMVHQIKEQLGLFLEKEQLTYIGHIHQQIKKDKYFDNTFV QSYLYVTDHAEKDFADAGNVVKVKFEELAAFTGNPEGKITVYDIQGQPIEETTHDQWWPR DEEFEKVMAPYIEKHF >gi|222441814|gb|ACEP01000128.1| GENE 3 2615 - 3088 503 157 aa, chain + ## HITS:1 COG:CAC3674 KEGG:ns NR:ns ## COG: CAC3674 COG0517 # Protein_GI_number: 15896906 # Func_class: R General function prediction only # Function: FOG: CBS domain # Organism: Clostridium acetobutylicum # 1 125 1 125 140 115 45.0 2e-26 MNISFYLIPKSEVTYIHQEDTVAQALRVIHKHGYQAVPVIDEKGRYVGTISEGDFLWNLI EEYHMDMEVMRKSHVKSMRLRWDYKPVSIDADIEQLDNHILNQNFVPVVDGRNVFIGIIT RKEIIKELLKQKKDVLAKSSLAEQNVHNTQQAGRTFH >gi|222441814|gb|ACEP01000128.1| GENE 4 3144 - 4814 1348 556 aa, chain + ## HITS:1 COG:SP1402_1 KEGG:ns NR:ns ## COG: SP1402_1 COG0144 # Protein_GI_number: 15901256 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA and rRNA cytosine-C5-methylases # Organism: Streptococcus pneumoniae TIGR4 # 4 302 3 280 280 212 41.0 2e-54 MADLPQSFLDSMKEILGEDYEAFLTGFDGQRQYGLRVNTLKMNLEEFERIAPFHLKKVPW ISNGYFYNAEDVPAKHPFYSAGLYYLQEPSAMTPASRLKVQPGERVLDLCAAPGGKATEL GAALQGEGLLVANDINTARAKALLRNLELFGISNSFVTNEPPHVLAERFPEFFHKIMVDA PCSGEGMFRKNPAVVDSWKEKGPEYFSKLQREIIVQAADMLLPGGMMFYSTCTFSPLENE KTITHLLKERPDMEVVPMEDYENFAEGLTSYKDEAFDESCKLCRRIWPHKMSGEGHFMAL LHKKSGAEQEIKSDEQDSVMQNINYLQAENSKESGLNRTFHMNQEANTSEILNRDKKENT IKYQQDAEPVFDRKNIEISQKSDEKASKKSKKAKKKQQTETLSSVWWEKCKSLSKEQKAA AEDFFAHVNLAYDVGRIDVRGDNLYYLPEPQYDGRGLHFLRNGLFMGEFKKKRFEPSQPF ALALRAQDFDQVLDFPADDERLQRYLRGETLDVSDLINGEKKRKGWQLVMAAGHPLGFGK LVNNNLKNKYPAGWRK >gi|222441814|gb|ACEP01000128.1| GENE 5 4823 - 5164 203 113 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225028497|ref|ZP_03717689.1| ## NR: gi|225028497|ref|ZP_03717689.1| hypothetical protein EUBHAL_02774 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02774 [Eubacterium hallii DSM 3353] # 1 113 8 120 120 211 100.0 2e-53 MINIISKRELSNSVIYELDDNILRSNIISILKMFENGKHKFGECSCSTWDEDGHVMRPCF HSIDELEKLKEYPSPDMSFDVDYVNSDTGKYSYTISASTNNNIITFFIDNKKC >gi|222441814|gb|ACEP01000128.1| GENE 6 5498 - 7168 2332 556 aa, chain + ## HITS:1 COG:TM0402 KEGG:ns NR:ns ## COG: TM0402 COG0004 # Protein_GI_number: 15643168 # Func_class: P Inorganic ion transport and metabolism # Function: Ammonia permease # Organism: Thermotoga maritima # 1 400 31 429 435 363 51.0 1e-100 MDMSVWYLLGAVLVFFMQCGFAMVETGFTRAKNAGNIIMKNLMDFCIGTVVFFVLGYGIM NSENYFFGLIGRPEYQMFTDFANFDWSNFFFQLVFCATAATIVSGAMAERTKFSTYCIYS AVISAIVYPIEAGWVWNSAGWLAKLGYVDFAGSSVIHMVGGIASVIGAAMLGPRIGKYTK GKDGKTVVNAFPGHSLTLGALGCFILWFAWYGFNGAAASDPTQLAQILGTTTIAPAVATF VCMMFTWIRNGAPDVSMCLNASLAGLVGITAGCANVDAVGATIIGLVDGILVVIVVEFID QKLKIDDPVGAVAVHGCNGLWGTVAVGLFDYNNGVFYGGGFHQLGVQVLGVVCIAAYTAV AMTIVFTILKHTIGLRVSAEEEIMGLDIAEHDLASAYADFLPISATTMGGVTTETIDVTD LRDKKLAPVIGGAKETGGRYTKLTIMCKEDRFAILKDAMSQIGVTGMTVSHVMGCGTQKG KTGQYRGVKIDMNLLPQLQVDIVVSTVPPELVVEAAKKALYTGEYGDGKIFLYDVENVVR IRTNETGIAALDNEEK >gi|222441814|gb|ACEP01000128.1| GENE 7 7449 - 7703 86 84 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225028500|ref|ZP_03717692.1| ## NR: gi|225028500|ref|ZP_03717692.1| hypothetical protein EUBHAL_02777 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02777 [Eubacterium hallii DSM 3353] # 1 84 1 84 84 154 100.0 2e-36 MIADSLIYLKEYNEITAKNRNEKNFDTSDEFLSGLQKSDRLHPVISLCIYYGKEDMRMSL PGKLNFEGLTQKSIRKNYYFYTLK Prediction of potential genes in microbial genomes Time: Fri Jul 8 08:20:29 2011 Seq name: gi|222441813|gb|ACEP01000129.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont415.1, whole genome shotgun sequence Length of sequence - 27815 bp Number of predicted genes - 30, with homology - 28 Number of transcription units - 15, operones - 9 average op.length - 2.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 2 1 Op 2 . - CDS 1544 - 2104 806 ## COG0245 2C-methyl-D-erythritol 2,4-cyclodiphosphate synthase - Prom 2150 - 2209 3.6 3 2 Tu 1 . - CDS 2249 - 2548 130 ## - Prom 2596 - 2655 5.2 - TRNA 2370 - 2443 81.1 # Met CAT 0 0 - TRNA 2469 - 2541 85.1 # Val TAC 0 0 4 3 Op 1 . - CDS 2697 - 3878 1304 ## COG1473 Metal-dependent amidase/aminoacylase/carboxypeptidase - Prom 3931 - 3990 5.6 5 3 Op 2 . - CDS 4077 - 4562 478 ## Closa_0686 hypothetical protein - Prom 4682 - 4741 10.0 6 4 Tu 1 . - CDS 4769 - 5788 919 ## COG5279 Uncharacterized protein involved in cytokinesis, contains TGc (transglutaminase/protease-like) domain - Prom 5825 - 5884 1.8 - Term 6166 - 6229 18.1 7 5 Op 1 41/0.000 - CDS 6308 - 7933 1584 ## PROTEIN SUPPORTED gi|167855908|ref|ZP_02478658.1| 50S ribosomal protein L28 - Prom 7969 - 8028 5.3 8 5 Op 2 . - CDS 8101 - 8373 528 ## COG0234 Co-chaperonin GroES (HSP10) - Prom 8394 - 8453 5.3 9 6 Tu 1 . - CDS 8573 - 8923 150 ## gi|225028508|ref|ZP_03717700.1| hypothetical protein EUBHAL_02787 - Prom 9044 - 9103 6.8 - Term 9013 - 9058 6.6 10 7 Op 1 . - CDS 9122 - 10378 470 ## COG4219 Antirepressor regulating drug resistance, predicted signal transduction N-terminal membrane component 11 7 Op 2 . - CDS 10498 - 10869 431 ## Toce_0329 transcriptional repressor, CopY family - Prom 10910 - 10969 5.2 12 8 Tu 1 . - CDS 11489 - 12349 747 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily - Prom 12483 - 12542 1.8 - Term 12478 - 12550 25.8 13 9 Op 1 1/0.000 - CDS 12585 - 13739 1325 ## COG0726 Predicted xylanase/chitin deacetylase - Term 13747 - 13785 6.2 14 9 Op 2 . - CDS 13846 - 14403 731 ## COG0231 Translation elongation factor P (EF-P)/translation initiation factor 5A (eIF-5A) 15 9 Op 3 . - CDS 14466 - 14981 478 ## COG2179 Predicted hydrolase of the HAD superfamily - Prom 15009 - 15068 3.4 16 10 Op 1 . - CDS 15108 - 15737 445 ## COG1191 DNA-directed RNA polymerase specialized sigma subunit 17 10 Op 2 . - CDS 15796 - 15942 86 ## 18 10 Op 3 4/0.000 - CDS 15890 - 17020 1148 ## COG0826 Collagenase and related proteases 19 10 Op 4 . - CDS 17040 - 17687 574 ## COG4122 Predicted O-methyltransferase 20 10 Op 5 . - CDS 17684 - 18052 417 ## gi|225028519|ref|ZP_03717711.1| hypothetical protein EUBHAL_02798 - Prom 18091 - 18150 4.5 21 11 Op 1 . - CDS 18267 - 18656 357 ## gi|225028520|ref|ZP_03717712.1| hypothetical protein EUBHAL_02799 22 11 Op 2 . - CDS 18660 - 20324 1947 ## COG0595 Predicted hydrolase of the metallo-beta-lactamase superfamily - Prom 20374 - 20433 4.7 - Term 20364 - 20433 2.0 23 12 Op 1 . - CDS 20446 - 20733 469 ## gi|225028522|ref|ZP_03717714.1| hypothetical protein EUBHAL_02801 24 12 Op 2 6/0.000 - CDS 20773 - 21210 432 ## COG0816 Predicted endonuclease involved in recombination (possible Holliday junction resolvase in Mycoplasmas and B. subtilis) 25 12 Op 3 . - CDS 21213 - 21470 381 ## COG4472 Uncharacterized protein conserved in bacteria 26 12 Op 4 . - CDS 21525 - 22877 405 ## PROTEIN SUPPORTED gi|227424643|ref|ZP_03907734.1| SSU ribosomal protein S12P methylthiotransferase - Prom 23061 - 23120 5.8 + Prom 23026 - 23085 7.7 27 13 Tu 1 . + CDS 23110 - 23346 344 ## bpr_I2619 hypothetical protein + Term 23416 - 23453 5.1 - Term 23402 - 23439 5.1 28 14 Op 1 7/0.000 - CDS 23606 - 24790 1505 ## COG0301 Thiamine biosynthesis ATP pyrophosphatase 29 14 Op 2 . - CDS 24887 - 26035 1240 ## COG1104 Cysteine sulfinate desulfinase/cysteine desulfurase and related enzymes - Prom 26132 - 26191 6.6 30 15 Tu 1 . - CDS 26406 - 27806 1409 ## COG5492 Bacterial surface proteins containing Ig-like domains Predicted protein(s) >gi|222441813|gb|ACEP01000129.1| GENE 1 9 - 1547 1210 512 aa, chain - ## HITS:1 COG:FN1041 KEGG:ns NR:ns ## COG: FN1041 COG4552 # Protein_GI_number: 19704376 # Func_class: R General function prediction only # Function: Predicted acetyltransferase involved in intracellular survival and related acetyltransferases # Organism: Fusobacterium nucleatum # 8 128 9 125 391 62 28.0 2e-09 MIVWDDGTKKQRVYEMWEDNFHDPVSYANFYFNEVYGKNEVLLNEEMSADENIVKGMLHL NPYTLHVKGQQAEAKYIVGVATDEEYRRQGVMRELLVKTFQELRSRGELFTYLMPADENY YLPFDFRFGMAQVEQELEYLPELRKETEKQAKDIIEKDSKDSIDAKQESDAKNKEVLLKK EEVEQESQPEECPIKEDILGDNKNKRESSFEKYTFKTELSDAELETLVAQENVIKDTLFD IFIEISVPYIRRMEKEVESDYGQVLYVFDGESYVGRAAVGAEDSYFVLSQVFCADASKRE NFLEQVILYCEEKYHYNRYQLILDPSWKDITTKIGLYRNFRFMPAKRVKKIMFRLLNIEE LSKFLECKIETLKETSETEKEKVEAKICEADSCGTNSEGNDDCKSDNCENTICEIDNCEA YTYREDIYIEDKYLEEQSGVYHFSLKDGKVSITKTETIKTDGMQSVSIAALTDYLFGKKE ESIDSCETLTEEGKMLLKNICPLSENCIMEIV >gi|222441813|gb|ACEP01000129.1| GENE 2 1544 - 2104 806 186 aa, chain - ## HITS:1 COG:CAC0434 KEGG:ns NR:ns ## COG: CAC0434 COG0245 # Protein_GI_number: 15893725 # Func_class: I Lipid transport and metabolism # Function: 2C-methyl-D-erythritol 2,4-cyclodiphosphate synthase # Organism: Clostridium acetobutylicum # 1 154 1 154 155 208 66.0 4e-54 MRIGNGYDVHRLVENRKLILGGVEIPYEKGLLGHSDADVLLHAVMDALLGAAALGDIGKH FPDNDPAYEGADSMKLLEEVGKLLQEKAYFIENIDATVIAQRPKLAPYIPEMKKNIARVL CIQEGQVNVKATTEEGLGFTGEGLGISASAVCLLSEIQNYVGADVLMEDRDCAGCGGCAR FGGIQK >gi|222441813|gb|ACEP01000129.1| GENE 3 2249 - 2548 130 99 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTWDLSSAGRASALQAEGHRFKSCRSHLFFVNIKYGGIAQLARARGSYPRCRWFKSSSRY FLLQFFVCNGRQTQTRKGYYVKMLRITALKLLDRNLSLI >gi|222441813|gb|ACEP01000129.1| GENE 4 2697 - 3878 1304 393 aa, chain - ## HITS:1 COG:PAB0873 KEGG:ns NR:ns ## COG: PAB0873 COG1473 # Protein_GI_number: 14521524 # Func_class: R General function prediction only # Function: Metal-dependent amidase/aminoacylase/carboxypeptidase # Organism: Pyrococcus abyssi # 7 381 6 376 383 286 41.0 5e-77 MTREEVLEKVSKKREEFISWYHEFHQIPELGFEEKETTELICEKLTGMGIHPLRLDPTGV TAYIGPHDAYTVGLRADIDGLRVLEDTDLPFQSKHPGKMHACGHDGHITGLLGAASILKE MESELTVRVKLIFQPSEENTKGAKQIISQGILEDVDEVFGLHLFSDIKAGEISVEPGARM AQTDRFTITFIGKGGHAAKPHQCVDATVMAADFVMNVQTIVARELDPIEGAVVTVGSLKS GTQYNIISGKAVLEGTCRSYSEDVAQHLQDAIRRRAEAVADSYGGNVKIAYEQGSHPPVL NDAALSRRIADEAEQLLENRKLVHIAPLMLGEDFSWYQKEVPGVFAFVGCGTPGRKCYPN HHPKFEVEESALSDGVLLHLAAVVSAMHREKEE >gi|222441813|gb|ACEP01000129.1| GENE 5 4077 - 4562 478 161 aa, chain - ## HITS:1 COG:no KEGG:Closa_0686 NR:ns ## KEGG: Closa_0686 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 161 1 154 157 77 33.0 1e-13 MTKEQFIMELEQCLQGEVSAYELSDSLTYYRQYFEDEIRNGKSEEQVVEELGSPRLIARS IIDAHGIEESASGNSDYGDNSYGSGYAQNDSSYRSYGDAYSDRRDIENNNQKGALQSAGK MIMTIIISALVFFIVLFLLRALLPLAVIIIAAVLIVRLIRG >gi|222441813|gb|ACEP01000129.1| GENE 6 4769 - 5788 919 339 aa, chain - ## HITS:1 COG:CAP0003_2 KEGG:ns NR:ns ## COG: CAP0003_2 COG5279 # Protein_GI_number: 15004708 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Uncharacterized protein involved in cytokinesis, contains TGc (transglutaminase/protease-like) domain # Organism: Clostridium acetobutylicum # 133 244 1 114 132 85 43.0 1e-16 MPHLNASAQTVIDTKKELLAEIENALSSGKTKISFVTSDLDQTDFDTLNQNIEGFYGVVK EYQIKSVKFLNKSYVTLNCEISDNYYVEKAFFDEEDIPEERDKARDLYEACKSFLTSLQS KKRSDYEKEKLIHDYIVSNVAYGYPGGKKEPEGDAYSAYGALVLKKAVCNGYAEGMKLLC DLSGVTCKMISGTADGEKHAWNLIKLDKEWYHADLTWDDPEPDETSRIMYTYFNVDDTQM KADHKWNAALYQKAEGNEYNYYRKKDLLCEDYKSFRSKCEDILEEKSPNSIQFMVKDYDQ DTYSDDNLQFILRYSGASSLRMQIAGKTPYTMLYFKLQY >gi|222441813|gb|ACEP01000129.1| GENE 7 6308 - 7933 1584 541 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|167855908|ref|ZP_02478658.1| 50S ribosomal protein L28 [Haemophilus parasuis 29755] # 2 541 3 543 547 614 58 1e-175 MAKEIKYGVEARKALEAGVNQLADTVRVTLGPKGRNVVLDKSFGAPLITNDGVTIAKDIE LEDPFENMGAQIVKEVATKTNDVAGDGTTTATVLAQAMINAGMKNLAAGANPIILRKGMK KATNAAVEAIGNMSEQVGGKEQIAKVASISAADEEVGQLVADAMEKVSQDGVITIEESKT MKTELDLVEGMQFDRGYISAYMATDMEKMVAELDDPYILITDKKISNIQEILPLLEQIVQ SGAKLLIIAEDIEGEALTTLIVNKLRGTFSVVGVKAPGYGDRRKAMLEDIAILTGGTVVS EEVGIELKDATMDMLGRAKSVKVEKENTVIVDGAGDKAAIKARVGQIKSQIEETTSDFDR EKLQERLAKLSGGVAVIRVGAATETEMKEAKLRLEDALAATRAAVEEGIIAGGGSAYIHA AKAVKEMAKELTGDEKTGAEIVLKALEAPLYHIAANAGLEGSVIINKVAESEAGVGFDAL SETYVNMVESGIIDPAKVTRSALQNATSVASTLLTTESVVADIKEDTPAPAMPAGGGMGM M >gi|222441813|gb|ACEP01000129.1| GENE 8 8101 - 8373 528 90 aa, chain - ## HITS:1 COG:CC0686 KEGG:ns NR:ns ## COG: CC0686 COG0234 # Protein_GI_number: 16124939 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Co-chaperonin GroES (HSP10) # Organism: Caulobacter vibrioides # 1 90 1 95 96 84 48.0 4e-17 MKLVPLADRVVLKQLEAETKTKTGIILTSSAQEKPQEAEVVAVGPGTEDVKMEVSVGQKV IYSKYAGTNVKMEEEEYIIVKQSDILAIVE >gi|222441813|gb|ACEP01000129.1| GENE 9 8573 - 8923 150 116 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225028508|ref|ZP_03717700.1| ## NR: gi|225028508|ref|ZP_03717700.1| hypothetical protein EUBHAL_02787 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02787 [Eubacterium hallii DSM 3353] # 1 116 9 124 124 184 100.0 1e-45 MWIRKIGLGILIGVVLCSVMVNAEERNSRKAEPMTVMERMREIPRKIYGKKKSRNKMRKV PGKFKLPPDARIGDIIKTKSGYKQIIDIWGNGKFRLKSLEKKYYRQRKTRVKKRYN >gi|222441813|gb|ACEP01000129.1| GENE 10 9122 - 10378 470 418 aa, chain - ## HITS:1 COG:CAC3437 KEGG:ns NR:ns ## COG: CAC3437 COG4219 # Protein_GI_number: 15896678 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Antirepressor regulating drug resistance, predicted signal transduction N-terminal membrane component # Organism: Clostridium acetobutylicum # 74 219 114 272 541 73 24.0 8e-13 MTAQKYIWILRLSIILYLCPFQEFKYLILPGKLLERFNFADFWGIAGKKISRFEVVNIPL LFGDYYVIPRGIFLVTVIWFIICTILVIYYYGTYFFIKRQIRKNGIKTDFFSGKRGKDIE VFQTAKVKTPCTVGWLHPNIFIPKRDYTVKEKEWLLRHEITHVRHGDTFWKALALLVCIF RWYNPFVYYLFQQYSVMCEYYCDAVCMQNSNKEEKKNYAIFLVKSATLTIPRRGLAIVQG LTNNGEKMQERVDRILDEDKKPAGKIQLFIAGILTVMCLCSFMTIFVYSATQEQDVPAQD NTFTEGEWDYFYGEDVSMGDASLDFSKSTILFESKQREITPVFQSNLMNNGSDKGQKKAQ QKRCNHVMEEGTLKIHKSKKKGCHIEAYEAQRCKKCGIVKKNQLSYDTDYVVCSHAKK >gi|222441813|gb|ACEP01000129.1| GENE 11 10498 - 10869 431 123 aa, chain - ## HITS:1 COG:no KEGG:Toce_0329 NR:ns ## KEGG: Toce_0329 # Name: not_defined # Def: transcriptional repressor, CopY family # Organism: T.oceani # Pathway: not_defined # 5 120 7 120 122 70 34.0 2e-11 MNSHLSKTEYRIMEYFWSTGGKYTFGELMKYFNEEEDKNWKKQTLNTFLSRLIDKGLLER KKEETKAYYGAALTKAEFKQRKAKAILEECYEGKISHFIAALTGNNAITKVDEKELIAHI LDR >gi|222441813|gb|ACEP01000129.1| GENE 12 11489 - 12349 747 286 aa, chain - ## HITS:1 COG:TP0986 KEGG:ns NR:ns ## COG: TP0986 COG0697 # Protein_GI_number: 15639970 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Treponema pallidum # 2 278 4 278 294 195 43.0 8e-50 MSQKNKGIILIIMSAFFFALMNMMVRLAGDLPSIEKSFFRNFVAVFFALIALKRSNTPVH VPKGQLKNLLMRSVCGTLGILCNYYAIDHLMLADASILNKLSPFFAILFSFILLKEKIHP FAAGCVFIAFVGSLFIIKPGFASVTALPAFVGLLGGMGAGIAYTYVRKLGTNGVKGPFIV LFFSAFSCIVTLPYLIFDFHPMTLAQLGCLLLAGLFASGGQFTITAAYTCAPAGEISIYD YSQIIFSTVLGFLFLRQIPDMWSFVGYGIIIAASVAMFLFNNRKRV >gi|222441813|gb|ACEP01000129.1| GENE 13 12585 - 13739 1325 384 aa, chain - ## HITS:1 COG:BS_yjeA_2 KEGG:ns NR:ns ## COG: BS_yjeA_2 COG0726 # Protein_GI_number: 16078275 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted xylanase/chitin deacetylase # Organism: Bacillus subtilis # 154 346 23 213 217 158 41.0 1e-38 MRRKKKSGAGKMIAAVVVVVILLCGIFAIMKNLASPASADNKAGNKEGNSAKATEASTEA PDNSVPMTGLKVTAPATTIRVGQTMQLKISHEPSNATNTKLKWSCSKDGMVTVTKDGVLK PGKNAGKNTVKVTATATDGSKLSASFDLRIYPAIDPSKPMVAITFDDGPNPETTTPMLDA LEENYAKATFFCLGQNAGYYPETVQREYNLGMEVGTHTYSHVVLTSLSASALDSEISKSV DAINKAIGVKPSLMRPPYGAVNKTVLSAVGGYNLCCMNWSLDTEDWKTKNADATYNEVMK AQDGDVVLLHDIHEYNVDAVKRFVPDLIAEGFQLVTVPELYEARGETLEAGTVHYRTDPT TQAGTETAPAASTDSAAESTTASE >gi|222441813|gb|ACEP01000129.1| GENE 14 13846 - 14403 731 185 aa, chain - ## HITS:1 COG:CAC2094 KEGG:ns NR:ns ## COG: CAC2094 COG0231 # Protein_GI_number: 15895364 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation elongation factor P (EF-P)/translation initiation factor 5A (eIF-5A) # Organism: Clostridium acetobutylicum # 1 185 1 185 186 214 56.0 9e-56 MISAGDFKNGLTIELEGNVYQIVDFQHVKPGKGAAFVRTKMKNIKNGGVVEKTFRPTEKF PPAHIERTDMQYLYTDGDLYYFMDVETFDQIALNEDMIGDSLKFVRENDMVKMISHNGDI FAVEPPMFAELEVTECEPGEKGNTATGATKPCTVETGAVVYVPLFVNLGDKLKIDTRTGE YLSRV >gi|222441813|gb|ACEP01000129.1| GENE 15 14466 - 14981 478 171 aa, chain - ## HITS:1 COG:BH1322 KEGG:ns NR:ns ## COG: BH1322 COG2179 # Protein_GI_number: 15613885 # Func_class: R General function prediction only # Function: Predicted hydrolase of the HAD superfamily # Organism: Bacillus halodurans # 1 169 1 166 171 111 33.0 7e-25 MLQMLYPDEYLDSTYTIDFKKLYKDGYRGILFDIDNTLVPHGAPADKRAVALFKKLREIG FQTCLISNNKEPRVKSFCDKVGSTYVFKAGKPLPGGYEEGIRRMKTTKENTLFIGDQIFT DVLGAKRAGLHAIMVKPIHPKEEIQIVLKRYLEKVVLFFYFRSIRKKSVEK >gi|222441813|gb|ACEP01000129.1| GENE 16 15108 - 15737 445 209 aa, chain - ## HITS:1 COG:CAC1689 KEGG:ns NR:ns ## COG: CAC1689 COG1191 # Protein_GI_number: 15894966 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit # Organism: Clostridium acetobutylicum # 3 201 26 224 234 207 55.0 1e-53 MRSFKKPLKAEEERYYLKMYRDGDEKAKEKLIEHNLRLVAHIVKKYNSPDRDLDDLISIG TIGLIKAVNTFDLNKGGRLVTYAARCIENELLMMIRQEKKILREVSLYEPIGTDQEGNHI SLLDIISTEEAELIEQYLHKYYLSRLEQEVFLLTDDREREIILKRYGLCGNEPHTQKEVA AMMGISRSYVSRIEKKALEKLRSSYEQYM >gi|222441813|gb|ACEP01000129.1| GENE 17 15796 - 15942 86 48 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MEAIKNLPVLAMEDGDRNSVESAPHPKQEIWINFGEALERGFLLRRKE >gi|222441813|gb|ACEP01000129.1| GENE 18 15890 - 17020 1148 376 aa, chain - ## HITS:1 COG:CAC1687 KEGG:ns NR:ns ## COG: CAC1687 COG0826 # Protein_GI_number: 15894964 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Collagenase and related proteases # Organism: Clostridium acetobutylicum # 1 363 1 361 406 415 55.0 1e-116 MRNTELLIPASSLEVLKTAVLYGADAVYIGGEMYGLRAKARNFSREDMEAGVKFAHENGV KVYVTANIVAHNEDLEGIREYFRELKEIGPDALIISDPGVFMIAREECPDIERHISTQAN STNYATYNFWYAQGASRVVAARELSLAELSEIRANIPEDMEIEAFVHGAMCISHSGRCLL SNYFTGRNANMGACTHPCRWKYYVMEESRPGEYLPVEENERGTYIFNSKDLCMIEHIPDL LEAGVDSLKIEGRMKTALYVAAVTRTYRRAIDDYKTSPDLFEKNLDYYREEIAKCTYRQF TTGFYYGKTDENSQIYDANTYIKEYTYIGIVQDIDENGWCKIYQRNKFSVGEIIEVMLPD GSNKKSSSTCDGRRRP >gi|222441813|gb|ACEP01000129.1| GENE 19 17040 - 17687 574 215 aa, chain - ## HITS:1 COG:CAC1686 KEGG:ns NR:ns ## COG: CAC1686 COG4122 # Protein_GI_number: 15894963 # Func_class: R General function prediction only # Function: Predicted O-methyltransferase # Organism: Clostridium acetobutylicum # 2 215 4 213 216 136 36.0 3e-32 MITEERVLDFIRSFGVDRGSEALHMIEKEAIRDDVPIIRKESGELLRILLQIKKPEKILE VGAAIGFSSVFMGENTDNNTHITTIENYPPRIERAKVNIALAGMEDKITLISGDAAEVLK ELSGSWDFIFMDAAKGQYIHFMPEVMRLLAPGGILVSDNVLQDGDIFESRYGIKRRNHTI HNRMREYLYALTHDEALDTVILETGDGMAISVKKE >gi|222441813|gb|ACEP01000129.1| GENE 20 17684 - 18052 417 122 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225028519|ref|ZP_03717711.1| ## NR: gi|225028519|ref|ZP_03717711.1| hypothetical protein EUBHAL_02798 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02798 [Eubacterium hallii DSM 3353] # 1 122 1 122 122 219 100.0 4e-56 MAEKVLHVIIKILMYIFIVLLFILLFVRMSGVGRAIFADKPKDKNPQIASETVLVVEQGE SLLEISKDLAEQGIVKNPYLFAISLRCMDGYQNIRPGEYQVSSSEKPSEILKQLTHEEEK IQ >gi|222441813|gb|ACEP01000129.1| GENE 21 18267 - 18656 357 129 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225028520|ref|ZP_03717712.1| ## NR: gi|225028520|ref|ZP_03717712.1| hypothetical protein EUBHAL_02799 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02799 [Eubacterium hallii DSM 3353] # 1 129 1 129 129 220 100.0 2e-56 MMDEGKLSIPFFPDTEDIRQGTKKLTDMICHTEDYKCYQKDLTVLKEQEELYRKFKEFRG KSLYLQLEKGQEQYFEKIESLHSEYKDVLTEPVVVDFLSAEQRMCKLMRLVYDGIAENIK LDLSYMDEL >gi|222441813|gb|ACEP01000129.1| GENE 22 18660 - 20324 1947 554 aa, chain - ## HITS:1 COG:CAC1683 KEGG:ns NR:ns ## COG: CAC1683 COG0595 # Protein_GI_number: 15894960 # Func_class: R General function prediction only # Function: Predicted hydrolase of the metallo-beta-lactamase superfamily # Organism: Clostridium acetobutylicum # 9 554 8 555 555 659 58.0 0 MKVTGNQSVKIIPLGGLEQIGMNITAFEYEDSIIVVDCGLAFPDDEMLGIDLVIPDITYL KENASKVKGLVITHGHEDHIGAIPYALKELNMPIYATKLTMGLIENKLKEHNLLRTTRRK VIKHGQSINLGRFRIEFIKTNHSISDAAALAIYTPAGIIVHTGDFKIDYTPVFGDAIDLG RFAELGKKGVLALMCESTNVEREGYTPSERTVGRTLRHLFDENKDHRIIVATFASNVDRV QQIIDCAYSCGKKVVIEGRSMVSTISTAAELGYIKIPDNTLIDIEQMKNYVDEQLVLITT GSQGETMAALSRMAASTHRKVSIKPGDVIIFSSHPIPGNEKAVSKVMNELAKKGASVIFQ DTHVSGHACQEEIKLIYTLTKPQYAIPVHGEYRHLKRHKDLAIEMGIPKENVMILQSGDV LTISKEQAKVTGHVPAQGILVDGLGVGDVGNIVLRDRQHLSQNGLFIVVVTLDRYNGVLL AGPDIVSRGFVYVRESEKLMEEAKSVVTNTLEHCNSKKITDWGRIKGAIRDDLGDYIWKK MKRSPMILPIIMEV >gi|222441813|gb|ACEP01000129.1| GENE 23 20446 - 20733 469 95 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225028522|ref|ZP_03717714.1| ## NR: gi|225028522|ref|ZP_03717714.1| hypothetical protein EUBHAL_02801 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02801 [Eubacterium hallii DSM 3353] # 1 95 1 95 95 129 100.0 5e-29 MNNYQSIPFKDEDGKEKEFYVLEQTTLNGFHYLLVEDGSADADDEEVEVLIMKRVADAAE DEEWATYEMVEDEEELISVSKVFEQLMDDINFEVE >gi|222441813|gb|ACEP01000129.1| GENE 24 20773 - 21210 432 145 aa, chain - ## HITS:1 COG:lin1537 KEGG:ns NR:ns ## COG: lin1537 COG0816 # Protein_GI_number: 16800605 # Func_class: L Replication, recombination and repair # Function: Predicted endonuclease involved in recombination (possible Holliday junction resolvase in Mycoplasmas and B. subtilis) # Organism: Listeria innocua # 4 138 2 135 138 137 58.0 8e-33 MKERWMGLDYGTKTVGVAVSDALGITAQGVETVTRKSNKKLRQTLARIEALIEEYEVSKI VLGLPKNMNNTLGERAEETKEFQAMLQRRTGLEVVLWDERLTTMESERILQEGGVRRENR KERIDWMAATLILQSYMDVHPIPEK >gi|222441813|gb|ACEP01000129.1| GENE 25 21213 - 21470 381 85 aa, chain - ## HITS:1 COG:BH1268 KEGG:ns NR:ns ## COG: BH1268 COG4472 # Protein_GI_number: 15613831 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus halodurans # 4 85 5 87 90 86 55.0 1e-17 MGRNNTQFFHVVKEQEYDVASILKDVYEALTEKGYNPVNQIVGYIMSGDPTYITSHKNAR SLIMKVERDEILEVLFENYIDTQLK >gi|222441813|gb|ACEP01000129.1| GENE 26 21525 - 22877 405 450 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|227424643|ref|ZP_03907734.1| SSU ribosomal protein S12P methylthiotransferase [Denitrovibrio acetiphilus DSM 12809] # 1 402 1 402 435 160 26 8e-39 MSKRRAAFLTLGCKVNQYETDAMEEILEKAGYEIVSFKETADVYIINTCSVTNMADRKSR QMIHRAKKNNPDAIIVAAGCYVQAAEEELAKKNEADILVGNNKKKDIAQILEEYFAAKEP EQEVPVVSEVIDINHTKEYEDLTIHKVNEHTRAYIKIQDGCNQFCSYCIIPYTRGRIRSK NPEEVIEEVKNLAAQGYKEIVLTGIHLSSYGKDLGTVTLLDVIKRIQQVEDVERIRLGSL EPRIITEEFVKELVKCDKVCPHFHLSLQSGCDETLKRMNRKYTTEEYEEALNILRKYYEH PALTTDVIAGFVGETEEEFEKTRAYLEKINLYEMHIFKYSVREGTRAQKMSGHVPEQVKT ERSEVLLSMAKRHKTAFEEWYVGRKEKVLLEEIVEKNGKKYFQGYTAHYVKVAVELTEND TKLEQNEVIVVDIQGFIDENLLYGKVSIEF >gi|222441813|gb|ACEP01000129.1| GENE 27 23110 - 23346 344 78 aa, chain + ## HITS:1 COG:no KEGG:bpr_I2619 NR:ns ## KEGG: bpr_I2619 # Name: not_defined # Def: hypothetical protein # Organism: B.proteoclasticus # Pathway: not_defined # 1 76 1 76 76 103 76.0 2e-21 MKTIKISLNSIDKVKTFVNVINRFDAEFDLVSGRYVIDAKSIMGIFSLDISKPIDLNIHN ADNLDEIMEQLQPYLVTE >gi|222441813|gb|ACEP01000129.1| GENE 28 23606 - 24790 1505 394 aa, chain - ## HITS:1 COG:CAC2971 KEGG:ns NR:ns ## COG: CAC2971 COG0301 # Protein_GI_number: 15896224 # Func_class: H Coenzyme transport and metabolism # Function: Thiamine biosynthesis ATP pyrophosphatase # Organism: Clostridium acetobutylicum # 4 391 3 381 384 356 50.0 5e-98 MEFKAFLIKYAEIGIKGNNRRFFEDALVRQMKYALKPVGDFSIRKESGRIFVHANEEYDF DEAVESLSHVFGIAGICPIVVEEDKDFEKIAEVVEKYVDEDYADKHFSFKVHVRRADKKY PIPSMEAAARLGERLLEKFEDLSVDVHNPDVMLTVEIRDKVYIYSREIPGPGGMPVGTNG RAMLLLSGGIDSPVAGYMISKRGVSIDATYFHAPPYTSERAKQKVVDLAKEIAKYTGPIT LHVVNFTDIQMAIYEKCPHDELTIIMRRYMMKIAEHFANQDNCLSLVTGESIGQVASQTM HSLYVTNEVCQMPVFRPCIGLDKNEIVKVAEKIGTYETSILPYEDCCTIFVAKHPVTKPR LDVIKRSERNLDDCIDELMKTAIETTEEIKINAH >gi|222441813|gb|ACEP01000129.1| GENE 29 24887 - 26035 1240 382 aa, chain - ## HITS:1 COG:CAC2972 KEGG:ns NR:ns ## COG: CAC2972 COG1104 # Protein_GI_number: 15896225 # Func_class: E Amino acid transport and metabolism # Function: Cysteine sulfinate desulfinase/cysteine desulfurase and related enzymes # Organism: Clostridium acetobutylicum # 4 379 3 377 379 300 45.0 4e-81 MSEIYFDNGATTRALPQVREIMDEVLEKEYGNPSSMHMKGFDAEKYVNHAKEIIAKSLKV DSKEIYFTSGGTEANNLALIGTAFANKRERKHIITSCIEHASVYNPLSFLEDEGFEVTYL PVDEHGIVDLEALKKALRKDTLMVSIMCVNNEIGAIEPVEEIGKIVKSFDPKILFHTDCI QAYGKLNCYPKKWKADMISVSGHKIHGPKGIGFIYIKNGTKIKPIIWGGNQQKGMRSGTE NVPGIAGLGKAAELIYENHEEKIAHIREIKDHFIKRVMEEIPDVKDNSGDAPHVASISFL GVRSEVLLHALEEREIYVSAGSACSSNKPEVSGTLKGIGLTKEYYESTLRFSFSIFNTVE EADICVEALKELLPMLRRFTRR >gi|222441813|gb|ACEP01000129.1| GENE 30 26406 - 27806 1409 466 aa, chain - ## HITS:1 COG:YPMT1.20c KEGG:ns NR:ns ## COG: YPMT1.20c COG5492 # Protein_GI_number: 16082802 # Func_class: N Cell motility # Function: Bacterial surface proteins containing Ig-like domains # Organism: Yersinia pestis # 294 465 50 214 220 65 36.0 2e-10 MYDWTEKEVKNLYLPMQAVVYLTKEGLAQASMMDKLMDDLCDVAVNRENYYMRLQANMIS GMAGVLGKNSGLDTGALAVTIARVIPGLAAKHPLDNAYLLTHLIPIDENGQKIVCIPEDA DYKEDLKATDNGTMTYTVMNQNLETNECSQVKSYVDIPIKEGEVYKSTFETDTESSTETL KNVSGDEVQPSIEKGASEDIKKQVNVTAQSGGSVTGGGSYTISEYAKVTAEPETNYTFAG WYENDNLISKEKEYRFCVQNNRNLSAHFSKNANQGTQKPSSNTTKPSGNTTNAPSIKPFK ATGKPKQIAAGKKVNLKAEIGLPDSITKQLTWKSSNTKVATVNADGVVTVKKKTGGKKVT ITASNDKIKVSASWKVTSMKGIVKKVKITGTKTVKAGKSLKLKANVTATKKANKKLQWTS SNAKYATVNAKGVVKTKKVAKGKTIKITAMATDGSNKKATFKVKVK Prediction of potential genes in microbial genomes Time: Fri Jul 8 08:21:20 2011 Seq name: gi|222441812|gb|ACEP01000130.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont416.1, whole genome shotgun sequence Length of sequence - 15523 bp Number of predicted genes - 12, with homology - 11 Number of transcription units - 8, operones - 2 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 10 - 180 196 ## gi|225028532|ref|ZP_03717724.1| hypothetical protein EUBHAL_02811 - Prom 372 - 431 12.3 2 2 Tu 1 . + CDS 723 - 1052 263 ## gi|225028533|ref|ZP_03717725.1| hypothetical protein EUBHAL_02812 + Prom 1063 - 1122 6.7 3 3 Tu 1 . + CDS 1223 - 1870 681 ## COG1739 Uncharacterized conserved protein - Term 1973 - 2024 9.1 4 4 Tu 1 . - CDS 2056 - 3891 2079 ## COG1217 Predicted membrane GTPase involved in stress response - Prom 4030 - 4089 15.8 + Prom 4110 - 4169 15.2 5 5 Tu 1 . + CDS 4222 - 4893 544 ## COG0629 Single-stranded DNA-binding protein + Term 4898 - 4941 8.0 + Prom 5107 - 5166 5.8 6 6 Op 1 . + CDS 5200 - 6441 1587 ## COG0124 Histidyl-tRNA synthetase 7 6 Op 2 . + CDS 6512 - 7030 476 ## COG2109 ATP:corrinoid adenosyltransferase 8 6 Op 3 6/0.000 + CDS 7067 - 7954 1199 ## COG0329 Dihydrodipicolinate synthase/N-acetylneuraminate lyase + Prom 8016 - 8075 5.4 9 6 Op 4 . + CDS 8098 - 8859 1048 ## COG0289 Dihydrodipicolinate reductase + Term 8935 - 8982 4.9 + Prom 9309 - 9368 9.1 10 7 Op 1 . + CDS 9424 - 12888 3698 ## COG1038 Pyruvate carboxylase 11 7 Op 2 . + CDS 12900 - 13085 101 ## + Term 13294 - 13331 2.4 - Term 12944 - 13009 5.9 12 8 Tu 1 . - CDS 13036 - 15009 763 ## COG1404 Subtilisin-like serine proteases - Prom 15041 - 15100 2.6 Predicted protein(s) >gi|222441812|gb|ACEP01000130.1| GENE 1 10 - 180 196 56 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225028532|ref|ZP_03717724.1| ## NR: gi|225028532|ref|ZP_03717724.1| hypothetical protein EUBHAL_02811 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02811 [Eubacterium hallii DSM 3353] # 1 56 1 56 56 84 100.0 2e-15 MTIKEARTKTNRPAEEIAKLLGFPLKTWLAWEEGERKPPQYIEQLILERLEKMNKS >gi|222441812|gb|ACEP01000130.1| GENE 2 723 - 1052 263 109 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225028533|ref|ZP_03717725.1| ## NR: gi|225028533|ref|ZP_03717725.1| hypothetical protein EUBHAL_02812 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02812 [Eubacterium hallii DSM 3353] # 1 109 1 109 109 194 100.0 2e-48 MERILKNERRVEGDCKRYLLRYTLVSNGFPKEEGEGIYYGIITEQFVWTDEEERWEAQDT AQVKGFSESLEESMAFFEKMVQGEVMPVSLNEIVDDWKSAFYMGRKESA >gi|222441812|gb|ACEP01000130.1| GENE 3 1223 - 1870 681 215 aa, chain + ## HITS:1 COG:lin2660 KEGG:ns NR:ns ## COG: lin2660 COG1739 # Protein_GI_number: 16801721 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Listeria innocua # 1 204 1 205 211 152 41.0 3e-37 MIRHKAVYRGGSGEIVEKKSRFIANIKSVETVEEAQVYIEEMKKKYWDARHNCSAFSVGT EQVTTRCSDDGEPSGTAGKPILEVISGSGIHNIVVVVTRYFGGTLLGTGGLVRAYTDATR AGIENSDIVEKIPGRRVDIAMDYTDLGKLQYMLAQNEVLTEDTEYTDKVIIHALFPESDK EKLKKKITEATSGRVMTQEGEEVYFGTVDGEVIVF >gi|222441812|gb|ACEP01000130.1| GENE 4 2056 - 3891 2079 611 aa, chain - ## HITS:1 COG:CAC1684 KEGG:ns NR:ns ## COG: CAC1684 COG1217 # Protein_GI_number: 15894961 # Func_class: T Signal transduction mechanisms # Function: Predicted membrane GTPase involved in stress response # Organism: Clostridium acetobutylicum # 2 607 3 605 605 731 63.0 0 MKISREDVRNVAIIAHVDHGKTTLVDELLKQSGTFRANQEVGERVMDSNDIEKERGITIL SKNTAIHYEGVKINIVDTPGHADFGGEVERVLKMVDGVILVVDAFEGPMPQTKFVTKKAL ELNLPFIVCINKIDRPEARPEEVMDEVLELLLELDADDEQLDSPFVYASAKSGIAMLELS ENGTDMKPLFETIIKHIPAPTGDPDADTQILISTIDYNEYVGRIGIGKVENGSIRVNQDA LLVNHHDPSVNKKVKISKLYEFDGLNKVEVNDATVGSIVAISGIADIHIGDTICSPDHPE PIPFQKISEPTIAMQFLVNDSPLAGQEGKFVTSRHIRDRLFRELNTDVSLRVEEMENTDS FKVSGRGELHLSVLIENMRREGYEFAVSKAEVLYHTDENGKKLEPMETAVIDVPEEFSGV VIEKLSQRKGELRNMSMSGTGSTRLEFSIPSRGLIGYRGDFMTDTKGTGIINTLFEGYGP YKGDIQYRKQGSLIAYESGETVTYGLFSAQERGTLFVGPGEKVYSGMVIGQNGKTDDIEV NVCKTKKLTNTRASGSDDALKLSPPRILSLEQALDFIDTDELLEITPKNIRIRKKILDPT ARYRAKRNGKA >gi|222441812|gb|ACEP01000130.1| GENE 5 4222 - 4893 544 223 aa, chain + ## HITS:1 COG:CAC2382 KEGG:ns NR:ns ## COG: CAC2382 COG0629 # Protein_GI_number: 15895648 # Func_class: L Replication, recombination and repair # Function: Single-stranded DNA-binding protein # Organism: Clostridium acetobutylicum # 3 223 2 219 229 220 51.0 2e-57 MTEKILKNNQAVVAGEIISDFEFSHEVFGEGFYFVKLKVSRLSHSSDIIPLLVSERLIDV SQSHIGQFLEARGQFRSYNKHENNRNHLVLSLFVRELELIDSVENRKPNMIFLDGYICKE PVYRTTPLGREIADVLLAVNRAYGKSDYIPCICWGRNARYAGNLSVGSRIQLWGRIQSRE YQKRVGENNVVSRVAYEISVNKMEYIDDGINLERKVADSGNIG >gi|222441812|gb|ACEP01000130.1| GENE 6 5200 - 6441 1587 413 aa, chain + ## HITS:1 COG:APE0662 KEGG:ns NR:ns ## COG: APE0662 COG0124 # Protein_GI_number: 14600873 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Histidyl-tRNA synthetase # Organism: Aeropyrum pernix # 5 344 11 339 438 184 36.0 3e-46 MKITPVKGTNDYLPAQAELRNYLQKKICETYEQAGFQRIATPIIEDIENLDKSEGGDNLN LIFKVMKRGEKLKKAIASQNPDNIADMGLRYDLTLPLSRYYANNRASLPQPFKVIQMDRV YRAERPQKGRLREFIQCDIDILGSDSSTCEIELIHTTAKALLNIGINNFKVKINDRRILR EILLSVGFEEDQLDSVCISFDKLNKIKAEGVEAELTEKGFDKAVIAEFMKLLSNAPFTLD YVKEICKNSEYVDSLAYIIDTSNALADGKYVAEYDMTLVRGQGYYTGTVFEVESLDFNSS IAGGGRYDNLIGKFLKEQIPAVGFSIGFERIYSILMEKESYNTEKQKKVVLIYEEDQVAD AIRYAEELRKTYKTALYIKPKKLGKFLNKLEEQGFDGFQVFGRDEEVRMFGEQ >gi|222441812|gb|ACEP01000130.1| GENE 7 6512 - 7030 476 172 aa, chain + ## HITS:1 COG:PH0075 KEGG:ns NR:ns ## COG: PH0075 COG2109 # Protein_GI_number: 14590029 # Func_class: H Coenzyme transport and metabolism # Function: ATP:corrinoid adenosyltransferase # Organism: Pyrococcus horikoshii # 3 160 8 160 175 95 37.0 3e-20 MDGSIQVYYGNGRGKTTAALGLGIRAAGVGKQVIMVQFLKKKHSDTLDFLKKLEPELQIF RFEKAACGYSDLSPKEKQEQMLNIKNALGYSRKVLDTGQCDLLILDEIFGLVDYNIITIE ELMDLIAVKKDSMDLILTGRNLPDEVREIADCVYSIQREKEKTPEILAQGNF >gi|222441812|gb|ACEP01000130.1| GENE 8 7067 - 7954 1199 295 aa, chain + ## HITS:1 COG:CAC2378 KEGG:ns NR:ns ## COG: CAC2378 COG0329 # Protein_GI_number: 15895644 # Func_class: E Amino acid transport and metabolism; M Cell wall/membrane/envelope biogenesis # Function: Dihydrodipicolinate synthase/N-acetylneuraminate lyase # Organism: Clostridium acetobutylicum # 1 294 1 292 293 261 45.0 1e-69 MAVFTGAGVAIATPFDENYNIDYESFGKFIDFQIENKTDAIIVCGTSGEASTLSHEEHIA AIRYCVNRVNKRVPVIAGTGSNCTDTASYLTKQAQNAGADAALVVTPYYNKATQNGLIAH YTDIAKHTDLPIILYNVPSRTGCKLEAATIAKLVKDVDNIVGVKEATGDIAFATQIMYDT QGDIDMYSGNDDMIVPMLSIGGKGVISVLSNVAPEDTHNICAEYFAGNVEKSRELQIKYL ELIHSLFCEVNPIPVKKALELMGFYGGNVRLPLTQLEPAHTERLKKAMIETGILA >gi|222441812|gb|ACEP01000130.1| GENE 9 8098 - 8859 1048 253 aa, chain + ## HITS:1 COG:CAC2379 KEGG:ns NR:ns ## COG: CAC2379 COG0289 # Protein_GI_number: 15895645 # Func_class: E Amino acid transport and metabolism # Function: Dihydrodipicolinate reductase # Organism: Clostridium acetobutylicum # 1 250 1 248 250 243 48.0 3e-64 MVKVMMYGCNGYMGHVICNLIKDMENVTVAAGIDPCPGDQKDFPVFASLDECNVDVDVII DFSAAVAVDGVLDYAAAKKIPLVECTTDLSEEQLAHLEEASKQTAILRSANMSLGVNTLL EMVKTAAKILGEAGFDIDIVEKHHRRKLDAPSGTALAFADSINEAADGKYNYVFDRSERR MQRPQDEIGISAVRGGTIVGEHEIIFAGTDEVIEFKHTAYSRAVFGKGAVEAAVYLADKP AGLYDMKDVISAK >gi|222441812|gb|ACEP01000130.1| GENE 10 9424 - 12888 3698 1154 aa, chain + ## HITS:1 COG:CAC2660 KEGG:ns NR:ns ## COG: CAC2660 COG1038 # Protein_GI_number: 15895918 # Func_class: C Energy production and conversion # Function: Pyruvate carboxylase # Organism: Clostridium acetobutylicum # 7 1150 2 1143 1144 1295 57.0 0 MSGKVNVKKFRKVLVANRGEIAIRIFRALSELGITTVSIYSKEDRYAMFRSKADESYPLN PQKGPIDAYLDIDTIIKIALSTGVDAIHPGYGFLSENPDFADACERNGIVFIGPGSNIMN AMGDKISSKKIAIEANVPIIPGVDYAIKEVDEAKKIAAQVGYPVMLKASNGGGGRGMRIV NREEDLEKEFNEAKNESKKAFGDDMIFIEKYLKGPKHIEVQIVGDNYGNVVHLYDRDCSV QRRHQKVVEYAPAFSIPETVRQEIFDASIRLAKTVGYRNAGTLEFLVDADNHPYFIEMNP RIQVEHTVSEMVTDIDIVQTQILIAEGYPLASEEIAIPSQESVTCTGYSIQTRVTTEDPA NNFLPDTGEITVYRSGSGNGIRLDGGNAYTGAVISPYYDSLLVKAISHDRTFLGAVRKSI RALQEMRIRGVKTNIPFLVNVLNHPTFQSGQCYTTFIEETPELFELTKSQNRANKIIEFI GDRIVNSNNGEKPFYENRVLPKLDKSKPVYGARDEFLKLGAQGFMQKILKEDKLYVTDTT MRDAQQSLMATRMRSKDLCGAAYATNAFMQNAFSVEAWGGATYDTAYRFLKESPWKRLEL LRERMPNTLIQMLLRASNAVGYSNYPDNVVKEFIRISADHGIDVFRIFDSLNWVENMKMP IEEALKTGKIVEGTICYTGDVSNPNETKYTLDYYVKMALELEKLGCHSIAIKDMAALLKP RAAKELVGTLKKELHVPLHLHTHDSTGNGVSTVLMAAEAGVDIVDLAIESMSSMTSQPSM NAVVEALRGTKRDTGLDFEELSELSRYYNRIRSVYAPFESDMKSPNTEIYKYEIPGGQYS NLLAQVTEMGSPEEFEAIKGLYKEANDLLGNIVKVTPTSKAVGDLAIFMYKNNLNKDNIL TAGASLSYPDSVVSYFRGMMGQPYGGFPKELQKIVLKDIEPLTERPGKLLPPEDLEGIKK HLIEKYHYEDKSEEVMAQKAISYALYPKVYEDYCEHFEMYNDVTRLESHVYFYGLRKGEE TYLKIGEGKELLIKYLEQSDPDENGIRTLSFQVNGSIRTVKIQDHNLEIKTDRRLKADKN NPQHLGSSIPGTVDKVLVKEGDVVTKNMPLMTIEAMKMETTVVSTVNGTVDKIYVEAGDS VHQDDLLVSFHIKE >gi|222441812|gb|ACEP01000130.1| GENE 11 12900 - 13085 101 61 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MMPILPLVRVPNVYKLTKMAKITSAAIVEISGWGYFYYSYKNDAGLKRYFLNIMLPSQFF H >gi|222441812|gb|ACEP01000130.1| GENE 12 13036 - 15009 763 657 aa, chain - ## HITS:1 COG:CAC3245 KEGG:ns NR:ns ## COG: CAC3245 COG1404 # Protein_GI_number: 15896490 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Subtilisin-like serine proteases # Organism: Clostridium acetobutylicum # 267 655 137 516 1118 231 35.0 4e-60 MLSITEIEQINQKLNPELALSLSLDNAKRMESSLLHSGFSPSTDMWKLILFYNGSLEEIK EMISFSFIPLLGNFAILFIKEKDIPSFLTFPQVLYVELSRPIYENALTGIDASCFPDYNI VTGRNVNATTLTGKGTAIAVLDSGVDYRHPDFRNVDGSTRILAYWDQSLPFASFNKENTN SNSSNSDNFQYISTTNSQNNYIAADNRTNTRRNNVSKHSSATDNPYNLGVIFSEEDLNRL LMPKSSSVPSDSSTFSVTDPVTELLSPSEDVSGHGTHIAGICSGNGRASNGNSQGVAPES SLIVVKLKNETASVYTDYANLMMAVDFAVRFANSRSLPLSINISYGSNDGSHTGSSLLEL FMEQVSLYGKNVICAATGNEGLTRRHASLNTISNQNTYDKSIDFTIAPGERSLYLEIWQT FADNFFYELFSPSGLESFVFPAVPGIYAYMIADTTIYLTINNPTPYQPFRQYFLSFSSNT TFITSGTWALHIESTPTGKIVDGRLQFWLPSKEATNSATGFLVPSSDMTFTIPSTASSVI SVSGYDSSLDVFAPFSGQGFSNNRHTKPDLCAPAVNILSTAPGGGYTIRTGTSMAAPFVT GAAALLMQHGIVNGNDPFLYGEKAKAYLWKSARPLPAFSEYPNEKIGWGALCLRNIF Prediction of potential genes in microbial genomes Time: Fri Jul 8 08:21:44 2011 Seq name: gi|222441811|gb|ACEP01000131.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont419.1, whole genome shotgun sequence Length of sequence - 21801 bp Number of predicted genes - 23, with homology - 20 Number of transcription units - 16, operones - 6 average op.length - 2.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 32/0.000 - CDS 25 - 684 886 ## COG2011 ABC-type metal ion transport system, permease component 2 1 Op 2 . - CDS 674 - 1717 1221 ## COG1135 ABC-type metal ion transport system, ATPase component - Prom 1875 - 1934 8.7 - Term 2098 - 2126 0.3 3 2 Tu 1 . - CDS 2127 - 2210 82 ## - Prom 2251 - 2310 2.6 4 3 Tu 1 . - CDS 2363 - 3520 857 ## Closa_2166 FliB family protein - Prom 3626 - 3685 5.4 5 4 Tu 1 . - CDS 3862 - 4545 441 ## COG1961 Site-specific recombinases, DNA invertase Pin homologs + Prom 4735 - 4794 3.0 6 5 Tu 1 . + CDS 4857 - 5309 447 ## COG3546 Mn-containing catalase + Term 5329 - 5361 1.5 - Term 5464 - 5505 2.6 7 6 Op 1 . - CDS 5606 - 6376 747 ## COG0642 Signal transduction histidine kinase 8 6 Op 2 . - CDS 6373 - 6462 68 ## - Prom 6504 - 6563 4.7 - Term 6911 - 6968 9.2 9 7 Op 1 . - CDS 6977 - 7627 388 ## SDEG_1773 hypothetical protein 10 7 Op 2 . - CDS 7665 - 8435 562 ## COG4509 Uncharacterized protein conserved in bacteria - Prom 8529 - 8588 7.7 11 8 Op 1 . - CDS 8707 - 9699 1182 ## MGAS2096_Spy0115 fibronectin-binding protein 12 8 Op 2 . - CDS 9780 - 10178 418 ## SDEG_1780 signal peptidase I (EC:3.4.21.89) 13 9 Op 1 . - CDS 10328 - 10747 319 ## gi|225028556|ref|ZP_03717748.1| hypothetical protein EUBHAL_02835 14 9 Op 2 . - CDS 10760 - 12820 1773 ## M28_Spy0109 hypothetical protein 15 9 Op 3 . - CDS 12741 - 13049 149 ## gi|225028558|ref|ZP_03717750.1| hypothetical protein EUBHAL_02837 - Prom 13123 - 13182 3.7 16 10 Tu 1 . - CDS 13538 - 14398 551 ## COG0582 Integrase - Prom 14428 - 14487 10.6 + Prom 14382 - 14441 9.0 17 11 Tu 1 . + CDS 14544 - 15530 1287 ## Rumal_1291 diaminopimelate dehydrogenase (EC:1.4.1.16) 18 12 Tu 1 . - CDS 15787 - 17418 1350 ## COG1620 L-lactate permease - Prom 17442 - 17501 5.3 19 13 Tu 1 . - CDS 17522 - 17632 57 ## - Prom 17656 - 17715 3.2 - Term 17943 - 17986 7.5 20 14 Tu 1 . - CDS 18150 - 19016 847 ## COG1578 Uncharacterized conserved protein - Prom 19077 - 19136 3.1 - Term 19121 - 19173 16.1 21 15 Op 1 4/0.000 - CDS 19194 - 19661 640 ## COG4917 Ethanolamine utilization protein - Prom 19695 - 19754 2.4 22 15 Op 2 . - CDS 19837 - 20220 445 ## COG4810 Ethanolamine utilization protein 23 16 Tu 1 . + CDS 20673 - 21539 840 ## COG0685 5,10-methylenetetrahydrofolate reductase + Term 21720 - 21764 13.1 Predicted protein(s) >gi|222441811|gb|ACEP01000131.1| GENE 1 25 - 684 886 219 aa, chain - ## HITS:1 COG:VC0906 KEGG:ns NR:ns ## COG: VC0906 COG2011 # Protein_GI_number: 15640922 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type metal ion transport system, permease component # Organism: Vibrio cholerae # 2 218 9 225 225 171 50.0 8e-43 MWTKEITDMIINGVWATLYMTVASTVLGYVFGLPMGVLLAITDKEGLTPNAVVYKILDVI ANIVRSIPFLILLILLIPFTRLLVGQSYGSTATIVPLVIAAIPFIGRMVESSIKEVDPGV IEAARSMGASNMQIVMKVLLVEARTSLLTGATIAIGTILGYSAMAGCVGGGGLGDIAIRY GYYRYQTSIMIVTVVLLVILVQIFQSIGMFLAAKLDKRK >gi|222441811|gb|ACEP01000131.1| GENE 2 674 - 1717 1221 347 aa, chain - ## HITS:1 COG:STM0247 KEGG:ns NR:ns ## COG: STM0247 COG1135 # Protein_GI_number: 16763636 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type metal ion transport system, ATPase component # Organism: Salmonella typhimurium LT2 # 4 340 2 338 343 297 46.0 2e-80 MSEIVIENLYKSFDTKDGTVEALKNVNLSIESGDIYGIIGMSGAGKSTLVRCINFLEVPT KGRVLINGKSLGDYTSRELRKQREDIGMIFQHFNLLMQKNVLENVCFPLYIQGKSKKDAR KRAKELLDIVGLGDRTGAFPAQLSGGQKQRVAIARALASDPKILLCDEATSALDPQTTSS ILALLKDINQRFNITIVIITHQMSVVREICSHVAIVKGGEVAEQGTVEDIFTHPKTAVAR ELLKNDVGDDGEEKRGTTAGGKEIIKSGEKIRIVFSENSAFEPVIANLILNFNEPVNILK ADTKNVGGVAKGEMVLEFAPDCTKNKEMKQYLLDRGLEIEEVSEYVD >gi|222441811|gb|ACEP01000131.1| GENE 3 2127 - 2210 82 27 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSPSKKMGIEEQDYDDDVCHANELHNL >gi|222441811|gb|ACEP01000131.1| GENE 4 2363 - 3520 857 385 aa, chain - ## HITS:1 COG:no KEGG:Closa_2166 NR:ns ## KEGG: Closa_2166 # Name: not_defined # Def: FliB family protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 374 1 368 386 295 42.0 2e-78 MRYVKPHFYDSFVCTAGDCPDTCCAGWQIMIDEESLERYENEPGEFGKILRNSIDWEEEC FYQNNRRCAFLNDENLCDLYKALGPDALCDTCRMYPRHTEEYEGLRELSLSLSCPKAAKI ILSCKEPVRFLEEETEEEDDFEEFDFMMFSQLEDTRDVLFAVLQDRSLPLTLRISVSEQL TESYQNCIEEGRQFDIDDLLRECERHQKEGSLSEFISKHLSEKGADAASLHQWNRQKKEL QVLRGLERLRPEWNQILDGAEKWLYQENEETYKNICKEFHQMYGALSNYKEEWENVGEQL MMFFVYTYFCGAVYDDMVCSKMEMALFSIRWVQEFLIVRWLENGKTLSMKDVEEISWRYA REVEHSDNNLNTLEDWLFENYYLIH >gi|222441811|gb|ACEP01000131.1| GENE 5 3862 - 4545 441 227 aa, chain - ## HITS:1 COG:SP1040 KEGG:ns NR:ns ## COG: SP1040 COG1961 # Protein_GI_number: 15900911 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinases, DNA invertase Pin homologs # Organism: Streptococcus pneumoniae TIGR4 # 18 212 85 286 559 72 26.0 6e-13 MIAEAKQRKFDLLITKGISRFARNTLDSIYYTRELRKVGVGVLFLNDGINTLDGDAELRL SIMASIAQEESRKTSERVKWGQKRRMEEGVVFGRSMLGYDVRNGKMYVNEEGAEVVRKIF YKFVEERKSTHTIARELLEEGIYPMRSKKWRNTVILRILQNEKYCGDLVQKKTFTPDYLS HEKKYNRGEEEFVIIKNHHEPIISREMFEKAEKIFSRNKAQKDSENT >gi|222441811|gb|ACEP01000131.1| GENE 6 4857 - 5309 447 150 aa, chain + ## HITS:1 COG:CAC1338 KEGG:ns NR:ns ## COG: CAC1338 COG3546 # Protein_GI_number: 15894617 # Func_class: P Inorganic ion transport and metabolism # Function: Mn-containing catalase # Organism: Clostridium acetobutylicum # 9 134 61 185 200 161 58.0 4e-40 MYSLFDASVGTEELNHIEMVCTIVHQLTRNLTVQQIKEQGFADYYVDHTTGIYPVAASGT PWSADTMQSTGDPITDLMENMAAEQKARTTYDNILRLVKDDPDVYDAIKFLRAREIVHFQ RFGESLRIVQDNLNSKNFYAFNPAFDNKKC >gi|222441811|gb|ACEP01000131.1| GENE 7 5606 - 6376 747 256 aa, chain - ## HITS:1 COG:RSp1178 KEGG:ns NR:ns ## COG: RSp1178 COG0642 # Protein_GI_number: 17549399 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Ralstonia solanacearum # 2 246 420 659 676 106 34.0 5e-23 MNLLSNAVKYTPEGGTIHFTIRELPYEREGYALFQTVVEDTGIGISKEYIPHLFEAFSRE KSSSESGIIGTGLGLRIVKKFVDLMEGSIVVESEIGEGTRFTVTIPHRIATANEYISEEN AKELPEEIKLNNVRILLAEDNMLNAEIAMTLLADANAYVELAPDGEKALSMLKRATDGYY DLIIMDIQMPHMNGYEATKNIRGLPDGRCRIPIIAMTANAFEEDRKRAIESGMNGYVTKP IKIEELISTIKKILKS >gi|222441811|gb|ACEP01000131.1| GENE 8 6373 - 6462 68 29 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSQKINTNEFLKNVRIGLWRIEVEEGKLQ >gi|222441811|gb|ACEP01000131.1| GENE 9 6977 - 7627 388 216 aa, chain - ## HITS:1 COG:no KEGG:SDEG_1773 NR:ns ## KEGG: SDEG_1773 # Name: not_defined # Def: hypothetical protein # Organism: S.dysgalactiae # Pathway: not_defined # 3 156 14 170 209 69 32.0 1e-10 MLLLVVLCLAVFPISTFAADAVQVQIPVSIQTSGETPSPEENYTVELQAVDDAPMPSENV LEISGSGKAFFSPIQYTTPGIYYYTITQQSGTHKRGHYDQTVYYVKVSVTNGENGNLETV IAAHTDADMTDAKCDITFTNYYKPIKKTSESTTETIPTTKRKPETKPGNKTSIKKSKNKV KTGDNSNATLFAILLLVSGTGLLIIAISRYREKYHK >gi|222441811|gb|ACEP01000131.1| GENE 10 7665 - 8435 562 256 aa, chain - ## HITS:1 COG:SPy0129 KEGG:ns NR:ns ## COG: SPy0129 COG4509 # Protein_GI_number: 15674344 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Streptococcus pyogenes M1 GAS # 7 247 5 234 237 149 35.0 6e-36 MDVSRIFLKTANSLLSAVIVLFLVVAGAYSAYALWDNMQVYASVDDIQSQLLKYKPTPGE DNGPTFEELRAINPDVCAWITLDGTKIDYPVLQGEDNLTYINKDVYGNFALAGSIFLDSN CDRSFQKKYSLLYGHHMAEHKMFGDLDLYEKQKFFDKNQTGTLVLPDRSYKLEIFACIKT SANEDNIFIPQKWQKSTDGLLKYAEKSAKYIHQDTMKKIGTSKDFSQILAMSTCSSDYTD ARTVILAVMKPYSPEK >gi|222441811|gb|ACEP01000131.1| GENE 11 8707 - 9699 1182 330 aa, chain - ## HITS:1 COG:no KEGG:MGAS2096_Spy0115 NR:ns ## KEGG: MGAS2096_Spy0115 # Name: not_defined # Def: fibronectin-binding protein # Organism: S.pyogenes_MGAS2096 # Pathway: not_defined # 93 327 101 345 347 92 31.0 2e-17 MKRVKRLFACLLALTMMLGLSATAFATETAEKVQTPANGGTVTFEKEYKLVGEGISPAET FEFNIAKAGVENGGVDAEKQDMPTVGTAAYAQNGATTAGNKKEVTVALPTYTSVGVYSYI ITEKTGTTAGVAYDSKSVTMKVTAVDNNGTIEIAAVSFTKAGKKLADNEAAFNNTYTANK LSISKTVAGNLGDKSKYFDFTVTLTGKEGEEYSLPATITGGSDKSGKKENVAISGTTTFK LKHGDTITIENLPAGVSYKVEEATPSDYTMNATGQEGTMDAEAKTAAFTNTKTGDIDTGV YLDNLPYILVFAGVLAIAAVFIVRRRRFED >gi|222441811|gb|ACEP01000131.1| GENE 12 9780 - 10178 418 132 aa, chain - ## HITS:1 COG:no KEGG:SDEG_1780 NR:ns ## KEGG: SDEG_1780 # Name: not_defined # Def: signal peptidase I (EC:3.4.21.89) # Organism: S.dysgalactiae # Pathway: Protein export [PATH:sds03060] # 1 130 42 171 173 103 41.0 2e-21 MRAPDNGMFPAIKGGDLLIGFRLQRNFLKNDVVVYKANGKLQVGRILGQETDVITIDDTG QLLVNGTPQTGEIAFPTYAKKGIKKYPYRVPKGCVFLLGDYRTQTKDSRDYGPIKMEDVK AKVITVLRRREL >gi|222441811|gb|ACEP01000131.1| GENE 13 10328 - 10747 319 139 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225028556|ref|ZP_03717748.1| ## NR: gi|225028556|ref|ZP_03717748.1| hypothetical protein EUBHAL_02835 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02835 [Eubacterium hallii DSM 3353] # 1 139 1 139 139 193 100.0 4e-48 MAERHKREAKLNRGRKAKNKGREVLPNTKKSTSTEVINQEHKEIIKWFQTVKFRRTLFGG VDENDVWKKLQELNQLYESAIRAERERYNILLTDHEKTYESQIYKYKQELMKNEGTTEQT EMDNTENEVPPETRNEVRR >gi|222441811|gb|ACEP01000131.1| GENE 14 10760 - 12820 1773 686 aa, chain - ## HITS:1 COG:no KEGG:M28_Spy0109 NR:ns ## KEGG: M28_Spy0109 # Name: not_defined # Def: hypothetical protein # Organism: S.pyogenes_MGAS6180 # Pathway: not_defined # 507 682 168 349 350 69 34.0 6e-10 MCLSLILVCSMSISAFAVNRIGRAENETTQSVQMTDNQKGEAAATRQKSSEAQTKTRQEN MKKSDSESKTKDNSEEQKKDSKDADNKGAKENSTEKDAKEDSEKNISKEEKSTKEEKNTE KKSEETTDVSEATTTAAKETTTEAAEKKTEAVTEKTPEGKDSKEATSEEKAEEETQAEDD KNKEIAVTSISGPTEVQVGKTIRLTGTSGSYGYSHSWSSSNKNIATVTGNGSTAEVKGVA VGTVTITHRYSSQWNGNTRTETYQVKVTKFADNYSGEAAIYYLANPAGDPWTNDTGAWAP SQDTSNTLANISTSGATWEDGHVDDIVYENKNIKSNVASYITSWPDGSKGSTWTVKEGDS STASYFTFILNSIWDKYKSSVAADLGIGADQLQQSDIKEITLTPRKISRDNGGTHPYHID CALSIKSTKVFTAKFWVKEPGESSEYRQIDAKNYLTGNPVARTTKATIGSTKTVNGVTYV LDGWYPENAEGGAYGSTKIADNRWNNYIPSADELANGTVNFYAHYSPTTTSIKLKKLVTG SMGDKQKKFHFTISIEKENKNVTFKVGNTSKTGSATVDLANDEESTLTEIPVGADVSITE EDYSGSGYTTSYVIDNGNSALEIKANISNIQAKNDVSAHEIVFTNNKEAIPDTGITLDSL PFIALLALSIAGGIFYLFCRYKKRFV >gi|222441811|gb|ACEP01000131.1| GENE 15 12741 - 13049 149 102 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225028558|ref|ZP_03717750.1| ## NR: gi|225028558|ref|ZP_03717750.1| hypothetical protein EUBHAL_02837 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02837 [Eubacterium hallii DSM 3353] # 1 102 50 151 151 211 100.0 1e-53 MKATFPAVFEKYLTGQAVHIPKVSACHFIFVEEFQKSIYKGNINLNPDHYAGQSKEVKYE TSSFIWEVQIIKKGHHDVPVFDTCLQHVYFGICGEPYRPGRE >gi|222441811|gb|ACEP01000131.1| GENE 16 13538 - 14398 551 286 aa, chain - ## HITS:1 COG:SP0506 KEGG:ns NR:ns ## COG: SP0506 COG0582 # Protein_GI_number: 15900420 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Streptococcus pneumoniae TIGR4 # 23 272 14 255 265 109 28.0 7e-24 MKYILKREMLVCYEKVLIEEEKSQATIEKYMRDVRKFFQYVEEMGKKEGITKEIVLSYKR SLIEEYAPSSVNSMLASLNHFFKVNHWYECIVKSLKIQQRTFRAKERELTKEEYYRLLRA AQKEGKYRLYCILQTICGSGIRVSELKYITVKAVTRGRAVIFMKNKTRTILLPPKLCRLL KDYCKQEKITSGMIFITRRGNPVDRSNILHEMKNLCGTAGVEREKVFPHNLRHLFAYTYY KAEKDIAHLADILGHSSINTTRIYTMGSGEEEMKQLSNLGLIVRCG >gi|222441811|gb|ACEP01000131.1| GENE 17 14544 - 15530 1287 328 aa, chain + ## HITS:1 COG:no KEGG:Rumal_1291 NR:ns ## KEGG: Rumal_1291 # Name: not_defined # Def: diaminopimelate dehydrogenase (EC:1.4.1.16) # Organism: R.albus # Pathway: Lysine biosynthesis [PATH:ral00300] # 1 328 1 329 329 470 68.0 1e-131 MDKIKIGIVGYGNIGRGVEQAIKRNDDMELAAVFTRRDPATVSIQTEGAAVKHFDDMVSM KGEVDVMILCGGSATDLPVIGPEVAASFNTIDSFDTHAKIPEYFANVDKAAKEGNNISII SVGWDPGMFSLNRLYAESILVQGSTYTFWGKGVSQGHSDAIRRIDGVKNAIQYTVPIEDA VEQVRSGSEPELTTRQKHLRECYVVAEEGADKAAIENAIKTMPNYFSDYDTTVTFITEEE LKANHSKMPHGGFVIRTGETGCEGNKHVIEYSLKLDSNPEFTGSVLVAYARAAHRLSKKG ECGARSVFDIAPAMLSQMSAEELRAHML >gi|222441811|gb|ACEP01000131.1| GENE 18 15787 - 17418 1350 543 aa, chain - ## HITS:1 COG:jhp0129 KEGG:ns NR:ns ## COG: jhp0129 COG1620 # Protein_GI_number: 15611199 # Func_class: C Energy production and conversion # Function: L-lactate permease # Organism: Helicobacter pylori J99 # 2 543 13 541 551 147 27.0 7e-35 MSIWTWLGAVFPILFLFLMMSVFHMKTERAAFLGIICAMISAILIGKSTPQIVTMDLEKG AFSALNILIVIWPAVFLYEMMEFAGVFQSIKQMLQKKTQDELVMVLLICWLFSSFLQGIT GFGVPVAVCAPILIALGVQPLWSVIITLLGHAWANTYGTFALAWDALITQSETQKIFETK WLAGLLLLGVNFAGALLISWLYGKGKALRHILPFLLIISLVQGGGQLIVSFTNSAIAAFI PTTLALVVGWLILSAGFYTKPWKMDSKLMVRRENEEIEKQLNKTITTEQLVEVKENSFNK PITNGQAIFPFIILAAISVIVLLIKPIHEVMNQTIFALSFPQIETGRGFIVEATDSYGAI HIFTHAGFVLLLTVIVTLFLYLQKGYLDKQKIKSIITATLKKIIPTSLGILFLIVMSQVL KGSGIMQLIAQGVTQLTGKYYSFAVPFLGLLGAFVTSSNTSSNILLGSFQKTAADMISVN EAAILAAQTAGGAIGTVVGPSTILLGTTTAGCEGKEGEVLRFMLPIVLVEAAITGIVITV LIH >gi|222441811|gb|ACEP01000131.1| GENE 19 17522 - 17632 57 36 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MYLSCSRIIHTGHSSRLDTLNGLAAGLKMIKAMGQK >gi|222441811|gb|ACEP01000131.1| GENE 20 18150 - 19016 847 288 aa, chain - ## HITS:1 COG:TM0176 KEGG:ns NR:ns ## COG: TM0176 COG1578 # Protein_GI_number: 15642950 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Thermotoga maritima # 3 288 6 293 293 134 31.0 2e-31 MIANSRCIACILSKQERKIRKNSDEQKKKEYIHKVLKILYEYGTKECAPMVEKRIKDIYK EYYEEDVDYVALKHRYNQYMLEKEEEIRAHIQKNNDIETCIKYVCAGNYIDFGVANKIDD QLLQGLLDKVNELKISKKEVAYLEDELAQAKTLLYITDNCGEIVLDKIFIEELQNKYKNL QITAMVRGGIASNDATMEDAEEIGLTNVVPVIGNGVAIMGTVKEQLSEEAKEKIQNADII IAKGMGNFESMYQEGMNPYYLLLCKCELYKEIFGVEMFQPIFCKEERL >gi|222441811|gb|ACEP01000131.1| GENE 21 19194 - 19661 640 155 aa, chain - ## HITS:1 COG:STM2056 KEGG:ns NR:ns ## COG: STM2056 COG4917 # Protein_GI_number: 16765386 # Func_class: E Amino acid transport and metabolism # Function: Ethanolamine utilization protein # Organism: Salmonella typhimurium LT2 # 2 126 19 146 150 95 40.0 4e-20 MQAMKGKPITYHKTQYVNNFDVIIDTPGEYAETKQLSGALAVYGCEADIIGLLMSAIEPF SLYPPNVVSVSNREVVGVVTKCDHWAANPELAAEWLRLAGCKKIFFTSAYTGEGIAEILS YLKEPVDVLPWEVAKAEYDKLGFGPGESEKHTLRI >gi|222441811|gb|ACEP01000131.1| GENE 22 19837 - 20220 445 127 aa, chain - ## HITS:1 COG:lin1108 KEGG:ns NR:ns ## COG: lin1108 COG4810 # Protein_GI_number: 16800177 # Func_class: E Amino acid transport and metabolism # Function: Ethanolamine utilization protein # Organism: Listeria innocua # 12 126 5 115 116 110 48.0 4e-25 MDIKELLPKDTEGRLRIIQETVPGRQITLAHVITSPKPIVYRKLGLNPDIDFDRSAIGII TVTPSESAVIGADIAIKSGDVYLGFVDRFSGTLILTGRISAVESAVKAVIHYFQNDLGYA TCPVTKK >gi|222441811|gb|ACEP01000131.1| GENE 23 20673 - 21539 840 288 aa, chain + ## HITS:1 COG:aq_1429 KEGG:ns NR:ns ## COG: aq_1429 COG0685 # Protein_GI_number: 15606607 # Func_class: E Amino acid transport and metabolism # Function: 5,10-methylenetetrahydrofolate reductase # Organism: Aquifex aeolicus # 10 288 5 288 296 217 40.0 3e-56 MNTMQTNPQDILKTLSLSFEVFPPKKWEDFPGLYETLDELKGLHPEFISCTYGAGGSNSK KTAEIASHIENELGIRSIAHLTCAALTEDKLLSTIENFKNAGIHSVLALRGDKPFDMSEE DFANRQYKHPSEIIPILKKAGFRVAAACYPEKHYEAPSMEEDLHWLKHKVEQGADALISQ LFFDNDAFYRFMDAADQKGITVPIHAGIMPITSAKMINTTINLSGASIPKALSDLIATYG ENPSDMRKAGIEYAARQINDLAEQGMHYVHIYTMNHSETAKEICGLIK Prediction of potential genes in microbial genomes Time: Fri Jul 8 08:22:46 2011 Seq name: gi|222441810|gb|ACEP01000132.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont420.1, whole genome shotgun sequence Length of sequence - 1439 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 92 - 151 8.7 1 1 Tu 1 . + CDS 233 - 1237 803 ## COG0502 Biotin synthase and related enzymes Predicted protein(s) >gi|222441810|gb|ACEP01000132.1| GENE 1 233 - 1237 803 334 aa, chain + ## HITS:1 COG:BH1748 KEGG:ns NR:ns ## COG: BH1748 COG0502 # Protein_GI_number: 15614311 # Func_class: H Coenzyme transport and metabolism # Function: Biotin synthase and related enzymes # Organism: Bacillus halodurans # 14 333 6 322 333 246 40.0 5e-65 MSEAVTIDTIKNILTKNVNKEYSLSREEAIAIMNLPEDDMPLLMEMAGSLRKKYKGNHVS IHLLTNARSGNCSQNCAYCAQSCRSKAEIEKYKWVNDEKLYSDNAFVHDNHLSRHCIGLS GMKFTDEEIEELAEKIRVMKAQGTHLCCSIGFLTEKQALMLKEAGLDRINHNLNSSRSYY HNICTTHTYEQRVNNIHMLQRLGFEICSGGIIGMGESKEDIVDMLLDLREIQPEALPINF LLPIPGTPLENADTSVLTPAYCMKVLCLARLMAPQSDIRCAAGREVYFKGREKELLSIVD SIFASGYLTADGQGISDTIKTITDAGFTYEIESD Prediction of potential genes in microbial genomes Time: Fri Jul 8 08:22:47 2011 Seq name: gi|222441809|gb|ACEP01000133.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont428.1, whole genome shotgun sequence Length of sequence - 848 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 6/0.000 - CDS 14 - 352 193 ## COG3293 Transposase and inactivated derivatives 2 1 Op 2 . - CDS 379 - 777 158 ## COG3293 Transposase and inactivated derivatives Predicted protein(s) >gi|222441809|gb|ACEP01000133.1| GENE 1 14 - 352 193 112 aa, chain - ## HITS:1 COG:CC1206 KEGG:ns NR:ns ## COG: CC1206 COG3293 # Protein_GI_number: 16125457 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Caulobacter vibrioides # 2 112 56 164 164 104 45.0 4e-23 MLSEGQRNDINYAIPLLEQVDIKESDILADRGYDSNKLMDYIYAHGGEPTIPSRKGAKFD RYCDWHLYKERHLVENYFLKLKAFRRIATRYDKLASTFAAFICIASILIWLK >gi|222441809|gb|ACEP01000133.1| GENE 2 379 - 777 158 132 aa, chain - ## HITS:1 COG:mll8100 KEGG:ns NR:ns ## COG: mll8100 COG3293 # Protein_GI_number: 13476707 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Mesorhizobium loti # 2 120 1 116 254 102 44.0 1e-22 MLRRYELTDEEWNQIVSLLPPENSGKQGRPSKCNRTILNGIVWIARSGAPWRDLPERYGP WQTVYSRFRKWMEDGILDNIFRVLSLDAELSELSIDASIVQAHQHSAGAKKGGHPTKSDT VAVEQVQKSMPL Prediction of potential genes in microbial genomes Time: Fri Jul 8 08:22:48 2011 Seq name: gi|222441808|gb|ACEP01000134.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont430.1, whole genome shotgun sequence Length of sequence - 3513 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 220 - 279 5.7 1 1 Tu 1 . + CDS 501 - 1268 371 ## gi|225028571|ref|ZP_03717763.1| hypothetical protein EUBHAL_02850 + Term 1443 - 1496 2.5 + Prom 1829 - 1888 6.1 2 2 Tu 1 . + CDS 1909 - 3147 863 ## PROTEIN SUPPORTED gi|163739624|ref|ZP_02147033.1| 50S ribosomal protein L32 Predicted protein(s) >gi|222441808|gb|ACEP01000134.1| GENE 1 501 - 1268 371 255 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225028571|ref|ZP_03717763.1| ## NR: gi|225028571|ref|ZP_03717763.1| hypothetical protein EUBHAL_02850 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02850 [Eubacterium hallii DSM 3353] # 1 255 1 255 255 529 100.0 1e-149 MNCIHLNFVSDKEGRKFLFPILPQDGLNEKTLNVVITDGDSQRIYPVFQQKAGIYGDYSE YMTRHGCACCSLTTALAAFVEKYADLKPNGTISEVERKHFPEEVYTENYGKVMARQMPVS LYGISLILQEEGVSCEYIGDFEDKAAEKQMMEHLYKGKPVIIETSRMRRKGKRIVHFFDK KYAGSYHTMILLGVDEEGQIVFTDSATRDWAGEQQRLKRAKLPELISYMFPQKNVGDTHL YFSRKRNTGGYILIR >gi|222441808|gb|ACEP01000134.1| GENE 2 1909 - 3147 863 412 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163739624|ref|ZP_02147033.1| 50S ribosomal protein L32 [Phaeobacter gallaeciensis BS107] # 1 382 8 388 418 337 44 1e-92 DFMTVYDELVARGLIAQVTDEEEIKELVNNGKAVFYIGFDPTADSLHVGHFMALCLMKRL QMAGNKPIALIGGGTAMIGDPSGRTDMRQMMTKETINHNVECFKKQMSRFIDFSDGKAML VNNADWLLDLNYVELLREVGPCFSVNNMLRAECYKQRMEKGLSFLEFNYMIMQSYDFYEL YQKYGCNMQFGGNDQWSNMLGGTELIRRKLGPDADAYAMTITLLLNSEGKKMGKTQSGAV WLDPNKTSPFDFYQYWRNVADADVMKCIRMLTFLPLEEIDAMDSWEGSKLNEAKEILAFE LTKLVHGEEEAQKAQDAARALFSNGGDTANMPACAVTEEDLRDGTVDILALLVKSGLAGT RSEARRNVTQGGVTLDGEKVTDFKAVYTLDDFKGEGKVLKRGKKKFIKIVAE Prediction of potential genes in microbial genomes Time: Fri Jul 8 08:23:01 2011 Seq name: gi|222441807|gb|ACEP01000135.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont431.1, whole genome shotgun sequence Length of sequence - 3646 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 144 - 203 7.5 1 1 Op 1 . + CDS 384 - 1694 1806 ## COG2873 O-acetylhomoserine sulfhydrylase 2 1 Op 2 . + CDS 1710 - 2639 809 ## PROTEIN SUPPORTED gi|148988856|ref|ZP_01820271.1| 50S ribosomal protein L9 + Prom 3120 - 3179 6.7 3 2 Tu 1 . + CDS 3215 - 3547 312 ## gi|225028578|ref|ZP_03717770.1| hypothetical protein EUBHAL_02857 Predicted protein(s) >gi|222441807|gb|ACEP01000135.1| GENE 1 384 - 1694 1806 436 aa, chain + ## HITS:1 COG:CAC2783 KEGG:ns NR:ns ## COG: CAC2783 COG2873 # Protein_GI_number: 15896038 # Func_class: E Amino acid transport and metabolism # Function: O-acetylhomoserine sulfhydrylase # Organism: Clostridium acetobutylicum # 7 432 3 425 427 506 63.0 1e-143 MAKKTYKDRDFKFETLQLHVGQESADPVTDARAVPIYQTSSFVFHNSQHAADRFGLKDAG NIYGRLTNPTVDIFEQRVAALEGGVAALATASGAAAVTYAIQNVALAGDHIVAAKNIYGG TYNLLAHTLVDYGVETTFVDPLKPEEFEAAIKENTKALYIEALGNPNSEVTDIEAVAEIA HKHNIPLLIDNTFGTPYLLRPLEHGADIVIHSATKFIGGHGTSIGGVIVDDGKFDWKASG KFPQLTEPNPSYHGISFVDAAGPAAFVTRIRAILLRDTGATLSPLHAFIFLQGLETLSLR VERHVENALKVVEYLKNQPLVEKINHPSVSEDPEQQRLYKKYFPNGAGSIFTFEIKGDAE TARKFIDNLELFSLLANVADVKSLVIHPASTTHAELNEQEQADQGIKPNTIRLSIGTEHI DDIIEDLDEAFKAVAE >gi|222441807|gb|ACEP01000135.1| GENE 2 1710 - 2639 809 309 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|148988856|ref|ZP_01820271.1| 50S ribosomal protein L9 [Streptococcus pneumoniae SP6-BS73] # 4 309 3 305 308 316 54 2e-86 MAKVYKGIQELIGNTPLVEAVNLEKKLGLKATLLLKLEYFNPAGSVKDRIANGMIEDAEK KGLLKEGATIIEPTSGNTGIGLAAIAAAKGYKAIFTMPETMSVERRNILKAYGAEIVLTP GEKGMSGAIAKAEELAKEIPGSFIPGQFVNPANPATHKATTGPEIWNDTEGKVDIFLAGV GTGGTVTGIGEYLKEQNPDVKVIAIEPATSPVLSQGHGGAHKIQGIGAGFVPDVLNTKIY DEVFPVENEKAFEYGKELAKAEGIITGISSGAALYAAVEVAKRPENEGKTIVVLLPDNGD RYYSTALFQ >gi|222441807|gb|ACEP01000135.1| GENE 3 3215 - 3547 312 110 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225028578|ref|ZP_03717770.1| ## NR: gi|225028578|ref|ZP_03717770.1| hypothetical protein EUBHAL_02857 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02857 [Eubacterium hallii DSM 3353] # 1 110 1 110 110 181 100.0 1e-44 MVIENNPKELAAMKKFHEGNRAEGLKLQEEFASEFREEYKDKDHCPCKKACRYHGNCKEC VAIHRAHQEHVPNCMRPMLNRKIKILSELTEHTLANEIEPPKEHLRKEFL Prediction of potential genes in microbial genomes Time: Fri Jul 8 08:23:09 2011 Seq name: gi|222441806|gb|ACEP01000136.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont432.1, whole genome shotgun sequence Length of sequence - 3839 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 17/0.000 - CDS 5 - 667 773 ## COG0569 K+ transport systems, NAD-binding component - Prom 736 - 795 6.5 2 1 Op 2 . - CDS 916 - 2220 1080 ## COG0168 Trk-type K+ transport systems, membrane components - Prom 2294 - 2353 5.5 3 2 Tu 1 . - CDS 2365 - 3738 1156 ## COG1253 Hemolysins and related proteins containing CBS domains Predicted protein(s) >gi|222441806|gb|ACEP01000136.1| GENE 1 5 - 667 773 220 aa, chain - ## HITS:1 COG:BH2663 KEGG:ns NR:ns ## COG: BH2663 COG0569 # Protein_GI_number: 15615226 # Func_class: P Inorganic ion transport and metabolism # Function: K+ transport systems, NAD-binding component # Organism: Bacillus halodurans # 9 219 5 216 220 157 42.0 1e-38 MRKNRGTAYGVIGLGRFGTALAIALAQAGKEVIAIDRSEEKIKNIRRYTDYAFVAENLSM ETLKEIGIQNCDVAIICIGEKVDVSILTTMSAIELGVPHVIAKATSEEQGAVLKKIGAEV VYPERDMALRLGKKLVSDNFLDFVSLSNSVEIRQIPVGKMLIGKSVQESDIRRKYKLNII AIENENQTNIEILPDYHFQTGDIIVVIGKTENMDKFETED >gi|222441806|gb|ACEP01000136.1| GENE 2 916 - 2220 1080 434 aa, chain - ## HITS:1 COG:DR1668 KEGG:ns NR:ns ## COG: DR1668 COG0168 # Protein_GI_number: 15806671 # Func_class: P Inorganic ion transport and metabolism # Function: Trk-type K+ transport systems, membrane components # Organism: Deinococcus radiodurans # 13 412 72 487 512 239 36.0 7e-63 MKQIFDSLKKQSPSRIIALGFSAVILVGAVLLMMPFSLKKGVHLSFLNALFTSTSAVCVT GLVVVDTADSFTAIGQAIVAALIQIGGLGIASVGVGFIIAAGKRVSIKSRLMVKEALNVD TYHGMVRLVRIVLAITLGFEAAGAIISFFVFRKDYSPLHAVGISIFHSIAAFNNAGFDNL GGMHNLIPYKDNILLNITTDVLVIVGGIGFLVLVDIVKQKSFKKLCLHSKVAISMTAVLL IGGTLILKATENITWMGAFFQSMTTRTAGFSTYNIGNFSKAGLFAMCILMFIGASPGSTG GGVKTTTIFVMIQALKCMGRKKSAHAFKRSISEENISKAFMITVLSAGILCVATFLMCVF EPEYDFIQIIFEVASALSTAGLSTGITPELGVAARILIILLMFTGRVGAFTLLSVWGKVS TPNARYSEEMISIG >gi|222441806|gb|ACEP01000136.1| GENE 3 2365 - 3738 1156 457 aa, chain - ## HITS:1 COG:CAC0460 KEGG:ns NR:ns ## COG: CAC0460 COG1253 # Protein_GI_number: 15893751 # Func_class: R General function prediction only # Function: Hemolysins and related proteins containing CBS domains # Organism: Clostridium acetobutylicum # 12 439 8 431 443 232 33.0 1e-60 MSSITYGGNFLDKMISVLVLQIVLLILNAIFSSAEISVISLNEAKVKQMAEEGDKRAEKA AMLLKKPARFLETIESAVTFASLLSGAFVAVYIGGFLKAGCMHLLAKWNISVSGQVMSVA AIVAAALLLTYINLLFGEILPKRIAMSRTEEDSFKMVGILFFFAKIFAPLAVLLSVSSSL LLRIIGINPEEEQEVVTEEEIRMMLAEGKEQGTIQNEESKLIQNVFEFNDTTAEQVSTRR RDLVCLNLEDNAEEWEKTIRECRFSHFPIIDSNQEDVVGILDTKDYFRSEDKSKEYVMKH AVDTPYFVPENMKANVLFANMKQTRIYFAVVLDEYGGMSGIVTLHDLVEALVGELEEEEM PAKPEDIERIAEGVWRIQGCAQLDEVSETLNVEFSEIFDTFSGFVWDAIGRVPAEGEKFS IEANGLKIDVENIKNHMVDYAIVRKIPRKKIEKEPEE Prediction of potential genes in microbial genomes Time: Fri Jul 8 08:23:19 2011 Seq name: gi|222441805|gb|ACEP01000137.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont433.1, whole genome shotgun sequence Length of sequence - 31877 bp Number of predicted genes - 28, with homology - 28 Number of transcription units - 8, operones - 5 average op.length - 5.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 17/0.000 + CDS 435 - 1772 722 ## COG0168 Trk-type K+ transport systems, membrane components 2 1 Op 2 . + CDS 1826 - 2473 622 ## COG0569 K+ transport systems, NAD-binding component 3 1 Op 3 . + CDS 2493 - 2780 276 ## gi|225028584|ref|ZP_03717776.1| hypothetical protein EUBHAL_02863 4 1 Op 4 16/0.000 + CDS 2831 - 4405 1132 ## COG2205 Osmosensitive K+ channel histidine kinase 5 1 Op 5 . + CDS 4398 - 5099 552 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 6 1 Op 6 . + CDS 5113 - 6015 556 ## EUBELI_20172 hypothetical protein + Term 6072 - 6124 10.3 - Term 6052 - 6120 17.0 7 2 Tu 1 . - CDS 6133 - 6948 601 ## COG1387 Histidinol phosphatase and related hydrolases of the PHP family - Prom 7057 - 7116 9.5 + Prom 7102 - 7161 8.8 8 3 Op 1 26/0.000 + CDS 7256 - 8272 1417 ## COG0057 Glyceraldehyde-3-phosphate dehydrogenase/erythrose-4-phosphate dehydrogenase + Term 8314 - 8366 12.0 + Prom 8374 - 8433 2.2 9 3 Op 2 13/0.000 + CDS 8464 - 9675 1818 ## COG0126 3-phosphoglycerate kinase + Term 9713 - 9774 4.1 + Prom 9704 - 9763 4.0 10 3 Op 3 8/0.000 + CDS 9786 - 10535 1227 ## COG0149 Triosephosphate isomerase + Prom 10607 - 10666 6.4 11 3 Op 4 . + CDS 10700 - 12244 1921 ## COG0696 Phosphoglyceromutase + Term 12329 - 12379 9.2 + Prom 12383 - 12442 10.8 12 4 Tu 1 . + CDS 12484 - 13230 500 ## COG3786 Uncharacterized protein conserved in bacteria + Term 13424 - 13458 -0.8 + Prom 13326 - 13385 6.8 13 5 Tu 1 . + CDS 13579 - 14448 1136 ## COG0024 Methionine aminopeptidase + Term 14639 - 14689 0.7 + Prom 14933 - 14992 11.9 14 6 Op 1 20/0.000 + CDS 15111 - 16283 1603 ## COG0683 ABC-type branched-chain amino acid transport systems, periplasmic component + Term 16302 - 16347 10.5 + Prom 16289 - 16348 7.3 15 6 Op 2 24/0.000 + CDS 16457 - 17338 1139 ## COG0559 Branched-chain amino acid ABC-type transport system, permease components 16 6 Op 3 19/0.000 + CDS 17351 - 18436 1285 ## COG4177 ABC-type branched-chain amino acid transport system, permease component 17 6 Op 4 18/0.000 + CDS 18439 - 19197 255 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 18 6 Op 5 . + CDS 19212 - 19922 287 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 19 6 Op 6 . + CDS 19931 - 20548 707 ## Clole_2248 hypothetical protein 20 6 Op 7 . + CDS 20629 - 21918 1424 ## COG0148 Enolase + Prom 21991 - 22050 7.7 21 7 Op 1 15/0.000 + CDS 22088 - 23338 1738 ## COG1744 Uncharacterized ABC-type transport system, periplasmic component/surface lipoprotein + Term 23362 - 23404 8.6 + Prom 23461 - 23520 3.8 22 7 Op 2 24/0.000 + CDS 23557 - 25083 1597 ## COG3845 ABC-type uncharacterized transport systems, ATPase components 23 7 Op 3 26/0.000 + CDS 25083 - 26177 1074 ## COG4603 ABC-type uncharacterized transport system, permease component 24 7 Op 4 5/0.000 + CDS 26192 - 27172 1036 ## COG1079 Uncharacterized ABC-type transport system, permease component 25 7 Op 5 . + CDS 27198 - 28475 1613 ## COG1744 Uncharacterized ABC-type transport system, periplasmic component/surface lipoprotein 26 7 Op 6 . + CDS 28546 - 28788 241 ## gi|225028609|ref|ZP_03717801.1| hypothetical protein EUBHAL_02888 + Term 28809 - 28855 7.2 27 8 Op 1 10/0.000 + CDS 28878 - 31265 1666 ## PROTEIN SUPPORTED gi|15894003|ref|NP_347352.1| fused ribonuclease/ribosomal protein S1 28 8 Op 2 . + CDS 31357 - 31830 387 ## COG0691 tmRNA-binding protein Predicted protein(s) >gi|222441805|gb|ACEP01000137.1| GENE 1 435 - 1772 722 445 aa, chain + ## HITS:1 COG:BS_yubG KEGG:ns NR:ns ## COG: BS_yubG COG0168 # Protein_GI_number: 16080162 # Func_class: P Inorganic ion transport and metabolism # Function: Trk-type K+ transport systems, membrane components # Organism: Bacillus subtilis # 15 438 18 438 445 254 37.0 3e-67 MSKDIHKKKHLSSFQLIILGFASVILLGAIILMLPVSSAEGVITPFNQTLFTSTSAVCVT GLAVLDTGSYWSVFGQVVILLLIQIGGLGVVTVAVSVFMLSGRKISLMQRSTMQNAISAH KVGGIVRLTKFILKGTLFIEMAGALALLPVFYHDFGRKGIWMAVFHSISAFCNAGFDILG TPANPFPSITAYAGNPIVNVVIMFLIIAGGIGFLTWEDICINKTHFRKYHMQSKIILVTT ALLIVLPAVFFFFSDFTHLSVGKRLLASAFQAVTPRTAGFNTMDISKMTEVSKAMIIILM LVGGSPSSTAGGMKTTTFAVLILNAFATFRSQENVETFGRRIEWSVIKNASTIAMMYCML FLCGGITISVYEGLPLSECLYEAASAVGTVGLTLGITPKLHIVSQFILIILMYLGRVGGL TLIYAVLSKKKKGNARLPLEKLTVG >gi|222441805|gb|ACEP01000137.1| GENE 2 1826 - 2473 622 215 aa, chain + ## HITS:1 COG:BS_ykqB KEGG:ns NR:ns ## COG: BS_ykqB COG0569 # Protein_GI_number: 16078515 # Func_class: P Inorganic ion transport and metabolism # Function: K+ transport systems, NAD-binding component # Organism: Bacillus subtilis # 2 211 3 212 221 137 38.0 1e-32 MKNILLIGLGRFGRHIAMELSELGHEIMAVDVNEGRVDKVLPYVTNAQIGDSTNAEFLES LGIGNYDICFVTIGGSFQNSLETTSLLKELGAKMVISRAERDVQEKFLLRNGADKVIYPE KQVAKWASIRYTDDHILDYMEVDASHAIFEVEVPKEWIGKTVGELDIRRKYDINILAIKN NGELSMAISPDTFFTGNISLLVIGEHKAIKKCFHI >gi|222441805|gb|ACEP01000137.1| GENE 3 2493 - 2780 276 95 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225028584|ref|ZP_03717776.1| ## NR: gi|225028584|ref|ZP_03717776.1| hypothetical protein EUBHAL_02863 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02863 [Eubacterium hallii DSM 3353] # 1 95 2 96 96 165 100.0 9e-40 MKEVILLLTVLVMFIVCFFVMGKIGYFINENYKAFIIEERENTQSCKIISLDKISDEELL KEIHDYQKKCGKAGMIIYDSKKKDKRQCFDGDKAV >gi|222441805|gb|ACEP01000137.1| GENE 4 2831 - 4405 1132 524 aa, chain + ## HITS:1 COG:pli0050 KEGG:ns NR:ns ## COG: pli0050 COG2205 # Protein_GI_number: 18450332 # Func_class: T Signal transduction mechanisms # Function: Osmosensitive K+ channel histidine kinase # Organism: Listeria innocua # 12 524 372 888 888 456 46.0 1e-128 MEENERREGVLKQHFALYVKKPSMKDCFVTVFIFMICTIIGLLFQKLHFTDTNIVTIYIL GVLITSILTDGYLCSIAGSFLSVVLFCFFLTEPRMSFQTYAVGYPVTFAIMLVCSVLAGT LSAKLKMHARIYSQLAFRTQVLFDTDRLLQKARSSKEMLRVTCTQLVRLLNRDIVAYIVE EGKLSQGQVFYKNKEETDQQFLIPEEQKIARWVYENNKRAGVGTNYFKNAKCLYLAIRIG DHVYGVIGIPANKDVFDSVEYSILLSVVNECALAMENLRNAIEKEKNAVLAKNEQLRADL LRAISHDLRTPLCSISGNADMLLNNGACLDSKTKQQIYVDIYDDSEWLIGVVENLLSITR LNDGRLKFKFTDQLLDEVIAESLRHISRKHDEYTIVTKCEELILVRMDVQLIIQVLVNLI DNAIKYTPKGSKICIRGMKCNGKAQICVEDNGPGIADEMKPYIFEMFYTGKSTIADSQRS LGLGLSLCQSIIEAHDGRLLLTDNTPRGCIFTFTLPLSEVTLNE >gi|222441805|gb|ACEP01000137.1| GENE 5 4398 - 5099 552 233 aa, chain + ## HITS:1 COG:pli0051 KEGG:ns NR:ns ## COG: pli0051 COG0745 # Protein_GI_number: 18450333 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Listeria innocua # 1 232 1 231 240 278 61.0 8e-75 MNKTVVLVVEDDRPIRNLIVTTLKMHDYKYLTAKNGASAIMEASSHHPDIVLLDLGLPDM EGVDVIKKIRTWSNMPIIVISARSEDSDKIEALDSGADDYITKPFSVEELLARIRVTQRR LAIIQAGEEQEASIFINGQLKIDYTAGCTYLKGKELHLTPIEYKLLCLLSRNVGRVLTHT YITQQIWGNCSQNDIASLRVFMATLRKKLEPDKNGLQYIQTHIGIGYRMLRIE >gi|222441805|gb|ACEP01000137.1| GENE 6 5113 - 6015 556 300 aa, chain + ## HITS:1 COG:no KEGG:EUBELI_20172 NR:ns ## KEGG: EUBELI_20172 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 6 300 7 305 305 480 74.0 1e-134 MKRSIAQYHLPGLFEFYELYKVFLPLFREHGEYFYDWCDIGSVYGAPADCIWGGGRAGFG DDDAKKVLDLMKGYGISARLTFSNSLLREEHLLDKKCNALCRLFEETGDIQNGVIIHSDL LLEYLKKNYPNLYFVSSTTKVLTNFQDFLKEVKREDFQYVVPDFRLNKSFDRLNTLTQIE KDKVEFLCNECCWFGCKDRKRCYEAVSRKNLGEHYPEHHCTAPNAEQGYRFSKAMKNSGF IGIEDIQNSYLPMGFSNFKIEGRGLGSALVLEFLLYYMVKPEYQIHIREAIYLDNMLDLF >gi|222441805|gb|ACEP01000137.1| GENE 7 6133 - 6948 601 271 aa, chain - ## HITS:1 COG:L37351 KEGG:ns NR:ns ## COG: L37351 COG1387 # Protein_GI_number: 15673198 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Histidinol phosphatase and related hydrolases of the PHP family # Organism: Lactococcus lactis # 5 265 5 259 269 122 30.0 7e-28 MIYTDTHLHSSFSSDSDTPMELMIKQGIRLGMKTLCFTEHYDPCYPDTPDGLDFLLDFEN YKKTLFTLKEKYASQIEILYGLELGVQPHLGQTLANFYKKYGNDFDFIINSCHIVNGTDP YYGTYFKEMGATRGLMNYFETILANLKVFPHYQSVGHLDYVCRYLPESSQTFFYDHFTDV LDEILIEIIERNRALEVNTAGLKYGLDWPNPDISILRRYRELGGELITIGSDAHQPEHMA WEFDHVPAILHRAGFHHYVVFKGKKPVFYDI >gi|222441805|gb|ACEP01000137.1| GENE 8 7256 - 8272 1417 338 aa, chain + ## HITS:1 COG:NMA0246 KEGG:ns NR:ns ## COG: NMA0246 COG0057 # Protein_GI_number: 15793264 # Func_class: G Carbohydrate transport and metabolism # Function: Glyceraldehyde-3-phosphate dehydrogenase/erythrose-4-phosphate dehydrogenase # Organism: Neisseria meningitidis Z2491 # 1 335 1 331 334 427 66.0 1e-119 MAVKVAINGFGRIGRLAFRQMFGAEGYEVVAINDLTSPKMLAHLLKYDSSQGTYEKAATV KAGEDSITVDGQEIKIYAFPDANNCPWGEIGVDVVLECSGFYTSKEKAQAHINAGAKKVV ISAPAGNDLPTVVYNTNHTVLKPEDTIISAASCTTNCLAPMADALNKYAPIQSGIMCTIH AYTGDQMTLDGPQRKGDLRRSRAAAVNIVPNSTGAAKAIGLVIPELNGKLIGSAQRVPTP TGSTTILTAVVKGADVTVEGINAAMKAAANESFGYNEDEIVSSDIVGMRFGSLFDATQTM VNKVADDLYEVQVVSWYDNENSYTSQMVRTIKYFAELA >gi|222441805|gb|ACEP01000137.1| GENE 9 8464 - 9675 1818 403 aa, chain + ## HITS:1 COG:CAC0710 KEGG:ns NR:ns ## COG: CAC0710 COG0126 # Protein_GI_number: 15893998 # Func_class: G Carbohydrate transport and metabolism # Function: 3-phosphoglycerate kinase # Organism: Clostridium acetobutylicum # 1 403 1 397 397 515 69.0 1e-146 MGLNKKSVDDINVKGLKVLCRCDFNVPLKDGKITDENRLVAALPTIKKLIADGGKVILCS HLGKVKTEEDYTTKTLAPVAARLSELLGQEVKFAADREVVGPNAKAAVEAMKDGDVILLE NTRFRKEETKCGEEFSKDLASICDVFVNDAFGTAHRAHCSNVGVASLVDTAVVGYLMQKE IDFLGNAVENPVRPFVAILGGAKVADKLNVISNLLEKCDTLIIGGGMAYTFLKAKGYEIG NSLVDETKIDYCKEMMEKAESLGKKLLLPVDSVMVDAFPNPIDAEIPTYVYDADAMAADK EACDIGPKTIELYAEAVKTAKTVVWNGPMGVFENPILAAGTKAVAAALAETDATTIIGGG DSAAAVNQLGYGDKMSHISTGGGASLEFLEGKELPGVAAANDK >gi|222441805|gb|ACEP01000137.1| GENE 10 9786 - 10535 1227 249 aa, chain + ## HITS:1 COG:BS_tpi KEGG:ns NR:ns ## COG: BS_tpi COG0149 # Protein_GI_number: 16080445 # Func_class: G Carbohydrate transport and metabolism # Function: Triosephosphate isomerase # Organism: Bacillus subtilis # 1 238 1 239 253 254 55.0 1e-67 MRKKIVAGNWKMNKTPSEAVELVNMLKPLCANEDVDVVFCVPAIDIIPAMEAAKGSNIAI GAENMYYEESGAYTGEISPAMLVDAGVKYVVLGHSERREYFAETNETVNKKMLKAFEHGI TPIMCCGETLEQREQGVTMDFIRQQVKVGFLNVTADQAKTAVIAYEPIWAIGTGKTATTE QAQEVCAEIRKCIAEIYDDATAAAIRIQYGGSVNAKTAPDLFAQADIDGGLVGGASLKAD FGQIVNYNK >gi|222441805|gb|ACEP01000137.1| GENE 11 10700 - 12244 1921 514 aa, chain + ## HITS:1 COG:CAC0712 KEGG:ns NR:ns ## COG: CAC0712 COG0696 # Protein_GI_number: 15894000 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphoglyceromutase # Organism: Clostridium acetobutylicum # 1 511 1 509 510 630 61.0 1e-180 MSKKPTVLLIMDGFGLNDRHEANAVYEANTPNIDKLMEKCPFVKGNASGLAVGLPDGQMG NSEVGHLNMGAGRIVYQELTRITKAIKDGDFFDNKAFLEAVENCKKNNSTLHLYGLLSDG GVHSHNTHLYALLELAKRQGLTDVCVHCFLDGRDTAPTSGKEFIEELETKIKEIGVGQIA SIEGRYYAMDRDNRWDRVEKAYAALVYGEGNEAADAVEAIQASYDNDKTDEFVIPTVIKK DGQPVGTVKANDSVIFFNFRPDRAREITRAMCDPEFTGFERRNGVFPLTYVCFTEYDETI PGKIIAFHKEEITNTFGEFLAANGLTQLRLAETEKYAHVTFFFNGGVEVPNKGEDRILVK SPAVATYDLQPEMSAPEVGEKLVEAIKSDKYDVIIINFANPDMVGHTGVQEAAIKAVETV DTCVGNAVAALKEVDGQMFICADHGNCEQLIDYETGEPYTAHTTNPVPFILVNYDPAYTL REGGCLADIAPTLIEMMGMEQPAEMTGKSLLIKK >gi|222441805|gb|ACEP01000137.1| GENE 12 12484 - 13230 500 248 aa, chain + ## HITS:1 COG:ML2522 KEGG:ns NR:ns ## COG: ML2522 COG3786 # Protein_GI_number: 15828358 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Mycobacterium leprae # 69 245 36 216 218 84 32.0 2e-16 MKENTKKGSCYLNGLAIFLSIIFVISIFIAADNAIPLQAATAKSQSTTVANRSKVKKDKW TTLLQKYEKSKKVNQLIFVKYKGKSKANIVLYEKVNGKFKKVFGCIGYVGKNGIGKKREG DKKTPTGTYSFTKAFGIKKNPRSKIKYIKLNKYHYWSGDRRYYNQMIDIRKVKASRSGEH LIDYKPHYNYALAIDYNKNCVYKKGSAIFLHCTGSNPYTGGCVAVKQKYMKKIMQTVDKN AKICIYPK >gi|222441805|gb|ACEP01000137.1| GENE 13 13579 - 14448 1136 289 aa, chain + ## HITS:1 COG:ECs0170 KEGG:ns NR:ns ## COG: ECs0170 COG0024 # Protein_GI_number: 15829424 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Methionine aminopeptidase # Organism: Escherichia coli O157:H7 # 44 288 5 249 264 246 50.0 4e-65 MKIGRNDKCWCGSGKKYKACHMSFDNRIKDYWNRGYEVPDHSMIKTPEQIEGIREAGKKN TMVLDFITPYVKEGVSTGELNRLIEEYTREIGGIPACLGYQGYPKSVCISIDDVVCHGIP SDDQILKSGQILNVDCTTIYNGYFGDASRMFCIGDVSEEKKKLVRVTKECVDLALAATKP WGTMGDMGYVCNKHAVENGYSVVREIGGHGCGVQFHEEPWVNHIGQPDEGILFVPGMTFT IEPMVNMGKPDVWEDEDDGWTIRTEDGKPSAQWEYTILVTEDGTEILSH >gi|222441805|gb|ACEP01000137.1| GENE 14 15111 - 16283 1603 390 aa, chain + ## HITS:1 COG:FN1432 KEGG:ns NR:ns ## COG: FN1432 COG0683 # Protein_GI_number: 19704764 # Func_class: E Amino acid transport and metabolism # Function: ABC-type branched-chain amino acid transport systems, periplasmic component # Organism: Fusobacterium nucleatum # 2 387 3 375 383 246 40.0 5e-65 MRKFISVMLVAAMAVTALTGCGSNSSSSSKKDADKYYIGGIGPTTGATAIYGTAVKNGAQ IAVDEINAAGGINGKQIEYRFEDDQNDAEKAVNAYNTLKDWGMQMLVGTTTTAPCIAVAG KTASDNVFQITPSASSPDVLSSGNGNVFQVCFTDPNQGVASAQYIAENKLPTKIGIIYDS SDVYSSGIEEKFEAEAKTQGLNIVSKAAFTADSKTDFGTQLQKAKDAGADLLFLPIYYQE ASIILKQADTMGYKPKFFGVDGMDGILTVENFDTKLAEDVMLLTPFAADAKNKSVQNFVK TYKEKYKDTPNQFAADSYDAVYALKAAIEESKATTDMSASDMCDALKGAMTKIKMQGLTG GEDGLTWNESGEVTKSPKAVIIKNGAYKAM >gi|222441805|gb|ACEP01000137.1| GENE 15 16457 - 17338 1139 293 aa, chain + ## HITS:1 COG:FN1431 KEGG:ns NR:ns ## COG: FN1431 COG0559 # Protein_GI_number: 19704763 # Func_class: E Amino acid transport and metabolism # Function: Branched-chain amino acid ABC-type transport system, permease components # Organism: Fusobacterium nucleatum # 1 293 14 308 308 253 51.0 3e-67 MSFISYLINGVSLGSIYAIIALGYTMVYGIAKMLNFAHGDVIMIGCYIVFVLMGSFHLNS TVSIIAAVIGCTVLGVVIEKIAYKPLRKASPLAVLITAIGVSYFLQNFALILFGADTKSF TSVVSIPTIKLAGGDLTISGVTIATVISCIIIMAALMIFIKKSKSGQAMLAVSEDKDAAQ LMGVNVNKTISLTFAIGSGLAAIAGVLLCSAYPTLSPYTGAMPGIKAFTAAVFGGIGSIP GALIGGILLGVIEILGRAYISSQLSDAIVFAVLIIVLLVKPTGLLGKQVQEKV >gi|222441805|gb|ACEP01000137.1| GENE 16 17351 - 18436 1285 361 aa, chain + ## HITS:1 COG:FN1430 KEGG:ns NR:ns ## COG: FN1430 COG4177 # Protein_GI_number: 19704762 # Func_class: E Amino acid transport and metabolism # Function: ABC-type branched-chain amino acid transport system, permease component # Organism: Fusobacterium nucleatum # 50 360 2 282 285 201 45.0 2e-51 MKKSNMKSINKQTKSMLITYAMVIAVFIIVEIMISTGMMSSLMQGLLVPLCIYSIVAIGL NLCVGYLGELSLGHAGFMCVGAFSSAVFSKCMMTSGMNETLRFILALLIGTVVAAVFGWL IGIPVLRLRGDYLAIVTLAFGEIIKNVINVMYLGRDSKGFHFSIKDSASLNMQADGKMII DGAKGIVGTPNQSTFVVGIIIILISLFVSFNLVNSRTGRAIMAIRDNRIAAESVGINITK YKLMAFTISAAIAGAGGVLYAHNLSTLTALPANFGYNMSIMILVFVVLGGMGSFRGSIIA AVVLTMLPEVLRGLADYRMLIYAIVLIIMMLFNWAPKAIEWREKHSLKRMFGKKEKAEEV Q >gi|222441805|gb|ACEP01000137.1| GENE 17 18439 - 19197 255 252 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 14 244 33 254 329 102 25 2e-21 MALLEVKNLGISFGGLRAVDQFEINIKKGQLYGLIGPNGAGKTTIFNMLTGVYKPTDGII KLDGEDLVGKKTIEINKAGIARTFQNIRLFSQQSVLDNVKIGLHNQYKYSTLTGILRLPK YRKMERKMNEKAMELLKVFELDDKADFLASNLPYGEQRKLEIARALATNPKLLLLDEPAA GMNPNETIELMDTIRFVRDNFDMTVLLIEHDMKLVSGICEELTVLNFGQVLAQGKTKDVL NNPEVIKAYLGE >gi|222441805|gb|ACEP01000137.1| GENE 18 19212 - 19922 287 236 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 3 226 1 229 245 115 26 4e-25 MAMLEVKDLEVYYGMIQAIKGISFEVNQGEVIALIGANGAGKTTTLQTITGMLQAKKGHI LFEGQDITKVPGHKIVTMGMAHVPEGRRVFANLSVYENLKLGAYTRKDKNEIAQSLEMVY ESFPRLKERRNQSAGTLSGGEQQMLAMGRALMSKPRIVLMDEPSMGLSPIFVDEIFKIIQ KISAEGTTVLLVEQNAKKALAIADRAYVLETGKIALSGDAKELMNNESIKKAYLGE >gi|222441805|gb|ACEP01000137.1| GENE 19 19931 - 20548 707 205 aa, chain + ## HITS:1 COG:no KEGG:Clole_2248 NR:ns ## KEGG: Clole_2248 # Name: not_defined # Def: hypothetical protein # Organism: C.lentocellum # Pathway: not_defined # 1 202 1 202 208 259 61.0 5e-68 MKHTVITIARSYGSGGRTLGKLLAKELGIHCYDRELLRMASEQSGINEALFGEVDEKVKS LPLFGISKKIYKGEVFPPESDDFVSDDNLFNYQAKVIKEVAAEESCVIIGRCADFILKDN PNVIRLFFYAPREDCITRVKTQNGGKEKDIIRKIEKTDKYRSDYYKYHTGKDWNDSRNYD FCLNTASMSYEKLVAVVKAYIEQYA >gi|222441805|gb|ACEP01000137.1| GENE 20 20629 - 21918 1424 429 aa, chain + ## HITS:1 COG:CAC0713 KEGG:ns NR:ns ## COG: CAC0713 COG0148 # Protein_GI_number: 15894001 # Func_class: G Carbohydrate transport and metabolism # Function: Enolase # Organism: Clostridium acetobutylicum # 1 422 1 415 431 511 63.0 1e-144 MKKGFNIEDVHGLEVLDSRGNPTVEVTVRLESGAMGKAIVPSGASTGRFEAVELRDEEKR FGGKGVERAVTNVNTRICDRLCGCNALEQREIDRILREADGTENKSRYGANAILGVSLAV ARAAADGLGLPLYRYVGGVNGKVLPVPMMNVINGGCHAKNSIDFQEFMIMPAGAETFSEG IQMCAEIYQQLKKTLSEKGYSTGVGDEGGFAPDLDSAQAALEILQEATTLAGYEPGEDIF FAMDAAASELYDENTKRYYFPGESRMREKEICRNAEEMVQYYEELVEQFPLISIEDGLSE EDYAGWKVLTYRLGEKVQLVGDDLFVTNVKRLSDGIEKGIANAILIKPNQIGTLSETLDA ILMAQKEGYRTIISHRSGDTEDTTIADLAVAVNAGQIKTGAPCRAERTAKYNQLMRIEEE LGNNGSYGR >gi|222441805|gb|ACEP01000137.1| GENE 21 22088 - 23338 1738 416 aa, chain + ## HITS:1 COG:AF0890 KEGG:ns NR:ns ## COG: AF0890 COG1744 # Protein_GI_number: 11498495 # Func_class: R General function prediction only # Function: Uncharacterized ABC-type transport system, periplasmic component/surface lipoprotein # Organism: Archaeoglobus fulgidus # 3 349 22 346 397 160 34.0 3e-39 MKKLLAFILAAVMTASLCACGSSKGGAEGSTSAKGSSAAGEAVAASDIKVGCIMIGDENE GYTEAHLKAVEEMKKNLGLSDDQVVIKTNIKEDEGCYDAAVDLAEQGCNIVIANSFGHES YILQAAADYPDVQFCHATGYQAKSSGLSNMHNFFTRIYESRYLSGVVAGMKLNEMIKNGE VSKDKCKIGYVGAFPYAEVISGFTSFYLGVKSVCDSATMEVKYTNSWASFDLEKECAEQL IADGCVLISQHADTTGAPTACEAKKVPCVGYNIDMIPTAPDAALTSATINWGPYYTYAVQ SVIDGTAIETDWAKGVKDGADAITELNDKTVAEGTKEAVEEAQKGIEDGTLHVFDTSKFT VGGKTLEELVESGDKDATKLKDYIKDGYFHESDVESGMPSAPAFTFIIDGINSITK >gi|222441805|gb|ACEP01000137.1| GENE 22 23557 - 25083 1597 508 aa, chain + ## HITS:1 COG:AF0887 KEGG:ns NR:ns ## COG: AF0887 COG3845 # Protein_GI_number: 11498492 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport systems, ATPase components # Organism: Archaeoglobus fulgidus # 9 506 1 492 495 393 42.0 1e-109 MEQTYAIELKNITKRFGKIVANDGINLKVKRGEILSLLGENGSGKTTLMNMISGIYFPDE GHILVDGKAVSIRSPKDSFDLGIGMIHQHFKLIDVMTAAENIILGLKGNAVLDMKKVHKD IQELVDRYGFDLDPAQKVYTMSVSQKQTVEIVKMLYRGANILILDEPTAVLTPQEVDKLF EVLRNMRDDGRAIIIITHKLNEVLELSDRVAILRKGKYIGTLQTSEATVESLTEMMVGEK VSLDINRPMPHDLKKRLTINNITCTNEDGLHVLENVSFEAFGGEVLGVAGISGSGQKELL EAIAGLTKLSSGSVIFHSPEGEDIDLAKYDAAQINELGVRHSFVPEDRLGMGLVGSMGMT GNMLLRSFRKGKGFFTSLKAPKELAEKIKEDLDVVTPDIDYPVRRLSGGNVQKILVGREI ASSPAVLLVAYPVRGLDINSSYMIYNLLNKQKERGVSVVCVAEDLDMLLEFCDRIMVLCD GKVTGIVDARKTTKEEVGRLMTSHKEEQ >gi|222441805|gb|ACEP01000137.1| GENE 23 25083 - 26177 1074 364 aa, chain + ## HITS:1 COG:SMc00243 KEGG:ns NR:ns ## COG: SMc00243 COG4603 # Protein_GI_number: 15965422 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport system, permease component # Organism: Sinorhizobium meliloti # 10 358 3 350 364 138 31.0 2e-32 MDNTVKEPLIHLSKREDMSFGKKWLVRIIAILLSLIVCAGFIFIIIKMNPVEVYEGIVEG AVGTGRRFWVTARDALTLLCVAVAVTPAFKMKFWNIGAEGQILMGCVASAAMMIYAGNAL PTGLLLVLMFVCSFIAGGIWGIIPAFFKARWNTNETLFTLMLNYVAMQVVTFCIVYWENP AGSNSVGIINAPTRAGWLPQLFGLDYGWNLAIVLLLTVLMFIYMKRTKHGYEIAVVGESE NTARYAGINVGKVIVRTVGISGGIAGIAGFILVSGSSHTISTSTAGGRGFTAIIVAWLAK FNVWAMAVIAFLLVFMQQGAIQIATQYNLNENASEVITGIILFFVIGCEFFINYKLEFRK KHSA >gi|222441805|gb|ACEP01000137.1| GENE 24 26192 - 27172 1036 326 aa, chain + ## HITS:1 COG:AF0889 KEGG:ns NR:ns ## COG: AF0889 COG1079 # Protein_GI_number: 11498494 # Func_class: R General function prediction only # Function: Uncharacterized ABC-type transport system, permease component # Organism: Archaeoglobus fulgidus # 2 323 5 306 310 174 38.0 3e-43 MSVLITFIQKAVMQGICMLFGADGEILTEKSGNLNLGVPGMMYMGGIAGLMAAFYYEKLA AHPNGLVGMILSLLAAFLMAAFGGLIFSVLTITLRANQNVAGLALTTFGVGFGNFFGGSL STLAGGVGQISVAITGDAYRAKIPLLSKIPVIGDIFFNYGFLTYFAIIITLLMAFFLKRT RWGLNLNAVGESPATADAAGINVTLYKYLATVIGGGIAGLGGLYFVMEYSGGTWTNNGFG DRGWLAIALVIFALWKPINAIWGSFLFGGLYILYLYIPGLGRSMQEIFKMLPYLVTIIVL IITSLRKKKEQQPPESLGLAYFREDR >gi|222441805|gb|ACEP01000137.1| GENE 25 27198 - 28475 1613 425 aa, chain + ## HITS:1 COG:BMEII0702 KEGG:ns NR:ns ## COG: BMEII0702 COG1744 # Protein_GI_number: 17989047 # Func_class: R General function prediction only # Function: Uncharacterized ABC-type transport system, periplasmic component/surface lipoprotein # Organism: Brucella melitensis # 45 360 34 333 363 84 24.0 3e-16 MRKHIAVIIALVMAFSLCGCGSSKSLFPENTGNKQQTKTADSGSIKVGYLLSSDGDAPDT VARVQGIRKMQEETGIADDQIIIAESVKKADCEKKAAELVEKGCDIIFAENPNLESILEE AAKKYPDVQFCQEGGKLAKKSGLSNFHNYDTRIYEAYHVAGLVAGVKLNHLLDKGDISAS ECVIGFVAYDKTAKTTSCINAFYLGVERVCSQSSILVRYVGKRGVYDADGKAARQLIAAG VKMMAQYTYTTAVATVCAENDTPLIGNDVNLISTAPKDALTSVTSDWSKYYTYAVNKVLN GKEIAADWTAGYAEDAVVISQLNDEHVTDGTAAKAAELEKNLRAGNAKVFNTEKFTIDGS SLDTLATSNTNFKKYKKYIKNGNLQESMTQSKSIFDTFIDGITESTQDYLAEQSADSAES TTQSN >gi|222441805|gb|ACEP01000137.1| GENE 26 28546 - 28788 241 80 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225028609|ref|ZP_03717801.1| ## NR: gi|225028609|ref|ZP_03717801.1| hypothetical protein EUBHAL_02888 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02888 [Eubacterium hallii DSM 3353] # 11 80 1 70 70 120 100.0 3e-26 MARTLVTIIFMLICLVMIIVVMMQESKNSGFGALSGQVDSDSYVRKNRGRTKEGKLERIT AILGFLFFVVAIGLCLKVFQ >gi|222441805|gb|ACEP01000137.1| GENE 27 28878 - 31265 1666 795 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15894003|ref|NP_347352.1| fused ribonuclease/ribosomal protein S1 [Clostridium acetobutylicum ATCC 824] # 11 756 4 730 730 646 46 0.0 MKDRLLFEQRKNSIRDMIYDPNYIPLKLKEMAYLMDVPHKDRGALKEVMDALVADGSVEL TARGKYIKPENQNVTGVFTANARGFGFVTAEGEEEDIFIPATYVNGALHKDVVKVKVTKK TGGEGKRREGMILKILERGCKTLVGTFQKNTSFGFVLPDDRHYDKDIFISKKHINGAKDG DKVVVRLTDFGGERKKPEGAVIEILGAMDDPATDVTSIVRAYGIEEEFPKSVIKEAQAIP QEITEQPVGKRADFRDLLTVTIDGEDARDLDDAITLSKKEDKYYLGVHIADVSHYVKEDS PLDKEALERATSVYLADRVIPMLPKELSNGICSLNAGVERLAMSCMMTFDASGNVLDHTI TESVICVDERMSYTGVKAILEGEAHPEGKREDIRALCFLMKEAAAILKEKRRKRGAIDFD FPESKIVVDEKGYPVDIHPYERNVATDIIEDFMLLANETVAEDYFWQEIPFVYRTHEAPD SDKIKKLDTFIHNFGYYMKTGRESFHPKEIQKLLFSLEGEPEEPLISRLALRSMKQAKYT TLNVGHFGLSTQYYTHFTSPIRRYPDLQIHRIIKENIHGKLNQKRLEHYESILPSVAQQS STMERRAQDAEREVDKLKKVEYMQQYIGGVFTGVISGVTSWGFFVELDNTIEGMVPINSL LDDFYVFDEESYKLTGEHTGRIFTLGQKAEIVVRSADKMERTIDFILKEFSQYGEYPDPD DYYGDGMPSRKKENTESEDFEFVDGEYPANQEESEFDETGESLPDKDGILLNDWIDDEEA DRLLHLYSKKKIDEI >gi|222441805|gb|ACEP01000137.1| GENE 28 31357 - 31830 387 157 aa, chain + ## HITS:1 COG:BS_yvaI KEGG:ns NR:ns ## COG: BS_yvaI COG0691 # Protein_GI_number: 16080413 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: tmRNA-binding protein # Organism: Bacillus subtilis # 1 152 1 150 156 163 57.0 1e-40 MAKSEGIKLIANNKKAYHDYFIDEKYEAGIALHGTEVKSLRMGRCSIKESFVTISNKGEI LINHMHISPYEKGNIFNKDPLRVRKLLLHKSEINKLAGQIKMKGYTLMPLKVYFKGSLVK VEIGLARGKKLYDKRQDIAKKDAKREAERDFKIRNLG Prediction of potential genes in microbial genomes Time: Fri Jul 8 08:23:44 2011 Seq name: gi|222441804|gb|ACEP01000138.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont434.1, whole genome shotgun sequence Length of sequence - 12384 bp Number of predicted genes - 10, with homology - 10 Number of transcription units - 6, operones - 1 average op.length - 5.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 158 - 217 8.4 1 1 Tu 1 . + CDS 296 - 1846 1549 ## COG1418 Predicted HD superfamily hydrolase + Term 1891 - 1929 1.5 + Prom 1873 - 1932 7.4 2 2 Tu 1 . + CDS 2049 - 2648 323 ## gi|225028614|ref|ZP_03717806.1| hypothetical protein EUBHAL_02893 + Term 2707 - 2739 4.2 + Prom 2673 - 2732 2.8 3 3 Op 1 15/0.000 + CDS 2792 - 3553 800 ## COG1319 Aerobic-type carbon monoxide dehydrogenase, middle subunit CoxM/CutM homologs 4 3 Op 2 11/0.000 + CDS 3558 - 4010 444 ## COG2080 Aerobic-type carbon monoxide dehydrogenase, small subunit CoxS/CutS homologs 5 3 Op 3 1/0.000 + CDS 4019 - 6313 2354 ## COG1529 Aerobic-type carbon monoxide dehydrogenase, large subunit CoxL/CutL homologs 6 3 Op 4 . + CDS 6360 - 6956 425 ## COG2068 Uncharacterized MobA-related protein 7 3 Op 5 . + CDS 6962 - 7585 768 ## COG0503 Adenine/guanine phosphoribosyltransferases and related PRPP-binding proteins + Term 7722 - 7767 -0.4 - Term 8026 - 8059 1.5 8 4 Tu 1 . - CDS 8122 - 8949 451 ## CLL_A0643 conserved hypothetical protein TIGR03172 - Prom 9196 - 9255 7.2 + Prom 8941 - 9000 7.4 9 5 Tu 1 . + CDS 9250 - 10062 944 ## COG1975 Xanthine and CO dehydrogenases maturation factor, XdhC/CoxF family + Term 10063 - 10121 4.2 + Prom 10068 - 10127 5.0 10 6 Tu 1 . + CDS 10235 - 12280 2055 ## COG0855 Polyphosphate kinase Predicted protein(s) >gi|222441804|gb|ACEP01000138.1| GENE 1 296 - 1846 1549 516 aa, chain + ## HITS:1 COG:CAC1816 KEGG:ns NR:ns ## COG: CAC1816 COG1418 # Protein_GI_number: 15895092 # Func_class: R General function prediction only # Function: Predicted HD superfamily hydrolase # Organism: Clostridium acetobutylicum # 9 516 7 514 514 534 62.0 1e-151 MSPIAVIIVLIAVVLTAVISVFATIAYRKSVSEAKVGSAEEKAREILDDALKTAEAKKRE ALLEAKEESLKAKNDFEKESRERRMELQRYEKRVLNKEETLDRKSDALEKKEQSLAHKES SLEKQKVKVEELHEKRLQELERISGLTSEQAKDYLLKTVEDDVKREMAVMVKEMKTRAKE EASKKAKEYVVTAIQKCAADHVAETTISLVQLPNDEMKGRIIGREGRNIRTLETLTGVDL IIDDTPEAVILSSFDPVRREVARIALEKLIVDGRIHPARIEEMVEKAQNEVEQTMREEGE AALLEVGVHGIRPELVRLLGKMKYRTSYGQNALKHSIEVAQLSGLLAGEIGEDVRLAKRA GLLHDIGKAVDQEMEGSHISIGVDLCKKYKESPVVINAVESHHGDVEPTSLIACIVQAAD TISAARPGARRETLETYTTRLKQLEDIANSFKGVDKTFAIQAGREIRIMVVPEQVSDSDM ILLARDISKQIESELSYPGQIKVNVIRESRAVDYAK >gi|222441804|gb|ACEP01000138.1| GENE 2 2049 - 2648 323 199 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225028614|ref|ZP_03717806.1| ## NR: gi|225028614|ref|ZP_03717806.1| hypothetical protein EUBHAL_02893 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02893 [Eubacterium hallii DSM 3353] # 1 199 1 199 199 273 100.0 6e-72 MYYTIEGKRVLGILYIAGLITGTLFFNITIKMQVFKVSDFLDFTEYMKMLENIDLAIFCS YVVFVRFRQLLLFFVGLFLFSPYIVFCLFDYGVSVFTGILISAMVLKYGWTGLVGGTCFL FPQYFFYGMIFVIVYIYLFRNAAIGNFYTVISRGRRGTHTLLKDRIIIILLIIGLFFVGC YIEAYISPAILKKIFGIFT >gi|222441804|gb|ACEP01000138.1| GENE 3 2792 - 3553 800 253 aa, chain + ## HITS:1 COG:APE2219 KEGG:ns NR:ns ## COG: APE2219 COG1319 # Protein_GI_number: 14601920 # Func_class: C Energy production and conversion # Function: Aerobic-type carbon monoxide dehydrogenase, middle subunit CoxM/CutM homologs # Organism: Aeropyrum pernix # 1 249 10 282 292 71 29.0 1e-12 MAESLEQAWELNQKGRNNIIIGGNLWLKMGRRNIINAIDLSSLGLDKIEEDEGGFRIGCM ATLHDIETHEGLNKEFQNLFKEAVRHIVGVQFRNCATIGGSIFPKLGFSDVLTAFLACDT QVILYKKGEVPLREFIYIPTDNDILTHLYVKKDGRKTAYESFRMTETDFPTITCAVAKTQ DGFETVLGARPKHAEVVVDEVVADAFSSEEVADYAKRVIGKLQYGSNIRGSEQYRKQLSK VLIGRCISRLTEE >gi|222441804|gb|ACEP01000138.1| GENE 4 3558 - 4010 444 150 aa, chain + ## HITS:1 COG:SSO2433 KEGG:ns NR:ns ## COG: SSO2433 COG2080 # Protein_GI_number: 15899181 # Func_class: C Energy production and conversion # Function: Aerobic-type carbon monoxide dehydrogenase, small subunit CoxS/CutS homologs # Organism: Sulfolobus solfataricus # 3 142 15 156 171 129 44.0 2e-30 MIIDLFINGKKIQAETAPDQMLIDFLRGLGYKSMKRGCDTGNCGLCTVWMDEKPVLSCSV PAARAAGHKITTLEGVQEEAAEFSDYLANEGADQCGYCSPGLIMNVLALKREIPNPTMAQ IKEYLSGNLCRCTGYQGQYRALAKYFGVEE >gi|222441804|gb|ACEP01000138.1| GENE 5 4019 - 6313 2354 764 aa, chain + ## HITS:1 COG:Z4220_2 KEGG:ns NR:ns ## COG: Z4220_2 COG1529 # Protein_GI_number: 15803418 # Func_class: C Energy production and conversion # Function: Aerobic-type carbon monoxide dehydrogenase, large subunit CoxL/CutL homologs # Organism: Escherichia coli O157:H7 EDL933 # 2 758 1 787 790 428 33.0 1e-119 MENTRFVNKGIVKKDAKVLLSGRPVYTDDIAPSDCLVVKLLRSPHAHAWIEDISTAAALK VPGIEAVFTYKDVPEKRYTQAGQTFPEPSPDDRLILDQHIRYNGDPVAIIAGKDENCVSQ AMKRIKVKYKVLEPLLDFTKAIDNPIIIHPEDNYVVKSDIGNDVKRNICASQCRGNGDID KVLSECDYTIDQTYHIKAVSQAMMEVFCAFTSIDIYGRIKVISSTQIPFHARRIIARGLN IPKSKVRVVKPRIGGGFGAKQTAVMEIYPAFVTYKTGKPAKLVFTREECFIQGSPRHEMQ VHVRLGADKNGKIRGIDLSTLSNTGAFGEHGPTTVDLSGTKSLSLYRMEAFRFKTEVVYT NVLSAGAYRGYGATQGIFAVESAVNELAHKLDIDPIKLREMNMVREGDLLTNYFGEVTNS CNLDKCVAKVKEMIGWDEKYPYVKTGPHTVRSVGMAMAMQGSGIGGVDVGSVQLKLNDDG FYTLMIGCSDMGTGCDTILAQMTADGLGCEFDDIIVRGVDTDQSPYDSGSYASSTTYVTG MAVVKACESMRKRICEEAAAIMGLKENEVEFDGETISENDNPEHNMTLSDLITSLQAGNG HIIEVTESHTSPVSPPPFMAGAAEIEVDLETGKITPINYVAAVDCGTVINTNLARIQAEG GLVQGLGMTLYEDMQYTKAGKMKNRNFMSYKIPTRLDMGNIKVEFESSYEETGPFGAKSI GEIVINTPAPAIADAVYNATGLRFRTLPITAEQVLMGLLDKEEA >gi|222441804|gb|ACEP01000138.1| GENE 6 6360 - 6956 425 198 aa, chain + ## HITS:1 COG:SSO2432 KEGG:ns NR:ns ## COG: SSO2432 COG2068 # Protein_GI_number: 15899180 # Func_class: R General function prediction only # Function: Uncharacterized MobA-related protein # Organism: Sulfolobus solfataricus # 1 192 8 188 189 78 29.0 7e-15 MASGYSERFGSNKLLESFLDKPLFTYTVDLVASCQAEMKIIATRYEPVIQYVKENVPSMS IVWNAHPERGISESIHLGIRYLKQQKDYEEYAGCCFMVCDQPLLQKESMQELFKSFYTNP EGIHMCVTKKREGNPVIFPKQLFDELMALTGDKGGKRVARQHPELVHKVYVAEKELSDMD YKEDKINLEKEHNKHNMR >gi|222441804|gb|ACEP01000138.1| GENE 7 6962 - 7585 768 207 aa, chain + ## HITS:1 COG:CAC0873 KEGG:ns NR:ns ## COG: CAC0873 COG0503 # Protein_GI_number: 15894160 # Func_class: F Nucleotide transport and metabolism # Function: Adenine/guanine phosphoribosyltransferases and related PRPP-binding proteins # Organism: Clostridium acetobutylicum # 1 189 1 189 189 219 62.0 3e-57 MKLLEERIKKDGKALGKTVLKVDSFINHQVDSEFMDALGKDFAEHFKDCGITKVFTIESS GIAPALMTAKYLDVPMIILKKQTSKILNGKVYQTRVTSFTKGTSYELTLYQDYIDPSDKV LLIDDFLANGEAAMGASRLIKESGAELSGIGILIEKSFQKGRKRLEDAGYYVYSQARIGR LDKGVIEFLPEDAPVLEGVDEKELLDK >gi|222441804|gb|ACEP01000138.1| GENE 8 8122 - 8949 451 275 aa, chain - ## HITS:1 COG:no KEGG:CLL_A0643 NR:ns ## KEGG: CLL_A0643 # Name: not_defined # Def: conserved hypothetical protein TIGR03172 # Organism: C.botulinum_B_Eklund # Pathway: not_defined # 21 273 30 265 265 103 31.0 8e-21 MNKYFFNHDLSPLLGLADSQIITFIGGGGKTSLMNTLGKEFASHGYPTLLTTTTHIMKPD FLSDESYIENEDLGQLANIFTNLKKNTLPLAALGIPEKVVNSNVKWRSPSSDFCEKIAEF SKKFSTKNPYKFLKILCEGDGSKRLPIKLPKDGEPVFFPKTDTVIGVIGLSCLGKPIKET LFRYELLPNLTSLDNYFIKSLQSADIVTTDFLYRLCLSEKGLRKNITSQKFCIIFNQADI LDEKALAEVITLRNQLQTKGICSHIISVKNNYIIN >gi|222441804|gb|ACEP01000138.1| GENE 9 9250 - 10062 944 270 aa, chain + ## HITS:1 COG:yqeB KEGG:ns NR:ns ## COG: yqeB COG1975 # Protein_GI_number: 16130777 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Xanthine and CO dehydrogenases maturation factor, XdhC/CoxF family # Organism: Escherichia coli K12 # 11 269 270 527 541 239 47.0 6e-63 MYTFEEWKKNLIIVRGGGDIATGTIYKLYQSRFPVLVLEIANPSCIRRTISFCEAVFDKK VEVEGVTAKRTESLEDAFAIYGQGQIPVMIDENGSMIQEVQPPVVVDAILAKRNLGTKIT DAPAVIGVGPGFCAGKDVDAVIETQRGHNLGRVIYEGEAAPNTGIPGMIGGYAKERVIHA PATGKLHILRQIGEIVEAGDILADIEGTPVKTLISGVIRGMIREGYDVKKGLKIADVDPR VKEQENCYHISDKARCVAGGVLEAILHLLK >gi|222441804|gb|ACEP01000138.1| GENE 10 10235 - 12280 2055 681 aa, chain + ## HITS:1 COG:BH1392 KEGG:ns NR:ns ## COG: BH1392 COG0855 # Protein_GI_number: 15613955 # Func_class: P Inorganic ion transport and metabolism # Function: Polyphosphate kinase # Organism: Bacillus halodurans # 6 675 20 696 705 509 42.0 1e-144 MSQQVYVNRELSWLKFNERVLEEAENKKVPLCERLTFASIYQSNLDEFFRVRVGSLVDQM LLGGKIRDNKTKMTAKEQIEEVLHQVMKLNRRKDAVYDAIMGQLEDYGVRLVDFRKISKK ESEYLEKYFLSEIAPVISPTIVGKRQPFPFLKNNEIYAVVVLQTKSGKEKLGIIPCSNTG FKRLVELPTAGTYILAEELILHYIPEVFKRYNIKAKSLIRVTRNADIDADALYDEDLDYR DFMAELIKRRKKLAPVRLELTREMDGEIVDVLCDYLELDSDYVFQVQAPLDLSFVFEIQD TLRKTPELFYEKRVPQKSSQFRDGEPVFPQIREKDKLLSYPYESMKPFLNFLREAANDKE VISIKMTLYRVAKHSKIVEYLIDAAENGKEVLVLVELKARFDEENNIEWSRRLEDAGCRV IYGLDGYKVHSKLCLVTRKSEGQVEYYTQIGTGNYNEKTARLYTDLSLMTANVEIGLEAA KVFQALSMEETVDNVQHLLVAPRCLQNKVLSMIDEEIACAKEEKEAYIGLKMNSLTDKKI IDKLIKASQAGVKIDMVIRGICCLIPGVKGKTENIQVRSIVGRFLEHSRIYIFGTQEREK VYIASADFMTRNTLRRVEVATPIYDKDLKMQLEEMFITMLSDNQKARQEDSRGNYEIAEA QETPLNSQEFFYEQAYMRVTI Prediction of potential genes in microbial genomes Time: Fri Jul 8 08:24:01 2011 Seq name: gi|222441803|gb|ACEP01000139.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont435.1, whole genome shotgun sequence Length of sequence - 7174 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 4, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 806 - 839 4.5 1 1 Tu 1 . - CDS 847 - 1749 972 ## Closa_3994 hypothetical protein - Prom 1790 - 1849 4.7 - Term 1912 - 1946 -0.7 2 2 Tu 1 . - CDS 1960 - 2406 213 ## gi|225028626|ref|ZP_03717818.1| hypothetical protein EUBHAL_02905 3 3 Op 1 . - CDS 2620 - 2811 170 ## gi|225028627|ref|ZP_03717819.1| hypothetical protein EUBHAL_02906 4 3 Op 2 . - CDS 2833 - 4602 879 ## Sbal195_1006 hypothetical protein - Prom 4763 - 4822 8.8 - Term 4853 - 4894 7.5 5 4 Op 1 5/0.000 - CDS 4912 - 6750 2230 ## COG0449 Glucosamine 6-phosphate synthetase, contains amidotransferase and phosphosugar isomerase domains 6 4 Op 2 . - CDS 6775 - 7173 508 ## COG1109 Phosphomannomutase Predicted protein(s) >gi|222441803|gb|ACEP01000139.1| GENE 1 847 - 1749 972 300 aa, chain - ## HITS:1 COG:no KEGG:Closa_3994 NR:ns ## KEGG: Closa_3994 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 2 297 3 306 310 317 54.0 5e-85 MKTALLIMAAGIGSRFGTGIKQLEPVDDFNHIIMDYSIHDAIEAGFNHIVFIIRKDIEEE FKEVIGNRIAKICEKYDVTVDYAFQDINDIPGTLPEGRTKPWGTGQAVLAAKNILDTPFI VINADDYYGKEGFKAVHEYLVNGSKSCMAGFVLKNTLSDNGGVTRGICKMDTENNLTEVI ETGNIVKTENGAEADGVTIDTDSLVYMNMWGLTPDFLNVLEEGFKEFFEKEVVKNPLKAE YLIPTFIGELLEQGKISVKVLRSNDTWYGMTYKEDVVSVRDNFNKMLKNGVYKADLFSDL >gi|222441803|gb|ACEP01000139.1| GENE 2 1960 - 2406 213 148 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225028626|ref|ZP_03717818.1| ## NR: gi|225028626|ref|ZP_03717818.1| hypothetical protein EUBHAL_02905 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02905 [Eubacterium hallii DSM 3353] # 1 148 1 148 148 255 100.0 8e-67 MKSEIIVALIAAGTTILTIIGSIVAVKITSNSQIKEQKIAADREMKRKYYNAFIEAFTKK LFYIGKHDCIEKIEAEMAFVLEANRLPLYASEEMIQFIEKIKYPQIAKETKMEEFLIILR RDLCENTFENFSDIKGISLVLPNDAEGV >gi|222441803|gb|ACEP01000139.1| GENE 3 2620 - 2811 170 63 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225028627|ref|ZP_03717819.1| ## NR: gi|225028627|ref|ZP_03717819.1| hypothetical protein EUBHAL_02906 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02906 [Eubacterium hallii DSM 3353] # 1 63 1 63 63 94 100.0 2e-18 MGLNPLGGPNRRGLTLLQKNKDGDAERNTLTEFNSNNTKLLNLEETKEMIAKLAYIQERL AEE >gi|222441803|gb|ACEP01000139.1| GENE 4 2833 - 4602 879 589 aa, chain - ## HITS:1 COG:no KEGG:Sbal195_1006 NR:ns ## KEGG: Sbal195_1006 # Name: not_defined # Def: hypothetical protein # Organism: S.baltica_OS195 # Pathway: not_defined # 1 578 1 588 589 147 23.0 2e-33 MKYSIPRKCDNFSEIDGILYFAQRLEEMLFDYTVDLFRMPLLNTHGLIKEYCSVTKKVEK NEVREYQRDIVFEEFSASFKSDIVIKECWGQDNIDRILKSFGSSSKQEKNDTIAYLNATF DNGKYYYWCVDTIKKYVRLPKQKKKIEATIRCWVSEILSMGYNSDYIYNELKKHFFSNGK ITESSVDDFLDIFNFEYHKYTVYFSVSNIALKFKEILEKRIRLCFNNDGNFSLFKKDKDK VIVYFEDIKAPCPNIAAEIAYNRLDLFFSFYKFVGNKRFFSIQKKAMIIEEQQSPIFVNA HKFSYNIIDDTDFAKIGATSDNLLTGLLINAESEYSLLRKSIELHNTALAVPDLKSGFLN LWSSIEVLCQPKNEGNKFEYVLKNVIPILKKEYLYSVIEDIIKCLKDNLPKCKYEEVLGL SNEIGCDIKKIFYLLFLPQYKEERKKIYGILGDFPVLRSRIACIAELDTTKKVKEYVGKY AQRVTWHLYRMYRTRNAIIHSGEVPHNIKYLGEHLHAYVDATLEEFVTKLSGDIPFDSTN NVIMDIKFATERIDNILEKDQKIDEKILDVLIHPEIGYTIQCKEHISNL >gi|222441803|gb|ACEP01000139.1| GENE 5 4912 - 6750 2230 612 aa, chain - ## HITS:1 COG:CAC0158 KEGG:ns NR:ns ## COG: CAC0158 COG0449 # Protein_GI_number: 15893453 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glucosamine 6-phosphate synthetase, contains amidotransferase and phosphosugar isomerase domains # Organism: Clostridium acetobutylicum # 1 612 1 608 608 650 53.0 0 MCGIVGFTGVQQAAPILLDGLSKLEYRGYDSAGIAVRNGENETEVVKAKGRLKALAEKTD NGATVLGSCGIGHTRWATHGEPSEGNAHPHKSDDGNVVAVHNGIIENYLELKEKLTRKGY VFYSETDTEVAVKLIDYYYKKYEGTPIDAINHAMVRIRGSYALAVMFKDYPEEIYVSRKD SPMILGIEDGESYIASDVPAILKYTRNVYYIGNMELACVRKGEITFYNLDGEEIEKELKT IEWDAEAAEKAGFEHFMMKEIHEQPKVVGDTLNSVLKDGQFDLSSVGLSEEEIKDISQIY IVACGSAYHVGIAAQYVIEDLAKIPVRVELGSEFRYRNPLLDPKGLVIVISQSGETADSL AALRESKEKGVRTMAIVNVVGSSIAREADSVFYTLAGPEIAVATTKAYSTQLMATYILAL QFAKVRGEISDETYKNMIAELQTIPDKIAKILEDKERIQWFASKQANAHDVFFVGRGIDY AICMEGSLKMKEISYIHSEAYAAGELKHGTISLIEDDILVVGVLTQPDLYEKTISNMVEC RSRGAYLMGLTTFGQYNVEDSTDFAVYIPKIDPHFATSLAVIPLQLLGYYISVAKGLEVD KPRNLAKSVTVE >gi|222441803|gb|ACEP01000139.1| GENE 6 6775 - 7173 508 132 aa, chain - ## HITS:1 COG:BS_ybbT KEGG:ns NR:ns ## COG: BS_ybbT COG1109 # Protein_GI_number: 16077245 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphomannomutase # Organism: Bacillus subtilis # 1 128 317 444 448 124 45.0 5e-29 KNGCRIGGEQSGHIIFSKYASTGDGILTSLKMMEVMMAKKKKMSQLTEDLHIYPQVLVNV KVKDKAVAQADQDVQAAVNKVAEALGDAGRILVRESGTEPLIRVMVEAETEEICHTYVDE VVAVINEKEATR Prediction of potential genes in microbial genomes Time: Fri Jul 8 08:24:28 2011 Seq name: gi|222441802|gb|ACEP01000140.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont436.1, whole genome shotgun sequence Length of sequence - 4264 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 3, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 574 - 633 2.9 1 1 Op 1 3/0.000 + CDS 659 - 1384 752 ## COG0037 Predicted ATPase of the PP-loop superfamily implicated in cell cycle control + Prom 1386 - 1445 2.5 2 1 Op 2 . + CDS 1481 - 2533 865 ## COG0582 Integrase 3 2 Tu 1 . - CDS 2773 - 3162 75 ## gi|225028633|ref|ZP_03717825.1| hypothetical protein EUBHAL_02912 - Prom 3259 - 3318 7.4 + Prom 3115 - 3174 6.7 4 3 Tu 1 . + CDS 3402 - 4028 655 ## COG1974 SOS-response transcriptional repressors (RecA-mediated autopeptidases) Predicted protein(s) >gi|222441802|gb|ACEP01000140.1| GENE 1 659 - 1384 752 241 aa, chain + ## HITS:1 COG:HP1182 KEGG:ns NR:ns ## COG: HP1182 COG0037 # Protein_GI_number: 15645796 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Predicted ATPase of the PP-loop superfamily implicated in cell cycle control # Organism: Helicobacter pylori 26695 # 4 218 7 227 253 144 37.0 1e-34 MKLQKLLSLTRQAIDAYGMIKENDKIAVGISGGKDSLTLLYALSYLRNFYPHKFDIVAVT VDLGFDNLNLEEIREFCKKLGVEYQIIKTQIAEVVFQVRKESNPCSLCAKMRKGALNDAM KELGCNKVAYAHHKDDLLESMMMSFLFEGRIHTFAPVTYLDRSGLTVIRPLLFLYEGDVK GFVKEYNLPVVKSPCPVDGSTKRENVKNLIGEINHQYPGAKARMIRAVIDDIVNPDCPPQ N >gi|222441802|gb|ACEP01000140.1| GENE 2 1481 - 2533 865 350 aa, chain + ## HITS:1 COG:CAP0080 KEGG:ns NR:ns ## COG: CAP0080 COG0582 # Protein_GI_number: 15004784 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Clostridium acetobutylicum # 41 344 19 307 323 117 30.0 3e-26 MQQPTYDEQITMKNEMKLRELCAGLPAFCKEFFRGIEPTTSSRTRIAYAYDLKIFFHFLQ EKYPDIEESDLKNWPVTVLDKVTLTQLEEYLDYLRLYEDAENKVHTNEERGRMRKTASIR TFYKFFYRKQVIKNNTASLLNLPKKHEHTIVRLEIDEIAKLLDAVEEGDNLTKSQKKYHN KTKIRDLAILTLLLGTGIRVSECVGIDISNLDFETNGMKIHRKGGADVILYFGEEVREAL LDYLEEREKIIPKEGSENALFLSMQKRRISVRAVENLVKKYARPVTPLKTITPHKLRSTY GTQLYQETGDIYLVADVLGHKDVNTTRKHYAAMEDQRRRMAAGKIQLREK >gi|222441802|gb|ACEP01000140.1| GENE 3 2773 - 3162 75 129 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225028633|ref|ZP_03717825.1| ## NR: gi|225028633|ref|ZP_03717825.1| hypothetical protein EUBHAL_02912 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02912 [Eubacterium hallii DSM 3353] # 1 129 1 129 129 247 100.0 2e-64 MQEISMKTTAQIYDFSELQKKKRQRELREKRPFFITGAVLIISLLSVCFFLYFGDRVVKA QESANDIQYKVVEIKNGDSLWSIAKENMDNTNDSGFINIYQYIHEIKRCNNMKSNQINAG CYLMVPYYN >gi|222441802|gb|ACEP01000140.1| GENE 4 3402 - 4028 655 208 aa, chain + ## HITS:1 COG:BH2356 KEGG:ns NR:ns ## COG: BH2356 COG1974 # Protein_GI_number: 15614919 # Func_class: K Transcription; T Signal transduction mechanisms # Function: SOS-response transcriptional repressors (RecA-mediated autopeptidases) # Organism: Bacillus halodurans # 5 205 3 204 207 218 54.0 5e-57 MSYGKISAKQKEILEYIKQSILSHGYPPAVREICEAVHLKSTSSVHSHLETLEKNGYIRR DPSKPRAIEIIDDNFNLTRRELVNVPIVGTVTAGEPILAIENIQGYFPIMPEFVNNKQTF MLKVKGESMINAGIFDGDFILVEETPTASDNDIIVALLEDSVTVKRFFKEEDHIRLQPEN DTMEPIIVPQDSPFSIVGRVIGLYRAFK Prediction of potential genes in microbial genomes Time: Fri Jul 8 08:24:42 2011 Seq name: gi|222441801|gb|ACEP01000141.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont437.1, whole genome shotgun sequence Length of sequence - 26145 bp Number of predicted genes - 19, with homology - 19 Number of transcription units - 17, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 26 - 85 8.1 1 1 Op 1 . + CDS 126 - 1418 860 ## COG4277 Predicted DNA-binding protein with the Helix-hairpin-helix motif 2 1 Op 2 . + CDS 1490 - 2221 369 ## Closa_1841 hypothetical protein 3 2 Tu 1 . - CDS 2284 - 2658 554 ## gi|225028637|ref|ZP_03717829.1| hypothetical protein EUBHAL_02916 - Prom 2838 - 2897 11.5 + Prom 2844 - 2903 9.3 4 3 Tu 1 . + CDS 2926 - 3879 726 ## COG0523 Putative GTPases (G3E family) + Term 3988 - 4021 0.6 + Prom 4030 - 4089 4.8 5 4 Tu 1 . + CDS 4220 - 5779 1122 ## EUBREC_2878 hypothetical protein + Term 5819 - 5866 7.3 - Term 5807 - 5854 7.3 6 5 Tu 1 . - CDS 5879 - 7300 1916 ## COG0469 Pyruvate kinase - Prom 7343 - 7402 5.0 7 6 Tu 1 . - CDS 7421 - 8713 1312 ## COG0460 Homoserine dehydrogenase - Prom 8750 - 8809 3.3 - Term 8737 - 8774 8.0 8 7 Tu 1 . - CDS 8816 - 10471 1795 ## gi|225028642|ref|ZP_03717834.1| hypothetical protein EUBHAL_02921 - Prom 10706 - 10765 15.1 - Term 10974 - 11016 3.1 9 8 Tu 1 . - CDS 11135 - 12769 1862 ## COG1151 6Fe-6S prismane cluster-containing protein - Prom 12872 - 12931 5.8 10 9 Tu 1 . - CDS 12981 - 13316 461 ## COG1917 Uncharacterized conserved protein, contains double-stranded beta-helix domain - Prom 13398 - 13457 3.6 + Prom 13304 - 13363 6.7 11 10 Tu 1 . + CDS 13408 - 14091 460 ## COG0664 cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases + Prom 14378 - 14437 7.9 12 11 Tu 1 . + CDS 14540 - 15124 204 ## PROTEIN SUPPORTED gi|154684571|ref|YP_001419732.1| 50S ribosomal protein L25/general stress protein Ctc + Term 15150 - 15204 18.4 - Term 15138 - 15192 18.4 13 12 Op 1 . - CDS 15321 - 15686 280 ## gi|225028647|ref|ZP_03717839.1| hypothetical protein EUBHAL_02926 14 12 Op 2 . - CDS 15758 - 16126 261 ## ELI_1761 hypothetical protein - Prom 16149 - 16208 3.7 - Term 16170 - 16218 11.2 15 13 Tu 1 . - CDS 16244 - 18145 2151 ## COG1297 Predicted membrane protein - Prom 18180 - 18239 13.7 + Prom 18795 - 18854 3.4 16 14 Tu 1 . + CDS 18883 - 19335 557 ## ELI_2290 hypothetical protein + Prom 19390 - 19449 4.1 17 15 Tu 1 . + CDS 19480 - 21720 2056 ## COG0659 Sulfate permease and related transporters (MFS superfamily) + Term 21751 - 21792 -0.6 + Prom 22102 - 22161 10.9 18 16 Tu 1 . + CDS 22222 - 23460 754 ## COG0665 Glycine/D-amino acid oxidases (deaminating) + Prom 23804 - 23863 7.7 19 17 Tu 1 . + CDS 23957 - 25732 2110 ## COG0441 Threonyl-tRNA synthetase + Term 25936 - 25974 -0.9 Predicted protein(s) >gi|222441801|gb|ACEP01000141.1| GENE 1 126 - 1418 860 430 aa, chain + ## HITS:1 COG:CAC3343 KEGG:ns NR:ns ## COG: CAC3343 COG4277 # Protein_GI_number: 15896586 # Func_class: R General function prediction only # Function: Predicted DNA-binding protein with the Helix-hairpin-helix motif # Organism: Clostridium acetobutylicum # 11 426 6 424 440 403 48.0 1e-112 MTEREIPLERKLKILADAAKYDVACTSSGSSRRGKKGHLGNTVSVDICHTFSSDGRCVSL LKILFSNDCVFDCKYCPSRAANEKERATFTPEEICRLVVEFYRRNYIEGLFLSSAVVKSP AYTMERMYEAIYMLRNKYHFNGYIHVKAIPGAPDDIIYALGFLVDRMSVNLELPTAEGLK RLAPNKKPKNLIRPIQQIQRGIQTQRVALGKDRRMERAKGNQYLSNSIFNEKNTSSVEEK SVFSIMEDAQKSVKKNDNVNRRQNDPELPAIGTAPPLLREHRLYQADWLLRYYGFQADEL LSSDRPNFNTFIDPKCDWALRHLEYFPVEINQASYEQLLRVPGIGNKSAGRIVRARRQAA LDFEDIKKMGVVLKRAVYFITCRGKMKYHTPIEEDFITRQLIGTNQKDNWKIEHPTTYRQ LSLFDDFNLT >gi|222441801|gb|ACEP01000141.1| GENE 2 1490 - 2221 369 243 aa, chain + ## HITS:1 COG:no KEGG:Closa_1841 NR:ns ## KEGG: Closa_1841 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 240 15 251 259 202 44.0 8e-51 MTCIYEAWASHLGHNNIKLRTEPLGTMELFCEYRHVEADKEKTESVIRTIQQKISFHAYQ MVYHAAMAADEEEKLDSIYRFLILGFHYGRKILDSLQNPIVMKIFELERKASNEAHIFRE CIRFTEMNHHILVGIISPKCDVVTLLAPHFVDRLPSEDWMIIDDNRRTAVVHPADQPYYL TSLSEEEMSEFLARESAAKDTDSMTALWKTFFQTIGIKERKNPVCQRNLIPLWYRKHMSE MQD >gi|222441801|gb|ACEP01000141.1| GENE 3 2284 - 2658 554 124 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225028637|ref|ZP_03717829.1| ## NR: gi|225028637|ref|ZP_03717829.1| hypothetical protein EUBHAL_02916 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02916 [Eubacterium hallii DSM 3353] # 1 124 1 124 124 231 100.0 2e-59 MLRDDFDLRLTNELIETYLKIAYGIKPGCWLDAEKGLLQENTLYPKLKQMINDGNINDAE DILYEYANPINPDILKVGLFFYYDLNRMADAELEAADFTRDEVKEGVQELLKIYDQENFG TIFR >gi|222441801|gb|ACEP01000141.1| GENE 4 2926 - 3879 726 317 aa, chain + ## HITS:1 COG:STM2212 KEGG:ns NR:ns ## COG: STM2212 COG0523 # Protein_GI_number: 16765541 # Func_class: R General function prediction only # Function: Putative GTPases (G3E family) # Organism: Salmonella typhimurium LT2 # 1 160 1 157 328 62 30.0 1e-09 MVKIDLITGFLGSGKTTFIKKYAKYLIDQGLNIGILENDFGAVNVDMMLLQDIAGEKCTL EMVAGGCDKDCHRRRFRTKLIAMGMCGYDRVLIEPSGIFDMDEFFDALHESPLDRWYEIG NVITVVDAMLEENLSEDAEFILASEVANAGIVLLSKAQEAAEIDIERTKAHLNKAMESVH CDRRFEKEIFAKDWNKLSDADFKKIQSAGYAGADYEKKDIAEEDAFQSLYFMNLTMPVEK LEEKVKQIFNDKECGNIFRIKGFMQTKPDQWIELNATHQNITIQSIKKGQEIFIVIGEKL NKEKITTNLMGTQTPLC >gi|222441801|gb|ACEP01000141.1| GENE 5 4220 - 5779 1122 519 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_2878 NR:ns ## KEGG: EUBREC_2878 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 519 1 522 526 728 67.0 0 MGTFFNPGNGSFTQDKNSEIYIDKTELLKFLNKKLGTNGKCIAVSHARRFGKSHAAGMID AYYSLGSDSSKLFENTKIASDPDYEKYRNKYNVIHLDISSVWDYHKEDLIESIEERVCED FQKIYGDLLNYKRDFYLLIQDIYLLTGIPFVIIIDEWDCVIRNSHDQALVHKYLQFLHSL FKSEESKSFLALGYITGILPIKKIKDESALNNFQEYTMLDSYPITEYYGFTEKEVKDLCK DYSMDFETVKAWYNGYLIDGEHMYNPNSVSQALERHKIDSYWRNTSAFDTINTFITMNYA GLKDDVMKMLTGGKVRINTNTFQNDFSIITSKDDALTALIHLGYLGYDADRKKSFIPNYE VATAFELALQTGSWKEIAESISKCDELLDETIDGNAERVAELIGLAHETYTSILKYNDEN VLSCVLTMAYFTAPAYYNVIREMPSGKGFADFVFLPRANAGNRPAMIIELKYNQSAETAL KQIKEKRYQGALSGYQGKILLVGINYDSEKHHTCMIEEL >gi|222441801|gb|ACEP01000141.1| GENE 6 5879 - 7300 1916 473 aa, chain - ## HITS:1 COG:BH3163_1 KEGG:ns NR:ns ## COG: BH3163_1 COG0469 # Protein_GI_number: 15615725 # Func_class: G Carbohydrate transport and metabolism # Function: Pyruvate kinase # Organism: Bacillus halodurans # 1 473 1 473 473 466 51.0 1e-131 MRKTKIICTLGPSTDKGDVLEQLMLSGMNVARLNFSHDTYENQKKRIDKVKALRTKHNLP VACLLDTKGPEIRLKTFKDERVTLEMGQDFCLTTRDVEGTKDIVSVTHKDLHKDIHVGSS ILIDDGLVGLKVVAIKGQDIHCKVENGGTISNRKGVNIPGVELSIPFMSEKDKEDLLFGI KQDVDFVAASFTRTADDIKEMKAFLKANGGEDIRIIAKIENSQGVDNIDSIIEACEGIMV ARGDMGVEIPEEEVPIIQKMIIKKVIAAGKVVITATQMLDSMMKNPRPTRAETTDVANAI YDGTSATMLSGETAAGAYPVEAVKMMARIAERTEKEINYRKRFNDMAKYTDPGITDAICH ATCSTAYDLDAKAIITVTKSGFSARMISKYRPGCDIIGCAMDEKVCRQLNLSWGVRPILL GEEWEVFVLFERAINACKRDGLLEKGDVTVITSGVPIGRSGTTNMLKVQVVDD >gi|222441801|gb|ACEP01000141.1| GENE 7 7421 - 8713 1312 430 aa, chain - ## HITS:1 COG:BH3422 KEGG:ns NR:ns ## COG: BH3422 COG0460 # Protein_GI_number: 15615984 # Func_class: E Amino acid transport and metabolism # Function: Homoserine dehydrogenase # Organism: Bacillus halodurans # 6 429 2 427 431 356 43.0 4e-98 METAQNKIVKAALLGAGTVGSGVYQLVQERQEDFSHICGTQIQITKILVRDASKEREGIP SPLLTDNWNEIIEDDNISIIIEVMGGIEPAKSYLLEAMKAGKQIVTANKDLIAEHGHELL DSAEKYGCDFKFEAAVAGCVPIIQVLKQSMSSENITEVMGIVNGTTNYILTRMTQSGMSY DEALKEATDLGYAEADPTADVDGLDAGRKIAIMASIAFHSRVTFSDVYIEGIRNITAKDI FYAKEFNSVIKLVGIARKDEDGIEVKVLPILIPQDHPLATVNDSFNAVFVHGTASDDTMY YGRGAGKRPTASAVTGDLCTVARHIVEKHSNLHVCSCYKELPVKNIQDTYSRFFLRLQVA DRPGVLANITSVFGTSQVSIAQIIQKSRQNGNAELVIITDEVKELYFTDALNILKRLSIV NSISSLIRVY >gi|222441801|gb|ACEP01000141.1| GENE 8 8816 - 10471 1795 551 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225028642|ref|ZP_03717834.1| ## NR: gi|225028642|ref|ZP_03717834.1| hypothetical protein EUBHAL_02921 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02921 [Eubacterium hallii DSM 3353] # 1 551 1 551 551 744 100.0 0 MKKRIYKNLLALTLACSVALSTAPVSSFARDVAAVSESEATTEAPKEKATEVEKKTEVSK DATTVTEKTEASEEKTTAAEKTEAPEEKTIAAEKTEAETSEEKTTSKKTKTSEKKATDPE CVKPDKPVVDGPTPEGKKDTDEATKPSTVTLHFNQSMPGHVLSVSKASSFNFGGYASDKK ATVKASYVGIAWNGKGYSSSTPSYKAGTYTETLYVTDAAYSATPISRTYTITRANTHIII NNPVDCTYDGSAFAVDATVYDDNNNEVTKANVTYIRVDGGKYIYLGFTAPVEAGAYIAIA TYPGDATHRPSAAKSFFIIHPKAVKVNIDYKEKYVGDKDPELTYKAEGVLDGDSLGLKLT REAGEEAGYYDIYPDTSSLNPNYKLAETPDGTDHFHILKRDSGETPNPGTTETPNPGTTE TPDPGTTETPDPGTTETPDPGTTETPDPGTTETPADDSSATTTETPADDSSATTTETPSV DPSENTDPTTKTTKKTTTQKKANTKKATTKKNKKSKSSKTGDASNPVLYMVVSVFALVMC AFIVLRRRTDR >gi|222441801|gb|ACEP01000141.1| GENE 9 11135 - 12769 1862 544 aa, chain - ## HITS:1 COG:CAC3428 KEGG:ns NR:ns ## COG: CAC3428 COG1151 # Protein_GI_number: 15896669 # Func_class: C Energy production and conversion # Function: 6Fe-6S prismane cluster-containing protein # Organism: Clostridium acetobutylicum # 5 542 3 564 567 778 68.0 0 MENQMFCYQCQETAGCSGCTRVGVCGKQPDIAAMQDLLVYVTKGISAVTTTLRQEGVEIQ PAVNHMITLNLFTTITNANFSKEAIIARIQETLSEKDLLLSKLTNKDVLPEAALWNGSVE EFDTKAAAVGVLSTENEDIRSLRELITYGLKGLSAYSKHANVLLQDNEEIDAFMQRTLAA TLDDSLSVDDLIALTLETGKYGVEGMALLDNANTSAYGNPEITKVNIGVGSRPGILVSGH DLRDLEMLLEQTQGTDVDVYTHSEMLPAHYYPTFKKYPNFVGNYGNAWWKQKEEFESFHG PILMTTNCIVPPKDSYKDRLYTTGATGYPGCKHIPGEIGEEKDFSSIIEQAKHCTPPEEI EHGEIVGGFAHNQVLALADDIVAAVKSGAIKKFVVMAGCDGRMKSRNYYTDFAKALPKDT VILTAGCAKYKYNKLDLGDICGIPRVLDAGQCNDSYSLAVIALKLKEVFGLEDINELPIL YNISWYEQKAVIVLLALLYLGVKNIHLGPTLPAFLSPNVAKVLVNTFGIAGIGTVEEDLK LFGF >gi|222441801|gb|ACEP01000141.1| GENE 10 12981 - 13316 461 111 aa, chain - ## HITS:1 COG:SPy1581 KEGG:ns NR:ns ## COG: SPy1581 COG1917 # Protein_GI_number: 15675470 # Func_class: S Function unknown # Function: Uncharacterized conserved protein, contains double-stranded beta-helix domain # Organism: Streptococcus pyogenes M1 GAS # 1 110 1 110 112 99 44.0 2e-21 MAYIKNISHEEVLHLTDQISADKGQIVSKTLAQNDYVSVTLFAFSKDEEIATHDSTGDAM VLVLEGTGQFTVDGKEYILQAGDTLVMPAKKPHSVYAKEDFKWLLTVVFPS >gi|222441801|gb|ACEP01000141.1| GENE 11 13408 - 14091 460 227 aa, chain + ## HITS:1 COG:CAC0884 KEGG:ns NR:ns ## COG: CAC0884 COG0664 # Protein_GI_number: 15894171 # Func_class: T Signal transduction mechanisms # Function: cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases # Organism: Clostridium acetobutylicum # 6 224 9 223 229 78 26.0 9e-15 MDFYKLSKTVLFQGATPEETEQMLSCLKAGKRKYKKEEMIYRVGEVVSCIGIVLSGSVLI ENDDIWGNRSVLERAGEGEIFAETYASIPDQKMLVNVIAAEDTEILFLNVGKMLHLCSNS CIFHNELIKNLLQVSAQKNLALSRRIFNTSSKSIRGRLLSYLSNQAVLCNSKEFDIPFNR QQLADYLSVDRSALSKELGKMSKEGLIYVKKSHFRLLSVQELFEEES >gi|222441801|gb|ACEP01000141.1| GENE 12 14540 - 15124 204 194 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|154684571|ref|YP_001419732.1| 50S ribosomal protein L25/general stress protein Ctc [Bacillus amyloliquefaciens FZB42] # 1 193 1 190 202 83 28 2e-15 MNTLSAEKRDMQIKAKKLRREGFVTGNVFGKNMEGSMPLKIEKKEAERILRTCNKGSQLI LTVEGQKMPVLLKEIDYNAMKHEIVEMDFQALVKGEKVHSVAEVVLLNHEKVKSGVVELL LEEISYEALPENLIEKVEIDLDGKTAGDTIRVKDLPIATAEGIHLKTNPEETVVQVTEVH NAPEEEETEEAAEE >gi|222441801|gb|ACEP01000141.1| GENE 13 15321 - 15686 280 121 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225028647|ref|ZP_03717839.1| ## NR: gi|225028647|ref|ZP_03717839.1| hypothetical protein EUBHAL_02926 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02926 [Eubacterium hallii DSM 3353] # 1 121 1 121 121 231 100.0 2e-59 MELLKNLIHYLPYILLFAVATMLIYGWGLWRSMHQNQDLNNLLTSKGISRIKKELKRNGK MTKKELSDCVKDLTASQPFSKQRIGVTNPNQFLDSLLPYMERQEMISKSKQNGKVYYDLI H >gi|222441801|gb|ACEP01000141.1| GENE 14 15758 - 16126 261 122 aa, chain - ## HITS:1 COG:no KEGG:ELI_1761 NR:ns ## KEGG: ELI_1761 # Name: not_defined # Def: hypothetical protein # Organism: E.limosum # Pathway: not_defined # 14 115 16 118 141 91 43.0 9e-18 MKGVPMAKQKIDKNYLDFIPVKNPEIKYETNDKGNITVFIRWEGFFNKIAQKFFHRPKVS SIDLDDYGSFVWNIIDDKKDIHTLSQELDAHFPKMEKSLSRLIKFLEILKDHHLITYSEK LK >gi|222441801|gb|ACEP01000141.1| GENE 15 16244 - 18145 2151 633 aa, chain - ## HITS:1 COG:VNG6268C KEGG:ns NR:ns ## COG: VNG6268C COG1297 # Protein_GI_number: 16120189 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Halobacterium sp. NRC-1 # 10 605 28 622 655 365 39.0 1e-100 MKENKNDFKPYIPADQVVPEFTVTALILGILLAVIFGAANAYLGLRVGMTVSASIPAAVL SMGIIRIILRKNSILENNLVQTIGSAGESVAAGAIFTLPALFLWAKEGKIDSPSILTIFL VALVGGILGVCFMVPLRQALIVEEHGVLPFPEGTACAEVLLAGEEGGNKAGIVFSGLGIA AIYKFIADGVKLFPSEIGYDIQAYAGSSVGIQVLPALAGVGYICGPQISKYMFAGGTLSW FVLMPMIALFGKDATIFPGSEVISTLAPGSLWGTYIKYIGAGAVAAGGIMSLIKTSPLIV RTFKQAIGSMAKNRATADASRTQRDLPMPIILGIIAVIAVAIWLLPIFPVSFLGAVLVVI FGFFFATVSARMVGLIGSSNNPVSGMAIATLIISTLILKATGTTGTTGMIGSICIGSIIC IVAAIAGDTSQDLKTGFIVGATPKLQQIGEMIGVIASSAAIGYVLYLLNAAWGFGSNEIP APQATMMKMLVEGIMNAELPWALILVGVFIAIVVEILGIPVLPFAVGMYLPFSLSAGIMA GGVVRWILERRKAANENEEKEKKACIERGTLFTSGLIAGEGLMGVILAICAVAKVDSKFV SPVALPQIASLVIFIILLAYLYFLCVKKNNKTN >gi|222441801|gb|ACEP01000141.1| GENE 16 18883 - 19335 557 150 aa, chain + ## HITS:1 COG:no KEGG:ELI_2290 NR:ns ## KEGG: ELI_2290 # Name: not_defined # Def: hypothetical protein # Organism: E.limosum # Pathway: not_defined # 1 145 1 146 159 191 75.0 6e-48 MPKTKLQNVIFTIIMAFVMVYAMVCYNIAIDKGGMSNQIFLLAFHEMPIMWPIAAILEFF VVEKLSKFLAFRIVSPTDKPIFITLAICSMIVCLMCPVMSFIATALFKHPGSEIIAQWLQ TTVINFPMALCWQIFFAGPLVRNVFGKFVK >gi|222441801|gb|ACEP01000141.1| GENE 17 19480 - 21720 2056 746 aa, chain + ## HITS:1 COG:lin0529 KEGG:ns NR:ns ## COG: lin0529 COG0659 # Protein_GI_number: 16799604 # Func_class: P Inorganic ion transport and metabolism # Function: Sulfate permease and related transporters (MFS superfamily) # Organism: Listeria innocua # 1 544 1 545 553 443 40.0 1e-124 MKSSLFPTLKNYKKEYLGKDIAAGIMIAAVSIPISMGYAEVSGLPAVFGLYGSVLPILFF ALFSTSPQFIFGVDAAPAAIAGAALASLGIESGSADALRYIPVIALFVGLWLLLFYFLKA GKLVTFISTPVMGGFISGIALTIILMQIPKIMGGKSGSGELPELLKHLYEIAQHINWVSV ALGVGALAILIISKKVMPKFPMAVVIMALGMIATMVFHVDRYGVTLLAKVEPGLPKFILP DIFHVDLSHAAGRGLMIAVVVMAETLLSENNFAAKNGYKIDDNKEILACAAGNIISAFTG SCPVNGSISRTSMNEQFDGKTQAVSITAAVTMVAVLLFAAGFIGYLPVPVLTAIVISALM NVVEWHLAVRLFKVSRKEFYIFVAACMAVLFLGTIYGVIIGIILSFVAVVIRETNPPRAF LGGIPGRDGFYDLNKNHYAHPIENTVIYRFNASLFFANSKLFQEDIENHLKEDTKTVIVD AGTITNIDITAADTLLMLKNNLEKKGIAFYITEHMQGLNTQLRNLGMGSLIEEGCVRRTI TAALLDSGLQKPFPLEGVPADLQENLKELQENAEKALSSHKHNTKNIEKIKKTLWLHTLP AEEENTLEEFAWAFGEDTVNEIEKRVHQVISHLHHWPELEAISESGLARQMRAWHNLGVI DEDEVLRRMELHLDELSANLEEDKNKQLIFQMIEKRRQHLEMELKQENPEIWQKLVESRK HLERRLEKQNPEAAKRLHELEEKFLG >gi|222441801|gb|ACEP01000141.1| GENE 18 22222 - 23460 754 412 aa, chain + ## HITS:1 COG:CAC2596_1 KEGG:ns NR:ns ## COG: CAC2596_1 COG0665 # Protein_GI_number: 15895855 # Func_class: E Amino acid transport and metabolism # Function: Glycine/D-amino acid oxidases (deaminating) # Organism: Clostridium acetobutylicum # 3 410 2 398 399 288 36.0 2e-77 MSDSIWSDGGRIQMPQYEKLTRIKKVDVTIVGGGLCGLLCAYFLKEAGVECLLLEGSRIG SGTVRHAMAQVTSQHGLIYSRLLETLGEEKARMYFEANELALHKYRELAASIDCDFEERS SYIYSRTDKECIEREVRAASLLGMKAEFCETPELPFDTAGAVRFPNQAQMNPVKFLYGLV KDIEKSDKVTIFEEMPVDDWINGTTWSGLYITIPKNIICTTHFPFYKKAGGYNNKLYQNR SYIMALDHVPELHGMYADASDNRLALRGYKDMIFVGGATHRVGQEERDWNEFREKILSYY PEAIEREHWEVEDCVSLDGIPYIGPYSDKTPNMYVATGFNGWGMTSAMTAAMLLTDAIIN GQKTNGATESYPWGEVFYPERKIVRSQYFANVKENVRENIKSFLTRKSKIND >gi|222441801|gb|ACEP01000141.1| GENE 19 23957 - 25732 2110 591 aa, chain + ## HITS:1 COG:CAC2362 KEGG:ns NR:ns ## COG: CAC2362 COG0441 # Protein_GI_number: 15895629 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Threonyl-tRNA synthetase # Organism: Clostridium acetobutylicum # 15 588 62 629 637 809 66.0 0 MKVLEKNGQVIEGNFEEKEVKEAFWHTSAHVLAQAVKRLYPDTKCAIGPAIDNGFYYDFD FSFPFTEENLAAIEKEMKKIVKQSLPLEVFEVTKEKALDYMKERQEDYKVEMIEELPEGE TITFYKQGDYEEFCAGPHVSNTCVIKAIKLLSTAGAYWRGDEKNKMLTRIYGISFPKAAE LKDYLNMLEEAKKRDHRKLGKELGLFTIMEEGPGFPFFLPKGMILKNTLIDYWRKMHTRE GYEEISTPILLNRQLWETSGHWDHYRENMYTTVIDDMDFAVKPMNCPGGMLVYKMEPRSY KELPLRLGELGLVHRHEKSGQLHGLMRVRCFTQDDAHIFMTEAQIENEIQKVVKLIDEVY KKFGFTYHVELSTRPADSMGTEEEWEVATNALENAIKAMNIPYEVNEGDGAFYGPKLDFH LQDSIGRTWQCGTIQLDFQLPQRFEAEYIGADGEKHRPIMIHRVVFGSIERFIGILIEHY AGKFPTWLAPVQAKVLPVSDKFADYAKQVAEALKASGVRVEVDDRDEKLGYKIRQAQLQK VPYMLILGEKEQSAGTVSVRKRDLIDENDSPELGTMSVEEFAAIIKEESQY Prediction of potential genes in microbial genomes Time: Fri Jul 8 08:25:48 2011 Seq name: gi|222441800|gb|ACEP01000142.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont438.1, whole genome shotgun sequence Length of sequence - 4968 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 3, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 639 - 1058 540 ## gi|225028655|ref|ZP_03717847.1| hypothetical protein EUBHAL_02934 - Prom 1092 - 1151 4.9 - Term 1149 - 1203 13.9 2 2 Op 1 . - CDS 1224 - 1688 676 ## COG0698 Ribose 5-phosphate isomerase RpiB - Prom 1709 - 1768 3.7 - Term 1700 - 1760 10.6 3 2 Op 2 . - CDS 1785 - 3338 1873 ## COG1069 Ribulose kinase - Prom 3377 - 3436 3.4 4 3 Op 1 . - CDS 3441 - 3878 798 ## COG0698 Ribose 5-phosphate isomerase RpiB - Prom 3903 - 3962 1.8 5 3 Op 2 . - CDS 3969 - 4763 867 ## COG0647 Predicted sugar phosphatases of the HAD superfamily - Prom 4795 - 4854 2.3 Predicted protein(s) >gi|222441800|gb|ACEP01000142.1| GENE 1 639 - 1058 540 139 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225028655|ref|ZP_03717847.1| ## NR: gi|225028655|ref|ZP_03717847.1| hypothetical protein EUBHAL_02934 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02934 [Eubacterium hallii DSM 3353] # 5 139 1 135 135 243 100.0 4e-63 MSRFMTIKAEEIAEAFETERRQYFVGNLKRPQHIPFVKSETTEMGLTVYDKFTGEPAHRH SVAKEYAYVISGRTQYMDLDTREIYEYHAGDFFVTFPGTTYVQKAEAGTKMIFVKEPSIN DKELVPMDEEAQKWMETEF >gi|222441800|gb|ACEP01000142.1| GENE 2 1224 - 1688 676 154 aa, chain - ## HITS:1 COG:rpiB KEGG:ns NR:ns ## COG: rpiB COG0698 # Protein_GI_number: 16131916 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose 5-phosphate isomerase RpiB # Organism: Escherichia coli K12 # 4 145 1 141 149 157 58.0 1e-38 MSEVKVIGIGNDHAAVDLKNEIKAYLEEKGYKVVNYGTDSSESCDYPVYGAKVGNAVASG EVDAGVLICGTGIGISIAANKVNGVRAAVVSEPVSARLTKEHNNANIIAFGARIVGSEMA KAIVDAWLDAEYIGSGRHERRVNMIEEEAAKYNK >gi|222441800|gb|ACEP01000142.1| GENE 3 1785 - 3338 1873 517 aa, chain - ## HITS:1 COG:SMb20852 KEGG:ns NR:ns ## COG: SMb20852 COG1069 # Protein_GI_number: 16264893 # Func_class: C Energy production and conversion # Function: Ribulose kinase # Organism: Sinorhizobium meliloti # 5 509 4 496 509 263 30.0 5e-70 MAEYLMGIDAGTGSVRVALFDFRGQNRAYAVAEYGTTYPKNGWAEQSDYEWFEALKKAIP ECIKKAGIAADNIAAITCDATTNTLVYLDENDTSVRPPILWMDVRATEEASFIDTIREEY DATKFYKPGFRADTMVPKCMWVKKNEPENWAKTKTIMEFEDWLNWKLTGRKTVSMSVAAF RWNYDDKNGGYPVDFYNAVGLDDVVDKFPEPVLKVGEVVGNISAEAAEIFGLSKKTVVVE GTADCNACMFGVGGVRPNGMTLIGGTSTCLLGLSEEEFHVDGVNGTYPNCMYDGTSLLEG GQTAAGSILTWFKNNLLPAAWMQEAVNRNMNIYDLITEKASEVPIGCDGLVMMDYFQGNR APYADSKARGMFWGLSIGTTPAHLARAVYEGVAYGANHCIVSMKKAGYDVKEIYACGGLA QSDFWMQMHADIIGVPMYTTVESQSAGCLGDCIIAAVGVGIYPNFWEAADSMIRVDKEYI PNMEAHKEYQFYMDRYMETWPQMREIVHKTVDHNNKK >gi|222441800|gb|ACEP01000142.1| GENE 4 3441 - 3878 798 145 aa, chain - ## HITS:1 COG:PM1645 KEGG:ns NR:ns ## COG: PM1645 COG0698 # Protein_GI_number: 15603510 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose 5-phosphate isomerase RpiB # Organism: Pasteurella multocida # 1 125 1 128 151 121 50.0 6e-28 MKIVFGCDPNATEFKLQLMEYVKGLGHEVADLGSDDPIYANTAIEVAKAVAAKEYDRGII VCGTGIGVSIAANKVKGAYAACIHDVYQAQRAQLSNHANIITMGSQVIGIELAKQLVKEY LSVEWDPNCRSVAKVQRIIDFENEE >gi|222441800|gb|ACEP01000142.1| GENE 5 3969 - 4763 867 264 aa, chain - ## HITS:1 COG:FN1255 KEGG:ns NR:ns ## COG: FN1255 COG0647 # Protein_GI_number: 19704590 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted sugar phosphatases of the HAD superfamily # Organism: Fusobacterium nucleatum # 6 261 13 271 275 218 43.0 9e-57 MQKKEDALKGVKLFVLDMDGTVYLGNHMIDGALDFIHEVDASEDRDYIFFTNNASRVPSV YVEKLHKLGLDVDESKVVTAGDVCAEFLKVNYPGAKVYLNGTPVLEENWKEKGIHLVEED PDVAVQSFDTTLTYHKLDRICHYVRNGVPFIATHMDTNCPTEYGFMPDCGAMCSLITDST GVKPRFLGKPWKETVDMVAEITGYKAEEMAFVGDRLYTDVATGVNNGAKGFLVLTGEADM QTVAESDVEPTCIYDSLGEMRRYL Prediction of potential genes in microbial genomes Time: Fri Jul 8 08:26:09 2011 Seq name: gi|222441799|gb|ACEP01000143.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont439.1, whole genome shotgun sequence Length of sequence - 47540 bp Number of predicted genes - 41, with homology - 41 Number of transcription units - 24, operones - 13 average op.length - 2.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 62 - 121 6.7 1 1 Tu 1 . + CDS 157 - 666 658 ## COG0494 NTP pyrophosphohydrolases including oxidative damage repair enzymes + Prom 683 - 742 6.0 2 2 Op 1 . + CDS 938 - 1093 103 ## gi|225028661|ref|ZP_03717853.1| hypothetical protein EUBHAL_02940 + Term 1135 - 1166 -0.8 + Prom 1217 - 1276 4.7 3 2 Op 2 . + CDS 1384 - 2775 1805 ## COG2239 Mg/Co/Ni transporter MgtE (contains CBS domain) + Term 2816 - 2885 28.5 + Prom 2908 - 2967 7.5 4 3 Tu 1 . + CDS 3096 - 4280 1059 ## COG0628 Predicted permease + Term 4321 - 4382 7.5 + Prom 4409 - 4468 8.2 5 4 Tu 1 . + CDS 4489 - 5511 1357 ## COG1304 L-lactate dehydrogenase (FMN-dependent) and related alpha-hydroxy acid dehydrogenases + Prom 5620 - 5679 11.5 6 5 Tu 1 . + CDS 5724 - 6386 671 ## COG0546 Predicted phosphatases 7 6 Tu 1 . + CDS 6905 - 7126 327 ## Cphy_1320 small acid-soluble spore protein alpha/beta type + Term 7175 - 7237 9.0 + Prom 7255 - 7314 9.1 8 7 Op 1 29/0.000 + CDS 7398 - 8717 2103 ## COG0544 FKBP-type peptidyl-prolyl cis-trans isomerase (trigger factor) 9 7 Op 2 24/0.000 + CDS 8787 - 9371 608 ## COG0740 Protease subunit of ATP-dependent Clp proteases + Term 9497 - 9545 11.1 + Prom 9399 - 9458 2.4 10 8 Op 1 18/0.000 + CDS 9573 - 10871 281 ## PROTEIN SUPPORTED gi|163762510|ref|ZP_02169575.1| ribosomal protein S16 + Term 10893 - 10947 -0.9 + Prom 10894 - 10953 10.3 11 8 Op 2 4/0.000 + CDS 11143 - 13449 2685 ## COG0466 ATP-dependent Lon protease, bacterial type + Prom 13476 - 13535 8.0 12 8 Op 3 . + CDS 13593 - 14186 780 ## COG0218 Predicted GTPase + Term 14299 - 14339 4.8 + Prom 14306 - 14365 6.7 13 9 Tu 1 . + CDS 14397 - 15356 1039 ## COG1893 Ketopantoate reductase + Prom 15612 - 15671 4.9 14 10 Tu 1 . + CDS 15698 - 16147 493 ## Cphy_0479 stage V sporulation protein AC 15 11 Op 1 40/0.000 - CDS 16182 - 17330 811 ## COG0642 Signal transduction histidine kinase 16 11 Op 2 . - CDS 17327 - 18004 341 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain - Prom 18112 - 18171 7.4 + Prom 18046 - 18105 6.9 17 12 Op 1 . + CDS 18157 - 18297 109 ## gi|225028679|ref|ZP_03717871.1| hypothetical protein EUBHAL_02958 18 12 Op 2 . + CDS 18290 - 19162 347 ## COG0348 Polyferredoxin 19 12 Op 3 . + CDS 19179 - 20072 1057 ## EUBREC_2028 alkyl hydroperoxide reductase + Prom 20093 - 20152 9.9 20 13 Op 1 . + CDS 20203 - 21072 676 ## gi|225028682|ref|ZP_03717874.1| hypothetical protein EUBHAL_02961 21 13 Op 2 . + CDS 21098 - 21859 784 ## COG0101 Pseudouridylate synthase + Term 21927 - 21970 0.1 + Prom 22074 - 22133 3.2 22 14 Op 1 . + CDS 22186 - 23202 862 ## Cphy_0480 stage V sporulation protein AD 23 14 Op 2 . + CDS 23212 - 23568 375 ## EUBREC_2431 sporulation protein + Term 23697 - 23735 -0.7 24 15 Op 1 . - CDS 23578 - 24576 709 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain 25 15 Op 2 . - CDS 24576 - 25868 982 ## COG4936 Predicted sensor domain - Prom 26060 - 26119 6.9 - Term 25966 - 26008 2.7 26 16 Tu 1 . - CDS 26159 - 28690 3181 ## COG1882 Pyruvate-formate lyase - Prom 28875 - 28934 9.4 - Term 28915 - 28966 2.3 27 17 Op 1 . - CDS 28998 - 31682 1940 ## COG0744 Membrane carboxypeptidase (penicillin-binding protein) 28 17 Op 2 . - CDS 31712 - 32077 207 ## PROTEIN SUPPORTED gi|148984704|ref|ZP_01817972.1| 50S ribosomal protein L20 - Prom 32202 - 32261 7.7 + Prom 32143 - 32202 9.0 29 18 Op 1 . + CDS 32435 - 33316 942 ## COG1376 Uncharacterized protein conserved in bacteria + Prom 33319 - 33378 8.1 30 18 Op 2 . + CDS 33406 - 35817 2937 ## COG0495 Leucyl-tRNA synthetase + Term 35956 - 35980 -1.0 + Prom 36233 - 36292 9.0 31 19 Op 1 . + CDS 36334 - 36540 203 ## Blon_1307 transposase IS200-family protein + Prom 36610 - 36669 3.5 32 19 Op 2 . + CDS 36703 - 37806 729 ## COG0675 Transposase and inactivated derivatives + Prom 37958 - 38017 4.1 33 20 Tu 1 . + CDS 38054 - 38500 240 ## PROTEIN SUPPORTED gi|42519249|ref|NP_965179.1| 30S ribosomal protein S21 34 21 Op 1 1/0.000 - CDS 38673 - 39215 542 ## COG1013 Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, beta subunit 35 21 Op 2 22/0.000 - CDS 39298 - 39672 316 ## COG1013 Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, beta subunit 36 21 Op 3 . - CDS 39672 - 40109 527 ## COG1014 Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, gamma subunit - Prom 40266 - 40325 7.3 - Term 40626 - 40656 -0.9 37 22 Tu 1 . - CDS 40711 - 41334 353 ## COG2323 Predicted membrane protein - Prom 41362 - 41421 1.7 + Prom 41969 - 42028 3.8 38 23 Op 1 . + CDS 42175 - 43695 1139 ## COG0169 Shikimate 5-dehydrogenase + Prom 43700 - 43759 2.9 39 23 Op 2 3/0.000 + CDS 43790 - 44500 213 ## PROTEIN SUPPORTED gi|163764767|ref|ZP_02171821.1| ribosomal protein L15 40 23 Op 3 . + CDS 44500 - 45525 1169 ## COG1063 Threonine dehydrogenase and related Zn-dependent dehydrogenases + Term 45639 - 45679 9.9 + Prom 45632 - 45691 8.7 41 24 Tu 1 . + CDS 45830 - 47227 1002 ## PROTEIN SUPPORTED gi|145629959|ref|ZP_01785741.1| 50S ribosomal protein L21 Predicted protein(s) >gi|222441799|gb|ACEP01000143.1| GENE 1 157 - 666 658 169 aa, chain + ## HITS:1 COG:lin0387 KEGG:ns NR:ns ## COG: lin0387 COG0494 # Protein_GI_number: 16799464 # Func_class: L Replication, recombination and repair; R General function prediction only # Function: NTP pyrophosphohydrolases including oxidative damage repair enzymes # Organism: Listeria innocua # 1 168 1 162 169 84 33.0 9e-17 MEFWDIYDENKQLTGRMMKRNDWCLKDGEYHLTVLGVIARPDGTFLITKRVMTKAWAPGW WEVSGGGVQAGESSEEAVQREVKEETGLDVRNAEGGYLFTYKRENPGEGDNYFVDVYRFV MDIDENDVSFQEAEIDGYMFATKEQIEGFAAEGKFLHYDSIKQAFSFEK >gi|222441799|gb|ACEP01000143.1| GENE 2 938 - 1093 103 51 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225028661|ref|ZP_03717853.1| ## NR: gi|225028661|ref|ZP_03717853.1| hypothetical protein EUBHAL_02940 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02940 [Eubacterium hallii DSM 3353] # 1 51 1 51 51 95 100.0 1e-18 MANGFKIDLRLDSGKEVKYMDSCDSAGRSMICGGSKKILTWLNVDFIFSCA >gi|222441799|gb|ACEP01000143.1| GENE 3 1384 - 2775 1805 463 aa, chain + ## HITS:1 COG:SA0867 KEGG:ns NR:ns ## COG: SA0867 COG2239 # Protein_GI_number: 15926597 # Func_class: P Inorganic ion transport and metabolism # Function: Mg/Co/Ni transporter MgtE (contains CBS domain) # Organism: Staphylococcus aureus N315 # 25 460 31 461 461 243 32.0 5e-64 MEKMLKKPDYAKEILAIMHGSFSKEEMLDRIADYHENDIAEAFELLSESERKSWYQMFGP EQIAEIFSYIDDPDSYLKELPLDEAAKVLSFMDSDDAVDILDEMDHSTQEKLVGLLDEES GHDIKMILSYEDDEMGSKMTTNFIVIHNNLTIRQAMRELVQQAGENDNISTVYVLDENDK FYGAIDLKDLIVARQEDNLEDIISTSYPYVNDHEMIDDCIERIKDYAEDSIPVLTEDNLL IGVITAMDIIEAVDEEMGEDYAKLAGLTAEEDLKETTIESMKKRLPWLIILLFLGMGVSS VVGVFETVVAVLPVVMCFQSLILDMAGNVGTQSLAVTIRVLMDETLSTKQKLGLVVKEMR IGLCNGLFLGVLAFVFIGFYIWLLKHNPVAHAFVISGCVGFSLMTAMVISSMVGTLVPMF FHKIKIDPAVASGPLITTINDLVAVVTYYGLAWLLLIHVLHMV >gi|222441799|gb|ACEP01000143.1| GENE 4 3096 - 4280 1059 394 aa, chain + ## HITS:1 COG:FN0345 KEGG:ns NR:ns ## COG: FN0345 COG0628 # Protein_GI_number: 19703688 # Func_class: R General function prediction only # Function: Predicted permease # Organism: Fusobacterium nucleatum # 84 363 38 315 331 106 27.0 7e-23 MKRYLKIGIIGAAILASGILCAFVLFKMPVIISVLKGITEILKPFLYGVVFAYLLAPLCN KIEEKLFQIFPKAKTKARRFICFIAIVISLCVAIAVIWLIIMMIIPQVWDSVMKIIQMVP QKLIVVNNWIEHMLENQPELQAYFEEFSSQAESNIDSLLNVDTIQKVQSIINSLSVQLFG VLGVVKNIFLGLLISAYLLGSRKLFGAQAGLILHGVFSDKWAKIIEEEIRYTDKMFNGFL VGKIIDSAIIGLLCFAGTSIMGFEAPAFISVIIGITNIIPFFGPFIGAIPCGLLLLLENP MHCLYFIIFIFVLQQLDGNVIGPKILGNTTGVSSFWVLFAILLFGGMWGVVGMVIGVPLF AVIYDIIRKLVYRGLRKHKRESMITDYEEKYHKS >gi|222441799|gb|ACEP01000143.1| GENE 5 4489 - 5511 1357 340 aa, chain + ## HITS:1 COG:all0170 KEGG:ns NR:ns ## COG: all0170 COG1304 # Protein_GI_number: 17227666 # Func_class: C Energy production and conversion # Function: L-lactate dehydrogenase (FMN-dependent) and related alpha-hydroxy acid dehydrogenases # Organism: Nostoc sp. PCC 7120 # 182 337 212 364 365 119 42.0 1e-26 MTYEEVMKKAKSEIGPYCKVCPQCNGKACKNQMPGPGAKGVGDVAIRNYDAWKDIRINMD TICENVTPDTSIELFGEKFDYPFFAGPVGAMKLHYGDKYDDLTYNDILVSACAANGIAAF TGDGTNPDVFKAATAAIKKNHGQGIPTVKPWNIETIREKMDMVQDCGAKMVAMDIDAAGL PFLKNLNPPAGSKTVEQLGEIVNAAGIPFIVKGIMTVAGAKKAFDAGASAIVVSNHGGRV LDQTPATAEVLERIVEWNQGRMKIFVDGGIRQGTDIFKALAMGADAVLIARPFVQAVYGG AEEGVKLYIEKLAAELSDTMAMCGAASIKDISRSMLYRPM >gi|222441799|gb|ACEP01000143.1| GENE 6 5724 - 6386 671 220 aa, chain + ## HITS:1 COG:BB0676 KEGG:ns NR:ns ## COG: BB0676 COG0546 # Protein_GI_number: 15595021 # Func_class: R General function prediction only # Function: Predicted phosphatases # Organism: Borrelia burgdorferi # 3 217 2 219 220 112 34.0 5e-25 MNKKDTVVFDLDGTLLDTLEDLKNSVNYAMTQCGYPEHTLDEVRRYVGNGILLLMQRAIP GGKDDPNFDKAFTLFKEHYGKHCNDTTKSYAGIMELLHTLKENNIKIAIVSNKADFAVKE LNAIYFKDLIPVAIGEKESEGIRKKPAPDTVIEALKQLGSTAENSVYIGDSDVDIETARN SGMDEILCDWGFRGEEFLKQHGAKIIIKKPDEILSLIGIE >gi|222441799|gb|ACEP01000143.1| GENE 7 6905 - 7126 327 73 aa, chain + ## HITS:1 COG:no KEGG:Cphy_1320 NR:ns ## KEGG: Cphy_1320 # Name: not_defined # Def: small acid-soluble spore protein alpha/beta type # Organism: C.phytofermentans # Pathway: not_defined # 7 71 2 66 66 76 67.0 4e-13 MASNKNSSSNRVEVPEARDAMDRFKMEVANELGVNLKQGYNGDITARDAGSIGGEMVRKM IKRQEEEMAGQSR >gi|222441799|gb|ACEP01000143.1| GENE 8 7398 - 8717 2103 439 aa, chain + ## HITS:1 COG:BH3053 KEGG:ns NR:ns ## COG: BH3053 COG0544 # Protein_GI_number: 15615615 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: FKBP-type peptidyl-prolyl cis-trans isomerase (trigger factor) # Organism: Bacillus halodurans # 1 425 1 427 431 319 48.0 8e-87 MSVQVEKLEHNMAKLTIEVEASKFDAAMKKAYNKKKGSFNLPGFRKGKVPMYIIEKEYGA GVFYEDAANELMPEAYAAALEESGLEVVSRPEVDVTQIGKGQNFIFTAEVAVKPEVKLGE YKGLDVQMDSVEVADEEVEAKLKEAQEQNAREITVEDRAVQDGDIITLNYAGTVDGVAFD GGTAENQQLVIGSHTFIDNFEEQLIGVEIGGEKDVEVTFPEEYHAPDLAGKAAVFHVEVL GIKEKQLPEIDDDFAQDTTEFDTLEEYKNDIKEKLATAKEEQAKSAAENALIEQIVESSE MDIPDAMVDFQVDQMVDEFKQRLSYQGLSLEQYVQFSGQNMDALRENMKGDALKRVQGSL VLEAIVAAENIEANEEDLKKEFEKMAQMYQMEASQIEGMIPEEQKENMKKDIAVQKAVEF VLNQANVTEATEELDFEEK >gi|222441799|gb|ACEP01000143.1| GENE 9 8787 - 9371 608 194 aa, chain + ## HITS:1 COG:CAC2640 KEGG:ns NR:ns ## COG: CAC2640 COG0740 # Protein_GI_number: 15895898 # Func_class: O Posttranslational modification, protein turnover, chaperones; U Intracellular trafficking, secretion, and vesicular transport # Function: Protease subunit of ATP-dependent Clp proteases # Organism: Clostridium acetobutylicum # 1 192 1 192 193 287 75.0 1e-77 MSLVPYVIEQTSHGERSYDIYSRLLKERIIFLGEEVNDTTASLIVSQLLFLESEDPEKDI SLYIMSPGGSVTAGMAIYDTMQFVKCDVSTICMGMAASMGAFLLAGGAKGKRLALPNSEI MIHQPSGGAQGQATEIEIVAEHILHTKEKLNRILAENTGQPYEKIVQDTERDNYMTAEQA KAYGLIDAVVEKRQ >gi|222441799|gb|ACEP01000143.1| GENE 10 9573 - 10871 281 432 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163762510|ref|ZP_02169575.1| ribosomal protein S16 [Bacillus selenitireducens MLS10] # 155 399 258 448 466 112 32 3e-24 MATRSDNNKQFRCSFCNKTQDQVRKLVAGPKGVYICDECIEVCTEIMEDEFENFNEDTQE INLMKPKEMKAFLDDYVIGQDEAKKVLSVAVYNHYKRILAGGSYDVELQKSNIIMLGPTG SGKTLLAQTLARQLNVPFAIADATALTEAGYVGEDVENILLKLIQAADYDIERAQTGIIY IDEIDKITRKSENTSITRDVSGEGVQQALLKILEGTVASVPPQGGRKHPHQEFLQIDTTN ILFICGGAFDGLEKIIENRIGKKSIGFQAEIMEQRKKDVGVLLKQVLPEDFVKFGLIPEF IGRVPVNVSLNPLDKDALVKILTEPKSALVKQYQKLFEMDGVELKFTDDALEAIAEKAIA RNTGARGLRSIMESVVMDLMYTIPSDDLVESCTITKETVDGSGEPELTYHDQPVTKKTVT RKRQKKNNNEIA >gi|222441799|gb|ACEP01000143.1| GENE 11 11143 - 13449 2685 768 aa, chain + ## HITS:1 COG:BS_lonA KEGG:ns NR:ns ## COG: BS_lonA COG0466 # Protein_GI_number: 16079872 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ATP-dependent Lon protease, bacterial type # Organism: Bacillus subtilis # 3 766 6 769 774 677 48.0 0 MEKQILPLLALRGKMVYPNTSVYFEVSRPKSMAALEQAVNHEQRIFLVNQIDPSLDKPEE EDLYTVGTVAKILQMVKAGQGVLRVFVEGEARARITSYIEFDGCVKAEVEEIPDTNYPEN PVEEEAFFRMLEEEAQEFSEKNPGFFAPQLQKAIDEKELLLLINELASQLPFELGKKQQI LEESDVKKQVEMLLAILKEEVEISIIRNELAEKVKKNVDKNQKDYLLREQQKVIREELGE DDIVSEADEYLEKCEALKASKEVKDKIKKEIKRYKKMPPVAAESTMTKNYIETMLEMPWN KVSRDSKSLEKAMEILEEDHYGLKKVKERILDYLAVRFLTKKGETPILCLVGPPGTGKTS IAKSIARALHKKYVRISLGGVRDEAEIRGHRKTYVGAMPGRIAEGIRQAGVKNPLMLLDE VDKVSSDYKGDTASALLEVLDGEQNVNFRDHYLEVPLDLSEVFFIATANDLTPIPKPLRD RMEIIELSSYTENEKFHIAENYLVKKQRKLSGLTKKQVSISDEVIREVIHFYTREAGVRN LERTIGQLMHKAARKIVEGEEKVDIKKKSLKELLGPAPFEDEDENLKPQIGVVRGLAWTS VGGDTLSIEVNVMPGKGKMELTGCMGDVMKESAQIGLSYVRSIAENYGIDSGYFEKHDIH LHIPEGAVPKDGPSAGITMALAILSAITGIPVRGDIAMTGEITLRGKVLPIGGLKEKLLA AKTVGVKEVLVPARNKRNVVELDEEITDGMKITYVEDMNEVIEHAMKK >gi|222441799|gb|ACEP01000143.1| GENE 12 13593 - 14186 780 197 aa, chain + ## HITS:1 COG:SP1568 KEGG:ns NR:ns ## COG: SP1568 COG0218 # Protein_GI_number: 15901411 # Func_class: R General function prediction only # Function: Predicted GTPase # Organism: Streptococcus pneumoniae TIGR4 # 16 191 18 193 195 189 53.0 2e-48 MVIKNVNLETVCGITSKLPKNTLPEVAFAGKSNVGKSSLINALVNRKSLARTSGQPGKTQ TINFYNVNEEMYIVDLPGYGYAKVSKEVAAKWGPMIENYLHTSEQLRMVFLLIDLRHKPT NNDVQMYRWILSNGFSPVIVATKADKIKRSQLQKQLKLLKDTLKVIEGVPIVPFSAVTKA GRDEIWELIEEYALAEE >gi|222441799|gb|ACEP01000143.1| GENE 13 14397 - 15356 1039 319 aa, chain + ## HITS:1 COG:CAC1605 KEGG:ns NR:ns ## COG: CAC1605 COG1893 # Protein_GI_number: 15894883 # Func_class: H Coenzyme transport and metabolism # Function: Ketopantoate reductase # Organism: Clostridium acetobutylicum # 2 299 3 301 301 146 32.0 7e-35 MRIKSVAVLGAGAVGSYIIWGLSNRKDIKLGVIAEGERKERLEKNGCLINEKVYHPEVWT PSEANGVDLLIVSLKYGALPGAVDSIKEAVGDNTIIMSLMNGVDSEEIISEKVDASHILH SIIKVASHKEGTGYCFDPETTIGIIFGEISAPFESERVEEIKELFADTGLHFRATEYIRE EMWSKYRLNICNNLPQAILGAGVGCYSDSTHMKAISDGLKKEVEAVAAAKGIDMSKVDIK SARGSAVPPTARYSTLQDIDAGRHTEIDMFSGALVRMGKELGVPTPYNEYTYHMIKALEE KNDGLFDYDGKNAETTCAK >gi|222441799|gb|ACEP01000143.1| GENE 14 15698 - 16147 493 149 aa, chain + ## HITS:1 COG:no KEGG:Cphy_0479 NR:ns ## KEGG: Cphy_0479 # Name: not_defined # Def: stage V sporulation protein AC # Organism: C.phytofermentans # Pathway: not_defined # 10 147 26 162 167 153 57.0 3e-36 MKKEFSNQEYAEKVKKLTPKFSSFSNCWHAFVSGGAICVLGELIRQIAYNRVGVSEENSY IVVSVCLVLISVVLTGFQLYEPLAKWCGAGTLVPITGFANSVASPAIEYQKEGQVFGIGV KIFTIAGPVILYGVFTSWVIGCVYWIFFL >gi|222441799|gb|ACEP01000143.1| GENE 15 16182 - 17330 811 382 aa, chain - ## HITS:1 COG:L0128 KEGG:ns NR:ns ## COG: L0128 COG0642 # Protein_GI_number: 15672986 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Lactococcus lactis # 143 380 200 435 441 145 34.0 2e-34 MKKMSLQWRLTIITTLLIAMICGSLTIFIYKNGVYYIDSLQNTVDAKSEDNNEKNPDEIY ISIPDEEWNNFAKNFSIQVYNNKADYKKSSLLFSTLLSLLGGVITFFISGHALKPLCDFS KKIEEVQAQNLSDSRIEENKFSELNQLSVSYNKMLERLSEAFKLQRQFTANAAHELRTPL AVMQLQIDLYNSSKHPNNDTSAQQTISMITEQTERLSKMVRTLLDMSELQTIARDEEIAI SALVEEVLADLEPLAQEKGINLIEKCDNVLLMGSDILIYRLVYNLVENAIKYNFSGGTVT VNATQQNSQLHLTVEDTGNGIPEELKERIFEPFFRLDKSRSRELGGVGLGLALVREIVRV HNGSILVKNNANSGTTFEVIFP >gi|222441799|gb|ACEP01000143.1| GENE 16 17327 - 18004 341 225 aa, chain - ## HITS:1 COG:SMc00458 KEGG:ns NR:ns ## COG: SMc00458 COG0745 # Protein_GI_number: 15964767 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Sinorhizobium meliloti # 1 220 1 216 223 164 41.0 1e-40 MRLLIVEDEKELCDTISKSLYESGYEVDTCYDGYEALDYILTEDYDLIVLDLNLPGMDGM DILRELRKENEETKVLILSARSQIIDKVEGLDAGANDYMEKPFHIQELEARIRSLTRRKF VQKDICLHCKDIKFDTKKREAYAKGILVPLTRKENGILEYLLLNQGRPVSQEELIEHVWD ASIDSFSGSIRVHMSSLRKKFKSVLGYDPIVNKIGEGYKLVEGVS >gi|222441799|gb|ACEP01000143.1| GENE 17 18157 - 18297 109 46 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225028679|ref|ZP_03717871.1| ## NR: gi|225028679|ref|ZP_03717871.1| hypothetical protein EUBHAL_02958 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02958 [Eubacterium hallii DSM 3353] # 1 46 9 54 54 65 97.0 1e-09 MKCRKISQFILLIAGVSMMSYGAVRGEAAVVLGKAIKLCLECVGIG >gi|222441799|gb|ACEP01000143.1| GENE 18 18290 - 19162 347 290 aa, chain + ## HITS:1 COG:MJ0750 KEGG:ns NR:ns ## COG: MJ0750 COG0348 # Protein_GI_number: 15668931 # Func_class: C Energy production and conversion # Function: Polyferredoxin # Organism: Methanococcus jannaschii # 89 284 53 232 238 86 29.0 5e-17 MDKKTSLISKKIASLRGGIQAAATLLTNIHLPNLFKGKIYQGNAKTVCVPGLNCYSCPAA TGACPIGAFQAVVGSSQFKFTYYITGFFILLGVMLGRFICGFLCPFGWFQDLLHKIPGKK FSTEKLRLLRYLKYMILFVFVIFLPIFITNSIGMGDPFFCKYICPQGVLEGAIPLSLGNT AIRSALGKLFSFKLIILIMVILLSILFYRPFCKWICPLGAIYSLFNKISFLSIKVEDDKC VGCHQCTKVCKMDVDVIKSPNHPECIRCGACMKACPKDAIHYQFIKFNDK >gi|222441799|gb|ACEP01000143.1| GENE 19 19179 - 20072 1057 297 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_2028 NR:ns ## KEGG: EUBREC_2028 # Name: not_defined # Def: alkyl hydroperoxide reductase # Organism: E.rectale # Pathway: not_defined # 1 297 1 300 300 333 65.0 5e-90 MKTRRTIIMILTLCVALSFTACTGKGDVGKADSSSVSEGTEKTEDLQAKLDDLYQQENEI FEKHKDVWEKVFSKMSKADAGSDNYAEYLASTVESNKNSFTDDELKTLKEDIETIRGIEE QITEVDNKLDTSGASDSKDDDTIAFNNVSGKDFDGNDVEENLFSKNAVTVMNFWFTGCKP CVAELSKLNELNDAIKSMGGEVVGVNTDTFDGKEATIKEAKKILESQGAKYRNFALDANS DAGKYASEIMAFPTTILVDRKGNIVGEPMVGGIDNQENYDTLMKRIKSVIEADSANK >gi|222441799|gb|ACEP01000143.1| GENE 20 20203 - 21072 676 289 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225028682|ref|ZP_03717874.1| ## NR: gi|225028682|ref|ZP_03717874.1| hypothetical protein EUBHAL_02961 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_02961 [Eubacterium hallii DSM 3353] # 1 289 1 289 289 539 100.0 1e-152 MEDRNQLDRIFSYIDEQKNMGKTVAEEDIEEKNYKAQELWKEKVWRGYAELKKAGFKGGD SLFLIAAYFAGKPDKTITPLLRRLQMVMKEREDLISGGMLAASYYGMEELSMRIPVLEEG VRNLYADQKDIEALTGSIMIADGGPAEVAKAIQWYMFFVKNGFDVKKRQMARVIGLLAVI SSSPVMVGRELMNRTNESIGRYENEQKDKNYMQDTFCEQVCTYIKQLQRKEQEKAKKLGK TSYRMLTGEKNVTVVDYTQEEEVSLNGSNMLVGMEQEVGLILSAIHMGV >gi|222441799|gb|ACEP01000143.1| GENE 21 21098 - 21859 784 253 aa, chain + ## HITS:1 COG:BH0167 KEGG:ns NR:ns ## COG: BH0167 COG0101 # Protein_GI_number: 15612730 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Pseudouridylate synthase # Organism: Bacillus halodurans # 3 246 2 238 263 174 40.0 2e-43 MKKNYKMTICYDGSRYYGWEHQPDRDTIQGKLENVLTRMCGEFIDVLGAGRTDAGVHARG MVANAVFDATDKERGGRERTPDEIRDYMNRYLPDDIAVKEVREASERFHARYKAVGKLYR YTCYDGPVKPVFQRKYVAVLEKRPDVEKMQQAAAYLEGKHDYKSFCGNPRMKKSTVRVVD KIEITRKGSFIYFDFHGTGFLQNMVRILVGTLLEVGKGKIKPEQIPEILEAKNRQMAGPT APAQGLCLIKVDY >gi|222441799|gb|ACEP01000143.1| GENE 22 22186 - 23202 862 338 aa, chain + ## HITS:1 COG:no KEGG:Cphy_0480 NR:ns ## KEGG: Cphy_0480 # Name: not_defined # Def: stage V sporulation protein AD # Organism: C.phytofermentans # Pathway: not_defined # 3 335 10 354 355 393 56.0 1e-108 MKQMQGKQSILFQNPVKILSHACVGGKKEGEGPIGKHLDLIVEDPMFGKENWEESESCFL KTAGEIALRKGKKKKKDVRMAFCGDLLGQLIASSFGIAELEIPYYGVYGACSSIGAALSI GAMAVNGGFADLVLSGCSSHFASAEKEFRFPLGYGSQRPYSATWTVTGAGAFLLNEDKGD VQIRGITTGKIKDYGVQDPFNMGACMAPAAADTIYQSFIDFEMESVDFDRIITGDLGAVG QKLLFELLDENHKDIKKNHMDCGMQIFDEESQDTHAGGSGCACSALMLSAVILPKLASGE WKRVLFVPTGALLSQVSANEGRTIPAIAHAVWLESGKE >gi|222441799|gb|ACEP01000143.1| GENE 23 23212 - 23568 375 118 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_2431 NR:ns ## KEGG: EUBREC_2431 # Name: not_defined # Def: sporulation protein # Organism: E.rectale # Pathway: not_defined # 1 117 22 138 139 138 66.0 6e-32 MIYLKAFVVGGLICTLVQIFLDNTKLMPGRVMVLLVIAGVILGAAGLYEPLIKWAGCGAS VPLSGFGYNLSKGVIEAVKEEGAIGLFKGGLKAAAAGTSSALIFSYIASLIFQPKMKR >gi|222441799|gb|ACEP01000143.1| GENE 24 23578 - 24576 709 332 aa, chain - ## HITS:1 COG:BH3842 KEGG:ns NR:ns ## COG: BH3842 COG4753 # Protein_GI_number: 15616404 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 1 140 1 132 530 92 37.0 9e-19 MYNVLIVDDDTLMRNALRAMISRFKNFQIIGEAANGAEAIAICRLYPVHLIFMDIIMPGM TGIEAGKKIKEDSPQTEICILSAYSDFHFAKEAMELHIKKYFSKPVSFLDISTYLENFSP STYKSVCPQLSQALELANSQSFSETYAGLSSIINEIFSSQNASIESLKKTFIEIGHGLLD SLEYADQRNEITDLFPLPDTWLVNGEIMKIWLFKIMDYIYQQKSIRRYSILEGVFLFIEQ HIKEKISLGQITEGSMISQGYLSRIFRDQFGISTMEYIRLRKIQMVKINFIFTTHNTSDV AFMLGFNESSYLQKVFQKIEGISIQEFKKRLK >gi|222441799|gb|ACEP01000143.1| GENE 25 24576 - 25868 982 430 aa, chain - ## HITS:1 COG:STM2036_1 KEGG:ns NR:ns ## COG: STM2036_1 COG4936 # Protein_GI_number: 16765366 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Predicted sensor domain # Organism: Salmonella typhimurium LT2 # 32 192 10 166 199 75 26.0 2e-13 MYTEKDLFSNNDEQNRGNFDTDVGILELFGKEPLENLQSIISKVTGLGFVTADFRGEPLT CMTGFTPFCKFTRERGNNGKLCNLSDAFGIVRSAITRKYCIYFCPCGLMEVAIPIIVHGK FLGGFLSGQVRCNDAPDDIPRFSGLIQPEYLEEDQAQVDEYFAQLPTYEYQKFSDIAELI YLMVNQLAENEMLRQQQYDSHKIQEQKLTSKINELEYENSSLTKELNYMQAKLTPHFILS LLTDIANLSVLENAPKTNQMAVALAQYLKYSLCTTGDNILLSDELKQIESYFNMLKIKYE DAFTYSITAKKEIDMLRIPSHILFPFLDRAASLITVNTGHLDIAISISCENGYTTIDIVS DYNPDNEESNPASSFKNFPDNDTFRSLISNIRQRLYSIFGNNYEIKESYQNNSKIHDIIK LPISHEERIM >gi|222441799|gb|ACEP01000143.1| GENE 26 26159 - 28690 3181 843 aa, chain - ## HITS:1 COG:pflD KEGG:ns NR:ns ## COG: pflD COG1882 # Protein_GI_number: 16131789 # Func_class: C Energy production and conversion # Function: Pyruvate-formate lyase # Organism: Escherichia coli K12 # 10 843 2 765 765 564 37.0 1e-160 MLAKGFSKPTDRVERLRSMIVDAVPCIEAERAVLITESYQATEGLPMIMRRAKALENILN NLTVTIRDDELVVGTLTKAVRSCQLFPENSYKWVMDEFDTIETRMADPFKISEEDKATLR KVLPYWEDKTISDLASSYMSEKTKECIANGVFTVGNYFFGGVGHIIIDYDKAIRRGYKAI IQDAVEALESFDCNDPDFIQKTQFCKAIITVLSAAINFAKRYSDKAKEMAAVETNPTRKA ELLQIAANCEKVPANGATNFYEACQSFIITQMILQVESSGHSESPGRFDQYMYPYLEKDL ASGAITKDFAQELLDCVFVKLNDLNKVRDQISAQAFAGFQVFQNIGAGGQTEDGMDATNE LSYMMMETVAHLRLSAPSFSIRVWQGTPDEFLYRACELARLGYGLPAMYNDEVIIPALTN RGISLHDARGYGLIGCVEPSVPGKEQGWHDAAFVNVAKILEITINNGRIGDLQIGPKTGE VDTFKTLEDFMQAFQKQIEYFVYYVAEADNCVDYAHMERGELPFLSSFVADCISDAKGIC AGGAKYNFTGPQAFGVADSGDSVYAIKKHVFDDKDITFAELKEAMDANFGYPVDGEVAPC AASAETEIEKDLYDQICKILGKEGININKSAATAAPACGSNNEKYERIRAMMDATDCFGN DIDEVDMIARRCAQMYCYEVEKYRNPRGGQFQAGIYPVSANVLFGKDVGALPDGRLAKKP LADGCSPRAGKDVKGPTAAAASVAKLDHEVASNGTLYNQKFLPSAVAGDQGLMNFAAVIR SYFEKKGMHVQFNVVDKETLLDAQAHPENYKDLVVRVAGYSAMFTELAKEVQDDIISRTE QHF >gi|222441799|gb|ACEP01000143.1| GENE 27 28998 - 31682 1940 894 aa, chain - ## HITS:1 COG:CAC2301 KEGG:ns NR:ns ## COG: CAC2301 COG0744 # Protein_GI_number: 15895568 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane carboxypeptidase (penicillin-binding protein) # Organism: Clostridium acetobutylicum # 35 723 39 663 809 259 32.0 1e-68 MNYNEPSIKKKREELSSPKKRRRTTASAVAFRVILFLICAICVSGAGLAYGSFKGIIASA PKNVSLSPKYSSTIIYDDNNKAVQPLHDYSSNRIKVKSEELPDNLKNAFIAIEDERFYEH DGVDFKGILRALWTDLRNGSTSQGASTLTQQLIKNNVFEAGGESNIIAKIKRKIQEQYLA INAEKTKPKEEILTDYLNTINLGKGNLGVETASKYYFGKSAKNLTLSECAVLAGITKNPT YLNPVDYPEANDARRKLILKKMYQLNYISAKEYQQASEDNVYAKIKSHTAKGNKNSVYSY FTDALITQIVADLQEQKGYTQSQAYQLVYRGGLKIYSTQNTKLQKIADSVINNPDNYPVA TKFSLEYKLQVTHADSSSSTYTEINVASYFRKKKKDPTYKTIYNSKKEMKAAVKAFRKSV MKKGDKKKAESIHYSMEPQLSYSLIDQKTGQVKVLVGGRGQKQDDLALNRATSVKRQPGS TFKILSTYAPAIDTGSMTLATVFDDAPYQYENGQKVTNYVSTSYKGLTTVRDAIIDSNNI VAVKALTDLTPQVGFDFLQKLGFTTLVNNRVSTNGTYESDVNQSLALGGITDGVTNVELT AAYAAIANQGTYNTPVLYTTVKDSNGNVLLSNKKKSRKVIKKSTAWLLTNAMEDVVKKGT GRAAQLDSDMAVAAKTGTTSNNYDYWFCGYTPYYTASVWTGYDYNTSFDNDEDYHKVIWK KIMDRIISEKKQKVKSFPSNKNIKKAEICIKSGKKALPNVCSKDPEKSMVRTEYFASGTV PKDSCDAHIAVTFCLKSHLVAQKFCPDKFRYTKIFRVRPKHSSHKTDDEPYFLNIDINNK CNIHTEEWHQKKLEKKKKKQEEKLKKQQKQQQSGNDTTDTSINNIEKQIKKLLN >gi|222441799|gb|ACEP01000143.1| GENE 28 31712 - 32077 207 121 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|148984704|ref|ZP_01817972.1| 50S ribosomal protein L20 [Streptococcus pneumoniae SP3-BS71] # 6 118 7 123 126 84 39 1e-15 MIEFNHFNFNVLNLEKSIAFYKEAIGLSVVREKDSSDGSFKIVYLGDGRTGFTLELTWMR DRKEPYNLGDEEFHLAFKTDEYDAFHKKHEEMGCICYENPSMGIYFINDPDGYWIEIVPV R >gi|222441799|gb|ACEP01000143.1| GENE 29 32435 - 33316 942 293 aa, chain + ## HITS:1 COG:CAC2631 KEGG:ns NR:ns ## COG: CAC2631 COG1376 # Protein_GI_number: 15895889 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 18 154 97 228 229 76 39.0 5e-14 MKKKRMLLLAAMFVFTFAFIRFNTVDVQAASKYTIYVNRRTNLVNVINSKTGKLVKAMYC STGRNYRTIRGTYNTTAKYKWRPLIHGVYGQYSTRIHGAYLFHSVPYYSINKSQVSTKEY NKLGQQASAGCIRLAVTDAKWIYDHCRLGTKVVIGEGRTLKKPTRPKVRVSTKKRAGWDP TDPDSRNPYRPKLTLKKKATKTIVYGSTFNIKNMVNVSSSYASSDALLKSMKVKGKVNTK KAGTYKVQCTITDPYTAVSVTKTFTFKVGKKPKQTTTEKKAPTELTTEEKTAE >gi|222441799|gb|ACEP01000143.1| GENE 30 33406 - 35817 2937 803 aa, chain + ## HITS:1 COG:BS_leuS KEGG:ns NR:ns ## COG: BS_leuS COG0495 # Protein_GI_number: 16080084 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Leucyl-tRNA synthetase # Organism: Bacillus subtilis # 5 801 3 802 804 868 52.0 0 MAEPYNHRAIEQKWRDKWEATPINVNDGKKPKYYCLDMFPYPSGSGLHVGHWRGYVISDV WSRYKMLQGYYLIHPMGWDAFGLPAENYAIKMGTHPKITTASNIKNIKRQINEIAAIYDW DMEVNTTDPNFYKWTQWIFVKMFKAGLAYEKEMPINWCPSCKTGLANEEVVDGCCERCGS PVTKKNLKQWMLRITKYADRLLNDLDKLDWPEKVKKMQTDWIGKSYGAEVDFPVEGKDEK ITVYTTRPDTLHGATFMVLAPEHAMAKELATDETREAVEKYIFNASMKSNVDRLQDKEKT GVFTGSYAINPLNGAKVPIWLSDYVLADYGTGAIMCVPAHDDRDFEFATKFNIPIIQVIA KDGKEIENMTEAYTEAAGTMINSGEWNGMESSVLKKEAPLMIEKMGIGRKTVNYKLRDWV FSRQRYWGEPIPIVHCPKCGAVPVPEEELPLLLPEVEKYQPTGTGESPLADITEWVNTTC PCCGAPAKRETNTMPQWAGSSWYFLRYVDNKNDKELVNREKADKYLPVDMYIGGVEHAVL HLLYSRFYTKFLNDIGVIDFDEPFKKLFNQGMITGKNGIKMSKSKGNVVSPDDLVRDYGC DSLRIYELFVGPPELDSEWDDKGIEGVNRFLKRFWKLALDNKDNDVKATPELIKVRHKLV HDITNRLNSFSLNTVISGFMEYNNKLMELSKKGGVDKETLETYIILLAPFAPHVSEELWE QFGHTDSVFHATWPEYDEEAMKDDEIEIPVQINGKTKCTIAINPDGDKDEILAKAKEALG DKVNGTIRKEIYVPGRIVNLVVK >gi|222441799|gb|ACEP01000143.1| GENE 31 36334 - 36540 203 68 aa, chain + ## HITS:1 COG:no KEGG:Blon_1307 NR:ns ## KEGG: Blon_1307 # Name: not_defined # Def: transposase IS200-family protein # Organism: B.longum_infantis_ATCC15697 # Pathway: not_defined # 1 68 12 79 132 94 64.0 2e-18 MMYYHLIMVVRYRRKVIDNPISERAKGIWEHIAPRYGIVLEEWNHDIDHVYVMFHAQSRM ELSKFINA >gi|222441799|gb|ACEP01000143.1| GENE 32 36703 - 37806 729 367 aa, chain + ## HITS:1 COG:alr2719 KEGG:ns NR:ns ## COG: alr2719 COG0675 # Protein_GI_number: 17230211 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Nostoc sp. PCC 7120 # 6 367 7 410 452 221 38.0 2e-57 MNIACRFRIYPTEEQKILLGKTFGCCRFLYNQMLNDKIREYEKTKKMLKNTPAMYKKEYP FLKEIDSLALANVQLHLEKAYKNFFRNPKIGFPRFKSKHHSKNSYTTNVVNGNILVESKR IRLPKLKWIAMKKHREPAEGLRLKSVTVSMEPSGKYFASLFYEGYSCENQAAEPDYSTAK ILGIDYAMQGMAVFSEKVETEEAGFFRKNEKRLAREQRKLSRCVRGSHNYELQKKKVARC HEKIRNQRRDHLHKLSRKIADGYDAVAVEDIDMKAMGQCLHFGKSVQDNGYGMFREMLDY KLAWKGKKMIKVDRFFPSSKKCCKCGRIKKELKLSERVYHCACGNKMDRDRNAAINIREE ARRMLTA >gi|222441799|gb|ACEP01000143.1| GENE 33 38054 - 38500 240 148 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|42519249|ref|NP_965179.1| 30S ribosomal protein S21 [Lactobacillus johnsonii NCC 533] # 11 147 11 146 147 97 38 2e-19 MSKIEEVRTAMYAAMKAKDKERKDALSMLLGALKAKAIDKREDLTEAEENSIIAKEIKQC KDTIEMSPADRTDIIDQCKLRIAVYEEFAPKQMSEDEIKATIQSVLDELGITEPTGKDRG KIMKDLMPKVKGKADGKLVNQMVASMLK >gi|222441799|gb|ACEP01000143.1| GENE 34 38673 - 39215 542 180 aa, chain - ## HITS:1 COG:jhp1038 KEGG:ns NR:ns ## COG: jhp1038 COG1013 # Protein_GI_number: 15612103 # Func_class: C Energy production and conversion # Function: Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, beta subunit # Organism: Helicobacter pylori J99 # 1 175 132 306 314 173 43.0 2e-43 MVYVCYDNGAYMNTGIQRSSATPMYADTTTTPVGSESDGKAQNRKDLTQVIAAHNIPYVA QTTFVQNFKDLHTKAEKAIYTEGAAFLNVMAPCPRGWRYNTPDILEICQLAVDTCFWPLF EVIDGEWIVNYVPKKKLPIEDFLRPQRRFKHLFKPGKEELIARIQAEVDRKWEELLAKAK >gi|222441799|gb|ACEP01000143.1| GENE 35 39298 - 39672 316 124 aa, chain - ## HITS:1 COG:TM0018 KEGG:ns NR:ns ## COG: TM0018 COG1013 # Protein_GI_number: 15642793 # Func_class: C Energy production and conversion # Function: Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, beta subunit # Organism: Thermotoga maritima # 30 120 11 102 324 105 55.0 1e-23 MISDINKLELDVKWQNLTPGCTIVGSCTAEVFRTEERLSPGHRMCAGCGATIAVRNVLRG LHEEDEAVITCATGCLEVSSFMYPYTAWKDSFIHNAFENAGATCSGVEAAYRALKKKGKV KKHS >gi|222441799|gb|ACEP01000143.1| GENE 36 39672 - 40109 527 145 aa, chain - ## HITS:1 COG:TM0015 KEGG:ns NR:ns ## COG: TM0015 COG1014 # Protein_GI_number: 15642790 # Func_class: C Energy production and conversion # Function: Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, gamma subunit # Organism: Thermotoga maritima # 2 145 50 189 192 106 40.0 2e-23 MGAPITAYNRLSNKQIRVHSNIYDPDYVVVVDETLLDSVDVTEGLKEEGAIIINTKHAKE ELLPKLNGYKGRVYTIDARKISLEALGKYFPNSPMLAAIVKVSGVMDDEAFLAEMQKSYA HKFASKPEVIEGNMKALSLALQEVQ >gi|222441799|gb|ACEP01000143.1| GENE 37 40711 - 41334 353 207 aa, chain - ## HITS:1 COG:CAP0075 KEGG:ns NR:ns ## COG: CAP0075 COG2323 # Protein_GI_number: 15004779 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Clostridium acetobutylicum # 1 203 31 231 233 122 31.0 5e-28 MSEMSMFDYISGITIGSIAAEMATAPDSSFLEPLTAMIVYGLLTAFLAFITGKNMKIRHF ISGSPYILFNDGHLYEGNFKKGHIDLSEFLVQCRLNGFFDLKELQTAILEENGRISFLAK SEHRPVSPSDLNLSPHKEWLLPTYIMDGKILYENLLHSGKDEEWLRKQLSAHGCKRLSDI FLATGNQQNTLNVYFKNNRRGSSVIKE >gi|222441799|gb|ACEP01000143.1| GENE 38 42175 - 43695 1139 506 aa, chain + ## HITS:1 COG:AGl1139 KEGG:ns NR:ns ## COG: AGl1139 COG0169 # Protein_GI_number: 15890687 # Func_class: E Amino acid transport and metabolism # Function: Shikimate 5-dehydrogenase # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 1 283 10 291 291 259 46.0 8e-69 MSKNYRAELVGLFGNPVDENPTGPMMEAGFAAQGLNYRYITMKVEKENLKDAIAGIRAIG MRGLNLTIPHKIAVIPFLDELSPAAKIIGAVNSIRVQDGQLIGENTDGKGFVTSLMETGI ELNGRIITVLGSGGAARAVAVECAISGAETVNIVARNEERGKELADLISEKTEAQGVYFT WKGSVPIPEGTEVLVNCTNVGLYPDENKPDITYEDIRKNMTVCDVVFNPPETKFLKEAKA RGAATVNGLGMLVNQAALNYCLWTENMAPKDMMKEALLREFNLENETVQEEKTIQKNIAI QEKVTKDTVKNMKTDITDTVNIMENTRRTPGRFQATQGEENIMDEQDRKLIEKMMEYYAG DPKRVQHFLKVYEFAKLIGESESLDTETMHILRTAAIVHDIGIKISEEKYGSSNGKYQEK EGPAVAEPMLLALGYDEAVIDRVLFLIAHHHTYNEIEGLDYQILVEADFLVNLFEDGSSR EAAQKVQKNIFKTNTGTKYLSGLFLN >gi|222441799|gb|ACEP01000143.1| GENE 39 43790 - 44500 213 236 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163764767|ref|ZP_02171821.1| ribosomal protein L15 [Bacillus selenitireducens MLS10] # 5 236 7 228 234 86 29 2e-16 MVFGVILAGGIGSRMGNVEKPKQYLSVGGKPIILHTLEKFYVNSKFEKLIVLCPSQWINH TKNLVKKNFNDSSKIVVIEGGSTRNETIMNSIRYIEKEYGLDDDTIIVTHDSVRPFLTYR IIEDNIRYAQEYGACDTVIPASDTIVESAGHEIISNIPDRSIMYQGQTPQSFKAKKLKEL YEGLTEEEKEILTDACKIFVIKGQDVHLVEGEVFNIKITYPYDLRVAETLIKGEEK >gi|222441799|gb|ACEP01000143.1| GENE 40 44500 - 45525 1169 341 aa, chain + ## HITS:1 COG:SA0246 KEGG:ns NR:ns ## COG: SA0246 COG1063 # Protein_GI_number: 15925959 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Threonine dehydrogenase and related Zn-dependent dehydrogenases # Organism: Staphylococcus aureus N315 # 1 339 1 339 341 419 57.0 1e-117 MLNTVYRLVAPRRFEVAFNDIDLNSDKVVVRPTHLSICHADQRYYQGTRPADVMAKKLPM ALIHEAIGDVVYDATGEFKPGELVVMIPNTPVEKDDVIAENYLRSSKFRASGFDGFMQDN VALDRDRLVRLPQDIDRTVAAFTEIVSVSVHALRRFERFTHARKNVVGIWGDGNLGYITA VFFHYMHPDTKLVIFGTNEDKLSGFTFADETHLVWEIPEDLKVDHAIECVGGDASQKAID QMIDIIQPEGTIALLGVSEYAVPINTRMVLEKGLHLYGSSRSGRADFEKTVELYQEYPEI LGYLSNIVGAQVEVNSIADMTRAFEMDINKMMGKTIMIWNK >gi|222441799|gb|ACEP01000143.1| GENE 41 45830 - 47227 1002 465 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|145629959|ref|ZP_01785741.1| 50S ribosomal protein L21 [Haemophilus influenzae 22.4-21] # 2 457 1 440 456 390 47 1e-108 MLKISNIVSQIDSFVWGPVMLVLLVGTGVLLTVRCNFLTWRNLPWALKKTLSKEARTKSR GTGDVSPFSALTTALAATIGTGNIVGVATAMVSGGAGALVWMWISACFGLSSKFSECMLA IKYREVNEKGEMSGGPMYTMKKGLKNKKLGAVLGWLFALFAVIASFGIGNMTQGNSIATS VHATFGVSVTVVGVVITVLALLIIIGGIKTISKVSSVVVPVMAIFYVAAGIIVILGNIHN LPAGISMIFKMAFSVKAVGGALCGNIVASMMNAARYGVARGCFSNEAGMGSAAITAAAAT TDNPVRQAYINMTGTFWDTIVVCTITGLAIASSGVLGQIDPATGQLYVGSALTIAAFSTV LGKVGGYLVTIGITLFAFSTILGWEYHGEKAFEYLVGTHKLNMVYRIIFSLVAYVGATQT LELVWNFSDIANALMAIPNLICMILLSGEIAKDVKAFQKIIDRER Prediction of potential genes in microbial genomes Time: Fri Jul 8 08:26:55 2011 Seq name: gi|222441798|gb|ACEP01000144.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont440.1, whole genome shotgun sequence Length of sequence - 1378 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 3 - 1055 767 ## MGAS2096_Spy1130 TraG/TraD family protein 2 1 Op 2 . + CDS 1074 - 1377 195 ## MGAS2096_Spy1130 TraG/TraD family protein Predicted protein(s) >gi|222441798|gb|ACEP01000144.1| GENE 1 3 - 1055 767 350 aa, chain + ## HITS:1 COG:no KEGG:MGAS2096_Spy1130 NR:ns ## KEGG: MGAS2096_Spy1130 # Name: not_defined # Def: TraG/TraD family protein # Organism: S.pyogenes_MGAS2096 # Pathway: not_defined # 99 348 41 291 597 118 30.0 5e-25 SAFTRSRHCFWRKRNNWIEKDTGRLYITYNLKQLVEKLGRSDRTISKAMKQLSEVGLIEK KKRGQGKPDIIYVMNFTAVHKEGAKEIVLEEQMEQDYQDKLAYILMHPFRNWFNRKTPAV IGIALTVWAMFVCYYLTYYRNFHPDAEHGVAEWADVPKTAKRLYGKGEEPVTCLSKNITV NAKALPNMHILILGGSGDGKTTSLLIPNILLANMTNIILDVKGDLLKNYGGYLKEIKNIT VKSLNFKDMLQSDRYNPFVYIDNYTDVIELITNIQTSVKPPDAQKGDPFWDDGVGLYLQS LFEYEWLLAKEGNRVASMPGILELVNRETQKTDEEGTTQLQQDMQELQER >gi|222441798|gb|ACEP01000144.1| GENE 2 1074 - 1377 195 101 aa, chain + ## HITS:1 COG:no KEGG:MGAS2096_Spy1130 NR:ns ## KEGG: MGAS2096_Spy1130 # Name: not_defined # Def: TraG/TraD family protein # Organism: S.pyogenes_MGAS2096 # Pathway: not_defined # 4 101 303 392 597 67 34.0 2e-10 MRDYRKLKEGASETVRSIVIMVNAQLRLFEIPEIKRVFEGTDDIDIPSLGLGVEGNPKKK TALFLVMPSGDSSYNLFINMFYTQLFTVLKRIADNRRDGQL Prediction of potential genes in microbial genomes Time: Fri Jul 8 08:27:06 2011 Seq name: gi|222441797|gb|ACEP01000145.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont441.1, whole genome shotgun sequence Length of sequence - 6660 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 3, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 273 - 332 7.2 1 1 Op 1 22/0.000 + CDS 530 - 763 273 ## COG1918 Fe2+ transport system protein A 2 1 Op 2 . + CDS 760 - 2769 1601 ## COG0370 Fe2+ transport system protein B + Prom 2906 - 2965 7.0 3 2 Op 1 26/0.000 + CDS 3039 - 3467 528 ## COG1585 Membrane protein implicated in regulation of membrane protease activity + Prom 3473 - 3532 2.4 4 2 Op 2 . + CDS 3552 - 4511 1236 ## COG0330 Membrane protease subunits, stomatin/prohibitin homologs + Term 4587 - 4648 11.4 + Prom 4522 - 4581 3.6 5 3 Tu 1 . + CDS 4663 - 6606 1366 ## COG1368 Phosphoglycerol transferase and related proteins, alkaline phosphatase superfamily Predicted protein(s) >gi|222441797|gb|ACEP01000145.1| GENE 1 530 - 763 273 77 aa, chain + ## HITS:1 COG:L192240 KEGG:ns NR:ns ## COG: L192240 COG1918 # Protein_GI_number: 15672170 # Func_class: P Inorganic ion transport and metabolism # Function: Fe2+ transport system protein A # Organism: Lactococcus lactis # 3 69 80 146 152 69 46.0 1e-12 MTLAQLPVGKEGVIETVGGEGNLRCRFLDMGLIPGTKVKVLKMAPMGDPIQIHLRGYDIT IRKEDGEKIVLKEGSVQ >gi|222441797|gb|ACEP01000145.1| GENE 2 760 - 2769 1601 669 aa, chain + ## HITS:1 COG:CAC1031 KEGG:ns NR:ns ## COG: CAC1031 COG0370 # Protein_GI_number: 15894318 # Func_class: P Inorganic ion transport and metabolism # Function: Fe2+ transport system protein B # Organism: Clostridium acetobutylicum # 1 662 6 674 683 511 40.0 1e-144 MIFALIGNQNCGKTTLFNQMTGSNQHVGNFPGVTVDQKMGKFTTASVDGTVVDLPGIYSL RPYTNEEIVTRDFLLKEKPDGIINIVDATNIERNLYLTLQLIEMRIPMVLALNMMDEVRN NGGSINVKEMSRLLGIPIIPISAIRNEGVGDLIHTAYEVAENKQYPKVYDFCTPGPVHRC IHGLYHQLEDHASRSGMNGRFAAVKVIEGDEDIIERLKLSENELEMMEHSIIEMERDRGL DRNAAMADMRYSFIENICEQSVVKCQVSKEYERSVQIDSVLTNRYLALPVFAGIMVFIFW MTFGPFGSFLSDALSAGIDWVSREIEILLTNYGINPVVKSLVIDGIFTGVGSVLSFLPVI LILFFFLSILEDTGYMARIAFVMDTFLRKIGLSGRSFVPMLLGFGCSVPAVMASRTLSSE RDRNLTIMLIPFISCSAKIPIYTVFTAAFFPHRGVLVMSCLYFGGIVIGILMALIFKKVL FAGKPMPFVMELPNYRFPSLKSVLLLMWEKARDFLERAFTIIFLASMIIWFLQTFDSRLN VVTDSANSLLAILGHLLAPIFKPLGFADWRISTALVTGITAKEAVVSTLSVLLGTNAAHV SAALGTLFTVRSAVSFLVFTLLYTPCVAAIAAIKQEMHSTLKTIGIVIMQCGVAWLTAWL VYIGSGVIL >gi|222441797|gb|ACEP01000145.1| GENE 3 3039 - 3467 528 142 aa, chain + ## HITS:1 COG:TM0865 KEGG:ns NR:ns ## COG: TM0865 COG1585 # Protein_GI_number: 15643628 # Func_class: O Posttranslational modification, protein turnover, chaperones; U Intracellular trafficking, secretion, and vesicular transport # Function: Membrane protein implicated in regulation of membrane protease activity # Organism: Thermotoga maritima # 4 142 5 140 140 73 33.0 2e-13 MTVMYWLGAAAIFVVIEIITMGLTTIWFAGGALVGAVMAAFSLPLWSQIIAFVIVSVILL ILTRPWALKYLNSRTVRTNADSLIGQTALVTQDIDNLNAKGQVKVEGQIWTARSISDDVQ LHEGQKVMIESISGVKVIVKPI >gi|222441797|gb|ACEP01000145.1| GENE 4 3552 - 4511 1236 319 aa, chain + ## HITS:1 COG:FN1549 KEGG:ns NR:ns ## COG: FN1549 COG0330 # Protein_GI_number: 19704881 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Membrane protease subunits, stomatin/prohibitin homologs # Organism: Fusobacterium nucleatum # 6 304 6 293 294 318 59.0 1e-86 MAGTIFMIIIILLIVYIVTSCVRIVPQAQAYVIERLGAYNGTWSVGMHFKVPFIDRVAKK VLLKEQVVDFAPQPVITKDNVTMRIDTVVYYQITDPKLYAYGVDNPIMAIENLTATTLRN IIGDLELDSTLTSRETINTKMRATLDEATDPWGIKVNRVELKNIIPPTEIQNAMEKQMKA ERERREAILRAEGEKKSSILRAEGHKESMILEAEAEKEAAILNAEAKKEATIREAEGQAE AILKVQRATADGLRAIREAGADEAVIKLKSLEAFEKAADGKATKIIIPSEIQNLAGLVTS IKEVAAQPEAVEKVDTTAK >gi|222441797|gb|ACEP01000145.1| GENE 5 4663 - 6606 1366 647 aa, chain + ## HITS:1 COG:TM1703 KEGG:ns NR:ns ## COG: TM1703 COG1368 # Protein_GI_number: 15644451 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Phosphoglycerol transferase and related proteins, alkaline phosphatase superfamily # Organism: Thermotoga maritima # 82 621 62 573 593 306 32.0 9e-83 MKTIIKKLLAHYWLVYPVLILKMFIYYWQTKRLDMLGIYDVPLLSLLFLFCVFEAFSFKE SKPRRYGFYIVYTLITLIMMADAAYSSYFGKYVSVNQLYQITSLGQIAGDGDVIGASVSP WCLLTLIDYPFVLYWYRLRNKGKKGMLDELAKCWPEKFHFKNVWKEKKFMLSLIKSALHI IIYIVAICAWYYYGLNPQNLRSVQQVNHIEFFTYHTNDIVVNVVGKLKRSSVDEKAIQKK MKSIVPKSSGTAYKGVAKGKNLILIQTESFNNFVIGATYNGQEITPNLNKLLKKDTIYFN HFYSTTGVGNTCDAEFSALNSLYPNDIRECYRMYVDNTYNGLPWMFREKGYSAMAFHGYV KTFWNRSEAYKNQGFQHYYSEEELKQTQISGFGITDKEMFRQAVDILKTKQQPFFSFMIT LTNHIPYELDQSLASLKLKDSDVGSTFGNYLQTVRYTDESFGKLIEYLKKNDMYDNTVIA IYGDHQGMNKETPSVQWKMTDFLGKEYDYDEMLNVPFIIHIPGLNESKVADTVGGEVDIM PTIANLMDLDTKQPYVFGHDLLNADEGFVAQISYVGEGSFITSDDNMLFTIGKDGTVASG RLMSLLDGSRKNINEKLCQKYSDRAQSLIDTCKEVLDNNLIANYITH Prediction of potential genes in microbial genomes Time: Fri Jul 8 08:27:07 2011 Seq name: gi|222441796|gb|ACEP01000146.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont445.1, whole genome shotgun sequence Length of sequence - 1353 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 73 - 1089 775 ## COG2110 Predicted phosphatase homologous to the C-terminal domain of histone macroH2A1 + Term 1156 - 1194 -0.9 Predicted protein(s) >gi|222441796|gb|ACEP01000146.1| GENE 1 73 - 1089 775 338 aa, chain + ## HITS:1 COG:MA1614 KEGG:ns NR:ns ## COG: MA1614 COG2110 # Protein_GI_number: 20090472 # Func_class: R General function prediction only # Function: Predicted phosphatase homologous to the C-terminal domain of histone macroH2A1 # Organism: Methanosarcina acetivorans str.C2A # 3 155 29 183 195 193 58.0 3e-49 MPLKIVRNDITKMNVDAIVNAANTSLLGGGGVDGCIHRAAGPDLLEECRMLHGCQTGNAK ITNGYRLPCKYVIHTVGPIWLDGKHQEQKLLESCYDTSLNLAKEYGCESVAFPLISSGIY GYPKDQALKVAVDIIGNFLLENEMTVFIVIFDRKAYQISGKLFADITAYIDDRYVKEHSD SREEQRWRSEPLIGESCFGSAVERMEAKPSYNAISSRTLEDALNQIDESFSEMLLRKIDE SDMTDTQCYKKANIDRKLFSKIRSNKFYKPSKTTVLAFALALELPIAEMQDMLGKAGFTL SRSSKFDIIVEYFVEQGNYNVYEINEALFAFDQSLIGS Prediction of potential genes in microbial genomes Time: Fri Jul 8 08:27:10 2011 Seq name: gi|222441795|gb|ACEP01000147.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont446.1, whole genome shotgun sequence Length of sequence - 10035 bp Number of predicted genes - 13, with homology - 12 Number of transcription units - 8, operones - 3 average op.length - 2.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 354 - 413 5.6 1 1 Tu 1 . + CDS 435 - 824 310 ## ELI_2320 hypothetical protein - Term 1453 - 1496 0.1 2 2 Tu 1 . - CDS 1550 - 2566 1342 ## COG1087 UDP-glucose 4-epimerase - Prom 2649 - 2708 5.7 3 3 Tu 1 . - CDS 3200 - 4075 872 ## COG1234 Metal-dependent hydrolases of the beta-lactamase superfamily III - Prom 4269 - 4328 6.4 - Term 4238 - 4276 1.0 4 4 Tu 1 . - CDS 4399 - 5793 1523 ## COG1757 Na+/H+ antiporter - Prom 5847 - 5906 3.9 5 5 Op 1 . + CDS 6031 - 6171 63 ## gi|210614563|ref|ZP_03290218.1| hypothetical protein CLONEX_02432 6 5 Op 2 . + CDS 6236 - 6481 198 ## EUBREC_1224 hypothetical protein 7 5 Op 3 . + CDS 6507 - 6638 108 ## + Term 6781 - 6828 0.1 - Term 6769 - 6817 6.5 8 6 Op 1 . - CDS 6908 - 7285 478 ## COG0346 Lactoylglutathione lyase and related lyases 9 6 Op 2 . - CDS 7334 - 7774 366 ## gi|225028728|ref|ZP_03717920.1| hypothetical protein EUBHAL_03007 10 6 Op 3 . - CDS 7794 - 8063 281 ## COG2148 Sugar transferases involved in lipopolysaccharide synthesis - Prom 8099 - 8158 5.4 - Term 8126 - 8165 5.2 11 7 Tu 1 . - CDS 8180 - 8359 200 ## gi|225028731|ref|ZP_03717923.1| hypothetical protein EUBHAL_03010 - Prom 8400 - 8459 5.7 12 8 Op 1 . - CDS 8530 - 9546 1072 ## COG1087 UDP-glucose 4-epimerase - Prom 9575 - 9634 2.7 - Term 9592 - 9633 2.2 13 8 Op 2 . - CDS 9636 - 10034 460 ## COG1109 Phosphomannomutase Predicted protein(s) >gi|222441795|gb|ACEP01000147.1| GENE 1 435 - 824 310 129 aa, chain + ## HITS:1 COG:no KEGG:ELI_2320 NR:ns ## KEGG: ELI_2320 # Name: not_defined # Def: hypothetical protein # Organism: E.limosum # Pathway: not_defined # 44 125 41 122 183 77 43.0 2e-13 MTEKLEKLKQGYANLSEWLEHIMAAIVLIAIVIAICSLWAPFKEFLQTRSESGAFLKYMA SVFDIVIGIEFFKLLCKPRKDTMLEVLMFVIARHMIIEHTTAFENLLSIVAISILIIVDR YFLKSKTLN >gi|222441795|gb|ACEP01000147.1| GENE 2 1550 - 2566 1342 338 aa, chain - ## HITS:1 COG:SP1828 KEGG:ns NR:ns ## COG: SP1828 COG1087 # Protein_GI_number: 15901657 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-glucose 4-epimerase # Organism: Streptococcus pneumoniae TIGR4 # 1 337 1 336 336 418 60.0 1e-117 MNILLAGGAGYIGSHTAVELLTAGHDVVIVDNYCNSCAEAVNRVEEVSGKKVVSYEADVK DKVAMAKIFAENKIDCVIHFAGLKAVGESVQKPIEYYRNNIDTTLTLLECMKEAGVKKFV FSSSATVYGEENDIPYIETMKRGSCSNPYGWTKVMMEEILEDAAKADEELTVVLLRYFNP IGAHESGRMGEDPQGIPNNLMPYIAQVAVGRRDHLTVYGGDYPTKDGTCRRDYIHVMDLA NGHLKAVEYAAQHKGVEVFNLGTGTPYSVLEIIHAFESANDIKIKYEIGDRRAGDLPEFW ANAEKAEKILGWKTQRTLEDMCRDTWNWQKNNPQGYNK >gi|222441795|gb|ACEP01000147.1| GENE 3 3200 - 4075 872 291 aa, chain - ## HITS:1 COG:RSc2495 KEGG:ns NR:ns ## COG: RSc2495 COG1234 # Protein_GI_number: 17547214 # Func_class: R General function prediction only # Function: Metal-dependent hydrolases of the beta-lactamase superfamily III # Organism: Ralstonia solanacearum # 3 288 42 317 320 113 30.0 3e-25 MKTKLVLLGTGTPNACPNANGPSSAVVVGDRAYIVDFGPGVVRQASAAYFNGIDALRPDL LTVAFCTHLHTDHTAGYPDLIFTPWVLERPVPLKVFGPKGMQHMTDHILKAYETDIDFRI NGFEKANESGYRVEVTEIESGIIYKDDRVTVEAFPVSHGTLECYGYKFITPDKTIVITGD TAPLDIVAEKAKDCDILLHEVEYAAGISCREPKWRKYHREVHTLSTDLAQVAKKANPKLL VTYHRIYHMNIQDNRKNLAAEMAWRDEAILDEIREAGYEGYVVNGKDLDVF >gi|222441795|gb|ACEP01000147.1| GENE 4 4399 - 5793 1523 464 aa, chain - ## HITS:1 COG:BS_mleN KEGG:ns NR:ns ## COG: BS_mleN COG1757 # Protein_GI_number: 16079413 # Func_class: C Energy production and conversion # Function: Na+/H+ antiporter # Organism: Bacillus subtilis # 36 451 36 451 468 256 36.0 7e-68 MKEEKKLNLPQSLLLLVLIIFSVAVCIRLKTGGPMIGLFASWIFIYLVCKIVRIPYDNVV AGAYDAIRMVVPTLCLLMAIGVMIGTWLQSGTIAIIIAWGLKMINPAWLLPLTLLFCSVL SVVTGTSYGSVGSAGVAMMAIGNAMGINSGMVAGAVICGAMFGDKLSPLSDTTNLAPAVA GAKLGDHIRAMFWTTIPTYIITLIIFTVLGIQQTSGGYTAGDISNYITELNGEFHLGAVT LIPAILIIVLLLCKVNAISALGISSFAAGAVSFFVQHATLQSIIQTAYSGYTTTIEEGVL QSILNRGGMGSMLQYVAIISFAVGMGGMLEKLGVLEHILNAVVKRINSDGSMILVTLIVG YITSLISCSQPMSHVLTGRLMAPVFKERKVAPEILSRCLEDSGTMAGPMIPWHGYGVYMA GTLGVAWAAFFPYLFLLYLTPIFSIIYGFTGISIKHVSGEEVAE >gi|222441795|gb|ACEP01000147.1| GENE 5 6031 - 6171 63 46 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|210614563|ref|ZP_03290218.1| ## NR: gi|210614563|ref|ZP_03290218.1| hypothetical protein CLONEX_02432 [Clostridium nexile DSM 1787] hypothetical protein CLONEX_02432 [Clostridium nexile DSM 1787] # 1 46 504 549 549 68 86.0 2e-10 MRDTQREDFSNRRDYERELRTVSANIDMILGKNHEQEQQIEKEQNL >gi|222441795|gb|ACEP01000147.1| GENE 6 6236 - 6481 198 81 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_1224 NR:ns ## KEGG: EUBREC_1224 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 60 26 85 94 94 73.0 2e-18 MKAHKYWSLGALAAMIGTFYTGYKNMKTAHKYFACSSLLCMIMAIYSGHKMISGKSRKKK DPVSEEAHTINHEYHDTTSLL >gi|222441795|gb|ACEP01000147.1| GENE 7 6507 - 6638 108 43 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MDKITHKIRCEQWTAIVNECLAKVGFCLQSEQVTYDLHSAYPH >gi|222441795|gb|ACEP01000147.1| GENE 8 6908 - 7285 478 125 aa, chain - ## HITS:1 COG:CAC0249 KEGG:ns NR:ns ## COG: CAC0249 COG0346 # Protein_GI_number: 15893541 # Func_class: E Amino acid transport and metabolism # Function: Lactoylglutathione lyase and related lyases # Organism: Clostridium acetobutylicum # 2 125 3 126 126 172 66.0 1e-43 MFDTIHHIAIIGSDYEKSKHFYVDLLGFEIIRENYRKERDDYKIDLACGEQEIELFIIKD APARVNYPEALGLRHLAFKVKSVDDTVKELNAKGIATEPVRLDDYTGKKMTFFHDPDNLP LEIHE >gi|222441795|gb|ACEP01000147.1| GENE 9 7334 - 7774 366 146 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225028728|ref|ZP_03717920.1| ## NR: gi|225028728|ref|ZP_03717920.1| hypothetical protein EUBHAL_03007 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_03007 [Eubacterium hallii DSM 3353] # 1 146 1 146 146 274 100.0 1e-72 MKEENIKGEYQRKVDKSLIPTCNIMGVNIAAINMNWLLKYLDANLSDTNTKFIWNAMIRG KYKIRQEQVRFLQEMPSEENLWKAVIAFQEYPFKTATGLPFRYKLKVGKNGEYNRELLID RREKSKSLAWSSVVLAFENSKINVFS >gi|222441795|gb|ACEP01000147.1| GENE 10 7794 - 8063 281 89 aa, chain - ## HITS:1 COG:alr4823 KEGG:ns NR:ns ## COG: alr4823 COG2148 # Protein_GI_number: 17232315 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Sugar transferases involved in lipopolysaccharide synthesis # Organism: Nostoc sp. PCC 7120 # 3 88 149 234 235 98 54.0 3e-21 MTLAGDMSLVGTRPPTVDEWDKYELHHRARLATKPGLTGMWQISGRSNITDFEEVVKLDK QCISEWTMGLDIRILFKTVQVVFKKEGSM >gi|222441795|gb|ACEP01000147.1| GENE 11 8180 - 8359 200 59 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225028731|ref|ZP_03717923.1| ## NR: gi|225028731|ref|ZP_03717923.1| hypothetical protein EUBHAL_03010 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_03010 [Eubacterium hallii DSM 3353] # 8 59 1 52 52 105 100.0 1e-21 MKTTLLIMAAGIGSRFGTGIKQLEPVDDSNHIIMDYFDIQLQPFEELEKAMKWWKSNKL >gi|222441795|gb|ACEP01000147.1| GENE 12 8530 - 9546 1072 338 aa, chain - ## HITS:1 COG:BS_galE KEGG:ns NR:ns ## COG: BS_galE COG1087 # Protein_GI_number: 16080937 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-glucose 4-epimerase # Organism: Bacillus subtilis # 1 337 1 336 339 474 66.0 1e-133 MAVLVTGGAGYIGSHTCVELLENNKEVVVLDNLSNSSEEALNRVKRITGKEVKFYKGDIS DIVILDTIFKKENIESCIHFAGLKSVGESVAKPLEYYQNNIAGTLILLQKLKEYNVKNII FSSSATVYGDPAFVPITEECPKGLCTNPYGWSKSMLEQILMDIYKADETWNIILLRYFNP IGAHKSGLMGENPNGIPNNLMPYVTQVAVGKLKELGVFGNDYDTPDGTGVRDYIHVVDLA KGHVKALQKIDEKCGFKIYNLGTGKGYSVLDIVKNFEAATGMKVPYVIKDRRPGDIATCY CDPGKAAKELDWKAENGIKEMCEDSWRWQKKNPNGYEG >gi|222441795|gb|ACEP01000147.1| GENE 13 9636 - 10034 460 132 aa, chain - ## HITS:1 COG:BS_ybbT KEGG:ns NR:ns ## COG: BS_ybbT COG1109 # Protein_GI_number: 16077245 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphomannomutase # Organism: Bacillus subtilis # 1 128 317 444 448 127 47.0 6e-30 KNGCRIGGEQSGHIIFSKYASTGDGILTSLKMMEVMMAKKKKLSQLTEDLHIYPQVLVNV MVKDKAVAQADADVQAAISKVAERLGDTGRILVRESGTEPLIRVMVEAETKEICHTYVDE VVSIINKKEATR Prediction of potential genes in microbial genomes Time: Fri Jul 8 08:27:42 2011 Seq name: gi|222441794|gb|ACEP01000148.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont448.1, whole genome shotgun sequence Length of sequence - 19562 bp Number of predicted genes - 15, with homology - 15 Number of transcription units - 8, operones - 4 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 240 - 299 7.3 1 1 Tu 1 . + CDS 330 - 827 557 ## gi|225028735|ref|ZP_03717927.1| hypothetical protein EUBHAL_03014 + Term 1056 - 1118 4.2 + Prom 1272 - 1331 7.8 2 2 Tu 1 . + CDS 1384 - 2835 2095 ## COG0516 IMP dehydrogenase/GMP reductase + Term 2850 - 2894 12.1 + Prom 2870 - 2929 6.5 3 3 Tu 1 . + CDS 3166 - 3744 352 ## COG5632 N-acetylmuramoyl-L-alanine amidase + Prom 3802 - 3861 2.8 4 4 Op 1 1/0.000 + CDS 3885 - 4967 1677 ## COG0473 Isocitrate/isopropylmalate dehydrogenase + Prom 5000 - 5059 6.0 5 4 Op 2 6/0.000 + CDS 5100 - 6773 2183 ## COG0129 Dihydroxyacid dehydratase/phosphogluconate dehydratase 6 4 Op 3 3/0.000 + CDS 6794 - 8497 2157 ## COG0028 Thiamine pyrophosphate-requiring enzymes [acetolactate synthase, pyruvate dehydrogenase (cytochrome), glyoxylate carboligase, phosphonopyruvate decarboxylase] 7 4 Op 4 5/0.000 + CDS 8517 - 9680 1543 ## COG0111 Phosphoglycerate dehydrogenase and related dehydrogenases + Prom 9716 - 9775 3.9 8 4 Op 5 . + CDS 9798 - 10883 1396 ## COG0136 Aspartate-semialdehyde dehydrogenase + Prom 10962 - 11021 6.3 9 5 Op 1 . + CDS 11041 - 11904 1047 ## COG0668 Small-conductance mechanosensitive channel 10 5 Op 2 . + CDS 11933 - 13267 1121 ## COG0534 Na+-driven multidrug efflux pump + Prom 13298 - 13357 6.2 11 6 Tu 1 . + CDS 13383 - 14441 788 ## COG0275 Predicted S-adenosylmethionine-dependent methyltransferase involved in cell envelope biogenesis + Term 14445 - 14495 13.1 - Term 14431 - 14483 7.9 12 7 Op 1 . - CDS 14494 - 15876 1226 ## COG0733 Na+-dependent transporters of the SNF family 13 7 Op 2 . - CDS 15892 - 17277 1504 ## COG0165 Argininosuccinate lyase - Prom 17317 - 17376 7.8 + Prom 17348 - 17407 6.4 14 8 Op 1 3/0.000 + CDS 17461 - 18426 865 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily + Prom 18520 - 18579 8.9 15 8 Op 2 . + CDS 18627 - 19427 1111 ## COG0834 ABC-type amino acid transport/signal transduction systems, periplasmic component/domain + Term 19495 - 19530 6.5 Predicted protein(s) >gi|222441794|gb|ACEP01000148.1| GENE 1 330 - 827 557 165 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225028735|ref|ZP_03717927.1| ## NR: gi|225028735|ref|ZP_03717927.1| hypothetical protein EUBHAL_03014 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_03014 [Eubacterium hallii DSM 3353] # 1 165 1 165 165 273 100.0 4e-72 MKVQIEVGKGIQCDGIEIMFGDSISSVVKAVGEVDKYEDNYYFYESSLLVHVDSNKCIDE IEIRNDEEHSHVVMLNGTNIFSEMKDVVIELIVRLNQSPVEDELGTYEAKRIGLAYSFSM TDEEIQEMISEAKEEGTYEEMKEEIEADIKMAKYLQTISIRKPKR >gi|222441794|gb|ACEP01000148.1| GENE 2 1384 - 2835 2095 483 aa, chain + ## HITS:1 COG:BH0020_3 KEGG:ns NR:ns ## COG: BH0020_3 COG0516 # Protein_GI_number: 15612583 # Func_class: F Nucleotide transport and metabolism # Function: IMP dehydrogenase/GMP reductase # Organism: Bacillus halodurans # 201 482 1 282 282 389 68.0 1e-108 MGKIIKEAITFDDVLLVPAYSEVIPNEVNLETKLTNKIKLNIPFMSASMDTVTEHQMAIA MARQGGIGIIHKNMSIEAQAEEVDKVKRSENGVITDPFYLSPEHTLQQAEDLMAKFRISG VPITENGKLVGIITNRDLKFETNFNKKIKESMTSEGLVTAKEGITLEEAKQILGKARKEK LPIVDDDYNLKGLITIKDIEKQIRYPYSAKDSNGRLLCGAAVGCTPDILNRVDELVKSHV DVIVIDTAHGHSANVLKTFALVKEKYPDLQVIAGNIATAEGTKAMIECGVDAVKVGIGPG SICTTRVVAGIGVPQITAVMDCYEMADKYNIPIIADGGIKFSGDVTKAIAAGANVVMLGG LLAGCDESPGEFELYQGRKYKVYRGMGSLAAMENGSKDRYFQANAKKLVPEGVEGRVAYK GKLEDTIFQLVGGLRSGMGYCGAKTIQELKEKGQFVKITAASLKESHPHDIHITKEAPNY SVE >gi|222441794|gb|ACEP01000148.1| GENE 3 3166 - 3744 352 192 aa, chain + ## HITS:1 COG:lin2374 KEGG:ns NR:ns ## COG: lin2374 COG5632 # Protein_GI_number: 16801437 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: N-acetylmuramoyl-L-alanine amidase # Organism: Listeria innocua # 34 190 7 177 316 104 35.0 1e-22 MSGISPKQEEGSDVGSRVLPAEDGPCGRPEITEDFLDKNPYSRSGIALRKVKGVVIHYVE NPGSTAKENRDYFNNLQNTHLTKASSHYIVGLDGEVIQCIPQSEISYASNNRNKDTISIE CCHPKKNGKFNDKTYNSAVRLTAWICKTYGLSSKNVIRHYDVTGKMCPRYYVKHEQAWKD FCNKVQECLEKD >gi|222441794|gb|ACEP01000148.1| GENE 4 3885 - 4967 1677 360 aa, chain + ## HITS:1 COG:CAC3171 KEGG:ns NR:ns ## COG: CAC3171 COG0473 # Protein_GI_number: 15896419 # Func_class: C Energy production and conversion; E Amino acid transport and metabolism # Function: Isocitrate/isopropylmalate dehydrogenase # Organism: Clostridium acetobutylicum # 3 356 4 353 357 412 62.0 1e-115 MSYKVGVIRGDGIGPEIVAEALKVLDKVAEKYNETFDYTDILLGGASIDATGKPLTEEAL ETAKSCDAVLMGSIGGNTTTSPWYKLPANLRPEAGLLAIRKGLGLFANMRPAYLYEELKD ACPLREDIIGDGFDLVVMRELTGGLYFGERKTFEEDGVTKAVDTLTYSEEEIRRIAIRGF DIAMKRRKKVTSVDKANVLDSSRLWRKVVNEVAKDYPEVELEHMLVDNCAMQLVRDPKQF DVILTENMFGDILSDEASMVTGSIGMLSSASLKEGSFGLYEPSHGSAPDIAGQDIANPIA TILSASMLLRYSFNMDEAADAVDNAVKQVLKDGYRTVDIMSEGMTKVGCKEMGTLIAERI >gi|222441794|gb|ACEP01000148.1| GENE 5 5100 - 6773 2183 557 aa, chain + ## HITS:1 COG:CAC3170 KEGG:ns NR:ns ## COG: CAC3170 COG0129 # Protein_GI_number: 15896418 # Func_class: E Amino acid transport and metabolism; G Carbohydrate transport and metabolism # Function: Dihydroxyacid dehydratase/phosphogluconate dehydratase # Organism: Clostridium acetobutylicum # 1 551 1 547 552 669 62.0 0 MNSDHVKKGMQQAPHRSLFNALGYTKEEMERPLVGIVSSYNEIVPGHMNLDKITQAVKMG VAMAGGTPVVFPAIAVCDGIAMGHTGMKYSLVTRELIADSTECMAKAHQFDALVMIPNCD KNVPGLLMAAARINVPTVFVSGGPMLAGHVDGRKRSLSSMFEAVGAYEAGKMTAEKVEEY VNKVCPTCGSCSGMYTANSMNCMTEVLGMGLRGNGTIPAVYSERIRLAKHAGMKVMELLK NNVRPSDIMTKKAFLNCLTVDMALGCSTNTMLHLPAIAHEAGVELNMDIANEISAKTPNL CHLAPAGPTYMEDLNEAGGVYAVMNELSKKGLLYEDQITVTGKTVGENIKDVHNLNPEVI RPIDNPYMAQGGIAVLKGNIAPDTGIVKQSAVVPEMMVHEGPARVFDCEEDAIKAIKGGD IVPGDVVVIRYEGPKGGPGMREMLNPTSAIAGMGLGDSVALITDGRFSGASRGASIGHVS PEAAVGGPIALIEEGDIIKIDIPNNSLNVDVSDEELAKRKEKWQPREPKITDGYLRRYAA LVTSGNRGAVLDVDQLK >gi|222441794|gb|ACEP01000148.1| GENE 6 6794 - 8497 2157 567 aa, chain + ## HITS:1 COG:CAC3169 KEGG:ns NR:ns ## COG: CAC3169 COG0028 # Protein_GI_number: 15896417 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Thiamine pyrophosphate-requiring enzymes [acetolactate synthase, pyruvate dehydrogenase (cytochrome), glyoxylate carboligase, phosphonopyruvate decarboxylase] # Organism: Clostridium acetobutylicum # 6 551 4 549 554 628 57.0 1e-180 MKQVLNGSEIVLECLKEQGVDTVFGYPGGAILNIYDELYKQSDKIHHVLTSHEQGASHAA DGYARATGKVGVCMATSGPGATNLVTGLATAYMDSIPVVAITANVGVSLLGKDSFQEIDI KGVTMPVTKHNYIVKDIEKLADTIREAFQIAQSGRKGPVLVDITKDVTAAKTYFESMEPA VIHPVCETIKEKSVKEAIEMIEEAKQPYVLLGGGCTLSDASEQVREFVKKIDAPVCDTLM GKGVFPGEDPAYTGMLGMHGTKTSNIGVSQSDLLIAIGTRFSDRVYGNAKTFASRAKIIQ IDIDPAEVNKNILIDTAIIGDVKAVLTILNRRLSQQQHEAWMQEIQDMKAKYPMTYHQER LSGPGLIEKIYELTEGDAIITTEVGQHQMWAAQYYKYERPRQFLTSGGLGTMGYGLGAAI GAKCGCPDKVVVNIAGDGCFRMNMNEIATATRSEIPVIEVVVNNHVLGMVRQWQNLFYGE RYSSTTLYDKVDFVKLAEAMGAKAKRVETIEEFEAAFKEAVASNEPYLLDCIIDSDDKVW PMVAPGGSISDAFNEEDLEKQKQEESK >gi|222441794|gb|ACEP01000148.1| GENE 7 8517 - 9680 1543 387 aa, chain + ## HITS:1 COG:lin2956 KEGG:ns NR:ns ## COG: lin2956 COG0111 # Protein_GI_number: 16802015 # Func_class: H Coenzyme transport and metabolism; E Amino acid transport and metabolism # Function: Phosphoglycerate dehydrogenase and related dehydrogenases # Organism: Listeria innocua # 1 387 1 389 395 358 47.0 7e-99 MFQYKCLNPISPCGTSLFTEEYKQVEELQKADAVLVRSAAMHDMQDVPNLLAVARAGAGV NNIPIADYSEKGIVVFNTPGANANGVKEMVIAGMLLASRDLIGGNKWVEENKEDPNITKA MEKAKKAFAGREIQNKKIGVIGLGAIGVLVANAAHNLNMEVYGYDPFLSVKSAWNLSRAV HHVSSIDEIFENCDFITIHVPLLDSTKNMISAEGIAKMKKDAVILNFARNGLVDEDALVA ALDNGKLAHYVTDFPTPKVAGVKGVIAFPHLGASTEESEDNCAEMAVDQLMDYLENGNIT NSVNYPNTQLGVCQTQGRIAILHRNIPNMLTRFTGAFAKDNINITEMSNKTKGDYAYAIF DVDSVITEESVQHIIDIEGVLKVRVVK >gi|222441794|gb|ACEP01000148.1| GENE 8 9798 - 10883 1396 361 aa, chain + ## HITS:1 COG:CAC0022 KEGG:ns NR:ns ## COG: CAC0022 COG0136 # Protein_GI_number: 15893320 # Func_class: E Amino acid transport and metabolism # Function: Aspartate-semialdehyde dehydrogenase # Organism: Clostridium acetobutylicum # 4 361 3 360 360 495 64.0 1e-140 MDQKLKVGILGGTGMVGQRFISLLENHPWFEVVAIAASPRSAGKTYEEAVGGRWKMQKAM PEAVKNIVVMNVNEVEKVASEVDFVFSAVDMSKEEIRAIEDAYAKTETPVVSNNSAHRWT PDVPMVVPEINPEHFEVIEAQKKRLGTKKGFVAVKPNCSIQSYAPALFALKEFGPKEVVA TTYQAISGAGKTFKDWPEMVENIIPFIGGEEEKSEQEPLRLFGKVVDGKIEKATAPLITT QCIRVPVLDGHTAAVFVNFEKKPTKEEIIDRWENFKGLPQELNLPSAPKQFIRYMEEDNR PQVKLDIDFENGMGISMGRLREDTMFDYKFVGLSHNTVRGAAGGAVLCAELLTAQGYITA K >gi|222441794|gb|ACEP01000148.1| GENE 9 11041 - 11904 1047 287 aa, chain + ## HITS:1 COG:VC0480 KEGG:ns NR:ns ## COG: VC0480 COG0668 # Protein_GI_number: 15640507 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Small-conductance mechanosensitive channel # Organism: Vibrio cholerae # 32 281 36 283 287 184 39.0 1e-46 MLEAASLKSFAHSLIYEYLGKTGVKLADFCGDIIGALLILFIGFKIVSYVVKMAHKIFER SIMDTTLQTFLLSFIKIGGKVLVIFMAVTKIGVAASSIVALLGSAGLALGLSLQGSLSNI AGGVILLLLKPFQVGDYIIEGNSGKEGTVQAIGIMYTKLLSVDNKAIMIPNGNLSNASIT NVTYQEKRRVDLNVSIEYSEDIKKVRKVLNALIEKEPARITDEPVKIYVNEFQASSIDIG IRYWVKTEDYWESRWRVLEEIKEEFDNNNIAIPFEQLDVNLIPQEKE >gi|222441794|gb|ACEP01000148.1| GENE 10 11933 - 13267 1121 444 aa, chain + ## HITS:1 COG:CAC3354 KEGG:ns NR:ns ## COG: CAC3354 COG0534 # Protein_GI_number: 15896597 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Clostridium acetobutylicum # 1 433 4 431 452 167 26.0 4e-41 MEKNLTTGSVLKSILYFSLPYLLSYFLQTLYGMADLFIIGQFEDISSITAVSIGSQVMHM ITVMIVGLAMGATVNIGQAIGAGNKKKAAKDIGNTVTLFMMLAIVGAAGLLFFIKPIVGM MSTPQEAVHETVIYLTICFLGIPFITAYNIISSIFRGIGDSKSPMYFIAAACAINIILDY LFIGGLHMGAAGAALGTTCSQTISVLIALVVIMKKKTGISLSKNDFKIDRKTIKRLLKIG IPVALQDGFIQIAFIIITIIANRRGLDTAAAVGIVEKIISFLFLVPSSMLSTVSALGAQN IGAGKRERAQRILYYAVLITVGFGVMISIIMQFIAGPVVGLFTENAKVIVLGSQYIRGYI FDCIFAGIHFSFSGYFCACGHSELSFIHNIIAILLVRIPGVYITAKVFSSLFPMGLATVG GSLLSVMICVIAYAWMKKKQGCMD >gi|222441794|gb|ACEP01000148.1| GENE 11 13383 - 14441 788 352 aa, chain + ## HITS:1 COG:BH2575 KEGG:ns NR:ns ## COG: BH2575 COG0275 # Protein_GI_number: 15615138 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted S-adenosylmethionine-dependent methyltransferase involved in cell envelope biogenesis # Organism: Bacillus halodurans # 55 352 4 309 310 162 34.0 9e-40 MEENKELEKPHKRRVRYKGTHPRKFEEKYKELQPEKYKDTIEKVIRKGSTPAGMHISICV QEILDFLQIQPGQKGLDCTLGYGGHTRKMLECLQGEGHIYGLDVDPIEIVKTTDRLRKAG YGEDILTIIQQNFRNIDLVAEEHGLFDFVLADLGVSSMQIDNPERGFSYKVEGPLDLRLN PDKGISAAERLRELTRDEIEGMLRENADEPYAEQIAKEIMKTFRQGEQIDTTGQLKEVIE RALVFLPDDKEKKDTIKKTCQRTFQALRIDVNSEFEVLEEFLEKLPHILAPGGRVAILTF HSGEDRLVKKIWKRQQKEGLWSEVAKNVIRPSKEECHRNGRARSTKMRWAIK >gi|222441794|gb|ACEP01000148.1| GENE 12 14494 - 15876 1226 460 aa, chain - ## HITS:1 COG:BS_yhdH KEGG:ns NR:ns ## COG: BS_yhdH COG0733 # Protein_GI_number: 16078012 # Func_class: R General function prediction only # Function: Na+-dependent transporters of the SNF family # Organism: Bacillus subtilis # 15 459 9 450 451 242 35.0 1e-63 MKHSKKSHKEGRGSFSGRIGYVLAVAGSAVGLGNIWRFPYLAAKYGGGIFLLVYLLLTVS FGYVLIMSETALGRMTKKSPVGAFASFGNSFPFKLGGWLNAVIPILIVPYYSTIGGWVIK YLAEYLKGNVQEVAQDGYFTEFITDSFQTELWFLVFAALIFIIILGGVRNGVERMSKIMM PILVLLAIVVTIYSVTRPGALAGVKYFLIPNMKNFSFMTIVAAMGQMFYSLSIAMGILYT YGSYTDKDMDLEQSTTQIEIFDTGIAILAGLMIIPAVFSFSGGNPENLQAGPSLMFITLP KVFASMGIGTAAGILFFVLVLLAALTSAVSLMETCVSTFADEFHWSRKKCCVFMAVVMIV LGSASSMGYGALDFIQIFGMAFLDFFDFLTNSVMMPLAALATCFLILRVVGFKKISEEIE QSSSFRRKKLYTVFLKYFAPICLIIILCSSIANVLGIISM >gi|222441794|gb|ACEP01000148.1| GENE 13 15892 - 17277 1504 461 aa, chain - ## HITS:1 COG:BH3186 KEGG:ns NR:ns ## COG: BH3186 COG0165 # Protein_GI_number: 15615748 # Func_class: E Amino acid transport and metabolism # Function: Argininosuccinate lyase # Organism: Bacillus halodurans # 1 454 1 454 458 497 53.0 1e-140 MAQLWGGRFTKETDKLVYNFNASINFDKKFYKQDMEGSIAHVKMLAAVGILTDEERDKII AGIEGILRDVENGSLEITEEYEDIHSFVEAVLTDRIGDVGKKLHTGRSRNDQVALDMKLY TRQELLAIRELVLELIKTCQTLMEENIHTVMPGFTHLQKAQPLTLSHHVGAYMEMFKRDY SRLSDIYDRMNYCPLGSGALAGTTYPLDREMTAELLDFYGPTLNSMDSVSDRDYVIELLS ALSTIMMHLSRFCEEIILWNSNEYRFIEIDDAYSTGSSIMPQKKNPDIAELIRGKTGRVY AALTSILTTMKGIPLAYNKDMQEDKELTFDAYDTVKGCLALFNGMLKTITFHKDIMEQSA KHGFTNATDAADYLVNHGVPFRDAHGIVGRLVLYCIDKGIALDDMSLEEYKAISPVFEND IYEVIDVHTCVDKRNTIGAPGPKAMEQVIAINKKWIEEHEE >gi|222441794|gb|ACEP01000148.1| GENE 14 17461 - 18426 865 321 aa, chain + ## HITS:1 COG:CAC0076 KEGG:ns NR:ns ## COG: CAC0076 COG0697 # Protein_GI_number: 15893372 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Clostridium acetobutylicum # 3 314 1 287 303 219 40.0 4e-57 MEMKTHKIRDTIFLFLTAMIWGAAFVAQSVSMDYIGPFTFICLRSVIGGLFLIPVIMVID NIRKKRRSESVKTASDNGINSFQKMQAEEEKLSWKNKRLLESGIVCGIFLFLANCFQQTG IQYTTVGKAGFITTFYIIIVPVVGLFFKKYCGILTWIGVVIALAGLYFLCITEKMTIQRG DALILCCSFLYTGQILAIDHYNPFVDGVKMSCIQFLTGGMLGAICMFLFESPNMAMILNA AGPILYTGIMSTGVGYTLQILGQKGLNPTVAALIMSLESVFSALSGYLFLHQVLTTRELI GCALMFIAIVLAQLPDIRRKV >gi|222441794|gb|ACEP01000148.1| GENE 15 18627 - 19427 1111 266 aa, chain + ## HITS:1 COG:FN0800 KEGG:ns NR:ns ## COG: FN0800 COG0834 # Protein_GI_number: 19704135 # Func_class: E Amino acid transport and metabolism; T Signal transduction mechanisms # Function: ABC-type amino acid transport/signal transduction systems, periplasmic component/domain # Organism: Fusobacterium nucleatum # 47 266 11 230 230 154 42.0 1e-37 MKMKKLAAVALAVVMSVSMLAGCGSSNDKKSAESSTSANGTATVKTAKDGVLTMATNATF PPYESYEGNDIVGIDADIAKAIADKLGLKLEIQDMEFNSIITAVQSGKADLGLAGMTVTD ERKQSVDFTDSYATGIQSVIVKEGSSIKSIDDLKGKKIGVQLATTGDIYAKDDFGEENVE EYNKGADAVMALTSGKIDAVIIDNQPAKSFVETTDGLQILDTDYVQEDYAAAIQKGNTDL LNAVNGALKELKEDGTIQKILDKYIK Prediction of potential genes in microbial genomes Time: Fri Jul 8 08:27:58 2011 Seq name: gi|222441793|gb|ACEP01000149.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont451.1, whole genome shotgun sequence Length of sequence - 20626 bp Number of predicted genes - 16, with homology - 16 Number of transcription units - 11, operones - 3 average op.length - 2.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 495 - 1733 1336 ## COG1004 Predicted UDP-glucose 6-dehydrogenase - Prom 1764 - 1823 9.7 2 2 Op 1 1/0.000 + CDS 1861 - 3033 1416 ## COG0562 UDP-galactopyranose mutase 3 2 Op 2 11/0.000 + CDS 3048 - 4196 1226 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 4 2 Op 3 11/0.000 + CDS 4207 - 5181 839 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 5 2 Op 4 . + CDS 5216 - 6454 768 ## COG0463 Glycosyltransferases involved in cell wall biogenesis + Term 6457 - 6505 7.2 6 3 Tu 1 . - CDS 6473 - 7570 172 ## c0970 hypothetical protein - Prom 7629 - 7688 10.0 + Prom 7585 - 7644 9.2 7 4 Tu 1 . + CDS 7743 - 9566 654 ## LAF_1406 hypothetical protein + Term 9668 - 9722 13.0 + Prom 9602 - 9661 5.2 8 5 Tu 1 . + CDS 9801 - 11054 1275 ## COG0463 Glycosyltransferases involved in cell wall biogenesis + Term 11141 - 11186 8.5 + Prom 11099 - 11158 8.5 9 6 Op 1 . + CDS 11250 - 11999 644 ## Ethha_0582 ATP-dependent transcriptional regulator, MalT-like, LuxR family + Term 12010 - 12067 -0.9 + Prom 12054 - 12113 7.1 10 6 Op 2 . + CDS 12247 - 13509 1051 ## COG1649 Uncharacterized protein conserved in bacteria + Term 13524 - 13579 19.1 + Prom 13597 - 13656 7.6 11 7 Tu 1 . + CDS 13710 - 13922 365 ## EUBELI_00572 hypothetical protein + Prom 14141 - 14200 5.4 12 8 Tu 1 . + CDS 14259 - 14591 524 ## Rumal_2099 hypothetical protein + Term 14663 - 14703 -0.4 + Prom 14647 - 14706 4.2 13 9 Tu 1 . + CDS 14782 - 16881 2574 ## COG2217 Cation transport ATPase + Prom 17137 - 17196 7.2 14 10 Op 1 . + CDS 17380 - 17997 369 ## EUBREC_1586 hypothetical protein 15 10 Op 2 . + CDS 18076 - 18510 715 ## COG0716 Flavodoxins + Term 18748 - 18803 4.2 - Term 18736 - 18789 -0.8 16 11 Tu 1 . - CDS 18948 - 20279 1157 ## COG0534 Na+-driven multidrug efflux pump - Prom 20317 - 20376 6.8 Predicted protein(s) >gi|222441793|gb|ACEP01000149.1| GENE 1 495 - 1733 1336 412 aa, chain - ## HITS:1 COG:BH3708 KEGG:ns NR:ns ## COG: BH3708 COG1004 # Protein_GI_number: 15616270 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted UDP-glucose 6-dehydrogenase # Organism: Bacillus halodurans # 6 412 3 388 388 521 67.0 1e-148 MSSYKIAIAGTGYVGLSNAILLSQHNEVYAVDIIEEKVNLINSGKSPIVDKEIQEYLATK DLNLTATTDAKKAYENADFVIISTPTNYDPKMNYFDTSSVEAVIKLVLEYNPNATMIIKS TVPVGYTLSVREKFNTENIIFSPEFLREGRALYDNLYPSRIIVGAPLDDERLVQAAHTFA GLLKEGAIKEDIPVLFTNPTEAEAVKLFANTYLALRVSFFNELDSYAEVHGLNTKQIIDG VGLDPRIGTHYNNPSFGYGGYCLPKDTKQLLANYKDVPENLIEAIVKSNSTRKDFIADRI LEKAGYYHGTAGRDGTPGNQVTIGVYRLTMKSNSDNFRQSSIQGVMKRLKAKGAEVIIYE PTLADDTFFGSKVEKDFDAFKANCNVIIANRFSDELADVEDKVYTRDLFRRD >gi|222441793|gb|ACEP01000149.1| GENE 2 1861 - 3033 1416 390 aa, chain + ## HITS:1 COG:Cj1439c KEGG:ns NR:ns ## COG: Cj1439c COG0562 # Protein_GI_number: 15792757 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-galactopyranose mutase # Organism: Campylobacter jejuni # 26 390 2 365 368 466 63.0 1e-131 MLKYYGYITEQYFLEKNVKGEIMKKYDYLIVGAGLFGAVFAHEMTKEGKKCLVIDKRDHI AGNIYCENVEGINVHKYGAHIFHTSDKVVWEYINQFAEFNNYINSPVARYKDELYNLPFN MNTFSKMFGIKTPAEAKAKIQEEIEELGITEPKNLEEQALSLAGRSVYEKLVKGYTEKQW GRDCKDLPAFIIKRLPLRFIYDNNYFNDRFQGIPMGGYTAIVEKMLEGSDVLLNTDYYEF RKENAGIAEKTVYTGMLDKYFDYKYGVLEYRSVRFETETLDMDNYQGNAVVNYTEREVPY TRIIEHKHFEYGTQPKTVISREYPSEWKLGEEPYYPVNNEKNEEVAGKYRELADKEENVI FGGRLGEYRYYDMDKVIAAALKAVEKEKNN >gi|222441793|gb|ACEP01000149.1| GENE 3 3048 - 4196 1226 382 aa, chain + ## HITS:1 COG:Cj1434c KEGG:ns NR:ns ## COG: Cj1434c COG0463 # Protein_GI_number: 15792752 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Campylobacter jejuni # 3 356 8 356 445 253 41.0 3e-67 MAKVSIIIPTYNVEMYLVECMESVVNQTLKDIEIICINDGSTDGSLEILKSYAQKDDRIV LVDKENGGYGIGMNIGLDKATGEYIGIVEPDDFIPLNMYSDLYEKAVENDLDFIKADFYR FKRDSETEDMELVYNHLSPNKEDYNIVFNPSETPKAIRYIMNTWSGIYKRSFIEEFHIRH NETPGASFQDNGFWFQTFAYAKRGMIIDKPYYMNRRDNPNSSVHNKEKVYCMNVEYDHIR ELLVRDKEVWERFRSMYWLKKYNNYMGTIKRIGEEYKREYVHRFSDEFKRGLELGELDES VFTKISWSNIQFITKDPNGFYMKKVATPVKVRRLTKKVEKLEKQNAKLQNDIKIIKGSRS YRLGWFMLTIPRKIKALFKGNK >gi|222441793|gb|ACEP01000149.1| GENE 4 4207 - 5181 839 324 aa, chain + ## HITS:1 COG:BS_yveT KEGG:ns NR:ns ## COG: BS_yveT COG0463 # Protein_GI_number: 16080481 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Bacillus subtilis # 5 222 5 219 344 126 33.0 7e-29 MSVKVSVILPVYNVSEYLRQCMDSIVGQTLKDIEIICVDDGSTDDSLEILKEYEAKDKRV KVIEQKNAGAGAARNNGLAIATGEYLSFLDSDDFFEPDMLEKAYEKGKSSNAQVVVFRSD QYREDLNEFVQVKWTLREKQIPPYRPMNCRSFTGNIFKVFVGWAWDKLFERKFVEENHLK FQQIRTSNDMLFVFSALAFAQRMEIVDLVLAHQRRNNPNSLSNTREKSWDCFHTALNALK DALTAHGNFWEFEQDYINYALHFSLWHLDTITGAKKEVLFNKLKNEWFAEFGIDSLPAER FYDKKEYAKYLKIKSNEFNAYYEE >gi|222441793|gb|ACEP01000149.1| GENE 5 5216 - 6454 768 412 aa, chain + ## HITS:1 COG:ECs4493 KEGG:ns NR:ns ## COG: ECs4493 COG0463 # Protein_GI_number: 15833747 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Escherichia coli O157:H7 # 10 267 8 260 338 113 30.0 6e-25 MSTDKKEYGISVIIPCYNVGQYVEETLKSVLNQSFKNYEIICLNDGSTDGTLEILNKYQS LYPYIQVFTGENHGVAYQRNKGVQRARGKYIYYLDGDDLIKENCLETLYQYAEADSLDIL YFEAESFYESKEIEEAFPQFLTLYHRHKEYDGVYDGKNLYIEMENKGDIKMSVCLQFVRR QFLIDNNIKFGEERYFEDNLYTVRTMIKAARARCVRDNLYLRRVRGNSIMTGGKRKIRFE SYLEVVRGLMQILEDEKQNSALQEAIYKRIRGTYINIFKDYLKVPESERDELFGEKNSPL YLLIGMAFFMNIEEVDRKKVSEKLKKTYAEKSEINEKLQRTYAEKSEINAKLKKAYEEKT ERGEKLKKLRKHLKDTKAELEKKSKRLEQLEQEKKQLEKHFLGRFVLEKYNK >gi|222441793|gb|ACEP01000149.1| GENE 6 6473 - 7570 172 365 aa, chain - ## HITS:1 COG:no KEGG:c0970 NR:ns ## KEGG: c0970 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_CFT073 # Pathway: not_defined # 1 349 1 329 343 67 24.0 1e-09 MKRNPSIELCRLLACLIVIGVHTSLSAAVNEHSWDTFRLFLNCLLADGVAIFFMITGAFL FNNDYCRLLKKTLKHIIIPLIIFSIIGFYCFDWLFDKQPLSLHIAASKYIDTFRTLLSWQ NPISHSVHLWYLYVYILVILIYPLLKAFVDYLDADTNRTKIFLIGTFIFLIMNDISSNSL GTFSHHSINALVPASIEIIWGHILYKYQKIFTRKLAIPVSIFLFFSLNYLRSLIQLHRYS LETPNNYILFWYSSIGIICASAIFIFCCSVIKAPGSSNIIQKFICWLASYTFPVYLIHIP IRDLLATYNVPTMLNHFMFSHFSEFIADILYNSSIVLIIFVICLIISILFRNIKKLLNKV IHCFA >gi|222441793|gb|ACEP01000149.1| GENE 7 7743 - 9566 654 607 aa, chain + ## HITS:1 COG:no KEGG:LAF_1406 NR:ns ## KEGG: LAF_1406 # Name: not_defined # Def: hypothetical protein # Organism: L.fermentum # Pathway: not_defined # 75 577 62 558 592 203 29.0 2e-50 MMQECSMRDKIKPGVYAILSAVALCISGRFSVIIGGWAKLAAELSGEKEQAFVIVILLMI LYRKSWSIYISGRKWITHILAVCFSCCMLIGKSFSQTGNIKFIFGDIKQFCIAVIVFFGF YILFDVAITLLYVYINEKSEEKEKSIKIKWIEEHYFAFSFLCMLLCWSPYILCYLPGSVP HDGYWQLNMAFGINPLTNHHPWVITFIYGVVMRIGRYISDNFGIFMIVAIFTVIEILCYA SVCNSLKKWGASKKVYIGTLVFFSVVPAFGGYAQAVIKDNIFTALLALFFIIYIDICIQH GKNIEIKKMVILFLVGMMVCLSRNNGVYIVIPSMVCLTLYVQKERSRYVILLICLMVCYQ GLEGYVAPQLGVEKGSVKEVLSIPFQQTARYIKEYPEEVTLKEKKAINDILSYDGIKENY NPEISDFVKNTFREGSEDKLDEYFKVWFEMFLKHPMVYIEATLNNTFSYYYPFYNEKVLG DYQFYIKGAPVATGYFDIHYITPVKIRTVLAEYAQIWKKVPGFSQLVDPGFYTWILLLLA GYLIYRKRTRDILLLIGPAINILICIASPVNGLLRYALPLMACTPLLIYLTVREEKESET EGENRPL >gi|222441793|gb|ACEP01000149.1| GENE 8 9801 - 11054 1275 417 aa, chain + ## HITS:1 COG:L16653 KEGG:ns NR:ns ## COG: L16653 COG0463 # Protein_GI_number: 15672197 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Lactococcus lactis # 4 225 1 220 333 112 32.0 2e-24 MEAIKVSVVVPVYNVDEFLDNTLSDITGQTLREIEIICVDDGSTDNSCKIIEEWMEKDSR IQLIRQKNQYAGVARNNGLKQAHGKYVIFWDADDLFEHNALEVMYAQAEQENSDICICEA RKYDNAKEKYIPSDAYLKEDLLPGKQTFNKFDVPDYIFNLTNNVPWNKLYLKEFITKNKL QYQAIKQANDTYFTIMALFLAERITYVKDVLIAYRVNNDESLSGKASDTVFCAYDSWLYA KEHIEKYPDFNLVRFSFLNRALSGFYHALNIQTTFESYEKLYRKLVEEGFHEFGLDECTE ENIYAVWMYKDMQKMYETEPADFLVQKSITRRVNNENNNVRRKILREKNAVLRENVQALK EDKKTLQSDKKKLQEEKKKLEGEKKKLQKEIDNIKQSKAYRLGNKLLWLPRKVVKRG >gi|222441793|gb|ACEP01000149.1| GENE 9 11250 - 11999 644 249 aa, chain + ## HITS:1 COG:no KEGG:Ethha_0582 NR:ns ## KEGG: Ethha_0582 # Name: not_defined # Def: ATP-dependent transcriptional regulator, MalT-like, LuxR family # Organism: E.harbinense # Pathway: not_defined # 1 246 1 244 244 304 60.0 3e-81 MNTLQTNDWLILNSIIYEIYTTADFDGMRKKFLEQMKMLVDFDSADFYLAAADGEHKLTA PVMYHCDEDLSDVYDTIDYGRGILYSGKSIVYRETDIMADEVRTKTEYYDKVYKPNRWHY SLQMIIAKDKQFLGVVTLYRNIGKDDFNHDEVFLVDTLKEHMAYRLWQQRQNRIQFGEKL TVSAATEKFGLTRQEHNILHLLMEGKDNSFICDYLSISINTLKKHILNIYRKLGIRNRVQ LFKKIREKE >gi|222441793|gb|ACEP01000149.1| GENE 10 12247 - 13509 1051 420 aa, chain + ## HITS:1 COG:BS_yngK KEGG:ns NR:ns ## COG: BS_yngK COG1649 # Protein_GI_number: 16078889 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus subtilis # 45 372 30 360 510 195 35.0 1e-49 MLKKGKLFLVVLLTCFLLGNTFGSIAVNAAGNSASQAAQTVGTMAKSDEYKAFWFSYYDY DAYRTKYKKRNASTFKKYFTKVVKKGKSLGMNCIIVHVRPFGDAMYKSKYFPWSKCISGK QGKNPGFDPLKIMTSVAHANGMKIEAWINPYRVAAGSTNYSRLSSKNLARKWHNNKKTRR NVLAYRGSLYYNPAKAQVRELIVNGAKEIVQNYDVDGIHMDDYFYPTFNSSNVNSAFDAK EYKASTMGKHKNGIVAFRRQQVNALVKAIHSAVKATKPNVTFGISPAGNIDNLTSKYSYY VDINKWLNSHDYVDYICPQIYWGFKHPYAKFDRVTKRWMKAAKSKKVKVYIGIAVYRAGH NTGASSSERREWRSDANVLKKQVQYARKKGCDGFAFFDYQDLKSKTSARAVKNLKKVLKR >gi|222441793|gb|ACEP01000149.1| GENE 11 13710 - 13922 365 70 aa, chain + ## HITS:1 COG:no KEGG:EUBELI_00572 NR:ns ## KEGG: EUBELI_00572 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 1 70 14 83 83 84 67.0 1e-15 MLPLTMASQGEPMTIKKIGGKQETKKFLETLGFVVGGTVTVVSEINGNMIVNVKDSRVAI GKDMANKIMV >gi|222441793|gb|ACEP01000149.1| GENE 12 14259 - 14591 524 110 aa, chain + ## HITS:1 COG:no KEGG:Rumal_2099 NR:ns ## KEGG: Rumal_2099 # Name: not_defined # Def: hypothetical protein # Organism: R.albus # Pathway: not_defined # 10 75 6 71 91 91 71.0 1e-17 MLECLTKINWKKVGLFASGTLFGTAGIKVLASDDAKKVYTNCTAAVLRAKETVMNTVTTV QENAEDIYEGAKQINEERAEAKVAAEFAEEAEAAEEETTEEVEAEAEAAE >gi|222441793|gb|ACEP01000149.1| GENE 13 14782 - 16881 2574 699 aa, chain + ## HITS:1 COG:SP2101 KEGG:ns NR:ns ## COG: SP2101 COG2217 # Protein_GI_number: 15901916 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Streptococcus pneumoniae TIGR4 # 1 692 1 682 687 397 33.0 1e-110 MKFQIKHEIKGRMRLHIFQKKQMSCEQADILLYYLENIEGVESAKVYERTADAVVRYAAD KDGFVRQNIIKEVQKFVYDKVEVPDGLIENSGRELNREFQEKLIGQITAHYARKFLLPAP FRAVYTGIKSLGYIWKGLQTLAQRKLEVPVLDATAIGVSILRGNYDTAGSVMFLLGIGEL LEDWTHKKSVGDLARIMSLNVEKVWLKTEDGEVLVPYSQIKQGDSIVVHMGNVIPFDGTV LAGEAMVNQASLTGESMPVRKSEETSVYAGTVVEEGEITVKVKAVGGSGRYDKIVTMIED SQKLKSGLEGKAEHLADKLVPYTLAGTGLVYLATRNATKAIAVLMVDFSCALKLAMPISV LSAIREASQHNVTVKGGKYMEAVAEADTIVFDKTGTLTKAEPTVAEVVPFGGNDADEMLR MAACMEEHFPHSMAKAVVDAALKKELDHEEMHSKVQYIVAHGISTTIEGKTAVIGSYHFV FEDENCKIPAGEEAKFEALPEEYSHLYLALEGTLAAVICIEDPLREEAADVIKALKKAGI SKVVMMTGDSERTAAAIAKRVGVDEYYSEVLPEDKANFITKETEKGRKVMMIGDGINDSP ALSAASVGIAVSDGAEIAKEIADITISAESLYEVVALKKISNALLKRIHKNYRSIVGINS GLIVLGVAGVLAPATSALCHNVSTLLISLNSMKNLLPEE >gi|222441793|gb|ACEP01000149.1| GENE 14 17380 - 17997 369 205 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_1586 NR:ns ## KEGG: EUBREC_1586 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 193 1 194 205 172 44.0 1e-41 MRQEVFEIVQKMDKDNIRTQMALQCAPLITGLKVANLLIIPSKNEEFVGAILDGTDISYM RLAKSECKTTFFLYREASLTVWLTKAENRVLLREAGYNGKVLSDILRAVQIRYEAYVQKG KDFPHEIGVLLGYPAEDVKGFVVNEGKNYLYSGYWKVYGDLSEAKQLFYKFDRAKEALIE LVSQGIGIRNVIESCSSNLLLDGAA >gi|222441793|gb|ACEP01000149.1| GENE 15 18076 - 18510 715 144 aa, chain + ## HITS:1 COG:CAC0587 KEGG:ns NR:ns ## COG: CAC0587 COG0716 # Protein_GI_number: 15893876 # Func_class: C Energy production and conversion # Function: Flavodoxins # Organism: Clostridium acetobutylicum # 1 140 1 139 142 123 53.0 1e-28 MSKISVVYWSQTGNTEAMAEAVGAGIAKAGKEADVVEVSSASMEDLKAAKVFALGCPAMG AEVLEEMEMEPFVEEVEGFAQGKTIALFGSYGWGDGQWMRDWEERMTAAGATVLNGEGLI CQETPDDDAIKQCEELGEQLAQNA >gi|222441793|gb|ACEP01000149.1| GENE 16 18948 - 20279 1157 443 aa, chain - ## HITS:1 COG:lin2873 KEGG:ns NR:ns ## COG: lin2873 COG0534 # Protein_GI_number: 16801933 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Listeria innocua # 6 442 8 446 450 184 26.0 2e-46 MEQTYMKEKPILPLLLSMAAPMILSMLVNSLYNIIDSIFVAKISEDAMTALSLVFPIQNV VNSVAVGFGIGINVMIAFCLGADQREQADKAASQGMFLSIVHGIILSITGILIMPSFLKL FTNNQDTLSLGLRYSNIVLLFSIIIHVEITFEKYFQAMGMMVVSMLSMLCGCILNIILDP VLIFGIGPFPRLGIEGAAIATGIGQCSTLLIYILFYIFKKPPVTIRKHLLRPEKNVCIRL YGVGIPGALNMALPSLLISALNGILASISESGVIILGIYYKLQTFLYLSVNGMIQGMRPL MGYNYGAKEYRRVNKIYRLALTIAIGIMTAGTLLFMFTPETFMKMFTTQETTIAAGASAL RIISCGFIVSSVSVVSSGALEALGKGNESLVVSLLRYLVIIVPLAFVLSRFIGANGVWHS FWITETVTAFVAYGIYRKALRNK Prediction of potential genes in microbial genomes Time: Fri Jul 8 08:28:26 2011 Seq name: gi|222441792|gb|ACEP01000150.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont453.1, whole genome shotgun sequence Length of sequence - 1009 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 2 - 293 218 ## BDP_0721 ImpB/MucB/SamB family protein involved in DNA repair (EC:2.7.7.7) 2 1 Op 2 . - CDS 248 - 982 500 ## COG0389 Nucleotidyltransferase/DNA polymerase involved in DNA repair Predicted protein(s) >gi|222441792|gb|ACEP01000150.1| GENE 1 2 - 293 218 97 aa, chain - ## HITS:1 COG:no KEGG:BDP_0721 NR:ns ## KEGG: BDP_0721 # Name: not_defined # Def: ImpB/MucB/SamB family protein involved in DNA repair (EC:2.7.7.7) # Organism: B.dentium # Pathway: not_defined # 1 97 193 289 530 164 73.0 1e-39 MDIMAKHVPADEYGVRIAYLDEITYRKKLWEHQPITDFWRVGKGYAKKLAAYQIYTMGDI ARCSVGKEKEYHNEELLYKLFGINAELLIDHAWGYEP >gi|222441792|gb|ACEP01000150.1| GENE 2 248 - 982 500 244 aa, chain - ## HITS:1 COG:SA1196 KEGG:ns NR:ns ## COG: SA1196 COG0389 # Protein_GI_number: 15926944 # Func_class: L Replication, recombination and repair # Function: Nucleotidyltransferase/DNA polymerase involved in DNA repair # Organism: Staphylococcus aureus N315 # 2 197 9 171 420 124 36.0 1e-28 MDKIYAVIDLKSFYASVECVERGLDPLTTNLVVADKSRTEKTICLAVSPSLKKYGIPGRP RLFEVIQKVKRINKERQETAPGHKFIGQSFHSDKLSDPSVALAYITAPPRMSLYMKYSTQ IYQIYLRYFAPEDIHVYSIDEVFIDLTGYLTNYQMGAKELISKVIQDVLKETGITATAGI GTNLYLAKIAMDIMAKHVPADEYGVRIAYQKQGSQLLQGLVRIYILQRLQWTLWQSMCQQ MSMG Prediction of potential genes in microbial genomes Time: Fri Jul 8 08:28:31 2011 Seq name: gi|222441791|gb|ACEP01000151.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont454.1, whole genome shotgun sequence Length of sequence - 11690 bp Number of predicted genes - 12, with homology - 11 Number of transcription units - 7, operones - 2 average op.length - 3.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 30 - 89 5.5 1 1 Tu 1 . + CDS 113 - 1489 1462 ## COG1350 Predicted alternative tryptophan synthase beta-subunit (paralog of TrpB) 2 2 Tu 1 . - CDS 1739 - 2653 881 ## COG0530 Ca2+/Na+ antiporter - Prom 2874 - 2933 10.1 3 3 Op 1 . - CDS 3090 - 4055 1142 ## SGGBAA2069_c22740 hypothetical protein 4 3 Op 2 5/0.000 - CDS 4057 - 4671 502 ## COG4709 Predicted membrane protein 5 3 Op 3 . - CDS 4661 - 4990 347 ## COG1695 Predicted transcriptional regulators - Prom 5040 - 5099 13.2 + Prom 5002 - 5061 8.1 6 4 Op 1 . + CDS 5274 - 5435 84 ## 7 4 Op 2 8/0.000 + CDS 5482 - 5853 370 ## COG1725 Predicted transcriptional regulators 8 4 Op 3 . + CDS 5838 - 6749 220 ## PROTEIN SUPPORTED gi|225088774|ref|YP_002660041.1| ribosomal protein S16 9 4 Op 4 . + CDS 6724 - 8751 1322 ## Closa_1704 hypothetical protein 10 5 Tu 1 . + CDS 8831 - 9118 226 ## gi|225028780|ref|ZP_03717972.1| hypothetical protein EUBHAL_03059 + Term 9204 - 9250 -0.5 11 6 Tu 1 . - CDS 9371 - 10753 1409 ## COG2252 Permeases - Prom 10864 - 10923 6.4 12 7 Tu 1 . - CDS 10962 - 11591 846 ## COG0035 Uracil phosphoribosyltransferase - Prom 11621 - 11680 2.6 Predicted protein(s) >gi|222441791|gb|ACEP01000151.1| GENE 1 113 - 1489 1462 458 aa, chain + ## HITS:1 COG:MA3198 KEGG:ns NR:ns ## COG: MA3198 COG1350 # Protein_GI_number: 20092014 # Func_class: R General function prediction only # Function: Predicted alternative tryptophan synthase beta-subunit (paralog of TrpB) # Organism: Methanosarcina acetivorans str.C2A # 9 436 15 441 442 492 58.0 1e-139 MSNKEIPYKIYLEENEMPDSWYNVRADMKKKPAPLLNPATLEPLKPEELEPVFCKELVKQ ELNDTDAYIPIPQEIKDFYKMYRPSPLVRAYCLEKKLGTPAKIYYKFEGNNTSGSHKLNS AIAQAYYAKQQGLKGVTTETGAGQWGTALSMACAYFDLDCKVYMVKVSYEQKPFRREVMR TYGASVTPSPSTTTEVGKKILEEHPGTTGSLGCAISEAVETATKHEGYRYVLGSVLNQVL LHQSVIGLESKIAMDKYGIKPDIIIGCAGGGSNLGGLISPFMGEKLRGEADYHFIAVEPA SCPSLTRGKYVYDFCDTGKVCPMAKMYTLGSGFIPSANHAGGLRYHGMSSIVSELYDQGL IEARSVEQTSVFEAAEQFARVEGILPAPESSHAIRVAIDEALKCKETGEEKTILFGLTGT GYFDMVAYEKYNNGEMSDYIPTDKDLEAGFAGLPKQPE >gi|222441791|gb|ACEP01000151.1| GENE 2 1739 - 2653 881 304 aa, chain - ## HITS:1 COG:BH0465 KEGG:ns NR:ns ## COG: BH0465 COG0530 # Protein_GI_number: 15613028 # Func_class: P Inorganic ion transport and metabolism # Function: Ca2+/Na+ antiporter # Organism: Bacillus halodurans # 1 303 1 315 318 194 42.0 2e-49 MEYVLLVVGFILLIKGADFFVEGSSSLAQILKVPSVVIGLTIVAMGTSAPEASVSINAAL AGSNDIAISNVVGSNIFNGLVVVGVCAFLSAFKTNRDILKRDMPLNILVTFILCLMFLDG KLNRIEGGILFAGMILYITVMIYHAIKNKSENEPEKILSLPRSIIYMIAGLAAVIFGGDL VVDKACIIATNFGVSQNFIGLTIVAIGTSLPELVTSIVATKKGDSGLALGNAIGSNLFNI LFILGFSAILSPLHVLNESIIDCGILLISSILLFIFAKTKQHMSKLEGIVCIGLYLAYMG YLLI >gi|222441791|gb|ACEP01000151.1| GENE 3 3090 - 4055 1142 321 aa, chain - ## HITS:1 COG:no KEGG:SGGBAA2069_c22740 NR:ns ## KEGG: SGGBAA2069_c22740 # Name: not_defined # Def: hypothetical protein # Organism: S.gallolyticus_gallolyticus # Pathway: not_defined # 1 321 1 301 301 87 28.0 8e-16 MKKTTRTLLGIGGAFCLIGALLFGIGFTSGGTKYVHATNLNSMNAQEDNKESANRFTLKK TEISDFTSLNINLEDSDLNIEESPNEKSYIEYAQATKDKKNPVSYFIENGTLTLKEAGNT GASYYINVDISFLQAALSSKNIEDYTKESSKYENYVTLYLPKDKLVSSAKIYLGYGDFYV KNAAFDTINIKLDDGDFDANTLTANSGKLTFSYGDCDLQKASLGNISLTSKDGDLTVNEL TLSGKTKIALSYGDAEITLNDSTKKVSGFDLATHYGTITASGLNGNKTLDEDDDVNTFQS DSKDGKTKLAIDSKDGDITVK >gi|222441791|gb|ACEP01000151.1| GENE 4 4057 - 4671 502 204 aa, chain - ## HITS:1 COG:SPy2173 KEGG:ns NR:ns ## COG: SPy2173 COG4709 # Protein_GI_number: 15675910 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Streptococcus pyogenes M1 GAS # 14 204 1 194 195 115 35.0 4e-26 MENKRYKKEQDIHLTKEEYLAQLKKYLKRLPKEDYNNAMDYFTEYFEDAGPEGEAALIQE LGTPKEAAYEILDNLITEKRKDPDTPIWKIIFLTFLSICAAPIGGALALTIIALALAGVL VIVAGLLAIFSFGIAFALVGGKLFIRGIIAITASLSGASLISGAGLFSIGISILAILAVF CFCKWIVLVLAHFIQNMSRKRSVK >gi|222441791|gb|ACEP01000151.1| GENE 5 4661 - 4990 347 109 aa, chain - ## HITS:1 COG:SPy2172 KEGG:ns NR:ns ## COG: SPy2172 COG1695 # Protein_GI_number: 15675909 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Streptococcus pyogenes M1 GAS # 3 107 2 106 108 139 68.0 9e-34 MFYYPLSSLLIECLILSIVEREDSYGYEISQTVKLVANIKESTLYPILRKLEQNGYLHTY SQEFQGRNRKYYSITDSGKEQLIFLKKEWNEYYHTIDDIIEGRMIPDGK >gi|222441791|gb|ACEP01000151.1| GENE 6 5274 - 5435 84 53 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MIEFSLFRLPLDIGVYLRYIKCINNVNTLLTYLVQYMLRFPEVFEDLAALAVE >gi|222441791|gb|ACEP01000151.1| GENE 7 5482 - 5853 370 123 aa, chain + ## HITS:1 COG:SP1714 KEGG:ns NR:ns ## COG: SP1714 COG1725 # Protein_GI_number: 15901548 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Streptococcus pneumoniae TIGR4 # 6 89 5 88 121 85 50.0 3e-17 MIFIDYNDKRPIYEQVTEKIQTLILNGVLEPDSKLPSVRSLAMELSINPNTIQRAYSELE REGFIYSVKGRGNFVKMNENLIIKEQEKLLIKFKENVEELRARGILKEQALACINEAYEE VER >gi|222441791|gb|ACEP01000151.1| GENE 8 5838 - 6749 220 303 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|225088774|ref|YP_002660041.1| ribosomal protein S16 [gamma proteobacterium NOR5-3] # 1 248 7 260 312 89 25 1e-17 GGGKMIEIKKISKRFDKIEAVSDVSLSIEEGQVFGLLGTNGAGKSTLLRMMAGVLKEDEG EILIDGEAVWDCVEAKQKFFYISDEQYFFPNATPLDMAAFYKTLYPAFNEKRFRKFLVKF TLQKDRKIQTFSKGMKKQLSILLGVCSGVKYLFCDETFDGLDPVMRQGIKSILANEIDER KFTPIIASHNLRELEDICDHVGLLHKGGVLLSKDLETMKCNIQKVQCALPKEDDKKLEKM FDILQFNRRGKLVTMTVRGSEKQVMAKLSVLKPVFYECIPLSLEEIFISETEVVGYDIKN LIL >gi|222441791|gb|ACEP01000151.1| GENE 9 6724 - 8751 1322 675 aa, chain + ## HITS:1 COG:no KEGG:Closa_1704 NR:ns ## KEGG: Closa_1704 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 656 1 697 725 188 26.0 6e-46 MTSRISYSKLSRENRKRMLGMTIITVLTFFIKAVFLVMGVNECSSHTEVLRLFQPNMGMA GVVIVLAACSAAVSLYYLHSRKQTDFYESLPIKRATLFRLVVQNSLFIFLIPLVIEEAIE FVIAQQQISGGGWEIVSSSLWYLLIFAATWLTMALAMIMTGNIIVGILGFGVFASYFPIV IYNIFPIYAGSFFATYSGNTADNVYNNITSYLSPVWVGLRGMADINSGRETQIKYMMILL LWIVGLYVLCRTLYNRRPAESAGRAMAFTKANTVIKVLLVIPSALYSGIIFYSLGNARYI FWLIFGVVFGVFVIHALIECIYEFDVRAIFCHKKQLIVIMFGSLFIMASFSADLFGYDRY IPKASRVDSIEIKPLGVNSYGYWGKGEERSGLNGEEKQLALSVIKESVEVKNNAETSSSE YYEKNSKFLFFSSVSKKEVMRDDIEISTTFCMKNGNRKVRNYILRSQKAKELYNKLYATR EFKKNIYSLYTRDYNDIKEITWSGIETEYLNLTKEEKKQFLDTYLSELDSLSYTDVQKTV PVGNIDISIGGTMGSENATQDTYYIYPSFKKTLAFLAQKGCAVNQTIDNLDISSIGVEGY DGSTGENIRYGITDKKRINKLKKKLVLQQMENVPGITVINGNMEIQVNYKNKNGGNSIYC YSICTDDELQQLLKE >gi|222441791|gb|ACEP01000151.1| GENE 10 8831 - 9118 226 95 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225028780|ref|ZP_03717972.1| ## NR: gi|225028780|ref|ZP_03717972.1| hypothetical protein EUBHAL_03059 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_03059 [Eubacterium hallii DSM 3353] # 1 95 1 95 95 175 100.0 1e-42 MYFIEIFQAFTGRLPEVDDLIANTAGFLVGYLTAQGFLELFDKKAYKYGIIRILSTVILF CISIFALSFWANGDALQAEEEAYVKMMEQYGMEHR >gi|222441791|gb|ACEP01000151.1| GENE 11 9371 - 10753 1409 460 aa, chain - ## HITS:1 COG:VC2278 KEGG:ns NR:ns ## COG: VC2278 COG2252 # Protein_GI_number: 15642276 # Func_class: R General function prediction only # Function: Permeases # Organism: Vibrio cholerae # 1 459 1 428 430 338 47.0 2e-92 MLEKFFKLSENHTDAKTEILAGITTFMTMAYILAVNPSIMAATGMDSGAVFTATALAAFI GTLLMAIFANYPFALAPGMGLNAYFAYTVVIGMGYTWQYALTAVFAEGIIFILLSLTNVR EAIFNAIPMNLKSAVSVGIGLFIAFVGLQNAHIVVGGSTLLQLFSVDAYNKANGVEASFN NVGITVILALAGIIITGILVVKNIKGNILWGILITWGLGIICQFAGLYVPNADLGFYSLL PDFSKGLSIPSLTPIFGKLQFKGIFSVDFIVILFAFLFVDLFDTIGTLVGVSAKADMLDE EGKLPHIKGALLADAVATTFGAILGTSTTTTFVESASGVSEGGRTGLTAVTTAILFGLSL FLSPIFLAIPSFATAPALVIVGLYMLSNVTNINFTDMSEAIPAYVCIIAMPFFYSISEGI SMGIISYVVINLITGEAKDKKISALMYVLAILFILKYIFL >gi|222441791|gb|ACEP01000151.1| GENE 12 10962 - 11591 846 209 aa, chain - ## HITS:1 COG:lin2682 KEGG:ns NR:ns ## COG: lin2682 COG0035 # Protein_GI_number: 16801743 # Func_class: F Nucleotide transport and metabolism # Function: Uracil phosphoribosyltransferase # Organism: Listeria innocua # 1 209 1 209 209 272 59.0 3e-73 MNNVIEMTHPLIKHKISILRDKNTGTNEFRKLIEEIGILMGYEALRDLPLEDVEIETPIE TCKTPMISGKKLAIVPILRAGLGMVNGILALVPSAKVGHIGLYRDEETHEPHEYYCKLPD PIDQRLIVVLDPMLATGGSAIAAINFIKEHGGKNIKFMSIIAAPEGVERLAKAHPDVQIY CGNIDRQLNEDAYICPGLGDAGDRIFGTK Prediction of potential genes in microbial genomes Time: Fri Jul 8 08:29:05 2011 Seq name: gi|222441790|gb|ACEP01000152.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont455.1, whole genome shotgun sequence Length of sequence - 30203 bp Number of predicted genes - 23, with homology - 23 Number of transcription units - 11, operones - 6 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 47 - 1435 1758 ## COG2509 Uncharacterized FAD-dependent dehydrogenases - Prom 1498 - 1557 9.9 + Prom 1782 - 1841 6.8 2 2 Op 1 24/0.000 + CDS 1973 - 3802 2125 ## COG0845 Membrane-fusion protein 3 2 Op 2 . + CDS 3833 - 4618 288 ## PROTEIN SUPPORTED gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein 4 2 Op 3 . + CDS 4620 - 7430 2722 ## EUBREC_2392 hypothetical protein 5 2 Op 4 . + CDS 7442 - 8698 397 ## PROTEIN SUPPORTED gi|163788031|ref|ZP_02182477.1| 50S ribosomal protein L9 + Prom 8777 - 8836 6.7 6 3 Tu 1 . + CDS 8895 - 10241 1584 ## COG1109 Phosphomannomutase - Term 10246 - 10295 13.1 7 4 Op 1 . - CDS 10471 - 10740 395 ## gi|225028789|ref|ZP_03717981.1| hypothetical protein EUBHAL_03068 - Prom 10761 - 10820 2.4 8 4 Op 2 . - CDS 10822 - 11745 895 ## COG0248 Exopolyphosphatase - Prom 11792 - 11851 8.1 + Prom 11848 - 11907 8.3 9 5 Op 1 . + CDS 11940 - 12116 233 ## Ccel_2580 hypothetical protein 10 5 Op 2 . + CDS 12142 - 12276 135 ## gi|225028793|ref|ZP_03717985.1| hypothetical protein EUBHAL_03072 11 6 Tu 1 . - CDS 12269 - 12448 244 ## Bacsa_1905 hypothetical protein - Prom 12555 - 12614 10.2 12 7 Op 1 30/0.000 + CDS 13105 - 13581 668 ## COG0066 3-isopropylmalate dehydratase small subunit 13 7 Op 2 30/0.000 + CDS 13585 - 14838 1590 ## COG0065 3-isopropylmalate dehydratase large subunit 14 7 Op 3 30/0.000 + CDS 14903 - 15463 726 ## COG0066 3-isopropylmalate dehydratase small subunit 15 7 Op 4 . + CDS 15473 - 16711 1430 ## COG0065 3-isopropylmalate dehydratase large subunit + Term 16868 - 16904 -1.0 + Prom 16880 - 16939 5.7 16 8 Op 1 . + CDS 16999 - 18471 1665 ## COG0770 UDP-N-acetylmuramyl pentapeptide synthase 17 8 Op 2 . + CDS 18483 - 19139 496 ## COG0406 Fructose-2,6-bisphosphatase 18 8 Op 3 24/0.000 + CDS 19181 - 21097 2313 ## COG0187 Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), B subunit + Prom 21143 - 21202 8.2 19 8 Op 4 . + CDS 21238 - 23502 2190 ## COG0188 Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), A subunit 20 9 Tu 1 . - CDS 23730 - 24806 718 ## COG0469 Pyruvate kinase - Prom 24894 - 24953 6.4 + Prom 24998 - 25057 10.6 21 10 Tu 1 . + CDS 25166 - 26149 710 ## COG1237 Metal-dependent hydrolases of the beta-lactamase superfamily II + Term 26274 - 26332 8.1 + Prom 26155 - 26214 6.0 22 11 Op 1 5/0.000 + CDS 26347 - 26706 402 ## COG0640 Predicted transcriptional regulators 23 11 Op 2 . + CDS 26803 - 29970 3703 ## COG2217 Cation transport ATPase + Term 29997 - 30058 18.1 Predicted protein(s) >gi|222441790|gb|ACEP01000152.1| GENE 1 47 - 1435 1758 462 aa, chain - ## HITS:1 COG:CAC3595 KEGG:ns NR:ns ## COG: CAC3595 COG2509 # Protein_GI_number: 15896829 # Func_class: R General function prediction only # Function: Uncharacterized FAD-dependent dehydrogenases # Organism: Clostridium acetobutylicum # 5 456 3 450 457 531 57.0 1e-151 MNQVYDVIIIGCGEAGIYAGYELSLKNPSLKIGVFEQGRDIYKRSCPIVAKKVKQCINCK VCDTMCGFGGAGAFSDGKFNFTTEFGGWLTDYMDHDEVMELIHYVDDVNVAHGATTSYFS TTTPEAQALAKKALEFDLHLLQAQCKHLGTEKNLMILTNIYEDLKDKMDFHFNTAISEIK TCSEGYELVTEKGDVARCQYLIAAPGRSGAEWFANQCKNLGIKLINNQVDIGVRVELPAR VFEHITDVVYESKLVYRTKQYGDSVRTFCMNPYGHVVAENVEGINTVNGHSYSDASLRSE NTNFALLVSNRFTEPFDEPYRYGKHIASLSNMLAGGVLVQRFGDLVKGIRTNEHRLSQSF VKPTLTAAVPGDLSLALPKRQLDDIIEMIYALDKVAPGTANYDTLLYGAEVKFYSSRLEL SHELETKLPGFFAIGDGAGTTRGLAQAGASGIKAARAVLARL >gi|222441790|gb|ACEP01000152.1| GENE 2 1973 - 3802 2125 609 aa, chain + ## HITS:1 COG:BS_yvrP KEGG:ns NR:ns ## COG: BS_yvrP COG0845 # Protein_GI_number: 16080382 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Bacillus subtilis # 123 525 2 391 397 131 31.0 4e-30 MFRKKNGQEFMDETNLDSSTNDKEEKKGFFHRKKKKEINLYEDDMGYIEDDSNFDEDEYD KELEELQEEEKPEKKRGFFHRKKKEEEKSGGTEKLEGTKEEEESLLSDDDAIILQEPEKK KFSKKKLALICGTAAVIVIAVVFFVIRNIGGTSEGKAYVESVREITGLGTAGGANNRYTG TVDAEKSWKITLQSDLSVEKRYVNVGDQVKKGDKLFKYNTQELKLSKEKKELEQETLQNE ITQLTKDIKSYQSDLKSASASEKIQLQTQILTAQTTIKKDQYTIKSNKESIKLLEKNIKD ATVKSKMNGLVKKVNASLETSSSDDSSDDGSDSGDGSSDDSSYMTILAVGNYRIKGTVSE TNVWSLNEGDPVIVRSRVDNSQTWKGTISSIKTDTTADDTSDSSSDYSDGYTDDESSTGE TASKYNFYVKLDNDKDLMMGQHVLIEQDNGQDEQKEGMWLPSAYIKKNNNKYYVWLDSHN KLKLHEIQVGEYDENLDEYEVKSGLKTSDYIACDDSTLKEGMKVTKVSPEDNTSGYGDEQ ENDDAESFGSDASEDTSDVGEDYSDYSDSADTYDSSDDGSLSDGADFGSDSSDSGNSGTS DSDIADFGE >gi|222441790|gb|ACEP01000152.1| GENE 3 3833 - 4618 288 261 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein [Acinetobacter baumannii AYE] # 1 226 6 221 311 115 31 3e-25 MLLELKHIYKNYLQDKIVIPVLKDVNISIDEGEYVAIMGPSGSGKTTLMNIIGCLDRPTK GEFLLEGESIVNYNENRLSDLRLNTLGFVFQSFNLLPKQTSLDNVALPLSYAGVPVKKRK EIAFQALKRVGLAERVHFRPSQLSGGQQQRVAIARALVNNPKIILADEPTGALDSRSGIQ VMNLFQQLNEEGVTIIMITHDANIANHAKKVLHIFDGEIVEDKERKEKQQAEVQSEKEAK NDTVFSEIEQDFGVKEWEGEI >gi|222441790|gb|ACEP01000152.1| GENE 4 4620 - 7430 2722 936 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_2392 NR:ns ## KEGG: EUBREC_2392 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 650 930 249 522 525 161 38.0 1e-37 MKGKKKILLAVALVLVMAVGITSIVVIRNRNKTVVKVFPVSMLNSSDWFDTDTSLTGTLT SDYIQEVHIDGGQKVKKVYVKKGDKVKKGDILLKYNVSEKELDLKLQKLQIQSSKMEIAE MEKELKKLKNTKTVGAVNTGNSDVMNASVNLSAIKGTSLSYLVAEANGNTEGTSSDVKVQ SGNAVSKSGSGSGSESGSGSGSGSESGSGSGSGSESGSGSGSGSESGSGSGSGSESGSGS GETKLVLRDHVNSEADKAATSGAGTKEDDSLIFTLTKNGTISGAFVKKLISEGKYAEFRS YKSEESKNPIATFELSPDTKVNVYDNDNYTVNKLNEGGDSRTLHNSIESVSERDSSSGDG SSAGNAYIYYLNVNGRVKGSVIKSLHDSKKYATFIEFDENGKEKNRYEYNPDRVFTDLEE DGLYKVSDFDNLTIRPAYKDLKADKLEDGQKDSNGKYIFYLQKGGKISGSLLNDLIKKGI NAEFIEYASEKDYDNENTENANTLEITSSSKFTVELADGTGYTISYLKSKVKKDDPSKPT TEHENPTTPSTDKKIKFKQNRKRVTAGNGCLLTVVTEDGKSIDQSKVTWKLSDNISKDTT ITKTKSGVWLYVDEGEPSSSVTITAQVKNVKGISKEQISRTLIVKGADTGDDGDDNNGGG SDDGSGSGSGDGIDDGTDSSGGDDNGYTAEELKAAISDKEDDIAQAKEDLHDAQISYKEA KAEVDKATVKAKLAGTVTTAYSKGTLPTDGSAAIVVKAADGMYVKTSISEMELDSVKVGG TIKCVSSDTGDEYTAEVKEISDFPTADSSNGGDVSNPNSSYYPIVAFIKDADGLSPGGSV EVSYSSKSMGTANENAVYLQKAYIRTEDKKSYVYLRDKKTKRLKKQYITIGKTMNGQYIE IVSGVTEDDNIAFPYGKNLREGVKTKISEDDSEMIY >gi|222441790|gb|ACEP01000152.1| GENE 5 7442 - 8698 397 418 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163788031|ref|ZP_02182477.1| 50S ribosomal protein L9 [Flavobacteriales bacterium ALC-1] # 3 418 7 413 413 157 28 7e-38 MLENIRLSFQGVWSHRLRSFLTMLGIIIGIAAIIAIVSTIKGTNEQIKKNLIGSGTNTTT VQLTEEGNYPAEMEYGIPKGIPVLDDDMRQQLADIDHVKDASLYVSRTQVDYIYYNDTSL SGGQLIGADSHYLGTCGYIMKRGRGFTDSDYEKYRKVAVIDSVAADSLFGDANPVGKTID IKGEPFIVIGVIEESTAFSPVINSVEDYYTYVYNDGSASGIVVIPDVDWNIVQQYDEPQN AKLLADSTDNMTSVGKAAQTLLNETVTSKSIKYRAANMNETAEESQQVSATTNKQLIWIA SISLLVGGIGVMNIMLVSVTERTREIGLKKAIGARKNRILAQFLTEAAVLTSLGGIIGTI TGIALAEIIGKISNVPIAISIPAVILAIAFSTVIGVVFGLLPAVKAANLDPIVALRHE >gi|222441790|gb|ACEP01000152.1| GENE 6 8895 - 10241 1584 448 aa, chain + ## HITS:1 COG:SP1559 KEGG:ns NR:ns ## COG: SP1559 COG1109 # Protein_GI_number: 15901402 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphomannomutase # Organism: Streptococcus pneumoniae TIGR4 # 1 442 1 444 450 418 49.0 1e-116 MGKYFGTDGFRGEANVDLTVEHAYQVGRFLGWYYGKDKKEKCKVVIGKDTRRSSYMFEYS LVAGLTASGADVYLLHVTTTPSVSYVVRTEDFDCGIMISASHNPYYDNGIKLINSKGEKM DEETILKVEDYIDGKIEVPMAVRDQIGCTVDYSAGRNRYIGYLISLATRSYKNIKVGLDC ANGSSWMIAKSVFDALGAKTYVINAEPDGLNINMNAGSTHIEVLQNFVKENQLDVGFAFD GDADRCIAVDENGNVVDGDLILYVYGRYLKEKGALRNNTVVTTIMSNFGLYKAFDELGID YEKTQVGDKYVYENMVKNGHRIGGEQSGHIIFSKYATTGDGILTAIKMMEVMLEKKSSLA TLTSPVRIYPQVLKNVRVKSKPEAQNDADVQAAVKQVAETLGSTGRILVRESGTEPVIRV MVEAETEEECEKYVDSVIDVIKEKGHCE >gi|222441790|gb|ACEP01000152.1| GENE 7 10471 - 10740 395 89 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225028789|ref|ZP_03717981.1| ## NR: gi|225028789|ref|ZP_03717981.1| hypothetical protein EUBHAL_03068 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_03068 [Eubacterium hallii DSM 3353] # 1 89 1 89 89 169 100.0 8e-41 MTILEKRKLAPERDEHLKKGEFTVLNGEYSKNFNVYVVGEKHKEAVDFIKVALDRSTDNV LGFGIKMGCQKNFNVIVWDEEDIPEEYRV >gi|222441790|gb|ACEP01000152.1| GENE 8 10822 - 11745 895 307 aa, chain - ## HITS:1 COG:BH1393 KEGG:ns NR:ns ## COG: BH1393 COG0248 # Protein_GI_number: 15613956 # Func_class: F Nucleotide transport and metabolism; P Inorganic ion transport and metabolism # Function: Exopolyphosphatase # Organism: Bacillus halodurans # 2 297 5 305 518 119 25.0 5e-27 MKCAIVDLGSNTIRLSLYNTLENGGFETLFSKKYMAGLAGYVSHGIMSNDGINQACAVLL DFKILLQQLGVKDMHVFATASLRNIKNTEKALETIKRRTGLSVDVIEGSEEGILGYYGAL YTTDLKNGMMFDIGGGSTEFVRVKNGKVKNSQSVSIGSLNLFHSNVSGLWPEKKEQKAIT KQINKKLNTVDFPKKSPEKVCCVGGTCRAILNIVNYHFNKQENNRIITREEFDKILKILT KHDITSRNYILKLCPDRVHTIIPGMMVVKEIMDRLNCKELWVSRYGVREGYLCKNFLMTG STQEVTK >gi|222441790|gb|ACEP01000152.1| GENE 9 11940 - 12116 233 58 aa, chain + ## HITS:1 COG:no KEGG:Ccel_2580 NR:ns ## KEGG: Ccel_2580 # Name: not_defined # Def: hypothetical protein # Organism: C.cellulolyticum # Pathway: not_defined # 3 58 5 60 60 63 62.0 3e-09 MASQCDLCAYYWIDEEDGTGECQVNLDEDDLARMLSGSSDSCPYFQMDDEYKIVRKQM >gi|222441790|gb|ACEP01000152.1| GENE 10 12142 - 12276 135 44 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225028793|ref|ZP_03717985.1| ## NR: gi|225028793|ref|ZP_03717985.1| hypothetical protein EUBHAL_03072 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_03072 [Eubacterium hallii DSM 3353] # 1 44 1 44 44 65 100.0 9e-10 MTVCTGGFHLDGIVKEQIKEVMEVAEQMSEEVLQELEKSFCNTI >gi|222441790|gb|ACEP01000152.1| GENE 11 12269 - 12448 244 59 aa, chain - ## HITS:1 COG:no KEGG:Bacsa_1905 NR:ns ## KEGG: Bacsa_1905 # Name: not_defined # Def: hypothetical protein # Organism: B.salanitronis # Pathway: not_defined # 3 58 2 57 58 70 57.0 2e-11 MGQLPQDPIMLYSVINTKLRDFYSSLEVLCEDMGLSEEELKEKLSSAGFEYDKDRNQFI >gi|222441790|gb|ACEP01000152.1| GENE 12 13105 - 13581 668 158 aa, chain + ## HITS:1 COG:MJ1277 KEGG:ns NR:ns ## COG: MJ1277 COG0066 # Protein_GI_number: 15669463 # Func_class: E Amino acid transport and metabolism # Function: 3-isopropylmalate dehydratase small subunit # Organism: Methanococcus jannaschii # 3 158 8 165 168 171 51.0 4e-43 MNKVWKFHNDVDTDQIIASQYLLLPDIEAMKQYTFESLDPDFAKKAQPGDLIVAGENFGC GSSREQAPSVLKALGIKAVIAKSFARIFYRNSINIGLPVIVCKDLYEHVEDGGQADLDLS AGTVTVDGQVFSCTKLPEYMQNILNAGGLIASLNKEEA >gi|222441790|gb|ACEP01000152.1| GENE 13 13585 - 14838 1590 417 aa, chain + ## HITS:1 COG:aq_940 KEGG:ns NR:ns ## COG: aq_940 COG0065 # Protein_GI_number: 15606262 # Func_class: E Amino acid transport and metabolism # Function: 3-isopropylmalate dehydratase large subunit # Organism: Aquifex aeolicus # 1 417 1 422 432 372 46.0 1e-103 MGMTLGEKIIARAAGVDFVKPNDIHTVTLDRMMSNDGTTHLTVDMYHNKLKNPKIADKDK VVFIMDHNVPAENPKTAEAHKKMRDFAIENDIKLYEGQGVCHQIMMEDYVCPGELIFGAD SHTTSYGALGAFGTGVGCTDFLYAMTTGTSWVMVPETIRFNLHGKLREGVYPRDLILTII GDIGANGANYKIMEFAGDGAHNLSVDDRMVLCNLTVEAGAKAGIVEPDEKVVEYCKAHGR EAVHMFKSDEDAHFCEVYDYDLSKIEPVVVRPDFVDNFALLKDVKKEKIAIDEAFLGSCN NGRIDDLRVAAEIVKGKQVDPKVRFIVAPASKAVYIQALEEGIITTLMEAGAMVMNCNCS VCWGSCQGVIGENEVLISTGTRNFKGRAGAPSSKVYLASAATVAASALAGYITGPED >gi|222441790|gb|ACEP01000152.1| GENE 14 14903 - 15463 726 186 aa, chain + ## HITS:1 COG:AF0629 KEGG:ns NR:ns ## COG: AF0629 COG0066 # Protein_GI_number: 11498237 # Func_class: E Amino acid transport and metabolism # Function: 3-isopropylmalate dehydratase small subunit # Organism: Archaeoglobus fulgidus # 29 182 2 160 161 154 48.0 1e-37 MHAYYNYRYKIIYSVYVTERGIIMEQFTGKVWVLGDDIDTDIIIPTEYLALPTVEDMKQY GFSPLRPELASQIQEGDIIVAGKNFGCGSSREQAPEIIKALKVKCVIAKSYARIFFRNSI NNGLLLIENSDIQDVIKEGDSLTVNMGESLEFDGKTYPIHPLPDNLMDIINAGGLVKAMQ KRNGII >gi|222441790|gb|ACEP01000152.1| GENE 15 15473 - 16711 1430 412 aa, chain + ## HITS:1 COG:MJ0499 KEGG:ns NR:ns ## COG: MJ0499 COG0065 # Protein_GI_number: 15668676 # Func_class: E Amino acid transport and metabolism # Function: 3-isopropylmalate dehydratase large subunit # Organism: Methanococcus jannaschii # 1 411 1 421 424 375 44.0 1e-104 MGSTLIEKIIEKNTGLSKVTPGQIVTVNVDRVMIHDIFIPFVADKFEEMGFTKLWDPDKV VLIYDHLVPTSAVEDVRHFKIGDAFADKYGLTHVHRRDGICHQLMTEAGYAKPGNIVFGT DSHTTTYGCVGCFSSGIGYTEMASILGTGEMWIRVPETIKVVINGTLPANVTAKDVILRL IGDLRADGATYKALEITGSAVDAMSVASRMTISNMAIEAGAKCCLFRPDEKTCEYSEVNL EDVDWLYGDEDASYCRVMTYQAEELVPVCACPSQVDNIHPVSELVGTEIDQVFIGSCTNG RLEDLARAAKILEGKTVACRTIVTPASRKIYIEAVKAGYMETLAKAGAIITQPGCGLCCG RSGGILCDGEKVLATNNRNFLGRMGTSKVGIYLGSPATAATSAIYGKITIPE >gi|222441790|gb|ACEP01000152.1| GENE 16 16999 - 18471 1665 490 aa, chain + ## HITS:1 COG:SA1886 KEGG:ns NR:ns ## COG: SA1886 COG0770 # Protein_GI_number: 15927657 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramyl pentapeptide synthase # Organism: Staphylococcus aureus N315 # 24 482 1 451 452 256 35.0 5e-68 MLECPLDIPKSFKDKYDYQEKIVMENITIKDIVEACSGQLLCGDENKVIKEFSIDSRSGS EDSIFVPIIGERVDAHKFIDGALKINGATFTSEHDEPLEGFENKPWIKVADTVEAMQKVG TFYRNRMNLPVVAVTGSVGKTTTREMISTVLASQKKVFQTIGNQNSQIGVPLTLSHLTRE DEIAVLEIGMSERGQIEKLTNMIRPNVAVVTMIGVSHIAQLKTQENICLEKMDIVKGLPE DGMVFLNGDDKFLAPYRGKLSHKTFFYGMDKACDYRAENVHVKDGQTIFHFFYKEGKVVK DMEVVLGTMGEHNVRNALVALGVAHQMGLDMKAAAKALVTFHGQRQQMHTLKSCTLIDDT YNASPDSMKASVSVLSSMEGVKGRRIAALADMLELGEKEREYHYEVGKFIAGTQVDEVAA YGELSEEILKGVEDNNERIVVKHFESRDALKDYLLTYVHPDDVLLLKASNGMKLKEIAEA FLQNEKKVLK >gi|222441790|gb|ACEP01000152.1| GENE 17 18483 - 19139 496 218 aa, chain + ## HITS:1 COG:alr3338 KEGG:ns NR:ns ## COG: alr3338 COG0406 # Protein_GI_number: 17230830 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose-2,6-bisphosphatase # Organism: Nostoc sp. PCC 7120 # 1 186 237 421 449 93 30.0 3e-19 MRHGETKWNKRSKLQGQVDIPLAPKGIEQAEMTSEGMKDIPFDHIFSSPLKRAYKTAQVV RRDRPIEIVRDDRLKEMSFGTSEGKIIGKIMANPAMVRYQRFRLDPAHFRPAKYGEYFQD VLKRTDEFFQEEIVPLEGKAENILIVAHGCVVRSFILNFTKRPLSEFWKTPFGRNCSSAA FEYKNGEINMIYENKLYYDVDCPDWQQQIPVYKKGRTR >gi|222441790|gb|ACEP01000152.1| GENE 18 19181 - 21097 2313 638 aa, chain + ## HITS:1 COG:CAC0006 KEGG:ns NR:ns ## COG: CAC0006 COG0187 # Protein_GI_number: 15893304 # Func_class: L Replication, recombination and repair # Function: Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), B subunit # Organism: Clostridium acetobutylicum # 2 631 3 629 637 633 52.0 0 MEQQTTYNANSIAVLEGLEAVRKRPGMYIGSVSTRGLNHLIYEIVDNAVDEHLAGYCSNI QVILDEDGTATIQDNGRGIPTGINSKTGIPAVEMVFTMLHAGGKFGTGGYKISGGLHGVG ASVVNALSVWLEVKVRSEGKVYQQMYEKGKAVAPLEVIGTCRKGDTGTTVTFLPDGEIFD KTYFKAESIKSRLHETAYLNPGLSITFENRRPGEEETVLFHEEEGLKAYVRDLNKGKPAV GEIVYFKKKVDDIEVEAAFQYVDEFQETIMGFCNNICTMEGGTHITGFKTKFTSVMNQYA RELGILKEKDKNFTGADVRNGMTAVLSIKHKDPRFEGQTKTKLDNPDAGKAVSEVLGEEL PLYYDRNLEELKKVIACAEKSAKIRKAEERARTNLISKSKFSIDTNGKLANCESRNPEEC EVFIVEGDSAGGSAKTARNRRTQAILPIRGKILNVEKASMDKVLANAEIKTMIHTFGCGF SEGYGNDFDISKLKYNKIVIMTDADVDGAHIATLLLTFFYRFMPDLIHQGHVYLATPPLY KAIPKRGKEEYLYDDRALENYRKNHKSNFTLQRFKGLGEMDAEQLWETTLNPETRILKQV EIEDGRLASEVTSMLMGSEVPPRRAFIHAHAQDADLDL >gi|222441790|gb|ACEP01000152.1| GENE 19 21238 - 23502 2190 754 aa, chain + ## HITS:1 COG:CAC0007 KEGG:ns NR:ns ## COG: CAC0007 COG0188 # Protein_GI_number: 15893305 # Func_class: L Replication, recombination and repair # Function: Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), A subunit # Organism: Clostridium acetobutylicum # 4 750 6 735 830 508 40.0 1e-143 MSEQIIKTEFSDVMQKSYIDYAMSVICQRALPDVRDGLKPVQRRVLYAMQELGLSADKPH RKSARIVGDTMGKYHPHGDSSIYEALVVMEQDFKKGMPLVDGHGNFGSIEGDGAAAMRYT EAKLQKFTQDVYLADMDKNVVDFQPNFDETEKEPVVLPVRIPNLLINGAEGIAVGMTTSI PPHNLSEVVDGVKAYMDNPEITTEELMEYVKGPDFPTGGIVVNQKELKNIYETGSGKMKL RGKVHFEKAKKRSERDKLVITEIPYTMIGANISKFISDVVSLIENKTTSDIIDVSNESSK EGIRIVLELRRNADAERLENLLYKKTRLEDTFGVNMLAIANGKPELLSLKDIISHHTKFH YEVLTRKYETLLKKELEQKEIKEGLIKACDIIDLIIEILRGSKSLKDAKSCLINGDTENI TFKTQKSKKEASKLCFTEKQASAILEMRLYRLIGLEIMALQEEYAEILKKIDKYQDILEH PASMKRVMKKDLEKIKKNYGFDRKTELTNAKAAVVKALPIEEKEVVFVMDRFGYSKILDK NTYERNEETVLKEYRHIVHCMNTDKICVFTDTGVMHQIKVQDIPSGRLRDKGTPLDNIGN YDSRNEQILLIVPDRTLKEASLLFVTASSMIKLVDGAEFIVQKKTVASTKLAEGDILAAV RLVAAKEGSINAQIIMQSEEGYFLRFPLEEVTRKKKGAIGIRGMKLQEDDHIKHVYLTGV ENEDNPSIIYKEKELVFSKIRLMGRDGKGVKVRR >gi|222441790|gb|ACEP01000152.1| GENE 20 23730 - 24806 718 358 aa, chain - ## HITS:1 COG:BH3163_1 KEGG:ns NR:ns ## COG: BH3163_1 COG0469 # Protein_GI_number: 15615725 # Func_class: G Carbohydrate transport and metabolism # Function: Pyruvate kinase # Organism: Bacillus halodurans # 10 350 9 333 473 192 33.0 6e-49 MKHSLHLYGTLGPACATKEKIRTMFLSGMTGMRLNLSHSNLDDCKDWLQQYYEVSERLHI VPELLVDLQGPELRIGTLDIPLSLSENDLIFLVSEKSFDKRKENSAIFSRIIAGEDELSI IPCPDILFSYFDEGQEVKLNDGKIVLELSAPLEEDIFCAKVKKGGELTSRKSIAIKDVTV PLPVLTETDKQNLKLLRKYHVTGVMLPFVRNADDLITLRKELCADGMDYIRIFAKIESLE GVENLPSLLPFCDEIVIARGDLGNAIPLPKLPAIQKRIAMTCLRAKKPFMVVTQMLASME QNPVPTRAEVTDIFQAVLDGASSLMLTGETAVGAYPEEAIKILAQTSEEALRFKIAKF >gi|222441790|gb|ACEP01000152.1| GENE 21 25166 - 26149 710 327 aa, chain + ## HITS:1 COG:PAB0518 KEGG:ns NR:ns ## COG: PAB0518 COG1237 # Protein_GI_number: 14520974 # Func_class: R General function prediction only # Function: Metal-dependent hydrolases of the beta-lactamase superfamily II # Organism: Pyrococcus abyssi # 1 301 4 247 272 107 27.0 5e-23 MRWKLTTLIENHVDKEERYMCEHGLAILIEGENQEEQVCLLMDTGQSGIFYENAAKMGIS LENLSALLISHAHYDHAGGVKRLIEEETIRKIYVGKDFFQGKYYEKNDGTMKDIGIAFSK EELEKKGITVCEVKEDMQMIFPGVTLYRNFERIVGYEQLNPRFFVKKEDKEIVAGCFAES FFQTHSGTEEGMTAYSTDCSSDELSVKPAIEKVISEYTKDSFTDEIAVALDTEQGIVVIV GCSHPGIMNILRTIEKRSGKKICGVVGGTHLMEADGERLRKTIDDLKEMNINFIAVSHCT GEDNLETIKNNFGEKFIFNCTGNVIRF >gi|222441790|gb|ACEP01000152.1| GENE 22 26347 - 26706 402 119 aa, chain + ## HITS:1 COG:FN0260 KEGG:ns NR:ns ## COG: FN0260 COG0640 # Protein_GI_number: 19703605 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Fusobacterium nucleatum # 7 114 11 118 125 100 50.0 6e-22 MGTKKSDCVAMFPESIEKVQKSMPKDCEIDKVVSFFKVLADDTRIRILYALKEQEMCAGD IAVFLDMTKSAVSHQLAVMRKMHQVRARREGKNVFYSLDDQHIVDIMEEALIHMTHTDS >gi|222441790|gb|ACEP01000152.1| GENE 23 26803 - 29970 3703 1055 aa, chain + ## HITS:1 COG:CAC2241 KEGG:ns NR:ns ## COG: CAC2241 COG2217 # Protein_GI_number: 15895509 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Clostridium acetobutylicum # 329 1054 4 699 699 612 45.0 1e-174 MQRKVYILENLDCANCAAKIERKLSKLPELSDVSVTFATKQLRFAAEDPEAVLPKIRETI QSMEPDVEVVERTRSRRKAAETHNHEHHHHEHGEECGCGHDHHDHDHDHEEHEHHHHHHE HGEECGCGHDHHDHDHDHEEHDHHHHHHEHGEECGCGHDHHDHDDHHHHHEHGEECGCGH DHHDHDHHHHHDHGPAKPQATRSHTHFQVDHHQVEGHPEGCQCEQCNSYVEYCDVCGESL AKCNCHMPDEDLEKKVYILEGIDCANCAAKIEAKIRQMPEVGFASVAFATKQLRVSANNQ AELLPKMQAVVDSIEDGVTIVPRQRKKLSGISNTKVYILEGLDCANCAAKIEAKLRTLNG VDDLTITYATKQMKLSAKNPDQMIPMIKETIDAMEDGITIVPKDNKVIKSEEAGEKKFSF NNPLVSIGVGAVIFIIGEILEHVGNVPTIPMFALFLIAYLVLGGKVLITAGKNIMKGQVF DENFLMCIATIGAFCIQEFPEAVGVMLFYRIGEYFEEKATEQSRTQIMEAVDLRPEVVNL VIGNDVRIIDAEEANVGDILLVRPGDRIPLDGVIIDGESRIDTSPVTGEPVPVMAKAGDN IVSGCVNTSGQLKIRVEKILEESMVTRILDSVENAAASKPNIDKFITRFARVYTPFVVLF ALFVAVVLPFILPDSLNWHFFVDSAYTGTVNTIHGTSGTASIYTALTFLVISCPCALVLS VPLAFFSGIGAGSKKGILFKGGIAIESLKNVKAIVMDKTGTITKGNFVVQKANPAGNAMT ANDLLAISASCELSSTHPIGNSIVEAAEEKGLSIERPSKVEEIAGHGIRAELSRGVVLCG NRKLMDAQNVDLSVYQKENFGTEVLVALNGKFVGNIVISDTVKDDAKDAIAAVKKQGIIT AMLTGDAQESADAVAKETGIDEVHAKLLPQDKLSELKKIRENHGAVMFVGDGINDAPVLA GADVGAAMGSGADAAIEAADVVFMNSEMKAIPEAISIAKMTNSISWQNVVFALAIKIIVM IMGLFGFANMWIAVFADTGVSVLCLLNSIRILHRK Prediction of potential genes in microbial genomes Time: Fri Jul 8 08:29:46 2011 Seq name: gi|222441789|gb|ACEP01000153.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont457.1, whole genome shotgun sequence Length of sequence - 4783 bp Number of predicted genes - 5, with homology - 4 Number of transcription units - 3, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + TRNA 10 - 85 83.3 # Thr TGT 0 0 + TRNA 126 - 199 87.0 # Met CAT 0 0 + TRNA 218 - 291 85.9 # Asp GTC 0 0 + TRNA 327 - 399 82.8 # Val TAC 0 0 + TRNA 405 - 491 67.4 # Leu TAA 0 0 + TRNA 507 - 580 75.6 # Arg ACG 0 0 + TRNA 608 - 679 63.4 # Gln CTG 0 0 + Prom 1161 - 1220 4.2 2 1 Op 2 . + CDS 1348 - 1461 101 ## + Prom 1486 - 1545 9.1 3 2 Tu 1 . + CDS 1622 - 2341 664 ## COG3884 Acyl-ACP thioesterase + Prom 2397 - 2456 8.9 4 3 Op 1 . + CDS 2594 - 3442 330 ## PROTEIN SUPPORTED gi|212640476|ref|YP_002316996.1| Uncharacterized protein conserved in bacteria containing two ribosomal protein S1-like RNA-binding domains 5 3 Op 2 . + CDS 3447 - 4541 451 ## COG1266 Predicted metal-dependent membrane protease Predicted protein(s) >gi|222441789|gb|ACEP01000153.1| GENE 1 840 - 1157 453 105 aa, chain + ## HITS:1 COG:no KEGG:Rru_A0915 NR:ns ## KEGG: Rru_A0915 # Name: not_defined # Def: hypothetical protein # Organism: R.rubrum # Pathway: not_defined # 5 98 4 97 99 105 55.0 7e-22 MTEGRIIYNPSASVMKMFYRKMNAESREELKSRKFDAVGLLQTTVAGVLYYADRAFKTSN VYPVEINGSCPQHITTLAFFGETTAVETAMKTLIREEKEKKNRLM >gi|222441789|gb|ACEP01000153.1| GENE 2 1348 - 1461 101 37 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTDEYEVADENQVKIFVDKINVLNNKAKKVIKVIWEK >gi|222441789|gb|ACEP01000153.1| GENE 3 1622 - 2341 664 239 aa, chain + ## HITS:1 COG:CAC3591 KEGG:ns NR:ns ## COG: CAC3591 COG3884 # Protein_GI_number: 15896825 # Func_class: I Lipid transport and metabolism # Function: Acyl-ACP thioesterase # Organism: Clostridium acetobutylicum # 12 212 16 219 248 101 29.0 1e-21 MYEMKIRVRYSEVDREGIARLHQILEYFQDCGTFQSEELGLGVEEDQKNHRAWYLIAWNV KIIRHPRMSEYIDVTTQAYKMRGFYGYRRYSIIDEQGATIVSAEAIWILMNTQKMLPMKV TKEIADIYVEKDADKTVRINRKISDDGEWKEYEPIEVTKQYLDSNNHVNNTVYALWTEDI LPEGACAKEVRIDYRKAAMFKETIQVFVQQQEDIFKVKYIKPDGEIAALVEMTLKTIED >gi|222441789|gb|ACEP01000153.1| GENE 4 2594 - 3442 330 282 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|212640476|ref|YP_002316996.1| Uncharacterized protein conserved in bacteria containing two ribosomal protein S1-like RNA-binding domains [Anoxybacillus flavithermus WK1] # 29 270 22 270 285 131 34 9e-31 MIELGVKQKLYIDHKTDFGVYLSDTPERNGKSDCVLLPKKQVPQNAKIGDEVEVFVYRDS KDREIATTNMPKLQLGELAVLQVAQVNNIGGFMYWGLEKDLLLPYSEQVVKVQKGEKYLV GLYIDKSDRLCATMKVYDFLRTDSPYKTDDIVEGTVYGKNPEYGVFVAVDNKYNGMLQNK EIVRKLKIGEKVQARVLSVREDGKLNLSLREKAYLQMDVDSAKILEKLKENDGFLPYHDK SAPEDIRSEFGMSKNEFKRAIGRLYKSKKIKISKTGIELTEE >gi|222441789|gb|ACEP01000153.1| GENE 5 3447 - 4541 451 364 aa, chain + ## HITS:1 COG:CAC0483 KEGG:ns NR:ns ## COG: CAC0483 COG1266 # Protein_GI_number: 15893774 # Func_class: R General function prediction only # Function: Predicted metal-dependent membrane protease # Organism: Clostridium acetobutylicum # 114 253 72 211 309 70 30.0 7e-12 MSQQKFLNPNRIWFTIVLLHIVGSVLDSVLAILPFGRIIAAIAIWFFLANIQEKNIRRIE ICIAGCIITIFMMYSLDISYQGTILLNELLLLYVFWVCQNAYVKPVEYCHLRKVSLKSIG LIIVAAIFLFIMADYVNACSMIAFQNLLDDSLQAIVNKPVEALVVVAILPAIIEEFLFRG MIYRGIANKSNKKMAIIISALLFAFLHMNFNQMCYAFVMGLVFAIVIYLTDNLSVSILLH MLFNAFTVIITCFEKSNAIQAILQCNIAGYTLFNPSLTDTQGKIEISLLIIGGLIAILSA VIAGWLILLIAKTEKTTETTINNENILKDNDNIFKNNEKWKPDLRFFAGSGVCILIAILY EILL Prediction of potential genes in microbial genomes Time: Fri Jul 8 08:29:54 2011 Seq name: gi|222441788|gb|ACEP01000154.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont458.1, whole genome shotgun sequence Length of sequence - 1757 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 84 - 569 399 ## CLH_2160 sporulation transcription factor Spo0A - Prom 617 - 676 9.4 + Prom 581 - 640 9.4 2 2 Op 1 . + CDS 703 - 1428 594 ## gi|225028814|ref|ZP_03718006.1| hypothetical protein EUBHAL_03100 + Prom 1481 - 1540 4.7 3 2 Op 2 . + CDS 1583 - 1744 79 ## gi|225028815|ref|ZP_03718007.1| hypothetical protein EUBHAL_03101 Predicted protein(s) >gi|222441788|gb|ACEP01000154.1| GENE 1 84 - 569 399 161 aa, chain - ## HITS:1 COG:no KEGG:CLH_2160 NR:ns ## KEGG: CLH_2160 # Name: spo0A # Def: sporulation transcription factor Spo0A # Organism: C.botulinum_E3 # Pathway: Two-component system [PATH:cbt02020] # 41 150 151 259 272 95 43.0 7e-19 MSNVFIDFAGIFHVKDDLFRFIDTMEDHQKVLIFTINSENDFKNVLRIETSGIQKLLMQI GIPANMLGFSYIVTALELIALDPDYLNNITKGLYVDVAKKCSSTAARVERNIRHAIEIGF LKGDLEEIEKIFHCSSYSDKGAPTNSKFLAEVYYYMVNNEL >gi|222441788|gb|ACEP01000154.1| GENE 2 703 - 1428 594 241 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225028814|ref|ZP_03718006.1| ## NR: gi|225028814|ref|ZP_03718006.1| hypothetical protein EUBHAL_03100 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_03100 [Eubacterium hallii DSM 3353] # 1 241 1 241 241 470 100.0 1e-131 MFSDKIRELMIERDISLTDLAELSGVPFETLRNIYYRKVKDPKVSTAFQLASALEVTVEY LLKDMSTTEEEAENSEGQEGIEEELLENYRECGNHGKAIIRLVARLECKAARKERLNLRK MKHRIPCFIPVGRIGDGIDYNSCTVEDEYTMIKKVYLAIEITNNNFAPAYCKGDRILLEN RFPELGECAVFTDGMKAYFRYFKMEGEMYCLKCLNGRGKDMLLRRMDEVDCLGTCISVIR K >gi|222441788|gb|ACEP01000154.1| GENE 3 1583 - 1744 79 53 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225028815|ref|ZP_03718007.1| ## NR: gi|225028815|ref|ZP_03718007.1| hypothetical protein EUBHAL_03101 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_03101 [Eubacterium hallii DSM 3353] # 1 53 1 53 53 91 100.0 2e-17 MGQILNEYDIGIYKGTRNIFSVLSIVGSVSNCSLEFVQKGEKESRSSRTLCSR Prediction of potential genes in microbial genomes Time: Fri Jul 8 08:30:13 2011 Seq name: gi|222441787|gb|ACEP01000155.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont459.1, whole genome shotgun sequence Length of sequence - 2795 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 38 - 1222 788 ## gi|225028816|ref|ZP_03718008.1| hypothetical protein EUBHAL_03102 2 1 Op 2 . - CDS 1274 - 2569 1360 ## COG0128 5-enolpyruvylshikimate-3-phosphate synthase - Prom 2675 - 2734 4.5 Predicted protein(s) >gi|222441787|gb|ACEP01000155.1| GENE 1 38 - 1222 788 394 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225028816|ref|ZP_03718008.1| ## NR: gi|225028816|ref|ZP_03718008.1| hypothetical protein EUBHAL_03102 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_03102 [Eubacterium hallii DSM 3353] # 1 394 1 394 394 635 100.0 1e-180 MRKLRKILLSTIFALTVSTTFFANTAGTQTVTAASGTAVTFKRKVIAYRTGSVYNFVPMG NAADNRRALNLLMEGNEKKVININNNVHIDTYLRPGNNTTINAGKHTITSDKGVIINDPT AASYTNFKNLTINGGIWKNSSSSGLAGTMMRISYASNISINNATVYTNYKGHGIELISCS NVVVNNCTLKAQGKCSKTCVEEQLQIDLSSPTTAPGLYRLSKKLCNGTPCKNITVKNCTI QGAHGICANFAGAGNEAKYRKARNYHSNITIENCNVTGISAEAIALFNTKSATVKNCRIT TKTPKSRNSYSVGLAVAYQKGSAPKATTKNVISLINNTVKGGQQGIFVRSNASKKLGTVI VSGNTVSAKKGKKNAIKVLSAKKVKIAKNKTKKW >gi|222441787|gb|ACEP01000155.1| GENE 2 1274 - 2569 1360 431 aa, chain - ## HITS:1 COG:BH1667 KEGG:ns NR:ns ## COG: BH1667 COG0128 # Protein_GI_number: 15614230 # Func_class: E Amino acid transport and metabolism # Function: 5-enolpyruvylshikimate-3-phosphate synthase # Organism: Bacillus halodurans # 4 431 7 430 431 424 54.0 1e-118 MKKITRTTGLKGTLTVPGDKSISHRAIMFGSLSEGTTTIHGFLKGADCLSTIDCFRKMGI SIEEKDETIYVHGKGLHGLSKPEETLDVGNSGTTTRLISGILAGQDFDTVLSGDASLNSR PMGRIMKPLSMMGADVTSINNNGCAPLSIKGHTLNAIHYDSPVASAQVKSCVLLAGLYAE GTTSVTEPALSRDHTERMLRSFGADIVSDGNTCTITPPETLHGQHIEVPGDISSAAFFIV AGLITPDSEITIKNVGINDTRAGILKVCQDMGADITLLNTREEGGEPVADLLVKTSKLHG TVVEGSIIPTLIDELPVIALMACFAKGKTIIKDAHELRVKESDRIAIMTENLTAMDADVI DTDDGFIINSRSEESIPVLHGAEINCSMDHRIAMTFAIAGLNADGETMITDSDCVDVSYP GFFAQLEALNN Prediction of potential genes in microbial genomes Time: Fri Jul 8 08:30:30 2011 Seq name: gi|222441786|gb|ACEP01000156.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont460.1, whole genome shotgun sequence Length of sequence - 2763 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 169 - 1347 1477 ## COG0371 Glycerol dehydrogenase and related enzymes - Prom 1374 - 1433 5.6 2 1 Op 2 . - CDS 1439 - 2386 987 ## COG2390 Transcriptional regulator, contains sigma factor-related N-terminal domain - Prom 2407 - 2466 9.8 Predicted protein(s) >gi|222441786|gb|ACEP01000156.1| GENE 1 169 - 1347 1477 392 aa, chain - ## HITS:1 COG:BH1862 KEGG:ns NR:ns ## COG: BH1862 COG0371 # Protein_GI_number: 15614425 # Func_class: C Energy production and conversion # Function: Glycerol dehydrogenase and related enzymes # Organism: Bacillus halodurans # 5 370 1 389 399 191 29.0 2e-48 MDEILNLSINEMPQTEFDCSCGKHHNFSVHDMSIRKGAIEDLPKMAEPFKDGKILVVFDN HTYKVAGKRAVELLKENGFNVKELLFDTGDDILIPDEKTLGRIVQEQDLDTSLMVAVGSG VINDSVKFVTSRSGLPYIIVATAPSMDGYVADGAPIFSQGYKYSPVAHLTYGLVGDTDIL KTAPQDLIQAGYGDVVGKITAIADWDLAVKANNDYRCDTCVTLVNRALDKCFAKAEGLKD RDPESLGALLEALTLTGVAMALVNISRPASGAEHMLSHFWEMDYIARGLNPNHHGIQVGV ATPIIARFFEELADILPEGTGALCPPHEEIEALLAKGGAPTSPKDIGISKELFHDSLLKG YTVRPRYSIMQFAKDNGRLEEIADKITEEIYG >gi|222441786|gb|ACEP01000156.1| GENE 2 1439 - 2386 987 315 aa, chain - ## HITS:1 COG:BH0780 KEGG:ns NR:ns ## COG: BH0780 COG2390 # Protein_GI_number: 15613343 # Func_class: K Transcription # Function: Transcriptional regulator, contains sigma factor-related N-terminal domain # Organism: Bacillus halodurans # 6 314 5 313 316 196 34.0 7e-50 MRKVIDDPRLMVRVCDLYYNQGISQQQIAKDLNLSRPTVSRVLALAREQGIVKISISNVD AVEHWELERKLEKEYGLQEVIIVGENSSEDKMKEALGEAAARYLEYTIKDGNTVGVSMGS TLYEVISHVMHPEAKRVTFVPIVGGVGRVRMELHANSLAESLSRIYDGKFVPLHAPARVS SRNIREELLKEETLLPAIRLTQKLDIAVVGIGYPNEKSAIMATGYFKENEIDSLINRKVA GELCMQFYDIKGDTSPYEDDNNVIGMDISKLRTVPRSIGIAGGIEKLRAIRGAINGHYIN TLITDIQCAEALISE Prediction of potential genes in microbial genomes Time: Fri Jul 8 08:30:30 2011 Seq name: gi|222441785|gb|ACEP01000157.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont461.1, whole genome shotgun sequence Length of sequence - 532 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 322 - 531 102 ## COG0675 Transposase and inactivated derivatives Predicted protein(s) >gi|222441785|gb|ACEP01000157.1| GENE 1 322 - 531 102 69 aa, chain - ## HITS:1 COG:all7148 KEGG:ns NR:ns ## COG: all7148 COG0675 # Protein_GI_number: 17233164 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Nostoc sp. PCC 7120 # 1 69 28 96 421 62 42.0 1e-10 YPNDEQKILFAKTFGCVRMVYNHWLDRKIRQYEENKTNVIYTICAKEMAAMKKTEEYRFL KEADSIALQ Prediction of potential genes in microbial genomes Time: Fri Jul 8 08:30:32 2011 Seq name: gi|222441784|gb|ACEP01000158.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont465.1, whole genome shotgun sequence Length of sequence - 6013 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 3, operones - 2 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 1653 1085 ## COG0642 Signal transduction histidine kinase - Prom 1721 - 1780 3.6 + Prom 2162 - 2221 5.4 2 2 Op 1 . + CDS 2461 - 2955 266 ## gi|225028826|ref|ZP_03718018.1| hypothetical protein EUBHAL_03112 3 2 Op 2 . + CDS 3035 - 4030 518 ## gi|225028827|ref|ZP_03718019.1| hypothetical protein EUBHAL_03113 + Term 4095 - 4136 5.2 - Term 4076 - 4131 15.0 4 3 Op 1 . - CDS 4177 - 4401 404 ## EUBREC_1497 hypothetical protein - Prom 4435 - 4494 5.9 5 3 Op 2 16/0.000 - CDS 4501 - 4662 112 ## COG0784 FOG: CheY-like receiver 6 3 Op 3 . - CDS 4705 - 5952 1095 ## COG0642 Signal transduction histidine kinase Predicted protein(s) >gi|222441784|gb|ACEP01000158.1| GENE 1 3 - 1653 1085 550 aa, chain - ## HITS:1 COG:SMb20356_1 KEGG:ns NR:ns ## COG: SMb20356_1 COG0642 # Protein_GI_number: 16264090 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Sinorhizobium meliloti # 89 338 391 646 667 188 40.0 2e-47 MFLFPIMFFKSGGYKGGMPSFYIFGILFTIFMLEGWLMFFCVWLELIIYIFTIGIAYYYP DTVIWFQSEKEIVVDVLTGVVVSSASLGVAMYLHFRIYKKQQVFLMQAREEAMEANRAKS TFLANMSHEIRTPINVMLGMNEMILRESESREVVQYAKSVEKAGNYLLSLINNILDITRI ESKKLDIIEEKFSLRQLVQEVCLIGAKQAEAKNLEFVVDVEETLPKYLEGDALHIKQVIL NLINNAVKYTKKGKVFLKVCQEEKQISFSVKDTGIGIKKEDMEALFDMFMRADIKRHRNI EGSGLGLTIAKELCEQMGGHIQAESIYGKGSNFTVYFPLKDAGTEKIGQWKVVEGEPVQE KRKEFFASEAQILLVDDSEQNIQVITSLLRRTGVQLDTAASGFECIEKVRNKKYHLIFLD YMMPEMDGIETFHRLKKEVNGQSVPVIALTADVSTGIHQHFLSEGFSDYLSKPVMWEKLE ELLLQWLPAALVSMKNGAGEDWNITEKQLLDLKQKLKKWDIELSEGLHLLGGSIVQYRKL SELFVEYYEP >gi|222441784|gb|ACEP01000158.1| GENE 2 2461 - 2955 266 164 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225028826|ref|ZP_03718018.1| ## NR: gi|225028826|ref|ZP_03718018.1| hypothetical protein EUBHAL_03112 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_03112 [Eubacterium hallii DSM 3353] # 1 164 1 164 164 324 100.0 1e-87 MEHFYIYSFTLAVIPLLLGLLFIWFGKKLWDNATKKNEEYQQRYTGRTTLRVIKVEKNEW EETNSNEQGPESRISVTSYTPVYEYTVNGQQYEYHTRLGSSTDQYPIGKECPGYYNPKNP ADVTETLKEITGGGSHFLSLLFFGIGVLAILFALIRVWAVITLI >gi|222441784|gb|ACEP01000158.1| GENE 3 3035 - 4030 518 331 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225028827|ref|ZP_03718019.1| ## NR: gi|225028827|ref|ZP_03718019.1| hypothetical protein EUBHAL_03113 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_03113 [Eubacterium hallii DSM 3353] # 1 331 3 333 333 532 100.0 1e-149 MLNKKREQKLKNQLYFAFRIRRIFLLIPLLLLCLSFVIPTQVHAQTDDSYFLNMWLNTKS LADLKNVYPGHTNIIVATISDPDLSKIPASKVSATSSNPSVLKVIDKYNIDFANYHPNMD IGIEIEAMKAGKATLTIQYKTIVKKLDITVKKSTAFIKKKKGTMLMGYHYFMRDFVTVHG KEPSIKKIKSSNKKIIKVSGQKLIPKKPGKATISAVINGKKQSIRITVKSPAPTYDQLKS KKAGLIYDSYSGKCYVKIKFTNKSQRTITKVQLKTTLFFDDAWDDMKPVKKKNVKVNIKP GKSQTVKIYFGKYPVLAPNRWKYEILQFWYR >gi|222441784|gb|ACEP01000158.1| GENE 4 4177 - 4401 404 74 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_1497 NR:ns ## KEGG: EUBREC_1497 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 2 74 28 100 100 96 63.0 3e-19 MNIEEKLEIHDCPLCDGGALLEEEGGCGYYVMCLECGCQSVTIDFHSEEERLEAAKKTVD LWNAGKVISCNPGE >gi|222441784|gb|ACEP01000158.1| GENE 5 4501 - 4662 112 53 aa, chain - ## HITS:1 COG:CC2501_2 KEGG:ns NR:ns ## COG: CC2501_2 COG0784 # Protein_GI_number: 16126740 # Func_class: T Signal transduction mechanisms # Function: FOG: CheY-like receiver # Organism: Caulobacter vibrioides # 9 51 18 60 99 59 60.0 2e-09 MNEVDADRYDLIFMDIQMPKMDGYMTTREIRTLKNNKKANIPIVAMTANVLRN >gi|222441784|gb|ACEP01000158.1| GENE 6 4705 - 5952 1095 415 aa, chain - ## HITS:1 COG:AGc1799_2 KEGG:ns NR:ns ## COG: AGc1799_2 COG0642 # Protein_GI_number: 15888325 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 59 247 43 240 323 118 37.0 3e-26 MLICIVKKSVVDNVMRNYQKTVLFTTILMGSFIILLFAGLFYSISKLNIADQKAECEKRN SELQLQTMKEMEAANKKLKKAKNIATEALQTAENANKAKTDFLSNMSHDIRTPMNAIIGI TSLIRHDAGDKAKVTEYADKIDTSSQLLLGIINDVLDMSKIEAGKTVFKYTDFSILDFIQ EIDHMFKIQAEEKNQTLQFTKENILHEWVNGDRVHLMQIFSNLLSNAVKYTQKGGTIEVE SELGQGSCFEVIMNLKVVENRSVSLEPQAEKEELDKNILKGMRFLCAEDNELNAEILIEL LKIEGAECTICENGERLLNTFEQSAPGDYDMILMDVQMPVMNGYEATKAIRRSTHKLAKT IPIVAMTANAFSEDIQHSLAAGMNAHISKPVEMKVLKKTIRNIKFGEGGYRNAGH Prediction of potential genes in microbial genomes Time: Fri Jul 8 08:31:04 2011 Seq name: gi|222441783|gb|ACEP01000159.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont466.1, whole genome shotgun sequence Length of sequence - 21246 bp Number of predicted genes - 18, with homology - 18 Number of transcription units - 7, operones - 4 average op.length - 3.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) - TRNA 198 - 285 55.6 # Ser GCT 0 0 1 1 Op 1 11/0.000 - CDS 388 - 936 713 ## COG1838 Tartrate dehydratase beta subunit/Fumarate hydratase class I, C-terminal domain 2 1 Op 2 . - CDS 952 - 1794 982 ## COG1951 Tartrate dehydratase alpha subunit/Fumarate hydratase class I, N-terminal domain - Prom 1819 - 1878 5.2 3 2 Tu 1 24/0.000 - CDS 1956 - 4514 3203 ## COG0188 Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), A subunit - Term 4527 - 4578 12.8 4 3 Op 1 9/0.000 - CDS 4604 - 6535 2193 ## COG0187 Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), B subunit 5 3 Op 2 9/0.000 - CDS 6602 - 7702 903 ## COG1195 Recombinational DNA repair ATPase (RecF pathway) 6 3 Op 3 6/0.000 - CDS 7720 - 7929 349 ## COG2501 Uncharacterized conserved protein - Prom 8045 - 8104 5.6 7 3 Op 4 16/0.000 - CDS 8107 - 9219 1007 ## COG0592 DNA polymerase sliding clamp subunit (PCNA homolog) - Prom 9340 - 9399 9.2 - Term 9459 - 9507 4.0 8 3 Op 5 . - CDS 9535 - 10869 951 ## COG0593 ATPase involved in DNA replication initiation - Prom 10914 - 10973 2.4 + Prom 11249 - 11308 6.8 9 4 Op 1 . + CDS 11360 - 11494 194 ## PROTEIN SUPPORTED gi|160882064|ref|YP_001561032.1| ribosomal protein L34 10 4 Op 2 16/0.000 + CDS 11554 - 11904 249 ## COG0594 RNase P protein component 11 4 Op 3 18/0.000 + CDS 11913 - 12113 122 ## COG0759 Uncharacterized conserved protein 12 4 Op 4 16/0.000 + CDS 12129 - 13415 1431 ## COG0706 Preprotein translocase subunit YidC 13 4 Op 5 4/0.000 + CDS 13415 - 14047 745 ## COG1847 Predicted RNA-binding protein + Term 14059 - 14119 17.1 + Prom 14090 - 14149 6.6 14 5 Op 1 11/0.000 + CDS 14169 - 15566 1534 ## COG0486 Predicted GTPase 15 5 Op 2 24/0.000 + CDS 15585 - 17468 2268 ## COG0445 NAD/FAD-utilizing enzyme apparently involved in cell division 16 5 Op 3 . + CDS 17478 - 18194 768 ## COG0357 Predicted S-adenosylmethionine-dependent methyltransferase involved in bacterial cell division + Prom 18290 - 18349 5.0 17 6 Tu 1 . + CDS 18379 - 19155 576 ## gi|225028847|ref|ZP_03718039.1| hypothetical protein EUBHAL_03134 + Prom 19220 - 19279 6.4 18 7 Tu 1 . + CDS 19348 - 21042 1946 ## COG0018 Arginyl-tRNA synthetase + Term 21108 - 21151 1.0 Predicted protein(s) >gi|222441783|gb|ACEP01000159.1| GENE 1 388 - 936 713 182 aa, chain - ## HITS:1 COG:CAC3090 KEGG:ns NR:ns ## COG: CAC3090 COG1838 # Protein_GI_number: 15896341 # Func_class: C Energy production and conversion # Function: Tartrate dehydratase beta subunit/Fumarate hydratase class I, C-terminal domain # Organism: Clostridium acetobutylicum # 1 180 1 180 187 212 56.0 4e-55 MNKTIQVPLSEEDINTLKAGDYVYLSGIIYTARDAAHKRMYESMHKGESLPIELNGNVLY YLGPSPAREGQVIGSAGPTTSSRMDKYTPEMLDKGLKGMVGKGKRSPEVIEAMKRNGAVY FAAVGGAGALLSKCIKKAEVIAYDDLGTEAIRKLEIENLPVIVVIDKDGNNLYETAKEKW KK >gi|222441783|gb|ACEP01000159.1| GENE 2 952 - 1794 982 280 aa, chain - ## HITS:1 COG:CAC3091 KEGG:ns NR:ns ## COG: CAC3091 COG1951 # Protein_GI_number: 15896342 # Func_class: C Energy production and conversion # Function: Tartrate dehydratase alpha subunit/Fumarate hydratase class I, N-terminal domain # Organism: Clostridium acetobutylicum # 1 275 3 277 282 339 60.0 4e-93 MRVINVEEISKNIKEMCIEANYYLSDDMKNALYKAAEQEENPLGCQILNQLKENLDIAGV EQIPICQDTGMAVVFAEVGQDVHIEGGSLTDAINKGVHDGYVEGYLRKSVVNDPFIRENT KDNTPAVIHYSIVPGENIKLTVAPKGFGSENMSRVFMLKPADGMEGAVNAIVSAVREAGP NACPPVVVGVGIGGTFEKAALMAKQALTRPVGTHSEFPSIKAMEEEVLEKVNNLGIGAAG LGGTVTALAVNINTYPTHIAGLPVAVNMCCHVNRHAVRTL >gi|222441783|gb|ACEP01000159.1| GENE 3 1956 - 4514 3203 852 aa, chain - ## HITS:1 COG:CAC0007 KEGG:ns NR:ns ## COG: CAC0007 COG0188 # Protein_GI_number: 15893305 # Func_class: L Replication, recombination and repair # Function: Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), A subunit # Organism: Clostridium acetobutylicum # 9 844 6 830 830 906 57.0 0 MDDSNIFDKIHEVDLKNTMEKSYIDYAMSVIVSRALPDVRDGLKPVQRRILYSMIELNNG PDKPHRKCARIVGDTMGKYHPHGDSSIYGALVNLAQDWSTRYPLVDGHGNFGSVDGDGAA AMRYTEARLSKISMEMTADIYKDTVDFVPNFDETEKEPSILPSRFPNLLVNGAQGIAVGM ATNIPPHNLRETIDGVVKIIDNQVEEDRDTEIDELLEIVKGPDFPTGAAILGRKGIDQAY RTGRGKIKVRAVTNIEPMKNGKQRIIVTELPYMVNKARLIEKIADLVKEKRIDGITELRD ESDRSGMRICIETRKDVNANVLLKQLFKHTQLQDTFGVIMLALVDNQPKVLNLKEMLVHY LNHQKDVVTRRTKYELNKAKERAHILEGLLKALDYIDEVIEIIRASKNVAEARDNLIKRF EFSQAQAQAIVDMRLRALTGLEREKLQNEYDELEKKIAELEAILADEKVLLGVIREEILL IRDKYGDDRRTSIEMDMEDLSDEALIPRKDAIITFSKLGYIKRMTPDNFHSQNRGGKGIK GMNVIDEDHIEDLFMTSTHNYIMFFTNKGRVYRLKGYEIPESSRTARGVNIINLLQLQPE EKITAIIPLSEDGKQHYLVMATRNGIVKKTGFDEYKNVRKNGLAAISLREGDELIEVKQT DENEDIFLVSREGQCIRFKLQDVRETGRVSMGVIGMNLGDTDEVVGMQIHSEGDDLLIVS EKGMGKRTPLDEFTVQHRGGKGLRCYKITEKTGYVIGVKTIKPEEEIMLITTEGIVIRMK SDSISVIGRNTSGVKLINIDADSDIKVASVAKVRFQEEDESEDMEEDISEVSEETVSDEN EAIENETAPEEV >gi|222441783|gb|ACEP01000159.1| GENE 4 4604 - 6535 2193 643 aa, chain - ## HITS:1 COG:CAC0006 KEGG:ns NR:ns ## COG: CAC0006 COG0187 # Protein_GI_number: 15893304 # Func_class: L Replication, recombination and repair # Function: Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), B subunit # Organism: Clostridium acetobutylicum # 12 643 8 637 637 828 63.0 0 MSEEQISAAQEYGADQIQILEGLEAVRKRPGMYIGSTSSKGLHHLVYEIVDNAVDEALAG FCDTVKVYINEDNSITVRDNGRGIPVGINKKKGIPAVEVVFTILHAGGKFGGGGYKVSGG LHGVGASVVNALSTWLEVDIFHEGKIYRQRYERGKVMYPLKIVGDTDKRGTEVRFLPDPT IFEETVFDFSVLKQRLREMAFLTKGLKIVLKDKRPEENVALTFHYEGGIREYVEYLNKSK EVLYPQVIYCEGKKGDVVVEVALQHNDSYNEGVYSFVNNITTPEGGTHLAGFRSALTKTF NDYARKNKLLKDSEQNLTGDDIREGLVAIVSIKIPEPQFEGQTKQKLGNSEARGAVDSVV SEQLTYFLEQNPNVAKIICEKAVLAQRAREAARKARDLTRRKSALDGMSLPGKLADCSDK DPQNCEIYIVEGDSAGGSAKTARSRATQAILPLRGKILNVEKSRLDKILVNNEIRAMITA FGTGIHEDFDITKLRYNKIIIMTDADVDGAHIATLLLTFLYRFMPDLIKEGHVYLAQPPL YKVEKNKKVWYAYSDEELNKILTDIGRDGNNKIQRYKGLGEMDASQLWETTMDPEKRILL RVNMDDDAASELDMTFDTLMGDRVEPRREFIEANAKYVQNLDI >gi|222441783|gb|ACEP01000159.1| GENE 5 6602 - 7702 903 366 aa, chain - ## HITS:1 COG:CAC0004 KEGG:ns NR:ns ## COG: CAC0004 COG1195 # Protein_GI_number: 15893302 # Func_class: L Replication, recombination and repair # Function: Recombinational DNA repair ATPase (RecF pathway) # Organism: Clostridium acetobutylicum # 1 359 1 362 363 296 43.0 3e-80 MFVESIELNNYRNFDSLKVEFSPGVNIFFGDNAQGKTNLLESIYVSGTLRSHRGSRDKDL IRFGEDEAHIRLFFRKDSLSHRLDVHLKKNKSKGVAVNGVPVRRSGELLGMMHIVFFSPE DLSIIKEGPAGRRRFLDMELSQIDKGYMQQLVAYSKILNERNNLLKQINLYPALIDTLDG WDEQLLAAGQFLIKKREEFVYFLDEMMAKIHGQLTGGKEQIKVEYEKNVEAEKFREQLYS KRNKDISSGTTSVGPHRDDLRFKVGGIDIRKFGSQGQQRTAALSLKLSEIRLIEQVTGEK PILLLDDVLSELDAGRQSWLLESIQDIQTLISCTGLDDFVNSRISLDKVFRVKEGIVQEE KTDGLV >gi|222441783|gb|ACEP01000159.1| GENE 6 7720 - 7929 349 69 aa, chain - ## HITS:1 COG:CAC0003 KEGG:ns NR:ns ## COG: CAC0003 COG2501 # Protein_GI_number: 15893301 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 1 68 1 68 68 67 55.0 4e-12 MEELKLRDEFIKLGQAMKAAGIVSSGIDAKMLIQDGQVTVNGEVETRRGKKLYDGDVFEF EGEEFKVVK >gi|222441783|gb|ACEP01000159.1| GENE 7 8107 - 9219 1007 370 aa, chain - ## HITS:1 COG:CAC0002 KEGG:ns NR:ns ## COG: CAC0002 COG0592 # Protein_GI_number: 15893300 # Func_class: L Replication, recombination and repair # Function: DNA polymerase sliding clamp subunit (PCNA homolog) # Organism: Clostridium acetobutylicum # 1 366 1 364 366 228 35.0 2e-59 MKITCSKNQFLYGVNTVSKAVSNKTTMDILQCILIEAYDDCIKLTANDTELGIETVIEGT IIEPGKIALEAKIFSEIIKKIPDNDVCIETDESGKTNIECEQIQCSIMGRKGDDFSHLPE IEKENKITLSQFDLKEVIRQTIFSVSDNENNKLMTGELLSIENNKMTVTSLDGHRISVRN IELQESYPKMEVIVPGKTLSEVSKILPGEAKDTINLYLTDKHILFEFGETKVVSRLLEGK FYNIGQMLSNDYETKLTINKMELLRCLDRSTLFVKESDKKPIILKIENEQMNLMISSQIG SMKDQIAISKEGKDLVIGFNPKFLVDALRVIDEEEVTIYLINSIAPCIIRDEKSSYLYLI LPINVSNSNV >gi|222441783|gb|ACEP01000159.1| GENE 8 9535 - 10869 951 444 aa, chain - ## HITS:1 COG:CAC0001 KEGG:ns NR:ns ## COG: CAC0001 COG0593 # Protein_GI_number: 15893299 # Func_class: L Replication, recombination and repair # Function: ATPase involved in DNA replication initiation # Organism: Clostridium acetobutylicum # 1 442 1 445 446 431 51.0 1e-120 MKELIESKWQEILNILVKEHEISEVAVNTWIQPLIIQTITDDSITFFMVRGPRGIEFIKH KYYDIYLSMAIEQVTGKQFKILFTDSNSAAPKSEPKKVSVPAEYIGNLNPRYTFDTFVVG PTNKMAHAVSVAVAESPGGAYNPLFLYGGAGLGKTHLMHSIAHHIINNRPDLRVLYVTSE KFTNELIDSLKHDKNKEFRDKYRNIDVLLIDDIQFIIGKESTQEEFFHTFNELHEAKKQI VISSDKHPREIATLEERLRSRFEWGITADIQPPDYETKMAILKKRAELEHLDINQEVMQY VATNINSNIRELEGALNKIYVFANLEKKPVTLELAENALKDTIECQKEVTPQLIMDVVAE HYNISVSDIISKKKNKEIANPRQICMYLSRKYTDYSLQNIGKIMGNRDHTTVIHGHDKIG KMLETDENLKSNLDIIIKKLNPPT >gi|222441783|gb|ACEP01000159.1| GENE 9 11360 - 11494 194 44 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|160882064|ref|YP_001561032.1| ribosomal protein L34 [Clostridium phytofermentans ISDg] # 1 44 1 44 44 79 86 2e-14 MKMTFQPKKRQRSKVHGFRKRMSTANGRKVLARRRAKGRNKLSA >gi|222441783|gb|ACEP01000159.1| GENE 10 11554 - 11904 249 116 aa, chain + ## HITS:1 COG:CAC3738 KEGG:ns NR:ns ## COG: CAC3738 COG0594 # Protein_GI_number: 15896969 # Func_class: J Translation, ribosomal structure and biogenesis # Function: RNase P protein component # Organism: Clostridium acetobutylicum # 7 114 6 115 119 79 46.0 2e-15 MSEFKSLRKNWEFQAVYRNGKSKANRCFVMIIKKNDTSSNRVGISVSKKVGNSIVRHRVT RVIREVMRLHWGEIKSGYDIVIVARPSAKDSDYGKFESAIFHLLNLHHLLKDDDLE >gi|222441783|gb|ACEP01000159.1| GENE 11 11913 - 12113 122 66 aa, chain + ## HITS:1 COG:TM1462 KEGG:ns NR:ns ## COG: TM1462 COG0759 # Protein_GI_number: 15644211 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Thermotoga maritima # 1 66 5 69 81 90 66.0 6e-19 MIKIIRFYQKYLSALKGRATCIYTPTCSQYAIEAIEKHGVLKGGLLAAWRILRCNPFSKG GYDPVP >gi|222441783|gb|ACEP01000159.1| GENE 12 12129 - 13415 1431 428 aa, chain + ## HITS:1 COG:BH1169 KEGG:ns NR:ns ## COG: BH1169 COG0706 # Protein_GI_number: 15613732 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit YidC # Organism: Bacillus halodurans # 38 130 61 155 280 88 45.0 2e-17 MLLTQSTTPIIGWIATLLGYVMEFIFYCLNFIGIQNIGLCIIIFTIIVRLLMLPLTIKQQ KFAKISQVMQPEINKIQRKYRNKTDQASMMKQNEEIQKVYEKYGTNPTGGCLQLVVQMPI FLALYQVIRKIPAYIPQVKAVYMQVVTAIAGQAGAIDAINKIGKGLKSSYVTSLASDATK NQIIDTLNYFNADAWHKLAKAIPSASDVINTSSTHIIGMNDFFAGINVSQTPGFHPSIYW LIPILAALFQYLSAKTMKQPELDGNNPAAGMTKSMTVMMPLMSLYFCLVTPAGLGIYWVT SALFQCLQQVIINKYMDNADIDALVAKNKEKAAKKKAKGQKTFMERLMDTSAKADAAKEE IENPSARKTIKQIASINTKKIAGPEGTGAEDYANLDSVDISKLGEIGKKAYLVSQYEKEH GHTRGGKK >gi|222441783|gb|ACEP01000159.1| GENE 13 13415 - 14047 745 210 aa, chain + ## HITS:1 COG:CAC3735 KEGG:ns NR:ns ## COG: CAC3735 COG1847 # Protein_GI_number: 15896966 # Func_class: R General function prediction only # Function: Predicted RNA-binding protein # Organism: Clostridium acetobutylicum # 1 208 1 208 209 174 49.0 7e-44 MEVKEFKAKTVDEAITAATLELGISSDKLQYEVVDEGSKGFLGIFNSKPAVIKVILKKSL LERTQEFCDELFAAMKVETTVNIDFREEDNVMNIDLSGSDMGILIGKRGQTLDALQYLIS LYVNRESDAYIRVKLDTENYRERRKATLEKLAKKIAYTVKRTKKPVALEPMNPYERRVIH SALQNDRYVCTKSEGEEPYRKVVIMLKRDR >gi|222441783|gb|ACEP01000159.1| GENE 14 14169 - 15566 1534 465 aa, chain + ## HITS:1 COG:FN0006 KEGG:ns NR:ns ## COG: FN0006 COG0486 # Protein_GI_number: 19703358 # Func_class: R General function prediction only # Function: Predicted GTPase # Organism: Fusobacterium nucleatum # 5 465 3 455 455 378 50.0 1e-104 MINEFDTIAAIATAVSNAGIGIIRISGSEAMEILVKIFEPYNKKADVYQLENHRIYYGNI KDGEEVVDECIVLIMKGPHSYTKEDVVEIDCHGGVTVVYKVLNLVLKNGARAAEPGEFTK RAFLNGRIDLSQAEAVMDLIDSKNEMARKNSMTQLKGGLSDKIKQLREEIIYQIAFIESA LDDPEHYSLDGFPEKLLEEDKKWITIAKEMLDSYDNGRIIAEGIRTCIVGKPNAGKSSFL NALLGEERAIVTDIAGTTRDTLEESVTIDGITLNIVDTAGIRDTEDKVESIGVERAKKEI ESADLILFLMDTSVQISEEDIEILQRIRDKKKIILLNKSDKATEESGFEQSALKEYISEE TPVISISAKYGDGIDNFIMELKNMMFHGRIDVNQEVYITNARHKDALAKTYESLECVERS IEAGMPEDFFTIDLMNAYEKLGLIIGESVEEDLVNEIFSKFCTGK >gi|222441783|gb|ACEP01000159.1| GENE 15 15585 - 17468 2268 627 aa, chain + ## HITS:1 COG:CAC3733 KEGG:ns NR:ns ## COG: CAC3733 COG0445 # Protein_GI_number: 15896964 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: NAD/FAD-utilizing enzyme apparently involved in cell division # Organism: Clostridium acetobutylicum # 8 621 9 620 626 817 64.0 0 MKTVEEYYDVVVVGAGHAGCEAALAAARLGCETIIFTVSMNSVALMPCNPNIGGSSKGHL VKEIDALGGEMGKNIDKTYIQSKMLNKSKGPAVHSLRAQADKDAYSMTMRYTLQNTDHLT LRQAEVTELIVEDGVIKGVKTFSGAIYHAKTVVLATGTYLKARCLYGEVVNYTGPNGLQA ANYLSQSLKDNGVELYRFKTGTPARIDKRSVNFDVMEEQFGDEKIVPFSFTNKEEDIKRE QISCWLTYTNEETHKIIRDNLDRSPIYAGVIEGTGPRYCPSIEDKVVRFADRKRHQLFIE PEGEFTNEMYVGGMSSSLPEDVQYAMYRTVKGLENVKIIRNAYAIEYDCINPNQLKLSLE FRNIQGLFSGGQFNGSSGYEEAACQGLIAGINAARKCLGKDPVILDRSQAYIGVLIDDLV TKENHEPYRMMTSRAEYRLLLRQDNADMRLTPIGHEIGLIDDERYDKFLLKKEQIEKEIE RLKHVNIGANKTVQELLESLGSTKLNSGSTLEELIKRPELNYECLAPIDPDREPLDDDVA EQININIKYEGYIKRQLQQVEQFKKMENKLIPDTINYDEVHNLRTEAVQKLKAVHPHSVG QASRISGVSPSDISVLMIYMEQMRHRK >gi|222441783|gb|ACEP01000159.1| GENE 16 17478 - 18194 768 238 aa, chain + ## HITS:1 COG:BS_gidB KEGG:ns NR:ns ## COG: BS_gidB COG0357 # Protein_GI_number: 16081152 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted S-adenosylmethionine-dependent methyltransferase involved in bacterial cell division # Organism: Bacillus subtilis # 1 238 1 237 239 226 47.0 2e-59 MDFKEKLKARAEKEGISLTPKQLGQFELFYKMLIETNKSMNLTAITDEDEVIEKHFIDSL SCRRVVDMSQIRTCIDVGTGAGFPGIPLKIVYPEIDFVLVDSLNKRVKFLKDVKEALGLE GLEALHGRAEDLARDKSLRAAFDLCVSRAVANLSVLSEYCVPFVRTNGYFVSYKGKKGLE EISNAQNCMNVLGCKIEKVDDFRLEEDEAERLLIRIKKCKGTPKLYPRKAGTPSKNPL >gi|222441783|gb|ACEP01000159.1| GENE 17 18379 - 19155 576 258 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225028847|ref|ZP_03718039.1| ## NR: gi|225028847|ref|ZP_03718039.1| hypothetical protein EUBHAL_03134 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_03134 [Eubacterium hallii DSM 3353] # 1 258 1 258 258 413 100.0 1e-114 MDIKEKNDIFIDNANLQGRMSVFHDKLRKHLSNEYLFLSEKKDNLENEILQLKLEKEQKD KTLFPHIEKRDVRKYFSPLNINEIDEEQKDEKEKQLTANISRISEEAELLDSRMLEIKDF LQDIESMLQEDSEDSMSTLLIKGEKNRISVEDFYKNHEVYPELLLNLKELTTFFQNQYEG LEILLEFEDGGIQLDTKVVANILRQLTINIMSAVEDYNISMILIEGKVTDDKILLTLNCM CDDGPVDTFKISYDIDRI >gi|222441783|gb|ACEP01000159.1| GENE 18 19348 - 21042 1946 564 aa, chain + ## HITS:1 COG:CAC1041 KEGG:ns NR:ns ## COG: CAC1041 COG0018 # Protein_GI_number: 15894328 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Arginyl-tRNA synthetase # Organism: Clostridium acetobutylicum # 3 564 12 563 563 586 53.0 1e-167 MKNKIIDILSENIEQLDKEEIAAALEIPKKTDMGDFAFPCFKLARVFRKAPNMIAEELTA QINEKGYDIFDKVVNVGAYINFFLAKEAFAKEIMNTVDTPDFGQGTEGEGKTVCIDYSSP NVAKNFHVGHLRTTIIGNSLYKIYSKLGYHVERINHLGDWGTQFGKLIVAYKKWGSKEAV EENGIDELMKIYVKFHQEAEKDDSLNDQAREWFVKMEQGDEEALSIWQWFKDISLVEYKR IYKLLGMDFDHFTGESFYRDKTHEVVEKLQEKDLLVESEGAHIVPLDDYDMAPCLIMKKD GSSIYATRDLAALLYRKRTYNFDKCIYVTGLEQKLHFAQVFKVIELLGYDWYQNLVHVPY GLVSMEGGKLSTRNGNVIYAEQILHEAIEKIHEIINEKNPDLPNKEEVSRQVGIGAILFN DLYNQRIKDVIFNWDKILNFDGETGPYVQYTYARCASVFRKVGAVELPAEIDYSVLTDDA TMNLLKDLTRFPEVIKEAADKYEPFMIARFAVSVAQHFNKFYHDCQINVEEENVKLARLK VVDVTMKVIKSALDLLGIECPEQM Prediction of potential genes in microbial genomes Time: Fri Jul 8 08:31:24 2011 Seq name: gi|222441782|gb|ACEP01000160.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont470.1, whole genome shotgun sequence Length of sequence - 7804 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 998 - 1057 5.1 1 1 Op 1 11/0.000 + CDS 1094 - 3346 2651 ## COG1882 Pyruvate-formate lyase 2 1 Op 2 . + CDS 3357 - 4085 593 ## COG1180 Pyruvate-formate lyase-activating enzyme - Term 3991 - 4037 1.6 3 2 Tu 1 . - CDS 4276 - 7467 2667 ## COG3867 Arabinogalactan endo-1,4-beta-galactosidase - Prom 7493 - 7552 7.4 Predicted protein(s) >gi|222441782|gb|ACEP01000160.1| GENE 1 1094 - 3346 2651 750 aa, chain + ## HITS:1 COG:lin1443 KEGG:ns NR:ns ## COG: lin1443 COG1882 # Protein_GI_number: 16800511 # Func_class: C Energy production and conversion # Function: Pyruvate-formate lyase # Organism: Listeria innocua # 5 750 3 743 743 987 62.0 0 MLQKEQWQGFNGRIWREEINLREFIQDNYTPYDGDASFLEGPTEATDTLWGELQKLQKAE REKGGVLDMDTDIVSSLTSHKPGYISESLKEKEQIVGLQTDKPLKRAFMPFGGIRTAEEA CTTYGYTPNPEFHKIFTDYHKTHNQAVFDSYTPEMKKARHNKIITGLPDTYGRGRIVGDY RRVALYGIDLLVADKQKDFANCGDGTMTDDVIRLREEISMQIKALKDMKVMAESYGYDIS EPASNAKEAVQWTYFGYLAAIKTQNGAAMSIGRIATFLDIYIKRDMDKGILTEKEAQELI DHMTMKFRMVKFGRIPAYNQLFSGDPVWATLEVGGLGQDGRSMVTKTDYRFLHTLENMGP SPEPNLTVLYSERLPKAFRDYAAHISITTSSIQYENDDVMRPVWGDDYSICCCVSATQTG KEMQFFGARANLAKCLLYAINGGVDEKTGQQVGPEYKPITSEYLDYDEVMHKYDIMMDWL AGLYVNTLNLIQYMHDKYYYEYALMALIDTNVRRTFATGIAGFSHVVDSLSAIKYAKVKT VRNEEGLVVDYETTGDFPKYGNDDDRADDIAVWLLQTFMKKLEKFHTYRDSEPTTSILTI TSNVVYGKATGSLPDGRKAGEPLSPGANPSYGAEQNGLLASLNSVAKLPYEWALDGISNT QTINPDALGHSEEERIGNLVNVMDGYFAQGAHHLNVNVFGVEKLIDAMEHPEKEEYANFT IRVSGYAVKFISLTKEQQLDVIARTCHDRM >gi|222441782|gb|ACEP01000160.1| GENE 2 3357 - 4085 593 242 aa, chain + ## HITS:1 COG:SP1976 KEGG:ns NR:ns ## COG: SP1976 COG1180 # Protein_GI_number: 15901799 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Pyruvate-formate lyase-activating enzyme # Organism: Streptococcus pneumoniae TIGR4 # 2 240 13 250 264 318 59.0 5e-87 MALVHSTESFGSVDGPGVRFIVFLQGCPLRCQFCHNPDTWKMTEENGAIWKNAEELLNQA LRYRPYWKNGGGITVSGGEPLLQIDFMLEFFKKAKEKGIHTVIDTAGGPFTRKEPFFSKF QELMKYTDLLLVDIKHIDTECHKVLTGHSNENILDMIRYLSDIKKPIWIRHVLVPERSDK DEYLTRLADFIHSLDNVEKVEILPYHTMGIYKWKELGLEYPLEGIQPPTKERVKNAKKIL AI >gi|222441782|gb|ACEP01000160.1| GENE 3 4276 - 7467 2667 1063 aa, chain - ## HITS:1 COG:BH2023 KEGG:ns NR:ns ## COG: BH2023 COG3867 # Protein_GI_number: 15614586 # Func_class: G Carbohydrate transport and metabolism # Function: Arabinogalactan endo-1,4-beta-galactosidase # Organism: Bacillus halodurans # 413 810 206 585 921 249 38.0 2e-65 MKNNHNLLMKMKNLEKFSGAIKRCVAIAMATMLFATTGFSLDTHPVKAQEVSSSADKIEI PEISKDAFKKYKKISGIGANTILGADFTYYQQCLEWGKSYKNYMSQSVDNIFDYVKSQGI NTISLKVAVNPTGENAYLSLDNAIKTLKAVKASKANLKTNLVLLYSDEMTYAGENGQKLP ADWEKAEKAEQSVSRVESAKTYTKEIIAKLKQVNVLPDIVTIGNEVDWNFLGITDGEGWE GWKAMGDISALLKKEGVKNAVSIAAPSDAASVKYIVQKLGYASVDYDYIGVNVYPDNNTN NYIKSLKKEVESCASDKQLIVSNVEYERVNEANTANVYTQADSIYNLLEATIDEKNAGGL IYNEAAYAGSWKSFFDDEGDAQVSMAIFAYAQGHETDTSRDPYKYGDDTGLKQQKVTIKK VQNMSDSTIRGIDISSYTALKKAGVKYYDNEGKEASLLKVLSDNGVNYIRIRIWNDPYNE KGETYGGGSNDVKAGLEIAKEAAKYNIKVLLGFHYSDFWADPAVQLLPKAWKKDRNNQEK MCSNVYEFTKETLEQFKDAGADIGMVQVGNEISQGMMGIMHKTKANVWQEEEKSTIIDSY LSAGARAVRECTPDALVAIHLDNLNLGMYKDAMNAWERDNVDYDVLGASSYAFWAGKNML KNVRKAGEYVASRGKLFAVLETSWLNSQKDADGTVNMVNNTNGAVYKVGPQGQADMLSDL YNAILSNDNGLGAFYWEGAWIPVRAGWVNWKYNKEMANEFGTGWAAENAEGYYPTSKLYY NGNPVWGGNSWDNQTLFDDKGYPLDSLRFYKDAVSSNEKYSRVVIALCDKENNVLEYRVV KVVSGKSMTYILPDIKGYTKEKDTIKILGTNDKISKVSVVYNKDIKKQAITVKKASYTIP YGTKFNLKNEVKAVGSLTFTSNNTNVVSVEKQGKKLVVKKPGKVTIKITAGATADYKQAS RIVTIYAVPKKQTIKKVSTSKRKVKVNIKKDAKATGYQIVAAKNSKFSKGKKVLTKKGIR QVTYTITKLDSKKIYYVKARSYKKIGNKKYFGPWSKVKKIRVK Prediction of potential genes in microbial genomes Time: Fri Jul 8 08:31:25 2011 Seq name: gi|222441781|gb|ACEP01000161.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont472.1, whole genome shotgun sequence Length of sequence - 548 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 548 542 ## COG1109 Phosphomannomutase Predicted protein(s) >gi|222441781|gb|ACEP01000161.1| GENE 1 2 - 548 542 182 aa, chain - ## HITS:1 COG:SP1559 KEGG:ns NR:ns ## COG: SP1559 COG1109 # Protein_GI_number: 15901402 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphomannomutase # Organism: Streptococcus pneumoniae TIGR4 # 1 182 135 315 450 183 51.0 2e-46 EYEEIPFAHKDAIGCTVDYAAGRNRYMGYLISLGLYSFKGMKVGLDCANGSSWNMAQQIF DSLGAKTFVINAEPDGTNINRDAGSTHIEGLQKYVVENGLDVGFAYDGDADRCLCVDEKG NVVDGDAILYIYGRYMKERGKLLTNTVVTTVMSNFGLYKAFDELGIGYAKTAVGDKYVYE YM Prediction of potential genes in microbial genomes Time: Fri Jul 8 08:31:28 2011 Seq name: gi|222441780|gb|ACEP01000162.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont476.1, whole genome shotgun sequence Length of sequence - 9315 bp Number of predicted genes - 11, with homology - 11 Number of transcription units - 4, operones - 4 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 610 506 ## COG0389 Nucleotidyltransferase/DNA polymerase involved in DNA repair 2 1 Op 2 . + CDS 610 - 1035 314 ## Bacsa_0373 YolD-like protein + Term 1194 - 1240 1.7 3 2 Op 1 . + CDS 1585 - 3561 2267 ## COG1151 6Fe-6S prismane cluster-containing protein 4 2 Op 2 . + CDS 3555 - 4901 1379 ## COG4624 Iron only hydrogenase large subunit, C-terminal domain + Term 5036 - 5079 8.4 + Prom 5088 - 5147 11.2 5 3 Op 1 . + CDS 5176 - 5388 193 ## EUBREC_1224 hypothetical protein 6 3 Op 2 . + CDS 5392 - 5847 512 ## COG1225 Peroxiredoxin 7 3 Op 3 . + CDS 5847 - 6596 554 ## COG3022 Uncharacterized protein conserved in bacteria + Prom 7029 - 7088 6.3 8 4 Op 1 1/0.000 + CDS 7125 - 7757 357 ## COG1853 Conserved protein/domain typically associated with flavoprotein oxygenases, DIM6/NTAB family 9 4 Op 2 3/0.000 + CDS 7763 - 8305 740 ## COG1592 Rubrerythrin + Prom 8407 - 8466 2.5 10 4 Op 3 3/0.000 + CDS 8487 - 8861 438 ## COG2033 Desulfoferrodoxin + Prom 8869 - 8928 6.8 11 4 Op 4 . + CDS 8956 - 9114 257 ## COG1773 Rubredoxin Predicted protein(s) >gi|222441780|gb|ACEP01000162.1| GENE 1 2 - 610 506 202 aa, chain + ## HITS:1 COG:BS_uvrX KEGG:ns NR:ns ## COG: BS_uvrX COG0389 # Protein_GI_number: 16079209 # Func_class: L Replication, recombination and repair # Function: Nucleotidyltransferase/DNA polymerase involved in DNA repair # Organism: Bacillus subtilis # 14 202 255 416 416 72 30.0 5e-13 TIADIKVYKPEAKSIGSGQVLSSAYSSEKAKVVVLEMAEQIAFDLFEQKLVSKQFVLTIG YDRDNLQGQKYSGEVATDRYGRKIPKHAHGTINLDIPTSSLKEITTAVSSLYDRIIDKEL LIRRLTLTATKVMPKEGQVYQQLDLFTDYEALKKEQEKERRLQKSILDIKKKYGKNAVLR GLSYEEGATTRTRNGQIGGHKA >gi|222441780|gb|ACEP01000162.1| GENE 2 610 - 1035 314 141 aa, chain + ## HITS:1 COG:no KEGG:Bacsa_0373 NR:ns ## KEGG: Bacsa_0373 # Name: not_defined # Def: YolD-like protein # Organism: B.salanitronis # Pathway: not_defined # 4 135 5 135 138 115 49.0 7e-25 MGKYDDIINLSHHVSKKHSQMPIADRAAQFAPFAALTGYGEQIKETARTTEEFHQVSGQE KEELNQKLQILLQTQYKHPLISTYYFTPDARKQGGHYEKKTEYVKRIDPVKEYIIMKDDT IIRFHYIKELQGEIFNKVLSF >gi|222441780|gb|ACEP01000162.1| GENE 3 1585 - 3561 2267 658 aa, chain + ## HITS:1 COG:CAC0116 KEGG:ns NR:ns ## COG: CAC0116 COG1151 # Protein_GI_number: 15893412 # Func_class: C Energy production and conversion # Function: 6Fe-6S prismane cluster-containing protein # Organism: Clostridium acetobutylicum # 5 647 9 629 629 732 56.0 0 MSDKICSSADKVLEVFLENAPMDTSHHRMVKQQNKCGYGLQGVCCRLCSNGPCRLSPAKP KGVCGADADTIACRNFLRQVAAGSGCYTHVVENTARRLKEMAQELQAEGKKPKYKDSVAK LAKILQINCCGNCGSDCHNSCAKTAEMIADAVLADIRKPYDEKMTLMKNIALPKRYELWE KLGILPGGAKDEIFNAVVKTSTNLNSDPMDMLLQCLRLGISTGNYGLILTNLMNDIIMGP PQISMDPVGFRIIDPEYINIMITGHQQSMFADLEEKLESEIVQKSAELVGAKGIRIVGCT CVGQDYQARSGCYKDVYCGHAGNNYTSEAVLMTGCVDLVVSEFNCTIPGIEPICEQLDIK MLCLDDVAKKANAQLLPYTAEEKEKITSQIIADALCGFKNRKEKLYGTAPAKGEKRVNVM AQHGFDKSITGLSEDTLVAALGGTLQPLIDAIVSGKIKGIAAVVGCSNLRAKGHDVFTVE LAKELIKKDILVLSAGCTCGGLENCGLMTMDAVELCGEGLKEICTALGVPPVLNFGPCLA IGRIEMAACALAKELNVDLPQLPVVISAPQWLEEQALADGAYALALGFPLHLALSPFVTG SQVAVNVLTEGLKDLTGGQLIIETEVDAAAQKFDEIIKEKRAGLGIDSGINKGGDAIC >gi|222441780|gb|ACEP01000162.1| GENE 4 3555 - 4901 1379 448 aa, chain + ## HITS:1 COG:TM0201_2 KEGG:ns NR:ns ## COG: TM0201_2 COG4624 # Protein_GI_number: 15642974 # Func_class: R General function prediction only # Function: Iron only hydrogenase large subunit, C-terminal domain # Organism: Thermotoga maritima # 79 448 3 363 372 291 45.0 3e-78 MLMNRTTPFMVPVDDANPAIIKNEALCSECGHCFAVCEEEIGVAAKYLLNQREAYQCIGC GQCSAVCPEKAITGRPHYKIVKELIQDPEKIVVFSTSPSVRVGFADGFGKEPGTFAQDEM VGALRALGADYVFDVTFSADLTIMEEGSELLSRILKGTGPLPQFTSCCPAWVKYMENFHP DKTKYLSSAKSPIGMQGAVIKTYFAHKKHIDPEKIISVAVTPCTAKKAEIAREELCDAGK LLNIEEMRDNDYVITTKELVQWCKEEGMDLGKITPSKYDSVLGEGTGAGMIFGNTGGVME AALRTVYRVLEGKEAPADFYQLRPVRGLNNRKEAEVTIAGKNLKVCILYGTAAAEEFLAE DMSGYHFVEVMTCPGGCISGAGQPDCGSVPVSDAVRKKRIASLYQADERAQYRNSMDNPE IGMIYNEFFKEPLSLLSETLLHTTYKSK >gi|222441780|gb|ACEP01000162.1| GENE 5 5176 - 5388 193 70 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_1224 NR:ns ## KEGG: EUBREC_1224 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 70 26 94 94 90 62.0 2e-17 MKAHKYWSVGALVTMLGTFYTGYKKLNSSHKYFALSSMLCMVMAIYTGHKMITGNRKEKA KKEKVTEGEE >gi|222441780|gb|ACEP01000162.1| GENE 6 5392 - 5847 512 151 aa, chain + ## HITS:1 COG:CAC0327 KEGG:ns NR:ns ## COG: CAC0327 COG1225 # Protein_GI_number: 15893619 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Peroxiredoxin # Organism: Clostridium acetobutylicum # 5 150 6 151 151 152 53.0 3e-37 MLEVGIKAPDFELPDQNGKIHRLSDYTGKKVILYFYPKDNTAGCTKQACGFSERYPQFTE KGAVILGVSKDSVASHKRFEEKYGLAFTLLADPERKVIEAYDVWKEKKNYGKVSMGVVRT TYLIDEQGIIIKANDKVKAADDPKNMLKELA >gi|222441780|gb|ACEP01000162.1| GENE 7 5847 - 6596 554 249 aa, chain + ## HITS:1 COG:RSc2009 KEGG:ns NR:ns ## COG: RSc2009 COG3022 # Protein_GI_number: 17546728 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Ralstonia solanacearum # 1 247 1 255 257 143 32.0 3e-34 MKIILSPAKKMIVDTDNLAPVELPVYIDKTAEVLNWMKSKSKEELKAIWKCNDKIAEQNF NRLENMDLYNRLTPAVLAYEGIAFQYMAPSVFEIQQFEYLQNHLRILSAFYGILKPMDGV TPYRLEMQAKVGIGDAKNLYEYWGELLYRSVIDDSRIIINLASKEYSKCIEKYLTPQDRY ITIVFCELSGDKLVTKGTYAKMARGEMVRFMAENSIENPEDIKKFDRLGYSFRSDLSSDA EYVFERKKE >gi|222441780|gb|ACEP01000162.1| GENE 8 7125 - 7757 357 210 aa, chain + ## HITS:1 COG:AF0830 KEGG:ns NR:ns ## COG: AF0830 COG1853 # Protein_GI_number: 11498436 # Func_class: R General function prediction only # Function: Conserved protein/domain typically associated with flavoprotein oxygenases, DIM6/NTAB family # Organism: Archaeoglobus fulgidus # 1 152 1 151 169 111 39.0 1e-24 MDEKAMYKLSYGLFVLTAKEDEKDNGCIINTAIQAASEPNQLSICVNKANYTHDMIKRTG KFTVSVLSQSVEFELFKHFGFQSGRNINKFEAFDKCARSTNGVYYITEGTNAYISVTVNK TEDLGSHTMFIGEITDMETLSNVPSVTYEYYQNNIKPKPREVGKTDDGQTIWRCRICGYE YVGEELPDDFICPICKHPASDFEKVVKKWR >gi|222441780|gb|ACEP01000162.1| GENE 9 7763 - 8305 740 180 aa, chain + ## HITS:1 COG:CAC2575 KEGG:ns NR:ns ## COG: CAC2575 COG1592 # Protein_GI_number: 15895835 # Func_class: C Energy production and conversion # Function: Rubrerythrin # Organism: Clostridium acetobutylicum # 8 180 6 195 195 194 56.0 1e-49 MATNKYTGTQTEKNLQEAFAGESQARNKYTFFASVAKKEGYEQMSALFLKTADNEKEHAK MWFKELAGIGDTKENLAAAAEGENYEWTDMYEGFAKTAEEEGFPELAAKFRAVGEIEKHH EERYRALLKNIETAQVFEKSEVKVWECRNCGHVVVGTKAPEICPVCNHPQSYFEVREENY >gi|222441780|gb|ACEP01000162.1| GENE 10 8487 - 8861 438 124 aa, chain + ## HITS:1 COG:AF0833 KEGG:ns NR:ns ## COG: AF0833 COG2033 # Protein_GI_number: 11498439 # Func_class: C Energy production and conversion # Function: Desulfoferrodoxin # Organism: Archaeoglobus fulgidus # 3 123 5 123 125 104 43.0 4e-23 MEVKYYVCNHCENIIEMVKDKGVPVICCGEPMQELKAGVTDAAVEKHVPVYTVEGSHVHV VVGETKHPMLEEHFIEWITLSTNQGIYRKQLNPGQEPVADFCLCDGELVEEVYAYCNLHG LWKC >gi|222441780|gb|ACEP01000162.1| GENE 11 8956 - 9114 257 52 aa, chain + ## HITS:1 COG:NMA1201 KEGG:ns NR:ns ## COG: NMA1201 COG1773 # Protein_GI_number: 15794145 # Func_class: C Energy production and conversion # Function: Rubredoxin # Organism: Neisseria meningitidis Z2491 # 2 48 3 49 56 60 63.0 8e-10 MKYVCDVCGWEYDEEEGYPEGGIAPGTKWEDVPEDFECPLCNVGKDQFSEVE Prediction of potential genes in microbial genomes Time: Fri Jul 8 08:31:34 2011 Seq name: gi|222441779|gb|ACEP01000163.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont479.1, whole genome shotgun sequence Length of sequence - 1648 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 108 - 1574 835 ## COG3666 Transposase and inactivated derivatives Predicted protein(s) >gi|222441779|gb|ACEP01000163.1| GENE 1 108 - 1574 835 488 aa, chain - ## HITS:1 COG:MA3799 KEGG:ns NR:ns ## COG: MA3799 COG3666 # Protein_GI_number: 20092595 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Methanosarcina acetivorans str.C2A # 239 478 4 234 243 105 30.0 3e-22 MLSSQLKLSLSYYSDLYDMIVPKDHILRKIRELVDFSFIYDELKNKYCLDNGRNAKNPIL LFKYLLLKYLYNLSDNGVVERSRYDMSFKYFLELTPEEEVIHPSLLTKFRKQRLKDENIL DLLINKSVELAINQGVLKSNTIIVDSTHTEARYHKTTQREMLKKASTSLKVALKDSDKDI YLPDEPEKRASLDEYNEYCHELLNTVQESPYSELPTIKEESSMLSEILDNQIDCYDQSID PDARIGHKTVNSSFYGYKTHLAMTDERIITAATITSGEAFDGQELPVLVQKSKAAGATVN EVIADTAYSTLENLKDAQSNDYKLISKLNPSIIKGTRSEDGFIFNKDADTMQCPAGHLAI KYRIDKRSNQKKNTRIKYFFDINKCHVCPYRNGCYKENAKTKTYSITIKSDYHKNQEAFQ KTQYFKERFKTRYMIEAKNSELKNIHGYSRCDAAGLSNMQLQGAVSIFAVNLKRILKLKG EVCPNEKK Prediction of potential genes in microbial genomes Time: Fri Jul 8 08:31:35 2011 Seq name: gi|222441778|gb|ACEP01000164.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont482.1, whole genome shotgun sequence Length of sequence - 1580 bp Number of predicted genes - 1, with homology - 0 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 82 - 282 137 ## + LSU_RRNA 310 - 1469 92.0 # CP000885 [D:78327..81226] # 23S ribosomal RNA # Clostridium phytofermentans ISDg # Bacteria; Firmicutes; Clostridia; Clostridiales; Clostridiaceae; Clostridium. Predicted protein(s) >gi|222441778|gb|ACEP01000164.1| GENE 1 82 - 282 137 66 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRIRWTAGLRFLHPLLSELRGHGKGSRAGNGKPGASEGVAGLENPPCMRRRRDAERNEVV KLRPLP Prediction of potential genes in microbial genomes Time: Fri Jul 8 08:31:41 2011 Seq name: gi|222441777|gb|ACEP01000165.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont483.1, whole genome shotgun sequence Length of sequence - 3722 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 4, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 68 - 127 3.7 1 1 Tu 1 . + CDS 195 - 500 245 ## CKR_2609 hypothetical protein 2 2 Tu 1 . + CDS 1041 - 1304 85 ## gi|225028869|ref|ZP_03718061.1| hypothetical protein EUBHAL_03157 + Prom 1343 - 1402 4.9 3 3 Tu 1 . + CDS 1434 - 1691 155 ## gi|225028870|ref|ZP_03718062.1| hypothetical protein EUBHAL_03158 + Term 1789 - 1838 -0.9 4 4 Tu 1 . + CDS 2193 - 2462 90 ## gi|225028871|ref|ZP_03718063.1| hypothetical protein EUBHAL_03159 Predicted protein(s) >gi|222441777|gb|ACEP01000165.1| GENE 1 195 - 500 245 101 aa, chain + ## HITS:1 COG:no KEGG:CKR_2609 NR:ns ## KEGG: CKR_2609 # Name: not_defined # Def: hypothetical protein # Organism: C.kluyveri_NBRC # Pathway: not_defined # 1 99 88 189 192 93 46.0 3e-18 MEPDISVICDKDKLNDRGCVGVPDFVIEVVSKSSRKMDYTTKNPLYSDVGVREYWIVDPE KERTTIYYYEQDEAPMISLFIQTIQSGIYEGLEINISELLK >gi|222441777|gb|ACEP01000165.1| GENE 2 1041 - 1304 85 87 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225028869|ref|ZP_03718061.1| ## NR: gi|225028869|ref|ZP_03718061.1| hypothetical protein EUBHAL_03157 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_03157 [Eubacterium hallii DSM 3353] # 1 87 1 87 87 141 100.0 2e-32 MQELIDTLWNVNKKSEAEKKRQVKELIDTLWNVNQKVALKEYNALVKELIDTLWNVNIFV NLTNTCQCPELIDTLWNVNKFGTSFIM >gi|222441777|gb|ACEP01000165.1| GENE 3 1434 - 1691 155 85 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225028870|ref|ZP_03718062.1| ## NR: gi|225028870|ref|ZP_03718062.1| hypothetical protein EUBHAL_03158 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_03158 [Eubacterium hallii DSM 3353] # 1 85 1 85 85 138 100.0 1e-31 MLLELIDTLWNVNVFGTQTISSCDSGINRYIMECKCRLDRTITTDWIELIDTLWNVNVKK VDTDEWEYQELIDTLWNVNTHLKLI >gi|222441777|gb|ACEP01000165.1| GENE 4 2193 - 2462 90 89 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225028871|ref|ZP_03718063.1| ## NR: gi|225028871|ref|ZP_03718063.1| hypothetical protein EUBHAL_03159 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_03159 [Eubacterium hallii DSM 3353] # 1 89 1 89 89 165 100.0 9e-40 MNISEETLVGTITELIDTLWNVNDAGYVSNVLKIRINRYIMECKLSISIHFLFPFAELID TLWNVNQRIFKNQKTAAERINRYIMECKY Prediction of potential genes in microbial genomes Time: Fri Jul 8 08:32:04 2011 Seq name: gi|222441776|gb|ACEP01000166.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont484.1, whole genome shotgun sequence Length of sequence - 9624 bp Number of predicted genes - 9, with homology - 9 Number of transcription units - 4, operones - 2 average op.length - 3.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 440 - 1897 1638 ## COG1376 Uncharacterized protein conserved in bacteria - Prom 2102 - 2161 9.1 - Term 2005 - 2048 -1.0 2 2 Tu 1 . - CDS 2183 - 2668 395 ## COG2954 Uncharacterized protein conserved in bacteria - Prom 2694 - 2753 6.8 + Prom 2684 - 2743 9.1 3 3 Op 1 . + CDS 2851 - 3399 852 ## COG0693 Putative intracellular protease/amidase 4 3 Op 2 . + CDS 3410 - 4204 726 ## gi|225028877|ref|ZP_03718069.1| hypothetical protein EUBHAL_03165 + Term 4278 - 4317 1.6 + Prom 4217 - 4276 2.4 5 4 Op 1 . + CDS 4327 - 4884 769 ## COG2087 Adenosyl cobinamide kinase/adenosyl cobinamide phosphate guanylyltransferase 6 4 Op 2 . + CDS 4893 - 5702 654 ## COG4509 Uncharacterized protein conserved in bacteria 7 4 Op 3 . + CDS 5716 - 7776 1671 ## COG1368 Phosphoglycerol transferase and related proteins, alkaline phosphatase superfamily 8 4 Op 4 . + CDS 7804 - 8373 555 ## COG0591 Na+/proline symporter 9 4 Op 5 . + CDS 8383 - 9348 1254 ## COG0591 Na+/proline symporter + Term 9491 - 9526 1.0 Predicted protein(s) >gi|222441776|gb|ACEP01000166.1| GENE 1 440 - 1897 1638 485 aa, chain - ## HITS:1 COG:CAC0747 KEGG:ns NR:ns ## COG: CAC0747 COG1376 # Protein_GI_number: 15894034 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 21 485 6 465 466 306 38.0 6e-83 MSDTQTTNVTPDSSTTRNRRSDKKSSKTGKIAAAVIVVLLLAVYGGGVYYYSSHFPHNTL INHVKVGEMDVTKAEKTFTDDLASHKISLKEKERTEVIDANDVGTVINVGSQISDLQDSM NPWLWFTNLFGSKHYTVKLDVTYDETKLEKVINNLACFKKENVTAPKSTYIKAGDSQFEI VPEVLGNTVKKKALIKLIGKDLSTGITKIDLEKENLYKLPKYYAKDKVVTDALAKANKYA GGTITYDFDYTTETLDYETSKDWVKISKDFKVTLDESKVGDYIEKLGSKYNTMGSSRPFT TAYGSKINVYGGDYGWKIYFDKEKTKLIKEIKSGKDVEREPVYSYKAKCRKSAKDDIGDS YVEVSISNQEIWLYVNGECKVNSSVVTGDPTRGHQTYTGVYAITYKQKDATLTGPNAGGG SYSSHVNFWMPFNGGQGLHDADSWRSSYGGSIYRGSGSHGCVNCPYSTAATLYKYVDAGF PVIVY >gi|222441776|gb|ACEP01000166.1| GENE 2 2183 - 2668 395 161 aa, chain - ## HITS:1 COG:AGl752 KEGG:ns NR:ns ## COG: AGl752 COG2954 # Protein_GI_number: 15890492 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 2 148 4 152 162 80 37.0 8e-16 MEIERKFLIKSLPDNLESYSCDTLIQAYISTEPVIRVRQKNEDYILTLKSAGLLAREEIE MPLTEESFKHLLSKKDGISIKKKRYKIPDTPYLIELDVFEGDYNGFCMAEVEFSDIDSAN AYNAPAWFGPEVTMDSRFHNSSLSKRNPAEVQEFLELIQKM >gi|222441776|gb|ACEP01000166.1| GENE 3 2851 - 3399 852 182 aa, chain + ## HITS:1 COG:CAC1629 KEGG:ns NR:ns ## COG: CAC1629 COG0693 # Protein_GI_number: 15894907 # Func_class: R General function prediction only # Function: Putative intracellular protease/amidase # Organism: Clostridium acetobutylicum # 4 172 3 171 188 134 48.0 9e-32 MRSVYVFCADGFEEVEGLTAVDLLRRAGVSVTMVSIMGRTKITGARNISVNTDILIEDIK EEADMLVLPGGMPGTNYLRDHEGLAELLKKQYEAGKWVAAICAAPSVFGGLGFLKDRKAT SYPGCLDGIPVGEYTEEPVAVDGNVVTSRGVGTAIAFALKLIEVLISKEKADEIAASIVY PV >gi|222441776|gb|ACEP01000166.1| GENE 4 3410 - 4204 726 264 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225028877|ref|ZP_03718069.1| ## NR: gi|225028877|ref|ZP_03718069.1| hypothetical protein EUBHAL_03165 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_03165 [Eubacterium hallii DSM 3353] # 1 264 2 265 265 488 100.0 1e-136 MQQESKIGFLWQYVIACIHPSQYKELIKKKKGAFVGYVAVLVMFLVLIENVIPFAAWTAS VGGLENLVLNRIPKFTLEKGYFQSESPIDFTIGGVVHIKADSSVEEFKESDFKSDYVQEL LVGKKNILIKMPSGNQQIVLSQFKNWKLNNKGLAEMLPAFYMFLVFYLVVLLITKAVQYL LVALAFALICRGSVRTTEGKIVSMAESFYIAVYARTLAAVIGSANIALGYMIDSFVLMIV TVFITMGFIMRGEISVLDPQPSKE >gi|222441776|gb|ACEP01000166.1| GENE 5 4327 - 4884 769 185 aa, chain + ## HITS:1 COG:CAC1383 KEGG:ns NR:ns ## COG: CAC1383 COG2087 # Protein_GI_number: 15894662 # Func_class: H Coenzyme transport and metabolism # Function: Adenosyl cobinamide kinase/adenosyl cobinamide phosphate guanylyltransferase # Organism: Clostridium acetobutylicum # 1 185 1 185 185 163 45.0 1e-40 MSKVIMVTGGSRSGKSVIAEQKAKEYGKRSVLYLATAIPIDDDMKERIRMHQERRDPEWG TYEGYRDLGEVVKNTEKNTILLDCVTVMITNILFEEEERDFDKISASEVEKLESEVIKEL TNLVKAIRDEDKTLIIVSNEVGMSIVPSYRLGRIFSDISGKANQVLASLSDEVYVAISGL PLRLK >gi|222441776|gb|ACEP01000166.1| GENE 6 4893 - 5702 654 269 aa, chain + ## HITS:1 COG:lin2285 KEGG:ns NR:ns ## COG: lin2285 COG4509 # Protein_GI_number: 16801349 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Listeria innocua # 8 265 2 246 246 109 29.0 7e-24 MSRKKKRKGSGFFSNLLLLLCIVVFCVAAWKLWGYYNNYHGGRKEYEQLRQYVKEKKDSD KKRKKKSGPDTCPVQVDFDALARINPDVKAWIYIKGTGINYPIVQTTDNTSYLHRTFEGK DSFIGAIFLDAGCEADFSSENSIVYGHNLKNGQMFGMLKKNYDTSYNADADYKDHQKIWI ITPQEEIEYQVFAAREIDVNVDADVYTIEFATEEDYADYLKTAVEKSLYHLGTKTGSTEN MLTLSTCTSSSEAGRFIVQAMQIQRTAKE >gi|222441776|gb|ACEP01000166.1| GENE 7 5716 - 7776 1671 686 aa, chain + ## HITS:1 COG:SMc00195 KEGG:ns NR:ns ## COG: SMc00195 COG1368 # Protein_GI_number: 15965601 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Phosphoglycerol transferase and related proteins, alkaline phosphatase superfamily # Organism: Sinorhizobium meliloti # 196 577 222 625 639 150 25.0 1e-35 MGEKNIWERIKDSSNCRVLILNLILTLCLNLLLEFTERRSVSEVFSFVQERTFVFLYNGF IIFLCLSVVFLVKKKIFAYVFITGCWSLVAIANGIVLSDRKTPFTAVDLTLVKSVLPILS SYLEVWQIVAIVILLVIGVGGLVCLYLYSPEDKKFKSAFSGFLYTAVTVVCFCAVTYVGV GKGMLIKKFDNLIAGYKDYGVAYGFCVTAIDTGIDRPINYSRDTVKGIKKKVKKAEKKQK QSEKAEDVREPNIIFIQLESFFDATTVKNLKVSEDPIPTFHKIQKEYTSGYLKVPVYGAG TINTEFEVITGMNMDYFGTGEYPYRSILHKTTCDSVAYWLKEKNYESSVIHNNNASFYDR DAVFSNLGFNNFITIENMDVKSRNEVGWAKDSILTTYIMDTLNQTKKKDIIYTISVQGHG DYPTDDQSGSPITVSGEGMSQSYLNQFTYYVNQTREMDDFIKDLTDKLSDYHEDVMVIAY GDHLPGMNLESKDLKTNSKYETPYFIWDNFGYNKENKKKQSGKVEAWQLASKVLKEVGIH NGFLNEYHQTMEEDKKYRKNLKLLQYDMLYGSNFVREDKKSLEPTKINYSLSPVEITEIK EDRDDYLLLGNNFTDASRVFVNGVRAASKKESSSVLEIPKSAVKEGDKITVHQVSVTNEN ITLNQSEEYEFHKDKVRLLYKNLYDE >gi|222441776|gb|ACEP01000166.1| GENE 8 7804 - 8373 555 189 aa, chain + ## HITS:1 COG:MA0003 KEGG:ns NR:ns ## COG: MA0003 COG0591 # Protein_GI_number: 20088902 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Na+/proline symporter # Organism: Methanosarcina acetivorans str.C2A # 1 188 2 189 514 176 56.0 2e-44 MSTTTAILCAFGVYLLLMIGIGVWCMKKTTGADDYFLGGRGLSGPVAALSAQASDMSGWL LMGLPGSIYALGTGQAWIAIGLGLGTIANWLIIAKPLRAYTIVANNSMTLPEFFGNRYHD EKKILLGISSAIIVVFFLVYTASALASGGKLFNTVFGIDYHVALFIGAAVILIYTFMGGF LAVCTTDFV >gi|222441776|gb|ACEP01000166.1| GENE 9 8383 - 9348 1254 321 aa, chain + ## HITS:1 COG:MA0003 KEGG:ns NR:ns ## COG: MA0003 COG0591 # Protein_GI_number: 20088902 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Na+/proline symporter # Organism: Methanosarcina acetivorans str.C2A # 48 311 239 487 514 235 49.0 7e-62 MLIALLIVPIIAWGYVSGDFSSLLAQSGVNSEHYMSLMYNGDHAITPIEIISNLAWGLGY CGMPHILIRFMAIKNEKELRKSSVIAIVWVVISLTLAVVIGVVGRAYLLPMILGETAGAP STESVFIQMIQETFTNKINLPFIGGLFICGILAAIMSTADSQLLVCSSSVSADIYQGIFK PSASDKEVLNIGRITTIIVAALAFIIAWDPNSSVMALVSDAWAGLGAAFGPLVVMSLFWK RTNLPGAIAGLLSGALTVIIWDYIPIIGGQTPYVATGLYSLVVGFIISLLFIVVVSLMTK APEESIIEEFELYKGYEEYDK Prediction of potential genes in microbial genomes Time: Fri Jul 8 08:32:17 2011 Seq name: gi|222441775|gb|ACEP01000167.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont485.1, whole genome shotgun sequence Length of sequence - 2923 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 3, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 165 - 212 4.8 1 1 Tu 1 . - CDS 228 - 998 946 ## COG1145 Ferredoxin - Prom 1096 - 1155 4.1 2 2 Tu 1 . - CDS 1342 - 1566 187 ## gi|225028884|ref|ZP_03718076.1| hypothetical protein EUBHAL_03172 - Prom 1693 - 1752 6.1 3 3 Op 1 . + CDS 1658 - 2119 196 ## gi|225028885|ref|ZP_03718077.1| hypothetical protein EUBHAL_03173 4 3 Op 2 . + CDS 2140 - 2598 260 ## gi|225028886|ref|ZP_03718078.1| hypothetical protein EUBHAL_03174 + Term 2614 - 2662 12.7 Predicted protein(s) >gi|222441775|gb|ACEP01000167.1| GENE 1 228 - 998 946 256 aa, chain - ## HITS:1 COG:CAC0885_1 KEGG:ns NR:ns ## COG: CAC0885_1 COG1145 # Protein_GI_number: 15894172 # Func_class: C Energy production and conversion # Function: Ferredoxin # Organism: Clostridium acetobutylicum # 1 103 1 103 115 107 50.0 2e-23 MIRRIIHINEEKCNGCGACADACHESAIGMVNGKAKLLRDDYCDGLGDCLPACPTGAITF EEREAAAYDEAAVMANKQKKEASAINTAQSVPQSNEAKKAHTAHGCPGQRIAQFNRQNTA DSASDNETTSTPAAITSQLSQWPVQIKLAPVTAPYFNGAKLLIAADCTAYAYANFHQDFI KGKITLVGCPKLDAVDYSEKLTEIIATNDIKSVTIVRMEVPCCGGLEHAAVTALKNSGKF IPWQVVTISIDGKILD >gi|222441775|gb|ACEP01000167.1| GENE 2 1342 - 1566 187 74 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225028884|ref|ZP_03718076.1| ## NR: gi|225028884|ref|ZP_03718076.1| hypothetical protein EUBHAL_03172 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_03172 [Eubacterium hallii DSM 3353] # 1 74 1 74 74 122 100.0 1e-26 MSRLKELRKEKGYTQVKMQLLTGIDQSDYSKIENGKRYYTFEQCRKIALALETSMDYLAG LTDEKKPYPRIIKK >gi|222441775|gb|ACEP01000167.1| GENE 3 1658 - 2119 196 153 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225028885|ref|ZP_03718077.1| ## NR: gi|225028885|ref|ZP_03718077.1| hypothetical protein EUBHAL_03173 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_03173 [Eubacterium hallii DSM 3353] # 1 153 1 153 153 209 100.0 5e-53 MVVIFFNLGLVLAFILLVSIVFINASTKFVLAILGVICILFLIKDIIQDLYFALYKYKNN VLIVIISFALDIARIAFFYKLIHHFAAGTARATGLSTFGWMLGYVISILVGGVLFFAGEA NGLLFGMGQKNESSILMNVFMTGCLVAFCHFII >gi|222441775|gb|ACEP01000167.1| GENE 4 2140 - 2598 260 152 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225028886|ref|ZP_03718078.1| ## NR: gi|225028886|ref|ZP_03718078.1| hypothetical protein EUBHAL_03174 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_03174 [Eubacterium hallii DSM 3353] # 1 152 14 165 165 288 100.0 1e-76 MNQITEKEREFIENKLKMTTRTIVVSRTKQFCLNKIGCEIIFDGRKIASVDSGKKVDIKI NGLPHKIACLWTFADGNQTLTETFTIPYGVSSLKYETMRKIENSHLFTLKGLNLTTTPVG ILMEDTRLREESNNLFRQKLNEMLQWYRTNNL Prediction of potential genes in microbial genomes Time: Fri Jul 8 08:32:44 2011 Seq name: gi|222441774|gb|ACEP01000168.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont486.1, whole genome shotgun sequence Length of sequence - 14101 bp Number of predicted genes - 13, with homology - 13 Number of transcription units - 6, operones - 4 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 251 - 310 4.2 1 1 Op 1 . + CDS 379 - 3591 3134 ## BBPR_0282 cell surface protein 2 1 Op 2 . + CDS 3588 - 4076 325 ## HMPREF0868_0819 signal peptidase I (EC:3.4.21.89) + Prom 4123 - 4182 5.3 3 2 Op 1 . + CDS 4273 - 5313 1170 ## M5005_Spy_0109 fibronectin-binding protein 4 2 Op 2 . + CDS 5359 - 6138 572 ## COG4509 Uncharacterized protein conserved in bacteria 5 2 Op 3 . + CDS 6159 - 6854 574 ## gi|225028893|ref|ZP_03718085.1| hypothetical protein EUBHAL_03181 + Term 6953 - 7017 11.6 6 3 Tu 1 . + CDS 7316 - 7543 244 ## gi|225028894|ref|ZP_03718086.1| hypothetical protein EUBHAL_03182 + Prom 8265 - 8324 7.4 7 4 Op 1 1/0.000 + CDS 8358 - 9272 929 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily + Prom 9295 - 9354 3.6 8 4 Op 2 1/0.000 + CDS 9433 - 9708 319 ## COG0675 Transposase and inactivated derivatives + Prom 9735 - 9794 6.5 9 4 Op 3 . + CDS 9832 - 10140 447 ## COG0526 Thiol-disulfide isomerase and thioredoxins + Term 10166 - 10225 10.1 + Prom 10185 - 10244 6.8 10 5 Op 1 . + CDS 10269 - 11087 371 ## COG2091 Phosphopantetheinyl transferase + Prom 11089 - 11148 6.4 11 5 Op 2 . + CDS 11168 - 11836 846 ## gi|225028899|ref|ZP_03718091.1| hypothetical protein EUBHAL_03188 12 5 Op 3 . + CDS 11858 - 12931 1321 ## COG0544 FKBP-type peptidyl-prolyl cis-trans isomerase (trigger factor) 13 6 Tu 1 . - CDS 13111 - 13779 719 ## COG0406 Fructose-2,6-bisphosphatase - Prom 13829 - 13888 9.4 Predicted protein(s) >gi|222441774|gb|ACEP01000168.1| GENE 1 379 - 3591 3134 1070 aa, chain + ## HITS:1 COG:no KEGG:BBPR_0282 NR:ns ## KEGG: BBPR_0282 # Name: not_defined # Def: cell surface protein # Organism: B.bifidum_PRL2010 # Pathway: not_defined # 356 783 599 1024 1176 111 29.0 2e-22 MWKKVKRILSCILALILLLSLIDGQQFFVRAESEVKVTQTEQASNKMEETKTSSRKEEGS EANNKENGKQEEGTQKEELSTKATEKTTETKNEDGKVDKPDNLNKENENNGQENGGKNEE KETGDKGEETIKEKAEEASAEEDTAQETEAKEQIDATDDTQERMNQQDENVPAMAQRGTD VSEKETEYVPKSQPQGVSIKVYAKENVFPEGTTMKVKELTNKELDSAQNVLEKGNVSYDG FLGYDISFYNKEGKEIEPEEGSVRVEVDMNLNLLPKDLQMDTLQMQHLKEEKKDRTVETV AKAADHKLSVSKSKVTAKFEVKSFSDFILTWNIDATPADPLETGDNAASIEKQINHEKYA TLRDDGTYDLTLTVAGKKGTETNKAKLDVIYILDKSGSMKEDFGGTSKRIAASNAITALT KSLKQNANIDARFSMVTFSGNKTTGMWGQGDTKTWDDAEVAVSWTTDAGTIERGSKPTSN GGTNYQAGIRTAKELLTSKRAGAMTAVIFISDGDPTFYYNPDGYTRGDGNNDGNGGADNL KVCLDAAKNEIANLGVNYFYTVGVGKANDYVNLSDLCSASGVSGAKNFDGTNTDELTKAF STIESDILTFLCSNVSIQDVLSENVEIVKDKDGVFKSLKIVVTGKEGKTIVEGDNKVTFQ DGTQNVTLKAGYDSKTKTITLDFPAEYQLNAEYTYKVIANIDATEKAYENYRKNLTDNKD ENEKGYKDVADAGTGTHAGKKGMYTNENSEAKMTYTFRGEEYTELYDKPVIKLHSGKLIL EKEVEGLDSLTPEQLEQYKANLKFKIKVKTKNNTSLKEEEITLTDLAKESNGNEDSGNSS NTNKRNKYIYTVMEGINPGSFYEITEDGGEVEGYIWETAADKKSENGTIAKDETEKVSFK NTYSRKKIPLIINKTVEGNMSEKRKEFAFSITLKDANGAVYELSDEEIKDVGVSTKGEGQ KGVYTFTLKDGESKEFSLPYGCKYTISEEDYSSSGYKTYIGEKKEENQKRTTEEETLTQK IEINFLNKKEVIPPTGVETTMTAWLLMTGVTLLLGAVFLLFGIRRKRFVA >gi|222441774|gb|ACEP01000168.1| GENE 2 3588 - 4076 325 162 aa, chain + ## HITS:1 COG:no KEGG:HMPREF0868_0819 NR:ns ## KEGG: HMPREF0868_0819 # Name: lepB_1 # Def: signal peptidase I (EC:3.4.21.89) # Organism: Clostridiales_BVAB3 # Pathway: not_defined # 8 158 144 292 296 114 39.0 2e-24 MSEIKAFFIRLVILLIFLWLLFGVFFGITPMRGNDMFPRISAGDLLLYYRLEKNFNSGDV LVFRKQGKISTGRVVAHGGDSVEITGDGELKVNGSIVIETNVFYKTYPYDKKKVNYPLSL KKDEVFLLCDYREGGRDSRYFGAVSKKEIKGKVITILRRSDL >gi|222441774|gb|ACEP01000168.1| GENE 3 4273 - 5313 1170 346 aa, chain + ## HITS:1 COG:no KEGG:M5005_Spy_0109 NR:ns ## KEGG: M5005_Spy_0109 # Name: not_defined # Def: fibronectin-binding protein # Organism: S.pyogenes_MGAS5005 # Pathway: not_defined # 11 342 2 339 340 70 27.0 1e-10 MRADKKKFSTRLMAALMAGTLAAGMMGMNVSAAVVHSGNPITTIPVTKNVLTDGNTMAPN TTFEFEVAVADAGTFNDGNKDQVVYKGIAGGLTAETGAAFTPGGKGSAAETYTAEGSLKT DAAVFERPGVYHYTVTEKANNYEGVTTDTTSYDVYVYVYNRTDGLYVGNVVSAKNGDKAD LIFNNDYGQDKNKDTTHDVVIKKVITGNQAVESDTFQLVVTVTGTAGEKYKVTLDNAEQN PLTSGEKATYTVTNNTEIHIYGLTEGDKVQAVEEANTQGYQATYTTGLSEGTLTISSDGS EATVTNTKNSTSPTGVILNYGPYILMIALAGSMAAFFFFKRNRKEA >gi|222441774|gb|ACEP01000168.1| GENE 4 5359 - 6138 572 259 aa, chain + ## HITS:1 COG:SPy0129 KEGG:ns NR:ns ## COG: SPy0129 COG4509 # Protein_GI_number: 15674344 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Streptococcus pyogenes M1 GAS # 5 245 3 236 237 181 37.0 1e-45 MKIAVRIAQAGNQILNIIVMGIILFLFLYGGYSLWDTYMSAKSAFLSDDLLSYKPQPGEG ANPSLEDLMAINKDVAGWITIDDTHIDYPVVQGKDDMEYINKDVYGEFSLSGSIFLSCMN KKDFSDNYNLVYGHHMANGGMFGDVVSFTEKSYFDKHKTGELYLPDKTMHIDLFACMKTS ASDSKVYNPQNISKTSESFKSFLDYVREAAICYRDSGRQDTGQIIGLSTCSEAVTNGRVI LFGRLSEVTKSNQKQKKAE >gi|222441774|gb|ACEP01000168.1| GENE 5 6159 - 6854 574 231 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225028893|ref|ZP_03718085.1| ## NR: gi|225028893|ref|ZP_03718085.1| hypothetical protein EUBHAL_03181 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_03181 [Eubacterium hallii DSM 3353] # 1 231 1 231 231 409 100.0 1e-113 MKRSKIKGKWIWGVLSTTLAIIFLFSVKMIAVYAAEGERECTIAIPVSVEMKGNIKSEVF EYALEPADENSSTEAVFSEVEVTQEGITKGSFEEMRYTRPGDYKYTVYQLKGESKDVIYD GSVYEVTVRVVNTQEGGLAAEVWAVKDQSDQKTDEIKFINQYKETTTATTEAINNTIHHT ITNTITKSSIGTSSPKTGDKSNVMLWGGIAILSGLCVFFFIFEGKRRKTEE >gi|222441774|gb|ACEP01000168.1| GENE 6 7316 - 7543 244 75 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225028894|ref|ZP_03718086.1| ## NR: gi|225028894|ref|ZP_03718086.1| hypothetical protein EUBHAL_03182 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_03182 [Eubacterium hallii DSM 3353] # 1 75 1 75 75 140 100.0 2e-32 MKDGNYTYRRPAQVTIKADGKDVENGTFWIRDDAAFTTHTINVKGCKDLRIFITQGNIIK KFDTEYGFADVKLSK >gi|222441774|gb|ACEP01000168.1| GENE 7 8358 - 9272 929 304 aa, chain + ## HITS:1 COG:FN2101 KEGG:ns NR:ns ## COG: FN2101 COG0697 # Protein_GI_number: 19705391 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Fusobacterium nucleatum # 7 294 8 295 301 186 39.0 4e-47 MNAKVKGTAFALFGGACWGLSSVVGKLLFDMRGMNAEWLVTARLVFGGFIMLCFAAFQKK GEIFKIWKEKKTAFSMILFAALGMLACQLTYFLCVQYSNPATATVLQYTSPVMIMLLCLF LDRKLPRIIDIIVLIAVVVGVFLMSTHGNIHSLAISQKALILGITTAITVVFYTVWPVRL LREYGSTLVIGWGMFIGGIMLMPFSKFWAPPGRWDWVTITLVAIVVVFGTIVSFSCYLKG VMYLGPVKSSLFACMEPLVSTILTIFLLHQAFVAADIVGIALIIVSVTSLAVNDVIKEAR EKEK >gi|222441774|gb|ACEP01000168.1| GENE 8 9433 - 9708 319 91 aa, chain + ## HITS:1 COG:TM1044 KEGG:ns NR:ns ## COG: TM1044 COG0675 # Protein_GI_number: 15643802 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Thermotoga maritima # 7 56 316 365 405 72 58.0 2e-13 MHEEEQGKKLVKVDRFFASSQTCSVCGYKNAKMKNLALREWDCPQCGTHHDRDVNAYVRK NAGLQTADIWYNKEDFVKEIIEQRNKNPERV >gi|222441774|gb|ACEP01000168.1| GENE 9 9832 - 10140 447 102 aa, chain + ## HITS:1 COG:ECs4714 KEGG:ns NR:ns ## COG: ECs4714 COG0526 # Protein_GI_number: 15833968 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Escherichia coli O157:H7 # 3 97 23 118 127 110 46.0 8e-25 MVIVKVTESNFEQEVLQAEGKVLVDFWAVWCGPCQMLSPIVDEVAAEAGNVKVGKVNVDE EPALAAKFGVMSIPTLLVFENGKVVKQSIGYIEKDEVMALIQ >gi|222441774|gb|ACEP01000168.1| GENE 10 10269 - 11087 371 272 aa, chain + ## HITS:1 COG:CAC1329 KEGG:ns NR:ns ## COG: CAC1329 COG2091 # Protein_GI_number: 15894608 # Func_class: H Coenzyme transport and metabolism # Function: Phosphopantetheinyl transferase # Organism: Clostridium acetobutylicum # 20 251 17 197 201 60 25.0 3e-09 MKNTKVYLMSTENVKDPRTFALWKEFLPKEHWEKTVRPLKEEDRKTELAAWFLLYQALRE WGISEEKINADGAYYYGEHGKPMRRNEEICFNLSHSGKYVLCAVSEMEIGCDIEKIKEVK WKLAKRFFSEKEYDFLVRLGRQEKLMKQGETVKSGKTDKQEKVGKQQNINRQKEKGKIKE NAYTVEEAFTRFWVLRESYVKKTGEGLGAALTGLDFSDILGQKNSKGKKNGEFLEETFFE MEYDGYRIAICGEKDSKPEFVVYRGTIDNCFL >gi|222441774|gb|ACEP01000168.1| GENE 11 11168 - 11836 846 222 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225028899|ref|ZP_03718091.1| ## NR: gi|225028899|ref|ZP_03718091.1| hypothetical protein EUBHAL_03188 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_03188 [Eubacterium hallii DSM 3353] # 1 222 1 222 222 372 100.0 1e-101 MAVEGTKIREMMEGKTLEKDNIKKMYDMVQNASEPEDKKKVMKHSVFFIGQLPLKENHIV DLEQTRRIVNGSVPFSEVDTYMDYLLKGLSTQKLLLEKEDGSYEVNTKYEKSVIKVRKIA RAFQLEKEKALEAGIPKAKKMYQMGTKYYHSGQYEEAAACFMNAAELAEYRMAYYSLALM YFKGQGVDQSFEEALYYARKALVKGAVIAQELEQEILEAMNA >gi|222441774|gb|ACEP01000168.1| GENE 12 11858 - 12931 1321 357 aa, chain + ## HITS:1 COG:SA1499 KEGG:ns NR:ns ## COG: SA1499 COG0544 # Protein_GI_number: 15927254 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: FKBP-type peptidyl-prolyl cis-trans isomerase (trigger factor) # Organism: Staphylococcus aureus N315 # 60 316 116 379 433 141 39.0 2e-33 MKKFATIALAVVLSMGILTGCGSSKKAAEATTEVKEETVDYGQGLNEDGTLEGVKATDYV TVCDYSALKIPKKEVKVSDSDVQTEIDTILSSYNQVTDRKVKKGDTVNIDYKGMVDGKEF DGGTASGASLKIGSGTFIDGFEDQLIGKMPGETVQVKVTFPKDYQGKEVAGKDAVFETTI NYIDETPKLTDKFVKEKLSDRYGYTTVKEMKKTIRDEIFKTNKTDYIWNHMIEKSKFKEI PDELINDRVDVLVNGLKAQLKASNYTLKDYLSAYGIEDETTLRDQYKSSCESTVKVFLIA DAIAADKKISVTDEDVKAYFNGEDTAQYEKQYSKAYINRIVLNNLVIQEIEKNVTVK >gi|222441774|gb|ACEP01000168.1| GENE 13 13111 - 13779 719 222 aa, chain - ## HITS:1 COG:CAC1385 KEGG:ns NR:ns ## COG: CAC1385 COG0406 # Protein_GI_number: 15894664 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose-2,6-bisphosphatase # Organism: Clostridium acetobutylicum # 1 198 2 184 191 87 29.0 2e-17 MKLYIIRHGETKLNALGRLQGWTDEPLNQNGKDLAIITGEALKEIPFDLVITSPLKRARE TGELTVAASEKNFGRKIPIIEDRRIMEFNWGCWEGLGCLENNFEVPDPNYNIFYTDAFHY QGSPDGESVADVMERTADFFNELIANPDYQDKTILIATHGCALRAMLNSLYENRDSFWQE HVPYNCAVTIIDVQNGQAKITAKDKIYYDESLCFDHYKVIDK Prediction of potential genes in microbial genomes Time: Fri Jul 8 08:33:36 2011 Seq name: gi|222441773|gb|ACEP01000169.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont488.1, whole genome shotgun sequence Length of sequence - 1365 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 21 - 80 7.4 1 1 Tu 1 . + CDS 102 - 1325 1212 ## COG1301 Na+/H+-dicarboxylate symporters Predicted protein(s) >gi|222441773|gb|ACEP01000169.1| GENE 1 102 - 1325 1212 407 aa, chain + ## HITS:1 COG:FN1148 KEGG:ns NR:ns ## COG: FN1148 COG1301 # Protein_GI_number: 19704483 # Func_class: C Energy production and conversion # Function: Na+/H+-dicarboxylate symporters # Organism: Fusobacterium nucleatum # 1 394 1 386 390 387 58.0 1e-107 MNKNNFFTSLPFKLLMGVFLGICIGLVLNYADGNAATRAILNVVVTAKYILGQLINFCVP LIIIGFIAPSITKLGSHASRMLGVAIVIAYVSSLGAALFSTVSGYALIPHLSISSAADKL KTLPDVVFELSIPQIMPVMSALVLSILLGLAAAWTKADLIAAFLDEFQKVVLAIVSRIII PILPFFIGLTFCSLAYEGTITKQLPVFLKVIVIVLVGHFIWMTLLYVLAGVYSGENPLDI IKNYGPAYLTAVGTMSSAATLAVALQCAEKCKPLRKDMVQFGIPLFANIHLCGSVLTEVF FCMTVSKILYGTVPTPGTMILFCVLLGVFAIGAPGVPGGTVMASLGLITGVLGFGDSGTA LMLTIFALQDSFGTACNVTGDGALTLILTGYTKRHNIEEQTIERIDL Prediction of potential genes in microbial genomes Time: Fri Jul 8 08:33:38 2011 Seq name: gi|222441772|gb|ACEP01000170.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont489.1, whole genome shotgun sequence Length of sequence - 2799 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 12/0.000 - CDS 14 - 448 208 ## COG3610 Uncharacterized conserved protein 2 1 Op 2 . - CDS 445 - 1206 496 ## COG2966 Uncharacterized conserved protein - Prom 1230 - 1289 7.5 + Prom 1182 - 1241 5.4 3 2 Tu 1 . + CDS 1271 - 2623 858 ## COG4905 Predicted membrane protein Predicted protein(s) >gi|222441772|gb|ACEP01000170.1| GENE 1 14 - 448 208 144 aa, chain - ## HITS:1 COG:CAC2266 KEGG:ns NR:ns ## COG: CAC2266 COG3610 # Protein_GI_number: 15895534 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 1 129 1 129 152 77 35.0 6e-15 MILQFFAAFIGAAGFGFLFHLQFKHIFPAALGGVLTWVIYVIATIWGFDVFISSLFASAF AAIYSEVIAKIRKAPTTLFLIISVVVLIPGGSLYYTMSYAVQKKWAEAAAYGSQTLQCAL GLAIGMSLIWIFHDMARKVYYLIH >gi|222441772|gb|ACEP01000170.1| GENE 2 445 - 1206 496 253 aa, chain - ## HITS:1 COG:CAC2265 KEGG:ns NR:ns ## COG: CAC2265 COG2966 # Protein_GI_number: 15895533 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 10 251 5 254 257 99 30.0 4e-21 MNPPNAILHELLNLAEIMLTNGAEVSRVEETLNRMGHAYGATQMNVFAITSSIVVTMVFE EGEEYTQTRRITTPVGTDFFKLEQANALSRRCCSSPLSLPDLKYEINKISKISPAPYTIY LGSMLGAGGFSVFFGGTILDGAIAVIFAFLLCFLQRHLNSICSNTLILQFFCSLFVGSGI CVAAHFFPVIHADKVMIGDIMLLIPGILMTNSIRDILIGDTISGVMRLVESLLWAGALAC GFMCAIWFVGGVL >gi|222441772|gb|ACEP01000170.1| GENE 3 1271 - 2623 858 450 aa, chain + ## HITS:1 COG:lin2818 KEGG:ns NR:ns ## COG: lin2818 COG4905 # Protein_GI_number: 16801879 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Listeria innocua # 11 235 9 229 270 94 27.0 6e-19 MSYTWYEVFWFFLLYSFAGWVIGTSGAALKEKKFIDVGFLFGPWCPSYGFGAVGFALFLP ELKDQPVFLFAGGVILSFIITSFTGVTLERIFHQKWKDYSRRRFNFGGYVNLPYTVVWGA AALLCIWIVNPFLSKIIGILPYGLGKVILIIAYVIIIIDFIGTVSGILSVHIRLARLGKL VLVEEVADNLQKAADYMGNGLTGWVLKHLEKSHEGLEIKEIIRSKLQRREKAQEMTGVFA AGCGFYKLFWLFFLGSFLGDITETIFCWWKMGTITSRSSVVYGPFSLVWGIGCFFLTYML YQYRDRSDSFIFIFGTVLGGAYEYICSVVLELVFGTVFWDYSKIPFNLGGRINLLYCFFW GIAAVVWMKILYPKLSGFIEKIPKKIGVPLTWFLVVFMVFDMAISGLALNRYHERHQSAK TPETAITKILDTRFPDTRMERIYPSAKHVR Prediction of potential genes in microbial genomes Time: Fri Jul 8 08:33:44 2011 Seq name: gi|222441771|gb|ACEP01000171.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont491.1, whole genome shotgun sequence Length of sequence - 17895 bp Number of predicted genes - 21, with homology - 20 Number of transcription units - 10, operones - 5 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 2/0.000 - CDS 27 - 440 472 ## COG1310 Predicted metal-dependent protease of the PAD1/JAB1 superfamily 2 1 Op 2 . - CDS 443 - 1246 976 ## COG0476 Dinucleotide-utilizing enzymes involved in molybdopterin and thiamine biosynthesis family 2 3 1 Op 3 . - CDS 1246 - 1449 347 ## EUBELI_00642 thiamine biosynthesis ThiS - Prom 1497 - 1556 8.4 4 2 Tu 1 . - CDS 1663 - 1977 320 ## COG0526 Thiol-disulfide isomerase and thioredoxins - Prom 2028 - 2087 2.8 - Term 2012 - 2048 -0.8 5 3 Op 1 . - CDS 2125 - 2370 378 ## COG0425 Predicted redox protein, regulator of disulfide bond formation 6 3 Op 2 . - CDS 2363 - 3226 1153 ## COG2221 Dissimilatory sulfite reductase (desulfoviridin), alpha and beta subunits - Prom 3263 - 3322 2.9 7 4 Op 1 . - CDS 3358 - 4584 1227 ## COG2873 O-acetylhomoserine sulfhydrylase 8 4 Op 2 18/0.000 - CDS 4599 - 6248 1905 ## COG2895 GTPases - Sulfate adenylate transferase subunit 1 9 4 Op 3 . - CDS 6250 - 7152 1071 ## COG0175 3'-phosphoadenosine 5'-phosphosulfate sulfotransferase (PAPS reductase)/FAD synthetase and related enzymes 10 4 Op 4 6/0.000 - CDS 7183 - 7497 310 ## COG1146 Ferredoxin 11 4 Op 5 . - CDS 7472 - 9211 1906 ## COG1053 Succinate dehydrogenase/fumarate reductase, flavoprotein subunit 12 4 Op 6 2/0.000 - CDS 9286 - 9735 502 ## COG1959 Predicted transcriptional regulator 13 4 Op 7 . - CDS 9833 - 10765 835 ## PROTEIN SUPPORTED gi|148988856|ref|ZP_01820271.1| 50S ribosomal protein L9 - Prom 10830 - 10889 5.1 + Prom 10739 - 10798 7.5 14 5 Tu 1 . + CDS 10819 - 10944 57 ## + Prom 11231 - 11290 9.7 15 6 Tu 1 . + CDS 11323 - 12273 470 ## EUBREC_3193 hypothetical protein + Prom 12509 - 12568 6.3 16 7 Tu 1 . + CDS 12594 - 13016 414 ## COG1959 Predicted transcriptional regulator 17 8 Tu 1 . - CDS 12967 - 13860 714 ## COG0583 Transcriptional regulator - Prom 13918 - 13977 9.6 + Prom 13874 - 13933 8.6 18 9 Op 1 7/0.000 + CDS 14103 - 14726 430 ## COG2059 Chromate transport protein ChrA 19 9 Op 2 . + CDS 14618 - 15292 460 ## COG2059 Chromate transport protein ChrA 20 10 Op 1 9/0.000 - CDS 15900 - 16775 1054 ## COG1760 L-serine deaminase 21 10 Op 2 . - CDS 16776 - 17444 755 ## COG1760 L-serine deaminase - Prom 17531 - 17590 9.5 Predicted protein(s) >gi|222441771|gb|ACEP01000171.1| GENE 1 27 - 440 472 137 aa, chain - ## HITS:1 COG:MT1376 KEGG:ns NR:ns ## COG: MT1376 COG1310 # Protein_GI_number: 15840789 # Func_class: R General function prediction only # Function: Predicted metal-dependent protease of the PAD1/JAB1 superfamily # Organism: Mycobacterium tuberculosis CDC1551 # 12 134 20 140 146 67 35.0 5e-12 MQVKLKKEDFQAMAAHAVAHLPEEACGLIAGEKSDTERVIKKVYYLTNIDHTNEHFSLDP KEQLAAIKDMRKNGLEPLGNWHSHPETPARPSKEDIRLAFDSEASYLIMSLMDQENPVIH AFRVREGQVSKDEIVTI >gi|222441771|gb|ACEP01000171.1| GENE 2 443 - 1246 976 267 aa, chain - ## HITS:1 COG:aq_1329 KEGG:ns NR:ns ## COG: aq_1329 COG0476 # Protein_GI_number: 15606532 # Func_class: H Coenzyme transport and metabolism # Function: Dinucleotide-utilizing enzymes involved in molybdopterin and thiamine biosynthesis family 2 # Organism: Aquifex aeolicus # 2 265 4 268 271 321 58.0 7e-88 MLSNEQLERYSRHIILKEVGAKGQKKLLNAKVLIIGAGGLGAPAAMYLAAAGVGTIGIAD ADEVDLSNLQRQIIHSTQDVGKAKVQSAKETMEAINPDVKVITYRTFVDSESIMDLIKDY DFIIDGTDNFPAKFLINDACVMAKKPFSHAGIIRFKGQLMTYVPGEGPCYRCVFKNPPPK DAVPTCKQAGVIGAMGGVIGSLQAMEAVKYILGVGDLLTGYLLTYDALKMEFRKIKLPEH TEDCAVCGDHPTITELIDYEQTECEEI >gi|222441771|gb|ACEP01000171.1| GENE 3 1246 - 1449 347 67 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_00642 NR:ns ## KEGG: EUBELI_00642 # Name: not_defined # Def: thiamine biosynthesis ThiS # Organism: E.eligens # Pathway: Sulfur relay system [PATH:eel04122] # 1 66 1 67 69 80 73.0 2e-14 MKITVAGKVKEYEDGLNITQLIEKENVETPEYVTVSVNDEFVERVDFEKALKDGDEVEFL YFMGGGR >gi|222441771|gb|ACEP01000171.1| GENE 4 1663 - 1977 320 104 aa, chain - ## HITS:1 COG:FN0093 KEGG:ns NR:ns ## COG: FN0093 COG0526 # Protein_GI_number: 19703445 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Fusobacterium nucleatum # 7 102 8 103 103 79 41.0 2e-15 MAERLNKDSFQEKVLNASGLAIADFYSDSCIPCKKLSPILADLDEKYPDVYIGKVNIVYE QELIEEYSVMSSPVLIFFKNGEEVDRIIGAVSKHKIEEAIEKNK >gi|222441771|gb|ACEP01000171.1| GENE 5 2125 - 2370 378 81 aa, chain - ## HITS:1 COG:AF0554 KEGG:ns NR:ns ## COG: AF0554 COG0425 # Protein_GI_number: 11498164 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Predicted redox protein, regulator of disulfide bond formation # Organism: Archaeoglobus fulgidus # 3 78 12 87 88 59 36.0 2e-09 MSEIHFDEQVDITDVVCPVTFVKAKVALEEMDEGQVLAVKMNDGEPVQNVPRSIKEEGHQ ILKLDTNDDGTYTLFVKKVGE >gi|222441771|gb|ACEP01000171.1| GENE 6 2363 - 3226 1153 287 aa, chain - ## HITS:1 COG:MA3439 KEGG:ns NR:ns ## COG: MA3439 COG2221 # Protein_GI_number: 20092251 # Func_class: C Energy production and conversion # Function: Dissimilatory sulfite reductase (desulfoviridin), alpha and beta subunits # Organism: Methanosarcina acetivorans str.C2A # 3 280 9 286 288 219 39.0 4e-57 MAVDYATLKKGGFMRQKQKNNFSLRLRTVGGHMSAEQLAKVAEVAEKYGEGYVHLTSRQG VEIPFIKLDQIEEVKNALAEGDVVPGVCGPRVRTVTACQGAAICQSGCIDTYALAKELDE RYFGYELPHKFKFGITGCQNNCLKAEENDVGIKGGMQVAWVEDKCIQCGVCVKACRNEAI TLEDGKISIDTGKCNYCGRCVKSCPVDAYDAQPGYIVSFGGTFGNHISKGETIIPFIESH EKLLKVCDAALKFFEQNGKAGERLKFTIDRTGKEAFEKAIKEAYENE >gi|222441771|gb|ACEP01000171.1| GENE 7 3358 - 4584 1227 408 aa, chain - ## HITS:1 COG:CAC0102 KEGG:ns NR:ns ## COG: CAC0102 COG2873 # Protein_GI_number: 15893398 # Func_class: E Amino acid transport and metabolism # Function: O-acetylhomoserine sulfhydrylase # Organism: Clostridium acetobutylicum # 1 408 1 409 409 350 43.0 2e-96 MKFNTKILHGKAGRNTPDGGTLPGISQVSAFSYDSAEHLEKVFGNKAPGFAYTRIGNPTV AAFEQRINELEGGVGAVACASGMAAITAALLNVLSAGDEIIAGSGLFGGTIDLFKDLEAF GITTHFIPHITVEDIKTVLNPNIKVVFGELIGNPGLDVVNIKEVTDFLHENGVPFIADAT TATPYLVNALSLGADVVIHSSSKYINGSGNSISGIIVDGGKFKWDKDRYPAMKEYSKYGK FAYTARLRNDVWRNMGGCLAPMNAYLNILGLETLGIRMKVLCHNALTLAKALEELAGITV NYPGLESSPYKPLVDGQFGGLGGAILTIRAGSREKAFKLINSLKYAHIATNIGDLRTLVI HPSSTIYLHSTKEEQEHAGVYEDTIRISVGIEDVEDLIADFTQAIENI >gi|222441771|gb|ACEP01000171.1| GENE 8 4599 - 6248 1905 549 aa, chain - ## HITS:1 COG:CC1482_1 KEGG:ns NR:ns ## COG: CC1482_1 COG2895 # Protein_GI_number: 16125729 # Func_class: P Inorganic ion transport and metabolism # Function: GTPases - Sulfate adenylate transferase subunit 1 # Organism: Caulobacter vibrioides # 2 407 22 430 430 384 49.0 1e-106 MKGLLKFITCGSVDDGKSTLIGHILYDSKLLYTDQEKALELDSKVGSRGGKIDYSLLLDG LMAEREQGITIDVAYRYFTTDNRSFIVADTPGHEEYTRNMAVGASFADLAVILLDATQGV LVQTRRHARICALMGIKYFIFAVNKMDLAKYSEERFREIEKQIEALKAELGLENVYIVPV SATEGDNVTKASDNMTWYEGKPILSILEQVDIDENQEEGFYLPVQRVCRPNHQFRGFQGQ IESGSIKVGDEITALPSMEKAHVKSIHVTDKDVQSAEKGQPVTIQLDREVDVSRGCVLSA GAGEKVTSSVEATLLWMDDDKLESGKNYFVKLGTRLVPGIVSKILYSIDVNTGEQKPADS LGKNEIAECEITFVDRVVADEFKDHKTLGELILIDRVTNMTSACGVVTEVKEDGQEAGKK ADQLRAVIKGQRAATIPFDLADDNITRSFVEQVEKELTLGGRHTYLYAPKDDEPVGLVVR HLNRAGIIVLLVLNKDLKNKEEIKNSLSKYSKYIKNWEEGEKGDVDSTVQTVRKITEYAE DVYGVASHI >gi|222441771|gb|ACEP01000171.1| GENE 9 6250 - 7152 1071 300 aa, chain - ## HITS:1 COG:mlr7575 KEGG:ns NR:ns ## COG: mlr7575 COG0175 # Protein_GI_number: 13476292 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: 3'-phosphoadenosine 5'-phosphosulfate sulfotransferase (PAPS reductase)/FAD synthetase and related enzymes # Organism: Mesorhizobium loti # 4 300 5 301 301 442 70.0 1e-124 MGTLSHLDELEAEAIYIMREVAAECEKPVMLYSIGKDSSVMLHLAMKAFYPEKPPFPFMH IDTTWKFKEMIEFRDKIAKEKGIDMIVYTNEEGVKQGINPFDHGSAYTDIMKTQALKQAL TKYGFTAAFGGGRRDEEKSRAKERIFSFRNKEQAWDPKNQRPEVWKLYNTQIKQGESIRV FPISNWTEKDIWQYIKRENIEIVPLYYAAERPVVERDGNIIMVDDDRMKLRPGEKIEHKS VRFRTLGCYPLTGGIESTAHTLDEIIDETLSAVDSERTSRVIDHDGGAASMERRKKEGYF >gi|222441771|gb|ACEP01000171.1| GENE 10 7183 - 7497 310 104 aa, chain - ## HITS:1 COG:CAC0105 KEGG:ns NR:ns ## COG: CAC0105 COG1146 # Protein_GI_number: 15893401 # Func_class: C Energy production and conversion # Function: Ferredoxin # Organism: Clostridium acetobutylicum # 1 104 1 104 104 113 54.0 7e-26 MSVRILQNKCVGCGKCLSVCPGNLLKKGEDGKVHIRNIRDCWGCTACLKECHTGAILYFL GADMGGMGSMLSVKEKGDVRIWRVDSPNGETKTIEINMKDSNKY >gi|222441771|gb|ACEP01000171.1| GENE 11 7472 - 9211 1906 579 aa, chain - ## HITS:1 COG:CAC0104 KEGG:ns NR:ns ## COG: CAC0104 COG1053 # Protein_GI_number: 15893400 # Func_class: C Energy production and conversion # Function: Succinate dehydrogenase/fumarate reductase, flavoprotein subunit # Organism: Clostridium acetobutylicum # 6 573 8 558 559 630 56.0 1e-180 MKTKRLETDVLIIGGGTAGCFAAYTLGKEAGVSVLIAEKANIKRSGCLAAGVNALNAYIT EGHIPQDYVDYAKKDAHGIVREDLLLSMSERLNHVTKVLDDLGLVILKDENGKYVARGNR NIKINGENIKPLLAEAAMSQDNVTVLERVNITDYIIEENEIKGAWGFNIEENVFYDIRAK AVLCATGGAAGLYRPNNPGFSRHKMWYSPFNTGAGFAMGILAGAEMTTFEMRFIALRCKD TIAPVGTIAQGVGARQMNSHGEIYETKYGLTTSQRLYGTVTENREGRGPCYLKTDEISAA QEQDLYRAYLNMAPSQTLKWLEGSRGPASENVEIEGTEPYIVGGHTASGYWVDNGRRTTV KGLFAAGDVAGGCPQKYVTGALAEGEIAAESMTKYLKNLNADSEKSNAIKEIETKIDNAS ILKKKEYEAHLTAEGKDSEDTFFITAEQLEEAMQKVMDQYAGGIATDYQYNEERLVIGKK RIGELLGMIDELKAKDMFELKDIYEVRERLIVCQTLIEHLRARKETRWHSFGENTDYPKE SNQWLKYVNSKRKPDGGTEIIFRELVTGGKVYECTDFTE >gi|222441771|gb|ACEP01000171.1| GENE 12 9286 - 9735 502 149 aa, chain - ## HITS:1 COG:CAC2236 KEGG:ns NR:ns ## COG: CAC2236 COG1959 # Protein_GI_number: 15895504 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Clostridium acetobutylicum # 1 149 1 147 147 95 42.0 3e-20 MNLSKKSRYGITALIDLAVYAKDQCVQLSSIAERNNISVKYLEQIFSSLRKAGLVRSVKG PQGGYLLAKRPESITVADVIHALDGSYLLEDERSVVSGTDEKGISTAIQKLLIDQMNDQT DRLLKQLTLRNLVDSSMEYSQVGHEMYYI >gi|222441771|gb|ACEP01000171.1| GENE 13 9833 - 10765 835 310 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|148988856|ref|ZP_01820271.1| 50S ribosomal protein L9 [Streptococcus pneumoniae SP6-BS73] # 3 308 2 304 308 326 56 8e-89 MANIYKGTLGLIGNTPLVEVANIEKELGLKARVLVKLEYFNPAGSVKDRIAKAMIEDAEK KGILKEGSVIIEPTSGNTGIGLASIAAAKGYRIILTMPETMSVERRNILKAYGAEIVLTS GAKGMKGAIAKANELSEEIEGSFIPGQFVNPANPAIHKATTGPEIWNDTDGEVDIFIAGV GTGGTLTGTGQYLREQKPEVKIVALEPDTSPVLSEGKAGPHKIQGIGAGFVPDVLDTKIY DEIFRATNEDAFATAKLLAKKEGILVGISSGASLHAAIEYAKKPENEGKTIVALLPDTGD RYYSTALFTE >gi|222441771|gb|ACEP01000171.1| GENE 14 10819 - 10944 57 41 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MYLMAMIIALLTYFVNRFIGIYLKTIFEKREIWLNHAVFEG >gi|222441771|gb|ACEP01000171.1| GENE 15 11323 - 12273 470 316 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_3193 NR:ns ## KEGG: EUBREC_3193 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 277 1 273 322 216 40.0 1e-54 MRQADALTKEYLSNNEIFADVFNYLIYDGQQRILPENLIERDTSEITLPLGKRGELATIQ KFRDILKGCIAKEYKNTLYVLFGVENQSHIHYAMPVRNMLYDAINYSAQVNEKTKKYRKI RKQNPNFKETTEEFLSGWHPDDRLVPVITVTIYFGNDGWDAAKSLQEMFSETDESLKEFL PDYKLHLISCNNISNFTKFHTEFGRLMHILKVISDEEQMDILLSDPGYSALSVTAAQIIN TFTGLHFSIPEKEDTINMRNAWTDHKESGRREGFNEATTSYTQRMYKAGIPLEVIAEVIE KPVTEVEKILSNDLIH >gi|222441771|gb|ACEP01000171.1| GENE 16 12594 - 13016 414 140 aa, chain + ## HITS:1 COG:DR2094 KEGG:ns NR:ns ## COG: DR2094 COG1959 # Protein_GI_number: 15807088 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Deinococcus radiodurans # 2 133 48 177 197 98 35.0 3e-21 MISTKGRYALRVMLDLAEHGNEGYIPLKDIAKRQAISDKYLESILKSLVKQKMLKGLRGK GGGYQLTRTPAEYSVGEILEVAEGTLAPVACLMDDAEDCPRANQCRTLPMWKKFDSLVHD FFYNITLEQILEDPKFLPEE >gi|222441771|gb|ACEP01000171.1| GENE 17 12967 - 13860 714 297 aa, chain - ## HITS:1 COG:sll0030 KEGG:ns NR:ns ## COG: sll0030 COG0583 # Protein_GI_number: 16332003 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Synechocystis # 2 283 5 288 304 138 30.0 2e-32 MSIRHFRIFIAVAEEENITKAAKKLYLTQPTVSVAIKEIEEHYGARFFERINQRLHITEE GKRFLDYAKHFIAMYDEMEIAFKNTDFTGRLRVGASVNIGTYYIPRLIREFRDRYPGINV QVKINTTAVVEKELLDGKLDLALVGGTIQSEHIVAEPLFYEKYTAICAPEHPMAGKTVPL EEFMKEPLLSREKSSGAYEVFQSAVNQTGLTVTPVWESTSQTTLMEAVHYGLGVTIVPEQ MIKREVREGRLSQITFSDFSFQNTIHIAYHHSKYLSPSIKKFILLVKTLDPPEFVPE >gi|222441771|gb|ACEP01000171.1| GENE 18 14103 - 14726 430 207 aa, chain + ## HITS:1 COG:FN0712 KEGG:ns NR:ns ## COG: FN0712 COG2059 # Protein_GI_number: 19704047 # Func_class: P Inorganic ion transport and metabolism # Function: Chromate transport protein ChrA # Organism: Fusobacterium nucleatum # 2 180 3 176 186 87 27.0 1e-17 MKNKPHKLWVLFLTTLYISAFTFGGGFVIVTFMKHKFVDELHWIDEQEMLDFTALAQSCP GAIAVNAAILVGWNVYGFAGMIVATLGTILPPMIILSVVSFFYAIFSTNVWVAVVLKGMQ AGVAAVILDAACSLGESVLKEKSSLSIFIMTAAFVCDFFLEVNVVYIILVAACIGIIQFI WYQYRPFFLRQTQKETQKQKCLQETKL >gi|222441771|gb|ACEP01000171.1| GENE 19 14618 - 15292 460 224 aa, chain + ## HITS:1 COG:FN0713 KEGG:ns NR:ns ## COG: FN0713 COG2059 # Protein_GI_number: 19704048 # Func_class: P Inorganic ion transport and metabolism # Function: Chromate transport protein ChrA # Organism: Fusobacterium nucleatum # 36 165 1 130 176 96 41.0 5e-20 MYRHHSVYMVSVQAIFSQANTEGNTETKMLTGDEIMIYIQLFLSFVQIGLFSVGGGYAAI PLIQSQVVETHGWLTMNEFTNLITIAEMTPGPIGVNSATFTGLQIAGIPGALAATFGSIF PSCILVSILALVYCKYKETSAISNILSSLRPAVVALITAAGFSMFQTAVLSGKALHISSV NVFALFLFIAAFIALRKWKLNPILIMCLCGVANLFFTLLKSANF >gi|222441771|gb|ACEP01000171.1| GENE 20 15900 - 16775 1054 291 aa, chain - ## HITS:1 COG:CAC0674 KEGG:ns NR:ns ## COG: CAC0674 COG1760 # Protein_GI_number: 15893962 # Func_class: E Amino acid transport and metabolism # Function: L-serine deaminase # Organism: Clostridium acetobutylicum # 6 284 5 284 290 212 44.0 6e-55 MIEYHSIQNILDLCKEQDKKIWEVVLQADMDESGASGEQVFEKMRFMYRAMQEADQSYEA NLKSQSGMVGGDGEKMRQYNATGKNLCSEYSSLAMEKALKMGESNACMKRIVAAPTAGSC GVIPAVLLSYEENKNAEEDELVKALFVASGIGKVVAENASIAGAFGGCQAEIGTASAMAA GALAYLQGADNEQIMQAATFALKNMLGLACDPVGGLVEVPCVKRNVAGVVNAISSAQLAL AGITSVITPDDTIDSMRRIGNDLPSCLKETSKGGLAVTESAKRIMEQFKSH >gi|222441771|gb|ACEP01000171.1| GENE 21 16776 - 17444 755 222 aa, chain - ## HITS:1 COG:BH2497 KEGG:ns NR:ns ## COG: BH2497 COG1760 # Protein_GI_number: 15615060 # Func_class: E Amino acid transport and metabolism # Function: L-serine deaminase # Organism: Bacillus halodurans # 3 214 6 216 220 185 45.0 7e-47 MNIFDVIGPVMVGPSSSHTAGAVRIGYVARQLLGEEVREAKILLYGSFLATGKGHGTDKA LTAGLLGMQTDDYRIPKSIEIAEDKGMKIAFGEAVLKEAHPNTAQLYLKGISGKTMELVG QSLGGGRINIAEIDGIETNFSGENPTLIVHNLDQPGHVSEVTSMLAHKGVNIATMQLYRS SKGGKAVMVVECDQEVPQEGIRWLERVEGVLKVTYLGKSEEK Prediction of potential genes in microbial genomes Time: Fri Jul 8 08:34:06 2011 Seq name: gi|222441770|gb|ACEP01000172.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont492.1, whole genome shotgun sequence Length of sequence - 43626 bp Number of predicted genes - 39, with homology - 38 Number of transcription units - 24, operones - 8 average op.length - 2.9 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 67 - 519 51 ## gi|225028929|ref|ZP_03718121.1| hypothetical protein EUBHAL_03218 2 1 Op 2 . - CDS 546 - 821 299 ## gi|225028930|ref|ZP_03718122.1| hypothetical protein EUBHAL_03219 - Prom 870 - 929 7.0 - Term 935 - 976 5.6 3 2 Tu 1 . - CDS 1186 - 2415 1177 ## COG4857 Predicted kinase - Prom 2502 - 2561 9.4 - TRNA 2992 - 3064 82.6 # Phe GAA 0 0 - Term 2872 - 2910 5.1 4 3 Op 1 . - CDS 3159 - 3434 300 ## COG0640 Predicted transcriptional regulators - Prom 3475 - 3534 3.5 5 3 Op 2 . - CDS 3618 - 4568 1124 ## COG1045 Serine acetyltransferase - Prom 4645 - 4704 10.8 6 4 Tu 1 . - CDS 4712 - 6079 1132 ## COG0534 Na+-driven multidrug efflux pump - Prom 6304 - 6363 4.1 7 5 Op 1 . - CDS 6534 - 6884 483 ## Tresu_0329 hypothetical protein 8 5 Op 2 . - CDS 6881 - 7693 712 ## Rumal_0582 hypothetical protein 9 5 Op 3 . - CDS 7695 - 8498 1012 ## COG4245 Uncharacterized protein encoded in toxicity protection region of plasmid R478, contains von Willebrand factor (vWF) domain 10 5 Op 4 . - CDS 8482 - 9675 1008 ## Rumal_0580 serine/threonine protein kinase 11 5 Op 5 . - CDS 9689 - 11044 1445 ## COG0515 Serine/threonine protein kinase 12 5 Op 6 . - CDS 11069 - 11944 1005 ## COG4245 Uncharacterized protein encoded in toxicity protection region of plasmid R478, contains von Willebrand factor (vWF) domain - Prom 11987 - 12046 6.9 13 6 Tu 1 . - CDS 12315 - 13697 1660 ## COG0017 Aspartyl/asparaginyl-tRNA synthetases - Prom 13723 - 13782 12.3 14 7 Op 1 . - CDS 13974 - 14945 668 ## SACE_4224 teichoic acid biosynthesis protein C - Prom 15081 - 15140 6.9 15 7 Op 2 . - CDS 15195 - 15392 287 ## Tresu_0820 hypothetical protein - Prom 15584 - 15643 9.8 + Prom 15360 - 15419 12.3 16 8 Tu 1 . + CDS 15631 - 16593 699 ## EUBELI_00038 cytidylate kinase + Term 16738 - 16783 8.0 - Term 16725 - 16769 4.0 17 9 Tu 1 . - CDS 16776 - 17576 587 ## gi|225028946|ref|ZP_03718138.1| hypothetical protein EUBHAL_03236 - Prom 17736 - 17795 6.9 + Prom 17646 - 17705 8.5 18 10 Tu 1 . + CDS 17899 - 18255 399 ## gi|225028947|ref|ZP_03718139.1| hypothetical protein EUBHAL_03237 + Term 18493 - 18546 1.3 - Term 18575 - 18625 -0.4 19 11 Tu 1 . - CDS 18809 - 19417 519 ## COG4186 Predicted phosphoesterase or phosphohydrolase - Prom 19537 - 19596 2.1 - Term 19497 - 19539 2.1 20 12 Op 1 . - CDS 19612 - 20265 827 ## Cbei_2388 hypothetical protein 21 12 Op 2 . - CDS 20293 - 21633 971 ## COG0427 Acetyl-CoA hydrolase 22 12 Op 3 . - CDS 21611 - 21703 57 ## - Prom 21781 - 21840 3.9 - Term 22162 - 22209 1.5 23 13 Tu 1 . - CDS 22364 - 23416 1482 ## COG0280 Phosphotransacetylase - Prom 23610 - 23669 8.5 - Term 23631 - 23664 1.1 24 14 Tu 1 . - CDS 23798 - 25468 1978 ## COG1283 Na+/phosphate symporter - Prom 25514 - 25573 2.8 25 15 Op 1 . - CDS 25575 - 26495 229 ## PROTEIN SUPPORTED gi|238855152|ref|ZP_04645474.1| pseudouridine synthase, RluA family 26 15 Op 2 . - CDS 26565 - 27092 502 ## COG0671 Membrane-associated phospholipid phosphatase - Prom 27193 - 27252 8.0 - TRNA 27307 - 27393 67.4 # Leu TAA 0 0 - TRNA 27449 - 27530 49.9 # Tyr GTA 0 0 27 16 Op 1 . + CDS 27916 - 28209 197 ## COG2929 Uncharacterized protein conserved in bacteria 28 16 Op 2 . + CDS 28224 - 28433 235 ## NE2102 hypothetical protein + Term 28588 - 28616 -0.2 29 17 Tu 1 . - CDS 29097 - 31220 2129 ## COG0366 Glycosidases - Prom 31266 - 31325 6.9 - Term 31349 - 31401 5.3 30 18 Tu 1 . - CDS 31421 - 33010 1960 ## COG0513 Superfamily II DNA and RNA helicases - Prom 33116 - 33175 13.7 - Term 33148 - 33188 4.0 31 19 Tu 1 . - CDS 33322 - 34143 716 ## COG0388 Predicted amidohydrolase - Prom 34173 - 34232 8.0 + Prom 34369 - 34428 7.8 32 20 Tu 1 . + CDS 34664 - 35323 491 ## COG1272 Predicted membrane protein, hemolysin III homolog - Term 35451 - 35501 2.0 33 21 Tu 1 . - CDS 35508 - 35795 135 ## COG2827 Predicted endonuclease containing a URI domain 34 22 Tu 1 . - CDS 36327 - 37514 904 ## Closa_2891 Peptidase M23 - Prom 37713 - 37772 13.7 - Term 37753 - 37792 2.0 35 23 Tu 1 . - CDS 37999 - 39129 784 ## COG2385 Sporulation protein and related proteins - Prom 39323 - 39382 6.7 + Prom 39325 - 39384 9.2 36 24 Op 1 4/0.000 + CDS 39626 - 40003 545 ## COG4810 Ethanolamine utilization protein 37 24 Op 2 . + CDS 40038 - 40517 614 ## COG4917 Ethanolamine utilization protein 38 24 Op 3 . + CDS 40535 - 41788 1191 ## COG4857 Predicted kinase 39 24 Op 4 . + CDS 41803 - 43026 1153 ## COG4857 Predicted kinase + Term 43143 - 43186 2.4 Predicted protein(s) >gi|222441770|gb|ACEP01000172.1| GENE 1 67 - 519 51 150 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225028929|ref|ZP_03718121.1| ## NR: gi|225028929|ref|ZP_03718121.1| hypothetical protein EUBHAL_03218 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_03218 [Eubacterium hallii DSM 3353] # 1 150 1 150 150 196 100.0 4e-49 MKKKIFLVFILFLTWLNYLLSYEYVKIMMDTADGSKVSNTMVLLIFPIMWNYGAAIIKII SSIVLIIITLIYQTNEYVDIRKICYSSDDSLLWKIVSILFYIAIVLFAAYILWSNVKYSF FGITWFLSSILLASSYFVWIKNLSFRLQDS >gi|222441770|gb|ACEP01000172.1| GENE 2 546 - 821 299 91 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225028930|ref|ZP_03718122.1| ## NR: gi|225028930|ref|ZP_03718122.1| hypothetical protein EUBHAL_03219 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_03219 [Eubacterium hallii DSM 3353] # 1 91 1 91 91 140 100.0 3e-32 MKKILLVMAVTVALSLNACTFTDQSETSNVVDESVSVDQGGQVIRVDIPNGDGTFTTLEG EEAQKWYDKAGEEDQQEAAKEASTQSIEDVK >gi|222441770|gb|ACEP01000172.1| GENE 3 1186 - 2415 1177 409 aa, chain - ## HITS:1 COG:BS_ykrT KEGG:ns NR:ns ## COG: BS_ykrT COG4857 # Protein_GI_number: 16078420 # Func_class: R General function prediction only # Function: Predicted kinase # Organism: Bacillus subtilis # 35 396 36 391 399 158 29.0 2e-38 MIKITKENIIPYLKAHMPDFDDSLPVQISMVGEGTEEEDGDGYVNYIYRVQTPKESLVLK QGTEISRVSQQKIATYRNRLEYNSMRIFYAITPEYVPYLKFQDRENNIFVMEDVSDLKVV RFQLNKNKMFPELGRQCGEFMAKTEFCTSEYYLSREQYRGLQKHFENTELRKIMEDQMFL DCFGCDIDYSLGKNFADFADQFSKDSRYRTELYKLRRKFMSHADGLIHADLHTSNIMASR DKMKVIDMEFAFMGPFGYDLGYLVGNLISQYCAACFKRFPSEQSRKQFKAYLLATIKSLI EIYMKTFTLCWEKSVKERYRGQQGLLQSILQEVMEDVPGYASMVNWFRSVSEIPYPDFDI IEDKDAKRNATVLSLMIDWGIMFGRYKYQSADDLIETIIGIEEEFRKSL >gi|222441770|gb|ACEP01000172.1| GENE 4 3159 - 3434 300 91 aa, chain - ## HITS:1 COG:pli0034 KEGG:ns NR:ns ## COG: pli0034 COG0640 # Protein_GI_number: 18450316 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Listeria innocua # 9 88 7 86 97 81 47.0 3e-16 MNKKFVVHARVLKAMSDENRLRILDLLREREYNASELLDEMDFGQSTLSHHMRILLQAGA VKARKNGKWTYYSLNKEIFSTVSEWIAEYGE >gi|222441770|gb|ACEP01000172.1| GENE 5 3618 - 4568 1124 316 aa, chain - ## HITS:1 COG:PA3816 KEGG:ns NR:ns ## COG: PA3816 COG1045 # Protein_GI_number: 15599011 # Func_class: E Amino acid transport and metabolism # Function: Serine acetyltransferase # Organism: Pseudomonas aeruginosa # 126 312 7 187 258 155 44.0 7e-38 MKETMKDIILNASADMTESCHKTDFIVSRGERCLPDRSCIIRILKDLKRVMFPGYFGEEN IALISPQYFIGDRLTDIYGKLKPQIEISLYYRDYEKYTEEEISERAEEVCEEFFKRLPYV QQMLLKDVQAGFDGDPAAKSREDVVISYPGLFAIFVYRIAHELYTQDVPFLPRIMTEYAH SRTGIDINSGAKIGEYFFIDHGTGVVIGETTEIGNNVKLYQGVTLGALSTRSGQKLKGKK RHPTIKDNVTVYSGASILGGETEIGEGVIVAGGAFVTKSVPAHTKVIVKNPEIKIKDSDK KIELQTTENKEGIWEI >gi|222441770|gb|ACEP01000172.1| GENE 6 4712 - 6079 1132 455 aa, chain - ## HITS:1 COG:yeeO KEGG:ns NR:ns ## COG: yeeO COG0534 # Protein_GI_number: 16129928 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Escherichia coli K12 # 20 450 92 527 547 182 28.0 1e-45 MSLFKRKPWTGKTELLFSDKALTALILPLFLEQLLSVFVGLSDSIMVAQVGEAAVSAVSL VDSIAILLINIFNALATGGAVVSGQYLGAKDKKKAAYAGRQMMQFMVWMSIAIMLLIYLG KYFILHTVFGSIEADVMANCNTYLLISAASIPFLGVYCGGAALFRAMGNSKTSMYVSCIM NVINVSGNAILVFGFHRGVEGVAIPTLLSRVVAAFLMVWLLHDEKHILTLKGMNLFKFDG KMIRKILHIGVPNGIENSMFQLGKILLLSLISTFGTASIAANAVSNSLAYFEILPGMALG LATVTVISRCVGAGDYPQAEYFAKRLLKFTYTCMWITCIVMMLLLPFMLKIYNLSPETSG YATQIMILHSIMAMIFWPAAFALPNVFRAANDVRFTMVVGVTSMWLFRIACAFIFGKYMG FGVFGAWIAMVLDWVVRAIFYVIHYRCGKWKEKSL >gi|222441770|gb|ACEP01000172.1| GENE 7 6534 - 6884 483 116 aa, chain - ## HITS:1 COG:no KEGG:Tresu_0329 NR:ns ## KEGG: Tresu_0329 # Name: not_defined # Def: hypothetical protein # Organism: T.succinifaciens # Pathway: not_defined # 2 116 11 134 134 94 41.0 1e-18 MKKEASVILAKCAQDKLYGIRIEKRDNDWVRTWAFKLKEETAEKEGFDKINFTGSFYTDE EYPGCPYCGAKKCFVCGNCGKVSCYDGSDKVVCNWCGSNGTAADDDGKLDVSGGGF >gi|222441770|gb|ACEP01000172.1| GENE 8 6881 - 7693 712 270 aa, chain - ## HITS:1 COG:no KEGG:Rumal_0582 NR:ns ## KEGG: Rumal_0582 # Name: not_defined # Def: hypothetical protein # Organism: R.albus # Pathway: not_defined # 2 263 3 266 301 194 40.0 2e-48 MFQGVHSTVIGDSHIRKGIVCQDSSGTVVTDKFAIAVVADGHGSAKHFRSDVGSRIAVKI TTKLLKNYMNRPDFKEQFFKHPEFILAQMEKQILMKWREAVEEYHLENPLTEEEENKIPE KAKGRLRTAVIYGSTVLAAVLTKDFSYGMILGDGGFVVLGGDKELYIPLEDKNSHANYTS SLCNTDAIHFFEHWYTTETVKALFVSTDGLFKSFASEEDFLKYHGLLSHMFHDTEKMQKS LKRNFEKRTREGSGDDISIAFVYQEGEEAE >gi|222441770|gb|ACEP01000172.1| GENE 9 7695 - 8498 1012 267 aa, chain - ## HITS:1 COG:YPO0594 KEGG:ns NR:ns ## COG: YPO0594 COG4245 # Protein_GI_number: 16120923 # Func_class: R General function prediction only # Function: Uncharacterized protein encoded in toxicity protection region of plasmid R478, contains von Willebrand factor (vWF) domain # Organism: Yersinia pestis # 23 220 2 201 327 62 25.0 7e-10 MSRQFDENIEAIDPLETMPPAKKSMVIFFLVDTSKSMEGSKLESLNRVMGDILPELIGVG EAGTDVKVAVMSFSSGCEWITPEPVLIEEYQRWENLRADGVTDLGDACEELCQKLSRNSF LRAPSLSYAPVIFLMTDGYPTDNYKKGFEMLKKNRWFQYGLKIALAIGSNVDLDVLREFT DDEELVLQACGAEMLKKLVREIAVTSSKIGSTSMTLTDATGERSLAEVAGAKKEQMVEAV QEMKQDILGLDDLTDLDDMDIEFDEGW >gi|222441770|gb|ACEP01000172.1| GENE 10 8482 - 9675 1008 397 aa, chain - ## HITS:1 COG:no KEGG:Rumal_0580 NR:ns ## KEGG: Rumal_0580 # Name: not_defined # Def: serine/threonine protein kinase # Organism: R.albus # Pathway: not_defined # 11 371 103 440 447 172 29.0 3e-41 MSNFNKNGWVSLAQICEERQLVIDAETGKKVLRPAYFSSMNAMIEGAFQFARFFEEIHQK GKVYCSISPDVFYFNLKNGAFHFEGEEFLGEAYVQEPDAAEIEFTEFLAPELAEALAEEQ EKLLSETEEQETLETFKECYSLETDRYFMAVYLFEYFFHTGSPFEGKKMVNRCFLSPEEK ELFRAREGRFCMEPGEEENIPVKGIQDKLIQYWNEYPEILQKMFQKAFLDGGRLRELRPT EVDWKQLLVRMAMDYKSCHCGFHGFCYRLLPKENGTFACPKCGKIYYPLTNGMDRILLAE GEKLYECQTGRNPMDKDTVTGLIVENRQKKGLYGIKNVSQGVWRGFYPDGKIKDIPNGQG IPIWNGMSVRFELGEEWNLRLMQQVEERKEDEDEQTV >gi|222441770|gb|ACEP01000172.1| GENE 11 9689 - 11044 1445 451 aa, chain - ## HITS:1 COG:YPO0592 KEGG:ns NR:ns ## COG: YPO0592 COG0515 # Protein_GI_number: 16120921 # Func_class: R General function prediction only; T Signal transduction mechanisms; K Transcription; L Replication, recombination and repair # Function: Serine/threonine protein kinase # Organism: Yersinia pestis # 106 354 127 380 499 68 24.0 2e-11 MRELKIGERIPIEAGGEAVVSRELGRGGQGIVYLVRYNGREYALKWYFISRIKNPEAFRN NLRKNIEDGAPEGGKQFIWPLFLTETKKNGAFGYLMRLAPSGYESFTDIYNGYRWEKPRV KGQPARKIKVKFASLDVMLTAAINIVKAFRALHLSGKSYQDLNDGGFFIDTRSGDVLVCD CDNVAADGDNFGIGGMPGFMAPEVVRGIAKPDVLTDRYSLAVVLFKLFFRGDPLEGSKVL QCVVMTEENDLIHYGKDPVFIYDPNNASNRPVNGVHDNVIKLWKIYPDFIREAFTLSFTY GIQEPNARIIEKSWIQMLIQLKLDIIHCSCGKTAFSSAFEKTGEHTLRCRNCGSTIYTMG VKDYELPLNLGAKLYKCLTKKNSDDFESVTGMVIENRLKKGLFGIKNMSGDVWKAKFPDN SVREVGPGKGVPIWKGLEIDFGDNLIAKILL >gi|222441770|gb|ACEP01000172.1| GENE 12 11069 - 11944 1005 291 aa, chain - ## HITS:1 COG:YPO0594 KEGG:ns NR:ns ## COG: YPO0594 COG4245 # Protein_GI_number: 16120923 # Func_class: R General function prediction only # Function: Uncharacterized protein encoded in toxicity protection region of plasmid R478, contains von Willebrand factor (vWF) domain # Organism: Yersinia pestis # 50 245 2 199 327 62 25.0 1e-09 MPKWLDLNEVDDDEILLKKNRMSRSLVLDTPEVADAAEEDFLDTMEPAKKSMTIFFMIDV SGSMKGTKIGSLNSTMEELLPSLIGVGEASTDVKIAIMKFSTDVEWVTPEPVKIEEYQYW NRLEADGLTFMGDAFMELSKKLSRSTFLSSPSLSFAPVIFLLSDGSPNDDWKKGLDTLKQ NKWFQHGLKIALGIGSKVNMDVLRAFTGNDELAVQAKNADQLRELIKLLAVTSSQIGSRS LALVDSNGRQPAAETVAQAKQQVLVEEIRTGAKDILGEAVDLGAVNYDEGW >gi|222441770|gb|ACEP01000172.1| GENE 13 12315 - 13697 1660 460 aa, chain - ## HITS:1 COG:CAC3260 KEGG:ns NR:ns ## COG: CAC3260 COG0017 # Protein_GI_number: 15896505 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Aspartyl/asparaginyl-tRNA synthetases # Organism: Clostridium acetobutylicum # 3 460 6 463 463 669 69.0 0 MTIREIYKNSEQFLNKEVKLGGWIRSIRGSKHFGFLVLHDGTFFKPIQVVYEEKLENFQE ISKMNVGAAVIVEGTLVPTPQAKQPFEIQAATVTLEGASAPDYPLQKKRHSLEYLRTISH LRPRTNTFQAVFRIRSMAAYAIHQFFQERGFVYVHTPLITASDCEGAGEMFQVTTLDLNN IPKDKEGNVDFAQDFFNKPTNLTVSGQLNAETFAQAFRNTYTFGPTFRAENSNTARHAAE FWMIEPEMAFADLKDDMEVAEAMLKYVISYVMENAPEEMDFLNQFVDKGLKDRLNHVLNS EFGHVTYTEAVEILEKHNDKFEYKVSWGTDLQTEHERYLTEEVFKRPVFVTDYPKEIKAF YMKLNPDGKTVAAMDCLVPGIGEIIGGSQREDDYEMLKKRMEEVGLDEEKYQFYLDLRKY GSTHHSGFGLGFERCVMYLTGISNIRDAIPFPRTVNNCEL >gi|222441770|gb|ACEP01000172.1| GENE 14 13974 - 14945 668 323 aa, chain - ## HITS:1 COG:no KEGG:SACE_4224 NR:ns ## KEGG: SACE_4224 # Name: tagC, dinC # Def: teichoic acid biosynthesis protein C # Organism: S.erythraea # Pathway: not_defined # 65 317 74 326 332 72 25.0 2e-11 MKKGQSILGILFFALLFGLCFQQNALNVEAKTRVIHKDITPKEAASSDLKVIKKVTKLAA HRWIQSYTMDSKYYYYIQMTSPYTGNLRITRVKYRGLGRYIKDHMDLKKFGHATNLDCSV SNGQTWLWTGSDCKGNDVSRAISGFRYQKNKTLRKHGTIHYKIPDAKSKKYMTNVYPAIN QNSTQMAVRYTYGGKQYYQIYNLAKGRFINPRNPVKRICLSATSGDFQGFDLYGTSIYTI EGSPRKSFLLGYDRSRRYQPTKIRRYNYAVRSKKTITVRGAKTLSFREPEGIKVLAGGQI YMMYVSNTLTNQSCNIYRVKRKI >gi|222441770|gb|ACEP01000172.1| GENE 15 15195 - 15392 287 65 aa, chain - ## HITS:1 COG:no KEGG:Tresu_0820 NR:ns ## KEGG: Tresu_0820 # Name: not_defined # Def: hypothetical protein # Organism: T.succinifaciens # Pathway: not_defined # 1 65 1 65 65 83 69.0 3e-15 MKIRLNENEKVVKMVKEGLKKKNGYCPCRLEMNEDTKCMCKEFREQIADENFEGYCHCML YYKEK >gi|222441770|gb|ACEP01000172.1| GENE 16 15631 - 16593 699 320 aa, chain + ## HITS:1 COG:no KEGG:EUBELI_00038 NR:ns ## KEGG: EUBELI_00038 # Name: not_defined # Def: cytidylate kinase # Organism: E.eligens # Pathway: not_defined # 11 310 10 316 327 171 34.0 3e-41 MKKSDEQERNIANRLYNLDKKVYETTGSTLGRVFPSSPSFALEKIIDALKINREYFLKNE MTLIANMIGTIRDEEKAAPLKQEYDELEKDILAYNPTIILPHSIDNNDAVASRRFSSDNH LIICISRQFGTGGHTIGFELAQRLGIAYYDKEIIHLACDRMGLNSKDVTDYGIDSSGSSR GILSSIGKASKKFFSTQSDMLFFTQSNLICEMATQESCIFMGRCADVVLTNNNIPHLSIF LSAPFKERVKYEMGIASCDYKTASAKVKTMDQERKNYYQYYTGRQWGYAGNYDLCVNTFF YGHDETIQVLYDMAEMSCKK >gi|222441770|gb|ACEP01000172.1| GENE 17 16776 - 17576 587 266 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225028946|ref|ZP_03718138.1| ## NR: gi|225028946|ref|ZP_03718138.1| hypothetical protein EUBHAL_03236 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_03236 [Eubacterium hallii DSM 3353] # 1 266 1 266 266 460 100.0 1e-128 MSTYQRTGKRIFFFTVFFMAVLLFAGKGNVQAKRKSVALNKTTVTAYKGMAPVKLKVKNV KKGKNIIWFSSKSSVAEVSQDGTVTFHKKGNAIVQAKVGKKTLKCIVSVCSKKAYKAVEK AKKFHSARNMSYSQGNRMGKRSVDCSSFCGRCYLPQGITMGGSTSWCNTAAGMALWSTKK GKVVANSGVSIGKKKLLPGDLVFYKKGYNGRYKNIYHVEIFVGYEWRNNQLVGYTMSSIT GNYPSILKRDYTSRKGRVAQIARPAQ >gi|222441770|gb|ACEP01000172.1| GENE 18 17899 - 18255 399 118 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225028947|ref|ZP_03718139.1| ## NR: gi|225028947|ref|ZP_03718139.1| hypothetical protein EUBHAL_03237 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_03237 [Eubacterium hallii DSM 3353] # 1 118 1 118 118 234 100.0 2e-60 MKFHRVLLFTSIQMDITPVGDDLHVLLTCGGENRLGCTAFSTPVTDTDPVECETSVITDV DNPEDLFCPYIAESLCKKTGQRVLCTGGIFVENPDEHQIDKLYENVDEMIMDWVHFLD >gi|222441770|gb|ACEP01000172.1| GENE 19 18809 - 19417 519 202 aa, chain - ## HITS:1 COG:SMb20398 KEGG:ns NR:ns ## COG: SMb20398 COG4186 # Protein_GI_number: 16264132 # Func_class: R General function prediction only # Function: Predicted phosphoesterase or phosphohydrolase # Organism: Sinorhizobium meliloti # 1 153 1 141 164 73 31.0 2e-13 MRYYISDLHFFHESLNTQMDCRGFKDSAQMNAYMIKQWNSRVRPKDEVVILGDFSVGKAK ETNEVLSQLNGTLYMIKGNHDSYLKDSEFNKERFVWIKPYAEMHDNRRKVVLSHYPIFCY NGQYRRNKKGEPLTYMLYGHVHDTMDEQLVRQFCQITRETKRQSKYDEEPMPIPCNMINC FCMYSDYVPLTLDEWIDYYLIH >gi|222441770|gb|ACEP01000172.1| GENE 20 19612 - 20265 827 217 aa, chain - ## HITS:1 COG:no KEGG:Cbei_2388 NR:ns ## KEGG: Cbei_2388 # Name: not_defined # Def: hypothetical protein # Organism: C.beijerinckii # Pathway: not_defined # 1 212 1 206 207 208 51.0 1e-52 MKVKQVTDASFKKYGKILTGIDFSEIYNVLEKIKYPETVEYAASFGPLEEPDFRQKISNT LYGELSVEIGYCCGHNKVLNALEYHRSSEANVAATDVILLLGQQSDITDDFKYDTAQLEA FFVPAGTAVELYATTLHYTPIGTKENDYAFKTGVILPFGTNFPLGVKLGAEAEEEKLPEE KLLFAKNKWLIAHEEGGEEGAFIGLTGKNISVDDLVI >gi|222441770|gb|ACEP01000172.1| GENE 21 20293 - 21633 971 446 aa, chain - ## HITS:1 COG:AF1145 KEGG:ns NR:ns ## COG: AF1145 COG0427 # Protein_GI_number: 11498745 # Func_class: C Energy production and conversion # Function: Acetyl-CoA hydrolase # Organism: Archaeoglobus fulgidus # 4 436 8 459 482 288 38.0 2e-77 MLKDFMTEYRQKLCTPEEAVEVVKSGDWVDYTSSLGKPVLLDKALAKRRDELFDVKIRGN LVEGPIEVAECDETQEHFVYHTWHCSAYERKLCDRGLCYYIPMVFHNNAAYYKYFLNVNV VMVSVSPMDKHGYFNYSVNTGVAGPIVQNADVVIVEVNEHMPKIHGGYGECIHVSEVDYI VEGKHEPFTTGKPYVPSEIDRKIAQNLLPYICDGATLQLGIGSMPNALGELIAETDRKDL GMHTELCSDAYLHLYKAGKLTNKKKTIDRGKGVFGVAIGSSELYEWLDDNHGVAAYPLEY VNRPDVIAQIDNMVSINSCVSVDLYGQVSSESFGARQISGTGGQLDFLIGASSARGGKAF ICMSSTYKDKSGQLFSRVLPQFNGDIITSPRSQVYFLATEYGVINMEGRSTWERAEGLIS IAHPDFRDELIKEAEKRKIWRRSNKR >gi|222441770|gb|ACEP01000172.1| GENE 22 21611 - 21703 57 30 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MVKELEDMSVQLANREKYRGEKNGVKGFYD >gi|222441770|gb|ACEP01000172.1| GENE 23 22364 - 23416 1482 350 aa, chain - ## HITS:1 COG:BH3823 KEGG:ns NR:ns ## COG: BH3823 COG0280 # Protein_GI_number: 15616385 # Func_class: C Energy production and conversion # Function: Phosphotransacetylase # Organism: Bacillus halodurans # 7 323 10 326 330 323 52.0 4e-88 MLTKMIETLKKNPRTIVYTEGTDKRILESAHRLTQEGILGVVLLGKPEEVKAAAKAGNFN IDACQIIDPENYEGIDAMVAEMVKLRKGKMTDEQVRAALSKSNYFGTMLVKMGGADCLLG GATYSTADTVRPALQLIKTKPGNNIVSSCFILVRGDEKIAMGDCAINIDPSEDDLVEIAI ESAKTAKIFGIDPKVAMLSYSTLGSGKGPSVTKVANATKKIKEAAPELAVEGEIQFDASV SPEVAEVKCPGSPVAGQANTFIFPDINAANIGYKIASRLGGYTAVGPVLQGLNAPINDLS RGCNAEEVYSMSIVTAALSVPEEEKVDEEGLEAIVRAVVETMLAAKKLDK >gi|222441770|gb|ACEP01000172.1| GENE 24 23798 - 25468 1978 556 aa, chain - ## HITS:1 COG:BH1407 KEGG:ns NR:ns ## COG: BH1407 COG1283 # Protein_GI_number: 15613970 # Func_class: P Inorganic ion transport and metabolism # Function: Na+/phosphate symporter # Organism: Bacillus halodurans # 4 543 8 542 543 266 33.0 9e-71 MLVMLTELLSGLALFLFGMKFMSEGLQKAAGDKMRVILDRVTKNRFIAVIFGAIFTAIIQ SSGATTVMEVSFVNAGLLTLEQSVGITFGANIGTTITAQLVSLKLTAIAPYIIFAGACLM MFGNKPITRKISEVIFGFGALFLGINSMTGALSRIPEYPVIMHWFSYLKNPFIALLVGLL FTVAVQSSSVTVSVLVLLADSHLVGLGSCLYFILGANIGSCTPAVMAAMNANKNAKRTAM VHFLFNALGMIIIGTILFFARNQIVDAISVLGGADNAKRFVANADTAFKVFQTIIFLPFA SQFVKLTKLFIPGNKEGEEEDNGLHLEYIRKASSKFVPSTAVVEIIQEIERMAGLARENL VASMEALADKNLDKVKEINAREKIIDYLSSEITDYLVEVNRYELPLADSRRIGALFHVVI DLERIGDHAINILENAEKRSQLNEKFTGAGKEELTTMYTEVLQLFDKSMEMFVTNKKDNI DEIIDMENDVDQMQIDYQNQQVKRLSKGVCGVEIGLIFTDMVIGLERIADHSTNIAYSIF HENPEDAEEQQKVLAE >gi|222441770|gb|ACEP01000172.1| GENE 25 25575 - 26495 229 306 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|238855152|ref|ZP_04645474.1| pseudouridine synthase, RluA family [Lactobacillus jensenii 269-3] # 31 300 29 283 287 92 27 3e-18 MTHYLEYRLGPTDDGKEVLFLMQKVLGFTTKKVRSVKHEEWGILLNGKRVTVRARGKEGQ LLKVMLEDSQEKQNKIASTKMNLHILYEDEDLIFVDKPAGIVSHPSKGHLSDSMINGVQA YFENEGREKNERSNIHLIGRLDKETSGILGIAKNSVTAERMIKQRQRGILIKEYYALVKG NPPEEGILEIPMEEMRDARDGDKLKMKVGKSENAKTAKTFFQVVKRFEKFSLVSLVIETG RTHQIRFHMASIGCPLLGDSFYGDGPELGISRTALHAQQLTFTHPFCEKVMRIESEMPKD LSFLLR >gi|222441770|gb|ACEP01000172.1| GENE 26 26565 - 27092 502 175 aa, chain - ## HITS:1 COG:CAC2438 KEGG:ns NR:ns ## COG: CAC2438 COG0671 # Protein_GI_number: 15895703 # Func_class: I Lipid transport and metabolism # Function: Membrane-associated phospholipid phosphatase # Organism: Clostridium acetobutylicum # 16 167 24 170 180 98 40.0 6e-21 MEIEILSMLQKIRTPLLDIFMSNITKLGNAGIVWILLTIVLLLIPKTRKSGLILASALIV DLILCNGILKPLIARIRPFDVNSAIQLIVAKPHDYSFPSGHTAASFTAVMALYLAGEKKL WKIALVLAVLIAFSRLYLYVHYPTDVLGGIITGAIAGYIGYKLTFIVQSKHRKKL >gi|222441770|gb|ACEP01000172.1| GENE 27 27916 - 28209 197 97 aa, chain + ## HITS:1 COG:sll1965 KEGG:ns NR:ns ## COG: sll1965 COG2929 # Protein_GI_number: 16330001 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Synechocystis # 4 96 47 136 136 71 44.0 4e-13 MSDIKFQWNDNKNKINQKKHKISFEEAQSVFFDEEALIIDDPEHSLEEERFIILGLSKKA NLLVVCHCYRESDTIIRIISARKATVRETISYNKNLF >gi|222441770|gb|ACEP01000172.1| GENE 28 28224 - 28433 235 69 aa, chain + ## HITS:1 COG:no KEGG:NE2102 NR:ns ## KEGG: NE2102 # Name: not_defined # Def: hypothetical protein # Organism: N.europaea # Pathway: not_defined # 1 69 31 99 99 88 53.0 8e-17 MKDEYDFTKARKNPYAKKLKQQITINIDVDTIDYFKEQSKQSGIPYQTLINLYLADCVAQ KRQLQMTWK >gi|222441770|gb|ACEP01000172.1| GENE 29 29097 - 31220 2129 707 aa, chain - ## HITS:1 COG:ECs0453 KEGG:ns NR:ns ## COG: ECs0453 COG0366 # Protein_GI_number: 15829707 # Func_class: G Carbohydrate transport and metabolism # Function: Glycosidases # Organism: Escherichia coli O157:H7 # 116 642 98 561 605 240 31.0 6e-63 MEMVQNLIRNLSAAEKLQFYSYHTKPVLNFQALFTDGSSNYMNPVEPKTGDEVTVRFRTA RENAEHVFLCVNGEKKEMQIASKTERFDFYESSFFMGTEIADYYFEIHSGGTCCYFNKKG PARDLEPFFNYKVTPGFSTPDWAKGAIFYQIYVDRFANGDTSNDVLNREYIYINQPSKKI DDWYRYPEEMDVRNFYGGDLQGVLDHLDYLKGLGVDVIYLNPIFVSPSNHKYDIQDYDYI DPHYGKIVVDEGNTLPDWENNNMNASKYISRVTDKRNLEASNEFFIHFVEEVHKKGMRVI LDGVFNHCGSFNKWMDAERIYENQYGYEKGAFVDANSPYRHFFKFYNQNAWPYNDDYDGW WGHKTLPKLNYEESRQLYDYILNVGRKWVSPPYNADGWRLDVAADLGQSEDFNHQFWRDF RTAVKEANPEAIILAEHYEDAGSWLMGDQWDTIMNYSAFMEPVTWFLTGMEKHSDERRGD LLGNTQAFVDAMVYHMSRFQYPSLMVSMNELSNHDHSRFLTRTNQTVGRTASMGAEAANQ NVNKGIMRAAVMIQMTWPGAPTLYYGDEAGLCGWTDPDNRRTYPWGREDQNLIQYHRDII AIHKEEEALMKGSLKYLYGGFKVLGYGRFTDTEKMVVLINSGFDEARVKVSVWEVGVTDD EEMVEVIMSNEEGYTMEKKVHPVTSGVLDITLPRVSAILLKAVEIEK >gi|222441770|gb|ACEP01000172.1| GENE 30 31421 - 33010 1960 529 aa, chain - ## HITS:1 COG:BH2384 KEGG:ns NR:ns ## COG: BH2384 COG0513 # Protein_GI_number: 15614947 # Func_class: L Replication, recombination and repair; K Transcription; J Translation, ribosomal structure and biogenesis # Function: Superfamily II DNA and RNA helicases # Organism: Bacillus halodurans # 4 526 6 524 539 507 52.0 1e-143 METVKFTELDIKPEILKAVANMGFEAMSPIQAKAIPVELSGKDVIGQAQTGTGKTAAFGI PILQKVDPKLKKPQAIVLCPTRELAIQVADEIRKLAKYMSSVKILPIYGGQEISKQIRSL KAGVQIIIGTPGRMMDHMRRKTVKFDNIHTVVLDEADEMLDMGFREDIETILNGVPEERQ TMLFSATMPKPIMELARAYQQNPEIIKVIRKELTVPNITQYYYEVRPKNKSEVLSRLLDI YDPKLSVVFCNTKKGVDELVADLKGRGYFAEGLHGDMKQTMRDRVMHRFRSGKTDILVAT DVAARGIDVDDVDAVFNYDLPQDEEYYVHRIGRTGRAGRTGMAFSFVVGREVYKLKDIRR YCKAKIKAQPIPSLNDVTETRVEKIFDRIDHYIEDQNLNKYIDMVEEFVNEKDYTAMDVA AAFLAEILGTADGKDAGSNKEDFGDTGAEEGMVRLFINIGKKQGIRPGDILGAIAGESGI SGNLVGTIDLYDKYTFVEVPREVASDVLEAMKNVKIKGKSINVEPANRK >gi|222441770|gb|ACEP01000172.1| GENE 31 33322 - 34143 716 273 aa, chain - ## HITS:1 COG:CAC1665 KEGG:ns NR:ns ## COG: CAC1665 COG0388 # Protein_GI_number: 15894942 # Func_class: R General function prediction only # Function: Predicted amidohydrolase # Organism: Clostridium acetobutylicum # 1 267 1 257 260 221 44.0 1e-57 MRIGMAQMDISWENIEDNKEKVEQFFKKAEENQVDCLVFPEMTLTGFSMNVEETGEKSEN QKHFFEEMSKKYKILTVFGYTEPVPEERLKEHPNWNHYYNRLGIAENGELKLNYAKIHPF SYGFEGEYYQGGRKLKSVEWKGTTLGAFVCYDLRFPEIFQISSEKSEIIFVIANWPKSRI DQWDTLLKARAIENQVFMVGVNRTGEGDGLHYNGHSAIYSPNGEAVTTIREEECLLIGDI NPEEIKEMRKTFPMKNDRREELYIKMWENAPHQ >gi|222441770|gb|ACEP01000172.1| GENE 32 34664 - 35323 491 219 aa, chain + ## HITS:1 COG:CAC0882 KEGG:ns NR:ns ## COG: CAC0882 COG1272 # Protein_GI_number: 15894169 # Func_class: R General function prediction only # Function: Predicted membrane protein, hemolysin III homolog # Organism: Clostridium acetobutylicum # 7 219 4 214 214 163 43.0 2e-40 MNKTSKKFKDPGSAITHLIAMIGAVICAFPLIGKAVHTNNSIVVFAMSVFISSMILLYGA STLYHSLDISKKVNLFFRRIDHSMIFVLIAGSYTPVCLLALDRKQGIPLLVLVWATAIIG IIIKIFFINCPHWVSSIIYIGMGWTCVLVFKPLLASLPASAFAWLLAGGIIYTVGGIIYG LKLPIFDKLPKDFGSHEIFHLFVMGGSICHFIFMYAYLM >gi|222441770|gb|ACEP01000172.1| GENE 33 35508 - 35795 135 95 aa, chain - ## HITS:1 COG:SA0446 KEGG:ns NR:ns ## COG: SA0446 COG2827 # Protein_GI_number: 15926165 # Func_class: L Replication, recombination and repair # Function: Predicted endonuclease containing a URI domain # Organism: Staphylococcus aureus N315 # 13 88 4 79 82 82 63.0 2e-16 MKETDKQNQHQGNYTYIVKCSDETLYTGWTNNLKKRLEAHNSGKGAKYTKNRRPVELVYF EEYDTKQEAMKREYAIKQLSRQKKLALIHSHQSIF >gi|222441770|gb|ACEP01000172.1| GENE 34 36327 - 37514 904 395 aa, chain - ## HITS:1 COG:no KEGG:Closa_2891 NR:ns ## KEGG: Closa_2891 # Name: not_defined # Def: Peptidase M23 # Organism: C.saccharolyticum # Pathway: not_defined # 209 394 72 253 253 142 39.0 4e-32 MLVIPEYLKPLVSDYRMNLIQVRQSESLCFRNQDISIVFDMIRSIYNKDYETFHEMYKDK TMSTELGLGRNGSSIHCLSKIPCINFIFLVKTIPRGDEMKRPNNLFQKKYISSLVIAAFL ILSVLTAYNIKIGNADGGKSDNKTKTETTEKSNDVIGPVSGNDTGSATNSADSAANNDMD VEPGAEGQNGTDTLDSNAEDNGEGTKYSNNSGRNSSSEQDDADTKDENNTKNNISGTKEN AGNNAENSGASGDSVYNYNGKKKLTWPVMGNIILPYSMDATVYYTTLDQYACNDGIIVGA KVGEEVVSPANGRVVNIEDTDRYGKVVTILLGNYYKAYCGQLDNVDYEIGDDIQEGDVLG TVAEPTKSFVLEGPNVFFKMTYKDKTVNPVKYLRV >gi|222441770|gb|ACEP01000172.1| GENE 35 37999 - 39129 784 376 aa, chain - ## HITS:1 COG:BS_spoIID KEGG:ns NR:ns ## COG: BS_spoIID COG2385 # Protein_GI_number: 16080728 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Sporulation protein and related proteins # Organism: Bacillus subtilis # 46 371 73 333 343 91 28.0 3e-18 MKKLALYILFIFGALLIPTIATILISGNMGVADIKMGITVEMKNGEKMDGEQFVIGMAAS ELSYIKEEEALKAWMIVCRTNFVKAASGESNSKKVGSETDMTVAQKTRNIANAASTQNSG NTTNVVGIQNSNNVKDVQSTEIINRKYQNIRQENLNLDYISMKELEENNGRKAYLEIKKK LEDASDATFGQVLTYENKYADALYHEVSIGKTVSSEEMYSVAVPYLISVDSSQDVESPDY MDVKVLPYDEVLQKLTKADKSGNLAKTDKSQNIKILKSSLKITKHTENGFVQEITAKDNK WTGEEWKKIFALNSTNFYLENYNGKLRMVTVGKGHDMGMSLYGANALAEKQITAATILSY YYPGTKVVTPDLPQAK >gi|222441770|gb|ACEP01000172.1| GENE 36 39626 - 40003 545 125 aa, chain + ## HITS:1 COG:lin1108 KEGG:ns NR:ns ## COG: lin1108 COG4810 # Protein_GI_number: 16800177 # Func_class: E Amino acid transport and metabolism # Function: Ethanolamine utilization protein # Organism: Listeria innocua # 13 125 5 116 116 121 54.0 4e-28 MTKDMINNGVNSDDKSRIIQEYVPGKQITLAHMVTRPKASVYKKLGLPDDYHDAIGILTI TPSEATIIAADICTKAASIKLGFVDRFSGSVIIVGSTSNVESALNSVITTFYNKLGFDTV SITKT >gi|222441770|gb|ACEP01000172.1| GENE 37 40038 - 40517 614 159 aa, chain + ## HITS:1 COG:FN0075 KEGG:ns NR:ns ## COG: FN0075 COG4917 # Protein_GI_number: 19703427 # Func_class: E Amino acid transport and metabolism # Function: Ethanolamine utilization protein # Organism: Fusobacterium nucleatum # 2 136 3 138 145 110 43.0 8e-25 MKIMLIGPSAAGKTTLIQRLIDEEIKYDKTQAVEYIGNFIDTPGEYMQQRGYWGSLTITS HDADIIGLVQDSSSEDCWFSGGLTSKFEKPVVGVITKIDKDDSNPKQARSYLELAGCTTI FEVSSYTGEGVEALSQGFHDMVNKYREENKGRYADDYFD >gi|222441770|gb|ACEP01000172.1| GENE 38 40535 - 41788 1191 417 aa, chain + ## HITS:1 COG:BS_ykrT KEGG:ns NR:ns ## COG: BS_ykrT COG4857 # Protein_GI_number: 16078420 # Func_class: R General function prediction only # Function: Predicted kinase # Organism: Bacillus subtilis # 35 399 36 391 399 168 30.0 2e-41 MITLSKENVVDYVKSRLDFFNPDGDIKVSAIGEGTVEEDGDGFINFVYRVSDGVHHLIVK QSTLESRSKGSFTLDLNRYKLEYDAMKICAAIVPDLIPKLYDCDEENRVFITEDVSYLRI SRFQLLKGVTYPKLADQIARYMAATQFYTSEYYLDTKRFRDLSVHFMNTTMRKIMEVGMF LTSVTPEDTVGRPLDPEFVTFSKRICADPAVLLARQKLRHLFMSKGECLIHCDLHTSNIF ASQTEAKVIDMEYAFFGPFAYDMGYFTANFIAQYAAAAFRPFETEEARKEFKAYCLFMIR ETYVKYCEYFTQYCKEDSKVEYHNIPGLAEDFCLTTLREFIGFAASAMLGRICDLIPYPD YDDIDDYVQRHNAKCFSIILNREMLVKWESYNSIDEFIQDILTIEEIYCRNIKDLKL >gi|222441770|gb|ACEP01000172.1| GENE 39 41803 - 43026 1153 407 aa, chain + ## HITS:1 COG:BS_ykrT KEGG:ns NR:ns ## COG: BS_ykrT COG4857 # Protein_GI_number: 16078420 # Func_class: R General function prediction only # Function: Predicted kinase # Organism: Bacillus subtilis # 42 400 41 391 399 157 30.0 4e-38 MIILTKENILSYIKEHVPSLQLKEPVKVSMIGEGDLGEDVEGDGYCNYVFRVSDADGYSY IVKQSTEHLKRRGRALTPTRNRFEYEIMELRAKIVPQYVPALYLGDPENNVFIMEDVSNL KLIRFQMNKNHLFPELAKQGAQYLAATHFYTSEFYLPTEEYRKLLAHFMNAELRVVMEDG IFLHIFGSDHYDAACGPEFEEYCKSIRFDPNLEFQRYKLRHLFMSKSETLIHGDFHTSNI FADDTHLKVIDMEYTFGAPFSYDLGFIIANIISQACSAAFRPFDTETHRKNYVAYLISLI EMLYTYYIQFFFEYWEKDAKLEYKITNGYKESLALDILRECFGFAACVNFSRVCGDMDTA DFDCIKDNELRTKAKFASATIDKMLFEKWDSYNDIKEVIEDIINMIC Prediction of potential genes in microbial genomes Time: Fri Jul 8 08:35:22 2011 Seq name: gi|222441769|gb|ACEP01000173.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont494.1, whole genome shotgun sequence Length of sequence - 618 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 602 399 ## COG2801 Transposase and inactivated derivatives Predicted protein(s) >gi|222441769|gb|ACEP01000173.1| GENE 1 3 - 602 399 199 aa, chain + ## HITS:1 COG:yi5B KEGG:ns NR:ns ## COG: yi5B COG2801 # Protein_GI_number: 16131429 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Escherichia coli K12 # 21 194 108 277 283 140 43.0 1e-33 SNLKSRYNGCTRASNNPAFIAENILNREFKAEHLNEKWVTDVTEFKYGNTLDSVHKVYLS AILDLCDRRPVAYVIGDSNNNALVFETFDKAVKANPGAHPIFHSDRGYQYTSRRFHQKLE EAGMVQSMSRVAHCTDNGVMEGFWGILKREMYYGKKFESREELEKAITEYIDYYTNDRPQ RGLGVLTPMEFHEKQRLAA Prediction of potential genes in microbial genomes Time: Fri Jul 8 08:35:30 2011 Seq name: gi|222441768|gb|ACEP01000174.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont495.1, whole genome shotgun sequence Length of sequence - 32679 bp Number of predicted genes - 28, with homology - 26 Number of transcription units - 16, operones - 5 average op.length - 3.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 139 - 1743 1976 ## COG0504 CTP synthase (UTP-ammonia lyase) - Prom 1894 - 1953 8.6 - Term 1954 - 1997 3.8 2 2 Tu 1 . - CDS 2004 - 2837 336 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 2978 - 3037 10.7 + Prom 3015 - 3074 11.9 3 3 Tu 1 . + CDS 3260 - 4180 1272 ## Closa_3989 hypothetical protein 4 4 Tu 1 . - CDS 4477 - 5424 589 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 5570 - 5629 8.3 + Prom 5642 - 5701 7.9 5 5 Op 1 . + CDS 5764 - 6339 598 ## gi|225028978|ref|ZP_03718170.1| hypothetical protein EUBHAL_03270 6 5 Op 2 . + CDS 6355 - 7782 1261 ## CKR_1546 hypothetical protein 7 5 Op 3 . + CDS 7779 - 9938 1599 ## gi|225028980|ref|ZP_03718172.1| hypothetical protein EUBHAL_03272 8 5 Op 4 . + CDS 9857 - 11524 1353 ## gi|225028981|ref|ZP_03718173.1| hypothetical protein EUBHAL_03273 9 6 Op 1 . + CDS 11641 - 12540 962 ## Gbro_2459 ABC transporter-like protein 10 6 Op 2 19/0.000 + CDS 12533 - 14122 1376 ## COG0772 Bacterial cell division membrane protein 11 6 Op 3 . + CDS 14028 - 15335 1284 ## COG0768 Cell division protein FtsI/penicillin-binding protein 2 - Term 15454 - 15497 1.0 12 7 Tu 1 . - CDS 15517 - 15993 178 ## gi|225028985|ref|ZP_03718177.1| hypothetical protein EUBHAL_03277 + Prom 16163 - 16222 8.0 13 8 Op 1 17/0.000 + CDS 16331 - 17257 872 ## COG0631 Serine/threonine protein phosphatase 14 8 Op 2 . + CDS 17259 - 18479 980 ## COG0515 Serine/threonine protein kinase 15 8 Op 3 . + CDS 18526 - 19206 571 ## gi|225028988|ref|ZP_03718180.1| hypothetical protein EUBHAL_03280 16 8 Op 4 . + CDS 19221 - 20489 1026 ## gi|225028989|ref|ZP_03718181.1| hypothetical protein EUBHAL_03281 17 8 Op 5 . + CDS 20480 - 20941 417 ## gi|225028990|ref|ZP_03718182.1| hypothetical protein EUBHAL_03282 - Term 21234 - 21276 1.1 18 9 Tu 1 . - CDS 21511 - 22077 238 ## gi|225028991|ref|ZP_03718183.1| hypothetical protein EUBHAL_03283 - Prom 22145 - 22204 4.9 + Prom 22104 - 22163 3.7 19 10 Tu 1 . + CDS 22185 - 22334 110 ## + Term 22369 - 22424 -0.4 - Term 22420 - 22453 2.1 20 11 Tu 1 . - CDS 22484 - 22720 199 ## - Prom 22834 - 22893 6.8 + Prom 22973 - 23032 8.2 21 12 Tu 1 . + CDS 23068 - 24366 1739 ## COG1167 Transcriptional regulators containing a DNA-binding HTH domain and an aminotransferase domain (MocR family) and their eukaryotic orthologs + Term 24443 - 24502 7.1 + Prom 24476 - 24535 7.8 22 13 Tu 1 . + CDS 24647 - 26167 1821 ## COG4624 Iron only hydrogenase large subunit, C-terminal domain + Term 26225 - 26268 0.1 23 14 Op 1 . - CDS 26350 - 26880 413 ## Closa_2357 hypothetical protein 24 14 Op 2 . - CDS 26862 - 27782 768 ## COG1242 Predicted Fe-S oxidoreductase 25 14 Op 3 . - CDS 27788 - 28939 1265 ## COG0281 Malic enzyme - Prom 29149 - 29208 5.3 26 15 Tu 1 . + CDS 29423 - 30994 1518 ## COG1158 Transcription termination factor + Prom 31214 - 31273 6.9 27 16 Op 1 . + CDS 31301 - 31771 519 ## COG2426 Predicted membrane protein 28 16 Op 2 . + CDS 31792 - 32193 174 ## Ccel_3106 Zn-finger containing protein + Term 32403 - 32455 16.1 Predicted protein(s) >gi|222441768|gb|ACEP01000174.1| GENE 1 139 - 1743 1976 534 aa, chain - ## HITS:1 COG:CAC2892 KEGG:ns NR:ns ## COG: CAC2892 COG0504 # Protein_GI_number: 15896145 # Func_class: F Nucleotide transport and metabolism # Function: CTP synthase (UTP-ammonia lyase) # Organism: Clostridium acetobutylicum # 1 529 1 529 535 709 62.0 0 MPVKYVFVTGGVVSGLGKGITAASLGRLLKARGYKVTSQKFDPYINMDPGTMNPIQHGEV FVTDDGAETDLDLGHYERFIDEKLNKQSNVTTGKVYWSILNKERRGDFGGHTVQVIPHVT NEIKSRFYHNEDASETEVAIIEIGGTAGDIESQPFLEALRQFQHEVGHENCILIHVTLIP YLKASGELKTKPTQASVKELQGMGIQPDILVCRSDLPLDDDIKAKIAQFCNVPKKRVIQN LDVDILYELPLAMEKEKLANVACECLNMECPQPDLTDWIDMVNAWKHPKHKVKVALVGKY VSLHDAYISVVEALKHGAVANNAEVEIKWVDSELVSDYNVDSFFSDVDGIIVPGGFGDRG IEGMICSIRYAREHKIPYLGLCLGMQLTIVEFARNVLGLKDAHSHEFNENTANPVIHIMP DKEGITDLGGTLRLGSYPCILDKHSKAYQLYGHKQIEERHRHRYEVNNDYREVLQENGMM LSGFSPDGRIVEMVEIPEHPWFIGTQAHPEFKSRPNKPHPLFKGFLAASLAHQK >gi|222441768|gb|ACEP01000174.1| GENE 2 2004 - 2837 336 277 aa, chain - ## HITS:1 COG:STM4423 KEGG:ns NR:ns ## COG: STM4423 COG2207 # Protein_GI_number: 16767669 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Salmonella typhimurium LT2 # 40 276 21 266 274 96 27.0 5e-20 MTQTRYTLQQHHVNLKSSVKLSHVNMYHCESPASNIFPGHPYSKIIFIQKGQGQFYFGNQ FLPVRENDLLLVNPDKQKFSVDVRESPLDFVILGIENLSFIDDTKQCIPPFTRLSFPSDT CQHLMHMLIHELEKHSEFYEDACRYYLNLILIEIQRNTNIHYDFPSKEKNNRDCKFVKEY LDSHYTENITLDILSKESKMNKYYLVHSFTKHFGCSPISYLNDKRIEESKNLLETTNHSI AAIASMIGFSSQSYFSQSFKKNTFMTPNEYRRASRQI >gi|222441768|gb|ACEP01000174.1| GENE 3 3260 - 4180 1272 306 aa, chain + ## HITS:1 COG:no KEGG:Closa_3989 NR:ns ## KEGG: Closa_3989 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 2 305 3 306 309 393 62.0 1e-108 MKEPVLIVMAAGMGSRYGGLKQMDPVDEQGHAILDFSVYDAKRAGFKKVVFIIKHAIEKD FREIVGKRIEPFMEVEYVFQELDKLPEGYQVPEGREKPFGTGHALLCCKDVVDAPFAVIN ADDYYGPKAFQMLYDYLTTHEDDDLYRYVMIAFHIEKTITENGHVSRGVCQVDKNHYLKE ITERTRIEKRGAEAAFTLDDGATWTPVPDKTPVSMNCWGFTPGFLKVLEDKFPAFLDKGL KENPMKCEYFLPSVVEELLKEKRATVEVMESAEKWYGVTYKEDKKSVMDAISEMKQQGVY PEELWS >gi|222441768|gb|ACEP01000174.1| GENE 4 4477 - 5424 589 315 aa, chain - ## HITS:1 COG:PA0748 KEGG:ns NR:ns ## COG: PA0748 COG2207 # Protein_GI_number: 15595945 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Pseudomonas aeruginosa # 187 300 190 303 306 76 30.0 7e-14 MALESLHLKKEFEIRQLTSLHYFEYKSDFIFKSEYHDYWKLLYVEDGSAEISFSNKKQAP VTLQKGELFIQSPNEYYSFRASSGNIAVLFTAGFYSDTANLSLLANKKLVCQKKETELLQ ALALEGKNNFSPKINPASPAALERRYNQPFGGEQLIGIYLEMLLISLMRQYTSSEEKKET PSENTKKDASASLLKTDSILFNRITDYYEAHITEHIKVEHLCKEFGIGRSHLQRIFREQT GYGAIEYFCQMRISAAKRCIRENRMNLTDTAASLGYTSIYYFSKQFKKITGMSPSQYQKM VRSSKKDPLYEQREL >gi|222441768|gb|ACEP01000174.1| GENE 5 5764 - 6339 598 191 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225028978|ref|ZP_03718170.1| ## NR: gi|225028978|ref|ZP_03718170.1| hypothetical protein EUBHAL_03270 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_03270 [Eubacterium hallii DSM 3353] # 7 191 1 185 185 323 100.0 3e-87 MISTFTMEAGLVIAALMAVVGVVQWKRQSGRQQQEKYILDYAGQVSELESQKREEERQQR QEEEKKQKDRLFGMDFMITHGSAFGGCAAICPDGRIRRDKVALDNILMYGKHPSGKEFTY YIEDQGLEICACDNPEELTVISKQQPFEIREAGLSRGQGVPTKQAVIKQDVKYYIILESK HEISIKATKRC >gi|222441768|gb|ACEP01000174.1| GENE 6 6355 - 7782 1261 475 aa, chain + ## HITS:1 COG:no KEGG:CKR_1546 NR:ns ## KEGG: CKR_1546 # Name: not_defined # Def: hypothetical protein # Organism: C.kluyveri_NBRC # Pathway: not_defined # 11 464 42 425 436 82 23.0 4e-14 MQETKTIRVKKCPYCFRNISNEDAGFLLRTDGVRFQSPALNEVFSYKTDTAYLYFWSAMG IPEEQIDAKRIIIDNEVMTELNQELTAAGRDLAVKRFDTDSCGYTFHVEEGAVTLFSNTM VCPHCHNVLPQNFFKYEMLMIGLAGSVASGKTVYLCSLMMNGFDVMQRQNLTVRNAYGNP NDDYKIEMERNADRLMRYGICPESTSKAFKKPVFLEVTYRIGQKNLHMLTAIYDVAGELI RDSAGTGRTGFVRHMDGFICLVDPAQMHLEHSLINKQIPDEERVLEKLYVMGKEEQMSIQ RMSNENGKQVMDQADFMMENTMSDEYFYERKADTVLDSIRSVLGDHELKQKYMALTIAKS DLLEELGEIKEYRGSSLLFERKQVSYGFMNMDHHFLRQEILKQIFDQKVFRLQRNLADYK ESGLFAVSALGCETEETMEGTQKEVKTVGRVRPIRTEEPILWMMMKFMQERGWLE >gi|222441768|gb|ACEP01000174.1| GENE 7 7779 - 9938 1599 719 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225028980|ref|ZP_03718172.1| ## NR: gi|225028980|ref|ZP_03718172.1| hypothetical protein EUBHAL_03272 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_03272 [Eubacterium hallii DSM 3353] # 1 719 1 719 719 1424 100.0 0 MKTEYLIHEYMWSNHCISGNKLGWGITASSMPEDRAYLRELEKLAQAAVIDKTGKTEVDE LVYSSVCGFVKMSSVPCESGEDKRQNKRVRIYQPKAPESNPVAYLAPGGEWAEEESVGYL QPLFLEEPEFHRKDILQEMNLMSRLPEFMQVVFWCLSGHSEGINIVAPDWKEEEFAEKAK RLMYVIHSLLPQPARERAGYVSFTREAIPSVSFYFSQKVCGTKYFNLSEKSERNDWENTQ TALDQYFYNGFAQASQKEDEIYRDFQKTAGKYLKTVRDNGNLLKKVEWIFYDIARKHGQS ALSIEILSENFPELLYWVCKDKALEYVAEDILKEIREYKFSAKERQKYIENLLTGMTGRS RERILKEIDRVLGEVFEEDKVEFASLLAVIREKNKDIYTSLLCETLANQKLSDYGKSLFR MNARDMESLYKYVKDFNEEKVPGEQKDEILRTGIGLLNEELFDKDRFELFDKIAIHLNRR EQWIKILQDFVKQLQEHAALFNKKQLDTACYVEKMLGGYRPETRMVLRQERGHRTHRIKK GGQSKDSGEEDSDMGRRYKKAAETEKEKARKTGTEIENVEAEAVEDSLEEEGPAVPFLLM GFPQGFLTGCIMYLSHYSLMIGHWKIALGMAGMWVLLMLNYQAVIIQRKASYPLWKVVGL CLVEGWIIEAAAWFFRSQKVRLYYFIILGVITVCIQVVNLLRLKKEKSQMEEESAHAGR >gi|222441768|gb|ACEP01000174.1| GENE 8 9857 - 11524 1353 555 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225028981|ref|ZP_03718173.1| ## NR: gi|225028981|ref|ZP_03718173.1| hypothetical protein EUBHAL_03273 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_03273 [Eubacterium hallii DSM 3353] # 33 555 1 523 523 1018 100.0 0 MYSGSESPSPEKGKKSDGRRIGTCRQIEKEEVMERLADDTIESFMNAIEERLAEGQKNKI PFSRFLEEKMDAFDYSNTSLAKKVFHRVEKKKEGTVSYVPVTRQAIGAWLRGSMPSSRDI YVTLGMAFEMNLEEINHILLETYMGYGLYCKNIDDALWIALINGLFPIDAFEDVRAHIED ILEENIQQDSRSLATMDLWVMLSEVKTLEEFYELIRSYKDEFKDGTRKFGQCLEEVIEEE YGYYDKAAWFLRDIGCLHCEAQFSKIRAGKAVVTREWLLRFCIALQPTYESIEKLLAKAQ MEPLGITPAEVIIEMIARYKSNSVANSQEIWIMIESASERLRQKGYEIEEDLCRKYNSVY ELPAAQKWWFSFCIGRQILKNEKNREFGYEKNGYCRFAMEDKLLFDDMNRNKKNVSFKKS AAEYMEEGIDNWQEVEFQEIPALLIEKNYVPDALDMEKFEDYCYVRRPSRFTKDFLRNDM YFYAATLYSIWTGKCYQKENGMQNVEEIQKEFIKNNLDGKALLSILAEVLGEEEDGWKSD LSSMVEAVAKVRKNQ >gi|222441768|gb|ACEP01000174.1| GENE 9 11641 - 12540 962 299 aa, chain + ## HITS:1 COG:no KEGG:Gbro_2459 NR:ns ## KEGG: Gbro_2459 # Name: not_defined # Def: ABC transporter-like protein # Organism: G.bronchialis # Pathway: not_defined # 210 283 209 280 848 65 41.0 3e-09 MSIISQIGVVILIAFIVLVLIRSNKNLKKASEEGENFVDEWALRQEKKEEIKQKREEKKL EVQKEKEQKKLQKKQKRKGILVDEDNEPEEYELREEKEEKQALEEEQYAEEYPKKNDFSE MEYEEESQQDGYDGESIRVNTDWLYDYEEQERKRQERMQPEESSVGQDYKDESGKIVPMK KRETTGITLMKLDANHRVMARYRVNKIPFTIGRSTSNDLVLDDLCVARNHCRIIERDGRY LLEDVGTMNKLYVNGMVTGQVPLSDGLRVYIGNEEFQIAVEGGRSQSTRLYKNAEGSYE >gi|222441768|gb|ACEP01000174.1| GENE 10 12533 - 14122 1376 529 aa, chain + ## HITS:1 COG:SP0803 KEGG:ns NR:ns ## COG: SP0803 COG0772 # Protein_GI_number: 15900696 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Bacterial cell division membrane protein # Organism: Streptococcus pneumoniae TIGR4 # 187 490 102 404 407 106 28.0 1e-22 MSRHDTGTKAFSFERVVWEIQRAFYSFKNSPVLALLLMQAAAFIFVQAKGAGKIYLDMIG ILLAITLLSWFFTSVIHGNKKIVIYTLLLLTVGTMLQCIFKEEQVLKNPQYYATHNPATS LQFQYILGFVMALLAAFIYINSRNIASIKCLKRFKLSTIKVCRIFFWFSLLLSAATLVLA KSVGNVRNWITIGGVSLQTTEFIKFLYVFIAAGLLGTKANPDKENIRAFYTVTFLEVLFL ALQSEFGTMLLILMLFLTFLFLFVPDIKVFIGTVFVMAAGSVGLSVIGAQITKWNSAGVF LGTNKLAQIFLSNYNKIANRFIYWLHPEKDALGLGYQLLKAKESIVLGGWFGTSSVTELP VKTSDLVYPALIQRCGMIFALLVFIVFIMMWLEGVRLFVRKQDRYHRAVGAGFVFMLFDQ TLIIIAGSTGLCPLTGITLPFISSGGTSLMISFMIVGLIVAVSSNVKWKGTVEDEEEQDK FFKENAVTAKCHAYLRHLNDHFSRESFRAASGRIKRSGQKEESGQGKDI >gi|222441768|gb|ACEP01000174.1| GENE 11 14028 - 15335 1284 435 aa, chain + ## HITS:1 COG:CAC0506 KEGG:ns NR:ns ## COG: CAC0506 COG0768 # Protein_GI_number: 15893797 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell division protein FtsI/penicillin-binding protein 2 # Organism: Clostridium acetobutylicum # 13 406 35 454 482 205 32.0 1e-52 MIISLVRVSALLPGVSKEADKKKSQAKAKIYEKEYVRGSILDRNGNTIAFSQKPGGARTY SHPYAFSNLVGYWSKIYGTYGVEKTMNEELVHSNCGANPKQKKGADVSLTIDAALQERAY KDIEKYKGSVAVLDAKTGEILALASSPSFNVSEIEDKWKKINEKEGVFLSNAYQNPVAPG SVFKLITSKEIVEAGIEREEVEDTGSITVNGQTIRNYGGKAYGSISFREGFVKSSNVYFM NRALKLGGLRLYKTGKSFLLGEDISLDFATIHSNFDLKGYEDNVVATTAFGQGETLVTPL QMAMVTQSIANDGVMLKPYLFKSVVNGKGQTTAEGKSEKLVETMDTDTAQEIKEAMKAAA QSYGMSTVGEKEYSIAAKTGTAERGDGTNNAWLVTFAPADNPQYVIVANRLKTTEIGKTL APVVEDLYNTLFDAN >gi|222441768|gb|ACEP01000174.1| GENE 12 15517 - 15993 178 158 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225028985|ref|ZP_03718177.1| ## NR: gi|225028985|ref|ZP_03718177.1| hypothetical protein EUBHAL_03277 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_03277 [Eubacterium hallii DSM 3353] # 1 158 106 263 263 275 100.0 8e-73 MECCDGCLLPKLVPFLALEDLLYLLLSTGKVLSYLLEKGILYTDLHPSNVLIRTLDGHLA ITLLDFTYCYYFLNNPNPPYELRFSYNLSPDLKGQQMLIQELTYFLHALLEIKEQQRTGK KEPLPFSVCMLLETGNHPPENLSLQEFLIMIQKCFLSF >gi|222441768|gb|ACEP01000174.1| GENE 13 16331 - 17257 872 308 aa, chain + ## HITS:1 COG:DR2513 KEGG:ns NR:ns ## COG: DR2513 COG0631 # Protein_GI_number: 15807498 # Func_class: T Signal transduction mechanisms # Function: Serine/threonine protein phosphatase # Organism: Deinococcus radiodurans # 37 249 43 241 353 70 27.0 5e-12 MAFFFASEINRRSSNEDSYCQMEIHMNAEAVVKAMVVADGMGGLSGGKYYSEAAVNLWYQ ELLQTLMSDRFKGNPLDRQIEILQEFSEEIYSKLNQGLYKKGLDAGIKGGTTLSAVIHFW DTFIVSNCGDSPIYRIKNGTIELISEVQNVAEEMVREGKTTAGSVLYYQNKNRLLDYLGK RKECKPYCRLFPAEEIDGILMGSDGAFGDLTKAEIETICCDCERPQRVIGQIFESARESG EEDNQTAILYVNQEKEKNKKESEVKFPKTDTRQSKKKEDICCYTELPKSKKPLSLKEKLF KNRFIGGR >gi|222441768|gb|ACEP01000174.1| GENE 14 17259 - 18479 980 406 aa, chain + ## HITS:1 COG:lin1934_1 KEGG:ns NR:ns ## COG: lin1934_1 COG0515 # Protein_GI_number: 16801000 # Func_class: R General function prediction only; T Signal transduction mechanisms; K Transcription; L Replication, recombination and repair # Function: Serine/threonine protein kinase # Organism: Listeria innocua # 21 293 4 271 342 73 25.0 6e-13 MNCYFEEGEVFTWIRDRERSGENMVVSLKMLKECHRTGSKQAMIAEDIRTGKKYFVKVLF CNDLEQVYVEKESKVQLYSPYIIRIYGGMLDEKNKRFITLVEYIEESDLSELMRGRGIAG DTWNEKMKVRNRIAMKFLLGIDHYMSMYRQDPIVHRDLKPENVLASPDGSVVKIIDFDWV HLHASNVTVMLRREQKGTPGYADPRYWNSYICRPEMDIYSAGLVLYFIYTGKHHFYGNDE IKNYMIGDDYAYQLKEMPGIDDRLSRIIAKMIAREEERYGSIKEVIQDMKEYLMSVRKLP ELPEFLEQPQEQKTIRFSYRIGDVKYSPYMKNYRFIPIEFGTKQERSQNGRMSGHILSFY RMDDKVKALILHEDCHIIRKQKEKEVCEGDIFTYAGTNIEILQIKR >gi|222441768|gb|ACEP01000174.1| GENE 15 18526 - 19206 571 226 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225028988|ref|ZP_03718180.1| ## NR: gi|225028988|ref|ZP_03718180.1| hypothetical protein EUBHAL_03280 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_03280 [Eubacterium hallii DSM 3353] # 1 226 1 226 226 422 100.0 1e-117 MNESWNPEIPLDLQQNFKKEKEQAFTLYLDFVVDATASMYTVFPAVYYAAAHFLECLSKY EVYPQIGLTLIRNEENGEETETVLFEGRDSFTSDISLFLKKLKGTKLYGGGDDGKESVHK AIGTSLRKFPVSGRNKAIMVFTDAYGSNDYEEYTQYPVGQVIFFSTDEMSEEDFHFCFVK PDGELDEEASPMFIKIDKLLKPMSTEFLDNIVKPLKDLMKGVSIGA >gi|222441768|gb|ACEP01000174.1| GENE 16 19221 - 20489 1026 422 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225028989|ref|ZP_03718181.1| ## NR: gi|225028989|ref|ZP_03718181.1| hypothetical protein EUBHAL_03281 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_03281 [Eubacterium hallii DSM 3353] # 1 422 1 422 422 755 100.0 0 MEERERMETVPMGNGSKEELFEKGHPVIRELKSIPEIIFEEQIGMLPSSAIPVKSLFARK QMDHFHVDSLILKKIALQVLHILKKLSQKHIYPGLIELQEFYVDMDNSQYGVVLLHPEKF QLLTFEQDYEWYPEDERIFGDATLFDEKMQQIADNRLIYKILVASAKGNVKIPPLKTEAD YSELFYNILSEEWKQIFENREICSYEKLQNLLEESIQMEEEFARMTKESLDEKSEQNKTK KAEELQARKSGETSAEQFDLFIILRTELDNSKKISKMLYLLQDELELENTLSGYNCQQAF VFGNGAVQVKTFQNYRTGFRCQFPQTIREYSSGEALIIGTDLMKEKKEELLLKNGENHKQ QLCLYILMDGRIKNDQLFQIALNRLQKLKEEGVKLCLRMVDNIYCEACQKLKDIVEDTQR CC >gi|222441768|gb|ACEP01000174.1| GENE 17 20480 - 20941 417 153 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225028990|ref|ZP_03718182.1| ## NR: gi|225028990|ref|ZP_03718182.1| hypothetical protein EUBHAL_03282 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_03282 [Eubacterium hallii DSM 3353] # 1 153 1 153 153 254 100.0 2e-66 MLLKKLKDFHERTMEQYKEEENLEPWKKKVMELHEKSAFLFYYDATLEENAEQNSLIIQG SLVEGELPIGSTVYLYTGEGKYLGSGRILSEPEEKEQGRRGLFKRRRNQFNLGLDEYLGK KVEKMKSREKTKMFHHIEANASLISELLICEAK >gi|222441768|gb|ACEP01000174.1| GENE 18 21511 - 22077 238 188 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225028991|ref|ZP_03718183.1| ## NR: gi|225028991|ref|ZP_03718183.1| hypothetical protein EUBHAL_03283 [Eubacterium hallii DSM 3353] hypothetical protein EUBHAL_03283 [Eubacterium hallii DSM 3353] # 1 188 1 188 188 318 100.0 1e-85 MPNRNDSSSPLSRLINDDNLTILENVIPYCNTNLGKLLALCLKFSEMQKIIEGFDDSERL NACGFENNSTDIESILRSVRSSVSEEKARQIDNILQILNFSRFYEKYNQILSEHPELVRP PAPAAAETQNAPEANPFSDPSLFFLLNSLMNNSDGNQNEKLKQVMALAQNSDNKDMEKIL STLLNSGL >gi|222441768|gb|ACEP01000174.1| GENE 19 22185 - 22334 110 49 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNENLIVEDNTIYEIDPECLKKKRKCQGETGEKVETENTKKQKMQRRNR >gi|222441768|gb|ACEP01000174.1| GENE 20 22484 - 22720 199 78 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSDLAATNCQNSCSSGCGCGNDFGFNNCFWLILILLCCNNGNDCGCGNDNLIWLILILCC CGNGCGIGTNGNGCGCGC >gi|222441768|gb|ACEP01000174.1| GENE 21 23068 - 24366 1739 432 aa, chain + ## HITS:1 COG:Cgl0241 KEGG:ns NR:ns ## COG: Cgl0241 COG1167 # Protein_GI_number: 19551491 # Func_class: K Transcription; E Amino acid transport and metabolism # Function: Transcriptional regulators containing a DNA-binding HTH domain and an aminotransferase domain (MocR family) and their eukaryotic orthologs # Organism: Corynebacterium glutamicum # 5 427 7 424 426 397 46.0 1e-110 MKPYRELSKEELLALQAELQAEYDAEKEKSLNLDISRGKPGKEQLDLSMPMMDVLNAESV LASENGTDVRNYGVLDGIPEAKELMAGMVGAKPSQMIVLGNSSLNIMYDCVARAEIFGIM GNTPWSKLDKVKFLCPVPGYDRHFGVTEQFGIEMINVPMTEDGPDMDVVEDYVNNDPAVK GIWCVPMFSNPGGVVYSDETVKRFANLKPAAKDFRIFWDNAYGIHYLYDRPESEQPHILN ILEECEKAGNPDIVYEFCSTSKITFPGAGVAAMASSEANIADIKSKLKWQTIGPDKINQL RHVRYFKDINGMKDYMKKHAAIIRPKFEAVLETLEKELGELGIASWSKPVGGYFISFNAE KGCAKAIVAKCKEAGLVLTGAGAAYPYGNDPEDSNIRIAPTLPPIEELKVATDLFVTCTK LVTVEKFLSEME >gi|222441768|gb|ACEP01000174.1| GENE 22 24647 - 26167 1821 506 aa, chain + ## HITS:1 COG:CAC3230 KEGG:ns NR:ns ## COG: CAC3230 COG4624 # Protein_GI_number: 15896476 # Func_class: R General function prediction only # Function: Iron only hydrogenase large subunit, C-terminal domain # Organism: Clostridium acetobutylicum # 134 455 64 379 450 196 38.0 6e-50 MRGVETRIQEIRHAVFTEVAKMAYEEGPVDKKIEALPYKIIPGETGNFRNDVFLERAIVG ERLRMAMGLPYRGAAEPAPVSDGIMEADKPEGYYTPPLINVIKFACNACDEKKVHVTDGC QGCLAHPCMEVCPKKAISLDRVTGKSIIDQDACIKCGRCATVCSYNAIIVQERPCAKACG MKAITSDENGKATIDYDKCVSCGMCLVNCPFGAISDKSQIYQVIKAIQSGEKVYAAVAPA FVGQFGPKVTPEKLRAAMKELGFADVLEVAIGADLCAKQEAEDFLKEVPEELPFMATSCC PAWSVMAKKLFPEYANCVSMALTPMTLTARLIKNHTPDAKVVFIGPCAAKKLEAMRRTVR SEVDFVLTFEEMSGIFAAKDLDLENMHEDADGVNDVSTDGRNFAVSGGVAQAVVDVIKKE YPDKEIKIANAEGLEECRKLMMLASKGKYNGYLLEGMACPGGCVAGAGTMQPIKKSQGAV KIYATKANHKVATETEYVKELDKLVD >gi|222441768|gb|ACEP01000174.1| GENE 23 26350 - 26880 413 176 aa, chain - ## HITS:1 COG:no KEGG:Closa_2357 NR:ns ## KEGG: Closa_2357 # Name: not_defined # Def: hypothetical protein # Organism: C.saccharolyticum # Pathway: not_defined # 1 173 1 171 174 145 44.0 7e-34 MSEESLTLYKLMILFMLDNLDFPLTNSQLSEFFVNHGYTSYFHLQQAINELVESEFVRAE TIRNTTNYHLISSGKEALSMFHTQISEPIKNDILDYFAEKKYQLRKEVDITADYYPLKKK RGEYMVKTQIKEKGSLLLELNINVVSKEQAIAICDSWEKKSDAVYGKLMEMLLLDE >gi|222441768|gb|ACEP01000174.1| GENE 24 26862 - 27782 768 306 aa, chain - ## HITS:1 COG:FN1142 KEGG:ns NR:ns ## COG: FN1142 COG1242 # Protein_GI_number: 19704477 # Func_class: R General function prediction only # Function: Predicted Fe-S oxidoreductase # Organism: Fusobacterium nucleatum # 1 298 2 299 304 288 46.0 9e-78 MERYYSYSRFIKEYFGEKLYKICIDGGFTCPNRDGTLATGGCIFCSEGGSGEFTENASLS VSEQIHLGKSQTSRKYQGEHYIAYFQAFTNTYAPVDRLRALYEEAITAPEIVALSIGTRP DCITPEILDLLEELNKRKPVFIEMGLQTSHDSTASFLNRAYPTETFTKTCHALAEAGIRV TAHIILGLPGETLDMELDTIRYLNTLPVSGIKISMLYVLKNTVLASYYEQHPFHILTMEE YINHLIICLSQLRKDIVIERMTGDGPKDLLIVPKWIGHKRPVLNTIHKEMKQRNFTQGGS LCQKNL >gi|222441768|gb|ACEP01000174.1| GENE 25 27788 - 28939 1265 383 aa, chain - ## HITS:1 COG:SA1524 KEGG:ns NR:ns ## COG: SA1524 COG0281 # Protein_GI_number: 15927279 # Func_class: C Energy production and conversion # Function: Malic enzyme # Organism: Staphylococcus aureus N315 # 1 380 1 381 409 424 55.0 1e-118 MTNKEKALEMHKQWNGKLSVEAKCQVASAEDLAIAYTPGVAEPCRAIAKNPKEAYTYTIK SNTVAVVSDGSAVLGLGNIGALAAMPVMEGKAALFKEFGGVNAFPICLDTQDTEEIIETV VRIAPAFGGINLEDIAAPRCFEIEERLDKMLDIPVFHDDQHGTAIVVLAGIINALKVVGK QKEDCKVVVNGAGSAGIAIAKMLLSYGFKDLTLCDRQGILCSGDPSLNWMQEKMTQITNL SHKTGTLADAMKGADIFVGVSAPGIISQDMVRSMAKDSILFTMANPDPEILPHLAKEAGA KVVGTGRSDFPNQVNNVLVFPGIFKGALEGHSTHITSEMKLAAAHAIAGLVPAEELSDTN ILPHAFDPQIADVVSAAVKEYID >gi|222441768|gb|ACEP01000174.1| GENE 26 29423 - 30994 1518 523 aa, chain + ## HITS:1 COG:CAC2889 KEGG:ns NR:ns ## COG: CAC2889 COG1158 # Protein_GI_number: 15896143 # Func_class: K Transcription # Function: Transcription termination factor # Organism: Clostridium acetobutylicum # 2 506 8 468 483 418 50.0 1e-116 MNFKEMTLLQLKAYAKEKGIRGVSTLKKNQLIERIIASENKSIQTKAEESIKSEKNTTSE NGIKTEKNAKSENGIKTEGEPKAEKRVKTEENSKSEDNIRKTQQKPQESEKKEKVYHHND RENGNSVNSANDENAGKENKKSELSSLDSGNMANGILEVMPDGYGFIRCENYLPGENDVY VSPSQIRRFNLKTGDILVGNIRIKTQNEKYSALLYVQSVNGYKPFDAAKRKNFEDLTPIF PNERISLETENAPLSMRMVDLLSPIGKGQRGMIVSQPKTGKTTLLKQIARSITATRPNMK VIVLLIDERPEEVTDIRESIEGPNAEVIYSTFDELPEHHKRVSEMVLERAKRLVEHKQDV VILLDSITRLARAYNLLVPPSGRTLSGGLDPAALYMPKKFFGAARNMREGGSLTILATAL VETGSKMDDVVFEEFKGTGNMELVLDRKLAEKRIFPAINIQRSGTRREDLLLSKEEQEIV YALHREMSGNRAEENMEQILNFFKRTKNNKEFIQVMKHSLLKK >gi|222441768|gb|ACEP01000174.1| GENE 27 31301 - 31771 519 156 aa, chain + ## HITS:1 COG:PH1658 KEGG:ns NR:ns ## COG: PH1658 COG2426 # Protein_GI_number: 14591427 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Pyrococcus horikoshii # 5 154 9 159 159 61 31.0 8e-10 MIKHYLTVFLISMVPIAELRAAIPYSQAVGLPLIPSYVIAIIGNMLPVPVIYLFARKVLE IGSNMPSIIGKFCRWCTAKGEKGGEKLQATAGRGLFVALLLFVGIPLPGTGAWTGTLAAS FLKMNFKSSVLAAMCGVLLAGVIMMTLSAGFFSLIQ >gi|222441768|gb|ACEP01000174.1| GENE 28 31792 - 32193 174 133 aa, chain + ## HITS:1 COG:no KEGG:Ccel_3106 NR:ns ## KEGG: Ccel_3106 # Name: not_defined # Def: Zn-finger containing protein # Organism: C.cellulolyticum # Pathway: not_defined # 43 133 40 130 130 97 48.0 2e-19 MKRLGSFFDEIYGFDGMSSFLLFISVILNLVTGMWPSQAVNNFNLISYLPLLGCIFRVFS RNHERRAHENECFLKLARPLCENIIEKREEQAEAKLFRFFKCPVCKQKLRVPKGKGKVEI TCPKCGNKFIKKA Prediction of potential genes in microbial genomes Time: Fri Jul 8 08:38:00 2011 Seq name: gi|222441767|gb|ACEP01000175.1| Eubacterium hallii DSM 3353 E_hallii-1.0_Cont496.1, whole genome shotgun sequence Length of sequence - 3727 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 107 - 1573 817 ## COG3666 Transposase and inactivated derivatives - Prom 1595 - 1654 5.3 - Term 2075 - 2123 3.0 2 2 Op 1 34/0.000 - CDS 2198 - 2920 584 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 3 2 Op 2 . - CDS 2907 - 3638 865 ## COG0765 ABC-type amino acid transport system, permease component Predicted protein(s) >gi|222441767|gb|ACEP01000175.1| GENE 1 107 - 1573 817 488 aa, chain - ## HITS:1 COG:MA3799 KEGG:ns NR:ns ## COG: MA3799 COG3666 # Protein_GI_number: 20092595 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Methanosarcina acetivorans str.C2A # 239 478 4 234 243 105 30.0 3e-22 MLSSQLKLNLSYYSDLYNMIVPKDHILRKIRELVDFSFIYDELKNKYCLDNGRNAKNPIL LFKYLLLKYLYNLSDNGVVERSRYDMSFKYFLELTPEEEVIHPSLLTKFRKQRLKDENIL DLLINKSVELAINQGVLKSNSIIVDSTHTEARYHKTTQREMLKKASTSLKAALKDSDKDI YLPDEPEKCASLDEYNEYCHELLNTVQESPYSELPTIKEDSSMLSEILDNQIDCYDQSID PDARIGHKTVNSSFYGYKTHLAMTDERIITAATITSGEAFDGQELPVLVQKSKVAGVTVN EVIADTAYSTLENLKDAQSNDYKLISKLNPCIIKGTRSEDGFIFNKDADIMQCPAGHLAI KYRIDKRSNQKKNARIKYFFDINKCHVCPYRNGCYKENAKTKTYSITIKSDYHKNQEAFQ KTQYFKERFKTRYMIEAKNSELKNIHGYSRCDAAGLSNMQLQGAVSIFVVNLKRILKLKG EVCLNEEK >gi|222441767|gb|ACEP01000175.1| GENE 2 2198 - 2920 584 240 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 239 1 242 245 229 48 3e-60 MISVKNLCKSFGDHQVLKDINEHIAPGEKIVIIGPSGSGKSTFLRCMNLLERPTSGQIIF DGIDITDPKTDINKVRQHMGMVFQHFNLFPHKTIMENITLAPVRLKLMKSEEAKEEALRL LKLVNLEEKADAYPGQLSGGQKQRIAIVRSLAMKPKMMLFDEPTSALDPEMVGEVLEVMK NLADQGMTMAVVTHEMGFAKEVGTRVMFMDEGRILEQGTPEDIFDNPKEARTQEFLSKVL >gi|222441767|gb|ACEP01000175.1| GENE 3 2907 - 3638 865 243 aa, chain - ## HITS:1 COG:FN0802 KEGG:ns NR:ns ## COG: FN0802 COG0765 # Protein_GI_number: 19704137 # Func_class: E Amino acid transport and metabolism # Function: ABC-type amino acid transport system, permease component # Organism: Fusobacterium nucleatum # 16 241 11 236 236 159 46.0 6e-39 MSEWFSGIKAKFILDFIADQRWKYITDGLKVTLEVTFLALILGLVLGAVIAVIRTTHDQL KEDQSKGFGSFLIKLADLICRVYLTVIRGTPTMVQLLIMFFIVLASSNNKVMVAVITFGV NSGAYVAEIFRSGIMSVDKGQMEAGRSLGLGYTDTMLQIIMPQAIKNCLPALVNEMITLL KETSICGYIGLNELTRGGDIIRGITYDAMMPLLIVAVIYLVIVMFFTWLMGKLERRLRES DIR