Prediction of potential genes in microbial genomes Time: Tue May 24 20:48:27 2011 Seq name: gi|330405847|gb|ADLB01000001.1| Lachnospiraceae bacterium 2_1_46FAA cont1.1, whole genome shotgun sequence Length of sequence - 50694 bp Number of predicted genes - 50, with homology - 43 Number of transcription units - 21, operones - 11 average op.length - 3.6 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 36 - 134 162 ## 2 1 Op 2 . - CDS 172 - 1005 406 ## COG1237 Metal-dependent hydrolases of the beta-lactamase superfamily II - Prom 1219 - 1278 4.5 + Prom 1620 - 1679 6.4 3 2 Op 1 36/0.000 + CDS 1715 - 2485 772 ## COG1136 ABC-type antimicrobial peptide transport system, ATPase component 4 2 Op 2 5/0.000 + CDS 2475 - 4409 1194 ## COG0577 ABC-type antimicrobial peptide transport system, permease component 5 2 Op 3 40/0.000 + CDS 4421 - 5080 534 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 6 2 Op 4 . + CDS 5077 - 6081 743 ## COG0642 Signal transduction histidine kinase + Prom 6113 - 6172 3.2 7 3 Tu 1 . + CDS 6200 - 7603 1834 ## COG0531 Amino acid transporters + Term 7623 - 7661 5.2 + Prom 7625 - 7684 3.7 8 4 Op 1 . + CDS 7704 - 9545 1866 ## COG0322 Nuclease subunit of the excinuclease complex 9 4 Op 2 . + CDS 9589 - 10524 1024 ## COG1493 Serine kinase of the HPr protein, regulates carbohydrate metabolism 10 4 Op 3 . + CDS 10535 - 11467 622 ## PROTEIN SUPPORTED gi|116517028|ref|YP_816079.1| glucokinase + Term 11479 - 11521 11.2 - Term 11465 - 11507 11.2 11 5 Tu 1 . - CDS 11517 - 14078 2209 ## COG0744 Membrane carboxypeptidase (penicillin-binding protein) - Prom 14098 - 14157 8.3 + Prom 14057 - 14116 7.5 12 6 Op 1 . + CDS 14261 - 14917 636 ## COG1739 Uncharacterized conserved protein 13 6 Op 2 1/0.000 + CDS 14931 - 15314 384 ## COG0346 Lactoylglutathione lyase and related lyases 14 6 Op 3 . + CDS 15311 - 15901 519 ## COG0110 Acetyltransferase (isoleucine patch superfamily) + Term 16074 - 16104 -0.9 - Term 15874 - 15914 8.1 15 7 Tu 1 . - CDS 15930 - 16466 482 ## PROTEIN SUPPORTED gi|28210085|ref|NP_781029.1| SSU ribosomal protein S30P - Prom 16515 - 16574 8.9 + Prom 16303 - 16362 6.2 16 8 Tu 1 . + CDS 16514 - 16627 64 ## - Term 16570 - 16613 7.1 17 9 Op 1 . - CDS 16616 - 17227 690 ## gbs1857 hypothetical protein 18 9 Op 2 . - CDS 17263 - 18171 353 ## COG0671 Membrane-associated phospholipid phosphatase 19 9 Op 3 . - CDS 18171 - 19994 2081 ## COG1217 Predicted membrane GTPase involved in stress response - Prom 20222 - 20281 8.7 + Prom 20055 - 20114 3.3 20 10 Op 1 . + CDS 20232 - 20861 729 ## COG0629 Single-stranded DNA-binding protein 21 10 Op 2 6/0.000 + CDS 20865 - 21782 911 ## COG0329 Dihydrodipicolinate synthase/N-acetylneuraminate lyase 22 10 Op 3 . + CDS 21770 - 22462 492 ## COG0289 Dihydrodipicolinate reductase 23 11 Tu 1 . + CDS 22575 - 22865 408 ## + Term 22873 - 22919 1.0 + Prom 22920 - 22979 6.2 24 12 Op 1 . + CDS 23038 - 25605 3335 ## COG0653 Preprotein translocase subunit SecA (ATPase, RNA helicase) 25 12 Op 2 . + CDS 25625 - 25702 91 ## 26 12 Op 3 . + CDS 25794 - 26738 833 ## COG1186 Protein chain release factor B 27 12 Op 4 . + CDS 26762 - 27199 461 ## COG4492 ACT domain-containing protein 28 12 Op 5 . + CDS 27205 - 28422 1280 ## COG0460 Homoserine dehydrogenase 29 12 Op 6 . + CDS 28465 - 29676 1299 ## COG0527 Aspartokinases + Term 29681 - 29721 8.2 - Term 29565 - 29595 -0.3 30 13 Tu 1 . - CDS 29693 - 30148 275 ## COG2954 Uncharacterized protein conserved in bacteria - Prom 30186 - 30245 5.8 + Prom 30205 - 30264 8.8 31 14 Op 1 . + CDS 30288 - 30839 716 ## COG0693 Putative intracellular protease/amidase 32 14 Op 2 . + CDS 30850 - 31515 551 ## gi|210615406|ref|ZP_03290533.1| hypothetical protein CLONEX_02749 33 14 Op 3 29/0.000 + CDS 31604 - 32890 1869 ## COG0544 FKBP-type peptidyl-prolyl cis-trans isomerase (trigger factor) 34 14 Op 4 24/0.000 + CDS 32922 - 33503 792 ## COG0740 Protease subunit of ATP-dependent Clp proteases + Term 33514 - 33548 4.4 35 14 Op 5 18/0.000 + CDS 33563 - 34810 266 ## PROTEIN SUPPORTED gi|163762510|ref|ZP_02169575.1| ribosomal protein S16 36 14 Op 6 4/0.000 + CDS 34881 - 37202 2350 ## COG0466 ATP-dependent Lon protease, bacterial type 37 14 Op 7 . + CDS 37211 - 37816 689 ## COG0218 Predicted GTPase 38 14 Op 8 . + CDS 37800 - 38471 592 ## COG0664 cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases + Term 38543 - 38591 -0.5 39 15 Tu 1 . - CDS 38620 - 38928 168 ## PROTEIN SUPPORTED gi|124485582|ref|YP_001030198.1| ribosomal protein L12E/L44/L45/RPP1/RPP2-like protein - Prom 39021 - 39080 4.6 + Prom 38917 - 38976 7.5 40 16 Op 1 . + CDS 39075 - 40424 1389 ## COG0534 Na+-driven multidrug efflux pump 41 16 Op 2 . + CDS 40437 - 40949 620 ## Cbei_2688 GCN5-related N-acetyltransferase + Term 40988 - 41021 -0.2 - Term 41013 - 41052 5.9 42 17 Tu 1 . - CDS 41066 - 41920 591 ## CLH_2455 hypothetical protein - Prom 41952 - 42011 3.8 43 18 Tu 1 . - CDS 42036 - 43385 1391 ## COG0534 Na+-driven multidrug efflux pump - Prom 43489 - 43548 5.9 + Prom 43447 - 43506 8.2 44 19 Op 1 . + CDS 43536 - 44933 1245 ## EUBREC_0404 hypothetical protein 45 19 Op 2 . + CDS 44934 - 45572 701 ## EUBREC_0405 hypothetical protein 46 19 Op 3 . + CDS 45529 - 48912 2738 ## EUBREC_0406 hypothetical protein 47 19 Op 4 . + CDS 48890 - 50083 865 ## EUBREC_2907 hypothetical protein + Term 50163 - 50199 3.1 + Prom 50086 - 50145 4.3 48 20 Tu 1 . + CDS 50294 - 50389 164 ## + Term 50570 - 50611 5.2 49 21 Op 1 . - CDS 50521 - 50604 75 ## 50 21 Op 2 . - CDS 50604 - 50693 62 ## Predicted protein(s) >gi|330405847|gb|ADLB01000001.1| GENE 1 36 - 134 162 32 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MAKEKGSFKEFLSAVAPEYQTSLCGLCRGYCA >gi|330405847|gb|ADLB01000001.1| GENE 2 172 - 1005 406 277 aa, chain - ## HITS:1 COG:MTH1101 KEGG:ns NR:ns ## COG: MTH1101 COG1237 # Protein_GI_number: 15679112 # Func_class: R General function prediction only # Function: Metal-dependent hydrolases of the beta-lactamase superfamily II # Organism: Methanothermobacter thermautotrophicus # 14 277 1 259 260 124 31.0 3e-28 MYIKINWNCKGDFMRITSLVENTTKSELKTKHGLSLYIETKSHKILFDLGSDSTLFENAM KRNIDISKVDTVIISHGHFDHGGALKKFLDINSVANIYVQKEAFEPHYSKTLFLKIPAGI DSKLKSNRQIKLLNGNYKIDEELSLFTVSKTNKFYSPANNALYDKKGKDTFVHEQNLIIS ENKVVVIMGCGHAGVVNIMEEAEKYQPHFCIGGFHLFNPLTKKTVSKELLNSIAMELQKY KDIEFYTCHCTGEKAYHYLSQQMDNMNYMSCGDVIEI >gi|330405847|gb|ADLB01000001.1| GENE 3 1715 - 2485 772 256 aa, chain + ## HITS:1 COG:lin2219 KEGG:ns NR:ns ## COG: lin2219 COG1136 # Protein_GI_number: 16801284 # Func_class: V Defense mechanisms # Function: ABC-type antimicrobial peptide transport system, ATPase component # Organism: Listeria innocua # 1 255 1 254 255 278 54.0 1e-74 MEHILNLEQVKKYYGGNENNITKAVDGISMYVDKGEFVAIMGASGSGKTTLLNCISTIDT VTSGHILVNGKDITAIKEKDFANFRRENLGFIFQDFNLLDTLTIKENIALSLIINKRNPQ KIEEEVEMIARKLDIADILSKFPYEVSGGQKQRCACARALINKPNLILADEPTGALDSKS SRLLLETMSNMNEKLQATILMVTHDPFSASFCKRILFLKDGKIFNEIQKGEKSRKEFFNE ILDILTLLGGEMGDVK >gi|330405847|gb|ADLB01000001.1| GENE 4 2475 - 4409 1194 644 aa, chain + ## HITS:1 COG:CAC0227 KEGG:ns NR:ns ## COG: CAC0227 COG0577 # Protein_GI_number: 15893519 # Func_class: V Defense mechanisms # Function: ABC-type antimicrobial peptide transport system, permease component # Organism: Clostridium acetobutylicum # 2 644 3 634 634 130 22.0 9e-30 MLSKLAFRNAKRSLKDYLIYLITITMSFSLMFAFNLVSNSEAVIKLSTGMDTFKNILLFV SAIIIFVVCFLINYTTKFMFEKRSKEFGTYMLLGIKKKEITKMFVLENILLGCLALALSI PIGFVFSQFISLIIVRILNVPGVVFISFGWKATGLLAIYFLAIYILVLLNMWRKMRKMSV HGFLYLEKQNEKKMFQNKKNRNRIFVISVLIGIISVFIWHSQFNMKNFSNQETINYLFVC TIGFIISIYGISATVSDMILSVVLRSKGMKYSNDNLFVTRTFASKARTMSFVLGTFSVLI LVALLCFNMSGITKGMYDSILNQDAPYDVAVMDDRNFHDEYLAVIDEDYTIEDSFAYDIY WDKKCQVQKYVGQFHDSDCIVKLSDYNRLMELRDEDTVELKDNEYMIVAAASTEARLKDI QEIKKVQMSGGKTLKLKEITTDNFWVFMNGEADYTMIFPDEYVRNLEAGESHLVVNTKEG TNAKLEDKIQEKMKRHLGDKTYKVTSRGEVIEEQNTMTAMLSSICLYMAFIFIAVVGTIL AVQSLSDSTKYKYRYLTLRRLGVNDKNLYKTIRKQLLVLFGLPLIYPVVLCFFLVKSINN IYYIFLESQSTYILYYFSGLLVFLCIYAVYWIVTYIGFKRNINE >gi|330405847|gb|ADLB01000001.1| GENE 5 4421 - 5080 534 219 aa, chain + ## HITS:1 COG:CAC1516 KEGG:ns NR:ns ## COG: CAC1516 COG0745 # Protein_GI_number: 15894794 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Clostridium acetobutylicum # 3 219 4 225 229 150 36.0 2e-36 MKIGIIEDDEITRRELSKLLQTKGYETLLFTDFENIPKNLGRCVLDLVLLDVNLPYENGY EICKKVKAVQQTPIIFVTSRDTTADELKSIQSGGIDFIPKPYDTLILLEKIKRALQLSDP NNFRQITKKECTLDLHLSVLKYKDETIELTRNEFRILYYFFMNEEKVIGKEELLEKLWND KYYLDENVLLVNMTRLKKKLKEIGISHLLENIRGKGWKL >gi|330405847|gb|ADLB01000001.1| GENE 6 5077 - 6081 743 334 aa, chain + ## HITS:1 COG:CAC0225 KEGG:ns NR:ns ## COG: CAC0225 COG0642 # Protein_GI_number: 15893517 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 1 326 1 329 339 155 33.0 9e-38 MSFITFLEEKILYILFQLFFLVLVSFLLFLSGVDGLYIFLMILIYAVFQIGFLWLTYRNK CRKNQRIVEMTDNLQEAYYISEILQKPKDLENQAYYYALKRACKAMNDKISALTEEQVEY QEYVESFAHEIKIPIGALSLTFDNARDYSLKKETDKIFQLVEQMLYYARSENTEKDYFVK ELSLDEMMHHVLLKFRYVLLEEKVTLDIHDMENTVYTDEKWLQFILSQILQNAVKYMNKE EKRLVIYSVSHDGNISLVMEDNGCGIKQSELSRVFEKGFTGSNRKKTNATGMGLYLAKKL SDRLGLRLQIDSEEGKYTRVTIFFPKGSIHSFSE >gi|330405847|gb|ADLB01000001.1| GENE 7 6200 - 7603 1834 467 aa, chain + ## HITS:1 COG:STM0969 KEGG:ns NR:ns ## COG: STM0969 COG0531 # Protein_GI_number: 16764329 # Func_class: E Amino acid transport and metabolism # Function: Amino acid transporters # Organism: Salmonella typhimurium LT2 # 1 465 1 472 473 483 52.0 1e-136 MNQKNEKKILWYSLAFMAFGGVWGFGNVINGFSEYNGLQAIVAWIIVFGIYFVPYALMVG ELGSAFKNANGGVSSWISATMGKRLAYYAGWTYWVVHMPYISQKPNSAVIATSWIIFQDN RASEMNTTVMQILCLAIFFLALFLASKGVSMLKKISTLAGTTMFIMSILYIVLMMAAPAI TNAPLLEIDWSVKTFMPTFDMKFFTGLSILVFAVGGCEKFSPYVNKTKNPAKDFSKGMIA LAVMVAVCAILGTISMGMMFDSNNIPEDLMTNGAYYAFQKLGDYYKVGNLFVILYAITQF LGQFSVMVISIDAPLCMLLGNADENYIPKSMFKQNKHGAYTNGHKLVAVIVSILIIIPAF GIKNVDVLVKWLVKLNSVCMPLRYLWVFVAYIALKKAGEKFAPEYKFVKNKTVGIIFGAW CFIFTAIACIGGIYSEDLFQLTLNILTPFVLLGLGAIMPILAKKSKK >gi|330405847|gb|ADLB01000001.1| GENE 8 7704 - 9545 1866 613 aa, chain + ## HITS:1 COG:CAC0508 KEGG:ns NR:ns ## COG: CAC0508 COG0322 # Protein_GI_number: 15893799 # Func_class: L Replication, recombination and repair # Function: Nuclease subunit of the excinuclease complex # Organism: Clostridium acetobutylicum # 1 611 1 617 623 606 49.0 1e-173 MFNIEEELKKLPGRPGVYLMHDEKDDIIYVGKAISLKNRVRQYFQSSRNKGAKIEQMVTH ISRFEYIVTDSELEALVLECNLIKEHRPKYNTMLMDDKTYPFIKVTVEEDYPRVLFAREI KKDKAKYFGPYTSAGAVKDTIDLIRKLYHIRSCNRNLPKDIGKERPCLNYHIKQCYAPCQ GYISKEEYRKSVQQALRFLGGEYDKILKTLEEKMEKASEELDFEKAIEYRELLNSVKQIA QKQKITNSDGEDKDILAVATEENDAVVQVFFIRGGRLIGRDHFYMKINEGEQKNEILENF IKQFYAGTPYIPRELMVQEELKESRLLEEWLTQKRGHKVYITVHQKGQKKKLVDLAEKNA KLVLSTDKERLKREEGRTIGAVKEIEKLLGLSNLVRMEAFDISNTNGFNSVGSMIVYEKG KPKRNDYRKFKIKGVQGADDYASMEEVLTRRFEHGIKEKEEGKELGSFTSFPDLIMMDGG KGQVNVALQVLERLHLNIAVCGMVKDDNHRTRGLYYNNVEIPIDRNSEAFRLITRIQDEA HRFAIEFHRKLRGQGQVHSILDDIKGIGPARRKDLMKHFMSLDAIKEATVEELKNLPSMN EKSAQEVYNFFHK >gi|330405847|gb|ADLB01000001.1| GENE 9 9589 - 10524 1024 311 aa, chain + ## HITS:1 COG:lin2626 KEGG:ns NR:ns ## COG: lin2626 COG1493 # Protein_GI_number: 16801688 # Func_class: T Signal transduction mechanisms # Function: Serine kinase of the HPr protein, regulates carbohydrate metabolism # Organism: Listeria innocua # 6 299 4 296 312 255 42.0 7e-68 MSCAGSVGLEKLVEKMELKNLTPDIDMTERVITVPDINRPALQLTGFFDHFDSERIQVIG YVEYTYLEKMPEERKEVIYEQLLSYDIPCLIYSTKVYPDELMLKKANEKNIPVFATDKKT SAFMAEAIRWLNVELAPCISIHGVLVDVYGVGVLIMGESGIGKSEAALELIKRGHRLVTD DVVEIRKVSDETLVGTAPDITRHLIELRGIGIVDVKMLFGVQSVKETQNIDLVITLEDWN KEKEYDRLGLEEEYTEFLGNKVVCHSIPIRPGRNLAIIVESASVNHRQKQMGYNAAQELY RRVQENLTQKK >gi|330405847|gb|ADLB01000001.1| GENE 10 10535 - 11467 622 310 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|116517028|ref|YP_816079.1| glucokinase [Streptococcus pneumoniae D39] # 6 310 7 319 319 244 41 9e-64 MRYCFGVDIGGTTVKMGVFHFDGTLIEKWEIETRKENHGEMILPDVADSIREKMENHNLD KEAVLGVGVGIPAPVTEAGVVQATANLGWGYKEVKHELETLVGIPVKVGNDANVAALGEM WKGGGFGHKNMVMVTLGTGVGGGIISNGHMVVGGHGAGGEIGHICVNYEETEKCGCGNRG CLEQYASATGIVRLAKKRLAENDDETVLRNEEVSAKTVFDAVKENDAVAIEIAKEFGKYL GYALANLAAAVDPEIVVIGGGVSKAGTILLDYIIESFMERVFFANKECRFELARLGNDAG IYGAAKLILE >gi|330405847|gb|ADLB01000001.1| GENE 11 11517 - 14078 2209 853 aa, chain - ## HITS:1 COG:CAC2301 KEGG:ns NR:ns ## COG: CAC2301 COG0744 # Protein_GI_number: 15895568 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane carboxypeptidase (penicillin-binding protein) # Organism: Clostridium acetobutylicum # 33 743 34 683 809 302 32.0 2e-81 MNYGKKNASKKQREITSKANMQKKRIFSRLFKSVILCIVIIGIVGIIGAGFFVKKTIDNA PNISPANVKPTGYTTFVVDKDGNEIEKFVSSGSNRVYKTLDEIPVHLRQAFVAIEDERFY QHKGIDLKGIVRAGVVGLTSGHFSEGASTITQQLIKNTVFPDFVNENGFFESLERKIQEQ YLALEIEKQMDKDLILENYMNTINLGQNTLGVQAASKRYFNKDVSELTLSECATIAGITQ NPSTFNPVTNPEKNKERRKKVLTKMLEQECISQAEYDEALADDVYARIQNVNTQLAGSDV DSYFIDALADQVIEDLQRTDPGYPGYSETQAINALYSGGLTIYATQDAEIQKIADEEMND DSNYPSSVYTGVEYALTVTRKDGTQENYSSGHLKKFGKEKYNDKQGLLFTSEEKAKARIE AFKASIAQEGDKYDEKILFSPQPQASAVIMDQYTGEVKALVGGRGKKQSSRSLNRATGSL RQPGSAFKIVAVYAPALDSAGQSLGTVTKDEPFERNGKSYKNAYNGFKGNVTMRTAIQDS VNISALKTFEEITPQLGFEYSEKLGISTLVKSKKINGKVYSDLQLPTALGGITDGVYNLE MTGAYAAIANGGVYNTPIFYTKILDHEGNVLIDNTSNSRTALRKSTAYLLTSAMEDVVKQ GTGKQAALTNMPVAGKTGTTTSRVDIWFCGFTPYYTCSVWVGFDDNKELPSGNYNQKLWK GIMSRIHKNLPYKDFEVPDTVAKKKVCSISGKVASSSCPYHMEYIALDGGSDKVCSSHSG YPGSVGGITSSEDENTNQDENGDFSDGNDTTATQPENSGTDNGNGSDSTDNSNNSGDGNS DSNQEQPVPPTVP >gi|330405847|gb|ADLB01000001.1| GENE 12 14261 - 14917 636 218 aa, chain + ## HITS:1 COG:BS_yvyE KEGG:ns NR:ns ## COG: BS_yvyE COG1739 # Protein_GI_number: 16080604 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Bacillus subtilis # 1 208 1 208 217 170 44.0 2e-42 MLAEYKTVYTGGEAEIVEKKSRFIATVQPIKTEEEALEFIEKVRKQHWSATHNCFAYTAG ERFEVQRCSDDGEPNGTAGKPMLDVLLGEKIHNIAVVVTRYFGGTLLGTGGLVRAYSKAT QEGISASKIITKMYGCKLEITTDYTGLGKIQYILGQRGITVLQSTYTEKVVLEVLVPKEE EEVLTGEIVEGTNGQALIEKGEECYFAKIDGEMVIFQE >gi|330405847|gb|ADLB01000001.1| GENE 13 14931 - 15314 384 127 aa, chain + ## HITS:1 COG:FN1050 KEGG:ns NR:ns ## COG: FN1050 COG0346 # Protein_GI_number: 19704385 # Func_class: E Amino acid transport and metabolism # Function: Lactoylglutathione lyase and related lyases # Organism: Fusobacterium nucleatum # 1 127 1 127 127 183 72.0 5e-47 MEIEHIAMYVKDLERTREFFEKYFGAVSNKKYHNIKTDFQSYFLSFDDGARLEIMHRPSM DDAEKTFARTGFIHLAFRVGSREKVNELTEKLRKDGFEVMSGPRTTGDGYYESCIIGIEN NQIEITL >gi|330405847|gb|ADLB01000001.1| GENE 14 15311 - 15901 519 196 aa, chain + ## HITS:1 COG:MA0513 KEGG:ns NR:ns ## COG: MA0513 COG0110 # Protein_GI_number: 20089402 # Func_class: R General function prediction only # Function: Acetyltransferase (isoleucine patch superfamily) # Organism: Methanosarcina acetivorans str.C2A # 1 190 9 198 199 258 63.0 6e-69 MTEKEKMKRQMLYDANNDTELLQERKVAKDLCYEYNQLRPSDEENQMKIIKSLLGKTKEK FCIIAPFWCDYGYNIEIGENFFANHNTVILDGGKVVFGDNVFVGPNCGFYTAGHPIDFEK RNTGLEYARPITVGDNVWIGAGVHVMPGVRIGNNVVIGGGSVVVKDIPDNSMAVGNPCRV IREITEEDSKEKSLRI >gi|330405847|gb|ADLB01000001.1| GENE 15 15930 - 16466 482 178 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|28210085|ref|NP_781029.1| SSU ribosomal protein S30P [Clostridium tetani E88] # 1 177 1 176 176 190 55 2e-47 MKFIISGKNIEVTPGLKETIEHKLGKLERYFTPDTEIIVTLSVEKERQKIEVTIPVKKDI IRSEQVSSDMYVSIDLVEEVIERQLRKYKNKLVARHQAGGNFNQAFFESEDDTADTNEIK IVRTKRFGIKPMFPEDACMQMDLLGHNFFVFSNAETEQVNVVYKRKDGSFGLIEPEFQ >gi|330405847|gb|ADLB01000001.1| GENE 16 16514 - 16627 64 37 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MDILYIKTEKNATKRKIKILNKNKRLTIKSQSFRLFD >gi|330405847|gb|ADLB01000001.1| GENE 17 16616 - 17227 690 203 aa, chain - ## HITS:1 COG:no KEGG:gbs1857 NR:ns ## KEGG: gbs1857 # Name: not_defined # Def: hypothetical protein # Organism: S.agalactiae_NEM316 # Pathway: not_defined # 1 202 11 214 215 146 39.0 4e-34 MTKYLSAILKKLATAMEIFIAFLLAVGIILLSARLTLDLIHIPNLEVSPNYEDLLENCFN LIIGVELIRMVYTHRPGTVFEVLLFAIARQVIIEHGSPLNSLIGVVAIAVLFATRKFLFC EFDEAEKILFRATQKVFLVNRLIHVNIPFEEGDTLRDVIVRQLDKHQEDVGVGACAYFSN LGLRVAKMHKGEISRVEVIRSIK >gi|330405847|gb|ADLB01000001.1| GENE 18 17263 - 18171 353 302 aa, chain - ## HITS:1 COG:CAC0605 KEGG:ns NR:ns ## COG: CAC0605 COG0671 # Protein_GI_number: 15893894 # Func_class: I Lipid transport and metabolism # Function: Membrane-associated phospholipid phosphatase # Organism: Clostridium acetobutylicum # 4 281 15 274 290 107 29.0 2e-23 MDFLRLLEGIRSPLFNKIFQFVTYFGQELLIIAFICTLYWCVNKRFAYILGFTYFTAGLC VQTLKITFRIPRPWVIDPNFHPVESAVPAATGYSFPSGHTQGATSLFAPIAFHTKKRILS FLCVCAFLLVGFSRMYLGVHTPKDVLVSMGISLFFAWLITHFSHILLDMECYRKRIVFCL ICLAALISIYTLFLLHTDTIDYKYAADCLKACGAGIGFAVGWYIEQTYLQFDVKTKHIYT QVLKVIIGLLIALLIKSFLPFLIGESIFSKMAEYFILVLWILVLYPYCFNRVNKFSLQKS ID >gi|330405847|gb|ADLB01000001.1| GENE 19 18171 - 19994 2081 607 aa, chain - ## HITS:1 COG:CAC1684 KEGG:ns NR:ns ## COG: CAC1684 COG1217 # Protein_GI_number: 15894961 # Func_class: T Signal transduction mechanisms # Function: Predicted membrane GTPase involved in stress response # Organism: Clostridium acetobutylicum # 2 607 3 605 605 724 62.0 0 MKTKREDIRNIAIIAHVDHGKTTLVDELLKQSGVFRENQEVAERVMDSNDIERERGITIL AKNTAVYYKDTKINIIDTPGHADFGGEVERVLKMVNGVILVVDAFEGAMPQTKFVLRKAL ELQLPVIVCINKIDRPEARPDEVIDEILELFMDLDASDEQLDCPFIYASAKAGHAILDLT DDPKNMVPLFETILDYIPAPEGDPDADTQVLISTIDYNEYVGRIGVGKVDNGTISVNQDM VVVNAHEPDKQKKVKISKLYEFDGLNKVEVKEAGIGSIVAISGISDIAIGDTLCSPENPV AIPFQKISEPTIAMQFIVNDSPFAGQEGKYVTSRHLRERLFRELNTDVSLRVEESENADS FRVSGRGELHLSVLIENMRREGYEFAVSKAEVLYKKDEKGKLLEPIETAYIDVPDEFTGT VIDKLSQRKGELQNMSASNGGYTRLEFSIPARGLIGYRGEFMTDTKGNGILNTAFEGYAP YKGDIQYRKQGSLIAFETGESVTYGLFSAQERGTLFIGPGEKVYSGMVIGQNGKAEDIEL NVCKTKHLTNTRSSGSDDALKLTTPRILSLEEALDFIDTDELLEVTPKTLRIRKKILDSK MRKRGAK >gi|330405847|gb|ADLB01000001.1| GENE 20 20232 - 20861 729 209 aa, chain + ## HITS:1 COG:CAC2382 KEGG:ns NR:ns ## COG: CAC2382 COG0629 # Protein_GI_number: 15895648 # Func_class: L Replication, recombination and repair # Function: Single-stranded DNA-binding protein # Organism: Clostridium acetobutylicum # 3 209 2 206 229 210 50.0 1e-54 MSDKVIENNQVTIMGEVVSEFTFSHEVFGEGFYMVEVLVRRLSNSDDRIPLMVSERLIDV TQDYRGEYIMVNGQFRSYNRHEEDRNRLVLSVFAREISFVEEELDGAKTNNILLDGYICK APVYRKTPLGREIADLLLAVNRPYGKSDYIPCICWGRNARYASSFEVGEHVQVLGRIQSR EYVKKLSEVETEKRVAYEVSVSKLECVGE >gi|330405847|gb|ADLB01000001.1| GENE 21 20865 - 21782 911 305 aa, chain + ## HITS:1 COG:CAC2378 KEGG:ns NR:ns ## COG: CAC2378 COG0329 # Protein_GI_number: 15895644 # Func_class: E Amino acid transport and metabolism; M Cell wall/membrane/envelope biogenesis # Function: Dihydrodipicolinate synthase/N-acetylneuraminate lyase # Organism: Clostridium acetobutylicum # 3 294 2 291 293 293 53.0 3e-79 MKSVFIGSGVALVTPFRKDGSINEEMLEELVEYHCENQTDALIICGTTGESATLSEEERL FCIRRAVEFAKGRVPIIAGTGTNCTQTTVQLSKEAENCGADGVLVVTPYYNKPTQRGLIA HYGKIAENIRIPIILYNVGSRTGCNIEPETVAQLMKNEENIVAIKEASGNISQVARIMEL TEGKTDIYAGNDNQIVPVLSLGGKGVISVLANIAPKEVHKICENFLSGNIKESCALQLKA LPLIRQLFCEVNPIPIKKALSLMGKDVGGLRMPLLELEEEKERTLARIMCEYGIELSGEG NIWLT >gi|330405847|gb|ADLB01000001.1| GENE 22 21770 - 22462 492 230 aa, chain + ## HITS:1 COG:CAC2379 KEGG:ns NR:ns ## COG: CAC2379 COG0289 # Protein_GI_number: 15895645 # Func_class: E Amino acid transport and metabolism # Function: Dihydrodipicolinate reductase # Organism: Clostridium acetobutylicum # 1 229 1 248 250 194 42.0 1e-49 MVDIILHGCNGKMGEIVTMLAGKDEEVQIVAGVDKCGSKERAYPVFDSLWECGVKADAVI DFSHSGGIEELLNVCEKRSLPLVLCTTGLSQEQQKKVEKSSGKFPILQSANMSLGITVLL GLLKTLGRELIPRGYDVEIIEKHHRYKKDAPSGTALALKKAVGEENIPIASVRCGKIVGE HEVIFAGEDEVIEIKHTAYSKTIFAKGALEAAKFLPGQKNGLYTIKDVFL >gi|330405847|gb|ADLB01000001.1| GENE 23 22575 - 22865 408 96 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKQIKKTVVALLCVCMLFAVGCGTKENANNNGNTTTNTDKKNTDGDGVMEDVGRGIDNGV RDVVDGVEDTVDNVTGNNRNNTNNNTNNSADKTPTK >gi|330405847|gb|ADLB01000001.1| GENE 24 23038 - 25605 3335 855 aa, chain + ## HITS:1 COG:CAC2846 KEGG:ns NR:ns ## COG: CAC2846 COG0653 # Protein_GI_number: 15896101 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit SecA (ATPase, RNA helicase) # Organism: Clostridium acetobutylicum # 1 855 1 838 839 946 59.0 0 MGLLEKIFGTHSEHELKRIYPIVDKIEALEPDMQKLSDEELRDKTKEFKERLAKGETLDD ILPEAFATVREAAVRTLHMRHYRVQLIGGIILHQGRISEMRTGEGKTLVSTLPAYLNALE GKGTHIVTVNDYLAKRDAEWMGQVHEFLGLTVGVVLNSMDNDERREAYNCDITYVTNNEL GFDYLRDNMVIYKEQLVQRGLHFAVIDEVDSVLIDEARTPLIISGQSGKSTKLYEACDIL ARQMERGEASGEFSKMNAIMGEEIEETGDFIVNEKEKNVSLTEDGVKKVEKFFHLENLAD PENLEIQHNVILALRAHNLMFRDQDYVVTPEGEVMIVDEFTGRIMPGRRYSDGLHQAIEA KEHVKVKRESKTLATITFQNLFNKYKKKSGMTGTALTEEKEFRDIYGMDVVEIPTNLPVQ RKDLDDAVYKTKEEKFQAVVDAVVEAHAKGQPVLVGTITIETSELLSRMLKKEGVPHNVL NAKFHEMEAEIVAQAGVHGAVTIATNMAGRGTDIKLDDDAKNAGGLKIIGTERHESRRID NQLRGRSGRQGDPGESRFYISLEDDLMRLFGSEKLMGVFNTLGVEDGEQIEHKMLSNAIE KAQKKIENNNFGIRKNLLEYDQVMNEQREIIYEERRRVLDGESMRDSIYHMITEYVENLV DACVSPDLDSDEWDLAELERSLLTTIPMTFVTPDEVKNMRQKELKHVLKERAVKAYEAKE AEFPEIEHLREVERVILLKVIDAKWMDHIDDMDQLRQGIGLQAYGQRDPLVEYKMLGYDM FGAMTNAIAEDTVRLLFHVRIEQKVEREQVAQVTGTNKDDTSVKEPKKREEKKVYPNDPC PCGSGKKYKQCCGRK >gi|330405847|gb|ADLB01000001.1| GENE 25 25625 - 25702 91 25 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MVELDQFKSVMNSYEEPLNELRDSL >gi|330405847|gb|ADLB01000001.1| GENE 26 25794 - 26738 833 314 aa, chain + ## HITS:1 COG:BS_prfB KEGG:ns NR:ns ## COG: BS_prfB COG1186 # Protein_GI_number: 16080582 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Protein chain release factor B # Organism: Bacillus subtilis # 1 309 56 364 366 370 58.0 1e-102 MRELKGLQSFLENAESLLTSYEDIAMLIEMVYEDNDESMVGEIRREIRQFEKNFEEIRIN TLLSGPYDKDNAIVTLHAGAGGTESCDWASMLYRMYSRWAERKGFSISVLDYLDGDITGI KTVTFEVNGENAYGYLQSEKGVHRLVRISPFNSAGKRQTSFASCDVIPDIEDEIDIEINE DDLKVDTYRASGAGGQHVNKTSSAIRITHLPTGIVVQCQNERSQHSNREKAMQMLKAKLY LIKQQEQADKVSDIRGEVKEIGFGNQIRSYVMQPYTLVKDHRTNVENGNVTAVLDGDIEP FINAYLKKRNEDEM >gi|330405847|gb|ADLB01000001.1| GENE 27 26762 - 27199 461 145 aa, chain + ## HITS:1 COG:BH1214 KEGG:ns NR:ns ## COG: BH1214 COG4492 # Protein_GI_number: 15613777 # Func_class: R General function prediction only # Function: ACT domain-containing protein # Organism: Bacillus halodurans # 5 143 6 144 147 105 39.0 4e-23 MDSNYYVIKKKAVPEVLLKVVEVKRLLAAERFMTIQEATEKIGLSRSSFYKYKDDIFPFH ENEKGQTITLAIQMDDMPGLLAEILQEIAKYKANILTIHQSIPLNRVATLTISVEILSTT GNISDMVNEIENNEGVHYLKIVGRE >gi|330405847|gb|ADLB01000001.1| GENE 28 27205 - 28422 1280 405 aa, chain + ## HITS:1 COG:BH3422 KEGG:ns NR:ns ## COG: BH3422 COG0460 # Protein_GI_number: 15615984 # Func_class: E Amino acid transport and metabolism # Function: Homoserine dehydrogenase # Organism: Bacillus halodurans # 1 346 4 355 431 274 44.0 3e-73 MIKIAVLGYGTVGSGVVEVLEKNREIIAKRAGEEIQVKYVLDRKDFSGTPIENIIVNDFE QIVQDEEIKIVVEVLGGVEPAYIFVKRSLEAGKSVVTSNKELVAQCGSELLEIAMNNGVN FLFEASVGGGIPIIRSLHQALTGDEIEEITGILNGTTNYMMTKMFNEGAEYDEVLKEAQD KGYAERNPEADVEGHDACRKIAILSSVISGKCVDYKEIYTEGIAEITATDMKYAKKLGRT IKLLASAKRDGDKISAMVCPCLLNDSHPLYHVDGVFNAVFVRGNMLGDAMFYGSGAGKLP TASAVVGDVVDAAKNMGRTVMPVWTNEKISLTDKADTEKKFFVRIKGTPEEYAEKIEGVF GKTETIVLENVEGEFGFMTEKIKESVYEEKASQFSNILHMMRVEE >gi|330405847|gb|ADLB01000001.1| GENE 29 28465 - 29676 1299 403 aa, chain + ## HITS:1 COG:MT3812 KEGG:ns NR:ns ## COG: MT3812 COG0527 # Protein_GI_number: 15843329 # Func_class: E Amino acid transport and metabolism # Function: Aspartokinases # Organism: Mycobacterium tuberculosis CDC1551 # 2 402 3 408 421 347 48.0 2e-95 MLIVKKFGGSSVANKERIFNVAKRCIEDWQKGNDVVVVLSAMGDTTDELLAKATDINPRA TKRELDMLLTTGEQVSVSLMAMAMQAMGVPAISLNAFQISMQSTSKYGNARFKRIDTDRI THELENRKIVIVTGFQGVNKYDDYTTLGRGGSDTTAVALAAVLHADVCEIFTDVDGVYTA DPRVVKNARKLEVVTYDEMLELATLGAKVLHNRSVEMAKKYGVTLIVRSSLNNEEGTIVK EAVKKMEKMLITGVAGDKNTARVSAIGVKDEPGIAFSLFHTLAKNNINVEIILQSVGRNG TKDISFTVSQDDLQATIELLEEKKEVLTIQEIGYDENVAKISIVGAGMMSNPGVASRMFE ALYNSRVNIKMISTSEIRITVLIDEKDVDRALNAVHEEFNLGE >gi|330405847|gb|ADLB01000001.1| GENE 30 29693 - 30148 275 151 aa, chain - ## HITS:1 COG:mll4592 KEGG:ns NR:ns ## COG: mll4592 COG2954 # Protein_GI_number: 13473857 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Mesorhizobium loti # 2 148 4 151 157 66 35.0 2e-11 MEIEKKFLVSRLPENLSAYPHRTIEQGYLSTEPVIRIRKDGEKYELTYKSKGLMIREEYN LPLTQSAYEHLKPKIDGHLITKVRYRIPYNERLTIELDCFLGEFSPLILAEVEFPDEMSA NHFLPPDWFGEDVTFSSDYQNSSLSQKGPLL >gi|330405847|gb|ADLB01000001.1| GENE 31 30288 - 30839 716 183 aa, chain + ## HITS:1 COG:CAC1629 KEGG:ns NR:ns ## COG: CAC1629 COG0693 # Protein_GI_number: 15894907 # Func_class: R General function prediction only # Function: Putative intracellular protease/amidase # Organism: Clostridium acetobutylicum # 4 181 2 178 188 143 45.0 2e-34 MEKKVCVFLADGFEEIEGLTVVDILRRADIKVETVSVTGEKEIHGSHEINVQADTLFEEA NFENAEMLVLPGGMPGTIRLQEHRGLEALLRKFYGSKKYIAAICAAPTVFGKLGFLEGRK ATCYPAMEEGLVGADVIRDMVIVDGHVVTSRGMGTAIPFALALVELLAGSEKAEEIRESI IYG >gi|330405847|gb|ADLB01000001.1| GENE 32 30850 - 31515 551 221 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|210615406|ref|ZP_03290533.1| ## NR: gi|210615406|ref|ZP_03290533.1| hypothetical protein CLONEX_02749 [Clostridium nexile DSM 1787] # 1 220 1 224 225 154 41.0 5e-36 MKRLKQGIVIILCCSLMIGMTACGKFDAKGYVQALLDHRFQGDTEKLLKFDKDSEAKKLK EDYEQYIQTFSQGLTEGLNASGEMEDKFYKVCKEIFRSMKYNVTKEEKVSSDEYKVTVEI QPTDIFVKWKSMLKESMAEIQQKSEHGEYKGTEEEILQLMLLDITAQSYELLETAYTEAS YGEKESVTLTIKKGENKEFAPKEEEISDLITKILRLDAIQD >gi|330405847|gb|ADLB01000001.1| GENE 33 31604 - 32890 1869 428 aa, chain + ## HITS:1 COG:BS_tig KEGG:ns NR:ns ## COG: BS_tig COG0544 # Protein_GI_number: 16079875 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: FKBP-type peptidyl-prolyl cis-trans isomerase (trigger factor) # Organism: Bacillus subtilis # 1 424 1 423 424 334 50.0 2e-91 MSLQVEKLEHNMAKLTVEVSAEEFEKALNGAYMKQRGKISIPGFRKGKVPRQMVEKMYGV EIFYDDAANAIIPEAYAKAYDESELEIVSQPKIEVVQIEKGKPFIFTAEVALKPEVTLGE YKGLKVDKYSNRVTQKEVEERLVQEQEKNARTIAVEGRPVQDKDEVVLDFEGFVDGVAFE GGKGENYPLTIGSGAFIPGFEEQLIGAEAGKEVEVNVTFPQEYHAPELAGKAAVFKCTVN EIKAKELPELDDEFAAEISEFDTLDELKADIKAKIKEQKNADGKRQKEDQAMEQVVENAT MDIPEAMIDTQVRQMADDFAQRMQQQGLNMEQYFQFTGMTAEKMLEELRPQAVKRIQTRL VLEAIVKAENIEITDERIDEELAKMAEAYKMEVEKLKEFMGENEKAQMKMDLAVQEAVTL VVDNTVEK >gi|330405847|gb|ADLB01000001.1| GENE 34 32922 - 33503 792 193 aa, chain + ## HITS:1 COG:CAC2640 KEGG:ns NR:ns ## COG: CAC2640 COG0740 # Protein_GI_number: 15895898 # Func_class: O Posttranslational modification, protein turnover, chaperones; U Intracellular trafficking, secretion, and vesicular transport # Function: Protease subunit of ATP-dependent Clp proteases # Organism: Clostridium acetobutylicum # 1 190 1 190 193 290 74.0 1e-78 MSLVPYVIEQTSRGERSYDIYSRLLKDRIIFLGEEVTDVSASIIVAQLLHLEAEDPEKDI QLYINSPGGSVSAGLAIYDTMQYIKCDVSTTCIGMAASMGAFLLAGGAKGKRFALPNAEI MIHQPLGGAQGQATEIQIAAEHILKTRQKLNEILAERTGKSFEQISADTERDNFMSAKEA EEYGLIDKVVVSR >gi|330405847|gb|ADLB01000001.1| GENE 35 33563 - 34810 266 415 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163762510|ref|ZP_02169575.1| ribosomal protein S16 [Bacillus selenitireducens MLS10] # 153 398 258 449 466 107 30 2e-22 MAKVDEAIRCSFCNKSQAQVRKLIAGPNGVYICDECIDVCSEIIEEELEYEEDNGFGEIN LLKPEEMKAFLDDYVIGQDEAKKVLSVAVYNHYKRIMAEKDLGVELHKSNILMLGPTGCG KTLLAQTLAKVLNVPFAIADATALTEAGYVGEDVENILLKIIQAADGDIQRAEYGIIYID EIDKITRKSENPSITRDVSGEGVQQALLKIIEGTIASVPPQGGRKHPHQELIQIDTTNIL FICGGAFEGIDKIIETRIDKKSIGFNAEIAEKHEYDIDRLLGQVLPQDLVKFGLIPELVG RVPVTVSLEMLDEEALIRILTEPKSAIVKQYQKLLELDGVELVFDKEALVEIAKTSLARK TGARGLRAIMEKIMMDTMYHVPSDDSIKYCRITKEVVQGIGEPVYERENTEVVSA >gi|330405847|gb|ADLB01000001.1| GENE 36 34881 - 37202 2350 773 aa, chain + ## HITS:1 COG:BH3050 KEGG:ns NR:ns ## COG: BH3050 COG0466 # Protein_GI_number: 15615612 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ATP-dependent Lon protease, bacterial type # Organism: Bacillus halodurans # 1 770 1 770 774 694 48.0 0 MNEQIKRLPMVALRGMTILPKEVVHFDVSREKSLEAVQKAMAEEQQIFLLTQKCIETENV TQDDLYEMGVVASVKQIVKMPKKILRVLVEGEQRAKLNELVQTEPYLEAEITVLEEYPFV EEEPVKQEAMIRTLQELFLQYAVKTPKLTKETVEQIAGIDELKRLVDEISANVPFRYEDT QKLLEETDALKRYFLLVEKLENEIQVSKIKEELQEKVKARVDKHQKEYLLREELRVIQEE LGDDSTISDADEFAKAVKELDAPEEVKEKLRKEIRRFRNTMNGPAENGVTRTYIETMLEM PWNKSGEDNLDIHYAKEILEGEHYGLEQVKERILEFLAVRALTSKGSSPILCLVGPPGTG KTSIAKSLSKALHKKYVRISLGGVRDEAEIRGHRKTYVGAMPGRIANALKTAGVRNPLML LDEIDKVSTDYKGDTFSALLEVLDSEQNVKFRDHYLEVPLDLSEVLFVTTANSLQTIPRP LLDRMEIIEVTSYTDNEKFHIAERHLIPKQLEKHGLEKDMLTISRGAIWKISHNYTREAG VRQLERRIGDICRKSAREILEKNKQKIRVTESNLEKYLGKEKYTYQKANETDEVGIVRGL AWTSVGGDTLQIEVNVMQGKGELMLTGQLGDVMKESAKTGISYIRSIANDADIPKEFFEE HDIHIHIPEGAVQKDGPSAGISMATANFSAITNRKVRADVAMTGEITLRGRVLAIGGLKE KLLAAKVAGIKTVIVPFENKRDVEELSSEITKGLEIIPVSHMREVLEIALVKE >gi|330405847|gb|ADLB01000001.1| GENE 37 37211 - 37816 689 201 aa, chain + ## HITS:1 COG:SP1568 KEGG:ns NR:ns ## COG: SP1568 COG0218 # Protein_GI_number: 15901411 # Func_class: R General function prediction only # Function: Predicted GTPase # Organism: Streptococcus pneumoniae TIGR4 # 5 189 7 191 195 195 51.0 5e-50 MVIKNVELEIVCGITSKLPDTELMEIAFAGKSNVGKSSLINALLNRKSLARTSATPGKTQ TINFYNVNKELYLVDLPGYGYAKVSPKEKEQWGKLIERYLNTSKKLRAVFLLIDIRHEPT ANDKMMYDWIVYHGYNPIIIATKLDKLKRSQIQKHVKMVKQGLNLVPGTKVIPFSSVTKQ GREEIWDLVKTMEESEDDRTE >gi|330405847|gb|ADLB01000001.1| GENE 38 37800 - 38471 592 223 aa, chain + ## HITS:1 COG:FN0217 KEGG:ns NR:ns ## COG: FN0217 COG0664 # Protein_GI_number: 19703562 # Func_class: T Signal transduction mechanisms # Function: cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases # Organism: Fusobacterium nucleatum # 1 216 2 213 217 100 32.0 3e-21 MTEQNRNYLKETLTFWNEITEEEREMIIQNTAVVKYKQGQNIYNAQQECVGVLLIKKGEL RTYILSEEGKEVTLYRLGEGDVCILSASCILKNITFDVHVDAEKDSEVLLINSPAYSALS QKNIYVEAFTLRLTADRFSDVMWAMEQILFMSFDRRLAIFLLDETAKTGTDNLSLTHEQI AKYVGSAREVVSRMLKYFAKEGMVELSRGSIKIVDRQKLKKLL >gi|330405847|gb|ADLB01000001.1| GENE 39 38620 - 38928 168 102 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|124485582|ref|YP_001030198.1| ribosomal protein L12E/L44/L45/RPP1/RPP2-like protein [Methanocorpusculum labreanum Z] # 4 99 18 112 120 69 36 4e-11 MEALNITIDNFETEVMQSDKPVLIDFWATWCGPCQMVLPIIKELAEEVTHAKICKVNVDE QPELAKKFRVLSIPTLMVIKDGEIVKREVGAKSKEEILELIG >gi|330405847|gb|ADLB01000001.1| GENE 40 39075 - 40424 1389 449 aa, chain + ## HITS:1 COG:FN0667 KEGG:ns NR:ns ## COG: FN0667 COG0534 # Protein_GI_number: 19704002 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Fusobacterium nucleatum # 10 412 12 414 426 263 38.0 6e-70 METTVTKNLMTEGSIWKKITFFALPIFLGNLFQQMYNTADSLIVGNFLGSSALAAVSSSG NLIFLMIGFFNGISIGAGVVIARYFGARNKEKVETAVHTTVAFGLAASVVLTIVGVFFAP QILILMDTPANVLPESVTYFRIYFMGSLGFVMYNVFVGILQSVGDSKHPLYYLIISSIIN VVLDLVFIAGFHYGVGAAALATTISQFVSAFLCMGQLMRTKEEYRLCLRKIRFDREMLGQ IIRIGLPSGIQNSIIAIANVVVQSNINAFGETAMAGCGAYSKIEGFGFLPITSFTMALTT FVGQNLGAKQEERAKKGARFGILTTVVLAELIGVLIFIFASPLTRAFDSNPEVIMYGVAR ARTAALFFCLLAYSHAISAVLRGAGKSMVPMIVMLICWCVIRVTILVVFGKLVGKIDIVY WVYPITWGLSSIIFFIYYKKANWLNSFKE >gi|330405847|gb|ADLB01000001.1| GENE 41 40437 - 40949 620 170 aa, chain + ## HITS:1 COG:no KEGG:Cbei_2688 NR:ns ## KEGG: Cbei_2688 # Name: not_defined # Def: GCN5-related N-acetyltransferase # Organism: C.beijerinckii # Pathway: not_defined # 1 167 1 167 171 164 47.0 1e-39 MKLREAGMQDISQIKEMYKDIVSEMEKNDIYIWDEIYPCEFLEEDIKKKQFYILVNDKGE MLSGFALNSLHEGEKSIIWENENADVRYIDRLGVHVKHLRKGIGLLTLKEAISLTKQLGG EYLRLFVVDNNFPAIMLYEKAGFKKREGIYEEKIDSDLSFKEYGYEVKTI >gi|330405847|gb|ADLB01000001.1| GENE 42 41066 - 41920 591 284 aa, chain - ## HITS:1 COG:no KEGG:CLH_2455 NR:ns ## KEGG: CLH_2455 # Name: not_defined # Def: hypothetical protein # Organism: C.botulinum_E3 # Pathway: not_defined # 1 222 1 219 272 134 40.0 4e-30 MKKQLKNLTIKDNFMFAAVMLDEDNCKEFLERALQIPIDHVEVSTEKNIVYHPEYKGVRL DVYAKDEKNTRYNVEMQVSTQNSLGLRSRYYQSQMDMEMLLSGSEYAELPNSYVIFICDF DPFNEQKYKYTFQMECKETANAELHDKRNIIFLSTKGRNDEEVSEELIRFLKFVKADLKE SQNDFQDTYVRKLQKFIKHIKQNREMEEKYMLFEELLKDEREVGRKEGVFKGMTDILFMF LSKFGVLPDELQKKISKETDEEVLKNWIQTASQVTSLEEFISKM >gi|330405847|gb|ADLB01000001.1| GENE 43 42036 - 43385 1391 449 aa, chain - ## HITS:1 COG:FN1653 KEGG:ns NR:ns ## COG: FN1653 COG0534 # Protein_GI_number: 19704974 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Fusobacterium nucleatum # 8 444 8 443 445 319 41.0 7e-87 MQKNKVANPITEGVIWKQLLIFFFPIVVGTLFQQLYNTVDAIIVGRFVGKEALASVGGSA SVLVNFVIGFFTGLSAGASVIISQFYGANDRENMQKGLHTAYAFSIIFGVITSIAGLILT PWLLEITKTPADIFDDSVLYLRIYFAGMLFMLIYNMGSSILRAIGDSKRPLYYLIVCCFI NIGLDILFVVYFHMGIAGAAIATLIAQAVSAVLVTRALMTSCDMLKLIPQEIRIHFTLLK SEFKIGLPSGVQSCMYSLTNIVIQAAVNNFGTDTAAAWAAYGKLDAIFWTICGSFGIAIT TFAGQNYGAGLYERVKKSVRVCLLMAFGVSAVLLTFLMVACRPLYYIFTTDPNVIDLGVY MLRLITPSYMIFIFIEIFSGALRGIGDVFIPTLITLGGVCFIRIPWVMFITPMRNEVSTL LYSYPVSWAATVLLLLPYYLYQKKKRLSK >gi|330405847|gb|ADLB01000001.1| GENE 44 43536 - 44933 1245 465 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_0404 NR:ns ## KEGG: EUBREC_0404 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 464 15 504 511 394 45.0 1e-108 MNLINKIPTEFYKLFASKYRAYYQKFLLAIYGETSKSYSLLGLTEEECKCIMNENIADMV LEWSEEQADEEGEFLTRANMASVCLKNLENWGWLHRDYDETLNQYVVSFPEYSRLYVELF EKLFSEEGNRERESVLAVYSHLYTYSSDMEKNNDILKSAWNTSRTLVQMLANMQEGIRGY FDELSKQKSFLGIQEVLVKEINNSDSRKYAILTTTDSFYRYKEAVKELIEKNLGENEKRR QSFMKKRLQFEKDSPQYARNERAIELCEEAMDIIYRIEREFDTIERRYNKLIEQKTIFAN RAAARIRYLLMEGVAEEDGTVAFVNLLNQSSKREEILAKLAAGMGMTERYREVTDKSMYR QKNTDKEKFTPATVKEIPVEDEGLEEFILKPLYTQKEIRDFRRKNEKAGEFIVTENTVQS IEDLEKLFFVWQEVTERAESDADVAIEEEVETKGFKFSKLRIKEK >gi|330405847|gb|ADLB01000001.1| GENE 45 44934 - 45572 701 212 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_0405 NR:ns ## KEGG: EUBREC_0405 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 196 1 196 241 234 58.0 1e-60 MFKYMEELSLTETENLRKTISNLFKQTCILQVKYDPVTLTPRDNIYYEMCVRHKKFIEDY LMVLGCELTHDPQEHIFRLTGEGVETERFSLTTTIIILMIKMIYRDKILGEGLEATVTTL EEIREYGKNTNLLNKKLKVAEWHEALYLLSKHQMIELPGAVRDVEDTTPIYLYSTINLYC SSLNINALLEEYREEIVENETVEENLYENVIE >gi|330405847|gb|ADLB01000001.1| GENE 46 45529 - 48912 2738 1127 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_0406 NR:ns ## KEGG: EUBREC_0406 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 11 1118 1 1111 1111 1057 53.0 0 MKQSKKIFTKMLLNNWGGISHKELVFHEYVNLFSGKSGSGKSTVMDAIQVILYGSVAASF LNKAADDAKNRRSVLSYLRGEQKDGTANRDGMDFCSQIVLEVEDRSTHIVTCIGVAFEVG KKDLELKRYSFFSHSGRLAEDGYIAEDGSPYTLARLKKLIERRKESEDNRGRGDVNRLYP SKESYLNTLYDVILGYIEPGRFMTMEKSAIALRMTNGTGQFIRDYMFPKSKEGTVSVISE QLGAYREIKERVEDLENRIQLLENIREHNLNLIQFKSDAIRTQAILKVITIESLKAQLQS KKETLESVIDTVEKIGETCRILEEQENKCFEEKVEIEAQIRASDYGQKEKELEEIRYTIE ILAGSSAQWRAVLHGLSAWEENEDISAYLSNPALHTIREVTEGNVTEEILTKLKKYLKDT LENIEEELQELQYSIRKTEKELAEKQEIIEDLKNDRKPYKKELKRARARLQNALSDRYGR TIHVGIFADLFDITDEKWKNAIEGRLGRIKYSLVTAPEYALEAATIFRKMREREFEEVNL INTAAIKRDEPKARENTLYEAVKTSETYVDLCLKQYLGRIVKCETIEQLHEVREGVTPDC YSYSNYMFRHLREKDYGDYSCIGTKVSKLKLQELEDEAYLLKKQLKADRQMKDSLKKART FEQLNNKNAEQLIQMSKASEDLENYYKKEEKLIRELQELNAGTLLPQLKEMKRQLEEKIE TLREELQKRNKEHLERIKIQGNAETELKTVESRLEDELQGYVKNEALEQEVLRELETKNE YTYRRQKQKELEKLDKAIEEETEQRLTARDEFNRQYSAYGFTGREPQNDVYDRVLEQCKK DFEPKYKAEFEQQYHLVYHTLRENVIATIHGEIKAAFRHKREINRMLGKIRFSDSIYQID ILPAENENRQFYEMLMAPELDSKVFDNHGYEGQLSLGEDEFYQKYETQILRLTEKFMPPR DEDSGKRAVYKEEMERYADYRNYLNFNMYEQVEDEDGNIRKNAVDDMAGRDSGGEGQNPK YVALLAGFAMLYMNQSNRDSKIRLVLLDEAFSKMDKERSEVCLHYARELELQLIVCVPDE RLQSLIRNVDSVYGFRRHQNQISMMHIDKGDYLKMIEGEEHEKIADA >gi|330405847|gb|ADLB01000001.1| GENE 47 48890 - 50083 865 397 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_2907 NR:ns ## KEGG: EUBREC_2907 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 6 397 9 417 417 350 46.0 4e-95 MKKLLTLSRWIIDKTNVDSYRAGRLQGEKHPCIDQSLLNQLGGMANLLKQAEEMERDAGL NKYIRFDWRDMGKDITKMHYSIDIIPLLCKKEGIEDSLEKQRRYIEKIENLLSEVSRAET NWLYSYYNSLLVRLKQGKNVKEMEDETLFRCLQGMAAIQNPLWKRIFSANVLENSKKFEK EYETKVVGILRMYSDLPDKDDMTDEEILKSYGILSYAQVLEWKGTVIYRLDTGKTVDTSD FYYGTIINSQTLEHSKPIAIPHIRKIMTIENKANYENMVYEKDTLYIYCHGFFSPKEVKF LNELTKLTDENVEFLHWGDMDFGGIRIFLFNKEKIFPKLKPYKMGRECFLEAIGKNAGID LEERKRRSLEEMKAGELEELKECILKYGVEIEQEMVI >gi|330405847|gb|ADLB01000001.1| GENE 48 50294 - 50389 164 31 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MANKPSSASLLKLIGYKKNASSEMLEISDTV >gi|330405847|gb|ADLB01000001.1| GENE 49 50521 - 50604 75 27 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MNSVRILSIPNSEKTSSIGLKWPNLLD >gi|330405847|gb|ADLB01000001.1| GENE 50 50604 - 50693 62 29 aa, chain - ## HITS:0 COG:no KEGG:no NR:no YKDENKAACDLMLLYNDDLRRAHYLKEKF Prediction of potential genes in microbial genomes Time: Tue May 24 20:49:49 2011 Seq name: gi|330403115|gb|ADLB01000002.1| Lachnospiraceae bacterium 2_1_46FAA cont1.2, whole genome shotgun sequence Length of sequence - 23125 bp Number of predicted genes - 22, with homology - 18 Number of transcription units - 8, operones - 5 average op.length - 3.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 235 245 ## Cphy_1438 transposase IS204/IS1001/IS1096/IS1165 family protein - Prom 287 - 346 6.0 + Prom 283 - 342 9.3 2 2 Op 1 . + CDS 533 - 718 146 ## 3 2 Op 2 . + CDS 715 - 819 69 ## + Prom 824 - 883 4.0 4 3 Op 1 1/0.000 + CDS 1062 - 1799 240 ## PROTEIN SUPPORTED gi|167855185|ref|ZP_02477956.1| 50S ribosomal protein L31 5 3 Op 2 . + CDS 1748 - 2071 168 ## PROTEIN SUPPORTED gi|167855185|ref|ZP_02477956.1| 50S ribosomal protein L31 6 3 Op 3 . + CDS 2104 - 2184 90 ## 7 3 Op 4 . + CDS 2206 - 2865 422 ## Cla_a032 putative DNA primase TraC 8 3 Op 5 7/0.000 + CDS 2866 - 4872 2109 ## COG0556 Helicase subunit of the DNA excision repair complex 9 3 Op 6 . + CDS 4882 - 7728 2982 ## COG0178 Excinuclease ATPase subunit + Prom 7740 - 7799 4.4 10 4 Op 1 . + CDS 7852 - 8841 1360 ## COG1077 Actin-like ATPase involved in cell morphogenesis + Term 8845 - 8883 6.2 11 4 Op 2 . + CDS 8909 - 11122 2033 ## COG0507 ATP-dependent exoDNAse (exonuclease V), alpha subunit - helicase superfamily I member 12 4 Op 3 . + CDS 11144 - 11839 406 ## COG1040 Predicted amidophosphoribosyltransferases + Prom 11845 - 11904 3.1 13 5 Op 1 . + CDS 11924 - 12292 394 ## gi|226324395|ref|ZP_03799913.1| hypothetical protein COPCOM_02176 14 5 Op 2 . + CDS 12293 - 13321 1231 ## EUBELI_01627 hypothetical protein 15 5 Op 3 . + CDS 13371 - 14255 998 ## COG1624 Uncharacterized conserved protein 16 5 Op 4 . + CDS 14233 - 15495 1598 ## EUBELI_01625 hypothetical protein 17 5 Op 5 . + CDS 15521 - 15787 357 ## EUBREC_2779 putative phosphotransferase system HPr protein + Term 15797 - 15853 10.0 + Prom 15816 - 15875 5.2 18 6 Op 1 . + CDS 15898 - 16815 1184 ## Cphy_0247 hypothetical protein 19 6 Op 2 . + CDS 16841 - 20017 2862 ## COG4193 Beta- N-acetylglucosaminidase 20 6 Op 3 . + CDS 20029 - 22725 2135 ## COG1376 Uncharacterized protein conserved in bacteria + Term 22737 - 22795 6.3 21 7 Tu 1 . - CDS 22715 - 22834 89 ## - Prom 22869 - 22928 8.6 22 8 Tu 1 . + CDS 22848 - 23124 161 ## COG3666 Transposase and inactivated derivatives Predicted protein(s) >gi|330403115|gb|ADLB01000002.1| GENE 1 1 - 235 245 78 aa, chain - ## HITS:1 COG:no KEGG:Cphy_1438 NR:ns ## KEGG: Cphy_1438 # Name: not_defined # Def: transposase IS204/IS1001/IS1096/IS1165 family protein # Organism: C.phytofermentans # Pathway: not_defined # 1 66 1 66 391 92 62.0 5e-18 MHSHYTNKLLNIEDVIIKKIHHADTFLKIYLETNPHEQVCPCCGSTTKRIHDYRYQTIKD LPFQLKLAISFFAKGGMY >gi|330403115|gb|ADLB01000002.1| GENE 2 533 - 718 146 61 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MFFESGNKNAMMPFPIMQLSISGITIYKAENECIECCFVDFNVVLVVYADRIILDETGKN R >gi|330403115|gb|ADLB01000002.1| GENE 3 715 - 819 69 34 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNRWPDSYIEVIVFVFARGNFIKVYLQVVLKKLF >gi|330403115|gb|ADLB01000002.1| GENE 4 1062 - 1799 240 245 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|167855185|ref|ZP_02477956.1| 50S ribosomal protein L31 [Haemophilus parasuis 29755] # 17 232 8 219 339 97 28 3e-34 MANREVGFIVTEEYRHWINEVKKRIKQSQIKASVKINYELLDLYWELGRDIVNKQENAKW GDAFLSVMSKNLQKTFPGMSGFSAQNLRSIRFWYKFYNDNAKCLQVVSKFEMVEKMVKSI PWGHNQRIMYKCKDINEALFYVQKTMDNNWSRNVLEHQIDSNLYERQGKAITNFQVKLPI PQSDLAEQTLKDPYNFDFLTLREEYDEKELEDALVNQITQFLLELGTGFSYIYCFIMFVY ILMWS >gi|330403115|gb|ADLB01000002.1| GENE 5 1748 - 2071 168 107 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|167855185|ref|ZP_02477956.1| 50S ribosomal protein L31 [Haemophilus parasuis 29755] # 1 105 233 327 339 69 36 3e-34 FIYILFYHVRLHSYVVIELKTEKFKPEFAGKLNFYVTAVNKNMKSKQDNQTIGILICKDK DDVVAEYSLDNVSQPIGIARYELTKVLREEFKSSLPTIEEIEKELSE >gi|330403115|gb|ADLB01000002.1| GENE 6 2104 - 2184 90 26 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKVLFDKNCILVGWYRNRDGKVFVIF >gi|330403115|gb|ADLB01000002.1| GENE 7 2206 - 2865 422 219 aa, chain + ## HITS:1 COG:no KEGG:Cla_a032 NR:ns ## KEGG: Cla_a032 # Name: not_defined # Def: putative DNA primase TraC # Organism: C.lari # Pathway: not_defined # 1 194 1 187 213 94 31.0 2e-18 MALILKVPYEEKDEAKVLGAKWNPQLKKWYVEKRKDYHKFTKWILGDKEQVYILCDYFYV VEGLHTCFKCRNLTQVIGYGVKKYFDVCNPELYGDEKAWSFEDDEIHIASHIYPLPEKVL NYLKDKYGYYESYSKTIQSSYLANHCSNCKVIQGDFYLFGEVDSPFFIDSEERARKLKLY RVPLKNDIIVEADIGRGSEDWLIEVFAQSIDLSIDGRVE >gi|330403115|gb|ADLB01000002.1| GENE 8 2866 - 4872 2109 668 aa, chain + ## HITS:1 COG:BH3595 KEGG:ns NR:ns ## COG: BH3595 COG0556 # Protein_GI_number: 15616157 # Func_class: L Replication, recombination and repair # Function: Helicase subunit of the DNA excision repair complex # Organism: Bacillus halodurans # 10 662 5 659 660 869 67.0 0 MAGGNYMDHFELISEYAPTGDQPQAIEQLVKGFKEGNQCQTLLGVTGSGKTFTMANVIQQ LNRPTLIIAHNKTLAAQLYGEFKEFFPNNAVEYFVSYYDYYQPEAYVPSSDTYIAKDSSI NDEIDKLRHSATAALSERRDVIIVASVSCIYGLGSPIDYQEMVISLRPGMIKDRDEVVAK LIEIQYDRNDIDFKRGTFRVRGDVLEIFPAISEEYAIRVEFFGDEVERITEIDVLTGEIK NVLNHVAIFPASHYVVSKENMEKATKAIEEELEERIRYFKGEDKLLEAQRISERTNFDIE MMKETGFCSGIENYSRHLTGLKEGQPPHTLIDYFPDDFLIMIDESHKTIPQIAGMYAGDQ SRKSTLVDYGFRLPSAKDNRPLNFEEFESKINQILFVSATPGEYEEKNELLRADQIIRPT GLLDPEVEVRGVEGQIDDLVGEVNKEIAKKNKVLVTTLTKRMAEDLTDYMRELGIRVKYL HSDIDTLERTEIIRDMRLDVFDVLVGINLLREGLDIPEITLVAILDADKEGFLRSETSLI QTIGRAARNAEGHVIMYADTMTDSMKNAIRETERRRKVQMAYNEEHGITPQTVKKAVRDL ISISKKVASEELQMEKDPESMSRSELEKLIKDMTKQMKKAAAELNFEMAAQVRDKLVELK KQLQEIES >gi|330403115|gb|ADLB01000002.1| GENE 9 4882 - 7728 2982 948 aa, chain + ## HITS:1 COG:CAC0503 KEGG:ns NR:ns ## COG: CAC0503 COG0178 # Protein_GI_number: 15893794 # Func_class: L Replication, recombination and repair # Function: Excinuclease ATPase subunit # Organism: Clostridium acetobutylicum # 8 945 2 939 939 1231 65.0 0 MAKKTEQKQYIKIRGANEHNLKNIDVDIPRNELVVLTGLSGSGKSSLAFDTIYAEGQRRY MESLSSYARQFLGQMEKPNVESIEGLSPAISIDQKSTNRNPRSTVGTVTEIYDYFRLLYA RIGIPHCYKCGKEIKKQTVDQMVDQILELPEGTKIQLLAPVVRGRKGRHEKLFERAKKSG YVRVRVDGNLYELTEEIVLDKNIKHNIEIIVDRLIVKAGIEKRLTDSVESVLSLAEGLMT VDVIGGEEIHFSQSFSCADCGISIDEIEPRSFSFNNPFGACPVCYGLGYKMEFDVDLMIP DKSLSIDEGAITVMGWQSCTDKGSYTRAILDALAKEYHFSLDTPFQDYPQEIQDILIHGT NGKEVKVYYKGQRGEGIYDVAFEGIIKNVERRYRETGSETMKAEYESFMRITPCHECGGQ RLKKSALAVTVGGKNIAEVTNLSIGKLQEFLDTLELTPTQELIGGQILKEIKARTGFLID VGLDYLTLARATGTLSGGEAQRIRLATQIGSGLVGVAYILDEPSIGLHQRDNDKLLKTLK HLRDLGNTLIVVEHDEDTMLEADYIVDIGPKAGEHGGEVVAVGTAKQIMRNKNSITGAYL SGRIKIPVPEVRKQPTGFLKIVGARENNLKNIDVDIPLGVLTCVTGVSGSGKSSLINEIL YKKLAKELNRARTIPGKHDRIEGVEQLDKVIDIDQSPIGRTPRSNPATYTGVFDLIRDLF AATADAKARGYKKGRFSFNVKGGRCEACSGDGILKIEMHFLPDVYVPCEVCGGKRYNRET LEVKYKGKSIYDVLNMTVEEAMHFFENVPSIRRKMETLYDVGLSYVRLGQPSTTLSGGEA QRIKLATELSKRSTGRTIYILDEPTTGLHFADVHKLTEILRRLAEDGNTVIVIEHNLDVI KTADYIIDIGLEGGDRGGTVVATGTPEEIAKNPNSYTGKYIDAILKKN >gi|330403115|gb|ADLB01000002.1| GENE 10 7852 - 8841 1360 329 aa, chain + ## HITS:1 COG:CAC2858 KEGG:ns NR:ns ## COG: CAC2858 COG1077 # Protein_GI_number: 15896112 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Actin-like ATPase involved in cell morphogenesis # Organism: Clostridium acetobutylicum # 4 323 8 327 340 382 62.0 1e-106 MSIDIGIDLGTASILVYVKGKGVVLKEPSVVAFDRDTNVIKAIGEEARLMLGRTPGNIVA VRPLRKGVISDYTVTEKMIKYFVQKAMGKRTFKKPRISICVPSGVTEVEKKAVEEATFAA GAREVHLIEEPVAAAIGAGIDIAKPCGNMIVDVGGGTADIAVISLGGTVVNTSIKIAGDD FDEAIVRYMRKKHNLLIGERTAEDIKIKIGTTYPLVEDETMEVRGRNLVTGLPKTVTVTS SETEEALRETTGQIVEAVIGVLERTPPELSADILDRGIVLTGGGAMLRGLEELIEERTGI NTMTAEDPMKVVAIGTGQFVEFMSGRKDA >gi|330403115|gb|ADLB01000002.1| GENE 11 8909 - 11122 2033 737 aa, chain + ## HITS:1 COG:CAC2854 KEGG:ns NR:ns ## COG: CAC2854 COG0507 # Protein_GI_number: 15896108 # Func_class: L Replication, recombination and repair # Function: ATP-dependent exoDNAse (exonuclease V), alpha subunit - helicase superfamily I member # Organism: Clostridium acetobutylicum # 8 737 8 733 739 556 44.0 1e-158 MEVVTGYVEHIVFRNEDNGYTVFQFNTEDGDLTCVGNFHYISEGEMLELSGEYVTHKVYG VQLQVLSHKAKEPEDLFSIERYLGSGAIKGLGSALAGRIVRRFKEDTIRIIEEEPERLAE VKGISEKKAQEIAEQVEGKKEMRNAMIYLQKYGISTTLAAKIYKKYKDSVYRVLEENPYR LADDIQGIGFKTADEIALRIGIHTDSDFRIRSGIFYALLQSVADGHIYLPKEELLKKAGE ILEVNIEDIEKYLMDLSMEKKVVLKENEGQIRVYPSGYYYMELNTAKMLHDLDIHEEISE SMVERRLHKIEDNTGMILDEMQRRAVIEATKHGILIVTGGPGTGKTTTINAMIHYFESEG LDITLAAPTGRAAKRMTEATGYEAQTIHRLLEVSGNPEDSEKRGGFGRNEENPLETDVII IDEVSMVDLALMYALLCAVTVGTRIILVGDGNQLQSVGAGNVLKDMIASECFPVVRLTKI FRQATESDIVMNAHKINRGERVVLDNKSRDFFFLKRQDANTIISVIITLLQKKMPKYVDA QPYDIQVLTPMKKGLLGVERLNKILQQYMNPPEKKKAEKEYGEKLFREGDKVMQIKNNYQ LEWEISTKFGLVVDKGIGVFNGDMGIITQINTYNETLEVEYDEKRKVTYPFQLLDELELA YAITIHKSQGSEYPAVIIPLLQGPRQLYHRNLLYTAVTRAKKCVTLVGSDGTFYEMIENT NEQNRYTSLAERIREFM >gi|330403115|gb|ADLB01000002.1| GENE 12 11144 - 11839 406 231 aa, chain + ## HITS:1 COG:HI0434 KEGG:ns NR:ns ## COG: HI0434 COG1040 # Protein_GI_number: 16272382 # Func_class: R General function prediction only # Function: Predicted amidophosphoribosyltransferases # Organism: Haemophilus influenzae # 14 219 9 220 229 105 29.0 9e-23 MIKRILEWLYPPVCVFCGKICEQGICAECRKKVGIIGEPRCKKCGKPIRLEEAELCYDCE REELDYEQGRSLWLHKMPVSSSIYAFKYKNRRIYGEVYGKEMAKTFEKLIRLWEIDVIVP VPLHRKKQKKRGFNQAEILAKEIGLRTGLPVDTTLVKRKINTVPQKEFSRRERKKNLKNA FEVTRKIKEKNVLIIDDIYTTGSTIHSISVLLKKSGAEKTYFLTISIGQGL >gi|330403115|gb|ADLB01000002.1| GENE 13 11924 - 12292 394 122 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|226324395|ref|ZP_03799913.1| ## NR: gi|226324395|ref|ZP_03799913.1| hypothetical protein COPCOM_02176 [Coprococcus comes ATCC 27758] # 1 122 10 131 131 115 49.0 1e-24 MTQMAMYESKEGQTDFKISGYYQKDYIGLHTWIMVIWSTIGYAITAGAVFFIFLDEIFAN PSLLRFLILGGILLAGYIATVVISVIVAHNFYKKKHVDARKRVKRFNRELIQLSKMYEKE IR >gi|330403115|gb|ADLB01000002.1| GENE 14 12293 - 13321 1231 342 aa, chain + ## HITS:1 COG:no KEGG:EUBELI_01627 NR:ns ## KEGG: EUBELI_01627 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 1 317 1 317 350 210 39.0 5e-53 MSRLLEWKEKLQSLYAGYSTYIDKGIRFVVAITSFLLISSNIGFMEKLAKPIISIVLAAV CAFLPMIVTVIVSAGLMLAHMFVLSEGIALITAGIIILMFIFYFRFTPKKAIILLLTPIA FAFKVPMLIPIAYGLAGTPVYIVPVICGTIVYFLIQFAKTFSTTITGASKGGMMTVVGTF SKQIFQSKELWAVIVAIAICLLIVYAIRRKSIDRSWEIAIVSGAVAYIIVMVVGSVALDV EVHYASLIIGSVISAVIGFGLELMLFSVDYARTEYLQFEDDEYYYYVKALPKVSVSAPEK QVKRINRRQETDTADNRSATGRLQLTGEADIDKIIEQELMKK >gi|330403115|gb|ADLB01000002.1| GENE 15 13371 - 14255 998 294 aa, chain + ## HITS:1 COG:BS_ybbP KEGG:ns NR:ns ## COG: BS_ybbP COG1624 # Protein_GI_number: 16077243 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Bacillus subtilis # 23 285 14 270 273 202 43.0 5e-52 MKEWIAQLFDEYFSYVPHITVTDVVEIIIIAFVVYEIALWIKNTKAWMLLKGMLVLAVFI SVAAIFQMNTILWLVKNSIGVLATVTVVVFQPELRRALEKLGERNLLSAVVPFDKSKEKE KFSQETIEGIVTATFEMAKVKTGALIVVEDAIQLTEYERTGIRLDSVVSSQLLINIFEHN TPLHDGAIIVRGDRIVAATCYLPLSDNRNLSKDLGTRHRAAVGMSEVSDALIIVVSEETG AVSYAQGGRIVRHVNERQLTEKLNQIRKKTDEEKKRGLRRLWKGRAKNERTIEK >gi|330403115|gb|ADLB01000002.1| GENE 16 14233 - 15495 1598 420 aa, chain + ## HITS:1 COG:no KEGG:EUBELI_01625 NR:ns ## KEGG: EUBELI_01625 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 1 405 1 398 433 130 28.0 7e-29 MRERLKNNFGLKVLSLLFAIVLWLLVVNIDDPIGEVTFRNIPVKVLHEEIFTSKSSTYTI VDDKDTVNVTVSARRKVLSEIKPSDIIVTADIKDRVTNSLTEATLPTEVTIQGFEGEYQS AYTTPKNIDIEIEASTNKKFPISVTTIGTPRDGNVIGSMTANPEFITLAGGESQINRVKK VVAKANVSGISSSGQVDAELILYDENEKVIDQALFDSNLGKEGVKVDIEVLKTKEVPLRF DMSDINTASGYTLGNIFYEPKKILVSGEESVLKELDIIDIPSGALKLNDISETTEVKVDV SKYLPKKVKLVDDTAGTIIVTVSVERYGTKAFSIPGNNVILENTLAGLKANIGTVENIEI QVKGSRAALEKLAEAPKVYVDLGKYTTAGTITVPLQAELPPGCTLVGNVTVPVVLTGEQK >gi|330403115|gb|ADLB01000002.1| GENE 17 15521 - 15787 357 88 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_2779 NR:ns ## KEGG: EUBREC_2779 # Name: not_defined # Def: putative phosphotransferase system HPr protein # Organism: E.rectale # Pathway: not_defined # 1 86 1 86 88 83 52.0 3e-15 MVSKKVKINNPTGLHLRPAGMFCRRAAEFEECKITFHVGCTTGSAKSVLSVLAACIKQGD EIEIICEGKNEESALEAMARLVEEELED >gi|330403115|gb|ADLB01000002.1| GENE 18 15898 - 16815 1184 305 aa, chain + ## HITS:1 COG:no KEGG:Cphy_0247 NR:ns ## KEGG: Cphy_0247 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 1 305 1 305 306 362 61.0 1e-98 MKKATLVIMAAGIGSRFGGGIKQLEPVGPNGEIIMDYSIYDAREAGFNKVVFVIRKDLEK DFKEIIGSRIEKEIEVAYAYQELMDIPEKFRGRFSERTKPWGTGQAILCCKDVVNEPFLV INADDYYGKEAYREAYQYLTEEKPESEKVQASMIGFVLGNTLSENGGVTRGICKVDENGM LTDIVETSNIEETETGAGIRTEDGITPVNVHSPVSMNMWGLHRDFFKVLESGFEEFLENV EEGNLKAEYLLPTIIGGLLEEGKIDVKVLKSHDKWFGVTYKEDKEFVVTSIKNLVEQGVY PQKLF >gi|330403115|gb|ADLB01000002.1| GENE 19 16841 - 20017 2862 1058 aa, chain + ## HITS:1 COG:SP0965_2 KEGG:ns NR:ns ## COG: SP0965_2 COG4193 # Protein_GI_number: 15900842 # Func_class: G Carbohydrate transport and metabolism # Function: Beta- N-acetylglucosaminidase # Organism: Streptococcus pneumoniae TIGR4 # 240 400 2 156 156 98 38.0 8e-20 MIKRIGQKIIAGILTAVLIAGSLNGISGYAEDNTLSEKEQSEEMSFQGENPVTAENYREV DAKGNVNQAKEESGIVEEDVSFYAATPQIVNFNTKSSGQTTDYTEEETGVAGYTYGPYAA DAAYLGKTSSGKVRFMLAGVIGTVNAREVQIVNKSSAKSLSYYYVSNNRLYHKIATNINN AGSGSTLDNGPAPSYLKTGVNYYSYDGHYFYTESNFGNMIDDYNNKTRSHSVNPNNPFYN YFQYLPLRSQSGYNESSFNTLLNNKVTSTSKMRNTGDDFVKNQNTYGANALLMAGIAANE SAWGTSNIAKTKNNLFGLNAVDSSPGESANYFESVSQCIKEYGEKWLSKEYLNPQNWKHY GAFLGNKASGMNVKYASDPYWGEKAAAMAWLLDGNGGNKDAYKYTIGIKDTIYPSNVVNV RKESTTSSTAFYKTKKNANCSFLILDKNPVNKFYKIQSEPVLNSGRTGINSSSGVYNFSN MYAYLSSDYLKIVSEGKTTWSVNGIDTDLSSPQLFDTSIKVSANISGNKTGLQYKFVWQK NDWSEWSVLKDFSSTDYATWLPKSEGNYTILLDVKDTSGKVITVSIPYQIKNWSYSGVKT SIQSPSALKTQINLSADIKGTTSGLQYKFVWKKDNWKEWGVIKDFSTTNNAVWTPTKAGT YELIVDVKDRDGFVATRRLTYQIVEKAWEVQNTVFSPNGAEIGNTIKITQNMRNITEDNV GLQYKFVWQKDNWKEWGVIRDFSTSNTVNWVPTVSGDCEIIVNIKDNRGNSATKTAKYQV QKGKWTFSEIAADKELPQIAGEKIKLTAKVHGNAAGLRYKFVWQKDNWKEWGVIQELSSK NTAEWTPKKSGTYTVYVDVKDSDGVTTSYPKIFTIKNTGEITIQLQPNTNQNVGGKVNIK ALVGGSNNNTKYKFVWQKDNWKEWGVIQDLSTKNNVDWIPKKEGTYTICVDVKQGNASAE TKTIQVKVGKWTYNEISVNKNQNGELEIKPIIQGNTTGFTYKYVWQKNNWKEWGVIKDFS GQEKMSWKPNGKGQYTMVVDVKDAAGNVKTVSKVFNIQ >gi|330403115|gb|ADLB01000002.1| GENE 20 20029 - 22725 2135 898 aa, chain + ## HITS:1 COG:CAC1822_2 KEGG:ns NR:ns ## COG: CAC1822_2 COG1376 # Protein_GI_number: 15895098 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 743 869 1 120 120 63 34.0 1e-09 MGKKAISFVLAVALCVTTMGHLSYAQDNRLQNETVEKANETEWKLQDIKFDKASPQEVSE NINISADVESENTNLLYKYVWQKNDWKEWGVIQDFSNEKTVKWQPKTSGKYYIYVDVKEP GKKTESLSEAYSVVKNKWDYNDLLPEGTELKKNKKVTISTELSGNKKGLRYKYVWQKNDW KEWGVIRDFSEENEVSFTPDGLGKCTIVVNIKDQDGDVITKSKEYNVKTDIWKYNEIVTD YSSPQEKYSDPITITAKTSVETERLQYKFVWQKDDWREWGIIQNLSKKNTAVWQPKSTGK YTIYVDIKDADGITTTKRIPYEIIPVNWKYEAIKTTPEDVQKKENEVEIQAQTSGNTKGL QYKFVWQKDDWKEWGVIQEFSEKNTAVWKTPNKSGSYKIYVDVKDRDGKTRTKSLDYFVA TQLWNPEEVVVNEGIEEQIYTKIPIEARVSGDTEGLQYKFVWKKGAADDDWGKEWGIIQE LGSSNRTEKWYPKKSGIYTIYVDIKDVDGRKKTITKEYKVLEAPWKLDELEIYGSPDRFV GDTLKVEAKTSGETEGLQYKFVCRRGSGWEDWEVVQDFLTENQIEIPIDKDKEYNIYVDI KDNRGVTFDAETTKVRGHKYLSVKSSSTVISKGQSVNVYPEITGEKGELQYKYVWQKDNW QKWGVIKEFSSSSSISWTPSEAGIYYIIIDVKANGKVQTKSVKIEVKNAKNGWYYEGGYK FYYKNGVKQLDLDGILPKQSNYYIKVNRKACTVTVYAKDGNNGYIIPVKRFACSVGKPNT PTPVGTFYTPAKYRWHTLMGPSYGQYCTRITGSILFHSVAGKNMTSYNLDARDYNMLGQP ASHGCVRLCVRDAKWIYDNCSLKTKVTIYDASSPGPLGKPSTIKIPLWQTWDPTDPNI >gi|330403115|gb|ADLB01000002.1| GENE 21 22715 - 22834 89 39 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MDCGLFGVRFLFLGHKNGTVAPVDYSSTFATAPFYYSIY >gi|330403115|gb|ADLB01000002.1| GENE 22 22848 - 23124 161 92 aa, chain + ## HITS:1 COG:CAC0657 KEGG:ns NR:ns ## COG: CAC0657 COG3666 # Protein_GI_number: 15893945 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Clostridium acetobutylicum # 9 92 9 92 334 117 70.0 4e-27 MTNKLKYHKNYTEFGEPYQLVLPLNLEGLIPDDDSVRLLSHELEDLDYSLLYQAYSAKGR NPAVDPKTMFKILTYAYSQNIYSSRKIETACK Prediction of potential genes in microbial genomes Time: Tue May 24 20:50:55 2011 Seq name: gi|330401009|gb|ADLB01000003.1| Lachnospiraceae bacterium 2_1_46FAA cont1.3, whole genome shotgun sequence Length of sequence - 78815 bp Number of predicted genes - 75, with homology - 72 Number of transcription units - 22, operones - 13 average op.length - 5.1 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 59 - 262 192 ## COG3666 Transposase and inactivated derivatives + Term 302 - 352 6.0 + Prom 364 - 423 5.9 2 2 Op 1 . + CDS 579 - 1976 1383 ## COG2148 Sugar transferases involved in lipopolysaccharide synthesis 3 2 Op 2 . + CDS 1976 - 3076 1025 ## COG0562 UDP-galactopyranose mutase 4 2 Op 3 . + CDS 3084 - 4112 517 ## COG3594 Fucose 4-O-acetylase and related acetyltransferases 5 2 Op 4 11/0.000 + CDS 4187 - 5218 1018 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 6 2 Op 5 26/0.000 + CDS 5218 - 6276 959 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 7 2 Op 6 . + CDS 6290 - 7378 749 ## COG0438 Glycosyltransferase 8 2 Op 7 . + CDS 7393 - 8676 650 ## gi|293401746|ref|ZP_06645887.1| hypothetical protein HMPREF0863_02027 9 2 Op 8 . + CDS 8661 - 9296 742 ## 10 2 Op 9 . + CDS 9299 - 9856 637 ## 11 2 Op 10 16/0.000 + CDS 9868 - 10950 1321 ## COG1088 dTDP-D-glucose 4,6-dehydratase 12 2 Op 11 1/0.000 + CDS 10980 - 11858 856 ## COG1209 dTDP-glucose pyrophosphorylase 13 2 Op 12 . + CDS 11880 - 12794 674 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 14 2 Op 13 . + CDS 12794 - 13252 374 ## Cbei_2596 WxcM domain-containing protein 15 2 Op 14 9/0.000 + CDS 13239 - 14333 1238 ## COG0399 Predicted pyridoxal phosphate-dependent enzyme apparently involved in regulation of cell wall biogenesis 16 2 Op 15 . + CDS 14348 - 15127 725 ## COG0110 Acetyltransferase (isoleucine patch superfamily) 17 2 Op 16 . + CDS 15124 - 16569 1385 ## COG2244 Membrane protein involved in the export of O-antigen and teichoic acid 18 2 Op 17 . + CDS 16589 - 17947 941 ## DSY3308 hypothetical protein 19 2 Op 18 . + CDS 17986 - 19212 1362 ## COG4198 Uncharacterized conserved protein 20 2 Op 19 . + CDS 19225 - 21039 1703 ## COG0079 Histidinol-phosphate/aromatic aminotransferase and cobyric acid decarboxylase 21 2 Op 20 . + CDS 21055 - 21999 716 ## COG2423 Predicted ornithine cyclodeaminase, mu-crystallin homolog + Term 22011 - 22059 1.1 + Prom 22396 - 22455 7.8 22 3 Op 1 1/0.000 + CDS 22490 - 23629 1196 ## COG1316 Transcriptional regulator + Term 23726 - 23780 3.0 + Prom 23688 - 23747 5.8 23 3 Op 2 . + CDS 23792 - 24931 1309 ## COG1316 Transcriptional regulator + Prom 24954 - 25013 10.2 24 4 Op 1 . + CDS 25040 - 26440 1505 ## EUBELI_00559 hypothetical protein + Term 26448 - 26487 6.0 + Prom 26564 - 26623 4.5 25 4 Op 2 . + CDS 26654 - 28216 1654 ## COG0265 Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain + Term 28222 - 28275 10.0 + Prom 28252 - 28311 8.0 26 5 Op 1 4/0.000 + CDS 28346 - 30586 2125 ## COG2217 Cation transport ATPase 27 5 Op 2 . + CDS 30611 - 30907 332 ## COG1937 Uncharacterized protein conserved in bacteria 28 5 Op 3 . + CDS 30922 - 31119 394 ## gi|210608693|ref|ZP_03287970.1| hypothetical protein CLONEX_00149 29 5 Op 4 . + CDS 31124 - 32860 1457 ## EF3301 hypothetical protein 30 5 Op 5 . + CDS 32873 - 33544 654 ## gi|210608691|ref|ZP_03287968.1| hypothetical protein CLONEX_00147 + Term 33638 - 33704 30.0 + TRNA 33620 - 33694 86.4 # Pro TGG 0 0 + TRNA 33699 - 33769 75.8 # Gly TCC 0 0 - Term 33770 - 33812 9.2 31 6 Op 1 . - CDS 33826 - 34446 500 ## COG3546 Mn-containing catalase 32 6 Op 2 . - CDS 34450 - 34716 344 ## Dtox_3011 CotJB protein 33 6 Op 3 . - CDS 34713 - 34925 125 ## gi|210608690|ref|ZP_03287967.1| hypothetical protein CLONEX_00146 - Prom 35026 - 35085 7.0 + Prom 34963 - 35022 8.0 34 7 Tu 1 . + CDS 35042 - 36202 1281 ## COG0772 Bacterial cell division membrane protein + Prom 36239 - 36298 5.5 35 8 Op 1 . + CDS 36366 - 37205 974 ## COG1307 Uncharacterized protein conserved in bacteria 36 8 Op 2 . + CDS 37259 - 37495 285 ## gi|210608684|ref|ZP_03287961.1| hypothetical protein CLONEX_00140 + Term 37510 - 37557 10.4 + Prom 37504 - 37563 5.3 37 9 Op 1 29/0.000 + CDS 37680 - 38117 460 ## COG2001 Uncharacterized protein conserved in bacteria 38 9 Op 2 . + CDS 38136 - 39071 767 ## COG0275 Predicted S-adenosylmethionine-dependent methyltransferase involved in cell envelope biogenesis 39 9 Op 3 . + CDS 39119 - 39607 429 ## EUBREC_2253 hypothetical protein 40 9 Op 4 3/0.000 + CDS 39631 - 41502 1722 ## COG0768 Cell division protein FtsI/penicillin-binding protein 2 + Prom 41529 - 41588 9.0 41 9 Op 5 4/0.000 + CDS 41657 - 43324 1270 ## COG0768 Cell division protein FtsI/penicillin-binding protein 2 42 9 Op 6 28/0.000 + CDS 43336 - 44292 1265 ## COG0472 UDP-N-acetylmuramyl pentapeptide phosphotransferase/UDP-N-acetylglucosamine-1-phosphate transferase 43 9 Op 7 25/0.000 + CDS 44306 - 45661 1685 ## COG0771 UDP-N-acetylmuramoylalanine-D-glutamate ligase 44 9 Op 8 . + CDS 45677 - 46762 1138 ## COG0772 Bacterial cell division membrane protein 45 9 Op 9 . + CDS 46784 - 47539 654 ## EUBREC_2248 hypothetical protein + Prom 47551 - 47610 4.8 46 10 Tu 1 . + CDS 47678 - 48907 1672 ## COG0206 Cell division GTPase + Term 48915 - 48976 13.5 + Prom 48910 - 48969 5.2 47 11 Op 1 . + CDS 49079 - 49861 574 ## Cphy_2470 peptidase U4 sporulation factor SpoIIGA 48 11 Op 2 . + CDS 49874 - 50614 760 ## COG1191 DNA-directed RNA polymerase specialized sigma subunit 49 11 Op 3 . + CDS 50694 - 52301 1547 ## COG1236 Predicted exonuclease of the beta-lactamase fold involved in RNA processing + Prom 52315 - 52374 3.2 50 11 Op 4 . + CDS 52401 - 53180 619 ## COG1191 DNA-directed RNA polymerase specialized sigma subunit + Term 53187 - 53236 1.4 + Prom 53209 - 53268 3.3 51 12 Op 1 . + CDS 53288 - 53938 638 ## COG0546 Predicted phosphatases 52 12 Op 2 . + CDS 53968 - 54231 464 ## EUBREC_1570 hypothetical protein + Term 54233 - 54270 3.7 53 13 Tu 1 . - CDS 54434 - 54916 452 ## COG1396 Predicted transcriptional regulators - Prom 55070 - 55129 3.1 - Term 54960 - 55007 1.1 54 14 Tu 1 . - CDS 55149 - 55373 326 ## gi|197302777|ref|ZP_03167830.1| hypothetical protein RUMLAC_01506 - Prom 55401 - 55460 4.8 + Prom 55389 - 55448 7.9 55 15 Tu 1 . + CDS 55480 - 56346 663 ## CLH_2455 hypothetical protein + Prom 56428 - 56487 3.2 56 16 Op 1 . + CDS 56515 - 56934 377 ## CD1104 hypothetical protein 57 16 Op 2 . + CDS 56948 - 57127 167 ## COG0537 Diadenosine tetraphosphate (Ap4A) hydrolase and other HIT family hydrolases 58 17 Tu 1 . - CDS 57172 - 57603 457 ## Cphy_3425 hypothetical protein - Prom 57633 - 57692 9.3 + Prom 57467 - 57526 9.3 59 18 Op 1 7/0.000 + CDS 57760 - 58332 691 ## COG0193 Peptidyl-tRNA hydrolase 60 18 Op 2 . + CDS 58347 - 61685 3041 ## COG1197 Transcription-repair coupling factor (superfamily II helicase) 61 18 Op 3 . + CDS 61733 - 62773 1199 ## EUBREC_0459 hypothetical protein + Term 62786 - 62825 8.2 - Term 62775 - 62809 6.3 62 19 Tu 1 . - CDS 62817 - 63263 267 ## PROTEIN SUPPORTED gi|42519249|ref|NP_965179.1| 30S ribosomal protein S21 - Prom 63287 - 63346 8.1 + Prom 63267 - 63326 7.6 63 20 Op 1 . + CDS 63410 - 63958 673 ## COG2002 Regulators of stationary/sporulation gene expression 64 20 Op 2 . + CDS 64021 - 64395 378 ## Cphy_3435 hypothetical protein + Prom 64463 - 64522 7.5 65 21 Op 1 . + CDS 64551 - 64694 173 ## + Term 64695 - 64751 1.1 66 21 Op 2 1/0.000 + CDS 64788 - 66191 1581 ## COG0641 Arylsulfatase regulator (Fe-S oxidoreductase) 67 21 Op 3 3/0.000 + CDS 66195 - 68333 2669 ## COG0342 Preprotein translocase subunit SecD + Term 68340 - 68387 9.4 68 21 Op 4 7/0.000 + CDS 68400 - 70115 1510 ## COG0608 Single-stranded DNA-specific exonuclease 69 21 Op 5 9/0.000 + CDS 70185 - 70709 710 ## COG0503 Adenine/guanine phosphoribosyltransferases and related PRPP-binding proteins 70 21 Op 6 . + CDS 70736 - 73036 2389 ## COG0317 Guanosine polyphosphate pyrophosphohydrolases/synthetases 71 21 Op 7 1/0.000 + CDS 73046 - 73669 603 ## COG0491 Zn-dependent hydrolases, including glyoxylases + Term 73694 - 73740 5.1 72 21 Op 8 . + CDS 73753 - 75192 903 ## COG0635 Coproporphyrinogen III oxidase and related Fe-S oxidoreductases 73 21 Op 9 13/0.000 + CDS 75183 - 76439 1437 ## COG0124 Histidyl-tRNA synthetase 74 21 Op 10 . + CDS 76458 - 78257 2007 ## COG0173 Aspartyl-tRNA synthetase + Term 78304 - 78363 11.3 - Term 78357 - 78407 6.0 75 22 Tu 1 . - CDS 78447 - 78800 289 ## COG3666 Transposase and inactivated derivatives Predicted protein(s) >gi|330401009|gb|ADLB01000003.1| GENE 1 59 - 262 192 67 aa, chain + ## HITS:1 COG:CAC0656 KEGG:ns NR:ns ## COG: CAC0656 COG3666 # Protein_GI_number: 15893944 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Clostridium acetobutylicum # 1 67 123 189 189 93 70.0 7e-20 MNRSIQVEGAFGVLKNDYEFQRFLLRGKSKVKLEILLLCMGYNINKLHAKIQKERTGSYL FLVKETA >gi|330401009|gb|ADLB01000003.1| GENE 2 579 - 1976 1383 465 aa, chain + ## HITS:1 COG:CAC2330 KEGG:ns NR:ns ## COG: CAC2330 COG2148 # Protein_GI_number: 15895597 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Sugar transferases involved in lipopolysaccharide synthesis # Organism: Clostridium acetobutylicum # 55 446 58 444 461 221 37.0 2e-57 MSYWIRKFEATMWLVIKAILYLSLLGVFMLILSEENPSLIRLSRTMGITMTTFAIVGILF LRIYGMYDVGRRKSKPIIYSISLATFFTDLVVYLQLMIMNTITPSITAFRLSSIGALIIA YIVQLIVIVIFAYAGNALFFKIHEPDSCCIITSSQESLDCITRAISRFRKQYKIDYVLDY KDKKIWETIDKADTVFIQDVPVVERSEIIRYCYKKKTHIYFNPEIEDIVEMNAKYYLLDD VSVLNANVKAWTMEQRIAKKLLDMGLAIVLGILTSPIWIVSAIAIKAYDGGSIFFKQKRA TLNGRVFEVYKFRTMKENVENRSVTDDDDRITKPGKILRKIRMDELPQLLNILKGDMSFV GPRPEMLENVNEYEKQLPEFRYRLRVKAGLTGYAQIAGKYNTTPKDKLIMDMMYIEQFNI WKDIQLIFQTFIVLLKSDSTEAFKTNGGGPEYVFHAAESDEKEYE >gi|330401009|gb|ADLB01000003.1| GENE 3 1976 - 3076 1025 366 aa, chain + ## HITS:1 COG:Cj1439c KEGG:ns NR:ns ## COG: Cj1439c COG0562 # Protein_GI_number: 15792757 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-galactopyranose mutase # Organism: Campylobacter jejuni # 4 365 2 363 368 449 60.0 1e-126 MKKYDYVIVGGGLFAGTFAYFARKQGKKCLVVEKRETLGGNIYCEDVEGIHVHKYGAHIF HTSNRQVWDFVNSLVEFNRYTNSPIANYKGEIYNMPFNMNTFSKMWGVATPKEAKEIIDR QRAVISGEPQNLEEQAISLVGEDIYKKLIKGYTEKQWGRDCKDLPSFIIKRLPVRYTYDN NYFNDLYQGIPVGGYNVLINALFEGCDVELGVDYNENREKYSALGEKVLYTGTLDSLYDF CYGKLEYRSLHFESEVLNEENHQGVAVVNYTDRETLYTRVIEHKHFEYGTQDKTVITKEY PADWKEGMEPYYPINDEKNQELYQKYRAKADKESNLILGGRLAEYKYYDMDKVIESAFQL VEKELA >gi|330401009|gb|ADLB01000003.1| GENE 4 3084 - 4112 517 342 aa, chain + ## HITS:1 COG:CAC3042 KEGG:ns NR:ns ## COG: CAC3042 COG3594 # Protein_GI_number: 15896293 # Func_class: G Carbohydrate transport and metabolism # Function: Fucose 4-O-acetylase and related acetyltransferases # Organism: Clostridium acetobutylicum # 7 286 2 270 337 96 30.0 9e-20 MKVGMNQREKSFDIAKGIGMLAVVLGHMSIPAKMGDFIFSFHMPLFFLINGYFFKKKEIS DSKYIITKVKTLILPYIMTCIFVIAFSTLFQLRQGIGMTGILENAKSWILAALYGSGTFT HFLKWNFRIIGAIWFLLAMFWADVIFHFLLKSKNLYVWGTVLSIIGYFTAKIVWLPMSVQ AGLSAIFFIQIGYFIKQKNMLQIYGNNKQVFCLSAIFWGIAVIYSGKLYMVGNNYGGGFL DVIGAVSATFLILKFSKFLEKKCGKIAELLSLYGKNSLIVLCFHLVELNTFPWSVLSGKF SYPTGLVVVFACKVIWSIAMIYVVHKITLLRNVYCGKQKSKR >gi|330401009|gb|ADLB01000003.1| GENE 5 4187 - 5218 1018 343 aa, chain + ## HITS:1 COG:BS_yveT KEGG:ns NR:ns ## COG: BS_yveT COG0463 # Protein_GI_number: 16080481 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Bacillus subtilis # 1 320 9 329 344 132 28.0 8e-31 MPVYKVEEYVGKAIESIQAQTLKDWEFLIVDDGTPDKSGEICDAYAEKDHRIRVIHKENG GAPSARNMAIEMAKGEYFYFLDSDDWAEPTMLEDMYNLAKRDNAQLVVAGFYIDTFIGDG QFMTDNYVVEDAVYPNKETFRRNAYKLFDKNLLYTPWNKLFEAKYVMENNLRFPTTFWDD FPFNVSVVRNVERVTVTSKQYYHFLRARTESETAAYRPGMYEKREEEHGWMVNLYKEWKV GGPESMEMIARRYMERFVGCVENITNPKCEMTAKEKKKEIKKMLQNPRINEMGKLAKPRS LYMKILLLPIKWHCTFLTYLEAKTITYVKTKNTKLFTKLKVGR >gi|330401009|gb|ADLB01000003.1| GENE 6 5218 - 6276 959 352 aa, chain + ## HITS:1 COG:BS_yveT KEGG:ns NR:ns ## COG: BS_yveT COG0463 # Protein_GI_number: 16080481 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Bacillus subtilis # 7 307 5 300 344 130 30.0 4e-30 MENRIAVSVIVPAYNAEKVVENCIGSISKQSLKDIEIIAVDDGSKDSTRKVLEKLAEGDN RIRLIKKDKNEGLSAARNSALEIATGEYVGFVDADDWVEEDTFERMFREGKGADLIVSGY KHDTMEENRKQVNISREVKTQPGYWNKKAEVISRAAYIDTAKMFAYTWNKLYRREIIERK KLVFSKQVLIEDFIFNTEFWNEISTLSVVCCTGYHYIKASKDALTQKFLPDFLQIMNLRF DCMKNLLEKNGVYEENCKEQVANIYIKHAVAGIVRNCSPAGNYSFKEQFDRVKKLLKDTH SKEACQNAKGNSKQEKVCNLVFKSKVALMNLLLGKMIYAMQTKSKTAFDKLK >gi|330401009|gb|ADLB01000003.1| GENE 7 6290 - 7378 749 362 aa, chain + ## HITS:1 COG:HI1698 KEGG:ns NR:ns ## COG: HI1698 COG0438 # Protein_GI_number: 16273585 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Haemophilus influenzae # 1 359 1 353 353 123 27.0 5e-28 MKRIMFISWDMSILGGINQVLVGLANKLSKDYEVYIVSLVKSGEKTKYILNSEIKGLEYL TEEDCRGREVLIKGRKKLRKLIKEKEIDILFLMGFQVSLPVILMTFGLKCKNVFCDHEAL LSRWHEKKITMVRYMTSIFSKKVITLTKQNAEDYKEKFRLSDKKVDFIYNSITEEVLNNC SEYNSDSKIILSVGRFSKEKGYDILVEVARKVLGKHEDWKWYIYGNGDTFFEIEQQIKKE KLDKQVILKGEVSDVSSIYGQAGIFVLTSYREGLPLVLLEAKANHLPCVSFDIISGPKEI IRDKVDGILVPPYDREKMAETIEKLICDTSLRKKMAEKAEENLSKFSEKEIMKQWKQLIE EL >gi|330401009|gb|ADLB01000003.1| GENE 8 7393 - 8676 650 427 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|293401746|ref|ZP_06645887.1| ## NR: gi|293401746|ref|ZP_06645887.1| hypothetical protein HMPREF0863_02027 [Erysipelotrichaceae bacterium 5_2_54FAA] # 7 422 17 430 444 184 31.0 7e-45 MKSLFWEKGEFKLWILLLCAYHFFPFITTKGSIVLYIWAYVIPLAYIALNLNYLKKIVSS IAHSEALICVSGIVLLICLSLIIPVIYNTGDFTYLTDSILGMIKILIRMLFIVLVIIKNI PEATKETFMKYFIYSCCLYICSTIIMMLFPSIKDIFYQLIKESDHSKLMAMDARYKTRYG WGGFSGFEYTFKCVLAIIFNSYLIETRIKQKRIWLNVGISLFLLVGTLFYGRVGSLFGGC VLVVLFIRLMKKRPKILVGVIVAGLMGCIALFVLQSRNETIKTWFNWAFDLFVTFAKTGK FETESSNVLIEQMLFVPEIKTILFGDGMYTTLTGYYMSTDAGIMRSLLFGGLGFALLRYL SFYIPLGLSMFKKRMNSADKTMYLWVLLLCVVFEIKGEILFSCIPIFIWIMAMEQYERWR KKEWNEI >gi|330401009|gb|ADLB01000003.1| GENE 9 8661 - 9296 742 211 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MERDLSFGKLCKMYILNIWVIILAGIIVAGAMGAVLKGKSEASVTQSVYLVYDLDTEQKE TLETKRSVYFDAYKGLISGNTLLDSNEFSKEEKALLSGITTEVESSCYTLTMQAQNVEQD EELLNRYVKASEKWMQEKYKDDSISVETLKKVSTSNEGGISIMKIALGFIVGAILAALGL FIWFVSDKKIRTEEDVQYYTELECLTTVRRR >gi|330401009|gb|ADLB01000003.1| GENE 10 9299 - 9856 637 185 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGRAMVDIENLREIVLLSESKVWGITSLSRKTGKTFVARQLVENLLKAGKKVVYLSFQKE SGEETKNFLEKELKKAENVSGSVKSVVLKDTLHLEEIVYSEKFVQLVEEYKKEYDYIIMD MMSMEENSIAKKIASVCENNFIVVAKDCENGEEVGKMVHQLKTLNIRLAGIVLNEYHEKK KWLRM >gi|330401009|gb|ADLB01000003.1| GENE 11 9868 - 10950 1321 360 aa, chain + ## HITS:1 COG:CAC2332 KEGG:ns NR:ns ## COG: CAC2332 COG1088 # Protein_GI_number: 15895599 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-D-glucose 4,6-dehydratase # Organism: Clostridium acetobutylicum # 1 360 1 351 351 456 63.0 1e-128 MRTYLVTGGAGFIGSNYIHYMLKTYGENIKVINVDKLTYAGNLENLSEVENLPNYQFVRA DICDRDSIEKIFAENEIDRVVHFAAESHVDRSIKEPEVFIKTNVLGTLVLLNAAKKAWEI DDGVYKEDKKFLHVSTDEVYGSLENSDEYFYETTPYDPHSPYSASKASSDFLVKAYMDTY RFPANITNCSNNYGPYQFPEKLIPLVINNALKGEKLPVYGDGKNVRDWLFVEDHVRGIDA VQEKGRLYETYNIGGHNEKQNIEIIHIILDTLLELLPENDERRKNISESLITYVTDRKGH DRRYAIAPDKIEKEVGWTPETKFEVGIKKTIQWYLEHEDWMKNVTSGSYQQYYKEMYGNQ >gi|330401009|gb|ADLB01000003.1| GENE 12 10980 - 11858 856 292 aa, chain + ## HITS:1 COG:CAC2333 KEGG:ns NR:ns ## COG: CAC2333 COG1209 # Protein_GI_number: 15895600 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-glucose pyrophosphorylase # Organism: Clostridium acetobutylicum # 5 288 2 285 288 407 68.0 1e-113 MKKRKGIILAGGTGSRLYPITKVISKQIVPIYDKPMIYYPLSILLLADIREILIISTPKD IDGFRNLLGDGHKMGIELSYAVQEQPNGLAEAFIIGEDFIGDDDVALILGDNIFYGQSLS DVLKNATAREEGATIFGYYVKEPSAYGVVEFDDELNVLSIEEKPENPKTNYAVPGLYFYD NDVVEIAKNVQPSARGEKEITSVNNEYLKRGKLKVELLGRGFAWLDTGTPDGLLEAANFV ATFQKRQGLYVSCIEEIAYKRGFIDSKQLEKLALELPNTPYGEYLLELSKGH >gi|330401009|gb|ADLB01000003.1| GENE 13 11880 - 12794 674 304 aa, chain + ## HITS:1 COG:YPO3098 KEGG:ns NR:ns ## COG: YPO3098 COG0463 # Protein_GI_number: 16123272 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Yersinia pestis # 7 226 3 206 247 68 25.0 1e-11 MKQDLLVTIIVLTYKDFSGLNKTVEAILEQTYQNIEIIISDDGSENYQEDMFLPFQKKAE AENKKIILKHHAQNEGTVKNINGALELATGDIIGFLGCGDYYASKEIIKEIVDEFLKKDA EVVTGKMKGISLSNPQRTTCLPEKHLIKLLKEGNRDKIYRKMFNENCFCAPATFYKKDVY DKVGKYDERMRLIEDYPFMFKLVRNHIKIVFMDRYVTIYLFDGVSSGKQSPAICADLEKI KKYVLLPHIEECDRRSRRLFLYNYERKNTKSVTEKLCVTIKYFDQFLYWQYKRVDGICKA QRRK >gi|330401009|gb|ADLB01000003.1| GENE 14 12794 - 13252 374 152 aa, chain + ## HITS:1 COG:no KEGG:Cbei_2596 NR:ns ## KEGG: Cbei_2596 # Name: not_defined # Def: WxcM domain-containing protein # Organism: C.beijerinckii # Pathway: not_defined # 24 142 17 135 135 136 56.0 2e-31 MVSNTKIVEFKDIINTKNSTKYMGHLVPIEVGEDIPFVVNRLYYITDVPKNETRGYHSHN DLEQVLICLHGSVTIKVNTPYEEEYVVLNKVNQGLYIGPMVWREMYDFSDDAVLLVLASK HYDEADYIRDYNEYCHMAEAYFEGGRKAHDTI >gi|330401009|gb|ADLB01000003.1| GENE 15 13239 - 14333 1238 364 aa, chain + ## HITS:1 COG:all0498 KEGG:ns NR:ns ## COG: all0498 COG0399 # Protein_GI_number: 17227994 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted pyridoxal phosphate-dependent enzyme apparently involved in regulation of cell wall biogenesis # Organism: Nostoc sp. PCC 7120 # 2 360 8 374 395 301 42.0 9e-82 MIPFNDFHIMHKELKNEMGEMFQQVFDKNWFIQGEQEETFNKKFAQYCGAKYCIGVGNGL EALRMILQAYDIGEGDEVIVPSNTFIATALAVTYVGAKLVFVEPDIETYTINPDLIEEKI TERTKAIIAVHLYGQTCDMDGILEIAKKHNLKVIEDAAQAHGAEYKGKKAGNLGDAAGFS FYPGKNLGALGDAGAITTNDEELAKKVRAIGNYGSEKKYNHIYKGTNSRLDEMQAGFLNV KLGHLDKWNERRRKIADRYLNEIKNPNIVLPTVKEENVPVWHIFAIRSERREELIEYLSE NNIGTMIHYPIPIHLQKAYEDLGHKQGDYPIAEKISNEQISLPMFYGLKDEEVDHIISCL NAWK >gi|330401009|gb|ADLB01000003.1| GENE 16 14348 - 15127 725 259 aa, chain + ## HITS:1 COG:TM0759 KEGG:ns NR:ns ## COG: TM0759 COG0110 # Protein_GI_number: 15643522 # Func_class: R General function prediction only # Function: Acetyltransferase (isoleucine patch superfamily) # Organism: Thermotoga maritima # 1 225 14 239 254 112 33.0 9e-25 MKRGKNVILEEGVFVGNDVILGDNVYIERGTIIHDNVEIGANTFIGANSILGEHLAGYYR DRENYKQPKLVIGEGSLIRSGAIIYGDVKVGKEFQTGHRVTIRENTTIGDCVRIGTNSDI QDGVEIGNYVNIHSDVFISADNKIHDYVWICPRVLFANDFTPPSNEIKGSIVESFSTICS NTTILPGTHIRQNVLVCAGAVMGGDSEEGFTYKGIPAKKGKPISEVKNHITGEEAYPWPL HFDRGMPWEKETFEEWSNS >gi|330401009|gb|ADLB01000003.1| GENE 17 15124 - 16569 1385 481 aa, chain + ## HITS:1 COG:MA3764 KEGG:ns NR:ns ## COG: MA3764 COG2244 # Protein_GI_number: 20092562 # Func_class: R General function prediction only # Function: Membrane protein involved in the export of O-antigen and teichoic acid # Organism: Methanosarcina acetivorans str.C2A # 36 399 37 407 492 66 26.0 9e-11 MKRFINLYKNLSNEAKASAWFVVCNIIQKGISFFTIPIFTRLLTTEEYGMTNVYQSWMSL IIIFATLNLQYGVFNTAMIKFEEDRDRFISSLQGLSTAFTVGIFAIYFIAHKSWDSLLGL PFVLMLTMFAELFTSPALGFWSGKQRFDIKYKGLVSLTLLIAVLSPVIGILLVIPAENRG VARIIGCAMANIIVGLILYFYNWKKGKSLYVKEYWIFALKFNLPLIPYYLSQMVFNQSDR LMIDYLSGRDKAGIYGVAYSMGLVLNFVINAINGSFVPWTYKSIRDKKQKNIKNIANGIS ILVAVMLSLLILITPELMRFIAGPEYYEAIWVVPPIAASLFFLFMSQLSINIEFYFGENT LLIKGSIISALVNVVLNFIFIQMFGYIAAGYTTLIAYIIFCLTNYSCMKKICKKEFGEEN WKLYDTKFLSILSIGFLITMVVLTVLYPFVIVRYVIIAAILFVLFLKRKLIIEKIKEVRK G >gi|330401009|gb|ADLB01000003.1| GENE 18 16589 - 17947 941 452 aa, chain + ## HITS:1 COG:no KEGG:DSY3308 NR:ns ## KEGG: DSY3308 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense # Pathway: not_defined # 50 420 403 764 789 130 25.0 7e-29 MLNRFKAKINEIYLDSTIPYNNISKKMVCALKQRITKEDFKTFQKLKKLKKKQNNLDKKI RIVFLLQMPEVWGKQQIVYEEMKKRDNIETIIFTIPEYDIKTGGHKEDIAYDYAVKEGLT EIVQSEKNGNWVSLKELQPDYVFYQRPYETYLPEVYRSKNVLEYAKTCYIPYAIFASISS AMNLEYERGFARNIYYHFVSNLEMEHLVRHKFSITSKLGLRQVKYLGVPILESVLKTAVK GEDNEVWKNWGSQQGQLKVLWTPRWTVDEKLGGSHFFHYKDKFTELVRDDKNIYFSFRPH PMAFDNYVKEKLMSVEQVEKLKKEYAETSNMVIDSHRGYVDTFWGADVLITDISSMMMEF FVTGKPIIFCGTNMALDSLHKEVVGTLYKGNTWQEIADALEQLKSGNDYLQEQRSQVIKR WFANMDKTSEKIVDAIVKDYQLSCTTNKKMIY >gi|330401009|gb|ADLB01000003.1| GENE 19 17986 - 19212 1362 408 aa, chain + ## HITS:1 COG:CAC0016 KEGG:ns NR:ns ## COG: CAC0016 COG4198 # Protein_GI_number: 15893314 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 1 408 1 413 414 465 57.0 1e-131 MANIKPFCSVRPNESLAREIAALPYDVYNREEAKQVVEKNPKSFLKIDRAETQFDDDFDM YSEEVYQKAHDTLAEMIDDGEFVKDEKPCFYVYELIMNGRSQTGIVGCASVDDYLNNVIM KHENTREDKELDRIHHVDTCNAQTGPIFLTYRANEKINHIVTDKKQEKALYDFVAEDGVS HRVWRIDAVEEVEKIAELFAAIPHIYIADGHHRAASAVKVGLKRRNENPNYTGREEFNYF LSVLFPDEELMVMDYNRVVKDLNGYTKETFLEELEKDFVICPKGEMPYSPKAKGEMGMFL GKQWYSLTFKGKVSENPVESLDVSILQNKVLEPLLAIEDPKTDKRIKFIGGIRGLKALEE QTKEGVAFSMYPTSMRELFAVSDAGLLMPPKSTWFEPKLRSGLFIHQI >gi|330401009|gb|ADLB01000003.1| GENE 20 19225 - 21039 1703 604 aa, chain + ## HITS:1 COG:mlr5890 KEGG:ns NR:ns ## COG: mlr5890 COG0079 # Protein_GI_number: 13474906 # Func_class: E Amino acid transport and metabolism # Function: Histidinol-phosphate/aromatic aminotransferase and cobyric acid decarboxylase # Organism: Mesorhizobium loti # 229 603 66 437 449 173 28.0 1e-42 MQAIILAAGMGKRLKELTSNATKCMVEVNGVTMIERMLSQLDALKLNRIVIVVGYEGKKL MEYIRSLNISTPIEYVDNDIYYKTNNIYSLYMAKDYLVQDDTLLLESDLIFEDSVLQRLL DNPYPSLALVAKFESWMDGTVVTLDEEDNIKNFLGKKDFVFEDIPNYYKTVNIYKFSKEF SNSHYVPFLEAYSKALGNNEYYEQVLKVITLLDKPEIKAERLGHESWYEIDDVQDLNIAE SIFATKEEKLNKFNRRFGGYWRYPQLIDFCYLVNPFYPNQKLVDEIKANFERLLCEYPSG MEINSLLGAKYYGLHKEQVCVGNGAAELIKSLMENHEGKVGMIYPTFEEYPHRKKEEDIV PYWIESKDFRYSADDLMSFYEDKDIQFLVLINPDNPSGNYIEKNDVLRVAEWAEKKNIKF VVDESFVDFAETEGSSTLLEEEIIKAYNNLIVVKSISKSFGVPGLRLGILASNDLQLISD MKKDVAIWNINSFAEFYMQIFEKYKSNYEEALVKFKEVRKEYVELLNGIEYLRVIPSQAN YLMCELTGQMTSRELTEILLNEYNILIKDLSSKNGFSGKSYIRVAVKRPEENMKLVEAIK SVLI >gi|330401009|gb|ADLB01000003.1| GENE 21 21055 - 21999 716 314 aa, chain + ## HITS:1 COG:RSp0418 KEGG:ns NR:ns ## COG: RSp0418 COG2423 # Protein_GI_number: 17548639 # Func_class: E Amino acid transport and metabolism # Function: Predicted ornithine cyclodeaminase, mu-crystallin homolog # Organism: Ralstonia solanacearum # 82 294 103 326 340 77 26.0 4e-14 MKIITFEDILKLNISPFECFEWVSSAIEEKKKALLPAKISLKPEIEGVFYNTMPVILPSI NYGGVKLVTRYPKRNPSLDSEILLYDLKTGENVALIDGNWITTMRTGAVAAHSIKLLANP EFSVIGFIGLGNTARATLKVLLSIFPEKHFKIKLKKYKNQHELFQQVFAKYPNVTFSYYD TMEEVIKGSEVVVSAATVFEEDVCSDDCFDEGVLLVPIHTRGFTNCDLFFDKVFADDTNH VKGFKNFNQFKSFAEISDVVLKNVCGREDKKERILVYNIGIALHDMYFAGKIYEKMKECE EISLNAPTKKFWVE >gi|330401009|gb|ADLB01000003.1| GENE 22 22490 - 23629 1196 379 aa, chain + ## HITS:1 COG:CAC3046 KEGG:ns NR:ns ## COG: CAC3046 COG1316 # Protein_GI_number: 15896297 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Clostridium acetobutylicum # 50 356 28 337 341 153 31.0 6e-37 MAKKEAGNPRRTRTARSRRRRRKRGNWFLRLSIGKKVAVCLLGVLICLIASGVIYVSAKL NKLNTEDIPKKNIAVNDLEEGVGEGFTNFALFGSDSRAAGADEGTRTDCIIVASLNNKTR EVKMVSVYRDSLLDIGEGTLNKCNGAYSHGGAKQAIDMLNTNLDLEIEDYVTVDFAAISD VIDLLGGVEIEVSEVELPYLNKFLGETAQVAGKKANPVTKAGMQTLDGVQATTYARIRST KGGDFKRAERQRYVIEKMVEKALKADLATINKIIDVVLPKIKTSLSATEILNYAKSFNKY TLGDNIGFPIEKTTDTLPGLGSVVIPVTLESNVVQLHEFLYGEEDYQPSSKVQSISSKLQ KKVGKRVADPETQWESPSE >gi|330401009|gb|ADLB01000003.1| GENE 23 23792 - 24931 1309 379 aa, chain + ## HITS:1 COG:CAC3063 KEGG:ns NR:ns ## COG: CAC3063 COG1316 # Protein_GI_number: 15896314 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Clostridium acetobutylicum # 39 299 25 292 339 135 31.0 1e-31 MSENKKLREEKRLAKQKKRKRRRIKRAVVLIAEILILLALCGAAYVMAKYDKFQTVTFSK GDIQSNEGVKQEGYMTVALFGGDSRNGQLEKGAHADTIILASIHHDTKEVRLASVYRDTL TQQVSGKIQKANYAYFAGGPKDAINMLNKNFDLDIQDYVTVDFKALADVVDLLGGIELEV TDQEAKEINNYIDETGTVTGKKATHLTNGGTLKLDGVQAVTYARIRKNVGGDYKRAERQQ KVIAKVVEKAKTMDLKTINKIINKVFPQISTSFSLADMIGLASGALDYKLIETTGFPMES MNGRVDKVGSVIVPVGLVENVQELHQFLYPKEEVKEVSDTVKSIAEKIETLTGITRDKLN DPSADISKSSHSTAGEEKK >gi|330401009|gb|ADLB01000003.1| GENE 24 25040 - 26440 1505 466 aa, chain + ## HITS:1 COG:no KEGG:EUBELI_00559 NR:ns ## KEGG: EUBELI_00559 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 95 457 194 532 533 80 24.0 2e-13 MKIMKRIAAVLLALCLTIPMYSVMSYAASGRVSFTDLQTKTGETVEVACALRAGTGSLES FDITLKYDASLLSFKEGNGVTKESDGVLKFSGKGDGTNKIRFTMKFQALKAGTTKIEVVN STGVVTGGETVECVNGTSTIQIAEGTTPQTETTEQTSTENGGDASQSGITVSGKSYQFSE DFNSSSIPVGFVETKLSYNGGERKFVRQENGSIVLGYLVDAENKGDFFLYNEEDATFSPY VQVTVSPSTSIVLLENNEGVKVPSGYQKVKLTVNEHEFPAWQDKKNEGFYLVYAMDSKGT KGFYQYDTEQESYQRYVSDGTDGTDKAATKVSKLNNFITEHLSMVILCVGLGILLLVIII IVLAVKLRHRNLELDDLYEEYDIDVEDLDDEESDDKLVSLKEEKAIEESLIEEQKEEFIE DFEDEDFVDDFDSEDFDDEDFDDEDFEDFDFDDDDDDFDMNFIDLN >gi|330401009|gb|ADLB01000003.1| GENE 25 26654 - 28216 1654 520 aa, chain + ## HITS:1 COG:BS_yyxA KEGG:ns NR:ns ## COG: BS_yyxA COG0265 # Protein_GI_number: 16081088 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain # Organism: Bacillus subtilis # 123 516 12 392 400 165 34.0 2e-40 MDNEYKSGQGNEENQENMNSEVNFVLRQPEEEAVEEQVQSTEEVTEEQVQPVEEITEEQM QLAEETMEEFTQPAENILTEEPMENAQTEDNKTENGHFTEESHIGESYQMGSNIPPKNNV PPTPQRPKKKKGSKGWVKAVACVGLAVLFGVVASGTYQVSNYVTDRLLGRSTSSSKETAK VNTTKVSNSTKTVTSDVTEIVKEAMPSVVSITNMSVQEVQSFFGGTQMQESQSSGSGIII GKNDTELLIVTNNHVIENSTSLTVSFIDNESVEGVVKGTDASRDVAIVAVPLSKIKSETL EEISIATVGDSSKLNVGEPAIAIGNALGYGQSVTTGIISATDRELDGFKGKLIQTDAAIN PGNSGGALLNANGEVIGINTVKINAEAVEGIGYAIPLSDIKDLLENLMNKETRTKVSDAQ RGYLGIAGFDVTEESAKMYNMTKGVYVSEVVKGGGADEAGVIQGSIITGFEGVGVDTMQA LQEQLQYYKVGEKVKVTLQVPADKGNYKEETVEITLGKKQ >gi|330401009|gb|ADLB01000003.1| GENE 26 28346 - 30586 2125 746 aa, chain + ## HITS:1 COG:CAC3655 KEGG:ns NR:ns ## COG: CAC3655 COG2217 # Protein_GI_number: 15896888 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Clostridium acetobutylicum # 8 745 78 813 818 739 54.0 0 MKTEKYTISGMSCAACSSAVERVTRKLEGVSESNVNLTTGKMTITYDDTVLTRENIITKV EKAGFGVTLDVEKSKEEKQHEKDELRESIKRTKRHLITNLILAIPLLYISMGHMLPITLP LPKWLDMAENPLHFALAQCILTVIILYNGRKFYLVGFKSLFKGNPNMDSLVAIGTGSAFL YSLVMTIRIPYDVSGVHNLYYESAAIVVTLVMLGKYMEGRSKGKTSEAIRKLMELAPDKA IVLRDGKQIEVLVEEIKVGERILVKPGNKIALDGVIVEGNTSVDESMLTGESIPVEKEKG MTVIGGSINYQGAIQVEVTRVGEETTLAKIVKLMEEAQGKKAPISKLADIVAGYFVPTVM VIAVVSAIIWAILGHDLAFVLTIFVSVLVIACPCALGLATPTAIMVGTGLGANHGILIKS GEALETTHKVDTVVLDKTGTITEGKPKVMGIISHDMEEEKLLRIAASCEQNSEHPLGQAI VEEAKERGLKLDGTESFNSITGQGIQAVLKGTEYYIGNKKLCEELKIDMGGNEQEAQNMA RKGQTPMFVIANKKVVGIISVADPIKETSKEAIKQLKGLGITVYMLTGDNRLTADYIGKK VGVDKVVSEVLPQDKVSVVEELQKQGKRVMMVGDGINDAPALVQADVGMAIGSGSDIALD SSDIVLMKSDLQDVYKAIRLSKETIRNIKQNLFWAFFYNACGLPLAAGALYLINGTLLNP IFAGLAMSLSSVSVVGNALRLRRLKL >gi|330401009|gb|ADLB01000003.1| GENE 27 30611 - 30907 332 98 aa, chain + ## HITS:1 COG:BS_yvgZ KEGG:ns NR:ns ## COG: BS_yvgZ COG1937 # Protein_GI_number: 16080405 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus subtilis # 11 98 12 100 101 79 50.0 2e-15 MEEKCCERTKHRESKEYKDMINRLSRIEGQVRGIRKMVGEERYCVDILTQVSAIQSALNS FNKKLLASHIHSCVVDDIQDGKVEAVDELCEIIQKLMK >gi|330401009|gb|ADLB01000003.1| GENE 28 30922 - 31119 394 65 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|210608693|ref|ZP_03287970.1| ## NR: gi|210608693|ref|ZP_03287970.1| hypothetical protein CLONEX_00149 [Clostridium nexile DSM 1787] # 1 64 1 64 65 85 71.0 1e-15 MTVINVEGMHCEKCVERIEKAMNEAGLDYKVSLADKTVEIDGCEHCVKTAMEILDDLGFD GEIEA >gi|330401009|gb|ADLB01000003.1| GENE 29 31124 - 32860 1457 578 aa, chain + ## HITS:1 COG:no KEGG:EF3301 NR:ns ## KEGG: EF3301 # Name: not_defined # Def: hypothetical protein # Organism: E.faecalis # Pathway: not_defined # 260 578 14 328 328 138 32.0 6e-31 MNAKKGIVAAICIMCMFCIGKTGAVRATETVPEVYEDILRDEDYMEVSLGRCYPGKQIFY DIPKKKTVIEPFEEQKQIIFREGEKAEVYYGKYRGREEYDNFRLLSSMLDADFFFEPFLA LNITKEKIEEENFYVHIDEEQEVEYFSELLIFAEVLNGDEAYMESVDILFDEAYRPAQIQ FHFQSNILSEEQLAYLEKEMTQRYKYITEEEIDAKMKETEEKMSHTVAEDDVTIEDYVTE LPNILRAKSRYLIETLFYQGEELEKEAGWDKRAVLAYMEAVNTKFNLFGANGYEAQNVTE EEYLQEIERTYESAHQIGSDVYTYMFEASKISPKEKALLVIRQMGGYMDKNGMLQLGEGD TFAPEMAPHSVFLEDYAECVRKAYRKKHLDNEMIHQFRMYIDKHNIEYVRGYYEGKTDYD RLKSYAKNFNMKLYYGEPSRHHNKTEKEQKFKEQRYDKILTPNKLSEFIIDVKTGSFVTE WDVLTKTKSNRIQSTTASYENMEKSQKKKVINSESFNYAPADYNEEHHRLDVLPATPAAG RKKTYLENDFKRSLKKIWKSPKRTQYKEKYKSPKDYLK >gi|330401009|gb|ADLB01000003.1| GENE 30 32873 - 33544 654 223 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|210608691|ref|ZP_03287968.1| ## NR: gi|210608691|ref|ZP_03287968.1| hypothetical protein CLONEX_00147 [Clostridium nexile DSM 1787] # 5 223 7 224 224 155 42.0 2e-36 MKKIMMLGVAFCMSLTLLSCQKEVDSSEKINDILENEKYMKIELGEDYHQSSVLFALQEE KAVVPKDVKWGLPKYREKDTIGVSFAEEGTEIYKADGEMYPELYAKAYLASIVNLDLSVK EVKDRNYSVKIADKEQLKWFEEELYTVSTLVIGDSYELSGVTVEFDREYRPVKKIFQLKK KNNMILGDEEDDSEKCIQEFSYDVGKMKFNREFSRVENLIGKD >gi|330401009|gb|ADLB01000003.1| GENE 31 33826 - 34446 500 206 aa, chain - ## HITS:1 COG:CAC1338 KEGG:ns NR:ns ## COG: CAC1338 COG3546 # Protein_GI_number: 15894617 # Func_class: P Inorganic ion transport and metabolism # Function: Mn-containing catalase # Organism: Clostridium acetobutylicum # 1 185 1 185 200 253 62.0 2e-67 MWNYEKRLQYPVKITQTNPKIAQVIISQFGGPDGELAASMRYLSQRYTMPYKQVTGILTD IGTEELAHMEMICAIVYQLTKNLSVEEIEKYGFDKYYVDHTLALWPQSAGGTPWTATYFQ SKGDPITDLHEDMAAEQKARTTYDNILRLVKDPEVCDPIRFLREREIVHYQRFGESLRIV QDNLDSKNFYAFNPEFDKTQNCNCRR >gi|330401009|gb|ADLB01000003.1| GENE 32 34450 - 34716 344 88 aa, chain - ## HITS:1 COG:no KEGG:Dtox_3011 NR:ns ## KEGG: Dtox_3011 # Name: not_defined # Def: CotJB protein # Organism: D.acetoxidans # Pathway: not_defined # 19 84 17 82 85 65 42.0 7e-10 MNNVPSRKELLHLINVASFSVDDVKLFLDTHPDNREALAYFQEYNAIRTQALKDYARLYS PLTLDSITTCSDYWKWIDEPYPWQEGGC >gi|330401009|gb|ADLB01000003.1| GENE 33 34713 - 34925 125 70 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|210608690|ref|ZP_03287967.1| ## NR: gi|210608690|ref|ZP_03287967.1| hypothetical protein CLONEX_00146 [Clostridium nexile DSM 1787] # 1 70 1 72 73 79 53.0 8e-14 MQNFRNNMTFYNPENLSCQRPDSLSSMPLAMAYVPWQRWENILDADKGFHCGTIFHDLRK PFEHGGGGCR >gi|330401009|gb|ADLB01000003.1| GENE 34 35042 - 36202 1281 386 aa, chain + ## HITS:1 COG:BS_spoVE KEGG:ns NR:ns ## COG: BS_spoVE COG0772 # Protein_GI_number: 16078585 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Bacterial cell division membrane protein # Organism: Bacillus subtilis # 36 380 26 362 366 192 38.0 6e-49 MTGTRIKKNREKKTVEYFDYSLLAVLIFLIGFGLLMLYSTSSYSAKMKFGDGMFYLKNQL KAYAVSFIAMWIVSNIDYHWYAKYSKAIFLAAMVVMALVFVPGVGIEAYGAKRWIKVPLM GQMQPSELMKIAIVLFIPAMICKIGNKIGRKEGLFCILGLGAIGAAGVLFLTDNLSTAII VMGMSCIMFFVAHRKTAPFIAIGAAMIAGVFIVAQVLGKVLTDSTDFRVRRILAWVNPEM YASEGSYQSMQALYAIGSGGFFGKGLGNSAQKIIIPEAQNDMILSIICEELGVFGMMIVL ILFGILLYRLAFIAQNAKDSYGSLIVTGIFSHIALQVIFNVCVVMNIIPTTGITLPFISY GGTAALFLMIEMGIAFNVSRTIKVAV >gi|330401009|gb|ADLB01000003.1| GENE 35 36366 - 37205 974 279 aa, chain + ## HITS:1 COG:SP0742 KEGG:ns NR:ns ## COG: SP0742 COG1307 # Protein_GI_number: 15900637 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Streptococcus pneumoniae TIGR4 # 1 277 1 276 281 140 31.0 3e-33 MSYKIVVDSCGELTEEMKQSGKVESVALSIQIDNENIIDDDTFNQAEFLRKVAESSNSPK SSCPSPECYMDSYKCDADRVYVVTLSAELSGSYNSAVLGKSIYAEEHGEKKIHIFNSRSA SVGETLIAKKIMECEEQGYGFEEVIEAVDAYIQGQNTYFVLETLETLKKNGRLTGVKALV ASALNIKPVMASTSQGTICQLGKSRGINKALDKMVEYVIKDAVDMEEKTLAIAHCNCYER AMSVKAMLEERGHFKDVIILDTRGISTMYASDGGIIVVV >gi|330401009|gb|ADLB01000003.1| GENE 36 37259 - 37495 285 78 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|210608684|ref|ZP_03287961.1| ## NR: gi|210608684|ref|ZP_03287961.1| hypothetical protein CLONEX_00140 [Clostridium nexile DSM 1787] # 1 76 1 76 82 79 53.0 8e-14 MSQEKVNRYKEQKANRKAIMKKEKRMKIFRNTVTAVVVVAVLSWVGYSGYNSYVDSQPVE KTEVDYTSIAEYLTGLAE >gi|330401009|gb|ADLB01000003.1| GENE 37 37680 - 38117 460 145 aa, chain + ## HITS:1 COG:BS_yllB KEGG:ns NR:ns ## COG: BS_yllB COG2001 # Protein_GI_number: 16078577 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus subtilis # 1 138 1 138 143 174 58.0 4e-44 MLTGEFNHSIDAKGRLIIPSKFRENLGENFVITKGLDGCLFLYPDNEWKTFEEKLRTLPL TNKDARIFTRFFLGSAVDGGLDKQGRVLISSALRNFARLEKEVVLVGVLDRVEIWDKAKW EENNTVIEDNMDDIASHMEELGLGI >gi|330401009|gb|ADLB01000003.1| GENE 38 38136 - 39071 767 311 aa, chain + ## HITS:1 COG:CAC2132 KEGG:ns NR:ns ## COG: CAC2132 COG0275 # Protein_GI_number: 15895401 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted S-adenosylmethionine-dependent methyltransferase involved in cell envelope biogenesis # Organism: Clostridium acetobutylicum # 1 310 1 309 312 360 59.0 2e-99 MTFEHKSVLLEETVNGLNIKPDGIYVDGTLGGGGHAFEVCKQLSDKGSFIGIDQDEAAIE AAGIRLRDFGERVTIVRSNYCDMKLQLQKLGIDKVDGIVLDLGVSSYQLDTAERGFSYRV DVPLDMRMDRRQEMTAKDIVNTYSEMELYRIIRDYGEDKFAKNIAKHIVLEREKGSIETT GQLIEIIKRAIPMKFQKNGGHPAKRTFQAIRIELNRELDVLRDSLDEMIDMLNENGRICI ITFHSLEDRIVKSIFRRNENPCTCPSHFPVCVCGNESKGKVITRKPILPSAEELEYNSRS KSAKLRIFERC >gi|330401009|gb|ADLB01000003.1| GENE 39 39119 - 39607 429 162 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_2253 NR:ns ## KEGG: EUBREC_2253 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 15 161 5 158 158 69 29.0 3e-11 METNRRPAYRRQTYTYYDYGNTVRKPDIVPKRHEETHRPEQKKVSRQIRKNRKNALHMNK GYVVFLSIAVVIAFVVCLQYLQLQAEITNRSQNITSLQRELAEKKEVNTTRFNSALDSVN LEEVRRKATEELGMVHAEDNQIITYDNPKESDVKQYKDIPKK >gi|330401009|gb|ADLB01000003.1| GENE 40 39631 - 41502 1722 623 aa, chain + ## HITS:1 COG:CAC2130 KEGG:ns NR:ns ## COG: CAC2130 COG0768 # Protein_GI_number: 15895399 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell division protein FtsI/penicillin-binding protein 2 # Organism: Clostridium acetobutylicum # 22 617 12 588 729 300 34.0 4e-81 MTKNRKKRGGLKFIRQKFPKRMQKKLVLVFMFTILAFVLFLGKAIKIVATKGEDYKKAVL DQQQNNSRVIPFKRGDILDANGTKLATSERVYNVILDAKVLLSGKEEELAKSKEQTVEAL ETCFEIDKKTVYTVLEEKSESRYNVLKKGVSYKQYQAFEKLKKDTKNYGNLKGVWLEEDY VRTYPYNTLASDLIGFTASGNVGNGGLEASYNDILNGTDGREYGYFGETSSVEKIVKEAT DGKSLVTSIDLNLQSIIEKHIREFNEEHKNGPNKTGLGSKNTAVIAMRPGTGEVLAMASY PNFDLNNPRDLSGLFSEEQLKNMSEEDKIKELNKLWRNFCVSDTYEPGSTMKPFTIAAGL ENGTLKGSETYVCPGYLDVGKWRIKCNKKDGHGTQTLKQAIANSCNVALMHVGAAIGPEE FSKYQHIFGFGEYTGIDLPGEASTDKLLFAPDKMGNADLATNAFGQNFNVTMLQLAAGFN SLINGGNYYEPHVVKQIRDADGNVIENKTPVLTKKTISEETSKLLRTYMESTVTEGTAKA AQVPGYAIGGKTGTAEKYPRNHGKNLISFIGYAPQENPEIMLYVVIDEPNVENQANSSYA IKLAQKIMAEAFPYMKITKTATN >gi|330401009|gb|ADLB01000003.1| GENE 41 41657 - 43324 1270 555 aa, chain + ## HITS:1 COG:BH2572 KEGG:ns NR:ns ## COG: BH2572 COG0768 # Protein_GI_number: 15615135 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell division protein FtsI/penicillin-binding protein 2 # Organism: Bacillus halodurans # 11 555 44 577 644 367 40.0 1e-101 MIVDAPYYQKRAEALHEREREIKAARGEIIDAKGKVLATNKTVCTISVIHSQIKDKEKVI RILSDELGIGETLVREKVEKVSSMERIKTNVDKKTGDKIREYELPGVKVDEDFKRYYPFG NVASKVLGFTGGDNQGIIGLEVKYEEYLKGKNGRILTTTDARGVELEGIAEDRMEAIPGN TLHISMDYNIQKYAQQAAEKVMKEKQADKVSVLVMNAQNGEILSMVNVPEFDLNEPFTLS GNENAEISDEKKQELLNQMWRNGCINDTYEPGSTFKIITSALCLEEGVVKESDRFSCPGY RVVENRRIRCHKAGGHGSETFVQGVQNSCNPVFIDIGLRLGADRFYDGFYKIGLMDKTGV DLPGEAGTIMHKKSNIGPVELATMSFGQSFQITPIQLATTVSSIINGGKRVTPHFAVCVT DREGKEVERFQYKTKRNGISEGTSEKMRTILESVVAQGSGKNAYIPGYKIGGKTATSQTL PRSANKYISSFIGFAPADNPKVVALVVIHNPKGIYYGGTIAAPVVKDIFSNILPYLGIEK NSKLEYTNEMKKQKR >gi|330401009|gb|ADLB01000003.1| GENE 42 43336 - 44292 1265 318 aa, chain + ## HITS:1 COG:CAC2127 KEGG:ns NR:ns ## COG: CAC2127 COG0472 # Protein_GI_number: 15895396 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramyl pentapeptide phosphotransferase/UDP-N-acetylglucosamine-1-phosphate transferase # Organism: Clostridium acetobutylicum # 5 318 4 316 317 247 49.0 2e-65 MEYNIVIPVLVAFGLSVLLGPVVIPFLRKLKMGQTERVDGVQSHLKKAGTPTMGGVIILV SVAITSVFYIGDYPKIIPILFVTLGFGLIGFLDDYLKVVMKRSDGLFPKQKMALQILVTA VFAYYMVNFTDVSLDMVIPFTNGKTWDIGWLAIPLLFIVVIGTVNGVNFTDGLDGLASSV TVLVATFFTVVAIGTKSGIEPITCAVVGALLGFLLFNVYPASVFMGDTGSLALGGFVAST AYMLQMPIFIVIVGLIYLVEVLSVMIQVTYFKKTGGKRIFKMAPIHHHFELCGWSETRVV AVFSIITAILCLIALIAM >gi|330401009|gb|ADLB01000003.1| GENE 43 44306 - 45661 1685 451 aa, chain + ## HITS:1 COG:BS_murD KEGG:ns NR:ns ## COG: BS_murD COG0771 # Protein_GI_number: 16078584 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramoylalanine-D-glutamate ligase # Organism: Bacillus subtilis # 3 451 7 450 451 291 36.0 2e-78 MDLRNKKVLVFGSGISGIGAARLLEQQGADVILYDGKETLDKEELRKKIGAGSKAQILLG TLRDEVIETLDSVVMSPGVPTDLPIVNKIREKGICIWGEIELAYRVGGGDVLAITGTNGK TTTTALLGEIMKAWKDSVYVVGNIGTPYTSIAAETREDSVIVAEISSFQLETVDAFRPKV SAILNITPDHLDRHHTMRAYIEAKETIAKNQTAEDVCVLNYEDEETRKFGEKTEATVLYF SSRRKLDRGVYLEDGTIIYAENGNRTTICHVDELKLLGVHNYENIMAAVAMSLAYHVPVD IIRQSVKAFGGVEHRIEYVCEKNGVVYYNDSKGTNPDAAIKAISAMKKPTYLIGGGYDKN ASYEEWIESFDGKVQKLVLIGQTKEKIEETAKRCGFTDTVLAETLEEAVSICAELSREGE AVLLSPACASWGMFKNYEERGDKFKELVNAL >gi|330401009|gb|ADLB01000003.1| GENE 44 45677 - 46762 1138 361 aa, chain + ## HITS:1 COG:BS_spoVE KEGG:ns NR:ns ## COG: BS_spoVE COG0772 # Protein_GI_number: 16078585 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Bacterial cell division membrane protein # Organism: Bacillus subtilis # 28 357 26 361 366 215 39.0 1e-55 MSRKSKKGRYDYSLLTAVFLLVGIGLVILYSTSAYNGEVKFHDSFYYLKKQAFATVLGII LMFAMANIDYHIWQHFAVFAYIVALILSTAVLFIGDEYNGSKRWLSLGPFSFQPSEYAKV ALILFLSYIVMKNVKKIDKVRTLIKIIGSILPIVALVGSNNLSTAVIILGIAIILIFVSS PKYTQFITMGILAVGFLGIFLALESYRLERLAIWRNPEKYEKGYQTLQGLYAIGSGGLFG RGLGSSIQKLGFVPEAQNDMIFSIICEELGLFGAIFIIVLFMILIWRFFVIATHAKDLFG ALIATGAMGHIMIQVILNIAVVTNSIPNTGITLPFISYGGTSVMFLLLEMGLVLSVSNLI E >gi|330401009|gb|ADLB01000003.1| GENE 45 46784 - 47539 654 251 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_2248 NR:ns ## KEGG: EUBREC_2248 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 15 239 22 248 248 90 29.0 3e-17 MKKQKKTSHTLYAFTVLILGILIIVMSVLLLFHVQTIEVTGNKYINSSEIGESIQKSSKT KNSLYLLGKNLMGKIDYPKAVVSAKIRLKTPWSIRVEVKEKEIMAYAVIDDEYVYFDEEG TVLSKSVVLMEGIPCIEGISANAELYKKLPVKEERLFRNIDTMLKALDEWKIKPDRIVSE GADLTIYIEKVCVTLGSGSMEEKISQLPPILTKLEGKTGTLDLRHYGEATEMITFKEGEL PKKEEEKTDKK >gi|330401009|gb|ADLB01000003.1| GENE 46 47678 - 48907 1672 409 aa, chain + ## HITS:1 COG:CAC1693 KEGG:ns NR:ns ## COG: CAC1693 COG0206 # Protein_GI_number: 15894970 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Cell division GTPase # Organism: Clostridium acetobutylicum # 13 318 12 319 373 334 63.0 2e-91 MLEIKTNESDAAAKIIVVGVGGGGNNAVNRMIDEQIAGVEFIAINTDKQALQLCKAPTLM QIGDKLTKGLGAGAKPEVGEKAAEESAEEIASALKGADMVFVTCGMGGGTGTGATPVVAR IAKEQGALTVGVVTKPFRFESKARMNNALAGIEKLKENVDTLIVIPNDKLLEIVDRRTTM PEALKKADEVLQQGIQGITDLINVPSLINLDFADVQTVMVDKGIAHIGIGKGKGEEKALD AVKEAVASPLLETTIAGASHVIINVSGDISLMDASDAAEYVQELAGEEANIIFGAMYDDT KQDEATITVIATGLHNVGGATSKLKQRLEGQKAAFHQTIQPKEEVQSPKVDLTKNVAHTQ FQSQTANHTGSANNEQRTYGGATPTLQTPKVPTSTVKEQSIKIPDFFKR >gi|330401009|gb|ADLB01000003.1| GENE 47 49079 - 49861 574 260 aa, chain + ## HITS:1 COG:no KEGG:Cphy_2470 NR:ns ## KEGG: Cphy_2470 # Name: not_defined # Def: peptidase U4 sporulation factor SpoIIGA # Organism: C.phytofermentans # Pathway: not_defined # 1 260 1 281 293 103 27.0 5e-21 MYYELYVDVFFLVNFMMDFLLLLIARKILKCSATHGNICLGSLVGSLLTCFVVVLPVRSA ILKLMLFHIVINVLMIYIGLRVHTLREIVRAWIALYIGGFLLGGVFTYFQQYLKMGSLFF AVAVFSHWIVQGIWAFVVCMQKVKQNECNVTLYQNGEKCTLHALIDTGNSLSDPLTKQPV CIVEYEAVKTWLNEDEVKNLRRIFFHSIGKECGTLPVMELEKMCIHNEKECWVMKPIVAV CENKISADEEYGMILNPDIF >gi|330401009|gb|ADLB01000003.1| GENE 48 49874 - 50614 760 246 aa, chain + ## HITS:1 COG:BH2556 KEGG:ns NR:ns ## COG: BH2556 COG1191 # Protein_GI_number: 15615119 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit # Organism: Bacillus halodurans # 29 241 23 232 237 275 69.0 7e-74 MVIKVAVPNQFKLKVISNFRTFLFPEKRDIHYIGGSDVLPAPLETEEETAVIEKLGTETD DVAKKTLIEHNLRLVVYIAKKFDNTGVGVEDLISIGTIGLIKAINTFNPMKNIKLATYAS RCIENEILMYLRRNNKTRLEVSIDEPLNVDWDGNELLLSDILGTEEDTIYRDLETEVERK LLMKAINRLSSREKTIVQMRFGIGTKDGEEKTQKEVADILGISQSYISRLEKKIMQRLKR EIVRYE >gi|330401009|gb|ADLB01000003.1| GENE 49 50694 - 52301 1547 535 aa, chain + ## HITS:1 COG:PA3614 KEGG:ns NR:ns ## COG: PA3614 COG1236 # Protein_GI_number: 15598810 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Predicted exonuclease of the beta-lactamase fold involved in RNA processing # Organism: Pseudomonas aeruginosa # 3 465 4 467 467 375 41.0 1e-103 MKLTFIGAAHEVTGSCHLLEACNKTILIDCGMEQGPDLYENQELPIVPGDIDCVLLTHAH IDHSGLIPMLCKQGFKGQIVTTFATSDLCNIMLRDSAHIQEFEAEWRNRKARRAGAALYE PLYEMQDALDAIELLAPCDYDQKIELYDGISVRFTDIGHLLGSACVEIWIREEGVEKKIV FSGDVGNINQPIIKDPTQVKEADYVVIESTYGNRLHGEETPDYIGEFTRILNETFARGGN VVIPSFAVGRTQELLYFIREIKEKNLVTAFPNFEVYVDSPLAIEATNVFTKNTKGCFDED AMKLVAEGINPLVFPGLKLSTTSDDSKAINFDERPKVIISASGMCEAGRIRHHLKHNLWR PECTILFVGYQAVGTLGRKLIEGAPLVKLFGEEIEVRAHIETLKGISGHADMRGLLNWLS GFETDIQHVFVVHGEDAVTDEFANTITEKFGYPAFAPYSGGSVDLAANVILSEGVKVRKK AEEKPSAIRAMTVFNRVVMAGKRLMSVIMKNEGLANKDLAKFESQIQNLADKWDR >gi|330401009|gb|ADLB01000003.1| GENE 50 52401 - 53180 619 259 aa, chain + ## HITS:1 COG:CAC1696 KEGG:ns NR:ns ## COG: CAC1696 COG1191 # Protein_GI_number: 15894973 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit # Organism: Clostridium acetobutylicum # 1 257 1 257 257 322 63.0 4e-88 MALNKVEICGVNTAKLPLLKEEEKEALFVRIKAGDEEAKEEYIKGNLRLVLSVIKRFSNS NENPDDLFQIGCIGLIKAINNFNTELNVKFSTYAVPMIIGEIRRYMRDNNSIRVSRSLRD TAYKAIYAKEGYMKTHLKEPTIQEIASEIGISKEEIVQALDAIQMPMSLHEPIYNDGGDT LYVMDQISDKKNKEENWVEELSLAEAMERLGERERKIIQMRFFEGKTQMEIAKEIHISQA QVSRLEKNALEFMKQYLVN >gi|330401009|gb|ADLB01000003.1| GENE 51 53288 - 53938 638 216 aa, chain + ## HITS:1 COG:TP0554 KEGG:ns NR:ns ## COG: TP0554 COG0546 # Protein_GI_number: 15639543 # Func_class: R General function prediction only # Function: Predicted phosphatases # Organism: Treponema pallidum # 1 214 4 218 222 168 43.0 8e-42 MYKVCIFDLDGTLTDTLESITYSVNKTLDELGFANITMEQCRQFVGDGARVLMERTLRAV GDVELEKIERAMEIYGRIFGENCTYHVTAYEGITDMLDQLKGRGIKTAVLSNKPHQQSVD VVAEILGKERFSCVNGQREGVEKKPDPAGVFEIMQFLGATKEECLYIGDSEVDMETAKRA GLVSVGVSWGFRGRDILQNAGADYIIDKPCELLKLV >gi|330401009|gb|ADLB01000003.1| GENE 52 53968 - 54231 464 87 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_1570 NR:ns ## KEGG: EUBREC_1570 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 87 6 92 92 87 66.0 1e-16 MYEYDEECLQVFLKKQGQLFDEPVAETLEEAEAFLEDCMAAVVDSIEEVREYFDESGADV DGMSEEELEDASEVFALPDGRYLVVEA >gi|330401009|gb|ADLB01000003.1| GENE 53 54434 - 54916 452 160 aa, chain - ## HITS:1 COG:SPy1834 KEGG:ns NR:ns ## COG: SPy1834 COG1396 # Protein_GI_number: 15675661 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Streptococcus pyogenes M1 GAS # 1 67 1 67 195 76 53.0 2e-14 MDFGEQIKSIRQKEKLTQEQFAMKLNVSRQAVSNWENNKNLPDIGMLILMSDVFQISLDY LIKGENEMNNMTEKVIKDGSETKRARYNMVCSIIGSFLILIGIVLLLVKGMSVEYIDEQG ILHENFFLIPIGFLCVFSGLISFITVGITTLISKLKNRNS >gi|330401009|gb|ADLB01000003.1| GENE 54 55149 - 55373 326 74 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|197302777|ref|ZP_03167830.1| ## NR: gi|197302777|ref|ZP_03167830.1| hypothetical protein RUMLAC_01506 [Ruminococcus lactaris ATCC 29176] # 18 74 1 57 57 93 89.0 4e-18 MGYEELDKFRVVKERYTMEKRKIVRGVILAVLLVCTMFLIYSIVTDPLGEHDIMFALAVT AGFIALGQDKKKER >gi|330401009|gb|ADLB01000003.1| GENE 55 55480 - 56346 663 288 aa, chain + ## HITS:1 COG:no KEGG:CLH_2455 NR:ns ## KEGG: CLH_2455 # Name: not_defined # Def: hypothetical protein # Organism: C.botulinum_E3 # Pathway: not_defined # 1 220 1 220 272 134 41.0 4e-30 MKKQLKNLTIKDNFMFAAVMLDEENCKGFLERALQMKIDRVEVSAEKNIVYHPEYKGVRL DVYAKDENNTRYNVEMQVSTQSSLGLRSRYYQSQMDMEMLLSGSEYEELPKSYVIFICDF DPFGERKYKYTFEMECKETANAKLQDKRKIVFLSTKGKNAEEVPEELVRFLEFVKADIKE SQNDFQDEYVKQLQKFVEHIKKDREMEERFMLFEELLKEERKSGREEGRQEGREGMAEAI IVFLSKYGEVPDTLSEKINEEKDMEVLKQWIKLSVEVESIEEFISKIS >gi|330401009|gb|ADLB01000003.1| GENE 56 56515 - 56934 377 139 aa, chain + ## HITS:1 COG:no KEGG:CD1104 NR:ns ## KEGG: CD1104 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile # Pathway: not_defined # 1 139 1 139 139 188 74.0 4e-47 MNEREGIIRLWFDMWLCQKDLGIDEIFTEDVTYIESWGPKYTNRSTVKHWFEEWNTRGKV YEWDIKQYFHKDSQTVVEWYFRNKMSNGEIEEFDGVSLVEWTEENKICFLKEFGCNLNFY NPYENGDKPQFTNKKTNWF >gi|330401009|gb|ADLB01000003.1| GENE 57 56948 - 57127 167 59 aa, chain + ## HITS:1 COG:SP0628 KEGG:ns NR:ns ## COG: SP0628 COG0537 # Protein_GI_number: 15900535 # Func_class: F Nucleotide transport and metabolism; G Carbohydrate transport and metabolism; R General function prediction only # Function: Diadenosine tetraphosphate (Ap4A) hydrolase and other HIT family hydrolases # Organism: Streptococcus pneumoniae TIGR4 # 1 55 12 66 167 85 67.0 2e-17 MCLICDRIEMIKQGTNPYFVKELETGYVVIGDNQHFKGYTLFLCKEHKTELFQLEYNQK >gi|330401009|gb|ADLB01000003.1| GENE 58 57172 - 57603 457 143 aa, chain - ## HITS:1 COG:no KEGG:Cphy_3425 NR:ns ## KEGG: Cphy_3425 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 3 142 2 143 158 95 42.0 4e-19 MSSKTKIVVLHMKEIIYTAIFVILGILFILLLVFMFFPKNKKALDTTKEYAPGIYTSTVS LNNTDLEIEVTVDQSKITSIRCVNLSESVTTMYPLLQPTIENIAEQVCESQSTKNLTYPE DNPYTSQMIVSAIEKALEKAVVK >gi|330401009|gb|ADLB01000003.1| GENE 59 57760 - 58332 691 190 aa, chain + ## HITS:1 COG:CAC3217 KEGG:ns NR:ns ## COG: CAC3217 COG0193 # Protein_GI_number: 15896464 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Peptidyl-tRNA hydrolase # Organism: Clostridium acetobutylicum # 1 186 1 183 187 202 58.0 3e-52 MYIIVGLGNPTSQYEGTRHNVGFDVIDAIADKYNITMDIRKHRAFCGKGIIAGQKVILAK PQTYMNLSGESVRSLVDYFKIDEEQELIVIYDDISLDVGQLRIRKKGSAGGHNGIKNIIA HLGHSVFPRIKVGVGEKPKQYDLADYVLGHFSKAEREVMEDGYQNAVKAIELMVNDEIET AMNEFNKKTK >gi|330401009|gb|ADLB01000003.1| GENE 60 58347 - 61685 3041 1112 aa, chain + ## HITS:1 COG:CAC3216 KEGG:ns NR:ns ## COG: CAC3216 COG1197 # Protein_GI_number: 15896463 # Func_class: L Replication, recombination and repair; K Transcription # Function: Transcription-repair coupling factor (superfamily II helicase) # Organism: Clostridium acetobutylicum # 1 1112 3 1169 1171 922 44.0 0 MQAFVEPLKELAEFEEIQKEKKKQKGMIQIAGCVNSQKTHLMYALGDDSTYRIIAVSSEA KAKQIYEEYKFLDEKVYYYPPKDLLFYQADLRGKALVKQRLEVIQAVLTEEKVTVITEFD GFMDSLLPLPKIQERIFTLKVGDTVDFDSLKEKVAALGYDREVQIDGMGQFAVRGGIIDI YPLTEEVPIRIEFWDDEIDSIRTFDVESQRSIENLEEIVIYPATDFPEEEGKRVSFLEYF PVEETMVFLDEPARLMEKGMGVEEEYLEAQKNRMEAGCEISDVRIPLYRTKQVIDKMNEY YCIGFSALEIRTREFKTNNLYSLHTKSVNPYNNSFEMLTRDLKRLKRNGYRVILLSGSRT RARRLAEDLRDYNLSSFYSEEKDRLVQEGEILVSFGHVAEGYEYPMLKFTVISETDIFGK GKKKRKRKVYEGQKIQSFSELKIGDYVVHENHGLGIYQGIEKIDVDKTSKDYMKISYAGG GNLYIPATQLDLIQKYASADAKKPKLNRLGTQEWTKTKTKVRGAVREIAKDLVELYAARQ REEGFVYGPDTVWQKEFEEMFPFEETEDQMLAIEATKRDMESNKIMDRLICGDVGYGKTE VAIRAAFKAVQENKQVVYLVPTTILAQQHYNTFVQRMKEFPVRVDLMCRFRTPAQQKKTI EDTKKGLVDIIVGTHRVLSDDLKFKDLGLLIIDEEQRFGVQHKEKIKKLKENIDVLTLTA TPIPRTLHMSLIGIRDMSVLEEAPMDRIPIQTYVMEYNDEMVREAIQRELSRQGQVYYVY NKVKDIEEITDRIQSLVPEAAVTYAHGQMSEHQLEKIMYDFINGEIDVLVSTTIIETGLD ISNANTMIIHEADKLGLSQLYQLRGRVGRSNRMAYAFMLYKRDKLLKEVAEKRLSAIREF TDLGSGFKIAMRDLEIRGAGNLLGAEQHGHMEAVGYDLYCKMLNEAVKHLKGEMEEEAYT TTVDLNVDAYIPSSYIPNEYQKLDIYKRIASIENEEEMDDMVEELIDRFGDIPKKVQQLL HIASVKALAHSAYIISIEQKGEQYKFTMYEKAKVCAERIPVLLQKYKGDLTFKIETNPYF IYQKKGKNKREKDENILECVKSVISGIKGLIE >gi|330401009|gb|ADLB01000003.1| GENE 61 61733 - 62773 1199 346 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_0459 NR:ns ## KEGG: EUBREC_0459 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 6 216 4 214 317 99 34.0 2e-19 MGNTRKKMVALMTVVAMSATAITGCGKINNEATLMTVGKDKVSMGVANFFARYQQAMAEA QYGTYMGDDMWESEMAESETMEDTMKKRILETLKTLYVLEDHMKDYKVELTAEEKQKIDK TAEEFLKANKDAAKEVVSADKETVSRVLELLTIQDKMREAMTADVDKNVSDEEAAQKSMQ YVFFTFTKTEENGTSSTISDDEKKTLKEKATKFQEGAKTQADFAAYAKESGYEAVTKTFD ADDVDPSEVLIKEVDKLKAGEMTGVVEAPSGYYVAKVTSVLDRKATDEKKQTIIAERENK KYEELVDKFLKDTKIDVDNKEWKKISFSKQRVTLKTVEQQESTEQK >gi|330401009|gb|ADLB01000003.1| GENE 62 62817 - 63263 267 148 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|42519249|ref|NP_965179.1| 30S ribosomal protein S21 [Lactobacillus johnsonii NCC 533] # 1 148 1 147 147 107 38 2e-22 MSKIDEVRSAMVTAMKAGDKETKETLSMLLSALKNKAIEKRADLSAEEETQVIMKEIKQT KETLEMTPADRTEIVDECKKRLAVLEQYAPKMMGEEEIKSVIDATLAELGIDAPTPKDKG RIMKELMPKVKGKADGKLVNDLLLSLMN >gi|330401009|gb|ADLB01000003.1| GENE 63 63410 - 63958 673 182 aa, chain + ## HITS:1 COG:CAC3214 KEGG:ns NR:ns ## COG: CAC3214 COG2002 # Protein_GI_number: 15896461 # Func_class: K Transcription # Function: Regulators of stationary/sporulation gene expression # Organism: Clostridium acetobutylicum # 1 182 1 183 183 199 59.0 2e-51 MKATGIVRRIDDLGRVVIPKEIRRTLRIREGDPLEIFTDRNGEVILKKYSPIGELHSVSC EYADSLSAATGYTVCIADTEQIVAVSGSGKKSLLEKGITHELRKVMEMRKSFVAECGSKN YVKIVDGEMHEYTSQAVSPVICGGDVIGAVILLDKNDGKKFGDLERILAQTGAGFLGKQM EQ >gi|330401009|gb|ADLB01000003.1| GENE 64 64021 - 64395 378 124 aa, chain + ## HITS:1 COG:no KEGG:Cphy_3435 NR:ns ## KEGG: Cphy_3435 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 1 124 1 124 124 73 35.0 3e-12 MGKKEKKESKGIWLLKALLAGYVVTGVLLMILALLLYKIDLDEQKVTMGIIATYVISTFT GGFIMGKLVEEQRFVWGLILGVIYFLLLFAVSFAVNHQLQSNGTNLITTLLLCAGGGMLG GMVS >gi|330401009|gb|ADLB01000003.1| GENE 65 64551 - 64694 173 47 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKHIKTLNAQTLNNTVKKGGCGECQTSCQSACKTSCTVGNQTCEQKK >gi|330401009|gb|ADLB01000003.1| GENE 66 64788 - 66191 1581 467 aa, chain + ## HITS:1 COG:CAC2279 KEGG:ns NR:ns ## COG: CAC2279 COG0641 # Protein_GI_number: 15895547 # Func_class: R General function prediction only # Function: Arylsulfatase regulator (Fe-S oxidoreductase) # Organism: Clostridium acetobutylicum # 1 463 3 454 454 486 51.0 1e-137 MIHQYKNNGYDIVLDVNSGSVHVVDDMCYDIIDYLNKLGVEDNMELLEQNETLEKMETDL GETYPVEEIAEAFEDIKELVKAGQLFTKDVYEEYIGEVKKRKTVVKALCIHIAHDCNLAC KYCFAEEGEYHGRRALMSYEVGKKALDFLIANSGNRRNLEVDFFGGEPLMNWQVVKDLVA YGREQEKIHNKNFRFTLTTNGVLLNDEVQEFVNKEMDNVVLSLDGRKEVNDKMRPFRNGK GSYDLIVPKFQKLADSRNQQKYYIRGTFTRDNLDFSKDVLHFADLGFEQISIEPVVGEES DFYSIREKDLPQIFEEYDALAKEMVKREKEGKGFTFFHFMLDLDGGPCVAKRLSGCGSGT EYLAVTPWGDLYPCHQFVGEEDFLMGNVDEGITKPEIADEFRGCSVYSKEKCKNCFAKFY CSGGCMANSYKFHGSIHNAYDVSCEMERKRVECAIMIKAALADKERV >gi|330401009|gb|ADLB01000003.1| GENE 67 66195 - 68333 2669 712 aa, chain + ## HITS:1 COG:CAC2278 KEGG:ns NR:ns ## COG: CAC2278 COG0342 # Protein_GI_number: 15895546 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit SecD # Organism: Clostridium acetobutylicum # 2 400 3 401 417 268 40.0 4e-71 MKKSKGVISLLLVTVLTGLLVFTTAVGFGSGHIGAAKNIKLGLDLAGGVSITYQAKGEKP TEKEMSDTVYKLQKRVEQYSTEAGVYQEGDDRISIEIPGVTDANKVLEELGKPGSLEFQD SEGNVVLDGTDVKSATAQIQEDNSGNKKNMVSLEMTKEGKKKFAEATEKNLGKPISIVYD GQVISSPTVNSAITNGQAVIEGSFTYEEAEELASTIRIGGLQVELEEINSSVVGAQLGEE AISTSLMAGAIGLAIIFVFMCAVYLLPGLASSIALCIYTGIILVLLNAFDITLTLPGIAG IILGIGMAVDANVIIFARVKEELGEGKNVKTALKKGFQKALSAILDGNITTLIAALVLGL KGTGTVKGFAQTLALGIVVSMFTALVVTRIIIFAFYAVGLKSEKLYGVQKERKTINFLGK RKICFAVSIALALSGFVAMGVQKSQGNDILNYSLEFKGGTSTNVTFNEDFSIKELDSKVT PVIEKVTGDKNVQMQKVKGTKQVIIKTQALDLDKREALNKALVDNFKVDEKKITYSNISS TVSSEMRSDAIVAVLIATVCMLIYIWLRFKDFRFASSAIIALLHDVLVVLAFYAIARVSV GNTFIACMLTIVGYSINATIVIFDRIREELKAGRRKDSLEDVVNKSITQTLTRSIYTSFT TFVMVAVLYVLGVSSVKEFAAPLMVGVVVGAYSSVCITGSLWYVLKKKFVKK >gi|330401009|gb|ADLB01000003.1| GENE 68 68400 - 70115 1510 571 aa, chain + ## HITS:1 COG:CAC2232 KEGG:ns NR:ns ## COG: CAC2232 COG0608 # Protein_GI_number: 15895500 # Func_class: L Replication, recombination and repair # Function: Single-stranded DNA-specific exonuclease # Organism: Clostridium acetobutylicum # 1 571 1 587 587 563 48.0 1e-160 MEKWFLTRKGADFQKISQKFHINPVIARLIRNRDIVGDDEIQFYLNGNVENLYDGILMKD MSVAVDIIREKIQERKSIRIIGDYDIDGINATYILLEGIEKLGGDVSVDIPDRMKDGYGL NKTLIDRAFEDGVDTIITCDNGIAAGGEIAYGKSLGMTIVVTDHHEVPYEETEEGIRYLL PHADAVIDPKREDCEYPFPHLCGAGVAYKLVETLFNVEGRDAGEIEYLIENVAIATVGDV MDLIGENRIFVKYGLEKIKKTKNLGLKSLIECTGVEVDRLSSYHIGFVIGPCLNASGRLD TAKRALELLRAKTKADADILAGDLKALNDSRKEMTAEAVEEAVEQVENTSLKEDTVLVIY LPDCHESLAGIVAGRIREKYHRPTFVLTNAEEGAKGSGRSIDAYHMYEEMNKCKDLLEKF GGHKLAGGLSIPIENIDLFRKRLNENCELSEDELYEKVSIDMQLPLSYVSETLIEELECL EPFGKGNPKPVFAEKDIQIRGTRILGKNKNVLKLQLSDMYGTEMEGMYFGDVGRVMDTIE NKNGKINITYYPTVNEYMGRKTLQIVIQNYQ >gi|330401009|gb|ADLB01000003.1| GENE 69 70185 - 70709 710 174 aa, chain + ## HITS:1 COG:TM1384 KEGG:ns NR:ns ## COG: TM1384 COG0503 # Protein_GI_number: 15644136 # Func_class: F Nucleotide transport and metabolism # Function: Adenine/guanine phosphoribosyltransferases and related PRPP-binding proteins # Organism: Thermotoga maritima # 4 171 6 173 173 184 52.0 6e-47 MKPIEEYIRSIPDFPEKGIIFRDITSVLEDAEGLRLAIDLMQENIGEGDFDVVVGPESRG FIFGVPIAYNMGKPFIPVRKKGKLPCETVSIQYDLEYGSAEIEMHKNSIKPGQKVVIIDD LMATGGTMEAIIKLIEGMGGEVVKIVCLIELAGLEGRKRLKDYQMEAVITYPGN >gi|330401009|gb|ADLB01000003.1| GENE 70 70736 - 73036 2389 766 aa, chain + ## HITS:1 COG:CAC2274 KEGG:ns NR:ns ## COG: CAC2274 COG0317 # Protein_GI_number: 15895542 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Guanosine polyphosphate pyrophosphohydrolases/synthetases # Organism: Clostridium acetobutylicum # 38 766 7 740 740 708 48.0 0 MAQNNIVGNPYQNLEIVDGHAVKAPEDYQDSDQLYDALIARIRKYHPSTDISLIEKAYRI AKEAHEGQVRKSGEAYIIHPTWVGLILADLELDKETIAAGILHDVVEDTVMTEEEITEEF GSEVALLVDGVTKLGQLSYSADKLEVQAENLRKMFLAMAKDIRVILIKLADRLHNMRTLQ FMRPEKQKEKARETMDIYAPIAQRLGISKIKTELDDLALKYSQPDVFYDLVNQINARKTE REEFVQQIVEEVSTHMKNANIKAEVYGRVKHFFSIYKKMVNQDKTVDQIYDLFAIRIIVD SVKDCYAALGAIHEMYTPIPGRFKDYIAMPKPNMYQSLHTTLMGPAGQPFEIQIRTVEMH KTAEYGIAAHWKYKESGGSEKSVATRAEEKLSWLRQILEWQQDTDNREFLSLLKGDLDLF AEDVYCFTPNGDVKNLPNGSTPVDFAYAIHSAVGNKMVGARVNGKLVPIDYKIQNGDRIE VLTSQNSKGPSRDWLNIVKSTQARNKINQWFKKEFKEENIVRGKEMLAAYCRAKALVLSD LTKPKYMQVVQQKYGFRDWEAVLAALGHGGLKEGQIINRLLEEYNKEHKEAITDETILEK VSEAAKNKVHVTKSKSGIVVKGIDDMAVRFSRCCNPVPGDEIVGFVTRGRGMSIHRTDCV NIIHLSESERARLIDAEWEQEEEKGTGQYMAEIRMYAMDRTGFLMEISKVFTENKIDVKS MNARTSKQGKVTLEIGFIVHGREELAKVAEKLRQIDGVIDIERAVS >gi|330401009|gb|ADLB01000003.1| GENE 71 73046 - 73669 603 207 aa, chain + ## HITS:1 COG:CAC2272 KEGG:ns NR:ns ## COG: CAC2272 COG0491 # Protein_GI_number: 15895540 # Func_class: R General function prediction only # Function: Zn-dependent hydrolases, including glyoxylases # Organism: Clostridium acetobutylicum # 1 206 1 199 199 159 41.0 4e-39 MRIKTIVTGIISTNCYVVHNETTKEAILIDPGACSKNLKDYLLEEKLNVKGILLTHGHFD HILGLDGLLQMFDVPVYVHEEEKELIEDAVLNQSKTYTEGYVFTKAQYVKDGQMLNLIGY DFQVIHTPGHTKGSACYYVKEEEVLFSGDTLFYASVGRTDFPTGSTSALIRSIKEKLMCL PDETIVYPGHMGATSIGYERQQNPFIQ >gi|330401009|gb|ADLB01000003.1| GENE 72 73753 - 75192 903 479 aa, chain + ## HITS:1 COG:BS_hemZ KEGG:ns NR:ns ## COG: BS_hemZ COG0635 # Protein_GI_number: 16081159 # Func_class: H Coenzyme transport and metabolism # Function: Coproporphyrinogen III oxidase and related Fe-S oxidoreductases # Organism: Bacillus subtilis # 54 477 75 490 501 318 42.0 1e-86 MVDIYIKEEFYAYDAFHIVKAFLPNEEIKQHINRERNEIISIDVDKQTIIKIDDVNKECT KKENKSEVNKKIYKQMERYTNKSIAWGILTGIRPTKILMKKLEDGKSEEDIKEWFQQQYL VSEKKATLGMEIAKREKSLLEQLDYENGYSLYIGIPFCPTTCSYCSFTSYPIQQWKERTD EYLEALCKEISFVGETSKEKKLNTIYMGGGTPTTLEAEQLDRLLTHVEKTFSLEELKEFT VEAGRPDSITEAKLEVLKRHHITRISINPQTMQQKTLDSIGRRHTVEDVIRIFRLARKLG FDNINMDLIAGLTGENVDDMRDTLEKIKELNPDNLTVHSLAIKRASKLNQTETKKELMNQ SETLSEMIELAAKYAGEMELFPYYLYRQKNIAGNFENVGYAKVDKAGIYNILIMEEKQSI VAVGAGASTKIVLPKGKEIRDRKSGDLKNIVRIENVKDVDAYIMRIDEMIERKGEWLWR >gi|330401009|gb|ADLB01000003.1| GENE 73 75183 - 76439 1437 418 aa, chain + ## HITS:1 COG:APE0662 KEGG:ns NR:ns ## COG: APE0662 COG0124 # Protein_GI_number: 14600873 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Histidyl-tRNA synthetase # Organism: Aeropyrum pernix # 10 406 14 393 438 204 32.0 2e-52 MALKKKAMTGMKDMLPREMEIRDYVIHLIKETYSTFGFSSIETPCVEHIENLCSKQGGEN EKLIFKILKRGEKLKLETAQTESDVVDGGLRYDLTVPLCRYYANASNELPSPFKALQMGN VWRADRPQRGRFRQFMQCDIDILGEPTNLAEIELILATTTLLGKLDFKNFTIRINDRKIL KAMASYSGFEEKDYDNVFIILDKMDKIGLDGVAAELEANGYAKESVDKYLALFKEITNDV EGVRYCKEKLAGFLDAETADGLEMIITSVDSVKEAEFKISFDPTLVRGMSYYTGTIFEIA MDEFGGSVGGGGRYDEMIGKFTGQQTPACGFSIGFERIVMLLLERGYEVPTAKNKKAFLI EKGMNQEGLLKVLSLAKEERAAGNQVQISVMKKNKKFQKEQLSSEGYNDIQEFFKDKL >gi|330401009|gb|ADLB01000003.1| GENE 74 76458 - 78257 2007 599 aa, chain + ## HITS:1 COG:CAC2269 KEGG:ns NR:ns ## COG: CAC2269 COG0173 # Protein_GI_number: 15895537 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Aspartyl-tRNA synthetase # Organism: Clostridium acetobutylicum # 1 599 1 595 595 721 60.0 0 MAESMQGLKRSHRCAELNSSHIGETVTVMGWVQKNRNKGGLVFVDVRDRSGIVQVVFEEG SVSSELIEKAGSLRSEYVVAVVGKVAKRSGAVNENIVTGDIEVLPTELRVLSESETPPFP IEENSKTKEEVRLKNRVLDLRRPDLQRNLIMRSQVATLTRQFLAEEGFLEIETPMLIKST PEGARDYLVPSRVHQGSFYALPQSPQIFKQLLMCAGYDRYFQIVKCFRDEDLRADRQPEF TQIDMELSFVDVDDVIDVNERLLAKVFKEILDVEVSLPIQRMTWQEAMDRFGSDKPDIRF GMELTDVSEVVKDCEFVVFKNALEQGGSVRGINAKGQGAMPRKKIDKLVDFAKGYGAKGL AYIAIQEDGTMKSSFAKFMIEEEMTALVNAMGGENGDLLLFAADCNKVVWDVLGNLRLEI ARQLELLDKNEYKFLWVTEFPLLEWNEEAGRYTAMHHPFTMPMEEDLHLIDTEPGKVRAK AYDIVLNGTELGGGSVRIFNQDIQNKMFEVLGFTKEQAQEQFGFLLNAFKYGVPPHAGLA YGLDRLIMLMAKQDSIRDVIAFPKVKDASDLMTEAPAGVDQKQLDELGLAIVAEEKTEE >gi|330401009|gb|ADLB01000003.1| GENE 75 78447 - 78800 289 117 aa, chain - ## HITS:1 COG:CAC0656 KEGG:ns NR:ns ## COG: CAC0656 COG3666 # Protein_GI_number: 15893944 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Clostridium acetobutylicum # 11 117 83 189 189 123 62.0 9e-29 MNARIVPAAPIKKNSTKAKGNKRLYVSKSFLEKRQESYKNILSETGIKYRMNRSIQVEGA FGVLKNDYEFQRFLLRGKSKVKLEILLLCMGYNINKLHAKIQKERTGSYLFLVKETA Prediction of potential genes in microbial genomes Time: Tue May 24 20:52:54 2011 Seq name: gi|330400460|gb|ADLB01000004.1| Lachnospiraceae bacterium 2_1_46FAA cont1.4, whole genome shotgun sequence Length of sequence - 3016 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 5, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 393 203 ## COG3666 Transposase and inactivated derivatives - Prom 454 - 513 2.2 - Term 446 - 504 6.3 2 2 Tu 1 . - CDS 517 - 1092 680 ## Bmur_0648 5TM receptors of the LytS-YhcK type transmembrane region - Prom 1225 - 1284 10.6 + Prom 1166 - 1225 9.7 3 3 Tu 1 . + CDS 1253 - 1822 552 ## COG5418 Predicted secreted protein + Term 1961 - 2011 -0.5 - Term 1774 - 1817 9.4 4 4 Tu 1 . - CDS 1825 - 2460 586 ## Cphy_3273 hypothetical protein - Prom 2496 - 2555 6.0 5 5 Tu 1 . - CDS 2631 - 3014 103 ## SZO_02320 transposase Predicted protein(s) >gi|330400460|gb|ADLB01000004.1| GENE 1 3 - 393 203 130 aa, chain - ## HITS:1 COG:CAC0657 KEGG:ns NR:ns ## COG: CAC0657 COG3666 # Protein_GI_number: 15893945 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Clostridium acetobutylicum # 9 130 9 131 334 166 66.0 1e-41 MTNKLKYHKNYTEFGEPYQLVLPLNLEGLIPDDDSVRLLSHELEDLDYSLLYQAYSAKGR NPAVDPKTMFKILTYAYSQNIYSSRKIETACKRDINFMWLLAGQKAPDHSTIARFRTGFL ADACENLFYQ >gi|330400460|gb|ADLB01000004.1| GENE 2 517 - 1092 680 191 aa, chain - ## HITS:1 COG:no KEGG:Bmur_0648 NR:ns ## KEGG: Bmur_0648 # Name: not_defined # Def: 5TM receptors of the LytS-YhcK type transmembrane region # Organism: B.murdochii # Pathway: not_defined # 11 190 8 187 205 126 40.0 4e-28 MNKQFSPTARMAFCGLAIALNIVLGIVTAALKFPFYLDVMGTMFIAIFFGPWYGAVVGGA TNILTSIFSGSISGMPFMLVSIAIGLITGFAFRKINFTFVNAILIGVVTGIVAPLIGTPI GIAVYGGLTGTVSDVAVMFLKQSGASIFTASFIPKLFNNLLDKIGSMILVYLLVIALPKN LKPKVFTRTQK >gi|330400460|gb|ADLB01000004.1| GENE 3 1253 - 1822 552 189 aa, chain + ## HITS:1 COG:MJ0003 KEGG:ns NR:ns ## COG: MJ0003 COG5418 # Protein_GI_number: 15668175 # Func_class: S Function unknown # Function: Predicted secreted protein # Organism: Methanococcus jannaschii # 3 117 4 110 156 59 32.0 4e-09 MKQKILFVSHCFLNDGAKLKNQNLSEMEQERKDKREFLKKILDAGVEIIQLPCPEFILYG ANRWGHAASQFDTAFFRKESRKMLEPILLQIEEYSFYPERFEIIGIVGIDGSPSCGVTFT YDGEWGGEFSGNENLSDTLDSLKREEKPGIFMKVLKDMLAEKGYEISFYSLKELEERNIF RKKCLPEDM >gi|330400460|gb|ADLB01000004.1| GENE 4 1825 - 2460 586 211 aa, chain - ## HITS:1 COG:no KEGG:Cphy_3273 NR:ns ## KEGG: Cphy_3273 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 2 211 3 212 212 230 49.0 3e-59 MNPKQKAHENMANTIIKNLSRRNIEGFYCPDKDSAVSLAMDIIGENATVSFGGSATLKES GMLEALRNSSVHVIETFSAQTPEEKHQAFLNCVASDYFFMSSNAITIDGELINIDGNGNR IACLIHGPKHVILLVGMNKVVSDVKSGIARIQNTASPVNAVSLNKKTPCGITGHCGNCHS ADCMCCQVVVTRHSRHDGRIKVILIGEELGY >gi|330400460|gb|ADLB01000004.1| GENE 5 2631 - 3014 103 127 aa, chain - ## HITS:1 COG:no KEGG:SZO_02320 NR:ns ## KEGG: SZO_02320 # Name: not_defined # Def: transposase # Organism: S.equi_zooepidemicus # Pathway: not_defined # 1 123 337 457 472 84 34.0 1e-15 EKQPDCETINQILAVLTERTVDCGHCIKFQKQYYKTINECGIQVHYHKGTKGLVIQTFDK RLLFSVNNKIYELEVVPLHEPLSKNFDLQPLKEKPRKRNLPSPKHPWRMSTFLQFKNHKI TETMLIC Prediction of potential genes in microbial genomes Time: Tue May 24 20:53:28 2011 Seq name: gi|330400063|gb|ADLB01000005.1| Lachnospiraceae bacterium 2_1_46FAA cont1.5, whole genome shotgun sequence Length of sequence - 87386 bp Number of predicted genes - 63, with homology - 60 Number of transcription units - 22, operones - 16 average op.length - 3.6 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 269 104 ## LGG_02944 integrase - Prom 316 - 375 3.2 + TRNA 483 - 554 78.1 # Gly GCC 0 0 + TRNA 602 - 673 78.1 # Gly GCC 0 0 + Prom 604 - 663 79.0 2 2 Op 1 . + CDS 712 - 1350 728 ## COG2188 Transcriptional regulators + Term 1360 - 1404 8.4 + Prom 1482 - 1541 7.2 3 2 Op 2 . + CDS 1561 - 3021 1669 ## COG0531 Amino acid transporters + Term 3033 - 3077 10.2 + Prom 3033 - 3092 8.6 4 3 Op 1 . + CDS 3282 - 3842 733 ## CLL_A2964 hypothetical protein 5 3 Op 2 1/0.000 + CDS 3859 - 4290 405 ## COG1846 Transcriptional regulators 6 3 Op 3 16/0.000 + CDS 4303 - 5151 1047 ## COG0207 Thymidylate synthase 7 3 Op 4 1/0.000 + CDS 5167 - 5676 623 ## COG0262 Dihydrofolate reductase 8 3 Op 5 . + CDS 5660 - 7012 1742 ## COG0534 Na+-driven multidrug efflux pump 9 3 Op 6 . + CDS 7059 - 9008 1868 ## COG3855 Uncharacterized protein conserved in bacteria + Term 9030 - 9087 17.1 - Term 9025 - 9067 3.4 10 4 Tu 1 . - CDS 9079 - 10410 1566 ## COG2239 Mg/Co/Ni transporter MgtE (contains CBS domain) - Prom 10443 - 10502 2.2 + Prom 10776 - 10835 10.4 11 5 Op 1 1/0.000 + CDS 10865 - 11620 975 ## COG0500 SAM-dependent methyltransferases 12 5 Op 2 . + CDS 11623 - 12501 1199 ## COG1281 Disulfide bond chaperones of the HSP33 family 13 5 Op 3 . + CDS 12504 - 14417 1906 ## COG0171 NAD synthase 14 5 Op 4 . + CDS 14436 - 15707 809 ## 15 5 Op 5 . + CDS 15778 - 16446 287 ## PROTEIN SUPPORTED gi|87308954|ref|ZP_01091092.1| 30S ribosomal protein S1 + Term 16473 - 16512 1.0 16 6 Op 1 . + CDS 16541 - 18202 1730 ## COG1227 Inorganic pyrophosphatase/exopolyphosphatase 17 6 Op 2 . + CDS 18212 - 18784 410 ## COG1853 Conserved protein/domain typically associated with flavoprotein oxygenases, DIM6/NTAB family + Term 18785 - 18822 3.5 + Prom 18787 - 18846 4.3 18 7 Op 1 . + CDS 18889 - 19569 597 ## COG0726 Predicted xylanase/chitin deacetylase 19 7 Op 2 . + CDS 19588 - 20304 730 ## COG1179 Dinucleotide-utilizing enzymes involved in molybdopterin and thiamine biosynthesis family 1 20 7 Op 3 . + CDS 20315 - 21637 1186 ## COG2256 ATPase related to the helicase subunit of the Holliday junction resolvase - Term 21607 - 21652 3.0 21 8 Op 1 . - CDS 21658 - 22170 449 ## Cphy_0407 hypothetical protein 22 8 Op 2 . - CDS 22185 - 23114 394 ## COG1242 Predicted Fe-S oxidoreductase 23 8 Op 3 . - CDS 23114 - 24262 1087 ## COG0281 Malic enzyme - Prom 24377 - 24436 10.1 + Prom 24355 - 24414 9.0 24 9 Op 1 . + CDS 24484 - 25809 1235 ## COG1158 Transcription termination factor + Prom 25811 - 25870 6.1 25 9 Op 2 . + CDS 25897 - 26220 291 ## Ccur_02270 hypothetical protein + Prom 26254 - 26313 4.7 26 10 Op 1 . + CDS 26337 - 26540 339 ## PROTEIN SUPPORTED gi|240147058|ref|ZP_04745659.1| large subunit ribosomal protein L31 + Term 26559 - 26594 6.0 27 10 Op 2 1/0.000 + CDS 26617 - 27549 1035 ## COG3872 Predicted metal-dependent enzyme 28 10 Op 3 32/0.000 + CDS 27546 - 28370 303 ## PROTEIN SUPPORTED gi|237684513|gb|ACR11777.1| protein-(glutamine-N5) methyltransferase, ribosomal protein L3-specific 29 10 Op 4 . + CDS 28409 - 29479 1134 ## COG0216 Protein chain release factor A + Prom 29568 - 29627 10.0 30 11 Tu 1 . + CDS 29705 - 34315 4542 ## CPE1523 hyaluronidase + Term 34325 - 34362 4.1 + Prom 34342 - 34401 5.4 31 12 Op 1 . + CDS 34428 - 35180 629 ## BHWA1_01124 hypothetical protein 32 12 Op 2 9/0.000 + CDS 35167 - 36957 991 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 33 12 Op 3 . + CDS 36954 - 37751 683 ## COG3279 Response regulator of the LytR/AlgR family + Term 37763 - 37802 5.1 + Prom 37811 - 37870 6.7 34 13 Tu 1 . + CDS 37928 - 49024 12775 ## Balac_0261 hypothetical fibronectin binding protein + Term 49039 - 49079 6.5 + Prom 49061 - 49120 5.1 35 14 Op 1 . + CDS 49186 - 50919 237 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 36 14 Op 2 . + CDS 50948 - 51841 908 ## COG4866 Uncharacterized conserved protein 37 14 Op 3 . + CDS 51842 - 53029 1003 ## COG0436 Aspartate/tyrosine/aromatic aminotransferase + Prom 53036 - 53095 2.5 38 15 Op 1 . + CDS 53122 - 53970 1258 ## COG2035 Predicted membrane protein 39 15 Op 2 . + CDS 53977 - 54663 667 ## COG4912 Predicted DNA alkylation repair enzyme + Term 54669 - 54704 6.0 40 16 Tu 1 . + CDS 55018 - 56238 831 ## + Prom 56255 - 56314 8.6 41 17 Op 1 . + CDS 56348 - 56809 477 ## gi|225568404|ref|ZP_03777429.1| hypothetical protein CLOHYLEM_04481 + Prom 56811 - 56870 3.8 42 17 Op 2 . + CDS 56892 - 57947 1228 ## COG0821 Enzyme involved in the deoxyxylulose pathway of isoprenoid biosynthesis 43 17 Op 3 . + CDS 57949 - 62412 4214 ## COG2176 DNA polymerase III, alpha subunit (gram-positive type) 44 17 Op 4 . + CDS 62426 - 62881 626 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases + Term 62916 - 62953 -0.6 + Prom 63228 - 63287 1.8 45 18 Op 1 . + CDS 63309 - 64172 685 ## DSY3852 hypothetical protein 46 18 Op 2 . + CDS 64160 - 65821 229 ## PROTEIN SUPPORTED gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 47 18 Op 3 . + CDS 65797 - 66735 769 ## DSY3851 hypothetical protein 48 18 Op 4 . + CDS 66755 - 67267 476 ## Pjdr2_2685 hypothetical protein 49 18 Op 5 . + CDS 67288 - 68217 910 ## COG0714 MoxR-like ATPases 50 18 Op 6 . + CDS 68226 - 69314 629 ## Cphy_3708 hypothetical protein 51 18 Op 7 . + CDS 69274 - 71400 1209 ## COG1305 Transglutaminase-like enzymes, putative cysteine proteases 52 18 Op 8 . + CDS 71312 - 72229 1026 ## Pjdr2_2684 S-layer domain protein 53 18 Op 9 . + CDS 72213 - 73463 802 ## gi|291546094|emb|CBL19202.1| hypothetical protein 54 18 Op 10 . + CDS 73535 - 77191 3806 ## COG1520 FOG: WD40-like repeat + Term 77198 - 77247 9.1 + Prom 77297 - 77356 5.4 55 19 Op 1 9/0.000 + CDS 77400 - 78476 1442 ## COG1063 Threonine dehydrogenase and related Zn-dependent dehydrogenases 56 19 Op 2 . + CDS 78502 - 79704 1395 ## COG0156 7-keto-8-aminopelargonate synthetase and related enzymes + Term 79718 - 79754 4.1 57 19 Op 3 . + CDS 79783 - 81042 1030 ## PROTEIN SUPPORTED gi|168182407|ref|ZP_02617071.1| 50S ribosomal protein L18 + Term 81045 - 81105 8.4 - Term 81037 - 81083 11.1 58 20 Op 1 1/0.000 - CDS 81098 - 82501 1220 ## COG0531 Amino acid transporters 59 20 Op 2 . - CDS 82523 - 83176 595 ## COG1309 Transcriptional regulator 60 20 Op 3 . - CDS 83073 - 83264 106 ## - Prom 83284 - 83343 5.2 + Prom 83173 - 83232 10.8 61 21 Op 1 . + CDS 83368 - 84672 1466 ## COG0477 Permeases of the major facilitator superfamily 62 21 Op 2 . + CDS 84700 - 86745 2650 ## COG2936 Predicted acyl esterases + Term 86749 - 86789 10.3 - Term 86726 - 86790 12.7 63 22 Tu 1 . - CDS 86883 - 87353 376 ## COG1943 Transposase and inactivated derivatives Predicted protein(s) >gi|330400063|gb|ADLB01000005.1| GENE 1 2 - 269 104 89 aa, chain - ## HITS:1 COG:no KEGG:LGG_02944 NR:ns ## KEGG: LGG_02944 # Name: tnp # Def: integrase # Organism: L.rhamnosus # Pathway: not_defined # 1 78 1 78 478 64 43.0 1e-09 MNEQLKYEVIKSLVDHNGNKKAAALKLGCTTRHINRLIQKYKQNGKAAFIHGNRGRKPLH SFTESQKLEILTLYNNKYYDATFTYTCEL >gi|330400063|gb|ADLB01000005.1| GENE 2 712 - 1350 728 212 aa, chain + ## HITS:1 COG:CAC2851_1 KEGG:ns NR:ns ## COG: CAC2851_1 COG2188 # Protein_GI_number: 15896105 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Clostridium acetobutylicum # 12 122 8 118 118 85 39.0 5e-17 MKKELKNEVGIPKYQQIAADIAYKIVDGTYTEGMKVYARSSIGSRYGVSSETARRAMCVL SDWDVVEVAKNSGVVIKSEDNAKNFLQQHTHMQSIQGLKRNILENVEKQQKATKELYGYL DEVIKKIDNYRSINPFIPYELVITDKTPYLNKTVGSINFWQETFATVIAIRRNGTLIMSP GPEAVFRLNDIVYYTGDEDCPDRVRGFMYPNK >gi|330400063|gb|ADLB01000005.1| GENE 3 1561 - 3021 1669 486 aa, chain + ## HITS:1 COG:BH0994 KEGG:ns NR:ns ## COG: BH0994 COG0531 # Protein_GI_number: 15613557 # Func_class: E Amino acid transport and metabolism # Function: Amino acid transporters # Organism: Bacillus halodurans # 71 457 1 384 395 318 44.0 1e-86 MNEKKSKYDKVLSSKDIFVIAFGAMIGWGWIVMTGEWINYGGSIGAMIAFGIGGFMVLFV GLTYAELTAAMPECGGEHIFSLRAFGKNGSFVCTWAIILGYVGVVAFEACAFPTVIKYIA PSLVTKGYMYTIGGFDVYASWVAVAVVAAIIITYINYKGAKTAANVQTILTVVIALVGIA LIAASVVRGDTANLDPMFLEGDEIGGVFKVAVMTPFMLVGFDVIPQAAEEINIPFKKIGK IIIFSIFMAVAWYVLIVLAVSLIMSKGDLSISELVTADAMKKAYFNSDIASTVCIFGGIM GIVTSWNSFFMGGSRAIAALSESHMAPGFLAKVNKKTKTPTNSILLVGFIAVIAPFFGKS MMTWITDAGSFAVCLAYLMVSASFLKLRKSEPNMNRPYKVKNAKLVGGMAVITTTIMCLL YIVPGNSQLITEEWVIVGAWILLGIAFYGYARKQYKEKFGTRQKLIYEDVVEEKPKKVRE RVAAKA >gi|330400063|gb|ADLB01000005.1| GENE 4 3282 - 3842 733 186 aa, chain + ## HITS:1 COG:no KEGG:CLL_A2964 NR:ns ## KEGG: CLL_A2964 # Name: not_defined # Def: hypothetical protein # Organism: C.botulinum_B_Eklund # Pathway: not_defined # 1 177 1 176 181 129 38.0 5e-29 MEQMKAQFKDSWQELKHLKTIVVTAMFIAIGVVLGFFFTIQVTDFLKIGFSFIANEMTAL LFGPVVGGIMGGVTDIIKFVLKPTGAYFFGFTFNAILGAVIYGMILYHRPISLKRILIAK IIVAIIVNLLLGTYWLHIMYGKAFWALIPVRLWKQVMSVPIESFLFYIVAKALSKARIFE LVKTKR >gi|330400063|gb|ADLB01000005.1| GENE 5 3859 - 4290 405 143 aa, chain + ## HITS:1 COG:CAC3413 KEGG:ns NR:ns ## COG: CAC3413 COG1846 # Protein_GI_number: 15896654 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Clostridium acetobutylicum # 1 136 1 137 143 57 26.0 9e-09 MEERQPIGFIVKQINNIYEKELTKRLKVLGITSSQCAVLNYLFQSNQEHVTQREIEKNLQ LKNPTVTGLLKRLDEKGFVLCVQSPTDKRCKNIYLTEKAFDIQKKMESDRKRLDKRLTIG MTKKEVEVFRRGLEKVLYNISEP >gi|330400063|gb|ADLB01000005.1| GENE 6 4303 - 5151 1047 282 aa, chain + ## HITS:1 COG:BS_thyA KEGG:ns NR:ns ## COG: BS_thyA COG0207 # Protein_GI_number: 16078831 # Func_class: F Nucleotide transport and metabolism # Function: Thymidylate synthase # Organism: Bacillus subtilis # 1 275 1 273 279 215 42.0 9e-56 MSYADKVFIEMCRDIIDNGVSTEGEKVRPKWEDGSFAYTMKKFCVVNRYDLSKEFPALTL RRTGIKSCVDEMLWIWQKKSNNINDLNSHIWDSWADEDGSIGKAYGYQMSVKHQYKEGMM DQVDRVIYDLKNNPYSRRIMTNIYVHQDLHEMNLYPCAYSVTFNVTKKNDSDKLVLNAIL NQRSQDVLAANNWNVCQYAVLVHMLAQVCDMQVGELVHVIADAHIYDRHIPLIEELISRE SYPAPAFWLNPEIKDFYEFTTDDVRLDNYQTGPQIKNIPIAV >gi|330400063|gb|ADLB01000005.1| GENE 7 5167 - 5676 623 169 aa, chain + ## HITS:1 COG:CAC3004 KEGG:ns NR:ns ## COG: CAC3004 COG0262 # Protein_GI_number: 15896256 # Func_class: H Coenzyme transport and metabolism # Function: Dihydrofolate reductase # Organism: Clostridium acetobutylicum # 1 149 2 145 153 112 40.0 3e-25 MNLIVAVDKNWAIGCGNRLLVSIPADMKFFRETTTGKVVVMGRKTLESFPGGQPLKKRTN IVMTKDKNYKVKDAIVVTSLEEVLEELKKYEEEDIYVIGGESIYRQLLPHCKTAYVTKID HAYEADTYFPNLDEMEDWSLTGISEEQTYFDLEYVFARYERVKGNEEKH >gi|330400063|gb|ADLB01000005.1| GENE 8 5660 - 7012 1742 450 aa, chain + ## HITS:1 COG:TM0815 KEGG:ns NR:ns ## COG: TM0815 COG0534 # Protein_GI_number: 15643578 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Thermotoga maritima # 12 439 18 445 464 177 29.0 3e-44 MKKSINLLEGPILPSLSGLALPIMATSLIQMAYNLTDMIWIGRVGSSAVAAVGAAGMYMW LANGLATLAKMGGQVKVAQALGAKENKQAVSYAKSALQLGITLGLIYGILSVVLANPLID FFKLNSAKVVADAKIYLQITCGGVVFSFLNQIFTGIMTAMGNSRTSFVATAVGLVINIVM DPVLIFGIGPFAEMGVMGAGVATVFAQMIVTAVFILAALKDEIIFQKVRLLEKVDRESMM TIVKIGFPTGVQSMIFTSISMIIARMIAGFGDSAVAVQKVGSQIESISWMTAEGFGAAVN AFMAQNHGAKNKKRIVQGYKVAMRIEIVWGILCTFILIVFPEVIFKIFIPEADVLPMGVD YLKILGVSQLFMCIELTTAGAFSGLGKTVPPSISSIVLTAARIPMAVALTSTALGLNGIW WSITISSIFKGVILTGWFLIYLKKMMSAKE >gi|330400063|gb|ADLB01000005.1| GENE 9 7059 - 9008 1868 649 aa, chain + ## HITS:1 COG:CAC1572 KEGG:ns NR:ns ## COG: CAC1572 COG3855 # Protein_GI_number: 15894850 # Func_class: G Carbohydrate transport and metabolism # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 6 648 16 661 665 751 57.0 0 MKTLEMKYLERLAEKYPTIASASTEIINLQAILHLPKSTEHFLTDIHGEDEAFTHVLKNG SGSVRRKIDEVFGNTLTNRDKKSLATLIYYPKEKMELVRHTETDMADWYRITLYRLIEVC KCASGKYTRSKVRKALPKDFAYVIEELLTEKSDAEDKEHYYNSIIDTILRIGRAEEFIIA MSELIQRLVVDHLHIVGDIFDRGPGPHKIMDKLKKYHSLDIQWGNHDILWLGAAAGQLAC IANVIRICARYGNLDILEDGYGINLLPLATFAVNTYKDDPCGCFRIKGQQGCSSAEMNNE IKMHKAISIIQFKLEGQLIERNPQFHMKNRLLLHWIDYDRGTIQIGDKEYALLDTNFPTV DADNPYELTEEEQDIMERLQKAFVNCEKLQGHMKVLLSKGSLYKVYNGNLLYHGCVPLNE DGSFKEVQIYGKTYHGKELYDVLESYVRKAFFAADKVERERGKNLLWYIWLHEDSPLFAK DKMATFERYFLEEKETHEEKKNPYYYLLENGEVMNRILEEFGLPKEGSHIINGHVPVKSK DGENPIKCNGKVLVIDGGFAKSYQKETGLAGYTLIYNSYGMILAAHEPFESRESAIQKET DIHSDTFLVKRVEKRTLVEDTDTGKELKEQMADLERLLEAYRSGKIAEK >gi|330400063|gb|ADLB01000005.1| GENE 10 9079 - 10410 1566 443 aa, chain - ## HITS:1 COG:TP0917 KEGG:ns NR:ns ## COG: TP0917 COG2239 # Protein_GI_number: 15639902 # Func_class: P Inorganic ion transport and metabolism # Function: Mg/Co/Ni transporter MgtE (contains CBS domain) # Organism: Treponema pallidum # 9 437 11 441 449 383 48.0 1e-106 MNTTIFMELLAKREFKVIRSILDVMNAVDIAILLSELDDKELALAFRLIPKDKAADVFSN MNNSMQTYLVDIFTEKELKELLDDLYMDDTVDMLEELPANLVTRILDTVDSTKRNSINQL LNYPDDSAGSIMTTEYVNLKKSMTVKDAMSHIKAVGIHKETIYTCYVLEHRRLIGIVSAK DLMTMDDETVIEEIMETEIISVSTHTDQEEVAKLFSKYGLLAIPVLDTGNLMVGIVTVDD AMGVMVDEATEDITIMAAVNPSEKSYFETSVFSHAKNRFLWLLILMLSSTITGAIITKYE NAFAAIPLLVSFIPMLMDTGGNCGSQSSTLVIRGLALEEIKFSDIFRVMFKEFRIALLVS AGLAIANGIRIFIMYHDAKLAIVIGLSLIATIIISKLIGCVLPLFAKKLHLDPAIMAAPL ITTLVDTCSIIIYFSIATKIFAL >gi|330400063|gb|ADLB01000005.1| GENE 11 10865 - 11620 975 251 aa, chain + ## HITS:1 COG:BS_yqeM KEGG:ns NR:ns ## COG: BS_yqeM COG0500 # Protein_GI_number: 16079615 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Bacillus subtilis # 4 248 3 241 247 159 35.0 4e-39 MEAYTSFASVYDTFMDNIPYEEWSVYVKELLEEYGVTEGLVLELGCGTGTMTELLASAGY DMIGIDNAEEMLEIALEKKLSSGHDILYLLQDMREFELYGTVKAAVSVCDSINYIMEEEE LEEVFRLVNNYLDPKGIFIFDFNTTYKYREILGDRTIAENREECSFIWDNYYYEEEEINE YELSLFIREGESDCYRKYEETHYQKAYCLETIRRLVERSGLEYITAYDAFTREKPTENSE RIYVIARERGK >gi|330400063|gb|ADLB01000005.1| GENE 12 11623 - 12501 1199 292 aa, chain + ## HITS:1 COG:BS_yacC KEGG:ns NR:ns ## COG: BS_yacC COG1281 # Protein_GI_number: 16077139 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Disulfide bond chaperones of the HSP33 family # Organism: Bacillus subtilis # 3 285 2 283 291 267 46.0 2e-71 MKDYIVRATAGNNQIRAFAATTKEAVEKARQAHNTSPVATAALGRLLTGGAMMGSMMKND TDMLTIQIKCDGPIKGLTVTADRHGNVKGYVENPEVMLPPNKQGKLDVAGALDLGVLSVI KDMGLKEPYVGQTILQTSEIAEDLTYYFATSEQVPSSVGLGVLMEKDNTVKQAGGFIVQV MPFIEEEVLSKLEENIKKISSVTSMLDKGYTPENILEEVLEGLDVEFTDTVPTQFYCNCT KERVEKAIISIGKKDIQEMIDDGKEIEVNCHFCNTNYTFSVEELKEMLKRSR >gi|330400063|gb|ADLB01000005.1| GENE 13 12504 - 14417 1906 637 aa, chain + ## HITS:1 COG:CAC1050_2 KEGG:ns NR:ns ## COG: CAC1050_2 COG0171 # Protein_GI_number: 15894337 # Func_class: H Coenzyme transport and metabolism # Function: NAD synthase # Organism: Clostridium acetobutylicum # 323 630 1 309 310 412 64.0 1e-115 MKQGFVKVAAATPDIRVADISYNTEQICKLIDETVANGAKIVVFPELCVTGYTCGDLFFQ DILLERSKEALNEIAEYTKDKDALVFVGVPLAINGKLYNVAAALNRGKIIGLVTKTFLPN YSEFYEMRTFQAGPEEARVILYEGEQVAFGPQILFKAKNMEELIVSAEICEDVWSPIPPS IMAATAGATILVNCSASSETYGKDGYRTSLISGQSARMIAGYVYANAGAGESTTDLVFGG HNIIAENGMVLKESRRYMNDVIYSEIDVHRLLNERRKNTTFQQMSGKHFLVTVPFEIEKE ETELTRSISQMPFIPEDEKERDECCEEILMIQALGLKKRLQHTNCRNVVIGVSGGLDSTL ALLVAVKTFDMLGIDRKNITAVTMPCFGTTDRTYQNACVLIEKLGVTLREISIQEAVRNH FKDIGQREDLFDTTYENAQARERTQVLMDIANMENGMVIGTGDMSELALGWATYNGDHMS MYGVNSSIPKTFVRFIVKHCAVECEDETLKKALFDILDTPISPELLPAKEGEIEQKTEDL VGPYELHDFYLYYMLRYGYEPSKIYRLARKAFAGVYEDETILKWLKSFCRRFFTQQFKRS CLPDGPKVGTVGLSPRGDLRMPSDACVTIWLEELERL >gi|330400063|gb|ADLB01000005.1| GENE 14 14436 - 15707 809 423 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MDKKWAETYRKQVRFGTAAIYIIVVAWLFFTGCELFRTGDINGTWWLFTIVVLGLELGFR FWKRNQMKKIQDVLYEECDPFRFKEIYEYFRAKTKKERIKSLYSIQISTALLMQGYEDSA YELMNQIDFSDLPPRFELSYYDFMRMYFSCKNDEQQLAKVKGVFERRLGNAKAMERKLII QQIKYMDLQVAVHRRDYAIYDKLILECTSDMTRMLQKVSVYMLMARAEFGRGNADKAKEY CQFVLQHGNRTYYVENAKKLLAVLDGQEEEEQSATESVQWEENQLYKEKPDLKFFDGRKK YRRRQRMNYLLWIAFGYAIGFIIYGAMGLKLHQELSNMEVEVWLWGNRALFVLQFGFLGG LLIAGVINGVYLFLKVISKLSLAMKIVIIVFTIPFFLVAGSISVIPYCVYQIVMIVKERE KKK >gi|330400063|gb|ADLB01000005.1| GENE 15 15778 - 16446 287 222 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|87308954|ref|ZP_01091092.1| 30S ribosomal protein S1 [Blastopirellula marina DSM 3645] # 31 203 239 413 536 115 38 1e-24 MSEEQLQTETMADYEDQFDVANPWNIVKRYMEEGTTLSVKVEGIVNGGAIAYIEGIRGFI PVSRLSLSYVENVESYLLKELEVRVIDVDQAENKLVLSAREILKEKENKEKARRLSAVEV GSVLEGTVESLQTYGAFVRLENDLSGLVHISQISDKRIKSPKDVLSAGDAVTVKVIGIKD GKISLSMKALIEPEEEVVEEVEIPKAEEIGTSLGDLFKNIQL >gi|330400063|gb|ADLB01000005.1| GENE 16 16541 - 18202 1730 553 aa, chain + ## HITS:1 COG:FN1824 KEGG:ns NR:ns ## COG: FN1824 COG1227 # Protein_GI_number: 19705129 # Func_class: C Energy production and conversion # Function: Inorganic pyrophosphatase/exopolyphosphatase # Organism: Fusobacterium nucleatum # 11 553 3 537 538 281 33.0 3e-75 MEFMESKRGRKVYVVGHKNPDTDSICSALAYANLKKQITGDDYVAKRAGQINEETQYVLK RFGVTPPGLLNNVQTQVKDIDINKMSGVESSVSIKKVWALMKENNAKTMPVTANGELEGL ITIGDIATSYMEVYDSNILASARTQYRSIANTLDGEIVIGNEHAYFIKGKVVIAASSPEL MENFIEQDDLVILGNRYESQLCALEMDASCLVVCQNATVSKTIKKIAEERSCVIIRTPHD TFTVARLINQSIPVKHFMTKENFTAFKTSDYIDDIKEVMTKKRYRDFPVIDKHGRFVGFI SRRRLLDAKKKQLILVDHNEKTQAVDGIEEADILEIIDHHRLGTGIETVGPVYFRNQPVG CTATIVYQMYLENHVEIDETTAGLLCSAITSDTLMFRSPTCTAVDEKAGRELAEIAHIDI EELAGNMFKAGSNLVNKTAKEICFQDFKKFEVSGVDFGVGQINSMSREELEEIKQRILPY LKTALVEKGIDMMFIMLTNIIEEATELLCVGNNARNLVLEAFDLSADTEDIILKGVVSRK KQLIPAIVVSLQS >gi|330400063|gb|ADLB01000005.1| GENE 17 18212 - 18784 410 190 aa, chain + ## HITS:1 COG:FN1468 KEGG:ns NR:ns ## COG: FN1468 COG1853 # Protein_GI_number: 19704800 # Func_class: R General function prediction only # Function: Conserved protein/domain typically associated with flavoprotein oxygenases, DIM6/NTAB family # Organism: Fusobacterium nucleatum # 1 184 1 185 197 177 48.0 1e-44 MAKQTWKAGNMLYPLPAVMVSMADGEGKSNIITVAWAGTVCTNPPMLSISVRPERYSYDI LKRTGEFVVNLTTEKLAYATDYCGVRSGRDVDKFKEMKLTEEKASIINAPLIKESPVNIE CKVRKIEELGSHHMFLADVVAVNIDERYLNEKNKFELSKAKPLVYSHGEYYGVGNLLGTF GYSVRKKKKK >gi|330400063|gb|ADLB01000005.1| GENE 18 18889 - 19569 597 226 aa, chain + ## HITS:1 COG:L81453_2 KEGG:ns NR:ns ## COG: L81453_2 COG0726 # Protein_GI_number: 15672265 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted xylanase/chitin deacetylase # Organism: Lactococcus lactis # 41 224 23 206 222 156 42.0 3e-38 MKKIKKLLCLCGFLYTLVLFQTLGISFLHKEKVVETGKVLSEKPQIAITFDDGPSEKYTP QLLDGLKERNVKASFFVIGKMAEKNPKLIKREQEEGHLIGNHTYNHVDISKMSDEAAVTE IEKTNQTIEKVTKKDVEYLRAPFGSWKKNLVARMNVFPVTWSVDPLDWTTENSDEIVNKV VTEVKENDIILMHDCYQSSVDAALRIIDILQKEGYEFVTVDKLIVD >gi|330400063|gb|ADLB01000005.1| GENE 19 19588 - 20304 730 238 aa, chain + ## HITS:1 COG:CAC0908 KEGG:ns NR:ns ## COG: CAC0908 COG1179 # Protein_GI_number: 15894195 # Func_class: H Coenzyme transport and metabolism # Function: Dinucleotide-utilizing enzymes involved in molybdopterin and thiamine biosynthesis family 1 # Organism: Clostridium acetobutylicum # 6 235 7 248 251 249 54.0 2e-66 MINEYSRTELLIGEEGVSRLKKSSVMVFGVGGVGSHCIEALARSGVGKLILVDNDVVSMT NINRQSIAYHSTVGKYKTEIMRDRIKDICPEIEVVTYETFVLPENIDTLFTEKVDYIIDA IDTVTAKLVLVEMAKEKNIPIISSMGTGNKLHPERFEVTDIYKTSVCPLCKVMRKELKAR GIRKLKVLYSKEQPVDTSGKVIEEKMGKRRSLPGSISFVPPVAGLIIAGEVVRELAGV >gi|330400063|gb|ADLB01000005.1| GENE 20 20315 - 21637 1186 440 aa, chain + ## HITS:1 COG:BS_yrvN KEGG:ns NR:ns ## COG: BS_yrvN COG2256 # Protein_GI_number: 16079807 # Func_class: L Replication, recombination and repair # Function: ATPase related to the helicase subunit of the Holliday junction resolvase # Organism: Bacillus subtilis # 17 438 3 420 421 426 50.0 1e-119 MDLFEYMRENTKEKESPLASRLRPTKLEEVVGQQHIIGKDKLLYRAIRADKLSSIIFYGP SGTGKTTLAKVIANTTSAEFMQMNATIAGKKDMEAVIEQAKNNLGMYGKRTILFIDEIHR FNKGQQDYLLPFVEDGTVILIGATTENPYFEVNGALISRSSVFELKPLEREDIKILLKRA VYDTEKGMGTYRAEIDEDALEFLSDISGGDARNALNAVELGILTTPRSDDGKIHITLEVA EQCIQKRVVRYDKTGDNHYDTISAFIKSMRGSDPNAAVYYLAKMLYAGEDIKFIARRIMI CASEDVGNADPNALTVAVSAAQAVERIGMPEARIILAQAVTYVASAPKSNASYMAINNAM DNVKRKKTTVPSHLQDSHYKGAGNLGHGIGYKYAHDYPEHYVKQQYLPDEIKEETFYHPT ENGHEKVIKEYLDKIKARSY >gi|330400063|gb|ADLB01000005.1| GENE 21 21658 - 22170 449 170 aa, chain - ## HITS:1 COG:no KEGG:Cphy_0407 NR:ns ## KEGG: Cphy_0407 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 2 170 3 171 173 172 49.0 3e-42 MSETFTLYKLIILYMLEKVDFPLTNGQISEFILDKGYTTYFKLQQALSEMVEARFIHEET THNRTLYHLTEEGSETIQFFINKLSPAIRDDIDAFFKEHKYELKNDVSVKADYFPNANQE FSVRCQIIENNASLMELVLTVPSEHEAEAITNNWRQKNEEIYAFLMKNLL >gi|330400063|gb|ADLB01000005.1| GENE 22 22185 - 23114 394 309 aa, chain - ## HITS:1 COG:CAC3238 KEGG:ns NR:ns ## COG: CAC3238 COG1242 # Protein_GI_number: 15896484 # Func_class: R General function prediction only # Function: Predicted Fe-S oxidoreductase # Organism: Clostridium acetobutylicum # 2 303 14 315 324 308 49.0 1e-83 MKRWGDKRYYSLDYYLKEVYGEKLYKLSLNGGMTCPNRDGTLDSRGCIFCSAGGSGDFSA PSSIDIREQIAYGKKLVERKYKGHSYIAYFQAYTNTYAPVSHLEKIFTEAIQEPEIKILS IATRPDCLGKDVLDLLAKLNQIKPVWVELGLQTIHEKSAEFIRRGYTLDVFESAVKNLRA IGITVIVHTILGLPNETEEMMLETISYLNQLDIQGIKLQLLHILKGTDMAVFFKNDKFYL PSLEEYMVLLSRCISLLRPDIVVHRLTGDGPKDLLIAPLWTGSKRFVLNSIQRYFKSEDI WQGKEYFPQ >gi|330400063|gb|ADLB01000005.1| GENE 23 23114 - 24262 1087 382 aa, chain - ## HITS:1 COG:SA1524 KEGG:ns NR:ns ## COG: SA1524 COG0281 # Protein_GI_number: 15927279 # Func_class: C Energy production and conversion # Function: Malic enzyme # Organism: Staphylococcus aureus N315 # 1 378 1 379 409 422 57.0 1e-118 MTTNEKALQMHEQWKGKIETVAKAKVNSREDLAIAYTPGVAEPCRVIAEDRDAVYKYTMK SNTIAVVSDGSAVLGLGNIGAYAALPVMEGKAVLFKGFGNVNAVPICLDTQDTEEIIKTV VNIAPAFGGINLEDISAPRCFEIEERLKELLDIPVFHDDQHGTAIVVLAGIINALKITGK EKENCKVVVNGAGSAGIAITKLLLTYGFKHITMCDINGMISKSSPNLNWAQEKMTDVTNL EEKTGSLADALNGADIFIGVSAPNIVTKEMVQSMNKDAILFAMANPVPEIMPDLAKEAGA RIIGTGRSDFPNQVNNVVAFPGIFKGALEGRATQITEEMKLAAANAIAGLVSDEELSDTN ILPEAFDPRVADAVSNAVKSLI >gi|330400063|gb|ADLB01000005.1| GENE 24 24484 - 25809 1235 441 aa, chain + ## HITS:1 COG:CAC2889 KEGG:ns NR:ns ## COG: CAC2889 COG1158 # Protein_GI_number: 15896143 # Func_class: K Transcription # Function: Transcription termination factor # Organism: Clostridium acetobutylicum # 6 440 10 483 483 430 52.0 1e-120 MREKYESLPLATLRDLAKARHLKGVSTMKKADIIELMLEEDEKDNTKAVREEVKNEKSGT ADIEQLDSGIVANGILEVLPDGYGFIRCANYLPGENDVYVSPSQIRRFNLKTGDIVEGNT RVKSPTEKFSALLYLTKINGMNPAEASKRGCFEDMTPIFPNERLHLEGPGSSAAMRIVDL ISPIGKGQRGMIVSPPKAGKTTLLKEVALSILKNNPEMHILILLIDERPEEVTDIREAIC GKNVEVIYSTFDELPEHHKRVSEMVIERAKRLVEHKKDVTILLDSITRLSRAYNLTVSPS GRTLSGGLDPAALHMPKRFFGAARNMREGGSLTILATALVDTGSKMDDVIYEEFKGTGNM ELVLDRKLQEKRVFPAIDIVKSGTRREDLLLTADEQEAVNNMRKALNGMKAEEAVDNILN MFSRTKNNEELVQTVRKTKFI >gi|330400063|gb|ADLB01000005.1| GENE 25 25897 - 26220 291 107 aa, chain + ## HITS:1 COG:no KEGG:Ccur_02270 NR:ns ## KEGG: Ccur_02270 # Name: not_defined # Def: hypothetical protein # Organism: C.curtum # Pathway: not_defined # 2 98 18 124 130 102 43.0 6e-21 MNIEKKRENEKEMLRTMITVYCRGVHKTKGQLCPECKELLEYALFRTEKCPFMATKTFCS ACKVHCYTKEKQEKIKAVMKYAGPRMIFSHPLQAVKHMWVTIKGKIK >gi|330400063|gb|ADLB01000005.1| GENE 26 26337 - 26540 339 67 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|240147058|ref|ZP_04745659.1| large subunit ribosomal protein L31 [Roseburia intestinalis L1-82] # 1 65 1 65 67 135 90 9e-31 MREGIHPDYYQATVTCNCGNTFVTGSTNEDIHVEICSKCHPFYTGQQKAAQARGRVDKFN KKYGIQA >gi|330400063|gb|ADLB01000005.1| GENE 27 26617 - 27549 1035 310 aa, chain + ## HITS:1 COG:CAC2886 KEGG:ns NR:ns ## COG: CAC2886 COG3872 # Protein_GI_number: 15896140 # Func_class: R General function prediction only # Function: Predicted metal-dependent enzyme # Organism: Clostridium acetobutylicum # 2 309 3 313 317 213 41.0 3e-55 MKSSNIGGQAVLEGVMMKNGEDYAVAVRKPDGEIEVKKDIYKGIVKWKFLTKTPFIRGIF NFIDSLVLGMKTLTYSSEFYEEEEEKIPLTEEEERKKKKKESILTGVTVGVSVVLAVGIF MVLPYYLISFAENIVHSKAALAALEGIIRILIFIAYILMISKMEDIKRVFMYHGAEHKCI NCIEHGMELNVDNVRKSSKQHKRCGTSFLLFVMIVSCVMIFFIRSDSQLVKVGLRIALIP VIAGISYELIKWAGNSDNPIVNLLSKPGLWLQNLTTKEPDDDMIEVAIQSVEAVFDWRTY QKETYGVSEE >gi|330400063|gb|ADLB01000005.1| GENE 28 27546 - 28370 303 274 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|237684513|gb|ACR11777.1| protein-(glutamine-N5) methyltransferase, ribosomal protein L3-specific [Teredinibacter turnerae T7901] # 49 241 64 263 305 121 36 1e-26 MTLEEAYGFGKRTLEEHKIADASIDAWILLEYITKISRAMYYANPKREMTGEQKTQYKYF VEERAKRIPLQHLTKEQEFMGLSFEVNEHVLIPRQDTEVLVETVLEDLEENMRVLDICTG SGCILISLLKIMRGVKGVGVDISEEALEVARRNAQKHDMEAVFIQSDLFENVEGTYDVIV SNPPYIKTEEIEKLEEEVKLHDPMLALDGKEDGLYFYRKIIKESRKYLKRNGKLYFEIGN TQGEEVKTLMEEEGFTNVKIKKDLAGLDRVVCGV >gi|330400063|gb|ADLB01000005.1| GENE 29 28409 - 29479 1134 356 aa, chain + ## HITS:1 COG:CAC2884 KEGG:ns NR:ns ## COG: CAC2884 COG0216 # Protein_GI_number: 15896138 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Protein chain release factor A # Organism: Clostridium acetobutylicum # 1 354 1 354 359 381 56.0 1e-105 MFDRLEDLIIRFEEIMGELQEPDVANDQNRFRKLMKEQADLTPIVDAYNEYKSAKQAIAD SEEMLEMESDEEMRELAKEELSASKKRVEELEHELKILLLPKDPNDDKNIMVEIRAGAGG DESALFAAEVYRMYVHYAESKRWKTELISFNENGIGGFKEVVFMVSGQSVYSRMKYESGV HRVQRVPETESGGRIHTSTITVAVMPEAEEVDVVIDDKDIRIDVMRASGNGGQCVNTTDS AVRLTHYPTGIVIYSQTEKSQLQNKEKAFRLLRSKLYELELEKQQSAEAEARRSQIGTGD RSEKIRTYNFPQGRVTEHRIKLTLYKIDNIMNGDLDEIIDPLIAADQAAKLANMNE >gi|330400063|gb|ADLB01000005.1| GENE 30 29705 - 34315 4542 1536 aa, chain + ## HITS:1 COG:no KEGG:CPE1523 NR:ns ## KEGG: CPE1523 # Name: nagL # Def: hyaluronidase # Organism: C.perfringens # Pathway: Metabolic pathways [PATH:cpe01100] # 59 1037 49 1019 1127 362 29.0 9e-98 MRKRKIWQSLVSAILTVAMVSTAVIPAKAVGNEKESIEAVKSPWVSEWKKGDAMTTGEEY QIYPIPQQVTYPKEKQSFEINKEVQTVIGADVDDATKKHLEKVLKKYDRTAKESKADKKE SKIILGIYGKNDEADKWFKDKNLDAKHFEKEDAYALYAKDGDIVILGKDSDAVFYGVSTL EMMFSSFAGKKFYPVEIMDYASIDARGYIEGFYGNWTHEQRKSLMKFAPEAKMNLYVYAA KTDPYHTDKWDQLYPAEQINQFKELADLQAETKAEFSWSVHLTNVLKGVTKEDATYNERK EKLKKKFDQLYNIGVRRFCVLNDDFGSGSNEIVVKLINDLTEEYIKPKNCKPIIFCPQGY NEAWARPAELEAMKKFNEEVLIFWTGKDVNSPFEQSTIKYVKEKTGHDPVFWVNYPCNEH AKSGIFLGSSTHYIRDNITGLAGAVSNPIFFAEADKVPLFQLGAYFWNVNNYSKNVEKVW EQCFKYLQPEVYDEYLTIARNVSDCPNSGRVPQGFEESLYLKDTLDKVKSAAENDTFKAD AKEVKELLAEFKHIQSAIADFQKNCTNEALVQELSNPGGTENGEGWLQALDNVAKAGEAL LQAEAELSAKTPDMSKVWTEFSKASKEMNAYNSRTYKFGQGNDRVTPKAGNKRLVPFVNA CMKDVQKELDKWLSSDNTEKKADRIYTNMPAYAKTPLTIAEKEFGVRDLSNVTLNKGDYI GVKKEAIAEISSIIVEVENTDKLTLEYSLYGDSWQTAKAGNQTKHLEAKYIRLINKGNTA VNVNLKRLAMVVENLPNDLRFKETNINEIKEGKWDNMVDGNKDTFVWTNRAQKVGDYITV DLGETKPIRDITFWTSDGNPRIKKASLSISTDNKNFTKITDFNDNGKVDPPLRSYSANAN GQRGRYIRLEVTEAYGYYLKIHEIEVNKGDKPNYSIPPESAVTNTKGDVKALDDKNLSTL FHVQNVKAGDKLEYRLTDNVNLDSFTVFQGTPCNADVTLVKADNTEQKLGKLTSGKQDFS GIDGNIHTIRFDFEQGKDVELHELVFTYGENPSDDIGVAVENIYIDSSEPDSEEVVNLAL NQKVEVSDVERVKGSGAETNYTGNLSVDGDKNTRWSSGALNDKSYGEPADQWLILDMGEK PVLMNEIQIDFYKKVWPTEYQIQVSNDKQNWVTLETIIRESANKEGISDKKTFENPLMAR YIRLYFPKEKLNTQAVGGSVSITELTVKGIRKTTPLNYLAVTDTFEDVKVENAATVKDLH LPEIVNVKMAKATGEEIAVQVLPKWNTEGFDEAKDVSLSLKGGLPTGALLINDKKVEAVQ TVVKGEPKEEDLTELRKALQDFIDTAEKMISDAESNHRLYDVSTIDALKPVLEKAKEVLA DTESTRDSLESAKTDLEKAIEQVNRLNLTAFDNAISQAQKVDKDLYTEESVQKLENVLAE AGKLFDKGVTQEDVDKMTGEVEKAYSGLETKPAPEPEPEPKPEPKPEPKPEPKPKPEQKP ENKKPNVPKTADTEPIFLFGGLFLLSGLAVVLTKRK >gi|330400063|gb|ADLB01000005.1| GENE 31 34428 - 35180 629 250 aa, chain + ## HITS:1 COG:no KEGG:BHWA1_01124 NR:ns ## KEGG: BHWA1_01124 # Name: not_defined # Def: hypothetical protein # Organism: B.hyodysenteriae # Pathway: not_defined # 1 250 1 254 256 211 42.0 2e-53 MDLKKIIFSATGGTEKVADILIKGLDESAEKIDLLDRNREFSKICLQKDDVCMLAVPSYG GRVPKEAAERIARIKGNGAKAILLVVYGNREYDDTLLELKEVAKKSGFLPVVAVAAVAEH SIMHQFAKGRPDKQDREELLSYVQFIRERIENPKPFKVPGKKPYRAYGKIPLKPQIKRTC NHCGKCAKECPAGAISMGEYIKLDKNLCISCMRCVAVCPRHARKLSPVLLKLASHKLKKV CRKRKKNEIY >gi|330400063|gb|ADLB01000005.1| GENE 32 35167 - 36957 991 596 aa, chain + ## HITS:1 COG:BS_ywpD_2 KEGG:ns NR:ns ## COG: BS_ywpD_2 COG2972 # Protein_GI_number: 16080688 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus subtilis # 394 583 1 198 218 135 38.0 3e-31 MKFIKKLAGSNRLKAIITEFTAGILVVMYVTFLLMPNTVGIVRPDKRIGKMDGWYYEENG EKIKPELPTVIDKVNHGTFVVENTLPEIAYRNAAVAFYSQQQKVKVFLDEEMIYEYPERK LAGDMIPSTWNFIRLPQDSSGKHIRIEISSPYKRFQGKMDSIYYGSYNSLYYGIREWKFP AFFTSLFIGSMGIIMLLISIFLRKFRGYRREETLGILFILVSLWLSGVSQMVYKSVGAEP RHFITMLSALFCNVFFLSYLEQRVEGESQKITRTAFYLSIVSAFGCLILQITGIKDLVET SNYMLMTLILSFIYACWIYGKKILAKEEGYNKGEFICMILILLAGMVEWGHFYSKKWHTF GVCIRGTVLIYTVYLFGYYILEIYGTAQLNRKLSKQLQDSEIQLMMSQIQPHFIYNTLGS IRALIKISPDEAYKMVYDFSNYLRGNIDAIGNQQNILLSEELRHVKSYVNIEKVRFQNQL EVQFDIQEENFYIPPLSVQALVENAIKHGIRKKSGGGCVWIRSYKRGENYIVEVEDNGVG FDLKEKRKKGSVGLRNIQFRLEKISRAKCELWSENGKGTKVTLTFPKIFRGGVRAN >gi|330400063|gb|ADLB01000005.1| GENE 33 36954 - 37751 683 265 aa, chain + ## HITS:1 COG:FN0219 KEGG:ns NR:ns ## COG: FN0219 COG3279 # Protein_GI_number: 19703564 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Response regulator of the LytR/AlgR family # Organism: Fusobacterium nucleatum # 4 142 5 143 240 89 32.0 9e-18 MKIVIVDDEWLQIKQFEMECEGLKGVEIAATFTNPLEAYEYLKEHPVEVVFTDIEMPEMN GLAFGKELRELYPDLVLVFVTGYEEYAREALRLRADYYVLKPYNQEEIRDVLRRSYYFSK RQKKRVYFRMFGRFDVFIDEQVVYFANAKAKELLALCADHRGGNVTMEEVIDKLWGERLY DERVKTLYRKAVMYLKKLFQVYEVEDVFVNNRGSCNIHYEKVDCDYYAVLEGRTDTMREK QYVGQYLWEYSWAEETAIELEEKIK >gi|330400063|gb|ADLB01000005.1| GENE 34 37928 - 49024 12775 3698 aa, chain + ## HITS:1 COG:no KEGG:Balac_0261 NR:ns ## KEGG: Balac_0261 # Name: fbp # Def: hypothetical fibronectin binding protein # Organism: B.animalis_lactis_Bl-04 # Pathway: not_defined # 56 2713 51 2579 2696 690 30.0 0 MKRERKKFRDSFVAILLAISMVVGLVPVNSITSYAKDKSAGNVKSQPRAAFDEESVSDES SLETWTKVVENSTENIGRIWVDKTVSDKNIKLPASSQGKEIEIDKGTSDFLVGLSALSST SNISTVADKPLDIVFVLDTSGSMSDPMEYIYSPTYNVVTDGRVEYYAEVEGNYVKIDRIT GLFGFFKHWEVAGKEVTPKKNESDTNGIQFYTRREKPNSQSKMGALKIAVNQFAQETAKR NDSITDAAKQHRMSIVTFSSESYIRQSLKAYNSNTVSEFERTINGLNANGATYANLGMEK AKESLKNVREKAQKVVIFFTDGTPGRSGFDDDTANNTIQAAKSLKDDLTKIYSIGVFDQA NPDNTSSSFNAYMHGVSSNYPNATKWTELGERAENSNYYKAAQDADELNKIFEEIFEEMN SGSGFPTHIEDGYDAGKGGYITFTDELGKYMQVDGFKDLVFADQTFKPTGSEEKDGVVTY HYEGKASGDSELYPEGNINEIIVQVQKSKNLQKGDTVTVKIPAGLIPLRHFNVDTKNQET TMSIKEAYPMRIFYGASLKPEVVETLKDGLDDRNDTDRELNAYLKDNRIDGNKAAFYSNF YDGSKTSGAKKLGNTVASFTPSKSNSFYFFTEDTVVYTDEKCIQPLKTNPDTSGKTSYYY EKNYFVKNTDGTVEKKEESAKFNGADFDTSSTTWGKNNKGEVYMKAGAARLTRVDDLTLT KEKNPTGTATEVINPQWDNINNPQNVQVYLGNNGRLAVELPGALAITKDATVTPNKGLNE EEIVKDKKFKFQISIPEMAGKTGYVQVKNQQGDIAGEVQKVTFDGKGKLEYSLKNDETLL IYGLNANASYAVTEIEESMPKGFTLTSVDGNADTTEASGRIEAGAVKKHTFVNTYDVTPT TVNAEEFAKYRKDFDNWEVTDSFEITLRRYTKDAPMPEGSTGNEKTVTVTENSREGNFGK VEFTKPGEYRYTVVERRPNEAMPGMTYSAAVYRVIVTVTDNGDGTLSANAVMTKTSDNSG QDINPPQEIGDKTAVFVNTFNAESVSAGAVANKVYIDNSGKKPLTNGMFRFKIRAIGENA AEAPKPAATQPDKDGYYYVHNEGGRVQFEQFVFTAEHVNRTWTYEFSEVLPEGATDNKDG TYTLDGMKYDGQTYKVEFAVTSEEMNGKPTVKVVKTYKNAKGEVLPEGGAEFHNEYTPKE IVLPGGDSAAIRGSKTLAGRDSLKDEAFSFDMKAANQEAKTALEKDWIVFENDKTKDTMK VTVDALKNGVEKPFNFGSVTITRPGTYQFNVNEIIPVYGIGEGNKGMVYDTHSALVTIVV TDENGTLKASVAYNNGEGGVTDRAAFLNTYRAETTYGQGAALNVAKTLNGRALSAEEFNF EIKGNSEEAEKKLSAGDRAFTNTSGANDGVASLIYNKMSNVKFTQDDAGKTFSYTVREVE PKENKLGGVTYDKTTYTIDIEAVDNGNGTMHTVTTVTKNLANGTTEKIGTYNSADGADTF VVSFVNQYTAKPVDVNTATDATLTKLLTGREWKEKDNFEFTLQAKTPTDAPMPEGAQGEP KKKTITVGKPAEGNKASFDFGTIRFDKAGEYVYTVTEKNAGETIDGITYDEYAATILVTI SDAGNGQLIATVDMYDGDFINEYKAELNHDDAGGLVVEKTLTGHDMKEGQFTFQVKALDS DNVTAEEAAKRIGISEGVAGEFKNTAGKDGETVTMETENPIIFTQEDVNKHFRYKFTEKG ANGEFGKGGTKDGYTFDDATYTVELWVTDDGDGTLTLHTSVNDEEQISSENDKKPTVLSF NNRYDATGSLDGTTKLAGQKEIAGPWPSDDYSGFEFAIAGGDETTNDAIKNGDVVLPKDT TVTSEKDGSFHFGNITFNRKGEYVFHVTEKPGNRNGVTYDDSVKVVTVTVTDNNDGTLTA AIKEGSDELTFTNAYNTTENATFVPSITKQVKGHDAVEDITFIMKAKDDVTKNAIQAGEI TGYKEEATISKDMLKKGGDPQAVTFGELTFTKAGTYTFDVFEKAVSVPNGWTYDADHYEV KIKVTDENSKLVAMQDISDAANNLKRNFVNVFAADTTYGAEGGLNVTKELQGRKLKENEF AFTITGQTEEAEAKLDDSDRTFKNSKPGDNGIAVMSKLGKVTFNQDDIGKTFTYAVHEVV NARGAIAYDNKDVTVAVQVLEENGDIYTVTTVTKDGQQLGVYNSKDEKATAVAPFVNSYT PDIIEMNPAEMAGKVTKVLKGNREEALKDNEFTFQMKVDALEGDMNTVVLPELTATNKAD GSVSFGKVKFTAVGKYRFTISEVIPGTPDKQMTYDRHTFSYVVEVSYDSTEGMLSAKVIE KTGSPVFTNIYTDEKSKDVANGEDKNTSADGDMISVGDTLTYSIDWVNNAVDENGQPTVA DVIIEDTIPEGTRYVEGSASNDGKFADGKLTWSFKEQEAGAGGTVTFKVEVTEDAVKVDT IKNTASIKIGDNAPKQTNEVENFVPGKDVKNDNGSSDIKVGDTVTYTIKYRNTEDADATV TITDKLPKGLTFKDADNGGSYDEATGTITWKLTKVKPGTTGTVTFKAVVNEDALVEDGVN NTAGVQVGENNPVIDTNTESFNTKKGNLTISKEIKLTEDQGTVIDKNKAFEFKVVLKDTA GQELKGEYTYGDNQKIRSGQTINLKHGESITIKGLPEGASYEVTETKAAGYTPNESTITG TVPAEKTAEAHFINTYSVADGSLEGAANLKVKKEFTGREGNKWLEKDRFTFKLTMAKGTP EGAVVLPSNADEIVIDSTTPNKAAAFGNIIFKRAGEYTFYVTENPSGILGVVDDKNATRE VVVEVVDNGDGTLSAKVKNAAEGLLTFKNVYKIQNPVEVDTTPTDTSVFFEKVLEGRDWL DTDEFTFTIAPEGDAPVPEKTTATVKKSDVKNKKAPFGFGKIIFTPELMGTAHTKKFTYK VTENDINKDKMPGVSKDTHTATLEVTVSDNGEGKLTAATSVVNNGTFTNHYRSSIDYGAA GGLVITKVLEGRNAAKDQFTFIMKAKDDASAEKLGISKEGTTFKSPAAKDGETATVNVLP DKVEFTQADAGKTYRYTAEEVNDGQKGYTYDNTKHEVEISVKDNNDATLTVTTKVGDKTY TYKTNAEGEKAVIPFKNKYAASTDNAGGTKAPVVTKKTLTGRPLTAGEFTFSVVYKNSKD VVVKDVKNSADGTVNFGELSFTIDSLKAAAKKGDAVQNPNGTWTVSYTAVEKTDGLAENG VTPVKNTFDFTVTVTDNGDGTLKAVTNLPEGHGFTNAYSTGEPVEVNVNGSKVLSAEEGL TPPDITGKFTFTLKGEKDAPMPEKTTAQNDAQGNVDFGKVSFTLENVFGEKASAKAGEPR SKTFTYTVTESGKVAGVTNDKETKTFTLTVTDDGKGHMTVTKNPADKTLFTFTNTYSIES PITSSVTDQISITKELEGRKMQAEEFNFELLEGDKVVATGTNDADGKVTFTGLEYNKPGE HDYTVREVAPEDTYGVTYDTSEFMIHTSVTDNSDGTLSVKHTTEGEIVFRNVYKAAGTSV TLGASKVLKGNDLVKGQFTFRLKDANGNVVAEAKNNENGQIIFETLEYDKAGTYKYTISE VNDKQENITYDEKVYEVTVNVVDDEKGQLKATVEGDSAVFTNEYAEKAMVLTPAKPKDDT PVKTGDESNPMAMAGMMTIAGMVILLNVLTAVRRRRQR >gi|330400063|gb|ADLB01000005.1| GENE 35 49186 - 50919 237 577 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 355 561 16 226 245 95 33 6e-19 MKDKLKKLFAYYKPYKLLFYSDMFFAILGAGVTLVIPLIVRYITSEVVYFDGNRAMETIV ALGALMIVLVLVEFGCNYFIAFYGHMMGAKMEADMRSDIFSHYQKLTFAFYDNQKVGALL SRITSDLFDITELLHHGPEDIVISIIKIVGSFIILANINLRLAMVAFVFVPVMIVFAVYY NRKMKTAFKRNRARIADINSQIEDSLSGIRVVKSFANEEIEMKKFHKGNDGFLAAKKISY RYMAGYHSTLTALTTLVTVVVLLVGAEGLTTKTVLVADLVTFLLYINNFIEPVKKLINFT EMFQNGYSGFDRFQEMLAIAPDITDAKDAVSVGKLEGKITFEDVSFQYQDTNEPVLSHIN LDVKAGEYIALVGPSGVGKTTLCSLIPRFYEATAGKIEIDGMDIRKMKLDDLRNNIGIVQ QDVYLFAGTIMENIRYGNPDATDEDVIRAARNANAHEFIMSFPDGYDTDIGQRGAKLSGG QKQRLSIARVFLKNPPILIFDEATSALDNESEKVVQQSLEMLAKDRTTFVIAHRLSTIKN AERILVLTEKGIEEQGSHEELLHQKGIYESLYHMQFH >gi|330400063|gb|ADLB01000005.1| GENE 36 50948 - 51841 908 297 aa, chain + ## HITS:1 COG:jhp0277 KEGG:ns NR:ns ## COG: jhp0277 COG4866 # Protein_GI_number: 15611347 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Helicobacter pylori J99 # 32 292 28 283 290 162 37.0 1e-39 MKNIEFKRPEMEDRKLLDGYFKRYPSRSCERTFINVYLWSRHYKVKYAVIENTLVFKSED EDIAFTYPLGEVEDIKRALRFLMEYCEKLGKEFELYCVTAEQFEQLNSWYPDEFQITYDR DIADYVYESEKLITLSGKKLHGKRNHINKFKAEHENWSYETMTEDNVEECFQMALQWRKE NGCDADAEKNAEMCVTLNSLRLFRELEMTGGILRVDGKIVAFTMGEPINDDTFVVHIEKA FSEVQGAYPMINQQFVEHECAKYKYINREDDAGVEGLRKAKLSYRPAFLVEKGRVKR >gi|330400063|gb|ADLB01000005.1| GENE 37 51842 - 53029 1003 395 aa, chain + ## HITS:1 COG:CAC2832 KEGG:ns NR:ns ## COG: CAC2832 COG0436 # Protein_GI_number: 15896087 # Func_class: E Amino acid transport and metabolism # Function: Aspartate/tyrosine/aromatic aminotransferase # Organism: Clostridium acetobutylicum # 1 394 1 392 393 443 56.0 1e-124 MISEKMEKMVSNSSAIRQMFEEGNRMAKLYGKENVYDFSLGNPNLPAPKRVKEAIVEILN EEDTLTLHGYTNSNAGYMEVRDAVASSLNKRFKTSLAGNNIIMTVGAAGGLNVALKVLLN PDDEVIVFAPYFSEYRSYVNNFDGVLVEISPNTENFQPKLDEFERKITKRTKAVIVNTPN NPTGVVYPEETVVRMAEILEKKQKEYGNDIYLLSDEPYRELVYDGAEVPYLTKYYDNTIV VYSYSKSLSLPGERIGYVVVPDEVSESGRVIAAMNVANRILGFVNAPTLQQKIVQKCLEE QTDISYYNRNRETLYKGLTDCGFSCIKPEGAFYLFMKSPIEDEKEFCRQAKELHILLVPG SSFGCGGYVRIAYCVSYETIVNALPKFKELAKKYF >gi|330400063|gb|ADLB01000005.1| GENE 38 53122 - 53970 1258 282 aa, chain + ## HITS:1 COG:TM0164 KEGG:ns NR:ns ## COG: TM0164 COG2035 # Protein_GI_number: 15642938 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Thermotoga maritima # 6 251 7 227 264 105 34.0 9e-23 MIKQILQGMVVGIANIIPGVSGGTMMVAMGLYDKLIHAITHLKSEFKESLKLLLPIFAGA GIAIVALSRLFEFLLETYPIPTNFAFCGLIAGSLPFIFKKVKGHKVTVGKIIPFLIFFGV VIVMALLGETGGNSADVSFGLVNVIKLFGVGVIAAATMVVPGVSGSMMLMLLGYYDTILK TINDFVDALVKFDMGGLTTGVGVLAPFGIGVVIGIFLIAKLIEFIFSKAEIHAYYGIIGL ILASPIAILMKTNWSGASVLMVGCGVVTFALGWFVASKLGGE >gi|330400063|gb|ADLB01000005.1| GENE 39 53977 - 54663 667 228 aa, chain + ## HITS:1 COG:FN0805 KEGG:ns NR:ns ## COG: FN0805 COG4912 # Protein_GI_number: 19704140 # Func_class: L Replication, recombination and repair # Function: Predicted DNA alkylation repair enzyme # Organism: Fusobacterium nucleatum # 6 228 23 251 251 132 37.0 5e-31 MMNIREQLESLAEPDFQQFMAKLIPNVKPEKIIGVRTPILRKIAKQIAKEDWRFYLEEYP AESYEEIMLQGMVIGYIKEDFGEVLKQVERFLPKIDNWGICDSFCAGLKQTKKEPEKMWE FITRYIHDERTYFVRFSVVMMIFYYTDKGHIEQAFSYFDEICHEDYYVKMAVAWAISIYF IKMPEQTMVYLKNNRLDDFTYHKALQKIRESQKVDKETKEKIKEMKRK >gi|330400063|gb|ADLB01000005.1| GENE 40 55018 - 56238 831 406 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPTAMDEFSEQTKVPAVCVRANEKGEQGVWAMQEVVDIWGEKQYQVRFESDVVKEIVGES ATLKIRPEAVVVDNLFPLEDEKTVKKVYEKRGNTKPEPNHVLYVTDKEIDRAKLEKDIAD SQSVLDTAYFGHKKSVCKEEEVDVAYGNKSWLRARNIGLSEGETESFFQGERKAFLHKHT ADRLKVKKGDTISLDNTSYRVEKVYDNQEKTKGFGEEIYVNQMPKGEEIPITSVMVKSTQ GKTLRQNTAALMVEEYLHLSAKDGKQYNLQDMDKMVTEVKWICSVIVLLVILIFLFRYTV FWWTKAYEERKFRFVLAGVCLSGSGFGFWHTLLNQCGIPQTFFSSEGILNIGAYADRIRA FYESIQTGAGKNYFLCQIGYQSLLRIGKYACIGVLAVTVFLVGKKL >gi|330400063|gb|ADLB01000005.1| GENE 41 56348 - 56809 477 153 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225568404|ref|ZP_03777429.1| ## NR: gi|225568404|ref|ZP_03777429.1| hypothetical protein CLOHYLEM_04481 [Clostridium hylemonae DSM 15053] # 1 153 1 153 153 144 49.0 2e-33 MIIAKNDEEKRSRIWRMVMAGASIILLIVIVYFLFSLFAGNPLEGTWKSEESNLQLTIRG GDSATASWSEIAEASNVKVKINYTLDKENKTITFKVDDAEIAKAVKNSDAGLTKTALETE ISMLETTFDYNLEKSKLTLSEREYGEQIVFVKK >gi|330400063|gb|ADLB01000005.1| GENE 42 56892 - 57947 1228 351 aa, chain + ## HITS:1 COG:CAC1797 KEGG:ns NR:ns ## COG: CAC1797 COG0821 # Protein_GI_number: 15895073 # Func_class: I Lipid transport and metabolism # Function: Enzyme involved in the deoxyxylulose pathway of isoprenoid biosynthesis # Organism: Clostridium acetobutylicum # 1 348 1 348 349 398 59.0 1e-110 MKREETKVVHIGNRSIGGGNPIAIQSMTNTKTEDVQATVEQILALEKAGCEIIRCAVPTM EAAEALKEIKKQIHIPLVADIHFDYRLAIKAIENGADKIRINPGNIGDVSRVQAVVNKAK EYGVPIRVGVNSGSLEKHLVEKYGGVTAEGIVESALDKVHIIENMGYDNLVVSIKSSDVL MCIKAHQLISEQCNYPLHIGITESGTVLSGNIKSSVGLGIMLYEGIGDTIRVSLTGDPLE EIKTAKLILKTLGLRKGGIEVVSCPTCGRTKIDLIGLANQVENMVADIPLDIKVAVMGCV VNGPGEAKEADIGIAGGIGEGLLIKKGEMVKKVKEEELLETLRQELLNWGK >gi|330400063|gb|ADLB01000005.1| GENE 43 57949 - 62412 4214 1487 aa, chain + ## HITS:1 COG:CAC3442 KEGG:ns NR:ns ## COG: CAC3442 COG2176 # Protein_GI_number: 15896683 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, alpha subunit (gram-positive type) # Organism: Clostridium acetobutylicum # 98 1486 87 1450 1452 1328 50.0 0 MGKRFFDVFPTLQTEKEIEGLFEGVEVTKVTTTSLKDRLKIYFLSTHLIPKVYIYQMEQM IREQLFARTPISIQIVERYQLSEQYTPENLLREYRDSILEELKEKSAVEYNMFLNAKCTF ENESILCMTLEDTIVAKGKQDSIVELLQEVFNERCNVPVEVRVLYEEAKESKHKKYNEIR LQQEIRAIVEGNESRRKEHVEEEKEKKPEEKKERTVKPKEEKKPFQKKEFYSSVKRSDDP NVIYGRNFEDDTIELSQVVGEMGEITIRGKVIQFDTREIRNEKTIIMFAVTDFTDTIMVK MFTRNEQLPEILSEINKGAFLKIKGVTTIDKFDGELTIGSVTGIKKIPDFTTTRKDTSPV KRVELHCHTKMSDMDGVSEVKAIVKRAHDWGHKAIAITDHGVVQAFPDANHYIETLDKDD PFKIIYGVEGYLVDDLTDIAVNEDNQTLDDAFVVFDIETTGFSSVKDKIIEIGAVKVENG KIVSRYSTFVNPEVPIPFEITKLTSITDAMVIDAPKIETVLPEFMEYVGDAVLVAHNAGF DVGFIEENCRRLGMERKLTSVDTVALARVLLPTLSRYKLNIVAKTLGISLENHHRAVDDA GATAEIFVKFVEMLKEREITTLKGVNEFGAMNPDAIRKMPAHHVIILAKNDVGRVNLYTL VSMSHLKYFGRMPRIPKSELNRYREGLIVGSACEAGELYQAILNDKSEEQIAKIVNFYDY LEIQPLGNNAFMLESERISNVNTKEDLIQINKKIVKLGEEFQKMVVGTCDVHFLDPDDEV YRRIIMTGKGFSDADDQAPLYLRTTEEMMSEFSYLGSAKAEEIVITNPNRIADMIDPISP VRPDKCPPVIQNSDQELRDICYEKAHSMYGENLPDMVVERLERELNSIISNGFAVMYIIA QKLVWKSVEDGYLVGSRGSVGSSFVATMAGITEVNPLSPHYYCKKCHYSDFTSDEVRAYS GLSGWDMPDKKCPVCGEPLAKDGFDIPFETFLGFKGNKEPDIDLNFSGDYQSNAHKYTEV IFGTGQTFRAGTIGTLADKTAYGFVKNYYEERDSRKRKCEIERIVEGCTGIRRSTGQHPG GIIVLPHGENINSFTPVQHPANDMTTDIITTHFDYHSIDHNLLKLDILGHDDPTIIRMLE DLITDITGEKFDATKIPLDDPDVMALFAGTEVLGIKPEDIGGCPVGCLGIPEFGTDFVIQ MVVDTKPKTISDLIRISGLSHGTDVWLNNAQTLIQEGKATISTAICTRDDIMTYLINKGL DSEKSFTIMERVRKGLVAKGKCKEWPEYKKDMLEHGVPEWYIGSCEKIKYMFPKAHAAAY VMMAYRIAYCKVNYPLAYYAAYFSIRASAFSYEIMCQGKDKLEFYLRDYKKRSDSLSKKE QDVLKDMKIVQEMYARGFEFLPLDIYTAKADKFQIIDGKLMPPLNSIEGMGDKAAEAVEL ASKDGPYLSRDDFRQRTKVSKTVIDLMADLGMFGNLPESNQLSLFDF >gi|330400063|gb|ADLB01000005.1| GENE 44 62426 - 62881 626 151 aa, chain + ## HITS:1 COG:lin2499 KEGG:ns NR:ns ## COG: lin2499 COG0454 # Protein_GI_number: 16801561 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Listeria innocua # 1 127 1 128 147 82 35.0 2e-16 MITFKEIDRSNYMECISLRIKEEQKGFVADNAQSLVEAKFEEGLYTRAIYSDDTMVGFVL FDYDAEIPGWSMSRFMIGEQYQGQGLGKDAVKQFFKYMKEEMKIQELYISVEVENTVACE LYEKIGFHFLKSVEYEFNGVVYKEKQMKIEL >gi|330400063|gb|ADLB01000005.1| GENE 45 63309 - 64172 685 287 aa, chain + ## HITS:1 COG:no KEGG:DSY3852 NR:ns ## KEGG: DSY3852 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense # Pathway: ABC transporters [PATH:dsy02010] # 3 251 9 259 302 197 41.0 3e-49 MKFSDFHPIVNAVYFAVVIGVTIFSMSPYFLVLSLSGAFIYSLVLKGKKAMKQNISIVFI TLVIMTGINALFTHNGETVLFYIRHNRITMEAVFFGIASSVMLSSVIILFDCFHVIMSEE KLIYLFGRIAPVFGLMVSMIFRFVPLLRCRFKEIQMGQACMGRNAQTGVIWKIRQLLKEL SILIAWSLEASIESADSMAARGYGLPGRSSFHLFKITVQDVILLIIVGVCGGITVAGCIF GRTTMQYYPKLLVPAMAWKDIICISSFGILVFMPSVIEYIGGHKWER >gi|330400063|gb|ADLB01000005.1| GENE 46 64160 - 65821 229 553 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 [marine gamma proteobacterium HTCC2080] # 316 509 21 216 305 92 28 5e-18 MGKIKFEQVTFTYPLAEQAALKDVTFEIEESQFIVICGKSGCGKSTLLRQMKKNMMPYGE LKGEILYDGTSIKELEERKSVSEIGYVQQNPDNQIVTDKVWHELAFGLESLGYDNQTIKR RVAEMADYFGIENWFHKETNTLSGGQKQLLNLASVMVMQPKVLILDEPTAQLDPIAASEF LQTVYKINRDLGVTVIMSEHRLEEVFPIADRVLVLDKGKIVAIDIPERIGEYLSGRETGE AHPMFYGLPAVVRIFQNFRMGESPLTIREGRKKMEQLLKGKEVVGSSFDAVEKSKETVIQ LKNVWFHYEKGHHDVLKGVSFSVEKGRWSCVLGGNGSGKSTMLKVICGILKQQKGKVFIN GAQTGKKCSEKVVMLPQNPQAVFTEITVEEELFEGVAFMDMEDREKIEKVEMMLEKMEIT HLRKANPYDLSGGEQQRLALGKILLLEPTILLLDEPTKSLDPFFKRTLAKILKDLQRNGM TIFMVSHDVEFCASYTDYCAMFFDGEIVSEDCTRKFFSGNSFYTTTANRIARKWCNELIT CEEVQEWLQKNVN >gi|330400063|gb|ADLB01000005.1| GENE 47 65797 - 66735 769 312 aa, chain + ## HITS:1 COG:no KEGG:DSY3851 NR:ns ## KEGG: DSY3851 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense # Pathway: ABC transporters [PATH:dsy02010] # 7 159 647 799 885 135 44.0 2e-30 MASEKRQLTQQEKAAKQRLHVAAELILIGIPIALIVGIKFLPREWYMLLSLSVLAMTIAP FFMVFESRKPKAREIVLLAMMSALVVASHLFFHIVFPIQIGTALIIISGISLGPEAGFFI GALSRLVCNFYMGQGPWTPWQMVCWGILGFLAGFAFNCGNKETLKSRNFKMIAGPVLTIL FSLLLAYLSFLLYPGQDSAVGWRFYLFGAVGLILGVAVQKKRLPADNITLSLFTFFTTFI VYGGIMNLSSALLSANGASGKGISLEGLRILYVTGVPYDFFHALTAAVCVYFIGGNVIQK LERIKIKYGIYK >gi|330400063|gb|ADLB01000005.1| GENE 48 66755 - 67267 476 170 aa, chain + ## HITS:1 COG:no KEGG:Pjdr2_2685 NR:ns ## KEGG: Pjdr2_2685 # Name: not_defined # Def: hypothetical protein # Organism: Paenibacillus # Pathway: not_defined # 41 168 182 308 324 110 42.0 2e-23 MDKDFEKLIRNKRKRKISVCIVILFIVLTTLFSFKTENGKEKKKSEVTLKIRCDELIDNP EKMTKPELWQYIPENGIILEETKCTIETGETVFDVLNRVCKEKDIQIEYSYTAGYDSYYV EGIQYLYEFDAGKRSGWIYLVDGKNTNYGCSQYKLKGGEKIVWEYVCDYK >gi|330400063|gb|ADLB01000005.1| GENE 49 67288 - 68217 910 309 aa, chain + ## HITS:1 COG:BH0604 KEGG:ns NR:ns ## COG: BH0604 COG0714 # Protein_GI_number: 15613167 # Func_class: R General function prediction only # Function: MoxR-like ATPases # Organism: Bacillus halodurans # 3 307 6 310 318 286 45.0 3e-77 MKEKILKLKKEIGKVIVGKEEVVEKVLMAIIAGGHILLEDIPGVGKTTLALAFSKAMGLE FRRIQFTPDVVPSDITGFMIYDKQKNSFTYREGAVMCNFFLADEINRTSSKTQSALLEVM QEAKVTVDGETHRIPQPFIVMATQNPIGTAGTQMLPEAQLDRFMIKLSMGYPEFNAQIEI LKNREKEDPLDAVKEILSIEEVIEMKKEAKEIYVDDKIYEYVTILVEATRKHDYVRLGIS PRGALALCGIAKARAYISGRNYVTPEDVSESFLDVCRHRLILKPKANLTSMDTDSILQQI LDENKAPKL >gi|330400063|gb|ADLB01000005.1| GENE 50 68226 - 69314 629 362 aa, chain + ## HITS:1 COG:no KEGG:Cphy_3708 NR:ns ## KEGG: Cphy_3708 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 65 349 64 358 404 108 26.0 4e-22 MVLKKEKIVFGLLFLAGTVLYFFTEISYAPLFLTGICVYGIVGFIVSGISGKKTDLHFDG EKQCEKGSQLEVTLHMKNNSKIPIWNGEILLRVKNKLTGEQTEVKKTMSLLPLQSKKAHF YLEAYFCGCIEVTVEKMKISDPFGIFTKERNFHMENRYYVFPKLMEVDFTSEQLNQYDME SYKYSALKSGEDTSETFGIREYKEGDSLKSIHWKLTGKMENLIVREAGLPVENSVMILLD KREREEISADKKDKLTEMFLSLSNTIIKNGIHHSIGWYNYEIEKFEQYRIQSTEDIYVII GELLSNPHCKDEISTVDRFIESDAGKEYSNYLYVTESDTAERETEKLMNYGNVEIYRTRE FS >gi|330400063|gb|ADLB01000005.1| GENE 51 69274 - 71400 1209 708 aa, chain + ## HITS:1 COG:PH0326_1 KEGG:ns NR:ns ## COG: PH0326_1 COG1305 # Protein_GI_number: 14590243 # Func_class: E Amino acid transport and metabolism # Function: Transglutaminase-like enzymes, putative cysteine proteases # Organism: Pyrococcus horikoshii # 419 528 208 309 310 67 35.0 1e-10 METLKFIERENLVRKKDIIKEILLSLFLGISTFLCFRDMFQFTLCFSGSYLGGMRSGLVY TWNRIADVLGNEKYILLPKFAGAGEGNTLFLITALILFVILYFFFVHSKNSWAIFLIFLL TVILSGATNLQISMGTAMLFTVSVIAGILYARDRNGLCNKLCLFILTAILCAAVFSSPLG RKINAKSETIYKIQNVVKTKAKEKYYGSNPLKNGDLKQRKREKVEGDALEITMTTPQSIY LRGFVGESYQDNRWETLPNYNYYKEKDLFYWLRKEEFQTFGQLGQAEELFHEEKSERENQ VTIQVKEADSRYAYIPYEMKGEISEGKAWGGSFITPSGLNRLKKYSYNTGNNAVKTWTDT ASKVFLRADTEQAKDYLRKESNYNTYVYENFTYLSKKEKSILNRYIGKAGDQSRGHIDYK PAIAGIQKYLEDNFIYTENLGEKAEQSQSMLEEFLVSKKGYDVQFATAATLMFRYYGIPA RYVEGYLITPKDAKKMESGKPEIIPKTNAHAWVEIYVDGTGFVPIEVSPPYRKIMEEADM EVGISNNTLLRPFDNEGGANNKAQETEMSGDDNKHITFPVIIIICGIIGLILLAIVIRLF WKYVICFKKFLNRRKLFNKAAPKQAVSAMYSYMEEQKYPIKEGVRALGNKAAYSQEELSD KERKEMLMALKSAAKEKKKNEKKNKRNAGRRACGRHAIVRLRGKKSGR >gi|330400063|gb|ADLB01000005.1| GENE 52 71312 - 72229 1026 305 aa, chain + ## HITS:1 COG:no KEGG:Pjdr2_2684 NR:ns ## KEGG: Pjdr2_2684 # Name: not_defined # Def: S-layer domain protein # Organism: Paenibacillus # Pathway: not_defined # 36 302 1215 1512 1517 171 35.0 3e-41 MKRKIKGMLVGALAAAMLLCGCGAKNLADNIDEQTKKTAEYVKENVPNPQVSSIGGEWAV KGIAESGIEPDDSYFEVYYDTVRAKVKSEKGAIHEEYYSDYARVIIALNAIGKDPTNVEG YDMTKPLEEYEELTQQGVNAVAYTLVAANESGISLEHEQAYVEFLVKEMEAMLSERKDTY TDYISMGLLGLSFYQDDDSVKKVTEDGIKYLSDMQQDNGTMGNCESTSEAVIALIQLGVD VFSDKRFVKNEGSLGESLMNYQAENGAFLHTEDGEKANEMATEKALLALCSMKKMEKGGL YDGQK >gi|330400063|gb|ADLB01000005.1| GENE 53 72213 - 73463 802 416 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|291546094|emb|CBL19202.1| ## NR: gi|291546094|emb|CBL19202.1| hypothetical protein [Ruminococcus sp. SR1/5] # 15 406 9 443 456 141 29.0 8e-32 MTGKNSLRHFWRKYRLSLLGVAVILGFIFLSGTFLSTQEQVHGKNPLETENKESSRIAVT GKNYTLNYKQEEEYKREQAKREEKIKQQPAEPIEEIKSETESGKKKTSSILSENSSQAQT GDKNSVGNGDKTEEGQGKPETGGGNEEGNGEGESGENTGGDEEVSKLPEIICSLTEGQVV SGGFLGFTVQATTYKKEPLSSFYVTVKVNGRKLYSSGTQHNVISYRTSQELQDGNNEISV TATDKEGNTATKNYHVTVKKEDAQIEGGTMRVKLQADVLGLGTIFDETVVFYEGENLPYV VDRAFKKAGIKYKYTGTFDYGFYLQRIYKPGITNGYKIPAPILQKLEEENCSWVGYETDS LGEKDFYYWSGWLYRLDGYFPDGLSTIPAEDGSEVELLFTLNNGAEYNGTWFSGNW >gi|330400063|gb|ADLB01000005.1| GENE 54 73535 - 77191 3806 1218 aa, chain + ## HITS:1 COG:MA4285_1 KEGG:ns NR:ns ## COG: MA4285_1 COG1520 # Protein_GI_number: 20093074 # Func_class: S Function unknown # Function: FOG: WD40-like repeat # Organism: Methanosarcina acetivorans str.C2A # 15 486 5 417 418 119 28.0 5e-26 MKKLKQMAATILAVSLMITMLPANVFAKEAVRENVDTEWYNFRNNPENNGVTDRPTPIEA EETVQKWAEKYGTGWSAAPTPPLILDGKLYIGVANRILEIDKETGKEIRRSDEMIGNVGY AMNPLTYADGKLFVQVGNGAIQAVDYKTLKCVWYSEKIGGQTVSPISYAKIDGKGYIYSG TWNSEKRDGSYICMTTDDEGTTEVEGQTKGGGKEKKLTWQFTPSTDDPQVKNKRGFYWAG AYATENYIAVGSDDGSNEGDYTADAIFYTLDSRTGKIIDSISGIKGDIRTTTVYDNGYLY FATKGGVLYKTKVDSKGKLSETSSIDIGAVTNEKLMATAAPLVYGNKIYMGVSGSGGQFD PDAGHGFRIIDNRGPLTQDSIMYNLPIAGYPQAAALASTAYADKDFDGDGEADGRVYLYF TYNAMPGGIYYCYDTKDQTEAKPEQYGELFVPQKEQQQYCISTLCADRDGTLYYKNDSCY LMAVENNLAYLNNVTVKGGENESISWNKAFDSANAEYELKIPDTMKTANITLEMNEGMTA TINGSAYTGKNEVTLEEGDTKVQIEVSKTEGEKTYKRTYTLNFARVKAISTLESMKVGNS NSFTGFLQMTPEFTSEVTDYKVDTTRQESFWRVWLKPTDVNSKITVTPVENVDRITTSNG NTSTQGHDRWNVYKKDQTKTAKIRVDVLSENGKKTTSYNLTLEVPVKVTGIALDKQEAEL DMTEKLQLNAKIMPENATIQTVKWYSSDEETATVDEKGLVTPKKAGNAEITVISDDGASI TDKCHVTITDKAKEVDDLIDAIGTVTLESKAKIDIARDAYNTLTAEQKERVTKLNVLEES EKEYDRLKGEADKEEADKAAAKAVDDLIEKIGEVTIDSGKQIQQAREAFEKLTPEQKEKV EKEEILKVAEEKYAELLLVEEKENAKNLLDNYKNLTEYRQEQQEELKRIIREGKTQIENA TDKDGIDKVVKAAKEKMDAVKTDAELTAQENMEKAAEYVEKQIANIGEVRFTSKSRDAIL FARVSYDKLEKTAQEKVDNYSVLSAAEAKWKELESNAKVITLVDEKSKIAVSGKFMEDFE LHVEKADKEAEAVLQKEFVSLGDKLENLFVPYTISYEGGYVGEITVKIPVDAVYNGRNVI VKQLCGDNSIVSYETTVKENSVSVKTNTLGTFMAGVEKVKNLNGSAPKTADTSDMILWIG ICVLSMVTLAVRRRLKHN >gi|330400063|gb|ADLB01000005.1| GENE 55 77400 - 78476 1442 358 aa, chain + ## HITS:1 COG:Ztdh KEGG:ns NR:ns ## COG: Ztdh COG1063 # Protein_GI_number: 15804160 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Threonine dehydrogenase and related Zn-dependent dehydrogenases # Organism: Escherichia coli O157:H7 EDL933 # 6 346 1 341 341 449 61.0 1e-126 MNDNKMWALVKEEAAPGLALKRVPIPEVGTNDVKIKIHKTAICGTDVHIYQWNEWAQHTI AVGQTAGHEYVGEIVEMGSGVRGYKIGDLVSGEGHIVCGKCRNCLEGHKENCKDAKGVGV NRDGAFAEYLVIPAENVWPTNPAIPEEMYAIFDPFGNATHTALSYEVLGEDVLITGAGPI GIMAAAIAKFAGARHVVVTDFNQYRLDLAKKLGATRTVNLANEKLEDVMKEIGMTEGFDV GMEMSGAAAGFRDMIENMKHGGKIAVLGLQRPDAQINWETVIWNGLNIRGIYGRKVWDTW YKMTTMLQAGLDISDIITHRMNIKDYKAGFDAMISGQSGKVILNWEELDKLEPQSQEV >gi|330400063|gb|ADLB01000005.1| GENE 56 78502 - 79704 1395 400 aa, chain + ## HITS:1 COG:ECs4495 KEGG:ns NR:ns ## COG: ECs4495 COG0156 # Protein_GI_number: 15833749 # Func_class: H Coenzyme transport and metabolism # Function: 7-keto-8-aminopelargonate synthetase and related enzymes # Organism: Escherichia coli O157:H7 # 12 398 10 396 398 504 64.0 1e-142 MARKNDILSIYTKEVEDIKAAGLFKGEAPIASAQGARVKLEDGREVINMCANNYLGLGDN QRLIDAAKRTYDERGYGVASVRFICGTQDIHKQLEKKISDFLGMDDTILYSSCFDANGGL FETILTADDAVISDELNHASIIDGVRLCKAKRFRYKNNDMEDLEAKLKEADEAGARIKLI ATDGVFSMDGIICNLKGVCDLADKYNALVMVDDSHAVGFVGAHGRGTAEHCGVEGRVDII TGTLGKALGGASGGYTSARKEIVDLLRQRSRPYLFSNSLAPAIAGASIELFDMLSESTEL RDHLEETTAYYRKLLVDNGFDIIMGTHPCVPVMLYDEVTAAEFAKRMMEKGVYVVAFSYP VVPKGKARIRTQVCASHTKEDIDFIVKCFIEVRQEMGLNK >gi|330400063|gb|ADLB01000005.1| GENE 57 79783 - 81042 1030 419 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|168182407|ref|ZP_02617071.1| 50S ribosomal protein L18 [Clostridium botulinum Bf] # 4 404 2 417 447 401 49 1e-111 MEKRKIIQVEEKVPFKLLVPLSIQHMFAMFGASVLVPFVFGINPAIVLFMNGVGTLLFIL ITKGKAPAYLGSSFAFLAPAGVVISKFGYEYALGGFVAVGFCGCILAFIIYKFGSDWIDI VLPPAAMGPVVALIGLELAGTAANNAGLLDKHIKPENAIVFLVTLGMAVFGSVVFKGFLS VIPILIAVIAGYISAIACGIVDFTAVSKASLFAMPNFSSPKFNLEAILIILPVILVITSE HIGHQIVTGKIVGRNLLKDPGLHRSLFGDNFSTMLSGFIGSVPTTTYGENIGVMAVTRVY SVYVIAGAAVLSIVCSFIGKLTVLIQTIPGAVIGGISFLLYGMIGTSGIRILVDSKVDYG KSRNLALTSVIFVTGLSGITVKFGNVELKGMVLACVVGMALSLIFYILDKFHLTNDAEE >gi|330400063|gb|ADLB01000005.1| GENE 58 81098 - 82501 1220 467 aa, chain - ## HITS:1 COG:STM2359 KEGG:ns NR:ns ## COG: STM2359 COG0531 # Protein_GI_number: 16765686 # Func_class: E Amino acid transport and metabolism # Function: Amino acid transporters # Organism: Salmonella typhimurium LT2 # 6 465 9 464 473 166 27.0 7e-41 MTKSKISFKELVFMNISALYGIRWIAKSTSASFGLGLGAIPSWAVFMVLFFIPQAFMCAE LASTYQSDGGLYTWVREAFGTKYAFMVSWLNWTAKIFWYASFLTFFAVNFAYMLGEPSLS ENKILVLVLSVALFWILSWISTKGMSFGKFFTSIGSFGSTIPTILLISMAFIAIVLLDKA PSASTYTVSTLMPKLNPDSLVAISGIIFAYTGAEITANFITEMDQPKKNFPRAIIVSAAV VCVLYILGSISISMLLSPEEISSSTGILDSLSRGCELLGIPTVFIQLLAAGISLSIIGAL VLYIASPIKMLFGSVEKGLFPEKLTEANEHGIPVKAVYLQAIIVTVLLAATSLLPGVDTI YNVLVTMTALTSLFPYVLLFLAYIKIKKRQKTVDKDTYVMTKNKKLAVGIAIFELVICII AIICSAYPVMDTVKDNIVYEIEMIGGGLLVILSGLYIWRRSKLQNKL >gi|330400063|gb|ADLB01000005.1| GENE 59 82523 - 83176 595 217 aa, chain - ## HITS:1 COG:FN1803 KEGG:ns NR:ns ## COG: FN1803 COG1309 # Protein_GI_number: 19705108 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Fusobacterium nucleatum # 9 212 9 212 217 111 31.0 1e-24 MANRERHHQRVLKYFIEATQLIIQKEGMEAVTIRKVAEIAGFNSATLYNYFQDLDQLLLY ASLNHLTSYNQKIIQESFAGKDWYDVLLLTWEEFSDISFTYPDAFLQIFFNKHSDSLAAI CNSYYELFSEEKVDDANEWHSILTDFNLYNRNKAILTKIYSDPTVSEENLDITNELMISA YHLLLEYRVSNKEEYSHQICQQKMMRYITYLLSTLKS >gi|330400063|gb|ADLB01000005.1| GENE 60 83073 - 83264 106 63 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MMILSKKSNIIALISKKRYNYNYEKRREKYGKQRTTPSESFKIFYRGYPVNYSERRHGSR HYP >gi|330400063|gb|ADLB01000005.1| GENE 61 83368 - 84672 1466 434 aa, chain + ## HITS:1 COG:yihN KEGG:ns NR:ns ## COG: yihN COG0477 # Protein_GI_number: 16131714 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli K12 # 8 431 4 420 421 159 28.0 1e-38 MKEKTGIKKNFFMICLLAFAGTIIYGLPYFRSYYYDTYQNLYHLTNTQMGLLGSAYGMLG VFSYMFGGVLADRFKAKKLLILSMIATGLGGFLHLFVTDFRALMAIYALWGFTSLLTFWP ALMKIVRTQGNEDEQSRAYGIFEGGRGVFNALQLAIATAIFGFFQAKMMPELGIKWIVIF YSAAPILCGIIFIFVLKEPGETKVKEKASEPEKEEPKEKFSWSNIGLVLKMPAVWLTILM MFCSYTFNMSIYYFTPYASNVLKTTAVVAAVLTVLQQYCRPFASPIGGFLADKIGRGQVM AGGFLLMGVGTLVLILSSGAGGAQMVFVVAACIIVYVGMYSNFGIYFSLLTEGGVPLKVS GLAIGIASTLGYLPEVIAPIVAGNILDTFAGAKGYHIYFGIMIAMAVIGFIASIVWANTY GKRYKEQMKKQKNK >gi|330400063|gb|ADLB01000005.1| GENE 62 84700 - 86745 2650 681 aa, chain + ## HITS:1 COG:mll5128 KEGG:ns NR:ns ## COG: mll5128 COG2936 # Protein_GI_number: 13474275 # Func_class: R General function prediction only # Function: Predicted acyl esterases # Organism: Mesorhizobium loti # 15 678 6 658 661 503 41.0 1e-142 MSVEVKEKKVLKKEDFPYDWTVEENHWIPLSDGTKLSSRIWYPKTDEPVPAVLEYIPYRK RDGMRGRDEPMHGFFAGNGYVVVRVDMRGTGESDGLLKDEYLKQEQDDALEVIDWISKQP WCDGNVGMMGKSWGGFNSLQVAARRPPALRAVICVGFTDDRYNQDIHYKGGCLLNDNFWW GAIMLAYQCRAIDCEIKPETWREEWLERLEDMPLWAENWLQHQTRDEYWKHGSVCEDYAD IQVPVFAIDGWSDSYTNTVLTLMNGLDVPRKAVIGPWAHVFAHDGYPAPAMNFLGEATKW WDKWLKGKDNDTLDGPMVDVWVEDSMLPEAIHDVSEGRWVGLENWPSKDVNDKVFSLTYG KLTEEENTKEEVVDLCTPLNHGLLAGEWMGAGVPGENACDQQLDDGLSMVFDSDVLEEDF DIVGYPKVEVELTSDKANAMLFAQLSDVHPNGQVTRVSYGVMNLTHLQGHDKVVPLVPGE KVKAFVGLDVCGHKFPKGHRVRISLATSHWPMFWVMPEIATLKLDLSTAKVTLPIFTGQD SEGPDMNPMCAPLTPMTTLSEGHVDRSVSYDILKDTWTCITDGVGGVFGEGVYRFDDIGT VVEHNLKRELTLSNKDPLSAKYTIYQKMKNGRDGWLMDTDIVVTQTADEEYFYLTGYMTA KMNDEEVFHRDYDRKVKRNGL >gi|330400063|gb|ADLB01000005.1| GENE 63 86883 - 87353 376 156 aa, chain - ## HITS:1 COG:CAC3531 KEGG:ns NR:ns ## COG: CAC3531 COG1943 # Protein_GI_number: 15896768 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Clostridium acetobutylicum # 1 156 1 156 157 236 71.0 1e-62 MDSNSLSHTKWNCKYHIVFAPKNRRKVAYGKIKQDIANILSMLCKRKGVKIVEAEICPDH VHMLVEIPPSISVSYFVGYLKGKSTLMIFERHANLKYKYGNRHFWCRGYYVATVGKNAKK IQEYIVNQLQEDLEYDQMTLKEYIDPFTGEPVKPNK Prediction of potential genes in microbial genomes Time: Tue May 24 20:55:59 2011 Seq name: gi|330399981|gb|ADLB01000006.1| Lachnospiraceae bacterium 2_1_46FAA cont1.6, whole genome shotgun sequence Length of sequence - 18012 bp Number of predicted genes - 14, with homology - 11 Number of transcription units - 10, operones - 4 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 1 - 153 193 ## PROTEIN SUPPORTED gi|148996730|ref|ZP_01824448.1| 30S ribosomal protein S9 2 1 Op 2 . - CDS 35 - 304 131 ## COG1943 Transposase and inactivated derivatives - Prom 345 - 404 7.0 - Term 481 - 531 10.4 3 2 Tu 1 . - CDS 534 - 1946 1295 ## COG0034 Glutamine phosphoribosylpyrophosphate amidotransferase - Prom 1988 - 2047 5.1 4 3 Tu 1 . - CDS 2068 - 2868 798 ## COG0726 Predicted xylanase/chitin deacetylase - Prom 2986 - 3045 4.0 5 4 Op 1 . + CDS 2933 - 4549 1613 ## COG0038 Chloride channel protein EriC 6 4 Op 2 . + CDS 4546 - 5028 562 ## COG0350 Methylated DNA-protein cysteine methyltransferase + Term 5045 - 5069 -1.0 + Prom 5189 - 5248 7.8 7 5 Op 1 . + CDS 5277 - 10859 6273 ## CPE1264 sialidase-like protein + Term 10875 - 10912 2.1 8 5 Op 2 . + CDS 10938 - 11711 647 ## SAK_0062 acetyltransferase + Prom 12134 - 12193 3.5 9 6 Tu 1 . + CDS 12225 - 14648 2636 ## COG0495 Leucyl-tRNA synthetase + Prom 14730 - 14789 4.0 10 7 Tu 1 . + CDS 14825 - 14941 117 ## - Term 15174 - 15210 4.3 11 8 Op 1 . - CDS 15235 - 16284 511 ## EUBREC_0065 hypothetical protein 12 8 Op 2 . - CDS 16380 - 17231 414 ## COG0582 Integrase 13 9 Tu 1 . + CDS 17525 - 17872 85 ## - Term 17739 - 17772 -0.4 14 10 Tu 1 . - CDS 17869 - 18012 84 ## Predicted protein(s) >gi|330399981|gb|ADLB01000006.1| GENE 1 1 - 153 193 51 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|148996730|ref|ZP_01824448.1| 30S ribosomal protein S9 [Streptococcus pneumoniae SP11-BS70] # 1 51 1 51 77 79 66 2e-14 IVEAEICPDHVHMLVEIPPSISVSYFVGYLKGKSTLMIFERHANLKYKYGN >gi|330399981|gb|ADLB01000006.1| GENE 2 35 - 304 131 89 aa, chain - ## HITS:1 COG:CAC3531 KEGG:ns NR:ns ## COG: CAC3531 COG1943 # Protein_GI_number: 15896768 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Clostridium acetobutylicum # 1 50 1 50 157 75 68.0 3e-14 MDSNSLSHTKWNCKYHIVFAPKNRRKVAYGKIKQDIANILSMLCKRKGVKNCRSGNMSRS CTHASRNSTEHKCIVLCRVLKRKKYTHDI >gi|330399981|gb|ADLB01000006.1| GENE 3 534 - 1946 1295 470 aa, chain - ## HITS:1 COG:MJ0204 KEGG:ns NR:ns ## COG: MJ0204 COG0034 # Protein_GI_number: 15668376 # Func_class: F Nucleotide transport and metabolism # Function: Glutamine phosphoribosylpyrophosphate amidotransferase # Organism: Methanococcus jannaschii # 1 469 1 457 471 192 29.0 9e-49 MGGFFGVAGKSNCTLDLFYGVDYHSHLGTRRGGMATYGPQGFSRAIHNIENSPFRTKFER DVEELKGNIGIGCISDFEPQPLLIQSHLGSFAITTVGKINNMDDLLNRVYGHGHTHFQEM SGGQINATELIASLICHKNSLVEGIQFVQEIVDGSMTLLLMTEDGLYAARDRLGRTPLVI GKKKDAYCVSFEDFAYINLGYSTYKELGPAEIVHITPDKVETVSPAKEDMKICSFLWVYY GYPTSSYEGVNVEEMRYKCGSMLAKRDAEDDVKPDIVAGVPDSGIAHAIGYSNESGIPFA RPFIKYTPTWPRSFMPQNQEQRNLIARMKLIPVQSLIQDKSLLLIDDSIVRGTQLRETTE FLYNSGAKEVHVRPACPPLLFGCKYLNFSRSKSELDLITRQVIQKLEGDNAENVLHEYAD PTTEKYANMLEEIRKQQNFTTLRYHRLDDLIASIGLEPCKVCTYCFNGKE >gi|330399981|gb|ADLB01000006.1| GENE 4 2068 - 2868 798 266 aa, chain - ## HITS:1 COG:CAC3009 KEGG:ns NR:ns ## COG: CAC3009 COG0726 # Protein_GI_number: 15896261 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted xylanase/chitin deacetylase # Organism: Clostridium acetobutylicum # 66 263 91 288 295 211 52.0 1e-54 MKKAKKFTAIVLLFALSFSIGHFTAKTVHTFSDKAVTASAEGNWGLSFQEEGKLPVGNAS VSELKKYDAYYAEPTEEKVLYLTFDAGFENGNTPAILDALKKHHAPATFFVVGNYLKTSP DLVKRMAEEGHIVGNHTLNHLDMSKLSSKEEFQKEISGVEDLYKEITGKPMTKFYRPPQG KYSTQNLQMAKDLGYKTFFWSLAYVDWYEDKQPTKEEAFDKLLKRIHPGAIVLLHSTSST NAQILDELLTKWEEMGYTFKSLNAFL >gi|330399981|gb|ADLB01000006.1| GENE 5 2933 - 4549 1613 538 aa, chain + ## HITS:1 COG:L113400 KEGG:ns NR:ns ## COG: L113400 COG0038 # Protein_GI_number: 15673646 # Func_class: P Inorganic ion transport and metabolism # Function: Chloride channel protein EriC # Organism: Lactococcus lactis # 22 533 6 512 512 357 42.0 4e-98 MVKVNVEGSEGRKVKKDTVHMLKRAERFHVILIGEGLLVGSVAGLIVLAYRIALKYAGQW LEAVRHYIGNSPVKAAMWFCLLFLMAIAVGLLIKWEPMISGSGIPQLEGEMSGKLNQTWW KVLPAKFIGGFLSLLAGLSLGREGPSIQLGAMTGKGISKALDRGKTEEKILLTCGASAGL SAAFHAPLAGVMFSLEEVHKNFSVSVLVSVMTASISADYLSSQFIGIEPVFQFDIGHVLP QNYYWLIIVLGVILGVLGSVYNRFTLKVQSLYKQSKRLNEVTKVMIPFMLSGVLILVMPE LLGSGHELIDSLTNHELLLGGAIVILIGKFLFSAVCFGSGAPGGIFFPLLVLGAFVGGIF GMIGVNYFGMDPDYINNFVMLAMAGNFTAIVRAPLTGIILIFEMTGSVSQMLSLSVVSIV AYVVASLLGSEPIYESLLERLLKNNGEETTEGIKGQKILISHVIMHGSPLAHRKVQEIDW PQNCLLVSIKRGDKEVIPKGKTVLLPSDMLVTLTDEKDEPYVRECVEKLCKMKQEEKR >gi|330399981|gb|ADLB01000006.1| GENE 6 4546 - 5028 562 160 aa, chain + ## HITS:1 COG:BH1021 KEGG:ns NR:ns ## COG: BH1021 COG0350 # Protein_GI_number: 15613584 # Func_class: L Replication, recombination and repair # Function: Methylated DNA-protein cysteine methyltransferase # Organism: Bacillus halodurans # 40 150 58 168 175 163 65.0 1e-40 MKKVYVYSSPVGKLGIDVDEKGLRKIEFLKDDAIDVVDGDNELVMEVKRQLNEYFRGERK TFDLPLALCGTPFQLKVWEALQTIPYGDTRSYKEIAIQVGSPKGCRAVGMANNKNPIPII IPCHRVVGGNGKLVGYAGGLDKKEYLLEVENCNISGSVAK >gi|330399981|gb|ADLB01000006.1| GENE 7 5277 - 10859 6273 1860 aa, chain + ## HITS:1 COG:no KEGG:CPE1264 NR:ns ## KEGG: CPE1264 # Name: not_defined # Def: sialidase-like protein # Organism: C.perfringens # Pathway: not_defined # 1 1296 6 1316 1588 619 33.0 1e-175 MKRKLLSSMLAAAMVVTSVFSTALVSEAKTTNGSHGTVEVQQGTNSVTIGNGAIARTFST KDSKLSTTKIVNKRTGGGKTSFTPGKDSEEFIVRTTKEKSKPIDLPALDRKKWTAEADSY HNGATGAADGPAANLLDGDVNTIWHTNYGGGNNGKGDQEYPHNVVIKLDGSQKFKSFSYT PRKEGETTNGNIKGYELYASNETNPLSAEAKEWGEPIAKGNFEYDGANPIYVNLEKECTA TQIKFVATSSNNGKNFAGGAEFNLHKDEAPVVKDDREFKASDLTLKDGNDAVKIESTNAQ INGKNKTGKKVTFSFKPYEHKGVQYTIDEVLVMYEGDHFMRKYLEISIPEDKQASTAIDY IDLESFKVNQTDKQWTIPRGKGGIVQMEEFKANLGQPIYIQGMFFGCEFPATDTEIVDGT GYMRYYTGKTFDRLKKDNQLTKDGKYVTWQTVAGAARSTENEVIQADFYEYINSIATPSE FRIQYNSWFDNMMKIDDNNILESFIEIDRELNNAEVRPMDSYVVDDGWNAYNNGTIPERE HQKSGAKINESGFWEFNEKFPNELTPSSQLVQKFGSNFGVWVGPRGGYNFYGTLADIIQK SGKGSKAGGSIDVADRVYVENFKKMAIEWQKKFKVNYWKWDGFADMGQYDHFNNLGGADG VPVYSESNHHMTGGYHQMYHVTDLWEAWIDLMEAVRQSEKADGINNLWISLTCYVNPSPW YLQWANSVWLQCTHDQKDADFGTTKMNKQITYRDACYYDFLKNHQFQFPLQNIYNHDPVY GKEGTGMNLNTATDEDFQNYLYMLSTRGTAFWELYYSDSIMTDGKYEITGEFLEWAEKSY HMLKNSKMIGGMPDKTSLGSATSDASKAEAYGFSCFDGKDGIISLRNPSATENKAITFTF DRTIGVSENAGTLDYYLEHSYLLSDPAAQTGKLVYGQKYTVNLKPNEVRILRVSDKKDTT APEIDRIMTDGDRKVTVKFNEKVSGNLFSVNGLGVASIAKSADDTTYHITLTKAPKDGET ITVTPKDIKDMAGNAAGKPASVIYHKDNVVVEQGVKEEKGITPADKSLNSKNGFTVAVVK KTAGSNESLVKQNTEYELKITQEGKAAFTLNGATAVSGKKVNDNKEHIIVGVKENNGMLK LYVDGQLEGSAYNKENRFHEVRKADITVGENVTKADVYDTAYGYDKVEELFREGFGKLKL NDGMVTVSGTEEGNKNTIFDGNNTTYWTSQKVTNGAMNTSNAWLKVDLGGIYKLDQVDYT PRFHNNADNYWCCTGNIKKLVVEISKDGNTWTPVTENGGRDLSNKIVNKNDSSFFPEEIK FDEQEARYVRIGGTESYHWENANVNKQVTVADLAIYGEKTEKDNIAKDSTVTAKWTKDNT NAAANNDRPMSMVVDGQKTNVDGNYGEFGADGKDESSYMQIDLGSVCDVEALNLYRYWNN NRTYKDTVVAVAKTEKDFADGKATIVYNADENNFHKLNEAYKKDKYDEEYVESSAGKSWQ LPKNTKAQFVRVYMHGSESGKTNHIVEMEVIGTKQKDESASVDITKLIDRLAELAAVDTT NATPDSVAKFNALVKEGYGLVSTGVKTQDEVDNMLKKLEGAEDILVYEKPQINVVELEKA IAKAEKVNKELYTTDSYHAMKKVLDEAKALLEETAKDQPTVDAKAKALNDAITALQKRGN TDALKKLVEKYRELKAEDFTTATWNVFKKAFDEAEAIVNDNSNSTQEKVDAVKKALEDAY NSLEKADKTVDKSKLEEAYNKYKDLKNDGYTKDSWKVFEKALNKAKTVLDDKNATEEQVN NALAQLENAVKKLEKAPNGGNAGSGSDGGVKTGDSSNMMLWGILAAAALVSGVIAKKKKY >gi|330399981|gb|ADLB01000006.1| GENE 8 10938 - 11711 647 257 aa, chain + ## HITS:1 COG:no KEGG:SAK_0062 NR:ns ## KEGG: SAK_0062 # Name: not_defined # Def: acetyltransferase # Organism: S.agalactiae_A909 # Pathway: not_defined # 10 252 7 247 250 196 39.0 5e-49 MVKEIKQNEISLIEKLFSGWSETMIWSCLQGCMGKAFAVAGKEEKSAMITIADFCFLAGE ADRELVAYISRTTEKDFILIVPQNEEWNTYIEKHFGERQEKTKRYAIKKKADMFDVEKLR QYAEQIPKGYILKQIDEELYDKVLKEEWSKDFCAMFSDYKQFRENGVGVVALLGEEVVAG ASSYTFYREGIEIEIDTKESHRRKGLATACGASLILECLKRDKYPSWDAIDLRSVALAEK LGYHRAEEYTTYFVTFS >gi|330399981|gb|ADLB01000006.1| GENE 9 12225 - 14648 2636 807 aa, chain + ## HITS:1 COG:BS_leuS KEGG:ns NR:ns ## COG: BS_leuS COG0495 # Protein_GI_number: 16080084 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Leucyl-tRNA synthetase # Organism: Bacillus subtilis # 5 805 3 802 804 874 52.0 0 MATPYNHKAIEKKWRENWEKNPVNVKADENGQREKYYCLDMFPYPSGNGLHVGHWRGYVI SDAWSRYKLLQGHYIVHPMGWDAFGLPAENYAIKMGVHPSVSTEENIKNFKKQLNEISAL YDWDMEVNTTDPAFYKWTQWIFVKMFKEGLAYEKEFPINWCPSCKTGLANEEVVNGKCER CGTEVTKKNLRQWMLRITKYADRLLNDLDKLDWPEKVKKMQAEWIGKSYGAEVDFKVDGK DEKITVYTTRPDTLHGATFMVLAPEHALAKELATDETREAVEKYILDSSMKSNVDRLQDK EKTGVFTGSYAINPINNAKVPIWLSDYVLADYGTGAIMCVPAHDDRDFEFATKFNIPIIQ VIAKDGKEIENMTEAYTDAAGTMINSGEWNGMESAVLKKEAPHMIEERGIGKATVNFKLR DWVFSRQRYWGEPIPIVHCEHCGAVPVPEDQLPLTLPDVDSYEPTGTGESPLAGIESWVN TTCPVCGKPAKRETNTMPQWAGSSWYFLRYVDNHNAEELVSKEKADEMLPVDMYIGGVEH AVLHLLYSRFYTKFLYDIGAVDFDEPFKKLFNQGMITGKNGIKMSKSKGNVVSPDDLVRD YGCDSLRMYELFVGPPELDAEWDDRGIDGVYRFLNRVWNLVMDSKDADVKATKEMIKIRN KMVYDITTRLESFSLNTVVSGFMEYNNKLLDIAKKEGGVDKETLGTFAILLAPFAPHMAE ELWEQLGNAGSVFHAGWPTYDEEAMKDDEIEVAVQINGKTRAVVTIPAEISKEDAIAAGK EAVADKLTGTIVKEIYVPGRIINIVQK >gi|330399981|gb|ADLB01000006.1| GENE 10 14825 - 14941 117 38 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKENGVNISRRYERNDADSRDIVCVIFGDYNGANNYEV >gi|330399981|gb|ADLB01000006.1| GENE 11 15235 - 16284 511 349 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_0065 NR:ns ## KEGG: EUBREC_0065 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 349 35 383 383 634 81.0 1e-180 MNCKTGAYGANVSVCEDCGAVQIHYNSCRNRCCPMCQAVPKEMWMDAHREDVLDAPYFHL VFTVPDILNPIIYNNQKLLYDTLYHAASATISELTADPKHLGADVGYICILHTWGSEMNF HPHIHTILLGGGLTSKNEWKDNGTEFFLPIRAVSKSFRGKYIDELKHLWNTGQLEFHGTA EKYRNYYVFKDLLDSCYDTEWIPYCKKTFNGAGSVIDYLGKYTHRIAISNHRIICMDDEN VTFSVKDYRNKGQWKELTLSGVEFIRRFLMHVPPKRFVRIRHYGLLCSRSKHKKLTLCRN LLGCQKYLSKLRGKEMPEILKQLYEINICVCKSCGGHLGKPQLRIPQRC >gi|330399981|gb|ADLB01000006.1| GENE 12 16380 - 17231 414 283 aa, chain - ## HITS:1 COG:mll9328 KEGG:ns NR:ns ## COG: mll9328 COG0582 # Protein_GI_number: 13488149 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Mesorhizobium loti # 14 275 19 281 299 155 32.0 5e-38 MYEKYLEQLEEAGKIRNLKERSINCYKNYVSYFLKYQAKPPEELTCQDVRNFLLAKKEEG LKATTLNLYNSSIRFFYRNVLHILWDDITVPRMILEHKLPTVLTVDEIDRLLEAVDDIKY RAMFATMYSSGMRVSEVIHLHYDDISRSNMQIHVRDTKNRMDRYTILSKRCLDILTQYWF EKGRPRGILFPNKFTGNYLTVSTLEQVMRRAVSDAELPKKATPHCLRHSFATHLMEQGVE RQNIQALLGHRDPKSTEVYLHVSNKSLMGIQSPFDRKEGADNE >gi|330399981|gb|ADLB01000006.1| GENE 13 17525 - 17872 85 115 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLEQELTNLQIYISKRNQGQTDEQVINHITKINNKTPLTQEEWHELIFPSCNNGYVEILR FILSNIQCLNNVKEYMRHTVYGRNKNINDERIEVLKEFMVLCQDLVQVKMRFSSS >gi|330399981|gb|ADLB01000006.1| GENE 14 17869 - 18012 84 47 aa, chain - ## HITS:0 COG:no KEGG:no NR:no AYRLIFVIFGRRSTIRNELHSHCDYMSPNDYEELYRRLQQDELQLAG Prediction of potential genes in microbial genomes Time: Tue May 24 20:57:03 2011 Seq name: gi|330399833|gb|ADLB01000007.1| Lachnospiraceae bacterium 2_1_46FAA cont1.7, whole genome shotgun sequence Length of sequence - 81087 bp Number of predicted genes - 85, with homology - 78 Number of transcription units - 41, operones - 20 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 153 102 ## EUBREC_0066 putative phage integrase/recombinase 2 2 Tu 1 . + CDS 447 - 971 430 ## COG1335 Amidases related to nicotinamidase + Term 1058 - 1091 -0.9 + Prom 1065 - 1124 3.7 3 3 Op 1 . + CDS 1146 - 1445 305 ## gi|210617254|ref|ZP_03291480.1| hypothetical protein CLONEX_03702 4 3 Op 2 . + CDS 1403 - 1630 205 ## gi|210617254|ref|ZP_03291480.1| hypothetical protein CLONEX_03702 + Prom 1718 - 1777 5.8 5 4 Tu 1 . + CDS 1875 - 2345 468 ## LSL_1338 DNA-binding protein + Term 2451 - 2497 9.0 - Term 2436 - 2488 6.8 6 5 Tu 1 . - CDS 2529 - 2843 226 ## gi|153855556|ref|ZP_01996672.1| hypothetical protein DORLON_02690 - Prom 2994 - 3053 4.1 + Prom 3303 - 3362 5.0 7 6 Tu 1 . + CDS 3400 - 3852 413 ## JDM1_0563 transcriptional regulator, xre family + Term 3974 - 4017 8.3 + Prom 4131 - 4190 4.2 8 7 Op 1 . + CDS 4218 - 4742 536 ## CLD_2265 hypothetical protein 9 7 Op 2 . + CDS 4753 - 5232 528 ## COG0494 NTP pyrophosphohydrolases including oxidative damage repair enzymes + Term 5245 - 5307 20.1 - Term 5241 - 5287 4.2 10 8 Op 1 . - CDS 5307 - 6884 1433 ## COG0488 ATPase components of ABC transporters with duplicated ATPase domains - Term 6931 - 6966 4.2 11 8 Op 2 . - CDS 6971 - 7228 359 ## CPE0254 hypothetical protein 12 8 Op 3 . - CDS 7239 - 7421 302 ## Dhaf_0015 hypothetical protein 13 8 Op 4 . - CDS 7485 - 8195 536 ## COG2071 Predicted glutamine amidotransferases - Prom 8235 - 8294 5.7 + Prom 8171 - 8230 6.4 14 9 Tu 1 . + CDS 8263 - 9228 744 ## COG2355 Zn-dependent dipeptidase, microsomal dipeptidase homolog + Term 9252 - 9285 3.4 + Prom 9233 - 9292 5.8 15 10 Op 1 8/0.000 + CDS 9337 - 9711 347 ## COG1725 Predicted transcriptional regulators 16 10 Op 2 . + CDS 9708 - 10610 782 ## COG1131 ABC-type multidrug transport system, ATPase component 17 10 Op 3 . + CDS 10585 - 12582 1547 ## TherJR_0019 hypothetical protein 18 10 Op 4 . + CDS 12609 - 13877 1474 ## COG0726 Predicted xylanase/chitin deacetylase + Term 13879 - 13917 3.5 + Prom 13892 - 13951 2.5 19 11 Tu 1 . + CDS 13979 - 14410 575 ## COG3238 Uncharacterized protein conserved in bacteria + Prom 14416 - 14475 2.9 20 12 Op 1 1/0.000 + CDS 14499 - 15059 585 ## COG1555 DNA uptake protein and related DNA-binding proteins 21 12 Op 2 40/0.000 + CDS 15071 - 15760 952 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 22 12 Op 3 . + CDS 15768 - 17177 1487 ## COG0642 Signal transduction histidine kinase 23 12 Op 4 . + CDS 17192 - 18112 923 ## EUBREC_1615 hypothetical protein 24 12 Op 5 4/0.000 + CDS 18113 - 20311 1312 ## COG2333 Predicted hydrolase (metallo-beta-lactamase superfamily) 25 12 Op 6 . + CDS 20355 - 21341 879 ## COG1466 DNA polymerase III, delta subunit 26 12 Op 7 4/0.000 + CDS 21377 - 23191 2011 ## COG0481 Membrane GTPase LepA 27 12 Op 8 . + CDS 23274 - 24398 1010 ## COG0635 Coproporphyrinogen III oxidase and related Fe-S oxidoreductases + Term 24400 - 24455 11.5 - Term 24390 - 24438 13.4 28 13 Op 1 . - CDS 24451 - 25608 922 ## 29 13 Op 2 . - CDS 25612 - 26907 701 ## gi|254517725|ref|ZP_05129781.1| extracellular solute-binding protein 30 13 Op 3 . - CDS 26952 - 27083 68 ## - Prom 27104 - 27163 3.9 + Prom 26977 - 27036 10.5 31 14 Tu 1 . + CDS 27157 - 28176 1196 ## COG1304 L-lactate dehydrogenase (FMN-dependent) and related alpha-hydroxy acid dehydrogenases + Term 28186 - 28219 0.5 + Prom 28214 - 28273 6.6 32 15 Op 1 21/0.000 + CDS 28331 - 29374 1373 ## COG1420 Transcriptional regulator of heat shock gene 33 15 Op 2 29/0.000 + CDS 29390 - 30010 836 ## COG0576 Molecular chaperone GrpE (heat shock protein) 34 15 Op 3 31/0.000 + CDS 30149 - 32011 2658 ## COG0443 Molecular chaperone + Term 32037 - 32086 12.2 + Prom 32022 - 32081 1.8 35 16 Op 1 . + CDS 32109 - 33296 1412 ## COG0484 DnaJ-class molecular chaperone with C-terminal Zn finger domain 36 16 Op 2 . + CDS 33299 - 33826 485 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases + Term 33828 - 33881 8.9 + Prom 33847 - 33906 7.0 37 17 Op 1 . + CDS 33937 - 34692 768 ## COG0682 Prolipoprotein diacylglyceryltransferase + Term 34699 - 34731 3.3 38 17 Op 2 . + CDS 34746 - 36152 1431 ## COG0534 Na+-driven multidrug efflux pump 39 17 Op 3 . + CDS 36202 - 36579 571 ## COG0251 Putative translation initiation inhibitor, yjgF family 40 17 Op 4 . + CDS 36593 - 37429 906 ## COG0489 ATPases involved in chromosome partitioning + Term 37435 - 37487 10.2 + Prom 37468 - 37527 4.2 41 18 Op 1 . + CDS 37571 - 38887 756 ## COG4277 Predicted DNA-binding protein with the Helix-hairpin-helix motif 42 18 Op 2 . + CDS 38865 - 39632 501 ## Cphy_0819 hypothetical protein + Prom 39648 - 39707 4.3 43 19 Tu 1 . + CDS 39756 - 39986 304 ## gi|210609708|ref|ZP_03288094.1| hypothetical protein CLONEX_00278 + Term 39996 - 40045 13.1 + Prom 39993 - 40052 6.0 44 20 Tu 1 . + CDS 40150 - 42486 1730 ## COG1199 Rad3-related DNA helicases + Term 42647 - 42685 -0.9 - Term 42621 - 42662 8.2 45 21 Op 1 . - CDS 42731 - 43141 210 ## gi|169351324|ref|ZP_02868262.1| hypothetical protein CLOSPI_02104 46 21 Op 2 . - CDS 43221 - 44108 863 ## COG0583 Transcriptional regulator - Prom 44130 - 44189 10.0 + Prom 44154 - 44213 7.7 47 22 Op 1 . + CDS 44238 - 44678 414 ## 48 22 Op 2 . + CDS 44675 - 45388 476 ## Amet_3274 hypothetical protein 49 22 Op 3 . + CDS 45427 - 49266 4361 ## COG0046 Phosphoribosylformylglycinamidine (FGAM) synthase, synthetase domain + Term 49279 - 49311 2.2 + Prom 49293 - 49352 8.4 50 23 Tu 1 . + CDS 49385 - 50086 765 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain + Term 50165 - 50203 2.0 - Term 50004 - 50052 6.1 51 24 Op 1 17/0.000 - CDS 50083 - 50727 588 ## COG0569 K+ transport systems, NAD-binding component 52 24 Op 2 . - CDS 50743 - 52104 970 ## COG0168 Trk-type K+ transport systems, membrane components - Prom 52164 - 52223 6.5 + Prom 52133 - 52192 9.3 53 25 Tu 1 . + CDS 52335 - 53669 1932 ## COG0334 Glutamate dehydrogenase/leucine dehydrogenase + TRNA 53834 - 53910 82.6 # Met CAT 0 0 + TRNA 53914 - 53986 85.3 # Val TAC 0 0 + TRNA 54011 - 54084 82.0 # Met CAT 0 0 - Term 54080 - 54111 2.7 54 26 Tu 1 . - CDS 54120 - 54311 151 ## gi|167760484|ref|ZP_02432611.1| hypothetical protein CLOSCI_02858 - Prom 54437 - 54496 8.5 + Prom 54299 - 54358 7.8 55 27 Op 1 . + CDS 54488 - 55033 791 ## COG0245 2C-methyl-D-erythritol 2,4-cyclodiphosphate synthase 56 27 Op 2 . + CDS 55033 - 56115 867 ## COG4552 Predicted acetyltransferase involved in intracellular survival and related acetyltransferases 57 27 Op 3 . + CDS 56135 - 57142 1199 ## COG1363 Cellulase M and related proteins + Term 57320 - 57360 -0.1 58 28 Op 1 4/0.000 + CDS 57484 - 58176 563 ## COG1045 Serine acetyltransferase 59 28 Op 2 8/0.000 + CDS 58181 - 59569 1486 ## COG0215 Cysteinyl-tRNA synthetase 60 28 Op 3 7/0.000 + CDS 59608 - 60003 250 ## PROTEIN SUPPORTED gi|163764762|ref|ZP_02171816.1| ribosomal protein S13 61 28 Op 4 1/0.000 + CDS 60013 - 60753 707 ## PROTEIN SUPPORTED gi|163764761|ref|ZP_02171815.1| ribosomal protein S11 62 28 Op 5 . + CDS 60780 - 61388 689 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog 63 28 Op 6 . + CDS 61363 - 61455 76 ## + Term 61529 - 61585 1.7 + TRNA 61451 - 61523 66.2 # Thr GGT 0 0 + Prom 61816 - 61875 9.5 64 29 Op 1 . + CDS 61930 - 63597 1795 ## COG0008 Glutamyl- and glutaminyl-tRNA synthetases + Term 63601 - 63640 7.7 65 29 Op 2 . + CDS 63657 - 64175 261 ## PROTEIN SUPPORTED gi|148360238|ref|YP_001251445.1| nucleotidyltransferase PLUS glutamate rich protein GrpB PLUS ribosomal protein alanine acetyltransferase 66 29 Op 3 . + CDS 64168 - 65574 1153 ## COG1167 Transcriptional regulators containing a DNA-binding HTH domain and an aminotransferase domain (MocR family) and their eukaryotic orthologs + Term 65583 - 65625 7.4 - Term 65571 - 65613 2.6 67 30 Tu 1 . - CDS 65623 - 65961 469 ## + Prom 66013 - 66072 7.5 68 31 Op 1 . + CDS 66099 - 67649 1470 ## COG1502 Phosphatidylserine/phosphatidylglycerophosphate/cardioli pin synthases and related enzymes + Term 67656 - 67703 8.6 69 31 Op 2 . + CDS 67717 - 68079 423 ## COG0346 Lactoylglutathione lyase and related lyases + Prom 68084 - 68143 6.1 70 32 Tu 1 . + CDS 68190 - 69470 1496 ## COG0104 Adenylosuccinate synthase + Term 69480 - 69520 4.3 - Term 69468 - 69508 5.1 71 33 Tu 1 . - CDS 69515 - 70525 692 ## COG2502 Asparagine synthetase A - Prom 70751 - 70810 5.4 + Prom 70483 - 70542 2.9 72 34 Tu 1 . + CDS 70688 - 70852 243 ## - Term 70806 - 70841 0.2 73 35 Tu 1 . - CDS 70857 - 71072 269 ## gi|210608917|ref|ZP_03288054.1| hypothetical protein CLONEX_00233 - Prom 71137 - 71196 9.9 + Prom 71156 - 71215 6.1 74 36 Op 1 . + CDS 71247 - 71747 745 ## COG0653 Preprotein translocase subunit SecA (ATPase, RNA helicase) 75 36 Op 2 . + CDS 71813 - 72538 734 ## COG1434 Uncharacterized conserved protein 76 36 Op 3 . + CDS 72586 - 73200 720 ## COG0572 Uridine kinase + Term 73427 - 73473 13.2 + Prom 73204 - 73263 8.4 77 37 Op 1 . + CDS 73511 - 74365 610 ## COG0656 Aldo/keto reductases, related to diketogulonate reductase 78 37 Op 2 . + CDS 74383 - 74835 576 ## COG0824 Predicted thioesterase 79 37 Op 3 . + CDS 74835 - 76796 2218 ## COG0441 Threonyl-tRNA synthetase 80 37 Op 4 . + CDS 76848 - 77177 321 ## CPE2333 hypothetical protein + Term 77190 - 77234 9.2 - Term 77178 - 77222 4.6 81 38 Tu 1 . - CDS 77229 - 78047 498 ## TherJR_1918 hypothetical protein - Prom 78128 - 78187 12.4 + Prom 78087 - 78146 9.0 82 39 Op 1 . + CDS 78172 - 79434 1395 ## COG0628 Predicted permease 83 39 Op 2 . + CDS 79440 - 80360 974 ## COG1897 Homoserine trans-succinylase + Term 80493 - 80539 7.5 + Prom 80520 - 80579 6.4 84 40 Tu 1 . + CDS 80611 - 80682 80 ## 85 41 Tu 1 . - CDS 80701 - 80958 248 ## COG2801 Transposase and inactivated derivatives Predicted protein(s) >gi|330399833|gb|ADLB01000007.1| GENE 1 3 - 153 102 50 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_0066 NR:ns ## KEGG: EUBREC_0066 # Name: not_defined # Def: putative phage integrase/recombinase # Organism: E.rectale # Pathway: not_defined # 1 50 1 50 282 93 90.0 2e-18 MYEKYLEQLEEAGKIRNLKERSINCYKNYVSYFLKYQAKPPEELTCQDVR >gi|330399833|gb|ADLB01000007.1| GENE 2 447 - 971 430 174 aa, chain + ## HITS:1 COG:lin0638 KEGG:ns NR:ns ## COG: lin0638 COG1335 # Protein_GI_number: 16799713 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Amidases related to nicotinamidase # Organism: Listeria innocua # 1 174 1 174 175 204 56.0 9e-53 MVLLVVDTQKALVNDELYNYNEFIDNIQLLIKKARENNVEVIYVVHDDGNGSGLTKGTDG FEVFEKFKPLSNEKLFVKSVNSAFKDTGLVEYLAENNEKDIIIVGLQTDKCINATVISGF EYGFNLIVPAFANSTINNNYMDSQKSYRYYNEFMWHERYAECISIEETIKRMDS >gi|330399833|gb|ADLB01000007.1| GENE 3 1146 - 1445 305 99 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|210617254|ref|ZP_03291480.1| ## NR: gi|210617254|ref|ZP_03291480.1| hypothetical protein CLONEX_03702 [Clostridium nexile DSM 1787] # 45 77 10 42 126 70 90.0 5e-11 MEIIYTVLGAIACIVLVISVDAGITKLFDEKSIMQKIILRCRFLMKTGYSDDGSAWIGYV KTSKTKKTIYFNDHAFRNIMVVIQTIWILKTEMNIGFLD >gi|330399833|gb|ADLB01000007.1| GENE 4 1403 - 1630 205 75 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|210617254|ref|ZP_03291480.1| ## NR: gi|210617254|ref|ZP_03291480.1| hypothetical protein CLONEX_03702 [Clostridium nexile DSM 1787] # 1 75 52 126 126 128 86.0 1e-28 MDIENRDEYWISGLKKKVSNRHWAGRGKIMIDHRAVNEYLALIGEKELPLNLFEVIDIED RFPVERVNELLNEKE >gi|330399833|gb|ADLB01000007.1| GENE 5 1875 - 2345 468 156 aa, chain + ## HITS:1 COG:no KEGG:LSL_1338 NR:ns ## KEGG: LSL_1338 # Name: not_defined # Def: DNA-binding protein # Organism: L.salivarius # Pathway: not_defined # 1 147 1 152 203 72 31.0 3e-12 MNQHEIGDFISCCRKEKGLTQVELAEMLGVSDKSISRWENAKTMPDISLYEPLCEALDIQ VAELLYAKKMTDNEKAKQGEQSALNIFKTKLQLKIFSIFSEILILIGIIITITFTKTLAV TIPQTVVTIICGSFVWGVGIILSVKIKKAILALENQ >gi|330399833|gb|ADLB01000007.1| GENE 6 2529 - 2843 226 104 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|153855556|ref|ZP_01996672.1| ## NR: gi|153855556|ref|ZP_01996672.1| hypothetical protein DORLON_02690 [Dorea longicatena DSM 13814] # 1 102 132 233 234 192 95.0 8e-48 MHILPLFLKEDNSQFYECPSDFFYSDRHTIFTRDFESIHMVLTTATIYYYGIDTKTYEPK VTMLSYSQADDCFYIDNGIQDGHIKAFEKELTKEALYLKHNALQ >gi|330399833|gb|ADLB01000007.1| GENE 7 3400 - 3852 413 150 aa, chain + ## HITS:1 COG:no KEGG:JDM1_0563 NR:ns ## KEGG: JDM1_0563 # Name: not_defined # Def: transcriptional regulator, xre family # Organism: L.plantarum_JDM1 # Pathway: not_defined # 1 85 1 85 221 66 41.0 4e-10 MALSDNIKKFREEKNLTQQQLADKLYVSRQTICRWENGSRCPDLITAKKLALELDVSMDE LISDEDVNDTRVNYGIWKSERIQNRTRLQELRKKLLNFMEILGSIFMAIILLFRVQLERD IPLWLTISFFCIAIPLIAINLLISKKIREM >gi|330399833|gb|ADLB01000007.1| GENE 8 4218 - 4742 536 174 aa, chain + ## HITS:1 COG:no KEGG:CLD_2265 NR:ns ## KEGG: CLD_2265 # Name: not_defined # Def: hypothetical protein # Organism: C.botulinum_B1 # Pathway: not_defined # 1 138 1 138 174 93 34.0 4e-18 MKKQKKGFLTFICSLIPGAGELYMGFEKQGMSILITFWGIVAISALTGMTFILCLLPIIW FYSFFHTHNLKNLTEEEFMMEKDRYLINLGYVMENKEELLQKYRAVIAGALILIGAAVIG KSLVRVLWNVIPMYLYDVFSSAIHLIGGGVVGIGIIILGVYFLRKRDEAEEAEE >gi|330399833|gb|ADLB01000007.1| GENE 9 4753 - 5232 528 159 aa, chain + ## HITS:1 COG:FN1791_1 KEGG:ns NR:ns ## COG: FN1791_1 COG0494 # Protein_GI_number: 19705096 # Func_class: L Replication, recombination and repair; R General function prediction only # Function: NTP pyrophosphohydrolases including oxidative damage repair enzymes # Organism: Fusobacterium nucleatum # 3 149 2 148 158 172 56.0 2e-43 MELTTLCYIEKDDSYLMLHRVSKKNDVNKDKWIGVGGHFEAGESPEDCLLREVKEETGLI LTSYRFRGLLTFVFNDNEAEYICLYTADGFEGEITDCDEGTLEWVSKKKIPELNLWEGDK IFFELLNRNEPFFSMKLVYQGDTLVDCELNGEKILTRKE >gi|330399833|gb|ADLB01000007.1| GENE 10 5307 - 6884 1433 525 aa, chain - ## HITS:1 COG:BS_yfmM KEGG:ns NR:ns ## COG: BS_yfmM COG0488 # Protein_GI_number: 16077809 # Func_class: R General function prediction only # Function: ATPase components of ABC transporters with duplicated ATPase domains # Organism: Bacillus subtilis # 8 525 1 518 518 821 77.0 0 MYIKELSMSILNVEHLSHGFGDRAIFEDVSFRLLKGEHIGLVGANGEGKSTFLNIVTGKL QPDEAKIEWAKNVRVGYLDQHTVLEKGMTIRDVLKSAFAYLFELEKKMNDICDSLGSADE DEMMTLMEELGSIQDTLTLHDFYVIDAKIEEVARALGILDIGLERDVTDLSGGQRTKVLL AKLLLEKPDILLLDEPTNYLDEEHIVWLKRYLIDYENAFILISHDIPFLNDVINIVYHME NQELNRYVGDYEHFQQVYEVKKAQLEAAYKRQQQEISELKDFVARNKARVSTRNMAMSRQ KKLDKMDIIELAAERPKPEFHFKTGRTPGKYIFETKDLVIGYDEPLSKPLTLAMERGHKI ALVGANGIGKTTLLKSILGLIPALSGSVELGENLQIGYFEQETAQDNTTTCIEEIWSEFP SFTQYEVRSTLAKCGLTTKHIESQVRVLSGGEQAKVRLCKLINRETNILLLDEPTNHLDV DAKDELKRALQEYRGSVLLICHEPEFYRDVVNEVWDCTKWTTKIF >gi|330399833|gb|ADLB01000007.1| GENE 11 6971 - 7228 359 85 aa, chain - ## HITS:1 COG:no KEGG:CPE0254 NR:ns ## KEGG: CPE0254 # Name: not_defined # Def: hypothetical protein # Organism: C.perfringens # Pathway: not_defined # 1 82 1 82 82 65 46.0 4e-10 MDEKTMVADALVGANGELKMFGEMISQTENKELKQCLKQLRNECEMSQEKLYQIAREKSY YVPAAKATAEEKQHVKSILTQGTMK >gi|330399833|gb|ADLB01000007.1| GENE 12 7239 - 7421 302 60 aa, chain - ## HITS:1 COG:no KEGG:Dhaf_0015 NR:ns ## KEGG: Dhaf_0015 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense_DCB-2 # Pathway: not_defined # 1 60 1 60 61 63 56.0 2e-09 MEDISILDLQNLRHLIGGYSTTHCKMTDYASFAQDPEIKKLFQDSADSAMKNKQELLKFL >gi|330399833|gb|ADLB01000007.1| GENE 13 7485 - 8195 536 236 aa, chain - ## HITS:1 COG:CAC1764 KEGG:ns NR:ns ## COG: CAC1764 COG2071 # Protein_GI_number: 15895041 # Func_class: R General function prediction only # Function: Predicted glutamine amidotransferases # Organism: Clostridium acetobutylicum # 1 228 1 232 241 159 34.0 3e-39 MKPKIAIIVCGLSENRQFVTNSYVQAVRYSGGLPLIIPLVKSNVAIQEYISLCDGFLFCG GGDITPLLFGQEPVDKLGETNITLDIFQLRFMRHVLLSGKPVLAICRGMQLLNVACNGTV CQDISIKVKDSINHMQRSFSRKDISHKVTVKSGTHLHRIIGNIVYTNSYHHQTIDRLGKG LISCAHTSDGIIEGIELTGHAFSIGVQWHPEAMYRTTPVMRRLFSSFIHASRLADE >gi|330399833|gb|ADLB01000007.1| GENE 14 8263 - 9228 744 321 aa, chain + ## HITS:1 COG:lin2556 KEGG:ns NR:ns ## COG: lin2556 COG2355 # Protein_GI_number: 16801618 # Func_class: E Amino acid transport and metabolism # Function: Zn-dependent dipeptidase, microsomal dipeptidase homolog # Organism: Listeria innocua # 3 306 2 295 308 149 30.0 6e-36 MDKLIDLHCDTIWRLMEAGENATLENNPFCVNLKEMRRADTMAQFFACFVDMQWFQEENK FEEGYQYVKKMIERLRKEVEHFSDKIAFAKSGKEIEENREEGKISAILTVEEGGILNNKI ERIEELRNEGIRLMTVLWNYENCIGYPNSKNAEIMAKGLKPFGFEVIERMNDVGMLIDVS HLSDGGFWDILQTSRVPVVASHSNARALCPHPRNMTDDMIRGLGEKGGVIGVNFYPPFIR ESGKATAKNIVSHIQHIANVGGMESVCIGTDFDGFIGEEGEIGKVGQINILYEELKRAKF TEKQIEKIWRGNAMRVIKEVI >gi|330399833|gb|ADLB01000007.1| GENE 15 9337 - 9711 347 124 aa, chain + ## HITS:1 COG:BS_ytrA KEGG:ns NR:ns ## COG: BS_ytrA COG1725 # Protein_GI_number: 16080098 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Bacillus subtilis # 1 104 1 110 130 77 41.0 4e-15 MIAIDYQNRKPIYEQIVEKFQMLILKEILPSGSQMPSVRSLAVELSINPNTIQKAYATLE QQGYIYPIKGRGNFVAESTELKEQKKEAFSEKMRELVREGIELEISKKECLDVFENCWKE EKSS >gi|330399833|gb|ADLB01000007.1| GENE 16 9708 - 10610 782 300 aa, chain + ## HITS:1 COG:BS_ytrB KEGG:ns NR:ns ## COG: BS_ytrB COG1131 # Protein_GI_number: 16080097 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase component # Organism: Bacillus subtilis # 1 287 1 285 292 144 29.0 2e-34 MIEVRDICKTFDGLKAIDHATTKIMEGQIFGLVGSNGAGKSTFLRMLSGVLKADAGEILI DGENVYENIEVKEKICFLSDVAYFPANATGKTMCDYYSIMYKDFNPSQFNKLAEKFQLDI YRKISTFSKGMQKQMSMLLGVCTNTKYLFCDETFDGLDPVMRQAVKSLFASEVINRGFTP IIASHNLRELEDICDSIGLLHRGGILFTKDLENMKFHIHKVQCVITNPVMEERLLAELDI LYHEKQGSILSFISRGTREEIMERVEEKEPIFMETVPLSLEEIFISEMEVEGYDIKNFLL >gi|330399833|gb|ADLB01000007.1| GENE 17 10585 - 12582 1547 665 aa, chain + ## HITS:1 COG:no KEGG:TherJR_0019 NR:ns ## KEGG: TherJR_0019 # Name: not_defined # Def: hypothetical protein # Organism: Thermincola_JR # Pathway: not_defined # 1 632 1 653 707 84 21.0 2e-14 MTSKISFSKLIKEDIKKRAWLLLLSITIFIIIIPVLTAIKIEGALPGGVSSSSETWREVQ TWFLSEMGFSNIYMWSAIILGGVLSGITSFSYLHSKNQIDFYHSFPIKRETWYFVNYIGG VIQIVVPYIIGYILLLAIGIVKGVASPQLFHQSSVIMGMMVLVFLLVYGTTALAMIITGK LLVGILGTLIFFSWGSVIVALKNYIMLQIFENYMTEEMVTGFLENMKEGGGWYSPILMYD KIRQYYELGKPMFPLVMVLIFVIALIFLLGLFAFKKRKMENVGKAIVYSKLESIVQIVIT VSAGVFFSVIVSSQNQTRGMKNGWLYGIAVVSVVIVYGIISFIYNGDVRILFEKKIPFFL SMGITLVVLTVFQFDILGYDEYVPEKDKIETMAIDSYDANYLLNYRGLWDNRSYKEHLVK LKTTQFEPMYSLIKDTLKEGSKPNAENTTVNIGYYLKNGRKIYRKYQVNREKLLLCLDRA MTDENYKREILKIGEIDADEKSSVSFENIRSENLPVKLNETERELLYDQYYKDMEKYPLS DILAGEIIGRLYYKSKKDERVLYVFKEYKDFISVLSQYCEVPDKITQQEVTSITVEDFRD SDNLKEITVQATEKEKIQSILDSLSYTEMGSYSGNIEPNVYVSIATEDDLISAFIKKGKI PEFLK >gi|330399833|gb|ADLB01000007.1| GENE 18 12609 - 13877 1474 422 aa, chain + ## HITS:1 COG:BS_yjeA_2 KEGG:ns NR:ns ## COG: BS_yjeA_2 COG0726 # Protein_GI_number: 16078275 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted xylanase/chitin deacetylase # Organism: Bacillus subtilis # 219 407 23 211 217 167 45.0 3e-41 MKNTRGRSRNIRSRRRRILISKIIICVVVCIGVIFGIKTAANTFFQKAAKIVLKADDAEM REGQTIPEIKVNVTVQGEKDIVLDKKTGYTVQKLIDDLKNGENYTVTNETDGKTEGSFPV KIVLKEEIKEKLKNEWKGKVSIETEESKLVVKNALGDWEGNKFKKADGQYAMNEFVKMKD GTYYFDEKGEKVTGQKDIGIKRCVFSADGKLESEKFIDLDPTKPMIALTFDDGPGAEAGR ILDVLEKYNARATFFMVGPMVNRYPETVKRISDLNCELGNHSTNHPKLTKLDAAGIKKEI QTTTDAIVKATGGKGPTVMRPPFGAVNETVKQTVGFPIIMWSVDTLDWKTRNTQKTIDNV LANAKDGSVVLMHDIHKPSVDAAVQLIPKLIEKGYQLVTVSELAEARGIDLQNGVKYSQF YK >gi|330399833|gb|ADLB01000007.1| GENE 19 13979 - 14410 575 143 aa, chain + ## HITS:1 COG:CAC3547 KEGG:ns NR:ns ## COG: CAC3547 COG3238 # Protein_GI_number: 15896783 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 14 143 14 142 143 83 38.0 1e-16 MVGFIIALLSGALMSVQGVFNTQVTKTTGMWVSNAWVQLTAFLLCVVAWFFAGRDSVMTI AKVEPKYVLLGGVIGAGITWTVIKSMEQLGPAKAALLIVISQLIVAYVIEVLGLFGVEKQ PLEWRRVIGMIIALVGVGIFQWK >gi|330399833|gb|ADLB01000007.1| GENE 20 14499 - 15059 585 186 aa, chain + ## HITS:1 COG:BS_comEA KEGG:ns NR:ns ## COG: BS_comEA COG1555 # Protein_GI_number: 16079613 # Func_class: L Replication, recombination and repair # Function: DNA uptake protein and related DNA-binding proteins # Organism: Bacillus subtilis # 4 186 12 205 205 110 41.0 2e-24 MHRILCYCLGICIAICILAGCQKKETVQLEEVTEKEDATEEEEVSENIQVVYVCGAVRKP GVYRLPAGSRIYEAIEMAGGMTEKADKAALNQAEKIKDEAQICVPEKAAEGEAAQESQTK DDGKINLNTATEEELMTLTGIGQSKAKSIIQYREEKGRFQSIEEIMEIEGIKSGVFNKIK EQIVVR >gi|330399833|gb|ADLB01000007.1| GENE 21 15071 - 15760 952 229 aa, chain + ## HITS:1 COG:BH4027 KEGG:ns NR:ns ## COG: BH4027 COG0745 # Protein_GI_number: 15616589 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Bacillus halodurans # 3 228 4 229 236 218 51.0 5e-57 MGKKVLVVDDEKLIVKGIRFSLEQDGMEVDCAYDGEEALEKAKAQEYDLILLDIMLPKYT GFEVCQMIREFSDVPVVMLTAKGDDMDKILGLEYGADDYITKPFNILEVKARIKAIMRRT GKSVKEKEDKNIILKNDMKIDRQSRRVYVDGKEINLTAKEFDLLELLATNPDKVYSREEL LNIVWGYEYPGDARTVDVHVRRLREKVEPNPSEPKYVYTKWGVGYYFRG >gi|330399833|gb|ADLB01000007.1| GENE 22 15768 - 17177 1487 469 aa, chain + ## HITS:1 COG:Cgl0398 KEGG:ns NR:ns ## COG: Cgl0398 COG0642 # Protein_GI_number: 19551648 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Corynebacterium glutamicum # 240 467 126 352 386 158 37.0 3e-38 MKNKFFKSLRFRILLLLVLFGIIPCAILKSAILDNYIDRTVSARTAEIQNQCAILSNQLK SYNYLIDAKSEVVNAELTQLSNIYNGRILVINDEFRIVKDTYELEEGKTVVSEDVLKCYQ GKGTTYHDKKNHYAEVTIPVKDGKKVVGVILMSVSTDAILNNQVELETKATIIVITMAII ILLLAFVLSALMVKPFARITKAIEDVNEGYDSDNLHENAYTETTLISEAFNKMLGRLKVV DDSRQEFVSNVSHELKTPLTSMKVLADSLLAQEDVPVELYQEFMGDIAEEIERENKIIND LLSLVKMDKTTMNLTVQPENINELVERILKRLRPIAAQRNIELVFESFRPVTAEVDEVKL TLALSNLVENAIKYNRDDGWVHVSLNADHKYFYVKVADSGIGIPKEDTEHIFERFYRVDK SHSREIGGTGLGLAIARNAIVMHRGAVKVYSEEGEGTTFTVRVPLTYIV >gi|330399833|gb|ADLB01000007.1| GENE 23 17192 - 18112 923 306 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_1615 NR:ns ## KEGG: EUBREC_1615 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 304 7 310 320 167 35.0 4e-40 MKRQKITVIIGVLMLVLLSGCQSKEEKEGKGPFLYYVNMEGTALEKESYEIKEDTPEKAV NKMLEKLAETPETIEVKAPIPPEVKIKKVEVKADEVHIHMNEKYLELEEVEEILCRASIV QSLTQIDGIEKVAFYIGEEPFRTKSGDVVGFMKADSFVKDTKNALKSSQQTTLTLFFANE KGDGLISEKVNVRYSGDTSIEKLVIEQLMKGPETSGAKPVLPSQVKVLNVSVKDGICYVN FDKQFWEQKFDVEPKVVIYGIVNSLISNGKASRVQISVEGETSVKFQEAVYLNEPFDRNV ELVEHE >gi|330399833|gb|ADLB01000007.1| GENE 24 18113 - 20311 1312 732 aa, chain + ## HITS:1 COG:lin1517_2 KEGG:ns NR:ns ## COG: lin1517_2 COG2333 # Protein_GI_number: 16800585 # Func_class: R General function prediction only # Function: Predicted hydrolase (metallo-beta-lactamase superfamily) # Organism: Listeria innocua # 470 711 1 246 259 143 35.0 1e-33 MRKRPLCIVCSILIAVQVFLFVAGIREMVPATESFKEFEGKQVGVSGQIYRKEKSSNGQI LYLKDARILHRNHRIKNIKITVYDKTMMKTALGNKISAVGTVRFFDVPRNFGNYDQKFYY EKQGISLSVFSTKVKMLSEETWAVREGLLKFREKCHQIVCGILGKEKGGMISGILLGEKS EMSPKWKEMYQVNGIGHILAISGLHLSFIGNFLYKGLRKMGFSYKISGGAVSVFLILYTV MTGAGVSTLRALMMFLIKIGADITGRVYDLPTSLSVAAAVIICRSPQYFFDAGFLLSFGA VLGIILLKPILEKLFPCEKKWAEGICFSVAIQLFLFPVTLYFFFEVPTYALLLNIFVIPL MPFLLGMALFGVAICFIFQSLGVWILKGSGFFLTLYNRLCELAMEFPHPRIVLGKPKWWQ VVVCYLLLLLFILYMQRKEEKVEKRYRKAVIFFSIFLFTVPNNLARGELEITMLDVGQGD GNFLCGPEKVTYLIDGGSSDVKQVGKYRIEPFLKAKGVETVDYVFVSHGDLDHLSGIDEM LARQKTGITIKNLVLPAKKVWDKTLTKLANKAIRFGTKVFVLQKGQQLTEGDMKIQCIFP SDTYEGEVGNASSMVLSLQYGEFDALFTGDVEGEGEKELEEEISGRYDVLKVAHHGSKHS TKEEILDKLRAKIGLISSGRKNSYGHPHKELLDRLEKANISAYGTKENGSVTLKTDGREM EIECYLFSLQKK >gi|330399833|gb|ADLB01000007.1| GENE 25 20355 - 21341 879 328 aa, chain + ## HITS:1 COG:BH1337 KEGG:ns NR:ns ## COG: BH1337 COG1466 # Protein_GI_number: 15613900 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, delta subunit # Organism: Bacillus halodurans # 4 325 6 337 342 147 28.0 3e-35 MKSLNEDLKTGNFKRVYLLFGEENYLKKQYKDRLTKALISEGDTMNYAYYEGKGIDVKEI IDLSETMPFFSERRLIVIENSGFFKNATAELAEYMKEIPETTYFIFVETELDKRGKMFKA VKDKGRIVELARQDEKTLVRWIYGNVKKEGKQIAESTIYYLLSKCGTDMENLQKEMEKLF CYTLNNDVINMEDIDAICTMQITNEIFDMVNAVAEKKQKRALDRYYDLLALKEPAMRILY LLSRQFRLLMEVKEMAGEGYDKKTIASKAGLHPYAVGKYIEQSRSFSQEELRKILEESVD IEERVKTGRLGDVLAVELFIVKYSSNSV >gi|330399833|gb|ADLB01000007.1| GENE 26 21377 - 23191 2011 604 aa, chain + ## HITS:1 COG:CAC1278 KEGG:ns NR:ns ## COG: CAC1278 COG0481 # Protein_GI_number: 15894560 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane GTPase LepA # Organism: Clostridium acetobutylicum # 1 602 1 602 602 864 70.0 0 MAGIEQSKIRNFCIIAHIDHGKSTLADRIIEKTGLLTSREMQSQVLDNMELERERGITIK AQTVRTVYRAKNGEEYIFNLIDTPGHVDFNYEVSRSLAACDGAILVVDAAQGVEAQTLAN VYLALDHDLDVMPVINKIDLPSAEPERVIEEIEDVIGIEAEDAPRISAKTGLNVDDVLEQ IVEKIPAPTGDADAPLQALIFDSVYDSYKGVIVFCRIKEGTVKKGTTIQMMATGAKAEVV EVGYFGAGQFIPCDELSAGMVGYITASLKNVKDTRVGDTITNASNPCKEALPGYKKVQPM VYCGMYPADGAKYQDLRDALEKLQLNDAALQFEPETSIALGFGFRCGFLGLLHLEIIQER LEREYNLDLVTTAPGVIYKVHKTNGEVIDLTNPSNLPDPAEIDYMEEPMVKAEIMVTSEF IGAIMDLCQERRGVYQGMEYIEETRAVLHYHLPLNEIIYDFFDALKSRSRGYASFDYEML GYEKSELVKLDILINKEEVDALSFIVFSGSAYERGRKMCEKLKEEIPRQLFEIPIQAAVG SKIIARETVKAMRKDVLAKCYGGDISRKRKLLEKQKEGKKRMRQVGNVEIPQKAFMSVLK LDDK >gi|330399833|gb|ADLB01000007.1| GENE 27 23274 - 24398 1010 374 aa, chain + ## HITS:1 COG:CAC1279 KEGG:ns NR:ns ## COG: CAC1279 COG0635 # Protein_GI_number: 15894561 # Func_class: H Coenzyme transport and metabolism # Function: Coproporphyrinogen III oxidase and related Fe-S oxidoreductases # Organism: Clostridium acetobutylicum # 5 373 4 373 374 276 43.0 5e-74 MRRDLELYIHIPFCVKKCAYCDFLSGPADEETMEYYVRALIREIESIESMKEMYRVVTVF VGGGTPSVLGGEQIERIFAALREKFAMESVREVTIEANPGTVTREKLKAYRSAGINRISF GLQSANNGELKQLGRIHTYEEFLESYMLAREEGFDNINIDLISAIPNQTVESWKSTVDRI LKLQPEHISAYSLIVEEGTPFEKMYGEDGNRKEELPSEEEERLIYQKTKKWLQEAGYERY EISNYAKKGYACRHNLGYWERKEYLGLGLGASSLIGNVRFQNTEEMKTYLKYSDDVRKRK QNEEVLTKEEELEEIIFLGLRKKDGISKKEVDFFCGEQIEKMICQGFLEEKDGNIRLTER GIDISNYVFAEILA >gi|330399833|gb|ADLB01000007.1| GENE 28 24451 - 25608 922 385 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKKKHTLTFLSLFLCCILCCSCGKEKEPQNKPNKQENYSQVHSNRMQITEEGIYYIDGYT KLMKFYDFALKDSIPLCDKPNCKHKNAKCNAYLENGFQSGMGCYRGKLYYFDINKPTLPL YQCDKNEKNRKVIAKLNDTESTQSCSISLPTFFADNKLILNVEYSTLLDEPIVHEDGTAD TIEYKWGIVEIALDTGDVTLLKEPEIYSNTNNSILLVGVNETSVIYTKIGTDGGFYIYDT TSKKEEKLFDLSEEKQLTFLNYDKKHRMAYHLNRDEKNCIVFQTNVETKETKEVFRTEHQ NSFMYFEVYDNVIYYIKEATGEMKSYDLKNKTEKELSKEEYTYLGRYENPSDWYISFAEE EGFVCISKKDYEKKNWKKVQVIGKF >gi|330399833|gb|ADLB01000007.1| GENE 29 25612 - 26907 701 431 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|254517725|ref|ZP_05129781.1| ## NR: gi|254517725|ref|ZP_05129781.1| extracellular solute-binding protein [Clostridium sp. 7_2_43FAA] # 104 428 113 477 477 70 21.0 2e-10 MKKYLLLGLLSFTFIGLLIGCDSNKDDKDTSDKKEIKTITWQTRRDLSDYQDYFNQILKE KGYPYQVEFRTKEENKKVDILDIGSNLWKKTYNDIKPIVDKKVIPLDSYFQTKEGKKLKA TLPQNVWDAYKVNGKQYGVLSTGYAPFQTAYIWDKTLADKYQIHPETWTEEIWKYKEDLE KVYNGENKNNFLTTAYLWETLSCPLEYTYVLGHCYPLVINEKEDLTTAQFLYDSEKYREN VKGISSLYNAGLYSSEQETALDNPKIFLEASSTFVSKDAFRLLCAVYTDKDLTNAILWGE KNVHFNVNGNFAVELGTKNRNAINNHLGNPLIGYTEVGQDPNRETLYPERMKNATPSKLL GFNFIGKNCTTELENIFQVWYKDYSALVTNPANSLLEQGNLRQQYKNAGIDKVIAEWNRE FQEWRETQKAR >gi|330399833|gb|ADLB01000007.1| GENE 30 26952 - 27083 68 43 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSTIELSVLFVNSLFLFIFYTYIEFLSKFLYNIFNKQLKKTMI >gi|330399833|gb|ADLB01000007.1| GENE 31 27157 - 28176 1196 339 aa, chain + ## HITS:1 COG:mll4732 KEGG:ns NR:ns ## COG: mll4732 COG1304 # Protein_GI_number: 13473966 # Func_class: C Energy production and conversion # Function: L-lactate dehydrogenase (FMN-dependent) and related alpha-hydroxy acid dehydrogenases # Organism: Mesorhizobium loti # 38 337 32 352 352 127 30.0 4e-29 MTYNELLEQARQEVGPYCKACPVCNGKACKNQMPGPGAKGVGDTAIRNYEKWQEIRLQMD TIGKNQKADTTFNVFGKTFRYPLFAGPVGAVNLHYGKKYNDESYNNILVSACAEAGIAAM TGDGVNENVMQVATEAIKKANGIGIPTIKPWDMEKIKEKMKLADASGAFAVAMDIDASGL PFLQAENSGAGKKSVEELHRIAKSTYAPFIVKGIMAVRGALKAESAGADAIVVSNHGGRV LDQCPATAEVLEEIANAVKGKMKIFVDGGIRSGADVFKAIALGADGVIICRPFVTALYGG GEEGVKLYIEKIGQELADAMEMCGANSLKEITKEMIWRP >gi|330399833|gb|ADLB01000007.1| GENE 32 28331 - 29374 1373 347 aa, chain + ## HITS:1 COG:CAC1280 KEGG:ns NR:ns ## COG: CAC1280 COG1420 # Protein_GI_number: 15894562 # Func_class: K Transcription # Function: Transcriptional regulator of heat shock gene # Organism: Clostridium acetobutylicum # 6 340 2 334 343 215 37.0 8e-56 MKSNQELDARKMKILQTIIKTYLETGEPVGSRTISKYTDLNLSSATIRNEMADLEDLGYI IQPHTSAGRIPSDKGYRLYVDMLMEEKEQEVTDMKEQMLEKADKMDQLLKQVARVLANST NYATMISAPTYNRNKLKFIQLSQVDENQIIAVIVMEGNIIKNKIVTVSECLGNETLLKLN MLLNTNLNGMSIEEINLGMIARLKEQAGIHSEVISEVLDAVAEAIQLDNDLEIYTSGATN IFKYPELSDNQSAQEIISAFEEKQQLVSLVTETLSSEDNKGIQVYIGNETPVQTMKDCSV VTATYELGEGMQGTIGIIGPKRMDYENVMRTLKTLMVELDAIFHKEH >gi|330399833|gb|ADLB01000007.1| GENE 33 29390 - 30010 836 206 aa, chain + ## HITS:1 COG:BH1345 KEGG:ns NR:ns ## COG: BH1345 COG0576 # Protein_GI_number: 15613908 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Molecular chaperone GrpE (heat shock protein) # Organism: Bacillus halodurans # 67 204 55 192 194 90 42.0 2e-18 MDKKMSNEEMVKEAVEEAKKNAQLENEETEEAVEEAEVEETEAEETEDKKDKKLFKRKPK KDKKDEQIEDLTDKLTRQMAEFDNYRKRTEKEKTAMYEIGAKEVVEKILPVVDNFERGLA AVPEDKKDDSFVAGMEMIYKQIMTSLEEIGVKPIEAVGKEFNPDFHNAVMHIEDEELGEN IVAEEFQKGYTYRESVVRHSMVKVAN >gi|330399833|gb|ADLB01000007.1| GENE 34 30149 - 32011 2658 620 aa, chain + ## HITS:1 COG:CAC1282 KEGG:ns NR:ns ## COG: CAC1282 COG0443 # Protein_GI_number: 15894564 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Molecular chaperone # Organism: Clostridium acetobutylicum # 1 618 1 610 615 726 67.0 0 MGKIIGIDLGTTNSCVAVMEGGQPTVIANTEGARTTPSVVAFTKTGERLVGEPAKRQAVT NADKTISSIKREMGTDYKVTIDDKKYSPQEISAMVLQKLKADAESYLGEKVSEAVITVPA YFNDAQRQATKDAGKIAGLDVKRIINEPTAAALAYGLDNEKEQKIMVYDLGGGTFDVSII EIGDGVIEVLSTAGDNKLGGDDFDQKITDYMLAEFKRMEGVDLSTDKMALQRLKEAAEKA KKELSSATTTNINLPFITATAEGPKHFDMNLTRAKFDELTHDLVERTAEPVTRALSDAGL TASELGQVLLVGGSTRIPAVQDKVKQLTGKEPSKSLNPDECVALGASVQGGKLAGDAGAG DILLLDVTPLSLSIETMGGIATRLIERNTTIPTKKSQIFSTAADNQSAVDINVVQGERQF ARDNKSLGQFRLDGIAPAPRGIPQIEVTFDIDANGIVNVSAKDLGTGKEQHITITAGSNM SDSDIEKAVKEAAEFEAQDKKRKEAIDARNDADAMIFQTEKALSEVGDKIDANEKAAVEA DIQALKDILAKSTPENVTDAQVEEIKAGKEKLMTSAQQLFTKMYEQAGAQQAGPAPEQQA GPAPEGFNGDDVVDGDYKEV >gi|330399833|gb|ADLB01000007.1| GENE 35 32109 - 33296 1412 395 aa, chain + ## HITS:1 COG:YPO0469 KEGG:ns NR:ns ## COG: YPO0469 COG0484 # Protein_GI_number: 16120798 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: DnaJ-class molecular chaperone with C-terminal Zn finger domain # Organism: Yersinia pestis # 5 395 3 376 379 323 47.0 5e-88 MAETKRDYYEVLGVDRNADDAALKKAYRVLAKKYHPDMNPGDAEAEKKFKEASEAYAVLS DPEKRRQYDQFGHTAFEGGGAGGAGGFGGFSSADFGDIFGDIFGDFFGGGRRSGRANNGP MKGANVRKGVRITFEEAIFGCEKELDVILKEPCKTCNGTGAKPGTSPETCSKCGGKGQVV YTQQSFFGTVQNVQTCPDCHGSGKIIKEKCSDCGGTGYVSTKKTIKVSIPAGIDNGQSVR IRDKGEPGVNGGPRGDLLVEVTVSRHPIFQRQDVHIFSTAPITFAQAALGGDVRIKTVDG EVIYTVKPGTKTDTKVRLKGKGVPSLRNPQLRGDHYVTLVIQTPEKLSAEAKEALRKFDA LTGNSLHQEEPVAEEEKKKPKKKGFMDKLKETFED >gi|330399833|gb|ADLB01000007.1| GENE 36 33299 - 33826 485 175 aa, chain + ## HITS:1 COG:CAC2751 KEGG:ns NR:ns ## COG: CAC2751 COG0454 # Protein_GI_number: 15896008 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Clostridium acetobutylicum # 8 174 4 166 167 114 35.0 7e-26 MEEKMICRRAEITDLPEIMEIIDNAKVFMRVFDMDQWQNGYPNEEVFRRDIILKECYVAI CDNETAGVMVVTPVPEECYKEIEGQGWLTDSEPYMTIHRMAVAEKYRGTKVAEEMISFAE NLCMKKGRKSLRTDTHKKNLAMQRFLEKQGFSYCGVVDYKDTAGDTLRIAYEKIN >gi|330399833|gb|ADLB01000007.1| GENE 37 33937 - 34692 768 251 aa, chain + ## HITS:1 COG:all4699 KEGG:ns NR:ns ## COG: all4699 COG0682 # Protein_GI_number: 17232191 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Prolipoprotein diacylglyceryltransferase # Organism: Nostoc sp. PCC 7120 # 5 250 16 273 283 107 32.0 2e-23 MQYKLFEIGPVTVYSYGLLIAIGIVLAFFVAEGRAKKQGLNGEEIYGLGILGLIGGVIGA KLLFFLTEIKSIMENPKILLSFSEGFVVYGGILGGILGAYIYCKWKKLPVLKYFDVAVPS LALAQGFGRIGCFLAGCCYGRETGAWYGITFHDSPFAPNGVSLIPTQLLSSGADFLHFFL LIYIAGKKKKDGIVVVSYMIFYSIGRFLIECLRNDPRGNVSILSTSQFISIFMLMAGLIA LVMLKKKEQDN >gi|330399833|gb|ADLB01000007.1| GENE 38 34746 - 36152 1431 468 aa, chain + ## HITS:1 COG:FN1789 KEGG:ns NR:ns ## COG: FN1789 COG0534 # Protein_GI_number: 19705094 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Fusobacterium nucleatum # 16 462 11 457 459 369 46.0 1e-102 MFTGVRKRQTKNELGIDWKQFYKNVLVLVVPMALQNLINVGVTAADVMMLGAVGEDVLSG ASLAGQIQFIMTLIFMGITSGATVLTAQYWGKGDTRTIEKILGMGLRFGCIVALLFGTGA LFFPETLMRIYTSEPAVIAEGVKYLQIVGCSYIFMAFTQVYLNIMRSIERVVIATVVYMI SLCANITINAILIYGFLGFPALGVRGAAIGTLISRIIEFAIVMCYAHFKNKVVKIRLYDL WHTDRVLLKDYLVYSMPVVLNELMWGMGSSANTAIIGHLGSAAVAANSVAQVVRQLAQVV VFGISNATAIYLGKTIGEKKLEHAKAYGKRFTVLSLILGILGAAVILISAPIANATMALS EEAQGYLVFMFFVMSYFAIAQAVNCTLVVGVFRSGGDTKFGLALDVGTMWGCSILLGAAA AFIFHCSVPVVYVILMSDEIIKLPFTIKRFFSYKWLRDVTRDDLADVS >gi|330399833|gb|ADLB01000007.1| GENE 39 36202 - 36579 571 125 aa, chain + ## HITS:1 COG:L52644 KEGG:ns NR:ns ## COG: L52644 COG0251 # Protein_GI_number: 15673211 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Putative translation initiation inhibitor, yjgF family # Organism: Lactococcus lactis # 1 124 1 125 126 130 52.0 9e-31 MKVVSTEKAPKALGPYSQGYVHNGIFYSAGQIAINPETDTVEADDIAGQTEQVCKNAGEV LKAAGSSFDKVLKTTCFLSDMADFAAFNEVYAKYFTSKPARSCVAVKTLPKNVLCEVELI AVVEE >gi|330399833|gb|ADLB01000007.1| GENE 40 36593 - 37429 906 278 aa, chain + ## HITS:1 COG:FN2098 KEGG:ns NR:ns ## COG: FN2098 COG0489 # Protein_GI_number: 19705388 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Fusobacterium nucleatum # 39 263 14 238 257 245 50.0 8e-65 MAEQEKNGSCSPESCSSCEHAGSCPSKMDLKVPANEYTHVKKVIGVVSGKGGVGKSMVTA SLARMMREQGYSVGILDADITGPSIPKMYGIHEHAVGNELGMFPCIAKDETRIMSVNLLL DSEDTPVIWRGPIIAGVVKQFWNEVLWGDLDYLFVDMPPGTGDVPLTVFQSLPLDGVVIV SSPQDLVQMIVKKAYYMARQMDIPILGIVENFSYLECPDCKKKISVFGESHVEEIAKELN IDVLGKMPIDPKLAEMVEQEKFYEVHHDYLKDAIDKLA >gi|330399833|gb|ADLB01000007.1| GENE 41 37571 - 38887 756 438 aa, chain + ## HITS:1 COG:CAC3343 KEGG:ns NR:ns ## COG: CAC3343 COG4277 # Protein_GI_number: 15896586 # Func_class: R General function prediction only # Function: Predicted DNA-binding protein with the Helix-hairpin-helix motif # Organism: Clostridium acetobutylicum # 1 430 1 432 440 526 58.0 1e-149 MSIEEKLNILTDAAKYDVACTSSGVDRRNDGTGIGNCVKSGICHSFSADGRCISLLKILF TNECIFDCKYCINRSSNDVPRTSFTPEEVCTLTMEFYRRNYIEGLFLSSGVLKSPDYTMG LLYETLYKLRTKYKFQGYIHVKAIPGASQELIQKTGFLADRMSVNLELPTAESLRLLAPH KTRENILKPMRLVQNVMNENKQEVALYRNAPRFVPAGQSTQMIIGATPETDFQIMHVAES LYKKFGLKRVFYSAFVQVNEDSNLPARTDEGPPLLREHRLYQADWLLRYYGFEAKELLSE DSPNFNVLFDPKCNWALKHLDNFPVEVNKADYYTLLRVPGIAHKSASRIIKARKTTVLGF DDLKKMGVVLKRALYFLTCNGKMMYPTKIEEDYITRNLLSAQGKTPAHIQQMTYRQLSLF DDVNFRKEQVFDEKSIYL >gi|330399833|gb|ADLB01000007.1| GENE 42 38865 - 39632 501 255 aa, chain + ## HITS:1 COG:no KEGG:Cphy_0819 NR:ns ## KEGG: Cphy_0819 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 2 253 4 256 280 161 36.0 2e-38 MKKVFICENTMTGIYSGIYDAWKMKPTREQVGVALRGNIEQELFCEYVESEPSEKKADAV EHMIQKHLGMEAYREIYHAMLSHNTEKGNAVLGMMIEARNIPNSRRIAEHLGNEDVCKVF ELSRRVSNEAHFFKEIVRFRELSNGVLFAEIEPENQILTCIAEHFANRLPLENWLIYDAV HNMTLIHQREKHWFLVTGEKPDREKTALYSHKERQMETLWKGVCESISISERENYRLQRQ HLPLKYRKHMVEFTQ >gi|330399833|gb|ADLB01000007.1| GENE 43 39756 - 39986 304 76 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|210609708|ref|ZP_03288094.1| ## NR: gi|210609708|ref|ZP_03288094.1| hypothetical protein CLONEX_00278 [Clostridium nexile DSM 1787] # 3 73 9 79 81 88 63.0 1e-16 MRREYKIRLSETSDVKDFVTSAEKCDFDIDISYNRFIIDAKSILGVLSLDLTKVLTVQCA EKNADFEKTIQKYCVA >gi|330399833|gb|ADLB01000007.1| GENE 44 40150 - 42486 1730 778 aa, chain + ## HITS:1 COG:CAC1672 KEGG:ns NR:ns ## COG: CAC1672 COG1199 # Protein_GI_number: 15894949 # Func_class: K Transcription; L Replication, recombination and repair # Function: Rad3-related DNA helicases # Organism: Clostridium acetobutylicum # 5 778 7 788 791 635 45.0 0 MNREQIRISVRNLVEFILRKGDIDNRISKTADKEAMQLGSKIHRKIQRQMGSSYHAEVSL KMMLHEEKYDLQVEGRADGIIVEEGVTIDEIKGVFRDLEQIEEPIEVHLAQAKCYAYIYG KQENLENISVQMTYCHLETEQVKRFKESYLLSELEKWFNNLIGEYKKWAEFQIDWKKKRN KSIRRIEFPYEYRKGQKELATSVYRTILRKKKLFIQAPTGVGKTMATIFPSVKAVGEELG EKIFYLTAKTITRTVVEQAFQLLKKQGLQYKVVTLTAKEKICFCEDKECNPEKCPYAKGH YDRVNQAVYEMITTTDDMSRENIEVYARKYEVCPFEMSLDVAVWVDAVVCDYNYVFDPNA HLRRFFADEVKGEYLFLIDEAHNLVERAREMYSATIYKEEFLEAKRIIKYLDKKLVRKLD ICNRQLLELKRECETYQVHQSVGHFSISMTNLLMEMERFMEECDRAEVKEELLEFYFHVR TFLNVYDVLDENYTIYTEMEEDGKFKLKLFCVNPALNLQNFLEKGNSTVFFSATFLPIHY YKQLLSTEKDDYAIYVDSPFDIKNREILIGSDVSSKYTKRNVQMYERIASYIIKTLEVKK GNYIAFFPSYQFMENVYEVLERRLSGETVCLLQEKMMTEEKREEFLEEFAKEREGNLLGF CVMGGIFSEGIDLTEEKLVGAFIVGTGLPQICYEREILRQYFEKKNGKGFDYAYLYPGMN KVLQSAGRVIRTETDRGVILLLDERFLQRQCQEIFPREWAGIKKCTLENISEQLEKFW >gi|330399833|gb|ADLB01000007.1| GENE 45 42731 - 43141 210 136 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|169351324|ref|ZP_02868262.1| ## NR: gi|169351324|ref|ZP_02868262.1| hypothetical protein CLOSPI_02104 [Clostridium spiroforme DSM 1552] # 1 136 1 136 136 153 73.0 3e-36 MMKLLLLTGIILAKVLIAKGVALVFIILSFIFKGKQLHKSTLEWDKYFLSLTDTFVDKYA KIIYIISTVLSSYVMFLLFKFFDFQYPTSLTLIILAVCSLISWYRYHKNGKSYIKSRVYE IKESIKSENVSNNVTP >gi|330399833|gb|ADLB01000007.1| GENE 46 43221 - 44108 863 295 aa, chain - ## HITS:1 COG:BH2712 KEGG:ns NR:ns ## COG: BH2712 COG0583 # Protein_GI_number: 15615275 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Bacillus halodurans # 3 243 1 240 296 160 34.0 2e-39 MDINYELYKVFYYVATTLSFSEASKQLFISQSAVSQSIKALEKKLEQTLFIRSTKKVQLT PEGEILLRHVEPAMNLIKRGEAQIMDSVSTGGQIRIGASDTICRYFLVPYLEKFHKDFPN IHIKVTNQTSLKCVDLLEAGQVDLIVTNYPNTNLGNLATVKKIKEFKDMFIANNDFKELK GRKLSFKELLNHPILMLDRKSTTSEFLHSLFQQQQLDLVPEIELSSNDLLIDLARIGLGI AFIPDYCVPKSKNLFVVDTEYELPRRQLVVAHNEHVPTSKATLEFLKYFDYKNYD >gi|330399833|gb|ADLB01000007.1| GENE 47 44238 - 44678 414 146 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MEYMEGMEFIHSCLEEDEYVLWRGKPGKKNTFSSRDRAVLPFALFWTAFALLLEGSVIMS GAPLMYCIFGAPFVLIIKRDKDIEVYLPDELPPMKIELHKNGNGTIIFSKETYSRRRSTY YTCCMLEDLPDIAQAQNAINIMKGKE >gi|330399833|gb|ADLB01000007.1| GENE 48 44675 - 45388 476 237 aa, chain + ## HITS:1 COG:no KEGG:Amet_3274 NR:ns ## KEGG: Amet_3274 # Name: not_defined # Def: hypothetical protein # Organism: A.metalliredigens # Pathway: not_defined # 18 209 18 209 229 132 40.0 1e-29 MRIIHRILICLTVISALIVLLVTSTELAIYMDFQFYEKEYEKYHVLDDLDMEMKDVMSVT HEMMDYLHGKRDDLVVMTTVSGEEREFFNDREKQHMVDVKNMFLAGMRTRNAAAVILLVS LTVLIFLKAEWKILLAKWYMGITCGLLAVAIGLGYLFTRDFNKYFVKFHEIFFDNDLWLL DPDTDLMIRMLPEGFFADFTVRIGIFAATAVIGCFIISLFVWHREKKISNKQDLLIN >gi|330399833|gb|ADLB01000007.1| GENE 49 45427 - 49266 4361 1279 aa, chain + ## HITS:1 COG:CAC1655_1 KEGG:ns NR:ns ## COG: CAC1655_1 COG0046 # Protein_GI_number: 15894932 # Func_class: F Nucleotide transport and metabolism # Function: Phosphoribosylformylglycinamidine (FGAM) synthase, synthetase domain # Organism: Clostridium acetobutylicum # 16 972 3 959 985 1058 54.0 0 MQLAVNVLIKMEEIMSSVKRVYVEKKPEFAVQAKELRQEVENYLGIKTITNVRVLNRYDI ENLSEETFERACNGVFAEPPVDVLYHETFDVAEGSCVFSVEYLPGQFDQRADSAVQCVQF IKEDEQPVIKTATTYVIEGNISDEEFEAVKTHCINPVDSREADAEKPETLVTVFEEPEDV KVFDGFKDMEEAELKELYDSLGLAMTFKDFLHIQGYFKGEEDRDPSMTEIRVLDTYWSDH CRHTTFSTELKEVTFGEGDYKEPIVDSYKQYLADHSEIFRGREDKFVCLMDLALMAMRKL KREGKLNDQEESEEINACSIVVPVAVDGVEEEWLVNFKNETHNHPTEIEPFGGAATCLGG AIRDPLSGRTYVYQAMRVTGAADPTVSVKETMKGKLPQKKLVTGAAHGYSSYGNQIGLAT GAVKEIYHPDYVAKRMEIGAVLGAAPRRAVIRETSDPGDIIILLGGRTGRDGCGGATGSS KVHTEESIETCGAEVQKGNPPTERKIQRLFRREEVSKLIKKCNDFGAGGVSVAIGELAAG LKVDLDKVPKKYAGLDGTEIAISESQERMAVVVDPKDVEAFLGYAKEENLEAVEVAVVTE SPRLVLNWRGKEIVNISRAFLDTNGAHQETTVKVDVPNRADSILVRPEVNDVKEKWLNTL QDLNVCSQKGLVEMFDGSIGAGSVFMPHGGKYQMTETQAMVAKLPVLTGKCDTVTMMSYG FDPYLSSWSPYHGAVYAVLESVAKIVASGGDYRKIHLTFQEYFRRMTEDPSRWSQPFAAL LGAYSVQLGLGLASIGGKDSMSGTFQDIDVPPTLVSFAVDVAEQKDIITPELKAAGNKLV WLHIDTDRYDLPVYESVMNQYGKFRDDVQGGKIVSAYTLDRHGIAAAVSKMAFGNGMGVE ISTTVEKEDLFACYFGDIIAEVPANEVDNLTIAHTVIGEVTDKGAITYGDMAIGLDEAVS TWKAPLEKVFPSVSGENQEDGPTSYFVKPATEDAVIENELYKTSNIYVCKNKVAQPTVFI PVFPGTNCEYDSAKAFERAGAKVVTKIFRNLDASDIRDSVAEFEKIIAQSQMIMFPGGFS AGDEPDGSAKFFATAFQNAKLKEAVEKLINERDGLVLGICNGFQTLIKLGLVPYGKICGQ TEGSPTLTYNTIGRHISKMVYTKVVTNKSPWLQGAELGGVYTNPASHGEGRFVASEEVLN ELFANGQVATQYCDLDGNITMNEEWNPNGSYRAIEGITSPDGRVLGKMAHSERRGDGVAI NIYGEQDMKIFESGVKYFK >gi|330399833|gb|ADLB01000007.1| GENE 50 49385 - 50086 765 233 aa, chain + ## HITS:1 COG:pli0051 KEGG:ns NR:ns ## COG: pli0051 COG0745 # Protein_GI_number: 18450333 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Listeria innocua # 1 233 1 233 240 281 61.0 7e-76 MNKPLILVVDDDKAIRSLITTTLETQEYRYHAVSNGMQGILEVASQKPDVLLLDLGLPDM DGIKVIQKVRSWSNVPIIVISSRSEDKDKIQALDLGADDYLTKPFSIDEILARIRVVLRR ERCMIYANPEEAEFINGELRINYVAGCAYLKERELHLTPIEYKLLVLLSKNVGKVLTHTY LTREIWGNSWESDVASLRVFMAALRKKIEENPQNPQYIQTHVGIGYRMIRVPD >gi|330399833|gb|ADLB01000007.1| GENE 51 50083 - 50727 588 214 aa, chain - ## HITS:1 COG:lin1022 KEGG:ns NR:ns ## COG: lin1022 COG0569 # Protein_GI_number: 16800091 # Func_class: P Inorganic ion transport and metabolism # Function: K+ transport systems, NAD-binding component # Organism: Listeria innocua # 6 204 7 206 219 155 40.0 5e-38 MKSILVIGLGRFGRHMTKKFIEEGNSVLAIEKNEERADNAIDILNDIQIADATNESFIKS LGVNNFDLCVVAIGDNFQSALEITVLLKDFGAKYILARACRDVHRKLLLRNGADHVVYAE REMAERLAVKYGSKNIFDYIELTSDTAICEIAVPPSWVGKSIVEKAIRNRYNVSILATKK EGEISPLPHPDHIFEKNETLIIMGQKKVIHSFTY >gi|330399833|gb|ADLB01000007.1| GENE 52 50743 - 52104 970 453 aa, chain - ## HITS:1 COG:BS_yubG KEGG:ns NR:ns ## COG: BS_yubG COG0168 # Protein_GI_number: 16080162 # Func_class: P Inorganic ion transport and metabolism # Function: Trk-type K+ transport systems, membrane components # Organism: Bacillus subtilis # 11 439 11 432 445 243 38.0 7e-64 MSTIHKIQNYFLHLSPMKILLCGYCTIIFIGSCLLSLPIAVRQPQDADFFTGFFTATSAA CVTGLIRVDTYTHWSFFGQAVILGLIQVGGIGFMTLCISIMTITKRKIGLVSRSLMQNSV SAPQIGGIVKMTRFIIIGTFLIESIGAVLLSFHFCPMFGLKKGIWYSIFHSISAFCNAGF DLMGTVQPFSSLTGELGNWYVNIIIMLLIIIGGLGFFVWHDIVTAKWHFHKMHLHSKLVL SVSAFLIIGGTLLLFVCEQGTPMFENLSLSEQLAASSFQSVSARTAGFNTVNLTELTQAG RFIMILLMLVGGSPGSTAGGIKTTTFAVLFLSVITTFRNRKSTEVFGRRLEDFITRKATC VFILYITLSFSIGMFISRTEGINLIDSLFETVSAIATAGLTTGITPGLSTLSHSLLAFLM IFGRVGSLTMLLAFSFTKKQAPSTSPLEKIQIG >gi|330399833|gb|ADLB01000007.1| GENE 53 52335 - 53669 1932 444 aa, chain + ## HITS:1 COG:CAC0737 KEGG:ns NR:ns ## COG: CAC0737 COG0334 # Protein_GI_number: 15894024 # Func_class: E Amino acid transport and metabolism # Function: Glutamate dehydrogenase/leucine dehydrogenase # Organism: Clostridium acetobutylicum # 1 442 1 442 443 577 65.0 1e-164 MSYVDEVIELVVRKNPAEPEFHQAVKEVLESLRVVVEANEERYRRDALLERMVEPERQFK FRIPWVDDNGQVHVNTGYRVQFNSAIGPYKGGLRLHPSVNLGIIKFLGFEQVFKNSLTGL PIGGGKGGSDFDPKGKSDREIMAFCQSFMTELCKYIGADTDVPAGDIGTGAREIGYLFGQ YKRIRGVYEGVLTGKGLSYGGSLARREATGYGLLYLTEEMLKLNGIELAGKTVSISGAGN VAIYAAEKAQQLGAKVVTVSDSTGWVYDADGIDLEALKEIKEVKRERLTEYTKYRPNAEY HEGKGVWSVKVDIALPCATQNELDLDDAKALVANGVIAVAEGANMPTTLEATEYFQNNGV LFAPGKAANAGGVATSALEMSQNSERLSWSFEEVDGKLKGIMVNICHNMAAAAEKYGVKG NYVVGANIAGFEKVVDAMAAQGVV >gi|330399833|gb|ADLB01000007.1| GENE 54 54120 - 54311 151 63 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167760484|ref|ZP_02432611.1| ## NR: gi|167760484|ref|ZP_02432611.1| hypothetical protein CLOSCI_02858 [Clostridium scindens ATCC 35704] # 1 63 31 93 96 68 50.0 1e-10 MKEDKKNKMIIDSYDYLGNSASSNDCTGLIPSAPVSNSQVQSYEDVYHFTPPEVPIKKCP KDE >gi|330399833|gb|ADLB01000007.1| GENE 55 54488 - 55033 791 181 aa, chain + ## HITS:1 COG:CAC0434 KEGG:ns NR:ns ## COG: CAC0434 COG0245 # Protein_GI_number: 15893725 # Func_class: I Lipid transport and metabolism # Function: 2C-methyl-D-erythritol 2,4-cyclodiphosphate synthase # Organism: Clostridium acetobutylicum # 1 154 1 154 155 204 68.0 8e-53 MRVGMGYDVHRLVEERELILGGVNIPYEKGLLGHSDADVLLHAIMDALLGAAGLGDIGTH FPDTDEKYKGISSIILLEHVGRLLEEHLYVIENIDATIIAQKPKMKPYIGQMRKNIASAL QIEEDQINVKATTEEGLGFTGTGEGISSQAICALENMTNYSYKVAGNESGCAGCGGCRRD K >gi|330399833|gb|ADLB01000007.1| GENE 56 55033 - 56115 867 360 aa, chain + ## HITS:1 COG:FN1041 KEGG:ns NR:ns ## COG: FN1041 COG4552 # Protein_GI_number: 19704376 # Func_class: R General function prediction only # Function: Predicted acetyltransferase involved in intracellular survival and related acetyltransferases # Organism: Fusobacterium nucleatum # 1 335 1 359 391 81 24.0 2e-15 MELRKLRKEEHIKTRELWEKIFDEDTPAFLDYYYTVKTVENEIYVMEDEEHICAMLHLNP YSMKIDGRSFMTHYIVAVATEEKYRQQGIMRSLLKKALADMKENGEPFTFLMPAAEQIYY PHGFRYIYKQRQGTAEKSGEKKEQWTLSHATEEECEEMALFANGILESEYEVYAERSRYY YEALLKEQKSEQGGILILRENRRLTGLFPYAEGEELEIREPLFLEKGEEILSYVSEYLCD KKQTKVLGYGDMEKPMIMAKILDVEKMFSCMRIKEEIHLKLKIWDTLTNASAGCFLLEGK EKIRAKKTNETKDCVCIDAGDLTGILFGMADMEKLNVPLQVKEELSKIIPLTKVYLNEVV >gi|330399833|gb|ADLB01000007.1| GENE 57 56135 - 57142 1199 335 aa, chain + ## HITS:1 COG:MTH437 KEGG:ns NR:ns ## COG: MTH437 COG1363 # Protein_GI_number: 15678465 # Func_class: G Carbohydrate transport and metabolism # Function: Cellulase M and related proteins # Organism: Methanothermobacter thermautotrophicus # 3 333 5 341 343 179 35.0 7e-45 MREDLLKMLLEVPSVSGDEFAVQKVLKEHMGNFYEVETDSIGDSVYTLEGLGEKRILMTS HIDEIGLLVTGATGDGFLKVTNSGGFSAKLYAGHGVQINTDNGVVYGSVVVTSKMLKNSE FSVSDILIDIGASTKEEALSQVQLGNSVVFDANIRQLMNERISARGLDDKAGVYVVMETM KRMKEKHHKATIIGAGTVGEETSKHGAGWVANRVNPTEAIVVDVTYTSDYDGMREAEWGE VELGKGPVLCINPICDRGINQKLMELANERGIPYQVEVSGGRSCTDADEIHMVGKGVPVV LVSIPLRYMHSPAEVADRKDLEYAVELLTAYLENA >gi|330399833|gb|ADLB01000007.1| GENE 58 57484 - 58176 563 230 aa, chain + ## HITS:1 COG:BH0110 KEGG:ns NR:ns ## COG: BH0110 COG1045 # Protein_GI_number: 15612673 # Func_class: E Amino acid transport and metabolism # Function: Serine acetyltransferase # Organism: Bacillus halodurans # 2 214 3 215 229 234 55.0 8e-62 MRLITYIKEELQVIKERDPAIKSNMEILLYPSFKVMIRYRMAHKLYLKKHYFLARWISQR AARKTGIEIHPGATIGKGLFIDHGSGVIIGETTVIGNNVTLYQGVTLGGTGKEKGKRHPT LKDNVMVSAGAKILGSFTIGENAKIGAGSVVLEEVPPNCTVVGVPGRIVRMGDQKIPRAD LDQIHLPDPVLNDIRELQNRNIQLQQELKEMEKDMRCRQHKHQEEKNEGI >gi|330399833|gb|ADLB01000007.1| GENE 59 58181 - 59569 1486 462 aa, chain + ## HITS:1 COG:BS_cysS KEGG:ns NR:ns ## COG: BS_cysS COG0215 # Protein_GI_number: 16077162 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Cysteinyl-tRNA synthetase # Organism: Bacillus subtilis # 1 462 9 465 466 506 55.0 1e-143 MTRRKEEFVPLEEGKVKMYVCGPTVYNLIHIGNARPMIVFDTVRRYFEYKGYEVNFVSNF TDVDDKIIKKAIEEGVSANEISQRYIAECKKDMEGMNVKPATKHPLATEEICGMIEMISD LIDKGYAYEKNGTVYFRTRKFEEYGKLSHKNLDDLQSGGRSLLVTGEDEKEDSLDFVLWK PKKEGEPAWESPWCDGRPGWHIECSVMSKKYLGEQIDIHAGGEDLVFPHHENEIAQSEAA NGKEFAKYWLHNAFLNIDNHKMSKSLGNFRTVREIGEKYDLQVLRFFMLSAHYRSPLNFS AALMESSANGLERILTAVANLKHLLGVVSEENMTAEEREQVKETENFVRAYEEAMEDDFN TADAIASIFDLVKFANTTTSAQSSKEYLQTLYDLIVKLSDVLGLIVEKEEEMLAEDIEAL IEERQAARKERNFQRADEIRDELLKKGIILEDTREGVKWKKA >gi|330399833|gb|ADLB01000007.1| GENE 60 59608 - 60003 250 131 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163764762|ref|ZP_02171816.1| ribosomal protein S13 [Bacillus selenitireducens MLS10] # 8 130 12 136 141 100 40 2e-20 MQEVDIREYSPLTLAYIGDSIYDLIIKSLVINRGNKQVQKLHKETSSLVQASAQSMMMRA MQEELTDEERAVYKRGRNAKSVSPAKNQSITDYRRATGFEALIGYLYLQEKWKRMLDLVK IGLDSLEKEEV >gi|330399833|gb|ADLB01000007.1| GENE 61 60013 - 60753 707 246 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163764761|ref|ZP_02171815.1| ribosomal protein S11 [Bacillus selenitireducens MLS10] # 8 244 9 246 255 276 55 2e-73 MAEESLIIEGRNAVIEAFRSGKTIDKVFVLDGCQDGPVRTIVREAKKHDTILQFVTKERL DQLSETGKHQGVVAHAAAYEYAEVEDILALAKEKGEDPFIFILDNIEDPHNLGAIIRTAN LAGAHGVIIPKRRAVGLTATVAKTSAGALNYTPVAKVTNLVKTMEELKEKGLWFVCADMG GETMYRLNLTGPIGLVIGNEGTGVGRLVKEECDFVASIPMKGDIDSLNASVAAGVLAYEI VRQRLN >gi|330399833|gb|ADLB01000007.1| GENE 62 60780 - 61388 689 202 aa, chain + ## HITS:1 COG:BH0115 KEGG:ns NR:ns ## COG: BH0115 COG1595 # Protein_GI_number: 15612678 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Bacillus halodurans # 4 194 14 208 217 182 49.0 3e-46 MCRYNMMADEELLRCLRNGEKEITEYIINKYKNLVKEKAKAMFLLGGDNDDLIQEGMIGL FKAIRDYDEEQETSFYHFAELCISRQMYTAIEASKRQKHIPLNSYVSIYDEKDEQPLIDT IQAVKETNPEALFLNKEYLQMIEVELKKNLSEFENKVLYLHLFGMDYQKIAKVLDKTPKS IDNALQRIKGKTEKIVGQKTRG >gi|330399833|gb|ADLB01000007.1| GENE 63 61363 - 61455 76 30 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLDKRLGDKNELTNSMDILYINYVAYKYTC >gi|330399833|gb|ADLB01000007.1| GENE 64 61930 - 63597 1795 555 aa, chain + ## HITS:1 COG:PA1794 KEGG:ns NR:ns ## COG: PA1794 COG0008 # Protein_GI_number: 15596991 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Glutamyl- and glutaminyl-tRNA synthetases # Organism: Pseudomonas aeruginosa # 7 551 9 553 556 637 55.0 0 MENEVVSKNFIEQEIDKDLREGVYDTVCTRFPPEPNGYLHIGHAKSILLNYGLAQKYNGT FHMRFDDTNPTKEKVEFVESIKEDIKWLGADWKDNLFFASDYFDAMYECAVKLIKKGKAY VCDLTPEEIREYRGTLTEPGKNSPYRDRTVEENLELFEAMKNGEFEDGERVLRAKIDMAS PNINMRDPIIYRVAHMTHHNTGDKWCIYPMYDFAHPIEDAIEKITHSICTLEFEDHRPLY DWVVRECEFNPAPRQIEFAKLYLTNVVTGKRYIKKLVEDGIVDGWDDPRLVSIAALRRRG FTPESIKMFVDLCGVSKSNSSVDYAMLEYCIREDLKMKRPRMMAVLDPIKLIIDNYPEGQ VEYLDVANNLENEELGFRKVPFCRELYIDREDFMEEPPKKYFRLFQGNEVRLMHAYFVKC ESFVKDENGKVVEIHCTYDPETKCGTGFTGRKVKGTIHWVPAPYATKAEVRLYENIVDEE KGVYNKEDGSLNLNPNSLTILKDCYLEPSFDGVKAYDSFQFVRQGYFCVDAKDSKPDALV FNRIVSLKSSFKLPK >gi|330399833|gb|ADLB01000007.1| GENE 65 63657 - 64175 261 172 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|148360238|ref|YP_001251445.1| nucleotidyltransferase PLUS glutamate rich protein GrpB PLUS ribosomal protein alanine acetyltransferase [Legionella pneumophila str. Corby] # 5 167 175 342 601 105 37 1e-21 MARKIEVVAYRPEWEEMFTEEAEKIRKILGENIIDIYHIGSTSVKNLWAKPIIDIMPVVK DISSVDEHNKEFEQIGYECKGEFGISGRCFYMKGGDNRTHHIHIFEESNQNEIQRHLAVR DYLRENPKKAEEYGLLKRKLAAEFTFDIDGYCDGKDAFVKNMEQQALEWKNE >gi|330399833|gb|ADLB01000007.1| GENE 66 64168 - 65574 1153 468 aa, chain + ## HITS:1 COG:BS_ydeL KEGG:ns NR:ns ## COG: BS_ydeL COG1167 # Protein_GI_number: 16077591 # Func_class: K Transcription; E Amino acid transport and metabolism # Function: Transcriptional regulators containing a DNA-binding HTH domain and an aminotransferase domain (MocR family) and their eukaryotic orthologs # Organism: Bacillus subtilis # 3 464 2 461 463 367 40.0 1e-101 MNELTIRLETKSQLPLYEQIYIYIKNDIQSGKIKYGDKLPSTRALAKYLDVSRSTIELAY EQLLAEGYVEAQAYRGFFVTKIDDLYQWKQLREADPIKEEKKKIEYAYDFTPNGIDLNSF PYRIWRKLAKDVLADDKAELFKAGDSQGEYELRNTICQYLYQSRGVNCEPEQIIVGAGND YLLMLLQTILEGNYKIAFENPTYRQAYRLFQQLSYETATVEMDTKGMRVDKLREIQADIA YVMPSHQYPLGIVMPIQRRMELLRWAAEKEGRYIIEDDYDSEFRYKGKPIPALQGYDRSE KVIYIGTFSKSIAPAIRVSYLVLPKPLLKVYQEKGRFLNSTVSRVDQQIIQRFIGEGYYE RHLNKMRGVYKNRHDIMIGSLKPLLKKCKMSGENAGVHLLLTFPADISENELIERAEKVN IKVYGLSAYDVCKEKKLNPTILLGYANMPEEEIREGVRKLIEAWNEMI >gi|330399833|gb|ADLB01000007.1| GENE 67 65623 - 65961 469 112 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MANKWGKRLLGLVAVGAAVGGVVAYLNKKSGCCNSEDEFSDDFENEDFDLDEDLKEAASR EYVSLTPDTSSETESADEEAQPEETSVSDEETVEVSSDNEVSFDETEKSDNE >gi|330399833|gb|ADLB01000007.1| GENE 68 66099 - 67649 1470 516 aa, chain + ## HITS:1 COG:SPy1212 KEGG:ns NR:ns ## COG: SPy1212 COG1502 # Protein_GI_number: 15675176 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylserine/phosphatidylglycerophosphate/cardioli pin synthases and related enzymes # Organism: Streptococcus pyogenes M1 GAS # 5 516 15 525 525 482 47.0 1e-136 MKAKKAKSGIFKIIFSRTALVAALLLFQIGLMIATITSIEKYAFSVYVSFIVLSMIVIIY IINRKENPAFKMSWILFVTFIPIVGTVFYIFMQIQPGTRYIGKRLFELGEKTKPYMFQND KIIEDLRISKPANANLAHYLSKQVGFPVHRNTKVTYFRVGEEKFEELKRQLESAKHFIFM EYFIVEEGIMWDSILEILKRKVDEGVEVRFMYDGMCSIVLLPYQYPKKLEEMGIRCKMFS PIKPVLSTHQNNRDHRKICVIDGQAAFTGGINLADEYINEKERFGHWKDTAVMLQGEAVQ NFTMMFLQMWNVTEKEKENFEKYLTPKSKEFRRELGYVLPYGDSPYDNENIGEQVYFHIL NHAKKYVHIMTPYLILDNEMVTNLTYAAKCGIEVIIIMPHIPDKWYAFVLAKTYYEELIN AGVQIYEYTPGFVHAKVFVSDDDTATVGTINLDYRSLYLHFECGTFIYNNQVVRDIEKDF QDTLKKCQRVTITDLRMRGIVETVTGRVLRLIAPLM >gi|330399833|gb|ADLB01000007.1| GENE 69 67717 - 68079 423 120 aa, chain + ## HITS:1 COG:MA0108 KEGG:ns NR:ns ## COG: MA0108 COG0346 # Protein_GI_number: 20089007 # Func_class: E Amino acid transport and metabolism # Function: Lactoylglutathione lyase and related lyases # Organism: Methanosarcina acetivorans str.C2A # 1 120 10 131 163 82 39.0 3e-16 MRLKNVLIVVKDIEKSKKFYHDLFGLDTILDNEGNMILTEGLVLQDAKIWKDFVKKDIIS ESNSCELYFEERDIETFAQKLDELYPSIKYVNRLMTHSWGQRVMRFYDLDGNLIEVGTPM >gi|330399833|gb|ADLB01000007.1| GENE 70 68190 - 69470 1496 426 aa, chain + ## HITS:1 COG:MT0373 KEGG:ns NR:ns ## COG: MT0373 COG0104 # Protein_GI_number: 15839743 # Func_class: F Nucleotide transport and metabolism # Function: Adenylosuccinate synthase # Organism: Mycobacterium tuberculosis CDC1551 # 5 424 6 424 432 375 45.0 1e-104 MVKAVVGANWGDEGKGKITDMLGEKADIIVRFQGGANAGHTIINEYGKFALHTLPSGVFY EHTTSIIGNGVALNIPVLFKEIQSVIDKGVPMPKILVSDRAQIVMSYHILFDQYEEERLG GKSFGSTKSGIAPFYSDKFAKIGFQVSELFDDELLKEKVVRVCEQKNVLLEHLYHKPLLK PEDLYAELQEYKKMVEPYVCDVSLFLHNAIREGKEILLEGQLGSLKDTDHGIYPMVTSSS TLAAYGAIGAGIPPYEIKQIITVCKAYSSAVGAGAFVSEIFGEEADELRKRGGDGGEFGA TTGRPRRMGWFDVVASKYGCRMQGTTDVAFTVLDVLGYLDEIPVCVAYEIDGKITTDFPT THLLEKAKPVLETLPGWKSDIRGIKKYADLPENCRKYIEFIEEKLGYPITMVSNGPGRDE IIYRNK >gi|330399833|gb|ADLB01000007.1| GENE 71 69515 - 70525 692 336 aa, chain - ## HITS:1 COG:FN0776 KEGG:ns NR:ns ## COG: FN0776 COG2502 # Protein_GI_number: 19704111 # Func_class: E Amino acid transport and metabolism # Function: Asparagine synthetase A # Organism: Fusobacterium nucleatum # 9 336 2 327 327 414 63.0 1e-115 MQQLTIPEGYSSPLSIRETEVAIKEIKDYFERALAKSLHLTRVSAPLFVKPESGLNDNLN GVERPVSFGIKEQNDNTVEIVHSLAKWKRYALKRYGFHSGEGLYTDMTAIRRDEDTDNIH SLYVDQWDWEKVISREERNTATLEHNVRHVYSALKETEQHISRRYNYIDTILPDEIFFIT SQELENMYPDCTSKEREYRIAKAKGAVFISQIGKILSSGEKHDGRAPDYDDWELNGDIIV YYPVLDIALELSSMGVRVDEYSLKRQLKLADCEDRAELTFQKSLLNGELPYTVGGGIGQS RICMFYLRKAHIGEVQSSIWPEHISKKAEEFNIQLL >gi|330399833|gb|ADLB01000007.1| GENE 72 70688 - 70852 243 54 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MQQEENRKNQEIISIEEYLYKRKKIKEQEERHLSPYKISIKEKSSALVLAELYM >gi|330399833|gb|ADLB01000007.1| GENE 73 70857 - 71072 269 71 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|210608917|ref|ZP_03288054.1| ## NR: gi|210608917|ref|ZP_03288054.1| hypothetical protein CLONEX_00233 [Clostridium nexile DSM 1787] # 1 68 1 68 68 69 64.0 5e-11 MKKTKRILAIIGVILLVALYGSTLLFAFIDTSKSLGLFKASVALTILIPVLLYAYSLIYK LAKKQDNDKEN >gi|330399833|gb|ADLB01000007.1| GENE 74 71247 - 71747 745 166 aa, chain + ## HITS:1 COG:CAC3537 KEGG:ns NR:ns ## COG: CAC3537 COG0653 # Protein_GI_number: 15896773 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit SecA (ATPase, RNA helicase) # Organism: Clostridium acetobutylicum # 1 166 1 165 166 143 48.0 2e-34 MSLLEQWRESAYSKEMDKAALQKFWGTYFQIEKEIYEKLLENPDEEVKGTVKELAEKYEI EVFTMTGFLDGINDSLKEPNPIEEMEEDTVVSLAFDKEKLYKNMVDAKADWLYELPQWDK IFTAERKKELYREQKQSGTIRKEKKIGRNDPCPCGSGKKYKKCCGK >gi|330399833|gb|ADLB01000007.1| GENE 75 71813 - 72538 734 241 aa, chain + ## HITS:1 COG:CAC0441 KEGG:ns NR:ns ## COG: CAC0441 COG1434 # Protein_GI_number: 15893732 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 92 241 106 258 259 120 45.0 2e-27 MSRAFYSMGILSILYYLLIVIHTHNWKATFARFWAFFGVLQFVVGFGVERMAKWLYLPFQ IGFTVAFLVFLAVEILILSGMIPSSVDDFFTIIILGACVKGKRVTGSLRKRLDKGAEYLL GHENTKVIVSGGQGKGEDVTEAFAMKNYLLDKGIDGKRIIMEEKSHSTEENLKYSLQYIT DTKEKVGIVTNNFHVYRGIKLAKRVGYKDVNGIPAGSDPVLFLNYLVREFFAVVYLVLFH R >gi|330399833|gb|ADLB01000007.1| GENE 76 72586 - 73200 720 204 aa, chain + ## HITS:1 COG:BH1275 KEGG:ns NR:ns ## COG: BH1275 COG0572 # Protein_GI_number: 15613838 # Func_class: F Nucleotide transport and metabolism # Function: Uridine kinase # Organism: Bacillus halodurans # 2 204 4 208 211 193 47.0 2e-49 MKTVIIGIAGGSGSGKSTFTNRLKAAFKNQLTVIYHDNYYKCNDGVPFEVRKKTNYDHPE ALDTELLIKHLKALKSGKTIEGPTYDYSKHNRAKEKVTLRPTKIILVEGILVLENKKLRD LLDIKIFVEADADERILRRVIRDVRERGRYVEDIAEQYLTTVKPMHYIYVEPTKAMADIV INSGMNDVAFDLVKTKIETLLIDK >gi|330399833|gb|ADLB01000007.1| GENE 77 73511 - 74365 610 284 aa, chain + ## HITS:1 COG:CAC1958 KEGG:ns NR:ns ## COG: CAC1958 COG0656 # Protein_GI_number: 15895230 # Func_class: R General function prediction only # Function: Aldo/keto reductases, related to diketogulonate reductase # Organism: Clostridium acetobutylicum # 1 278 1 270 274 258 43.0 1e-68 MKNIHDFYTLSNGVRIPCIAFGTYKAAQEDNVKTIQLAIEAGYRYFDTASFYDTERVLGE AIKQSGIPREEFFVVSKVWKTEMGYENTKRAFKQSLERLQMEYLDGYLIHWPKPYPEAED WKELDRETWRAMEELYLEGKIRTIGLSNFLPQHIEPLLQKCQIQPMMNQLEIHPGYSQTA AVQYCKEKGMILQAWSPIGRGRVLQDELIVELAKKYGVTPARICLRYLVQNGIIPLPKSS SLERMKENQNVFSFALSKEDMYRLETMPQTGWSGEYPDRERITE >gi|330399833|gb|ADLB01000007.1| GENE 78 74383 - 74835 576 150 aa, chain + ## HITS:1 COG:BH2288 KEGG:ns NR:ns ## COG: BH2288 COG0824 # Protein_GI_number: 15614851 # Func_class: R General function prediction only # Function: Predicted thioesterase # Organism: Bacillus halodurans # 8 133 13 138 143 107 40.0 1e-23 MKAYKHKVQYYETDQMGIVHHSNYIRWFEEARTDFMEQLGMGYDEMEKEGILSPVLEVEA TYLRMVRFGDTVTITTRIKEYNGIKLTVAYEIHNDRTGMIHCKGVTKHCFLTKMGKPISL KKDLIEFHNMFAKGLEDSQNNNKEEEEGEK >gi|330399833|gb|ADLB01000007.1| GENE 79 74835 - 76796 2218 653 aa, chain + ## HITS:1 COG:BH3141 KEGG:ns NR:ns ## COG: BH3141 COG0441 # Protein_GI_number: 15615703 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Threonyl-tRNA synthetase # Organism: Bacillus halodurans # 1 640 5 644 645 579 46.0 1e-165 MKITLKDGSVKEYAESKSVYEIALDISEGLARAACAGEINGEVVDLRTMVEEDCSLNILT ANDKEGLAALRHTASHVLAEAVKRLYPEAKLAIGPSIDTGYYYDFEHAPFSNEDLATIEK EMKKIIKEGAKIERFELPRAEAIAYMEEKGEPYKVELINDLPEDAVISFYKQGEFVDLCA GPHLMSTKGVKAFKLISSSGAYWRGDEKNKMLTRVYGTAYAKKDELKAHMEHLEEIKKRD HNKLGREMGLFTTVDVIGQGLPLLMPKGAAIIQTMQRWVEDEEEKRGYVRTKTPLMAKKD LYVISDHWGHYKEGMFVLGDEDKEGEEVFALRPMTCPFQYYVYKQSQKSYRDLPCRYGET STLFRNEDSGEMHGLTRVRQFTISEGHLVITPEQVEDEFRGCVDLAKYCLTTLGVEEDVT YRLSLADPDNMDKYLGTREMWDEVEDIMRKMLNHLEIDYTEETGEAAFYGPKLDIQAKNV YGKEDTMITIQLDMFLAERFDMSFVDKDGEKKRPYIIHRTAMGCYERTLAWLIEKYAGLF PTWLCPEQVRVLPISEKYHDYAHQVLGELRANGIKCTVDDRAEKIGYKIRETRLDKVPYM LVVGAKEEEEKVVSVRSRFLGDEGQKSLDTFIADICKEIRTKEIRKIEVEENK >gi|330399833|gb|ADLB01000007.1| GENE 80 76848 - 77177 321 109 aa, chain + ## HITS:1 COG:no KEGG:CPE2333 NR:ns ## KEGG: CPE2333 # Name: not_defined # Def: hypothetical protein # Organism: C.perfringens # Pathway: not_defined # 1 106 1 102 105 61 40.0 9e-09 MPELKCTVQTCVHNKQFLCDLDAIEVGGSSAKNAEETCCDSFQERKGNGYSNSYSDVSGN TASDRSEIDCKATDCMYNEKCQCHAGKISVEGSNACDCDGTECATFTCK >gi|330399833|gb|ADLB01000007.1| GENE 81 77229 - 78047 498 272 aa, chain - ## HITS:1 COG:no KEGG:TherJR_1918 NR:ns ## KEGG: TherJR_1918 # Name: not_defined # Def: hypothetical protein # Organism: Thermincola_JR # Pathway: not_defined # 2 272 1 300 300 164 34.0 4e-39 MIIDILKDTILDSIHILPFLFLAFFLLEMFSHHSKRIHFTLNPFLAGLLGCIPQCGIPVL AVNLYSGGMLSPGTLLALLISSTDESSLILFKETQGQEIFLPLLITKFIISVIAGYFVDF FLKDKFTPPGQSAACDCHSCHHSHGIFLASLRHTLELFIYLFICSFGLGLLLYVTDIDTL SRMLLGGSVLQPLLTTLIGLIPNCAASLLLCELYLDGLIGFSSLVAGLCASCGVGLLVLL KTKMDKKEIGKLIGFLYLASAVSGVILSIFFK >gi|330399833|gb|ADLB01000007.1| GENE 82 78172 - 79434 1395 420 aa, chain + ## HITS:1 COG:CAC0730 KEGG:ns NR:ns ## COG: CAC0730 COG0628 # Protein_GI_number: 15894017 # Func_class: R General function prediction only # Function: Predicted permease # Organism: Clostridium acetobutylicum # 16 403 1 383 383 142 27.0 2e-33 MEEKNKPKNITPRTQMKIKQYFMRGLTSFLVILAGIVCYFAFLRLDDIAGLLGKIGGILQ PIILGLVFAYLLNPLVTIIENQVIKMFEKKAKNREKLRKVSRSIGIAGSLLVAFAVVILL LNMVIPELYRSIRDLIMGLPHQINNAIDFLESEKIHDTAFSGTIKSVLENGAEAFQMWLK TDLLTRINQMMSFVTVGVFSVVETLFDIAVGVIVSVYVLNSKEKFTGQCKKITYALLSRE RANLMLQITRKSHKIFGGFVIGKIIDSIIIGILCFIGLSILDIPYTLLVSVIVGVTNVVP FFGPYIGAIPSAILILLSEPIKGIYFVIFILVLQQFDGNILGPKILGDSTGLSAFWVVFS ILLGGGLFGFVGMIMGVPTFAVFYYLVSMFIEQKLERKKLPKESKEYEEIQYIEEKKEEE >gi|330399833|gb|ADLB01000007.1| GENE 83 79440 - 80360 974 306 aa, chain + ## HITS:1 COG:BH2280 KEGG:ns NR:ns ## COG: BH2280 COG1897 # Protein_GI_number: 15614843 # Func_class: E Amino acid transport and metabolism # Function: Homoserine trans-succinylase # Organism: Bacillus halodurans # 1 301 1 301 303 390 61.0 1e-108 MPIRVQNDLPAKEILEKENIFVMDEHRAVHQDIRPIRIGILNLMPLKEDTELQLLRSLSN TPLQVDVIFVHVVSHQSKNTATSHLNRFYETFEEIKHQKFDGFIITGAPVEQMPFEEVDY WEELKKIMEWTKTNVTSTLHLCWGAQAGIYYHYGIDKEPLEEKLFGVFWHKVLNRKIPLV RGFDDVFLAPHSRHTDVPLENIRQDSRLTVLAESEKAGAFLVMAQEGRQIFVMGHPEYDR ITLDKEYKRDKEKNLPISMPENYYENDDDTKKPLLTWRATANNLYTNWLNYYVYQVTPYD MLGTPF >gi|330399833|gb|ADLB01000007.1| GENE 84 80611 - 80682 80 23 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTETMGVADLCRSNGPMSRFSTS >gi|330399833|gb|ADLB01000007.1| GENE 85 80701 - 80958 248 85 aa, chain - ## HITS:1 COG:L0434 KEGG:ns NR:ns ## COG: L0434 COG2801 # Protein_GI_number: 15672639 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Lactococcus lactis # 1 72 206 277 279 81 48.0 4e-16 MQCSYSKKSYPWDNACIESFHSLIKREWLNRFKIRDYDHAYRLIFEYLEAFYNTKRIHSH CDYMSPNDYEELYRRLQQDELQLAG Prediction of potential genes in microbial genomes Time: Tue May 24 20:59:32 2011 Seq name: gi|330399666|gb|ADLB01000008.1| Lachnospiraceae bacterium 2_1_46FAA cont1.8, whole genome shotgun sequence Length of sequence - 52890 bp Number of predicted genes - 56, with homology - 50 Number of transcription units - 24, operones - 14 average op.length - 3.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 65 - 349 465 ## EUBREC_2191 hypothetical protein + Prom 479 - 538 5.3 2 2 Op 1 . + CDS 617 - 898 350 ## CKR_2548 hypothetical protein 3 2 Op 2 . + CDS 962 - 1315 172 ## 4 2 Op 3 . + CDS 1345 - 1575 304 ## gi|225026030|ref|ZP_03715222.1| hypothetical protein EUBHAL_00269 5 2 Op 4 . + CDS 1635 - 2234 520 ## COG1073 Hydrolases of the alpha/beta superfamily 6 2 Op 5 . + CDS 2297 - 2737 477 ## COG4405 Uncharacterized protein conserved in bacteria + Prom 2907 - 2966 8.3 7 3 Tu 1 . + CDS 3113 - 3622 380 ## gi|260587020|ref|ZP_05852933.1| putative toxin-antitoxin system protein + Prom 3662 - 3721 3.5 8 4 Op 1 . + CDS 3741 - 4235 198 ## PROTEIN SUPPORTED gi|229872047|ref|ZP_04491633.1| acetyltransferase, ribosomal protein N-acetylase 9 4 Op 2 4/0.000 + CDS 4315 - 4620 185 ## COG0640 Predicted transcriptional regulators + Prom 4650 - 4709 4.3 10 4 Op 3 . + CDS 4753 - 5706 841 ## COG0701 Predicted permeases + Prom 5712 - 5771 3.4 11 5 Op 1 . + CDS 5833 - 5928 129 ## 12 5 Op 2 . + CDS 5954 - 6385 278 ## COG0394 Protein-tyrosine-phosphatase 13 6 Op 1 . - CDS 6417 - 6707 271 ## 14 6 Op 2 . - CDS 6710 - 7006 321 ## Mbar_A0863 hypothetical protein - Prom 7032 - 7091 6.9 + Prom 7006 - 7065 7.9 15 7 Op 1 . + CDS 7210 - 9444 2549 ## COG1409 Predicted phosphohydrolases 16 7 Op 2 . + CDS 9486 - 10655 1092 ## COG5438 Predicted multitransmembrane protein + Term 10665 - 10703 2.0 + Prom 10686 - 10745 4.5 17 8 Op 1 . + CDS 10791 - 10862 80 ## 18 8 Op 2 12/0.000 + CDS 10878 - 12998 2169 ## COG1328 Oxygen-sensitive ribonucleoside-triphosphate reductase 19 8 Op 3 . + CDS 12999 - 13487 505 ## COG0602 Organic radical activating enzymes 20 8 Op 4 . + CDS 13502 - 14014 538 ## COG0756 dUTPase + Term 14034 - 14077 10.1 + Prom 14030 - 14089 4.2 21 9 Op 1 . + CDS 14126 - 15496 1309 ## Cphy_0751 GerA spore germination protein 22 9 Op 2 . + CDS 15558 - 16421 1119 ## COG1307 Uncharacterized protein conserved in bacteria + Term 16429 - 16486 12.3 - Term 16415 - 16474 14.3 23 10 Tu 1 . - CDS 16478 - 17833 1231 ## COG0733 Na+-dependent transporters of the SNF family - Prom 17864 - 17923 12.8 + Prom 17900 - 17959 9.8 24 11 Op 1 . + CDS 18169 - 18663 405 ## PROTEIN SUPPORTED gi|163801060|ref|ZP_02194960.1| 50S ribosomal protein L35 25 11 Op 2 . + CDS 18720 - 18917 244 ## PROTEIN SUPPORTED gi|160878606|ref|YP_001557574.1| ribosomal protein L35 26 11 Op 3 . + CDS 18949 - 19302 517 ## PROTEIN SUPPORTED gi|240146873|ref|ZP_04745474.1| ribosomal protein L20 + Term 19524 - 19583 1.4 + TRNA 19436 - 19520 49.8 # Leu AAG 0 0 - Term 19420 - 19490 22.7 27 12 Tu 1 . - CDS 19575 - 19979 61 ## - Prom 20003 - 20062 8.6 28 13 Op 1 3/0.000 + CDS 20138 - 21145 932 ## COG0232 dGTP triphosphohydrolase 29 13 Op 2 31/0.000 + CDS 21225 - 23000 1676 ## COG0358 DNA primase (bacterial type) 30 13 Op 3 5/0.000 + CDS 23037 - 24149 1222 ## COG0568 DNA-directed RNA polymerase, sigma subunit (sigma70/sigma32) 31 13 Op 4 9/0.000 + CDS 24158 - 24847 645 ## COG2384 Predicted SAM-dependent methyltransferase 32 13 Op 5 . + CDS 24862 - 25638 845 ## COG0327 Uncharacterized conserved protein 33 13 Op 6 . + CDS 25676 - 27343 688 ## PROTEIN SUPPORTED gi|39938628|ref|NP_950394.1| ribosomal protein L13 - Term 27276 - 27321 6.7 34 14 Op 1 . - CDS 27340 - 28224 324 ## PROTEIN SUPPORTED gi|161507907|ref|YP_001577871.1| ribosomal protein large subunit 35 14 Op 2 8/0.000 - CDS 28243 - 30147 1362 ## COG0687 Spermidine/putrescine-binding periplasmic protein 36 14 Op 3 30/0.000 - CDS 30147 - 30965 756 ## COG1176 ABC-type spermidine/putrescine transport system, permease component I 37 14 Op 4 4/0.000 - CDS 30955 - 32028 1293 ## COG3842 ABC-type spermidine/putrescine transport systems, ATPase components 38 14 Op 5 . - CDS 32052 - 32591 396 ## COG1396 Predicted transcriptional regulators - Prom 32663 - 32722 13.2 - Term 32686 - 32743 8.3 39 15 Tu 1 . - CDS 32751 - 33851 800 ## COG0082 Chorismate synthase - Prom 33878 - 33937 9.0 + Prom 33878 - 33937 6.4 40 16 Op 1 . + CDS 34052 - 35674 1597 ## COG1376 Uncharacterized protein conserved in bacteria 41 16 Op 2 15/0.000 + CDS 35694 - 36827 997 ## COG0343 Queuine/archaeosine tRNA-ribosyltransferase + Prom 36831 - 36890 5.5 42 16 Op 3 . + CDS 36916 - 37188 375 ## COG1862 Preprotein translocase subunit YajC + Term 37209 - 37252 11.9 + Prom 37356 - 37415 6.7 43 17 Tu 1 . + CDS 37446 - 38867 1312 ## COG0726 Predicted xylanase/chitin deacetylase + Term 38879 - 38919 6.5 - Term 38860 - 38914 10.1 44 18 Tu 1 . - CDS 38919 - 40091 1130 ## COG0462 Phosphoribosylpyrophosphate synthetase - Prom 40124 - 40183 7.1 + Prom 40153 - 40212 5.1 45 19 Op 1 1/0.000 + CDS 40247 - 41560 1285 ## COG1625 Fe-S oxidoreductase, related to NifB/MoaA family 46 19 Op 2 2/0.000 + CDS 41574 - 42902 1632 ## COG1160 Predicted GTPases 47 19 Op 3 1/0.000 + CDS 42904 - 43542 755 ## COG0344 Predicted membrane protein 48 19 Op 4 . + CDS 43554 - 44564 1301 ## COG0240 Glycerol-3-phosphate dehydrogenase + Term 44574 - 44612 4.2 - Term 44561 - 44599 4.2 49 20 Tu 1 . - CDS 44602 - 45528 699 ## COG3965 Predicted Co/Zn/Cd cation transporters - Prom 45554 - 45613 9.0 + Prom 45507 - 45566 8.6 50 21 Op 1 . + CDS 45655 - 47127 1567 ## Cphy_2385 stage IV sporulation protein A 51 21 Op 2 . + CDS 47187 - 48212 979 ## COG2008 Threonine aldolase + Term 48220 - 48257 6.4 - Term 48203 - 48251 6.0 52 22 Tu 1 . - CDS 48254 - 48829 803 ## COG0503 Adenine/guanine phosphoribosyltransferases and related PRPP-binding proteins - Prom 48981 - 49040 10.3 + Prom 48924 - 48983 6.9 53 23 Op 1 . + CDS 49011 - 49598 525 ## COG1896 Predicted hydrolases of HD superfamily 54 23 Op 2 . + CDS 49612 - 51306 1124 ## COG0249 Mismatch repair ATPase (MutS family) 55 23 Op 3 . + CDS 51309 - 51791 450 ## COG0328 Ribonuclease HI + Term 51943 - 51983 -0.3 56 24 Tu 1 . + CDS 52304 - 52889 225 ## Predicted protein(s) >gi|330399666|gb|ADLB01000008.1| GENE 1 65 - 349 465 94 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_2191 NR:ns ## KEGG: EUBREC_2191 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 94 1 94 94 135 77.0 4e-31 MAKHYDKQFKLDAVQYYHDHKNLGLQGCATNLGISQQTLSRWQKELRETGDIESRGSGNY ASDEAKEIARLKRELRDAQDALEVLKKAINILGK >gi|330399666|gb|ADLB01000008.1| GENE 2 617 - 898 350 93 aa, chain + ## HITS:1 COG:no KEGG:CKR_2548 NR:ns ## KEGG: CKR_2548 # Name: not_defined # Def: hypothetical protein # Organism: C.kluyveri_NBRC # Pathway: not_defined # 1 93 1 93 94 71 45.0 1e-11 MEYGIIVLMIIGGILLYIKINELQKQVKSQQIQIDKLCKETGNQQLATYFISDEEKEYIT HLKKSVKEVEAVKKVREVTSMDLVQAKQYVDTL >gi|330399666|gb|ADLB01000008.1| GENE 3 962 - 1315 172 117 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSNKLVKSKKPPQKTNEGIVFYNQIGRIFIFLGIISFIILGFVSFLPQDYEKYNIPLFYG IWAAISSISMTVGVLLLLLGKYCKKYQIWIEKEVKSHQNRMLKQAEKSIEKDMRKHK >gi|330399666|gb|ADLB01000008.1| GENE 4 1345 - 1575 304 76 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225026030|ref|ZP_03715222.1| ## NR: gi|225026030|ref|ZP_03715222.1| hypothetical protein EUBHAL_00269 [Eubacterium hallii DSM 3353] # 1 76 1 76 76 80 69.0 2e-14 MKCLDCGMEMEQGTVEGVGQGGGHWYEFTSDEEKKKVGLKGFFTRKTISVKTSVLESPAW YCPKCKKILMWLDSKE >gi|330399666|gb|ADLB01000008.1| GENE 5 1635 - 2234 520 199 aa, chain + ## HITS:1 COG:FN0852 KEGG:ns NR:ns ## COG: FN0852 COG1073 # Protein_GI_number: 19704187 # Func_class: R General function prediction only # Function: Hydrolases of the alpha/beta superfamily # Organism: Fusobacterium nucleatum # 1 197 1 197 199 238 60.0 4e-63 MKKAILYIHGKDGNANEAEHYKSICSGYDIFGMDYSSKTPWEAKEEFPVIFDTLCGKYDS VIIIANSIGAFFAMNALQNKNIKKALFISPIVNMEKLIGDMMAWANVTETELCDKKEIPT EFGETLSWKYLCYVRKNPIDWKIPTNILFGTKDNLTSYDTVSMFADKIGATLTIMENGEH WFHTNEQIEFLDNWLRHSI >gi|330399666|gb|ADLB01000008.1| GENE 6 2297 - 2737 477 146 aa, chain + ## HITS:1 COG:SP0796 KEGG:ns NR:ns ## COG: SP0796 COG4405 # Protein_GI_number: 15900689 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Streptococcus pneumoniae TIGR4 # 1 144 1 144 146 162 54.0 2e-40 MSGKELWEIFVTENNIGECHYEEWSFGVEADLLVHLVATGEKTATASAYPLYELENEPLP AIGAYSVILDSKDNAICIIQTKKVTIVPFYAVTAEHAHKEGEGDKSLDFWREVHEKFFTE CLSEAGLKFTSDMKVVCEEFEVVYKQ >gi|330399666|gb|ADLB01000008.1| GENE 7 3113 - 3622 380 169 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260587020|ref|ZP_05852933.1| ## NR: gi|260587020|ref|ZP_05852933.1| putative toxin-antitoxin system protein [Blautia hansenii DSM 20583] # 1 169 1 170 170 275 83.0 6e-73 MIFNFNPWQLDADVDLTKQLYEEVDYSVDKTANIEFIESLSSEQQLFFNSLGIDLAKIEV DKVIYEIPKDEEISTSKIYRMSVNFLIRGKILALPKYQKDIYSDEEVFGKKIPESIKILS DDEDYLKTFDNGIGAGIVFKHPCFHYGDERFKEWDCGYILGTILIMKDM >gi|330399666|gb|ADLB01000008.1| GENE 8 3741 - 4235 198 164 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229872047|ref|ZP_04491633.1| acetyltransferase, ribosomal protein N-acetylase [Spirosoma linguale DSM 74] # 3 164 2 165 167 80 34 1e-14 MEIKTKRLKIVALTPEQLEMLVNNISKFERELHCSYQGEKVEGVFKHILFNQSLKAKKNY KDYLWLTFWIIVRKEDDIVVGMIDFKDVPNAKGEVEIGYGLGNHHEHLGYMTEAVEAFCS WGKKQNAVKTIVAETEMENFPSQRLLKKCGFIEFFRDKTIWWRL >gi|330399666|gb|ADLB01000008.1| GENE 9 4315 - 4620 185 101 aa, chain + ## HITS:1 COG:pli0034 KEGG:ns NR:ns ## COG: pli0034 COG0640 # Protein_GI_number: 18450316 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Listeria innocua # 3 73 2 72 97 80 47.0 6e-16 MTDFSKDAKIFKALSDTKRLTILEYLKSGEKCACVLIENMNIGQSALSYHMKILCDSGIV TARQDGKWTHYSLSKSGSEYASKRLLELTTSNTENQSNCCK >gi|330399666|gb|ADLB01000008.1| GENE 10 4753 - 5706 841 317 aa, chain + ## HITS:1 COG:MTH894 KEGG:ns NR:ns ## COG: MTH894 COG0701 # Protein_GI_number: 15678914 # Func_class: R General function prediction only # Function: Predicted permeases # Organism: Methanothermobacter thermautotrophicus # 10 311 17 325 327 259 45.0 4e-69 MNGLIGNGLSLLGLDINSRLGGSIQFFLYDVIKITILLCLLIFFISYIQSYFPPERSKKI LGRFHGIGANIISALLGTVTPFCSCSSIPLFIGFTSAGLPLGVTFSFLISSPMVDLGSLV LLMSIFGAKVAFAYVIVGLVIAVLGGTLIEKLHMEKYVEDFVKNASSVDISSPTLTKKDR VQYAKEQVVGTFKKVFPYILLGVGIGATIHNWIPETWIENILGSNNPFGVILATLVGIPM YADIFGTIPVAEALLAKGAQLGTVLSFMMAVTTLSLPSLIMLKKAVKPKLLTLFIAICTV GIIIVGYGFNIFSTLFI >gi|330399666|gb|ADLB01000008.1| GENE 11 5833 - 5928 129 31 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSTPALVLDGEVISYGKVLTVEEVKALLTKK >gi|330399666|gb|ADLB01000008.1| GENE 12 5954 - 6385 278 143 aa, chain + ## HITS:1 COG:CAP0105 KEGG:ns NR:ns ## COG: CAP0105 COG0394 # Protein_GI_number: 15004808 # Func_class: T Signal transduction mechanisms # Function: Protein-tyrosine-phosphatase # Organism: Clostridium acetobutylicum # 3 131 2 130 136 159 58.0 1e-39 MRKLKVAFICVHNSCRSQIAEALGKYLASNAFESYSAGTEIKPQINQDAVRLMKEMYQID MEKTQYSKLLSDIPQVDIVITMGCNVDCPAIPCKYREDWGLNDLTGKDDIEFSKIIRTIH SNILELMERVLIYDKIAKGDGTK >gi|330399666|gb|ADLB01000008.1| GENE 13 6417 - 6707 271 96 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MPTYLIIVIIAVVFALMGVQTILTKRKSPFWGLIIPVLIVATGIYFYFFRNIEVNYKNVA VFAVPLVWCLIECYQGRKRRTQEIEKQIQKMKAKDL >gi|330399666|gb|ADLB01000008.1| GENE 14 6710 - 7006 321 98 aa, chain - ## HITS:1 COG:no KEGG:Mbar_A0863 NR:ns ## KEGG: Mbar_A0863 # Name: not_defined # Def: hypothetical protein # Organism: M.barkeri # Pathway: not_defined # 1 96 4 99 102 127 75.0 8e-29 MEIDKIPQWILALEQEDATFLKNFVLKSGSLKEIAKIYGVTYPTVRLRLDKLIQKIEVSD QKEEEPFTTFVKGLAVDSRIDLETAKIIIEKYKSEKEK >gi|330399666|gb|ADLB01000008.1| GENE 15 7210 - 9444 2549 744 aa, chain + ## HITS:1 COG:CAC0205 KEGG:ns NR:ns ## COG: CAC0205 COG1409 # Protein_GI_number: 15893498 # Func_class: R General function prediction only # Function: Predicted phosphohydrolases # Organism: Clostridium acetobutylicum # 65 478 47 454 652 225 35.0 2e-58 MKQKSKKKMLLSLAVVTTVSMIGNTQVQAFGEYGQGDGTNKLNSGLTATQDYQTWYNTKW NQRESGEMDSGKVVLTPGKDERSLNFAWYSEEKGVPQIRISANQDMTQAKTFTGTAESIQ KNNLFKTYTSSNKVSVENYLQENTTYYYQCSTDGKNWSGTNKYQTHSFSQYQAVLVGDPQ IGASGSNGQGTEEDTDIAVNTYSWNKTLNCALGESGIAQNASFILSAGDQIDYSDDNYTI REQEYAGYLYPEVLRSVPVSTTIGNHESKGDDYSYHYNNPNASELGSTESGGDYYYSYGD ALYIVLNSNNRNMEEHRQLMAQADESHKDAKWKIVMFHHDIYGSGSPHSDVDGANLRILF APLMDEFDVDVCLTGHDHSYARTYQILDGKVIDTEGVGEGASNAVNPEGTLYIAAGSATG SKFYTLNTTKQYYLAERSNTPIPTFSTIDVSADKLTLKTYDYEGNKYANDVTIQKNNGAT SIIEEKDAAEKLDLSVMTSGSKVRVQDALKNVDTILDTRDDSKATGELVRKYNTAEDPVN YYAYAQNGYGDTSNNKVLKKGYSTLLDKTLYENDTNVSVSAEMITKAYANLTYAKEEIVT TTEFTELVNQFAEAEKELSNATIGNKKGEYAQETVDTFKKVLGELKVQADETKITKTEWT QLSAKLADERTKFAQSANQKDKEEKPIGSHNPESSTPDTGASGNGTSAKPGNTVKTGDTQ PIGLLGAAGFVSLAVAVLVKRKNK >gi|330399666|gb|ADLB01000008.1| GENE 16 9486 - 10655 1092 389 aa, chain + ## HITS:1 COG:CAC0206 KEGG:ns NR:ns ## COG: CAC0206 COG5438 # Protein_GI_number: 15893499 # Func_class: S Function unknown # Function: Predicted multitransmembrane protein # Organism: Clostridium acetobutylicum # 34 380 21 374 397 226 40.0 8e-59 MEEIVKKSIAAFGKLSRKEKCKHFCVWGAILLFLLFLFVFNQKIEKKQLIEENGNQFEKA EVVEIVSENRNEDGSQQGTQKIKVRLKSGEFKGEVVEATNIDSYLYGADCKVGTAVVVQI SEYNGKISASVYNYNRTTILFAMIGVFLLALVLIGRGKGVTSALGLIFTFVCILYLYLPM MYLGFSPFFSAVVVTILTTLVTMYFIGGFSKKTLCAIVGTVGGVVIAGIFANLFGKLGHI TGHNVSDIETLLYIGQNSKLQIGGLLFSGILIASLGAVMDVAMSISTTIEELHFHNPNLT RQQLFKSGIKIGGDMMGTMSNTLILAFTGGSLSTLMTFYAYDMPFLQMMNSYEMGIEIIQ GISGSLGVILTVPLVSLVAAVYMGRPAVE >gi|330399666|gb|ADLB01000008.1| GENE 17 10791 - 10862 80 23 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MHSRTRYSDYAKKIILNIEYIDN >gi|330399666|gb|ADLB01000008.1| GENE 18 10878 - 12998 2169 706 aa, chain + ## HITS:1 COG:PM0940 KEGG:ns NR:ns ## COG: PM0940 COG1328 # Protein_GI_number: 15602805 # Func_class: F Nucleotide transport and metabolism # Function: Oxygen-sensitive ribonucleoside-triphosphate reductase # Organism: Pasteurella multocida # 4 703 6 709 713 313 30.0 9e-85 MIKVIKKDGTKEDFNVQKVIVAVNKSAYRALIKFTEEDLDFICKFVEEEVQKMGVEEIRI AQMHNVVEGALEKVNPVVAKSYRDYRNYKQDFVQMLDEVYKKSQSIMYIGDKENSNTDSA LVSTKRSLIFNELNKELYKKFFLTVEEIQAIREGYIYIHDMSARRDTMNCCLFDVKNVLT GGFEMGNLWYNEPKSLDTAFDVIGDIVLSAASQQYGGFTVPSVDDILEPFAEKSHKALLE KYKNLGLSEEKAHEVAWADLEKEMEQGFQGWEYKFNSVSSSRGDYPFITVTSGTNTSIYG KLATIKMLEVRKNGQGKEGHKKPVLFPKIVFLYDENLHGPGKPLEDVFEAGIECSRKTMY PDWLSLTGKGYVPSMYKQYGKIVSPMGCRAFLSPWYEKGGMNPADENDKPVFVGRFNIGA VSLHLPMIYAKAKQENRDFFEVLDYYLELIRQLHIRTYEYLGEMKASTNPLAYCEGGFYG GNLGLYDKIKPLLKSATASFGITAINELQQLHNKKSLVEDGEFAIKTLEYINKRVNEFKK EDGHLYAIYGTPAENLCGVQVQQFRKKYGIVENVSDREYVSNSFHCHVTEDITPIEKQDL EARFWDLCNGGKIQYVKYPINYNVEAIKALVRRAMEMGFYEGVNLSLSYCDDCGHEELSM DTCPKCGSTNLTKIDRMNGYLSYSRVKGDTRLNDAKMAEIAERKSM >gi|330399666|gb|ADLB01000008.1| GENE 19 12999 - 13487 505 162 aa, chain + ## HITS:1 COG:FN0312 KEGG:ns NR:ns ## COG: FN0312 COG0602 # Protein_GI_number: 19703657 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Organic radical activating enzymes # Organism: Fusobacterium nucleatum # 1 159 1 165 168 138 43.0 5e-33 MRYHNITKDDMLNGDGLRVVLWVSGCSHCCKECHNPITWDANGGLEFDSAAKEELFAELS KSYINGVTFSGGDPLHINNIYEIADLAKEIREKFPHKTIWLYTGSVWESIKDMELMRYID VLVDGEFECDKKDANLHWKGSSNQRVIDVQETLKKGKVVLHD >gi|330399666|gb|ADLB01000008.1| GENE 20 13502 - 14014 538 170 aa, chain + ## HITS:1 COG:SP0021 KEGG:ns NR:ns ## COG: SP0021 COG0756 # Protein_GI_number: 15899969 # Func_class: F Nucleotide transport and metabolism # Function: dUTPase # Organism: Streptococcus pneumoniae TIGR4 # 40 168 18 145 147 94 41.0 8e-20 MKRIAKFHKVSYEQFKEGWTDTFGKIEDGKIKEIYDGIRLPKRATSGSAGYDFYAPVTVV LKPGETIKIPTGIRVEMEENWVLKCYPRSSLGFKYRLQLNNTVGIIDSDYFYSDNEGHIF VKLTNDSNEEKTAEILKDTGFMQGIFVEYGITFDDDVTNVRNGGLGSTTK >gi|330399666|gb|ADLB01000008.1| GENE 21 14126 - 15496 1309 456 aa, chain + ## HITS:1 COG:no KEGG:Cphy_0751 NR:ns ## KEGG: Cphy_0751 # Name: not_defined # Def: GerA spore germination protein # Organism: C.phytofermentans # Pathway: not_defined # 9 451 4 445 473 526 55.0 1e-148 MVEKMTTRKVSSSLSENVEYMNQVLPVDESFDLIRRDIVIGEKKSSFYFIDGFTKDDTMQ KLMTSFLGVNKYDMPEDATTFSQKFIPYVEVDVLTEFDEILRNVLSGVSCFFIEGYSACL AIDCRTYPARSVEEPDKDKSLRGSRDGFVETIVFNTALMRRRIRDPHLIMQMMDIGDSSR TDVVVAYMEDRVDKELLKNLTNRLNEIKVDALRMNQQSLAEQLFKRKWFNPFPKFKFTER PDTASACLLEGKVVLLVDNSPSAMILPTSIFDMIEEANDYYFPTVTNVYLKVSRAIITVA TVFVTPLFLLFMQNPQWLPKMFEFVLIRDVQNIPLLYQLLLLELAIDGLRLAAMNTPSML STPLSVIAGIVMGEFSVKSGWFNSEVMLYMAFVAVANYTQPNFELGYALKFMRLILLVLT AWFNWIGFVIGCVIVILSVSCNKTLSGRNYLNVKLN >gi|330399666|gb|ADLB01000008.1| GENE 22 15558 - 16421 1119 287 aa, chain + ## HITS:1 COG:SPy1936 KEGG:ns NR:ns ## COG: SPy1936 COG1307 # Protein_GI_number: 15675739 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Streptococcus pyogenes M1 GAS # 4 284 3 282 286 179 37.0 6e-45 MKEFIIATDSTVDLPKAFLEENHVLTISLSYVMDGVTYKDLDGLSHEEFFEKIRNGSLPT TSQINPEEARKALEPVVKEGKEILYLGFSSGLSGSYNSVRMAAEDLMEDYPETKIVTIDS LCASMGEGLLLYKTLQLKEQGKNLDEIAEWVEANKLHICHNVTVDDLNHLHRGGRISKTT AVLGTMVKIKPIIHMDNEGKLVVIGKERGRKKSLLTLLDKMEKQMQGYQNDVVMITHGDC IEDAKYVEEQIRERFGIENIIVNGIGSVIGSHTGAGVVAVFFMGSER >gi|330399666|gb|ADLB01000008.1| GENE 23 16478 - 17833 1231 451 aa, chain - ## HITS:1 COG:NMB1707 KEGG:ns NR:ns ## COG: NMB1707 COG0733 # Protein_GI_number: 15677555 # Func_class: R General function prediction only # Function: Na+-dependent transporters of the SNF family # Organism: Neisseria meningitidis MC58 # 1 442 1 442 445 260 36.0 4e-69 MSQRSSFSSKIGFVLAAAGSAVGLGNIWRFPYLAAQYGGGTFLLVYLILAVTFGFTLMIA EIAIGRKTGLSAIGAFRKLDSRFGFLGILAAIIPIIIFPYYSVIGGWVIKYFAVFISGNG AKAATDTYFNEFISSTAEPIGWFFLFIAFTAVVVLLGVQKGIETVSKFMMPVLVLLSVGI ALYGLTVDGAMEGLIYYLKPSFENFSIKTVLAAMGQLFYSMSLAMGIMVTYGSYMKKDNN LESSVRQIEIFDTGIAFLAGLMIIPGVFAFSGGTAESLNQGPGLMFVTLPKVFDSMPMGD VIGTLFFLLVFFAALTSAISLMETIVSILRDKFHWKRKFTCGVVTIIALLMGLPSSLGFG MWSHIKFIGLSILDTFDFISNSVLMPILAFFTCIFVGFVIKPKTIADEVKIGNTQFKGER LFSVMIKWIAPVCLIAILVSSILNALGIIVL >gi|330399666|gb|ADLB01000008.1| GENE 24 18169 - 18663 405 164 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163801060|ref|ZP_02194960.1| 50S ribosomal protein L35 [Vibrio campbellii AND4] # 2 164 1 165 166 160 48 1e-38 MINEQIRDKEIRLIGEDGEQLGIMSSREAMKLAQEADLDLVKIAPTAKPPVCKIIDYGKY RYELARKEKEAKKKQKTVDVKEVRLSPNIESNDLNTKVNNAKKFIEKGNKVKVTLRFRGR EMAHVQTSKHILDDFAEMLKDVASVEKQPKLEGRSMSMVLTEKR >gi|330399666|gb|ADLB01000008.1| GENE 25 18720 - 18917 244 65 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|160878606|ref|YP_001557574.1| ribosomal protein L35 [Clostridium phytofermentans ISDg] # 1 65 1 65 65 98 75 7e-20 MPKIKTNRAAAKRFKKTATGKLKRNKAYKSHILTKKSTKRKRNLRQATITDASNVKNMKK VLPYL >gi|330399666|gb|ADLB01000008.1| GENE 26 18949 - 19302 517 117 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|240146873|ref|ZP_04745474.1| ribosomal protein L20 [Roseburia intestinalis L1-82] # 1 116 1 116 118 203 85 1e-51 MARIKGGMNAKKKHNRTLKLAKGYRGARSKQYRVAKQSVMRALTSAYAGRKQRKRQMRQL WIARINAAARMNGLSYSKFMHGLKVAGVEMNRKILADMAINDAAGFATLAELAKKAN >gi|330399666|gb|ADLB01000008.1| GENE 27 19575 - 19979 61 134 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTERESYTIKSIELHLFLGRIILKQISALSPAEDYNALSSFEKKYKTLLSKTAQAGNCLF SSEISLDCSQEIKYINQDVLLLLQKLICVLENVSLPVLEWEIHFYRQCLFCLETESTHAL SRYWIKQICKSEFS >gi|330399666|gb|ADLB01000008.1| GENE 28 20138 - 21145 932 335 aa, chain + ## HITS:1 COG:RSc2968 KEGG:ns NR:ns ## COG: RSc2968 COG0232 # Protein_GI_number: 17547687 # Func_class: F Nucleotide transport and metabolism # Function: dGTP triphosphohydrolase # Organism: Ralstonia solanacearum # 14 331 17 385 387 231 38.0 1e-60 MTIREQLELRELEYLSPFATLSSRSKGREREEEQCDVRPVFQRDRDRILHCKAFRRLKQK TQVFLLPKGDHYRTRLTHTLEVSQNARTIAKALRLNEDLVEAIALGHDLGHTPFGHAGER ALNEVCPHGFKHNEQSVRVVECLEKQGQGLNLTWEVRDGILNHKTSGRPSTLEGEIVRLS DKIAYINHDIDDAIRGGVLKPEDIPEEYREILGETTRMRLNTMIHNVIINSMDKPYIKMS EDIEKATAGLRKFMFEHVYFNPAAKSEESKAVEMIKNLYGYYICHMDKLPEKYLQRIDEG KSTKEQNVCDFIAGMTDAYAVKKFQEFFIPESWKN >gi|330399666|gb|ADLB01000008.1| GENE 29 21225 - 23000 1676 591 aa, chain + ## HITS:1 COG:CAC1299 KEGG:ns NR:ns ## COG: CAC1299 COG0358 # Protein_GI_number: 15894581 # Func_class: L Replication, recombination and repair # Function: DNA primase (bacterial type) # Organism: Clostridium acetobutylicum # 4 501 7 499 596 338 39.0 3e-92 MYYSEELVEEVRMRNDIVDVISGYVKLQKKGSSYFGLCPFHNEKSPSFSVSREKQMYYCF GCGAGGNVFTFIMEYENYSFVEALKFLAERAGIDLPEVEYSKEAKEKADLKLTILEINKL AAKYFYAQLKAEGGKTAYAYLKNRGLSEETIVAFGLGYANKYSDDLYRYLKMKGYGDELL LKAGLISADERKGAYDKFWNRVMFPIMDVNNRVIGFGGRVMGDAKPKYLNSPETIVFDKS RNLYGLNRARTSRKPYFLICEGYMDVIALHQAGFTNAVASLGTALTVGHASLLKRYVNEV YLTYDSDEAGTRAALRAIPILKDAGITAKVIRMDPYKDPDEFIKNLGAEAFEERISKARN GFMFGLEMLEKEFDMNSPEGKTEFHHEVARRLCNFEEEIERENYIEAVAQAYHIGYDNLR KLVTKMAVQSGLATPAVKPKQAVGKDKIREDGNLQSQKILLTWLLEDEKIFGQIKKYISP KDFTEKLYTTVADILYEQYEDGEVNPAKIMNHFTDEEEHREVASLFHTKIKELTTKQEQE KALKETIIRVKNNSIEYATKMLDPTDIVGLQKLMDAKRELQDLQRLHISID >gi|330399666|gb|ADLB01000008.1| GENE 30 23037 - 24149 1222 370 aa, chain + ## HITS:1 COG:BH1376 KEGG:ns NR:ns ## COG: BH1376 COG0568 # Protein_GI_number: 15613939 # Func_class: K Transcription # Function: DNA-directed RNA polymerase, sigma subunit (sigma70/sigma32) # Organism: Bacillus halodurans # 10 370 17 372 372 434 72.0 1e-121 MEENIVKFEEKLKELLALGKKKKNILEIQEINEIFADMELEPEQMEKVFEYLEGQNIDVL RINGDVDDDDDLEVVITEEDEVDVEKIDLSVPDGISIEDPVRMYLKEIGKVPLLSAEEEI ELAKRMAEGDEDAKKRLAEANLRLVVSIAKRYVGRGMLFLDLIQEGNLGLIKAVEKFDYE KGFKFSTYATWWIRQAITRAIADQARTIRIPVHMVETINKLIRVSRQLLQELGREPSPEE IAEELDMPVERVREILKISQEPVSLETPIGEEEDSHLGDFIQDDNVPVPAEAAAATLLKE QLGEVLNTLTDREQKVLRLRFGMNDGRARTLEEVGKEFDVTRERIRQIEAKALRKLRHPS RSRKLRDYLD >gi|330399666|gb|ADLB01000008.1| GENE 31 24158 - 24847 645 229 aa, chain + ## HITS:1 COG:CAC1302 KEGG:ns NR:ns ## COG: CAC1302 COG2384 # Protein_GI_number: 15894584 # Func_class: R General function prediction only # Function: Predicted SAM-dependent methyltransferase # Organism: Clostridium acetobutylicum # 1 227 1 227 229 136 37.0 3e-32 MELSVRLQAVADMVTEGTKVADVGTDHAYIPIYLVEHDKNPSAIAMDINRGPLKKAEENI SSHNLENKIETRLSDGLKQLHLGEADSVVIAGMGGGLVVKIMEEGTLHKKYVKEWILQPQ SEISKVRQYLNENGYCIVEENMVIDEGKFYPMMRVTEGTIEEYTQEELCYGKCLLKEKNP ILKKFLEKEIDIKKEILEKLHQTRGGQVAKRIEEIEEETDRLQKTLSSF >gi|330399666|gb|ADLB01000008.1| GENE 32 24862 - 25638 845 258 aa, chain + ## HITS:1 COG:FN1316 KEGG:ns NR:ns ## COG: FN1316 COG0327 # Protein_GI_number: 19704651 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 1 255 1 257 258 156 36.0 5e-38 MLCKEIIEKLEELYPSNAALTWDNVGLLVGRKQKEVKNVYVALDLTEEIIDRAIEEKADM IITHHPLIFSPLRTVTDMDFIGRRVVKLLQNDISYYAMHTNYDVLRMADLSADILGIDCQ EVLDKTSTEEDKGIGKIGIYENEMTLRQCSELVKKKFGLNAVQVFGAEEKTVRKIAVCAG SGKSVISLAIDKGADVLITGDIGHHEGIDAVEQGLTIIDAGHYGLEHIFIKDVAETLEKD FSELVVKQAEIVFPFQVY >gi|330399666|gb|ADLB01000008.1| GENE 33 25676 - 27343 688 555 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|39938628|ref|NP_950394.1| ribosomal protein L13 [Onion yellows phytoplasma OY-M] # 124 553 115 544 546 269 35 2e-71 MDRKMCRIQIGNEIKLYEEGTTYKKIAEDYQGEYENDIVLVLVDGKLQELPKQCKHDCKL EFITTDTKIGNETYRRSMCLLMVKAIEDVTGKENVKKIRVQYSVSKGYYCTADGDFEITD TFLRQVEKRMEEMAEEKTPIVKKTVHIDDAMSLFKEQGMYDKERLFRYRRVSAVNLYEME NMADYYYGYMVPDASYLKYFALYRYDDGFVLQMPTKDEPKAVPPFEPQKKLFQILKESTQ WGDMMHIETVADLNEQITRQGMEEVILVQEAYQERKIAELAMQIVQRPSVKFVLIAGPSS SGKTTFSHRLSIQLQALGLKPHPIAVDNYFVNREFTPRDENGDYDFECLEAIDRELFNHQ LTELLEGKEVQLPTFNFKTGKREYNAPPTKMGKQDILVIEGIHCLNDALTYKLSSENKFK IYISALTQLNIDEHNRIPTTDGRLLRRMVRDARTRGTSAQNTIAMWHSVRRGEERNIFPF QEEADAMFSSALIYELAVLKQYAEPILFGIDKDCDEYVEAKRLLKFLDYFVGISAENVPK NSLLREFIGGGCFRV >gi|330399666|gb|ADLB01000008.1| GENE 34 27340 - 28224 324 294 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|161507907|ref|YP_001577871.1| ribosomal protein large subunit [Lactobacillus helveticus DPC 4571] # 34 284 33 279 285 129 34 3e-29 MRRVFTYRIPEEFENDTLLSFLKAECYSSPIITHLKRTEKGLLLNGNWARVRDILHKNDI LTVTLLETTSSDNIVPVSLPLDIVYEDEDLMVINKPANMPIHPSQNNYDNTLANAVAYYY AQKEVPFVYRCINRLDRDTTGLLILAKHMYSASLLSDMVKNRLIHREYIAIADGYVDDAG IIDAPIARTKDSTIEREVNFSEGDFARTHYKCLQRKNGYSLVSLKLETGRTHQIRVHMKY IGHALLGDFLYNPDYTFIGRQALHSHRLTFMHPITKQHLTFTAPLPYDMAKLFD >gi|330399666|gb|ADLB01000008.1| GENE 35 28243 - 30147 1362 634 aa, chain - ## HITS:1 COG:CAC0837 KEGG:ns NR:ns ## COG: CAC0837 COG0687 # Protein_GI_number: 15894124 # Func_class: E Amino acid transport and metabolism # Function: Spermidine/putrescine-binding periplasmic protein # Organism: Clostridium acetobutylicum # 284 633 5 352 354 304 42.0 3e-82 MKKTLQNIYLSLIIFLLYAPIVTLAVLSFNNSKTRAKWGGFTGKWYVSLFQNEQIMNALY TTLLIALLSAFIATLIGTAAAIGIQSMKRKPRTLMMGITNIPMLNADIVTGISLMLLFIA MGTALKFMGIQFSLGFATVLIAHITFNIPYVILSVMPKLKQTKRSTYEAALDLGASPVYA FFKVVFPDILPGVFSGFLLAFTMSLDDFVITHFTKGPGVDTLSTKIYSEVRKGIKPEMYA LSTLLFVSVLILLILINISPAKKDGEAKAKEMKKSTRVSRFVFRRVIPVAMALVVVAGGI FYSSKEDLSGDNQVIVYNWGEYLDPEVITMFEEETGINVVYEEFETNEIMYPKIQSGAIA YDVVCPSDYMIQRMIENDLLAELNFDNIPNIKNIGSQYMEQSKQFDAENKYSVPYCWGTV GILYNKKMVDEPIDSWSVLWDKKYKDNILMQDSVRDAFAVALKYLGYSLNSTDLDELQEA KELLIKQKPLVQAYVIDQVRDKMIGNEAAIGVIYSGEAIYTQLENPDLEYVIPKEGSNVW IDSWVIPKNAKHKENGEKFINFLCRPDIAKMNFDYITYSTPNTEGRKLIEDPAIRNSTIA FPDAKELERCETFKFLGDKNDAIYNELWREIKSN >gi|330399666|gb|ADLB01000008.1| GENE 36 30147 - 30965 756 272 aa, chain - ## HITS:1 COG:CAC0839 KEGG:ns NR:ns ## COG: CAC0839 COG1176 # Protein_GI_number: 15894126 # Func_class: E Amino acid transport and metabolism # Function: ABC-type spermidine/putrescine transport system, permease component I # Organism: Clostridium acetobutylicum # 2 268 3 269 277 240 48.0 2e-63 MKNRKKLLAGPYLFWSVSFIIIPLFMILYYGLTNSKNEFTLSNLAKITTPENLKALGLAL LLSLISTVICLILAYPLAMILASKSMNQTSFVVLIFILPMWMNFLLRTLAWQTLLEKNGV INSILSFFHLPTQSIINTPSAIILGMVYNFLPFMVLPIYNVIVKIDKDVISAAKDLGANN IQTFTKIIFPLTTPGIISGITMVFVPALTTFVISDLLGGSKILLIGNVIEQEFKQSSNWN VGSGLSLVLMIFIIASMALIAKYDKDGEGTAF >gi|330399666|gb|ADLB01000008.1| GENE 37 30955 - 32028 1293 357 aa, chain - ## HITS:1 COG:CAC0840 KEGG:ns NR:ns ## COG: CAC0840 COG3842 # Protein_GI_number: 15894127 # Func_class: E Amino acid transport and metabolism # Function: ABC-type spermidine/putrescine transport systems, ATPase components # Organism: Clostridium acetobutylicum # 5 352 6 352 352 409 59.0 1e-114 MAKKLIDMINISKSYGDNLVLDELNLYIKENEFLTLLGPSGCGKTTLLRILGGFETANEG KIIFEGNDITNLAPNKRQLNTVFQKYALFSHMSIAENIAFGLKIKGKSKAYINDKIKYAL KLVNLDGYENRTPDSLSGGQQQRIAIARAIVNEPKVLLLDEPLGALDLKLRQDMQYELIR LKNELGITFVYVTHDQEEALTMSDTIVVMNQGYIQQIGTPEDIYNEPENAFVADFIGDSN ILPATMVEDKLVKILGVNFPCVDTGFGRNKPVDAVIRPEDIDLVKVDEGIMQGVVTHLIF KGVHYEMEVTANNYKWLVHSTDMFPVGTEVGIKVDPFDIQIMKKPESEDEEAVIVEE >gi|330399666|gb|ADLB01000008.1| GENE 38 32052 - 32591 396 179 aa, chain - ## HITS:1 COG:CAC0841 KEGG:ns NR:ns ## COG: CAC0841 COG1396 # Protein_GI_number: 15894128 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Clostridium acetobutylicum # 1 179 1 179 179 204 56.0 5e-53 MQIGQKLKGLRIAKNLTQEELADRAELSKGFISQLERDLTSPSISTLVDILQCLGTSLKD FFQEESDEQIVFGNEDYFEKVDTELKNTVEWIIPNAQKNMMEPIRLTLQPGGSTYPDLPH EGEEFGYVLQGSIQIHVGKKVYHAKKGEAFYFTPHSEHYIKANRSTGAKLIWVSTPPNF >gi|330399666|gb|ADLB01000008.1| GENE 39 32751 - 33851 800 366 aa, chain - ## HITS:1 COG:MA0550 KEGG:ns NR:ns ## COG: MA0550 COG0082 # Protein_GI_number: 20089439 # Func_class: E Amino acid transport and metabolism # Function: Chorismate synthase # Organism: Methanosarcina acetivorans str.C2A # 1 350 1 353 365 367 55.0 1e-101 MAGSTYGNIFQVTTWGESHGKGIGVVIDGCPAGLSLCEEDIQKYLDRRKPGQNKYTTKRN ESDTVEILSGVFEGKTTGTPISLLVRNEDQRSKDYGNIASCYRPGHADYTFDVKYGFRDY RGGGRSSGRETIGRVAGGAVALKLLSELGITVSAYTKAIGPYEIPSSAYRFSEMEENVFC MPNLAYAEKASEFLNQCMENCDSAGGVIECQVKNLPAGIGEPVFHKLDACLAQAMFSIGA VKGFEIGDGFCAATSCGSENNDAFYTENGKIYKKTNHAGGILGGISDGNDIIFRTAFKPT PSIAKQQNTVTSSGENTVLSIHGRHDPIIVPRAIVVVETMTALTLIDLLLANMSSRLENI TRFYDR >gi|330399666|gb|ADLB01000008.1| GENE 40 34052 - 35674 1597 540 aa, chain + ## HITS:1 COG:CAC0747 KEGG:ns NR:ns ## COG: CAC0747 COG1376 # Protein_GI_number: 15894034 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 73 540 5 466 466 360 40.0 6e-99 MSDMSKDKKNVSSQNDTRQKELLAELEALETEDLSGFVSKEEKKQETNEGELVFLYDEDE KNPNYRPKKRKREKEKSKSGKKWIIASLSLAGVLLVVYLGFAFYFNSHFYFGTTINENKF GGKSVAHVEKSIEKQISDYELTLKRSDGQTEKIQGKEIDLKYNPGNELQKVLKSQNPFLW PKSLFKKDIREIEIGVTYDKQKLSAQISQLESVKNQNPVKSVSATPVFDGTEFTIQKEVY GNEVNVENLTKVIEEHLDGLNEEVDMKKSKAYYEPKYVSDSKEVVSAKDALNKCLTAEIT YDFKPFTEKVDKGSISDWLSVDENMNVSFKEKEVKAYIKKLASKYDTSGKPRSFVTATGK TVEVKNGVYGWKINQDGEYKKLTEDILAGNSVKREPEYKRKAVSHEGNDFGNTYAEVDLT TQHMWFFQNGKLMMESPIVTGKPSTGHATPQGTYTVTYTQKGAVLRGKILPNGKREYETP VDFWMPFNGGIGFHDATWQSSFGGNRYLTHGSHGCVNMPYDKAQQLFGYLKAGTPVICHY >gi|330399666|gb|ADLB01000008.1| GENE 41 35694 - 36827 997 377 aa, chain + ## HITS:1 COG:CAC2282 KEGG:ns NR:ns ## COG: CAC2282 COG0343 # Protein_GI_number: 15895550 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Queuine/archaeosine tRNA-ribosyltransferase # Organism: Clostridium acetobutylicum # 1 375 1 375 376 622 75.0 1e-178 MYKIIKKDGLAKRGRLETVHGTIETPVFMNVGTAAAIKGAVSTEDLQGIKTQVELSNTYH LHVRPGDEVVKKLGGLHKFMVWDKPILTDSGGFQVFSLSGLRKIKEEGVYFQSHIDGKKI FMGPEESMQIQSNLASTIAMAFDECPSSVASRQYIENSVARTTRWLERCKAEMNRLNQLP DTINKHQLLFGINQGAIYEDIRIEHAKRISELDLDGYALGGLAVGESHEDMYRILDATVP HLPENKPTYLMGVGTPANILEAVDRGVDFFDCVYPSRNGRHGHVYTNHGKLNLFNAKYEL DDRPIEEGCQCPACRKYSRAYIRHLLKAKEMLGMRLCVLHNLYFYNTMMEEIREAIEEGR YKEYKKQKLDSMSGKQI >gi|330399666|gb|ADLB01000008.1| GENE 42 36916 - 37188 375 90 aa, chain + ## HITS:1 COG:BS_yrbF KEGG:ns NR:ns ## COG: BS_yrbF COG1862 # Protein_GI_number: 16079823 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit YajC # Organism: Bacillus subtilis # 1 83 9 85 89 58 38.0 2e-09 MLSIIIVYAVILGGFWFLLMRPQKKEQKRIKLMLSELEVGDTVLTTSGFYGVVIDVQEDD VVVEFGNNKNCRIPMQKAAISQVEKPGTEN >gi|330399666|gb|ADLB01000008.1| GENE 43 37446 - 38867 1312 473 aa, chain + ## HITS:1 COG:BS_yjeA_2 KEGG:ns NR:ns ## COG: BS_yjeA_2 COG0726 # Protein_GI_number: 16078275 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted xylanase/chitin deacetylase # Organism: Bacillus subtilis # 264 458 17 211 217 160 42.0 6e-39 MIREKFKKDNHIWKVVICLLLCFVVMGVLQTSTAAEESEKAVIILRPESVEMVQEEEVPE ITAKASGKGDMKLVLNEDTGYTVQDLLDSFNKGNHYQISCSADGKVDGEFPIKIKLSEKI EDSLAKDWLGKVRIDLQDGKVTVKNKYGQWENDKFKKNDGQYAANEFVVSKGKTYYMGAD GKKQTGWQEKDGQIYYFDKKGVMQTGWQEKDGVKIYLKQDGSMAVGWQKIKDDKYYFDKD GKAVTGKQQIGTKKCEFAKDGKLISEENSIDSNKPMIALTFDDGPGPDTEKILDALKANN ARATFFMLAPKVEKYPNAVKKMQEIGCELANHSTTHTALTTLTPEEIRKEISTTAHAVSK ATGGFPTTLVRPPYGKKNDTVKATVGQPIIMWSIDTLDWKTRNVQSTINNVLKNAKDGDI VLMHDIHKQSVEAAVQLIPMLIQKGYQLVTVSELAEARGVTMENGVSYSQFRK >gi|330399666|gb|ADLB01000008.1| GENE 44 38919 - 40091 1130 390 aa, chain - ## HITS:1 COG:CAC0819 KEGG:ns NR:ns ## COG: CAC0819 COG0462 # Protein_GI_number: 15894106 # Func_class: F Nucleotide transport and metabolism; E Amino acid transport and metabolism # Function: Phosphoribosylpyrophosphate synthetase # Organism: Clostridium acetobutylicum # 16 389 13 371 371 390 49.0 1e-108 MPNLKVLEQSLPIAPLKIAALESCTDLAKKVNDYIVQFRHAPVVDSIDPTLFANHQTDNY LASLSCPRFGSGEAKGMFHESIRGCDLFVMVDVCNYSLTYSVNGHLNHMSPDDHYQDLKR IISAATGKAHRINVIMPFLYESRQHKRTKRESLDCAFALEELTNMGVSNILTFDAHDPRV QNAIPLKGFDNFTPPYQFMKALFRSVPDLQVDKDHLMIISPDEGAMHRAVYFSNVLGVDM GMFYKRRDYSTVVNGKNPIVAHEFLGDDIRGKDLIIIDDMISSGESMLDVAKQLKTRGAG RVFVCTTFGLFTEGFAKFDEYYEKGYIDRVITTNLTYLPKEVHEKPYFVTADMSKFLALI IDSLNHDLTIGSVLNPTDRIHQLLAKHKQI >gi|330399666|gb|ADLB01000008.1| GENE 45 40247 - 41560 1285 437 aa, chain + ## HITS:1 COG:CAC1710 KEGG:ns NR:ns ## COG: CAC1710 COG1625 # Protein_GI_number: 15894987 # Func_class: C Energy production and conversion # Function: Fe-S oxidoreductase, related to NifB/MoaA family # Organism: Clostridium acetobutylicum # 4 436 2 433 437 421 48.0 1e-117 MKKEHIIRSVEEGSIAWELGIEAGDKLISINDNEIEDVFDYHFLVNDEELIVLIEKPDGE QWELEIEKDYEEDLGISFEQGLMDEYRSCRNKCMFCFIDQMPKGMRDTLYFKDDDSRLSF LQGNYITLTNMSDDDVRRIVKYHLEPINISIHTTNPELRCKMLHNRFAGEALKKVDILYE GGITMNGQIVLCKGENDGEELERSIRDMTKYLPYLQSVSVVPVGLTKYREGLYPLESFEK EDAKKVLETIHKWQKKIYEEHGTHFIHAGDEWYILAEEEVPEEERYDGYLQLENGVGMLR LLQNEFEEEFDTLVGDDRRREISLATGVLAYPYLKRMVERLQTKYPNITVHLYKIINNFF GEKITVAGLITGQDLIGQLKGQPLGDTLLLPCSMLRDGEEVLLDDVTLTDLKESLQVDID IVKSSGQDLIEAIINNI >gi|330399666|gb|ADLB01000008.1| GENE 46 41574 - 42902 1632 442 aa, chain + ## HITS:1 COG:CAC1711 KEGG:ns NR:ns ## COG: CAC1711 COG1160 # Protein_GI_number: 15894988 # Func_class: R General function prediction only # Function: Predicted GTPases # Organism: Clostridium acetobutylicum # 1 439 1 438 438 512 58.0 1e-145 MSKPIVAIVGRPNVGKSTLFNALAGEMISIVKDTPGVTRDRIYAEVNWLDKEFTLIDTGG IEPDSKDIILSQMREQAQIAIDTADVIIFLTDVKQGLVDSDSKVADMLRRSGKPVVLVVN KVDSFQKFMADVYEFYNLGIGDPFPISASSRLGIGDMLDEVVKHFPETTAEEAEDDRPRI AIVGKPNVGKSSIINKLQGDNRVIVSDIAGTTRDAIDTPITYNGKEYVFIDTAGLRRKNK IKEELERYSIIRTVTAVERADVVLIVIDATEGVTEQDAKIAGIAHERGKGIIIVVNKWDA IEKHDKTMYEYEKQVRQVLSYMPYAEIMYVSAHTGQRLNKLYEKIDMVIENQTLRVATGV LNEIMMEAVAMQQPPSDKGKRLKLYYITQVSVKPPTFVIFVNDKELMHFSYTRYLENKIR EAFGFKGTSLKFFIRERKDKDK >gi|330399666|gb|ADLB01000008.1| GENE 47 42904 - 43542 755 212 aa, chain + ## HITS:1 COG:BS_yneS KEGG:ns NR:ns ## COG: BS_yneS COG0344 # Protein_GI_number: 16078868 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Bacillus subtilis # 7 209 7 192 193 117 39.0 2e-26 MERFVCLILGYICGLFQTGYIYGRLHHIDIRKHGSGNAGTTNALRTLGWKAGAVTFLGDS LKCVLAVAIAYMIYGKSHNEMFELLAMYAGIGAVLGHNFPFYLNFKGGKGIAATAGLLLA VNPKVALVAMVTFIIVVGVTRYVSLGSIILVVIFGIGMVVQGEVFGTTLSQSQQYELYAL SVFLMVLAVYKHRENIKRLCNGTENKLKFGKK >gi|330399666|gb|ADLB01000008.1| GENE 48 43554 - 44564 1301 336 aa, chain + ## HITS:1 COG:FN0906 KEGG:ns NR:ns ## COG: FN0906 COG0240 # Protein_GI_number: 19704241 # Func_class: C Energy production and conversion # Function: Glycerol-3-phosphate dehydrogenase # Organism: Fusobacterium nucleatum # 1 332 1 334 335 430 65.0 1e-120 MANVGVLGAGSWGTALALLLHKNGHHVTVWSISKEEVEMLSKEREHKSKLPGVKLPEEMQ FTNVLEEAIQGKDFMVLAVPSPFTRATARNMKSYVAENQIIVDVAKGIEEDTLMTLSQQI HEEIPQADVAVLSGPSHAEEVGRGLPTTCVVGAKTKKTAEYLQEAFMNEVFRVYTSPDML GIELGGALKNVIALAAGIADGLGYGDNTKAALITRGIAEIARLGVAMGGKMESFTGLTGI GDLIVTCASVHSRNRRAGILIGQGKTMEEAMEEVKMVVEGVYSTKSAVKLAEKYNVSIPI IEQVNAVLFEGKNPGEAVKELMLRDKRIEIPTLPWE >gi|330399666|gb|ADLB01000008.1| GENE 49 44602 - 45528 699 308 aa, chain - ## HITS:1 COG:BH0427 KEGG:ns NR:ns ## COG: BH0427 COG3965 # Protein_GI_number: 15612990 # Func_class: P Inorganic ion transport and metabolism # Function: Predicted Co/Zn/Cd cation transporters # Organism: Bacillus halodurans # 15 289 10 276 301 105 23.0 1e-22 MHATRAKQKKERSALNISLIAGLVFALLEIIIALYTHSQAVLLDGFYDGVESVMIVISIG IIPLLYRSSNEKRPFGYLQIESFFVVIKGVTMVAVTIGLIATNIDIVLNGGRHISFNNIA YFELFAAVLSIFVIFLLKKKNAKTHSALVTMEIQEWIIDGIASLGMSVAFFLPLFIKANW FSNIVPYLDQIIAILLSLFMLPTPIRAIITGLRDIFLIAPEEETVTEIKNIVNPILDSYG YEELHYDIVRTGRKLWISVYITFDKDMISISRLRNLQGLIIDALKKEYQDFYFELLPDVE YLGNTEVR >gi|330399666|gb|ADLB01000008.1| GENE 50 45655 - 47127 1567 490 aa, chain + ## HITS:1 COG:no KEGG:Cphy_2385 NR:ns ## KEGG: Cphy_2385 # Name: not_defined # Def: stage IV sporulation protein A # Organism: C.phytofermentans # Pathway: not_defined # 1 490 1 491 491 567 57.0 1e-160 MNTFHLYKDIQARTNGEIYIGVVGPVRTGKSTFIKRFMDLLVLPHMEDEHMRVRTRDELP QSASGKTIMTTEPKFIPKEAAAIELEEGITANIRLIDCVGYMVEGAVGHIENEEERQVKT PWFEHEIPFTKAAAIGTQKVIREHSTIGIVITTDGSIGELKRENYLEAERNTIRELQNIG KPFVVLVNSKKPHGEEANAVKEEVESKYGVTALTVNCEQLRQEDIHQIMQSVLFEFPIAE VQFYIPKWVEMLPMEHRIKQDLLSHVREMLDAFSQIKDVKKEWKQTESEYIEEMRVEEVK MDSGSVKVRLQIGEPFYYEVLSEVTGADITSEYDLIGTMKELSTLRKEYESVKDAMDSVR LKGYGVVSPAKEEIKLEDPVIIKQGNKFGVKIHSEAPSIHLIRANIETEIAPIVGSEQQA EDLIKYIKETKETEEGMWKTNIFGKSIEELVTEGMHNKMMMINDESQVKLQDTMQKIVND SNGGLVCIII >gi|330399666|gb|ADLB01000008.1| GENE 51 47187 - 48212 979 341 aa, chain + ## HITS:1 COG:CAC3420 KEGG:ns NR:ns ## COG: CAC3420 COG2008 # Protein_GI_number: 15896661 # Func_class: E Amino acid transport and metabolism # Function: Threonine aldolase # Organism: Clostridium acetobutylicum # 1 338 1 338 344 448 61.0 1e-126 MIAFNCDYNEGAHPRILEALIETNMEQTAGYGEDQYCREAEIIIKGLCQSPEAKVHFMVG GTQANLTVISSVLRPHQAALCAVSGHINVHETGAIEACGHKVMTVPSEDGKISAEQVIET YTLHKNDSSFEHTTQPKLVYISNPTELGTIYTKKEVEELRAACDECGMYLYLDGARLGYG LCAENNDLDLESIAKNCDVFYIGGTKVGALFGEAIVICNPALQEDFRYIMKQKGGMLAKG RLLGLQFRELFKDGLYFTMSNHAIRLAMRLKNGLAERGYKFLLDSNTNQQFVIVPDKKLE QLKEKYAYTYQEKYDEKNSVIRLCTSWATKEENVEELLSDM >gi|330399666|gb|ADLB01000008.1| GENE 52 48254 - 48829 803 191 aa, chain - ## HITS:1 COG:BH1514 KEGG:ns NR:ns ## COG: BH1514 COG0503 # Protein_GI_number: 15614077 # Func_class: F Nucleotide transport and metabolism # Function: Adenine/guanine phosphoribosyltransferases and related PRPP-binding proteins # Organism: Bacillus halodurans # 1 191 1 191 198 172 48.0 2e-43 MQLLKERILKDGVVKPGNVLKVDSFLNHQMDIDLINEIGKEFRRRFPSDKITKILTIEAS GIGIACIVAQYFNVPVVFAKKAQSINIDGDVYSTKIESFTHKRTYDVILSKKFLTEDDHV LILDDFLANGCALLGLIDIVQEAGATIEGAGIVIEKGMQDGGKQIREKGIHLESLAIVES MTDTELTFKED >gi|330399666|gb|ADLB01000008.1| GENE 53 49011 - 49598 525 195 aa, chain + ## HITS:1 COG:DR0704 KEGG:ns NR:ns ## COG: DR0704 COG1896 # Protein_GI_number: 15805731 # Func_class: R General function prediction only # Function: Predicted hydrolases of HD superfamily # Organism: Deinococcus radiodurans # 3 185 4 189 198 107 37.0 2e-23 MTRLEKQTQFIVEVDKVKNIFRQTYLSDGERKENDAEHSWHLALSAILLKEYVSEEVDLL KVITMVLIHDLVEIDAGDTYAYDSAGAKDKREREEKAADRIFSILPTEQGQYFRELWEEF EEYETEEAKYAHLLDNFQPMLLNDAAKGKSWSEHQVKKQQIYKRNERIEETSETIWGEMQ RIVEKNIQLGNIHEK >gi|330399666|gb|ADLB01000008.1| GENE 54 49612 - 51306 1124 564 aa, chain + ## HITS:1 COG:CAC3563 KEGG:ns NR:ns ## COG: CAC3563 COG0249 # Protein_GI_number: 15896798 # Func_class: L Replication, recombination and repair # Function: Mismatch repair ATPase (MutS family) # Organism: Clostridium acetobutylicum # 5 547 35 567 577 277 31.0 4e-74 MEIEIILAAVAALIVFSIVQIYLSEKKARGVWRAKFLKEWGKIPQREYDEEEVEKISRYF EYQLKKGKINEPYIDEITWNDLDMDNVFAYINNTKSSAGEEYLYYLLRTPVTEQKTLDER ERLIHFFATHESERADLEVCFRQIGKTRRFSINDYIDLLVTLKKESNVEHYIALLTIPIG LVLMTVFEVGVGLVVMMVFLVWNVMRYFKRRGEIQPYLTTFSHILKMLDASKEISKLKID EVQTYVDNIADIRKKFKKFRFGSSLLMSTQRETGSLAEAFLDYIRVATHIDLLKFNSMLE EIKQNAGEIDNLVENIGILDALISVASFRESLPFYSLPELRETEEAFIEVKDVYHPLIVE PVANSITEDRNVLITGSNASGKSTFLKTVAINAVLAQTVNTSISTFYRASFFQIYSSMAL KDNLQGNESYYIVEIKSLKRILSHADEDIPMLCFVDEVLRGTNTIERISASAQILKSLSK KQVLCFAATHDIELTHMLENEYSNYHFQEEIQENDILFNYELYRGRAVSRNAIKLLSIIG YDKAIIESAEQTATHFLKTGEWRV >gi|330399666|gb|ADLB01000008.1| GENE 55 51309 - 51791 450 160 aa, chain + ## HITS:1 COG:STM0263 KEGG:ns NR:ns ## COG: STM0263 COG0328 # Protein_GI_number: 16763646 # Func_class: L Replication, recombination and repair # Function: Ribonuclease HI # Organism: Salmonella typhimurium LT2 # 1 147 2 141 155 152 53.0 3e-37 MMKVKIYTDGAARGNPDGPGGYGTVLEYVDTKGELHTKELSQGYKKTTNNRMELMAVIAG FEALNRPCEIELYSDSKYVVDAFNQKWIDGWIKKRWKRGKNEPVKNIDLWKRLLKAKEPH QVTFIWVKGHDGHMQNERCDFLATSSADGENLIDDVVEVC >gi|330399666|gb|ADLB01000008.1| GENE 56 52304 - 52889 225 195 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MDYLIRMKKRKRQVEKRTIISVIMFFLFVMGYYTVVFKKPDVILVESILIIMLLYFAIIV FNLHKCEKQVEMTHMKRICDMVSAEELREIDDFVCHVADTLWFKGSIVHMQCVLSEKTIY GWGYGTADFYCYKIADITKIEIKKNLLYITLQNGEEKLIGSYYSNGRGTFSFKMKDLKEQ IEKKVHKQLGVKTSL Prediction of potential genes in microbial genomes Time: Tue May 24 21:00:41 2011 Seq name: gi|330399660|gb|ADLB01000009.1| Lachnospiraceae bacterium 2_1_46FAA cont1.9, whole genome shotgun sequence Length of sequence - 2707 bp Number of predicted genes - 7, with homology - 3 Number of transcription units - 5, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 42 - 275 115 ## - Prom 335 - 394 2.8 + Prom 25 - 84 4.2 2 2 Tu 1 . + CDS 205 - 348 77 ## 3 3 Op 1 . - CDS 528 - 737 126 ## 4 3 Op 2 . - CDS 785 - 946 202 ## 5 3 Op 3 . - CDS 950 - 1297 381 ## COG1695 Predicted transcriptional regulators - Prom 1331 - 1390 8.7 + Prom 1327 - 1386 3.1 6 4 Tu 1 . + CDS 1407 - 2333 656 ## gi|224541451|ref|ZP_03681990.1| hypothetical protein CATMIT_00620 7 5 Tu 1 . - CDS 2352 - 2705 241 ## COG2801 Transposase and inactivated derivatives Predicted protein(s) >gi|330399660|gb|ADLB01000009.1| GENE 1 42 - 275 115 77 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTLIATFKYNGSSSSCTSSKVETKVYHSAWKITSKNSSHSKNVATAKVVAKLFKGSTVSQ TKTKTLTLTCSKTGVIS >gi|330399660|gb|ADLB01000009.1| GENE 2 205 - 348 77 47 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MVSTLEEVQEEEEPLYLKVAISVMLHKGFPAGVLYVAVRETVLVWSL >gi|330399660|gb|ADLB01000009.1| GENE 3 528 - 737 126 69 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MNADSLRKVLSRTRYIRIATIFIIFIAIFTGTFKITTDYMNYKEAQEAYIHREIIETTEY IEVPVSEDE >gi|330399660|gb|ADLB01000009.1| GENE 4 785 - 946 202 53 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTKAQKKIIKNYKREIRKLLPIFTKNENIFSMIYLFLSKIMLLQTKVFPKTHL >gi|330399660|gb|ADLB01000009.1| GENE 5 950 - 1297 381 115 aa, chain - ## HITS:1 COG:CAC0571 KEGG:ns NR:ns ## COG: CAC0571 COG1695 # Protein_GI_number: 15893861 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Clostridium acetobutylicum # 5 106 3 100 107 57 35.0 7e-09 MDAKSNFRRGSVELLVLHLLQEKDYYGYELSQNIKSRSNGIIDIPVGSLYPALYKLIDNG YITDYKQQAGKRLVRVYYHLEPEGKERLDLLLEDYYATNTGIQNILNYELKEGEE >gi|330399660|gb|ADLB01000009.1| GENE 6 1407 - 2333 656 308 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|224541451|ref|ZP_03681990.1| ## NR: gi|224541451|ref|ZP_03681990.1| hypothetical protein CATMIT_00620 [Catenibacterium mitsuokai DSM 15897] # 1 136 1 125 318 65 36.0 5e-09 MKKRKYILLIALSVVMLTGCKTSETKSELSDYYLSMTNSSRINENEISVENHTYDFEKDK MDTEKYKMSLTAKYSLAVYDNKGKAVLYSAKDTNGNDEVYRYDLKTKKSEQLTENLWGIN YIVPREKDYIVVGVPQIPHDSKVLMLWSIDRNTKEVRNIEIPHDKYKDISVWQVAYVPET DGIILQTYSDSETYELQDKWNSMEEHPKDKEFTNPFYYYQYVNGEMKYLFEKDMPQSMGI IANKNDILFAVQSEIDGNDVFRYNIKDNKVDKVKELETLEKVFYLDEESKNVYRFNGKIG PMSRFSTS >gi|330399660|gb|ADLB01000009.1| GENE 7 2352 - 2705 241 117 aa, chain - ## HITS:1 COG:L0434 KEGG:ns NR:ns ## COG: L0434 COG2801 # Protein_GI_number: 15672639 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Lactococcus lactis # 1 104 172 277 279 99 46.0 2e-21 AKARRNIDKPLILHSDRGSQYVSKEYKRVTATMQCSYSKKSYPWDNACIESFHSLIKREW LNRFKIRDYDHAYRLIFEYLEAFYNTKRIHSHCDYMSPNDYEELYRRLQQDELQLAG Prediction of potential genes in microbial genomes Time: Tue May 24 21:01:19 2011 Seq name: gi|330405514|gb|ADLB01000010.1| Lachnospiraceae bacterium 2_1_46FAA cont1.10, whole genome shotgun sequence Length of sequence - 32585 bp Number of predicted genes - 32, with homology - 27 Number of transcription units - 17, operones - 8 average op.length - 2.9 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 29 - 313 453 ## EUBREC_2191 hypothetical protein - Prom 370 - 429 7.5 + Prom 609 - 668 10.0 2 2 Op 1 . + CDS 786 - 1301 448 ## gi|225378454|ref|ZP_03755675.1| hypothetical protein ROSEINA2194_04122 3 2 Op 2 . + CDS 1359 - 1625 272 ## + Prom 1645 - 1704 4.1 4 3 Tu 1 . + CDS 1742 - 2734 960 ## + Term 2736 - 2801 11.2 - Term 2725 - 2788 11.6 5 4 Op 1 19/0.000 - CDS 2789 - 3229 572 ## COG1781 Aspartate carbamoyltransferase, regulatory subunit 6 4 Op 2 . - CDS 3250 - 4170 1269 ## COG0540 Aspartate carbamoyltransferase, catalytic chain 7 4 Op 3 . - CDS 4180 - 4716 430 ## COG0778 Nitroreductase - Prom 4744 - 4803 8.6 8 5 Tu 1 . - CDS 4820 - 5200 199 ## Cphy_0824 hypothetical protein - Prom 5220 - 5279 8.4 + Prom 5200 - 5259 10.0 9 6 Tu 1 . + CDS 5285 - 6412 1163 ## COG0077 Prephenate dehydratase + Prom 6429 - 6488 9.2 10 7 Op 1 . + CDS 6524 - 7033 525 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog 11 7 Op 2 . + CDS 7017 - 7535 405 ## 12 7 Op 3 . + CDS 7588 - 9957 2454 ## Cphy_2123 transglutaminase domain-containing protein 13 7 Op 4 . + CDS 9977 - 11455 1549 ## gi|153816004|ref|ZP_01968672.1| hypothetical protein RUMTOR_02250 + Term 11465 - 11499 6.0 - Term 11360 - 11394 0.1 14 8 Tu 1 . - CDS 11475 - 13208 1150 ## COG1293 Predicted RNA-binding protein homologous to eukaryotic snRNP - Prom 13244 - 13303 7.1 + Prom 13165 - 13224 6.4 15 9 Op 1 8/0.000 + CDS 13357 - 14235 1119 ## COG1561 Uncharacterized stress-induced protein 16 9 Op 2 . + CDS 14248 - 14880 693 ## COG0194 Guanylate kinase 17 9 Op 3 . + CDS 14885 - 15127 386 ## Cphy_2879 DNA-directed RNA polymerase, omega subunit + Term 15138 - 15176 7.6 18 10 Op 1 2/0.000 + CDS 15189 - 16511 1112 ## PROTEIN SUPPORTED gi|229230948|ref|ZP_04355465.1| SSU ribosomal protein S12P methylthiotransferase 19 10 Op 2 6/0.000 + CDS 16493 - 17056 375 ## PROTEIN SUPPORTED gi|229231897|ref|ZP_04356325.1| SSU ribosomal protein S12P methylthiotransferase 20 10 Op 3 . + CDS 17058 - 17531 298 ## PROTEIN SUPPORTED gi|229231897|ref|ZP_04356325.1| SSU ribosomal protein S12P methylthiotransferase 21 10 Op 4 . + CDS 17555 - 17629 99 ## + Prom 17648 - 17707 1.8 22 11 Op 1 . + CDS 17727 - 18419 970 ## COG0822 NifU homolog involved in Fe-S cluster formation 23 11 Op 2 . + CDS 18438 - 19448 1276 ## Cphy_3261 hypothetical protein + Term 19469 - 19501 3.3 24 12 Tu 1 . + CDS 19517 - 20062 272 ## PROTEIN SUPPORTED gi|229241266|ref|ZP_04365652.1| acetyltransferase, ribosomal protein N-acetylase + Prom 20086 - 20145 10.7 25 13 Op 1 . + CDS 20205 - 24566 4712 ## COG3669 Alpha-L-fucosidase + Term 24575 - 24617 8.9 + Prom 24568 - 24627 1.7 26 13 Op 2 . + CDS 24662 - 25453 629 ## COG2357 Uncharacterized protein conserved in bacteria + Prom 25472 - 25531 2.4 27 13 Op 3 . + CDS 25600 - 26871 1050 ## COG0402 Cytosine deaminase and related metal-dependent hydrolases + Term 26872 - 26935 18.2 - Term 26857 - 26925 16.8 28 14 Op 1 1/0.500 - CDS 26933 - 29272 2282 ## COG1511 Predicted membrane protein 29 14 Op 2 . - CDS 29293 - 31395 2404 ## COG1033 Predicted exporters of the RND superfamily - Prom 31442 - 31501 6.9 + Prom 31474 - 31533 8.1 30 15 Tu 1 . + CDS 31558 - 32154 617 ## Cphy_1910 TetR family transcriptional regulator + Term 32160 - 32218 6.3 31 16 Tu 1 . - CDS 32132 - 32257 118 ## - Prom 32292 - 32351 8.6 32 17 Tu 1 . + CDS 32271 - 32583 165 ## COG3666 Transposase and inactivated derivatives Predicted protein(s) >gi|330405514|gb|ADLB01000010.1| GENE 1 29 - 313 453 94 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_2191 NR:ns ## KEGG: EUBREC_2191 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 94 1 94 94 135 77.0 4e-31 MAKHYDKQFKLDAVQYYHDHKNLGLQGCATNLGISQQTLSRWQKELRETGDIESRGSGNY ASDEAKEIARLKRELRDAQDALEVLKKAINILGK >gi|330405514|gb|ADLB01000010.1| GENE 2 786 - 1301 448 171 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225378454|ref|ZP_03755675.1| ## NR: gi|225378454|ref|ZP_03755675.1| hypothetical protein ROSEINA2194_04122 [Roseburia inulinivorans DSM 16841] # 2 169 3 170 171 125 40.0 1e-27 MKKSKKITLLIGVLIILILVVWIVKKQGYSEEDAWEELMSLKSNVSMEDLKQKGYIDVSK VMDTENEEIQSFLQDTKNKKKGTLRIATVVDDRLCAKILVYNKEMNAIVMQTMYPEKQQG ESPDKCFDIETYFEEENGVTTVYLKNIPNRSIPNTDKVELEDERLYSYRVK >gi|330405514|gb|ADLB01000010.1| GENE 3 1359 - 1625 272 88 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MDGVFHTICVIVVFGISAAKLHCLRKDREFLEKFRTEQMIHSLEVLCIGIALSCIFVVLA SLWNFPELSLVSLIIIGVVCWVIYKICH >gi|330405514|gb|ADLB01000010.1| GENE 4 1742 - 2734 960 330 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKRGKQILLLTMTALMLTGCQTSEKKSELTDYRLVTSNAGRLNENQIVVENNTYDFEKDK MNSAKYKIDLATQFNFAVVDDKGKAVLYSAKDEKGNDEVYRYDLETKKTEQLTDNLWGIN SIIPRENDYIVTSVPQTPDNSKEFMLWSIDRGTNKIKQIEIPHDKHKDMSVWQVAYVPET DGLVLQTYSESEEYTLRDKWNSAENHSEGKELEIPFYYYQYENGKVKYLFEQKMPQSNGL VANKNDILFGVQSDVNGDSVYRYNLKEDRTEKVKGLERLDKVFYLDEASEYLYKFNGKIG KVDLKTGEEEWLDNQFKEYYFTNYRLVKEQ >gi|330405514|gb|ADLB01000010.1| GENE 5 2789 - 3229 572 146 aa, chain - ## HITS:1 COG:CAC2653 KEGG:ns NR:ns ## COG: CAC2653 COG1781 # Protein_GI_number: 15895911 # Func_class: F Nucleotide transport and metabolism # Function: Aspartate carbamoyltransferase, regulatory subunit # Organism: Clostridium acetobutylicum # 6 143 2 138 146 130 50.0 9e-31 MTKNTLNVSSIEEGFVLDHIQAGKSMDIYRYLKLDKLDCCVAIIKNAKSSKMGKKDIMKI ECPIDIIDLDILSFIDHNITINIIQNDKVVEKKRLELPKEITNVIKCKNPRCITSIEQEL DHVFVLTDEETQEYRCKYCEEKYHRK >gi|330405514|gb|ADLB01000010.1| GENE 6 3250 - 4170 1269 306 aa, chain - ## HITS:1 COG:CAC2654 KEGG:ns NR:ns ## COG: CAC2654 COG0540 # Protein_GI_number: 15895912 # Func_class: F Nucleotide transport and metabolism # Function: Aspartate carbamoyltransferase, catalytic chain # Organism: Clostridium acetobutylicum # 2 303 5 306 307 426 66.0 1e-119 MKHLLNPLDLSVEEIDEILDLANDIEAHPEKYAHACEGKKLATLFYEPSTRTRLSHEAAM MNLGGNVLGFSSADSSSATKGESVSDTIRMIACYADICAMRHPKEGAPMVASQHSTIPII NAGDGGHQHPTQTLTDLLTIRSEKGRLHDITIGLCGDLKFGRTVHSLIHALVRYENVKFV LISPEELRLPSYIRKDVLDKHNIPYKEVVRLEDAMGELDLLYMTRVQKERFFNEEDYVRM KDFYILDAAKMALAKDDMLVLHPLPRVNEIAVEVDKDPRAIYFKQAQYGVYVRMALILTL LEIKVS >gi|330405514|gb|ADLB01000010.1| GENE 7 4180 - 4716 430 178 aa, chain - ## HITS:1 COG:MA0330 KEGG:ns NR:ns ## COG: MA0330 COG0778 # Protein_GI_number: 20089228 # Func_class: C Energy production and conversion # Function: Nitroreductase # Organism: Methanosarcina acetivorans str.C2A # 1 163 6 158 179 77 32.0 1e-14 MNTVECIKSRRSIRKYKSDKVDHSLLESIISTASYSPSWKNTQITRYIAIEDTSIINKIV TDFTPEYNANTIRQVPMLIAVTMVTGRCGFERDGSFTTKKGDRWQMFDVGVSCQSFCLAA HDAGLGTVIMGIFDEDGVTELLDIPEGQELVALIAIGYPDIEPVVPKRKSVEDLLTYK >gi|330405514|gb|ADLB01000010.1| GENE 8 4820 - 5200 199 126 aa, chain - ## HITS:1 COG:no KEGG:Cphy_0824 NR:ns ## KEGG: Cphy_0824 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 5 126 24 145 147 139 56.0 3e-32 MKNKFILCGTVGWCMEIIFTSIHSLCRKEPKLIGKTSIWMFPIYSLAGFLSPLCRFLKRK SILIRGGIYTFCIFFCEFLTGSLLKKYDACPWDYTDAKLNIKGLIRLDYAPLWFGVGLLY EKLLNR >gi|330405514|gb|ADLB01000010.1| GENE 9 5285 - 6412 1163 375 aa, chain + ## HITS:1 COG:DR1147 KEGG:ns NR:ns ## COG: DR1147 COG0077 # Protein_GI_number: 15806167 # Func_class: E Amino acid transport and metabolism # Function: Prephenate dehydratase # Organism: Deinococcus radiodurans # 111 375 23 285 293 175 37.0 2e-43 MSLQEIREQLDDIDAQILALYERRMELCEQVGNDKLKTGKKVYDRKREEEKLAVLSEKAS NERNKKGIRELFEQIMSMSRKLQYQILGENGVYGKTAFVALKELDSKNARLVFQGMNGAY SQEALRKYFGDGENVFHVDTFRDAMEAIEEGSADFAVLPIENSSAGAVSQVYDLLVEFEN YIVGEVVIPIRHALAGIPGTTFSDIERVYSHPQGLMQSEKFLAEHRNWQQISVENTAVAA KKVLESGKRTEAAICSEYAAELYGLEVLAQSINHSENNSTRFIIVTNQKVFLEGAKKISM CFEIPHESGSLYHLLSHFIYNDLNMTKIESRPIEDRNWEYRFFVDFEGNMADSSVKNAIR GLRDETRNLRILGNY >gi|330405514|gb|ADLB01000010.1| GENE 10 6524 - 7033 525 169 aa, chain + ## HITS:1 COG:lin0443 KEGG:ns NR:ns ## COG: lin0443 COG1595 # Protein_GI_number: 16799520 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Listeria innocua # 14 163 23 182 182 87 33.0 1e-17 MVGQGRQVDFCAEQVLEQYADMVYRLAIIHMKNKADAEDVFQEVFLRLVKNADKISSEEH LKAWLIKVTVNCCKKQFDSAWNRHRASLEYDVEESYEMEEKDESVTAAVQKLPENYRTVI HLFYYEEYSVKEISTILEQSETAVKTRLSRARDMLRNYLKGEVEYAGTV >gi|330405514|gb|ADLB01000010.1| GENE 11 7017 - 7535 405 172 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MQEQYRREIDNIHAPKDLIERTRKQMEAEVNQTKKKPKKPIWIGTAVAACICITVLGGYF YIGKIRTNIDIQNVSFEETSDWETGLSLGGRQDGESEKTTKIEWEELEGKEDIPEAVLKV KPSKIHGKSIYICKEKGTDNYYAAYEKGDNYFYIYGKNITEKEFLTFLEKIL >gi|330405514|gb|ADLB01000010.1| GENE 12 7588 - 9957 2454 789 aa, chain + ## HITS:1 COG:no KEGG:Cphy_2123 NR:ns ## KEGG: Cphy_2123 # Name: not_defined # Def: transglutaminase domain-containing protein # Organism: C.phytofermentans # Pathway: not_defined # 3 788 2 770 773 402 36.0 1e-110 MKRRKVTSICCAILTFSLLLTGCRTEQKSNKDAGKTKAQMTTSKQLSAKIKEKYADSEKY IYKDTIEDVKRNEKIPVQFGFDIKNGQFKNYTELVAVYQDPELTQRVGTHFEWDAESQML KIKPPRWSTAQISVTSKQVEEQKDLIFGYDKVSNVLFDKDEFEDWGNLSQYYMALYVNPK NGEKLEKPIVTPFTIKHEIKKAPEVKFRVNEEGKPEFYWEKIKGAKAYYVVQYDYDEEHG YSMVGMTRGITDKTSWSPDSNVMFNTFTVAEVSRNEKWVVDKYGQGTGPVVIDRDLKEQK YCVVAVNEDGTSAISDSFSKHQLAKMIASNVEGKKSREEEGSRYAKSFNELPSYAWVTMC DGRLVQKLITYDIENAKEKTETWGQYEKEDMSDLKNVQVDIVKIPYQIEGTPFKDVAVVE QYDKATVKADLKALKERQDTLRTKGGRTDAKFEVEDEKEKKSEEDIYATDYDITATSALS EYIGRNMVAGNTMIDLSDFKEAKDQSQLLDAWQEATYQNPLALGVRSASIAGDYLVVKYD DSPEVMRKKQEEIVKEVKRVVKEVVKDGMTELEKEIALNNYLCEIAKYDDNALKNAEKNN FRKVDKEFNDSFTPYGVLINKNGVCASYAGAYKLLAQEAGLECIVVTGYLEGNLPHAWNK VKVDGQWQIVDPTNNDNEILYNALLNLPNAEADRVLVEDDKYLVNSAIADYTAKEDKKEY YHINNKYFEKNAIANSLTKEVGTNGKATLRTDYGLTEEEFKQIAGQVVTGLGNNQLQGTY WLGVIHITK >gi|330405514|gb|ADLB01000010.1| GENE 13 9977 - 11455 1549 492 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|153816004|ref|ZP_01968672.1| ## NR: gi|153816004|ref|ZP_01968672.1| hypothetical protein RUMTOR_02250 [Ruminococcus torques ATCC 27756] # 1 268 1 269 461 187 41.0 2e-45 MKKRMKKLATLGIVAMMVMAMAAGCGKKATPESLLTDMNKKLKDIKSTEANMVLNMEMSQ EGQTAEMNIEMEAETIMKTGESHTEGEVGMKIMGQSASTEVESYVVKEDDDYINYINQEG KWKKEKIDEDEMTKASTDLFDEFGESFEKFKLSEDLVKVNDKECFELKGEVSSEIMEYML SSDMSKEADLDKYLSGEDEDIKVPCTLSIYKDKILPAKITLDMKEIFGKMSETTGMDIEK FKLEVSYDDFNKVKKIEVPKDVLKETEGFGSGLDLDDDDFESNEKDDDNQSSSGSANTNV SDAVDMTNTTITSPAPLNKWVKTTRYATEDKVYHTVYVRINKVTTMSEDANYVNQAIQQH NQFGTRKIDVNQMKLPSDVELCVFDYEVMVPKEFPAPDYGMVEPDISFTAKNVKGGGVPS ADGTQTYIGLGSVTEMKVRSDEEKFYPGNTYKMRGCFSMVKGFKDYAFEATAYPEGQSTS GGTLLKGYWAGH >gi|330405514|gb|ADLB01000010.1| GENE 14 11475 - 13208 1150 577 aa, chain - ## HITS:1 COG:BH2516 KEGG:ns NR:ns ## COG: BH2516 COG1293 # Protein_GI_number: 15615079 # Func_class: K Transcription # Function: Predicted RNA-binding protein homologous to eukaryotic snRNP # Organism: Bacillus halodurans # 1 570 1 559 570 378 40.0 1e-104 MAFDGITISTIIQELNTTLKDGRINKIAQPEADELLLTIKTPTGQRRLYISASASLPLIY LTDTNKPSPMTAPNFCMLLRKHINNGRITNIYQPKLERIICFEIEHLDELGDLCKKQLIV EIMGKHSNIIFCNDKGMIIDSIKHVSAQMSSVREVLPGRPYFIPDTMERHNPLTVTEEEF RTILTDKPMPLGKAVYTSFTGISPIVAEEICHLSGIPSEITPRELSDDMLLHFFNQFSLY FDSVKEKDFHPAIYYRGQEPVEFSCLPISHYSQYAVKSFESVSELLETYYSTKNTITRIR QKSSDLRRVVQTALERNRKKYALQTKQLRDTENREKYKVYGELLHTYGYNVEKGAKQLEA LNYYTNEMITIPLDTTKTPMENAQKYFDKYNKQKRTFEALSELIEETKDDITYLESVSNA LDIALSENDLLQIKEELIESGYVRRKFTKKKVKITNKPLHYISSDGFHIYVGKNNMQNEE LTFHFAVGNDWWFHAKGVPGSHVIVKTNGEELPDRTFEEAGRLAAYYSKNRGNEKVEIDY IEKKHVKKPAQAKPGFVIYHTNYSLIIDSDISKMEQV >gi|330405514|gb|ADLB01000010.1| GENE 15 13357 - 14235 1119 292 aa, chain + ## HITS:1 COG:CAC1716 KEGG:ns NR:ns ## COG: CAC1716 COG1561 # Protein_GI_number: 15894993 # Func_class: S Function unknown # Function: Uncharacterized stress-induced protein # Organism: Clostridium acetobutylicum # 1 292 1 292 292 231 45.0 1e-60 MIKSMTGFGRCEISEAERKFTVELKGVNHRYLDVNIRMPKKLNFFEASIRNLLKKYAQRG KVDIFITYEDFSENQVSLKYNETLAEEYLKYFKQMEEKFSLENDIRVSTLSRYPEVLTME EQMIDEEELWNVLKKALEGAFSQFVSTRITEGEALKKDLLAKLDEMLLLVDKVEERSPEI VAEYREKLEMKVNELLADTQIEESRIASEVVLFADKICTDEETVRLRSHIEHMKNTLEET EGIGRKLDFIAQEMNREANTILSKANDLEVSNYAIDLKTGIEKVREQIQNIE >gi|330405514|gb|ADLB01000010.1| GENE 16 14248 - 14880 693 210 aa, chain + ## HITS:1 COG:L149828 KEGG:ns NR:ns ## COG: L149828 COG0194 # Protein_GI_number: 15673881 # Func_class: F Nucleotide transport and metabolism # Function: Guanylate kinase # Organism: Lactococcus lactis # 1 191 1 191 205 200 54.0 2e-51 MKRKGILIVVSGFSGAGKGTLMKELLKQYDNYALSISATTRKAREGEEDGREYFFKTVEE FEKMIAKDELIEYARYVGNYYGTPRAYVEEQLEAGKDVILEIEIQGALKVKEKYPDTLLL FVTPPSAKELERRLVGRGTETMEVIASRMKRATEEAEIMSAYDYIVVNDELDICVKETHN IIQSEHNRVFRNKDFMNQIEEELKGKSKGE >gi|330405514|gb|ADLB01000010.1| GENE 17 14885 - 15127 386 80 aa, chain + ## HITS:1 COG:no KEGG:Cphy_2879 NR:ns ## KEGG: Cphy_2879 # Name: not_defined # Def: DNA-directed RNA polymerase, omega subunit # Organism: C.phytofermentans # Pathway: Purine metabolism [PATH:cpy00230]; Pyrimidine metabolism [PATH:cpy00240]; Metabolic pathways [PATH:cpy01100]; RNA polymerase [PATH:cpy03020] # 1 79 1 78 82 90 63.0 3e-17 MLHPSYTDLMKVVNQDVEEGATKIVNSRYSIVLATSKRARQLIDGDTPLVHTKDGEKPLS IAIDELNNGKIKIIAEDSEQ >gi|330405514|gb|ADLB01000010.1| GENE 18 15189 - 16511 1112 440 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229230948|ref|ZP_04355465.1| SSU ribosomal protein S12P methylthiotransferase [Desulfotomaculum acetoxidans DSM 771] # 1 439 19 460 462 432 48 1e-120 MNILFISLGCDKNLVDTEVMLGLLASKGHQMVNDEMEADVIVINTCCFIHDAKEESIQNI LEMAELKKEGRLKALIVTGCLAQRYKEEIIEEIPEVDAVLGTTSYDKILEAIDEALEGRH CVEMTDIDALPLVQSNRLVTTGGHFAYLKIAEGCDKHCTYCIIPKIRGNFRSVPMERLLK EAEGLAEQGVKELILVAQETTLYGKDIYGEKSLHKLLKELCKVSGIQWIRILYCYPEEIT DELIQVMKEEKKICHYLDLPIQHASDEILKRMGRRTSKAQLKEIIGKLREEIPDITLRTT LITGFPGETKEQHEELMEFVDEMEFDRLGVFTYSPEEDTPAALMDNQIEEEVKEDRQAEL MELQQDIAFDLAEDMIGKEVLVLIEGKVADENAYVGRTYKDAPNVDGLIFVNTEEELMSG DFAKVRVTGALEYDLIGEIV >gi|330405514|gb|ADLB01000010.1| GENE 19 16493 - 17056 375 187 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229231897|ref|ZP_04356325.1| SSU ribosomal protein S12P methylthiotransferase [Cryptobacterium curtum DSM 15641] # 1 182 474 669 904 149 40 1e-116 DRRDCIMNLPNKLTVIRVIMIIPFVVFMLNTGIAGDASKWIAVGIFIVASLTDLLDGKIA RKYNLVTNFGKFMDPLADKLLVGAAMICLVEMGRLQAWIVIIIISREFIISGFRLVASDN GIVIAASYWGKFKTTFQMLMIIFLIIDLGGVFATVETALIYISLALTIISLIDYIAKNKQ VLTQGGM >gi|330405514|gb|ADLB01000010.1| GENE 20 17058 - 17531 298 157 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229231897|ref|ZP_04356325.1| SSU ribosomal protein S12P methylthiotransferase [Cryptobacterium curtum DSM 15641] # 8 156 756 904 904 119 44 2e-26 MDLEEIIVQKLTEKKWKITTAESCTGGLLAGRILNVSGASSVYEEGHITYSNEAKEKILN VCHDTLEKYGAVSAETAKEMAVGAAATANAEVALSTTGIAGPTGGTKDKPVGLIYIACYI LGQVYSKELRLKGTREENRAETVEEALKLFLDSLPEQ >gi|330405514|gb|ADLB01000010.1| GENE 21 17555 - 17629 99 24 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MIYKNKTNYSVDCRIIKKYEKDID >gi|330405514|gb|ADLB01000010.1| GENE 22 17727 - 18419 970 230 aa, chain + ## HITS:1 COG:CAC2565 KEGG:ns NR:ns ## COG: CAC2565 COG0822 # Protein_GI_number: 15895825 # Func_class: C Energy production and conversion # Function: NifU homolog involved in Fe-S cluster formation # Organism: Clostridium acetobutylicum # 1 230 1 230 230 382 83.0 1e-106 MIYSHEVQTMCPVAQGVNHGPAPIPEEAKWVKAKEVKDISGLTHGVGWCAPQQGACKLTL NVKEGIIQEALVETIGCSGMTHSAAMASEILPGKTILEALNTDLVCDAINTAMRELFLQI VYGRTQSAFSEDGLPVGAGLEDLGKGLRSQVGTMYGTLAKGPRYLEMAEGYVTGIALDEN DEIIGYQFVSLGKLTDFIKKGDDPNTAWEKAKGQYGRVADAVKIIDPRAE >gi|330405514|gb|ADLB01000010.1| GENE 23 18438 - 19448 1276 336 aa, chain + ## HITS:1 COG:no KEGG:Cphy_3261 NR:ns ## KEGG: Cphy_3261 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 1 336 1 336 336 578 91.0 1e-163 MALFESYERRIDKINSVLNSYGIASIEEAEKITKDAGLNVYDQIKGIQPICFENACWAYI VGAAIAIKKDCRRAADAAAAIGEGLQAFCIPGSVADQRKVGLGHGNLGKMLLEEDTDCFA FLAGHESFAAAEGAIGIAEKANKVRKKPLRVILNGLGKDAAQIISRINGFTFVETEMDYY TGEVKEVFRKSYSDGLRSKVNCYGANDVTEGVAIMHKEGVDVSITGNSTNPTRFQHPVAG TYKKECIEQGKKYFSVASGGGTGRTLHPDNMAAGPASYGMTDTMGRMHSDAQFAGSSSVP AHVEMMGLIGAGNNPMVGMTVAVAVSIQEAAEEGRF >gi|330405514|gb|ADLB01000010.1| GENE 24 19517 - 20062 272 181 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229241266|ref|ZP_04365652.1| acetyltransferase, ribosomal protein N-acetylase [Cellulomonas flavigena DSM 20109] # 11 181 12 179 182 109 38 3e-23 MKKIWEKDGYIIRQAEEKDKEAYYEQNFNPLDKEVARFTGCKETFTREEVLSFYEKCLIS EDRYDFLIFSPEGKIIGESVINDIDWKVKSANYRICIFHSVERGKGIGSWAVKTAICFAF EELKLHRLELDVFSFNERAKHLYEKSGFKVEGIRRDAVLDEGKYADDIFMSILEVEYSKN R >gi|330405514|gb|ADLB01000010.1| GENE 25 20205 - 24566 4712 1453 aa, chain + ## HITS:1 COG:SP2146 KEGG:ns NR:ns ## COG: SP2146 COG3669 # Protein_GI_number: 15901959 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-L-fucosidase # Organism: Streptococcus pneumoniae TIGR4 # 56 540 10 449 559 350 39.0 1e-95 MKKKQVVSLLLAATLMGSSVFSGIATLDVSAATKEGVTAEDLADNNTNAPEKDKVIPSAN QYKYQKDELAAFCHFGMNTYTGSEWGNGKEKPEQFALKNDFDAETMVKTLHEAGFKKLIV TAKHHDGFCIWPSKYTEHDSQKAGYSGDVLAEISTACTKYNMDMGLYLSPWDVNAKSYGY YDKDGKPLCDDKGKPLNNKTWEEVEELDVDDYNEYYNNQLIEILNNEKYGNNGRFVEVWM DGAKGSGSAVQNYTFEKWFDTIQQYEGKKGGQKDDCLLFGAEAYTTVRWIGNENGFADTE TWSKSNVNKDENTIDSNTKSGYTKGYVDGNQWTVPEADARITSGWFWGEGKKSPKDMKAL SEMYFRSVGHNAPLLLNVPPNKEGKVDQAILNRVTEFGKGIKNTFKDNLAKGAKVTASTV RGNDKKFSPQNVLDGKDDTYWTVDDDKKTGTLTIDLGGLKTFDVVSVEESIEFGQRIGSY KIEYQTESGEWKKFDEGKTIGAKRLARKNAVKGKKVRITVTADEKAEHKVPMLSEIGVYK ATDDMALGNGIPEGLEITDDRKFTAKDWSQESGDQFIEGTGMWCEPNAQATVKFTGTKAW LVGTVDPNHGPADIYIDDKKVATINTKGTARKLGQRIFESETLEDKEHTLKIVNTGTGKQ AIGIDALLSLNNGGKGMLDIEYPSYRVNEDSKVPIKVKRIGGAKGETKVTFQVNPGSAWQ DHFDADGNMELTLKDGQTEAEAYVTTKRVPAKTGDLSFTAELVNPTNKVLTGFNTPTRIT IADSEEFDKSALLKKLDEVKQANYQEGNYTDASYKALQDAIQFAEKVAKKAKPSAQECAE AVGKLDVAVRSLAKRATYTEKDPYQLPKRKGGKKNLEAENFILDKGTSAADKYVRVQQDT TASNGAKIGWFEEGNKIKLPFYAAKEGTYTFKAKLQSGRNADNPNALNWSGTNVESGTLD VHNGDSNTYKEVEFDVEVTKAGAGELIFTADKKGSPNLDKFEVTAKEVAMGNFDITASAG EHGTISPLGKVTVKEGENKEFTMQPSKDYMVKDVLVDGESVGNILKYTFKDVDAAHTIKV EFEKEKLAADNRFEFPLKGEKRLEAERLELHNVGGNDEEWKLEVKDADWASNGKFVNSLN KDDEIKLYYNAEKAGDYAVIVTYRSGSAENGFLWSEKDGKIEAGTATVGATDGAKETHQK EITFKVKTAGEGVLTIKGGEKGAPQIDKFDIVSPELITFELEKAIADADKIVLDEYKDGA AKDAFVKALKDAKDMLIKAQEGKCTQDDIANVTKVLRDAMEKLEDKDQVPATDKSALKNA IDKANGIDLKKYQDGSEKEVFVKALDNAKAVYANDKATEEEIAKATLELNTAMDKLKPID NSNGGNNGNTGNTGNAGNSGSGEQNGNGQTSLGNNNQKPSAPVKTGDTAEPFGYMAGMAA GLAAVVAILRKRK >gi|330405514|gb|ADLB01000010.1| GENE 26 24662 - 25453 629 263 aa, chain + ## HITS:1 COG:FN0926 KEGG:ns NR:ns ## COG: FN0926 COG2357 # Protein_GI_number: 19704261 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 4 251 6 238 259 133 36.0 3e-31 MINKREFLKEYNITEQEFREAEMEWSELEAIYANYKEIEDKLRRLGKEFVNDYLYDIERA GIHSYRYRTKAAGHLIEKIIRKRREAKEKYAEINRDNYYKYNTDLIGIRVLFLYREDWVH FHHYITTVFENNPDCYVVDRLHDFDEDITHNYIAERPKVYKRSGDSRIYDENLIEIKAGG IYRSLHYIIKYKGYYVEIQGRTLFEEGWSEVDHDIVYPYFQDDAMLKDFSTLLNRLSGMA DEMSSYFRRMKMEREAERFLKED >gi|330405514|gb|ADLB01000010.1| GENE 27 25600 - 26871 1050 423 aa, chain + ## HITS:1 COG:CAC0282 KEGG:ns NR:ns ## COG: CAC0282 COG0402 # Protein_GI_number: 15893574 # Func_class: F Nucleotide transport and metabolism; R General function prediction only # Function: Cytosine deaminase and related metal-dependent hydrolases # Organism: Clostridium acetobutylicum # 5 422 9 423 428 417 47.0 1e-116 MSTFVLKGNICFSKSRTEFVTAENSYLVCENGICKGVFQSLPEKYKNLPCHDYGNKLIIP GLTDLHVHAPQYTFRGLGMDLELIDWLNAHTFREEAKYADLEYAGKAYEIFVDDLMKSAT TRACIFGTIHNEATLVLMEMLERRGFKGYVGKVNMDRNSPEELCEENDSVSAQRTIEWIE EAQKRFCKMRPILTPRFIPTCSDGLMKKLADIQEKYHLPVQSHLSENMGEIEWVRELCPN TGFYGEAYSQFGMFGNKYPAIMAHCVHSTEEEMELMAKQNVFIAHCPQSNTNLSSGVAPI RKYLDMGIKMGLGSDVAGGFDLSIFRAMADALQVSKLRWRLLDTDLTPLKVEEAFYLATK GGGQFFGKVGSFEDGFEFDAVVLDDSSLRTMRELNVKERIERLIYLADDRCVVDKYIAGE NVK >gi|330405514|gb|ADLB01000010.1| GENE 28 26933 - 29272 2282 779 aa, chain - ## HITS:1 COG:lin2460 KEGG:ns NR:ns ## COG: lin2460 COG1511 # Protein_GI_number: 16801522 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Listeria innocua # 272 765 206 706 927 135 27.0 2e-31 MKNRDYKTLVAIATASALVVGSLGVVAPSYAKDDKNITTETVENTVPEKTSEAASSDVKP FKSETVYAKIDGNGNVSSITVSDQLKNLDDMEQLKDTSILNDIENVKGEEKFSKKQDSII WDTDNKDICYQGTTTQPLPVGIKVTYVLDGKEITAKELKGKSGHLVMRYQYTNETASGSE YVPFAMVTGLIFDTDKISNIKLTNAKLISDGDRDVILGMGLPALNEQLDVKELDIPDYFE VEADVKDYELSEGMTVATNDIFNDLETDKFDNLSELKDSMKTLQDSANQLVDGSGELRKG LDTLLSSSGTLTDGIHQLVNGGNTLKKGTSSLAGGSKSLVNGSQSLASGTAQLQSGTNSL KAGASQVNAGLSSASSKTSSVLLPGAVQLDNGVADMQNKLTPGMNALAQGVQQLDAGLNS PLTSAQPGLKASISAVNDVMNKGTTETGGKSLSKITKDTADAADALAKSMQSSSNTTVSA ETVAKGNGETDKAISSLKSLLNREDLPEDVKASIQASITSLEKDKQTRQTSAAQLDKQIQ SQTSEQNKALANTQELLKQVVAGTKTTSAIVDNVAGNMTKINAGTTQLAQGAKELDAKVT DSKSGLIAQINGGVSTLKNGTSQLREGIGGKNGLANGLNQLADGASQVNSGAATLNEKMI IANSGAKELHKGAAQLSAGASQLDAGAGSLVSGLNTLNQGSSALIDGVKKLDAGAVALND GMIKFNKEGIEKLVSVFDGDIDGLLDKVNTILDSSKNYKNFSGISDDMDGEVKFIFVTE >gi|330405514|gb|ADLB01000010.1| GENE 29 29293 - 31395 2404 700 aa, chain - ## HITS:1 COG:BH0720 KEGG:ns NR:ns ## COG: BH0720 COG1033 # Protein_GI_number: 15613283 # Func_class: R General function prediction only # Function: Predicted exporters of the RND superfamily # Organism: Bacillus halodurans # 1 689 1 683 687 300 30.0 9e-81 MVKVGKKIVKFRVAILILGFLLLIPSALGYFKTRVNYDILYYLPDNIDTMKGQDILMDDF GKGAFAMEVVEGMSTKEVSDVKKKIEKVDGVADVIWYDSLADLSIPIDALPNKIKDVFQK DDATLMAIFFDDTTSADDTMNAITQIRKVTNKQCYLSSMSSVVTDIKSLSEKETPMYVLI AAVLTSIILALCTNCWILPIFFMLSIGMAIVYNMGTNYFLGEISYITKALSAVLQLGVTM DYSIFLWHSYQENQERFPNDKERAMAHAISNTFSSVLGSSVTTVAGFIALCFMSFTLGRD LGIVMAKGVVFGVISCVTILPSFILIFDKQIEKTSHRSLIPKMEKTSHFVTDHYKTFAVI FLILLVPAIWGYTKAEVYYNLDATLPKYLDSIKANEKLSNTFDMNATHMVLADANLSPKE AKEMLDEMSDVKGVKFALGLDSLLGSSIPRDVVPSELTETLKQGDWQLILVQSEYKAATD EVNKQCTELNKIIKSHDHSAMLIGEAPCTKDLITITDKDFKVVSAISIVAIFIIIAVVFK SISLPVILVAVIEFAIFINLGIPYFTGTKMPFIASIVIGTIQLGATVDYAILMTTRYKKE RCKGKVKKEAIYIALSSSVSSVIVSGLGFFAATFGVGLYSDIDMISALCSLMARGAIISM LTVIFILPSALMLFDKLICATTKDMNLKKIEARKNSTANA >gi|330405514|gb|ADLB01000010.1| GENE 30 31558 - 32154 617 198 aa, chain + ## HITS:1 COG:no KEGG:Cphy_1910 NR:ns ## KEGG: Cphy_1910 # Name: not_defined # Def: TetR family transcriptional regulator # Organism: C.phytofermentans # Pathway: not_defined # 1 194 1 195 207 187 49.0 1e-46 MGKVEENKQQKEDALFESAYDLFMTKGIAKTSIHDIVQNAGVAKGTFYLYFKDKYEIRDR LIAKTAGRLFHSANRELEKAQIPQFEDKIIFIVDYVLDEMQKNKAVLQFVSKNLSWGIFR QAIESKEENTGVKELFYKLLEESPEVKLQAPETMLFLIIELASSTSYSTILENDPISYEE LKPYLNASIRAIIRNHMC >gi|330405514|gb|ADLB01000010.1| GENE 31 32132 - 32257 118 41 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MDCGLFGVRFLFLGHKNGTVAPVDYSSTFATAPLINTYDFL >gi|330405514|gb|ADLB01000010.1| GENE 32 32271 - 32583 165 104 aa, chain + ## HITS:1 COG:CAC0657 KEGG:ns NR:ns ## COG: CAC0657 COG3666 # Protein_GI_number: 15893945 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Clostridium acetobutylicum # 9 101 9 101 334 135 72.0 2e-32 MTNKLKYHKNYTEFGEPYQLVLPLNLEGLIPDDDSVRLLSHELEDLDYSLLYQAYSAKGR NPAVDPKTMFKILTYAYSQNIYSSRKIETACKRDINFMWLLAGQ Prediction of potential genes in microbial genomes Time: Tue May 24 21:02:49 2011 Seq name: gi|330405451|gb|ADLB01000011.1| Lachnospiraceae bacterium 2_1_46FAA cont1.11, whole genome shotgun sequence Length of sequence - 26225 bp Number of predicted genes - 26, with homology - 24 Number of transcription units - 11, operones - 5 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 54 - 257 192 ## COG3666 Transposase and inactivated derivatives + Term 297 - 347 6.0 - Term 341 - 400 11.3 2 2 Tu 1 . - CDS 405 - 2090 960 ## COG3044 Predicted ATPase of the ABC class - Prom 2126 - 2185 11.1 + Prom 2068 - 2127 6.4 3 3 Tu 1 . + CDS 2248 - 3474 827 ## COG2207 AraC-type DNA-binding domain-containing proteins + Term 3509 - 3548 -0.0 + Prom 3585 - 3644 7.5 4 4 Op 1 . + CDS 3673 - 4779 1254 ## COG1915 Uncharacterized conserved protein 5 4 Op 2 . + CDS 4811 - 5692 1040 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily + Term 5733 - 5767 4.7 + Prom 5747 - 5806 6.8 6 5 Tu 1 . + CDS 5836 - 6246 452 ## Cthe_1015 hypothetical protein + Prom 6296 - 6355 3.2 7 6 Op 1 38/0.000 + CDS 6446 - 8053 1807 ## COG0747 ABC-type dipeptide transport system, periplasmic component 8 6 Op 2 49/0.000 + CDS 8070 - 9053 1242 ## COG0601 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 9 6 Op 3 44/0.000 + CDS 9046 - 9879 834 ## COG1173 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 10 6 Op 4 17/0.000 + CDS 9891 - 10811 370 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 11 6 Op 5 . + CDS 10801 - 11415 236 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 12 6 Op 6 2/0.000 + CDS 11422 - 12174 815 ## COG1691 NCAIR mutase (PurE)-related proteins 13 6 Op 7 1/0.000 + CDS 12167 - 13453 1142 ## COG1641 Uncharacterized conserved protein 14 6 Op 8 . + CDS 13454 - 14281 844 ## COG1606 ATP-utilizing enzymes of the PP-loop superfamily + Term 14286 - 14334 11.0 + Prom 14285 - 14344 5.8 15 7 Op 1 2/0.000 + CDS 14369 - 15121 812 ## COG0500 SAM-dependent methyltransferases 16 7 Op 2 40/0.000 + CDS 15140 - 15835 780 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 17 7 Op 3 . + CDS 15828 - 16910 971 ## COG0642 Signal transduction histidine kinase + Term 16973 - 17027 13.3 + Prom 16941 - 17000 4.7 18 8 Op 1 . + CDS 17043 - 18059 783 ## COG1181 D-alanine-D-alanine ligase and related ATP-grasp enzymes 19 8 Op 2 . + CDS 18074 - 18805 520 ## COG1876 D-alanyl-D-alanine carboxypeptidase 20 8 Op 3 . + CDS 18809 - 19798 589 ## CDR20291_1526 serine/alanine racemase 21 8 Op 4 . + CDS 19795 - 20904 816 ## COG0787 Alanine racemase 22 8 Op 5 1/0.000 + CDS 20970 - 21776 651 ## COG0561 Predicted hydrolases of the HAD superfamily + Term 21790 - 21836 8.2 + Prom 21780 - 21839 9.2 23 9 Op 1 . + CDS 21884 - 23677 1625 ## COG0488 ATPase components of ABC transporters with duplicated ATPase domains 24 9 Op 2 . + CDS 23678 - 25702 1566 ## COG3973 Superfamily I DNA and RNA helicases + Term 25705 - 25751 16.1 + Prom 25764 - 25823 2.0 25 10 Tu 1 . + CDS 25850 - 25945 164 ## 26 11 Tu 1 . - CDS 26077 - 26160 82 ## Predicted protein(s) >gi|330405451|gb|ADLB01000011.1| GENE 1 54 - 257 192 67 aa, chain + ## HITS:1 COG:CAC0656 KEGG:ns NR:ns ## COG: CAC0656 COG3666 # Protein_GI_number: 15893944 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Clostridium acetobutylicum # 1 67 123 189 189 93 70.0 7e-20 MNRSIQVEGAFGVLKNDYEFQRFLLRGKSKVKLEILLLCMGYNINKLHAKIQKERTGSYL FLVKETA >gi|330405451|gb|ADLB01000011.1| GENE 2 405 - 2090 960 561 aa, chain - ## HITS:1 COG:VCA0786 KEGG:ns NR:ns ## COG: VCA0786 COG3044 # Protein_GI_number: 15601541 # Func_class: R General function prediction only # Function: Predicted ATPase of the ABC class # Organism: Vibrio cholerae # 1 560 1 544 549 375 39.0 1e-103 MYKLQKQLESIHRKSYPAYKSLKGVYDFKNYTLSIDHVQGDPFASPSAVSVRIPHKLAGF PREYFDRYDKRIALQDYLLRLFAKKISSFSFKAKGSGKSGLISTSLCGQKILERTACEFT DKEIIVRLEIGFPANGRTINATELEKILFDFLPKCIESIFYYKKLNADNVLSVIHLAEDQ TAVREQLNENNLICFIADGSILPRESGISDRPMKNAIPFKSPSSLLVELSLPRQRKIKGM GLKKGITLIVGGGYHGKSTLLNAIESGVYNHIIGDGREYVITDDTAVKLRAEDGRCIKNT DISLFINDLPNQKDTQHFSTENASGSTSQSANVIEAMESGSKLFLIDEDTSATNFMVRDE LMQSIVTRDKEPITPFLERVTDLYRQSGISTILVAGSSGSYFYTADTVLQMDCYHTKEIT DKVKEKCKKTTAPSLFAPNYKTPDFQRTLFAAKPNYRPNDRIKLKFHGKDSFQIDRQTVD LRYVEQLADAEQTACLAYLLKYALLHYSGSKKTVQEIVSALTKQIETKGLSSIFDSSYIS LGLALPRPQEIYACFNRYREL >gi|330405451|gb|ADLB01000011.1| GENE 3 2248 - 3474 827 408 aa, chain + ## HITS:1 COG:SA0622 KEGG:ns NR:ns ## COG: SA0622 COG2207 # Protein_GI_number: 15926344 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Staphylococcus aureus N315 # 188 287 152 251 716 62 31.0 1e-09 MGFIPIQSLYTSDNMNNCICNFEVEVFEKVTEPLIHPMSRLWLVNEGVGKILINSKEYII TKGSLVSVLPWQITEIVEVKKPLQFYVVQYHFERINELVKTFFNPDNVQLSFSQVLRKAP VITFGEEKYKEIQNMFHQIGNEIRYVNEDEKETGEEIFSNVYIINKLIEILVSFVRSNRR VPRETSIDASEILQFMYAHLNEKITISMLSQKFYMSESAIRSYIKNTTGLSFFDLLNEMR VGKMINYLLYTNLTVGELAEILGFTDDSHMCKVFKARMGIKTKEFRNTYQVIGDRCHIVD RREFYEIVEYIYRNHTEDLQLQAVSEQFKMSPKELNRVLSYQVEMSFNEFLNFIRVNHAA QLLLDSDKSVLTIALEVGYKTEKTLSRNFLQIKGRTAGEFRRKVKLEK >gi|330405451|gb|ADLB01000011.1| GENE 4 3673 - 4779 1254 368 aa, chain + ## HITS:1 COG:MJ1480 KEGG:ns NR:ns ## COG: MJ1480 COG1915 # Protein_GI_number: 15669673 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Methanococcus jannaschii # 24 357 80 401 423 196 37.0 6e-50 MSNFEMPKYHHPDFTQEMFVNAPDAKYEAAEKDGVVPEYYHSTSMYPEYFKINGEWKLAE ESRMDSCVILCDDGHLTVVEARNIKKGDKVLLGRTERCEEGVYMHCHGFEEAGEKLDDQF VFRQGRSRETSYARDYDKLFELLKHEKENGGKIVWVMGPAFAFDANARAAMQALVENGYA HGLLAGNALATHDLEGALLHTALGQDIYTQQSQPNGHYNHLDVLNKVRRSGSIPQFMKDY DLNDGIICSCVNNNVPIVLAGSIRDDGPLPEVYGNVYEAANAMRGIVKEATTVICLATML HTIAVGNMTPSFRVLKDGTIRPVYLYTVDADEFVVNKLLDRGSLAATTMVTNVQDFITRV AKGLGVME >gi|330405451|gb|ADLB01000011.1| GENE 5 4811 - 5692 1040 293 aa, chain + ## HITS:1 COG:FN1038 KEGG:ns NR:ns ## COG: FN1038 COG0697 # Protein_GI_number: 19704373 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Fusobacterium nucleatum # 1 290 2 297 303 121 30.0 1e-27 MKKNSMFGKLALLFVAIAWGSSMVVIKGSTDTLPAGTLLACRFTVAGVILALANFNKLKQ IDKDYLKSGIFIGVCLFMAYFTQTIGVMLEMPGKSHFLSSAYCVFVPFLGWLILKEKPKI YHVIAATMCAVGIILVSVAGNFSISYGDSISIISSVFWAAQIIAIAKWGKDKDPGLITML QFVVCAVLAWVFTLTMENPGAIEWNMGAVGGVLYLGIVCSGICFLLQTIAQKTENPTSVS IILSFENIFGLVFGAIFFNEKFTPRSIMGFVLIFAAIIIAETELSFLKKKKAK >gi|330405451|gb|ADLB01000011.1| GENE 6 5836 - 6246 452 136 aa, chain + ## HITS:1 COG:no KEGG:Cthe_1015 NR:ns ## KEGG: Cthe_1015 # Name: not_defined # Def: hypothetical protein # Organism: C.thermocellum # Pathway: not_defined # 1 133 1 134 140 117 47.0 1e-25 MKVLIIDAQGGGMGKQLVSAIKQEFPAAEITAVGTNSMATSNMLKAGADHAATGENAVVV GCRRAEIIIGPVGIAIADSMYGEITSKMAEAVGQSEARKLLIPMNHCNNIIVGVGNTSMS FLIASVVEELIEYSNC >gi|330405451|gb|ADLB01000011.1| GENE 7 6446 - 8053 1807 535 aa, chain + ## HITS:1 COG:MA1915 KEGG:ns NR:ns ## COG: MA1915 COG0747 # Protein_GI_number: 20090764 # Func_class: E Amino acid transport and metabolism # Function: ABC-type dipeptide transport system, periplasmic component # Organism: Methanosarcina acetivorans str.C2A # 49 533 67 549 553 303 36.0 5e-82 MKKKFMSLLCVTAMFAMAVSGCGGENGKKDSSTKKEDGIKAYVGTTLFEGSLDPIKGALP HGYPFINNALLRVDSNSKYVGDLAKSWEISQDSLTYTFHLNEDIKFSDGSDFDAEDVVFT YETVQKNQADNEYVDLTRLASVKEIDKNTVEFKLKEAYSPFLDTTALLQIVPSDAYDSKA FDTKPIGTGAYKVAQYDADQQIILEANENYFGKEPEIKKVTIVNMDTDAAFAAAKAGELD VVMVGTNYSKEKIPGMTLQKLETMDVRNVSLPVRRVTEMKNSDGKKVTVGNNVTSDLAVR KALSIGIDRQKIIDNSSDGIGLPSVNFTDNLVWASTDTYPDKKVDEAKKLLEDAGWKVGK DGIREKDGQKCTFDLYASSGDTDRYNLSVALAENAKELGIDIKVKTATWEEIVTLQNTSA IMWGWGQYSPTVLSSLFQSDLFLTGGYDNVVGYQNPAVDAKIKEALSANTQEKAVAAWKE VQKIADEDYTNLFLVNIQHCYFISDKLDISIDTQIPHPHGHGTPIICNMADWKMK >gi|330405451|gb|ADLB01000011.1| GENE 8 8070 - 9053 1242 327 aa, chain + ## HITS:1 COG:MA1913 KEGG:ns NR:ns ## COG: MA1913 COG0601 # Protein_GI_number: 20090762 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Methanosarcina acetivorans str.C2A # 21 324 25 327 330 333 52.0 2e-91 MKNGKKLLGECVRMVLLLVAVSIVAFLLITKAPIDPLVSYVGTNSTLSEEAKEEIAEEWG LNDPLPERFATWVKHAVHGDLGMSITYKKPVIEVIKTRFSYSIVLMMLAWALSGIIGFIL GIVCGMRQGGIMDRIVKTFCLVIKSAPVFWIGLLILTIFAVQLGWFPIGMAVPAGKLASE VTLGDRIYHLILPVLTLTIVSISEVVLYTRQKVIEIMNSDFILYARARGENDRQLVKRHV LRNVALPAITVQFASFNELFGGMALAETVFAYPGIGNATTAAAMNADVPLLLGIAIFSAL FVFTGNLIANLLYGVFDPRIREGEKRG >gi|330405451|gb|ADLB01000011.1| GENE 9 9046 - 9879 834 277 aa, chain + ## HITS:1 COG:MA1912 KEGG:ns NR:ns ## COG: MA1912 COG1173 # Protein_GI_number: 20090761 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Methanosarcina acetivorans str.C2A # 8 277 17 285 285 246 48.0 3e-65 MDKKKFWNRRVMVAILVAIAVIYLAGIFFWGIFMDPSCYDTNYADKFMAPGLKHLFGTDF VGRDMFYRCIKGLSNSLIIGILASVVSSVIALVAGIASAVFGGWVDKFVNWCVDLCMGLP HLVLLMLISFMMGKGVKGVTVAVALTHWPSLTRIVRSEVMQIRSAQYVQAAYKMGKSKTQ VAVQHILPHVLPAYLIGLVLLFPHAIMHEASITFLGFGMPAEMPAIGVILSEAMNHIATG KWWLAFFPGLMLLIAVILFDVIGENLKKLWNPGSGNE >gi|330405451|gb|ADLB01000011.1| GENE 10 9891 - 10811 370 306 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 6 301 11 321 329 147 31 9e-35 MKEKSVLTVDDLVVSFSMYKGFFHKENLEVVHSLSLDVNRGEIVAVVGSSGSGKSLLAHA IMGLLPNNANISGKISYNGEELTQKRRKELLGRKMMFIPQSVDYLDPVMKVGKQVTGVYS TKQRQEEMFEKYKLSKDVEEMYPFQLSGGMARRVLVSSAVMESPELIIADEPTPGLSVDM AMDTLRHFREIADNGAGVLLITHDIDLAFHVADRIAVFYAGTIVESAPTEDFLAGAEKLR HPYSKAFISALPQNEFQAIDGVQPYAANLPGGCLFADRCPNVTKKCCGEIPMRKLRGGKV RCVHAT >gi|330405451|gb|ADLB01000011.1| GENE 11 10801 - 11415 236 204 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 22 195 40 232 329 95 29 3e-19 MQLSAENISFRYTDKSSWILKDVNLKIETGERVGIVGPSGYGKSTLAKILAGYNRADSGQ VLLDGKPLQAKGFCPVQMIYQHPELAVNPRWKMEKTLNECWNPDEKILERFGIEKDWLTR WPRELSGGELQRFCIVRLLSPETKFLICDEITTMLDVISQAQIWNVLLQMAEERNYGMLI VTHNMDLAKRVCTRIVDLSEINRR >gi|330405451|gb|ADLB01000011.1| GENE 12 11422 - 12174 815 250 aa, chain + ## HITS:1 COG:CAC0776 KEGG:ns NR:ns ## COG: CAC0776 COG1691 # Protein_GI_number: 15894063 # Func_class: R General function prediction only # Function: NCAIR mutase (PurE)-related proteins # Organism: Clostridium acetobutylicum # 2 244 5 248 248 244 50.0 1e-64 MDTKKILEQVKNGAMSVEEAEEFFRRQPFEDLGYAKLDTHRKLRSGCAEVVFCSGKADEH LLSIYERLYAEEGEVLGTRASKEQAELVQSKLPEVEYDALSRILKIEKKEKEYVGKIVVC TAGTADIPVAEEAAQTAEFFGNHVERIYDVGVSGLHRLLSRVETIQSANCVITVAGMEGA LASVIGGLVDKPVIAVPTSVGYGASMNGISALLTMINSCANGIATVNIDNGYGAGYIASQ INKLGVHEHE >gi|330405451|gb|ADLB01000011.1| GENE 13 12167 - 13453 1142 428 aa, chain + ## HITS:1 COG:CAC0774 KEGG:ns NR:ns ## COG: CAC0774 COG1641 # Protein_GI_number: 15894061 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 6 426 4 417 420 271 39.0 2e-72 MSKQKLYLECYAGISGDMTVAALLDLGADQDVLNETLNSLSISGQFQTKISRVSKSGLDA CDFDVVLERENHDHDMHYLHGHDHHHEHTHDHDHHHEHTHEHGHDHHHEHRGLIEIQGII RCSRMTERAKELAEKIFDILANAEAKAHGKPKTEVHFHEVGAVDSIVDIAAVAICLDNLG IEDVIIPTLYEGSGTVRCQHGVLPIPVPAVANIVNAENLTLRITETEGEFVTPTGAAIAA AIRTEEKLPENFRILKTGIGAGKRNYERPSILRAMLIEEQANQDVEKDEIIKLESNIDDC TGEALGYTMEKLMEAGARDVHYFPVFMKKNRPGYQLNVICKENQISQLQKIIFEETTSIG IRIQRMERSVLPRRIETRDTSMGEVQIKICSLPTGERIYPEHDSIARICKESGKSWQEVY RKIIEECQ >gi|330405451|gb|ADLB01000011.1| GENE 14 13454 - 14281 844 275 aa, chain + ## HITS:1 COG:CAC0775 KEGG:ns NR:ns ## COG: CAC0775 COG1606 # Protein_GI_number: 15894062 # Func_class: R General function prediction only # Function: ATP-utilizing enzymes of the PP-loop superfamily # Organism: Clostridium acetobutylicum # 6 268 5 262 271 193 39.0 4e-49 MKSYEEKKKQLFEKINSLAKEDIVLAFSGGVDSSLLLKICCDSSKNYGTTVYAITVHTEL HPMKDVEIATKVAKEAGAKHLVVYIDELQDAGIEYNPIDRCYRCKSLLFGKLKEKAKELG VKNVVEGTNEDDLHVYRPGIRALRELNIISPLAESGFTKAEVRKLAAEYGISVANRPSTP CMATRFPYGAKLDYERMHQVEEGEEWLKTLGFYNVRIRVHGEIARIEVDEKDMPLLLNNR VKVIEKLKAFGYDYVTVDLEGFRSGSMDIHVTEKN >gi|330405451|gb|ADLB01000011.1| GENE 15 14369 - 15121 812 250 aa, chain + ## HITS:1 COG:FN1919 KEGG:ns NR:ns ## COG: FN1919 COG0500 # Protein_GI_number: 19705224 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Fusobacterium nucleatum # 5 250 3 249 249 333 63.0 2e-91 MSEGYQKINAESVDRWCAEGWEWGEPITHETFKKAERGEWGVYLTPTKYVPHEWFGELKD KNILGLASGGGQQIPVFSALGAKCTLLDYSHTQCESERMVAEREGYEVKIIEGDMTKPLP FSDETFDLIFHPVSNCYVEEVKPIFKECYRVLKKGGILLCGLDNGMNYIFDETETKLQHK LPFNPMKDKKIYEYSIKNDWGIQFSHTLEEQIGGQLEAGFILTNIYEDTNGEGNLHEYNV PTFFATRAIK >gi|330405451|gb|ADLB01000011.1| GENE 16 15140 - 15835 780 231 aa, chain + ## HITS:1 COG:CAC0564 KEGG:ns NR:ns ## COG: CAC0564 COG0745 # Protein_GI_number: 15893854 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Clostridium acetobutylicum # 3 231 4 230 233 237 52.0 1e-62 MKKQILIVDDEKEIADLLEVYLQNDGFEVHKYYTGKEAMDCIRTTRLDLAILDIMLPDID GFTICQKIREKYFYPVIMLTAKVEDADKILGLTLGADDYILKPFNPLEVLARVKTQLRRY TKYNQVETVVEEKEEHDIKGLVINKTTHECTLFGEKIELTPIEFSILWYLCENRGKVVSS EELFEQVWGEKYYDSNNTVMAHIGRLREKLHEPPRKPKFVKTVWGVGYKVE >gi|330405451|gb|ADLB01000011.1| GENE 17 15828 - 16910 971 360 aa, chain + ## HITS:1 COG:CAC0565 KEGG:ns NR:ns ## COG: CAC0565 COG0642 # Protein_GI_number: 15893855 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 111 357 250 497 499 166 38.0 9e-41 MNKEVKKIYHKHILKLIAAILICPCIVLLAYGLGKIILGQFIWYEDDFIYVLGKWIESKM VMFTTLATIFGGFCVGVFFVRKPFVYLSEVLQATGGLYQNREELIKFPPVLKDAENQLNH IRITMERNVRAAKEAEQRKNDLIVYLAHDLKTPLTSVIGYLTLLEDEPQISEELRQKYLQ IALTKSEHLEDLINEFFEITRFNLSNLTLEVSSVNLTRMLEQITYEFKPLLTEKQLSFRL QMPKDYMMKCDVGKMQRVFDNLFRNAVNYSFSGGEIVVTVTEKENRIHIACENQGNTIPK EKLARIFEQFYRLDTARGTGTGGAGLGLAIAKEIVELHHGTIQAYSEDERIRFEIELPVL >gi|330405451|gb|ADLB01000011.1| GENE 18 17043 - 18059 783 338 aa, chain + ## HITS:1 COG:ECs0431 KEGG:ns NR:ns ## COG: ECs0431 COG1181 # Protein_GI_number: 15829685 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanine-D-alanine ligase and related ATP-grasp enzymes # Organism: Escherichia coli O157:H7 # 3 338 5 352 364 248 36.0 1e-65 MKKIAVIFGGCSTEYEVSLQSAYAVLTHLDREKYEIISIGITREGKWFRYYGKEKYISED TWQEKECNPAWILPDRVEQGILECTKEGNRLIKVDAVFPVLHGKNGEDGTIQGLIELAGL PVIGCGTLSSSLCMDKDKAHRLVEKEGIAVPKAVVLRKGEKADTLVYPVFVKPVRAGSSF GITKVDREEDLEKAVELAFVHDDEVIIEENIEGFEVGCAVLGNDKLIVGRVDEIELSDGF FDFKEKYTLQNSKIHMPARIDGEIEEKIKETAKKIYRTLSCKGFARVDMFLTPDKRIVFN EVNTIPGFTSHSRYPNMMKGIDMLFEDVLDNLISLEVK >gi|330405451|gb|ADLB01000011.1| GENE 19 18074 - 18805 520 243 aa, chain + ## HITS:1 COG:BH1810 KEGG:ns NR:ns ## COG: BH1810 COG1876 # Protein_GI_number: 15614373 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanyl-D-alanine carboxypeptidase # Organism: Bacillus halodurans # 1 241 54 288 290 105 30.0 6e-23 MKKEDIHRGTLVLVNRDYPLRVERERRLCQINTNYPEVVLEKEASVTLKKLLEEICGENE IALVSGYRSKKEQERIYQTSLWENGGEFTRKYVAIPNHSEHQTGFAIDVGEKREEIDFIC PAFSNEGAGKKFRAQSYQYGFIERYTKEKEEITKIAAEEWHFRYVGYPHSMIIKERGLCL EEYIELLKDYPYGQKGLSYLAADKQIEIFYIRAEEETVILLPENSLYKISGNNVDGFILT LWR >gi|330405451|gb|ADLB01000011.1| GENE 20 18809 - 19798 589 329 aa, chain + ## HITS:1 COG:no KEGG:CDR20291_1526 NR:ns ## KEGG: CDR20291_1526 # Name: vanTG # Def: serine/alanine racemase # Organism: C.difficile_R20291 # Pathway: not_defined # 1 310 2 313 712 295 49.0 1e-78 MRNEYRGIDYFRFVSAILVITIHTSPLLSYQPYADFILTRIIARVAVPFFLMTTGFFLFK NLETTLEKMKVFTKKMAVLYGISIVIYLPVNWYMGDLHGKGLGKRIVEDIFFQGTMYHLW YFPATILGVWVVYFLLKRLGKRRMFIVGGALYVIGLFGDSYYGLTEGIPLLKNMYQFIFS LFGYTRNGLFYVPVFIMLGAWLSEEKTVKKWKIVTGMAVSLGGMIAEGVILHTFDMQRHD SMYILLLPCMVCLFQLLLFWKGKRNRHLGKVALCIYVIHPLVIIGVRGVAKVLHLENLFV ENSLCNFLIVTLVSIAASICITYGKKEKR >gi|330405451|gb|ADLB01000011.1| GENE 21 19795 - 20904 816 369 aa, chain + ## HITS:1 COG:alr2458 KEGG:ns NR:ns ## COG: alr2458 COG0787 # Protein_GI_number: 17229950 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Alanine racemase # Organism: Nostoc sp. PCC 7120 # 2 364 31 395 401 233 36.0 6e-61 MRAWLEIDLEKIKWNVNCLRNMMPSGCEMMAVIKANAYGHGAVPVGKMLNGMGIYSFAVA TVDEGIELRENGIIGDILILGYTDLKRMEEVVRYDLIQTIVDERYGEEMNKCGFSIQAEL KIDSGMHRLGVLAENIEETEKVFNLQNLRIKGMFTHLYESDSLEKEAVEKTEKQIEKFDR LVDNLKQRGIVIPKLHIQSSYGLLNYPQLQYDYVRIGIALFGVLSSKQDKTKVKLPLQPA LSLKALVTSVREIKSGETAGYSGAYHAEADRKIAAVSIGYADGIPRNLSGTNNKVIVHGQ KTSIIGKICMDQLLIDVTMIKDVRAGDTVTLIGREGEEEVSVLDMADKAGTIANEIFSRL GKRLEMKVV >gi|330405451|gb|ADLB01000011.1| GENE 22 20970 - 21776 651 268 aa, chain + ## HITS:1 COG:SA0517 KEGG:ns NR:ns ## COG: SA0517 COG0561 # Protein_GI_number: 15926237 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Staphylococcus aureus N315 # 3 267 2 284 289 131 35.0 2e-30 MDIKLLAVDIDETMVNSHHKMTTVTQQALKKAMEQGITVAVVTGRCLEGLPAKLRTLEGI RYVISSNGAKIYDMKAKKVLYRKLISTKSVLDIWNVCKNFHVGLAFHSEGRCYDNRVLQM LYRRVAYHRDFKTHSPIINLEKWVKNNKKPLEKLQIFSGNREKLQTMNEQLRELPGLEIA YSSSNYIEITDEEANKGKALNYLCHSLGIPLEQVMAIGDNANDCSMLKIVGCPVAMGNAN EEVKRIAKIITSSHDDNGVAKVINEYLL >gi|330405451|gb|ADLB01000011.1| GENE 23 21884 - 23677 1625 597 aa, chain + ## HITS:1 COG:CAC3012 KEGG:ns NR:ns ## COG: CAC3012 COG0488 # Protein_GI_number: 15896264 # Func_class: R General function prediction only # Function: ATPase components of ABC transporters with duplicated ATPase domains # Organism: Clostridium acetobutylicum # 1 593 1 630 632 503 45.0 1e-142 MNILNIEHISKIFGEKVIFDDVSCGIQEGEKIGVIGINGTGKTTLLKLIAGTEEADAGQI IKQNGIRIAYLPQSPAFPEDATVLSYANEGIRENDWTAKSEVKSALTKLGITDFDQKIEH LSGGQKKKVALAKTMTSSFEVLLLDEPTNHLDSDMIAWLEDYLRKYKGVVVMVTHDRYFL DKVTNKILEISRGKLYAYEANYSQFLELKAQREEMELASERKRQSVLRMELEWAKRGCRA RTTKQKARLERLEVLKGKNAPVQEQVAQIDSVETRMGKKTIELRYVSKSYGDKKLIDNFS YIALKNQNVGIVGSNGCGKSTLLKIIAGVIEADEGEVEVGETIKIGYFAQEVPNMDTKQK VIDYVKDIAEYIPTREGKITASQMLERFLFTPDMQYAPIEKLSGGEKKRLYLLGILQSAP NVLIFDEANNDIDIPTMTILEDYLNSFQGIVITVSHDRYFLDNVVDRIFEFDGNGHLQQY EGGYTDYVEAKKKRETAETEDKEKKVSVKNDWKQNREKKLKFSYKEQKEYETIDDDIAKL EEELENIDDEMMKNATNSAKLSELTKQKEEKEMLLEEKMERWVYLNDLAEKIKEQKN >gi|330405451|gb|ADLB01000011.1| GENE 24 23678 - 25702 1566 674 aa, chain + ## HITS:1 COG:BS_yvgS KEGG:ns NR:ns ## COG: BS_yvgS COG3973 # Protein_GI_number: 16080398 # Func_class: R General function prediction only # Function: Superfamily I DNA and RNA helicases # Organism: Bacillus subtilis # 10 669 15 750 774 294 31.0 4e-79 MEKLDGTIFLKEITEKLNTRIRQLEEGIEKGQKEIENMHTYYWENYTEMDEYGYENYDNQ QALFRQADANHEKIQLKYRFEKMKDSPFFARVDFRYEDEEEAEVFYIGIGNFAETAGTMP LIYDWRAPVSSLFYDYDKGEASYEAPAGRMEGEICSKWQYKIKNRKMLYGFESDMKIDDD ILQQELGSNGDVQLKNIVRTIQKEQNEIIRNTRDKILVIQGVAGSGKTSVALHRIAYLLY HDRKNLRSANVLILSPNGVFADYISHILPELNEENIQEMSFDLFAYKELQEIVSDCEDRY HQIERQLREDDKEQEERYRTKQSAEFVGMAEGFLAQLEDELMDFTEVEFKGMKLTEQEII DLFYYKFQEIPLLARMGAVQEYFVDAWETLRGRDLSEEEKECLSSRFMKMYVTRDVYKIY NWLLEEMGYPLLPTVQYEKRQLQYEDVFPMLYLKYRLEGGKKHKQIKHLVIDEMQDYSYL QYVILDYLFSCKMTILGDKQQTIDTTERDVLTFLPKILGKDIRKIVMNKSYRNTVEIAQY ANSITANTDMELFERHGKAVEERKTDKKEAIDFVVNKIKEVGEKYETIAIITMTEKEAEE FYRQLQERGMQASYLDRDSMHFQKGVTVTTFYMAKGLEFDCVFGISSKWNEEKGKQGKYI CATRALHELYMMEM >gi|330405451|gb|ADLB01000011.1| GENE 25 25850 - 25945 164 31 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MANKPSSASLLKLIGYKKNASSEMLEISDTV >gi|330405451|gb|ADLB01000011.1| GENE 26 26077 - 26160 82 27 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MNSVRILSIPNSEKTSSIGLKWPNLLD Prediction of potential genes in microbial genomes Time: Tue May 24 21:04:29 2011 Seq name: gi|330405144|gb|ADLB01000012.1| Lachnospiraceae bacterium 2_1_46FAA cont1.12, whole genome shotgun sequence Length of sequence - 252537 bp Number of predicted genes - 258, with homology - 248 Number of transcription units - 85, operones - 49 average op.length - 4.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 258 190 ## COG3464 Transposase and inactivated derivatives - Prom 310 - 369 6.0 + Prom 196 - 255 7.4 2 2 Tu 1 . + CDS 343 - 510 129 ## + Term 676 - 734 5.7 - Term 665 - 720 8.1 3 3 Tu 1 . - CDS 734 - 997 290 ## PROTEIN SUPPORTED gi|160880450|ref|YP_001559418.1| ribosomal protein S20 - Prom 1076 - 1135 6.6 + Prom 673 - 732 5.1 4 4 Op 1 . + CDS 924 - 1094 97 ## 5 4 Op 2 . + CDS 1101 - 1187 62 ## 6 4 Op 3 . + CDS 1180 - 2145 1128 ## Cphy_2317 germination protease (EC:3.4.24.78) 7 4 Op 4 . + CDS 2213 - 3409 936 ## EUBREC_1620 stage II sporulation protein P 8 4 Op 5 24/0.000 + CDS 3488 - 5410 1914 ## COG0187 Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), B subunit 9 4 Op 6 . + CDS 5424 - 7664 2287 ## COG0188 Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), A subunit + Term 7734 - 7772 8.6 + Prom 7728 - 7787 6.0 10 5 Op 1 32/0.000 + CDS 7815 - 8282 504 ## COG0779 Uncharacterized protein conserved in bacteria 11 5 Op 2 22/0.000 + CDS 8301 - 9407 749 ## PROTEIN SUPPORTED gi|17988250|ref|NP_540884.1| transcription elongation factor NusA 12 5 Op 3 8/0.000 + CDS 9425 - 9703 216 ## PROTEIN SUPPORTED gi|206900953|ref|YP_002250931.1| ribosomal protein L7Ae family protein 13 5 Op 4 10/0.000 + CDS 9690 - 10013 310 ## PROTEIN SUPPORTED gi|240146074|ref|ZP_04744675.1| ribosomal protein L7Ae/L30e/S12e/Gadd45 14 5 Op 5 32/0.000 + CDS 10031 - 12496 3187 ## COG0532 Translation initiation factor 2 (IF-2; GTPase) 15 5 Op 6 4/0.045 + CDS 12521 - 12892 379 ## COG0858 Ribosome-binding factor A 16 5 Op 7 1/0.091 + CDS 12927 - 13841 870 ## COG0618 Exopolyphosphatase-related proteins 17 5 Op 8 12/0.000 + CDS 13852 - 14763 888 ## COG0130 Pseudouridine synthase 18 5 Op 9 9/0.000 + CDS 14766 - 15683 393 ## PROTEIN SUPPORTED gi|163762565|ref|ZP_02169630.1| ribosomal protein S2 + Prom 15729 - 15788 1.7 19 5 Op 10 . + CDS 15809 - 16075 396 ## PROTEIN SUPPORTED gi|238924297|ref|YP_002937813.1| ribosomal protein S15 + Term 16124 - 16160 4.9 + Prom 16118 - 16177 6.5 20 6 Op 1 . + CDS 16201 - 16629 599 ## COG1661 Predicted DNA-binding protein with PD1-like DNA-binding motif 21 6 Op 2 3/0.045 + CDS 16645 - 17775 685 ## COG5438 Predicted multitransmembrane protein 22 6 Op 3 . + CDS 17772 - 18527 720 ## COG5438 Predicted multitransmembrane protein + Prom 18530 - 18589 6.4 23 7 Op 1 . + CDS 18614 - 19339 525 ## COG0671 Membrane-associated phospholipid phosphatase 24 7 Op 2 . + CDS 19311 - 20342 875 ## COG0392 Predicted integral membrane protein 25 7 Op 3 14/0.000 + CDS 20356 - 20973 669 ## COG1183 Phosphatidylserine synthase 26 7 Op 4 . + CDS 20985 - 21854 672 ## COG0688 Phosphatidylserine decarboxylase 27 7 Op 5 . + CDS 21847 - 23163 1312 ## COG1362 Aspartyl aminopeptidase + Term 23164 - 23212 11.2 + Prom 23188 - 23247 7.3 28 8 Op 1 33/0.000 + CDS 23279 - 23974 938 ## COG0528 Uridylate kinase 29 8 Op 2 19/0.000 + CDS 24007 - 24558 810 ## COG0233 Ribosome recycling factor + Term 24592 - 24626 5.1 30 8 Op 3 32/0.000 + CDS 24647 - 25357 618 ## COG0020 Undecaprenyl pyrophosphate synthase 31 8 Op 4 15/0.000 + CDS 25357 - 26160 920 ## COG0575 CDP-diglyceride synthetase 32 8 Op 5 17/0.000 + CDS 26174 - 27316 1102 ## COG0743 1-deoxy-D-xylulose 5-phosphate reductoisomerase 33 8 Op 6 . + CDS 27322 - 28347 902 ## COG0750 Predicted membrane-associated Zn-dependent proteases 1 + Term 28365 - 28410 5.2 - Term 28353 - 28398 8.2 34 9 Tu 1 . - CDS 28404 - 29669 1084 ## COG3409 Putative peptidoglycan-binding domain-containing protein - Prom 29839 - 29898 9.9 + Prom 30098 - 30157 8.1 35 10 Op 1 40/0.000 + CDS 30185 - 30856 742 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 36 10 Op 2 . + CDS 30853 - 31851 868 ## COG0642 Signal transduction histidine kinase + Term 31868 - 31906 1.2 + Prom 31937 - 31996 6.0 37 11 Tu 1 . + CDS 32084 - 35089 2340 ## SGO_0107 LPXTG cell wall surface protein + Prom 35122 - 35181 8.9 38 12 Op 1 36/0.000 + CDS 35206 - 35973 314 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 39 12 Op 2 . + CDS 35966 - 37957 1623 ## COG0577 ABC-type antimicrobial peptide transport system, permease component + Term 37963 - 38011 10.4 + Prom 37978 - 38037 7.1 40 13 Tu 1 . + CDS 38072 - 39037 1261 ## COG0530 Ca2+/Na+ antiporter + Term 39044 - 39081 6.2 - Term 39027 - 39074 4.2 41 14 Tu 1 . - CDS 39077 - 40171 674 ## COG0079 Histidinol-phosphate/aromatic aminotransferase and cobyric acid decarboxylase - Prom 40229 - 40288 12.1 + Prom 40182 - 40241 12.6 42 15 Op 1 . + CDS 40302 - 43565 2359 ## EUBREC_1826 hypothetical protein 43 15 Op 2 . + CDS 43584 - 44828 874 ## EUBREC_1825 hypothetical protein 44 15 Op 3 . + CDS 44803 - 45567 530 ## EUBREC_1824 hypothetical protein + Term 45657 - 45705 8.1 + Prom 45582 - 45641 9.0 45 16 Op 1 4/0.045 + CDS 45750 - 46457 1037 ## COG0152 Phosphoribosylaminoimidazolesuccinocarboxamide (SAICAR) synthase 46 16 Op 2 . + CDS 46475 - 47908 1651 ## COG0015 Adenylosuccinate lyase 47 16 Op 3 . + CDS 47929 - 48012 84 ## 48 16 Op 4 12/0.000 + CDS 48009 - 48797 811 ## COG2966 Uncharacterized conserved protein 49 16 Op 5 . + CDS 48794 - 49267 563 ## COG3610 Uncharacterized conserved protein + Prom 49301 - 49360 5.1 50 17 Op 1 . + CDS 49385 - 50674 1453 ## COG1167 Transcriptional regulators containing a DNA-binding HTH domain and an aminotransferase domain (MocR family) and their eukaryotic orthologs + Term 50686 - 50738 8.3 51 17 Op 2 12/0.000 + CDS 50752 - 51555 652 ## COG2966 Uncharacterized conserved protein 52 17 Op 3 . + CDS 51552 - 51995 361 ## COG3610 Uncharacterized conserved protein 53 17 Op 4 . + CDS 52063 - 53520 1755 ## COG0008 Glutamyl- and glutaminyl-tRNA synthetases 54 17 Op 5 . + CDS 53530 - 55365 1299 ## COG0210 Superfamily I DNA and RNA helicases + Term 55372 - 55407 3.0 + Prom 55456 - 55515 7.8 55 18 Tu 1 . + CDS 55582 - 60099 4964 ## COG3250 Beta-galactosidase/beta-glucuronidase + Term 60110 - 60163 4.4 56 19 Tu 1 . - CDS 60152 - 60484 595 ## EUBREC_1820 hypothetical protein - Prom 60573 - 60632 5.4 + Prom 60532 - 60591 8.4 57 20 Op 1 . + CDS 60632 - 61627 1464 ## COG1879 ABC-type sugar transport system, periplasmic component 58 20 Op 2 . + CDS 61649 - 63340 1012 ## COG1523 Type II secretory pathway, pullulanase PulA and related glycosidases 59 20 Op 3 . + CDS 63344 - 63988 506 ## EUBREC_1055 hypothetical protein 60 20 Op 4 . + CDS 64001 - 64540 590 ## Cbei_2682 hypothetical protein - Term 64524 - 64568 7.5 61 21 Tu 1 . - CDS 64575 - 65054 579 ## COG1854 LuxS protein involved in autoinducer AI2 synthesis - Prom 65075 - 65134 5.7 + Prom 65046 - 65105 7.0 62 22 Op 1 . + CDS 65188 - 65820 553 ## COG0406 Fructose-2,6-bisphosphatase 63 22 Op 2 . + CDS 65841 - 66845 1162 ## COG0180 Tryptophanyl-tRNA synthetase + Term 66848 - 66903 9.2 64 23 Op 1 . - CDS 66876 - 67547 498 ## gi|169350673|ref|ZP_02867611.1| hypothetical protein CLOSPI_01446 65 23 Op 2 7/0.000 - CDS 67535 - 68476 670 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 66 23 Op 3 . - CDS 68463 - 68876 299 ## COG2246 Predicted membrane protein 67 23 Op 4 . - CDS 68858 - 71290 937 ## COG4485 Predicted membrane protein - Prom 71432 - 71491 6.3 + Prom 71230 - 71289 7.3 68 24 Tu 1 . + CDS 71462 - 73837 2353 ## COG0457 FOG: TPR repeat + Term 73844 - 73893 10.5 - Term 73830 - 73881 7.1 69 25 Tu 1 . - CDS 73889 - 75526 1798 ## COG0488 ATPase components of ABC transporters with duplicated ATPase domains - Prom 75577 - 75636 11.4 + Prom 75499 - 75558 6.6 70 26 Tu 1 . + CDS 75739 - 76593 583 ## COG4667 Predicted esterase of the alpha-beta hydrolase superfamily + Term 76606 - 76654 11.1 71 27 Op 1 . - CDS 76644 - 77390 616 ## COG0101 Pseudouridylate synthase 72 27 Op 2 . - CDS 77457 - 77735 308 ## gi|210615872|ref|ZP_03290834.1| hypothetical protein CLONEX_03053 73 27 Op 3 . - CDS 77737 - 78141 356 ## gi|210615871|ref|ZP_03290833.1| hypothetical protein CLONEX_03052 - Prom 78218 - 78277 8.2 + Prom 78213 - 78272 5.8 74 28 Tu 1 . + CDS 78315 - 78455 233 ## + Term 78614 - 78653 -0.2 + Prom 78575 - 78634 5.5 75 29 Op 1 . + CDS 78695 - 79513 939 ## COG3711 Transcriptional antiterminator 76 29 Op 2 25/0.000 + CDS 79551 - 79808 519 ## COG1925 Phosphotransferase system, HPr-related proteins 77 29 Op 3 10/0.000 + CDS 79839 - 81458 1714 ## COG1080 Phosphoenolpyruvate-protein kinase (PTS system EI component in bacteria) 78 29 Op 4 3/0.045 + CDS 81475 - 81969 627 ## COG2190 Phosphotransferase system IIA components + Term 81983 - 82011 -0.9 79 29 Op 5 1/0.091 + CDS 82051 - 83523 1885 ## COG1263 Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific + Term 83553 - 83602 12.5 + Prom 83537 - 83596 3.0 80 30 Tu 1 . + CDS 83617 - 84360 818 ## COG3142 Uncharacterized protein involved in copper resistance + Term 84365 - 84404 7.1 - Term 84354 - 84389 6.4 81 31 Tu 1 . - CDS 84409 - 84645 205 ## - Prom 84674 - 84733 8.8 + Prom 84633 - 84692 5.6 82 32 Op 1 . + CDS 84797 - 85711 904 ## COG1404 Subtilisin-like serine proteases + Term 85713 - 85749 2.1 83 32 Op 2 8/0.000 + CDS 85775 - 86116 358 ## COG2739 Uncharacterized protein conserved in bacteria 84 32 Op 3 23/0.000 + CDS 86118 - 87467 1673 ## COG0541 Signal recognition particle GTPase 85 32 Op 4 19/0.000 + CDS 87505 - 87750 346 ## PROTEIN SUPPORTED gi|160880540|ref|YP_001559508.1| ribosomal protein S16 86 32 Op 5 12/0.000 + CDS 87774 - 88001 404 ## COG1837 Predicted RNA-binding protein (contains KH domain) 87 32 Op 6 30/0.000 + CDS 88045 - 88551 471 ## COG0806 RimM protein, required for 16S rRNA processing 88 32 Op 7 33/0.000 + CDS 88554 - 89276 771 ## COG0336 tRNA-(guanine-N1)-methyltransferase + Prom 89281 - 89340 1.5 89 32 Op 8 1/0.091 + CDS 89376 - 89723 465 ## PROTEIN SUPPORTED gi|238916980|ref|YP_002930497.1| large subunit ribosomal protein L19 + Term 89735 - 89779 6.6 90 33 Op 1 2/0.045 + CDS 89793 - 90644 885 ## COG1161 Predicted GTPases 91 33 Op 2 4/0.045 + CDS 90664 - 91218 601 ## COG0681 Signal peptidase I 92 33 Op 3 1/0.091 + CDS 91211 - 91969 784 ## COG0164 Ribonuclease HII 93 33 Op 4 . + CDS 91970 - 92329 302 ## COG0792 Predicted endonuclease distantly related to archaeal Holliday junction resolvase 94 33 Op 5 . + CDS 92338 - 93183 452 ## COG1496 Uncharacterized conserved protein 95 33 Op 6 . + CDS 93194 - 94132 899 ## COG0668 Small-conductance mechanosensitive channel + Term 94236 - 94299 12.1 + TRNA 94166 - 94238 85.7 # Thr CGT 0 0 + Prom 94164 - 94223 75.1 96 34 Op 1 . + CDS 94385 - 94723 204 ## PROTEIN SUPPORTED gi|18309686|ref|NP_561620.1| 30S ribosomal protein 97 34 Op 2 . + CDS 94710 - 95456 802 ## CBO2305 membrane protein 98 34 Op 3 . + CDS 95472 - 96968 1262 ## COG0554 Glycerol kinase + Term 96977 - 97023 8.6 99 35 Op 1 24/0.000 - CDS 97014 - 97811 548 ## COG0600 ABC-type nitrate/sulfonate/bicarbonate transport system, permease component 100 35 Op 2 . - CDS 97804 - 98571 215 ## PROTEIN SUPPORTED gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein - Prom 98652 - 98711 6.0 + Prom 98602 - 98661 8.8 101 36 Op 1 . + CDS 98686 - 99309 453 ## COG1636 Uncharacterized protein conserved in bacteria 102 36 Op 2 . + CDS 99328 - 100413 1143 ## COG0136 Aspartate-semialdehyde dehydrogenase 103 36 Op 3 . + CDS 100438 - 101481 400 ## PROTEIN SUPPORTED gi|15900011|ref|NP_344615.1| aldose 1-epimerase 104 36 Op 4 . + CDS 101486 - 103483 1469 ## COG0550 Topoisomerase IA 105 36 Op 5 . + CDS 103486 - 104151 806 ## COG0110 Acetyltransferase (isoleucine patch superfamily) + Term 104174 - 104226 12.1 + Prom 104157 - 104216 2.6 106 37 Op 1 . + CDS 104247 - 104942 260 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 107 37 Op 2 . + CDS 104942 - 106513 1402 ## CLB_3171 ABC transporter, permease protein 108 37 Op 3 . + CDS 106531 - 107007 667 ## COG1803 Methylglyoxal synthase + Term 107018 - 107057 3.5 + Prom 107037 - 107096 6.8 109 38 Op 1 14/0.000 + CDS 107255 - 108277 1340 ## COG0468 RecA/RadA recombinase 110 38 Op 2 1/0.091 + CDS 108279 - 108893 583 ## COG2137 Uncharacterized protein conserved in bacteria 111 38 Op 3 . + CDS 108966 - 110519 1802 ## COG1418 Predicted HD superfamily hydrolase + Term 110528 - 110565 6.4 + Prom 110530 - 110589 2.6 112 39 Op 1 . + CDS 110622 - 111164 625 ## COG0494 NTP pyrophosphohydrolases including oxidative damage repair enzymes 113 39 Op 2 . + CDS 111216 - 111755 325 ## gi|167759050|ref|ZP_02431177.1| hypothetical protein CLOSCI_01397 + Term 111759 - 111794 4.1 114 40 Tu 1 . - CDS 111786 - 113105 133 ## COG2244 Membrane protein involved in the export of O-antigen and teichoic acid - Prom 113170 - 113229 6.2 + Prom 113129 - 113188 6.0 115 41 Op 1 13/0.000 + CDS 113244 - 113702 547 ## COG1959 Predicted transcriptional regulator 116 41 Op 2 20/0.000 + CDS 113720 - 114901 1174 ## COG1104 Cysteine sulfinate desulfinase/cysteine desulfurase and related enzymes 117 41 Op 3 1/0.091 + CDS 114923 - 115363 556 ## COG0822 NifU homolog involved in Fe-S cluster formation 118 41 Op 4 . + CDS 115365 - 116447 985 ## COG0482 Predicted tRNA(5-methylaminomethyl-2-thiouridylate) methyltransferase, contains the PP-loop ATPase domain 119 41 Op 5 . + CDS 116451 - 117251 743 ## COG1387 Histidinol phosphatase and related hydrolases of the PHP family + Term 117304 - 117352 1.3 + Prom 117296 - 117355 7.3 120 42 Op 1 26/0.000 + CDS 117392 - 118405 1332 ## COG0057 Glyceraldehyde-3-phosphate dehydrogenase/erythrose-4-phosphate dehydrogenase + Term 118434 - 118473 6.4 121 42 Op 2 13/0.000 + CDS 118492 - 119685 1402 ## COG0126 3-phosphoglycerate kinase 122 42 Op 3 . + CDS 119753 - 120502 887 ## COG0149 Triosephosphate isomerase + Term 120519 - 120554 5.1 123 43 Op 1 . + CDS 120579 - 121205 586 ## COG0494 NTP pyrophosphohydrolases including oxidative damage repair enzymes + Prom 121207 - 121266 2.7 124 43 Op 2 . + CDS 121295 - 122833 1656 ## COG0696 Phosphoglyceromutase + Term 122844 - 122888 7.2 - Term 122832 - 122875 3.2 125 44 Tu 1 . - CDS 122881 - 127125 3025 ## COG1924 Activator of 2-hydroxyglutaryl-CoA dehydratase (HSP70-class ATPase domain) - Prom 127231 - 127290 7.4 + Prom 127144 - 127203 9.7 126 45 Op 1 . + CDS 127353 - 128222 584 ## Cphy_2901 hypothetical protein 127 45 Op 2 . + CDS 128261 - 129697 1775 ## COG0469 Pyruvate kinase + Prom 129702 - 129761 7.2 128 45 Op 3 . + CDS 129789 - 130430 843 ## gi|210616214|ref|ZP_03290994.1| hypothetical protein CLONEX_03213 + Term 130482 - 130513 1.8 + TRNA 130507 - 130590 70.7 # Leu CAA 0 0 - Term 130468 - 130503 0.6 129 46 Tu 1 . - CDS 130584 - 130733 64 ## - Prom 130795 - 130854 2.6 + Prom 130516 - 130575 80.4 130 47 Op 1 4/0.045 + CDS 130682 - 133297 2396 ## COG0749 DNA polymerase I - 3'-5' exonuclease and polymerase domains 131 47 Op 2 . + CDS 133358 - 133888 507 ## COG0237 Dephospho-CoA kinase 132 47 Op 3 . + CDS 133893 - 134693 732 ## COG1235 Metal-dependent hydrolases of the beta-lactamase superfamily I 133 47 Op 4 . + CDS 134713 - 135567 857 ## COG1307 Uncharacterized protein conserved in bacteria + Term 135575 - 135630 12.2 134 48 Tu 1 . - CDS 135594 - 136817 505 ## COG1323 Predicted nucleotidyltransferase - Prom 136841 - 136900 7.6 + Prom 136794 - 136853 9.3 135 49 Tu 1 . + CDS 136985 - 138316 1530 ## COG2239 Mg/Co/Ni transporter MgtE (contains CBS domain) + Prom 138327 - 138386 6.6 136 50 Op 1 21/0.000 + CDS 138418 - 139413 1457 ## COG0280 Phosphotransacetylase 137 50 Op 2 1/0.091 + CDS 139468 - 140658 1446 ## COG0282 Acetate kinase 138 50 Op 3 . + CDS 140735 - 141262 567 ## COG1399 Predicted metal-binding, possibly nucleic acid-binding protein 139 50 Op 4 . + CDS 141266 - 141445 284 ## PROTEIN SUPPORTED gi|227872300|ref|ZP_03990657.1| possible ribosomal protein L32 + Term 141461 - 141508 9.8 + Prom 141492 - 141551 8.7 140 51 Op 1 3/0.045 + CDS 141572 - 142585 1154 ## COG0416 Fatty acid/phospholipid biosynthesis enzyme 141 51 Op 2 7/0.000 + CDS 142611 - 142841 479 ## COG0236 Acyl carrier protein + Term 142851 - 142886 6.0 142 51 Op 3 6/0.000 + CDS 142893 - 143588 573 ## COG0571 dsRNA-specific ribonuclease 143 51 Op 4 10/0.000 + CDS 143596 - 147156 3648 ## COG1196 Chromosome segregation ATPases 144 51 Op 5 . + CDS 147167 - 148111 776 ## PROTEIN SUPPORTED gi|163762490|ref|ZP_02169555.1| ribosomal protein L28 145 51 Op 6 . + CDS 148127 - 148540 469 ## gi|210612579|ref|ZP_03289370.1| hypothetical protein CLONEX_01572 + Term 148545 - 148595 8.7 + Prom 148558 - 148617 12.4 146 52 Op 1 . + CDS 148654 - 149301 662 ## COG1802 Transcriptional regulators 147 52 Op 2 . + CDS 149321 - 150517 1470 ## COG0538 Isocitrate dehydrogenases + Term 150526 - 150565 3.0 - Term 150509 - 150556 8.6 148 53 Tu 1 . - CDS 150557 - 152473 1840 ## COG0488 ATPase components of ABC transporters with duplicated ATPase domains + Prom 152418 - 152477 6.1 149 54 Op 1 . + CDS 152603 - 153247 730 ## COG2344 AT-rich DNA-binding protein + Term 153256 - 153312 3.0 150 54 Op 2 1/0.091 + CDS 153344 - 154843 642 ## PROTEIN SUPPORTED gi|90022317|ref|YP_528144.1| ribosomal protein S15 151 54 Op 3 . + CDS 154840 - 156006 1155 ## COG0787 Alanine racemase 152 54 Op 4 . + CDS 156091 - 156438 239 ## COG2337 Growth inhibitor 153 54 Op 5 . + CDS 156456 - 157295 795 ## COG1253 Hemolysins and related proteins containing CBS domains 154 54 Op 6 . + CDS 157366 - 158094 928 ## COG0217 Uncharacterized conserved protein 155 54 Op 7 . + CDS 158129 - 158890 779 ## Hbal_0324 hypothetical protein 156 54 Op 8 . + CDS 158903 - 160186 1280 ## COG2873 O-acetylhomoserine sulfhydrylase + Term 160190 - 160248 13.2 + Prom 160207 - 160266 5.8 157 55 Op 1 1/0.091 + CDS 160299 - 160979 824 ## COG0740 Protease subunit of ATP-dependent Clp proteases 158 55 Op 2 . + CDS 161044 - 163509 2006 ## COG1674 DNA segregation ATPase FtsK/SpoIIIE and related proteins 159 55 Op 3 1/0.091 + CDS 163528 - 165198 1834 ## COG2759 Formyltetrahydrofolate synthetase 160 55 Op 4 . + CDS 165216 - 166064 982 ## COG0190 5,10-methylene-tetrahydrofolate dehydrogenase/Methenyl tetrahydrofolate cyclohydrolase 161 55 Op 5 . + CDS 166064 - 167194 841 ## EUBREC_1956 hypothetical protein + Prom 167206 - 167265 8.2 162 56 Tu 1 . + CDS 167309 - 168490 1543 ## COG1454 Alcohol dehydrogenase, class IV + Term 168507 - 168539 3.0 + Prom 168545 - 168604 6.9 163 57 Op 1 5/0.000 + CDS 168629 - 169519 995 ## COG0543 2-polyprenylphenol hydroxylase and related flavodoxin oxidoreductases 164 57 Op 2 . + CDS 169519 - 170907 1706 ## COG0493 NADPH-dependent glutamate synthase beta chain and related oxidoreductases + Term 170945 - 170983 6.3 + Prom 170936 - 170995 2.4 165 58 Tu 1 . + CDS 171035 - 171934 893 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily + Prom 171942 - 172001 6.6 166 59 Op 1 . + CDS 172045 - 172569 657 ## Dhaf_1199 C_GCAxxG_C_C family protein 167 59 Op 2 . + CDS 172589 - 173509 706 ## COG1275 Tellurite resistance protein and related permeases + Term 173511 - 173554 3.0 + Prom 173523 - 173582 7.3 168 60 Op 1 . + CDS 173629 - 174108 588 ## COG1576 Uncharacterized conserved protein 169 60 Op 2 1/0.091 + CDS 174123 - 174794 527 ## COG2357 Uncharacterized protein conserved in bacteria 170 60 Op 3 . + CDS 174871 - 175716 752 ## COG0789 Predicted transcriptional regulators 171 60 Op 4 . + CDS 175734 - 176516 720 ## COG0500 SAM-dependent methyltransferases 172 60 Op 5 . + CDS 176529 - 177209 507 ## Cphy_0177 hypothetical protein 173 60 Op 6 . + CDS 177213 - 178157 766 ## COG0042 tRNA-dihydrouridine synthase 174 60 Op 7 . + CDS 178163 - 178873 529 ## COG2176 DNA polymerase III, alpha subunit (gram-positive type) + Term 178874 - 178904 0.3 + Prom 178889 - 178948 10.5 175 61 Op 1 . + CDS 179045 - 179725 283 ## Huta_1008 hypothetical protein 176 61 Op 2 . + CDS 179725 - 182340 1975 ## COG0480 Translation elongation factors (GTPases) + Term 182344 - 182391 9.1 - Term 182322 - 182390 11.3 177 62 Tu 1 . - CDS 182397 - 183830 1552 ## COG0076 Glutamate decarboxylase and related PLP-dependent proteins - Prom 183860 - 183919 8.5 + Prom 183861 - 183920 12.2 178 63 Tu 1 . + CDS 183947 - 184822 612 ## COG0583 Transcriptional regulator + Prom 184826 - 184885 6.3 179 64 Op 1 1/0.091 + CDS 185030 - 186895 1764 ## COG1032 Fe-S oxidoreductase 180 64 Op 2 2/0.045 + CDS 186879 - 187586 751 ## COG5011 Uncharacterized protein conserved in bacteria 181 64 Op 3 . + CDS 187519 - 188730 769 ## COG1530 Ribonucleases G and E 182 64 Op 4 . + CDS 188735 - 189217 543 ## COG2606 Uncharacterized conserved protein 183 64 Op 5 . + CDS 189230 - 189826 415 ## COG1011 Predicted hydrolase (HAD superfamily) 184 64 Op 6 . + CDS 189847 - 190260 441 ## COG0517 FOG: CBS domain + Prom 190269 - 190328 3.5 185 65 Op 1 . + CDS 190376 - 190684 428 ## PROTEIN SUPPORTED gi|238924055|ref|YP_002937571.1| 50S ribosomal protein L21 186 65 Op 2 . + CDS 190704 - 191036 184 ## PROTEIN SUPPORTED gi|116492579|ref|YP_804314.1| ribosomal protein 187 65 Op 3 14/0.000 + CDS 191040 - 191324 449 ## PROTEIN SUPPORTED gi|160880681|ref|YP_001559649.1| 50S ribosomal protein L27 + Term 191335 - 191379 6.4 188 65 Op 4 1/0.091 + CDS 191405 - 192688 760 ## PROTEIN SUPPORTED gi|149915191|ref|ZP_01903719.1| 50S ribosomal protein L27 189 65 Op 5 7/0.000 + CDS 192712 - 193005 225 ## PROTEIN SUPPORTED gi|212638657|ref|YP_002315177.1| Predicted RNA-binding protein containing KH domain, possibly ribosomal protein 190 65 Op 6 9/0.000 + CDS 193030 - 193635 524 ## COG1057 Nicotinic acid mononucleotide adenylyltransferase 191 65 Op 7 6/0.000 + CDS 193632 - 194219 454 ## COG1713 Predicted HD superfamily hydrolase involved in NAD metabolism 192 65 Op 8 . + CDS 194219 - 194572 467 ## COG0799 Uncharacterized homolog of plant Iojap protein 193 65 Op 9 . + CDS 194604 - 194996 317 ## Clos_1149 hypothetical protein + Term 195007 - 195044 6.4 - Term 194993 - 195032 3.0 194 66 Tu 1 . - CDS 195062 - 195679 570 ## COG1974 SOS-response transcriptional repressors (RecA-mediated autopeptidases) - Prom 195702 - 195761 8.7 + Prom 195757 - 195816 8.1 195 67 Tu 1 . + CDS 195870 - 196235 344 ## EUBREC_1702 hypothetical protein 196 68 Tu 1 . - CDS 196266 - 196412 100 ## - Prom 196458 - 196517 5.4 + Prom 196354 - 196413 9.4 197 69 Tu 1 . + CDS 196455 - 197516 1029 ## COG0582 Integrase - Term 197500 - 197546 12.7 198 70 Tu 1 . - CDS 197548 - 197994 577 ## COG1490 D-Tyr-tRNAtyr deacylase - Prom 198128 - 198187 8.1 + Prom 198027 - 198086 6.7 199 71 Tu 1 . + CDS 198130 - 198864 580 ## COG0037 Predicted ATPase of the PP-loop superfamily implicated in cell cycle control + Term 198870 - 198915 5.1 - Term 198854 - 198907 13.2 200 72 Op 1 . - CDS 198912 - 199556 585 ## EUBELI_01147 cytidylate kinase - Prom 199583 - 199642 5.5 201 72 Op 2 15/0.000 - CDS 199654 - 200577 316 ## PROTEIN SUPPORTED gi|238855152|ref|ZP_04645474.1| pseudouridine synthase, RluA family 202 72 Op 3 . - CDS 200579 - 201106 490 ## COG0597 Lipoprotein signal peptidase 203 72 Op 4 . - CDS 201122 - 201877 328 ## PROTEIN SUPPORTED gi|227874237|ref|ZP_03992436.1| possible ribosomal protein S4e 204 72 Op 5 14/0.000 - CDS 201888 - 202433 559 ## COG1799 Uncharacterized protein conserved in bacteria 205 72 Op 6 . - CDS 202462 - 203136 585 ## COG0325 Predicted enzyme with a TIM-barrel fold 206 72 Op 7 . - CDS 203148 - 204530 1284 ## EUBREC_1735 hypothetical protein 207 72 Op 8 . - CDS 204547 - 205491 616 ## COG1242 Predicted Fe-S oxidoreductase - Prom 205667 - 205726 6.7 + Prom 205420 - 205479 11.4 208 73 Tu 1 . + CDS 205564 - 205761 285 ## COG2155 Uncharacterized conserved protein + Term 205767 - 205814 12.8 - Term 205757 - 205798 8.0 209 74 Op 1 . - CDS 205800 - 206738 728 ## COG1686 D-alanyl-D-alanine carboxypeptidase 210 74 Op 2 . - CDS 206749 - 207144 393 ## COG1803 Methylglyoxal synthase 211 74 Op 3 . - CDS 207162 - 208274 1182 ## COG0772 Bacterial cell division membrane protein 212 74 Op 4 . - CDS 208289 - 208519 223 ## gi|210612730|ref|ZP_03289445.1| hypothetical protein CLONEX_01647 213 74 Op 5 22/0.000 - CDS 208532 - 209323 866 ## COG2894 Septum formation inhibitor-activating ATPase 214 74 Op 6 1/0.091 - CDS 209305 - 209994 584 ## COG0850 Septum formation inhibitor 215 74 Op 7 . - CDS 210015 - 212885 2375 ## COG0768 Cell division protein FtsI/penicillin-binding protein 2 216 74 Op 8 . - CDS 212878 - 213408 601 ## EUBREC_1743 hypothetical protein 217 74 Op 9 22/0.000 - CDS 213422 - 214285 1039 ## COG1792 Cell shape-determining protein 218 74 Op 10 4/0.045 - CDS 214287 - 215309 979 ## COG1077 Actin-like ATPase involved in cell morphogenesis 219 74 Op 11 . - CDS 215329 - 215997 557 ## COG2003 DNA repair proteins - Prom 216060 - 216119 9.2 220 75 Op 1 . - CDS 216123 - 217715 1059 ## COG2509 Uncharacterized FAD-dependent dehydrogenases 221 75 Op 2 . - CDS 217708 - 218946 1138 ## COG2081 Predicted flavoproteins 222 75 Op 3 . - CDS 218959 - 220245 1532 ## COG4100 Cystathionine beta-lyase family protein involved in aluminum resistance 223 75 Op 4 12/0.000 - CDS 220239 - 221192 898 ## COG0324 tRNA delta(2)-isopentenylpyrophosphate transferase 224 75 Op 5 6/0.000 - CDS 221194 - 223110 1672 ## COG0323 DNA mismatch repair enzyme (predicted ATPase) 225 75 Op 6 . - CDS 223129 - 225780 2801 ## COG0249 Mismatch repair ATPase (MutS family) - Prom 225843 - 225902 8.3 226 76 Op 1 . - CDS 225930 - 226346 316 ## Cphy_3715 hypothetical protein 227 76 Op 2 . - CDS 226432 - 227889 678 ## PROTEIN SUPPORTED gi|16079597|ref|NP_390421.1| hypothetical protein BSU25430 - Prom 227914 - 227973 3.2 - Term 227933 - 227966 -1.0 228 77 Op 1 . - CDS 227975 - 228151 277 ## BF0576 hypothetical protein 229 77 Op 2 . - CDS 228151 - 228702 387 ## COG0212 5-formyltetrahydrofolate cyclo-ligase 230 77 Op 3 . - CDS 228680 - 228901 343 ## EUBREC_1649 hypothetical protein 231 77 Op 4 . - CDS 228927 - 229286 409 ## gi|153853173|ref|ZP_01994582.1| hypothetical protein DORLON_00567 232 77 Op 5 . - CDS 229333 - 230151 814 ## COG2240 Pyridoxal/pyridoxine/pyridoxamine kinase 233 77 Op 6 . - CDS 230152 - 231819 1986 ## COG2244 Membrane protein involved in the export of O-antigen and teichoic acid 234 77 Op 7 17/0.000 - CDS 231839 - 232333 594 ## COG0319 Predicted metal-dependent hydrolase 235 77 Op 8 . - CDS 232333 - 233319 1160 ## COG1702 Phosphate starvation-inducible protein PhoH, predicted ATPase 236 77 Op 9 . - CDS 233320 - 234555 678 ## Cphy_2612 putative stage IV sporulation YqfD 237 77 Op 10 . - CDS 234569 - 234835 205 ## TherJR_2425 sporulation protein YqfC 238 77 Op 11 . - CDS 234890 - 235591 700 ## COG0744 Membrane carboxypeptidase (penicillin-binding protein) - Prom 235612 - 235671 7.7 239 78 Op 1 . - CDS 235694 - 237040 1223 ## COG0534 Na+-driven multidrug efflux pump 240 78 Op 2 . - CDS 237076 - 237555 354 ## COG0394 Protein-tyrosine-phosphatase - Prom 237757 - 237816 7.8 - Term 237802 - 237841 3.4 241 79 Op 1 . - CDS 237865 - 239127 493 ## COG0582 Integrase 242 79 Op 2 . - CDS 239105 - 239335 164 ## gi|163816186|ref|ZP_02207554.1| hypothetical protein COPEUT_02370 - Prom 239355 - 239414 3.5 243 80 Op 1 . - CDS 239436 - 239669 260 ## CD0356 excisionase 244 80 Op 2 . - CDS 239716 - 240636 757 ## COG4653 Predicted phage phi-C31 gp36 major capsid-like protein 245 80 Op 3 . - CDS 240657 - 240896 157 ## gi|295108407|emb|CBL22360.1| hypothetical protein 246 80 Op 4 . - CDS 240912 - 243221 1745 ## COG3378 Predicted ATPase 247 80 Op 5 . - CDS 243258 - 243434 214 ## gi|295108405|emb|CBL22358.1| DNA binding domain, excisionase family - Prom 243634 - 243693 5.5 + Prom 243449 - 243508 7.8 248 81 Tu 1 . + CDS 243577 - 244455 654 ## gi|291536674|emb|CBL09786.1| hypothetical protein + Term 244496 - 244549 6.3 - Term 244568 - 244603 6.5 249 82 Tu 1 . - CDS 244620 - 245285 503 ## DET0065 virulence-related protein - Prom 245368 - 245427 3.8 250 83 Op 1 . - CDS 245440 - 245697 146 ## 251 83 Op 2 . - CDS 245648 - 246451 650 ## Emin_0869 hypothetical protein 252 83 Op 3 . - CDS 246451 - 247386 669 ## BCAH820_1014 hypothetical protein 253 83 Op 4 . - CDS 247383 - 249314 1059 ## BcerKBAB4_0821 hypothetical protein 254 83 Op 5 . - CDS 249326 - 249559 280 ## gi|291172169|ref|ZP_06573343.1| conserved hypothetical protein - Prom 249637 - 249696 11.3 - Term 249733 - 249776 4.3 255 84 Op 1 6/0.000 - CDS 249778 - 250905 737 ## COG0270 Site-specific DNA methylase 256 84 Op 2 . - CDS 250910 - 251323 298 ## COG3727 DNA G:T-mismatch repair endonuclease 257 84 Op 3 . - CDS 251307 - 251630 166 ## - Prom 251725 - 251784 10.2 - Term 252111 - 252148 4.8 258 85 Tu 1 . - CDS 252199 - 252456 238 ## COG2801 Transposase and inactivated derivatives Predicted protein(s) >gi|330405144|gb|ADLB01000012.1| GENE 1 3 - 258 190 85 aa, chain - ## HITS:1 COG:BH1412 KEGG:ns NR:ns ## COG: BH1412 COG3464 # Protein_GI_number: 15613975 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Bacillus halodurans # 4 85 2 83 405 70 40.0 6e-13 MHSHYTNKLLNIEDVIIKKIHHADTFLKIYLETNPHEQVCPCCGSTTKRIHDYRYQTIKD LPFQLKHCYLVLRKRRYVCKCGKRF >gi|330405144|gb|ADLB01000012.1| GENE 2 343 - 510 129 55 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MINLKSSTPTFILEPFSFTPSLFEKHPVPFCLKIEKFYFYTMLKAYKYVISYLIF >gi|330405144|gb|ADLB01000012.1| GENE 3 734 - 997 290 87 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|160880450|ref|YP_001559418.1| ribosomal protein S20 [Clostridium phytofermentans ISDg] # 1 87 1 87 87 116 71 1e-24 MANIKSAKKRILVNETRAARNKAIKSKVKTCVKKVEAAVAANDKTVAAEALRVAIVEINK AASKGVYHKNTAARKVSRLTKAVNGIA >gi|330405144|gb|ADLB01000012.1| GENE 4 924 - 1094 97 56 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MIALFLAALVSFTKILFLADLILANLNTSKLCEFSIIFSLEPDTERSNETYSFIVG >gi|330405144|gb|ADLB01000012.1| GENE 5 1101 - 1187 62 28 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MVVNTYFRKKCKTYEQKAKKERKEVWND >gi|330405144|gb|ADLB01000012.1| GENE 6 1180 - 2145 1128 321 aa, chain + ## HITS:1 COG:no KEGG:Cphy_2317 NR:ns ## KEGG: Cphy_2317 # Name: not_defined # Def: germination protease (EC:3.4.24.78) # Organism: C.phytofermentans # Pathway: not_defined # 7 316 8 303 307 310 53.0 6e-83 MIEKYSIRTDLALEQKERFESDNVEVQGVVLEEEYDEEREIKITKVIIETENGAKTMGKP VGTYITLEAPNLMIPDEDYHREISGELAKIVCELIKEKKKEYDVLVIGLGNREVTPDALG PYVVDNLIVTRHIIREYGKYAMGEENVNQVSAIVPGVMGQTGMETVEVVNGIVKETKPDF IIAVDALAARSTKRLNRTIQVADTGIHPGSGVGNHRGGITQETLGIPVIAIGVPTVVDAA TIVNDTMENFLTALESSEMLRGVGVVLQGYNAAEKYELIQELISPHLNGMFVTPKDIDDT IKRISYTISEALNILFSNGQA >gi|330405144|gb|ADLB01000012.1| GENE 7 2213 - 3409 936 398 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_1620 NR:ns ## KEGG: EUBREC_1620 # Name: not_defined # Def: stage II sporulation protein P # Organism: E.rectale # Pathway: not_defined # 93 396 59 364 365 254 46.0 4e-66 MRANRKTARMLTQGAIICLGSYILYNGSKQIVGEWKGKINETLQLRAEKTFMSGLTYANR EEKSLEEWIANQALEMLPLGSYINGKAVCQSEVEDEMTYEMILAKQEADENEVDANGNLI GKEEKSVQQEAKQETKPVNKSRVDLSVEKLKDFEYLRSHFYTVDSSTFVNPEDLQAEKLL SKNMKLDKNKKGPKILLYHTHSQEAFADSVPGDVNTTIVGVGRYLAKLLNEKYGIETLHH EGIYDMKNGKVDRTQAYERAKGNIQKILKDNPSIEMVIDLHRDGVGKNTRLVTEIDGKPT AKLMFFNGMSRTRTNGKLTVLQNPYIQDNLALSLQMKLEAERRYPGLTRNIYLKGYRYNM HMKPKTLLVEGGAQTNTVAEIMNAMDYLAEILNAVVGE >gi|330405144|gb|ADLB01000012.1| GENE 8 3488 - 5410 1914 640 aa, chain + ## HITS:1 COG:BS_gyrB KEGG:ns NR:ns ## COG: BS_gyrB COG0187 # Protein_GI_number: 16077074 # Func_class: L Replication, recombination and repair # Function: Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), B subunit # Organism: Bacillus subtilis # 3 633 4 630 638 624 52.0 1e-178 MAKKNTYDADSISILEGLEAVRVRPGMYIGSVSTKGLNHLVYEIVDNAVDEHLAGFCNRI EVTLEKDGSATVSDNGRGVPVGMHQKGVSAARIVYTTLHAGGKFDDSAYKTSGGLHGVGS SVVNALSAYMDVKISRDGAVHHDHYERGIPTIELVDGLLPVIAKTKKTGTTVNFLPDDTI FEKTRFKAEEIKSRLHETAYLNPTLTIVFEDKRGAETERIEYHEPNGILGFIADLNQKKE TVHEPVYFKGEADGIEVEVAFQYVNEFHENVLGFCNNIYNAEGGTHLTGFKSTFTMVMNQ YAREIGVLKEKDANFTGADIRNGMTAIVSIKHPDPRFEGQTKTKLDNQDAGKATSKVTND EITRFFDKNLDTLKKVLSCAEKAAKIRKTEEKAKTNLLTKQKYSFDSNGKLANCESRDAS KCEIFIVEGDSAGGSAKTARNRQFQAILPIRGKILNVEKASIDKVLANAEINSMINAFGC GFSEGYGNDFDISKLRYDKIIIMADADVDGAHISTLLLTLFYRFMPELIFEGHVYIAMPP LYKAVTKKGEEEYLYDDKALEKYRRRQKGAYTLQRYKGLGEMDAEQLWDTTLNPETRMLK LIEIEDARMASGVTEMLMGTEVPPRRAFIYENATEAELDI >gi|330405144|gb|ADLB01000012.1| GENE 9 5424 - 7664 2287 746 aa, chain + ## HITS:1 COG:CAC0007 KEGG:ns NR:ns ## COG: CAC0007 COG0188 # Protein_GI_number: 15893305 # Func_class: L Replication, recombination and repair # Function: Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), A subunit # Organism: Clostridium acetobutylicum # 1 744 2 720 830 527 41.0 1e-149 MQDSQLIRTEYSEVMKKSYIDYAMSVIVSRALPDVRDGLKPVQRRTLYDMYELGIKYDKP YRKSARIVGDTMGKYHPHGDSSIYESLVVMAQDFKKGMALVDGHGNFGSIEGDGAAAMRY TEARLARLTQEAYLSDLDKDIVDFVPNFDETEKEPEVLPVRVPNLLVNGAEGIAVGMATS IPTHNLGEVVDAVKAYMKNNGITTKQLMKYVKGPDFPTGAIVVNKDDLLSVYETGSGKIK LRGKVEVEEGKGGKKKLVISEIPYTMIGAGIGKFLNDVASLVETKKTNDITDISNQSSKE GIRIVLELKKDTDVENLKNMLYKKTRLEDTFGVNMLAVADGKPETMGLKKIIEHHVDFQF ELATRKYKNLLGKELDKKEIQEGLIKACDVIDLIIEILRGSKSIKDVKACLTNGVTDNIS FKSNISKKMASMLRFTERQATAILEMRLYKLIGLEIEALQKEHEETLKNIARYEDILNNY DSMAEVIMEELDYYKKTYGRKRRTVIENAEEAVYEEKKVEEQDVVVLMDRFGYTRSIDTA TYDRNKESADSENKHIIFCKNTGKICMFTNTGKMHQVKVLDIPFGKFRDKGTPVDNLSNY DSAEENIIYICDEEQFRMGKMLFATKYGMIKKVDGAEFVVSKRTITATKLQEEDEVVSIQ PVNENQNIVLQTGGGYFLRFLAEEVAEKKKGAIGVRGIKLQKTDILENVYLFEEGTEIKV PYREKEVTLNRLKLAKRDGMGTKTRG >gi|330405144|gb|ADLB01000012.1| GENE 10 7815 - 8282 504 155 aa, chain + ## HITS:1 COG:lin1358 KEGG:ns NR:ns ## COG: lin1358 COG0779 # Protein_GI_number: 16800426 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Listeria innocua # 9 155 6 155 155 116 42.0 1e-26 MSKREVYEQKTEELLLPIVEEHGFELVDVEYVKEGGTWYLRAYIDKPGGIAVDDCEVVSR AFSDILDEKDYIEDTYIFEVSSPGLGRPLKKEKDFARSMGEEVEVRTYRAIDRQKEFVGI LKGYDKNTVTIEMEDGSERIFERNDIALIRLAFDF >gi|330405144|gb|ADLB01000012.1| GENE 11 8301 - 9407 749 368 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|17988250|ref|NP_540884.1| transcription elongation factor NusA [Brucella melitensis 16M] # 4 345 9 350 537 293 44 6e-78 MNTELLEALTILEQEKDISKETLLDAIENSLINACKNHFGKADNIKVIMNRETCDYSVFA EKTVVENVEDDVMEISLANAKMIDSKFELGDIVQIPVESKEFGRIATQNAKNLILQKIRE EERKVVYDQYFEKEKDIVTGIVQRYVGKNVSINLGKADAMLTENEQVKGEVFKPTERIKL YVVEVKNTTKGPKILVSRTHPELVKRLFESEVAEVKDGTVEIKSIAREAGSRSKIAVWSN DPDVDPVGACVGMNGARVNAIVNELRGEKIDIINWSENPAILIENALSPAKVISVMADPD EKTASVIVPDYQLSLAIGKEGQNARLAARLTGYKIDIKNETQAIESGELPEDYMNTQAVE ETEEIVEE >gi|330405144|gb|ADLB01000012.1| GENE 12 9425 - 9703 216 92 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|206900953|ref|YP_002250931.1| ribosomal protein L7Ae family protein [Dictyoglomus thermophilum H-6-12] # 1 87 1 87 98 87 44 4e-16 MSSVKKIPMRKCVGCGEMKSKKEMMRVLKTSENEFVLDATGKKNGRGAYLCQSKECLAKA IKNKGLERSFKQAIPKEVYEILEKEMETLETE >gi|330405144|gb|ADLB01000012.1| GENE 13 9690 - 10013 310 107 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|240146074|ref|ZP_04744675.1| ribosomal protein L7Ae/L30e/S12e/Gadd45 [Roseburia intestinalis L1-82] # 1 105 1 105 110 124 56 5e-27 MKQSKVLSLISLATKAGRTVSGEFATEKETKSGKAWLVIVANDASDNTKKKFKNMCDFYE VPICFYGDKDTLGHAMGKEFRASLAVTDGGFAKGIMKHLEAENNIIA >gi|330405144|gb|ADLB01000012.1| GENE 14 10031 - 12496 3187 821 aa, chain + ## HITS:1 COG:BH2413 KEGG:ns NR:ns ## COG: BH2413 COG0532 # Protein_GI_number: 15614976 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation initiation factor 2 (IF-2; GTPase) # Organism: Bacillus halodurans # 1 821 1 729 730 672 49.0 0 MSKIRVHELAKELNKTNKELLDFLKSKEIGVKSHMSSLSDEQIQMVKGAMAGKATKTDGE DAQKKKNIVQVFRPQNSKQNNRPKQNNHKQNQNQNNNNNQGNTKQMNNGQNNPRQNNNQG KRQNNNQGGRYQNNNQNGERKNNNQSGRPNKNFNNSQRRDDRRDDNRRNDKRTPSIPAPV LPEQKPQRQKAKDKDAYKKKDYRQDDREDRKVKQTKKGKPAPQPQKPQPKAEQKVEEKIS MITIPEVLTIKELADKMKIQPSAIVKKLFMQGKIVTVNQEVDFETAEEIAMEFEVLCEKE EVVDVIEELLKEDEEDETTMEKRPPVVCVMGHVDHGKTSLLDAIRNTNVIDREAGGITQH IGAYVANVNGERITFLDTPGHEAFTAMRLRGAQSTDIAILVVAADDGVMPQTVEAISHAK AAGIEIIVAVNKIDKPSANIERVKQELTEYELIPEDWGGSTIFVPVSAKTGEGLEDLMEM ILLTAEVLELKANPNRQARGLVIEAELDKGKGSVATVLVQKGTLRVGDAIAAGSAHGRVR AMIDDKGRRVKEAGPSTPVEILGLNDVPNAGEVFVGCANDKEARSFAETFISQGRVKMLE ETKSKMSLDDLFTQIQAGNLKELGIIVKADVQGSVEAVKQSLVKLSNDEVVIKIIHGGVG AINESDVTLASASNAIIIGFNVRPDATAKEIAEREGVDLRLYRVIYNAIEDVEAAMKGML DPVFEEKVLGHAEVRQTFKASGVGTIAGSYVLDGVFERNCSARVVRDGVVIFDGALASLK RFKDDVKEVKAGYECGFVFEKFNDVKEGDQVEAYKMVEIPR >gi|330405144|gb|ADLB01000012.1| GENE 15 12521 - 12892 379 123 aa, chain + ## HITS:1 COG:BS_rbfA KEGG:ns NR:ns ## COG: BS_rbfA COG0858 # Protein_GI_number: 16078728 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Ribosome-binding factor A # Organism: Bacillus subtilis # 5 104 2 100 117 84 45.0 6e-17 MRKNSIKNTRVNMEVQRELSNIVRGGIKDPRVAPMTSVVAVEVAPDLKTCKAYISVFGDE MAQEDTLKGLQSAEGYIRRELAHNLNMRNTPEIKFVLDQSIAYGVAMSKKIDDVTKDIKE EGE >gi|330405144|gb|ADLB01000012.1| GENE 16 12927 - 13841 870 304 aa, chain + ## HITS:1 COG:CAC1804 KEGG:ns NR:ns ## COG: CAC1804 COG0618 # Protein_GI_number: 15895080 # Func_class: R General function prediction only # Function: Exopolyphosphatase-related proteins # Organism: Clostridium acetobutylicum # 1 301 16 321 321 152 33.0 6e-37 MAIGGHIRPDGDCVGSCMAMYQYIRTWYPETDVDVYLEEIPNSFRFIGATKEISHEIKEK TYDLFISLDCGDVGRLGFSAPLFERAAETFCVDHHISNRSFGKNNYIKPDASSTCELVFE LMEEEKITKEIAECLYLGLVHDTGVFQYSCTSPETMIVASKLMAKGIDYSKIIHDTYYEK TYVQNQILGRALLESVLFMEGQCIVSVIDKKMMDFYEVSPKDLEGIVSQLRLTRGVKVAI FLYELQTHEFKISLRSGDEVDVSKVAGYFGGGGHKKASGLTMKGTAHDVINNIAKQIELQ LQKD >gi|330405144|gb|ADLB01000012.1| GENE 17 13852 - 14763 888 303 aa, chain + ## HITS:1 COG:CAC1805 KEGG:ns NR:ns ## COG: CAC1805 COG0130 # Protein_GI_number: 15895081 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Pseudouridine synthase # Organism: Clostridium acetobutylicum # 4 298 3 288 289 219 42.0 4e-57 MIHGVLNVYKEKGYTSHDVVAKLRGIVGQKKIGHTGTLDPDATGVLPVCLGKATKLCDML TDKDKTYETVLLLGQVTDTQDTGGEVLETKNTDALTEEQVREVICGFVGRYEQIPPMYSA LKVNGKKLYELARQGIEVERKARPVQIHEICIKEINLPRVKMEVTCSKGTYIRTLCHDIG QKLQCGGCMEELVRTRVSRFRIEESFRLEQITRLRDEGKLDNILVPIDEMFLQYKKVNVK EKYVSLVYNGNPFFKNQAEETDDLAEEEFVRVYDNQNRFIGVYGYNKEKKMFKLIKMFFD KGE >gi|330405144|gb|ADLB01000012.1| GENE 18 14766 - 15683 393 305 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163762565|ref|ZP_02169630.1| ribosomal protein S2 [Bacillus selenitireducens MLS10] # 17 303 19 312 317 155 35 1e-36 MRYIRGIENYNCPSHTAITLGKFDGLHRGHQKLIEQVKKHADGQIKSVVFSFDMFPFFKE LGKQNYILMTSEEKCLRLEEQVDYLVECPFVEKIHTMDAETFIKEVLMKKFHAKYVVVGT DFRFGYQKKGDIYLLEKYQQICGYKLIVIEKEMYGEREISSTFVKEEIHKGNMELAEKLL GYPYTILGTVEHGEKLGRKLGFPTMNVIPAEEKLLSPNGVYVSSVMIDGKEYRGISNVGC KPTVSDKKEKLVETFLFDYDEDAYGKKIQIRLYSFVREEKKFSSVSALQKQMRSDIEMGK GFFKK >gi|330405144|gb|ADLB01000012.1| GENE 19 15809 - 16075 396 88 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|238924297|ref|YP_002937813.1| ribosomal protein S15 [Eubacterium rectale ATCC 33656] # 1 88 1 88 88 157 87 5e-37 MIAKDKKQAIIAEYGRTEGDTGSPEVQVAILTARINELTEHFKANPKDHHSRRGLLKMVG QRRGLLAYLKKTDIERYRSLIERLGLRK >gi|330405144|gb|ADLB01000012.1| GENE 20 16201 - 16629 599 142 aa, chain + ## HITS:1 COG:SA0649 KEGG:ns NR:ns ## COG: SA0649 COG1661 # Protein_GI_number: 15926371 # Func_class: R General function prediction only # Function: Predicted DNA-binding protein with PD1-like DNA-binding motif # Organism: Staphylococcus aureus N315 # 1 137 1 137 140 94 35.0 5e-20 MEYRRMNNVIIARIDKGEEILEKIKELALAENIKLASVQALGAIGDFTVGVLRTEEKQYK SNHFHGDFEIVSLTGTINTMDDQFYTHIHLSAGNEKGEVFGGHLNRAVVSATCEMIVQVI DGRVDRKFDEETGLNLFSFFNE >gi|330405144|gb|ADLB01000012.1| GENE 21 16645 - 17775 685 376 aa, chain + ## HITS:1 COG:CAC0890 KEGG:ns NR:ns ## COG: CAC0890 COG5438 # Protein_GI_number: 15894177 # Func_class: S Function unknown # Function: Predicted multitransmembrane protein # Organism: Clostridium acetobutylicum # 39 370 42 374 381 169 33.0 8e-42 MKENIRILLQKMRERKAICVMVFMYFSLLLFVAFDTWLYKTPIVKITEVKTEEKETEEGT RGGEEVHYRQFLKGVVLNGEHKKEIIQLKNEYAESEVTSQKYQKHDRIFVEIREKANTLS GNIKGLKRDTGIVALLGLMVLLLFAVTKRQGIRTIFTVGVNIAIYMVGFAFFLNGEDVLK ICNVMVFVFTIGTLIILNGMNRRTLASILSTLCVLAVIMGLFHLVMNTAGETDYAAMEYL GSLDNPSELFEAEVMLAGLGAIMDVTVTISATLGELIRKRPSIRWKELFRSGREVGYDIM GTMISVLLFTFGCGLIPTFLIRMNNDISFLTIAKLNIPFEICRFLIESIGIVIAIPISIF VAASLLKIKRREEAKA >gi|330405144|gb|ADLB01000012.1| GENE 22 17772 - 18527 720 251 aa, chain + ## HITS:1 COG:SA0427 KEGG:ns NR:ns ## COG: SA0427 COG5438 # Protein_GI_number: 15926146 # Func_class: S Function unknown # Function: Predicted multitransmembrane protein # Organism: Staphylococcus aureus N315 # 17 244 19 243 260 113 32.0 4e-25 MIGILALILFVLMLVIGGERGATSVMALAGNIIVLAFTIIFLARGHSPFFILFFATVAIS CVSLFGQNGKNIKTKSAFMAVCIVMFFVTALIYFVVWKYRAGGLNEIQSIQEDVMYYYNV NIQISMMQIAVSVTILSALGAAIDTALSVTSAVHEVAFHKQNLTEKELFHSGIQVGKEII GTTVNTLLFAYLGESLLLFAYLKREKYPLELIVNSKFLFQELSIMLFGAVTCLLIVPVSA YFAARYLAKKA >gi|330405144|gb|ADLB01000012.1| GENE 23 18614 - 19339 525 241 aa, chain + ## HITS:1 COG:CAC1489 KEGG:ns NR:ns ## COG: CAC1489 COG0671 # Protein_GI_number: 15894768 # Func_class: I Lipid transport and metabolism # Function: Membrane-associated phospholipid phosphatase # Organism: Clostridium acetobutylicum # 21 214 19 212 219 198 51.0 9e-51 MRKWIQGFIEKIMPVGRFFIVIFAFVFNSLLYCGARMIAGDWHHYILTSRLDEKIPLIPE SLFIYFGCYIFWVINYIIIAKQDEERAYQFFFADMISRIICFTIFILFPTTNIRPDIVGG GVWNEGMRFLYRIDAADNLFPSIHCLVSWFCYIGIRGDKRVPKWYQWTSCLIAVSVFVST LTTKQHVIIDVIAGVIIAEGTLWFARHTQFYKGYIRFWKRIADRTFKTGGTTDEKRKEEY I >gi|330405144|gb|ADLB01000012.1| GENE 24 19311 - 20342 875 343 aa, chain + ## HITS:1 COG:CAC3016 KEGG:ns NR:ns ## COG: CAC3016 COG0392 # Protein_GI_number: 15896268 # Func_class: S Function unknown # Function: Predicted integral membrane protein # Organism: Clostridium acetobutylicum # 8 332 5 330 337 85 23.0 1e-16 MKNVKKNIFNLVFLFLVFGLTIYGVFKGEDITKVVRVIKEARLEFILFGVVCVVFFIWGE SIIIYYLMNSLSIKLKKWKCFLFSSVGFFFSCITPSASGGQPMQIYYMKKEKIPIPVSTL VLMIVTIIYKMVLVLIGVILLIIGQGFVKKYLTGILPVFYLGVGLNVMCVTFMLILVFHP VLAKAMMKKGLRLLEGLHILKRKEERMQKLEESMDLYNDTAKYLKSHAMVMVNVLLITFL QRIALFLTTYFVYCAFGLSGKSIIDIVLLQAVISIAVDMLPLPGGMGISEQLFLIIFVPI FSPKFLLPGMILSRGLGYYAQLFISAIMTVVAQFTLGKKEMIE >gi|330405144|gb|ADLB01000012.1| GENE 25 20356 - 20973 669 205 aa, chain + ## HITS:1 COG:CAC0798 KEGG:ns NR:ns ## COG: CAC0798 COG1183 # Protein_GI_number: 15894085 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylserine synthase # Organism: Clostridium acetobutylicum # 2 204 3 203 205 138 41.0 6e-33 MIGFYNYSVILTYIGLISSIIGMMFTVNGHYKLAIFCLAFSGLCDMFDGKIARAMKNRTA DEKKFGIQIDSLCDVVCFGAFPVILCYCLGLKDVFGIMILAFYGTASVIRLGYFNVMEEK RQQETTEARKYYEGLPITTMAIILPILYLIKPCMGAYFVLVLHIVMLAVGFLLILRFQLK KPGNKVLAVIVAVVALAVLKLTHII >gi|330405144|gb|ADLB01000012.1| GENE 26 20985 - 21854 672 289 aa, chain + ## HITS:1 COG:CAC0799 KEGG:ns NR:ns ## COG: CAC0799 COG0688 # Protein_GI_number: 15894086 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylserine decarboxylase # Organism: Clostridium acetobutylicum # 16 285 19 288 291 202 41.0 4e-52 MRYKDREGNLKGKNGFQDKLLKSLYGNVLGRVVIKCLTVPLVSELGGKILDSSFSKVLIP PFVKMNHIDLSLCEQNKFKSYNAFFKRKFKADAREINMQENVFISPCDAKLTVYPIGKES RFCIKHTSYTVEELLKNKKTAAKYEGGYAWVFRLSVEDYHRYCYVASGVKSRNVRIPGVF HTVNPVANDVYPIYKENTREYSLLKTEEFGTILMMEVGALMVGKIENRHEEREVKRGEEK GNFAFGGSTIILLTQKDKVCVDRDILKNTECAYETLVKMGEKIGEQKYV >gi|330405144|gb|ADLB01000012.1| GENE 27 21847 - 23163 1312 438 aa, chain + ## HITS:1 COG:CAC0607 KEGG:ns NR:ns ## COG: CAC0607 COG1362 # Protein_GI_number: 15893896 # Func_class: E Amino acid transport and metabolism # Function: Aspartyl aminopeptidase # Organism: Clostridium acetobutylicum # 4 421 6 431 433 402 49.0 1e-112 MYRETAKELLSFIEKSYSCFHLIQNMKEELNENGFQELYEAQEWKLKAGGKYYVSRNESS LIAFQIPKENFVGFQIMASHSDFPTFKVKENPEICVENQYTKLNVEKYGGMLCAPWFDRP LSVAGRLLVKEEDGIATKLVNVDRDLVMIPNLAIHMNREVNDGYKYNAQEDMLPLYSNTT EKGGFMAVVAESAGVEKENILGSDLYLYNRMKGSIWGANEEFISSSKLDDTQCAFSSLKG FLQAENKQSVSVHCVLDNEEVGSGTKQGAASTFLKDTLGRINRGLGRTEDEYLRALANSF MISADNAHGVHPNYSNKTDLTNRPYLNGGIVIKYSANQKYTTDAVSAAMFKTICDRVGVP YQNFVNRSDMLGGSTLGNISNTQVAVNTVDIGIAQLAMHSPYETCGIKDTHYLVEVAKEF YQSSVVSDGSGKYHIEKA >gi|330405144|gb|ADLB01000012.1| GENE 28 23279 - 23974 938 231 aa, chain + ## HITS:1 COG:CAC1789 KEGG:ns NR:ns ## COG: CAC1789 COG0528 # Protein_GI_number: 15895065 # Func_class: F Nucleotide transport and metabolism # Function: Uridylate kinase # Organism: Clostridium acetobutylicum # 2 230 7 235 236 201 45.0 1e-51 MKRVLLKLSGEALAGDKKTGFDEATCIGVANQVKQLVDSGVQVAIVTGGGNFWRGRTSET IDRTKADQIGMLATVMNCIYVSDIFRHVGMQTEVFTPFVCGAFTSLFSKDAVVEALNSGK VVFFAGGTGHPYFSTDTGAVLRAIEIEADAMLLAKAIDGIYDSDPKLNPDAKKYDEISIQ ETIDKRLAAVDLTASILCMENKMPMLVFGLNEENSIVETMTGTFKGTKVTV >gi|330405144|gb|ADLB01000012.1| GENE 29 24007 - 24558 810 183 aa, chain + ## HITS:1 COG:CAC1790 KEGG:ns NR:ns ## COG: CAC1790 COG0233 # Protein_GI_number: 15895066 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Ribosome recycling factor # Organism: Clostridium acetobutylicum # 1 183 2 185 185 180 57.0 1e-45 MNERVKAFDEKMTKSYNSLVGELATIRAGRANPHVLDKLTVDYYGVPTPIQQAANISVPE PRMIQIQPWEKSMVKEIEKAILTSDIGINPTNDGSVIRLVFPELTEERRKELAKDVKKKG EQAKVAIRNIRRDGNDTFKKLKGTEVSEDEIKDLEDELQKLTDKYIKDIDKAVEEKSKEV MTV >gi|330405144|gb|ADLB01000012.1| GENE 30 24647 - 25357 618 236 aa, chain + ## HITS:1 COG:BS_yluA KEGG:ns NR:ns ## COG: BS_yluA COG0020 # Protein_GI_number: 16078716 # Func_class: I Lipid transport and metabolism # Function: Undecaprenyl pyrophosphate synthase # Organism: Bacillus subtilis # 3 231 31 259 260 249 52.0 3e-66 MNIPQHIAIILDGNGRWAKAKGMPRNYGHAQGSKNVERICEEAYKMGVKYLTVYAFSTEN WSRPKEEVDALMKLLRNYMKTCLKTAAKNRMKVRVLGDKTKLDDDIRKRIEELEKATVDN DGLNFQIALNYGSRDEMIRAMKKMAADCKEGKLEVEDISESVFEAYLDTHDIPDPDLLIR TSGELRLSNYLLWQLAYTEFYFTDVLWPDFTKKELEKAILHYNNRDRRFGGTKEEK >gi|330405144|gb|ADLB01000012.1| GENE 31 25357 - 26160 920 267 aa, chain + ## HITS:1 COG:CAC1792 KEGG:ns NR:ns ## COG: CAC1792 COG0575 # Protein_GI_number: 15895068 # Func_class: I Lipid transport and metabolism # Function: CDP-diglyceride synthetase # Organism: Clostridium acetobutylicum # 9 241 9 240 245 129 37.0 6e-30 MFKTRLLSGIVLVIIALITVITGQDLLFGVLLVISLIGMSELYKVVDVHKKLLGFTGYLA GIAYYVCLRFGSKEQIVPLIIGFLVLLMAVYVFSFPKYIAQQVMFVFFGLFYVALMLSYV YQTRMLPQGAFLVWLIFLCSWGSDTCAYCVGMLMGKHKMAPKLSPKKSVEGGIGGILGAA LFGAVYGLAINRFASGADANVLHYAIICGIGSMISQVGDLTASAIKRNHDIKDYGKLIPG HGGILDRFDSVIFTAPIIYYLSVMLMK >gi|330405144|gb|ADLB01000012.1| GENE 32 26174 - 27316 1102 380 aa, chain + ## HITS:1 COG:alr4351 KEGG:ns NR:ns ## COG: alr4351 COG0743 # Protein_GI_number: 17231843 # Func_class: I Lipid transport and metabolism # Function: 1-deoxy-D-xylulose 5-phosphate reductoisomerase # Organism: Nostoc sp. PCC 7120 # 1 376 2 382 399 392 52.0 1e-109 MKKIAILGSTGSIGTQTLEVVRANKDIEVVGMAAGNNISLLEEQIREFSPKTVAVWSEEK ARELRVRISDLPVKVVAGMEGLIEIATIPETEILVTAIVGMIGIRPTIAGIEAGKNIALA NKETLVTAGHIIMPLAKKHGVSILPVDSEHSAIFQSLQGNEKNAIHKILLTASGGPFRNR KKEELEHIQVEDALKHPNWEMGRKITIDSSTLVNKGLEVIEAKWLFDVKMEQIEVVVQPQ SIIHSMVEYVDGAVIAELGTPDMKLPIQYALYYPERRYLPGDRLDFKKLSQITFEEPDME TFYGLRLAYEAGRAGGSLPTVFNAANELAVSKFLNREIRYLEIPEIIGECMRNHKNIMNP SVEEILETEQEVYKQIESRW >gi|330405144|gb|ADLB01000012.1| GENE 33 27322 - 28347 902 341 aa, chain + ## HITS:1 COG:CAC1796 KEGG:ns NR:ns ## COG: CAC1796 COG0750 # Protein_GI_number: 15895072 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted membrane-associated Zn-dependent proteases 1 # Organism: Clostridium acetobutylicum # 3 337 6 333 339 237 41.0 3e-62 MGIILALLLFSFIVFFHELGHFLLARKNGVYVEEFCIGMGPTIISKQGKETKYSIKLLPI GGACMMGEDDVENTDEKSFNNKSVWARISVIAAGPIFNFILAFILSVIVVAWVGYDKSEI GGIVPNSAAQEAGLQKGDVITEINGKNIHLFREISVYNQFHQGEKVTLEYKRDGKTYESV LTPQKNEQGQYLIGITQAKYKKANAFTALQYGLYEVEYWIETTLESLKMLVTGKIGMDQL SGPVGIVDVVGDAYETNKAYGVSSVIFSLINLSILLSANLGVMNLLPLPALDGGRLVFLF VEAIRGKRVPPEKEGMVHFAGLILLFGLMIFVLFNDIQRLL >gi|330405144|gb|ADLB01000012.1| GENE 34 28404 - 29669 1084 421 aa, chain - ## HITS:1 COG:CAC3244 KEGG:ns NR:ns ## COG: CAC3244 COG3409 # Protein_GI_number: 15896489 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Putative peptidoglycan-binding domain-containing protein # Organism: Clostridium acetobutylicum # 14 420 3 407 437 352 49.0 7e-97 MRSIHTMQNNTADKGSLQLNVTSEITSYPVSDATVDISYTGVPDSQLEQLTTDSSGQTET IELDTPPLEYSLNPSIESQPYSEYTFKISAPGYETMNISGAELLPTVKAIQNVALKPISP DTQQQEIYVIPGHTLYEEYPPKIAEDEIKPMDETGEIVLSRVVVPEYIVVHDGSPRDSTA KNYYVRYRDYIKNVASSEIYATWPASTIQANVLAIMSFTLNRVYTEWYRNKGYDFTITSS TAFDHKWIPNRNIYDTISAVVDDLFANYLSRPNVRQPILTQYCDGRQVTCPNWLTQWGSL SLGEQGYTAIEILRYYYGDNIYINTAQEISGVPSSWPGYDLENGSSGDKVRQLQEQLNVI AGAYPAIPKITADGIYGPATAAAVKKFQSIFGLPDTGITDYPTWYKISQIYVGVSRIAEL T >gi|330405144|gb|ADLB01000012.1| GENE 35 30185 - 30856 742 223 aa, chain + ## HITS:1 COG:SP1633 KEGG:ns NR:ns ## COG: SP1633 COG0745 # Protein_GI_number: 15901469 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Streptococcus pneumoniae TIGR4 # 1 220 1 220 225 244 54.0 1e-64 MYKLLIVEDDMMIAKILENHLEKWGYEVKCITDFEKVYQEFLAFEPHLVLLDISLPFFNG FHWCGEIRKVSKVPIVFISSASDNMNIVMAINMGADDFVDKPFDLNVITAKIQALLRRTY SFQGSVNVIEHKGVVLNLNDASVSYEEKHLELTKNDYKILQLLLENVGKIVSREEIMVRL WENEEFIDDNTLTVNVTRLRKKLETIGVKDFIATKKRIGYIIL >gi|330405144|gb|ADLB01000012.1| GENE 36 30853 - 31851 868 332 aa, chain + ## HITS:1 COG:CAC0225 KEGG:ns NR:ns ## COG: CAC0225 COG0642 # Protein_GI_number: 15893517 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 8 325 6 329 339 194 38.0 2e-49 MMKLVLPYIKEKRKTIFVLLFCGVIIKGIAFLYGANTEDTNYALLICAVSVGIIAVTDFV KYAGKYKRLEHLKKQLAYEMGDFPEAEGLLEQTYQEIIVELHRNRKELISSMDISAKEAE EYYMMWAHQIKTPISVMNLLLQSGECDLKILSAELFKIEQYVEMVLHYLRKKQMSQDMVL NQYFLSDIVKAAIKKYSKLFILQKIKLNFEPMEITVLTDEKWLLFALEQIISNALKYTNE GSVSIYMDKNNANLLVIEDTGIGISAEDLPRVAERGFTGYNGRSNQKSTGLGLYLTKEVL GKLGHGLKLESQVGKGTKVIIDFSREFTPINE >gi|330405144|gb|ADLB01000012.1| GENE 37 32084 - 35089 2340 1001 aa, chain + ## HITS:1 COG:no KEGG:SGO_0107 NR:ns ## KEGG: SGO_0107 # Name: not_defined # Def: LPXTG cell wall surface protein # Organism: S.gordonii # Pathway: not_defined # 19 362 422 711 1058 85 28.0 8e-15 MSLLCGVIPWNTFAAEKRSVSKGSVSATVKARKVLQGHVLQDRQFTFELLENGIVIQTAQ NDAAGDITFRPITYKNAGTHRYTIRERNGGQTIDGIVYDSNEYEVNVTVREEDLKAPDQN KIYYGHAPDGEIFVGEYPGERGYEVYCIDQSKALPPKEPEARTKYIVLNDPDSKELESHV TMNRYGDKLAENLKKCFFYFQLFPDKYSSHDRREIVWVATGAYGDNDAQWKEVMKEIFQV SLPEEYHLVLFVPEDKDYHQTLGMGYGVAIKNDVGSDISFGTAPPIFVNKSENTAPQEAK AVIKAKKILEGEKLRDKQFQFELLDEGGGVLQEATNDAEGNIQFQPITYTAPGEYRYKIR EKYYGQTIDGIVHDAKVYDIVVTVTEKDLEGFQNAVKYYGVAPEGNYIFVGEKPGEQKYQ VFCIDGNKTLPPPVVNHHTYKVVTDPELTEMEHYVSINLWGDKLVENLKKIFYYFQVYPN KYGLEEQKNIVWAVTGYFGHDLGQYEKIVNEMFKIELPAEYHLVVFEPQGQDRDFYQPLG MGYGVEVGNRSAEDMEMSLETPVFVNKISNVKTEFTPLAFKKVDGRNPFANENFEFVLHK EDGSVISKAQNNKQYISFPPIAYTAKDAGKTFHYYIDEFTKEGYICDTERVEVTVTVAFS EFAGLVAEGSYKKGNVTSGIGSATFLNKTARDEITFVPEAMKMVDGKVPTAQEVFEFKMY TITEDGQKGALAGETSNIGKSVRFKPITYKKSDVGKTFFYRIEETKKDGFLCDTEPVDIK VTVTQNEMGLLKTEVTYTKGGVQGEKTFHNSSLENKGVISISKTVEGNGADQQEYFRFVM ETFRTDGGDSDGEYDVMFENQEDLEQGQKNDSTLNVTAKKIRFDDEGKAVFYLKHGQTLH IRLPYDYYVSVQETMNRDYDVSLSVEGVPTIPGNGQNGPVEFEYTRKYPDVNIKFVNTKY EVVVPTGISLPSNSMKILAVLALLGSGCMLVLGRKKRRKGI >gi|330405144|gb|ADLB01000012.1| GENE 38 35206 - 35973 314 255 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 3 225 1 220 245 125 33 2e-27 MSILEVKNIKKIYTTRFGGNQVKALSDVSFSVEPREYVAIMGESGSGKTTLLNILAALDK ATSGKVLLKGRDLSTIKEKEMAAFRRQNLGFVFQDFNLLDTFSLKDNIFLPLVLAGKKYD EMEKRLMPIAKKLGIESLLNKYPYEVSGGQKQRAAVARALITKPQLILADEPTGALDSKA SDELLKLFTEINQDGQTILMVTHSVKAASNASRVLFIKDGEVFYQLYRGNLSNEEMYQKI SDTLTALTTGGDRIE >gi|330405144|gb|ADLB01000012.1| GENE 39 35966 - 37957 1623 663 aa, chain + ## HITS:1 COG:SP0913 KEGG:ns NR:ns ## COG: SP0913 COG0577 # Protein_GI_number: 15900794 # Func_class: V Defense mechanisms # Function: ABC-type antimicrobial peptide transport system, permease component # Organism: Streptococcus pneumoniae TIGR4 # 8 663 7 661 662 336 32.0 1e-91 MSSSIYRKLAVTNIKNNRRTYVPYIITCVITVMMFYIIYGLTMNKGIGGVPGEVTFREML SMGVSIVGIFSAIFLFYTNSFLIKQRKKEIGIYNVLGLGKRHIAKMLTVEMLIIALISLV GGILGGMCFGKLVFLIMLKILHFDAHMEFAIEPKALVGTILLFSGIFLVSLLFNFFQISL ANPIELLHGSNQGEKEPKTKIIMTIIGIVSLGIGYYLALSTESVMNALGYFFVAVILVII GTYALFTAGSIAFLKMLRKNKNYYYKTKHFTSVSGMMYRMKQNAVGLGNICILSTMVLIM LSTTLAMYTGLEDILSTRFPKQCIVMGDVVKGEDKEQIIHTIDSGVEKALETYNLEAKDE VSYYYTETDATKGKNTLVTNIDDIGPKTLDVSSYCGVRFMTLEEFNQVEKRNEILGENEI LYYGAKGDNVSGDELTINGEKFHVKKMKMKNLDKELQMMIDMYYIVMPDQKAIDTLLGDD KTQKNTTECYYKSLDFKGSKENIMKASEELKKSFPMNADTIRGEFKETEREEFFSLYGVF LFLGLFFGGLFLMATVLIIYYKQISEGYSDKERFAIMQKVGMSKQEVKNSIRSQVMSVFY LPLICAVIHVIVAFKIITKLLSAFALENVKLIAGTTAVTVLVFGIFYTIVFMITTKEYYR IVK >gi|330405144|gb|ADLB01000012.1| GENE 40 38072 - 39037 1261 321 aa, chain + ## HITS:1 COG:BH0465 KEGG:ns NR:ns ## COG: BH0465 COG0530 # Protein_GI_number: 15613028 # Func_class: P Inorganic ion transport and metabolism # Function: Ca2+/Na+ antiporter # Organism: Bacillus halodurans # 18 321 14 317 318 216 44.0 4e-56 MDMFVAIICLVIGFVLLVKGADFFVDGASSIAKQLHIPAVVIGLTIVAFGTSAPELAVSV SAAMKGSNDIAIGNVVGSNIFNLLIVVGVSAFIYPLHVKKSMIKKDYPISIIAAVLLGVL AMDTLFGKTSMELSRVDGIILLVGFAVFMYLAIREGLKGRAEHKESGEEIEVKYTLGKSV LVCIIGLAGIIIGGNMVVDGAKEIARAFGLSEAFIGLTIVAFGTSLPELVTSIVAAKKGE SDISLGNVVGSNIFNIFFILGVSGTILPMAVANTYLYDIAILIVVSIVFFIPICRKQRVS KGMGAAMVATYAAYMAYLFIR >gi|330405144|gb|ADLB01000012.1| GENE 41 39077 - 40171 674 364 aa, chain - ## HITS:1 COG:FN0973 KEGG:ns NR:ns ## COG: FN0973 COG0079 # Protein_GI_number: 19704308 # Func_class: E Amino acid transport and metabolism # Function: Histidinol-phosphate/aromatic aminotransferase and cobyric acid decarboxylase # Organism: Fusobacterium nucleatum # 4 363 1 354 357 237 34.0 2e-62 MSKLQQFHGSDLEAVARYFHLPQEEIICFGANVNPLGLSSSVKNTLAEHLEIISSYPDRD YSSLKKQISHYCHVCPQNIVVGNGCTELISLFIQLLKPRRALLPIPSYSEYEREIALSGG TVEYFPLSQEENFSLNLSSLYKVLEKNIDLLILCNPNNPTSTATCVDDLTRILSFCKEKH IFVMIDETYVEFVEDMDKITAMPLTESFDNLIVLRGVSKFFAAPGLRFGYGVTGNNGLLS QLNSIKNPWSLNSIGAYAGELLLSDKNYIQKSRSLISSERKRICKRLRAFSFIQVYEPAA NFILLKICKDNLTSYDVFVHAIQRGLMIRDCSSFVGLEGEFIRFCLMNPEDNDRLLDCLE EILA >gi|330405144|gb|ADLB01000012.1| GENE 42 40302 - 43565 2359 1087 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_1826 NR:ns ## KEGG: EUBREC_1826 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 1086 1 1180 1181 382 25.0 1e-104 MKNKIKRISKGDFEVIQPDVQFSETHIAMSVSEGEVYEGSFTLQNRKEGDIRGLIYPSSF RIHFKEQGFQGNPVEIHYTFDGAGMAPGDVENGKFTIVCDGGEYDIAYTAVIERPFVITE YGKVQTLKEFKRLAKADFSEACKLFRSKKFAELLKYEDSRITALYGNMRKWSLDEQALEE FLVGIKQKEKIYLLFDEEKAEFSDYEESKKERAVVTKNTWGYLSIKISASGDFLHVSEES ITTEDFVGENYYLEYVIDAKKLHAGYNYGEIVVETPYETVTYPVTVHQQSSHREKHGEEK LMFGCLLKSYMACVSGRLELKQWTNQAVELVRQLQEIAPDNEFYELLMAHVYLRGGRKEE GKWFLENYNYNRFAIGKKTEIGSYYLFLTGLLAKDESHIKKVTDELNKAYAKHPESWQLL CMILNIDVRYRNYGDRIHILRQQFEKGANQIPLYIEAYICFQEKSSLLKKLGQFELRILD FAAKYSMITKELALYTANLASQEKAYKDRLYQILARSYQLYPEEMILNAICTLLIKGDKK EKSYFIWYERAVEAGLRIAKLYEYYIMSADKERMKKQFPRTVWLYFVHGSSLDYKNTAFL YENVLTYEDEESRLFASYREQMEAFAWRQLEERRINEHLRLLYKRFCNEPDLDMERMRAI YDISHVYIVKTAAPHMKYVLVIEKNGVISQRVAYRKEGTKILLYHKTSRIVWEADDGKHY ADSIPYELTRLCFESQFAEMYKDKELLLETGEEENQTELTLETVKRYGIDFFGEEKVFRF CSKHIREEEKEDDYLTYLCFSLLEREQYDKVTLTYLADYYCGATKDMKKVWNVARGYEVK TDALAERIITQMLYSETMFNEEQIFMDYYRGSVYFRLKQAYLSYISREYVVKDRETKEEI IQVILNEYEKGTELPEVCKIAVLKFYASHSFESEVEDRLKLFLQELCDRQLVFPFYLKYK EEWLREVLLHDKILIQYQAREEGKVKLTYKLNRGTVTEQLQPVYANIYVKEFVLFNKESL TYTFREQAGEAIICEESGVCRQERKVEPVGKYGRLNAMCKMEGDELKDSLFKYQLEEKMA EELFKIY >gi|330405144|gb|ADLB01000012.1| GENE 43 43584 - 44828 874 414 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_1825 NR:ns ## KEGG: EUBREC_1825 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 5 394 8 403 412 177 27.0 8e-43 MQRGYLVGVELNEHTCQISYCEEGQTEPKTAERIPLVISKEGQEWVYGERAGEEGICDLL SLAEKHDMIQVGENTYEGIWLLSLYIQLVLKQFHHIETIVFSLPEIDVDIVHLLKSVGQK LGIAKEKVYVQDNRESFCYYMFHQPKELWQYESALFYCDRSGVRTYMLKKLRTGYEKGRD TFVTVDEVANADWEELAIVYPLANEDGQKEADEQFNQFIKSVFNRKLISSVFLIGEGFEK NWYPNSLRTLCNGRRAFQGDNLYSKGACYAACYKDDESRGGLVYLDETKMTEQICLKLRV NGMDAWYPIVQWGTKWYENDKQFEILLEDTEDIEIHVESLAKRSVRAIKVPLGNLPEREK YTVRLQVNVIFQDEHVCEITWKDVGFGNFFGSSGFQTQTVIELGGRDGQFNSLS >gi|330405144|gb|ADLB01000012.1| GENE 44 44803 - 45567 530 254 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_1824 NR:ns ## KEGG: EUBREC_1824 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 182 1 182 259 107 36.0 4e-22 MGSLILCHSKRARQPYEITRVHKKIYTLEELCYYICNFPYLIDHTLVNRKLCDWIEEELE KEGLAKQLNDCMKSHGSAEQFVLYILKDSGIYTANELSHIEGTLRQLKNQKDVEKRKYKA DSLLQNGETEAAIRGYLSILHDERDESVEPKFYGKVYGCLGTAYGRLFLYREAGERFLSA FQICEEESMLRAYVYCCRQYMTVEEYEAFLSKHEVYRETDGYLVEREKEWENQISIEDPE KIFRSYKRFYSGQK >gi|330405144|gb|ADLB01000012.1| GENE 45 45750 - 46457 1037 235 aa, chain + ## HITS:1 COG:CAC1391 KEGG:ns NR:ns ## COG: CAC1391 COG0152 # Protein_GI_number: 15894670 # Func_class: F Nucleotide transport and metabolism # Function: Phosphoribosylaminoimidazolesuccinocarboxamide (SAICAR) synthase # Organism: Clostridium acetobutylicum # 1 233 1 232 235 280 64.0 1e-75 MKKLEQLYEGKAKKVFMTDDPDVVIVDYKDDATAFNGEKKGTIVGKGVVNNRMTNHVFKL IEKEGVPTHLVEELSDRETAVKKVDIVPLEVIVRNVAAGSFSKRMGVEEGKELLCPILEF SYKNDDLGDPFINDDYALALGLATQEEIDTIKAYTRKVNEVLKAYFLQADMKLIDFKIEF GRLKDGTIILADEVSPDTCRLWDVHTNEKLDKDRFRRDMGNVEEAYNEVFKRLGI >gi|330405144|gb|ADLB01000012.1| GENE 46 46475 - 47908 1651 477 aa, chain + ## HITS:1 COG:CAC1821 KEGG:ns NR:ns ## COG: CAC1821 COG0015 # Protein_GI_number: 15895097 # Func_class: F Nucleotide transport and metabolism # Function: Adenylosuccinate lyase # Organism: Clostridium acetobutylicum # 4 477 3 476 476 644 66.0 0 MSTDRYQSPLSERYASKEMQYIFSPDMKFRTWRKLWIALAETERELGLNITEEQIEELKA NADNINYDVAKEREKLVRHDVMSHVYAYGQQCPKAKGIIHLGATSCYVGDNTDIIIMAEA LKLVKKKLVNVIAELAKFAEEYKALPTLAFTHFQPAQPTTVGKRATLWLQEFMLDLEDLD YVLKSLKLLGSKGTTGTQASFLELFDGNQEIIDKIDPMIAKKMGFEQCYAVSGQTYSRKV DTRVVNVLAGIAASAHKFSNDIRLLQHLKEVEEPFEKTQIGSSAMAYKRNPMRSERIASL SRYVMVDALNPAITSATQWFERTLDDSANKRLSVPEGFLAIDGILDLCLNVVDGLVVYPK VIEKRLMSELPFMATENIMMDAVKAGGDRQELHERIRELSMEAGRNVKEKGLDNNLLELI AQDAAFNLTLEDLQKTMDPTKYVGRSKEQVDAFLKNVVNPMLEENKNILGMKAEINV >gi|330405144|gb|ADLB01000012.1| GENE 47 47929 - 48012 84 27 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MCYAHSIFHVRIYKSTDSFHEKGAGRR >gi|330405144|gb|ADLB01000012.1| GENE 48 48009 - 48797 811 262 aa, chain + ## HITS:1 COG:CAC2265 KEGG:ns NR:ns ## COG: CAC2265 COG2966 # Protein_GI_number: 15895533 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 1 247 1 243 257 143 37.0 4e-34 MNNEQLMETAMLAGEIMLCSGAETYRVEDTMSHILKNADAEQKEVLVMMTGIMATLKKEG EKPSTIIKRVTDRGTNLNRVVRVNEVSRKLCSGELTVDKAYQELKKIASGRYKEFTSPLY HFSATIIAVIGFTMMFGGKVNEIWTSAIVGALLGFCMEAGRRLEVHDFMMDAISSMAITV LTILLKAYLPVSINMDTIIISAIMPLVPGVAITNAVRDTLQGDYITGSSRMLEAFIKAAS IALGVGIGMMLFGSKFIGRTVL >gi|330405144|gb|ADLB01000012.1| GENE 49 48794 - 49267 563 157 aa, chain + ## HITS:1 COG:SA0699 KEGG:ns NR:ns ## COG: SA0699 COG3610 # Protein_GI_number: 15926421 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Staphylococcus aureus N315 # 8 138 11 141 164 82 35.0 3e-16 MILQIIGAFIAIFGFAILLEIPKKYLVLAGVVAAIGWSIYLLSEAMGTGAVFASFFSALT VTLVSHLFARTMKTPVTVFLIAGIIPTVPGAGMYRIAYYMIMGDSEMYAHYFTETLKIAG VIALAIFIMDTIFKFFMKNGIKQNSLSYTKRKMNKRK >gi|330405144|gb|ADLB01000012.1| GENE 50 49385 - 50674 1453 429 aa, chain + ## HITS:1 COG:ML2336 KEGG:ns NR:ns ## COG: ML2336 COG1167 # Protein_GI_number: 15828259 # Func_class: K Transcription; E Amino acid transport and metabolism # Function: Transcriptional regulators containing a DNA-binding HTH domain and an aminotransferase domain (MocR family) and their eukaryotic orthologs # Organism: Mycobacterium leprae # 1 428 37 462 463 384 44.0 1e-106 MVAYSELSKEELLKLKNELEEQFSEVKAKGLNLDMSRGKPSAAQLDLAMGMMDVLNSKSD LKCQEGVDCRNYGVLDGIQEAKQLLADMMEVPKDNIVIFGNSSLNVMYDTIARSMTHGVM GSTPWCKLDKVKFLCPVPGYDRHFAITEHFGIEMINIPMTESGPDMDLVEKLVSEDEAIK GIWCVPKYSNPQGITYSDETVHRFAKLKPAAKDFRIYWDNAYGIHHLYEDKQDYLIEILM ECKKEGNPDMVYKFCSTSKVSFPGSGVSAIAASDANLVAIRKQMTIQTIGHDKLNQLRHA RFYKDIHGMVKHMKLHADILRPKFEAVLEVLEEELGGLGIGSWLAPRGGYFISFDAMEGC AKAIVAKAKEAGLVMTPAGATFPYGKDPKDSNIRIAPSYPTPEELKIASEIFVLSVKLVS IDKLLEEKK >gi|330405144|gb|ADLB01000012.1| GENE 51 50752 - 51555 652 267 aa, chain + ## HITS:1 COG:CAC2265 KEGG:ns NR:ns ## COG: CAC2265 COG2966 # Protein_GI_number: 15895533 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 14 267 5 257 257 152 37.0 7e-37 MEEKKVPEYEEMHKVLNFALDAGRILLKNGAEIFRVEETIDYICKRYHIKEVDTFVLSNG IFITAEKEGKEMFAKVKHIPLSGTHLGIVTAVNDLSREITARRISLDEAIEKLKEIENMP PKKRYFRIFAAGMGSGGFCYLLQGNLFESAVAFVIGMILYTFVTFTEKHPISKVIVNIVG AGLVTVLALLVCNYCFPFLRRDKIIIGSILPLVPGVAFTNAIRDIARSDFISGTVRMIDA VLVFVYVAIGVGVVLTLYQNMLGGFAL >gi|330405144|gb|ADLB01000012.1| GENE 52 51552 - 51995 361 147 aa, chain + ## HITS:1 COG:lin0586 KEGG:ns NR:ns ## COG: lin0586 COG3610 # Protein_GI_number: 16799661 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Listeria innocua # 1 145 7 152 152 77 33.0 1e-14 MISDIISSFMGTLAFSILYNVDKKFYFYCGLTGTAGWLCYRMAVDFSSPAVASFIGTLMV VLISRIFSVWKKCPITVFLISGIFPLVPGASVYYTAYYFVTGDVALASQMGISSVKIAFA IVLGIIFIVSMPRQWFSFRYWKQKIMK >gi|330405144|gb|ADLB01000012.1| GENE 53 52063 - 53520 1755 485 aa, chain + ## HITS:1 COG:CAC0990 KEGG:ns NR:ns ## COG: CAC0990 COG0008 # Protein_GI_number: 15894277 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Glutamyl- and glutaminyl-tRNA synthetases # Organism: Clostridium acetobutylicum # 2 485 4 485 485 641 65.0 0 MSKVRTRFAPSPTGRMHVGNLRTALYAYLIAKHEDGDFLLRIEDTDQERLVEGAIDIIYR TLEKTGLIHDEGPDKDGGYGPYVQSERNASGLYLKYAKQLVEQGDAYYCFCDKERLESLK STVAEGGTEIVVYDKHCLHLSKEEVEKNLAEGKPYVIRINMPTEGTTTFHDDIYGDITVP NAELDDMILIKSDGFPTYNFANVIDDHLQEITHVVRGNEYLSSAPKYNRLYEAFGWEVPT YVHCPLITDESHKKLSKRCGHSSYEDLLEQGYLTEAIVNYVALLGWSPTDNREIFSLEEL VKVFDYHHMSKSPAVFDTVKLKWLNGEYLKAMDFDKFFELAKPYIEEVVTKDYDLKKIAA LVKTRIEIFPDIKEHIDFFEELPEYDTAMYTHKKMKTNEETSLEVLKEILPLFEAQEDYS NDALYATLKGYVEEKGCKNGYAMWPVRTAVSGKQNTPGGATEIMEILGKEESLRRIRRGI ELLSK >gi|330405144|gb|ADLB01000012.1| GENE 54 53530 - 55365 1299 611 aa, chain + ## HITS:1 COG:BS_yjcD KEGG:ns NR:ns ## COG: BS_yjcD COG0210 # Protein_GI_number: 16078247 # Func_class: L Replication, recombination and repair # Function: Superfamily I DNA and RNA helicases # Organism: Bacillus subtilis # 1 609 134 750 759 336 33.0 9e-92 MGFNKAQTEAITHQNGPMLVLAGPGSGKTLVITKRIEYLIQKRKVRPEEILVITFTKAAT NEMRERFRKLMQGQHFPVTFGTFHGIYYGILKWAYGLSAENIFSEEEKIRLLKEILEHTD AEIEVDDEKDFLEGITGEISCIKNNQLEIKEFQSVYCPSTVFREIYYTYERERNRRKKLD FDDMLVLCHDLFIKRPDILEKWQERYKYILIDEFQDINKVQYDVIRMLALPQNNLFIVGD DDQSIYKFRGARPEIMLGFEKDYPNAKKVLLDINYRSTKAIVNGAKKIIQKNRNRYVKNI ITSNEQGADIHVQEVRNLSEESEYVLNEIREEMKKGVPASEIAVLFRTNMEPRMLAETFM EYNLPFQMKEHLPNLYEHFIGRNFCAYMRMAVGKRERKDFLEVMNRPVRYISRSSVEKAE VSFESLRRFYCDKEWLLDRIDQLDVDLRIMKTMTPYAAIQYIRKKIGYDDFLKDYAYTRK IKVEDLYEIANEIQTRAKEYKTIEDWFSYIEKYTEELKKRSKQTESNPNAVTFLTMHSAK GLEFHTVFIIEANEDICPYKKAEVVEDMEEERRMFYVAMTRAKKKLYISYVRERNGKTML PSRFVNDLLFE >gi|330405144|gb|ADLB01000012.1| GENE 55 55582 - 60099 4964 1505 aa, chain + ## HITS:1 COG:BH2723 KEGG:ns NR:ns ## COG: BH2723 COG3250 # Protein_GI_number: 15615286 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Bacillus halodurans # 40 637 9 565 1014 431 41.0 1e-120 MKKRNQWVAAALALSVVMSGIPAESLSAGKPSDVFKGEEWFDQNDVFEVNREDAHASFTG FDNLESVKSPQKRQQKEKSPYYLSLNANKGDKDGWKFKMVSNPSERDMEFYKPETDVSGW DSITVPSNWQTEGYDSPKYTDTRLPWEGVEDPRANYGLPEDSPRGISPTIYNPVGHYRRT FTTPENWDGKEVFVSFQGVESAFYLWVNGHKVGYSEDSYTPAEFDISKYLNPAGEENTIA VQVYRWSDGSYLEDQDFIRLSGIFRDVFLYAKDKQASLFDFAYTTDLDENYTNAELSVET VLRNYDEKADTSGYKVKTFLYDASGKTVKEKELGDLKFEQVPGEKFRQAKVSYKENVEKP LLWSAENPNLYELAFVLYDNEGNIVETAGTHVGFREVEIVRKGTNKSQILINGAPIMLKG VNRHETSKETGRHISEESMIEDIKLMKQYNINAVRNSHYPNEARWYELCDEYGLYMIDEA NIESHGLNDYIPQSDPQWISVCKDRMTSTIERSKNHASIISWSLGNESYGGDVWAELGKL CKELDPTRFVHYEGWRDIEEVDVWSRMYRRVNDPTSTDKIKNPLGWWGENGTKPAFQCEY AHAMGNGIGNLDEYWDMYEKYPNLQGGFIWDWVDQTIELPTPTNKVLKDEGKNNLEVLMD GSLESGENGKAMRGYAKVYNDKSLHLAGNQAFTLEADVKPDSFKESRNNDGSGQLDPSDY NKTAPIITKGNDGWKCTESYGLRRLVEGDKDVLEFYIYNQNWNEEEGAYEKVSAQYPLPE NWADEWHHVAGSFDGTNLKLYLDGKEVASAKSSAGIAGGPNQVGIGADVTFDAQNPNVPD TFKGLIDNVHIYNKVLSLEEINDKTRKADDSTLLWLNFDNTTAKTYEQDKYYSFGGDWQS IPEGNPNNKNFCANGLVSADRTVQPELQQVKYIYQNVGIEDEDLMNGKVKISNKYLFQNL NEFKGKWELAEDGKVIQKGTFSDEDLNIAPVNDYPVNNAKQDVKEYNEKVVQIPFTKPDL KAGGEYFINISLELKEDTSWAKAGHEVAYRQLPLSYKVPEKETAELDKMSEIQVTEKETK TTVNGKDFTLEFDKTKGTIESFVYQGTALLENGPEPNFWRAPTDSDLGFYSGLEMDTWRY AGQDKVVTDVKTEKIDDKAVRFTVTSKLPTTNESLYKQTFTVYGTGDVKVDSLLQPGKDL PMIPVVGNMLTLPKEFSNVTWYGKGPDENYVDRQSGYEVGVYKKDVKDFFIDYIKPQETG NRTDTRWVSLTNDKGVGLLAKADGVMEFSALYYTPEQLSNALHSYLLEENDSISLRLNQK QMGLGGDNSWGARPYDPYLIKADKPYEYSFTLKPVNTADVDKTMADSKIQMPEAEDGEAQ KVNKDKLAEAIKEAEAVDTKLYTAKSVEKMNDALKKAKEVYADKEATQKEVDAAEKELRD ALKNLVKKTEGQNNNGQGGNKPNKPNKPVKTGDATPFTLLFGGLAVSLGTILGLKKRRDK LDTEE >gi|330405144|gb|ADLB01000012.1| GENE 56 60152 - 60484 595 110 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_1820 NR:ns ## KEGG: EUBREC_1820 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 18 106 14 103 103 65 53.0 5e-10 MSEHNCNCGHDNCNCHDEEVTVTLTLDNDEEVECVVLTIYEAGGKEYIALLPIDEDGDNE EGDVYLYRYAEVDGEPTLDNIEDDDEYEVAADAFDEWLDAQEYEELSDED >gi|330405144|gb|ADLB01000012.1| GENE 57 60632 - 61627 1464 331 aa, chain + ## HITS:1 COG:CAC1453 KEGG:ns NR:ns ## COG: CAC1453 COG1879 # Protein_GI_number: 15894732 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Clostridium acetobutylicum # 4 331 3 325 325 168 33.0 2e-41 MDIKRVKLVVLAVLCAFTFAGCKKNVGAEEDNPVNTEQESDEEKVTYKFGFSCIAKENPY YVTLADSIRDSLKEEGHTVIGMNKDTHLSSQEQIEQINEMIEQGIDAIFLSPVDWKEITP ALQALREADVKIINIDTKVADSDYIDAYIGSDNKNAGTVCGKDLIEKCEKGGKIVILESP TMSSVNERITGFEEAIAGSPFEVVARKDVKGDLTLAMEETKKILKKNPDVVAIMCGNDPS AIGAHVAANELGIKGIKIYGVDGSPELKKELEKPNTLIAGTAAQSPINLGKLAVKTAVNM LNGEDYEEETYLDTYLITKENVEMYGTDGWQ >gi|330405144|gb|ADLB01000012.1| GENE 58 61649 - 63340 1012 563 aa, chain + ## HITS:1 COG:slr1857 KEGG:ns NR:ns ## COG: slr1857 COG1523 # Protein_GI_number: 16330244 # Func_class: G Carbohydrate transport and metabolism # Function: Type II secretory pathway, pullulanase PulA and related glycosidases # Organism: Synechocystis # 6 562 20 707 707 386 33.0 1e-107 MEHCKGMPFPMGTAIVGDSVNFSTTAASGKKCFLLLYKKGGRIPSYEFEMKEECSFGEIR SIMLKGMDWADWEYNYKIDNEIVTDVYAKAVTGKEEWAKEENLLTKGRIFADTYNWENDT PLCIPENEVIAYSLHVRGFTKHSSSKVKHGGTFLGVKEKLPYLKRLGVNQIHCMPIYEFE ENGRYTNYWGYGSAFCFAPKSSYAASKDSVSELKDLVKACHKSGIEIVFDLPFTAEMSKR MIADCIRYYKMEYHIDGFIVNPFHAPFEELQKDPLLKNTKILRKQDDFQNTMRRFLKGDE GMVESVMWWTRHFSKEEGIFNYITNHTGFTLSDLVSYDGKHNEKNGEKNQDGPDYNYSWN CGAEGPSRKRAVVTLRKKQIRNAFLLLLTSQGIPCILAGDECHNSQEGNNNVYCQDNEIG WVNWGNAEKDRELFNFVQSLIQFRKEHTVLHAPYELKGMDIVSCGIPDVSYHGEYAWQIP SEISSRQLGIYYCGETLGTDSCFVAYNMHWLKHSFALPALRKGRKWYRAVSTEEGVLKEM EELDNQREVTVAERTIVIFVGKE >gi|330405144|gb|ADLB01000012.1| GENE 59 63344 - 63988 506 214 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_1055 NR:ns ## KEGG: EUBREC_1055 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 2 210 7 250 260 105 34.0 1e-21 MKIGLNSFHLKLIAVITMIIDHIGLFFFPEHILFRIIGRLSFPIFAFLIVEGFYHTRDIR KYMIRLAGLGVISEIPFDLLTTGKFFDLRHQNVFFTLLIGLILLYGYEKQYSTFSKVSFA FLILIAGDLFRVDYGAWGVLMIFCFFIFRERMWAKIVSVAVINVVVFGYIQAFAVLALLP ICLYNGEKGRGYKYFFYLVYPVHLWIIWIIKTMI >gi|330405144|gb|ADLB01000012.1| GENE 60 64001 - 64540 590 179 aa, chain + ## HITS:1 COG:no KEGG:Cbei_2682 NR:ns ## KEGG: Cbei_2682 # Name: not_defined # Def: hypothetical protein # Organism: C.beijerinckii # Pathway: not_defined # 1 176 1 174 185 213 63.0 2e-54 MKAWKHFCTITHHKKLVMQYCFRVGLYKQGLLHDLSKYSFTEFRVGCKYYQGTRSPNNAE RETIGYSSAWLHHKGRNKHHYEYWIDYGVGKEKGLIGMPMPKNYVVEMFMDRIAASKTYM RENYTDRKPLEYYEQGAKYVRKMLHADTRELLETLLHMLAERGEEVTFAYIRKEVLNKR >gi|330405144|gb|ADLB01000012.1| GENE 61 64575 - 65054 579 159 aa, chain - ## HITS:1 COG:CAC2942 KEGG:ns NR:ns ## COG: CAC2942 COG1854 # Protein_GI_number: 15896195 # Func_class: T Signal transduction mechanisms # Function: LuxS protein involved in autoinducer AI2 synthesis # Organism: Clostridium acetobutylicum # 1 159 1 158 158 249 71.0 1e-66 MEKITSFTIDHLKLKPGVYVSRKDSVGAEIITTFDLRLTSPNDEPVMNTAEMHTIEHLAA TFLRNHKIFGEKTIYFGPMGCRTGFYLLLAGEYESSDIVPLLIEMFRFIADFAGEVPGAS AKDCGNYLDMNLPMANYIAKKYLHDVLENITQEQLIYPD >gi|330405144|gb|ADLB01000012.1| GENE 62 65188 - 65820 553 210 aa, chain + ## HITS:1 COG:FN0808 KEGG:ns NR:ns ## COG: FN0808 COG0406 # Protein_GI_number: 19704143 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose-2,6-bisphosphatase # Organism: Fusobacterium nucleatum # 1 190 1 189 206 95 34.0 6e-20 MRLYIMRHGETDWNKEKRLQGQSDIELNEFGRNLAYKTKEGLKDVQFDLVITSPLKRARE TALIVKGDREIPVIEDARIEEMCFGEYEGLYCKGEKFNIPDEEFKRFFDAPESYKASKGG EDFSEFNERIEHFLDDLFHNRDYQDNTILISVHGAVLCAILRIVKKNPMKLFWQGGVHKN CAVTIIRVEDTIPTIEEENIVYYEDEVEDW >gi|330405144|gb|ADLB01000012.1| GENE 63 65841 - 66845 1162 334 aa, chain + ## HITS:1 COG:CAC0626 KEGG:ns NR:ns ## COG: CAC0626 COG0180 # Protein_GI_number: 15893914 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Tryptophanyl-tRNA synthetase # Organism: Clostridium acetobutylicum # 1 329 9 338 343 396 56.0 1e-110 MINDKKVLFSGMQATGNLTLGNYLGALKNWITLSDEYECFYSVVDMHSITVRQDPATLRK RARALLTLYIAAGLDPEKNCIYYQSHVSGHAELAWILNCYTYMGELNRMTQFKDKAAKHA DNINAGLFTYPVLMAADILLFQADVVPVGIDQMQHLELTRDIAQRFNGIYGDVFTVPEAY IGKVGAKIMSLQEPTKKMSKSDENPNGSIYLMDDPDTIMRKCKRAVTDSEAQILYRDEQP GVKNLIDIYRACTNKTVDEVVKEFDGKGYGDFKMAVGEAVVSVLKPLQDEVARLEKDKGY IDSIIKNNAEKANYYATKTLRKVQRKVGFPDRIR >gi|330405144|gb|ADLB01000012.1| GENE 64 66876 - 67547 498 223 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|169350673|ref|ZP_02867611.1| ## NR: gi|169350673|ref|ZP_02867611.1| hypothetical protein CLOSPI_01446 [Clostridium spiroforme DSM 1552] # 35 219 40 224 226 184 50.0 4e-45 MGILKKVFVSCLIASLFVLSGCQKEEMKEPTKQAKKLMIVAHPDDETIFGGKHISKGGYF IVCLTNQNNSVRRAEFNAMLDLSKNEGVILDFPDKTHGKRDNWKKSKEQIEEVITHYVKE KKWESITTHNADGEYGHIHHKMTHQLVTKVCEREKQTDNLFYFAPYYKKQDLKKHTLTPM SEKDLKTKTTLADVYISQEKVCNNLKHIFPYEEWTSYKDAYKK >gi|330405144|gb|ADLB01000012.1| GENE 65 67535 - 68476 670 313 aa, chain - ## HITS:1 COG:SP1606 KEGG:ns NR:ns ## COG: SP1606 COG0463 # Protein_GI_number: 15901446 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Streptococcus pneumoniae TIGR4 # 4 308 2 306 320 274 44.0 2e-73 MKKISVIVPCYNEEDSLPLFHREVTKVLREIEDADYELLFVNDGSKDHTIDVIKLLAIKD SHISYFSFSRNFGKEAAMYAGLSHADGDYCVIMDADLQHPPSLLKDMFYSVDKEGYDCCA GKRLDRTGEGKVRNFLSHSFYRVIQRLTHMDMSDGAGDFRMMSRQMVNAILEMKEYNRYM KGLFSFVGFETKWVEFHNVERIAGETKWTLRSLFAYACEGIFSFSTKPLMISGMFGSFLL VISFILAGYIGVDTLFFHHKFNGLLAITLLILVLSSIQMIFVSILGQYVSKDYMENKNRP IYIIKESNKPWVS >gi|330405144|gb|ADLB01000012.1| GENE 66 68463 - 68876 299 137 aa, chain - ## HITS:1 COG:lin2694 KEGG:ns NR:ns ## COG: lin2694 COG2246 # Protein_GI_number: 16801755 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Listeria innocua # 10 129 22 145 145 67 35.0 8e-12 MEKKTLTEFIRYCFIGGCTTAINYLVYIGFLFFFQKHYLFANTVAWIFAVLFAFYANKYF VFQKTEKSDREALSFFSMRLLTLLIENVLLYICIQQLLIHSLIAKLVVSVITVLANYVLC KYKIFTEEKGGMLYEEN >gi|330405144|gb|ADLB01000012.1| GENE 67 68858 - 71290 937 810 aa, chain - ## HITS:1 COG:BS_yfhO KEGG:ns NR:ns ## COG: BS_yfhO COG4485 # Protein_GI_number: 16077927 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Bacillus subtilis # 472 795 465 808 819 65 19.0 4e-10 MRKLQKNYPYILLTFFSCCLIFIIWPNNSSFGSTTDWVQQHIKIADYFRTLFFENGQFFP DYAAHLGGGVNIYALSYYGLFRLDVLVSYLIPSVEMEYIVVTYILLEFVASINLCYYWLR KQNIEKEICFVSAILLATSSLLFQSHRQIMFVNYTPYLFCTLIAIDNFLQTKKIRWISLS IALVILHSYFFSVTAIILCTGYFLLRLENINGKIKLKEQKSVLVKFIGAISLGIGMCAVL LLPTAYYMLHIQKDRGSTLSPIEIFGITPSLKGLLYYSYGAGLTIIVLFVLILSLQYKRT KKMGIILLICLCSNLVSYLLNGTLYVRYKVYLVFLPFILYAFAKTLQEMYHSEKKIHFSP LLLACIPVVTIYLFDHKKEEVPLLLDVVVALFLLLLFYRKKNTRYLLLACVLPFIICVRL NANEVYPQKEKTVFSDNELADLCSAYPGRFLDTTTGLLNVNHIFSPDTYKTTMYSSLSNG EYNTFYYDYMRNPMSIRNRVVLSSNPNVLFQLLMNVTTMETRKETLPIGYEILEEKKETV LARTKDAMPPAYVTDSLYDKRQFEKLSYPENIDVLSQSAVVDTDKTDFQKRTKKENFSLA IDGKRNRKKIVRLPKTYKNTVLLLSFDISRKDGREVVIDINGIRNKLSGKYANYPNNNTT FTYILSSNQPIRELRISASKGNYHIKNVNLYSLPYGNIQKRNQSVDTVNLEKPIDREVLK GNVTVSKEGYLITSIPYEKGFTAYMDGQKIPVEKVNTCFVGFPIKEGKHDISIVFHAPAK RLGLMISGISLVIFIILFIYEEKLLWKRKH >gi|330405144|gb|ADLB01000012.1| GENE 68 71462 - 73837 2353 791 aa, chain + ## HITS:1 COG:FN0847 KEGG:ns NR:ns ## COG: FN0847 COG0457 # Protein_GI_number: 19704182 # Func_class: R General function prediction only # Function: FOG: TPR repeat # Organism: Fusobacterium nucleatum # 10 577 3 575 599 224 29.0 5e-58 MNEKNILRQLDLLFAEKKIYEVEPFLHNCILQAKQEKDKETLLTLYNELTGYYRSVGRTR EAIESGEQADKLIAELDLTGTEFHATTLVNTATAYRAGEHYERALELYQRAEKIYKKVSN YPDIKRAGLYNNMSMAYQGKNQCDKAITYLYKALEIIKALPEYRVETASTYTNLSTVYFE LEEYEKGITALQEALKLYEEEETKDSHYSALLASLAHGYYLKKDYGQSVAYYTNALKEIL EHYGECENYAITCENCAIVLEESGYGKQAQYLRQKAWAARVKQKKGMEISRMYYEMFGEK MLKEKFPKYFDKITVGLMGHGSECFGFDDTFSRDHDFGPAFCIWLEEEDYKKIGSDVQRE YERLPKAFGNLPPREITTHGKNRVGVLNVSTFYEEFLGKELYEVLLYPDKHTKEEKERAW FSVSETALAQVTAGELFKDGEGRFSYVQKELKKGYPREVWIRKIAQMTALTAQAGQYNYL RCVKRGEYAAGEMALREFVQAGCNLIYLLNNTFMPYYKWVFRKMENLPKLSEAKVWFDKL FSVQTPEEKSFIIEKICAMILQELKEQQLTEGEEDFLECHVERILRRENKMYVIEEIVKL EWTMFQKVRNTGGRASCQDDFDTFDIMRKSQFSVWNGELLNSYYKDLKEGEKCGRNLVME KYAYMMESASKEEYDGIKENLPAVGEQKLKIIEGIIPIQVEWREEFAKKYPHLSGQARLI HTSEDEMNNISFETYLRGELKTYSMETLVRYAAMVVEFAKKEINMVEEIMRQTTEYYGYT TMEEAEQKQMM >gi|330405144|gb|ADLB01000012.1| GENE 69 73889 - 75526 1798 545 aa, chain - ## HITS:1 COG:BS_ykpA KEGG:ns NR:ns ## COG: BS_ykpA COG0488 # Protein_GI_number: 16078507 # Func_class: R General function prediction only # Function: ATPase components of ABC transporters with duplicated ATPase domains # Organism: Bacillus subtilis # 1 532 1 533 540 771 70.0 0 MISANNVTLRLGKKALFEDVNIKFTAGNCYGMIGANGAGKSTFLKILSGQIEPTSGEVVI SPGERLSFLQQDHFKYDDYLVLDTVIMGNARLYEIMKEKEEIYAKEDFTDEDGMKASELE GEFASMNGWEAESDAATLLNGLGIETDLHYKYLRDLTGAEKVKVLLAQALFGNPDILLLD EPTNHLDLDAIAWLEEFLINFENTVIVVSHDRYFLNKVCTQIADIDYGKIQLYAGNYDFW YESSQLIIRQMKEANRKKEEKIKELQEFISRFSANASKSKQATSRKKALEKIQLDDIRPS SRKYPYIDFRPNREIGNEVLTVENLSKTIDGVKVLDNLSFTITREEKVAFVGGNEFAKTV LFKILAGEMEADEGTYKWGVTTTQAYFPKDSSKEFDNDYTIVDWLTQYSENKDATYVRGF LGRMLFAGEDGVKKVKVLSGGEKVRCLLSKMMISGANILILDEPTDHLDMESITALNNGL IKFPGVILFASRDHQIVQTTANRIMEIVPGGKLIDKITTYDEYLASDEMARKRQTYSVDN SEVDN >gi|330405144|gb|ADLB01000012.1| GENE 70 75739 - 76593 583 284 aa, chain + ## HITS:1 COG:CAC2424 KEGG:ns NR:ns ## COG: CAC2424 COG4667 # Protein_GI_number: 15895690 # Func_class: R General function prediction only # Function: Predicted esterase of the alpha-beta hydrolase superfamily # Organism: Clostridium acetobutylicum # 6 283 6 280 283 226 42.0 3e-59 MKKGTLILEGGAVRGVFTSGVLDYLMEKDLYFSHVVGVSAGTCNGVNYVSRQIERTKKCM IHQEDEYDYYMGIRKFIKEKSLLNMDMIFDKFPNEIFPFDYDTYFNSDIYTEWVTTNCLT GKAEYMDSRESKEQLMKICRASSSMPLISPIVNIDGIPYLDGGLSDSIPIRRAMKYGNKK MVIVLTKNKGYRKKFVTKAKRKLYESAYKKYPELVKTLIKRPVIYNRTLDKIEQLEEEGK IFVIRPEVAMISRLERNMEKLEEFYRHGHEEMERRYDELTEYLG >gi|330405144|gb|ADLB01000012.1| GENE 71 76644 - 77390 616 248 aa, chain - ## HITS:1 COG:RC1328 KEGG:ns NR:ns ## COG: RC1328 COG0101 # Protein_GI_number: 15893251 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Pseudouridylate synthase # Organism: Rickettsia conorii # 1 246 1 245 245 150 34.0 2e-36 MRNIKLTLEYDGSRYQGWQRLGKNESSNTIANKIIEVIKKMTNEDVELFCGARTEVGVHA YAQIVNFKTTSDMKLGEIKQYLNRYLPMDIAVTDIEEKPERFHASLNATSKIYMYRMAIG EVPSVFERKYTYYVFKKPDVDVMKQAALLLVGKHDFKNFSTVKKSKSTVKEIYDIDIYTD DEEIQITLHANDFLHNMARMVIGTLIDIGLGNRKKEEIEDIFNPESSVQASSPCDAKGLY LQEVLYER >gi|330405144|gb|ADLB01000012.1| GENE 72 77457 - 77735 308 92 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|210615872|ref|ZP_03290834.1| ## NR: gi|210615872|ref|ZP_03290834.1| hypothetical protein CLONEX_03053 [Clostridium nexile DSM 1787] # 1 92 1 92 94 95 59.0 7e-19 MEDWLNRPELKQIDPVKLELMKTVITKTKGKSGNDLAPILLSLIMTANKKGIRFSTDEIT FIMDLMKEGKTSDEQARIDKTAQMIQSALKKK >gi|330405144|gb|ADLB01000012.1| GENE 73 77737 - 78141 356 134 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|210615871|ref|ZP_03290833.1| ## NR: gi|210615871|ref|ZP_03290833.1| hypothetical protein CLONEX_03052 [Clostridium nexile DSM 1787] # 1 130 24 157 174 88 43.0 1e-16 MKLIIPYTPPANQHMLAVYVKFLELKRTMTMFRQTHQNIHTQTFEKTISSPLDIIDEIRP YLSEAERNSIDSILNVFNMMQMLSTMEQMSSDAGTFNPMDMVKEMLTPEQQNMFEMYSTM FQENEDNQNEEGDS >gi|330405144|gb|ADLB01000012.1| GENE 74 78315 - 78455 233 46 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRKLIIDGNAVYELDENCMLKKKLSEKQNEEEKKNLQNKKKEEQNS >gi|330405144|gb|ADLB01000012.1| GENE 75 78695 - 79513 939 272 aa, chain + ## HITS:1 COG:CAC1355 KEGG:ns NR:ns ## COG: CAC1355 COG3711 # Protein_GI_number: 15894634 # Func_class: K Transcription # Function: Transcriptional antiterminator # Organism: Clostridium acetobutylicum # 2 269 8 277 287 129 30.0 4e-30 MYQIIKVLNNNAFLAKHDEGERILVGKGIGFGKKPGDTFTAIKDAKIYTLAVRENERSVI NAVKGIEPKYLEAAGRVIDEAQVVFKTINPDILIPLADHIAFAAKRAEENIYLPNPFIAD IKALFGKEYTVVVKSREIIEEMTGYRITDDEAGFIALHVHSGLSDAEVSETLKITQIIDE CMLDIEKMLDQHISRESLGGIRLMSHLYYMIARSKAKEKINIDLNQFVEENYAKAGMIAR HICREVERKLEVPVIREEEGFLAVHIQTIIIP >gi|330405144|gb|ADLB01000012.1| GENE 76 79551 - 79808 519 85 aa, chain + ## HITS:1 COG:MYPU_6030 KEGG:ns NR:ns ## COG: MYPU_6030 COG1925 # Protein_GI_number: 15829074 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, HPr-related proteins # Organism: Mycoplasma pulmonis # 1 84 1 84 87 62 40.0 2e-10 MKKFEYTIKDELGIHARPAGMLAKEAKNYTSVITITKEGKSAEATRLMAVMSLAVKCGQT VEVSVEGEDEDTAFEGVKAFFEANL >gi|330405144|gb|ADLB01000012.1| GENE 77 79839 - 81458 1714 539 aa, chain + ## HITS:1 COG:CAC3087 KEGG:ns NR:ns ## COG: CAC3087 COG1080 # Protein_GI_number: 15896338 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphoenolpyruvate-protein kinase (PTS system EI component in bacteria) # Organism: Clostridium acetobutylicum # 5 535 4 536 539 454 47.0 1e-127 MECLKGKSVYKGIALGKISVLKKNDYVVKRTKADDVEGEIARVSQAKELACAQLQKLYEK ALKEVGEASAAIFEVHHMMLDDADFNEAIENIIRTQEVNAEYAVASAGDSFSEMFASMDD DYMRARAADIKDISERLVQNLIGGEENDMDFDEPVIVVADDLTPSETVQMDKEKILAFVT VHGSTNSHTAILARMMNIPALIGVDMNLEELKTGMEAVVDGFEGEMILEPTEEVRNATLT KIAEEEEKTRLLLELKGKENITLGGKKIEIYANIGSASDVGYVLENDAGGIGLFRSEFLY IGRNELPSEEEQFQAYKQVAQNMAGKKVIIRTLDIGADKQADYLDLGEEENPALGYRAIR ICLSQPEIFKTQLRAIFRASTYGNISIMYPMITSVEEVEKIQRIVAEVKKELYECDIPYK DVEEGVMIETPAAVMISDELAEMVDFFSIGTNDLTQYTLAIDRQNEKLDSFYNPHHKAVL KMIQMVVDNSHKAGKWTGICGELGADTELTETFVKMGVDELSVAPSMILKLRKIIREMK >gi|330405144|gb|ADLB01000012.1| GENE 78 81475 - 81969 627 164 aa, chain + ## HITS:1 COG:BH0296_3 KEGG:ns NR:ns ## COG: BH0296_3 COG2190 # Protein_GI_number: 15612859 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIA components # Organism: Bacillus halodurans # 16 164 13 161 161 132 44.0 4e-31 MLGSLKEKLGFGKKGEVLYAPIEGEAIELAKVSDPTFGEEILGKGIAIVPSVGKVYAPID ATIEMVFDTKHAISMKSESGIEILVHVGLDTVTLKGEPFKEHVAAGDKVKAGDLLLEFDI EAIKNAGLEVVSPVIICNTADYSEIKTTTGKTVNTKDEVMTIIK >gi|330405144|gb|ADLB01000012.1| GENE 79 82051 - 83523 1885 490 aa, chain + ## HITS:1 COG:YPO2628_1 KEGG:ns NR:ns ## COG: YPO2628_1 COG1263 # Protein_GI_number: 16122841 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system IIC components, glucose/maltose/N-acetylglucosamine-specific # Organism: Yersinia pestis # 1 394 4 393 410 352 48.0 1e-96 MKYLQKLGKALMLPVACLPICGILMGIGYFLCPATMQGGDINGMKEIIGMFLVKAGGALI DNMAILFAIGVGVGMSDDNDGTAGLAAMVSWLMITTLLSEGVVTVIRPSVAENAKYLLAF QKIANPFIGIISGIIGSTCYNKFKGTKLPSWLAFFSGKRSVAIVAGVVSILTSAVLLFVW PIIFAGLVTIGESIVKLDAVGAGIYAFLNRLLIPTGLHHALNNVFWFDTIGLGDLQAFWA GKTSADVSWSLGMYMSGFFPCMMFGVPGAALAMIHTAKDNKKKVAIGLVASAAVCAFLCG VTEPFEFGFMFLAPALYVVYALLYGIFVAVTAALGFRAGFSFSAGATDLFFSSSLPAAQK TWMIIPLGIAAFIIFYVVFRFAITKFDLKTPGREDDDDVEAESKAVLANNDFTEVAKIIL EGLGGKENVTSIDNCITRLRLEVKDNTLVNEKKIKTAGVAGVIRPGKTSVQVIIGTQVQH VADEFKLLCK >gi|330405144|gb|ADLB01000012.1| GENE 80 83617 - 84360 818 247 aa, chain + ## HITS:1 COG:PM0526 KEGG:ns NR:ns ## COG: PM0526 COG3142 # Protein_GI_number: 15602391 # Func_class: P Inorganic ion transport and metabolism # Function: Uncharacterized protein involved in copper resistance # Organism: Pasteurella multocida # 5 241 3 238 244 186 39.0 3e-47 MKSVLECCVDSVESAVAAKAGGADRIELCSGLVIGGLSPSKALFQKIRETVNIPIRTLLR TRFGDFCYTEYEHEILKEEVRMFRQLGADGVVIGSLTPDGNLNMNQMKELMEEAGEMKVT LHRAFDMCKNPLETLEQAKELGIDTILTSGQKNCAAEGTELLKQLVEKSENKIEILVGGG VDGSVIESLYRETNATSYHMSGKLSLESEMQYRNPDVNMGLASMSEYEIWRTSEERVRKA RRVLDSL >gi|330405144|gb|ADLB01000012.1| GENE 81 84409 - 84645 205 78 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSDLSATNCGCGCNEGRECGFNNNSCLWIILLLFFCGGNNGCGNFNNGGNDCCLLILLLL FCCGGNGFGNNNGGCGCC >gi|330405144|gb|ADLB01000012.1| GENE 82 84797 - 85711 904 304 aa, chain + ## HITS:1 COG:BH1930 KEGG:ns NR:ns ## COG: BH1930 COG1404 # Protein_GI_number: 15614493 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Subtilisin-like serine proteases # Organism: Bacillus halodurans # 22 278 145 414 444 215 46.0 9e-56 MEQVKEKVSYWCEDTKSYRVEGEGIGVAVLDTGIALHPDFDSRITYFKDFVNKQEMLYDD NGHGTHIAGIIGGSGKLSDGVYSGIAPKCNLIPLKVLDKRGNGEITYVIEGIKWILQNSE KYNIRVVNISVGTLPNTEKAEEEKLLTAVEELWNRGMVVVVAAGNYGPKRQSITVPGVSK KVITVGASDDGIELFTIEGPLQDYSGRGPTEECVMKPDLVAPGSKIYSCNSRYGKKGRAY THKSGTSMATPVVSGAVALLLSKYPEMTNVEVKLRLRQTCVDLGKCKNQQGWGELNIKKL LSVK >gi|330405144|gb|ADLB01000012.1| GENE 83 85775 - 86116 358 113 aa, chain + ## HITS:1 COG:CAC1753 KEGG:ns NR:ns ## COG: CAC1753 COG2739 # Protein_GI_number: 15895030 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 8 74 8 74 116 57 49.0 7e-09 MNNILEQALLYDFYGELLNEHQKNIYEQFILEDLSLSEIASSEGISRQGVHDLIKRCNKT LAGYEAKLHLVEKFLAIKEKVHQINEVLEAEQTEMGMLIDNIKKISTEILEEL >gi|330405144|gb|ADLB01000012.1| GENE 84 86118 - 87467 1673 449 aa, chain + ## HITS:1 COG:BH2484 KEGG:ns NR:ns ## COG: BH2484 COG0541 # Protein_GI_number: 15615047 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Signal recognition particle GTPase # Organism: Bacillus halodurans # 1 434 1 433 451 451 56.0 1e-126 MAFDSLTEKLQNVFKNLRSKGRLTEDDVKTALREVKMALLEADVNFKVVKGFIKDVQERA VGQDVMSGLNPGQMVIKIVNEELVKLMGSETTEIKLEPGQATTVIMMAGLQGAGKTTTTA KLAGKFKLKGKKPLLVACDVYRPAAIKQLQVNGEKQGVEVFTMGENHKPVNIAKAALEHA AKNGNNIVILDTAGRLHVDEEMMEELQQIKEAVTVHQTILVVDAMTGQDAVNVAGTFNDK IGIDGVIVTKLDGDTRGGAALSIRAVTGKPIFYVGMGEKLSDLEQFYPDRMASRILGMGD VLSLIEKASAEIDEDKAKQMAQKMKKAQFDFEDYLESMKQMRNMGGLSSILSMMPGVKGN QLKDLDSEENEKKMARIEAIIYSMTVKERQNPDILNPSRKHRIAKGAGVDISEVNRLVKQ FEQSKKMMKQIPGLMGGKGGKKGKFRFPF >gi|330405144|gb|ADLB01000012.1| GENE 85 87505 - 87750 346 81 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|160880540|ref|YP_001559508.1| ribosomal protein S16 [Clostridium phytofermentans ISDg] # 1 81 1 81 81 137 79 3e-31 MAVKIRLKRMGQKKAPFYRIVVADARSPRDGRFIDEIGTYDPNQNPSVFKVDEEAAKKWL QNGAQPTETVGKIFKAAGIEK >gi|330405144|gb|ADLB01000012.1| GENE 86 87774 - 88001 404 75 aa, chain + ## HITS:1 COG:BH2482 KEGG:ns NR:ns ## COG: BH2482 COG1837 # Protein_GI_number: 15615045 # Func_class: R General function prediction only # Function: Predicted RNA-binding protein (contains KH domain) # Organism: Bacillus halodurans # 1 74 1 74 76 67 59.0 5e-12 MRELVEVIAKALVDDPDSVVVTEKTEGKTIVIEVRVADADMGKVIGKQGRIAKAIRSVVK AAAAKEDKKVVVDIA >gi|330405144|gb|ADLB01000012.1| GENE 87 88045 - 88551 471 168 aa, chain + ## HITS:1 COG:CAC1757 KEGG:ns NR:ns ## COG: CAC1757 COG0806 # Protein_GI_number: 15895034 # Func_class: J Translation, ribosomal structure and biogenesis # Function: RimM protein, required for 16S rRNA processing # Organism: Clostridium acetobutylicum # 1 165 1 161 166 124 45.0 8e-29 MEQLLQVGIISSTHGVRGEVKVFPTTDDINRFKKLKQVILDTGREKMVLEVQGVKFFKQF AILKFKGIDNINDIEKYKGKSLFVDREHAVKLKKNEYFIADMIGMNVYTEEDEFFGVLKD VIETGANDVYVISSEKHGEVLVPAIRQCILNVDIEKSKMVIHLLEGLV >gi|330405144|gb|ADLB01000012.1| GENE 88 88554 - 89276 771 240 aa, chain + ## HITS:1 COG:BH2479 KEGG:ns NR:ns ## COG: BH2479 COG0336 # Protein_GI_number: 15615042 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA-(guanine-N1)-methyltransferase # Organism: Bacillus halodurans # 1 237 1 237 246 271 53.0 8e-73 MNFHILTLFPDMVMDGLNTSIIGRAIANNLLTIEAVNIRDFAVNKHNRVDDYTYGGGAGM LMQAEPVYLAYKSVEEKCKKAPRVIYLSPQGQTFSQEMAEEFANEEELVFLCGHYEGIDE RVLEEIVTDYVSIGDYVLTGGELPSMVMIDAISRLIPGVLHNNVSAEFETFQDNLLEYPQ YTRPEEWRGKKVPEILLSGHHANIEKWRREQSVLRTAKARPDLLEKAELTEKEKKMLANS >gi|330405144|gb|ADLB01000012.1| GENE 89 89376 - 89723 465 115 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|238916980|ref|YP_002930497.1| large subunit ribosomal protein L19 [Eubacterium eligens ATCC 27750] # 1 115 1 115 115 183 80 5e-45 MNDIIKKIEAEQLKAEVPEFNVGDTVKVYGKIKEGNRERIQIFEGTVLKKQGGSTRATFT VRKNSNGIGVEKTWPLHSPNVERVEVVRRGKVRRAKLNYLRERVGKRAKVKELVK >gi|330405144|gb|ADLB01000012.1| GENE 90 89793 - 90644 885 283 aa, chain + ## HITS:1 COG:CAC1761 KEGG:ns NR:ns ## COG: CAC1761 COG1161 # Protein_GI_number: 15895038 # Func_class: R General function prediction only # Function: Predicted GTPases # Organism: Clostridium acetobutylicum # 5 279 8 283 283 281 51.0 9e-76 MHFQWYPGHMTKAKRMMQENIKLIDLVIELVDARIPMSSRNPDIDELGKNKARLILLNKS DLAEDRQTDAWAEYFRGKGYSVVKVNSKKGGGIKSIQGVIQEACKEKIERDRKRGILNRP VRAMVVGIPNVGKSTFINALAGKACAKTGNKPGVTKGKQWIRLNKNVELLDTPGILWPKF EDQQVGLKLAFIGSIKDEILNTEELAGELIKFLNAHYTGVLAEKYGIEEQEDSYACLMEI AKNRHCLLRGNELDMEKVAMILTDDFRNGRLGKITLEFPEDYE >gi|330405144|gb|ADLB01000012.1| GENE 91 90664 - 91218 601 184 aa, chain + ## HITS:1 COG:alr2975 KEGG:ns NR:ns ## COG: alr2975 COG0681 # Protein_GI_number: 17230467 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Signal peptidase I # Organism: Nostoc sp. PCC 7120 # 21 181 28 190 190 130 41.0 2e-30 MEDKEKSKSIFKEILDWVIYIGIILLFTYLIITYVGVRTRVSGQSMQPTLHDGDNLLVDK LTYRFRDPKRYEIVVFPYKYEEDTYYIKRIIGLPGETVQIIDGYVYINGEKLKKDYGAEV MQDSGIAEEPITLGEDEYFVLGDNRNHSSDSRVPNVGVLKRKDLLGRAWVRIWPLDRIGV VSHE >gi|330405144|gb|ADLB01000012.1| GENE 92 91211 - 91969 784 252 aa, chain + ## HITS:1 COG:BS_rnh KEGG:ns NR:ns ## COG: BS_rnh COG0164 # Protein_GI_number: 16078669 # Func_class: L Replication, recombination and repair # Function: Ribonuclease HII # Organism: Bacillus subtilis # 4 247 5 248 255 202 50.0 5e-52 MNKSISEIKHEFEQADKEKLPLLYEQYARDSRTGVVSLIAKYKKKEEKLQAEIKRIEEMS RYEQENCMEEYICGIDEVGRGPLAGPVVAGAVILPKDEMILYLNDSKKLSEKKREELYDI IMERAVAAGIGIVSPARIDEINILQATYEAMRMAIDNLKVQPTLLLNDAVTIPSVNIRQI PIIKGDAKSVSIAAASIIAKVTRDRLMKEYDKVIPGYDFASNKGYGSAAHIQALKENGAT LIHRKTFIKNFI >gi|330405144|gb|ADLB01000012.1| GENE 93 91970 - 92329 302 119 aa, chain + ## HITS:1 COG:VC0580 KEGG:ns NR:ns ## COG: VC0580 COG0792 # Protein_GI_number: 15640602 # Func_class: L Replication, recombination and repair # Function: Predicted endonuclease distantly related to archaeal Holliday junction resolvase # Organism: Vibrio cholerae # 1 114 1 118 122 85 37.0 2e-17 MMKSNKREIGSAFEKQAGEYLKKIGYSIIEYNFQCRMGEIDIVACDGKTLVFCEVKYRSD NQKGTPFEAVTVSKQKKICRAALYYMTKHQITDTACRFDVIGITGEKIEIIKNAFDYIS >gi|330405144|gb|ADLB01000012.1| GENE 94 92338 - 93183 452 281 aa, chain + ## HITS:1 COG:BS_ylmD KEGG:ns NR:ns ## COG: BS_ylmD COG1496 # Protein_GI_number: 16078601 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Bacillus subtilis # 29 281 25 276 278 172 35.0 7e-43 MNIIYKNKEKIFQENEKNGVYYLTYPLFEETKMVKHGFSTRLGGVSKGVCSTMNFSFSRG DREEDVRENFRRMSAALEVQEENIVFSNQTHTTNVRVVTEEDRGKGLIKPLDYDDVDGLV TNIPGICLTTFYADCVPLFFVDPVKRVIGLSHSGWRGTVGKIGKSTVEKMQEEYGSNPKD ILAAIGPSICQDCYEVSEDVIEEFKKAFDGEQWRDLFYRKENGKYQLNLWKANEYVFLEA GIQKEHMAVSNVCTCCNDEVFFSHRASCGRRGNLAAFLTIR >gi|330405144|gb|ADLB01000012.1| GENE 95 93194 - 94132 899 312 aa, chain + ## HITS:1 COG:VC0480 KEGG:ns NR:ns ## COG: VC0480 COG0668 # Protein_GI_number: 15640507 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Small-conductance mechanosensitive channel # Organism: Vibrio cholerae # 33 298 22 285 287 194 38.0 1e-49 MILLEKTANGEVTKEVTEVTSEAAKEMGKFAQYINDNIPALIGFGVKVLLAIIFFFIGRK VIKIIRKLIKRSFERSSADKGVQQFIDSMVKVVLYALLIFSIAAKFGVDTASVAALLASG GVAIGLALQGSLSNFAGGVLILLLKPFVVGDYIIEDSQKNEGTVKEIQIFYTKLSTVDNK TIVIPNGTLANTSLTNVTAKDVRRLDLSIDIAYEADLRKAKEIIEGLLKADGSILKDEEI LVFVDQLGASSVVIGTRAWVKTEEYWPTRWRLLEEIKLALDENHIEIPYQQITLHMKENE EKKQLQNSENRI >gi|330405144|gb|ADLB01000012.1| GENE 96 94385 - 94723 204 112 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|18309686|ref|NP_561620.1| 30S ribosomal protein [Clostridium perfringens str. 13] # 4 106 5 108 110 83 42 1e-14 MAEKSSLSGNNIMLLLQLLSEKDMYGYEIMETLRERSENVFELKAGTLYPLLHSMELKKL LASYEQEAEGKKRKYYQITEEGRKQLATKKREWNEYSKAVTQIIGGTCYGTA >gi|330405144|gb|ADLB01000012.1| GENE 97 94710 - 95456 802 248 aa, chain + ## HITS:1 COG:no KEGG:CBO2305 NR:ns ## KEGG: CBO2305 # Name: not_defined # Def: membrane protein # Organism: C.botulinum # Pathway: not_defined # 1 244 1 250 258 126 31.0 5e-28 MELHNYIESLIKPIQSQEVKLAVAEEIKGHIEEQKEAYMEEGMDEKAALEQAILSMGSPK EVGDAFATIYNLTSEKKNMIYYSIFSLLGLALIWCLRTLDTYTVMIKIVGVILMLGGVIA AASEKWANLSFYYFKNQSGGSTMNSYAICAAGIAFFSEEYWQWLILSVIIGLVVYFERDM IGKCQAKKTDRYLYKTGIAITDIDYRGKANIDGEVQKVRVTHDKIKKGQNVIISKVNGFH LIVEAISP >gi|330405144|gb|ADLB01000012.1| GENE 98 95472 - 96968 1262 498 aa, chain + ## HITS:1 COG:BS_glpK KEGG:ns NR:ns ## COG: BS_glpK COG0554 # Protein_GI_number: 16077994 # Func_class: C Energy production and conversion # Function: Glycerol kinase # Organism: Bacillus subtilis # 1 489 1 490 496 659 63.0 0 MGKYIMALDAGTTSNRCILFNEKGEMCSAAQKEFTQYFTQPGWVEHDANEIWSTQLGVAV EAMNKIGASAKDIAAIGITNQRETAIVWDKETGEPIYNAIVWQCRRTSDYCDKLKEDGMT EFFRRKTGLVIDAYFSATKVKWLLDNVKGAREKAEEGRLLFGTVETWLIWKLTKGKVHVT DYSNASRTMLFNINTLEWDDEILEKLQIPKKMLPEVKPSSCVYGMTDSSFFGDEIAIAGA AGDQQAALFGQTCFQKGEAKNTYGTGCFLLMNTGEEPVFSNHGLVTTIAWGLDGKVNYAL EGSIFVAGAAIQWLRDEMKLIDSAKESESMAQKVKDTNGCYVVPAFTGLGAPHWDQYARG TIVGITRGVNRYHIIRATLESLAYQVHEVLEAMKADSGISIPALKVDGGASANNFLMQVQ ADIMNAPVIRPRCVETTAMGAAYLAGLAVSYWADKEDVVKNWQKDESFVPEITEEEREKR IKGWNKAVKYSYEWAKEE >gi|330405144|gb|ADLB01000012.1| GENE 99 97014 - 97811 548 265 aa, chain - ## HITS:1 COG:CAC0618 KEGG:ns NR:ns ## COG: CAC0618 COG0600 # Protein_GI_number: 15893906 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport system, permease component # Organism: Clostridium acetobutylicum # 4 265 3 264 264 199 41.0 5e-51 MNDISTAQMLYLKKQKRHKYIVSFSRVLILILFLLLWEFTAHYEIIDSFVFSSPSKIVKC FWSMVLDKSIFSHIGITLYETIFSFLFVIFFSILIAILLWSSQKLSDITEPYLVVLNSLP KSALAPLLIVWLGANPTTIIVCGMSIAIFGSILNLYTNFVHVDEEEIKLIYTLHGNRFHV LTKVVLPSSVPAIMNMMKVNIGLCLVGVVIGEFLAAKSGLGYLIIYSSQVFKMDWLLMSI VLLCIMAMVLYTFISFIQRQYQKRL >gi|330405144|gb|ADLB01000012.1| GENE 100 97804 - 98571 215 255 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein [Acinetobacter baumannii AYE] # 1 211 1 211 311 87 29 5e-16 MENILEIKNLSYTYHTLEGETPALDNISFALKEGEFLSVVGPSGCGKSTLLSIICGLLTP DAPSVFFQGKSLKENSDKIGYMLQHDHLFEWRTIYQNVLLGLEIQHKLSVRTKEKAKKLL ETYGLKEFSNARPSELSGGMRQRAALVRTLVLDPELLLLDEPFSALDYQTRLKVGDDIGQ IIRKEKKTAILVTHDLSEAISLGDRVLVLSKRPATIQKTVELRFNLENDTPLQRRNTPEF KDYFNLVWKELNKNE >gi|330405144|gb|ADLB01000012.1| GENE 101 98686 - 99309 453 207 aa, chain + ## HITS:1 COG:CAC1577 KEGG:ns NR:ns ## COG: CAC1577 COG1636 # Protein_GI_number: 15894855 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 1 199 4 204 208 228 62.0 8e-60 MNYQKELEKLLVKLEKEGRTPRLLLHSCCAPCSSYVLEYLSKHFEITVFYYNPNIFPESE YTKRILEQQTLIGEMKYPVSFIAGNYDKEKFYEMAKGLEHLKEGKERCFKCYALRLEETA RLAREGEFDYFTTTLSISPMKNADKLNEIGTMLGRKYGVEYLQSDFKKKNGYKRSIELSK EYDLYRQDYCGCVYSMQERKINDNKRR >gi|330405144|gb|ADLB01000012.1| GENE 102 99328 - 100413 1143 361 aa, chain + ## HITS:1 COG:CAC0022 KEGG:ns NR:ns ## COG: CAC0022 COG0136 # Protein_GI_number: 15893320 # Func_class: E Amino acid transport and metabolism # Function: Aspartate-semialdehyde dehydrogenase # Organism: Clostridium acetobutylicum # 4 361 3 360 360 472 61.0 1e-133 MTEKLRVGILGATGMVGQRFISLLENHPWYKVTVVAASPKSAGKTYEEAVGNRWKMDTPI PEKVKHLTVMNVHEVENIAAQVDFVFSAVDMKKEEIRDIEDAYAKAETPVVSNNSAHRWT KDVPMVIPEINSEHFKVIESQKKRLGTTRGFVAVKPNCSIQSYAPVLTAWKEFEPTEVVA TTYQAISGAGKTFREWPEMIENIIPFIGGEEEKSEQEPLKIWGKVVEGEIVKAQMPVITT QCIRVPVLNGHTSAVFVKFAKKPTKEQLIDRLLSFKGTPQELHLPSAPKQFIQYLKEENR PQVKLDVDFENGMGISVGRLREDTVYDFKFVGLSHNTVRGAAGGAILCAETLTAQGYIQA K >gi|330405144|gb|ADLB01000012.1| GENE 103 100438 - 101481 400 347 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15900011|ref|NP_344615.1| aldose 1-epimerase [Streptococcus pneumoniae TIGR4] # 13 347 13 345 345 158 29 2e-37 MGMISGSFGVTSKGKEATLYTITNKSGMSISVTDFGATLVKVMVPDKNDEYLDVVLGYDN VTGYENGTLFFGASVGRCANRIGGAKFTINGKTYYLEKNDYSNNLHSGTDFYNKRMWEVR SKNYESITFTLHSRGGDQGFPGALDMEVKYTLTENNEIKIEYYSVPSEDTIINMTNHSYF NLNGHDSGSAMNQEVGIAAEYFTKTDIQSIPTGEIVAVEDTPMDFRVKKPLGRDIEEQYE PLIFGRGYDHNWVLENNGKFDKVAEMSSNESGIVMEVYTDLPGMQLYTGNFVENEIGKNG AVYDVRHGVCFETQYFPDAINKDNFKSPLCPAGENYRTITVFKFGVE >gi|330405144|gb|ADLB01000012.1| GENE 104 101486 - 103483 1469 665 aa, chain + ## HITS:1 COG:CAC3567 KEGG:ns NR:ns ## COG: CAC3567 COG0550 # Protein_GI_number: 15896801 # Func_class: L Replication, recombination and repair # Function: Topoisomerase IA # Organism: Clostridium acetobutylicum # 1 635 1 588 709 372 37.0 1e-102 MGKAVYIAEKPSVAQEFAKALKLNTKRKDGYLESEEAVVTWCVGHLVTMSYPEVYDEGMK RWRLETLPFLPKEFLYEIIPAVQKQFKTVSGILTRDDVDTIYVCTDSGREGEYIYRLVEQ MAGDKVKNKLRKRVWIDSQTEEEILRGIKEAKDLSAYDNLSASAYLRAKEDYLMGINFSR LLTLKYGNSISNYRHTKFTVISVGRVMTCVLGMVVRREREIRNFVKTPFYRVLSTLKLDE QSFDGEWKAVKGSKYFESFDLYKENGFKTREKAEELIAYLSEETPLTAQVISIEKKKEKK NPPLLYNLAELQNECSKRFKISPDETLRIVQELYEKKLVTYPRTDARVLSTAVAKEISKN LNGLSKYETAKPYLSDILSFGNHKGLEKTRYVNDKQITDHYAIIPTGQGLSALQSVSSTS KQIYDTIVRRFLSIFYPPAVYQKVSIVTKIREEQFFSNFKVLAEEGYLKVVGIPQTKKDA ESEENTSDFGLFAVVQQLKKGSVLEIKALNLKEGETSPPKRYNSGSIILAMENAGQLIED EELRAQIKGSGIGTSATRAEILKKLLSIKYLSLNKKTQMITPTLLGEMIFDVVEHSIRSL LNPELTASWEKGLTYVAEGEITSEEYMVKLENFIKSRTHGVLGLNNQYQLRSCYDAAARF YKKGK >gi|330405144|gb|ADLB01000012.1| GENE 105 103486 - 104151 806 221 aa, chain + ## HITS:1 COG:CPn0749 KEGG:ns NR:ns ## COG: CPn0749 COG0110 # Protein_GI_number: 15618658 # Func_class: R General function prediction only # Function: Acetyltransferase (isoleucine patch superfamily) # Organism: Chlamydophila pneumoniae CWL029 # 9 197 8 204 208 117 33.0 2e-26 MEKLTVKELFNLNETIAKDLLNSVTYPWEALPKISAFIVELGETLDEEKYEKRDKDIWIA KNATIAPTAYIHGPAIIGENAEVRHCAFIRGNAIVGEGAVVGNSTELKNVILFNKVQVPH YNYVGDSILGYKSHMGAGSITSNVKSDKTLVTIRTEENIVETGLKKFGAILGDEVEVGCG SVLNPGTVVGSHTNIYPLSMVRGYVPAKSIYKKQGEIAEKQ >gi|330405144|gb|ADLB01000012.1| GENE 106 104247 - 104942 260 231 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 220 1 232 245 104 28 3e-21 MLEVQNLRKRYGKYLAVDNVSFTVPDGKVGILLGPNGAGKSTIIKSIAGLLKYEGGIGIQ RLSSKTLEAKKLFAYVPEIPAMFEALTVREHIEYIRKAYESRITDEEIKQVLDRFELWDK QDKLGNELSKGMMQKVSICCALALQPKVILFDEPMVGLDPKAIKELKEVVLELRDNNVTV LISTHMLEMVKELWDIMYVMEKGKIIGTYTKEELGEKDIDDVFFELTGGDE >gi|330405144|gb|ADLB01000012.1| GENE 107 104942 - 106513 1402 523 aa, chain + ## HITS:1 COG:no KEGG:CLB_3171 NR:ns ## KEGG: CLB_3171 # Name: not_defined # Def: ABC transporter, permease protein # Organism: C.botulinum_A_ATCC19397 # Pathway: not_defined # 1 520 1 521 523 139 27.0 4e-31 MKALWYVTKRSGINFMKKAVKKPTTYLFLAVIFIYAIILGSGAFAWRISGIFTENIALVT LLTIWTTFIFFTNFVTYAKKKGIIFKPSHTHFIFTAPISPKAILLMGAMKNYMFSFVMTL IIFYGELVVFQISFPKALLTSFGMLILELIFEMSLIIFLYANEKMSFKTTTWICRSIYVF LLGVVCVAVFYFRENGMSPETLTQFVDYPVIQMIPFLGWNIAVYRMVFLGATTLNVICSI LYFLSVAIMFFIAYKMKCTGGYYEEAAKFADDYVEFRNKSKNGEMVMSIGKKKTFKKAVK MDYKASGAKAIFYRQLLEYKKERFFILGGMTLVIIGIAFLFIKIIGVENGIPPQVMLLLV MAYIIFCATGYTGKWEKELKNPYLYLIPDTPVRKMWYATVIEHIRSFIDGSILAVSLGIA WKIPVFEIVLAIAIYVVLQANKLYMRVLADSILGASSGVNVRNMIYMLFQSATLGMGALF GAVAGIFISMKLLFPVILIYSILMVILIGLLASSRFGEMEQAG >gi|330405144|gb|ADLB01000012.1| GENE 108 106531 - 107007 667 158 aa, chain + ## HITS:1 COG:CAC1604 KEGG:ns NR:ns ## COG: CAC1604 COG1803 # Protein_GI_number: 15894882 # Func_class: G Carbohydrate transport and metabolism # Function: Methylglyoxal synthase # Organism: Clostridium acetobutylicum # 11 152 5 146 149 192 61.0 2e-49 MLQENFVTFEIGKRKNIALVAHDGKKKELVEWCERNKEILREHSLCGTGTTSRLITERTG LKVKGYNSGPLGGDQQIGAKIVEGNIDFMIFLWDPLEAQPHDPDVKALLRIAVVYDIPIA NNLSTADFMLHSKYMTTTYNRRVENFNKKVEERVSQMK >gi|330405144|gb|ADLB01000012.1| GENE 109 107255 - 108277 1340 340 aa, chain + ## HITS:1 COG:BH2383 KEGG:ns NR:ns ## COG: BH2383 COG0468 # Protein_GI_number: 15614946 # Func_class: L Replication, recombination and repair # Function: RecA/RadA recombinase # Organism: Bacillus halodurans # 5 330 3 327 349 456 70.0 1e-128 MATDDKLKALDAAISKLEKDFGKGTVMKLGDAGANVSVETVPTGSLSLDIALGLGGVPKG RVIEVYGPESSGKTTVTLHMIAEVQKRGGIAGFIDAEHALDPVYAKNIGVDIDELYISQP DSGDQALEIAETMVRSGAMDIVVIDSVAALVPKQEIEGDMGDSHVGLQARLMSQALRKLT PVISKSNCVVIFINQLREKVGVMFGNPETTTGGRALKFYASVRMDVRRIETLKQSGEMVG NRTRIKIVKNKIAPPFKEAEFDIMFGKGISKEGDVLDLATGIDLVNKSGSWYAYNGEKIG QGRENAKTYLHTHPEIMEELEAKVREHYGLGVTKEDKEEK >gi|330405144|gb|ADLB01000012.1| GENE 110 108279 - 108893 583 204 aa, chain + ## HITS:1 COG:CAC2410 KEGG:ns NR:ns ## COG: CAC2410 COG2137 # Protein_GI_number: 15895676 # Func_class: R General function prediction only # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 3 198 6 207 214 62 23.0 4e-10 MVVTQIEPLTKTKWKVYIDGKFAFVLYKGELSRFRIVQGEDVSEEIYEKIKNEVILKRVK LRALHLLNQMDRTEEQLRTKLRQGHYTDDMIDTAISYVKSFGYIEDSAYAKRFVLSRQDS KSRKEIYAKLCQKGIAKDVIANAMEECYEDGEELTAIKKLVEKKRFDPYNATDSERQKIY GYLARKGFSYDSIRQVIQISDWNA >gi|330405144|gb|ADLB01000012.1| GENE 111 108966 - 110519 1802 517 aa, chain + ## HITS:1 COG:CAC1816 KEGG:ns NR:ns ## COG: CAC1816 COG1418 # Protein_GI_number: 15895092 # Func_class: R General function prediction only # Function: Predicted HD superfamily hydrolase # Organism: Clostridium acetobutylicum # 10 517 7 514 514 536 62.0 1e-152 MPSISIIIAVVITLVITAIVTYFATVSNLKKNADSKIGNAESKAREIIDDALKTAETTKK EALLEVKEESIKTKNELEKETKERRAELQRYEKRVLSKEEALDKKADVIEQREANFVARE EQIKQREVKVEELSKQRVQELERISGLTSEQAKEYLLKTVEEDVKHDTAKMIKELETQAK EEAGKKAREYVVTAIQKCAADHVAETTISVVQLPSDEMKGRIIGREGRNIRTLETLTGVE LIIDDTPEAVVLSGFDPIRREVARIALEKLILDGRIHPARIEEMVEKAQKDVEAMIREEG ENAALEVGVHGIHPELIRLLGRMKFRTSYGQNALKHSIEVAQLSGLLAGEIGLDVRMAKR AGLLHDIGKSIDHDVEGSHIQIGVDLCRKYKESAVVINAVESHHGDVEPETLIACIVQAA DTISAARPGARRETLETYTNRLKQLEDIANQFKGVDKSFAIQAGREIRIMVVPEQVSDAD MVLLARDISKQIEFELEYPGQIKVNVIRESRVTDYAK >gi|330405144|gb|ADLB01000012.1| GENE 112 110622 - 111164 625 180 aa, chain + ## HITS:1 COG:TM1181 KEGG:ns NR:ns ## COG: TM1181 COG0494 # Protein_GI_number: 15643937 # Func_class: L Replication, recombination and repair; R General function prediction only # Function: NTP pyrophosphohydrolases including oxidative damage repair enzymes # Organism: Thermotoga maritima # 6 173 6 171 179 124 42.0 9e-29 MGEQLKRTDRTLKYEGSILKVYTDEIQLPNGKKTHWDYIHHNGATAIIPVRNDGKILMVR QYRNALDRETLEIPAGKLEDKDEKPLLCAMRELEEETGYVAGKMSHLITLRTWVAFTNEK IEVYVAEELKSTAQCLDEDEFINVEAFTMEELKEKIFSGEIQDAKTISSLLAYEVWQKKE >gi|330405144|gb|ADLB01000012.1| GENE 113 111216 - 111755 325 179 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167759050|ref|ZP_02431177.1| ## NR: gi|167759050|ref|ZP_02431177.1| hypothetical protein CLOSCI_01397 [Clostridium scindens ATCC 35704] # 1 179 11 189 189 149 44.0 1e-34 MKVHQSKAYITAFCVLGFFTGILYANIVSNQIFSTSGIFSEYFLSQYPSIEIIPEEYLFY ILRIRVVPLLFLILLGQTKIRKAVISIFLFWTGFSGGILIVSAIFRMGMKGVLLFIIGIM PQFIFYILAYVVVIWYFYTYPMSRWNYGKTAFLVCMMLFGILTEIYVNPILMKMFVRII >gi|330405144|gb|ADLB01000012.1| GENE 114 111786 - 113105 133 439 aa, chain - ## HITS:1 COG:BH1233 KEGG:ns NR:ns ## COG: BH1233 COG2244 # Protein_GI_number: 15613796 # Func_class: R General function prediction only # Function: Membrane protein involved in the export of O-antigen and teichoic acid # Organism: Bacillus halodurans # 4 438 3 441 522 156 28.0 6e-38 MSKKNAIIKGTFVLTATGFLSRFIGFFYRIFLSHTFGEEGVGLYQLIFPVFALGFSLTTA GIEIAISRIVAQKHSLHRHAEARRTLFVGLTMSFCLSFCTMFFLQKHAFFIAKNFLHDTR CTNLLLAVSYAFPFSSIHSCICGYYLGLKRTRVPALSQLAEQISRVLFVYGVCYLMSKKH MAPSVSIAVFGLVCGEIISSLYCVFYFSRKKPSGKNTVSKMQGTLSAGIELFRLSAPLTA NRILLNILQSIEAVSIPSCLILYGLTNSDALSTYGVLTGMALPCILFPSAITNSVSTMLL PTIAEIQTQNDKMHLITLVKKVTLYIFTLGFLCTIGFLMFGNFIGNILFHSTLAGKFILT LSWICPFLYTNNTLISTLNGLGKTTTTFFINTTGLLIRIAGIFILIPKFSIIGYLWGLLV SQCLVTFLSVVFLRKYLSS >gi|330405144|gb|ADLB01000012.1| GENE 115 113244 - 113702 547 152 aa, chain + ## HITS:1 COG:CAC1675 KEGG:ns NR:ns ## COG: CAC1675 COG1959 # Protein_GI_number: 15894952 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Clostridium acetobutylicum # 1 141 1 138 139 133 48.0 1e-31 MKLSTKGRYGLRALIDLAQYSEVEPVSINSISARQGISERYLEQLMALLKKAGLVKSIRG AGGGYVLAKEKEDISVGDVLRALEGKLEPVECSGFSAEESCEAAGGCVTKYVWQRINESI NNTVDEIKLDQLVRESKAMNSIEKCEHRDCNK >gi|330405144|gb|ADLB01000012.1| GENE 116 113720 - 114901 1174 393 aa, chain + ## HITS:1 COG:MA2718 KEGG:ns NR:ns ## COG: MA2718 COG1104 # Protein_GI_number: 20091542 # Func_class: E Amino acid transport and metabolism # Function: Cysteine sulfinate desulfinase/cysteine desulfurase and related enzymes # Organism: Methanosarcina acetivorans str.C2A # 3 384 4 384 392 460 59.0 1e-129 MERFVYLDNAATTKTAPEVVEAMLPYFTEHYGNPSSVYSFASKNKEVITKQREIIADVLG ASANEIYFTAGGSESDNWALKATAEAYKDKGNHIITTKIEHHAILHTAEYLEKQGFEVTY LDVDENGIVKLDELKKAIRPTTILISIMFANNEIGTIQPIKEIGEIAKEHGILFHTDAVQ AFGQIPVNVDECHIDMLSASGHKLNGPKGIGFLYIRKGVKIRSFVHGGAQERKRRAGTEN VPGIVGFGTAVKLAVDTMKERTDKEIQLRDYMIERISEEIPYAKLNGDAKKRLPNNVNFS FRFIEGESLLIMLDMKGICASSGSACTSGSLDPSHVLLAIGLPHEIAHGSLRMTLSEETT KEDIDYVIDNLKDIVSKLRSMSPLYEDFSKKTK >gi|330405144|gb|ADLB01000012.1| GENE 117 114923 - 115363 556 146 aa, chain + ## HITS:1 COG:MA2717 KEGG:ns NR:ns ## COG: MA2717 COG0822 # Protein_GI_number: 20091541 # Func_class: C Energy production and conversion # Function: NifU homolog involved in Fe-S cluster formation # Organism: Methanosarcina acetivorans str.C2A # 2 125 3 125 128 168 62.0 3e-42 MYTEKVMDHFQNPRNVGEIEDASGVGTVGNAKCGDIMRMYLDIDENQVIQDVKFKTFGCG AAVATSSMATELVKGKTIYEALKVTNKAVMEALDGLPPVKVHCSLLAEEAIHAALWDYAE KNGIKIEGLEKPKSDIHEDEEEVEEY >gi|330405144|gb|ADLB01000012.1| GENE 118 115365 - 116447 985 360 aa, chain + ## HITS:1 COG:CAC2233 KEGG:ns NR:ns ## COG: CAC2233 COG0482 # Protein_GI_number: 15895501 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Predicted tRNA(5-methylaminomethyl-2-thiouridylate) methyltransferase, contains the PP-loop ATPase domain # Organism: Clostridium acetobutylicum # 4 357 3 354 355 409 57.0 1e-114 MEKKKVVVGMSGGVDSSVAAHLLKEQGYDVIGVTMQIWQDEDRAIQEENGGCCGLSAVDD ARRVAGQLEIPYYVMNFKKEFKKEVIDYFIEEYRQGRTPNPCIACNRYVKWESLLKRSMD IGADYIATGHYARIVKLENGRYTLKRSATLAKDQTYALYNLTQNQLSHTLMPVGEYTKDE IRQIAEKIHLQVANKPDSQDICFVPDGDYASFIEETTGERIQTGNFVDLNGNILGQHKGI IHYTVGQRKGLGLSLGRPAFVLEIRPETNEVVIGTNEDAMSYTLRANHLNFMSISDLEGE MHVFAKIRYNHKGAWCTIKKTSEDEVLCTFDEPQRAITPGQAVVFYDDEYVIGGGTIIGV >gi|330405144|gb|ADLB01000012.1| GENE 119 116451 - 117251 743 266 aa, chain + ## HITS:1 COG:L37351 KEGG:ns NR:ns ## COG: L37351 COG1387 # Protein_GI_number: 15673198 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Histidinol phosphatase and related hydrolases of the PHP family # Organism: Lactococcus lactis # 2 259 3 259 269 132 32.0 5e-31 MRADFHMHTEFSTDAHCSPEEMVLGAIKKGLTTVCITDHMDKDYDKEGKQFVFCPEEYFS TLRKLQGEYRQKIDVRIGVEMGLQPHLKTFVGNYVKKYSFDFVIGSVHLIDGKDPHQEKE NQMEDSILYRNGFENILECVKQQMDFDVLGHMDYVVRYGKEKEKSYFLQDYAEITDEILR SLIMQGKGIELNTAGFKYGLGFAHPHISILKRYKELGGEVITVGSDAHQPENIAYDFDRA KEVLKDSGFKYYTEFKGRKPIFIQVG >gi|330405144|gb|ADLB01000012.1| GENE 120 117392 - 118405 1332 337 aa, chain + ## HITS:1 COG:NMA0246 KEGG:ns NR:ns ## COG: NMA0246 COG0057 # Protein_GI_number: 15793264 # Func_class: G Carbohydrate transport and metabolism # Function: Glyceraldehyde-3-phosphate dehydrogenase/erythrose-4-phosphate dehydrogenase # Organism: Neisseria meningitidis Z2491 # 1 334 1 331 334 446 69.0 1e-125 MAVKVAINGFGRIGRLAFRQMFGAEGYEVVAINDLTSPKMLAHLLKYDSSQGKYALADKV TAGEDSITVDGKEIKIYAKANAEELPWGEIGVDVVLECTGFYTSKAKAEAHIKAGARKVV ISAPAGNDLPTIVYNVNHDTLKPEDTVISAASCTTNCLAPMADALNKLATIKSGIMCTIH AYTGDQMTLDGPQRKGDLRRSRAAAVNIVPNSTGAAKAIGLVIPELNGKLIGSAQRVPTP TGSTTILTAVVEGNVTVEEINAAMKAAANESYGYNEDEIVSSDIVGMRYGSLFDATQTMA LPLDNGTTEVQVVSWYDNENSYTSQMVRTIKYFSELA >gi|330405144|gb|ADLB01000012.1| GENE 121 118492 - 119685 1402 397 aa, chain + ## HITS:1 COG:CAC0710 KEGG:ns NR:ns ## COG: CAC0710 COG0126 # Protein_GI_number: 15893998 # Func_class: G Carbohydrate transport and metabolism # Function: 3-phosphoglycerate kinase # Organism: Clostridium acetobutylicum # 2 397 3 397 397 545 72.0 1e-155 MLNKKSVDDINVKGKRVLVRCDFNVPLIDGKITDENRLVAALPTIKKLVADGGKIILCSH LGKPKGEPKPELSLAPVAVRLSELLGQEVKFAADATVVGENAKAAVEAMNDGDIILLENT RYRAEETKNGEEFSKELASLCDVFVNDAFGTAHRAHCSNVGVTQFVDTAVVGYLMQKEID FLGNAVNNPERPFVAILGGAKVSSKISVIENLLDKVDTLIIGGGMAYTFSKARGGKIGLS LCEDDYMEYALNMLKKAEEKGVKLLLPVDHRIGDDFSNDANVQVVESGEIPDGWEGMDIG PKTEVLFADAVKDAKTVVWNGPMGCFEMPKFANGTEAVAKALAETDAVTIIGGGDSAAAV NQLGYGDKMTHISTGGGASLEFLEGKELPGVAAANDK >gi|330405144|gb|ADLB01000012.1| GENE 122 119753 - 120502 887 249 aa, chain + ## HITS:1 COG:CAC0711 KEGG:ns NR:ns ## COG: CAC0711 COG0149 # Protein_GI_number: 15893999 # Func_class: G Carbohydrate transport and metabolism # Function: Triosephosphate isomerase # Organism: Clostridium acetobutylicum # 3 248 2 248 248 258 56.0 8e-69 MSRRKIIAGNWKMNKTPSETVTLLNELKPLVATEDADVVFCVPAISIIPAMEAVKGTNIC IGAENMYFEESGAYTGEIAPNMLTDVGVKYVIIGHSERREYFAETDETVNKKVKKAFEHG ITPIICCGETLAQREQGVTIDFIRQQIKIAFLDVTAEQAKTAVIAYEPIWAIGTGKVATT EQAQEVCAAIRQCIAEIYDEATAEAIRIQYGGSVSADSAPELFAQPDIDGGLVGGASLKT DFGKIVNWK >gi|330405144|gb|ADLB01000012.1| GENE 123 120579 - 121205 586 208 aa, chain + ## HITS:1 COG:CAC3601 KEGG:ns NR:ns ## COG: CAC3601 COG0494 # Protein_GI_number: 15896835 # Func_class: L Replication, recombination and repair; R General function prediction only # Function: NTP pyrophosphohydrolases including oxidative damage repair enzymes # Organism: Clostridium acetobutylicum # 3 202 5 201 202 117 35.0 1e-26 MRRIKKISQTTNNRFLNMYTLQMKSDTGKESTYYVASRAKNIDELKITTRQNRADGVIIY SLYQDENDDEEKIVLIRQYRCPLDDYIYEFPAGLVDEGEDFKVAGVRELKEETGLDLTPI NAKDMFTKPFFTTVGMTDESCGTVYGYAKGKPSKAGQEENEEIEIVLADRKEVRRILREE NVAIMCAYMLMHFLHTPDGKVFDFLESI >gi|330405144|gb|ADLB01000012.1| GENE 124 121295 - 122833 1656 512 aa, chain + ## HITS:1 COG:CAC0712 KEGG:ns NR:ns ## COG: CAC0712 COG0696 # Protein_GI_number: 15894000 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphoglyceromutase # Organism: Clostridium acetobutylicum # 1 510 1 509 510 648 63.0 0 MSKKPTVLMILDGYGLNDKHEANAVYEGKTPVMDKLMAECPFVKGNASGMAVGLPDGQMG NSEVGHLNMGAGRIVYQDLTKITKSIQDGDFFENEALLAACNNAKEHNSALHLFGLVSDG GVHSHNSHIYGLLELAKKQGLEKVYVHCFLDGRDTPPASGKDYVQELADKMKEIGVGEIA SVMGRYYAMDRDNRWDRVELAYNAMVKGEGETAECAVCAVQNSYDADKTDEFVLPTVVVK DGKPVATIKDNDSIIFFNFRPDRAREITRTFCDDNFDGFDRGERIKTTYVCFTEYDVTIE NKLVAFHKTEITNTFGEFLANNNMTQARIAETEKYAHVTFFFNGGVEEPNKGEDRILVKS PKVATYDLQPEMSAYEVCDKLVEAIKSDKYDVIIINFANPDMVGHTGVEAAAIKAIEAVD ECVGKTVEALKEVDGQMFICADHGNAEQLVDYETGEPFTAHTTNEVPFILVNADPSYKLR EGGCLADIAPTLIELMGMEQPKEMTGKSLLVK >gi|330405144|gb|ADLB01000012.1| GENE 125 122881 - 127125 3025 1414 aa, chain - ## HITS:1 COG:CAC2401_1 KEGG:ns NR:ns ## COG: CAC2401_1 COG1924 # Protein_GI_number: 15895667 # Func_class: I Lipid transport and metabolism # Function: Activator of 2-hydroxyglutaryl-CoA dehydratase (HSP70-class ATPase domain) # Organism: Clostridium acetobutylicum # 1 663 1 663 663 841 61.0 0 MQKTLHTLGIDIGSTTVKIAILNEENEVLFSDYERHFANIQETLSDLLGRAVHKLGTIEV SPVITGSGGLTLAKHLEIPFVQEVIAVSTALQDYAPQTDVAIELGGEDAKIIYFEGGNVE QRMNGICAGGTGSFIDQMASLLQTDATGLNEYAKHYQALYSIAARCGVFAKSDIQPLINE GATKEDLSASIFQAVVNQTISGLACGKPIRGHVAFLGGPLHFLSELREAFVRTLKLDDQH IIAPRHSHLFAAIGSALNSGRDLSIQLTSLQMRLTNRIEMEFEVERMTPLFASKEEYDAF EARHAKHQVPVKDLSSYKGKCFLGIDAGSTTTKAALVGEDGTLLYSFYHSNEGDPLGTTI SAIKDIYDKLPEGVEIVHSCSTGYGEALIKAALLLDEGEVETVSHYYAAAFFNPEVDCIL DIGGQDMKCIKIKNQTVDSVQLNEACSSGCGSFIETFAKSLNYSVEDFAHEALFAKNPID LGTRCTVFMNSKVKQAQKEGASVADISAGLAYSVIKNALYKVIKVSDASELGKHIVVQGG TFYNNAVLRSFETIADCEAVRPDIAGIMGAFGAALIARERYDESKTTTMLSIGEINALEY STSMTKCKGCTNNCRLTINKFSGGRSFITGNRCERGLGKSKTKNDIPNLFAYKNKRYFDY TPLSETDAKRGTVGLPRVLNMYENYPFWFTFFDALKFRVVLSPASTRKIYELGIESIPSE SECYPAKLAHGHVEWLIQNGVSFIFYPCIPYERQEFKDANNHYNCPIVTSYAENIKNNMD EITSGKVDFMNPFLSFKNEETLDERLTEEFSGKFSITEAEVKQAVHLAWEELKKCREDIR RKGEETIAYLDKTGKRGIVLAGRPYHIDPEVNHGIPELINSYDIAVLTEDSVSHLHPVER PLRVMDQWMYHSRLYAAANYVKTKDNLDLIQLNSFGCGLDAVTTDEVSDILTHSGKIYTS LKIDEVNNLGAARIRIRSLLSAIRVREKKHIEHCIQPSSITKVPFTKEMRETYTILCPQM SPIHFELVEPAFRSCGYNIEVLGNDSKSAINTGLKYVNNDACYPSLLVVGQIMEAILSGR YNTEKIAVIITQTGGGCRASNYIGFIRRALEKAGYAQIPVISLNLSGLEENPGFKLTLPL VLKGMYAVVFGDIFMRAVYRVRPYEVTPGSTDALHEKWKQKCITFLTQKHPSRRTFNKMC KEIIEDFDNLPMNNVKKVRVGIVGEILVKFHPTANNHLVELLEQEGAEVVVPDLLDFFLY CFYNSNFKVSHLGMKKSTAWKANLGIKALEWFRSPAKKAFAQSKHFTPPADITNLANMAK EIVSLGNQTGEGWFLTGEMLELIENGTTNIVCAQPFACLPNHVVGKGVIKELRHRHPLAN IVAIDYDPGASEVNQLNRIKLMLSTANKNLQENH >gi|330405144|gb|ADLB01000012.1| GENE 126 127353 - 128222 584 289 aa, chain + ## HITS:1 COG:no KEGG:Cphy_2901 NR:ns ## KEGG: Cphy_2901 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 1 288 1 288 297 278 49.0 2e-73 MNWIQKLERKFGRYAIHNLMYYIIILYGVGFVISLVKPEFYYQYLAMDAGAVMHGQIWRV VTFLMQPPSSSPIFMVFALYLYYMIGQHLEATWGAFRFNLYFFTGVLFHVIAAFIAYFVT GISFPIGTAYLNLSLFFAFAAMYPDMEFLFFFVLPIKVKWLAIIDGVLFGYTIVQAFLPA YSTGIMGLITKSDAIAAGVSILNFVIFYLSSRSFKAHSPKQMHRRRQFQKDIRNANANSA AHASGAKHLCAVCGRTELDDPNLEFRYCSKCNGAYEYCQDHLFTHEHIK >gi|330405144|gb|ADLB01000012.1| GENE 127 128261 - 129697 1775 478 aa, chain + ## HITS:1 COG:BH3163_1 KEGG:ns NR:ns ## COG: BH3163_1 COG0469 # Protein_GI_number: 15615725 # Func_class: G Carbohydrate transport and metabolism # Function: Pyruvate kinase # Organism: Bacillus halodurans # 1 477 1 471 473 484 55.0 1e-136 MRKTKIVCTMGPNTNDRELMKKMVEKGMNIARFNFSHGDHEEQKSRMDMLKGIREELGKP VAILLDTKGPEIRTGVLKGDKKVFLEEGDTFTLTTEEIEGDNKRVSVSYEGLVEDVEPGK KILIDDGLIELEVKGINGTEITCKVLNGGELGSKKGVNVPNVPVRLPALTQKDREDIIFG VEQGVDFIAASFVRSVEGVLEIKALLKECGAPFLPVIAKIENAEGIRNIDEIIHCADGIM VARGDLGVEIPAEEVPYLQKMLIKKCNDNFKPVITATQMLDSMMRNPRPTRAEVTDVANA VYDGTDAVMLSGETAQGKYPLEALEMMVHIVENTEEHLDYDILLEKAQEHRRKGISSAIG YSSVATAMNLNAKCIITPTLSGATARVVSKFKPKADIIGVTPNEATLRKMQIYWGVLPIK SIEYHTTEDICNDAIDLVNAKQLVETGDIVVLTAGIPSPVMKKTRDGVSNMMKIAVIE >gi|330405144|gb|ADLB01000012.1| GENE 128 129789 - 130430 843 213 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|210616214|ref|ZP_03290994.1| ## NR: gi|210616214|ref|ZP_03290994.1| hypothetical protein CLONEX_03213 [Clostridium nexile DSM 1787] # 1 206 1 205 217 114 33.0 6e-24 MKKKKSILALMMVIVMCMLSACGTKFDAKAYIESCLDVQFKGEYDEYMKLTESSKKDSEK LYNDGIDTFMKGYESLSLPDELEGKFRDAYKKMLKSAKYSIKESKETDDGFVVTVAVKPM KCFENYEADLQKLQQEFLADMQEKIAKDGKMPSQQEIMEQMAEIVYNDLEERISKCEYDK EVTVDVKVKKDSNKKYTANETDLGKVAQKALGV >gi|330405144|gb|ADLB01000012.1| GENE 129 130584 - 130733 64 49 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLYLKYYVHQLIQFFHSYYFSIYIFCYILYTIFRKSDSIKDTIAFYLTS >gi|330405144|gb|ADLB01000012.1| GENE 130 130682 - 133297 2396 871 aa, chain + ## HITS:1 COG:CAC1098_2 KEGG:ns NR:ns ## COG: CAC1098_2 COG0749 # Protein_GI_number: 15894383 # Func_class: L Replication, recombination and repair # Function: DNA polymerase I - 3'-5' exonuclease and polymerase domains # Organism: Clostridium acetobutylicum # 400 871 61 531 531 466 53.0 1e-131 MNEKIVLIDGHSILNRAFYGVPDLTNSEGLHTNAIYGFLNILFKILEEEKPEYLTVAFDV HAPTFRHEMYDEYKGTRKPMLEELRQQVPVMKEVLQAMGVKLIEQAGYEADDLLGTLSVL AEEKGMDVSIISGDRDLLQLATEKVKIRIPKTKKGKTEIENYYAKDVEEKYFVTPKEFID LKALMGDTSDNIPGVASIGEKTATKIITQFHSIENAYAHLDELKPPRAGKALQEQYDKAQ MSKELATIHVHVPVSYDFEEAKLGNLYTEEAYVYFQKLEFKNLLSRFDMSAPENSIEDSF KIIYAKEDADAVFENAMHAEVVGISFGKNMENVLPLFADAAEINHIGVSFDDEKIYTIYT GEEITFSYLKKKIEELTACVKTVSICNIKEALKWITFANETSAFDPIIASYLLNPLKSDY AFEDVAREQLNILIDDKMEEARKSCYTAYTAYKSVFFLSDRLKEKKMEHLFTDIEMPLAF TLFDMEQAGVKVKAEELKVYGQQLGGKINELETEIYEIAGETFNINSPKQLGVVLFEHLN LPNGKKTKTGYSTAADVLEKLAPDYPIVSKILEYRQLAKLKSTYADGLANFIQEDGRIHG KFNQTVTATGRISSTEPNLQNIPVRMELGRLIRKVFVPEEDYVFVDADYSQIELRILAHC SGDESLISAYKEARDIHRITASQVFHIPFDEVTDLERRNAKAVNFGIVYGISSFGLSQDL SITRKEAGEYIDNYFKTYPGIKTFLDDTVAHAKEQGYVVTLFGRRRPVPELESGNFMQRS FGERVAMNAPIQGTAADIMKIAMIGVNRTLKEKQMKSKLVLQVHDELLIEAYKDEVEEVK SILKREMEEAASLSVPLDIDMHTGNSWYEAK >gi|330405144|gb|ADLB01000012.1| GENE 131 133358 - 133888 507 176 aa, chain + ## HITS:1 COG:BS_ytaG KEGG:ns NR:ns ## COG: BS_ytaG COG0237 # Protein_GI_number: 16079958 # Func_class: H Coenzyme transport and metabolism # Function: Dephospho-CoA kinase # Organism: Bacillus subtilis # 6 174 26 194 197 98 34.0 7e-21 MEEIYGAVLCQTDVVAHQLQKKGETCYKEIVNVFGVNILTENKEIDRKKLGAIVFNDNDK LKKLNQIVHPAVKKQVKLEIEEARRKQKEFFLIESALLMEDHYEELCDELWYIYADERVR RDRLKTSRLMNEEKIDLIIKAQATEETFRKYCHITIDNSGTIENTKEQIEQAVNRR >gi|330405144|gb|ADLB01000012.1| GENE 132 133893 - 134693 732 266 aa, chain + ## HITS:1 COG:CAC3538 KEGG:ns NR:ns ## COG: CAC3538 COG1235 # Protein_GI_number: 15896774 # Func_class: R General function prediction only # Function: Metal-dependent hydrolases of the beta-lactamase superfamily I # Organism: Clostridium acetobutylicum # 1 263 1 259 261 222 47.0 5e-58 MKLCSIASGSSGNCIYVGSDHTHIIIDAGISKKKIEEGLGKIGISGEELDGILVTHEHSD HIQGLGVFSRKYKVPIYATAGTICGIENYKSLGKMPEGLFHSIKADEEFSLGELTIKPFA ISHDAIEPTGYRIEKNQKSVAVATDLGNYDKYIVEKLKDVNALVLEANHDVHMVEVGPYP YPLKRRVLGEQGHLSNELSGRLLCDILHDDLQCVMLGHLSKENNYESLAYETVKLEVTLG DNKYKGDDIHIVVAKRDTVSELITVE >gi|330405144|gb|ADLB01000012.1| GENE 133 134713 - 135567 857 284 aa, chain + ## HITS:1 COG:BS_yitS KEGG:ns NR:ns ## COG: BS_yitS COG1307 # Protein_GI_number: 16078175 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus subtilis # 6 282 5 283 283 134 29.0 2e-31 MGRTAIVTDSNSGITQERARELGIRVLPMPFYINEKLYFEDITLTQEEFYEELKNDADIS TSQPSPADVLDLWDEVLKEYDEIVYIPMSSGLSSSCETAISLSREYDGKVQVVDNQRISV TQSKSVEDAMKLAEQGKNAKEIKEILETEKKESSIYITVDTLKYLKKGGRITSAAAAIGT VLNLKPVLQIQGGKLDAFAKVRGWKQAKKTMLEAVERDWQERFNGEEVYIQGAYTSSKEE AEQWKEEIENRFPNVTIEMRPLSLSVACHIGPGALAVTCTKKMR >gi|330405144|gb|ADLB01000012.1| GENE 134 135594 - 136817 505 407 aa, chain - ## HITS:1 COG:CAC1741 KEGG:ns NR:ns ## COG: CAC1741 COG1323 # Protein_GI_number: 15895018 # Func_class: R General function prediction only # Function: Predicted nucleotidyltransferase # Organism: Clostridium acetobutylicum # 1 405 1 401 402 248 39.0 1e-65 MKIVGLITEYNPFHNGHKHHLNEALKRTGADYAIVVMSGNFVQRGAPAIIPKHLRTQIAL EAGAAVVIELPVFFATGSAELFAAGAVSLLHSLNCVDSICFGTESGDIVSLSKIARILCD EPDDYKFFLQNHLKNGDSFPLARQKAFGEVTGDTSLSTILESPNNILGIEYIKALCRLNS SITPVAIQREESHYHDMELKTVYSSASAIRNVFAQAENSFSDISEILKDQIPRACISFLQ NTFQRRYPIYTNDFSLLLKYKLLTETKETLLTYMDINETLANRIYKNKNNFLTFEQFCEE IKTRDLTYTRVSRALLHILLHVKKADCAAPAYARILGFRQDSSTVLSVMKRQSRIPLITK LSSASLEEKESFYLKQDIFASDLYESVVTDKFQTPFINEHKQEIVRC >gi|330405144|gb|ADLB01000012.1| GENE 135 136985 - 138316 1530 443 aa, chain + ## HITS:1 COG:FN1480 KEGG:ns NR:ns ## COG: FN1480 COG2239 # Protein_GI_number: 19704812 # Func_class: P Inorganic ion transport and metabolism # Function: Mg/Co/Ni transporter MgtE (contains CBS domain) # Organism: Fusobacterium nucleatum # 4 436 8 442 449 384 46.0 1e-106 MTEEQMLELLEERKYKELKTELENLYPVDIAQILEEFNEKQRIIVFRLLTKEEAAETFTY MSSEMQEDLVNGLTDAEIEEFMEEMYLDDMVDVLEEMPANVVDRLLMATDEETRKQINTL LQYPEDSAGSIMNVEYIGLRKEMTVADAILKIRQVGINKETIYTCYVTEKRKLIGVVDVK DLLTTGENRLIEEIMETNMLYVNTHDDQEEVVSMINKYGLIAIPVVDHEMCMVGIVTVDD AMSVLQEETTEDMSVMAGIAPNEEPYFETSVWQHAKSRFPWLLFLMLSATVTGLILGHFE GALAVMPVLNTFVPMLTGTGGNCGSQSSTLIIRGLAVDEIEFKDIFKVVFKEVRIALIVG FLLAVVNGIRIILMYQNPMLAFAVGITLMCTVLLAKTIGCMLPLVAKKCGLDPAIMAAPL ITTLVDTGTILVYFTIVTRIFGI >gi|330405144|gb|ADLB01000012.1| GENE 136 138418 - 139413 1457 331 aa, chain + ## HITS:1 COG:MA3607 KEGG:ns NR:ns ## COG: MA3607 COG0280 # Protein_GI_number: 20092407 # Func_class: C Energy production and conversion # Function: Phosphotransacetylase # Organism: Methanosarcina acetivorans str.C2A # 3 331 4 331 333 351 58.0 9e-97 MGFIDVIKAKAKANKKTIVLPETEDIRTYEAAEAVLREGTADLVLVGSKEEIEKNRAGFD ISGATIVDPATDERTAGYIAKLVELRQAKGMTEEKAKELLLSNYLYYGVMMVKMGDADGM VSGACHSTADTLRPCLQILKTKPGTKLVSAFFVMVVPDCEMGANGTFIFGDSGLEQNPDP EKLAAIALSSAESFKLLVGEEPKVAMLSHSTKGSAKHADVDKVVEATRIAKELAPDLMLD GELQLDAAIVPEIGESKAPGSAVAGHANVLIFPDLDAGNIGYKLVQRLAKAEAYGPLTQG IAKPVNDLSRGCSAKDIEGVVAITAVQAQAE >gi|330405144|gb|ADLB01000012.1| GENE 137 139468 - 140658 1446 396 aa, chain + ## HITS:1 COG:TM0274 KEGG:ns NR:ns ## COG: TM0274 COG0282 # Protein_GI_number: 15643044 # Func_class: C Energy production and conversion # Function: Acetate kinase # Organism: Thermotoga maritima # 1 396 1 398 403 500 61.0 1e-141 MNVLVINCGSSSLKFQLINSETEGVLAKGLCERIGIDGSLTYQPAGKDKVTTEKPMPTHT EAIQFVIDALTDAETGVVKSLEEINAVGHRVVHGGEKFANSVVITDEVISAIEECNDLAP LHNPANLIGINACAKLMPNTPMVAVFDTAFHQTMPEAAYMYGLPYEYYDKYKVRRYGFHG TSHSFVSKRVAELLDKPYEESKTVVCHLGNGASVCAVKNGKSVDTSMGLTPLEGLVMGTR SGDIDPAILEFIAKKEDLDIAGLMNVLNKKSGVFGLSDNLSSDFRDLTAGAASGNKPAQI ALDVFCYRVAKYVGSYAAAMNGVDAIAFTAGIGENVCIVRSKVCEYLEFLGITVDEDANA KRGEEIMISTPDSKVKVLVVPTNEELAIARETVALI >gi|330405144|gb|ADLB01000012.1| GENE 138 140735 - 141262 567 175 aa, chain + ## HITS:1 COG:CAC1744 KEGG:ns NR:ns ## COG: CAC1744 COG1399 # Protein_GI_number: 15895021 # Func_class: R General function prediction only # Function: Predicted metal-binding, possibly nucleic acid-binding protein # Organism: Clostridium acetobutylicum # 35 171 13 139 140 86 36.0 3e-17 MLINLSDVLSEYHKTIEKDCLVEMEDFRSELGTFPITKKGNLHIVIEHIKGRELFIKGNV ELTIAIPCDRCLKDVPTEFRLDFEKTVDLCESTDAQKDELDEKNYIDGYHLDVDKLLYNE ILIEWPMKILCSDDCKGICNVCGQNLNEGTCDCEDTSLDPRMSVIRDVFKNFKEV >gi|330405144|gb|ADLB01000012.1| GENE 139 141266 - 141445 284 59 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|227872300|ref|ZP_03990657.1| possible ribosomal protein L32 [Oribacterium sinus F0268] # 1 59 1 59 60 114 86 5e-24 MSICPKNKSSKARRDKRRANWKMSAPNLVKCSKCGELMMPHRVCKACGSYNKKEIISVD >gi|330405144|gb|ADLB01000012.1| GENE 140 141572 - 142585 1154 337 aa, chain + ## HITS:1 COG:CAC1746 KEGG:ns NR:ns ## COG: CAC1746 COG0416 # Protein_GI_number: 15895023 # Func_class: I Lipid transport and metabolism # Function: Fatty acid/phospholipid biosynthesis enzyme # Organism: Clostridium acetobutylicum # 7 334 3 329 331 314 47.0 2e-85 MSEITKVALDAMGGDNAPGAIVQGAIDVVAKRQDIHIFLVGQENVIEEELNKYEYPKEQI SIVNATEIIETAEPPVMAIRRKKDSSIVVAMNMVKNGKADAFVSAGSSGAILVGGQVLVG RIKGVERPPLAPLIPTQKGVSLLIDCGANVDARPSHLVQFAKMGSIYMRNVIGIENPKVA IVNIGAEEEKGNALVKETFPLLKECEDINFIGSIEAREIPYGKADVIVCEAFVGNVILKL FEGVGSALISEIKKGLMTNARSKMGALLVKPALKETLKKFDASEYGGAPLLGLNGLVVKT HGNSTAKEVANSIIQCVTFKEQNINDKIKETIQKKTE >gi|330405144|gb|ADLB01000012.1| GENE 141 142611 - 142841 479 76 aa, chain + ## HITS:1 COG:CAC1747 KEGG:ns NR:ns ## COG: CAC1747 COG0236 # Protein_GI_number: 15895024 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Acyl carrier protein # Organism: Clostridium acetobutylicum # 1 72 1 72 77 60 56.0 9e-10 MEFEKIKAIIEDVLNVGPEEITMDTTFVDDLGADSLDIFQIIMGIEEAFDIEIENEDAEK IVTVGDAVEQIKNAIN >gi|330405144|gb|ADLB01000012.1| GENE 142 142893 - 143588 573 231 aa, chain + ## HITS:1 COG:lin1919 KEGG:ns NR:ns ## COG: lin1919 COG0571 # Protein_GI_number: 16800985 # Func_class: K Transcription # Function: dsRNA-specific ribonuclease # Organism: Listeria innocua # 6 228 5 229 229 191 48.0 8e-49 MRQKLKELEKKIGYKFQRFDLLCQAMRHSSYANEKHMEKNECNERLEFLGDAVLELVSSE FLFFENKKMPEGELTRTRASIVCEPSLAFCARELNLGSYLLLGKGEENTGGRYRESLTSD ALEALIGAIYLDGGFASAKEFIHRFVLNDLEHKKLFFDSKTILQEIVQGNFNEPIEYTLL KEEGPDHNKSFFISVSIGNEVYGNGKGRTKKAAEQEAAYQAILELHKRKLK >gi|330405144|gb|ADLB01000012.1| GENE 143 143596 - 147156 3648 1186 aa, chain + ## HITS:1 COG:BS_smc KEGG:ns NR:ns ## COG: BS_smc COG1196 # Protein_GI_number: 16078657 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Chromosome segregation ATPases # Organism: Bacillus subtilis # 1 1181 1 1180 1186 615 37.0 1e-175 MYLKSIEVQGFKSFANKIVFDFHNGITGIVGPNGSGKSNVADAVRWVLGEQRAKQLRGGS MQDVIFSGTENRKPLSYASVAITLDNADHQLAIDFQEVTVTRKLYRSGESEYLINGSICR LKDVNELFYDTGIGKEGYSIIGQGQIDKILSGKPEERRELFDEAAGIVKFKRRKNMSVKK LEEETQNLLRVTDILSELEKQIGPLEKQSEKAKEYLKKKEELKSYDINLFLMESVRIRKQ IGEVERQLANAQEEFDAAQKKYNDTKVEYEEIESQLDEIDGTMEHTKSQLNETHLLKQQL EGQIELLKEQINSAKMSDEHFAQRSTHIHVEITERKKNEEEYLKEQNILQQQLDTQKTAE NEVNSQLAAIQLRITELTANIEQWKQDIMDMLNHRAVTKAKIQHFDTLLEQMKVRKAEMN KKLIEISSDVSVQDESIRGYEKDLQEISEEIQTYVAEVKSQEEKIQELQRTLAKRTEQLR AGQTAYHREASRLESLKNITERYDGYGNSIRKVMEKKEQEQGLLGVVADLIKVDKAYEIA IETALGGSIQNIVTDNENTAKRMIQYLKQNKFGRATFLPLTAITGGGGIRQPEVLREKGV IGLANTLVTVEDKFKVLADSLLGRTIVVEKIDDGIALARKYRQSLRIVTVEGELINPGGA MTGGAFKNTSNLLSRRREIEEFEKTVVQLKKEMTVMEEEIAVIKQERAGCYEKIEENNQK LQKKYVIQNTLKMNLNQANTKKENILKMTGDMHREGKELEEQAAELLENQESIMIELDTS EKLERELNKKIEEQQLLLEKEKVSENETMQRAESIHLTLANLEQKNEFILENLSRIHEEM AKFGEELEQLRQNKENSSLEIGEKEKQIEEIRTTITNSKELFEEIELKIQNLSQEKEILT QKNKDFLTKREELSKHMSDLDKESFRLNSKKETFEETLEKQINYMWEEYEITYSKARELR NETFTDLSEIKRQIQLLKSEIRGLGSVNVNAIEDYKNVSERYDFLKTQYDDLVEAKETLI QIIEELDTAMRKQFAERFKEIASEFDKVFKQLFGGGKGTLELMEDEDILEAGIRIIAQPP GKKLQNMMQLSGGEKALTAISLLFAIQNLKPSPFCLLDEIEAALDDSNVTRFAQYLHKLT KNTQFIVITHRRGTMTAADRLYGITMQEKGVSTLVSVDLLEKDLTK >gi|330405144|gb|ADLB01000012.1| GENE 144 147167 - 148111 776 314 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163762490|ref|ZP_02169555.1| ribosomal protein L28 [Bacillus selenitireducens MLS10] # 15 306 25 320 336 303 50 5e-81 MGEEKKGFFKRLVSGLTKTRDSIVSSMDSIFNGFSKIDEEFYEELEEVLIMGDLGVQATY AVLDDLRAKVKEQRIKEPMECKQLLIDSIKEQMRVGETAYEFEDRTSVVLVIGVNGVGKT TTIGKLAGKLRANNKKVVLAAADTFRAAAGEQLVEWARRADAELIGGQDGADPAAIVYDA VAAAKARHADVLLCDTAGRLHNKKNLMEELKKINRVLEREYPEAYRETLVVLDGTTGQNA LAQAREFNEVADITGIVLTKMDGSAKGGIAVAIQSELKIPVKYIGVGESIDDLQKFDSDE FVNALFYTDESSDR >gi|330405144|gb|ADLB01000012.1| GENE 145 148127 - 148540 469 137 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|210612579|ref|ZP_03289370.1| ## NR: gi|210612579|ref|ZP_03289370.1| hypothetical protein CLONEX_01572 [Clostridium nexile DSM 1787] # 1 136 1 157 158 83 38.0 5e-15 MEYKTVTKGECYRFPEGKQCQVMELATDFQTGEEMVVYKENDKEEKVYVRTLVSFIKEME KVSDEDMRLIEDFLDIVENEQKLYFLQKHKKDITEKFMSIAAQSMDFAEKETSMEMRYQE LLHFIRMKMKYEGGRLH >gi|330405144|gb|ADLB01000012.1| GENE 146 148654 - 149301 662 215 aa, chain + ## HITS:1 COG:mll6782 KEGG:ns NR:ns ## COG: mll6782 COG1802 # Protein_GI_number: 13475658 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Mesorhizobium loti # 4 209 9 214 224 109 30.0 4e-24 MEKDKGYSLSEKVFHKLRNDILSGVYQENDELREMTIGKELGVSRTPVREAFKKLELEGL VKTIRNKGTYVTGISKKDIHDIFIIRSLLENLCTEWVIDNITQEDMNELEEIILLSEFHL KKSGTDKMAILDGKFHRILYEACGSKILKHILSDLYQYVQFARVYSVKTEGRLEKSVAEH KAILEAIKKKDKLLAGELITEHMKCVLENLEKHGF >gi|330405144|gb|ADLB01000012.1| GENE 147 149321 - 150517 1470 398 aa, chain + ## HITS:1 COG:TM1148 KEGG:ns NR:ns ## COG: TM1148 COG0538 # Protein_GI_number: 15643905 # Func_class: C Energy production and conversion # Function: Isocitrate dehydrogenases # Organism: Thermotoga maritima # 1 398 1 395 399 493 58.0 1e-139 MEKIKMTTPLVEMDGDEMTRILWKMIKDTLLLPYIDLKTEYYDLGLEKRNATNDQVTIDS ANATKKYGVAVKCATITPNAARMTEYNLKEMWKSPNGTIRAMLDGTVFRAPIVVKGIEPC VKNWKKPITLARHAYGDVYKNTEMIIDAPGKVELVYTAEDGTEKRELIHQFNGAGIAQGI HNTDDSIESFARSCFNYALETKQDLWFATKDTISKKYDHTFKDIFQEIYEKEYDAKFKEC GIEYFYTLIDDAVARVMKAEGGFIWACKNYDGDVMSDMVSSAFGSLAMMTSVLVSPEGYY EYEAAHGTVQRHYYKHLKGEETSTNSVATIFAWTGALRKRGELDNNKELMQFADKLEKAT LDTIESGKMTKDLALITSIPNPTVLNSEEFIQAIAELL >gi|330405144|gb|ADLB01000012.1| GENE 148 150557 - 152473 1840 638 aa, chain - ## HITS:1 COG:CAC2714 KEGG:ns NR:ns ## COG: CAC2714 COG0488 # Protein_GI_number: 15895971 # Func_class: R General function prediction only # Function: ATPase components of ABC transporters with duplicated ATPase domains # Organism: Clostridium acetobutylicum # 1 637 2 638 643 533 47.0 1e-151 MILACHNLNKAFGEQIIVKDGSFHIEEHEKAALIGLNGAGKSTILKMIMGELPVDSGEII LAKGKTIGYLSQHQKLESGNTIYEEVKTAKSDIITLENQIRSIELELKHLTGEELEKRLQ TYHHLTASFERANGYAYESEITGVLKGLGFSEEDFGKSVDTLSGGQKTRVSLGKLLLTKP DILLLDEPTNHLDLNSIAWLETYLSNYQGAVLIVSHDRYFLNRVVTKVIEVEQGNIMMFL GNYTAFAQKKKLVREAKLKEYMKQQQEIKHQEAVIEKLRSFNREKSIRRAESREKMLDKM TLVEKPTEVTSEFHISLEPSCVSGNDVLTVEHLSKSFGALKLFQDISFEIKRGEHVAVIG DNGTGKTTLLKILNEVQPADSGVFTLGTNVHIGYYDQEHHVLHMEKTIFDEISDDYPTLT NTQIRNTLAAFLFTGDDVYKQIKDLSGGERGRVSLAKLMLSEANFLILDEPTNHLDITSK EILEKALNDYSGTILYVSHDRYFINQTATRILDLVNHTFVNYIGNYDYYLEKKEELTAAY TNVNTDTLSASAPTESETKLDWKAQKEQQAKERKRQNDLRKTEERISALEERDGEIDELM TQEDVYTNSVKCRELSEEKANIMQELELLYEKWEELAE >gi|330405144|gb|ADLB01000012.1| GENE 149 152603 - 153247 730 214 aa, chain + ## HITS:1 COG:CAC2713 KEGG:ns NR:ns ## COG: CAC2713 COG2344 # Protein_GI_number: 15895970 # Func_class: R General function prediction only # Function: AT-rich DNA-binding protein # Organism: Clostridium acetobutylicum # 2 210 3 211 214 234 56.0 1e-61 MEEREISQAVIRRLPRYYRYLGELLEDNVERISSNELSRRMQVTASQIRQDLNNFGGFGQ QGYGYNVKYLYTEIGKILGLDIDHNMIIIGAGNLGQALANYTSFERRGFIFKGIFDVNPR LSGVSIRGVPIRMMDELKDFINEHNIDIAVLTIPKAKAVEVAKLLVDNGVRAIWNFAHTD LDLPENVIVENVHLSESLMRLSYNLSRYNKEHTE >gi|330405144|gb|ADLB01000012.1| GENE 150 153344 - 154843 642 499 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|90022317|ref|YP_528144.1| ribosomal protein S15 [Saccharophagus degradans 2-40] # 6 490 12 499 500 251 31 2e-65 MRYVLTGEEMQYADRYTIEQMGVPSCVLMERAALKVVEILEQNNVDCSKTLIVCGSGNNG GDGLAIARLLYLKGIHVNVCYMGNKETASEENKRQYTIAENYGISIGNTLEKKEYSVIID ALLGIGLKRDVSGKYKEVIEQLNEMNGTKVAVDIPTGICDTTGEIKGCAFRADFTVCFAF EKIGMLFEKGRQYAGKIYVADIGISSEALPAGQKLYTYDKEDKDVSLPKRNPNGNKGTFG KVLIIAGSKGMAGAAYLNAKAAYSAGAGLVQIYTHEDNRVILQQALPEAIISTYETFNEN QLKQLFDRSDVLLIGSGLGKSDLSEKIFTYAIQYAKIPSVIDGDGLTLLAEDLSLLEHKK QVILTPHLKEMSRLLQCSVEEVQKNKITVLKNFVREYPIVCAMKDARTFVAADGEDIYIN TTGNHAMAKAGAGDVLAGIIVGFLAQKMKCKDACEIGVYVHGLCGDYAQKEKGSYSVLAN DLLDAISHVMRKLEKEQTR >gi|330405144|gb|ADLB01000012.1| GENE 151 154840 - 156006 1155 388 aa, chain + ## HITS:1 COG:CAC0492 KEGG:ns NR:ns ## COG: CAC0492 COG0787 # Protein_GI_number: 15893783 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Alanine racemase # Organism: Clostridium acetobutylicum # 7 386 8 385 386 319 42.0 6e-87 MKTYSRVYAKIDLDAVTYNMEQMKQRIDGDTKIMAVIKSDGYGHGAIQVAEVLEKYDYIW GFAVATLDEAVVLRTEGIQKPILVLGCIFPDQYLEMLDNDIRMNVYTEEMAKEISYMARR EGKTAYLHIKLDTGMARLGFAVNNESVDAITRISKLPNVNMEGVFTHFAKADETDKTFTK NQISQFVSMTEKLRERGVTFPYEHCANSAAIIDVEDARFDIVRAGISTYGLYPSEEVNQN AVHLKPALALKSHVAFVKEIEEGTPVSYGGTFVAEKKMKIATIPVGYGDGYPRSLSGKGY VLIRGKKANILGRICMDQFMVDVTDIEGVSFGDKVTLIGRDGNEAISVEKLSELSGRFNY EFICALGKRIPRVYVKNGKIAEQVDYFA >gi|330405144|gb|ADLB01000012.1| GENE 152 156091 - 156438 239 115 aa, chain + ## HITS:1 COG:BS_ydcE KEGG:ns NR:ns ## COG: BS_ydcE COG2337 # Protein_GI_number: 16077533 # Func_class: T Signal transduction mechanisms # Function: Growth inhibitor # Organism: Bacillus subtilis # 1 113 1 113 116 155 67.0 2e-38 MQVRRGDIYYADLSPVVGSEQGGIRPVLIIQNDVGNRHSPTVICAAITSKMNKAKLPTHV EIDAGKYHIVKNSVVLLEQIRTIDKQRLRELVCHVDKKLMLKIDEAIKISFELHT >gi|330405144|gb|ADLB01000012.1| GENE 153 156456 - 157295 795 279 aa, chain + ## HITS:1 COG:TM0845 KEGG:ns NR:ns ## COG: TM0845 COG1253 # Protein_GI_number: 15643608 # Func_class: R General function prediction only # Function: Hemolysins and related proteins containing CBS domains # Organism: Thermotoga maritima # 26 277 178 428 455 166 39.0 6e-41 MEEGSSLWDRLMYNFSGEDKEQEDVEHEILSLIQEGRERGFFAGGEGEMISNIFSYNEKK AEDIMVHRKHIIALGSEKTVEEALEFMLEQKNSRFPVYEEDIDSIIGIFHLRDAVKCYFK ENLRNVPIKHLKEYMHPASFVPEAKSIDKLFKEMKLKKNHMVVVLDEYGQTAGIIAMEDI IEEIVGNIQDEYDDDEESIQKISNGVYIADGMTELEDLEKILRISFENEDFGTLNGYLVH CLEHIPTEEEECSVEYGGYRFYIISVSNNLIEKVRIEKI >gi|330405144|gb|ADLB01000012.1| GENE 154 157366 - 158094 928 242 aa, chain + ## HITS:1 COG:CAC2295 KEGG:ns NR:ns ## COG: CAC2295 COG0217 # Protein_GI_number: 15895562 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 1 239 1 237 246 242 56.0 5e-64 MSGHSKFANIKHKKEKNDAAKGKIFTIIGREIAVAVKEGGPDPANNSRLRDVIAKAKSNN MPNDTIERGIKKAAGDANSVNYESVTYEGYGPSGVAIIVDALTDNKNRTASNVRNAFTKG SGSVGTQGCVSYMFDKKGQIIVAKEEYETDADELMMIALDAGAEDFAEEEDSYEILTAPE DFSDVRLALEEAGVPMASAEVTMIPQTWVELTDETDLKNLQKTLDLLDEDDDVQVVYHNW DE >gi|330405144|gb|ADLB01000012.1| GENE 155 158129 - 158890 779 253 aa, chain + ## HITS:1 COG:no KEGG:Hbal_0324 NR:ns ## KEGG: Hbal_0324 # Name: not_defined # Def: hypothetical protein # Organism: H.baltica # Pathway: not_defined # 144 248 247 351 354 83 38.0 6e-15 MYSDAQETLINLFEDALILVKSFNRDVYKSDFETHYNKYKDFFVNVNESLREEEGKEAVI SELAGIIPNYIDGKLKKIGQKRKRENLILDYNMALVTYVLPVINYGVGEYGNAISEKIVE LWNNNISSVTIKNATYETITEGFKKRLCYITTAVCESLGKPDDCYELELLRNYRDKYLIE ENAGRDIVQQYYNIAPTIVKRIDKRDDAKGIYKRIWTDYLTPCIHLIEENEKEACKKVYS DMVYDLQKKYIYS >gi|330405144|gb|ADLB01000012.1| GENE 156 158903 - 160186 1280 427 aa, chain + ## HITS:1 COG:PM0738 KEGG:ns NR:ns ## COG: PM0738 COG2873 # Protein_GI_number: 15602603 # Func_class: E Amino acid transport and metabolism # Function: O-acetylhomoserine sulfhydrylase # Organism: Pasteurella multocida # 9 424 5 418 422 523 59.0 1e-148 MSNNYKIETKCIQSGWEPKQGEPRVVPIYQSTTFKYDTSEQMGRLFDLEDSGYFYTRLQN PTNDVVANKICDLEGGVAAMLTSSGQAANFYAIMNICEAGDHIVCASALYGGTYNLYAHT IRKMGVEATFVDPDASEEEISSAFRDNTKALFGETISNPALVVLDLEKFAKVAHEHGVPF IVDNTFATPINCRPFEWGADIVTHSTTKYMDGHAMSVGGCIVDSGNFDWEAYGDKFSCLT APDETYHGIVYTKKFGKGAYITKATAQLMRDLGSIQSPQNAFLLNVGLETLHLRVPRHCE NAKKVAEFLKNHEKVAWVSYPDLEGDKYHALAQKYLPNGSCGVLTFGIKGGRDASVTLMD NLKLAAIVTHVADSRTSVLHPASHTHRQMNEQELIEAGVQPDLIRFSVGIENADDIIADL AQALELV >gi|330405144|gb|ADLB01000012.1| GENE 157 160299 - 160979 824 226 aa, chain + ## HITS:1 COG:BS_ymfB KEGG:ns NR:ns ## COG: BS_ymfB COG0740 # Protein_GI_number: 16078742 # Func_class: O Posttranslational modification, protein turnover, chaperones; U Intracellular trafficking, secretion, and vesicular transport # Function: Protease subunit of ATP-dependent Clp proteases # Organism: Bacillus subtilis # 3 220 2 230 241 215 50.0 6e-56 MSENTEKQHEEIKEMGTLSLSHNKKRHNIQLLTIIGEIEGHEAVSGNTKATKYEHLLPKL AEVEDSEEIEGLLILLNTLGGDVEAGLAIAEMIASLSKPSVSLVLGGSHSIGGPLAVSAD YSFIVPSGTMVIHPVRSSGTFIGVMQSYRNMERTQNRITKFISEHANISQERIEELMLDP TQLVKDVGTMLEGEEAVKEGLIDEVGGMKEALQKLHELIEIQAKNK >gi|330405144|gb|ADLB01000012.1| GENE 158 161044 - 163509 2006 821 aa, chain + ## HITS:1 COG:CAC1812 KEGG:ns NR:ns ## COG: CAC1812 COG1674 # Protein_GI_number: 15895088 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: DNA segregation ATPase FtsK/SpoIIIE and related proteins # Organism: Clostridium acetobutylicum # 16 813 20 765 765 565 45.0 1e-160 MAAKSNKSKKAAKRTPAKKAAKKQTRKKTAQQNTFLKNEIFILLSLAICILLLISNFGIG GFLGNIVSSFLFGTFGLPAYIIPILLFIGIAFTISNKGNSIAYIKLGAGIVLTIMVCTLT QLITFEYDSTAGLLSYYDTSALHKQGGGIIGGLFSKMLCHLVGVIGAYVIVIILSIICLV IITEKSFLRGVKKGSKKAYQTAKEDVRKRKEQSVIRKEEKRQNKELKRKDKQVKGVSFDT TLTPKETDIKEVFPEEMPPIYAEEPLESPFHTEAEEIVPEVVPETMVIHRATEEEPPKES VSKKKKEKNDVSAQIEEVEEVIAKESRQSSSTYTFPPIDLLNKGKQGNGDSDNYLRETAL KLQQTLKNFGVNVTVTNVSCGPSVTRFELQPEQGVKVSKIVGLSDDIKLNLAAADIRIEA PIPGKAAVGIEVPNRENTAVMLRDLLETKEFKSHPSNIAFAAGKDIAGKVVVADIKKMPH VLIAGATGSGKSVCINTLIMSILYKAKPDEVKLIMIDPKVVELSVYNGIPHLMIPVVTDP KKASGALNWAVVEMEKRYQLFAEYNVRDLNGYNDKVEQIKDIEDETKPEKLPQIVIIVDE LADLMMVAPGEVETAICRLAQLARAAGIHLVLATQRPSVNVITGLIKANMPSRIAFSVSS GVDSRTIIDMNGAEKLLGKGDMLFYPSGYQKPARVQGSFVSDKEVQAVVDFLVSQNGNVS YDEEITKQVNSASINGANSSAAAGNGNERDVYFADAGRFIIEKDKASIGMLQRVFKIGFN RAARIMDQLFEAGVVGEEEGTKPRKVLMSTEQFEQYIEENV >gi|330405144|gb|ADLB01000012.1| GENE 159 163528 - 165198 1834 556 aa, chain + ## HITS:1 COG:CAC3201 KEGG:ns NR:ns ## COG: CAC3201 COG2759 # Protein_GI_number: 15896448 # Func_class: F Nucleotide transport and metabolism # Function: Formyltetrahydrofolate synthetase # Organism: Clostridium acetobutylicum # 1 556 1 556 556 734 65.0 0 MKTDIQIAQEATMKHIKEVASSIGIQEDDLEFYGKYKAKISDELWEKAKDNKDGKLVLVT AINPTPAGEGKTTTSVGLGQAFAKLNKKAIIALREPSLGPCFGIKGGAAGGGYAQVVPME DLNLHFTGDFHAITSANNLLAAMLDNHIQQGNALQIDPRQVVWKRCLDMNDRVLRNIVVG LGNKMDGMVREDHFVITVASEIMAILCLANDLQDLKERLGKIIVAYNFAGEPVTADQLEA TGAMTALLKDAIKPNLIQTLEHTPALVHGGPFANIAHGCNSVRATKTALKLCDIAITEAG FGADLGAEKFMDIKCRSANLKPDAVVLVATVRALKYNGGVAKKDLAEENLDALKKGIVNL EKHIENVQKFGVPVVVTLNSFVTDTDAENEYIKNFCEERGCEFALSEVWEKGGEGGIALA EKVLETLEKKESNFHTLYSDELSLKEKIETISKEIYGAGSVEYAPAAEKQLAKIESMGFG NLPICMAKNQYSLSDDATLLGRPENFTIHIREVYVNAGAGFVVALTGAVMTMPGLPKVPA ANGIDVNEEGKITGLF >gi|330405144|gb|ADLB01000012.1| GENE 160 165216 - 166064 982 282 aa, chain + ## HITS:1 COG:lin1397 KEGG:ns NR:ns ## COG: lin1397 COG0190 # Protein_GI_number: 16800465 # Func_class: H Coenzyme transport and metabolism # Function: 5,10-methylene-tetrahydrofolate dehydrogenase/Methenyl tetrahydrofolate cyclohydrolase # Organism: Listeria innocua # 1 279 1 279 284 291 56.0 9e-79 MPRLIDGKKIALEIKNELKVKVEELKQQEKDVCLAVIQVGNDAASTVYVNGKKKDCEYIG IKSLSYELPEETTENELLDLIGKLNGDINVHGILVQLPLPSHMNEEMITQAISPEKDVDG FHEMNIGKLCIGENGFECCTPAGIIQLLKRSGIEIAGKECVVIGRSNNVGKPTALLLLRE NGTVTIAHSHTENLKEITKRADIVVAAIGKPKFITKEYLKQGAVVIDVGIHRNADNTLCG DVDFDDVAETVSAITPVPGGVGVMTRAMLMYNCVSAVYREAL >gi|330405144|gb|ADLB01000012.1| GENE 161 166064 - 167194 841 376 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_1956 NR:ns ## KEGG: EUBREC_1956 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 367 1 433 433 158 26.0 3e-37 MRCRKCGAEIREGSLYCDMCGEEVRIVPDYSSLDEILAEHVRNELQSGNRTKKTTGRPKK KSRKHHKKMILILSSLLFIILIGLLLYQTSYAGYIKRGYNALDKKKYDTAIRRFESAIEK DEKRADGYSGMADVYLAKKDFYSAEKIFLDEIERQESNVGLYKGLIRVYMKSDKKEEIPK LLEECKSDSVLSALSGYIVETPSFSLTKDSYDNEQELKFYSNGNDIYYTLDGSTPTESSG AKYRKDTPILLGEGEWVVRAVAVNKKGIPSLLTKHTYKIQTPITEAPIVSPSSGLYEQQE KITIQVKEGFTAYYAFGTTEPTEENWKKYIGSIPMPQGNLIFSAVLVENSTGKASAITVR NYDLELDTQTEENIQQ >gi|330405144|gb|ADLB01000012.1| GENE 162 167309 - 168490 1543 393 aa, chain + ## HITS:1 COG:YPO2180_2 KEGG:ns NR:ns ## COG: YPO2180_2 COG1454 # Protein_GI_number: 16122410 # Func_class: C Energy production and conversion # Function: Alcohol dehydrogenase, class IV # Organism: Yersinia pestis # 6 390 6 416 441 330 44.0 2e-90 MSRFTLPRDIYHGKGCLEELKNLKGKRAILVVGGGSMKRQGFLDKAVNYLKEAGMEVQLF EGVEPDPSVETVMKGAEAMRAFEPDWIVAMGGGSPIDAAKAMWAFYEYPDVTFEDLCIPF NFPELRQKAKFAAIPSTSGTATEVTAFSVITDYAKGIKYPLADFNITPDVAIVDSELVEG LPAHQVAYTGMDALTHAIEAYVSTLNCAYTDPLAIQAIEMVFDYLPASFRGNMTARAKMH DAQCLAGMAFSNALLGIVHSMAHKTGAAFSTGHITHGLANAMYLPYVIKYNAKDPVAAKR YAEIARRVGLEGSSEQSLINSLCEKINEFNVILGIPATLKEFGIKEDEFKEKVSAVAENA VGDACTGSNPRAIDPATMERLFTCIYYGTEVDF >gi|330405144|gb|ADLB01000012.1| GENE 163 168629 - 169519 995 296 aa, chain + ## HITS:1 COG:PAB1737 KEGG:ns NR:ns ## COG: PAB1737 COG0543 # Protein_GI_number: 14521153 # Func_class: H Coenzyme transport and metabolism; C Energy production and conversion # Function: 2-polyprenylphenol hydroxylase and related flavodoxin oxidoreductases # Organism: Pyrococcus abyssi # 1 272 1 264 278 257 50.0 2e-68 MYKILKAEELADKIYLMDVEAPRVAKHCEPGQFVIVKIDDKGERIPLTICDYDREKGTIT IVFQIVGASTQKMAELKVGDSFRDFTGPLGCASEFVSEDLEALKNKKMLFVAGGVGAAPV YPQVKWLREKGIDADVIVGCKTKNMLILEKEMEAVAGNLYVCTDDGSYGHAGMVTSMIEK LVKEEGKKYDVCVAIGPMIMMKFVCLLTKQLDLPTIVSMNPIMVDGTGMCGACRLQVGDE IKFACVDGPEFDGHLVDFDQAMKRQQMYKTEEGRAMLKLQEGDTHHGGCGHCGGDN >gi|330405144|gb|ADLB01000012.1| GENE 164 169519 - 170907 1706 462 aa, chain + ## HITS:1 COG:TM1640 KEGG:ns NR:ns ## COG: TM1640 COG0493 # Protein_GI_number: 15644388 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: NADPH-dependent glutamate synthase beta chain and related oxidoreductases # Organism: Thermotoga maritima # 6 462 4 462 468 509 57.0 1e-144 MADVLKKVPVREQEPEVRAKNFEEVCLGYNKEEAMEEATRCINCKNAKCIEGCPVSINIP AFIKEVKEGNIEEAYKVIGQSSALPAVCGRVCPQESQCEGKCIRGIKGEPVSIGKLERFV ADYALENNIKPVGAEKTNGHKVAVIGSGPAGLTCAGDLAKLGYEVTVFEALHELGGVLVY GIPEFRLPKHAVVAKEIEKVKELGVKFETNVVIGKSTTIDQLIEEEGFEAVFIGSGAGLP KFMGIPGENANGVFSANEYLTRSNLMKAFDESYDTPIAAGHKVAVIGGGNVAMDAARTAL RLGAEVHIVYRRSEEELPARVEEIHHAKEEGIIFDLLTNPKEILVDDAGYVKGMVCVKMK LGEPDASGRRRPIEVPDSEFTIDLDTVIMSLGTSPNPLISSTTEGLEINKWKCIVAEEET GKTTKEGVYAGGDAVTGAATVILAMGAGKAGAKGIHEYLSNK >gi|330405144|gb|ADLB01000012.1| GENE 165 171035 - 171934 893 299 aa, chain + ## HITS:1 COG:CAC0076 KEGG:ns NR:ns ## COG: CAC0076 COG0697 # Protein_GI_number: 15893372 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Clostridium acetobutylicum # 1 294 1 287 303 239 44.0 4e-63 MQKEKIRNSILLLLTAVIWGTAFVAQSVGMDYIGPFTFNAARFLIGGTVLIPLIVYRSKK NPLLKNQTLEEKRKNQKTEWIGGVCCGIALCGASLLQQMGIQHTTVGKAGFITTLYIIIV PLIELFFGKKIAKKIWLGAVMAVIGLYLLCINENFSIGKGDFLILVCAILFAIHILIIDH FSPKADGVVLSAIQFFVSGFISVIGAILVENPNPAAMLDAIVPILYAGVMSCGVAYTLQV IGQKNISPTVASMILSLESVISVLAGWIILGEALSAKEIVGCVIVFMAVVLVQLPEKKK >gi|330405144|gb|ADLB01000012.1| GENE 166 172045 - 172569 657 174 aa, chain + ## HITS:1 COG:no KEGG:Dhaf_1199 NR:ns ## KEGG: Dhaf_1199 # Name: not_defined # Def: C_GCAxxG_C_C family protein # Organism: D.hafniense_DCB-2 # Pathway: not_defined # 17 158 23 160 160 141 52.0 8e-33 MKIGVTEISPRSVQQVAEKKFKEGYYCCEALVSAIRDEFKLDIPEEVIAMASGMAVGAGK SGCVCGAFNGGILALGLFFGRTEQDGPTNPKSIKCMELTKELHDWFKEANGKNAICCRVL TKEFDKGRGEHKEQCIFFTGLCTWKVAQIICRELGIKNLDEIDEPAPRRTLDEI >gi|330405144|gb|ADLB01000012.1| GENE 167 172589 - 173509 706 306 aa, chain + ## HITS:1 COG:L181867 KEGG:ns NR:ns ## COG: L181867 COG1275 # Protein_GI_number: 15672360 # Func_class: P Inorganic ion transport and metabolism # Function: Tellurite resistance protein and related permeases # Organism: Lactococcus lactis # 4 301 9 307 324 131 35.0 2e-30 MKKLIQKVPIPLCGVMLGFAALGNLLQSYGEGVRYACGIAAAFLLILILLKLVMFPGMIK EDMQNPIMASVSGTFPMALMLLSTYVKPFIGTGAKFIWFFAIGLHIVLIIYFTLKFMLKI QMVKVFASYFIVYVGIAVAAVTAPVFEEIKIGTYAFWFGLITLLLLLVLVTIRYVKYTDV PESAKPLFCIYAAPTSLCIAGYVQSVTPKSKEFLLAMLLVATVLYILTLVKAIGYLKLKF YPSYAAFTFPFVISAIATKQTMACLANMKQPLPILQYVVLIETVIATLFVIYTFVRFMQF IFKGNN >gi|330405144|gb|ADLB01000012.1| GENE 168 173629 - 174108 588 159 aa, chain + ## HITS:1 COG:CAC3536 KEGG:ns NR:ns ## COG: CAC3536 COG1576 # Protein_GI_number: 15896772 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 1 159 1 159 159 185 61.0 3e-47 MKITVITVGKLKEKYLKDAIAEYAKRLSKYCKLEIIEVADEKTPDTASGIVEDQIRSKEA ERIMKYVKEDAHVITLEIAGKQLTSEEFADKIEKLGVQGTSHITFIIGGSIGLGKEVLKR SDYALSFSKMTFPHQLMRVILLEQIYRGYRIINGEPYHK >gi|330405144|gb|ADLB01000012.1| GENE 169 174123 - 174794 527 223 aa, chain + ## HITS:1 COG:CAC3340 KEGG:ns NR:ns ## COG: CAC3340 COG2357 # Protein_GI_number: 15896583 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 20 220 14 214 217 204 53.0 8e-53 MLKHEEEELMTLFLKENSENVVENLMEFNKLMMKYRSAIREVTTKLEVLNDELSLEGDRN PIESIQSRIKKPISIAGKLQRLGKKFNIEDIQENLNDIAGIRVICPFIEDIYRVSEMLTK QDDISVVKVKDYIKNPKTNGYRSYHLILEIPVFFSDCKQPMRVEVQIRTVAMNFWASLEH QIRYKKNISNQQEICEELKKCADTIASTDIRMQELRQWIESEQ >gi|330405144|gb|ADLB01000012.1| GENE 170 174871 - 175716 752 281 aa, chain + ## HITS:1 COG:SMb21579 KEGG:ns NR:ns ## COG: SMb21579 COG0789 # Protein_GI_number: 16264767 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Sinorhizobium meliloti # 2 69 1 68 147 60 38.0 4e-09 MLNIKSVEEQTGITKQNIRYYEKKGLLSVKRNEENSYREYDDEDIRTLKIIKLLRKLDMP IEEIRKVLAEEISLSKAINTQKEYLEKEREKLQDAISFCDKVEPNTLSSLDIDKYLAAME REEKNGSVFADIIEDFKKMAQAEAVCAFSFRPDTMCMNPAEFTEELLKYAEDNHLSIFIT KESMYPEFELNGIAYTAYREFGRFGANIVCKAKDADKIKPKDVSGKRTILYSIFKTLFPI MVIALPIYLAFMPKTGITEIFLLIYIIVLAGGFVLYNRLKK >gi|330405144|gb|ADLB01000012.1| GENE 171 175734 - 176516 720 260 aa, chain + ## HITS:1 COG:CAC0728 KEGG:ns NR:ns ## COG: CAC0728 COG0500 # Protein_GI_number: 15894015 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Clostridium acetobutylicum # 1 257 15 271 272 391 72.0 1e-109 MTVELEKYYNKFCEEKRLTRRHGQVEFITSMTYIHEYLKGKGQASILDVGAGTGRYSVSL AEEGYDVTAVELVKYNLGILKSKGSSVKAYQGTALNLSRFQDNTFDIVLVFGPMYHLYTM EDKIQALKEAKRVAKKDGIILVAYCMNEYSILTYGFKENHIEESIKTGKVNEEFHVVSQP EDLYDYVRIEDIDEVRKGAGLERIKLISADGPANYMRPVLNAMTPETFETFIQYHLSTCE RPELLGAGAHTLDILKKTEV >gi|330405144|gb|ADLB01000012.1| GENE 172 176529 - 177209 507 226 aa, chain + ## HITS:1 COG:no KEGG:Cphy_0177 NR:ns ## KEGG: Cphy_0177 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 1 224 1 224 224 295 60.0 7e-79 MRREVSMEEISDGKLYTANDLVKADCNDCRGCSACCQKMGQSIVLDPLDIYRLTFGLRQK FEELLADKLELNVVEGIILPNLQMSGTEEKCAFLNAEGRCSVHAIRPGICRLFPLGRYYD GKSFQYFLQIHECKKENRTKVKVKKWIDTPDLKKNEKYISDWHYFLTDMQKTLATLQNDE KVKKINMLILQTFFLEPYHQEEDFYVQFYDRLEKLKTSLQLLTRRE >gi|330405144|gb|ADLB01000012.1| GENE 173 177213 - 178157 766 314 aa, chain + ## HITS:1 COG:CAC3454 KEGG:ns NR:ns ## COG: CAC3454 COG0042 # Protein_GI_number: 15896694 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA-dihydrouridine synthase # Organism: Clostridium acetobutylicum # 1 306 1 306 311 351 55.0 1e-96 MNYYLAPLESVTTYIFRNAYHRYFLPLDKYFTPFIVPHPNKKFNTREKKELSTEHNHGLY VVPQLLTNKAEDFITTAKEIENMGYKEINLNLGCPSGTVVAKGKGSGFLAYPEELDRFLD NIYSKLDMKISVKTRIGKTSPDEFYRLIEIYNKYPLEELIVHPRLQTDFYKNKPNLQIFK EAVEMSRHSLCYNGDIYTVEDFQKFSEHFPTVEKIMCGRGAVANPSLFDAIANGKKLDIN TLKKFHDDIYMDYQEISSGERNVLFKMKELWSYLIQSFDHAEKYMKKIKKSEKLLAYEKA VSDLFANCPLKKGI >gi|330405144|gb|ADLB01000012.1| GENE 174 178163 - 178873 529 236 aa, chain + ## HITS:1 COG:BH2418 KEGG:ns NR:ns ## COG: BH2418 COG2176 # Protein_GI_number: 15614981 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, alpha subunit (gram-positive type) # Organism: Bacillus halodurans # 5 166 419 581 1433 123 41.0 3e-28 MRKTYISFDLETTGLSEEKDYIIEIGAVKVNNGKVVDRFARFLKPPISISNTITNLTGIT DDMVKNAGDTQETIKDFIKFCDGYFLLGHNIMFDYKFMKTYAARFGLHFEKEGIDTLKIA RKVHKNLESKSLESLCTYYNIVNASAHRAYHDALATAKLYHMLAHDFEVSEPSLFTPETL CYKPKKVQKITKKQLIYLTSLLEYHSLSEETDLETLTRSEASKLIDKIILTHGYMR >gi|330405144|gb|ADLB01000012.1| GENE 175 179045 - 179725 283 226 aa, chain + ## HITS:1 COG:no KEGG:Huta_1008 NR:ns ## KEGG: Huta_1008 # Name: not_defined # Def: hypothetical protein # Organism: H.utahensis # Pathway: not_defined # 10 222 10 213 221 100 28.0 6e-20 MFFFFFSKPYISIVGEVLPAKHSEENKEQHRKLITVLEKINTEHDEDISAKFMPTIENEF QGLICKGCNLTKLLLEIKNAMHPDKIRFGIGIGEIETDFQSDNAITVSGPGYEKAREALR FLQSREKKKQMAFSDTYLLTAEERNDRIHLINTIFSLLSTLELSWSDRQREIIYNMIKYD DTQTSAAGRLGVTQSTIQKSLTAGNYYAYAEASSTLETIFSEIGDK >gi|330405144|gb|ADLB01000012.1| GENE 176 179725 - 182340 1975 871 aa, chain + ## HITS:1 COG:CAC0854 KEGG:ns NR:ns ## COG: CAC0854 COG0480 # Protein_GI_number: 15894141 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation elongation factors (GTPases) # Organism: Clostridium acetobutylicum # 7 636 4 640 644 546 44.0 1e-155 MKSNKLTIGILAHVDAGKTTLAESILYTTGHIRKLGRVDHKNAFLDTDALERSRGITIFS KQVNFVLKEKEITLLDTPGHIDFSAEMERTLQVLDYAILVISGADGVQSHVQTLWKLLKK YNIPVFLFVNKMDQQGTDKKRLIKELQNRLNEHCIAFDGEEAKESFMENVAMCDELLLEK YLETESISRFDIQKLIENRKLFPCYFGSALKLEGIETFLEGIYQYTDKKDYGEEFGAKVY KISRDQQGNRLSHLKITGGTLKVKEKIKEDEKVEQIRIYSGTSYQAVNEVAAGTICAVMG LTSSKAGEGLGKELQSEKPMLEPVLTYRIQFPEGSDIHGMFLNLKQLEEEEPELHIVRNK ESGEIHAQVMGEIQLEILKSLISERFHTDVSFGQGQIIYKETIKETVEGAGHFEPLRHYA EVRLRLEPAKLGSGLHFSTECSEDILNRNWQRLILTHLEEKQHRGVLTGSEVTDMNIILT TGKAHIKHTEGGDFRQATYRAVRQGLKRAESILLEPVYEYRLEIPSEMVGRAMSDLQRMQ ADFSLPESEGEMSVLTGTAPVVHMQEYQKEVLAYTKGHGRLFCSLRGYEPCHNAEEVIEK IQYDSESDLENPTGSVFCSHGAGFNVSWDNVEEYMHTSPQEKKTEKEISHQSAYLPPVID EEELLEIFERTYGPVKRKKNNFKKQISSPVIAGRTQKKKEYDKEYLLVDGYNIIFAWEDL KELSEVNIEGARNKLMDILSNYQGYRQCTLILVFDAYKVPGNVGEVQKYHNIHVVYTKEA ETADQYIEKTVHELGKHYKVTVATSDGLEQMIIMGQGANRLSAKGLLEEILIANEEIRKE HLNQPVKNKQYLFDNVSEDMAEYIEQVRKGE >gi|330405144|gb|ADLB01000012.1| GENE 177 182397 - 183830 1552 477 aa, chain - ## HITS:1 COG:mlr4653 KEGG:ns NR:ns ## COG: mlr4653 COG0076 # Protein_GI_number: 13473905 # Func_class: E Amino acid transport and metabolism # Function: Glutamate decarboxylase and related PLP-dependent proteins # Organism: Mesorhizobium loti # 2 471 32 509 517 216 28.0 7e-56 MINSENILNSAQLEEAIENFVHEFCAEKHDIHSQKVTWYTSEGQIEKIKQIGISKDGRPV DEVIAEMSKEVYRYRGDANHPRFFGFVPGPASSVSWLGDIMTSAYNIHAGGSKLAPMVNC IEQEVLRWLCEQVGFTKNPGGVFVSGGSMANITALTAARDCKLNDENLHLGVAYVSDQTH SSVAKGLRIIGIPNSRIRSVATNSAFQMDTDMLKEMIAKDKENGLIPFVVIGTAGTTNTG SIDPLEEIADICANNNMWFHIDGAYGASVLLSPKYKHLLKGTELADSISWDAHKWLFQTY GCAMVLVKDIKHLFHSFHVNPEYLKDVEGDMEHINTWDIGMELTRPARGLKLWLTLQVLG TRLIGSAIEHGFQLAEWAEEALNELDNWEVISKAQLAMLNFRYAADDLTDEQMDLLNEKV SEKIVESGYAAVFTTILNGKKVLRICALHPETTRDDMRTTIHLLDTYARELHEKMKK >gi|330405144|gb|ADLB01000012.1| GENE 178 183947 - 184822 612 291 aa, chain + ## HITS:1 COG:CAC0023 KEGG:ns NR:ns ## COG: CAC0023 COG0583 # Protein_GI_number: 15893321 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Clostridium acetobutylicum # 1 291 1 291 299 183 31.0 4e-46 MELRNINTFLKVAGTQNFSKAAEQLGYSQSAVTVQIQQLENELQTQLFERIGKRVYLTEK GQEFVSYANDIMRVTDNARDFAKQSDILEGTLRIGGVESICTALLPDLLLKFYQICPNVQ VTIKSGTTNELMEMAKSNELDLIYTLDKKILGREWTRATIMEEEIVFVTLADRTENFSDK VPVQKLIEKPFILTEMGAAYQYELERLLSEKDLEINPILEIGNTETIINLLKRGMGVSFL PKFTVQRELEKNVLSQVRTNLPGVKMYSQLFYHKNKWVTKQMRNFIELVKE >gi|330405144|gb|ADLB01000012.1| GENE 179 185030 - 186895 1764 621 aa, chain + ## HITS:1 COG:CAC1254 KEGG:ns NR:ns ## COG: CAC1254 COG1032 # Protein_GI_number: 15894536 # Func_class: C Energy production and conversion # Function: Fe-S oxidoreductase # Organism: Clostridium acetobutylicum # 8 619 6 615 622 713 53.0 0 MRKLALDDEILLKIEKPARYIGNEVNSVMKDKNEVDIRFAMCFPDVYEIGMSHLGIQILY DMFNRRDDVWCERVYSPWVDLDKIMREEHIPLFALESQDAVKDFDFLGITIQYEMCYTNI LQVLDLSRIPLFAKDRGEEDPIVIGGGPCTYNPEPLADFFDIFYIGEGETVYNELLDAYK ENKKNGGTRQTFLEMAAEIEGLYVPAFYDVEYKEDGTILSFTPNNTHAKEKIRKQMVSDL SESSYPVKPVVPFIKVTQDRVVLEIMRGCIRGCRFCQAGMIYRPTREKNLEKLKDYAYQM LKNTGHEEISLSSLSSSDYTQLEGLVTFLIEEFKGKGINISLPSLRIDAFSLDVMGKVQD IRKSSLTFAPEAGSQRLRDVINKGLTEEIILEGAGQAFEGGWSRVKLYFMLGLPTETEDD MKEIAHLAEKVAKRYYEVPKENRNGKCQIVASTSFFVPKPFTPFQWASMCTSEEYIGKAH IVNNEMKAQLNKKSLKYNWHEADVTVLEGVFARGDRKVGKALLEAYKLGCLYDSWSEYFK NDLWLQAFENTGIDIGFYNLRERELDEIFPWDFIDIGVTKKFLIREWTRAMEGKVTPNCR MGCSGCGAAIYGGGVCIEGKN >gi|330405144|gb|ADLB01000012.1| GENE 180 186879 - 187586 751 235 aa, chain + ## HITS:1 COG:CAC1255 KEGG:ns NR:ns ## COG: CAC1255 COG5011 # Protein_GI_number: 15894537 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 1 213 1 224 238 119 37.0 5e-27 MKARIKFKKYGVMKFIGHLDIMRYFQKAMRRADIPIAFSGGYSPHMIMSFANPLGVGVTS DGEYFDIELTEPIASDVAVKHLNDVMVEGMEIVSFVEISEDKKKTGMAIVAAADYVSTVK NGELPADWKEKAKDFFAQDEIIITKKTKKSEKEVNIKPMIYQFEVIENSLYSFVATGSVE NLKPGLVMEAFLNYLGIDSSSITFSHHRLEVYANAGSEEKREFVSLESLGTKIGS >gi|330405144|gb|ADLB01000012.1| GENE 181 187519 - 188730 769 403 aa, chain + ## HITS:1 COG:XF1125 KEGG:ns NR:ns ## COG: XF1125 COG1530 # Protein_GI_number: 15837727 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Ribonucleases G and E # Organism: Xylella fastidiosa 9a5c # 36 401 17 409 497 187 32.0 4e-47 MPDQKKKENSFLWKVWVQKLDRKLIMTKLQENVLTSVKENDEIVELHISNTEDKYRLGNI YIGKVKKIVSNIQAAFIEIDKGVECYYDMTEHGNKPLRVGEELIVQISKEAIKTKQPAVT RKISFTGKYCVLTVGDNRISFSSKIDKEKREELRQFMEPYHSEKYGFILRTNAKGAAVCD IAEEINRLIEEYEYIFRIAPTRVCFSCLKENEKPYIADLKNIYQEGLTDIVIEDKEIYNH VYSFLQREQPEDLDKLRLYEDKQLPLAKLYSIETVLQNALKERVWLKSGAYLVIQPTEAL TVIDVNTGKCIGKKRDEVAYLKINIEAAKEVARQIRLRNLSGIILVDFINLDDKEKWEEL LNYLKIHLRKDPIQTVLVDVTKLQLVEITRKKVRKPLHENIRG >gi|330405144|gb|ADLB01000012.1| GENE 182 188735 - 189217 543 160 aa, chain + ## HITS:1 COG:BS_yjdI KEGG:ns NR:ns ## COG: BS_yjdI COG2606 # Protein_GI_number: 16078271 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Bacillus subtilis # 1 158 1 159 159 167 52.0 1e-41 MAKKETKTNAVRILERNKITFEMMTYECDEFIDGIHIADKLGIPHEIVYKTIVTVGKSKG YYVFVLPIEKEIDFKKAAKSVGEKSLEMLPLKDLTPLTGYVRGGCTSIGMKKQFPTVIDR SAFELPHIIVSGGKLGLQLKLSPHDLANVVRAEFADVIFT >gi|330405144|gb|ADLB01000012.1| GENE 183 189230 - 189826 415 198 aa, chain + ## HITS:1 COG:CAC3581 KEGG:ns NR:ns ## COG: CAC3581 COG1011 # Protein_GI_number: 15896815 # Func_class: R General function prediction only # Function: Predicted hydrolase (HAD superfamily) # Organism: Clostridium acetobutylicum # 6 184 5 184 201 102 35.0 3e-22 MKKTTVIFDLGNVLVAYDWKSYLKTFSFDDNVFHTIANAMFLNSDWEEGDRGADAEQWLS LFIENAPEYEKEIRMVYENLEKCVYLFPYTLKLIQFFRKNGYRILFLSNYSEYLYEKTKG TLSFIETFDGGVFSFEETCIKPDKLIYERLLEKYHIAPEEALFFDDREENVRAAEELGIH GILFTSETADKILKGQIV >gi|330405144|gb|ADLB01000012.1| GENE 184 189847 - 190260 441 137 aa, chain + ## HITS:1 COG:CAC3674 KEGG:ns NR:ns ## COG: CAC3674 COG0517 # Protein_GI_number: 15896906 # Func_class: R General function prediction only # Function: FOG: CBS domain # Organism: Clostridium acetobutylicum # 1 137 1 138 140 140 50.0 7e-34 MNILFFLKPKSEIAFIHKEDTLRQAIEKMEYHKYSSIPMINIDGKYVGSITEGDLLWGIK NKYNLNLKEAELIPITEIDRRTDYQAVNINADIEDLVEKAMDQNFVPVVDDQGNFIGIIT RKDIIGYCYNKMTNIEK >gi|330405144|gb|ADLB01000012.1| GENE 185 190376 - 190684 428 102 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|238924055|ref|YP_002937571.1| 50S ribosomal protein L21 [Eubacterium rectale ATCC 33656] # 1 101 1 101 102 169 82 1e-40 MYAIIATGGKQYKVAEGDIIKVEKLGVEAGETFTFDQVLAVSDSELKVGNPTVAGATVEA SVIGDGKAKKVIVYKYKRKTGYHKKNGHRQQYTAVKIEKINA >gi|330405144|gb|ADLB01000012.1| GENE 186 190704 - 191036 184 110 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|116492579|ref|YP_804314.1| ribosomal protein [Pediococcus pentosaceus ATCC 25745] # 1 103 1 105 108 75 42 2e-12 MITVSIYKNEKHEYVGFKTLGHAGYSEPGQDIVCAAVSVLTINTINSIDKFTEEKTSLVS DEESGFIDYKIDGRPGKEAALLLKAMILGLREMASDESYEEYIDLTFEEV >gi|330405144|gb|ADLB01000012.1| GENE 187 191040 - 191324 449 94 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|160880681|ref|YP_001559649.1| 50S ribosomal protein L27 [Clostridium phytofermentans ISDg] # 1 92 1 92 95 177 91 4e-43 MMKLNLQFFAHKKGVGSTKNGRDSESKRLGAKRADGQFVKAGNILYRQRGTKIHPGLNVG RGGDDTLFALVDGVVRFERKGRDKKQVSIYPVVE >gi|330405144|gb|ADLB01000012.1| GENE 188 191405 - 192688 760 427 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|149915191|ref|ZP_01903719.1| 50S ribosomal protein L27 [Roseobacter sp. AzwK-3b] # 2 347 3 342 345 297 46 3e-79 MFADRAKIYIRSGKGGDGHVSFRRELYVPNGGPDGGDGGRGGDVIFEIDEGLNTLADYRH RRKYVAKDGEQGGKRRCHGKNAEDIILKVPEGTIIKEAESDKIIADMSGDNRRQVILKGG KGGLGNQHFATSTMQIPKYAQPGQPAQELWVKLELKVIADVGLIGFPNVGKSTLLSRVTN ANPKIANYHFTTINPNLGVVDIDGADGFVIADIPGLIEGASEGVGLGHEFLRHIERTKMM IHVVDAASTEGRDPIDDIYKINAELKAYNEDIAKRPQVIAANKIDAIYSEDEDPVERIRK EFEPQGIKVFAISGVSGEGIRELLYYVSEQLKTLDQETIVFEQEYFPEEELIHIDLPYTV EKEDDMYVVEGPKIEKMLGYTNLDSEKGFAFFQKFLKDTGILDELENAGIQEGDTVRMYG LQFDYYK >gi|330405144|gb|ADLB01000012.1| GENE 189 192712 - 193005 225 97 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|212638657|ref|YP_002315177.1| Predicted RNA-binding protein containing KH domain, possibly ribosomal protein [Anoxybacillus flavithermus WK1] # 1 97 2 97 97 91 48 4e-17 MTTKQRAYLKSLAMTMDPIFQIGKSSMTPALTEAIGEALTARELIKISVLKNCADDPKEL AAMIAERTHSQVVQVIGKKIVLYKEGKDDKKKIELPK >gi|330405144|gb|ADLB01000012.1| GENE 190 193030 - 193635 524 201 aa, chain + ## HITS:1 COG:CAC1262 KEGG:ns NR:ns ## COG: CAC1262 COG1057 # Protein_GI_number: 15894544 # Func_class: H Coenzyme transport and metabolism # Function: Nicotinic acid mononucleotide adenylyltransferase # Organism: Clostridium acetobutylicum # 2 196 3 199 200 131 38.0 7e-31 MKIGIVGGTFDPIHNGHLMLGAYAYDNFQLDKIWFMPNGNPPHKSKEINVDFRLDMVKLA IEGKEEFCLSTFEIEEEKHSYSYETLEKLHQLYPQDTFYFIIGADSLFTIEFWKEPARIM HSCIILAACRDDKDMDKMYKQISYLTEKYSAKIELLKMPLIDISSSDIRQKRENGENIDN LVPQKVSAYIESHGLYEVKNK >gi|330405144|gb|ADLB01000012.1| GENE 191 193632 - 194219 454 195 aa, chain + ## HITS:1 COG:CAC1263 KEGG:ns NR:ns ## COG: CAC1263 COG1713 # Protein_GI_number: 15894545 # Func_class: H Coenzyme transport and metabolism # Function: Predicted HD superfamily hydrolase involved in NAD metabolism # Organism: Clostridium acetobutylicum # 1 169 1 168 189 137 44.0 1e-32 MKSLEEIEKILEETLTVSRFRHTLGVMYTAGALAMKYGVDLDQAMPAGLLHDCAKCLPVE EQKNLCKEYGISLSESEEKNPALIHAKLGAYLAKSKYGVTDEDVLSAIMYHTTGRPNMTM LEKILYIADYIEPGRHHAENLSFVRRLAFEDIDKAILQVSGDILNYLLENRNSVIDALTK ETYEFYKNQLELEEK >gi|330405144|gb|ADLB01000012.1| GENE 192 194219 - 194572 467 117 aa, chain + ## HITS:1 COG:ML1453 KEGG:ns NR:ns ## COG: ML1453 COG0799 # Protein_GI_number: 15827762 # Func_class: S Function unknown # Function: Uncharacterized homolog of plant Iojap protein # Organism: Mycobacterium leprae # 2 111 5 115 129 94 45.0 6e-20 MEQAKNMARLAYEALSEKKGEDIRVINISEISTLADYFIIANGTNESQVNALVENVEEKL EKAGYTVKQREGYGLGNWVLLDFGDIIVHVFDKDNRLFYDLERIWRDGRVVENIDEL >gi|330405144|gb|ADLB01000012.1| GENE 193 194604 - 194996 317 130 aa, chain + ## HITS:1 COG:no KEGG:Clos_1149 NR:ns ## KEGG: Clos_1149 # Name: not_defined # Def: hypothetical protein # Organism: A.oremlandii # Pathway: not_defined # 10 129 8 121 122 119 52.0 3e-26 MGFDSAILTIMDLLFPLVFLFIIGVFIMTIIRGVGTWHKNNNSPRLTVDASVVAKRTSVS HHTHANAGDQTGMSGSYTTTSTSYYVTFQVSSGDRIEFSVSGKEYGILAEKDYGKLTFQG TRYLSFEREY >gi|330405144|gb|ADLB01000012.1| GENE 194 195062 - 195679 570 205 aa, chain - ## HITS:1 COG:BS_lexA KEGG:ns NR:ns ## COG: BS_lexA COG1974 # Protein_GI_number: 16078848 # Func_class: K Transcription; T Signal transduction mechanisms # Function: SOS-response transcriptional repressors (RecA-mediated autopeptidases) # Organism: Bacillus subtilis # 5 204 3 204 205 230 58.0 2e-60 MSYGKISQKQSEILEYIKSEILKRGYPPAVREICEAVNLKSTSSVHSHLETLEKNGYIRR DPTKPRAIEIIDDMFNLTRRDLVQVPMIGRVAAGEPLLAQENIEDYFPIPAELMPNNQVY MLQVQGESMINAGILDGDYVLVEQCNTVSNGQMVVALVEDGATVKTFYKEEGIYRLQPEN DTMSPIIVQEVTILGKVIGVFRMMK >gi|330405144|gb|ADLB01000012.1| GENE 195 195870 - 196235 344 121 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_1702 NR:ns ## KEGG: EUBREC_1702 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 58 120 54 116 116 66 39.0 3e-10 MSERNYNQKKRREKTIRRNIKVFFVLPILLAVLAGTIYCGGVLSNAHGNLEEEPVGFKYY KSIKIEQGDTLWGIAQKYMTDEYDSPQQYIKEIKQLNGLTSDNIQESKHLLIAYYDTEFK E >gi|330405144|gb|ADLB01000012.1| GENE 196 196266 - 196412 100 48 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MDIIAYPEKKVYIYLKNRYLSYRYIDKEKKEKTTLYQQTTHTVFHMCA >gi|330405144|gb|ADLB01000012.1| GENE 197 196455 - 197516 1029 353 aa, chain + ## HITS:1 COG:CAP0080 KEGG:ns NR:ns ## COG: CAP0080 COG0582 # Protein_GI_number: 15004784 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Clostridium acetobutylicum # 43 346 19 306 323 115 29.0 2e-25 MAKESKTYHELMYIDNTLRLRDVLNTMPGFVRDYFRAIEPTTSVKTRISYAYDIRIFFKF LLENNPVYKSYKMTDFQVSDLERIAPVDIEEYQEYLKVYQNEDKQITNAEKGLARKMSAL RSFYNYYYKHQMIEKNPSLFVDMPKIHDKAIVRLDTDEVALLLEYVENCGNQLTGQKKVY YEKNKTRDLAILTLLLGTGIRVSECVGLDIPDIDFKNNGIKVMRKGGNEMVVYFGPEVRK ALIDYLETTRNTITPLPEHENALFLSTQKKRMGVQAVENMVKKYAREVTPNKKITPHKLR STYGTSLYKETGDIYLVADVLGHRDVNTTKKHYAAIDDARRRQAASAVKLREP >gi|330405144|gb|ADLB01000012.1| GENE 198 197548 - 197994 577 148 aa, chain - ## HITS:1 COG:TM0730 KEGG:ns NR:ns ## COG: TM0730 COG1490 # Protein_GI_number: 15643493 # Func_class: J Translation, ribosomal structure and biogenesis # Function: D-Tyr-tRNAtyr deacylase # Organism: Thermotoga maritima # 1 148 1 148 149 164 56.0 4e-41 MKFVIQRVTEASVSVEGEVIGKIGKGFLVLIGVGESDTKEIADKLVKKLVGLRIFEDENG KTNLALKDVDGELLLISQFTLYANCKKGYRPSFTEAGAPDKANELYEYIIEECRKAVPSV QKGQFGADMKVSLINDGPFTILLDSEQL >gi|330405144|gb|ADLB01000012.1| GENE 199 198130 - 198864 580 244 aa, chain + ## HITS:1 COG:FN0868 KEGG:ns NR:ns ## COG: FN0868 COG0037 # Protein_GI_number: 19704203 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Predicted ATPase of the PP-loop superfamily implicated in cell cycle control # Organism: Fusobacterium nucleatum # 12 230 40 260 277 145 35.0 7e-35 MKLQQLFSFTRKAIDEYNMIQEGDHIAVGISGGKDSLTLLYALHGLKRFYPKKFELSAIT VDLGYSDFDLTPVENLCRELDVPYKIVKTDIGRILFEERKESNPCSLCAKMRKGALNEAV KEMGCNKVAYAHHKDDIIETMLLSLIFEGRFHSFSPKTYLDRMDLTVIRPMMFVDEADVI GFKNKYNLPVVKNKCSVDGHTKRQYAKELVKQLNTEHNGAKERMFTAILNGDIAGWPERT LHTR >gi|330405144|gb|ADLB01000012.1| GENE 200 198912 - 199556 585 214 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_01147 NR:ns ## KEGG: EUBELI_01147 # Name: not_defined # Def: cytidylate kinase # Organism: E.eligens # Pathway: Pyrimidine metabolism [PATH:eel00240]; Metabolic pathways [PATH:eel01100] # 6 212 5 212 212 273 66.0 2e-72 MKEMDTIITIGRQFGSGGREIGNLLAEDLHVKLYDKEMLAIAAKESGICEELFETHDEKP TNSFLYSLVMDTYSMGYSQNAFLDMPINHKIFLAQFDAIKKIANQGPCILVGRCADYALE SYKNRVSVFIHADLSARIKRIARLYDLTDAKAKDLIIKTDKKRASYYNYYSNKKWADAES YDLCLDSSKLGVRGTADAILSYIEVKNNVKDIKL >gi|330405144|gb|ADLB01000012.1| GENE 201 199654 - 200577 316 307 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|238855152|ref|ZP_04645474.1| pseudouridine synthase, RluA family [Lactobacillus jensenii 269-3] # 28 290 27 279 287 126 35 1e-27 MQEEQFIINQEENGMRIDVFLSKKMDNVSRSYIQKLIKEKEISVNGMAVKANYKVSANDI VQLTIPDLSEPDILPENIPLDILYEDADILIVNKPKGMVVHPSPSHYTGTLVNALMYYCK DDLSGINGVMRPGIVHRIDMDTTGSLLVCKNDFAHQKLAEDLKVHNIKRIYHAIVHGVIK EDEGTVEGPIGRHPIDRKKMSINYKNGKPAVTHYRVLKRFSNYTYIECQLETGRTHQIRV HMASIHHALVGDTVYGPAKSPFHLQGQTLHAKILGIHHPRTNEYLEIDAPLPEYFTDLLE KLERMSK >gi|330405144|gb|ADLB01000012.1| GENE 202 200579 - 201106 490 175 aa, chain - ## HITS:1 COG:SPy0826 KEGG:ns NR:ns ## COG: SPy0826 COG0597 # Protein_GI_number: 15674864 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Lipoprotein signal peptidase # Organism: Streptococcus pyogenes M1 GAS # 13 150 7 142 152 91 40.0 8e-19 MKKERIKYYACAVLAVIAGIVFDQYTKFLAVEHLKGQQPFVLIKNVFELNYLENRGAAFG MLQNQKAFFIFSFVLIIAAVLYLYVKLPLEKRFLPLHICSILIVAGAVGNMIDRLKLGYV VDFFYFKLIDFPIFNVADIFVVVSVILLAILVLFVYKDEEINQIFSFRVNGRGMK >gi|330405144|gb|ADLB01000012.1| GENE 203 201122 - 201877 328 251 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|227874237|ref|ZP_03992436.1| possible ribosomal protein S4e [Oribacterium sinus F0268] # 14 250 13 253 254 130 35 4e-29 MQKEEIVLQKRLLELSRVAYQKGIVTYSDFLNLNELNILHTTPKNEFFTQYETFGGYSDS ERQMVAFLPDALYYAHFYPIKVLKISPLQKKFAESLSHRDYLGAILNLGIDRCKLGDILL IDGDAYLFVQESLADFICRNLTRIRHTSVYVEVEDEQTFSYTPAIEEIKGTVASVRLDSL LSLVFPASRSKLVPLIEGGRVFVNGKLITTNSYNVKENDIVSVRKLGRFRYKGVLSQTKK GRYYVLLEKYI >gi|330405144|gb|ADLB01000012.1| GENE 204 201888 - 202433 559 181 aa, chain - ## HITS:1 COG:SA1032 KEGG:ns NR:ns ## COG: SA1032 COG1799 # Protein_GI_number: 15926772 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Staphylococcus aureus N315 # 39 166 50 177 187 74 33.0 1e-13 MGVLDKFLNIMKLDDDYDEEEDFLGEEEEFEEYEPKPKKSLFKKEKEEYEDFDLQEDHRG KSTSVAGNSNNKVTPMRQPNAKRSASMEVCVIKPTAVDDAREITETLLSGRTVILNLEGI DLEIAQRIIDFTSGATYAISGNLQKISNYIFLVTPTNVEISGDLQDLLNTSFDVPSIRTR F >gi|330405144|gb|ADLB01000012.1| GENE 205 202462 - 203136 585 224 aa, chain - ## HITS:1 COG:CAC2121 KEGG:ns NR:ns ## COG: CAC2121 COG0325 # Protein_GI_number: 15895390 # Func_class: R General function prediction only # Function: Predicted enzyme with a TIM-barrel fold # Organism: Clostridium acetobutylicum # 5 223 6 219 221 205 55.0 6e-53 MLKDNLNHVIENIHKSCKNEVTLIAVSKTKPAEMIQELYDAGCRNFGENKVQELIDKYEI LPKDICWHMIGHLQRNKVKYIVDKVSLIHSVDSLRLAQTIEKEAEKKNCVVDILIEINMA REESKYGIYPEELEALLREISHLSHIRVKGLMTVAPNVKNPEENRKIFTEMKKLSVDIAK KNIDNIIMSILSMGMSNDYNIAVEEGANMVRVGTSIFGARNYNI >gi|330405144|gb|ADLB01000012.1| GENE 206 203148 - 204530 1284 460 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_1735 NR:ns ## KEGG: EUBREC_1735 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 3 460 4 462 462 282 35.0 2e-74 MAENKKILVYRKKWKMNIGILLFGIIFIYLIAMIFSYATKDKVTSYEVHMGSILNDENYT GMAIREEKVVSAEKDGYVNYYSMENKKIKAGAGVCGISPEKLSFQKSDSDTNTGSETELT DEQQNSLGLTVQAFTDNFTESMFAEVYSFKDSVRNALSDFSSPDGENVLDTMLNGQPGSS LLYSTDDGVIAYETDGFETFTEEQVTLENFNREKYYAKELHNNMKVAVGEPIYKLITSEE WSVVVPITEKTAEELKDRTNITVRFLKDNETMVGNLSIKKQGEQYMAYIGFSGGMIRYAD ERFLDLELIITDYTGLKIPKSSIVKKPFFVVPTEYVIQGGNSDEKGVLRRGKDNQNTTEF VPMDIYYTKDNLAYFDLTELNKGDILIKPDSNITYVVGEQKKLTGVYQINKGYAVFKQIQ ILCESKEYYIVQAKNEFGLTNYDHIALDSSKIKENDIITQ >gi|330405144|gb|ADLB01000012.1| GENE 207 204547 - 205491 616 314 aa, chain - ## HITS:1 COG:TM0515 KEGG:ns NR:ns ## COG: TM0515 COG1242 # Protein_GI_number: 15643281 # Func_class: R General function prediction only # Function: Predicted Fe-S oxidoreductase # Organism: Thermotoga maritima # 7 307 3 302 303 261 45.0 1e-69 MKEKFLYNKYSDYLKERYGEKVYKLPINLPLTCPNRINGHGCSFCADVGTGFEAMESSVS VTEQLEKTKEYISKRYHAKKFIAYFQNYTNTYMPLEIFKKYILEAVSIPDIVEISISTRP DCVREDYLQFLKVIEKTYHISIHLELGLQTVNYHTLDFISRGHGLAEYIDAVRRIQKYGF SICTHLILNLPHDTLRDTIETAKTMSALEIDIVKLHSLYIPKNTPLCEAYEHGTINLCSK EEYIERLATFLAYLSPQIIVERLFSRVPEKDAVFSNWNTSWWKLQDEALAYMEEKEYYQG KFCNYLNGAGLNSL >gi|330405144|gb|ADLB01000012.1| GENE 208 205564 - 205761 285 65 aa, chain + ## HITS:1 COG:CAC0976 KEGG:ns NR:ns ## COG: CAC0976 COG2155 # Protein_GI_number: 15894263 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 1 64 1 64 69 63 59.0 8e-11 MKWFDNTALTLVIVGAINWLLIGIFQFDIVAFLFGELSFLSRAVYTIIGLCGLYLISLYG RIKDI >gi|330405144|gb|ADLB01000012.1| GENE 209 205800 - 206738 728 312 aa, chain - ## HITS:1 COG:CAC2057 KEGG:ns NR:ns ## COG: CAC2057 COG1686 # Protein_GI_number: 15895327 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanyl-D-alanine carboxypeptidase # Organism: Clostridium acetobutylicum # 35 291 4 245 351 150 40.0 4e-36 MLLCVALCSSLLIGCTSSGKNIDAYESTDYKKTAYRAEGFAEQLCVTPEEEVSYKSVNTD MALSAAGLFSLDDKEVLYGRNIYKKIYPASTTKILTALVALKYGNLDDVVTVSKTATEFP PAASLCGIKEGDKLTLRELLYGLLLPSGNDAAVAIAEHISGSVEEFAKLMNKEAYSIGAT HTHFVNPHGLHDDNHYTTAYDLYLIFNQCIQYDEFIKIVSETKYDLPVTSADGTSRTLNL ETTSLYGLGEAEKPEHMTVIGGKTGNTGEAKRCLILLSKDEQGHSYVSIVMGAETKSLVY TNMNKMLNAISE >gi|330405144|gb|ADLB01000012.1| GENE 210 206749 - 207144 393 131 aa, chain - ## HITS:1 COG:BH1681 KEGG:ns NR:ns ## COG: BH1681 COG1803 # Protein_GI_number: 15614244 # Func_class: G Carbohydrate transport and metabolism # Function: Methylglyoxal synthase # Organism: Bacillus halodurans # 1 129 1 129 138 155 54.0 2e-38 MNIGLIAHDAKKTLMQNFCIAYRGILKKHNLFATGTTGRLIEDVTILNITKYLAGPLGGM QQLGAQIAQNDIDALIFLRDPMNVKPHEPDVNDVVRLCDMHNIPIATNVATAEVLILAIE RGDLDWREMYR >gi|330405144|gb|ADLB01000012.1| GENE 211 207162 - 208274 1182 370 aa, chain - ## HITS:1 COG:BH3275 KEGG:ns NR:ns ## COG: BH3275 COG0772 # Protein_GI_number: 15615837 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Bacterial cell division membrane protein # Organism: Bacillus halodurans # 9 370 8 390 398 176 31.0 5e-44 MKLTKNYQFKNLKISLIVLVLAISTIGILVVGSAKPSYQGKQIMGVVLGVIAMLVVSMID YNWLLNLSWIMYAVNVGLLVLVKLIGKEVNGAQRWIDLKVISVQPSDLTKIFMIIFFAKF LMDHEEDLNEPKNIIKAILLILPSLILIVAQPNLSNTICVATLFCVLMFIGGLSYKFIRN VLLIAVPLVVIFLVIAVQPNQKLLKPYQQKRILSWLEPDKYADQEAYQQINSLMAIGSGQ ATGKGLNNQGSTSVKNGNFISEPQTDFIFAIIGEELGFVGCCITIILLLLIVIQCIIIGT KAQNLAGQIICGGVAALIGIQSFINISVATRIFPNTGIPLPFVSYGLTSLVTFFIGIGLV LNVGLQPKKY >gi|330405144|gb|ADLB01000012.1| GENE 212 208289 - 208519 223 76 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|210612730|ref|ZP_03289445.1| ## NR: gi|210612730|ref|ZP_03289445.1| hypothetical protein CLONEX_01647 [Clostridium nexile DSM 1787] # 1 74 1 74 76 94 67.0 2e-18 MNVFESGQKKNSVVIAKDRLKVLLISDRVNCTPDTFEKLRNELYLTVSKYIEVTPEIFDV EITQSNISIRLSGECN >gi|330405144|gb|ADLB01000012.1| GENE 213 208532 - 209323 866 263 aa, chain - ## HITS:1 COG:BH3027 KEGG:ns NR:ns ## COG: BH3027 COG2894 # Protein_GI_number: 15615589 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Septum formation inhibitor-activating ATPase # Organism: Bacillus halodurans # 1 260 1 261 264 323 62.0 1e-88 MGEIIVVTSGKGGVGKTTTTANIGVGLAKLHKKVVVIDTDLGLRNLDVVMGLENRIIYNL VDVIEGNCRLKQALIKDKRYEELYLLPSAQTKDKTAISPEQMKKLTAELKEDFDFILLDC PAGIEQGFQNAIAGADRAIVVTTPEVSAIRDADRIIGLLEQNKIKQLDLIINRIRIDMVK RGDMMSVDDVTEILAIHLLGAIPDDENIVVCTNQGEAVVGGESLSGQAYENICHRILGEE IPLLDLDEHKGFFKKLANLFQKN >gi|330405144|gb|ADLB01000012.1| GENE 214 209305 - 209994 584 229 aa, chain - ## HITS:1 COG:FN0175 KEGG:ns NR:ns ## COG: FN0175 COG0850 # Protein_GI_number: 19703520 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Septum formation inhibitor # Organism: Fusobacterium nucleatum # 1 178 1 177 216 85 30.0 9e-17 MNRLVTIKSNKYGLLIRLQPEVPFQELLAEVNRIFSDTVKFFHHAKLAVSFEGRILTKDE EHQLIEMISKTAQIEIVCIIDNDREREQMYKRYVEESLAETTLKDGLFCKGTLHKKQVLE SEKSVVLLGDVEEGATIAAKGSIVVLGTLKGNAYAGVTGKSNSFVFSLSMQPNQLKIGNV AVQPDTVTKNEYMSMMPQIAIQNGGNIMIQSIFDDEAYIQEEMKWAKLL >gi|330405144|gb|ADLB01000012.1| GENE 215 210015 - 212885 2375 956 aa, chain - ## HITS:1 COG:RP565 KEGG:ns NR:ns ## COG: RP565 COG0768 # Protein_GI_number: 15604419 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell division protein FtsI/penicillin-binding protein 2 # Organism: Rickettsia prowazekii # 602 929 253 585 594 140 30.0 8e-33 MFNRIKDAISEVVKSRLFIIIIVFIIMFGILIQRCFSLQIVNGQDYLDNYKLQIQKTREV QGTRGIIYDRNGQVLAENKLAYTVTIEDNGTYEDKEEKNEKINKTIEKVIGIIEKNGDSI INDFGIVLDENEEYSFLNPEGTSRKRFIADVYGKAKIDDLTKEQKASTPNDIIRYLCEDE KYGYGINRKKYSKEMVLKLVNVRYAMGLNSFQKYIPTTIASDVSNETVAAIMENLDSLQG VDIQEDSLRHYPDSKYFASILGYTGKISTEEYDNFKAEGKDYSKTDIIGKAGLEQSMDSI LQGKNGKEVLYVNNVGKIIERDKTTKAEAGNNLYLSIDKNLQIATYKILEEQLAGIILSN MRNTMNFEKRAGTEDVNILPFDDVLNSFFANNILDIKHFAEKDAQSREKAVYQKFLTRRE TVLKRITEELNSSSAKPYKNLSKEMQAYMSYIANDVLTENTGILVKDKIDTDDKTYKAWK EDDSISLREYLQYAISKNWVDTSKLQEYIEGKSDYSDANEVYEGIVAFLKEYLSSNTGFD KLIYKYMIKDGSITGREVCLMLYEQGVLKYDEKQVSALNAGTVSAYDFIRGKIKSLEITP GQLGLEPCTASAVVTDVNSGAVLACVSYPGYDNNRLANTMDSEYYNELVTGLSRPLYNNA TQETTAPGSTFKPVSAIAGLTEGVIGGGTVFNCSGKFTKITPSPKCWIYPGAHGGLNITG AISHSCNVFFSEMAYRMSMDDKGNYSSKKGTEILEKYAKMFALDQKSGIEIPESESTIST EDAVRSAFGQGTNNYTVTQLSRYVSAVANKGTVYDLTLLNKVETVDGKTVKEYEPKVNNE IKEVSKTTWNLVHQGMESMVSSNRIFNDLKKSNFKMSGKTGTAQQSKLHPDHALFVGFAP SDAPQISVAIRIANGDKSAFAAEIGRDIVRYYFNLADSSEIIHNGASTVTSATAGD >gi|330405144|gb|ADLB01000012.1| GENE 216 212878 - 213408 601 176 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_1743 NR:ns ## KEGG: EUBREC_1743 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 3 175 1 171 173 158 52.0 7e-38 MKIKRKMITLLIILVCFLLETTVFPSISLAGITPNLLVVVVSSFGFMRGKNSGMVVGFIC GLMTDIFFGLQGVIGFYALIYTLIGYGNGFFKRLFYDEDIKLPLALIAGSEFLYGIVIYI CVYLMRSKFDFIYYLSHIIMPELMYTILVTLVLYQIILHINRKLETEEKRSASKFV >gi|330405144|gb|ADLB01000012.1| GENE 217 213422 - 214285 1039 287 aa, chain - ## HITS:1 COG:CAC1243 KEGG:ns NR:ns ## COG: CAC1243 COG1792 # Protein_GI_number: 15894526 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell shape-determining protein # Organism: Clostridium acetobutylicum # 71 275 67 273 283 100 31.0 3e-21 MKGKNQLSVPSKYWLLIITTICIIFIGLSLVSDKTSGPFKVVADYTVVPMQKGINRAGMF LNDLTENFDTLQDVKAENKSLKKKVDSLTISNNRLQQEKYELERLRELYKLDQTYSDYEK IGAHVIGNNGSNWFSTFTIDKGSEDGIKVDMNVMAGAGLVGIVTKVGPHSSEVRAIIDDQ SNVSGMVLSTSDTCVVRGDLKLLADGRIRFEQLANNNTKIKEGEQVVTSQISSKYVQGIL IGYISEINVDDNNLTRSGYITPVVDFKKLQEVLVITSTKKDLVEKKK >gi|330405144|gb|ADLB01000012.1| GENE 218 214287 - 215309 979 340 aa, chain - ## HITS:1 COG:CAC2858 KEGG:ns NR:ns ## COG: CAC2858 COG1077 # Protein_GI_number: 15896112 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Actin-like ATPase involved in cell morphogenesis # Organism: Clostridium acetobutylicum # 2 317 5 322 340 201 36.0 1e-51 MLGNIYGLDLGTYEIKVYDKKKDTIWTEKNAIAIENEDTIFSVGDEAYEMYEKAPDNIEI VFPMKEGEISHFNDLQYLLQNLLKKGKRFARGSKYIIAVPTDVTEVQKRAFFDLVIHSTA KAKEVNIVERGIADAVGIGLDVQKEKGLFIANFGGEITELSVLSYGGLVLNRIVKVGGVT LDEAIVNRVRRNHDFLIGRLTAETLRKEFGIFDDASDHTLSVAGRNLISGVPTQYDVSIS LVRAAIKESLEEISRSIKSLIERTPPEVLTEVKKNGIYMTGGLANLRGLSTYIEGRTGLK VTLAKNPELCAVNGLKKIILSKELQKLAYSMLDENFRWMR >gi|330405144|gb|ADLB01000012.1| GENE 219 215329 - 215997 557 222 aa, chain - ## HITS:1 COG:CAC1241 KEGG:ns NR:ns ## COG: CAC1241 COG2003 # Protein_GI_number: 15894524 # Func_class: L Replication, recombination and repair # Function: DNA repair proteins # Organism: Clostridium acetobutylicum # 5 222 15 229 229 181 46.0 6e-46 MLEEERPYEKCERLGVSSLTDIELLAVLLRNGAINSNSLELARRILYPLNQRGGLINLHQ YSLAQLKKIHGIGRVKALQIVCIVELSRRLSKASAIKGLDFSSAAKIAEYYMEDLRHHQQ EHMKLLMLNTKARLIGETDISKGTVNASLVSPRELFIEALQKNAVSIILLHNHPSGDPTP SKEDILITKRIKEAGSLIGIELLDHIIIGNNCYISLAEKNLV >gi|330405144|gb|ADLB01000012.1| GENE 220 216123 - 217715 1059 530 aa, chain - ## HITS:1 COG:L195271 KEGG:ns NR:ns ## COG: L195271 COG2509 # Protein_GI_number: 15673161 # Func_class: R General function prediction only # Function: Uncharacterized FAD-dependent dehydrogenases # Organism: Lactococcus lactis # 1 526 1 528 535 471 47.0 1e-132 MIRIHQIKLPITHTQEDLKKKIGKILRISVSEIREVKIVKQSIDARKKPQIFYTYTVDAA ITGEKKILQKVKSNQIMPSPSKAYRFNAVGEKEMRFRPIIVGSGPAGLFCAYMLSVHGYR PILLERGASVDERIKDVESFWKSGKLNLNSNVQFGEGGAGTFSDGKLNTLVKDSFGRNQK VLEIFAKHGAPEDICYTNKAHIGTDILTDVVKQMRQSIISHGGEVRFHSQVTDICVEDNK ITHLIINREEKIPCEVVVLAIGHSARDTFEMLNKRKVPMEAKSFAVGVRVEHLQSMINLS QYGMEGNSLLSAASYKVAEQLDNGRGVYSFCMCPGGYVVNASSEEKRLAVNGMSYHSRSG KNANSAIIVTVTPEDYGGTDALSGVEFQRKLEEKAYNLAEGKIPVQRYEDFCKNQISSQF GKVTPSMKGNYAFANVRSIFPEEISQSIEDGIKKFNNKIRGFSQEDTLLSGVESRTSSPV RIKRGEGLMSEIAGLYPCGEGAGYAGGITSAGMDGLKVAEEIAKTYRAFD >gi|330405144|gb|ADLB01000012.1| GENE 221 217708 - 218946 1138 412 aa, chain - ## HITS:1 COG:CAC3590 KEGG:ns NR:ns ## COG: CAC3590 COG2081 # Protein_GI_number: 15896824 # Func_class: R General function prediction only # Function: Predicted flavoproteins # Organism: Clostridium acetobutylicum # 6 403 2 403 405 316 41.0 5e-86 MKKSKKIVVIGGGASGLVAAITAARNGADVIVLEHKEKVGKKILATGNGRCNFTNEYMEI SCFRGEDVERIDAVLKKFGTKETLHFFDELGVFPKSRNGYYYPKSNQAAAILDVLNMELK RQQVEIVCNSHVTKIEKTKQFKVLSSTGTYMADAVILAAGGKASAVLGSDGSGYSLAKMF GHTISPVVPALVQLRGNGTFFKQISGVRADAKISLYVDGELLGEDTGEVQFTDYGLSGIP VFQISRFAALALYQKKTPKVRVDFFPEFTGEELTLFFEKRIRQNGEKKAGEFLVGLLNKK LVPILLRASGVRERTLISEVEKERLERLIDKCKGFEIEITETNSFEQAQVCAGGVRLNEI DIETMESLYEKGLYLAGEILDADGICGGYNLQWAWATGYLAGKNALKGKKYD >gi|330405144|gb|ADLB01000012.1| GENE 222 218959 - 220245 1532 428 aa, chain - ## HITS:1 COG:BS_ynbB KEGG:ns NR:ns ## COG: BS_ynbB COG4100 # Protein_GI_number: 16078807 # Func_class: P Inorganic ion transport and metabolism # Function: Cystathionine beta-lyase family protein involved in aluminum resistance # Organism: Bacillus subtilis # 23 425 17 421 421 461 54.0 1e-129 MLETKTQYEQLGISEEVYAFGKKIEEKLTERFQKVDEVAEFNQLKVIKAMQDRRVSEACF NYASGYGYNDLGRDTLEEVYASVFHTESALVRPQITCGTHALALALSANLRPGDELLSPA GKPYDTLEEVIGIRESKGSLAEYGISYRQVDLLENGTFDYESIEKAINEKTKLVTIQRSK GYQTRPSFSVSQIGELIAFVKKIKPDVICMVDNCYGEFVETIEPSDVGADMVVGSLIKNP GGGLAPIGGYIAGKNECIENCAYRLTSPGLGKEVGASLGVMQSFYQGLFLAPTVVSGALK GAIFAANIYEELGFPVIPSGSESRHDIIQAVEFGTPDGVIAFCKGIQAAAPVDSYVSPEP WAMPGYDSDVIMAAGAFVQGSSIELSADGPIKPPYAVYFQGGLTWTHAKLGILMSLEKLV QGGLVKLP >gi|330405144|gb|ADLB01000012.1| GENE 223 220239 - 221192 898 317 aa, chain - ## HITS:1 COG:CAC1835 KEGG:ns NR:ns ## COG: CAC1835 COG0324 # Protein_GI_number: 15895110 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA delta(2)-isopentenylpyrophosphate transferase # Organism: Clostridium acetobutylicum # 4 305 2 303 309 312 50.0 7e-85 MSKKPLIILTGPTAVGKTKASINLAKALGGEIISADSMQVYRHMDIGSAKITKEEMQGVK HYLIDVLEPDEEFHVVKFQQLAKAAMEEIYAKGKIPIVVGGTGFYIQALLYDIDFTENNE DTAYRRELEEISRTKGAEYLHEMLKAVDEKSAQSIHANNVKRVIRALEFFRQTGQKISEH NEKEREKESPYQFCYFVLNDVRGHLYKRIELRIDEMIKEGLVSEVEALKEKGYTREMTSM QGLGYKEIFDYLNGVTSLEEAVYILKRDTRHFAKRQITWFKREKDVIWVNKDEFDYDEEK ILAFLLEKIKERITFPC >gi|330405144|gb|ADLB01000012.1| GENE 224 221194 - 223110 1672 638 aa, chain - ## HITS:1 COG:CAC1836 KEGG:ns NR:ns ## COG: CAC1836 COG0323 # Protein_GI_number: 15895111 # Func_class: L Replication, recombination and repair # Function: DNA mismatch repair enzyme (predicted ATPase) # Organism: Clostridium acetobutylicum # 4 637 3 621 622 399 36.0 1e-110 MPNIQVLDQITIDKIAAGEVIERPASIVKELVENAIDANATAVTVEIKEGGISFIRITDN GCGIPKEEVSLAFLRHSTSKIRSEKDLATVSSLGFRGEALSSISAIAQVEVITKTKENDF GVKYTIEGGVEKSIEEVGAPDGTTFLVHQIFYNTPARRKFLKTPMTEASHVNELMVRLAL SHPEVSIQFINNGQSKLHTAGNGKVKDVIYHVFGREIANNLLEVNRDEGKMRVSGYIGKP LISRGNRNYENYYINGRYVKSNIIAKAIEDAYKDFTMQHKYPFTVLHFWLDGNDIDVNVH PTKMELRFSHRQEVYDFVYRAVKETLIEPELIPRVEISKPTEEKNVPEKKVPEVHDEAYF MKKMRERVQSYHRQASQAEVKDTTELHRGNLQIDRIKEAVTYNKNREREERSAQPVMQPA QPDMSQEMKAEQLNFFEEKLLTKKAVQEYKLIGQVFDTYWLVEFQEQLYIIDQHAAHERV LYEKTLHGMKDRTFTSQYLSPPIILNLSMQEARLLTEHMDLFSKIGFEIENFGGDSFAVR AVPDNLFSIAKKELLMEMLDNLSDDITSAEAPDLIGEKIAAMSCKAAVKGNAKLSSAEVN ALIGELLELENPYHCPHGRPTIIAMTKRELEKKFKRIV >gi|330405144|gb|ADLB01000012.1| GENE 225 223129 - 225780 2801 883 aa, chain - ## HITS:1 COG:CAC1837 KEGG:ns NR:ns ## COG: CAC1837 COG0249 # Protein_GI_number: 15895112 # Func_class: L Replication, recombination and repair # Function: Mismatch repair ATPase (MutS family) # Organism: Clostridium acetobutylicum # 4 880 3 867 869 838 49.0 0 MSELTPMMQQYVEMKKQYQDCILFYRLGDFYEMFFDDAVTASQELELTLTGKNCGMEERA PMCGVPYHAVEGYLTRLVSKGYKVAICEQVEDPKVAKGIVKREVVRIVTPGTNLNTQSLD ESKNNYIMCIVYIADRYGLSVADISTGDYFVTELDTGRKLLDEIAKFAPSEIICNEPFYM SGLDIDDLKNRLGIAIYSLDAWYFDDAMCTKILKEHFKVSSLEGLGLGDYNCGVIGAGAL LKYLYETQKTTLSHLTGIISYTTGKYMLLDSSTRRNLELCETLREKQKRGSLLWVLDKTK TAMGARTLRSYVEQPLINKEDILARLDAVGELKDNAIAREEIREYLTPVYDLERLISKIT YQSANPRDLTAFQSSLAMLPHIKYILSDMTSPLLMSLYRELDTLEDLCELVQSAIKEEPP LAMKEGGIIKDGYDAEVDKLRNAKTEGKTWLAELEAEEREKTGIKNLKIKYNKVFGYYLE VTNSYKELVPDYYTRKQTLANAERYIIPRLKELEDTILGAEDKLYALEYEIYCKIRDKIA DEVVRIQKTAKAIAKIDVFASLALVAERNNYVRPKINEKGVIDIKNGRHPVVEKMIPNDM FIANDTLLDDKKNRVSIITGPNMAGKSTYMRQTALIVLMAQIGTFVPAESANVGIVDRIF TRVGASDDLASGQSTFMVEMTEVANILRNATNRSLLILDEIGRGTSTFDGLSIAWAVVEH ISNAKLLGAKTLFATHYHELTELEGKIDSVNNYCIAVKEKGDDIVFLRKIVKGGADKSYG IQVAKLAGVPESVILRAKEIVSELSEADITTRVKDIKIQGQESKARTKQKKYDEVDLAQM SLFDTVKDDDVLKELEELDVSRMTPMDALNTIYRLQNKLKNRW >gi|330405144|gb|ADLB01000012.1| GENE 226 225930 - 226346 316 138 aa, chain - ## HITS:1 COG:no KEGG:Cphy_3715 NR:ns ## KEGG: Cphy_3715 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 1 135 9 143 146 99 37.0 3e-20 MKRLSEYLFIWALGGTLYYTFEMFFRGFSHWTMFVLGGICAVFCVWQGLVLKWREPLWIQ IIRCTIFVVAGEFITGIVVNKWLRWQVWDYTDQPFQLFGQICAPFAIIFSGLCALGIIGG GYFVHWIYGEEKPDFHVL >gi|330405144|gb|ADLB01000012.1| GENE 227 226432 - 227889 678 485 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|16079597|ref|NP_390421.1| hypothetical protein BSU25430 [Bacillus subtilis subsp. subtilis str. 168] # 49 471 3 423 451 265 34 1e-69 MLKDLDKVIGEIDLNKEAPVNEPDRQYYYMAKAKKYVAEQAENLGRSLTFCVTTFGCQMN ARDSEKLVGILEQIGYAEETDEEKADFIIYNTCTVRENANMRVYGRLGQLKRVKKENPHM MIGLCGCMMQEPEVVEKLKKSYRFVDLIFGTHNIYKFAELIVTRFESERMVIDIWKDTDK IVEDLPSERKFSFKSGVNIMFGCNNFCSYCIVPYVRGRERSRNPKDIVREIERLVADGVV EVMLLGQNVNSYGKNLETPMTFAQLLQEVEKIDGLKRIRFMTSHPKDLSDDLIEVMKNSK KICKHLHLPVQSGSSRILQKMNRRYTKEQYLELVRKIKTAIPDISLTTDIIVGFPGETEE DFLETMDVVKQVRYDSAFTFIYSKRTGTPAATMENQIADDVVKDRFDRLLKEVQGISTEV CGVHTGTTQEVLVESLNDHDDSLVTGRLSNNILVHFPGDEDLIGKIVDVKLEECKGFYYI GTRVE >gi|330405144|gb|ADLB01000012.1| GENE 228 227975 - 228151 277 58 aa, chain - ## HITS:1 COG:no KEGG:BF0576 NR:ns ## KEGG: BF0576 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 57 1 57 58 66 57.0 3e-10 MNLPKDAILLLSVVNTKLRDYYKSLDDLCEDMNAEKKDIIHKLEGIGYEYDGIKNQFV >gi|330405144|gb|ADLB01000012.1| GENE 229 228151 - 228702 387 183 aa, chain - ## HITS:1 COG:aq_1731 KEGG:ns NR:ns ## COG: aq_1731 COG0212 # Protein_GI_number: 15606807 # Func_class: H Coenzyme transport and metabolism # Function: 5-formyltetrahydrofolate cyclo-ligase # Organism: Aquifex aeolicus # 3 179 2 176 186 107 36.0 9e-24 MEIKKQIRRRILKRRESLSKDEWRRKSDKVIEELIAQPIYKESNTIYSYVSYRNEPDTWR FIIYSLKAGKKVAVPKVIGNEMKFYYIRGIEELAEGYKGIFEPPETNEEATEDNALLIMP LVAFDREKHRLGYGGGFYDRYLQKYPNHFKIGIGFSFQEAESVPAEKYDVSPDSIWTDKG EII >gi|330405144|gb|ADLB01000012.1| GENE 230 228680 - 228901 343 73 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_1649 NR:ns ## KEGG: EUBREC_1649 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 73 1 73 74 65 64.0 7e-10 MLEEKIQRINELYRKSKAEGLTAEELQEQKVLRAEYIDAFKRNLRGQLNNISIQEKDGTI TNLGEKFGNKKAN >gi|330405144|gb|ADLB01000012.1| GENE 231 228927 - 229286 409 119 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|153853173|ref|ZP_01994582.1| ## NR: gi|153853173|ref|ZP_01994582.1| hypothetical protein DORLON_00567 [Dorea longicatena DSM 13814] # 1 119 1 123 123 113 50.0 4e-24 MKKKYSIRFLVATVTCLIALTVAYQMSYHHARMELEKETEAEENNPSVATEGEAFQSDIY YLQELNGFIAVYQSDKKTVFEYTDIRMEELPSDLANEIKKGKKLPGIEEVYGFLENYSS >gi|330405144|gb|ADLB01000012.1| GENE 232 229333 - 230151 814 272 aa, chain - ## HITS:1 COG:CAC1622 KEGG:ns NR:ns ## COG: CAC1622 COG2240 # Protein_GI_number: 15894900 # Func_class: H Coenzyme transport and metabolism # Function: Pyridoxal/pyridoxine/pyridoxamine kinase # Organism: Clostridium acetobutylicum # 2 271 3 278 290 169 33.0 4e-42 MKPIKKVAVLHDICGVGKAGAMNMIPILSVMGMEVCLVPTMLLSTHTGGYGKPVICPVSP DYLSDCAKHYKKEKIEFDFIFVGYLGNCDMVDGVLDFIRHFPKAKVVTDTIMGDNGEFYG NFDSSYLQAVKKLLAFSDLILPNYTEACFLAGMEYRKNPSQTYQNELCERLMKLGAKDMV ITSVTAAEGTGILYCEKGKKDCLYLDCEPHNIHGTGDVFDGVILGNCMRGLPLKENIVKA HQFVKTCIAETYRYDYNKREGVLLEKMLPMLV >gi|330405144|gb|ADLB01000012.1| GENE 233 230152 - 231819 1986 555 aa, chain - ## HITS:1 COG:CAC3213 KEGG:ns NR:ns ## COG: CAC3213 COG2244 # Protein_GI_number: 15896460 # Func_class: R General function prediction only # Function: Membrane protein involved in the export of O-antigen and teichoic acid # Organism: Clostridium acetobutylicum # 13 524 6 497 512 167 27.0 5e-41 METKGKKHEGAFLVQGMILASAGIITKIIGVIYRIPLINIMGDQGQGYYGIAFEIYSLAL LLTSYSLPLAVSKLVSARVSKGERRNAFKVFKSALIFGLVSGSLIGLIVFFGADFIATKI MAMGPSQYALRVLAPCLLVVAIMGVVRGYFQGLGTMIPTAVSQILEQIVNAIVSVVGASY LFEFGKKAAEAKGKEYLGPAYAAAGGTLGTFVGAVSGLLFLLFVLFIYRKIIRKQLRRDH TTYQEEYGTIFPILLMTIIPVILSTAVYQSTKILDAGIFSNIMSIQGMSKEKYETLWGMY TGKFNTLVNVPLAIANAIGASVIPSLTAAMTSGDRRLVHSKIQLATRFSMLISIPSTIGY MVLAKPIMNLLFNGDNSTPALLLITGAITIAFYSLSTITNAMLQGIDRMTTPIKNAAISL VIHLVSLFIMLVVLKMNIFAVIGSTIVFSLSMCILNGRALRKEIGYHQEYYKTFILPLIA SIIMGVIAFVIQVAFANIMPEKVATIISVLVAVLVYSLALLLLGGLTESEILAMPKGRKV VNVLKKFHLLREEEE >gi|330405144|gb|ADLB01000012.1| GENE 234 231839 - 232333 594 164 aa, chain - ## HITS:1 COG:CAC1293 KEGG:ns NR:ns ## COG: CAC1293 COG0319 # Protein_GI_number: 15894575 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase # Organism: Clostridium acetobutylicum # 8 164 8 166 166 100 37.0 1e-21 MTIYFEEEGELKLDLECETLAETVVEGVLDYEKCPYEAEVNLLLTMNKEIQEMNAEFRHI DRATDVLSFPMIDYEKAGEFAFLEEDDSYFNCDTGELMLGDIVISKEKVIAQAEEYGHTI KREYAFLIAHSMLHLLGYDHMEEQERLEMERKQKEILEQLGITR >gi|330405144|gb|ADLB01000012.1| GENE 235 232333 - 233319 1160 328 aa, chain - ## HITS:1 COG:BH1361 KEGG:ns NR:ns ## COG: BH1361 COG1702 # Protein_GI_number: 15613924 # Func_class: T Signal transduction mechanisms # Function: Phosphate starvation-inducible protein PhoH, predicted ATPase # Organism: Bacillus halodurans # 13 317 15 319 320 320 54.0 2e-87 MGILERVMEVSAEHEKNVFGEFDKFAKKIERTLHVTLIARNGEVKILGESSYVEKAENVL SQLLELSRRGNVIQEQNVDYALSLSFEDNTDGLLQIDKEIICHTLQGKPIKPKTLGQKKY VDAIRKQMITFGLGPAGTGKTYLAMAMAITAFKNNEVGRIILTRPAIEAGEKLGFLPGDL QSKIDPYLRPLYDALYQIMGAESFIKNSEKGLIEVAPLAYMRGRTLDNAFIILDEAQNTT PAQMKMFLTRIGFGSKVVVTGDSTQKDLPSGTTSGLDIAKKVLKNIEDISICNLTSKDVV RHPLVQKIVKAYEEFEKKNTSTHEKRRR >gi|330405144|gb|ADLB01000012.1| GENE 236 233320 - 234555 678 411 aa, chain - ## HITS:1 COG:no KEGG:Cphy_2612 NR:ns ## KEGG: Cphy_2612 # Name: not_defined # Def: putative stage IV sporulation YqfD # Organism: C.phytofermentans # Pathway: not_defined # 19 387 4 384 412 215 34.0 2e-54 MLLSIIRYIKGYLRIKIIGYSPERFLNLCRHHHIYLWGLTPRMHDYEMFISISGFRKLKP ILKKTKTKVVILNRYGLPFFLHKYRKRKVFFIGAVGCMLSIYLFSMIVWNIHIEGNYSQT DEVILEFLETKKVYHGMPKAKINCERIAKDIRKQFDDIIWVSVSIQGSRLMIQVKENTDT VPQKVNKEEKVTDLVASENGEVVEIIARSGVPLVSVGDKVKKGDVLVSGRVEIKNDAKEV VEYRYQPADADIRLKVSHSYQSETPTDYMKKKYTGKTRSQMFLKTEQHFFGLGFMHNKFK YKETYTSESRLRLGEHFYFPLSYGKMKVKEYKPERKKRSDKEIQRILSEEFQNFCKDLEE KGVEIIEKDVKIYKGSKTAKAKGELTLIKSEKMRKDTTILQIKEKESKGVD >gi|330405144|gb|ADLB01000012.1| GENE 237 234569 - 234835 205 88 aa, chain - ## HITS:1 COG:no KEGG:TherJR_2425 NR:ns ## KEGG: TherJR_2425 # Name: not_defined # Def: sporulation protein YqfC # Organism: Thermincola_JR # Pathway: not_defined # 2 86 7 91 94 71 35.0 1e-11 MKESLKKKMACTTNLPKDVIFGVPIITMTGQLEVCVENYKGIMEYTDTLIRIRSKVGQIR VTGRNMQIEYYTNDEMKITGQIKSVEYS >gi|330405144|gb|ADLB01000012.1| GENE 238 234890 - 235591 700 233 aa, chain - ## HITS:1 COG:BS_pbpF KEGG:ns NR:ns ## COG: BS_pbpF COG0744 # Protein_GI_number: 16078075 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane carboxypeptidase (penicillin-binding protein) # Organism: Bacillus subtilis # 45 228 57 241 714 167 44.0 1e-41 MRTVRKIIKNIFIVLLVACIGICSYLIYQGYTMYKDAVNEVSIEEKVAEIRSKKQYTSVE ELPQIYKDAIISVEDHRFYSHPGIDILATARAAIHDIQAMSLVEGGSTITQQLAKNLYFT QEKKFVRKIAEVFVAVEMERKYTKDEILELYVNSIYFGNNCYCVKDASMTYFNKLPKDMT DYESTLLAGIPNAPSAYSLNVNPTLAKQRQRQVIEKMVEFGHLTQKEANAIIK >gi|330405144|gb|ADLB01000012.1| GENE 239 235694 - 237040 1223 448 aa, chain - ## HITS:1 COG:CAC3354 KEGG:ns NR:ns ## COG: CAC3354 COG0534 # Protein_GI_number: 15896597 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Clostridium acetobutylicum # 1 440 1 441 452 265 38.0 2e-70 MKTKNIDLTQGNEMKSIVLFSLPLIAGNLFQQLYNLVDTIVVGKFVGADALAAVGSSYMT MTFLTSIIIGLCMGSGVIFSYFYSARDSKQLSQSFFQSFIFIFGVTVVINGLSFLFIDEI LRLLNIEKSIFAMTKDYLLIIFAGMFFVFLYNYFAAVLRSMGNSFIPLIFLVISSVVNIV LDYLFVVPFQMGVQGAGYATVIAQIISAVGIAWYTLKCVPEMKFQKEMFVIKKSLMKKII DQSVMTSIQQSIMNFGILMVQGLVNSFGITVMAGFAVAVKIDSFAYIPVQDFGNGFSTYV AQNRGAGKTERIHTGTKAAVKLILMTCIVISTIVFLFASQLMNCFVSAGETEVIAEGVRY LQIVCPFYCLIGFLFMFYGLYRGMGRPQMSIVLTVISLGTRVALAYILSDIKWISVVGIW WAIPIGWFLADLVGLVYYRRYRGKGNLE >gi|330405144|gb|ADLB01000012.1| GENE 240 237076 - 237555 354 159 aa, chain - ## HITS:1 COG:BH2238 KEGG:ns NR:ns ## COG: BH2238 COG0394 # Protein_GI_number: 15614801 # Func_class: T Signal transduction mechanisms # Function: Protein-tyrosine-phosphatase # Organism: Bacillus halodurans # 3 152 2 153 160 145 47.0 3e-35 MRIRVLFVCHGNICRSTMAEYVMKDLVKKSNLTEEFYIDSAATSREEIGNPVHRGTRQKL KEKGIFCGDHRARQVTKKEYAEYDYILGMDTWNMKNMMRIFGSDPDEKIYKLLDFSSAPR DIADPWYTGNFDATYDDVSEGCHTLLNHILEKGQVAQRQ >gi|330405144|gb|ADLB01000012.1| GENE 241 237865 - 239127 493 420 aa, chain - ## HITS:1 COG:SA1835 KEGG:ns NR:ns ## COG: SA1835 COG0582 # Protein_GI_number: 15927603 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Staphylococcus aureus N315 # 116 399 98 387 390 85 25.0 3e-16 MATRRDSKHRVLRRGESIRQDGRYQFKYHVNGKAHFVYSWRLEPTDKLPVGRKPCLSLRE LEKQIGYDLDNRLDPVGKNITVNELVDRYLATKTGVKYNTQMNYNFVKNILKAHPFGDTK ISRVKTSDAKLFLIKLQQEDGRGYSSVKTIRGVLRPAFQMAVDDGVLNKNPFGFQLAGVV VNDSVTREAITKEQMNKFLKFVHDDNVYCKYYEVIYILFHTGMRISEFCGLTISDIDLEN NIVNIDHQLQRTSDMKYILDTTKTDAGTRKLPITQDVADCFRSILEDRKKPRYEKMIKGH TGFLFLDKNGNPEVAMHWQHRLNHMVKRYNDIYRVQMPNITPHVCRHTYCSNMAKSGMNP KTLQYLMGHSDISVTMNTYTHWGLEDAADELKKMEDVEKVRREMEKGQEKPMNQKMFRAI >gi|330405144|gb|ADLB01000012.1| GENE 242 239105 - 239335 164 76 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|163816186|ref|ZP_02207554.1| ## NR: gi|163816186|ref|ZP_02207554.1| hypothetical protein COPEUT_02370 [Coprococcus eutactus ATCC 27759] # 2 68 21 87 87 85 65.0 9e-16 MKADLSQKDVLNPLETIELFVLSRRKFYDLLKHNKGLEFLAKYGTRNLIIRTEFEKYLQA HPELRRRGTNGDEERF >gi|330405144|gb|ADLB01000012.1| GENE 243 239436 - 239669 260 77 aa, chain - ## HITS:1 COG:no KEGG:CD0356 NR:ns ## KEGG: CD0356 # Name: xis # Def: excisionase # Organism: C.difficile # Pathway: not_defined # 12 77 1 67 67 65 53.0 8e-10 MKVEKHDTSGSIENTLVPVSEKYTLTIKEAASYFNIGTKKMRRLAEDNRGRFAVFSGNRY LIIRPQFEKFISASSEI >gi|330405144|gb|ADLB01000012.1| GENE 244 239716 - 240636 757 306 aa, chain - ## HITS:1 COG:CC2783 KEGG:ns NR:ns ## COG: CC2783 COG4653 # Protein_GI_number: 16127015 # Func_class: R General function prediction only # Function: Predicted phage phi-C31 gp36 major capsid-like protein # Organism: Caulobacter vibrioides # 121 305 140 340 341 91 28.0 2e-18 MINSTAYYNDFWNEMRGMEVISDSLSNVRESKTNACPLLEESEKKYMAALKKESVVRQLA TVVNATKSDSRLWTFDYTGQAEWSDIVNLEGLDTEEDFQRFDIQAHRLSEFIKLGLEFAA DQSFAIEDYLIGKMARCFGTSEELAFVNGTGENQPIGILHDAEGAETGVTTENASSISYD EIIKLYLSVDKKYRKHGTWLMNDETALALRTLKDSAGNYLWKDINETIFSKPVQIVDSMP SIGKGQKMIAFGDFSNYWLVQRFPMTIRTLRELFAVRGQVGYLGCEYLDGKLIRKDAVKV LQMAAD >gi|330405144|gb|ADLB01000012.1| GENE 245 240657 - 240896 157 79 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|295108407|emb|CBL22360.1| ## NR: gi|295108407|emb|CBL22360.1| hypothetical protein [Ruminococcus obeum A2-162] # 1 76 1 76 86 128 80.0 1e-28 MGKQKEVLPMKFEPSDFSTDKYRCVNVINFRDRDPVIILVSETCDPPYYRVVDGTMQMCY LSYSEAVEYCRQSGYIAQK >gi|330405144|gb|ADLB01000012.1| GENE 246 240912 - 243221 1745 769 aa, chain - ## HITS:1 COG:lin2587_2 KEGG:ns NR:ns ## COG: lin2587_2 COG3378 # Protein_GI_number: 16801649 # Func_class: R General function prediction only # Function: Predicted ATPase # Organism: Listeria innocua # 308 741 2 427 456 238 31.0 3e-62 MYEKLPEELKKDGRFCLWKYEERNGRITKVPYQINGKKASSADKNTFSDFRMAVNAMDDY DGIGMGAFDDFCMVDIDHCVCGGKLTRMAENIIEKMDSYTEISPSGTGVRIVCKASSLFY DKGRYYINNSKIGLEIYVSDVTKKFCTLTGNAIRNCGVEERSGQLGIILETYMLRPIPKE KDREQDIPGSYLSDDSVVRLASDSRQGRKFKSLWNGEIPEGKSHSDADMSLASILAFWCG GDLEQMDRLFRKSGLMRSKWDRVQSGSTYGALTLEKAVAQAVEFYRPYATTSAESDFDDM LQKLVEWNVSDNRRYPWNDNGSGRLFADVYKDIARYVPERKKWYVYDGTRWIPDIGGLKT MELAKSLADTLVRYALTIADERRRKDYLEYSAKWQSRNYRNTYISDAQSVYPIAMSEFDR NIYYLNCQNGTLDLQTGEFHLHTPQDKLTKIAGAAYDPNAKSPRFIRFISEVMSGDKEKV RFMQKSLGYGLTGDTRYECMFFYYGATTRNGKGTLMESTLHVMGDYGLTVRPETIAAKPS VNSQNPTEDIARLAGIRFANISEPRRGLVLNEAQIKSMTGNDTLNARFLHENSFDFKPQF KLYVNTNYLPAITDMTLFSSGRIVIIPFDRRFEEWEQEQNLKAEFSKPEIASAILNWLIE GYTLLQEEGFDQPTAVKDAILSYQHDSDKMQLFVEEFLEKEKDAECRTSAVYQAYRNWCN NNGYFAENSRNFNQALRTIGMVVRKRPKDGGEKTTLLTGYRLLCRDFLC >gi|330405144|gb|ADLB01000012.1| GENE 247 243258 - 243434 214 58 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|295108405|emb|CBL22358.1| ## NR: gi|295108405|emb|CBL22358.1| DNA binding domain, excisionase family [Ruminococcus obeum A2-162] # 1 58 1 58 58 92 96.0 9e-18 MDKVTMSVQEMAMQMGISLSKAYALTREEGFPIVRVGKRVLIPVSEFKVWLSARATEK >gi|330405144|gb|ADLB01000012.1| GENE 248 243577 - 244455 654 292 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|291536674|emb|CBL09786.1| ## NR: gi|291536674|emb|CBL09786.1| hypothetical protein [Roseburia intestinalis M50/1] # 1 292 1 292 292 564 94.0 1e-159 MNEDFSLGKMNADARLTATFKYYEKKEQDGKIMITAYLGEGDVPKLKRGYGLDGNELWLS LNNLYQRIRGKDDNTAADDIISWCQQYAHPYYASEDIEEYRWDIEKDTEYWDFSTNILGN FTFDVHTMRKDLESLYRDTLVILMFKKCLERLDVSTDLAQITWTNEFANFNSFPMQKYLG KISAYLNQMNGVTMKLGLDENGELKVMPDFHSVFDAARFALSQYVSIPTDYPIAYADRVG VATCECCGRLFIKNGNRQKYCDNPECKKERNRRKSRTAYHRKIQEENDNRWA >gi|330405144|gb|ADLB01000012.1| GENE 249 244620 - 245285 503 221 aa, chain - ## HITS:1 COG:no KEGG:DET0065 NR:ns ## KEGG: DET0065 # Name: not_defined # Def: virulence-related protein # Organism: D.ethenogenes # Pathway: not_defined # 4 220 7 250 286 177 39.0 4e-43 MRNEIRFTLEPKQRPKLAQEIGKILGTAPHYERVPSCAYDIAGYRLEKEGVLHIPEGVAV EMVEHLIHQLRERGFQDDAEFTEEVLLPKDKLTIAVPREIFIDMALENLQKIIANKQILF QRAFRTDSTEIGITKEKINITWFPYTTNVDEIAAYTQFITRLCDMAKNAKRVSSKPTETD NDKYAFRCFLLRLGFIGNDYKTARKILLRNLTGNSAFRYGE >gi|330405144|gb|ADLB01000012.1| GENE 250 245440 - 245697 146 85 aa, chain - ## HITS:0 COG:no KEGG:no NR:no METVVPVRRAERRKKDNPIQTAVASFRCGFSVVGVSKSLQTILLKTDGPLRANFRRIKQG GYPQNPYYMKISAEYRFFSRFCFAG >gi|330405144|gb|ADLB01000012.1| GENE 251 245648 - 246451 650 267 aa, chain - ## HITS:1 COG:no KEGG:Emin_0869 NR:ns ## KEGG: Emin_0869 # Name: not_defined # Def: hypothetical protein # Organism: E.minutum # Pathway: not_defined # 63 190 11 145 147 100 43.0 4e-20 MLLGYFTEEAYNKLLHDIQRNTENYSSPDEWLSTYFGGNDYFKMSSVDVSVFVPDYTPGK KDDAQKSREDLVNTRLIYDAFKSLTPLQASNKYLWTYLCHAEPAYSAYIRDRWLQEEREN TIRSRFFVTTPGSLLNDNALSRLWWYGYLTYDRKNSNHYWLTEILFTNQTICTDVMDTFN RMNFDRMRGVLMAIRDFKNVIGDNEGITEYFRECKKYLNHYAAVTTLEFLDSDEIRDLAF NYMMKLREEKQNGNGGTGPKSRKKKKR >gi|330405144|gb|ADLB01000012.1| GENE 252 246451 - 247386 669 311 aa, chain - ## HITS:1 COG:no KEGG:BCAH820_1014 NR:ns ## KEGG: BCAH820_1014 # Name: not_defined # Def: hypothetical protein # Organism: B.cereus_AH820 # Pathway: not_defined # 8 307 6 308 308 71 26.0 4e-11 MKSVKNEFPNPVLAAGRDDYIESCRFYTTINEEEIVVDTENIVFPMRYVLECNGLSAMVQ SGQAVAVVTVKSSAASYSKLFRFSADSKELTISVPKFAVVNKMDITGSIIAACNIDRFCC EGEFNDLYFGGSTFEIRKGDILTTEEIRSIFVDDSELEKPISSIFDISRNDEQDSEVVPN FYGEKIEIFLKSELYDLYYKFKDFNNGSLRRYAAGIIVYPVLVEAIGYVIGHYQNDGDVG DGTNFSEKRWFRAIDHKADVKGIDLRSYDGCPTTLANDILGDIALDALKSFKDTLDSEIN SGETQMIGGVD >gi|330405144|gb|ADLB01000012.1| GENE 253 247383 - 249314 1059 643 aa, chain - ## HITS:1 COG:no KEGG:BcerKBAB4_0821 NR:ns ## KEGG: BcerKBAB4_0821 # Name: not_defined # Def: hypothetical protein # Organism: B.weihenstephanensis # Pathway: not_defined # 4 637 3 607 612 219 29.0 3e-55 MTQVGWRFPPLSGGTRQGYTNNDIEVFKGQELIDNLAREICQNSLDAHIEGTDTPVRVVF ELRQISRSGYDVFSQYSRCLKGCRKYWGTEMDAKLCRFVEGAEATLTEDDIPVLIASDYN TKGLSGSHNGKLSSSWEALTGSDGVSVKSDENSAGSYGIGKNAPFACSSLSMVFYNTVAE NNESAFIGVARLATLLNEDGKPTQRVGKYQNNDEENEKWFPIYDTDENDFRDCFHRIERG TDVIIVGFTQATNWVTNVTKAVLKNFFVAISEGRLVVELKNGNDIRIIDESTLSQLFNDF SDDTEMLATSQLYKAFTSPDCKKTMEVLEADDVEVYIKSDSSYKRTIANFRATGMLVGTY YKRIFQHYAAVVIVRGSKLGELLKDTEPPRHNRWDYKQIESSDRKKRKLARESIQKIDDF VLNLLKSQFEVVTEDTVDAAGVGEYIPDDIDGLGGQSEGDDILKVKIKIGKIKTNHTHQG FTTEVAVQEEGTEEEGDVHNHERNPNPVPPRPRPPKPVTPDPDAPDPQPGATPGKGVKTV NTPNLSAQRAFPVSSSQGLYKIVIKPSETYENLYVECFAVGEDGKADSLDMESFTFNSKN IKISNGKAGPIKVEADTPAVFFAKFFRKEKMKLRLSLTEVVKK >gi|330405144|gb|ADLB01000012.1| GENE 254 249326 - 249559 280 77 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|291172169|ref|ZP_06573343.1| ## NR: gi|291172169|ref|ZP_06573343.1| conserved hypothetical protein [Filifactor alocis ATCC 35896] # 1 77 19 95 95 112 81.0 9e-24 MATKETKRERFVRLAEARTNKIIDMMKLLGNCSSTANYEYTEEDVKKIFSAIEHELKNTK AKFNGTDSPKEERFTLE >gi|330405144|gb|ADLB01000012.1| GENE 255 249778 - 250905 737 375 aa, chain - ## HITS:1 COG:NMA1500 KEGG:ns NR:ns ## COG: NMA1500 COG0270 # Protein_GI_number: 15794400 # Func_class: L Replication, recombination and repair # Function: Site-specific DNA methylase # Organism: Neisseria meningitidis Z2491 # 3 364 20 332 337 165 33.0 2e-40 MPFTTIDLCAGIGGMRKGFELTGYFHNVLAAEIDKYACMTYQHLYGDDANHDLTSEEFKA ELDTIQYDVLLAGFPCQTFSKAGLEEGFNDTEKGIIFNHIAEIIRRTRPRAVFLENVDNL VRHDKGNTFRVIINTLEKTLNYKVIGVTYDMLGEPTYNGKDFIRNSRNFGIPQNRPRTYI MAFNRQRYGAAALAGIENCLPIENDWYLYEDLNELLEFHAEAKYYMASGYLDTLIRHRER EHGKGNGFGYRIVNEPGIEHPVANTIMATGGSGKERNLVFDPQDNIAGMILSTKKTPLND KGIRVMTPREWGKLQGFINYAFLDEEGIDHFSFPKEVAPTQQYKQFGNSVTIPVIETMAE FMLNCFRVLGDIPNE >gi|330405144|gb|ADLB01000012.1| GENE 256 250910 - 251323 298 137 aa, chain - ## HITS:1 COG:RSc3439 KEGG:ns NR:ns ## COG: RSc3439 COG3727 # Protein_GI_number: 17548156 # Func_class: L Replication, recombination and repair # Function: DNA G:T-mismatch repair endonuclease # Organism: Ralstonia solanacearum # 13 137 13 139 154 76 32.0 1e-14 MIKKTKEQISYNMRQVKNKDSEIELMLRRELWKRNIRYRKNVTRIFGKPDIAFIRKKVAV FVDSEFWHGFNWEVKKNEVKSNRDFWIAKIERNMERDAEVNQYLQKEGWLVLRFWGNEIK KDVVTCADKIESALKER >gi|330405144|gb|ADLB01000012.1| GENE 257 251307 - 251630 166 107 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTQEQLAKAIGRSKMTVSQFEKGKNAPPQGELLEKIISTLTLTTEQENRLRFLSSESRRT VPCDIEDYFFENPSICKAIRAAQASSANDAFWNELSGRLKKNNDQED >gi|330405144|gb|ADLB01000012.1| GENE 258 252199 - 252456 238 85 aa, chain - ## HITS:1 COG:L0434 KEGG:ns NR:ns ## COG: L0434 COG2801 # Protein_GI_number: 15672639 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Lactococcus lactis # 1 72 206 277 279 81 48.0 4e-16 MQCSYSKKSYPWDNACIESFHSLIKREWLNRFKIRDYDHAYRLIFEYLEAFYNTKRIHSH CDYMSPNDYEELYRRLQQDELQLAG Prediction of potential genes in microbial genomes Time: Tue May 24 21:09:40 2011 Seq name: gi|330404621|gb|ADLB01000013.1| Lachnospiraceae bacterium 2_1_46FAA cont1.13, whole genome shotgun sequence Length of sequence - 115094 bp Number of predicted genes - 126, with homology - 117 Number of transcription units - 33, operones - 19 average op.length - 5.9 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 100 - 384 453 ## EUBREC_2191 hypothetical protein - Prom 441 - 500 6.6 2 2 Op 1 . - CDS 699 - 848 131 ## 3 2 Op 2 . - CDS 845 - 1378 343 ## DSY0717 hypothetical protein + Prom 1550 - 1609 8.2 4 3 Tu 1 . + CDS 1665 - 3791 1773 ## SGO_1415 LPXTG cell wall surface protein, X-prolyl dipeptidylaminopeptidase, putative (EC:3.4.14.11) + Term 3814 - 3852 5.1 - Term 3847 - 3884 -0.1 5 4 Op 1 . - CDS 3906 - 5018 1216 ## COG1453 Predicted oxidoreductases of the aldo/keto reductase family - Prom 5045 - 5104 4.9 6 4 Op 2 . - CDS 5108 - 5386 399 ## gi|167758134|ref|ZP_02430261.1| hypothetical protein CLOSCI_00472 - Prom 5423 - 5482 3.4 - Term 5413 - 5477 9.3 7 5 Op 1 . - CDS 5487 - 6740 815 ## COG0107 Imidazoleglycerol-phosphate synthase 8 5 Op 2 . - CDS 6730 - 7296 457 ## COG3331 Penicillin-binding protein-related factor A, putative recombinase 9 5 Op 3 . - CDS 7298 - 8260 952 ## COG0564 Pseudouridylate synthases, 23S RNA-specific 10 5 Op 4 . - CDS 8276 - 8851 518 ## EUBELI_01028 hypothetical protein 11 5 Op 5 . - CDS 8870 - 9622 790 ## COG0300 Short-chain dehydrogenases of various substrate specificities 12 5 Op 6 . - CDS 9640 - 11538 1622 ## COG1032 Fe-S oxidoreductase 13 5 Op 7 1/0.000 - CDS 11539 - 12207 589 ## COG0637 Predicted phosphatase/phosphohexomutase 14 5 Op 8 . - CDS 12201 - 12905 624 ## COG1187 16S rRNA uridine-516 pseudouridylate synthase and related pseudouridylate synthases 15 5 Op 9 . - CDS 12902 - 14287 1325 ## COG0144 tRNA and rRNA cytosine-C5-methylases 16 5 Op 10 . - CDS 14340 - 15179 579 ## COG0739 Membrane proteins related to metalloendopeptidases 17 5 Op 11 . - CDS 15242 - 16690 1641 ## COG2195 Di- and tripeptidases - Prom 16770 - 16829 5.6 + Prom 16708 - 16767 8.8 18 6 Tu 1 . + CDS 16803 - 17735 337 ## COG3314 Uncharacterized protein conserved in bacteria + Term 17777 - 17806 -0.2 - Term 17659 - 17703 7.6 19 7 Op 1 . - CDS 17730 - 18326 931 ## EUBREC_1679 hypothetical protein 20 7 Op 2 14/0.000 - CDS 18343 - 18819 424 ## PROTEIN SUPPORTED gi|163764798|ref|ZP_02171851.1| ribosomal protein S19 21 7 Op 3 . - CDS 18813 - 19382 525 ## COG0742 N6-adenine-specific methylase - Prom 19403 - 19462 4.1 22 8 Op 1 . - CDS 19464 - 20123 497 ## CD3160 putative ABC transporter, permease protein 23 8 Op 2 . - CDS 20056 - 21450 733 ## CD3161 putative ABC transporter, permease permease 24 8 Op 3 3/0.000 - CDS 21451 - 22326 668 ## COG1131 ABC-type multidrug transport system, ATPase component 25 8 Op 4 40/0.000 - CDS 22400 - 23590 534 ## COG0642 Signal transduction histidine kinase 26 8 Op 5 . - CDS 23578 - 24258 537 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 27 8 Op 6 . - CDS 24327 - 24953 412 ## BHWA1_00599 acetyltransferase, GT family - Prom 25009 - 25068 6.3 + Prom 24968 - 25027 7.2 28 9 Tu 1 . + CDS 25057 - 25878 530 ## COG0789 Predicted transcriptional regulators + Term 26023 - 26060 5.5 - Term 25792 - 25828 1.0 29 10 Op 1 . - CDS 25872 - 26291 450 ## Apre_1653 hypothetical protein 30 10 Op 2 . - CDS 26309 - 27031 496 ## gi|169349649|ref|ZP_02866587.1| hypothetical protein CLOSPI_00387 31 10 Op 3 8/0.000 - CDS 27033 - 27893 834 ## COG1131 ABC-type multidrug transport system, ATPase component 32 10 Op 4 . - CDS 27874 - 28257 297 ## COG1725 Predicted transcriptional regulators 33 11 Tu 1 . - CDS 28360 - 29019 627 ## CD1901 putative phage repressor - Prom 29240 - 29299 5.9 + Prom 29131 - 29190 7.9 34 12 Tu 1 . + CDS 29270 - 29476 325 ## Cphy_1320 small acid-soluble spore protein alpha/beta type + Term 29491 - 29556 17.4 - Term 29468 - 29505 3.4 35 13 Op 1 . - CDS 29549 - 30067 576 ## FMG_0453 hypothetical protein 36 13 Op 2 . - CDS 30067 - 30285 235 ## PROTEIN SUPPORTED gi|163756262|ref|ZP_02163377.1| 50S ribosomal protein L20 37 13 Op 3 . - CDS 30295 - 30774 471 ## EUBREC_2826 hypothetical protein - Prom 30838 - 30897 5.8 - Term 30872 - 30916 1.3 38 14 Op 1 4/0.000 - CDS 30924 - 32984 1954 ## COG1200 RecG-like helicase - Term 32999 - 33037 2.0 39 14 Op 2 9/0.000 - CDS 33058 - 34716 2156 ## COG1461 Predicted kinase related to dihydroxyacetone kinase 40 14 Op 3 . - CDS 34735 - 35094 563 ## COG1302 Uncharacterized protein conserved in bacteria - Prom 35135 - 35194 6.2 + Prom 35184 - 35243 7.6 41 15 Tu 1 . + CDS 35289 - 35474 239 ## PROTEIN SUPPORTED gi|160881022|ref|YP_001559990.1| ribosomal protein L28 + Term 35484 - 35534 3.9 - Term 35479 - 35514 3.1 42 16 Op 1 . - CDS 35526 - 35672 153 ## 43 16 Op 2 . - CDS 35723 - 36124 457 ## EUBREC_1666 hypothetical protein 44 16 Op 3 . - CDS 36148 - 37023 614 ## EUBELI_00953 hypothetical protein 45 16 Op 4 . - CDS 37032 - 37370 402 ## gi|210615640|ref|ZP_03290686.1| hypothetical protein CLONEX_02904 46 16 Op 5 19/0.000 - CDS 37396 - 38994 1584 ## COG0768 Cell division protein FtsI/penicillin-binding protein 2 47 16 Op 6 . - CDS 38843 - 40303 1401 ## COG0772 Bacterial cell division membrane protein 48 16 Op 7 . - CDS 40312 - 42678 1476 ## COG0826 Collagenase and related proteases - Term 42686 - 42715 1.4 49 16 Op 8 . - CDS 42718 - 43128 516 ## EUBELI_00944 hypothetical protein 50 16 Op 9 29/0.000 - CDS 43201 - 44202 1243 ## COG2255 Holliday junction resolvasome, helicase subunit 51 16 Op 10 . - CDS 44219 - 44824 699 ## COG0632 Holliday junction resolvasome, DNA-binding subunit 52 16 Op 11 . - CDS 44844 - 45335 654 ## COG4769 Predicted membrane protein 53 16 Op 12 . - CDS 45346 - 45702 512 ## EUBREC_1653 hypothetical protein + Prom 45637 - 45696 6.4 54 17 Tu 1 . + CDS 45828 - 46820 759 ## COG1477 Membrane-associated lipoprotein involved in thiamine biosynthesis - Term 46789 - 46830 3.2 55 18 Op 1 . - CDS 46845 - 47189 499 ## gi|210615649|ref|ZP_03290695.1| hypothetical protein CLONEX_02913 56 18 Op 2 . - CDS 47233 - 47670 300 ## gi|210615650|ref|ZP_03290696.1| hypothetical protein CLONEX_02914 57 18 Op 3 12/0.000 - CDS 47680 - 48471 933 ## COG2878 Predicted NADH:ubiquinone oxidoreductase, subunit RnfB 58 18 Op 4 3/0.000 - CDS 48484 - 49059 776 ## COG4657 Predicted NADH:ubiquinone oxidoreductase, subunit RnfA 59 18 Op 5 13/0.000 - CDS 49071 - 49769 907 ## COG4660 Predicted NADH:ubiquinone oxidoreductase, subunit RnfE 60 18 Op 6 12/0.000 - CDS 49762 - 50370 893 ## COG4659 Predicted NADH:ubiquinone oxidoreductase, subunit RnfG 61 18 Op 7 12/0.000 - CDS 50370 - 51302 1110 ## COG4658 Predicted NADH:ubiquinone oxidoreductase, subunit RnfD 62 18 Op 8 . - CDS 51321 - 52655 1453 ## COG4656 Predicted NADH:ubiquinone oxidoreductase, subunit RnfC - Prom 52724 - 52783 5.5 - Term 52753 - 52810 4.0 63 19 Op 1 4/0.000 - CDS 52830 - 53921 1180 ## PROTEIN SUPPORTED gi|227872165|ref|ZP_03990534.1| possible ribosomal protein S1 64 19 Op 2 2/0.000 - CDS 53902 - 54756 572 ## PROTEIN SUPPORTED gi|168180380|ref|ZP_02615044.1| 4-hydroxy-3-methylbut-2-enyl diphosphate reductase / ribosomal protein S1 homolog 65 19 Op 3 1/0.000 - CDS 54767 - 55423 701 ## COG0283 Cytidylate kinase 66 19 Op 4 1/0.000 - CDS 55439 - 56668 1085 ## COG2081 Predicted flavoproteins 67 19 Op 5 . - CDS 56669 - 57598 789 ## COG1737 Transcriptional regulators - Prom 57631 - 57690 6.2 - Term 57689 - 57720 2.7 68 20 Op 1 4/0.000 - CDS 57732 - 59141 1742 ## COG5016 Pyruvate/oxaloacetate carboxyltransferase 69 20 Op 2 9/0.000 - CDS 59159 - 60310 1584 ## COG1883 Na+-transporting methylmalonyl-CoA/oxaloacetate decarboxylase, beta subunit 70 20 Op 3 . - CDS 60374 - 60733 700 ## COG0511 Biotin carboxyl carrier protein 71 20 Op 4 . - CDS 60758 - 61522 1029 ## EUBELI_00921 hypothetical protein 72 20 Op 5 . - CDS 61537 - 62973 1474 ## COG4799 Acetyl-CoA carboxylase, carboxyltransferase component (subunits alpha and beta) - Prom 63171 - 63230 8.1 + Prom 63390 - 63449 5.2 73 21 Tu 1 . + CDS 63474 - 64844 1352 ## COG3104 Dipeptide/tripeptide permease + Term 64883 - 64912 0.5 + Prom 64894 - 64953 9.4 74 22 Tu 1 . + CDS 64989 - 66776 1783 ## COG0006 Xaa-Pro aminopeptidase + Term 66805 - 66842 1.2 - Term 66705 - 66739 3.0 75 23 Op 1 . - CDS 66832 - 68034 909 ## COG1686 D-alanyl-D-alanine carboxypeptidase 76 23 Op 2 21/0.000 - CDS 68092 - 68661 652 ## COG1386 Predicted transcriptional regulator containing the HTH domain 77 23 Op 3 . - CDS 68676 - 69425 612 ## COG1354 Uncharacterized conserved protein 78 23 Op 4 . - CDS 69437 - 70306 642 ## COG1408 Predicted phosphohydrolases 79 23 Op 5 . - CDS 70364 - 71143 538 ## COG1686 D-alanyl-D-alanine carboxypeptidase 80 23 Op 6 . - CDS 71179 - 72738 854 ## COG1961 Site-specific recombinases, DNA invertase Pin homologs 81 23 Op 7 . - CDS 72728 - 72853 69 ## 82 23 Op 8 . - CDS 72921 - 73862 828 ## EUBELI_20121 endonuclease 83 23 Op 9 . - CDS 73884 - 74135 258 ## gi|160914469|ref|ZP_02076684.1| hypothetical protein EUBDOL_00473 84 23 Op 10 . - CDS 74166 - 74345 122 ## - Prom 74417 - 74476 8.2 85 24 Tu 1 . - CDS 74497 - 74874 383 ## COG1686 D-alanyl-D-alanine carboxypeptidase - Prom 74902 - 74961 6.9 - Term 74951 - 74999 5.8 86 25 Op 1 . - CDS 75001 - 75189 239 ## 87 25 Op 2 . - CDS 75202 - 75789 571 ## COG4905 Predicted membrane protein 88 25 Op 3 . - CDS 75761 - 75829 62 ## 89 25 Op 4 . - CDS 75859 - 76035 288 ## PROTEIN SUPPORTED gi|227872333|ref|ZP_03990687.1| ribosomal protein S21 - Prom 76108 - 76167 7.9 90 26 Tu 1 . - CDS 76179 - 78818 3247 ## COG0013 Alanyl-tRNA synthetase - Prom 78871 - 78930 9.5 - Term 78924 - 78960 6.4 91 27 Op 1 . - CDS 78967 - 79089 167 ## 92 27 Op 2 . - CDS 79152 - 80660 1848 ## COG4624 Iron only hydrogenase large subunit, C-terminal domain 93 27 Op 3 . - CDS 80670 - 83048 2195 ## COG1410 Methionine synthase I, cobalamin-binding domain 94 27 Op 4 . - CDS 83052 - 83696 477 ## EUBELI_01383 5-methyltetrahydrofolate--homocysteine methyltransferase 95 27 Op 5 . - CDS 83705 - 84571 740 ## COG0685 5,10-methylenetetrahydrofolate reductase 96 27 Op 6 . - CDS 84590 - 86026 1302 ## COG0297 Glycogen synthase 97 27 Op 7 2/0.000 - CDS 86091 - 86879 793 ## COG0784 FOG: CheY-like receiver - Prom 86917 - 86976 4.5 - Term 86941 - 86984 -0.7 98 27 Op 8 3/0.000 - CDS 86988 - 88043 1013 ## COG0750 Predicted membrane-associated Zn-dependent proteases 1 - Prom 88073 - 88132 4.3 - Term 88055 - 88105 3.2 99 27 Op 9 8/0.000 - CDS 88135 - 89811 1413 ## COG0497 ATPase involved in DNA repair 100 27 Op 10 1/0.000 - CDS 89821 - 90273 630 ## COG1438 Arginine repressor 101 27 Op 11 5/0.000 - CDS 90276 - 91085 680 ## COG0061 Predicted sugar kinase 102 27 Op 12 6/0.000 - CDS 91099 - 91908 953 ## COG1189 Predicted rRNA methylase 103 27 Op 13 13/0.000 - CDS 91917 - 93782 1753 ## COG1154 Deoxyxylulose-5-phosphate synthase 104 27 Op 14 . - CDS 93795 - 94685 1064 ## COG0142 Geranylgeranyl pyrophosphate synthase 105 27 Op 15 . - CDS 94675 - 94878 286 ## gi|167758089|ref|ZP_02430216.1| hypothetical protein CLOSCI_00427 106 27 Op 16 1/0.000 - CDS 94847 - 96082 1254 ## COG1570 Exonuclease VII, large subunit 107 27 Op 17 10/0.000 - CDS 96093 - 96491 475 ## COG0781 Transcription termination factor - Prom 96535 - 96594 13.1 - Term 96577 - 96619 8.1 108 28 Op 1 . - CDS 96624 - 97004 493 ## COG1302 Uncharacterized protein conserved in bacteria 109 28 Op 2 . - CDS 97078 - 98670 1580 ## COG4108 Peptide chain release factor RF-3 110 28 Op 3 . - CDS 98739 - 99341 794 ## EUBREC_2230 hypothetical protein 111 28 Op 4 . - CDS 99361 - 99909 698 ## Cphy_2518 hypothetical protein 112 28 Op 5 . - CDS 99890 - 100222 324 ## gi|167758082|ref|ZP_02430209.1| hypothetical protein CLOSCI_00420 113 28 Op 6 . - CDS 100235 - 101380 1019 ## Cphy_2520 sporulation stage III, protein AE 114 28 Op 7 . - CDS 101386 - 101772 563 ## Cphy_2521 stage III sporulation protein AD 115 28 Op 8 . - CDS 101783 - 101977 198 ## EUBREC_2235 hypothetical protein 116 28 Op 9 . - CDS 101989 - 102507 522 ## Cphy_2523 hypothetical protein 117 28 Op 10 . - CDS 102522 - 103445 840 ## COG3854 Uncharacterized protein conserved in bacteria - Prom 103477 - 103536 2.3 - Term 103499 - 103544 7.3 118 29 Tu 1 . - CDS 103564 - 105654 2708 ## COG1185 Polyribonucleotide nucleotidyltransferase (polynucleotide phosphorylase) - Prom 105761 - 105820 5.1 - Term 105822 - 105872 8.7 119 30 Op 1 . - CDS 105874 - 106329 544 ## Lebu_1182 GCN5-related N-acetyltransferase 120 30 Op 2 . - CDS 106326 - 106988 439 ## COG4845 Chloramphenicol O-acetyltransferase 121 30 Op 3 . - CDS 107023 - 107571 229 ## PROTEIN SUPPORTED gi|228000081|ref|ZP_04047083.1| acetyltransferase, ribosomal protein N-acetylase - Prom 107662 - 107721 10.6 - Term 107655 - 107719 10.8 122 31 Op 1 . - CDS 107728 - 111861 4359 ## COG3250 Beta-galactosidase/beta-glucuronidase 123 31 Op 2 . - CDS 111945 - 112010 89 ## - Prom 112059 - 112118 3.4 124 32 Op 1 . - CDS 112123 - 114087 1715 ## COG3250 Beta-galactosidase/beta-glucuronidase 125 32 Op 2 . - CDS 114118 - 114240 78 ## - Prom 114303 - 114362 5.9 - Term 114352 - 114382 5.0 126 33 Tu 1 . - CDS 114493 - 114963 400 ## COG1943 Transposase and inactivated derivatives - Prom 115004 - 115063 6.2 Predicted protein(s) >gi|330404621|gb|ADLB01000013.1| GENE 1 100 - 384 453 94 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_2191 NR:ns ## KEGG: EUBREC_2191 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 94 1 94 94 135 77.0 4e-31 MAKHYDKQFKLDAVQYYHDHKNLGLQGCATNLGISQQTLSRWQKELRETGDIESRGSGNY ASDEAKEIARLKRELRDAQDALEVLKKAINILGK >gi|330404621|gb|ADLB01000013.1| GENE 2 699 - 848 131 49 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKLEDMIEGWDDKTNGVTDDYKYCKSCKIQYFKRKNIMIQVLFVGYGNL >gi|330404621|gb|ADLB01000013.1| GENE 3 845 - 1378 343 177 aa, chain - ## HITS:1 COG:no KEGG:DSY0717 NR:ns ## KEGG: DSY0717 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense # Pathway: not_defined # 1 177 1 187 193 102 34.0 6e-21 MILWTIQNKLAYESMKETGVLRVDENYILDDLFKEPYLWMASQMKKRIGNPPEGVIFPVW AWYRWEGNRKRLDMRFHGCNWGEKGTPLVLLTIDVPEQSILLSDFDYWHCVLNDGEIIFP YNESVVYSEEEKQKSWENIFDIECSFDGESHKFLTTQATLWEIKSEWVKKVEYFISR >gi|330404621|gb|ADLB01000013.1| GENE 4 1665 - 3791 1773 708 aa, chain + ## HITS:1 COG:no KEGG:SGO_1415 NR:ns ## KEGG: SGO_1415 # Name: not_defined # Def: LPXTG cell wall surface protein, X-prolyl dipeptidylaminopeptidase, putative (EC:3.4.14.11) # Organism: S.gordonii # Pathway: not_defined # 10 705 259 929 1057 441 38.0 1e-122 MTNKNYSNVPVFENGMAQPVFPFTDGKTGEKYNPNTSSIVRYCVYVESDYDMDGDGKRDL VKVLVQVPRSAVEGNYKAASLFEARPYCAGVQEDGYAHMKEVESKEYRKFDFADLNKKVP ARVPTDYISAMDLALKADPADWYYPDKGNNNNMVYENLDNFNYYLVRGFAVIVSAGFGAL GSDGFNYVGSDYERDAFKFVVEWLHGDRTAYADREGTIATKADWSNGNVAMTGRSYAGTM PFAVATTGVEGLKTIIPVAGISDWYSQQNQQGAQRYWPKEMLNSFLAYFCSSRYNDTTLT EKQLDDIAAFHHELSLQQLKCGFDYDKDFWGSGNYRLNADKIKCSALIVQGLNDENVSTK QFEMMYKSFEKAGQTVKAIIHQGPHITPTMPNKNYGILIDGKFYDDIVNEWISHYLYGID NNAETMPTVLAQDNINQKKWETEHSWTTEHTIKLESAETGVTVIDTDWEKANISAENFDD IVAQKSTNMNKRYVTAPFKKPMTIQGTICVNLKAALKDGDVENDFNPENRNDADILTMKL GSSAISGKMDDVEITVLLCDVCDEKFDSIQTVDPERNIIPVVTVKENGIINGGDLPAFNE VEFSTVHKNYRVITRAFADLCNPEAGYEPETATNSIKLKKGEYHNYHVYLNATRYTVEPG HRLAVVIATEDPVNCLIHKTYSIEIENASVNAEVPVTKVFENCTLNTK >gi|330404621|gb|ADLB01000013.1| GENE 5 3906 - 5018 1216 370 aa, chain - ## HITS:1 COG:CAC0767 KEGG:ns NR:ns ## COG: CAC0767 COG1453 # Protein_GI_number: 15894054 # Func_class: R General function prediction only # Function: Predicted oxidoreductases of the aldo/keto reductase family # Organism: Clostridium acetobutylicum # 10 368 1 372 376 253 38.0 5e-67 MRYKEFKDGVQLSRLGMGTMRLPILDNDNAKIDYEKAAKLVDDCMEQGVNYYDTAYIYHG GKSEEFLGKALAKYPRESFYVTDKYNFQAEPDYRKQFKEQLSRLNMDRIDFYLLHSIQDS FADEMIGNGCIEYFDQMKKEGRIGYLGFSFHGSPEVMKKLLPLYPWDFVQIQLNYYDWYF EDAKELYEMLAEAKIPVMVMEPAHGGLLVNLTEDAAKELKELNSENSIASWAMRWVMSLD SVQVVLSGMSDENQVNDNVKTFAEADPLTTEEQERIEKAAKIQHAAITIPCTACNYCTPN CPKGLDIPTLLKCYNEAKIGGAWRVRHLKDMPEEKGPAGCIGCKACTRHCPQGFEIPRYM EELAKMLEEV >gi|330404621|gb|ADLB01000013.1| GENE 6 5108 - 5386 399 92 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167758134|ref|ZP_02430261.1| ## NR: gi|167758134|ref|ZP_02430261.1| hypothetical protein CLOSCI_00472 [Clostridium scindens ATCC 35704] # 1 90 1 91 100 89 50.0 7e-17 MNETEKRRRQLLEETRRKYGDTRTPPAVHPRYGAIYSDLYEGKQAQNDGLFMRTIIAVLL FALFVAMDYSGEKVATVDSKRVVREIQAEMGD >gi|330404621|gb|ADLB01000013.1| GENE 7 5487 - 6740 815 417 aa, chain - ## HITS:1 COG:SA2465 KEGG:ns NR:ns ## COG: SA2465 COG0107 # Protein_GI_number: 15928259 # Func_class: E Amino acid transport and metabolism # Function: Imidazoleglycerol-phosphate synthase # Organism: Staphylococcus aureus N315 # 1 131 1 132 252 85 33.0 2e-16 MSYKRLIPCIFIYNGKAVKWFNDKEILTENVVELAQRYSNQGADELLVFDLSDNDEEHTE SIELIQKMNRVIPIPMIAGGNIKRLEDVKQILYAGVKRVMLNFSKPDMFSLAQEAANRFG KEKVAVSLNDFDALFKQQQLIEQYSSELIFMHRLDLDSVMNITNIPCIIVTDTVREEELF NILRCPGVKGLSGRFVSNVEMDFSQFKDKCAERDIKITSFESLMDYSEFHTDVNGILSVI TQDYKTGEILSVRKMNREAFETTVKTGKMTYYNPDSCGMESKSNQYVKSMTVNESRDTLL AKIDFREKTATVFTQTLIGSDKSEKNPLQIFEYVYQTILERKQNPKEDSYTTYLFDKGID KILKKIGEEATEIVIAAKNPSQDELKGEFIDFLYHAMVLMVEKGVTWEELTEEMAKR >gi|330404621|gb|ADLB01000013.1| GENE 8 6730 - 7296 457 188 aa, chain - ## HITS:1 COG:BS_yppB KEGG:ns NR:ns ## COG: BS_yppB COG3331 # Protein_GI_number: 16079288 # Func_class: R General function prediction only # Function: Penicillin-binding protein-related factor A, putative recombinase # Organism: Bacillus subtilis # 10 171 31 195 206 112 39.0 6e-25 MGTWNSRGLRGSTLEDFINRTNEKYLEKGLALIQKIPTPITPVRIDKEHRHITLAYFEQQ STVDYIGAVQGIPVCFDAKECSAGTFPLQNIHEHQVTFMKNFEKQGGIAFLLIYYSEKNI LYYMRLKELLQYWERSLNGGRKSFRFDELDSEFFLTLSSGCFVPYLNAIQKDLDMREIAD RIGERHEL >gi|330404621|gb|ADLB01000013.1| GENE 9 7298 - 8260 952 320 aa, chain - ## HITS:1 COG:CAC1015 KEGG:ns NR:ns ## COG: CAC1015 COG0564 # Protein_GI_number: 15894302 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Pseudouridylate synthases, 23S RNA-specific # Organism: Clostridium acetobutylicum # 3 314 2 311 318 196 35.0 7e-50 MQEIQIAANEAGQRLDKLLAKYLSEAPKSFLYKMLRKKNIVLNGKKASGNEKLVAGDSVK LFLSDETIQKFSKEITVCKSNTKLDILYEDDDILLVNKPAGMLSQKAEAKDISLVEHLIS YLLESGQLTRENLKSFKPSICNRLDRNTSGLVVCGKSLKGLQTMGQLFKERKLKKYYRCI VAGNVTEKQYVKGYLIKDEKKNQVTVSDTYFPESQEIETEYRPIQQLKQGTLLEVHLITG KTHQIRAHLASQGHAVIGDYKYGSREINDRYKKEYQLSHQLLHAYRLEVPKTKELPQLSE KMFVAPLPKQFQKIIEGESR >gi|330404621|gb|ADLB01000013.1| GENE 10 8276 - 8851 518 191 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_01028 NR:ns ## KEGG: EUBELI_01028 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 2 191 3 191 191 92 30.0 7e-18 MRIAKGKSGYIQSKKKWEITKTCLEFGMVLAVFLTGYFTTKTRLNLLTVAAVLGCLPAAK ALVGVIMLFPHHSMEREKVEEIEEKAPTLVKAYDMILTSYEKIMPIDSIVIFENIVCGYS GSNKIDVNDTASYIKKMLRNNQYDKVSVKIFTDYKTYMTRVEGMENMAVAERKSSQEHEE GIRRTILSLSM >gi|330404621|gb|ADLB01000013.1| GENE 11 8870 - 9622 790 250 aa, chain - ## HITS:1 COG:lin2083 KEGG:ns NR:ns ## COG: lin2083 COG0300 # Protein_GI_number: 16801149 # Func_class: R General function prediction only # Function: Short-chain dehydrogenases of various substrate specificities # Organism: Listeria innocua # 2 242 8 252 263 123 36.0 3e-28 MKIAMITGASSGFGREFVRQIDGLYKELDEIWVVARRVERLHELQSRTMTKLRVFGGDLL EEEIYQEISEALSDENPNIRMLVNAAGFGKMGTVEEIDEKTQLEMIDTNCRSLTRMTKIC LPYLTKGSRVVNIASAAAFCPQPSFAVYAATKSYVLSFSRALRAELEDREIYVTAVCPGP AKTEFFDIAGMSANILKKMTMAQPEKVVKQALIDAKNRKEISVYGGAMKGARVLTKIIPH KIAVEVMKQF >gi|330404621|gb|ADLB01000013.1| GENE 12 9640 - 11538 1622 632 aa, chain - ## HITS:1 COG:MA4618 KEGG:ns NR:ns ## COG: MA4618 COG1032 # Protein_GI_number: 20093399 # Func_class: C Energy production and conversion # Function: Fe-S oxidoreductase # Organism: Methanosarcina acetivorans str.C2A # 5 602 126 742 742 604 49.0 1e-172 MQHGFLPISKEEMKQRGWEQVDFVYVSGDAYVDHPSFGHAIITRVLEAHGYKVGIIAQPD WKNKESIMVFGEPRLGFLVSSGNMDSMVNHYTVSKKRRKADAFTPGGVMGKRPDYAAVVY SNLIRQVYKKTPIILGGIEASLRRLAHYDYWSNKLKRSILLDSGADILSYGMGERSIVEI AEALDSGMEVSDITYIDGTVCKVKSLDCVYDAITLQSYDELQKDKLNYARSFYVQYCNTD PFTAKRLVEPYGEHLYIVQNPPSKPLSQSEMDRVYSYPYMRTYHPSYEELGGVPAIEEVK YSLISNRGCFGACSFCALTFHQGRIIQTRSHESLIEEAKKFIWDKDFKGYIHDVGGPTAN FRAPACEKQLTHGACKNKQCLFPKPCKNMRVDHKDYLRLLRKLRELPNVKKVFIRSGIRF DYLMADKDDTFFKELCEHHVSGQLKVAPEHISDAVLQKMGKPENSVYQAFTKKYKQINER IGKNQFLVPYLMSSHPGSTMKEAVELAEYLRDLGYMPEQVQDFYPTPSTISTCMYYTGVD PRTMEKVYVPVNPHEKAMQRALIQYRNPKNYDLVLEVLKVADRMDLVGYDKKCLIRPRQQ KQQQWEKTHQQPQNKKPRKKKAIRNVHKKKNK >gi|330404621|gb|ADLB01000013.1| GENE 13 11539 - 12207 589 222 aa, chain - ## HITS:1 COG:CAC3231 KEGG:ns NR:ns ## COG: CAC3231 COG0637 # Protein_GI_number: 15896477 # Func_class: R General function prediction only # Function: Predicted phosphatase/phosphohexomutase # Organism: Clostridium acetobutylicum # 1 214 1 213 215 164 43.0 1e-40 MLENIKAVIFDLDGSLADSMWVWTTIDMEYIKEYQLEVPPHFYTDMEGKSFTETAQYFLE TFPCLPLTLEELKQDWVNRAYGKYTKEVKLKEGALEFLQMLKEKDIRTGIATSNGRQLVE EFLRANHIDTFFDTVWTSCDVCIGKPAPDVYLKAAESLQTSPENCLVFEDVPMGILAGKN AGMKVCAVEDEFSKVQEKKKRELADYYIQNYNDIQNKTYEVL >gi|330404621|gb|ADLB01000013.1| GENE 14 12201 - 12905 624 234 aa, chain - ## HITS:1 COG:L109527 KEGG:ns NR:ns ## COG: L109527 COG1187 # Protein_GI_number: 15674222 # Func_class: J Translation, ribosomal structure and biogenesis # Function: 16S rRNA uridine-516 pseudouridylate synthase and related pseudouridylate synthases # Organism: Lactococcus lactis # 2 233 32 271 273 230 53.0 2e-60 MMRLDKYLCEMNVGSRSEVKKYIRQGRITVDGKTAVKPEEKIDEHSQKVCVDSRVIGYEA FEYYMLYKPSGVVSATTDRKEKTVLDLIEEKKRKDLFPVGRLDKDTEGLLLITNDGEMAH RLLSPKKHVDKVYYAKIDGKVTEREVQIFSEGLSIGNDEYAKPSALKILKEGNISEIELT IQEGKFHQVKRMFHAVGMEVIYLKRISMGNLVLDESLKKGEYRRLTEKEVEQLC >gi|330404621|gb|ADLB01000013.1| GENE 15 12902 - 14287 1325 461 aa, chain - ## HITS:1 COG:SP1402_1 KEGG:ns NR:ns ## COG: SP1402_1 COG0144 # Protein_GI_number: 15901256 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA and rRNA cytosine-C5-methylases # Organism: Streptococcus pneumoniae TIGR4 # 1 301 1 280 280 201 40.0 3e-51 MNLPTIFEEKMKMLLGEEFSDYIKCYDEPRFYGLRVNTKKISVEKFKEICPFEIRPIPWI ENGFYYDGDKVQPAKHPYYFAGLYYLQEPSAMTPANRLPIEPGDKVLDVCAAPGGKATEL GAKLKGQGVLIANDISNSRAKGLLKNIEIFGVGNVLVLSEEPGKLEEYFPEYFDKILIDA PCSGEGMFRKDKKMVKAWEEHGPSFFAKIQRSIVMQAARMLKPGGMILYSTCTFDAEENE GTIEYLLKECPEFFLKDIVPYEGFAQGKPEVTESKLEDFRKTVRIWPHKMHGEGHYLALL QKGERREEDGVVKKQKKAKKVPTELEEFFKDVTWEMDWSRLEIHGERVYYMPENLPNVKG IRFLRTGLYLGDVKKNRFEPTQAFAMCLKKEEYAHTLSLPADDERVIKYLKGETLEVDDL VSPKEKGWQLVCIEDYPLGWGKLSNGTLKNKYLPGWRWQSA >gi|330404621|gb|ADLB01000013.1| GENE 16 14340 - 15179 579 279 aa, chain - ## HITS:1 COG:BS_yunA KEGG:ns NR:ns ## COG: BS_yunA COG0739 # Protein_GI_number: 16080287 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane proteins related to metalloendopeptidases # Organism: Bacillus subtilis # 122 275 201 346 349 112 41.0 6e-25 MQRKMGGNLLLLVLLFSFFIVLGKHKIISLSQMNRLNEKTVTSEEFRKQQLDKGVLSVLE KEENKGKIAGLYLLESRFSEQNFSEKNKRNTYKQLYMKWAVNKEWTEYHKVCKAIWNDLK YFPVPKSSQNKKLTVSYVDSWQAKRTYGGERGHEGTDIMAKKNVAGVYPVVSMTDGIVSE KGWLEKGGYRIGITGSSGGYFYYAHLDSYSDLKVGQKVKAGEVLGYMGDTGYGKEGTKGK FPVHLHVGVYFYKNGKEISVNPYWLLRYLENHKLKYAFS >gi|330404621|gb|ADLB01000013.1| GENE 17 15242 - 16690 1641 482 aa, chain - ## HITS:1 COG:YPO3230 KEGG:ns NR:ns ## COG: YPO3230 COG2195 # Protein_GI_number: 16123389 # Func_class: E Amino acid transport and metabolism # Function: Di- and tripeptidases # Organism: Yersinia pestis # 3 479 4 480 486 345 38.0 1e-94 MALENLEPKVVFHYFEELTKIPHGTYNTKEISDYCVNFAKDLGLEYMQDETNNVIIKKAG TKGYENAEPIIIQGHLDMVCEKIPGSAHDFTKDPLDVYEEDGMVQARGTTLGGDDGIAIA YAMAILASDDLEHPPIEAVFTVDEEEGMGGANAIDLSVLKGRMLLNLDSDVEGTIVAGCE GGYENVIRIPIEREEKEGTLISICIGGLKGGHSGLEIHEQHGNSNKLMGRLLMTLAAEDV DFSLVEFNGGSKPNVITSLTEGKIILCPELVQKAKEVIQECEEVMKEEFGQDEPDLTITV SEEAGQKVNAMTKESTKKTVFLVTATPDGVQCFSRNIKGLVETSLNLGIAKTEEKQVTAV YRVRSAVQSKKTNMKKVLAMWAEYLGGTPLVEGEYPAWAYKTDSKIRPIVVDTYKELFGK EPIVTTIHAGLECGLFAGKLKGLDCVSIGPEMLSIHSPNEKLSVASTQRTWELLKGILKN CK >gi|330404621|gb|ADLB01000013.1| GENE 18 16803 - 17735 337 310 aa, chain + ## HITS:1 COG:BS_ylbJ KEGG:ns NR:ns ## COG: BS_ylbJ COG3314 # Protein_GI_number: 16078567 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus subtilis # 10 304 10 342 408 100 26.0 5e-21 MKRKLLPFALISIFFFMLVKPNETFSGASEGLLLWFQIILPTLFPFMIITNLLVRTNTMF YFSKCLSPILSPLFHVSANGTFAILTGFLCGYPMGAKVTADLVRSGKISKKEGSYLLSFC NNTSPMFIMNYVVLKSLKQENLLIPSLLILFLSPILCSFLFRIYYLKEEKHYSSSALTSP HFYFHFQILDSTIMNGFEMITKVGGYIILFSVLITLLREIPCSSPLWTTFVLPSLEITNG IPMIISLCHSFPLSFILVMALVSFGGLCSIAQTKCMLEGTGLSIVSYTIEKLITAMVTSL LSLVYLYFIL >gi|330404621|gb|ADLB01000013.1| GENE 19 17730 - 18326 931 198 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_1679 NR:ns ## KEGG: EUBREC_1679 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 182 14 195 231 164 62.0 2e-39 MSSRIEQIIEEIEEYIDSCKFQALSTSKIIVNKEQIDELLRELRMKTPDEIKRYQKIIAN KDAILADAQAKAQSMIEEAQIQTNELVSEHEIMQQAYAQANEIVMTATDQAQEILDNATN DANDIRIGAMQYANDILANAEGIIAHSLDTYTTKYESLINSLRECYEVVRTNRAELEVPD KSDRGIESEFGNENTNTL >gi|330404621|gb|ADLB01000013.1| GENE 20 18343 - 18819 424 158 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163764798|ref|ZP_02171851.1| ribosomal protein S19 [Bacillus selenitireducens MLS10] # 5 156 7 158 164 167 52 2e-40 MLRAIYPGSFDPVTLGHMDIIKRSCKIVDELIVGVLNNNAKTPLFSVEERVKMLREATKE LKNVKVVEFEGLLVDFAKAIDAKVVIRGLRAITDFEYELQMTQTNHKLEPDVETLFLTTS IEYSFLSSTTVKEVAAFGGDITQFVPEVVVKEIEEKMK >gi|330404621|gb|ADLB01000013.1| GENE 21 18813 - 19382 525 189 aa, chain - ## HITS:1 COG:FN1329 KEGG:ns NR:ns ## COG: FN1329 COG0742 # Protein_GI_number: 19704664 # Func_class: L Replication, recombination and repair # Function: N6-adenine-specific methylase # Organism: Fusobacterium nucleatum # 1 149 1 150 182 127 44.0 1e-29 MRVIGGSAKRLQLKTLDGLETRPTTDRIKETLFNMISPYLCDCMFLDLFSGSGGIGIEAL SRGAKEAVFVENNPKAMQYIKENLAFTKLDKKAVTMQTDVITALKRLEGTTQFDYIFMDP PYAKGLEEQVLEYLAESDLVNEDTVIIVEAAKETEFTYLERFGFYIIKTKEYKTNKHLFI ERERGEEVC >gi|330404621|gb|ADLB01000013.1| GENE 22 19464 - 20123 497 219 aa, chain - ## HITS:1 COG:no KEGG:CD3160 NR:ns ## KEGG: CD3160 # Name: not_defined # Def: putative ABC transporter, permease protein # Organism: C.difficile # Pathway: not_defined # 31 206 2 176 184 76 30.0 8e-13 MITSYKSILDFLRIDFPLVFKNSLLLGIVYLFLIPVIRGTSNLDAIHSADVFGQSLVLTG AILLIPITKWELETSVKEIVCTKAWSYIKSVSIRLICGFLLITVMIVTFAFIMQFKNCVF PFWEYVSVTILYAVFLGILGLCFSQFAGNVIVGYLAVLGYWSLCQFRIICEGDMLYMFPF VKGIIDMERVWILLGTDIVLISSFQVMAKWSKKDFFCRK >gi|330404621|gb|ADLB01000013.1| GENE 23 20056 - 21450 733 464 aa, chain - ## HITS:1 COG:no KEGG:CD3161 NR:ns ## KEGG: CD3161 # Name: not_defined # Def: putative ABC transporter, permease permease # Organism: C.difficile # Pathway: not_defined # 2 442 3 407 429 273 37.0 1e-71 MLFFKEVKKICFSLVYVLFIGLLLFSWHDNFYGVTSKEISALQENDTSISSELAGGSILK KPEKEAESYGMKNKEVPEKIMCGGTDMLIIEYLKNSYATYPFTYYKEVVLNEDEQKEILD IIKKITGLNEQQINNLPDDYFPAVNGNIIHFGSGEEQSKSGTFSFEMGNDEDITTNNDYT KHFIAQVSYEEFQELMKKAENIIGRGSNYSMEMLSEYYGLSEMTYEEAMDEYNKTIYDDK VSVAFARLFCDYLTRDLGLYPVFLAVIFWLKDRRNRMNELIDCKQIGTTKLVTVRFLAML FAVMLPIIILSFESFIPLIKFSADTGIAIDMFAFLKYIIWWLLPTAMIVTSLGMFLTILT STPIAILIQFAWWFIDTSMTALSGDTKIFTLMIRHNLLRGSEIINQDFTLICLSRTLFAI LSFVLAGLSIAIYNRKRGGELNYDYFLQKHFGFFKNRFSSRLQK >gi|330404621|gb|ADLB01000013.1| GENE 24 21451 - 22326 668 291 aa, chain - ## HITS:1 COG:CC3566 KEGG:ns NR:ns ## COG: CC3566 COG1131 # Protein_GI_number: 16127796 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase component # Organism: Caulobacter vibrioides # 5 284 4 286 294 234 42.0 2e-61 MEIKIEHLNKNYGKQSALKDISLHIPVGMYGLLGENGAGKTTLMRILATMLELSTGTVKI NGIDIKNKKEIRKIIGYLPQEFSVYPNMSVYSALDYLGILAELPNNIRRQRIDELLKQVN LEQEKNKKFKNLSGGMKRRFGIAQALLNNPKILIIDEPTAGLDPEERLRFYNLLSELAMD RIVLLSTHIVGDIEATCSKVAVLCSGEVVFEGQIEKLLQLGNEKVYTTTIMQNELNNFKK KYRVISIHQNGTDVSCRFLSSASPQKDWVSCQPTIEDSYMYLLSCFKQGVE >gi|330404621|gb|ADLB01000013.1| GENE 25 22400 - 23590 534 396 aa, chain - ## HITS:1 COG:CAC0317 KEGG:ns NR:ns ## COG: CAC0317 COG0642 # Protein_GI_number: 15893609 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 114 378 204 475 498 103 26.0 4e-22 MGKIKNTFQKMSLKKSLVMLAALCLGGVSILSIITILIFSDMQQKILDTRPIVVTDYTMK NYEETGSETTVVPQKYIYKELSKDNQIYYWIVTILMVILPVIYIIAASLMVAKFYYKLKL QIPLQNLKNGMYHISQQDLDFQIQYTTDDELGKLCDTFECMRNEIHKSNCKMWDMLQERK ALTASISHDLRTPITVVNGYLDYLEKSMERKMLTEELLQTTIKNMAGAMSRLERYVDCVK DIQKMEEIEIRKEQYNLKEIIADITREFSVLAAQYGRKLIIHDLSQDAFIETDKDMLSKI LENIFDNALRFSTDKIVLSIDEKKDYICFSIQDDGVGFTAEELNSATSFFYSSPTNGGNF GIGLSICKILCEKLGGNMCLGNNFDCGAIITIKIKK >gi|330404621|gb|ADLB01000013.1| GENE 26 23578 - 24258 537 226 aa, chain - ## HITS:1 COG:BH1944 KEGG:ns NR:ns ## COG: BH1944 COG0745 # Protein_GI_number: 15614507 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Bacillus halodurans # 1 224 1 225 229 165 41.0 5e-41 MKEKYKILIVDDEQSILDMLKLQLEFEGYAVFTAGNAKETLDKVSYGPDIILLDINMAGM NGLDLCASIRDFISCPIIFLTARVSEQDKINGLMAGGDDYITKPFSVNELLARISAHLRR EQRSHNKTKSKFLEELIIDYSDRSLYIKGNKIELSNKEFEIIQLLSANAGQVFDREKIYD CIWGIDGSGDSVVIKEHIRKIRLKFSEFTDKTYITTVWGVGYKWVK >gi|330404621|gb|ADLB01000013.1| GENE 27 24327 - 24953 412 208 aa, chain - ## HITS:1 COG:no KEGG:BHWA1_00599 NR:ns ## KEGG: BHWA1_00599 # Name: not_defined # Def: acetyltransferase, GT family # Organism: B.hyodysenteriae # Pathway: not_defined # 1 208 1 206 206 132 37.0 8e-30 MSEQIILREYQETDRQALEDIIRETWKYDRFCCAKVAKKMAKVYLNSCLIEQTFTRVAVI NHIPVGIIMGKNIQKHKCPLKLRIKWLKSIVSLFVSKEGRKISKIFGGVKGIDEELLADC DKEYKGEVVFFAISEKCRGKGLGRRLFQTVTDYMKSQHISEFYLFTDTSCNYPFYEHLGL IRRCEKKQVIDVNDEKEEMTFFIYEYKI >gi|330404621|gb|ADLB01000013.1| GENE 28 25057 - 25878 530 273 aa, chain + ## HITS:1 COG:BS_bltR_1 KEGG:ns NR:ns ## COG: BS_bltR_1 COG0789 # Protein_GI_number: 16079711 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Bacillus subtilis # 1 108 1 108 124 85 41.0 1e-16 MLKEKNTLFTIGQFAALHEINKKTLMWYDEIGLLKPACIKENGYRYYSYQQSAELETILM LRELNVSLEEIRKFMRNRTVDNFECLLQEKITELNQTISHLKSIQKVLVNHQKDMETLHS LDLSAITLIEKQSHYLVSVDVTADLPFEKEIEQVISEAKKYQLRRLHDASYGAMIPVDNL YEKKFSEYTALYIEMPYPVSKKGLHLQPAGTYLRAFCQGNWDKLSNRYKEILAYANKQGL TLHGYAYETGINELVIDNMNDYITQIEIPVKVY >gi|330404621|gb|ADLB01000013.1| GENE 29 25872 - 26291 450 139 aa, chain - ## HITS:1 COG:no KEGG:Apre_1653 NR:ns ## KEGG: Apre_1653 # Name: not_defined # Def: hypothetical protein # Organism: A.prevotii # Pathway: not_defined # 10 139 7 134 157 62 36.0 5e-09 MSKKKVIGVIVIIGLLLIGCAVAFFMNRTTYTLNLPKADDLKSISLSNPNKEEEVTDIKE IEDMLYVLKENGRTTKKESINDSPVNTEEPIKVEFNFKKEGSSVVYIYEDDGKYYLEQPY NGIYQISGDEYNDIAKYFQ >gi|330404621|gb|ADLB01000013.1| GENE 30 26309 - 27031 496 240 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|169349649|ref|ZP_02866587.1| ## NR: gi|169349649|ref|ZP_02866587.1| hypothetical protein CLOSPI_00387 [Clostridium spiroforme DSM 1552] # 1 240 1 240 240 242 62.0 1e-62 MKYLKSVVQYECMTSFKYIWIFYAIQYAIVSLITLIIGISMGTFENIGTNALEINTLIYV SILGVLGFKEDFKMLIQNGFTRKYIFAATFSLFCFISGIMAFVDMATGNIIHHFNDRYIS LYGGIYGYGNLFMNWLWLFLVYVLICCLLYLVILVINKVGKTTSIYLGVILGGIVLLTVA LFRYVFSDETVNNILKFFMKTMGFMESGSVNYILPVLTLFILVGIFSVVSFAIIRHTELK >gi|330404621|gb|ADLB01000013.1| GENE 31 27033 - 27893 834 286 aa, chain - ## HITS:1 COG:BH3493 KEGG:ns NR:ns ## COG: BH3493 COG1131 # Protein_GI_number: 15616055 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase component # Organism: Bacillus halodurans # 19 282 20 283 283 152 31.0 9e-37 MNAIQIKNVTKRYDNLTAVDNVSFSFGFGKIYGFLGRNGAGKSTLINIIANRIFADSGEV LIDDIPAKDNMQVHEKIFCMSESDLYDTSLKIKDHFKWIERFYDSFDLNKAFEISEKFNL DTSKRFKALSKGYQSIFKLTVALSLNVPYVIFDEPVLGLDANHRELFYDLLLKDYEDNER TIIIATHLIEEVANIIEEVVLIDKGKVLLQEKVENLLETGYSISGVAGEVDDYCSDKKVI GYDELGNLKIAYILGEKTPLSPNSNLQISTMNLQKLFVKITEKGGK >gi|330404621|gb|ADLB01000013.1| GENE 32 27874 - 28257 297 127 aa, chain - ## HITS:1 COG:BH3492 KEGG:ns NR:ns ## COG: BH3492 COG1725 # Protein_GI_number: 15616054 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Bacillus halodurans # 1 123 1 123 129 102 40.0 2e-22 MNINANIEKPIFIQIAEQLEDSIFTGVFPEESKIPSTNEISALLNINPHTVLKGMNMLVD EEIIYKKRGLGMFVKEGAVKKIKSKRQGQFYEQYIATLIEEASKLQMSKEEIISLIERGY AHERNSN >gi|330404621|gb|ADLB01000013.1| GENE 33 28360 - 29019 627 219 aa, chain - ## HITS:1 COG:no KEGG:CD1901 NR:ns ## KEGG: CD1901 # Name: not_defined # Def: putative phage repressor # Organism: C.difficile # Pathway: not_defined # 1 95 1 96 169 91 53.0 2e-17 MIGENIKKYRRRKGMSQEELAVKLHVVRQTVSKWEQALSVPDAEVLMRMASLLEVSVSTL LGSEIEDENSEHLTEKLAQLNEQLAEIQREQIVQKRAGEKIGVGIVTSVLVALNVISFSE YDERLFAMLLIACIMVFTGIVSPKLPFTRHTDLRLPWTVRDQDTWNVAHKVMGVISLPIA CLYVGCSLTIADFEAVTLFAMLLWIGIPAVISYCFYMKK >gi|330404621|gb|ADLB01000013.1| GENE 34 29270 - 29476 325 68 aa, chain + ## HITS:1 COG:no KEGG:Cphy_1320 NR:ns ## KEGG: Cphy_1320 # Name: not_defined # Def: small acid-soluble spore protein alpha/beta type # Organism: C.phytofermentans # Pathway: not_defined # 5 68 3 66 66 84 75.0 1e-15 MTSRSSNRAAVPEAKTALDKFKYEVASELGVPLTDGYNGDLTSRQNGSVGGYMVKKMIEE QEKQMAGK >gi|330404621|gb|ADLB01000013.1| GENE 35 29549 - 30067 576 172 aa, chain - ## HITS:1 COG:no KEGG:FMG_0453 NR:ns ## KEGG: FMG_0453 # Name: not_defined # Def: hypothetical protein # Organism: F.magna # Pathway: not_defined # 1 172 1 157 157 89 33.0 7e-17 MKKHFVMVGIILILMFQMSACGSTNFSEEISEKTGVDVSGGKEVTVSDTHGGFHGDGTTY VVLEFSSDKLEEDIKNNEKWSRLPLDKGAETLAYGTRKETDDTIEIFGPYMTDDKGNGLM TKVENGYYFLFNKQNGKTGMTREEIANASSLNVILAIYDTDTQRLYFCEEDT >gi|330404621|gb|ADLB01000013.1| GENE 36 30067 - 30285 235 72 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163756262|ref|ZP_02163377.1| 50S ribosomal protein L20 [Kordia algicida OT-1] # 5 69 3 67 67 95 66 1e-18 MEGEIIFNIDVMLAKRKMSVTQLSEKVGITMANISILKNGKAKAIKISTLTKLCEALECQ PGDILEYREREV >gi|330404621|gb|ADLB01000013.1| GENE 37 30295 - 30774 471 159 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_2826 NR:ns ## KEGG: EUBREC_2826 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 159 1 159 159 79 27.0 3e-14 MKRDTISKLLKIMILGMGIGGFAIFLLVFPTVSKELVSSYPEFSNYYLPWLIFIWLTAIP CYIVLGLMWKFATSVRQGTIFTFINGKRMKKIFFWTVSDTTFFLMGNIFLFLLNRSHPSI FILSMGIIFIGYSIAVISVAFAEFLNGAAALQEENALTI >gi|330404621|gb|ADLB01000013.1| GENE 38 30924 - 32984 1954 686 aa, chain - ## HITS:1 COG:BS_ylpB KEGG:ns NR:ns ## COG: BS_ylpB COG1200 # Protein_GI_number: 16078650 # Func_class: L Replication, recombination and repair; K Transcription # Function: RecG-like helicase # Organism: Bacillus subtilis # 2 675 4 672 682 523 42.0 1e-148 MNELSNISELKGVGEKTEKLFYKLGIFTVGDLIRYFPRTYDVYEKPVEISEVEEGRTVTV TGTIWGVVQVTGTRAMQITTLTLKDISGTLKVIWFRMPFLKNTLRSGSVITLRGKVTKRK RVLTMEQPEIFYPSSKYQEKENTLQPVYPLTAGITNHMIAKMEQQALENLDLKKEILPAE IRLRYHLAEYNYAVRGIHFPKNKEEYYYARERLVFEEFLLFILSLRQLKEKKERMQNTFS FVMPKEIEEFLDQLPYELTNAQKKVWEEIKRDITGEKVMTRLVQGDVGSGKTIIALLGLM LVGLNGYQGAMMAPTEVLAKQHYHSIREMLDKYHIPLKVELLTGSMAAKEKRIAYEKIGT GEADLIVGTHALIQEKVEYKNLALVITDEQHRFGVKQREKLSEKGNTPHILVMSATPIPR TLAIILYGDLDISVIDELPANRLPIKNCVVDTSYRKTAYEFMKKQIGQGRQCYVICPMVE ESETLEAENVTDYAKALQEELGEQIHVQYLHGKMKQAQKDEIMERFARNDVQVLVSTTVI EVGINVPNSTIMMIENAERFGLAQLHQLRGRVGRGEHQSYCIFMTGTKSKETKKRLEILN KSNDGFKIAGEDLKLRGPGDLFGIRQSGILDFKLADVFQDAKTLQNANEAANQLLEDDPD LEREENENLRKYLQKYMRDILLETTL >gi|330404621|gb|ADLB01000013.1| GENE 39 33058 - 34716 2156 552 aa, chain - ## HITS:1 COG:CAC1735 KEGG:ns NR:ns ## COG: CAC1735 COG1461 # Protein_GI_number: 15895012 # Func_class: R General function prediction only # Function: Predicted kinase related to dihydroxyacetone kinase # Organism: Clostridium acetobutylicum # 1 552 1 547 547 464 46.0 1e-130 MATKTINAEVLAKMFLAGAGNIEAKKEFINELNVFPVPDGDTGTNMSLTIMAAAKEVTAI EQFDMASLAKAISSGSLRGARGNSGVILSQLFRGFTKGIKEHKEVDTVILAKACMKAKDT AYKAVMKPKEGTILTVARGIAEKAVELAETTDDLEIFIPQVIEHAEYVLSQTPDMLPVLK EAGVVDSGGQGLLEVLKGAYDAFLGKEIDYSKIAPSTGKATLKVSNDVPADIKFGYCTEF IIMTEKEFTEKDEREFKAYLESIGDSIVCVADEEIVKIHVHTNDPGLAIQKALTFGQLSR MKIDNMREEHQEKLIRDAEKLALEQAKKKEEPRKEMGFIAVSIGEGLNEIFRELGADYII EGGQTMNPSTEDMLAAIDAVNAEHIFILPNNKNIIMAANQAQSLTKDKDIIVIPTKTVPQ GITAIINYMPEADAKTNEETMLEEIKNVKTGQVTYAVRDTHIDDKEIHEGDIMGIGDTGI LSVGTSVEETTKDMLAQLVDEDSELISLYFGQEVSEEEAERLSAEIEELYPDADVDTHFG GQPIYYYVLAVE >gi|330404621|gb|ADLB01000013.1| GENE 40 34735 - 35094 563 119 aa, chain - ## HITS:1 COG:BS_yloU KEGG:ns NR:ns ## COG: BS_yloU COG1302 # Protein_GI_number: 16078646 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus subtilis # 1 119 1 119 120 98 47.0 3e-21 MKGSMSTDFGIITIDPEVIAKYAGSVAVECFGIVGMAAVNMKDGLVRLLKKESLTHGIQV YISEDNKITLDFHVIVSYGVSISAVTENLISNVKYKVEEFTGMSVDKINIYIEGVRVID >gi|330404621|gb|ADLB01000013.1| GENE 41 35289 - 35474 239 61 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|160881022|ref|YP_001559990.1| ribosomal protein L28 [Clostridium phytofermentans ISDg] # 1 61 1 65 65 96 69 5e-19 MAKCAICEKGAHFGNNVSHSHRKTPKMWKSNVKSVRVKTEGGSKKMYVCTSCLKSGRVER A >gi|330404621|gb|ADLB01000013.1| GENE 42 35526 - 35672 153 48 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKRVIGFALFWVAVGMLISMFISSTFLKVTVIIICFLLGYILFCCDGK >gi|330404621|gb|ADLB01000013.1| GENE 43 35723 - 36124 457 133 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_1666 NR:ns ## KEGG: EUBREC_1666 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 121 1 120 131 149 69.0 4e-35 MAENNLKNTVESLFKGMDSVVSSKTVVGEAIHIGETIILPLVDVSFAVGAGAFEQDKKNK GAGGLGGKMTPSAVLVIQDGKTKLVNIRNQDTITKILDMVPDVVERFSNQNKEKKMTEDD VAEILDGKSEKES >gi|330404621|gb|ADLB01000013.1| GENE 44 36148 - 37023 614 291 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_00953 NR:ns ## KEGG: EUBELI_00953 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 1 281 1 323 327 98 26.0 2e-19 MLHILCILLKIIGILILVLLGLIVLIAGIMLFVPIHYRIEMKGEGTLQSLEVKVRFSWLL HLLSGYVSYADKKADWQVRTAWKKWNVPVSVQKKEETKKTEAEPEKTAEVTEQPKQIESK EDRKKLPPREVVVKKEKKRKKVSLLDKLKKIFQKIKYTIQNLCGKIKLLSEKKDKVLEFL EDELHRTAWGKTKQEIIRLLKYLRPKKIKGRVRFGFSDPYLTGKALAILSMWYPFYANML SIAPDFEKAVLEGNISVKGYIRGVYFLIPVFNIIRSKEIRQTYKHIRAFQL >gi|330404621|gb|ADLB01000013.1| GENE 45 37032 - 37370 402 112 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|210615640|ref|ZP_03290686.1| ## NR: gi|210615640|ref|ZP_03290686.1| hypothetical protein CLONEX_02904 [Clostridium nexile DSM 1787] # 1 112 1 112 112 94 45.0 2e-18 MEFYKNLYAGESLKEKWESVVEKLEMGKVQFSCYLIVLPNNPVNQLECYDSILLLQKRWI EKPALVVGVASSYTESLEVIKKITEDTYLRYGDADLRRYILETQQEFQESKV >gi|330404621|gb|ADLB01000013.1| GENE 46 37396 - 38994 1584 532 aa, chain - ## HITS:1 COG:CAC0506 KEGG:ns NR:ns ## COG: CAC0506 COG0768 # Protein_GI_number: 15893797 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell division protein FtsI/penicillin-binding protein 2 # Organism: Clostridium acetobutylicum # 77 511 10 459 482 268 35.0 2e-71 MREKRKDRNQKSRRLTDDDIMILMDDDLDARLERLPSSQDEERGQRRQPAKKTTPKKAVE KKPKKSRKKVTNKEFARITYLFVSLFLVLMGYIVYFNAVKAKTIINSPYNMRQDIFADRV IRGKILDKNGEVLAKTTVGEDRKETREYPYGDLYSHVVGYASKGKAGLESVENFNLLTSN AFVLEKIMKEFKGEKNIGDNVVTTLDTDLQQAAYDALGDNKGAVVVMEADTGKILSLVSK PTFDPNFVAENWAELNQDKDSAMLNRAMQGKYAPGSTFKIVTALEYIRENSDYDSYQYKC TSKIEHEGTVIHCNNNKVHGMEDLRSSFANSCNTSFSNIGLQLDILKYKKTAEDLLFNKG LPSPLMYSESKFVLSPKDASSEVMMTAMGQGKTQVSPYHMALITSSIANGGKLMKPYLVD KVVNYSGTTVTKNMPEEYGTLMTREEASQLTEYMEAVVEEGTAKTLSGQGYSVAGKTGTA EYSVDKEKVHSWFVGFTNVENPELVISVVVEGSDATGTKAVNVAKKVFNSYY >gi|330404621|gb|ADLB01000013.1| GENE 47 38843 - 40303 1401 486 aa, chain - ## HITS:1 COG:MT0020 KEGG:ns NR:ns ## COG: MT0020 COG0772 # Protein_GI_number: 15839391 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Bacterial cell division membrane protein # Organism: Mycobacterium tuberculosis CDC1551 # 126 423 122 437 469 171 34.0 3e-42 MVNIIVELSKYLMIICIAIYTYLCFSVFGYEDEYKKRKLLRRQNRLMFAIHFMAYLVLYL KMDEVKVIAFYLIQIALLSGIILLYTKIYPKISRLVVNNMCMLLTIGFIILTRLSFQKAI KQCIIVAGSVLLSLVIPVAIRKMKKLADWTYLYAGIGLCALLLVLLLASTSGGAKLGFTI AGIGIQPSEFVKISFVFFVAGRLQKSTEFKDVVVTTIIAAIHVIILVLSTDLGAALILFV VYLIMLYVATRQPLYLMAGAAGGGVAAVAGYFLFSHVRTRVAVWRGPVSPQTPGGHQVAQ SLFAIGTGSWFGMGLMQGAADKIPVATEDFVFAAIAEELGLIFALCMMLICVSCYVMFLN IAMQLRNTFYKLVALGLGTCYIFQIFLTIGGGTKFIPLTGVTLPLVSYGGSSVLSTIMMF AIIQGLYVLREDEEEIIERKKERQKSKKQAFNGRRYNDSDGRRSGRPSRAASEQPRRRAG AKKTTR >gi|330404621|gb|ADLB01000013.1| GENE 48 40312 - 42678 1476 788 aa, chain - ## HITS:1 COG:MA0538 KEGG:ns NR:ns ## COG: MA0538 COG0826 # Protein_GI_number: 20089427 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Collagenase and related proteases # Organism: Methanosarcina acetivorans str.C2A # 6 788 7 855 855 421 35.0 1e-117 MKREIEILAPAGSYESLKAAIAAGADAVYVGGNRFGARAYANNFSEEELLEAIDYVHLHG KKIYLTVNTLLKEKELEKELYSYLLPYYRQGLDAVIVQDPGVLRFVREYFPSLPIHASTQ MTITGADGARFMEEQGVERIVTARELSLQEIKEISEKTSLEIESFVHGALCYCYSGQCLY SSLLGGRSGNRGQCAQPCRLPYKVDGKTSYVMSLKDLCTVEFIPDLAEAGIYSFKIEGRM KKPEYVAAVTSIYRKYADLYLQKGRKDYSVSREDKQILMDLYNRGGFHTGYYQMRNGKEM LSLHRPNHAGIKAVKVVSQQGKEIKAEAMTELHKGDILELPDKENYTLGQDVRIGQQFSI GLRKGKKIGKGEVLNRTRNEKLLQELRKNFVEQKLQEKINGNLILSTEKDATLTLRFGDT EVTVVGEKAEQALNQPMSEERIDKQIRKTGNTPFVFDTLTISLENSLFMPMQKLNELRRS GLEMLEKEICAKYRRRDEKQEEEKQEKQDIKKNKQHDIQSLFVYVEKKEQFLKAIGNESV KRIYIDCTIIEKAWENKELSDLTKQARDRGKEIYFAMPYIFRGDTEKKYAETFDATVFDG ILIRNYESFYFLKERFPQAHIVLDCNMYQFNQEAKRFWEENGVHEFTAPLELNYRELQEL GCDKSELVVYGYLQMMVSAQCIHKTTGKCTHRSGYTKMTDRYNKQFTSKNCCDYCYNIIY NAEPLCLLEQKEDILQLAPKELRLHFTIENGKETEEVIKQFEEVFIHNKEIGTYEKAFTR GHFKRGVK >gi|330404621|gb|ADLB01000013.1| GENE 49 42718 - 43128 516 136 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_00944 NR:ns ## KEGG: EUBELI_00944 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 3 136 2 135 135 114 57.0 9e-25 MASSKNYTEVLIGGKVFTLSGFESEEYLQKVSTYLNRKMDECSSVEGYKKQSAETRSILL ALNIADDYFKTKSMATTSEADIELKDKEMYDLKHELISAQLSCENAEKEIDRLKEENTEL QKQIVKLETELKNREK >gi|330404621|gb|ADLB01000013.1| GENE 50 43201 - 44202 1243 333 aa, chain - ## HITS:1 COG:BS_ruvBm KEGG:ns NR:ns ## COG: BS_ruvBm COG2255 # Protein_GI_number: 16081161 # Func_class: L Replication, recombination and repair # Function: Holliday junction resolvasome, helicase subunit # Organism: Bacillus subtilis # 1 326 3 328 336 439 65.0 1e-123 MGKRIITTESLEEDLKIENHLRPQLLTDYIGQEKAKEMLRIYIQAAKERNEALDHVLFYG PPGLGKTTLAGIIANEMGVNLKITSGPAIEKPGEMAAILNNLQEGDVLFVDEIHRLNRQV EEILYPAMEDYAIDIMIGKGANARSIRLDLPQFTLVGATTRAGMLTAPLRDRFGVVQRLE FYTEKELQTIIIRSAGVMGVEVDENGAMEMARRSRGTPRLANRLLKRVRDFAQIKYDGII TEEAANYALDLLEVDKYGLDHIDRNILLTMIQKFNGGPVGLDTLAASIGEDSGTIEDVYE PYLIKNGFLHRTPRGRVVTSFAYHHLGISEENH >gi|330404621|gb|ADLB01000013.1| GENE 51 44219 - 44824 699 201 aa, chain - ## HITS:1 COG:CAC2285 KEGG:ns NR:ns ## COG: CAC2285 COG0632 # Protein_GI_number: 15895553 # Func_class: L Replication, recombination and repair # Function: Holliday junction resolvasome, DNA-binding subunit # Organism: Clostridium acetobutylicum # 1 201 1 200 201 133 34.0 2e-31 MISYIRGELVAVEEEKAVIDVGGVGYGIFMPAQSMGKLPPLHEEVRLHTYLHVKEDAMQL YGFLTRDDLKVFKLVIGVSGIGPKGALNILSNLSADDLRFAVLSNDVKAISAAQGIGKKT AEKLIIELKDKLSMDDVLEHMVQEEEVAVTGQNSGVQAEAVQALVALGYGNTESLRAVRR VEVNADSDVETVLKQALRFMM >gi|330404621|gb|ADLB01000013.1| GENE 52 44844 - 45335 654 163 aa, chain - ## HITS:1 COG:lin2789 KEGG:ns NR:ns ## COG: lin2789 COG4769 # Protein_GI_number: 16801850 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Listeria innocua # 6 161 8 166 180 89 32.0 3e-18 MKNKGVYFGMFTALALIFSYVESLIPFHIGIPGVKLGLANLVIVVAMYKMNKKQVYLLSV TRVVLAGFMFGNLFSIVYSLAGSLLSLAVMYGLKRKESFSIMGISMAGGVFHNIGQLIVA MIVLESLNLVYYASVLLISGLITGIVIGIVSDEIMKRIKKIHF >gi|330404621|gb|ADLB01000013.1| GENE 53 45346 - 45702 512 118 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_1653 NR:ns ## KEGG: EUBREC_1653 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 26 117 35 131 132 86 47.0 3e-16 MKKNDYKLMIIIIGIAVLLFLAAALKDNGKENVIEIKIDGVVKGRYALSEDRELNLNGTN RLVIKDGKADITEADCPDKVCVKQKPISKAGESLVCLPNKIIITVVEGEENELDGVAN >gi|330404621|gb|ADLB01000013.1| GENE 54 45828 - 46820 759 330 aa, chain + ## HITS:1 COG:CAC2761 KEGG:ns NR:ns ## COG: CAC2761 COG1477 # Protein_GI_number: 15896017 # Func_class: H Coenzyme transport and metabolism # Function: Membrane-associated lipoprotein involved in thiamine biosynthesis # Organism: Clostridium acetobutylicum # 38 324 21 310 327 219 39.0 6e-57 MKRRYISIVLTFCLLLLSGCTTPVSKDSISKSGIFFDTLISIKLWGTNDSSILEHCFKLC SEYENKFSRTIPESEISKINQSNGVPVEVSDETVELIKKALYYSELSNGAFDITIAPLSS LWDFKNNDGNVPAPTLIQEAKSHVNYKNVVVDGNTVQLLDPKAALDLGGIAKGYIADRLN EYLEKEGVKHAMINLGGNVLTVGNKPDGKPFHIGIQKPFGEQNETIASIPVDDRSVVSSG VYERYFKKNGQIYHHLLDSSTGYPKESNLLSVTIISDSSADGDALSTACFILGLEKGLKL INQLDNVDAVFITDDYQLHCTEKINDTILK >gi|330404621|gb|ADLB01000013.1| GENE 55 46845 - 47189 499 114 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|210615649|ref|ZP_03290695.1| ## NR: gi|210615649|ref|ZP_03290695.1| hypothetical protein CLONEX_02913 [Clostridium nexile DSM 1787] # 5 114 30 149 149 92 50.0 1e-17 MEENYNNEQENEQQYQYNYNTDYNYNQPQNYNQQNDNGVMSVGEWLLTILATIIPCIGPI IYLVWAFGKGGNENRRNYCKAWLIYWVIQTILAIILVIVLFALIIPASSQYYYY >gi|330404621|gb|ADLB01000013.1| GENE 56 47233 - 47670 300 145 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|210615650|ref|ZP_03290696.1| ## NR: gi|210615650|ref|ZP_03290696.1| hypothetical protein CLONEX_02914 [Clostridium nexile DSM 1787] # 3 136 10 143 151 128 47.0 9e-29 MWKQAGKLFWQDIKKLKTALILICIYLAFMKFVLHSGCPFVVVTGFPCAACGLTRAGVHF LKGEWIQAWNVHPAIFPIVLLVVVFIIQRYFRQASLKKLKKYAILLIVGMLVLYVYRFAT QFPGNPPMSYYRDNLMMRIVKWIKM >gi|330404621|gb|ADLB01000013.1| GENE 57 47680 - 48471 933 263 aa, chain - ## HITS:1 COG:MA0664 KEGG:ns NR:ns ## COG: MA0664 COG2878 # Protein_GI_number: 20089551 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfB # Organism: Methanosarcina acetivorans str.C2A # 6 262 5 261 264 182 43.0 8e-46 MSITGIVIAAVIVGGTGLFIGAFLGLAGKKFAVETDEREEAIIEVLPGNNCGGCGYAGCS GLASAIVKGEAEISGCPVGGAAVAAEIGKIMGVEAKEQTRMTAFVKCAGTCDKAGKDYVY TGIEDCTMMKSMQNGGPKSCNYGCHGFGTCVKACPFDAIHIENGVAVVDKEACKACGKCI AVCPQNLIELVPYEQKHLVQCNSKDKGKDVMSACKAGCIGCKMCQKVCEYDAITVEDNIA HIDPEKCTNCGACAEKCPKKIIM >gi|330404621|gb|ADLB01000013.1| GENE 58 48484 - 49059 776 191 aa, chain - ## HITS:1 COG:FN1592 KEGG:ns NR:ns ## COG: FN1592 COG4657 # Protein_GI_number: 19704913 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfA # Organism: Fusobacterium nucleatum # 19 190 21 192 194 183 59.0 1e-46 MKELLIIAVGSALVNNVILSQFLGICPFLGVSKKVETAAGMGGAVVFVITISSFCTGLIY KFILSPLHFEYLQTIVFILVIAALVQFVEMFLKKMMPPLYKALGVYLPLITTNCAVLGVA LTNVQKEYGILESVVNGFATALGFTIAIVIMAGIREKTEHNDVSPSFQGTPIVLVTASLM AIAFFGFSGLI >gi|330404621|gb|ADLB01000013.1| GENE 59 49071 - 49769 907 232 aa, chain - ## HITS:1 COG:TM0247 KEGG:ns NR:ns ## COG: TM0247 COG4660 # Protein_GI_number: 15643019 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfE # Organism: Thermotoga maritima # 5 202 4 198 200 206 57.0 2e-53 MNKCMERLYNGLIKENPTFVLMLGMCPTLAVTTSAINGVGMGLTTTVVLVMSNMLISMLR KVIPDSVRMPAFIVVVASFVTIVQFLLEGFVPVLYDALGLYIPLIVVNCIILGRAESYAS KNPVLPSIFDGIGMGLGFTVGLTSIGIVREVIGAGEIFGFRVMPASYEPITIFILAPGAF LVLAGLVALQNKIKANAKKKGKEVVEASCGEGCASCGNHGCKGKVFPVESDK >gi|330404621|gb|ADLB01000013.1| GENE 60 49762 - 50370 893 202 aa, chain - ## HITS:1 COG:FN1594 KEGG:ns NR:ns ## COG: FN1594 COG4659 # Protein_GI_number: 19704915 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfG # Organism: Fusobacterium nucleatum # 2 194 3 173 177 72 30.0 4e-13 MNKIVKNALILTAITLVSGLLLGGVYEITKEPIQASKEKAKQEAYKSVMSDAEEFEPDNS FDEKKAEKILEKNKINGCYLSEVAVGKDKSGKEIGYVITSTSKEGYGGEIQISVGVSMDG TVTGIEILSINETAGLGMQATEPEFKKQFENVKTDKFEVKKDNPKGNVDALSGATITTRA VTNAVNAGLSYFQDALGGSVNE >gi|330404621|gb|ADLB01000013.1| GENE 61 50370 - 51302 1110 310 aa, chain - ## HITS:1 COG:TM0245 KEGG:ns NR:ns ## COG: TM0245 COG4658 # Protein_GI_number: 15643017 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfD # Organism: Thermotoga maritima # 10 306 8 315 318 206 44.0 6e-53 MSEQFNVSSSPHIRSKTTTANIMLCVVLALLPATAFGVYNFGVQALGVVLITVGSAVLTE YIYEKLMHKKVTICDFSAAVTGLLLALNLPPTAPWWLCVLGSVFAILIVKQLFGGLGQNF MNPALAARCFLLISFTGRMTTFVYNDAVTSPTPLAALKAGEDVNLKDMFIGNIAGTIGET SVIAILIGAIFLLVMGIIQIHIPLTYIVTFVIFLILFSGHGFDVNYIAAHLCGGGLMLGA WFMATDYVTAPITKRGQILYGVCLGVLTGTIRLFSGSAEGVSYAIIISNLLVPLIEKVTI PKPFGKGGEK >gi|330404621|gb|ADLB01000013.1| GENE 62 51321 - 52655 1453 444 aa, chain - ## HITS:1 COG:FN1596 KEGG:ns NR:ns ## COG: FN1596 COG4656 # Protein_GI_number: 19704917 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfC # Organism: Fusobacterium nucleatum # 1 443 1 441 441 342 42.0 9e-94 MKGRDKMKLLTFKGGIHPDDGKQLAKDKKIVSILPKEELVYPLSQHIGAPAKPVVKAGDR VLKGQKIAEAGGFVSASIFASVSGEVKGIEKRFNPSGAKVDSIIVENDFQYEEVSYLPVK PLEEMTKEEIIDRIKEAGIVGMGGAGFPTHVKLSPKEPEKIDYVIANCAECEPYITADYR RMLENPEELISGMRVVLSIFPQAKGIFGVEDNKPDCIEKLRKLTEKESRMEVKALKTKYP QGAERQLIFATTGRAINSSMLPADAGCVVDNVETMIAIHTAVVEGKAVTERIVTLSGDAV KEPGNFKVLFGTNHQEVIEAGGGFKNPPEKIISGGPMMGFAMFTTDTPITKTSSSILGFT KDEVKANEPTACINCGRCVEVCPSRIIPSRLADLAERHDEEGFKKLEGLECIECGSCSYV CPAKRQLKQSIGTMRKIALANRKK >gi|330404621|gb|ADLB01000013.1| GENE 63 52830 - 53921 1180 363 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|227872165|ref|ZP_03990534.1| possible ribosomal protein S1 [Oribacterium sinus F0268] # 1 362 1 362 367 459 62 1e-128 MSELTFEQMLEESLKTIRNGEVVDGTVIDVKPDEIILNIGYKADGIITRSEYTNEPNVDL TTLVSVGDTMTVKVLKVNDGEGQVLLTYKRLAAEKGNERLKEAFENKEVLKATVNQILGG GLSVVVDEARVFIPASLVSDTYEKDLSKYQGQEIEFVISEFNPRRNRVIGDRRQLLVAEK AKLQEELFANLKVGDVVEGTVKNVTDFGAFIDLGGVDGLLHISEMSWGRVENPKKVFQVG ETIKVLVKDINDTKVALSLKFPETNPWANAEEKYAVGTEVTGKVARMTDFGAFVELAPGV DALLHVSQISKVHVEKPADVLNVGQEITAKIVDLNVADKKISLSMKALETEETAEETEES VEE >gi|330404621|gb|ADLB01000013.1| GENE 64 53902 - 54756 572 284 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|168180380|ref|ZP_02615044.1| 4-hydroxy-3-methylbut-2-enyl diphosphate reductase / ribosomal protein S1 homolog [Clostridium botulinum NCTC 2916] # 1 283 1 278 638 224 40 1e-135 MEIKVAKTAGFCFGVKRAVETVYDQVEKEKGKQIYTYGPIIHNDEVVKDMQKRGVKVIQT EEELENLENGVVIIRSHGVPKRIYDKLAEKNITCVDATCPFVKKIHNIVKRESEAGSQIV IIGNNEHPEVEGIKGWSTSPVTVIQTMEDVNNFEPDRSGKVCVVSQTTFNYKKFEELVEI ISKKRYDISVLNTICNATTERQTEARSIAEGVDAMIVIGDKHSSNTQKLFEICDKACNNT YYIQTLDDLDLNQLGSAKTVGITAGASTPNNIIEEVQNNVRINF >gi|330404621|gb|ADLB01000013.1| GENE 65 54767 - 55423 701 218 aa, chain - ## HITS:1 COG:SP1603 KEGG:ns NR:ns ## COG: SP1603 COG0283 # Protein_GI_number: 15901443 # Func_class: F Nucleotide transport and metabolism # Function: Cytidylate kinase # Organism: Streptococcus pneumoniae TIGR4 # 5 213 6 215 223 177 48.0 1e-44 MGYNIAIDGPAGAGKSTIAKKVAKKLGYIYVDTGALYRGMAVYFLENGVSADETEQIGKM CRQAVVTLGYEDGVQQVYLNGENITAKLREEEVGKMASVSSAIKEVRLQLLELQRDLAAK ENVVMDGRDIGTNVLPQAETKIYLTASVETRAKRRFLELQEKGVPCVLEEIAKDIEDRDY RDMNRDVAPLKQAEDAVYIDSSEMSIDEVVAEILKYCK >gi|330404621|gb|ADLB01000013.1| GENE 66 55439 - 56668 1085 409 aa, chain - ## HITS:1 COG:CAC1849 KEGG:ns NR:ns ## COG: CAC1849 COG2081 # Protein_GI_number: 15895124 # Func_class: R General function prediction only # Function: Predicted flavoproteins # Organism: Clostridium acetobutylicum # 14 408 1 391 393 421 54.0 1e-117 MSKVVVIGGGAAGMFASVFAARNGNEVHLFEKNEKTGKKLFITGKGRCNITNAGDMETLF RSVVSNPKFLYSSFYGYTNEDVISFFEEIGVKTKIERGNRVFPVSDHSSDVINGLEREMK KLGVKVHLNTKVKSVEGNEEGFREIVLSNGEKVTADACIVATGGCSYQPTGSTGDGYRMA ESVGHTVTEVYPALVPLEIKEDFVGELQGLSLRNVEATIYDGKKEIYSDFGEMLFTHYGV SGPLMLSASSCITKKLQDREMKLVIDLKPALSEEQLNQRVLRDFEENKNKQFRNAITKLF PAKLIPVMVMLSGIDAEKKVNEVSKEERMEFVRLIKHFTLTINGTRSFREAIITQGGVKT KEVNPATMESKLVPNLYFVGEVLDLDALTGGFNLQIAWSTAYMAGSSIW >gi|330404621|gb|ADLB01000013.1| GENE 67 56669 - 57598 789 309 aa, chain - ## HITS:1 COG:CAC1850 KEGG:ns NR:ns ## COG: CAC1850 COG1737 # Protein_GI_number: 15895125 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Clostridium acetobutylicum # 8 288 7 286 293 291 54.0 1e-78 MNSNNTNELLSKINEKYARLSKGQKLLAEYIAENYDKAVFLTAAKLGKTVGVSESTVVRF ATQLGYDGYPGFQKALEELVRNKLNSIQRMEVAYGRIAQSEILETVLQSDIEKIKLTLAG IEQGAFDLAVDTMLNAKRIYIVGIRSCAPLASFLGFYLNLIFDNVTLVNTNSASEIFEQL IRINEEDVIIGISFPRYSMRTLKALEFASNRKAKVITLTDSVHSPMNLYSSCNLIARSDM ASIVDSLVAPLSVVNALVVALCMKKQTEVIDTLGTLEKIWDEYQVYSSDELNPVNDKMSL KKVERDGEK >gi|330404621|gb|ADLB01000013.1| GENE 68 57732 - 59141 1742 469 aa, chain - ## HITS:1 COG:FN1376 KEGG:ns NR:ns ## COG: FN1376 COG5016 # Protein_GI_number: 19704711 # Func_class: C Energy production and conversion # Function: Pyruvate/oxaloacetate carboxyltransferase # Organism: Fusobacterium nucleatum # 8 452 4 448 448 533 58.0 1e-151 MEIEKKPIKITETILRDAHQSLIATRMTTEQMLPIVDKMDKVGYHSVECWGGATFDASLR FLKEDPWERLRKFRDGFKNTKLQMLFRGQNILGYRPYADDVVEYFVQKSVANGIDIIRIF DCLNDIRNLKTAVKAAVKEKADAQVALSYTLGDAYTLEYWMDIAKKIEDMGATSICIKDM AGLLVPYKATELITAMKEATSLPIQLHTHYTSGVASMTYLKAVEAGVDVIDTAMSPFAMG TSQPATEVMVETFKGTPYDTGLDQNLLAEIADYFRPIRDEALESGLLNPKNLGVNIKTLL YQVPGGMLSNLTSQLKEQGAEDKFYEVLEEVPRVRKDLGEPPLVTPSSQIVGTQAVFNVL MGERYKMATKETKDVLCGKYGATVKPFNKEVQKKCIGDAEVITCRPADLIPDELDTLKDE MKQWSQQDEDVLSYALFPQVATDFFKYREAQQTKVDQKVADTENGSYPV >gi|330404621|gb|ADLB01000013.1| GENE 69 59159 - 60310 1584 383 aa, chain - ## HITS:1 COG:PAB1772 KEGG:ns NR:ns ## COG: PAB1772 COG1883 # Protein_GI_number: 14521092 # Func_class: C Energy production and conversion # Function: Na+-transporting methylmalonyl-CoA/oxaloacetate decarboxylase, beta subunit # Organism: Pyrococcus abyssi # 4 381 3 400 400 366 56.0 1e-101 MEYITSTLGNLMDQTAFFNLTWGNYVMIAVACGFLYLAIRKGFEPLLLVPIAFGMLLVNI YPDIMLSIEESSNGVGGLLHYFYLLDEWAILPSLIFMGVGAMTDFGPLIANPKSFLLGAA AQFGIFAAYLGAMAMGFSDKAAAAISIIGGADGPTSIFLAGKLQQTAILGPIAVAAYSYM SLVPIIQPPIMKLLTTEEERKIKMEQLRPVSQLEKILFPIIITIVVSLILPTTAPLVGML MLGNLFKECGVVRQLTETASNALMYIVVILLGTSVGATTSAEAFLNMDTLKIVALGLVAF AFGTAAGVLFGKLMCKVTGGKVNPLIGSAGVSAVPMAARVSQKVGAEADPTNFLLMHAMG PNVAGVIGTAVAAGTFMAIFGVK >gi|330404621|gb|ADLB01000013.1| GENE 70 60374 - 60733 700 119 aa, chain - ## HITS:1 COG:SPy1176 KEGG:ns NR:ns ## COG: SPy1176 COG0511 # Protein_GI_number: 15675148 # Func_class: I Lipid transport and metabolism # Function: Biotin carboxyl carrier protein # Organism: Streptococcus pyogenes M1 GAS # 1 118 1 116 116 67 40.0 5e-12 MKNYTITVNGNVYDVTVEEKVGGAAPVQRAAVPAAPVAAPVQKQASAAGNIEVKAGAAGK VFKIEASVGQKVSRGDTVIVIEAMKMEIPVVAPEDGTVASIDVAVGDAVEAGAVMATLN >gi|330404621|gb|ADLB01000013.1| GENE 71 60758 - 61522 1029 254 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_00921 NR:ns ## KEGG: EUBELI_00921 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 82 247 83 257 263 107 45.0 6e-22 MKKRISLLLCVFVLTFSFIGCGSEKKTVEYDKEMLEQSAEIIIQSFSEMDDETFAQYEEG SELQVNMMLLQSGLPIEKEEFVSMIHSWQASIDECGDYKKHGEYKVEAKTNEVILSTEAT YEERKATIEFKFDDELKMESMDVSAKYSTAEILEKAGMNTVLGMGTVFVVLIFLAFIISL LKYIPIIMGKFGKKEEKKETVVTKAVAVEQNETPAMDDLELVAVITAAIAAETGTSSDGF VVRSIKRRTSNNWN >gi|330404621|gb|ADLB01000013.1| GENE 72 61537 - 62973 1474 478 aa, chain - ## HITS:1 COG:TM0716 KEGG:ns NR:ns ## COG: TM0716 COG4799 # Protein_GI_number: 15643479 # Func_class: I Lipid transport and metabolism # Function: Acetyl-CoA carboxylase, carboxyltransferase component (subunits alpha and beta) # Organism: Thermotoga maritima # 9 478 37 513 515 357 41.0 4e-98 MSNTTANPASRRIAALLDSGSFVEIGGAVTARTTDFNMQTKDTHADGVITGYGVIDGNLV YVYSQDATVLNGAIGEMHAKKIAAIYDMAMKMGAPVIALIDCAGLRLQEATDALHAFGSL YLKQTMASGVIPQITAIFGTCGGGLAMVPALTDFTFMEEKQGKLFVNSPNALAGNSVEKC DTASATFQSEETGLVDGIGTEEEILSQIRSLVTILPANNEDDQSYDECLDDLNRACEGIE NASGDARMTLAMISDEYVFFETKKEYAKEMVTGFIRLNGMTVGAVANCSKHYDEEGNLQE EYATDLTVNGCRKAKEFVDFCDAFNIPVFTLTNVTGFEATKCSEKNIAREVAKLTYAFAD ATVPKVNVIVGKAYGSAYVAMNSKSIGADMVYAWPTAEIGMMDAELAAKIMYADADADTI QEEAKKYKELQSSPLSAARRGYVDTIINPADTRKYVIGAFEMLFTKREERPSKKHGTV >gi|330404621|gb|ADLB01000013.1| GENE 73 63474 - 64844 1352 456 aa, chain + ## HITS:1 COG:CAC0751 KEGG:ns NR:ns ## COG: CAC0751 COG3104 # Protein_GI_number: 15894038 # Func_class: E Amino acid transport and metabolism # Function: Dipeptide/tripeptide permease # Organism: Clostridium acetobutylicum # 7 450 21 519 521 213 30.0 6e-55 MGATAKKGRYPIGYWICCITFTFERFAFYGTKPLLALFLATSVAEGGIGLSEVDAAPIAA ALTSFTYITPIVGGWICDRFLGARYAVTLGCILMGIGYLLGWQSHSVGMVWAMIIVVAIG TAFFKGNLAAIIGRLFDDEKLLDTAFSIQYSFVNIGAFFGSMICAALYMSTFKKGDVLGY RQVFFLCAILIFIGGAIFTACYGLLQGQGKLPFKYLTDTQGNVIGQESHEKKEKTTAPLT SLQKKNVIAILLVTLFSVVFWLAYYQQDIFLTFYIRDNVMRTVGGFELSPAHLTTTWNGL LCIFMSLAAAKLWEKLSNRPKGDLSMFQKVTLSFVFLGLSYGALALMDIVRGAGKAHFLW ICLFGVLITCGEICFSPLGNSFVSKFAPKKYLSLLMGIWTMASFIASTINGEIMKLVEKM GDFNIMLSFMIISFVCAIVMFFLIKPLNGLTSEGEE >gi|330404621|gb|ADLB01000013.1| GENE 74 64989 - 66776 1783 595 aa, chain + ## HITS:1 COG:FN0453 KEGG:ns NR:ns ## COG: FN0453 COG0006 # Protein_GI_number: 19703788 # Func_class: E Amino acid transport and metabolism # Function: Xaa-Pro aminopeptidase # Organism: Fusobacterium nucleatum # 1 595 1 584 584 506 42.0 1e-143 MKVTERIANLRSLMTEKGIDAYVVPTADFHQSEYVGEHFKSRKFITGFSGSYGTAVIMQE DAGLWTDGRYFFQATNELEGSGIRLMKMFVGDTPSVTEFLASNVKEGGKVGFDGRVLSMG EGQEYEEALLPKNISIDYSEDLIDEVWTDRPPLSDKPAFFLPEKYSGESTSSKLERVRQV MRDHGATVHAIASLDDVCWLLNVRGDDIDFFPLLLSYAVVKMDCVDLYVDENKLNDEILA ELAKNNVHIHPYNDIYEDIKTLSADETIMIDPMKMNYALYKNIPCKIVEHANPTILFKAM KNPVELENIRQAHIKDGVAITKFMHWVKTRYDKETITELSSADKLTGFRAEQEGYIRDSF EPLCAFKDHAAMMHYSPSPESDVKLESGAFFLNDTGGGYFEGSTDITRTFVLGSVDDEMK KYFTAVVRAMMNLSRAKFLYGCYGYNLDILARGPIWDLGLDFQCGTGHGVGYLGNIHEAP TGFRWYVVPSKNEHHQLEEGMVITDEPGIYEDGKFGIRIENEFIVKKAEQNKYGQFMEFE TITFAPIDLDGIDTQYMTKFEIDWLNNYHAQVYEKIAPHLTDEEREWLKEYTRAI >gi|330404621|gb|ADLB01000013.1| GENE 75 66832 - 68034 909 400 aa, chain - ## HITS:1 COG:CAC2057 KEGG:ns NR:ns ## COG: CAC2057 COG1686 # Protein_GI_number: 15895327 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanyl-D-alanine carboxypeptidase # Organism: Clostridium acetobutylicum # 7 289 5 270 351 165 38.0 1e-40 MRVILKKRIGIVLIVTIFFSITAKAVEKPKELYAQSAVLMDADSGRILFEKNGDQRRAMA STTKIMTCILALERGGLEGEVVASENAAKQPKVHLGVEKGETFRLRDLLYSLMLESHNDS AVMIAEKVGGSVEKFAKMMNEKAKEIGCINTYFITPNGLDAKDDTGKHSTTAGELAKIMS YCITQSPKKEEFLAITGQKSYQFTNMEKTRTYICHNHNAFLEMMEGAISGKTGFTGEAGY CYVGALRRNDRTFVVALLGCGWPNNKTYKWSDTKKLMQYGLDSFQYKDIEPAIYFETIPV ENGIPQDEDFTEKAQLRFCVKGKKKISERILMSDEESVRVETKIKKEWKAPIEKGERAGT VTYYLNDIPVREYEVVTENGVEEMTWNWCLRQMIEKYMTL >gi|330404621|gb|ADLB01000013.1| GENE 76 68092 - 68661 652 189 aa, chain - ## HITS:1 COG:CAC2060 KEGG:ns NR:ns ## COG: CAC2060 COG1386 # Protein_GI_number: 15895330 # Func_class: K Transcription # Function: Predicted transcriptional regulator containing the HTH domain # Organism: Clostridium acetobutylicum # 7 184 21 198 202 135 39.0 6e-32 MEIKETEAIVEAILFTMGESVELDKIANTIGHDKETTRKIIQNMMMRYEEEDRGIRIIEL ENAFQLCTKQEMYEYLIRIAKQPKRHVLTDVLLETLSIIAYKQPVTRLEIEKIRGVKSDH AVNKLIEYNLVTEVGRMDAPGKPLLFGTTEEFLRRFSVQSVDELPVFDTEKLEDFKAEAE DEVQLKLDI >gi|330404621|gb|ADLB01000013.1| GENE 77 68676 - 69425 612 249 aa, chain - ## HITS:1 COG:CAC2061 KEGG:ns NR:ns ## COG: CAC2061 COG1354 # Protein_GI_number: 15895331 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 1 243 1 242 249 143 37.0 3e-34 MGIPVKLEAFEGPLDLLLHLIEKNKIDIYDIPIVEITNQYMDYIRDMETKDLNVMSEFLL MASTLLDIKCKMLLPKEVNEEGEEEDPRQELVEQLLQYKMYKYMSYELRDRQIEGERLMF RAPSIPKEVSEYTEPVDLDVLLDGMTLAKLKAVYNEVLKRQEGKIDPIRSKFGRIEKEEV ELPEKISFLHQYAKAHKTFSFRQLLERQKSKIHIVVTFLAVLEMMKTGEIRVVQENTFSE IMITSTVCR >gi|330404621|gb|ADLB01000013.1| GENE 78 69437 - 70306 642 289 aa, chain - ## HITS:1 COG:CAC2775 KEGG:ns NR:ns ## COG: CAC2775 COG1408 # Protein_GI_number: 15896030 # Func_class: R General function prediction only # Function: Predicted phosphohydrolases # Organism: Clostridium acetobutylicum # 14 288 13 287 287 148 31.0 1e-35 MKNLLIVLGIIFLLILIEIIREITTFKVTHYEIVSGKLKKNMGEKKVAFISDLHNYSYGK DNGKLLKAIEREEPDIILSAGDLLVGDKNKSSETAEKFVASLASEFPVYCANGNHEQRMK EKPATYGEKYIQYKKELLKAGVHLLENDSVLLDWAGSKIRISGLEIPLRYYGKFTKEKLK IDEIESRLGKADEDIFEILLSHNPVHAETYAKWGADLTLSGHLHGGMVRIPFKRGVITPQ ACLFPKYSGGIYDVGKKKIVVSKGLGVHTIPIRLFDEAEVIILHVKGKG >gi|330404621|gb|ADLB01000013.1| GENE 79 70364 - 71143 538 259 aa, chain - ## HITS:1 COG:BH1535 KEGG:ns NR:ns ## COG: BH1535 COG1686 # Protein_GI_number: 15614098 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanyl-D-alanine carboxypeptidase # Organism: Bacillus halodurans # 1 257 130 383 387 164 33.0 1e-40 MAEHICGDEQTFVKKMNERAEKLGMKNTHFVNCNGLDADGHLTTARDIALMSRELITKYP EVHKYTKIWQENITHETKKGTSKFGLTNTNKFIRQYPYATGLKTGSTGKAKFCLSATAEK DKVNLIAVVMASPNAKQRVKDAVTLMNYGFGKCRKYEDEAPIKIKEVQVKRGQEEAVDIQ AEKKFKYIDSSGSDLSNIKKKVKMKENISAPIKKGEKLGRVIYYLNKKEIGNVSVVAKED IKEADYPYAVRSVLKTFFL >gi|330404621|gb|ADLB01000013.1| GENE 80 71179 - 72738 854 519 aa, chain - ## HITS:1 COG:lin1623 KEGG:ns NR:ns ## COG: lin1623 COG1961 # Protein_GI_number: 16800691 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinases, DNA invertase Pin homologs # Organism: Listeria innocua # 8 305 11 301 301 170 34.0 7e-42 MEGKRAVLYLRLSKEDKDKLCKGDESASIINQRLLLKEYASRNGYEIIDVYADDDESGLY DDRLEFSRMITDAKKNKFDVIIAKTQSRFSRNIQHIEKYLHHEFINWGIRFIGVVDGVDT NDFANKKTRQISGLVNEWYCEDLSKNIRSAFAAKRKEGQFLGASCPYGYCKDAGNHNHLV VDTYASQIVKEIYRLYLQGHGKSGIGKILTERGVLIPSIYKQEILGLSYKNGRKPNTTRA WCYQTIHTILNNRTYTGCVVQHKSETLSYKDRKKCVVPKNEWVIVENKHEPIISEEMFEQ VQRIQQIRTREVKGEEKGVFAGKILCADCKKTMVRLYDRRGGKACIGYVCKTYKNQGRKF CEGHRINIDELEEAVLEDLQRQAEEILTDKDIKELRSWKKQLETVRKGQDSLKIWKKKEE KIKTFKEKTYQNYLEELLTKEEYIRYTKNYDRQLEEIEFQKEKIGKDKELTEKKQENKWL NRFLEYIHVEKMTREMVLELIDFIEVNRDGSIQIQYRFK >gi|330404621|gb|ADLB01000013.1| GENE 81 72728 - 72853 69 41 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKGEKKVYIAKVTYGEKTLLECMKRIIENHCNKESRDNDGR >gi|330404621|gb|ADLB01000013.1| GENE 82 72921 - 73862 828 313 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_20121 NR:ns ## KEGG: EUBELI_20121 # Name: not_defined # Def: endonuclease # Organism: E.eligens # Pathway: not_defined # 10 313 14 316 316 274 46.0 4e-72 MNRIKALLLALLLSVNIVACGNTNTTTNQKKNEITLSGGTFDYQSVPEYSGKPFVEVNGN IPYFTEEEITDKSFETYGELDSLGRCTTSVASLSKDTMPKEQEKRGKIGNVKPTGWHSVK YDCVQGKFILNRAHCIGWQLGAENDNEKNLVSGTRYMNLEMLEYENLVADYIKETGNHVM YRVTPVFVDKEMLCRGLLMEGYSVEDKGEGVSYCVFFYNVQPGIEMNYQTGESKYTGVFL DTDSVAVEYDVKTSGNAEEDKEGSYILNTNTKKFHFPTCKGVKDIKEENKKTFSGKREEL LQESYSPCKICNP >gi|330404621|gb|ADLB01000013.1| GENE 83 73884 - 74135 258 83 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160914469|ref|ZP_02076684.1| ## NR: gi|160914469|ref|ZP_02076684.1| hypothetical protein EUBDOL_00473 [Eubacterium dolichum DSM 3991] # 1 77 136 212 218 70 45.0 3e-11 MLHEKNQDILKGLYKAALFVIQADYYQKKGVYVSKHKTLGTLVEDREKEIIEQYDRMKKK EKPDFQEVSERIFAWAKEMLVRV >gi|330404621|gb|ADLB01000013.1| GENE 84 74166 - 74345 122 59 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MARVLGKRGSIENFGVLESAEFLLEMDKVCVVNYNIYLLGAEIFCQTEQKENDRIKEEV >gi|330404621|gb|ADLB01000013.1| GENE 85 74497 - 74874 383 125 aa, chain - ## HITS:1 COG:BH1535 KEGG:ns NR:ns ## COG: BH1535 COG1686 # Protein_GI_number: 15614098 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanyl-D-alanine carboxypeptidase # Organism: Bacillus halodurans # 1 125 1 129 387 128 56.0 3e-30 MKVIVSVIISCLLLGECVYAAPQKEEAKIEAVSGIVMEASTGKMIYEKNADEKLPPASVT KVMTMLLIFDAIESGEIKLEDNVTVSEFAASMGGSQVFLEPGETQTVETMLKCIAVASAN DACVA >gi|330404621|gb|ADLB01000013.1| GENE 86 75001 - 75189 239 62 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MDSKMDVKKFLIYYGIMSVIIIVFGKMFPEWVKDNTMLYYIILVGAAFPADILSRRGNKK DK >gi|330404621|gb|ADLB01000013.1| GENE 87 75202 - 75789 571 195 aa, chain - ## HITS:1 COG:lin2818 KEGG:ns NR:ns ## COG: lin2818 COG4905 # Protein_GI_number: 16801879 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Listeria innocua # 15 165 7 157 270 130 43.0 1e-30 MWNATLFGIESYKMILWFLTYSILGWIVESIYMSICNRKLTNRGFAKGPICPIYGVGALT VFVILKPYSHNMLLLFLFGSILATTLEYVTAIIMNKMFGEIWWDYTDKPFNYKGILCLES SIAWGFYTIILFTFLHGFVDGLVNRIPFTVGRTIGTIILALVAVDYTIMLYKQKKESLPK RVIRWKESILLKIRR >gi|330404621|gb|ADLB01000013.1| GENE 88 75761 - 75829 62 22 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLPFYTSTNEDKMYVECNVIWN >gi|330404621|gb|ADLB01000013.1| GENE 89 75859 - 76035 288 58 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|227872333|ref|ZP_03990687.1| ribosomal protein S21 [Oribacterium sinus F0268] # 1 58 1 58 58 115 98 1e-24 MSNVIVKENETLDSALRRFKRNCAKAGIQQEIRKREHYEKPSVRRKKKSEAARKRKFN >gi|330404621|gb|ADLB01000013.1| GENE 90 76179 - 78818 3247 879 aa, chain - ## HITS:1 COG:CAC1678 KEGG:ns NR:ns ## COG: CAC1678 COG0013 # Protein_GI_number: 15894955 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Alanyl-tRNA synthetase # Organism: Clostridium acetobutylicum # 1 875 1 877 881 913 52.0 0 MQPYGVNELRKMFLDFFESKGHLAMKSFSLVPHNDNSLLLINSGMAPLKPYFTGQEIPPR TRVTTCQKCIRTGDIENVGKTARHGTFFEMLGNFSFGDYFKREAIKWSWEFLTEVVGLDA DRLYPSVYEEDEEAYEIWNKEMGIAPERIFKFGKADNFWEHGSGPCGPCSEIYYDRGEKY GCGRPDCTVGCECDRYMEIWNNVFTQFDNDGKGHYEELEQKNIDTGMGLERLASIVQDVD SIFDVDTLKALREHICTLAGTEYGKEYSKDVSIRVITDHIRSVTFMISDGIMPSNNGRGY VLRRLLRRACRHGRILGIEGQFLTKLCETVINGSKDGYPELEEKKDAIFNVIRQEEEQFN KTIDQGLGILSEMIQEMTEKKETTLSGENAFKLYDTYGFPLDLTKEILEEKGFDIDEEGF KTSMEEQRVKARKARGVSNYMGADATVYDEIDPSVTTEFVGYDHLTYDSDITVLTTENEI VNTLSEGEKGTIFVNRTPFYATMGGQEGDKGLITTEHAEFAVEETVKLLGGKVGHVGKVT KGSFTVGDAVNLSVEENGRANTCKNHSATHLLQKALKTVLGSHVEQKGSLVTPDRLRFDF VHFSAMTPEEIAKTEELVNAEIAKNNAVITEVMNIEQAKATGAMALFGEKYEEDVRVVSM GDFSKELCGGTHVANTGMITTFKILSEAGVASGVRRIEALTGDGVFAYYREMEKELNEAA KVAKATPAQLKDKVEHMLSEIKSLQSEVEALKSKLAKDALGDVMNQITEVKGIKLLATAV EDVDMNGLRDLGDQLKEKLGEGVVVIASSANGKVNLIAMVTDGAMEKGAHAGNLIKGIAA LVGGGGGGRPNMAQAGGKNPAGIPDAIAKVQEVLEGQIS >gi|330404621|gb|ADLB01000013.1| GENE 91 78967 - 79089 167 40 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MAFWATLIQTIIKMVIVGAAAFGGIMLGKTLRKRKNEKNK >gi|330404621|gb|ADLB01000013.1| GENE 92 79152 - 80660 1848 502 aa, chain - ## HITS:1 COG:CAC3230 KEGG:ns NR:ns ## COG: CAC3230 COG4624 # Protein_GI_number: 15896476 # Func_class: R General function prediction only # Function: Iron only hydrogenase large subunit, C-terminal domain # Organism: Clostridium acetobutylicum # 138 455 50 379 450 193 33.0 6e-49 MRGLKTPVRYIRRKVFEEVARVGFTSTPETLIEDIEEIPYRLVNEETEKYRESVYRSRAI VRERLRLAMGMSLRPEDKPVHLTAGIEESNISAKYYEPPLMQVIPSACASCEEKEYEVSN MCKGCVAHPCKEVCPVGAISMKDGHSFIDQTKCIKCGKCKANCPYDAIAKKERPCQKSCG VNAIVSDKYGRAKIDNDKCVSCGMCMVSCPFGAISDKSQIFQLARALREGGEIIAEIAPA FVGQFGANITPRNIKAALQELGFKEVYEVALGADIGAISEAHHYVEKVVTGELPFLLTSC CPSWSILTKKYFPDLVDQVSQELTPMVATARTIKQEHPNAKVVFIGPCASKKLEASRRTV RSDVDFVITFEELQGMFDAKGINLEKYEAESSFHDATGAGRGYAAAGGVAAAIEKCINEY YPDVEVNIEHAEGLAECKKMLVLAKAGKMNGCLIEGMGCPGGCIAGAGTNIEIPKAKKAL GEFVSKSSKDIPAKELEEIELK >gi|330404621|gb|ADLB01000013.1| GENE 93 80670 - 83048 2195 792 aa, chain - ## HITS:1 COG:TM0268_2 KEGG:ns NR:ns ## COG: TM0268_2 COG1410 # Protein_GI_number: 15643038 # Func_class: E Amino acid transport and metabolism # Function: Methionine synthase I, cobalamin-binding domain # Organism: Thermotoga maritima # 289 791 3 479 483 273 34.0 8e-73 MILEQLGKKLLFLDGGMGTLLQAEGLLPGELPETWNAKRRETVINIHRQYFEAGSDIVLT NTFGANAIKFHDDFYSLENIIKQAVANVHEGAKKCNKSKEEYCVALDIGPTGKLMEPMGD LSFEDAYNTFAEVMQYGKEAGADLIHIETMSDTYELKAAVLAAKENTNLPVFATMIFNEN GKLLTGGDVPSAVSMLEGLRVDAIGLNCGMGPKQMLPILKEMRKYTSLPIIVKANAGLPK QKNGETYYDVNPEEFANAMAEIVEEGGCVIGGCCGTTPEHISEMKKMCGNKQIKVPQKRS ETIVSSYGKAVVFGEKPIIIGERINPTGKSKMKQALKENQLEYLLKEAITQQEKGADILD VNVGLPDINEPEMMKKVIPEIQSVTNLPLQIDTVNVAALEGAMRLYNGKPMVNSVTGKQE SMDKVFPLIQKYGGVVVGLTLDEAGIPKTAEGRLEIAKKIIREAEKYGIDKKDIVIDVLT MTISSEPEGAKTTLEALEMVRTICGVHTVLGVSNISFGLPTRPIINANFYTMAMQKGLSA GIINPSSEEMMNSYYAFCALMNLDENCENYIANCMPKTVETKPVTTLTLKMAIEKGLKEE TVQSVKSLIQTEKPLEIINNYLIPALDTVGKGFEKGTVFLPQLLMSAESAKEAFAILKEE LSKSGQTDAKKEKVILATVKGDIHDIGKNIVKVLLENYAFDVVDLGKDVPPEKVVEAAKE NNVKLVGLSALMTTTVVSMEETIKQLRKEVPDCKVMVGGAVLNQEYADMIGADFYGKDAM QSVHYARKILKH >gi|330404621|gb|ADLB01000013.1| GENE 94 83052 - 83696 477 214 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_01383 NR:ns ## KEGG: EUBELI_01383 # Name: not_defined # Def: 5-methyltetrahydrofolate--homocysteine methyltransferase # Organism: E.eligens # Pathway: Cysteine and methionine metabolism [PATH:eel00270]; One carbon pool by folate [PATH:eel00670]; Metabolic pathways [PATH:eel01100]; Biosynthesis of secondary metabolites [PATH:eel01110] # 6 212 13 219 227 130 36.0 3e-29 MEKRVREAIRYLGFGKTAVDDRTFALIIKSFKELEETADERVVYHIFSLQTEPDNKIKIG NMTIDSIHLGKNLRGCQEVILFGATLGTGVDMLMKRRSITDMAGAVVLQACAAAMLEEFC DRCVEEISEEFKREGKYLRPRFSAGYGDFSISYQEEILRMLEASKKIGLTMTKGSMLTPI KSITAVIGVSDKNISCHREGCEVCDKTDCDYRRG >gi|330404621|gb|ADLB01000013.1| GENE 95 83705 - 84571 740 288 aa, chain - ## HITS:1 COG:aq_1429 KEGG:ns NR:ns ## COG: aq_1429 COG0685 # Protein_GI_number: 15606607 # Func_class: E Amino acid transport and metabolism # Function: 5,10-methylenetetrahydrofolate reductase # Organism: Aquifex aeolicus # 1 287 1 291 296 233 40.0 3e-61 MKIIDRLNEDRINISFEVFPPKTEAGFDSVIKAVDEIAMLEPAFISVTYGAGGGTSGNTV HIASHIKHDLQVESLAHLTCVSSTKEQVHEMISTLKEEGIENILALRGDIPKETEFPIAG RYRYACELIREIKQQGDFCIGAACYPEGHVENEHKKDDIQYLKEKVDSGVDFLTTQMFFD NSILYNFLYRIREKGITVPVLPGIMPVTNGKQIRRICELSGTVLPQRFKAIVDRFGDDPK AMQQAGIAYATDQIIDLLANGIENIHIYSMNKPEVAKAIMMNLKEIIK >gi|330404621|gb|ADLB01000013.1| GENE 96 84590 - 86026 1302 478 aa, chain - ## HITS:1 COG:CAC2239 KEGG:ns NR:ns ## COG: CAC2239 COG0297 # Protein_GI_number: 15895507 # Func_class: G Carbohydrate transport and metabolism # Function: Glycogen synthase # Organism: Clostridium acetobutylicum # 3 475 2 473 477 475 51.0 1e-134 MRKILFAASESVPFIKTGGLADVVGSLPKYFDKKKYDVRVMLPKYMCMKQEWKDKMEYKT HFYLDLSWRKQYVGILETKLDGITFYFIDNEYYFAGPTPYGNIYEDVEKFAFFSRAVLSA LPLIDFRPDIIHCHDWQTGLIPVYLDNFRFGGEFYQGIKTVMTIHNLKFQGIWDKKTIQD ITGLPDYYFTSDKLEAYKDANYLKGGIVYADRVTTVSNSYAEEIKTEFYGERLDGLMNAR ANCLSGIVNGIDYDVYNPKTDKMITKKYSVDTFRRDKKKNKLALQEELGLEVNDKKMMIG IVSRLTDQKGFDLIAYIMDELCQDDIQLVVLGTGEERYENMFRHFAWKYEGKVSANIYYS EEMSHKVYASCDAFLMPSLFEPCGLSQLMSLRYGTVPIVRETGGLRDTVEPYNEFEKTGT GFSFKNYNAHEMLGIIRYAEKVYYEKKRDWNKIAEHGMKKDFSWKNSAKQYEALYEEM >gi|330404621|gb|ADLB01000013.1| GENE 97 86091 - 86879 793 262 aa, chain - ## HITS:1 COG:BH2773_1 KEGG:ns NR:ns ## COG: BH2773_1 COG0784 # Protein_GI_number: 15615336 # Func_class: T Signal transduction mechanisms # Function: FOG: CheY-like receiver # Organism: Bacillus halodurans # 1 121 1 120 120 109 41.0 7e-24 MGNLNVAIADDNERILELLDEIISTDNELTVVGKANNGEDMCHIIKNKEPDVVLLDLIMP KMDGISVMEKINTDASIKKRPDFIVVTAVGQERITEDAFRKGANYYIMKPFNNEMIINRI KSTARGRVRGTVQSSAQIQQPALTKERLEEYVTDMLHEIGIPAHIKGYHYLRDSILMAVD DMDVLNAITKVLYPTVAKKHQTTASRVERAIRHAIEVAWGRGKPDTLEELFGYTISNGKG KPTNSEFIALIADTIQLKYKRR >gi|330404621|gb|ADLB01000013.1| GENE 98 86988 - 88043 1013 351 aa, chain - ## HITS:1 COG:CAC2072 KEGG:ns NR:ns ## COG: CAC2072 COG0750 # Protein_GI_number: 15895342 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted membrane-associated Zn-dependent proteases 1 # Organism: Clostridium acetobutylicum # 47 349 86 391 395 210 39.0 4e-54 MRKYWYRRCLIIILVFTMAMGSGMLITEQQRKQEVSVESISNDLLIPGGVPVGIYMETDG VMVLGTEKVKSVDGAKYEPAKRLVRPGDYIQEIDGVKVKNKKELMEEVSKVNPSGVVLKL KRGKDTLSVKIRPVKCKGKSYKLGIWVRDNTQGLGTITFLNGNSKYGALGHGIHDMDTGK LLQMSSGRLYETSIQDIKKGVDGEPGGMEGIIVYNKYNVLGNIEKNTEAGIYGRMDTIPQ ELKKQEPLRIGKKEEIEEGSATIRCSVDGELKEYSVRITKVDLHEREVNKGIVLEVTDKE LLEKTGGIVQGMSGSPIIQNNKIIGAVTHVFVQDSAKGYGIFIENMLRNVK >gi|330404621|gb|ADLB01000013.1| GENE 99 88135 - 89811 1413 558 aa, chain - ## HITS:1 COG:BH2776 KEGG:ns NR:ns ## COG: BH2776 COG0497 # Protein_GI_number: 15615339 # Func_class: L Replication, recombination and repair # Function: ATPase involved in DNA repair # Organism: Bacillus halodurans # 1 558 1 562 565 342 38.0 1e-93 MLQNLHIKNLALIDEIEVDFEEGLNILTGETGAGKSIILGSVHLALGGKYNADMLRKGAS YGLVELTFQIREEECKEALEKLDIFPEENEVVFSRKLMEGRSVSRINGETVSINKVKEAA GILIDIHGQHEHQSLLYKKNHLDILDAFAKEDISELKAEVKRKYTVYKALREEVEQAKTD EAERAKEIDFIQFEVAQIRDAGLKPGEDSDLEEQYKKMINGKRIVENVDEVAGYMGAYDG NGISDMLNRAIRCLGEISHMDTSVAALHGQLSDVDNLLNDFNRELSEYRHTLEFSEEEFY ETETRLNEINRLKAKYGQTIEEILRYCEEKENRLERLNDYDNYVEQLQKKYRAAEKELEK VTEELSEGRKKAATLLRDAIEEGLLELNFLEVRFEIAFEEMKEFTANGKDAVEFLISMNP GEMVKPLGEVASGGELSRIMLAIKTVLADKDAIETLIFDEIDVGISGRTAQKVSEKMCVI GRKHQVICITHLAQIAAMADTHFAIEKKVEGGKTKTEITKLSEEQSVQELSRILGGAKIT DKVIENAVEMKKLALNVK >gi|330404621|gb|ADLB01000013.1| GENE 100 89821 - 90273 630 150 aa, chain - ## HITS:1 COG:BS_ahrC KEGG:ns NR:ns ## COG: BS_ahrC COG1438 # Protein_GI_number: 16079481 # Func_class: K Transcription # Function: Arginine repressor # Organism: Bacillus subtilis # 2 148 3 149 149 119 42.0 3e-27 MKVNRHAKIVELIHKYDIETQEELADYLNEAGFQVTQATVSRDIRDLRLTKVPTGTGKQK YIVLRTPDEDLRDKYRRVLQDGYISMDKAQNILVIKTVPGMAMAVAAALDAMKWHEVVGC IAGDDTIMCAIRSEDATTEVMEKISKIVFQ >gi|330404621|gb|ADLB01000013.1| GENE 101 90276 - 91085 680 269 aa, chain - ## HITS:1 COG:RSc2650 KEGG:ns NR:ns ## COG: RSc2650 COG0061 # Protein_GI_number: 17547369 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted sugar kinase # Organism: Ralstonia solanacearum # 49 266 74 290 302 152 36.0 9e-37 MDKFYVITNHTKDENYEVTRAIKKYIEDKGKICILDSERTIPDDTEGVLVIGGDGTLIQA SRELLDKKMQLIGINLGTLGYLTEIEMQTVYPALDSLIEDKYTVEERMLLKGILPNGRED VALNDIIVTRYGSLRLIAFRVYVNGELLNTYQADGIILSTPTGSTAYNLSAGGPIVEPTA SLIVLTPICSHALNTSSIILSVEDEIVIEIGSRRENEVEEAVVAFDGTDILKMRTGERIR VKKADETMKLMKINQVSFLETLRRKMKGN >gi|330404621|gb|ADLB01000013.1| GENE 102 91099 - 91908 953 269 aa, chain - ## HITS:1 COG:CAC2076 KEGG:ns NR:ns ## COG: CAC2076 COG1189 # Protein_GI_number: 15895346 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Predicted rRNA methylase # Organism: Clostridium acetobutylicum # 2 267 5 266 267 306 60.0 3e-83 MKERLDVLLVKRNLVESREKAKAIIMSGNVFVEGEREDKAGTTFSDEVQIEIKGHTLPYV SRGGLKLEKAVANFDVDLEGKVCTDVGSSTGGFTDCMLQNGAKKVFAIDVGRGQLAWKLR QDDRVICMEKTNIRYVAPEDLGERIDFSSIDVSFISLTKVLLPIRNYLKEDGQIVALIKP QFEAGREKVGKKGVVREKSTHHEVIETVVSYAVSIGFKVLNIDFSPIKGPEGNIEYLLHI QKTENAGTLEDVDVDLKSVVDNAFDTLAK >gi|330404621|gb|ADLB01000013.1| GENE 103 91917 - 93782 1753 621 aa, chain - ## HITS:1 COG:CAC2077 KEGG:ns NR:ns ## COG: CAC2077 COG1154 # Protein_GI_number: 15895347 # Func_class: H Coenzyme transport and metabolism; I Lipid transport and metabolism # Function: Deoxyxylulose-5-phosphate synthase # Organism: Clostridium acetobutylicum # 2 616 4 617 619 649 53.0 0 MVLERIQKENDIKKLNGQELSVLADEIRTFLIEKISVTGGHLASNLGVVELTMAMHLAFD LPQDRMIWDVGHQAYTHKLLTGRKAGFDDLRKHGGMSGFPKRKESDCDAFDTGHSSTSIS AGLGYVEAREILGEKHHVISVIGDGSLTGGMAYEALNNASHLNSNFIIVLNDNNMSISPN VGGMSKYLDSLRTADAYTGLKKGVESALKQVPVAGKPLVSHLKKTKSSIKQLFVPGMFFE DMGITYLGPIDGHDIKALYRTFEEAKKLNNAVLVHVITKKGKGYLPAEEFPSKFHGTGPF AIETGESLEKKKKDTYTDVFGKVMCDLAAKDSKVVAITAAMSDGTGLSEFSRRYKKRFFD VGIAEEHAVTFAAGLAAGGLKPVFAVYSSFLQRAYDQLIHDVALQNLPVVFAVDRAGIVG NDGETHQGIFDLSFLSSIPNMTIISPKNRWELADMIRYAIQYEAPIAIRYPRGTAYTGLK EYRKPIAFKTSETIYEEDGIAIFSVGHMMEVAEKVRERLKATGYNCSLINSRFVKPIDEH ILEEMAEEHTLFVTIEENVLSGGYGEKVQDYVMEQQLSVEVLKIGVPDEYVEHGNIDVLR KEIMLDEESIVKQIITQYIGR >gi|330404621|gb|ADLB01000013.1| GENE 104 93795 - 94685 1064 296 aa, chain - ## HITS:1 COG:alr0213 KEGG:ns NR:ns ## COG: alr0213 COG0142 # Protein_GI_number: 17227709 # Func_class: H Coenzyme transport and metabolism # Function: Geranylgeranyl pyrophosphate synthase # Organism: Nostoc sp. PCC 7120 # 28 266 41 279 309 219 50.0 7e-57 MNFNELREAKVTEIENILRKYLPDSSGHQEKIMEAMEYNLMAGGKRIRPMLMKETYEMFG GTGKIIEPFMAAIEMIHTYSLIHDDLPAMDNDDYRRGRKTTHIVYGEAMGILAGDALLNY AFETASEAFDMDRDNGYLIGKALQILGRKAGIYGMIGGQVVDVAASGKAVDKDVLDFIYD LKTGALIEASMMIGAILAGATEQEVKIIESAAKKVGLAFQIQDDILDVTSTKEVLGKPIH SDEKNEKTTYVTLEGFDQAKEQVEILSTEAIDLMKGLNRENLYLMTLLEKLIYREK >gi|330404621|gb|ADLB01000013.1| GENE 105 94675 - 94878 286 67 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167758089|ref|ZP_02430216.1| ## NR: gi|167758089|ref|ZP_02430216.1| hypothetical protein CLOSCI_00427 [Clostridium scindens ATCC 35704] # 2 67 9 74 74 65 71.0 1e-09 MEEKTLEEVFSQLDTVIQDMEREDISLEESFRLYHEGMQMLKVCNEKIDTVEKKMLILDE EGTEHEF >gi|330404621|gb|ADLB01000013.1| GENE 106 94847 - 96082 1254 411 aa, chain - ## HITS:1 COG:BS_yqiB KEGG:ns NR:ns ## COG: BS_yqiB COG1570 # Protein_GI_number: 16079486 # Func_class: L Replication, recombination and repair # Function: Exonuclease VII, large subunit # Organism: Bacillus subtilis # 6 400 8 447 448 304 41.0 2e-82 MRNVYSVKQVNSYIKNMFTQDFMLNRIYVKGEVSNCKYHTSGHIYFSLKDESGTLACIMF AGQRAGLTFRMREGQQVIVLGSVTTYERDGKYQLYAKEIILDGAGLLYEKFEALKKELEE MGMFAPEYKQPIPFYAKRIGIVTAPTGAAIRDIMNIASRRNPYVQLILYPALVQGKDASE SIIKGIGMLEAKGVDLIIVGRGGGSIEDLWAFNEESVARAIFNCAVPIISAVGHETDTTI ADYVADLRAPTPSAAAELAVTEYSRLEETMYDYEVQLKRNLRQFLAAKRLLLRQYAIRMK YLQPYNRLREQRQQLINMEEKIQSLMQKKLDRAKYNFAVRLENMKALSPLQKLNQGFGYV TDESGKTVKSVKQTEKGNTLKIQMKDGILYAKVMDKAEEEIDGREDAGRSF >gi|330404621|gb|ADLB01000013.1| GENE 107 96093 - 96491 475 132 aa, chain - ## HITS:1 COG:CAC2084 KEGG:ns NR:ns ## COG: CAC2084 COG0781 # Protein_GI_number: 15895354 # Func_class: K Transcription # Function: Transcription termination factor # Organism: Clostridium acetobutylicum # 1 132 1 130 135 91 42.0 4e-19 MGRTELRQHIFKILFLIEFNGRDEMSEQIELYLDNLEELSEKDRAYIENKYRSVVEKVEE IDELLNANATGWKTARMNKVDLTILRLATYELKWDEDVPVGVAINEAVELAKKYSSEEGP SFVNGVLGKLVN >gi|330404621|gb|ADLB01000013.1| GENE 108 96624 - 97004 493 126 aa, chain - ## HITS:1 COG:BS_yqhY KEGG:ns NR:ns ## COG: BS_yqhY COG1302 # Protein_GI_number: 16079489 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus subtilis # 12 126 11 125 135 81 38.0 4e-16 MGKEERNAYTIQSDENLGEVKIADEVVAIIAGLAAMEVDGVASMAGNATREIIGKLGMKT LSKGVKVDVLEGIVTVCMNLNLKYGYSIREISGKVQEKVKTAIENMTGLTVADVNIRIAG VEMDAQ >gi|330404621|gb|ADLB01000013.1| GENE 109 97078 - 98670 1580 530 aa, chain - ## HITS:1 COG:CAC0630 KEGG:ns NR:ns ## COG: CAC0630 COG4108 # Protein_GI_number: 15893918 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Peptide chain release factor RF-3 # Organism: Clostridium acetobutylicum # 1 529 1 526 526 721 67.0 0 MSNRINEIKKRRTFAIISHPDAGKTTLTEKFLLYGGAINQAGSVKGKATAKHAVSDWMEI EKERGISVTSSVLQFNYDGYCINILDTPGHQDFSEDTYRTLMAADSAVMVIDASKGVEAQ TRKLFKVCVMRHIPIFTFINKMDREARDTFELLDDIEKELGIATCPINWPIGSGKEFKGV FDREHGEIELFSDTKKGTSVGEVRKVALNDSSVETLISEEQRVQLEEEIELLDGASAEFD QELVNKGELSPVFFGSALTNFGVETFLQHFLAMTSSPLPRKSDEGEIDPIKEEDFSAFVF KIQANMNKAHRDRIAFMRICSGEFDAGMEVFHIQGGKKVRLSQPQQMMASERKMVDKAYG GDIIGVFDPGIFSIGDTLTTSPRKFAYEGIPTFAPEHFARVRQVDTMKRKQFIKGINQIA QEGAIQIFQEYNTGMEEIIVGVVGELQFDVLKYRLENEYNVEIRMERLPYEHIRWIENEE LDLDKLIGTSDMKKIKDLKDRPLLLFVHSWSIRMTEERNEGLKLSEFGRS >gi|330404621|gb|ADLB01000013.1| GENE 110 98739 - 99341 794 200 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_2230 NR:ns ## KEGG: EUBREC_2230 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 196 1 207 208 112 42.0 1e-23 MKRLFKKNQMIITTLAIMIAIAGYLNYSGKIFGDKPATETSGELASKDLLDISEETTGDI ESHDGEVADGSVEGTPGEAVLTNGVVAEAKVTREQVRAKNKETLQAIIDNKNISEEQKKD AIAQMVEMTALAEKEVAVETLLASKGFKNAVVSLTKDSADIVVGKSELTDANRAQIEDIV TRKTDIAPANIVITPIHEKK >gi|330404621|gb|ADLB01000013.1| GENE 111 99361 - 99909 698 182 aa, chain - ## HITS:1 COG:no KEGG:Cphy_2518 NR:ns ## KEGG: Cphy_2518 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 11 182 52 238 239 95 34.0 8e-19 MGLKLNKILQKKQQWLIVLLVGVLLVVIAIPSEKQEIEEEQMQKTTVKTDTTSYTSQMER KLEQVLGQVAGVGEAKVMITLKSTAEKVIEKDGENTNQSVEEEDAQGGVRTTKDNNKKET TIYEGGSEEQKPYVKKEMTPEIEGVIVIAEGGGNPTTVQNITEAVLALFDVDTHKIKIMK MN >gi|330404621|gb|ADLB01000013.1| GENE 112 99890 - 100222 324 110 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167758082|ref|ZP_02430209.1| ## NR: gi|167758082|ref|ZP_02430209.1| hypothetical protein CLOSCI_00420 [Clostridium scindens ATCC 35704] # 1 106 1 115 116 109 53.0 4e-23 MFAFLYEWIRNVAFYIVIITAVIQILPNNTYKKYIHFFTGLVLILLLMTPVLKILGMDNL SNPLQQGKEFEQKMKEIEKETEYLNEVRLEDYLEEENIEVEEIQIGAETK >gi|330404621|gb|ADLB01000013.1| GENE 113 100235 - 101380 1019 381 aa, chain - ## HITS:1 COG:no KEGG:Cphy_2520 NR:ns ## KEGG: Cphy_2520 # Name: not_defined # Def: sporulation stage III, protein AE # Organism: C.phytofermentans # Pathway: not_defined # 34 380 29 376 380 243 42.0 1e-62 MKKIVLGVLLFFLCFHIQAVATEIQDKDREQEQAEDTLMKAFDFGEVNQVLKDILPEDKI DFGETVKSVINGDMKFSVELLEKLISDQLFYEVRQNKKTIVHILLIAIIAAVFTNFSNVF QNQQIGEISFYVLYLLLITICLSTFQILIDSVGAHLENLTMFMRVLGPLYFLAVAISTGS GTSIIFYNLLLFLIYIVELLILNFLLPLLHIYMVIRVLNNLSAEEYLSKFAELIEFIVTW TLKTLLAAIIGLNVIQGLISPAIDSLKRGVLTKGMEAIPAIGDALGGTAEIILGTSVLVK NGIGVTGAIVCITICVIPILQMFVMAFLYKLTSAFIQPISDKRIVGCVSGMGEGCQILLR VVFTTGMLFLLTIAIVTATTT >gi|330404621|gb|ADLB01000013.1| GENE 114 101386 - 101772 563 128 aa, chain - ## HITS:1 COG:no KEGG:Cphy_2521 NR:ns ## KEGG: Cphy_2521 # Name: not_defined # Def: stage III sporulation protein AD # Organism: C.phytofermentans # Pathway: not_defined # 5 127 15 137 139 102 50.0 5e-21 MGIIQIGAIGVIGAILAIQFKSGKSEYGIYISIALSLLIFFSIVGKLETIVQVIKSIGEK IQIKSTYITALLKMLGVTYVAEFSSAICKDAGYQTIAQQIEIFSKLTILALSMPILLALL ETIQAFLG >gi|330404621|gb|ADLB01000013.1| GENE 115 101783 - 101977 198 64 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_2235 NR:ns ## KEGG: EUBREC_2235 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 64 2 65 65 68 64.0 8e-11 MSVNLIFKIAAVGILVSVLSQVLKHSGREEQAFLTSFAGLLLVLFWILPYIYELFESIKR LFSL >gi|330404621|gb|ADLB01000013.1| GENE 116 101989 - 102507 522 172 aa, chain - ## HITS:1 COG:no KEGG:Cphy_2523 NR:ns ## KEGG: Cphy_2523 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 2 172 6 176 176 114 33.0 2e-24 MLKLIGCVLIIFASSGMGYLKGMELKKHLVEVEKMRQMFLMLRSEIRHIKSPLPEAFRHI GKRMGGVYESWLLDLSEQLIRKSGVTFMELWSNSIEKHWKGGNLKEGDMEKLKAAGENMG YLDEEMQVGTIDLYVEQLEQEIQRLQNEFAVEKKLYHCLGVMGGIFLAVVLI >gi|330404621|gb|ADLB01000013.1| GENE 117 102522 - 103445 840 307 aa, chain - ## HITS:1 COG:CAC2093 KEGG:ns NR:ns ## COG: CAC2093 COG3854 # Protein_GI_number: 15895363 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 6 295 4 295 305 260 46.0 2e-69 MKEQKKQILQVLGKKIQCCVEREQFDFSNLQEIRLRVGKPLTVLYKGKEKLLMKVEQEDV RETMEYISNYSLYAYENEMRQGFITIEGGHRVGMAGKVVMEEGKVKSLKYISSINIRVAH EVRGCADKLFPYITKERQICHTLIISPPRCGKTTLLRDMIRQISDGNRWVKGVPVGVVDE RSELGGCYMGTAQNELGIRTDILDCCPKADGMLMLIRSMAPQVIAVDEIGAREEIRAIEY ALHCGCKMLATAHGVSMEEMKKKPFFEQLIREKRFERYVVLGNEHHMGEVLGIYDENGER IFEHVAI >gi|330404621|gb|ADLB01000013.1| GENE 118 103564 - 105654 2708 696 aa, chain - ## HITS:1 COG:CAC1808 KEGG:ns NR:ns ## COG: CAC1808 COG1185 # Protein_GI_number: 15895084 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Polyribonucleotide nucleotidyltransferase (polynucleotide phosphorylase) # Organism: Clostridium acetobutylicum # 1 696 1 694 703 702 53.0 0 MFKKFEMELAGRTLRVDVGRVAAQANGAALMHYGETVVLSTATASEKPREGIDFFPLSVE YEEKLYAVGKIPGGFNKREGKASENAILTSRVIDRPMRPLFPKDYRNDVTLNNLVMSVDP DCSPELTAMLGSAIATAISDIPFDGPTSTTQVGLIDGEFVFNPNAAQRAVSDLQLTVAST RDKVIMIEAGANEVPEDKMIEAIFAAHEVNQQVIQFIDTIVAECGKPKHEYTSCAVPEEL FEAMKEIVTPEEMEVAVFTDEKQVREENLRVIREKLEEAFADNEEWLEILGEALYQYQKK TVRKMILKDHKRPDGRAIDQIRPLAAEIDLIPRVHGSAMFTRGQTQICTVTTLAPLSEAQ RIDGLDEAETSKRYMHHYNFPSYSVGETKPSRGPGRREIGHGALAERALVPVLPTEEEFP YAIRTVSETFESNGSTSQASICASSMSLMAAGVPIKSAVAGISAGLVTGDTDDDYLVLTD IQGLEDFFGDMDFKVAGTHKGITAIQMDIKIHGLTRPIIEEAIARTKQARTYILDEVMHN AIAEPREEVGPYAPKIRQMQIDPAKIGDVVGQRGKTINAIIEQTGVKIDITDEGSVSVCG VDAKAMEEAMKMIAIIVTDFEAGQVLEGKVISIKEFGAFLEFAPGKEGMVHISKISKERI NHVEDVLTLGDKVKVVCLGKDKMGRISFSMKDVVEE >gi|330404621|gb|ADLB01000013.1| GENE 119 105874 - 106329 544 151 aa, chain - ## HITS:1 COG:no KEGG:Lebu_1182 NR:ns ## KEGG: Lebu_1182 # Name: not_defined # Def: GCN5-related N-acetyltransferase # Organism: L.buccalis # Pathway: not_defined # 3 142 4 147 157 78 36.0 8e-14 MTIRVAEEKDYIQVENLMKQVQNLHVELRPDIYKPEEVVLPKKEFMEQAKKEEILVAVEE KKVVGLLSYILRMISGKTVVEKKVLFIECLVVDEEYRGRGIGSRLLDCAKEIYREKKCNG LELQVNARNIGAWELYKKCGFREKSINMELE >gi|330404621|gb|ADLB01000013.1| GENE 120 106326 - 106988 439 220 aa, chain - ## HITS:1 COG:CAC0235 KEGG:ns NR:ns ## COG: CAC0235 COG4845 # Protein_GI_number: 15893527 # Func_class: V Defense mechanisms # Function: Chloramphenicol O-acetyltransferase # Organism: Clostridium acetobutylicum # 5 204 3 202 212 151 36.0 1e-36 MKYHYLDMNAYKRKKHFEYFSTMAYPYVSVTVNVDITEFLTEIKQKKFPFFLSFCYCAAK AANSVPEFRQRIQENKIIEYDRCRTSHTVFLEDGTYCYCMLDYDKSFEEYLRYAVRTQEN AKQEKSVEDKQEDLNELIFISTLPWFSYTALNNPVPIPADSNPRITWGKYIKEDLKTVIP VTVQCNHALVDGVHISQFLEALKKELCKVTESVRKQEGNK >gi|330404621|gb|ADLB01000013.1| GENE 121 107023 - 107571 229 182 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|228000081|ref|ZP_04047083.1| acetyltransferase, ribosomal protein N-acetylase [Brachyspira murdochii DSM 12563] # 18 159 4 144 166 92 34 7e-18 MKFNALKVEDKTGREVILRSAEKNDAKSLLEYLKITAAETPYLIREPEEITLSIEQEEEF IQRIEDSEKELMLVATVDGKHVGNCSLMSMGGFRRYGHRCDVAIALYQEYCGRGIGKKML ETVLKVAKEIGYEQAELEVAADNKNAIALYEKLGFKQYGCFPDNMKYSNGKYADAYWLMR KL >gi|330404621|gb|ADLB01000013.1| GENE 122 107728 - 111861 4359 1377 aa, chain - ## HITS:1 COG:TM1193 KEGG:ns NR:ns ## COG: TM1193 COG3250 # Protein_GI_number: 15643949 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Thermotoga maritima # 8 395 612 983 1087 147 29.0 2e-34 MNERKVNVYNESNFTNLNKYKVTWELLENGKKIDAGVVENTDVAPRTTGTIDVPFNLPKE IPAGSEYYLNISVSLKEAERWAEKGAEMSWGQIAVPVEVKQAAPAVSEKEVTVSDETDGY AVEGENFSFTIDKKTGILKNYVYDGETLVKQGPTPNFWRGRLENDKDGWYSGNTFDWGWE KVEKNIKVESVTASEKDGQHVITAELVFPDGGNTKETIVYTINGDGQVTVNMKVDPTKSG MGNMLRVGSRMTLPEGFENVTWYGNGPVETFNDRKTNGRQGIWENTVNRFFYPYLRVDDS GNLTDVKWISVENPEMKNALLVAAKDTVEAQALHYTPDELNAVEHVYELQPQDGKETYLN VNYGSSGTGGATCGPATLVQYRLPSNRVYEWEFTMIPVSKDADTEEISNLAKTYHTVDVF DQAEYDAQVAADIIKRVDEFVVYDYSQLKEVEKLQADYNALTDTQKKLVNEGKDRAALIK KYVEGVKALELQETYLQDESKNNMKIPYQTTAKFTNCGEGVVMSGQLQVPYNNILTPVFS GDDTSFMVEVNAIPTAYSTYDMFAGKGDYAFALRAREESGDFHVFADGNWRALECKMTPE FAENWLGKEHQIVGVYDAQTDKIAVYVDGVLLGEKETDKGVDPSDYNLTVGGCPETGRGS TAEFKNIHVYNKALTAAEVKGQYAETPAIGKDSENVELWVDLTDIKHREKANIYDVSIEP DLAVVKAGTTKEFKVVPDNAEAVVENAQWSVADENGKAIKGMDIVEGDDGTATLIVGKNV ADRTKAKVIVSNVNGKETLTAEAEVTVQNAEEKQIIKDESKNKLNTKLPETAQFVNGEQG KETTLKGHFTVDDEKKIVNAAMTNGKPFTVSSRVYVPASCKSDEGKFDGNGDKHNMIASL GDGSFAYRIFHKRGSAETRIDAFISDGSGWNMISSKALADDFFDKWHEISVAYTGDTLKL YVDGTELVTGATTLKVADNEFGFSVGYDPQVTSRTSDLTFEQVRVFSEALTTDELNNAAE PANDNVVLWLNFDGRLEDDIATDVDKRMLEALVDYCLSLESGDYLEAGWSAMQKPLETAK TVLADKQATRKEVADAEKALSEAKEALVYVKDLKDAIDVADKEIVPNKDKYTKDSYKVFS DALKEAKAIRNKKDATQAEVNKAKITLLDAQNALVNIADKSDLSKAIKDAEQLLKKESLT PSSEQELKAAIEAAKKVNDDENATQEAVDAATEALKEAMGAIRTMADFKELEKTVNRIDE MKLDKYTEESVQILKKALADAKAVLANKESTQKEVDDALSTLLAAEKGLVKKQDGGNNGG NNGGSNGGNNQNNGNHGNPNRPVKTGDTSPVMAFGLAAVATGLAGAVAMYTKRRKRS >gi|330404621|gb|ADLB01000013.1| GENE 123 111945 - 112010 89 21 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MDTAVTGEMYQTTTVSARMVY >gi|330404621|gb|ADLB01000013.1| GENE 124 112123 - 114087 1715 654 aa, chain - ## HITS:1 COG:TM1193 KEGG:ns NR:ns ## COG: TM1193 COG3250 # Protein_GI_number: 15643949 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Thermotoga maritima # 140 654 62 540 1087 327 36.0 3e-89 MKRKKLLPLLLSAAMVVTMFPTGSLTVQAAGTSSSKMKVSDKVKADVDKTKFTHKEWTGT DYQDVNGKEVTGEDVFGINREDASTVLIPYQNADAAKKAVWNYNARTESSYFQLLTGEEN DWDLTVVQNQEEAEKFMGKDGFMTEKFQPEKQDGWKTVQLPQSWTTQGFDKSIYTNTQMP WQTGEEGTPCPEAPTKYNPVGLYRKTFQVNQAMRDSGRRIYLDFQGVESAYYVYVNGKEV GYSEDTFSPHRFDITDYLKDGENLLAVKVHKFCDGTWFEDQDMIYDGGIFRDVYLTSAPL VQISDYVVQTDLDSEYKDATLNLSVDVRNLSNEAQKGWTIDVAAFDEAGKNILGETGISV DEVVSNKTKTFNLSQKVENPKLWSAENPNLYALVLTLRDGNGKEVETVSTQLGFREVEFT STKVDSNYKVLTKKWDPIRINGQNLLLKGANRHDTDPFYGKAVPQATMEEDVKLMKQNNL NAIRTSHYSNDEYLYWLCNTYGLYMMGETNMECHAIQSDSERAGLFYELGMDRTETAFKR LRNNPSIVMWSIGNEMAYTQNPNDAKGLYRDMIWYFKDNDATRPVHSEGQTDAMGVDMGS NMYPGVDTVQSRVGEGKIPYVMCEYVHGMGNSVGNLKEYWDAVRSADNMLGGFV >gi|330404621|gb|ADLB01000013.1| GENE 125 114118 - 114240 78 40 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MPKLCVKYTKIVRKKTTKMRKINNRSCFIGFTVNKMYVYK >gi|330404621|gb|ADLB01000013.1| GENE 126 114493 - 114963 400 156 aa, chain - ## HITS:1 COG:CAC3531 KEGG:ns NR:ns ## COG: CAC3531 COG1943 # Protein_GI_number: 15896768 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Clostridium acetobutylicum # 1 156 1 156 157 237 71.0 8e-63 MDSNSLSHTKWNCKYHIVFAPKNRRKVAYGKIKQDIANILSMLCKRKGVKIVEAEICPDH VHMLVEIPPSISVSYFVGYLKGKSTLMIFERHANLKYKYGNRHFWCRGYYVDTVGKNAKK IQEYIANQLQEDLEYDQMTLKEYIGPFTGEPVKPNK Prediction of potential genes in microbial genomes Time: Tue May 24 21:12:58 2011 Seq name: gi|330404459|gb|ADLB01000014.1| Lachnospiraceae bacterium 2_1_46FAA cont1.14, whole genome shotgun sequence Length of sequence - 56618 bp Number of predicted genes - 61, with homology - 58 Number of transcription units - 26, operones - 14 average op.length - 3.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 286 - 327 7.4 1 1 Op 1 38/0.000 - CDS 378 - 1298 557 ## PROTEIN SUPPORTED gi|42631241|ref|ZP_00156779.1| COG0264: Translation elongation factor Ts - Term 1330 - 1369 6.0 2 1 Op 2 . - CDS 1386 - 2132 1109 ## PROTEIN SUPPORTED gi|240145469|ref|ZP_04744070.1| ribosomal protein S2 - Prom 2238 - 2297 5.2 3 2 Op 1 7/0.000 - CDS 2314 - 3507 1172 ## COG0791 Cell wall-associated hydrolases (invasion-associated proteins) - Prom 3553 - 3612 7.3 - Term 3622 - 3661 6.1 4 2 Op 2 . - CDS 3669 - 4694 1207 ## COG0791 Cell wall-associated hydrolases (invasion-associated proteins) - Prom 4731 - 4790 6.2 5 3 Op 1 . - CDS 4798 - 5649 795 ## EUBREC_2275 CAAX amino terminal protease family protein 6 3 Op 2 . - CDS 5680 - 6843 1104 ## COG2872 Predicted metal-dependent hydrolases related to alanyl-tRNA synthetase HxxxH domain 7 3 Op 3 . - CDS 6846 - 8612 1740 ## COG0018 Arginyl-tRNA synthetase - Prom 8745 - 8804 6.6 + Prom 8619 - 8678 13.5 8 4 Op 1 . + CDS 8792 - 9430 437 ## COG1853 Conserved protein/domain typically associated with flavoprotein oxygenases, DIM6/NTAB family 9 4 Op 2 . + CDS 9459 - 10361 830 ## COG0648 Endonuclease IV + Term 10365 - 10402 3.1 + Prom 10370 - 10429 5.9 10 5 Op 1 8/0.000 + CDS 10477 - 10845 439 ## COG1725 Predicted transcriptional regulators 11 5 Op 2 . + CDS 10846 - 11544 229 ## PROTEIN SUPPORTED gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 12 5 Op 3 . + CDS 11538 - 12335 502 ## DSY1434 hypothetical protein + Term 12341 - 12399 -0.1 - Term 12403 - 12453 8.0 13 6 Tu 1 . - CDS 12486 - 13856 1595 ## COG0624 Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases - Prom 13960 - 14019 6.6 - Term 13986 - 14031 7.6 14 7 Op 1 . - CDS 14037 - 14117 83 ## - Prom 14151 - 14210 4.3 - TRNA 14072 - 14143 64.5 # Arg CCG 0 0 15 7 Op 2 . - CDS 14212 - 15186 789 ## Cphy_2931 hypothetical protein - Prom 15367 - 15426 11.8 + Prom 15207 - 15266 11.7 16 8 Tu 1 . + CDS 15295 - 16062 658 ## gi|291541113|emb|CBL14224.1| Beta-lactamase inhibitor (BLIP). + Term 16065 - 16115 13.1 - Term 16056 - 16098 7.0 17 9 Op 1 . - CDS 16109 - 16666 654 ## COG0231 Translation elongation factor P (EF-P)/translation initiation factor 5A (eIF-5A) - Prom 16690 - 16749 5.2 18 9 Op 2 . - CDS 16760 - 17260 576 ## COG2179 Predicted hydrolase of the HAD superfamily 19 9 Op 3 . - CDS 17263 - 18231 844 ## COG0524 Sugar kinases, ribokinase family 20 9 Op 4 . - CDS 18259 - 18933 497 ## COG5632 N-acetylmuramoyl-L-alanine amidase - Prom 18961 - 19020 5.0 - Term 19042 - 19085 8.1 21 10 Tu 1 . - CDS 19100 - 20554 2027 ## COG0516 IMP dehydrogenase/GMP reductase - Prom 20619 - 20678 6.3 22 11 Tu 1 . - CDS 20687 - 21175 562 ## Cphy_3288 hypothetical protein - Prom 21199 - 21258 5.5 - Term 21219 - 21264 9.0 23 12 Op 1 41/0.000 - CDS 21293 - 22915 1586 ## PROTEIN SUPPORTED gi|167855908|ref|ZP_02478658.1| 50S ribosomal protein L28 24 12 Op 2 . - CDS 22946 - 23230 506 ## COG0234 Co-chaperonin GroES (HSP10) 25 12 Op 3 . - CDS 23311 - 23637 269 ## gi|210610051|ref|ZP_03288230.1| hypothetical protein CLONEX_00416 - Term 23655 - 23700 6.0 26 13 Op 1 1/0.000 - CDS 23708 - 24490 1112 ## COG4465 Pleiotropic transcriptional repressor - Prom 24565 - 24624 4.0 27 13 Op 2 13/0.000 - CDS 24632 - 26698 1933 ## COG0550 Topoisomerase IA 28 13 Op 3 2/0.000 - CDS 26733 - 27818 872 ## COG0758 Predicted Rossmann fold nucleotide-binding protein involved in DNA uptake 29 13 Op 4 . - CDS 27821 - 29329 865 ## COG0606 Predicted ATPase with chaperone activity - Prom 29416 - 29475 6.2 - Term 29440 - 29491 12.0 30 14 Op 1 . - CDS 29495 - 30532 905 ## gi|166031604|ref|ZP_02234433.1| hypothetical protein DORFOR_01304 31 14 Op 2 . - CDS 30543 - 31115 685 ## Shel_22530 hypothetical protein - Prom 31215 - 31274 5.4 + Prom 31169 - 31228 4.3 32 15 Tu 1 . + CDS 31252 - 31872 328 ## COG1191 DNA-directed RNA polymerase specialized sigma subunit - Term 31836 - 31886 11.4 33 16 Op 1 4/0.000 - CDS 31890 - 33119 1510 ## COG0826 Collagenase and related proteases 34 16 Op 2 . - CDS 33135 - 33776 774 ## COG4122 Predicted O-methyltransferase - Term 33788 - 33826 4.3 35 17 Op 1 1/0.000 - CDS 33833 - 35500 1745 ## COG0595 Predicted hydrolase of the metallo-beta-lactamase superfamily 36 17 Op 2 . - CDS 35514 - 35990 430 ## COG0735 Fe2+/Zn2+ uptake regulation proteins 37 17 Op 3 . - CDS 36014 - 36274 388 ## gi|255282414|ref|ZP_05346969.1| conserved hypothetical protein 38 17 Op 4 6/0.000 - CDS 36296 - 36724 657 ## COG0816 Predicted endonuclease involved in recombination (possible Holliday junction resolvase in Mycoplasmas and B. subtilis) 39 17 Op 5 . - CDS 36729 - 36992 394 ## COG4472 Uncharacterized protein conserved in bacteria 40 17 Op 6 . - CDS 37049 - 38341 938 ## PROTEIN SUPPORTED gi|16079597|ref|NP_390421.1| hypothetical protein BSU25430 - Prom 38362 - 38421 3.2 + Prom 38383 - 38442 7.0 41 18 Tu 1 . + CDS 38473 - 38703 354 ## EUBELI_01159 hypothetical protein + Term 38724 - 38758 6.0 - Term 38712 - 38744 5.6 42 19 Op 1 7/0.000 - CDS 38762 - 39946 1397 ## COG0301 Thiamine biosynthesis ATP pyrophosphatase 43 19 Op 2 . - CDS 39984 - 41141 1256 ## COG1104 Cysteine sulfinate desulfinase/cysteine desulfurase and related enzymes 44 19 Op 3 9/0.000 - CDS 41185 - 41922 781 ## COG1385 Uncharacterized protein conserved in bacteria 45 19 Op 4 . - CDS 41933 - 42886 999 ## PROTEIN SUPPORTED gi|240145923|ref|ZP_04744524.1| ribosomal protein L11 methyltransferase 46 19 Op 5 . - CDS 42976 - 43536 686 ## COG1971 Predicted membrane protein 47 19 Op 6 . - CDS 43536 - 44696 1212 ## COG1168 Bifunctional PLP-dependent enzyme with beta-cystathionase and maltose regulon repressor activities 48 19 Op 7 . - CDS 44700 - 45911 1632 ## COG3681 Uncharacterized conserved protein - Prom 45935 - 45994 3.0 49 20 Op 1 . - CDS 46049 - 47221 789 ## Cbei_2712 FliB family protein 50 20 Op 2 28/0.000 - CDS 47214 - 49940 2452 ## COG0419 ATPase involved in DNA repair 51 20 Op 3 . - CDS 49944 - 51017 754 ## COG0420 DNA repair exonuclease 52 20 Op 4 . - CDS 51089 - 51616 438 ## gi|210611502|ref|ZP_03288922.1| hypothetical protein CLONEX_01112 53 20 Op 5 . - CDS 51640 - 52689 1077 ## COG0275 Predicted S-adenosylmethionine-dependent methyltransferase involved in cell envelope biogenesis 54 20 Op 6 . - CDS 52714 - 53124 404 ## gi|239624705|ref|ZP_04667736.1| conserved hypothetical protein 55 20 Op 7 . - CDS 53096 - 53386 304 ## gi|283797219|ref|ZP_06346372.1| putative FMN-dependent (S)-2-hydroxy-acid oxidase - Prom 53549 - 53608 10.8 + Prom 53251 - 53310 4.6 56 21 Tu 1 . + CDS 53367 - 53486 107 ## + Prom 53491 - 53550 7.7 57 22 Tu 1 . + CDS 53624 - 54799 1026 ## COG0270 Site-specific DNA methylase 58 23 Tu 1 . - CDS 54816 - 55358 575 ## COG1592 Rubrerythrin - Prom 55408 - 55467 5.1 - Term 55431 - 55476 1.7 59 24 Tu 1 . - CDS 55560 - 55646 174 ## - Prom 55671 - 55730 2.2 60 25 Tu 1 . - CDS 55777 - 56040 290 ## Ccel_1541 hypothetical protein - Prom 56196 - 56255 5.0 + Prom 56174 - 56233 7.4 61 26 Tu 1 . + CDS 56290 - 56574 453 ## EUBREC_2191 hypothetical protein Predicted protein(s) >gi|330404459|gb|ADLB01000014.1| GENE 1 378 - 1298 557 306 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|42631241|ref|ZP_00156779.1| COG0264: Translation elongation factor Ts [Haemophilus influenzae R2866] # 3 305 4 280 283 219 43 4e-56 MAITASMVKELREMTGAGMMDCKKALNETNGNMDEAVEFLRKNGQAKADKKAGRIAAEGI VKAIVKDDKVAAIVEVNSETDFVAKNADFQTYVEEVANQALNTNATDIDAFLAEAWVSDN SKTVKEVLTEKISVIGENLNIRRFEKVETEGCVVSYIHGGGRIGVLIEADADVVNDEIKG CLRNVAMQVAAMSPKYVSRDEVAADYLEHEKEILLAQAKTENPEKPDNIIEKMIIGRLNK ELKEICLLDQVYVQDSDYTVAKYVEKVAKENGANVTVKRFVRFETGEGLEKKNEDFAAEV AAQMGN >gi|330404459|gb|ADLB01000014.1| GENE 2 1386 - 2132 1109 248 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|240145469|ref|ZP_04744070.1| ribosomal protein S2 [Roseburia intestinalis L1-82] # 1 248 1 247 247 431 87 1e-120 MSVISMKQLLEAGVHFGHQTRRWNPKMAEYIYTERNGIYIIDLQKSVGKVDEAYKAMSDI AAEGGTILFVGTKKQAQDAIKTEAERCGMFYVNERWLGGMLTNFKTIQSRIARLKDIERM SEDGTFDVLPKKEVIEIKKEWDKLEKNLGGIKDMKKAPDAIFVVDPKKERICVQEAHTLG IPLIGIADTNCDPEELDYVIPGNDDAIRAVKLIVAKMADAVIEANQGTTGVEDVYEEVAE EVEAAEEA >gi|330404459|gb|ADLB01000014.1| GENE 3 2314 - 3507 1172 397 aa, chain - ## HITS:1 COG:L161266 KEGG:ns NR:ns ## COG: L161266 COG0791 # Protein_GI_number: 15672918 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell wall-associated hydrolases (invasion-associated proteins) # Organism: Lactococcus lactis # 279 372 84 177 197 67 40.0 5e-11 MKKIKKIFIATVITSSLIVVPALASPNVNQIEQNKEETEKKLEDANSKLVTLLTDFEVLK GDIRNQEAKITLADKDLQEAKKKEEEQYKQMLIRIKYMYENGDGNEISALLGAESFGEVI NQVEYMKNVHSYDRKMLNEYKKSKEKVATLKQDLESGKADMEFMATTYEQQEKELKTTIE TMKTQVADFDVQLANAQRQAEEEARRLEQQTQQMEAELSDTKPSKPQTNRPSSNNDNNDK DEEDKPSTPKPDKPSKPSKPDKPSKPTEPEAEDKPSNASKGQQIANEGLKYVGNPYVWGG NDLYNGIDCSGFTSKIHAICGISGVPRNSKEQRYGGKAVNGLANALPGDIICYDGHVAIY IGGGRIVHASNPKPYPQGGIKTGTATYRTILAIRRYW >gi|330404459|gb|ADLB01000014.1| GENE 4 3669 - 4694 1207 341 aa, chain - ## HITS:1 COG:L161266 KEGG:ns NR:ns ## COG: L161266 COG0791 # Protein_GI_number: 15672918 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell wall-associated hydrolases (invasion-associated proteins) # Organism: Lactococcus lactis # 231 331 84 186 197 90 48.0 6e-18 MKKFGKVLGVSVLAASLVVTPVLATPSVNDLEQNKKATQKEVDSLQSELASLMNKINKVE EDLVTKGKEVTEATAKLTEAEKKEKEQYEKMLYRIKYMYEAGDTSFVEGLLSSDGMGEVL SEAEYVENVHDYDRKMLTEYEKTKKQITKLKSGLEKELSDIEKMQKDFEKDKQKLDDTIK EKQGEIANFDTQIEAAREEAARKAAERSNNGGNKGGSSNNDEAVNIPSDSSKASIIVSTA YSYLGTPYVWGGESKSGIDCSGLTMRAHQAAGIYLSHSSGAQGAGGKRVANMASALPGDL VCYSGHVGIYIGGGQMIHAPKPGDVVKVASVYGSPWFKRYW >gi|330404459|gb|ADLB01000014.1| GENE 5 4798 - 5649 795 283 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_2275 NR:ns ## KEGG: EUBREC_2275 # Name: not_defined # Def: CAAX amino terminal protease family protein # Organism: E.rectale # Pathway: not_defined # 68 273 49 253 259 87 28.0 4e-16 MEKKRRIGVFAIISPFLVYYAVSLVVEIIASLIIMIPEIQKGGVINVEETTAAVMKVFMS HLTLITSVVALCVIPLFWLMYRKDTKYEKEIGIAERERVPILKYAVIIGLGITACIGLNN LLVLGNVAAYSGTYEETTAALYKENFLIQLIGLGLVVPIAEEFMFRGIIYKRLSFIMKRE KAMLFSALMFGLYHGNLVQAIYGFVLGYLAVYIYEKYGSLKASILFHTVINLTSVIGTEC GMYTWIFGNVIRVGIVTVICGAAASSFFVLMQKLFKKEAKTET >gi|330404459|gb|ADLB01000014.1| GENE 6 5680 - 6843 1104 387 aa, chain - ## HITS:1 COG:CAC0906 KEGG:ns NR:ns ## COG: CAC0906 COG2872 # Protein_GI_number: 15894193 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolases related to alanyl-tRNA synthetase HxxxH domain # Organism: Clostridium acetobutylicum # 1 264 1 264 387 175 36.0 2e-43 MTEKLFYKDSHLAEFTASVESCEKEEKYYKVVLNRTAFFSEGGGQSADTGTLDGVRVFDV QEKDGILYHMTEKPLEQGKTVTGKIDWEERFSKMQQHSGEHIVSGLIHRKYGYNNVGFHL GQDTVTMDFDGVLTKEQLREIELLANGAVVKNLDIQVDYPTKEELHTIEYRSKIEIEGQI RIVTIDGYDVCACCAPHVKKTGEIGLIKLTNVQNYKGGVRITMLCGFRALADYNEKEASV RKVSVLMSAKENEIDEAVEQLKEEKAQLKNEIALLQDKLLRSKASRIEPGQETVCLFDSE LSGNAPRELVNLLLEKDVKICGVFFGNDSEGYRYVVGSRSVDTRPIAKSLNETFSGRGGG KPEMVQGSLKGMESEIYQFFACSVLGK >gi|330404459|gb|ADLB01000014.1| GENE 7 6846 - 8612 1740 588 aa, chain - ## HITS:1 COG:CC3359 KEGG:ns NR:ns ## COG: CC3359 COG0018 # Protein_GI_number: 16127589 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Arginyl-tRNA synthetase # Organism: Caulobacter vibrioides # 20 588 20 600 600 474 42.0 1e-133 MKKIADIISGELEKAFEESGYEAKYGKVTLSNRPDLCEYQCNGAMAAAKTYKKAPIMIAG DVVEKLQNSEAISKVEAVNPGFINIKLSEEFVADYLNKMSEDDHLGLEQNEEPKTIVIDY GGANVAKPLHVGHLRTAIIGESVKRIARFMGDEVIGDVHLGDWGYQMGLIITELKKRKPE LPYFDENFAGEYPEEAPFTIDELEEIYPTASAYAKEHDDYKEEALHATYLLQNGHKGYTA IWKHIMRVSVTDLKKNYANLNVDFDLWKGESDAQPYIPDMVEYLKKEGYARYDDGALVID VKEETDTKEIPPCMILKSDGASLYGTTDLATLIEREKLYNPDEVIYVVDKRQELHFVQVF RSARKAHIVKDDTKLTFIGFGTMNGKDGKPFKTREGGVMRLENLIGDIEERMYGKIMENR AIEEEEARKTAKIVGLSAIKYGDLSNQASKDYVFDVDRFTSFEGNTGPYILYTIVRIKSI LAKYTEMKTTPARASVKKAVSDSEKALMLEITKFNSVIETAYEEKAPHKICAYVYELANA FNRFYHETKILAEENEERQKSFISMLLLTKEVFEACVDMLGFEAPERM >gi|330404459|gb|ADLB01000014.1| GENE 8 8792 - 9430 437 212 aa, chain + ## HITS:1 COG:AF0830 KEGG:ns NR:ns ## COG: AF0830 COG1853 # Protein_GI_number: 11498436 # Func_class: R General function prediction only # Function: Conserved protein/domain typically associated with flavoprotein oxygenases, DIM6/NTAB family # Organism: Archaeoglobus fulgidus # 1 163 1 165 169 127 36.0 2e-29 MNKNVFRNFSYGVYLISSLDKERPTGCIANSAMQITSAPATIAISINHDNFTNECIRQSG QFAISILSENTSPDLIGQFGFQSGKDVNKFDTIPFQTVSGLPIPDDCCGYVTCKVIDTME TSTHTVFLGEVTDGDILKDEPPMTYAYYHKVVKGKSPKNAPTYIPEDSSSSTKHWVCGVC GYIYDGDTLFEELPDTYQCPICKVTKDKFEYK >gi|330404459|gb|ADLB01000014.1| GENE 9 9459 - 10361 830 300 aa, chain + ## HITS:1 COG:BS_yqfS KEGG:ns NR:ns ## COG: BS_yqfS COG0648 # Protein_GI_number: 16079568 # Func_class: L Replication, recombination and repair # Function: Endonuclease IV # Organism: Bacillus subtilis # 1 297 2 296 297 394 62.0 1e-109 MKLGSHVGMSGKEMLLGSAKEAVSYGANTFMFYTGAPQNTKRKEISELNIEPAWAYMKEH NINEIVVHAPYIINLANTVKPETFELAVDFLRLELERASACKSNTLILHPGAHVGAGTDA GIEQIAKGINEVLTKDTTCNIALETMAGKGTEIGRTFEELAQIYDKVVYNDKLRVCFDTC HTNDSGYDIVNDFDGVIEKFDKLIGKDQIAVFHINDSKNVLGAKKDRHANLGFGEIGFDA ISYIVHHKDFEEIPKILETPYIPSPTKAKKSYAPYRYEIEMLRANAFQTNIIDTILQDNE >gi|330404459|gb|ADLB01000014.1| GENE 10 10477 - 10845 439 122 aa, chain + ## HITS:1 COG:SP1714 KEGG:ns NR:ns ## COG: SP1714 COG1725 # Protein_GI_number: 15901548 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Streptococcus pneumoniae TIGR4 # 1 119 1 119 121 122 47.0 2e-28 MAWDLDNDRPIYLQIMERISHDIIAGTYHAGDKLPSVRDLALEAAVNPNTMQKALSELER QGLVYSQRTSGRFITEDTKMLNKLKADMAEEHIREFLEKMKHLGFPKEEILALIEQTMKE ER >gi|330404459|gb|ADLB01000014.1| GENE 11 10846 - 11544 229 232 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 [marine gamma proteobacterium HTCC2080] # 1 222 1 231 305 92 26 4e-18 MKPIIECRSLTKKYGDFYALDNLNLTLSRGEIIGLLGPNGSGKTTLIKLLNGLLTLTSGE IYIDGEPLGIKTKKIVAYLPERTYLNAHRKVKDIISYFEDFYDNFDSERAYRMLKHLEIN PDARLKSLSKGTKEKVQLILVMSRDADLYILDEPIGAVDPAARDYILNTILTNYNENATI LLSTHLIHDIENILDRVIFIQNGHVILNSTVDEIRTEKRQSVDTLFREVFKC >gi|330404459|gb|ADLB01000014.1| GENE 12 11538 - 12335 502 265 aa, chain + ## HITS:1 COG:no KEGG:DSY1434 NR:ns ## KEGG: DSY1434 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense # Pathway: not_defined # 1 265 1 265 265 99 32.0 1e-19 MLTKLIKYDLKSLSRVLIPANILLLIYSFLARCVITSELYNDLPYFVIGLGITTYIILLV LINYITLFAVLYRFYKNLFTDEGYLTLTLPVTPSQHLLAKTISGTFWVVLSYAVITISIL MLVLVPDVIKHSDMIMSELTKSLEMPASFFFISTFMVGLISCFFALPFYYVCIALGQLFN KHRILAAVILFFALSSVVSVISLIILFVVGAFPLLFGPASNLSEMTNVSHLLSTSYGVSC IIMVIQGVLSYVVTLYIMKKKINLE >gi|330404459|gb|ADLB01000014.1| GENE 13 12486 - 13856 1595 456 aa, chain - ## HITS:1 COG:CAC2723 KEGG:ns NR:ns ## COG: CAC2723 COG0624 # Protein_GI_number: 15895980 # Func_class: E Amino acid transport and metabolism # Function: Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases # Organism: Clostridium acetobutylicum # 4 455 2 465 465 342 40.0 8e-94 MRNEKLDERMLELKEDIFASIRESVAIESVKSEAKEGAPYGEGPKAALDHLLALGEKLGF RTGNVDNRVGWIEYGEGEEMVGVLGHVDIVPLGEGWDYDPLGCEVHDGKMYGRGVLDDKG PTIGAIYAMKAIKDLGLPIDRRIRVMIGTDEENGSSCVQHYIKVGGEKPTIGFTPDAEYP VIFCEKGQIFWEVTKKVENPSTVKVISITGGTAKNVVTPKCTMVVEGDFDFPATDHISVT KEDGKTVIVSSGRGAHGSLPHLGQNAAIQLFTALKENGIDLGGDLQKMIDFVLDKINTET KGETLGVYANDEETGETSVCFGVVNCTEDKLFFTLDVRYPNNADNVKITETIKEKAKEYG LDAEVEQTGKLLYVSKESELVQKLMGVYREYTGSTEEPIAIGGGTYAKAFDNMVAFGPIF PGDDDVIHQPNEYAEIDKLMKSFQIVATAMYELAQR >gi|330404459|gb|ADLB01000014.1| GENE 14 14037 - 14117 83 26 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MPSGGMSGGSNPPRCVSYHYFNKIFL >gi|330404459|gb|ADLB01000014.1| GENE 15 14212 - 15186 789 324 aa, chain - ## HITS:1 COG:no KEGG:Cphy_2931 NR:ns ## KEGG: Cphy_2931 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 34 317 1 295 306 162 31.0 1e-38 MKTHNILFIQINDAIRIRFTMMIIEDGLQGGKMITYCQFIAAVKQKVESNINRNVRVEVH ITLKNNGKERRGLVFIEEGINISPTIYLEEFYQNFREGKALDEIIKDILEVYANSKFSKK WETEKLRVFENVRHNVLCKMINREKNREFLKETPYIPFFDLAIVCYVLIELNEHGIATMP VKKAQLEMWGIKENELFQVAKRNVQKQFPAELRQMKDVIAEMIGMEVQDAEDDFMYVLSN EMRSFGAVCITYDGVPELVGIELEENYYIIPSSVHEMIIVPESKAPSREEMERMVTEINE TQVEEEEVLSNRVYFYDIRAKKMS >gi|330404459|gb|ADLB01000014.1| GENE 16 15295 - 16062 658 255 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|291541113|emb|CBL14224.1| ## NR: gi|291541113|emb|CBL14224.1| Beta-lactamase inhibitor (BLIP). [Roseburia intestinalis XB6B4] # 85 243 74 250 259 81 27.0 5e-14 MKCSRCGSEMKIKSVKVDTDIHGNPIYNKYAYCYTCKIKRNLGQLKAPAKRSHKRKRKKG KLVVALILILLLIIGIGTFFFIKNLKEKQSQNESTVVTPTGNNKISTDSYEQLETGMPYD EVKSLIGNAGNKLLQVTSDENSAERYQWITKDGEGTVLLSFEDGNLISISQAGMETGGSV SLSDEVKKEIKSDMSYKEVTSVLGEKGVLLSETLQNGTTSKLYEWKDSNSGKSFSIVFVE DKLRSYNFNDKKTEQ >gi|330404459|gb|ADLB01000014.1| GENE 17 16109 - 16666 654 185 aa, chain - ## HITS:1 COG:CAC2094 KEGG:ns NR:ns ## COG: CAC2094 COG0231 # Protein_GI_number: 15895364 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation elongation factor P (EF-P)/translation initiation factor 5A (eIF-5A) # Organism: Clostridium acetobutylicum # 1 185 1 185 186 218 57.0 4e-57 MISAGDFRNGVTLEIEGNIYQILEFQHVKPGKGAAFVRTKLKNIISGGIVEKTFRPTEKF PKAHIERKDMQYLYSDGELYHFMDVETYDQIALNEDTIGDALKFVKENEMVKICSHNGNV FAVEPPLFVELQITDTEPGFKGDTAQGATKPATVETGAVVYVPLFVEQGDVLKIDTRTGE YLSRV >gi|330404459|gb|ADLB01000014.1| GENE 18 16760 - 17260 576 166 aa, chain - ## HITS:1 COG:BH1322 KEGG:ns NR:ns ## COG: BH1322 COG2179 # Protein_GI_number: 15613885 # Func_class: R General function prediction only # Function: Predicted hydrolase of the HAD superfamily # Organism: Bacillus halodurans # 1 163 1 164 171 117 35.0 9e-27 MFKKFFPDNYEASTYIIPFEDLYKEGYRGLIFDIDNTLVPHGAPADERAKKLFARLQEIG FQCCLLSNNKEGRVKMFNEEIGVNYIYDAHKPSTKNYKKAMEIMGTDLDNTIFIGDQLFT DVYGAKRTGIRNILVKPIHPKEEIQIILKRYLEKIVLYFYKKERKK >gi|330404459|gb|ADLB01000014.1| GENE 19 17263 - 18231 844 322 aa, chain - ## HITS:1 COG:BH1857 KEGG:ns NR:ns ## COG: BH1857 COG0524 # Protein_GI_number: 15614420 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Bacillus halodurans # 6 320 5 317 319 225 39.0 9e-59 MKTYDVIALGELLIDFTMNGENERGNSMFEACPGGAPCNVLAMLNKLGRKTSFLGKVGKD AFGIQLRKTLETAGIDTSKLYEDREVHTTLAFVHTLPDGDREFSFYRNPGADMMLVEEEV TEEYIRQARVIHFGTISMTHDGVRNATKKAVELAKKNGLLITFDPNLRPPLWESLESAKE QMEYGFSQCDVLKISDNELQFASGKEDYDEGICILQEKYKIPLIFLTLGKEGSRAYYKGM CVEEKGYNVQAVDTTGAGDTFCGSVIHAVLKYGLETLTEEKLREILAFANSAGALVTTKK GALCSMPKQEEIMELMKKEKRI >gi|330404459|gb|ADLB01000014.1| GENE 20 18259 - 18933 497 224 aa, chain - ## HITS:1 COG:lin2374 KEGG:ns NR:ns ## COG: lin2374 COG5632 # Protein_GI_number: 16801437 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: N-acetylmuramoyl-L-alanine amidase # Organism: Listeria innocua # 70 224 9 177 316 104 35.0 2e-22 MKKKRMTKRRRKRILQLYTRMGILLMSVILIGLSVMKIFHFGVFSSEAKYMEESGADIDK SKPDIDVQLLTKNPYSRPGTNTKRITGIVIHYTANPGSTAMQNRNYFEGLKDSHQTKASS HFVVGIDGGIVQCVPTWEEAYASNTRNEDTVSIETCHKEADGKYTRQTYKSMVQLTAWLC KKFDLTENDVIRHYDITGKICPKYFVEDEKAWEQFKKDVGNALR >gi|330404459|gb|ADLB01000014.1| GENE 21 19100 - 20554 2027 484 aa, chain - ## HITS:1 COG:CAC2701_3 KEGG:ns NR:ns ## COG: CAC2701_3 COG0516 # Protein_GI_number: 15895958 # Func_class: F Nucleotide transport and metabolism # Function: IMP dehydrogenase/GMP reductase # Organism: Clostridium acetobutylicum # 205 484 1 280 280 381 67.0 1e-105 MGKIIGEGITFDDVLLVPAYSEVIPNQVDLTTNLTNSIQLNIPMMSAGMDTVTEHRMAIA MARQGGIGIIHKNMSIEEQADEVDKVKRSENGVITDPFYLSPEHTLADANELMAKFRISG VPITEGKKLVGIITNRDLKFEEDFTKKIKESMTSEGLITAPEGITLEEAKQILAKARKEK LPIVDKDFNLKGLITIKDIEKQIKYPLSAKDSKGRLLCGAAVGITANCLERVDALVKAQV DVVVMDSAHGHSANVLKTVRMVKEKYPELQVIAGNVATGEATRALIEAGVDAVKVGIGPG SICTTRVVAGIGVPQISAIMDCYEVAKEYNIPIIADGGIKYSGDMTKAIAAGANVCMMGS IFAGCDESPGTFELFQGRKYKVYRGMGSISAMENGSKDRYFQADAKKLVPEGVEGRVAYK GTVEDTVFQLMGGLRAGMGYCGAPTVDTLKETGKFVKISAASLKESHPHDIHITKEAPNY SVDE >gi|330404459|gb|ADLB01000014.1| GENE 22 20687 - 21175 562 162 aa, chain - ## HITS:1 COG:no KEGG:Cphy_3288 NR:ns ## KEGG: Cphy_3288 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 1 161 1 167 169 92 32.0 4e-18 MNDFYTEQLVKRKTTGKVMLAKAGLIILTLVSLILLLKTPFALILTMVLIALDIFLFRNM DLEYEYLFVNGELDVDKIIARSKRKKVFSANVEELELLAPTGSSELRLVQAEKTYNYSSM AEGRKTYELIVSQKGQKIKVIFEPNDTILNGFKTLAPRKVII >gi|330404459|gb|ADLB01000014.1| GENE 23 21293 - 22915 1586 540 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|167855908|ref|ZP_02478658.1| 50S ribosomal protein L28 [Haemophilus parasuis 29755] # 2 540 3 543 547 615 58 1e-175 MAKEIKYGAEARAALESGVNQLANTVRVTLGPKGRNVVLDKSFGAPLITNDGVTIAKEIE LEDAFENMGAQLIKEVAAKTNDVAGDGTTTATVLAQAMVHEGIKNLAAGANPIVLRKGMK KATETAVEAIASMSSKVESKDRIANVASISAGDTEVGAMVADAMEKVSNDGVITIEESKT MKTELDLVEGMQFDRGYISAYMATDMEKMEANLEDPYILITDKKISNIQELLPLLEQIVQ SGARLLIVAEDIEGEALTTLIVNKLRGTFNVVAVKAPGYGDRRKAMLEDLAILTGGQVIS EELGLDLKETTLEQLGRAKSVKVQKENTVIVDGLGDKNAINARVAQIKAQIEETTSEFDK EKLQERLAKLAGGVAVIRVGAATETEMKEAKLRMEDALSAARAAVEEGIVAGGGSAYIHA SKKVAKLAETLEGDEKTGANIILKALEAPLFHISANAGLEGSVIINKVKESQVGIGFDAY NEEYVDMVEAGILDPAKVTRSALQNATSVASTLLTTESVVANIKEDVPPMPAGAGGMGMM >gi|330404459|gb|ADLB01000014.1| GENE 24 22946 - 23230 506 94 aa, chain - ## HITS:1 COG:CAC2704 KEGG:ns NR:ns ## COG: CAC2704 COG0234 # Protein_GI_number: 15895961 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Co-chaperonin GroES (HSP10) # Organism: Clostridium acetobutylicum # 1 94 1 94 95 94 69.0 3e-20 MKLVPLGDRVVLKQLIAEETTKSGIVLPGQTKEKPQQAEVVAVGPGGNVDGKEVVMQVAV GDKVIYSKYSGTEVELDDEEYIVVRQNDILAVIK >gi|330404459|gb|ADLB01000014.1| GENE 25 23311 - 23637 269 108 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|210610051|ref|ZP_03288230.1| ## NR: gi|210610051|ref|ZP_03288230.1| hypothetical protein CLONEX_00416 [Clostridium nexile DSM 1787] # 2 108 3 109 109 96 50.0 4e-19 MEISKAKIKMKKSLLANGTLDEKFYDCDVIRYEKKEESIYLLLRDENLMDISLDGIYECI IYRGSVGIKCEGRIVERYVNQDEKIVRFQVEKGFYKINIKYVDKEETE >gi|330404459|gb|ADLB01000014.1| GENE 26 23708 - 24490 1112 260 aa, chain - ## HITS:1 COG:CAC1786 KEGG:ns NR:ns ## COG: CAC1786 COG4465 # Protein_GI_number: 15895062 # Func_class: K Transcription # Function: Pleiotropic transcriptional repressor # Organism: Clostridium acetobutylicum # 5 255 4 258 258 206 49.0 4e-53 MSVQLLDKTRKINKLLHNNDSSKVVFNDICEVLTEILDSTVLVISKKGKVLGVSECDKVE PINELVQEEIGAYIDTLLNDRLLNILSTKENVNLQTLGFTPEVVKGYQAMITPINTAGER LGTLFMYKKDALYEIDDIILSEYGTTVVGLEMLRSVNEESAAETRKEQIVKSAISTLSFS ELEAIVHIFDELEGTEGILVASKIADRVGITRSVIVNALRKFESAGVIESRSSGMKGTYI KVINDYVFEELDRIRKNRSR >gi|330404459|gb|ADLB01000014.1| GENE 27 24632 - 26698 1933 688 aa, chain - ## HITS:1 COG:BH2467_1 KEGG:ns NR:ns ## COG: BH2467_1 COG0550 # Protein_GI_number: 15615030 # Func_class: L Replication, recombination and repair # Function: Topoisomerase IA # Organism: Bacillus halodurans # 1 551 1 550 550 581 55.0 1e-165 MARNLVIVESPAKVKTIKKFLGSSYVVMASNGHVRDFPKSQFGIDVENDFEPKYITIRGK GEILANLRKEVKKADKVYLATDPDREGEAIAWHLYVALKLEGKKVYRISFNEITKNAVKS AIKEAREINMDLVDSQQARRALDRIVGYRISPLLWAKVKRGLSAGRVQSVALRIIADREE EIDAFIPEEYWTIDTVLKVEGEKKPLVAKFYGTEKQKMTIRSKEELDNILGQIENEEYKV TEVKKSERLKKAPLPFTTSTLQQEASKALNFATQKTMRIAQQLYEGIDIKGQGTVGIITY LRTDSTRVSDEAESSVREYIKDIYGDGFVSETEPKKETGKKIQDAHEAIRPTDVTRTPAS LKESLTREQFRLYQLIWKRFVASRMKPAKYETTSVSIGAGEYRFHVSASKIVFEGFRSVY IESDEEKAENNVLVKSITTDTKLTKEDIETKQHFTQPPAHYTEASLVKILEELGIGRPST YAPTISTIIARRYVSKEQKNLYLTEIGEVVNNMMKTSFPSIVDVNFTANMEGLLDRVEEG AVEWKSVIRNFYPDLEEAVEKAEKELESVKIEDEVTDVICEECGRNMVIKYGPHGRFLAC PGFPECRNTKPYLEKIGVPCPKCGKDIVLRKTKKGRKYYGCEDNPDCDFMSWQKPSKEKC PKCNGYMVEKGNKLLCGDKECGYITDKK >gi|330404459|gb|ADLB01000014.1| GENE 28 26733 - 27818 872 361 aa, chain - ## HITS:1 COG:STM3405 KEGG:ns NR:ns ## COG: STM3405 COG0758 # Protein_GI_number: 16766694 # Func_class: L Replication, recombination and repair; U Intracellular trafficking, secretion, and vesicular transport # Function: Predicted Rossmann fold nucleotide-binding protein involved in DNA uptake # Organism: Salmonella typhimurium LT2 # 64 356 60 358 374 197 38.0 2e-50 MEREEKYEYWFANIKGLTGRNKRQIREKIQSVEAFYHMKEDERKNYVEEEASTYIRKSIE EWKVDEQYNLLKEKKIHFLPFFHEEYPDMLKNLDNPPYAIYLKGKMIRKDTLKVAIVGAR KCSPYGESMAIRFAERLAEQGIEIISGLARGVDGAGQRGALNVGGCSYGVLGSGVDVCYP RENIGLYEDLQEKGGILSEQPLGTAPLSRNFPARNRIISALADIVIVIEAKEKSGSLITA DMALEMGKEVYALPGPVNSELSKGCNMLIRQGAGILLSPQDLLDELRISTKEDVKKKLKS KISLETEENMVYSCLDLYPKNVNQLMMETKMEIPQLINQLVSLEMQGYIREISKNYYVKS N >gi|330404459|gb|ADLB01000014.1| GENE 29 27821 - 29329 865 502 aa, chain - ## HITS:1 COG:slr0904 KEGG:ns NR:ns ## COG: slr0904 COG0606 # Protein_GI_number: 16331658 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Predicted ATPase with chaperone activity # Organism: Synechocystis # 1 500 1 507 509 447 42.0 1e-125 MYSTVLSAAIHGLEVKCIHVEADVSNGLPLFHMVGYLSSEVKEAGERVRTAIRNSGFLLP SKKIVINLSPADVRKRGSAFDLPIAISILSSLGYIPHKKLQQTLFIGELGLDGKVQKIAG VLPIIIEAKQQGYKACIVPKGNGLEGSIIEGIEVYGVSTLKEAADYLNGNIKLIPEKKQL LAKEETREGDFSEVRGQAMLKRATEIAVAGNHNILFIGSAGIGKTMIASRIPTICPPMTE EECIEVTKVYSILGMIDENHPIIRERPFRSPHHTSTKASLIGGGNIAMPGEISMANHGVL FLDELSQFQKSVLDALRQPMEDRVIRLSRKSGVYVFPSNFMLVGSCNPCPCGNYPDLNKC TCTPGQIQQYYNHLSQPFLDRFDLSIEVSKIQYDDLEGDREEETSEVIRGRVCEARGRQY KRYGKTKTNAELSSTEVETYCKLGQKEKKFMGQIFEQMELTARTYHKVLKVARTIADLAE EEKIQLSHLREAVGYRMFDKRR >gi|330404459|gb|ADLB01000014.1| GENE 30 29495 - 30532 905 345 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|166031604|ref|ZP_02234433.1| ## NR: gi|166031604|ref|ZP_02234433.1| hypothetical protein DORFOR_01304 [Dorea formicigenerans ATCC 27755] # 6 275 32 294 400 90 28.0 2e-16 MYEEWNEDEDIFEKLEEESKKTKINGKKLIIAEFIGVFILLCVVLGTYFGLNNTKRIAKE YMRDVLQGNWNEVYDYLYFKKDDNTFLNKQMFISAQELNDEARTLQTRIGDVREIGKGNN DYRQFEVSYRDRKEDKKMIVPMIRRNGEWYVDGNQQFVEEVVRLEVPKDTKVYLDKIPLD SSYQVKTEENIDFYEIPNLFRGVHYIALEKENMEKYEDLVRFEKDKPVQVTMQYGKDVLE KAAQQAQGEIRKEYEKATGNVTGDKEFRYLQLKNNQIKVKPSEENADLIEVTVESEYEYQ YKEKVFYFFHINKTDRGKCTSKCVFSYQSGMLKLENKEINTSFLK >gi|330404459|gb|ADLB01000014.1| GENE 31 30543 - 31115 685 190 aa, chain - ## HITS:1 COG:no KEGG:Shel_22530 NR:ns ## KEGG: Shel_22530 # Name: not_defined # Def: hypothetical protein # Organism: S.heliotrinireducens # Pathway: not_defined # 93 178 303 387 395 71 45.0 1e-11 MTKEKKSKKTRMIELILAASICILLVTEGVLLINQKKESKNEKPILTQEEAKKIDRKNME SAKDTVVEKEGADITIGTKDVAEVQETEQTRDYIIEDSYARALTMEELTLLTAEELQIAR NEILARHGRKFEDEKMQMYFESKSWYSGTMEAAEFDAYYDSILSETEKANVQLIQSMETM GETETTETEE >gi|330404459|gb|ADLB01000014.1| GENE 32 31252 - 31872 328 206 aa, chain + ## HITS:1 COG:CAC1689 KEGG:ns NR:ns ## COG: CAC1689 COG1191 # Protein_GI_number: 15894966 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit # Organism: Clostridium acetobutylicum # 3 201 26 224 234 225 59.0 6e-59 MKTFPEPLTASEERYYLQKYTEGDLEAKHILIERNLRLVAHIVKKYQHLDEDPEDLLSIG TIGLIKAVSTFNLEKGNRLATYAARCIENEILMMLRGKKKTSKEVSLYEPIGTDREGNEI QLFDVIESDEEDAPSKVALKDDIKMLYEKVSSELSPRERLVLKMRYGLYEGEEYTQREIA RQLGISRSYVSRIEKSAIEKLRTFFS >gi|330404459|gb|ADLB01000014.1| GENE 33 31890 - 33119 1510 409 aa, chain - ## HITS:1 COG:CAC1687 KEGG:ns NR:ns ## COG: CAC1687 COG0826 # Protein_GI_number: 15894964 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Collagenase and related proteases # Organism: Clostridium acetobutylicum # 5 383 4 381 406 441 55.0 1e-123 MARHPELLIPASSLEVLKIAVIFGADAVYIGGEAFGLRAKAKNFSKKDMEEGIKFAHEHG VKVYVTANILAHNGDLEGAREYFRELGEMNPDALIIADPGMFMLAGEECPHIERHISTQA NNTNYETYRFWHKLGATRVVSARELSLEEIKEIRANIPDELEIETFVHGAMCISYSGRCL LSNYFTGRDANQGACTHPCRWKYSIVEETRPGEYMPVYENERGTYIFNSKDLCMIEHIPE LLEAGIDSLKIEGRMKTALYVATVARTYRKAIDDYQKDPELYKKNMPWYLEQISNCTYRQ FTTGFFFGKPDETTQIYDSNTYVKEYTYLGIIGEEREGTYRIEQRNKFSVGEKIEVMKPN GDNVEVTVKRILTEDGEEQESAPHPKQVLYVDLGIPVDQYDILRRQEEK >gi|330404459|gb|ADLB01000014.1| GENE 34 33135 - 33776 774 213 aa, chain - ## HITS:1 COG:SP0980 KEGG:ns NR:ns ## COG: SP0980 COG4122 # Protein_GI_number: 15900857 # Func_class: R General function prediction only # Function: Predicted O-methyltransferase # Organism: Streptococcus pneumoniae TIGR4 # 2 212 17 226 237 157 39.0 2e-38 MIVDERMVTFINSLETENSEILETIEKEALETFVPIIRKEMQSFMKVLLAIQKPLNILEV GTAVGFSALLMSEYAPEECRITTIEKYEKRIPIAKENFRRAGKEDKITLLEGDALEILKG LEEKYDFIFMDAAKAQYINYMPEVIRLLEKGGVLVSDNVLQDGDIIESRFAVERRNRTIH SRMREYLYRLKHEEQLLTSIIPLGDGVAISTKK >gi|330404459|gb|ADLB01000014.1| GENE 35 33833 - 35500 1745 555 aa, chain - ## HITS:1 COG:CAC1683 KEGG:ns NR:ns ## COG: CAC1683 COG0595 # Protein_GI_number: 15894960 # Func_class: R General function prediction only # Function: Predicted hydrolase of the metallo-beta-lactamase superfamily # Organism: Clostridium acetobutylicum # 1 555 1 555 555 655 58.0 0 MKKENNSKLKIIALGGLEQIGMNITAFEYEDSIIVVDCGLAFPEDDMLGIDLVIPDITYL KDNIQKVKGFVITHGHEDHIGALSYVLKDLNFPIYATKLTMGIIENKLKEHNLLRSTRRK VVRHGQSINLGQFRIEFIKTNHSIADASALAIYSPAGIVVHTGDFKVDYTPVFGDAIDLQ RFAELGKKGVLALMSDSTNAERPGFTMSERTVGKTFDHIFAEHRNTRIIIATFASNVDRV QQIINSAYKYDRKVVVEGRSMVNIIETASELGYLNIPDKTLITIDQLKNYPDEKTVLITT GSQGESMAALSRMAADIHKKVSIKPGDTVIFSSSPIPGNEKAVSRVINELSQKGATVIFQ DTHVSGHACQEEIKLIYSLVKPKYAIPVHGEYRHLRANAGIAESLGIPKENIFLLQSGDV LALDGKGAEVVDKVHTGAILVDGLGVGDVGNIVLRDRQHLAEDGILIVVLTLEKGTNQVL AGPDIVSRGFVYVRESEGLMEEARQILSEALENCLMQNKNADWSRIKLVIRDTMNEFIWK RTKRRPMILPIIMDV >gi|330404459|gb|ADLB01000014.1| GENE 36 35514 - 35990 430 158 aa, chain - ## HITS:1 COG:CAC1682 KEGG:ns NR:ns ## COG: CAC1682 COG0735 # Protein_GI_number: 15894959 # Func_class: P Inorganic ion transport and metabolism # Function: Fe2+/Zn2+ uptake regulation proteins # Organism: Clostridium acetobutylicum # 6 149 10 150 151 125 47.0 3e-29 MSVNQEKFKAMLKEKGLKVTNQRLVVLKVLAEHKDRHMTAEEIYDLVRNEFRDIGIATVY RTVQLLLEMKLIDRIELNDGCVRYEIGHQFFGDTKHYHHHLICKECGKIIPFDDDLLEDL EKHIEKTLGFCVLDHELKLYGKCKDCIEKEKNVRKHIE >gi|330404459|gb|ADLB01000014.1| GENE 37 36014 - 36274 388 86 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|255282414|ref|ZP_05346969.1| ## NR: gi|255282414|ref|ZP_05346969.1| conserved hypothetical protein [Bryantella formatexigens DSM 14469] # 1 86 19 104 104 82 58.0 7e-15 MEKIQFMFDDGTESADFFVLEETKVNGVSYILVTDSEEDDAECMILKDTSKPEESESVYQ VVEDDTELEAVLKIFEELLEDIDIEM >gi|330404459|gb|ADLB01000014.1| GENE 38 36296 - 36724 657 142 aa, chain - ## HITS:1 COG:lin1537 KEGG:ns NR:ns ## COG: lin1537 COG0816 # Protein_GI_number: 16800605 # Func_class: L Replication, recombination and repair # Function: Predicted endonuclease involved in recombination (possible Holliday junction resolvase in Mycoplasmas and B. subtilis) # Organism: Listeria innocua # 1 137 1 136 138 150 60.0 8e-37 MRIMGLDYGSKTVGVAISDALCLTAQGIETIQRKEENKLRKTLARIETLIKEYEVDKIVL GFPKHMNNDIGDRAEKSLAFKDMLERRTGLEVIMWDERLTTVEAERTLIESNVRREERKK YVDKIAAIFILQGYLDSVYLKK >gi|330404459|gb|ADLB01000014.1| GENE 39 36729 - 36992 394 87 aa, chain - ## HITS:1 COG:CAC1679 KEGG:ns NR:ns ## COG: CAC1679 COG4472 # Protein_GI_number: 15894956 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 5 81 7 83 86 91 55.0 3e-19 MKDLNSTQFFQVEPAPQIQAKDILEIVYQALREKGYNPVNQIVGYIMSGDPTYITSHNGA RSLIMKMERDELVEEMLKTYIEHHSWE >gi|330404459|gb|ADLB01000014.1| GENE 40 37049 - 38341 938 430 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|16079597|ref|NP_390421.1| hypothetical protein BSU25430 [Bacillus subtilis subsp. subtilis str. 168] # 1 422 1 423 451 365 42 1e-100 MKKVALHNLGCKVNAYETEAMQELLEKEGYEIVPFKEGADIYIINTCTVTNMADRKSRQM LHRAKKMNPNAIVVAAGCYVQAKSESKETDESIDIIIGNNKKQDLISILKEYQEKHDGIQ KEIIDINHTKEYEELHLSKTAEHTRAYLKVQDGCNQFCTYCIIPYARGRVRSREKENVVA EVKQLVANGYQEVVLTGIHLSSYGVDLQGEDLLSLILAVNEIEGLKRIRLGSLEPRIITE EFAKTISGLEKICPHFHLSLQSGCNGTLKRMNRRYTAEEYFEKCELLRKYFDNPALTTDV IVGFPGETEEEFEESRAFVEKVNFYETHIFKYSRREGTKAAVMENQVPEQIKTKRSNILL ELDERKRKEYEEKFIGKTVEVLMEEEVEKEGKRYQTGHTKEYIKVALESDENMQNQLVKI KIDNHSQIIR >gi|330404459|gb|ADLB01000014.1| GENE 41 38473 - 38703 354 76 aa, chain + ## HITS:1 COG:no KEGG:EUBELI_01159 NR:ns ## KEGG: EUBELI_01159 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 1 76 1 76 77 108 88.0 5e-23 MKTVQISLNSIDKVKSFVNEITKFDYDFDLVSGRYVIDAKSIMGIFSLDLSKPIDLNIHA EDDAETVLEVLKPYIV >gi|330404459|gb|ADLB01000014.1| GENE 42 38762 - 39946 1397 394 aa, chain - ## HITS:1 COG:CAC2971 KEGG:ns NR:ns ## COG: CAC2971 COG0301 # Protein_GI_number: 15896224 # Func_class: H Coenzyme transport and metabolism # Function: Thiamine biosynthesis ATP pyrophosphatase # Organism: Clostridium acetobutylicum # 6 393 5 382 384 345 49.0 7e-95 MKFQSFLIKYGEIGIKGKNRYIFEDALMRQIRFALKDVDGIFNVHKSQGRVYVDCEGDFD YDEAVAGLKRVFGIVGICPVIHVEDKGFEELKKVVVNYMDEMYPDKNTTFKVEARRGKKS YPKNSMEINCEIGEAILEAFPEIKVDVHKPAIKLNIEVREEIYIYSEIIPGPGGMPVGTN GKAMLLLSGGIDSPVAGYMISKRGVGLEATYFHAPPYTSERAKQKVIDLAKLVSKYSGPI KLNIVNFTDIQLYIYDKCPHDELTIIMRRYMMKIAEHFAKESGCLGLITGESIGQVASQT MQSLMATNDVCTLPVYRPVIGFDKQEIVDIAEKIDTYETSILPFEDCCTIFVAKHPVTKP NLNIIRRSEENLAEKIDELFEQAIQTVETIVVKP >gi|330404459|gb|ADLB01000014.1| GENE 43 39984 - 41141 1256 385 aa, chain - ## HITS:1 COG:CAC2972 KEGG:ns NR:ns ## COG: CAC2972 COG1104 # Protein_GI_number: 15896225 # Func_class: E Amino acid transport and metabolism # Function: Cysteine sulfinate desulfinase/cysteine desulfurase and related enzymes # Organism: Clostridium acetobutylicum # 1 381 1 376 379 316 45.0 4e-86 MEVYLDNSATTKCYDSVKDIVQKVMCEDYGNPSSMHKKGVEAERYIKEAKEILAKLLKVQ EKEIFFTSGGTESDNLALIGAARANHRAGKHLITSSIEHPAILNTMQYLEQEEGFRVTYL PVDSEGKIRLDALREALCEETILVSVMYVNNEVGSVQPIEEAAKLVKDYNKNILFHVDAV QGFGKYKIYPKKLGVDLLSVSGHKIHGPKGIGAIYINEKAKVRPIIFGGEQQKNIRSGTE NVPGIAGLGVAAREIYTDLDKKVAHMRELKQRFIDGIMKIEDTTIHGKYDDTSAPHIISV GIAGIRSEVLLHTLEEKGIYVSSGSACASNHPAISGVLKGIGAKNEFLDATIRFSLSEFT TEEEIDYTLETLYNCIPMLRRYTRH >gi|330404459|gb|ADLB01000014.1| GENE 44 41185 - 41922 781 245 aa, chain - ## HITS:1 COG:CAC1285 KEGG:ns NR:ns ## COG: CAC1285 COG1385 # Protein_GI_number: 15894567 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 1 243 1 243 250 197 48.0 1e-50 MHHFFVTPNQVSGEEIYIEGSDVNHICNVLRMKKGEKLQISDGNNKKYICRIEDMTAEKV LLQIVEEKLGDTELSSKIYLFQGLPKSDKMEWIVQKAVELGAYEIIPVSTKRAVVKLDAK KAAKKVERWNSIAEGGAKQSGRTVIPKVREVMTYREALAYAETLDIVLVPYELAEGMDKT KEFIGQIEKGQSVGIFIGPEGGFEKEEVEQAMEMGAEPITLGKRILRTETAGLTVLSILM YHLES >gi|330404459|gb|ADLB01000014.1| GENE 45 41933 - 42886 999 317 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|240145923|ref|ZP_04744524.1| ribosomal protein L11 methyltransferase [Roseburia intestinalis L1-82] # 1 316 1 326 327 389 56 1e-107 MKWNKFRLKTTTEAEDIVSSMLMDLGIEGVEIEDKVPLTQADKEQMFVDILPEIEADDGV AYLSFYLDDGEDTDTVLANVKKELEEMKSFLDIGECAIEESETEDVDWVNNWKKYFHQFY VDDVLIIPSWEEVKPEDEDKMIIHIDPGTAFGTGMHETTQLCIRQIRKYVTPQTRILDVG CGSGILGMLALKFGAEYSVGTDLDPCAIDATYENMEVNGIEKSQYEVMIGNIIDDKEVQD KVGYEKYDIVVANILADVLVPLTPVIVHQLKKGGIYITSGIIDMKEETVVNAVKEAGLEV LEVTYQGEWVSVTARKN >gi|330404459|gb|ADLB01000014.1| GENE 46 42976 - 43536 686 186 aa, chain - ## HITS:1 COG:Cj0167c KEGG:ns NR:ns ## COG: Cj0167c COG1971 # Protein_GI_number: 15791554 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Campylobacter jejuni # 1 185 1 184 187 157 54.0 1e-38 MGLIELICIAVGLSMDAFAVAICKGLSLRKCTWQKQGIVGLYFGVFQAGMPLLGYLLGMQ FKEMITSIDHWIAFVLLGIIGANMIKEGFSKEEISDEKTDRLSVKEMLGLAVATSIDALA VGVTFAFLQVEIIPAVCIIGITTFILSAAGVKIGNIFGTRYKSKAEIAGGVILIFMGVKI LVEHLM >gi|330404459|gb|ADLB01000014.1| GENE 47 43536 - 44696 1212 386 aa, chain - ## HITS:1 COG:FN0625 KEGG:ns NR:ns ## COG: FN0625 COG1168 # Protein_GI_number: 19703960 # Func_class: E Amino acid transport and metabolism # Function: Bifunctional PLP-dependent enzyme with beta-cystathionase and maltose regulon repressor activities # Organism: Fusobacterium nucleatum # 8 381 13 387 398 357 43.0 3e-98 MEKIIYQDRKKTNCEKWDGIFQKFGKEDLLPLWVADMDFQAPECVGKALQEYAKKGVFGY YKVPESFFEAFINWEKERHGYHVEKEWIRFAPGVVPALFWLVQCFTEKEDSVMISTPAYP PFFHCIEDNGRKVTDVPLKCTDGHYEMDYDKMEKIMREEQVKMYILCSPHNPIGRVWKEE ELRRLISLCEKYHVLLVADEIHQDIILKGNRQIPAGSIGQYTDKIVALTAASKTFNLADC QNAFILLADEKLREEFDDFAEKIHVMQGSGFGYIAVQNAYEGGLDWLNGVLKIIEENYVY ARHFIERELPEAVVSELQGTYLMWIDLKPYLQDKDVVQTLQEKCHLAIDYGDWFGGEDWK YYIRVNLATKTENIAQAMENLKRGLE >gi|330404459|gb|ADLB01000014.1| GENE 48 44700 - 45911 1632 403 aa, chain - ## HITS:1 COG:FN1147 KEGG:ns NR:ns ## COG: FN1147 COG3681 # Protein_GI_number: 19704482 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 2 400 5 409 411 342 46.0 1e-93 MGCTEPIALAYGAAKAREVLGALPDKVKIEASGSIIKNVKSVIVPNTDHLKGIPAAATAG IIAGKADKKLEVIAEVSKEEIQKMREFMENTEITVEHINSGITFDIVITVYHGDSYAKVR IANYHTNIVLVEKDGEVLESTVVEGEKEEGLTDRSLLNVKDIVDFADSVDVEEIKEVLDR QIKYNTAISAEGLRGDYGANIGSVLLETYGDDIRTRAKAAAAAGSDARMNGCELPVIINS GSGNQGMTASLPVIEYAKELKVTDEKLYRALALSNLVTIHQKTGIGRLSAYCGAISAGAG AGAGIAYLLGGTYEEIIHTVVNALAIVSGIVCDGAKASCAAKIASAVDAGILGYNMYKCG QQFYEGDGIVTKGVEATIKNIGRLGKEGMKETNEEIIRIMVGE >gi|330404459|gb|ADLB01000014.1| GENE 49 46049 - 47221 789 390 aa, chain - ## HITS:1 COG:no KEGG:Cbei_2712 NR:ns ## KEGG: Cbei_2712 # Name: not_defined # Def: FliB family protein # Organism: C.beijerinckii # Pathway: not_defined # 7 386 1 384 386 239 34.0 1e-61 MDKRGKMRYTVPHYYCKFQCTASECQDTCCAGWKIVIDEKTLKKYREVKGPFGNRLQNSI DWEEETFLQYDGRCAFLNEENLCDIYTEAGEKMFCKTCRLYPRHIEEFQGEREISLSLSC MEAGKLILGCKESVKFVSLEKDVREKEDETFDFLLYGKLSDTREVIFRILQNRTLPIKLR LSMVLALAHDMEIKVAKQQLFQVDDVLRRYEDESAPERFAKKLGNSISGSRQRKKLMQEM FKGFEQFEVLNKTWPAYLKELKKTIFNQSEEMYEENRRHFTTEVLEKEVWTEQLMVYFVF TYFCGAVYDGEIYAKMQMAVASTLLIEELAMAVWQQNHCILTFSAFVDIAHRYSREMEHS DINLNRIEETVKQKECFQLKNLLKVINEIQ >gi|330404459|gb|ADLB01000014.1| GENE 50 47214 - 49940 2452 908 aa, chain - ## HITS:1 COG:VCA0521 KEGG:ns NR:ns ## COG: VCA0521 COG0419 # Protein_GI_number: 15601281 # Func_class: L Replication, recombination and repair # Function: ATPase involved in DNA repair # Organism: Vibrio cholerae # 1 907 1 1012 1013 259 28.0 2e-68 MKPLQLKLSAFGPYADETVIDFTLLGEKGLYLITGDTGAGKTTIFDAITFALYGEASGNN RGADMLRSQYANPGTPTFVEMRFLYRQEEYEIRRNPEYIRPAKKGGGMTTEKADAMLSYP DGRIVTKMKEVTKAIIELIGLDRNQFTQIAMIAQGDFLKLLFAKTEERSKIFREIFDTKK YQILQDRLKAEKNSLDKEYQDISKSIQQYMEGIQGEKEEQVTAEVRLGNLSALLEEENNE IARLTEELAHMENELEKLNRQIGKAETEEKNRKELQKLEENLQFQEERLAQAKQKLSEET KREPEREKLRAEIILSEEKLPLYDEADQVKAEYNSIQVRLQSVKNELAQSKEREEKYNQN VVLMKEQLEKLASADTEKVRLENERAVLEENKNKVRLLKKMLVEYKGISKRLTELQQQYQ TAMLKGQKKQQEYEKLERNFLDSQAGLLAQKLKEGEPCLVCGAIHHLNPAKITRRVCTEE ELREVKRQSTILMEKATMLSVEAGKAKGELDTARRNIEERADEIFGKRVESIYTELENRI GELAKADEEMRKKWSVLEKKAEEKIRYEQQLPQIEKLQKEEEQKGKELEHLQVKCHTECD TLQKQMMKLKESLLFSSKAEILDDIELKKRTRKSMEQAYESARQETEKIAGQIKEDCARI STLKEQLEKVNENSLEELMKKQRELQKERTTLLADRENLVSMYKSNGQIKEAVERQYGAM KTIEKKQVLVKALSDTANGNVAGKDKVMLETYIQISYFHRIIARANTRFMVMSGGQYELK RREETKDQRSQTGLELDVIDHYNGSVRNVKTLSGGEAFQASLSLALGLSDEIQSLAGGIQ LDTMFIDEGFGSLDEEALEQAMKALYRLADGNRLVGIISHVSELKERIDKQIIVKKEKSQ GSSVRVIG >gi|330404459|gb|ADLB01000014.1| GENE 51 49944 - 51017 754 357 aa, chain - ## HITS:1 COG:lin1687 KEGG:ns NR:ns ## COG: lin1687 COG0420 # Protein_GI_number: 16800755 # Func_class: L Replication, recombination and repair # Function: DNA repair exonuclease # Organism: Listeria innocua # 1 357 20 371 374 215 33.0 1e-55 MIEDQKYILRQVIELVQEERPDGVLLAGDIYDKPVPSAEAVQVFDEFLNGLAEQKVSIFI ISGNHDSPERLAFGGNLLKNTGVYVSPVFREIPKAIAMYDEYGEVNVYLLPFIKPAYVKQ VMPKENVETYQEAVQTIIEHMEIDNEKRNIILAHQFVTGAARCDSEEVSVGGLDNIDASV FAPFDYVALGHIHGRQSISREEVRYCGTLLKYSFSESRQKKTVTIVEMKEKGNVAIREIS LSPKHDMREIKGTYGEVTAREFYQGSATDDYLHITLTDEEDILDAINRLRVIYPNIMKMD YDNTRTRNKQCVGEIEQVEKKTPSELFGELYLLQNNQEMSEEQNVFLKNLIEKIWEE >gi|330404459|gb|ADLB01000014.1| GENE 52 51089 - 51616 438 175 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|210611502|ref|ZP_03288922.1| ## NR: gi|210611502|ref|ZP_03288922.1| hypothetical protein CLONEX_01112 [Clostridium nexile DSM 1787] # 6 174 4 172 172 158 46.0 1e-37 MEFEELRKEHKIFAYHISPSYGVEGDEGGFSIEIYGNGNLRYCIYKLFEEINTLEMFKLT KEQVYQLFRIIQENEKKLESIPTYLEDGKEHKNCNEFEFLGRGKIYTGDVSKTFLPLERL KNKEYYKAYKETMKWNNLLYDIFEKVSACLKKSGILLSPESCELSEDCKIRVTWK >gi|330404459|gb|ADLB01000014.1| GENE 53 51640 - 52689 1077 349 aa, chain - ## HITS:1 COG:FN1711 KEGG:ns NR:ns ## COG: FN1711 COG0275 # Protein_GI_number: 19705032 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted S-adenosylmethionine-dependent methyltransferase involved in cell envelope biogenesis # Organism: Fusobacterium nucleatum # 52 314 9 262 314 163 40.0 6e-40 MENQEKQHKRRVRYKGTHPRNYKERYKELQPEKYPETIEKVIRKGSTPVGMHISICVKEI LDFLQIQPGQRGLDATLGYGGHTLEMLKCLKGEGHLYALDVDTVESAKTKERLQKLGYGE EILTIKNINFRDIDKVCEEAGKFDFVLADLGVSSMQIDNPERGFTYKKEGPLDLRLNQKK GVSAAERLKEVTMEELEGMFVENADEPYAHEIAEKIFAKLRRGEAIDTTTKLQQIIEEAL KFIPEKERKEAVKKSCQRTFQALRIDVNSEFEVLYEFMEKLPNVLKEGGRVAILTFHSGE DRIVKRAFKELKRAGVYSDIANDVIRPSAEECNYNSRAKSTKMRWAIKA >gi|330404459|gb|ADLB01000014.1| GENE 54 52714 - 53124 404 136 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|239624705|ref|ZP_04667736.1| ## NR: gi|239624705|ref|ZP_04667736.1| conserved hypothetical protein [Clostridiales bacterium 1_7_47_FAA] # 1 124 1 126 204 153 64.0 4e-36 MNGKKIRIIKKNDEYSMEYQIGDIFTVDGTWYGGVNVRSASGVPLSLDKEEYEEVEERQA RKIDLYSYQLGVMDCLCEMVGEGIKPTAVSRKFDTEEERDSCEEEVKKLCDKYGILYRKE EKYFYIFFTDENKLKE >gi|330404459|gb|ADLB01000014.1| GENE 55 53096 - 53386 304 96 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|283797219|ref|ZP_06346372.1| ## NR: gi|283797219|ref|ZP_06346372.1| putative FMN-dependent (S)-2-hydroxy-acid oxidase [Clostridium sp. M62/1] # 1 85 1 85 91 67 43.0 3e-10 MIELKGMISIQLLKKEAFCGSDEKMRYRLYKEETDGEVKLAVAYWFTPYCWQTTPKEEKT IRYFEFNTEGIWQAVEWLNAISEERKNDEREEDSDN >gi|330404459|gb|ADLB01000014.1| GENE 56 53367 - 53486 107 39 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPFNSIMMSLPYPYHLNLCYSNNYDELYYRFEVCTTGGL >gi|330404459|gb|ADLB01000014.1| GENE 57 53624 - 54799 1026 391 aa, chain + ## HITS:1 COG:STM1992 KEGG:ns NR:ns ## COG: STM1992 COG0270 # Protein_GI_number: 16765328 # Func_class: L Replication, recombination and repair # Function: Site-specific DNA methylase # Organism: Salmonella typhimurium LT2 # 5 174 91 295 476 163 44.0 6e-40 MLKPFKFIDLFAGIGGFHQAMHNLGGECVFASEIDKYAIETYKTNYGVDAGINIRDVHEE DIPEHDVLCAGFPCQAFSKAGYQKGFEDETRGTLFFEIVRILKYHRTPYIILENVRNLTS HDHGNTWRVIKSALHELGYIITEEPIIISPHQLGVPQFRERVVILGIHSSLGIQNLNIEL PDKSKDDINFLTSGILETTDVDEKYYISSHEEKVLICWDEFIKGIKEKVIGFPIWFDEFG KTYDYTDLGYADWKVKFIQKNRQLYENNKEFIDSWREKWNNLEDFTMTEKKFEWQCGKDC SSVWEALIQFRPSGVRVKRPSVFPALVAMVQIPVIGSVRRRLTPREAARLQSFPDTFICN PNDRQAYKQFGNAVNVRVIQYMGEQLFHCKK >gi|330404459|gb|ADLB01000014.1| GENE 58 54816 - 55358 575 180 aa, chain - ## HITS:1 COG:CAC2575 KEGG:ns NR:ns ## COG: CAC2575 COG1592 # Protein_GI_number: 15895835 # Func_class: C Energy production and conversion # Function: Rubrerythrin # Organism: Clostridium acetobutylicum # 8 180 6 195 195 191 55.0 7e-49 MEKNKYNGTETEKNLRTAFSGESEARNKYTFFASVAKKEGYEQIAALFQKTADNEKEHAK MWFKELNGIGNTTHNLASAAEGENYEWTDMYAGFAETAEKEGFPELAAKFRLVAEIEKHH EERYRALLKNVETAKVFEKSEVKVWECRNCGHIVIGTSAPQICPVCAHPQSYFEVRAENY >gi|330404459|gb|ADLB01000014.1| GENE 59 55560 - 55646 174 28 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MVQALDRKKPTMILKGMIEQAELKDKVD >gi|330404459|gb|ADLB01000014.1| GENE 60 55777 - 56040 290 87 aa, chain - ## HITS:1 COG:no KEGG:Ccel_1541 NR:ns ## KEGG: Ccel_1541 # Name: not_defined # Def: hypothetical protein # Organism: C.cellulolyticum # Pathway: not_defined # 1 86 19 158 158 81 39.0 8e-15 MPKFQIITGQKSIEKYESDYRWEYTRILAKLSGADFTEPITAGIFVSIVAWASYQRLEEE GHTVIQKGRKNIRYYVKDYENSLFDLK >gi|330404459|gb|ADLB01000014.1| GENE 61 56290 - 56574 453 94 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_2191 NR:ns ## KEGG: EUBREC_2191 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 94 1 94 94 135 77.0 4e-31 MAKHYDKQFKLDAVQYYHDHKNLGLQGCATNLGISQQTLSRWQKELRETGDIESRGSGNY ASDEAKEIARLKRELRDAQDALEVLKKAINILGK Prediction of potential genes in microbial genomes Time: Tue May 24 21:14:44 2011 Seq name: gi|330404222|gb|ADLB01000015.1| Lachnospiraceae bacterium 2_1_46FAA cont1.15, whole genome shotgun sequence Length of sequence - 42824 bp Number of predicted genes - 57, with homology - 49 Number of transcription units - 23, operones - 14 average op.length - 3.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 377 249 ## COG2801 Transposase and inactivated derivatives 2 2 Tu 1 . - CDS 396 - 1040 274 ## CDR20291_2961 hypothetical protein - Prom 1062 - 1121 3.4 3 3 Op 1 . - CDS 1144 - 1947 579 ## 4 3 Op 2 . - CDS 1996 - 2136 210 ## 5 3 Op 3 . - CDS 2192 - 2677 413 ## gi|210613137|ref|ZP_03289612.1| hypothetical protein CLONEX_01814 - Prom 2707 - 2766 4.1 6 3 Op 4 . - CDS 2771 - 3559 892 ## COG0428 Predicted divalent heavy-metal cations transporter - Prom 3700 - 3759 9.3 7 4 Op 1 . - CDS 3952 - 5172 927 ## COG0582 Integrase 8 4 Op 2 . - CDS 5133 - 5261 75 ## 9 4 Op 3 . - CDS 5277 - 5471 186 ## EUBREC_3614 hypothetical protein 10 4 Op 4 . - CDS 5464 - 5529 145 ## - Prom 5551 - 5610 3.8 11 5 Op 1 . - CDS 5911 - 6159 361 ## CD0358 hypothetical protein 12 5 Op 2 . - CDS 6163 - 6582 265 ## CD0359 hypothetical protein - Prom 6765 - 6824 6.9 + Prom 6987 - 7046 9.6 13 6 Tu 1 . + CDS 7105 - 7464 443 ## CD2002 hypothetical protein - Term 7402 - 7443 -0.5 14 7 Tu 1 . - CDS 7535 - 7723 169 ## CD0369 hypothetical protein 15 8 Op 1 . - CDS 7801 - 8595 574 ## Cphy_1635 hypothetical protein 16 8 Op 2 3/0.000 - CDS 8592 - 9527 249 ## PROTEIN SUPPORTED gi|90020817|ref|YP_526644.1| ribosomal protein S16 17 8 Op 3 . - CDS 9479 - 10105 340 ## COG1309 Transcriptional regulator - Prom 10133 - 10192 5.6 - Term 10180 - 10224 -0.9 18 9 Op 1 . - CDS 10240 - 11142 797 ## CD0371 conjugative transposon protein 19 9 Op 2 . - CDS 11159 - 12166 813 ## COG0791 Cell wall-associated hydrolases (invasion-associated proteins) 20 9 Op 3 . - CDS 12163 - 14364 797 ## CD0373 conjugative transposon protein 21 9 Op 4 . - CDS 14364 - 16814 1710 ## CD0374 conjugative transposon protein 22 9 Op 5 . - CDS 16792 - 17190 374 ## CD3384 conjugative transposon membrane protein - Term 17216 - 17251 2.1 23 10 Op 1 . - CDS 17306 - 17719 413 ## CDR20291_3463 conjugative tranposon protein 24 10 Op 2 . - CDS 17814 - 17915 125 ## 25 10 Op 3 . - CDS 17915 - 18130 196 ## gi|210613177|ref|ZP_03289620.1| hypothetical protein CLONEX_01822 26 10 Op 4 . - CDS 18185 - 18652 341 ## - Prom 18704 - 18763 7.8 27 11 Op 1 . - CDS 18919 - 19089 318 ## CD3387A conjugative transposon protein 28 11 Op 2 1/0.000 - CDS 19104 - 20300 978 ## COG2946 Putative phage replication protein RstA - Prom 20409 - 20468 2.0 29 11 Op 3 . - CDS 20484 - 21458 480 ## COG1674 DNA segregation ATPase FtsK/SpoIIIE and related proteins - Prom 21480 - 21539 1.6 - Term 21892 - 21933 6.5 30 12 Op 1 . - CDS 21958 - 22149 243 ## COG1476 Predicted transcriptional regulators 31 12 Op 2 . - CDS 22152 - 22448 244 ## gi|225406863|ref|ZP_03761052.1| hypothetical protein CLOSTASPAR_05084 - Prom 22497 - 22556 7.5 32 13 Op 1 . - CDS 22568 - 22954 390 ## CD3390 conjugative transposon protein 33 13 Op 2 . - CDS 22970 - 23296 281 ## CD3391 conjugative transposon protein 34 13 Op 3 . - CDS 23244 - 23522 83 ## CD0385A hypothetical protein 35 13 Op 4 . - CDS 23535 - 26585 2860 ## CD3392 putative collagen-binding surface protein - Prom 26655 - 26714 4.5 - Term 26811 - 26846 -0.7 36 14 Tu 1 . - CDS 26869 - 27063 102 ## Ccel_2739 hypothetical protein 37 15 Op 1 . - CDS 27123 - 27869 567 ## DSY0090 hypothetical protein 38 15 Op 2 . - CDS 27902 - 28258 375 ## COG3070 Regulator of competence-specific genes 39 15 Op 3 . - CDS 28258 - 28797 339 ## COG3797 Uncharacterized protein conserved in bacteria - Prom 28837 - 28896 10.3 - Term 29036 - 29080 -0.3 40 16 Tu 1 . - CDS 29284 - 30606 877 ## MCRO_0335 hypothetical protein - Prom 30678 - 30737 5.9 - Term 30743 - 30784 4.4 41 17 Op 1 . - CDS 30822 - 32186 1222 ## COG2265 SAM-dependent methyltransferases related to tRNA (uracil-5-)-methyltransferase 42 17 Op 2 . - CDS 32207 - 32773 644 ## COG0558 Phosphatidylglycerophosphate synthase - Term 32782 - 32818 8.2 43 17 Op 3 . - CDS 32829 - 33071 442 ## gi|210608460|ref|ZP_03287836.1| hypothetical protein CLONEX_00015 - Prom 33184 - 33243 7.5 44 18 Op 1 . - CDS 33245 - 35470 2548 ## COG0210 Superfamily I DNA and RNA helicases 45 18 Op 2 . - CDS 35546 - 36067 603 ## COG1335 Amidases related to nicotinamidase 46 18 Op 3 . - CDS 36076 - 36396 446 ## gi|167746261|ref|ZP_02418388.1| hypothetical protein ANACAC_00966 47 18 Op 4 . - CDS 36393 - 37238 496 ## gi|210608462|ref|ZP_03287838.1| hypothetical protein CLONEX_00017 - Prom 37262 - 37321 6.6 + Prom 37279 - 37338 5.5 48 19 Op 1 . + CDS 37362 - 37967 472 ## EUBREC_1394 hypothetical protein 49 19 Op 2 . + CDS 38031 - 38756 692 ## COG1387 Histidinol phosphatase and related hydrolases of the PHP family 50 19 Op 3 . + CDS 38828 - 39115 355 ## COG4496 Uncharacterized protein conserved in bacteria + Term 39122 - 39175 7.4 + Prom 39127 - 39186 12.5 51 20 Tu 1 . + CDS 39223 - 39699 280 ## EUBELI_00140 hypothetical protein + Term 39932 - 39974 -1.0 52 21 Tu 1 . - CDS 39700 - 39993 363 ## COG0715 ABC-type nitrate/sulfonate/bicarbonate transport systems, periplasmic components - Prom 40127 - 40186 3.0 53 22 Op 1 . - CDS 40265 - 41218 1119 ## COG0860 N-acetylmuramoyl-L-alanine amidase 54 22 Op 2 . - CDS 41265 - 41729 543 ## gi|153815855|ref|ZP_01968523.1| hypothetical protein RUMTOR_02100 55 22 Op 3 . - CDS 41749 - 41865 253 ## 56 22 Op 4 . - CDS 41897 - 42301 482 ## gi|153815854|ref|ZP_01968522.1| hypothetical protein RUMTOR_02099 - Prom 42334 - 42393 4.1 + Prom 42274 - 42333 5.1 57 23 Tu 1 . + CDS 42378 - 42737 291 ## Predicted protein(s) >gi|330404222|gb|ADLB01000015.1| GENE 1 3 - 377 249 124 aa, chain + ## HITS:1 COG:L0434 KEGG:ns NR:ns ## COG: L0434 COG2801 # Protein_GI_number: 15672639 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Lactococcus lactis # 1 111 165 277 279 103 46.0 8e-23 VIETIKKAKARRNIDKPLILHSDRGSQYVSKEYKRVTATMQCSYSKKSYPWDNACIESFH SLIKREWLNRFKIRDYDHAYRLIFEYLEAFYNTKRIHSHCDYMSPNDYEELYRRLQQDEL QLAG >gi|330404222|gb|ADLB01000015.1| GENE 2 396 - 1040 274 214 aa, chain - ## HITS:1 COG:no KEGG:CDR20291_2961 NR:ns ## KEGG: CDR20291_2961 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile_R20291 # Pathway: not_defined # 18 205 1 187 215 125 38.0 1e-27 MLMKLILIKLNLTEVLKIKSFLEKVRSPKKDIPPNKQIAETAGIILFGFALGVLQKWLDS TTANAFPSVIQQLDISNYFGRLAIWILLVTIISVYAKSPLRTAINTFFFFVSMLAGYYLY CNYILGFLPKTYMMIWIAISFASFFMAYICWYAKGEGVIAIFISSMIMGALLAQAINLNI TQGVYVYHIMEVLTWIIGVILLRRNGPMSRFSTS >gi|330404222|gb|ADLB01000015.1| GENE 3 1144 - 1947 579 267 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKKTIKIKPILIFTIIFCICAGGILYAYIGNRPKPANVTQVENALAEQGFQAFNITDNAQ NNFPNMGLENCIIAEQDDLYFEFYQFDNVKSARKVYTQAYNKIIGNRTTQRVEFDERKLN YRVYILDVETDYYIAMYAENTAVYAYCDSENSSKINGVLNSLDYIDTGNTDWNSETPFDN IFRVLAYALCISVMYITRIWIWPVVYKSAGVTRKKALELGDDRKEIIPKLIQCSKAPKQT KLFALIQLYIITGLYCCNISNNKLFHR >gi|330404222|gb|ADLB01000015.1| GENE 4 1996 - 2136 210 46 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLWGTIIWDILYHIECMRFYNSLHIKQLGLTAFFIPGKEMTKLITA >gi|330404222|gb|ADLB01000015.1| GENE 5 2192 - 2677 413 161 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|210613137|ref|ZP_03289612.1| ## NR: gi|210613137|ref|ZP_03289612.1| hypothetical protein CLONEX_01814 [Clostridium nexile DSM 1787] # 1 161 1 161 161 285 93.0 7e-76 MIRTIIFAVFVVPVISLVLDKTQKATKKEKFTNTVFVSHAYQTVCHVCSLIMVGIALVLG FFMGFDNTIGHSVVFAIFVLLIEFCAWSLKRHKVVIDGNNLRITPAVGRTREISFNDISH CIEKEQIGLKIFVGNKKICTVSCDCVGYKEFLEMLKERNLF >gi|330404222|gb|ADLB01000015.1| GENE 6 2771 - 3559 892 262 aa, chain - ## HITS:1 COG:lin0435 KEGG:ns NR:ns ## COG: lin0435 COG0428 # Protein_GI_number: 16799512 # Func_class: P Inorganic ion transport and metabolism # Function: Predicted divalent heavy-metal cations transporter # Organism: Listeria innocua # 16 262 25 269 269 202 51.0 6e-52 MTGNIWVGILIPFLGTTLGAGCVFFLKRDLRPIVQKSLLGFASGVMVAASVWSLLIPALD MAEKTLGKMSIIPASTGFLLGIGFLLLLDRLIPHLHLGEDKPEGMSKKLKKTTMLILAVT LHNIPEGMAVGAVFAGIVSKDAEITLMGAFALSIGIAIQNFPEGAIISIPLRSETNMNKG KAFTLGALSGIVEPIAAVCMFFLADMLESILPYILSFAAGAMIYVVIEELIPETTEGEHS NSGTVGFALGFVLMMIMDVTLG >gi|330404222|gb|ADLB01000015.1| GENE 7 3952 - 5172 927 406 aa, chain - ## HITS:1 COG:DR0513 KEGG:ns NR:ns ## COG: DR0513 COG0582 # Protein_GI_number: 15805540 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Deinococcus radiodurans # 9 391 6 384 444 84 22.0 5e-16 MAISRKDPKGRKLREGENWRNDGRYSYRYTDVRTGKRLTVYAQDLPELREKEKQIAKDME DNILTDGAIKKMTLNTLFERYMATRELADTTRVSYVRAWENRVKDEIGNIKVVQLLPSHI KAYYAKLSKAGYAYSTIKYIHNLLYPALEMAVDDDIIRKNPAKSSISDYGKPAEEKEALT VSQQERFMEFVKQSNVYNTYYPMLTIMIGTGVRCGELIGLTWKDVNIKAKTVSVDHQLIY KNLGDGCKFHISTPKTESGIRIIPMTQEVAKAFEEQRKINFMLAKDKSIEVDGYSGFVFT AKSGRPLMPSAVNSVLYNIVDAYNKTEVERAKKEHRKAELLLKFSAHVMRHTACTRMAEC RMDVKVLQYIMGHAHIDVTMEVYNHIGELARIENEIARLDSMALNA >gi|330404222|gb|ADLB01000015.1| GENE 8 5133 - 5261 75 42 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MFSGVCSIIRKDKARCENKKRGFFRKEWFSWQLVEKTQKEEN >gi|330404222|gb|ADLB01000015.1| GENE 9 5277 - 5471 186 64 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_3614 NR:ns ## KEGG: EUBREC_3614 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 59 1 59 64 62 55.0 6e-09 MNKTKIRATEQPIAVDIEGLSAMLSCGRATARKIGEQAGAKIVIGRRVLYSIEKVKKYLL YLEE >gi|330404222|gb|ADLB01000015.1| GENE 10 5464 - 5529 145 21 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLNTDVLNDVMFQRGKVLEYE >gi|330404222|gb|ADLB01000015.1| GENE 11 5911 - 6159 361 82 aa, chain - ## HITS:1 COG:no KEGG:CD0358 NR:ns ## KEGG: CD0358 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile # Pathway: not_defined # 1 75 1 75 84 119 88.0 5e-26 MKYTMKKPSYYLLSQAMDGDEKAIEKILAFYDPYISKCCLRPLYDEYGNVYIVVDMELKG LIREALIKMILGFDIALEIEEE >gi|330404222|gb|ADLB01000015.1| GENE 12 6163 - 6582 265 139 aa, chain - ## HITS:1 COG:no KEGG:CD0359 NR:ns ## KEGG: CD0359 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile # Pathway: not_defined # 1 138 4 141 142 189 71.0 4e-47 MKPSDFQKTVQCRFENCLKKVVRHVVKDYQQGLKRRKDKEISFCELPEIVVEKLAVWDEY DSDYTFFNVCGNDIRICDDELAEALKHLSERNRENLLMYYFLEMSDTEIAKRQNISRSGV FQNRHNSLELMKKILKEKR >gi|330404222|gb|ADLB01000015.1| GENE 13 7105 - 7464 443 119 aa, chain + ## HITS:1 COG:no KEGG:CD2002 NR:ns ## KEGG: CD2002 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile # Pathway: not_defined # 1 119 1 119 119 184 85.0 7e-46 MRKTKETHTFDFRPLGLAIREARERAGLSRNDLGDKVFYGERHIADIENTGSHPSFQLFH DLVTMFNISVDEYFYPEKKIAKSTTRRQIETSLDLLSDSELKIIQGTIDGILNSRESKK >gi|330404222|gb|ADLB01000015.1| GENE 14 7535 - 7723 169 62 aa, chain - ## HITS:1 COG:no KEGG:CD0369 NR:ns ## KEGG: CD0369 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile # Pathway: not_defined # 9 62 28 81 81 94 94.0 8e-19 MKVGAKVGGVSLADIRADIEATIDEAMNSTDPEVQANFKKYFGNKRPTPEEYIYKITKKP KV >gi|330404222|gb|ADLB01000015.1| GENE 15 7801 - 8595 574 264 aa, chain - ## HITS:1 COG:no KEGG:Cphy_1635 NR:ns ## KEGG: Cphy_1635 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 4 264 5 266 266 165 37.0 1e-39 MIITLIKKEIKSNSKIYIIFLSVIAIYIFSLLSMYNPALEESLIALEKSMPEILAIFGMQ NRGTTLLDFIVNYLYRFVLIVTPFIYTAIMCYKLVTKYEEKGAMAYLLNSHYSRKQIIIT QGINLLLGISVMIVFATALTILSCAIMFRGELDIIGFLTLNFGLLILQIFLATFCFMFTC AFSEIKYSVGLGAGIGSLFIMIQMVSQVYDNELLKYCNPLSLFNPDKIIEYNPISLLCIG ILFVLSILFFVVAVKAFKRKDLNL >gi|330404222|gb|ADLB01000015.1| GENE 16 8592 - 9527 249 311 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|90020817|ref|YP_526644.1| ribosomal protein S16 [Saccharophagus degradans 2-40] # 19 308 12 310 318 100 25 1e-20 MARYIETSKLQGGVLMNMIEVQNLTKDYGNNRGIFHLNFSIKQGETVAFLGTNGAGKTTT IRQLMGFVKPQNGTAKINGLDCFQQEKDIQKQVGYLSGEISFLDENMTGHDFIKFMSDIK KIKNYQRIEQLITYFELDTQIKIKKMSKGTKQKVGLIVAFMQDAPVLILDEPTSGLDPIM QNKFIELIMKEKSKGKTIIMSSHIFEEVEHTCDRILCIKDGKLIADENVEDIKRNRGKQY SITFSNERDALNFAKKFPNISLVNLKTVLLMQTGSINDLLHELYHYEVTDIDVRNQTLEE LFLQYYGGNKQ >gi|330404222|gb|ADLB01000015.1| GENE 17 9479 - 10105 340 208 aa, chain - ## HITS:1 COG:lin2076 KEGG:ns NR:ns ## COG: lin2076 COG1309 # Protein_GI_number: 16801142 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Listeria innocua # 13 116 3 105 206 63 34.0 3e-10 MQINEKFYSLSKEKQQMIINAGLECFGKYGYQKANTEKIALKAGISKALLFHYFINKKNF YLFLCDFCKEVSANLLEVDEMMKITDFFELIDFSMQTKWKIMTKYPYMANFALNAFYSQK EKTTSDVNKYIQTELNSSFDIYFRNIDFSKFKENAEPKFIYQMLVMLSEGYLSEKQRTNT PIIFDEATKELKKWQDILKQASYKEEYL >gi|330404222|gb|ADLB01000015.1| GENE 18 10240 - 11142 797 300 aa, chain - ## HITS:1 COG:no KEGG:CD0371 NR:ns ## KEGG: CD0371 # Name: not_defined # Def: conjugative transposon protein # Organism: C.difficile # Pathway: not_defined # 1 300 1 303 303 511 96.0 1e-143 MFKKNKKQTENIKEKKVRTVKVGTHQKTVIALWAVLIASVSFGVYKNFTAIDQHTTHEKE IIELRLQDTNGIENFVKNFAKSYYTWNNSKEAIEARTQAINGYLTKELQDLNVDTIRTDI PTSSTVTDVIVWHIEQSGTDTFSATYEVDQQIKEGEQTSNVKATYTVKVHMDADGDMVIV QNPTLASAVEKSDYEPKTPEADASVDADTVNDATAFLETFFKLYPTATEKELAYYVSGNV IEPIGRDYLYSELVNPIFTKDGDNVKVKVAVKFLDNQTKATQVSQYELVLHKDSNWKIVG >gi|330404222|gb|ADLB01000015.1| GENE 19 11159 - 12166 813 335 aa, chain - ## HITS:1 COG:BS_yddH_2 KEGG:ns NR:ns ## COG: BS_yddH_2 COG0791 # Protein_GI_number: 16077564 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell wall-associated hydrolases (invasion-associated proteins) # Organism: Bacillus subtilis # 212 334 2 124 124 138 54.0 2e-32 MKLKHIAIIGSLFPILFSLVLFFGVLISADSDDENSNFSSGITGMNLSAEVLKHQPMVEK YAREYGISEYVNVLLAIIQVESGGTAEDVMQSSESLGLPPNSLDTESSIKQGCKYFASLL SSCKNQGIDDLNVAIQSYNYGGGYVGYVAGKGKKHTFNLAENFAREKSGGKKVTYTNPIA VAKNGGWRYGYGNMFYVELVNQYLAVPQVSGELAQKVMNEALKYQGWKYVYGGSNPNTSF DCSGLTQWCYGKAGISLPRTAQAQYDATQHLPLSQAKAGDLVFFHSTYNAGTYVTHVAIY VGNNQMYHAGNPIGYADLNSSYWQQHLIGAGRVKQ >gi|330404222|gb|ADLB01000015.1| GENE 20 12163 - 14364 797 733 aa, chain - ## HITS:1 COG:no KEGG:CD0373 NR:ns ## KEGG: CD0373 # Name: not_defined # Def: conjugative transposon protein # Organism: C.difficile # Pathway: not_defined # 1 733 2 736 736 1303 95.0 0 MKERIKGAFTKKKIFHFLKMALFVVALSLILLSLLGTVAHATGLVDDTINAENLYSKYPL SNYQLDFYVDNSWSWLPWNWLDGIGKSVQYGLYCITNFVWTISLYLSNATGYVVQEAYKL DFINDMADSIGKSIQTLAGVTENGFSSTGFYVGFLLLIILVVGMYVAYTGLIKRETSKAL HAVINFVVVFILSASFIAYAPDYIKKINEFSSDISTASLDLGTKIMLPNSDSEGKDSVDL IRDSLFSIQVEQPWLLLQFGNSNAEEIGTDRVEALVSASPEDEDGKTREEVVKTEIEDND NNNLTIPQVVNRLGMVFFLLFFNLGITIFVFLLTGMMLFSQILFIIFAMFLPISFLLSMI PSYESMAKQAIVRVFNTIMTRAGITLIVTVAFSISSMFYNISTDYPFFMVAFLQIVCFAG IYMKLGDLMSMFSLNANDSQSMGRRIFRRPYLFMRHRARRMEHRIARVVSAGGISGGVAG AVAGSAVANKRAERKNNVSKENRGNTTSSMGQRAGSKVGAVLDTKNKVKDKANAVKENIK DMPTQTAYAVYSAKEKAKSSVSDFKRGMVQEQQSRQTGRLEKQEQHKKNIADKRMELQKT QEARQAQRKADGSATTGATRPHERPATASKPSAEKMQEIKRPATAPTPKASEPVKTTVIK ERPLSSGASDRKATQSAQPVHRQNVEKVVSQETRQNYTKDRRTKVQQTQNVQKNQQTTEK TRNLVTKKGQKKK >gi|330404222|gb|ADLB01000015.1| GENE 21 14364 - 16814 1710 816 aa, chain - ## HITS:1 COG:no KEGG:CD0374 NR:ns ## KEGG: CD0374 # Name: not_defined # Def: conjugative transposon protein # Organism: C.difficile # Pathway: not_defined # 1 816 1 816 816 1541 98.0 0 MFPIKYIDNNLVWNKDNEVFAYYELIPYNYSFLSAEQKFIVHDSFRQLIAQSREGKIHAL QIATESSIRSMQEQSKKLVTGKLREVAVQKIDEQTEALVSMIGDNQVDYRFFLGFKLMVT EEQFNLKNIKKSAWLTFKEFLHEVNHTLMNDFVSMSNDEINRYMKMEKLLENKISRRFKV RRLEINDFGYLMEHLYGRDGIAYEDYEYQLPKKNYKKETLIKYYDLIRPTRCVIEESQRY LRLEHEDKESYVSYFTVNAIVGELDFPSSEIFYFQQQQFTFPVDTSMNVEIVENRKALTT VRNKKKELKDLDNHAYQAGSETSSNVVDALDSVDELETDLDQSKESMYKLSYVIRVSAPD LDELKRRCDEVKDFYDDLNVKLVRPAGDMLGLHSEFLPASKRYINDYVQYVKSDFLAGLG FGATQQLGESTGIYMGYSVDTGRNVYLQPSLASQGVKGTVTNALASAFVGSLGGGKSFCN NLLVYYSVLFGGQAVILDPKSERGNWKETLPEIAHEINIVNLTSDKGNAGLLDPFVIMKN VKDAESLAIDILTFLTGISSRDGEKFPVLRKAVRSVTQSDSRGLLHVIDELRREDTPVSR NIADHIDSFTDYDFAHLLFSDGTVENAISLDNQLNIIQVADLVLPDKDTTFEEYTTIELL SVSMLIVISTFALDFIHSDRSIFKIVDLDEAWAFLNVAQGETLSNKLVRAGRAMQAGVYF VTQSSGDVSKESLKNNIGLKFAFRSTDINEIKQTLEFFGIDKDDENNQKRLRDLENGQCL LQDLYGRVGVVQIHPVFEELLHAFDTRPPVQRNEVE >gi|330404222|gb|ADLB01000015.1| GENE 22 16792 - 17190 374 132 aa, chain - ## HITS:1 COG:no KEGG:CD3384 NR:ns ## KEGG: CD3384 # Name: not_defined # Def: conjugative transposon membrane protein # Organism: C.difficile # Pathway: not_defined # 1 132 1 132 132 239 99.0 3e-62 MKKIKSYTGIWNVEKVLYAINDFNLPFPVTFTQITWFVITEFIIILFGDMPPLSMIEGAF LKYFGIPVALTWFMSQKTFDGKKPYSFLKSQITYALRPKITYAGKAVKLHKQILNETITA VRSVNYVPDKIY >gi|330404222|gb|ADLB01000015.1| GENE 23 17306 - 17719 413 137 aa, chain - ## HITS:1 COG:no KEGG:CDR20291_3463 NR:ns ## KEGG: CDR20291_3463 # Name: not_defined # Def: conjugative tranposon protein # Organism: C.difficile_R20291 # Pathway: not_defined # 6 135 36 165 167 147 61.0 8e-35 MRWSNERIGLNGRYEEYAIHDSENFPCELEEYTSIEELNRIYELIQDFPEEVLDKLDDFI SYYGDLEELADHIGDIICYSGCETMEDVAYHKIYEENVLGEIPPAFFHYIDCEAYGRDIE IQDYFVKTRYGMCEIKR >gi|330404222|gb|ADLB01000015.1| GENE 24 17814 - 17915 125 33 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MQELRVQIEAVPTVEGDILSAWFTLPLQRLTNA >gi|330404222|gb|ADLB01000015.1| GENE 25 17915 - 18130 196 71 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|210613177|ref|ZP_03289620.1| ## NR: gi|210613177|ref|ZP_03289620.1| hypothetical protein CLONEX_01822 [Clostridium nexile DSM 1787] # 1 71 1 71 98 119 87.0 9e-26 MNDIFKDMQAKVGCNYLSDLPSYKRKVWQEMKRLNPADYPKKQLEDFSIYVFGMSYQTLK DVMNQQRGSEK >gi|330404222|gb|ADLB01000015.1| GENE 26 18185 - 18652 341 155 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRIQYYRHHSWNLLNYLGMIKDGIDKLNEIEVSKDLLEYDFEKSTRNLIVHIYEKMNKIN ETSGLLHGFNVIMDEKDVFLRDIKEPIFNLDLLKEEITMVIIDRRKTPPIEHNYKVELNH LKNSIDMLKAQVEKVTQYIREAWYPNQSFRVSVKQ >gi|330404222|gb|ADLB01000015.1| GENE 27 18919 - 19089 318 56 aa, chain - ## HITS:1 COG:no KEGG:CD3387A NR:ns ## KEGG: CD3387A # Name: not_defined # Def: conjugative transposon protein # Organism: C.difficile # Pathway: not_defined # 10 56 27 73 73 71 97.0 9e-12 MGISTRLPAGIYLGFKREFSKLIGFLVVALVAVGLVFNAGGVKDVLLELFNKIIGA >gi|330404222|gb|ADLB01000015.1| GENE 28 19104 - 20300 978 398 aa, chain - ## HITS:1 COG:BS_ydcR KEGG:ns NR:ns ## COG: BS_ydcR COG2946 # Protein_GI_number: 16077554 # Func_class: L Replication, recombination and repair # Function: Putative phage replication protein RstA # Organism: Bacillus subtilis # 65 396 25 351 352 226 38.0 6e-59 MVLNEEQWIKELREKRIAYGISQGRLAVASGITREYLNKIESGKMKPSKELLETLHKELA RFNPEAPLTMLFDYVKIRFPTLDIQHIIKDILKLNINYMLHEDYGRYSYTEHYSLGDIFI YTSADEEKGVLLELKGRGCRQFESYLLAQQRSWYDFLMDALVDGGVMKRIDLAINDHTGI LDIPELAEKCRKREYIGKSRSYKFYQSGELIKHREDDREYMGRTLYLGSLKSDVYFCIYE KDYEQYVKLGTPLEEADIINRFEIRLRNERAYYAVRDLLTYYDAEQTAFSIINQYVRFVD EEPDKRKNDWKLNDRWAWFIGDNRQSLKLTTKPESYTLDRTLRWVQRQVAPTLKMLKKID KGNGTDYMETIEKQAVLSEKHEMIIKQQTTPAKDLVES >gi|330404222|gb|ADLB01000015.1| GENE 29 20484 - 21458 480 324 aa, chain - ## HITS:1 COG:BS_ydcQ KEGG:ns NR:ns ## COG: BS_ydcQ COG1674 # Protein_GI_number: 16077553 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: DNA segregation ATPase FtsK/SpoIIIE and related proteins # Organism: Bacillus subtilis # 6 324 149 467 480 349 51.0 5e-96 MEKGLLHIRCEITLGKYQDQLLRLEDKLESGLYCELTDKTLHDGYIEYTLLYDMIANRIT IDEVRAENGCLRLMKNLVWEYDALPHALIAGGTGGGKTYFLLTLIEALLHTNAVLYVLDP KNADLADLGTVMGNVYHTKEEMIDCVNAFYEGMVRRSEEMKRHPNYKTGENYAYLGLPPC FLIFDEYVAFFEMLGTKESVSLLSQLKKIIMLGRQAGYFLIVACQRPDAKYFSDGIRDNF NFRVGLGRISELGYGMLFGSDVKKQFFQKRIKGRGYCDVGTSVISEFYTPLVPKGHDFLQ TIGSLAQARQDGTATCEAKGDGTD >gi|330404222|gb|ADLB01000015.1| GENE 30 21958 - 22149 243 63 aa, chain - ## HITS:1 COG:L126409 KEGG:ns NR:ns ## COG: L126409 COG1476 # Protein_GI_number: 15672309 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Lactococcus lactis # 1 60 1 60 97 68 65.0 2e-12 MENLIRNRRKELGLSQVELAKKCGVSRQTVNAIENNKYDPTLTLAFNLAKELKTTVDELF IHN >gi|330404222|gb|ADLB01000015.1| GENE 31 22152 - 22448 244 98 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225406863|ref|ZP_03761052.1| ## NR: gi|225406863|ref|ZP_03761052.1| hypothetical protein CLOSTASPAR_05084 [Clostridium asparagiforme DSM 15981] # 3 97 1 95 96 87 47.0 3e-16 MAMPVGAMIFLGSILIGFTILDILMLVSLLKPGDERNQIIVWKASSFTLLSITGSLVLDI IESYVRAQPLTINPLIHLEVIAIVYFLSLMFFKKRHGG >gi|330404222|gb|ADLB01000015.1| GENE 32 22568 - 22954 390 128 aa, chain - ## HITS:1 COG:no KEGG:CD3390 NR:ns ## KEGG: CD3390 # Name: not_defined # Def: conjugative transposon protein # Organism: C.difficile # Pathway: not_defined # 1 126 1 126 127 240 96.0 1e-62 MRLSNGFVIDKEKTFGELKFTVVRDVFLQNEDGTPSTQLKKRIYDLKCSLHGGIIPVSVP PEVPLREFPYNAVVELVNPVADTVSRKTFTGADVDWYVKAEDIVLKNKGNQNAGSPQNHT PQGQPKNK >gi|330404222|gb|ADLB01000015.1| GENE 33 22970 - 23296 281 108 aa, chain - ## HITS:1 COG:no KEGG:CD3391 NR:ns ## KEGG: CD3391 # Name: not_defined # Def: conjugative transposon protein # Organism: C.difficile # Pathway: not_defined # 1 108 1 108 108 196 94.0 3e-49 MEMKYVVPDMAQSFGTLEFAGESEPVFERDKNNRRVLARRSYNLYSDVQKGENVVVEIPV QAGEKHFKYEQKVKLVNPKLYGRGYAIGDMGHTDYVLLADDIVAVEEK >gi|330404222|gb|ADLB01000015.1| GENE 34 23244 - 23522 83 92 aa, chain - ## HITS:1 COG:no KEGG:CD0385A NR:ns ## KEGG: CD0385A # Name: not_defined # Def: hypothetical protein # Organism: C.difficile # Pathway: not_defined # 1 59 1 59 61 108 94.0 5e-23 MRKGKSRSPPKLSNGRLPLILACLPTTCKNTDSQGCVSIYFTLYYRWDCWYFPNRQNKKS QSKNLSNIERNEVKQYGNEICCARYGTVFWNS >gi|330404222|gb|ADLB01000015.1| GENE 35 23535 - 26585 2860 1016 aa, chain - ## HITS:1 COG:no KEGG:CD3392 NR:ns ## KEGG: CD3392 # Name: not_defined # Def: putative collagen-binding surface protein # Organism: C.difficile # Pathway: not_defined # 1 1016 1 1014 1014 1766 91.0 0 MKKMLKRLCTGFLALATVVTALPTTPVHAESKQYWTESKERVGIVEKVMNDGSIGSTFNE GHLTVEGEDAYCIDINTDFKNGYKTRADASTRMSADQISDVALSIEYVKQYTDSHSGISK NHAYLLRQLVVWQRLSVHLGWQCDNVRASYDEIPKATQDEVFAGAKAFVKENKGRYECGG YIYSGEGQELGQFWAKLNVGNAKLQKTSSNTSITEGNGNYSIAGATYGVFSDKDCTKQLA TLTTDENGNTDVVEVTAGTVYIKELSAPAGYKVDKTVYPLTIKAGETATLKVSDTPKVTD TLIELFKIDMETQKDNPQGDASLAGAEFTWKYYDGYYNKNNLPEKATHTWTTKTIAEKDS DGTIHYVTKLADAYKVSGDSFYMQDGKAVLPLGTLTVEETKAPNGYLLDGAYMQAGDKSE QIKGLYVTQITEDGDLVVLSGSNQFSVSDKVIRGGVKIQKRDLETGDTKPQGSATLKDTA FDIISLNDNEVLVEGKLYKKNEVVKTIHTDIEGVASTSADLLPYGKFRIVESKAPNGYLT DGAKPIDFAITENGKIVDLTDEAHSIYNQIKRGDIEGVKIGAGTHKRLADVPFRITSKTT GESHVVVTDDNGQFSTSSDWASHKHNTNAGKTSEDGVWFGTSEPDDSKGALPYDTYIIEE LRSDGNKGFELIPPFEIVVSRNNLVIDLGTLTDEYEKEISIHTTATSKDGEKTILAGKEV TIVDTVKLDGLTKGTKYQLKGWQMLKEENAELIIDGKRVENDYTFIADDEEMKVEISYTF NASALGGKNLVTFEELYDLSNPDEPVKVAEHKDIEDDGQTVLITERIIKIHTTATDKDGN KEFEAGKDVTIIDTVTLEGLEVGTQYKLVGWQMLKEENAELLINEKRVESDYTFTADSEI MKVEVAFTFDATSLDGKQLVTFEELYDLSNPDEPKKVIEHKDIEDEGQTITFKEKPEVPE EPEQPETPQTPDTPHKTDSPKTGDSTNLYGLLALLLTSGAGLAGIFFCKRRKMKKS >gi|330404222|gb|ADLB01000015.1| GENE 36 26869 - 27063 102 64 aa, chain - ## HITS:1 COG:no KEGG:Ccel_2739 NR:ns ## KEGG: Ccel_2739 # Name: not_defined # Def: hypothetical protein # Organism: C.cellulolyticum # Pathway: not_defined # 2 64 76 143 155 69 45.0 3e-11 MDKEGLLFNIDKVHTTEMGIGRIKKNLKLDTDDVVEWCKNRVLDEGCNIYKQGKNWYCEI VITA >gi|330404222|gb|ADLB01000015.1| GENE 37 27123 - 27869 567 248 aa, chain - ## HITS:1 COG:no KEGG:DSY0090 NR:ns ## KEGG: DSY0090 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense # Pathway: not_defined # 1 248 1 249 271 274 53.0 2e-72 MEYIKVTSENIEKEHICCAISNNNDIQVSSKKAWLSERFEDGLVFLKSTERGKCFIEYIP AENAWNPISADGYMYINCLWVAGSFKGHGYSADLLNACIEDSKNKGKKGLCILAAAKKKP FLVDSKFLKHKGFQVCDEADNGIQLWYLAFVENATLPQFKECAKHPHTDEMGYVLYYTSQ CPFNAKYVPVIEGIAKEKAIQFTTIHLQSKEEAQKAPTPITTYALFLDGNYITNEQMNDK RFLKLLEK >gi|330404222|gb|ADLB01000015.1| GENE 38 27902 - 28258 375 118 aa, chain - ## HITS:1 COG:SP0951 KEGG:ns NR:ns ## COG: SP0951 COG3070 # Protein_GI_number: 15900829 # Func_class: K Transcription # Function: Regulator of competence-specific genes # Organism: Streptococcus pneumoniae TIGR4 # 13 87 1 75 75 97 62.0 5e-21 MILEILKLGVGIMASSKQYLEFILGQLSELDEITYRAMMGEFIIYYRGKIVGGIYDDRLL VKAVKSAISYMPTALYELPYEGAKEMLLVDEVDNKEFLAVLFNAMYEELPTPKPKKKK >gi|330404222|gb|ADLB01000015.1| GENE 39 28258 - 28797 339 179 aa, chain - ## HITS:1 COG:SP0830 KEGG:ns NR:ns ## COG: SP0830 COG3797 # Protein_GI_number: 15900717 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Streptococcus pneumoniae TIGR4 # 1 179 1 177 180 82 32.0 5e-16 MKRYIALLRGINISGKNRITMSDLKAGFIELGYTAVSTYLNSGNVIFDSGIDNKEELSNA IRFMINKQFELEIPVFVILQEELEEILKNAPEWWGDSSKEIYDNLIFMFPTISYDRFYNE VGNPKEAYEKIYHYKNVIFWSFSRKDYQKTNWWLKTANSKVSNEITIRTANTVRKIVNM >gi|330404222|gb|ADLB01000015.1| GENE 40 29284 - 30606 877 440 aa, chain - ## HITS:1 COG:no KEGG:MCRO_0335 NR:ns ## KEGG: MCRO_0335 # Name: not_defined # Def: hypothetical protein # Organism: M.crocodyli # Pathway: not_defined # 63 438 57 422 432 101 28.0 6e-20 MANALPRQFKMLNDELIVEYSPDTKKTRNEIETLLKNALGDSFDVIQFFKGQKVVTGVIK GDDGRATYIMAANLTFMGGKEGQHPKDLKRIQYNNNWKLFYDKYSCDGDVFWLGLYSYDD VNVWAYFKPESYLKKHEGKEMISAGGHKAQYSCHIFLNDLYQGYINGYFEKVDKNNNIVG AIKNEYLKEFFSSAEREERNPIITAIQKLNKEKIRWNEWITAKEAITYMKDLKDKTGFGM WKQNLWNGWLIEAYYSEFFHDNSSEYADYIATSENPSIISEYGKMGLDLAFPHCKYRFIG DLKAVCEGGGNTLLNDETKVTDALNKYHRIWFIMYIHDKKQGKTNDYEMVKWRNHFIKDM GEWDLKKPFNELSARNTPHSISYSEMVIIELNEITKEKYFSISKQFGLNSDGKPRNLKFT VDKNLLDTSDDSFVIYRYRP >gi|330404222|gb|ADLB01000015.1| GENE 41 30822 - 32186 1222 454 aa, chain - ## HITS:1 COG:CAC1435 KEGG:ns NR:ns ## COG: CAC1435 COG2265 # Protein_GI_number: 15894714 # Func_class: J Translation, ribosomal structure and biogenesis # Function: SAM-dependent methyltransferases related to tRNA (uracil-5-)-methyltransferase # Organism: Clostridium acetobutylicum # 1 453 4 453 456 407 45.0 1e-113 MKKGQIYEGIIEKVEFLNKGIVPVESEERKVIVKNGIPGQKVKFCINKMRKGKAEGRLLE VLEKSPYETNEPVCSIFPQCGGCMYQTMSYDEQIKMKSQQVKEILDAVVKDDYVFEGVKK SPKQFAYRNKMEFSFGDEYKDGPLSLGLHKKGSTYDVLTVSDCKLVHEDMTKILNCVLTY FQEKGVGYYKKMQHIGYLRHLLLRRGDKTGEILINLVTTTQEEHDMQPLVDRLLALQLEG KIVGILHILNDSLSDVVQSDKTILLYGQDYFYEELLGMEFKITPFSFFQPNSVGAEVLYD TVREYIGDIDNMTIYDLFSGTGTIGQILAPVAKEVIGVEIIEEAVEAAKENAEHNGLSNC KFIAGDVFKVLDEIEEKPDVIVLDPPRDGIHPKALPKILDYQVDKIVYISCKVTSLARDL EMIQERGYRVEKCVAVDQFANTAHVETVALLSKV >gi|330404222|gb|ADLB01000015.1| GENE 42 32207 - 32773 644 188 aa, chain - ## HITS:1 COG:CAC3596 KEGG:ns NR:ns ## COG: CAC3596 COG0558 # Protein_GI_number: 15896830 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylglycerophosphate synthase # Organism: Clostridium acetobutylicum # 14 147 2 133 174 72 34.0 5e-13 MVLGDVPLKKQIFSIPNLMGYFRIILIPVILWRYLTADSIADYRMAAVIIGISGITDFLD GFVARKFHMVTQLGKAIDPIADKLTQIAIVCALSVRFEWFGVVACLLIVKEGFMGIMGYI LIRRGKMLNGARWFGKVSTAVLYVIMFALILIPDINMDIANGMILVSGILLTLSLILYIP EYRKILKE >gi|330404222|gb|ADLB01000015.1| GENE 43 32829 - 33071 442 80 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|210608460|ref|ZP_03287836.1| ## NR: gi|210608460|ref|ZP_03287836.1| hypothetical protein CLONEX_00015 [Clostridium nexile DSM 1787] # 1 58 28 85 107 66 75.0 5e-10 MSKFEDIIAASKVSDLLHKKDEDKTKNTVLWVLAIVGVIAAVAGIAYAVYRFFTPDYLED FEEDFEDDFDDEFFGDEDEI >gi|330404222|gb|ADLB01000015.1| GENE 44 33245 - 35470 2548 741 aa, chain - ## HITS:1 COG:SA1721 KEGG:ns NR:ns ## COG: SA1721 COG0210 # Protein_GI_number: 15927479 # Func_class: L Replication, recombination and repair # Function: Superfamily I DNA and RNA helicases # Organism: Staphylococcus aureus N315 # 2 740 3 727 730 682 50.0 0 MSIYDTLNTEQKEAVLHTEGPLLILAGAGSGKTRVLTHRISYLIEEKGVNPWNILAITFT NKAAGEMRERVDKIVGFGSESIWVSTFHSMCVRILRRHIDRLGYDTNFTIYDTDDQKTLM KDVCKMLQIDTKVYKERMFLGEISSAKNELVTPEEYELNAAGDYVKGKVAKVYKEYEKQL RDNNALDFDDLLLKTVQLFQTQADVLDYYQERFRYIMVDEYQDTNTVQFQLIRILAGKYK NLCVVGDDDQSIYKFRGANIQNILNFEKVFADARVIKLEQNYRSTANILNAANAVIRHNT GRKDKTLWTDNEEGEKIGFRQFDTAFDEAEYIVDEIRKGVSGKGYTYSDNAILYRTNAQS RMFEEKFVAANIPYKIIEGVNFYARREIKDLLSYLKTIDNGKDDLAVRRIINVPKRSIGL TTVNRVQENALERGISFYDALSSADLIDNIGRSLSKIESFVALIEHFKEQSEKMSLSQLM EEIIEMTGYIESLEAESEIEAETRIENIEELKSKIIAYEESCEDEKPTLSGFLEEVALVA DVDTLDENSDYVVLMTLHSAKGLEFPNVYLAGMEDGLFPSYMTITADDPMEIEEERRLCY VGITRARKHLTMTCARRRMIHGETQYNKLSRFLKEIPLELLDTGNIVEKNTMDVPKQTAY AQARQTFKTKAFSTAKPIKQFGTASGEGPGYDVGDRVKHMKFGEGLVTAMTAGGRDYEVT VQFDTVGVKKMFATFAKLQKI >gi|330404222|gb|ADLB01000015.1| GENE 45 35546 - 36067 603 173 aa, chain - ## HITS:1 COG:L67226 KEGG:ns NR:ns ## COG: L67226 COG1335 # Protein_GI_number: 15672251 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Amidases related to nicotinamidase # Organism: Lactococcus lactis # 4 170 6 168 171 177 54.0 6e-45 MNILVVIDMQNDFIDGALGTKEAVSIVPKVMEKIRMFDGKILATRDTHEENYLETQEGRK LPVKHCIRNTKGWQINGEIQSLLSETPVDKETFGSRELPEILKKYDEKEKIESITLVGLC TDICVISNAMVLKAYFPEVPIIVDASCCAGVTPQSHKQTLEAMKVCQIEVVEE >gi|330404222|gb|ADLB01000015.1| GENE 46 36076 - 36396 446 106 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167746261|ref|ZP_02418388.1| ## NR: gi|167746261|ref|ZP_02418388.1| hypothetical protein ANACAC_00966 [Anaerostipes caccae DSM 14662] # 3 106 16 121 122 92 51.0 8e-18 MKIAFMIWAVIAFVFMGIGIYDYFSEKPAGFWANAKTVPIDDVKAYNRAVGKLFVCFGIG FILLGLPLLVSKQNSPVILFSVIGLMFESIGMMIVYMKIESKYRRK >gi|330404222|gb|ADLB01000015.1| GENE 47 36393 - 37238 496 281 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|210608462|ref|ZP_03287838.1| ## NR: gi|210608462|ref|ZP_03287838.1| hypothetical protein CLONEX_00017 [Clostridium nexile DSM 1787] # 180 278 29 133 144 87 42.0 9e-16 MREEKVLRTCKIVRKITVIIGLAVTILPILFWNYIPDKIPAHYGVSGKVDRVGGKEELIL LFFVLWLVLGSLSVVSYYLKTSGVSKYANEKDNEHLQTIYPMITWITLITTFTPIVVYIG KARKHSAGNPSEKAKFVQAESREEGIAYRTAVDWWLALLFILVIGGELWIFFQSLLNKGK VEWITLVAVILILLLVVPLVRIKYILYSEHFLISAGYLGKQRIAYSSIVNIKETHNPLSA PACSLDRVEIDYVVSGMHKFALISPVHLKMFKKELESRCEK >gi|330404222|gb|ADLB01000015.1| GENE 48 37362 - 37967 472 201 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_1394 NR:ns ## KEGG: EUBREC_1394 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 198 1 198 217 198 58.0 1e-49 MTIDKNDFLNSILSSLSRINYVKLEDIPNIELYMDQVTTFMETQLAASKRYEEDKILTKT MINNYAKNNLLPPPEKKKYSKEHMLVLIFIYYFKNILSIKDIETLLKPITDNYFHSDSDF NLTTIYEKICKSEKERINFLQEDIKNIYTTSMENFQEIDTENKEFLQLFSFICTLSFDVY VKKQLIEKLIDLLPEPEKKKK >gi|330404222|gb|ADLB01000015.1| GENE 49 38031 - 38756 692 241 aa, chain + ## HITS:1 COG:CAC0509 KEGG:ns NR:ns ## COG: CAC0509 COG1387 # Protein_GI_number: 15893800 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Histidinol phosphatase and related hydrolases of the PHP family # Organism: Clostridium acetobutylicum # 1 236 1 233 244 193 43.0 3e-49 MNLCIDTHSHTIASGHAYSTIREMAKAAAEKELSALAITEHAPAMPETCGNFYFSNLKVI PRTMYGVNLLFGVELNILDEEGNIDLPPSLLKSVDIAIASIHTPCFQEKRGIEENTRAYI RAMQNPYIDIIGHPDDSRFPIDYEQVVYAAKETGTLLEVNNSSLSPGSFREGAEENLKIM LSLCKKHRVPITTGSDAHVDIDVGNFTYSSKILASCDFPEELIVTTNLDKLRKFLKRNKN M >gi|330404222|gb|ADLB01000015.1| GENE 50 38828 - 39115 355 95 aa, chain + ## HITS:1 COG:BH0639 KEGG:ns NR:ns ## COG: BH0639 COG4496 # Protein_GI_number: 15613202 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus halodurans # 4 93 5 94 100 112 66.0 1e-25 MSKKIRTESVDYLFDAILSLKDKEECYTFFEDVCTINELLSLSQRLEVAKMLREQKTYLE IAEKTGASTATISRVNRSLNYGNDGYDMVFERLHK >gi|330404222|gb|ADLB01000015.1| GENE 51 39223 - 39699 280 158 aa, chain + ## HITS:1 COG:no KEGG:EUBELI_00140 NR:ns ## KEGG: EUBELI_00140 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 11 125 19 133 178 89 43.0 3e-17 MKNNLKHYTFIGILFVSISGTLSHFVYEWSGKKPFLALFFPINETTWEHMKLIFFPALAF ALFMAYRLRFLYPHIISALSAGILLGTFSIPAIFHAYSGILGFHTLFLDIATFLLSVFIT FISAYFFTISDKLKKQSFWIFLIVCLALCFFFFSIREI >gi|330404222|gb|ADLB01000015.1| GENE 52 39700 - 39993 363 97 aa, chain - ## HITS:1 COG:BS_ytlB KEGG:ns NR:ns ## COG: BS_ytlB COG0715 # Protein_GI_number: 16080112 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport systems, periplasmic components # Organism: Bacillus subtilis # 5 94 11 100 105 61 37.0 5e-10 MLQGFTNALQKGMDYVQSHTPEEIAKIIKPQFKDADMETITTIVERYAEQETWKENLIFE KESFELLQDILESSDELEKRVPYKELVTTEFAKKAYK >gi|330404222|gb|ADLB01000015.1| GENE 53 40265 - 41218 1119 317 aa, chain - ## HITS:1 COG:BH0810 KEGG:ns NR:ns ## COG: BH0810 COG0860 # Protein_GI_number: 15613373 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: N-acetylmuramoyl-L-alanine amidase # Organism: Bacillus halodurans # 1 208 1 207 263 103 33.0 4e-22 MAKVFLSAGHGGNDPGASAYGLVEKTINLNIMLACRNVLMAHNITVVCSRVTDENDPVSQ EVREANASGADLAVSFHTNAGGGDGSESYYHPTSEKGKKLAQLCEKHTQALGQNSRGVKT KDLAFPRDTRMTAVLCECAFVDNDKDNDIVDTLEEQQAFGIAYAEAILEYLDIKYTGTNN PVAPPQKPTQAPQNIKIKEDGILGKNSNEETQRHYGTFVDGVISRQPLCNKKYFIYIDAR YWEFVKNYGEGSPVVKAWQRDLKNRGFYHGEIDGLMGPKMIIALQQFLSMLGLYVGKIDA YLGEKCGRGWQKYLNTH >gi|330404222|gb|ADLB01000015.1| GENE 54 41265 - 41729 543 154 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|153815855|ref|ZP_01968523.1| ## NR: gi|153815855|ref|ZP_01968523.1| hypothetical protein RUMTOR_02100 [Ruminococcus torques ATCC 27756] # 1 154 3 156 156 237 86.0 2e-61 MNYAETIIDTYNVIVGSIVAVLSYVLGEHWCLFVAFLLLNVADWLTGWMKSRMAGKENSV KGWKGVLKKLGYWLMIMVAFGASAVFVEIGKTINVDLQITTLLGWFVLASLLINEIRSIL ENFVEAGYNVPKILIKGLEVADKVVNKDNISEGE >gi|330404222|gb|ADLB01000015.1| GENE 55 41749 - 41865 253 38 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MAVVYATLILKGKKRLEDVPERIREEVKAVLKDLEVQM >gi|330404222|gb|ADLB01000015.1| GENE 56 41897 - 42301 482 134 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|153815854|ref|ZP_01968522.1| ## NR: gi|153815854|ref|ZP_01968522.1| hypothetical protein RUMTOR_02099 [Ruminococcus torques ATCC 27756] # 1 107 1 107 253 98 46.0 1e-19 MILKFNDATELQIQSAEPVEQGLRILAINTTPKQLRTLFSDKVKTKVMKVEERGQLIATY ENYTEYDHTEEYTGQIYGIVMNRVGKSLEEQLQEKDEQIHNLTEELKNANAQITDLQLAM CELYEGMGVSAKSK >gi|330404222|gb|ADLB01000015.1| GENE 57 42378 - 42737 291 119 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNNKYLSKLTVQKTSTQGGWVVEKYANGWCKLYIKAQVYRASKEQTVYKLPFPNGIQLKN VRVQMTPAQNGWNVANFYHNTSGQNDDNAVISEVVIIFTAKDTTALTYCFDVCVSGFLV Prediction of potential genes in microbial genomes Time: Tue May 24 21:17:59 2011 Seq name: gi|330403967|gb|ADLB01000016.1| Lachnospiraceae bacterium 2_1_46FAA cont1.16, whole genome shotgun sequence Length of sequence - 92006 bp Number of predicted genes - 102, with homology - 81 Number of transcription units - 34, operones - 22 average op.length - 4.1 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 144 177 ## 2 2 Op 1 . - CDS 158 - 928 764 ## Ccur_02620 hypothetical protein 3 2 Op 2 . - CDS 916 - 1929 831 ## Apar_0590 hypothetical protein 4 2 Op 3 . - CDS 1914 - 2765 618 ## Ccur_02650 hypothetical protein 5 2 Op 4 . - CDS 2762 - 6484 3757 ## COG5283 Phage-related tail protein 6 2 Op 5 . - CDS 6503 - 7108 372 ## CLK_1325 putative protein GP15 7 2 Op 6 . - CDS 7105 - 7440 340 ## 8 2 Op 7 . - CDS 7520 - 8050 686 ## CLK_1327 hypothetical protein 9 2 Op 8 . - CDS 8056 - 8484 181 ## CLD_2452 hypothetical protein 10 2 Op 9 . - CDS 8481 - 8852 358 ## CLD_2451 hypothetical protein 11 2 Op 10 . - CDS 8852 - 9262 240 ## gi|153853386|ref|ZP_01994795.1| hypothetical protein DORLON_00784 12 2 Op 11 . - CDS 9262 - 9621 334 ## gi|154504713|ref|ZP_02041451.1| hypothetical protein RUMGNA_02220 13 2 Op 12 . - CDS 9635 - 9949 608 ## gi|295091507|emb|CBK77614.1| Collagen triple helix repeat (20 copies). 14 2 Op 13 . - CDS 10032 - 10886 1225 ## Shel_14060 hypothetical protein 15 2 Op 14 . - CDS 10907 - 11455 777 ## lin0108 putative scaffolding protein 16 2 Op 15 . - CDS 11476 - 11562 58 ## 17 2 Op 16 . - CDS 11595 - 11861 304 ## gi|166031066|ref|ZP_02233895.1| hypothetical protein DORFOR_00747 18 2 Op 17 . - CDS 11854 - 13518 1300 ## EUBREC_1286 hypothetical protein 19 2 Op 18 . - CDS 13524 - 14861 1122 ## LKI_09670 minor capsid protein 20 2 Op 19 . - CDS 14848 - 16173 853 ## cauri_1927 phage terminase large subunit 21 2 Op 20 . - CDS 16148 - 16615 470 ## gi|153815839|ref|ZP_01968507.1| hypothetical protein RUMTOR_02084 - Term 16642 - 16670 1.0 22 3 Op 1 . - CDS 16675 - 17340 382 ## PMI0517 hypothetical protein 23 3 Op 2 . - CDS 17342 - 17737 293 ## gi|296443490|ref|ZP_06885533.1| conserved hypothetical protein 24 3 Op 3 . - CDS 17730 - 18587 475 ## COG1192 ATPases involved in chromosome partitioning - Prom 18659 - 18718 3.4 - Term 18614 - 18663 8.1 25 4 Op 1 . - CDS 18729 - 19157 381 ## gi|154504699|ref|ZP_02041437.1| hypothetical protein RUMGNA_02205 26 4 Op 2 . - CDS 19163 - 19522 241 ## gi|225574763|ref|ZP_03783373.1| hypothetical protein RUMHYD_02840 27 4 Op 3 . - CDS 19488 - 19652 156 ## 28 4 Op 4 . - CDS 19710 - 20252 507 ## 29 4 Op 5 . - CDS 20256 - 20462 191 ## 30 4 Op 6 . - CDS 20462 - 20902 300 ## gi|225575248|ref|ZP_03783858.1| hypothetical protein RUMHYD_03337 31 4 Op 7 . - CDS 20905 - 21126 160 ## 32 4 Op 8 . - CDS 21140 - 21349 122 ## 33 4 Op 9 . - CDS 21351 - 21746 437 ## COG4570 Holliday junction resolvase - Prom 21878 - 21937 6.0 34 5 Op 1 . - CDS 21944 - 22024 68 ## 35 5 Op 2 . - CDS 22024 - 24249 1992 ## COG3598 RecA-family ATPase 36 5 Op 3 . - CDS 24252 - 25835 1068 ## COG1061 DNA or RNA helicases of superfamily II 37 5 Op 4 . - CDS 25835 - 26290 594 ## Ccel_3308 phage protein 38 5 Op 5 . - CDS 26306 - 27445 978 ## Ccel_3309 phage protein 39 5 Op 6 . - CDS 27445 - 28743 1212 ## GALLO_0437 hypothetical protein 40 5 Op 7 . - CDS 28831 - 28932 150 ## 41 5 Op 8 . - CDS 28950 - 29105 312 ## 42 5 Op 9 . - CDS 29128 - 29247 189 ## - Prom 29476 - 29535 4.4 + Prom 29215 - 29274 4.4 43 6 Tu 1 . + CDS 29445 - 29759 301 ## gi|255283158|ref|ZP_05347713.1| conserved hypothetical protein + Term 29765 - 29823 16.1 - Term 29522 - 29557 0.6 44 7 Op 1 . - CDS 29749 - 29937 272 ## 45 7 Op 2 . - CDS 29949 - 30170 186 ## - Prom 30303 - 30362 3.5 + Prom 30159 - 30218 8.6 46 8 Tu 1 . + CDS 30244 - 30792 244 ## CD0910 hypothetical protein + Term 30837 - 30891 7.8 - Term 30670 - 30699 -0.4 47 9 Op 1 . - CDS 30793 - 31056 320 ## 48 9 Op 2 . - CDS 31133 - 31324 71 ## CPF_0929 hypothetical protein - Prom 31407 - 31466 9.8 + Prom 31405 - 31464 10.7 49 10 Op 1 . + CDS 31484 - 31837 156 ## CD2950 putative phage repressor 50 10 Op 2 . + CDS 31852 - 32337 325 ## CD2951 phage protein 51 10 Op 3 . + CDS 32382 - 32732 360 ## 52 10 Op 4 . + CDS 32803 - 32976 259 ## 53 10 Op 5 . + CDS 33035 - 34114 646 ## CPF_0390 hypothetical protein + Term 34217 - 34266 4.0 + Prom 34209 - 34268 4.7 54 11 Tu 1 . + CDS 34304 - 35668 875 ## COG1961 Site-specific recombinases, DNA invertase Pin homologs - Term 35630 - 35676 5.2 55 12 Op 1 . - CDS 35705 - 36295 685 ## COG0715 ABC-type nitrate/sulfonate/bicarbonate transport systems, periplasmic components 56 12 Op 2 . - CDS 36364 - 36834 324 ## BcerKBAB4_5499 hypothetical protein 57 12 Op 3 . - CDS 36882 - 38702 1038 ## Cthe_1185 hypothetical protein 58 12 Op 4 . - CDS 38704 - 39612 774 ## COG0657 Esterase/lipase - Prom 39745 - 39804 9.6 - Term 39791 - 39833 8.7 59 13 Op 1 . - CDS 39846 - 41228 1731 ## COG2252 Permeases - Prom 41255 - 41314 7.3 60 13 Op 2 . - CDS 41331 - 42263 818 ## PROTEIN SUPPORTED gi|148988856|ref|ZP_01820271.1| 50S ribosomal protein L9 - Prom 42384 - 42443 13.2 + Prom 42278 - 42337 9.5 61 14 Tu 1 . + CDS 42455 - 42655 256 ## COG1278 Cold shock proteins + Term 42738 - 42791 5.0 62 15 Op 1 3/0.000 - CDS 42755 - 43219 355 ## COG1714 Predicted membrane protein/domain 63 15 Op 2 . - CDS 43200 - 44183 1214 ## COG0616 Periplasmic serine proteases (ClpP class) 64 15 Op 3 . - CDS 44187 - 44795 621 ## gi|210608470|ref|ZP_03287846.1| hypothetical protein CLONEX_00025 - Prom 44820 - 44879 7.3 + Prom 44639 - 44698 9.1 65 16 Tu 1 . + CDS 44935 - 46074 692 ## EUBELI_20336 hypothetical protein + Term 46083 - 46120 7.8 - Term 45912 - 45943 0.2 66 17 Op 1 . - CDS 46110 - 47612 1254 ## COG0714 MoxR-like ATPases 67 17 Op 2 . - CDS 47603 - 48949 1118 ## EUBREC_1241 hypothetical protein 68 17 Op 3 . - CDS 48964 - 49890 628 ## Cphy_1220 hypothetical protein - Prom 49944 - 50003 4.6 - Term 49991 - 50019 1.0 69 18 Op 1 . - CDS 50020 - 51045 1010 ## COG0809 S-adenosylmethionine:tRNA-ribosyltransferase-isomerase (queuine synthetase) 70 18 Op 2 . - CDS 51061 - 51729 321 ## PROTEIN SUPPORTED gi|239830964|ref|ZP_04679293.1| Ribosomal protein L11 methyltransferase - Prom 51771 - 51830 11.8 - Term 51801 - 51845 9.0 71 19 Tu 1 . - CDS 51863 - 57433 5835 ## CPE0191 hyaluronidase - Prom 57502 - 57561 5.2 - Term 57642 - 57688 10.3 72 20 Op 1 40/0.000 - CDS 57692 - 60109 2980 ## COG0072 Phenylalanyl-tRNA synthetase beta subunit 73 20 Op 2 . - CDS 60147 - 61166 1202 ## COG0016 Phenylalanyl-tRNA synthetase alpha subunit - Prom 61216 - 61275 3.7 - Term 61215 - 61248 5.2 74 21 Op 1 . - CDS 61365 - 61442 112 ## 75 21 Op 2 . - CDS 61515 - 62372 888 ## COG1295 Predicted membrane protein 76 21 Op 3 . - CDS 62446 - 63759 1217 ## COG2385 Sporulation protein and related proteins - Prom 63810 - 63869 5.3 - Term 63824 - 63859 5.1 77 22 Op 1 . - CDS 63906 - 64070 151 ## 78 22 Op 2 . - CDS 64071 - 66182 1598 ## COG1305 Transglutaminase-like enzymes, putative cysteine proteases 79 22 Op 3 . - CDS 66195 - 67268 685 ## EUBREC_1423 hypothetical protein 80 22 Op 4 . - CDS 67270 - 68226 1036 ## COG0714 MoxR-like ATPases - Prom 68253 - 68312 10.3 - Term 68265 - 68299 5.5 81 23 Tu 1 . - CDS 68316 - 70004 1874 ## COG1866 Phosphoenolpyruvate carboxykinase (ATP) - Prom 70103 - 70162 6.6 - Term 70132 - 70181 4.1 82 24 Tu 1 . - CDS 70257 - 71570 1548 ## COG1757 Na+/H+ antiporter - Prom 71605 - 71664 5.4 - Term 71659 - 71712 14.4 83 25 Tu 1 . - CDS 71718 - 72404 789 ## COG1378 Predicted transcriptional regulators - Prom 72439 - 72498 4.8 84 26 Op 1 . - CDS 72509 - 74155 2237 ## COG0129 Dihydroxyacid dehydratase/phosphogluconate dehydratase 85 26 Op 2 . - CDS 74173 - 75162 1065 ## COG2423 Predicted ornithine cyclodeaminase, mu-crystallin homolog 86 26 Op 3 . - CDS 75241 - 77205 1484 ## COG4932 Predicted outer membrane protein - Prom 77242 - 77301 5.9 87 27 Op 1 25/0.000 - CDS 77434 - 78429 742 ## COG0438 Glycosyltransferase 88 27 Op 2 . - CDS 78386 - 79591 931 ## COG0438 Glycosyltransferase 89 27 Op 3 . - CDS 79604 - 80197 598 ## COG0398 Uncharacterized conserved protein - Prom 80280 - 80339 5.0 - Term 80306 - 80353 9.5 90 28 Op 1 . - CDS 80359 - 81756 1040 ## Bsph_3214 hypothetical protein 91 28 Op 2 . - CDS 81785 - 82573 587 ## COG4905 Predicted membrane protein - Prom 82597 - 82656 6.8 + Prom 82617 - 82676 6.9 92 29 Op 1 . + CDS 82711 - 82950 316 ## gi|167761152|ref|ZP_02433279.1| hypothetical protein CLOSCI_03557 93 29 Op 2 . + CDS 82934 - 84976 1744 ## COG0370 Fe2+ transport system protein B + Term 84991 - 85037 11.6 - Term 84970 - 85033 18.2 94 30 Op 1 . - CDS 85046 - 86260 1415 ## COG3919 Predicted ATP-grasp enzyme - Prom 86281 - 86340 5.2 95 30 Op 2 . - CDS 86345 - 86611 357 ## - Prom 86684 - 86743 5.2 - Term 86765 - 86802 6.4 96 31 Op 1 . - CDS 86806 - 87645 881 ## DKAM_1054 predicted metal-dependent phosphoesterases (PHP family) 97 31 Op 2 . - CDS 87718 - 89346 1848 ## COG1283 Na+/phosphate symporter 98 31 Op 3 . - CDS 89372 - 89773 186 ## - Prom 89872 - 89931 5.8 + Prom 89849 - 89908 6.4 99 32 Tu 1 . + CDS 89954 - 90580 835 ## COG2116 Formate/nitrite family of transporters + Term 90586 - 90636 7.5 - Term 90346 - 90380 -0.9 100 33 Op 1 1/0.000 - CDS 90549 - 91202 747 ## COG0637 Predicted phosphatase/phosphohexomutase 101 33 Op 2 . - CDS 91205 - 91498 442 ## COG0607 Rhodanese-related sulfurtransferase - Prom 91524 - 91583 11.9 102 34 Tu 1 . - CDS 91669 - 92004 99 ## SZO_02320 transposase Predicted protein(s) >gi|330403967|gb|ADLB01000016.1| GENE 1 1 - 144 177 47 aa, chain + ## HITS:0 COG:no KEGG:no NR:no NVANFYHNTSGQNDDNAVISEVVIIFTAKDTTALTYCFDVCVSGFLV >gi|330403967|gb|ADLB01000016.1| GENE 2 158 - 928 764 256 aa, chain - ## HITS:1 COG:no KEGG:Ccur_02620 NR:ns ## KEGG: Ccur_02620 # Name: not_defined # Def: hypothetical protein # Organism: C.curtum # Pathway: not_defined # 1 147 1 147 272 114 38.0 5e-24 MEIVTGYRGKPHITSEKWADLNRGIIGAEEYVLGVGRMFESELVSNNLLKIYDGCGVFQG REFSTSAGQSDEITIENGTQGEKRIDLIVARYTKNEDTKIETIEPVLIKGTPSASDPAVP KYTEGNIRQGDLIADMPLYEVELNGINVVEVRPLFRALMDMNKINKYLSNKENPVIMEKI VKTPGITLNAFEGKALSSSAITPPTVEGYRCIGLASGWGEGQVGLVVSPNGWAANCTNVK KTYNAVALKFLYLKSF >gi|330403967|gb|ADLB01000016.1| GENE 3 916 - 1929 831 337 aa, chain - ## HITS:1 COG:no KEGG:Apar_0590 NR:ns ## KEGG: Apar_0590 # Name: not_defined # Def: hypothetical protein # Organism: A.parvulum # Pathway: not_defined # 24 326 22 309 521 129 32.0 2e-28 MEKKMIVVNDAGRQIGYLDYSIEMDMDLGDTNDFAFEMKLASWDKEKMNYGFIIALPDTE YGGIIGDIQSSTGSAKVTLTGDTWRGMLAKKIIEPPSDQDYKKVSGELNSILRSLLDGQF GDLFLVPQKDTGVSVQNYQFKRYCTLIEGIEDMLSSMKYRLDIQYKQGGAGVPGWVEVQA VPTEDFSEQKEYNQDNRINFIARDYRRGINHLICAGTGEGTDRTVLHLYVQKDGTIGGNK FYKGLDERVALYSYTSQSDIEQLRKDGTKRLQELMNYKEFGMKVSDVDLSIGDIVSGRDF VTGILVQKPVVQKILKIQKGKINVEYILKGDDGIWKS >gi|330403967|gb|ADLB01000016.1| GENE 4 1914 - 2765 618 283 aa, chain - ## HITS:1 COG:no KEGG:Ccur_02650 NR:ns ## KEGG: Ccur_02650 # Name: not_defined # Def: hypothetical protein # Organism: C.curtum # Pathway: not_defined # 83 280 72 274 275 76 26.0 9e-13 MMEIYYVNSKGIRLDLLKPPYLLQTGDFFDYIWGYESVDTSALSGKITDFIRGITEKTML LSILNYSKDDYYDAINYFHETVEYDVLHKQAGRLYIGEQYWQCYIIASDITEWENDIELL DNQITFVAEHPFWIKEEKHEFQPSTNADKTGEYLDFDFDIPFDLTGDAVGVGTIEFEHYS PCDFLMTIYGPCTNPRITINNHTYEVKTKLDKGEYLVIDSFAGTVYRMRTNGIQVNEFDN RNNENGSIFEKIQPGYNLISWDGSFGFDILIYLKRSTPKWKRK >gi|330403967|gb|ADLB01000016.1| GENE 5 2762 - 6484 3757 1240 aa, chain - ## HITS:1 COG:ECs2641 KEGG:ns NR:ns ## COG: ECs2641 COG5283 # Protein_GI_number: 15831895 # Func_class: S Function unknown # Function: Phage-related tail protein # Organism: Escherichia coli O157:H7 # 60 365 204 511 696 177 36.0 1e-43 MADGKVVIETGLDSSGIEQGLKKLGGITKKGLKVAATAVAGTSAALTGVATVAIKTGSDF EAQMSRVKAISGATEQEFAKLKEQAIELGADTAFSSGQAAEGMENLAAAGFTTNEILEAM PGLLDLAAASGEDLSNSSDIAASTLRGFGLAAEDVGHVADVLAENANRTNSSVAETGEAM KYIAPLARAAGISMEETAAAIGIMANAGIQGGQAGTTLRGALSRLSRPTEDMQEAMSELG VSFYDSEGKMLSLTDQVDVLGKAMEGMTDEQKNNYLVTLYGQEALSGMLALINEGPESLE SLTKAYTTCDGSAKKAAETMQDNLKGAVEQLSGSAESLGIVFYESVSDNLKNAAVTATDS INEITDAFNNGGLDKAIETAGDEFAGLATKAAEHAPDMVDTAVSFIQSFAQGVIDNREEI LNAAEETARAFVGGIAKILPASEGIEEAGETVIDTMDNVIDITGKVAKVALPPLTKALDF AGEHLELIASSATTAFVAFKGYKVVNKASSAMEKGAKAWKVAKTAVDKYNEAQLIAMETG VASNATLTIGQAVVGKMTGKITLATAAHNIWNKVMNANPINLLITAVGALVAGLGVYALT TENASEKAGKLTAEQKKSIEESHKIAEEYAELNKQRQESMASVEAEYGHYTELNKELSGL IDSNGKVKEGYEERANFIVNELSTALGLEKEQIWDIIKSNGNLEESIGQVIEMKKAEALL QANEQAYTEAIQKQNEALAQYQESLKTLDEVEKEYNRTKKEASEVMDMYNELLKSNPKDA KEYLDVNSKIIDANEEAKKSYEKALKGVEDSESAYVGYVSTIENYEGLSSAIISKDAGKI KTAMENIKASFITAETGTKETLEKQVKNMKSHYEQLKQAVANGTPGVTQEMVTQAEQMVT RSKEELKKQSEMALPEITTALGNLGITATQAMLDSLSSKQPDVQSKTVELLNQIKAGTAL KEPEIKTLLSNLGIKSADGLIGAISGKKASVQAEGIKLIAQLETAEGAKREEILKKLREL GISAGSGLETGTSSKKKDVNSAAEEIVSEPERVADKHDLYTKAYGLGQSFGAGMEKGIGS YVGIVGDMGASMVANTLERVRQAQNSHSPAKETEYLGDDFGAGYALGIEKKKKLVGRTSS ELAETALDSLNMSDISSRMRNTMALNTGRIARSFSLETNTNIMNRQETNSMLKLSDEEIG KLAKEIGTVAGKEFSDRVNGMSVEVFEREFGRVVRKVDRQ >gi|330403967|gb|ADLB01000016.1| GENE 6 6503 - 7108 372 201 aa, chain - ## HITS:1 COG:no KEGG:CLK_1325 NR:ns ## KEGG: CLK_1325 # Name: not_defined # Def: putative protein GP15 # Organism: C.botulinum_A3_LochMaree # Pathway: not_defined # 1 201 1 197 197 108 33.0 2e-22 MSVLTVPFPTWLQIDGVNCPIHSDFRTVLRCYEILGDKKELSKDEMVEMLSMFYVGKRCH TEEHIDKMFWFFSCGREKERKVFPRKIAGINNKQSFDFVEDAELIYAGFMQQYGIDLQTE EMHWWKFMILLENLGSDTKLQRVMEYRTIDTKNKDLSKKEREFYSAMQRYFALERKISEM PDRVKRIEEALLKGEDISGLL >gi|330403967|gb|ADLB01000016.1| GENE 7 7105 - 7440 340 111 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MVIHGICLDFALYNADKDTKEKYFRELEKMQHLLKNVPHGNEQEKNKYLCDSIKQMFDSI FGEGAGVEVCGEENDLLLHMDAYEQLVNEQIRQQEQYNGIMERLGSMRKRK >gi|330403967|gb|ADLB01000016.1| GENE 8 7520 - 8050 686 176 aa, chain - ## HITS:1 COG:no KEGG:CLK_1327 NR:ns ## KEGG: CLK_1327 # Name: not_defined # Def: hypothetical protein # Organism: C.botulinum_A3_LochMaree # Pathway: not_defined # 15 176 2 160 169 132 50.0 5e-30 MKMNLQFFAKAESTGVEQRFQQPDYLDVTGGSESPTFELLGFGVTQLDNSPSAQTTSKRY VNQKSATQRIGSYEWSAPLEFDLIRSEKALAFITDIGENEKTGADAETLYVKVYLNKPVP EQATQFEAKQRRVAIELSEFADNDGEIQGSGNLIAVSDWVAGTFDTSTKKFTPKGE >gi|330403967|gb|ADLB01000016.1| GENE 9 8056 - 8484 181 142 aa, chain - ## HITS:1 COG:no KEGG:CLD_2452 NR:ns ## KEGG: CLD_2452 # Name: not_defined # Def: hypothetical protein # Organism: C.botulinum_B1 # Pathway: not_defined # 7 141 6 141 144 108 41.0 4e-23 MSVAVQVKKFIETCPFLEEFENLFPSVSINKLDESPTMYSIEETPAEPIVKRYANGDSVR QYVFSLCSCELYGAVENEKTSEFYERFSDWLEDCTRKNVLPELSGQLQSKSIRATTNGYL YDNQSSICQYRIQCQFLYFKRR >gi|330403967|gb|ADLB01000016.1| GENE 10 8481 - 8852 358 123 aa, chain - ## HITS:1 COG:no KEGG:CLD_2451 NR:ns ## KEGG: CLD_2451 # Name: not_defined # Def: hypothetical protein # Organism: C.botulinum_B1 # Pathway: not_defined # 2 123 3 114 114 96 47.0 2e-19 MTKIRLDMDPTDKILLKRKLNKNGKGQRFFTHEVRRLSDPYTPMLSGHLKNNVTEEPAKI IYNAPYARRQYYENRGRGKQGTTKRNNHNYKCLRGKRWTERMWADRGKEIVKATAKYCGG KAR >gi|330403967|gb|ADLB01000016.1| GENE 11 8852 - 9262 240 136 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|153853386|ref|ZP_01994795.1| ## NR: gi|153853386|ref|ZP_01994795.1| hypothetical protein DORLON_00784 [Dorea longicatena DSM 13814] # 1 134 1 133 134 72 33.0 7e-12 MLTNASITIFNQYPERMERRVVFIPHHIEKVWLHTKQKTAVVEGGLKSADEYLIRIPYEE CKEWLPSNDFRELGNPNENWTVQNGDFFIVGRWDSGEKVNGIEEIKKEFSGVTGKILSHS ENFFGSSKHIRIGGGS >gi|330403967|gb|ADLB01000016.1| GENE 12 9262 - 9621 334 119 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|154504713|ref|ZP_02041451.1| ## NR: gi|154504713|ref|ZP_02041451.1| hypothetical protein RUMGNA_02220 [Ruminococcus gnavus ATCC 29149] # 5 119 6 125 126 63 35.0 6e-09 MVKVDFLFYQEKYAGTIIPDATSLKQPLLKANLYLSQMLCEDGEGSAEELVKLCLCEVAE LLYQDSIIRQGYGGRQAQSENTDGYSVSYVSDESSLETKIYCVIDRYLSHTGLLYMGVE >gi|330403967|gb|ADLB01000016.1| GENE 13 9635 - 9949 608 104 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|295091507|emb|CBK77614.1| ## NR: gi|295091507|emb|CBK77614.1| Collagen triple helix repeat (20 copies). [Clostridium cf. saccharolyticum K10] # 1 54 1 57 120 62 70.0 9e-09 MAYEPTVWKDGEVITAARMNKLEQGVKNEQVGPQGPAGAKGPAGERGPQGPAGPSYTLPA ANKTTLGGVKQAALVAEATGESVTKAEFKALLDALKAAGIMASI >gi|330403967|gb|ADLB01000016.1| GENE 14 10032 - 10886 1225 284 aa, chain - ## HITS:1 COG:no KEGG:Shel_14060 NR:ns ## KEGG: Shel_14060 # Name: not_defined # Def: hypothetical protein # Organism: S.heliotrinireducens # Pathway: not_defined # 1 280 1 340 345 169 31.0 1e-40 MANTIALRKAYSTALDEVYKLASLTAVLDGPNELVKEGANANEILIPKMSMQGLANYSKQ TGYVAGDVTLEYETKKCSYDRGRMFTIDAMDNIESAGIAFGRLSGEFLRTKVVPELDAYR LAAYASIPGVTTVAANLTDGKAALAALRAAKGKIENAEANVSTCYLFINPTVLGMIEDLD TTASKKAIEGFASVIKVPEGRFYSKIDLTASGAGGYAKNSAGKNLNFLIVDKQAVIQYQK HTVPKIIDPEVNQNADAWKFAYRTAGIAEYYDNKKDGIYVHTVE >gi|330403967|gb|ADLB01000016.1| GENE 15 10907 - 11455 777 182 aa, chain - ## HITS:1 COG:no KEGG:lin0108 NR:ns ## KEGG: lin0108 # Name: not_defined # Def: putative scaffolding protein # Organism: L.innocua # Pathway: not_defined # 1 178 1 178 189 69 32.0 5e-11 MKAEFLKNFGLEQDAIDKIMAENAKDVSAEQEKTKNAESEANSYKEQLETATASLEKFKN VDPEAMKGEIESLNQQLKDQKAEYEAQEADRVFKESVKTAIREAGGRNEKSVMALLDMDA LKESKNQSEDIKKALETVKESDAYLFGSKEPIDNAVTITGGAGSGSNLDAVRAAMGLPAK KD >gi|330403967|gb|ADLB01000016.1| GENE 16 11476 - 11562 58 28 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MYSFLRRWGSSVSIDYITGQLIRPKTVN >gi|330403967|gb|ADLB01000016.1| GENE 17 11595 - 11861 304 88 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|166031066|ref|ZP_02233895.1| ## NR: gi|166031066|ref|ZP_02233895.1| hypothetical protein DORFOR_00747 [Dorea formicigenerans ATCC 27755] # 1 88 3 90 90 107 70.0 2e-22 MFEKIMNYINNFLKNTPDDIYEFSIVLEDALVDDYDEMYKDQPNATDVLAEEVPYICASA EPGMTQEEIEEFKRKLKIEYDKAMEAVV >gi|330403967|gb|ADLB01000016.1| GENE 18 11854 - 13518 1300 554 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_1286 NR:ns ## KEGG: EUBREC_1286 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 15 375 16 394 524 152 30.0 3e-35 MQPNDLIKISLQIEALFEEMQVRVMQDIVRRILKAGKITSTADYQIEKKILLGNSTEFIE AEIKRLTEKTQEQIWQMYEDVIDWEYVRNKGIYEQINGHFIPYEDNEQLQQWVYAIYEQT NNEIKNITRSMGFALNYAGRVVFTPFSEYYQKYLDRACMDVVTGVFDYNTVLRRVVKELT ASGIRTVDYASGCSNRVTVAGRRAVMTGVNQLSAKINEKVAKDLGTDTFEVTWHAGARPT HWWGGMVFTKKELENICGLGSVDGLCGANCRHNYMAFVPGVSVRTYTDEQLAEMNHQERQ TKEWKDKKYTTYEATQKQRKMETSMRAQRERIKLLQDGKADKDTIMLEKAKYQGQLNEYA KFCKKMGLPQERERIYLDGFGKVATNTKWQNMKYTSEMIRNARKDSKQYEKYKEVLGNSV GTLADFRQMKYNKPREFELLTKKFDTYSAINKKNWSEEFKQKSKDAYVRFEKEGIYLSDH ALSRLPRLNKKGFLEISENDVKKVLKGNPNYKEGETKLVYFSEELQLSVIKNIETGDIVS IIRAKKPKGDWENV >gi|330403967|gb|ADLB01000016.1| GENE 19 13524 - 14861 1122 445 aa, chain - ## HITS:1 COG:no KEGG:LKI_09670 NR:ns ## KEGG: LKI_09670 # Name: not_defined # Def: minor capsid protein # Organism: L.kimchii # Pathway: not_defined # 44 439 43 476 504 110 24.0 2e-22 MWMSRLWKAVRKMFGYTEIKRIIGRDVTLSQDMIDAINDWKNMLNGKADWVSKYVKSLKI EQGICREFADVVLTEMESNLSNDKLDTYYQKALLDLNENMQDGLGLGSFILKPLGNGKSE FVSADKFIPIHFDDTGKPDDCAFMTVKTVGENQHYTKIERHRLENKVLIIENTVYYSESR NDIGRRVGLDSVEEWAKLPEAVSYPGMDRMDFGFYRNPMKNSIDESPCGISIFEIAKELI KKADIQGARLDWEYESGERAIHVDDKALKKSGSNFRLPRLNERLYRGMNLEDGKDKELLR EYSPLMRDEAFQRGLEKYYRQIEFSVGLSYGDLSDCQEVEKTATEIKVAKQRKYNRVNAI QAKLKECLEDYAAGLAFREGMYTSGYEFNCKFNDSILTDEEAERKQDMQDVSMGVMQLWE YRMKWYNEDEETAKQNLPEQHRDME >gi|330403967|gb|ADLB01000016.1| GENE 20 14848 - 16173 853 441 aa, chain - ## HITS:1 COG:no KEGG:cauri_1927 NR:ns ## KEGG: cauri_1927 # Name: not_defined # Def: phage terminase large subunit # Organism: C.aurimucosum # Pathway: not_defined # 1 428 1 401 414 118 26.0 4e-25 MQLSRKQNEYIVNATHRWNVKSGAVRSGKSYVDTAFVIPFRIRERAGKPGLNVILGVSKE SIERNVLQPMREIYTDKLIGNINNRNVARICGEDVYCLGAEKISQVAKIQGASIKYCYGD EVAKWNKEVFQMLKSRLDKPYSCFDGSCNPEHPTHWLKEFLDNDELDIYLQRYTIFDNPF LPEEFVQQLCKEYEGTIYYDRLILGLWKRAEGAIYKRFADNPDKFRCEVVDELTMETDIK QFRKDDIVSIEIGLDFGGNQSGHSFVARGYTDDYREVIALKSKRIMAKDENEDIDSNTLD RLFCEFVQEVIDKYAVIVRRGDYVEYCNVESVYYDNAETVLGNSIRNAVEKKYPWVSVRK AKKKTINDRIRCTVKLMGAGRFFITDDCESLETALSDAVWNKEVKEKDERLDDGSTDIDS LDAFEYTIERDMKELIQNVDE >gi|330403967|gb|ADLB01000016.1| GENE 21 16148 - 16615 470 155 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|153815839|ref|ZP_01968507.1| ## NR: gi|153815839|ref|ZP_01968507.1| hypothetical protein RUMTOR_02084 [Ruminococcus torques ATCC 27756] # 47 155 1 109 109 140 84.0 2e-32 MAGYDNIKDYGFDKRTADERRELAIKAGKASGEARRRKADFRKTLNMLLTAEIDNEEYKP ILESLGVACTLESAILMAQIKEAMAGDTKAATFVAKYSGQSPEPEENRRNRDADTELKQA RKQAVTGENETDEALDKLDSILKELHENAVKQETE >gi|330403967|gb|ADLB01000016.1| GENE 22 16675 - 17340 382 221 aa, chain - ## HITS:1 COG:no KEGG:PMI0517 NR:ns ## KEGG: PMI0517 # Name: not_defined # Def: hypothetical protein # Organism: P.mirabilis # Pathway: not_defined # 28 216 26 215 222 65 25.0 9e-10 MLRRYDDLREIKIEIEKYISQLQGNSKIKDAHIIKGIAKYILFLKRIQKSKAYGHYGSCF IYDMLMLMHSLTQNSERDFYATYRSAIENFVRCFLEIEDNDETGVRNLFSNLRVLCINGE SEEIITYIEGEYGKCCNFVHSNIKANMEIYEYYADILKQNELSKEKTDRLINCVMTFVKK ITELMIENKILWVDETFYKDKQTLKFLIGNQMYSRFEEKVS >gi|330403967|gb|ADLB01000016.1| GENE 23 17342 - 17737 293 131 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|296443490|ref|ZP_06885533.1| ## NR: gi|296443490|ref|ZP_06885533.1| conserved hypothetical protein [Clostridium lentocellum DSM 5427] # 3 131 5 141 141 72 36.0 8e-12 MNNKLGDKKLVSSLMIRMKQKNIDKLLMLGIYCQLVYSKTIFLRNSIVAEFIHEVLEIDF PTYVVSSRTLMTARTIKIIYNFDDAEIEKLRRKTLYLLENTEIEENVKKSAKSSKKKKNE NDKLESWLKGL >gi|330403967|gb|ADLB01000016.1| GENE 24 17730 - 18587 475 285 aa, chain - ## HITS:1 COG:DR2040 KEGG:ns NR:ns ## COG: DR2040 COG1192 # Protein_GI_number: 15807034 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Deinococcus radiodurans # 13 220 20 246 331 101 31.0 1e-21 MYGGIRMKKDGKVVSFINMKGGVGKTTLCIGIGEYLAHYLNKRVLFIDLDPQFNTTQSLM NLFELEDEYMTNYSVKNKTVRRLFESPTTVSEMPKLPEKEDVIIDLDYNISIIAGTINLI FDDNNKSTSASRRVKKFIEENALRNEYDYIFIDCPPTISLYTDSALIASDYYLVPVKVDR YSILGIKLLDQVIERLKFDETLNIKPLGIVYTMLDNTITQKTKKIMETLESSEIVNKIGL FETKTFFVRDLMVGLGGNIASRYNSSRIDIEALCSEFMKEVEKNE >gi|330403967|gb|ADLB01000016.1| GENE 25 18729 - 19157 381 142 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|154504699|ref|ZP_02041437.1| ## NR: gi|154504699|ref|ZP_02041437.1| hypothetical protein RUMGNA_02205 [Ruminococcus gnavus ATCC 29149] # 1 135 4 138 140 122 57.0 5e-27 MDKGRLKKHKKNKDRLKRIDEKIEELCGREVQVVSGKVTGSSKDFPYTEVRTSVLMYEPY ENDRINKRIREYEAERLVLLQEVEEVDRYIEGIKDSEVRETFELSFVEGKKQREVAEELS IDRSYVSKKINNYLKLSHNSQK >gi|330403967|gb|ADLB01000016.1| GENE 26 19163 - 19522 241 119 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225574763|ref|ZP_03783373.1| ## NR: gi|225574763|ref|ZP_03783373.1| hypothetical protein RUMHYD_02840 [Blautia hydrogenotrophica DSM 10507] # 4 119 6 121 121 104 55.0 2e-21 MEVAQSLKNFLDFVDECHSLNSMAKNGISTEEKRQQDLLHAIEFETNGKKRGPLDTKLHK CRVARRGYKDIFEVTEEVVKFFRDPQHKKTLDHMRQLLGSVRKVEEYQKNRFYIPRIKE >gi|330403967|gb|ADLB01000016.1| GENE 27 19488 - 19652 156 54 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MAELEEEQASESAVKAFHRKAYPTYSVGDCLEKWGVNLKGGIGDGSSTKPKEFS >gi|330403967|gb|ADLB01000016.1| GENE 28 19710 - 20252 507 180 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MNVLEKILEEIENHAIEFESFGMCDDYVSVGWVKEIIRKHMNEKEKVTSAEIVSREVDGK TYYAIKYKEVGKDYYTVGYGSYKLDYVVCWLNEYFEFCGEAKVVVGAGKDTNVPNNDGWI PVEERLPEDERMVLVTYMTKSGISSVDRARFDGKYWRGSGFMGRVIAWQPLPEPYKPKKK >gi|330403967|gb|ADLB01000016.1| GENE 29 20256 - 20462 191 68 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MICSRCGKEIIETYRSEYFNCIRGQTWRTNIAFAKLGLWQPEQNKAEKPQVIALCSECYE EFVDFMEA >gi|330403967|gb|ADLB01000016.1| GENE 30 20462 - 20902 300 146 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225575248|ref|ZP_03783858.1| ## NR: gi|225575248|ref|ZP_03783858.1| hypothetical protein RUMHYD_03337 [Blautia hydrogenotrophica DSM 10507] # 2 142 3 137 142 91 42.0 1e-17 MREILFKAKRKNWKELPKEDWWVEGYYLNIAEINHFICTGKIKLNGAIKGIIAPEMYAID INTLCQYTGLTDKNGKRIWENDICDRKEKYPEIVTYNKGDWQLDYSYVFGKEIHHDACNL GFYAYERNCVEVIGNIFDNPELSEVE >gi|330403967|gb|ADLB01000016.1| GENE 31 20905 - 21126 160 73 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MEVLAKYEETGLTPEQVQELKERDTAKRPIKTTDETGIKYTDSYRCPNCGGNFTGTGIAD YCYHCGQRLDWSE >gi|330403967|gb|ADLB01000016.1| GENE 32 21140 - 21349 122 69 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTNEQMFKATNEIYNCFWLKWRNKKISGNDWNILIQESRELAKRYPFKFVREWIIEMQDE LEKRNKILN >gi|330403967|gb|ADLB01000016.1| GENE 33 21351 - 21746 437 131 aa, chain - ## HITS:1 COG:SPy0673 KEGG:ns NR:ns ## COG: SPy0673 COG4570 # Protein_GI_number: 15674739 # Func_class: L Replication, recombination and repair # Function: Holliday junction resolvase # Organism: Streptococcus pyogenes M1 GAS # 1 129 2 131 131 144 56.0 5e-35 MKFFIAMNPPTKTYQQKKVAIVKGKPKFYEPPELLAARCKLRDHLAGYVPDKMFNGPVRL VVKWCFHCTGNHKDGEYKYTKPDTDNLQKMLKDVMTELKFWKDDALVVSEIVEKFWAELP GLYIQVESLEK >gi|330403967|gb|ADLB01000016.1| GENE 34 21944 - 22024 68 26 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MCLSLGTKTENDVLVPKDKKVGQKHD >gi|330403967|gb|ADLB01000016.1| GENE 35 22024 - 24249 1992 741 aa, chain - ## HITS:1 COG:SPy0671 KEGG:ns NR:ns ## COG: SPy0671 COG3598 # Protein_GI_number: 15674737 # Func_class: L Replication, recombination and repair # Function: RecA-family ATPase # Organism: Streptococcus pyogenes M1 GAS # 5 741 6 753 757 860 58.0 0 MERGYDIKQILDFLNPSELNYQEWIYVGMALKEDGYDVSVWDSWSRSDSRYHSGECQRKW AGFHGNGYPVTAGTIVQMAKDRGWKPEIQGYELDWNDDIFIEKDDKVVVDKTWIENVEVQ QPKAWNPASQLIKYLETLFESSENVGYVTQSWKNEDGKFLPSKGNTDRTAGQLIEQLSKC NDDIGSVLGDYNTEAGAWIRFNPIDGKGVKNENVTEFRYALVESDELEIEKQNAIIRELE LPVACLVHSGGKSIHAIIKVDASDYNEYRKRVDYLYAVCQKNGFKVDTQNKNPSRLSRMP GIIRNGNKQFLIDTNIGKQSWNEWYEWIESVNDDLPEPESLESVWEDLPELAPCLIDEVL RKGHKMLIAGPSKAGKSFLLIELCCAIAEGRQWLDWQCEQGKVLYVNLELDRASCLHRFK DVYQALGWNAEHLNNIDIWNLRGKSVPMDKLAPKLIRRAAKKDYVAVIIDPIYKVITGDE NSADQMSAFCNQFDKICTELGTSVIYCHHHSKGSQGGKRSMDRASGSGVFARDPDALLDL IELELTDAVKKQEVNKAICRACTMYLDSHFAWQEDVSQDDMLSSRAMQEYCKEKLTRRQY AELQVIIEKEIKLAEGKTAWRIEGTLREFQKFEPKNTWFDYPVHRADTVGILKDVQPDEE KPAWKRAMESRKSKNTKAEERKKSVEMAVAACGIEGDITLQALAEYMGTSEKTARRRVNE HGGYQILGSKVTERTGTKTEK >gi|330403967|gb|ADLB01000016.1| GENE 36 24252 - 25835 1068 527 aa, chain - ## HITS:1 COG:SPy0669 KEGG:ns NR:ns ## COG: SPy0669 COG1061 # Protein_GI_number: 15674735 # Func_class: K Transcription; L Replication, recombination and repair # Function: DNA or RNA helicases of superfamily II # Organism: Streptococcus pyogenes M1 GAS # 1 525 1 525 527 766 69.0 0 MELRPYQQEAREAVEQDWKSGIKRTLLVLPTGCGKTIVFAKIVEDCVREGRRVLILAHRA ELLEQAADKLSRSTGLKCAVEKAEESCLGSWFRVVVGSVQSMQRPKRLQQFERDYFDTIV IDEAHHCISDGYQTVLQYFSEADILGVTATPDRGDMRNLGSYFENLAYEYTLPKAIKEGY LVPIKALTIPLKIDMSGVGMQSGDFKAGDIGTVLDPYLQSIADEMIKYCMDRKTVVFLPL VKTSQKFCQILNEKGFAAAEVNGNSQDREEILSDYENGKYNVLCNSMLLTEGWDCPEVDC VIVLRPTKVRSLYCQMVGRGTRLYPGKDHLLLLDFLWHTERHELCHPASLICENEDVARQ MTKNMEEEPGIAIDIEEAERSAAEDVVVQREEALAKQLSEMKKRKKKLVDPLQFEMSIQA EDLSGYIPSFGWEMSPPSDKQKKTLEKLGIMPDEIENAGKAAKLLDRLDKRRAEGLTTPK QIRFLEGRGFQHVGTWKFETAKNLIDRIAGNGWRIPRDINPQEYKGA >gi|330403967|gb|ADLB01000016.1| GENE 37 25835 - 26290 594 151 aa, chain - ## HITS:1 COG:no KEGG:Ccel_3308 NR:ns ## KEGG: Ccel_3308 # Name: not_defined # Def: phage protein # Organism: C.cellulolyticum # Pathway: not_defined # 1 151 1 160 160 142 48.0 4e-33 MNNEGRELNWDDQIEQDSQEFILLEEGDYDFKIEKYERARSQGSGKLPACNMAKVFFTIE SAKGSTTITENYILHSSLEWKLSELFAAIGLKKKGERISMNWNQVSGAAGRAHIIVDTYE NRDGEERKINRIKKLYPKEEEPQKTFKAGAF >gi|330403967|gb|ADLB01000016.1| GENE 38 26306 - 27445 978 379 aa, chain - ## HITS:1 COG:no KEGG:Ccel_3309 NR:ns ## KEGG: Ccel_3309 # Name: not_defined # Def: phage protein # Organism: C.cellulolyticum # Pathway: not_defined # 1 377 1 388 388 426 56.0 1e-117 MQIIRGIIPSAQKVLVYGPEGIGKSTFVSQFPDPIFIDTEGSTKHMNVARTPKPSSFQML LEQVTYFRDNPNELKTLVIDTADWAEKLCSENLCAKNKKKSIEDFGYGKGYVYLMEEFGG LLNLLDELIERNINVVMTAHAKMRKFEQPDEMGAYDRWELKLSKQVAPLLKEWADMVLFA NYKTYVVNVDGQGAEKGKNKAQGGTRVMYTTHHPCWDAKNRHNLPPETKFSYEEIAHIFQ GNASCSTEKVVESKCENVVQNPIEQSLPERKENVMPEFMTQEQMPVEVQSNQETVSPDRS SFDSNPEHLPKALLDLMKEHNVNEWHIQSAVAARGYYPENTPIERYDPEFIQGVLVGAWP QVYEMIKEIKEKEEIPFSE >gi|330403967|gb|ADLB01000016.1| GENE 39 27445 - 28743 1212 432 aa, chain - ## HITS:1 COG:no KEGG:GALLO_0437 NR:ns ## KEGG: GALLO_0437 # Name: not_defined # Def: hypothetical protein # Organism: S.gallolyticus # Pathway: not_defined # 5 423 4 422 430 492 65.0 1e-138 MTMKINKLEIENVKRIRAVKIEPTQNGLTVIGGNNNQGKTSVLDSIAWALGGENFRPTEA MRHGSNVPPNLKIVMNNGLVVERKGKNSSLKVTDPSGEKAGQTLLNTFVETLALNLPKFM ESSGKEKANTLLKIIGVGDKLLLLEKEEKELYNQRLTIGQIADQKEKYAKEQVYYAEAPK DLISPTDLIKKQQEVLARNGENQRKREKVSQYQQSVAFLNQEVLAMREQLQKKEAELEEA KASLNVALMTAQDIHDESTAELENSIANIEEINRKVRANLDKEKAEEDAKEYRKQYTELT EKIEKTRTKKQDLLNAAELPLPELSVKEGELIYKGQKWDNMSGSDRLKVSTAIVRKLNPE CGFVLLDKLEQMDMATLKEFGEWLEKEGLQAIATRVSTGDECSIIIEDGYVVGQETAEPI PTKTNKWREGVF >gi|330403967|gb|ADLB01000016.1| GENE 40 28831 - 28932 150 33 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKKWKEDFLEYLVYSLPAWFCVGMFLYWLIIGY >gi|330403967|gb|ADLB01000016.1| GENE 41 28950 - 29105 312 51 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MANEKKSEFSKRRAKAYRKRWLYKRLEETVIVLVIVLALFGTFALGVLIAG >gi|330403967|gb|ADLB01000016.1| GENE 42 29128 - 29247 189 39 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MNEETKKREDFIKKLLVTLIEHQENSEWTQIHKKEDKTA >gi|330403967|gb|ADLB01000016.1| GENE 43 29445 - 29759 301 104 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|255283158|ref|ZP_05347713.1| ## NR: gi|255283158|ref|ZP_05347713.1| conserved hypothetical protein [Bryantella formatexigens DSM 14469] # 2 104 10 116 116 80 37.0 4e-14 MQLTKDTDKMLCLIYEEFLERRKNGLSKSNAKTFERPAALQEQFLQGIHKDDIYDALVEL KRNNLIRAYYDMGFQLNDSAIIYMENRFKNGLKEVTDFISKFIP >gi|330403967|gb|ADLB01000016.1| GENE 44 29749 - 29937 272 62 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKAKKLESVHIDVKNGVGHVEVNGKDISANGIYLCLTFENGEWSLMTTKDEFYSTSDQEV KE >gi|330403967|gb|ADLB01000016.1| GENE 45 29949 - 30170 186 73 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSGFKMSNDTVVLNYTKHNAVSFLCQAMLDETLRLRKCTRRKERDSIRNFLVESSRILIL GVQDKENNNEKAG >gi|330403967|gb|ADLB01000016.1| GENE 46 30244 - 30792 244 182 aa, chain + ## HITS:1 COG:no KEGG:CD0910 NR:ns ## KEGG: CD0910 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile # Pathway: not_defined # 1 179 20 205 208 100 40.0 2e-20 MLTKESKLILNYLISAFSENETMISYSEIIENTGLPLHEVDTSLQFLFEENYLKLKRYKN GGFVHSLTHKGFHYEEFDSSSPVSQTNIFNAPVSNSAVGNTGNITINNGISFSEIRSFIE SMNISVNDKKEALQVIDYVETLTENEAPLKKGFLSKFKDTLSKLDWLPDLIGKCLVAYFS SI >gi|330403967|gb|ADLB01000016.1| GENE 47 30793 - 31056 320 87 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKCLKCGAEIKESESDKFCGKCGFPLKIAARDGKITKLESIFLDVQSGILLVNGKEIDNV TALNLVFDKGKYGLDIQFDKTFKAYIC >gi|330403967|gb|ADLB01000016.1| GENE 48 31133 - 31324 71 63 aa, chain - ## HITS:1 COG:no KEGG:CPF_0929 NR:ns ## KEGG: CPF_0929 # Name: not_defined # Def: hypothetical protein # Organism: C.perfringens_ATCC13124 # Pathway: not_defined # 1 62 1 62 78 65 59.0 8e-10 MSFDKLKGKMAEIKMSQEKLSRCLGITVQSLNAKLNGRSQFTLEEVVKITKELNIKDPVD IFF >gi|330403967|gb|ADLB01000016.1| GENE 49 31484 - 31837 156 117 aa, chain + ## HITS:1 COG:no KEGG:CD2950 NR:ns ## KEGG: CD2950 # Name: not_defined # Def: putative phage repressor # Organism: C.difficile # Pathway: not_defined # 9 78 8 77 151 84 64.0 1e-15 MNEKEVSDKMQEIMIRMKNRREELDMSYQTLSDKVGISKSTLQRYETGYIKNMPVDKLED IADALQVSPAYLMGWETNSTTNEPTTIAAHFDGTEYTEEQLDRIKAFAAFIKTEDNK >gi|330403967|gb|ADLB01000016.1| GENE 50 31852 - 32337 325 161 aa, chain + ## HITS:1 COG:no KEGG:CD2951 NR:ns ## KEGG: CD2951 # Name: not_defined # Def: phage protein # Organism: C.difficile # Pathway: not_defined # 1 157 1 151 157 156 51.0 3e-37 MNKYEELLQDASDDNIRVYESFDLNGDSPSTIKIDGLYIDGNIALDKNLKTTAEKACVLA EELGHHYTSHGNIIDLTHVQSRKQEHQARFHGYNRLIGLCGIISAFKAGCQNTYEIAEHL HVTEDYFQQCINCYREKYGICTTVDNYVIYFIPNLTVGERI >gi|330403967|gb|ADLB01000016.1| GENE 51 32382 - 32732 360 116 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKKKLLTLLMALTLSASIIACGGKKLPDGMSQDTYDTGIKALEIMDKYNDADIDADEADK RLEALSDKLDSLELSDDESAENSFVQSNILSFQSKLFLGGDTYTTADDLRKLLELD >gi|330403967|gb|ADLB01000016.1| GENE 52 32803 - 32976 259 57 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRYIFENENALELYYHITKEMNVKLVPHGFEDEVYISEEFMNECMKDIITKDKRFKK >gi|330403967|gb|ADLB01000016.1| GENE 53 33035 - 34114 646 359 aa, chain + ## HITS:1 COG:no KEGG:CPF_0390 NR:ns ## KEGG: CPF_0390 # Name: not_defined # Def: hypothetical protein # Organism: C.perfringens_ATCC13124 # Pathway: not_defined # 218 358 73 201 203 70 35.0 1e-10 MGLRFRKSIKIAPGLKINLNKNSISATVGTKGAHYTVNSKGKRTASVGIPGTGISYTQTT GVKSKGSKKADTSTNSYTYSQPNNTEPNNNDKKWYQKTGWIIAWLILFFPVGLFLMWKYS DWKKAIKVIVTALFACFTIAAITSPTLEDVSLSADTSKTYDIKQDIPIKATVSPSDYELS DTDFNISGGELKISDGKITFSATKGGAYEVWVEHDDIKSNTLKFKIEDKKAIAKKKAEEA KKKAEEEAKRKAEEEAKRKAEEEARIAAEQEAKRKAEEEARLKAEQEAKRKAEQNTPPAN QTHTQAPQQSSQSGGTVYWVSGGSVYHSTPNCATLKRSSNIRSGSITESGKSKPCKVCH >gi|330403967|gb|ADLB01000016.1| GENE 54 34304 - 35668 875 454 aa, chain + ## HITS:1 COG:lin1231 KEGG:ns NR:ns ## COG: lin1231 COG1961 # Protein_GI_number: 16800300 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinases, DNA invertase Pin homologs # Organism: Listeria innocua # 8 397 4 402 471 135 29.0 2e-31 MAKTKRCAIYIRVSTEEQFINGLSLQAQQKALTEYATTNNYTIVDVYADEGISARKSMKH RKELLRLLDDVKHNKIDMILVTKLDRWFRNIKDYNVTEEILQAHNCNWKTIFENYDSSTA NGQMVINIMLSVNQAECDRTSERIKAVFDYKRSQGKVVSGMAAPYGYKVVDGYIQKDLEV APIIEDAYDYYFKCFSQRKVLEYIQDKYGDKAPTVYKLDKLFKNEKYAGKFQDNLNFCEP YISPSQFERIQMISQSKTYTRQHNHTFIFSSLLKCPVCKSSLSGFIKKQTLKDGTVAEYY RYRCSNKLSYHSGGPCITESIVEKYMIENVFQNLKKEYIDFNVKRKAVKKKKDNSKKITE ELERLNSMYQKGRISEAFYDSEYERLQAELNKTNDIATITVIERYSELLGKFSGNWLSLY NKLDNQHKNTFWKSIIQEIYFDENNKLSGFKFLI >gi|330403967|gb|ADLB01000016.1| GENE 55 35705 - 36295 685 196 aa, chain - ## HITS:1 COG:BS_ytlA KEGG:ns NR:ns ## COG: BS_ytlA COG0715 # Protein_GI_number: 16080111 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type nitrate/sulfonate/bicarbonate transport systems, periplasmic components # Organism: Bacillus subtilis # 20 192 20 188 229 194 53.0 1e-49 MKKRVISLMMTAVVVLSAVSMTACKKEEEPKDKLQKVRLNEVAHSIFYAPMYVAIEEGYF EEEGIDLILKTGFGADKTMTAVLSGDAEIGFMGSESSIYTYNEGAKDYVVNFAQLTQRAG NFLVAREEMPDFKWEDLKGKEVLGGRKGGMPEMVFEYILKKNGMIPQKDVKINQSIDFGS TAPAFSEGKGDFTVVL >gi|330403967|gb|ADLB01000016.1| GENE 56 36364 - 36834 324 156 aa, chain - ## HITS:1 COG:no KEGG:BcerKBAB4_5499 NR:ns ## KEGG: BcerKBAB4_5499 # Name: not_defined # Def: hypothetical protein # Organism: B.weihenstephanensis # Pathway: not_defined # 1 156 1 162 164 125 42.0 7e-28 MENLNAITGTVQRIQPISEDCCRQMITIQNAEGVHNFIISPETYVIDMIRIRPGMTVTVF YDANLPVPLIYPPQYQAVIIGRNIPNENIFVGYFDENLTAVDEQLKLNISRGTEIITSNG QQYMCPVGQNMLIVYYTITTRSIPPQTTPRKIIVMC >gi|330403967|gb|ADLB01000016.1| GENE 57 36882 - 38702 1038 606 aa, chain - ## HITS:1 COG:no KEGG:Cthe_1185 NR:ns ## KEGG: Cthe_1185 # Name: not_defined # Def: hypothetical protein # Organism: C.thermocellum # Pathway: not_defined # 1 415 1 422 426 318 38.0 5e-85 MSETKRASWRKLDNAAKIFPATSGRADTRVFRFYCELQEEVDEGILQKALDKTVLKYPVF LSVMRKGLFWHYLEKSHLRPAVEEEWKEPCSDIYIRDKKSLLFEVTYYKNRINFEVYHAL TDGTGATLFLRELVKNYLLIAHKSEGIKDVPLGNESVTVTDQEIDGFEKYYSPKIKKPKE EKKKAFQLKKQKQNRFQMQITEATLSVSEVLEKSREYGVTMTVLLTSVLLCAIHREMSKT EEKNPVKLMIPVNLRKLFPSESMLNFFGWIEPCYQFGQEKDSLEDVLAYVKQFFQEELNT KRMAEHMNEWISLEKNPFLRIAPLELKNVCMQAGAKLAEKELTAVFSNMSVVSMPEEYRK YIYRFGVYTSTGKMELCACSFEDKLSLGFTSRSDTENVIRNFFDILSQVGIEAKIETPTY PETKKQTELAQKLFQWFSFLCIAGVTLTFGINTMINSGIYWAVFVTGLICSTWLVFVTGF KKRHNILKNGMWEMMLITVGCVLWDIGTKWKGWSVGYVFPIAVICIITFMLTAIRVQKLK AKDYMIYLLMAGSYGMTVPFIFLLTRVVTNTIPSLLCVMFSFLLLVALIIFKKEEVLQEI HKKFHI >gi|330403967|gb|ADLB01000016.1| GENE 58 38704 - 39612 774 302 aa, chain - ## HITS:1 COG:lin2194 KEGG:ns NR:ns ## COG: lin2194 COG0657 # Protein_GI_number: 16801259 # Func_class: I Lipid transport and metabolism # Function: Esterase/lipase # Organism: Listeria innocua # 51 301 97 346 347 169 38.0 8e-42 MRRVVRKVLRGLSKGGIKVKPARYMADLKGIDLLKIFHRTTDYYVENNGYKIPIRLYLPK EMQGERKALLFFHGGGWVTESVDSYERICAKLTEATNQIIVSVEYRLAPEYPFPAGFEDC YRVAEAMFNEKLIKDMPPENITLIGDSAGGNLVAGVCQKARDTGDFSPRRQILIYPAVNS DYSEKSPYMSVQENGTDFFLTVQKMQDYIELYKSEEKDKDNPYFAPIKAKTFANLPRTLI LTAELDPLRDEGEDYGRRLQEENVDVQIYRIQNAIHGFFALGFKHYHVQESLEYICKFLE EC >gi|330403967|gb|ADLB01000016.1| GENE 59 39846 - 41228 1731 460 aa, chain - ## HITS:1 COG:CAC2772 KEGG:ns NR:ns ## COG: CAC2772 COG2252 # Protein_GI_number: 15896027 # Func_class: R General function prediction only # Function: Permeases # Organism: Clostridium acetobutylicum # 1 459 1 429 429 345 50.0 1e-94 MLEKIFKLKENKTTVKTEILAGITTFMTMAYILAVNPSILSAAGMDQGAVFTATALAGFL GTMLMALFANYPFALAPGMGLNAYFAYTVVIGMGYTWQVALAAVFVEGIIFILLSVTNVR EAIFNAIPMNLKSAVSVGIGLFIAFIGLQNAKIVIGGSTLVQLFSVKGYNELNKVSASMN DVGITVLLAVIGIIITAILVVKNVKGNILWGILITWILGIICQLTGIYVPNAELGMYSLL PDFSNGISVPSLSPIFAKLSFSGINIGQFMVVVFAFLFVDIFDTLGTLIGVSTKANMLDK DGKLPRIKGALMADAVATTAGAVLGTSTVTTFVESASGVSEGGRTGLTAVTTAVLFGASL LLSPIFLAIPSFATAPALVVVGFYMLTNVANIDFSDFTEGLPCFICIAAMPFFYSISEGI AMGVITYVIINLVAGKAKEKKISVLMIVLAVLFIGKYFLL >gi|330403967|gb|ADLB01000016.1| GENE 60 41331 - 42263 818 310 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|148988856|ref|ZP_01820271.1| 50S ribosomal protein L9 [Streptococcus pneumoniae SP6-BS73] # 4 310 3 306 308 319 55 3e-86 MAKIYKNILELVGNTPLLEVTSIEEAENLKAKVYVKLELFNPAGSVKDRVAKKMIEDAEE MGKLKKGATIIEATSGNTGIGLAMAAASKGYKAIFTMPETMSVERRKLLQGYGAEIVLTD GKLGMKGALQRAKELEEEIEGAIILGQFTNPSNPKAHYGTTGEEIWEDTDGKVDMFIAGV GTGGTITGTGRCLKEKNKNIEVVAVEPAQSPVLSGGNPGPHGIQGIGAGFIPDILDTNIY DRILTVENEQAFKGARLLAEKEGVLVGISSGAAVYAAIEEAKKEENAGKVIVTLLADTGE RYLSTELFKY >gi|330403967|gb|ADLB01000016.1| GENE 61 42455 - 42655 256 66 aa, chain + ## HITS:1 COG:SA1234 KEGG:ns NR:ns ## COG: SA1234 COG1278 # Protein_GI_number: 15926982 # Func_class: K Transcription # Function: Cold shock proteins # Organism: Staphylococcus aureus N315 # 1 66 1 66 66 87 68.0 7e-18 MNKGTVKWFNNQKGYGFISDEAGNDVFVHYTGLNMEGFKSLEEGQAVEFEITEGAKGPQA VNVVKL >gi|330403967|gb|ADLB01000016.1| GENE 62 42755 - 43219 355 154 aa, chain - ## HITS:1 COG:BH3197 KEGG:ns NR:ns ## COG: BH3197 COG1714 # Protein_GI_number: 15615759 # Func_class: S Function unknown # Function: Predicted membrane protein/domain # Organism: Bacillus halodurans # 9 147 40 178 178 81 36.0 5e-16 MQNNYDNPVMYGGFFARLSAYIIDSIIVWVGLLFIRIPLHFLFSQIDGNILFHYTLKDIV LYVCGVSYFVLMTYCTGTTLGKRVMNLQVVNANGGKLSLFNIIYRETIGRFLSGFCMGIG YVLVGVDREKRGIQDMLGDTRVIYRNSYMKNPIR >gi|330403967|gb|ADLB01000016.1| GENE 63 43200 - 44183 1214 327 aa, chain - ## HITS:1 COG:BS_yteI KEGG:ns NR:ns ## COG: BS_yteI COG0616 # Protein_GI_number: 16080005 # Func_class: O Posttranslational modification, protein turnover, chaperones; U Intracellular trafficking, secretion, and vesicular transport # Function: Periplasmic serine proteases (ClpP class) # Organism: Bacillus subtilis # 1 319 1 327 335 196 40.0 5e-50 MNKKQIIGAVVAAVLFIAVGVTSVFTNTIADSFLKRTSSEVLSGGTELQLPNREYVGVVN VVGTIQEQTTSDGIFDTSEGYQHLDTLEYIDRMKEDSSNKGILLRVDSPGGTVYESEELY LKLKEYQKETNRPVWTYMEHYAASGGYYISAPSDKIYANPNTTTGSIGVIISGFDMTGLY EKLGIRSYSITSGKNKDMSQMNEEQTAIYQSIVDESYGRFVEIVADGRKMSEEEVRKLAD GRIYSAKQAKANGLVDEIGLYDDMKKDMSKEIGENVIFYEPSTEPSMLASFFGKLESLKT KSEAEVLKETEAELGSGVPMYYAEQLR >gi|330403967|gb|ADLB01000016.1| GENE 64 44187 - 44795 621 202 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|210608470|ref|ZP_03287846.1| ## NR: gi|210608470|ref|ZP_03287846.1| hypothetical protein CLONEX_00025 [Clostridium nexile DSM 1787] # 1 200 1 209 216 77 33.0 5e-13 MICKRCGKEMQIKPIEVGKDKQGNPIYHTYAFCYECKVKMDLDKQREKEKKEHSENYAGS KKNSGRSKKKRKKGGSIKLPFKFSGKKKKKTKEKKGHGFLKAILFILILAVIGTAAYYNR ETLKKWAKIGIEKLNDKDESTKEPVKEPEEEQTEEQTEEQTEEQEQKPVEEQKEEENQGA GQDKVPDNSTQEGENTSDKGEE >gi|330403967|gb|ADLB01000016.1| GENE 65 44935 - 46074 692 379 aa, chain + ## HITS:1 COG:no KEGG:EUBELI_20336 NR:ns ## KEGG: EUBELI_20336 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 1 370 5 375 393 411 55.0 1e-113 MNKKEIAEIKKQFTPANCAITRICGCYVNGEKEKQSTFKQAFLSLPEEEMFKYFDIFRKG LSGTLGKNLINMDFSLEAELNGGTHEFLLKLRESRLTDDMLLNTFYDKIIESYDYAEHYL ILLIHSAYDIPGKSSDNEEMFDASDEVYDYIVCSICPVKLSKPGLSYDAELNTFHDRIRD WIVEMPNHAFLFPAFNDRSTDIHSTLYYSKKPEELHMELVDSLLGCSSPLTAGTQKESFN TLITETLGDECSYEVVKNIHENLNELIEEQKDSPVPVELDKEEVKYLFAKSGVDQEKLEN FDTQYETVAGEKSTLLATNITNTRTFEIKTPDIIIKVNPERTDLVETKVIDGRQCLVIGV DDHVEINGISARTILKKSE >gi|330403967|gb|ADLB01000016.1| GENE 66 46110 - 47612 1254 500 aa, chain - ## HITS:1 COG:DR1171 KEGG:ns NR:ns ## COG: DR1171 COG0714 # Protein_GI_number: 15806190 # Func_class: R General function prediction only # Function: MoxR-like ATPases # Organism: Deinococcus radiodurans # 18 233 3 207 340 90 26.0 5e-18 MDIKRAKEEIKNAISAYLLKDEYGEYVIPSVRQRPIFLIGAPGIGKTQIMEQIAKECGLG IVSYTITHHTRQSAIGLPFISKHVYGEKEYSVTEYTMSEIVASVYDKIEKTGLKEGILFV DEINCVSETLAPAMLQFLQYKSFGNHQIPEGWIIVTAGNPPEYNKSVREFDVVTMDRVKK IEVQPEFSVWKEYAYKSNIHGAVISYLIAKPSYFYQMEMTVDGMLFSTPRGWEDLSNMLY AYEKQEKEADWEVVVQYIQYPKIAKDFANYLELYHKYRTDYRMDEVLEGTIDERVKKKLN HASFDERLSIVSLLLAKLTEQFRQTYRREGETEVLFKCLKRYKSNLEFGEDAVSCLEEVK EKMIETLEYKKKAELLNREEKYSFSRTIQKLESYIMVLKTENLYDKEEGFEKVRSLFAKE KEIYEKLEEASGKMLEYAFDFMEIMFGTSQEMVAFVTELNANYYSAQFLQEYSCERYYQY NKNVLFEEREQNLLKRIERL >gi|330403967|gb|ADLB01000016.1| GENE 67 47603 - 48949 1118 448 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_1241 NR:ns ## KEGG: EUBREC_1241 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 20 435 14 448 451 323 40.0 9e-87 MAREWNVLQREQQRELEKIEICMEVWAQAKQELYVSMRFFHVALSRLEFEPQYSSAEIGT NGETVYFSPDTLISLFRENRLKVNRMYLHMILHCLFLHFALPSAADKRYWNLASDIVVEK MIDSLTARSVRRYVSPYRKNLYQTIEKQKQVLTPKNVYEFLRQMNYQEEEIRRMENEFYM DNHSMWSDNLTPKSTMERNKRWKEDSEKMQTEIETFGKEQSDEVKDVLEQIQIENREKYD YRKFLRKFSVLKEETQVDLDSFDYVFYHYGMELYGNMPLIEPQETKEVYKVEDFVIAIDT SMSCKGELVKQFLRETYSVLSESESFFRKVRIHIIQCDDRIQSDTVITNEKELQAYMESF EVRGQGGTDFRPVFAYVNELIKKKAFHHLRGLIYFTDGYGTFPVKRPLYETAFVFMKEDY RDVDVPIWAMKLIIEPEEFSGKVEAEWT >gi|330403967|gb|ADLB01000016.1| GENE 68 48964 - 49890 628 308 aa, chain - ## HITS:1 COG:no KEGG:Cphy_1220 NR:ns ## KEGG: Cphy_1220 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 6 297 3 288 297 144 29.0 4e-33 MTNQFEEYQIEKKNQAVTIRRCYVCGGVIEIPEEINGYPVEEIGDYAFSAYGPAGEEDAS VCGAKLEEIILPKTIKRIGKYAFYGCSRLEKISFYSNISDIGAGAFTGCHRVSELDATLV AGEKSCLRELLLELREKQSVSLHAVEGEAKLVFPEFFEEAVENTPARILETHTHGSGMWY RNCFIQTEVQFDLYDKRFPWAKEDEKQEVILEMAFGRLLYPIGLHKEAEQKYKEFLTEYF EEACLWAFSRESLQETKYLAREIAKEKAMMQQMIHIAGKQRKEEIFSYLMNEEHERYPSG KKEDKFLL >gi|330403967|gb|ADLB01000016.1| GENE 69 50020 - 51045 1010 341 aa, chain - ## HITS:1 COG:CAC2283 KEGG:ns NR:ns ## COG: CAC2283 COG0809 # Protein_GI_number: 15895551 # Func_class: J Translation, ribosomal structure and biogenesis # Function: S-adenosylmethionine:tRNA-ribosyltransferase-isomerase (queuine synthetase) # Organism: Clostridium acetobutylicum # 1 341 1 341 341 469 64.0 1e-132 MKREDFYFELPEELIAQDPLEDRSGSRLLVLDKETGKTEHHIFKEIVNYLEEGDCLVIND TKVLPARLIGSKVGTDAKIEVLLLKRRENDVWETLVKPGKKAKVGTKIRFGEGLLEGEVI DIVDEGNRLIQFHYEGIFEEVLDQLGQMPLPPYITHQLEDKNRYQTVYAKHTGSAAAPTA GLHFTEELLEEIAKKNVKIARVTLHVGLGTFRPVKVDNILEHHMHSEFYQIDEEAARKIN ETKEQGHRVICVGTTSCRTVESAADENGRLQATSGWTDIFIYPGYQFKILDCLITNFHLP ESTLIMLVSALAGRENVLAAYEEAVRERYRFFSFGDAMFIR >gi|330403967|gb|ADLB01000016.1| GENE 70 51061 - 51729 321 222 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|239830964|ref|ZP_04679293.1| Ribosomal protein L11 methyltransferase [Ochrobactrum intermedium LMG 3301] # 6 221 30 242 245 128 31 1e-28 MRSLLEQPAMHNYMPDIFGKKVLEIGCGCGKNCRYFAENGAVKVLGTHMSGRMLKIAKRK SVGLPIEYRLLAPEKAAGLDEKFDVIYSSLVFHYVEHFDKFIAGLSSLLHKDGILLFSQK HPITTASLDRQPGWNYDERGKEISYTFSNYHQSGRRGSNWFIDDEVIYHRTVGEIVTALG QHGFYVEAADETRPSSTMIKQYPQLEKEMLKPAFLVIRARKI >gi|330403967|gb|ADLB01000016.1| GENE 71 51863 - 57433 5835 1856 aa, chain - ## HITS:1 COG:no KEGG:CPE0191 NR:ns ## KEGG: CPE0191 # Name: nagH # Def: hyaluronidase # Organism: C.perfringens # Pathway: Metabolic pathways [PATH:cpe01100] # 1 1452 1 1479 1628 724 33.0 0 MKKILKRILTGTLAVAMVLTMMPISTPTVQAAKDKTAYEIYPNPHDMSYQDGEFVIRQEV NVVFEKTIDSVTQKRMNEVLASKNKKVTTSQEKVEGKTNILVGTYNSKEYVDTYVKEHYN VEASLFEKFGANFVASNNGEIVVLGRDTDAAFYGITSLKHIFNQMNGSTIRNFTIKDYAD TNIRGFIEGYYGIPWSNEDRMSLMKFGGDFKMTSYIFAPKDDPYHKEKWREEYPAKELAE ITEMVKVGNEAKCRFVWTAHPFMGGFNANNADAEIQALLKKFEQLYKAGVRQFGVLGDDV GQLNKDIVVKMMQEVSKWAKAKGDVYDTVFCPAGYNHSWQGDYSELNKYDKEFPEDIKIF WTGEAVCQPVEQKTLNHFRKHNLNGQATRRSPLFWLNWPVNDINGQRLMMGKGSLLHTDI TLEDLAGVVTNPMQEAEASKVAIFAVADYAWNVKAFDDDKSWEDSFKYVDAQASEALHTL AKHMSNPQPNGHGLVLAESEELQPLINKINTQLTAGKLVDADAKQMIAEMQTIIDACKEF HAQSKNANLKKELKPFTGSLADLAKAIQEYVKAEMAVEAKDMFTAFNHYNTGYSALLSSK KHERKMLNGSAMVSPGSTHLIPLAKKLQTKLSGPVNDYATGGVGSQQVEMTATSNISGWH QGPIGNIIDGNDKTHAWANTEEKVGQYFQVAFNKPVTVYGIHIINGANQDNKHQDTFGTA QLKYKVQGSEEWKLVSQDTYRDYAEFVDVSQIELENVVAVRYECTEKGSGNKWPAMREFK VSTIPENAENFTKTVIRTSNAEGWSVSSGKETNVVDGDLNTNVHYNVRQQGDPVNTTIVG DYVGVELSKPIVLGKINITQGRDDNDGDYFTDADLEYSMDKNKWTTIKNFKNARKISVDV SDQNITAKYVRLVNKKQQPTWIAMREFDVDAKIFHNGKVFTNVDKYKKHTADYLADTAEV TPIDNVTLAKNEYIGLKLDRIHELAEITSKLSDENLTVQLSKNNVEWKDINVKESKATKV TDDARYVRIINKTENAITFNVEKFAVKTVEIYPKSVKETNYSEIENPLNVFDGDITTATH YKNSQTQGKFFTYDLGQEIDLKTFKVVCRDSEHDFPRHAKFSVSTDGKTWEEIMTLGNQD KDNEGEAAGSDHINTVLPTHEISYNVKKETNINKKARYLKFEITRNKSGSDKWVRFNELE INDGEFMPSENNPTYKSDAEETRDGLFRNMSDGDIATMFVPSKDNGYVQYSLSDNTDVNT IKVLQNASAISNATVKARVWEDKKEKWVTVGKLSQSYNEFVLPNDTVLLDVKVEWDKTTP NITELLTYKSDYKAADKNALKALIDSKEDTTKWTTSTAKVYADAYKAGQEVMSSENVSQD SVNNAVTAINKAIEGKQLKGDISVLETIVKDAYKDADTYTAKTWAVYEKAINACKAAIEN KDDTSEKDVETLKKAVEDAKAGLVYNPTNQEMASIAVEEANAFIASITNPENVYTQNSWK AYVDAKKEVESLLEENKTTPQHPDTFEAALNKLTAAKENLVALGELPTLIADFDAIKDSS IYTKDSFDAYKTAVDNARQLLVNGTKETIAEAIKNIKTAKDNLKLSGNVDASKVNALLNE LKNLTAGNYTEKSFNELQKVVTEVGNKDLSALSQEELQECLNQLNGAKQNLVSVKALKDA IASAGEYAADKYTANSYKVLTDAVSDGKALFAAGTKEEISKATDTIYKAMKGLVVRANAD EVKAYVESIVEKDLSKYTSESTKAYKDALAVLKNMLNDLDNVSATGFAQAKANFEKAEAG LVVKDTTVPDKKPNKKPGNKPVKTGDAANSVMPIALMIASMAIAGGVLVFRRKRTK >gi|330403967|gb|ADLB01000016.1| GENE 72 57692 - 60109 2980 805 aa, chain - ## HITS:1 COG:CAC2356_2 KEGG:ns NR:ns ## COG: CAC2356_2 COG0072 # Protein_GI_number: 15895623 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Phenylalanyl-tRNA synthetase beta subunit # Organism: Clostridium acetobutylicum # 152 804 1 654 654 578 46.0 1e-164 MNTSLSWIKAYVPGLDVTAQEYTDAMTLSGTKVEGYEVLDADLEKIVIGQIEKIEKHPDA DKLVVCQVNVGEETVQIVTGAKNVFEGAKIPVVLDGGRVAGGHDGKKTPGGIKIKKGKLR GVESFGMMCSIEELGSTTEMYPEAAEDGIYIFEDDAVVGSSAIEALGRNDVVFEYEVTSN RVDCFSVVGIAREVAATFRKEFCPPVVKETGNGEEVNDYVKVTVEDADLCPRYCARMVKN VKIAPSPKWLQRRLASVGIRPINNLVDITNYVMEEFGQPMHAYDMSTIAGNEIVVKTAKN GEKFVTLDGQEREVDESVLMICDGEKAVGIAGIMGGENSMITENADTVLLESACFDGTNI RKSSKKVGLRTDASGKFEKGLDPNNAKAAIDRACQLIEELGAGEVVGGTVDVYKKVKEPV RVPFDAEKINVMLGTDISEEEMLGYFEKIGLEYDAEAKEVIAPTFRHDLFRLADLAEEVA RFYGYDNIPTTLPRGEATTGKLSFKLRIEEVARNIAEFCGFSQGMTYSFESPKVFDKLLI PADSKLREAVEIMNPLGEDYSIMRTTSLNGMLTSLATNYNRRNKDVRLYELGNIYLPKQL PVTELPEERMQFTLGMYGEGDFFTMKGVVEEFFEKIGLHEKETYDPNAEKTFLHPGRQAN IIYAGKVVGYLGEVHPQVADTYGIGTRAYIAVLDMPEIVGLATFDRKYEGIAKYPAVTRD ISMVVPKEILVGQIEEVIEKKGGEYLEDYQLFDIYEGSQIKEGFKSVAYSIVFRAKDKTL EETDVTTAMTRILKALEEMGIELRQ >gi|330403967|gb|ADLB01000016.1| GENE 73 60147 - 61166 1202 339 aa, chain - ## HITS:1 COG:CAC2357 KEGG:ns NR:ns ## COG: CAC2357 COG0016 # Protein_GI_number: 15895624 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Phenylalanyl-tRNA synthetase alpha subunit # Organism: Clostridium acetobutylicum # 1 339 1 339 339 452 61.0 1e-127 MKEKLQSIQKEALQQIQASEVPEKLNEVRVKFLGKKGELTAILKGMKDVAVEDRAKVGQL VNETRAAIESLLEERKAKMEKEILERKLREEVIDVTLPAKKNTVGHRHPNTIALEEVERI FVGMGYEVVEGPEVEYDEYNFEKLNIPADHPAKDEQDTFYINKDIVLRTQTSPVQARVME QGKLPIRMIAPGRVFRSDEVDATHSPSFHQIEGLVIDKNISFADLKGTLEVFAKELFGPE TKTKFRPHHFPFTEPSAEVDVTCFKCGGKGCRFCKGSGWIEILGCGMVHPHVLEMCGIDP EEYNGFAFGVGLERIALLKYEIDDMRLLYENDIRFLKQF >gi|330403967|gb|ADLB01000016.1| GENE 74 61365 - 61442 112 25 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLVENRFLAERGNHWLEVSLSSDGN >gi|330403967|gb|ADLB01000016.1| GENE 75 61515 - 62372 888 285 aa, chain - ## HITS:1 COG:lin1818 KEGG:ns NR:ns ## COG: lin1818 COG1295 # Protein_GI_number: 16800885 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Listeria innocua # 32 276 28 273 289 129 30.0 7e-30 MPILIFMKESGEWPVKKWLKKINQLLAADANDHTSAYAAMSAFFFVLSLIPIILLLLTLV QYTSLTKVDVMSAVAQVVPDSITPTILAIVNQVYNQSAAVIPITILVALWSAGRGVLAVT SGLNWIYDSRETRNYFYLRIRATFYTVLFIVVIVLTLVVLGFGNSISLFVEAHIPLASHI TKFMIEIRTIAAFFALLVFSLCIYKFLPNRRDKFLSQLPGSLFTSVGWLLTSFFVSKYME IFKGFEDMYGSLTTIVLIMLWLYFSMYIMLLGGKVNVYFQGKGDK >gi|330403967|gb|ADLB01000016.1| GENE 76 62446 - 63759 1217 437 aa, chain - ## HITS:1 COG:sll1283 KEGG:ns NR:ns ## COG: sll1283 COG2385 # Protein_GI_number: 16329811 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Sporulation protein and related proteins # Organism: Synechocystis # 141 432 127 383 391 132 33.0 1e-30 MKKWCKSNVVYYGLGLCMCVCLSVGLLSLLVDKEIKERKRPYPIKSVQEEENESGKEKAI RVVIKTNGFKEVEHKEVVVRAKSGMCIRSDAQTRECKPNEEIKILPDDKMFQKGAIQILP KEGGDKINIVSLKRGYGAPEYRGKLELHSTEKGIAIVNELPLEEYLYAVVPSEMPASYSL EALKVQAVCARSYAEKQTKGFAYPQYQAHVDDSTEYQVYGNSREQESTIRAVNETKGEKV WYNGEVATTYYFSTSCGKTTTSEAWGSQRTKQNAYLCSIDVAENGKAYEKDLPWYRWSAK IPEKVLENLIEKNTKTDIGKLCNVQVTKRGEGNVALQLVATGDKKKVTVDTENKIRRALG GDGYLLERQDGKKVKSMELLPSAFFEIKKQGKEYVLTGGGYGHGIGMSQTGANEMAKRGK NYKEILSIFYKNIEVRK >gi|330403967|gb|ADLB01000016.1| GENE 77 63906 - 64070 151 54 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MIYNVFTDLKCSITSVILNLQYFKRDSYISVICQASILAVEDRKADAVAHLADC >gi|330403967|gb|ADLB01000016.1| GENE 78 64071 - 66182 1598 703 aa, chain - ## HITS:1 COG:lin0469 KEGG:ns NR:ns ## COG: lin0469 COG1305 # Protein_GI_number: 16799545 # Func_class: E Amino acid transport and metabolism # Function: Transglutaminase-like enzymes, putative cysteine proteases # Organism: Listeria innocua # 233 581 214 588 721 92 24.0 3e-18 MKKTKIWEEIIQIVYIIIAQLGLLFSFTETFDISYQERSVYIAVTVLAVLMFIVLKKARH GLLLLGGGIVLAEGAFLLWKRGTVFKEIKCIVEIVRTHMENYVKTGKIEHLYENSKFTLG ILFFLILTTGVLACIIVKMKRAFGIMTVSLITFCIPFMAGEEPGIKTLICVGVTIITAGF AGGKGITGKDKIIVRRLGIAVGILSVIVGGLLFEPSIEKELGHTKKYKEDILSFLNRKKS DIFPGDRGVGGINGGKLGRVGKLEQDNTADLKVTVGKKPTQRMYLQGFIGENYTGHEWEE IEGYVETGRTYSLISYLGIFDYAMDTIEIEYIDVNKKYDYQPYGSSFYSSYNKMIGNKRT MNYYTYELLKTFPALEEWNKLKDVYGENLESKYTTVEREVIDNFSGQVESFIHGTDIDTV AREIATLLDNRTDYSLNPGKTPVDRDFAEYFYFENRKGYCTHYATTAALLFRLKGIPARY VEGYAIEPDEFKKQKDGRYTAVATGRNAHAWAEVFYPEGSWLPVETTPGYTKNETPMEEE NIDRETNGPTSQKNEQQKEENLEKQKKEKKEESKEEKKKEQTNMFPIVISAVGGVLIAAG ISSVFLLKRKKKKRRRVTGFNEEIQELFHQIYRKLLRKKKITGNEELNQEFVEKICTAYP SISTEMGEQMLDIVYRANYGKDSLPKEDCMLLKRILLLLEKEK >gi|330403967|gb|ADLB01000016.1| GENE 79 66195 - 67268 685 357 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_1423 NR:ns ## KEGG: EUBREC_1423 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 4 314 5 313 345 129 29.0 2e-28 MKKIIYLIVMGTTVYLNVLYDWKEGAMVLSVELAFLFFCLLEAWFVRRKVKTRIEVKKEI AEQGEEVPICVWIQNRSSFPAGVKVTLFIKYAGENKKIKQTWRVYLGGRKEETFSKETFA QKCGKMKMELGKITVSDWWGGISFPKKVREQESILVMPKPCPVNLTVSHKTKWFPIESDE YATDRSGDDNTEIYEIREYRNGDRLQKVHWKVSAKQDDLYVKEYSYPLGAGVLILLEGGK SENTPSFLMGVVSVSIAMLEKKCPHYIAWKMKEDPFIQRRLIKDEESFYQFLTEFLEIET NLLETDMEERYRYQYRNAIYSSIIIFTNALEMQINHQEKMKVGPEIERFFETTEIIV >gi|330403967|gb|ADLB01000016.1| GENE 80 67270 - 68226 1036 318 aa, chain - ## HITS:1 COG:PAB0848 KEGG:ns NR:ns ## COG: PAB0848 COG0714 # Protein_GI_number: 14521486 # Func_class: R General function prediction only # Function: MoxR-like ATPases # Organism: Pyrococcus abyssi # 9 313 8 312 314 280 48.0 2e-75 MAKYMETVQNIVGEVKKAVIGKDECIKKVMAGILAGGHILIEDIPGVGKTTLALAFAKSM GMNWHRIQFTPDVLPADITGFSMYDKYMGEFRYQEGAVMCNLLLADEINRTSPKTQSALL EVMEEGTVTVDKVTREVPKPFVVIATENPIGSSGTQMLPESQLDRFYISMTIGYPEIRHE IEIIKGNQTGGLLDGINPVISAEVLTFIQQEVEEVYVHDVIYEYIAELITMTRNHAMIEV GISPRGTIALTKMTKAAAYLSGREYCIPKDVQEMFYSVARHRIQLNSKARINHVKVEDIM EEILKKVEAPTPKKKVEK >gi|330403967|gb|ADLB01000016.1| GENE 81 68316 - 70004 1874 562 aa, chain - ## HITS:1 COG:APE0033 KEGG:ns NR:ns ## COG: APE0033 COG1866 # Protein_GI_number: 14600399 # Func_class: C Energy production and conversion # Function: Phosphoenolpyruvate carboxykinase (ATP) # Organism: Aeropyrum pernix # 216 494 154 434 493 63 25.0 9e-10 MSTKAYYPLSEIGAGKTGFSKTRSIIEAAFYGNNVVKVSTLREAYELAKNSPGTIVTDMP VYRGEEFGLDRDAKVLLFNDGAVTGRYAAARRIKGEPGVDAAKLDKVTMDAVYESRWKTM YHAEVYVGLDPEFMVKAHLLIPEGEENIMYNWMLNFQYMSDEYVKMYKESKAVGDGKEPD IYIFSDPQWIPQERPDVDYSCLSDPLTLCYFDTAENCACILGMRYFGEHKKGTLTMAWAI ANRNGYASCHGGQKEYTLSDGSKYVASVFGLSGSGKSTLTHAKHGGKYPITVLHDDAFII NTDTCSSVALEPTYFDKTADYPTGCPDNQYLLSAQNCSATLDEDGKVQLVTEDIRNGNGR ALKSKLWSPNRVDKIDAPVNAIFWIMKDPTIPPVVKLKGSALASVMGATLATKTSTAERV AAGTDLNALRIVPYANPFRTYPLVNDYEKFKKLVEEKNVDCYIINTGDFMGKKVQPKDTL GILETIVEGKAEFKQWGNFEDIEIMEWEGFVPDLSDEDYKAQLKNAMQNRVNAVEGFATK KDGYDKLPDEALEALQKLVAEV >gi|330403967|gb|ADLB01000016.1| GENE 82 70257 - 71570 1548 437 aa, chain - ## HITS:1 COG:SA2117 KEGG:ns NR:ns ## COG: SA2117 COG1757 # Protein_GI_number: 15927906 # Func_class: C Energy production and conversion # Function: Na+/H+ antiporter # Organism: Staphylococcus aureus N315 # 5 430 28 444 459 345 47.0 1e-94 MEKNKGRAVALLPIGVFLIIFLGAGIVTGDFYTMPAIVAFLIALLVAFLQNKNLTFTEKL QVISKGVGDENIITMCLIFLCAGGFSGAVTAAGGADSTVNFGLSILPSNIAVVGLFIIGC FISVSMGTSMGTIAALAPIAVGISEKTGFPMAVCIGAVVCGAMFGDNLSMISDTTIAAVK TQGCDMKDKFKANFFIVLPAAIATIAIFFFMTRNGNFTLTKELPYNVWQILPYIVVLIGA LIGINVFVVLISGTVLSLIVGVATGSIALGDIFIAVGGGTIGDKAIGGVMGMYDITVISI VVACIVSLVKEYGGIQFILNLIKGHIKGRRGGELGIAGLALLVDACTANNTVAIVMTGPI AKEICDEYHISAKRSASLLDIFTSVGQGIIPYGAQLLSAATLTGLTPFAIMPYLYYPFLM AVSAVLFILFRKEKENR >gi|330403967|gb|ADLB01000016.1| GENE 83 71718 - 72404 789 228 aa, chain - ## HITS:1 COG:BS_yrhO KEGG:ns NR:ns ## COG: BS_yrhO COG1378 # Protein_GI_number: 16079765 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Bacillus subtilis # 2 228 40 263 275 114 32.0 2e-25 MSGVPRSKIYNILETLVTKGFVRFTEGEGTNQYLAVPIEEVSERIQKETKDTLEELTAQL KEYQTSTDLEYIWHIREYKNVFAKCRNIIRHTEDELLIQIWEEDLPQIEKELKGLEEKGV RLGIVYFSEDENSSIPFKNYSRHGMLAEKRKEMGGRFITLVSDEEEVVFGQIINETVAEV IWTKSKPMIAMSAECVRHDMYFYKNVGKFKEEMQEEFGKDYIKIREIF >gi|330403967|gb|ADLB01000016.1| GENE 84 72509 - 74155 2237 548 aa, chain - ## HITS:1 COG:MJ1276 KEGG:ns NR:ns ## COG: MJ1276 COG0129 # Protein_GI_number: 15669462 # Func_class: E Amino acid transport and metabolism; G Carbohydrate transport and metabolism # Function: Dihydroxyacid dehydratase/phosphogluconate dehydratase # Organism: Methanococcus jannaschii # 8 546 17 559 561 420 43.0 1e-117 MGQEYWNGAEAAHRRVMMKAAGYSDEDIRKKPHIGVPNSFMEGSPGSAHLRQIAEAVKQG IWEAGGVPVEFGIPATCGNIANGAEELKYEQVGRDIVAMSVEFVTKVHNFDGLVMIASCD NIIAGCYLAAARLDIPSMVVTGGSMQPGHHCGKAIVEADLDVARFSGAGEEYLDELEESV CPSFGACPSMGTANTMQMLGEVFNLVMPGTATVPASDNKKLRQARMAGKYAVELVKSGRK PSELITKEVLLNAIMFDMAVAGSTNAVLHILTMAYELGIDITLEDFEKYAKEIPCINAVI PSGPYTVVDFHYAGGVPNVLKMLESKLYKDAPMMTGITLGEFLSQLTAKPNEVIHSLEKP LFNEPGLKVLRGNIAPNGAIVRPTGVPKEVKYIKGKAKVFDGDRMAFEAIESGKIVAGDI IVIRYEGCKGAPGMKELMLSIDALIGLGLHTSVGLITDARFSGFNYGAIVGHVSPEAYDG GVIALVEDGDEIILDTINGEATLCVSDEELAARREKWVCPPLKEQKGCLNLFARNCRPAE EGGAMQPW >gi|330403967|gb|ADLB01000016.1| GENE 85 74173 - 75162 1065 329 aa, chain - ## HITS:1 COG:MTH1495 KEGG:ns NR:ns ## COG: MTH1495 COG2423 # Protein_GI_number: 15679492 # Func_class: E Amino acid transport and metabolism # Function: Predicted ornithine cyclodeaminase, mu-crystallin homolog # Organism: Methanothermobacter thermautotrophicus # 2 326 14 334 339 210 36.0 4e-54 MLLLSKKDIQKVFTMKDAVEADKEAFTLFSEGKSVVPLRTNIGAPKYDGAFLFMPSYVED LECSAIKIVNVFPKNIEKGIPTTPAQVLLIDGKTGVVISVLDGTYVTQLRTGAASGAAFD VLANPTAKKGALIGTGGQAATQLEAMLAVRDLEEVKVYDINLQRTEEFVEKMNEELASYG TRIIAAKSSDEAIEDADMIVTVTPSSKPVFDGNKVKAGATVSCVGSYQPHMQEMDSVILQ RAGKIYFDSEEAVLSEAGDILIPLADGLITKEDFTGDLGDVLLGKVVGRETEEEIIVFKT VGIGTQDLVTAKHIYDKAVEQGIGTEWNS >gi|330403967|gb|ADLB01000016.1| GENE 86 75241 - 77205 1484 654 aa, chain - ## HITS:1 COG:BH0361 KEGG:ns NR:ns ## COG: BH0361 COG4932 # Protein_GI_number: 15612924 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted outer membrane protein # Organism: Bacillus halodurans # 301 615 1083 1380 1661 76 29.0 2e-13 MKKVKKEWISVLLVLCMVISFPIYKKVVHAQEKSDVVIPKLVRITTSPYKITQYGKESRG TYGVMFFVKEKNSQRIGYCMDFGKELPMNKPLKVLTEKQNELLKAALEFGYQAMTETPTN KQKAQYGATQVMVWNIMEGVYGTPKARKAMEEYSRSLKNASDGLAFYDELNRKINEVGKL PSFLSDKKDNAGSHMLKWNSQKKRYEIILTDTNKVNCKLQIAGNSSLTLECINQEKKEYC LFSAKDFQGEQTVEIKRTDELGGVRPYLIYGAEGTKYQRALTYNPQGVKDMAEGYLKVRT EKGRIEIIKIDEETGDTESQLPDISFAGTQYELLSSEKERIEILTINEEGRAFSGELPAG TYYVREIKAPEGYNIDTVLHKVTLPSEENILYTKVQSKENVIRGDVEIKKIGEGRPMPDV EFTLTNKKTGEKTVIKTDENGVAATKREDSERGSLVYGTYTIEETKYPEGYIPMKAFDVV IEQENVKLTYTLENQIIKGKIIVEKRDKDTKQMLNGAVFEIIAKEDVKAADGSVLVEAGQ TADKIITKDGIAESKELYLGTYLVKEIKAPEGYIPSEKSYEVHLKSQGEEIPVVMERLSI ENQKEKKIMKLIKAEEVKTGDTDKMGMYGIISLLSIAVILNGYRLYRQKKKNFS >gi|330403967|gb|ADLB01000016.1| GENE 87 77434 - 78429 742 331 aa, chain - ## HITS:1 COG:L189090 KEGG:ns NR:ns ## COG: L189090 COG0438 # Protein_GI_number: 15674119 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Lactococcus lactis # 1 329 1 328 332 282 45.0 5e-76 MKVLLYAKNQQVVSKSGVGKAMSMQKEALLANGVEVTENSEDSYDVVHINTIFPSDYRMA KRAKKEGKKVVYHAHSTKEDFQNSFTGSNLIAPLFKKWIMKCYRLGDIILTPTEYSKTLL KGYGIENPVYAISNGVDTSLFSKNILAREDFRKKYGLRQDDKVIMSVGLYFERKGILDFV ELAEQMPEYKFIWFGYTPDIQIPSKVRRAVHKSLTNLIFAGYVPKEELINAYSSCDLFFF PSYEETEGIVVLEALSSEIPVLLRDIPVYENWLENRKDVYKGKTNKEFSSLIKDILEKRL PVLTWNGRKRALERDVNCQARKLNRYYEQLV >gi|330403967|gb|ADLB01000016.1| GENE 88 78386 - 79591 931 401 aa, chain - ## HITS:1 COG:CAC3594 KEGG:ns NR:ns ## COG: CAC3594 COG0438 # Protein_GI_number: 15896828 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Clostridium acetobutylicum # 1 387 1 387 398 286 40.0 4e-77 MRVLITTDLFKPTINGVVTSILNLEQELKEQGHEVKILAVSQNVYSYREDNVYYVRSVPS HIYPEVRVPVSRAASFVEELIEWKPEVVHSQCEFFSYGFAKRIAKATNARLVHTYHTLYE QYTEYIPVGKRLGRAALGKWIKMRLKDTNLIIAPTKKVEQTLYQYGMAKEIRIVPTGICL EKFKNPVDDKTVEQLRERYEIKKEDKVLLSLGRLGYEKRIDELLYGMKEIVKMEENIKLL IVGGGPARESLEKLTDELQLREYVRFAGMVSPEEVQTYYRLGDVFVCASTSETQGLTYIE AAASGLPLICRKDACLYGVLEEGGNGFSYQDIYRFAKYVRMYATDEEWLEKAGNHSEKIA EKYNTNLFGKAVSGIYREEIIKGGVEEYESIALCEKSAGRI >gi|330403967|gb|ADLB01000016.1| GENE 89 79604 - 80197 598 197 aa, chain - ## HITS:1 COG:SP1720 KEGG:ns NR:ns ## COG: SP1720 COG0398 # Protein_GI_number: 15901553 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Streptococcus pneumoniae TIGR4 # 6 187 15 196 205 164 43.0 1e-40 MSKKTKRILNIVTAVSLISIIAFVLYGVHTGVLTDRQQMEMLVKKSGLWGPILFIVIQMI QVVIPIVPGGITCGVGVVIFGAWSGLLYNYIGIVAGSLINFYLARRYGTCFVKYFVKEET YEKYIGWLDKGKKFDKFFALAIFFPCAPDDVLCLIAGLTKMTWKKFTTIILLGKPLSIAM YSMALVYAGSWIERFIA >gi|330403967|gb|ADLB01000016.1| GENE 90 80359 - 81756 1040 465 aa, chain - ## HITS:1 COG:no KEGG:Bsph_3214 NR:ns ## KEGG: Bsph_3214 # Name: not_defined # Def: hypothetical protein # Organism: L.sphaericus # Pathway: not_defined # 14 456 13 463 472 135 27.0 3e-30 MRISSLIAMGKEYLLLGVIGISIVIVLALVVYKFLLKGQKKIKPLRIFGWLLLGSYIVIV LGATLLARSEAGEAGSILPLFYSYRDAWVNFSATAWRNIILNICMFVPLGLLLPMMIKSF RSFWKTYLAGFLFTLCIEGIQLLGRRGIFELDDILHNAVGMMIGYGIYAVAAAAIQKYRK EKVSLRNVLLLQLPLVVCVISFSVIFISYHTKELGNVGGQYIVTYDTDKLQVETDKDYDK KSKELPVYQFEVMSEAEASDFAGQLFQKLGTELDEERNDFYEDIALFHAKDSYNLWVNYA GGTYNLTDMDTQFSADEKLKTGASEEEVRKIFEEYGIDIPQEVAFSEKEGKYTFTVNQLV ENGVMLDGEINGELNEENKFLDINYGLKKCKFYKSFQGISEREAYEQICSGKFFYPYGRE EELSVLVKDCKTVYERDSKGYYQPVYLFSCEINGEESEIKIPILQ >gi|330403967|gb|ADLB01000016.1| GENE 91 81785 - 82573 587 262 aa, chain - ## HITS:1 COG:lin2818 KEGG:ns NR:ns ## COG: lin2818 COG4905 # Protein_GI_number: 16801879 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Listeria innocua # 1 247 1 250 270 164 35.0 1e-40 MNIYEMILCFFIYGFLGWCTEVAFAAVKERRFVNRGFLNGPICPIYGVGVVMVAQLLMPY QSNLLLLYITSVIVVTALEWLTGFALDKIFHNKWWDYSDMPFNLNGYVCLLFSLIWGVAC VVIVKWIHPLIYKGVLFLPIWLVITLDVILVIAVFTDLFVTVSKILKVNQHLEKMEEIAT ELHRISNEIGQNISKDILGVIERRDDISGEMSEKITELRKKYSELAVRNAKNMKRLVKAF PRMTPRKHKEIIEELKNYLKKK >gi|330403967|gb|ADLB01000016.1| GENE 92 82711 - 82950 316 79 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167761152|ref|ZP_02433279.1| ## NR: gi|167761152|ref|ZP_02433279.1| hypothetical protein CLOSCI_03557 [Clostridium scindens ATCC 35704] # 1 74 1 74 75 104 75.0 2e-21 MYLKQGLNHHIYRVENIDLELQLERRLEALGLTHGTLITILNNNKKGSLTIKFRGTRFAI GKRIAEHITVTEVENGQCN >gi|330403967|gb|ADLB01000016.1| GENE 93 82934 - 84976 1744 680 aa, chain + ## HITS:1 COG:MJ0566 KEGG:ns NR:ns ## COG: MJ0566 COG0370 # Protein_GI_number: 15668746 # Func_class: P Inorganic ion transport and metabolism # Function: Fe2+ transport system protein B # Organism: Methanococcus jannaschii # 6 680 5 667 668 483 39.0 1e-136 MDNVIKVGFIGNPNCGKTTLFNAFTGANLKVANWPGVTVEKKEGTTTYKGDTFKLIDLPG IYSLTSYTMEEKVSRECILSDDVDVIVDVVDASSLERNLYLTLQLIELGKPVILALNMMD IVEERGMEIDLHRLPEMLGIPAIPVSARKKTGLSILLHAIAHHKDYAQQSKFIHHHHDRT PKHHHNHHHEYAMVYSDEIEDKIDLIQDEFDKHYPESNLKRWHAIKILENDSEVLKRHPL DLTHILTKNYEAEIINQKYDFIEEIIKEVLVNKVQKEERTEKIDFYLTHKIWGLPIFLGI MALVFFFTFTIGDWLKGYFEIFLEFFSGNVSHLLASVHASDMVISLIVDGIISGVGGILT FLPNIFILFLALAFLEDSGYMSRVAFVMDDIMSKLGLSGRAFIPLLLGFGCSVPAIMASR ALEHKKDRLKTILVTPFMSCSARLPIYVLFSQMFFPKHAMLVSYSMYLIGIIVAILTAYI ISKFDGSKAEHALLIELPEYKTPNAHTIAIYVWEKVKDYLTKAGTVIFIASVIMWILLNF GPTGYVTDITQSFGSIIGKFIVPVFKPLGLGYWQIIVALIAGIAAKEVVVSSCGVLFGIQ NITSTGGMSALAATLGGLGFGTANAYALMVFCLLYVPCTATIATINREVHSKKLTLGIIS FQLIVAWLMSFIVYHIGLLL >gi|330403967|gb|ADLB01000016.1| GENE 94 85046 - 86260 1415 404 aa, chain - ## HITS:1 COG:L93420 KEGG:ns NR:ns ## COG: L93420 COG3919 # Protein_GI_number: 15674209 # Func_class: R General function prediction only # Function: Predicted ATP-grasp enzyme # Organism: Lactococcus lactis # 5 404 6 406 408 259 38.0 6e-69 MEKVEFLPVLLASDINVYSMARAFHEEYGIKSLVVARSSSNIIANSKILVYRERAGLDKT EVFLKEMNEIYKKYGKTKKLILIGCADHYVRLIVENKAELKDKFILPYTDKEVLDNIVLK ETFYELCEKYNLDYAKTFIYKPEMNFEFDLKDMLFPVVLKASDSVKFHRNKFEGFHKAYF IDTKEELVDTLKLIYKNGYDDNMIIQEKVPGEDAAMYDLQIYVGSDHKVKMMNLGNVVLE EHTPTAIGNNAATITMSDYNEELMKKIQFLMEDIGYEGLADCDLKYDYRDGKYKMFEINI RQGRSHYRVTGGGYNLAKYIVDDYVYHKEIPTTYVKDEYFWHVVPLGVVYKYVKDKDKIA KIKELVKQGKVCDSFYYKEDMCLKRRIMYWLRCMNHYKKYKKYF >gi|330403967|gb|ADLB01000016.1| GENE 95 86345 - 86611 357 88 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MGAGVAGIVFFLILFFTPGSFGGGDVKLSLALGYYLGAEKWFYSFAIAVFLAGMVIVGKY ISGKINKKEEIAFGPYLCLGAVLVKIFM >gi|330403967|gb|ADLB01000016.1| GENE 96 86806 - 87645 881 279 aa, chain - ## HITS:1 COG:no KEGG:DKAM_1054 NR:ns ## KEGG: DKAM_1054 # Name: not_defined # Def: predicted metal-dependent phosphoesterases (PHP family) # Organism: D.kamchatkensis # Pathway: not_defined # 4 224 9 219 232 80 31.0 9e-14 MKIDMHCHVKEGSIDSKVSLEEYITILKQQGFQGMLITDHNTYKGYRYWKEHIRGKKHTD FVVLKGIEYDTRDAGHIIVIMPEGVKMRLLEVKGLPVSVLIDFVHKNGGILGPAHPCGEK YLSYTNTKRFFKSPELIKRFDFFEGFNACESEVSNEAAMKLVTRYHKPGIAGSDSHKPEC VGKAYTILPEYVTCETELITLIRKKVPIEIGGTLYEKTTKEKLGKASKILTYSFWIYNKG GNLLKAYKRKGKEKEENPVDAIDPIEMELLAKKKLKNIS >gi|330403967|gb|ADLB01000016.1| GENE 97 87718 - 89346 1848 542 aa, chain - ## HITS:1 COG:BH1407 KEGG:ns NR:ns ## COG: BH1407 COG1283 # Protein_GI_number: 15613970 # Func_class: P Inorganic ion transport and metabolism # Function: Na+/phosphate symporter # Organism: Bacillus halodurans # 4 536 7 536 543 311 36.0 3e-84 MGMTEVFMLLGGLALFLYGMQMMSNGLEAAAGNKMKKILERVTSNRIIGVLVGAAITAVI QSSSATTVMVVGFVNSGLMTLNQAVWVIMGANIGTTITGQLIALDIGAIAPLIAFIGVAS IMFIKNEKVKHISEILAGLGILFIGMETMGSAMAPLQESETFIGFMANASNPIVGILVGA IFTAIIQSSSASVGILQALAQTGVVPLSSAVYILFGQNIGTCITAVLASIGTKTNAKRTT VIHLMFNIIGSILFTIICLLTPFVHLVETLTPGDTVAQIANVHTIFNVTTTLLLLPVGTY MAKLAVKILPESKEEEEGEHKLAYIEKFESSYAVGNAALAISQVEGEVDRMLTMVKKNTE TGFEALLHKDKIDEEKMLDREEYIDYLNKEISKYIVGLMGTEMADEDLRRINAYYKIISN IERVGDHAMNFLGYAKDLKIWDMKLSGHALKEIEEMKGLCAKALEDITVSDISKAGNILE MAAKNEQKIDDMKEIYLKEQIERMKTGDCKAETGIIFSEILTDFERIGDHILNIGEQYNE MM >gi|330403967|gb|ADLB01000016.1| GENE 98 89372 - 89773 186 133 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKKIVMFFLIMILAFANIDEWIVAEEPLSIYTQDKNTMKKRFLERVVEKESAVMMVRGKV QTSTPAVRKNGVKLLFCFLIMCFLYFQSKKGSRGETFLEIQDFLEYHIRRMRQIHQMDGK KKGICCIGNKSIG >gi|330403967|gb|ADLB01000016.1| GENE 99 89954 - 90580 835 208 aa, chain + ## HITS:1 COG:FN1141 KEGG:ns NR:ns ## COG: FN1141 COG2116 # Protein_GI_number: 19704476 # Func_class: P Inorganic ion transport and metabolism # Function: Formate/nitrite family of transporters # Organism: Fusobacterium nucleatum # 8 201 34 244 256 73 32.0 3e-13 MKQNFFSGLMAGAYISLGAMTFLIIPNQIVASLFFATGIFLVFNFSNMLFTRVCPLMAAT KEYHLKDLIVAWFGNGVGAFLVATLVHFCRFESKILERLQPIVDTKLADSPLSLFIMGII CALFVSYAVLLGKKYPVGSFPQIFYVWLLITAFVFGGYDHIVANMYYLSAYGWAFGVQVM PFLKVLGIVTAGNVVGGLLIGSLEKKHL >gi|330403967|gb|ADLB01000016.1| GENE 100 90549 - 91202 747 217 aa, chain - ## HITS:1 COG:VCA0102 KEGG:ns NR:ns ## COG: VCA0102 COG0637 # Protein_GI_number: 15600873 # Func_class: R General function prediction only # Function: Predicted phosphatase/phosphohexomutase # Organism: Vibrio cholerae # 2 213 7 216 219 125 33.0 7e-29 MRAVIFDMDGVLIDTEKYLVKFWCQAGREFGYDMKREDALMIRSLAGKYAKPKLQSIYGK DFDYAAVRSRRKELMKDWLEQNGIEKKKGVDEILPYIKKQGLKIAVATATDEERAIHYLK EIGIYHWFDKVICANMVENGKPMPDIYLYACEQIGEEPKDCYAVEDSPNGVRSASAAGCR TIMVPDLTEPDKETEKLLYVKAASLSELKDVFSRENR >gi|330403967|gb|ADLB01000016.1| GENE 101 91205 - 91498 442 97 aa, chain - ## HITS:1 COG:lin0618 KEGG:ns NR:ns ## COG: lin0618 COG0607 # Protein_GI_number: 16799693 # Func_class: P Inorganic ion transport and metabolism # Function: Rhodanese-related sulfurtransferase # Organism: Listeria innocua # 1 93 1 95 99 70 39.0 9e-13 MNETLIIEEGLQKLNETENAVLLDVRSTEEYHEGHLEGSLNIPINRLPTISLPKETPIFV YCLSGARSKRAADFLNKIGYTATNIGGIAGYHGRLVL >gi|330403967|gb|ADLB01000016.1| GENE 102 91669 - 92004 99 111 aa, chain - ## HITS:1 COG:no KEGG:SZO_02320 NR:ns ## KEGG: SZO_02320 # Name: not_defined # Def: transposase # Organism: S.equi_zooepidemicus # Pathway: not_defined # 1 107 353 457 472 76 33.0 3e-13 TERTVDCGHCIKFQKQYYKTINECGIQVHYHKGTKGLVIQTFDKRLLFSVNNKIYELEVV PLHEPLSKNFDLQPLKEKPRKRNLPSPKHPWRMSTFLQFKNHKITETMLIC Prediction of potential genes in microbial genomes Time: Tue May 24 21:23:32 2011 Seq name: gi|330403673|gb|ADLB01000017.1| Lachnospiraceae bacterium 2_1_46FAA cont1.17, whole genome shotgun sequence Length of sequence - 69545 bp Number of predicted genes - 68, with homology - 61 Number of transcription units - 23, operones - 15 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 315 127 ## LGG_02944 integrase - Prom 489 - 548 5.8 + Prom 443 - 502 8.0 2 2 Op 1 . + CDS 549 - 995 348 ## CLH_1634 putative transcriptional regulator 3 2 Op 2 . + CDS 997 - 2061 688 ## EUBELI_01556 hypothetical protein + Term 2067 - 2111 11.1 - Term 2059 - 2094 5.0 4 3 Op 1 . - CDS 2099 - 2743 632 ## EUBELI_01147 cytidylate kinase - Prom 2767 - 2826 10.2 5 3 Op 2 . - CDS 2829 - 3860 942 ## COG0722 3-deoxy-D-arabino-heptulosonate 7-phosphate (DAHP) synthase 6 3 Op 3 35/0.000 - CDS 3881 - 5770 183 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 7 3 Op 4 . - CDS 5763 - 7505 231 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 - Prom 7543 - 7602 5.5 - Term 7580 - 7621 8.1 8 4 Op 1 . - CDS 7646 - 9670 1236 ## Shel_01150 PAS domain-containing protein 9 4 Op 2 . - CDS 9690 - 10418 444 ## COG4509 Uncharacterized protein conserved in bacteria 10 4 Op 3 . - CDS 10420 - 10881 286 ## COG0681 Signal peptidase I 11 4 Op 4 . - CDS 10922 - 11542 458 ## 12 4 Op 5 . - CDS 11635 - 12714 1386 ## SpyM50106 putative surface-anchored protein - Prom 12807 - 12866 5.0 13 5 Op 1 2/0.000 - CDS 12871 - 13026 143 ## COG1592 Rubrerythrin 14 5 Op 2 . - CDS 13001 - 13252 421 ## COG1592 Rubrerythrin - Prom 13297 - 13356 5.0 - Term 13371 - 13425 11.3 15 6 Op 1 36/0.000 - CDS 13429 - 15894 2613 ## COG0577 ABC-type antimicrobial peptide transport system, permease component 16 6 Op 2 . - CDS 15898 - 16578 326 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 17 6 Op 3 3/0.000 - CDS 16632 - 17876 762 ## COG2207 AraC-type DNA-binding domain-containing proteins 18 6 Op 4 40/0.000 - CDS 17895 - 19016 689 ## COG0642 Signal transduction histidine kinase 19 6 Op 5 . - CDS 19018 - 19695 570 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain - Prom 19724 - 19783 8.6 + Prom 19765 - 19824 11.4 20 7 Tu 1 . + CDS 19864 - 20286 253 ## CKR_0679 hypothetical protein + Term 20293 - 20336 6.2 - Term 20276 - 20328 9.5 21 8 Op 1 . - CDS 20333 - 21925 1405 ## Ccur_02400 hypothetical protein - Prom 21958 - 22017 4.9 - Term 21999 - 22035 3.2 22 8 Op 2 . - CDS 22045 - 24117 2239 ## COG0480 Translation elongation factors (GTPases) - Prom 24149 - 24208 12.1 + Prom 24270 - 24329 6.5 23 9 Op 1 6/0.000 + CDS 24359 - 25459 790 ## COG0287 Prephenate dehydrogenase 24 9 Op 2 . + CDS 25473 - 26768 810 ## COG0128 5-enolpyruvylshikimate-3-phosphate synthase + Term 26800 - 26844 1.7 - Term 26636 - 26669 1.4 25 10 Tu 1 . - CDS 26726 - 27181 372 ## Nther_0252 response regulator receiver protein - Prom 27234 - 27293 4.0 26 11 Op 1 16/0.000 - CDS 27309 - 28034 803 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 27 11 Op 2 . - CDS 28031 - 29548 1273 ## COG2205 Osmosensitive K+ channel histidine kinase 28 11 Op 3 15/0.000 - CDS 29603 - 31606 1616 ## COG2205 Osmosensitive K+ channel histidine kinase 29 11 Op 4 18/0.000 - CDS 31651 - 32283 757 ## COG2156 K+-transporting ATPase, c chain 30 11 Op 5 20/0.000 - CDS 32298 - 34361 2307 ## COG2216 High-affinity K+ transport system, ATPase chain B 31 11 Op 6 . - CDS 34383 - 36095 1674 ## COG2060 K+-transporting ATPase, A chain 32 11 Op 7 . - CDS 36125 - 36202 203 ## 33 11 Op 8 . - CDS 36199 - 36312 188 ## - Prom 36343 - 36402 6.0 - Term 36430 - 36465 5.1 34 12 Op 1 . - CDS 36480 - 37274 1023 ## COG0428 Predicted divalent heavy-metal cations transporter 35 12 Op 2 . - CDS 37327 - 38688 1365 ## COG0534 Na+-driven multidrug efflux pump 36 12 Op 3 . - CDS 38675 - 39133 460 ## Cphy_3318 MarR family transcriptional regulator 37 12 Op 4 . - CDS 39078 - 39191 58 ## - Prom 39415 - 39474 5.0 + Prom 39234 - 39293 13.1 38 13 Tu 1 . + CDS 39332 - 39754 582 ## COG0071 Molecular chaperone (small heat shock protein) + Term 39761 - 39814 7.0 - Term 39746 - 39803 3.5 39 14 Op 1 1/0.000 - CDS 39850 - 40401 519 ## COG1827 Predicted small molecule binding protein (contains 3H domain) 40 14 Op 2 13/0.000 - CDS 40415 - 41269 552 ## PROTEIN SUPPORTED gi|163755345|ref|ZP_02162465.1| 30S ribosomal protein S6 41 14 Op 3 10/0.000 - CDS 41281 - 42444 1102 ## COG0029 Aspartate oxidase 42 14 Op 4 . - CDS 42460 - 43365 993 ## COG0379 Quinolinate synthase - Prom 43388 - 43447 9.2 + Prom 43386 - 43445 8.6 43 15 Tu 1 . + CDS 43671 - 44336 795 ## COG3859 Predicted membrane protein + Term 44346 - 44401 8.1 - Term 44329 - 44389 15.5 44 16 Op 1 . - CDS 44444 - 44977 181 ## DSY0717 hypothetical protein 45 16 Op 2 . - CDS 45010 - 45369 407 ## gi|294640107|ref|ZP_06718121.1| hypothetical protein CUS_1512 46 16 Op 3 . - CDS 45385 - 46959 1828 ## COG0661 Predicted unusual protein kinase 47 16 Op 4 . - CDS 46973 - 47299 512 ## EUBREC_0665 hypothetical protein 48 16 Op 5 . - CDS 47334 - 47618 406 ## COG0818 Diacylglycerol kinase - Prom 47730 - 47789 6.8 - Term 47798 - 47848 -0.4 49 17 Tu 1 . - CDS 47850 - 48200 124 ## - Prom 48387 - 48446 5.4 + Prom 48120 - 48179 6.7 50 18 Op 1 . + CDS 48199 - 48540 377 ## CDR20291_2714 hypothetical protein 51 18 Op 2 . + CDS 48498 - 49043 330 ## CDR20291_2714 hypothetical protein 52 18 Op 3 . + CDS 49040 - 49198 186 ## + Term 49305 - 49345 7.2 - Term 49290 - 49334 9.8 53 19 Op 1 1/0.000 - CDS 49381 - 52005 3398 ## COG0574 Phosphoenolpyruvate synthase/pyruvate phosphate dikinase - Prom 52027 - 52086 3.9 - Term 52056 - 52090 1.3 54 19 Op 2 . - CDS 52147 - 53442 1044 ## COG0673 Predicted dehydrogenases and related proteins 55 19 Op 3 . - CDS 53518 - 54048 572 ## Teth514_0984 hypothetical protein 56 19 Op 4 4/0.000 - CDS 54053 - 55435 1554 ## COG3225 ABC-type uncharacterized transport system involved in gliding motility, auxiliary component 57 19 Op 5 24/0.000 - CDS 55448 - 56314 834 ## COG1277 ABC-type transport system involved in multi-copper enzyme maturation, permease component 58 19 Op 6 . - CDS 56298 - 57281 1161 ## COG1131 ABC-type multidrug transport system, ATPase component - Prom 57310 - 57369 3.8 59 20 Tu 1 . - CDS 57470 - 58237 477 ## COG1943 Transposase and inactivated derivatives - Prom 58287 - 58346 3.2 60 21 Op 1 1/0.000 - CDS 58373 - 59761 1460 ## COG0423 Glycyl-tRNA synthetase (class II) - Prom 59831 - 59890 6.9 61 21 Op 2 . - CDS 59898 - 60644 415 ## COG1381 Recombinational DNA repair protein (RecF pathway) 62 21 Op 3 . - CDS 60580 - 60744 172 ## 63 21 Op 4 . - CDS 60835 - 61731 993 ## COG1159 GTPase 64 21 Op 5 . - CDS 61754 - 64678 3272 ## COG1026 Predicted Zn-dependent peptidases, insulinase-like - Prom 64729 - 64788 10.5 - Term 64773 - 64824 11.3 65 22 Op 1 . - CDS 64834 - 66621 2121 ## COG0006 Xaa-Pro aminopeptidase 66 22 Op 2 . - CDS 66698 - 67054 203 ## Kole_0724 protein of unknown function DUF952 67 22 Op 3 . - CDS 67074 - 68681 1985 ## COG1757 Na+/H+ antiporter - Prom 68702 - 68761 2.1 + Prom 68993 - 69052 8.9 68 23 Tu 1 . + CDS 69212 - 69496 453 ## EUBREC_2191 hypothetical protein Predicted protein(s) >gi|330403673|gb|ADLB01000017.1| GENE 1 3 - 315 127 104 aa, chain - ## HITS:1 COG:no KEGG:LGG_02944 NR:ns ## KEGG: LGG_02944 # Name: tnp # Def: integrase # Organism: L.rhamnosus # Pathway: not_defined # 1 103 1 103 478 72 39.0 4e-12 MNEQLKYEVIKSLVDHNGNKKAAALKLGCTTRHINRLIQKYKQNGKTAFIHGNRGRKPLH SFTESQKLEILTLYNNKYYDATFTYACELLAKNDGIFISPSALT >gi|330403673|gb|ADLB01000017.1| GENE 2 549 - 995 348 148 aa, chain + ## HITS:1 COG:no KEGG:CLH_1634 NR:ns ## KEGG: CLH_1634 # Name: not_defined # Def: putative transcriptional regulator # Organism: C.botulinum_E3 # Pathway: not_defined # 16 141 13 137 142 73 38.0 2e-12 MKDYHWTEMIEKMQDIRLFSSLHIRRSKNEGITSSQELDLLSRIVLSDTPLTPHDLCSSM GLSKSAVSRLIENLEKKKFLYKESNPTDKRSYSLLITKKGNEELNLTYAYYLEPIYWLRR VLGDDSFESLTTKIKEANELLLKEKGGN >gi|330403673|gb|ADLB01000017.1| GENE 3 997 - 2061 688 354 aa, chain + ## HITS:1 COG:no KEGG:EUBELI_01556 NR:ns ## KEGG: EUBELI_01556 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 1 340 1 340 353 337 56.0 3e-91 MTFYQELQLNQSGSKTLLKSINEPKEKVKHWGIYLFKILLTVAFCVLFVTLYTKLFGADN SIVGVVVLLSVMVFRQADLGIHTPHGSAVLLGIFCILAFGPRLTNMVSPVPAFFINAICI FLLMFFGCHNVIMSNHSTFVLGYLLLQGYDVSGHAYKMRLAGLLVGAVMTSLILYRNHKK YTYKRNFKHLFQEFDLTSSRTQWYLRFTFGVSTALLIASLLNIPRAMWIGISAMSVLLPF RKDLVYRVKFRAPGNILGAVLFLVLYTVLPESAYGCIGMIGGIGVGLSASYGWQTVFNSF GALAIAVPMFSLYGAIFLRIFNNAFGSLYGLVFDKLFHSVFNFFLKRDEETVSC >gi|330403673|gb|ADLB01000017.1| GENE 4 2099 - 2743 632 214 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_01147 NR:ns ## KEGG: EUBELI_01147 # Name: not_defined # Def: cytidylate kinase # Organism: E.eligens # Pathway: Pyrimidine metabolism [PATH:eel00240]; Metabolic pathways [PATH:eel01100] # 1 199 1 197 212 158 41.0 1e-37 MTDNIIITIGRQFGSGGHEIGNRLATRLDIPLYDHNLVKMAARELHLDESVAEEADESML GKFLSAYVVGVGSYTSFITNEAVVEPVSDRLYAEQTEILKRLARRSSCVIVGRCADYILG DYSNYIHTFIYAFWEDRVRRIMSIYGMTEKQAKEKIRQVDKERRLYYEAHTGRKWGDIES HQVLLNSSLLGIDGTVDILEAAYRKKVEQIKKSR >gi|330403673|gb|ADLB01000017.1| GENE 5 2829 - 3860 942 343 aa, chain - ## HITS:1 COG:SP1700 KEGG:ns NR:ns ## COG: SP1700 COG0722 # Protein_GI_number: 15901534 # Func_class: E Amino acid transport and metabolism # Function: 3-deoxy-D-arabino-heptulosonate 7-phosphate (DAHP) synthase # Organism: Streptococcus pneumoniae TIGR4 # 15 339 15 337 343 376 53.0 1e-104 MGMRVNKQLPLPSELKEEYPMSKDIVELKRKRDKEIRDIFTGKSDKFIVLVGPCSADNEI AICDYVNRLSKLNEQVSDKLMIIPRIYTNKPRTTGAGYKGMLHQPEPDKAPDLLGGIIAI RKMHIHAIEESGLTSADEMLYPENRSYLDDVLSYEAIGARSVENQQHRLTASGMDIPVGM KNPTSGDFSVMMNSVIAAQSSHTFIYRGMDVTTDGNDLAHVILRGGVDKYGTCIPNYHYE DLVRLQSTYEKMSLQNPAAIVDANHSNSNKQFKEQIRIVSEVLHSRNYNPEIKKLVKGVM IESYLEEGCQSIAEDRIYGKSITDPCLSWKDTEILIHKIAENC >gi|330403673|gb|ADLB01000017.1| GENE 6 3881 - 5770 183 629 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 402 625 34 264 329 75 27 1e-12 MNKKKKKGTLGRILKTLFEFYPVRMPIVFFCIIFSAVVSSIPAIFMQNIISIVETSWKTG DWDAVGGKIVTYVGILLIFYILSLLSAVTFTQMMATITQGFLKKLRVKMFHGMQNLPIKY FDTHNHGDIMSYYTNDIDTLRQMVSQSIPQLLISSVTVLTVFSIMLYFSLWLTLVVIIGV VFMFIVTKKIGGNSAKYFIRQQTSLGKAEGYIEEIMNGQKVVQVFCHERECEEAFDEINE KLFSDSESANKFANMLMPILNNIGNVLYVTVALTGGILLLAKAPNVSLSGMAIGISIVVP FLNMTKQFCGNVSQVSNQVNSVVMGLAGAERIFELIDEEPEEDEGYVTLVNVREENGEIV ECKERTGMWAWKHPHQEDGTVTYTKLAGDVRMFDVDFGYEKDKTVLHNITLYAEPGQKVA FVGSTGAGKTTITNLINRFYDIADGKIRYDGININKIKKSDLRRSLGVVLQETNLFTGTV MDNIRYGRLDATEEDCIAAAKLAGADDFITRLPEGYDTLLTENGANLSQGQRQLIAIARA AVADPPVMILDEATSSIDTRTEAIVQKGMDALMEGRTVFVIAHRLSTVRNSDVIMVLEQG HIIERGTHEMLIEEKGKYYQLYTGAFELE >gi|330403673|gb|ADLB01000017.1| GENE 7 5763 - 7505 231 580 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 352 560 37 251 329 93 30 3e-18 MIKELSKSIREYKKPSALAPVFVSLEVIMECIIPFVIARLVNEIKAGCEFKTIAIYGGIL ILMAGCALLAGAIAGSVCATGSCGFAKNLRKDMFYKIQTYSFTNIDKFSTSSLVTRLTTD VTNVQMAYMMLIRVAIRCPLMLIFSFTMAFIMGGKMAWIFVILVPILAVGLFLIIRMVMP LFKQVFKKYDKLNNSIQENIKGMRVVKSYVREEYEKEKFEKAADDICKDFTKAEKILAFN NPLMQFCMYTVMIFVLYFGSYVIITSRGLDLDVGQFSALLTYSFQILNSLMMMSMVFVMI TMASESSKRIMEVLQEEGTIKNPKNPVYEVKDGSVDFENVNFKYAETAAKMALSNIDLHI RSGETIGVIGGTGSSKSSLIQLISRLYDATEGSVKVGGVDVREYDLTSLRDEVSVVLQKN VLFSGTIKENLRWGNKEATDEELVEACKLAQADDFIRQFPRQYDTYIEQGGTNVSGGQKQ RLCIARALLKKPKILILDDSTSAVDTKTDALIRKAFKEFIPETTKIIIAQRISSVEEADR IIVMDKGTINAIGTHKQLLAENEIYQEVYISQNKAGEKDE >gi|330403673|gb|ADLB01000017.1| GENE 8 7646 - 9670 1236 674 aa, chain - ## HITS:1 COG:no KEGG:Shel_01150 NR:ns ## KEGG: Shel_01150 # Name: not_defined # Def: PAS domain-containing protein # Organism: S.heliotrinireducens # Pathway: not_defined # 527 659 379 511 524 155 50.0 6e-36 MRAEKGIQSEWIERMSKQVGLLSLYQDKFEVSYLNELLITELKYKNLEDFKTHIGNSVLE IVHPEDLEEFRRFLLLKTEYPENPKFKCRLRCKDNSFTWYEVSRVHCVGREQKPMLLCTL INVSGYMKFPKKMYQAATLQNGEIVIRIEVEEKTAYIPVELAAKYNLPEKFLNMPYSLIE SGYILDESVVDYLKFYGKIVKGEEGTVEVQSRGIDGVTRWLYGKSVVKCNEEGKAVTAVV SFLDITEKKKKEIKINRLQQSEMFFKKVAELSERIILKYEFHTDCFIPVTGRAGKILEKF PEPLSPLKIINAQFIEDEYLWEVHRAFYNMKKEARDGSLCVRIKSLSAEKRWKWYKFTYY VIVDSEKQPTHAIIFCDDITKMRSGELAEQCLENNLNRKSQKPVFNLVYNLSLDAFEKSE GMVPGYYADGVIASYSKAFDWICGQVLPEYREKFRKCFSKNILLKKGEDRGEEIFPVKYK ERTISVKAAYQVFKDVYTESLIIWIQCQETDVGTTEAEKTNIICKGIYIRTFGHFDIFIN GKAVPIQNAKARELLALLVDKRGGFISAEEILSVLWEDEPMSTKALAKIRQTVMILKNVL KKYTEEEVIESQRGLRRLNIDIVKCDLYDYLSGASEYVSLYQGAYMPNYSWGEMMIPELN RRKVRFSEQDFYEE >gi|330403673|gb|ADLB01000017.1| GENE 9 9690 - 10418 444 242 aa, chain - ## HITS:1 COG:SPy0129 KEGG:ns NR:ns ## COG: SPy0129 COG4509 # Protein_GI_number: 15674344 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Streptococcus pyogenes M1 GAS # 7 241 6 236 237 229 49.0 5e-60 MRVQTWIRAVDKVVNCIIWVLICILLLYSGLGLWDTYSVYRGAENSELIYKYKPSGEGAN PSLKELQKLNPDVCAWLTVDNTKIDYPIVQGKTNMEYINKAVDGSFSLSGSLFLDYRNER NFHDFYSLIYGHHMAGNVMFGILPSFQKKEFFESHTSATLFLPDSTKKIEIFACVYTDAY DSIVFNPKCNDADARTKVLNRIREKAVRYRDTELKDGDRIVGFSTCYNTTTNGRIIVFGR LR >gi|330403673|gb|ADLB01000017.1| GENE 10 10420 - 10881 286 153 aa, chain - ## HITS:1 COG:CAC2646 KEGG:ns NR:ns ## COG: CAC2646 COG0681 # Protein_GI_number: 15895904 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Signal peptidase I # Organism: Clostridium acetobutylicum # 2 145 20 172 184 72 33.0 2e-13 MIKLAIVFAVVVAMFTLVFGVLFCKGETMYPRLRDGDVAIYYRLTTDYQVGDVVVFESGG QSIAARIVAREGDTIELDKEGRLLVNGNIQQEEVFFPTEPIAGGITYPYRIEKDSFFLLC DNRPAASDSRFFGAVSQKKIKGKVINLFRRRGI >gi|330403673|gb|ADLB01000017.1| GENE 11 10922 - 11542 458 206 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MEKVNRNKRKNRKIFFAFCILLLSVICMGGLRTEAEENKIPLNLEIENDIGLKAGTDKNI GQDIPFVFEIIPEEAKNPVPPTREVKIIGAGKAAFKTIYFTTPGNYKYIIRQKTEGNKNW NLDNREYRVNVSVKKAENNRLSLSVWGFIKGSDEKADVFRFENFYRGKTGTPHPIKTGDE SHAEPAILVLLGAGAVITCIIAKRKH >gi|330403673|gb|ADLB01000017.1| GENE 12 11635 - 12714 1386 359 aa, chain - ## HITS:1 COG:no KEGG:SpyM50106 NR:ns ## KEGG: SpyM50106 # Name: not_defined # Def: putative surface-anchored protein # Organism: S.pyogenes_Manfredo # Pathway: not_defined # 31 355 38 351 352 144 32.0 4e-33 MGLKRKLLGVTTGAMLLCSMIGGSVLAAGHVTTGGNLQIEKDVVLDGGSIVPNQKFSFKI EPLDVQPNTKENNLEVKKGKELQGADAIQSGQFDKDTQTTDEGGKKVAKDTTAQFDFSGV QFEHNKPAIYRYKVSENAGQGTGMSYDNTKYQLDVYVDKDGNPTSLVAKKLNDQNQAEGE KVPLKFVNTYKTESLTVEKNVTGASGETEKAFEFHITVKENDSLKNGSAIQAKLHKAGTQ TEDVQIVVGTEATFSLKSGEKLVVPELPKDTDYTLYETEHGQNGYTTTVTVNDQNKDNAD TNREYKVVENTNKVVFTNNRDEITPTGILLNVAPYAAGVILAAFAAVLFLAKKRRNRRA >gi|330403673|gb|ADLB01000017.1| GENE 13 12871 - 13026 143 51 aa, chain - ## HITS:1 COG:MTH756 KEGG:ns NR:ns ## COG: MTH756 COG1592 # Protein_GI_number: 15678781 # Func_class: C Energy production and conversion # Function: Rubrerythrin # Organism: Methanothermobacter thermautotrophicus # 6 51 152 197 197 59 52.0 1e-09 MPLREKTFKGDAPKGWKCNNCGYIHEGDEAPEECPICAHPKAYFERKAENY >gi|330403673|gb|ADLB01000017.1| GENE 14 13001 - 13252 421 83 aa, chain - ## HITS:1 COG:CAC2575 KEGG:ns NR:ns ## COG: CAC2575 COG1592 # Protein_GI_number: 15895835 # Func_class: C Energy production and conversion # Function: Rubrerythrin # Organism: Clostridium acetobutylicum # 1 82 1 99 195 72 47.0 2e-13 MSKYAGTETEENLKKAFAGESQARNKYTYFASAARKNGFVQIANIFEETANQEKEHAKMW FKELLGIGTLEENLTDAAQGENI >gi|330403673|gb|ADLB01000017.1| GENE 15 13429 - 15894 2613 821 aa, chain - ## HITS:1 COG:CAC0527 KEGG:ns NR:ns ## COG: CAC0527 COG0577 # Protein_GI_number: 15893817 # Func_class: V Defense mechanisms # Function: ABC-type antimicrobial peptide transport system, permease component # Organism: Clostridium acetobutylicum # 1 454 1 469 863 249 35.0 2e-65 MNILNELTVKSLKKNKRRTVLTIFGILLSVALITAITTFVSSMQGSLVDFAKKNSGNYHI LVENVPKDKQKYLLHNEKAEKKIVIQTIGNAKPLSLQYEEGVNEENATKIKVRAVKKENF SDLGMILQSGKFPQSENEIVIPEHFRGDYTTKIRVGDKLKLNIDGKEKEYTVSGISGMSA VEFHSEEGMLGCSLFTISDDEKISENIDVAMLMKNPKDTFAFREMLEKELGLHELQINNI LLDFQGAVVNANVLVVLKVLAGMVIGIILLTSIFVIKNSFDISITERLKQYGMLVSVGAT SKQIRKNVLFEGVVLGIIAIPLGVLLGVGAIWCTLQVVMKILEGTSFGGEVELKMYVSVV AILIAVAIAIVMIYISSLIPAKKAQKVSPMEAIREAKDTKLEAKKLRTSKLFRKVFGIEG EIARKNLKRSRKKYRTTVFSLFISIVLFLSISSVRIYGQEMQNMKFAKMDYNLIAHYDED DLGKQGEIFRRASKVDGVKKAEIVKVWIGDPKNISFTKTAEKSRGIEEFTQEEKDKEAQY AFYSLSNETYKAFAKELGLSYEEVKDKGILCDTSISFVRDEEDEKAKKTKYHELAVKEGE KLQFDGGEIEIVKRADRLPFTQNFYYGVNVIVSEEWMSHRDFRYDGLYIDAEDTMKVREN MEKITGKDGWTYMDFAEQARENNSINLIISIFLYGFVAVISIIGITNVFNTITTNVALRS REFATLRSVGMTDKEFKKMIWYESFLYGTKSLLYGVPVGLGLSYIFYRQFTNILEMSYIV PYQQVIICILFVFIIVCLTMQYAVKKVEKQNIIETIRSENI >gi|330403673|gb|ADLB01000017.1| GENE 16 15898 - 16578 326 226 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 3 221 1 219 245 130 36 3e-29 MEILKVENLCKTYGVNDTEVRALDNVSFSVERGEFVAIIGASGSGKSTLLHLIGGVDKAT SGKIYIDGTEISELNQDKMAIFRRRQLGLVYQFYNLIPILTVEENIVLPCRLDGKQVEKE KLEEMLSILNLTDRRGHLPNQLSGGQQQRVSIGRALINHPAIVLADEPTGNLDSKASREI IDLLKMTNRKYDQTILLITHDENIALEADRIITIADGKIVKDEKVR >gi|330403673|gb|ADLB01000017.1| GENE 17 16632 - 17876 762 414 aa, chain - ## HITS:1 COG:CAC2608 KEGG:ns NR:ns ## COG: CAC2608 COG2207 # Protein_GI_number: 15895866 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Clostridium acetobutylicum # 293 391 180 278 284 74 38.0 3e-13 MRINEQDLKKIKALHVALKMPVWVLDEKTNEILKSYTSIYKYPISYEFKKRIIAQNAVEF YSGILNEIFLCLKYKNVKIVMGAFRINNVSRKTFSLVYNNMTEKNKKMLKEEECWEYYST LPVYPLGDIRDYLSLLGFLFDMELEDGYSAELHKQVEENQFELKKPFSEGKLYNTFRVER YTFYYENQIMNLVSQGDLEKLKSGLAELGTSVLPILTSSSLQTEKNYTITILEKLSSLAI QMGKDILSVIKLRNYYIRKLEKQEDFMGVLVTRDSAIIHFTKELHGVSVRAKSSLIQCVL QYINLKIYDSIKISKLAEQFYLSESSLRRKFKEEVGMSVNEYINHRKIEESKMMLQSGIP IGEISKRLSFYDLSHFYKTFKKHTGITPQYFRDTIATPEMPPEKNNNKISTESL >gi|330403673|gb|ADLB01000017.1| GENE 18 17895 - 19016 689 373 aa, chain - ## HITS:1 COG:CAC0525 KEGG:ns NR:ns ## COG: CAC0525 COG0642 # Protein_GI_number: 15893815 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 84 369 44 327 329 152 30.0 1e-36 MEKKKWLGILCCCMIFNFSVCILFGMAKERKINGWLYDTTEKILSAYGEEQGETDILRKY GVEKSDFSPISMALEITGICLLNGILFFSVSWYDKKKQSEKLREMERYCEDILSDRETLQ LHDNEEGEASILKNKVYDITMLLREKNAYLEESKCELEKFLADISHQLKTPITSLHMANE LLRMDLPQEKKDLFLDNMQKDMMKIEWLVKGILNLAKLDSRTLTLKKEEILVKDLIAEVE DRFCALCEITGSIIEKNGKENSKAWCDFRWTKEGICNIVKNAIEHGATKVKISWEENYIY TKISIWDNGEGIDEEDLPHIFERFYKSKNAKEDSVGLGLAFTKSIVKHQGGDIEVHSEKQ KGTEFVLKFFAVY >gi|330403673|gb|ADLB01000017.1| GENE 19 19018 - 19695 570 225 aa, chain - ## HITS:1 COG:CAC0524 KEGG:ns NR:ns ## COG: CAC0524 COG0745 # Protein_GI_number: 15893814 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Clostridium acetobutylicum # 1 224 1 222 228 211 46.0 8e-55 MIQILIIEDDRHIGEGLQFLLESEGYCAELAMSVSEGREKLEKGKTQLLILDVNLPDGSG FEFFQKIQSKKIPVIFLTALDEEKNIVKGFNLGADEYITKPFRPRELVSRVKNVIRLTGI EDSRKQELIIGNVSIHLEEKVAYKNGKQLELTALEYKVLLLFFENRGRILTRNQILSHIW DDSGNFVNDNTLTVYIKRLREKIEDNPYEPDIIKTVRGMGYKIGG >gi|330403673|gb|ADLB01000017.1| GENE 20 19864 - 20286 253 140 aa, chain + ## HITS:1 COG:no KEGG:CKR_0679 NR:ns ## KEGG: CKR_0679 # Name: not_defined # Def: hypothetical protein # Organism: C.kluyveri_NBRC # Pathway: not_defined # 1 122 4 124 140 82 32.0 3e-15 MVNLLVGPKGSGKTQQMIDFANEKAKTLDGNVVFIKNSHNDTYNVSFDIRAICMADYPSI RNIDEYTGFIYGMVSCNHDIECIFIDGILKQSDISLENMPRFVERLKTISKLHHIDFYLS ISAKREDLSDIDLTDCQILN >gi|330403673|gb|ADLB01000017.1| GENE 21 20333 - 21925 1405 530 aa, chain - ## HITS:1 COG:no KEGG:Ccur_02400 NR:ns ## KEGG: Ccur_02400 # Name: not_defined # Def: hypothetical protein # Organism: C.curtum # Pathway: not_defined # 5 530 130 669 697 556 50.0 1e-156 MKMKIIGVIAFLIVAGIYYYVTLPAINIHSKDFWVFLIILMVLLIVWYGWKKKIRTKEEV KSSKGMRTLITLGVAVVVIYAVGTLLSSPIINAKKYRNLLKVEEGEFTKDIEELSFDQIP LLDKESASLLGNRKMGSMVDMVSQFEIDDIYTQINYNDRPVRVSPLKYANIIKWFTNRSE GIPAYVRIDMANQTTELVKLKEGMKYTTSEHFNRNIYRHLRFKYPTYIFNDLSFETDDDG TPYWVCPVKKFNIGLFGGETIGRVVLCNAVTGETKDYAIDEAPSWIDRAYSADLLVQLYD YYGSLKHGYFNSILGQKDCLKTTDGYNYLAIDDDVWVYTGVTSVSGDESNVGFVLMNQRT METKFYPIEGATEKSAMSSAEGQVQHLKYKATFPLLLNISGEPTYFIALKDDAGLVKKYA MVNVQKYQLVAIGDTVSGCEKQYNQLLVTDGVVKEEEKTSEVQTVSGKISKIAQGVIEGN SHFYIMLEGNNEIFDVSVVDFIDVIKYEVGQNVTIEYTKGKNANTVLSMK >gi|330403673|gb|ADLB01000017.1| GENE 22 22045 - 24117 2239 690 aa, chain - ## HITS:1 COG:FN1546 KEGG:ns NR:ns ## COG: FN1546 COG0480 # Protein_GI_number: 19704878 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation elongation factors (GTPases) # Organism: Fusobacterium nucleatum # 1 690 3 689 690 583 44.0 1e-166 MKVYRTDEIRNVVLLGHGGSGKTSLVEAMAYVSGAVNRMGKISDHNTISDFDKEEQKREF SISTTLTPIEWEKAKINILDTPGYFDFVGEVEEAVSAADAAVIVVSGKAGVEVGTEKAWE LCDKYNLPRMVYVTEMDVDDASFREVVKQLTDRYGKVIAPHFQPIRENEKLVGYVNVIKN AGRRYTGIGQREECEIPEYCLPNLQILRDALMEAVAETSEDFMERYFAGEEFSIEEIRSA MRTEVMDGSIVPVAMGSNIEAQGAANLLSDIVRFFPSPDKRECAGFNRKTNEIFEANYNF SKAKTAYVFKTMVDPFIGKYSFVKVCSGVLKGDDVLYNAESDTEEKPGKLYVMNGNKPIE VSELHAGDIGAIAKLTATKTGDTLSTKNTPVLYGKTEYSTPYTYMKYVVKTKGDEDKVSQ GLAKMMAEDMTMKVVNDSENHQTLLYGMGEQHLEVIVSKLASRYKVDVELERPKVAFRET IRKTSDVDTKYKKQSGGHGQYGHVKMKFEPSGDLETPYVFEETVVGGAVPKNYFPAVEKG LQESVLKGPLANYPVVGVKATLYDGSYHPVDSSEMAFKTATMQAFKKGIMEASPVLLEPI ANIKVVVPDEYTGDVMGDLNKRRGRVLGMTPLPKGKQEIEADIPMTCVFGYCTALRSMTG GRGTYSYEFVRYEQAPSDVQEKEIANRSEE >gi|330403673|gb|ADLB01000017.1| GENE 23 24359 - 25459 790 366 aa, chain + ## HITS:1 COG:BH1666 KEGG:ns NR:ns ## COG: BH1666 COG0287 # Protein_GI_number: 15614229 # Func_class: E Amino acid transport and metabolism # Function: Prephenate dehydrogenase # Organism: Bacillus halodurans # 7 364 6 364 366 264 42.0 2e-70 MNIPEKVGFIGLGLIGGSIAKAIRLYYPDTKIIAFDKSRETLALAMSESIIDVSCNTIDN NFANCSYIFLCTPVAFNNAYLKQIQPLLHDDCILTDVGSVKTSIHEEIVSLGLEKYFIGG HPMAGSEKSGFSNAKALLIENAYYILTPSAAVSGEKVLQYTAFIQSLKALPVEMDYQEHD YITGAISHLPHIIASVLVDFVRNSDTEEEFMKRLAAGGFKDITRIASSSPIMWQHICTQN SKNISYILGKYIDALQEIKCSVEQKEDGLLYDLFLRAKNYRDSVPEHSAGPIKKVFAIYC DILDETGGIATVATILASNHISLKNIGIVHNREFEEGVLRIEFYDESSSLKAVKLLEKHR YTVYKR >gi|330403673|gb|ADLB01000017.1| GENE 24 25473 - 26768 810 431 aa, chain + ## HITS:1 COG:BS_aroE KEGG:ns NR:ns ## COG: BS_aroE COG0128 # Protein_GI_number: 16079317 # Func_class: E Amino acid transport and metabolism # Function: 5-enolpyruvylshikimate-3-phosphate synthase # Organism: Bacillus subtilis # 5 429 5 424 428 410 52.0 1e-114 MIIRKTKGLRGVISIPGDKSVSHRAVMFGSLAGGTTEITNFLQGADCLATINCFREMGIE IENSPEKILIHGKGLHGLHVPQTILNTENSGTTTRLLSGILAGQTFESTLSGDASLNTRP MGRIIKPLSLMGGKIKSIHGNNCAPLHISPSALHGIHYHSKVASAQVKSAILLAGLYADG ITSVTEPVLSRNHTELMLAGFGGKLRSFTQPDTNLPTVSVEPEPKLEGQKIVIPGDISSA AYFIAAGLLIPDSEILIQNVGINPTRAGILTVCQNMGGDITFLNEKTVGGEPVADLLVKT SNLHGTVIEGSIIPTLIDEIPIIAVLASFADGTTVIRNAEELKVKETDRIETVSENLKQM GGEVIPTEDGMIITGKGYLSGAKIDSFLDHRIAMAFSIAGLAADGETEIINSHCIDVSYP TFFETLESLFL >gi|330403673|gb|ADLB01000017.1| GENE 25 26726 - 27181 372 151 aa, chain - ## HITS:1 COG:no KEGG:Nther_0252 NR:ns ## KEGG: Nther_0252 # Name: not_defined # Def: response regulator receiver protein # Organism: N.thermophilus # Pathway: not_defined # 25 93 139 207 239 62 44.0 6e-09 MKEQMINTKEIMEERELRQLCFDVKQCCSQHDFIKAEKMAALAMEKYPDAAQPHNLYGAI LEMTDQHVSAMKHFRAALALDPTYEPASYNLEQYGTLTLEGKHCAFDEEDLETKKRPSHG PAKKQAGNSTLRMILYPLYRNRDSNVSKNVG >gi|330403673|gb|ADLB01000017.1| GENE 26 27309 - 28034 803 241 aa, chain - ## HITS:1 COG:pli0051 KEGG:ns NR:ns ## COG: pli0051 COG0745 # Protein_GI_number: 18450333 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Listeria innocua # 8 240 6 236 240 250 53.0 2e-66 MMKGTKPNVLVVEDDEPIRNLIVTTLEMEKYKFDTAENGKQAIMLAAANNPDIILMDLGL PDMDGIEVIRKIRSWSVVPIIVISARSDERDKVAALDVGADDYLTKPFSVVELSARLRAT IRRVQYIMHTDNTNESMFQNGDLMIDYSAGVVKVSGVEIHLMPLEYSLLCLLAQNVGKVL TYQFILEKVWPNGIGSDISSLRVYMTSLRKKLEKDKANQTYIQTHIGVGYRMLNVEESGE E >gi|330403673|gb|ADLB01000017.1| GENE 27 28031 - 29548 1273 505 aa, chain - ## HITS:1 COG:pli0050 KEGG:ns NR:ns ## COG: pli0050 COG2205 # Protein_GI_number: 18450332 # Func_class: T Signal transduction mechanisms # Function: Osmosensitive K+ channel histidine kinase # Organism: Listeria innocua # 12 505 387 886 888 402 42.0 1e-112 MEETKYTYIGKDIFYCMISLVLSTLIAYIFSILGFTDANLIMIYILGVIMVALLTSGYCW GLCASALSVLTFNFFFADPIFTFSVYNPDYLITFGVMLVTSITCSVLTKKVKNYAQESEM KSYRSELLLQASRSLQEASTAREIMQKTVEQLGNLLEKNIYCYMGEPSVKEIPISYKKDD NPRQLDESDVAIAKWCYQNKEDAGFSTKILRESAYMFLAIQSEEKMFAVIAVDMEKEKIG TFEKGIMGTIIHESVLALEKEQLLKQRSESEMRLEKERLRANLLRSISHDLRSPLTSISG NAENLMTNETKLKQEKRQKIYQDIYDDSVWLINLVENLLSVTRIENGTMELNIQGEVAEE VIEEALKHINRRGKQQKIMIDCEEILIAKMDVKLIFQVLINLVDNAVKYTPKGAEIVVGA KKCGDKVVFFVKDNGQGLTAEQKKRVFDMYYTVNNTVSDSRRGMGLGLPLCQAIIQAHGS ELKIYDNAPTGTIFRFALEQEEIVL >gi|330403673|gb|ADLB01000017.1| GENE 28 29603 - 31606 1616 667 aa, chain - ## HITS:1 COG:pli0050 KEGG:ns NR:ns ## COG: pli0050 COG2205 # Protein_GI_number: 18450332 # Func_class: T Signal transduction mechanisms # Function: Osmosensitive K+ channel histidine kinase # Organism: Listeria innocua # 24 661 13 640 888 529 43.0 1e-150 MGGERANPEELLKRIQAAEKQKENDKKGKLCIFLGYAAGVGKTCAMLDMAHELQKEGMDI VAGYIEPHARPETSAREEGLEKLPPLMVEYKGIQLRELDIDGVLKRKPQAVLIDELAHTN AKGMRHQKRYEDIEEILDAGINVYTTVNIQHLESLNDVVESITKIHVKERIPDFVFDEAT RVKLIDIEPDELITRLKEGKIYKTVQAERALQNFFAKEKLIALREIALRRMADKVNRLGL QERILNENVDGSEGYQGEHIMTCISTAPSCEKVIRSASRMAYAFHAKFTALYVETPELQN ADSSVKQIRDKNIHLAEALGAKIVTVFGEDVAFQIAEYARVSSVTKLVLGRTNHRIWFGQ KKGTITDKISEYMPELDIFIIPDTTKYRRKTREIFTLPKKSEGLAKDLLKEFLMLASATG IGLFFRQNRLMEADIIMVYLMGILLLSLYTKRRYVAVSSSIISLLLFDWFFVPPFQSFHF YSGKYSATFALMLIFSIIITTIISRERRQVRESAKMVYRTQLLLDNSRRMRRIETVRELL MELSEKVLKLMNLSVVFYIRKGEKTTGPWLFPKPEMSKSELKKMMTPRERAVVDWVMANK KRAGCCTHTLPEADAMYLPIKTGDAIYGVMGIVLEEKREIPTFEYGLLTAMLNEAALVFA RIYLVQK >gi|330403673|gb|ADLB01000017.1| GENE 29 31651 - 32283 757 210 aa, chain - ## HITS:1 COG:pli0055 KEGG:ns NR:ns ## COG: pli0055 COG2156 # Protein_GI_number: 18450337 # Func_class: P Inorganic ion transport and metabolism # Function: K+-transporting ATPase, c chain # Organism: Listeria innocua # 1 207 1 203 212 208 50.0 5e-54 MKNFGKYIKRAIVITVALFVICGVAYPFLLTGIGQVVFPKQANGSIVKAEGEAVGSEVVG QKFEGDKYFHGRVSVVDYNTYTEEEKENGDYTGVASGSYNYSATNEDLKKRVEEDVKAFK ERYKKATGEEFKGEIPADMLTASGSGLDPHISPESAKIQLPIVAASSGLSEKEVEEIVEK NTTHKLFGIFGEETVNVLQCNIDIAEAIGE >gi|330403673|gb|ADLB01000017.1| GENE 30 32298 - 34361 2307 687 aa, chain - ## HITS:1 COG:CAC3681 KEGG:ns NR:ns ## COG: CAC3681 COG2216 # Protein_GI_number: 15896913 # Func_class: P Inorganic ion transport and metabolism # Function: High-affinity K+ transport system, ATPase chain B # Organism: Clostridium acetobutylicum # 6 687 2 684 685 995 78.0 0 MKMEKQEKKTKFITKDILKTSIIGAFQKLNPKYMMKNPVMFVVEVGFVISLIMTFFPTIF GDDSGLRIYNGIVTVILFITILFANFAESVAEGRGKAQADSLKKTKKDTMATLLLKDGTE KLINASELKKGDIVIVRTNEVIPNDGEVIEGVASVDESAITGESAPVTREAGGDFSSVTG GTTVVSDWLKIKITSEPGESFLDKMISLVEGASRQKTPNEIALNTLLVSLTIIFLIVVVT LYCFSDYSGVKIPMATMIALLVCLIPTTIGGLLSAIGIAGMDRVTRFNVIAMSGKAVEAC GDVDTMILDKTGTITYGNRLAADFREVKGKSKEDLIDYSVMTSLCDATPEGKSVVELGKR LGTKIKDEVKDQMEFVEFTAQTKMSGVNLKDGTRIRKGAYDAIKKYVQENNGEIPEDLES IVTDISSLGGTPLVVCVGNVIYGVIYLKDTVKPGLVERFERLREIGIKTIMCTGDNPLTA ATIAKEAGVDGFIAECKPEDKIDAIKKEQAEGKIVAMTGDGTNDAPALAQADVGLAMNSG TTAAKEAANMVDLDSDPTKILEVVEIGKQLLITRGSLTTFSIANDVAKYFAIIPAMFTLV IPQMEVLNIMHLSTPFSAILSALIFNAIIIPCLIPIAMRGVKYKPMRSEKMLMKNMGIYG LGGVIVPFVAIKLIDILVTPLLALLGL >gi|330403673|gb|ADLB01000017.1| GENE 31 34383 - 36095 1674 570 aa, chain - ## HITS:1 COG:pli0052 KEGG:ns NR:ns ## COG: pli0052 COG2060 # Protein_GI_number: 18450334 # Func_class: P Inorganic ion transport and metabolism # Function: K+-transporting ATPase, A chain # Organism: Listeria innocua # 1 569 1 572 573 697 65.0 0 MLQIALTLVIFLVLVIPMGTYMYHIATKQRTFADPLFDRLDGGIYKVLKISREGMNWKQY ALHLLITNAVMVLVGYVILRLQGVLFANPNGIEAMDPTLSFNTIISFMTNTNLQHYAGES GLSYLSQMTVIIFMMFVSAASGYSACMAFCRGLAGKQKDVGNFHEDMIRVTTRILIPFSI IIGILLIWQGVPQTLDANQTIKTLEGNYQDLAMGPVAALESIKHLGTNGGGFFGANSSMP FENPTIISNLIELLSMMILPGACVVTFGKMTMQRRKEKKKTQRVLLGNQGRTIFAAMSIL FLVGLILCFQAEKAGNPVLEQAGVNQDVGNMEGKETRFGVAQSALFTTTTTSFTTGTVNN MHDSLTPLGGLVPMLHMMLNCVFGGKGVGLMNMIMYVILAVFLCGLMIGRTPEYLGKKIE GKEMKLVALVLIIHPLLILGFSALAVMTGAGIEGITNGGFHGLSQVLYEYASSAANNGSG FEGLADNTAFWNITTGLAMFFGRYITMIAQLAIAGSLLAKRRVNETVGTLRTDNVIFVVV LVIVVYIFAALTFFPALALGPIAEHLSLWL >gi|330403673|gb|ADLB01000017.1| GENE 32 36125 - 36202 203 25 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MILLIILILLLAGYLFYALIYPEKL >gi|330403673|gb|ADLB01000017.1| GENE 33 36199 - 36312 188 37 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MDLLMILILAVCFGSMKLLVDWCEKQVGTAAIREEEK >gi|330403673|gb|ADLB01000017.1| GENE 34 36480 - 37274 1023 264 aa, chain - ## HITS:1 COG:lin0435 KEGG:ns NR:ns ## COG: lin0435 COG0428 # Protein_GI_number: 16799512 # Func_class: P Inorganic ion transport and metabolism # Function: Predicted divalent heavy-metal cations transporter # Organism: Listeria innocua # 1 264 8 269 269 177 43.0 2e-44 MNGIFFAFLGGLVTFTVTTLGAAGIFFVKRQISENWQNGFLGFAGGVMIAASVWSLLLPG IEFAEENDQVGWLVITGGFLLGVMTLLTADFLLQKWYKRQRKKPITLKRSTSMLVLAITV HNIPEGMSVGLAFALAGQNREDIALMSGAIALTIGIAIQNFPEGTAVALPLMKEGMTKKK AFLIGSMTAVVEPVFAVLAAVFANITQSSIAVFLAFAAGTMIYVVVEELIPEAHMGADGE EGKAGTLGFIVGFLVMMILDVALG >gi|330403673|gb|ADLB01000017.1| GENE 35 37327 - 38688 1365 453 aa, chain - ## HITS:1 COG:FN1726 KEGG:ns NR:ns ## COG: FN1726 COG0534 # Protein_GI_number: 19705047 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Fusobacterium nucleatum # 1 447 1 444 457 268 36.0 2e-71 MEQNKDFLGTEPVGKLLLKLALPTVAAQIINMLYNIVDRIYIGHIKDVGALALTGVGVCM PLIMIVAACAALISNGGAPRATIFMGKGDKESAEKTLGNCFSLQIVISILLTTILLVFNR QFLMAFGASENTIEYGVSYMNIYAVGTIFVQLTLGMNAFITAQGFAKTGMLSVLIGAVSN IILDPIFIFGFNLGVKGAALATVISQALSCIWVLCFLFGKKTQLKIKRGYMKWERKIILP SLALGLAVFIMQASESVISVCFNSSLLKYGGDIAVGAMTILTSVMQFAMLPLQGLGQGAQ PIISYNYGAKNKKRVKSAYKLLLQASFGYAVLLWGLVMLFPQMFAKMFTSDAALLNFTST ALRVYMGVLFIFGVQMACQMTFNSLGNAKASILVAVMRKFILLIPLIYIMPHIFTADKTI AVYMAEPIADILAVTFTVILFVFQFKKALKEIS >gi|330403673|gb|ADLB01000017.1| GENE 36 38675 - 39133 460 152 aa, chain - ## HITS:1 COG:no KEGG:Cphy_3318 NR:ns ## KEGG: Cphy_3318 # Name: not_defined # Def: MarR family transcriptional regulator # Organism: C.phytofermentans # Pathway: not_defined # 28 143 25 139 141 63 33.0 2e-09 MQRATDILLLIRGLGKLNDKCLETVRKEYELSQIEVTIMGFLHNNPGKDTVGEIAYWRML PKGNVSQGVESLIQKRLLQRICDKDDRRKIHLQITEQAKDIVRGIDKARKKYDEKIFLGF SEEEKKLYFQLNGKILENVFEGLERKEKNGAE >gi|330403673|gb|ADLB01000017.1| GENE 37 39078 - 39191 58 37 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MNKVLKRTILKRTIKGGDCYAACDGHTLANPRVGQIK >gi|330403673|gb|ADLB01000017.1| GENE 38 39332 - 39754 582 140 aa, chain + ## HITS:1 COG:CAC3714 KEGG:ns NR:ns ## COG: CAC3714 COG0071 # Protein_GI_number: 15896945 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Molecular chaperone (small heat shock protein) # Organism: Clostridium acetobutylicum # 33 125 46 137 151 70 40.0 6e-13 MLLANRNYDLFDEMFKDPFFTRPFENASSQIMKTDIHEQDGNYLIEMELPGFAREDIKAD LKNGYLTITAEKNQTNEEKDAKGNCIRKERYTGSCNRSFYVGEQVAQEDIKAAFKDGILR LQVPKDTPKAIEEPRLITIE >gi|330403673|gb|ADLB01000017.1| GENE 39 39850 - 40401 519 183 aa, chain - ## HITS:1 COG:SP1234 KEGG:ns NR:ns ## COG: SP1234 COG1827 # Protein_GI_number: 15901096 # Func_class: R General function prediction only # Function: Predicted small molecule binding protein (contains 3H domain) # Organism: Streptococcus pneumoniae TIGR4 # 5 168 4 170 171 136 39.0 2e-32 MTGSERREEIISQIQNSTTAVSGKKLAAAYDVSRQVIVQDIALIRAMGYDIISTNKGYIL NAPKSISKIFKVRHTDEQLEEELCAVVDNGGCIENVMINHKVYGHMEANLQINSRRKIKE FMEEIRSGKSSPLKNITSGYHYHKVSADSRETLEMVEKELKRKGFLIQTEYGEKEDGINV SRL >gi|330403673|gb|ADLB01000017.1| GENE 40 40415 - 41269 552 284 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163755345|ref|ZP_02162465.1| 30S ribosomal protein S6 [Kordia algicida OT-1] # 9 278 10 283 286 217 43 2e-55 MNTITMNLQADKLIREALCEDISSEDVTTNSVMKEAVMGEVDLLCKEDGVIAGLEVFERV FHLLDENVKVELYCKDGDKVKNGQLMGKVTGDIRVLLSGERVALNYLQRMSGIATYTNSV SALLEGTKTKLLDTRKTTPNMRIFEKYAVRVGGGYNHRYNLSDGVLLKDNHIGAAGSVTK AIEMAKEYAPFVRKIEVEVESIEMVKEAVKAGADIIMLDNMSPEEMEEAVRIIDGRAETE CSGNVTKENIGRLTSIGVDYISSGALTHSAPILDISLKNLHAVS >gi|330403673|gb|ADLB01000017.1| GENE 41 41281 - 42444 1102 387 aa, chain - ## HITS:1 COG:FN0009 KEGG:ns NR:ns ## COG: FN0009 COG0029 # Protein_GI_number: 19703361 # Func_class: H Coenzyme transport and metabolism # Function: Aspartate oxidase # Organism: Fusobacterium nucleatum # 4 385 5 384 435 382 52.0 1e-106 MELNADTVIVGTGVAGLFSALNLSKKKKIIMITKADMESNDSFLAQGGICVLRNEKDYDS FFEDTMRAGHYENRKESVDIMIRNSREVIEDLVGYGVEFEKRDGDFAYTREGAHSKPRIL YHEDVTGKEITSKLLAQVKQLENVTIYEYTTMTDIIEENGKCIGIEAKREEEELRIYAKN TILATGGIGGNYKHSTNFPHLTGDAIDISKKHGIRLEHLDYVQIHPTTLYSKKEGRRFLI SESVRGEGAVLYNKSKERFVDELLPRDVVAKAIQNQMEKDGTEYVWLSMEHIPKETILNH FPNIYQKCLDEGYDVTKECIPVVPAQHYFMGGIWVDGDSRTSMPNLYAVGETSCNGVHGK NRLASNSLLESVVFAKRAAKKIEREAV >gi|330403673|gb|ADLB01000017.1| GENE 42 42460 - 43365 993 301 aa, chain - ## HITS:1 COG:FN0008 KEGG:ns NR:ns ## COG: FN0008 COG0379 # Protein_GI_number: 19703360 # Func_class: H Coenzyme transport and metabolism # Function: Quinolinate synthase # Organism: Fusobacterium nucleatum # 5 300 3 296 298 308 55.0 8e-84 MDIIEKIKKLKEEKDAVILAHYYVSDEVQEIADYIGDSFYLSKVAVGLKEQTIVFCGVSF MGESAKILNPKKTVLMPDMSADCAMAHMADVETIQRMRDTYDDLAVVCYINSTGELKQHS DVCVTSANAVKIVKALPNKYIFFIPDRNLARYVAEQVPEKQFVFNEGYCPIHEQIRLEEV REEKELHPNAQILTHPECPKAICDLSDYIGSTSGIISYVGKSDCKEFIICTENGVRYELE KQNPDKKFYFTKTEPVCRDMKQITLEKIAHVLETGENEVQVEETLREESQKALERMLELA K >gi|330403673|gb|ADLB01000017.1| GENE 43 43671 - 44336 795 221 aa, chain + ## HITS:1 COG:CAC2928 KEGG:ns NR:ns ## COG: CAC2928 COG3859 # Protein_GI_number: 15896181 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Clostridium acetobutylicum # 18 192 19 183 210 94 36.0 2e-19 MFQLLVNQDGGLTTTGYAVTIIVAILVLVAAFIFAGKASDRKKISTKQLVYCAMALALGF VTSYIKIFELPFGGAVTACSMLFIVLIGYWYGAKTGILIGFVYGIMQFLQGPYVLSFFQV CCDYLLAFAALGLAGFFVHKKNGLVKGYIVAILGRGAFHVLGGYLYWMDYMPKNFPKSLA ALYPFIYNYGFILAEGLITILILSIPAVKKALEQVKRNALS >gi|330403673|gb|ADLB01000017.1| GENE 44 44444 - 44977 181 177 aa, chain - ## HITS:1 COG:no KEGG:DSY0717 NR:ns ## KEGG: DSY0717 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense # Pathway: not_defined # 1 177 1 187 193 102 34.0 4e-21 MILWTIQHRLSYENMKKTGVLRADELYIWDNYLKKSYLWMAEQMKKRIGNAPEGVVFPVW AWYQWEGKRKRPDMRKHGCRWGEKGSSIVLLTVNVPDDCVLLSDFDYWHCVLNDTDIIFP YNETNVYSENEKRKSWENIFDVNCSFDGEVHHTLSTQATFWEIKSEWIIKVEYFISR >gi|330403673|gb|ADLB01000017.1| GENE 45 45010 - 45369 407 119 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|294640107|ref|ZP_06718121.1| ## NR: gi|294640107|ref|ZP_06718121.1| hypothetical protein CUS_1512 [Ruminococcus albus 8] # 1 113 1 113 123 98 47.0 2e-19 MWTYKELTREEFEKTKDKVEQFIREFGGRVVDIQLPYEQKSTSYSGNFIVEHTSLRPIYE LNKEYFRVDEICFRNKPFIVIEFGSYEELMSNTMEDAGAFPYDLSERDIRNEVLEILWN >gi|330403673|gb|ADLB01000017.1| GENE 46 45385 - 46959 1828 524 aa, chain - ## HITS:1 COG:CAC0033 KEGG:ns NR:ns ## COG: CAC0033 COG0661 # Protein_GI_number: 15893331 # Func_class: R General function prediction only # Function: Predicted unusual protein kinase # Organism: Clostridium acetobutylicum # 27 524 34 531 532 289 33.0 1e-77 MNRQEYKERLKEMTEVLRKHNIKRGVSPEKLRLILEDLGPTFIKLGQIMSMHSDILPKRY CDELMRLRSEVPPMLFAEVEEVLQGAYGCPWREIFEKIDEKPLGSASIAQVHKALLKTGE EVVVKVQRKGIYDKMARDIGLLHRAVKLLPPVSLKGMVDLDMVLEELWAVTREEMNFLTE AANLEEFARRNKDIAFVGVPKLYHEYTNHYVLVMEYIEGYAVDDKENLLADGYDLEEVGV KLVDHYIKQVMEDGFFHADPHPGNVKIRGGKIIWIDMGMMGRLTERDRELIGKAIEGIAM SDIGMIQDAVMALGDFKEAPDQSCLYEGIGNLLSQYGNAEMGNIDIAKVTMSLMEVMKEN KIIMPHGLTMLARGLTHMEGVIADIAPELNMVEIASRHMAGKLLKEKDWKQELKNSGKTI YRSFHKAMNIPSLISDILQGYLKGQTKINLDLHASAELERLLRRLVRNVVMGLWVMALLI SSSIICTTDMKPKVWGIPALGAFGYLIAFAIVMYVFLKHIFSKK >gi|330403673|gb|ADLB01000017.1| GENE 47 46973 - 47299 512 108 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_0665 NR:ns ## KEGG: EUBREC_0665 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 108 1 108 150 102 66.0 3e-21 MEGFGENIKKLLLAGIGAVATTAEKSKELLEDMVEKGELTVEQGKVLNEELKHNIKKTVK ENVNVSVKPSSPEELDELLEKMTPEQIATLKERLHSMEKVQSDNGQAE >gi|330403673|gb|ADLB01000017.1| GENE 48 47334 - 47618 406 94 aa, chain - ## HITS:1 COG:SP0968 KEGG:ns NR:ns ## COG: SP0968 COG0818 # Protein_GI_number: 15900845 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Diacylglycerol kinase # Organism: Streptococcus pneumoniae TIGR4 # 1 94 36 130 131 69 45.0 2e-12 MKIHCLAILCVVAAGFLVRISVVEWCMCLILFGLILSLELVNTAIEAVVDLVTEERKPLA KLAKDTAAGAVLVAAIMAAGVGILIFVPKILALI >gi|330403673|gb|ADLB01000017.1| GENE 49 47850 - 48200 124 116 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLCTPFKQLYFILRISYKQGLQSSIFSKITPFSIFLNFFFIPKKVDREHINLVILHQIIS LKLLDFSLNDIKNALTEQADSMREKIDVVKCFDDDTFNHIAPITLDIQLLRREVIR >gi|330403673|gb|ADLB01000017.1| GENE 50 48199 - 48540 377 113 aa, chain + ## HITS:1 COG:no KEGG:CDR20291_2714 NR:ns ## KEGG: CDR20291_2714 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile_R20291 # Pathway: not_defined # 1 100 1 100 277 130 66.0 2e-29 MTIKEVEEQTSLSRSNIRFYEKEKLIEPSRNESNGYKDYSENDVENIKKIAYLRTLGISI EDIRSIISEKVTLQEIIEKQNEVLKNQITNLNKAKLMCEKCWRKKVLVMKNYR >gi|330403673|gb|ADLB01000017.1| GENE 51 48498 - 49043 330 181 aa, chain + ## HITS:1 COG:no KEGG:CDR20291_2714 NR:ns ## KEGG: CDR20291_2714 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile_R20291 # Pathway: not_defined # 1 173 101 272 277 184 52.0 1e-45 MLEEKSISYEKLQVEQYVTELQDYWKDNQTVFKLDSVSFLYIWGSMLTWTTITALCLIIG VLSYSKLPTEIPVQWSNGVATSLVNKNWIFIYPVICIIIRYLLKSFIYAKLQMNNYYGEI ITEYLTNYMCFIVLSIEIFSILFTFGVVKNIVMLLFVNTVVFIGLLVVGLTKMDLRGKGG L >gi|330403673|gb|ADLB01000017.1| GENE 52 49040 - 49198 186 52 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MIQISNLTKEYGKTVIFSNLNYTFEPSFFNELTIYEKIAALPSYVKATSIIA >gi|330403673|gb|ADLB01000017.1| GENE 53 49381 - 52005 3398 874 aa, chain - ## HITS:1 COG:SMc00025 KEGG:ns NR:ns ## COG: SMc00025 COG0574 # Protein_GI_number: 15964685 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphoenolpyruvate synthase/pyruvate phosphate dikinase # Organism: Sinorhizobium meliloti # 1 874 1 887 898 1001 57.0 0 MAKWVYMFTEGDATMRNTLGGKGANLAEMTKLGLPVPQGFTITTDACTQYYEDGRKINDE IMEQIMDAIVKMEEVTGKKFGDKENPLLVSVRSGARASMPGMMDTILNLGLNEEVVKVLA EKSGNPRWAWDCYRRFIQMYSDVVMEVGKKYFEELIDKMKEEKGITQDVDLTAEDLEVLA NQFKAEYKEKIGEEFPIDPKEQLMGAVKAVFRSWDNPRANVYRRDNDIPYSWGTAVNVQM MAFGNMGETSGTGVAFTRDPATGEKHLMGEFLMNAQGEDVVAGVRTPQKIDQLKEVMPEV YEQFVGICNTLEDHYRDMQDMEFTIEDKKLYMLQTRNGKRTAQAALKIACDLVDEGMITE EKAVAMIDPRNLDTLLHPQFDAAALKAAAPVAKALGASPGAACGKIVFTAEDAKEWAERG EKVVLVRLETSPEDIEGMKAAQGILTVRGGMTSHAAVVARGMGTCCVSGCGDITMDEANK KFTLAGKEYHEGDSISLDGSTGNIYDGIIPTVDATIAGEFGRIMGWADKYRTLKVRTNAD TPADARKARELGAEGIGLCRTEHMFFEGNRIDAFREMICSETLEEREAALEKILPEQQGD FEALYEALEGHPVTIRFLDPPLHEFVPTTEEDIKKLADTQGKTVEQIKAIIDSLHEFNPM MGHRGLRLAVTYPEIAKMQTRAVIRAAINVQKAHSDWTVKPEIMIPLSCDAKELKYVKDM VVATADAEIAAAGVELAYEVGTMIEIPRAALTADEIAKQADFFCFGTNDLTQMTYGFSRD DAGKFLDAYYDAKIFENDPFAKLDQTGVGTLMETAIKLGKPVNPNLHVGICGEHGGDPSS VEFCHKIGLDYVSCSPFRVPIARLAAAQAAINNK >gi|330403673|gb|ADLB01000017.1| GENE 54 52147 - 53442 1044 431 aa, chain - ## HITS:1 COG:DR1362 KEGG:ns NR:ns ## COG: DR1362 COG0673 # Protein_GI_number: 15806379 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Deinococcus radiodurans # 1 418 1 398 403 265 37.0 1e-70 MNKKMKVALAGLGSRGKDTYAPTAKLFPDKMEIVAIADIDPAKVQEVAKEYHIPEEMCFS SAEELIAQDKLADIMFITTQDKQHVEQAIPALKKGYHLLLEKPISPKPEECREIVKVAKE YNRQVIVCHVLRYTPIYSKLKDILDTGVIGDIVSVMSIENVGYWHQAHSFVRGNWRNSET TSPMILQKCCHDMDLLLWLTGKTCESVSSFGDTYLFKEECAPKGAALRCMDGCKAREDCP YDAEKIYLEHHKIGAKTGYTEWPLDVLTLHPSVETITEAIKTGPYGRCVYHCDNNVVDHQ IVNMKMTDGTTISHTMCGFTATGSRYAKFMGTKGEIIADMTENTIKITVFGQKDTEVIDI SKIATDFSGHGGGDNRMVEEFLDMLINETGATKRTTSVEQSVESHYCALAAEESRINGGK VIFLDKYRNGK >gi|330403673|gb|ADLB01000017.1| GENE 55 53518 - 54048 572 176 aa, chain - ## HITS:1 COG:no KEGG:Teth514_0984 NR:ns ## KEGG: Teth514_0984 # Name: not_defined # Def: hypothetical protein # Organism: Thermoanaerobacter_X514 # Pathway: not_defined # 4 156 5 154 300 62 32.0 5e-09 MKKKSLWILLGILILLIGVYFGIGHWQKVQKEKDDLEKEKSKITLVKGNDWKKLSYTKDG SELRFEKKEDTWYVEGDDKTKLTQSYVTAIAEAFSDLQAVRELKGGDELSDYGLDNPTYT VTLTDKDGKQIVCYIGNGAGENYYFTMGEKKKVYLIPGEVVQTLQYDLEQMKSTEE >gi|330403673|gb|ADLB01000017.1| GENE 56 54053 - 55435 1554 460 aa, chain - ## HITS:1 COG:slr2105 KEGG:ns NR:ns ## COG: slr2105 COG3225 # Protein_GI_number: 16330592 # Func_class: N Cell motility # Function: ABC-type uncharacterized transport system involved in gliding motility, auxiliary component # Organism: Synechocystis # 13 309 71 351 595 89 28.0 2e-17 MQKIKELFKSVHTKKGTYSMALTAVVIAIAIVVNMLAGKLPSSAKTIDVSGNQLYEITNT SKKVLKELDKEITFTILAEKKSVDDRIKTFVKKYAALSDKIDVEYVDPVLHPSALDKYDG EENSIVVQCKETERSIVIPFTDIIVYDEMSYYSGTMQEKEFDAEGQLTSAVNYVTSTVNK TVYRLAGHGESTLSTEVSALLGKSNISVNELNLMMKKEIPEDCDLIFIYAPTTDITADEQ KTLSAYLKEGGKVLFMSGGADADTPNLKNLLKEYGLQQAEGYIADTERSYQGNYYYLIPN LSVSGKMANDISSETVLLVNSVGFTETTPASENVSVDSFMTTSEKAYAVTEKEQKEGTYI LGAVAEDSETKGRLTVIGASTMIDSNLTEMFSNTENLTLFVNAVTSNFDDVENVSVKPKS LEVTYNTVRHAGAFSLLAIFGIPACILIIGASRCWKRRKA >gi|330403673|gb|ADLB01000017.1| GENE 57 55448 - 56314 834 288 aa, chain - ## HITS:1 COG:PA4038 KEGG:ns NR:ns ## COG: PA4038 COG1277 # Protein_GI_number: 15599233 # Func_class: R General function prediction only # Function: ABC-type transport system involved in multi-copper enzyme maturation, permease component # Organism: Pseudomonas aeruginosa # 4 194 7 205 244 70 28.0 3e-12 MKAIYKRELKSYFDSMTGYVFIAAIIIFTGIYFMVYNLNMGYPYFSYALYGAGFIFILSI PILTMRCFAEERKNKTDQLLLTSPVSVLSVVLGKYFAMTTVFLVPNLIFMAFPIIIKSQG KANFLIDYAGIFTFFVMGCVYIAIGMVISALTESPVIALLSTFGVLLVMYLGDALLAYLP TSAVGNLAGILVLIALLSCIVYRNTKNWVIGGGLGIVGFGITIAGYVLKKTWFENRLAEL LKKVLLTEGFENIVFNHLLDIGSLILYISLIFVFVFLTMQTIQKRRWN >gi|330403673|gb|ADLB01000017.1| GENE 58 56298 - 57281 1161 327 aa, chain - ## HITS:1 COG:sll0489 KEGG:ns NR:ns ## COG: sll0489 COG1131 # Protein_GI_number: 16331772 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase component # Organism: Synechocystis # 1 310 1 313 342 256 45.0 6e-68 MIEVKNLVKKYGEYLAVDHLSFTVNKGEIYGFLGPNGAGKSTTMNIMTGYLGATEGEVIV NGHNILEEPEAAKKSIGYLPEIPPLYTDMTVLEYLCFVSDLKKIPKSERNEQINKVVELV KLEEVEMRLIKNLSKGYKQRVGLAQAILGFPEIIILDEPTVGLDPKQIIEIRELIRELAK EHTVILSSHILAEVREVCDYIMIISNGKLVASDTPENLEKRISGTEMLHLVAKADEVGRV ENILEKVEYIEHIECKQRGEFVEVDIEIENGRDIREEVFFAFSKNECPLVEITRQKVSLE DVFLELTEEEKHEAIVEEEELEDESDL >gi|330403673|gb|ADLB01000017.1| GENE 59 57470 - 58237 477 255 aa, chain - ## HITS:1 COG:mll6836 KEGG:ns NR:ns ## COG: mll6836 COG1943 # Protein_GI_number: 13475699 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Mesorhizobium loti # 14 140 24 150 230 68 29.0 9e-12 MPRQRRRKSATGFYHVIAKGINKERIFNQTREKLYFKRILREYLKEDEVEIYAYCIMSNH VHLIIKSDMAVLAAFMAKCLAKYAEYYNYKHNRNGHVFQNRFKSECIESERYFWNCVRYI HLNPMKCHNENTYLQYKYSSIGEYQLLKNDILHENALTIFQRKFEEWEDVVNFHNKKQYV VIDDVKEDMHLQRQEIALNILEQMKMEKNLSKIEEIIEEKSLREEYKERLIDKMKISSTR SENLYLFVKKKIIGE >gi|330403673|gb|ADLB01000017.1| GENE 60 58373 - 59761 1460 462 aa, chain - ## HITS:1 COG:CAC3195 KEGG:ns NR:ns ## COG: CAC3195 COG0423 # Protein_GI_number: 15896443 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Glycyl-tRNA synthetase (class II) # Organism: Clostridium acetobutylicum # 2 449 4 450 462 693 71.0 0 MEKTMEKIVAVAKSRGFVYPGSEIYGGLANTWDYGNLGVELKNNVKRAWWKKFIQENPYN VGVDCAILMNPQTWVASGHLGGFSDPLMDCKECKERFRADKLIEDFAAENGIELTESVDA WSKEQMEAWIAEHEVPCPTCKKHNFTEIRQFNLMFKTFQGVTEDAKNTVYLRPETAQGIF VNFKNVQRTSRKKIPFGIGQIGKSFRNEITPGNFTFRTREFEQMELEFFCEPGTDLEWFA YWKKFCLDWLYSLGLKEDEVRARDHSPEELCFYSKATTDVEFLFPFGWGELWGIADRTDY DLSQHQEVSKQDMTYFDDEKKEKYIPYVIEPSLGADRMVLAFLCSAYDEEELEGGDVRTV MHFHPALAPIKIGVLPLSKKLNEGAEKVFAELSKTYNCEFDDRGNIGKRYRRQDEIGTPF CVTYDFDSEEDGAVTVRDRDTMEQERIKIEDLKAYFEKKFEF >gi|330403673|gb|ADLB01000017.1| GENE 61 59898 - 60644 415 248 aa, chain - ## HITS:1 COG:BH1369 KEGG:ns NR:ns ## COG: BH1369 COG1381 # Protein_GI_number: 15613932 # Func_class: L Replication, recombination and repair # Function: Recombinational DNA repair protein (RecF pathway) # Organism: Bacillus halodurans # 9 244 7 241 254 114 28.0 2e-25 MNQPVIVTGIVLSVTPIGEYDKRVVILTKEKGRISAFAKGARRPNSQFGGAVSPFCFGEF TLYEGRNSYNIMQVHISNYFQELKNDMEATYYGFYFMEFAEYYTREQNDETQMLKLLYQT FRALTKKNISNELIRCIYELKAVYINGEGPQVFQCVKCGSADAPVVFSVRQGGIICKNCI NEVKDAVEIDTSTLYAMQYIAGSSIEKLYTFTVSENVLKKLKKIVSGYLSFYVDKSFKSL EILEICLK >gi|330403673|gb|ADLB01000017.1| GENE 62 60580 - 60744 172 54 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MELTAWDKFSKSGKIEDYLEYCESRMEKTGEYNRESAGDSDRYCAVSDSDRGVR >gi|330403673|gb|ADLB01000017.1| GENE 63 60835 - 61731 993 298 aa, chain - ## HITS:1 COG:CAC1295 KEGG:ns NR:ns ## COG: CAC1295 COG1159 # Protein_GI_number: 15894577 # Func_class: R General function prediction only # Function: GTPase # Organism: Clostridium acetobutylicum # 5 294 2 291 296 320 54.0 3e-87 MRADYKSGFVTLIGRPNVGKSTLMNYLIGQKIAITSNKPQTTRNRIQTVLTTEQGQVVFV DTPGIHKAKNKLGEYMVNVAERTLNEVDVVLWLVEPSTFIGAGEKHIAEQLKRVKTPVIL VINKVDMVKKDEVFAFIDAYQKIYDFAAIVPVSAKNGENTEELLKVLFQYLPYGPQFYDE DTVTDQPQRQIVAEIIREKALHALNEEIPHGIAVSIEKMQFKRKIVEIEATIICERDSHK GIIIGKQGSMLKKIGTNARYEIEKMLEMQVNLKLWVKVKKDWRDSDFLIKNFGYTKDE >gi|330403673|gb|ADLB01000017.1| GENE 64 61754 - 64678 3272 974 aa, chain - ## HITS:1 COG:CAC3006 KEGG:ns NR:ns ## COG: CAC3006 COG1026 # Protein_GI_number: 15896258 # Func_class: R General function prediction only # Function: Predicted Zn-dependent peptidases, insulinase-like # Organism: Clostridium acetobutylicum # 9 974 13 975 976 837 46.0 0 MKLQELTTYELIQEHHLKDLQSEGYILKHKKSGAKVVLLSNDDENKVFSIGFRTPPKDST GLPHILEHSVLCGSKRFPSKDPFVELVKGSLNTFLNAMTYPDKTVYPIASCNDKDFQNLM HVYMDAVFYPNIYEHDEIFRQEGWSYKLDSADAKLEYNGVVYNEMKGAFSSPEGVLDRVV LNSLFPDTSYRNESGGDPEVIPELTYEQFLDFHRKYYHPSNSYIYLYGDMDMAEKLEWLD KEYLSHFDAMEIDSKIQKQEAFAERKEVEIAYSVSSNESEEDNTYLSYNKVIGTSLDRNL YLAFEILDYALLSAPGAPLKKALVDAGIGKDIMGSYDNGIYQPVLSIVAKNANQEQKEEF ISVIEDTLKSIVENGMDKKAVEAGINYHEFRYREADFGNFPKGLMYGLQIFDSWLYDDEK PFIHLDAIETFKFLKEQVNTNYFEQLIQKYLLDNTHASIVVVKAEKGRTARLEKELDEKL QAYKASLSKEEVDRLVERTAQLIAYQEEPSTEEELKTIPVLEREDISREIAPIYNEEKYY DDTLMVYHNIETNGIGYVDLLFDLSAVPAELLPYVGILQSVLGIIDTKNYEYGELFNEIN VHTGGIGTSLEMYPNVTKVKEKEFKATFEMKAKALYDKLPTAFAMMKEILVNSKLDDEKR LKEILDITKSRLQMRFQSAGHTTSALRAMSYASPLAKFKDITNGIGFYQTVNDICEHFEE KKEELIQNLQKLCKMLFRAENMMISYTASEEGLADMEKLIADLKTDLYKETVESVPCILQ CEKKNEGFKTSSKVQYVARAGNFIDQGVDYTGALHILKVILSYDYLWQNVRVKGGAYGCM SSFNRLGDGYFVSYRDPNLEKTNEIYEGITEYLRQFDVSDRDMTKYIIGTISNIDQPMNP AAKGDRSLNLYMNHVSKEMIEKERKEILDATQEDIRKLADVVDAVLKANQLCVIGSEEKI EEQKTLFDETKDLF >gi|330403673|gb|ADLB01000017.1| GENE 65 64834 - 66621 2121 595 aa, chain - ## HITS:1 COG:FN0453 KEGG:ns NR:ns ## COG: FN0453 COG0006 # Protein_GI_number: 19703788 # Func_class: E Amino acid transport and metabolism # Function: Xaa-Pro aminopeptidase # Organism: Fusobacterium nucleatum # 1 595 1 584 584 535 45.0 1e-152 MKVTERIAKLRALMEEKNIDMYIVPSADNHQSEYVGEHFKAREFITGFTGSAGTAVITKT EAGLWTDGRYFLQAEQQLEGSGVDLYRMGNPGVPTVLEFIADKLNENGTLGFDGRLVAVD EGKEYAKAASKKGGNVNYAYDLVDEVWEDRPALSTEKAFALDEKLVGESTESKLARIRKV MEEVGANVHVITSLDDVAWTLNVRGNDVAYSPLLLSYLVITMDQVDLYVDETKLNDEIRA NFNKVNVVLHPYNDIYEAMKVYDANDTLLIDPDRLNYALYYNIGANVNTVERQNPTVLMK AMKNEVELANTRNAHIKDGVAMTKFMKWVKENVGKMTITEMSASDKLEAFRAEQEGFLWP SFEPICGYGEHAAIVHYTSTPETDVELKEGALLLTDTGGNYYEGSTDITRTFALGEVSDV EKLHFTTVAKSMLNLANAKFMYGAMGVNLDILARKPFWDMNLNFNHGTGHGVGYLLNIHE GPSGIRWQYRPGESTPFEEGMVVTDEPGIYIAGSHGIRTENELIVRKGEANEYGQFMYFE TMTFVPIDLDAINPDIMSAEDKKMLNDYHKQVFEKISPYLNEEETEWLRKYTREI >gi|330403673|gb|ADLB01000017.1| GENE 66 66698 - 67054 203 118 aa, chain - ## HITS:1 COG:no KEGG:Kole_0724 NR:ns ## KEGG: Kole_0724 # Name: not_defined # Def: protein of unknown function DUF952 # Organism: K.olearia # Pathway: not_defined # 2 115 60 172 177 68 35.0 1e-10 MLILHCIKEKLYEEMREDRYFGKVLLKNNPFIHCSSIEYFWRVSPHFDNEKEKLVLLVID TDNLDVPVKWEDLEGCGREYPHIYGLVKMEAIKNVLPYLKKEDGTWIKNEELKHIADK >gi|330403673|gb|ADLB01000017.1| GENE 67 67074 - 68681 1985 535 aa, chain - ## HITS:1 COG:BH3449 KEGG:ns NR:ns ## COG: BH3449 COG1757 # Protein_GI_number: 15616011 # Func_class: C Energy production and conversion # Function: Na+/H+ antiporter # Organism: Bacillus halodurans # 40 530 3 506 516 329 41.0 7e-90 MKKVSRRKIVSVVSMVLLCVLMCSMTVFAAEEAPKPELYATFWALVPPVVAIALALITKE VYSSLFVGILVGALFYSGFSFEGTVTHIFQGGIISVLSDEYNVGILVFLVILGAMVSLMN RAGGSAAFGRWASQHIKTRVGAQLATVVLGVLIFIDDYFNCLTVGSVMRPVTDKHNVSRA KLAYLIDSTAAPVCIIAPISSWAAAVTGFVKGEDGFSIFIQAIPYNFYALLTIVMMIAIV TMKVDYGPMKIHEDNAKKGDIYTTPDRPYENASTEVVNHNGKVIDLVIPIISLIVCCVIG MIYTGGFFSGTDFVTAFSQSDASVGLVLGSFFGLVITIVLYMVRKVLSFSDCMSCIPEGF KAMVPAIMILTFAWTLKAMTDSLGAAVYVEGLVKSSAGSLMNFLPAIIFLIGCLLAFATG TSWGTFGILIPIVVQAFANDPKLMIISISACMAGAVCGDHCSPISDTTIMASAGAQSNHV NHVSTQLPYAMTAAAVSFVTYIVAGFVQSAWIALPFGIVLMLAVLFVIKKKTKSA >gi|330403673|gb|ADLB01000017.1| GENE 68 69212 - 69496 453 94 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_2191 NR:ns ## KEGG: EUBREC_2191 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 94 1 94 94 135 77.0 4e-31 MAKHYDKQFKLDAVQYYHDHKNLGLQGCATNLGISQQTLSRWQKELRETGDIESRGSGNY ASDEAKEIARLKRELRDAQDALEVLKKAINILGK Prediction of potential genes in microbial genomes Time: Tue May 24 21:25:21 2011 Seq name: gi|330403600|gb|ADLB01000018.1| Lachnospiraceae bacterium 2_1_46FAA cont1.18, whole genome shotgun sequence Length of sequence - 32682 bp Number of predicted genes - 28, with homology - 27 Number of transcription units - 10, operones - 3 average op.length - 7.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 301 78 ## SZO_02320 transposase + Term 404 - 442 5.2 - Term 382 - 436 9.8 2 2 Tu 1 . - CDS 444 - 3983 4363 ## COG0674 Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, alpha subunit - Prom 4026 - 4085 11.8 - Term 4043 - 4106 4.2 3 3 Op 1 . - CDS 4114 - 5034 769 ## COG0682 Prolipoprotein diacylglyceryltransferase 4 3 Op 2 . - CDS 5056 - 6153 1577 ## COG0012 Predicted GTPase, probable translation factor 5 3 Op 3 6/0.000 - CDS 6180 - 6839 622 ## COG1564 Thiamine pyrophosphokinase 6 3 Op 4 10/0.000 - CDS 6832 - 7494 622 ## COG0036 Pentose-5-phosphate-3-epimerase 7 3 Op 5 7/0.000 - CDS 7497 - 8375 883 ## COG1162 Predicted GTPases 8 3 Op 6 17/0.000 - CDS 8391 - 10424 2384 ## COG0515 Serine/threonine protein kinase 9 3 Op 7 5/0.000 - CDS 10424 - 11164 703 ## COG0631 Serine/threonine protein phosphatase 10 3 Op 8 4/0.000 - CDS 11175 - 12218 677 ## COG0820 Predicted Fe-S-cluster redox enzyme 11 3 Op 9 2/0.000 - CDS 12220 - 13569 1090 ## COG0144 tRNA and rRNA cytosine-C5-methylases 12 3 Op 10 1/0.000 - CDS 13562 - 14269 884 ## COG2738 Predicted Zn-dependent protease 13 3 Op 11 26/0.000 - CDS 14296 - 15225 1159 ## COG0223 Methionyl-tRNA formyltransferase 14 3 Op 12 4/0.000 - CDS 15227 - 15700 646 ## COG0242 N-formylmethionyl-tRNA deformylase 15 3 Op 13 . - CDS 15714 - 17942 1716 ## COG1198 Primosomal protein N' (replication factor Y) - superfamily II helicase 16 3 Op 14 . - CDS 17920 - 19404 1341 ## COG0769 UDP-N-acetylmuramyl tripeptide synthase 17 3 Op 15 . - CDS 19424 - 20620 995 ## COG0389 Nucleotidyltransferase/DNA polymerase involved in DNA repair 18 3 Op 16 . - CDS 20622 - 23207 1799 ## PROTEIN SUPPORTED gi|163764771|ref|ZP_02171825.1| ribosomal protein S8 - Prom 23345 - 23404 6.6 19 4 Tu 1 . - CDS 23409 - 23582 161 ## gi|210617226|ref|ZP_03291470.1| hypothetical protein CLONEX_03692 - Prom 23652 - 23711 3.1 - Term 23714 - 23775 6.5 20 5 Op 1 . - CDS 23783 - 24535 717 ## COG0596 Predicted hydrolases or acyltransferases (alpha/beta hydrolase superfamily) - Prom 24623 - 24682 2.5 - Term 24568 - 24612 -0.9 21 5 Op 2 . - CDS 24698 - 24826 182 ## - Prom 24971 - 25030 4.9 - Term 25008 - 25050 6.1 22 6 Op 1 24/0.000 - CDS 25242 - 26039 590 ## COG1277 ABC-type transport system involved in multi-copper enzyme maturation, permease component 23 6 Op 2 . - CDS 26036 - 26923 733 ## COG1131 ABC-type multidrug transport system, ATPase component 24 6 Op 3 . - CDS 26955 - 27386 206 ## EUBELI_20627 hypothetical protein - Prom 27525 - 27584 6.1 + Prom 27498 - 27557 8.8 25 7 Tu 1 . + CDS 27596 - 28456 731 ## COG0010 Arginase/agmatinase/formimionoglutamate hydrolase, arginase family + Term 28466 - 28505 2.2 - Term 28448 - 28498 12.2 26 8 Tu 1 . - CDS 28501 - 30027 1584 ## COG0769 UDP-N-acetylmuramyl tripeptide synthase - Prom 30067 - 30126 7.2 + Prom 30132 - 30191 11.7 27 9 Tu 1 . + CDS 30222 - 31439 1322 ## COG2195 Di- and tripeptidases + Term 31456 - 31496 7.4 - Term 31444 - 31484 7.4 28 10 Tu 1 . - CDS 31489 - 32010 359 ## COG1633 Uncharacterized conserved protein - TRNA 32168 - 32238 52.9 # Trp CCA 0 0 - LSU_RRNA 32303 - 32682 92.0 # CP000885 [D:78327..81226] # 23S ribosomal RNA # Clostridium phytofermentans ISDg # Bacteria; Firmicutes; Clostridia; Clostridiales; Clostridiaceae; Clostridium. Predicted protein(s) >gi|330403600|gb|ADLB01000018.1| GENE 1 2 - 301 78 99 aa, chain + ## HITS:1 COG:no KEGG:SZO_02320 NR:ns ## KEGG: SZO_02320 # Name: not_defined # Def: transposase # Organism: S.equi_zooepidemicus # Pathway: not_defined # 1 95 365 457 472 61 31.0 8e-09 FQKQYYKTINECGIQVHYHKGTKGLVIQTFDKRLLFSVNNKIYELEVVPLHEPLSKNFDL QPLKEKPRKRNLPSPKHPWRMSTFLQFKNHKITETMLIC >gi|330403600|gb|ADLB01000018.1| GENE 2 444 - 3983 4363 1179 aa, chain - ## HITS:1 COG:CAC2229_1 KEGG:ns NR:ns ## COG: CAC2229_1 COG0674 # Protein_GI_number: 15895497 # Func_class: C Energy production and conversion # Function: Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, alpha subunit # Organism: Clostridium acetobutylicum # 3 402 2 404 413 563 67.0 1e-159 MGRKMKTMDGNHAAAHASYAFTDVAAIYPITPSSVMAEATDEWATQGRTNIFGQTVQVTE MQSEAGAAGTVHGSLAAGALTTTYTASQGLLLMIPNLYKIAGEQLPGVFNVSARALASHA LSIFGDHSDVYACRQTGCAMLCESSVQEVMDLTPVAHLSAIKGKIPFINFFDGFRTSHEI QKIETWDYEDLKDMADMDAIDAFRKHALNPNHPCQRGSAQNPDIFFQAREACNPYYDAMP AIVQEYMDKVNAKIGTNYKLFNYHGAEDAEHVIVAMGSVCDTIDETVDYLMAAGRKVGVV KVRLYRPFCAEALVNAIPESVKQITVLDRTKEPGALGEPLYLDVVAALKGTKFDAVPVFT GRYGLGSKDTTPAQIVAVYDNNAKQKFTIGIVDDVTNLSLEVGAPLVTTPEGTINCKFWG LGADGTVGANKNSIKIIGDNTDMYAQAYFDYDSKKSGGVTMSHLRFGKSPIKSTYLIKQA NFVACHNPSYINKYNMVQELVDGGTFLLNCPWDMEGLEKHLPGQVKAFIANHNIKFYVID GIKIGKEIGLGGRINTVLQSAFFKLANIIPEDHAIELMKAAAKASYGKKGDKIVQMNYDA IDAGAKQVVEIAVPESWKNAEDEGLFTPEVKGGKPEVVDFVKNIQAKVNAQEGNTLPVSA FTEYVDGSTPSGTSAYEKRGIAVDIPVWKPENCIQCNRCAYVCPHAVIRPVALTEEEAAN APEGLETIDMIGMPGLKFTMTVSAYDCTGCGSCANVCPGKKGEKALVMENMEANAGKQDF FDYGREIPVKPEVVAKFKETTVKGSQFKQPLLEFSGACAGCGETPYAKLITQLFGDRMYI ANATGCSSIWGNSSPSTPYTVNEKGQGPAWSNSLFEDNAEFGYGMLLAQKALRNGLKTKV EKLVENGDNEAVVAAGKEWLDTFNCGATNGTATDNLVAALENCDCGCELRADILNNKDFL AKKSQWIFGGDGWAYDIGFGGVDHVLASGQDINIMVFDTEVYSNTGGQSSKSTPTGAIAQ FAAAGKEVKKKDMASIAMSYGYVYVAQIAMGADFNQTVKAIAEAEAYPGPSLIIAYAPCI NHGIKKGMNKAQTEEELAVKSGYWHNFRFNPAAEGNKFTLDSKEPTESYREFLDGEVRYN ALARMNPERAEELFAKSEQAAKERYAYLNKLITLYGNEE >gi|330403600|gb|ADLB01000018.1| GENE 3 4114 - 5034 769 306 aa, chain - ## HITS:1 COG:all4699 KEGG:ns NR:ns ## COG: all4699 COG0682 # Protein_GI_number: 17232191 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Prolipoprotein diacylglyceryltransferase # Organism: Nostoc sp. PCC 7120 # 26 279 22 265 283 137 31.0 3e-32 MNKMINFPNLNIHWENVGEKINLFGIEITYFGILLGVAILFGIALTMWLAHRSGQEAEEY LNLSIYVIIFGIVGARLYYVLFGFEQFKGHWTKIFNIRSGGLAFYGALIAGIITVFIFCK RQGLLVWKVLDTIVPALLIGQILGKLGNFFNREAFGECTGGIFAMQLPIDSVRAVDVTEK MRQNVEKIDGINMIQVSPSFLYEIILCLVILIFILIYRRFEVYRGEIFLLYIISYGAGRF FIEMGRVDKLRTFIFHLPVSQVVAVLSVIIAMGLLCITYRSGDGQGKGFFRTNKAKKKKT KLKFSK >gi|330403600|gb|ADLB01000018.1| GENE 4 5056 - 6153 1577 365 aa, chain - ## HITS:1 COG:CAC2134 KEGG:ns NR:ns ## COG: CAC2134 COG0012 # Protein_GI_number: 15895403 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Predicted GTPase, probable translation factor # Organism: Clostridium acetobutylicum # 1 365 1 365 365 465 66.0 1e-131 MKLGIVGLPNVGKSTLFNSLTKAGAESANYPFCTIDPNVGVVTVPDKRLDVLGEMYHTKK IVPAAIEFVDIAGLVKGASKGEGLGNQFLANIREVDAIVHVVRCFEDSNIVHVDGSIAPL RDIETINLELIFSDIEILERRIAKAVKVARNDKTVAKELALLERLKAYLEEGNMARSFEA EDDDELEWLESYNLLTYKPVIFAANVAEDDLANDAQDNEGVCAVREYAKKENCEVFVVCA QIEQEIAELDDDEKKMFLEDLGLEESGLEKLIRASYSLLGLISYLTAGEPEVRAWTIKKG TKAPQAAGKIHTDFERGFIRAEIVSYDDLMECGTYTAAKEKGLVRLEGKDYVVKDGDIIL FRFNV >gi|330403600|gb|ADLB01000018.1| GENE 5 6180 - 6839 622 219 aa, chain - ## HITS:1 COG:CAC1731 KEGG:ns NR:ns ## COG: CAC1731 COG1564 # Protein_GI_number: 15895008 # Func_class: H Coenzyme transport and metabolism # Function: Thiamine pyrophosphokinase # Organism: Clostridium acetobutylicum # 4 217 2 211 211 137 36.0 2e-32 MNRRAVIVSGGTIQEEFVYAKLKEWEHDVLIGVDKGVEFLYRHQIKPDYIVGDFDSLSEE IVQFYKEKTDVFIREFNPEKDYSDTEIAVYQAMELNCEEIILLGATGSRIDHVLANIQVL AIPHKRGIHAEILDENNRISLIEHEKVLEKEKMYGKYFSVFPLDRCIEKFSIAGAKYPLH NHRLCPYDSLCVSNQAQEEEVKITFSEGIVILIEAKDEN >gi|330403600|gb|ADLB01000018.1| GENE 6 6832 - 7494 622 220 aa, chain - ## HITS:1 COG:CAC1730 KEGG:ns NR:ns ## COG: CAC1730 COG0036 # Protein_GI_number: 15895007 # Func_class: G Carbohydrate transport and metabolism # Function: Pentose-5-phosphate-3-epimerase # Organism: Clostridium acetobutylicum # 5 217 4 213 216 217 51.0 1e-56 MKYILAPSILAADFKKLGEEIQATEKSGAEYLHFDVMDGLFVPSISFGMPVLQSIKSCTN QVIDAHLMINEPIRYVEAFQKAGADLITVHLEACEDVHKTLQKIKECGMKAGISICPDTP VSALEEYIEEADMILIMSVHPGFGGQKFIEDSLTKIRMTREMLNRYGLETDIQVDGGIYT SNVENVLKAGANIIVAGSAVFKGDAKENTKEFMEILKKYE >gi|330403600|gb|ADLB01000018.1| GENE 7 7497 - 8375 883 292 aa, chain - ## HITS:1 COG:lin1933 KEGG:ns NR:ns ## COG: lin1933 COG1162 # Protein_GI_number: 16800999 # Func_class: R General function prediction only # Function: Predicted GTPases # Organism: Listeria innocua # 1 292 2 291 291 254 44.0 2e-67 MQGKIIKGIAGFYYVHVVEFGVYECKAKGAFRKEKVKLLVGDNVEIDILNEEKKLGNVVK LLPRENELIRPAVSNIDQALVVFAITKPKPHFNLLDRFLVMMESKKIPIVLCFNKKDIAE EKEIRELEEIYSSCGYHIIFTSAKEEENIKLVKEVLKGKTTAVAGPSGVGKSSLINLFQS EIQMETGCISEKIERGKHTTRHSELIPIDDNSYIMDTPGFSSLYVNDFEKEELKYYFPEF EQYEGLCKFNGCDHIHEPNCAVKQAVEEGKIHKVRYENYIEMYKELQEKRRY >gi|330403600|gb|ADLB01000018.1| GENE 8 8391 - 10424 2384 677 aa, chain - ## HITS:1 COG:BH2504_1 KEGG:ns NR:ns ## COG: BH2504_1 COG0515 # Protein_GI_number: 15615067 # Func_class: R General function prediction only; T Signal transduction mechanisms; K Transcription; L Replication, recombination and repair # Function: Serine/threonine protein kinase # Organism: Bacillus halodurans # 4 316 3 307 343 279 49.0 1e-74 MKDGIFIGDRYEVLSKVGSGGMADVYKGKDHKLNRFVAIKILKKEYREDELFVRKFQSEA QAAAGLMHPNIVNVYDVGEESGLYYMVMELVEGITLKDYITKKGVLSSKEVISIAIQVAN GIEAAHLHNIVHRDIKPQNIIISKEGKVKVTDFGIAKATSSNTISTNALGSVHYTSPEQA RGGFSDFKSDIYSLGITMYEMATGHLPFDGDTAVSIAIKHLQEEITPPSEYVEDIPFSLE QIILKCTQKSADRRYQDVASLKEDLKHSLVDPDGDFVVIGPMRTMDETVMFNPDDVRKAK RETSYDEDDFSEEDYDDYDDFEDDDDYDKKGRKKNGDVNPKMAKMMKVLTIVAVVIIVFI VVFIIGKATGLLKFGPSINLEDDADKNEKVKVPDLRGMTEEEAKKVLKDKKYDMTLVVDS EEESTTYEKGQIIKQTPGSNKEVKKKTVIKVVISSGKKEETKAIPDVSGKDQSEALQLLS EAGFTNVKVDKSRFETSSEYDQDEVIRTEPSANTEVALDEEINLIVSLGENKPEVPNVIG MTQSSAVSALEAKGLEVSVEEDYHDSTEKGKVFYQSKRAGTRVTEGSTIIIRVSKGKKQV EVTVPDIIGKSKTYAKNSLEAQQLGYREVYQDSDLPKDTVISCDPGSGKNVVQGTTITVY ISRGPSTDGGTGTGSDN >gi|330403600|gb|ADLB01000018.1| GENE 9 10424 - 11164 703 246 aa, chain - ## HITS:1 COG:lin1935 KEGG:ns NR:ns ## COG: lin1935 COG0631 # Protein_GI_number: 16801001 # Func_class: T Signal transduction mechanisms # Function: Serine/threonine protein phosphatase # Organism: Listeria innocua # 7 238 7 241 252 172 43.0 7e-43 MKIFSLTDIGRKREVNQDYVYVTDKPVGHVPNLFVVADGMGGHKAGDFASKYAVQVLEEH VRNHSGMGPELIITDAVREANRKIVEKAKQDTGLEGMGTTLVVATIIEHTLYFANVGDSR LYLIRDEIKQLSKDHSLVEEMVRLGGINEEEAKHHPDKNIITRAIGAKDDVEVDFFEYRL QKGDIILMCTDGLTNMVDDDEIFRIVKGGRDVVETAMQLVEKANENGGKDNIGIVLVEPF ACEVNV >gi|330403600|gb|ADLB01000018.1| GENE 10 11175 - 12218 677 347 aa, chain - ## HITS:1 COG:CAC1726 KEGG:ns NR:ns ## COG: CAC1726 COG0820 # Protein_GI_number: 15895003 # Func_class: R General function prediction only # Function: Predicted Fe-S-cluster redox enzyme # Organism: Clostridium acetobutylicum # 4 346 2 343 345 356 51.0 3e-98 MEKKDIRSYTLEELKKELESIGEKPFRSKQIYSWLHEKLADSFEEMTNLSKALREKLEKD YEIYPVTMVERQVSKLDGTNKFLFALRDNHVVESVLMRYKHGNSVCISSQVGCRMGCRFC ASTLDGLARNLTPSEMLGQIYQIQKITGERVSNIVIMGTGEPLDNYENFVRFIKLISDEN GLHVSQRNITVSTCGIVPNMKRLAEEKFQITLALSLHGSTQEKRRELMPVANKYELKEVL EACDNYFDRTGRRITFEYSLVQGVNDREEDAGELISILKPRNCHLNLIPVNPIKERNFEK PTRQNAEKFKNKLEKSGINVTIRREMGSDIDGACGQLRRRYSEKSKI >gi|330403600|gb|ADLB01000018.1| GENE 11 12220 - 13569 1090 449 aa, chain - ## HITS:1 COG:SA1060 KEGG:ns NR:ns ## COG: SA1060 COG0144 # Protein_GI_number: 15926800 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA and rRNA cytosine-C5-methylases # Organism: Staphylococcus aureus N315 # 8 449 32 460 461 283 37.0 7e-76 MTKQISEREIVLGILLEITKEGGYSHIVLRNTLEKYQYLDKKERAFITRVVEGTLEHLIE IDYIINQFSKVKVNKMKPVIRTIIRNSVYQIKYMDSVPNSAVCNEAVKLAEKKGFFQLKG FVNGVLRNIDRNVSNIQYPDKKNMVEYMSIKYSMPIWILEEWGKSYDFSVIERILQGFLE EKGTTIRCNLNKISKEELVEKLKKENVRVKEHAYLPYALEISGYNYLGDMESFREGDFQV QDISSMLVSEIADPKKEDYVIDVCSAPGGKALHMADKLGETGYVEARDLTDYKVSLIEDN IERSGMKNIRAVKQDALICDEESKEKADIVLADLPCSGLGVLGKKTDIKYKMTRETQQEL KKLQRKILSVVQNYTKPGGTLVYSTCTINEEENIGNVRWFLKQYPQYQLVSVEDRLCDEL KPSVKEGCLQLLPGIHNSDGFFIAKFRKD >gi|330403600|gb|ADLB01000018.1| GENE 12 13562 - 14269 884 235 aa, chain - ## HITS:1 COG:TM1511 KEGG:ns NR:ns ## COG: TM1511 COG2738 # Protein_GI_number: 15644259 # Func_class: R General function prediction only # Function: Predicted Zn-dependent protease # Organism: Thermotoga maritima # 7 229 5 224 230 213 50.0 3e-55 MYYPMYFDPTYILIIAGMIISLIASARVKSTFAKYQRVANHSGMTGRDAAERILHGANIY DVRIERVSGHLTDHYDPRDKVLRLSDSTYQSASVASVGVAAHECGHAIQHATGYAPLRFR GALVPVANFGATIAWPLILLGFFINSRSSDLFIHLGIIAFSLSVLFQLVTLPVEFNASRR AVRILGSSGMLYQNEVAETKKVLDAAALTYVASAAGSILQLLRILIIAGGRRGND >gi|330403600|gb|ADLB01000018.1| GENE 13 14296 - 15225 1159 309 aa, chain - ## HITS:1 COG:CAC1723 KEGG:ns NR:ns ## COG: CAC1723 COG0223 # Protein_GI_number: 15895000 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Methionyl-tRNA formyltransferase # Organism: Clostridium acetobutylicum # 1 304 2 305 310 312 52.0 4e-85 MRVIFMGTPDFSVGTLEELIHAGHEVVLAVTQPDKPKGRGKEMQFTPVKEVALQHGIPVF QPKKIRDAESIEKLREYPADVMVVVAFGQIVPKEILEMTPYGCINVHASLLPKYRGAAPI QWSLIDGESVTGVTTMQMDEGLDTGDMLLKTEIPISPKETGGSLHDKLAEAGAKLCVETL KALEEKTVTGEKQGETPTAYARMLDKKLGNIDWSKGAAEIERLIRGLNPWPSAYTMWEGK VMKIWEAEIFEGQKENEPGTVVKVEKDGFFVQTGNGLLKITQLQIPGKKRMEAGAFLRGY SIKEQTKLG >gi|330403600|gb|ADLB01000018.1| GENE 14 15227 - 15700 646 157 aa, chain - ## HITS:1 COG:CAC1722 KEGG:ns NR:ns ## COG: CAC1722 COG0242 # Protein_GI_number: 15894999 # Func_class: J Translation, ribosomal structure and biogenesis # Function: N-formylmethionyl-tRNA deformylase # Organism: Clostridium acetobutylicum # 1 144 1 144 150 164 59.0 5e-41 MAIREIRKIGDEVLTKKCKEVKTITPRTVELIEDMFDTMYDEMGVGLAASQVGILKRIVT IDVGEGPILLINPEIIETSGEQTGDEGCLSVPGKAGQVTRPFYAKVRACDIDMQPFEIEG EGLLARAFCHEIDHLDGHLYVDKVEGELHDVSYGEEE >gi|330403600|gb|ADLB01000018.1| GENE 15 15714 - 17942 1716 742 aa, chain - ## HITS:1 COG:CAC1721 KEGG:ns NR:ns ## COG: CAC1721 COG1198 # Protein_GI_number: 15894998 # Func_class: L Replication, recombination and repair # Function: Primosomal protein N' (replication factor Y) - superfamily II helicase # Organism: Clostridium acetobutylicum # 2 742 4 733 733 566 42.0 1e-161 MYADIIIDITHENVDRVFQYIVPPQLEGKLKEGMEVLVPFGAGNREKKGYIVGFSEKCNY DSAKMKAIHDISDTAVQIEGRLVSLAGWIKENYGGTMIQALKTVLPVKQKSQAKINRTIK LLLTQEEAEQKLEEFLHKNMKARARVLAELMDNKVLPESIVIKKLNVARNVIKALEEQKI LSVESERIYRNPVKAGERKSAEITYTKEQENAIRLFKDDYDKGIRKTYLIHGVTGSGKTG VYMEMIEKVIREGKQAVFLIPEIALTYQTVMRFYQRFGDRVSIMHSRLSDGEKYDQMMRA KNGEIDVMIGPRSALFTPFEKLGLIVIDEEHEATYKSEQIPRYHARETARRRAEMEGASV VLGSATPSVDSYYKALHGEYVLLELKCRATGQQLPEMYTVDLREELKSGNRSIISDRLRT LMSDRLKKREQIILFINRRGYAGFVSCRSCGYVAKCPHCDVALSAHKNGKLVCHYCGYEE PMHTSCPECGSRHIGGFRAGTQQIEELVKKEFPTAGVLRMDMDTTKAKDGHEKILSAFAN EEADILIGTQMIVKGHDFPNVTLVGILAADLSLYADDYRAGERTFQLLTQAGGRAGRGEK KGEVVIQTYSPEHYCIQTASRQDYEAFYEAEINYRELMGYPPAEQLLAIYLTSEDEKLLE RGAHFLKEYAERVNKRKEYRIIGPVSPYVEKINDVYRKLIYIKQERYGMLIDMKDFLEKY IEVNSGFQRIRIQFDFNPVNVF >gi|330403600|gb|ADLB01000018.1| GENE 16 17920 - 19404 1341 494 aa, chain - ## HITS:1 COG:CAC2129 KEGG:ns NR:ns ## COG: CAC2129 COG0769 # Protein_GI_number: 15895398 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramyl tripeptide synthase # Organism: Clostridium acetobutylicum # 1 484 1 480 482 335 39.0 1e-91 MKLTNLLRPIKYSILQGESEPDIEKVTHDSREVERDTLFVCIKGSATDGALYVGEAVEKG AVAILADRRIEVPREVTFILVENVRETLAKIASVFYGEPSKEMTVIGITGTKGKTTTAYM IYHILEQSGYKVGLIGTIEIRIGEKRIPAKNTSPESVNIQRFLKEMVLQKCDIAVMEVSS QGLKLHRTDEIQFDIGVFTNLGKDHIGKFEHKDFEEYKNCKAKLFSQSDFGIANMDSPYW REMFLRANCDVETVGFSKEADYYASDEVVTRKSGRLGIRYRLNGKDRGDIVLSMAGRFNI YNSLCAIAVCRRLGVSFENIRQSLCKTQVKGRVEQISTNKGTIILIDYAHNAMSLQSILE MVHSYRPHRIICIFGCGGNRAKERRYEMGKVSGALADFTIITSDNPRYENPLQIMSEIRE GVEDGNGNYLEICNREEAIRYGIKYAGAADIIVIAGKGHEEYQEIEGIRYHMTDREVVCH IIDEEKKNVCRYYY >gi|330403600|gb|ADLB01000018.1| GENE 17 19424 - 20620 995 398 aa, chain - ## HITS:1 COG:CAC0285 KEGG:ns NR:ns ## COG: CAC0285 COG0389 # Protein_GI_number: 15893577 # Func_class: L Replication, recombination and repair # Function: Nucleotidyltransferase/DNA polymerase involved in DNA repair # Organism: Clostridium acetobutylicum # 1 398 1 392 396 370 46.0 1e-102 MTPIIFHIDVNSAYLSWTAVEKLKQGADIDLREIPAIIGGNQESRHGVVLAKSVPAKRYG IRTGEPVANAFRKCPNLIMEPPNHKMYSIYSKELMEFLRQYTSDIEQVSVDECYMDFTGI AYKYTSPVEAAVEIKNKILEQFGFTVNIGISTNKLLAKMASDFEKPNRVHTLFPEEVSKK MWPLPVSSLYMAGRASVEVLRKLEINTIGELAGTNPRILELHLKSHGKMLWEFANGIDHA KVQSEETNAKGIGNSTTLARDVADKETAKEVLLWLSESVGKRLRKAGQKAQMISVEIKYY NFRSVSHQRQMRRAAYEDKTLYENACALFEELWNGEPIRLLGIRTSKLVEETAPEQLSIF DIQQEKPKDEKHRKLDKALDEIRKRFGEDAVKRGTFLK >gi|330403600|gb|ADLB01000018.1| GENE 18 20622 - 23207 1799 861 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163764771|ref|ZP_02171825.1| ribosomal protein S8 [Bacillus selenitireducens MLS10] # 1 854 1 806 815 697 44 0.0 MNINKFTQKSLQAVQGCEKIAYDYGNQEVEQEHLLYSLLTQEDSLILKLLEKMNIQKELF INRVEEALRKRVKVQGGQIYIGQDLNKVLIHSDDEATQMGDEYVSVEHIFLSLLKYPNRE IKTIFKEFGITRELFLQALSTVRGNQRVTNDNPEATYDTLNKYGTDLVERAREQKLDPVI GRDAEIRNVIRILSRKTKNNPVLIGEPGVGKTAAVEGLAQRIVRGDVPEGLKDKTIFALD MGALVAGAKYRGEFEERLKAVLEEVKSSDGKIILFIDELHTIVGAGKTDGAMDAGNMLKP MLARGELHCIGATTLDEYRQYIEKDAALERRFQPVMVGEPTVEDAISILRGLKERYEVFH GVKITDGALVSAAVLSNRYISDRFLPDKAIDLVDEACALIKTELDSMPTELDELRRRIMQ LEIEEAALKKENDRLSQDRLVNLQKELGELRDEFAGRKAQWDNEKASVEKVQKIREEMEA LNKEIQKAQRSYDLEKAAELQYGKLPQLQKQLAEEEEKVKARELTLVHESVTDEEIAKII SRWTGIPVAKLNESERNKTLHLEEELHKRVIGQDEGVTKVTEAIIRSKAGIKDPSKPIGS FLFLGPTGVGKTELAKALAESLFDDENNMVRIDMSEYMEKYSVSRLIGAPPGYVGYDEGG QLTEAVRRKPYSVVLFDEIEKAHPDVFNVLLQVLDDGRITDSQGRTVDFKNTILIMTSNI GSAYLLDGIDDDGNIQKESEELVMRDLRNHFRPEFLNRLDETIMFKPLTKSNICDIIDLL VKDINHRLEEKEISIELTQAAKAMITERGYEAMYGARPLKRYLQKHVETLAAKLILADKV KSQDIILIDVENNELKASVRG >gi|330403600|gb|ADLB01000018.1| GENE 19 23409 - 23582 161 57 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|210617226|ref|ZP_03291470.1| ## NR: gi|210617226|ref|ZP_03291470.1| hypothetical protein CLONEX_03692 [Clostridium nexile DSM 1787] # 2 57 279 334 409 87 75.0 2e-16 MQAKVGCEYISDLPFEVDGYSGFVFKAKSGRPFMPNGVNSVLYNIVDAYNKTEVENA >gi|330403600|gb|ADLB01000018.1| GENE 20 23783 - 24535 717 250 aa, chain - ## HITS:1 COG:MA3611 KEGG:ns NR:ns ## COG: MA3611 COG0596 # Protein_GI_number: 20092411 # Func_class: R General function prediction only # Function: Predicted hydrolases or acyltransferases (alpha/beta hydrolase superfamily) # Organism: Methanosarcina acetivorans str.C2A # 3 124 6 125 263 70 31.0 2e-12 MSYFKYQSKRIFYKEIGCGKPLIMLHGDAVSSTMFEMLLPLYQEHFRVILIDFLGNGKSD RVKKIPADLWSSQAHQVITLIEHLKLEKVNLLGTSGGAWVAVNTALERPDLIEKVIADSF DGRRLDQNFVENLLKEREYAKHDTYGKQFYEWCQGEDWETVVDLNTQALVECAKKKIPLF SKPLESLSIPILFIGSKQDDMCRKDMLEEYAEMKKLVKHGSIHTFDTGGHPTIVTNAEKF ADIVLHFLND >gi|330403600|gb|ADLB01000018.1| GENE 21 24698 - 24826 182 42 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MEMKYVVSDMAQSFGTLEFPGEREPIYERTKNNRRDNRKINN >gi|330403600|gb|ADLB01000018.1| GENE 22 25242 - 26039 590 265 aa, chain - ## HITS:1 COG:TM1029 KEGG:ns NR:ns ## COG: TM1029 COG1277 # Protein_GI_number: 15643787 # Func_class: R General function prediction only # Function: ABC-type transport system involved in multi-copper enzyme maturation, permease component # Organism: Thermotoga maritima # 6 237 8 236 263 59 21.0 8e-09 MSIPLLKTELKSNCKIIVLFLAIITMYAGIITAMYDPSLSEGINELAKSMPQFFAAFGME NPGTTLLEFLNNYLYGFILILIPFIYTIIMCYKLVAKYIDKGSMAYLLNTHYSRMQIILT QFVILLIGLLILIGYATGLIIAASGMLFEGELEITKFLFLNLGLFLLELFLATLCFMFAC MFDELKFSVGLGAGLGMMFFLVQMLSQVSEDIEFLKYFTPLTLFNPESLVKYKTDSFICL GVLFGSAIVFFAVGAMRFQKRDLSL >gi|330403600|gb|ADLB01000018.1| GENE 23 26036 - 26923 733 295 aa, chain - ## HITS:1 COG:TM1028 KEGG:ns NR:ns ## COG: TM1028 COG1131 # Protein_GI_number: 15643786 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase component # Organism: Thermotoga maritima # 1 290 1 288 293 226 40.0 4e-59 MDVIEISNLTKDYGNNKGIFDVSFSIKQGECVGFLGSNGAGKTTTIRHLMGFIQSECGSA KILGMDCFKESASIQEKIGYLPGEIAFMDNMTGEEFIQFMAKMKNIKDLHYARELISYFS INTKVKIKKMSKGMKQKIGLIVAFMQNTPILILDEATTGLDPLMQNKFIELIKKEKDKGK TILMSSHIFEEIENTCDRIVMIKEGRIVADENINMIKSNMSKHYEIMFANTEDAVHFQKL YADSSERQENVVYLISNDNVNLLLSELCKYDIKDINARHQTLEELFLKYYGGNKQ >gi|330403600|gb|ADLB01000018.1| GENE 24 26955 - 27386 206 143 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_20627 NR:ns ## KEGG: EUBELI_20627 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 1 137 32 167 172 119 42.0 4e-26 MTENKVPPHITVSSFEMGEEREIIGKLRENIEVLSRGEVIWCSVGMFLPSVIYIAPVLNN YLQNLSKFIYNIFLQTGNISVSPYYRPMQWFPHATVGKKLSREEMIGAVRALQDEFGMFR GEVTHIGLAKTNPYEELWKVELK >gi|330403600|gb|ADLB01000018.1| GENE 25 27596 - 28456 731 286 aa, chain + ## HITS:1 COG:CAC1054 KEGG:ns NR:ns ## COG: CAC1054 COG0010 # Protein_GI_number: 15894341 # Func_class: E Amino acid transport and metabolism # Function: Arginase/agmatinase/formimionoglutamate hydrolase, arginase family # Organism: Clostridium acetobutylicum # 2 285 3 298 299 173 34.0 3e-43 MLNIFGCPMHCGVGEKGLVHSLDYLNEHNPSLNLTTIPEISVSDENSAHMKNFRSVLETC TSIAEYSYNNILSKGNTPLFLGGDHSAAIGTVSATATQYDNIGLIWIDAHPDINTDLTSA SGNIHGMPISALLGNLSPAGADLSKILTDNPKVKPENIVHIGLRDIDPPEAVILKELNIK HFTYDDIKEKGLDFCLNEAISHLSHLKHIHLSFDIDVVNPQLLPGVSVPVNDGFTIPEIY HVFERILGELPISALDIVEYNKEYDKDDVTADFVSELITFIQKHFN >gi|330403600|gb|ADLB01000018.1| GENE 26 28501 - 30027 1584 508 aa, chain - ## HITS:1 COG:SPy0388 KEGG:ns NR:ns ## COG: SPy0388 COG0769 # Protein_GI_number: 15674533 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramyl tripeptide synthase # Organism: Streptococcus pyogenes M1 GAS # 28 500 28 481 481 249 34.0 7e-66 MGQMNKYTINVYSELVKEQDNLLEEELYGKEQTTIEQLTFDSKKVTENTLFICKGMAFKE EYLKEAVKNGAVCYISEQKYELDEDVPYILVKDIRKAMPVLANKFFNEAWKELNVIGIGG TKGKSTSVYYMKSILDDYLAANGKKESAVISSIDIYDGVEKIESHITTPEAVELHGYFRN AVNCGMEFVEMEVSSQALKYGRVNEMMFDVAVFLNICEDHISPIEHKDFEDYFSSKLLIF QKTKTAVVNLDADFSDRILEAAQCAPKVVTISTKNENADFYAYNIHKEGDGTIFNVRCAE FDKEFELTMPGFFNVENALATIAVANQFNIPVEFMYSGLKKARSSGRMELSTSDDNKIIA VVDYAHNKLSFEKLFSSIKEEYPDREIISIFGCPGKKAFTRRRDLGTIAGKYAKKVYLAA EDPGYEPVEEISRDIAEYVEAQGCPYEMIEDRGEAIKKAIESVENPSILLITGKGNETRQ KYGCEYVDCPSDTDYVKKYLDEYNKNHQ >gi|330403600|gb|ADLB01000018.1| GENE 27 30222 - 31439 1322 405 aa, chain + ## HITS:1 COG:FN0733 KEGG:ns NR:ns ## COG: FN0733 COG2195 # Protein_GI_number: 19704068 # Func_class: E Amino acid transport and metabolism # Function: Di- and tripeptidases # Organism: Fusobacterium nucleatum # 5 405 11 411 412 365 48.0 1e-100 MRAYERLLNYVVVRTPSDENCENVPSSQCQFDLAKILVEELKQLGIDNAYVNDKCFVYAK LPATPGYEDKTKLGFIAHMDTVADFCDHDIVPVVTENYNGEDLVLGTSGRTLAVADFPHL PSLKGRTLITSDGTTILGADNKAGIAEIMTMLEKIISENIPHGQISVAFTPDEEIGTGAQ SFDVETFDADFGYTLDGDLEGEIQYENFNAYKATFDITGFSVHPGSSKNTMINASLVAME INSMLPAGETPRGTEEYEGFYHLVSMSGDVCKATLNYIVRDHDASLMNARLNTLRLIEKL MNEKWGEGTVKLEIAEQYRNMAEIVGTCMHLIENAKIACENVGVTPNIIPIRGGTDGCQL SFKGLPCPNLGTGGHGYHGPYEHITVEGMDLMTDVAVEIVKLYAK >gi|330403600|gb|ADLB01000018.1| GENE 28 31489 - 32010 359 173 aa, chain - ## HITS:1 COG:CAC1633 KEGG:ns NR:ns ## COG: CAC1633 COG1633 # Protein_GI_number: 15894911 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 15 172 65 231 236 67 33.0 1e-11 MCESQEKYQYVDEAPYPPVQVCGRNQDYAIAILGNIGACNSEMSAIALYFYNSIILGESY EEVGRIFHKISMVEMHHLNIFGRLSEMLGAEPRLWSVNRGRFTYWTPKCNNYPRKIDSLL KNAYKGELDAIDKYTKQLSWIEDEHIRANLERIILDEKCHLKIFESLYEKYCE Prediction of potential genes in microbial genomes Time: Tue May 24 21:28:07 2011 Seq name: gi|330403359|gb|ADLB01000019.1| Lachnospiraceae bacterium 2_1_46FAA cont1.19, whole genome shotgun sequence Length of sequence - 122174 bp Number of predicted genes - 135, with homology - 128 Number of transcription units - 49, operones - 28 average op.length - 4.1 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 312 164 ## COG3666 Transposase and inactivated derivatives - Term 365 - 423 6.3 2 2 Tu 1 . - CDS 434 - 2434 1259 ## PROTEIN SUPPORTED gi|157803230|ref|YP_001491779.1| 50S ribosomal protein L9 - Prom 2456 - 2515 7.5 + Prom 2521 - 2580 12.5 3 3 Tu 1 . + CDS 2629 - 4107 1151 ## COG4868 Uncharacterized protein conserved in bacteria + Term 4114 - 4178 2.1 - Term 4102 - 4164 4.3 4 4 Tu 1 . - CDS 4167 - 4919 843 ## COG1349 Transcriptional regulators of sugar metabolism - Prom 4951 - 5010 4.2 - Term 4967 - 5027 8.2 5 5 Op 1 2/0.000 - CDS 5028 - 6431 1821 ## COG2610 H+/gluconate symporter and related permeases 6 5 Op 2 3/0.000 - CDS 6443 - 7441 405 ## PROTEIN SUPPORTED gi|163786851|ref|ZP_02181299.1| 50S ribosomal protein L32 7 5 Op 3 1/0.000 - CDS 7458 - 8762 1182 ## COG3395 Uncharacterized protein conserved in bacteria 8 5 Op 4 5/0.000 - CDS 8802 - 10220 1294 ## COG3395 Uncharacterized protein conserved in bacteria 9 5 Op 5 . - CDS 10236 - 11123 1086 ## COG2084 3-hydroxyisobutyrate dehydrogenase and related beta-hydroxyacid dehydrogenases - Prom 11183 - 11242 7.6 - Term 11225 - 11280 11.2 10 6 Op 1 . - CDS 11282 - 13090 1865 ## DSY1218 hypothetical protein 11 6 Op 2 . - CDS 13101 - 13796 652 ## DSY1217 hypothetical protein 12 6 Op 3 . - CDS 13789 - 14490 429 ## DSY1216 hypothetical protein - Prom 14515 - 14574 8.6 - Term 14550 - 14599 10.6 13 7 Op 1 40/0.000 - CDS 14602 - 15831 983 ## COG0642 Signal transduction histidine kinase 14 7 Op 2 . - CDS 15828 - 16502 644 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain - Prom 16531 - 16590 7.4 - Term 16581 - 16628 14.3 15 8 Op 1 35/0.000 - CDS 16631 - 18475 172 ## PROTEIN SUPPORTED gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein 16 8 Op 2 4/0.000 - CDS 18468 - 20636 212 ## PROTEIN SUPPORTED gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P 17 8 Op 3 . - CDS 20650 - 21072 572 ## COG1846 Transcriptional regulators - Prom 21107 - 21166 9.4 + Prom 21123 - 21182 11.8 18 9 Op 1 . + CDS 21276 - 22196 715 ## COG1597 Sphingosine kinase and enzymes related to eukaryotic diacylglycerol kinase 19 9 Op 2 . + CDS 22208 - 23008 300 ## COG0584 Glycerophosphoryl diester phosphodiesterase + Prom 23012 - 23071 7.1 20 10 Op 1 . + CDS 23107 - 23763 379 ## COG0671 Membrane-associated phospholipid phosphatase 21 10 Op 2 . + CDS 23835 - 24080 155 ## + Term 24105 - 24148 6.3 - Term 24085 - 24142 9.8 22 11 Op 1 . - CDS 24144 - 25295 1421 ## COG1775 Benzoyl-CoA reductase/2-hydroxyglutaryl-CoA dehydratase subunit, BcrC/BadD/HgdB - Prom 25325 - 25384 7.2 23 11 Op 2 . - CDS 25386 - 27146 532 ## PROTEIN SUPPORTED gi|46129221|ref|ZP_00155777.2| COG1194: A/G-specific DNA glycosylase 24 11 Op 3 . - CDS 27147 - 27497 487 ## COG1393 Arsenate reductase and related proteins, glutaredoxin family 25 11 Op 4 . - CDS 27512 - 27907 190 ## EUBREC_2353 hypothetical protein - Prom 27934 - 27993 5.1 26 12 Op 1 34/0.000 - CDS 28013 - 28735 603 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 27 12 Op 2 31/0.000 - CDS 28732 - 29421 809 ## COG0765 ABC-type amino acid transport system, permease component 28 12 Op 3 . - CDS 29442 - 30203 1057 ## COG0834 ABC-type amino acid transport/signal transduction systems, periplasmic component/domain - Prom 30237 - 30296 11.1 - Term 30254 - 30318 14.8 29 13 Op 1 . - CDS 30321 - 31154 1081 ## COG1968 Uncharacterized bacitracin resistance protein 30 13 Op 2 . - CDS 31171 - 32001 938 ## COG0561 Predicted hydrolases of the HAD superfamily 31 13 Op 3 . - CDS 31985 - 32572 545 ## gi|167759822|ref|ZP_02431949.1| hypothetical protein CLOSCI_02185 - Prom 32600 - 32659 9.4 + Prom 32596 - 32655 8.9 32 14 Tu 1 . + CDS 32715 - 33773 1042 ## COG3773 Cell wall hydrolyses involved in spore germination + Term 33781 - 33840 0.1 - Term 33777 - 33821 2.0 33 15 Tu 1 . - CDS 33832 - 34587 921 ## Emin_0754 aminoglycoside phosphotransferase - Prom 34688 - 34747 9.1 34 16 Tu 1 . - CDS 35138 - 37750 3027 ## COG1012 NAD-dependent aldehyde dehydrogenases - Prom 37979 - 38038 8.2 + Prom 37848 - 37907 7.7 35 17 Tu 1 . + CDS 38011 - 38283 356 ## COG4545 Glutaredoxin-related protein + Term 38290 - 38320 3.4 - Term 38275 - 38311 5.6 36 18 Tu 1 . - CDS 38314 - 40062 1397 ## GYMC10_3134 peptidase S41 - Prom 40082 - 40141 4.3 - Term 40089 - 40127 8.4 37 19 Op 1 50/0.000 - CDS 40143 - 40670 736 ## PROTEIN SUPPORTED gi|240145848|ref|ZP_04744449.1| LSU ribosomal protein L17P - Term 40768 - 40821 3.0 38 19 Op 2 26/0.000 - CDS 40833 - 41792 1213 ## COG0202 DNA-directed RNA polymerase, alpha subunit/40 kD subunit 39 19 Op 3 36/0.000 - CDS 41856 - 42449 858 ## PROTEIN SUPPORTED gi|240145850|ref|ZP_04744451.1| 30S ribosomal protein S4 40 19 Op 4 48/0.000 - CDS 42467 - 42862 630 ## PROTEIN SUPPORTED gi|239623373|ref|ZP_04666404.1| ribosomal protein S11 41 19 Op 5 . - CDS 42962 - 43330 542 ## PROTEIN SUPPORTED gi|238922854|ref|YP_002936367.1| ribosomal protein S13p/S18e - Prom 43357 - 43416 5.3 + Prom 43347 - 43406 6.8 42 20 Tu 1 . + CDS 43492 - 43596 57 ## - Term 43585 - 43617 -0.4 43 21 Tu 1 . - CDS 43632 - 43745 189 ## PROTEIN SUPPORTED gi|160881761|ref|YP_001560729.1| ribosomal protein L36 - Prom 43776 - 43835 2.3 44 22 Op 1 . - CDS 43864 - 44082 262 ## PROTEIN SUPPORTED gi|15900168|ref|NP_344772.1| translation initiation factor IF-1 45 22 Op 2 . - CDS 44084 - 44335 337 ## gi|167759839|ref|ZP_02431966.1| hypothetical protein CLOSCI_02202 46 22 Op 3 12/0.000 - CDS 44347 - 45102 826 ## COG0024 Methionine aminopeptidase 47 22 Op 4 28/0.000 - CDS 45107 - 45751 894 ## COG0563 Adenylate kinase and related kinases - Prom 45797 - 45856 6.3 48 22 Op 5 53/0.000 - CDS 45858 - 47174 802 ## PROTEIN SUPPORTED gi|163796899|ref|ZP_02190856.1| 30S ribosomal protein S11 49 22 Op 6 48/0.000 - CDS 47175 - 47615 654 ## PROTEIN SUPPORTED gi|238922848|ref|YP_002936361.1| 50S ribosomal protein L15 50 22 Op 7 50/0.000 - CDS 47639 - 47818 214 ## PROTEIN SUPPORTED gi|168334334|ref|ZP_02692521.1| 50S ribosomal protein L30 -related protein 51 22 Op 8 56/0.000 - CDS 47833 - 48342 716 ## PROTEIN SUPPORTED gi|240145861|ref|ZP_04744462.1| ribosomal protein S5 52 22 Op 9 46/0.000 - CDS 48357 - 48725 528 ## PROTEIN SUPPORTED gi|238916281|ref|YP_002929798.1| large subunit ribosomal protein L18 53 22 Op 10 . - CDS 48743 - 49285 791 ## PROTEIN SUPPORTED gi|160881770|ref|YP_001560738.1| ribosomal protein L6 54 22 Op 11 . - CDS 49381 - 49530 152 ## 55 22 Op 12 50/0.000 - CDS 49565 - 49966 610 ## PROTEIN SUPPORTED gi|240145864|ref|ZP_04744465.1| ribosomal protein S8 56 22 Op 13 50/0.000 - CDS 50085 - 50270 307 ## PROTEIN SUPPORTED gi|158319556|ref|YP_001512063.1| ribosomal protein S14 57 22 Op 14 48/0.000 - CDS 50288 - 50827 818 ## PROTEIN SUPPORTED gi|240145866|ref|ZP_04744467.1| 50S ribosomal protein L5 58 22 Op 15 57/0.000 - CDS 50851 - 51162 471 ## PROTEIN SUPPORTED gi|240145867|ref|ZP_04744468.1| 50S ribosomal protein L24 59 22 Op 16 50/0.000 - CDS 51175 - 51543 569 ## PROTEIN SUPPORTED gi|160881775|ref|YP_001560743.1| ribosomal protein L14 60 22 Op 17 . - CDS 51565 - 51819 379 ## PROTEIN SUPPORTED gi|160881776|ref|YP_001560744.1| ribosomal protein S17 61 22 Op 18 . - CDS 51860 - 52075 278 ## PROTEIN SUPPORTED gi|239623355|ref|ZP_04666386.1| 30S ribosomal protein S17 62 22 Op 19 50/0.000 - CDS 52056 - 52493 663 ## PROTEIN SUPPORTED gi|160881778|ref|YP_001560746.1| 50S ribosomal protein L16 63 22 Op 20 61/0.000 - CDS 52496 - 53152 1004 ## PROTEIN SUPPORTED gi|240145872|ref|ZP_04744473.1| SSU ribosomal protein S3P 64 22 Op 21 59/0.000 - CDS 53164 - 53550 594 ## PROTEIN SUPPORTED gi|240145873|ref|ZP_04744474.1| 50S ribosomal protein L22 65 22 Op 22 60/0.000 - CDS 53581 - 53862 463 ## PROTEIN SUPPORTED gi|238922834|ref|YP_002936347.1| 30S ribosomal protein S19 66 22 Op 23 61/0.000 - CDS 53892 - 54734 1308 ## PROTEIN SUPPORTED gi|240145875|ref|ZP_04744476.1| 50S ribosomal protein L2 67 22 Op 24 61/0.000 - CDS 54800 - 55099 413 ## PROTEIN SUPPORTED gi|240145876|ref|ZP_04744477.1| 50S ribosomal protein L23 68 22 Op 25 58/0.000 - CDS 55099 - 55719 829 ## PROTEIN SUPPORTED gi|238916266|ref|YP_002929783.1| large subunit ribosomal protein L4 69 22 Op 26 40/0.000 - CDS 55748 - 56428 958 ## PROTEIN SUPPORTED gi|238922830|ref|YP_002936343.1| 50S ribosomal protein L3 70 22 Op 27 . - CDS 56503 - 56820 512 ## PROTEIN SUPPORTED gi|240145879|ref|ZP_04744480.1| 30S ribosomal protein S10 - Prom 57045 - 57104 8.9 + Prom 57120 - 57179 8.1 71 23 Tu 1 . + CDS 57206 - 58303 1057 ## CLJ_B2686 hypothetical protein 72 24 Op 1 . - CDS 58321 - 58788 479 ## CDR20291_3099 hypothetical protein 73 24 Op 2 . - CDS 58800 - 59564 683 ## Clos_0031 D-proline reductase (dithiol) (EC:1.21.4.1) - Prom 59589 - 59648 9.7 - Term 59672 - 59716 9.1 74 25 Op 1 . - CDS 59724 - 59969 262 ## CLD_2160 D-proline reductase, PrdB subunit, selenocysteine-containing (EC:1.21.4.1) 75 25 Op 2 . - CDS 59997 - 60449 516 ## Clos_0030 D-proline reductase (dithiol) (EC:1.21.4.1) 76 25 Op 3 . - CDS 60449 - 60736 425 ## CLD_2159 hypothetical protein 77 25 Op 4 . - CDS 60752 - 62581 2091 ## Clos_0028 glycine/sarcosine/betaine reductase complex protein B alpha and beta subunits 78 25 Op 5 . - CDS 62650 - 63039 653 ## Clos_0028 glycine/sarcosine/betaine reductase complex protein B alpha and beta subunits 79 25 Op 6 . - CDS 63057 - 64373 1348 ## COG4656 Predicted NADH:ubiquinone oxidoreductase, subunit RnfC - Prom 64396 - 64455 8.3 - TRNA 64497 - 64594 20.7 # SeC TCA 0 0 + Prom 64622 - 64681 7.9 80 26 Op 1 . + CDS 64926 - 65957 1139 ## COG3938 Proline racemase 81 26 Op 2 . + CDS 65973 - 66560 607 ## COG1309 Transcriptional regulator + Term 66565 - 66607 4.4 - Term 66544 - 66603 12.0 82 27 Op 1 . - CDS 66606 - 66899 413 ## gi|167758842|ref|ZP_02430969.1| hypothetical protein CLOSCI_01185 83 27 Op 2 . - CDS 66915 - 68066 1689 ## Shel_25180 fatty acid/phospholipid biosynthesis enzyme 84 27 Op 3 . - CDS 68080 - 69612 1913 ## Shel_25190 hypothetical protein - Prom 69654 - 69713 5.0 - Term 69705 - 69748 10.1 85 28 Op 1 . - CDS 69797 - 70123 413 ## Shel_25200 glycine/sarcosine/betaine reductase complex protein A (EC:1.21.4.4) 86 28 Op 2 . - CDS 70139 - 70270 199 ## Amet_3592 glycine/sarcosine/betaine reductase complex protein A (EC:1.21.4.3 1.21.4.4 1.21.4.2) - Prom 70302 - 70361 4.0 - Term 70311 - 70344 1.5 87 29 Op 1 11/0.000 - CDS 70398 - 70715 493 ## COG0526 Thiol-disulfide isomerase and thioredoxins 88 29 Op 2 . - CDS 70735 - 71682 594 ## PROTEIN SUPPORTED gi|148988049|ref|ZP_01819512.1| 30S ribosomal protein S9 89 29 Op 3 . - CDS 71707 - 71937 265 ## Shel_25230 glycine reductase, selenoprotein B 90 29 Op 4 . - CDS 71965 - 73011 1432 ## Shel_25230 glycine reductase, selenoprotein B 91 29 Op 5 . - CDS 73031 - 74314 1664 ## Shel_25240 glycine/sarcosine/betaine reductase component B alpha/beta subunit 92 29 Op 6 . - CDS 74383 - 74766 388 ## Amet_3596 GrdX protein - Prom 74805 - 74864 3.2 93 30 Tu 1 . - CDS 75011 - 76405 852 ## PROTEIN SUPPORTED gi|145629959|ref|ZP_01785741.1| 50S ribosomal protein L21 - Prom 76450 - 76509 7.0 - Term 76596 - 76631 5.1 94 31 Op 1 . - CDS 76643 - 76879 264 ## Cphy_1490 hypothetical protein 95 31 Op 2 . - CDS 76881 - 77084 318 ## gi|226322978|ref|ZP_03798496.1| hypothetical protein COPCOM_00750 96 31 Op 3 . - CDS 77138 - 78235 1424 ## Shel_25130 hypothetical protein 97 31 Op 4 . - CDS 78251 - 79390 1090 ## COG0520 Selenocysteine lyase - Prom 79426 - 79485 6.0 - Term 79464 - 79522 14.5 98 32 Op 1 . - CDS 79525 - 80424 807 ## COG0583 Transcriptional regulator 99 32 Op 2 7/0.000 - CDS 80424 - 82337 1874 ## COG3276 Selenocysteine-specific translation elongation factor 100 32 Op 3 1/0.000 - CDS 82368 - 83780 1210 ## COG1921 Selenocysteine synthase [seryl-tRNASer selenium transferase] 101 32 Op 4 . - CDS 83793 - 84749 883 ## COG0709 Selenophosphate synthase - Prom 84769 - 84828 3.7 102 33 Tu 1 . - CDS 84859 - 85461 727 ## EUBREC_3158 hypothetical protein - Prom 85708 - 85767 4.5 + Prom 85544 - 85603 5.4 103 34 Tu 1 . + CDS 85639 - 86499 821 ## COG1092 Predicted SAM-dependent methyltransferases + Term 86502 - 86563 11.1 - Term 86501 - 86541 6.6 104 35 Op 1 . - CDS 86545 - 87207 661 ## COG0637 Predicted phosphatase/phosphohexomutase 105 35 Op 2 . - CDS 87225 - 87842 541 ## COG0424 Nucleotide-binding protein implicated in inhibition of septum formation 106 35 Op 3 . - CDS 87847 - 88623 675 ## COG0703 Shikimate kinase - Prom 88672 - 88731 11.5 - Term 88699 - 88741 6.7 107 36 Op 1 . - CDS 88984 - 89142 168 ## 108 36 Op 2 . - CDS 89178 - 89411 347 ## gi|210618046|ref|ZP_03291881.1| hypothetical protein CLONEX_04114 109 36 Op 3 10/0.000 - CDS 89420 - 89887 513 ## COG0691 tmRNA-binding protein 110 36 Op 4 . - CDS 89892 - 92018 1778 ## PROTEIN SUPPORTED gi|15894003|ref|NP_347352.1| fused ribonuclease/ribosomal protein S1 - Term 92045 - 92085 7.0 111 37 Op 1 . - CDS 92104 - 92346 395 ## Cphy_2867 preprotein translocase, SecG subunit 112 37 Op 2 . - CDS 92398 - 93693 1277 ## COG0148 Enolase - Prom 93768 - 93827 9.7 + Prom 93666 - 93725 8.2 113 38 Tu 1 . + CDS 93817 - 94659 842 ## COG1284 Uncharacterized conserved protein + Term 94663 - 94715 18.3 - Term 94651 - 94703 18.3 114 39 Tu 1 . - CDS 94713 - 96059 1523 ## COG1109 Phosphomannomutase - Prom 96266 - 96325 7.0 - Term 96168 - 96222 9.1 115 40 Op 1 . - CDS 96377 - 97753 1502 ## COG0624 Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases - Term 97777 - 97821 9.0 116 40 Op 2 . - CDS 97835 - 99010 1343 ## COG0138 AICAR transformylase/IMP cyclohydrolase PurH (only IMP cyclohydrolase domain in Aful) 117 40 Op 3 . - CDS 99022 - 99735 897 ## EUBREC_1566 hypothetical protein 118 40 Op 4 . - CDS 99792 - 99935 138 ## 119 40 Op 5 . - CDS 99859 - 100362 651 ## COG0782 Transcription elongation factor - Term 100380 - 100433 13.1 120 41 Op 1 . - CDS 100441 - 102069 1307 ## COG1316 Transcriptional regulator 121 41 Op 2 1/0.000 - CDS 102179 - 102949 975 ## COG2013 Uncharacterized conserved protein 122 41 Op 3 . - CDS 102968 - 103753 898 ## COG1028 Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) - Prom 103801 - 103860 4.2 - Term 103814 - 103866 12.4 123 42 Op 1 . - CDS 103885 - 110325 6366 ## BLD_1258 subtilisin-like serine protease - Prom 110355 - 110414 5.0 - Term 110477 - 110514 6.2 124 42 Op 2 . - CDS 110535 - 111902 1084 ## COG2509 Uncharacterized FAD-dependent dehydrogenases - Prom 112000 - 112059 8.4 - Term 111963 - 111993 -0.5 125 43 Op 1 . - CDS 112181 - 112774 546 ## Amet_1784 hypothetical protein 126 43 Op 2 . - CDS 112761 - 113252 327 ## BcerKBAB4_5408 hypothetical protein - Prom 113420 - 113479 8.6 - Term 113511 - 113550 -0.4 127 44 Op 1 59/0.000 - CDS 113556 - 113948 604 ## PROTEIN SUPPORTED gi|238925338|ref|YP_002938855.1| 30S ribosomal protein S9 128 44 Op 2 . - CDS 113975 - 114403 627 ## PROTEIN SUPPORTED gi|239623392|ref|ZP_04666423.1| ribosomal protein L13 - Prom 114426 - 114485 11.4 - Term 114448 - 114489 1.1 129 45 Op 1 . - CDS 114574 - 115677 1363 ## COG3853 Uncharacterized protein involved in tellurite resistance 130 45 Op 2 . - CDS 115690 - 116850 1002 ## Cbei_0753 hypothetical protein 131 45 Op 3 . - CDS 116840 - 119215 1972 ## COG0474 Cation transport ATPase - Prom 119293 - 119352 11.9 + Prom 119252 - 119311 11.5 132 46 Tu 1 . + CDS 119446 - 119637 153 ## - Term 119639 - 119684 0.4 133 47 Tu 1 . - CDS 119695 - 120918 836 ## PROTEIN SUPPORTED gi|163739624|ref|ZP_02147033.1| 50S ribosomal protein L32 - Prom 120938 - 120997 3.3 - Term 120971 - 121008 7.1 134 48 Tu 1 . - CDS 121060 - 121131 57 ## - Prom 121251 - 121310 5.6 + Prom 121333 - 121392 4.3 135 49 Tu 1 . + CDS 121441 - 122169 374 ## LGG_02944 integrase Predicted protein(s) >gi|330403359|gb|ADLB01000019.1| GENE 1 3 - 312 164 103 aa, chain - ## HITS:1 COG:CAC0657 KEGG:ns NR:ns ## COG: CAC0657 COG3666 # Protein_GI_number: 15893945 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Clostridium acetobutylicum # 9 101 9 101 334 135 72.0 1e-32 MTNKLKYHKNYTEFGEPYQLVLPLNLEGLIPDDDSVRLLSHELEDLDYSLLYQAYSAKGR NPAVDPKTMFKILTYAYSQNIYSSRKIETACKRDINFMWLLAG >gi|330403359|gb|ADLB01000019.1| GENE 2 434 - 2434 1259 666 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|157803230|ref|YP_001491779.1| 50S ribosomal protein L9 [Rickettsia canadensis str. McKiel] # 16 647 2 625 636 489 44 1e-137 MDNQNNNNKNKNQKNNRQGWGIILFTTLITLFLVLALYRFMPGSNPKEITYDQFLKMVDE GKVEKVKINSDKIEITPKKKDSSKDSELMFPSVAPKYYTGVVKDDKLSERLYKAGVKYTQ EIPDSGSDIFWNIMLTLVLPTLLLFGVFSFFMRKMSKGGGMMGIGKSNAKMYVEKQTGVT FKDVAGQDEAKESLQEVVDFLHNPGKYTSIGAKLPKGALLVGPPGTGKTLLAKAVAGEAK VPFFSLSGSAFVEMYVGVGASRVRDLFKQAQSMAPCIIFIDEIDAIGKSRDNAMGSNDER EQTLNQLLAEMDGFDTDKGLLLLAATNRPEILDPALLRPGRFDRRIIVDAPDLKGRVDVL KVHAKDVKMDETVDLEAIALATSGAVGSDLANMINEAAINAVKNGRQVVSQADLFEAVEV VLVGKEKKDRIMNAEERRIVSYHEVGHALVSALQKDAEPVQKITIVPRTMGALGYVMQTP EEEKFLNTKKELEAMLVGMLAGRAAEEIVFDTVTTGASNDIEKATKIARAMITQYGMSEK FGLIGLESVQHKYLDGRPVTNCGEETAAEIDREVMKMLKDAYEEAKRLLSENREALDKIA AFLIEKETITGKEFMKIFREVQGIEEEEEPQKQRIEMKVEEPEQLEQKDVEEPVQNEPEV SETTEE >gi|330403359|gb|ADLB01000019.1| GENE 3 2629 - 4107 1151 492 aa, chain + ## HITS:1 COG:Cgl2942 KEGG:ns NR:ns ## COG: Cgl2942 COG4868 # Protein_GI_number: 19554192 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Corynebacterium glutamicum # 1 492 1 493 495 592 57.0 1e-169 MKIGFDNDKYLKMQSEHIRERINQFDNKLYLEFGGKLFDDYHASRVLPGFAPDSKLRLLM QLSDQAEIVIVISAGDIEKNKIRGDLGITYDTDVLRLMEAFKENGLFVGSVVITQYSGQN SANLFKSKLEHLGIKVYIHYCIEGYPSNIPLIVSDEGYGKNDYIETSRPLVIITAPGPGS GKMATCLSQLYHEHKRGVRAGYAKFETFPIWNIPLKHPVNLAYEAATADLNDVNMIDPFH LETYGVTTVNYNRDVEIFPVLDAIFEKIYGENPYKSPTDMGVNMVGNCISDDEVCQNASK QEIIRRYYDALNGLAEGTHSEEEIFKIELLMKQAHVTTDDRVVVSAALQRAEETGAPAAA IELDDGTIITGKTSNLLGASAALLLNALKHLSKIDDELHMISPEAIEPIQTLKTKYLGSK NPRLHTDEVLLALSISAASNPNAELALEQIPKLKGCQVHTTVRLSSVDVKMFKRLGIQLT TESKDENKRIYQ >gi|330403359|gb|ADLB01000019.1| GENE 4 4167 - 4919 843 250 aa, chain - ## HITS:1 COG:BH0801 KEGG:ns NR:ns ## COG: BH0801 COG1349 # Protein_GI_number: 15613364 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulators of sugar metabolism # Organism: Bacillus halodurans # 6 230 13 241 259 160 38.0 2e-39 MLAAERREKITEFLYRYGSVQVEELAQELGVSTMTIRRDLLKLQEDGKIERCHGGAVAKQ EVTYMDKQTSHCEEKAAIAEKCAEYVVSGNTVFLDAGTTTYEIARKIMDIPDILVVTNDL EIAQLLKNSPVELFVCGGHVQKATGSMFGHYATEMLKDFKFDVGFFGAASINDEFEVMTP TVDKMWLKRETPKQCRNAYLAVDGSKFDRQAMAKINCLGDYTGVITEKKFSKDEQVYLNG MGAEIIEVSI >gi|330403359|gb|ADLB01000019.1| GENE 5 5028 - 6431 1821 467 aa, chain - ## HITS:1 COG:FN0225 KEGG:ns NR:ns ## COG: FN0225 COG2610 # Protein_GI_number: 19703570 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism # Function: H+/gluconate symporter and related permeases # Organism: Fusobacterium nucleatum # 8 456 3 440 452 409 56.0 1e-114 MVNGLSGERMLIGLAIGIAVLIFLVLKTKVQAFLALIICTVIVGVVGGMPLNNITLEDGT TFGILNSITSGFGGTLSSIGIIIGFGVMMGQIFEVTGAAKRMAHTFLKLFGKKREEEALA FTGFLVSIPIFCDSGFVVLAPIAKAISRVTKKSVIGLGVALASGLVITHSLVPPTPGPLG VCGIFGIDVGSFILMTILLAIPMAIACIAYSRLYLSKKYYRIPNEEGEIVEATYQEPDYE AAFSMDMTGLPGVFESFMPLLIPIILILVNTVATAMGQTKGIMNVLIFLGQPIIAVGLGL IVAIFTLGRKLDRHTALMEMEKGMMSAGIIMLVTGGGGALGQIIKDSGLGNFMAEGLAKT AIPIIILPLLIATAMRFIQGSGTVAMTTAASISAPIILASGASPLLGAIACCVGSLFFGY FNDSYFWVVNRTLGVSEAKDQLRVWSVTSTVAWAVGVVEVLILNIFM >gi|330403359|gb|ADLB01000019.1| GENE 6 6443 - 7441 405 332 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163786851|ref|ZP_02181299.1| 50S ribosomal protein L32 [Flavobacteriales bacterium ALC-1] # 6 320 9 320 346 160 30 3e-38 MKRSFIAVPIGDPAGVGPEIVAKSVASKDVTAAAKCVIVGNKEVMDNAIKITGENLTVNV IDDVEQGDYREGILNLIHIDNMDMSQFAFGKVSGMCGKAAYDYIEKSIELANNGKVDAVA TTPINKESLRAGGINFIGHTEIFGALTNTEDPLTMFETNGLRVFFLTRHVSLREMLDMIK KDRIIDYVKRCTAALEKLGVTDGVMAIAGLNPHSGEHGLFGWEEVEEIVPAIEELKKQGY KVEGPIGADSVFHQAAQGKYSSVLSLYHDQGHIATKTLDFEKTIAITNGMPILRTSVDHG TAFDIAGKGVVSAVSMIEAILLAAKYAPNFTK >gi|330403359|gb|ADLB01000019.1| GENE 7 7458 - 8762 1182 434 aa, chain - ## HITS:1 COG:FN0227 KEGG:ns NR:ns ## COG: FN0227 COG3395 # Protein_GI_number: 19703572 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Fusobacterium nucleatum # 1 432 1 425 425 259 34.0 9e-69 MPQCVVIADDLTGANATGVLLKKMNYKAYTVMNMERIELEMLSDCDCVLYPTDSRGVDAE ISYNRVKNVCELLKHEEVKVYSKRIDSTLRGNLGSETDAMLDVLGEDYVAIVAPCFPSSK RIAIGGYMLVNGIPLHKTDIAIDPKTPVTKSEIAVLFEEQSKYKVSLICMKDLMHGKHYL ADKMKECVNAGSRIIVLDCVTQEDLDLIADAVITSRLKTVAVDPGVFTATLSRKLIVPAE KQEKSRILAVVGSVNPNTKTQMEELWLSQRTHNVFVHTKELLEGDSRRENEIERVTEEIL SESSRNIVSTVVGDGIYPENRIDFIPYMEKYNCSMDEVTERINSAFAEITYRIFQKETSF KGLYTSGGDITVAVCRKFKTAGLLLKDEVLPLAAYGQFLKGEFDGIHIITKGGSQGESDA INRCITYLKEKLYI >gi|330403359|gb|ADLB01000019.1| GENE 8 8802 - 10220 1294 472 aa, chain - ## HITS:1 COG:alr3454 KEGG:ns NR:ns ## COG: alr3454 COG3395 # Protein_GI_number: 17230946 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Nostoc sp. PCC 7120 # 28 472 2 439 441 269 35.0 1e-71 MLSAEILNSFEAIDEEYVDNLLRREIENNGKKIVVLDDDPTGVQTVHDISVYTNWDKDSM LQGFEEKNSLFYILTNSRGFTAEQTTKVHHEIASVIDEVSKETGKEYIFISRSDSTLRGH YPLETEILKEDYEKNTGKIIDGEIICPFFKEGGRFTIDNVHYVKYGNELIPANETEFAKD STFGYTAGTMPEYVEEKTKGKYKASDVCRIPLEDMRSMNIDKIEQTLSEVNGFNKIIVNA VDYADIKVFCVALYRAMAKGKVFMFRTAAAIVKVMGGISDQPLLTKEQMIVKETDNGGVV IVGSHTDKTSRQVEELRKLENIEFIELDVSLVHDENAFAKEVERCLEKEENCIRNGKTVC CYTTRSLITADTGDKEDELRLSVRISDAVQSLVGRLQVTPRFVIAKGGITSSDIGTRALA VKKANVLGQIKPGIPVWQTGGESKFPMTPYVIFPGNVGEDTTLRETVEVLTK >gi|330403359|gb|ADLB01000019.1| GENE 9 10236 - 11123 1086 295 aa, chain - ## HITS:1 COG:STM3248 KEGG:ns NR:ns ## COG: STM3248 COG2084 # Protein_GI_number: 16766546 # Func_class: I Lipid transport and metabolism # Function: 3-hydroxyisobutyrate dehydrogenase and related beta-hydroxyacid dehydrogenases # Organism: Salmonella typhimurium LT2 # 1 293 3 296 296 276 51.0 5e-74 MKVGFIGLGIMGKPMAKNLLKAGIELWVFDLNREAVREVEEAGAYATTYAEIGKKCDIVF IIVPSGDIVKSILFGEDGVASTLKEGSIVCDMSSVTPVESRECYERLQKIGVGFVDAPVS GGEPGAIAGTLAIMAGGDEKDFEALQKYFDILGSTALLIGGSGSGSVTKLANQIIVNNTI AVVSEAFVLAAKAGADPQKVYEAIRGGLAGSAVLDAKIPMIIERNFKPGGPIRINHKDIK NVVNTAHSLDVPIPYTAQLYEILQTLKVHGHMNDDHGGIVQYFEKLADVEVKKSE >gi|330403359|gb|ADLB01000019.1| GENE 10 11282 - 13090 1865 602 aa, chain - ## HITS:1 COG:no KEGG:DSY1218 NR:ns ## KEGG: DSY1218 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense # Pathway: not_defined # 7 578 11 606 644 406 43.0 1e-111 MRKRTIFLTVFLAGTLIVSGCGNKNSNSENTDTAKTSTTQNIDIANMFSDRDMETGYDEE KSVNIKLADNGTTSDSDTVKISDNIVTITEEGTYVLTGALTDGMVIVEVKDTEKVQLVLN GVEITNETSAAIYVRSADKVFITTTENAENVLENGGEYTAIDDNNIDAVIFSKADLTLNG TGSLMVNAEAGHGIVSKDDLVLTSGTYEIQSASHGLSGKDSVRVANGTYTITSEKDGIHA ENADDDSLGFVYLAEGNFEIVAKGDGISAGNWLQSDGGTYTIIAGDGSENVQKNGQKWEF GPRGEQENTAEENSVSMKGIKAAGDLTVTAGKYDLNTADDSIHSNANTTISGGEWTIASG DDGIHGNSATTISGGTIEITQSYEGIEGLSIDITGGDIQLVSSDDGLNAAGGNDSSGFEG PGGDQFAAEDGAYIHISGGKLKVNASGDGIDSNGELTISGGETYVSGPTNNGNGTLDYSG TAKITGGIFVGAGSSGMAQSFGEDSTQGVIFVAMNSQSSKSAITLLDADKKEIVSWTPDK EYTSVIISAPSVKEGGQYTLTAGETTENITMDSLIYGNAQGMGQAPGNGQNDRKQGTPPD MR >gi|330403359|gb|ADLB01000019.1| GENE 11 13101 - 13796 652 231 aa, chain - ## HITS:1 COG:no KEGG:DSY1217 NR:ns ## KEGG: DSY1217 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense # Pathway: not_defined # 1 219 1 220 232 220 57.0 3e-56 MVNKIFGGIFDTEMTNVIALSDFLLCVGCALIIGFILSMAYMYGTRYTTSFVVTLAVLPA IVCVVIMMVNGNVGAGVAVAGAFSLVRFRSVPGSAKEIGAIFLAMGTGLVAGMGYLGYAL LVSVLLGGITLLYYRLDFGNGKKKALYKTLHITIPEDLDYTEVFDGILKQYTQTCDLVQV KTTNMGSLFKLTYNLTLCDMNKEKELIDSLRCRNGNLEITVSKQETIIGEL >gi|330403359|gb|ADLB01000019.1| GENE 12 13789 - 14490 429 233 aa, chain - ## HITS:1 COG:no KEGG:DSY1216 NR:ns ## KEGG: DSY1216 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense # Pathway: not_defined # 3 224 4 225 242 197 47.0 2e-49 MAHQMTFKRYELKYLLTRREKEKILCSMAPYMKLDKYGRTTIRNIYFDTDTYRLIRRSLE KPIYKEKLRVRSYHTAGSEDSVFVELKKKYKSVVYKRRLVLPRDKAMDSFLNGKPFPENS QIGKEIQYFWKYYGSLRPTVFLSYEREAYYSLDGSDFRVTFDENILYREEDLSLGSPIYG KALLGEDETLMEIKTSGGIPLWLSRVLTENKIFKTSFSKYGLAYKQMGGRQYG >gi|330403359|gb|ADLB01000019.1| GENE 13 14602 - 15831 983 409 aa, chain - ## HITS:1 COG:SPy0875 KEGG:ns NR:ns ## COG: SPy0875 COG0642 # Protein_GI_number: 15674900 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Streptococcus pyogenes M1 GAS # 1 396 1 402 410 205 31.0 1e-52 MIKKLRIKLIAASMISLLSVLLIIEGIAGVLNYRNIVSEADRILTLLEDNGGRFPRMEPN GNMGDLGQRSPEIPYESRYFSVLLDDSGNVVMADTGKIARIDTSDAIDYAKEVWANGKER GFIGEYRYAVSSEESDVRIVFLDCGRSLNTFYNFIFTAVGVSVIGLLAVLILMIFLSAHI VKPFLENYEKQKRFVTDAGHEIKTPLTIIDADAEVLKMDFGENEWVNDIQNQTKRMAELT NSLILLSRMEEERTKQLMTEFSLSDVVEETVGTFQTLAKTQDKLLESKILPMISLKGDEM AIRRLTAILLDNAVKYADDNGKIMVTLEKKKKRIYFTVFNTTEEISKKQTEHLFDRFYRT DTSRNSETGGYGLGLSIAAATVEAHKGKIQASTEDEKSLAVTVTFPLAT >gi|330403359|gb|ADLB01000019.1| GENE 14 15828 - 16502 644 224 aa, chain - ## HITS:1 COG:MT1009 KEGG:ns NR:ns ## COG: MT1009 COG0745 # Protein_GI_number: 15840406 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Mycobacterium tuberculosis CDC1551 # 1 222 3 227 230 169 37.0 3e-42 MRLLLVEDEKPLSRALKAILERNHYSVDAVYDGRAALDYLEADNYDGVILDIMIPKIDGI TVLKTIRSCGNSIPVLLLTAKSEVEDKVLGLDSGANDYLSKPFHAEELLARIRAMTRTQT AQNDSKLQIGNIILDRATFELSTPSGSYNLANKEFQMLELMMLNPNHLISAEKFIENIWG YDSEAEMNVVWVYISYLRKKLAALHANVRIKVTRNVGYSLEEIK >gi|330403359|gb|ADLB01000019.1| GENE 15 16631 - 18475 172 614 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|169795303|ref|YP_001713096.1| ABC transporter ATP-binding protein [Acinetobacter baumannii AYE] # 396 609 28 232 311 70 26 3e-11 MSNHQPKQRGPMGQGHGGMGTGEKAKDFKGTIKKLMAYLGNYKIAIGFVIIFAIGSTIFT IVGPKILGDATTEIFKGLVGKVSGGKGIDFDKIARILLTVLTLYVVSALFAYIQGFIMTG VSQKLTYRLRKEISEKINRMPMNYFDTKTHGEVLSRVTNDVDTLSQSLNQSATQVITSFT TIVGVLIMMLRISPLMTLVALLILPVSMTLISFIVKRSQKHFKNQQEYLGHVNGQVEEVY SGQNIVKAFNKEEDVIREFDETNESLYQSAWKSQFFSGMMMPIMQFVGNLGYVAVAILGG YLTIKDKIAVGDIQSFIQYVRSFTQPITQVAQVANLLQSTAAASERVFEFLEEEEEDQFA ENPVSVEGLEGNVEFKNVHFGYNKDRIIINDFSAKVKQGQKIAIVGPTGAGKTTMIKLLM RFYDVNSGKILVDGHNLKDFNRSELRQMFGMVLQDTWLFHGSIKDNIRYGKLDATDEEVI EAAKAAHAHRFIQTLPNGYDMELNEEASNVSQGQKQLLTIARAILADPKILILDEATSSV DTRTEVRIQKAMDNLMKGRTSFIIAHRLSTIRDADLILVMKDGDIVEQGNHEELLLRNGF YAELYNSQFERTSA >gi|330403359|gb|ADLB01000019.1| GENE 16 18468 - 20636 212 722 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P [Thermanaerovibrio acidaminovorans DSM 6589] # 469 692 132 354 398 86 28 6e-16 MRKILKFLKPHAVSVVAIVAVLILQAYCDLSLPSYTSDIVNVGIQQGGIDEHIPEKISKA DMENLFLFMAKEDIETVKKAYGTDKDTYEKEAYVLKETENTEELEKILNRPMLLNTEMAA GESEKKMNPEELKMLPEQQRMAILEQMNEKLKDMPDTMLEQASTLYIKAAYERLGVDVDK IQTTYILTTGGKMLALALLGMLVSVMVGLLASRIAASTGRDLRSGIFKKVVGFSSGEFDK FSTASLITRSTNDIQQIQLVIVMLLRMVLYAPIIGIGGVIKVLNTNVSMSWIIALAVVLI SILVFTLFIVAMPKFKMLQNLVDKLNLVTREILTGLPVIRAFSTEKHEEERFDKANRDLT RTNLFVNRAMTLMMPTMMLIMNGVSVLIVWVGANGINDGQMQVGDMMAFIQYTMQIIMAF LMICMISIMLPRASVSASRVDEILTSETMIHDAKNPKKIEDKKDGSVEFDHVSFQYPNAE EDVLHDISFVAEAGKTTAFIGSTGSGKSTLVNLIPRFYDVTGGTIRINGVDIRDVSQHDL REKLGYVSQKGILFSGTIASNIMYGNTKGTEEEMKEAARIAQAEEFIDAKPEQFESPISQ GGANVSGGQKQRLSIARAIAKNPDIYIFDDSFSALDYKTDVTLRKALQEHTSNSVVLIVA QRISTILHAEQIIVLDEGKIAGKGTHKELLKNCEVYRQIAMSQLSEEELRRDMEGKEEKD NE >gi|330403359|gb|ADLB01000019.1| GENE 17 20650 - 21072 572 140 aa, chain - ## HITS:1 COG:L114370 KEGG:ns NR:ns ## COG: L114370 COG1846 # Protein_GI_number: 15672690 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Lactococcus lactis # 32 135 42 145 291 70 36.0 8e-13 MEQCTQLLMNQVMHLYIQRSMQLFRDLKVHPGAGGMLWALRKNNGMAQKEIAKKLGITPP SMTVMIKKLEGEGYIVKKQDEKDQRMTRIFITEKGKEIAHHMESVLKTLEQEAFANMSEE EIMLLRRLLIQMKENLLDKN >gi|330403359|gb|ADLB01000019.1| GENE 18 21276 - 22196 715 306 aa, chain + ## HITS:1 COG:BH1953 KEGG:ns NR:ns ## COG: BH1953 COG1597 # Protein_GI_number: 15614516 # Func_class: I Lipid transport and metabolism; R General function prediction only # Function: Sphingosine kinase and enzymes related to eukaryotic diacylglycerol kinase # Organism: Bacillus halodurans # 1 285 1 278 295 155 34.0 1e-37 MYTFIVNPHSRSGEGQKTWEEIHLLLEKNHIPFEVYFTRYQKHATTIMKSLTSDSVTRTV IVLGGDGTINEVINGIEDFKHLTLGYIPTGSGNDFARGYSLPSNPKEVLDYILHSPHIHY MDIGEVSYKNKKRRFAVSCGIGYDAGICHEAVVSKIKPVLNRVHLGKLTYALIAVRQWLF LTPYECTITLNHSETHHFSHCYFVSAMNHRYEGGGVQFAPHADPEDRLLSICAVADVPKW RLLLVLAAALFGKHTHLKGSHLYNCETISIQYGGAMPVHTDGEPVFLQKDISVSCLSERL RIITAR >gi|330403359|gb|ADLB01000019.1| GENE 19 22208 - 23008 300 266 aa, chain + ## HITS:1 COG:CC2157 KEGG:ns NR:ns ## COG: CC2157 COG0584 # Protein_GI_number: 16126396 # Func_class: C Energy production and conversion # Function: Glycerophosphoryl diester phosphodiesterase # Organism: Caulobacter vibrioides # 36 264 25 253 257 119 32.0 9e-27 MPVLILLLIFLLFLYMIAPRFRFRKNLKPFIRYDYAHRGYWNIHKAIPENSLSSFRLAVS HGFGIELDLHLTKDNKIVVFHDDFLTRMCGENLSVENTTYEELSRHTLAGTSEKIPLFSE VLALVNGQTPLLIELKLPTSDTTLCSLVCRELQNYNGAYLIQSFNSLGLLWFRKHAPHVI RGQLSSNLTKTEPDIPFIACFFVKYLLSNLITRPDFISYKLHDEKNISLFLHKYLFGTPI AVWTLRTEEDFTYAKSKYSIMIFEKS >gi|330403359|gb|ADLB01000019.1| GENE 20 23107 - 23763 379 218 aa, chain + ## HITS:1 COG:CAC1489 KEGG:ns NR:ns ## COG: CAC1489 COG0671 # Protein_GI_number: 15894768 # Func_class: I Lipid transport and metabolism # Function: Membrane-associated phospholipid phosphatase # Organism: Clostridium acetobutylicum # 36 203 42 204 219 71 34.0 1e-12 MWKKLWEKYRHGWVFLYIFIYLPWFFYLEKHITTDYHLIHASLDDKIPFIEYFIIPYTLW FAFIAVTIGYFFFFGEKSEFYRLIILLFSGMTIFLIVSTVYPNGLQLRPETFARDNIFVD MVRQLYAVDTSTNVLPSIHVYNSLGVYIAISHCSKLKQYRWVQILTLLLTISIILSTMFL KQHSVIDVVAGFIMAGVLYLIVYKPVKVRGKNSLENLA >gi|330403359|gb|ADLB01000019.1| GENE 21 23835 - 24080 155 81 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLTLSFALLVGIILALILLIFIDEESSIKVPFLHRKVSIVDMSTKTAFISAFIFSMVKHI KDWGTVTISNICRLTNLHKEV >gi|330403359|gb|ADLB01000019.1| GENE 22 24144 - 25295 1421 383 aa, chain - ## HITS:1 COG:ECs5298 KEGG:ns NR:ns ## COG: ECs5298 COG1775 # Protein_GI_number: 15834552 # Func_class: E Amino acid transport and metabolism # Function: Benzoyl-CoA reductase/2-hydroxyglutaryl-CoA dehydratase subunit, BcrC/BadD/HgdB # Organism: Escherichia coli O157:H7 # 1 383 8 390 390 505 60.0 1e-143 MELIKDLPEVFEEFAEQRQKSFLAVKELKDKGVPIVGVYCTYFPQEIVMAMGAAAVSLCS TSDETIPDAEKDLPKNLCPLIKSSYGFAKTDKCPFFYFSDLVVGETTCDGKKKMYEYMAE FKPVHIMELPNTQSENALELWKKEIIRFKEVLEERFQVEITEDKIREAIRDNNKARLSLR RLYETMRHDPVAMYGHDLFKVLYGSTFKFDRKEIPSEVDPLVEKIEAEYAKGQTIPKKPR ILITGCPIGGATEKVIRAIEDNGGVVVTYENCSGAKSIDKLVDEENPDVYDALARRYLAI GCSVMTPNPNRLELLGRLIDEYKVDGVVEVVLQACHTYNVETLGIRKFVNEKKHIPYISV ETDYSQADIGQLNTRLTAFIEML >gi|330403359|gb|ADLB01000019.1| GENE 23 25386 - 27146 532 586 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|46129221|ref|ZP_00155777.2| COG1194: A/G-specific DNA glycosylase [Haemophilus influenzae R2846] # 238 576 15 350 378 209 34 5e-53 MKYERIKVGTFVERPNRFIAYVNIAGRKETVHVKNTGRCRELLIPGATVYVQESDNPNRK TKWDLIGVKKGNRMINMDSQITNAVVKEWIEKGNLFANPSLIKPETTYGQSRFDLYVEEG ERKAFIEVKGVTLEEDGIVRFPDAPTERGVKHVEELCHAVSEGYEAYVIFVIQMDKVSYF TPNERTHKAFGDALRKAKKAGVHILAYDCKVKKDSISLYKEVPVILEQEQLKEIVQPIVS WYRENKRQLAWRENVSAYRVWVSEIMLQQTRVEAVKPFYDRFLKELPTVKDLAEAEEDKL LKLWEGLGYYNRVRNMQKAAVQVMEEFHGEFPKTYEEVLSLSGIGNYTAGAICSFAYGIP KPAVDGNVLRVISRVIASEEDIMKPAVRTKIEYMLDGVIPKDSASDFNQGLIELGALICT PKGMAKCEKCPLGSVCQAKKENKVEELPVKAKTKERRIENRTVFVFRDGENIAIRKRKSK GLLAGLYEFPNEEGHFTEDEVISYSKTKGLYPMRIKKLGEAKHVFSHVEWHMTGYAVHVD ELEKSCEKDMLFIHPEEIQKEYPIPSAFEYYTNQVNIKLGIEKNYK >gi|330403359|gb|ADLB01000019.1| GENE 24 27147 - 27497 487 116 aa, chain - ## HITS:1 COG:BS_yusI KEGG:ns NR:ns ## COG: BS_yusI COG1393 # Protein_GI_number: 16080333 # Func_class: P Inorganic ion transport and metabolism # Function: Arsenate reductase and related proteins, glutaredoxin family # Organism: Bacillus subtilis # 1 113 3 115 118 125 64.0 2e-29 MLFIEYPKCTTCKKAKKWLEENGVEFTDRHIVEDNPSKEELKEWYEKSGLPLKRFFNTSG MKYKEMKLKDKLPEMSEEEQLELLATDGMLVKRPLLIGETFAIPGFKEQAWEEVCK >gi|330403359|gb|ADLB01000019.1| GENE 25 27512 - 27907 190 131 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_2353 NR:ns ## KEGG: EUBREC_2353 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 131 1 131 131 119 46.0 4e-26 MKEKIQRFMMGRYGSDTLNKFLFGVTLVLLITSILFDSNFLNLLALLLLVVCYVRMLSRD VQKRYNENVKFLNMKKDFFGFFRREKSYFEQRKTHHIYKCPTCKQKIRVPKGKGRICITC PSCKTEFIKNS >gi|330403359|gb|ADLB01000019.1| GENE 26 28013 - 28735 603 240 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 239 1 242 245 236 48 3e-61 MIITKNLQKSFHGNQILKGIDIHIEQGEKVVIVGPSGSGKSTFLRCLNLLETTTGGEVWF EGNNITGKECNINKLRQKMGMVFQQFNLFPHLSVKENITLAPIKLGLMSKEEADKKAEEL LVRVGLPDKADQYPKQLSGGQKQRIAIARALAMNPEVMLFDEPTSALDPEMVGEVLTLMK ELADEGMTMVVVTHEMGFAREVATRVIFMDEGQIKEENAPEEFFSNPKDERLKEFLSKVL >gi|330403359|gb|ADLB01000019.1| GENE 27 28732 - 29421 809 229 aa, chain - ## HITS:1 COG:FN0802 KEGG:ns NR:ns ## COG: FN0802 COG0765 # Protein_GI_number: 19704137 # Func_class: E Amino acid transport and metabolism # Function: ABC-type amino acid transport system, permease component # Organism: Fusobacterium nucleatum # 7 227 3 232 236 160 47.0 2e-39 MDLIQNFIDGFYNAVIYDQRYLVYLDGLKVTILISLFAIAIGVAIGLLIAVIKVSASEKK VLVGLKWLCNVYITVIRGTPVMVQLLIIYNLVFTARDTNEIVVGAICFGINSGAYVAEII RAGIESIDKGQMEAGRSLGFNYIQTMRYIIVPQAIKNILPALSNEFIVLIKETSVAGIIA VTDLTKAAQYIGSTDYNILPPLYVAAVCYLIIVLGLTKLQGLLERRLAR >gi|330403359|gb|ADLB01000019.1| GENE 28 29442 - 30203 1057 253 aa, chain - ## HITS:1 COG:TM0593 KEGG:ns NR:ns ## COG: TM0593 COG0834 # Protein_GI_number: 15643359 # Func_class: E Amino acid transport and metabolism; T Signal transduction mechanisms # Function: ABC-type amino acid transport/signal transduction systems, periplasmic component/domain # Organism: Thermotoga maritima # 3 237 1 231 246 132 36.0 6e-31 MKMKKIISLALAGVVALSITACGEKKSESDTLIVGTEAGFAPYEYMKGKEVVGIDMDIAK AIADDMGKELKIKNMEFDGALASVASGKVDFVAAGVSINEKRKKTMDFSHEYVDSTEVIV VNGENPAISGKVSGESLNGKVVAVQQGNIADSWVSNEENCKPKEVKRYTKFAQASEDLKN NKVDCIVMDKYPAEKLISENKELKILDGVLFEDKYAIAVKKGNKELLDKINKVIDKLIAD GKIEEFTANHTQK >gi|330403359|gb|ADLB01000019.1| GENE 29 30321 - 31154 1081 277 aa, chain - ## HITS:1 COG:SPy0280 KEGG:ns NR:ns ## COG: SPy0280 COG1968 # Protein_GI_number: 15674457 # Func_class: V Defense mechanisms # Function: Uncharacterized bacitracin resistance protein # Organism: Streptococcus pyogenes M1 GAS # 3 271 2 270 279 286 56.0 3e-77 MEIFFEMLKTVFLGIVEGITEWLPISSTGHLILVNQIVQLDASKQFMEMFNVVIQLGAIM AVVILYFHKLNPFSKRKSPKQKTQTLQLWIKVAIACLPAMIIGIPLDDWMDEHLHNSVVV AAMLIIYGILFIVVENRNKNKTPSVTKFSQLSYQMALVIGVWQVLALVPGTSRSGATIVG ALLLGTSRYIAAEFTFFLAIPVMVGASGLKILKFMLSGIGITGLEVAILIVGCLTAFIVS VLAIKFLMGYIKKNDFKVFGVYRIVLGIIVLGTFFMM >gi|330403359|gb|ADLB01000019.1| GENE 30 31171 - 32001 938 276 aa, chain - ## HITS:1 COG:lin0298 KEGG:ns NR:ns ## COG: lin0298 COG0561 # Protein_GI_number: 16799375 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Listeria innocua # 6 275 3 268 270 124 31.0 1e-28 MGKQIKMIGLDCDGTLLNNNKELTDHSREVLCRAIERGIVIVVATGRPLTGIPKEVLDID GIRYALTSNGARIVDIKENRIISKCTMSTEKTKELLQLFSAYDTYKEIFIDGQGYTNRSD LENIENYAESESMARYARSCRKPVENMEELLEDKEVRVDKVHVMFSGLKEREQAFEEVKT VQGVTATSAISNNIEVNMFGVNKGEGLLRLGTLLGIERDEIMACGDGMNDFEMLKAVGFG VAMENAVEKVKQAADYVTDTNENDGVAKAIEKFALA >gi|330403359|gb|ADLB01000019.1| GENE 31 31985 - 32572 545 195 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167759822|ref|ZP_02431949.1| ## NR: gi|167759822|ref|ZP_02431949.1| hypothetical protein CLOSCI_02185 [Clostridium scindens ATCC 35704] # 1 191 1 191 193 247 59.0 2e-64 MRKYEEEHYRLDKIKDKIYPWVKGELIDCQVINGKEISAKDTPLITFVGDLHIVLVINRG NDTYEIIRDKMLPDDCNIEELYHIACANLVRDVKFVISQTLYGGFGIVADGFHEASALCF RQIWTMCAEKLKDDLLIIAPAKDTVVFLPMKEKEKLSYMKEYARQAYERNKDKISLQIFR FVRKRKELTVYGETD >gi|330403359|gb|ADLB01000019.1| GENE 32 32715 - 33773 1042 352 aa, chain + ## HITS:1 COG:CAC1232_2 KEGG:ns NR:ns ## COG: CAC1232_2 COG3773 # Protein_GI_number: 15894515 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell wall hydrolyses involved in spore germination # Organism: Clostridium acetobutylicum # 240 302 2 68 120 61 41.0 3e-09 MNIIQKRRKKVFALLTAATLCFSSPFTYVLHAENTTQKQQDLENNLNELNKDKDATGKNL SAATSSLSKTMKEISTTQQELALAKVRENGQYELMKKRIKYIYESGSASLLSMLFESESM ADLLNKAEFISTVSEYDRTMLEKLQTLHSSIAKKEKELRQQQKNFEKQKKDLEEKYEQLS SKITDTSNELNDYKKQLAEAEALAKATQSLLDKKPAEEPSKPDSKPEEKPDNKPAPPSAA EDLALFAGILECEAGSSDYNALLAVATVIMNRVESPKYPNTLSGVIYQSGQFSPTWTGKL DKVLERGPKKLCYQVAQDALAGARLDSVRNCYSFRAATSGHDGIIVGGNVFF >gi|330403359|gb|ADLB01000019.1| GENE 33 33832 - 34587 921 251 aa, chain - ## HITS:1 COG:no KEGG:Emin_0754 NR:ns ## KEGG: Emin_0754 # Name: not_defined # Def: aminoglycoside phosphotransferase # Organism: E.minutum # Pathway: not_defined # 1 251 1 249 249 286 60.0 4e-76 MKLDKSNLIVKRAYKEVYKCDEGIVKVFEKTHLKSDVFNEALNTARVEETGLDIPKVKSV SEVDGKWAVVIEYKEGKTLAEMMQVDSKNLEKYMEDFVDLQLQVHSKKAPLLNKLKDKLA RQINSLKELDATARYELLTRLESMPKHDKVCHGDFNPSNVIVGKNGKMTVVDWAHATQGN ASADAAMTYLLFALKDQETADLYLKLFCKKSDTAKQYVQQWLPIVAAAQLTKNNEMEKEF LMKWIDVFDYQ >gi|330403359|gb|ADLB01000019.1| GENE 34 35138 - 37750 3027 870 aa, chain - ## HITS:1 COG:CAP0035_1 KEGG:ns NR:ns ## COG: CAP0035_1 COG1012 # Protein_GI_number: 15004739 # Func_class: C Energy production and conversion # Function: NAD-dependent aldehyde dehydrogenases # Organism: Clostridium acetobutylicum # 18 456 9 448 448 597 67.0 1e-170 MTKKKEIVPEIIDSVEALTAKMEAMREAQKVFATFTQEQVDKIFFEAALAANKQRIPLAK MAVEETGMGIVEDKVIKNHYAAEYIYNAYKDTKTCGVIEEDKAYGIKKVAEPIGLIAAVI PTTNPTSTAIFKTLIALKTRNAIIISPHPRAKASTIAAAKVVLDAAVKAGAPEGIIGWID VPSLELTNEVMKGADTILATGGPGMVKAAYSSGKPALGVGAGNTPVIIDDTADVKMAVNS IIHSKTFDNGMICASEQSVTVLEGVYKEVKEEFALRGCYFLKKDEIEKVRKTIIINGALN AKIVGQSAHTIAALAGVKVPEETKILIGEVESVDISEEFAHEKLSPVLAMYKAKTFDEAI AKAEQLVADGGYGHTSSIYIHPAQTEKLAKHAAAMKTCRILVNTPSSHGGIGDLYNFKLA PSLTLGCGSWGGNSVSENVGVRHLINIKTVAERRENMLWFRAPEKVYFKKGCMPVALDEL GTVMGKKKAFIVTDSFLYHNGYVKGIEEKLDEMGIQHTCFYEVAPDPTLACAKAGAEAMR AFGPDVIIALGGGSAMDAGKIMWVMYEHPEADFMDMAMRFMDIRKRVYTFPKMGEKAYFV AIPTSSGTGSEVTPFAVITDEKTGVKYPLADYELLPKMAIIDADNMMSQPKGLTSASGID ALTHALEAYASIMATDYTDGLALKAMKNIFEYLPSAYENGANDPKAREKMADASCMAGMA FANAFLGVCHSMAHKLGAFHHLPHGVANALLINEVMKFNATDVPTKMGTFSQYQYPHALE RYAECARFCGVVGKDDKETFEKFLVKIDELKEKVGIKKTIKEYGVDEKYFLDTLDEMVEQ AFDDQCTGANPRYPLMKEIKEMYLRVYYGK >gi|330403359|gb|ADLB01000019.1| GENE 35 38011 - 38283 356 90 aa, chain + ## HITS:1 COG:TP0072 KEGG:ns NR:ns ## COG: TP0072 COG4545 # Protein_GI_number: 15639066 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Glutaredoxin-related protein # Organism: Treponema pallidum # 3 76 5 78 90 66 39.0 1e-11 MKLIIYGTKTCKDCVDALEVLEQKNVRYLFLEFSDSIGNLKRFLKIRDTNPMFNPVKEKG GIGVPLFVFEDGTMTFELDDVLARIDAASE >gi|330403359|gb|ADLB01000019.1| GENE 36 38314 - 40062 1397 582 aa, chain - ## HITS:1 COG:no KEGG:GYMC10_3134 NR:ns ## KEGG: GYMC10_3134 # Name: not_defined # Def: peptidase S41 # Organism: Geobacillus_Y412MC10 # Pathway: not_defined # 2 581 5 581 583 371 36.0 1e-101 MKKKKVIFLGVIGVSVVILCIGLILSQDKNKLERDKDKEFASGTDIKIEEISSEQNNSLY KLCKVWGYIKYRHPDIVNGKINWDAELFRVMPEILKSENQEKTNKIVYDWLKRFPFEEKS DKKAEKYLTYPQKDNVGEPESKWIYDEQFLDKDVGNYLQKLSKTFIGERKNAYASFYDNQ PLTFFTNEKNYPLREDDDGMKLLSLFRFWNIYEYYSPNVQIARKDWDEVLKEEIHRMVQT KTYREYALVIAEMAAQTGDPHLEVIDNEDILINFYGKYYLQCEFMYVENKVIINQVGKDE KNLKVGDILLSIDEIELEKRIKELKKYKAIPEEDKFVAKIKYQLMQTEGKTSEVIVLRDG EEKKLKVNTSEEPYMYENPQKNGFIEEEKIGYIDPSDLKEGDLEKLMEKFADIEGIIVDL RYYPSSVIAYSMAEYINPEMEEVLQFVFPNPAVPGRYYAIPMSVGKGMMKDKEYPQYSGK VILLMDERSVSQSETTIMSLRQSPNAVVIGSESRGANGDLVEVKLPGDLRINMSGLGVYT AEGEQTQRVGLKPDIEIFPTIQGIKEGRDELLEKGIEEILNN >gi|330403359|gb|ADLB01000019.1| GENE 37 40143 - 40670 736 175 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|240145848|ref|ZP_04744449.1| LSU ribosomal protein L17P [Roseburia intestinalis L1-82] # 1 175 1 178 178 288 84 1e-76 MAKYRKLGRTSSQRKALIRNQVTALLNNGKIVTTEAKAKEIRKVAEGLIALAVKEKDNFE EVKVMAKVARKDQNGKRVKEVVDGKKVTVYDEVEKTIKKDMPSRLHARREMLKVLYPVVE VKDGKKRSAKEVDLVAKLFDEIAPKYANRNGGYTRIVKIGLRKGDAAMEVLLELV >gi|330403359|gb|ADLB01000019.1| GENE 38 40833 - 41792 1213 319 aa, chain - ## HITS:1 COG:BH0162 KEGG:ns NR:ns ## COG: BH0162 COG0202 # Protein_GI_number: 15612725 # Func_class: K Transcription # Function: DNA-directed RNA polymerase, alpha subunit/40 kD subunit # Organism: Bacillus halodurans # 1 318 1 314 314 382 67.0 1e-106 MFDFNKPNIEITEISEDKKYGRFVVEPLERGYGTTLGNSLRRIMLSSLPGSAISYVKIDG VLHEFSSIPGVKEDVTEIVMNLKSLAIKNTSETDEPKTAYIEFEGEGVVTGADIQVDSDI EIMNPETVIATLNGGADSKLYMELTITNGRGYVSADKNKSSELPIAVIPIDSIYTPVERV NLTVENTRVGQITDFDKLTLDVYTNGTLAPDEAVSLAAKVLNEHLKLFIDLSEVAQAAEV MIEKEDDEKEKVLEMSIDELELSVRSYNCLKRAGINTVEELTNRTSEDMMKVRNLGRKSL EEVLAKLKELGLELSQGEE >gi|330403359|gb|ADLB01000019.1| GENE 39 41856 - 42449 858 197 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|240145850|ref|ZP_04744451.1| 30S ribosomal protein S4 [Roseburia intestinalis L1-82] # 1 197 1 197 197 335 84 8e-91 MAVNRVPVLKRCRSLGMDPIYLGIDKKSNRQLKRANRKMSEYGLQLREKQKAKFIYGVLE KPFRNYFAKASKMNGMTGDNLMILLESRLDNVVFRMGLARTRREARQIVDHKHVLVNGKQ VNIPSYLVKAGDTIEIKEKHKGSQRYKDILEVTGGRLVPEWLDVDAENLKGSVKELPLRE AIDVPVDEMLIVELYSK >gi|330403359|gb|ADLB01000019.1| GENE 40 42467 - 42862 630 131 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|239623373|ref|ZP_04666404.1| ribosomal protein S11 [Clostridiales bacterium 1_7_47_FAA] # 1 131 1 133 133 247 93 2e-64 MAKKVTKKVTKRRVKKNVERGQAHIQSSFNNTIVTLTDAQGNALSWASAGGLGFRGSRKS TPYAAQMAAETAAKAALVHGLKTVDVFVKGPGSGREAAIRALQACGIDVTSIKDVTPVPH NGCRPPKRRRV >gi|330403359|gb|ADLB01000019.1| GENE 41 42962 - 43330 542 122 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|238922854|ref|YP_002936367.1| ribosomal protein S13p/S18e [Eubacterium rectale ATCC 33656] # 1 122 1 122 122 213 88 3e-54 MARIAGVDLPRDKRVEIGLTYIYGIGRVSANNILEKAGVNPDTRCRDLTDEEVKVISAVI DENYQVEGDLRREIALNIKRLQEIGCYRGIRHRRGLPVRGQKTKTNARTRKGPKRTVANK KK >gi|330403359|gb|ADLB01000019.1| GENE 42 43492 - 43596 57 34 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTPGIISNIHRRMISLYINSIYTPPGVSILQPPN >gi|330403359|gb|ADLB01000019.1| GENE 43 43632 - 43745 189 37 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|160881761|ref|YP_001560729.1| ribosomal protein L36 [Clostridium phytofermentans ISDg] # 1 37 1 37 37 77 94 3e-13 MKVRSSVKPICEKCKIIKRKGSIRVICENPKHKQRQG >gi|330403359|gb|ADLB01000019.1| GENE 44 43864 - 44082 262 72 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15900168|ref|NP_344772.1| translation initiation factor IF-1 [Streptococcus pneumoniae TIGR4] # 1 72 1 72 72 105 66 1e-21 MSKADVIEIEGTVVEKLPNAMFQVELENGHQVLAHISGKLRMNFIKILPGDKVTLELSPY DLSKGRIIWRDK >gi|330403359|gb|ADLB01000019.1| GENE 45 44084 - 44335 337 83 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167759839|ref|ZP_02431966.1| ## NR: gi|167759839|ref|ZP_02431966.1| hypothetical protein CLOSCI_02202 [Clostridium scindens ATCC 35704] # 1 82 1 81 83 79 58.0 8e-14 MERFEVGMLAKSEAGHDKDNVYVIIAMDDSYVYLADGRVKTVEKPKKKKYKHIQIIRKIN EDIPVTDNVAIKRILKLYNREEA >gi|330403359|gb|ADLB01000019.1| GENE 46 44347 - 45102 826 251 aa, chain - ## HITS:1 COG:BH0156 KEGG:ns NR:ns ## COG: BH0156 COG0024 # Protein_GI_number: 15612719 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Methionine aminopeptidase # Organism: Bacillus halodurans # 6 248 5 246 248 258 52.0 6e-69 MPVSIKSAREIELMTEAGRILAIVHEEIEKALRPGMTTLEIDKLGEEVIRSYGCIPSFLD YNGYPASICVSVNDEVVHGIPSGKRILKEGDIVGIDAGVIYKGYHSDAARTHGIGEISKE AEELIKVTRESFFEGIKYAKEGHYLFEISAAIGRYAQERGFGVVRDLCGHGIGVNLHEAP EIPNYEMNRKGMKLRAGMTLAIEPMINAGGYQVDWLDDDWTVVTRDGSLSAHYENTVLIT ENEPKLLTLLK >gi|330403359|gb|ADLB01000019.1| GENE 47 45107 - 45751 894 214 aa, chain - ## HITS:1 COG:CAC3112 KEGG:ns NR:ns ## COG: CAC3112 COG0563 # Protein_GI_number: 15896362 # Func_class: F Nucleotide transport and metabolism # Function: Adenylate kinase and related kinases # Organism: Clostridium acetobutylicum # 1 214 1 214 215 268 61.0 5e-72 MKIIMLGAPGAGKGTQAKKIAEKYGIPHISTGDIFRANIKNGTELGKKAKTYMDEGLLVP DELVVDLVVDRVQQEDCKNGYVLDGFPRTIPQAEALDKALAELGEKMDYAINVEVPDENI VNRMSGRRACVGCGATYHIVHAPTKVENICDTCGGELILRDDDKPETVLKRLGVYHEQTQ PLIQYYTDKDILVEVDGTVDLEDVFKAIVNILGE >gi|330403359|gb|ADLB01000019.1| GENE 48 45858 - 47174 802 438 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163796899|ref|ZP_02190856.1| 30S ribosomal protein S11 [alpha proteobacterium BAL199] # 13 436 22 437 447 313 39 2e-84 MLKTFRRAFQIDDVRKRIFYTFLMLIVVRIGSILPTPGVDPTYIQNFFKSQTGEAFNFFS AFTGGSLERMSIFALSITPYITSSIIMQLLTIAIPKLEEMQKDGEDGRKKIASITRYVTV ILALVESTAMAVGFGRSGLLVKYNFVNVAVAVLTLTAGSAFLMWVGERITEKGIGNGISI VLLINIISRMPSDFMTLYTKFMKGKSLAKAGLAGLIILAILLVVVIFVIVLQDGQRKISV QYSQKVQGRKSVGGQTTFIPLKVNTAGVVPVIFASSLMQTPVVIAGFLGKGNGSGIGSEI LKGLSSNNWCNPENLKYSWGLVLYIVLTILFAYFYTSITFNPLEIANNMKKSGGFIPGIR PGKPTVEYLTKILNYIIFIGAVGLVIVQVIPFVFNGWLGANVSFAGTSLIIIVGVVLETL KQIESQMLVRNYKGFLNN >gi|330403359|gb|ADLB01000019.1| GENE 49 47175 - 47615 654 146 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|238922848|ref|YP_002936361.1| 50S ribosomal protein L15 [Eubacterium rectale ATCC 33656] # 1 146 1 146 146 256 87 4e-67 MNLSNLRPADGSKHSDNFRRGRGHGSGNGKTAGKGHKGQKARSGGTRPGFEGGQMPLYRR IPKRGFTNRNTKEIIGINVDALEAFDNDTVVSVETLIEAGIVKNPRDGVKILGNGELTKK LTVQANAFSASAVEKIEALGGKAEVI >gi|330403359|gb|ADLB01000019.1| GENE 50 47639 - 47818 214 59 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|168334334|ref|ZP_02692521.1| 50S ribosomal protein L30 -related protein [Epulopiscium sp. 'N.t. morphotype B'] # 1 59 1 59 61 87 74 4e-16 MANLKITLVKSTIGAVPKHKKTVEALGLKKLNKTVELPDNAATRGMVQQVRHLVKVEEV >gi|330403359|gb|ADLB01000019.1| GENE 51 47833 - 48342 716 169 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|240145861|ref|ZP_04744462.1| ribosomal protein S5 [Roseburia intestinalis L1-82] # 1 169 1 169 169 280 85 2e-74 MKQNRIDASQLELTEKVVSIKRVTKVVKGGRNFRFTALVVVGDGNGHVGAGLGKAAEIPE AIRKGKEDAMKKMVSVALDEHNSVTHDLIGKFGSASVLLKKAPEGTGVIAGGPARAVIEL AGIKNIRTKSLGSNNKQNVVLATINGLSQIKTPEEVAKLRGKSVEEILG >gi|330403359|gb|ADLB01000019.1| GENE 52 48357 - 48725 528 122 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|238916281|ref|YP_002929798.1| large subunit ribosomal protein L18 [Eubacterium eligens ATCC 27750] # 1 122 1 122 122 207 86 1e-52 MVSKKSRSEVRVNKHRKLRNRFSGTAERPRLAVFRSNNHMYAQIIDDTVGNTLVSASTLQ KEVKAELEKTNNVDAAAYLGTVIAKRAIEKGINTVVFDRGGFIYQGKIKALADAAREAGL EF >gi|330403359|gb|ADLB01000019.1| GENE 53 48743 - 49285 791 180 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|160881770|ref|YP_001560738.1| ribosomal protein L6 [Clostridium phytofermentans ISDg] # 1 180 1 180 180 309 84 5e-83 MSRIGRLPVAIPAGVTVEIAENNKVTVTGPKGTLVRELPVEMEIKKEGEEVVVTRPSDLK RMKALHGLTRTLINNMVVGVTQGYEKVLEVNGVGYRAAKSGNKLTLNLGYSHPVEMTDPE GVETVLEGQNKITVKGIDKEKVGQYAAEIRDKRRPEPYKGKGIKYADEVIRRKVGKTGKK >gi|330403359|gb|ADLB01000019.1| GENE 54 49381 - 49530 152 49 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MAELSARSFFEISEVLSNLAKRTSLDNQCNIIATRFERTSPSGRDVLSY >gi|330403359|gb|ADLB01000019.1| GENE 55 49565 - 49966 610 133 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|240145864|ref|ZP_04744465.1| ribosomal protein S8 [Roseburia intestinalis L1-82] # 1 133 1 133 133 239 88 5e-62 MTMSDPIADMLTRIRNANTAKHDTVDVPASKMKIAIADILFNEGYIAKYDIVEDGSFKTI HITLKYGADKNEKVISGLKRISKPGLRVYANREELPRVLGGLGTAIISTNQGVITDKEAR KLNVGGEVLAFVW >gi|330403359|gb|ADLB01000019.1| GENE 56 50085 - 50270 307 61 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|158319556|ref|YP_001512063.1| ribosomal protein S14 [Alkaliphilus oremlandii OhILAs] # 1 61 1 61 61 122 86 6e-27 MAKTSMKIKQQRKQKFSTREYNRCKICGRPHAYLRKYGICRVCFRELAYKGQIPGVKKAS W >gi|330403359|gb|ADLB01000019.1| GENE 57 50288 - 50827 818 179 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|240145866|ref|ZP_04744467.1| 50S ribosomal protein L5 [Roseburia intestinalis L1-82] # 1 179 1 179 179 319 87 3e-86 MSRLKETYQNEIVDAMIKKFGYKNIMEVPKLDKVVINMGVGEAKDNHKVLESAVADLEKI AGQKAVLTKAKKSVANFKLREGMAIGCKVTLRGEKMYEFVDRLINLALPRVRDFRGVNPN AFDGRGNYALGIKEQLIFPEIEYDKIDKVRGMDIIFVTTAKTDEEARELLTQFNMPFTK >gi|330403359|gb|ADLB01000019.1| GENE 58 50851 - 51162 471 103 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|240145867|ref|ZP_04744468.1| 50S ribosomal protein L24 [Roseburia intestinalis L1-82] # 1 103 1 103 103 186 87 6e-46 MSTMKIKKGDTVKVIAGKDKDKEGKVIAVNRKDNTVLVEGVNMLTKHTKPSATNQNGGII HQEGPIDASNVMYLHKGTATRVGFKMDGDKKVRFAKSTGEVID >gi|330403359|gb|ADLB01000019.1| GENE 59 51175 - 51543 569 122 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|160881775|ref|YP_001560743.1| ribosomal protein L14 [Clostridium phytofermentans ISDg] # 1 122 1 122 122 223 93 3e-57 MIQQESRLKVADNTGAKELLCIRVLGGSTRRYASIGDVIVASVKDATPGGVVKKGDVVKA VVVRTVNKTRRKDGSYIRFDENAAVIIKDDKNPKGTRIFGPVARELREKQFMKIVSLAPE VL >gi|330403359|gb|ADLB01000019.1| GENE 60 51565 - 51819 379 84 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|160881776|ref|YP_001560744.1| ribosomal protein S17 [Clostridium phytofermentans ISDg] # 1 84 2 85 85 150 86 3e-35 MERNLRKTRVGKVVSNKMDKTIVVAVEDHVKHPLYNKIVKRTYKLKAHDEANECNIGDKV KVMETRPLSKDKRWRLVEVMEKVK >gi|330403359|gb|ADLB01000019.1| GENE 61 51860 - 52075 278 71 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|239623355|ref|ZP_04666386.1| 30S ribosomal protein S17 [Clostridiales bacterium 1_7_47_FAA] # 1 70 1 70 70 111 82 1e-23 MITVKINTFVKDLKAKSVAELNEELVAAKKELFNLRFQNATNQLDNTSRIKEVRKNIARI QTVITEASKAE >gi|330403359|gb|ADLB01000019.1| GENE 62 52056 - 52493 663 145 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|160881778|ref|YP_001560746.1| 50S ribosomal protein L16 [Clostridium phytofermentans ISDg] # 1 145 1 145 145 259 88 3e-68 MLMPKRVKRRKQFRGSMKGKALRGNKITNGEYGIVATEPCWIRSNQIEAARIAMTRYIKR GGKVWIKIFPDKPVTTKPAETRMGSGKGTLEYWVAVVKPGRVMFELAGVPEETAREALRL AMHKLPCKCKIVSREDLEGGDNSEN >gi|330403359|gb|ADLB01000019.1| GENE 63 52496 - 53152 1004 218 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|240145872|ref|ZP_04744473.1| SSU ribosomal protein S3P [Roseburia intestinalis L1-82] # 1 215 1 215 218 391 89 1e-108 MGQKVNPHGLRVGVIKEWDSKWYAEDKFADYLVEDHKIRTFLKKKLYSAGVSKIEIERTS DRVKIIIYTAKPGVVIGKGGAEIEKVKAELQKFTDKKLVVDIKEIKRPDKDAQLVAENIA LQLENRISFRRAMKSCMSRTMKSGALGIKTSVSGRLGGADMARTEFYSEGTIPLQTLRAD IDYGFAEADTTYGKVGVKVWIYKGEVLPTKETKEGSDK >gi|330403359|gb|ADLB01000019.1| GENE 64 53164 - 53550 594 128 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|240145873|ref|ZP_04744474.1| 50S ribosomal protein L22 [Roseburia intestinalis L1-82] # 1 128 1 128 128 233 89 3e-60 MAKGHRSQIKRERNANKDTRPSAKLSYARVSVQKACFVLDAIRGKDVQTALGILTYNPRY ASSLIKKLLESAIANAENNNGMNVENLYIEECYANKGPTMKRIKPRAQGRAYRIEKRMSH ITLVLNER >gi|330403359|gb|ADLB01000019.1| GENE 65 53581 - 53862 463 93 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|238922834|ref|YP_002936347.1| 30S ribosomal protein S19 [Eubacterium rectale ATCC 33656] # 1 93 1 93 93 182 93 5e-45 MARSLKKGPFADESLMKKVEAMNAAGDKSVIKTWSRRSTIFPQFIGHTIAVHDGRKHVPV YVTEDMVGHKLGEFVATRTYRGHGKDEKKSRVR >gi|330403359|gb|ADLB01000019.1| GENE 66 53892 - 54734 1308 280 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|240145875|ref|ZP_04744476.1| 50S ribosomal protein L2 [Roseburia intestinalis L1-82] # 1 280 1 281 281 508 87 1e-143 MGIKTYRPYTPSRRQMTGSDFSEITKTTPEKSLVVSLNKTAGRNNQGKITVRHRGGGAKR KYRIIDFKRRKDGIPATVIGIEYDPNRTANIALICYADGEKAYILAPEGLTDGMKVMNGP EAEVRVGNCLPLSEIPVGTQIHNIELYPGRGGQLVRSAGNSAQLMAKEGKYATLRLPSGE MRMVPIVCRASIGVIGNGDHNLINIGKAGRKRHMGFRPTVRGSVMNPNDHPHGGGEGKTG IGRPGPSTPWGKPALGLKTRKKNKQSNKLIVRRRDGKGIK >gi|330403359|gb|ADLB01000019.1| GENE 67 54800 - 55099 413 99 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|240145876|ref|ZP_04744477.1| 50S ribosomal protein L23 [Roseburia intestinalis L1-82] # 1 99 1 99 99 163 81 3e-39 MANIQYYDVILKPVVTEKSMELMGDKKYTFLVHTEATKNQIKEAVEKMFEGTKVKSVNTM NLDGKTKRRGMTFGKTAKTKKAIVQLTADSADIEIFEGL >gi|330403359|gb|ADLB01000019.1| GENE 68 55099 - 55719 829 206 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|238916266|ref|YP_002929783.1| large subunit ribosomal protein L4 [Eubacterium eligens ATCC 27750] # 1 206 1 206 206 323 77 2e-87 MANVSVYNIEGNEVGTIDLNDAVFGVEVNEHLLHMAVVNQLANKRQGTQKAKTRSEVSGG GRKPWRQKGTGHARQGSTRSPQWTGGGVVFAPTPRDYSFKLNKKEKRAALKSALTSRVLE NKFIVLDEINFGEIKTKNFQNVLNNLNVSKALVVLEDDNRNAELSARNIAAVKTARTNTI NVYDILKYNTVVTTKAVVEKIEEVYA >gi|330403359|gb|ADLB01000019.1| GENE 69 55748 - 56428 958 226 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|238922830|ref|YP_002936343.1| 50S ribosomal protein L3 [Eubacterium rectale ATCC 33656] # 1 225 1 225 227 373 80 1e-102 MKKAILATKVGMTQIFNEDGALTPVTVLQAGPCVVTQIKTVENDGYSAVQVGFVDKKDKI VNVDKNGKKEIVHRHGVTKAEKGHFDKAGVSGKRFVREFKFENADEYTLAQEIKADIFAA GDKVDATAISKGKGFQGAIKRHNQHRGPMTHGSKFHRHAGSNGAASDPSKVFKGKKMPGH MGSKKITIQNLEIVRVDAENNLILVKGSVPGPKKSLVTIKESVKAI >gi|330403359|gb|ADLB01000019.1| GENE 70 56503 - 56820 512 105 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|240145879|ref|ZP_04744480.1| 30S ribosomal protein S10 [Roseburia intestinalis L1-82] # 1 105 1 105 105 201 96 1e-50 MASQVMRITLKAYDHQLVDSSAKKIIETVKKNGAQVSGPVPLPTKKEVVTILRAVHKYKD SREQFEQRTHKRLIDIITPTQKTVDALSRLEMPAGVYIDIKMKNK >gi|330403359|gb|ADLB01000019.1| GENE 71 57206 - 58303 1057 365 aa, chain + ## HITS:1 COG:no KEGG:CLJ_B2686 NR:ns ## KEGG: CLJ_B2686 # Name: not_defined # Def: hypothetical protein # Organism: C.botulinum_Ba4 # Pathway: not_defined # 243 364 164 285 292 145 63.0 2e-33 MNLVRFMQIAIFIYALFFAFILITDCLKHKSEFTRNKIFPLAIIGFFSDLLDTWGIGSFA TCQAGFKFTNSCPDEKMPGTLNVAHTLPTIAEFLLFLNLIKIEGITLITLIGGAVIGAIY GASLVSKWSPKIIRLALGSALVVLAVILSLKLAHIGPFEKPLAPQDIVLELKANGFTVNN GEEWIYLIENNQNLSDTEKEIYTNPSITASELSSLIPKLKLEGIYTLNARNLTSTITGHL TYGLKGYKLIIGIIINTLLGALMTIGVGLYAPCMAMVSGLGLNVSAAFPIMMGSCAFLMP SAGIQFIQSGAYDRKAAAVITVFGLMGVFLAYKVAAALPMNILTAIVICVMLYTAFTFFK DAKRL >gi|330403359|gb|ADLB01000019.1| GENE 72 58321 - 58788 479 155 aa, chain - ## HITS:1 COG:no KEGG:CDR20291_3099 NR:ns ## KEGG: CDR20291_3099 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile_R20291 # Pathway: not_defined # 1 153 1 153 155 207 66.0 9e-53 MGIGPSTKETSLHHFRDPLLDVVSADEDLDLMGVLLVGTPDGNEEKMLVGRRTAVWAEGM RADGVIISSDGWGNSDVDFTNTVEELSTRGIEVTGLNFSGTVGQFVVVNEYLDGIIDINK SESGTETCVVGENNATALDAKKAVAMLKLKMRRKE >gi|330403359|gb|ADLB01000019.1| GENE 73 58800 - 59564 683 254 aa, chain - ## HITS:1 COG:no KEGG:Clos_0031 NR:ns ## KEGG: Clos_0031 # Name: not_defined # Def: D-proline reductase (dithiol) (EC:1.21.4.1) # Organism: A.oremlandii # Pathway: Arginine and proline metabolism [PATH:aoe00330] # 1 253 1 252 253 303 60.0 4e-81 MSQEVRDLRRLVIKAFHIENIEAGSENEVYPDGKMKVHAGIERTDDEKSAIDKIEISVIK PGEHNRWTNSIMDIIPISTKVLGKIGEGITHTLTGVYVILTGVDVNGIQTAEFGSSEGNL SEKLYLGRAGTPSPGDYIISFDVTFKAGMGQERHGVTEAHRKCDEFIQKFREQMKKFKGE ECTERHEYHDIVRPGKKRVLIVKQVAGQGAMYDTHLFGKEPSGVEGGRSIIDMGNMPVLV TPNEYRDGIIRSMQ >gi|330403359|gb|ADLB01000019.1| GENE 74 59724 - 59969 262 81 aa, chain - ## HITS:1 COG:no KEGG:CLD_2160 NR:ns ## KEGG: CLD_2160 # Name: prdB # Def: D-proline reductase, PrdB subunit, selenocysteine-containing (EC:1.21.4.1) # Organism: C.botulinum_B1 # Pathway: Arginine and proline metabolism [PATH:cbb00330] # 1 81 161 241 241 122 83.0 3e-27 MQRAIEEAGIPTIIIAALPPVVRQSGTPRAVAPLVPMGANAGEPNNKEMQTAIVKATLEQ LVEIQTPGKIVPLPFEYVAKV >gi|330403359|gb|ADLB01000019.1| GENE 75 59997 - 60449 516 150 aa, chain - ## HITS:1 COG:no KEGG:Clos_0030 NR:ns ## KEGG: Clos_0030 # Name: not_defined # Def: D-proline reductase (dithiol) (EC:1.21.4.1) # Organism: A.oremlandii # Pathway: Arginine and proline metabolism [PATH:aoe00330] # 1 150 1 150 241 245 78.0 4e-64 MSLTTVKGIQSEIFVPITPPSVWTPVTKEIKDMTVALATAAGVHHKDDKRFNLAGDFTWR KITDTMKSEDLMVSHGGYDNGDVNKDINCMFPIDRIHELAKEGFIKACAPVHAGFMGGGG NQEKFKHETGPAIAQMFKEEGVDAVLLTAG >gi|330403359|gb|ADLB01000019.1| GENE 76 60449 - 60736 425 95 aa, chain - ## HITS:1 COG:no KEGG:CLD_2159 NR:ns ## KEGG: CLD_2159 # Name: not_defined # Def: hypothetical protein # Organism: C.botulinum_B1 # Pathway: not_defined # 1 80 3 82 91 108 68.0 7e-23 MKYGDKLIRMEGVIVEVHDGCVAVDLKGRLGYLKIPMRMLITDYPLAVGQEVAWNMSFIE QLGPEANDKYVSNLDVYKRRQEEARLKNENNKEAE >gi|330403359|gb|ADLB01000019.1| GENE 77 60752 - 62581 2091 609 aa, chain - ## HITS:1 COG:no KEGG:Clos_0028 NR:ns ## KEGG: Clos_0028 # Name: not_defined # Def: glycine/sarcosine/betaine reductase complex protein B alpha and beta subunits # Organism: A.oremlandii # Pathway: Arginine and proline metabolism [PATH:aoe00330] # 1 608 1 625 627 759 64.0 0 MSITAETAKEHAKDPAVLCCRAEEGIIIGPENLEDPAIFDDLVDSGLLNLDGVLTIEEIL GAKLTKTCDSLTPITRDVIDAVQTVAEEETAAPEEAPAAAPVAPVAAKSNGGVLKIHIGE GKDIQIELPMNVGAGTAKAVEIPAGAEVAVPVKEQEPKLIRSLTRKHFKITEVKRGPETK IEGTTLYIREGIEEEAVASQELVNSLKIDIITPAEYHTYSETIMDVQPIATKDGDDEIGS GATRVLDGVVMMVTGTDANGVQIGEFGSSEGYLDENIMWGRPGAPEEGEIFIKTEVVIKE GTNMERPGPLAAHTATDVITQEIRTALKKLDESLVVDTEEFNQYRRPGKKKVVIVKEIMG QGAMHDNLILPVEPVGVLGAKPNVDLGNVPVMVSPLEVLDGCIHALTCIGPASKEMSRHY WREPLVLEALHDEEIDLCGVIFVGSPQINAEKFYVSKRVGMMVEALDVDGAFITTEGFGN NHIDFASHHEQIGMRGVPVVGLSFCAVQGALVVGNKYMTNMVDNNKSEAGIENEVLACNT LCHEDAIRALAMLKTAMAGEEVKKPEKKWNPNVKATNVELISNVTGKPVELVKNEQSLPM SDKRREKYN >gi|330403359|gb|ADLB01000019.1| GENE 78 62650 - 63039 653 129 aa, chain - ## HITS:1 COG:no KEGG:Clos_0028 NR:ns ## KEGG: Clos_0028 # Name: not_defined # Def: glycine/sarcosine/betaine reductase complex protein B alpha and beta subunits # Organism: A.oremlandii # Pathway: Arginine and proline metabolism [PATH:aoe00330] # 1 129 1 137 627 67 37.0 1e-10 MLITNEMVKEYADKPAVFAVKVPAGTEISAEHLEHPMFLDDMVSSNLFTLDTALTIGQVI GVKTAKDCDALTPVTEEAVGGAVKLAAKEAPAEENVQEVQVQTEVAAQDGNTIIISIKEG KDIYLELPM >gi|330403359|gb|ADLB01000019.1| GENE 79 63057 - 64373 1348 438 aa, chain - ## HITS:1 COG:TM0244 KEGG:ns NR:ns ## COG: TM0244 COG4656 # Protein_GI_number: 15643016 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfC # Organism: Thermotoga maritima # 2 304 45 363 451 137 30.0 4e-32 MKDVQILLRQHVGAPCQATVKSGDKVKKGTLIAEPTGLGANIFSSVYGEVTEVLEDRIVI RPDEVQSEEFEKIEEGSKLDMVKAAGIVGMGGAGFPTGVKIGTDLQGGYILINAAECEPG LRHNIQQIEEECDKVIRGVKYTMEISNAAKAIFAIKKKNEKAVQTLKEALKDEEAISIHL LPDIYPMGEERAVVRECLGIELKPEELPSVAKSIVINVETLARITEAIEERKPCFSKNLT VIGKLNGGNEPHVFKDVPVGTSVAELIERAGGIDGEYGEIIMGGPFTGKATTLDAPITKT TGGIIVTMEFPDLHGATMGVLVCACGGDEIRMRDIAEKMNAKVVSVARCKQAQEMKSGAL KCERPGNCPGQVKNSMQFKKDKCEYIIIGNCSDCSNTVMGSAPKMNLKVFHQTDHVMRTI GHPLYRYLTVSKQVEQDI >gi|330403359|gb|ADLB01000019.1| GENE 80 64926 - 65957 1139 343 aa, chain + ## HITS:1 COG:SMb20268 KEGG:ns NR:ns ## COG: SMb20268 COG3938 # Protein_GI_number: 16264006 # Func_class: E Amino acid transport and metabolism # Function: Proline racemase # Organism: Sinorhizobium meliloti # 16 342 6 332 333 240 40.0 2e-63 MSLNPTLNKERFEAAFTVVDSHTVGEFTRIVLSGFPELEGNTMMERKNFLVNHCDDYRTA LMLEPRGHHDMFGAILTEPISEEADLGVIFMDTGGYLNMCGHGTIGSATVAVETGLVPVT EPYTEVVLEAPAGIIRTKVKVENGKAVEVTLTNVPAFLYKDNLTVEVDGVTIPYCISFGG SFFALVDASKLGYNIGPAAVPALQTLGMKMLEKINKEVSVQHPTLDINTVDLVEFYGPTP NPEKADMRNVVIFGEAQADRSPCGTGTSAKLATLYAWDEIKIGEEFRYESFTGSVFRGVI KEEASVGDFKAVIPQITGSAYITGMGTYVIDPTDPLKYGFSIG >gi|330403359|gb|ADLB01000019.1| GENE 81 65973 - 66560 607 195 aa, chain + ## HITS:1 COG:SMb20367 KEGG:ns NR:ns ## COG: SMb20367 COG1309 # Protein_GI_number: 16264101 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Sinorhizobium meliloti # 1 83 1 88 230 66 40.0 3e-11 MPRKRQSPKSRIVKAAWNLFYKNGYEETTVNDIITASKTSKGTFYHYFKGKDALLSSLSY LLDDKYEELTGIINPDLSAYDKLLFLNHELFYMIETSIDVNLLAYLYSSQLVTKDKKSLL DKKRYYFTWITEIIEDALKNGEFHPANSASDLVKLYTMYERSLLYDWALCKGKYALTEYS DRLLPHVLDRFKEEI >gi|330403359|gb|ADLB01000019.1| GENE 82 66606 - 66899 413 97 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167758842|ref|ZP_02430969.1| ## NR: gi|167758842|ref|ZP_02430969.1| hypothetical protein CLOSCI_01185 [Clostridium scindens ATCC 35704] # 1 97 1 97 97 135 72.0 9e-31 MKVYDTVNKVELEATEEELIKLMAPGGRQVDLYLNGKKTDEDGYLTWDVEHWSSIDGKRF IRCYSLEGRVLGESTGHNIYDLRNDFKPEEAKEVKLS >gi|330403359|gb|ADLB01000019.1| GENE 83 66915 - 68066 1689 383 aa, chain - ## HITS:1 COG:no KEGG:Shel_25180 NR:ns ## KEGG: Shel_25180 # Name: not_defined # Def: fatty acid/phospholipid biosynthesis enzyme # Organism: S.heliotrinireducens # Pathway: not_defined # 3 382 2 381 382 480 71.0 1e-134 MANSVEKMIASTFMDIAQGLETGSFGKRPKIALTGMGSEHGEENSMAAAVEAEKDGIDVY YIGTLEAEGVTTIKVADDEEGHKKMEEMLKNGEVDGAVTMHFPFPIGVSTVGRCVTPATA KEMFIANTTGTSSTDRIEGMIKNAIYGIITAKACGKKNATVGILNVDGARQTEKALKQLQ ENGYDITFAESGRADGGCVMRGNDVLQASPDIMVTDSLTGNILVKMLSSFTTGGSFEATG FGYGPGIGEGYEQLVMIVSRASGAPVIANAIRYAGQLVRGKVFEVAKQEFEAANKAGLKD ILNAHKNASKPAEAEEVKEPPKEVVTAQIPGIEVMDLEDAVKTLWKINIYAESGMGCTGP IVLVSEANLVKAEEELKKAGYIN >gi|330403359|gb|ADLB01000019.1| GENE 84 68080 - 69612 1913 510 aa, chain - ## HITS:1 COG:no KEGG:Shel_25190 NR:ns ## KEGG: Shel_25190 # Name: not_defined # Def: hypothetical protein # Organism: S.heliotrinireducens # Pathway: not_defined # 1 508 1 508 514 711 66.0 0 MNSVIKGASYVLAHTPQMVVYNGTTQTTERVVNPESEYLKELNSHLRSYEECVNYWPNQV YIGNAHPDELAEVEFPYYDKVKEGAERFGKYGEIMPENEFLLLVQTCDMFEVVKLDKEFV AATKEAFAANPIITEDIVEKIEEGVELSEVEHLVNDEHAEGLYIDGKLVGCIKRAHDIDV NLSAHVMHENMMSKATSVLALLYAVRNTGIDKSEIEYVIDCAEEACGDMNQRGGGNFAKA AAEVAGLVNATGSDSRGFCAGPSHALIEAAALVKSGAYKCVAVTAGGCTAKLGMNGKDHV KKGLPILEDCLAGFCVIICENDGESPEINLDILGRHTVGTGSAPQNVIGSLVATPLEKVG LKITDIDKYSPEMQNPDITKPAGAGDVPLANYKMIAALAVKKGELDRKELPAFAAKHGLT GWAPTQGHIPSGVPYIGFAVEDIKAGKIKNAMIVGKGSLFLGRMTNLFDGVSFVIQANTG AEDNAGVSEEEVKGLIAKAMKDFAETLMAE >gi|330403359|gb|ADLB01000019.1| GENE 85 69797 - 70123 413 108 aa, chain - ## HITS:1 COG:no KEGG:Shel_25200 NR:ns ## KEGG: Shel_25200 # Name: not_defined # Def: glycine/sarcosine/betaine reductase complex protein A (EC:1.21.4.4) # Organism: S.heliotrinireducens # Pathway: not_defined # 1 107 50 156 156 164 80.0 1e-39 MDLENQKRVKDFADEFGAENLVVIVGAAEGEAAGLAAETVTAGDPTFAGPLTGVELGLTV YHVCEPELKEEFDEAVYDEQVGMMEMVLDVDDIVSEMSSIREQFCKYL >gi|330403359|gb|ADLB01000019.1| GENE 86 70139 - 70270 199 43 aa, chain - ## HITS:1 COG:no KEGG:Amet_3592 NR:ns ## KEGG: Amet_3592 # Name: not_defined # Def: glycine/sarcosine/betaine reductase complex protein A (EC:1.21.4.3 1.21.4.4 1.21.4.2) # Organism: A.metalliredigens # Pathway: not_defined # 1 43 1 43 158 71 74.0 1e-11 MAILDGKKVIIIGDRDGVPGPAIAECVKTAGGEVVFSSTECFV >gi|330403359|gb|ADLB01000019.1| GENE 87 70398 - 70715 493 105 aa, chain - ## HITS:1 COG:Cgl3031 KEGG:ns NR:ns ## COG: Cgl3031 COG0526 # Protein_GI_number: 19554281 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Corynebacterium glutamicum # 6 100 9 103 107 64 31.0 4e-11 MLDLDKTTFQPEVLEAEGYVFVDFYGDGCVPCQALMPFVHEMADKYGDKLKFTSLNTTKA RRLAIGQKVLGLPVMAIYKDGEKIDEVVKDDATQENIEAMIQKYI >gi|330403359|gb|ADLB01000019.1| GENE 88 70735 - 71682 594 315 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|148988049|ref|ZP_01819512.1| 30S ribosomal protein S9 [Streptococcus pneumoniae SP6-BS73] # 4 314 1 306 306 233 42 3e-60 MSKIYDVIILGAGPAGLAAGLYAGRARMSTLIVEKGKDGGQIAITDEIENYPGQIVEGES GPSLIARMTEQCEKFGVERVTDVINDVVLEGDVKKLISAKGEYCGKTLIIATGAFARPIG CKGEKEFMGKGVSYCATCDANFFEDFEVYVVGGGDSAVEEAMYLTKFARKVTIIHRRDEL RAAKSIQEKAFKNPKLHFMWDTVVEEVSGDGILSEMTVKNVKTGELTKIEADEEDGMFGV FGFIGTLPSTKIFEGKGLELDERGYIPTDDNMRTNIPGVFAAGDVRVKSLRQVVTAAADG AIATTQCEKYINEME >gi|330403359|gb|ADLB01000019.1| GENE 89 71707 - 71937 265 76 aa, chain - ## HITS:1 COG:no KEGG:Shel_25230 NR:ns ## KEGG: Shel_25230 # Name: not_defined # Def: glycine reductase, selenoprotein B # Organism: S.heliotrinireducens # Pathway: not_defined # 1 76 361 436 437 122 78.0 6e-27 MVKGVEGAGLPVVHVCTVTPISMTVGANRIVPAVAIPHPLGNPALEADEEKKLRRNLVEK ALHALTVEVEDQTIFE >gi|330403359|gb|ADLB01000019.1| GENE 90 71965 - 73011 1432 348 aa, chain - ## HITS:1 COG:no KEGG:Shel_25230 NR:ns ## KEGG: Shel_25230 # Name: not_defined # Def: glycine reductase, selenoprotein B # Organism: S.heliotrinireducens # Pathway: not_defined # 2 348 4 350 437 507 74.0 1e-142 MIKVVHYINQFFANIGGEEMAHIPAELHKGEVMGPGLAFKASFKDEAEITATIVCGDSWF NENLEEAKKTILAWVKEENPDVFVAGPAFNAGRYGVACGTIADAVQEELGIPAVTGMYVE NPGADMFKNKVYTVPTKNSAAGMRAAVADMAPLALKLAKGEAIGASCQDHYMPNGVRVNF FEEKRGSRRAVDMLLKKLADKPFTTEYPMPAFDRVPPMPAVKDLSKATIALVTSGGIVPK GNPDHIESSNASHYGEYDITGVMDLTEDTYETAHGGYDPVYANEDADRVLPVDVLRDMEK EGVIGKLHNKFYTTVGNGTAVASAKAFSEEFAQKLIDDGVDAVILTST >gi|330403359|gb|ADLB01000019.1| GENE 91 73031 - 74314 1664 427 aa, chain - ## HITS:1 COG:no KEGG:Shel_25240 NR:ns ## KEGG: Shel_25240 # Name: not_defined # Def: glycine/sarcosine/betaine reductase component B alpha/beta subunit # Organism: S.heliotrinireducens # Pathway: not_defined # 3 427 2 426 426 624 76.0 1e-177 MRRLELGHINIKDIQFGPESKIEDGVLYVNEDAVKAIVLEDEKIKSCKLDIARPGESVRI TPVKDVIEPRVKVEGRGGVFPGVINKVDTVGEGKTYALKGMAVVTAGKIVGFQEGIIDMT GPGADYTPFSKLNNLVVVCEPVDGIKQHDYEPAVRFAGFRVAVYLGELARELTPDSTEVF ETYGIMEGAEKLPGLPRVAYVQMLQSQGLLHDTYVYGVDAKKTLSTMMTPTEVMDGAIVS GNCVSACDKNPTYVHENNPVVHDLFAEHGKTLNFVAHILTNENVYLADKQRSSDWTAKLC RLLDLDGVVVSQEGFGNPDTDLIMNCKKIEAEGVKTVIITDEYAGRDGKSQSLADADAAA DAVVTGGNANQVIVLPKLDKVIGTLDYVNTIAGGNEHSLREDGTIEVEIQAITGATNETG FGYLSAR >gi|330403359|gb|ADLB01000019.1| GENE 92 74383 - 74766 388 127 aa, chain - ## HITS:1 COG:no KEGG:Amet_3596 NR:ns ## KEGG: Amet_3596 # Name: not_defined # Def: GrdX protein # Organism: A.metalliredigens # Pathway: not_defined # 1 127 1 127 127 110 49.0 1e-23 MKAEFIIITNNPLIVEKLGQEYTVEYEDISYEDTLKKVRNYVYKGHELLTHPLSGSVKPN ETPYKSVMVSEKATALNSESVKIIEQAIQSCGKFEFKSDKYALHVYDDFQYIDYTLISSA LPSATAW >gi|330403359|gb|ADLB01000019.1| GENE 93 75011 - 76405 852 464 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|145629959|ref|ZP_01785741.1| 50S ribosomal protein L21 [Haemophilus influenzae 22.4-21] # 1 459 1 447 456 332 40 4e-90 MDLNKIVLAVQGVLADKILIIALVGTGLWFTFNLGFIQIRGFGEGWKRTFGGLFKKSGKA GKDGMSSFQALATAIAAQVGTGNLAGAATAIAMGGPGAIFWMWVSAFLGMATIYAEALMA QKYKQVGDDGVVTGGPVYYIRAAFKGTFGKVLAGIFAVLIVFALGFMGNAVQSNSISDAF GTAFGINPWVVGIVIAAISLFVFVGGISRIASLTEKIVPIMAGFYIIGALIVIIANGDML PKAFHDIFVGAFSPQAVGGGVLGWTVQKAMARGVGRGLFSNEAGMGSTPHAHAVAKVDHP VEQGFVAMMGVFIDTFVVLTLTALVILTSGMVGEIDPLTKAAYTGTALTQVGFSSVFGKF GEVFIAICMFFFAFSTIIGWYFFGEANIKYLFGSKAVKIYAVLVCVCIVVGSAQRVDLVW NMADCFNSAMVIPNVIGLWALSGMVKKVHKDYYQNFRPNQMKKK >gi|330403359|gb|ADLB01000019.1| GENE 94 76643 - 76879 264 78 aa, chain - ## HITS:1 COG:no KEGG:Cphy_1490 NR:ns ## KEGG: Cphy_1490 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 1 75 1 75 78 79 54.0 4e-14 MRKKELKLVVTFHTTADAMAMEKACKEHQVSGRLIPVPRTISAGCGLSWCAELSELEKIK NMMQEAGIEEEEIHECMV >gi|330403359|gb|ADLB01000019.1| GENE 95 76881 - 77084 318 67 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|226322978|ref|ZP_03798496.1| ## NR: gi|226322978|ref|ZP_03798496.1| hypothetical protein COPCOM_00750 [Coprococcus comes ATCC 27758] # 1 67 9 75 75 103 89.0 3e-21 MKEVDARGLSCPEPLMLTAEALKTAKGPIKVLVSEPHQKTNVEKYAKDHGKKAISTQKGS EFEIVIE >gi|330403359|gb|ADLB01000019.1| GENE 96 77138 - 78235 1424 365 aa, chain - ## HITS:1 COG:no KEGG:Shel_25130 NR:ns ## KEGG: Shel_25130 # Name: not_defined # Def: hypothetical protein # Organism: S.heliotrinireducens # Pathway: not_defined # 1 365 1 363 369 364 61.0 4e-99 MNLSDSKVKLGIAGVVCGIVAVVLAMTGNPGNMAICIACFIRDTAGALGMHQAAVVQYAR PEIIGIVLGAFIISVVTKEYRSTAGSSPMIRFLLGIMIMIGSLLFLGCPLRMVLRMASGD LNAYVALIGFVAGIITGSLALKKGFSLGRAHETNKTSGAVLPLLMVGILVLILVGSSLLI SSKTGPGSMHAPIWIALIGGLVFGAFAQKSRMCFAGSIRDIFLMKNFDLFTVIAGLFVVM LVYNVATGNFAFSFTGKPVAHSQHLWNILGMYIVGFAATLAGGCPLRQLVLAGQGSSDAA VTVAGLFVGAALCHNLKLASAAAAPATKEAAAVIGGPSAEGKIAAVICIILLFAIAFVGN KRKTQ >gi|330403359|gb|ADLB01000019.1| GENE 97 78251 - 79390 1090 379 aa, chain - ## HITS:1 COG:CAC2354 KEGG:ns NR:ns ## COG: CAC2354 COG0520 # Protein_GI_number: 15895621 # Func_class: E Amino acid transport and metabolism # Function: Selenocysteine lyase # Organism: Clostridium acetobutylicum # 1 377 1 376 379 325 45.0 6e-89 MIYMDNAATTMQKPREVIDAVVEAMNSMGNAGRGAHSASLDASRTIYGVRDILAHFFGAE SPKQIVFTNNSTESLNIAIKGLLQPKDHVITTEMEHNSVLRPLYEMEEKGVELTILPADK KGIVSCEDFEKEIRPNTKAIICTHGSNLTGNMLNIKRIGEMAKEHGLRFVVDASQTAGVY PIDVQEMNIDILCFTGHKGLLGPQGTGGMYVRSGLELKPLKCGGSGVDTYNKHHPKEMPT ALEAGTLNGHGIAGLGAGVSYIRKTGMETIREKELAHMWRFYHGVKDIPNVKIYGDFETE KRCPIVTLNIGDYDSSEVSDELLMTYNISTRPGAHCAPLMHRALGTVEQGAVRFSFSHYN TEEEITVAIKAIEELAKEE >gi|330403359|gb|ADLB01000019.1| GENE 98 79525 - 80424 807 299 aa, chain - ## HITS:1 COG:MJ0300 KEGG:ns NR:ns ## COG: MJ0300 COG0583 # Protein_GI_number: 15668475 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Methanococcus jannaschii # 6 296 8 296 296 140 31.0 3e-33 MNLKQLEAFVYVAEKKSFSKAARELFLTQPTISAHISSLEKELNVRLVVRNTKEVNLSAE GEKLYKCAKKMVELEREIEETFFREEGEKACITISASTIPAQYLLPKILAKFSDKYPEEQ FKIKETDSAQVVEQVVNRMIDIGFTGTMLDKKHCVYLPFYEDELIVITPNQERFRKLQKD DENAEWIQREPMIMREEGSGTRKEAEKQLKKKGLSVESLNIVASIENQETIKRSVENGMG ISIISKLAAKREMESGKLLGFPLTEGDSVRNIYMIYNKDIRLSHITEKFIKMVKEMYHY >gi|330403359|gb|ADLB01000019.1| GENE 99 80424 - 82337 1874 637 aa, chain - ## HITS:1 COG:PA4807 KEGG:ns NR:ns ## COG: PA4807 COG3276 # Protein_GI_number: 15600001 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Selenocysteine-specific translation elongation factor # Organism: Pseudomonas aeruginosa # 4 631 1 622 641 308 31.0 2e-83 MKNIIIGTAGHIDHGKTTLIKALTGRNTDRWEEEQRRGITIDLGFTYFDLKSGDRVGIID VPGHEKFINNMVAGVVGMDLVMLVVAADEGIMPQTREHMDILGELGIEKSILVLNKCDLV DEEWLELVEEEIQEELEGTFLENAPVVKVSAATGQGIPELIETIERLTADEVVEKDIHTI PRLPIDRVFSLSGFGTIITGTLLAGTICKEDTLQMYPIGKECKIRSIQVHGENVEKCYAG QRVAINISNLKKSEIHRGCVLAPPNNMKNTMLLDVKMNILPSSMRILTNHTRLHLFTGTS EILCRAVLLDKDEIGPGESGYVQLRLEEEIALRRGDKFIVRFYSPMETIGGGVVLEPNPT KKKRFHEETIEELKRKESGSTEDIIEMHIKNSKDTMLTVAELAKMTALSSDEVKQDAESL EERGIIKIFHMKKDSYTWHRENDKALQEDMAKTLQEYHRKHLYRYGMQKAEIHMTFLKKV KPNVFDLYLESLVEEGVMKRRNEFLSLPQHEIVKDEVYQKVESKLLTTFEKAGLDFPKVS EVDFGTVPFETVEDILLLLMEEKKIVKLGEELYTLTSYIEHSKEIICSMFKEKDLITMAE IRDALETSRKNAKLIVEYMDSIKITKRNGTESERVAY >gi|330403359|gb|ADLB01000019.1| GENE 100 82368 - 83780 1210 470 aa, chain - ## HITS:1 COG:HI0708 KEGG:ns NR:ns ## COG: HI0708 COG1921 # Protein_GI_number: 16272648 # Func_class: E Amino acid transport and metabolism # Function: Selenocysteine synthase [seryl-tRNASer selenium transferase] # Organism: Haemophilus influenzae # 6 462 4 460 461 386 46.0 1e-107 MNKNLLYRNIPKVDILLENIKIQELIETYGRETVVETIREALESLRAYIKECDAEENALE RINSLVSDIEKKVISINTPDIRSVINGTGTILHTNLGRAPISPKHMQYVAEIATGYSNLE YNLEEGKRGERYSHFEKLLCKITGAEAAMAVNNNAAAVMLILNTLGKGKEIIVSRGELVE IGGKFRVPDVMEQSGASLVEVGTTNKTHFADYEQAITEETGALLKVHTSNYRIVGFTDTV SIEELVPLGEKYHLPVIEDLGSGVLIDLSKYGLTYEPTVQDSIRHGADIVCFSGDKLLGG AQAGIIVGKKAYIDKMKKNQMTRALRIDKFTAATLDVILHEYLSEERAIQNIPALHMITE SLEDVTKRAKSLQRMLRQMKPKAEIVLEKCESQIGGGSLPLERIESMAVTIKPNCITTAE LEERMRHLPIPIIPRTMNDKIILDVRTIEQRFFKTIVQELKESHIFEEKA >gi|330403359|gb|ADLB01000019.1| GENE 101 83793 - 84749 883 318 aa, chain - ## HITS:1 COG:Cj1504c KEGG:ns NR:ns ## COG: Cj1504c COG0709 # Protein_GI_number: 15792818 # Func_class: E Amino acid transport and metabolism # Function: Selenophosphate synthase # Organism: Campylobacter jejuni # 12 315 7 305 308 241 45.0 2e-63 MGKLPKFHDDNLLVGIETSDDAAIYKVTDDIALIQTVDFFTPIVDDPYMFGQIAAANSLS DVYAMGGEPKIALNIVGFPNCLDPSVLGEILAGGADKVKEAGAVLVGGHSVQDDEPKYGL CVSGFVHPDKIFKNYGCKPGDVLILTKQIGNGIVNTAIKAEMASERAVKEVTVAMASLNK KAKEVVENHQVNACTDITGFGLLGHCVEMAVASDVTFEINVKDIAYFEDAISYAKMGLVP AGAYKNRGYSGKQVDMSQVEEHYVDLLYDPQTSGGLLISVPPEEVESIMKEFEEKKMDTK VSIIGKVTEKGEKLIRLR >gi|330403359|gb|ADLB01000019.1| GENE 102 84859 - 85461 727 200 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_3158 NR:ns ## KEGG: EUBREC_3158 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 200 1 205 205 210 56.0 2e-53 MIKVNAMGDNCPIPVIKTKKAMQEVTGPEVIEVAVDNEVAVQNVTKMASSAGGTVTSEKI AEKEYRVTIQMNGAIKEDAEEACCPTQPEENTVVVISSDRMGSGNDELGKVLIKGFIFAV TQLDTLPKTMLFYNGGATLTAEESDCLEDLKHLAEQGVEILTCGTCLNYYGLTEKLAVGS VTNMYTIVEKMAGASKIVQP >gi|330403359|gb|ADLB01000019.1| GENE 103 85639 - 86499 821 286 aa, chain + ## HITS:1 COG:mlr3209 KEGG:ns NR:ns ## COG: mlr3209 COG1092 # Protein_GI_number: 13472800 # Func_class: R General function prediction only # Function: Predicted SAM-dependent methyltransferases # Organism: Mesorhizobium loti # 9 283 58 336 338 227 41.0 2e-59 MWIADGWKDYEVIDSSKGEKLERWGNYTLVRPDPQVIWDTPKAHKGWKKMNGHYHRSKKG GGEWEFFDLPEQWQIHYKDLTFNLKPFSFKHTGLFPEQATNWDWFSEKIRNAGRPIKVLN LFAYTGGATLAAASAGASVTHVDASKGMVTWAKENAVSSGLKDAPIRWLVDDCVKFVERE IRRGNHYDAIIMDPPSYGRGPKGEIWKIEDAIHPLIKLCTQILSDDPLFFLINSYTTGLA PSVLTYMLATELKKFNGVVDSQEIGLPVSGNGLVLPCGASGRWERK >gi|330403359|gb|ADLB01000019.1| GENE 104 86545 - 87207 661 220 aa, chain - ## HITS:1 COG:VCA0102 KEGG:ns NR:ns ## COG: VCA0102 COG0637 # Protein_GI_number: 15600873 # Func_class: R General function prediction only # Function: Predicted phosphatase/phosphohexomutase # Organism: Vibrio cholerae # 2 213 6 215 219 125 34.0 5e-29 MLKALLFDMDGLIFDSEKVVQRSWNMAGNTLGYRNIGEHIYNTLGMNAKGRGEYFQKVFG EDFPNDRFRDLARENFYGIVEKEGLSVKPGARELIRYAKQLGYCMAVVTSSRKEYVREMF QRAGLYEYFDLFVCGDMVTKSKPAPEIYEKACKLLEVRPEYCVAFEDAPAGVESATKAGV DVIMVPDLVQPDMETRRRAWRVIRTLDEAIDILRREAREQ >gi|330403359|gb|ADLB01000019.1| GENE 105 87225 - 87842 541 205 aa, chain - ## HITS:1 COG:BH3033 KEGG:ns NR:ns ## COG: BH3033 COG0424 # Protein_GI_number: 15615595 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Nucleotide-binding protein implicated in inhibition of septum formation # Organism: Bacillus halodurans # 17 204 1 182 190 150 43.0 1e-36 MKTPAMQVFFEQGGKKMKKIILASASPRRKEILEQIGMQFEILVSDKEEIYESSAPEEIV KELSLLKANHVSEMVGKNDIIIIGADTIVSHEGKVLGKPKSRAEAFQMIQGIQGKVHKVY TGIAILCYDESGEKKMINDAVETKVFVYPMSEKEINNYLDTGEYMDKAGAYAIQGLFAPF IEKIEGDYYNVVGLPVSRIYQAIKS >gi|330403359|gb|ADLB01000019.1| GENE 106 87847 - 88623 675 258 aa, chain - ## HITS:1 COG:NMA0648 KEGG:ns NR:ns ## COG: NMA0648 COG0703 # Protein_GI_number: 15793634 # Func_class: E Amino acid transport and metabolism # Function: Shikimate kinase # Organism: Neisseria meningitidis Z2491 # 87 258 2 167 170 107 37.0 2e-23 MNDLELYREQLALCDEKLIEDLTERSDIFEKMLEYKEEHGMLILQPMQREKRVKRLEEKL KGNPYKEEIMNVFSCIAWNWKRIQGKKLFPHNIVLTGFMGTGKTTLAEYMGERFAMDVVE MDKEIENRAGMSVEEIFSQYGEEHFRQLETELLEELQSREHIVISCGGGIVLREENIDKL KKQGKVVLLTASAEVILERVKENGERPLLKGNKNIEWICKMMEERADKYAEVADVTVNTD GKTVLQICEEMIQKLEER >gi|330403359|gb|ADLB01000019.1| GENE 107 88984 - 89142 168 52 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MNLYKTKKKYPYLGLVLVSTGVLKFGKPSAVRNRITIRKLKLNADENLAYAA >gi|330403359|gb|ADLB01000019.1| GENE 108 89178 - 89411 347 77 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|210618046|ref|ZP_03291881.1| ## NR: gi|210618046|ref|ZP_03291881.1| hypothetical protein CLONEX_04114 [Clostridium nexile DSM 1787] # 1 77 1 77 77 101 87.0 1e-20 MVPTTKEALSKLVTETTVEIYEELTPQLIQLIDQTKHNDELTEAQKQDEISLHMMGYIKS CTNEIMIEVLAEILGLD >gi|330403359|gb|ADLB01000019.1| GENE 109 89420 - 89887 513 155 aa, chain - ## HITS:1 COG:BS_yvaI KEGG:ns NR:ns ## COG: BS_yvaI COG0691 # Protein_GI_number: 16080413 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: tmRNA-binding protein # Organism: Bacillus subtilis # 1 150 1 150 156 171 58.0 7e-43 MAKTEGKLIANNKKAYHDYFILDTYEAGISLHGTEVKSLRMGKCSIKESFIRIENGEVFI YGMHISPYEKGNIFNKDPLRVKKLLLHKSEINKMLGKTKEKGMAIVPLKVYFKGSLVKVE IGLARGKKLYDKRDDIAKKDQKREAQREFKIRNLG >gi|330403359|gb|ADLB01000019.1| GENE 110 89892 - 92018 1778 708 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15894003|ref|NP_347352.1| fused ribonuclease/ribosomal protein S1 [Clostridium acetobutylicum ATCC 824] # 12 708 7 705 730 689 50 0.0 MNEIFEKRKKMIYDFICDEFYVPMKIKEFAILLQVPKEERGELKKILDSLEAEGKIRVSK KGKYVKGEAKSLVGTYQAHQRGFGFVTVEGEDEDIFISEDDINGAFHGDSVEVVIKATPE GKRREGKITKILSHGTTQLVGYFQRNKNFGFVIPDNAKFLQDVFVPLERSKGAVTGHKVV VELTSYGKSGKKPEGKVVEIIGHINEPGTDIMSIVKDFDLPVDFPEKVMNQAERVGDKII PADMAGRMDIRDWDMVTIDGEDAKDLDDAISIRREGNNYLLGVHIADVANYVQEKSALDR EAYKRGTSVYLADRVIPMLPHALSNGICSLNEGEDRLALSCIMTITPKGEVIDHKIAETV ICVNRRMSYTSVKKILEDNDEQEIEKYKEFCPMFKLMEELAGILREKRKKRGSIDFDFPE TKMVLDENGKPLELKPYDRNVATKIIEDFMLIANETVAEDYFWQEIPFVYRTHETPDEEK IKKLAIFINNFGHSLHIANNEVRPKEVQKLLTKVEGTPEEMLISRLALRSMKQAKYTPDN TGHFGLAAPYYCHFTSPIRRYPDLQIHRIIKENIRGRMNANRREHYEGILTEVAKHSSEM ERRAEEAERETVKLKKAEYMESRIGETFEGVISSITKWGMYVELSNTIEGLVHVTNMYDD HYNYYEERYEMVGEHTNKVYKLGQTVNVRVLDVDKLQRTIDFELAEGE >gi|330403359|gb|ADLB01000019.1| GENE 111 92104 - 92346 395 80 aa, chain - ## HITS:1 COG:no KEGG:Cphy_2867 NR:ns ## KEGG: Cphy_2867 # Name: not_defined # Def: preprotein translocase, SecG subunit # Organism: C.phytofermentans # Pathway: Protein export [PATH:cpy03060]; Bacterial secretion system [PATH:cpy03070] # 1 76 1 77 81 83 62.0 2e-15 MEILKMVITILFAIDCIALTAIVLLQEGKSAGLGTISGAADTYWGQNKGRSMEGALVKST KFLAILFVVLAAVLNLKVFA >gi|330403359|gb|ADLB01000019.1| GENE 112 92398 - 93693 1277 431 aa, chain - ## HITS:1 COG:BS_eno KEGG:ns NR:ns ## COG: BS_eno COG0148 # Protein_GI_number: 16080443 # Func_class: G Carbohydrate transport and metabolism # Function: Enolase # Organism: Bacillus subtilis # 7 424 4 422 430 530 64.0 1e-150 MYKNLPIQDIYAREILDSRGNPTIEVEVLVGENIVGRASVPSGASTGKYEAVELRDGDER YNGLGVTRAVHHVNDAIAREIIGINVFEQEKIDGILLKIDRTENKSNLGANAMLGVSLAV AKTAAKALDIPLYRYLGGVNADTMPIPMMNILNGGRHADNSIDIQEFMIMPAGAKCFREA LRIGAEVYHCLKQILKEEKKATAVGDEGGFAPELSDAKEALFFIVKAIETAGYKAGKDVV LALDVAASELYDKKLNKYVFAGEGKVYSAEEMIDYYEELLTEFPIVSIEDPLDEEDWEGW ELLTTRLGQDIQLVGDDLFVTNTQKLERGIRQHIANAILIKVNQIGTLTEAVKAVEMAKN AGYKAIISHRSGETEDSTIADIAVGLNAGQIKTGAPCRSERTAKYNQLLRIEEELAETGK YKNLFKENLAI >gi|330403359|gb|ADLB01000019.1| GENE 113 93817 - 94659 842 280 aa, chain + ## HITS:1 COG:CAC0496 KEGG:ns NR:ns ## COG: CAC0496 COG1284 # Protein_GI_number: 15893787 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 4 277 3 276 279 176 38.0 7e-44 MKTKLKSFSLLTGSILLMAVGIYFFKFSNNFTFGGITGLAVLVAKTGFMSAGDFTFIMNM VLLLIGFLILGKDFGIMTAYCSILLSVALSTLERLFPMTKPMTDQPMLELIFAIALPAIA SAALFNMGASSGGTDILAMIVKKYTSFNIGNALLLSDIVITLLGFFVFDIKTGLYSVLGL AVRSLMIDNVIESMNLSKYFNVVCSDPEPICDFIVHELNRSATTCLAKGAFSGDDRYIIF TALNRSQAVKLRNFIKHQEPGAFILISNTSEIIGKGFHYV >gi|330403359|gb|ADLB01000019.1| GENE 114 94713 - 96059 1523 448 aa, chain - ## HITS:1 COG:BS_ybbT KEGG:ns NR:ns ## COG: BS_ybbT COG1109 # Protein_GI_number: 16077245 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphomannomutase # Organism: Bacillus subtilis # 1 443 1 444 448 422 50.0 1e-118 MGKYFGTDGFRGEANVVLTVEHAFKVGRFLGWYYGQNHKAKVIIGKDTRRSSYMFEDALS AGLTASGADVYLLHVTTTPSVSYVVRTEDFDCGIMISASHNVFYDNGIKVINGKGHKLEA EVEEKVEAYIDGEFGELPLATRENLGRTVDYSAGRNRYIGHLIAMATRSFKDKRVGLDCS NGSASSIAKSVFDALGAKTYVIHSEPDGTNINRNCGSTHIETLQEFVKEKKLDVGFAYDG DADRCLAVDENGNVVDGDLILYVCGKYLKEQGRLNNDTIVTTIMSNIGLYKACDKVGMKY EKTAVGDKYVYENMVQNNHSLGGEQSGHIIFSKYATTGDGILTSLLLMEVMLEKKETLGK LTEEVKIYPQLLKNVRVSDKKTARENPAVVREVEKVTEALGNDGRILVRESGTEPVIRVM VEASTHELCEEYVNQVVDVMEKEGLIIE >gi|330403359|gb|ADLB01000019.1| GENE 115 96377 - 97753 1502 458 aa, chain - ## HITS:1 COG:FN0278 KEGG:ns NR:ns ## COG: FN0278 COG0624 # Protein_GI_number: 19703623 # Func_class: E Amino acid transport and metabolism # Function: Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases # Organism: Fusobacterium nucleatum # 12 458 12 452 452 371 41.0 1e-102 MEERRVLQNDVEKIVEDIIKLCSINSVEGEGKEGMPFGEGPAKALQCALKLGEDMGFISE NFDNYAGHIDFGEGEETLGILCHADVVPCGEDWICDPYNPQIIDGRLYGRGVLDNKGPMV VCLHAMRILKEMNIPLKKKVRLIIGTNEETNWKCMEYYFGKKKVETPQIAFTPDADFPLK YAEKGLLQYSLKMKISEEISLQGGNAVNSVPERASVCLDAKYLPEIERQKEQWKEKSGCA YEISGKGDKITLTVFGKAAHGAWVENGINAITGVMLAVDELKVGGDLERIASLYMKYIGL CLHGENLGLRFSDEESGILSFNVGTVGVSDGEVRFAVDNRVPVTYKCGEVMAQLEKALEG SGIEAEILDRIEGIHIDKDSFLVQTLMQVYRDVTGDMKAEPEVDGGCTYARTMQNCVAFG ALLPDQENVMHEKNEYLEISKIETWLRIYLEAIYRLAK >gi|330403359|gb|ADLB01000019.1| GENE 116 97835 - 99010 1343 391 aa, chain - ## HITS:1 COG:CAC2445 KEGG:ns NR:ns ## COG: CAC2445 COG0138 # Protein_GI_number: 15895710 # Func_class: F Nucleotide transport and metabolism # Function: AICAR transformylase/IMP cyclohydrolase PurH (only IMP cyclohydrolase domain in Aful) # Organism: Clostridium acetobutylicum # 3 391 5 391 391 518 62.0 1e-147 MKELELKYGCNPNQKPSRIYMEEGELPIKVLCGKPGYINFLDAFNGWQLVRELKKATGLP AATSFKHVSPAGAGVGLPLDDTLAKIYWVDDMGELSPLACAYARARGADRMSSFGDFISL SDVCDVATAKLIKREVSDGVIAPGYEPEALELLMQKKRGNYAIIEIDPAYEPNPIEHKEV FGITFEQGRNELVIDDELLSNVVTENKEIPKQAKIDLAIALITLKYTQSNSVCYAKDGQA IGIGAGQQSRIHCTRLAGQKADNWWLRQSPQVLGLQFVDKIGRADRDNAIDLYIGDEYED VLAEGTWQNIFKVKPEVFTREEKRAWLDKMTDVALGSDAFFPFGDNIERAHKSGVKYIAQ PGGSVRDDNVIETCNKYDMAMAFTGIRLFHH >gi|330403359|gb|ADLB01000019.1| GENE 117 99022 - 99735 897 237 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_1566 NR:ns ## KEGG: EUBREC_1566 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 237 50 286 286 389 78.0 1e-107 MEMLSIEKELQENAYPGRGIIIGRTPDGKKAVTAYFIMGRSENSRNRVFVEEGRGIRTQA FDPSKLTDPSLIIYAPVRVLGNKTIVTNGDQTDTIYEGMDKQLTFEQSLRCREFEPDAPN YTPRISGVLHIENGKYSYAMSILKSDNGNPEACNRYTFAYENAIAGEGHFIHTYKCDGNP LPSFEGEPKRVAMSDDMEAFAEMLWNSLNEDNKVSLFVRYIDIETGKEETKIVNKNK >gi|330403359|gb|ADLB01000019.1| GENE 118 99792 - 99935 138 47 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MQGIMLLLNPYKNWRTETILKYDSFNGVITLYYNNLLVKSTGKQVIL >gi|330403359|gb|ADLB01000019.1| GENE 119 99859 - 100362 651 167 aa, chain - ## HITS:1 COG:BS_greA KEGG:ns NR:ns ## COG: BS_greA COG0782 # Protein_GI_number: 16079786 # Func_class: K Transcription # Function: Transcription elongation factor # Organism: Bacillus subtilis # 5 139 9 143 157 90 39.0 2e-18 MYNQLTEEDIKKMEEEIEYRRLVVRKEALEAVKEARAHGDLSENFEYHAAKKDKNKNESR IRYLQRMIRTAQIVSDDSKEDEVGINKAVEVYFEDDDECETFKLVTTVRGNSIDNRISTE SPIGKAILGHKAGDRVFVKVNENAGYYVVIKSIQKLEDGDDIEIRQF >gi|330403359|gb|ADLB01000019.1| GENE 120 100441 - 102069 1307 542 aa, chain - ## HITS:1 COG:SP0346 KEGG:ns NR:ns ## COG: SP0346 COG1316 # Protein_GI_number: 15900275 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Streptococcus pneumoniae TIGR4 # 158 532 113 480 481 252 39.0 1e-66 MGKKGMSPQERREYERLQERKKRQMEKRYAKSMGWDNPAKSVSENGRIPNPDKKREKKGR NILGNILLLLQILASAGLMAMLILSNIIPLKYLVAAGCVLLVLWGIGLLSQIKRKKRGII GKIYIVLITLCIVAGTYQIGFMTGALGKVTGGNSKVDTMVVAVLADDKAEDIADVSDYTF GVQYKLDKGDMKETVSHINEELDMKIKTKEYKNVNEQAKALHEGEVKAIIYNEGYKEILE EEFHGYSDKVKIVYHYNIKTKLDDVSSNVKVKTEPFSVYVSGIDVYGEITKNSRSDVNII ATVNPKTRQILLVTTPRDYYVEIPGVSHGQKDKLTHAGIYGVDTSIKTLSELYDTEIPFY ARVNFTSLIDIVDELGGVDVMSDYTFRTGKESGAVVKVTKGLNHFNGKQALAFSRERHNL PDGDNQRGKHQQAVLTAMIQKMLSPSMLIKANSIINKVSDGVETNMSQEQLQTLIKMQLN QGGSWNIKSVAAEGTGDKQKCYSSGSMRLYVCQPDEESVAEIKQLIKQVENGEILQDSEA TQ >gi|330403359|gb|ADLB01000019.1| GENE 121 102179 - 102949 975 256 aa, chain - ## HITS:1 COG:CAC0556 KEGG:ns NR:ns ## COG: CAC0556 COG2013 # Protein_GI_number: 15893846 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 1 251 1 251 252 299 58.0 4e-81 MYKISNFTDNDDIKILNTLGPFTVVEYQRDLSVAPSDAQKAYFCNAMNVRKRQVICELSK ANITLQAGAMQWTVGDVNATTGLKGVGDLLGKAIRGKVSGESAIKPEYKGSGTLVLEPTY KHILLLDLADWNHSIVLDDGLFLACDAQLKHKAVMRSNFSSAVAGNEGLFNLGIVGDGVL CLETPCPREELIEITLQEDVLKIDGNMAIAWSGSLDFTVERSGKSLAGSAASGEGLVNVY RGTGKVLLAPVENDVI >gi|330403359|gb|ADLB01000019.1| GENE 122 102968 - 103753 898 261 aa, chain - ## HITS:1 COG:MA0415 KEGG:ns NR:ns ## COG: MA0415 COG1028 # Protein_GI_number: 20089308 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: Dehydrogenases with different specificities (related to short-chain alcohol dehydrogenases) # Organism: Methanosarcina acetivorans str.C2A # 2 257 15 252 256 166 40.0 5e-41 MLVTGATSGIGRAVALRGAKEGATVIAVGRNEERGNAVVEAIENKEGKAVFKKCDVSDKE AVKKLFAEIKEEFGKLDVAVNNAGIVGASKTVEELEDDDWSKVIDANLNSCFYCCREEVK LMKENGGAIVNVSSVAGMRGFPSAAAYVASKHAVSGLTKAVAVDYATKGITCNAVCPAGT DTPLTERSSADIKTRMAEIAAQGKDPMEWLKNSMLSGKTETLQKRNATPEEQAATILFFA SDEAKHITGSIVASDGGFTTY >gi|330403359|gb|ADLB01000019.1| GENE 123 103885 - 110325 6366 2146 aa, chain - ## HITS:1 COG:no KEGG:BLD_1258 NR:ns ## KEGG: BLD_1258 # Name: aprE # Def: subtilisin-like serine protease # Organism: B.longum_DJO10A # Pathway: not_defined # 225 2014 41 1887 1937 771 32.0 0 MKRIAKKFASLILVFAMVCTAFVGYLPSLDVLAAPKKMAHLKSGSGNGNAHFGGGTPEAF VLSDKDNIRGEDLSVQFKVASEQKKTRLRFVTKYVDDTHWGFIAYDGASGWLYQYKNGNQ EQWPSLKGLPAVNKDDVVNISTSYETDGVRIKVENETTGESGTAVANDQNFVGLKDSAGK IGVGAATFGTEYTDIYFADVTAGATKFSDYSTWTLYRKDAAGQVWEPAVEIPDGNEPAGE GRAWIELKGGKNNSGGHAYGNPNVAAPVLLLDNDKKMEASGELSLALKPSDNWGVFHTYV DDNNWLYVGYDSNSKWYYQYKLNGSESYPKISGLPAPVAGEEMQMKISLNNETLSVTVND VTVRVTNQTLINFSKQTAGKGRFGVKTNGQSTISFADVTYNGKNCMEDNWVYCAQRDGQS FKKTYSKLTSLTGRVTEKGKDGLEKATVRVGNKAVKTDASGNYQLDRLEVGKYQMSVSMP GYESYTDEITLKETDNVKNVELQLKAPLDLTKYDTISSDTMKVYIGKQFPVVARYQLLSG GKEVADTYFRGNENELDTIVINGVSVKPEVTVAETTADSRTYAMKIRKDKLNLRMKVKVN VEGNNLTWQVTELKKENGCDKIATIDIPNLNLLTVDAVETGANFAGANTSTTTTATGDEF INFEEGFMPSETKGYLYGFLTNGKLSAGLYSNSEVEGDKRVIRNNGADTMSLTSAPWYHE MGDKNGQNKASKYAAYPVSELPCTKVAIAADKNEDGAIDWNDGALAFRDIMNIPYGSEDM KDLVNYRIVMNFASMAPNPYHTTADNIKKVYLATDGLPQAVMLKGYGNEGHDSANSEYAD IAEREGGVEDFQELIKIAHDYNTEIGIHVNAQEIYPEAASFNENMIEGEKSYGWGWLDQS VTIDKLWDLSSQARWKRFVQLYDRINETNFYSRKWPEAVENSKGEVKASKEEIKKDAEKR KDNMDFIYLDVWYQDAWETRQIAKEINSLGWRFSTEFSAEGEYDSTWQHWSTDAVYGGAS SKGYNSDIIRFIRNDQRDSQVLNYPEFGGTADNPLLGGYRLYGFEGWGGDKDYNNYILQT FNQNLPTKFLQHYYITDWENYEEGKSPVENHEKQITLKNDAGDVVVVQRNEKQRSDTNIE RTITLNNKKVLDDVTYLLPWTDNQDGSEKLYHWNLEGGKTTWELPKGWEGLGNVVMYELS DQGRINEKNVAVSGNKVTLDAKPATAYVLVKGSAVKTLKEDFGEYDYVKDPGFNGYAAGE KLSADDWSGDIADESVVVEKANTGDQRLAFNSPLDDVSVTTTISGLKKGTDYVAEVYVEN NSKAKASIEVNAGDKKVSNYIEKSILNNYVKCDQKNGSQMQRMQVSFTAESDTAKVTLSR GYGEGSTYMDDIRIVEKSLNNFREDGVFKQDFETVVQGLYPFVLSSAQGISDPVTHLSQK NAPYTQAGWGDRVIDDVLDGEWSLKHHGSNKGIIYQTLPQNFRFEPGKVYTVEFDYQSGP DKAYAMVVGDGTNYTAPAADEYLVQARGKNAHVKMQVVGSGSGQTWIGLYANEPSGNTKL AETDFVLDNLVIKEEKNASSVTLSTTELYKGETAKIYGSNLDKITWETSNDKVATVDKKA GVVKAVGQGEATLTAKLPGDKEVTFEITVLGGVSTDVPREELEGMTSTANTEQASGEPAG SGVASAATDGNSSTYWHSQWSGFTVSKENPAILTVDLGKEMSIGGFKFQQRPGTNNGIVY KYRYEVLDASGKTVASANTISLPDSQRAGGQWITNEFDSNVQAKQIKIYVEEGQGNFAAI AEVVPLRIQRVAESVTLKDVTVKANTKVQLQPEHEEGTIVKGLVWSSSNKDIVKVNQNGV VIGLEKGTATITVSNAAGLKAECTVTVTLNTGELENLVEEYKNLDLTKYEDGKAKDTFKA TLKEAESLIGKATTQKEIDDMTKALKDAKAALKVLDYEELENLVNEYENLDLSKYEDGEA KDTFKATLKEAKALIGKATTQKEINDMVTALKEAKDNLKVIEKPEPPKVDKSKLEKFYKE CLAYYKEADYSKENWKKYQKALADAKAVLEDEDATQEEVNKALKALIEITQLMNKENAGF SNPSNPPKAPEVPKTGDTAPWLPLTMLLILAGGTAIIVVRRRKRAK >gi|330403359|gb|ADLB01000019.1| GENE 124 110535 - 111902 1084 455 aa, chain - ## HITS:1 COG:CAC3595 KEGG:ns NR:ns ## COG: CAC3595 COG2509 # Protein_GI_number: 15896829 # Func_class: R General function prediction only # Function: Uncharacterized FAD-dependent dehydrogenases # Organism: Clostridium acetobutylicum # 2 454 3 455 457 595 62.0 1e-170 MYDVLIIGAGPGGIFTAYELMEKKPDLKIAVFEMGHELTKRKCPIDGDKIKSCIHCKSCS IMSGFGGAGAFSDGKYNITNDFGGTLHEYIGKKNALELMDYVDKVNLAFGGEGTKLYSTA GSNIKNSCMQNGLHLLDASVRHLGTDINYIVLEHLYNYLKENIEFHFDCFIDKVEKLENG YRIYSNDSFYEGKECVISAGRSGSKWMEQVCADLDINTNSNRVDIGVRVELPAGIFAHLT DELYESKIVYRTSKYEDLVRTFCMNPKGEVVNENTNGIVTVNGHSYEDPEKQTNNTNFAL LVAKHFSEPFKDSNGYGESIARLSNMLGGGVIVQRFGDLIRGRRSTEERISESFTVPTLN AAAGDLSLVLPKRILDGIIEMIYALDKIAPGTANDDTLLYGVEVKFYNMEVELDNNLMTC HEGLYVIGDGSGITHSLSHASASGIYVARQIIEKL >gi|330403359|gb|ADLB01000019.1| GENE 125 112181 - 112774 546 197 aa, chain - ## HITS:1 COG:no KEGG:Amet_1784 NR:ns ## KEGG: Amet_1784 # Name: not_defined # Def: hypothetical protein # Organism: A.metalliredigens # Pathway: not_defined # 12 195 6 186 190 103 32.0 4e-21 MITISKKEVHKFLDEHPDIPFSAAKFEYSFYLEPEAVEYANAISEKMFVESEEDLYSKKM IEQAATAEELLKLMRKSLSGGNRSRLRKKVLEYEKEIMPLIKEKTMKSGQDIFIENTLYF FLHCEGNCCDWILKEYSNIWNEYLKSMLCLVLGFRGEVEMIPFLMKETVRLERMYPEETY AQAPILAIQELAVRFLN >gi|330403359|gb|ADLB01000019.1| GENE 126 112761 - 113252 327 163 aa, chain - ## HITS:1 COG:no KEGG:BcerKBAB4_5408 NR:ns ## KEGG: BcerKBAB4_5408 # Name: not_defined # Def: hypothetical protein # Organism: B.weihenstephanensis # Pathway: not_defined # 2 160 8 171 190 66 33.0 3e-10 MKFYNIKDEYINYLKKYESKVADNKKGKRPYVGVVLEIDGIKYYTPFTSPKEKHRKMKNT KDFRKINQGIYGAINFNNMIPVVESALVLIDIDELEDSKYQRLLQNQYKCIKADREQIEL TAKRLRDTLFKKDEELNGNDKRIKERCCDLPLLEEVAKHYDNH >gi|330403359|gb|ADLB01000019.1| GENE 127 113556 - 113948 604 130 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|238925338|ref|YP_002938855.1| 30S ribosomal protein S9 [Eubacterium rectale ATCC 33656] # 1 130 1 130 130 237 90 2e-61 MSKVKFYGTGRRKKSIARVYLVPGTGKITINKRDIDEYLGLETLKVVVRQPLVATETADK FDVLVNVHGGGYTGQAGAIRHGIARALLQADPEYRPVLKKAGYLTRDPRMKERKKYGLKA ARRAPQFSKR >gi|330403359|gb|ADLB01000019.1| GENE 128 113975 - 114403 627 142 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|239623392|ref|ZP_04666423.1| ribosomal protein L13 [Clostridiales bacterium 1_7_47_FAA] # 1 142 1 142 142 246 81 5e-64 MKTYMANPDKIERKWYVVDAEGCTLGRLASEVAKVLRGKNKPEYTPHVDTGDYVIVVNAD KISVTGKKLDQKVYYHHSDYVGGMKETTLREMMAKKPEKVVELAVKGMLPKGPLGRAMIK KLHVYAGPEHANQAQKPEVLTF >gi|330403359|gb|ADLB01000019.1| GENE 129 114574 - 115677 1363 367 aa, chain - ## HITS:1 COG:BS_yaaN KEGG:ns NR:ns ## COG: BS_yaaN COG3853 # Protein_GI_number: 16077094 # Func_class: P Inorganic ion transport and metabolism # Function: Uncharacterized protein involved in tellurite resistance # Organism: Bacillus subtilis # 12 367 25 384 386 230 40.0 3e-60 MGQESPVLTFEPFAEENHQVEEQPKMEEVVLTSEEEKMVADFSSKIDLTNANMILQYGAG AQKKMADFSENALENVKTKDLGEVGKMLTDVVSELKDIETDEDEKGFFGFFKKNTNKLAN MKAKFDKAETNVERICKALENHQIQLLKDIALLDKMYELNTTYFKELSMYIEAGKRKIKE VQERELPALRNKAQLSGLPEDVQEANDLAGRCERFEKKIHDLELTRAISLQMAPQIRLVQ GNDTLMSEKIQSTLVNTIPLWKSQMVLAIGVENSSRAAEAQREVTDMTNELLRKNAEKLK MATIDTAKEAERGIVDMETLKATNESLISTLDEVMKIQKEGRVKRRTAEAELNRIEGELK QKLLEVR >gi|330403359|gb|ADLB01000019.1| GENE 130 115690 - 116850 1002 386 aa, chain - ## HITS:1 COG:no KEGG:Cbei_0753 NR:ns ## KEGG: Cbei_0753 # Name: not_defined # Def: hypothetical protein # Organism: C.beijerinckii # Pathway: not_defined # 1 386 1 398 399 214 35.0 6e-54 MDNRDFSNIGEQIRRTVEDAMNSMNYQQLNQKINQTVDMAMNEAKRHMNPHVHVQPPKQE QKRQAKPKVEIRLKEKGRYSSIPLIILGGAGILVFIGFLLTSVIGFLISGSIITKTFSSV VLAIGLIVCILSTFFLIKGILGRKRYRCFKKYVEILDGREFCSVKEFAEKTRTPEKKVCR NIKMMIRKEMFVDGFLDASQTCFTATEEAYRQLQQAEESRRQREEEQKRRQEKMQEEAEH ISDSAVAEMIRKGNEYIEVIRLANDAIPGEVISAKLDRLENVIRKIFDSVKQHPEQMPEM DKFMEYYLPTTKKLVDAYKEFDALSIKGENVTKSMNEIENTLDTISNAFEQLLDDLFQDT AFDISADISVLQTMLAREGYKEKDFK >gi|330403359|gb|ADLB01000019.1| GENE 131 116840 - 119215 1972 791 aa, chain - ## HITS:1 COG:L168650 KEGG:ns NR:ns ## COG: L168650 COG0474 # Protein_GI_number: 15672557 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Lactococcus lactis # 9 769 5 772 775 689 51.0 0 MEQIRTQTGLSEREVEERIQQGKQNTPVEAPSKSVKEIVLTNIFTYFNLIFGVIAGLLIL VGAFRELTFLPIIIANTLVGIIQECRSKKILDKLAVLHAPKANVIRGGKKQVIRAEELVL DDVVIFTAGNQIPADAVVISGEAQVNESLITGEADEITKVQGDSLLSGSYIVSGRCLARL TKVGADSYVSKLTLEAKAVKEGEQSEMIRSLNKLVKAVGIIIIPIGIMMFYQQYILSGNS LRGSVTSMTAAIIGMIPEGLYLLASVALVVSVMKLAKKKVLVHDMKCIETLARVNVLCVD KTGTITENVMKVSEILSLGEDEKELEEQIGDFVFNMEKDNTTMEALKNHFHMSMGRTADK ITTFSSEYKYSSAVFGGVTYLLGAPEFVLREDFETYRKQIEEQSEKGYRVLVFAKYNGNA DGKALTEKANPLGLILLANPIRKNAKETFRYFAEQGVDIKVISGDNPMTVSKVALEAGIA NAENYVDARQLQNEEDIAEAVRKYTVFGRVTPKQKRMFVQALKKNGKTVAMTGDGVNDVL ALKDADCSIAMASGSDVASQASQLVLLESDFAKMPSVVAEGRQVVNNIERSASLFLVKNI FSFLLSALSLIFMITYPLGPAQVSLVSMMTIGTPAFLLAMEPNKNLIKGRFLTNVLLKAL PAGLTDVIMVGLLILFGSVFGVSAEEISITATLLLAIVGLQILFELSKPMNIFRWIVWFG MLIGLLICVLFMGELFGISQVTLKSGLLLLVFAFATMPVFNHLCRFVKKAEDFFIRINEK KKKRKADKYGQ >gi|330403359|gb|ADLB01000019.1| GENE 132 119446 - 119637 153 63 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MEQLNNTFLKYQRLIITTCIIIFLIIILVFLYPLARYHYEYPTDRNAVIKTDRFTGTVDT VIY >gi|330403359|gb|ADLB01000019.1| GENE 133 119695 - 120918 836 407 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163739624|ref|ZP_02147033.1| 50S ribosomal protein L32 [Phaeobacter gallaeciensis BS107] # 10 377 16 388 418 326 44 3e-88 MKIYDELKARGLIAQVTDEELISNLVNEGKATFYIGFDPTADSLHVGHFMALCLMKRLQM AGNKPIALLGGGTGYIGDPSGRSDMRSMMTPEQIQHNCDCFKKQMSKFIDFSEGKALMIN NADWLLGLNYVELLREVGPHFSVNRMLTAECYKQRMEKGLSFLEFNYMIMQAYDFYALYQ NYGCNLQFGGDDQWSNMLAGTELIRRKLGKDASAMTITLLLNSEGKKMGKTQSGAVWLDP NKTSPFEFYQYWRNVADNDVLKCIRMLTFLPLEEIDEMDKWEGSQLNKAKEILAFELTKL VHGEEEALKAQESSRELFTSGAAANMPTAELEEGDFVEGKIDILTMLVKSGLVPSKSEAR RAVQQGGVAVDGEKVEDIKKEYAKDDLSGEGVVLRRGKKNFRKVIVK >gi|330403359|gb|ADLB01000019.1| GENE 134 121060 - 121131 57 23 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRKLVGCNGVALVIVLEVLIDTC >gi|330403359|gb|ADLB01000019.1| GENE 135 121441 - 122169 374 242 aa, chain + ## HITS:1 COG:no KEGG:LGG_02944 NR:ns ## KEGG: LGG_02944 # Name: tnp # Def: integrase # Organism: L.rhamnosus # Pathway: not_defined # 1 242 1 245 478 187 43.0 3e-46 MNEQLKYEVIKSLVDHNGNKKAAALKLGCTTRHINRLIQKYKQNGKAAFIHGNRGRKPLH SFTESQKLEILTLYNNKYYDATFTYACELLAKNDGIFISPSTLTKIMYENFIPSPRTTKT VRKRLAKELRTQQKYVSTKKKQDELQAAIVTVETPHSRRPRCVYFGEMLQMDASVHLWFG NAKTNLHIAIDDSTSRIVGAFFDEQESLNGYYNVFHQILTTYGIPAMFYTDRRTVFEYRN KK Prediction of potential genes in microbial genomes Time: Tue May 24 21:31:28 2011 Seq name: gi|330403074|gb|ADLB01000020.1| Lachnospiraceae bacterium 2_1_46FAA cont1.20, whole genome shotgun sequence Length of sequence - 3640 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 2, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 340 - 375 6.7 1 1 Op 1 . - CDS 428 - 1795 1277 ## Cphy_3620 sodium/hydrogen exchanger 2 1 Op 2 . - CDS 1813 - 2262 620 ## Cphy_3620 sodium/hydrogen exchanger 3 1 Op 3 . - CDS 2265 - 2897 352 ## COG1309 Transcriptional regulator - Prom 2920 - 2979 6.2 - Term 2959 - 2989 5.0 4 2 Tu 1 . - CDS 3100 - 3570 359 ## COG1943 Transposase and inactivated derivatives Predicted protein(s) >gi|330403074|gb|ADLB01000020.1| GENE 1 428 - 1795 1277 455 aa, chain - ## HITS:1 COG:no KEGG:Cphy_3620 NR:ns ## KEGG: Cphy_3620 # Name: not_defined # Def: sodium/hydrogen exchanger # Organism: C.phytofermentans # Pathway: not_defined # 1 249 157 405 412 357 76.0 7e-97 MAALDDIVGCIVFFTTIAIVAGNLSAGSLPAYMIALVVVLPLIIGAVCGFISGFILKRKN SDAVTMLLLVITILFTSAAGFYFNNQVMPKPVLNFMLIGMAFSATFANMVSEERLEQIMG VFNPVLGIAMIIVILNLGAPLDYHLIMGAGIFTAVYILARGFGKYFGAYFGAAITKSPQT VKKYLGFTLLPHSGVSLVFTGIAVSVLAQPAPECAKIVQGTIAAAAVINEIIAVIIAKKG FEWAGEFNKNSTEDSAKAEEKNPKIITISRQYGSGGRDIAMQLSKKLNIPFYDKEIIELA AESSSLDRSLFENYEKNKLNSVLVDLANSVSKDSSIDDRIFAHHAKVIQEIVNKGSCIIV GRCADYLLKDRKDVLKIFIYGDMDTRKKRIVEVYKEASEEAADLIKKTDKRRSAYYNHYL GNVFGNAENYDICLNSSVLGIEECVRIIERLYAEV >gi|330403074|gb|ADLB01000020.1| GENE 2 1813 - 2262 620 149 aa, chain - ## HITS:1 COG:no KEGG:Cphy_3620 NR:ns ## KEGG: Cphy_3620 # Name: not_defined # Def: sodium/hydrogen exchanger # Organism: C.phytofermentans # Pathway: not_defined # 13 128 13 128 412 172 79.0 3e-42 MFLLIVRLLITIVLAFLVGKLVSKIKLPAILGWLIAGMLLGPHAFSVINQEILDAGWYQI LVHILECAVGLMIGTELVWNKIKKSGKAIIITTLTQSLGTFLLVSAVFGIVFYITDIPIY LSFIFGGIALATARRRLCPLSVNLKQTVL >gi|330403074|gb|ADLB01000020.1| GENE 3 2265 - 2897 352 210 aa, chain - ## HITS:1 COG:lin2076 KEGG:ns NR:ns ## COG: lin2076 COG1309 # Protein_GI_number: 16801142 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Listeria innocua # 13 209 7 206 206 60 27.0 3e-09 MKKEEKTRRTCEKIIQAATVEFGMKSYETASLNTICSENNISKGLIYHNFQNKDELYLIC VKRCYDELTSHMQKTVFESDNVRENLEKILKERQKFFEEYPHYKNIFFNSILLPPGHLIK KLKEIRSEYDNYLKRQYIDLVGEMNLRQGVSVERATQYLMTFQEMYNGYFRERYGENKDF ETLVKTHETKLSELLDIMLYGIAVKDKGEE >gi|330403074|gb|ADLB01000020.1| GENE 4 3100 - 3570 359 156 aa, chain - ## HITS:1 COG:CAC3531 KEGG:ns NR:ns ## COG: CAC3531 COG1943 # Protein_GI_number: 15896768 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Clostridium acetobutylicum # 1 156 1 156 157 239 71.0 2e-63 MDSNSLSHTKWNCKYHIVFAPKNRRKVAYGKIKQDIANILSMLCKRKGVKIVEAEICPDH VHMLVEIPPSISVSYFVGYLKGKSTLMIFERHANLKYKYGNRHFWCRGYYVDTVGKNAKK IQEYIVNQLQEDLEYDQMTLKEYIDPFTGEPVKPNK Prediction of potential genes in microbial genomes Time: Tue May 24 21:33:07 2011 Seq name: gi|330402635|gb|ADLB01000021.1| Lachnospiraceae bacterium 2_1_46FAA cont1.21, whole genome shotgun sequence Length of sequence - 292849 bp Number of predicted genes - 278, with homology - 241 Number of transcription units - 110, operones - 65 average op.length - 3.6 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 309 206 ## PROTEIN SUPPORTED gi|148996730|ref|ZP_01824448.1| 30S ribosomal protein S9 - Prom 350 - 409 7.0 - Term 486 - 535 8.8 2 2 Op 1 . - CDS 538 - 1044 596 ## COG1853 Conserved protein/domain typically associated with flavoprotein oxygenases, DIM6/NTAB family 3 2 Op 2 12/0.000 - CDS 1065 - 2171 1513 ## COG1820 N-acetylglucosamine-6-phosphate deacetylase 4 2 Op 3 . - CDS 2183 - 2914 1034 ## COG0363 6-phosphogluconolactonase/Glucosamine-6-phosphate isomerase/deaminase - Prom 2984 - 3043 7.6 5 3 Op 1 . - CDS 3068 - 5233 1974 ## COG0550 Topoisomerase IA 6 3 Op 2 . - CDS 5262 - 6020 770 ## COG1924 Activator of 2-hydroxyglutaryl-CoA dehydratase (HSP70-class ATPase domain) 7 3 Op 3 8/0.000 - CDS 6033 - 6770 575 ## COG0101 Pseudouridylate synthase 8 3 Op 4 34/0.000 - CDS 6866 - 7690 625 ## COG0619 ABC-type cobalt transport system, permease component CbiQ and related transporters 9 3 Op 5 15/0.000 - CDS 7687 - 8538 442 ## PROTEIN SUPPORTED gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P 10 3 Op 6 . - CDS 8514 - 9365 590 ## PROTEIN SUPPORTED gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P 11 3 Op 7 . - CDS 9418 - 10350 892 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily 12 3 Op 8 . - CDS 10335 - 10514 72 ## - Prom 10542 - 10601 4.9 + Prom 10399 - 10458 6.0 13 4 Op 1 . + CDS 10493 - 11011 591 ## COG0652 Peptidyl-prolyl cis-trans isomerase (rotamase) - cyclophilin family + Term 11013 - 11049 5.8 14 4 Op 2 . + CDS 11057 - 11605 535 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog 15 4 Op 3 . + CDS 11598 - 12281 343 ## gi|225027075|ref|ZP_03716267.1| hypothetical protein EUBHAL_01331 16 4 Op 4 . + CDS 12286 - 12600 225 ## + Prom 12603 - 12662 3.5 17 5 Tu 1 . + CDS 12753 - 13019 163 ## + Term 13097 - 13141 5.1 + Prom 13037 - 13096 6.7 18 6 Op 1 . + CDS 13239 - 13400 247 ## 19 6 Op 2 . + CDS 13436 - 13687 226 ## 20 6 Op 3 . + CDS 13723 - 13980 306 ## 21 6 Op 4 . + CDS 13995 - 14309 306 ## + Term 14312 - 14344 1.0 22 7 Tu 1 . - CDS 14333 - 15070 736 ## COG3022 Uncharacterized protein conserved in bacteria - Prom 15090 - 15149 8.7 + Prom 15009 - 15068 8.8 23 8 Op 1 . + CDS 15307 - 15558 277 ## Cbei_4440 hypothetical protein 24 8 Op 2 . + CDS 15568 - 16605 742 ## COG0502 Biotin synthase and related enzymes 25 8 Op 3 . + CDS 16621 - 18042 1461 ## COG1060 Thiamine biosynthesis enzyme ThiH and related uncharacterized enzymes 26 8 Op 4 . + CDS 18053 - 19252 701 ## COG1160 Predicted GTPases + Term 19430 - 19473 -0.8 27 9 Tu 1 . - CDS 19290 - 19559 292 ## COG3326 Predicted membrane protein - Prom 19585 - 19644 4.5 + Prom 19535 - 19594 14.7 28 10 Op 1 9/0.000 + CDS 19626 - 20294 845 ## COG1760 L-serine deaminase 29 10 Op 2 . + CDS 20307 - 21179 902 ## COG1760 L-serine deaminase + Term 21194 - 21239 1.4 - Term 21179 - 21230 9.4 30 11 Op 1 . - CDS 21233 - 22063 726 ## gi|260588003|ref|ZP_05853916.1| conserved hypothetical protein 31 11 Op 2 . - CDS 22060 - 22779 618 ## COG0846 NAD-dependent protein deacetylases, SIR2 family 32 11 Op 3 . - CDS 22779 - 23321 453 ## COG0622 Predicted phosphoesterase 33 11 Op 4 . - CDS 23402 - 23920 456 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases 34 11 Op 5 . - CDS 23921 - 24079 84 ## - Prom 24102 - 24161 5.7 35 12 Tu 1 . - CDS 24327 - 25940 1498 ## COG1961 Site-specific recombinases, DNA invertase Pin homologs - Prom 25989 - 26048 7.6 36 13 Tu 1 . - CDS 27528 - 27692 165 ## gi|255280646|ref|ZP_05345201.1| transcriptional regulator, Cro/CI family - Prom 27733 - 27792 1.9 37 14 Tu 1 . - CDS 27794 - 28741 652 ## COG2946 Putative phage replication protein RstA - Prom 28767 - 28826 2.1 - Term 28752 - 28792 5.0 38 15 Op 1 . - CDS 28829 - 29803 373 ## Dde_0201 hypothetical protein - Term 29861 - 29905 -0.4 39 15 Op 2 . - CDS 29921 - 30025 124 ## - Prom 30050 - 30109 10.0 - Term 30241 - 30279 -0.3 40 16 Tu 1 . - CDS 30308 - 30838 265 ## gi|160914642|ref|ZP_02076856.1| hypothetical protein EUBDOL_00649 - Prom 30869 - 30928 7.8 - Term 31049 - 31087 -0.8 41 17 Tu 1 . - CDS 31249 - 31770 207 ## - Prom 31790 - 31849 9.2 42 18 Tu 1 . - CDS 31989 - 32075 153 ## - Prom 32100 - 32159 9.0 + Prom 32056 - 32115 6.3 43 19 Tu 1 . + CDS 32227 - 32652 236 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases - Term 32544 - 32576 1.0 44 20 Op 1 . - CDS 32654 - 33082 379 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases 45 20 Op 2 . - CDS 33084 - 33596 156 ## gi|160916275|ref|ZP_02078482.1| hypothetical protein EUBDOL_02302 46 20 Op 3 . - CDS 33599 - 34156 576 ## PROTEIN SUPPORTED gi|229859876|ref|ZP_04479533.1| acetyltransferase, ribosomal protein N-acetylase 47 20 Op 4 . - CDS 34177 - 34557 412 ## COG0346 Lactoylglutathione lyase and related lyases - Prom 34586 - 34645 6.5 + Prom 34791 - 34850 3.3 48 21 Tu 1 . + CDS 34913 - 35149 252 ## gi|260589982|ref|ZP_05855895.1| conserved hypothetical protein - Term 35206 - 35248 2.5 49 22 Op 1 2/0.000 - CDS 35302 - 35739 464 ## COG1846 Transcriptional regulators 50 22 Op 2 . - CDS 35775 - 36854 1120 ## COG2070 Dioxygenases related to 2-nitropropane dioxygenase 51 22 Op 3 . - CDS 36867 - 38558 1279 ## COG0825 Acetyl-CoA carboxylase alpha subunit 52 22 Op 4 4/0.000 - CDS 38564 - 39829 889 ## COG0439 Biotin carboxylase 53 22 Op 5 4/0.000 - CDS 39839 - 40267 611 ## COG0764 3-hydroxymyristoyl/3-hydroxydecanoyl-(acyl carrier protein) dehydratases 54 22 Op 6 4/0.000 - CDS 40283 - 40708 621 ## COG0511 Biotin carboxyl carrier protein 55 22 Op 7 11/0.000 - CDS 40712 - 41956 1564 ## COG0304 3-oxoacyl-(acyl-carrier-protein) synthase 56 22 Op 8 26/0.000 - CDS 41968 - 42708 243 ## PROTEIN SUPPORTED gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 57 22 Op 9 3/0.000 - CDS 42702 - 43637 864 ## COG0331 (acyl-carrier-protein) S-malonyltransferase 58 22 Op 10 4/0.000 - CDS 43630 - 44568 1047 ## COG2070 Dioxygenases related to 2-nitropropane dioxygenase 59 22 Op 11 6/0.000 - CDS 44589 - 44807 462 ## COG0236 Acyl carrier protein 60 22 Op 12 . - CDS 44862 - 45785 569 ## COG0332 3-oxoacyl-[acyl-carrier-protein] synthase III - Prom 45828 - 45887 8.8 + Prom 45791 - 45850 7.5 61 23 Tu 1 . + CDS 45953 - 46945 663 ## COG4129 Predicted membrane protein 62 24 Op 1 . - CDS 46965 - 47399 600 ## COG1585 Membrane protein implicated in regulation of membrane protease activity 63 24 Op 2 . - CDS 47444 - 47602 104 ## - Prom 47671 - 47730 8.1 + Prom 47541 - 47600 6.1 64 25 Tu 1 . + CDS 47718 - 47951 239 ## DSY0564 hypothetical protein + Term 48087 - 48136 0.4 - Term 48081 - 48118 5.5 65 26 Op 1 . - CDS 48168 - 49586 1474 ## gi|153815962|ref|ZP_01968630.1| hypothetical protein RUMTOR_02207 66 26 Op 2 . - CDS 49619 - 50146 486 ## gi|153815961|ref|ZP_01968629.1| hypothetical protein RUMTOR_02206 67 26 Op 3 . - CDS 50184 - 50525 224 ## Cphy_3526 hypothetical protein 68 26 Op 4 . - CDS 50533 - 51360 527 ## gi|153815959|ref|ZP_01968627.1| hypothetical protein RUMTOR_02204 69 26 Op 5 . - CDS 51375 - 51926 361 ## gi|153815958|ref|ZP_01968626.1| hypothetical protein RUMTOR_02203 - Prom 51964 - 52023 11.6 + Prom 51945 - 52004 9.8 70 27 Tu 1 . + CDS 52029 - 52349 267 ## COG1396 Predicted transcriptional regulators - TRNA 52440 - 52519 65.2 # Leu TAG 0 0 - TRNA 52584 - 52656 86.6 # Lys TTT 0 0 - TRNA 52666 - 52737 69.0 # Gln TTG 0 0 - TRNA 52749 - 52822 64.8 # His GTG 0 0 - TRNA 52830 - 52903 80.2 # Arg TCT 0 0 + Prom 52858 - 52917 2.8 71 28 Tu 1 . + CDS 52938 - 53159 177 ## - TRNA 52951 - 53021 75.8 # Gly TCC 0 0 - TRNA 53026 - 53100 86.4 # Pro TGG 0 0 - Term 53212 - 53261 4.3 72 29 Op 1 7/0.000 - CDS 53283 - 53762 527 ## COG0622 Predicted phosphoesterase 73 29 Op 2 . - CDS 53767 - 54366 368 ## PROTEIN SUPPORTED gi|71274727|ref|ZP_00651015.1| Ham1-like protein 74 29 Op 3 . - CDS 54359 - 55522 1057 ## COG0116 Predicted N6-adenine-specific DNA methylase - Prom 55543 - 55602 5.8 - Term 55568 - 55609 10.4 75 30 Tu 1 . - CDS 55618 - 55773 196 ## - Prom 55933 - 55992 10.1 - Term 56053 - 56097 12.6 76 31 Op 1 . - CDS 56102 - 56539 544 ## Cphy_0363 hypothetical protein 77 31 Op 2 . - CDS 56559 - 58670 2512 ## COG3968 Uncharacterized protein related to glutamine synthetase - Prom 58702 - 58761 4.9 - Term 58746 - 58787 7.4 78 32 Tu 1 . - CDS 58803 - 59666 1215 ## COG0191 Fructose/tagatose bisphosphate aldolase - Prom 59702 - 59761 5.8 79 33 Op 1 . - CDS 59778 - 65327 4875 ## SAV_7268 hypothetical protein 80 33 Op 2 . - CDS 65302 - 65382 102 ## - Prom 65411 - 65470 8.0 + Prom 65393 - 65452 9.5 81 34 Tu 1 . + CDS 65517 - 65678 116 ## + Term 65758 - 65813 6.0 - TRNA 65679 - 65750 63.5 # Glu CTC 0 0 82 35 Op 1 . - CDS 66037 - 67407 1567 ## COG0733 Na+-dependent transporters of the SNF family 83 35 Op 2 . - CDS 67422 - 68357 882 ## COG1597 Sphingosine kinase and enzymes related to eukaryotic diacylglycerol kinase - Term 68369 - 68402 3.1 84 36 Op 1 2/0.000 - CDS 68410 - 68667 437 ## COG1925 Phosphotransferase system, HPr-related proteins 85 36 Op 2 . - CDS 68660 - 69622 719 ## COG1481 Uncharacterized protein conserved in bacteria 86 36 Op 3 1/0.071 - CDS 69636 - 70496 828 ## COG1660 Predicted P-loop-containing kinase 87 36 Op 4 . - CDS 70498 - 71409 954 ## COG0812 UDP-N-acetylmuramate dehydrogenase - Prom 71600 - 71659 6.7 + Prom 71476 - 71535 9.4 88 37 Tu 1 . + CDS 71568 - 72851 1383 ## COG1253 Hemolysins and related proteins containing CBS domains + Term 72877 - 72932 0.1 - Term 72783 - 72838 0.5 89 38 Op 1 38/0.000 - CDS 72921 - 73766 584 ## COG0395 ABC-type sugar transport system, permease component 90 38 Op 2 35/0.000 - CDS 73766 - 74644 916 ## COG1175 ABC-type sugar transport systems, permease components 91 38 Op 3 . - CDS 74646 - 75902 1730 ## COG1653 ABC-type sugar transport system, periplasmic component - Prom 75924 - 75983 6.0 92 39 Op 1 . - CDS 76029 - 76721 881 ## Cphy_2059 hypothetical protein 93 39 Op 2 . - CDS 76740 - 77501 1040 ## COG0860 N-acetylmuramoyl-L-alanine amidase - Prom 77590 - 77649 7.9 - Term 77629 - 77670 6.6 94 40 Op 1 . - CDS 77674 - 78114 576 ## COG0698 Ribose 5-phosphate isomerase RpiB - Prom 78143 - 78202 3.2 95 40 Op 2 35/0.000 - CDS 78212 - 78994 279 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 96 40 Op 3 33/0.000 - CDS 78991 - 79977 1133 ## COG0609 ABC-type Fe3+-siderophore transport system, permease component 97 40 Op 4 . - CDS 79974 - 80864 984 ## COG0614 ABC-type Fe3+-hydroxamate transport system, periplasmic component 98 40 Op 5 . - CDS 80868 - 81563 817 ## gi|210614877|ref|ZP_03290376.1| hypothetical protein CLONEX_02590 99 40 Op 6 . - CDS 81566 - 83164 1563 ## LSEI_0291 lacto-N-biosidase - Prom 83301 - 83360 7.4 + Prom 83107 - 83166 5.1 100 41 Tu 1 . + CDS 83186 - 83263 74 ## 101 42 Tu 1 . - CDS 83388 - 84158 708 ## gi|210614875|ref|ZP_03290374.1| hypothetical protein CLONEX_02588 - Prom 84402 - 84461 7.0 + Prom 84236 - 84295 6.0 102 43 Tu 1 . + CDS 84330 - 85103 735 ## COG0561 Predicted hydrolases of the HAD superfamily - Term 85098 - 85147 5.6 103 44 Op 1 . - CDS 85149 - 85781 498 ## COG0546 Predicted phosphatases 104 44 Op 2 . - CDS 85784 - 87571 1830 ## COG0006 Xaa-Pro aminopeptidase - Term 87599 - 87634 0.4 105 44 Op 3 . - CDS 87649 - 89040 1612 ## COG1362 Aspartyl aminopeptidase - Prom 89069 - 89128 9.7 - Term 89125 - 89168 9.4 106 45 Op 1 . - CDS 89173 - 89355 267 ## gi|167759308|ref|ZP_02431435.1| hypothetical protein CLOSCI_01655 107 45 Op 2 . - CDS 89403 - 94286 5038 ## COG3525 N-acetyl-beta-hexosaminidase - Prom 94412 - 94471 6.1 + Prom 94368 - 94427 10.5 108 46 Tu 1 . + CDS 94520 - 95641 645 ## FMG_1520 putative multidrug-efflux transporter + Term 95649 - 95692 0.1 - Term 95625 - 95692 16.2 109 47 Op 1 . - CDS 95694 - 96725 1310 ## COG2855 Predicted membrane protein 110 47 Op 2 . - CDS 96736 - 97635 371 ## PROTEIN SUPPORTED gi|148988049|ref|ZP_01819512.1| 30S ribosomal protein S9 - Prom 97707 - 97766 9.5 + Prom 97700 - 97759 10.0 111 48 Op 1 3/0.000 + CDS 97827 - 98669 425 ## COG0583 Transcriptional regulator 112 48 Op 2 . + CDS 98659 - 99360 461 ## COG0500 SAM-dependent methyltransferases - Term 99367 - 99398 2.5 113 49 Op 1 . - CDS 99592 - 99900 326 ## gi|291549834|emb|CBL26096.1| hypothetical protein 114 49 Op 2 . - CDS 99875 - 102013 1268 ## Nther_1522 hypothetical protein 115 49 Op 3 . - CDS 102046 - 102963 697 ## Pfl01_3049 hypothetical protein 116 49 Op 4 . - CDS 102991 - 103869 764 ## MHO_0360 cytosine-specific DNA methyltransferase/type II site-specific deoxyribonuclease 117 49 Op 5 . - CDS 103870 - 105102 786 ## COG0270 Site-specific DNA methylase - Prom 105132 - 105191 5.2 + Prom 105056 - 105115 6.1 118 50 Tu 1 . + CDS 105284 - 105742 95 ## COG3727 DNA G:T-mismatch repair endonuclease + Term 105757 - 105799 2.5 119 51 Op 1 . - CDS 105792 - 106166 319 ## CDR20291_1765 hypothetical protein 120 51 Op 2 . - CDS 106171 - 107586 1398 ## EUBREC_0392 hypothetical protein 121 51 Op 3 . - CDS 107662 - 107934 129 ## 122 51 Op 4 . - CDS 107829 - 108056 316 ## gi|253579252|ref|ZP_04856522.1| conserved hypothetical protein - Prom 108105 - 108164 4.2 123 52 Op 1 1/0.071 - CDS 108174 - 109550 1415 ## COG5545 Predicted P-loop ATPase and inactivated derivatives 124 52 Op 2 . - CDS 109456 - 110130 488 ## COG0358 DNA primase (bacterial type) 125 53 Op 1 . - CDS 110231 - 110305 83 ## 126 53 Op 2 . - CDS 110318 - 110488 310 ## 127 53 Op 3 . - CDS 110508 - 111560 982 ## COG0582 Integrase 128 53 Op 4 . - CDS 111574 - 111765 293 ## EUBREC_0388 hypothetical protein - Prom 111865 - 111924 7.8 + Prom 111811 - 111870 8.3 129 54 Op 1 . + CDS 111968 - 112501 471 ## EUBELI_20054 hypothetical protein 130 54 Op 2 . + CDS 112530 - 112781 152 ## 131 54 Op 3 . + CDS 112785 - 113330 309 ## DMR_24760 hypothetical membrane protein + Term 113365 - 113405 4.2 - Term 113441 - 113478 7.1 132 55 Op 1 30/0.000 - CDS 113547 - 114740 1487 ## PROTEIN SUPPORTED gi|119502908|ref|ZP_01624993.1| Ribosomal protein S19 - Prom 114813 - 114872 6.3 133 55 Op 2 51/0.000 - CDS 114891 - 117008 2334 ## COG0480 Translation elongation factors (GTPases) 134 55 Op 3 56/0.000 - CDS 117026 - 117496 736 ## PROTEIN SUPPORTED gi|240146972|ref|ZP_04745573.1| 30S ribosomal protein S7 - Prom 117540 - 117599 4.5 135 55 Op 4 . - CDS 117693 - 118112 665 ## PROTEIN SUPPORTED gi|160878393|ref|YP_001557361.1| 30S ribosomal protein S12 - Prom 118204 - 118263 5.8 136 56 Op 1 . - CDS 118281 - 118412 136 ## gi|255281969|ref|ZP_05346524.1| two-component response regulator 137 56 Op 2 2/0.000 - CDS 118357 - 118998 682 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain 138 56 Op 3 1/0.071 - CDS 119095 - 120378 1109 ## COG1653 ABC-type sugar transport system, periplasmic component 139 56 Op 4 . - CDS 120357 - 122213 1318 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain - Prom 122234 - 122293 2.1 - Term 122238 - 122289 -1.0 140 56 Op 5 . - CDS 122295 - 123716 1298 ## COG2195 Di- and tripeptidases - Prom 123883 - 123942 8.0 141 57 Tu 1 . + CDS 123953 - 124546 394 ## TTE1304 CheY-like domain-containing protein + Term 124558 - 124623 17.9 - Term 124546 - 124611 18.4 142 58 Op 1 . - CDS 124619 - 125704 1313 ## COG0006 Xaa-Pro aminopeptidase 143 58 Op 2 . - CDS 125771 - 125869 91 ## 144 58 Op 3 . - CDS 125856 - 125957 154 ## 145 58 Op 4 . - CDS 125998 - 126282 335 ## Cphy_1196 hypothetical protein 146 58 Op 5 . - CDS 126294 - 126887 494 ## EUBREC_1586 hypothetical protein - Prom 126919 - 126978 13.4 - Term 127053 - 127098 10.1 147 59 Tu 1 . - CDS 127122 - 127496 387 ## - Prom 127520 - 127579 4.9 148 60 Op 1 8/0.000 - CDS 127597 - 129231 209 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 149 60 Op 2 . - CDS 129224 - 130963 185 ## PROTEIN SUPPORTED gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 150 60 Op 3 . - CDS 130953 - 131156 185 ## - Prom 131177 - 131236 7.7 - Term 131230 - 131277 10.4 151 61 Tu 1 . - CDS 131287 - 132123 1066 ## COG1307 Uncharacterized protein conserved in bacteria - Prom 132161 - 132220 6.3 152 62 Op 1 . - CDS 132245 - 132382 223 ## 153 62 Op 2 22/0.000 - CDS 132397 - 134556 2705 ## COG0370 Fe2+ transport system protein B 154 62 Op 3 . - CDS 134625 - 134846 368 ## COG1918 Fe2+ transport system protein A 155 62 Op 4 . - CDS 134861 - 135070 335 ## EUBELI_00572 hypothetical protein - Prom 135113 - 135172 5.3 + Prom 135140 - 135199 8.4 156 63 Tu 1 . + CDS 135244 - 135609 485 ## COG1321 Mn-dependent transcriptional regulator + Prom 135638 - 135697 5.9 157 64 Tu 1 . + CDS 135725 - 136042 64 ## - Term 135981 - 136023 6.1 158 65 Op 1 . - CDS 136030 - 136644 630 ## gi|210633344|ref|ZP_03297757.1| hypothetical protein COLSTE_01670 159 65 Op 2 . - CDS 136631 - 137233 487 ## gi|295106189|emb|CBL03732.1| hypothetical protein 160 65 Op 3 . - CDS 137248 - 142257 4979 ## Ctha_1865 filamentous haemagglutinin family outer membrane protein 161 65 Op 4 . - CDS 142257 - 142832 444 ## gi|210633347|ref|ZP_03297760.1| hypothetical protein COLSTE_01673 162 65 Op 5 . - CDS 142893 - 143591 774 ## gi|210633348|ref|ZP_03297761.1| hypothetical protein COLSTE_01674 163 65 Op 6 . - CDS 143566 - 144114 364 ## COG0681 Signal peptidase I - Term 144543 - 144580 2.2 164 66 Op 1 . - CDS 144591 - 145253 648 ## COG3619 Predicted membrane protein 165 66 Op 2 . - CDS 145273 - 146613 406 ## PROTEIN SUPPORTED gi|168182407|ref|ZP_02617071.1| 50S ribosomal protein L18 - Term 146623 - 146655 4.0 166 66 Op 3 58/0.000 - CDS 146663 - 150316 4142 ## COG0086 DNA-directed RNA polymerase, beta' subunit/160 kD subunit 167 66 Op 4 . - CDS 150331 - 154161 828 ## PROTEIN SUPPORTED gi|163796927|ref|ZP_02190884.1| 30S ribosomal protein S12 - Prom 154320 - 154379 5.0 - Term 154347 - 154384 3.5 168 67 Tu 1 . - CDS 154392 - 160379 7273 ## COG3250 Beta-galactosidase/beta-glucuronidase - Prom 160470 - 160529 6.8 - Term 160528 - 160565 6.1 169 68 Op 1 44/0.000 - CDS 160589 - 161539 966 ## COG4608 ABC-type oligopeptide transport system, ATPase component 170 68 Op 2 44/0.000 - CDS 161542 - 162540 1206 ## COG0444 ABC-type dipeptide/oligopeptide/nickel transport system, ATPase component 171 68 Op 3 49/0.000 - CDS 162547 - 163548 1086 ## COG1173 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components 172 68 Op 4 21/0.000 - CDS 163553 - 164506 1130 ## COG0601 ABC-type dipeptide/oligopeptide/nickel transport systems, permease components - Prom 164647 - 164706 4.0 - Term 164673 - 164719 8.4 173 68 Op 5 . - CDS 164762 - 166414 2255 ## COG4166 ABC-type oligopeptide transport system, periplasmic component - Prom 166549 - 166608 8.7 - Term 166657 - 166702 10.5 174 69 Op 1 47/0.000 - CDS 166716 - 167093 477 ## PROTEIN SUPPORTED gi|240143815|ref|ZP_04742416.1| 50S ribosomal protein L7/L12 175 69 Op 2 . - CDS 167137 - 167673 581 ## PROTEIN SUPPORTED gi|240143816|ref|ZP_04742417.1| 50S ribosomal protein L10 - Prom 167705 - 167764 8.0 + Prom 167849 - 167908 8.4 176 70 Tu 1 . + CDS 168009 - 169415 747 ## PROTEIN SUPPORTED gi|145632256|ref|ZP_01787991.1| 50S ribosomal protein L27 + Term 169430 - 169486 18.3 - Term 169478 - 169528 2.1 177 71 Op 1 55/0.000 - CDS 169560 - 170249 1007 ## PROTEIN SUPPORTED gi|238916246|ref|YP_002929763.1| large subunit ribosomal protein L1 178 71 Op 2 45/0.000 - CDS 170310 - 170735 633 ## PROTEIN SUPPORTED gi|238922786|ref|YP_002936299.1| ribosomal protein L11 179 71 Op 3 . - CDS 170789 - 171304 761 ## COG0250 Transcription antiterminator 180 71 Op 4 . - CDS 171337 - 171543 298 ## EUBELI_00277 hypothetical protein 181 71 Op 5 . - CDS 171568 - 171717 239 ## PROTEIN SUPPORTED gi|160881814|ref|YP_001560782.1| ribosomal protein L33 - Prom 171773 - 171832 5.8 182 72 Tu 1 . - CDS 171883 - 172251 295 ## COG2832 Uncharacterized protein conserved in bacteria - Prom 172287 - 172346 8.4 + Prom 172320 - 172379 6.9 183 73 Tu 1 . + CDS 172420 - 173844 1217 ## COG1502 Phosphatidylserine/phosphatidylglycerophosphate/cardioli pin synthases and related enzymes + Term 173856 - 173902 15.4 - Term 174121 - 174179 21.5 184 74 Op 1 . - CDS 174181 - 177477 3175 ## COG3534 Alpha-L-arabinofuranosidase 185 74 Op 2 . - CDS 177492 - 182264 5197 ## COG3525 N-acetyl-beta-hexosaminidase - Prom 182308 - 182367 6.2 186 75 Tu 1 1/0.071 - CDS 182408 - 184168 1879 ## COG3525 N-acetyl-beta-hexosaminidase - Prom 184248 - 184307 10.8 - Term 184276 - 184331 8.5 187 76 Op 1 . - CDS 184347 - 189917 5643 ## COG3250 Beta-galactosidase/beta-glucuronidase - Prom 189954 - 190013 7.6 188 76 Op 2 . - CDS 190078 - 190419 287 ## EUBREC_3110 hypothetical protein - Prom 190489 - 190548 5.2 - Term 190496 - 190537 7.4 189 77 Op 1 21/0.000 - CDS 190550 - 190813 334 ## PROTEIN SUPPORTED gi|160940069|ref|ZP_02087414.1| hypothetical protein CLOBOL_04958 190 77 Op 2 24/0.000 - CDS 190883 - 191317 473 ## COG0629 Single-stranded DNA-binding protein 191 77 Op 3 1/0.071 - CDS 191339 - 191626 402 ## PROTEIN SUPPORTED gi|160881892|ref|YP_001560860.1| ribosomal protein S6 - Prom 191659 - 191718 2.7 192 78 Op 1 . - CDS 191722 - 191916 256 ## COG4481 Uncharacterized protein conserved in bacteria 193 78 Op 2 . - CDS 191928 - 192752 907 ## COG4509 Uncharacterized protein conserved in bacteria - Prom 192780 - 192839 5.7 194 79 Tu 1 . - CDS 193091 - 193684 439 ## EUBREC_3502 hypothetical protein - Prom 193847 - 193906 4.4 - Term 193833 - 193869 0.0 195 80 Tu 1 . - CDS 193909 - 194388 513 ## COG0073 EMAP domain - Prom 194408 - 194467 6.6 - Term 194520 - 194557 0.0 196 81 Op 1 . - CDS 194574 - 195815 752 ## COG0582 Integrase 197 81 Op 2 . - CDS 195849 - 196031 233 ## gi|210610680|ref|ZP_03288561.1| hypothetical protein CLONEX_00751 - Prom 196053 - 196112 5.3 198 82 Tu 1 . - CDS 196144 - 196335 310 ## EUBREC_0076 hypothetical protein - Prom 196575 - 196634 6.0 - Term 196570 - 196605 1.0 199 83 Op 1 . - CDS 196747 - 196986 242 ## CD3328 hypothetical protein 200 83 Op 2 . - CDS 196995 - 197405 321 ## CDR20291_1773 hypothetical protein - Prom 197574 - 197633 7.5 - Term 197792 - 197827 6.0 201 84 Op 1 . - CDS 197963 - 199204 684 ## COG0535 Predicted Fe-S oxidoreductases 202 84 Op 2 . - CDS 199198 - 200841 1066 ## COG1132 ABC-type multidrug transport system, ATPase and permease components - Prom 200934 - 200993 4.6 203 85 Op 1 . - CDS 201031 - 202179 391 ## CDR20291_3112 putative serine protease 204 85 Op 2 . - CDS 202115 - 202270 204 ## 205 85 Op 3 . - CDS 202335 - 203339 417 ## COG0641 Arylsulfatase regulator (Fe-S oxidoreductase) 206 85 Op 4 . - CDS 203341 - 203910 291 ## CLK_0453 putative AIP processing-secretion protein 207 85 Op 5 . - CDS 203914 - 204063 83 ## - Prom 204166 - 204225 4.1 208 86 Op 1 . - CDS 204230 - 204313 91 ## 209 86 Op 2 . - CDS 204391 - 204537 58 ## 210 86 Op 3 . - CDS 204500 - 204670 163 ## 211 86 Op 4 9/0.000 - CDS 204684 - 205955 897 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 212 86 Op 5 . - CDS 205933 - 206673 401 ## COG3279 Response regulator of the LytR/AlgR family - Prom 206769 - 206828 8.0 + Prom 206664 - 206723 8.0 213 87 Tu 1 . + CDS 206874 - 207191 255 ## Dhaf_2022 transcriptional regulator, XRE family + Term 207312 - 207354 9.1 + Prom 207543 - 207602 4.6 214 88 Tu 1 . + CDS 207634 - 208164 266 ## DSY0900 hypothetical protein + Prom 208315 - 208374 2.0 215 89 Tu 1 . + CDS 208448 - 208981 411 ## DSY0900 hypothetical protein 216 90 Op 1 36/0.000 - CDS 209176 - 211659 1136 ## COG0577 ABC-type antimicrobial peptide transport system, permease component - Prom 211689 - 211748 2.8 217 90 Op 2 4/0.000 - CDS 211753 - 212430 233 ## PROTEIN SUPPORTED gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P - Prom 212502 - 212561 4.0 - Term 212552 - 212586 0.4 218 90 Op 3 40/0.000 - CDS 212587 - 213510 535 ## COG0642 Signal transduction histidine kinase 219 90 Op 4 . - CDS 213513 - 214187 379 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain - Prom 214244 - 214303 3.4 220 91 Op 1 . - CDS 214314 - 214496 124 ## 221 91 Op 2 . - CDS 214509 - 214751 330 ## gi|210610640|ref|ZP_03288540.1| hypothetical protein CLONEX_00730 - Prom 214773 - 214832 9.2 + Prom 214810 - 214869 6.5 222 92 Op 1 . + CDS 214889 - 215311 288 ## gi|210610639|ref|ZP_03288539.1| hypothetical protein CLONEX_00729 223 92 Op 2 . + CDS 215298 - 216929 499 ## EUBREC_3583 hypothetical protein 224 92 Op 3 . + CDS 216877 - 217020 129 ## 225 92 Op 4 . + CDS 217020 - 217964 584 ## EUBREC_2192 hypothetical protein + Term 217974 - 218018 4.1 - Term 217929 - 217967 0.0 226 93 Op 1 . - CDS 218013 - 218849 549 ## EUBREC_3563 hypothetical protein 227 93 Op 2 . - CDS 218880 - 221543 1988 ## CD1105 putative DNA primase - Term 221556 - 221591 6.5 228 94 Op 1 . - CDS 221599 - 229374 5342 ## COG4646 DNA methylase 229 94 Op 2 . - CDS 229396 - 231477 1525 ## COG0550 Topoisomerase IA - Term 231490 - 231523 0.5 230 95 Op 1 . - CDS 231535 - 232197 828 ## EUBREC_0776 hypothetical protein 231 95 Op 2 . - CDS 232178 - 232453 250 ## gi|210610624|ref|ZP_03288524.1| hypothetical protein CLONEX_00714 232 95 Op 3 . - CDS 232496 - 234919 1820 ## COG0791 Cell wall-associated hydrolases (invasion-associated proteins) 233 95 Op 4 . - CDS 234956 - 235327 184 ## EF2320 TraE protein, putative - Prom 235471 - 235530 2.5 234 96 Op 1 . - CDS 235544 - 238348 1729 ## COG3451 Type IV secretory pathway, VirB4 components 235 96 Op 2 . - CDS 238329 - 238706 142 ## EUBREC_3575 hypothetical protein 236 96 Op 3 . - CDS 238737 - 239345 473 ## COG4725 Transcriptional activator, adenine-specific DNA methyltransferase 237 96 Op 4 . - CDS 239359 - 240231 842 ## CD1112 hypothetical protein 238 96 Op 5 . - CDS 240254 - 240466 235 ## CKL_0289 hypothetical protein - Prom 240497 - 240556 5.0 - Term 240497 - 240535 1.3 239 97 Op 1 . - CDS 240743 - 242677 1695 ## COG3505 Type IV secretory pathway, VirD4 components 240 97 Op 2 . - CDS 242674 - 243192 376 ## CD1116 hypothetical protein 241 97 Op 3 . - CDS 243226 - 243372 82 ## 242 97 Op 4 . - CDS 243438 - 243584 159 ## gi|210610583|ref|ZP_03288509.1| hypothetical protein CLONEX_00699 - Term 243595 - 243633 7.5 243 97 Op 5 . - CDS 243659 - 249580 5400 ## COG4932 Predicted outer membrane protein - Prom 249614 - 249673 1.9 + Prom 249923 - 249982 11.6 244 98 Tu 1 . + CDS 250027 - 250791 596 ## Ent638_4316 hypothetical protein + Term 250793 - 250825 2.5 - Term 250781 - 250813 2.5 245 99 Op 1 . - CDS 250820 - 251263 473 ## gi|210610579|ref|ZP_03288505.1| hypothetical protein CLONEX_00695 246 99 Op 2 . - CDS 251280 - 251897 379 ## gi|210610578|ref|ZP_03288504.1| hypothetical protein CLONEX_00694 - Prom 251927 - 251986 4.4 - Term 251966 - 251995 -0.3 247 100 Tu 1 . - CDS 252076 - 252477 353 ## gi|210610577|ref|ZP_03288503.1| hypothetical protein CLONEX_00693 - Prom 252497 - 252556 2.0 - Term 252536 - 252568 2.2 248 101 Op 1 . - CDS 252594 - 252917 181 ## gi|210610576|ref|ZP_03288502.1| hypothetical protein CLONEX_00692 249 101 Op 2 . - CDS 252914 - 254155 886 ## Amet_3992 replication initiator A domain-containing protein 250 101 Op 3 25/0.000 - CDS 254136 - 255041 782 ## COG1475 Predicted transcriptional regulators 251 101 Op 4 . - CDS 255031 - 255816 866 ## COG1192 ATPases involved in chromosome partitioning 252 101 Op 5 . - CDS 255819 - 255992 232 ## gi|210610572|ref|ZP_03288498.1| hypothetical protein CLONEX_00688 253 101 Op 6 . - CDS 256005 - 256439 354 ## MGAS2096_Spy1123 hypothetical protein - Prom 256460 - 256519 4.0 - Term 256466 - 256517 13.7 254 102 Op 1 2/0.000 - CDS 256656 - 258581 2238 ## COG1190 Lysyl-tRNA synthetase (class II) 255 102 Op 2 . - CDS 258598 - 259080 798 ## COG0782 Transcription elongation factor 256 102 Op 3 . - CDS 259100 - 259207 64 ## - Term 259216 - 259251 5.1 257 102 Op 4 . - CDS 259257 - 262796 3132 ## COG4409 Neuraminidase (sialidase) - Prom 262866 - 262925 7.1 - Term 262900 - 262972 19.7 258 103 Op 1 . - CDS 262979 - 263968 602 ## PROTEIN SUPPORTED gi|145632364|ref|ZP_01788099.1| ribosomal protein L11 methyltransferase 259 103 Op 2 . - CDS 263941 - 264282 519 ## Cphy_0423 hypothetical protein 260 103 Op 3 1/0.071 - CDS 264275 - 265912 1121 ## COG0340 Biotin-(acetyl-CoA carboxylase) ligase 261 103 Op 4 . - CDS 265913 - 266158 427 ## COG1752 Predicted esterase of the alpha-beta hydrolase superfamily - Term 266167 - 266212 9.2 262 104 Op 1 1/0.071 - CDS 266222 - 267151 1287 ## COG0330 Membrane protease subunits, stomatin/prohibitin homologs 263 104 Op 2 40/0.000 - CDS 267239 - 268576 943 ## COG0642 Signal transduction histidine kinase 264 104 Op 3 . - CDS 268515 - 269267 706 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 265 104 Op 4 . - CDS 269269 - 269703 372 ## - Prom 269729 - 269788 4.6 - Term 269772 - 269835 23.2 266 105 Op 1 . - CDS 269839 - 270870 1206 ## COG3773 Cell wall hydrolyses involved in spore germination - Prom 270907 - 270966 6.4 267 105 Op 2 . - CDS 271028 - 271801 632 ## COG0566 rRNA methylases 268 105 Op 3 17/0.000 - CDS 271806 - 272456 821 ## COG0569 K+ transport systems, NAD-binding component 269 105 Op 4 . - CDS 272469 - 273809 1378 ## COG0168 Trk-type K+ transport systems, membrane components - Prom 273854 - 273913 5.6 - Term 273862 - 273918 6.7 270 106 Op 1 40/0.000 - CDS 273923 - 274612 1012 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 271 106 Op 2 . - CDS 274605 - 276098 1286 ## COG0642 Signal transduction histidine kinase - Prom 276143 - 276202 4.8 - Term 276213 - 276277 17.1 272 107 Tu 1 . - CDS 276290 - 277477 1472 ## COG0192 S-adenosylmethionine synthetase - Term 277561 - 277603 3.0 273 108 Op 1 . - CDS 277793 - 282487 4807 ## COG3525 N-acetyl-beta-hexosaminidase 274 108 Op 2 . - CDS 282519 - 284573 1966 ## COG3291 FOG: PKD repeat 275 108 Op 3 . - CDS 284595 - 289952 4789 ## CPF_2129 fibronectin type III domain-containing protein - Prom 290014 - 290073 10.3 - Term 290112 - 290159 11.2 276 109 Op 1 . - CDS 290161 - 291453 1504 ## COG0766 UDP-N-acetylglucosamine enolpyruvyl transferase - Prom 291539 - 291598 13.0 - Term 291592 - 291631 1.1 277 109 Op 2 . - CDS 291677 - 292078 422 ## gi|153816497|ref|ZP_01969165.1| hypothetical protein RUMTOR_02750 - Prom 292104 - 292163 6.9 - Term 292166 - 292227 5.6 278 110 Tu 1 . - CDS 292231 - 292713 541 ## CPE1279 hyaluronidase Predicted protein(s) >gi|330402635|gb|ADLB01000021.1| GENE 1 1 - 309 206 103 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|148996730|ref|ZP_01824448.1| 30S ribosomal protein S9 [Streptococcus pneumoniae SP11-BS70] # 51 103 1 53 77 84 67 6e-15 MDSNSLSHTKWNCKYHIVFAPKNRRKVAYGKIKQDIANILSMLCKRKGVKIVEAEICPDH VHMLVEIPPSISVSYFVGYLKGKSTLMIFERHANLKYKYGNRH >gi|330402635|gb|ADLB01000021.1| GENE 2 538 - 1044 596 168 aa, chain - ## HITS:1 COG:TM0564 KEGG:ns NR:ns ## COG: TM0564 COG1853 # Protein_GI_number: 15643330 # Func_class: R General function prediction only # Function: Conserved protein/domain typically associated with flavoprotein oxygenases, DIM6/NTAB family # Organism: Thermotoga maritima # 25 167 17 159 159 94 36.0 1e-19 MAFVEIAINELQINPFTKIGKEWTLITAGDIQKHNTMTASWGGVGVLWNKNVATVYIRPQ RYTKEFVDGNEKFTLTFFSEQYRNALTYLGRVSGKDGDKIKESGLTPMEMEGTIAFKEAQ LVLVCKKLYADTLKPECFLDRETERKCYPEKDYHTMYIAEIEKVFVRE >gi|330402635|gb|ADLB01000021.1| GENE 3 1065 - 2171 1513 368 aa, chain - ## HITS:1 COG:SA0656 KEGG:ns NR:ns ## COG: SA0656 COG1820 # Protein_GI_number: 15926378 # Func_class: G Carbohydrate transport and metabolism # Function: N-acetylglucosamine-6-phosphate deacetylase # Organism: Staphylococcus aureus N315 # 1 368 4 387 393 246 37.0 4e-65 MIIKNVQVFTEEGKFEPGEIIIENGIFQDTATNQDEVIDGQGGYAIPGLVDVHFHGAVGH DFCDGTEEAIREIAKYEASQGITTMVPATMTLSEEELMHISKTAGAYKNEEGAVLAGINM EGPFISPGKKGAQAGTHIVKPDVEMFRRLQEAANGLYRLVDIAPEAEGAMDFIEALKDEV HISFAHTLADYDTAKKGYDLGADHATHLYNAMPAFTHRAPGVIGAAHDSSHCFVELITDG VHIHPSVVRTTFDMFKDRVVLISDSMRATGLDDGEYTLGGQAVKVEGNVATLVSDGALAG SVTNLMDCVRVAVQKMDIPLETAIAAATINPAKSVGLDDKYGSIKVGKVGNVVLLNEDLS LKEVIVNG >gi|330402635|gb|ADLB01000021.1| GENE 4 2183 - 2914 1034 243 aa, chain - ## HITS:1 COG:CAC0187 KEGG:ns NR:ns ## COG: CAC0187 COG0363 # Protein_GI_number: 15893480 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphogluconolactonase/Glucosamine-6-phosphate isomerase/deaminase # Organism: Clostridium acetobutylicum # 1 237 1 237 241 261 53.0 1e-69 MKICVAKDYNEMSRKAANVIGAQVIMKPNCVLGLATGSSPIGTYKELIKRYEEGDLDFSQ VKSVNLDEYKGLPRDNDQSYYYFMNDNFFDHINIDKANTNVPNGMEPDAEKECARYEELI ASLGGVDLQLLGLGHNGHIGFNEPAPQFDKITHCVDLQESTIEANKRFFASADDVPRQAY TMGIGTIMSAKKIVVVVSGEDKAEIVKKAFFGPVTPEVPASILQMHPDVTVVCDEAAYSK VAQ >gi|330402635|gb|ADLB01000021.1| GENE 5 3068 - 5233 1974 721 aa, chain - ## HITS:1 COG:CAC2947_1 KEGG:ns NR:ns ## COG: CAC2947_1 COG0550 # Protein_GI_number: 15896200 # Func_class: L Replication, recombination and repair # Function: Topoisomerase IA # Organism: Clostridium acetobutylicum # 2 617 3 618 618 702 58.0 0 MKSLVIAEKPSVARDIARVLHCKKNISGAIEGEKYIVTWALGHLVTLADPEGYDKKFKEW KMDVLPMMPKKMELVVIKKTAKQYNAVKTQLFRKDVDNIIIATDAGREGELVARWILAKV NCHKPIKRLWISSVTDKAIQDGFAHLSDGRKYNDLYMAAVSRAEADWLVGINATRALTCK YNAQLSCGRVQTPTLAMIAKREEEIRTFKPQTYYGLTVQAGKIKWTWQDMKSGSYRTFSK ERIEELEKGVANTGLQVEQVEKSAKKSFAPGLYDLTELQRDANKRFGFSAKETLNIMQRL YENHKVLTYPRTDSKYIGTDIVDTIKDRLRACGIGPYKKLAGSLLMKPLQTNKSFVDDKK VSDHHAIIPTEQYVQLEHMTNEERKIYDLVVRRFLSVLYPPFVYEQTTLKGKVGKETFVA KGKTVQQIGWKAVYEYEEEQEDELSEQSLPQIEKGSTFAIEKVFASEGKTKPPARFTEAT LLSAMENPVKYMESKDKQAVQTLGETGGLGTVATRADIIDKLFNTFLMELKGKEIYLTSK GKQLLSLVPEELKKPELTADWEMKLSRIAKGELAKETFMTEIETYTEDIVEEIKHGEGNF RHDNVTTKKCPKCNKPMLAVNGKNARMLVCQDRECGHKEVIARLSNARCPNCHKKMEIIK KGDEEQFVCVCGHKEKMSAFKARREKEGAGVSKKDVQRYLKQQKEEPVNTAFADALAKLK I >gi|330402635|gb|ADLB01000021.1| GENE 6 5262 - 6020 770 252 aa, chain - ## HITS:1 COG:AF1959 KEGG:ns NR:ns ## COG: AF1959 COG1924 # Protein_GI_number: 11499541 # Func_class: I Lipid transport and metabolism # Function: Activator of 2-hydroxyglutaryl-CoA dehydratase (HSP70-class ATPase domain) # Organism: Archaeoglobus fulgidus # 1 248 1 251 251 173 42.0 3e-43 MNYVGIDIGSTASKVVVRGEKELHFVLPTGWSSKETTMTIRERLLKEGIDVEDEKTRVVA TGYGRVAVDFADKVVTEITCHARGGRELAEGDCTIIDVGGQDTKVIQVNGGMVMDFLMND KCSAGTGKFLEIMANRLGITLDELFELAKTGEMIPISSLCTVFAESEVISYIGEGRPRED IAMGVIESVVSKVAQLAQRQTLAEKIILTGGLSHSAYFAERLSKKLGAAVTPTAFGRFAG ALGASYLAEEKK >gi|330402635|gb|ADLB01000021.1| GENE 7 6033 - 6770 575 245 aa, chain - ## HITS:1 COG:CAC3099 KEGG:ns NR:ns ## COG: CAC3099 COG0101 # Protein_GI_number: 15896350 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Pseudouridylate synthase # Organism: Clostridium acetobutylicum # 1 244 1 244 244 234 48.0 1e-61 MRRVKLTIAYDGTNYCGWQIQPNGITIEEIVNKALSKLTGEKIVVIGASRTDSGVHAMGN VAVFDTETTIPPERVAMAVNRILPEDIVVVKSEEVPLDFHPRYCDCEKTYEYHIVNTRIP IPTKRLTNYFVSYELDLDKMREGASYLIGEHDFASFCNIKTDVESTVRTVKELEILENGE EITIRISGNGFLYNMVRIIVGTLIRVGRGFYEPVQVKEILEAKNRKAAGVTAPPHGLMLM EIRYN >gi|330402635|gb|ADLB01000021.1| GENE 8 6866 - 7690 625 274 aa, chain - ## HITS:1 COG:CAC3100 KEGG:ns NR:ns ## COG: CAC3100 COG0619 # Protein_GI_number: 15896351 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type cobalt transport system, permease component CbiQ and related transporters # Organism: Clostridium acetobutylicum # 1 248 1 249 267 271 57.0 1e-72 MIRDITIGQYYPANSIVHRLDPRVKIMCTLFYLISLFLFKSVLGYALCTVFLFAVIRLSK VPFKFITKGLKPIIVLLMITVLFNLFLTNQGNVLVSFWIFKITDEGLRTAVYMAIRLIYL IVGSSLMTLTTTPNELTDGIEKLLKPFNKIKVPVHEIAMMMSIALRFIPILLEETDKIMK AQIARGADLEGGNIIQKAKNMIPILVPLFVSAFRRANDLAMAMEARCYQGGEGRTKMKPL VYQKRDYITYAVTILYIVVMVLVGRYISFRVWIF >gi|330402635|gb|ADLB01000021.1| GENE 9 7687 - 8538 442 283 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P [Thermanaerovibrio acidaminovorans DSM 6589] # 1 283 112 396 398 174 35 3e-42 GGIMSIKLEHINYIYGEGTAYRKQALKDVSLEIPHGQFVGIIGHTGSGKSTLIQHLNGLI RATDGKFYFNGEDVYAEGYSMRNLRNEVGLVFQYPEHQLFEVDVLTDVCFGPKNQGLPKE ECEKRAKEALEAVGFPEKYYGQSPFDLSGGQKRRVAIAGVLAMQPKVLVLDEPTAGLDPK GRDEILDQIAELHRQGDMTVVLVSHSMEDIAKYVERLIVMNRGEKMFDGTPKEVFSHYKE LEQIGLSAPQVTYVMNALKDKGFAVSTEATTIEEAVEEIMKAL >gi|330402635|gb|ADLB01000021.1| GENE 10 8514 - 9365 590 283 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P [Thermanaerovibrio acidaminovorans DSM 6589] # 28 280 146 397 398 231 49 2e-59 MSMIKTKHLVFEYDKLDETGNVIGKHRAIDGVDLNIEEGQFIAILGHNGSGKSTLAKHIN AILTSTEGTMWVDGKDTKDLDKLWEIRQSAGMVFQNPDNQIIGTVVEEDVGFGPENLGVP TEEIWQRVDKSLHAVGMTEYRHHSPNKLSGGQKQRVAIAGVIAMCPKCIVLDEPTAMLDP IGRKEVLRTVDELRKREKVTVILITHYMEEVIGADRVFVMEQGRVVMDGTPREIFSQVDE LKKHRLDVPQVTMLAHELRKRGVKLPKDILKTEELVEALCQLN >gi|330402635|gb|ADLB01000021.1| GENE 11 9418 - 10350 892 310 aa, chain - ## HITS:1 COG:CC3033 KEGG:ns NR:ns ## COG: CC3033 COG0697 # Protein_GI_number: 16127263 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Caulobacter vibrioides # 14 305 17 309 310 236 45.0 3e-62 MEKTMRKTSVVAILALICCALWGSAFPAVKIGYELFEIKTVGSQILFAGYRFFLAGVLTF LLACIYEKRFVTLKKSSIPYVFAQGLLQTTVQYLFFYIGMANTTGAKGSVINASNGFVAI IVATMLIPAEKMTWRKALGCVVGFAGVIFINLSPGAWGSGFSLQGEGMVMLCSIAYGTSS VTLKLISHREKPMTLTAYQLLFGGAVLTLIGIFAGGSITNFTVKSTLLLLYMALLSTIAF SLWTVLLKYNPVSKVAIFGFSIPIFGVLLSGIFLGEEIFSIKNILALLCVCAGIIVVNRE SRSREQIIAK >gi|330402635|gb|ADLB01000021.1| GENE 12 10335 - 10514 72 59 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLRLDLPFDYFLSSEKIMLYPIVMGEKYFVNVRFLLYEWEKLRYNREMKKWSVRKWKKQ >gi|330402635|gb|ADLB01000021.1| GENE 13 10493 - 11011 591 172 aa, chain + ## HITS:1 COG:CAC2769 KEGG:ns NR:ns ## COG: CAC2769 COG0652 # Protein_GI_number: 15896024 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Peptidyl-prolyl cis-trans isomerase (rotamase) - cyclophilin family # Organism: Clostridium acetobutylicum # 1 168 1 168 174 244 71.0 5e-65 MANPIVTFEMESGDIIKAELYPEIAPNTVNNFVSLVNSGYYDGLIFHRVIRGFMIQGGCP DGTGMGGPGYTIKGEFNQNGFSNHLKHEPGVLSMARAMHPDSAGSQFFIMHETSPHLDGS YAAFGKVTEGMDIVNKIAETATDYNDRPLETQRMKTVTVETFGENYPEPDKM >gi|330402635|gb|ADLB01000021.1| GENE 14 11057 - 11605 535 182 aa, chain + ## HITS:1 COG:SMb20592 KEGG:ns NR:ns ## COG: SMb20592 COG1595 # Protein_GI_number: 16265252 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Sinorhizobium meliloti # 16 177 21 190 227 65 26.0 8e-11 MLVYLEMLETPEEKLKFEQIYFLYRDKMYAVAFKILHNENDAEDIVHESFKAIIENFEKI NDISCHKTWNYIVTIVKNKSFTLYQKKKHHETSAYEDWVEVEMDFTPEKVTEEREIAEIL VELIRSLPFPYKEVLYLQYYNALSGEEIAKFIDKTPAHVRKISQRAKAMLKEELLKRGIQ NE >gi|330402635|gb|ADLB01000021.1| GENE 15 11598 - 12281 343 227 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225027075|ref|ZP_03716267.1| ## NR: gi|225027075|ref|ZP_03716267.1| hypothetical protein EUBHAL_01331 [Eubacterium hallii DSM 3353] # 3 225 4 219 219 92 32.0 1e-17 MNDKLLENAVKIAVEKDYKKEVTDFSERDSHTFSPAFEEKMQPLLHRESPIRYRKRLKFR YLLVAILVLLFGGMTVLANPDVRERIGKYFEQLFDDHTDVSFDAPENHKEKEFVKVKPTY IPEGCTLVDAEYDSTFETYSLIYEDKNERTFYYEQSSVKFFEDSSVSISSDGTPAKKIDI SGFPCYLLSDEYGFHTLMYITDNYVFQVGGESSSEELIKVIKSLRKE >gi|330402635|gb|ADLB01000021.1| GENE 16 12286 - 12600 225 104 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKLSLKTYKPLVLIWEIMCILSWCKAIFLPAKIEYNYSGVFFALNFIYCVYSIVLIWLMP KDFTLSSKQKIAASILMLGLFISIFSKSKKILKFFVTFSSLDCY >gi|330402635|gb|ADLB01000021.1| GENE 17 12753 - 13019 163 88 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTIPRKTFRFLRIITIIMTFIFWANMLFVPDKILVTYNTSFFIFDIIYFILAFIVSYFEP KDYHVPKIEKLIGFIMIIVIMIKLFFFS >gi|330402635|gb|ADLB01000021.1| GENE 18 13239 - 13400 247 53 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MMDFFNDVLFDLIMFAKSLPAFLITIIVLYFVIKSAVKKGLQEYHDSLEKKND >gi|330402635|gb|ADLB01000021.1| GENE 19 13436 - 13687 226 83 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTLSTKQFNVLKIIWFMMTFVLAYMMFALPKGVNHSASTSVFDTAQFIFTCILVKLRPKN YEFSLVEKGIVIFIIITMIISLF >gi|330402635|gb|ADLB01000021.1| GENE 20 13723 - 13980 306 85 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLMSIVLIFLNFALFFKSLWNWFALTGIDSNMISLFALLWCIIIAFVPAAYCAEREKQYL EASVNRKKQIIILCIFAFIVAGLYV >gi|330402635|gb|ADLB01000021.1| GENE 21 13995 - 14309 306 104 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTKKFFVLLVCSMVLTGCASSKQENTQKQEFVREESSLINSPEMMNYKYRVKLTGKSNNA KQKSYYTVLTNDKSLTFETVDRKFWSSSIEDKNDFYIVAHGIAE >gi|330402635|gb|ADLB01000021.1| GENE 22 14333 - 15070 736 245 aa, chain - ## HITS:1 COG:ECs0006 KEGG:ns NR:ns ## COG: ECs0006 COG3022 # Protein_GI_number: 15829260 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 1 242 1 252 258 155 37.0 7e-38 MLTIISPAKNMKIRKRDDLSLSKTAFPEETEEIIKELKKYQPYELETLMKINEKLAVQAF MDIQNFDLHKAGTPAILTYDGLVFKNIDAETFSVEEMEFINTHLRILSACYGAVRPLDEI LPYRLEMQCKLKIDDKSLYQFWGDLLYKEVYRENQIVLNLASEEYAKAIRKYAKYDKFID VEFLTYRKGKLRTITTSAKMARGQMVRFIAKNRIDEPEQIKAFDWNDYEFEESMSNEKKY VFIQR >gi|330402635|gb|ADLB01000021.1| GENE 23 15307 - 15558 277 83 aa, chain + ## HITS:1 COG:no KEGG:Cbei_4440 NR:ns ## KEGG: Cbei_4440 # Name: not_defined # Def: hypothetical protein # Organism: C.beijerinckii # Pathway: not_defined # 1 82 1 82 83 104 73.0 1e-21 MNTRIALIGIIVESNESVSELNRLLSEYSQYIIGRMGIPYREKNISIISVAIDAPNDIIS SLSGKLGMLSGINTKTTYAKTSV >gi|330402635|gb|ADLB01000021.1| GENE 24 15568 - 16605 742 345 aa, chain + ## HITS:1 COG:CAC1631 KEGG:ns NR:ns ## COG: CAC1631 COG0502 # Protein_GI_number: 15894909 # Func_class: H Coenzyme transport and metabolism # Function: Biotin synthase and related enzymes # Organism: Clostridium acetobutylicum # 3 341 6 339 350 288 42.0 8e-78 MKQLIQKLKETQTLTKEEWISLIDNRTPELAEYLFSEAREVQKFYYGNKVYIRGLIEFTN FCKNDCFYCGIRKSNGNACRYRLTKEEILECCRIGYPLGFRTFVLQGGEDGYFTDAKMTD IISSIKAVYPDCAITLSVGEKSYDSYKAFFDAGADRYLLRHETYVSEHYKKLHPHTLSAK KRQQCLWDLKDIGYQVGTGFMVGSPYQTTENLAEDMLFLKKLNPQMVGIGPFIPHHDTPF KSSPQGSLELTLFMLALIRLMLPKVLLPSTTALGTIAPDGREQGILAGANVVMPNLSPLS VRKNYLLYDNKICTDSEAAEGLEQLTTQIQKIGYEISVSRGDSLN >gi|330402635|gb|ADLB01000021.1| GENE 25 16621 - 18042 1461 473 aa, chain + ## HITS:1 COG:CAC1356 KEGG:ns NR:ns ## COG: CAC1356 COG1060 # Protein_GI_number: 15894635 # Func_class: H Coenzyme transport and metabolism; R General function prediction only # Function: Thiamine biosynthesis enzyme ThiH and related uncharacterized enzymes # Organism: Clostridium acetobutylicum # 1 473 1 472 472 703 73.0 0 MYNILSENPEEFINHEEILDTLDFAGKNKHNVELINQIIEKAKLRKGLTHREALVLLDCD IEEKNQEIYALAQQIKKDFYGNRIVMFAPLYLSNYCVNGCTYCPYHMKNKHIARKKLTQE EIRKEVIALQDMGHKRLALETGEDPVNSPIEYVLESIKTIYGIKHKNGAIRRVNVNIAAT TVENYRKLKEAGIGTYILFQETYHKESYLQLHPTGPKHDYDYHTTAMDRAMQGGIDDVGI GVLFGLDKYRYELAGLLMHAEHLEAVFGVGPHTISVPRLRHADDIDADSFDNGIDDDTFA KIVACIRIAVPYTGMIISTRESKACRERVLHLGVSQISGGSKTSVGGYAEPEPDDNKSEQ FDVSDTRTLDEVVKWLMDLGYIPSFCTACYREGRTGDRFMSLCKSGQIQNCCQPNALMTL KEYLMDYASEATKKVGENVIQEEIGKIPNEKVRQVVIENLKAIANDNRRDFRF >gi|330402635|gb|ADLB01000021.1| GENE 26 18053 - 19252 701 399 aa, chain + ## HITS:1 COG:CAC1651 KEGG:ns NR:ns ## COG: CAC1651 COG1160 # Protein_GI_number: 15894928 # Func_class: R General function prediction only # Function: Predicted GTPases # Organism: Clostridium acetobutylicum # 3 398 4 400 411 403 52.0 1e-112 MSLNSTPSANRIHIGIFGRRNCGKSSLINALTGQALSIVSDTKGTTTDPVLKAMELLPIG PVVFIDTAGLDDEGELGQLRITKTYQMLNKTDIALLVIDSSVGMTVEDENILKKIQDKQI PYLIIHNKCDLPEKSTLSLAEHTEHTIEISAKTGFHIRELKEKISTLLPKDVNGTPIVSD LISPNDFVVLVVPIDSSAPKGRLILPQQQTIRDILDANAVSIVVKDTELKETLENLGKTP RLVITDSQAFKQVSEIVSKSVLLTSFSILFARYKGNLDTVTQGAFAIDQLKDGDTVLISE GCTHHRQCGDIGTVKLPNLLKKYTQKNLHFVFTSGEEFPSDLTKYHLIIHCGGCMLNERE MKYRLRCAKDMNVPITNYGIAIAKMNGILERSLEPFTHR >gi|330402635|gb|ADLB01000021.1| GENE 27 19290 - 19559 292 89 aa, chain - ## HITS:1 COG:BH3136 KEGG:ns NR:ns ## COG: BH3136 COG3326 # Protein_GI_number: 15615698 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Bacillus halodurans # 6 77 10 81 94 72 50.0 2e-13 MTNIIIYLCAINIMTFLLYGLDKQKAKRHKWRIPEATLLGVALAGGSIGAFLGMYIFHHK TKKAKFYIGVPMFFVMQAVGIIVIQTRLL >gi|330402635|gb|ADLB01000021.1| GENE 28 19626 - 20294 845 222 aa, chain + ## HITS:1 COG:CAC0673 KEGG:ns NR:ns ## COG: CAC0673 COG1760 # Protein_GI_number: 15893961 # Func_class: E Amino acid transport and metabolism # Function: L-serine deaminase # Organism: Clostridium acetobutylicum # 1 220 1 220 227 187 45.0 1e-47 MNFLSIFDVIGPNMIGPSSSHTAGAVAIALLARKMFAQPIDKVTFTLYGSFAKTYRGHGT DKALLGGTLGFSTDDERIRDAFTIADREGLEYHYIIDEETMTEHPNTADIVLENKAGHIL SIRGESIGGGKMKIVRINNIDVEFTGEYSTLIVQQTDKPGVVAHITQCLSEENVNIAFMR LFREDKGATAFTIVESDEKIPEEILDKIRVNEHVKDLMLIQM >gi|330402635|gb|ADLB01000021.1| GENE 29 20307 - 21179 902 290 aa, chain + ## HITS:1 COG:CAC0674 KEGG:ns NR:ns ## COG: CAC0674 COG1760 # Protein_GI_number: 15893962 # Func_class: E Amino acid transport and metabolism # Function: L-serine deaminase # Organism: Clostridium acetobutylicum # 5 290 5 290 290 230 48.0 2e-60 MDFLNGQTLLNLCEKEQISISSAMKEREITMGTLSSAEVDDKLTEVLTIMKNSAHKPIQN PGKSIGGLIGGEAKQVADHAKTNASVCGSMLSKAISYSMAVLEVNASMGLIVAAPTAGSS GVVPGTLLALQEEKELSDSTLYSGLLNASAIGYILMRNASVSGAEAGCQAEVGAASAMAA SAVVEMMGGTPEMCLTAAGIALSNLLGLVCDPIAGLVESPCQSRNAIGVANAITSAELAL SGVTHPIPFDEMAEAMFRVGKSLPFELRETAMGGCAGTPTGCSLGCGICK >gi|330402635|gb|ADLB01000021.1| GENE 30 21233 - 22063 726 276 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260588003|ref|ZP_05853916.1| ## NR: gi|260588003|ref|ZP_05853916.1| conserved hypothetical protein [Blautia hansenii DSM 20583] # 42 275 49 267 274 105 29.0 2e-21 MRKKAGISLLFILVLFCVCLAACKKEEKKEKQNSTKENHKPIEAMYVPFGEDGYIFFDTS TESPFYAIIPEDKLYDTNDKKMKQKDLSAGDIIACYGDGKMLESYPGQYPGVTKMVRVKK GTVEDTKKYEKEVAKFAQKPGKADIPYLDIENMQKDALVTTAATEGNYEWKYEDSSGNEQ VETVENLSLSDKEGLADVICDGENSDLKLIFSAEPNHVKVRRWPLDTAKEQNSYGQRVRV TLDGNTAYIKRAKKTYIYEVIGTWDNGNVHYGFYIP >gi|330402635|gb|ADLB01000021.1| GENE 31 22060 - 22779 618 239 aa, chain - ## HITS:1 COG:CAC0284 KEGG:ns NR:ns ## COG: CAC0284 COG0846 # Protein_GI_number: 15893576 # Func_class: K Transcription # Function: NAD-dependent protein deacetylases, SIR2 family # Organism: Clostridium acetobutylicum # 1 237 4 244 245 315 63.0 6e-86 MEKLSLLQKMIDDSHRIVFFGGAGVSTESNIPDFRSSDGIYQEKYAYPPEQVVSHTFFQK KPELFYEFYKEKMMFLDAKPNKAHLKLAEMEEKGKLSAIITQNIDGLHQLAGSKNVLELH GSIHRNYCQRCGKFYGAKYVKESEGIPICECGGTIKPDVVLYEESLDSEVIQKSVREIAQ ADMLIIGGTSLVVYPAAGFIDYFRGKHLVVINKSATPRDEQADLCIQKPIGEVLEGITV >gi|330402635|gb|ADLB01000021.1| GENE 32 22779 - 23321 453 180 aa, chain - ## HITS:1 COG:CAC2749 KEGG:ns NR:ns ## COG: CAC2749 COG0622 # Protein_GI_number: 15896006 # Func_class: R General function prediction only # Function: Predicted phosphoesterase # Organism: Clostridium acetobutylicum # 1 177 1 177 180 182 48.0 2e-46 MKLMIASDIHGSAYYCEKMIDAFHREKADRLLLLGDILYHGPRNDLPKEYQPKKVIEMLN NMKEELLCVRGNCDTEVDQMVLEFPILADYAILSLNDRTIFATHGHQFNRDNLPMLKEGD ILLNGHFHVPAGERIGKYIYLNPGSISIPKENSEHSYMIIEDNVYLWKNLEGEVYKKEVF >gi|330402635|gb|ADLB01000021.1| GENE 33 23402 - 23920 456 172 aa, chain - ## HITS:1 COG:SP0950 KEGG:ns NR:ns ## COG: SP0950 COG0454 # Protein_GI_number: 15900828 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Streptococcus pneumoniae TIGR4 # 5 166 2 162 166 148 42.0 6e-36 MSLQINQISSKYTVKELTERDIDAIYELELGNPLYYQYCPPTVTRESILEDMEALPPKKT YEDKYYIGFWEQEKLVAIMDLIVEYPNKNTAFVGLFMLEKSVQGNGVGTNIIAEFCSYMK NAGYSFVRLGYVKGNPQSRAFWLKNGFEETGVVCHNEEYDIIILEKYINSFL >gi|330402635|gb|ADLB01000021.1| GENE 34 23921 - 24079 84 52 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MNWLGDRAFISYLKYWIIINPVLLHLVTVIPFVLSIAFFVKAKSKVEKKEEE >gi|330402635|gb|ADLB01000021.1| GENE 35 24327 - 25940 1498 537 aa, chain - ## HITS:1 COG:lin1623 KEGG:ns NR:ns ## COG: lin1623 COG1961 # Protein_GI_number: 16800691 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinases, DNA invertase Pin homologs # Organism: Listeria innocua # 8 302 9 299 301 223 41.0 7e-58 MKQQIYNTALYLRLSRDDELQGESSSITTQRSMLRLYAKEHHLNVIDEYIDDGWSGTNFD RPSFQRMIEDIEAGKINCVVTKDLSRLGRNYIMTGQYTELYFPSHNVRYIAIDDGVDSEK GESEIAPFKNIINEWVARDTSRKVKSAFKTKFAEGAHYGAYAPLGYKKHPDIKGKLLIDD ETKWIIEKIFSLAYQGYGSAKITKQLRAEKVPTASWLNFTRYGTFAHIFEGKPESKRYEW TVAHVKAILKSEVYIGNSVHNMQSTVSFKSKKKVRKPESEWFRVENTHEPIIDKEVFYRV QEQIKSRRRQTKEKATPIFAGLVKCADCGWSMRFGTNKANKTPYSYYACSYYGQFGKGNC SMHYIRYDVLYQAVLERLQYWAKAVQQDEEKVLNKIQKVGNAERIREKKKKASTLKKAEN RQNEIDRLFAKMYEDRACEKITERNFVMLSSKYQKEQIELEQQITSLREELSKMEQDMIG AEKWIELIKEYSVPKELTAPLLNAMIEKILIHEATTNEDNERIQEIEIYYRFIGKVE >gi|330402635|gb|ADLB01000021.1| GENE 36 27528 - 27692 165 54 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|255280646|ref|ZP_05345201.1| ## NR: gi|255280646|ref|ZP_05345201.1| transcriptional regulator, Cro/CI family [Bryantella formatexigens DSM 14469] # 1 54 351 404 404 103 100.0 4e-21 MVQALDRENHTTILKDMIEQAELKDKHKHLLQLEKSTIEERIDTAVPQENDGIF >gi|330402635|gb|ADLB01000021.1| GENE 37 27794 - 28741 652 315 aa, chain - ## HITS:1 COG:BS_ydcR KEGG:ns NR:ns ## COG: BS_ydcR COG2946 # Protein_GI_number: 16077554 # Func_class: L Replication, recombination and repair # Function: Putative phage replication protein RstA # Organism: Bacillus subtilis # 63 299 25 259 352 192 44.0 6e-49 MNDKKFIEKLRQKREEYGVTQTRLAVACGISREYYNRIEKGKQPLNDELKEVIEKQIERF NPQEPLFLLIDYFRVRFPTTNALKIIREVLQLKADYMLYEDFGKYGYESKYVLGDINIMC SMQEHLGVLLELKGRGCRQMESYLLAQERSWYDFMLDCLTAGGRMKRLDLAINDKAGILD IPKLKEKYKAGECISYFRMQKDYSGTEKCGSDLPKNTGETLYLGSTSSELYMCAYQKNYE QYVKNGIEVEDTEIKNRFEIRMKNERAYYAVVDLLTYRDAERTAFSIINHYVRFVDREDT SQKVNGKQMMIGRGL >gi|330402635|gb|ADLB01000021.1| GENE 38 28829 - 29803 373 324 aa, chain - ## HITS:1 COG:no KEGG:Dde_0201 NR:ns ## KEGG: Dde_0201 # Name: not_defined # Def: hypothetical protein # Organism: D.desulfuricans # Pathway: not_defined # 4 314 5 323 326 71 22.0 4e-11 MEKKELIKLYLQNVDKMFGYANMNAYIDERLKKYTKYCQSKKPEEQIIIWLKLLHENFGK KIVYLGSYLALQEKDMSYLNNAFNSAVTWGQLTITNSGCDHSIHTWNILPHIFCANRFRD IEKIFPKENGLSKNGLKSACSITNLVMYLYYQEPTWKQYVIDESKEFLQNKHTAEEKAVI NGFLALIERNWEKFSLELANLCKAHRKSKDYGENPFTRKISFFAFGLYNFARYLYKEEVK NITLPQNEFLFEDFRIYQESTGYQIGQPFCIFEEPLLLLNDFEKIDLPIMYLTAGKKRLL DIENYRQEVVKKIQALHSNQSSWS >gi|330402635|gb|ADLB01000021.1| GENE 39 29921 - 30025 124 34 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MNIAMILIGGLIFLSFPFVVLQKIKKIPMKLRKN >gi|330402635|gb|ADLB01000021.1| GENE 40 30308 - 30838 265 176 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160914642|ref|ZP_02076856.1| ## NR: gi|160914642|ref|ZP_02076856.1| hypothetical protein EUBDOL_00649 [Eubacterium dolichum DSM 3991] # 3 174 9 181 187 196 61.0 4e-49 MDFLKLLIVGTVVCGILDLLTARATKKSISSLTHFTVRAPIDFAFLGTIGMAFGIGIFLF AKHENRSIPLLVMLILTIGLIVPSALLMIAPIKYVWDVIVENDDITIIKGFIYRRHWKFS SIQYAKAGRGGLKVYVDSRKRKAFFVDTMCPASQNFIERMEKENKPIIYPQEDKTN >gi|330402635|gb|ADLB01000021.1| GENE 41 31249 - 31770 207 173 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKHYLFWKKQSVREFWENELLAEKVFVIVYCIIALLFGGMFIVCIWSTDNKLSAGILLGA CWIFIFLGIPFLTHTGEKSYWVIFEDTFVLRRGNCKGYSHIEKIFYKEAKYLVIGSVDPY ISPRMVSPKWYHKKWGNYINVVSSDKKALFSVRYSEEMLKMLQKKCINAHVVK >gi|330402635|gb|ADLB01000021.1| GENE 42 31989 - 32075 153 28 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MGLAERLKENRERLGLSQGDVAEKLNIT >gi|330402635|gb|ADLB01000021.1| GENE 43 32227 - 32652 236 141 aa, chain + ## HITS:1 COG:VCA0382 KEGG:ns NR:ns ## COG: VCA0382 COG0454 # Protein_GI_number: 15601145 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Vibrio cholerae # 5 138 3 135 141 102 35.0 3e-22 MQIILRDYHITDKKYLALLFYQIQQEEFPWIDKNTLFLTDFERSTEGERIFVAEVNGKIA GFISVWTPDKFIHNLFVVKNFRHLHIGQALIEKVLLTFGKPLTLKCIAQNKTALQFYLSH GWIIQEDGICDEGTYYLMILE >gi|330402635|gb|ADLB01000021.1| GENE 44 32654 - 33082 379 142 aa, chain - ## HITS:1 COG:MA3570 KEGG:ns NR:ns ## COG: MA3570 COG0454 # Protein_GI_number: 20092376 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Methanosarcina acetivorans str.C2A # 1 141 1 140 140 98 38.0 5e-21 MIRNLEKTDVDSVAEIWLDTNIKAHNFVPAKYWESNFTLVKEMFLQAEIYVYEGETSKKI EGFIGLDDNYIEGIFVCDKVQSCGIGRQLLDFVKERKTFLKLSVYQKNVRAIQFYEREGF QITSENIDENTGEKEYEMAWKR >gi|330402635|gb|ADLB01000021.1| GENE 45 33084 - 33596 156 170 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160916275|ref|ZP_02078482.1| ## NR: gi|160916275|ref|ZP_02078482.1| hypothetical protein EUBDOL_02302 [Eubacterium dolichum DSM 3991] # 1 167 1 165 173 102 37.0 8e-21 MKKGKRVLTTISAAILILAMACLYIGAQRFFAIRPATEYEDKGVYTFSPYRVLPTQVKNT AGGRQGRLHPTKTVYYLYYRETGGTGYKWKEEVSSRGVGQKKISEGREVKRRVLSIKKEG KYITIEPEQTAKSYVAKQRRIYGITSICSIIVLLGYLGVWIRKRKKKAEM >gi|330402635|gb|ADLB01000021.1| GENE 46 33599 - 34156 576 185 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229859876|ref|ZP_04479533.1| acetyltransferase, ribosomal protein N-acetylase [Streptobacillus moniliformis DSM 12112] # 1 184 1 184 187 226 57 8e-58 MEHKGTGRLETERLILRKFAESDIESSYRNWTSDDKTTKYLTWSTHENIGVTKQVLKSWI EEYKNPSFYQWAIELKEIGEAIGTISVVRSNESIDMVEIGYCIGSKWWNQGIVTETFKKV ITFLFEEVKVNRIQSYHDQENVGSGKVMSKCGMIYEGTLREADRNNRGIVDMVTYGILAK EYFKK >gi|330402635|gb|ADLB01000021.1| GENE 47 34177 - 34557 412 126 aa, chain - ## HITS:1 COG:CAC0249 KEGG:ns NR:ns ## COG: CAC0249 COG0346 # Protein_GI_number: 15893541 # Func_class: E Amino acid transport and metabolism # Function: Lactoylglutathione lyase and related lyases # Organism: Clostridium acetobutylicum # 1 126 1 126 126 165 63.0 2e-41 MKLDEIHHIAIIGSDYEKTKHFYVEILGFEVIAENYREEREDYKIDVRRGNIELELFIIK GRPKRVSYPEAYGLRHLAFKVDSVEEAAQELKNLGVETEPIRMDEFTGKKMTFFFDPDGL PLELHE >gi|330402635|gb|ADLB01000021.1| GENE 48 34913 - 35149 252 78 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260589982|ref|ZP_05855895.1| ## NR: gi|260589982|ref|ZP_05855895.1| conserved hypothetical protein [Blautia hansenii DSM 20583] # 1 78 211 287 287 63 44.0 4e-09 MLFEELLKEERETGRKEGLQEGESIGMINSLTMLLKNFGEVPESLQKRISEEKNLETLNL WFQLAIQADSLEEFISKM >gi|330402635|gb|ADLB01000021.1| GENE 49 35302 - 35739 464 145 aa, chain - ## HITS:1 COG:CAC3579 KEGG:ns NR:ns ## COG: CAC3579 COG1846 # Protein_GI_number: 15896813 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Clostridium acetobutylicum # 3 145 4 147 154 115 47.0 3e-26 MNTYETINDILVHLFNEIWELEEKAIITEEFKDITNNDMHIIEVIGLGEGSPMSAIAKKM NVTAGTLTTSMNSLVNKKYAVRERSEEDRRVVYIKLTEKGKKAYEHHAQFHHQMTEAVIK KLNEEEIPVLLKTLEGLSEFFRNYH >gi|330402635|gb|ADLB01000021.1| GENE 50 35775 - 36854 1120 359 aa, chain - ## HITS:1 COG:CAC3580 KEGG:ns NR:ns ## COG: CAC3580 COG2070 # Protein_GI_number: 15896814 # Func_class: R General function prediction only # Function: Dioxygenases related to 2-nitropropane dioxygenase # Organism: Clostridium acetobutylicum # 10 358 8 353 355 328 48.0 8e-90 MKREEVSFAVGEKKAYIPIVQGGMGVGISLHRLAGTIAKEGGVGLISTAQIGFREKNFDR NPLQANLHAMGKELKSARNLAPNGIIGFNIMVAARHYEEYVREAVRLEADLIVSGAGLPT DLPLYAKGTKTNIAPIVSTEKSAHVILKYWDKKYGVTADMIVIEGWKAGGHLGFHLDELG KWDDISYETEIGKIIGIVRFYEEKYERKIPIILAGGMYSAKDLKRAKELGADGIQVGSRF VTTEECDADIRYKEAYLKAGQEDICIVKSPVGMPARAIYNTLIRRVENGEQIPHTPCHQC VKGCNPKTIPYCITDKLIEAVKGNVENGLVFCGAEVYRAAKIETVKEVIDSFLLAEECV >gi|330402635|gb|ADLB01000021.1| GENE 51 36867 - 38558 1279 563 aa, chain - ## HITS:1 COG:CAC3568 KEGG:ns NR:ns ## COG: CAC3568 COG0825 # Protein_GI_number: 15896802 # Func_class: I Lipid transport and metabolism # Function: Acetyl-CoA carboxylase alpha subunit # Organism: Clostridium acetobutylicum # 306 563 8 265 274 334 61.0 3e-91 MNLKQMFKKTNYISLKGDVKKEQTPEVPEGILKKCNACKSAIFTEDVKNGYYICPKCHNY FRVHAKHRLEMVADEGSFEEWDRGLCTRNPLQYKGYEEKIRLTQEKTGLEEAVITGKAEI GGKTVALGICDSRFIMASMGEVVGEKITRMVERATKEKLPIIMFACSGGARMQEGITSLM QMAKTAAALKKHSEAGGFYISVLTNPTTGGVTASFAMLGDVILAEPNALIGFAGPRVIEQ TIGQKLPKGFQRSEFLLEHGFIDAIVERENLKKVLFQLIDLHEQKKICEKSNDIYAEETE KKAVGTAWDCVTKSRMKDRPVGSDYISRLFADFIELHGDRYFQDDKAIVGGIAKFHGIPV TVIAQEKGRTTKENLERNFGMPSPEGYRKALRLMKQAEKFNRPVICFVDTPGAFCGIGAE ERGQGEAIAKNIYEMSALKVPVLSIIIGEGGSGGALAMAVANEVWILENAVYSILSPEGF ASILWKDSKRAKEAAEVMKLTARELKEANIVEKVIEEPKHFTVETLSSVCDNLAAKISCF LWKYKDKAGNDIAEERYQRFRKM >gi|330402635|gb|ADLB01000021.1| GENE 52 38564 - 39829 889 421 aa, chain - ## HITS:1 COG:CAC3570 KEGG:ns NR:ns ## COG: CAC3570 COG0439 # Protein_GI_number: 15896804 # Func_class: I Lipid transport and metabolism # Function: Biotin carboxylase # Organism: Clostridium acetobutylicum # 1 418 1 426 447 521 61.0 1e-147 MIKKVLIANRGEIAVRIIRTCREMGIETVAVYSEADREALHTQLADEAICIGPAPSRESY LNMERIISATITSGADAIHTGFGFLSENSKFAELCEKCHIIFIGPKSEVIRKMGNKSQAR NTMIEAGIPVIPGSSEIILDMERGKEVAVKVGYPVIIKAVLGGGGKGMRVAYTEEEFSKS FQMAQEEAKASFGDNSMYIEHFVENPRHVEFQILADKYGNVVHLGERDCSIQENHQKLIE ESPCTILSEELRRKMGEVAVKAAKAVQYENVGTIEFLLENNNQFYFMEMNTRIQVEHPVT EWVTGLDLIKEQIRIASGLPLPYDQSDIQINGHAIECRINAKSSGKITDVHFPGGEGVRI DTAVYHGYTVPPYYDSMLAKLIVHGKTREDAIAKMRSALGEVIIEGIGTNIDYQYDIIEK M >gi|330402635|gb|ADLB01000021.1| GENE 53 39839 - 40267 611 142 aa, chain - ## HITS:1 COG:BH3735 KEGG:ns NR:ns ## COG: BH3735 COG0764 # Protein_GI_number: 15616297 # Func_class: I Lipid transport and metabolism # Function: 3-hydroxymyristoyl/3-hydroxydecanoyl-(acyl carrier protein) dehydratases # Organism: Bacillus halodurans # 4 141 2 139 140 172 60.0 2e-43 MKGLTVQEICEIIPHRHPFLLVDYIEDYEPGEFAVGYKCVTYREDFFRGHFPEEPVMPGV FTVEALAQVGAVAILSKEENRGKIAYFGGIQKCKFKGKVVPGDKLKLETKIIKQKGPLGI GEAIASVDGKVVASAELTFMIG >gi|330402635|gb|ADLB01000021.1| GENE 54 40283 - 40708 621 141 aa, chain - ## HITS:1 COG:slr0435 KEGG:ns NR:ns ## COG: slr0435 COG0511 # Protein_GI_number: 16331454 # Func_class: I Lipid transport and metabolism # Function: Biotin carboxyl carrier protein # Organism: Synechocystis # 5 141 7 153 154 101 43.0 4e-22 MEKEQLIELIHTVSSSNLTEFQYEENGMKISMKRECQVVYAGEKELPAPQINVEEIQKTE EAQEGKLVTSVLVGTFYTAPSEDAEPFVRVGDAVTKGQTLAIVEAMKLMNEIESEFDGVV AEIFVENGQAVEYGQPLFRLK >gi|330402635|gb|ADLB01000021.1| GENE 55 40712 - 41956 1564 414 aa, chain - ## HITS:1 COG:CAC3573 KEGG:ns NR:ns ## COG: CAC3573 COG0304 # Protein_GI_number: 15896807 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: 3-oxoacyl-(acyl-carrier-protein) synthase # Organism: Clostridium acetobutylicum # 1 411 1 411 411 455 57.0 1e-128 MKRRVVVTGMGAITPIGNSVEEFWTGIKAGQVGIGEITKFDTTEYKVKLAAEVKDFVAKE RMDFKAAKRMETFSQYAVAAAKEAFENAGLCPEEEDAFRMGVIVGSGIGSLQVVEREYDK IQTKGPSKVNPLMVPLMISNMAAGNVAIQLGLKGKCTSITTACATGTHAIGDAFRAIQYG DADVMLAGGAESSICPTAVAGFSALTALTASTDKNKASIPFDKDRSGFVIGEGAGIVVLE ELEHAKARGAKIYGEIAGYGATCDAYHITSPAEDGSGAAKAMELAMQEAGVLPKDVDYIN AHGTSTHHNDLFETRAISLAFGEDAQKPIINSTKSMIGHLLGAAGAVEFITCIKTIQEGF IHQTMGTENLDEECTLNYAIHSPIEKEVNCAISNSLGFGGHNATILVKKYREEE >gi|330402635|gb|ADLB01000021.1| GENE 56 41968 - 42708 243 246 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 [Phaeobacter gallaeciensis BS107] # 2 242 1 238 242 98 30 3e-19 MLKGQTAVVTGGSRGIGRAIALELAEQGANVVINYHGSTEKAEAVKHEIEEKGGTAEIMQ CNVADFEESEKFFQQVIEKYKRIDILVNNAGITCDGLLMKMSEKDFDTVMNTNLKGTFHC IRFVARQMIKQRYGRIINISSVVGVAGNAGQANYAASKAGVIGLTKSAAKELASRKITVN AIAPGFIETDMTKVLPDKVKEDSIEKIPLGYYGKPEDVAGAVAFLASDKAGYITGQVLHV DGGMVI >gi|330402635|gb|ADLB01000021.1| GENE 57 42702 - 43637 864 311 aa, chain - ## HITS:1 COG:CAC3575 KEGG:ns NR:ns ## COG: CAC3575 COG0331 # Protein_GI_number: 15896809 # Func_class: I Lipid transport and metabolism # Function: (acyl-carrier-protein) S-malonyltransferase # Organism: Clostridium acetobutylicum # 1 301 1 306 308 275 48.0 6e-74 MSKIAFVFSGQGVQKAGMGKDFYEQSEIAKHIFDEASERIGLDLRKLCFEENDLLDRTEY TQAALVTTCLAIARVVEENGLHADVTAGLSLGEYTAIAVSGGMTDMDAISTVKKRGKFMQ EAVPQGEGAMSAILGMTAEQVESVITNMENVFVANYNCPNQTVITGKTEEVQSANEKLLS AGAKRAMLLNVSGPFHSPFLVDAGEKLRKVLADISFTKLKIPYVTNVTGKYITNIEETKE LLTKQVASSVLWQQSVENMIADGVDTFVEIGVGKTLSGFIKKIDRNVKTYSISNWQEMKK VTDELGGRDKC >gi|330402635|gb|ADLB01000021.1| GENE 58 43630 - 44568 1047 312 aa, chain - ## HITS:1 COG:CAC3576 KEGG:ns NR:ns ## COG: CAC3576 COG2070 # Protein_GI_number: 15896810 # Func_class: R General function prediction only # Function: Dioxygenases related to 2-nitropropane dioxygenase # Organism: Clostridium acetobutylicum # 1 300 2 301 310 381 67.0 1e-105 MKTRVTELLRIEYPIIQGGMAWVAEYHLAAGVSEAGGLGMIGAANAPAEWVRGQIREVKK LTKKPFGVNIMLMSPNADEVAKVIVEEEVPVVTTGAGNPEKYIKMWKEAGIKVIPVVASV AMAKRMERCGADAIVAEGTEAGGHIGENTTMVLVPQIADAVSIPVISAGGIGDGRGMAAA FMLGAEAVQLGTCFVVTKESVVHENYKKCIIKAKDIDSRVTGRTTGHPVRALRNQMTKEY LKLEKEGASLEELELLTLGGLRKAVVEGDVINGSVMAGQIAGLVKEEMSCRKFIEKLVSE TDRLMKGKVFYE >gi|330402635|gb|ADLB01000021.1| GENE 59 44589 - 44807 462 72 aa, chain - ## HITS:1 COG:aq_1717a KEGG:ns NR:ns ## COG: aq_1717a COG0236 # Protein_GI_number: 15606797 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Acyl carrier protein # Organism: Aquifex aeolicus # 3 71 5 73 78 63 62.0 8e-11 MLEKVKEIVADSLGAEVEELTAETSFKEDLGADSLDLFEMVMAFEEEFEKEIPTEDLEKL TTIGDVVAYLEA >gi|330402635|gb|ADLB01000021.1| GENE 60 44862 - 45785 569 307 aa, chain - ## HITS:1 COG:CAC3578 KEGG:ns NR:ns ## COG: CAC3578 COG0332 # Protein_GI_number: 15896812 # Func_class: I Lipid transport and metabolism # Function: 3-oxoacyl-[acyl-carrier-protein] synthase III # Organism: Clostridium acetobutylicum # 4 307 5 323 325 311 48.0 9e-85 MIGKICGTGAYAPPNVMDNNDLSKLVETSDEWIRERTGIVRRHIADGENTVSMAVKSAES ALKNANISAEEIGFIIVSTISSNVILPCAACEVQKEIGAVNAVCFDLNAACSGFLFAYNT AQAYIASGIYRTGIVIGAECLSNLIDWKDRGTCILFGDGAGAVVVKAKEGTQPSIITHSD GEKGKALTCEQNDFIHMDGQEVFKFAVKQVPKCICELLEANGLKKEEIHYFILHQANRRI VEAVAKRLKIEIEKFPMNLQEYGNTSSASIPILLDEMNQKGLLRKGMKIVMAGFGAGLSW GASLIEW >gi|330402635|gb|ADLB01000021.1| GENE 61 45953 - 46945 663 330 aa, chain + ## HITS:1 COG:lin1983 KEGG:ns NR:ns ## COG: lin1983 COG4129 # Protein_GI_number: 16801049 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Listeria innocua # 7 314 5 312 327 163 32.0 3e-40 MKTLDYIKTIKISFGCCIAFFIAEVFSLNFSTSVITITLLSILNTKKDTFLVAGKRLLSF FIAVFVAILFFPFLNYSLLSLGIYLAVYQLLCQFWHLTEGFSMSTVLMLHLWKTKKMSLP LLANELGLMLIGISMGILMNLYMPNKVEKIRKAQKDIEEKMSQILFTMSRAIFQEYLSES IAQNLSELENLLSTSLTHAQYTQKNFFLKDMSYYVSYIRMRSEQTALLKQIHRNLPRLQE SYLQTNLVSKFMRVVAISMENYDNAEDLLKYLSLLRQKFRAASLPSSRQEFESRAVLYEI VNELREMLILKRDFTKNLSSHQIETFWVQK >gi|330402635|gb|ADLB01000021.1| GENE 62 46965 - 47399 600 144 aa, chain - ## HITS:1 COG:TM0865 KEGG:ns NR:ns ## COG: TM0865 COG1585 # Protein_GI_number: 15643628 # Func_class: O Posttranslational modification, protein turnover, chaperones; U Intracellular trafficking, secretion, and vesicular transport # Function: Membrane protein implicated in regulation of membrane protease activity # Organism: Thermotoga maritima # 4 142 5 140 140 71 37.0 6e-13 MESVIWLGLILVFLIVEIITVGLTSIWLAGGALVALILNQLGVNLIWQIVAFFVVSVVLM VFTRPFAKKYINAGRTKTNYESIIGKTAKVTETIDNLNETGAAVVDGQEWTARAKDTVDV IEKDEIVKVLDIIGVKLIVEKVQK >gi|330402635|gb|ADLB01000021.1| GENE 63 47444 - 47602 104 52 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKNNLEEKQMMQAIHILLQDDKEYNARIIRDLTEEEFKSFLRQCPEFLEEIP >gi|330402635|gb|ADLB01000021.1| GENE 64 47718 - 47951 239 77 aa, chain + ## HITS:1 COG:no KEGG:DSY0564 NR:ns ## KEGG: DSY0564 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense # Pathway: not_defined # 5 70 4 69 70 65 46.0 4e-10 MYLKRLKDLRTDHDLTQENMADILKCHREVYRRYESGIRTIPIDYLVTIAKYYNTSTDYL LELTDVKTPYPPTKKKS >gi|330402635|gb|ADLB01000021.1| GENE 65 48168 - 49586 1474 472 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|153815962|ref|ZP_01968630.1| ## NR: gi|153815962|ref|ZP_01968630.1| hypothetical protein RUMTOR_02207 [Ruminococcus torques ATCC 27756] # 224 472 140 387 387 319 71.0 2e-85 MFCENCGKEIKDNDVFCPNCGKLIQEEKASNSSDVKNRSDIDKPKKKKSGKAIALVGIIA VIIAVIILVMKMFGGNSNTKEIVEKLKTSTTVSKTENGYEFTYGDMWTLCYTKESAFVII PDSTLSEYFEEAWGIGELGDDGTVRYNLKTENLNMYCDAEVEGENLSVITYDVEEDEITF IVDGERYELKKKYAQQIKKDDIAIMLKSGIDNFVDDLNYIDLTKEDVAHLTYKDIKSNVD DKELKTVKKDKESEVPKKEETEPKEEPVKEPQEESSESSEEENVLTDPITMGWAGTYKDD DSGERLVIASIGYEWSYVMYTASGQVEQKESGCSAGKDYLEGQYYTFYKNENGLLGVTSG AGQGWGNYRRISGTSTPDLEIEGTYENDTTIITVSEQMAEASTEYVPTGADIAAIHVKYN NGLEINGRLYTQGDGGLAIVDSISKEIYGTATFKNGQVEVTGSGFDGVLKSK >gi|330402635|gb|ADLB01000021.1| GENE 66 49619 - 50146 486 175 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|153815961|ref|ZP_01968629.1| ## NR: gi|153815961|ref|ZP_01968629.1| hypothetical protein RUMTOR_02206 [Ruminococcus torques ATCC 27756] # 1 175 1 175 175 217 66.0 3e-55 MFCENCGKEIKDGERFCRNCGSEIKKRGTGSEPESQKTFAIKGASKKKNWKTIAIIALVI VVVIGIFKWSSKEETLKFNVRVVNNTGFDICALYASEPNVDNWEEDLLDENILCDGEKMN IEFIITEDNLDWDFAIEDVEGNILEFYGLSFAECDVDGATLILEYDGYETTATLY >gi|330402635|gb|ADLB01000021.1| GENE 67 50184 - 50525 224 113 aa, chain - ## HITS:1 COG:no KEGG:Cphy_3526 NR:ns ## KEGG: Cphy_3526 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 6 109 12 115 124 98 46.0 7e-20 MDKKRIYKKVIMIGRATGCHQLPERSFFYHGKQFPVCARCTGVFIGEILGVALFKIAEIS NLTAILFCLIMFVDWFVQYKFQRESNNLRRLVTGFVCGYAFGNFIMKYLRYFM >gi|330402635|gb|ADLB01000021.1| GENE 68 50533 - 51360 527 275 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|153815959|ref|ZP_01968627.1| ## NR: gi|153815959|ref|ZP_01968627.1| hypothetical protein RUMTOR_02204 [Ruminococcus torques ATCC 27756] # 1 275 1 275 275 414 74.0 1e-114 MKYCGKCGNEINEEDQYCPNCGNRVNDEQVERTENSFQGNNTDKTLWESLVEFFKVLIEC IKDTGINIIRTIDKVIQTEPNIPKDSMCPYCNSEDTFPIVKSETEIKTKGYSIGRGCCGM CLLGPFGVLCGSFGSGSKVKSISQTWWGCKNCGKQHLAQHDAVEMMRVFIDKMVINCLCY GSIGSLLLYPILDELVNGFLRTLIIVFIAVIVGVSVPLYFFYQLCEDIDKQLGYSVWDIL EVEKKKEFWDSIKYSMISLAVTLIIVCPILIYFAE >gi|330402635|gb|ADLB01000021.1| GENE 69 51375 - 51926 361 183 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|153815958|ref|ZP_01968626.1| ## NR: gi|153815958|ref|ZP_01968626.1| hypothetical protein RUMTOR_02203 [Ruminococcus torques ATCC 27756] # 3 183 45 224 224 263 74.0 4e-69 MIKCSNCGNEIREGTKFCSKCGSNILEQQEKRNDEISNEEKNKVEQSKAAFYFEKVKMIG RLRYKIIQTKVNNDGDGLEINQKTHRFLRKDKANHLEIKLSEISSMELKTKMDFWDTMYA ALFGVIFLLDTTDIAWLLLIAIFLYTGYGKILNLKMKNGLNFEIPVNGMTEDVERFRALV RKE >gi|330402635|gb|ADLB01000021.1| GENE 70 52029 - 52349 267 106 aa, chain + ## HITS:1 COG:CAC1578 KEGG:ns NR:ns ## COG: CAC1578 COG1396 # Protein_GI_number: 15894856 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Clostridium acetobutylicum # 4 105 3 104 110 59 35.0 1e-09 MANLNFERIGEKLRIIRLSKNLTQEYIANVADVNTSHISNIENNRVKVSLSTLVQICNAL NTTVDYVLSEEYNDASSAIEQEIIHELHACDTKTKEQILKIIKALQ >gi|330402635|gb|ADLB01000021.1| GENE 71 52938 - 53159 177 73 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MALRKRVMGIEPTYLAWKASVLPLNYTRKVGVTGFEPATSWSQTRRSSQAEPHPDVKFFV VISVTQWLLYMLV >gi|330402635|gb|ADLB01000021.1| GENE 72 53283 - 53762 527 159 aa, chain - ## HITS:1 COG:CAC2664 KEGG:ns NR:ns ## COG: CAC2664 COG0622 # Protein_GI_number: 15895922 # Func_class: R General function prediction only # Function: Predicted phosphoesterase # Organism: Clostridium acetobutylicum # 1 147 1 149 155 77 32.0 1e-14 MRVLIVSDTHRLHKNLDIALEQAGKIDLLLHMGDVEGGEDYIEAVAGCPVRIVAGNNDFF SQLNREEEVQLGNHRIFMTHGHYYYVSVGMERIKEEGLSRGADIVMFGHTHRPYLDIGEK ITVLNPGSLSYPRQEGRRASYMIMEIHEDDSVEYELHYI >gi|330402635|gb|ADLB01000021.1| GENE 73 53767 - 54366 368 199 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|71274727|ref|ZP_00651015.1| Ham1-like protein [Xylella fastidiosa Dixon] # 1 191 1 192 200 146 44 1e-33 MSNKILFATGNENKMKEIRMILSDLGMPIQSMKEAGIDVDIVEDGSTFEENAIIKATAIA KMTGDIVLADDSGLEIDYLNKEPGIYSARYAGVDTSYDIKNRMLLDRLEGVPDEKRTARF VCVIACAFPDGTVETARGTIEGIIGHEIAGENGFGYDPIFYLPEYQCTTAELEPEKKNEL SHRGKALRAMRAIMEKKLG >gi|330402635|gb|ADLB01000021.1| GENE 74 54359 - 55522 1057 387 aa, chain - ## HITS:1 COG:BH1771 KEGG:ns NR:ns ## COG: BH1771 COG0116 # Protein_GI_number: 15614334 # Func_class: L Replication, recombination and repair # Function: Predicted N6-adenine-specific DNA methylase # Organism: Bacillus halodurans # 1 379 1 378 385 404 53.0 1e-112 MERIELIAPCHFGLESVLKREVQDLGYEISVVEDGRVSFYGDVEAICRANIFLRTAERVL LKVGSFKATTFDELFEKTKQIAWEKYITKDGKFWVTKAASVKSKLFSPSDIQSIMKKAMV NRLKEHYDAEWFEESGASYPVRVFIKKDIVTVGIDTSGVSLHKRGYRQLSSKAPITETLA AALIMLTPWKKDRILVDPFCGSGTFPIEAAMIAANIAPGMNRSFTAEEWTNFIPKKEWYN AVNEANDLVNDDIEVDIQGYDIDKSVIKAARENAREAGVEHLIHFQERAVKDLRHAKKYG FIITNPPYGERLEEKKDLPKLYGEFGESFKLLDSWSAYMITSYEDTEKYFGRKADKNRKI YNGMLKTYFYQFLGPKPPRKQREDTNE >gi|330402635|gb|ADLB01000021.1| GENE 75 55618 - 55773 196 51 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MAKTNGTNNKTRNAYSSEADYQSKKNKTSNKNTNASKNKMNNAESNEYNNN >gi|330402635|gb|ADLB01000021.1| GENE 76 56102 - 56539 544 145 aa, chain - ## HITS:1 COG:no KEGG:Cphy_0363 NR:ns ## KEGG: Cphy_0363 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 1 144 1 169 169 64 28.0 1e-09 MYRNKFLSSLREALEGNMSEQAVKENLQYYKTYIEDEVKKGRTEKEVVEELGDPWIIAKT LIESPGGEQTYEEAEEDNVSRYEDRRQNVHILGLDTWWKKAALILAIVGIIFGIGTLLVG VVRIVLPILIPFLAIMFLIKFFHKK >gi|330402635|gb|ADLB01000021.1| GENE 77 56559 - 58670 2512 703 aa, chain - ## HITS:1 COG:CAC2658 KEGG:ns NR:ns ## COG: CAC2658 COG3968 # Protein_GI_number: 15895916 # Func_class: R General function prediction only # Function: Uncharacterized protein related to glutamine synthetase # Organism: Clostridium acetobutylicum # 7 703 4 696 696 863 60.0 0 MREFINVAEIFGENVFNDTVMQERLPKKVYKDLKKTIEDGKELDLVTADVIAHEMKEWAI EKGATHYTHWFQPLTGVTAEKHDSFISAPMPNGKVLMSFSGKELIKGEPDASSFPSGGLR ATFEARGYTAWDCTSPAFVRQDAGGATLCIPTAFCSYTGEALDQKTPLLRSMEALNKQTL RLLRLFGNTTAKRVIASVGPEQEYFLVDREKYLKRKDLVFTGRTLFGAMPPKGQELDDHY FGSIRERIAAYMKDVNEELWKLGVSAKTQHNEAAPAQHELAPIYAQCNVAVDHNQLIMEV LKKVAARHDLHCLLHEKPFAGVNGSGKHNNWSLTTDDGKNLLEPGKTPHENIQFLFILMC VLKAVDTYAELLRESAADPGNDHRLGAAEAPPAIISVFLGEQLEDVLEQLVSTGSATHSL KSGKLETGVRTLPNLAKDATDRNRTSPFAFTNNKFEFRMVGSRDSVAAPNVVINTIVADA FSDACDILEKAEDFDLAVHDLIKEYAIKHQRIVFNGNGYSEEWVEEAKRRGLPNLKSMVE SIDAMVSDKAVKLFEKFNVFTKAELESRAEIQYETYAKAINIEARAMIDIASKHILPAVV KYTKQLADTVIAVKEAGADAEVQSEILNEVSRLLKEAKEALVKLGEVTNQAAEKEEGKEQ AFWFYEEVSPAMQALRDPVDKLEMIVAKEVWPMPSYGDLMFEV >gi|330402635|gb|ADLB01000021.1| GENE 78 58803 - 59666 1215 287 aa, chain - ## HITS:1 COG:CAC0827 KEGG:ns NR:ns ## COG: CAC0827 COG0191 # Protein_GI_number: 15894114 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose/tagatose bisphosphate aldolase # Organism: Clostridium acetobutylicum # 1 287 1 287 287 436 77.0 1e-122 MLVSAKEMLQKAKAGHYAVGQFNINNLEWTKAILLTAQENNSPVILGVSEGAGKYMGGYK TIVGMVNGLLEELNITVPVALHLDHGSYEGCLKCIDAGFSSIMFDGSHYPIEENVAKTKE LVAIVKEKGMSLEAEVGSIGGEEDGVVGAGECADPAECKMVADLGIDFLAAGIGNIHGKY PENWQGLSFETLDAIQQLTGELPLVLHGGTGIPADMIKKAIDLGVSKINVNTECQLAFAA ATREYVEAGKDLEGKGFDPRKLLAPGFEAIKATVKEKMELFGSVNRA >gi|330402635|gb|ADLB01000021.1| GENE 79 59778 - 65327 4875 1849 aa, chain - ## HITS:1 COG:no KEGG:SAV_7268 NR:ns ## KEGG: SAV_7268 # Name: not_defined # Def: hypothetical protein # Organism: S.avermitilis # Pathway: not_defined # 48 678 46 625 625 366 35.0 5e-99 MKKNRKRSRIMSLLLIFALLLQPTVSLAKDGEESAKKNNGVVIDVRDFGADPTGVNDSAE AIWKALEEAKKVSDGGKKEVTLNFPKGEYHIYKDKAQKREYHTSNTNSIENPVKTIGILI EEQKNLTIEGNGSLFMMHGNMMALAVAKSENITLHNFSWDFAVPTVSEMTVTNMGKSGGR DYTDFYIPNCFPYEIQGNTILWKSENSPYTNQPYWTEKGIHNSYAVVAYQPDDEMTRNYY TNQGPFQNAIKIEKLENNQVRVHYAVTRPAMQEKGMVFELASSAVRETAGAFTWESKNVR AEKVNVHFMHGFGWLIQMSENVYYKDCNLMPRENSGHLTVSYADGIHASGASGEIVIENC NFANTHDDPINLHGTFTRVEKRIDSHTLELKYIHNQQGGFPQYHVGDKVQFFTRDLLSST DGEKQYEVAEVISNPGENGNDLRTMKIKFKEDLPENLSDKIGGQPKYVAENVTYAPKVTI RNCTFRNVPTRGILCTTRKEVIIENNVFHNMSMATIFLSNDSNDWYESGPIRDMKIRNNT FYIKDIGRTSWEYAPAIYIHPVTKGGQFPDASNPIHKNISIEGNTFYMDEDTVVKAESVE NLTFKNNKVFRMNPDVTVGITLDNKTIQTGQSVQLKTDAKGNTNNKKVDNVFEFTKSKNI KIENNVYDDGLKLYAVAEDDATEKNISIKGDDIKVIRNKNQAPAEPVRNICYASTNPDVA TVDKNGQISGIKSGKTTVFAYYRWNDTIVKSNEIEVTVGGEAAETEKIEIKEEDNKEVKV NEKVQFTLKKGTGATWSVTDFLTGEETNSATISSTGEFTAKANGVVWVNAVKGTAKDRKA VVIYGGEVKALNADFEIKREDKAKYKLTDNSIEVTMQPGDLYTNTNTVKNLFLYNPKNVE KNNLRAVVKVEGLPIKENNQWDTASFILYRDDDNYVTVGKKSHYNGIATVEEKNGAAQEY HGNEADNNVTSAYLGITKKNDKITLSYQKENGEWQTAKEFTNASFGNDYKIGMACWESNA RAKKVTFSDFHVGSADTEFNKLLEGEEISVQQTKTVRPTVSDVKTKVENGKAKVEYKFAD TQGAKEGETVYRWYWKEGEVEKTTVTKEKELSVAGKKDIFCQVYPKDQYGIVGMPSEKVK VAVSPEQTDELQEISVNNGIVYKRGDNKEQNILVPQSLKKIMLSYVSVNDSVGKTQIYIN GTEVAGELNNTDTLILEVQDKDELRIVRGKNTYVLTVQLKGESAIQVNKLSMPQLQFEQN KPIETKSFYMNADGSKATSDIVVETEGKIGKVEVLAGEYRTPLNVSNEGNRYTAKADFRN GLNSFYVRVYAEDNTTYEQYILNVTYMPSTKMQVNGITLNGKGLTNFDAEKEEYFFALDK KEKNLNVKVNTDASAVKISLNGEVKNGTEATFNKLKEGENTLLIQCVAGDKLFKKNYRVT VYKHFDVNTGIQEVILDGKNITSDLLNEKKTSEQFVDSDKTTLKVTAIDPDATVEVKQTR EKAAKGNVFEGELNVYDGLQNVDIVVTAADRKTKETYRIELRKGAYLSDLEWESATVGYG TQVNRDRNHLGSTIQLANEKGEPVAFKKGLGTHAESVIVYDITGKGYKALEGFVGIDYCQ YNADDGKVQFCIEVDGQIVFDSGEMAQKTPMKKFAIELPENAKKVTLKALKGENNFNDHA DWADAKFLGEFPEKTYTITANANDDKMGVVEIVPEKEVYQIGEEVKVIANVNEGYKFTGW LENNEIISTEKEYTFTVDKDRELTAQFEKEEKPVEPDKPVVPEKPNKPVAPQIPPKPDRP NEQKPSQNGNAGQPQKPVKTGDDSQSTLALFGFGISAIALAYVLGRKRK >gi|330402635|gb|ADLB01000021.1| GENE 80 65302 - 65382 102 26 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKKLLKTYKMKSKEKEEYYEKEQKTK >gi|330402635|gb|ADLB01000021.1| GENE 81 65517 - 65678 116 53 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MYAFTKYIFYCIGKLCTTKNCPNPFFKELDSFYMLKMPTPENIQMLGISFSFV >gi|330402635|gb|ADLB01000021.1| GENE 82 66037 - 67407 1567 456 aa, chain - ## HITS:1 COG:MA0901 KEGG:ns NR:ns ## COG: MA0901 COG0733 # Protein_GI_number: 20089780 # Func_class: R General function prediction only # Function: Na+-dependent transporters of the SNF family # Organism: Methanosarcina acetivorans str.C2A # 3 455 8 456 459 467 56.0 1e-131 MEKREKFSSRLGFILISAGCAIGLGNVWRFPYITGQYGGGAFVLIYLFFLLVLGLPIVIM EFAVGRASQKSVAKSYEVLEPKGTKWHIHKYFAIAGNYLLMMFYTTVGGWMVAYFVKMAK GEFAGAGAEKISEIFTALTSNRNEMLFWMLVISVIGLIVCSLGLEKGVEKITKAMMVSLF LIMIVLVVRAVTLPNAMEGLSFYLKPDLSKMKEVGIGTTVFAAMGQSFFTLSIGIGALAI FGSYIGKERTLAGEAISVTLLDTCVAIMAGLIIFPACFAFGINPGEGPGLVFVTLPNVFN EMAGGRLWGALFFLFMTFAALSTIIAVFQNVITCARELWGWSLKKSTWINGILLIVLGIP CVLGMTDWAGFTVAGKNIMDLEDFIVSNNLLPLGSLVYLLFCVTRYGWGWDNFLKEANAG TGLKFPKNKAVKFYLTFILPVVVLFIFVQGYWTMMK >gi|330402635|gb|ADLB01000021.1| GENE 83 67422 - 68357 882 311 aa, chain - ## HITS:1 COG:BS_yerQ KEGG:ns NR:ns ## COG: BS_yerQ COG1597 # Protein_GI_number: 16077740 # Func_class: I Lipid transport and metabolism; R General function prediction only # Function: Sphingosine kinase and enzymes related to eukaryotic diacylglycerol kinase # Organism: Bacillus subtilis # 1 275 1 276 303 175 34.0 1e-43 MKKLLFIYNPNAGKGLLKPRLSDIFDIFVKAGYEVTVYPTQKYRDGYRKVADFTGDYDLL VCSGGDGTLDEVVTGMMQRENKIPIGYIPTGTTNDFAKSLHIPRELLKAADVAVNGEIFS CDVGRFNKDIFVYIAAFGLFTDVSYQTKQEIKNVLGHLAYVLEGTKRLFNIPSYNIRVTH DGEVIEDEFIFGMVTNSRSVGGFKNMVGKKVVFDDGEFEATLIKKPKNPLELQGIILALV SEQQDSKYMYSFKTKEVTFESLEEISWTLDGEFGGEHDKVVIRNAQKELQIMVDAECKSS LMEYPQEKEEE >gi|330402635|gb|ADLB01000021.1| GENE 84 68410 - 68667 437 85 aa, chain - ## HITS:1 COG:BS_crh KEGG:ns NR:ns ## COG: BS_crh COG1925 # Protein_GI_number: 16080527 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, HPr-related proteins # Organism: Bacillus subtilis # 1 77 1 77 85 65 51.0 2e-11 MIKQPVTIQAEDGICARVTAKAVQVANEFKSQVFIEVGTKKINAKSIMGMMSLQLSTGDE LTVVADGDDETMATEKMKELLVSGK >gi|330402635|gb|ADLB01000021.1| GENE 85 68660 - 69622 719 320 aa, chain - ## HITS:1 COG:CAC0513 KEGG:ns NR:ns ## COG: CAC0513 COG1481 # Protein_GI_number: 15893804 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 1 313 1 314 317 247 43.0 3e-65 MSFSGEIKEELGGQISTARHCQIAEISALISMCGSVMIDSNNRYAIKIHTENLIVARKCF TLLEKTFNIRTEISIRKNLIRQSVSYWIIVKKHEEAIKVLQATKLINKDGEVFEELSIVK NVIVQQYCCKRAFIRGAFLASGSISDPEKSYHFEIVCAVRAKAEQLQKIMNSFGIDAKVI LRKKSYVVYVKEGAQIADLLNIIEAHVALMKFENVRIIKDMRNTVNRKVNCETANINKTV SAAVKQVEDIVYIRDTIGLENLSDALRDVALTRLEYPEATLKELGDLLTTPIGKSGVNHR LRKLGEMADKLRENKEVRYD >gi|330402635|gb|ADLB01000021.1| GENE 86 69636 - 70496 828 286 aa, chain - ## HITS:1 COG:CAC0511 KEGG:ns NR:ns ## COG: CAC0511 COG1660 # Protein_GI_number: 15893802 # Func_class: R General function prediction only # Function: Predicted P-loop-containing kinase # Organism: Clostridium acetobutylicum # 1 281 1 286 294 290 53.0 2e-78 MRVVIVTGMSGAGKSTALKVLEDVGYFCVDNLPIPLFPKLSELLKVPGPEFNKIALGLDI RGGQEFSELEEMLKEISCEILFLDASDSVLIKRYKETRRNHPLTAQGSIGDGLLQEREIL EPIKKRADYIIDTSLLLTRELKQALQNIFVNNQEYKNLYLTVVSFGFKYGIPNDADLVFD VRFLPNPYYIDELRPRTGNEKEVQDYVMSNGKADIFLQKLKDMVEFLIPNYISEGKHQLV IGIGCTGGKHRSVTLANALYDYFKEKNQYGVRIEHRDIEKDALRKM >gi|330402635|gb|ADLB01000021.1| GENE 87 70498 - 71409 954 303 aa, chain - ## HITS:1 COG:CAC0510 KEGG:ns NR:ns ## COG: CAC0510 COG0812 # Protein_GI_number: 15893801 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramate dehydrogenase # Organism: Clostridium acetobutylicum # 3 302 5 303 305 286 47.0 3e-77 MDRKFLDRVTEIIDKERVFTDEPMSRHTTFRIGGPADCFVCPETVDEVQKVVRLCNEMDM PFYLLGNGSNLLVGDKGFRGVIVRLYKQMDKIEVSGTKIRAQAGALLVKVASEACRNGLT GLEFAGGIPGTLGGAVVMNAGAYGGEMKNVLEEVTVLTREGELLTLSKEELELGYRTSIV GRRGYIALEAVLQLEKKDAKEIREYMNELREKRTTKQPLEYASAGSTFKRPEGHFAGQLI EQAGLRGFRVGDAQVSEKHCGFLINRGNATAEDVVELMREVTVKVEEKFGVTLEPEVKKL GEF >gi|330402635|gb|ADLB01000021.1| GENE 88 71568 - 72851 1383 427 aa, chain + ## HITS:1 COG:FN1486 KEGG:ns NR:ns ## COG: FN1486 COG1253 # Protein_GI_number: 19704818 # Func_class: R General function prediction only # Function: Hemolysins and related proteins containing CBS domains # Organism: Fusobacterium nucleatum # 19 422 17 426 426 250 40.0 3e-66 MDSSVVTQSIVLIILLALSAFFSSAETSLTTVNRIRMRSLAEEGNKRAKMVIRITEDSGK MLSAILIGNNLVNLSASSITTSIAYVFGGSAVAIATGIITLLILLFGEISPKTVATIHSE KLALVYAYPISFLMKIATPFIFIVNSLSKVILFILRVDPNAKNDSMTESELRTIVDVSHE DGVIESEEKEMIYNVFDMGDAKAKDVMVPRVHVTFADVESTYEELIEIFREDKFTRLPVY ENTTDNVIGTINMKDLLLFDNRKEFHVRDILREAYFTYEYKSISELLVEMREASLNIVIV LDEYGETAGLITLEDLLEEIVGEIHDEYDENEEEFFKEINEHEFIVEGSMNLDDLNDRLD LGLTSEDYDSLGGFIIEQLDRLPELHDEVVTDSGIRLVVETLDKNRVEDVHIYLPENFRE EKDEEEE >gi|330402635|gb|ADLB01000021.1| GENE 89 72921 - 73766 584 281 aa, chain - ## HITS:1 COG:AGl3273 KEGG:ns NR:ns ## COG: AGl3273 COG0395 # Protein_GI_number: 15891757 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 14 281 14 276 276 139 33.0 5e-33 MKRRKKVSLLYIAAVILFLILTLGPFVWAFIVSVTPEYEMFKNTVGMLPKEMTWNNYEML LDVNSSQHIRLFTGLANSMKAVGCTILFGVPIALVSGYVLTRMEFPGKKFIRNSLLITMV IPIFATIIPLFRIFVDKGLLNNTFWLSMVYVSSFLPMNVWLISNYFATIPKELEEAARVD GCGRIHTFFKIILPISYPVIFSSILIMFLSGWSQFQIPLILASSAETKPVAIVTSEFMTK DMVQYGVTAAAGLLAVIPPAIVALIFRKFLVAGMTKGSTKG >gi|330402635|gb|ADLB01000021.1| GENE 90 73766 - 74644 916 292 aa, chain - ## HITS:1 COG:BH1245 KEGG:ns NR:ns ## COG: BH1245 COG1175 # Protein_GI_number: 15613808 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Bacillus halodurans # 44 290 189 437 445 143 30.0 3e-34 MLKRNKEKLLAYVTLTPVILIMCTLVLYPIVMTFSYSLHKMKLTAPNDIKLIGLENYISI LKSDTFWYSLQNTLFLLVIVMAVTTVCGFAVALILNVDTKISGILTAIAILPWAMPPIVN GIIWKFVFHSGYGFMNKLLISLKMIDKPIEWLTDRWLLLLVVALVVAWRSVPFCAIVTVS GLRAIPEELYEAAKIDGADRKSRFFHITLPLVKPFLIIGMTSASITAINIFDEIVAISGY KDLGKNLLIENYLTTFSFLDFGKGSALTYIIMLLSAILGFFYLRSLNKEVEY >gi|330402635|gb|ADLB01000021.1| GENE 91 74646 - 75902 1730 418 aa, chain - ## HITS:1 COG:mlr3639 KEGG:ns NR:ns ## COG: mlr3639 COG1653 # Protein_GI_number: 13473140 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Mesorhizobium loti # 62 408 58 405 412 115 27.0 1e-25 MRKKIVSILLCMALVGTMAAGCGKKETSKDKADGKKEITFMIPDWGNPGEELLKEFEEES GIKVNVEEVSWDDIRDKVSIAAAGGEAAADVVEVDWSWVGEMNSAGWLEPIEMSEEDKKD MPTLETFTIDDKVLAVPYANDYRIAYYNEEHFKKAGIEKAPETWDEVYAALKKLKESGVT KYPFTLPVNADESATTSMIWFAFARNGVAFNEDGTLNKEAVLDALKFEQKLVDEELIDPA AKTFSGMDCYRKLTSGEASFMVGPTKFVGISNNPEESSVVGQIKPILLPGKEGTSQVTMP LPEAIGITAFSKNKEEAKEFVKWYSSPEVQKKLYAQNSSIPTRNSVIEELINDGTIKNAG AMLDQAKLIKSPFPKGVPDYYSEMSNTIYNNVNKMVLGEISPEKAFDTMDKKIKELVK >gi|330402635|gb|ADLB01000021.1| GENE 92 76029 - 76721 881 230 aa, chain - ## HITS:1 COG:no KEGG:Cphy_2059 NR:ns ## KEGG: Cphy_2059 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 1 229 1 254 254 70 28.0 6e-11 MSGKLQFREKLNGILELGKEKHRVLTMEDVEKYFEEDTLSEEQMELVYDYLLSQKIAVTG YVKKGGTVVEEEEEKNFPKEDELYLETYLEEMSSLVPLNEREEKLKQYFPKVVEIAKELH TPEFFIGDLIQEGNVGLMLALEKEEAEEQEILEGIRQMMQLLVEEQEEVKHQDNKMVEKV IFLDESLKNLTEDLNRKPTLEELADYMEMTEEEVNDILRLTGEEDDEEEE >gi|330402635|gb|ADLB01000021.1| GENE 93 76740 - 77501 1040 253 aa, chain - ## HITS:1 COG:BS_cwlC KEGG:ns NR:ns ## COG: BS_cwlC COG0860 # Protein_GI_number: 16078804 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: N-acetylmuramoyl-L-alanine amidase # Organism: Bacillus subtilis # 3 252 2 252 255 118 30.0 1e-26 MPISIMLDAGHGGTDPGAVYKGRREADDALQLTLAVGEILQNHGFDVEYTRTTDVYESPY QKAMEANEAGVDYFVSIHRNSYPTDNEVSGVESLVYDLSGIKYEMAKNINAQLEDIGFVN LGVHARPNLVVLKRTKMPAVLVEAGFINSDTDNQLFDESFDDIANAIADGIMDTVMKPGM NQVMQYRVQVGAYRQRRYAEALLNELQAQEYPAFIEQGNGFWRVQVGEYPTLKEAVVMER RLKQAGYPTIIVS >gi|330402635|gb|ADLB01000021.1| GENE 94 77674 - 78114 576 146 aa, chain - ## HITS:1 COG:rpiB KEGG:ns NR:ns ## COG: rpiB COG0698 # Protein_GI_number: 16131916 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose 5-phosphate isomerase RpiB # Organism: Escherichia coli K12 # 2 146 3 147 149 176 60.0 2e-44 MRIGIGNDHAAVEMKEQIVAFLEELGHEVVNYGTDTNDSCDYPVYGEKVGRAVVDGDVEC GILICGTGVGISIAANKVKGVRAVVCSEPYSAKLSKQHNNTNILAFGARVIGIELAKMIV EEWLNTEFEGDRHQRRVDMISDIENR >gi|330402635|gb|ADLB01000021.1| GENE 95 78212 - 78994 279 260 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 1 260 14 265 329 112 30 2e-23 MSEQKYVYEVRDLYFAYGEHKVLQEVSFSLQKGKITTLIGANGCGKSTLFQLLTRNQKPE SGEIKLYGENLRDIKLKDLAKKSAIVHQYHTVLEDLSVEKLVSYGRIPHRKAGSFRLSEK DREKVEWAMKVTRITKYRNRPVSKLSGGQRQRVWIAMALAQDTELLLLDEPTTYLDVRYQ IQILQLVSKLNQEYGLTIVMVLHDINQALYYSDEILAMKNGKIIGQGMPKEVLTEELLEE VYGVRLHVFTVGEKSFISAV >gi|330402635|gb|ADLB01000021.1| GENE 96 78991 - 79977 1133 328 aa, chain - ## HITS:1 COG:SPy1794 KEGG:ns NR:ns ## COG: SPy1794 COG0609 # Protein_GI_number: 15675632 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+-siderophore transport system, permease component # Organism: Streptococcus pyogenes M1 GAS # 23 324 36 337 340 255 51.0 6e-68 MIGRRRKIISFLVTGAALIVLFLVAVNTGSLKASPGQIIKGLFVEYDETVATIYSLRFPR ILIAMLAGAAVAVSGVMLQAVMKNPLADPGIMGISSGAGMAAVLVTVFAPSLYLAVPVFA FLGGIIACLIVYLLSWKGELSPLRVILTGVAVNAFFSGLMSAGESMMGTDYSGAASIVNG NITMKTWGDFQMLSVYTVIGLILAFIFAVKCDILGLEDKTILSLGIRVNTVRLLVSGVAV LLAAVSTAVVGSISFLGLLVPHIARLLVGNSHKVLIPYTMLLGAFTLLLADTVGRTIAYP YEINAAVIMAVIGGPSFILLLRRAKTEL >gi|330402635|gb|ADLB01000021.1| GENE 97 79974 - 80864 984 296 aa, chain - ## HITS:1 COG:SPy1795 KEGG:ns NR:ns ## COG: SPy1795 COG0614 # Protein_GI_number: 15675633 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+-hydroxamate transport system, periplasmic component # Organism: Streptococcus pyogenes M1 GAS # 1 286 2 283 294 271 50.0 9e-73 MKKKPLIAVILSALFLTGCVNQSGEQNVNNSAKEKRIVATSVATCEILDRLEWDNVVGVP KTDSYSIPKRYEKAKNVGSPMAPDMEIVKSLKPDIVLSPNSLEGELKSQYENIGVESYFL DLKSTEGMYESILTLGKMLGKEARAEELYKEFEDFKTDFAKKHTGEEAPTVLLLMGIPGS YVVATESSYAGSLVKLAGGINVYGDGAGQDFLNINPEDMVKKKPDIILRTSHALPEQVKK MFAEEFKTNDIWKHFEAVQQGYVYDLNNEKFGMSANFRYKEALQELETYLYQEGTN >gi|330402635|gb|ADLB01000021.1| GENE 98 80868 - 81563 817 231 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|210614877|ref|ZP_03290376.1| ## NR: gi|210614877|ref|ZP_03290376.1| hypothetical protein CLONEX_02590 [Clostridium nexile DSM 1787] # 1 231 1 239 239 312 69.0 1e-83 MKRWKEVVLGLGMAAMLAVFHIPVHAMEDGAYTVGRTTSYANPETNKTVDGGTNIALGDS MADSIVEDSVLVEQSQGKTYVTLGIGLMSNVNNVRMQIKGNDGNYRNVELTQTGTCERDG DTCNHYRFEVDSAENYISPILFVEPMGRDVQFFIKLDMTSAKKGNGNFVSEMIKTEDEQP VKEESVKKTSKETSKETPKESKKKVPVGEIAVGAAVVLVIGSLVVFVKKRK >gi|330402635|gb|ADLB01000021.1| GENE 99 81566 - 83164 1563 532 aa, chain - ## HITS:1 COG:no KEGG:LSEI_0291 NR:ns ## KEGG: LSEI_0291 # Name: not_defined # Def: lacto-N-biosidase # Organism: L.casei # Pathway: Other glycan degradation [PATH:lca00511]; Amino sugar and nucleotide sugar metabolism [PATH:lca00520]; Metabolic pathways [PATH:lca01100] # 255 319 382 446 569 64 50.0 9e-09 MKNRTIKSMAKRLAALAMAAVCTFSLTQVPAVKAAEVLEAGDYVVPITSLKSKAPIPAVN EAFNKAFGENANVSVDEKGNMTATVENRHMVINMMGEYHANVLTVEGAEYLSYKTEQSST LFGKPTEITQIEVPEKMKFPITPSGNGECSLTITVDFMNNMMGGGKPYPTTVTLTLDFSK AQADTSGLNALLQSYDGLKKEDYTEESWKAFELKKKEAKELLAGGKASASEISGMIASLK DLKGKLQLVSELPEADYSKVDAAIATVPKDLTKYTDGTVKTLENALQAVIRGLKQNEQDK VNQMAANIEAAVKGLKKKPTSDNNTNNGTNNDTNSQPNLNTGNTTLDKNNLKDGIYEVPV WLWHATKNQASMAAQSLNNTARIVVKNGVKTMYIYTKSMKYGNIEASLQELKVEGADGKY ISAKVEERDAAGNPVCFSFTLPHTKEYLNVKVNPHVAIMGNQDIDARFRINYSSLKQVSS VQVTKPSAQSSYGASNAPKTADYSNYSALLLAMLLSAAASVATLVYRKRQAR >gi|330402635|gb|ADLB01000021.1| GENE 100 83186 - 83263 74 25 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MVALNYFLLVSSNPRALKKATEIAD >gi|330402635|gb|ADLB01000021.1| GENE 101 83388 - 84158 708 256 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|210614875|ref|ZP_03290374.1| ## NR: gi|210614875|ref|ZP_03290374.1| hypothetical protein CLONEX_02588 [Clostridium nexile DSM 1787] # 1 256 1 256 256 404 76.0 1e-111 MDVLRVKILICFQNPKEQWNVTNLAITLGVEKYVVSRILSALEKEGLLDKSDRRNPVLTR KGRAVAKEYSQKVELVIGHLLGTGVSQETAREDAVIIASYCKEETLEVLKKEEIAKRVKY SFREGMEFGGERLSGRYPDGNYPIPFAIFQKELDLEKEVSTWNERFENPCILNVQNKNGK VFLQMLGEYREYQFAYWDGEAWCDMERQGRLLSFRADWIRFQSMADDRGRILLGKIWIKI YRGRDMKKLLFTMYIA >gi|330402635|gb|ADLB01000021.1| GENE 102 84330 - 85103 735 257 aa, chain + ## HITS:1 COG:Cgl0894 KEGG:ns NR:ns ## COG: Cgl0894 COG0561 # Protein_GI_number: 19552144 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Corynebacterium glutamicum # 3 250 17 264 277 137 35.0 3e-32 MIKLVASDIDGTLLRYGEQKLNPELFDIIRQLKKKGIHFIAASGRQYASIVNLFEPVKDE ISYITENGSLCIHNGEVLSRGTIDRELGIRIFEEIRKREHCDAVLSCEEACYIETADKEF FRHLVEDMANKVIRVDDLSTVASPFLKIAIFDNVTPEETFVHFREEFKSDIKVVTSGNEW VDFISPEANKGTALQVFLDYFNILPEECMAFGDQHNDIEMLHLAGEGYAMSDAAPGVASH STYVTDSVIETLKELLE >gi|330402635|gb|ADLB01000021.1| GENE 103 85149 - 85781 498 210 aa, chain - ## HITS:1 COG:CAC2496 KEGG:ns NR:ns ## COG: CAC2496 COG0546 # Protein_GI_number: 15895761 # Func_class: R General function prediction only # Function: Predicted phosphatases # Organism: Clostridium acetobutylicum # 3 210 1 207 208 159 40.0 5e-39 MKMQIESIILDVDGTLWDSTELVAKSWNRAIRDIGITHIDVTADRLKTLFGRTMKEIADV VLAGCSEEEKAQVMDLCCQYEHEDLENDECDMLYPNVRETIIELSKTHRVFIVSNCQSGY IEMFLKKTNLEKYVTDIECFGNTGRGKGENIRLIMERNHSESACYVGDIRGDYEASKQAG VPFVFAEYGFGDVPESEWKIKEFSELLKLE >gi|330402635|gb|ADLB01000021.1| GENE 104 85784 - 87571 1830 595 aa, chain - ## HITS:1 COG:FN0453 KEGG:ns NR:ns ## COG: FN0453 COG0006 # Protein_GI_number: 19703788 # Func_class: E Amino acid transport and metabolism # Function: Xaa-Pro aminopeptidase # Organism: Fusobacterium nucleatum # 1 595 1 584 584 483 43.0 1e-136 MSIQERISMLREQMKSHGVDMYIVPTADFHQSEYVGEYFKARKFITGFSGSAGTAVITLE EARLWVDGRYFIQAAEQLQGTEIQMMKMGQPNVPTLDKYIEETLQNGQTLGFDGRVVSMG NGQKYAKIVEEKQGKIVYDMDLIGEIWEDRPSLSKEPVFALEEKYTGESTESKLSRIREV MKENGATVHILTTLDDICWTLNIRGNDIEFFPLVLSYAIITMDKMHLYIDETKLSDEIRV NMEADGVILHKYNAIYEDVKQIGEEEVLLIDPMCLNYAIYSNISADVKKVEKRNPEVLFK AMKNPSEVENMRQAQIKDSVAHVKFMKWLKENVGKITITEMSASDKLDEFRAEMGNFIRP SFEPISSYGEHAALCHYTSSPETDVELKEGNIFLTDTGAGFYEGSTDITRAYALGEIPEN MKEDFTVVAMCNLELSNAKFMEGCTGVNLDILARKPLWERNKDYNHGTGHGIGYLLNIHE DPANFQLRYREGNTAVLQEGMILTVEPGIYIEGSHGVRLENEVLVCKGEKNEYGQFMYLE TITYVPFDLDAIKPEMLTEEYRRYLNTYHKTVYEKVSPYLNEEEKEWLKKYTREI >gi|330402635|gb|ADLB01000021.1| GENE 105 87649 - 89040 1612 463 aa, chain - ## HITS:1 COG:CAC1091 KEGG:ns NR:ns ## COG: CAC1091 COG1362 # Protein_GI_number: 15894376 # Func_class: E Amino acid transport and metabolism # Function: Aspartyl aminopeptidase # Organism: Clostridium acetobutylicum # 2 459 8 463 465 585 62.0 1e-167 MERKNMWKQYTVEQLNEVEEVSKRYKDCLDAGKTERECVSLAVSMAEEHGYRNLEECIAE NTELKAGDKVYVNQMGKALVLFQIGQKPLEEGMNILGAHVDSPRIDIKQNPLYEDTDLAY LDTHYYGGIKKYQWVSIPLAIHGTVAKKDGTTVDIVVGEKEEDPVLVISDLLIHLAGEQM DKKASVVIEGEGLDVLVGSKPLSTEDKEEKELVKATILSFLKDNYGIEEEDFLSAELEIV PAGKARDCGFDRSMIMGYGHDDRICAFTSLFAMLEVENPERTSCCLLVDKEEIGSVGATG MESKFFENAVAELVALTVGESSLKVRRALQNSYMLSSDVSAAYDPMFANAFEKKNTAYFG RGLCLNKYTGARGKSGSSDANAEYIAKIRKVFDDNEIGFQTAELGKVDFGGGGTIAYIMA DYGMNVIDSGMAVLSMHAPWEIASKADIYEGYKGYKAFLREMK >gi|330402635|gb|ADLB01000021.1| GENE 106 89173 - 89355 267 60 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167759308|ref|ZP_02431435.1| ## NR: gi|167759308|ref|ZP_02431435.1| hypothetical protein CLOSCI_01655 [Clostridium scindens ATCC 35704] # 1 59 1 59 61 75 69.0 8e-13 MKAKVNEGCIACGMCVSLCPEVFRFNDEGLAEAYTDITEETRETAEEARESCPVSVIDLN >gi|330402635|gb|ADLB01000021.1| GENE 107 89403 - 94286 5038 1627 aa, chain - ## HITS:1 COG:AGc4704 KEGG:ns NR:ns ## COG: AGc4704 COG3525 # Protein_GI_number: 15889853 # Func_class: G Carbohydrate transport and metabolism # Function: N-acetyl-beta-hexosaminidase # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 488 875 204 610 639 86 23.0 4e-16 MKKWKKKVLALALSMSLAVTGNVTIFAETKAPKANDNLALGKTVTASSDHANFPKSNLTD ENQESRWSAEAAPEQWAYVDLGAKKSMNYFSMLWESDTEYGTDFKIYVSDSTTDWGKPVA EKTGNTTRKTEVKLDHTVEGRYVKLSVTKVSKYPNVSCCDFQVKLVDDTTEPEEPEKPQK PQNPQENVAQGKTAVADSVEANNLTVDKAFDGNTKDTRWASAMNSNPHWIYVDLGAVKDV KTIRLFWETRKATNYRIQTATTLSSQMKDSDWTDAKVMNTRPASKTEKIVLDNKVQARYV RLMIDKFTAEDVDGGATWNTISVYEMEVYGGEPKAGMDDLAREIQIKTPERTDKKLQVTL PTSEDYTIKYNGTDYEQVVGPNLEIYQPLVDTVVKVSFKLTDKKNDKNYMFEERNVVIPG RLAKEEGDNKAPKVLPELREWKGSTGNFTPSASSRIVIQSDALKDMANAFAKDYKEITGK EISVVKGEAKPGDFYFTLTTDKTKGLQKEGYSMEIADKVTVEAETTTGAFWATRTILQSL KLTNNIPKGQTRDYPLYEVRGFILDVGRKTFTMDYLQQVVKEMSWYKMNDFQVHLNDNLI GLENVKDPMKAYSGFRLESDIKEGGNGGKNKADLTSKDVFYTKNEFRDFIEESRTYGVNI VPEIDVPAHSLALTKVRPDLRHGTSGRQNDHLNLIGDKYNDSLEFVKSIFGEYMTGANPV FDKQTTVHVGADEYTADGNAYRKFANDMLGYVKESGRKARIWGSLSTIKGNVDVQSENVE MNLWNFGWANMDKMYEEGFDLINCNDGNYYIVPNAGYYYDYLSNETCYNLAINTIGGVTI PAGDEQMIGGAFAVWNDMTDYLNNGISEYDVYDRIKEPIALFGAKLWGKGDMNTNQALDL SKQLNDAPQTNFGYETAKDKEGAIANYPMDDKSDYSANKHNLVDGKNVEITNVDGKSALK LNGKESYINTGVGTVGLGNDLRVKVKRTSESVDEQVLFESSYGSIKAVQKETGKVGISRE NFDYSFNYELPVNEWVELEIKNVNNMKNNRIQTRIELYVNGELKDVLGDDEQVEGKSMIA TTMFPVDTIGSKTKAFEGYVDDVRVGTAVEGTYTSTITLDRAVWTATEVAKDNEALKPLL EEAKQLLTKFNPTKAEVENLTNRINEVISSTQFKKADYSRVDKYLALVPEDLSAFTKESV DNLKYVIGSIRRDLPASMQDVVDGYEKALAKALKELKIYEAANVNYVDNRTVKATACSKN GSEGPDKAIDGNKNTIWHTNYSGEDKCASNQHWINFEMKEPTAVSGLTYTPRANGGNGNL TSYEIKASDDGKNYTTVKTGTLADDAKEKVIDFGKTVTTKHIRVIYKASHGGFGSAAEFL IHRAGVKADVEGLNALINEAKAMKNEGYTEDSWKALQDKIADAEELAKAENPDANDVEVM KKQLRKAMMSVVLEEKGDEPTDDEKAALRDAVKKYSKYDEKDYTAESWKAFAKALADAKK VLDNKNATKEQIDNALNTLEEAADALVPADNGNGNDNGNNNGNGNNNGNGNNNGNGNNNG NGNNNGNGNNNGNGNNSGNGNNNGNSNNKPTGTKPPKTADATPIGLWLALLVVAGGSIFF SKKRKHS >gi|330402635|gb|ADLB01000021.1| GENE 108 94520 - 95641 645 373 aa, chain + ## HITS:1 COG:no KEGG:FMG_1520 NR:ns ## KEGG: FMG_1520 # Name: not_defined # Def: putative multidrug-efflux transporter # Organism: F.magna # Pathway: not_defined # 12 355 31 383 403 205 37.0 2e-51 MKHRNITTLKYISLFGGLIFYAPVALLVRTRTGMTYSQFFVLQAVLSISIFVFEIPCGYI TDKIGYQKTLILNSLFTFLARICMVFAGNFPLFFIEAVLEGISIAFYSGTMSGYLYQLSE EHFAFSLSVVDNYSNLGFILSTLTFPYIYSAWGIDGLLVLTALCSFIGFLLAFHLPNEKR HPVKKKISFSFHLTDIPISLFIGIINISFLIINFFYIGILTQFHMDERYMSILILLYTTI QLSVSKLTQKFGENNLFKKIYVCSGMCIFCIFSISLCRSRAIFFPMLILPTLLSLLNVYF EKYQNNYIDYRGYGQHRATILSCYNMSANIVEVFFLFGSAKLKNIYVFDIFTILGCIFTI ILIFILIFQARKK >gi|330402635|gb|ADLB01000021.1| GENE 109 95694 - 96725 1310 343 aa, chain - ## HITS:1 COG:SP0034 KEGG:ns NR:ns ## COG: SP0034 COG2855 # Protein_GI_number: 15899980 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Streptococcus pneumoniae TIGR4 # 1 343 1 336 336 332 60.0 7e-91 MKFIEKNWKGLLLCLAIAVPSSMLGKKFAIIGGPVFAIIIGMILALLVKNKTAFESGVKF TSKKILQYAVVLLGFGMNLKVVMQTGKQSLPIIICTISTSLIIAYIMHRVMRVPTKISTL IGVGSSICGGSAIAATAPVIDAEEEEVAQAISVIFLFNVLAALIFPMLGTGLGFSTTSGE AFGIFAGTAVNDTSSVTAAASTWDSMYQLGSQTLDKAVTVKLTRTLAIIPITLVLAFMRA RKAENESGEKVSLKKVFPFFILFFIGASLITTVATSMGVPATMFNPLKELSKFFIVMAMA AIGFNTDIVKLVKSGGRPILLGMSCWVGITAVTLGMQHFMHIW >gi|330402635|gb|ADLB01000021.1| GENE 110 96736 - 97635 371 299 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|148988049|ref|ZP_01819512.1| 30S ribosomal protein S9 [Streptococcus pneumoniae SP6-BS73] # 2 295 1 297 306 147 31 5e-34 MMYDVVIIGAGPAGISAGIYAVSRGKKVLIVEKNIVGGLIGKVSTVTHYAGIMSEETGVT FANRLKEQAESAGIEIAYGEVKRVELTGEVKKIYTEDKAYSAKKIVLANGTTPRKLGIIG EKELAGKGMGMNAQRDGERYQGKHVYVVGGADGAVKEALYLSKYAEQVTIIHFEETLGCI AEFREKVARTPNMKLRLGSRLHAVYGVDKVESLEISDEKTGAIETISDSGCGIFVYAGTT PNTELYTELKLENGFIPVDENMETEIEGVYAAGDIRVKQVRQVSTAVADGTVAGVHLAK >gi|330402635|gb|ADLB01000021.1| GENE 111 97827 - 98669 425 280 aa, chain + ## HITS:1 COG:Cj1000 KEGG:ns NR:ns ## COG: Cj1000 COG0583 # Protein_GI_number: 15792327 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Campylobacter jejuni # 4 273 19 292 293 99 29.0 1e-20 MNFTHAGEALNLTQPAVSQHIKYLEKKYDSPLFIRDKKRLLLTPAGEILRSALETMRNDE NTLKKRMKESLAGKNILTFGVTMTIGEYALVPALTKLIKSHENTDFHVRYGNTQSLLSYL YEGSIDFAIVEGYFKPDNYNTRIYKTEEYIAVASTKHIFQNPIRTLTDLTSERLIIREHG SGTRAILTKTLAMKNMSVNDFRHIVEVENIHTIVNLLRQDCGISFLYKAAVEEEIANGTL MQIPLSDFMIKHDFTFLWNKDSVFSTEYERIFEELKEYEE >gi|330402635|gb|ADLB01000021.1| GENE 112 98659 - 99360 461 233 aa, chain + ## HITS:1 COG:MA3534 KEGG:ns NR:ns ## COG: MA3534 COG0500 # Protein_GI_number: 20092341 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Methanosarcina acetivorans str.C2A # 9 211 6 212 241 129 34.0 4e-30 MKSRKDNIITEWNFAADNYMKQQEQSSFVSINKQIIMKRFPKLNNENVLDLGCGYGVYTN YFRTVNANAIGIDGSKEMLRLAKEQYPDCHFELADFNQPLPFSDNSFDIILCNQVLMDIE NIDLIFSECQRILKKNGIFFYAIVHPAFYDAEWLEDENHFKYAKVISNYIEPYQFKNTFW GETTHFHRPLSEYLNTASKNGFMLIHTEEPVSYDGKTKTKDLPLFFIAEYRKI >gi|330402635|gb|ADLB01000021.1| GENE 113 99592 - 99900 326 102 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|291549834|emb|CBL26096.1| ## NR: gi|291549834|emb|CBL26096.1| hypothetical protein [Ruminococcus torques L2-14] # 1 101 1 101 102 158 97.0 1e-37 MNQESKAINVHLEKRENKDYLVFGFEEVAEVCLNDDESQNNLKSIFVKLLTEITKYPVEL QFLENPEYKTGLYIDVCKEYIKDLNKEITNVRKNMPEKLKIQ >gi|330402635|gb|ADLB01000021.1| GENE 114 99875 - 102013 1268 712 aa, chain - ## HITS:1 COG:no KEGG:Nther_1522 NR:ns ## KEGG: Nther_1522 # Name: not_defined # Def: hypothetical protein # Organism: N.thermophilus # Pathway: not_defined # 1 707 1 708 709 397 35.0 1e-108 MRIDLHCHTKKVKTGDAYTRNVTKDKFFQKVIEAEVKIIAITNHNQFDYMQYKEFKDVTE GYCDIWPGVELDIIGKADQKGNCKRGHLIVIANPKNVELFNTQVQELVNDEDVNTFQIGV KKVYETLGKCDCIYIPHFHKEPKLSDEDIQELGELLPDSSRLFKETSDYRSLGVFSNFDY SVIIGSDVQDWNKYETSKFADIRLPVQTFEQFCLLAKKDVQIIDTLLNQKRKKEIPVSPY KKVNFKLPFYEDINIIFGQKGTGKTEILESLKKYYIENGIAMESYKGNEKDSDFSKMLKV NDIIATPDKLQLDSMRQQFIDVYNWKEELPTSFEKYISWMETKDNNKNKGRMKITECVHI EEGVRDRKLESDYKYLKEFTESTFEKIDIEKYLDEQERTTLMLLLGKLCENINDAKMQKW NSDKSIKLTNWSIDKIKAIADKCSDTISKPSSVGFYDFAMGRFKLFENVEEICSTFSVED KVEKEYLGNLEEKGDIYIQTRYRMLTKESRTDEFKQGITVLRNCKLVIDGIKKAVLAENI SEEVSKFQEFYDDGIKDIGAFIGVSKETALENGEIYRPSNGERGILLMQKLLDSESDVYI LDEPELGMGNSYITSNILPKLTDLAKRRKTVIIATHNANIAVGTLPYISILRTHENGIYK TYVGNPFYDELRNIDDETDTKNWTQESMHTLEGGKTAFYDRKDIYESGKQSD >gi|330402635|gb|ADLB01000021.1| GENE 115 102046 - 102963 697 305 aa, chain - ## HITS:1 COG:no KEGG:Pfl01_3049 NR:ns ## KEGG: Pfl01_3049 # Name: not_defined # Def: hypothetical protein # Organism: P.fluorescens_PfO1 # Pathway: not_defined # 5 302 14 325 328 287 46.0 3e-76 MSTSTHDLFDFFKKANSNISDEYTRICRRVNEDPGTAGDQGEENWKELLESWIPPYFQIV TKGRILSDSGETSPQVDVIVLSPDYPKSLLNCKEYLSGGVVAAFECKTTLRRKHIGEFIE HSKKIEKLAINENGTPRKDLQSKIYYGLLAHSHEWKKENSNPKENIEKAIWEYDEKYVTH PREIPDIICVADLGVWKSQKMIVPNYDNGKGLKGNYVMSGYIASDNKEEFFTPVGSCVFD LLQNIAWRYPSVRDIVTYMRKLNISGSGSGHSRPWEWSILSEETQENIYRLKNGGFWNEW GMVID >gi|330402635|gb|ADLB01000021.1| GENE 116 102991 - 103869 764 292 aa, chain - ## HITS:1 COG:no KEGG:MHO_0360 NR:ns ## KEGG: MHO_0360 # Name: dcm # Def: cytosine-specific DNA methyltransferase/type II site-specific deoxyribonuclease # Organism: M.hominis # Pathway: not_defined # 5 219 343 548 553 239 53.0 1e-61 MAWNLDFISEEDFKKHVRATIMKYGEKLESYDLKRFNSNLIDPIKLIFDKSVYRTSWEEI VNNEIFRQRDKSNNNDIGYFHQNIFSYFKGCEVPQAGWDVIYRNPDGIQMPDGDIVHTIY VEMKNKHNTMNSASSAKTYIKMQGQILEDDDCACLLVEAIAKKSQNIKWSTKVDGKNVQH RLIRRVSMDQFYAILTGEEDAFYKMCMALPEVINSVVNEEGGVEVPHDTVIDELRKVASL YGDENDELSMAMAVYMLGFNTYMGFGDKIRGELGEDKDGMLKRIYEYVKRLK >gi|330402635|gb|ADLB01000021.1| GENE 117 103870 - 105102 786 410 aa, chain - ## HITS:1 COG:all0934 KEGG:ns NR:ns ## COG: all0934 COG0270 # Protein_GI_number: 17228429 # Func_class: L Replication, recombination and repair # Function: Site-specific DNA methylase # Organism: Nostoc sp. PCC 7120 # 76 408 72 431 477 219 34.0 8e-57 MAICYDKLWKLLIDKKMNRTELKEASGISFNVLARLGKNEPVSFESIEKICFTLNCKIED VVEIQKEEPLQIDSDAFTTIELFAGAGGLALGIEKAGFEPVGLIEFDKDAAESLKTNRPN WRVIHDDIANISCLDLEDYFGIKKGELDLLSGGAPCQAFSYAGKRLGLEDARGTLFYHYA TFLQKLQPKMFLFENVRGLLTHDKGRTYATITNIFEQAGYTIQKKVLNAWDFGVPQKRER LITVGIRNDLVGKVSFSFPKEHDYKPVLRDVLLDCPEGPGVPYGENKRKIFELVPPGGYW RDIDPEIAKAYMKSCWDMEGGRTGILRRMSLDEPSLTVLTSPSQKQTERCHPLEARPFTV RENARCQTFPDEWQFCGSVQSQYKQVGNAVPVNLAYEIGLEIHKSLEGVK >gi|330402635|gb|ADLB01000021.1| GENE 118 105284 - 105742 95 152 aa, chain + ## HITS:1 COG:NMA0429 KEGG:ns NR:ns ## COG: NMA0429 COG3727 # Protein_GI_number: 15793434 # Func_class: L Replication, recombination and repair # Function: DNA G:T-mismatch repair endonuclease # Organism: Neisseria meningitidis Z2491 # 12 116 14 115 140 82 44.0 3e-16 MSRDSATVSNNMRKIHSKDTSIELLLRKALWHKGYRYRKNYKALPGSPDIVLTKYKIAIF CDSEFFHGKDWEILKLRLEKGKNPDFWIKKIERNRNRDYENDKKLLFLGYTVLHFWGQDI SKHTDECLQAIEEAIWDTKFSDTATDYDISEE >gi|330402635|gb|ADLB01000021.1| GENE 119 105792 - 106166 319 124 aa, chain - ## HITS:1 COG:no KEGG:CDR20291_1765 NR:ns ## KEGG: CDR20291_1765 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile_R20291 # Pathway: not_defined # 10 124 11 125 125 103 44.0 2e-21 MKKQIYDEKNGLNYTLHGDYYLPDLEINEEEPTYGKYGIMRKQFLKEHRSARYQYLVLIG KLTEHLNQVDKEAREKVEMLVEQMAEQWGVTEELKMQNQMEWVRRMNNIKATAEEIVYKN MIFM >gi|330402635|gb|ADLB01000021.1| GENE 120 106171 - 107586 1398 471 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_0392 NR:ns ## KEGG: EUBREC_0392 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 9 443 1 446 449 230 37.0 9e-59 MMKRTISGMIGAGSLAHNRRDFVAENVNPDRVQLNICYKNENLKEVYKELFDDAVERYNV GKRKDRQIANYYEKIRQGKQEKLFHEVIFQIGNREDMAVGTSEGNLAVKILDEYVKDFQK RNPTLRVFSCYLHQDEATPHLHIDFVPYVTNWKGKGMDTRVSLKQALKSLGFQGGNKHDT ELNQWINHEKEVLAEIAKQHGIEWEQKGTHEEHLDVYNFKKKERKKEVQELEQEKENLIA ENEELTSQIADARADIKLLEEEKIQFQKDKETAEKRAEKAETELKKLEDRREFLQPVLDN VSKEIKEYGMIKTFLPEATTLERAVTYRDKKIKPLFIEMKNKIGAMAAQVKELTRERDSW KSKFQKKKQEHEKTKKELAEVQKDYQKLSGEKEILQGIADRYNRLLRMLGKDMVERLVQD DIRMQTELEAKKQKEQMPKKIGDRIQWARERSEEHNAKIKKNKAKYRGMEL >gi|330402635|gb|ADLB01000021.1| GENE 121 107662 - 107934 129 90 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MGTDQNGDESGKRMVSGGDTLLWRSGICAGKDTWINRSMIRFGWCKADVHRTLCPRQGAR GLLMDGTAVESAFALLLTKVTKPPYPAKVS >gi|330402635|gb|ADLB01000021.1| GENE 122 107829 - 108056 316 75 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253579252|ref|ZP_04856522.1| ## NR: gi|253579252|ref|ZP_04856522.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 74 1 74 75 140 91.0 3e-32 MKQGRLGYNSSNRRYGLLSSDLWIDTGFHCGEGLEVLVGDEWVQTRMEMNLAREWYLVGT PYCGDLEYVQARIPG >gi|330402635|gb|ADLB01000021.1| GENE 123 108174 - 109550 1415 458 aa, chain - ## HITS:1 COG:XF2121 KEGG:ns NR:ns ## COG: XF2121 COG5545 # Protein_GI_number: 15838712 # Func_class: R General function prediction only # Function: Predicted P-loop ATPase and inactivated derivatives # Organism: Xylella fastidiosa 9a5c # 27 333 29 334 501 154 32.0 5e-37 MKKELTNISEKSLMELEDIVNEERLTTENVNNMLEHTDKGRTKQTIRNCVTVLQNDPVLK KAIKRNELSGRMDIVKEVPWERRNNSPTVTDTDENNLKMYLEENYELTSERVIKAGIDIV SNENKYHPIRDYLESLVWDGIPRIENMLPHFLGAEKSKYTNGVMKMHMLAAISRIYEPGI KYDIMLCLVGSQGAGKSTFFKYLAIKEEWFSDNLDHLDDENIYRKLQNHWIIEMGEMKAT ITAKNIEQIKSFLSRQKETYKVPYEVHPEDRPRQCVFCGTSNDLNFLPLDRTGNRRFAPV MTDMSKAEVHILDNEAESKAYIEQAWAEAMVLYRQGNVFLGFTKEIEEEAKRLQKEFMPE DTNAGIIQAFLDDYDDDYVCTRILFDDALHRTGEMKQWEGKEIANIMNNAIEGWKPHGTH RFGKEYGIQRSWKRMEDSVKKDKDGFMEVPEQLEIPFE >gi|330402635|gb|ADLB01000021.1| GENE 124 109456 - 110130 488 224 aa, chain - ## HITS:1 COG:MPN353 KEGG:ns NR:ns ## COG: MPN353 COG0358 # Protein_GI_number: 13508092 # Func_class: L Replication, recombination and repair # Function: DNA primase (bacterial type) # Organism: Mycoplasma pneumoniae # 2 85 5 94 620 63 37.0 2e-10 MTLFEQVKECVTARQAAEHYGIKVKRNGMACCPFHKDRHPSMKADKIYHCFACGVGGDAI DFTARLFGISQYEAAKKLVEDFGLDIKVGNRSKYERTPRSRASLKQKQSKVVMIREKLEQ WLKHTTDVLIRYLKWIQFWKEFYRPEPDEEWHELFTEALANERKINDYLDVLMFGTGEEI VEFFKMKRREVEKIEERINEYQREVLDGIRRYCERGKAYNRKCQ >gi|330402635|gb|ADLB01000021.1| GENE 125 110231 - 110305 83 24 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTTDRQKEKAQIKYLSKFSDNQIH >gi|330402635|gb|ADLB01000021.1| GENE 126 110318 - 110488 310 56 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKTKKVVGKIFDAINYSKKLKISSILPDRDSDYVILLELEDGSKFEIIIIPTRRFV >gi|330402635|gb|ADLB01000021.1| GENE 127 110508 - 111560 982 350 aa, chain - ## HITS:1 COG:BS_ydcL KEGG:ns NR:ns ## COG: BS_ydcL COG0582 # Protein_GI_number: 16077547 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Bacillus subtilis # 14 348 15 366 368 157 33.0 2e-38 MSAYKDKTQGTWYVSFRYIDWTGKKTQKLKRGFKTKKEALNYEKEFIRKTAADMKMEMNS FIQVYFEDKKNELKENSIRNKQHMMNKHIVPYFGTRKMNEITPAEIIQWQNIIQEKGYSK TYERMIQNQLNALFNHAQKIYNLKENPCKKVKKMGKSDANKLEFWTKAEYDRFIAGIEPG SEDYLIFEILFWTGIREGELLALSLSDFDMSGNLLHINKTYNRIKKRDVIDTPKTENSVR TIDIPNFLKEEVQEYAKKHYGFPEDQRLFPIVARTLQKRLKKYEALTGVKPIRVHDIRHS HVAYLIYQGVEPLIIKERLGHKDIQMTLNTYGHLYPSQQKKVAEMLDNKR >gi|330402635|gb|ADLB01000021.1| GENE 128 111574 - 111765 293 63 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_0388 NR:ns ## KEGG: EUBREC_0388 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 2 60 3 61 66 67 45.0 2e-10 MRTNYMMTVDDVMEELGVKRSKAYSILKQLNDELAKEGYVAVRGKIPRPYWETKFYGCSQ RAV >gi|330402635|gb|ADLB01000021.1| GENE 129 111968 - 112501 471 177 aa, chain + ## HITS:1 COG:no KEGG:EUBELI_20054 NR:ns ## KEGG: EUBELI_20054 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 2 110 14 121 154 73 37.0 3e-12 MVGKKIRAYREFRGYSQIQLAELSGINVGTIRKYELGIRNPKPDQLEKIATALGLNVSVF LDFNIETVGDVLSLLFSIDNSVNLSLAETPDQKVSLTFDNPTMQDFFRKWCQFKNIYEKE KAEILAIEDEDKRQEEMDKLNVTQEEWKLRAMGTTIGCHTIVKKGTEGNTIKAYSLT >gi|330402635|gb|ADLB01000021.1| GENE 130 112530 - 112781 152 83 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKKIDSNEIINFLNSKWQGSTCPMCHAGNWNVSNKVFELREFNNGDLVIGGSSSAITPVV PVTCSNCGNTIFINVLSTGLLKE >gi|330402635|gb|ADLB01000021.1| GENE 131 112785 - 113330 309 181 aa, chain + ## HITS:1 COG:no KEGG:DMR_24760 NR:ns ## KEGG: DMR_24760 # Name: not_defined # Def: hypothetical membrane protein # Organism: D.magneticus # Pathway: not_defined # 8 161 44 197 203 70 33.0 4e-11 MSEIHTNDSKIISETNNTFLHSKEKKSQITIADDTWLPDKAKNNYHTQRLKQSKWAFWLS FWGAIAGFAVLIITTFIYIASDNPSAVGYISGIIIEATSALFFTLSNKANEKISEFFDKL TLDSNTTHAMHMTKEISNSDVRDQLLVKLSLHLAGINEDKICKQTIEICTNSQNESSNQA D >gi|330402635|gb|ADLB01000021.1| GENE 132 113547 - 114740 1487 397 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|119502908|ref|ZP_01624993.1| Ribosomal protein S19 [marine gamma proteobacterium HTCC2080] # 1 397 1 407 407 577 71 1e-163 MAKAKFERSKPHANIGTIGHVDHGKTTLTAAITKTLAARVEGNTATDFENIDKAPEERER GITISTSHVEYETENRHYAHVDCPGHADYVKNMITGAAQMDGAILVVAATDGVMAQTREH ILLSRQVGVPYIVVFMNKCDMVDDEELLELVEMEIRELLDEYEFPGDDTPIIQGSALKAL EDPNGEWGDKIMELMAAVDSYIPDPERATDKPFLMPVEDVFSITGRGTVATGRVERGTLH VSDEVEIVGIKEETRKVVVTGIEMFRKLLDEAQAGDNIGALLRGVQRTEIERGQVLVKPG SVTCHTKFTAQVYVLTKDEGGRHTPFFNNYRPQFYFRTTDVTGVCHLPEGTEMCMPGDNV EMSIELIHPVAMEQGLRFAIREGGRTVGSGRVASIIE >gi|330402635|gb|ADLB01000021.1| GENE 133 114891 - 117008 2334 705 aa, chain - ## HITS:1 COG:CAC3138 KEGG:ns NR:ns ## COG: CAC3138 COG0480 # Protein_GI_number: 15896387 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation elongation factors (GTPases) # Organism: Clostridium acetobutylicum # 3 703 2 686 687 937 65.0 0 MAGREYPLERTRNIGIMAHIDAGKTTTTERILYYTGVNHKIGDTHEGTATMDWMAQEQER GITITSAATTCHWTLQENCKPKAGALEHRINIIDTPGHVDFTVEVERSLRVLDGAVGVFC AKGGVEPQSENVWRQADTYNVPRMAFINKMDILGANFYGAVEQIKSRLGKNAICIQLPIG KEDEFKGIIDLFEMKAYIYNDDKGDDISITDIPEDMKDDAELYRTELVEKICELDDELMM EYLEGEEPSTEALKAALRKGTCTCEAVPVCCGTAYRNKGVQKLLDAVIEFMPSPVDVPAI QGVDEDGNDVVRESSDEGPFSALVFKIMTDPFVGKLAYFRVYSGTMNSGSYVLNATKGKK ERVGRILQMHANKREELDKVYSGDIAAAIGFKFSTTGDTICDEQHPVILESMEFPEPVIE LAIEPKTKASQGKLGESLAKLAEEDPTFRAHTDQETGQTIIAGMGELHLEIIVDRLLREF KVEANVGAPQVAYKESFTKAVDVDSKYAKQSGGRGQYGHCKVKFEPTDVNGEETFFFEST VVGGAIPKEYIPKVGEGIEEAMKSGMLGGFPVVGIKATVYDGSYHEVDSSEMAFHIAGSL AFKDAMAKAAPVLLEPIMKVEVTMPEEYMGDVIGDINSRRGRIEGMDDLGGGKIVRAYVP LSEMFGYSTDLRSRTQGRGNYSMFFEKYEPVPKNVQEKVLANKAK >gi|330402635|gb|ADLB01000021.1| GENE 134 117026 - 117496 736 156 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|240146972|ref|ZP_04745573.1| 30S ribosomal protein S7 [Roseburia intestinalis L1-82] # 1 156 1 156 156 288 89 2e-76 MPRKGHTQKRDVLADPMYNNKVVTKLINNIMLDGKKGVAQKIVYGAFERIAEKSDKPAIE VFEEAMNNIMPVLEVKARRIGGATYQVPIEVRADRRQALALRWITLYSRKRGEKTMEERL ANELLDAANNTGASVKKKEDMHKMAEANKAFAHYRF >gi|330402635|gb|ADLB01000021.1| GENE 135 117693 - 118112 665 139 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|160878393|ref|YP_001557361.1| 30S ribosomal protein S12 [Clostridium phytofermentans ISDg] # 1 139 1 139 140 260 93 4e-68 MPTFNQLVRKGRQTAVKKSTAPALQKGYNSLRKKATDASAPQKRGVCTAVKTATPKKPNS ALRKIARVRLSNGIEVTSYIPGEGHNLQEHSVVLIRGGRVKDLPGTRYHIVRGTLDTAGV AKRRQARSKYGAKRPKDAK >gi|330402635|gb|ADLB01000021.1| GENE 136 118281 - 118412 136 43 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|255281969|ref|ZP_05346524.1| ## NR: gi|255281969|ref|ZP_05346524.1| two-component response regulator [Bryantella formatexigens DSM 14469] # 1 42 307 348 348 70 71.0 3e-11 MEDSSLRIGDVAREVGFLDFAHFSRVFKKVEGISANEYRNHMD >gi|330402635|gb|ADLB01000021.1| GENE 137 118357 - 118998 682 213 aa, chain - ## HITS:1 COG:SP0661 KEGG:ns NR:ns ## COG: SP0661 COG4753 # Protein_GI_number: 15900562 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Streptococcus pneumoniae TIGR4 # 1 192 6 198 245 105 30.0 5e-23 MIIDDEPIIVEGLSRSIPWEKWNCEVVATANDGYEGRELIQKYKPNMIFCDISMPELDGL TMIAGLKSEFEDMEISILTGYRDFDYAQQAVNLGVTRFLLKPSNMKELEEAVETMTENLK KKNILVEPHETANSFIVKNAMKYIDEHYSEKLTLPEVAEKTYVSQWHLSKLLNKEMKKSF SEILNEIRIKRAKKTVGRFFTSHRRRGERSWFS >gi|330402635|gb|ADLB01000021.1| GENE 138 119095 - 120378 1109 427 aa, chain - ## HITS:1 COG:BS_yurO KEGG:ns NR:ns ## COG: BS_yurO COG1653 # Protein_GI_number: 16080313 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Bacillus subtilis # 4 347 2 339 422 78 22.0 3e-14 MGRKKGLILAVFCTLILLLIVGCDGKVEEEEVQEEIVLKTVSMFGGTTPSAKVYQDINEE FMKEYENVTIEDDSQICDEEWKTKVAANFAVGNEPDIIQFFTDSAAESILATGKLVSLEE IQKEYPEYAKDSRKTALEAVRNNDGVIRAVPTTGYWEGLFCNKELFEKYHVKIPTDWDSL VKAIRTFRKKGITPISVSLNHMPHYWIEHLLLYTAGEEEYTSIPETAPETWIKAFETLKE LRDMGAFPENIDACADNFAIQLFQEGKAAMQLEGSWYVSSLIEAGIADKVTVVAFPGVRE QKAKQGALVTRISSGFYITRRAWDNPEKRELAVRFVEAHTKRESLIKYWGGNGIATFKGN VQTKITPLIKDGRKLLDNAPSLNQSTDSRICPEAYKLIIEGTAEVSEGKKDAETLLNEAL TLEKDRR >gi|330402635|gb|ADLB01000021.1| GENE 139 120357 - 122213 1318 618 aa, chain - ## HITS:1 COG:BH3678 KEGG:ns NR:ns ## COG: BH3678 COG2972 # Protein_GI_number: 15616240 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Bacillus halodurans # 311 594 313 597 605 140 32.0 9e-33 MGKKRKRKASIRNRILFLVIMCWVVPVVAVFVFTAFSYRIEITKRTDSMIENEMQNISSL VAGRLDDLIESCLQISYERSCEEVWREYRKGVKTKTELIDTITILSKQQFNLDRRFSMFA FYEVETPYVISYNTKIKHQFDEYVRDIEPEISEIRQEDTSNVQLKVIERRLFIVRNLYTV FDYEKFGTLVAEVDIKRLFHGLEIREKENVALSIGEELDLVQVHGCVGEKSQRYVIDKLF NQYQKSQRKNLFLVKDLNYNAYMNMKEYNYYDMGVLFLVNATEVYSGLYEFYRVISIIFI AILPIIIGIFYFLEKQIGTPIHELVNVSKSIERGELGAVLEIENMPNKEFDYLVDSFNNM SKKVKYLFDCVYDEKMARKDAKILALQTQINPHFINNTLEMMNWQARMTGDSTVSEMIEA LGTVIDYRMNRENQKLTALKEELKCSEAYLFITKKRFGQRLTFEKEIDESVLEVKVPRLV LQPILENAVVHGIERVKKGIIKIRIYREEDKVFLEVINTSGRVSEEEKERIHEILYGPPE NIQGVGKHTSLGLRNVNERIQLIYGQEYGLHIEIDEEHTSFKIVIPYEWKPEDEEEKKHQ VEIELREKREMIWEEKKD >gi|330402635|gb|ADLB01000021.1| GENE 140 122295 - 123716 1298 473 aa, chain - ## HITS:1 COG:STM0316 KEGG:ns NR:ns ## COG: STM0316 COG2195 # Protein_GI_number: 16763698 # Func_class: E Amino acid transport and metabolism # Function: Di- and tripeptidases # Organism: Salmonella typhimurium LT2 # 10 471 16 482 485 285 35.0 1e-76 METEKIISYFKEISSIPRQSGDEKAVSDYLTAFAKELGLWVKQDKDYNVLVKKPATKGME KRDTIILQGHMDMVYVVAQGMEHCYQNGIEVIDDGVFLHANGTTLGADNGIALAYAMMLM AEKDIPLPPLEFIFTSKEEIGLLGGASADISELEGKMIINLDSEEEGMFCSGCAGGIGAC IKLPIEKEKAEIPLVPLHISIGGLKGGHSGMEIQLERGNAIQLIGRVLQRLEKYDARIGE TESIGKFNAIANTGEILCYVKEEALSKVKEEIAELERELKNELSPADDVQFIVREEEIVS DCEVFTKKTAETLKNFLVLMLQGVMNMSAAVEGLVQTSINTGCMEEKEGCLLFHSALRSS VETQKQFLISRIQTIVDVLGMECKWSGEYPGWQYRENSKLREIATEKYRELFGKEPKIEA VHAGLECGYWAEKIKDADILSIGPDMIDVHTPNEKVSKQSIENVWELLKAILS >gi|330402635|gb|ADLB01000021.1| GENE 141 123953 - 124546 394 197 aa, chain + ## HITS:1 COG:no KEGG:TTE1304 NR:ns ## KEGG: TTE1304 # Name: cheY7 # Def: CheY-like domain-containing protein # Organism: T.tengcongensis # Pathway: Two-component system [PATH:tte02020] # 70 187 138 255 260 156 61.0 4e-37 MKENRLGTSFPTEKERIMLKLNTAVATAVNGQLQSNEPCIIQLNLEISKDSIMELLGRLG KGKSSDTNEVQQTNDNTLTSQITSVIHEIGVPAHIMGYRYLKDAISLAITDPDSITAITK IIYPEVASKNHTTPSRVERAIRHAIEVAWSRGNMDVLNHFFGYTVSADSGRPTNSEFIAL VADYIRLKNQDEHVSLH >gi|330402635|gb|ADLB01000021.1| GENE 142 124619 - 125704 1313 361 aa, chain - ## HITS:1 COG:CAC2788 KEGG:ns NR:ns ## COG: CAC2788 COG0006 # Protein_GI_number: 15896043 # Func_class: E Amino acid transport and metabolism # Function: Xaa-Pro aminopeptidase # Organism: Clostridium acetobutylicum # 1 358 1 357 358 336 46.0 4e-92 MDAGKVGRILKAMEEQGMPQMIISDPPSIYYLTGKWIHPGERLLALYLNVNGNHKLVINK LFPQEKDLGVDLVWYDDIEDGVEILSRFVEKDKPIGIDKVWPAKFLLRLQELGAGSEFRN GSMIVDYVRMIKDENEKQLMREASAANDKIMEELVPLVVKGYTEKELNAIVREKYAQTGH SGVSFDPITAYGKSGADPHHVTDDTKGKRGDSVVLDIGGILNDYCSDMTRTVFIGEVSDR AREVYEVVKEAQARGIAASKPGNRMCDVDLACRNYIEEKGFGQYFTHRTGHSIGMEDHEY GDVSSVNEDIIKVGQCFSIEPGIYLPDEGIGVRIEDLVIITEDGCEVLNSFTKDLIVVPE E >gi|330402635|gb|ADLB01000021.1| GENE 143 125771 - 125869 91 32 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRVHEVVEATVKVVWEVVLTTEKRNNCFVDTM >gi|330402635|gb|ADLB01000021.1| GENE 144 125856 - 125957 154 33 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSLADIILIGLIAVVVVLVIRNQIKKHKNGEST >gi|330402635|gb|ADLB01000021.1| GENE 145 125998 - 126282 335 94 aa, chain - ## HITS:1 COG:no KEGG:Cphy_1196 NR:ns ## KEGG: Cphy_1196 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 1 91 1 91 95 124 67.0 1e-27 MSVVIVGGHDRMVGQYKKLCKSYKCKAKIFTQMTANLSEQIGSPDLVILFTNTVSHKMVR CAVSEAQKCNANIVRCHTSSKNALNEILQNECAS >gi|330402635|gb|ADLB01000021.1| GENE 146 126294 - 126887 494 197 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_1586 NR:ns ## KEGG: EUBREC_1586 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 195 1 196 205 181 46.0 1e-44 MSREIFELVREMDLKNIEIQLALQCAPLITGLKVSNLLIVSLENAGQVTAIIEKSGLSYY KMLQTKEKVTFLLYRRKQLEEFLCRREIRQFFRQEGYQVFQFGRILRTFQMRYATYMEER GEFPHEMGLLLGYPLEDVRGFMENEGKYFLYAGYWKVYENMTEKLKLFRKFEVAKETLLS LIYNGVCMEEIMTSYAG >gi|330402635|gb|ADLB01000021.1| GENE 147 127122 - 127496 387 124 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MFSPISDEEYQEFENDSSIFSSSGWLGERFKDSMTEEAYNTFLETGTYQIMMLSYENGCN IEMEKMKINEKKDYDEFNAKLHVSHKNGDDEKFEVNGTAQFDENGKVRYFNLNNMDKLVE ILKK >gi|330402635|gb|ADLB01000021.1| GENE 148 127597 - 129231 209 544 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 353 544 38 235 329 85 28 3e-15 MNKSRSGSQVMLGLIGMIRPLITVMLVAIFMGCIGNLMAIFITVLGGMGIGNVLGIDRTM SLTTIFVLAIICAVLRGILRYAEQASNHYIAFKLLASIRHQIFSALRKLAPAKLDGKEKG NLISIITSDIELLEVFYAHTISPIAIAVLTSIVMVIFIGRYHWLLGAIALFFYIIVGAVI PIINSKAGQKYGQKYRKLYGNLNTTVLDNLYGLDEILQYGKENDRKEKMDKFTDELEEVN DKLKQQEGRQRVSTDIVILLAGVVMLLASAALVPAGETIICTIAMMSSFGPTGALSALSN NLNQTLASGNRVLNLLEEEPVVAEVESNVSVPKGELKLSDVTFGYPNAEKNILENFSLQI EENTIHGILGESGCGKSTILKLLMRFYETKEGQVLFGNQNVNEIDTKQLRKQIAYVTQET FLFEDTIENNIKIAKADATREEVIEAAKKASLHEFILSLPLGYDTKLAELGDSVSGGERQ RIGIARAFLHDAPIILLDEPTSNIDSLNEGIILKTLLEEKENKTIILVSHRKSTMSIADK VSRM >gi|330402635|gb|ADLB01000021.1| GENE 149 129224 - 130963 185 579 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163765018|ref|ZP_02172066.1| ribosomal protein L9 [Bacillus selenitireducens MLS10] # 348 551 38 249 329 75 29 2e-12 MFHKRLLSEFKDTKKKVVQMVFSQWIMLIANILFLFTIAGFIQSMVSGEDLREKRLALLV TLGIVVLIRSVMSLWNTKLSFEASKEVKKKLRTMVYEKLMRMGNQYKKYFQTSEIVQIST EGVEQLEIYFGKYVPQFFYSMLAPITLFLVVGTMSIKVAIVLLVCVPLIPLSIVAVQKFA KKMLAKYWGTYTEMGDSFLECLQGLTTLKIYEADGRYAKKMDEEAEKFRKVTMRVLIMQL NSISIMDLIAYGGAGVGMILSVLEYRSGNVSLVQCFFMIMVSAEFFLPLRLLGSFFHIAM NGNAAADKIFRLVDLTEEEKEEQAEEISGDEIAFAGVDFSYDEEKQILSGIALRAGHGLT ALVGESGCGKSTITSIIMGDYLADSGEVTIDGIDLQMISPKALRKKVTRIRHDSYLFAGT IRENLLMGKENATEEEMRNALKKVHLLDFVEKSGGLDYILQEKASNLSGGQKQRLALARA ILHDSAIYIFDEATSNVDVESENQIMEVVRELAKTKTVLLISHRLANVVSADKIYVLKDG RIVEQGKHEELCEKNGYYYKIFNQQSALEQFGKGAVSYE >gi|330402635|gb|ADLB01000021.1| GENE 150 130953 - 131156 185 67 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRKNKRVDILLIATMILMAVSGFLIPVLQTWVWISAVHKISSVLFCILCVLHISQYKRGR RAKKNVS >gi|330402635|gb|ADLB01000021.1| GENE 151 131287 - 132123 1066 278 aa, chain - ## HITS:1 COG:BS_yviA KEGG:ns NR:ns ## COG: BS_yviA COG1307 # Protein_GI_number: 16080601 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus subtilis # 1 277 3 281 281 122 30.0 8e-28 MMIITDSASDITLQEAREMGVEMVSLQIKFSDGDFIQEKEEDFCRFFEKLAKEKDLPVTS QPSPEDFLKLYREAKEKEEDVLVITLSGGLSGTVNAANLAKQISEYDRVWIVDSEQAIIT QRFLVQRAVSMRAEGKSVEEIVACLDNLKKRLVVCGMLDTLTYLKKGGRIPAPLAVIGNM LQIKPIIELKDKVLVMLGKARGRNGGKKCLWKEFESYEIDEKEPVYFGYTSNKEIVQQFM EETVEKYGIKQYAMYAVGGIIGTHVGPSCIAISFVKKK >gi|330402635|gb|ADLB01000021.1| GENE 152 132245 - 132382 223 45 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MGTMIAGLIVLAVVGLSVRSMIHDKKNGKSLQCGGNCKNCGGHCH >gi|330402635|gb|ADLB01000021.1| GENE 153 132397 - 134556 2705 719 aa, chain - ## HITS:1 COG:L190009 KEGG:ns NR:ns ## COG: L190009 COG0370 # Protein_GI_number: 15672169 # Func_class: P Inorganic ion transport and metabolism # Function: Fe2+ transport system protein B # Organism: Lactococcus lactis # 5 712 4 699 709 698 49.0 0 MAIKIALAGNPNCGKTTLFNALTGANQFVGNWPGVTVEKKEGKLKGHKDVIIMDLPGIYS LSPYTLEEVVARNYLVGERPDAILNIIDGTNMERNLYLSTQLMELGIPVIMAVNMMDIVE KNGDKIHIDKLSRKLGCEVVEISALKGKGITKAAEKAVGIAQKKKSAERIHEFSKEAEDI IEKVENKLSGIVPKEQERFFAIKLLEKDDKIAGVMKKNVDVSAEIKEMEDAFDDDTESII TNERYIYISSIIEECLTKSKKEKMTTSDKIDHFVTNRWLALPIFAVVMFLVYYVSVTTVG DWATAWANDGVFGEGWHLFGIKSITVPSIPALVESGLNAIGCADWLQGLILDGIVAGVGA VLGFVPQMLVLFFFLAFLEACGYMARVAFIMDRIFRKFGLSGKSFIPMLIGSGCGVPGIM ASRTIENDRDRKMTIMTTTFIPCGAKLPIIALIAGALFDGAWWVAPSAYFVGIAAIICSG IILKKTKMFAGDPAPFVMELPAYHMPTVRNVLRSMWERGWSFIKKAGTIILLSTIVLWFL MNFGWIDGTFKMLEPEQLDSSILSAIGGVIAPIFAPLGWGDWKMAVAAVTGLIAKENVVG TFGILFGFAEVNADNGIEIWGQLAASMTALAAYSFLIFNLLCAPCFAAMGAIKREMNNAK WFWFAIIYQTLLAYVVSLCVYQVGMLVTNGVFGIGTVAAFILIAGFLYLLFRPQREMKK >gi|330402635|gb|ADLB01000021.1| GENE 154 134625 - 134846 368 73 aa, chain - ## HITS:1 COG:L192240 KEGG:ns NR:ns ## COG: L192240 COG1918 # Protein_GI_number: 15672170 # Func_class: P Inorganic ion transport and metabolism # Function: Fe2+ transport system protein A # Organism: Lactococcus lactis # 4 72 80 148 152 67 47.0 5e-12 MKTLKQIPCKATAKVKKLHGEGPVKRRIMDMGITKGVEVLVRKVAPLGDPIEVTVRGYEL SIRKADAEMIEVE >gi|330402635|gb|ADLB01000021.1| GENE 155 134861 - 135070 335 69 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_00572 NR:ns ## KEGG: EUBELI_00572 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 1 69 15 83 83 84 69.0 9e-16 MPLSMMSTGQETIIKKVGGKEETRRFLENLGFVTGGTVTVVSEIGGNLIVNVKNSRVAIG RDMANKIMV >gi|330402635|gb|ADLB01000021.1| GENE 156 135244 - 135609 485 121 aa, chain + ## HITS:1 COG:CAC1469 KEGG:ns NR:ns ## COG: CAC1469 COG1321 # Protein_GI_number: 15894748 # Func_class: K Transcription # Function: Mn-dependent transcriptional regulator # Organism: Clostridium acetobutylicum # 7 117 3 114 122 93 50.0 8e-20 MKTAINESAENYLETILVLSKTLPVVRSVDIANELGFKKSSVSVAMKNLREKNQITVTDA GFIYLTDAGREIAEMIYERHELLTAWLIRLGVPEEIASEDACKMEHVISKESFEAIKRHV H >gi|330402635|gb|ADLB01000021.1| GENE 157 135725 - 136042 64 105 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MYHSFKTPKIEKKSHTLYGFLLTDKNQERLPVTATIDFEIGTYPFQNEKTSIQGKITVSY AQKEIYTTDFYSHNLERKYISIEDISKQKSVATVFSVCNTFYYGL >gi|330402635|gb|ADLB01000021.1| GENE 158 136030 - 136644 630 204 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|210633344|ref|ZP_03297757.1| ## NR: gi|210633344|ref|ZP_03297757.1| hypothetical protein COLSTE_01670 [Collinsella stercoris DSM 13279] # 11 189 9 198 205 85 28.0 2e-15 MKINKWKTAFLVCILALLFVPREAYAKETPTIIYNGQKKTFDLKNISENDLFKELKGLMP GDSVEQEIIIQTKDLTKETSLFLEADCKDEQELLKDMQFSVKQDGREISQSAVSFNQIRL GQFKGNSTVKVTVTLEVPVTVGNEIAEKEYNTEWTVIAQEDGKDINSKPIKTGDDFPIAM SMGILVLCAATIVYVLKRKNHHNP >gi|330402635|gb|ADLB01000021.1| GENE 159 136631 - 137233 487 200 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|295106189|emb|CBL03732.1| ## NR: gi|295106189|emb|CBL03732.1| hypothetical protein [Gordonibacter pamelaeae 7-10-1-b] # 31 181 48 191 213 78 32.0 2e-13 MKEKKIGKKISIALLLLTVIIAGVTIAYLQSKGNLTNQFDVGKVSAKIEENFDGQVKSNV RVTNTGNMPVYVRCNLSVYYETSKGKEQDSEAETKTDNISINVPEQGEDYTLEFPDGFKD NWLEIDGIYYYRKPLQNGESVQFIKNCKEVKKKDGEKLVVDISTQSIQAEPERAVSDAWK KVVVDSETKELKEAVRNENK >gi|330402635|gb|ADLB01000021.1| GENE 160 137248 - 142257 4979 1669 aa, chain - ## HITS:1 COG:no KEGG:Ctha_1865 NR:ns ## KEGG: Ctha_1865 # Name: not_defined # Def: filamentous haemagglutinin family outer membrane protein # Organism: C.thalassium # Pathway: not_defined # 84 356 343 610 1692 80 29.0 8e-13 MDKQKSSVRKYIAGILVLAMLLPIFSGFATTVKAEEPVRKISSFDELLVFAAASQSFDFK DQTIVLTEDIEIGEIEQSILDHYGIKHLTIGTKDLPFQGTFDGQGHTIKGLKYDPNIIKD ANSGLFSFIENATIKNLIVENADLDCIFQGGVIVGYAENSKLENITVLNSKLKISPANNV ISLVTNGGFSGGGIAGIVENSLLYNCEVSGTEVVNNSTSGVTGVGGEGLYMGGLVGWASS STIEYCRARANYEGEGDSRELRNTVVRNDYKIAVGALGGKSVYAGGIIGGVNNGCHIIDC FSTAKVSFDVANYVAVGSGIAGYAGGITGALRGNSEIVRCHYAGDISSQQYNAVLVIPII QHNVNLSGIARIVDSDCEVSYSYFKPSAIASGVSIRAVGDNDDNEVHGPRDDATYEDIDF WESKFYDFIGDEPRQTYLNLKTHYNKWVMDYDLGIPVHGKSVMATFDFPGAGKVKIDKTA LVNVPAETSDPLSFAIQGVHPREEQAVTLTMELNDRYRLTNWYKKANMEKHEVSDMKEIL ALTQNDGAKLEGVENPKKVSIQDRDLFAAGVEAEVTFYELDGKTTVADEWYKYDAALKEH MPNPIEGSKFYGWTTIPNPSEEEKGYSAITSTQLKDIKQQGEFYPNGAAVKKEMKLYPIY INSLANVLTEFEGNEQDENPDVTLREGVGRTLVESDKNNVYIDVEAEGGTKEFPKGYQFR GWYKKQADGTEVCVSREYRYKVPDLTEKVTYVARFNYEVEYWAKAFEQDNGDEFAESKFY TKVIHTYGEGFQNIAGPAYCMEEVIGWGKEHIEHADQGDCNESYKEDMKITEPIKVYSHN LPFGNSVNNDYAISVNSDFPNAGEIKNVPTGSSIHYQFQYVPIDDARYHFQFWTLERVKS KDGWTYKNNPMDTGNLWEAVIDVYKGCAMISTDVVFHDEMGNVKTTVQRRYDDAIFMSPD NEHQYKYPLSGTDVSVDTEDGGKISGKLERKASPTDEQMKRNGYQFLGWISGNEVAKDSA EWNYIYDAKEEYCTSNVEKAEPYLLKEYDGEHSSYLDERLKVKETMDLYPVYAKYDVTTT TNIHQIENLPSGINLPNKPSYTVEELPDEFGKAKVTVTAESGKVSVLSSEPDGKKYELMS LICETDGEQQILKTVSNGSEYVYTGEIVAGKSYKFIAVYSPVLVLYHMNDNDTINPVVKD MGERLGVSPNPDFDKIADVEQSYFAGWTQQKPQAGHVWKFASKEALDASKISFVSKDAVV KQFMDLWPVFVKTSIKVNSNIDTVIAGEGGKPEQYRKLVRNTDGSMQLEAYEYDGYVFKG WYTDYKDDGSMGTKVSDKMVVKLKPEQLFEEKTYTAVFNSANVVTYHDMEGNPICKVNVE EGTRSFVNENGDVLDTEPILKMNELLGNSGTFIEWQWKSDDKMITWDTFKNTLITADMEL YPCIVKTTAKDSQHSDYTDKVEFHVSSLPEENPLAQKGAYMLNGLFKEEYTQPNLTLHTE RQVWNPAEDKKQSIPITELPTKMYTTITEDGNSTYVEASQGPVYTDTYGDALHEFFGKLK LTKEYVNADEDGVVYIDVVKKDSKETRRIPIDVSGGKGTTTVHLPVGEYQIAENLDWNWR DNIKSTSNVDGQGNIAIRIGAEEEVVIRNERVNEKWLDGADRKKNIYQK >gi|330402635|gb|ADLB01000021.1| GENE 161 142257 - 142832 444 191 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|210633347|ref|ZP_03297760.1| ## NR: gi|210633347|ref|ZP_03297760.1| hypothetical protein COLSTE_01673 [Collinsella stercoris DSM 13279] # 29 188 73 233 237 97 32.0 5e-19 MTKNRKRVVMGLFLILCLGIFSSSAFLIRKIKTDNVITFGNVKVQLINHTLDENGKEVEV KDGKEELLKYEDVSRIIKVKNVCNHPVYVRVKLNTTGKKNQEIFPAEDYVNYKFADEKWR EKDGWIYYTDVLEPNKTTEDLMRGIEFDVNRLTSEYAGSDIEFKAEVQAVQSENNSKNVF AVEGWPEEVAK >gi|330402635|gb|ADLB01000021.1| GENE 162 142893 - 143591 774 232 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|210633348|ref|ZP_03297761.1| ## NR: gi|210633348|ref|ZP_03297761.1| hypothetical protein COLSTE_01674 [Collinsella stercoris DSM 13279] # 8 155 5 160 236 107 38.0 8e-22 MERKQTSRKTKRTAIIATLLACVLLIGGTVAWLTTHDSLSNQFTVGNINPIDPTEPDGGP DDKPIDKDESKLNGNLYEPDWVANSKIFPGETVVKNPYVGVGAGSEKCYVYVYVNNTMKN DNKVYFTINDGWEAVEATTTGTDGEYVGGIFKYTAGLDGSQATKNVWTTTPLFSDVVAKD TATGDDFMTADGTKAGAIEVHSFQHQMFDGEGQVIDEANVVIPAAKAAFGIK >gi|330402635|gb|ADLB01000021.1| GENE 163 143566 - 144114 364 182 aa, chain - ## HITS:1 COG:BH2130 KEGG:ns NR:ns ## COG: BH2130 COG0681 # Protein_GI_number: 15614693 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Signal peptidase I # Organism: Bacillus halodurans # 43 170 47 173 191 65 31.0 4e-11 MKTQKSNIFNHIFKYISQFFFIICIVLIFALGGVRLIGLNPYVITSGSMVPKYKVGSIVY IQKVNPEELKVGDDISFYLDDKVVATHRIREIDEPNRQVKTYGIHNKDSNGNQINDANPV DFDHIIGKVKFSLPILGYIYLFARTTTGKAIIITLLAMVLISSTIHYIYKRRRDYGEKTD KS >gi|330402635|gb|ADLB01000021.1| GENE 164 144591 - 145253 648 220 aa, chain - ## HITS:1 COG:lin0467 KEGG:ns NR:ns ## COG: lin0467 COG3619 # Protein_GI_number: 16799543 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Listeria innocua # 3 219 5 219 221 251 60.0 8e-67 MKKQMSESIRLGVVLAIAGGFMDAYSYMCRGKVFANAQTGNILLLGIHLSEKNWSVALRY AVPVVAFVVGIALSDMIRLKLKEKSLLHWRQISILIEAFVLVGVCFIPQEYNLPANSLTS FACGIQVESFRKIHGNGIATTMCIGNLRSATQNMCEYWHTKEKKKLKKGFLYYGVILCFV LGAILGNVFVEVLREKAIVICSGILFIGFIMMFVDWEKRR >gi|330402635|gb|ADLB01000021.1| GENE 165 145273 - 146613 406 446 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|168182407|ref|ZP_02617071.1| 50S ribosomal protein L18 [Clostridium botulinum Bf] # 20 443 10 426 447 160 27 4e-38 MKEQQKKEVCVENIYKLDGRVPVAKAIPFGLQHILAMFVANLTPIILIAAASGVSEKQQA LLLQNAMFMAGIATLIQLYPLGRVGARLPIVMGVSFTFVTLLSYIGTEYGYATVVGSVIV GGILEGTLGLLAKYWRKIITPLVAGIVVTTIGYSLLSVGVRSFGGGYTEDFGSAKNLLIG TVTLVACLLFNIVAKSFWKQLSVLFGLAVGYILSLALGVVDLSGVFDGGMVAIPRILPVT PEFNLNAIIAVFIIFMVSAAETIGDTSAVVATGLGREVTEKEISGSLACDGFLSSVTGLF GCPPITSFSQNVGLVAMTKVVNRFTIMMGALCMILAGFFPPIANFFSSLPESVLGGCTIM MFGTIMVSGVQMLARAGFSQRNVIITAISLAIGIGFTTASEADIWRIFPQIVRDIFGGNC VAIVFVVSILLSILLPQNMEVEKLGQ >gi|330402635|gb|ADLB01000021.1| GENE 166 146663 - 150316 4142 1217 aa, chain - ## HITS:1 COG:CAC3142 KEGG:ns NR:ns ## COG: CAC3142 COG0086 # Protein_GI_number: 15896391 # Func_class: K Transcription # Function: DNA-directed RNA polymerase, beta' subunit/160 kD subunit # Organism: Clostridium acetobutylicum # 15 1181 6 1161 1182 1662 69.0 0 MSATNNEATYQPMTFDAIKIGLASPEKILEWSRGEVTKPETINYRTLKPEKDGLFCERIF GPSKDWECHCGKYKKIRYKGVVCDRCGVEVTKASVRRERMGHIDLAAPVSHIWYFKGIPS RMGLILDISPRTLEKVLYFASYIVLDKGETDLQYKQVLSEQEYQEAREKWGSAFRVGMGA ESIQELLQEIDLDKEYNELQGGLKGATGQKRARIVKRLEVVEAFRESGNKPEWMILNAIP VIPPDLRPMVQLDGGRFATSDLNDLYRRIINRNNRLKRLLELGAPDIIVRNEKRMLQEAV DALIDNGRRGRPVTGPGNRALKSLSDMLKGKSGRFRQNLLGKRVDYSGRSVIVVGPELKI YQCGLPKEMAIELFKPFVMKELVANGTAHNIKNAKKMVERLQTEVWDVLEDVIKEHPVML NRAPTLHRLGIQAFEPILVEGKAIKLHPLVCTAYNADFDGDQMAVHLPLSVEAQAECRFL LLSPNNLLKPSDGGPVAVPSQDMVLGIYYLTQERPGVKGEGKFFKNLNEAILAYENEVIT LHSRITVRVTKKLPDGRTLTGNVESTLGRFLFNEIIPQDLGFVDRSIPGNELLLEVDFLV GKKQNKQILEKVINTHGATVTAEVLDKIKATGYKYSTRAAMTVSISEMTVPPQKPELIQN AQDTVDRITKNFKRGLITEEERYKEVVETWKNTDDALTKALLDGLDKYNNIFMMADSGAR GSDKQIKQLAGMRGLMADTTGRTIELPIKSNFREGLDVLEYFMSAHGARKGLSDTALRTA DSGYLTRRLVDVSQDLIIREVDCVEKGAEIPGMYVKAFMDGNEEIESLQERITGRYVCET IYDKDGNIIVKANHMVTPKRAEQVMKYGVNKDGGPITEVKIRTILTCKSHIGVCAKCYGA NMATGEPVQVGEAVGIIAAQSIGEPGTQLTMRTFHTGGVAGGDITQGLPRVEELFEARKP KGLAIITEFGGTATINDTKKKREIIVTNNETGESKAYLIPYGSRIKIQDGAVLGAGDELT EGSVNPHDILKIKGLRAVQDYMIQEVQRVYRLQGVEINDKHIEVVVRQMLKKIRIEENGD SEFLPGTMVDVLDFDDVNEQLIAEGKEPATGEPVMLGITKASLATNSFLSAASFQETTKV LTEAAIKGKVDPLIGLKENVIIGKLIPAGTGMRKYRDVTLDTGKKQEIEFDDAEDFEDSN LADNVEELTEEFETIEE >gi|330402635|gb|ADLB01000021.1| GENE 167 150331 - 154161 828 1276 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163796927|ref|ZP_02190884.1| 30S ribosomal protein S12 [alpha proteobacterium BAL199] # 842 1208 994 1391 1392 323 47 5e-87 MEKNRIRPITNGKSSRMSYSRQKEVLQMPNLIEVQKDSYQWFLDAGLKEVFEDISPIADY SGHLSLEFVDFTLCEDDVKYTIEECKERDATYAAPLKVKVRLYNKETDEINEHEIFMGDL PLMTKTGTFVINGAERVIVSQLVRSPGIYYAIAHDKIGKELYSSTVIPNRGAWLEYETDS NDVFYVRVDRTRKVPITVLIRALGFGTNAEILDLFGEEPKILASFGKDTAESYQEGLLEL YKKIRPGEPLAVDSAESLITSMFFDARRYDLAKVGRYKFNKKLALKNRITGQVAAEDIVS AITGEVIVEEGQVITRELAEAIQNGAVPFVWVKGEERNIKILSNMMVDLESVVDVDAKEL GIAELVYYPVLADLLEETGGDIDELKEEIRRNIHELIPKHITKEDILASINYNIHLEYGI GTDDDIDHLGNRRIRAVGELLQNQYRIGLSRLERVVRERMTTQDLEGISPQSLINIKPVT AAVKEFFGSSQLSQFMDQNNPLGELTHKRRLSALGPGGLSRDRAGFEVRDVHYSHYGRMC PIETPEGPNIGLINSLACYARINEYGFVEAPYRKIDKTDPKNPVVTEEVVYMTADEEDNY HVAQANEPLDAEGHFIHKNVSGRYREETQEYEKTAFDYMDVSPKMVFSVATALIPFLEND DPTRALMGSNMQRQAVPLLMTEAPVVGTGMETKAAVDSGVCIVAEQAGVVDRSTSKEITV KHDDGTRKTYKLTKFLRSNQSNCYNQRPIVVKGERVEAGQVIADGPSTSNGEMALGKNPL IGFMTWEGYNYEDAVLLSERLVQDDVYTSIHIEEYEAEARDTKLGPEEITRDIPGVGDDA LKDLDDRGIIRIGAEVRAGDILVGKVTPKGETELTAEERLLRAIFGEKAREVRDTSLKVP HGEYGIVVDAKVFTRENGDELSPGVNQAVRIYIAQKRKISVGDKMAGRHGNKGVVSRVLP VEDMPFLPNGRPLDIVLNPLGVPSRMNIGQVLEIHLSLAAKALGFNIATPVFDGANEVDI MDTLDLANDYVNLEWEEFEAKHKEELLPEVLEYLSENREHRKLWKGVPLSRDGKVRLRDG RTGEYFDSPVTIGHMHYLKLHHLVDDKIHARSTGPYSLVTQQPLGGKAQFGGQRFGEMEV WALEAYGASYTLQEILTVKSDDVVGRVKTYEAIIKGENIPEPGIPESFKVLLKELQSLGL DVRVLRDDNTEVEIMETIDYGETDLHSIIEGDRTYAPEDESYGEHGFSQKEFEGDELIDI EEEPEDIEFDEILDEE >gi|330402635|gb|ADLB01000021.1| GENE 168 154392 - 160379 7273 1995 aa, chain - ## HITS:1 COG:BH2723 KEGG:ns NR:ns ## COG: BH2723 COG3250 # Protein_GI_number: 15615286 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Bacillus halodurans # 36 679 2 581 1014 403 37.0 1e-111 MKKRNRLVSLLLSGALVVTPVLGMDGLELEAKAADKQFAGEEWYDQIETVEVNREAARAT FTPYESAEKALKNEKSALDNVDETGSKWYKTLNGEWNFKYAEKPADRLNTKRGEDAKNYK EDWNTEGWDKIQVPSNIQTQKDEKGNFKYDKPIYVNQTYPWANYEKVEYNTNGKNKPVAP TVKNSVGQYKRTFEIPADWDGREVFVSFQGVESAFYLYVNGERVGYAEDSYTADEFNITD YLKEGENTIAVEVYRWSTGSYLENQDFIRLSGIFRDVYLYSKDKVEIRDYFVKTDLDENY ENATLTLDADIRSLDKNVSGKYTVKADLYEMDSDKKVWDSPLSFDVDVKAGKATVEESAD DKGQRGSGSKEVVNPKKWFADTPNLYRLLIQLVDEDGKVIETTCQRVGFREIDKVDINEE GQEQAQINGKKIMFRGTNRHETDNVDGRALTKEDIRTDLMTMKQFNVNAIRTSHYPNHPY TYALADELGLMMCDEANIESHKGSFENGADIPSGAPVWNASVMDRTINMVERDKNHASVV IWSLGNESTYKDHTMDENYCFNNSTKWILKRDPSRLRKYERDNRYTKGDRENSMVDIYSS QYWGVSSVEGHVTNTNNKTPYIQSEYAHSMGNGLGNFKEYWDVFRKYDNAQGGFIWDWMD QSILTNAVNKTDYYVKNDDGTKSAIKGTLTDGQKDKALDGYVLVPNKSANSKAITLGAWV KYNGGTGSDQAIIAKGDSGYNLKITRADDKIEFFVDGWSAGTLTAPFPKDKIGKWTYVAG TYENGKYTLYVDGTKIGEKDVTKGEKVDTEPFKIGIGEDPEYSGRQFNGLIDGVRVLNVA NPNPDYQPKDSEVVYSMDFKDDQIVAEGTDYPEGTTFFGYGGDWGEKVTDRDFCCNGLIN ADRTPSPELYEVKKVHQEISFYDDGEAKDGKVRIVNEFLNTNLKKYDITWKLLKDNGIVK EGTLSDEEKDVEAGAEKVIELKDFPEIKAVEGSDYILEFNATLKENQDWAGAYGGKKGSE IAFEQLQLSYENETARPTIDVEDADNIKVEDGEKNLVLSGSEKDGDKFSVTIDKTTGYIT NYMVNDEVLLKDGPKPDYWRARISNDPNFTDAMKNAAANFKVTDCKVETKDKVVNVHVEG TIEGIDSPNVIDYQIYANGDIVVTNSFTPSNSSAIGDIAQVGMRMVVPEGYENVTYYGRG PQENYVDRKTGARVSIYKDTVENMFEDKYVRPQENGNRSDVRWATLTKGENGKGIMVAAE DTMNMSALHYTSEDIHKVWNDFGHPYQVPKTKDVVLSVGTAQRGLGNASCGPGPLGQYIL QKGQTYTQTFRITPITKAAADANAFVKERMENSKLDVNSTMPVKNITLDGKALEGFEVSK TEYTHQLFNKEDVKLPKVDVVKNAEDVKVEITQPTLEKPVATIRATSGYGIEKVYTINFE LVDQMYISDMNWTVDKAGYSANMRDTCTCGGELGVWVDGKATKFDKGVGSHAPSEVTVNV EGLDATMFKAVAGIGKEQGGNGEVNFVVKVDGKEVFRKDAVKFKTSVPVNVAIRGAKTVS LIAETNGSDGNDHAVWADAKLVNEVVFTELEKAIQEYEKIAENSADYSKKTFADYTTAYE SGKAVLEKESAKQDEVDKAVLALNEAKSALANVAELRAKAEEYKKLDANLYVKDSYDALQ KVVEEALTLANDDNATKDQVNEMMESLKDAFAKLIPLDENRKSLIDAIKEYEELAEKQKE ENCYTEGSWNAYKDLIDDAKEMLNNPDATGEDITNLVDKINKEKDALVDISALRKAIADL NYDEVNYTAKSYEEYEALVKVAEDVLKKADATKNEVEKAIESLSKENVKKVLVDISELKT LVSEAEKLDKDDYKEDAWKTLQDEIAKAKDLYKEATKEQIAEQIKTLRAAMDDVKEPDKP TDPDPQPPIPPTDDGNGSGNGNGNGSGNGNGSGAGGSHGAAKTGDTAPIALWGALFAAAA AGIGGIFVKRRRKDR >gi|330402635|gb|ADLB01000021.1| GENE 169 160589 - 161539 966 316 aa, chain - ## HITS:1 COG:CAC3628 KEGG:ns NR:ns ## COG: CAC3628 COG4608 # Protein_GI_number: 15896862 # Func_class: E Amino acid transport and metabolism # Function: ABC-type oligopeptide transport system, ATPase component # Organism: Clostridium acetobutylicum # 4 316 10 322 322 431 63.0 1e-121 MEKLITLENVSKHFHVGRKQTLVAVNNVSMDIYKGETLGVVGESGCGKSTLGRTVMGIYS ATSGKITYNGEEVSVKRTKDRYEYSKKAQIIFQDPYASLNPRMTVGSIIEEGMEIHNMYT KEKRRERVYELLETVGLNREHANRFPHEFSGGQRQRIGIARALAVEPEFIVCDEPISALD VSIQSQIINLLKNLQEEHGFTYMFIAHDLNIVKYISDRIAVMYLGNLVELADSDEIYEHT LHPYSQALLDAVPIPDPEKESLKERKLLSGDVPSPINPKPGCPFAGRCPKATEACRTTSP ELKEVRPNHFVACHLY >gi|330402635|gb|ADLB01000021.1| GENE 170 161542 - 162540 1206 332 aa, chain - ## HITS:1 COG:BS_oppD KEGG:ns NR:ns ## COG: BS_oppD COG0444 # Protein_GI_number: 16078211 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport system, ATPase component # Organism: Bacillus subtilis # 4 326 7 329 358 408 62.0 1e-114 MEHILNIEDLHVSFDTYAGEVHAVRGVSLHVDEGEVLAVVGESGCGKSVTAQTIMKLNPM PPARIKEGSIDLCGKDIVAMSDKEMQEIRGQLVSMIFQDPMTCLNPTMKVGKQLTETLKK HKKISKEEEKAEAIKLLNMVQIPNAEERVNQYPHEFSGGMRQRAMIAMALACKPKLLIAD EPTTALDVTIQAQIMKLLAELGKETKTAVILITHDLGVVANLADRVSVMYAGKVVEEGTV GDIFYRYAHPYTEALLKSLPTVDTKKDEELVSIPGTPPDLYAPPQGCAFASRCEKCMKIC KKNQPPVFDLGEGHKASCWRLHPDFPDKKEEN >gi|330402635|gb|ADLB01000021.1| GENE 171 162547 - 163548 1086 333 aa, chain - ## HITS:1 COG:BS_oppC KEGG:ns NR:ns ## COG: BS_oppC COG1173 # Protein_GI_number: 16078210 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Bacillus subtilis # 25 333 4 305 305 303 50.0 2e-82 MRVKKEKALEVAQMPRKAYDVSGKIPAEAFEHIGTDAESMESIARPSVSFWKDAFNRVKK SKVAVVCLILLLLLLLGSIFIPMISPFDYSSQNVAFANQPIMSEDPVTGQTHIFGTDALG RDVFVRVWMGARVSLTVAIVVALIDCFIGVIYGGISGYFGGAVDNVMMRIVEIISGIPYL IIVILLMTVLPRGIGTIIVAYSLTGWTGMARLVRGQVVSLGEQEFVIAAKSMGAKPARII SKHLVPNILSVIIVNITLDIPAVIFTEAFLSMLGLGIAPPKASWGIMANDGILTFQMYPS LLIVPAIFICITMLSFNLLGDQLRDAFDPKLRR >gi|330402635|gb|ADLB01000021.1| GENE 172 163553 - 164506 1130 317 aa, chain - ## HITS:1 COG:lin2299 KEGG:ns NR:ns ## COG: lin2299 COG0601 # Protein_GI_number: 16801363 # Func_class: E Amino acid transport and metabolism; P Inorganic ion transport and metabolism # Function: ABC-type dipeptide/oligopeptide/nickel transport systems, permease components # Organism: Listeria innocua # 5 317 2 309 309 268 45.0 1e-71 MSNRVKFILKRIVYSFITIVVLIGFTFTLMHMLPGDPFSGGKAIPEATKAALFEKYGLDK PVLVQFFIYLGNVFKGDLGVSLPDGRQVTDIIAQAFPVSFELGMRALIFAFIMGILLGVV AAVKRGTFWDSASMFFALVGVSVPSFIIGSVLQYFLGLKLYEATGVEVFAILGWGSENSK ILPAFALAFGSMATISRLMRTSMLDVLNSDYIKTAKAKGLSQKAIIWKHAVRNAIMPVIT VMGPLVASVLTGAFVAENIFAIPGLGKYFVQCVQTNNYPVIAGTTIFYGTFLILANLVVD LIYGFIDPRVKLTGGKE >gi|330402635|gb|ADLB01000021.1| GENE 173 164762 - 166414 2255 550 aa, chain - ## HITS:1 COG:lin0200 KEGG:ns NR:ns ## COG: lin0200 COG4166 # Protein_GI_number: 16799277 # Func_class: E Amino acid transport and metabolism # Function: ABC-type oligopeptide transport system, periplasmic component # Organism: Listeria innocua # 3 538 1 537 549 291 34.0 2e-78 MKMKKLVALCLAGAMALSVVACGGGNKGGDSKTAGSFPGTKDKDMVTVDIRTEPAELNSI KTSDVPSGNILRMVISGLYKLDKDDKPVPDLAEDTKVSEDGKTYTMKIRQDAKWSNGEPV TAHDFVFAYQTISKKDTGSVYGFIVYQNLLNGQEVFDGKKDPSELGVKALDDYTLEVTFV NPIPYALHLFSFSSFYPMNQKGYEEIGAENYAKDADKIVTNGAYKIEEWVHDDHITLAKN EEHYNAKNVSIPKVKYLMLRDGNARMNAFKAGQIDSINIAGEQLEQAKKEGMKLTTYVDN SNWYLQFNTQKKELSNPKIRQALGMALDTDSVCKNVIKTGAEPATGIVPTSIAGADGNSY AKTLGDITEYNPEKAKQLFDEGLKEIGMKAEDFKISYTTEDSAGAQKEGAFYQEQWKKAL GIQVEIKPMPFKAKLAAMDAGEFDIVFAGWSPDYNDPMTYLDMFATGNGNNYGKYSSPEY DKLIADATKEVDKVKRQEMLMAAEKLVCQTDAAVFPLYFQSVTYTVSDKLEGMTRTGFQE FDFTDGAKIK >gi|330402635|gb|ADLB01000021.1| GENE 174 166716 - 167093 477 125 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|240143815|ref|ZP_04742416.1| 50S ribosomal protein L7/L12 [Roseburia intestinalis L1-82] # 1 125 1 125 125 188 80 2e-46 MAKLTTAEMIEAIKELSVLELNELVKACEEEFGVSAAAGVVVAAAGGDAAGAAEEKDEFD VELVSAGASKVKVIKVVREITGLGLKEAKEVVDGAPKVLKEGVTKAEAEEIKTKLEAEGA EVNLK >gi|330402635|gb|ADLB01000021.1| GENE 175 167137 - 167673 581 178 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|240143816|ref|ZP_04742417.1| 50S ribosomal protein L10 [Roseburia intestinalis L1-82] # 1 177 1 183 187 228 68 2e-58 MAKVELKQPIVQAIAEDIAGAQSAVLVDYRGLTVAEDTELRKQLREAGVIYKVCKNTMMK RAFEGTEFAGLEEYLEGPSALAISKDDATAPARILCKFAKDAKALEVKGGVVEGAVYDVA GIQELSKIPSREELLSKLLGSIQSPITNFARVIKQIAEQDGEAVETPAEEAAEAPVEE >gi|330402635|gb|ADLB01000021.1| GENE 176 168009 - 169415 747 468 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|145632256|ref|ZP_01787991.1| 50S ribosomal protein L27 [Haemophilus influenzae 3655] # 1 447 1 445 456 292 35 1e-77 MMLNQFLSSLNDFLYSYILIFLLVIAGLYFSFRTKFVQFRLFKDALHALKEKSENSGSKK SVSSFQALMISTASRVGTGNIAGIATAIVAGGPGAVFWMWVMALIGGASAFVESTLAQVY KVKDGKDFRGGPSYYIEQALGKRWLGVLFSILLIACFAYGFNGLQTYNMSSALEYYIPNY NESILPAVVGLVIALATAFVIFGGVHRISFITSVVVPIMAGIYILMGIVITLMNFNQIPE MFSQILEGAFNFKAIFGGFAGSTILIGIKRGLFSNEAGMGSAPNASATASVSHPVKQGMV QVISVFIDTLLICTTTAFILLLSNVEGAAGKLDGIPYVQAALQANVGEIGIHFITFSIFA FAFTSLIGNYYYAESNILFIKDNKILLNLFRVSCLVAIFLGAQADFGTVWNLADVLMGFM AIENILVIFVLGGISFKVLDDYMRQKKQGLNPVFKAENIGLKNTDCWK >gi|330402635|gb|ADLB01000021.1| GENE 177 169560 - 170249 1007 229 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|238916246|ref|YP_002929763.1| large subunit ribosomal protein L1 [Eubacterium eligens ATCC 27750] # 1 228 1 228 231 392 84 1e-108 MKRGKKYTEAAKLIDRGNLYDKEEAIALVKKSAVAKFDETIEAHIRTGCDGRHADQQIRG AVVLPHGTGKTVRILVFAKDAKAEEAKAAGADYVGAEDLIPKIQNENWFEFDVVVATPDM MGVVGRLGRVLGPKGLMPNPKAGTVTMDVTKAINDIKAGKIEYRLDKTNIIHVPVGKASF TEEQLADNFQTLIDAINKAKPAAVKGQYLRSVTLTSTMGPGVKLNPVKL >gi|330402635|gb|ADLB01000021.1| GENE 178 170310 - 170735 633 141 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|238922786|ref|YP_002936299.1| ribosomal protein L11 [Eubacterium rectale ATCC 33656] # 1 141 1 141 141 248 87 2e-64 MAKKVQGYIKLQIPAGKATPAPPVGPALGQHGVNIVEFTKQFNAKTADQGDLIIPVVITV YNDRSFSFVTKTPPAAVLIKKACNIKSGSGVPNKNKVATITKAQVQEIAEMKMPDLNAAT IEAAISMVAGTCRSMGVTVTE >gi|330402635|gb|ADLB01000021.1| GENE 179 170789 - 171304 761 171 aa, chain - ## HITS:1 COG:CAC3149 KEGG:ns NR:ns ## COG: CAC3149 COG0250 # Protein_GI_number: 15896397 # Func_class: K Transcription # Function: Transcription antiterminator # Organism: Clostridium acetobutylicum # 1 171 1 172 173 190 56.0 1e-48 MSEAKWYVVHTYSGYENKVKANIDKTIENRHLEDQILEVRVPMQDVVELKNGVQKAVSKK MFPGYVMIHMIMNDDTWYVVRNTRGVTGFVGPGSKPVPLTDEEMAPLGIQKEEIVVDFEE GDTVTVTAGAWEGTVGVIQAMNAQKQSLTINVELFGRETPVEISFKEVKKM >gi|330402635|gb|ADLB01000021.1| GENE 180 171337 - 171543 298 68 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_00277 NR:ns ## KEGG: EUBELI_00277 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: Protein export [PATH:eel03060]; Bacterial secretion system [PATH:eel03070] # 7 68 6 67 67 73 64.0 2e-12 MGEKSSKEKAQKKSWFKGLKAEFKKIIWPDKKTLAKETTAVVAVSVLLAALISVIDVIVK YGVDFLIK >gi|330402635|gb|ADLB01000021.1| GENE 181 171568 - 171717 239 49 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|160881814|ref|YP_001560782.1| ribosomal protein L33 [Clostridium phytofermentans ISDg] # 1 49 1 49 49 96 81 9e-19 MRTRITLACTECKQRNYNMTKDKKTHPDRMETKKYCKFCRTHTLHKETK >gi|330402635|gb|ADLB01000021.1| GENE 182 171883 - 172251 295 122 aa, chain - ## HITS:1 COG:PM0679 KEGG:ns NR:ns ## COG: PM0679 COG2832 # Protein_GI_number: 15602544 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Pasteurella multocida # 1 120 1 120 120 85 45.0 3e-17 MKIIWITIGFVTMAIGGLGVILPVLPTTPFLLVSSFCFAKGSKRFHNWFTGTQLYKKHLD SFVKDRSMTLKTKWCVLLPASFMLIIAMIMMQNIYGRIFILILIGFKYIYFFTKIQTKKE KI >gi|330402635|gb|ADLB01000021.1| GENE 183 172420 - 173844 1217 474 aa, chain + ## HITS:1 COG:mlr8077 KEGG:ns NR:ns ## COG: mlr8077 COG1502 # Protein_GI_number: 13476687 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylserine/phosphatidylglycerophosphate/cardioli pin synthases and related enzymes # Organism: Mesorhizobium loti # 58 422 16 405 466 101 22.0 3e-21 MEEKNFMLFKHLSIGKILLFLFLIYVALLVVPYISHKKVPTSYRENFNPNSFYSDITGNE RVAYINDNNDALLYRLHMIENAKEEIILSTFDFNPSQSGKNVMAALIQASKRGVSVKVIV DGCHGFIDMYGNDYFQALASYPNISVRIYNTINLLKPYDIQARLHDKYVIVDNKMYLLGG RNTTNLFLGDYSSSKNIDRELFVYNTKEDKNSSLSQLRTYFESVWVLPDSKDYLCQKETD NVIKAKNVLEDRYNYLRKTYEKAYCEWDFEELTMPTNKITLLSNPIETENKEPLLWYSLQ QLMLESKNVTIYTPYIICGKEMYQGLEDLKAKNIPVEIITNDVASGANPWGCTDYLNQKE KIWKTGVKVYEFMGEYSCHTKVLLLDDRMSIVGSYNFDMRSTYQDTELMLAVDSPELNAI IRKEAERDKTYSKTMGDDGKYIYGDNYKKKDLGIGKKIFYGVLRIVVLPIRRFL >gi|330402635|gb|ADLB01000021.1| GENE 184 174181 - 177477 3175 1098 aa, chain - ## HITS:1 COG:mll3591 KEGG:ns NR:ns ## COG: mll3591 COG3534 # Protein_GI_number: 13473100 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-L-arabinofuranosidase # Organism: Mesorhizobium loti # 281 442 40 189 501 83 32.0 2e-15 MKSRALKTLLCSALAFGMIIPSALPAYAKNPERMDLTNGLKDYWNFDSLKSSEGDKTTAT LHGKVEIVESGDAVFGKVLRFGAGTDNYLRLNDYINTGSGSTSFSMWYRYDTSVVANEGN KPTVLLQHEGEGRSILTLKGDGKYNTYLNGQDVASTKTVAKGQWQHITVTFDQNAKKVKY YINGEKDSEQDLGSNVINEKLALRLGAHKTSGSTDPHPMRGDVDEFCVYNKVLTDEEAKA LYNEKAEELNRVSLTLNLDEVNRTIEPESIFGINHRYAFNGYGTFDSKAMKVKEDFKELY ENAGFGSIRYPGGTISNLFNWKTTLGPKENRKKQIHGFYNNRNQGGIEPNFGIKEIADFA DDVDSEIVYVYSLGRGSKMDAADLVEYLNAKVGTNPNGGIDWAQVRANNGHEKPYNVRYF EIGNEMQQAFGKAADGTSSQGYWTEYVEGGAEKAYTEGGVAKFIQRYAVDEENWNKEASK TNGKPNLVRYLRYANGNPGKMENGKIVADPEFRAVNDGVEVFVGVDGNLQPWKVVDSFAN SGANDKHCVVDYSTGAIHFGDGKHGKIPDANQNVYATYSVKRDGFLDVSKAIKNTTDKIN EIEGTSHEAKVYTSFESEGFITRMNNLNANDWYDGMTIHPYSGTVGATGTPEAFYDEAMK KAETKGVDHVKKFANLMPKGKVPVISEYGIYNNRQLQVRSQTHAVYIAKVLMEYVRLGSP YIQKHCLSDWYSDGGDSLGPTQQAVIQVVPQKGADTKTGEGEFKFFSTPSAHVFEMLNAG FGENIVDAQFNRVPTMNNGVKTLSALASKDASGNVYVAMVNVDRQNDRKVTLNIPNYDLT GRKVELQTLSSKTITDENSLEKPDNVKINKTEMVLEENQSITLPKHSFVVAKICNNVDKT NLQNKVDEINGLNKNNYTNFESIEETLNKAETILNDMFATQTEVAEVLKELTDAQNKLVL KKADYTKVDEAISKADALNKDEYVDFSEVEKAIKAVVRDLDITKQAEVDKMAEAIETAIK NLKRKPEKPNNDPDNKPDNKPDDKPDGKPQQKPSNKPNTKPDTTNKPVKTGDEAPILPFA VTIAGAGAVIALLLKKRK >gi|330402635|gb|ADLB01000021.1| GENE 185 177492 - 182264 5197 1590 aa, chain - ## HITS:1 COG:VC0613 KEGG:ns NR:ns ## COG: VC0613 COG3525 # Protein_GI_number: 15640633 # Func_class: G Carbohydrate transport and metabolism # Function: N-acetyl-beta-hexosaminidase # Organism: Vibrio cholerae # 516 750 203 433 637 79 26.0 5e-14 MKKKQVVSGLLSAAMIMNVALVDMGPVQAATQMNLALNRPTSASSYEKPAGQPEKTAPSK AVDGNLETWWGTDQNKAKNEHIEVTLDGVQTVKQINVHFERDDAGQNIKKFKVEIKNEQD QYEEVYKNSTDRAKQLETITLPEAKKAKAVKVTILDADGGKINWVNVGIREIEVYAQDIE ATENKNHMRTAKSVTASSKEVDKFGADKLNDGKYGKPNRWASQAHTYQNQWVKAELAKPT VVKQMKVKLFNRDVAPSPSNVKGFNLKYTDINGEEKTVKVNNVQETGKEGYKTDLVYTFE EPICATKIELSDFDVLVKDSSNNGYNNISIEEIELYSNEQAQQPTDDTLDSVIAGLKGGT VEKDAKTFTLPTVPQGFTIKINGADFEQIIAKDGKVQHPLTDKLVKVSFEVSDSKGNKKV TGDYDYTVKGLHETVEGKNAKPVVIPEIQEWYSNSDKKVSTDKLTKVVYNDEKLKPIVDE FVKDYKDFSGKTLTVSKGEAQANAFNFTLKAPDELLGEEGYTMDIKEDRINVDSVSVTGN MYGMQTVLQMYKENSKEYKVGQMRDYPRFQTRGLLLDVARKPISLEMMKEITRTMRYYKM NDFQAHLSDNYIFLENYGKHNKENEAFKAYEAFRLESSLTNDKGESPTAKDYSISKREFK KFIQEERELGMKVVPEIDVPAHATSFTKVWPELMVKNQVSSLNANRPLVDHFDLTNQKAI DKIEEIFDDYTKGANPTFDKDTTVHIGADEFLYNYKSYRDFVNKLVPHVKETNPVRMWGG LTWIKDNPITPIKQEAIENVEMNLWSSDWADGIEMYNMGYKLINTIDNYGYMVPDGSKTR KDAYGDLLNVNRVFDSFEASKVKTKRNGYQSVPSGDDQMLGAVFAIWSDNIDKNASGLTE SDLYWRFFDALPFYAEKTWAATGKEKGTADKLAKLAQDKGTGPNTNPYYQEDKKGDNYAE YKFEGNLNDSSENKRNLENGKNAEVKDGALHLKDKESYVTSPIEQLGNGNALSFDIKLDK PSKPGDIIFEETAPYGTHDIRVMNNGKLGFTRELYDYYFDYELPVGKQVNIQIVVRQQSA KLYVDGQFVSDAKGRFFHNNMVKKDNISHATFALPLERIGSKSDSIEAEIDNVVVTTAPE VKDEYNKSAWTGKTNSETLNGGDKEGEITKAFDKKANTHWHSNWKGVADKVENIDGKKGN KDEIWAEIDFHKGYTINQFSFTPRTDTNSGYVTRASLYVKNSADGAWKEVAKDQKFANDA SKKTFTFDEQEVFAVKFVATQSSDGWVTVSEFDMANAPQRTYTVFVQAEEGGRVEGGKDV AVGENVTVKAIANAGYEFAGWYNSLGAKVSDAAEYTFKVTGNTALTAKFEKTDAPVDPVE KYTVTVQSSDEAMGTVSGGGEFNKGENVTVTAEAKEGYKFVNWTVNGAEVSTDAAYTFEV TEKVVLTANFEKVEEQKPEKPSKDALSDALAAAGKLEEKDYTADSWKVFADALKDAQEVY LNENATEKEIADALNALKDAQAQLVKADTEEPQDPQEPQKPETPDNKPNQKPNTKPNKPV KTGDEAPILPLTATIAGLGAAIALLFKKRR >gi|330402635|gb|ADLB01000021.1| GENE 186 182408 - 184168 1879 586 aa, chain - ## HITS:1 COG:SP0057_2 KEGG:ns NR:ns ## COG: SP0057_2 COG3525 # Protein_GI_number: 15900002 # Func_class: G Carbohydrate transport and metabolism # Function: N-acetyl-beta-hexosaminidase # Organism: Streptococcus pneumoniae TIGR4 # 40 479 2 437 782 383 47.0 1e-106 MKFMKKLLAVTLAFGVVLSGASLTTGEAATVKAAEEQKLKSVFSIDAGRKYFSKDQLKTI INKAYQNGYTDVQILLGNDALRFVLDDMSMEVEGVSYSSEEVINAIKVGNNLYHNDPNGN VLTESEMNEILSYAKELNVGIIPVINGPGHMDGILNAMKELGIENPEYSYGGKTSTRTID LNNETAMAFNKTLLSKYVSYFSGKCEIFNFGCDEYANDIDSSKEGYYNGWSRLQGTGDYP KFVEYVNEVAGMIKDAGMTPLCFNDGIYYNNDDQFGTFDKDIIISYWTAGWWEFYVAKPS YLASKGHKILNTNDAWYWVLGNIDGKINGQGVPYPYNGVVEKIQTVSFRQIAGDKENTPI IGSMQAVWCDVPSVEHDMDRVLSLMDQFSEKHKDYLIRPADYTKVDAAISKIPSDLSIYT EETVNALNTAKDAVVRNKRVTEQEVVDGYADAIYAAIDALVYKTADYTKVDEAIAKAEAL NKADYKDFSAVEKAIEAVKRDLDITKQEEVDAMAKAIEDAIAALEKKAPTTSGTTNGTTN GGNTTDGTSDTIKTGDTTNMTLWFMMLVVSAGLAGMFYARKRKENR >gi|330402635|gb|ADLB01000021.1| GENE 187 184347 - 189917 5643 1856 aa, chain - ## HITS:1 COG:SPy1586 KEGG:ns NR:ns ## COG: SPy1586 COG3250 # Protein_GI_number: 15675473 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Streptococcus pyogenes M1 GAS # 29 885 21 891 1168 644 40.0 0 MNKFTKGTARFGATLLTAVLTMSCFSPLATANASIDVRTQSQTQMSSDKEVVYVNTVNNA TERTQNFDANWKFYLGDAGNAQTSNFDDSKWRNISLPHDYSIEQEYSKQMEAESGYLPGG TGWYRKHFTVGNELKGKEIRLDFDGVYMNSTVYINGEELGTHPYGYTPFSFDLTDHIKFG EENVIAVKVDHKTPSSRWYSGSGIYRSVNLTVTDAVNVALNGTKVETPELATNQTTVKTT VKTTVENDGKEAKDVVLTHKVFEKGNSKNVIGETTTQKTTIQPNGKADINAEFTVSNPKL WDIESPNLYTVRTEVKVGDKVVDTYDTEYGFRYTKFDKDTGFYLNGKPVKLKGVCMHHDQ GALGAVANRRAIERQVEILQEMGCNSIRVTHNPAAQHLIDICNEKGILVVEEIFDGWHHA KNGNTEDFAKYFEKQVGENNKLIGATPTMTWAEFSLKATLKRDCNAPSVIMWSLGNEIQE GNYKSGFLERTPNMIKWAQAVDKTRELTIGSNAVKNEGVNGEHSKIADKITEAGGVSGTN YSDGNSYDRLRNLHPNWKIYGSETASSVNSRGVYHTKDRDNNTQELSSYDTSKVNWGAFA SDAWYDVITRDFVAGTYVWTGFDYIGEPTHWNGTNPGVVGKWPSPKNSFFGIVDTAGMPK DSYYLYQSQWNDDVNTLHVLPTWNRDEIILDNQNKAEVVVYSDAPKVELWLTPAGSKEAK KVGEAQELETRTTEKAKHSYQVVKGSNKLYRTWKVPYEAGTLEARAFDANGKPITDTKGR SSVTTAEKEAKLQVKADRNEIKANGTDLSYLQIDVVDAKGNIVPNADDKVKVEVTGNGTL VGLDNGWQTDHDSYKGKERRVYNGSGIAIVQSTKNAGEIKVKVSAEGLGEKTVTLNTTAD TTGGEQQVVVDSFFMEKNYYVKVGNKPSLPKTIEVRYSNGTKEQKPVKWNTISDEQTNKA GAFTVKGTVENAGEVSVRVNMIDEVGGLLNYSTAVPVGTTPTLPESRTAYLSNGEVLDVS FPVKWEEKDASAYNKVGTVIVNGTANVLGKEVTVTATVRVAKEDVALGDSISGAAVVLEQ DIPENMQSDHLPAIKDGNTGKNANNEGGTNKSVWTNYKNSNEGNDNKAEIIMGYDTQVVL GEIVIHFFEDSYSARFPDANKTKIYVAETKNGPWTQVQAKEQIGTAKDGIKPYTYKLDAP VSATFIKFELTNKDEQLDGRKPCTGITEIELKKATTSFKTNTTAELGKLTVNGVELTKQQ LASGKYTTSALVADVTPEAKDNAAVTVLPKNENNQIKIIIESEDHKTTNTFTIFLNDKDP DVYYPNKDITPSAPFSLPSPDAHEGDVRYVLDGDVNTHWHTNWRNGATASDIAKREITLT LKEAATIDAMNYHPRVYGGGNGRVTKYKVLYSVDGTTFNESDVCAEGTISQDKADWTLIE FTKPVKAKAFKLIGVHTYTDQGADKHMAVAELRLRMNRETTDISDEANKVTIDPIAKQKV DVVDEKHPVEPQVTVKQNGKALTYGIDYKVSYADNTKEGTAKAIVTGIGKYSGTLETTFA IEKNPVVLTSIAVKTSPKTDYHVGDTFNPEGLVLSVFYSDNTSSEVAYTAEIKDKFTFAP SLETALKETDEVVTVTYEGKTTEVKVNVTEKQESTVRKALEERVKFAQTITSNGYTADSY SAFKAALKAAEETLANENATDAQLQSALDNLNTGINGLKEESVTPNPNPNPNPDENGDRD ELRAKLQKFYDECLAYYKEGNYSKDNWKVYEDAMAQAKAVLDNENATEKELKDALSDLIN ATKRLNDEGKAEEQNPPTPPTSIETGDQAPITMIVVLMVVAVVAIVGIVVYKRKKR >gi|330402635|gb|ADLB01000021.1| GENE 188 190078 - 190419 287 113 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_3110 NR:ns ## KEGG: EUBREC_3110 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 109 1 109 116 75 29.0 5e-13 MKKAVLIWEMPLILGTPSNPISHILYHKTAERYVERFTRFCKENSLDWCIILDETHGDVE KLLQNEIDMLVFAPGRRSRSFVYNKELKETSVPIYMLTEEEYQNGKFEKWLNI >gi|330402635|gb|ADLB01000021.1| GENE 189 190550 - 190813 334 87 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|160940069|ref|ZP_02087414.1| hypothetical protein CLOBOL_04958 [Clostridium bolteae ATCC BAA-613] # 1 87 1 89 89 133 73 9e-30 MAYDKGNKADSPMKRRGGRRRKKVCVFCGKENNEIDYKDVAKLRKYISERGKILPRRITG NCAKHQRALTVAIKRARHLALMPYVQD >gi|330402635|gb|ADLB01000021.1| GENE 190 190883 - 191317 473 144 aa, chain - ## HITS:1 COG:CAC3723 KEGG:ns NR:ns ## COG: CAC3723 COG0629 # Protein_GI_number: 15896954 # Func_class: L Replication, recombination and repair # Function: Single-stranded DNA-binding protein # Organism: Clostridium acetobutylicum # 1 138 1 133 144 108 45.0 3e-24 MNKVILMGRLTRDPEVRYSQGANATAVARYSLAVDRRFKRDGEPTADFINCVSFGRTAEF AEKYFRQGLKVVVTGRIQTGSYTNKDGVKVYTTDVVVEEQEFAESKAASENSGYQASPMP SPSADIGDGFMNIPDGIDEELPFS >gi|330402635|gb|ADLB01000021.1| GENE 191 191339 - 191626 402 95 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|160881892|ref|YP_001560860.1| ribosomal protein S6 [Clostridium phytofermentans ISDg] # 1 95 1 95 95 159 76 1e-37 MNKYELAVVVSAKIEDDARTETIEKVKEIIARFGGNVTDVDEWGKRRLAYEIQKMKEGFY YFIHFESDSTVPAEVESRLRIMENVIRFLCIRQDA >gi|330402635|gb|ADLB01000021.1| GENE 192 191722 - 191916 256 64 aa, chain - ## HITS:1 COG:lin2921 KEGG:ns NR:ns ## COG: lin2921 COG4481 # Protein_GI_number: 16801980 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Listeria innocua # 1 59 1 59 65 61 45.0 5e-10 MEKLEYEVGDVVRLKKKHPCGSFEWEILRVGADFRLKCTGCGHQIMIARKIVEKNTKELK KGLK >gi|330402635|gb|ADLB01000021.1| GENE 193 191928 - 192752 907 274 aa, chain - ## HITS:1 COG:BH3294 KEGG:ns NR:ns ## COG: BH3294 COG4509 # Protein_GI_number: 15615856 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus halodurans # 26 263 2 241 254 115 30.0 1e-25 MDAKEKRKLHRQKMKEEGKYQNKQKKGKKKNSIFSTIVLIVAICVFCFSAFQLYQIFSSY KKGNDEYDKIKNLAISVEKNEQGEEKFKVDFNKLWEINPDTIGWIRFEEPSRINYPVVHS KDNKEYLTKLFGTGKNTYGTLFVDKDNSGDFQDKNTIIYGHRMKSGSMFGQLEKYMEESF YKEHPYFYIYTPDGKESKYQVISAAVVKDTSRTYTKTFQNDEEFMDYIDYVRSISNYQTD AEVTKDSRIVSLSTCTIDSNEDRFVLQAVKISER >gi|330402635|gb|ADLB01000021.1| GENE 194 193091 - 193684 439 197 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_3502 NR:ns ## KEGG: EUBREC_3502 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 11 195 11 195 197 130 37.0 2e-29 MKLNAVILETLKNNNNIITTSQAVELGFSRYLLSKYEKAGLLERERQGVYVLPDSIHDDM YTLTLRSKKIIFSHDTSLFLNGLSERTPFVHSITIPNNTRVSKAIQAECICYHIKPELYQ IGMTTRKTTLGNEVRYYNLERTICDLLRSRSRGDEETILSAIKNYAETSNKDLNLLAVYA SKFGVDKILKRYMEVTV >gi|330402635|gb|ADLB01000021.1| GENE 195 193909 - 194388 513 159 aa, chain - ## HITS:1 COG:TM1085_2 KEGG:ns NR:ns ## COG: TM1085_2 COG0073 # Protein_GI_number: 15643843 # Func_class: R General function prediction only # Function: EMAP domain # Organism: Thermotoga maritima # 47 158 2 108 109 88 43.0 5e-18 MKSQGAAKNEANNAAQATSTVAKAVEEPTAAKVEIDFSNVEVEPLFEDMVDFDTFSKSDF RAVKVKECEAVPKSKKLLKFVLDDGSGEDRVILSGIHDYYEPEDLVGKTLIAIVNLPPRK MMGIDSCGMIISATHLEEGRGGLNVLMVDDRIPAGAKLY >gi|330402635|gb|ADLB01000021.1| GENE 196 194574 - 195815 752 413 aa, chain - ## HITS:1 COG:SA1835 KEGG:ns NR:ns ## COG: SA1835 COG0582 # Protein_GI_number: 15927603 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Staphylococcus aureus N315 # 24 411 24 383 390 86 25.0 8e-17 MAYSRKDKKGRVLRKGETQRKCDGKYVYTYTDTLGRRRSIYSKDIKILREREEQLIKDQL DGLDTYVAGNATVNFVFDRYISTKSELRATTRTNYKYMYDRFVRNGFGKKKIATIKFSDV LQFYHHLLKDKKIQVNTLEIIQTVLHPTFQLAVRDNIIRNNPSDGVMAQIKKLPGRNHGI RHALTLEQQRAFIRYVEENEKFESWVTLFKFLLGTGCRIGEAVGIRWEDIDFKKRIISIN HSLVYYSREYKEHSTCSFSISLPKTEAGIRIIPMMDTVYEALKKEYAFQEENGFNETEID GMTGFVFSNRFGNVHNPQAINRAIKRIYEAYNAEEVVKAAKEKREPILIPHFSCHHLRHT FCSRFCENETNLKVIQSIMGHADIETTMDIYAEVTETKKYESIQELARKLDVF >gi|330402635|gb|ADLB01000021.1| GENE 197 195849 - 196031 233 60 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|210610680|ref|ZP_03288561.1| ## NR: gi|210610680|ref|ZP_03288561.1| hypothetical protein CLONEX_00751 [Clostridium nexile DSM 1787] # 1 60 6 65 65 96 96.0 5e-19 MAKMGRPKSANPKNTLIGLKLTEEEATKLKEYASKHDMTITDVLQKGIDLQYAMEDSERS >gi|330402635|gb|ADLB01000021.1| GENE 198 196144 - 196335 310 63 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_0076 NR:ns ## KEGG: EUBREC_0076 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 63 1 63 70 70 60.0 2e-11 MQRTPKKLERKRLVRYKEGAEMYSMGMNKFQTLAKDARAILKIDRMVLVDLDIFDEYLET FRV >gi|330402635|gb|ADLB01000021.1| GENE 199 196747 - 196986 242 79 aa, chain - ## HITS:1 COG:no KEGG:CD3328 NR:ns ## KEGG: CD3328 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile # Pathway: not_defined # 3 78 2 76 79 73 47.0 3e-12 MNSKKHRQLPEFETIQAAIKGDVDAINQILCYFQPFIQSECKREYKDEFGRTYYVTDEYM KRRLETKLITKILDFEIQL >gi|330402635|gb|ADLB01000021.1| GENE 200 196995 - 197405 321 136 aa, chain - ## HITS:1 COG:no KEGG:CDR20291_1773 NR:ns ## KEGG: CDR20291_1773 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile_R20291 # Pathway: not_defined # 1 136 1 137 145 85 36.0 5e-16 MTLNRQDKEIVRRRFCAYCIKVMHGEALNYFDELERQRTREVTFSELLQKDMDSLYYCDD YNMADYFTVMGRQIPVRDEWISKALKKLPSKKREIILMLYFLDMTEKEIAACLKLVQSTV HYHKDDSLKLLRKLME >gi|330402635|gb|ADLB01000021.1| GENE 201 197963 - 199204 684 413 aa, chain - ## HITS:1 COG:MT0720 KEGG:ns NR:ns ## COG: MT0720 COG0535 # Protein_GI_number: 15840098 # Func_class: R General function prediction only # Function: Predicted Fe-S oxidoreductases # Organism: Mycobacterium tuberculosis CDC1551 # 82 335 13 284 391 95 27.0 2e-19 MLIAKSTEIIENQNKIILANRISGEWMRMSREVYEVIQEIIKSGKKIEELENSFEEKEDY IFVKNTIQTLEDCGILCESDEEKRLDNKIVSIQMTNRCNLRCKHCCVSAGDNVIEELTTL QMKRALDTVIEWNPLNIMLSGGEPMIRDDFFELLEYLDQRYSGNIILSTNALLINERNVE SLVKHCEHFEISIDGVDEESCAIIRGKGVFEKVCSKIKLLKEYGAEHINLSMVFSDKNEH LKERFYELNKQLNTTPVCRIFSPKGRGAENRNIISETSKTDYYIPNDYLVEDYKKTFSIG ICSAGKRELFIGCNGNIYPCPNFILDTCILGNILNVESLFDITDKSADQYLCDFIAEVAP INLPRCQKCPVKLFCWTCPGELFDINTEAAFNSRCERLKPILMRRVWEKEINY >gi|330402635|gb|ADLB01000021.1| GENE 202 199198 - 200841 1066 547 aa, chain - ## HITS:1 COG:sll1276 KEGG:ns NR:ns ## COG: sll1276 COG1132 # Protein_GI_number: 16330165 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase and permease components # Organism: Synechocystis # 21 537 6 527 534 218 29.0 4e-56 MILTSALNVIQPKILSYLIDKGIGGKSINILFIMTGSYILCNIIVRILNIRTNNYFETLK LVTATKLKKTLMEKLARAQGTYISGQGTGEILYILDNDIYQLESFGIEFFLELIMNVITA VVVFVILLRLNIEMLFLVGIIQILMIIFQKFMSVKITTAIRNVRDVIGELSNLQEQFVSN LKTVILSDVADFFLRIFNAEQQNYCEKSKKTNKFILYQREIIGFLQSFSLVASYLVGGIM IINARLTLGELIAFIQYVNMLIAPCVLIANSNIEIKQIEVALNRIYGELDRIDTIEDDKK KERIDKIDKISFENVSFRYSEKDVLKNINIEFEKNSITAIVGESGCGKSTILNILYGLWK PKSGNIYFNFWPYEKLNIKSVRNNITIVCQEPFLFNSSIIDNIKMGNEGNDQEKIDQVIK CVGLEMLIEEKEEENVGEKGNNLSGGQKQRVAIARALLRDSDVIVFDEATSNLDNLSQKE IMNNISSYFKDKIVIIIAHRLSTIKSADRIIVMHKGEIVESGTEDYLLSKNGFYKKLDNT DEGEKIC >gi|330402635|gb|ADLB01000021.1| GENE 203 201031 - 202179 391 382 aa, chain - ## HITS:1 COG:no KEGG:CDR20291_3112 NR:ns ## KEGG: CDR20291_3112 # Name: not_defined # Def: putative serine protease # Organism: C.difficile_R20291 # Pathway: not_defined # 42 381 6 342 343 177 35.0 7e-43 MIVLMTAKRVENPVITKHGNISKHVITQKRRKAAGGSCLFENMKNIKIALVDTGVNVKHN FLKDIIKQQYYVKNYNGKYEVLKKKEQVSDDLIGHGTGCASVIKKECKNIEIYSFCIVNK DGSSNLCILETVLNYILNLPIDIVNLSLSVNQWTDLRNLKHIIKGLSLQGKIVVVSLENN KRISYPAAFRECIGVQGAILDTVDAIWFNPYRRIQGVVDSTPYLHCNIENTYSMFGKSNS YAAARLTGIIALLMKMYGLKNKELIFAKLQQMSEKKIWCNLNLRKSRRLPEREPLLDKVD KRLCSYVVQILENYLNNDKEEMLKGMLLLSNRGRIHYSNCFEILKLLEKEIGFTVPDYTM ISRENFYTVYHITELVESYINN >gi|330402635|gb|ADLB01000021.1| GENE 204 202115 - 202270 204 51 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLKIKAIKDKKLAQVTMQSDPCKYAGQEGAYDCTYDCKTSGKPRYYKTWKY >gi|330402635|gb|ADLB01000021.1| GENE 205 202335 - 203339 417 334 aa, chain - ## HITS:1 COG:CAC2279 KEGG:ns NR:ns ## COG: CAC2279 COG0641 # Protein_GI_number: 15895547 # Func_class: R General function prediction only # Function: Arylsulfatase regulator (Fe-S oxidoreductase) # Organism: Clostridium acetobutylicum # 11 323 105 437 454 132 30.0 1e-30 MAYMIWVTGACNLKCKYCYEGIEKINKHMSPEISKQTIQFIIDDFDPSSHNELLINFHGG EPFLQVELMEYFIHNIKKQFDNQCSLMFSATTNATLLSQQVIDFIVENNMDITVSLDGKK ESHDKQRVFGDGRGSFDLALKNSLKLLNYNRDLRVRMTVNTNNVMELYDNIEFLILQGFR VIVPGIDIFDKSWNEESVEKLRLEILKIKKKYGKSNDLKISLCEPLNYCGTYCSAGMNTK HIYYTGDLYPCTVVCGYKEFSIGNVYEGTDRKKIQNLQKNISSLYESCQYCDVKNYCSAA RCSLINKLATGDFHRPTEIECNLTNILYEVNGIC >gi|330402635|gb|ADLB01000021.1| GENE 206 203341 - 203910 291 189 aa, chain - ## HITS:1 COG:no KEGG:CLK_0453 NR:ns ## KEGG: CLK_0453 # Name: not_defined # Def: putative AIP processing-secretion protein # Organism: C.botulinum_A3_LochMaree # Pathway: Two-component system [PATH:cbl02020] # 1 183 5 187 194 100 32.0 2e-20 MDKKIVDVLVKADVIDPANKEVVSFGIRQVEYFFINVITILFLGKIFNETVSSIIFLLAF IPLRTYAGGYHAKTILRCYIITIVLMVFTIFSFKQSMWSQGNSCVILGFAGALLYKLVPV DNYNHRLTLHERKVFRRKALLVMCLEIFISLIGILMNDKFLYQGIVAAIVMTTVLVCVGN QKNYYVGGR >gi|330402635|gb|ADLB01000021.1| GENE 207 203914 - 204063 83 49 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MINNIIKKYGKYIVAIALCFTNLTVNSTCPFVTFQPELPEDAEKLRKYK >gi|330402635|gb|ADLB01000021.1| GENE 208 204230 - 204313 91 27 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MMSKWRGYTLIFYVINHDFLKSEYSLE >gi|330402635|gb|ADLB01000021.1| GENE 209 204391 - 204537 58 48 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MILQRKWSTVNTDELVPLHVQERLNFQSYVLEVNRKLYRLSKTKEARI >gi|330402635|gb|ADLB01000021.1| GENE 210 204500 - 204670 163 56 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTSIDVKYKILAHRKKHRFEGAFSYGLVFCAKYVGGRFSKYVHVDDSSAQMEHCKY >gi|330402635|gb|ADLB01000021.1| GENE 211 204684 - 205955 897 423 aa, chain - ## HITS:1 COG:lin0802 KEGG:ns NR:ns ## COG: lin0802 COG2972 # Protein_GI_number: 16799876 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Listeria innocua # 212 416 226 431 433 102 30.0 1e-21 MGIKYVSEYLILNIFFFYIIIRFMAIFFVDDREEKKIKMLILIFSYLINSICYLLLNNPI VNMITSIIAIFLISCLYKSSWLKRAIVTFMIYCICSISDIFVATIIGQYTLGERLGIVNN ITSYFMIFIFEVVIEKYIQLKYDYFGSYSTVIIAAGVPISSLLIICVLVVNRIDNEAVIV IVSMGLIFVNMLVFKLYDILFSTYEQKYKNDVLEREINEYREQLVLIKKSQTKINALKHD FKHHILALTKLSSTRDYNGIIEYLNNIKTFIQDEKTYVDSGNNDIDSILNYFIQNAKNKN IEVLVNIKIGKEMKINFFDLNIILGNLMDNAIEGTLNAEKKIIMLSMELDRKVLYINIKN SYDQVVRIKDNNLMTRKSDKISHGIGLENIRFIVNKYSGQLDINYDENFFEVDILLYLNM NNE >gi|330402635|gb|ADLB01000021.1| GENE 212 205933 - 206673 401 246 aa, chain - ## HITS:1 COG:lin0801 KEGG:ns NR:ns ## COG: lin0801 COG3279 # Protein_GI_number: 16799875 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Response regulator of the LytR/AlgR family # Organism: Listeria innocua # 4 233 5 234 240 113 25.0 3e-25 MYYIAICDDETFTCSAIEQQLLDLSEKANIPMQVDIFYSGNGFINYLRNKIQYDILFLDI ELQDINGIEIARYLRNSMNDEHIQIIYISSKTEYALKLFKTRPMDFLVKPFTVDDIENTF LQAKKLIDCGTGNFECCIKNMYYKIPLDKIMYFQSYERKIYINMEDQQLEFYDKLDNVQK SIQSLEFWRIHKSYFINYKYVIEYTYEWVKMKNGIQLPISQINRKEIRNKLLKLKKEKNY GYKVCE >gi|330402635|gb|ADLB01000021.1| GENE 213 206874 - 207191 255 105 aa, chain + ## HITS:1 COG:no KEGG:Dhaf_2022 NR:ns ## KEGG: Dhaf_2022 # Name: not_defined # Def: transcriptional regulator, XRE family # Organism: D.hafniense_DCB-2 # Pathway: not_defined # 1 99 1 97 102 72 43.0 5e-12 MNKKFIQNRISELKDYNNYSEYQLSYELGKSKGYIQSITSGRSLPSMQVFLDLCDCFEIT PSEFFSDDSLQCNSQTAKRISSKLSKLSQNDLEILEEIVDRFLKT >gi|330402635|gb|ADLB01000021.1| GENE 214 207634 - 208164 266 176 aa, chain + ## HITS:1 COG:no KEGG:DSY0900 NR:ns ## KEGG: DSY0900 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense # Pathway: not_defined # 4 165 38 200 396 141 41.0 1e-32 MSILISIFISGYHGKTTNFAKNSSCHRTTIAHFLNSGKWDDSLLSDTLKCSVIEIIYSEA ARTGKPVFCIVDDTIASKTKPSSQALHPIEDAYFHQSHLKGKQDYGHQAVAVMLSCNGIV LNYAFVMYNKSISKIDIVQSIAKELPVPPVMSYFLCDCWYVSEKMNIPDGFTSAVR >gi|330402635|gb|ADLB01000021.1| GENE 215 208448 - 208981 411 177 aa, chain + ## HITS:1 COG:no KEGG:DSY0900 NR:ns ## KEGG: DSY0900 # Name: not_defined # Def: hypothetical protein # Organism: D.hafniense # Pathway: not_defined # 1 174 220 391 396 160 42.0 2e-38 MLYPSGMKKKLRELATELSVTPREFDLVTVKKRNYYVYRYEGNLNGIENAVVLLSYPEKA FGNPKALRAFISTNAALSTQEILSWYACRWPIEVFFRQCKEKLALDGYQIRSAQGIKRYW LLMSLAHFMCAVGTGRFCSFETGYHEICDTIQLEKYRYLFQCAKESNDFDSFMKFAV >gi|330402635|gb|ADLB01000021.1| GENE 216 209176 - 211659 1136 827 aa, chain - ## HITS:1 COG:CAC0454 KEGG:ns NR:ns ## COG: CAC0454 COG0577 # Protein_GI_number: 15893745 # Func_class: V Defense mechanisms # Function: ABC-type antimicrobial peptide transport system, permease component # Organism: Clostridium acetobutylicum # 22 777 16 782 832 117 23.0 8e-26 MIKVDNKHIIRDVSKKTYNANKKRNLLTIFAIFLTTFSILTIIGIGIGYWEMISERQMKM NGMDYDIEVSEPTEDQVNIARNSDLIKYAGISVKCAVIESANDRILSKIQLYWVDETCWE KQCLPAMDSLVGDYPQKKNEIMLSKEALRNMGINNPNIGMKIKVEYTALNQNSTDGNSQE INFILSGYYIDFSGDSRGYVSDIFYKDTGAKQTDFTQGTMKLTLKNPLYSEKTIQTLQKQ FHLENGQVINADYESINTFIKTVFVLFGVLIIIFLSGYLFIYNTLYISVSKDIRYYGQLK TLGMTFVQLKTLVLSQALRNACFGIPIGLIVGCLVSIKIIPTILKVQNPDLASNLTFSYY PILIGLTTVFSLLTVIISSRQPATVAGKCSPIEAIRYTNEKSKSYKRTNGVRWMAWRNIF RDKKKTIIVLSSFVISIVIFFTINVVIKENDAESILNKTYSYDLQLVDETYFDANRVDSI TPECIKKIENINGVSDVRPIYSTVIKVPYQEEVFGDFYKDLYDSRYSPGDYEKDIQKYKD GDSEGLFESKLIGIDDKELELLLKESGIEINKDEFEKGNIALTAGWLSIMPTNVVGKKVE FSLLNEEKVQSLDIEGVVKDPTYFSGGYNPTLIVSTSKFFEIIDNPMVELAYVDYKTSFD NSTEKQIKKIFEDSKTVSFDSKLDLYTDMSDSEMKIKVWGNGIGIIIAIIAIMNYINMMA AGVENRQKEFAILESIGMTKKQTRKTLICEGAGYAIISSVSAIIIAIPISYIVFSSMNIY GIEYSIPILPNSILLCIILVICVSVPPIMYNIFCKGSVLERIRRSEE >gi|330402635|gb|ADLB01000021.1| GENE 217 211753 - 212430 233 225 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P [Thermanaerovibrio acidaminovorans DSM 6589] # 20 217 144 340 398 94 31 5e-18 MEILRAENVKKYYMMGENIVKALDGVSLNVERGEFVSIVGSSGSGKSTLLHILGGLDVAT EGKVEISNIDISKMSSDELTIFRRRKIGFVFQNYNLVPLMNVYENIVLPIQLDGIKPDEM FISEIIRLLGLEEKMYAMPNQLSGGQQQRVALARALATKPAIILADEPTGNLDSKTSQDV LGLIKMTSKKFCQTVVMITHNDEIAQMSDRIIRIEDGKIYGDNMV >gi|330402635|gb|ADLB01000021.1| GENE 218 212587 - 213510 535 307 aa, chain - ## HITS:1 COG:Cgl0398 KEGG:ns NR:ns ## COG: Cgl0398 COG0642 # Protein_GI_number: 19551648 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Corynebacterium glutamicum # 62 304 96 349 386 100 25.0 3e-21 MDMEIIIILVLFVCNIVLLICGIRLWLNCKDYRTTLNLVLNYLDMAMSEKELPIIYDEKI ESAIQAKLNEIIEISRIHRAEENSNQKMVLSLLSNIVHQVRTPLSNIMLYTGILEETLQI EDSKRIVEKIYLQSRKLDYFMKELVRSSYLETEMISVNIQEEEVDKLILESCQEVELSAL KKKIIFDIRECGKKAKFDLKWTKEALINLLDNSIKYSSENSLITIRVFFYESFFCIQVTD QGIGIDEKEQGAIFKRFYRSEKVKSQDGLGIGLYLVREILERQGGYVKVKSSLNNGSTFY MYLPNVL >gi|330402635|gb|ADLB01000021.1| GENE 219 213513 - 214187 379 224 aa, chain - ## HITS:1 COG:CAC0450 KEGG:ns NR:ns ## COG: CAC0450 COG0745 # Protein_GI_number: 15893741 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Clostridium acetobutylicum # 3 222 4 222 227 164 41.0 1e-40 MNILILEDDKDLAVGIEISLKQENYRFVISNTIKDAKKQLNSYVFDLLILDINLPDGSGF DLCKEIRSVSKIPILLLTARDMDVDIVRGLESGADDYITKPFSMMVLRSRIRSLLRRSNW EQMESKYDKGEFKFDFFRMEFYKNGELIELSRTEQRILYILVTNEGHILTRERLLEWVWS GGGEYVEDNALSVGIKRLRTKLESIPSKPEHIKTIYGKGYIWEC >gi|330402635|gb|ADLB01000021.1| GENE 220 214314 - 214496 124 60 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MFIAMKSVAVDIFFITLMEQSRCDYQDRSQYTYLKNMMSRVWGAKSPSSLPSRKKSRSEF >gi|330402635|gb|ADLB01000021.1| GENE 221 214509 - 214751 330 80 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|210610640|ref|ZP_03288540.1| ## NR: gi|210610640|ref|ZP_03288540.1| hypothetical protein CLONEX_00730 [Clostridium nexile DSM 1787] # 1 80 20 99 99 81 98.0 2e-14 MPRGRKKQNVLTLEEQLVTVNERIQEVENELKQLRSQRKEIQAKIEEQEKETLFRAVVAS GRTVEDVLAILKAQDQKEEQ >gi|330402635|gb|ADLB01000021.1| GENE 222 214889 - 215311 288 140 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|210610639|ref|ZP_03288539.1| ## NR: gi|210610639|ref|ZP_03288539.1| hypothetical protein CLONEX_00729 [Clostridium nexile DSM 1787] # 1 140 1 136 136 215 88.0 6e-55 MARPKKEKALRHTHQIMLRLTDTEYEIISSQAKAANLPLAEFARRQIMNKRTILKYELVA DLPELKKLISEFGKIGSNLNQIARYFNSGGIHSQEVQKSLQQSLSKIYEMKYEVLKLAGN FSAQNISAPTHSRGNEYGNS >gi|330402635|gb|ADLB01000021.1| GENE 223 215298 - 216929 499 543 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_3583 NR:ns ## KEGG: EUBREC_3583 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 538 3 539 543 325 36.0 3e-87 MAILKHIAIKNMDYGEAQRYLLFQHDEHTKKPVLDENGELIPRKEYYIDGINCDPFSFDL ECKELNAQYHKNQNFNEIKSHHYILSFDPKDTEDHGLTGEKAQQIGLEYAQKNFPGHQAL VCTHTDGNNESGNIHVHIIINSLRKYDVERQTFMERPCDARAGYKHHLTPTYLIYLKRSL MDICHREHLHQVDLLSPAEKKITDREYRAKHTGQKKLDALNKEMLSKGIAPRKTVFQTQK EYLRTAIEYAAASSCNIEEFQATLLKNYQITLKVSRGRFSYLHPERSKFITGRTLGSHYT EDYLLSIWEKRKEPQTQSHTGSELFTPHSASTVTEGKETTAFCFIKSDLRLVIDLQQRVK AQNNPVYARKVKLSNLQQMAKTIAYVQERGYSTEEDLMTALTTSQSETANCRKNLHSTQQ ELKRINEQIHYTGQYLANKSVYREFLHAKNKKQFRQEHLSEITLYETARNFLKQHSENET LLSMKLLKSEKEKLLSTKNMQRQEYQKRKQYEKELRTACKNVEMILHGKSDFEKNSEKDI PPR >gi|330402635|gb|ADLB01000021.1| GENE 224 216877 - 217020 129 47 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MESLILKKILKRIYHHANPYVPKFLSIFVVFFCDVPYTKGKIEKVVI >gi|330402635|gb|ADLB01000021.1| GENE 225 217020 - 217964 584 314 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_2192 NR:ns ## KEGG: EUBREC_2192 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 3 193 24 206 211 74 31.0 6e-12 MELETTLSQTIRTAGSNSDYDSACKRLLSEKIILAWIMKNCLEEYRECSISEIIEKYIEG EPQVAEVPVLPDQTNPSLIKGERTEDKTITEGTITYDIRFTATAPKSGEFIRLIINVEAQ NNFYPGYPLLKRSIYYCSRMISAQYGTVFSSAQYDKIRKVYSIWICMNPPKSRENTITQY YIAEKSLVGHVTEKVENYDLMSAVMICLGKPDSENYHGVLKLLNVLLSSETTPKEKERIL HDDFDIAMTQNLKREVSLMCNLSQGIVESTTERNTLDSIRNLMETLNLTTEEAMAALKIP EAERPKYASMLKEL >gi|330402635|gb|ADLB01000021.1| GENE 226 218013 - 218849 549 278 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_3563 NR:ns ## KEGG: EUBREC_3563 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 6 238 260 485 918 67 25.0 4e-10 MAGNQFAVYQLREIPENRKKRFRSYAELQKEHIQIRYTDYKQVYRSWMRYDETPDNIRNR LAKQLPKNFSGHALSIGDVLILDKNGEAAAYYLEKEGFTVLDDFIRNGSSNTLISLATTN FQLEGKEGSWLAFDNLVVEGKEFFLMEHTTYGKNAAWVVVDGTGKLIVDQVTAGFDETVK EQIVSYLHPQETQEKRGKPVLENWQKSYENGEYLRSAEITEEQNYNMIDGRMNNLPTKPR KIGTRTSVLDRLHLKQAALNAKNLPQSVMENEIERKRK >gi|330402635|gb|ADLB01000021.1| GENE 227 218880 - 221543 1988 887 aa, chain - ## HITS:1 COG:no KEGG:CD1105 NR:ns ## KEGG: CD1105 # Name: not_defined # Def: putative DNA primase # Organism: C.difficile # Pathway: not_defined # 8 298 408 693 1343 243 47.0 2e-62 MAGNREQQMYEITKQLEEGVKALFTSERYTEYLKTMSKFYNYSFNNTVLIALQRPEATLV AGYSAWQKNFHRQVKKGEKGIQIIAPSQRKEKELVEKFDPETNEPILGPDGQPETEVVEH VVSDFRVVRIFDISQTYGEPLPELAIPDLTGQVQNFPLFLQAVKELSPVPIRFGETEGEA KGYYSNKKKEIVVKEDMSESQTIKTLIHEIAHAKLHDREVLEQTGEEKDQRTKEIEAESI AYTVCQYFGLDTSDYSFPYIAGWSDNLKMWELRTFMDAIRRTAGEFIKELEEKMKELETD RGERDGIQNEKEIKIGMDTKAVELEQHEGLWHTVEEVEIEKEHFYLMEHNEYGASVAPVL VNGDGKVVAQDLENGLDQEAMKAIREYLEEKETDKKQEKVEISNASLFDEAEREAAYQIE ATGQFFFIQETEEGYDYTFYNQEFQELDGGVYDTFDVTLQEAAKTLLLEEGADLTACRKI DSEMLQEQVERAEYFPQKSYEALKPLMESEENNVAFRSGYGYVMLQKILEGYECMIYDQA FREIGGQFYENITASKEEVFSQIFQEEGIGELPCEPFSYEELKKSVIEEEKKRFYEGELT PTSQIGIREEKLLRGESRQNIEEAILCYAQAELEEAGYEEITLLAARVYESRTRETLYRE DSDLDVALSYTGDIREDAFFNLLQENGMRIAGLKIDINPISLEKTGTLQEYIKRAEQYLD EREAEKQGQIPQETLQEPQISFYAAECMEFPSMGEYYENLTLEEAVEKYKMIPSDRIHGI KGIGFCLKDGSIYDGEYELMSGGKISKDLIDLVPHYKESPLVQKAMQDLERILAEKQREN TAEQTKAEGTMGRKVSVLKALKERKELLKKQEKKEETSHTRKKEAEL >gi|330402635|gb|ADLB01000021.1| GENE 228 221599 - 229374 5342 2591 aa, chain - ## HITS:1 COG:AGpT188_2 KEGG:ns NR:ns ## COG: AGpT188_2 COG4646 # Protein_GI_number: 16119916 # Func_class: K Transcription; L Replication, recombination and repair # Function: DNA methylase # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 1259 2523 1 1311 1315 671 32.0 0 MGKLYEILGLAQYEAEKISSSPRDWLHYLDTASQLYRYSFSDTLLIHAQSPGVTACAELE TWNHKMKRWVNRGAKGIALIDDRYPKKRLRYVFDVSDTHKVQGGKTPYLWQLKEHQKEPL LEHLSEVYGLSGEDTVALTSALWKIARDMTEENLEEAMEGLEYEVEGTFLEDLDEQTIQT EFRELLQNSIFYSLSRRCGLDPMDVLEEGDFTAITDFRSMSVLPFLGNATNQIVEPILRD IGRTVWRMDWEEQKEKSQKRVENSSQTHYNEFNTLMRESKENGGNENGTDILPQRGLPVS EPDNRKRTSEHREVRNASEDIPEGKPEELVSEHAADRKIGETSDGNRESSTGENGSSDEW TSYSISGSGQGGRSDEVGGTHEQSPEDGRGERFDGIGIQLSEEQTEQDFDGAEEEIASAL SLPQLPPVEKQIREIEERQAALYAKEVTIPAEIVDEVLRYGGNRTGSHLRIIYNFMIEQP EEAYTEFVQKEYGKGGIGLQISGREYAVWYDELGMQIAVGHTVHDTVLDKVFLSWEEASE RIHQLLRQGEYAPQVVLDAARENALKEHADALIYMERDMAEGVAELVFEDTEVFHNTFPT VTEKVCVLLEQPGYLADLNERLEALAEYYQEDKEIMRFHFYGPDKVLEQFQKFAKEAVPY QAREGFAWEEHPIFITQDEIDAFLVGGGAYRDGRLSIYAFFIQDKSEREKADFVKESYGI GGRSHALSGADHSHADYDSKGLKLERGSYSHPEATVFLKWTQVAKRIEFLIDQDFYLKAA DYTRMPSYEREQLAKRIVSFYYRLPEEIERPFPDSFLGEEARRELPLLLEQEETAEELVK KMDEALAALPLEFENYEKRVETLAILHQYIEGTYTLFPERQKEISIESGNQQLSLFDFMQ EEPFDMEKLEEPNLSQPDNEVADTIDVIGTDGKIITAVPKKQNQTKEDTIPQQKEHAEER NNFRITDDALGIGSPKEKFRGNIEAISLLKKLEAENHLATAEEQEILSRYVGWGGLSAAF DDRKEEWSQEYQELKSLLSESEYKEARSSTLNAFYTPPTVIKAMYQILENMGLSTGNVLE PSCGVGNFMGLVPESMQNIQMYGVELDPISGKIAGQLYQKNRIKVKGFEKTEYPESFFDC VIGNVPFGNYQVSDRKYDKYSLMIHDYFIVKSLDLIRPGGVVAVITSSRTMDKESEKVRL QFAEKADLLGAIRLPENAFRKNAGTDVVSDILFFQKRDRAVLQKPSWVEVGETEEGYKIN SYFVEHPEMVLGDFALESNQYGRMEVTVKPRKENSLEEQLKAVIPFIQGQITVREFDELD LNETEDSILADPSVKNFSFAKVGEQVYYRENSRMNRMDLPAVTTERVIGMIEIRDITRKL IDSQMEDAGEEEIKSLQKELNQTYDTFTEKYGLLNSAANRRAFSQDSSYCLLASLEFLDE EGRLKRKADIFHKRTIRKAEPIERVDTASEALALSLAERAKIDLSFMTELTGKPEAEIIE ELTGVIFQNPLTEKWETSDEYLSGNVRVKLEVAKQFAENHDRYQVNVQALEKVQPKELEA SEIEIRLGATWVDAEYITEFMGELFQTPDYYLGSRIEVKYAPVNGQWNISGKSLDTYGNT RVTATYGTARANAYRLLEDALNLRDTKIYDTIQDADGEHRVLNKKETMLAQQKQDMIKEA FKEWIFRDIDRREALCKKYNEIFNSIRPREYDGSHIQFVGMTPEITLMPHQKNAVAHVLY GGNTLLAHCVGAGKTFQMIAAGMESKRLGLSQKNLYVVPNHLTEQWGSDFLRLYPGANVL VATKKDFEPANRKKFCSRIATGDYDAIIIGHSQFERIPLSKERQMAMIEQQIEEIMLAIE EAQSDDSPRYTIKQMERTRKGLEKKLEKLNDDTRKDDVVTFEQLGVDRLFVDESHFYKNL FLYTKMRNIAGIAQTDAQKSSDMFMKCRYMDELTGGKGVTFATGTPVSNSMTELYTIMRY LQYDTLKKLQLGHFDSWAATFGETVTAVELSPEGTGYRAKTRFARFFNLPELISLFKESA DIQTADMLHLPVPEAEYINEVLKPSEEQQELVAAFGERAEIVRSGCVDASVDNMLKITND GRKCALDQRLINDMLPDYEDSKINRCVENTFSIWEETKEQLSTQLIFCDLSTPKSDGSFN VYDDIREKLVTKGIPKEEIVFIHEAGTETKKAELFAKVRSGQVRVLLGSTPKLGAGTNIQ DRLIALHHLDCPWKPADLEQQEGRILRQGNQNKKVKIFRYVTENTFDAYMWQILENKQKF ISQIMTSKSPVRACEDVDDTALSYAEIKALATGNPYIKEKMDLDIQVSKLKLLKANHTSQ KYKLETDIAKNYPMQIAAQKERLAGLRSDAEAVRPIFENEDFSMIVSNKTFTDKKEAGTA LLAACEGLKAIHTEGKIGTFHDFSLYAKFDAFNQRYIMTIKRKCSYLIEMGKDILGNLQR ISNALGGIEKKMEEAEQKLETLQKQLETAQEEAAKPFPKEKELQEKMERLAELNSLLNMD EKEMPQDQEKEVDQVADLSRRSVNYAGRVSEGRKGKIEKSSVLGRLKTTQRNMEKGWKNR NMQKKKPEQQL >gi|330402635|gb|ADLB01000021.1| GENE 229 229396 - 231477 1525 693 aa, chain - ## HITS:1 COG:CAC3567 KEGG:ns NR:ns ## COG: CAC3567 COG0550 # Protein_GI_number: 15896801 # Func_class: L Replication, recombination and repair # Function: Topoisomerase IA # Organism: Clostridium acetobutylicum # 3 649 5 675 709 433 39.0 1e-121 MFLVLGEKPSVAQALAEVLGAKKRADGYLEGEDCIVSWCLGHLAEYAAPEVYDERYKKWE FTDLPILPKAWELRIAKDKKEQFDVLKELLNRKDLEYVVNACDAGREGELIFKRVYDLSK SSLPVKRLWISSMEEQAICDGFSNLKNGKEYERLCEAAVCRAKADWLVGMNATRAFTTTY YKRMVVGRVQTPTLAMLVERQKKVENFEKEAYYQVELPVLDFFVTSEKMKKEEEADALVK MCEGNPIHILKVERKQKKTNPPKLYDLTTLQREANRFFGYTAQETLTELQKLYEEKWVTY PRTDSQYITEDMEQSVLELLDILSDMLPFCSRELVGRNVGAIINNQKVSDHHALLPTKEA WKQDIGTLSKKQKDIFYLIGQRLAQAVSEAAVFEETEVSAECAEHLFATKGKRVVEPGFQ KILKAFQEKVRKTEEAGEKEEKVSISENISDGMEIAGRIPEKRKRFTAPPKPYSEDTLLS AMETAGNHSFDVETEKKGLGTPATRAGIIEKLVSSGYAVRKGKQLLPTKEGIALISVLPE ELKSAALTAEWENELLRMERGEVTSETFMEDITEFVQKLIAGCGEIPKEERYRFYEKESI GKCPVCGSPVYEGKRNFYCSSHDCNFALWKESRYLVGMKKTLDQKMAKELLEHGKTRVTD FYSQRTGKKFTADLLLELQGDRTSFKMEFPKRK >gi|330402635|gb|ADLB01000021.1| GENE 230 231535 - 232197 828 220 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_0776 NR:ns ## KEGG: EUBREC_0776 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 42 220 29 211 215 82 33.0 2e-14 MVNVKFKKWLAACIVSMGIVCSTGATAFAYTGEGTPTEQGTEAPTPIPEQTPAPQEGQTE EGTPFSVPGNGQILDDKNNDSTKQFLTVQTKNGNTFFLVLDRSSSTENVYMLSMVDENDL AEFVPEKKPQTETPSVVIPETEKKPEAEKPEKKADKAENQTGALLAIGLLAAGGAGAYYY FKVAKPKKAETDAEDEDLEFYNGPYINEDEDAEDAEETEE >gi|330402635|gb|ADLB01000021.1| GENE 231 232178 - 232453 250 91 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|210610624|ref|ZP_03288524.1| ## NR: gi|210610624|ref|ZP_03288524.1| hypothetical protein CLONEX_00714 [Clostridium nexile DSM 1787] # 1 91 8 98 98 90 90.0 3e-17 MLDEIEKTETKIAQWQEHLKMLLVQKKQLEDTEILKSIRSMRLESRELLQVLEKLQKGEF QFTGNRYENPRKSDFSEKEETKEETNGEREI >gi|330402635|gb|ADLB01000021.1| GENE 232 232496 - 234919 1820 807 aa, chain - ## HITS:1 COG:BS_yddH_2 KEGG:ns NR:ns ## COG: BS_yddH_2 COG0791 # Protein_GI_number: 16077564 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell wall-associated hydrolases (invasion-associated proteins) # Organism: Bacillus subtilis # 685 806 4 123 124 101 40.0 7e-21 MQEHKYQAREKPVREIEREEKTKIRQEGRGQARDRPIEKALLEKNMADKETVPEEKEKQE VSAGQRKKKQNTLRRFASHREEGIDGKFTREPVTISEPEEKLEGENGKEELAEHSLRMGN LRVKAKNQSLEGESSQPGSRKKKLVQEHARRRKTEEENAVSPEDGKGRKPQDSSLESFRE KKQKQKRLQEEAGRSQKKISFSDEGNGMIRGAGMGIAKKGISVTASAIENSLQTQETDEN AATEGAKQSGILSEQALRYAVKKSRRQVQYHKGRYKEEESVKNSLHFDPMQGTADDIKTE RAGKQAQERKHLLNRFWQKQRYKKARQAAQQGEKTTAEAAKTAQTFLEKAKSIVTSFFKT HKGFFGIVIAILLFVLMIGVGMSSCSAVFQGTVSTVVGTTYASDDEDIYAVEDAYSKLEA ELNQQINSMEARHEGYEEYRYQVDEISHNPYHLISYFTALYGDFTYEQVAGEIEEIFQQQ YHLTTEATKEVVTETKKVKVGESLGKVVTSGYCNCSICCGSWSGGPTASGVYPKANHTIA VDANNPFVPIGTKVVMNGVEYTVEDTGAFARYGVQFDVYYDNHAAASAHGHQTWEAFIAD SNGKNEIEVTTTQEVNRMDITLTNKGLDAVLRSRMDENEEKRYDIYNTTYGNRDYLFDKN TIPMGGGNGGFGYEIPPEALTDERFANMIREAEKYLGYPYVWGGASPSTSFDCSGFVSWV INNCGNGWNVGRQTADGLRSCCAYVPPEQAKPGDLIFFQGTYDTPGASHVGIYVGNQVMI HCGDPIQYANISTPYWQQHFMAFGRLP >gi|330402635|gb|ADLB01000021.1| GENE 233 234956 - 235327 184 123 aa, chain - ## HITS:1 COG:no KEGG:EF2320 NR:ns ## KEGG: EF2320 # Name: not_defined # Def: TraE protein, putative # Organism: E.faecalis # Pathway: not_defined # 21 118 728 825 825 163 80.0 1e-39 MKMSENSGGTDKKLSVNHKQIPTGITQNIKDLLASREIENIFENSDFIYMLNQAAGDRQI LAKQLNISPHQLSYVTNSGEGEGLIFYGNVIIPFKDRFNHNLRLYSLMTTRPSDLEKHAG KGV >gi|330402635|gb|ADLB01000021.1| GENE 234 235544 - 238348 1729 934 aa, chain - ## HITS:1 COG:CAC2047 KEGG:ns NR:ns ## COG: CAC2047 COG3451 # Protein_GI_number: 15895317 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirB4 components # Organism: Clostridium acetobutylicum # 674 906 261 525 617 63 22.0 2e-09 MVDRMDKRKPIKLTRSEKKAMRRTEKLAVKLEKEQERADKIRRKKEAKRAAREEKRRKKG NRKSGSQRKDKKPTPNKRTEKKHSYSEQKSSTGNVRCTNVKRKQERNISAQNSIPYREMA KDGICRVQEKYYSKTIRFYDINYQLAQNEDKNAIFENWCDFLNYFDSTIHFQISFINHHS NMKEFESVIQIQPQNDAFDDVRMEYAQMLRDQLAKGNNGLVRTKYITFGIEAENIREAKP KLERIEADILNNFKVLGVSAYPLNGEERLQILYETFNPEEKVPFQFSYDRILRSGMGTKD FVAPTSFVFKEGKTFQMGNTIGAASYLQILAPELTDKMLAEFLDMNRNLIVNLHIQSIDQ MKAIKLVKNKVTDINRMKIEEQKKAVRAGYDIDIIPSDLNTYGGEAKRLLEDLQSRNERM FLVTVLFLNTAKTKQELDNAVFQTAGIAQKYNCSLRRLDYMQEQGLMSSIPLGMNMIPIK RALTTTSTAIFVPFTTQELFMGGESLYYGLNALSNNMIMVDRKKLKNPNGLILGTPGCFT GETKLLLPDGRKVSFLELLAKKEEVLVNSFDFQKQELVKARGYDVRCTKEVTELVEVELE NGETVRCTPEHWFLTQSAGYVEACNLKVGAKFIPEHEVKAVRFLSLEEAVPVYDISVEGY QNFLLSCGVVVHNSGKSFAAKREIANVFFATQDDIIIGDPEGEYYPLVHALGGQVIHISP TSHDYINPMDINLDYSDDDNPLGFKSDFILSLCELIMGSRNGIEAEEKSVIDRCLPLVYQ KYFENPIPENMPVLGDLYDCLRKQEEVQAQRIATALEIYVNGSLNVFNHHTNVELNNRIV CFDIKDLGKQLKKLGMLIVQDQVWNRVTVNRVAHKSTRYYIDEFHLLLKEEQTAAYSVEI WKRFRKWGGKQNIILKILQRSIVYAVLIEAEGNN >gi|330402635|gb|ADLB01000021.1| GENE 235 238329 - 238706 142 125 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_3575 NR:ns ## KEGG: EUBREC_3575 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 124 1 120 154 127 54.0 2e-28 MAFVSVPKDLTKVKNKVVLNLTKRQLLCLSIAAAMGLPFYFLTKDWIGTSNAATGMVILM VPAFLFAMYEKDGMPLEKILFHIIQVKLQRPAVRKYEIENFYEKEENLQTERKKKGGKPD GRQNG >gi|330402635|gb|ADLB01000021.1| GENE 236 238737 - 239345 473 202 aa, chain - ## HITS:1 COG:all7280 KEGG:ns NR:ns ## COG: all7280 COG4725 # Protein_GI_number: 17233296 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Transcriptional activator, adenine-specific DNA methyltransferase # Organism: Nostoc sp. PCC 7120 # 2 173 8 190 210 108 34.0 7e-24 MGKYQVIYADPPWAYRVWSKKGNGRSAESHYPTMSIEEIADLPVKELADENCALFLWVTF PLLREIWKVIDAWGFTYKSVAFVWIKQNKRADSLFWGMGYWTRANAEICILATKGSPKRY SKRVHQAIVSHIEEHSKKPEEARRRIEQLMGDVPRIELFARRETPGWDVWGNEVACSLGT EILQENKRKETSEKKMGELLNL >gi|330402635|gb|ADLB01000021.1| GENE 237 239359 - 240231 842 290 aa, chain - ## HITS:1 COG:no KEGG:CD1112 NR:ns ## KEGG: CD1112 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile # Pathway: not_defined # 5 290 4 289 289 359 68.0 5e-98 MFDGIFEAIEEWMRGLLTGMVESNLTTMFTDVNEKTGEIAAQVGQTPQGWNGSIFSLIQN LSDSVIVPIAGMIITFVLCYELISMFTEKNNMNEVDTWMFFKYFFKMWVAVYLVSHTFDI TMAVFDVGQHIVNAAGGVIESDTAINVDTMLEQMKTAMESMEIGELVILALETMLVSLCM KIMSVLITVILYGRMIEIYLYTSVAPIPFSTMSNREWGQIGNNYFRGLFALGFQGFFMMV CVGIYAVLIATIEISDNMHSALFGVAAYTVILCFSLMKTANLSKSIFNAH >gi|330402635|gb|ADLB01000021.1| GENE 238 240254 - 240466 235 70 aa, chain - ## HITS:1 COG:no KEGG:CKL_0289 NR:ns ## KEGG: CKL_0289 # Name: not_defined # Def: hypothetical protein # Organism: C.kluyveri # Pathway: not_defined # 1 70 1 70 70 93 82.0 2e-18 MDFFSSAVDILQTLVVAIGAGLAVWGVINLLEGYGNDNPGAKSQGIKQLMAGGGVVLIGT QLIPLLSGLF >gi|330402635|gb|ADLB01000021.1| GENE 239 240743 - 242677 1695 644 aa, chain - ## HITS:1 COG:CAC1969 KEGG:ns NR:ns ## COG: CAC1969 COG3505 # Protein_GI_number: 15895240 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirD4 components # Organism: Clostridium acetobutylicum # 166 593 157 556 591 203 32.0 1e-51 MKKKEQKKEKQFKKVWEKLWQNVIGRLPEVDIKRLFLTNLPYVLVFYIVNKLAWLYQYCQ GNSLLERLSVLTLNVSLAFENILPSLRFSDIGVGFTGAVLLKGIVYVKGKNAKKFRQGVE YGSARWGTAKDIAPFMDSAFENNIILTQTERLTMNSRPKKPKYARNKNVMIIGGSGSGKT RFYVKPNLMQMTPNVSYVVTDPKGTILVECGKMLQKGTPKIKDGKPVLDKNGKVIYEPYK IKVLNTINFKKSMHYNPFRYIRSEKDILKLVNTIIANTKGDGEKSGEDFWIKAERLLYCA LIGYIYYEAPEEEQNFSTLLEFINASEAREDDEEFKNAVDELFEELEKEKPEHFAVRQYK KYKLAAGKTAKSILISCGARLAPFDIAELRELTSYDELELDMLGDQRTAMFVIISDTDDT FNFLVAIMYTQLFNLLCDRADDVHGGRLPYHVRLLLDEFANIGQIPKFDKLIATIRSREI SASIILQSQSQLKTIYKDAAETILGNCDTMLFLGGKEGSTLKEISETLGKETIDLYNTSD TRGQSRSYGLNYQKTGKELMSRDELAVMDGDKCILQLRGVRPFYSNKFDITKHSRYRELS DYDKKNTFDVEAYLKHQFEIRKKEEFELFEVEGKQMDTETSEQV >gi|330402635|gb|ADLB01000021.1| GENE 240 242674 - 243192 376 172 aa, chain - ## HITS:1 COG:no KEGG:CD1116 NR:ns ## KEGG: CD1116 # Name: not_defined # Def: hypothetical protein # Organism: C.difficile # Pathway: not_defined # 1 171 1 157 158 94 40.0 1e-18 MQDEVNEKVVSLCIRGGKISAQILKSALLKTLTKLEQQKSQRKQKKGVSKEEKNLAVYKG KQSMEKLKEQNCELSNIEITDGNIKSFEKYARKYNVDYCLKKDRSSEPPRYYVFFKARDV DSMTAAFKEYTGWSMKQSKKVSIRKKLSQVKERQAQHRERQKTKSKERDAAR >gi|330402635|gb|ADLB01000021.1| GENE 241 243226 - 243372 82 48 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MAAPENSAIIFSQNSQRKIRKKQDNVWVALMENILLVLDTVCRKYYRT >gi|330402635|gb|ADLB01000021.1| GENE 242 243438 - 243584 159 48 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|210610583|ref|ZP_03288509.1| ## NR: gi|210610583|ref|ZP_03288509.1| hypothetical protein CLONEX_00699 [Clostridium nexile DSM 1787] # 1 48 1 48 48 84 100.0 2e-15 MVGKIIFGVAVILTFGLYCCVRVGGQADRQMEEMLQKKRKDQENGESG >gi|330402635|gb|ADLB01000021.1| GENE 243 243659 - 249580 5400 1973 aa, chain - ## HITS:1 COG:L148778 KEGG:ns NR:ns ## COG: L148778 COG4932 # Protein_GI_number: 15672133 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted outer membrane protein # Organism: Lactococcus lactis # 1070 1973 1184 1982 1983 122 24.0 6e-27 MKKTVKRIFSGVLSALTILTSVVQPMTAYAGETNPKSYEMQYPALEQVKAQLQEDEIVTV QDYEVEKGSDFDIKTDFSGMKFHSDQVKVTFYEAKNQKGQDFHTDHADTYQAVYFVEPVR EAPNYHVIRNITVAEHKEKENPSPSKEHGKQEKNSEEEEDASQKEEQILTEDEFSQVLEE SKKQDTYDEESGLGLYDVLSQAEEQEIDLFQIKEGETVTFVANASDQAARATQTVTITKG PLYRYQDYDLGTYVTEPYFISYGNVHATAYCIQPSLPGPGTGTYTITKIEDNQALAKVCY YGTEAAGKESYFAKHHSDFSEGKQFILTHIAAAYAYGSADAFYGANETAKSLAMEVYNYC VGKPEIPEVSMSFSKNHVKAYQEGQEQRTEDITFTADSLQSITMQLPKGVVFHNLTTGKN SAAGARVTLSGGTKFYLSAPLTQTEDVEGSWSATMQGSITKDYSAYKLVTNDSVQDLAFV FGEGVEQEQYVEFSVEWLKNAKIELIKKDQSSQKKMEGAVYGVYRDKDCTDLIVEMPATD KNGVSSVTIEKTQDTVYLKEISAPQGYVLDTKYYGVKLEVGKTVQIEVADQEQLASLTVY KEGEVLTGADVTEEGVTFSYTKEKQKGAVYDVYAEKEIVRADGTVAYKKGAVVKKGLKTG EDGSVTLSELPLGTYKVVEVKAPENFVCKGESQTVTLSYAGQNEEVVFKTITFVNDRQKA SVSVRKEDQDTKNPLAGGIYGLYTAEDIHGKAGNLLVKKDTLIEKVTTGENGTASYQADL PIGYSYYIKELQAPAQYVKKETEKFFFDFQYAEGEEHCSFSHTFQNERVKAEIILEKEDA ETGKTPQGDASLKGAVYGLYARENIVHPDGKTGVLYQKDEQVATLKTDDKGKAEVKDLYL GSYYIKEITPPAGYLLDETEHDVVCDDEGDTVAIVKRVCNVKEQVKKQPFQLIKAGNNGN TDAELLKGAGFQAYLESSLKKKEDGSYDFASATPVVIGENGATELFTDERGYACSIPLPY GTYVVRETTTPPNYKPVKDFIVRITEHKPDTPQIWRVLLDKEFEAKLKIVKKDDESKKPV LKKGTEFKVYDLDRKKYVEQVTTYPTTVVHKSYFTDETGYLILPQNLDAGHYRIEEVTAP DGYTLNENYYEVSVASDTAYQMDSISGDVIIEMVYENHPVKGELQIVKKGEILEDFDKDF SYQEINLSGAVFDLYAAEDIYTADAQKDEQGNRILEYAEGTKVATLTTDQEGKASVSDLP LGTYELVEITAPEGFVKNPEPQTVIFKYEDQKTPVIEQEVLLKNERQKAKISVVKRDAET KKEIKGAVFGLYAKEDIQVGNQTLVKADTLIGKAETGENGKATFSFDLPFGKYYVKELEA PAGYVSSEQVLDVDFSYQGQEIDVVELTSEFLNQPTKVSITKVDVTTGVELSGATLMVLD KDGEVVDSWKSVKGEAHVIRGLKVGETYTLREETAPYGYLRAEEVSFTVEDTEEIQKVEM KDDVPTGSILINKKGEFLEKVSVFEQIGGWISHVFEYLSGSLKEVTFAVYVREDIQAADG ESKDYYKKDELVAEITTDHTGVARISGLPLGKYYVKEKETAHGFVLDKEAREIDLLYRDQ ETAEITYSTDWQNKRQKAEVIVWKKEKDVDRMLQGAVFALCVKEDIKNAQGNVIMKADTV IEEQATNAQGMLRFEADLPVGYTYYIKETASAQGFSMTGERKEFTFDPENSQEACVSYEF TFENVPTVVEFTKTSLTDGKEVEGAKLQVKDEQGKVIDEWISAKEPHVIKELVAGHTYVL EETLPADGYVTAESISFTVADTGEVQKVEMKDDITKVEISKTDISGKELPGAKLTILDQE GNEVESWTSGEKPHYIEMLPIGKYTLHEVSAPDGYLVAEDIIFEVKDTGEIQKVVMKDER KPSENVEQPQESKPTGNAPKTGDTSQVELWVLLSAMAVLLAGSSYYWIKKRKK >gi|330402635|gb|ADLB01000021.1| GENE 244 250027 - 250791 596 254 aa, chain + ## HITS:1 COG:no KEGG:Ent638_4316 NR:ns ## KEGG: Ent638_4316 # Name: not_defined # Def: hypothetical protein # Organism: Enterobacter_638 # Pathway: not_defined # 1 238 21 269 285 179 42.0 9e-44 MEKETILNELDELLIEGNSILSTKWTESMGIYTYVDEENYYSWRTKVLSFLKLFLSIESD YINVFSELSESSFSNANICVQTLQNIKEYIKKNYISIEQKNNIKIDEILDLIFTRFHKIA RQLQSRYNHRDTLTIDDEYDVQDLLHALLQLYFDDIRAEEWTPSYAGKCARVDFLLKNEK IVIEVKKTRKGLTDKEIGDQLIVDIDRYKSHADCQKLICFVYDPEARIGNPIGIMKDLNT QHNGFVTVYIRPLI >gi|330402635|gb|ADLB01000021.1| GENE 245 250820 - 251263 473 147 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|210610579|ref|ZP_03288505.1| ## NR: gi|210610579|ref|ZP_03288505.1| hypothetical protein CLONEX_00695 [Clostridium nexile DSM 1787] # 1 147 1 147 147 248 97.0 9e-65 MRGKKPKERKSYVLRMPDGSLIEVTREVYLEWYQSRRKERYQMERQKKQRVCSLEALNGY GDLEEEKYDTEETVLCKLYEAKVREGIARLAKEDARLIYLLFFEEVTIKDAAQIMGCSRK TIYNRRKRILSELRSILHDLGIVGGAF >gi|330402635|gb|ADLB01000021.1| GENE 246 251280 - 251897 379 205 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|210610578|ref|ZP_03288504.1| ## NR: gi|210610578|ref|ZP_03288504.1| hypothetical protein CLONEX_00694 [Clostridium nexile DSM 1787] # 1 205 7 211 211 322 85.0 6e-87 MLTLAELKHFMPVPVLEENYPTAKYLKTHGTIFVSKSIATNVKISVYQNGYALYEISGLA TVFPIWDCQNYRYEMEQNEISEQWFEKEAWYLRLILEGEDRINRNLETRQQRKSISYSAV SEEWGVLGSLEATVLETAIRKEMIQELLGLLTERQKEVVFEYFWEQKSQSEIAEKLGVSH AAVSKILTRAVTRLRKKCMEWKEAV >gi|330402635|gb|ADLB01000021.1| GENE 247 252076 - 252477 353 133 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|210610577|ref|ZP_03288503.1| ## NR: gi|210610577|ref|ZP_03288503.1| hypothetical protein CLONEX_00693 [Clostridium nexile DSM 1787] # 1 133 15 147 147 194 87.0 1e-48 MGNRAVSQELIILGNRIRECRKEREFSQEILAEKSGVSTNTISRIEGGQMAMSVGILQKI VKALGVDANTLLGVSTEVNETKIWVSAFSSRVQELKENEQEILKHTMNALIESMEKNRLK ISTDKSSQVLHTR >gi|330402635|gb|ADLB01000021.1| GENE 248 252594 - 252917 181 107 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|210610576|ref|ZP_03288502.1| ## NR: gi|210610576|ref|ZP_03288502.1| hypothetical protein CLONEX_00692 [Clostridium nexile DSM 1787] # 1 107 4 110 110 150 90.0 2e-35 MRQEIFCRMIFVTEVLLWTVIFYPFCVEGGNCDYLKLWILVGIPFGIRKMQIWIVPRKRD VGSTIAMLVFQMLIGGIIGGVVLVWKLFLLFVFMVLQLYKAIKKLIK >gi|330402635|gb|ADLB01000021.1| GENE 249 252914 - 254155 886 413 aa, chain - ## HITS:1 COG:no KEGG:Amet_3992 NR:ns ## KEGG: Amet_3992 # Name: not_defined # Def: replication initiator A domain-containing protein # Organism: A.metalliredigens # Pathway: not_defined # 1 397 1 319 319 242 37.0 2e-62 MEKIQFDYYRGMEAEQYSFYRVPKILFTAECFKELSCEAKVLYGLLLDRMSLSMKNHWLD EEERVYIIFTIEEIAELLNCGTQKAVKLLKELDSEKGIGLIEKKRLGLGRPNVIYVKNFL VQKNDEENSDTSDLQNCENHNSGVVKTTIQEFPKSQFKNDENHNSEDAEPIDIETEWLEK EPYLSDGKEILENMEIKMQENEGIGEENFQNCENHNSRVVKTTIQEYPKSQFKNDENHNS GVVKTTIQECPKSQSNNTDINKTENNETESSSILSNLICPEKEKTIDEIEQRNTYREIIR ENISYECFRNDTPHAREEVDELVELMVEVMVMPDQGKIRIAGEDKLVSLVKSQFMKLTHA HIEYVGLCLNKNTTKVGNIKSYLLTALYNSVLTINHYYQAEVNHDLYGGGWGK >gi|330402635|gb|ADLB01000021.1| GENE 250 254136 - 255041 782 301 aa, chain - ## HITS:1 COG:PA5562 KEGG:ns NR:ns ## COG: PA5562 COG1475 # Protein_GI_number: 15600755 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Pseudomonas aeruginosa # 20 176 25 175 290 99 35.0 9e-21 MRSKSAEKVKLSSFDDLFGKEETVSGEVVTSVPIDLLHPFKNHPFHVQDDEKMEETVESV KQYGILMPGIVRPYPDGGYEVVAGHRRWRACELAGLEEMPVIIRDIDDDTATVIMVDTNI QREDILPSEKAFAYKMKYEALKHQGSKGEKYTAEMVGETAGDSGRTVQRYIRLANLISGL LDLVDVKKVPMIVGEKLSYLTREEQELVLEAVRNCEIMPTAAQAEAIKMFSEEKKLDGNA IYGLLLKKKNSGNGVTISAKKISSYFPPAYTKQEIENVIYTLLEEWKNGREEELEHGEDT I >gi|330402635|gb|ADLB01000021.1| GENE 251 255031 - 255816 866 261 aa, chain - ## HITS:1 COG:BS_soj KEGG:ns NR:ns ## COG: BS_soj COG1192 # Protein_GI_number: 16081149 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Bacillus subtilis # 1 258 1 251 253 199 46.0 5e-51 MCKVIAIANQKGGVGKTTTCVNLGIGLAREGKRVLLIEADAQGSMAASLGIQEPDELEVT LVTIMEKVINDEDVEPNEGIIWHDEGIAFIPANIELAGLETALVNVMSREMILKQYLDTV KAEYDYILIDCMPSLGMITINALVASDYVLIPVEAAYLPVKGLQQLIKTIGRVHRKLNPQ LSIMGILFTKVDRRTNFARDIAEQIRQVYGTRVHIFKNCIPLSVRAAETTAEGKSIYLHD PKGIVAKGYISLTEEVLAYEE >gi|330402635|gb|ADLB01000021.1| GENE 252 255819 - 255992 232 57 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|210610572|ref|ZP_03288498.1| ## NR: gi|210610572|ref|ZP_03288498.1| hypothetical protein CLONEX_00688 [Clostridium nexile DSM 1787] # 1 57 1 57 57 100 98.0 3e-20 MLKVRVQGTKSDIRWFREKLEQHPQIAVLQVSDLLSNKGTTKYFRMYAEIEEKKGGR >gi|330402635|gb|ADLB01000021.1| GENE 253 256005 - 256439 354 144 aa, chain - ## HITS:1 COG:no KEGG:MGAS2096_Spy1123 NR:ns ## KEGG: MGAS2096_Spy1123 # Name: not_defined # Def: hypothetical protein # Organism: S.pyogenes_MGAS2096 # Pathway: not_defined # 16 143 4 120 134 117 43.0 2e-25 MNNTALRAETEITNTIIFKNEKHKEFYQTYMKKCRRQDVYHKALIYCLGISEDTRVNIDR IYDLRTGYVKTECLQEGWQTSGSVRIVRMAFNLYCNGTPSVYDYESEEGKLQEYAGYTPE ELFCCGYARYFWEAIKIRYPEYCF >gi|330402635|gb|ADLB01000021.1| GENE 254 256656 - 258581 2238 641 aa, chain - ## HITS:1 COG:CAC3197 KEGG:ns NR:ns ## COG: CAC3197 COG1190 # Protein_GI_number: 15896444 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Lysyl-tRNA synthetase (class II) # Organism: Clostridium acetobutylicum # 8 492 22 509 515 563 59.0 1e-160 MAEQKNAQEQDINQLRKVRREKLADLQENGKNPFLITKYDVTHHSMEIKDNFEEMEGKDV SIAGRIMSKRVMGKASFCNVQDLQGNIQSYVARDNIGEEAYKEFKKMDIGDIVGIKGDVF RTKMGEISIHAQEVTLLSKSLQILPEKFHGLTNTDLRYRQRYVDLIMNPDVKDTFIKRSK IISAIRKYLDGQGFMEVETPILVSNAGGAAARPFETHFNALNEDFKLRISLELYLKRLIV GGMERVYEIGRVFRNEGLDTRHNPEFTLMELYQAYTDYNGMMDLTENLYRYVAQEVLGTT KITYNGVEMDLGKPFERITMVDAVKKYAGVDWNEVQTLKEARALAKEHKVEYEERHKKGD ILALFFEEFAEEHLIQPTFVMDHPIEISPLTKKKPDNPEYTERFEFFMNGWEMANAYSEL NDPIDQRERFKAQEELLAQGDEEANTTDEDFMNALEIAMPPTGGIGFGIDRMCMLLTDSA AIRDVLLFPTMKSQGAAKNEANNATQVKTEEKSAEKIDFSNVKIEPLFEETVDFDTFSKS DFRAVKIEACEAVPKSKKLLKFTLNDGTDRKRTILSGIHEYYEPETLVGKTAIAIVNLPP RKMMGIDSEGMLISAVHEEDGREGLNLLMVDDRIPAGAKLY >gi|330402635|gb|ADLB01000021.1| GENE 255 258598 - 259080 798 160 aa, chain - ## HITS:1 COG:CAC3198 KEGG:ns NR:ns ## COG: CAC3198 COG0782 # Protein_GI_number: 15896445 # Func_class: K Transcription # Function: Transcription elongation factor # Organism: Clostridium acetobutylicum # 3 157 4 158 158 129 57.0 3e-30 MADKKTILTYAGLKKLEEELENLKVVKRKEVAGKIKEAREQGDLSENAEYDAAKDEQRDI EARIEELEKILKNAEVVVEDEVDLDKISIGCTVTVYDNEFEEEIEFKIVGSTEANSLEGK ISNESPVGKALIGRKVDDVVEVETQAGVMEYKVLKIERSI >gi|330402635|gb|ADLB01000021.1| GENE 256 259100 - 259207 64 35 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MNNLDNPLFVRYNTEIVKVYSNEEMHEKIHEKIKW >gi|330402635|gb|ADLB01000021.1| GENE 257 259257 - 262796 3132 1179 aa, chain - ## HITS:1 COG:SP1326 KEGG:ns NR:ns ## COG: SP1326 COG4409 # Protein_GI_number: 15901180 # Func_class: G Carbohydrate transport and metabolism # Function: Neuraminidase (sialidase) # Organism: Streptococcus pneumoniae TIGR4 # 46 463 283 739 740 159 31.0 4e-38 MKKRRRILALLLATTLTLSSMPAVLAKEGTKPQDSTTKEQPFWSGTGESDKFRIPCLVSL DDGTLVAGCDARWTTHMDGGGLDTIVSYSKDKGQTWRYTFANYLGDNGNEHNVNSTAFID PAMATDGERVYMIADLYPAGYALNGAKKQPVPGKSHDDKGNILLADARNWNDVWGSERTN SANYTYRLIKNEGEDGRPTYSIQDANGKIMEGYTVDAYFNVKGENVNANLFEADSPFQVW PTDYLYLTTSEDGGRTWSVPSILNLRRDSEQSLLVGPGRGMVTSEGRIVFTAYEFTNGDK NSAAIYSDDGGRTWKRGKSVSGLSSEAVVTEADGKLYMFTRHGGYYVSNDWGETWSGRKD MGINYNLGCQLTAVTYPEKIDGKTAILFAAPSNTTTRAAGKIFVGLVQEDGTLKWTYDYS INGSDYYAYSCLTVLPDGTVGLLYENGATEIKYIDLDIKDIAKGATIGNIWCTDDAGKTV SAVTMKSNESKTFTVNGLKDSAEVTVNSDNESAVEATYENGKLKLTSSKVTGLKQAVVTV KCGEADTELHVNVTDSEKYEIVNLRVGDTKTYTDKTGNYSDNKLEELDKDIAEVILEGED AKDEEIKPEAEIQLGTNAQFDGEKKKISSSLFTFDKKDNGKYAISATTAKGEKVYLTPKT AATANTPLTKTSADITVAKGEDGTFSFTQEAAGGAGGILYFHKEAGKLYFDRNGTMHNNC KFDIYKAAENAKDSEIPGYIKVKDLSEITDNGQYVIASKAEDRRNYLLNPSQETDKYSYV AKVTGKMYEGQPTPAKTDITIKGKAEGETEVTIGKTTYYIIVKNDVKEVNLKVGETVNIP GKIVQAKGETDSISKVERNDMPPYKAITELKEGTYLFGSGTHILLNTPSSKDNPTGLAMQ AANFNVGEYNESLWTIEKSENGYTMKDVNGKYVNIDNQNVTLKDTPQTLTISKRSQGGFA VSRNGYYLNNWEGANNKVAAYTGDDNAWSFYKASVGNVVTGEKAGKVTLVTEGTTYKITV EQDPVTCKHTWGDWEVIKEATCTEGGEETKTCTKCHATETRQTKALGHKFSDWEVVEEPT ATEDGYEERVCERCKARQTKVLPATGENNNNGGDTGNQNNGETAGNSGNHDNSGNSGNSG NQNQKPVKTGDMADPLTTALGMAGCLGAIIAILRRKSGK >gi|330402635|gb|ADLB01000021.1| GENE 258 262979 - 263968 602 329 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|145632364|ref|ZP_01788099.1| ribosomal protein L11 methyltransferase [Haemophilus influenzae 3655] # 1 325 21 342 353 236 39 8e-61 LLGRGNRGMILKIGNVELENRYILAPMAGVTDLPFRLLCKEQGAGLLCMEMVSAKAIQYK NKNTQALLKIHPEECPVSLQLFGSDPDVMSEVAKSIEELPFAILDINMGCPVPKIVKNGE GSALMQNPKLVEQIVKKVSGAIQKPVTVKIRKGFNDESVNAVEIAKIIEASGGAAVAVHG RTREQYYSGKADWDIIRQVKEAVSIPVIANGDVTCGQDAIEIQKQTNCDGIMIGRGAQGN PWIFSELLYYEKHGEMPERPDVHEVKKTMLRHAKLQMQYKGDYLGIREMRKHIAWYTTGM KNSAKLRDDINRVESFEELETLLDERLTD >gi|330402635|gb|ADLB01000021.1| GENE 259 263941 - 264282 519 113 aa, chain - ## HITS:1 COG:no KEGG:Cphy_0423 NR:ns ## KEGG: Cphy_0423 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 6 111 2 107 107 129 64.0 2e-29 MYNKEERTVLCASNAYEKKYYFNPDFDALPDSIKDELKIMCVLYTEEIGGILALEFEEDG TLVFEVSSEENDFLYDDIGSRLKIKELQRTKEDMLRSLELYYRVFFLGEEIGE >gi|330402635|gb|ADLB01000021.1| GENE 260 264275 - 265912 1121 545 aa, chain - ## HITS:1 COG:MTH1916 KEGG:ns NR:ns ## COG: MTH1916 COG0340 # Protein_GI_number: 15679898 # Func_class: H Coenzyme transport and metabolism # Function: Biotin-(acetyl-CoA carboxylase) ligase # Organism: Methanothermobacter thermautotrophicus # 283 535 4 257 261 195 41.0 2e-49 MEKLFRKETPLTEVLRELWKTLADGGVDVTPLRKLIHENVDEKKIRDSGIEFCIMTFSVS DMKELDLSMEDIPEGMLEDFLLASAYLVGFKNEKLHGKTYIDGGVINNVPMGALVDRGYE NIIQIRIFGPGREPKVRITDEMNVYRIAPHVKLGSIIEFHQRRSRQNMRIGYYDAQRMLY GLKGRIYYIEQTEEECYYKTRISRLSEKERIETAFELRMAVGYTEEELYLTMLEACAKLL HIQKYKIYTEQELYAQICRRYERAKEKEEFPGFVSLLVRIGRDYVTDLMEMNTRWASGTV YSYEEIDSTNAEALRLAKAGESHGTLVVAKKQYAGRGRRGRTWESEDEENIYMSLLLRPE FSAGKAPMLTLVMAYSVAKVLREQENLDVIIKWPNDLVIGKKKICGILTEMKMEENKISS VIIGVGINVNVESFPRELRDKATSLRREAGREFCCTDLIAKIMESFEQNYNYFSEVEDLS FIQEEYNEILVNCGKQVRILEPHNEYEAVALGINEEGELLVEKETGEIERVFAGEVSVRG MYEYV >gi|330402635|gb|ADLB01000021.1| GENE 261 265913 - 266158 427 81 aa, chain - ## HITS:1 COG:ECs1399 KEGG:ns NR:ns ## COG: ECs1399 COG1752 # Protein_GI_number: 15830653 # Func_class: R General function prediction only # Function: Predicted esterase of the alpha-beta hydrolase superfamily # Organism: Escherichia coli O157:H7 # 8 79 4 80 356 60 42.0 1e-09 MRPVIDLEKEYAIMLDGGGARGAYQIGAWKALKEAGVKINAVAGTSVGALNGAFICMDDV EKAEKVWSEIAFSKVMDVDDT >gi|330402635|gb|ADLB01000021.1| GENE 262 266222 - 267151 1287 309 aa, chain - ## HITS:1 COG:FN1549 KEGG:ns NR:ns ## COG: FN1549 COG0330 # Protein_GI_number: 19704881 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Membrane protease subunits, stomatin/prohibitin homologs # Organism: Fusobacterium nucleatum # 9 305 8 293 294 307 59.0 2e-83 MSIGAILGIALLVLIILILISCIKIVTQAQALVVERLGAYQATWGVGLHFKIPIIERVAR KVDLKEQVADFPPQPVITKDNVTMRIDTVVFYQITDPKLFCYGVANPLMAIENLTATTLR NIIGDLELDETLTSRETINAKMRSSLDVATDPWGIKVNRVELKNIIPPAAIQDAMEKQMK AERERRESILRAEGEKKSTILVAEGNKESAILDAEAEKQAAILRAEAQKEKMIKEAEGQA EAILKVQQANADGIRFLKEAGADEAVLTMKSLEAFAKAADGKATKIIIPSEIQSVAGLVK SVTEIGADK >gi|330402635|gb|ADLB01000021.1| GENE 263 267239 - 268576 943 445 aa, chain - ## HITS:1 COG:BS_yrkQ KEGG:ns NR:ns ## COG: BS_yrkQ COG0642 # Protein_GI_number: 16079695 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Bacillus subtilis # 19 439 8 427 432 150 25.0 5e-36 MFGERVMRLSMKRMRKNYLNRELFILSLVSFGIALLVYFVLSVSSTYLITRYYNTEEHIE KLEKRYAKELQGYITKEGITVKNIGEVDEWVFNRDDVFLKIFINGNLVYDAMYGATDNVV GSSENREHFLEMNSYELTLGEDKAEAILFCYDFAAENYGRYVSLLISFLVFFISMIAGVR KKMKYLVTLQRELHMLSEDLDSSITLQGNDEITDVAKGIESLRLSVKDKMEKEKMAYEAN HRLITSLSHDIKTPLTVMIAYLELAKSKGRECAELEKYIGISLDKANHLKELTNELFEHF LLHGGMQEVIFDRVNGNELIMQMLEENLFDLEMQGVDIRRDINDITSVLEVNVNMVHRLF HNLFSNLNKYGDLSKPIFIHYALENNHLVLSMQNTKAQMPDKRESAKIGLHNCEAIMEKH KGTFEVCENENTFSVRFAFPVCTKK >gi|330402635|gb|ADLB01000021.1| GENE 264 268515 - 269267 706 250 aa, chain - ## HITS:1 COG:BS_yrkP KEGG:ns NR:ns ## COG: BS_yrkP COG0745 # Protein_GI_number: 16079696 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Bacillus subtilis # 14 237 4 226 231 196 44.0 2e-50 MKKAKEGEHMQGSKIMVVDDDADIRQALRVLLETENYEVSEADNGEMALELLDETVDLMI LDIMMPEKDGLSTCREVREKYTFPILFLTAKTTENDKYVGFLSGGDDYLTKPFSKMEVLT RVSAMLRRYHVYQGKGSESKEKYICIKDLKIDRNVSRVFQGDKEIVLTNTEYDILVFMAK NPNRVFTLEELYEKIWGEPYHFSVNATVMVHIRNLRRKLNDTSQASRYIKNVWGKGYAIV DETNEKKLSE >gi|330402635|gb|ADLB01000021.1| GENE 265 269269 - 269703 372 144 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MYNYFIFFLNILRRYLCFAAADVMLGEGDSLLQPSTKTDYVLVGLAVFVSIMIEREYWKS KENDREQLFLEVPKNLKQKAIFGVWVLLKILYNYLIIGVLQIIFLKSDSLLKPNTEGEYF IAFISILCSFPIAYFYQRKRRSMV >gi|330402635|gb|ADLB01000021.1| GENE 266 269839 - 270870 1206 343 aa, chain - ## HITS:1 COG:CAC3081 KEGG:ns NR:ns ## COG: CAC3081 COG3773 # Protein_GI_number: 15896332 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell wall hydrolyses involved in spore germination # Organism: Clostridium acetobutylicum # 233 330 71 163 194 74 38.0 2e-13 MKREIMKVAVCLLLTTCLTIPAVFTSEAADAQSETQGIVELLGDEPVLNDITANLPAAAS VAVETQEFHQKALVNTDGEMSIYAAADENSEVAGKVYRNTVVHIEETGEMWSKVSSGDVV GYIKNDNLVLGTKAVERAKTVCPAKTTVNGKEYTVVDTNGKQLTLKGADNEQITVAAADV KVTRNTQNGKTMKQIAEEEAKKRAEEEARKKAEEAKKKAAAASSQTRNAPMSVSASDRDL MAAIIYCEAGAEPYQGKVAVGAVIMNRVRSSRFPNTISGVIYQKGQFGPAITGKLGRVLA SGKATAECYKAADAALAGENPIGNRLFFGNGNTGYKIGSHYFH >gi|330402635|gb|ADLB01000021.1| GENE 267 271028 - 271801 632 257 aa, chain - ## HITS:1 COG:CAC2358 KEGG:ns NR:ns ## COG: CAC2358 COG0566 # Protein_GI_number: 15895625 # Func_class: J Translation, ribosomal structure and biogenesis # Function: rRNA methylases # Organism: Clostridium acetobutylicum # 1 255 4 259 261 165 37.0 9e-41 MITSTGNAQVKELLQLQKKSKVRNERNIFLVEGIKMFFEAPRNRIEKVFISETLFDRKKQ ELNLDGLKVEILSDKVFSHVSDTKTPQGILCIMRQKKTKLEEIFAQKPEHLMILDNLQDP GNLGTIVRTAEGAGVSGIILSKDCVDIYNPKTIRSTMGSIYRMPFLYVEDLENTIEEVKK QDIKVYAAHLQGKNNYDEENYKTGCAFLIGNEGNGLRDEIAEKADIWVKIKMHGEVESLN AAIASSILMFEVCRQRR >gi|330402635|gb|ADLB01000021.1| GENE 268 271806 - 272456 821 216 aa, chain - ## HITS:1 COG:BH0597 KEGG:ns NR:ns ## COG: BH0597 COG0569 # Protein_GI_number: 15613160 # Func_class: P Inorganic ion transport and metabolism # Function: K+ transport systems, NAD-binding component # Organism: Bacillus halodurans # 1 216 1 216 218 159 37.0 3e-39 MGRQYAVFGLGNFGKSVALSLERLGCEVLVVDKSMEKIQEISDEVSYAMRADIEEAEVME TIGARNLDGAIVALSENLEASIVATIMAKEMGIPYVIAKAQTELQGRILKKVGADVIVYP EKEMGIRIARGIVATNFAEWIDLSPEYSLVEWKIPGKWAGKTLLELRIRERKGINVVGII QDGNVNLSFEPTEPLPEDGIIILIGSNKVLQKLKEE >gi|330402635|gb|ADLB01000021.1| GENE 269 272469 - 273809 1378 446 aa, chain - ## HITS:1 COG:BS_yubG KEGG:ns NR:ns ## COG: BS_yubG COG0168 # Protein_GI_number: 16080162 # Func_class: P Inorganic ion transport and metabolism # Function: Trk-type K+ transport systems, membrane components # Organism: Bacillus subtilis # 2 428 18 429 445 271 39.0 2e-72 MQLIAAGFLGTIILGGVLLHLPMCNTKPIRFTDALFTATSAVCVTGLVSIVPATQFTVLG KGILLVLIQIGGLGVIACTFSFFLILKKKITLRERIIIQETYNMDTLSGLVKFLIKIIKG TFIVEGVGAILFAFQFVPEYGVVEGICYSIFHAVSAFCNAGIDILGDSSFIKYATNPLIN FTTMSLIVIGGIGFTVWGDMLKNVFHGIGKEGYSVRKAFTKLTLHSKIAISMTIALILVG MVSFFVMEYANPETFGNMTFEEKLMASGFHSVSTRTAGFATVPQAGLTNGSKLITCLLMF VGGSPAGTAGGVKTTTVAMLVLTCRSVIKGGKDTECFGRKIVEDNIRTGLAVIFLSMGFL IMGTVVVSVLEPDVEFMNILYETTSAMGTVGLSANLTPELETASKYIIMTLMYIGRIGPV TMALIFGKGHTKDRVRELPEKRILVG >gi|330402635|gb|ADLB01000021.1| GENE 270 273923 - 274612 1012 229 aa, chain - ## HITS:1 COG:CAC0321 KEGG:ns NR:ns ## COG: CAC0321 COG0745 # Protein_GI_number: 15893613 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Clostridium acetobutylicum # 1 228 1 229 230 304 67.0 1e-82 MNKILIVEDEEAIADLEKDYLELSGFEVEVANDGEMGLSKALGEDYNLIILDLMLPGVDG FEICRRVRDEKNTPIIMISAKKDDIDKIRGLGLGADDYMTKPFSPSELVARVKAHLARYE RLIGSTIEENKVIEIRGLKIDTTARRVWVNGEEKTFTTKEFDLLTFLASHPNHVYTKDQL FSEIWDMESIGDIATVTVHIKKIREKIEFDTSKPQYIETIWGVGYRFKV >gi|330402635|gb|ADLB01000021.1| GENE 271 274605 - 276098 1286 497 aa, chain - ## HITS:1 COG:CAC0317 KEGG:ns NR:ns ## COG: CAC0317 COG0642 # Protein_GI_number: 15893609 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 1 488 1 492 498 365 39.0 1e-100 MKLKTRLITAFVTVSVLPILLTAIIIFGLSQYQINAIEENYGITGTTYKNFSNSVQVMNS LTEQPYHELKQVASENPSKLADITYLEEINEKLLSKNSYLIVRNNKKIIYIGGDAEQAKQ MIYQLPEYGEHDSTSKNGVYYGNAQSLVKQIDFRQADGTQGSAFLVTDVSNVIPEVRQLI TDALLAIIIILVVTAALLIFWIYKGVTGPLGKMQKAAKNIKEGNLEFRLEAEADDELGQL CQDFEDMRKRLKTNAEEKLQYDKEGKELISNISHDLKTPVTAIKGYAEGIMDGVADTPEK MEKYIRTIYNKANEMDRLINELTFYSKIDTNRIPYNFSTISVNDYFDDCAEDLSLELESK GIVFGYSNYVEGEQKIIADAEQLKRVINNIVTNSIKYMDKEQGRINLRIKDVGDFIQVEL EDNGKGISSKDLPNIFDRFYRTDASRNSSKGGSGIGLSIVKKIVEEHGGKIWATSKEGTG TVMYFVIRKYQEVPINE >gi|330402635|gb|ADLB01000021.1| GENE 272 276290 - 277477 1472 395 aa, chain - ## HITS:1 COG:BS_metK KEGG:ns NR:ns ## COG: BS_metK COG0192 # Protein_GI_number: 16080107 # Func_class: H Coenzyme transport and metabolism # Function: S-adenosylmethionine synthetase # Organism: Bacillus subtilis # 3 393 5 395 400 620 75.0 1e-177 MERLLFTSESVTEGHPDKICDQISDAVLDALMEQDPNSRVACETAITTGLVLVMGEITTN AYVDIQKIVRETINEIGYNRGKYGFDAYTCGVITAIDEQSSDIAMGVDKALEAKENKMSD EEIEAIGAGDQGMMFGYASNETEELMPYPISLAHKMALQLTKIRKDGTLTYLRPDGKTQV SVEYNEEGKPERLDAVVLSTQHDPDVTQEQIHEDIKKHVFDVILPKEMIDENTKFFINPT GRFVIGGPHGDSGVTGRKIIVDTYGGMARHGGGAFSGKDCTKVDRSAAYAARYVAKNIVA AGIADKCEIQLSYAIGVAQPTSIMVDTFGTGKLSEEKLVEIIRENFDLRPAGIIKMLDLR RPIYKQTAAYGHFGRNDLDLPWEKLDKVEDLKKYL >gi|330402635|gb|ADLB01000021.1| GENE 273 277793 - 282487 4807 1564 aa, chain - ## HITS:1 COG:CC0447 KEGG:ns NR:ns ## COG: CC0447 COG3525 # Protein_GI_number: 16124702 # Func_class: G Carbohydrate transport and metabolism # Function: N-acetyl-beta-hexosaminidase # Organism: Caulobacter vibrioides # 499 830 43 343 757 87 27.0 2e-16 MKKKRVMALLLAATLTIGNMSSLGMTQAAGKEDSQTQEFVRTRAADKNIALNRPARASKY LDATGGNPARLPKLAFDGKGDNQTTDAQNSRWQSGDDDTFTEQWLEVDLGNSAKVSDVTV KFFAKLYGSFVIETSDSNGAKATWKEVGKLDIPSSNDQNITRKVDLTKNGQPTEVSRYLR LRFTSYNTNAAKKSIGVYECEVHGELDTPDKPNTIKGNIAKGKKVTASGVEAEAPQYKPE LAVDGDKTSDTSRWSAPKMKNGNSANQNQTMQWFEINLGNEVTDITSIDLYFYKKVFSTD FVIKTKENKDDDWGEPIKTFTTESSDVSDKRVSITTVPKLKKYVRFEFNKVNTNAGGNSV SVREIEINGTQVQVPYEPESAQEVMDSVKGLDKITADMAEVPLPSVPEGYEIRVVGSEFP QVITDEGKITDYNIYDYNNMEIMLEVVNKEDENDKAQKTFQVSVPKKTQKHTDLFPQVER PNAEPKVIPSIQEWYGYNGDVKLGKDSRIIVKDNGNVNAEKVAKQFQKDMKEITGMELEI VSGQSGDANDIVIESLKKDIYSLGDEGYLLQADDKGIHITANGYNGCLYGAMTLEQVFYT QKGEFIFPKGVARDFSKYEVRGVMIDIARTPYRMDALEDIVKALAFYKINEVQFHLNDNR HVPGDANRGDYEHWKDVEGMFRLESDTYPGLATTPKKDEYYNEVYGGAPQYTKEEYKELQ KMAMDYGINPISEIDAPGHSLLFTKYVKENLDEVKKKVPGVTTNINADKDWELLAMSGPK KDSAFAFMDALFDEYLDENVFLGDTVNIGADEYWKITNEERPGVQQYIRQTADNVKKHNK KVRMWGSTKQFFTNPEAAKAYNDIEIDFWANSWEDAGKRIEQDFKIVNVDSFHLYGNPGR DKRDIVNVEHVFNNWDPTVMTGSTVKKSEPNLLGAKTALWADIADMGVTERDNFERILRQ AAVLSEKTWGGTDATQSFEEYSFKYESLQQGPGVSLGSDVKSETGLVLDYDFANVKDGKV HDASGNDYDGKITDAKVEEENGKTWLKLDGNGKVETGLRSMDYPYTVQFALKLPKDTEVK GDVCLFDGRDGRLTVKENGNLGLNRSYFSQDFGYNIPKGEEVQLTVVGTPQVTKLYVNGK LSKTLLRTSASETDYAHLLSTFVFPLTTIGEGLNGKIADIKVYNKALSPEKIGEIAEGKV VKELNVSQATAAAGTAQHKGDTGQDVDWKKLRVGWKAIDGDGNTLDGRHDTTVSEKDSYF EGAYADSAFAVDMLKEQQISKIVLQWDRAPKSFKIQTSADGSTWTDVKTVTGEKVNTITF ETPLNTRYLKMQGVSLNGGTFKLREFKAYESVDKTKLGKLVKTAEEKLEELGVTFADRKG YDALIEAYAEANSLYENALAEQENVKAAEEALEQAFADLPEKQEKVTVTFDVDGKETSVE VEKGKALGDKLPEAPVKEGFTFKEWNTEKDGTGTKVTENTVVNEEMTVYAVFEKNSEPTE PENPDSGKPEKPEKPNSGKPGKPDTKPNKQNPVKTGDTQSPIWYLGLMAAAGALIAGKRK KKED >gi|330402635|gb|ADLB01000021.1| GENE 274 282519 - 284573 1966 684 aa, chain - ## HITS:1 COG:MA4292 KEGG:ns NR:ns ## COG: MA4292 COG3291 # Protein_GI_number: 20093081 # Func_class: R General function prediction only # Function: FOG: PKD repeat # Organism: Methanosarcina acetivorans str.C2A # 231 445 820 1061 1995 79 25.0 2e-14 MKRKKIWIKLGVLVSIVSISVSGVPYRQQVEAKSGYVQAKENGKERISAEEYNKVALDFI SLLKPEDFKTWDQSLGVMTEKEAEEIKTFLETTVIKDEKDDYQKAKKIYEWITKNIKYAT LQDQNIGLRPYDVFKYKVAVCGGYSNLYKAMLNLAGIPAILVTGDTTAGAHAWNIVYADG KWFFSDSTWGSADAQNFDYGLEMFSKSHTSVDLYGVSVTDKNGIILGFHKGLAVMGETGG AKTITVPEKFNDLDVISVAQSVFEEKSAVEHLVLTEKVSNVETQVHSKTLKSVTVPEESQ YYASKDGVLFTKDLSKILIYPYQKEETSFVMPKEVTMYDEKETFQNPYLSEILVEEGNEK FSSYDGVIYNAEKTRLLTVPEGKQKIYVAGTVELDNIALNGKSNLKEIVLEDGVKAIPPY ALNGCSGLTKIYIPKSVGEISEDAFFGVNMSGLTILGEKESAAEKFANAHGIRFVDVKEI EIKLEETKELIQKAKEYDDTSVYTDESVETLRKAIEEAENIAAQSDVTVEQLNKAIENLN TAITNMEKRKEEPEKPDISEAIIAKKKEAKELLEKAKKYDNVFVYTEDSVQALRSAMKNM ENVLNQEDVTLDRLEKAIDSLNDAVLGMKKKGEVEKPEKPNGQIPPKQEKPETPKTADTS APGAAGVGMMLSAALITLLQRKKK >gi|330402635|gb|ADLB01000021.1| GENE 275 284595 - 289952 4789 1785 aa, chain - ## HITS:1 COG:no KEGG:CPF_2129 NR:ns ## KEGG: CPF_2129 # Name: not_defined # Def: fibronectin type III domain-containing protein # Organism: C.perfringens_ATCC13124 # Pathway: not_defined # 39 832 40 748 1479 590 43.0 1e-166 MKKKRMKRILAGACACLLLPWGAYGNDVGMEVKAAQEKNVNSINDALKLWYKTPANIHSA ETNGGEWTRQSLPLGNGNLGNLIFGGISKERIHFNEKTLWTGGPSETRPDYQFGNKKTAY TDKEIEAYRKLLDDKSKNVFNDDTSLGKPGMSGKIKFPGEDNLNKGSYQDFGDIWIDFSE TGIRDDNVKNYRRELDLQTGVAATTFSHQGVDYKREHFVSSPDQVMVTELSASKEKKLDV SIKMELNNSGLEGTAKFDAEQNMYTIFGKVKDNGLKFRTTMKIVQSGGDITADEKNQLYK VENADKIMIVMAAETDYKNDYPTYRDTKKDLEKVVVERVKRASEKSYQELKENHIEDHQG LFDRVSLDLGENRSNIPTNELIDAYRKGSYSKYLEVLAFQYGRYLTIAGSRGTLPSNLVG LWTMGASAWTGDYHFNVNVQMNYWPVYVTNLAECGTTMVDYMENLREPGRLTAERVHGIE DATTKKNGFTVHTENNPFGMTAPTNNQEYGWNPTGAAWAIQNLWAHYEFTQNKDYLKNTI YPIMKEAAQFWDNYLWTSDYQKVHDKNSKYDGQPRLVVVPSFSAEQGPTAVGTTYDQSLV WELYNECIKAGKIVGEDETVLKSWEEKMQRLDPIEMNATNGIKEWYEETRVGTETGHHQS YAKAGNLAEIPVPNSGWNIGHLGEQRHASHLVGLFPGTLIHKDNEEYMDAAIQSLEERGE YSTGWSKANKINLWARTGNGDKAYRLLNNLIGGNTSGLQYNLFDSHGSQGGDTMMNGTPV WQIDGNYGLTSGVAEMLLQSQLGYVQFLPAIPSAWTDGEVKGLKARGNFTISEKWKNNMA EKFTVRYDGEEKESTFTGEYKDITNAKVYQDGKEVRVQKDNEKGRISFAAQKGKTYTIDF TEANMEELKQQAEKFLPELHRDLSKIKEELQTAIKNSANNLGDVLAKAKQMNRLYRMYLE EVENIYCLTNREGLSYEKIDVMYRQMRSLRKTLLENTGDLKYYQEADTMLNDMAATMSEQ MKNRTISFSKESGFVSPQENDLSLSKSDLARKYEIRYTMDGSIPRKTSPLYENVLKLPEG KDCTVKAALFYGEQRVSPVYTKKYVRHAVSIQAVEVSQKDNWGAEYVKEKMIDGNASTRW ASKKPNASLPIEITLTFAKEETVNQIKFDQFVSNHNGIAGFEIQALQNGKYVTVYEGKKL GDINDKVGDADGTSAGYHAYYLAEFPKVNTTSVKVILKPGYTGEPSLYEIIPLNLSDDTE KKGNAEELSEMIKQAEAIDKNAPNYVEAEKTLKDAFAESIKDGKEMLESSQEYIDARTAF IHSRYERLGFGRTDKTELNELIAKAEKELEGAYTKNSLYRLKKMVAQAKDVQKDEAVKQP EVDRTVKQLQEALEQLESIGYIEQQIPAKDLQGGQGWKLINGYQATDSDNAGALSYQFTG HFVSGITVKGDDHGLLKVKISDLAGKTVYEETIDTYAQKRAEGVELMKKELPEGTYTIEF ERGGKSPNGQNKRGWVEVGTLTIRQQAEPEEVDRSSLQAELKKCENLKEADYTKESWEKF RLVIEKAEELLKKDDEETCTAEMEDMAKEVSEARDALQIRVDIRELQDTLKKAKEISGEG YTKESYWALRDCIQEIEAFLNGTYTQEEVNGKTALLKQRISELQADKTKLLEKYNAIKDM EQGKVTDWSWKEFVQLRNEVKEILDKADATPKEVQDILSKLDAFHFTYESEEEGHEQAGG TDAKKPQGQSDTVKTGDNSRGMFYLTLLLGASAVVMVKCRKKKQK >gi|330402635|gb|ADLB01000021.1| GENE 276 290161 - 291453 1504 430 aa, chain - ## HITS:1 COG:CAC3539 KEGG:ns NR:ns ## COG: CAC3539 COG0766 # Protein_GI_number: 15896775 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylglucosamine enolpyruvyl transferase # Organism: Clostridium acetobutylicum # 1 415 1 415 418 444 56.0 1e-124 MEQYIIKGGTPLVGEVQIGGAKNAALAILAAAIMTDETVTIDNLPDVNDINVLLEAMSGI GATIQRIDRHTVKINGSGVRDFSIEYDYIKKIRASYYLLGALLGKYNKAEVALPGGCNIG SRPIDQHLKGFKAIGADVIIEHGKIIAEAEQLKGTHLYFDVVSVGATINVMMAAAMAEGT TIMENVAKEPHVVDVANFLNSMGANIRGAGTDVIKIKGVPKLHKTEYSIIPDQIEAGTFM MAAAATRGDVTVMNVIPKHLEATTAKLVEIGCEVEEFDDAVRVVSKGRLKHTQVKTLPYP GFPTDMQPQMGVTLALCSGTSTITESIFENRFKYLDELARMGANIKVEGNSATIEGVEKF SGARVSAPDLRAGAALCIAGLAAEGITIVDDIVYIQRGYERFEEKLRSLGGVIERVASEK EIQKFKFKVG >gi|330402635|gb|ADLB01000021.1| GENE 277 291677 - 292078 422 133 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|153816497|ref|ZP_01969165.1| ## NR: gi|153816497|ref|ZP_01969165.1| hypothetical protein RUMTOR_02750 [Ruminococcus torques ATCC 27756] # 1 133 1 135 136 111 55.0 1e-23 MDYKLPYYMAYPMPFEYDDERKERQDMEYMKSLYPNAVKRILPFVEDECERMEYEGSMIY DEYPDMLGIRLMCNRICERVEAMDRREDIEDELEMQQKRQNNRGNIRDIVEVLLLNELMK RRNEHRRRRRRFY >gi|330402635|gb|ADLB01000021.1| GENE 278 292231 - 292713 541 160 aa, chain - ## HITS:1 COG:no KEGG:CPE1279 NR:ns ## KEGG: CPE1279 # Name: nagK # Def: hyaluronidase # Organism: C.perfringens # Pathway: Metabolic pathways [PATH:cpe01100] # 21 157 1019 1163 1163 66 36.0 3e-10 MDNAAEVLQNAMDNLKEKEPEFIEADKKALQSVIAQAEKKAEKDYTSESWKAFAEALAKA QEVNEDVKASQEVVDTAAKVLQEAIEELQKKPAKVPEEESGTTKPDGSTNINPDSNTNVK PDKNQPVKTGDAQIPFAFVLAMMASAAGVFVIKRRRKEDE Prediction of potential genes in microbial genomes Time: Tue May 24 21:43:54 2011 Seq name: gi|330402193|gb|ADLB01000022.1| Lachnospiraceae bacterium 2_1_46FAA cont1.22, whole genome shotgun sequence Length of sequence - 142006 bp Number of predicted genes - 149, with homology - 131 Number of transcription units - 58, operones - 29 average op.length - 4.1 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 15 - 5615 6146 ## COG3250 Beta-galactosidase/beta-glucuronidase - Prom 5703 - 5762 13.0 - Term 5734 - 5800 14.4 2 2 Op 1 42/0.000 - CDS 5810 - 6211 635 ## COG0355 F0F1-type ATP synthase, epsilon subunit (mitochondrial delta subunit) 3 2 Op 2 42/0.000 - CDS 6224 - 7615 1589 ## COG0055 F0F1-type ATP synthase, beta subunit 4 2 Op 3 42/0.000 - CDS 7657 - 8517 894 ## COG0224 F0F1-type ATP synthase, gamma subunit 5 2 Op 4 41/0.000 - CDS 8534 - 10036 1902 ## COG0056 F0F1-type ATP synthase, alpha subunit 6 2 Op 5 38/0.000 - CDS 10064 - 10609 711 ## COG0712 F0F1-type ATP synthase, delta subunit (mitochondrial oligomycin sensitivity protein) 7 2 Op 6 37/0.000 - CDS 10597 - 11112 725 ## COG0711 F0F1-type ATP synthase, subunit b 8 2 Op 7 40/0.000 - CDS 11156 - 11422 506 ## COG0636 F0F1-type ATP synthase, subunit c/Archaeal/vacuolar-type H+-ATPase, subunit K 9 2 Op 8 . - CDS 11461 - 12180 658 ## COG0356 F0F1-type ATP synthase, subunit a 10 2 Op 9 . - CDS 12196 - 12597 410 ## EUBREC_0117 hypothetical protein 11 2 Op 10 . - CDS 12587 - 12820 270 ## gi|210613768|ref|ZP_03289882.1| hypothetical protein CLONEX_02089 12 2 Op 11 . - CDS 12899 - 13396 408 ## COG0219 Predicted rRNA methylase (SpoU class) 13 2 Op 12 3/0.000 - CDS 13389 - 14570 925 ## COG0436 Aspartate/tyrosine/aromatic aminotransferase 14 2 Op 13 . - CDS 14570 - 15052 464 ## COG1522 Transcriptional regulators 15 2 Op 14 . - CDS 15064 - 16053 1009 ## COG0309 Hydrogenase maturation factor 16 2 Op 15 . - CDS 16068 - 17663 1622 ## COG2720 Uncharacterized vancomycin resistance protein 17 2 Op 16 . - CDS 17696 - 17797 95 ## - Prom 17818 - 17877 2.0 18 2 Op 17 . - CDS 17880 - 19604 1542 ## COG1032 Fe-S oxidoreductase - Prom 19843 - 19902 2.2 19 3 Op 1 . - CDS 19963 - 20409 337 ## COG0590 Cytosine/adenosine deaminases 20 3 Op 2 . - CDS 20427 - 21170 519 ## EUBREC_0291 hypothetical protein - Prom 21202 - 21261 5.3 - Term 21246 - 21281 5.3 21 4 Op 1 . - CDS 21284 - 21928 771 ## gi|210613781|ref|ZP_03289895.1| hypothetical protein CLONEX_02103 22 4 Op 2 . - CDS 22003 - 22806 737 ## COG1876 D-alanyl-D-alanine carboxypeptidase 23 4 Op 3 . - CDS 22809 - 23450 364 ## COG2091 Phosphopantetheinyl transferase 24 4 Op 4 . - CDS 23447 - 23962 535 ## Cphy_0209 hypothetical protein 25 4 Op 5 . - CDS 23953 - 24579 722 ## COG2214 DnaJ-class molecular chaperone 26 4 Op 6 . - CDS 24576 - 25421 745 ## Cphy_0207 hypothetical protein - Prom 25457 - 25516 5.2 + Prom 25394 - 25453 6.1 27 5 Tu 1 . + CDS 25514 - 25696 134 ## - TRNA 25627 - 25702 89.3 # Lys CTT 0 0 - Term 25570 - 25625 11.5 28 6 Tu 1 . - CDS 25783 - 26664 937 ## COG1284 Uncharacterized conserved protein - Prom 26684 - 26743 3.8 - Term 26705 - 26742 -0.0 29 7 Op 1 . - CDS 26792 - 27061 362 ## COG0526 Thiol-disulfide isomerase and thioredoxins 30 7 Op 2 . - CDS 27076 - 27213 168 ## 31 7 Op 3 . - CDS 27256 - 27909 668 ## COG0220 Predicted S-adenosylmethionine-dependent methyltransferase 32 7 Op 4 . - CDS 27931 - 29847 1577 ## COG0296 1,4-alpha-glucan branching enzyme - Prom 29874 - 29933 6.9 - Term 29891 - 29930 5.3 33 8 Tu 1 . - CDS 29940 - 32495 2784 ## COG0464 ATPases of the AAA+ class - Prom 32604 - 32663 9.8 + Prom 32509 - 32568 7.3 34 9 Tu 1 . + CDS 32619 - 32717 64 ## + Term 32733 - 32773 -1.0 35 10 Tu 1 . - CDS 32908 - 32991 185 ## - Prom 33039 - 33098 12.4 + Prom 33071 - 33130 14.3 36 11 Tu 1 . + CDS 33232 - 33786 184 ## PROTEIN SUPPORTED gi|163764517|ref|ZP_02171573.1| ribosomal protein L32 + Term 33796 - 33851 15.7 - Term 34144 - 34201 11.7 37 12 Op 1 . - CDS 34208 - 36463 2715 ## COG1048 Aconitase A 38 12 Op 2 . - CDS 36457 - 37563 861 ## COG2508 Regulator of polyketide synthase expression - Term 37583 - 37617 6.2 39 13 Op 1 . - CDS 37643 - 38992 1760 ## COG0166 Glucose-6-phosphate isomerase 40 13 Op 2 . - CDS 39057 - 39824 761 ## gi|210613855|ref|ZP_03289949.1| hypothetical protein CLONEX_02162 - Prom 39849 - 39908 5.5 41 14 Tu 1 . - CDS 39947 - 40642 778 ## COG0775 Nucleoside phosphorylase - Prom 40679 - 40738 9.2 + Prom 40609 - 40668 6.8 42 15 Tu 1 . + CDS 40767 - 41009 254 ## EUBELI_00054 hypothetical protein + Term 41013 - 41063 12.5 - Term 41054 - 41090 -0.8 43 16 Op 1 . - CDS 41303 - 41761 550 ## gi|226325859|ref|ZP_03801377.1| hypothetical protein COPCOM_03672 44 16 Op 2 . - CDS 41830 - 42663 829 ## gi|225570068|ref|ZP_03779093.1| hypothetical protein CLOHYLEM_06164 - Prom 42719 - 42778 7.5 + Prom 42678 - 42737 7.8 45 17 Tu 1 . + CDS 42768 - 43832 857 ## COG0707 UDP-N-acetylglucosamine:LPS N-acetylglucosamine transferase + Term 43833 - 43878 9.2 - Term 43826 - 43860 5.5 46 18 Op 1 . - CDS 43890 - 44456 408 ## gi|225570070|ref|ZP_03779095.1| hypothetical protein CLOHYLEM_06166 47 18 Op 2 . - CDS 44488 - 44895 428 ## COG0537 Diadenosine tetraphosphate (Ap4A) hydrolase and other HIT family hydrolases - Prom 45020 - 45079 5.5 + Prom 44836 - 44895 5.7 48 19 Tu 1 . + CDS 45001 - 45750 409 ## COG0642 Signal transduction histidine kinase - Term 45524 - 45570 -0.6 49 20 Tu 1 . - CDS 45679 - 46290 350 ## COG0705 Uncharacterized membrane protein (homolog of Drosophila rhomboid) - Prom 46313 - 46372 7.3 50 21 Op 1 . - CDS 46376 - 47083 574 ## Cphy_0040 FHA domain-containing protein 51 21 Op 2 . - CDS 47073 - 47468 230 ## gi|226325866|ref|ZP_03801384.1| hypothetical protein COPCOM_03679 52 21 Op 3 . - CDS 47465 - 47869 381 ## gi|210613906|ref|ZP_03289970.1| hypothetical protein CLONEX_02183 53 21 Op 4 . - CDS 47857 - 48735 689 ## Cphy_0037 hypothetical protein 54 21 Op 5 . - CDS 48756 - 49985 916 ## EUBREC_0046 hypothetical protein 55 22 Op 1 . - CDS 50122 - 50298 256 ## 56 22 Op 2 . - CDS 50310 - 51497 911 ## Cphy_0034 hypothetical protein 57 22 Op 3 . - CDS 51523 - 52278 508 ## Cphy_0033 Flp pilus assembly protein TadB-like protein 58 22 Op 4 . - CDS 52223 - 53437 928 ## COG4962 Flp pilus assembly protein, ATPase CpaF 59 22 Op 5 . - CDS 53391 - 54380 526 ## COG1192 ATPases involved in chromosome partitioning - Prom 54467 - 54526 5.9 60 23 Tu 1 . - CDS 54539 - 54748 301 ## gi|225570084|ref|ZP_03779109.1| hypothetical protein CLOHYLEM_06180 + Prom 54811 - 54870 2.7 61 24 Tu 1 . + CDS 54907 - 55146 147 ## gi|225570085|ref|ZP_03779110.1| hypothetical protein CLOHYLEM_06181 + Term 55150 - 55207 6.3 - Term 55225 - 55286 6.5 62 25 Op 1 1/0.250 - CDS 55288 - 55776 575 ## COG2131 Deoxycytidylate deaminase 63 25 Op 2 . - CDS 55793 - 56422 842 ## COG0035 Uracil phosphoribosyltransferase 64 25 Op 3 . - CDS 56440 - 57960 1595 ## COG0009 Putative translation factor (SUA5) 65 25 Op 4 . - CDS 58007 - 58723 633 ## COG0860 N-acetylmuramoyl-L-alanine amidase 66 25 Op 5 . - CDS 58735 - 60546 1559 ## COG3858 Predicted glycosyl hydrolase - Prom 60571 - 60630 5.6 67 26 Tu 1 . - CDS 60659 - 60880 263 ## gi|226325561|ref|ZP_03801079.1| hypothetical protein COPCOM_03366 - Prom 60932 - 60991 8.6 + Prom 61230 - 61289 11.5 68 27 Tu 1 . + CDS 61327 - 62064 793 ## Cphy_3752 negative regulator of genetic competence sporulation and motility-like protein + Term 62081 - 62118 6.1 - Term 62063 - 62113 11.8 69 28 Tu 1 . - CDS 62142 - 62339 348 ## EUBREC_0036 hypothetical protein - Prom 62364 - 62423 6.8 + Prom 62921 - 62980 10.3 70 29 Op 1 . + CDS 63027 - 63431 347 ## COG0735 Fe2+/Zn2+ uptake regulation proteins + Prom 63433 - 63492 6.8 71 29 Op 2 . + CDS 63535 - 64077 770 ## COG1592 Rubrerythrin + Term 64091 - 64126 6.0 - Term 64077 - 64114 6.4 72 30 Op 1 . - CDS 64131 - 64721 861 ## CLH_1589 hypothetical protein - Prom 64745 - 64804 5.2 73 30 Op 2 . - CDS 64812 - 66398 1684 ## COG0513 Superfamily II DNA and RNA helicases - Prom 66487 - 66546 8.6 + Prom 66554 - 66613 7.4 74 31 Tu 1 . + CDS 66642 - 67304 636 ## COG1272 Predicted membrane protein, hemolysin III homolog + Term 67315 - 67366 12.0 75 32 Op 1 . - CDS 67332 - 67583 236 ## COG2827 Predicted endonuclease containing a URI domain 76 32 Op 2 . - CDS 67586 - 67864 264 ## gi|291171378|ref|ZP_06572553.1| conserved hypothetical protein - Prom 67884 - 67943 4.3 - Term 67909 - 67958 7.2 77 33 Op 1 . - CDS 67961 - 68704 719 ## EUBREC_0243 hypothetical protein 78 33 Op 2 . - CDS 68747 - 69562 635 ## COG2385 Sporulation protein and related proteins - Prom 69629 - 69688 5.2 - Term 69734 - 69778 3.1 79 34 Op 1 . - CDS 69786 - 70451 753 ## gi|154505671|ref|ZP_02042409.1| hypothetical protein RUMGNA_03210 80 34 Op 2 . - CDS 70464 - 72926 2532 ## COG0058 Glucan phosphorylase 81 34 Op 3 . - CDS 72937 - 76044 3768 ## COG0060 Isoleucyl-tRNA synthetase - Prom 76087 - 76146 6.6 - Term 76085 - 76132 7.1 82 35 Tu 1 . - CDS 76255 - 76335 146 ## - Prom 76410 - 76469 6.3 + Prom 76310 - 76369 9.8 83 36 Tu 1 . + CDS 76500 - 77102 464 ## BDI_1616 hypothetical protein + Term 77109 - 77150 7.0 - Term 77097 - 77138 7.0 84 37 Op 1 . - CDS 77145 - 78416 1273 ## COG1686 D-alanyl-D-alanine carboxypeptidase 85 37 Op 2 . - CDS 78488 - 79105 681 ## COG1145 Ferredoxin - Prom 79162 - 79221 4.7 + Prom 79101 - 79160 6.0 86 38 Tu 1 . + CDS 79193 - 79861 229 ## COG0664 cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases 87 39 Op 1 . - CDS 80048 - 81253 876 ## EUBELI_20462 hypothetical protein 88 39 Op 2 . - CDS 81246 - 81896 749 ## - Prom 81916 - 81975 5.7 89 40 Tu 1 . - CDS 82132 - 82305 161 ## - Prom 82466 - 82525 3.1 + Prom 82337 - 82396 7.3 90 41 Tu 1 . + CDS 82578 - 83762 583 ## COG3547 Transposase and inactivated derivatives 91 42 Tu 1 . - CDS 83907 - 84089 112 ## gi|167760939|ref|ZP_02433066.1| hypothetical protein CLOSCI_03327 - Prom 84128 - 84187 3.1 92 43 Op 1 . - CDS 84349 - 84438 79 ## 93 43 Op 2 . - CDS 84491 - 84748 418 ## 94 43 Op 3 . - CDS 84761 - 88048 1819 ## - Prom 88120 - 88179 5.2 95 44 Op 1 . - CDS 88186 - 89202 803 ## gi|167765782|ref|ZP_02437835.1| hypothetical protein CLOSS21_00273 96 44 Op 2 . - CDS 89233 - 89754 460 ## gi|160914689|ref|ZP_02076903.1| hypothetical protein EUBDOL_00696 97 44 Op 3 . - CDS 89771 - 90286 681 ## gi|160914690|ref|ZP_02076904.1| hypothetical protein EUBDOL_00697 98 44 Op 4 . - CDS 90322 - 91083 662 ## Elen_2849 hypothetical protein 99 44 Op 5 . - CDS 91085 - 91858 486 ## Elen_2849 hypothetical protein - Prom 91888 - 91947 2.2 - Term 91863 - 91901 6.0 100 45 Tu 1 . - CDS 91960 - 92556 746 ## - Prom 92593 - 92652 6.0 101 46 Op 1 . - CDS 92683 - 92970 134 ## 102 46 Op 2 . - CDS 92981 - 96268 2764 ## - Term 96418 - 96454 -0.7 103 47 Op 1 . - CDS 96631 - 97029 428 ## COG1959 Predicted transcriptional regulator 104 47 Op 2 . - CDS 97086 - 97232 72 ## - Prom 97286 - 97345 5.1 - Term 97304 - 97358 7.6 105 48 Tu 1 . - CDS 97419 - 97814 343 ## COG1959 Predicted transcriptional regulator - Prom 97989 - 98048 6.4 - Term 97969 - 98014 -0.9 106 49 Op 1 . - CDS 98172 - 100070 1759 ## SP_1654 hypothetical protein 107 49 Op 2 . - CDS 100090 - 101049 662 ## COG4932 Predicted outer membrane protein 108 49 Op 3 . - CDS 101108 - 101203 64 ## 109 49 Op 4 . - CDS 101210 - 101506 331 ## gi|253581110|ref|ZP_04858370.1| conserved hypothetical protein 110 49 Op 5 . - CDS 101517 - 102515 725 ## Amet_3992 replication initiator A domain-containing protein 111 49 Op 6 . - CDS 102605 - 103513 714 ## COG1475 Predicted transcriptional regulators 112 49 Op 7 . - CDS 103503 - 103721 274 ## EUBREC_3612 chromosome partitioning protein ParA 113 49 Op 8 25/0.000 - CDS 103737 - 104288 639 ## COG1192 ATPases involved in chromosome partitioning 114 49 Op 9 . - CDS 104304 - 105179 887 ## COG1475 Predicted transcriptional regulators 115 49 Op 10 . - CDS 105172 - 105276 72 ## - Prom 105352 - 105411 5.5 - TRNA 105587 - 105658 50.3 # Arg CCT 0 0 116 50 Op 1 . - CDS 105960 - 106844 782 ## Cphy_3931 hypothetical protein 117 50 Op 2 . - CDS 106863 - 108140 1469 ## COG0172 Seryl-tRNA synthetase 118 50 Op 3 . - CDS 108165 - 108686 709 ## Cphy_3933 hypothetical protein 119 50 Op 4 25/0.000 - CDS 108688 - 109578 970 ## COG1475 Predicted transcriptional regulators 120 50 Op 5 . - CDS 109578 - 110345 895 ## COG1192 ATPases involved in chromosome partitioning - Prom 110367 - 110426 9.0 + Prom 110383 - 110442 10.7 121 51 Tu 1 . + CDS 110528 - 111490 754 ## COG0596 Predicted hydrolases or acyltransferases (alpha/beta hydrolase superfamily) + Term 111496 - 111530 6.2 122 52 Op 1 24/0.000 - CDS 111504 - 112217 537 ## COG0357 Predicted S-adenosylmethionine-dependent methyltransferase involved in bacterial cell division 123 52 Op 2 11/0.000 - CDS 112219 - 114117 1841 ## COG0445 NAD/FAD-utilizing enzyme apparently involved in cell division 124 52 Op 3 4/0.000 - CDS 114083 - 115456 1426 ## COG0486 Predicted GTPase - Term 115467 - 115505 3.9 125 52 Op 4 16/0.000 - CDS 115513 - 116268 982 ## COG1847 Predicted RNA-binding protein 126 52 Op 5 18/0.000 - CDS 116281 - 117543 1504 ## COG0706 Preprotein translocase subunit YidC 127 52 Op 6 16/0.000 - CDS 117565 - 117759 138 ## COG0759 Uncharacterized conserved protein 128 52 Op 7 . - CDS 117749 - 118141 308 ## COG0594 RNase P protein component - Term 118161 - 118188 -0.8 129 52 Op 8 . - CDS 118189 - 118323 199 ## PROTEIN SUPPORTED gi|160882064|ref|YP_001561032.1| ribosomal protein L34 130 53 Op 1 16/0.000 + CDS 118705 - 120075 1329 ## COG0593 ATPase involved in DNA replication initiation + Prom 120088 - 120147 6.4 131 53 Op 2 6/0.000 + CDS 120260 - 121369 1242 ## COG0592 DNA polymerase sliding clamp subunit (PCNA homolog) 132 53 Op 3 9/0.000 + CDS 121373 - 121582 333 ## COG2501 Uncharacterized conserved protein 133 53 Op 4 9/0.000 + CDS 121579 - 122664 684 ## COG1195 Recombinational DNA repair ATPase (RecF pathway) 134 53 Op 5 24/0.000 + CDS 122675 - 124591 1994 ## COG0187 Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), B subunit 135 53 Op 6 . + CDS 124616 - 127096 2910 ## COG0188 Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), A subunit + Term 127107 - 127157 -0.9 + Prom 127098 - 127157 2.1 136 54 Op 1 . + CDS 127212 - 128795 1382 ## COG1574 Predicted metal-dependent hydrolase with the TIM-barrel fold 137 54 Op 2 . + CDS 128779 - 129696 981 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily 138 54 Op 3 7/0.000 + CDS 129684 - 131288 1494 ## COG4753 Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain 139 54 Op 4 1/0.250 + CDS 131299 - 132996 1482 ## COG2972 Predicted signal transduction protein with a C-terminal ATPase domain 140 54 Op 5 3/0.000 + CDS 132983 - 133951 973 ## COG1879 ABC-type sugar transport system, periplasmic component + Prom 133976 - 134035 12.4 141 55 Op 1 16/0.000 + CDS 134065 - 135189 1355 ## COG1879 ABC-type sugar transport system, periplasmic component + Term 135219 - 135257 3.0 + Prom 135209 - 135268 3.9 142 55 Op 2 10/0.000 + CDS 135291 - 136790 1736 ## COG1129 ABC-type sugar transport system, ATPase component 143 55 Op 3 . + CDS 136806 - 137876 1301 ## COG4211 ABC-type glucose/galactose transport system, permease component 144 55 Op 4 . + CDS 137885 - 138139 266 ## gi|226325936|ref|ZP_03801454.1| hypothetical protein COPCOM_03749 + Term 138140 - 138196 13.6 + Prom 138164 - 138223 4.8 145 56 Tu 1 . + CDS 138271 - 138732 555 ## COG2032 Cu/Zn superoxide dismutase - Term 138730 - 138774 1.3 146 57 Tu 1 . - CDS 138800 - 139240 390 ## BHWA1_01097 C-GCAxxG-C-C, putative redox-active protein (C-GCAxxG-C-C) - Prom 139282 - 139341 8.8 + Prom 139319 - 139378 7.2 147 58 Op 1 11/0.000 + CDS 139407 - 140252 548 ## PROTEIN SUPPORTED gi|169634422|ref|YP_001708158.1| fumarate hydratase 148 58 Op 2 . + CDS 140254 - 140805 408 ## PROTEIN SUPPORTED gi|169634422|ref|YP_001708158.1| fumarate hydratase 149 58 Op 3 . + CDS 140811 - 141521 645 ## COG0204 1-acyl-sn-glycerol-3-phosphate acyltransferase + Term 141591 - 141626 6.7 Predicted protein(s) >gi|330402193|gb|ADLB01000022.1| GENE 1 15 - 5615 6146 1866 aa, chain - ## HITS:1 COG:TM1193 KEGG:ns NR:ns ## COG: TM1193 COG3250 # Protein_GI_number: 15643949 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Thermotoga maritima # 43 1084 7 984 1087 581 36.0 1e-165 MRKKRFVSTLLAGSLVLGTFLSGFGELGKDMMNVLAADTDTREWTNNPEVFQVNREKARA TFYRYDSVDKAKKGDKQSSSYYQLLNGDDWKFSWAVKPADRIAAKDKNFNKKDYDDSNWD NITVPRNWQTYLNKDGSFKYDPVIYSNQNYPWINAEGKNYKDYKVGQAPEECNPVGTYRK TFTVDSSWEGKEIFLNFEGVESAMYLWVNGEYIGYSEDSFTRNEFNITDALDFSKGNENV ITVEVYRWCDGSYIENQDFLRLAGIFRDVYMTAKEQTEIRDFTVVTDLDEKYENAALNVD VDLRNFDGKKDGYSVKGYLYDAAGNLVTDTPLSAEASFGDKKETTVKLSGQITNPKKWTA EHPNLYNLVLVLEKDGKETEITSVKVGFREVEITDKGTNDARLRVNGQVITLYGVNRHEN DPETGRYLTEEDMREEIELMKSLNINAVRTAHYPDDPVFYELCDEYGLYVMDEANVESHN GRSQYNVPGSISGYVEAAEDRAINMLERDKNYPSVIMWSPGNETGTGDSLQAEIDYFQNN DDTRVVHYQGWNANEGVDVESNMYPEIDKIKTNYTKPYIMCEYLHTMGNSGGGMIDYWDK IRKYGILQGGFIWDWVDQTFNTPLIEDGKWDGKTTYWGYDGDWNKGDYSSWKSGNADFCV NGIVSADRTLQPEAYEVKRIYQALQMTMADEKAQTVSIHNEYTDTNASEYKMLWSLEKDG TSVQKGEITDVDIAPLATKEVKIPYTVPTDAKEGDEYFLHIQFVTKADTSWGKAGDIVAE AQFDLNFEKEAADRGLDTGAMKAFEDSAVKETEKEVTIKQDDWSVAFNKEKGSLSSFEVN GKEMIAEDLLPNYWRAYTDNDKKEAVDANWKKANENAKVDAVSVTKSEKAIYVTIDRTLT NCSDSKDSLTYTIYSSGDIFVKSTLIPSNQMGDLLRVGNRVQLDESLENMTWYGRGEADS YSDRKTGYDVGIYESTVKDQFINFVYPQETGNKTDVRFMALTDKDGNGLLVDATDHLLEM SALHYTQEDLDKAGHPYQLEGTKNTVLTIDYAQMGLGTASCGPATLSKYRLPSSQTYTYT YHLKALSGETKEQMVEESKVTVKDETVLLKGIKVGEKDLPGFNNEVTSYTFDAYGTEGQV PQVTATAASEDVKVEITQAEAIPGTATVKATAKNGYSRTYTIQMEMSKEIPLTEVGYDTK RSTSGYAGIHVNEDNGGTALDLWVDGKRTTFSTGYGVNAESKLYFDISDLNVSRLQVYGG IDCHKTQSKDGVELAVFVDGNEVERSKVLKHGQNAYYFDIDVKGAEEIMLYADMLGNNGH DMVTWGDPKLVKGDGEQENDQFELKDDAKAKMDRENNILYGIASGTTAGELKKMFREIDG ATITMSEAMGGEQSDSSPAATGYKLTLTVNGTIKDTLTLAVSGDIDGSADGLVTEADVEK MESYLAGEEADTLYLRAGDLDNDGKLTENDLKLLKELAGEKPENVPVTKLTMKGIPGKVS AGDTFTVTVSAEPENATDKSVTFESSNLAVLRIDENGKVRALGNGVATLKATAKDGSKVQ TTTEVTVGEDMLKLDTYYLTGDGVEGAMEAGEGYQVLKYEDAVSGWGGVHINKADTGTDK STTDISMKIDNQQTTFANGLSANTNAQITYDLSKVAGTEKYFQTWLGIDYIKAGKTGRDG AKFLIYKEEVSEENLLYDSGIIKQADEAQFVNLDVSSVNKLILVADEVENKNDDCVDWAD AKLYVKPEVKADKEALQSVIAQAEKKAEKDYTSESWKVFAEALAKAQEVNADDKASQEEV DNATEVLQKAMDNLKEKEPEFIEADKKALQSVIAQAEKKAEKDYTSESWKAFAEALAKAQ VSKCRR >gi|330402193|gb|ADLB01000022.1| GENE 2 5810 - 6211 635 133 aa, chain - ## HITS:1 COG:BH3753 KEGG:ns NR:ns ## COG: BH3753 COG0355 # Protein_GI_number: 15616315 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, epsilon subunit (mitochondrial delta subunit) # Organism: Bacillus halodurans # 6 132 4 131 133 94 39.0 5e-20 MAENVFELRIISPDEVFYEGESSFLEFVSVEGEMGVYKNHIPLTTILEPCVMKIHKGAEV KKAVILGGFLEILQEKITVLAEDAQWPEEIDVARAEEAKKRAEERLNSKSDSVDVIRAEA ALKRAVARINGAK >gi|330402193|gb|ADLB01000022.1| GENE 3 6224 - 7615 1589 463 aa, chain - ## HITS:1 COG:CAC2865 KEGG:ns NR:ns ## COG: CAC2865 COG0055 # Protein_GI_number: 15896119 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, beta subunit # Organism: Clostridium acetobutylicum # 4 463 3 461 466 667 74.0 0 MAEKNIGKITQIIGAVLDIKYADGNLPEINEAIDITRSNGEKLVVEVAQHLGDDTVRCIA MGPTDGLTRGMDAVATGAPISVPVGENTLGRIFNVLGEAIDEKPAPTDVEYAPIHRKAPA FEEQATEAEMLETGIKVIDLLCPYQKGGKIGLFGGAGVGKTVLIQELITNIATEHGGYSV FTGVGERTREGNDLYYEMMESGVINKTAMVFGQMNEPPGARMRVGLTGLTMAEYFRDKGG KDVLLFIDNIFRFTQAGSEVSALLGRMPSAVGYQPTLQTEMGALQERITSTKNGSITSVQ AVYVPADDLTDPAPATTFAHLDAKVVLSRAITELGIYPAVDPLESSSRMLDPHIVGEEHY KVARGVQEILQKYKELQDIIAMLGMDELSEEDKLTVSRARKIQRFLSQPFFVATQFTGFE GRYVPISETIQGFKEILEGKHDDVPEGYFLNAGNIDDVLARVK >gi|330402193|gb|ADLB01000022.1| GENE 4 7657 - 8517 894 286 aa, chain - ## HITS:1 COG:BH3755 KEGG:ns NR:ns ## COG: BH3755 COG0224 # Protein_GI_number: 15616317 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, gamma subunit # Organism: Bacillus halodurans # 1 286 1 284 285 184 37.0 1e-46 MASMREIKRRKSSVQSTQQITKAMKLISTVKLQKARMRAENSKAYFECMYSTVTSMLAKA GNIEHPYLKAGDSKKVGIVAVTSNRGLAGGYNSNIVKLITESGIAKEDVRLYLVGRKGAE SLVRKGYDVALDCSDMIEEPTYADAQALSKRLLTDFANGEIGEIYVAYTFFKNTVTHIPT FKKLLPVDTSAVAEENESENSNVLMNFEPKEERAISLLVPKYMTSILYGAFVEAVASENG ARMQAMDSATNNAEEIIEDLALKYNRARQGAITQELTEIIAGADAL >gi|330402193|gb|ADLB01000022.1| GENE 5 8534 - 10036 1902 500 aa, chain - ## HITS:1 COG:CAC2867 KEGG:ns NR:ns ## COG: CAC2867 COG0056 # Protein_GI_number: 15896121 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, alpha subunit # Organism: Clostridium acetobutylicum # 1 500 1 500 505 688 69.0 0 MNLRPEEISSVIKEQIKRYASDLEVSDVGTVIQVADGIARIHGLENAMQGELLEFPGEVY GMVLNLEEDNVGAVLLGDNRNINEGDTVKTTGRVVEVPVGDCMLGRVVNALGQPIDGKGP IQAKAYRQIERVASGVISRKSVDTPLQTGIKAIDSMVPIGRGQRELIIGDRQTGKTAIAI DTIINQKGQGVKCIYVAIGQKASTVANIVQTFEEYGAMDYTTVVASTASELAPLQYIAPY AGCAMGEEWMEKGEDVLIVYDDLSKHATAYRTLSLLLRRPPGREAYPGDVFYLHSRLLER AARLSDELGGGSLTALPIIETQAGDVSAYIPTNVISITDGQIFLETEMFNAGFRPAINPG ISVSRVGGSAQIKAMKKIAGPIRIDLAQYRELAAFAQFGSELDADTTERLAQGARIKEML KQPQYQPMPVEYQVIIIYAATQKYLLDIPVERVLEFEKGLFDFIQTKYPEIPEAIKVEKV ISDKTEEALVNAINEFKKEF >gi|330402193|gb|ADLB01000022.1| GENE 6 10064 - 10609 711 181 aa, chain - ## HITS:1 COG:all0006 KEGG:ns NR:ns ## COG: all0006 COG0712 # Protein_GI_number: 17227502 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, delta subunit (mitochondrial oligomycin sensitivity protein) # Organism: Nostoc sp. PCC 7120 # 5 176 10 180 183 99 34.0 4e-21 MAKLVSKTYGDALFSVAMEENRVDAFAEEAKGLSVIFSENQELRKLMDNPKIIKEDKIKL IEETFTSHVSKEVIGLIALLISKGHSKDIPAVFDYFIALVKEEKKIGTAFVTTAVALSDA QKSAVEKRLLETTRYESFEMNYSVDESLIGGMVIRIGDRVVDSSIKTKLYELSKQLRSIQ I >gi|330402193|gb|ADLB01000022.1| GENE 7 10597 - 11112 725 171 aa, chain - ## HITS:1 COG:SA1909 KEGG:ns NR:ns ## COG: SA1909 COG0711 # Protein_GI_number: 15927681 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, subunit b # Organism: Staphylococcus aureus N315 # 16 166 22 172 173 69 29.0 3e-12 MERLFTLDAQFLFDAVVLALSMLVMFTFLSYLLFEPVRNLLEKRRQRVLDEQETAKKERT DATAYKEEYEKKLKEVDKEAQEILSAARKKAMQNEAKIIAEAKEEAARIIERGNAEIELE KKRALDEMKQEMITIASMMAGKVVSSSIDTNVQESLIEETLKEMGDSTWLN >gi|330402193|gb|ADLB01000022.1| GENE 8 11156 - 11422 506 88 aa, chain - ## HITS:1 COG:FN0363 KEGG:ns NR:ns ## COG: FN0363 COG0636 # Protein_GI_number: 19703705 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, subunit c/Archaeal/vacuolar-type H+-ATPase, subunit K # Organism: Fusobacterium nucleatum # 1 88 1 88 89 84 63.0 4e-17 MNGISNEALILACSAIGAGLAMIAGIGPGIGQGVAAGHGASAVGRNPGAKSDITSTMLLG QAVAETTGLYSLVIALILLFANPLLGKL >gi|330402193|gb|ADLB01000022.1| GENE 9 11461 - 12180 658 239 aa, chain - ## HITS:1 COG:CAC2871 KEGG:ns NR:ns ## COG: CAC2871 COG0356 # Protein_GI_number: 15896125 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, subunit a # Organism: Clostridium acetobutylicum # 19 230 7 214 221 117 35.0 2e-26 MGNGLGILAAKEADFMIHSLVKFKLFGQELYLTTTHVSIFIICVGLIIFALVARIKLKDT DGKPGKFQNAVEYIVEMLDGMVHGGMGKKGIPYRNYIGTLFLFILFSNISGLLGLRPPTA DYGVTFPLGIITFFLIQFNNIRYNKFGAFTDLFKPLPFLFPINLIGEIAVPFSLSLRLFG NVLSGTVIMALLYGLLSNIAIAWPGFLHIYFDIFSGAIQTYVFCMLTMVFVSDKIPDRK >gi|330402193|gb|ADLB01000022.1| GENE 10 12196 - 12597 410 133 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_0117 NR:ns ## KEGG: EUBREC_0117 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 7 125 6 124 143 70 38.0 2e-11 MTNNSANETLKELITGIIVYGAIAQIPVCFIATGNLSYLSFGLWIGVVVAVGMAVHMKRS IEDALDLGEAGAAKHMRKMYALRYTAVVIAFGATVYFDVGSPITLLIGVMGLKIGAYLQP YTHKVLLKLKKSK >gi|330402193|gb|ADLB01000022.1| GENE 11 12587 - 12820 270 77 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|210613768|ref|ZP_03289882.1| ## NR: gi|210613768|ref|ZP_03289882.1| hypothetical protein CLONEX_02089 [Clostridium nexile DSM 1787] # 1 67 1 67 82 114 82.0 2e-24 MKYKKSVHRTFALITQVGISMIVPILMCTWLGSYLEEKFSLPVFIPLVILGVLAGGRNVY YLVRHANEDSEEEKDDK >gi|330402193|gb|ADLB01000022.1| GENE 12 12899 - 13396 408 165 aa, chain - ## HITS:1 COG:CAC0700 KEGG:ns NR:ns ## COG: CAC0700 COG0219 # Protein_GI_number: 15893988 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Predicted rRNA methylase (SpoU class) # Organism: Clostridium acetobutylicum # 2 153 8 160 160 178 54.0 5e-45 MLNIVLFEPEIPANTGNIGRTCVATNTRLHLIEPLGFRLNEKALKRAGMDYWEDLDVTTY INYEDFLEKNPNAKIYMATTKAPNVYTDVSYEPDCYIMFGKESAGIPEEILVLHKEDSIR IPMSQDIRSLNLGNSVAIVLYEALRQNGFANMETAGHLHRLEWKE >gi|330402193|gb|ADLB01000022.1| GENE 13 13389 - 14570 925 393 aa, chain - ## HITS:1 COG:BH3350 KEGG:ns NR:ns ## COG: BH3350 COG0436 # Protein_GI_number: 15615912 # Func_class: E Amino acid transport and metabolism # Function: Aspartate/tyrosine/aromatic aminotransferase # Organism: Bacillus halodurans # 2 387 8 393 393 446 54.0 1e-125 MRNPLSEKIVNIAPSGIRRFFDLVSEMDDAISLGVGEPDFDTPWRIREEGIYSLERGRTF YTSNAGLKELKQEICRYLSRKINVEYDFRNEVVVTVGGSEGIDIAMRAMLDKGDEVLIPQ PSYVSYLPCTVLADGVPVIIPLQRKNEFKLTVEELENAVTEKTKILVLPFPNNPTGSIMT REDLEPIAKFVVEHDLFVLSDEIYSELSYKGEHISIASFPGMKERTILINGFSKGFAMTG WRLGYACGPKKIIEQMIKIHQYAIMCAPTNSQYAAVEALRNCEGEVEEMRNAYNQRRRFL VSEFKRMKLDCFEPYGAFYIFPSIREFGMSSEEFAMKFLAEEKVAVVPGTAFGECGEGFL RISYAYSLEDLKEAIGRLERFINRLRERTEENA >gi|330402193|gb|ADLB01000022.1| GENE 14 14570 - 15052 464 160 aa, chain - ## HITS:1 COG:BH3351 KEGG:ns NR:ns ## COG: BH3351 COG1522 # Protein_GI_number: 15615913 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus halodurans # 4 160 7 164 164 135 48.0 2e-32 MRKEILSYIETHSRVELKELAVLLGSDEITVANEIADMEKEKIICGYHTLIDWDKAGVEK ITALIEVRVTPQRNQGFDRIAERIYNYPEVSAVYLISGGYDLLVTLEGKTLKEVSMFVSE KLSPIESVISTATHFILKKYKDHSTILVNKSKGERIPVMP >gi|330402193|gb|ADLB01000022.1| GENE 15 15064 - 16053 1009 329 aa, chain - ## HITS:1 COG:PH1573 KEGG:ns NR:ns ## COG: PH1573 COG0309 # Protein_GI_number: 14591353 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Hydrogenase maturation factor # Organism: Pyrococcus horikoshii # 1 326 3 325 326 134 29.0 2e-31 MKIGNVSQTVIRRSVLKQLKTKREEAVIQPSVEETCSGIAVPDGQQAIFTNVTLYGDEKD IAVFGLAQALNDLYTRGASPVGANLSILLPPYAYESRLKSMVEFAEQTAQRQHIQILNVK AQVSPAITKAVVTVVGIGLVAAGELIQSNMGKANQDIVLTKWIGLEGMLRIVREKGEELS QRFVPAFMTQVNALEENLFADRELKLAKEFGVSAMCQITDGGILAALWNMAEASGVGVEV DLKKIAVKQETIEICEYFQLNPYQMTSAGSVLFMTDNGKELTDLLLRKGIQAEVIGRSTA DRERVIYNQEEKRYIDRPAQDELLKIYGE >gi|330402193|gb|ADLB01000022.1| GENE 16 16068 - 17663 1622 531 aa, chain - ## HITS:1 COG:CAC0691 KEGG:ns NR:ns ## COG: CAC0691 COG2720 # Protein_GI_number: 15893979 # Func_class: V Defense mechanisms # Function: Uncharacterized vancomycin resistance protein # Organism: Clostridium acetobutylicum # 49 409 28 387 411 139 30.0 2e-32 MRKGIKDSKRRNENEKKHRKTAKICGLIFVCFLFVYGGIYAVLHYQVNKQAEDKICEGVF IGRVDVSGMNAEQAQNAVDKQAEVYGKQKISLTAEGKKAEVTLEELGFDISKEEKLIEEA VDYGKKGNVFSRYAEIRKLKKQKKTLKPVYQIQKEKTEKVLKEKVESFLPKAADATITRK DGAFVLTKEQQGKQLDTNKTIDAVNQYLNKDWNGGAGKVEVVSTIKKPRVQKEDLESIKD TLGTFSTYCGSGQSRVVNIINGVKKINGSVVMPGEEFSAGKAMQPFEKSNGYVEAGAFEN GELVQSIAGGICQVSTTLYNAVIESELEVTSRQPHSMTVNYVKPSKDAAIAGDYKDFKFK NNLDTPIYVEGYVSKGNVVFTIYGKETRKKDRKIEYVSEVISTEAPKKKFVAQPGSAIGA MSVTKGTHKGMTARLWKVVYEGGKEVSREIFNKSTYKPSVTTVSVGTASSNPEYTKQMQN AIKTQDESKIKAAIADIKAKEAAAQAPVPAPQPQPQPNPPAAPQTAPQGQQ >gi|330402193|gb|ADLB01000022.1| GENE 17 17696 - 17797 95 33 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MNFYDKKFRKIVSGVILVIIVAMLATSVLPYIM >gi|330402193|gb|ADLB01000022.1| GENE 18 17880 - 19604 1542 574 aa, chain - ## HITS:1 COG:CAC1021 KEGG:ns NR:ns ## COG: CAC1021 COG1032 # Protein_GI_number: 15894308 # Func_class: C Energy production and conversion # Function: Fe-S oxidoreductase # Organism: Clostridium acetobutylicum # 2 554 3 539 548 407 43.0 1e-113 MKIILTAINAKYIHSNLAVYSLKAYAKKYAEHIFLAEYTINQQLDEILMDIYKKKPDVLC FSCYIWNISYVETLIHEIHKIMPNLPIWLGGPEVSYDAQEVLRRLPEVRGIVKGEGEETF SEVVSCYIEGKSEDYLREIKGISYRKKDGRIVENPWREIMDLSRVPFVYEDMEEFKNKII YYETSRGCPFSCSYCLSSIDKCLRFRDLDLVKKELQFFLDNEVPQVKFVDRTFNCKHSHS MEIWRYIAEHDNGITNFHFEISADLLNEEELELLGTMREGLVQLEIGVQSTNPKTIGEIK RTMRFDKVADAVKRVNAGRNIHQHLDLIAGLPYEDYESFGKSFNDVYALAPEQLQLGFLK VLKGSYMEEHTEEYGLVYKSLPPYEVLYTKWLPYEDVLKLKRIEEMVEVYYNSGQFSYTL EHLVREFETPFAFFEKLGEYYEEHAFHMVSHSRITRYEILLAFAEKYAGKRVDLYRELLI FDLYLRENIKSRPTFAGENTTDKEWLGEFYEEESKEHKYLLGYEKYDKRQLRKMTHIEKF TYDVLGNGEKKETVILFDYQNRSKLNYQATVFRL >gi|330402193|gb|ADLB01000022.1| GENE 19 19963 - 20409 337 148 aa, chain - ## HITS:1 COG:BH0033 KEGG:ns NR:ns ## COG: BH0033 COG0590 # Protein_GI_number: 15612596 # Func_class: F Nucleotide transport and metabolism; J Translation, ribosomal structure and biogenesis # Function: Cytosine/adenosine deaminases # Organism: Bacillus halodurans # 1 147 9 155 159 166 53.0 2e-41 MREAIKQAKKAYEINEVPIGCVIVCEDKIISRGYNRRTTDKNPLAHAEMIAIKKASKKVG DWRLEDCTMYVTLEPCQMCSGAIVQSRMKKVVVGCMNAKAGCAGSILNLLQMDEFNHQVE LETGVLEEECSLLMKNFFKELRKAKQKH >gi|330402193|gb|ADLB01000022.1| GENE 20 20427 - 21170 519 247 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_0291 NR:ns ## KEGG: EUBREC_0291 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 215 1 230 253 95 32.0 1e-18 MKRFTQYLYEYRQREKVKNIGFVKVEQGKREGKVEIQTRGLFFSRGGGLKVYLFYRNNDM YEGVLQGEIGQNKPMIQYMLTFSDEDIGNADSVEAICGIFLEDEEKKRYMAVWNGEEINV EDISTEPLKEMEIREEPESEPEPEKNFYEKIQRKDLVRLPRREWRLANNNFLLHGFYNYH HLLWIEEGGELYIGVPGVNHAREVQAAKAFGFTQFRKMENEEVELSEEERNPREDFGYFC RRIEKNE >gi|330402193|gb|ADLB01000022.1| GENE 21 21284 - 21928 771 214 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|210613781|ref|ZP_03289895.1| ## NR: gi|210613781|ref|ZP_03289895.1| hypothetical protein CLONEX_02103 [Clostridium nexile DSM 1787] # 1 214 1 235 235 217 58.0 5e-55 MMNLTFGEQVKIVLNRRGMTIKELAEVIEEKTGKKMSRQNLTQRLNRDNFQEQDMRQIAE ILGCPFQLSILSESEPVVKPRKEKKVEREVTIGDLVDLEEEPKVKPVEKKTTVSSASGVF KGFRKPSASYKSTAVQETRKREEKQQIVGDINPYTRQEYETNSVRMHPKRIGYVQVYDRG EHKWTDMTEWAFLGYQERKKKLMGKDYVPPIYLD >gi|330402193|gb|ADLB01000022.1| GENE 22 22003 - 22806 737 267 aa, chain - ## HITS:1 COG:CAC3297 KEGG:ns NR:ns ## COG: CAC3297 COG1876 # Protein_GI_number: 15896541 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanyl-D-alanine carboxypeptidase # Organism: Clostridium acetobutylicum # 49 267 22 236 241 134 37.0 1e-31 MKEKLQKIGENLKGVRLSKRAWLIIGAVLLLVAGTVAGYFMGKHFADGKAEENKEALISK HKKREKELEEQLREMKIQKPEERPWYLTLVNDTNPMQEGYVPELTKVVGEYQVDSRIAEP LKQMLADAKKDGMKLSVCSAYRSAEKQKELYNLYMGNEVRSGKNYWQALEETTKSTAYPG KSEHSLGLAVDIVSAKYTALDEKQKDTPEAKWLAKNCYKYGFILRYPVDKTDITGIIYEP WHYRYVGKEDAKKIMERGITLEEYLGE >gi|330402193|gb|ADLB01000022.1| GENE 23 22809 - 23450 364 213 aa, chain - ## HITS:1 COG:BS_sfpm KEGG:ns NR:ns ## COG: BS_sfpm COG2091 # Protein_GI_number: 16081163 # Func_class: H Coenzyme transport and metabolism # Function: Phosphopantetheinyl transferase # Organism: Bacillus subtilis # 67 203 86 215 224 75 33.0 5e-14 MIRTWIAEVTPLLEESVYRKCYEKLPTFRKEKADRIVPEADRALSVGAWLLYECVRKECG VSAETPFNLSHSGKMVLCSIEDSGEKEIKIGCDIEEIKKLHTKLIKRYFFKSEENFILSK PSEEEKKVAFYRYWVLKESFMKATRYGMKMGLDTFEIMCDETGAKLLRQPKEICEQFYFK EYETDLPYRVAVCSTSDVFAEKMERMDLSIYWR >gi|330402193|gb|ADLB01000022.1| GENE 24 23447 - 23962 535 171 aa, chain - ## HITS:1 COG:no KEGG:Cphy_0209 NR:ns ## KEGG: Cphy_0209 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 1 171 1 188 189 107 34.0 2e-22 MHVGAKKIAVLGLLAAFSVLLIVLGAVIETSTLFFICGASFCVGIAIREWGLRYGFSFLV ATTLVGLLVAPNKMYCVTFAGMGLYLCLAEILWNKFGNKRTLLWCGKYIVFNLMYVPTII FMPQLLIAKEISSTLLMIVWAAGQAGLFIYDKAHDYFQIFIWNKLRKYLLK >gi|330402193|gb|ADLB01000022.1| GENE 25 23953 - 24579 722 208 aa, chain - ## HITS:1 COG:CAC0648 KEGG:ns NR:ns ## COG: CAC0648 COG2214 # Protein_GI_number: 15893936 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: DnaJ-class molecular chaperone # Organism: Clostridium acetobutylicum # 2 202 1 193 195 131 38.0 7e-31 MMANPYEVLGISPSATDDEVKKAYREMSRKYHPDSYTNNPLSDLAEEKFKEVQEAYDQIM KQRENGGYQGGFSGQQSGYSYSGNSGDANVQMQAVANYINARHYREALNVLNGISNRNAQ WYYYSAAANYGMGNNIVAMQHAQQAAAMEPNNPEYANMVNQMQWRGQRYQNTGYGYGRQS YGTGNLCCDLWCADTLCECMGGDLCTCM >gi|330402193|gb|ADLB01000022.1| GENE 26 24576 - 25421 745 281 aa, chain - ## HITS:1 COG:no KEGG:Cphy_0207 NR:ns ## KEGG: Cphy_0207 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 1 281 1 280 285 313 57.0 6e-84 MFGYVTICKPELKMKDYYTYRAYYCGLCKVLKEKYGFLGQMTLTYDMTFLVLLLTSLYEE KPTHEQNRCIVHPAKKHDMFFNEITEYAADMNIVLTYFHFADDWQDEKSKVGLAGMRALR KTYLKIREKYPNKCEKIRRCLVRLQKAEKMREENIDVVSGYFGELMGELLLYKDDVWKKT LKRLGFYLGKYIYILDAYDDLEKDRESGSYNPLLTLYNDERYEEKCGQMLTLVLAECSSA FEKLPCIEYADILRNILYVGVWNKYDDKQKQNTVNEEGIKE >gi|330402193|gb|ADLB01000022.1| GENE 27 25514 - 25696 134 60 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNSYNVKTDSIVFQYCLSHFNKKKHRPHNSKIAKTVLPSARSGVRTLDTLIKSQVLYQLS >gi|330402193|gb|ADLB01000022.1| GENE 28 25783 - 26664 937 293 aa, chain - ## HITS:1 COG:CAC0848 KEGG:ns NR:ns ## COG: CAC0848 COG1284 # Protein_GI_number: 15894135 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 15 292 17 291 292 137 33.0 2e-32 MKAETKKELKDIFDLKRLFFVILGNTIYAAGIAAFVLPTGLITGGTTGLGLIANQYFQIP IELFAAAFNVTMFILAVLVLGKSFALTSLISTFYFPVILGVFQKVEVLQNLTDDLMLCTI FSGLCIGVGIGMVIKAGASTGGMDIPPLIFNKKMGVPVSVGLYVFDFSILIGQMFFRDTE KSLYGILLVMIYTVLVDKVLLMGKNQMQVKIISDEYEKINAMIHQKLDRGTTLYKTETGY LHKDGFAIFTVVSNRELPKLNEMVLEIDSKAFMVINQVSEVKGRGFTLGKIKM >gi|330402193|gb|ADLB01000022.1| GENE 29 26792 - 27061 362 89 aa, chain - ## HITS:1 COG:MA3212 KEGG:ns NR:ns ## COG: MA3212 COG0526 # Protein_GI_number: 20092028 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Methanosarcina acetivorans str.C2A # 1 81 6 85 93 65 38.0 2e-11 MKPELVVIFGNWCAKCNMMMPIVEEIAQEYKDVLRVTKIEVENENEDLQKNYGVDIVPTF LIQKDGKELGRMSGLIDKKLMLRRIFSVL >gi|330402193|gb|ADLB01000022.1| GENE 30 27076 - 27213 168 45 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSSKRKVKNKVKCVLYDRPKLNRKLSCTKRSFDEVVRILNPRCKC >gi|330402193|gb|ADLB01000022.1| GENE 31 27256 - 27909 668 217 aa, chain - ## HITS:1 COG:BS_ytmQ KEGG:ns NR:ns ## COG: BS_ytmQ COG0220 # Protein_GI_number: 16080042 # Func_class: R General function prediction only # Function: Predicted S-adenosylmethionine-dependent methyltransferase # Organism: Bacillus subtilis # 1 215 1 210 213 212 50.0 5e-55 MRLRNIPGARETIDANPIAIKNEREHKGKWNELFGNDNLLHIEIGMGKGQFLLTLAKQNP DINYIGVERYSSVLLRALEKYETEEFCGLENIRFICMDANGIAEVFAPEEVDKIYLNFSD PWPKVRHARRRLTSKEFLSRYEKVLAKDGTLEFKTDNRPLFEFSLEQAEEAEGWKVKEHT FDLHHNEKMNEGNVMTEYEQKFSSQGNPIHKLIAERQ >gi|330402193|gb|ADLB01000022.1| GENE 32 27931 - 29847 1577 638 aa, chain - ## HITS:1 COG:all0713 KEGG:ns NR:ns ## COG: all0713 COG0296 # Protein_GI_number: 17228208 # Func_class: G Carbohydrate transport and metabolism # Function: 1,4-alpha-glucan branching enzyme # Organism: Nostoc sp. PCC 7120 # 9 637 106 752 764 629 47.0 1e-180 MTNKNIRYSYEIGELDHYLFGQGTHYEIYKKMGAHKVKRGAKEGVYFSVWAPHAKEVSVV GEFNDWDREKNPMERGEPLGIYTCFVPEAKEGMLYKFCIETHKGELLYKADPFANYAEVR PGTASRITDITNLKWTDKEWLEKRKSWNHKENPMSIYEVHMGSWKRHEGYGKDNGFFTYR EFAKEAVAYMKEMGYTHIELMGIAEYPFDGSWGYQVTGYYAPTSRYGTPEDFAYMINLFH KNKIGVILDWVPAHFPKDAHGLADFDGTPTFEYADSRKGEHADWGTKIFDYGKSEVQNFL IANALFWIEHFHIDGLRVDAVASMLYLDYGKEDGQWVANKYGGNENLEAIEFFKHLNSVV LGRNPGTLMIAEESTAWPKVTGSPEENGLGFSLKWNMGWMHDFTEYMKLDPYFRRPNHHL MTFAMTYAYSENYILVLSHDEVVHLKCSMLNKMPGLGFDKFANLKVGYAFMMGHPGKKLL FMGQDFAQLREWSEERELDWYLLREDNHQAIQNFTRDLLQLYKKNKAMYECDNDPEGFEW VNANDGNRSIYSFIRHSKNGKKNLLFVCNFTPMEWADYRVGVSRRKQYKLVLDSDEKQYG GMGKKRPSVYKATKGECDGRPYSFAYPLSPYGVAVFEF >gi|330402193|gb|ADLB01000022.1| GENE 33 29940 - 32495 2784 851 aa, chain - ## HITS:1 COG:SMb20196 KEGG:ns NR:ns ## COG: SMb20196 COG0464 # Protein_GI_number: 16263937 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ATPases of the AAA+ class # Organism: Sinorhizobium meliloti # 648 823 80 272 311 65 23.0 5e-10 MDKHEYRVKTEQMLEYMEKKQYKKAMEIADTIDWHRVKSLSMLSTVSEIYEYNGENQKSR DILFIAYDRSPGSRKIVYRLGTLAIKIDDLQEAMDCYEEFLSIAPKDPNQYILKYKILKA ENAPLEEQIEALVDFKKMEYVEKWAYELAKLYNAAGMIPECLEECDDLILWFSEGKYVLK AMELKRQHKPLTPLQQEKYDRFVEKKEQVKKAKEDLDATRKIEIPKGIAEPVQEKAVEEV AEEPVEETSGEVAEEPVEETSEETAEEPVEETSEEIVEEPVEETSEEVAEEPVEEITEEP VEEVQEEGTEETVEEESVEEPVEEVSEESETEPEEVTEEKEPESIIKETVVGSTLEEAVA RGVAAAVNVTAREKEEEAMTAQEKLEGILQNWEETQRAVEQEIERKKQEAEAEREAKRKE RAAQSMFNTTPIIPDDIQQLLDDLESEMKESEEAEEDVEINEVVEEPTVEETETTKPAEE IQEEQVEEVTEMEAEVTEEPIMEEEPEEVVEETVEQPVEEPQPQEEIFADDDDIIEKQFD ISLQEEIEEPVEKDDELEKLQNETKKDVDKETDGVSFDTGFIVQGKYDLEAQSEIGTKAG LTEEQKKLFSYFVPVRGMSEQIVEVLENDKNCNTRYGTSRTGNILVVGRQGTGKTVLAVN VVKAIQKARKLKQGKVAIVTGEALNRKDLSTVLGKLKGGALIVEKAGKMNQSTIEALNEK MEEQTGELLVVLEEQRKPLEKLLSENPGFRRKFTSKLELPIFINDELVTFGQTYAKENGY RIDEMGILALYSRIDILQKEDHAVTVTEVKEIMDHAIEKAQRVNIKHFFKRLFGKHTDNA DRIILTEQDFK >gi|330402193|gb|ADLB01000022.1| GENE 34 32619 - 32717 64 32 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MAILVFSFILYIYIGGIICFLFLIALVFIIQS >gi|330402193|gb|ADLB01000022.1| GENE 35 32908 - 32991 185 27 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MDTIENPIQYISGMANESSYVESLNHR >gi|330402193|gb|ADLB01000022.1| GENE 36 33232 - 33786 184 184 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163764517|ref|ZP_02171573.1| ribosomal protein L32 [Bacillus selenitireducens MLS10] # 8 178 2 175 190 75 29 1e-12 MNKQIEHTFATKNLVRIGLMTAILCVIAPFAFPLPFSPVPISFATFILYLGTYTLGRKYG TVSCLIYLLLGIVGLPVFSGFSGGIGKLAGPTGGYLIGYLLLAFVGGWFVENYEGRLIPS AIGLVLGTALTYTFGTIWIAQQMSLSFTQGLAIGVLPYLPGDAVKIIAALIIGPVLKKRL QQIK >gi|330402193|gb|ADLB01000022.1| GENE 37 34208 - 36463 2715 751 aa, chain - ## HITS:1 COG:ybhJ KEGG:ns NR:ns ## COG: ybhJ COG1048 # Protein_GI_number: 16128739 # Func_class: C Energy production and conversion # Function: Aconitase A # Organism: Escherichia coli K12 # 1 750 9 757 761 917 59.0 0 MVKLYPHGVYLLNGKEISDEAQMSKEEARKNTIAYGILKEHNTSSDMEHLQIKFDKLTSH DITFVGIIQTARASGLEKFPVPYVLTNCHNSLCAVGGTINEDDHMFGLTCAKKYGGMYVP PHQAVIHQFAREMLAGGGKMILGSDSHTRYGALGTMAMGEGGPELVKQLLNKTYDIKMPE VIGIYLDGEVMAGVGPQDVALAIIGATFANGYVNNKVMEFVGPGVDKLSADFRIGIDVMT TETTCLSSIWKTDDTIKEFYETHKRADEFKELNPGAVAYYDGMVYVDLSKIKPMIAMPFH PSNVYTIDEVNANLKDILHDVEQKALVSLDGAVEYSLQDKIRNGKLYVEQGIIAGCAGGG FENICEAAHIMKGHNIGADEFTFSVYPASTPIYMELVKNGAAADLLAAGAVVKTAFCGPC FGAGDTPANNAFSIRHSTRNFPNREGSKLQNGQISSVALMDARSIAATALNKGYLTSATE VVDREYRTPKYHFDASIYANRVFDSKGIADPSVEIKFGPNIKDWPEMSALPENLIVKVVS EIHDPVTTTDELIPSGETSSYRSNPLGLAEFALSRKDPAYVGRAKEVQKAQKAIEEGRCP LEELEELKPVMDKIFEKNPQLDKKTLGIGSTIFAVKPGDGSAREQAASCQKVLGGWANIA NEYATKRYRSNLINWGMLPLLTEEKDTELSFKNGDYIFFPEIRKAIIEKRSAIKAYVVGD EWKELTLTMGELTDNEREIILKGCLINYYRG >gi|330402193|gb|ADLB01000022.1| GENE 38 36457 - 37563 861 368 aa, chain - ## HITS:1 COG:CAC3236 KEGG:ns NR:ns ## COG: CAC3236 COG2508 # Protein_GI_number: 15896482 # Func_class: T Signal transduction mechanisms; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Regulator of polyketide synthase expression # Organism: Clostridium acetobutylicum # 142 347 111 307 312 86 31.0 7e-17 MLPVQILKKTVCDIQDITGTACSIWNVDGNCLAHSGGKEKELSDFIKKLPEEEEWEENNA SFFGVSSEEGLIYVLVLHSVLPEMKMAGRLGVKQLEGYLRLYQEKMNRNHFIQNLLLDNL LLVDVYNRAKNMHIPIEARRAVIMIEPKNPNDNLILEVLKGLYETSTKDFVTAVDESHII LVKMLEDREGYKELNKIAKYIVDILNTEAMVNVRVAYGTIVDEIKDVSKSYKEAGMALDV GRIFYEEKNVLAYDQLGIGRLIHQLPISLCDMFLSEIFEGKEIEFDEETLTTVHKFFDNN LNISETARQLFLHRNTLVYRLEKIQKKTGLDVRVFDDALTFKIALMVSNHLKSLQKSVQS NAKEETKW >gi|330402193|gb|ADLB01000022.1| GENE 39 37643 - 38992 1760 449 aa, chain - ## HITS:1 COG:BH3343 KEGG:ns NR:ns ## COG: BH3343 COG0166 # Protein_GI_number: 15615905 # Func_class: G Carbohydrate transport and metabolism # Function: Glucose-6-phosphate isomerase # Organism: Bacillus halodurans # 1 449 1 449 450 616 66.0 1e-176 MNKKVTFDYSKAGAFIKEHEVEYMGKLVADAKELLVSREGAGNDFLGWIDLPVDYDKEEF ERIQKAAEKIQGDSEVLLVIGIGGSYLGARAAIEFLRHSFYNMVSKEVRKTPEIYYVGNN ISSTYIKHLIDVIGDRDFSINMISKSGTTTEPAIAFRVFKELLENKYGKEEAGKRIYATT DKARGALKNLATEEGYESFVVPDDVGGRFSVLTAVGLLPIAVSGADITKLMEGAKEGRRL ALESDFAENDALKYAAIRNILHRKGKSVEVLANYEPNLHYVSEWWKQLYGESEGKDQKGI FPASVDLTTDLHSMGQFIQDGNRIMFETVLNVEKSTEEIVLKEEPVDLDGLNYLAGKTVD FINKSAMNGTILAHTDGNVPNLMVSIPEQNEYYLGQLFYFFEFACGVSGYLSGVNPFDQP GVESYKRNMFALLGKPGFEKEREELLKRL >gi|330402193|gb|ADLB01000022.1| GENE 40 39057 - 39824 761 255 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|210613855|ref|ZP_03289949.1| ## NR: gi|210613855|ref|ZP_03289949.1| hypothetical protein CLONEX_02162 [Clostridium nexile DSM 1787] # 1 255 17 269 269 179 43.0 2e-43 MLEKIMEANVILWIMGAIIAIGVLGKFITVVTLGKLVRASGEMGKSTHKLMKLVKAKFEH TIMVSEKVENIHAFVEKYIFEYRVCGLHLHTWRYLEKQSIWFCGVAGCIGAFLSYRLGGV QEEMFRYTALAGVGMVVLFLVYISTDESYQMDTIQVYMVDYLQNICAPRYQKQQALQMQR MEEMMKEPEEIKESEEVQEQEIKETVEEEVKAAEEVEETSMEQAEIRQEEIKEEKEEKKV SQEVILREILEEFLA >gi|330402193|gb|ADLB01000022.1| GENE 41 39947 - 40642 778 231 aa, chain - ## HITS:1 COG:CAC2117 KEGG:ns NR:ns ## COG: CAC2117 COG0775 # Protein_GI_number: 15895386 # Func_class: F Nucleotide transport and metabolism # Function: Nucleoside phosphorylase # Organism: Clostridium acetobutylicum # 2 230 3 230 230 210 49.0 2e-54 MIGIIGAMQNEVVLLQEEMTVEETVEKAGMVFYKGELCGQKAVIVKSGIGKVNAALCAQI LVDMFHVDTLINTGIAGSLNAEINIGDIVISTDAVQHDMDTTIFGDPLGQVPQMDTFSFP ADEKLAKLAKEVNEEENPDIQTFMGRIVSGDQFVSSGEVKERLVSQFDAMCTEMEGAAIA HAAYLNKVSCVIIRAISDKADNSAVMDYPAFERQAIVHSVRLVRGLMKKLG >gi|330402193|gb|ADLB01000022.1| GENE 42 40767 - 41009 254 80 aa, chain + ## HITS:1 COG:no KEGG:EUBELI_00054 NR:ns ## KEGG: EUBELI_00054 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 1 76 1 76 82 104 65.0 8e-22 MEYKPHGVCSQKITFDIQNDTVHAVQFTGGCSGNLQGIAKLIEGMNVDEAISRIEGIKCG YKSTSCPDQLAQALKEATGR >gi|330402193|gb|ADLB01000022.1| GENE 43 41303 - 41761 550 152 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|226325859|ref|ZP_03801377.1| ## NR: gi|226325859|ref|ZP_03801377.1| hypothetical protein COPCOM_03672 [Coprococcus comes ATCC 27758] # 51 151 46 146 146 100 53.0 2e-20 MRKKIAITVAALLLSLTLSGGEGKEVKAVENVQKVSVQQENKLSEGITEPKGKYLQKGVS SIGQVGPGKIKVSGTTVAQQLVSTIKISVMVERQVNGDWLSYTSWNASDTNAYALTTSKT MYVPRGYYYRVRCVHSANSDVGDSNTSAILVD >gi|330402193|gb|ADLB01000022.1| GENE 44 41830 - 42663 829 277 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225570068|ref|ZP_03779093.1| ## NR: gi|225570068|ref|ZP_03779093.1| hypothetical protein CLOHYLEM_06164 [Clostridium hylemonae DSM 15053] # 1 277 1 274 274 116 31.0 2e-24 MNKKEDITAQYSDEELQREADEIRAIIENQPELKKMEMPKEAHLSLMNRIKEYEEEKLRN ALPEEDREALRIGREVQKKRERAKQNRKKIRRIMGVAATVVLFIGIGVTSVGGKKLFIDT FDKKFGGEDKTYVNSIEPEDVGEVTEEEAWEQIKEALGEEPVRMVYKPRNTKFLNAVVDK EVQEAMLYYSVNDKTMSLQVVSRYVKSSTGIEISDKILQEYTIQLPETKVLVREYKVLET GEKEYTAQFSYKNSKYFLVGIMNKAEFEEIIKNLHFF >gi|330402193|gb|ADLB01000022.1| GENE 45 42768 - 43832 857 354 aa, chain + ## HITS:1 COG:CAC2231 KEGG:ns NR:ns ## COG: CAC2231 COG0707 # Protein_GI_number: 15895499 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylglucosamine:LPS N-acetylglucosamine transferase # Organism: Clostridium acetobutylicum # 3 351 5 353 359 398 53.0 1e-110 MKRIILTGGGTAGHVTPNIALLPRLKELNYDIHYIGSYTGIEKELIEQLGIVYHGISSGK LRRYFSLKNFSDPFRIVKGLNEANKLMKSLKPDVVFSKGGFVSVPVVMAAKRHHIPTIIH ESDMTPGLANKLSIPSATKVCCNFPETLEYLPKEKALLTGSPIRQELLAGDRAAALKFCG LTEDKPVILIIGGSLGSVVVNDAVRAILPELLQKFQVIHLCGKNKVDPSLNHLNGYVQFE YVQNELKDIFALTDIVISRAGANAICELLALRKPNLLIPLSANASRGDQILNAHSFERQG FSIVIEEEDLSNEKLLASIHSLYDNRDSYVNAMSQSLQQNSIDTIIKLIEEVSQ >gi|330402193|gb|ADLB01000022.1| GENE 46 43890 - 44456 408 188 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225570070|ref|ZP_03779095.1| ## NR: gi|225570070|ref|ZP_03779095.1| hypothetical protein CLOHYLEM_06166 [Clostridium hylemonae DSM 15053] # 10 184 4 175 180 104 35.0 3e-21 MNEIESAKRKFTGNIKFDSIYGRNRNKVFQTAMYYTKRMEIAEEITQEIFVEMYVGIDKI DEAKVDNWLLTVTKNKSLNWLHKLEAEVRRLEYLQENGELLSESAEYSVLEKERKQDFGL LSQEIFCELYKENERWYEVVSGIYCSGKTYTEVADELHVSKDVVYAIIYRARRWIKRNYG KRYEKIFE >gi|330402193|gb|ADLB01000022.1| GENE 47 44488 - 44895 428 135 aa, chain - ## HITS:1 COG:BH1189 KEGG:ns NR:ns ## COG: BH1189 COG0537 # Protein_GI_number: 15613752 # Func_class: F Nucleotide transport and metabolism; G Carbohydrate transport and metabolism; R General function prediction only # Function: Diadenosine tetraphosphate (Ap4A) hydrolase and other HIT family hydrolases # Organism: Bacillus halodurans # 3 117 4 119 142 130 52.0 9e-31 MRDNNCIFCKIANGEIPSATIYEDEDFRVILDLSPASKGHALILPKEHYANLFELDDEKA GKVLVVAKKVITKMKEILNCDGYNLVQNNGEAAGQTVNHFHLHLIPRYEGDNVGLQWNPG TLTEEVKEEILLKFQ >gi|330402193|gb|ADLB01000022.1| GENE 48 45001 - 45750 409 249 aa, chain + ## HITS:1 COG:atoS_3 KEGG:ns NR:ns ## COG: atoS_3 COG0642 # Protein_GI_number: 16130156 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Escherichia coli K12 # 35 241 63 270 278 115 31.0 5e-26 MFTLDDEKQLHRLMNENSDNYMLIQKLLDFQQLTISKISHEIRNPLTLISSTLQLIQKQH PEVTHFAYWKQMLEDISFMKTLLEELSLYNNGTRLHKTVIHTTDFFNRLILSFATTLTES SIELTSNIEQNLPNIAGDPIKLQEALLNLLTNARDAVKPDGKIYFSAQSKDESIILHIKD NGCGIPQSHLSDIFTPFTTYKENGTGLGLAIVKAVTEAHHGTVYVKSIPFHSTSFTLTLP IQDKCEKDS >gi|330402193|gb|ADLB01000022.1| GENE 49 45679 - 46290 350 203 aa, chain - ## HITS:1 COG:RSp1525 KEGG:ns NR:ns ## COG: RSp1525 COG0705 # Protein_GI_number: 17549744 # Func_class: R General function prediction only # Function: Uncharacterized membrane protein (homolog of Drosophila rhomboid) # Organism: Ralstonia solanacearum # 6 197 204 406 569 108 33.0 1e-23 MPIKKRPIITIGIIAINILVFVWLSFFGMTEDGSYMLEHGAMFVPLVLGNHEYYRLITSI FLHFGFAHLMNNMVMLFFLGSILEEEIGSFKYLLLYFVSGVAGNILSAFMDLKTGEFVIS AGASGAIFGVIGALLIIVTKNHGHLRTLDGRGMVFMVVCSLYHGFTSTGVDNMAHIGGLL SGILLAFILYRKRQSKRSTVEWN >gi|330402193|gb|ADLB01000022.1| GENE 50 46376 - 47083 574 235 aa, chain - ## HITS:1 COG:no KEGG:Cphy_0040 NR:ns ## KEGG: Cphy_0040 # Name: not_defined # Def: FHA domain-containing protein # Organism: C.phytofermentans # Pathway: not_defined # 1 193 6 200 580 78 25.0 2e-13 MESKYKRDCKHMYFILERQEFYEEDYQMKMMKENRIEGLLQVTGQGINGKSQYNYEISGK VSVKSIYEKMVIEKEELENFLRQFLNLLSQIENYLLNVNCILLEPEYIFYEEGKYFFCYL PIEEEEFCYRFHCLTEYFVSKINHKEQEAILLAYELHKATMEENYSLEKIIEQAMAASVK ETKKEREMEEEYEDDWIDFEEQQEKQGNIIEEKWKSWKKKWGRRKTEKWGEWEQD >gi|330402193|gb|ADLB01000022.1| GENE 51 47073 - 47468 230 131 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|226325866|ref|ZP_03801384.1| ## NR: gi|226325866|ref|ZP_03801384.1| hypothetical protein COPCOM_03679 [Coprococcus comes ATCC 27758] # 1 126 1 126 138 102 42.0 7e-21 MKMMKGSLMVEAAYIMPVIFLSFIAGLYMLFYFHDKNILLGAGYETVVVGSEKMRWNEEN IEEKMEEFFHKRVKGKMILFSKPKVTVRYEKEELVLRAYAKKKRMALKIEQKKTVRMPER YIRKKRNVYGK >gi|330402193|gb|ADLB01000022.1| GENE 52 47465 - 47869 381 134 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|210613906|ref|ZP_03289970.1| ## NR: gi|210613906|ref|ZP_03289970.1| hypothetical protein CLONEX_02183 [Clostridium nexile DSM 1787] # 4 132 10 138 140 76 37.0 4e-13 MWGMIFLLICAMWDLRVKRIPTYLLWIGTAGTMIFNIFFYKGNLFNPLGGLIIGGVCLFI SKVTDEALGYGDSFVICLLGSYAGFLKTLWTVTLAFTGVGLFSLIFLMRKGDYRKRTIPF IPFLAISYMGVMYV >gi|330402193|gb|ADLB01000022.1| GENE 53 47857 - 48735 689 292 aa, chain - ## HITS:1 COG:no KEGG:Cphy_0037 NR:ns ## KEGG: Cphy_0037 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 50 290 7 298 304 108 26.0 2e-22 MSFLFINSQKTLKHQKKNNRRKEEDSTKTQSITSYFPKKVERASLFTSFKGSITVEASMA VPLFFLAICCLCYLLEVMSIQMTIRSALDEIGRKTAEEIYVKPFIFPGELEEKMVAHIGK ERLDRSVIADGSQGLDLSGTTLSGKSKIIKMCVEYKIKLPVYTFGKLALSCKEEFRLKSW TGYEKEFGFQDKEDIVYVTETGMVYHRDNQCTYLDLSIRTAGKKDIAQLRNENGEKYRAC EKCGKKAGKNVYITNQGNRYHSSIGCSGLKRKIYAIPISEAQGKGECSRCGE >gi|330402193|gb|ADLB01000022.1| GENE 54 48756 - 49985 916 409 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_0046 NR:ns ## KEGG: EUBREC_0046 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 4 406 45 497 524 163 25.0 1e-38 MALESVFAEYQKEMMETYDVFALDGTYESSTFSEKNILDRLKFYGAGNGRYKIERIQLLS DDSGRAFREQAIACMKQKTGIGIVEELAGETSEWEEQERNQEESQKEEKEVTDSLEESLA EAEQELPTEDNPIEVVSDIKSRGLLQVAVPKDIHISEKAIRLQELPSNRKLKKGRGTFKI RKEKSETMSKLYFTSYLLEKFSAVDKPDDKKKLSYELEYVIGGKESDRENLDIVVTRLVA MRFPANYGFLLGDSTKKAEAEAMAATLAGVIALPALVGIIKQAILLAWAFGESIMDVRSL LAGGKVELVKRKENWQLQLSSLLNLGKEEGDVQKKETGLSYREYLRMLLFLQKEDECTMR SLDIIEMNIRQKKGEFFRIDSCVSKLEIKSICKIRERITYQFSTLYGYQ >gi|330402193|gb|ADLB01000022.1| GENE 55 50122 - 50298 256 58 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSRIRLFWSTLKNDRGIRVVEVILILVVLIGLVIIFKSQLTELVESIFEKITSESSGI >gi|330402193|gb|ADLB01000022.1| GENE 56 50310 - 51497 911 395 aa, chain - ## HITS:1 COG:no KEGG:Cphy_0034 NR:ns ## KEGG: Cphy_0034 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 36 395 108 485 485 197 31.0 5e-49 MFLAGGVTLILSIGCLLVETGSRNVSTVERNGYGKGERQETYKVTVGDILKNESLDIEVG EEEYTVEEIRRVFKRTMDQLEKIVLADNKSADHVESNLNFIGEMPDVPIRIVWETDRSEI INAQGEIQEENLSKEGEQVEIKGCLSYGEEECMYVMNVMVYPKTLSKKEKVLTEIQNKVR SSDRKSKNKSAMHLPDEVGGEKIIWKKTTDYTFLYILVLGGVCTVAIYTREKEEKKKREK QRREQMIIDYPEIISQLSLLIGSGMTIKNAWKKILFHYENRKKKEVRFAYEEIYCAMREM QGGMTESESYERFGKRCGISSYMKLGTTLSQNVRKGAKGLTELLEKETKEALETRKQRAK QYGEKAGTKLLLPMSMMLIVVLVIVIVPAFLSISI >gi|330402193|gb|ADLB01000022.1| GENE 57 51523 - 52278 508 251 aa, chain - ## HITS:1 COG:no KEGG:Cphy_0033 NR:ns ## KEGG: Cphy_0033 # Name: not_defined # Def: Flp pilus assembly protein TadB-like protein # Organism: C.phytofermentans # Pathway: not_defined # 7 251 3 247 247 165 37.0 1e-39 MRWKTEENYWQQDMRNRDRLLVIVKGLGGVGILAYLFYHSWIGYAALLPLLFLYYRTWKK EYAKRRKEKFQKQFKEALQSLRTVLNVGYSIENAIREVKKEMTTLYGEDALITKEFSYLV RQLEMNIPVEQAWKTFADRVEMSEVTHFVTILSTIKRSGGDMIAVLRQTIETICMKLEVQ QEIYTIIVAKQMEFRIMSAIPMGIIVYLKMSFPEFLSVLYQELVGKVVMTVCLLIYFAAY QWGKKILEIEV >gi|330402193|gb|ADLB01000022.1| GENE 58 52223 - 53437 928 404 aa, chain - ## HITS:1 COG:YPO0690 KEGG:ns NR:ns ## COG: YPO0690 COG4962 # Protein_GI_number: 16121013 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Flp pilus assembly protein, ATPase CpaF # Organism: Yersinia pestis # 65 395 75 406 428 301 47.0 1e-81 MIQAEQFHGKILEKIDLSEEIDDEELTEVIYQVLEEYSEKQYISLQDKVEVGQELFNAFR KLDILQNLIEDENITEIMINGNQNIFIEREGKIILSDKRFISKKKLEDVIQQIVAETNRI VNEATPIVDTRLSDGSRVNIVLYPIALNGPIVTIRKFPKESITLEKLRQWGAVSQEIISF LQILIQSKYNIFISGGTGTGKTTFLNALSQYIPKDERVITIEDNAELQIKELPNLVSLEA RMANIEGVGEISIRDLIKTALRMRPDRIIVGEVRGAEVIDMLQAMNTGHDGSLSTGHANS SKDMITRLETMVLMGMDLPLSAIQRQIASGVDILIHLGRFRDKSRKLISIEEITGCGSEG VRLNMLYQFQETEEKGGRVKGKWVKIHEMENRGKLLAAGYEKQG >gi|330402193|gb|ADLB01000022.1| GENE 59 53391 - 54380 526 329 aa, chain - ## HITS:1 COG:CAC0037 KEGG:ns NR:ns ## COG: CAC0037 COG1192 # Protein_GI_number: 15893335 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Clostridium acetobutylicum # 29 313 31 321 361 85 25.0 2e-16 MRSRCFVICDSEIVYAEKLSFILGKKIDFPVYVCSAKEQIEKIANEIPIEILLLDERFAE KVQTDIAKKVIYLTRERGKVEEKQAIYKYQSADNILLDILNVYAEEDRDILRQNRYKECT FIGVYSPVHRVGKTAFAIALGRELAKRERVLYLNLEEYSGWEERMMVKTSQTLADLLYYA RQENSRIGTKIGVMAEKTGQLEYIAPMKISEDLKQVTYEEWQELFRQLSHLRLYRKIIID FGECVQGLWSLLNICHKIYMPVSRQRESSAKILQFEQNADILGYGGVLDKTVQLEIPENL EEYVRDLLKKEGKQSDTGRTVSWEDIGKD >gi|330402193|gb|ADLB01000022.1| GENE 60 54539 - 54748 301 69 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225570084|ref|ZP_03779109.1| ## NR: gi|225570084|ref|ZP_03779109.1| hypothetical protein CLOHYLEM_06180 [Clostridium hylemonae DSM 15053] # 1 69 17 85 85 75 59.0 8e-13 MEKEHSGFIINFIIRAVIGIGLIFFINEYLSYRQIEIAVGINPVSVLLSGFLGAPGVALL YGILVYQIL >gi|330402193|gb|ADLB01000022.1| GENE 61 54907 - 55146 147 79 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|225570085|ref|ZP_03779110.1| ## NR: gi|225570085|ref|ZP_03779110.1| hypothetical protein CLOHYLEM_06181 [Clostridium hylemonae DSM 15053] # 1 78 1 78 78 75 51.0 1e-12 MKLFQKKRKAKVVYREHTNTPIQDTVLLKEIEKVKADMDNAYINFQNVLDPDLIDCYIFE SNAALKRYHFLLKQAKKIS >gi|330402193|gb|ADLB01000022.1| GENE 62 55288 - 55776 575 162 aa, chain - ## HITS:1 COG:FN1902 KEGG:ns NR:ns ## COG: FN1902 COG2131 # Protein_GI_number: 19705207 # Func_class: F Nucleotide transport and metabolism # Function: Deoxycytidylate deaminase # Organism: Fusobacterium nucleatum # 5 162 15 170 174 204 62.0 5e-53 MSEKRNDYISWDEYFMGVAMLSGMRSKDPSTQVGCCIVSQDNKILSMGYNGLPKGCSDDE FPWTREGEDPLETKYVYTVHSELNAILNYSGGSLAGAKLYVSLFPCNECAKAIIQSGIKE VVYDSDKYADTASVMASKRMMDCAGVRYHQYHRTGRKIEIEL >gi|330402193|gb|ADLB01000022.1| GENE 63 55793 - 56422 842 209 aa, chain - ## HITS:1 COG:BS_upp KEGG:ns NR:ns ## COG: BS_upp COG0035 # Protein_GI_number: 16080742 # Func_class: F Nucleotide transport and metabolism # Function: Uracil phosphoribosyltransferase # Organism: Bacillus subtilis # 1 209 1 209 209 297 67.0 9e-81 MAKVHVMEHPLIRHKISYIRRADLGSKDFREMISEIAMLICYEATRDLKLQDVTIQTPIC ETVEKELAGKKLAVVPILRAGLGMVEGMLAMIPAAKVGHIGLYRDPDTLEPVEYYCKLPA DCEERDVFVVDPMLATGGSSVAAIQMLKDKGVKNIRFLCIIAAPEGVERMQKEHPDVDIY IGALDKQLNEHGYIVPGLGDAGDRIFGTK >gi|330402193|gb|ADLB01000022.1| GENE 64 56440 - 57960 1595 506 aa, chain - ## HITS:1 COG:CAC2882 KEGG:ns NR:ns ## COG: CAC2882 COG0009 # Protein_GI_number: 15896136 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Putative translation factor (SUA5) # Organism: Clostridium acetobutylicum # 1 348 1 349 350 332 48.0 8e-91 MKTIITKVDKNQIEKEVIEQAGKIIQEGGLVAFPTETVYGLGANALDEEAAKKTYEAKGR PSDNPLIIHIADISHLNKIVQNISKKAMDLAEVFWPGPLTMIFEKSDIVPYGTTGGLDTV AVRMPDDIIARELILEGGGYISAPSANTSGRPSPTTAKHVEEDMSGRIDMILDGGSVDIG VESTIVDMTAEPPMILRPGAVTQRMLEEVIGEVQVDKTLLGEETSEAPKAPGMKYRHYAP KAQLKIVQGTLKDEVHAIKQLAYEKTKKGKKVGIIASTETAPQYTAGLIKPIGTRENQSS IAKNLYKVLREFDDENVDCIYSESFSVDGIGSAIMNRLEKAAGHQIIDANEIAAKQKFRR VVFVSKSDNSRGPMAAELLRNHDLIQEYDIESRGMIVLFPEPANQKAEAIMKSQQMTLEG HEAVAFSEADLDEDTLVLTLEENQKWKIVTDYEYVKHVYTLGEYVEIDKEVPSAYGQPLV EYGKSFEILKEMIEKLAEKLNAEARK >gi|330402193|gb|ADLB01000022.1| GENE 65 58007 - 58723 633 238 aa, chain - ## HITS:1 COG:BH0239 KEGG:ns NR:ns ## COG: BH0239 COG0860 # Protein_GI_number: 15612802 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: N-acetylmuramoyl-L-alanine amidase # Organism: Bacillus halodurans # 53 230 42 227 238 118 35.0 1e-26 MAYILTINVTEGIFLYRKTKLLMAVVLIGASFLMGKTLSVHVSGNQIKKGKSIVVLDAGH GYTDPGKIGINKKKEKDINLSIAKKVEKKLKKQNIIVKMTRREDKGLGDTKIADMKERVK RINNTKPNLAVSIHQNSYTQESVKGAQVFYFTHSKEGKEAAEVMQEMFRLFDKENKRVCK GNNTYYMLRKTEVPTIIVECGFLSNWEEAEKLSTKEYQEKVAQVICDGIIHILQKQEE >gi|330402193|gb|ADLB01000022.1| GENE 66 58735 - 60546 1559 603 aa, chain - ## HITS:1 COG:BS_ydhD KEGG:ns NR:ns ## COG: BS_ydhD COG3858 # Protein_GI_number: 16077638 # Func_class: R General function prediction only # Function: Predicted glycosyl hydrolase # Organism: Bacillus subtilis # 348 587 174 400 439 96 27.0 2e-19 MNERDRREQIRRRRYSRKRRQRSLVIKVCIFIILALVVLCGTILWKKYSPSDKKANRKEY YGITNEEQLAVIVNNQILEPKGKMIDGTLYLEYSAVRDYVNERFYWDANENLLLYTLPND MVSVSVGSREYAVSKEKKSEEYVILKTEGSTAYIALDFVKKYTDIEYKMYKSPNRIVIQS DWSDAKVATTKNNSAIRMRAGVKSPILTETKSDDVVRIIEKEGKWRKVRTNDGFVGYIKQ SDLKDEKTKTYKREFKEEKFTNIKKDYTINMAWHQVSSRSANNQVLEMIANTKGLTTLSP TWFSVKDTKGNITSIASSEYVNYAHQSNIEVWGLVRNFDPLNQEGIKTSEETHELLSRTS SRENLTNQLISEALQVGLDGINVDFETVAEKTGEHYIQFIRELSVKCRKNGIVLSVDNYV PKGYNEHYHRKEQGIVADYVIIMGYDEHFAGSYESGSVASIGYVKEGIQETLKDVPASKV INAMPLYTRLWKEVPKTEAELSEQAGTEAGKYSTKVTSETVSMKRTDKVLEQFGAEAKWD SKTGQNYAQWEKDDATYKIWLEDEKSIEEKLKVMKANKLAGTAVWKLGFESQGLWELILK YVN >gi|330402193|gb|ADLB01000022.1| GENE 67 60659 - 60880 263 73 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|226325561|ref|ZP_03801079.1| ## NR: gi|226325561|ref|ZP_03801079.1| hypothetical protein COPCOM_03366 [Coprococcus comes ATCC 27758] # 1 71 39 109 115 79 52.0 8e-14 MEQEDKAYRELFTELSQYERYGVPMSIDGESASPMQIAAAHMVKESGGYMRDYIWDKDGH MEELHFHEIKSVE >gi|330402193|gb|ADLB01000022.1| GENE 68 61327 - 62064 793 245 aa, chain + ## HITS:1 COG:no KEGG:Cphy_3752 NR:ns ## KEGG: Cphy_3752 # Name: not_defined # Def: negative regulator of genetic competence sporulation and motility-like protein # Organism: C.phytofermentans # Pathway: not_defined # 1 243 1 259 261 218 46.0 2e-55 MKIEKLNDNQIRCTLTRADLAERELKLSELAYGTEKAKSLFQDMMQQAAFEFGFEADDTP LMIEAIPASSDTIVLIITKVEDPEELDTRFSKFAPSSSGGTTGKMQNVLDKLEGAEEFLD LLGKVKKAVSEKPEMEGTESTPADRKTKKDVRLFTFASLDNVIDASHLLTTMYTGANTLY KDPKDDLYILVLSQSGHSASDYNKICNMLSEYGSFEKTGSATLAYLEEHCDTIIGGNAIQ QLALI >gi|330402193|gb|ADLB01000022.1| GENE 69 62142 - 62339 348 65 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_0036 NR:ns ## KEGG: EUBREC_0036 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 2 62 4 64 69 75 62.0 4e-13 MISKDMTIGELLATNPQVAPILMEIGMHCLGCPSAQAETLGEAAMVHGLDADLLVEKINA FLAAN >gi|330402193|gb|ADLB01000022.1| GENE 70 63027 - 63431 347 134 aa, chain + ## HITS:1 COG:lin1791 KEGG:ns NR:ns ## COG: lin1791 COG0735 # Protein_GI_number: 16800859 # Func_class: P Inorganic ion transport and metabolism # Function: Fe2+/Zn2+ uptake regulation proteins # Organism: Listeria innocua # 3 125 19 141 148 98 36.0 3e-21 MPALKYSRQRESIKEYLMSTKEHPTADMVYLHVKQEFPNISLGTVYRNLNLLADIGEIIK ISTLNGGDRFDGTTTPHYHFFCTDCGAVLDLDMDNIETINHIAGKNFDGEIDTHNITFYG RCGDCVKNAKVLKM >gi|330402193|gb|ADLB01000022.1| GENE 71 63535 - 64077 770 180 aa, chain + ## HITS:1 COG:CAC3598 KEGG:ns NR:ns ## COG: CAC3598 COG1592 # Protein_GI_number: 15896832 # Func_class: C Energy production and conversion # Function: Rubrerythrin # Organism: Clostridium acetobutylicum # 1 179 1 180 181 214 65.0 9e-56 MKKFVCTVCGYVYEGEAAPAECPQCHVGAEKFKEMSGEREWAAEHVVGVAKGVSEDIVAD LRANFEGECSEVGMYLAMARVAHREGYPEIGLYWEKAAWEEAEHAAKFAELLGEVVTDST KKNLEMRVEAENGATAGKFDLAKRAKAANLDAIHDTVHEMARDEARHGKAFEGLLNRYFG >gi|330402193|gb|ADLB01000022.1| GENE 72 64131 - 64721 861 196 aa, chain - ## HITS:1 COG:no KEGG:CLH_1589 NR:ns ## KEGG: CLH_1589 # Name: not_defined # Def: hypothetical protein # Organism: C.botulinum_E3 # Pathway: not_defined # 1 196 1 196 196 264 68.0 1e-69 MNKKKSTLMKISQTAVMAALCFVVFTFLQIKIPTPGGDASSLHLGNAVCVLAALMLGGVY GGLAGAIGMGIADIMDPIYITVAPKTFILKLCIGLITGFVAHRVAKINQSTDKKYILKWS TLAAAAGLGFNIIADPLVSYFYKMIIFGQPQKMAEVLAKLSAVTTFVNAVVSIILVAFLY NALRPVLIRSGVIVKE >gi|330402193|gb|ADLB01000022.1| GENE 73 64812 - 66398 1684 528 aa, chain - ## HITS:1 COG:BH2384 KEGG:ns NR:ns ## COG: BH2384 COG0513 # Protein_GI_number: 15614947 # Func_class: L Replication, recombination and repair; K Transcription; J Translation, ribosomal structure and biogenesis # Function: Superfamily II DNA and RNA helicases # Organism: Bacillus halodurans # 4 528 6 527 539 476 48.0 1e-134 MENIKFEQLGICPEIQKAVKYMGFEEASPIQAKAIPVILEGKDIIGQAQTGTGKTAAFGI PLLQKVNPKNKKLQGIVLCPTRELAIQVADEIRNLAKYMHGIKVLPIYGGQEIVKQIRSL KNGTQIIIGTPGRVMDHMRRKTIKMEQVHTVVLDEADEMLNMGFREDIETILEGVPEERQ TVLFSATMPKAIMEITKKFQKKAEVIKVTKKELTVPNIEQFYYEVKPKNKEEVLARLLDI YTPKLSVVFCNTKKQVDLLVTSLLGRGYFAAGLHGDMKQVQRDRVMQGFRSGKTDILVAT DVAARGIDVDEVEAVFNYDLPQDDEYYVHRIGRTGRAGRVGRSFSFVSGKEVYKLKEIQR YCKTKIYAQKVPSLNDVATTKMENILDDIDQIIVNEDLTLMINAIQERVNDSDYTAMDMA AAFLKMCTGMTSEEEQEEMDFGDTGAEEPGMVRLFINIGKKQKAKPGDILGALAGETGMP GKLIGTIDMFDKYTFVEVPREYAKDVLLAMKNVKIKGKTIVIEPANQK >gi|330402193|gb|ADLB01000022.1| GENE 74 66642 - 67304 636 220 aa, chain + ## HITS:1 COG:CAC0882 KEGG:ns NR:ns ## COG: CAC0882 COG1272 # Protein_GI_number: 15894169 # Func_class: R General function prediction only # Function: Predicted membrane protein, hemolysin III homolog # Organism: Clostridium acetobutylicum # 7 219 4 214 214 176 46.0 3e-44 MLMKAKKLKDPASAITHFIGMVAAIIAAAPLLFRAWHKPDKIYFVSFAVYIISVILLYTA STVYHSLDISEKINKRLKKFDHMMISVLIAGSYTPICLIALKGKVGYTLLAIVWGIAILG ILIKAFWIYCPKWFSSVLYIAMGWTCVLAFPQIYHTLSTPAFIWLLVGGIIYTIGGVIYG LKLPIFNKKHHNFGSHEIFHLFVMGGTACHFVVMYVYLLG >gi|330402193|gb|ADLB01000022.1| GENE 75 67332 - 67583 236 83 aa, chain - ## HITS:1 COG:VNG2274C KEGG:ns NR:ns ## COG: VNG2274C COG2827 # Protein_GI_number: 15791086 # Func_class: L Replication, recombination and repair # Function: Predicted endonuclease containing a URI domain # Organism: Halobacterium sp. NRC-1 # 3 77 2 76 77 84 56.0 5e-17 MENYTYIVRCSDGTLYTGWTNDLEKRIQSHNEGKGAKYTKTRTPVTLVYYETFPTKQEAM KREYAIKQMRRKEKEKLIDGRSV >gi|330402193|gb|ADLB01000022.1| GENE 76 67586 - 67864 264 92 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|291171378|ref|ZP_06572553.1| ## NR: gi|291171378|ref|ZP_06572553.1| conserved hypothetical protein [Filifactor alocis ATCC 35896] # 17 92 7 82 82 75 51.0 9e-13 MDEKDYISMGLNGTAPLKLILCGEVEEAESRKTGVVSVVYATENKEIAEEKIKELSEKYP EKFYMVYSVPLNVDLTTLSHYPSIEISQEDLQ >gi|330402193|gb|ADLB01000022.1| GENE 77 67961 - 68704 719 247 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_0243 NR:ns ## KEGG: EUBREC_0243 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 70 247 51 224 225 143 42.0 5e-33 MKNNLRPSKRGHGLAIGAVICFVAVIGMVGTYTINQYKAGVKKELAKVEKKQKEEKIEST GGGQDLVINTPQKEQNVIEEQPKVTEQVETVEEQPEVVAEQPVKSQFSFSEKDLILWPIE NGEVILNYSMDKTVYFPTLDQYKYNPAMIIKGAAGTQVMSVARGVVKSIDVTSQTGTTVT VDMGNGYEAIYGQLKEVPVHVGDTVEAKTVLGYLSEPTKYYSVEGCNLYFEMRKDGQPIN PNDFLGE >gi|330402193|gb|ADLB01000022.1| GENE 78 68747 - 69562 635 271 aa, chain - ## HITS:1 COG:BH3748 KEGG:ns NR:ns ## COG: BH3748 COG2385 # Protein_GI_number: 15616310 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Sporulation protein and related proteins # Organism: Bacillus halodurans # 8 264 69 325 336 124 32.0 1e-28 MEKKVLWDEYLTGVVAKEIQPDYHEEMLKVQTIITRTNLYRKIEEKKDVVFSGEYMTRGD MEKLWGKDNGRKIYKKLKKAVSDTEGKAILYKGKPASASFHQLSNGKTRNGNEVLGGENY PYLKTKDCPKDIENKWQLQTLELSYEEVKEYCQPFLAAVDKKEAEKPFTKEDFEIKQTDT AGYVTKVRIGKHIYAGEEFRNALELPSSCFSFQESNGMLKITTKGVGHGIGLSQHTANEM AKEGKTYEEILQYFFEGTEIREVAEILLNIE >gi|330402193|gb|ADLB01000022.1| GENE 79 69786 - 70451 753 221 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|154505671|ref|ZP_02042409.1| ## NR: gi|154505671|ref|ZP_02042409.1| hypothetical protein RUMGNA_03210 [Ruminococcus gnavus ATCC 29149] # 2 208 14 215 318 139 36.0 2e-31 MYYNETDTVAVTVLAIYLVVLGLCLLGWIISFIFRGIGMYKMGKAQGKTNSWLAFIPFAR TYFHGELSGEIPLKKRSIKSPGGWLLIVPIIYGVIFAVMYFFMIISILISAISAESRMRD YMGYHVPNSEMSGLLMVFVVFLVFVIIISVIYAAIKGGLEILINRQIYERYTTVNMATLH AVFSMIIPFYESVCMFIFGRRAEQNTKENMAENQLTIEEEE >gi|330402193|gb|ADLB01000022.1| GENE 80 70464 - 72926 2532 820 aa, chain - ## HITS:1 COG:CAC1664 KEGG:ns NR:ns ## COG: CAC1664 COG0058 # Protein_GI_number: 15894941 # Func_class: G Carbohydrate transport and metabolism # Function: Glucan phosphorylase # Organism: Clostridium acetobutylicum # 7 818 4 807 812 824 53.0 0 MRQEKFNKEEFQKTVKNNVKRLYRRTIDEASQQQLFQAVSYAVKDEIIDRWLLTQEQYKK DDPKTVYYMSMEFLMGRALGNNIINLTAYKEVAEALDEMGIDLNVIEDQEPDAALGNGGL GRLAACFLDSLASLGYAAYGCGIRYHYGMFKQKIENGFQVETPDDWLKEGNPFEIRREEY AKVVRFGGHIRINYNEKTKRSEFIQEDYESVLAIPYDMPIVGYNNNIVNTLRIWDAKAIT DFHLDSFDRGEYQKAVEQENLAKTIVEVLYPNDNHYAGKELRLKQQYFFISASLQEAIEK YLREHDDVRKFHEKVTIQMNDTHPTVAVAELMRLLMDEQGLEWDEAWEVTTKTCAYTNHT IMAEALEKWPIDLFSRLLPRVYQIVEEINRRFVAQIRAKYPGNEEKVRKMAILYDGQVKM AHLAIVAGYSVNGVARLHTEILKHEELKDFYEMMPEKFNNKTNGITQRRFLLHANPLLAG WVTKHIGDGWITDLSQMAKLKPLADDVKEREKFMDIKFRNKERLAKYILEHNGIEVDPRS IFDVQVKRLHEYKRQLLNILHIMYLYNKIKEHPELSFYPRTFIFGAKAAAGYKRAKQTIK LINSVADVINNDRSINGKIKVVFIEDYRVSNAELIFAGADVSEQISTASKEASGTGNMKF MLNGAVTLGTMDGANVEIVEEVGSENAFIFGLSSDEVIQYENYGGYNPVDIYNSDWEIKR VVDQLVDGTYANGDHEMYRDLYNSLLNTQSSDRADTYFILKDFRSYAAAQEEVEKAYRDV DRWSKMALLNTASCGKFTSDRTIQEYVDEIWKLDKVTIDV >gi|330402193|gb|ADLB01000022.1| GENE 81 72937 - 76044 3768 1035 aa, chain - ## HITS:1 COG:CAC3038 KEGG:ns NR:ns ## COG: CAC3038 COG0060 # Protein_GI_number: 15896289 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Isoleucyl-tRNA synthetase # Organism: Clostridium acetobutylicum # 1 1035 1 1033 1035 1239 56.0 0 MYQKVSTDLKFVDREKKIEKFWEENDIFKKSMENRKEGETYTFYDGPPTANGKPHIGHVL TRVIKDMIPRYRTMKGYMVPRKAGWDTHGLPVELEVEKLLGLDGKEQIEEYGLEPFISHC KESVWKYKGMWEDFSATVGFWADMEHPYVTYDNNFIESEWWALKQIWDKGLLYKGYKIVP YCPRCGTPLSSHEVAQGYKDVKEKSAIVRFKVKDEDAYILAWTTTPWTLPSNVALCVNPE ETYVKVKNDGYTYYMAEALLDSVLEGDVEVVERYMGKELEGKEYEPLFPFVELNKKAYYV TCDTYVTLTDGTGVVHIAPAFGEDDAQVGRKYDLPFLQLVDEKGEMAEETPWKGVFCKKA DPFVLEDLEKRGLLFSAPNFEHSYPHCWRCDTPLIYYARESWFIKMTAVKDDLIRNNNTI NWIPESIGKGRFGDWLENVQDWGISRNRYWGTPLNVWECECGHMHSIGSIAELKEMSDNC PDDIELHRPFIDKVTIKCPHCGKEMHRVPEVIDCWFDSGAMPFAQHHYPFENKELFEQQF PANFISEAVDQTRGWFYSLLAISTLVFNEAPYKNVIVLGHVQDEKGQKMSKSKGNAVDPF EALEQYGADAIRWYFYINSAPWLPNRFHGKIVTEGQRKFMGTLWNTYAFFVLYANIDEFD ATKYTLEYDKLTVMDKWLLSKLNTLIKTVDNNLENYRIPESARALQEFVDDMSNWYVRRS RERFWAKGMEQDKINAYMTLYTALVTVAKVAAPMIPFMTEDIYQNLVRSIDTNAPESIHL CDFPAANEEWIDTELERNMEEVLEIVVMGRACRNSANIKNRQPIGTMFVKAEEELSEFYK EIVAEELNVKEVKFTDDVRDFTSYTFKPQLKTVGPKYGKLLGGIRTALSELDGNKAMDEL NEKEELKLDINGEEVVLSREDLLIETAQMEGYVSESDNGITVVLDTNLTEELLQEGFVRE IISKIQTMRKEADFEVTDKIYVTYEGSEKAESIFAQYGSQIGEEVLALSVEKKTPAGYTK EWNINGEKVTMGVEK >gi|330402193|gb|ADLB01000022.1| GENE 82 76255 - 76335 146 26 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTKRVELFQKAKRVGGGENPMRVGIV >gi|330402193|gb|ADLB01000022.1| GENE 83 76500 - 77102 464 200 aa, chain + ## HITS:1 COG:no KEGG:BDI_1616 NR:ns ## KEGG: BDI_1616 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 1 194 1 200 215 145 38.0 1e-33 MDKKIITIGRQFGSNGRIIGRELANRLGINCYDKGLIKLAAEHTDIPYEQLKLVDEKKEK PWQYQVDIDDNLDKQYRYGHIDEVLFDLQSKVIQDLASKENCIFVGRCADYVLRNEKHCK NIYLFAPTDFRVKTVMERYGIPEKKACTLVKKVDKDRSYYYNFYTDKDWHDAKSYHLAID TSTFTLKEVVDILELIYNNL >gi|330402193|gb|ADLB01000022.1| GENE 84 77145 - 78416 1273 423 aa, chain - ## HITS:1 COG:CAC1267 KEGG:ns NR:ns ## COG: CAC1267 COG1686 # Protein_GI_number: 15894549 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanyl-D-alanine carboxypeptidase # Organism: Clostridium acetobutylicum # 48 312 17 290 425 186 39.0 1e-46 MRNRIKRGIGILLSMVLIFQCGGYVQATEQMTPEQTKEAEIQASYNKTIESNAIQNWPQG PQVYGDSAIVMDMKTGAILYAKGIDEERYPASITKILTALVAIENSQMTDMVKFSEESIQ SLEPGYAHIAMKAGEEITMKDALHALMLASANEVAYAIGETVGGTHENFIKMMNDKAKEL GCTHTHFINTNGMFDEQHYTSARDMALIARAAFSHQELLDIVQTLQYTIPPTNMESEPRI FQQKHKMLLNGKYHDNRCIGGKTGYTEKAYNTLVTVMEQGDMEIVAVIMRSRRDTFEDTK KICDYAFNNFREVNISANEKSDKVSDIDGEACVTLPKNIEFNQLSRQYEKDKVNYCYEGQ FVGETKAKIKEVKKETKKEPETKKSDTFLMKTIIAVILLLIIGFTVLLILNTIRRKKRKR NRS >gi|330402193|gb|ADLB01000022.1| GENE 85 78488 - 79105 681 205 aa, chain - ## HITS:1 COG:CAC0885_1 KEGG:ns NR:ns ## COG: CAC0885_1 COG1145 # Protein_GI_number: 15894172 # Func_class: C Energy production and conversion # Function: Ferredoxin # Organism: Clostridium acetobutylicum # 1 81 1 84 115 101 58.0 1e-21 MIRKIIKINEEKCNGCGACASACHEGAIEMVDGKAKLMREDYCDGLGDCLPACPVDAIRF EEREAPAYNEAAVMEAKRKKEHGAGQLSQWPVQIKLVPVNAPYFDNANLLVAADCTAYAY GNFHNEFIKNRITLIGCPKLDEGNYTEKLTEIIANNNLKSVTVVRMEVPCCGGIENAVKN ALQKSGKFIPWRVVVISTDGKIIED >gi|330402193|gb|ADLB01000022.1| GENE 86 79193 - 79861 229 222 aa, chain + ## HITS:1 COG:CAC0884 KEGG:ns NR:ns ## COG: CAC0884 COG0664 # Protein_GI_number: 15894171 # Func_class: T Signal transduction mechanisms # Function: cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases # Organism: Clostridium acetobutylicum # 2 220 3 217 229 98 29.0 1e-20 MKEYLTVLKKSQLFSGIRENEIELIFDCLKAKLRTYKKGEFVFRQGEYIHYITILVEGTL HIQKDDFWGNRSIVNQINAGEMFGEAYAAPGSEPLLNDVIAIEESKVMFLDIQKILTVCS SACTFHSLLIQNLFYAISAKNRGLVQKLGHMSKRTTKEKLLSYLSEQAKKHKSAAFDVPF NRQQLADFLSVDRSAMSNELCKLRDEGLLSFSRNHFVLLEKG >gi|330402193|gb|ADLB01000022.1| GENE 87 80048 - 81253 876 401 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_20462 NR:ns ## KEGG: EUBELI_20462 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 59 385 53 386 394 157 34.0 6e-37 MSKAQDLKKVRDNNGLGIRTREFIIDNKEETIQSYYENILSAEEINKSVEEYRVNMDIVS KEFKERTKLQDKDIAFLTVATMLQCARIYLVNYFTKIEKAGSGNKKEAFLHEKQEKILGK FGFEENEHGRLYYAPLKQIVLGRGVPYDATAYFSENYKLFEGANHRFSTLGHDPLFGLIF GTTNILTNTITCNQKIVLKTNHVIYDENMKNPKISRYASTPKMFESVKDRIINEKEAVVA AVIKQLIHIGTDLYTPCGIQLPGASLILDKANVEKLTKYVSTGDVIKFGASAAIDCLINM IIRIVHSCKWLYEEDGDFSKELYEIRTRKIILYSNLFATSSNVIYTGLTGNMKDFDFAGF AITAYRLFSDTNFMDKIKYEFLNSQVSKVYEEKLGEDSLYY >gi|330402193|gb|ADLB01000022.1| GENE 88 81246 - 81896 749 216 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MGYLGKALLGAAIGVGTVAAIPVTGGGSVLAGTSLIASLTGAGAIASAAGVGGAVTGAVV QGVENIKKEKDIKKAKSASFQDGMNEGKAKTVDELKKVYDFYLATTALSYYIAKCDGEIS EEEQLEIDFDLDAIKKNADIPDAIKREMDRIARNNYLRFDDVEIYLDKISIENLQGLEND VQEIIEANGHISEEEEFAKNLFLNYIEQRKRCENYE >gi|330402193|gb|ADLB01000022.1| GENE 89 82132 - 82305 161 57 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSQLQYTRKLYSFFKGKDAVTDENWNKVLFTIEMKKVDLIDRFESFGRKKNLIENEI >gi|330402193|gb|ADLB01000022.1| GENE 90 82578 - 83762 583 394 aa, chain + ## HITS:1 COG:FN1676 KEGG:ns NR:ns ## COG: FN1676 COG3547 # Protein_GI_number: 19704997 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Fusobacterium nucleatum # 1 388 1 385 391 193 32.0 6e-49 MILIGIDIGKNQHTFSIVNKETGEILSSPSLFNNNQEGIQSLIRKLSSFAKSELLIGMED TGHYHFPLLKYLLDRRYTVALINPTTTDLTRKLQGGITKNDPLDSLTICDVIGSNQRKKP YRITKVNRFDLYEQKQLTRHHHNLKEELNIYKNRLQKCIDIVFPEFNSLFRSKYGIVYMN VLKIFASAEKIANSDIRTIRKCFEYNGRGKRIQLSAEQLKTTAKGSVGIPSVAEEIQIRH LVSQIEMIEEQLSEIDKKIEEFSLQNNSPILSIPGISHFSGTSILAELGDICNYTKASQI IKFAGVAPYHYESSQFQAQHTAITKKGSRYLRKTLYQIILPVINHNKVFTAYYNKKLAEG KGHRCAQGHCVRKLLRIIYHLLSTGQSFDPALLI >gi|330402193|gb|ADLB01000022.1| GENE 91 83907 - 84089 112 60 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167760939|ref|ZP_02433066.1| ## NR: gi|167760939|ref|ZP_02433066.1| hypothetical protein CLOSCI_03327 [Clostridium scindens ATCC 35704] # 1 57 1 57 72 94 87.0 1e-18 MANDNHLLSVPCYNGTMLKQDYYNEFFELGQQKINFSFFELCLPDDDPVYTLKKVMELTI >gi|330402193|gb|ADLB01000022.1| GENE 92 84349 - 84438 79 29 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MGCYKSHSIIVWNYGSNFTHKKITAPHSF >gi|330402193|gb|ADLB01000022.1| GENE 93 84491 - 84748 418 85 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKKILVYMTAIMLVVSAYINPTMVQASDLQTEIGYTTSTVAQIPSGNKTPVTGDVSNLNG ILIVLGISTGIISILLIKRMKEELE >gi|330402193|gb|ADLB01000022.1| GENE 94 84761 - 88048 1819 1095 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKHKKVKALLSAALVLSLVGGNNLSYVGAKNQQDIENTLKIEEQTNNAEQESEAEKEIPK EENEIPKEENDQMQSTPEESVEKDHIEDVEVIPAMPETESEKAETLEKNPEKVSKYGDEI PEITDRAVLMEGRNNTVGNYDDVCLGGGNDKPTKKTYNQNPKVRNPITGNYYWRYDNPTY PFEPYYSEQRLIDKIWPKNIGDANNHGYRLPDAVIDPDSQCFYANRENGFGESLGNNIGN ERYYVKYNSKWNNPVRKSHTPNPLGSATIGGKPISPPVFTDLSYTGNWRWQQGGTNRPGG YLFTIVTRPILKMGSATYYYDKEAKKSYPQFYADGKSHPRVYDNTEDVSSLVNPEEALFK FEGYRGNAKDKVFYSGPGGGYIYDGKGYYNAAPRDPGWYNITIQSKPREDMFMLPGVTYE TAVYMKKGYLVDFVQTIDRKSEQPASWRPSDEPTAFDADTKNMFGVDMYLEGSENKIPIP QYDESKFIFEGWDVEEQYWEEPTSQHPHGRILTKTKSLDRSDEIFIYRPSAIGRNQYFVL QAKLIARLKTRKTDTINPVTKVIDGETIQGGIAVSPSNITVTEEIPNAQMFEASVNNGYA YEFVGWSYNENGSHSDLGVTKTYKPNPDFTQQTEVHKTSTLYATFKPKEYTITFDGNGGT DSTKSSTTSTQTAKYYNDVQLAKNPFVREGFEFKGWSKMKNGSVDFADQGNGRNIPLNGL DKAPNNDIPTEITLYAVWERNIAEVDIEKYKDTLWIEGASQNAMSNNIYVHKSSQGYNAY EITRFYVNVDVMGAGNVPDKLKKDKFYVVWERSIDNGNSWEKIPLSVYNKFFTAPFTNNR QVGYSKVVYDEQKGRWFVPLLVRANQQPTQANVGGQYRVNVAYDDPQATKRVASEQDFFD GTGNIGWTTSEPTEIKMISTFDTVIKVPSTITLAETTEVSQSGEKTDVIKSVPYVNEVTV NPVRHTIDGKTDYDWHTPNTAMDGKTTQSNSYNGGQYTEYTKQKPFKVSVNWDKTLTDTT KQYTVNTVEMYSAENIGTMKKDQKITTGTQSTFSYDGQESGKKLFSFYLKGEKSKHLPEG VLFKGQITFVIDNVR >gi|330402193|gb|ADLB01000022.1| GENE 95 88186 - 89202 803 338 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167765782|ref|ZP_02437835.1| ## NR: gi|167765782|ref|ZP_02437835.1| hypothetical protein CLOSS21_00273 [Clostridium sp. SS2/1] # 97 337 1 224 239 219 46.0 3e-55 MIDVNVSISGILMDCDESVCALQLGNGYKIEKCNLDSLFFKNRITNGRGYLGTDYFGTQI KEGNETYFICVTKDEVMQIESPWIHEFSRFETDEKELCKTRIKKYTEKEIDYLYEQIDLL RIFRPGNIGLKDVFFQYSFTVLDHVTNTIEHRSHNQARNTVAGGYFKLDKAEIVLCNRWM HNFSRIPYILMKSCIDEFSWGLEQIDCINGFKQYIKTLKMILLRDEHIGENLLLARRISL LLGNTESGVQLIYQNTMDILEYYAQSLSESKGATVLENISENYSKNVLESVLKNELHKLE NITREVVKNCLIRCKAEHAMNRSITWNEIKERIINELA >gi|330402193|gb|ADLB01000022.1| GENE 96 89233 - 89754 460 173 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160914689|ref|ZP_02076903.1| ## NR: gi|160914689|ref|ZP_02076903.1| hypothetical protein EUBDOL_00696 [Eubacterium dolichum DSM 3991] # 62 170 43 152 156 101 40.0 2e-20 MKKSPKKWGILLLAILVIAVCSILVIAAMKFDPAKKNTKQDAVKEEVVNNENNEKSSDNQ VELVSVNLDQIKEKIESKESFFVVITNQECPYCEELYEILDEYKPNEKVQLYDLNFEDFS YDREKVKEVFPSFVGTPNMFYVEDGTIISQYDNMDGDLTVDLFGRWIEKYYKK >gi|330402193|gb|ADLB01000022.1| GENE 97 89771 - 90286 681 171 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160914690|ref|ZP_02076904.1| ## NR: gi|160914690|ref|ZP_02076904.1| hypothetical protein EUBDOL_00697 [Eubacterium dolichum DSM 3991] # 6 165 17 176 180 169 53.0 4e-41 MIMMFMSGCAKNDYEGTELEEVNGKKFVIDELELEVKSADIKNEVKPNNPQGYYPHYKEQ EGYHYIVLNGTLKNLSDKVYQMDRLRVMGEADEEMTQAKLVLINEIESYFWDEIGPGVSL DFYLFAIVDKDTDKIDNFYFYYDDDHKIDKEQKEFDYKVKYVVPEGLESNE >gi|330402193|gb|ADLB01000022.1| GENE 98 90322 - 91083 662 253 aa, chain - ## HITS:1 COG:no KEGG:Elen_2849 NR:ns ## KEGG: Elen_2849 # Name: not_defined # Def: hypothetical protein # Organism: E.lenta # Pathway: not_defined # 132 246 47 163 170 87 36.0 4e-16 MRYEVRKLKDCVKLWKDKNRKRWPYLTKKEAEVQGIDLNNVFADVGSEKDYNGDCLFYAG RKLVLTDKREELKKSEKIYGYIPCMDKYGEEGYVRIVKKRKKWLALLLLLLLLIPLMFGL KYFIDQNKKVDLDEAAISYQMPNGVKNENPEEIMMPVFGELKMVEGTTEITAGLANPEGN PCYFKYSIVLKEGNKVLYESKWIEPGTAVVELNISEKLPKGSHPILIKIDTGTLADPEVA MNGGEIESVLKVE >gi|330402193|gb|ADLB01000022.1| GENE 99 91085 - 91858 486 257 aa, chain - ## HITS:1 COG:no KEGG:Elen_2849 NR:ns ## KEGG: Elen_2849 # Name: not_defined # Def: hypothetical protein # Organism: E.lenta # Pathway: not_defined # 131 247 44 163 170 97 37.0 5e-19 MIYKLLENENTKEFWIDSKKQFWPYISAKHMQELYPNGGYMTVGDVKKKIKKDSPSVLVG NGQKIPVLPFQPKLKSGEKCVGYIGVESEESAHTYIRIIEKKKGWLFLLPLLLLLLGLLI LLGLKLMDDKGPNLDNSAISYHIEGLENKDDSNISIPIFGKITVDSETMESEIHLANPKG NPCYFQYKIIMKDSKEELFKSGLIEPGTAIPKVSINQQLTPGEYPAMIIVNTFALEDHTV AMNGGEMDVTLVVEEEN >gi|330402193|gb|ADLB01000022.1| GENE 100 91960 - 92556 746 198 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKKNITKLGAMALVGIMGMGLAAVPVSAAPTNASKTTDVFYTTSSAAIDAEGKVVMVVPA RVDLTKEVPKKEIDIVMQTSNKDDKLPENFSATVKVLSLNKGKLKEVTDKGPGTKEYEYE LQKDGQKINLAQEGEFHTFTVGKPVPPAGVNAIVQKATITAEKSVDKMEKAEKPGTRFND TLTFKVTDLEGNGLTPKA >gi|330402193|gb|ADLB01000022.1| GENE 101 92683 - 92970 134 95 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSNKKIRSFVLTFLLLLALSSPVYAASGSNQTEVGYTTSTIATIPSGKPTIPGGTPTTGD DIYIKKYIIALGISISAITCIILLKRKEDEKRKRG >gi|330402193|gb|ADLB01000022.1| GENE 102 92981 - 96268 2764 1095 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MIKKKLAKVLLSAALVLSILGNDFFYVGAGNPQEDTTSEVAEGVTNDVVENEQENEIQAE EVQPEENIVPDTPLPEVETEGGQVDSEVTVPDSEPVIGESGEVLEEAVAILSDEGEEEIP EITDEDLLRNQNSRYDDRVNYNDVCQGGGYGNKGPSNPNQTNKPNQNPDSDWKYWDYDTN SYLGHYSEWKLIDKYWPRNKGEAVHHGYRLNDAVIDPKDRCYYANPENGWGESVLNSPDF YYRYDKDWANGSGQKITPRPSGSDSEGEKRVDVNDIRYCGNYKWTNAKGGHLFTIVTRPV LEKANFGHQHWWSGKEWVRDIGQYYIDGNSYPRVKDRTGDLEKELDPRGSMYWFKQVQDY KGTGMPGGQEFKGIPDKKWGQSPNKPGWYNVRIQTPVDQKRFILPGQWSEVVMYLNKGYK IKFQKQFDGNTLQDISDNDWKPGRRPDSVDEQTKNDFNIDMYLHKREYKIPEPAYDSNKY IFEGWEAREKKWVPGDPNDPKDSGKMITITVPIDSKDSVGNYLYTPSEITDEQFFVLDST IVAKFKTRKTNTINVKLNFIEDESGLNEDEKATLDKNSIRLVEQIDNNEEFMVTLKDKPF DFLGYSRKPDGTDLLSTNLTWKPGEKNSQLLPDNQLETHVSTDVKATIRPKKIKIHLDAN GGELAPNQSQPTDIETYYYKSPRIHTHAFTHKEGLVLKGWAETQNGDVVVLNGADYKIKD EKVGSGYPSEKILYAVWGKATATVHATKHKDVLFYPTADEKMSATDIHTPEKERLNAYEL GRFYFDVKPANHNIGTEQSKLDASKFYIVFEHSDDGKTWKKLEINDTSKAIITKGFSKSA TGPYAKVIYDKNKQKWFAPLLGRVEMMNDENAYKGFYRVNVAYDDDNALEEVSTEQEFVK EGKNKGWSESDRLQLRVVQNASAFINVPSSITLEEKTVVDGGSGTSKEIIESLHKSNKVM VEAFEHEYEKKTDYDWITTNNALVNGKTGSYNEQQHEEFIKNKPFYVSMSWNKTLADSTG RYQVNNIEMYSASNIGGMQTDQIIQPGTKRSFTYDGTNSDKTLFDFYLKGDKPKGLPEGL QLKGTITFTVSPVAQ >gi|330402193|gb|ADLB01000022.1| GENE 103 96631 - 97029 428 132 aa, chain - ## HITS:1 COG:FN0893 KEGG:ns NR:ns ## COG: FN0893 COG1959 # Protein_GI_number: 19704228 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Fusobacterium nucleatum # 1 132 1 133 140 58 25.0 3e-09 MQLSVTTDYGIRIVLYLIKNNEVTSSAILSEKLDIPKTYVLKVTKKLEAANIVACYQGVN GGVNLLPRPEEISLWDVVIATETTTAINKCLDTDGCCNRHAEDTCKVREVYLVLQKAVEE RMQNIKLIDLAE >gi|330402193|gb|ADLB01000022.1| GENE 104 97086 - 97232 72 48 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTLTRQIKTFCGEHKTRQPDAIKLPNKLSVYNSIYILVYTNPELSDFD >gi|330402193|gb|ADLB01000022.1| GENE 105 97419 - 97814 343 131 aa, chain - ## HITS:1 COG:BH0656 KEGG:ns NR:ns ## COG: BH0656 COG1959 # Protein_GI_number: 15613219 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Bacillus halodurans # 1 131 1 126 132 63 33.0 7e-11 MQLSATTDYGIRIVLYLIKNNEIISSGVLAEQLEIPKTYVLKVTKKLELANIVTSYQGVN GGIGLAGDPAQITLWDVISAVESTTVINKCMEQENEYDVKAEDYCKLRKVYGLLQQAVED RLQSIKLIDLT >gi|330402193|gb|ADLB01000022.1| GENE 106 98172 - 100070 1759 632 aa, chain - ## HITS:1 COG:no KEGG:SP_1654 NR:ns ## KEGG: SP_1654 # Name: not_defined # Def: hypothetical protein # Organism: S.pneumoniae # Pathway: not_defined # 4 485 296 752 803 401 44.0 1e-110 MTKAAEKSYEQLKEAHINDYQQLFDRVKLDLGGDCPNIPTDQLMKNYRNGDYQIAVEEMV YQFGRYLTIAGSRQGDDLPTNLAGIWCIGDPAWGGDFHFNVNVQMNYWPAYTTNLMECGT VFNDFMESLVIPGRVTAEKSAGVKTENHASTPIGQGNGFLVHTQNNPFGCTAPFGSQEYG WNIGGSSWAMQNVYDYYLFTGDKKTLEESIYPMLKEMATFWNQFLWWSEEQNRLVVGPSV SAEQGPTVNGTTYDQSMVWELYKMAIEASEILGVDEDEREVWKEKQSQLNPIIIGEQKQV KEWYEETSLGKGQAGNLPETDIPNFGAGGNANQGAVHRHTSQLIGLFPGTLINKDTKEWM DAAIKSLEQRSLNGTGWSKAMKINMYARTGLADDTYAMVRAMCAGNTNGILDNLLDSHPP FQIDGNYGLTAGITEMLLQSQLGYTQFLPALPSAWPNGSVEGIKSRGNFTIGETWKEGKA KNFTVCYEGPEGSSTFTGNYEGIADAAVLENGKKIETVIDAENHRLSFEAEKGKVYTIDL SGEAITDPEQPEQKPGGGTEGETEGSGDANKPENKPGNKPDKKPEIHHGKEQIGTPKTGD EVGIALPISMMFAIAAGSCVILIAKKKRNKNM >gi|330402193|gb|ADLB01000022.1| GENE 107 100090 - 101049 662 319 aa, chain - ## HITS:1 COG:BH0361 KEGG:ns NR:ns ## COG: BH0361 COG4932 # Protein_GI_number: 15612924 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted outer membrane protein # Organism: Bacillus halodurans # 29 311 1197 1427 1661 63 28.0 4e-10 MQFEEAKGNLDIIKSSANAEITNGNSNYSLQGAVYGLYQGGKEVGRATTDKNGKISFTNL KKGNYTLKELTAPKGYALDITTYNVTVESGKTVTKRVTDYPQSDPVTILLGKVDAETNAN KPQGTATLEGAEFTVKYYATQSSTDPATSGHKAVRTWVLKTNEDGKTAFVDRLQVSGDEF YKDSTGANTLPLGTITIQETKAPKGYLLNSEVFVRQITSKGTAEGVQTYNMPTVPEIVQK GIIELQKVDSETQKNDAQGAASLEGAVYDIFLKSEYQAGDNTESASYRQTLTTDEEGKAK STELPLESFLFHLFQEQYL >gi|330402193|gb|ADLB01000022.1| GENE 108 101108 - 101203 64 31 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTVKVALVISDKNYRASPSGKGSIRKVQTEI >gi|330402193|gb|ADLB01000022.1| GENE 109 101210 - 101506 331 98 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253581110|ref|ZP_04858370.1| ## NR: gi|253581110|ref|ZP_04858370.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 1 97 1 97 98 70 50.0 3e-11 MGILKVAGKVLLIPLWFIISIIGAGVKLLVHMVAVAKKILGFGIMALFIGTVICYQDWLQ AAFLACMGGVLIFILFAGEFIDTVIDLLREKICTLILG >gi|330402193|gb|ADLB01000022.1| GENE 110 101517 - 102515 725 332 aa, chain - ## HITS:1 COG:no KEGG:Amet_3992 NR:ns ## KEGG: Amet_3992 # Name: not_defined # Def: replication initiator A domain-containing protein # Organism: A.metalliredigens # Pathway: not_defined # 3 318 2 319 319 238 40.0 2e-61 MDKKMNFKYFYGTEADQFSFYRIPKALFTNDCFKDLSSDAKILYGLMLDRMSLSIKNQWF DEENRAYIYFSIEDIMELLNCGRNKAVKSLQELDDEKGIGLIEKRRQGFGKVTIIYVKSF VQEECEEQKKEKSKMVKFINQTSVEEEETEEVYISNFKKSQKQTSRSPENKLQEVYISNS NSTNINNTNLSENKSNHIVSADGIGSEEDEMETLHAYQSLIKENLDYDSLLVSHPHDKNQ IDEIVDLIVETVMCKSDKVLIASNWYSGALVRGKFMKLDYSHVEYVLHCLEGNTSKIKNI KKYLLAALFNAPSTISGYYRTEVNHDMPWLAR >gi|330402193|gb|ADLB01000022.1| GENE 111 102605 - 103513 714 302 aa, chain - ## HITS:1 COG:MT4036 KEGG:ns NR:ns ## COG: MT4036 COG1475 # Protein_GI_number: 15843551 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Mycobacterium tuberculosis CDC1551 # 28 185 62 203 344 83 37.0 4e-16 MKSRSGEKIKLTSIDELLGVVNEESAMEIEINRIHAFKDHPFKVLDDEKMADLVESVKTY GVLTPVLLRSDGENGYEMISGHRRMHAAVIVGLATIPAIVRELSDDDAVIAMVDANIQRE ELLPSEKAFAYKMKLDAMKRQAGRPSQKNSGQNDQNFGKVSRDVLAEEVVESSKQIQRYI RLTELIPELLDMVDAKKLNFTIAVDISYIEKEIQKWIYEYIRDTGFVKPKQITALRKQLE EGSVNQGFTISIFNSCIAVKAPERKVVLSGKKLTKYFPEDYSETDMEKVIEALLEQWKRE QE >gi|330402193|gb|ADLB01000022.1| GENE 112 103503 - 103721 274 72 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_3612 NR:ns ## KEGG: EUBREC_3612 # Name: not_defined # Def: chromosome partitioning protein ParA # Organism: E.rectale # Pathway: not_defined # 1 70 190 259 261 68 47.0 6e-11 MVDFRTNYAKDIASRVRETYGSKISIFENVIPLSVKVAEASAEGKSIYCHCPNGKVSMAY ENLTQEVLENEK >gi|330402193|gb|ADLB01000022.1| GENE 113 103737 - 104288 639 183 aa, chain - ## HITS:1 COG:BH4058 KEGG:ns NR:ns ## COG: BH4058 COG1192 # Protein_GI_number: 15616620 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Bacillus halodurans # 1 183 1 178 253 151 43.0 5e-37 MCRVISVANQKGGVAKSTTTLNLGVGLARQEKKVLLIDADPQGSLTASLGYVEPDDIGTT LATIMMNIINDEEIAEEEGILHHEEQVDLLPANIELSALEVTMSNVMSRELIMKEYIDTM RSRYDYILIDCMPSLGMMTINALVASDTVLIPVQAAYLPVKGLQQLIRTISMVKKRLNRK LTI >gi|330402193|gb|ADLB01000022.1| GENE 114 104304 - 105179 887 291 aa, chain - ## HITS:1 COG:SA2498 KEGG:ns NR:ns ## COG: SA2498 COG1475 # Protein_GI_number: 15928294 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Staphylococcus aureus N315 # 9 207 21 202 279 76 30.0 5e-14 MTEVTKEFIQKNEDEKIIEIEIERLRSFKNHPFQVKDDNEMHLLKESIEKYGILTPLIVR PVPDGVYEIIAGHRRRHAAELLGYRKVPVIIRVMNEDEAILNMVDSNLHREKISFSEKAF AYKMKNDVLKRKSGRKKGQIDHETKKKRTVEIISEECGDSPKQVQRYISLTKLIPEFLQK LDDELISFNPAVEISALKEEQQKQLLEAMDYAQAVPSLSQAQRIKKLSKENQLTLEKMQE IMSEIKKGEITRVAFTNEQLHKYFPSRYTPAMMKREIIALLKIWQNENWEK >gi|330402193|gb|ADLB01000022.1| GENE 115 105172 - 105276 72 34 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MANRKRDARLELISEIIGSRRAFFSLTEEGGNDD >gi|330402193|gb|ADLB01000022.1| GENE 116 105960 - 106844 782 294 aa, chain - ## HITS:1 COG:no KEGG:Cphy_3931 NR:ns ## KEGG: Cphy_3931 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 1 291 1 290 291 254 43.0 3e-66 MHEFLKAIGFSDLKTKKELKEILSDIEKDFSAQQTITVEENVDYSERKKEFGDNIGITVC GELNEEGNFEREYYFPYFEGTGVTTYSDVILEKRMEREMYAGVCEEMKVGVSLIFHLQNM AEYRREKRLGHIKKNSVSVTLSALAEGGTILFPIQKDEFQEQKSQEDARNRMMLMSAAKS GDQQAIESLTLDDIDIYSQVSKRLITEDVFSIVDTYFMPYGVECDRYSIMGTILDIKEIH NFYTNEELYIFTLEVNELVFDVCVPKKSVMGEPEIRRRFKGNIWLQGRINFGII >gi|330402193|gb|ADLB01000022.1| GENE 117 106863 - 108140 1469 425 aa, chain - ## HITS:1 COG:PH0710 KEGG:ns NR:ns ## COG: PH0710 COG0172 # Protein_GI_number: 14590588 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Seryl-tRNA synthetase # Organism: Pyrococcus horikoshii # 1 421 6 446 460 317 39.0 3e-86 MLDIKFLRSNPEIVKQNIKNKFQDAKLPLVDEVIELDKRNREIKQEVEALRANKNQISKK IGACMAQGKKEEAEEYKRQVAQNAGRTEELSAEEKEVEEKIKTIMMTIPNIIDPSVPIGK DDSENVELEKFGEPVVPDFEVPYHTEIMDKFNGIDLEAAGKVAGQGFYYLMGDIARLHSA VISYARDFMIDRGFTYCIPPYMIRSNVVTGVMSFAEMDSMMYKIEGEDLYLIGTSEHSMI GKFIDTLNDEEKLPYTLTSYSPCFRKEKGSHGIEERGVYRIHQFEKQEMIVVCKPEESME WYEKMWKNTVDLFRSLDIPVRTLECCSGDLADLKVKSVDVEAWSPRQQKYFEVGSCSNLG DAQARRLGIRVKGKDGKYFAHTLNNTVVAPPRMLIAFLENNLQADGSVKIPEALRPYMGG KAEIR >gi|330402193|gb|ADLB01000022.1| GENE 118 108165 - 108686 709 173 aa, chain - ## HITS:1 COG:no KEGG:Cphy_3933 NR:ns ## KEGG: Cphy_3933 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 4 169 2 168 170 142 47.0 4e-33 MNSKILESIGIDPFYILVVMLLLIILLFVLVIGINMKYNRLKVSYNSFMRGKDGKTLEKS ILSKFKEIDTISSMTKKTRQEVKELSQKMAGSYQKVGIVKYDAFNEMGGKLSFALTLLDG NDSGYIINAMHSREGCYLYIKEIVKGESYIELAEEEAESLERAIYEETYGLDV >gi|330402193|gb|ADLB01000022.1| GENE 119 108688 - 109578 970 296 aa, chain - ## HITS:1 COG:CAC3729 KEGG:ns NR:ns ## COG: CAC3729 COG1475 # Protein_GI_number: 15896960 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Clostridium acetobutylicum # 4 288 3 281 283 226 49.0 5e-59 MAVKRGGLGKGLDSLIPEKKETKTKQEKPAKAGEQMIKLSLIEPNREQPRRMFEEDSLLE LADSIKQYGVLQPLLVQKKDDFYEIIAGERRWRAAKLAGIKEVPVIVRKYTEQEMVEIAL IENIQRENLNPIEEAMAFKRLLTEFSLKQDEVAERVSKSRTAVTNSMRLLKLNEKVQQMI IDDMISTGHARALLAIDDEEQQYILANKIFDEKLSVRETEKLIKELKNPKKEKKKKIVEN DFIYRDSEERMKSLMGTKVSINQKANGKGKIEIEYYSEKDFERIYDLIMSIEKGDA >gi|330402193|gb|ADLB01000022.1| GENE 120 109578 - 110345 895 255 aa, chain - ## HITS:1 COG:BS_soj KEGG:ns NR:ns ## COG: BS_soj COG1192 # Protein_GI_number: 16081149 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Bacillus subtilis # 1 253 1 253 253 302 59.0 5e-82 MGRIIAIANQKGGVGKTTTSINLSACLSALGKKVLAIDMDPQGNMTSGLGIDKDNVEYTV YDLIIGETEIEKVICKDTLENLDVLPTNIDLSAAEIELIGVDNKEYIIRDAVDTVKEMYD FIIIDCPPSLSMLTINAMTTADTVLVPIQCEYYALEGLSQLIHTIDLVKERLNPKLEMEG VVFTMYDARTNLSLQVVENVKENLNKAIYKTIIPRNIRLAEAPSHGLPINLYDPKSAGAE SYMLLAEEVINKGEE >gi|330402193|gb|ADLB01000022.1| GENE 121 110528 - 111490 754 320 aa, chain + ## HITS:1 COG:slr1235 KEGG:ns NR:ns ## COG: slr1235 COG0596 # Protein_GI_number: 16330114 # Func_class: R General function prediction only # Function: Predicted hydrolases or acyltransferases (alpha/beta hydrolase superfamily) # Organism: Synechocystis # 43 243 13 215 293 88 29.0 1e-17 MNWKKKIMNCALFIGITTFVMFIINKLIYFISTVENVLDKIKGNYYEWRFGKIFYTKHGT GKPVLLIHDTTSMGSGYEWNKIVKQLSKTNTVYTIDLLGCGRSEKPNITYTNYLYVQLIS DFIKQVIGKKTDVVSSGLSSSFVLMACHMDNNIINKVVMVAPHSFTSLNKSPNRNSKALK ILLNTHIVGTLVYNLIFTRKNITSLLENEYYYNNNKVSEELIQNYYESAHTCNSASKYVF ASIKGRYLNANVLNALQSLNNSIFVISGDEHIDECKCIGKQYQEKLPSIEVVHIQKSLLA PHIEQAELFTEQLQILLETE >gi|330402193|gb|ADLB01000022.1| GENE 122 111504 - 112217 537 237 aa, chain - ## HITS:1 COG:lin2934 KEGG:ns NR:ns ## COG: lin2934 COG0357 # Protein_GI_number: 16801993 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted S-adenosylmethionine-dependent methyltransferase involved in bacterial cell division # Organism: Listeria innocua # 5 236 6 237 238 225 47.0 5e-59 MSKIFEEKLAHLGIVLNDTQKEQFHQFYEILVEWNSFMNLTGITEYEEVNEKHFVDSVSL IKAVDLSKVKTVIDVGTGAGFPGIPLKIAFPHLEIVLLDSLNKRVKFLNEVIEQLGLTGI RAIHGRAEDYAKQKEYREQFDLCVSRAVANLSTLSEYCIPYVNVGGMFIPYKSGEIDEEV ETSKKAIHILGGKLSEVVKFRLPGTDIGRSFVKIEKIQNTSKKYPRKAGLPAKEPLL >gi|330402193|gb|ADLB01000022.1| GENE 123 112219 - 114117 1841 632 aa, chain - ## HITS:1 COG:CAC3733 KEGG:ns NR:ns ## COG: CAC3733 COG0445 # Protein_GI_number: 15896964 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: NAD/FAD-utilizing enzyme apparently involved in cell division # Organism: Clostridium acetobutylicum # 14 630 8 622 626 790 61.0 0 MKYSANSVWENKDEYDVVVIGAGHAGCEAGLATARLGLKTIVFTVSINSVALMPCNPNIG GTSKGHLVREIDALGGEMGKVIDKTFIQSKMLNKSKGPAVHSLRAQADKAQYSKTMRCVL ENQENLELKQAEVTDILVEDGKITGVQTYSGAVYHCKAVILCTGTYLKSRCIYGEISNET GPDGLQAANYLTDSLKRLGIEMYRFKTGTPARIDKNSIDFSKMEEQFGDERVVPFSFTTN PEDVQIEQASCWLTYTNEKTHEIIRQNLDRSPLFSGMIEGTGPRYCPSIEDKVVKFADKN RHQVFIEPEGLDTNEMYIGGMSSSLPEDVQYEMYRSVPGLENARIMRNAYAIEYDCIDAR QLYPTLEFKKIEGLYSGGQFNGSSGYEEAAAQGLIAGINAALKLLGREQIVIDRSEGYIG VLIDDLVTKESHEPYRMMTSRAEYRLLLRQDNADQRLTEIGYKVGLISNDRYSHLKEKER QIEQEVERVKHTNIGANGKVQEVLEKYGSTPLNSGTTLAELIRRPELNYEALAPIDVKRE KLSDEVIEQVNISIKYEGYITRQMKQVEQFKKLETKRIPDEIDYDDVKSLRIEAVQKLKQ YRPISIGQASRISGVSPADISVLLVYMEQWGR >gi|330402193|gb|ADLB01000022.1| GENE 124 114083 - 115456 1426 457 aa, chain - ## HITS:1 COG:CAC3734 KEGG:ns NR:ns ## COG: CAC3734 COG0486 # Protein_GI_number: 15896965 # Func_class: R General function prediction only # Function: Predicted GTPase # Organism: Clostridium acetobutylicum # 4 457 5 459 459 401 49.0 1e-111 MKKDTIAAIATAMSSAGIGIVRISGREAIEIIQKIFRGKKEKNFAEEKTYTIHYGYIADG EEIIDEVLVMLMKAPHSYTGEDTVEIDCHGGIYVVKKIMETVIKYGARPAEPGEFTKRAF LNGKMDLSQAEAVIDIIDSKNEYALKSSVSQLKGSVQKKIGEIREEILYHTAFIETALDD PEHISVDGYGETLKKVVDNLLEEIRRLLISADNGRIIKEGIKTVIVGKPNAGKSSLLNVL VGEERAIVTDIEGTTRDVLEENIQLQGVSLNIMDTAGIRETKDVVEKIGVDKAKNHANEA DLIIYVADASRPLDDNDEEIIEMIRDKQAIVLLNKSDLDMVVTKKELQEKLNKPMIVISA KEEQGIKELEETLKEMFFHGDISFNDEVYITNIRHKTALQDAAESLEKVLISIENGMPED FYSIDLLDAYESLGSITGETIGEDLVNEIFSKFCMGK >gi|330402193|gb|ADLB01000022.1| GENE 125 115513 - 116268 982 251 aa, chain - ## HITS:1 COG:CAC3735 KEGG:ns NR:ns ## COG: CAC3735 COG1847 # Protein_GI_number: 15896966 # Func_class: R General function prediction only # Function: Predicted RNA-binding protein # Organism: Clostridium acetobutylicum # 1 249 1 208 209 150 43.0 2e-36 MNEITVSAKTLDDAITEALIQLGATSDQVEYDVIEKGSAGFLGIGSKQAVIKAWRKVETP VVEEIKEEIKQEIREEKELVEKKVEEEHRSKEHKLGAVENTTITACETFLHDVLKTMGME VEITSQVDEEGALNIDMKGENMGILIGKRGQTLDSLQYLTNRVANKSQEEYVRVKLDTEN YRQRRKETLENLAKNIAYKVKRTKKPISLEPMNPYERRVIHSALQGDKYVSTHSEGEEPY RRVVVTLARRY >gi|330402193|gb|ADLB01000022.1| GENE 126 116281 - 117543 1504 420 aa, chain - ## HITS:1 COG:BH1169 KEGG:ns NR:ns ## COG: BH1169 COG0706 # Protein_GI_number: 15613732 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit YidC # Organism: Bacillus halodurans # 25 132 45 153 280 77 40.0 5e-14 MTGTLLTQATTPIIGQIAWLLGKLMDGIFNVMSNLFGIENIGICIIIFTIIIYTLLIPLT IKQQKFSKMSAAMNPEIQAIQKKYKGKRDQTSVLKMQEETQMVYDKYGTSPTGGCGSLLI QFPILLGLWQVIQNIPAYVGGVKDAYLPLVNSIMATDGYQKIMEGIGKASPILIDPKHFD YTKTNTIIDVLYKFQDSTWDTLADKFSNLQPIIQSTQENVKHLNSFLGINLAETPKTMFM DGVKTGAIALIIVALIIPILSGLTQWISIKLTSTASATQTADNPMASQMKMMNVMMPMLS VFMCFTFQAGLGLYWIASAVVRCVQQLIINKHLEKISVDDLIAQNVEKVKKKREKKGASA KNINEMAQRNVRNIQEPSMKKMTDAEKEARLKKAAEANKNAKAGSLASKANMVKNFNENN >gi|330402193|gb|ADLB01000022.1| GENE 127 117565 - 117759 138 64 aa, chain - ## HITS:1 COG:CAC3737 KEGG:ns NR:ns ## COG: CAC3737 COG0759 # Protein_GI_number: 15896968 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 3 63 10 70 71 87 68.0 5e-18 MLIKFYRKYLSGLKVRTHCIYIPTCSEYGLEAIEKYGAFKGSLLTIWRIMRCNPFSKGGY DPVP >gi|330402193|gb|ADLB01000022.1| GENE 128 117749 - 118141 308 130 aa, chain - ## HITS:1 COG:CAC3738 KEGG:ns NR:ns ## COG: CAC3738 COG0594 # Protein_GI_number: 15896969 # Func_class: J Translation, ribosomal structure and biogenesis # Function: RNase P protein component # Organism: Clostridium acetobutylicum # 7 103 6 104 119 85 50.0 3e-17 MQFSESLKKNRDFQIVYKRGKSYANKYLVMYKKPNGLNKNRIGISVSKKVGNSVVRHRLT RLIRESYRLQESKMQCGYDIVVIVRASANGKSYGEMDSALIHLAKLHHIYNPMEKKNGDG ISEETAYNAN >gi|330402193|gb|ADLB01000022.1| GENE 129 118189 - 118323 199 44 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|160882064|ref|YP_001561032.1| ribosomal protein L34 [Clostridium phytofermentans ISDg] # 1 44 1 44 44 81 90 2e-14 MKMTFQPKKRSRAKVHGFRARMSTAGGRKVLAARRAKGRKKLSA >gi|330402193|gb|ADLB01000022.1| GENE 130 118705 - 120075 1329 456 aa, chain + ## HITS:1 COG:CAC0001 KEGG:ns NR:ns ## COG: CAC0001 COG0593 # Protein_GI_number: 15893299 # Func_class: L Replication, recombination and repair # Function: ATPase involved in DNA replication initiation # Organism: Clostridium acetobutylicum # 4 455 5 445 446 420 50.0 1e-117 MNIVKEKWSEIIEKLRIEYGLSNVSFNTWIKPLKVHEVKDNTVFLLCELKASIDHIKHKY ELPLRVCIAEVTDEEFNVAFICEDDIKQTETKEENQKIKTIIEQANLNHKYTFDTFVVGN NNQFAHAASLAVAESPGEVYNPLFLYGGVGLGKTHLMHSIAHFILEEDPTKKVLYVTSET FTNELIEAIKSGRTGNESTMTSFREKYRNIDVLLIDDVQFIIGKESTQEEFFHTFNHLHV SGKQIILSSDKPPKDFETLEARLRTRFEWGLIADISSPDYETRMAILQKKIELDNLDVYH IPDDVVEYIANNVKSNIRELEGSLNKLIALYKLNYNSSKEIDIPLAAEALKDIISPNENR TITIDLIIDVVAEHFNVSVADLKSGKRNSEIVMPRQIAMYLCRTMIDTPLKSIGIILGGR DHSTINHGVDKIKAELKTNDTLNNTIDIIKKKLNPV >gi|330402193|gb|ADLB01000022.1| GENE 131 120260 - 121369 1242 369 aa, chain + ## HITS:1 COG:CAC0002 KEGG:ns NR:ns ## COG: CAC0002 COG0592 # Protein_GI_number: 15893300 # Func_class: L Replication, recombination and repair # Function: DNA polymerase sliding clamp subunit (PCNA homolog) # Organism: Clostridium acetobutylicum # 1 368 1 366 366 239 38.0 6e-63 MKIICTKSNLVKGVSIVSKAVPSKTTMPILECILIDATTDIIKLTANDMELGIETIIEGD IIEKGLIAIDAKIFSEIVRKLPDNEVVIETDSNLQTLITCEKAKFNISGKSGEEFSYLPY IEKEEPVELSQFTLKEIIRQTIFTISDHDSNKLMTGELFEINENILKVVALDGHRISIRK VELMNNYASKKVIVPGKTLIEVSKILSGEVESKANIYFTPNHIVFEFDQTVVVSRLIEGE YFKIDQMLSSDYETKVRVNKKELLSCIDRAMLLVKEGDKKPIIINIGDEAMELKIQSQLG SMNEDILITKEGKDLLIGFNPKFLMDALRVIEDEEIQIYLTNAKAPCFIRDEKESYIYLI LPVNFNAVV >gi|330402193|gb|ADLB01000022.1| GENE 132 121373 - 121582 333 69 aa, chain + ## HITS:1 COG:Cgl0147 KEGG:ns NR:ns ## COG: Cgl0147 COG2501 # Protein_GI_number: 19551397 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Corynebacterium glutamicum # 2 58 4 60 97 62 57.0 3e-10 MEKIKLREEFIKMGQALKAAGLVESGVEAKEVIQDGLVQVNGEIDTRRGRKLYDGDVVFF DGEEIQIEK >gi|330402193|gb|ADLB01000022.1| GENE 133 121579 - 122664 684 361 aa, chain + ## HITS:1 COG:CAC0004 KEGG:ns NR:ns ## COG: CAC0004 COG1195 # Protein_GI_number: 15893302 # Func_class: L Replication, recombination and repair # Function: Recombinational DNA repair ATPase (RecF pathway) # Organism: Clostridium acetobutylicum # 1 359 1 362 363 278 41.0 8e-75 MIIKSLKLKNFRNYDLLNLDFDSATNILYGDNAQGKTNILEAIYLSGTTKSHRGTKDRDM IRFGQEESHIETVIEKKGIEFKTDIHLKKNSPKGIAINKMPIRKASELFGVIHLVFFSPE DLNIIKNGPAERRRFIDMELSQLDKVYLNDLANYNRIINQRNKLLKDIYGREDLISTLDI WDMQMAHYGDRVMQRRAKFIAQINGIIENVHGKLTGGKEKLNLFYEKSIGDADFSEAILK NRERDIRMKSTSVGPHRDDICFKAGDLDIRKFGSQGQQRTAALSLKLSEIELVKLLINDT PILLLDDVLSELDKNRQNYLLDSISNIQTIVTCTGVDEFVNRRFLINKIFHVNGGQVKKE N >gi|330402193|gb|ADLB01000022.1| GENE 134 122675 - 124591 1994 638 aa, chain + ## HITS:1 COG:CAC0006 KEGG:ns NR:ns ## COG: CAC0006 COG0187 # Protein_GI_number: 15893304 # Func_class: L Replication, recombination and repair # Function: Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), B subunit # Organism: Clostridium acetobutylicum # 5 638 8 637 637 811 64.0 0 MSTEYGADQIQILEGLEAVRKRPGMYIGSTSVRGLHHLVYEIVDNAVDEALAGYCDTIYV SINKDNSITVIDNGRGIPVGINHKAGLPAVEVVFTVLHAGGKFGGGGYKVSGGLHGVGAS VVNALSEWLEVEIYKDGKIYKQRYERGHVMYKLEVTGTCEPEKTGTKVTFLPDKEIFEET IFDYDTLKQRFREMAFLTRNLKIILADERPEERIEKTFHYEGGIKEFVSYLNKSKTPLYE QIIYCEGEKDGVAVEVAMQHNDSYSDNTYGFVNNITTPEGGTHIVGFRNALTKTFNDYAR KNKLLKDSEPNLSGEDIREGLTAIVSVKIGDPQFEGQTKQKLGNSEARGAVDNIVSTQLQ IFLEQNPSVAKMTVEKSVMAQRAREAARKARDLTRRKSALETMSLPGKLADCSDKDPANC EIYIVEGDSAGGSAKTARDRATQAILPLRGKILNVEKARLDKIYANAEIKAMITAFGTGI HDDFDISKLRYHKIIIMTDADVDGAHISTLLLTFLYRFMPDLIKEGYVYLAQPPLYKLEK NKKVWYAYSDDELNQILTEVGRDGNNKIQRYKGLGEMDAEQLWETTMSPEHRVLLRVTMD EETSSELDLTFTTLMGDKVEPRREFIEENAKYVQNLDV >gi|330402193|gb|ADLB01000022.1| GENE 135 124616 - 127096 2910 826 aa, chain + ## HITS:1 COG:CAC0007 KEGG:ns NR:ns ## COG: CAC0007 COG0188 # Protein_GI_number: 15893305 # Func_class: L Replication, recombination and repair # Function: Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), A subunit # Organism: Clostridium acetobutylicum # 8 826 6 819 830 941 59.0 0 MEDNIFDKVHDVDLKKTMESSYIDYAMSVIASRALPDVRDGLKPVQRRILYSMIELNNGP DKPHRKCARIVGDTMGKYHPHGDSSIYGALVNMAQEWSTRYPLVDGHGNFGSVDGDGAAA MRYTEARLSKISMELTADINKDTVDFVPNFDETEKEPAVLPARFPNLLVNGTSGIAVGMA TNIPPHNLKEVINAVVKIIDDQIEETGETTIEDILQIVKGPDFPTGATILGTRGIEEAYR TGRGKIRVRAVTDIEPMQNGKSRIVVTELPYMVNKARLIEKIAELVRDKKIDGITDLSDQ SNREGMRICIELRRDANANVILNQLYKHTQMQDTFGVIMLALVNNEPRVMNLLDMLNHYL KHQEEVVTRRTKYELNKAEERAHILQGLLIALDNIDEVIRIIRGSKTTQIAKEGLIAAFD LTEVQAQAIVDMRLRALTGLEREKLETEYAELMKRIGELKAILADRKLLLGVIKEEIIVI RDKYGDERRTAIGFDEYDISMEDLIPREDVVITMTKLGYIKRMTVDTFKSQNRGGKGIKG MQTLEEDYIEDLFMTSTHHYVMFFTNTGRVYRLKAYEIPEAGRTARGTAIINLLQLMPEE KITAIIPIKEYVQGEYLFMATKKGLVKKTPITDYANVRKTGLAAITLREEDELIEVKATD NEKDIILVTKYGQCIRFNEKDVRSTGRTSMGVRGMNLGDRDEVVGMQLNTQGEHLLIVSE KGMGKRTEMSEFTSQNRGGKGVKCYKITEKTGNVVGVKAVSEDNEIMIINTEGIIIRMEC NSISVLGRITSGVKLINLHDDEKVASIAKVREASSELEENAEETVE >gi|330402193|gb|ADLB01000022.1| GENE 136 127212 - 128795 1382 527 aa, chain + ## HITS:1 COG:MA0761 KEGG:ns NR:ns ## COG: MA0761 COG1574 # Protein_GI_number: 20089646 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase with the TIM-barrel fold # Organism: Methanosarcina acetivorans str.C2A # 1 520 27 552 553 346 36.0 7e-95 MNESNLYAEAVCVENGRIVSVGEYNEIIKLQEEPDEVIDLQGKTMLPGFIDAHSHFVGAA NAMTQCDLSFCKNFDDIIQTMRAFKEKRKLSDESWIIGSNYDQNFLEEGRHPDKDVLDQI SEVNPILLIHASSHMGVTNSKGLEIQQIDEKTEDCAGGRYGRVEGTNIPDGYMEEKAFLT FQSKLPMQSVETLLSCIQDTQKMYAGYGVTTVQDGMVGKPLFELLKYVSSTGKLQLDVVG YADIMTASDLVEEEPEYIGKYKNHFKLGGFKVFLDGSPQGKTAWLTKPYENEKEYKGYPI HGNDELKKYICLALDKKQQLLAHCNGDAAAEQYISMFEQALRERKEKNIYRAVMVHAQLV RKDQLERMADMGMIPSFFVAHTYYWGDIHLKNFGEKRGSRISPVKDAVELKMKYTFHQDT PVVPPDMMRTVSCAVNRVSRTGQSIGENQRISVLEALKAITSYGAYQYFEENDKGTIEKG KYADFVVLDKNPLETEKENLADIKVLMTIKENKVIYKNGEDTNESAE >gi|330402193|gb|ADLB01000022.1| GENE 137 128779 - 129696 981 305 aa, chain + ## HITS:1 COG:MA3243 KEGG:ns NR:ns ## COG: MA3243 COG0697 # Protein_GI_number: 20092059 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Methanosarcina acetivorans str.C2A # 4 276 2 273 291 105 30.0 8e-23 MNQRSKGIILAVTTAVMWGIMGIFVRDLSHNQFTNIEISFFRCALAGAAYFIFLLLTKPS ALKVNIKGLVICVLYGAVAYSISFVSYSVSVSRIPVGVATVLMFMSPIWVVILGTVMFKE KLQKSKIVTIVICFIGAILVANLIGGGEIKLDAIGILAGIINGIGVALQILLPKFFAKEY DRDTLLVYGFLGAASVLLFGMDFGSVAEHISNTPMTNLIWDLFGIGILCTMVANAACVKS TQYVEASTTSILSALEVVVGTLVGFFIFHEHLTFLQIAGAVIIVVGAIGSEIYQPKKKGK NECIQ >gi|330402193|gb|ADLB01000022.1| GENE 138 129684 - 131288 1494 534 aa, chain + ## HITS:1 COG:BH2109 KEGG:ns NR:ns ## COG: BH2109 COG4753 # Protein_GI_number: 15614672 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver domain and AraC-type DNA-binding domain # Organism: Bacillus halodurans # 1 526 1 520 525 159 25.0 2e-38 MYTVIIADDEEEIRRSLIRKIDWESIGFKVVGEAGNGEDALELVEKLEPDLLLTDVKMPF ISGIELARQVREIRPTIQIAFLSGYDDFSYAQQAIQYNIVSYMLKPISSGDITKELKNIK LKIERKFEEFATKTPSSEKMEKSEFLMPLLLDSFQKERSEEEKKRLVEKAVECGVLEPGN LDVLKYVVLVTRILNEDGEVCTDRTSVNAIDSILKKYVKHSSVYMHGKVVSLLSATKTGF NKYLHIIVEDISQSVERIMELNCTIGVSRVGEDLSCIHECYLEAMNAISYSKKNRRNIHF IGDEERMNTLDQELVQQTIQDIENLIRSGSQEEIQEYFVSFGKLIEQRKVTKVAINFIFV QLIASVFQVVYAVAGNDAVQRIQKNFAFYSMPVLSGSTKSTLKYYEEICLMARNIISEQR KKSSEVLCENTIRIINEQYMDPNLSLSDVSSENSVSPNYLSAVIKRENGSTFKDLLTKKR IEKAQELLFCTSMKIREIAEKCGYNDQHYFSYCFKKYTGMSPNVCRRNYEENKE >gi|330402193|gb|ADLB01000022.1| GENE 139 131299 - 132996 1482 565 aa, chain + ## HITS:1 COG:SP0662 KEGG:ns NR:ns ## COG: SP0662 COG2972 # Protein_GI_number: 15900563 # Func_class: T Signal transduction mechanisms # Function: Predicted signal transduction protein with a C-terminal ATPase domain # Organism: Streptococcus pneumoniae TIGR4 # 156 557 160 560 563 216 30.0 1e-55 MKKISLSQMLTGIVVMIVLVLTVTFFFAFVNIYKNTTEENVVTSSEQAVLQVKNTVENYT EDMEILMEMIQKNISRGKENTDTYLKNLISIRKDIVAITTYDEQGNLLKAWSDRGKLKEN IVENLSFNPNVPKEKEEMLNVTKPHVESLFEDYYPWVVTISEYMKNSEGDTIQVALDIQF SQIANYMDDVGIGQHGYCYIADKKGDIIYHPQQQLIYSGLKEEEYNYLKDGSHVEKDAIY SVRSLDNCEWKIIGVCYIDEMITNKVNHIMKTLSVIFLIVVLLTVLIIRFFSKLFSNPAR ELANAMQEFEKDTNNFEFKSIEGTAEITSLTESFEHMVVQIKELVEKVRQEEITLRKTEL KALQAQINPHFLYNTLDAIAWLCEEERHKDAVEMVNSLAKLFRISISRGHELITIEKEMQ HAKSYLKIQNFRYKNQFTYSFDVDEECLNYLCNKITLQPIIENAIYHGIDRMVDEGKINI GIHQKGDKIIFTVEDNGVGMTEEQCEEILHKDAGDRVGIGIKNVNDRIKIYFGEEYGLTI QSELDEGTRVTISMPKITENDYGEK >gi|330402193|gb|ADLB01000022.1| GENE 140 132983 - 133951 973 322 aa, chain + ## HITS:1 COG:AGc5109 KEGG:ns NR:ns ## COG: AGc5109 COG1879 # Protein_GI_number: 15890064 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 17 319 52 357 357 125 28.0 1e-28 MEKNKKKVIAFFCICMLFISGCGMENSENIQKRKVAVIMKSTDSAFFKAVSLGANAAGTA YNMEILFEGPKNEEDYKTQNKMIEQAIQEEYDAIVFSAVDYNANAEAIDKAAHAGIKTIV IDSDVNSKKVNCRISTNNYKAGKMAGQAVLENKSEELNIGIVNFDKNSRNGQERERGFKD FVKKDNRVKNVETINVISTVEQAKNQTKEFLEEHQEMNVLVTFNEWTSLGVGYAVRELGL QDKIQVVAFDNNVISVGMLETGEVDALIVQNPYAMGYLGVESAYNLLNEIPIKGKQVDTE TICATRENMYDDEYQKTLFAFE >gi|330402193|gb|ADLB01000022.1| GENE 141 134065 - 135189 1355 374 aa, chain + ## HITS:1 COG:YPO1507 KEGG:ns NR:ns ## COG: YPO1507 COG1879 # Protein_GI_number: 16121780 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Yersinia pestis # 1 349 1 309 335 160 35.0 6e-39 MKKKLLAAVLCVAMVAATLVGCGSSDDGKKDGGKKSDLKVGVFYYTYADTYISSVRTALD KQLDDMGVDYQDYDGNSNQTTQNEQIDTAIQSGTNLLIVNIVTSGSNDASTQVIEKAKAA DVPVIFFNRAVEDPKGKEGVVLNTYDKCAFVGTDAPEAGHMQGEMIGDYLVKNYDKVDLN KDGKISYAMFMGQLGNVEAIYRTQYAVEDANKKLVEAGKPELQFFDASNKDKYQVDQDGN WSAGAANNYMTTNLSQYNEGNNNMIELVICNNDGMAEGAIAALNDKGWNKGDAAKTIPVF GVDATDAAKKLIADGKMVGTIKQDAEGMAKCIADLTKNASEGKDLMEGTDSYNKSKEVST KIYIPYGIYTGEGK >gi|330402193|gb|ADLB01000022.1| GENE 142 135291 - 136790 1736 499 aa, chain + ## HITS:1 COG:FN1166 KEGG:ns NR:ns ## COG: FN1166 COG1129 # Protein_GI_number: 19704501 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, ATPase component # Organism: Fusobacterium nucleatum # 6 499 7 500 500 610 64.0 1e-174 MANDVLLQMSHISKEFPGVKALNDVSLTVKKGTVHALMGENGAGKSTLMKCLFGMYKKDG GTIKLEGKEVDFKNSKEALENGVAMVHQELNQALKRNVMDNIWLGRYPKVAGLMVNEKKI YQDTMDIFKELEIDVDPRRVMSTMPVSQRQMVEIAKAVSFNSKVIVFDEPTSSLTEEEVE HLFKIINMLRDRGVGIIYISHKMAEIKRISDEITIMRDGTWVATKPADELSMDDIIRLMV GRELTNQFPPKTNKPGEIALEVEGLTAQYSSLKDVSFNVRKGEIVGLAGLDGSGRTETLE NIFGIATRHSGTIKLDGKVVKNRNSRESIKNGFALLTEERRATGIFGILSIRENTVISSL KKHKKAGVYLSEKSMKKDTQWSIDAMHTKTPTQETKIRSLSGGNQQKVILGRWLLTEPEV LLLDEPTRGIDVGAKYEIYQLILDLANKGKVVMVVSSEMPELLGICDRILVMSGGRLAGE VDAKETTQEEIMTLAAKFV >gi|330402193|gb|ADLB01000022.1| GENE 143 136806 - 137876 1301 356 aa, chain + ## HITS:1 COG:HI0824 KEGG:ns NR:ns ## COG: HI0824 COG4211 # Protein_GI_number: 16272765 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type glucose/galactose transport system, permease component # Organism: Haemophilus influenzae # 33 356 16 336 336 256 43.0 7e-68 MRLVANIKERQASYKALDSKDKRTYWKEFALNNALYFLIIVAVIYTAVQNPKFVSTSSIV NIISLSAANIPIAVGIAGCIVLTGTDLSAGRVVGLTACISASLLQAVDYPTKIFPDLPVL PIPLVILIVVLVGAIVGVVNGFFVAHFQLHPFIVTLATQLIVYGVLLMYIMLNGNNGQPL SGLDQSFKDFVTGSVLKIDGVPIPNYVWYACIIVILMWFIWNKTSFGKNMFAVGSNQEAA NVSGVNVAKTIILVHTLAGAMYAVTGFIEAARIGSNQANTGLNYECDAIAACVIGGVSFV GGTGKIRGVVMGVFILRIIFVALNFLSINQNLQYVIKGLIILVACAIDMRKYLVRK >gi|330402193|gb|ADLB01000022.1| GENE 144 137885 - 138139 266 84 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|226325936|ref|ZP_03801454.1| ## NR: gi|226325936|ref|ZP_03801454.1| hypothetical protein COPCOM_03749 [Coprococcus comes ATCC 27758] # 2 82 9 89 89 65 41.0 8e-10 MSDKAREKGRLAIYGMAGLYILYMAYSIFKGLSDSAGSEQFVLFVAMIAFAVIGVVLIVF SVVRGYKISKAEYDDKKNKENENK >gi|330402193|gb|ADLB01000022.1| GENE 145 138271 - 138732 555 153 aa, chain + ## HITS:1 COG:CAC1363 KEGG:ns NR:ns ## COG: CAC1363 COG2032 # Protein_GI_number: 15894642 # Func_class: P Inorganic ion transport and metabolism # Function: Cu/Zn superoxide dismutase # Organism: Clostridium acetobutylicum # 10 151 32 178 182 95 37.0 4e-20 MRNINGMPDAYAQVRGDAAHSKIYGNVYFYGMYNGTLVVAEIYGLPNTEEKDNGNFYGFH IHEGEYCTGDMEDSFKNAGAHYNPDKTEHPKHRGDLPPLLANEGIAWSAVYTDRFYPEEV VGKTVIIHGMADDFKTQPSGNAGMKIACGEIVE >gi|330402193|gb|ADLB01000022.1| GENE 146 138800 - 139240 390 146 aa, chain - ## HITS:1 COG:no KEGG:BHWA1_01097 NR:ns ## KEGG: BHWA1_01097 # Name: not_defined # Def: C-GCAxxG-C-C, putative redox-active protein (C-GCAxxG-C-C) # Organism: B.hyodysenteriae # Pathway: not_defined # 1 146 1 145 148 165 52.0 6e-40 MTTNIDMQALRNDAVEIFNHGFACSESVIYAIKKHFDLDLSDDAIAMSSGFPWGLGGGGC ICGALAGSTMCIGYFFGRRTPNDPAIQKCFELTNELHDYFKQTCGATCCRILTKGLEKNS PERKAQCTRFVADIVQKTAEIILREL >gi|330402193|gb|ADLB01000022.1| GENE 147 139407 - 140252 548 281 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|169634422|ref|YP_001708158.1| fumarate hydratase [Acinetobacter baumannii SDF] # 1 276 1 282 508 215 44 1e-96 MVRTIDVNEITKQIKEMCIEANHFLSSDMVCAMENAKNTERSPLGKQILEQLQENLEIAG EEMIPICQDTGMAVVFLEIGQDVHFKGGLLEDAVNEGVRQGYQEGFLRKSVVNDPIIREN TKDNTPAVIHYKMVEGNKVKIKVAPKGFGSENMSRVFMLKPADSLEGVKNAILTAVKDAG PNACPPMVVGVGIGGTFEKCALMAKEALTRETGVHSEIPHVKELEEEMLKKINRLGIGPG GLGGTSTALAVHINTYPTHIAGLPVAVNICCHVNRHIIREV >gi|330402193|gb|ADLB01000022.1| GENE 148 140254 - 140805 408 183 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|169634422|ref|YP_001708158.1| fumarate hydratase [Acinetobacter baumannii SDF] # 9 176 321 489 508 161 49 1e-96 MKKQIQIPIQKEIIRELKAGDYVYLTGEIYTARDAAHKRMNETLDRNEELPVNIKDKIIY YMGPSPAREGKVIGSAGPTTASRMDKYAPRLMDLGLSGMIGKGKRSQEVIDAIIRNEGVY FAAVGGAGALLSKCIKKSEVVAYDDLGTEAIRRLEVEDFPVIVVIDSEGNNLYEMAIEEY KEG >gi|330402193|gb|ADLB01000022.1| GENE 149 140811 - 141521 645 236 aa, chain + ## HITS:1 COG:CAC0965 KEGG:ns NR:ns ## COG: CAC0965 COG0204 # Protein_GI_number: 15894252 # Func_class: I Lipid transport and metabolism # Function: 1-acyl-sn-glycerol-3-phosphate acyltransferase # Organism: Clostridium acetobutylicum # 56 207 59 210 241 88 36.0 9e-18 MKRILLMVFRNIILVPYMWCKLCYYASHVEKYSEQKRYDMLRFIVFRANKGGNMHIDVFG KENIPKENGFMFYPNHQGLYDVLAIVEACDVPFSVVAKKEVGNIQFLKQVFACMKAYLID RDDVRQSMQIIIDVSKEVASGRNYLIFAEGTRSKLGNKLLDFKGGSFKAATKAKCPIVPV ALVNSFKPFDTNSIKPVNVEVHFLKPMLYEEYKDMKTTEIAIEVKKRIAKVVEREE Prediction of potential genes in microbial genomes Time: Tue May 24 21:50:33 2011 Seq name: gi|330401904|gb|ADLB01000023.1| Lachnospiraceae bacterium 2_1_46FAA cont1.23, whole genome shotgun sequence Length of sequence - 78473 bp Number of predicted genes - 77, with homology - 71 Number of transcription units - 32, operones - 18 average op.length - 3.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) - 5S_RRNA 150 - 208 98.0 # AE015927 [R:2797299..2798807] # 5S ribosomal RNA # Clostridium tetani E88 # Bacteria; Firmicutes; Clostridia; Clostridiales; Clostridiaceae; Clostridium. 1 1 Tu 1 . - CDS 427 - 513 77 ## - Prom 556 - 615 5.0 - Term 552 - 613 4.2 2 2 Tu 1 . - CDS 629 - 1981 1211 ## COG1066 Predicted ATP-dependent serine protease - Term 1992 - 2035 6.0 3 3 Op 1 . - CDS 2050 - 2463 622 ## Cphy_3460 hypothetical protein 4 3 Op 2 . - CDS 2518 - 4971 2163 ## PROTEIN SUPPORTED gi|163764771|ref|ZP_02171825.1| ribosomal protein S8 5 3 Op 3 . - CDS 4997 - 5365 373 ## COG1725 Predicted transcriptional regulators - Prom 5399 - 5458 8.6 + Prom 5426 - 5485 6.5 6 4 Op 1 . + CDS 5536 - 5874 215 ## COG1695 Predicted transcriptional regulators 7 4 Op 2 . + CDS 5871 - 6560 396 ## + Term 6561 - 6620 5.0 - Term 6707 - 6771 16.2 8 5 Op 1 . - CDS 6784 - 8055 1721 ## COG0151 Phosphoribosylamine-glycine ligase 9 5 Op 2 21/0.000 - CDS 8066 - 8692 739 ## COG0299 Folate-dependent phosphoribosylglycinamide formyltransferase PurN 10 5 Op 3 . - CDS 8686 - 9711 811 ## PROTEIN SUPPORTED gi|169632702|ref|YP_001706438.1| phosphoribosylaminoimidazole synthetase 11 5 Op 4 . - CDS 9753 - 10265 716 ## COG0041 Phosphoribosylcarboxyaminoimidazole (NCAIR) mutase - Prom 10328 - 10387 5.7 12 6 Op 1 . - CDS 10405 - 12456 2144 ## EUBELI_00547 hypothetical protein 13 6 Op 2 . - CDS 12458 - 12859 399 ## gi|210615476|ref|ZP_03290603.1| hypothetical protein CLONEX_02819 14 6 Op 3 . - CDS 12846 - 13310 231 ## COG4767 Glycopeptide antibiotics resistance protein 15 6 Op 4 . - CDS 13326 - 13475 228 ## 16 6 Op 5 . - CDS 13493 - 14173 956 ## COG0461 Orotate phosphoribosyltransferase - Prom 14245 - 14304 8.4 - Term 14276 - 14318 10.6 17 7 Tu 1 . - CDS 14340 - 19454 5528 ## COG3250 Beta-galactosidase/beta-glucuronidase - Prom 19542 - 19601 5.2 - Term 19570 - 19630 16.1 18 8 Op 1 13/0.000 - CDS 19632 - 20534 1014 ## COG0167 Dihydroorotate dehydrogenase 19 8 Op 2 1/0.100 - CDS 20534 - 21304 699 ## COG0543 2-polyprenylphenol hydroxylase and related flavodoxin oxidoreductases 20 8 Op 3 . - CDS 21317 - 22243 1297 ## COG0284 Orotidine-5'-phosphate decarboxylase 21 8 Op 4 . - CDS 22256 - 23545 1378 ## COG0044 Dihydroorotase and related cyclic amidohydrolases - Prom 23576 - 23635 6.5 - Term 23590 - 23635 9.1 22 9 Op 1 1/0.100 - CDS 23639 - 24814 1399 ## COG0793 Periplasmic protease 23 9 Op 2 2/0.000 - CDS 24828 - 25961 1321 ## COG0739 Membrane proteins related to metalloendopeptidases 24 9 Op 3 28/0.000 - CDS 25978 - 26883 842 ## COG2177 Cell division protein 25 9 Op 4 . - CDS 26888 - 27559 336 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 26 9 Op 5 . - CDS 27581 - 28666 1159 ## COG2508 Regulator of polyketide synthase expression - Prom 28720 - 28779 4.5 27 10 Op 1 . - CDS 28781 - 29464 581 ## EUBREC_0711 hypothetical protein - Prom 29489 - 29548 11.9 28 10 Op 2 . - CDS 29563 - 29751 331 ## gi|153855921|ref|ZP_01996883.1| hypothetical protein DORLON_02908 - Prom 29821 - 29880 4.1 - Term 30075 - 30125 15.5 29 11 Op 1 . - CDS 30135 - 30317 307 ## COG1983 Putative stress-responsive transcriptional regulator 30 11 Op 2 . - CDS 30335 - 31168 1108 ## gi|225570780|ref|ZP_03779803.1| hypothetical protein CLOHYLEM_06883 31 11 Op 3 5/0.000 - CDS 31168 - 31836 802 ## COG4709 Predicted membrane protein 32 11 Op 4 . - CDS 31833 - 32156 440 ## COG1695 Predicted transcriptional regulators - Prom 32192 - 32251 6.0 33 12 Op 1 8/0.000 - CDS 32271 - 35864 2760 ## COG1074 ATP-dependent exoDNAse (exonuclease V) beta subunit (contains helicase and exonuclease domains) 34 12 Op 2 . - CDS 35867 - 39199 2578 ## COG3857 ATP-dependent nuclease, subunit B 35 12 Op 3 2/0.000 - CDS 39210 - 39770 351 ## COG0500 SAM-dependent methyltransferases - Term 39782 - 39818 6.6 36 12 Op 4 . - CDS 39830 - 41077 653 ## PROTEIN SUPPORTED gi|149914878|ref|ZP_01903407.1| 30S ribosomal protein S2 37 12 Op 5 . - CDS 41082 - 41609 690 ## EUBREC_2423 hypothetical protein 38 12 Op 6 . - CDS 41621 - 41980 399 ## COG0640 Predicted transcriptional regulators - Prom 42100 - 42159 8.8 + Prom 41936 - 41995 4.0 39 13 Tu 1 . + CDS 42151 - 44610 1873 ## COG1409 Predicted phosphohydrolases + Term 44613 - 44664 11.2 - Term 44605 - 44647 4.1 40 14 Tu 1 . - CDS 44683 - 46920 2384 ## COG2217 Cation transport ATPase - Prom 46985 - 47044 6.1 + Prom 46963 - 47022 10.1 41 15 Tu 1 . + CDS 47077 - 47523 373 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases + Term 47525 - 47554 2.1 - Term 47513 - 47542 2.1 42 16 Op 1 . - CDS 47543 - 48217 826 ## COG0692 Uracil DNA glycosylase 43 16 Op 2 40/0.000 - CDS 48242 - 49717 1268 ## COG0642 Signal transduction histidine kinase 44 16 Op 3 . - CDS 49717 - 50394 633 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 45 16 Op 4 . - CDS 50460 - 51353 501 ## COG2333 Predicted hydrolase (metallo-beta-lactamase superfamily) - Prom 51378 - 51437 2.9 - Term 51437 - 51477 3.9 46 17 Op 1 3/0.000 - CDS 51485 - 52501 1275 ## COG2876 3-deoxy-D-arabino-heptulosonate 7-phosphate (DAHP) synthase 47 17 Op 2 3/0.000 - CDS 52498 - 53502 962 ## COG0337 3-dehydroquinate synthetase 48 17 Op 3 . - CDS 53523 - 54284 708 ## COG0710 3-dehydroquinate dehydratase - Prom 54309 - 54368 5.3 + Prom 54262 - 54321 8.6 49 18 Tu 1 . + CDS 54407 - 55282 487 ## COG0169 Shikimate 5-dehydrogenase - Term 55260 - 55307 14.5 50 19 Tu 1 . - CDS 55312 - 56661 1096 ## COG0534 Na+-driven multidrug efflux pump - Prom 56789 - 56848 5.2 + Prom 56671 - 56730 6.6 51 20 Op 1 . + CDS 56765 - 57238 324 ## COG1905 NADH:ubiquinone oxidoreductase 24 kD subunit 52 20 Op 2 . + CDS 57262 - 57483 234 ## Cphy_2061 hypothetical protein - Term 57464 - 57496 4.0 53 21 Tu 1 . - CDS 57507 - 58754 1505 ## COG0112 Glycine/serine hydroxymethyltransferase - Prom 58798 - 58857 5.0 54 22 Tu 1 . - CDS 58871 - 59434 326 ## - Term 59442 - 59487 10.2 55 23 Op 1 . - CDS 59495 - 60685 1583 ## Cphy_2845 hypothetical protein 56 23 Op 2 . - CDS 60704 - 60769 79 ## 57 23 Op 3 . - CDS 60756 - 61145 393 ## EUBREC_0207 hypothetical protein 58 23 Op 4 . - CDS 61153 - 61893 919 ## COG0760 Parvulin-like peptidyl-prolyl isomerase 59 23 Op 5 . - CDS 61920 - 62552 211 ## PROTEIN SUPPORTED gi|229845805|ref|ZP_04465917.1| 50S ribosomal protein L31 - Prom 62575 - 62634 6.9 + Prom 62528 - 62587 5.4 60 24 Op 1 . + CDS 62693 - 63187 395 ## COG4767 Glycopeptide antibiotics resistance protein 61 24 Op 2 . + CDS 63234 - 63716 309 ## PROTEIN SUPPORTED gi|15902812|ref|NP_358362.1| hypothetical protein spr0768 62 25 Op 1 . - CDS 63713 - 64855 1218 ## COG1929 Glycerate kinase 63 25 Op 2 . - CDS 64861 - 65202 325 ## - Prom 65223 - 65282 3.4 64 26 Tu 1 . - CDS 65340 - 66635 639 ## COG3344 Retron-type reverse transcriptase 65 27 Tu 1 . - CDS 67126 - 67299 282 ## gi|197302777|ref|ZP_03167830.1| hypothetical protein RUMLAC_01506 - Prom 67422 - 67481 5.1 + Prom 67445 - 67504 7.5 66 28 Tu 1 . + CDS 67535 - 68425 699 ## Acfer_1503 tetracycline resistance leader peptide - Term 68576 - 68612 5.1 67 29 Op 1 . - CDS 68813 - 69475 712 ## COG0613 Predicted metal-dependent phosphoesterases (PHP family) 68 29 Op 2 . - CDS 69488 - 70681 575 ## COG0438 Glycosyltransferase 69 29 Op 3 . - CDS 70739 - 71176 450 ## HMPREF0424_0946 hypothetical protein - Prom 71387 - 71446 6.9 70 30 Op 1 1/0.100 + CDS 71433 - 71834 337 ## COG1959 Predicted transcriptional regulator 71 30 Op 2 1/0.100 + CDS 71895 - 73778 1996 ## COG1151 6Fe-6S prismane cluster-containing protein 72 30 Op 3 4/0.000 + CDS 73790 - 74242 271 ## COG1142 Fe-S-cluster-containing hydrogenase components 2 73 30 Op 4 . + CDS 74239 - 75465 1254 ## COG0446 Uncharacterized NAD(FAD)-dependent dehydrogenases - Term 75737 - 75796 12.7 74 31 Op 1 . - CDS 75804 - 76001 180 ## gi|253579274|ref|ZP_04856544.1| conserved hypothetical protein 75 31 Op 2 . - CDS 76041 - 76853 594 ## COG0122 3-methyladenine DNA glycosylase/8-oxoguanine DNA glycosylase 76 31 Op 3 . - CDS 76858 - 77901 964 ## COG1396 Predicted transcriptional regulators - Prom 77998 - 78057 5.9 77 32 Tu 1 . - CDS 78123 - 78473 65 ## SZO_02320 transposase Predicted protein(s) >gi|330401904|gb|ADLB01000023.1| GENE 1 427 - 513 77 28 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLTEEIFDDIIDKHSRDGQTSQKQNSKK >gi|330401904|gb|ADLB01000023.1| GENE 2 629 - 1981 1211 450 aa, chain - ## HITS:1 COG:BS_sms KEGG:ns NR:ns ## COG: BS_sms COG1066 # Protein_GI_number: 16077155 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Predicted ATP-dependent serine protease # Organism: Bacillus subtilis # 1 448 1 453 458 505 57.0 1e-143 MAKGKKSVFFCQNCGHEEKKWLGQCPMCKEWNTFVEETISDGVSFSKTVSRAEVVTLSSV KTDSEERTKTGIEELDRVLGGGIVPGSLILVGGDPGIGKSTLLLQVCQKLSQEKHQVLYI SGEESLKQIKLRAMRMGEFTDELSLLCETSLNTIRGVIERRKPEIVIIDSIQTMYSEDVT SAPGSVSQVRESTNVLMQLAKGLNISIFIVGHVTKEGTVAGPRVLEHMVDTVLYFEGDRH ASYRILRGVKNRFGSTNEIGVFEMRQSGLQEVKNPSEFMLNGKPEGASGSVVACSMEGTR PILLEIQALVCQSGFGMPRRTAAGTDYNRVNLLMAVLEKRIGLPLFNYDAYVNIAGGIKM NEPAIDLGIILAIVSSYKNKPIGDKVIAFGEVGLSGEVRAVSMPEQRVSEAKKLGFQTCI VPEVSVETVKNIKGIEIVGVKNISEAIQFV >gi|330401904|gb|ADLB01000023.1| GENE 3 2050 - 2463 622 137 aa, chain - ## HITS:1 COG:no KEGG:Cphy_3460 NR:ns ## KEGG: Cphy_3460 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 1 136 1 136 137 155 61.0 3e-37 MAAVKELLRVEEDGTLSFGDYTLDKKTKLEDVEFQGDLYKVKTFAEITKLERNGMFLYES VPGTAVEHLKVTEDTVSFSVSGRQDVQFTLEMEADTEYTVYMDDTNVGRMKTNLSGKLSI SAELNENVPVEIKVVRA >gi|330401904|gb|ADLB01000023.1| GENE 4 2518 - 4971 2163 817 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163764771|ref|ZP_02171825.1| ribosomal protein S8 [Bacillus selenitireducens MLS10] # 5 795 6 798 815 837 53 0.0 MQNYYTKQAEEVLKTARTLARKMGNRYVGTEHLLVALRKVYRGVAGQVLAQNGLKEEDLL KVVNELISPIGEKSEVAKPEESPRLSYILENSKAQAIRLRAREIGTEHMLLAIVEDTECV GARILATLNIQPQRITQDILQAIGAEPEEYKMFQGGEKGQTAVLEQYSTDLTRQAMDGKL DAVIGRDEEIARIMQILSRRTKNNPCLIGEPGVGKTAIVEGLALRIANGIVPNAMKNKKI VTLDLAGMVAGSKYRGEFEERMKKLIQEVKSVGNVILFLDEVHTIIGAGGAEGAIDASNM LKPSLARGEIQLIGATTITEYRKHIEKDAALERRFQPIMVEEPTKEQCMAILEGVKSKYE IHHDVSIEKEALAAAVELSKRYVTDRNLPDKAIDILDEACSKVSLDGFKVPEHIRELEET VDRLAKEKEDAVKNGQMQEASLLNQEQETVREKLESARKRFQRGNRKKQICVTESAVAEV VSAWTKIPVQKLAESETARLQKLDKILHKRVIGQEEAVVAVTKAVKRGRVGLKDPNRPIG SFLFLGPTGVGKTELSKTLAEALFGNEEALIRVDMSEYMEKHSVSKMIGSPPGYVGHDDG GQLSEKVRRNPYSVILFDEIEKAHPDVFNILLQVLDDGHITDSQGRKVDFKNTVIIMTSN AGAKAIVEPKKLGFVANEDPDSDYKKMKSNVMDEVKQLFKPEFLNRIDEIIVFHALNREQ MKKIVNLMCKELSDRVQKQLNIKLVIRDSVKKEIVEKGTDLKYGARPLRRAMQNILEDKM ADAILEGEIQAGTQVEVGMSKKGIKFNVKADKELAKE >gi|330401904|gb|ADLB01000023.1| GENE 5 4997 - 5365 373 122 aa, chain - ## HITS:1 COG:CAC0599 KEGG:ns NR:ns ## COG: CAC0599 COG1725 # Protein_GI_number: 15893888 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Clostridium acetobutylicum # 1 121 1 123 125 107 44.0 4e-24 MLIEIDFNSDEAIYMQLRNRIIMGIATAEIQEGDSLPSVRQLADNIGINMHTVNKAYSVL KQEGFIRLDRRRGAVISLDVNKIQALEDMKDQLRIILAKGRCKNITKDEVHELVEEIFKE YR >gi|330401904|gb|ADLB01000023.1| GENE 6 5536 - 5874 215 112 aa, chain + ## HITS:1 COG:CAP0153 KEGG:ns NR:ns ## COG: CAP0153 COG1695 # Protein_GI_number: 15004856 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Clostridium acetobutylicum # 1 106 1 107 115 93 48.0 8e-20 MYDKAQLLKGLQEGCVLQIINTKDTYGYEIVTTLQKKGFQSVKEGTIYPLLLRLEKNGYI SSNFRISERGPARKYYQITDLGKEHLAEFVLAWQEVTQLVDFVFSDEKGEEK >gi|330401904|gb|ADLB01000023.1| GENE 7 5871 - 6560 396 229 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTNELKLLKKENTLRASCLSSQSQDALQKWREYAKLSDINPYDFELCYKELIGIASQKEL EGSNLEEFFGENMQTVTEDILQSCPKKSLKDYFLFDFGNMMGHLAISFAILYFVSGILWK EITLASFLGQFVIIGMWFVYLRFVTGDRHRSGFFTPKLNQKIVKTFSVYFMFLLATAIIS YGFKEMLSSVSLQIPNAFFCVIYGVLWAVLTVWRNRYVKEMSLRKRWVG >gi|330401904|gb|ADLB01000023.1| GENE 8 6784 - 8055 1721 423 aa, chain - ## HITS:1 COG:BH0634 KEGG:ns NR:ns ## COG: BH0634 COG0151 # Protein_GI_number: 15613197 # Func_class: F Nucleotide transport and metabolism # Function: Phosphoribosylamine-glycine ligase # Organism: Bacillus halodurans # 1 422 1 419 428 416 51.0 1e-116 MKVLIVGSGGREHAIAWSVAKSDKVDKIYCAPGNAGITEFAECVNIGAMEFDKLVQFAKE KEIDLTVIGMDDPLVAGVVDAFEAEGLRVFGPNKAAAILEGSKAFSKDLMKKYNIPTAAY ENFDNPEDALAYLETAKFPIVLKADGLALGKGVLICNTLEEAKDGVKSIMLDKKFGTAGN TMVIEEFMTGREVSVLSFVDGKTIKTMTSAQDHKRAKDGDEGLNTGGMGTFSPSPFYTKE VEEFCEKYVYQATVDAMRKEGREFKGIIFFGLMLTADGPKVLEYNARFGDPETQVVLPRM KTDIIDVFEACIDGRLDEIELEFEDNAAVCVVLASDGYPLAYEKGFEIAGLEEFNKHDGY YCFHAGTKWEDGKIVTNGGRVLGITAKGADLKEARANAYKATEWVQFANKYKRNDIGKAI DEA >gi|330401904|gb|ADLB01000023.1| GENE 9 8066 - 8692 739 208 aa, chain - ## HITS:1 COG:CAC1394 KEGG:ns NR:ns ## COG: CAC1394 COG0299 # Protein_GI_number: 15894673 # Func_class: F Nucleotide transport and metabolism # Function: Folate-dependent phosphoribosylglycinamide formyltransferase PurN # Organism: Clostridium acetobutylicum # 1 206 1 202 204 215 55.0 4e-56 MLKVVVLVSGGGTNLQAIIDAINTKTITNTEIIGVISNNKNAYALERAKQHNIFAKCISP KDYETREAFNDAFLEELNGLNPDLIVLAGFLVVIPKEMIKQYENRIINIHPSLIPAFCGK GYYGLKVHEKALERGVKVVGATVHFVDEGTDTGPIILQKAVSVQQGDTPEILQRRVMEEA EWKILPEAIHLIANGKIKVENRQVRIES >gi|330401904|gb|ADLB01000023.1| GENE 10 8686 - 9711 811 341 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|169632702|ref|YP_001706438.1| phosphoribosylaminoimidazole synthetase [Acinetobacter baumannii SDF] # 1 338 12 344 356 317 48 2e-85 MDYKKAGVDIEAGYKSVELMKEHVKKTMRPEVLGGLGGFSGAFSLEKIKEMEEPVLLSGT DGCGTKVKLAMILDKHDTIGIDAVAMCVNDIACAGGEPLFFLDYIACGKNEPEKIATIVS GVAEGCLQSEAALIGGETAEHPGLMPEEDYDLAGFAVGVCDKKDMITGQDLKAGDVLIGM ASSGVHSNGFSLVRKVFEMSEESLNTYHDRLGRTLGEALLVPTKIYVKALKSIKEAGVRV KACSHITGGGFYENIPRMLKEGTHAVVKKDSYPILPIFPMLAEYGDIAEEMMYNTFNMGL GMIVAVDEKDVESTLSAIKAAGEEGYVVGHIEAGDKGVTLC >gi|330401904|gb|ADLB01000023.1| GENE 11 9753 - 10265 716 170 aa, chain - ## HITS:1 COG:TM0446 KEGG:ns NR:ns ## COG: TM0446 COG0041 # Protein_GI_number: 15643212 # Func_class: F Nucleotide transport and metabolism # Function: Phosphoribosylcarboxyaminoimidazole (NCAIR) mutase # Organism: Thermotoga maritima # 1 169 1 169 171 197 59.0 6e-51 MAKVGIVMGSDSDMKVMSKAADMLEKFGVDYEMTIISAHREPDVFYDYAKSAEEKGFKVI IAGAGMAAHLPGMCAAIFPMPVIGIPMSGKNLDGVDALYSIVQMPPGIPVATVAIDGGAN AAILAARILAASDPELLERLKAYTQEMKETVQAKAEKLEKLGHKEYLKQM >gi|330401904|gb|ADLB01000023.1| GENE 12 10405 - 12456 2144 683 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_00547 NR:ns ## KEGG: EUBELI_00547 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 5 677 6 684 690 716 54.0 0 MEKQIQERYTLSIERIRKIETEKTVGEKFQGFFSKVAGFLLEIDRVNQLLENGGWKDLSF EEMQRENKKLYEDVLPENYGKSYANPDYAVEMLGEEYGKILSFLYTEMRGEIAYVFERKQ LYLTICNELFIEMYNCFEGEEPSYESLKEIVYWYASDYADVFMADRIVDQIDWEKDFATR IIMDSNLEDLRYLYSYGEYVSENERRTAEHMNELPEETICKMADTYTEGYRMCFVNGRKD LSKKGSVNVRYTLGFERVVRKAIANFEKMGLKPVIYRAGVSVLTKRKHFKTGYYGAIANK QYEYDHGADAGLFLDKKFVERKLDVLKNTYEKHKELAGKMAGPAVMEVFGEAPFAPESKK NAVHLTEKQEQLELLYDNKSGQITNEYIKGEERGFTIIAYPVPEIAENYEEIFDEVIRIN TLDAGLYERVQQTIIDALDQGEYVHILGKGANKTDLKVQLHKLQNPEKETIFENCVADAN IPVGEVFTSPVLEGTEGTLYVSKVYLNELQYRDLEITFKDGKIQDYTCSNFDEEEENKKY IHANILYNHETLPLGEFAIGTNTTAYVVAKRYGIEDKMPILIAEKMGPHFAVGDTCYSWS EDNQIYNPNGKEIVAKDNSVSILRKEDLSKAYFQCHTDITIPYEELEEISVIKPDGEKIV ILKDTRFVLEGTEMLNEPLAELN >gi|330401904|gb|ADLB01000023.1| GENE 13 12458 - 12859 399 133 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|210615476|ref|ZP_03290603.1| ## NR: gi|210615476|ref|ZP_03290603.1| hypothetical protein CLONEX_02819 [Clostridium nexile DSM 1787] # 17 117 3 103 119 107 57.0 3e-22 MRKDREKDRKKDRAKEREKERDRKKDKKKRGNRKYGQAKLKHSKKGITSCVIAGAVAFVI LSSVVITYLYEGKAAGYIGGLGVSALIFAGAGIMQGVRGFKERDKDYRTCKVGIGLNIFF LVSLLVFFLGGFL >gi|330401904|gb|ADLB01000023.1| GENE 14 12846 - 13310 231 154 aa, chain - ## HITS:1 COG:CAC0829 KEGG:ns NR:ns ## COG: CAC0829 COG4767 # Protein_GI_number: 15894116 # Func_class: V Defense mechanisms # Function: Glycopeptide antibiotics resistance protein # Organism: Clostridium acetobutylicum # 16 149 18 155 308 75 33.0 3e-14 MKAKNKKRIKILGKVFFILYIVFIFYFLLLSDWYGRGQEMKMYHYNFVLFKEIKRFWNYR EQLGIMAVMSNLLGNVLIFVPFGFFMPIGSRFRSFLATAFYGFGISLCVELFQLVTKVGS FDVDDLLLNTVGSILGYMLFILCDGIRRKFSAKR >gi|330401904|gb|ADLB01000023.1| GENE 15 13326 - 13475 228 49 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MEKLYKTTGTVGAGNIALGIIMIVTGVVAGILTIVGGGRLLAGRKKIMF >gi|330401904|gb|ADLB01000023.1| GENE 16 13493 - 14173 956 226 aa, chain - ## HITS:1 COG:CAC0027 KEGG:ns NR:ns ## COG: CAC0027 COG0461 # Protein_GI_number: 15893325 # Func_class: F Nucleotide transport and metabolism # Function: Orotate phosphoribosyltransferase # Organism: Clostridium acetobutylicum # 1 226 1 224 224 266 59.0 3e-71 MEQYKREFIDFMVESKVLKFGEFTLKSGRKSPFFMNAGGYVTGSQLKRLGEYYAKAIHDK YGDDFDVLFGPAYKGIPLGVVTAIAYSELYGKEVRYCSDRKEEKDHGADKGSFLGSKLQD GDRVIMIEDVTTSGKSMEETVPKVKGAADVEIVGLMVSLDRMEVGKGGEKCALEEVKDLY GFETNAIVTMAEVVEHLYNRECNGEIVIDDTIKAAIDAYYEQYGAK >gi|330401904|gb|ADLB01000023.1| GENE 17 14340 - 19454 5528 1704 aa, chain - ## HITS:1 COG:SMb21655 KEGG:ns NR:ns ## COG: SMb21655 COG3250 # Protein_GI_number: 16263752 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Sinorhizobium meliloti # 45 824 3 743 755 208 24.0 6e-53 MKRKVVALLTALTLVGGMFSPYGVVKTQAASEVKTEKEAVKSTERNVINFNTDWRYKKGD VSGGEAVDFADEEWGYVNLPHSTTFYTVENKDAYLGISWYRKDFQVDESLKGKKLLLTFE AAMQKADVYINGTKVMTHEGGYIPFVIDISDKVNYGTENTIAVKIDSRANTSFAPGKAQP DFQYFGGIYGNSYITVTDELHITDAVEANETAGGGVFITAPEVAKDKATVKVKTHVENEG KENKETTLLTELLDEGGNVVVSKEDKLSIESGKAQYFNQSLQVTNPRLWSPGTPELYTVR STVKADGVVKDTIETKYGIRKVEWKRDGLYINDKKTDVEGANLHGETYMFGNAMPDNAIY EEVKRFKESGFDIMRMSHYPHRQAYYDACDKYGVMVVECPSGWQYFNNTEAFKNSTYREL QTDIRHKRNHPSIVVWESSLNETGYNKEWAKEMNRIVKEEYPEDGDMYGYTAGCYHWDVW DIGLSTPQAGIFGTGNEGAENPKYKDKPMIIAEYGDWNYGGSGSTTRVTREDKNNAGKKG GDEGMLIQSDNIQESVQTNRSRGKNWLGAAMYWDYADYAGFDAGILTNCGVVDLYRLPKH GAYFFQSQRDASVDMSQYGVKTGPMVYIANTWDAKADKEVRIFSNCDTVELFLNGKSLGE KGHDKTTWGPHGDGDPMHYPQDGAGKEISTDAMKNAPITFDLDKYEAGELKAVGKIDGKE VTEYIRKSPEAAMQIQLRPESDAKVPLDGSSAKLVWIDITDKNGTVVPSAYTDVNLEVEG PGLVVGPKTVTTKGGQLAVWVKSKRGSGDITLTATSENLKATSVVIPTSEVQGLPEVPEG GDADEYENTKDDENAQDGNIFLNKVSRASTENTQNGKHEVKEYGNDGNDNTKWCANSGSY PQWWEVDLGASYKLDTMKLSFETSGSAYHYTIAVSENPMTDENYAKNIVIDNSKGSTDTE LTFEKGKVQGRYVRVTFTEATNNEWAVLRDVSGTGETDNLALNKPVTASSVNTGVGNKVE KAEYAVDGKMDTWWCAKGGEGTKDHWIQVDLKDTYKLSEVKIAFEKEDAAYKFVLQGSVD GVHYKDIKNFREGEGCGQEVTVQSDDIVQYLRVYDITTKNMVNQWPTIREIEAYGEKVDY KLYNVSREKEAFASSCKEGSEPGHGSNGVPGWYWYPATMGDEWWYIDTKGIYDLDNIQMT WNAEETHKYIIDISTDGKNWTTVADRSEKGTNEIMPYEAVEGTARYIRMRLPAGRTTEQG FGFFNAYAPVPTGRQVKEVKATEKVTAFVGTEFGELQLSKEVDVVLEDDIKTTLPVKWNE ADYDKTKEGETTINGTLTEIEGVKVGEELKNTTVVVELKKDTEVPVIKKQPKSQTVEIGE NVTFTVEAETNDGGNLTYQWQVNNGTDWQNIEGATTSSYTKENVSLEENGMKFRCLVVNT KEGMEAKQVITDEAVLTVTEKPIQVNEIVSVETIQDKEVEFGTPVSALELPQTVKVQLDN NEERTLSVQWDTSSYNGNVANTYKLSGELDLTDGIANTKGLKATVEVTVKEEPVKPEVDK EALKDIIAKAKHEAASGKYTNESVSALNEVIAKAWEIVEDKEATEEEVQQAIKDVEKAMK NLKEKETDDNNNEGNGNEGEDQNGGSHGNNGGNGNDNNGSGNGNNGQNGHGNGTKVPKTS DDFHILIWLAGMLGAIKVISGRRK >gi|330401904|gb|ADLB01000023.1| GENE 18 19632 - 20534 1014 300 aa, chain - ## HITS:1 COG:BS_pyrD KEGG:ns NR:ns ## COG: BS_pyrD COG0167 # Protein_GI_number: 16078618 # Func_class: F Nucleotide transport and metabolism # Function: Dihydroorotate dehydrogenase # Organism: Bacillus subtilis # 4 297 3 297 311 315 51.0 9e-86 MNTKINIAGVEWKNPVTTASGTFGSGAEFAEFVDLNKIGAVTTKGVANVAWEGNPTPRIA ETYGGMMNAVGLQNPGIDVFCKRDIPFLRQYDTKIIVNVCGRTTEDYCEVVERLSHEDVD MLEINISCPNVKEGGIAFGQNPKSAEEITKEIKKYAKQPVIMKLSPNVTDITEMAKAVEA GGADAVSLINTLTGMKIDVHKRTFALANKTGGVSGPAIKPIAVRMVYQTANAIKLPIIAM GGIATAEDAIEFILAGATAVAVGTANFINPSVTMEIAEGIEQYMKRYDFENIRDMVGLVK >gi|330401904|gb|ADLB01000023.1| GENE 19 20534 - 21304 699 256 aa, chain - ## HITS:1 COG:SP0963 KEGG:ns NR:ns ## COG: SP0963 COG0543 # Protein_GI_number: 15900840 # Func_class: H Coenzyme transport and metabolism; C Energy production and conversion # Function: 2-polyprenylphenol hydroxylase and related flavodoxin oxidoreductases # Organism: Streptococcus pneumoniae TIGR4 # 11 249 3 243 250 197 42.0 2e-50 MEKREKERAVVVSQEQIATDIYSLWLRSEASVSAKAGQFISMYTNDSGKLLPRPISICEI DKENRQLRVVYRVTGENTGTEQFSKLTEGDEIEILGPLGNGFPLKEKKAFLIGGGIGIPP MLELAKQLHAEKQIVVGYRDELFLTEELSANGEVYVATEDGSAGTKGNVLDAIRENGLTA DIIYACGPTPMLRALKTYAEENNIECYISMEERMACGIGACLACVCKSKEKDHHTNVHNK RICKDGPVFLATEVDL >gi|330401904|gb|ADLB01000023.1| GENE 20 21317 - 22243 1297 308 aa, chain - ## HITS:1 COG:CAC2652 KEGG:ns NR:ns ## COG: CAC2652 COG0284 # Protein_GI_number: 15895910 # Func_class: F Nucleotide transport and metabolism # Function: Orotidine-5'-phosphate decarboxylase # Organism: Clostridium acetobutylicum # 1 305 2 286 286 204 39.0 2e-52 MINKLVENIKRTKAPIVVGLDPMLNYIPQHIQKKAFAEFGETLEGAGEAIWQFNKEIIDK TYDLIPAVKPQIAMYEQFGIPGLVAFKKTVDYCKEKGLVVIGDIKRGDIGSTSSAYAVGH LGRVQVGSKSYVPFDEDFATVNPYLGSDGINPFIDVCKEENKGLFILVKTSNPSSGEFQD RLVDGRPLYELVGEKVAQWGESHMGDEYSYIGAVVGATYPEMGKVLRKVMPKSYILVPGY GAQGGQGKDLVHFFNEDGLGAIVNSSRGIIAAYKQEAYAEFGEENFADASRKAVETMIAD INGALENR >gi|330401904|gb|ADLB01000023.1| GENE 21 22256 - 23545 1378 429 aa, chain - ## HITS:1 COG:lin1951 KEGG:ns NR:ns ## COG: lin1951 COG0044 # Protein_GI_number: 16801017 # Func_class: F Nucleotide transport and metabolism # Function: Dihydroorotase and related cyclic amidohydrolases # Organism: Listeria innocua # 4 424 3 420 426 359 46.0 7e-99 MRTLIKYGHVLDPKTQRDEVCDVLIEDDRIVKVDSDIVSEADRTVDATGCYVMPGFIDLH VHLRDPGLEYKETLQTGGQAGARGGITTICAMPNTKPVMDTGEKVSEVHRRSRTESPVNV IQLGAITVGQKGEELADIEGMAKAGCRAVSEDGKSVMNASLYRQAMKQAKENNISIFAHC EDITMVEGGVMNADEKAKKLGLKGITNAVEDVIVARDILLAKETGVQLHLCHCSTADSVK MVAEAKKEGLPVTAEVCPHHFILTTEDIPEDDGNYKMNPPLRSREDVETLKEGLRNNIMD VIATDHAPHSAEEKNKSMKDAMFGIVGLETSAALTYTELVDKGILTPMQMAEKMSYNPAK ILGIAGEKGSVSEGMIADIVIFDPKREYRIDKNTFLSKGKNTPFDNYAVKGDVRYTFVGG KMVYENTAY >gi|330401904|gb|ADLB01000023.1| GENE 22 23639 - 24814 1399 391 aa, chain - ## HITS:1 COG:CAC0499 KEGG:ns NR:ns ## COG: CAC0499 COG0793 # Protein_GI_number: 15893790 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Periplasmic protease # Organism: Clostridium acetobutylicum # 42 391 48 403 403 265 45.0 1e-70 MGDKRGFLKGVLCGTLVTLVVAGAAGMYLKNTGYGETIDKKTARKLEQIKDIIDDSYLKG DEIKESDLENFLLKGYVNGLKDPYSVYYNEEETKELYETTEGEYSGIGAVMSQNLETGVI TMVNVYEDSPAEKAGFKNNDILYKVNGKDISTEDISKVVNKIKGEEGTEVTITVLRDGKE YEATTTREKLEAHTIEYEMKEDHIGYVRVTEFDMVTYNQFVEAITDLKKQGMKEIIIDLR GNPGGNLSTVCQMLDYILPEGLIVYTEDKDGKREVMTSDEEHKLEMPMTVLVDGRSASAS EIFAGAIQDYQLGTLVGTTTYGKGVVQQLFDLKDGTCLKLTIAEYFTPKGRNIHEKGIKP DVEVEYQYDKNNADADNQLDKAIEVLKEKMK >gi|330401904|gb|ADLB01000023.1| GENE 23 24828 - 25961 1321 377 aa, chain - ## HITS:1 COG:VCA0079 KEGG:ns NR:ns ## COG: VCA0079 COG0739 # Protein_GI_number: 15600850 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane proteins related to metalloendopeptidases # Organism: Vibrio cholerae # 260 361 277 379 430 108 52.0 1e-23 MAGKRAISLVLATVLIAGSVFKPVHAESIHSAEQKGQQLEEQKKQAEKEQESLSERLNSI ITEMQTAQANVEKKEDEIAVAEDELVEAEIKENKQYESMLIRIKYVYENGNSEMMQILLE AEDMGDFLNKTEYVEQVSQYDRKMLDQFAKLVKEVEKKEKSLKKEKESLVVLQKDLETKQ KEVQTLLDSKKEEIAKLDAQIGENTKTLNALIEKAKEAERRKQEAANANKKPSGGSGNKP GAPVISGGNGYLSNPCPAGRITSEFGPREAPVPGASTFHEGRDYGAPIGTPIYAAAEGTV IRTGNQAVRGNYVVIRHPNGLTTWYQHCTDIFVSKGQKVSRGQNIATVGKTGRVSGPHLH FIVEEASGELVDPRKYL >gi|330401904|gb|ADLB01000023.1| GENE 24 25978 - 26883 842 301 aa, chain - ## HITS:1 COG:BS_ftsX KEGG:ns NR:ns ## COG: BS_ftsX COG2177 # Protein_GI_number: 16080578 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Cell division protein # Organism: Bacillus subtilis # 3 301 2 296 296 138 29.0 1e-32 MRISTVGYTMKQGVKNIGRNKMFSLASIATMSACIFMFGLFFSIIVNFQNIVKSAEEGVA ITVFFDEGLEQNRIDEIGKLIKDRKEVSRVEYTSGEQAWEDYKKKYFGDKEYLAEGFADN PLVNSDNYQVYMKDVSKQKDLVKYVSGLEGVRQVNKSDTVAKTLSGINKLILYVSAAIII ILLAVSIFLISNTVTMGISIRREEIAIMKYIGAKDAFVRMPFIIEGLLIGLVGAIIPLAA LYFLYEKAVGYILIKFKILNNLLTFLPVTEVYKTLLPIGIALGIGIGFVGSFLTIRKHLK V >gi|330401904|gb|ADLB01000023.1| GENE 25 26888 - 27559 336 223 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 216 1 218 245 134 36 2e-30 MIELKDVTKEYSKGVAALNGINLKIEQGEFVFIVGDSGSGKSTLIRLLMKELDPTSGTII VNGINLNRMKHRHIAKYRRHLGVVFQDFRLLKDRNIYENIAFAQRVIETPTRVLKQKVPA ALSLVGLAQKYKAYPKQLSGGEQQRVAIARAVVNEPAILLADEPTGNLDPTNAWEIMKLL EEANERGTTVIVVTHNHEIVAEMKKRVITMQKGVIVSDEKEGE >gi|330401904|gb|ADLB01000023.1| GENE 26 27581 - 28666 1159 361 aa, chain - ## HITS:1 COG:CAC3236 KEGG:ns NR:ns ## COG: CAC3236 COG2508 # Protein_GI_number: 15896482 # Func_class: T Signal transduction mechanisms; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Regulator of polyketide synthase expression # Organism: Clostridium acetobutylicum # 169 354 134 312 312 82 32.0 9e-16 MISNQILQNTIEGLKAITRIDLSILDTDGKVLASTFVEQEHYENSLDSFVESPADSQVIQ GYQFFKIFDEHQLEYILVVNGSSDDVYMVGKIAAFQIQNLLVAYKERFDKDNFIKNLLLD NLLLVDIYNRAKKLHIETDVRRVIFVIETAREKDTGTMEHIRSLLGMKSKDFVTAVDERN IIIVKELGANEGYPEMEKVANEILEILKTEGEEVHIAYGTIVNDIKEVSKSYKEAKLALD VGKIFFDEKDIVAYSTLGIGRLIYQLPLPLCKMFIREIFEGKSPDEFDEETLTTINKFFE NSLNVSETSRQLYIHRNTLVYRLDKLQKSTGLDLRVFEDAITFKIALMVVKYMKYMETLE Y >gi|330401904|gb|ADLB01000023.1| GENE 27 28781 - 29464 581 227 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_0711 NR:ns ## KEGG: EUBREC_0711 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 190 1 190 227 77 22.0 3e-13 MKSRVNVLEIELNNNTAKEAMQKVMEYMQTEPMNIVEIISANTLVKSKEDENLKDNIAQA DLVLAGEQAVLEVAEVTDRKKLQEADSQLFVKMLLRFFHKNHNRIFLLTESEDEKIHLKE YLQEYYYGIQIVGGEVVPTDSSADDMILNSVNGAEADCVLSMISSPFQEAFVVRNRILLN TRMWLGLGKNTPLPLDKKKGKKPIREFMIKRFLKREVEKERQKRGNV >gi|330401904|gb|ADLB01000023.1| GENE 28 29563 - 29751 331 62 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|153855921|ref|ZP_01996883.1| ## NR: gi|153855921|ref|ZP_01996883.1| hypothetical protein DORLON_02908 [Dorea longicatena DSM 13814] # 1 62 1 62 72 69 66.0 5e-11 MAKKEKLPIILSEIEPEEKLKYEVAEELGLLDKVLSEGWRSLTSKETGRIGGIITKKKRQ KK >gi|330401904|gb|ADLB01000023.1| GENE 29 30135 - 30317 307 60 aa, chain - ## HITS:1 COG:BH3592 KEGG:ns NR:ns ## COG: BH3592 COG1983 # Protein_GI_number: 15616154 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Putative stress-responsive transcriptional regulator # Organism: Bacillus halodurans # 1 59 1 60 65 67 51.0 6e-12 MKKLYRSTVDRKVCGVCGGVAEYFNIDPTLVRLGMVLATCFSFGTMIFGYFVAAVIIPDR >gi|330401904|gb|ADLB01000023.1| GENE 30 30335 - 31168 1108 277 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225570780|ref|ZP_03779803.1| ## NR: gi|225570780|ref|ZP_03779803.1| hypothetical protein CLOHYLEM_06883 [Clostridium hylemonae DSM 15053] # 1 275 1 276 277 150 32.0 8e-35 MKKGWKIFAIICAVLAGLGVVLCIAGFAIGVTDEDVRGAVHRGIGFITDDDDDDNDDFDN RTLQSAKTVKEGEFPNAKGIDVEVNKVKVEVVATDENVIRVKNATDEEAGIVIREEGSTL EIETSRRGNRTDNGGRLTIYVPRNMKLTEANVQVNEGAIDLSGIQSNELELSVERGEITA ADFTVGELSVSCGQGAANIKGTAQADIDLECGVAAINLETKGAQTDYNYEIEVGAGEVNL GGQTFSGLAVEKTINNHAGKEMGIDCATGKVNVTFVQ >gi|330401904|gb|ADLB01000023.1| GENE 31 31168 - 31836 802 222 aa, chain - ## HITS:1 COG:SP0099 KEGG:ns NR:ns ## COG: SP0099 COG4709 # Protein_GI_number: 15900042 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Streptococcus pneumoniae TIGR4 # 1 222 1 192 197 74 28.0 1e-13 MNRIEFMTELEKLLKEIPEEERKEAMQYYEDYFADAGLENEQHVISELESPKKVAQTIKA GLRGRGEESSEYRETGYTDTRFEEKETPARREENKPKTNDWLKIILIVAIAIFTLPIIFP LALAGLAVIFAIAVSGFAVLGTLVLLAATVMICGFCITICGILEMFHFPALALVVMGGGI LAFVAGLVATVLMVKACMIVFPVLFRGLVNIFRKFVHRKESR >gi|330401904|gb|ADLB01000023.1| GENE 32 31833 - 32156 440 107 aa, chain - ## HITS:1 COG:SP0100 KEGG:ns NR:ns ## COG: SP0100 COG1695 # Protein_GI_number: 15900043 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Streptococcus pneumoniae TIGR4 # 1 105 1 103 108 84 39.0 6e-17 MVFNTGAALLDAIVLAVVSREQQGTYGYKITQDVRQALDVSESTLYPVLRRLQKDECLEV YDMQFDGRNRRYYKVTEKGMAQLNLYRIEWKNYSRKISEMFEGGVAV >gi|330401904|gb|ADLB01000023.1| GENE 33 32271 - 35864 2760 1197 aa, chain - ## HITS:1 COG:CAC2262 KEGG:ns NR:ns ## COG: CAC2262 COG1074 # Protein_GI_number: 15895530 # Func_class: L Replication, recombination and repair # Function: ATP-dependent exoDNAse (exonuclease V) beta subunit (contains helicase and exonuclease domains) # Organism: Clostridium acetobutylicum # 4 1197 7 1251 1252 726 37.0 0 MGVNWTPEQEKVISLRNRNILVSAAAGSGKTAVLVERIITMLTKDEPPINVDELLIVTFT EAAASEMKERILSAIEKKLEENPDNVHLQKQSTLIHSAMITTIHSFCLSVIREYFHTIDL DPSFRIGEEGELKLLQKEVLQELLEEQYEKADKKFLSFVERFATGRDDKKLEEFILQLYT FSGSYPNGEKWLDSCIQPYDVKDVVDLENTDYCKRAVENIKRYGLDAVSILESGISACEE PDGPYMYKETLEKEAEGWKSFLASDSLSSFYEKRNCPTGERLKPNRDKSVSEELSAFVKE TRKQTKDLLTDIVKTYLFQPIEEMAVDMAKTKETMEELVFLVKEFGCLYTEAKRSRNLID FGDMEHLALQILTVEEGDAILPSPVAREYQKRFKEVMIDEYQDSNLTQEAILTSVSTVGC GKNNIFMVGDVKQSIYRFRLSRPELFMEKFDTYDIEDSTKQRIDLHKNFRSRREVLDSTN FIFEQIMTKGLGKVEYDDKAALYVGADYKEKEGNETEVLVIDVDMDGESEEQVEETAREL EARAVARRIKELMAQHRVQDKQTGEFRQIRYRDIVILTRSLKGWTDVFARILNREGIPTY TGSKEGYFETQEIQTVLNYLRVLDNPNQDIPLVSVLTSGMGKFSGEELAEIRTLDKEVPF YRCVRRYIKEGKNTLLANRLEKLFAQIEYFREMIPYTAIHELLWRVIEETGYGYETAVMA GGEQRKANLDMLLEKAKAFEGTSYKGLFNFVRYMEQLKKYDVDYGEANIADEQADAVRLM SIHKSKGLEFPIVFVSGMNKRFNKQDITGEIVTHAELGIGLDRVDLEKRMKSPTFLKKTM QREVMLENIGEELRVLYVALTRAKEKLIMTGTLSQVEKKFMSYAVLENREEKQLPFTMLS KAGSYFDWVLPALLRYPEQRRVGKVSVINLEELVQKEVQEETGAEYTKEKLQNWDTEQIF DSDFKERLEKQFSYRYPYENQNDLKLKFTVSELKKRKNLAEESGELLIREEEIIPILPKV LTEEEGIKGVLRGTTYHKVLELLDFTKEYTYESLKEYLEHLTEEGRISEEMKKCVRIKDL LTFLRTDIAERMKKCSKCGKLFKEQPFVLGVPAKEIYPEEKIEETLLIQGIIDVWLEEED GLVVVDYKTDRVKNGLELKAKYSGQLNYYGEALERLTGKKVKEKIIYSLFLQKEISI >gi|330401904|gb|ADLB01000023.1| GENE 34 35867 - 39199 2578 1110 aa, chain - ## HITS:1 COG:CAC2263 KEGG:ns NR:ns ## COG: CAC2263 COG3857 # Protein_GI_number: 15895531 # Func_class: L Replication, recombination and repair # Function: ATP-dependent nuclease, subunit B # Organism: Clostridium acetobutylicum # 1 1110 1 1143 1153 617 31.0 1e-176 MSLQFIFGSSGAGKSHYLHRYIVEESIKNPAENYFVIVPEQFTMQTQKDLVMAHERKGIM NIDVLSFGRLAHRIFEEVGGNQRIILDDTGKSLILRKIAGDNEDELKVLRGNLKKHGYIG EMKSVLSEFAQYNIDVDTLDKMADSVEEGTYFQWKLKDIAKVYEGFSRYLSDKYITNEEL LDVLCHVIGKSDILKDSVIAFDGFTGFTPVQDKVMKELLHVCKKVVVTVTIDERENPYTF THPYQLFALSKRLVTGLTALAKEERVEIVEPVCLPQNPAYRFRENPPLAFLEKHLFRYTT DSFSGEQESLSLHCAKMPKDETAFVMQEIRHLVRIKGYRYRDIAIIAGNIATYADYMEKM AEVYTIPIFMDYKRSVLLNSFVEYIRSLLAMAEQNLTYESVFRFLRTGLTGFSREDIDIL ENYVIAMGIKGYKKWNEKWVRTYTGLSEEELEKVNEIRSEFVKMIQDVMSVLKKRRKTVT EITLTLHSFFMQEKLQEKLVKYEETFQENNEPALAKEYAQIYRIVIEIFDKFIELLGDEY LPLKEYCALLDAGFEEAKVGVIPPGIDEVIVGDMERTRLKEVKVLFFVGANDVFIPGNQA GNGLLSEQDRKIFEARKIPLAPNGKEKIYIQKFYLYMNLTKPSEKVYVSYSKVSGDGKGI RPAYFLQDLKRLYPMLHVNEEEEKKFTELELTKKAGITYIISGLQRRREEITDEWLELYT SYRKDEDWEGIIDNLIEAGFYKMPEDRLSKQMAEKLYGEKIEGSVTRLERFAACAFAHFL SYGLRLKERKEYRFEAVDLGNIFHRAIEIFSRNLVKSGYTWTTLPKEVAEELIEESVETS ITDYGNTVLYSSARNEYMITRIKRMMKRTVWALAKQLEKGDFVPSGYEVSFGKGKIDRID IYETEDRVYVKVIDYKTGTKTFSLASLYHGLQLQLAMYMNMAVELERRKQPEKKVVPAGL FYYQMKDPIIDKVEEDKWEETVLSKLRLDGIVNANEEVINHLDREFSGNSLAIPVGKNKN GSLSKTSKAFSEEEFSIISRYVEKEVKDIEGRMQEGEIEVSPYEMGTSYGCDFCQFKGVC HFDEKIEGCEYRKLGRLSDEEVLKKMMEEL >gi|330401904|gb|ADLB01000023.1| GENE 35 39210 - 39770 351 186 aa, chain - ## HITS:1 COG:NMB0747 KEGG:ns NR:ns ## COG: NMB0747 COG0500 # Protein_GI_number: 15676645 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Neisseria meningitidis MC58 # 6 186 6 185 188 147 40.0 1e-35 MKKYQITEWCHHFIKNHVKEGETCVDATAGNGNDTLLLAELVGEKGKVYAFDIQEQALRR TKERLEEKKLDSRAELILQSHEEIGEIVKEEVSCIVFNFGYLPGGDHSLSTKKESSIKAI ETGLSLLKKGGLMSLCIYSGGDSGFEEKDGILQYLRTLDSKKYLVIVSEYYNRPNNPPIP AMIIKL >gi|330401904|gb|ADLB01000023.1| GENE 36 39830 - 41077 653 415 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149914878|ref|ZP_01903407.1| 30S ribosomal protein S2 [Roseobacter sp. AzwK-3b] # 26 399 26 407 425 256 40 3e-67 MAEMIELSKEIEKVILVGVSVEENDDTEKSLDELEELARTAGAVTVGRVIQKREQIHPGT YVGKGKIDEIKELLWETEATGIICDDELSPAQLGNLQDALDTKVMDRTLIILDIFAERAS TSEGKIQVELAQLKYRQSRLVGLGKSLSRLGGGIGTRGPGEKKLEMDRRLIKGRIAQLNR ELKEVKRHREVTREQRNRNHLPVVAIVGYTNAGKSTLLNKLTGASVLEEDKLFATLDPTT RGLKLQSKQEILLTDTVGFIRKLPHHLIEAFKSTLEEAKYADIILHVVDASNPQLDEQMH IVYETLQQLEVVNKPIITAFNKQDKADGEMIIRDFKADYIVKISAKTGEGLSGLLETIEE VLRQQKVLVERIYSYAEAGKIQLIRKYGELLTEEYREEGIFVKAYVPVEIYGKIQ >gi|330401904|gb|ADLB01000023.1| GENE 37 41082 - 41609 690 175 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_2423 NR:ns ## KEGG: EUBREC_2423 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 37 175 187 320 320 71 33.0 1e-11 MIKKRKTAILLCFVCTVLMTGCAGKVDKGIEHLEKKEYKEAIKSFDKAIDGGEQVAEAYH GQGIAYWELKEYDKAKEAMLNYLKEKGKPTATVYQIFGDCEMKSGNYEEALSYYLKGMSS DGLNEKQLQEMSYNEIVAYEKLNDWSSAKAKISAYVEKYPDDEKAKKEAEFLETR >gi|330401904|gb|ADLB01000023.1| GENE 38 41621 - 41980 399 119 aa, chain - ## HITS:1 COG:CAC2242 KEGG:ns NR:ns ## COG: CAC2242 COG0640 # Protein_GI_number: 15895510 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Clostridium acetobutylicum # 7 119 9 121 122 127 55.0 6e-30 MGKLEVECCGSTHIHTKAIHQVEEHMPKEETLTDLADFFKVFGDATRVKILYVLLQSEMC VCDLAEVLQMTQSAISHQLRVLKQVKLVKNRREGKTVFYSLADGHIQTILNQGMEHISE >gi|330401904|gb|ADLB01000023.1| GENE 39 42151 - 44610 1873 819 aa, chain + ## HITS:1 COG:lin2791 KEGG:ns NR:ns ## COG: lin2791 COG1409 # Protein_GI_number: 16801852 # Func_class: R General function prediction only # Function: Predicted phosphohydrolases # Organism: Listeria innocua # 63 362 35 317 443 132 31.0 2e-30 MLKFGKFSYKEILICYNPVCKHQNEHERKAKMKMNMKIGAVTLACAITIGSTPLSAMAAE VPQKKEPLKIGVMSDTHYFSKSLYGDCEDFTTAMNSDRKMLKESDAILTGTLNQLVKDEP DVVMISGDLTKDGEQVNHEAVAEKLSEAKDALKKKGVDTNFFVINGNHDINNPHGKDFSS KTAQDADRTTVEEFREIYKEFGYGENTVQYNPDSNRGGSLSYVTQLAEGYTLIAVDTGKY SSDQTDSKEDLQETGGVISPKLLDWVTAQAEKAKAKGDTVMVVQHHGVIPHFEQEQTLMA DYLVDNWEEVREAYADAGISYVFTGHMHANDIASYTSKNGNTLYDIETGSLVTYPSLFRS ITVQNGTDKTKDGNTLTTKMETPGTISYEDFDTGNVQKIENLTEYGKKLTLSNEVIRTMI TEGLLSPMIDFTLANGGSRALVADLLQVTPEQTSRTLVEMLTQLLPTTKENGLPLSVSGF NFRIYYDAAEKCIRISQDTSKTISAKQEGVLEIPLENGETISITLPETFRKTLAEQIQTA AMTEEKAATIELFVSNEKLSNFFDQLFADVDNHLLGDKDALFSIVETLVNRILDSKVDDT HNVFDLVNYVYQLHLAGNESCDAWAEAAIQKIQQGNLLPDILKESIKATQPTIKNVLSKI NMNLETVLDKGNNSLTTNLAYGVITGMIKNAGDIVDMVDLSTLLPESILKEINTLAYNAA YTMSHDENYQEDLDTSILMEGKTSWETPEVPETPETPEIPEIPETPQKPQTQKPVQHQQN VATKKPAQTVKTGDSSKISLTLLMLTFSVGAMGLIRKKR >gi|330401904|gb|ADLB01000023.1| GENE 40 44683 - 46920 2384 745 aa, chain - ## HITS:1 COG:CAC2241 KEGG:ns NR:ns ## COG: CAC2241 COG2217 # Protein_GI_number: 15895509 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Clostridium acetobutylicum # 47 744 1 699 699 569 46.0 1e-162 MSGHKTEKCNCGHDHHDHHEHHECCGGHDHHECCGHHHHDVEEKKELTSHAKKTIYTIEN IDCANCASKIEKKINELPEVEEAVLTFATKQLQVASDSEKDLLPLLQKACDSVEDGVVIK EKKRSVKALREEKKEENKKGEIISIGIGAALFIAGVICNGKMEIVPIVLFVLSYLILGGG IVLTAGKNLLKGRVFDENFLMSLATLSAFAIGEFSEAVGVMLFYRIGELFEHIAVERSRS QIMEAVDLRPEVVNRMTGEKVEVIPSEDAQVGDIVLVRAGDRIPLDSVIVEGESQIDTSA VTGEPVPVKVKVGEKVTSGCVNLTGVLKVKVEKVLEDSMVTRILNSVENAASSKPKIDRF ITKFARVYTPFVVGLALATAILPSIFTGNWEHWIYTAITFLVISCPCALVLSVPLAFFSG IGAGSKRGILFKGGVALEALKDVKQVVVDKTGTITEGNFVLQKCVAVEGTEEELLRLAGN CELISTHPIGNSIVTAGKEKGLKLERPQKAEEISGKGIRASLKEGEILCGNRTLMEENKI DLQQYEKKSYGTEVLVALNGKFIGYLVISDTIKEDAISAVRQIKGEKVDITMLTGDAKKS ADAIAKEVGIDNVYAQLLPQDKLEKLQQIRKEKGAVMFVGDGINDAPVLAGADVGAAMGS GADVAIEVADVVFMTSSVEAIPTSIRIAKQTGKIAVQNVVFALAIKVLVMILGLLGFANM WLAVFADTGVAMLCVLNAIRILYKK >gi|330401904|gb|ADLB01000023.1| GENE 41 47077 - 47523 373 148 aa, chain + ## HITS:1 COG:L50174 KEGG:ns NR:ns ## COG: L50174 COG0454 # Protein_GI_number: 15672030 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Lactococcus lactis # 1 147 1 146 149 107 38.0 8e-24 MILRKYNSTDCEQLAELFYDTVHTVNAEDYTEEQLNVWATGQIDLEKWNASFEKNHTVIA VEEDKIVGFGDMDKTGYLDRLYVHKDYQRMGIASSICDELEKHFSSSFTTHASITARPFF EKRGYTVVKEQQVERGGILLTNFVMKKN >gi|330401904|gb|ADLB01000023.1| GENE 42 47543 - 48217 826 224 aa, chain - ## HITS:1 COG:BH3850 KEGG:ns NR:ns ## COG: BH3850 COG0692 # Protein_GI_number: 15616412 # Func_class: L Replication, recombination and repair # Function: Uracil DNA glycosylase # Organism: Bacillus halodurans # 1 221 1 221 224 285 60.0 5e-77 MGAINNDWLDALKDEFRKPYYAELHKKVLEEYHTHLIFPPADDIFNAFHLTPLKNVKVVI LGQDPYHNVNQAHGLCFSVKPEVEIPPSLVNIYKELHDDLGCVIPNHGYLTKWAEQGVLM LNTVLTVRAHQANSHRGIGWEEFTDAAIRAVNTQDRPIVYILWGKPAQAKKSMLTNPKHL ILEAPHPSPLSAYRGFFGSKPFSKTNAFLEEHGVEPIDWQIESR >gi|330401904|gb|ADLB01000023.1| GENE 43 48242 - 49717 1268 491 aa, chain - ## HITS:1 COG:CAC2434 KEGG:ns NR:ns ## COG: CAC2434 COG0642 # Protein_GI_number: 15895699 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 1 490 3 491 492 252 31.0 1e-66 MKYSIKRQMATVFIGIVMFILLAVFVMNSRFLEEYYVQNKMQVLTNIYQSLNKLMEENKE MDSESLNKTLGPLVEKSNVSLIVTDATGSIRIKMRTDSENEELKTRLVGYRFNINQEEST LLKSSDNYELRKAKDPRNKMEYLEMWGNFSNGNVFLFRSPLESIRESVMIFNRFLMYIGV GVILISIVIIWYFSKKITDPILELAMLSEKMADLDFDAKYTSGGKNEIGILGCNFNKMSQ RLETTISELKSANNELQKDIEKKEQIETMRTEFLGNVSHELKTPIALIQGYAEGLKEGIS DDVESREFYCDVIMDEANKMNQMVRNLLTLNQLEFGKEEISFERFDIVELVRGVIQSCDI LIQQKGVKINFIMEESVYVWADEFKAEQVIRNYISNALNHVAGENIIEIKIVKKEETVRV SVFNTGSPIPEEDIGHIWDKFYKVDKARTREYGGNGIGLSIVKAIMESFHKSYGVKNYDN GVEFFMELDLK >gi|330401904|gb|ADLB01000023.1| GENE 44 49717 - 50394 633 225 aa, chain - ## HITS:1 COG:CAC2435 KEGG:ns NR:ns ## COG: CAC2435 COG0745 # Protein_GI_number: 15895700 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Clostridium acetobutylicum # 6 224 5 222 224 230 52.0 1e-60 MERLKILIVDDESRMRKLVKDFLERDGFTVLEAGDGLEAMDVFYDNKDIALLILDVMMPK MDGWQVCREVRQVSQVPIIMLTARADEKDELQGFDLGVDEYIAKPFSPKILVARVEAILR RTNVLFSEESIDAGGIVIDKTAHLVTIDNQPIELSYKEFELLSYFVENQGIALSREKILN NVWNYDYFGDARTIDTHVKKLRSKLGDKGDYIKTIWGMGYKFEVS >gi|330401904|gb|ADLB01000023.1| GENE 45 50460 - 51353 501 297 aa, chain - ## HITS:1 COG:CAC0946 KEGG:ns NR:ns ## COG: CAC0946 COG2333 # Protein_GI_number: 15894233 # Func_class: R General function prediction only # Function: Predicted hydrolase (metallo-beta-lactamase superfamily) # Organism: Clostridium acetobutylicum # 7 292 5 288 320 220 38.0 2e-57 MNRNRRIRRISIIITIFCLTVIFFRFQEKWETKKVLQDASGELKVHFMDTGQSESILIQS GGKNMLIDGGTNATGGKVVEYLRKKGITYLDYVVGTHGHEDHIGGIDAVLYNFEVGELFL PKQTYGTHTYRDIRETAQAKQIPITVPKWKEVRTLGKVKMMFLAPNPNKIYEDVNDSSLI LRISNGKHSFLFCGDISKTLEKELSEKDVYLKSDVLKLNHHGSSDTNSMKFLKAVSPEYA VVCCAEGNPFGHPHKSVLKRVKKRKIKLFRTDEQGTIILTSDGKKLMSNVKPIRENL >gi|330401904|gb|ADLB01000023.1| GENE 46 51485 - 52501 1275 338 aa, chain - ## HITS:1 COG:CAC0892 KEGG:ns NR:ns ## COG: CAC0892 COG2876 # Protein_GI_number: 15894179 # Func_class: E Amino acid transport and metabolism # Function: 3-deoxy-D-arabino-heptulosonate 7-phosphate (DAHP) synthase # Organism: Clostridium acetobutylicum # 1 338 1 337 337 389 58.0 1e-108 MIITCKKNAPQEEVDKVVKSFEQQGLEVNFIQGANYNVFGIVGDTTIVDERKIRANKWIE DVTRIGAPYKKVNRLFHLEDSVVDVAGVKIGGKEKVVVIGGPCSVEGKEEICDLAQKVKH AGADMLRGGAYKPRTSPYAFQGLGTEGIEAMVEARRRTGLPIVSEIMSADKIDEFVENVD LIQVGARNMQNFDLLKALGKIRKPILLKRGLANTIEEWLMSAEYIMAGGNDQVILCERGI RTFEKYTRNTLDLSVVPIIKRKSHLPIIIDPSHATGDWELVESMGLAAVAAGADGLIVEV HDNPECAWSDGAQSLKPEKFKAMIDKCKVVAEVVGRSM >gi|330401904|gb|ADLB01000023.1| GENE 47 52498 - 53502 962 334 aa, chain - ## HITS:1 COG:L0060 KEGG:ns NR:ns ## COG: L0060 COG0337 # Protein_GI_number: 15673738 # Func_class: E Amino acid transport and metabolism # Function: 3-dehydroquinate synthetase # Organism: Lactococcus lactis # 1 330 1 351 356 204 37.0 2e-52 MEMNIQLAEKTSRIYIERDCRLHLEKYISLDCKVMVITDEGVPKQYYEEVLSQCKEGCLF VAKQGEETKSLTVYEQILNRLLELDFSRKDLVIALGGGVIGDLSGFVAGTFKRGMRFASI PTTTLSQIDSSIGGKVAVNMGEVKNVVGTFYHPEAVLIDLNTLSTLPERHYHNGLVEAVK AGLIADEKLFELFEKTENIEEDLEEIIVRALHVKKNVVEQDEKEENLRKILNFGHTIGHA IESIYHLEGYYHGECVGIGMMMILENEKIRERLRRVLCKLNVPLSADYDVEEVIHFIKKD KKANGKYVTVVQVDKVGEAKLLPTEIESLRGNKI >gi|330401904|gb|ADLB01000023.1| GENE 48 53523 - 54284 708 253 aa, chain - ## HITS:1 COG:lin0494 KEGG:ns NR:ns ## COG: lin0494 COG0710 # Protein_GI_number: 16799569 # Func_class: E Amino acid transport and metabolism # Function: 3-dehydroquinate dehydratase # Organism: Listeria innocua # 1 252 1 252 252 221 49.0 1e-57 MNRVRVRQTVIGEGKPKIVVPIVARTEAEVEEQVKEARQQKADMLEWRGDWFDGIFDWER VENVLRQLRKYVGEMPVLFTFRTANEGGEKEISRDEYRRLNIEVAKSKMADLIDVEGVAF ENADSLIEELKELGSKVIVSSHDFNKTPQKNEMIERLCEMQKMGADIVKLAVMPQSVKDV LELLVATEEMNRLYAKQPIVTMSMAGQGVISRLAGELVGSAMTFGTAGKASAPGQIDVKE LDFILTTIHNQLR >gi|330401904|gb|ADLB01000023.1| GENE 49 54407 - 55282 487 291 aa, chain + ## HITS:1 COG:lin0493 KEGG:ns NR:ns ## COG: lin0493 COG0169 # Protein_GI_number: 16799568 # Func_class: E Amino acid transport and metabolism # Function: Shikimate 5-dehydrogenase # Organism: Listeria innocua # 1 287 5 291 291 350 56.0 2e-96 MNRQIDGHTILTGLLGSPVSHSISPLMHNESFRQLGLNYIYLAFDVDVNRLETAVEGLRA LNVRGFNLTMPNKNEMCRLCDKLSPAAKISGAVNTVVNDDGVLTGYTTDGIGYMQSLKEA GHDIIGKKMTMLGAGGASTAILVQAALDGVREISVFNNRSASFHRMEKVIEDLKNVSDCE MHLYDYSDEHILRREISESVLLTNGTSVGMSPHTDASIITDADMFHKNLIVSDVIYNPRE TKLLRMARESGCPTLNGLYMLLFQGAEAFKLWTGKEMPVEKIKEKYFQNGT >gi|330401904|gb|ADLB01000023.1| GENE 50 55312 - 56661 1096 449 aa, chain - ## HITS:1 COG:lin2873 KEGG:ns NR:ns ## COG: lin2873 COG0534 # Protein_GI_number: 16801933 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Listeria innocua # 1 445 1 443 450 203 29.0 6e-52 MNKTKNVMGTKPVFPLLMSMAIPPMVSMLIQSMYNIVDSIFVAQLGEDALTAVSLAFPLQ NLVLSVAVGLGVGLNAGIARNMGAGKQDMVDRTATHGIVFTAIHSLMFILIGIFCIKPFF RLFTDDPDIFKWGCQYTYIVICLSFGSLFQIAIEKMFQATGKMIMPMIMQAVGAIINIIL DPILIFGLFGFPEMRVQGAAVATVIGQISACTLAIILFCKDSGGIHIKFKGFRFDKQITK QLYMVAIPSGIMTSMPSILIGILNGIISAVSGTAVAVLGIYFKLQTFVYMPANGVIQGMR PLISYNYGAGLHHRLRQVIKVSVMVSGAIMIFGTVLFVGIPEQILKMFNANEEMMKMGVE TLRIIGLGFIVSTIGTVYSGVFESLGRGIESLMISLLRQLIIIVPLSIVLLKVMGLTGVW ITFPIAETVSAVVAVFLLRRTMKKMSIGK >gi|330401904|gb|ADLB01000023.1| GENE 51 56765 - 57238 324 157 aa, chain + ## HITS:1 COG:TM1424 KEGG:ns NR:ns ## COG: TM1424 COG1905 # Protein_GI_number: 15644175 # Func_class: C Energy production and conversion # Function: NADH:ubiquinone oxidoreductase 24 kD subunit # Organism: Thermotoga maritima # 28 156 22 151 164 69 28.0 3e-12 MINNIEDKYLQFKETTEYYQSLQMQNDQETLVQFLRETQDIFGCIPADAKMQIGEIMGMK PSLIDKIISMYPSLTAEKFQTEIIVCSGASCSSKNAQKVLSEIQTLLQIRPGQVTKDGRY KLTAKPCMKQCKKGPNLMIGSTIYHNIDSEKLKTLLL >gi|330401904|gb|ADLB01000023.1| GENE 52 57262 - 57483 234 73 aa, chain + ## HITS:1 COG:no KEGG:Cphy_2061 NR:ns ## KEGG: Cphy_2061 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 2 72 4 74 79 79 57.0 5e-14 MLNPMSLIKYKDMWNKFTANHPKFPHFLKAVSQNALSEGTLIEINVTTPDGKSYSSNLKV KAEDMELLRELRG >gi|330401904|gb|ADLB01000023.1| GENE 53 57507 - 58754 1505 415 aa, chain - ## HITS:1 COG:PA4602 KEGG:ns NR:ns ## COG: PA4602 COG0112 # Protein_GI_number: 15599798 # Func_class: E Amino acid transport and metabolism # Function: Glycine/serine hydroxymethyltransferase # Organism: Pseudomonas aeruginosa # 9 414 8 415 417 509 60.0 1e-144 MVNEIMERIAAYDAEVGQAVQAECNRQRRNLELIASENIVSEEVMMTMGTVLTNKYAEGY AGKRYYGGCEQVDVIENIAIERAKKLFYCDYANVQPHSGAQANMAVQIAMLKQGDTVMGM NLEHGGHLTHGSPVNFSGMYFHIVPYGVDDNGFIDYDEVERLAVESKPKLIIAGASAYAR EIDFKRFREIADKVGAYLMVDMAHIAGLVAAGLHQSPIPYADVVTTTTHKTLRGPRGGMI LANQEAAEKFHLNKAVFPGTQGGPLEHVIAAKAVAFGEALKPEFKEYQEQVVKNAKVLSE ALQRKGFDILTGGTDNHLMLLDLRNLDLSGKELQRRCDEVYITLNKNTVPNDPRSPFVTS GVRIGTPAVTSRGMKEEDMEKIAELVWLTATEFEEKADLIRSEVEKLCGKYPIYK >gi|330401904|gb|ADLB01000023.1| GENE 54 58871 - 59434 326 187 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MNWNIFWGIYILAFALACWVCFYIFGIRKWKKAERCSEHIVGRVVGVSPVCYAGIHIPLI EYMVKGKIYKVSGPKFRSGSAVQINTPFDSPTAQMETNLTTRENLPLHLKVKMRKNSSVN MQESPLFRLYPIGTPADVFYNPKKPKESFVQRYEGASKLLGILLFALALVLTGVGIFVLF GPPIIMS >gi|330401904|gb|ADLB01000023.1| GENE 55 59495 - 60685 1583 396 aa, chain - ## HITS:1 COG:no KEGG:Cphy_2845 NR:ns ## KEGG: Cphy_2845 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 1 396 1 396 396 669 81.0 0 MERKVGTISRGIRCPIIREGDNLADIVVNSVLDAASSEGFELRDRDVISLTESIVARAQG NYAPISAIASDVKAKLGGETIGIIFPILSRNRFSICLKGMAMGAKKVVLMLSYPSDEVGN ELVSLDKLDEAGINPYSDVLTLEKYRELFGENKHQFTGVDYVEYYGELIREAGAEVEIIF ANNPKVILDYTKNVIHCDIHTRARTKRILKANGAEKVFGMDEILDASVDGSGFNEKYGLL GSNKSTEESVKLFPRECFDLVEEIQKSILEKTGKHVEVMVYGDGAFKDPQGKIWELADPV VSPAFTDGLRGTPNEVKLKYLADNDFKDLSGEALKEAISKRIKEKDDNLVGNMVSQGTTP RQLTDLIGSLCDLTSGSGDKGTPVVLVQGYFDNYTN >gi|330401904|gb|ADLB01000023.1| GENE 56 60704 - 60769 79 21 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MYNNKILYYNSLVAKTYYIIL >gi|330401904|gb|ADLB01000023.1| GENE 57 60756 - 61145 393 129 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_0207 NR:ns ## KEGG: EUBREC_0207 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 13 124 16 127 127 66 28.0 4e-10 MRETVLLINFQDAKQLREIKMILLSLKIKMKTIEKKDYLQTIGCLAGMKDMEETQEIYEG EELEKEMMIFSNLRESQLEQILYRIRRNGVRKVDYKAILTDTNKDWNILQLYEELASEHE KMKKNNVQQ >gi|330401904|gb|ADLB01000023.1| GENE 58 61153 - 61893 919 246 aa, chain - ## HITS:1 COG:CAC0279 KEGG:ns NR:ns ## COG: CAC0279 COG0760 # Protein_GI_number: 15893571 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Parvulin-like peptidyl-prolyl isomerase # Organism: Clostridium acetobutylicum # 1 243 1 242 247 182 46.0 5e-46 MSEKVLATVAGQPITEEELQAFLNNVPREQQPYINNPQFRDQCLEQLISLHLFAQMGEEM KLEETEEFQQILKNAKKDILAQLAMRETMKGVEVSDEEVKAYYDANSQQFKKGATVSAKH ILTDSEEKCQTILESILNGEKTFEDSAKEFSTCPSGTRGGDLGQFGRGQMVKEFEDVAFE AEIGEVKGPVKTQFGYHLIKVENRTEESVAAFDEVKETIRRSLVQQKQNAKYMEQVNVLK EKYLEK >gi|330401904|gb|ADLB01000023.1| GENE 59 61920 - 62552 211 210 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229845805|ref|ZP_04465917.1| 50S ribosomal protein L31 [Haemophilus influenzae 7P49H1] # 53 206 58 213 378 85 33 6e-16 MRKKELALEVIERLKKEYPVADCTLDYDEAWKLLVGVRLAAQCTDERVNIVVEKLYAKFP DVESLANAPVEEIEEIVRPCGLGKSKARDISACMKILHEKYNDQIPTTFDEILALPGVGR KSANLIMGDVFGKPAIVTDTHCIRLVNRIGLVNGIKEPKKVEMELWKIIPGEEGSDFCHR LVYHGREVCTARTKPYCDRCCLADICAKKI >gi|330401904|gb|ADLB01000023.1| GENE 60 62693 - 63187 395 164 aa, chain + ## HITS:1 COG:SP0049 KEGG:ns NR:ns ## COG: SP0049 COG4767 # Protein_GI_number: 15899994 # Func_class: V Defense mechanisms # Function: Glycopeptide antibiotics resistance protein # Organism: Streptococcus pneumoniae TIGR4 # 1 150 2 157 169 89 42.0 2e-18 MNSKKLSIGLFSFYCAALTWIILFKFSFSLEELPHLRNINLIPFAESVIVNGKVDFSEIF QNLFAFIPFGVLISTISENKSFLQKLLPAFFVSLTFEVLQFIFSIGASDITDVITNTLGG LIGIGIFMVLRKLFKRKYKTVINIVSLIFAILFSGLMLMLFLAN >gi|330401904|gb|ADLB01000023.1| GENE 61 63234 - 63716 309 160 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15902812|ref|NP_358362.1| hypothetical protein spr0768 [Streptococcus pneumoniae R6] # 11 158 5 152 165 123 41 3e-27 MKNIDIVYPKEKKKLYSLLRTQLIALTDGVPHTLANFSNASALLNQALSDINWVGFYFIK ENQLVLGTFQGKPACIEIQIGSGVCGTAVAKNETIVVENVHDFPGHIACDSASNSEIVIP IHAHGEIIGVLDIDSPIFSRFDEEDKNGLEMLVKVLEEMF >gi|330401904|gb|ADLB01000023.1| GENE 62 63713 - 64855 1218 380 aa, chain - ## HITS:1 COG:STM0525 KEGG:ns NR:ns ## COG: STM0525 COG1929 # Protein_GI_number: 16763905 # Func_class: G Carbohydrate transport and metabolism # Function: Glycerate kinase # Organism: Salmonella typhimurium LT2 # 1 375 1 375 382 272 42.0 6e-73 MNVAVIMDSFKGSMTSREAGNAVREGIKGVYPNANIAVKPLADGGEGTGEILTESLGGKK ISVSVSGPLGEKNIAYYGYLETEKVAVIEMAQAAGLTLLEEKNPLQADTFGVGEMILDAV QKGCREIWIGIGGSGTNDAGVGMLQALGYTFLDKEKNPIGRGGKEVGRIEFISDENVSSA LKKCCFRIMCDVENELYGENGATYVFGKQKGVTEEQKEKLDKGIKHFAYQTEKFIGKDYS MKKGAGAGGGIGFAFLSYLQAEFFSGIQFVLEKLSIEEVIQNADIVITGEGRLDAQTGMG KAPLGVAELAKKYHKNVIAFAGTVTEGAKQCNEKGIDAFFPIVREIVTLEEATAKEKAIE NLKDSVEQVFRVIQMKRGDV >gi|330401904|gb|ADLB01000023.1| GENE 63 64861 - 65202 325 113 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MIQRKLEQLGVQVQEGIEKLGEETYFIELNTFIKGMDFMRIGQAASRRQWSSAMMTLRRM TQKGKELEIKNMEMLFIRLRKEIGEKNIVAVKNTLALLTQKRVQIIKVLQEEE >gi|330401904|gb|ADLB01000023.1| GENE 64 65340 - 66635 639 431 aa, chain - ## HITS:1 COG:BH0224 KEGG:ns NR:ns ## COG: BH0224 COG3344 # Protein_GI_number: 15612787 # Func_class: L Replication, recombination and repair # Function: Retron-type reverse transcriptase # Organism: Bacillus halodurans # 31 376 11 326 418 143 30.0 6e-34 MGTKLERIAEISAETKKPVFTSIYHLINAELLKQCHKELDGKKAVGVDKVTKEEYGKNLD RNIEDLVQRLKNKSFKTLPSLRVYIPKGNGKMRPLGIASYEDKIVQMAVKKVLEAIYEPR FLNCMYGFRPNRGCHEAVKEVYQRISYGKISYIVDADIKGFFDHINHDWMMRFLEWHIQD PNLLWLIKKYLKAGIMEDGKFESTEEGSAQGSVMSPMLANIYMHNVLTLWFKFVVKKTLK GDCFLVNFADDFVAGFQYKSEAEWYYSALKERMEQFNLELESSKSRLIEFGRFAETNRLS RGIGKPEAFDFLGFTFYCGKTRKGTFALKLQTSRKKFERKLKEYKLWIYDNRNRPVREII KELNVKLVGHYRYYGVTWNYRRLCAFLHRVQQFLFKAVNRRGCRRAYTWGGFWEMLKCYP LAKPKTYYCLY >gi|330401904|gb|ADLB01000023.1| GENE 65 67126 - 67299 282 57 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|197302777|ref|ZP_03167830.1| ## NR: gi|197302777|ref|ZP_03167830.1| hypothetical protein RUMLAC_01506 [Ruminococcus lactaris ATCC 29176] # 1 57 1 57 57 91 91.0 2e-17 MEKRKIVRGVILAVLLVCTVFLIYSIVTDPLGEHDIMFALAVTAGFIALGQDKKKER >gi|330401904|gb|ADLB01000023.1| GENE 66 67535 - 68425 699 296 aa, chain + ## HITS:1 COG:no KEGG:Acfer_1503 NR:ns ## KEGG: Acfer_1503 # Name: not_defined # Def: tetracycline resistance leader peptide # Organism: A.fermentans # Pathway: not_defined # 3 244 5 242 278 135 37.0 2e-30 MKKQLKNLTIKDNFMFAAVMLDEENCKGFLERALQMKIDRVQVSAEKNIVYHPEYKGVRL DVYAKDENNTRYNVEMQVSTQSSLGLRSRYYQSQMDMEMLLSGSEYEELPKSYVIFICDF DPFGERKYKYTFEMECKETAKAKLQDKRKIVFLSTKGKNAEEVLEELVRFLEFVKADIKE SQNDFQDEYVKQLQKFVEHIKKDREMEERFMLFEELLKEERKSGREEGREEGRQEGRQEG RKKMAEITTALLSKYGEIPDALSEKINEEKDMEMLKRWISVATEVSTIEEFISKIS >gi|330401904|gb|ADLB01000023.1| GENE 67 68813 - 69475 712 220 aa, chain - ## HITS:1 COG:MA3621 KEGG:ns NR:ns ## COG: MA3621 COG0613 # Protein_GI_number: 20092421 # Func_class: R General function prediction only # Function: Predicted metal-dependent phosphoesterases (PHP family) # Organism: Methanosarcina acetivorans str.C2A # 1 195 1 195 224 97 34.0 2e-20 MFIDTHMHECTYSKDSYLKLEDMVKVAKRRGLGGICITDHDDMGLKQYAEEYAKKVGFPI FVGIEFFSLQGDIVAFGIEEYPKERVDAQEFIDLVKRQGGICFSAHPFRNNNRGLEENLR MVKGLDGVEVLNGSTLPEANKKAADYAKELGLVTVGSSDCHIIEKIGFFATYFPEEVRTM KDFIRVFKSGKCRPAYYTENGYQIWDMETPIHKPYLLKDS >gi|330401904|gb|ADLB01000023.1| GENE 68 69488 - 70681 575 397 aa, chain - ## HITS:1 COG:MTH450 KEGG:ns NR:ns ## COG: MTH450 COG0438 # Protein_GI_number: 15678478 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Methanothermobacter thermautotrophicus # 111 374 123 402 411 85 25.0 2e-16 MKILSITAQKPHSTGSGIYLTELVKNWDTQGNKQAVITGIYEDDRLDFPQEVKVYPVYYQ SEELPFSIFGMSDEMPYESLKYSEMSGEILEKFSRAFQEKISLAVQEMNPDVIVCHHLYL LTALVREWYPDKKIYGVCHGSDLRQFSKNPLQRERIRKNIGKLNGVIVLHREQKQIVQKM FSLPTDRIHIVGVGYNQTVFRRRKTEKLRKKQIAFAGKVTEKKGIFSLLRAVEKLPYDEN ELAIKIAGGIGTKQEFEEICKLAKESRYPITFLGLLSQQELAETFCQSDVFVLPSFYEGL PLVVLEAMSCGCKVVCSDIPGVKEWLSENVRREQAAFVKLPSMKNTDEPNREELPYFERR LADSIYEKLEQREEENPDLSKVSWNGVSLRILELLKK >gi|330401904|gb|ADLB01000023.1| GENE 69 70739 - 71176 450 145 aa, chain - ## HITS:1 COG:no KEGG:HMPREF0424_0946 NR:ns ## KEGG: HMPREF0424_0946 # Name: not_defined # Def: hypothetical protein # Organism: G.vaginalis # Pathway: not_defined # 1 136 5 139 147 114 47.0 1e-24 MKKTVNTSFIYAILAMVGGVFYREFTKFQEFDGKTTLSVLHTHLFMLGMVMFLLVTICIR LFAIDRAKKYSAFFVIYNAGVCLTSVMLTVRGIVQVLGVELSSAQNGMLSGVAGIGHILI GVGMLFFFFSLKESVSTLQTSQKEV >gi|330401904|gb|ADLB01000023.1| GENE 70 71433 - 71834 337 133 aa, chain + ## HITS:1 COG:FN1093 KEGG:ns NR:ns ## COG: FN1093 COG1959 # Protein_GI_number: 19704428 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Fusobacterium nucleatum # 1 130 1 127 142 66 31.0 1e-11 MLFTKECDYAIRIMRALSSGEFINVSTICTMENLSSAMTYKITRKLEKAGLLKSLRGANG GYALNRSLKDISLYDVCTSIEPDILLLECMKKDYQCSMNTQNTPCLVHGEFCRIQRILLQ ELQHKNLAELFRK >gi|330401904|gb|ADLB01000023.1| GENE 71 71895 - 73778 1996 627 aa, chain + ## HITS:1 COG:CAC0116 KEGG:ns NR:ns ## COG: CAC0116 COG1151 # Protein_GI_number: 15893412 # Func_class: C Energy production and conversion # Function: 6Fe-6S prismane cluster-containing protein # Organism: Clostridium acetobutylicum # 1 625 1 626 629 846 64.0 0 MNQCYGCNTCKSADKPLESYIRSLPMETSHHRVEGQSTKCGFGLQGVCCRLCSNGPCRIT PDAPRGICGANADTIVTRNFLRAVASGSGCYIHIVENTALNLKKTAQIRGELKGIQSLNH LAEIFGITDEDIYVKAEKVADAVLADLYKPEYVSAELIEKIAYAPRVKRWKELGIMPGGA KAEVFKGVVKCSTNLNSDPVDMLTDCLKLGISTGVYGLTLTNLLNDVLLGQPELRLAPVG LRVIDPDYINIMITGHQHTIFVDLQERLISDEAKAKAQAAGAKGFKLVGCTCVGQDLQLR GAHYTEVFDGHAGNNYTSEAILATGAIDVVLSEFNCTLPGIEPICDELKIKQICLDDVAK KANAELMPFSFENREEQSEIIIDKIIEAYKERRGTVPMNLFEDHGNDCTLTGVSEGSLKD FLGGKWTPLIELIVSGEIKGIAGVVGCSNLTAGGHDVLSVELTKELIAKDILVLTAGCSS GGIENCGLMTPEAAKLAGPKLRAVCEKLNIPPVLNFGPCLAIGRLEIVATEIARELGVDL PELPLVLSAAQWLEEQALADGCFGLALGLPLHLGLPPFITGSKTAVKVMTEDMKELTGGQ VIINGDAKETADILEKIIEEKRNHLHI >gi|330401904|gb|ADLB01000023.1| GENE 72 73790 - 74242 271 150 aa, chain + ## HITS:1 COG:TM0396 KEGG:ns NR:ns ## COG: TM0396 COG1142 # Protein_GI_number: 15643162 # Func_class: C Energy production and conversion # Function: Fe-S-cluster-containing hydrogenase components 2 # Organism: Thermotoga maritima # 1 148 5 150 152 91 37.0 6e-19 MRRIMIDASKCDGCKNCTVACIQAHRETPGTIYDLDLTDIRNESRNHIEKMPDGSYRPIF CRHCDLPECVMSCMSGALTKDAKTGLVSYDEKKCGSCFMCVMNCPYGVLKADTATHTKVI KCDFCIQDNSEPNCVKSCPKKAIYVEEVTL >gi|330401904|gb|ADLB01000023.1| GENE 73 74239 - 75465 1254 408 aa, chain + ## HITS:1 COG:TM0395 KEGG:ns NR:ns ## COG: TM0395 COG0446 # Protein_GI_number: 15643161 # Func_class: R General function prediction only # Function: Uncharacterized NAD(FAD)-dependent dehydrogenases # Organism: Thermotoga maritima # 1 395 1 406 425 221 32.0 3e-57 MMHVIIGVGPAGITAAQTILKTDKDAQITMISTDEFVHSRCMLHRYLSHERNEETLSFVD KDFFSSNRIHWVKGVTVEKIDVDEKKVCLSNHEDYLFDKLLIATGANSFIPPVAQFREAK NVFGLRHLSDAQKIRPLADDADHILIVGSGLVGMDAAYAFLEQGKKVTVVEMADRILPIQ LNETAGKPYRKLFEEHGCKFIVGKKASDTHMDENGYIDAVFLDDGTKIDCDLIIVAAGVR PAVECVEGSKIHVDRFIQVDDTMKTNCNDIYSAGDVTGLSGIWPNAMKQGQVAGLNMCGV ESHYTDRYAMKNTMNFYGLVTLSLGRGVAEEGDTVLEQEDAKNYKRAIIRNGKLDSILLQ GSIDYAGIYQYLIKNEIDVSHFDKDIFHLSFADFYGIDEKGKYYYEVR >gi|330401904|gb|ADLB01000023.1| GENE 74 75804 - 76001 180 65 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|253579274|ref|ZP_04856544.1| ## NR: gi|253579274|ref|ZP_04856544.1| conserved hypothetical protein [Ruminococcus sp. 5_1_39B_FAA] # 11 65 6 60 60 65 65.0 8e-10 MKQTEKKMVSNCESCMNYEYDEEYEYYVCTKNLDEDEMYRFVKGEFRDCPYYQFGDEYRI VRKQM >gi|330401904|gb|ADLB01000023.1| GENE 75 76041 - 76853 594 270 aa, chain - ## HITS:1 COG:CAC2707 KEGG:ns NR:ns ## COG: CAC2707 COG0122 # Protein_GI_number: 15895964 # Func_class: L Replication, recombination and repair # Function: 3-methyladenine DNA glycosylase/8-oxoguanine DNA glycosylase # Organism: Clostridium acetobutylicum # 4 264 16 284 292 181 36.0 1e-45 MTTKQIPHFNIQQICDSGQCFRMTPVEENKYSVVASGRYLEITQEKENCTFSCGEEEFET FWKNYFDLDRNYEDYLCQINPNDRYLSEAGRFGYGIRILHQDLWEMIVSFLISQQNNIAR IRKCIQNICEEYGERQLNFRGEVYYTFPTAETLEKLGDDELKACNLGYRSKYVVRTAKSV VAGEISLEQVKKMSYRKAKAELLNLFGVGEKVADCICLFALHHLQAFPVDTHIRQVLDEH YKRGFPNRRYNGFQGVMQQYIFYYELFGKK >gi|330401904|gb|ADLB01000023.1| GENE 76 76858 - 77901 964 347 aa, chain - ## HITS:1 COG:CAC3472_1 KEGG:ns NR:ns ## COG: CAC3472_1 COG1396 # Protein_GI_number: 15896711 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Clostridium acetobutylicum # 1 117 1 117 125 111 46.0 2e-24 MKIGEVIRKYRKEKQMTQEEMAEYLGVTTPAVNKWENGNSMPDISLLAPIARLLGISTDT LLSYQNELTDKEISRKIELLAEKMKTESYEDVFRWTEQQIREYPNCDKLILVAAQTLDGY RTILNVENHEKYDEKIYQLYLRVADSREYTLAMSAWISLFHYQIGKKNFEEAQRYLDRFP KIEVSAKQMRALLFAKQGKTEDAYRLYEEILYTGGMNMAQSLSGIYTLAVKENDLEKAEY ITEKQGELAKTLEMGRYMEVTAGLELAVYKKDKEKTLEILTEMAHSLQHMGDFRYAPLYS HMQFKDGEMSNIFFMFKKAFENDGEFAFLQDDERYKKLLEEVKEITG >gi|330401904|gb|ADLB01000023.1| GENE 77 78123 - 78473 65 116 aa, chain - ## HITS:1 COG:no KEGG:SZO_02320 NR:ns ## KEGG: SZO_02320 # Name: not_defined # Def: transposase # Organism: S.equi_zooepidemicus # Pathway: not_defined # 1 112 349 457 472 70 31.0 3e-11 VAVLTERTVDCGHCIKFTKTSTNKTINECGIQVHYHKGTKGLVIQTFDKRLLFSVNNKIY ELEVVPLHEPLSKNFDLQPLKEKPRKRNLPSPKHPWRMSTFLQFKNHKITETMLIC Prediction of potential genes in microbial genomes Time: Tue May 24 21:52:18 2011 Seq name: gi|330401853|gb|ADLB01000024.1| Lachnospiraceae bacterium 2_1_46FAA cont1.24, whole genome shotgun sequence Length of sequence - 10751 bp Number of predicted genes - 15, with homology - 12 Number of transcription units - 9, operones - 4 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 363 142 ## SPCG_1975 putative IS1202 transposase 2 1 Op 2 . - CDS 420 - 539 63 ## - Prom 563 - 622 1.8 + Prom 794 - 853 3.3 3 2 Tu 1 . + CDS 900 - 2234 1502 ## COG1114 Branched-chain amino acid permeases + Prom 2238 - 2297 3.9 4 3 Op 1 . + CDS 2322 - 2417 162 ## 5 3 Op 2 . + CDS 2430 - 2621 130 ## + Term 2664 - 2699 -0.8 + Prom 2652 - 2711 6.8 6 4 Tu 1 . + CDS 2731 - 3078 197 ## CKR_0494 hypothetical protein 7 5 Op 1 . - CDS 2993 - 3412 367 ## COG1468 RecB family exonuclease 8 5 Op 2 2/0.000 - CDS 3342 - 3482 236 ## COG0599 Uncharacterized homolog of gamma-carboxymuconolactone decarboxylase subunit 9 5 Op 3 . - CDS 3495 - 3917 451 ## COG0656 Aldo/keto reductases, related to diketogulonate reductase 10 5 Op 4 . - CDS 3932 - 4132 98 ## COG0716 Flavodoxins + Prom 3934 - 3993 3.1 11 6 Tu 1 . + CDS 4200 - 4502 224 ## Cbei_3181 LysR family transcriptional regulator + Term 4508 - 4566 10.1 - Term 4503 - 4546 5.5 12 7 Op 1 . - CDS 4558 - 8223 2614 ## Hlac_1576 hypothetical protein 13 7 Op 2 . - CDS 8227 - 9138 767 ## Cphy_0967 hypothetical protein - Prom 9272 - 9331 6.5 + Prom 9274 - 9333 5.4 14 8 Tu 1 . + CDS 9365 - 10255 1009 ## COG0596 Predicted hydrolases or acyltransferases (alpha/beta hydrolase superfamily) + Term 10288 - 10340 12.5 + Prom 10291 - 10350 6.0 15 9 Tu 1 . + CDS 10402 - 10750 131 ## COG3464 Transposase and inactivated derivatives Predicted protein(s) >gi|330401853|gb|ADLB01000024.1| GENE 1 3 - 363 142 120 aa, chain - ## HITS:1 COG:no KEGG:SPCG_1975 NR:ns ## KEGG: SPCG_1975 # Name: not_defined # Def: putative IS1202 transposase # Organism: S.pneumoniae_CGSP14 # Pathway: not_defined # 1 119 8 129 423 80 40.0 1e-14 MNEQLKYEVIKSLVDHNGNKKAAALKLGCTTRHINRLIQKYKQNGKAAFIHGNRGRKPLH SFTESQKLEILTLYNNKYYDATFTYACELLAKNDGIFISPSALTKIMYENFIPSPRTTKT >gi|330401853|gb|ADLB01000024.1| GENE 2 420 - 539 63 39 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MICKKFASGFYKKIIFYKPFLIKSLYITFDYVPSDKYGV >gi|330401853|gb|ADLB01000024.1| GENE 3 900 - 2234 1502 444 aa, chain + ## HITS:1 COG:CAC1610 KEGG:ns NR:ns ## COG: CAC1610 COG1114 # Protein_GI_number: 15894888 # Func_class: E Amino acid transport and metabolism # Function: Branched-chain amino acid permeases # Organism: Clostridium acetobutylicum # 2 433 3 436 440 335 45.0 1e-91 MKLKKRQVLLVSFMLFSLFFGAGNLIFPPFLGQNAGEHTPLAILTFLITAVALPVLGVIV VARFDGLERLSRKVNPKFAIVFTILIYLSIGPGLGIPRAASLPFEMAVAPYLPEGTNLTL SMLAFSFVFFLLAGWLAMTPNKLVERIGKVLTPSLLCLLLFLFFSFLFKGTVDVAPAQES YNANPMVKGFLEGYLTMDTIAALNFGLVIATTIRGLGVKEKKGVMHYTVTTGIFAGTILA AVYLMLAYMGMSSSGVFDIQENGAWTLRCIVHELFGDTGAILLAAIFTLACLTTCVGLIT SISQYFSTLTSKFSYRKWVCVIAGFSFIVCNQGLNMILSISVPVLDAIYPISIMLIVLGL CDTWLKRNPYIYPCTIGAVGVISVIYALEKVGVKFGFVNKLCHSLPLYSLGLGWVVVAVA VMAFCCVVCVFKKVDDTQAIPTAE >gi|330401853|gb|ADLB01000024.1| GENE 4 2322 - 2417 162 31 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MMNIQLDDRLKNYMQENQLQNILITSMMCHT >gi|330401853|gb|ADLB01000024.1| GENE 5 2430 - 2621 130 63 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MEISARFVDKNETERLKADNFKSFPHKLGEVLIRRIPEKADDTVHLGLSRFFKKIIVEGI YSA >gi|330401853|gb|ADLB01000024.1| GENE 6 2731 - 3078 197 115 aa, chain + ## HITS:1 COG:no KEGG:CKR_0494 NR:ns ## KEGG: CKR_0494 # Name: not_defined # Def: hypothetical protein # Organism: C.kluyveri_NBRC # Pathway: not_defined # 1 94 1 94 94 108 61.0 6e-23 MAREKKPVHRVQMTEGKRNIIHQLLEEYDIRSAEDIQDALKDLLGGTIKEMMEAEMDDHL GYEKSERSDSDDYRNGYKRKRVNSRYGSMEIEVPQSLQPHVFLFPHKTRLPLKFL >gi|330401853|gb|ADLB01000024.1| GENE 7 2993 - 3412 367 139 aa, chain - ## HITS:1 COG:SPy1563 KEGG:ns NR:ns ## COG: SPy1563 COG1468 # Protein_GI_number: 15675455 # Func_class: L Replication, recombination and repair # Function: RecB family exonuclease # Organism: Streptococcus pyogenes M1 GAS # 9 131 32 154 224 117 47.0 6e-27 MYCLVKFGQWEENVHTVVGELMHKKAHDPYLVEKRKDLLVVRSLPIVSREMGVSGECGVV EFHKCEDGVKLHGHRGLFSVYPIEYKKGKPKVTEEDRLQLAAQALCLEEMFSTEISEGAL FYGETRRREVVEIVELQSP >gi|330401853|gb|ADLB01000024.1| GENE 8 3342 - 3482 236 46 aa, chain - ## HITS:1 COG:L178933 KEGG:ns NR:ns ## COG: L178933 COG0599 # Protein_GI_number: 15673518 # Func_class: S Function unknown # Function: Uncharacterized homolog of gamma-carboxymuconolactone decarboxylase subunit # Organism: Lactococcus lactis # 1 32 1 32 102 59 78.0 1e-09 MVKQTAGRDALNDFAPKFAELNDDVLFGEVWSMGRKCTYCCRRINA >gi|330401853|gb|ADLB01000024.1| GENE 9 3495 - 3917 451 140 aa, chain - ## HITS:1 COG:TM1009 KEGG:ns NR:ns ## COG: TM1009 COG0656 # Protein_GI_number: 15643767 # Func_class: R General function prediction only # Function: Aldo/keto reductases, related to diketogulonate reductase # Organism: Thermotoga maritima # 40 133 191 284 286 116 55.0 1e-26 MKKVLIISGSPRKNGNSDILCNQFAQGAKSGGHQVEKVFLGEGRKGTFDNPVLIKIGEKY GKTPAQVMLRWNIQRGVIVIPKSTHKERMEENFAVFDFALSDEDMTVIAELDKKESSFFS HQDPDTVEWFAKLVAERKNK >gi|330401853|gb|ADLB01000024.1| GENE 10 3932 - 4132 98 66 aa, chain - ## HITS:1 COG:MA0407 KEGG:ns NR:ns ## COG: MA0407 COG0716 # Protein_GI_number: 20089301 # Func_class: C Energy production and conversion # Function: Flavodoxins # Organism: Methanosarcina acetivorans str.C2A # 14 66 120 172 179 71 56.0 4e-13 MATIFLYLPRGRKGKTIKPFCTHEGSGMGSSLHDIKRLCPDAKVEKGLAICGGSVRQAKQ EISEWI >gi|330401853|gb|ADLB01000024.1| GENE 11 4200 - 4502 224 100 aa, chain + ## HITS:1 COG:no KEGG:Cbei_3181 NR:ns ## KEGG: Cbei_3181 # Name: not_defined # Def: LysR family transcriptional regulator # Organism: C.beijerinckii # Pathway: not_defined # 1 97 197 293 297 100 41.0 2e-20 MMVRRGDSGINDFMRNDLEKNHSQIHIEDTPQFYDLSVFNRCAETGNVLLTIECWQDVHP GLVTLPVNWEYSIPYGILYSLNAPEDVLHFIDVVKEITVI >gi|330401853|gb|ADLB01000024.1| GENE 12 4558 - 8223 2614 1221 aa, chain - ## HITS:1 COG:no KEGG:Hlac_1576 NR:ns ## KEGG: Hlac_1576 # Name: not_defined # Def: hypothetical protein # Organism: H.lacusprofundi # Pathway: not_defined # 11 1206 58 1361 1525 560 31.0 1e-157 MELCQYPCQCRIRYFTIEAIQIPIVYNQYGDYDPDGLLYVLEEDSQRIQEEALRRFKQIP PQPYEEVQPLVIRVNCGDIVKVRFRNHLNRRLSIHVQGVAYDVMTSDGTGAGFNRDSTTD NEIVYTWYADTEGVFLFHDMADAGSSEKATNIHGLFGAVIVEPPEAKWFDPQTGEEIKSG LMADIYQPGKPAFREYSVFFHDELEILDKNGDTPMDHRTGLPSSTTAISYRSEPMRNRMP LTHDPADSGEEISMSSWVYGDPAPPILRVYVGDPAKIRLIHGGIKETHVFHLHNHQWRLE AENPVSTIIDSITISPQECYTLDILYGAGSRNGVIGDVIFHCHLYPHFHEGMWTLWRIHD RLEDGSGKLPDGTLIPPLQPLKDRERPPKKDALHPGYPNFICGEAGMPPRQPPCGVLDSE GNVVVYPTPLEEANFVERAVPGALYTDTCPCHTAGECDDPCENVKVFEIALVQAKLTYNR YGWHEPEGRFFVLKEELERYGGLDAYIRMVEEEKIQVEPLVIRANAGDCIELRTTNLLPE FLEENAFQLKTKTDIVGHHVHLVKFDTITSDGAANGWNNIAGARKYETLVERFFADEELR TVFFHDHLFANAHQLHGMFGALIIEEAGATFHNIRDGEELRFGTQAVIQRRDGTSFREFA LFVHDFTFLFDKEGKPLNPPAVPGSHDDPGVMGINYRAEPLRERLKTGEDPAYIFSSFVH GDPATPVLETYPGDELMIRLIDGAHEEQHAFNITGMSWQKEVADDKSPLVASQTIGVSEA FNLNVKESYQSGDYLYYFGGTDDVWLGLWGIIRAYDKYQKCLKPLCKGKDMIQPLPPCPG KDDVVRRYEIAAIQTELIYNRYGDYDPDGLIFVPLEDADRVLSGKCKPKPLILRANVGDW LEITLHNMWERPVPYFDYPKVPLDLKHKPSNRVSLNPQFLKYDPISDSGINVGYNPKEQT VGIGESKRYLWKAEREYGSCILQSFGDMRNHRYHGLFGAVIVEPAGAEWYENFTKKKNPF AEQAVITAPGIESFREYVLFIQNGIRLLDADGNLIQTAAAGSDEPVDAEDTGEKGYNYHS ERFANRLLKDARVWKVFSSKIHGDPATPIWKAYSGDRVIFRTMMPADKPRNVSLTIHGHL WREQPKDALSREIPLQGGISIGNRFDMELKGGAACPGDYLYRSGSFGWDVESGMWGIFRV LKRSIRCKCKEMCGRILNCMK >gi|330401853|gb|ADLB01000024.1| GENE 13 8227 - 9138 767 303 aa, chain - ## HITS:1 COG:no KEGG:Cphy_0967 NR:ns ## KEGG: Cphy_0967 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 1 298 1 302 305 284 51.0 2e-75 MKEYVTTSLDTHLFFGRIMKEHSLFLLAAFPAKEMEYRNKANWFRDRFEKALERTVQLAN GMVSEDVLCSGEVFTEFTEMAERQTRKLTKIPINVQITQAEKRLRSGHGNCQDKRMEQYI RHLNQNILHLLNGLIAFKEKILQEVLSCNLYTANYPLLIEHIIREAKLYRQIVTTLEEKG CMPSKDLRETEMFWNQIMMEHALFIRGLLDPTECELVGTADTFAGDYCRLLEEAREQDCR TMDTLTRETLETTEKYRDFKAAGTKGITCCDIRSVILPLLADHVLREANHYLRILQTEGK WRG >gi|330401853|gb|ADLB01000024.1| GENE 14 9365 - 10255 1009 296 aa, chain + ## HITS:1 COG:mll0601 KEGG:ns NR:ns ## COG: mll0601 COG0596 # Protein_GI_number: 13470803 # Func_class: R General function prediction only # Function: Predicted hydrolases or acyltransferases (alpha/beta hydrolase superfamily) # Organism: Mesorhizobium loti # 6 282 9 286 301 172 35.0 5e-43 MAKITEGYMPYLGYQTYYRIVGERKNNGKAPLICLHGGPGSTHNYYEVLDNVADDDDRMI VMYDQIGCGNSYLDGHPELWNQKVWLDELEALRKHLGLDECHIIGQSWGGMMQIAYALDY KPQGVKSFIISSGHPSSSLWEREGLRRIKMMPQDMQDAINHALETGDFTGEAYDAAVAEY MDRYCNYWLGDDVPECCKRPKKSGSESYVEGWGPNEFAPTGSLRDFEYIDRLGEIEIPSL ICSGISDLCSPLVAKTMADGIPNSKWILWERARHTCFVDRHDDYCVELIKWMNQYD >gi|330401853|gb|ADLB01000024.1| GENE 15 10402 - 10750 131 116 aa, chain + ## HITS:1 COG:BH1412 KEGG:ns NR:ns ## COG: BH1412 COG3464 # Protein_GI_number: 15613975 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Bacillus halodurans # 4 99 2 97 405 79 39.0 2e-15 MHSHYTNKLLNIEDVIIKKIHHADTFLKIYLETNPHEQVCPCCGSTTKRIHDYRYQTIKD LPFQLKHCYLVLRKRRYVCKCGKRFYESYSFLPRYFQRTARLTVFYCLLLCILYKH Prediction of potential genes in microbial genomes Time: Tue May 24 21:53:23 2011 Seq name: gi|330401653|gb|ADLB01000025.1| Lachnospiraceae bacterium 2_1_46FAA cont1.25, whole genome shotgun sequence Length of sequence - 102053 bp Number of predicted genes - 101, with homology - 89 Number of transcription units - 48, operones - 22 average op.length - 3.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 37 - 120 73 ## 2 2 Tu 1 . - CDS 252 - 347 164 ## - Prom 405 - 464 2.1 - Term 444 - 495 12.1 3 3 Op 1 . - CDS 527 - 1987 837 ## COG0488 ATPase components of ABC transporters with duplicated ATPase domains 4 3 Op 2 . - CDS 2003 - 3229 977 ## COG0477 Permeases of the major facilitator superfamily - Prom 3289 - 3348 6.1 + Prom 3173 - 3232 6.8 5 4 Tu 1 . + CDS 3443 - 3622 164 ## gi|166032256|ref|ZP_02235085.1| hypothetical protein DORFOR_01959 6 5 Tu 1 . - CDS 3537 - 3701 77 ## - Prom 3777 - 3836 6.9 - Term 4193 - 4251 11.0 7 6 Op 1 . - CDS 4262 - 4492 108 ## gi|268611353|ref|ZP_06145080.1| hypothetical protein RflaF_17869 8 6 Op 2 . - CDS 4452 - 5231 512 ## COG1484 DNA replication protein 9 7 Tu 1 . + CDS 5268 - 5339 87 ## 10 8 Tu 1 . - CDS 5331 - 6779 912 ## COG4584 Transposase and inactivated derivatives - Prom 6823 - 6882 6.2 - Term 7182 - 7211 1.4 11 9 Tu 1 . - CDS 7336 - 7746 411 ## LSL_1338 DNA-binding protein - Prom 7797 - 7856 4.7 12 10 Tu 1 . - CDS 7949 - 8545 402 ## ACL_0934 hypothetical protein - Prom 8615 - 8674 4.1 13 11 Op 1 . - CDS 8718 - 8810 155 ## 14 11 Op 2 . - CDS 8886 - 10331 943 ## HPG27_1439 hypothetical protein 15 11 Op 3 . - CDS 10335 - 10772 266 ## COG3600 Uncharacterized phage-associated protein 16 11 Op 4 . - CDS 10825 - 11241 337 ## CKR_3111 hypothetical protein - Prom 11383 - 11442 4.9 - Term 11393 - 11431 -0.6 17 12 Tu 1 . - CDS 11455 - 11871 356 ## EUBREC_0392 hypothetical protein - Prom 11978 - 12037 6.2 + Prom 11682 - 11741 2.4 18 13 Tu 1 . + CDS 11924 - 12007 57 ## + Term 12029 - 12075 4.5 19 14 Op 1 . - CDS 12073 - 13614 1868 ## COG0519 GMP synthase, PP-ATPase domain/subunit - Prom 13641 - 13700 3.1 20 14 Op 2 . - CDS 13702 - 15063 1258 ## COG0534 Na+-driven multidrug efflux pump 21 14 Op 3 . - CDS 15080 - 16273 798 ## COG1408 Predicted phosphohydrolases - Prom 16300 - 16359 6.8 - Term 16338 - 16378 8.1 22 15 Tu 1 . - CDS 16402 - 18042 1239 ## COG1151 6Fe-6S prismane cluster-containing protein - Prom 18133 - 18192 11.6 - Term 18130 - 18184 8.1 23 16 Op 1 15/0.000 - CDS 18203 - 19012 230 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 24 16 Op 2 34/0.000 - CDS 19015 - 19869 388 ## PROTEIN SUPPORTED gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P 25 16 Op 3 . - CDS 19857 - 20612 709 ## COG0619 ABC-type cobalt transport system, permease component CbiQ and related transporters 26 16 Op 4 . - CDS 20613 - 21236 763 ## FMG_1446 hypothetical protein 27 16 Op 5 . - CDS 21249 - 22211 1132 ## COG1957 Inosine-uridine nucleoside N-ribohydrolase 28 16 Op 6 2/0.250 - CDS 22230 - 23129 1079 ## COG0524 Sugar kinases, ribokinase family 29 16 Op 7 . - CDS 23136 - 25046 1629 ## COG0524 Sugar kinases, ribokinase family - Prom 25212 - 25271 7.9 - Term 25229 - 25273 9.0 30 17 Tu 1 . - CDS 25284 - 25535 174 ## Cphy_0210 sporulation transcriptional regulator SpoIIID - Prom 25599 - 25658 5.5 31 18 Tu 1 . - CDS 25706 - 26143 377 ## gi|291086876|ref|ZP_06571709.1| putative lipoprotein - Prom 26164 - 26223 1.6 32 19 Op 1 . - CDS 26244 - 27248 901 ## EUBREC_1720 hypothetical protein 33 19 Op 2 . - CDS 27302 - 27622 384 ## EUBREC_3436 hypothetical protein 34 19 Op 3 . - CDS 27619 - 27828 71 ## 35 19 Op 4 . - CDS 27770 - 28405 477 ## gi|266623894|ref|ZP_06116829.1| conserved hypothetical protein 36 19 Op 5 . - CDS 28410 - 28853 276 ## COG3279 Response regulator of the LytR/AlgR family - Prom 28873 - 28932 10.3 - Term 28941 - 28980 9.1 37 20 Tu 1 . - CDS 28992 - 30104 1252 ## COG0544 FKBP-type peptidyl-prolyl cis-trans isomerase (trigger factor) - Prom 30172 - 30231 6.9 + Prom 30157 - 30216 8.3 38 21 Tu 1 . + CDS 30339 - 30782 531 ## Cphy_3788 hypothetical protein + Term 30991 - 31036 4.1 - Term 30768 - 30800 4.2 39 22 Op 1 . - CDS 30805 - 31257 516 ## COG4506 Uncharacterized protein conserved in bacteria 40 22 Op 2 . - CDS 31266 - 32096 1033 ## COG0796 Glutamate racemase - Prom 32129 - 32188 6.6 - Term 32161 - 32204 8.7 41 23 Op 1 3/0.250 - CDS 32217 - 33191 1223 ## COG0205 6-phosphofructokinase 42 23 Op 2 . - CDS 33268 - 36741 3269 ## COG0587 DNA polymerase III, alpha subunit 43 23 Op 3 . - CDS 36814 - 37053 287 ## gi|154503520|ref|ZP_02040580.1| hypothetical protein RUMGNA_01344 44 23 Op 4 . - CDS 37101 - 37928 311 ## PROTEIN SUPPORTED gi|212640476|ref|YP_002316996.1| Uncharacterized protein conserved in bacteria containing two ribosomal protein S1-like RNA-binding domains 45 23 Op 5 . - CDS 37928 - 38641 676 ## COG3884 Acyl-ACP thioesterase 46 23 Op 6 . - CDS 38641 - 39816 1144 ## COG0053 Predicted Co/Zn/Cd cation transporters 47 23 Op 7 . - CDS 39836 - 40468 673 ## Cphy_3792 stage II sporulation protein R 48 23 Op 8 . - CDS 40530 - 41204 955 ## COG1802 Transcriptional regulators 49 23 Op 9 1/0.250 - CDS 41207 - 42082 1044 ## COG1947 4-diphosphocytidyl-2C-methyl-D-erythritol 2-phosphate synthase 50 23 Op 10 . - CDS 42127 - 43689 1213 ## COG1388 FOG: LysM repeat 51 23 Op 11 . - CDS 43700 - 44107 273 ## Cphy_3813 hypothetical protein - Prom 44165 - 44224 5.4 + Prom 43982 - 44041 7.0 52 24 Op 1 . + CDS 44217 - 45743 853 ## COG1640 4-alpha-glucanotransferase 53 24 Op 2 . + CDS 45758 - 46006 228 ## EUBREC_0547 1-acyl-sn-glycerol-3-phosphate acyltransferase 54 24 Op 3 . + CDS 45963 - 46493 530 ## COG0204 1-acyl-sn-glycerol-3-phosphate acyltransferase + Prom 46495 - 46554 8.7 55 25 Op 1 . + CDS 46616 - 48073 1379 ## COG1488 Nicotinic acid phosphoribosyltransferase 56 25 Op 2 6/0.000 + CDS 48087 - 49139 1059 ## COG1181 D-alanine-D-alanine ligase and related ATP-grasp enzymes 57 25 Op 3 . + CDS 49154 - 50527 1185 ## COG0770 UDP-N-acetylmuramyl pentapeptide synthase 58 25 Op 4 . + CDS 50536 - 51342 689 ## COG0561 Predicted hydrolases of the HAD superfamily + Term 51353 - 51398 7.1 - Term 51338 - 51385 7.7 59 26 Tu 1 . - CDS 51429 - 52298 720 ## TDE0948 hypothetical protein - Prom 52492 - 52551 5.0 + Prom 52196 - 52255 2.8 60 27 Tu 1 . + CDS 52316 - 52516 137 ## + Term 52535 - 52583 4.1 - Term 52523 - 52571 -0.8 61 28 Op 1 . - CDS 52581 - 52904 368 ## gi|197303201|ref|ZP_03168243.1| hypothetical protein RUMLAC_01924 62 28 Op 2 . - CDS 52923 - 53201 344 ## gi|197303200|ref|ZP_03168242.1| hypothetical protein RUMLAC_01923 - Prom 53265 - 53324 2.6 + Prom 53319 - 53378 7.2 63 29 Op 1 . + CDS 53405 - 54529 728 ## gi|260589015|ref|ZP_05854928.1| hypothetical protein BLAHAN_06100 64 29 Op 2 . + CDS 54417 - 54857 97 ## gi|260589014|ref|ZP_05854927.1| conserved hypothetical protein + Term 54867 - 54905 3.5 - Term 54854 - 54892 6.0 65 30 Tu 1 . - CDS 54897 - 56291 615 ## PROTEIN SUPPORTED gi|145629959|ref|ZP_01785741.1| 50S ribosomal protein L21 - Prom 56345 - 56404 3.2 - Term 56367 - 56418 10.1 66 31 Op 1 . - CDS 56431 - 56805 443 ## 67 31 Op 2 . - CDS 56820 - 57824 854 ## COG1619 Uncharacterized proteins, homologs of microcin C7 resistance protein MccF 68 31 Op 3 . - CDS 57828 - 59066 1491 ## COG2270 Permeases of the major facilitator superfamily 69 31 Op 4 . - CDS 59084 - 59872 887 ## TDE1263 hypothetical protein 70 31 Op 5 . - CDS 59940 - 61193 1357 ## COG0785 Cytochrome c biogenesis protein - Prom 61222 - 61281 7.7 - Term 61260 - 61313 11.6 71 32 Op 1 . - CDS 61320 - 64316 2666 ## GYMC10_2894 Ig domain protein group 2 domain protein - Term 64337 - 64375 2.7 72 32 Op 2 . - CDS 64378 - 65112 739 ## gi|210616234|ref|ZP_03291014.1| hypothetical protein CLONEX_03233 - Prom 65222 - 65281 6.0 + Prom 65286 - 65345 11.0 73 33 Tu 1 . + CDS 65386 - 66255 694 ## COG2207 AraC-type DNA-binding domain-containing proteins + Term 66259 - 66309 6.0 - Term 66252 - 66292 8.0 74 34 Tu 1 . - CDS 66301 - 68262 2071 ## COG5520 O-Glycosyl hydrolase - Prom 68378 - 68437 4.4 75 35 Op 1 . - CDS 68439 - 69932 1065 ## Cbei_0747 BNR repeat-containing glycosyl hydrolase 76 35 Op 2 . - CDS 69960 - 70376 313 ## SSUBM407_0294 lipase - Prom 70402 - 70461 4.6 - Term 70415 - 70463 -0.7 77 36 Tu 1 . - CDS 70655 - 70813 197 ## - Prom 70865 - 70924 5.0 - Term 71201 - 71246 3.3 78 37 Op 1 . - CDS 71254 - 71532 161 ## 79 37 Op 2 . - CDS 71535 - 71762 203 ## HMPREF0573_11212 TnpV family protein - Prom 71788 - 71847 8.3 80 38 Op 1 . - CDS 71856 - 74231 1942 ## COG1479 Uncharacterized conserved protein 81 38 Op 2 . - CDS 74279 - 75943 1250 ## COG2265 SAM-dependent methyltransferases related to tRNA (uracil-5-)-methyltransferase - Term 75963 - 76021 0.6 82 39 Op 1 . - CDS 76030 - 76971 846 ## COG3481 Predicted HD-superfamily hydrolase 83 39 Op 2 . - CDS 77016 - 78296 1394 ## COG0265 Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain + Prom 78339 - 78398 9.1 84 40 Op 1 . + CDS 78438 - 80813 1906 ## COG1193 Mismatch repair ATPase (MutS family) 85 40 Op 2 40/0.000 + CDS 80831 - 81532 756 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 86 40 Op 3 . + CDS 81534 - 82940 1055 ## COG0642 Signal transduction histidine kinase + Term 82942 - 82981 6.0 - Term 82930 - 82967 5.6 87 41 Tu 1 . - CDS 82972 - 84132 1242 ## COG1879 ABC-type sugar transport system, periplasmic component - Prom 84215 - 84274 8.4 + Prom 84278 - 84337 11.7 88 42 Op 1 . + CDS 84428 - 86539 2558 ## COG1882 Pyruvate-formate lyase 89 42 Op 2 11/0.000 + CDS 86569 - 86811 546 ## COG1882 Pyruvate-formate lyase + Term 86838 - 86885 8.7 + Prom 86923 - 86982 2.6 90 42 Op 3 . + CDS 87077 - 87835 557 ## COG1180 Pyruvate-formate lyase-activating enzyme + Prom 87843 - 87902 3.0 91 43 Tu 1 . + CDS 87936 - 88106 359 ## gi|210613377|ref|ZP_03289697.1| hypothetical protein CLONEX_01904 + Term 88138 - 88175 5.5 + Prom 88197 - 88256 4.6 92 44 Tu 1 . + CDS 88294 - 88869 658 ## EUBREC_0112 hypothetical protein + Term 88874 - 88918 9.2 - Term 88863 - 88903 8.4 93 45 Tu 1 . - CDS 88920 - 94112 6332 ## COG5492 Bacterial surface proteins containing Ig-like domains - Prom 94133 - 94192 7.6 - Term 94357 - 94404 10.1 94 46 Op 1 16/0.000 - CDS 94405 - 95745 1387 ## COG0305 Replicative DNA helicase 95 46 Op 2 9/0.000 - CDS 95758 - 96204 528 ## PROTEIN SUPPORTED gi|160881875|ref|YP_001560843.1| ribosomal protein L9 96 46 Op 3 . - CDS 96206 - 98248 690 ## PROTEIN SUPPORTED gi|85057286|ref|YP_456202.1| exopolyphosphatase-related protein - Prom 98278 - 98337 12.6 - Term 98306 - 98358 10.0 97 47 Op 1 . - CDS 98367 - 99383 1209 ## COG1087 UDP-glucose 4-epimerase 98 47 Op 2 . - CDS 99424 - 100536 1328 ## EUBREC_0139 hypothetical protein 99 47 Op 3 . - CDS 100536 - 101462 1062 ## EUBREC_0139 hypothetical protein - Prom 101488 - 101547 6.8 100 47 Op 4 . - CDS 101555 - 101650 92 ## - Prom 101738 - 101797 7.4 + Prom 101452 - 101511 7.2 101 48 Tu 1 . + CDS 101735 - 102053 185 ## COG3464 Transposase and inactivated derivatives Predicted protein(s) >gi|330401653|gb|ADLB01000025.1| GENE 1 37 - 120 73 27 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNSVRILSIPNSEKTSSIGLKWPNLLD >gi|330401653|gb|ADLB01000025.1| GENE 2 252 - 347 164 31 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MANKPSSASLLKLIGYKKNASSEMLEISDTV >gi|330401653|gb|ADLB01000025.1| GENE 3 527 - 1987 837 486 aa, chain - ## HITS:1 COG:CAC0528 KEGG:ns NR:ns ## COG: CAC0528 COG0488 # Protein_GI_number: 15893818 # Func_class: R General function prediction only # Function: ATPase components of ABC transporters with duplicated ATPase domains # Organism: Clostridium acetobutylicum # 1 486 1 492 492 608 61.0 1e-174 MSMIRVENLTFSYPSSYDTIFDNVNFQIDTDWKLGFVGRNGRGKTTLLNLLLGKYEYSGK IISSVQFDYFPYPISDKKQMTEDILREVCPLAEEWELMRELSYLDVNADVLWRPFETLSN GEQTKVLLAALFLNEGHFLLIDEPTNHLDATAREKVSEYLKKKKGFILVSHDRRFLDGCV DHILSLNRANVEVQSGNFSSWIANFERQQEFELAQNTRLRKDISRLQKSAKRSAVWSDRV EASKRGAADKGYVGHKAAKMMKRSKAIEERKNQAIEQKSALLKNAETVDTLKIQPLEYHT NMLMSFSDVSVIYDGRQVCKPVTFELQSGEQVIIDGKNGSGKSSLLKLLSNYPIEHTGTV TIGSGLVISYVPQDTSHLKGRLSEFAEKNNLDESLFKAILRKMDFERVQFEKDIKDFSGG QKKKVLIAKSLCERAHLYVWDEPLNFIDVYSRMQIEQLIESFSPTMLLVEHDIAFRDAIA TKIVNL >gi|330401653|gb|ADLB01000025.1| GENE 4 2003 - 3229 977 408 aa, chain - ## HITS:1 COG:CAC2753 KEGG:ns NR:ns ## COG: CAC2753 COG0477 # Protein_GI_number: 15896010 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Clostridium acetobutylicum # 1 402 1 406 410 173 27.0 4e-43 MKYIRQYMGMRKELYILFWGRVVTNMGALIWPMMTLILKSKLGYSASQIAGILLVLGIAQ LPCTLIGGKLADRFNKRNLIIICDLVTVVSYFICAFLPMSEKVIPLLALAAIFAQMEWPS YDALVADLSSAEERERAYSLNYLGVNLGLVLAPTIGGFLFANHLSLAFLISSVATFSSTV LIFFFIRDITPVKSQEVFGQYEEVREGHSIWRVLRENKLLILFMLCGGIWTLVYSQFNFL IPLNLEQYYGEQGAVWFGTLTSVNAFVVIVGTPILTRVMTRIRDVERLLIGQILVVIGLV AYAVVQNVLVVYFVSMIVFTIGEICETLGRQPYLTRRIPSSHRGRFSSCYTVFAGAFQLY GQQIVGNMADVMPMQKVWLFVVIIGALNSAGYFVLRQRDKKVFSLLYK >gi|330401653|gb|ADLB01000025.1| GENE 5 3443 - 3622 164 59 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|166032256|ref|ZP_02235085.1| ## NR: gi|166032256|ref|ZP_02235085.1| hypothetical protein DORFOR_01959 [Dorea formicigenerans ATCC 27755] # 1 55 83 138 228 75 71.0 1e-12 MLKEHYAPYSERMKMVIDVLKQYKIDINDFDSIDRLIEEAKLAQVQSDYIAPLKKHSKL >gi|330401653|gb|ADLB01000025.1| GENE 6 3537 - 3701 77 54 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MVNISVWVATSPIFWAIYAIIGIIIAPKVLSVFLKGQCNHFVLAQVLLPRLACQ >gi|330401653|gb|ADLB01000025.1| GENE 7 4262 - 4492 108 76 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|268611353|ref|ZP_06145080.1| ## NR: gi|268611353|ref|ZP_06145080.1| hypothetical protein RflaF_17869 [Ruminococcus flavefaciens FD-1] # 12 74 2 64 65 88 69.0 1e-16 MAFMLPGKELPMTDYCAFSKDHKCVKFTDYEITRHELEEADELCHGNWIEIQHLREYIDR LQALLIEHGITIPDEY >gi|330401653|gb|ADLB01000025.1| GENE 8 4452 - 5231 512 259 aa, chain - ## HITS:1 COG:SMa0776 KEGG:ns NR:ns ## COG: SMa0776 COG1484 # Protein_GI_number: 16262872 # Func_class: L Replication, recombination and repair # Function: DNA replication protein # Organism: Sinorhizobium meliloti # 1 239 1 234 245 152 35.0 7e-37 MIRQSTIDKLHDMRLSAMSDAFECQCKDPDTYQGLSFEDRFGMLVDKEWDKRKSTKLQKL IRSAEFRYPNACMEDIEYHPDRNLDKGQMLEFSTCRYISDDHHIILKGASGNGKTYIACA LGIAACRNFIKVRYVRLPDLLNELAVAHGDGTLKKVIKAYQKIDLLILDEFLLSPVSTEQ TRELLEIIEARSVKGSVIFCTQFEPKGWYSRIGNDCDATICEAIIDRIIHNSYEVMLDGR ISMRERHGIHASRKGAAND >gi|330401653|gb|ADLB01000025.1| GENE 9 5268 - 5339 87 23 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MMPYLELSLLAALETTFATATLS >gi|330401653|gb|ADLB01000025.1| GENE 10 5331 - 6779 912 482 aa, chain - ## HITS:1 COG:AGl6 KEGG:ns NR:ns ## COG: AGl6 COG4584 # Protein_GI_number: 15890093 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 1 462 38 501 530 275 34.0 2e-73 MTDYREIIRLHSLKFSNVSIANSLCCSRNTVSEVLKLAETHSLEWPIPETLSNKDIEYLF YPNRGNNEGRRLPDYEYVYNELAKPGVTLSLLWAEYCAKCEAEHTIPYQHSQFNDKYHAY AASKKATLRIKRKPGETMEVDWVGDTLKVYDAASCYDIPACIFVAVLPCSLYGYAEAFPD MKSNHWIEAHIHAYSFFGGVTRILVPDNLKTGVIKNTRAELVLNRSYHEMAEYYGTAIIP ARPVKPKDKPNAEGTVKVLETWILAALRNRKFFTFEELNKAIHEKLEEFNAKPFQKKKGS RLSAFLEEEKDFLMPLPASPYETAVWSTATIQPDYLIKIGDCKYSVPYEFIGKKVDIRAT ENSIEVFYHSNRIASHVRRSYSPEPIYVPEHMPENHRKFLEYNTDSFLNWGKSVGHSTLI VVKHFLYMHKVEQQGYKSCASLMKLADRYGTQRLENACIKALSYTPSPSLKKYQYHFEKW SG >gi|330401653|gb|ADLB01000025.1| GENE 11 7336 - 7746 411 136 aa, chain - ## HITS:1 COG:no KEGG:LSL_1338 NR:ns ## KEGG: LSL_1338 # Name: not_defined # Def: DNA-binding protein # Organism: L.salivarius # Pathway: not_defined # 2 129 20 149 203 64 31.0 9e-10 MTQEQLAEKLGVSNKTISKWETGKCMPDYSIIKNLCDELEITVAELFDGETSGEKSVRVY DKEQFLNLLRRIQELEKQKNTLYGVVLIVMGIAMQALSYAVGGSDMKDFISGVLLGMSIA VMLIGVYVVGKSLSGK >gi|330401653|gb|ADLB01000025.1| GENE 12 7949 - 8545 402 198 aa, chain - ## HITS:1 COG:no KEGG:ACL_0934 NR:ns ## KEGG: ACL_0934 # Name: not_defined # Def: hypothetical protein # Organism: A.laidlawii # Pathway: not_defined # 27 192 8 172 177 133 43.0 4e-30 MMKETLKILSFQTAMVMQNEREKYDQTKWIYVPDFYTDYRYILGTVGNNPLITIGINPST AEPEKMDNTMKSVEKIAMGNGFDSFIMFNVYAQRATDPNQMNKEINPMLHKENMQAFQWI LENSGKKPIIWAAWGTNIERRKYLKECLREMINISNRYNAVWCKVGKCSVKGHPHHPLYL KKDSRIEAFDVKTYLTSL >gi|330401653|gb|ADLB01000025.1| GENE 13 8718 - 8810 155 30 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSFCGAILALKNRTTRNMRGYFEEYGKVVR >gi|330401653|gb|ADLB01000025.1| GENE 14 8886 - 10331 943 481 aa, chain - ## HITS:1 COG:no KEGG:HPG27_1439 NR:ns ## KEGG: HPG27_1439 # Name: not_defined # Def: hypothetical protein # Organism: H.pylori_G27 # Pathway: not_defined # 32 240 16 188 236 72 28.0 5e-11 MNYEQKSNALLELLSQLADDTEYSKKDRGREIQILSQIYSEGFRHSYAKISTKVQAILED DIDKGECLSQNLQMLKKSIEKLTYNKSISMEICNKVRKLCDHVNLEIGRYNLIMNKIETR ISNLQDKQNTGGIGSSAEELNKRITEMENKVSGVVNKAYEATKELEKVDGKLERNSMSSI TTLTIFSAVILAFTGSITFTSGVFSGMSNVSPYRIVFVTAMIGTIVFNLIFMLLFIVGKM VGKNICCRCPYYSESLEGVVNVCGSGVCEKKEHIPNFGCIMLHKYPYILFVNTILGVIMY YDMILFFINNSEYIQFVIVSDVLKYLVLFFPFLILLLAYIVCQIKKNILYCRTVNAVSLA IAEEYFDDDDDESIVASVFKKFSETISRLFSSSQDRKKEIEKILENIPDLTEKKKQRYLI KQLKDISKIKVIKGNKNLMRISFSENRYNKILLKKMAADILEGEQYSTDVDEDIVDVVDS E >gi|330401653|gb|ADLB01000025.1| GENE 15 10335 - 10772 266 145 aa, chain - ## HITS:1 COG:Cgl0313 KEGG:ns NR:ns ## COG: Cgl0313 COG3600 # Protein_GI_number: 19551563 # Func_class: S Function unknown # Function: Uncharacterized phage-associated protein # Organism: Corynebacterium glutamicum # 2 118 3 124 148 63 33.0 1e-10 MIYSALNIAKYVVTFCFKRGNPISNLQLQKILYYIQGYSLALRDEEAFDEEIVAWQYGPV IKAVYDFFSIYAAMPIENSFRIEIEDQEFEKMIRVIALDKMNIPVWKLVEQTHNEAPWKY TTELFGMGSVIPKEYIRRYFKEKCI >gi|330401653|gb|ADLB01000025.1| GENE 16 10825 - 11241 337 138 aa, chain - ## HITS:1 COG:no KEGG:CKR_3111 NR:ns ## KEGG: CKR_3111 # Name: not_defined # Def: hypothetical protein # Organism: C.kluyveri_NBRC # Pathway: not_defined # 3 137 26 161 162 117 52.0 1e-25 MDKVVGRLTVFFEQPFWVGVFECISEGKLSVCKVTFGAEPKDYEVYDFVLKNYYRLRFSP AVETDVKETGRNPKRIQREVRKQVQNTGIGTKSQQALKLQQEQLKTERKIISREQRETEK QRQFELKQQKKKEKHRGR >gi|330401653|gb|ADLB01000025.1| GENE 17 11455 - 11871 356 138 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_0392 NR:ns ## KEGG: EUBREC_0392 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 130 314 443 449 62 30.0 6e-09 MERFAEKYSADPEEVLPEAGMLETGKTYREKKAKPLIKKIVVVLRSVYRAYLDLSRKFSD MQKSYERALSKVNSLTARVEELWSENKVLGEKLGDLNRVEWALGRDTVETIVQREKSLEE AQRKQNRERKRKIDRGGR >gi|330401653|gb|ADLB01000025.1| GENE 18 11924 - 12007 57 27 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTYNNGAIDTIQISFPPVILKYKNIVI >gi|330401653|gb|ADLB01000025.1| GENE 19 12073 - 13614 1868 513 aa, chain - ## HITS:1 COG:CAC2700_2 KEGG:ns NR:ns ## COG: CAC2700_2 COG0519 # Protein_GI_number: 15895957 # Func_class: F Nucleotide transport and metabolism # Function: GMP synthase, PP-ATPase domain/subunit # Organism: Clostridium acetobutylicum # 195 513 1 316 316 444 68.0 1e-124 MNNETVIVLDFGGQYNQLIARRVRECNVYCEVHPYTLSLEKIKEMNPKGIIFTGGPNSVY DEASPRYERAILELGIPVLGICYGSQLMAYLLDGDVKTAPVSEYGKTEVEVDSSSKLFEG VQKSTICWMSHTDYIAAAPEGFKITATTPVCPVAAMENEAEKLYAVQFHPEVLHTQEGTK MLRNFLYNVCECTGDWKMDSFVEKSIEAIREKVGDGKVLCALSGGVDSSVAAVMLSKAVG KQLTCVFVDHGLLRKNEGDEVEAVFGPEGNYDLNFIRVNAQERFYNKLAGKTEPEEKRKI IGEEFIRVFEEEAKKIGAVDFLVQGTIYPDVIESGLGKSAVIKSHHNVGGLPDCVDFKEI IEPLRDLFKDEVRKAGLELGIPEHLVFRQPFPGPGLAVRIIGEVTAEKVKIVQEADAIYR EEIEKAGLSRNIGQYFAALTNMRSVGVMGDFRTYDYAVALRAVLTSDFMTAEAAEIPWEV LSKVTTRIVNEVKHVNRVMYDCTGKPPATIEFE >gi|330401653|gb|ADLB01000025.1| GENE 20 13702 - 15063 1258 453 aa, chain - ## HITS:1 COG:yeeO KEGG:ns NR:ns ## COG: yeeO COG0534 # Protein_GI_number: 16129928 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Escherichia coli K12 # 17 447 92 527 547 182 28.0 1e-45 MRTKVKTDRTHYLFDNKAIIALIIPLIIEQLLAVLVGMADSVMIASVGEAAVSGVSLVDN VMVLLINLFGALATGGSVVAGQYLGKKQEEKADRASNQLIWFITICAIGITILVYLGKSF MLQTVFGEIAPDVRGYANTYLLIVTASIPFIALYNGGAAIFRTMGNAKVTMIVSLIMNAV NVIGNATLIYGFHMGAEGVAIPTLVSRMVAAVLIVILLLNPKQVLHLQKTLKYRPDWKMI KNILGLGIPNGLENSMFQLGKIMVLSLVSTFGTYAIAANAVSNAVAMFQILPGMAISLAV TTVISRCVGAKDYEQVSYYTKKLMTITYACMIVTNVIIYLLLPVIMQIYNLSAKTSQVTE QILIFYSVSCVLIWPIAFTLPTTLRAAGDAKMSMIISILSMWIFRIGFSYLLGKYFGMGV FGVWVAMVIDWVFRAILFVARYIGGKWKEIRTV >gi|330401653|gb|ADLB01000025.1| GENE 21 15080 - 16273 798 397 aa, chain - ## HITS:1 COG:CAC3027 KEGG:ns NR:ns ## COG: CAC3027 COG1408 # Protein_GI_number: 15896279 # Func_class: R General function prediction only # Function: Predicted phosphohydrolases # Organism: Clostridium acetobutylicum # 44 395 60 390 392 180 32.0 3e-45 MVTLFLIPIYILFSVYLCRWLFRWLNSCHCWFQNKVTKIVIISAYIFVSSTVIISFLLPV SPVQRVLKQISNYWLGTLVYIFLVVLLADVGRIILKRTKWISNEKLSSGRTFVITGGICI VLIAGISIYGILNARNIRTTAYDVTIEKDAGELEDLKIALVADLHLGYSIGDYHMEQMVE KINKMDADIVLIAGDIFDNDYDALYPPDYLIKTLRGIKSKYGVYACYGNHDIQEKILVGF TFPGGEKKQSDLRMDEFLKKANIKLLRDEAVLIDNAFYLVSRPDYKCPGRGIEQRKTPEQ ITAELDKTKPILVMEHEPKEIEELEEAGADMQLSGHTHDGQIWPGTWLIKCFWKNPYGYM KEGNLHSIVTSGIGVYGPAMRVDSKSEICEIHVSFQQ >gi|330401653|gb|ADLB01000025.1| GENE 22 16402 - 18042 1239 546 aa, chain - ## HITS:1 COG:CAC3428 KEGG:ns NR:ns ## COG: CAC3428 COG1151 # Protein_GI_number: 15896669 # Func_class: C Energy production and conversion # Function: 6Fe-6S prismane cluster-containing protein # Organism: Clostridium acetobutylicum # 5 541 3 564 567 786 69.0 0 MEEKMFCYQCQETAGCKGCTMSGVCGKKPDVAAMQDLLVYVTKGISAVTTMLRKEGKNIS AEVNHLITLNLFTTITNANFDKESIEERIRTTLEVKDSLLEELHNPTDLPEAAKWSGLGD WEEKAKTVGVLSTENEDIRSLRELITYGLKGLSAYSKHANVLLHDSEEVDEFLQRALAAT LNDTLSVDDLVALTMETGKYGVSGMALLDKANTETYGNPEITKVNIGVRKNPGILVSGHD LRDLEMLLQQTQGIGVDVYTHSEMLPAHYYPAFKKYQNFVGNYGNAWWKQKEEFESFNGP ILMTTNCIVPPKDSYKDRLYTTGAAGYSGCTHIPGEIGEQKDFSALIEHAKKCAAPNEIE TGEIIGGFAHAQVLALADKVVEAVKSGAIKKFVVMGGCDGRAKSRNYYTEFAKALPRDTV ILTAGCAKYKYNKLPLGDINGIPRILDAGQCNDCYSLAVIALKLKEVFGLDDINDLPIIY NIAWYEQKAVIVLLALLYLGVKNIHLGPTLPAFLSPNVAKVLVENFGIAGIGTVEDDLKL FFSENN >gi|330401653|gb|ADLB01000025.1| GENE 23 18203 - 19012 230 269 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 23 251 20 245 245 93 28 5e-18 MALVEVKNISFQYPNGYLAVDDVSFSIDAGENIAIVGQNGAGKTTTVKMLNGLIKACKGD VLIDGESTQKYTTAQMAKRVGYVFQNPDDQIFNSTVRKEIEYGLKKMKIDPDECERRIKD AAELTGMDKYLEVNPYDLPLSIRKFVTIASVIASNCEVMIFDEPTAGQDLEGLARLSKLN KILTERGKAIVTITHDMEFVAENYERTIVMCQKKVLADGKTKDVFFQKDIMEKAMLKQPV IVRIANQIGMKENTLDIKKVAKFVADNRG >gi|330401653|gb|ADLB01000025.1| GENE 24 19015 - 19869 388 284 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P [Thermanaerovibrio acidaminovorans DSM 6589] # 9 270 135 390 398 154 35 2e-36 MDSIKITDLSYRYPTAEEEVLRHVSLTIKKGELCAIVGANGSGKTTLCNAIRGFVPKFYK GEISGEVLVNGKDVQKEDIGSTALEVGFVFQNPFTQISGIADTVYEELAFGLENMGIDPT EINERIEKMMKLTKIEEFRDRDPYQLSGGQQQRVALASILVMGQDILVIDEPTSQLDPQS TDDVFEMIQLMKNMGKTIVLVEHKMEQIAEYADHVVVMDKGKIVLEGTAKEVFSNPKCME YHTRLPQSTKIAMELMKEGIPFQEIPVTVDETIQILNSTMGKGV >gi|330401653|gb|ADLB01000025.1| GENE 25 19857 - 20612 709 251 aa, chain - ## HITS:1 COG:CAC3100 KEGG:ns NR:ns ## COG: CAC3100 COG0619 # Protein_GI_number: 15896351 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type cobalt transport system, permease component CbiQ and related transporters # Organism: Clostridium acetobutylicum # 8 227 17 247 267 59 28.0 8e-09 MNKKQKGIKLLNPLTVLYLVILLAVVSALFDYKITIVSVLVMMLLAGISGEGKPYFLLWL KSIFLICVICFVLQSLFIPGEQIIWKVWVFSIKVESVQKAIILCSRILGIGSAILLGGKL IDIKKLMIVLEKKGMSSSVTYVLLSTTNIIPQMSKKMGAILEAQKSRGIETDSNMIVRAK AFFPSVGPLLLNSLVNAEERAITLEARAFSAPCKKTSLKNVEDSSRDKMLRIVFIAATIL AIGGKIVLWIV >gi|330401653|gb|ADLB01000025.1| GENE 26 20613 - 21236 763 207 aa, chain - ## HITS:1 COG:no KEGG:FMG_1446 NR:ns ## KEGG: FMG_1446 # Name: not_defined # Def: hypothetical protein # Organism: F.magna # Pathway: not_defined # 5 203 2 201 207 133 38.0 5e-30 MTNEKKGLKADFNLITILIIPIAIAINFIVGNLVLTLKLPLYLDSIGTFLVAILAGPWVG CLTGVLSIAINSITDPSLFPFAIISGVLGLVIGTMARKGMFIKFSRFIVSSIIVSVIAVA LSVIISYAFFGGFDSSGNSIMIGAMVSAGIPFWPAQIIGNLISEVPDKFISLLVPYLVIR GMSDRYLYKFSNGSVFINARKDKKGSK >gi|330401653|gb|ADLB01000025.1| GENE 27 21249 - 22211 1132 320 aa, chain - ## HITS:1 COG:SSO2243 KEGG:ns NR:ns ## COG: SSO2243 COG1957 # Protein_GI_number: 15899016 # Func_class: F Nucleotide transport and metabolism # Function: Inosine-uridine nucleoside N-ribohydrolase # Organism: Sulfolobus solfataricus # 1 307 1 304 307 194 37.0 3e-49 MQKIIIDTDPGIDDALAILLALAAKDELDVLALTTVNGNVGVEQVTKNAFKILEIAGRTD IPVYMGNGKPLMRENEHCEEFHGDDGMGNLEMPDCHKIPENENAVDFLIRKVREEKGEIT LVPIGPLTNIAEAIQKNSDFVKNVKEVVIMGGAEHGGNMSPHAEFNFWTDPEAAKIVFQA GFEKVTMIGLDATSYVFLSPTLRELLYLINTPISRFIHKITRVYADGHWEVEKKLGCELC DVLTIAYLLDRSVVEKTDAFVDVETQGLCDGASVVYRQKYYPDKMKNCEVAIKADTKKFF ELFFTYLFPEHKESWKEFIF >gi|330401653|gb|ADLB01000025.1| GENE 28 22230 - 23129 1079 299 aa, chain - ## HITS:1 COG:PA1950 KEGG:ns NR:ns ## COG: PA1950 COG0524 # Protein_GI_number: 15597146 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Pseudomonas aeruginosa # 3 299 4 300 308 254 44.0 1e-67 MKKILIIGSLNMDIVLETPRIPKAGETISGKNIMQAPGGKGANQAYAIGKLGGKVEMIGA VGDDSFGYKLKANLESVGVSTVGVETFSGEPTGQAYIAVDEEGENSIILIAGTNGMVTKD MIKKNLKKIQESDIIIMQLEIPLEVVEYVKKLAVGLGKTVIVDPAPAVPNIPDSFWKGID YIKPNETELEILAGKEMKNLEQLKEGARIMLEKGVKNVVVTLGGDGCLFVSAEKEEFFPA NKVKVVDTTAAGDSFTAGMALALSQGKTCEEAIAFGQKVSAIVVSRKGAQISIPAMEEI >gi|330401653|gb|ADLB01000025.1| GENE 29 23136 - 25046 1629 636 aa, chain - ## HITS:1 COG:TM0960 KEGG:ns NR:ns ## COG: TM0960 COG0524 # Protein_GI_number: 15643720 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Thermotoga maritima # 330 623 5 295 299 179 40.0 1e-44 MNIQDIAKLAGVSASTVSKVMNGKDKDISEETKKKVLKIIEKENYIPYSKFREKAGMKSH LIGLILKTKNRERESIVLHVEKKARENGYGLLISYADTQEDIELCVEEMRQKKAEGILVD SDKCVSAHGFENTCVYLNQTKEFDEKQKVTFYYRLSEAGRLAVERLQREGHEKIACIILN TDKTIRDGYETAMRSMQSPIQPMWIYEGKSIEDVEKYGIRQCLSENVTAVVCGSPEIAYC IYKVMERTRMSIPDTLSIISIGESRLLELVGDGITGVSLPSACMSSDAVEYLVEMIQKEK KIELMRKFSPTLIERNSVTKPAQEKQGEKIVVVGSMNMDVTVEVSRIPVDGETQLAERLY VFTGGKGGNQAVGVGKLGGQVYMIGCLGNDMDGKQLYSTLVENHVHMDGVLFDTELPSGK AYINVDKNGESTIVVYQGANRNLSIEQINRCKYLFQNAKYCLLSLEIPETIAEYTIKFCK RNNTQVILKPSAADKIKEELLKDIAYFIPNEKELHNFVQGKGAIEKKAQILMDKGVENVI VTLGERGCYLRNKEYSLYFEGSGFEAVDTTGGADSFISALAVCLSEGKNIIQSIGFAIYA SGISVTRHGVQPALPDRKAVEIYEDEIRSKYGKGGE >gi|330401653|gb|ADLB01000025.1| GENE 30 25284 - 25535 174 83 aa, chain - ## HITS:1 COG:no KEGG:Cphy_0210 NR:ns ## KEGG: Cphy_0210 # Name: not_defined # Def: sporulation transcriptional regulator SpoIIID # Organism: C.phytofermentans # Pathway: not_defined # 5 82 12 89 96 106 76.0 3e-22 MKDYIEERAVEIATYIIEHQATVRQTAKEFGVSKSTIHKDVTERLQQINPALALQARKVL DTNKSERHIRGGMATREKYLHQH >gi|330401653|gb|ADLB01000025.1| GENE 31 25706 - 26143 377 145 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|291086876|ref|ZP_06571709.1| ## NR: gi|291086876|ref|ZP_06571709.1| putative lipoprotein [Clostridium sp. M62/1] # 17 144 235 364 367 70 36.0 2e-11 MIERKEKIGEDFIETPEDEEEGSQELSVKERETQNVAKVEAEAKRRDQTLSDTATQMEMN EYSEQLYKLWDDELNRLWKVLKEELSSTEMAKLLEEQRTWIAEKEKAINEIGEISGGGTA TTMNKNMTGEDLTRKRVYELLEYLP >gi|330401653|gb|ADLB01000025.1| GENE 32 26244 - 27248 901 334 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_1720 NR:ns ## KEGG: EUBREC_1720 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 5 328 20 345 349 326 50.0 1e-87 MPKGTNQKFKLYRLAQIMLEMTDDEHYITMPEIIEALGKYEITADRKSIYNDLRDLEVLG IEVEGESAGKGYHYHVVNRPFELAELKLLVDAIQSSKFITERKTNTLIKKLEKMISKYET MKLQRQVFVSGRIKTMNESIYYTVDAIHNAIAENRKIYFQYYQWNVKKEMELRHSGAVYH ISPWGLSWDNENYYLIGYDSEAEKMKHYRVDKMLHIKLSDEKREGKESFKQLDLADYAKK SFGMFGGKEQQVKLLVDNSLTGVIIDRFGKDVMIIPADKEHFTVNVTVHISNQFLAWVIS LGEGVKIISPDEVVNRLKKEVERLAGQYGVNRRM >gi|330401653|gb|ADLB01000025.1| GENE 33 27302 - 27622 384 106 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_3436 NR:ns ## KEGG: EUBREC_3436 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 95 1 95 121 126 67.0 2e-28 MKKIFFTLAGTKHYYGSEFFKPEMRVQLEKEEDNEYDTEAIVVKVEGLGKVGYVANSPFT VLGESMSAGRIYDKIGKKAFGEVVFVMSQGVLCKLDEKSLLDWQQD >gi|330401653|gb|ADLB01000025.1| GENE 34 27619 - 27828 71 69 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MNIAVYPMGLRMRWNFDVSREFLERICIVKRKVFAISIDFVYQLRRWYLYGTVRRLDYKY QKTLGGIER >gi|330401653|gb|ADLB01000025.1| GENE 35 27770 - 28405 477 211 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|266623894|ref|ZP_06116829.1| ## NR: gi|266623894|ref|ZP_06116829.1| conserved hypothetical protein [Clostridium hathewayi DSM 13479] # 47 209 45 203 206 77 28.0 8e-13 MVKKEQIQEVGKSVWIGVILFFSAVLFLMQSGDINHLKLFSDNLLWIIGAYIFGTIAVRF LFSLEICRKGAFLFQGEAIIWWVFLVFTVLSVILPFIPEYMRDIMLVNIFVLILMMLGNY LHMRYISKELNGGAWQKDFVLIEDLKRKPRNEEEFIKFIILYCEKNALELEILQYGEPAK VKMGGRVYTVKIVEYCSLSNGVAYAMEFRCE >gi|330401653|gb|ADLB01000025.1| GENE 36 28410 - 28853 276 147 aa, chain - ## HITS:1 COG:lin0983 KEGG:ns NR:ns ## COG: lin0983 COG3279 # Protein_GI_number: 16800052 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Response regulator of the LytR/AlgR family # Organism: Listeria innocua # 26 145 24 144 151 64 33.0 6e-11 MKVKISYIEPEKTERAELYVVRGHQSLETLVQVIEKESYKEKVLYVTDKDEKYQISCTHI FFIESIGEKILVHTEKKVFQCKKRLYELEKELPEYFSRISKSVILNLRQVEYYRPQLNGI MKANLYNREEVYISRKYLREIRSKIGG >gi|330401653|gb|ADLB01000025.1| GENE 37 28992 - 30104 1252 370 aa, chain - ## HITS:1 COG:SPy1896 KEGG:ns NR:ns ## COG: SPy1896 COG0544 # Protein_GI_number: 15675709 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: FKBP-type peptidyl-prolyl cis-trans isomerase (trigger factor) # Organism: Streptococcus pyogenes M1 GAS # 23 336 107 418 427 101 28.0 3e-21 MKKQRVIAAMLTLCLVVSAVGCSKKEISNEYVKISQYKGVEAEKVTPNEVTEDDITNYIN AKMAEQTKDVTDRELRKGDLAQFEYTGKLKSTGKVFDEGTLTLGNGEQYVDGFEEGIYGH KLNETFDLPVKFPEGYGGTEQPELSGADVIFTIKITGIKERAYSELNDEFVKAVSKKSET VEEYKEEVKKEIEKMNKKQAEAELRDNAWNEVMDNAEIVKYPEKRLKEKEKAIEEQLKKL VEQSYGVSYEEFLKQSGQSEDDFKKQVKEMSKESLKPALVAELIAKEEGLEISDKEMKKE MKVFAKENGYPDADTLVELNGEDMVKEAILQNKVMEWAADNCKQVEKKKENKQDEKKEEN KEGNKEKTTK >gi|330401653|gb|ADLB01000025.1| GENE 38 30339 - 30782 531 147 aa, chain + ## HITS:1 COG:no KEGG:Cphy_3788 NR:ns ## KEGG: Cphy_3788 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 7 140 7 140 151 138 55.0 8e-32 MSLAVKILLIIVVVLAIVLVVLYFLGKKAEKRQAEQQEKLDAAAQTVSMLIIDKKRMKLK EAGFPAVVIENTPKYLRRSKVPVVKAKIGPKIMTLMCDASVFPILPIKKEVKVVVSGIYI TEVKGARGGLEVPVKKKGFFSRFRKEK >gi|330401653|gb|ADLB01000025.1| GENE 39 30805 - 31257 516 150 aa, chain - ## HITS:1 COG:CAC2894 KEGG:ns NR:ns ## COG: CAC2894 COG4506 # Protein_GI_number: 15896147 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 18 141 12 134 137 66 33.0 2e-11 MTKEVLINITGIHMDMIEKGEEDEPVAVITPANYFKKNGKHYIIYDEVAEGMPGVTKNTI KITGDNMIEIMKKGLTNTHMVFEKGKNHMTGYGTPFGQLVMGIRTKDLHVSETEEEIGAS ILYHLEVNEEAVAECRIKIRITPKSAGIEI >gi|330401653|gb|ADLB01000025.1| GENE 40 31266 - 32096 1033 276 aa, chain - ## HITS:1 COG:NMA2026 KEGG:ns NR:ns ## COG: NMA2026 COG0796 # Protein_GI_number: 15794906 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glutamate racemase # Organism: Neisseria meningitidis Z2491 # 7 273 6 269 270 234 45.0 9e-62 MSIDCRKDAPIGVFDSGVGGLTVVKEIMQNLPNENLVYFGDTARVPYGSKSVDTIIRFSK QIVRFLQTKKVKAIVIACNTVSALALDILRQETDIPIIGVVKPGSKVAAEVTENKRVGII GTEATVKSGVYTKFIQQYKPDIEVLAKACPLFVPLVEEGFVDNRIAREAIDYYLHEMKQS GIDTLILGCTHYPLLYNPIMAYMGEGIRLVNPAYETALELKQLLEEKNLENVSQEAGKVP TYEFYSSDAPVRFKTFANAVLPYSVDWIEHINIEEF >gi|330401653|gb|ADLB01000025.1| GENE 41 32217 - 33191 1223 324 aa, chain - ## HITS:1 COG:CAC0517 KEGG:ns NR:ns ## COG: CAC0517 COG0205 # Protein_GI_number: 15893808 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphofructokinase # Organism: Clostridium acetobutylicum # 5 323 1 318 319 375 62.0 1e-104 MANHIKTIGVLTSGGDAPGMNAAIRAVVRTALGKGLNVQGIRRGYHGLLNEEIINMTARD VSDIIQRGGTILQTARCSEMRTEEGQRKAAGILKKYGIDGLVVIGGDGSFAGAQKLSNLG INTIGIPGTIDLDIDCTEYTIGFDTAVNTAMEAIDKVRDTSTSHERCSVIEVMGRDAGYL ALWCGIANGAEQILLPEEHDYDEQQIIKNIVENRKRGKKHYIIINAEGIGDSINMAKRIE EATGMETRATILGHMQRGGSPTCKDRVFASIMGAKAVDLLCEGKTNRVVGFSHGEYIDFD IDEALNMKKEIPEYQYKIAKDLSI >gi|330401653|gb|ADLB01000025.1| GENE 42 33268 - 36741 3269 1157 aa, chain - ## HITS:1 COG:CAC0516 KEGG:ns NR:ns ## COG: CAC0516 COG0587 # Protein_GI_number: 15893807 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, alpha subunit # Organism: Clostridium acetobutylicum # 3 1149 11 1167 1167 1167 53.0 0 MKFAHLHVHTEYSLLDGSNKIKEYVARVKELGMDSAAITDHGVMFGVIDFYRAAKAEGIK PILGCEVYVAPQSRFDKEVGTREDRYYHLVLLAENNKGYENLTKIVSKAFVEGYYYKPRV DYELLEQYHEGIIALSACLAGEVQKNLLRGMYEEAVKSAKRYEKIFGKGNFFLELQDHGI PEQATVNQRLLRLSQDTGIELVATNDVHYTYAEDEKPHDILLCIQTGKKLQDENRMRYEG GQYYVKSPEEMAELFPYALQALENTQKIADRCDVEIEFGVTKLPRYDVPSGYTSWEYLNK LCFDGLASRYQPVTEELGERLTYELSVIQKMGYVDYFLIVWDFIKYARDNDIMVGPGRGS AAGSIVSYCLGITDIDPIRYQLLFERFLNPERVSMPDIDIDFCFERRQEVIDYVVRKYGA DRVVQIVTFGTMAARGVIRDVGRVMDLPYAFVDTIAKMIPTELNMTLTKALSVNPELKKV YEEDAEVKELIDMSLRLEGLPRHTSMHAAGVVISQKPVDEYVPLSLGADGAVTTQFTMTT LEELGLLKMDFLGLRTLTVIQNAAKLAEKSSGHKIDMHKINYNDKEVLESIGTGKTDGIF QLESAGMKNFMKELKPQNLEDIIAGISLYRPGPMDFIPQYIKGKNDRGSITYDCPQLEPI LESTYGCIVYQEQVMQIVRDLAGYTLGRSDLLRRAMSKKKADVMEKERQNFVYGNEEEGV PGCVANGIPEETANKIYDEMIDFAKYAFNKSHAAAYAVVAYQTAYLKYYYPVEFMAALMT SVIDNPGKVSEYIYASRQMGIHILPPDINEGEGNFSVDNGNIRYGLAAIKSIGRPVIEAI ITEREIGGKYQNLKNFIERLSGKEVNKRTIENFIKAGAFDSLPGTRKQLMMIYIQILDQV NKDRKNSMSGQMSLFDLVDEEQKKEFDIPLPDVKEYEKETMLAFEKEVLGVYVSGHPMEE YEELWKKSISATTMDFQPGEETGRSKVRDGSKEIIGGLIVNKTIKYTKNNKVMAFLTIED LLGTVEVVIFPKDYEKNKLYLEEDSKVFVKGRVSEEDERASKLICESIVPFREIKKELWI QFPDKKTFLQEEEQLYRMISGSEGNDTVVVYCQAEKAIKRLPVNRNIGIDSVILSRLTNY FGEKRVKVVEKAIEKTI >gi|330401653|gb|ADLB01000025.1| GENE 43 36814 - 37053 287 79 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|154503520|ref|ZP_02040580.1| ## NR: gi|154503520|ref|ZP_02040580.1| hypothetical protein RUMGNA_01344 [Ruminococcus gnavus ATCC 29149] # 3 79 1 77 78 124 77.0 2e-27 MRMKIQNITNIEKFFSVVDQCKGKVELVTGEGDRLNLKSKLSQYVSMANVFTNGEIPQLE LISYEPEDTDKLLDFMMNG >gi|330401653|gb|ADLB01000025.1| GENE 44 37101 - 37928 311 275 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|212640476|ref|YP_002316996.1| Uncharacterized protein conserved in bacteria containing two ribosomal protein S1-like RNA-binding domains [Anoxybacillus flavithermus WK1] # 7 263 4 270 285 124 32 2e-27 MKLGKKQVLTVVKKVDFGVYLGGEEEKVLLPKKQVPEGIELGDPVEVFLYKDSDDRLIAT TNEPKLTLGELAVLTVADVGRFGAFLDWGLEKDLFLPFKQQTAKVQKGDKCLVTLYIDKS ERLCATMKVYDMLRKDSPYQKDDMVEGIIYDRSDEFGLFVAVDNRYSARIPKKEAYGKLF VGMEIKARVTAVKADGKLDLSVREKIPAQMDKDADLVWKTICEYDGELPFTDKADPEIIK RELNLSKNAFKRAVGRLLKEGKIEIREKTIAIRER >gi|330401653|gb|ADLB01000025.1| GENE 45 37928 - 38641 676 237 aa, chain - ## HITS:1 COG:CAC3591 KEGG:ns NR:ns ## COG: CAC3591 COG3884 # Protein_GI_number: 15896825 # Func_class: I Lipid transport and metabolism # Function: Acyl-ACP thioesterase # Organism: Clostridium acetobutylicum # 12 204 16 211 248 119 32.0 6e-27 MYSIKERVRYSETDKTGHLTLTGIVNYFQDCSNFQSEELGVGMKFLNERHHGWILSAWQI IVERYPKLCEEIEVGTWPTAFNGLYGTRNFVMNDKHGKCVAYANSIWVFMDTERKRPAKP SKEDIEKYETEPELQMEYAPRKIALPETWEEKEPFSVQKSDIDTNGHVNNSRYVQMALEV ADENMAVVQIRVEYKKSALYGDKIFPKMHSEERKVTVKLCDAEDKTFAVVELTGEEK >gi|330401653|gb|ADLB01000025.1| GENE 46 38641 - 39816 1144 391 aa, chain - ## HITS:1 COG:CAC0606 KEGG:ns NR:ns ## COG: CAC0606 COG0053 # Protein_GI_number: 15893895 # Func_class: P Inorganic ion transport and metabolism # Function: Predicted Co/Zn/Cd cation transporters # Organism: Clostridium acetobutylicum # 1 391 11 403 403 315 39.0 1e-85 MTEFLVKHFVKRYEYVEEVSVRTAYGVLSGVIGIICNIILFVGKGIIGLLMHSVSILADA FNNLSDSGSSIISLIGVKMASKPADEEHPFGHGRIEYISALIVSFLVIQVGLTFFKDSVG KIMHPEQLKFQWISIIVLVLSIGMKLWLGAFNRKLGKRIDSKVLQATATDSMGDAITTSV TVLSILFWKITGVNIDGIVGLVVSGIVIWAGIGIAKDTVEALIGPAIDPEIFRQITEFVE GYEGIEGTHDLIVHNYGPGRSMASIHAEVPNTADIEVSHEVVDKIERDAQKFLGIFLVIH MDPIETKNESILKIKEMANAKIEEIDDKVSIHDFRVVEGKQRINLIFDMVVPHEYDKEKQ VEVSNTLRKELQEVDNRYQCVIHVEKSYGGM >gi|330401653|gb|ADLB01000025.1| GENE 47 39836 - 40468 673 210 aa, chain - ## HITS:1 COG:no KEGG:Cphy_3792 NR:ns ## KEGG: Cphy_3792 # Name: not_defined # Def: stage II sporulation protein R # Organism: C.phytofermentans # Pathway: not_defined # 40 199 64 223 240 174 51.0 1e-42 MEKERKWICAAVAILIAGIVTGMTIYRQSVLVEAKVEKTQEKLAEEVLRFHVLANSDSEE DQQLKMKVKEAVIAYMKRELPNAESVEETKEWTGSHEKELEEVAGRVISEEGYTYPVKAE LTESYFPQKTYGDVTFPAGEYEALRIEIGKAKGHNWWCVLYPNLCFVDATNAVVPKKSKQ KLKSVLDEEEYEMVTATSKFKIRWFFFGGE >gi|330401653|gb|ADLB01000025.1| GENE 48 40530 - 41204 955 224 aa, chain - ## HITS:1 COG:SMb20773 KEGG:ns NR:ns ## COG: SMb20773 COG1802 # Protein_GI_number: 16265213 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Sinorhizobium meliloti # 15 220 23 228 228 111 33.0 9e-25 METNFQVNMNEYLPLRDVVFNTLRQAILRGELKPGERLMEIQLANKLGVSRTPIREAIRK LELEGLVLMIPRKGAEVAEITEKSLRDVLEVRKALEELAVKLACDRMTKQQMSQLKEAAK EFEETLKTNDVTKFAEADVKFHDVIFMATDNQRLIQLLNNFREQMYRFRVEYLKKKCFHS ILIAEHEDIINRIEKRQKEEAAEVVCRHIDNQVEAVIDTIRTKK >gi|330401653|gb|ADLB01000025.1| GENE 49 41207 - 42082 1044 291 aa, chain - ## HITS:1 COG:BS_yabH KEGG:ns NR:ns ## COG: BS_yabH COG1947 # Protein_GI_number: 16077114 # Func_class: I Lipid transport and metabolism # Function: 4-diphosphocytidyl-2C-methyl-D-erythritol 2-phosphate synthase # Organism: Bacillus subtilis # 7 281 6 277 289 235 43.0 7e-62 MDKVELKALAKINLGLDVLGRKENGYHDVRMVMQTIYLYDDVLMQKTEKEGIHLETNLFY LPVNENNIAYKAAKLLMDEFHIEGGVSIRLNKFIPVSAGMAGGSSNAAAVLFGMNRMYEL GLSQKELMERGVQLGADVPYCIMRGTVLAEGIGEKLTPLPALPKCYVLVAKPPVSVSTKT VYEKLDALDIVNHPNIDGILEGLEEQNLEKIASNMGNVLEEVTIGDYPVIEKIKQTMKDA GALNAMMSGSGPTVFGIFTDRKAAKHAYTEIRRKRLAKQVYVTNVHNTRGR >gi|330401653|gb|ADLB01000025.1| GENE 50 42127 - 43689 1213 520 aa, chain - ## HITS:1 COG:CAC2903 KEGG:ns NR:ns ## COG: CAC2903 COG1388 # Protein_GI_number: 15896156 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: FOG: LysM repeat # Organism: Clostridium acetobutylicum # 23 512 25 514 520 121 21.0 3e-27 MEFMKKQMHVNCMEKGITDQFFVDDDYNVPDAKRDISKIVMSKGTAKVEEMKPLENYIRV TGNVYFQILYVTEEGETKLTSLEGKCPFEEMVYVENGAEDGTYTVRNLRTEFTVSMIHSR KLSIKAMVELEIGKEISEDTSLTTDIESEDTLYKKKKEHQLLQLHTGKRDTYRIKEEVIL PGTKENIGTILWTDISNRKLDTKLEDDALCLSGELSVFCLYESPEGKADWIEQAVSYEGR VNCSGVEPSMYHYITADLDEVNVDMRTDEDGEMRSIGIEGTLNLKIAVYEEQTVEVLEDV YSLRHECNLKKEKVRNEELILQNHSKCKLSEQLSLPELKEDILQICHNSGEVQVQKMERA EQGIQIEGILHLQFLYVKESDEVPFDTWQGMIPFSYLIECDEMNEDTVYDINYGLEQLSV SMLGSGEVEVKAVLAFHSFIRRQLWQNVITDMEALPFDMEKIGKEPGITGYIVKEGDELW DLAKQYRTTTERIQATNGLNNGEIKEGDKILIFKENMSIL >gi|330401653|gb|ADLB01000025.1| GENE 51 43700 - 44107 273 135 aa, chain - ## HITS:1 COG:no KEGG:Cphy_3813 NR:ns ## KEGG: Cphy_3813 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 3 135 6 139 139 88 37.0 6e-17 MKKAVLCFICIGILVLYTGCCVKKNQTEKGKEAKFTIVKEELIPEELESEIKKQKGKPFR LTYEDKGMLYIARGYGKKATTGYSVKVKKCEETENTIYFHTNLIGPSKQEEIVKKANNPH IVIALPVSDKTVIFE >gi|330401653|gb|ADLB01000025.1| GENE 52 44217 - 45743 853 508 aa, chain + ## HITS:1 COG:alr3871 KEGG:ns NR:ns ## COG: alr3871 COG1640 # Protein_GI_number: 17231363 # Func_class: G Carbohydrate transport and metabolism # Function: 4-alpha-glucanotransferase # Organism: Nostoc sp. PCC 7120 # 12 507 2 497 502 523 52.0 1e-148 MIFETAEKTYEPPARISGILLHPTSLPSPYGIGDLGDEAYRFADFLEKSGQHLWQILPLG PTGFGDSPYQSFSAFAGQPLLISPKHLEELGLLTEQDLENCPCTDREKVNYGDVIVWKTE ILHKAYANYCHTADKMLLEEYDSFYENNRFWLDDYALFMACKEVHNGQSWLEWEEEYRSP SKLFVKELENSLAKEIRYHQFVQFIFFKEWYSLKEYANKKKIQIIGDIPIFVSLDSADVW ANQELFQLDSKGFPIEVAGVPPDYFSETGQLWGNPLYNWSAHKKTGFKWWISRIQNQLGL SDYLRIDHFRGFEAYWSVPYGEETAVNGKWKPGPKEDLFLAIEKALGENLPIIAEDLGVI TPEVERLRDRFHFPGMKVLQFAFESETESSFLPHQFTTTNCICYTGTHDNNTTKGWYEAV SETARDKVRRYMNTDGSSVHFDFIRTCLGTIATYAIFPLQDALGIGNEGRMNCPGVAMDN WSWRYKKETLTDELAEKLLELSHLYGRY >gi|330401653|gb|ADLB01000025.1| GENE 53 45758 - 46006 228 82 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_0547 NR:ns ## KEGG: EUBREC_0547 # Name: not_defined # Def: 1-acyl-sn-glycerol-3-phosphate acyltransferase # Organism: E.rectale # Pathway: Glycerolipid metabolism [PATH:ere00561]; Glycerophospholipid metabolism [PATH:ere00564]; Metabolic pathways [PATH:ere01100] # 2 82 1 82 243 78 48.0 9e-14 MIRFILVSTTVILFLVLFIPVLIVEWIIGKFNRKAKDYSSLRIVQGAFKLILWITGVKVT VIGEENIPDEPVLFIGNHRSFF >gi|330401653|gb|ADLB01000025.1| GENE 54 45963 - 46493 530 176 aa, chain + ## HITS:1 COG:CAC0965 KEGG:ns NR:ns ## COG: CAC0965 COG0204 # Protein_GI_number: 15894252 # Func_class: I Lipid transport and metabolism # Function: 1-acyl-sn-glycerol-3-phosphate acyltransferase # Organism: Clostridium acetobutylicum # 15 170 83 236 241 102 34.0 3e-22 MNLSFLSAITEVFFDILLTYSRCKRLTGYVAKKEMLKFPLLRDWMKNLYCLFLDRENAKE GLKTILTAIDYVKQGISICIFPEGTRNKGEELSMLPFKDGALKIATKTGCPVIPISMNNT SEIFENHFPKIRKTHVILEYGKPIDYKSLDREDQKHFGAYCQNIIQETIKKNAALL >gi|330401653|gb|ADLB01000025.1| GENE 55 46616 - 48073 1379 485 aa, chain + ## HITS:1 COG:CAC1780 KEGG:ns NR:ns ## COG: CAC1780 COG1488 # Protein_GI_number: 15895056 # Func_class: H Coenzyme transport and metabolism # Function: Nicotinic acid phosphoribosyltransferase # Organism: Clostridium acetobutylicum # 2 483 11 489 489 529 55.0 1e-150 MNTRNLTLLTDLYELTMMQGYFESGENDVVVFDMFFRSNPCNNGYSIAAGLEQVIEYIKN LNFSYEDVEYLRSLNIFSEDFLHYLSGFHFSGNIYAIPEGTVVFPKEPLIKVIAPIMEAQ LVETAILNIINHQSLIATKTARVTFAAGNDRVLEFGLRRAQGPDAGLYGARAAMIGGCAA TSNVLAGKEFGVAISGTHAHSWIMSFPDEYTAFKKYADLYPNSCCLLVDTYDTLKSGVPN AIRVFKEMRESGMELKNYGIRLDSGDLAYLSKKARKMLDKAGFEDAYISASNDLDEYLIH DLKMQGAAITSWGVGTHLITSKDCPSFGGVYKLAAIQNEDGEFIPKIKLSENTEKITNPG NKTIFRIYEKTTGKIKADLICFADEVINCEEDLLLFDPVDTWKKTLLKGGTYEVRELLKP IFINGACVYETPSVMDIADYCRKEKETLWDETKRLVYPHRVYVDLSEKLYTVKKDLLNQM SLKEQ >gi|330401653|gb|ADLB01000025.1| GENE 56 48087 - 49139 1059 350 aa, chain + ## HITS:1 COG:PA4201 KEGG:ns NR:ns ## COG: PA4201 COG1181 # Protein_GI_number: 15599396 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanine-D-alanine ligase and related ATP-grasp enzymes # Organism: Pseudomonas aeruginosa # 1 348 6 337 346 239 42.0 4e-63 MKIIVLAGGLSTERDVSLSSGAGICRTLKERGHDVFLLDAFMGLEYDSDKLEEVFTLENS GLEIAEGIKTTEPDLEAVKASRPDQSPSLLGPNVIELCRMADIVFMGLHGDIGENGKLQA TFDILGIKYTGPNSLGSALAMDKGVTKQIFKMSGVPTPAGTWLKKADKDTTLNELGLSLP VVVKPCSGGSSIGVYIPQTEAEYKTAVEESFKYEDEIIIEPYIKGREFAVGIIDGKALPV IEIIPKSNFFDYTNKYQAGCTEEICPAHIDDAIAKRMQEATEKAFAALKLDVYSRADFLL TEDGEIYCLEVNSLPGMTSASLLPKEAQAAGISYGDLCELIIQKSLEKYN >gi|330401653|gb|ADLB01000025.1| GENE 57 49154 - 50527 1185 457 aa, chain + ## HITS:1 COG:CAC2128 KEGG:ns NR:ns ## COG: CAC2128 COG0770 # Protein_GI_number: 15895397 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramyl pentapeptide synthase # Organism: Clostridium acetobutylicum # 4 456 1 448 452 303 40.0 6e-82 MKNMSLQEIATACGGTYFGDSTKLSLEVSGVAIDSRKVEKDFLFVAIKGARVDGHDFIPQ VMENGALCALSEKDLGDVPYSYILVKSCEQALKDLAEYYRLALDIKVVGITGSVGKTSTK EMIASVLEQKYSVLKTEGNFNNEIGLPLTIFNIREEHEVAVLEMGINEFEEMHRLAKVAR PDICVITNIGFCHLENLIDRDGVLRAKTEMFDFMRDGAQIILNGDDDKLITVQNVKGIIP VFFGLTERHDYYATDVHSLGLKGTSCILHLPNGTVNANIHLPGAHMVYNALAGACVGYSL GLTNEEIQKGIESLLPVSGRNNLIETDDFLIIDDCYNANPVSMKASLDVLSNALGRKVAI LGDMGELGEDEKALHYGVGVYAAKKETDLICCIGTLAQEFVNGANSVTSKSETLYFATKE EFLSQMHSIVKKGDTILVKASHSMEFPEIVEALQKLR >gi|330401653|gb|ADLB01000025.1| GENE 58 50536 - 51342 689 268 aa, chain + ## HITS:1 COG:CAP0070 KEGG:ns NR:ns ## COG: CAP0070 COG0561 # Protein_GI_number: 15004774 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Clostridium acetobutylicum # 2 268 3 279 283 185 35.0 6e-47 MDYQILVLDLDGTLTNSKKEITEPTKQALIDIQEKGKKVVLASGRPTPGILPLAKELHLE KYGSYILSFNGARIIDCRSKELLYNRTIPSEVIRPIYEMLKSYDMDLLTYTDTHILSGMK TNQFTELESRINQMPIVQTDDFLSEITFPVDKLLGTGNKEIIAKALETVKSHFHSQLNIY LSEPFFLEIMPQRIDKAHSLQKLLNSIGLTADSMICCGDGYNDLTMIEYAGLGVAMENAQ PLVKESADYITKSNDEDGVLHVINEFMR >gi|330401653|gb|ADLB01000025.1| GENE 59 51429 - 52298 720 289 aa, chain - ## HITS:1 COG:no KEGG:TDE0948 NR:ns ## KEGG: TDE0948 # Name: not_defined # Def: hypothetical protein # Organism: T.denticola # Pathway: not_defined # 6 250 5 249 292 232 49.0 2e-59 MQDKENREVKNSVFVDLFYADESAEENDIALFNAIHDESLPEGTTIRRFKVDTTIYMNFQ NDISFDAGGKLLIFGEHQSTVNENMPLRSLLYIGRAYEQLVSIRDRYKRKQVPLPTPEFY TFYNGKEKWDKEKELRLSDAYMIKDSQPMLDLKVKMININPSEQHEILEKCQVLREYSQF IDTVTKYQEMGIDEPYKRAIRECIEKGILADYLKRKGSEVVNMLTAEYDYEMDIEVQREE AFESGKELGLEVSRELYAKLKQLGRTDDILKAIEDAEYQKKLLEEYKII >gi|330401653|gb|ADLB01000025.1| GENE 60 52316 - 52516 137 66 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MHENLAEMQRNLMFRVYAKKEFSARYFMQTFVINLLFGKVEGFPQDKPIGRFTDRMSVSD VMIAHL >gi|330401653|gb|ADLB01000025.1| GENE 61 52581 - 52904 368 107 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|197303201|ref|ZP_03168243.1| ## NR: gi|197303201|ref|ZP_03168243.1| hypothetical protein RUMLAC_01924 [Ruminococcus lactaris ATCC 29176] # 10 106 13 108 109 90 47.0 3e-17 MYVSRNSMDFWEIVPMFFAIMVGILVLLMLAYVFYQNKDKKKELISKKVTVLEKPIQQGN LEWYVMEDENGERMKLRSFQGNSLFISVGDKGIVSYRGETIESFKRE >gi|330401653|gb|ADLB01000025.1| GENE 62 52923 - 53201 344 92 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|197303200|ref|ZP_03168242.1| ## NR: gi|197303200|ref|ZP_03168242.1| hypothetical protein RUMLAC_01923 [Ruminococcus lactaris ATCC 29176] # 1 92 17 107 108 72 43.0 7e-12 MSKKVIRNCMLIMAVCFLILGMLFKSNAERQKGIQATTGIMIDGKYQPTSSGRIGANQEK YEAANTTGVTFYILAGVTGVIGVVMLIRDRKK >gi|330401653|gb|ADLB01000025.1| GENE 63 53405 - 54529 728 374 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260589015|ref|ZP_05854928.1| ## NR: gi|260589015|ref|ZP_05854928.1| hypothetical protein BLAHAN_06100 [Blautia hansenii DSM 20583] # 36 374 1 337 337 392 58.0 1e-107 MSEFSERCRQLLINSGSNVYQMAKHSSLDKTSIQRMVTGKRLPSLDFVKDFCSYLRISPI EKKELLELYEIEKVGKSEYKSRCYIKSLIESLSFLDEENKKNISFSETLDYYEPFSPVPN VENKILSILQSELKNNNKPEILFNLPTSYRYIFFILKGLFADFKGEASVKHLITLNKNPL NTVYPCQNLEALSHILPLSEILKDIYRPYYLYSNLTPSDEKMLIMPYYIITSEYLLTISS DFKTVSLHDTDTAIEKYRKEFYRILDMAQPLIKYADNTFKIINHLQENYIEYGLPSHSLE FHPCLFFMDTSFELSKESFKQIPHAEEHIAALQEMYKVVSASTPPPKKIGNDNFFLFRAG TGNVLPDGKMFRTV >gi|330401653|gb|ADLB01000025.1| GENE 64 54417 - 54857 97 146 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|260589014|ref|ZP_05854927.1| ## NR: gi|260589014|ref|ZP_05854927.1| conserved hypothetical protein [Blautia hansenii DSM 20583] # 26 145 1 120 121 150 67.0 2e-35 MFPLLPPPRKKSGTTISFFSEPGLEMFCQTGKCFGQYKNLEMGFSPEERKVMLKSYFRNR DAEAFTPYILKPSFHTPTYLNIELHESHTVTIFSLKENFQFSFIQIKESSICTAFYSFFQ YLTDSDFVYTAKETGSILENYFHSLL >gi|330401653|gb|ADLB01000025.1| GENE 65 54897 - 56291 615 464 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|145629959|ref|ZP_01785741.1| 50S ribosomal protein L21 [Haemophilus influenzae 22.4-21] # 6 464 3 444 456 241 32 1e-62 MKEDLVMKIISTLYNLMWGDLITIPLPGGSSIGLSLLVMILVPAGIFFTIRTRFILLRKF PYILQIVKEKKSETQKNSISGVQALIVSTATRVGMGNLVGVVAAISAGGAGAVFWMWISA LIGSSTAFVEATLAQMYKEKDPLYNGYRGGPAYYIHSYMTRKKERRYSLIAVLFAISGLV CWCGISQVISNSVTSSMENAFHIPPIYTTVILVVVSAIIVLRKNATVKVLDVVVPIMAGG YLLITLFIIGKNIGMLPEVFSRIFAEAFGIRQVVAGGFGAVLMNGVKRGLFSNEAGSGSA PCAAAAADVSHPAKAGLLQAFGVFIDTIVICSCSAMIMLLVPQGKVEGLVGMEFLQKAME YHMGEFGVIFITFTLLLFSFSTFIGVLYYARSNVAYLFGDNWTSQTIYKIVALAMLFIGG LATYTFVWDLGDVGVGLMTIFNMLVLIPMSKQVINVLKDYEKKV >gi|330401653|gb|ADLB01000025.1| GENE 66 56431 - 56805 443 124 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKKKMLLCLLTIFLFTVFVTGCDVEKEKSQSYLNAKVIEVRENTLIVKPTDNKETKAPKE VKEAEKLTLDLTGFDIKAMPENLTEGEKIRVVFNEDSVEKGDISKIKIVFAIYRMDKNGD IITE >gi|330401653|gb|ADLB01000025.1| GENE 67 56820 - 57824 854 334 aa, chain - ## HITS:1 COG:VCA0439 KEGG:ns NR:ns ## COG: VCA0439 COG1619 # Protein_GI_number: 15601202 # Func_class: V Defense mechanisms # Function: Uncharacterized proteins, homologs of microcin C7 resistance protein MccF # Organism: Vibrio cholerae # 14 315 13 314 334 357 53.0 2e-98 MKVKKLDTDRQIGIGVFSSSSPISSTTPVRYMNGKKYLEEKGYNVIDGVLFQKKDFYRSG SIQERAKEFNELLYRDDVQILMSSIGGNNTNSILPYIDYEYLRKHPKMIVGYSDTTALLL AIYAKTGLVTFYGPALAASFGEFPPFVDMTYEYFEKMLRYTEGTFSYKQPEYWTDEFVDW STQNRSKEKRDNEWVCVKEGKCAGRVVGGNLNTMEGFFGTEYMPEIRKGDILFIEDSLKD ACTIERSFSLLKLAGVFDKIGGLILGKHEKFDDNGTGRKPYDILMEVIGEADFPILAEFD CCHTHPMFTLPIGCKIELNATEKEVTLLETPFDI >gi|330401653|gb|ADLB01000025.1| GENE 68 57828 - 59066 1491 412 aa, chain - ## HITS:1 COG:CAC1585 KEGG:ns NR:ns ## COG: CAC1585 COG2270 # Protein_GI_number: 15894863 # Func_class: R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Clostridium acetobutylicum # 6 411 4 414 425 328 44.0 1e-89 MNKSKLTKLEKYWILYDVGNSAFILLVSTIIPIYFDAMAEKAGISEVDYLAYWGYATSVA TFLVAIIGPILGTIADTKGYKKPIFAVSMMVGVLGCAGLSLPTSWIVFLAVFVVAKVGYS ASLIFYDAMLADVTTPERMDTVSSHGYAWGYLGSCFPFIISLIFVLFHEKIGISMTAAMI FAFCLNAGWWLLVTLPLLKNYEQTHYAERESNPIRSSFRRLGQSFSEIKEHKKIFLFLIA FFFYIDGVYTIIDMATAYGSALGLDTTGLLVALLVTQIVAFPFALFFGKASKKYETAKLI KICIVAYTGIAVFAIQLDKQWEFWVLAVVVGMFQGAIQALSRSYFAKIIPPEKSGEFFGL FDICGKGASFMGTALMGAFAQWTGSPNGGVIILAIMFLIGLAVFVKANRLEG >gi|330401653|gb|ADLB01000025.1| GENE 69 59084 - 59872 887 262 aa, chain - ## HITS:1 COG:no KEGG:TDE1263 NR:ns ## KEGG: TDE1263 # Name: not_defined # Def: hypothetical protein # Organism: T.denticola # Pathway: not_defined # 27 261 9 249 250 131 33.0 2e-29 MPALYAHNQFGNRVLQKLDAKKREFLLRNLRQFRIGLQGPDYLFFYKPLSPNPVSQIGFQ IHERPASEFIEYARKVIRENGTESPEYAYILGFICHFALDSECHSFVAEEMERTGIGHVE LESEFEKFLMRKNGEEPLSYPVGKTFPTDKDTAKCVAKFYEGVEPEEAHKALKSMRKYKS ILVAPGKVKRTILNRGMKLSGQYDALQGHLFKVEDNPKCVESNAGLYERFKNAVPVAEYL IQKFDEYLEGNELDERFERDFE >gi|330401653|gb|ADLB01000025.1| GENE 70 59940 - 61193 1357 417 aa, chain - ## HITS:1 COG:BH1194 KEGG:ns NR:ns ## COG: BH1194 COG0785 # Protein_GI_number: 15613757 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Cytochrome c biogenesis protein # Organism: Bacillus halodurans # 7 241 2 231 238 139 32.0 1e-32 MGFSFDVSVPVVTVFFQGLLSFFSPCVLPLIPIYISYLSGGAGVKGEDGKIHFKKSKVML HTLFFVLGISFTFVVLGLGVSAFGSFFKDNQLIFARVGGVLVVGFGLYQLGAFGKSSFLG KERRLPMKADVLAMSPIVAFVMGFTFSFAWTPCVGPTLASVLLMAASATTKWTGFALIGV YTAGFVLPFLAVGLFTTTVLEFFKSHMKVVRYTVKIGGILMILMGVLMFTGKMNAFTGYL SGTSSNTETKKPEKKAEETEAEKQEKESEDSEMSAIDFTLYDQYGKKHSLADYKGKTIFL NFWATWCPPCKAEMPDIQKLYESYGDDGEVAILGVAAPNIGNETDEEGIKTFLKENGYTY PVLMDTEGELFMQYGISSYPTTFMIDKEGNIFGYVSGQLSEDMMKSIIRQTIEGKRE >gi|330401653|gb|ADLB01000025.1| GENE 71 61320 - 64316 2666 998 aa, chain - ## HITS:1 COG:no KEGG:GYMC10_2894 NR:ns ## KEGG: GYMC10_2894 # Name: not_defined # Def: Ig domain protein group 2 domain protein # Organism: Geobacillus_Y412MC10 # Pathway: not_defined # 391 930 31 541 1071 258 33.0 6e-67 MKGKKVKRLIAVALATAMTVTMFPGITPENAGASRVKAASTVTNENLLKNPSFEEATPFT AHSEQNQNNKTGNWFYFQNSVKKEKGNAHSGEWSASLEKANDALEQDIPNLQKGATYKVS VWAKNTNPSGVKAYLCVKSYGGSEKKVPITSGDYKKYEVEFVYTGDNSGKNTRTAIWVEQ ANTGNVYVDDWSFTIASDLKSLSVENGTLTAEYNEDYTGKMSSGDFDFSFTSSLESEKAE KLTITEEKVNGKILTMKFAPIKKEPVEQKITVTATYKPKKQPLVVDFTVASSGETVTEAK LVGISAENGKVIGNLDVVPTIAPVKKDFVLEYKINDGAFTKAEVKEFTYDKANQNVALTF DKISSAIDEKKVTVKVSYKNVAKTAQFVVKLGSGVKYYVDATKGKDTNDGTSPEKAWKSI DRVNKEVFQPGDEILFKAGEEWTGALKPQGSGVEGAPIVIASYGEGAKPLLKPGKDWKNS HMDIANKVVSSPTVNNVITFFNQEYWEVRDLELYDPTYSQNAQTRVFRRGINVSAEDAGD LRYFKFDNLTIHGFRGPNDNDGKSSGGIIMTVTTNRYDASKRVRTAIHDISVTNCEMYDL GRSGINFISPWTTRKGDEWNKYGRFDYVGKGEWKPYENFRLSNNTIYNIDGDGTIVDGCK NAVVDHNTVYRTVLNCWYGVGLFNWNSDNTVFEYNEVYEASPSDALLGAGDGQGIEIDAL NKNTLVQYNYLHNNAGGIFMWCCTATLRGFNGIYRYNISQNDGAKHGVIDWREGHEGSMA YNNTIYLDDSVKREWLKNGYTGGKSDAKFWNNIVVNRGEMHATNFNEQEIDYERNIFVGF DAVPSNDETVIQEDPMFVAEGTGGKGIDSLKGYKLQENSPAIDAGINIENNGGKDYYGTL LSDGKTDIGAAEYVKVKPEEPEKPQKPEQPEKPEKPEQPEKPEQPQKPNKPNTQQTKPQT QNESIKTADVSPIIPMMIVSIMALGVVVFLTMKRKNNQ >gi|330401653|gb|ADLB01000025.1| GENE 72 64378 - 65112 739 244 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|210616234|ref|ZP_03291014.1| ## NR: gi|210616234|ref|ZP_03291014.1| hypothetical protein CLONEX_03233 [Clostridium nexile DSM 1787] # 2 244 30 272 277 207 53.0 4e-52 MKKKEQEEVPQVVTTISEEEYNFYCEIVQKNYKGKDKKKLEELTKQYAENVYAQYTLGQK CGVCEPYSYEYLKMKMETENQQRKAKKEAGEVVYGTLEFKLDGYLQYQLSNLRLSIIDSV VKNSDKTLEKKAKKYWEEHQDKFKRIASVEYRLDKETKVVTWKDFPMLEKTDSQLFTYLY EGKEGDTFSIEENGQIVEGEIIKKTMEKTEFNENKTEIIKNYVGSVYYDELIEQLKEENV LEYE >gi|330401653|gb|ADLB01000025.1| GENE 73 65386 - 66255 694 289 aa, chain + ## HITS:1 COG:CAC1451 KEGG:ns NR:ns ## COG: CAC1451 COG2207 # Protein_GI_number: 15894730 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Clostridium acetobutylicum # 31 284 33 286 295 82 23.0 8e-16 MVSSTSSDYFDISACAVSGGCLQSHSGPYLFGRHIHTSFEVYLITQGNCFMNINGKDISC KKDDFIMILPNTVHSFEVKENEQCEFLHIHFEPELFAHILVKKTPEFSISLSDALLFHCD FYYKQPSNITLFNLVTSIVGVYQNPNSYSAISTNLYVAQLMLYVLEQVSKDEPLTPSVHV QNRYISYTLQYVSEHFQEKILIPDIAKELNISTRYLGKLFSRHMNLTLANYINIYRINQA IQLMSSTDMPLVDIALSIGLNDSQHFSRLFYKIINMTPGKYRKMIQKGE >gi|330401653|gb|ADLB01000025.1| GENE 74 66301 - 68262 2071 653 aa, chain - ## HITS:1 COG:STM4426 KEGG:ns NR:ns ## COG: STM4426 COG5520 # Protein_GI_number: 16767672 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: O-Glycosyl hydrolase # Organism: Salmonella typhimurium LT2 # 65 502 31 440 447 203 29.0 1e-51 MKRKVMAGMLTFAMLGMSLGGFQTPQVQAAEDLGRVGVHVSAKEGGYDETLPDLHFKKAD ATKTSGYVKVYPEEKRQTFLGVGGAMTESAAYNLQKLSKEKQEEVYEAYFGENGAKYSVL RSTIGSADFSTRSYSYNDTEEPDPELKNFSIEKDWDYIIPAVKKAQSYRSDIKFFAAPWA PPAWMKKSGVRRGQTGTAGINFVDNSVKPEYYESYANYLVKYIQEYEKAGIDVYSLSMQN EAQNNPKWEAATWSTDAVIDFVGNHLGPALERNNLDPQLLIWDWDKGNDPMHHDGFIDFN TKVLSNEKARKYIDGIAFHWYAGDVWHEIQGVPMWSKDFYSLDTVKEKFPDIHLYATEAC QEKGAWFGSFDPADRYIYDILNDFEHGTETWIDWNLVLDREGGPTQGVVNKCHAPVMLDE NNNVCYQPSYYILKQISRTVQPGTVSIKTTTDTDIVKTAVMDDEGMVSVMLGNITDSEKK VTVIDDDRSVDVTLKPHSLTTVKYDSDYQPDGDIDETLPDVAVKPIAATASSYEKNPIYN YQAVSAIDDSMKTRWASDWTNQEDITFELSSRATVSGIQLYFENGHDALYDIQVSDDGKN FKTVKTVMIEEMKSPQVTVRFNPVKARYVRFQGIQRNDKYGYSIYDAKVLVKQ >gi|330401653|gb|ADLB01000025.1| GENE 75 68439 - 69932 1065 497 aa, chain - ## HITS:1 COG:no KEGG:Cbei_0747 NR:ns ## KEGG: Cbei_0747 # Name: not_defined # Def: BNR repeat-containing glycosyl hydrolase # Organism: C.beijerinckii # Pathway: not_defined # 67 494 149 564 564 263 35.0 1e-68 MKYIYKISGKISLMFYIFTLYRIWHLCQYGGLRSHVPTLAVGMTGLVGTFILWVISRKVI RKTDPAYREKRTVFYTEIAIFIMATLFFGGRIVYSAIPYHGALSWKMEEWLHKKEIKLEH DNVFENGVEGILTDLDEELELPEELYIADKVQIGFDKNGKIQNIDAFIYGKDKQGEKRTY LIDYDADKSEDMTVWLDGNVNGKYDEDMRLSPMTQILKRAEWKKKVNVWAKTDYEDNIYE IFYLGRRAFGSEEGLRYVPGDADGDGKEAGADSLLQLRRGGEIVGFEVSLHIPEKSSVTP VRYIMDPEYISQEKLNEENDKQQIDKAKDAKKWTVDRGNGTMYFLLDNRSGWRLVEKDAA AGSRFYGLEKTADGGKTWQSINETPFGEQFGVAEGLLFFDESFGVAGLTGASQSESELYI TRDGGISFKQIELPMDTVTELQESAKEYDYMHMPEEENNILTITVTKDGAETDGIVFQST DKGVTWTYKEMTEKSGD >gi|330401653|gb|ADLB01000025.1| GENE 76 69960 - 70376 313 138 aa, chain - ## HITS:1 COG:no KEGG:SSUBM407_0294 NR:ns ## KEGG: SSUBM407_0294 # Name: not_defined # Def: lipase # Organism: S.suis_BM407 # Pathway: not_defined # 6 126 32 157 307 85 37.0 6e-16 MAEEGYWDKVKTSAEIETKYMRLGDCIVKTAEYPIAGERWTNYAVWYPEEMEYEDKSYPL IVMSNGTGVQNSKYEPVFKHLSSWGFIVIGNDDISSGLGDSASKSLAYIMDLSENKECVF YKKIDMRSLISMNLCYII >gi|330401653|gb|ADLB01000025.1| GENE 77 70655 - 70813 197 52 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MVNSKRKFCSHECYIKNRFWSKEGATQLVDALMVGEDIVIPQWLKEKLKERL >gi|330401653|gb|ADLB01000025.1| GENE 78 71254 - 71532 161 92 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MQQIVQQFQDEVMEAHLRKLRAEDEGYQEIERELLRLSHKVQEHVKTISDSQQKILMEYS DTRNSQESAHYNCLYRSGMKDAVRILQYLEVL >gi|330401653|gb|ADLB01000025.1| GENE 79 71535 - 71762 203 75 aa, chain - ## HITS:1 COG:no KEGG:HMPREF0573_11212 NR:ns ## KEGG: HMPREF0573_11212 # Name: not_defined # Def: TnpV family protein # Organism: M.curtisii # Pathway: not_defined # 3 75 4 76 124 75 52.0 4e-13 MITLFEQQGGIYYNQGDYFIPCVETKEQGDLHIGVWANRHRQYLKQYHRVRYYNLLTSEK LYEYLADIEEQGGLR >gi|330401653|gb|ADLB01000025.1| GENE 80 71856 - 74231 1942 791 aa, chain - ## HITS:1 COG:jhp0572 KEGG:ns NR:ns ## COG: jhp0572 COG1479 # Protein_GI_number: 15611639 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Helicobacter pylori J99 # 10 553 9 557 683 230 31.0 1e-59 MQPYKVWLFSDLLEKNKRVFKVPVYQRNYDWNNIQCEKLYQDIMIANERDHKHFTGTIVY IVGIDGSKLNEVLIIDGQQRLTTVYILLKALYDAAKGVSVRIETEIEEVMFNRNCDEKYK LKLKPVKTDNAQLCLLIQDKVDEMDRNSNIYKNYICFKNLINKTLQSGLEVNDILNGIKK LEMVEIVLDKSQGDEPQKIFESINSTGLELSLADKIRNYLLMDDNNQDELYENYWSVIEK NVGYRNLGDFAINFLNSQISKSVNGKNAYRLFKEHCEENHLSHEDVLKKLKRMSKSYGAF IGENHYYSKEILDYLRAFYTIKQTTVLPFLFRVFDDYEDGNIDETTLLKVLNYLLTYFVR VTACEINKNLSKFMKAMYDRFFDGNYEKYYEKLVIFLNDLRANDRMPTDAEFEKALIHTP LYKKPICKFVLSVIENSSKEHIDISNLTIEHILPQKENAAVWKKEVGNDYRRVYEIYLHT LGNLTITGHNSELGTKSFADKKKIIRENSKANILNKEVLAAEKWDEESIRNRAKVLANIL IGKFNYVELHSDLNELNELSFGVNSGIDFSNTKPEGFAFVGEYTKTLSWVDLLTKFISIA YDLDEDTFSDLAASNYSLPNADRTYISNDERKLRKAKQIDNSGIYFETNLSTNNIISFIK NLLEKMNLDIDDFSFSLSEAPFDINDENTWSEGMLPVAKLFYNLIEELIKQTKITEEEIE QLKTKEYTKSLFRATDYPAVANNRTDNMGNSLHKRYRTKELQFNGKNIYISTQFFETDRS AVIQWYKAHLS >gi|330401653|gb|ADLB01000025.1| GENE 81 74279 - 75943 1250 554 aa, chain - ## HITS:1 COG:BH0687 KEGG:ns NR:ns ## COG: BH0687 COG2265 # Protein_GI_number: 15613250 # Func_class: J Translation, ribosomal structure and biogenesis # Function: SAM-dependent methyltransferases related to tRNA (uracil-5-)-methyltransferase # Organism: Bacillus halodurans # 1 459 8 454 458 449 49.0 1e-126 MRKNETAVVKIEDIGVNGEGIGKVDGYTLFVKDAVIGDVVDVKVMKAKKNYGYAKLINVL EPSKDRVQAKCSVARQCGGCQIQELSYEKQLEFKEKKVRGNIERIGGFSSEFLDSVMEEI CGMDNPFHYRNKAQFPFGTDKNGQIVTGFYAGRTHQIIPNMECALGCEENGKILKIIVDF MNKYHISAYDEKTGKGFVRHALLRFGFTTKEIMVCLVVNGDNFPHSEKLVDNLRKIEGMT SITYSVNKENTNVIMGKSIHLLWGQTYITDYIGNVKYQISPLSFYQVNPKQTENLYQYAL EYAGLTGEETVWDLYCGIGTISLFLAQKAKKVYGVEIVPQAIEDAKRNADINGIENAEFF VGKAEEILPEYYENYAREHDGEKAYADVIVVDPPRKGCDETLLRTMTDMSPERIVYVSCD SATLARDLKYLCENGYELQKVRAVDMFPNTVHVETVVKLVRKKPDTYIDITVDMNEIDLI ASEAKATYQGIKDYIKEKYDVKVSSLYIAKVKQKYGIIERENYSTSKSENAKQPQCPPEK EKVIEEALRYFKMI >gi|330401653|gb|ADLB01000025.1| GENE 82 76030 - 76971 846 313 aa, chain - ## HITS:1 COG:BS_yhaM KEGG:ns NR:ns ## COG: BS_yhaM COG3481 # Protein_GI_number: 16078057 # Func_class: R General function prediction only # Function: Predicted HD-superfamily hydrolase # Organism: Bacillus subtilis # 7 297 7 295 314 172 35.0 6e-43 MKYINTLHEGETVRDIYLCKTKRSAETRNGKPYDNLLLQDKTGTLDGKVWDPNSSGIADY DEMDFIEVFGEVISYNGTLQLNIRQIRKAFEDEYNPADYMPTSEKSVDVMYDELLGYIRQ VNNPYLHKALEHYFVDDEAFIRQFKAHSAAKTVHHGFAGGLLEHTLSIVKLCEYYVGAYP ILNKDLLFTAAIFHDIGKTKELSAFPENDYTDDGQLLGHIVIGVEMIHDAIRTIDGFPEK LASELKHCILAHHGELEYGSPKKPALAEAVALNFADSTDAKMQTLTEIFKDKQGSEWIGY NRLFESNLRRTSL >gi|330401653|gb|ADLB01000025.1| GENE 83 77016 - 78296 1394 426 aa, chain - ## HITS:1 COG:TM0571 KEGG:ns NR:ns ## COG: TM0571 COG0265 # Protein_GI_number: 15643337 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain # Organism: Thermotoga maritima # 160 418 89 349 459 100 28.0 6e-21 MSTESNKIQREDPNEQKDKEFSFLQETIKSEQMTGRKLAGRVVKIAACGLFFGLMSCVGF FALKPWAESAFHQNTNQVTIPKDEENVQPKEETEQGQANIPKMTLQNQEELNEALFEVAK KAEKGVVEVRGIHGKEGWIEETYDTVNSVSGTIIADTGVEVLILANNSVLKDAESLTCTF CDGSTYAAEIKKQDKNLGIVIFSVKKAAMNSTTLRQVSTLTLGNSNLVTQGDTLIALGKP FGYTSGISYGIASAVDKEVSFADGDYRMILSDIPGSGSGSGILINTAGEVVGMIKPNLSG SENVTTTNALAISDLKTVIELLSNGRGVSYVGITGTEITGKISEEQGLPEGVYVKNVDSD SPAMKAGIQCGDIITSVGKTKVTTQDAYQSAILEYEPGRQIELSGKRRSNNGYVEIKFTV TVGSKE >gi|330401653|gb|ADLB01000025.1| GENE 84 78438 - 80813 1906 791 aa, chain + ## HITS:1 COG:CAC2340 KEGG:ns NR:ns ## COG: CAC2340 COG1193 # Protein_GI_number: 15895607 # Func_class: L Replication, recombination and repair # Function: Mismatch repair ATPase (MutS family) # Organism: Clostridium acetobutylicum # 1 791 1 788 788 607 45.0 1e-173 MNKKAQLKLEYNKIIDLLEEQASSPSGKNRCKKLSPMLNIGDINLAQEQTASAFTRIVKK GRISFSGCYPIEDSLMRLEVGGVLTCNELLRIAKLLQVTNRVKSFGRHDTVDDLEDCLDT YFNQLEPLGVLSNEISRCILGEDEISDDASSTLKHIRRSIQLLNERVHSTLTSLVSGSLK SYLQDSLITMRGDRYCIPVKAEYRSQVPGMIHDQSSTGSTLFIEPMAIVKLNNDLKELYG KEQEEIQVILSRLSEEAGGYIQELRTNFAILTELDFIFAKGMLALSMNAGKPIFNTKGYI HIREGRHPLLDKKKVVPITLTLGGDFDLLIVTGPNTGGKTVSLKTVGLFTLMGQAGLHIP ALDRSELAVFHEVYADIGDEQSIEQSLSTFSSHMTNIVSFLKDVDEHSLVLFDELGAGTD PTEGAALATAILSHLHQRGIRTMATTHYSELKIFALSTEGVENACCEFDVETLRPTYRLL LGIPGKSNAFAISGKLGLPDYIIDEAKRQLSEHDESFEDLLSDLEESRKTIEKERAEIAS YKQEIQQLKSRVEKKQVRLDEQKERILREANEKANAILRDAKEVADETMKNFRKFGKENI SVAEMERERERLRQKISKTQEKSSIQPKKPKKQHKPGDFKLGESVKVLSMNLTGTVSSLP DAKGNLFVQMGILRSQVNISDLEIIEEPMTITAKQMRRTSSGKMKMSKSLSVSPEINLLG KTVDEAIAELDKYLDDAYLAHLTPVRIVHGKGTGALRQGIHNYLKRLKYVKSYRLGAFGE GDAGVTIVEFK >gi|330401653|gb|ADLB01000025.1| GENE 85 80831 - 81532 756 233 aa, chain + ## HITS:1 COG:CAC3220 KEGG:ns NR:ns ## COG: CAC3220 COG0745 # Protein_GI_number: 15896467 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Clostridium acetobutylicum # 6 233 7 228 228 255 55.0 4e-68 MVTKQKILIVDDDENIAELISLYLTKECFDTMMVHDGEEALTTFDSYQPNLILLDLMLPG IDGYQVCREIRAKSNTPIIMLSAKGEIFDKVLGLELGADDYMIKPFDAKELVARVKAVLR RYQPAKQEIPVADKGKCVVYPGIEINLTNYSVTVDDLPVEMPPKELELLYFLAASPNQVF TREQLLDQIWGYDYMGDTRTVDVHIKRLRAKIKDHPTWSLGTVWGIGYKFEVK >gi|330401653|gb|ADLB01000025.1| GENE 86 81534 - 82940 1055 468 aa, chain + ## HITS:1 COG:CAC3219 KEGG:ns NR:ns ## COG: CAC3219 COG0642 # Protein_GI_number: 15896466 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 194 465 201 471 475 216 40.0 9e-56 MRSTLYLKFILIYIAFGFLSIFTVATLGSTLLATNLETKIGGELYKEANLMGKEYLPGFF TNTLSLNDTQVQLNGLTAYLHADVWITDRNGKLLLSSHSNPSTSSPAQITDFNPAEAGNE QYLIGNYHNYFSEEMITVIAPVTQGFYTKGYILIHKSYSDLTETKDSILNVVYITLLVIY ILSFSILLGLHFFIYRPLRKITEAAKQYASGNLDHEIHINTDDEVGYLSASLNYMSNQLK DMNSYQKKFIANVSHDFRSPLTSIKGYVNAIADGTIPVEMYDKYLNIILFETERLTDLTQ DLLTLNEFDTKELLLDKSSFDIHEIIRTTAISFEGRCTAKKISIELLFASKTIFVYADKR KIQQVLYNLLDNAVKFSNADSSIYVETTEHGGKVFVSVKDAGIGIPKKSINQIWERFYKT DLSRGKDKKGTGLGLAIVKEIINAHGESINVISTEGVGTEFIFSLSKK >gi|330401653|gb|ADLB01000025.1| GENE 87 82972 - 84132 1242 386 aa, chain - ## HITS:1 COG:mlr3334 KEGG:ns NR:ns ## COG: mlr3334 COG1879 # Protein_GI_number: 13472894 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Mesorhizobium loti # 52 385 30 326 331 100 27.0 7e-21 MKRKLQVLLSLVCTAVLFSSCASTDTQKENVTQKEEGRKVNEKQSNLDVLNPIAYSDVSG IKLEPGSYISIIGKSNNGEFWEEVKAGAERAAEDLNAALGYKGEDKIKVNYSGASKGENI EDQINILDEELARNPVAVGIAIIDSTACEVQFDLAAENGIPIVAFDSGSDYKNIQAMCSA NNSEIGKTGATKLASVLDDTGEVALFVHDKESTSAKLREEGFLKEIQENHPNIKVPLVYH LDDLEETAKTIASERNKNKKEGEKDVLPEEITQTEAIQYVLKKNPNIKGCFSTNISANEE LLTALEGEDRDLKIISVDGGESQLQALKDGKVNGLIVQNPYAMGYATVISAARASLGMGN QAFIDSGFIWVTKDNMNKKTIKRMLY >gi|330401653|gb|ADLB01000025.1| GENE 88 84428 - 86539 2558 703 aa, chain + ## HITS:1 COG:lin1443 KEGG:ns NR:ns ## COG: lin1443 COG1882 # Protein_GI_number: 16800511 # Func_class: C Energy production and conversion # Function: Pyruvate-formate lyase # Organism: Listeria innocua # 9 672 5 668 743 1053 75.0 0 MEKNWNASWDGFKPGRWNRTSVNVRNFIQENYTPYEGDDSFLAGPTEATTKLWAQVMDLS KQEREAGGVLDMDTKIVSTITSHGPGYLNKDLEQIVGFQTDKPFKRSLQPFGGIRMAQNA CHENGYEVDPEIVKIFTEYRKTHNQGVFDAYTPEMRLARKSAIITGLPDAYGRGRIIGDY RRVALYGTDWLIEDKKQQLATSLVRMTGDNIRLREELSEQIRALSDLAKLGEIYGYDITK PASNAKEAIQWLYFGYLAAVKEQNGAAMSLGRTSTFIDIYIKRDLDRGLITEEQAQEYID HFIMKLRLVKFARTPEYNALFSGDPTWVTESIGGVGVDGRPLVTKTSFRYLHTLDNLGTA PEPNLTVLWSTRLPHAFKEYCAKMSIKSSSIQYENDDLMRQTHGDDYAIACCVSSMRVGK EMQFFGARANLAKCLLYAINGGIDEKLKIQVGPKYRAVEGDVLDYDDVMQKFDDMMEWLA GLYVNTLNIIHYMHDKYSYEKLQMALHDRDVKRYFATGIAGLSVVADSLSAIKYAKVTPV RDEDGLVVDYKVEGDFPKYGNNDDRVDEIAVDIVRSFMDKVRKHHTYRHGVPTTSILTIT SNVVYGKKTGNTPDGRKMGEPLAPGANPMHGRDSHGALASLASVAKIPFRHAQDGISNTF SIIPGALGKEDQIFAGDLDLDRIEECGNQACNIPNIMESIDNE >gi|330401653|gb|ADLB01000025.1| GENE 89 86569 - 86811 546 80 aa, chain + ## HITS:1 COG:SA0218 KEGG:ns NR:ns ## COG: SA0218 COG1882 # Protein_GI_number: 15925929 # Func_class: C Energy production and conversion # Function: Pyruvate-formate lyase # Organism: Staphylococcus aureus N315 # 5 80 674 749 749 123 75.0 6e-29 MAVNQQQIDNLVNMLDGYVEQGGHHLNVNVFTRETLLDAQKHPENYPQLTVRVSGYAVNF IKLTKEQQDDVISRTFHEGM >gi|330401653|gb|ADLB01000025.1| GENE 90 87077 - 87835 557 252 aa, chain + ## HITS:1 COG:SPy0379 KEGG:ns NR:ns ## COG: SPy0379 COG1180 # Protein_GI_number: 15674526 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Pyruvate-formate lyase-activating enzyme # Organism: Streptococcus pyogenes M1 GAS # 4 239 11 248 263 255 50.0 5e-68 MTKGYIHSIESCGTVDGPGIRYVVFLQGCPMRCQYCHNPDTWKVNTGEQHTVAEVLEGFY TNRPFYRNGGVTVTGGEPMMQMDFLIELFTQLKKDGIHTCIDSSGVMFQPENEVFMNKLN ILLGLTDLVMLDIKHIDDEKHKELTGHSNKNILAFAKYLDGKNVPVWIRHVVVPGITLYQ EYLERLGKFISTLNNVKALDVLPYHSMGKVKYDNLGMDYPLKDTREATKEEAAAAKNIIL SAYKKGKKENLY >gi|330401653|gb|ADLB01000025.1| GENE 91 87936 - 88106 359 56 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|210613377|ref|ZP_03289697.1| ## NR: gi|210613377|ref|ZP_03289697.1| hypothetical protein CLONEX_01904 [Clostridium nexile DSM 1787] # 1 56 1 56 56 65 96.0 9e-10 MAHVISDECVSCGACEAECPVGAISQGADHYEISADACVDCGACAAQCPTGAISAE >gi|330401653|gb|ADLB01000025.1| GENE 92 88294 - 88869 658 191 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_0112 NR:ns ## KEGG: EUBREC_0112 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 7 165 1 191 238 91 32.0 1e-17 MGEVHYMISEASKRVNVETHVLRYWEEELELPIGRTEMGHRYYTEENIQLFRCIKELKEQ GMFLKDLKAMIPDIIRLKEQKMSQAKNMHEVTVSKAEVVQDHKREQVQALLGQAFQNILA QNNKILEESVTRAVSDKVTKEVGFLLQAKERQEEERFKKLDSLIRQQQTLRKEAGKSPAR KLRRLLGAGEV >gi|330401653|gb|ADLB01000025.1| GENE 93 88920 - 94112 6332 1730 aa, chain - ## HITS:1 COG:CAC3086 KEGG:ns NR:ns ## COG: CAC3086 COG5492 # Protein_GI_number: 15896337 # Func_class: N Cell motility # Function: Bacterial surface proteins containing Ig-like domains # Organism: Clostridium acetobutylicum # 952 1205 64 304 498 80 29.0 2e-14 MKRKNLKKVLAGLCALTLTATSIPFSTQAEATKETPKKESGLRLWYDEPASKGKNILNGG SFGTTEEDNTWQQHTLPIGNSFMGANVYGEIGKERLTFNQKTLWNGGPSTSRPNYKGGNK DTADNGKKMSDVYKEIIELYKKGEDAKANELAKKLTGEVAGYGAYQSWGDIYVDFKFDES QAKNYVRDLNMENAVASVDFDYKNTKMHREYFVSYPDNVLAMKFTADGNEKLNLDISFPI DNAEGVTGKKLGKNVQTTVKDNTITVAGEMQDNQLKLNGKLKVETENGTVEAKDGDKLHV ANASEVTVYVSADTDYKNDYPKYRTGETKEQLNDSVQKTIDKASKKGYEKVKEDHIADYT EIFDRVDLDLGQSVPTKTTDVLLNDYKAKKNTAAEDRALEVMLFQYGRYLTIASSRAGDL PSNLQGVWQNRVGDHNRVPWASDYHMNVNLQMNYWPTYSTNMAECATPLVDYINSLVEPG KVTAKTYFGVENGGFTAHTQNTPFGWTCPGWNFSWGWSPAALPWILQNCWEYYEYTGDVK YMEEHIYPMLKEAALLYDQILIEDTKTGRLVSAPAYSPEHGPVTAGNTYEQSLIWQLYED AATAAEILNVDKDKAAQWRERQAKLKPIEIGDSGQIKEWYTETTLGSMGQKGHRHMSHLL GLFPGDLISVDNPEFMDAAIVSLKERGEKSTGWGMGQRINAWARTGDGNQAHKLIQNLFN DGIYPNLWDTHTPFQIDGNFGMTSGVSEMLLQSNMGYINMLPSLPDVWANGSVKGLVARG NFEVSMKWADKNVTEATILSNNGGTATVQVKNASLATVLDENGKVVDVKPVSADRISFET TKGQTYTIKNIPEGVETPTGLAAERVDDNSVDLTWDEAKTEGRTFNVYRKVENGDVQLIA SDVKTNSYKDITADKRLGAMKYQITAVVAGQESKKSEFVSVNDMSDMAGMIDDTDSRVVY QGAWGDWKEDVNYNGTIKYLNTPQGGETASLTFVGTGIEVITCTNHDRGKIEVLIDGKSY GEVDTYSAQTKRQQTIFTKKDLSEKPEKHTITLKVLNKSNQGEGKSTKVELDAFNVLDST TVKPSSVKVATVSGITTVGKANSTVQMKAEVLPKDAKDKSVTWSSSDNSIATVDEKGVVT VKEQNGDVKITATSKADASKKGEMTLKVAIKKNNEVQETVVEDGTKNGSTGTRNDKITWN GTWTTWAGETKHHGGTKTETGSGADAVGTYFEYKFTGTGIEVYSHKNTTQASFAVYIDGN LVQDNVSLDGNDVAQSLVYSNKNLTNAEHTIKCVVKARDGKNQANLDYLKIFSPVESVKV DKSKLQDTITEASKLSEDAYAKDKWATFKEVYDKAVEVMNKDAATQEEVDKAQKDLADAI KALGKAQAPVVTDETGKAILVESKLVALEWDTVRGATSYEIVDEEHNVKAEATDTFVKVT GLKPGTTYNFKIYAVNEGGKSAKAIEVNHVTTTDPNADNTLPSVTDIVKTVIGKDSVKLT WKAPAGTEAAGYTVYVDGVKKGTTDKEEFTLEGLEKDKTYVVKIIAFDNEGHQSLPAQFA FVFSEEKEEQVIVNVSETAGLTVEKGTAFDKLNLPKTVMVQLESGLNKELSVKWQKGDYN EKEDGVYTLKGKLELKDNITNPNNISATIKVTVKSETITPPNPDGSGDNNNGNNNNGNNS GNNGGNANGGNNNPNSPVKTGDTAPIALWGLIGLCAIAGIGFVVRKRRVK >gi|330401653|gb|ADLB01000025.1| GENE 94 94405 - 95745 1387 446 aa, chain - ## HITS:1 COG:CAC3715 KEGG:ns NR:ns ## COG: CAC3715 COG0305 # Protein_GI_number: 15896946 # Func_class: L Replication, recombination and repair # Function: Replicative DNA helicase # Organism: Clostridium acetobutylicum # 8 440 6 434 442 463 54.0 1e-130 MDEALIKRVLPHSIEAEQSVVGAMLMDKDAIMTAAEIISGEDFYQTSYGVIFDAIVELFN EGKPVDLITLQERLKEKDVPAEISSLEFVRDLVTAVPTSANIKYYAEIVSEKAMLRRLIK LNEEIANTCYLAKEPLEAVLEMTEKKVFELVQKRNSGDFVPIKQVVLNALERIEKASKNK GTVTGIPTGFLDLDYKTSGLQPSDLILVAARPSMGKTAFVLNIAQHVAFKENKSVAIFSL EMSKEQLVNRLFSLESQVDAQLLRTGNLKDSDWEKLIEGAGIIGKSNLIIDDTPGISISE LRSKCRKYKLEKGLDIIIIDYLQLMTGRVGGRSESRQQEISDISRSLKGLARELNVPVIA LSQLSRAVEQRPDHRPMMSDLRESGAIEQDADVVMFIYRDDYYNKDTEMKNVAEIIIAKQ RNGPIGTVNLTWLPNYTKFANYLKKE >gi|330401653|gb|ADLB01000025.1| GENE 95 95758 - 96204 528 148 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|160881875|ref|YP_001560843.1| ribosomal protein L9 [Clostridium phytofermentans ISDg] # 1 145 1 145 148 207 72 1e-52 MKVILLEDVKSLGKKGQIVNVNDGYARNFILPKKLGLEATGKNLNDLKLQNANKEKLAQE ALEAAQELAKKIEAGKIVVPIKVGEGGRTFGSISTKEIAIEVKNQMGYDIDKKKIQLKDA IKTLGTHNVPIKLHQKVVAELKVNVTEG >gi|330401653|gb|ADLB01000025.1| GENE 96 96206 - 98248 690 680 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|85057286|ref|YP_456202.1| exopolyphosphatase-related protein [Aster yellows witches'-broom phytoplasma AYWB] # 182 679 197 697 849 270 31 4e-89 MKKEIQFKGQLRLYMQWPAIMMILLVAMNIWIYQIDKKAGFVMTAFVLIYGVIVGILYFY NKSLLLSELVEFAGQYGAVQNVLLKELSIPYVILLEDGNVIWGNEQFSEMMEGRKEKYLS KYIPELNRGLFPSKEGVKTKTEISYGDREYEVEFSKISVQDFNEAEPLVEIPKGKDYFIT AYFRDITELKRYIRENDEQRLVSGLIYIDNYDEIMDSVEEVRQSLLVALIDRKINQYIAN ADGIAKKLEKDKYFIVIKKHFFEQMKEDRFSVLEDVKNVSIGNDVPATLSIGFGLSSDTY AQSYNYARVSIDLALARGGDQAVVKSPEGIEYFGGKREQTSKNTRVKARVKAEALREFIA AKDRVIIMGHKIGDVDSFGAGIGIYRAVTEMEKEAHIVINDITSSVRPLYESFTKNPAYP DDMFLTSKNVEDYVDNNTMVVVVDTNKRELTECEDLLDWAKMIVVLDHHRQGSDVIEGAV LSYIEPYASSTCEMVAEILQYIVDDVKLPSMEASSMYAGIMIDTNNFMNRAGVRTFEAAA FLRRCGADITFVRKMFREDMETYRAKAEIISSAEVYRDIYAIARSENLKIDSPTIVGAQA ANELLDIGNVKASFVLTQYNGKIYVSARSIDEVNVQIIMERIGGGGHMTTAGAQFAHTDM DKAIDVLKQTIDVMIEEGDI >gi|330401653|gb|ADLB01000025.1| GENE 97 98367 - 99383 1209 338 aa, chain - ## HITS:1 COG:BS_galE KEGG:ns NR:ns ## COG: BS_galE COG1087 # Protein_GI_number: 16080937 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-glucose 4-epimerase # Organism: Bacillus subtilis # 1 336 1 335 339 496 70.0 1e-140 MRILVTGGAGYIGSHTCIELIQAGYDVVVVDNLCNSCAEALNRVEKIVGKPIKFYEADIR DAKAMEDIFAKEDIDAVIHFAGLKAVGESVQKPLEYYDNNIAGTLVLCDAMRKAGVKNII FSSSATVYGDPAFVPITEECPKGQCTNPYGWTKSMLEQILTDFHTADPEWNVVLLRYFNP VGAHKSGTIGENPKGIPNNLMPYITQVAVGKLECLGVFGDDYDTHDGTGVRDYIHVVDLA VGHVKALKKIEEKAGVCVYNLGTGNGYSVLDMVKAFSKACGKEIKYEIKPRRAGDIAACY ADPAKAKAELGWEAERGLDEMCEDSWRWQSNNPNGYAE >gi|330401653|gb|ADLB01000025.1| GENE 98 99424 - 100536 1328 370 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_0139 NR:ns ## KEGG: EUBREC_0139 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 8 368 306 667 668 492 62.0 1e-137 MTAEMNKGQREAIESYQFAGELVDVRPYGSGHINDTYLVTLKEDGSEKKVILQRMNKNIF TKPVELMENVLGVTSYLRERIIENGGNPDRETLNVIPTSEGKPYFVDSEGEYWRAYKFIT GATSYDAVETPEDFYQSAVSFGNFQRLLAEYPAETLHETIEGFHDTKARFAVFKKAVEED ACGRAASVQKEIDFVLAHEDVANVFGDMLAKGELPLRVTHNDTKLNNIMIDDETRKGICV IDLDTVMPGLAMNDFGDSIRFGASTAAEDEQDLSKVSCDMGLFEIYTKGYIEGCGGRLTE KEIEMLPMGAKVMTFECGMRFLTDYLEGDHYFKIHREGHNLDRCRTQFKLVEDMEAKWDT MQEIVKKYSK >gi|330401653|gb|ADLB01000025.1| GENE 99 100536 - 101462 1062 308 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_0139 NR:ns ## KEGG: EUBREC_0139 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 307 1 307 668 424 63.0 1e-117 MKKPVLVIMAAGMGSRYGGLKQIDPVDEQGHIIMDFSIFDAKRAGFEKVVFIIKKENEKD FREVIGKRMAKHMEVAYVFQDLHNLPEGFSVPEGRVKPWGTAHAVLSAIDEIDGPFAVIN ADDYYGRHAFQTIYDYLSTHEDDDKYRYTMVGYELQNTVTDNGHVARGICQVSADGKLEG IQERTRIEKREGKIAFTEDDGETWTYVPGETLVSMNMWGFTRSILEEIQKGFPAFLEKGL AENPMKCEYFLPTVVDNLEKAGRAEVSVLSSKDKWFGVTYKEDKPVVVEAIRKMKEDGLY PEKLWGEE >gi|330401653|gb|ADLB01000025.1| GENE 100 101555 - 101650 92 31 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MINLKSSTPTFILEPFVFEKKFGKVRTIVQR >gi|330401653|gb|ADLB01000025.1| GENE 101 101735 - 102053 185 106 aa, chain + ## HITS:1 COG:BH1412 KEGG:ns NR:ns ## COG: BH1412 COG3464 # Protein_GI_number: 15613975 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Bacillus halodurans # 4 99 2 97 405 77 39.0 5e-15 MHSHYTNKLLNIEDVIIKKIHHADTFLKIYLETNPHEQVCPCCGSTTKRIHDYRYQTIKD LPFQLKHCYLVLRKRRYVCKCGKRFYESYSFLPRYFQRTARLTAFY Prediction of potential genes in microbial genomes Time: Tue May 24 21:57:00 2011 Seq name: gi|330401545|gb|ADLB01000026.1| Lachnospiraceae bacterium 2_1_46FAA cont1.26, whole genome shotgun sequence Length of sequence - 36597 bp Number of predicted genes - 35, with homology - 29 Number of transcription units - 18, operones - 6 average op.length - 3.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 5 - 334 104 ## COG3464 Transposase and inactivated derivatives + Term 426 - 467 1.6 - Term 53 - 94 5.2 2 2 Tu 1 . - CDS 275 - 370 164 ## - Prom 397 - 456 2.0 - Term 469 - 507 7.2 3 3 Op 1 . - CDS 515 - 682 234 ## 4 3 Op 2 38/0.000 - CDS 696 - 1571 1085 ## COG0395 ABC-type sugar transport system, permease component 5 3 Op 3 35/0.000 - CDS 1587 - 2477 869 ## COG1175 ABC-type sugar transport systems, permease components - Term 2492 - 2526 4.4 6 3 Op 4 . - CDS 2560 - 3927 1550 ## COG1653 ABC-type sugar transport system, periplasmic component - Prom 4034 - 4093 9.0 - Term 4075 - 4113 9.3 7 4 Tu 1 . - CDS 4126 - 9810 5713 ## SAV_7268 hypothetical protein - Prom 9882 - 9941 9.3 + Prom 9867 - 9926 7.0 8 5 Op 1 7/0.000 + CDS 9997 - 10572 634 ## COG2059 Chromate transport protein ChrA 9 5 Op 2 1/0.000 + CDS 10572 - 11117 395 ## COG2059 Chromate transport protein ChrA 10 5 Op 3 8/0.000 + CDS 11167 - 11994 792 ## COG2207 AraC-type DNA-binding domain-containing proteins 11 5 Op 4 . + CDS 12031 - 12918 479 ## COG2207 AraC-type DNA-binding domain-containing proteins + Term 12930 - 12973 7.2 - Term 12918 - 12961 7.2 12 6 Op 1 . - CDS 12971 - 13519 489 ## COG1396 Predicted transcriptional regulators - Term 13563 - 13612 5.1 13 6 Op 2 . - CDS 13628 - 15793 2429 ## Cphy_0577 hypothetical protein - Prom 15820 - 15879 7.6 14 7 Tu 1 . - CDS 15902 - 16933 1165 ## COG1363 Cellulase M and related proteins - Prom 16972 - 17031 8.3 - Term 17021 - 17063 3.2 15 8 Tu 1 . - CDS 17078 - 17596 546 ## - Prom 17655 - 17714 4.2 16 9 Op 1 . - CDS 17807 - 17989 113 ## 17 9 Op 2 . - CDS 17992 - 18300 185 ## Dhaf_3112 transcriptional regulator, XRE family - Prom 18325 - 18384 6.2 - Term 18345 - 18395 13.7 18 10 Tu 1 . - CDS 18399 - 19634 1155 ## COG2966 Uncharacterized conserved protein - Prom 19664 - 19723 6.5 - Term 19695 - 19759 15.0 19 11 Tu 1 . - CDS 19762 - 22131 2254 ## COG5492 Bacterial surface proteins containing Ig-like domains - Prom 22380 - 22439 8.1 - TRNA 22275 - 22345 63.3 # Cys GCA 0 0 - Term 22234 - 22272 4.4 20 12 Tu 1 . - CDS 22445 - 22588 56 ## - Prom 22670 - 22729 5.9 + Prom 22620 - 22679 6.8 21 13 Tu 1 . + CDS 22700 - 22993 443 ## gi|154484766|ref|ZP_02027214.1| hypothetical protein EUBVEN_02484 + Term 23030 - 23080 0.6 + Prom 23011 - 23070 5.0 22 14 Tu 1 . + CDS 23251 - 23847 233 ## COG2801 Transposase and inactivated derivatives 23 15 Tu 1 . - CDS 23863 - 24522 667 ## Blon_1431 hypothetical protein - Prom 24559 - 24618 8.2 - Term 24938 - 24991 11.1 24 16 Op 1 . - CDS 24993 - 25820 1171 ## COG0005 Purine nucleoside phosphorylase 25 16 Op 2 . - CDS 25837 - 26565 741 ## HMPREF0868_0659 cyclic nucleotide-binding domain protein 26 16 Op 3 . - CDS 26567 - 27526 733 ## COG1816 Adenosine deaminase - Term 27534 - 27583 8.6 27 17 Op 1 . - CDS 27602 - 28702 1320 ## COG1744 Uncharacterized ABC-type transport system, periplasmic component/surface lipoprotein 28 17 Op 2 7/0.000 - CDS 28724 - 29389 723 ## COG0274 Deoxyribose-phosphate aldolase 29 17 Op 3 . - CDS 29402 - 30724 1698 ## COG0213 Thymidine phosphorylase 30 17 Op 4 26/0.000 - CDS 30739 - 31692 1101 ## COG1079 Uncharacterized ABC-type transport system, permease component 31 17 Op 5 24/0.000 - CDS 31697 - 32830 1444 ## COG4603 ABC-type uncharacterized transport system, permease component 32 17 Op 6 . - CDS 32820 - 34349 336 ## PROTEIN SUPPORTED gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 33 17 Op 7 . - CDS 34373 - 34786 460 ## COG0295 Cytidine deaminase 34 17 Op 8 . - CDS 34801 - 35973 1096 ## COG1015 Phosphopentomutase - Prom 36117 - 36176 9.3 + Prom 36006 - 36065 7.1 35 18 Tu 1 . + CDS 36224 - 36334 146 ## Predicted protein(s) >gi|330401545|gb|ADLB01000026.1| GENE 1 5 - 334 104 109 aa, chain + ## HITS:1 COG:BH1307 KEGG:ns NR:ns ## COG: BH1307 COG3464 # Protein_GI_number: 15613870 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Bacillus halodurans # 7 106 324 429 441 73 37.0 1e-13 MLLYNDDLRRAHYLKKNFNELCQNTKYSEQRKDFFDWIKMAESSGLNEFEKVAKTYRAWS KEILNAFKYAHITNGPTEGFNNKIKVLKRTSYGIRNFQHLRTRIFLITD >gi|330401545|gb|ADLB01000026.1| GENE 2 275 - 370 164 31 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MANKPSSASLLKLIGYKKNASSEMLEISDTV >gi|330401545|gb|ADLB01000026.1| GENE 3 515 - 682 234 55 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKNSVRNIIMCIVFVISLALIIIGQRNISVSGLMMEIVGLIGLLTLLFVYNHRFK >gi|330401545|gb|ADLB01000026.1| GENE 4 696 - 1571 1085 291 aa, chain - ## HITS:1 COG:BS_yurM KEGG:ns NR:ns ## COG: BS_yurM COG0395 # Protein_GI_number: 16080311 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, permease component # Organism: Bacillus subtilis # 11 291 31 300 300 143 30.0 5e-34 MGKKNRKPTSFIYKLFIYVALFTLAISIIIPVAWVFMASLKQNSEFIGVNVSPWALPKEF FFQNFVIAFRDANMGTFFLNSVIVTALALILLLVLALPASYALSRFNFRGKKILNLGFMA GLFVNISYIVIPIFLMLSEVDKALGMEFLLNNRFVLALVYASSALPFTVYLLSGYFQTLP KGFEEAAYIDGCGYFKTMVKIMIPMAKPSIITVILFNFLSFWNEYIIAYTLMDGNDTLAM GLKNLMAVEKTATNYGIMYAGLVIVMLPTLILYIAVQKRLTEGMTLGGLKG >gi|330401545|gb|ADLB01000026.1| GENE 5 1587 - 2477 869 296 aa, chain - ## HITS:1 COG:BS_yurN KEGG:ns NR:ns ## COG: BS_yurN COG1175 # Protein_GI_number: 16080312 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, permease components # Organism: Bacillus subtilis # 15 282 13 279 292 166 36.0 4e-41 MNRKRSERRFIFACLAPATILLVLFIFIPTIDVFRMSLYRMGGITNQKTFIGFENFKTLM GDKSFLQAMQNTILVIVLVMLCTIVLAVLFAALLNRGTFKGKNFFRVIFYIPNILSIVVI AGIFGAIYNPSTGLLNTFLEAIHLDGLTHKWMAEPNIVIYSVIFALVWQAIGYYMVMYMA SMAAIPPDYYEAASLDGATELQMFFKITFPLIWSNIRTTLTFYIISTINLSFLFVQIMTN GGPNGKTEVGLNYMYKQAYGNGAYGYGMAIGVVIFLFSFILAGIVNKITDREVYEF >gi|330401545|gb|ADLB01000026.1| GENE 6 2560 - 3927 1550 455 aa, chain - ## HITS:1 COG:PM1762 KEGG:ns NR:ns ## COG: PM1762 COG1653 # Protein_GI_number: 15603627 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Pasteurella multocida # 1 346 1 342 451 79 24.0 1e-14 MKMRKVTALLLSVAMIATLAAGCGSKDKDAKKDGKTVLKVAAFEGGNGAQIWKNIEKAFE EENKDVDIELHLSSELDKDLTKDIKNGNVPDVVYYNLGQPSGFTETMLKEKAIADISDVF DDELKEKLVPGITDNAAAQPYGDGKIYLAPITYTPTGLWYNKDLFEGPNKKYDLPTTWDE FFALGDKAKADGVSLLTYPIPSYFDTTMYQMLAQAGGIDFYNKAVNYDKDTWTSEAGKKV SDTISKIAQPKYVHPDTVANANAEGGFKLNQQNVIDGKALFMPNGNWVIGEMANSTPEGF NWGMMAPPKFSANDEKHFIYTFTEQMWVPAEAKEMDLAKKFVKFMYSDKVVDLMLANETT NKETGEKSAAPIVAPVVGASDKLPAGTIKDTYTLANSEGYETVSGAWATTKPIEGFDMKA TVYGSIDALNSGEMKAADYNKLLVDTWKKLAENVE >gi|330401545|gb|ADLB01000026.1| GENE 7 4126 - 9810 5713 1894 aa, chain - ## HITS:1 COG:no KEGG:SAV_7268 NR:ns ## KEGG: SAV_7268 # Name: not_defined # Def: hypothetical protein # Organism: S.avermitilis # Pathway: not_defined # 47 693 46 624 625 326 35.0 5e-87 MRMKKRLLAGVVTVCLLVAGFPVGFFSTGKPVLAKEKTGTVIDVTDFGADPGGKTDSTKA VQKALEKAKETKGKVTLNFPKGEYHFWKDYATKRNYHTSNTSSLSYPEKSIAILLEGVDN LTLEGNGSSLIMHGDMMAIAAVDSHNIKLHDFVLDYKDPDTVDLSVVGNGVDESGKQYAD FYVPANYNYEINADKTGITWQGEMSPVTKRPYWEKSNADFCAYLVIYKGYDQTVARASNK AASNPFANVASIEKAGENVLRFTYNGERPKDQEVGNIFLLSDSATRKTTGAFFWESEDVL VENIDVHYLSGFGWLTQMCKNVEFKGVDFLPRYATGKYTTSNADQLHVAGCGGYFKVTDC NFSMSHDDPINVHGSYMRVEEIIDNRTLKLKYIHSQQGGFRQFHKGDEVLFYSRTYLELP DGSTEENPYIVESSVAPGEEYNGEKLDMRTEVVTFKEAFSEETLNDLKIKVRRNNSEEEE GLYVAENVSYTAAVTISGNHMKSIPTRGILCTTRKPVVIEDNIFDNMTMANIYLSNDANY WYESGPIRNMKIRNNKFYIRPTGQAEWGDVSGIFVDPVVLTSVQDAPASKGNIPVHRNIS VEGNEFHMANDNVVTAEGVDGMTIKNNKIIRDNPGIEITLNDMEGLGIGETQDIKKEVKE KTLPKDIFKFTNCKNVEISGNTYDDGLNLNVTTSGDKMTEDDVEIKDDVLTLNKEGNNLI TSESKVHYITSDPTVAYVNEQGQLVGVADGTVTLTAYVEWNGTMIKSNSVEIPVGEGKGT TVSIEADKTEITEEGGTAHLTITDGAEIEVLDPLTGKASNAATVKGNVYTALKEGAVLLK ADKNGKSASVLMINRFPKSYGDVSKLNSSEVSVDNMHEKMLSGAKNEITIEAESNPKSDL YGANTYVTNLVKLPILEEYKNDLRIQVDVDGLVVNGDGYNNAGIVLYKDGDNYYSVGKKG HMSGVTAVYEEKASPSERGGDSADNKLTSTTFEIEIQNDTVTVRFKDASGKWKTANTQNA ISKITDGKLYLAFTAWENGGKDFSSTFSNVKIAKASETTTENMNTVVPKQIFSAVSNERP TVSSVDLKADKVNTEAKVTAKVSDSDGTVEKIIYEWTLENESGVKDVTYTSEGKYIPKEA GKLSVRILAIDNLGKPSAVAKSEEVSVTVDKSDAETLRNLYINGNAVADFDKKDAKFVLP DGASSKIRVSYDPENAGVKTVIKDKDGKVTLATLTDECAAVVDMQDGLTIERGSKVYKVA LHTQKEGNVNIQKLSVDGKDVNLSEDEIKSGTNSYFINTDKDDVSIRMETEDKGTAFSVK RSFFNIPVENASKDKGVFEADVDLTAGINAFYLTVKGADGITERTVRLYLFRDGFNNSAL KDLKVNGKSLPDFAPDKTEYTMYVDNPQSVKVEAVQGAQEQQTNITKNSVRTDGVSASYK LEDGLNQIVVANTSENMWTTTYYTVNVVVRSGSNADLLSLKTDETLLPGFDPLKPDVTEY AVTMNNDSIHIEAQALMDNAKIRVFSDTEEKSGAGSIVTDMNIYEGKNVIGIEVTSPDGK VKRVYTLDIDAEGFDYASDRMDISTKDEVGYGELGLDISSSGGAIALADEKGQRVEFKKG LGAHASSEVAYNIEGQGYTAFESYIGIDYYQVSQGNVPSSVTFRVLVDGEEKYNSGEMTV KDPMKKIKIDLTGASTIQLFADMGENNYNDHADWADAKFVKSLPEKPEKPKVNKITSVDK VKDVEVAYGTKLKDIPLPKEVKVELDNGTTVELPVEWSCEKYNAEVAGKYVFTGTLTVPE GIENPDGLMAETEVTVKEKPNDTTNPDNSGNNGSHSNNNSNSNSNGSSADKGENHQGAVE TGDSFGMMMWISLLAILTSAGVIVVYFLRRKNRR >gi|330401545|gb|ADLB01000026.1| GENE 8 9997 - 10572 634 191 aa, chain + ## HITS:1 COG:FN0712 KEGG:ns NR:ns ## COG: FN0712 COG2059 # Protein_GI_number: 19704047 # Func_class: P Inorganic ion transport and metabolism # Function: Chromate transport protein ChrA # Organism: Fusobacterium nucleatum # 13 189 8 182 186 87 28.0 2e-17 MDKNLSKKQILKKLFFSTLYLSTFTFGGGYVIVSLMKNKFVDELHWIDETEMMDLISIAQ SSPGAVAVNGAIVVGYKLCGLAGVGLAILGAIIPPFVIISVISGFYTLFQSNPVIQSLLT GMRAGVGAVIVSVVWDMGFGVVKGKQFLPIVIMISAFIANYYFKINIIFIIFICIVFSVC KTVLDCGKEKR >gi|330401545|gb|ADLB01000026.1| GENE 9 10572 - 11117 395 181 aa, chain + ## HITS:1 COG:FN0713 KEGG:ns NR:ns ## COG: FN0713 COG2059 # Protein_GI_number: 19704048 # Func_class: P Inorganic ion transport and metabolism # Function: Chromate transport protein ChrA # Organism: Fusobacterium nucleatum # 1 148 1 148 176 93 37.0 3e-19 MIYIQLFLSYLQIGAFSFGGGYAAMSLIQAQVVEKYHWLTMGQFTDLVTIAEMTPGPIAI NSATFVGTQVGGIFGAFCATIGCILPSCIIVTLLAKVYVKYRNVTVMQNILSTLRPVVVS MIAIAGISILLSVFFPNGADFSIRGTIIFLVALVLLRKTKLQPVSVMVGCGIFELIWQFI C >gi|330401545|gb|ADLB01000026.1| GENE 10 11167 - 11994 792 275 aa, chain + ## HITS:1 COG:lin0157 KEGG:ns NR:ns ## COG: lin0157 COG2207 # Protein_GI_number: 16799234 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Listeria innocua # 22 272 20 276 277 101 29.0 1e-21 MSNQNYRLEEKALEKIDARLLYITEARYGNDWHSIVHTHHFTELFFVTHGSGNFIIENES FPVKENDLVIVNPNVSHTESGKDGTPLEYIVLGINGLQFKTEDRQIDYNIHNFLHNREEV YFYLKALLHEVQTKEENFESICQNLLEILIWNIMRKTKTTLSVAPTKKITKECRFIEQYL DEHFMEDITLQTLSDLTYLNKYYLVHAFKHYKGVSPINYLIEKRISEAKHLLDTTNYPIA KVASVVGFSSQSYFSQVFRKETGFSPNEYRKSMEN >gi|330401545|gb|ADLB01000026.1| GENE 11 12031 - 12918 479 295 aa, chain + ## HITS:1 COG:PA0748 KEGG:ns NR:ns ## COG: PA0748 COG2207 # Protein_GI_number: 15595945 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Pseudomonas aeruginosa # 155 285 168 300 306 68 32.0 2e-11 MSYKSVNLEDTISIENLFTIHYFEYMSDFSFAGESHDFWEFVFVDKGEIDIYMDNIHTRL KKREVAFHKPNEFHRLQATGSSAPNLVVISFSCNSPAMSFFREKILTVDDFGKTLLGNII KEAKKLFDCRLDDPYLTEMIKKEEKPLGAEQLIKIQLEHFLLHLLRCSQNRNASALPKPA ASKGSSEIFNRVLEYMENHLDSRLTIDLLCKENMVGRSQLQKIFQQKTGLGVIEYFSHMK IDAAKQMMRTDFMNFTQISESLGYSSIHYFSRQFKKITGMTPSEYVSSVKALAEN >gi|330401545|gb|ADLB01000026.1| GENE 12 12971 - 13519 489 182 aa, chain - ## HITS:1 COG:TM0656 KEGG:ns NR:ns ## COG: TM0656 COG1396 # Protein_GI_number: 15643421 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Thermotoga maritima # 6 175 2 172 176 93 34.0 2e-19 MSDISKIGYNVRSLRKERGITLQQLAEATGLSTGYLSNLERNVNSPTLVNIQKICEALDI SLGDMVERNREESIIVRKDEREVVIDEQNSIRLENIDYGLDKTTFTYMSIEPGGAGEGLR WEHDFDEVGTVIKGELTLGLGNQMFHLKEGDTILVKAHTSHCYFNKGTEESVSFWARCRE SE >gi|330401545|gb|ADLB01000026.1| GENE 13 13628 - 15793 2429 721 aa, chain - ## HITS:1 COG:no KEGG:Cphy_0577 NR:ns ## KEGG: Cphy_0577 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 6 721 7 723 723 1181 78.0 0 MSKEKLTGRVTIPTDVDVVPETLELVKRWGADAIRDCDGTDYPEELRDVDAKVYSTYYTT RKDNAWAKANPDEIQQMYIMTSFHTAVDSELSIHLMDHLYPDMLKVNSHDDITRWWEVID RTTGEVVPVTEWDYSEETGDVTIHGAKPFHDYTVSFLAYIMWDPVHMYNAVVNDWKDVEK QITFDVRQPKTKEFTMRRLRKYIEDNPHIDVIRYTTFFHQFTLIFDEMAREKYVDWYGYS ASVSPYILEQFEKEVGYKFRPEFIIDQGYMNNQYRIPSKEFRDFQAFQRREVAKLAKEMV DLTHELGKEAMMFLGDHWIGMEPFMDEFKSIGLDAVVGSVGNGSTLRLISDIPNVKYTEG RFLPYFFPDTFHEGGDPVKEAKVNWVTARRAILRKPIDRIGYGGYLKLALQFPEFIDYVE SVCDEFRTLYENIKGTTPYCVKTVAVLNSWGKMRAWGNHMVHHALYQKQNYSYAGIIEAL SGAPFDVKFISFDDIKENPAILDNIDVVLNVGDAYTAHSGGANWIDETVVTAVKKFIYNG GGFIGVGEPTAYQWEGRFFQLANALGVEKENGFNLNTDKYNWDEHDHFIKEDCTKDIDFG EGEKNIFALDGTTILRQVDKEVQLAVNEFGKGRCVYISGLPYSFENSRLLYRSIIWSSHD EDNLHKWFSSNFNVEVHAYVKNGKYCVVNNTYEPQSTTVYKGDGTSFDLDLDANEIIWYE I >gi|330401545|gb|ADLB01000026.1| GENE 14 15902 - 16933 1165 343 aa, chain - ## HITS:1 COG:CAC0690 KEGG:ns NR:ns ## COG: CAC0690 COG1363 # Protein_GI_number: 15893978 # Func_class: G Carbohydrate transport and metabolism # Function: Cellulase M and related proteins # Organism: Clostridium acetobutylicum # 16 341 11 339 343 278 44.0 2e-74 MTENNAVYHDYAVEQIEKLISIDSPTGYHYNVTAYLQEELEHMGFEVRTLRKGGVIANLG GEGNPLMMMAHVDTLGAFVHYIKGDGRLAISNGTLNQNNIETENVRILTRDRKVYEGTIQ LANASIHVNPDINEERKFTNLEVVLDENVSSKEDVEKLGISAGDFIMVEPRFRITESGYI KSRFLDDKASSGILLALAKYVSEHKGCLKRNVQIFFTVHEEVGHGAASGVTDDIKDILAI DMGCVGENLTCTERQVSICAKDGRGAYHFEMVNELIDTAKKNELDYAVDIYPFYGSDTAA ALIAGRDVRHALIGSGVYASHGYERTHKEGVRNTFELALHYVI >gi|330401545|gb|ADLB01000026.1| GENE 15 17078 - 17596 546 172 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKRKQYKYIIVIAVIVLLIGSCYGAYRWYKSSISENYYVAKTKEGDIEVSFQRYAKAEYF IKDVPDYIGLSGGEKSFKEEEENGYGFILGQVKMAVDTYVEEAEYKMYLIQKNGKEQEVP LLFTTVFNDAVEKNKMVKADFYFTEQKSSLYTEVRLERDGKEIMGLELKVVE >gi|330401545|gb|ADLB01000026.1| GENE 16 17807 - 17989 113 60 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MIGNVGVLIYLCVNRKRKYFKMFILLSVFLLLMAARNTYAEISYYIAVHNLETTVEKIIE >gi|330401545|gb|ADLB01000026.1| GENE 17 17992 - 18300 185 102 aa, chain - ## HITS:1 COG:no KEGG:Dhaf_3112 NR:ns ## KEGG: Dhaf_3112 # Name: not_defined # Def: transcriptional regulator, XRE family # Organism: D.hafniense_DCB-2 # Pathway: not_defined # 7 86 3 82 208 68 40.0 7e-11 MEEQTNIGEKIKRIRRENNLSQEDLARSLHTSRQTVSRWETNRSVPDLEMLEKVAFIFEM EVKDLLDVKMVNKNQEKQETLDSYLPLSILMMFSVISVVIPF >gi|330401545|gb|ADLB01000026.1| GENE 18 18399 - 19634 1155 411 aa, chain - ## HITS:1 COG:FN0781 KEGG:ns NR:ns ## COG: FN0781 COG2966 # Protein_GI_number: 19704116 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 1 248 5 253 256 102 26.0 1e-21 MDYKKIVQGILDIGEAMLVNGAENFRLEDSLYRMCKSYGFVRYDVFVIPSNIQITVETPE GEIITQIRHIENADIDFDQLDYLNNLSRYVCQHTPDAKELREKYLEVVNRPAQHPAIKYF AAVMGGTGFAVFFGSDFQDAIVAVIVSLMIVVVGGWLSRREENLLIYNAILSFLSEVIIL LCVRAGLGSHPERIMIGIVMLLISGLSTTNGIRELLQKDYISGAINIMNSMLGAAGIAFG TAFAMLLLDGVSAEGFILNYNIPIQLMACTVACVGFAFWFKIRGKQVLYSGVGAFITWGI YLIVYEWKPSNFFATLVAAIFVAGYAFVMSRKNKAPSTIFLTASVFPLIPGPNLYYMMYG CVSRDIGMAFSEAIILLATCLAIAFGFNIVDVIARGTMRALKKEYHVGKKS >gi|330401545|gb|ADLB01000026.1| GENE 19 19762 - 22131 2254 789 aa, chain - ## HITS:1 COG:CAC3272_1 KEGG:ns NR:ns ## COG: CAC3272_1 COG5492 # Protein_GI_number: 15896517 # Func_class: N Cell motility # Function: Bacterial surface proteins containing Ig-like domains # Organism: Clostridium acetobutylicum # 653 737 395 480 482 69 47.0 2e-11 MRKRKWTAVLLSVAMMLTMLPSTAFAKETKAEAWDGKAVDTSWYAEDKTSYEISSADQLA GLSQLVNEGKSFKGKTVSLTDDIDLNNKEWTPIGKSGKTFQGTFDGKNHVISNLSINTPK KNDIGLFGYTSGGAIKNFQVKDATVKGRMCVGVIAGTPHTADYANITVSGKVIVEGYAYV GGACGKNAYGDITNVDITGEAGSYVKAESEQYRTYVGGLVGFMGEGNQTVKDCDVKIDVI GSTCDVGGILGILHYGNTLQNCTYEGSLTITKPDAEDGSEFGALVGTIYNGKDKVSTTIK DCEATVNQATSGGKDVTDTITPHGDFYNKTESGSLEDITIDAIINGSHVAITNAVAKIGN TEYKTLAEAFRAAKSGDTVEILRDVDVKVWNQINNVKNITVKGNGHTLNIAEVESKQNGD YLFYNADGLNVSDVNINFTKSGNGFCMNSGKLENVHMTGAPEQKSNYAVFVGTGSKVEID NSSFKNFGVAVYSQPTANDKATSDIDVTDSQFTDCKMTICSYAANTVFTGNTVTGSEELS FAAKADAVNKDANVTYVITGNTFKDAGKIWFYGANANDVVFKKNKVTGNTYISTEDAAEG VLDLNYNYWGGNAPSDKQIVDNNDNVSVENSIYYINERMSDKDLNTYVPLDKIILNKTEL KLHVNESEKLIAAFVPDNATDKALTWSSSDEKVAIVDENGLVTAKGLGTAEITVTTEDGK TAVCKVTVEKKKDENPSTKPDKPNQPDKPQKPESPKTADNSNVFFYMIMLLAAGAGIIVY AKRRRNWIK >gi|330401545|gb|ADLB01000026.1| GENE 20 22445 - 22588 56 47 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKDLFVNFIGAVVFSIIGYFYVKRRGEERENLRRNSFRGESRRKVII >gi|330401545|gb|ADLB01000026.1| GENE 21 22700 - 22993 443 97 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|154484766|ref|ZP_02027214.1| ## NR: gi|154484766|ref|ZP_02027214.1| hypothetical protein EUBVEN_02484 [Eubacterium ventriosum ATCC 27560] # 1 92 1 89 182 84 51.0 3e-15 MATRYNENFKIEVVKAYMAGDRSISEIAADYNVAKSTLTGWVKKHGEECQYKTTTKTSLE EFESAKEIRRLNQLLHEKDKEIDFLKKAAAFFAKEID >gi|330401545|gb|ADLB01000026.1| GENE 22 23251 - 23847 233 198 aa, chain + ## HITS:1 COG:pli0072 KEGG:ns NR:ns ## COG: pli0072 COG2801 # Protein_GI_number: 18450354 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Listeria innocua # 15 193 72 255 274 115 36.0 4e-26 MNKDLNLSAIVMRKKPGYKSCKKHKIFDNLLKQKFNVDEKNKIWCTDFTYMRQPNGKFRY NCTIMDLFDRSAIASVNSDYINTELAIETLTEALKKEKPNNGILLHNDQGVQYTSWEFVN FCKDNGVIQSMSKAGCPYDNAPMERFYNTFKNCFYYRYTFESVEQLDEMTKQYINWYNYV RPHSTNNYMTPMEFRYRA >gi|330401545|gb|ADLB01000026.1| GENE 23 23863 - 24522 667 219 aa, chain - ## HITS:1 COG:no KEGG:Blon_1431 NR:ns ## KEGG: Blon_1431 # Name: not_defined # Def: hypothetical protein # Organism: B.longum_infantis_ATCC15697 # Pathway: not_defined # 1 212 34 245 358 267 60.0 2e-70 MRQRFKIHWKAMKMEIKEHKSSFIVYMVLRILVILIMILQILNKNYENVFYCVLTLLLLI VPSFLQVELKIELPTTLEIIILLFIVAAEVLGEIRSYYTKVPGWDTMLHTINGFLAAAIG FSLVDILNRNKKAKFNLSPLYMAIVAFCFSMTIGVMWEFYECTVDTFLGMDTQKDTVVHS ISTVMLDPNGETNVVHIDDIDEVIVNGQELGLCTGQAFW >gi|330401545|gb|ADLB01000026.1| GENE 24 24993 - 25820 1171 275 aa, chain - ## HITS:1 COG:CAC2064 KEGG:ns NR:ns ## COG: CAC2064 COG0005 # Protein_GI_number: 15895334 # Func_class: F Nucleotide transport and metabolism # Function: Purine nucleoside phosphorylase # Organism: Clostridium acetobutylicum # 15 273 13 271 271 271 50.0 7e-73 MNPVYEKLMTCVESVKAKVDFRPEVALILGSGLGDYADEIEIVQTVDYTDIEGFPTSTVA GHKGRFVFGYVDKTPVVIMQGRVHYYEGYPMSDVVLPTRLMGLLGAKKLILTNAAGGVNF DFKPGDFMMLTDHITTGVPSALIGANLEELGVRFPDMSEVYSTKMQEIVRETAKEQGIDL KEGVYAQFTGPNYETPAEIRMARAWGADAVGMSTACEAMAARHMGMEICGISCITNLAAG MSKEELNHKEVQETADRVAKQFKTLITGIITRIGE >gi|330401545|gb|ADLB01000026.1| GENE 25 25837 - 26565 741 242 aa, chain - ## HITS:1 COG:no KEGG:HMPREF0868_0659 NR:ns ## KEGG: HMPREF0868_0659 # Name: not_defined # Def: cyclic nucleotide-binding domain protein # Organism: Clostridiales_BVAB3 # Pathway: Two-component system [PATH:clo02020] # 4 241 5 242 242 296 58.0 6e-79 MECKYLYNVLPFLAKCKKETQKQFENYFKSAPAWLMDSFQIEEMAKGEIFVRENTPIDTI YFVGKGMIKATDYRIYGITYDFMSFDCVHALGGMEIIMDLDTYMTTLETATPCTVVKIPK AQYEKWLLSDLRVLKLEAKLTGKSLLEDGRKSRAYLFLQGSDRLAMLLTERYELYAKNGI LRVQGGRQGLSDTTGLCVKTINRAVKKFSEEGLVSKEGNKITVSYEQYKKLKDKIMRIID IE >gi|330401545|gb|ADLB01000026.1| GENE 26 26567 - 27526 733 319 aa, chain - ## HITS:1 COG:CAC3005 KEGG:ns NR:ns ## COG: CAC3005 COG1816 # Protein_GI_number: 15896257 # Func_class: F Nucleotide transport and metabolism # Function: Adenosine deaminase # Organism: Clostridium acetobutylicum # 3 316 10 333 334 231 38.0 1e-60 MVIPKVELHCHLDGSLPIQTVSELLGREVRQSELQVSEDCRNLAEYLEKFDLPLQCLQTE EGLKKASKAFLMDLQKDNVQYVEVRFAPLLSVNEHLNCRRVIQSVIEGLEEAKKECNIFY NVIACAMRHHSEEENLEMMKVAREFLGEGLCAVDLAGNEAAFPMENYVELFGEAKKLGLP FTIHAGECGRVENVIQSVECGAARIGHGIALRGNRDGIALCKEKGIGIEMCPISNLQTKA VQNPSEYPLREFIDAGLRVTINTDNRTVSNSSLQKEMEFVKKQYGITDDELIKLTGNAID VAFAEDSVKEELWRRLRGV >gi|330401545|gb|ADLB01000026.1| GENE 27 27602 - 28702 1320 366 aa, chain - ## HITS:1 COG:TM0102 KEGG:ns NR:ns ## COG: TM0102 COG1744 # Protein_GI_number: 15642877 # Func_class: R General function prediction only # Function: Uncharacterized ABC-type transport system, periplasmic component/surface lipoprotein # Organism: Thermotoga maritima # 37 356 18 353 359 135 30.0 1e-31 MKKRLISLILAGAMVLGMAGCAKKPGESDDKKGDKGGYELALITDVGTIDDKSFNQGSWE GLKKYADEKDISCKYYKPSEKSDKACLTAIDLAVKGGAKVVVTPGFLFEKPIFEAQTKYP DVTFIIIDAAPIGEDKKPHIEKNVLSIFYAEEQAGYLAGYAAVKEGYKQLGFMGGIAVPA VIRYGYGFVQGADAAAKELGLAKDAIDIKYTYVGNFDASPENQTKAAAWYSEGTECIFAC GGGVGNSVMKAAEGAGKKVIGVDVDQSVESDTVITSAMKKLSDSVYNAVASYYDKSFEGG KSITLSAADDGVQLPMETSKFEKFTKEEYEKVYSGLKDGSIKIANDTAAKDAKDLPTSVV KVESIK >gi|330401545|gb|ADLB01000026.1| GENE 28 28724 - 29389 723 221 aa, chain - ## HITS:1 COG:SPy1867 KEGG:ns NR:ns ## COG: SPy1867 COG0274 # Protein_GI_number: 15675686 # Func_class: F Nucleotide transport and metabolism # Function: Deoxyribose-phosphate aldolase # Organism: Streptococcus pyogenes M1 GAS # 1 219 1 219 223 232 54.0 4e-61 MKTSEILSHVDHTLLKAFATWEDIQKLCDEALEYQTASVCIPPNYIKRIHDTYGDKINIC TVVGFPLGYSTTKAKLAEVEQAVEDGVGEVDMVINITDVKNGDFDKVTEEIRSLKQAVGD RILKVIIETCYLTKEEKIAMCKAVTEAGADYIKTSTGFGTAGATMEDILLFKEYIGPHVK MKAAGGVKSVEDMEAFLEAGCDRIGTSSAISLIQGQNVKDY >gi|330401545|gb|ADLB01000026.1| GENE 29 29402 - 30724 1698 440 aa, chain - ## HITS:1 COG:BH1533 KEGG:ns NR:ns ## COG: BH1533 COG0213 # Protein_GI_number: 15614096 # Func_class: F Nucleotide transport and metabolism # Function: Thymidine phosphorylase # Organism: Bacillus halodurans # 1 405 1 406 433 437 57.0 1e-122 MRMYDLIMKKRNGEALTKEEIQYMITEYVDGRIPDYQMSAFLMAVYFQGMTEEETLAMTL AVAHSGDMVDLSGIEGMKVDKHSTGGVGDKTTLIIAPIVASCGVKVAKMSGRGLGHTGGT VDKMESIPGMQTSLDRESFFAVVNKTGLSVIGQSGNLAPADKKLYALRDVTATVDSIPLI AVSIMSKKLAAGNDGILLDVKTGSGAFMKTVEDSIALAKEMVSIGENAGKKMVALITNMD IPLGHNIGNSLEVIEAVETLKGNGPEDLTEVCLQLAANMLYLAEKGSIEECRQMAEEAIR SGNALERLIAMVEAQGGDAGVIRDTDKFAKAPYQREVLAEKDGHITHMNAEGCGIASSML GAGRETKDSEIDYTAGIVLKKKTGDSVKAGDVLAVLYTSNDSLFEAAEEKYQQSVVIQAE APKEEPLIYARVTPNGVEKY >gi|330401545|gb|ADLB01000026.1| GENE 30 30739 - 31692 1101 317 aa, chain - ## HITS:1 COG:SPy1225 KEGG:ns NR:ns ## COG: SPy1225 COG1079 # Protein_GI_number: 15675189 # Func_class: R General function prediction only # Function: Uncharacterized ABC-type transport system, permease component # Organism: Streptococcus pyogenes M1 GAS # 3 314 6 316 318 226 44.0 4e-59 MDIVYFIMQQTMFFAIPLLIVAIGGMYSERSGVINIALEGIMVIGAFFGILFINMAQGSM SGQGLLLLAVVIAGVTGGIFSLLHAFASINMKADQTISGTALNMFAPAFAIFVARMIQGV QQIQFTDTFQIDKVPLLGDIPVIGPLFFQNSYITTYLGILIFLIAAFVIKRTRFGLRLRA CGEHPGAADSVGVNVYKIRYAGVIISGVLAGIGGLVFIVPTSTNFNASVAGYGFLALAVL IFGQWRSNKIFMAAFFFGVMKTLASAYSAIPVLKSLPIPNEVYKMIPYIATLIVLAFFSK NSQAPKAEGIPYDKGMR >gi|330401545|gb|ADLB01000026.1| GENE 31 31697 - 32830 1444 377 aa, chain - ## HITS:1 COG:lin1427 KEGG:ns NR:ns ## COG: lin1427 COG4603 # Protein_GI_number: 16800495 # Func_class: R General function prediction only # Function: ABC-type uncharacterized transport system, permease component # Organism: Listeria innocua # 21 356 12 344 350 202 40.0 9e-52 MEYKKKGLFQREGMISFSSSILAIICGLLFGLVILLLSNPSQASGGFMMILQGGFTDGIQ GIGQMFYYATPIIMTGLSVGFAFQTGLFNIGAAGQFTIGAFVAVLIGVKCTFLPAGIHCL VAIAGAALAGALWGMIPGLLKAFRNVNEVISSIMMNYIGMYLVNLLIQKLIYDHVKNQSM AVASTANLPKAGLDKIFPNTDINVGIIIAIVFVILVYIILNRTTFGYELKACGKNQNASK YAGINAKRNIVLSMVISGALAGIGGALLYLANSGKYLQVLDIIAPEGFSGISVALLGMSN PIGILFAGLFIGHLTVGGYNMQLFDFAPEVIDMIIAAIIYCGALSLLFKGLISKIQFRKE KDKAVEKSEKDNLKKEE >gi|330401545|gb|ADLB01000026.1| GENE 32 32820 - 34349 336 509 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149915877|ref|ZP_01904401.1| 50S ribosomal protein L17 [Roseobacter sp. AzwK-3b] # 23 481 31 499 563 134 25 1e-30 MEYVIEMNHITKQFGEFKANDDITLRVRKGEIHALLGENGAGKSTLMSVLFGLYQPEKGQ IKINGKEVKINNPNDANALGIGMVHQHFKLVHNFTVLESIVLGRETTKGGVLKMENARKK IMELSERYKFKIDPDAYISDITVGMQQRVEILKMLYCDNDILIFDEPTAVLTPQEIHELM KVMKQLVEEGKSIIFITHKLNEIKEVADRCSVLRKGKYIGTIEVADTSKEEMSEMMVGRK VNFIIDKKESEPGDAILEVKNMTIRAKHGKDAVKDVSFTVRKGEIVTIAGIDGNGQSELV YGITGMMPIDAGQIFLKGKDITHESIRKKCLDGLAHIPEDRHKDGLVLDYNLQENLVLQT YFTERFQKHGFLKFDAIRKYAEELIKKFDIRSGQGADSIVRGMSGGNQQKAIIARELDRN PEIVIAVQPVRGLDVGAIEYIHRQLIAQRDAGKAVLLVSLELDEVMNVSDRILVMYEGEI VADVNPKEVTTEELGLYMAGAKRSVNHGV >gi|330401545|gb|ADLB01000026.1| GENE 33 34373 - 34786 460 137 aa, chain - ## HITS:1 COG:BS_cdd KEGG:ns NR:ns ## COG: BS_cdd COG0295 # Protein_GI_number: 16079584 # Func_class: F Nucleotide transport and metabolism # Function: Cytidine deaminase # Organism: Bacillus subtilis # 1 136 1 130 136 144 55.0 3e-35 MTNRELLTEAKKARLKSYAPYSKFQVGAALLTKDGKVYHGCNIENASYTPTNCAERTAFF KAVSEGDTEFEKIAIVGGKEGTDADELCAPCGVCRQVMMEFCNPEEFQIILADGKDGMKV MSLKEMLPYGFCPSDLE >gi|330401545|gb|ADLB01000026.1| GENE 34 34801 - 35973 1096 390 aa, chain - ## HITS:1 COG:TM0167 KEGG:ns NR:ns ## COG: TM0167 COG1015 # Protein_GI_number: 15642941 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphopentomutase # Organism: Thermotoga maritima # 5 390 2 390 390 408 55.0 1e-113 MGRQRVFLIVLDSFGIGNAPDAADFGDEGANTLHTITQSPEYDTPNMNKLGLSRIDGIDY LEKVENVIGSYGRLQEASRGKDTTIGHWEIAGIISPQALPTYPNGFPKELLDEFSKRTGR GVLCNLPYSGTDVIRDYGEEHIRTGKLIVYTSADSVFQIAAHEDVVPVEELYRYCEIARE LLTGEHGVGRVIARPFTDAAPNFRRTSNRHDFSLLPPKDTMLDILQREGYDTYGVGKIYD IFAGKGIAHTQRIKDNVDGMEKTIALQDEDFNGLCFVNLVDFDMMYGHRNDIEGYAKAAT TFDEQLGTFMERMKPEDILMITADHGCDPGFKGTDHSRECVPFLVYGDSVKEGVDLGTRK TFADISATVLDMFGIENTTDGTSFKGEIYK >gi|330401545|gb|ADLB01000026.1| GENE 35 36224 - 36334 146 36 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MHSHYTNKLLNIEDVIIKKNSSRRYFFENLFRNQPS Prediction of potential genes in microbial genomes Time: Tue May 24 21:58:31 2011 Seq name: gi|330401367|gb|ADLB01000027.1| Lachnospiraceae bacterium 2_1_46FAA cont1.27, whole genome shotgun sequence Length of sequence - 64324 bp Number of predicted genes - 66, with homology - 62 Number of transcription units - 19, operones - 11 average op.length - 5.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) + TRNA 323 - 395 77.7 # Asn GTT 0 0 + TRNA 431 - 502 63.0 # Glu TTC 0 0 + TRNA 517 - 589 80.3 # Thr TGT 0 0 + TRNA 641 - 714 85.3 # Met CAT 0 0 + TRNA 734 - 807 87.1 # Asp GTC 0 0 + TRNA 850 - 922 85.3 # Val TAC 0 0 + TRNA 927 - 1013 65.5 # Leu TAA 0 0 + TRNA 1029 - 1102 73.6 # Arg ACG 0 0 + Prom 1031 - 1090 79.0 2 2 Op 1 . + CDS 1267 - 2499 1316 ## COG0205 6-phosphofructokinase + Term 2509 - 2547 4.5 + Prom 2517 - 2576 3.7 3 2 Op 2 . + CDS 2671 - 3183 305 ## PROTEIN SUPPORTED gi|228000081|ref|ZP_04047083.1| acetyltransferase, ribosomal protein N-acetylase 4 2 Op 3 . + CDS 3210 - 3308 73 ## + Prom 3314 - 3373 2.7 5 3 Op 1 . + CDS 3418 - 4653 1036 ## gi|166032834|ref|ZP_02235663.1| hypothetical protein DORFOR_02550 6 3 Op 2 . + CDS 4665 - 6596 1554 ## CPF_1123 cell wall-associated serine proteinase (EC:3.4.21.96) + Term 6801 - 6843 1.0 7 4 Tu 1 . - CDS 6781 - 8001 1070 ## COG0205 6-phosphofructokinase - Prom 8058 - 8117 8.3 + Prom 8078 - 8137 6.1 8 5 Op 1 . + CDS 8162 - 9178 788 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 9 5 Op 2 . + CDS 9241 - 10965 1583 ## COG1404 Subtilisin-like serine proteases + Prom 11004 - 11063 9.5 10 6 Op 1 . + CDS 11117 - 13402 1959 ## COG1328 Oxygen-sensitive ribonucleoside-triphosphate reductase 11 6 Op 2 . + CDS 13416 - 13571 287 ## Clos_1144 anaerobic ribonucleoside triphosphate reductase (EC:1.17.4.2) 12 6 Op 3 . + CDS 13564 - 14079 558 ## COG0602 Organic radical activating enzymes + Term 14081 - 14112 3.6 13 7 Tu 1 . - CDS 14121 - 14654 482 ## COG3760 Uncharacterized conserved protein - Prom 14681 - 14740 6.7 + Prom 14694 - 14753 7.4 14 8 Op 1 35/0.000 + CDS 14780 - 16519 209 ## PROTEIN SUPPORTED gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P 15 8 Op 2 . + CDS 16512 - 18248 187 ## PROTEIN SUPPORTED gi|225088774|ref|YP_002660041.1| ribosomal protein S16 + Term 18431 - 18472 6.2 - Term 18198 - 18237 2.3 16 9 Tu 1 . - CDS 18245 - 19057 632 ## COG0789 Predicted transcriptional regulators - Prom 19084 - 19143 4.2 + Prom 19048 - 19107 4.5 17 10 Op 1 . + CDS 19145 - 20473 1219 ## COG0534 Na+-driven multidrug efflux pump + Prom 20475 - 20534 2.2 18 10 Op 2 7/0.000 + CDS 20554 - 21327 325 ## PROTEIN SUPPORTED gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 19 10 Op 3 . + CDS 21328 - 22551 1287 ## COG1668 ABC-type Na+ efflux pump, permease component 20 10 Op 4 1/0.222 + CDS 22609 - 22869 278 ## COG1873 Uncharacterized conserved protein 21 10 Op 5 . + CDS 22925 - 23377 605 ## COG1327 Predicted transcriptional regulator, consists of a Zn-ribbon and ATP-cone domains + Term 23383 - 23413 1.0 - Term 23364 - 23406 7.9 22 11 Tu 1 . - CDS 23411 - 24604 899 ## COG0452 Phosphopantothenoylcysteine synthetase/decarboxylase - Prom 24730 - 24789 6.9 + Prom 24671 - 24730 6.4 23 12 Op 1 . + CDS 24779 - 25438 797 ## COG4684 Predicted membrane protein 24 12 Op 2 . + CDS 25449 - 26213 748 ## COG1521 Putative transcriptional regulator, homolog of Bvg accessory factor 25 12 Op 3 . + CDS 26223 - 26978 609 ## COG0101 Pseudouridylate synthase 26 12 Op 4 . + CDS 26975 - 27727 696 ## COG0708 Exonuclease III 27 12 Op 5 . + CDS 27737 - 28267 647 ## COG1335 Amidases related to nicotinamidase + Term 28268 - 28301 2.3 - Term 28255 - 28290 4.3 28 13 Tu 1 . - CDS 28304 - 29035 230 ## PROTEIN SUPPORTED gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 - Prom 29061 - 29120 9.0 + Prom 29082 - 29141 6.8 29 14 Op 1 . + CDS 29170 - 29703 528 ## COG0317 Guanosine polyphosphate pyrophosphohydrolases/synthetases 30 14 Op 2 . + CDS 29781 - 30791 1008 ## COG0252 L-asparaginase/archaeal Glu-tRNAGln amidotransferase subunit D 31 14 Op 3 . + CDS 30804 - 31664 912 ## COG1284 Uncharacterized conserved protein + Term 31666 - 31703 1.0 32 14 Op 4 . + CDS 31714 - 32364 602 ## COG0546 Predicted phosphatases 33 14 Op 5 23/0.000 + CDS 32378 - 32743 426 ## COG1380 Putative effector of murein hydrolase LrgA 34 14 Op 6 . + CDS 32740 - 33432 736 ## COG1346 Putative effector of murein hydrolase + Prom 33700 - 33759 8.7 35 15 Op 1 . + CDS 33825 - 36473 3037 ## COG0525 Valyl-tRNA synthetase 36 15 Op 2 . + CDS 36508 - 38685 2234 ## COG1368 Phosphoglycerol transferase and related proteins, alkaline phosphatase superfamily 37 15 Op 3 . + CDS 38678 - 39976 865 ## COG0285 Folylpolyglutamate synthase 38 15 Op 4 . + CDS 39963 - 40100 245 ## 39 15 Op 5 . + CDS 40123 - 41472 1602 ## EUBREC_2575 hypothetical protein + Term 41486 - 41524 6.4 40 16 Op 1 8/0.000 + CDS 41538 - 41855 213 ## COG1366 Anti-anti-sigma regulatory factor (antagonist of anti-sigma factor) 41 16 Op 2 6/0.000 + CDS 41852 - 42307 481 ## COG2172 Anti-sigma regulatory factor (Ser/Thr protein kinase) 42 16 Op 3 . + CDS 42286 - 42999 812 ## COG1191 DNA-directed RNA polymerase specialized sigma subunit 43 16 Op 4 . + CDS 43071 - 43226 167 ## 44 16 Op 5 . + CDS 43217 - 43882 668 ## EUBELI_01606 stage V sporulation protein AA 45 16 Op 6 . + CDS 43842 - 44258 473 ## EUBELI_01605 hypothetical protein 46 16 Op 7 . + CDS 44273 - 44734 617 ## Cphy_0479 stage V sporulation protein AC 47 16 Op 8 . + CDS 44736 - 45740 1107 ## Cphy_0480 stage V sporulation protein AD 48 16 Op 9 . + CDS 45754 - 46110 497 ## EUBREC_2431 sporulation protein 49 16 Op 10 . + CDS 46198 - 47832 1824 ## COG1283 Na+/phosphate symporter 50 16 Op 11 . + CDS 47836 - 48879 1063 ## COG1316 Transcriptional regulator 51 16 Op 12 . + CDS 48929 - 49531 776 ## gi|210611074|ref|ZP_03288716.1| hypothetical protein CLONEX_00906 + Term 49553 - 49593 7.6 + Prom 49535 - 49594 1.6 52 17 Op 1 2/0.222 + CDS 49666 - 50352 699 ## COG3944 Capsular polysaccharide biosynthesis protein 53 17 Op 2 3/0.000 + CDS 50359 - 51084 806 ## COG4464 Capsular polysaccharide biosynthesis protein 54 17 Op 3 1/0.222 + CDS 51087 - 51758 547 ## COG0489 ATPases involved in chromosome partitioning 55 17 Op 4 4/0.000 + CDS 51804 - 53666 1675 ## COG1086 Predicted nucleoside-diphosphate sugar epimerases 56 17 Op 5 25/0.000 + CDS 53671 - 55515 1354 ## COG0438 Glycosyltransferase 57 17 Op 6 25/0.000 + CDS 55512 - 56711 706 ## COG0438 Glycosyltransferase + Prom 56723 - 56782 5.7 58 17 Op 7 . + CDS 56813 - 57925 704 ## COG0438 Glycosyltransferase 59 17 Op 8 . + CDS 57918 - 58853 514 ## THA_745 putative glycosyltransferase protein 60 17 Op 9 . + CDS 58873 - 60027 467 ## SP70585_0419 putative repeating unit polymerase 61 17 Op 10 . + CDS 60020 - 61018 591 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 62 17 Op 11 . + CDS 61015 - 62151 750 ## BVU_2948 hypothetical protein 63 17 Op 12 . + CDS 62156 - 63259 774 ## COG0381 UDP-N-acetylglucosamine 2-epimerase 64 17 Op 13 . + CDS 63262 - 63897 304 ## gi|237727084|ref|ZP_04557565.1| predicted protein 65 18 Tu 1 . + CDS 63974 - 64132 270 ## EUBREC_2191 hypothetical protein + Term 64219 - 64249 -0.9 - Term 64136 - 64173 6.4 66 19 Tu 1 . - CDS 64195 - 64323 177 ## gi|210611402|ref|ZP_03288896.1| hypothetical protein CLONEX_01086 Predicted protein(s) >gi|330401367|gb|ADLB01000027.1| GENE 1 246 - 335 121 29 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNLVRGGSQSLTSVSEFSWADIPNINLAS >gi|330401367|gb|ADLB01000027.1| GENE 2 1267 - 2499 1316 410 aa, chain + ## HITS:1 COG:XF0274 KEGG:ns NR:ns ## COG: XF0274 COG0205 # Protein_GI_number: 15836879 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphofructokinase # Organism: Xylella fastidiosa 9a5c # 2 407 13 413 427 268 41.0 1e-71 MKGNVIVGQSGGPTAAINSSLAGVYRTAKDRGAKKVYGMRYGIQGLLEEKYIDLSEHITN ELDAEILKRTPAAYLGSCRYKLPEIHEDMTLYEKIFEILNRLEIEAFIYIGGNDSMDTIK KLSDYAIITGQPTRFIGCPKTIDNDLALTDHTPGFGSAAKYIGTSVKEIICDSFCLEYHK GLVTIVEVMGRNAGWLTGAAALAKGEDCAGPDLIYLPELTFNVEKFMDKIKELLKKKSSV VVAVSEGIKLEDGRYVCEVGANVDFVDAFGHKQLTGTASYLASRVAGEIGCKTRAIELST LQRAASHCASRVDILEAYQVGGAAVKAADEGDSGKMVVLERRSDDPYQCGTEVKDVHKIA NDEKVVPREWVNKEGTYVTDEFVTYVRPLIQGDVSPVMVDGIPRHLNRPR >gi|330401367|gb|ADLB01000027.1| GENE 3 2671 - 3183 305 170 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|228000081|ref|ZP_04047083.1| acetyltransferase, ribosomal protein N-acetylase [Brachyspira murdochii DSM 12563] # 1 169 1 166 166 122 37 7e-27 MQNVTFREPVIEDAKEIVDFYNYVGGETSFLSFEKDEYPLDVEGQKESIQATNESPINTM LLAIVDGKIVGIGTISSSHKIKSRHSGELGIVVAQEYQGKGIGTSIIQQLIDWAKGNEVT TRIQLDTRTDNELAVKLYQSFGFEIEGCLKNSTLLDGKYYDLYVMGMMIK >gi|330401367|gb|ADLB01000027.1| GENE 4 3210 - 3308 73 32 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLCSFISILSCANKFDVFLCKKTSCFYGEFAV >gi|330401367|gb|ADLB01000027.1| GENE 5 3418 - 4653 1036 411 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|166032834|ref|ZP_02235663.1| ## NR: gi|166032834|ref|ZP_02235663.1| hypothetical protein DORFOR_02550 [Dorea formicigenerans ATCC 27755] # 1 356 6 418 508 120 27.0 1e-25 MSEKEGYEVIKFVERGNDCYIVADYVEGVTLYHWIQLHKELEKVILNKWIRELIRQLSLF RRQKDNPKYTYLNPHCIVITEEKEIMLFYPGENVPRLKGKFAKQFQSDSPTQDVDLYCLG RTIQFIMAHEECVPHLKKREEIRLLTFIKKCLESNPKKQLKNIFSLEKKSIKKKIKWIPI VLGSAIFVIGITVSTEKEKKPCKNETDYFELGMYYFLEQKDYTQSKLYFQKAKKQEKSAD AYVKLSEFMLNQSDDKDMEKVLRKIEKEADKEKKIEKKLMLARGCVLLEMDWAYEMIVEM YLNNPQEVSEERLKEWNEYKALAYEKLSLWKEAGTQYRILNEQEKKREEKEKMYREKYME MDINYLKELWKEESFTMEEKQSILQNMVKENPQMKEEESFIRFLQDNHIQI >gi|330401367|gb|ADLB01000027.1| GENE 6 4665 - 6596 1554 643 aa, chain + ## HITS:1 COG:no KEGG:CPF_1123 NR:ns ## KEGG: CPF_1123 # Name: not_defined # Def: cell wall-associated serine proteinase (EC:3.4.21.96) # Organism: C.perfringens_ATCC13124 # Pathway: not_defined # 189 612 1100 1485 1570 69 24.0 4e-10 MNRRKLGVIIFILLILYGLSAMAVAEEGEEELHIEYEEPNGKNQYYTKKPTVTVYHGGKD VITKIQLKQEEEILLEKTIDNGEKEFRIEGEEFKEGRHILCVWQENEKGEVIEETKIERE FHIDTVSPEKTVFQYEGSSEGEILYFQKETVVKLQAKDNGAGVENIFYQIGKGETKKESA DEIQIELPKEFKGEITAWVEDKAGNRGDTAVSKYIVCDAKLPQIEIHTDLEEGKWYSQNV NAEIIVEENGVSSGVASIRCYFNGKLVKEIEKLPEFCQSERVKLQIKELGELLVEVRDYA GNISWKKQKMLIDKEKPEITVSGSYPNMITSRTVHLEISAKDRKQLQEASFIVKRKNDAG ESVFSEKKETKFVKNSWITDLEIEEDGIYTIELKAVDMAGNVERKEEKIIVDKTNPIIRY VEDLDGKWIKSFEWRYPLSEFIFDLTKFRYEIRLDGKMQTLFLTERKEGKHIFYVKATDV AGNEAVARAEFVIDHTKPEIFVEGAADNESYEEKTEIKLGVKDSTERLEEVWINGEKQKT DADSKLFRFTADEIRDYNIVARAVDLAGNQSVKNINFSIMEKKNILEKVFAPTETEKPKN QKEQKEHKKQEIYLIAGIVILAMMGIMFGYKMKKRPHKREDAE >gi|330401367|gb|ADLB01000027.1| GENE 7 6781 - 8001 1070 406 aa, chain - ## HITS:1 COG:XF0274 KEGG:ns NR:ns ## COG: XF0274 COG0205 # Protein_GI_number: 15836879 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphofructokinase # Organism: Xylella fastidiosa 9a5c # 2 400 14 409 427 238 38.0 1e-62 MKNVIVGQSGGPTAVINGSLYGVVSESLKKSDDIEIVYGMINGIEGFLHDEVMEMNSLVS TGELELLKTTPGAYLGSCRFKLPEDLSDPVYPALFEKFNSKNIGYFFYIGGNDSMDTVSK LSRYAESVQSDIRIIGIPKTIDNDLVETDHTPGFASAAKYVATTVREIAIDASVYDNKKS VTIVEIMGRHAGWLTAASALARKFEGDNPVLIYLPEVAFNQEDFIEKVKHSLETTTNLIV CVSEGIHDGNGTFICEYASDVGVDTFGHKMLTGSGKYLENLVKEKVGVKVRSVELNVSQR CSSSCLSKTDLDEAVAGGAFGVNSALAGETGKMISFKRVSSNPYVMECTTADVTAICNKE KGVPIDWITENGSNISDEFINYASPLIQGEVEIPMQNGLPHFAFRK >gi|330401367|gb|ADLB01000027.1| GENE 8 8162 - 9178 788 338 aa, chain + ## HITS:1 COG:CAC1488 KEGG:ns NR:ns ## COG: CAC1488 COG0463 # Protein_GI_number: 15894767 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Clostridium acetobutylicum # 1 338 1 338 338 384 57.0 1e-106 MKYLTFAIPCYNSAEYMSHAVDSILVGGEDVEILIVNDGSQDETLQIAKKYEEKYPDIIR VIDKENGGHGDAVNAGLRNATGKYFKVVDSDDWVDEQSLKKILDTLHSWEEEKQEVDMLI SNYVYEKVGVKNKRVIHYRNVLPENKIFGWSDVGHFGIAQYILMHSVIYRTDLLKLNQLE LPKHTFYVDNIYVYYPLPHVKKIYYLDVNFYRYYIGREDQSVNEQVMIGRIDQQIFVTKT MIDMYYLRRIGNKKLRNYMINYLAIMMTVSSILCIRSKKEENLEKKKELWNYLKTKDYKV YWKIRYSILGQTINLPGKSGRKISSFVYIVARQLIGFN >gi|330401367|gb|ADLB01000027.1| GENE 9 9241 - 10965 1583 574 aa, chain + ## HITS:1 COG:CAC3245 KEGG:ns NR:ns ## COG: CAC3245 COG1404 # Protein_GI_number: 15896490 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Subtilisin-like serine proteases # Organism: Clostridium acetobutylicum # 51 571 11 518 1118 335 37.0 2e-91 MNGQKLENMLNLALDATEEERAKSLELDVGYNPVERIWDLIVKYSGNLDRVRELGAEVVE LQNEFAIVTIAETLIESLALLPEIEYIEKPKRLFFETANGRRVSCISSVQNARPFLFGKG VLVGVIDSGIAYANEDFRKEDGSTRIRVLWDQSISGNPPEHYSIGTEYTEEQINDALRQE SRSEQLKIVPSIDTSGHGTAVAGIAAGNGRGSVGNVQKGVASESELVIVKLASPRKDGFP RTTELMQAIDYVVRKAREFEMPIAINISFGNTYGSHDGTSLLERFIDDISNYWKSVICIG SGNEGASAGHTSGQVQRGVDTVVQLAIQESEQTVNLQLWKQYSDVMEISLISPSGVRIGP FQEILGSQRFSIGETELLVYYGEPSPYSTNQEIYIDFLPKDTYITAGIWEIVLTPRQIVT GEFEMWLPSENVLNKGTAFLYPSERKTLTIPSTADRVISVGAYDALTFTYADFSGRGGDG KPDLVAPGVRVVAPTREGTYEAVSGTSFATPFVTGGAALLMEWGIVQGNDLYLYGEKVKA YLQRGARPLPAFTEYPNRQVGYGALCVRDSLPFF >gi|330401367|gb|ADLB01000027.1| GENE 10 11117 - 13402 1959 761 aa, chain + ## HITS:1 COG:CAC1209 KEGG:ns NR:ns ## COG: CAC1209 COG1328 # Protein_GI_number: 15894492 # Func_class: F Nucleotide transport and metabolism # Function: Oxygen-sensitive ribonucleoside-triphosphate reductase # Organism: Clostridium acetobutylicum # 1 740 1 663 699 466 39.0 1e-131 MVKSIKKRDGRIVLYDESKIASAVLKAMEAAGEGDATDAAQVANDVHMELEKLPEDMPPQ IEQVQDAVERELMKNGFEQSAKKFILYRAKRTNIREANTSLMRTIDEITNVDARLSDMKR DNANIDGNTSMGSMLQIGAAGAKSYNAAYLLTEEQAKAYLNGDIHIHDFDFYSLTTTCTQ IDIVRLFKGGFSTGHGVIREPQSIQSYAALAAIAIQSNQNDQHGGQSIPNFDYGLAKGVA KTFAKLYTVKLTESLEDALESDNESEIVKEILQQATEEFGHRPCLEKDEEAFDAFIAEEL IGRCGLDTHKAKQVVATTRRRAVRQTENETYQAMEGFVHNLNTMHSRAGAQTPFSSINYG TDTSPEGRMVIRNVLLATDAGLGNGETCIFPIHIFKVKEGVNYNEGDPNYDLFRLSCKVS AKRLFPNYVFLDSPFNLQYYKPGHPETEVATMGCRTRVIGNYCHKDKEVSYSRGNLSFTS LNLPRIAIESNHDEKVFYEKLDKMMELVAQQLYDRFEIQAKKHVYNYPFLMGQGVWLDSD KLGPNDEVREVLKHGTLSIGFIGLAETLTALYGHHHGESEEMQKKGLEIIGHMREFTDRE SERRNLNYTLLATPAEGLSGRFVKMDKKRYGVIKNVTDREYYTNSFHVPVWYDISAYDKI YKEAPYHALTNAGHISYVELDGDTAKNVEAFESVIRCMHDAGIGYGSVNHPVDRDPACGY VGVIDDVCPRCGRREGEAISEEKLEELRKLYPNMPQFYGCE >gi|330401367|gb|ADLB01000027.1| GENE 11 13416 - 13571 287 51 aa, chain + ## HITS:1 COG:no KEGG:Clos_1144 NR:ns ## KEGG: Clos_1144 # Name: not_defined # Def: anaerobic ribonucleoside triphosphate reductase (EC:1.17.4.2) # Organism: A.oremlandii # Pathway: Purine metabolism [PATH:aoe00230]; Pyrimidine metabolism [PATH:aoe00240]; Metabolic pathways [PATH:aoe01100] # 6 44 731 769 773 63 76.0 1e-09 MEQRNRLGQDVGFERIRRITGYLVGTMDKWNDAKKAEEKDRVKHGLGKTNA >gi|330401367|gb|ADLB01000027.1| GENE 12 13564 - 14079 558 171 aa, chain + ## HITS:1 COG:CAC0481 KEGG:ns NR:ns ## COG: CAC0481 COG0602 # Protein_GI_number: 15893772 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Organic radical activating enzymes # Organism: Clostridium acetobutylicum # 2 147 3 150 153 111 41.0 6e-25 MLRLAGIVNDSIVDGPGIRITIFGQGCPHHCHGCQNPETWSDTDGTLVDEREVVQMIETN PLAKGVTFSGGEPFAQAESFNVLAELLKDKGYEVASYSGYTFEELLNGTEEQRGLLKKID VLIDGRYEEDKKSLNIIYRGSVNQRIIDVPESLKQGTAVEITGGRWAGEYL >gi|330401367|gb|ADLB01000027.1| GENE 13 14121 - 14654 482 177 aa, chain - ## HITS:1 COG:AGl2275 KEGG:ns NR:ns ## COG: AGl2275 COG3760 # Protein_GI_number: 15891247 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 10 177 2 167 169 85 31.0 3e-17 MDEIFTTAPTDNHRLPKEMRVYQLLEKLQIPFERMDHEVADTMEACEKIDEKLQVTMCKN LFLCNRQKTTFFLLLMPGDKKFLTKNLSKQLGISRLSFAEAEYMEKFLDITPGSVSVLGL MNDVEKQVTLIIDKDLLEGEYIGCHPCVNTSSLKLKTSDILEKFLTYTGHTPTLVTL >gi|330401367|gb|ADLB01000027.1| GENE 14 14780 - 16519 209 579 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P [Thermanaerovibrio acidaminovorans DSM 6589] # 337 558 134 354 398 85 28 9e-16 MRKLLKYLRHYKKESIIGPLFKLLEACFELTVPIIMARMIDVGIKNQNMSFVWKMGGILV AFGVLGLACSLTAQYFAAKAAVGFGTELREDFFRHIGRLSYRELDMVGNATLVTRITSDI NQVQSGVNLVLRLFLRSPFIVVGAMAMAFTISVKLALIFVVAVPLLSVVIYGIMIWTIPL YRKVQKRLEKVMLMTRENLSGVRVVRAFRMQEREKEEFRQETKNLMHMQIFVGKISALLN PITYLIVNGATILIVWFGGQEVYYGNLSQGEIVALVNYMSQILLALVALANLIITFTKAL ASAERINEVLALETSIHGGDGVAEETKNGEMKIELDKVSMMYYGDKEEVLSDITLSVKRG ETVGIIGGTGSGKTTLINLLPRFYDVFSGSVKIDGRDVKEYDLTALREKFGIVPQKAVLF HGTIRDNMLFGNENASDEEIWEALRIAQAGNIVKDKEKELDTIVQEGGKNLSGGQKQRLT IARALVRKPEILILDDSTSALDFATDAKLRKAIAKHTDSMTVCIVSQRVMSMMKADKIFV LDKGKLVGSGTHGELLKNCEVYREICSSQLSKEEVSDHA >gi|330401367|gb|ADLB01000027.1| GENE 15 16512 - 18248 187 578 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|225088774|ref|YP_002660041.1| ribosomal protein S16 [gamma proteobacterium NOR5-3] # 341 565 12 233 312 76 27 3e-13 MRKQYDTIKKVIYLLKPYLPILILSLGFAVATVLCTLYAPILIGQGVDLILGKGKVDFQG ITEILMKLAAVTILTSITQWLMNLCNNRITYRVVNDVRTDAFEHLQKLPLKYIDSHQYGE IISRVITDVEQFSDGLLMGFSQLFTGIVTIVGTLIFMVSIHFQISLIVILITPLSFFVAG FVAKKTYMMFRKQSEVRGELTSLTDEIVGNQKIVQAFGYQDRAMKRFKKVNEELKDCSVK AIFFSSITNPTTRFINGLVYAGVGITGAIFAIHGRISVGQLSSFLSYANQYTKPFNEISG VVTELQNALACARRVFDFMEEQEEKDDAKEKEILEKADGNVELKEVSFSYQPDKELLKNL NLNVKAGQKIAIVGPTGCGKTTLINLLMRFYDIDSGKMVVSGRDVKDITKDSLRANYGMV LQETWLKSCSIADNIAYGKPDATREEIERAAKKAFAHGFIQRMENGYDTIITEEGGNLSQ GQKQLICIARVMLQLPSMLILDEATSSIDTRTEVKVQDAFRQMMEGRTSFIVAHRLSTIQ EADCILVMRDGNIIEQGKHEELLQKEGFYAELYKSQFH >gi|330401367|gb|ADLB01000027.1| GENE 16 18245 - 19057 632 270 aa, chain - ## HITS:1 COG:BS_bltR_1 KEGG:ns NR:ns ## COG: BS_bltR_1 COG0789 # Protein_GI_number: 16079711 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Bacillus subtilis # 5 120 7 122 124 101 44.0 2e-21 MNSNRYLTTGEFAKLTGVTKHTLFYYDKIQLFSPDIKAENGYRFYSLEQLEAFDVIQILR ELNMPLEEIKTYMNSRSPHHLLELFQKESQLIDKQIKKLTHMKNWIEKKRCHIDTLLQTN IEKVSVVHESEKYMIQSESTLDDERVWAEKIGELFNYCEENGVKSLDPIGYRQNTTDIQN GVFHRYRIFYELLDEKPKKIPYQIKPEGDYLVAYHRGHWKTMGNTYKKMLSFAKDNSLSL GDYFYEDCIFDSLTFQKEEDYITRLTCQIL >gi|330401367|gb|ADLB01000027.1| GENE 17 19145 - 20473 1219 442 aa, chain + ## HITS:1 COG:BH4045 KEGG:ns NR:ns ## COG: BH4045 COG0534 # Protein_GI_number: 15616607 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Bacillus halodurans # 1 441 3 446 447 338 48.0 1e-92 MSNSISKEFKFFSLLRFAFPTMVMMVFMSLYTIVDGIFISRLVGTDALSATNIVYPAISL LIAIGVMLATGGSAIIAKKLGEKKEKEACEDFSFLVLAAVVIGIIFMVFGNLFINPIVRL LGATDVLMDYCVGYLSVCLFLAPACILQLVFQTFFVTAGKPVLGLALTISGGIANMILDY VFMGVFHMGVEGAALATGIGQLIPALAGIIYFLFVRHSLYLVKPKIRWSVLAESSFNGSS EMVTNLSGAIVTFLYNIMMLKLVGEAGVAAITIVLYGQFLFNALYMGFSMGVAPVISYNY GSGNQRLLKRIFKICVGFIAISSIIVTVFALVMSPVIVEIFTPAGTETYALAKTGFFLFS LNFVFAGINIFSSSMFTAFSDGKTSAIISFVRTFVLIVINILVLPTLIGVNGVWLSVPLA EFMSVFLSGYFFIRKKGKYNYM >gi|330401367|gb|ADLB01000027.1| GENE 18 20554 - 21327 325 257 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|119503196|ref|ZP_01625280.1| Ribosomal protein S16 [marine gamma proteobacterium HTCC2080] # 22 240 12 230 305 129 30 3e-29 MIEIKNLTKVFKLNKKQMKESKTKSNTKVAVDNLSLTANNGEIYGLLGPNGAGKTTTLRC IATIIRATDGEIYVAGHEVQKEAEAVRNSIGFLTSDIKLDPQFSPDYMFDFFGRLHNVPK DVLRERKEELFTYFGIKDFAHKQIKELSTGMKQKAAIAVSLVHDPEIVIFDEPTNGLDIV TARSVTDYLKKLRDEGKLVIVSTHIMSEAEKLCDRIGVIIDGKKVAEGTLAELLERTHTD DLEDAFFELYKARKGEA >gi|330401367|gb|ADLB01000027.1| GENE 19 21328 - 22551 1287 407 aa, chain + ## HITS:1 COG:CAC3550 KEGG:ns NR:ns ## COG: CAC3550 COG1668 # Protein_GI_number: 15896786 # Func_class: C Energy production and conversion; P Inorganic ion transport and metabolism # Function: ABC-type Na+ efflux pump, permease component # Organism: Clostridium acetobutylicum # 7 390 8 370 392 105 25.0 2e-22 MSGIKEIFKKEIVRVFKDKKMVFSVFILPVAAMLGVLYLMGNIMSNMQDDIEAHKSKVYI QNEPAGFQAFCESAKLDMQVENSGNITEKQEKKIKQELKDGKADLFIVFPEGFEEDIKNY KNGDKVPQVTVYHNPSEEYSQAAYNKIGVQTLEAYRQTLLNERVGDMSRISIFTVNADTK ETIIQDDNKAGAKALGTMLPYLLTLLLFAGAMSIGTDMIAGEKERGTMASLLVTPIKRSS IIFGKVFALMVISGISAVVSVVGMVVSMPIIQDQMMGGATQGMDMKWSTQQIIMLAVLII ALAFLYATIIALVSVFAKTIKEANSFVMPVYMIVLVVGLLTMYTTKDPTMSSYFIPFYNS AITLQGILTQEVTMVQYGITLAITLGAGLVLTGVIAKAFESEKVMSA >gi|330401367|gb|ADLB01000027.1| GENE 20 22609 - 22869 278 86 aa, chain + ## HITS:1 COG:BS_ylmC KEGG:ns NR:ns ## COG: BS_ylmC COG1873 # Protein_GI_number: 16078600 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Bacillus subtilis # 1 77 2 78 81 73 44.0 1e-13 MRLCELREKEVINSCDCKRLGCVVDLEINLCTGNVEAIIIPGPGRICGFLGNDSEYVIPF ECIKKIGEDIILVEIKEEKFLKNCKY >gi|330401367|gb|ADLB01000027.1| GENE 21 22925 - 23377 605 150 aa, chain + ## HITS:1 COG:lin1597 KEGG:ns NR:ns ## COG: lin1597 COG1327 # Protein_GI_number: 16800665 # Func_class: K Transcription # Function: Predicted transcriptional regulator, consists of a Zn-ribbon and ATP-cone domains # Organism: Listeria innocua # 1 150 1 150 154 160 57.0 8e-40 MKCPYCNHADTRVIDSRPAEDGTSIRRRRSCDECGKRFTTYEKIETIPLIIIKKDNNREQ YDRSKIEAGVLRACYKRPVSAEEIQKTIDAVETEIFKREEKEISSTVIGEIVMEKLKELD AVAYVRFASVYREFKDVNTFMAELKKILDK >gi|330401367|gb|ADLB01000027.1| GENE 22 23411 - 24604 899 397 aa, chain - ## HITS:1 COG:BH2510 KEGG:ns NR:ns ## COG: BH2510 COG0452 # Protein_GI_number: 15615073 # Func_class: H Coenzyme transport and metabolism # Function: Phosphopantothenoylcysteine synthetase/decarboxylase # Organism: Bacillus halodurans # 1 396 1 397 404 364 49.0 1e-100 MLKGKTVVLGVTGGIAAYKIANLASMLVKLHCDVQVIMTENATNFIHPIAFETLTKNKCL IDTFDRNFQHKVEHVSIAKRADVVLIAPATANVIGKMANGIADDMLTTTVLACRCKKIVS PAMNTNMYCNPIVQDNLKKLQHYGFEIIEADTGFLACEDIGQGKMPSPEVLLQYILKEIA FEKDMKGKRVLITAGPTREAIDPVRYITNHSTGKMGYALARNCMLRGAEVTLVTGPTSID PPPFVHVIPVVSAKEMFDTVCSHAKTQDIIIKSAAVADYRPITVSEEKMKKSDSELSIPL ERTDDILQYLGQNKKENQFICGFSMETENMLENSRRKLEKKNADMIVANNLKQSGAGFGT DTNIVTLITKDSERSLDVMSKDDVANEIVNTILEKLS >gi|330401367|gb|ADLB01000027.1| GENE 23 24779 - 25438 797 219 aa, chain + ## HITS:1 COG:L166479 KEGG:ns NR:ns ## COG: L166479 COG4684 # Protein_GI_number: 15672553 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Lactococcus lactis # 20 215 9 193 196 94 38.0 2e-19 MNNTTNGSKPNTQVKGMVQVALFAALIIIMAFTPFLGYIPLGFTRATIIHIPVIIGALIL GPRKGAILGFVFGLTSLINNTMNPTATSFVFSPFYSLGEIHGGIGSIIICFLPRILVGII PYYVYRFMTKITEKWKSSMIFSLGVCGVIGSLTNTLLVMNLIFVFFKDAYAMANGVASNA VYGFILSVIGINGVPEAIVAGVLVAFVGRVLLKSRSVIG >gi|330401367|gb|ADLB01000027.1| GENE 24 25449 - 26213 748 254 aa, chain + ## HITS:1 COG:BH0086 KEGG:ns NR:ns ## COG: BH0086 COG1521 # Protein_GI_number: 15612649 # Func_class: K Transcription # Function: Putative transcriptional regulator, homolog of Bvg accessory factor # Organism: Bacillus halodurans # 1 253 1 253 254 229 48.0 3e-60 MILAIDIGNTNIVIGCCDREKIYFVERLSTNNTKTELEYAISFKNVLELYHINPKEIEGG IISSVVPPITNIVRASAEKVLGKSVKIVGPGVKTGLNILMDNPAQVGSDRIVNAVAAINE YTAPLIIIDMGTATTFCVVDEKKNYIGGMILPGVRISLDALTAGTSQLSRISIEAPKKII GKNTIDCMKSGIINGNAACIDGMISRINRELGQEATVIATGGLAKKIVPHCEQKICIDDE LLLKGLRLIYEKNK >gi|330401367|gb|ADLB01000027.1| GENE 25 26223 - 26978 609 251 aa, chain + ## HITS:1 COG:BH0167 KEGG:ns NR:ns ## COG: BH0167 COG0101 # Protein_GI_number: 15612730 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Pseudouridylate synthase # Organism: Bacillus halodurans # 1 237 1 238 263 166 41.0 3e-41 MRNYKMIIAYDGTKYQGWQRQSGTPMTIQGIIEKTIGEVVGYPVEIDGSGRTDGGVHAGG QVANVRLSGKVEEAVFQEKLNEKLPEDIRICHVELVLNRFHSRLDAVAKTYEYVVDMREK ADVFTRKYCYHYTKPLEPEKMRKAAALLIGKHDFSAFTDKKDEKSAVRELYDIQIETYRQ KLYFRFYGNGFMYHQVRIFVGTLLEAGAGERSVESVLEALKSKKRVNAGFLAPAHALTLK EVEYKVKESLK >gi|330401367|gb|ADLB01000027.1| GENE 26 26975 - 27727 696 250 aa, chain + ## HITS:1 COG:FN0047 KEGG:ns NR:ns ## COG: FN0047 COG0708 # Protein_GI_number: 19703399 # Func_class: L Replication, recombination and repair # Function: Exonuclease III # Organism: Fusobacterium nucleatum # 1 250 1 250 253 400 75.0 1e-111 MKLISWNVNGIRACVQKGFLDFFHQADADIFCIQESKMQEGQLQLELEGYHQYWNYAEKK GYSGTAVFTKEEPLSVTYGMGIEEHDKEGRLITLEFPEFYFATVYTPNSQNELARLDYRM QWETDFLAYMKKLEEKKPVIFCGDLNVAHREIDLKNPKTNRKNAGFTDEERGKFGELLDA GFIDTFRYFYPDMEGIYSWWSYRFRAREKNAGWRIDYFCVSESLRERLADAKILTYIFGS DHCPVELDLK >gi|330401367|gb|ADLB01000027.1| GENE 27 27737 - 28267 647 176 aa, chain + ## HITS:1 COG:PA3846 KEGG:ns NR:ns ## COG: PA3846 COG1335 # Protein_GI_number: 15599041 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Amidases related to nicotinamidase # Organism: Pseudomonas aeruginosa # 1 171 1 174 180 125 39.0 3e-29 MKIKAKDTIAIVVDYQERLMPVMYEKEALLKNTAILLNGLKELGVPIAVTQQYTKGLGPT VKEIENVLGEYEPLEKIAFSSFDAVKEAIKGKKFVIVCGIEAHICVLQTLLDLKENGYIP VLVEDCLSSRTLHNKQIALERAKQEGVILTTYESILFELLEKAGTETFKKIQALIK >gi|330401367|gb|ADLB01000027.1| GENE 28 28304 - 29035 230 243 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 [Phaeobacter gallaeciensis BS107] # 1 241 1 238 242 93 30 3e-18 MDKRNVLVTGASRGIGKAIALLFAEKGFRVFINCTRSVSDLEKVKEEIDAMHMGECHLAV GDVGNPDDVADMFAHIRRHCNGIDILINNAGISYIGLLTDMTNAEWNTLLQTNLSSAFYC AKEVIPHMVSQKSGKIIQISSIWGSVGASCEVAYSAAKSGLHGFTKALAKELAPSNIQVN AIACGAIDTVMNKQLDATERLALEEEIPAGRFGLPGEVARLTFDIATGHDYMTGQVIGMD GGW >gi|330401367|gb|ADLB01000027.1| GENE 29 29170 - 29703 528 177 aa, chain + ## HITS:1 COG:lin0808 KEGG:ns NR:ns ## COG: lin0808 COG0317 # Protein_GI_number: 16799882 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Guanosine polyphosphate pyrophosphohydrolases/synthetases # Organism: Listeria innocua # 1 176 1 178 180 110 36.0 1e-24 MVEKAIRFATKAHEGQFRKGTDRPYIVHPLEVGKILSEMTADEQVIAAGILHDTIEDCKE VTEEILRQEFGERITTMVLHESEDKTKSWKERKSATIEKLKTAPLEVQYIGLADKLSNIR DIDRDYPIYKEELWNRFRMKDKQIIGWYYKEIQKSLQSLKGIPQYEEYCRLVERNFG >gi|330401367|gb|ADLB01000027.1| GENE 30 29781 - 30791 1008 336 aa, chain + ## HITS:1 COG:BS_ansA KEGG:ns NR:ns ## COG: BS_ansA COG0252 # Protein_GI_number: 16079415 # Func_class: E Amino acid transport and metabolism; J Translation, ribosomal structure and biogenesis # Function: L-asparaginase/archaeal Glu-tRNAGln amidotransferase subunit D # Organism: Bacillus subtilis # 3 309 2 306 329 263 45.0 4e-70 MRKRILMIGTGGTIASKQTENGLKPGLSSEDILSYIPQVEQVCEVRTMQVCNIDSTNVTP EHWTILSKTIEEQYDKYDGFVICHGTDTLAYTASALSYMIQHSRKPIVITGAQKPINMDV TDAKTNLLDSFIYAADDESQDVNIVFDGKVIVGTRAKKVRAKSYNAFESINFPYVAVIQD GNIIRYIPTIPHKSRVRFYHEMKNSIYVLKLIPGMKADILSYIFKDYDCIVIESFGVGGM PESIVHEFYEQMSAWKEKEKLVVMTTQVANEGSNMTVYEVGQKVKQDFDLIEGYDMTLES IITKLMWIMTLPGLHFDEIKKTFYQTMNHDILFKEN >gi|330401367|gb|ADLB01000027.1| GENE 31 30804 - 31664 912 286 aa, chain + ## HITS:1 COG:CAC0496 KEGG:ns NR:ns ## COG: CAC0496 COG1284 # Protein_GI_number: 15893787 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 4 277 2 274 279 166 36.0 5e-41 MDKKKFIREYGIITLGIFIISMAVYFFMIPSNVIVGSLSGLVIVLANFIPLPISVMTFIL NAALLIIGFIFIGKEFGAKTVYTSVMLPVFLFVFEKIVPNNKSLTGDILIDTICYVLVVS VGLAMLFNANASSGGLDIVAKIVNKYLHVEIGKAMTITGMCVAVSSILVYGTKEVILSIL ATYVNGIVIDNFIGGFNRRKRVCILSDHYEVVQKFITEELNRGVTLYTAMGGYNKTERIE VVTILTQNEYGKLLEYLHTVDMRAFVTVSTVNEVIGEWNIKKKARG >gi|330401367|gb|ADLB01000027.1| GENE 32 31714 - 32364 602 216 aa, chain + ## HITS:1 COG:CAC0418 KEGG:ns NR:ns ## COG: CAC0418 COG0546 # Protein_GI_number: 15893709 # Func_class: R General function prediction only # Function: Predicted phosphatases # Organism: Clostridium acetobutylicum # 4 216 2 213 216 203 52.0 2e-52 MKNYQYILFDLDGTLTNSQLGITTCVAYALESFGIHTENPEELRKFIGPPLKESFVKYYN MTDGEGDRAVEKYRERFATVGLYENEVYAGIPELLQKLKAQGKTLLVATSKPTVYSDKIL EHFGLKEYFSYIAGSELDGTRVNKAEVIQYALEQMKITESEKIVMIGDKEHDMIGAGICG VDSIGVLYGFGEREELENHGATYIAETVSDLEKILL >gi|330401367|gb|ADLB01000027.1| GENE 33 32378 - 32743 426 121 aa, chain + ## HITS:1 COG:MA3263 KEGG:ns NR:ns ## COG: MA3263 COG1380 # Protein_GI_number: 20092079 # Func_class: R General function prediction only # Function: Putative effector of murein hydrolase LrgA # Organism: Methanosarcina acetivorans str.C2A # 3 120 1 118 165 76 36.0 1e-14 MKLLYQFGVILTVTFIGEVLYSIIPLPIPASIYGLLLMLFCLCTKIVKLSQVKIAGDFLI DIMPPMFIPAAVGIIAAWADLKEILVPVVVITFVTTVIVMVCTGKVTQAVIRLKQREEME S >gi|330401367|gb|ADLB01000027.1| GENE 34 32740 - 33432 736 230 aa, chain + ## HITS:1 COG:MA3262 KEGG:ns NR:ns ## COG: MA3262 COG1346 # Protein_GI_number: 20092078 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Putative effector of murein hydrolase # Organism: Methanosarcina acetivorans str.C2A # 11 220 15 224 238 172 47.0 3e-43 MKELLTESVYFGVTISIVSYGIGLFLKKKLKWGILNPLLVSILFVVGFLILFDIDYDMYN QTAKYLSYLLTPATVALAIPLYQKITLLKKNGLAVFLGILSGVLSSLLSVLAMAWLFGLS HREYVTLLPKSITTAIGMGVSDELGGITTITVAVIIVTGVLGNVIGQSVCKLFKIYEPIA VGLALGTSAHAIGTAKALELGEVEGAMSSLSIVVSGLITVVGASVFAMFL >gi|330401367|gb|ADLB01000027.1| GENE 35 33825 - 36473 3037 882 aa, chain + ## HITS:1 COG:CAC2399 KEGG:ns NR:ns ## COG: CAC2399 COG0525 # Protein_GI_number: 15895665 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Valyl-tRNA synthetase # Organism: Clostridium acetobutylicum # 4 881 6 881 881 1096 59.0 0 MSKKLEKTYNPKEIETKLYEKWCENKYFHAEVDRSRKPFTIVMPPPNITGKLHMGHALDN TLQDILIRYKRMQGYNALWIPGTDHAAISTEVKVTNQLKEEGIDKKELGREGFLKRTWEW KEEYGGTITSQLKKLGTSCDWDRERFTMDEGCSKAVEEVFIKLYEEGYIYKGSRIINWCP VCKTSLSDAEVEHEEQAGHFWHIKYPIAGTDRFLEIATTRPETMLGDTAIAVHPDDERYK DIVGKNVILPLVNREIPIVADYYVDKEFGTGAVKITPAHDPNDFEVGKRHNLPEINIMND DATINEHGGKYAGMERYEARKAIVADLEAEGYLVKIEEHTHNVGTHDRCHTTVEPLIKQQ WFVKMEELAKPAINALKTGELKFVPERFNKTYLHWLENIRDWCISRQIWWGHRIPAYYCD ECGEFVVAREMPEKCEHCGCTHFTQDEDTLDTWFSSALWPFSTLGWPDSTEELDYFYPTD VLVTGYDIIFFWVIRMVFSGFAHTGKSPFHTVFIHGLVRDSQGRKMSKSLGNGIDPLEII EQYGADALRMTLVTGNAPGNDMRFYDERVEANRNFANKVWNASRFIMMNMEEKEITTPSE SDLTATDKWILSKVNTLAKDVTENMDKFELGIALQKVYDFIWDEFCDWYIELAKYRIYHA DDDAVSANAALWTLKTVLANGLKMLHPYMPFVSEEIYSALVPEEESLMMSSWPEYKEEWN YVKEENVLEHMKEVIRGVRNVRAEMNVAPSRKAKAFVVCENGELCEGFEEIKVSCASLMN ASEIVIQKTKEGIAEDAVSVVVTDAVVYLPLAELVDFEQEIERLTKEEKRLEKELARVNG MLSNEKFISKAPEAKINEEKAKLEKYTQMMEQVKERLAGLKK >gi|330401367|gb|ADLB01000027.1| GENE 36 36508 - 38685 2234 725 aa, chain + ## HITS:1 COG:AGc3053 KEGG:ns NR:ns ## COG: AGc3053 COG1368 # Protein_GI_number: 15888966 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Phosphoglycerol transferase and related proteins, alkaline phosphatase superfamily # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 230 597 226 627 646 152 27.0 3e-36 MKHIQFKKIKKEDIKAYWKARKERRQAILEKRRNSAFAQKMKPVYKVMNQFSLVFHVLYA CVLNLTIEAISRHSLFKAWDYMIESPWTFLFNAYLIFITFLLAYLVKRRVFTRILLSVFW LGLGITNGYMLLIRVTPFNAQDLKVAGDAVTLFDKYFTGFEGIMLAIGIVAVVIWLISMW KRGGQYTGKMHRIPALIAIVAAFGTVGLLTNLAIDKRVVSNYFGNIAFAYEDYGFPYCFS ASVFNTGIEEPNGYSEETMNKITDNGAITESKTGRKEMPNILFIQLESFFDPYEVEFFKT SQDPIPNFRKLSEQYSSGYFKVPSVGAGTANTEFEVLTGMSMRYFGPGEYPYKTILKKTQ AESAATALKKFGYGAHALHNNGGNFYSRADVFNNIGFDSYTSKEFMNILQLTENGWAKDS VLTQHIKNAMDSTEQQDFVFGITVQGHGDYPEEKKIENPRIRVTGIEDEGRTNAWEYYVN QLYETDQFIGDLIQMLKDRDEPTVLVLYGDHLPTMGLEAKDLKGRYLYNTNYVIWDNIGL PKQDENIAAYQAMADVFDRLDIHSGTVFNYHQNRRKTKNYLADLELLQYDMLYGEQYVYG GKDKNPVKTGYMQMGILDVTLTGIVQQVEDTYSLYGENFTKNSKVYINDDKQDATFLNNT RIDLKETKLKNGDKIKVCQVGSSERIFRESAEYVYYDGKLLTPEEWKAKETQEAEAQKTE AQANE >gi|330401367|gb|ADLB01000027.1| GENE 37 38678 - 39976 865 432 aa, chain + ## HITS:1 COG:BS_folC KEGG:ns NR:ns ## COG: BS_folC COG0285 # Protein_GI_number: 16079860 # Func_class: H Coenzyme transport and metabolism # Function: Folylpolyglutamate synthase # Organism: Bacillus subtilis # 6 422 8 427 430 212 34.0 1e-54 MNKVQAEAYIADLPKFTTKNSLEHTREFLVRLGNPEKNKKIIHVAGTNGKGSVCAYLQAM LLADGKKTGMFISPHLEKINERIVIQGEEISDEKFLKAFHFVMDTIKEMKKDGIAHPSYF EFLFGMGMYAFSEADVEYIILETGLGGRLDATNSIQHPIMTVITSITLDHTDILGDTIEK IAAEKAGIIKEGIPVVCMGDREYTHVIEKRAEKLHSPCIKISKNAYEIIESTDKYIDFFS VNAYDKYVTWKLHNSGVYQVENAMLAIKALEYLFPEKEDMSLWRRALANVVWQGRMEEIM PGIVVDGAHNIGAVEAFVKSVKQMKDTKNVILFSSVSDKDYEKMIQYLCLHMKAECYIIT EIPDTRGEEPEQLAEVFRKYVKSEVIVEPDLLKALSCAVEKKGEEGKVYCLGSLYLAGKL KELQGGRKDVKF >gi|330401367|gb|ADLB01000027.1| GENE 38 39963 - 40100 245 45 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLNFEEEIKKFKPSLEVEQIEDAVYQEDLTDMTDILREMMQQQTK >gi|330401367|gb|ADLB01000027.1| GENE 39 40123 - 41472 1602 449 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_2575 NR:ns ## KEGG: EUBREC_2575 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 5 434 27 451 456 248 36.0 5e-64 MEYTKKLIYQSNYWYNDGLAKAKIHDLSGATVSLRKSLQYNRENVAARNLLGLVCYAKGE IAEALVEWIISKNFKNHENIANYYIKKVQESAGELDTINNAIKKYNQCLSYCLQDGEDLA LIQLKKVVAAHPTFLKALQLLALLYLQTEQYAKARQVLKRAHRIDTTNEVTLRYLHELSK ISSKKAVKKEQEEKDQTVTYNLGNETIIQPAAAGVKDNGAMSTIINIFIGIVVGAAVVWF LIVPTTNQIKTNKLNKEIVKYSDQISGKNAEISALKKEIEGYKDTTKKTEEAQATAEATK TSYEALLSVYEHSKTSGYNKESLAKELKAVKKDSLGDTGVAVYDKIYDEYVSPLCAKKYR LAIRNYEVKNYTTAVTALEFVTSIDEKYEDGKALLNLAKAYEKAGKTEQAKTTYKRVAEL YPDSELASQAQQALTGTATSGESSTDTEE >gi|330401367|gb|ADLB01000027.1| GENE 40 41538 - 41855 213 105 aa, chain + ## HITS:1 COG:BS_spoIIAA KEGG:ns NR:ns ## COG: BS_spoIIAA COG1366 # Protein_GI_number: 16079404 # Func_class: T Signal transduction mechanisms # Function: Anti-anti-sigma regulatory factor (antagonist of anti-sigma factor) # Organism: Bacillus subtilis # 1 102 5 106 117 84 40.0 3e-17 MEYQIQENCLTICLPNELDHHNAEQIRKLSDQLVEKNHIKYIFFDFQNTNFMDSSGIGVI MGRYKQVCLFGGEVWAMNTNERLKKILRMSGVTKIIHIYEEELCI >gi|330401367|gb|ADLB01000027.1| GENE 41 41852 - 42307 481 151 aa, chain + ## HITS:1 COG:CAC2307 KEGG:ns NR:ns ## COG: CAC2307 COG2172 # Protein_GI_number: 15895574 # Func_class: T Signal transduction mechanisms # Function: Anti-sigma regulatory factor (Ser/Thr protein kinase) # Organism: Clostridium acetobutylicum # 5 137 4 136 143 146 60.0 2e-35 MKDTNEMELIFDSRSSNEGFARVAVASFMTQLNPTLEEVSDVKTAVSEAVTNAIIHGYEN EIHKIEILCRTEGKTICVEVRDRGKGIENVEKAMEPLFTTKPELERSGMGFAFMEAFMDK VEVSSTVGEGTTVKMEKTIGKGRSSWNTQSL >gi|330401367|gb|ADLB01000027.1| GENE 42 42286 - 42999 812 237 aa, chain + ## HITS:1 COG:BS_sigF KEGG:ns NR:ns ## COG: BS_sigF COG1191 # Protein_GI_number: 16079402 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit # Organism: Bacillus subtilis # 7 234 22 247 255 223 53.0 3e-58 MEHTIALIKRSHKGDEKARTQLVEENVGLVWCVVKRFYGRGTEPEDLFQIGSIGLLKAID KFDLSYDVKFSTYAVPMISGEIKRFLRDDGMIKVSRSLKELAYKAYLAKEEMKERLNREP TVEELAEKLEVEKEELALALESGGEVESLHKPIYQKDGNEIQLMDKLEEKEEQEEKILNH MLLQQLLEQLKKEERQLIYMRYFANKTQSEIGKTMGISQVQVSRLEKKILGYLREKI >gi|330401367|gb|ADLB01000027.1| GENE 43 43071 - 43226 167 51 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MEKQRKKVRVCLFCVVLIAVAVGLVYYFSDVRHAESTDAGVLITGVNYLWQ >gi|330401367|gb|ADLB01000027.1| GENE 44 43217 - 43882 668 221 aa, chain + ## HITS:1 COG:no KEGG:EUBELI_01606 NR:ns ## KEGG: EUBELI_01606 # Name: not_defined # Def: stage V sporulation protein AA # Organism: E.eligens # Pathway: not_defined # 3 199 18 213 218 154 38.0 2e-36 MAVTKDTVYIKGEQNVEVTKSEVTLGDILSIECANLEMIPKIKALKLLKVQNNGKHRYVV SVLKIIECIHRQYPNVDIENLGKEDIIITCEDQKTPNKFIHWTKVAVVTAITFVGAAFSI MAFNNDVETTKLFGQIYTLLMGKQSSGFTVLELMYCVGLIVGILIFFNHFGGKKFSVDPT PMEVEMRLYENDIQTTLIENYSRKEKELDVGKTTSAGTHRA >gi|330401367|gb|ADLB01000027.1| GENE 45 43842 - 44258 473 138 aa, chain + ## HITS:1 COG:no KEGG:EUBELI_01605 NR:ns ## KEGG: EUBELI_01605 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 7 135 4 132 135 78 34.0 6e-14 MWVKQLLLALIGLSAGVTVAGGLFSFVVSLGVISDFADRTHTGNHILLYETSVALGGILG NILFIYQIPVPVGTVILMLFGIFAGIFVGCWSMALAEILNVFPIFIRRVKVLRGIPYIIL SIALGKGIGACLFFFKGW >gi|330401367|gb|ADLB01000027.1| GENE 46 44273 - 44734 617 153 aa, chain + ## HITS:1 COG:no KEGG:Cphy_0479 NR:ns ## KEGG: Cphy_0479 # Name: not_defined # Def: stage V sporulation protein AC # Organism: C.phytofermentans # Pathway: not_defined # 2 146 18 162 167 187 64.0 1e-46 MEKEKKEKAYEQYVKEKTPVHNVWLNMAKAFVTGGGICVLGQVILNYCSSQGLSKEISGA WTSVVLVFLSVLLTGLNVYYKIAKWGGAGALVPITGFANSVAAPAIEYKKEGQVFGIGCK IFTIAGPVILYGVVTSWVLGVVYYLLIKSGVTL >gi|330401367|gb|ADLB01000027.1| GENE 47 44736 - 45740 1107 334 aa, chain + ## HITS:1 COG:no KEGG:Cphy_0480 NR:ns ## KEGG: Cphy_0480 # Name: not_defined # Def: stage V sporulation protein AD # Organism: C.phytofermentans # Pathway: not_defined # 4 334 13 355 355 419 59.0 1e-116 MIKGSQSIQFETAPYIISSGSIVGKKEGEGPLGKRFDMVGEDDLFGEETWELAESTMQKE ACLLALRKAGVTPKEVRYLYGGDLLRQGVATSMGTEELKIPMFGLYGACSTSGEALALAS MAVAGGYGDLMLAVTSSHFGSAEKEFRFPLGYATQRPLSAHWTVTGSGAFLVGTRKSHVR ITGVTVGKIVDYGLKDSQNMGACMAPAACDTIVRNLEDFSRRETDYDRIITGDLGYIGQS ILFDLVGKQGRDIRSNHMDCGMTIFDQKTQDTHAGGSGCGCAATTLSAYILPKLESGEWR RVLFVPTGALMSTVSYNEGESVPGIAHGIVLEHC >gi|330401367|gb|ADLB01000027.1| GENE 48 45754 - 46110 497 118 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_2431 NR:ns ## KEGG: EUBREC_2431 # Name: not_defined # Def: sporulation protein # Organism: E.rectale # Pathway: not_defined # 1 117 22 138 139 152 66.0 3e-36 MDYINAFWVGGLVCAIVQILLDRTKLMPGRVMVLLVCSGTVLGFLNLYEPFQNYAGAGAS VPLLGFGNALWKGVKEAIQQYGFIGTFMGGLKAAAVGTSSALIFGYGASLIFEPKMKK >gi|330401367|gb|ADLB01000027.1| GENE 49 46198 - 47832 1824 544 aa, chain + ## HITS:1 COG:SP0496 KEGG:ns NR:ns ## COG: SP0496 COG1283 # Protein_GI_number: 15900410 # Func_class: P Inorganic ion transport and metabolism # Function: Na+/phosphate symporter # Organism: Streptococcus pneumoniae TIGR4 # 1 541 3 538 543 318 33.0 1e-86 MSYADIIIPFAGGLGMFIYGMQIMAQGLENAAGNKMKSLLEVLTKNKFFGVLLGALITAV IQSSSATTVMVVGFVNAGIMNLGQAMGVIMGANIGTTVTGWLVSSVEWAKALSPTTLAPI AIIIGVIIMLTGKRHSSKEVASIIIGFGILFVGISTMSSAVSPLKESPAFREVFITLGRN PILGILAGTAVTAIIQSSSASVGILQSLAAAGLVPMSAAIYIIMGQNIGTCVTAMLSSVG AKKNAKTAALMHLLFNIIGTILFSVIAIVFFTVINPAMGTGMITQTQISTVHTIFNIGTT ILLFPVSDLIIKLAKKIGRVDESEQEDSAVLLDDRMLETPSIALQSTITEIARMGHIVMD SLEKAKTVLFTLSSDDIKALKEEESVVDKLSAGITTYAIKISSLQVSEKEHQEVAHLLQI VSDMERISDYCENISEFAETLREKKLKFSDMGTDGIKEMLDVCAKSYKYALEAFEEESQD KALRVIEKETQADQLELMLRTKHIKRLANNQCNAEAGIVFLDALVCLERISDHARNIAEE LMER >gi|330401367|gb|ADLB01000027.1| GENE 50 47836 - 48879 1063 347 aa, chain + ## HITS:1 COG:SA2103 KEGG:ns NR:ns ## COG: SA2103 COG1316 # Protein_GI_number: 15927890 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Staphylococcus aureus N315 # 63 305 39 271 315 83 27.0 5e-16 MNRKSRYRLRIGFSRIRRNRSEKVKKRKYIILGIVTVLLVAVTVVLAYRMQQEKKKEADY DEYTSGNQERNYITYKGEKYKYNYDLRTVLFMGIDEENEIEERKVGDGGQTDSLVLFAMD TEKKTTKALSISRDTMTDIKTYDMNGNPLSTERAQIALQYAYGDGGRRSCVLTREAVSNL LYQIPINAYVALTMEGFSKITDELGGVKITMAEDYTHINPAFKKGETVTLNGEMAQQYVR YRDTNVSGSNNERMERQSLFIEALVQQLQEEMQETKDVVQLYKKMEPYMVTDISEGELKK MINYKLEGEVDKVPGTVRQGEKYDEFVVDNEKLQEKVVKTFYKREQR >gi|330401367|gb|ADLB01000027.1| GENE 51 48929 - 49531 776 200 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|210611074|ref|ZP_03288716.1| ## NR: gi|210611074|ref|ZP_03288716.1| hypothetical protein CLONEX_00906 [Clostridium nexile DSM 1787] # 4 200 1 198 198 164 53.0 3e-39 MQKLGKKFLGMVLAGTLSIAMGITAFATPSPEKNAVVTEYTKAVDKDGKNVEIVIEELTK EGKTAAKLLQSKETLKDIIGDNYVDGMEVVDVREVRAVGNPSFPVTITFKVPGVLASTNV AILHYENGAWTEEPSKAGKGTITATFDSLSPVAFVVDRNTSASSTESPKTSETPRVAVAG VAAIAALAGAVVFRKRTAFK >gi|330401367|gb|ADLB01000027.1| GENE 52 49666 - 50352 699 228 aa, chain + ## HITS:1 COG:Cgl0343_1 KEGG:ns NR:ns ## COG: Cgl0343_1 COG3944 # Protein_GI_number: 19551593 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Capsular polysaccharide biosynthesis protein # Organism: Corynebacterium glutamicum # 13 211 1 211 232 103 29.0 2e-22 MEERTWENEETGIDLIEILNLLLRKWWLILLSLAVGVCAAFGYTKICVTPLYQASSMIYV LGTQGGESININLSRQLTSDFITLSKSRPVIEDAINRVNLDMTYEEVAGMISVENPTDTS MLKTTVTSADPQLAKSLSNAMSDTLAERIQEVMGKDKPSTVEKAIEPKYPVTPNTTKNMI VGGLLGAVLMMGILVLLFLMDDRIKNQDDVERYLQLNVLASVPIERKR >gi|330401367|gb|ADLB01000027.1| GENE 53 50359 - 51084 806 241 aa, chain + ## HITS:1 COG:SP0347 KEGG:ns NR:ns ## COG: SP0347 COG4464 # Protein_GI_number: 15900276 # Func_class: G Carbohydrate transport and metabolism; M Cell wall/membrane/envelope biogenesis # Function: Capsular polysaccharide biosynthesis protein # Organism: Streptococcus pneumoniae TIGR4 # 4 236 1 238 243 149 36.0 6e-36 MRGLTDIHNHILPMVDDGAKTVVEALKMIKMQENDGVKNIILTPHYRKGMFETSAEEIEW KFHKFQEYVEKKGFDVNLYLGRECYADTALMETLLKNPSFQMNGTRFVLVEFSYHYDFQK IRNSVYELVSEGYNPILAHIERYACLNKHKEYIEELIQMGAYMQINTSAVLGKMGFSQKH FCKKVLQEHLIHFIASDAHSLQKRRPNMKECKKYVEKKVGVEYANEIFVRNPQLILEENR R >gi|330401367|gb|ADLB01000027.1| GENE 54 51087 - 51758 547 223 aa, chain + ## HITS:1 COG:SP0349 KEGG:ns NR:ns ## COG: SP0349 COG0489 # Protein_GI_number: 15900278 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Streptococcus pneumoniae TIGR4 # 1 220 1 227 227 142 37.0 5e-34 MKKIEITQDELSYELEEEIKTLRTNLLFCGDDKKAILLTSCFQGEGKTNTALQLAQSLAA MQKKVLLVDADLRKSVLISRLNVGKVEYGLSHFLSGQCSLGEAVVSTNIPRFHIMFSGPS VKSSAELLTNERFEKMMDSFREIYDYIIIDSAPLGMVIDAAIAAKQCDGAIMVIESEKVK YRIAQEVKKKLEKSGCQILGVVLNKVPRRNRKKYYGYGEKYYG >gi|330401367|gb|ADLB01000027.1| GENE 55 51804 - 53666 1675 620 aa, chain + ## HITS:1 COG:BH3718 KEGG:ns NR:ns ## COG: BH3718 COG1086 # Protein_GI_number: 15616280 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Predicted nucleoside-diphosphate sugar epimerases # Organism: Bacillus halodurans # 10 601 3 582 608 480 45.0 1e-135 MKRTEKNTQWHMRVILLLAGDIIAIMLSSFLALWLRFELVFADIDRIFLESIYQYAGINI ICTIAIFYAFHLYTSLWKYASVNELVNITLAVVISGIVNAVGMRAFQNPVPRSYDILYTM FLLAMMIGIRFFYRFVRFMKTEYFIRNRKEMSNVMVIGAGDAGAAIVKELRLNETLKRRV CCMIDDDPAKKGKYVQGCLVVGGKKDIISAVERYGITKIIIAMPSAPKSTIKEIVDLCQH TECKIRILPGMYQLVSGEVSVSQLRDVEIGDLLGREQIKVNLNEILGYVQNKVVLVTGGG GSIGSEICRQLAGHGVKQLIIVDIYENNIYEIQQELKRKYPELDLVALIASVRNTHRINE IMDKYRPNVVYHAAAHKHVPLMEDNPNEAIKNNVFGTYKTASAAGKHGVERFVLISTDKA VNPTNIMGASKRMCEMIVQTLDKFYPTEFVAVRFGNVLGSNGSVIPLFKKQIAEGGPVTV THPDIIRYFMTIPEAVSLVLQAGAYAKGGEIFVLDMGEPVKIADLAKNLIRFSGFKVGED IEIKYTGLRPGEKLYEELLMDEEGLQATDNNLIHIGKPIDMNEAVFMRQLKELKAASDKD SELIRQMVKEMVPTYVMKEE >gi|330401367|gb|ADLB01000027.1| GENE 56 53671 - 55515 1354 614 aa, chain + ## HITS:1 COG:SP0351 KEGG:ns NR:ns ## COG: SP0351 COG0438 # Protein_GI_number: 15900280 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Streptococcus pneumoniae TIGR4 # 216 607 2 402 409 419 51.0 1e-116 MEQNHYLKIKRSIDIICSLVGLILLSPVYLIICIAIKITSPGPVLFKQKRVGLHKKYFNI LKFRTMRIDTPKDMPTHLLSNPDQYITKIGKFLRKTSLDELPQIINILKGDMSIIGPRPA LWNQYDLLAERDKYGANDVTPGLTGWAQVHGRDELEIEVKAKLDGFYVKHMSFRMDAICF VRTIFSVLKSEGVVEGGTGEIGKNSEDADSANEKKKILVVCQYYKPEPFRISDICEEMVR RGHEVQVVTGYPNYPEGILYNGYGKGKKIDEVINGVKVHRCYTIPRQTGAVKRMFNYYSY VISSVKYVLSKKCKTSDGKSFDVVFCNQLSPVMMAHAAIAYKKKYKVPVIMYCLDLWPES LIAGGVKRESILYTYYHYVSKRIYKKMDEILITSRMFSTYLQQEFDIAKEKIAYLPQYAE GIFEKVPKKQESGAFDFMFAGNIGAAQSVETIIKAASLLKNEPVKIHVIGSGTELEKLQK ISENEDNVVFYGRRPLEEMPSFYEKADAMLITLQADPILSMTLPGKVQSYMAVGKPIIGA IDGETKKVIEDAKCGYCGKAEDAEILAENIRRFIKNPDKNMMGENAKIYYEKHFEQEKFM DKFEAYLFLEDSNI >gi|330401367|gb|ADLB01000027.1| GENE 57 55512 - 56711 706 399 aa, chain + ## HITS:1 COG:all4426 KEGG:ns NR:ns ## COG: all4426 COG0438 # Protein_GI_number: 17231918 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Nostoc sp. PCC 7120 # 87 336 94 345 417 102 29.0 2e-21 MKILMINVVCGIRSTGRICTDLAVELEKQGHEVRIAYGREEVPEKFQKYAIRIGTDIDVR LHGVKARLLDGAGFGSKRATIKFVEWVKKYDPDIIHLHNIHGYYVNIEVLFEYLKKYGKK IIWTLHDCWAFTGHSAYCDAVQCERWIQGCYNCPQMKEYPKTIIDASQKNWGKKKKIFTG IPNMTIITPSNWLAGLVKKSFLSKYPVKVIHNGIDTSQFKLLNNDFRKVYHLENKFLVLG CATSWNDMKGYSDFIELSGRLNKNYKIVLVGLADKQIKSLPENILGIGRTSSIKELAYIY STSDIFLNLSYCENYPTVNIEANACGLCVLTYNTGGSPESASNNCIVVERGNIDQIVEKI EQVYKKKLPKRQVSVIDKQTTVKIYLSEYLADDKKVLNE >gi|330401367|gb|ADLB01000027.1| GENE 58 56813 - 57925 704 370 aa, chain + ## HITS:1 COG:MTH450 KEGG:ns NR:ns ## COG: MTH450 COG0438 # Protein_GI_number: 15678478 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Methanothermobacter thermautotrophicus # 181 359 220 404 411 90 32.0 4e-18 MKLLFLSNVPSPYRVDFFNELGKYCDLTVLFEKTTSDERDESWKKYQFNTFRGIFLRGKT ISVDTAFCPEVTKYIRDKSFEHIICTTFTTPTGMWAIQYMKRHKIPYYLECDGGFAKNGQ GIKEKIKKFFISGARRYFSTGKTCDDYYLQYGANVDKLVLYPFSSLKEKDIMERPTPFGT KREIRSKLGICEEKVVLTVGQFIYRKGFDLLLEAAKELPNDIGIYFVGGLPTEEYMQIKK KYELHNIHFIGFKDKYELKEYYNAADIFVLPTREDIWGLVIEEAMACGLPIISTERCAAA LELVKNNENGYIIPVENVDKLTASILSILNSKETIERWGMRSINIVHSYTIEKMVEKHLI ALEGVKENNE >gi|330401367|gb|ADLB01000027.1| GENE 59 57918 - 58853 514 311 aa, chain + ## HITS:1 COG:no KEGG:THA_745 NR:ns ## KEGG: THA_745 # Name: not_defined # Def: putative glycosyltransferase protein # Organism: T.africanus # Pathway: not_defined # 10 310 8 306 318 99 27.0 1e-19 MSNKKLAVSVLIPTMNRPKALERTLKSYLSAKYIPCQIVVVDQSEKKMQEKVKQVVEKYR NIVEMKYVYQEEASLTKARNTAFGYAKAEIIICSDDDIDVYEDTVKNVYDFMQNRSFAMI AGIDDNMTTSSSKIGYFLGTKSFINKNKGHVTLSMLGRFPNEITENTQTMWAMGFFFVIR KSLVENWKIKWDEKLIGYAYAEDLDFSYSYYKKAKKECLKCVMSSNIHVKHLASQEFRTP SRKSTFMYVIHRAYLSYKHQMGWKSELAMRWCNLGIYFERRIKNNCSQDMKDAMHYLKKH RSEIRKGEFNF >gi|330401367|gb|ADLB01000027.1| GENE 60 58873 - 60027 467 384 aa, chain + ## HITS:1 COG:no KEGG:SP70585_0419 NR:ns ## KEGG: SP70585_0419 # Name: not_defined # Def: putative repeating unit polymerase # Organism: S.pneumoniae_70585 # Pathway: not_defined # 41 372 51 390 401 95 25.0 4e-18 MIKRSSLYRDLIILGCVLCNLTQIPTLYNNRILSLGYSGTWFLLLIVMILHDHRIKIQYF IIPVFFDIYCIILSLGKGGYTSSDLFRPINLCTFILLIGILAGRFLDEANLQKISGAFIL SALIVAIYLYFDIFRGVDWAGTGGYLYGAKNSAGQIFLTAIILLILFFMERYKIISIVLC SIFCALIIMMKSRSTLLTLVLITVYIVLFVIKKPLYKAIGMCIISIIAIIILTDDSLYNL YVNQIMLNNKDINDFSAITSDRDIHYEFFARYFGQYWGVGTGGTYLEAMPLAVLMSYGIV GGIPVLLYSLFPLYVGIKNINKRKDYRIFCHIIISLGIVMWVNGIFEEQSPFGPGAKCYF LWLVTGIFIGYKEKKKKKEMEHYE >gi|330401367|gb|ADLB01000027.1| GENE 61 60020 - 61018 591 332 aa, chain + ## HITS:1 COG:BS_yveT KEGG:ns NR:ns ## COG: BS_yveT COG0463 # Protein_GI_number: 16080481 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Bacillus subtilis # 9 267 5 260 344 130 32.0 3e-30 MNKENKIKLSFVVPVYNMEKYLENCIESILSQNVEQSEIILVDDGSTDKSVVLCDKYAEQ YENLKVIHKVNGGISSARNAGLLAAQGEYICFVDSDDFFKKEFANDFLEICEKENLDIIR GWYGIYEEDTKTYQKHLFPEISYSNKTLSGYDFLKKSVSEHANEVVPWLGFFRRDYLIRN KLIFPVGIAYEEDQLFFLESLVCDKKCRVYQSDVEFYSYRKRTGSATKTPKFKQIEDILY VVEKETKVLDKYQLPKDIKKSVLKYVCSSFYQLTSIYGRLKREDAIKTAKITPFEVKWQC ICHPYDSHQFFKIFLFTFARWVVDLEYKRRRR >gi|330401367|gb|ADLB01000027.1| GENE 62 61015 - 62151 750 378 aa, chain + ## HITS:1 COG:no KEGG:BVU_2948 NR:ns ## KEGG: BVU_2948 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 2 367 5 381 388 158 28.0 4e-37 MKLVIVLLTHIDNLPPARNLLISLSKIEIKVELITMYSAALPEQIRKAGNITIHDVQSEI ANNKLQALKNRFKRRRKVRTLIKEIMSDRDVLWTVTDYDAMEVGDILNNYRHVMQLMELI QDIPIFDELPFYKANLQKYAQTAEMVIVPEYNRAYIQQAYWQLKNTPKVLPNKPTVYKQE YDIEAISPKGAEVLSEIGNRKIVLYQGVFGYERVLDQFIEAVEQLGDDYCMLLMGRDDEE LQKLLEKYPETFFVPFIAAPNHLAITSKAHIGVLSYVNTNNIRHYDPLNALYCAPNKLYE YACFGIPMIGNDIPGLRIPFEQNNIGRCSELKANEIAEAIRKIEMDYSQMSKNCIQFYES IDMDKLVGELVKDLISKE >gi|330401367|gb|ADLB01000027.1| GENE 63 62156 - 63259 774 367 aa, chain + ## HITS:1 COG:SP0357 KEGG:ns NR:ns ## COG: SP0357 COG0381 # Protein_GI_number: 15900286 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylglucosamine 2-epimerase # Organism: Streptococcus pneumoniae TIGR4 # 4 364 2 362 365 568 74.0 1e-162 MDKKKIMVVFGTRPEAIKMCPLVKELKSRKGLETIVCVTGQHRQMLDQVLEKFNVIPDYD LSIMKDKQTLFDVTINILEKIKSILEETKPQIVLVHGDTSTTFVTALACFYLRIPIGHVE AGLRTYNIYSPYPEEFNRQAVSIISQYNFAPTELSKQNLLNEGKKEDTIYVTGNTAIDAL KTTVNKDYSHSELEWVGGSRLILITAHRRENLGEPMKNMFRAIRRVMDEHPDVKAIYPIH MNPVVREMANDILRKDDRIHIIEPLDVIDFHNFQNKSYLILTDSGGIQEEAPSLGKPVLV MRDTTERPEGIAAGTLKLVGTSEEVIYREFTELLDNSEAYNKMAQAANPYGDGHACERIA DILEESL >gi|330401367|gb|ADLB01000027.1| GENE 64 63262 - 63897 304 211 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237727084|ref|ZP_04557565.1| ## NR: gi|237727084|ref|ZP_04557565.1| predicted protein [Bacteroides sp. D4] # 2 173 12 189 250 117 38.0 4e-25 MKKKELSISLLRLISMSMIVVCHILQFYGNELAYWFNVGVQIFLIISGYLYGQKSRINSI EFYKKNFKKILCDYWICLIVVLLFYQLYTPQYINFENVIKAIFGVSNGIPGLGHYWFIST ILICYLVTPMLSKYLNGKKDIVNFLFIICFNELIFHFLPYFDGAWINCYCASFYYARMKE NIKNDKLFIANVCSITILANSIKILLVLEYK >gi|330401367|gb|ADLB01000027.1| GENE 65 63974 - 64132 270 52 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_2191 NR:ns ## KEGG: EUBREC_2191 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 45 1 45 94 66 68.0 4e-10 MAKHYDKQFKLDAVQYYHDHKNLGLQGCATNLGISQQTLSRWQKERKRLVSG >gi|330401367|gb|ADLB01000027.1| GENE 66 64195 - 64323 177 42 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|210611402|ref|ZP_03288896.1| ## NR: gi|210611402|ref|ZP_03288896.1| hypothetical protein CLONEX_01086 [Clostridium nexile DSM 1787] # 2 41 41 80 83 63 72.0 4e-09 TYNLNLYEYLKFLFEHRPNKDMSDEEFENLAPWNEHVQELCK Prediction of potential genes in microbial genomes Time: Tue May 24 22:00:27 2011 Seq name: gi|330401250|gb|ADLB01000028.1| Lachnospiraceae bacterium 2_1_46FAA cont1.28, whole genome shotgun sequence Length of sequence - 54970 bp Number of predicted genes - 52, with homology - 48 Number of transcription units - 28, operones - 13 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 2 - 206 138 ## EUBREC_3236 transposase 2 1 Op 2 . - CDS 196 - 609 229 ## EUBREC_3235 hypothetical protein - Prom 683 - 742 6.5 + Prom 783 - 842 5.5 3 2 Op 1 . + CDS 925 - 1290 365 ## BBR47_10940 hypothetical protein 4 2 Op 2 . + CDS 1341 - 2150 782 ## TDE1991 hypothetical protein + Term 2170 - 2220 1.2 + Prom 2245 - 2304 6.7 5 3 Tu 1 . + CDS 2334 - 2801 516 ## COG5652 Predicted integral membrane protein + Term 2942 - 3000 -0.7 6 4 Tu 1 . - CDS 2807 - 3070 306 ## Cphy_0044 hypothetical protein - Prom 3098 - 3157 12.5 7 5 Op 1 . + CDS 3184 - 4248 1309 ## COG3641 Predicted membrane protein, putative toxin regulator 8 5 Op 2 . + CDS 4266 - 4787 534 ## COG2109 ATP:corrinoid adenosyltransferase + Prom 4790 - 4849 6.9 9 6 Op 1 . + CDS 4886 - 5605 793 ## COG2357 Uncharacterized protein conserved in bacteria 10 6 Op 2 . + CDS 5626 - 6501 620 ## COG1284 Uncharacterized conserved protein + Term 6512 - 6570 13.6 + Prom 6571 - 6630 6.0 11 7 Op 1 39/0.000 + CDS 6684 - 7586 1103 ## COG0226 ABC-type phosphate transport system, periplasmic component 12 7 Op 2 38/0.000 + CDS 7588 - 8478 1148 ## COG0573 ABC-type phosphate transport system, permease component 13 7 Op 3 41/0.000 + CDS 8471 - 9334 795 ## COG0581 ABC-type phosphate transport system, permease component 14 7 Op 4 32/0.000 + CDS 9327 - 10076 331 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 15 7 Op 5 7/0.000 + CDS 10079 - 10753 823 ## COG0704 Phosphate uptake regulator 16 7 Op 6 40/0.000 + CDS 10750 - 11430 903 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 17 7 Op 7 . + CDS 11444 - 12733 909 ## COG0642 Signal transduction histidine kinase + Term 12739 - 12765 -1.0 - Term 12726 - 12752 -1.0 18 8 Tu 1 . - CDS 12765 - 12923 401 ## COG1592 Rubrerythrin - Prom 12950 - 13009 7.2 + Prom 12928 - 12987 12.8 19 9 Op 1 . + CDS 13061 - 13801 787 ## COG4509 Uncharacterized protein conserved in bacteria 20 9 Op 2 . + CDS 13856 - 14176 204 ## 21 9 Op 3 . + CDS 14207 - 14338 67 ## 22 9 Op 4 . + CDS 14289 - 15908 1572 ## COG0110 Acetyltransferase (isoleucine patch superfamily) + Term 15913 - 15982 16.1 + Prom 16013 - 16072 4.6 23 10 Op 1 . + CDS 16113 - 20066 3716 ## COG4886 Leucine-rich repeat (LRR) protein 24 10 Op 2 . + CDS 20108 - 20482 360 ## gi|210613459|ref|ZP_03289718.1| hypothetical protein CLONEX_01925 + Term 20501 - 20549 5.3 - Term 20491 - 20535 7.4 25 11 Tu 1 . - CDS 20587 - 21177 356 ## COG1309 Transcriptional regulator - Prom 21263 - 21322 7.7 + Prom 21223 - 21282 10.4 26 12 Op 1 . + CDS 21351 - 24629 2847 ## BBR47_26510 hypothetical protein 27 12 Op 2 6/0.000 + CDS 24659 - 25339 866 ## COG3819 Predicted membrane protein 28 12 Op 3 . + CDS 25355 - 26299 1281 ## COG3817 Predicted membrane protein 29 12 Op 4 . + CDS 26320 - 27306 789 ## COG2309 Leucyl aminopeptidase (aminopeptidase T) 30 12 Op 5 . + CDS 27318 - 27962 627 ## COG2039 Pyrrolidone-carboxylate peptidase (N-terminal pyroglutamyl peptidase) + Term 28083 - 28140 11.2 - Term 28070 - 28126 13.1 31 13 Op 1 . - CDS 28143 - 28259 236 ## 32 13 Op 2 . - CDS 28337 - 28909 559 ## COG1396 Predicted transcriptional regulators - Prom 29071 - 29130 11.5 + Prom 29069 - 29128 8.0 33 14 Op 1 11/0.000 + CDS 29167 - 30078 606 ## COG1180 Pyruvate-formate lyase-activating enzyme 34 14 Op 2 . + CDS 30082 - 32454 2652 ## COG1882 Pyruvate-formate lyase + Term 32472 - 32517 11.6 - Term 32460 - 32505 8.4 35 15 Tu 1 . - CDS 32508 - 33155 517 ## COG1802 Transcriptional regulators - Prom 33186 - 33245 6.5 + Prom 33066 - 33125 6.0 36 16 Tu 1 . + CDS 33341 - 38116 5061 ## CPF_0859 alpha-N-acetylglucosaminidase family protein + Term 38132 - 38181 10.8 + Prom 38185 - 38244 10.9 37 17 Tu 1 . + CDS 38287 - 39081 1105 ## COG0345 Pyrroline-5-carboxylate reductase + Term 39094 - 39134 8.3 + Prom 39108 - 39167 5.5 38 18 Tu 1 . + CDS 39189 - 40184 1006 ## COG0667 Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) + Term 40193 - 40231 2.2 + Prom 40186 - 40245 8.5 39 19 Op 1 . + CDS 40314 - 41078 785 ## COG0253 Diaminopimelate epimerase 40 19 Op 2 . + CDS 41066 - 42286 706 ## COG0436 Aspartate/tyrosine/aromatic aminotransferase - Term 42075 - 42122 2.6 41 20 Tu 1 . - CDS 42281 - 43072 519 ## COG0789 Predicted transcriptional regulators - Prom 43100 - 43159 7.3 + Prom 43062 - 43121 7.6 42 21 Tu 1 . + CDS 43150 - 43788 578 ## COG1284 Uncharacterized conserved protein + Term 43846 - 43887 1.9 + TRNA 43880 - 43950 60.1 # Gln CTG 0 0 + TRNA 43977 - 44047 60.1 # Gln CTG 0 0 + TRNA 44061 - 44134 72.7 # Pro CGG 0 0 - Term 44047 - 44119 31.2 43 22 Tu 1 . - CDS 44207 - 44296 86 ## - Prom 44459 - 44518 4.3 + Prom 44063 - 44122 80.0 44 23 Tu 1 . + CDS 44295 - 44915 690 ## COG0020 Undecaprenyl pyrophosphate synthase + Term 44920 - 44970 15.4 - Term 44906 - 44958 12.0 45 24 Tu 1 . - CDS 44964 - 45392 301 ## gi|210617536|ref|ZP_03291618.1| hypothetical protein CLONEX_03840 - Prom 45449 - 45508 7.5 46 25 Op 1 36/0.000 + CDS 45513 - 46217 283 ## PROTEIN SUPPORTED gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) 47 25 Op 2 . + CDS 46224 - 49523 2926 ## COG0577 ABC-type antimicrobial peptide transport system, permease component + Term 49565 - 49603 8.1 + Prom 49575 - 49634 7.3 48 26 Tu 1 . + CDS 49665 - 51695 2286 ## COG0326 Molecular chaperone, HSP90 family + Term 51713 - 51772 4.4 + Prom 51798 - 51857 10.8 49 27 Op 1 . + CDS 51885 - 52760 940 ## gi|295109548|emb|CBL23501.1| hypothetical protein 50 27 Op 2 . + CDS 52787 - 53341 629 ## gi|295107973|emb|CBL21926.1| hypothetical protein 51 27 Op 3 . + CDS 53353 - 54588 1010 ## EUBELI_00297 hypothetical protein + Term 54592 - 54651 11.3 - Term 54645 - 54695 6.0 52 28 Tu 1 . - CDS 54735 - 54899 63 ## COG3666 Transposase and inactivated derivatives Predicted protein(s) >gi|330401250|gb|ADLB01000028.1| GENE 1 2 - 206 138 68 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_3236 NR:ns ## KEGG: EUBREC_3236 # Name: not_defined # Def: transposase # Organism: E.rectale # Pathway: not_defined # 1 48 1 48 113 68 60.0 7e-11 MLGDISLATNIYLVTGYTDMRKSIDGLCAIIMKNFKHEPDGHSIYLFCVSVATGSKSCSK GRMAISSY >gi|330401250|gb|ADLB01000028.1| GENE 2 196 - 609 229 137 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_3235 NR:ns ## KEGG: EUBREC_3235 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 134 1 124 125 98 43.0 8e-20 MKAKRVNREEQLKLIMECRSSGLSDYQWCEAHGIHAGTFYNWVSKLRKVGVTIPNSESKH LGTPVHQEVVKLDLVPEPAPAATIMEKNTRILTMPDTDASVAVEIIMGNSTIRFFNNTNP DLIRTTLQCLGGMSHAW >gi|330401250|gb|ADLB01000028.1| GENE 3 925 - 1290 365 121 aa, chain + ## HITS:1 COG:no KEGG:BBR47_10940 NR:ns ## KEGG: BBR47_10940 # Name: not_defined # Def: hypothetical protein # Organism: B.brevis # Pathway: not_defined # 11 111 130 235 251 63 35.0 2e-09 MTLYGRQPGMENEYRFQLFASYIEVFKEKNVEVLLQIIMNIFMFIPIGFLLPYCFKKFEK NKTVFFTAILFSGIIECTQGIFRMGMFEADDILGNVLGVELGFFLFCLVRKICQKFVGFH E >gi|330401250|gb|ADLB01000028.1| GENE 4 1341 - 2150 782 269 aa, chain + ## HITS:1 COG:no KEGG:TDE1991 NR:ns ## KEGG: TDE1991 # Name: not_defined # Def: hypothetical protein # Organism: T.denticola # Pathway: not_defined # 9 263 4 252 254 132 32.0 1e-29 MEEHMRKTRLEELTIKHNFMFGAVMIDPENCKGFLERVLQIKIDHVEVSLEKSIVYHPEY KGVRLDVYAKDERNTRYNVEMQVLRRAALGKRSRYYQSQMDMEILLAGSEYEDLPDSYVI FVCDFDPFGKKKYVYTFQTQCRESEETELEDGRNILFLSTHGENESEVSKELVSFLKFVK ADLQGSEEDFHDVYVKRLQEFIHHVKENRKMEERFMVFEEMLRDERKEGRKDALHQIVMK MFLQGFDDEKIMEICEISKEELESYKQEK >gi|330401250|gb|ADLB01000028.1| GENE 5 2334 - 2801 516 155 aa, chain + ## HITS:1 COG:BH3707 KEGG:ns NR:ns ## COG: BH3707 COG5652 # Protein_GI_number: 15616269 # Func_class: S Function unknown # Function: Predicted integral membrane protein # Organism: Bacillus halodurans # 1 153 1 141 146 86 38.0 2e-17 MKKGIFTTLVVFWCGLIFWFSAQPAVESAKMSHSVGKVIGEVLVPDFKTLSAEEQEKIAE KIDYPIRKTAHTMEYAVLGGLLVLMYGSYGIIGKKGMAYGILTGVAYAMTDEIHQLFVPG RSCQVTDVLIDSAGVLFGSVIGVLIFISTVKQRRR >gi|330401250|gb|ADLB01000028.1| GENE 6 2807 - 3070 306 87 aa, chain - ## HITS:1 COG:no KEGG:Cphy_0044 NR:ns ## KEGG: Cphy_0044 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 1 87 1 88 89 112 69.0 6e-24 MFRLWGKIFKDNRMLQDTVICDDSSDTRTHKIFHALDEICYAFDLSKPIWLDSTISEFKK HDKARFYQDNFVDSIDFDYLEIHVIEE >gi|330401250|gb|ADLB01000028.1| GENE 7 3184 - 4248 1309 354 aa, chain + ## HITS:1 COG:BH3254 KEGG:ns NR:ns ## COG: BH3254 COG3641 # Protein_GI_number: 15615816 # Func_class: R General function prediction only # Function: Predicted membrane protein, putative toxin regulator # Organism: Bacillus halodurans # 8 353 14 334 336 235 47.0 9e-62 MNWLKKILDKIFIDGLSGMAQGLFATLIVGTIVQQIGTLVGGETGNYIFLVGKVAASLTG AGIGVGVAHRFQESSLVVLSSAAAGMIGGYAGNLLAGSMVSGGNVILAGPGEPLGAFIAA FIGIQAGHLISGKTKIDILLTPIVTIGIGGTTGILVGPSISKFMMGLGDMINWGTEQQPL IMGIVVSVLMGMALTLPISSAALGVILNLNGLAAGAATVGCCCNMIGFAVASFRENKLGG LLAQGIGTSMLQVPNIVRNPLIWLPAILSSAILGPVGTILLKMTNNATGAGMGTAGLVGQ IMTWQTMTAVESPTIVVIKIVAIQILLPAIVTVCISEFMRKKGWIKSGDMKLDL >gi|330401250|gb|ADLB01000028.1| GENE 8 4266 - 4787 534 173 aa, chain + ## HITS:1 COG:TM1465 KEGG:ns NR:ns ## COG: TM1465 COG2109 # Protein_GI_number: 15644214 # Func_class: H Coenzyme transport and metabolism # Function: ATP:corrinoid adenosyltransferase # Organism: Thermotoga maritima # 4 172 3 169 170 119 42.0 4e-27 MSTGLIHIYCGDGKGKTTAAVGQAVRSAGYGYKVLFYQFLKNNNSSERRSLEQLENITCI RGRDEVKFSFQMSEEEKKEIRSYHREQLRMLRNEVPNYDVLILDEAVCVTGIGLLDEEEL VSFLENKPKKLEVVLTGHTVSDRILQMADYVTEMKKVKHPFDRGVQARKGIEM >gi|330401250|gb|ADLB01000028.1| GENE 9 4886 - 5605 793 239 aa, chain + ## HITS:1 COG:lin0794 KEGG:ns NR:ns ## COG: lin0794 COG2357 # Protein_GI_number: 16799868 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Listeria innocua # 12 219 7 211 212 218 50.0 1e-56 MQEAREAIQNYDDVDSWKTVMFLYNSALKEVGTKLEILNDEFKHVHSYNPIEHIKTRIKT PESIVKKLKRYGYEISIDNMIRYINDIAGVRLICSFTSDIYRLAAMIGNQSDLKVLTIKD YIKNPKDSGYKSYHMLVSVPIFLSDSVVNTKVEIQIRTIAMDFWASLEHKIYYKFEGNAP EYISRDLKDCADMVSMLDEKMLSLNNAILECVKQKNEEENRKEAPKVTSQDTAKLGVAD >gi|330401250|gb|ADLB01000028.1| GENE 10 5626 - 6501 620 291 aa, chain + ## HITS:1 COG:BS_yitT KEGG:ns NR:ns ## COG: BS_yitT COG1284 # Protein_GI_number: 16078176 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Bacillus subtilis # 11 289 7 279 280 142 32.0 1e-33 MEKENKKRGMEQMLYTVLGGFLFAFGVNLIIMPMNLYNGGFMGVAQLLRTFVVSVLHIDL GSFDLAGIIYYMINIPLMYLAWKNIGKGFFVRTIISVTLQSVFLTLIPIPKEPLISDMLT ACIIGGLISGCGTGLLLRGGSSGGGQDIIGVICAKKYPGFSVGKIGMLMNVGVYGICLLM FDVEIVVYSLIYTTVLSLTIDRIHIQNINTSVMIFTKQPGVAKAIMEETGRGVTNWKGVG AYTNENTEVLSVVISKYEVNQIKRIVQRIDENAFMIFNEGSYVIGNFEKRL >gi|330401250|gb|ADLB01000028.1| GENE 11 6684 - 7586 1103 300 aa, chain + ## HITS:1 COG:SP2084 KEGG:ns NR:ns ## COG: SP2084 COG0226 # Protein_GI_number: 15901900 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type phosphate transport system, periplasmic component # Organism: Streptococcus pneumoniae TIGR4 # 1 298 1 290 291 188 41.0 1e-47 MKLRKLLAITAITTMAIVSAAGCGSSDKKEYGKKAEGWDSANDISVVSREDGSGTRGAFI ELFGIEEKDASGEKVDNTTEEASVTNSTSVMMSTVEGNEYAIGYVSLGSLNDTIKSVKID GAEATAENIKSGTYKISRPFNIVTKDKVSDVAQDFINYIMSPEGQKVVEEAGCIPMDDVK EYKSNGATGKIVIGGSSSVSPAMEKLKEAYTALNPKAEIEIQTSDSTTGVTSTIDGVFDI GMASREIKDSELSKGIKPTVIALDGIAVIVNNNSEVEELTSEQVKDIFTGELLTWDEVVK >gi|330401250|gb|ADLB01000028.1| GENE 12 7588 - 8478 1148 296 aa, chain + ## HITS:1 COG:SP2085 KEGG:ns NR:ns ## COG: SP2085 COG0573 # Protein_GI_number: 15901901 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type phosphate transport system, permease component # Organism: Streptococcus pneumoniae TIGR4 # 2 293 3 284 287 305 58.0 7e-83 MKKAWKEKFMRGVFFTAACASVLAVALICIFLFANGVPAMKEIGFGDFLLGEKWKPGNDI YGIFPMIIGSIYVTVGAIVIGVPIGILTSIFMAKYCPKKIYPFLKGATELLAGIPSVVYG FFGLVVIVPIVHEICLDLRRAGILERSGDGKSILTASILLGMMILPTIIGVTESAIRAVP DQYYEGALALGATHERSIFRVVLPAAKSGVLAGVILGIGRAIGETMAVIVVAGNQPRLPE EITQGVRTLTANIVMEMGYATDLHREALIATAVVLFVFILIINLCVSILNRRSRNV >gi|330401250|gb|ADLB01000028.1| GENE 13 8471 - 9334 795 287 aa, chain + ## HITS:1 COG:SP2086 KEGG:ns NR:ns ## COG: SP2086 COG0581 # Protein_GI_number: 15901902 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type phosphate transport system, permease component # Organism: Streptococcus pneumoniae TIGR4 # 16 285 2 271 271 251 55.0 2e-66 MSRWKNKLRSYLKTPASFITMLLVMLSALITFTILLFLIAYILIKGIPHLTTDLFALKYT SDNGSLMPALLNTLFMTALSLIIAVPLGIFSAIFLVEYSGKGNKFVEVIRITTETLSGIP SIVYGLFGMLFFVSTLKWGYSLLSGAFTLAIMILPLIMRTTEEALKSVPDTYREGSLGLG AGKLRTVFRVVLPSAVPGILAGVILAVGRIVGETAALIYTAGTVPKVAANPMDSGRTLAV HMYNLSNEGLFMDKAYATAVVLLVMVVGINWLSGFIAKKITKGRSNE >gi|330401250|gb|ADLB01000028.1| GENE 14 9327 - 10076 331 249 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 4 244 2 239 245 132 36 6e-30 MSKISIQNLDLHYGDFKALKNVNLEIEKNKITAFIGPSGCGKSTLLKSLNRMNDLVEGCK IDGKILLDGEDIYGRMDVNLLRKRVGMVFQKPNPFPMSIYDNIAFGPRTHGIRSKSKLDD IVEKSLRDAAIWEETKDRLKKSALGMSGGQQQRLCIARALAVQPEVLLMDEPTSALDPIS TSKIEDLAIELKKDYTIVMVTHNMQQAVRVSDNTAFFLLGEVIEYNNTEKLFSIPSDKRT EDYITGRFG >gi|330401250|gb|ADLB01000028.1| GENE 15 10079 - 10753 823 224 aa, chain + ## HITS:1 COG:lin2637 KEGG:ns NR:ns ## COG: lin2637 COG0704 # Protein_GI_number: 16801699 # Func_class: P Inorganic ion transport and metabolism # Function: Phosphate uptake regulator # Organism: Listeria innocua # 1 209 3 212 219 124 35.0 2e-28 MRNKFDMQLDKMNEMLIQMGELCEKAIANATKAVADGNLEMAKIVIGEDEEIDQMEKDIE RLCLKLLLQQQPVAKDLRQISAALKMITDMERIGDQASDIAEILLSANESEAAPIPYLTE MAAATSKMVTKSVQAFVEKDLELTRKVILADDTVDELFDKVKEKLVSFLAKNDNSGERAI DMLMIAKYLERIGDHATNIAEWVEFSITGVHREQKEHEKVRRTE >gi|330401250|gb|ADLB01000028.1| GENE 16 10750 - 11430 903 226 aa, chain + ## HITS:1 COG:CAC1700 KEGG:ns NR:ns ## COG: CAC1700 COG0745 # Protein_GI_number: 15894977 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Clostridium acetobutylicum # 2 219 6 226 232 209 51.0 3e-54 MIYCVEDDSNIRELVVYTLESTGMKARGFEDGRKFTEALAFETPELVLLDIMLPGEDGIE ILKKLRNNAKTKNIPVIMVTAKGSEYDKVVGLDTGADDYITKPFGMMELVSRIKAVLRRT TREKEETKYQIGNLVIDVEKHKVKVDGKSVTLTLKEFELLEKLMKNRHIVLTRDRLLEEI WGYDFSGETRTVDVHVRTLRQKLGDAGELIETVRGVGYRIGGDNEE >gi|330401250|gb|ADLB01000028.1| GENE 17 11444 - 12733 909 429 aa, chain + ## HITS:1 COG:BH3156 KEGG:ns NR:ns ## COG: BH3156 COG0642 # Protein_GI_number: 15615718 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Bacillus halodurans # 203 428 352 579 589 166 42.0 1e-40 MVLILCITLLATYSLLAVVVYNQSFLNLQEEVQQEADYIKAAIDISGEQYLRAMDNVRQK ARVTVVRPNGTVTYDSGAGQEKLENHRERKEIKEALRNGKGSDFRKSDTTKERSIYYAIQ LENGNVLRVARTVDNVWPMLLNVFPYMAGISILMMGIAFLLAKWQTARLIRPINELNLEE PLENEVYEELQPLLEGLDKRNREKEAVAQMRKEFSANVSHELKTPLTSISGYAEIMKSGL VKPEDMQNFSERIYTEASRLITLVEDIIKLSKLDEGKVELEKEEVDLFMLCREVCSRLSL QAEKKRIKIEVTGEPVFYRGIRQILSEMIYNLCENGIKYNVEGGKLTVWAGNTLQGKKVI IRDTGIGIPEEDRERIFERFYRVDKSHSKQSGGTGLGLSIVKHGAMLHQAEVQVKSEVGK GTTMEIVFK >gi|330401250|gb|ADLB01000028.1| GENE 18 12765 - 12923 401 52 aa, chain - ## HITS:1 COG:alr1174 KEGG:ns NR:ns ## COG: alr1174 COG1592 # Protein_GI_number: 17228669 # Func_class: C Energy production and conversion # Function: Rubrerythrin # Organism: Nostoc sp. PCC 7120 # 2 49 181 228 237 80 62.0 1e-15 MKKYVCEPCGYVYDPELGDPDGGIAPGTAFEDIPDDWVCPICGLGKDVFVEE >gi|330401250|gb|ADLB01000028.1| GENE 19 13061 - 13801 787 246 aa, chain + ## HITS:1 COG:lin2285 KEGG:ns NR:ns ## COG: lin2285 COG4509 # Protein_GI_number: 16801349 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Listeria innocua # 33 229 42 226 246 106 34.0 4e-23 MKKKRDKIFFWICIIICICCVGYIAYYYMQKADSKKDYKEVKKKVIKEETQNEEKDVIPI DFAELKKVNDEIYAWIQVPDTNIDYPILQSKTDDAFYLDHAFDKKYDVFGSVFTERINAK DFTDFNTVIYGHNIKDGSFFQNLHKFEEEAFFNNHDTFTIYTETEKKTYKIFAAVEYSSK HILYNYDNESPEERKAFLQSLRESRSMKNHYRDDVKVDENSKIVTLSTCIRWEPKKRYIV AGVEVE >gi|330401250|gb|ADLB01000028.1| GENE 20 13856 - 14176 204 106 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTEKKRTVLATVLVLYVFEMILLKFRGGDMLFGNMIFMFPIGVILPKLWERCKERKKLIG GGIVFNIILELIQLSGGGEIHIFYAVSTGMAGLLLGYAGYYISEKT >gi|330401250|gb|ADLB01000028.1| GENE 21 14207 - 14338 67 43 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MDMHKIQTVEIKIRRKGENGFVFESDTNEKVINYRCRRFRTYD >gi|330401250|gb|ADLB01000028.1| GENE 22 14289 - 15908 1572 539 aa, chain + ## HITS:1 COG:NMB1820_2 KEGG:ns NR:ns ## COG: NMB1820_2 COG0110 # Protein_GI_number: 15677656 # Func_class: R General function prediction only # Function: Acetyltransferase (isoleucine patch superfamily) # Organism: Neisseria meningitidis MC58 # 25 160 1 141 190 106 42.0 1e-22 MKKLLIIGAGGFGHMIKETAAELGYEEIVFLDDAVKGADVIGKCCDYQAFLSEYDTAVAG LGDNGMRLYWTEKLAEAGYNVPAVIHPSAVVSPSAKIGLGSFVMQRAVVNTNVVIEDGVL VNSGAVVDHDSYVASGAHIGLGSVVKANCTINKKKKVEAGEVVFSTRRKIDGVDNLNLED ALYAFGFGPQCSYVKPFGEGHINETYAVYMPGEEGDEFCYILQRVNSNVFKDPAGVMENI FNVTEYLRNVIREEGGDPDRETLCAIKTKDGSTYFEDSEGQPWRSYHYIPNSVCYQLVEE PEQFYQSGNSFGHFLKQLGQYPASDLNETIPDFHNTVKRFEAFAFALKRDIKNRAVSCRP EISFVLDRKEDCGVLVKQQEEGVLPLRVTHNDTKLNNILFDAETGKGLCIIDLDTIMPGL AANDFGDSIRFGAATAAEDEKNLDLMHFDISLYETYVKGYLEATKDVLTPEEVASLPWGA RLMTLECGIRFLTDYLQGDTYFKTAYPEHNLVRARTQFRLVDEMEQQFDKMQELVRKYC >gi|330401250|gb|ADLB01000028.1| GENE 23 16113 - 20066 3716 1317 aa, chain + ## HITS:1 COG:lin0372 KEGG:ns NR:ns ## COG: lin0372 COG4886 # Protein_GI_number: 16799449 # Func_class: S Function unknown # Function: Leucine-rich repeat (LRR) protein # Organism: Listeria innocua # 1152 1306 450 592 656 61 30.0 1e-08 MKKKRKYAKSILSGALAVVLAATAVPFPEVSSVKAASIETEQAPMQIRFDEPLSKGKLTG GSGSFTNGGSETDWWQQLSLPIGNSYMGANVYGEVGKEHLTFNHKTLWNGGPTADKPHTG GNINKVGDKSMAAYLESVQQAFLDGKSNASEMCNQLIGQNTREYGAYQGWGDIYLDFDRE SAKEDATIISDKSDKIKYGQGWGEWPQPTWEAGSEHYAMNPARLEIEFEGTGIQMIGVQY IDMGDFTATVDDTKTVSGSMYSETKKEGVALFEISGLQYGKHKLVFESKSKNGKSKTSFD YFKVLEGETIDWNPENTTDKVTFTGGWERWDRTNDSDANQWFGKDEVFIDPSRAERATLT CKFTGTGFELFGAKSDQVGKFQYQIDNGAWKEVDTRSNTFARQRLLKVNGLEKREHTLTI KGIKNNKVSFDGIVTSMDKSKPEHTEVTNYERALDIDSALATVSFDRDYTHYYREYFASY PDNVIAMKLTAEALKGSQKEMKPLEFEVSFPVDQPSEAALGKEVKYETTEDGTIVVSGHM RDNGLLFNGRLQVVTKDGKVEQIANKEGTLLVSGATEVYIYVTADTDYKMTYPKYRSGIT ADELSTQVKTVLDKAVKKGYKAVKDDAVADYKKIYDRVKLDLGQGAYKKTVDELIASYKS NKASAEEKAYLEAILFQYGRYLQISSTREGDKLPANLQGVWLDCTGKANAPIAWGSDYHM NVNLQMNYWPTYVTNMAECAEPMIKYIEGLREPGRVTASTYFGIDNSNGQKNGFTAHTQN TPFGWTCPGWEFSWGWSPAAVPWMLQNVYEAYEYSGNIEKLEKDIFPMMQEQAKFYMSIL KKVTTADGKERYVTIPAYSPEHGPYTAGNVYENVLVWQLFNDCIEAADALNANKAGTVSE EQITQWKEYRAGLKPIEIGQSGQIKEWYDETTLGHNTKGNIPKYQKGHRHMSHLLAVYPG DLVTVDDEKTMDAAKVSLNDRGDNATGWGIAQRLNTWARTGDGNHAYKIIDSFIKNGIYS NLWDAHPPFQIDGNFGYTSGVAEMLLQSNAGYINLLPAMPENQWQSGSVSGLVARGNFVV SENWDKGVLTEATIESRNGGDCTVQVNGWEKVFVQDSEGKKVQAKEVEGKTGRVTFATEE GKTYFITKTEGAVDSATVTFTAEGKTVKEISIVKGETIGNELPTAPEKEGYTFKGWNTKE DGKGEVVTEKTVVNADMTIYAIYEKDETPEPQPEEKMVTFTAEGKLVKEVSIVKGETIGN ELPTAPEKEGYTFKGWNTKEDGKGEVVTEKTVVNADMTVYAIYEKNETPEPQPEEKR >gi|330401250|gb|ADLB01000028.1| GENE 24 20108 - 20482 360 124 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|210613459|ref|ZP_03289718.1| ## NR: gi|210613459|ref|ZP_03289718.1| hypothetical protein CLONEX_01925 [Clostridium nexile DSM 1787] # 2 57 1739 1794 2065 72 62.0 7e-12 MKGEAIGNELPTAPEKEGYTFKGWNTKEDGKGEVVTEKTVVNANMTAYAIYEKVQTPITP TPDNKPTTKPDTHPTQKPDNKPVQKPDNKPVKTSDTANVAIPFAFMLIAAAAGVIVLGKK RKTK >gi|330401250|gb|ADLB01000028.1| GENE 25 20587 - 21177 356 196 aa, chain - ## HITS:1 COG:CAC2605 KEGG:ns NR:ns ## COG: CAC2605 COG1309 # Protein_GI_number: 15895863 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Clostridium acetobutylicum # 11 194 15 192 194 80 28.0 3e-15 MRKKAINTTIKEQLISAAWELFLEKGYDATTVNEIIERSQTSRGSFYHHFRGKEDLIFSL AYFFDNDYDEWLKSLPSNLSAVDKLITFDEFILSNLEHSPYISFFQTLYGLQVMTQGTRY ILNPKRRYYQILNQLVKEGLDSGEIISSESYTEISEKIASLERGLTYDWCLQEHRYSLLH YAHGIVCTYLESLRKH >gi|330401250|gb|ADLB01000028.1| GENE 26 21351 - 24629 2847 1092 aa, chain + ## HITS:1 COG:no KEGG:BBR47_26510 NR:ns ## KEGG: BBR47_26510 # Name: not_defined # Def: hypothetical protein # Organism: B.brevis # Pathway: not_defined # 201 1088 271 1054 1066 256 27.0 4e-66 MEKQKKIQLPKKMGKTERNRLIEFALLKGFQSVSAEFPVVSTTEGEEVPESFSQYELQLR NIEAPEDYVSATKSAKQIPDLDWRKKKGLEAVFEKGEFLKDNNLDMLPDSLDMKIILPEN ADESMVIAACNIAFRFGMETTGYEGMIVADESYTGNAFVIEEGDECRVRFDKQDHRTIVY LSGKGKELEVFSAWLCEKFPLLPNAERWTDRLMELTDSFSMKNLDGQLSYLKAFQKELGD SVTAYVSPEISSVKKGIEKEFPNVLFENYKGMKKVYEKEYDIPWEVDVCKEILESKVYPL LRNGDKVKVYAALSEEMDVRMALKQNINQRLEEKGAKLLDAEILCAYKQGFSWIDEIIIP KLLNEKVGKLHIAFRPFLPEGQTEWLDENGATPSYTNIDFSNPDKWYDLPIRYLQELYPI VDIIESKLGIEKDNIVFSVYEGEKDCTYYLEAEDINGNTILEEDYLAINAERPYLDAFPE MGKVHPATGYVKVYLNGQEVLNERVRTDVECIWEIYQKEVLADLQKFIEGKLGEHICLET QPFFSRLSLEVVASEPDYKLPCREDLISTLDALHEDMYFVGTDFFKNYGVRHSGEMLDAP GLILPILKKGTGKPYFKVTLYDIMAEKPMLVSAQKEITSFANRQDIDIYMDSLTYESDKI TAHICVDGGNAKVAEAYALLFNKGILSSCKGYAEVDIVQLDVDGQIFEMYVPEYEEQAKD LDIREIDLLEHTLIGYDQYLEIIKQLKRVEGISVYKTATSYAGRDIYAIELLPKCKGYIS RTKRINLLPSEIINSRHHANEVSSTNSAFILLKEILTNEKYKDLSEKLNLVIVPMENVDG TAIHYELQKENPYWKFHVARFNAIGKEFYHEHFKEDTIHTEAMGLTRLWYRMLPDIVVDN HGVPSHEWEQQFSGYTSPSFKGFWLPRSLLYGYFWYVTNEEYKSNYKVNKKMEDVIADAI GAKEDIYKWNKEWMQQFEKFAHGWMPKLFPANYYKDMINYWIPFEYQPEHRYPSIRFPWI TTVAYTSEVADETAQDDYLNLCAKAHVTHDLATIDMFVEGESVFEVREEKADDSVSICYI RQRPLIIGSFEN >gi|330401250|gb|ADLB01000028.1| GENE 27 24659 - 25339 866 226 aa, chain + ## HITS:1 COG:SP0858 KEGG:ns NR:ns ## COG: SP0858 COG3819 # Protein_GI_number: 15900742 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Streptococcus pneumoniae TIGR4 # 1 220 1 221 229 144 41.0 1e-34 MEYITLIGIVIVIVGFALKLDNILTILVAAVVTALVGGLGVDGLLETLGSSFVANRSMAI FIIVMLVTGTLERSGLKEAAAALIGKIKGATAGAVVVAYGIMRAIFAAFNVGFGGVAGFV RPVIMPMADAAIENTVGKANEEHEEQVKGMASGMENITWFFFQVLFIGGSGGLLVQSTLE GLGYKVDLVDLAKVEIPVAIFALVVTIIYYIFRDKKLMKKYYKKEK >gi|330401250|gb|ADLB01000028.1| GENE 28 25355 - 26299 1281 314 aa, chain + ## HITS:1 COG:SP0859 KEGG:ns NR:ns ## COG: SP0859 COG3817 # Protein_GI_number: 15900743 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Streptococcus pneumoniae TIGR4 # 11 310 4 302 307 211 41.0 2e-54 MSFFMSSEELLGTKLLEIVYILIGLITIYTGVKNVMDKKNPSRIGTGIFWIVLGIVLAFG RWIPAKVNGVLIIVMCIPAILQKVKIGKIDKPTEQESRANYDKVGMKIFIPALTIGVCAL GFALFTDLGALVGVGVGVLISALLLILFLKSNTPKVLLNDSERMLSTVGPLSMLPMLLAS LGAIFTTSGVGDVIAQIVGNVIPKGNVNVGIIVYAVGMMVFTMIMGNAFAAITVMTVGIG APFVLAYGANPVLIGMLALTCGYCGTLLTPMAANFNIVPVALLDMKNRFTVIRNQVVIAV LMLVFQIAYMIMFK >gi|330401250|gb|ADLB01000028.1| GENE 29 26320 - 27306 789 328 aa, chain + ## HITS:1 COG:PH1048 KEGG:ns NR:ns ## COG: PH1048 COG2309 # Protein_GI_number: 14590885 # Func_class: E Amino acid transport and metabolism # Function: Leucyl aminopeptidase (aminopeptidase T) # Organism: Pyrococcus horikoshii # 78 321 81 319 320 98 29.0 2e-20 MKGIDRGAEIVIKKWVKLKPWEKLLIVTSEETIEEAQALKKFALRRSASVDLMIVEKTGM KIGVFFDQHEDIFANYHVIIGATNYSLVTTKAAKKAIKKGKKFLSLPLCTNDGKAMLSYD FMTMDTKKSKMLAMVIMKYLNSASVIHITSKNGTDLKVYKRNRKAGFFNGVLKDGKGYSS ASIEVYVPIEETKSEGVMVLDGSLGYIGKADYPTKIDISQGKITKIEDTITGCQLQEYLK SYDDERVWVASELGIGLNSQSKCRGKCYIEDESAYGTFHIGFGRNLALGGVHEASGHYDL VCREPNIYADNRQIMMEGKIITPEPEVY >gi|330401250|gb|ADLB01000028.1| GENE 30 27318 - 27962 627 214 aa, chain + ## HITS:1 COG:SP0860 KEGG:ns NR:ns ## COG: SP0860 COG2039 # Protein_GI_number: 15900744 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Pyrrolidone-carboxylate peptidase (N-terminal pyroglutamyl peptidase) # Organism: Streptococcus pneumoniae TIGR4 # 1 205 1 204 214 220 52.0 1e-57 MKVLVTGFDPFGGETVNPAYESVKLIHDTVKNAEIIKVEIPTVFGEAGKKVEKAIEEHHP DVVLCIGQAGRRADIQVERIAINLCEAPIPDNAGNQPIDEKSQKDGETAYFATVPVKAMV KNINEHGIPASVSYTAGTYVCNNVMYDLLYTLNKKYPNVRGGFIHVPYSTQQGVGKPLGT ATMSIETMAKALEYAIEAIVETREEVEISMGITD >gi|330401250|gb|ADLB01000028.1| GENE 31 28143 - 28259 236 38 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MGFLVCIIAVSIIAAIVAAVVSASAGALIGAQEIDDEE >gi|330401250|gb|ADLB01000028.1| GENE 32 28337 - 28909 559 190 aa, chain - ## HITS:1 COG:mlr7385 KEGG:ns NR:ns ## COG: mlr7385 COG1396 # Protein_GI_number: 13476145 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Mesorhizobium loti # 2 186 39 220 227 87 31.0 1e-17 MDTNAIIGLRIKNLRTEKKYTLKYLSENTGLSIGFLSQLERGMTSIAIDSLDKISKVLDV ELSSFFTTNIKPNDVHIMRRYEQKSTVVNPEIIEHALTNHVEEFDLLPRLVELMPALQDE SDLELYSHGGEEFIYILEGILTLQIEEDIYHLYPGDSAHVHSNISHNWKNDTNTVVKFLT VHTPNPLRHK >gi|330401250|gb|ADLB01000028.1| GENE 33 29167 - 30078 606 303 aa, chain + ## HITS:1 COG:AF1450 KEGG:ns NR:ns ## COG: AF1450 COG1180 # Protein_GI_number: 11499045 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Pyruvate-formate lyase-activating enzyme # Organism: Archaeoglobus fulgidus # 6 279 8 272 302 204 38.0 1e-52 MRTPNVINIQKFSVHDGDGIRTTIFFKGCYLNCWWCHNPESQNFAPEVMINTEKCTGCRA CEQVCPEKAIHIDNCKQCTDRTKCKVCSTCLDYCVNNAREVVGKQYTIAELVKEVDKDYM FYEESFGGVTLSGGEVMAQDMEYIEELLKKLKRKGYNITIDTCGFAPEENFQIVLPYVDT FLYDIKLMDNEKHKKYMGQSNELIFTNLKYLSDNGARIYIRIPVIGGVNDSDEEIQAIIS YLKENISVAQVNLLPYHDIASSKYQRLDVTYKGKEFTVPSKERMEELKEMFQKNGFTNTK IGG >gi|330401250|gb|ADLB01000028.1| GENE 34 30082 - 32454 2652 790 aa, chain + ## HITS:1 COG:STM4114 KEGG:ns NR:ns ## COG: STM4114 COG1882 # Protein_GI_number: 16767379 # Func_class: C Energy production and conversion # Function: Pyruvate-formate lyase # Organism: Salmonella typhimurium LT2 # 5 785 1 761 765 444 35.0 1e-124 MERGMNERIRRLRKQSLETEPHIYMERADLETDAYMMYEGSVSVPELRALAFKHFMANKT LCINDGELIVGEKGDGPQAAPSFPELCCHTVEDMKIMNARDLIYFRVSEEDLKLQEEKII PFWEKRSVRHKILANMSQEWKDAYAAGIFTEFMEQRGPGHTVGSEKIYKKGFLDYKNDII EARNNLDFLNDKEALDKKAQLNAMEICCDAIMILGERYAKYARELADKETDETRKAELLQ IAANCDVVPAHAPQTFWQAIQMYWFVHLGVTTELNPWDAYSPGRFDQHLNPFYQKDTEEG ILDDEKALELLECLWVKFNNQPAPPKVGVTLKESGTYTDFANLNTGGITPEGENGVNEVS YLILDCMDEMKLLQPSSNVQISKKTPTKFLKRACEISRKGWGQPAFYNTEAIIQELLNAG KTIEDARRGGTSGCVETGAFGNEAYILTGYFNLPKILELTLYNGYDIVSKKQIGLPLGYA KDFKSYEELYDAYKKQIEYLVDIKIEGSNIIEKIYAEYMPAPFLSIITNDCISKGKDYNA GGARYNTNYLQGVGIGTITDSLAAIKYNVFDEQKFTMEELIEAMEHNYEGYERIANLVRN KTPKYGNDDDYADGIMKDVFNFYQKTVTGRPNMKGGTYRVNMLPTTCHVYFGEVMNASPN GRLAQKPVSEGISPEKGADVNGPTAVIKSCAKMDHLRTGGTLLNQKFTPSVVAGEEGLTH MADLVRAYFNMDGHHIQFNVIDKETLIQAQKNPDEYKDLIVRVAGYSDHFRNLSKALQDE IIERTEQSFN >gi|330401250|gb|ADLB01000028.1| GENE 35 32508 - 33155 517 215 aa, chain - ## HITS:1 COG:mlr7144 KEGG:ns NR:ns ## COG: mlr7144 COG1802 # Protein_GI_number: 13475949 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Mesorhizobium loti # 4 205 38 236 253 62 24.0 6e-10 MATQSKSLADKAYHTIKTNILNLTYPPGMPLTEAILTEELEMSRNPVRTAIKMLQAEGLI VSGYHKSMTVKEITDKDIEEIYQLRELFEGSAFKLIFSSNRFEEYSYRIEEKVVRMCAAT NNPYEWEVADTKMHLEIVSIFDNTRINKFYESNLFELIRMGLYSLKNGMQIEKTNANLKK MVQYMRNNDYENAYAILTQDHFMTGKDSALGEYHH >gi|330401250|gb|ADLB01000028.1| GENE 36 33341 - 38116 5061 1591 aa, chain + ## HITS:1 COG:no KEGG:CPF_0859 NR:ns ## KEGG: CPF_0859 # Name: not_defined # Def: alpha-N-acetylglucosaminidase family protein # Organism: C.perfringens_ATCC13124 # Pathway: not_defined # 351 1309 40 995 2095 716 41.0 0 MCSLLLSFCLISGTLFSVPLSVEAREELVNVARKEGVKITVQNTGSGSASDMVDGDDKTS WQQNGWSINKDAIVDMKLAENGTNVKKVVVKVGGNDYANRKVKVTVQRAQNGITSDWLTI GEKVVTTTGDANDTADAVFELEQTASSSDIRVILSEPVAHDGGEVYFWPSIHEIEVYEMQ EVHLSDYNNIASQAEITTDGNENPSEGKDRLVDDNDTTLYKFHNAAQDSEKYINLSFAEE RMMNACEIVFEHVGAADEYTYEFTYSILGKAKGAAEYTKIVDHAKANRTDNYVQGYKFDE TAYSDVKIVIHSTTNSQGNGWPAVAEFRVYGAEEEIDDSDSIAFKKPVHSNSNQKNASKV NDGSKNTMWTGEYYPGYVDIDLKENYNLNTVEIFTQEKGYSQYSIYTSMNGRDFDKLAEK TSTESCTDKGEIYQAKGKEARIIRVYMEYNSTSPAAVIKEIRATGEKSGTKVQTRPEVNV KDFKDSSYANTEKVTPEQTYEEVRGIIERRLGAEYKSWFDLKLQENPNETGYDYFELNDK DGKVQITGNDGVSLAMGLNHYLKYFCNVNISQVGDQADMPAKIVPIGEKVHKETKVGTRY SYNYCTLSYSMAFWGEKEWRNELDWLALNGVNVVLDATAQEEVWRRFLEDLGYTHEEIKD YIAGPAYYAWAYMANLSGFGGPIHDSWFEERTELARKNQLSMRRLGMQPVLQGYSGMVPT NIREKDSSAEVIEQGTWCSFRRPDMLKTDSASFDKYAKLFYQAQKEVYGESAHYYATDPF HEGGDTGGLNPTVIAGKVLDAMLEADKDGIWIIQSWQGNPTTALLKGLEGRKEHALVLDL YAEKTPHWNETNPNEYGGGEFNDTPWVFCMLNNFGGRLGLHGHLDNLAKNIPAALNSAKH MEGIGITPEASVNNPLLYDFLFETVWTDNAKEKLPVIDLDKWLKDYAKRRYGKESQSAYE ALLIMKDTVYKAELNMKGQGAPESVVNARPALDIGAASTWGNAVISYDKAKLEKAAELLL KDYDKLKDSDGYMYDLATMLQQVLSNSAQEYQRKMANAFKENNKEEFNTYADKFLSIIDS MEKVTSTSKYYLLGTWVEQAKALAKNADDFTKDLYEFNAKALVTTWGSINQAEGGGLKDY SNRQWSGLLKDFYKVRWQKWIQARNDELDGKQPENINWFEWEWKWVRENTEYTNTPNKEN LKNLDKIGETILKEFSVKDPNADDSNDIKVEGIKATAGSEQSKTPGEEGAAVNVLDDNQA TIWHSSWNGAERKDLWLQLELPEAQKVNGVRLQTRNSYANGFITKYRIETSMNGTNFTEA VSGEWDTIPGWKKASFEEREAKFVRIYAVESYSNGSNNYASAAEMRLTQEKADIPALDKA LLEGKLAEAKELKTEGYTTSSVNVLNKAIEKAEKVLVEAKEQSELNAMVEELKNAMKLEE KADAEDSRTEFDKIIAGVKEESVYTEETWNVYAEAKEKVENALKDTSDVSEAKMKELLAD LKSAVAGLKEAETGEPQQPEKPENPEIPQQPEKPENPEIPQQKPQKPQQKPDNTVPETGD TTNGLGFMLLAVMAGGYIVFAETKKRRNADK >gi|330401250|gb|ADLB01000028.1| GENE 37 38287 - 39081 1105 264 aa, chain + ## HITS:1 COG:STM0386 KEGG:ns NR:ns ## COG: STM0386 COG0345 # Protein_GI_number: 16763766 # Func_class: E Amino acid transport and metabolism # Function: Pyrroline-5-carboxylate reductase # Organism: Salmonella typhimurium LT2 # 2 264 4 266 269 271 57.0 1e-72 MKLGFIGAGNMAKAMMGGILKNGILTAEEVIASDMYVPGLEKTKEELGIHVTTDNKEVAT KAEIVVLSVKPQFYADVINEIKDCITENQIIVTIAPGKTLEYLADAFGHPVKIVRTMPNT PALVGEGITGVCHNELVTKEELDTVCNILSGFGKAEVLSEKLMDVVVSVSGSSPAYVFMF IEAMADAAVADGMPRQQAYKFAAQAVMGSAKMVLETGKHPGELKDMVCSPGGTTIEAVRV LEEKGFRSSVIEAMKACVKIARGL >gi|330401250|gb|ADLB01000028.1| GENE 38 39189 - 40184 1006 331 aa, chain + ## HITS:1 COG:STM2406 KEGG:ns NR:ns ## COG: STM2406 COG0667 # Protein_GI_number: 16765732 # Func_class: C Energy production and conversion # Function: Predicted oxidoreductases (related to aryl-alcohol dehydrogenases) # Organism: Salmonella typhimurium LT2 # 1 328 1 327 332 411 59.0 1e-114 MSYIADEKRYEKMNYFRSGNSGLKLPAVSLGLWHNFGTNDSFENMVNMCEAAFDSGITHF DLANNYGPQNGSAEENFGKIINGELKPYRNELVISTKAGYEMWDGPYGNWGSRKYLLSSL DSSLQRMGLDYVDIFYHHRMDPATPLEESMMALDTAVKQGKALYAGISNYDAKTTKKAME ILKELRCPFIINQVRYSIFDRWIEEDGLKTFASENGCGLIAFSPLAQGLLTDKYLRGIPE NSRIKKDGRFLKESAITKERLEQIEALNQIAKERGQTLAQMALSWVIRDGDVTSVLIGAS SPEQIKENVKIVENTAFTQEELDAIDSISKK >gi|330401250|gb|ADLB01000028.1| GENE 39 40314 - 41078 785 254 aa, chain + ## HITS:1 COG:slr1665 KEGG:ns NR:ns ## COG: slr1665 COG0253 # Protein_GI_number: 16332245 # Func_class: E Amino acid transport and metabolism # Function: Diaminopimelate epimerase # Organism: Synechocystis # 2 251 9 279 279 185 37.0 7e-47 MQALGNDYIYVNAMEEKIENPSVLAQKLSDRHFGVGGDGLILIDFSEKADVKMRMFNADG TEGEICGNGIRCVAKYVYDKKIVKNKNIKIETYNGIIETKIFQKKYSETILADMGQPSIG EENKKIVMGEEEYTADYISMGNPHLVVYVDDPKRIQMKEWQEYEKVNVEFVTVLNRNKIS MRVKERGTGETLACGTGACAAAVSCMKKQYVDNYVEVEMPGGNLFVEWEKDSNHIFAAGE AKTVFEGEVDICLP >gi|330401250|gb|ADLB01000028.1| GENE 40 41066 - 42286 706 406 aa, chain + ## HITS:1 COG:MTH52 KEGG:ns NR:ns ## COG: MTH52 COG0436 # Protein_GI_number: 15678081 # Func_class: E Amino acid transport and metabolism # Function: Aspartate/tyrosine/aromatic aminotransferase # Organism: Methanothermobacter thermautotrophicus # 1 404 1 408 410 478 57.0 1e-135 MFAVNENYLSLSHDYLFAEIRNRVEEYKKKHRGKEIIRLGIGDVTLPLIPAVTEALHKAV EEMGKKETFRGYGGSRGYEFLRNAIAENDYKKRGCHITADEIFISDGAKSDIGNLQEIFS RENKVAVCNPVYPVYTDINIMAGRAGKYDSRIDKWRNIIYMDCTKETNFLPDIPKRVPDM IYLCFPNNPTGAVMKRERLQEWVDYANQVGAVIFFDNAYEGYISEKDVPHTIYECVGART CAIEVRSFSKKAGFTGLRLGATVIPKEIKRKGITLHSLWERRQETKYNGTSYIVQRAGEA TYTKKGQRQIDEQIGYYMRNAGKIRETLSELGYMAEGGKNSPYVWLQTPNHMKSWEFFTY LLENANVVGTPGVGFGMAGEGYFRLTGFGSREDTEKALQKISALPL >gi|330401250|gb|ADLB01000028.1| GENE 41 42281 - 43072 519 263 aa, chain - ## HITS:1 COG:SP0739 KEGG:ns NR:ns ## COG: SP0739 COG0789 # Protein_GI_number: 15900634 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Streptococcus pneumoniae TIGR4 # 5 130 2 118 246 67 33.0 3e-11 MKKYYHIGEISKLYHIGTDSLRYYEKLGILTPTRSEKGYRLYSLNELWRLNVIRDLRSLG FSMEQIKTYLNNRTIHSTEQLLTEELAVITEKLNLLANLRENVEERLQTIHNAVLQPIGE IRKVYYEKRYCHVIHSPYKTDEEMDILIKQLLNKDEERLYIIGNNRIGSSISAENVNKKL FCNYENVFIIDRNGKEYLEEGYYLTLCYRGKSKQNAVYIPQLFRYAEEHHLTPAGPVREL LWIDIHQSKNTEEHITELQVRCL >gi|330401250|gb|ADLB01000028.1| GENE 42 43150 - 43788 578 212 aa, chain + ## HITS:1 COG:lin2023 KEGG:ns NR:ns ## COG: lin2023 COG1284 # Protein_GI_number: 16801089 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Listeria innocua # 11 206 5 198 288 109 33.0 5e-24 MTIIQRITDGLSWKKIYYIIIGAIICSFGIYNIHQQTGVTEGGVIGTMLLIHHWIGLEPS VITPILDISCYIFAYKYLGGRFLKISFVSTLSVSFFFKIWEQFPPVLLDLSSYPFVSAVL GGIFVGVGVGLIVRQGGSSGGDDALALAISKVTHCRLSKAYMFTDFTVLILSLSYIPFSR IIFSIITVTISSYIIDFVQQFGKYEALELEEG >gi|330401250|gb|ADLB01000028.1| GENE 43 44207 - 44296 86 29 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MFATTLSIASVSMKQEFIQGINKIFIIFI >gi|330401250|gb|ADLB01000028.1| GENE 44 44295 - 44915 690 206 aa, chain + ## HITS:1 COG:CAC1432 KEGG:ns NR:ns ## COG: CAC1432 COG0020 # Protein_GI_number: 15894711 # Func_class: I Lipid transport and metabolism # Function: Undecaprenyl pyrophosphate synthase # Organism: Clostridium acetobutylicum # 1 206 1 213 213 257 57.0 1e-68 MRIPKHIAIIPDGNRRWAVQHKLEKKDGYQKGLDPGVQVLRKAKEYGISEITYYGFTTDN CKRPTEQKEAFKKACVDALRLIKEEGNVSLLVLGNAESENFPKELIQYTTRQEIGNPEIK VNFLVNYGWDWDLSGIAHRKDAYSKEISRIDLVIRWGNMRRLSGMLPIQSVYADFYVVED LWPDFDERDFDAAISWYDKQDITLGG >gi|330401250|gb|ADLB01000028.1| GENE 45 44964 - 45392 301 142 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|210617536|ref|ZP_03291618.1| ## NR: gi|210617536|ref|ZP_03291618.1| hypothetical protein CLONEX_03840 [Clostridium nexile DSM 1787] # 1 131 1 148 166 124 41.0 1e-27 MNHLRIIPVTNEELVYSLCAVTEELLDAPLLPETICEDISNDFEYFLLSYDYTFAGFSKI IESDNLLHLDTIHIHEDFRGLKIFSSLLKNYVELCRLRKLSKIILTCDKQNRNAMDCFKH LGFQETHKDVSVSPSAIMELSI >gi|330401250|gb|ADLB01000028.1| GENE 46 45513 - 46217 283 234 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) [Campylobacter concisus 13826] # 1 220 1 222 223 113 33 2e-24 MSAFVELRDVKKIYKMGEVEIQAVAGIDFKINKGEFTVVVGASGAGKTTVLNILGGMDTA TSGEVLVDGNDITKYNRKKLTAYRRDDIGFVFQFYNLVPNLTALENVELALQICKNPLDA ESVLKEVGLGERLNNFPAQLSGGEQQRVSIARALAKNPKLLLCDEPTGALDYNTGKAILT LLQDTCRKKGMTVILITHNSAIAPMADRVIHIKNGKVSKVEENRNPVSIEEIEW >gi|330401250|gb|ADLB01000028.1| GENE 47 46224 - 49523 2926 1099 aa, chain + ## HITS:1 COG:lin1187 KEGG:ns NR:ns ## COG: lin1187 COG0577 # Protein_GI_number: 16800256 # Func_class: V Defense mechanisms # Function: ABC-type antimicrobial peptide transport system, permease component # Organism: Listeria innocua # 1 1099 1 1136 1136 680 36.0 0 MKKKALRKDFYMEIKRSLGRFLSIFFIVALGVSFFAGIRAAEPDMRFSADEYFDEHHLAD LKVMSTFGLTEEDAEAIRQVEGVKKAEYGYSTDAICQLEDSEKVVHVMSDLPTMNQPDVL EGRMPKKDNECFMDIDFMKMAGYKVGDTVQLESGTEEGLDKTLKHSSYRIVGAGSSPCYI SFERGSSTIGTGEVSGFIVVTPETFLTEVYTEGLIEVEGAKQETAFTGEYDSKVEKVSEK IKEIEDVQCRRRQQEIVSEAQDKISKSETELNKAKQTAKEELDSAQKTISDGEKKLTEGK KQLSDGQKAISEAKSKLSGKEKELLQAKKEYETGISQLQRGKETLEAKEKEFQKQYAEAQ KGIQTLQGTIKTLEKTLAEQRVQYNIIQKQIESLNPDIPEEKAQIEELKKKQAVIANAIQ KMEGQVSALKGQLQEIEAKLSAGQTAINQAKKEIAHNEEMLKKAGKQIADGEKQLQQGKT ELTSKEKEFKKAEEAIKKSEKDLAKGKEEYLAAKKEADAKISDGEKKIADAKKEIGTIGK PKWYVNNREVFPEYSGYGDNAERIKAIGKVFPLIFFLVATLISLTTMTRMVEEQRTQIGT MKALGYSRLSIAGKYINYALIATVGGSIAGVLIGEKLYPYVIVFAYQIMYVHMPNIVIPY NVKYAVMATVAAVACTMFATILACYKEMAEHPAVLMRPPSPKQGKRVFLEKIPFIWKRLS FIWKSTIRNLIRYKKRFFMTIFGIGGCMALILVGYGLKDSIFSISVIQYGEIQKYNLQSL LNEDASDEEKKALETYKETQSEIKDSLQIYMKNIKVGKGSTEKDVYMYVPSDTKEFDKFV TLKDRKTDEEYKLTETGAVLTEKMAKTLDVKKGDKVYIKDEDGRKKEVTVTAICENYMGH YIYLTKGMYEKMYGESPSFNSELYKLKNAAEKTTLQIGEDMLETKAVMNMQYTDNLQTRI DDMLKTLNSVIVVLIVAAGMLAFVVLYNLNNINITERKRELATIKVLGFYDGEVAAYVYR ENILLTLIGSIAGCGMGYLLHRFVIVTVEIDDVMFGRNIDFVSYVIALLFTFGFSIFVNW VMYYKLKKINMVESLKSVE >gi|330401250|gb|ADLB01000028.1| GENE 48 49665 - 51695 2286 676 aa, chain + ## HITS:1 COG:alr2323 KEGG:ns NR:ns ## COG: alr2323 COG0326 # Protein_GI_number: 17229815 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Molecular chaperone, HSP90 family # Organism: Nostoc sp. PCC 7120 # 278 676 216 654 658 291 38.0 4e-78 MGAKHGNLSINSENIFPIIKKWMYSDHDIFVRELISNGCDAITKLKKLDMMGEYSIPEDY KAKIEVVVNPEEKTLKFIDTGIGMTADEVEEYITQIAFSGATEFLEKYKDKTTEDDMIGH FGLGFYSAFMVADEVHIDTLSYKEGATSVHWTCDGGTEYDMEEGTKDTVGTEITLFLNED SVAFSNEYRVREVIEKYCSFMPVEIFLSKENAEQEYETILESELREDDVVVEHIHEEAKM EEKENENGEKEMVEVSPAVDKVKINKRPVSLSDTTPLWSKHPNECTKEEYIDFYRKVFMD YKEPLFWIHLNMDYPFNLKGILYFPKINTEYESIEGTIKLYNNQVFIADNIKEVIPEFLM LLKGVIDCPDLPLNVSRSALQNDGFVNKIADYISKKVADKLSGMCKTDRESYEKYWDDIS PFIKFGCLKDEKFCDKMSDYILFKNLEHKYMTLKDLVPETEKQENAEEKSEEKEKTTIYY VTDEQQQSQYINMFKKEGMDAVLLTHNIDSAFISQLEQRNENVKFQRIDADLTEGLKEEV SEEDAKVLEEKKETLTEVFRKALSNDKLSVKVEKLKHENVAAMVTLSEESRRMQEMMRMY NMYGMDPSMFGTDVTLVLNANHPLVQYVLENKDGEHTQMFCQQLYDLAMLSNKPLNPEEM TAFINRSNEIMMLLAK >gi|330401250|gb|ADLB01000028.1| GENE 49 51885 - 52760 940 291 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|295109548|emb|CBL23501.1| ## NR: gi|295109548|emb|CBL23501.1| hypothetical protein [Ruminococcus obeum A2-162] # 110 277 1 168 182 99 34.0 3e-19 MAKFCTKCGRKLEEGEVCTCQQENNNYSEPQKVAANEVESQTTEPQPERWDGGQQTANQT QNETVSQTSEKTKEAEWISQKSTIVVNETKNVFAKIIPLLKHPVTETKKIADGKSSIPGI EFIAIKAAVVLIFTIIMLAKMDSALGGFVEIPTVTIIIMAILLTLGADCLEALMLKVFSG VLNGVTEQSAMFSVVGTRALYETLIYIVAGIVCFISANFGIIIIALTSILLPIVEFGSYR VLVQTSEDRKVYAYFIAKVIMAVISYIVIYLCGKEVLSSILGTLMGGLLNF >gi|330401250|gb|ADLB01000028.1| GENE 50 52787 - 53341 629 184 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|295107973|emb|CBL21926.1| ## NR: gi|295107973|emb|CBL21926.1| hypothetical protein [Ruminococcus obeum A2-162] # 79 182 202 304 308 64 38.0 2e-09 MGGKKKIVAILLGIVDVLLIVALVLTFFIAGNEGKSVKANWNSDFKPNASRYEEVDETKE YKADKQIAFDGKKITAADSPTEKNVTEPQSELVFPDSDKTLLTDEMINEKVNDKQTLRLA INEIYARHGYQFTSEEYINHFNQFDWYKNMTKEPDMNKVSAGFSEIEKKNVEKLQAYSDA KGWS >gi|330401250|gb|ADLB01000028.1| GENE 51 53353 - 54588 1010 411 aa, chain + ## HITS:1 COG:no KEGG:EUBELI_00297 NR:ns ## KEGG: EUBELI_00297 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 1 402 1 433 440 85 24.0 5e-15 MFCPNCGTKCNDDDLFCGECGTSLAEYREEETDSAAAVKEADEQFERIVEMPEKKPKGSK INFLLIGEILVMVAVLVGIYFSLDKKYSAETTALEYWKAKSDCEWSKVYDYYSFGNEKEL SKQMYVNAHSQDNETIKYQSVTAKKNGGQKDSDSVTCGITYRESGSGKEEYETVSLVKGE KKFLFWNEWNVIPTDEVVDDWSLTVPENATVKLNGEEIKESKSSGREGMKKVTAPSLFQG QYQLEVSEEGMEPYRDIIQIDSNTLGERIELVPKEEEKEKLAEQFGKDLEKILTAAVSKE DFSKMKEIFSEEAMKRSFVKSSYEDLYRIMDTEGVVIRSFNIADVKLTLNNYYENKLTFD VSMKIKTTYKRFWSEQMETEEKENSGTVTYVKEEDGWKLQSLPISYYDLIY >gi|330401250|gb|ADLB01000028.1| GENE 52 54735 - 54899 63 54 aa, chain - ## HITS:1 COG:CAC0656 KEGG:ns NR:ns ## COG: CAC0656 COG3666 # Protein_GI_number: 15893944 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Clostridium acetobutylicum # 1 54 136 189 189 66 61.0 1e-11 MKNDYEFQRFLLRGKSKVKLEILLLCMGYNINKLHAKIQKERTGSYLFLVKETA Prediction of potential genes in microbial genomes Time: Tue May 24 22:02:07 2011 Seq name: gi|330401197|gb|ADLB01000029.1| Lachnospiraceae bacterium 2_1_46FAA cont1.29, whole genome shotgun sequence Length of sequence - 17500 bp Number of predicted genes - 16, with homology - 15 Number of transcription units - 8, operones - 4 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 47 - 349 194 ## COG3666 Transposase and inactivated derivatives + Prom 269 - 328 8.6 2 2 Tu 1 . + CDS 363 - 467 90 ## - Term 402 - 460 6.3 3 3 Op 1 . - CDS 472 - 1257 364 ## COG0860 N-acetylmuramoyl-L-alanine amidase 4 3 Op 2 . - CDS 1281 - 1769 311 ## Cphy_3370 hypothetical protein 5 3 Op 3 . - CDS 1776 - 3086 995 ## COG0534 Na+-driven multidrug efflux pump - Prom 3113 - 3172 7.9 + Prom 3054 - 3113 7.6 6 4 Op 1 . + CDS 3199 - 3930 828 ## COG1187 16S rRNA uridine-516 pseudouridylate synthase and related pseudouridylate synthases 7 4 Op 2 . + CDS 3946 - 5907 1876 ## COG0272 NAD-dependent DNA ligase (contains BRCT domain type II) 8 4 Op 3 . + CDS 5909 - 7312 250 ## PROTEIN SUPPORTED gi|163762510|ref|ZP_02169575.1| ribosomal protein S16 + Prom 7319 - 7378 5.9 9 4 Op 4 . + CDS 7428 - 8696 1487 ## COG0475 Kef-type K+ transport systems, membrane components + Term 8711 - 8777 18.3 + Prom 8764 - 8823 7.4 10 5 Op 1 . + CDS 8849 - 10561 1149 ## COG1404 Subtilisin-like serine proteases 11 5 Op 2 . + CDS 10580 - 10867 330 ## gi|210617495|ref|ZP_03291598.1| hypothetical protein CLONEX_03820 12 5 Op 3 . + CDS 10870 - 11328 366 ## COG0597 Lipoprotein signal peptidase + Term 11333 - 11364 3.4 13 6 Tu 1 . - CDS 11354 - 12289 678 ## COG1052 Lactate dehydrogenase and related dehydrogenases - Prom 12315 - 12374 10.9 + Prom 12268 - 12327 9.0 14 7 Op 1 . + CDS 12375 - 13214 709 ## COG2816 NTP pyrophosphohydrolases containing a Zn-finger, probably nucleic-acid-binding + Term 13253 - 13289 1.4 + Prom 13220 - 13279 4.8 15 7 Op 2 . + CDS 13332 - 14186 1004 ## COG1284 Uncharacterized conserved protein + Term 14205 - 14251 8.5 + Prom 14241 - 14300 5.8 16 8 Tu 1 . + CDS 14418 - 16670 2208 ## COG0737 5'-nucleotidase/2',3'-cyclic phosphodiesterase and related esterases + Term 16696 - 16731 3.1 + SSU_RRNA 17181 - 17500 99.0 # EF403348 [D:1..1494] # 16S ribosomal RNA # uncultured bacterium # Bacteria; environmental samples. Predicted protein(s) >gi|330401197|gb|ADLB01000029.1| GENE 1 47 - 349 194 100 aa, chain - ## HITS:1 COG:CAC0657 KEGG:ns NR:ns ## COG: CAC0657 COG3666 # Protein_GI_number: 15893945 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Clostridium acetobutylicum # 9 93 9 93 334 124 70.0 5e-29 MTNKLKYHKNYTEFGEPYQLVLPLNLEGLIPDDDSVRLLSHELEDLDYSLLYQAYSAKGR NPAVDPKTMFKILTYAYSQNIYSSRKIETACKRASTLCGC >gi|330401197|gb|ADLB01000029.1| GENE 2 363 - 467 90 34 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MDCGLFGVRFLFLGHKNGTVAPVDYSSTFATAPF >gi|330401197|gb|ADLB01000029.1| GENE 3 472 - 1257 364 261 aa, chain - ## HITS:1 COG:BH3665_2 KEGG:ns NR:ns ## COG: BH3665_2 COG0860 # Protein_GI_number: 15616227 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: N-acetylmuramoyl-L-alanine amidase # Organism: Bacillus halodurans # 57 260 5 179 180 73 32.0 3e-13 MRKVKKFFILIGSCLLLTGCSSSLVPSEITSFLQPAKHISPEITLSADAFVKPKQYTVAI DAGHQQKGNSELEPIGPGASERKAKVAGGTSGISTKVPEYQLTLDISLLLKQELLNRGYE VIMIREQNEVNISNAERAEIANRSGADIFIRVHANGDNNRSVSGALTIAPSQNNRYVKDI AVASRKLSESIIASYCDATGFKNRGVLTSDTMSGINWCKIPVTIIELGFMTNPAEDKNMQ NTSTQNLMVSGIANGIDAYFK >gi|330401197|gb|ADLB01000029.1| GENE 4 1281 - 1769 311 162 aa, chain - ## HITS:1 COG:no KEGG:Cphy_3370 NR:ns ## KEGG: Cphy_3370 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 1 160 1 163 164 123 40.0 3e-27 MILLSISEVKEFMSKLLLSDTFDSFLFIEGEIVTFNTFSINGYLQKDFFDKDMIPERNYS LWKELREYCFSLIRGKRTPLRFKFVFGLSEPNIEKLLRQQGLSFTPQDVQGLYLNISYDG HSLRCVTGTSMNLFTLDKSLEEAWDKMVQKFFVQKEISFELM >gi|330401197|gb|ADLB01000029.1| GENE 5 1776 - 3086 995 436 aa, chain - ## HITS:1 COG:lin2192 KEGG:ns NR:ns ## COG: lin2192 COG0534 # Protein_GI_number: 16801257 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Listeria innocua # 2 436 4 438 443 214 31.0 2e-55 MNIKSFTEEPVSKIFFRYLFPSICGTMVTSIYVLADTIIIGKGIGADAMAALNIVLPLFN VFFGTGLLFGVGGSVLMSIARGKGDYQTGNRYFSTAVLLNAVISIIYMIIFLIFMEDMAR FLGATEITLPYIMDYAPYVIWGLGFFSFSTFLQTFVRNDGAPKLSMIAVVTGGITNVFLD ILFVFTFDMEMAGASIASVIGTIITTCILLFHFFSKRNGLHFTLKGLSFSFVKQIFKNGF TSFILEMSAGIVMFFFNIQILKYIGDIGVTMYGVIANTSIVVVCLCNGINQASQPIISIN HGAGLDERIRTVKKLGLKTAFIICSVPAILGLIVPNMFTNIFLNPNAEILALSSTAIRIY FTGFFVIGINMFIVGYFQSTVKPQLSLSLCLIKSCVLSVLFVYILPLFFGVTGIWAAVPL AEFITLGLGIYFLKRR >gi|330401197|gb|ADLB01000029.1| GENE 6 3199 - 3930 828 243 aa, chain + ## HITS:1 COG:STM4193 KEGG:ns NR:ns ## COG: STM4193 COG1187 # Protein_GI_number: 16767443 # Func_class: J Translation, ribosomal structure and biogenesis # Function: 16S rRNA uridine-516 pseudouridylate synthase and related pseudouridylate synthases # Organism: Salmonella typhimurium LT2 # 7 239 8 242 289 254 57.0 9e-68 MKEELIRLNKFLSEAGVCSRREADRLIEAGKVFVDGRRAETGMRVSPAQEIKVGKKVIHK GNEMVLLAVNKPAGIVCTEEKKEKHNIIRFLNYPTRITYIGRLDKDSEGLLLMTNNGDII NKMMRAGNRHEKEYKVTVNKPVTREFIEKMGQGVPILDTVTRPCKVKAIGKYKFNIVLTQ GLNRQIRRMCEYFGYKVTRLERVRVMNIQLGSLKPGEYREVTDDEIRELYELIKDSSNET VIE >gi|330401197|gb|ADLB01000029.1| GENE 7 3946 - 5907 1876 653 aa, chain + ## HITS:1 COG:SPy0751 KEGG:ns NR:ns ## COG: SPy0751 COG0272 # Protein_GI_number: 15674800 # Func_class: L Replication, recombination and repair # Function: NAD-dependent DNA ligase (contains BRCT domain type II) # Organism: Streptococcus pyogenes M1 GAS # 5 650 2 652 652 349 34.0 1e-95 MTDKKKRMQELIALLNEAGKAYYQGATEIMSNFQYDKLYDELAELEKELNVTLSNSPTVN VGYEVLSELPKERHEKPMLSLDKTKEVSSLKEFLGNQKAVISWKLDGLTIVLTYRDGELQ KAVTRGNGEVGEVITNNAKVFKNIPLHISYKGELVLRGEAVIGYKDFERINEEIEDADAK YKNPRNLCSGSVRQLNNEITAKRNVRFFAFSLVKAEGVDFKNSRNEQMRWLSQQGFDVVE HEEVTSADIEEKVEMFAHKIEKNDFPSDGLVLVYDDIAYGQSLGTTAKFPRDSFAFKWAD EIRETTLREIEWSPSRTGLINPVAIFDSVELEGTTVSRASVHNISIMEELELGIGDTIQV YKANMIIPQIAENLTRSGVQDIPDVCPVCREKTEIRKVNNAKALYCTNPECQAKRIKSFA LFASRDALNIDGMSEATLEKFILEGFIKEYADIFHMDRYEEKIKSMEGFGEKSYNKLQAS VEKARTTTLPKVIYSLGIANIGLANAKMICKEFQNNADEIMNATPERLNQIDGVGEVIAG TFVEYFSVEKHRKEFQNLLRELTIPKESAEEGKQIFDGIQFVITGSVNHFANRNEVKEVI ESKGGKVTGSVTSKTNYLINNDTASTSSKNKKARELGIPIISEEEFLEMLRKG >gi|330401197|gb|ADLB01000029.1| GENE 8 5909 - 7312 250 467 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163762510|ref|ZP_02169575.1| ribosomal protein S16 [Bacillus selenitireducens MLS10] # 221 447 267 448 466 100 31 5e-21 MLENENNVKDTELQETNDKQEEYEKFCFMCRRPESVAGKMIELPNHIHVCSDCMQKSFDA MNNGQIDYSKLMNIPGIQVMNVEDLENMMPKQQKVKKKKPKEEAKPALDIKNIPAPHKIK ASLDEYIVGQEHAKKAMAVAVYNHYKRVATNTMDDIEIEKSNMLMIGPTGCGKTYLVKTL ARLLDVPLAITDATSLTEAGYIGDDIESVVSKLLAAADNDVEKAEQGIIFIDEIDKIAKK KNTNQRDVSGESVQQGMLKLLEGSDVEVPVGANSKNAMVPLTTVNTKNILFICGGAFPDL EEIIKERLNKQASVGFMADLKDKYNNDKNLLEKVEVEDLRNFGMIPEFIGRLPVIFTLQG LDRDKLVKILKEPKNAILKQYQKLLALDEVKLEFDEEALGAIAEKAMKKDTGARALRAII EEFMLDIMYEIPKDDNIGQVTITKAYIEGTGGPVITLRGQAAIEQKL >gi|330401197|gb|ADLB01000029.1| GENE 9 7428 - 8696 1487 422 aa, chain + ## HITS:1 COG:CAC0444 KEGG:ns NR:ns ## COG: CAC0444 COG0475 # Protein_GI_number: 15893735 # Func_class: P Inorganic ion transport and metabolism # Function: Kef-type K+ transport systems, membrane components # Organism: Clostridium acetobutylicum # 5 390 3 379 393 236 41.0 9e-62 MEEYRYLLDVAIILLSTKVLGLATRRVQMPQVVGALLAGLVLGPAMLGILTETSFIHEIA EIGVIVLMFCAGMETDIKELKESGKASFVIALCGVIVPLIGGFAIAYFFNKPELISSDAS CNIFLQNIFIGIILTATSVSITVETLKELGKLKTRSGNAILGAAIIDDILGIIALTIVTS MADSSVSIAVVLLKIAGFFVFAGIIGVLFYKLFKKWSETSNKPLRRHAIIAFVFCLLMAF IAEVGFGVADITGAFIAGLIISNTSRSAFVASKFDTLSYMLLSPIFFASIGLKVVLPKMS VAVVVFSVLLCIIAILTKIVGCGLGAKICGYENYQCKRIGVGMISRGEVALIVASKGEAL GLLGSNFLGPVIIVVVITTIITPILLKFVFKSSPANAQPSDYAGSNKFAEYDDSVRGMRE EK >gi|330401197|gb|ADLB01000029.1| GENE 10 8849 - 10561 1149 570 aa, chain + ## HITS:1 COG:CAC3245 KEGG:ns NR:ns ## COG: CAC3245 COG1404 # Protein_GI_number: 15896490 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Subtilisin-like serine proteases # Organism: Clostridium acetobutylicum # 8 569 536 1099 1118 282 33.0 1e-75 MLEDAALCKEQILSEDFRDFIYERENPYFLEQISTSRVCEQEIGFYYRAIYINKIFGDPV SLERFRYGSIPKCYTLLGTETLNQAGITQVQNYPTLRLKGEGVMIGIIDTGIDYENPLFR NADGTTRIAGIWDQTIQTGREPEGFLYGSEYVREEIDEALRSDNPKEIVPTTDTDGHGTF VASIAGGGENVEEDFLGAAPRATFGIVKLKEAKNYLKQFYFINEDAKCYQENDIMLGMRY LEILADRNQMPLIFCIPLGTNLGDHNGTSPLGNLLSYYSNEHNMAVVLGGGNEANQRHHY YGKLEERRDKDSVEIRVERGGEGFTLEMWTDIPNIFMVSIISPAGEKIPFPTVKQSESTF VHSFVFERTTVYVDYRLLVERTNSELAFIRFENAVEGIWRIEVEAIQVADGIFHMWLPVT EFLKGEVYFLLSNPDWTITEPGCVWTVMTAANYNGNNNSIAIDSGRGYTRNGTLKPDFAA PGVNVTGVTNRNQFVERSGSSISAAITAGASALLMEWLYYQIGRRNVDTVQIKNLLILGT NRIDSEEYPNRLWGYGTLDLYKTFDELRSF >gi|330401197|gb|ADLB01000029.1| GENE 11 10580 - 10867 330 95 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|210617495|ref|ZP_03291598.1| ## NR: gi|210617495|ref|ZP_03291598.1| hypothetical protein CLONEX_03820 [Clostridium nexile DSM 1787] # 1 83 1 83 158 74 51.0 2e-12 MKDLWEDIVDRISETAETVGKRAGEVVETQKIKGKIRNLERSNRRDFRDLGRIVYERYQR GEVQDEDFLELCENIAEREQEIKVCEYEMKDIFED >gi|330401197|gb|ADLB01000029.1| GENE 12 10870 - 11328 366 152 aa, chain + ## HITS:1 COG:VC0683 KEGG:ns NR:ns ## COG: VC0683 COG0597 # Protein_GI_number: 15640702 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Lipoprotein signal peptidase # Organism: Vibrio cholerae # 42 151 53 162 171 59 35.0 3e-09 MLIICCLSIVGIAILDLWIKSRVEKKIKRGEEIPVCRGKILLRHVHNEGMALNMLDRYPK AVKWISIVMTGIVAVYALCLFGKHKNGLEKISLSFLIGGAVSNLYDRIKRKYVVDYIGFK TKWKKLTDITFNIGDFSIFAGAIGFILSGFKK >gi|330401197|gb|ADLB01000029.1| GENE 13 11354 - 12289 678 311 aa, chain - ## HITS:1 COG:Cj0373 KEGG:ns NR:ns ## COG: Cj0373 COG1052 # Protein_GI_number: 15791740 # Func_class: C Energy production and conversion; H Coenzyme transport and metabolism; R General function prediction only # Function: Lactate dehydrogenase and related dehydrogenases # Organism: Campylobacter jejuni # 1 308 1 306 311 261 46.0 1e-69 MKIVFLDAKTIGDDINLSPFEKLGEIVKYDFSTPEQIPDRVKDADVLILNKVAINHSTIH TAKHLKLVCVTATGTDNLDKEYLEKNHIAWRNVAGYSTNSVAQHTFAMLFYLLEKLNYYD GYVKGGHYVNDRIFTHFEEKFSEIEGKTWGIIGLGNIGRRVAQIAECFGANVIYYSPSGS PAQKGYHQVDFDTVLTQSDILSVHAPLTDKTKNLMNKSAFAKMKSSAIFLNLGRGAIVVE SDLAYALENQLIAAAGLDVLCTEPMEAENPLLQIEDSRKLLITPHIAWASIEARTRLMHI ILNQIKEFFHI >gi|330401197|gb|ADLB01000029.1| GENE 14 12375 - 13214 709 279 aa, chain + ## HITS:1 COG:CAC3396 KEGG:ns NR:ns ## COG: CAC3396 COG2816 # Protein_GI_number: 15896637 # Func_class: L Replication, recombination and repair # Function: NTP pyrophosphohydrolases containing a Zn-finger, probably nucleic-acid-binding # Organism: Clostridium acetobutylicum # 95 255 91 250 271 142 40.0 8e-34 MIQDIYPHIYRNEYKPKAPQKESFLLIYKGREALLKESEEICFPTFGEMELYDSALCEKC TYLFSVDNQSFYLAEDFYGDIEGYCWKDAEMFRRLKPQHLAFAGITGCQLHRWYQSHKFC GRCGRKMVKDDRERMLKCEHCGQMEYPKISPAVIVGVTDGNRLLMSKYAGREYKKYALIA GYAEIGETIEETVKREVMEEVGLKVKNIRFYKSQPWSFTDTLLLGFFADLDGEDKITLDK EELALAEWFEREDIPITERNISLTNEMILYFKEKKNIDI >gi|330401197|gb|ADLB01000029.1| GENE 15 13332 - 14186 1004 284 aa, chain + ## HITS:1 COG:CAC0848 KEGG:ns NR:ns ## COG: CAC0848 COG1284 # Protein_GI_number: 15894135 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 1 281 8 285 292 127 30.0 2e-29 MEKNRRKEFAIDILVDIVGNLLIAIGVYNFAANSGFPVAGISGIAMIFYHLFGLPIGTMT IILNIPIVIMCYKLLGKGFLLRSIKTMIIAWPLMDYVAPMLPVYSGDRMLSAICVGVFSG LGYAMIYMRNTSTGGADFIIMSVRALRPHLSIGKITFITDVIIVGLGGLLFGDVDSIIYG LILTYILSVVVDKVMYGIDAGKMTLVVTDHGHEVAKRIDELTQRGSTFLRGVGSYSGEEK LVVMCACSNKEMHMVQKAVKEVDEEAFLITMESNEVRGEGFKPH >gi|330401197|gb|ADLB01000029.1| GENE 16 14418 - 16670 2208 750 aa, chain + ## HITS:1 COG:TM1878 KEGG:ns NR:ns ## COG: TM1878 COG0737 # Protein_GI_number: 15644621 # Func_class: F Nucleotide transport and metabolism # Function: 5'-nucleotidase/2',3'-cyclic phosphodiesterase and related esterases # Organism: Thermotoga maritima # 39 506 22 459 508 120 27.0 1e-26 MKRKQKTLLSFLMIMAMLLVNAPQVQAENNNPYANWDTIKVFETTDVHGYITDVSTYKEE TFQYRLAYIAKLVNDARANSEYEGVLLLDGGDIYQSTPHSNLTFGAPMRAAYDKMKYDAV ALGNHEFDWDVKKYAADEKGTMAPYEIGNYKGDSDIPVLANNLYYKENGQRVPFTQDYTV VKKGKYDIAIVGYVEDYSNDIKASQIAPYKIDGDLNKLSERTKEVKEKTKADVVMVLAHG EPGPIAEAMNPDVVDLVLGGHSHEKNVGTAKNGIDYIQGNSKAYGYATAEVKINPENNDV EVIDPTYVDTTAKENLQNLYYQNGNNPHLDKEITKISQAAWDEVKDEMYEVLCTVDKNIT KTPIQEGTTTSIAGNWLAGLMLDATKEQNTVAAFTNSGGIRSDILLEEGASTRNVTVADI YTITPFGNRLYTYEITGKQMAQQLENALLGTKDSSNFGDQFSGITATYEKVNGKVKVTSI TTDRGEPISIYDTEKKYPVCVNEYCATLEGSVFKELKPIVSPDDAPIDNLSAIQALRDRR EAQGLKMELDTKVRAIEQPAGTYLAEEITRLEKDAQAYRVETVKSSDTEKIRLLDEKGKE LLANPDISEEQKNQVNKVLEHTKALLERISMAQQALQTENIKKADKINTDNVKVTDKTVL ENAKRDLETAEKLYKQNYTEEELAQLKTKLQQVQKNLDAIKKMQNTHAPENNGGGAVNPK TADATQVTIWLIMMMAATGSLTALRYKKKN Prediction of potential genes in microbial genomes Time: Tue May 24 22:02:25 2011 Seq name: gi|330400975|gb|ADLB01000030.1| Lachnospiraceae bacterium 2_1_46FAA cont1.30, whole genome shotgun sequence Length of sequence - 13085 bp Number of predicted genes - 16, with homology - 13 Number of transcription units - 8, operones - 4 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - SSU_RRNA 1 - 319 99.0 # EF403348 [D:1..1494] # 16S ribosomal RNA # uncultured bacterium # Bacteria; environmental samples. 1 1 Tu 1 . - CDS 676 - 762 69 ## - Prom 818 - 877 2.6 - Term 839 - 888 9.3 2 2 Op 1 . - CDS 940 - 1659 630 ## Cphy_0053 hypothetical protein 3 2 Op 2 . - CDS 1668 - 2792 1076 ## EUBELI_01971 hypothetical protein 4 2 Op 3 . - CDS 2845 - 3840 883 ## EUBELI_01972 hypothetical protein - Prom 3888 - 3947 3.8 - Term 3919 - 3957 0.8 5 3 Tu 1 . - CDS 3964 - 4476 225 ## COG1418 Predicted HD superfamily hydrolase - Term 4486 - 4526 9.2 6 4 Op 1 12/0.000 - CDS 4533 - 5471 1365 ## COG3958 Transketolase, C-terminal subunit 7 4 Op 2 . - CDS 5459 - 6295 1141 ## COG3959 Transketolase, N-terminal subunit 8 4 Op 3 . - CDS 6363 - 7016 870 ## COG0176 Transaldolase - Prom 7072 - 7131 8.6 - Term 7090 - 7144 4.0 9 5 Op 1 . - CDS 7243 - 8331 824 ## COG1609 Transcriptional regulators 10 5 Op 2 23/0.000 - CDS 8347 - 8943 765 ## COG0353 Recombinational DNA repair protein (RecF pathway) 11 5 Op 3 30/0.000 - CDS 8960 - 9313 172 ## PROTEIN SUPPORTED gi|149916415|ref|ZP_01904934.1| 30S ribosomal protein S21 12 5 Op 4 . - CDS 9327 - 10877 1736 ## COG2812 DNA polymerase III, gamma/tau subunits - Prom 10997 - 11056 7.1 + Prom 10956 - 11015 7.2 13 6 Tu 1 . + CDS 11038 - 11901 967 ## COG0024 Methionine aminopeptidase + Term 11905 - 11946 6.6 - Term 11893 - 11934 10.4 14 7 Op 1 . - CDS 11952 - 12203 200 ## 15 7 Op 2 . - CDS 12280 - 12528 181 ## - Prom 12603 - 12662 5.3 - Term 12675 - 12712 6.4 16 8 Tu 1 . - CDS 12734 - 13084 225 ## COG3436 Transposase and inactivated derivatives Predicted protein(s) >gi|330400975|gb|ADLB01000030.1| GENE 1 676 - 762 69 28 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLTEEIFDDIIDKHSRDGQTSQKQNSKK >gi|330400975|gb|ADLB01000030.1| GENE 2 940 - 1659 630 239 aa, chain - ## HITS:1 COG:no KEGG:Cphy_0053 NR:ns ## KEGG: Cphy_0053 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 1 238 1 228 233 130 31.0 3e-29 MNWLLIGVGIIFFICMAIGYAKGFIKIIVSFGTTIASLALVIFLTPYTGKAIVAMTPIDE MVKEKCISMMIPEGVDIDISEIPLDKIELPRQKQMEILEKADIPEFLKKGLMENNNNEAY KQLGVSNFMDYIGAYVSDIIVKIISFLITFLVVTIFIRAIIFALDIITALPVINGLNRVA GIAVGGLIAVILVWIAFLIITLLYNSDIGKECFRCIENSEFLTFLYEKNVILDWVTKLR >gi|330400975|gb|ADLB01000030.1| GENE 3 1668 - 2792 1076 374 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_01971 NR:ns ## KEGG: EUBELI_01971 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 4 372 47 421 422 181 27.0 5e-44 MEGKKKKKFLIKLLLIGAALILFAAIGIYIFLTHRTYNNIRVLNTVTLERDNGEKYAEFA DGILKYSKDGVALLDRKGKERWNFSYQMQNPVVSVSEKSAVIADKGANDLVVLQSKGVKG EIHTSLPVEKAEVSRQGIVSAILRDGVSAKIVCYDAEGNLLVEHVTSPTTTGYPLDAAIS DDGYTLLVSYLFIEQGKMTTKVAYYSFDEKTEKKENHLIMEEKYENTIAPSVFFLDNETS AVVSDNQLHIYEGKEKPKKKQTVQIKEQIQNVFHDDEYIGMLLKNEKNYIVKLYDKNGKE LLSKTVEQEYKNVKVEDGQVIMYDGKKCQILTKAGVRLFDGEMENDIANIISVFGVNKYL VINANGLEEIRLAN >gi|330400975|gb|ADLB01000030.1| GENE 4 2845 - 3840 883 331 aa, chain - ## HITS:1 COG:no KEGG:EUBELI_01972 NR:ns ## KEGG: EUBELI_01972 # Name: not_defined # Def: hypothetical protein # Organism: E.eligens # Pathway: not_defined # 6 331 34 392 392 158 30.0 3e-37 MERKIPKNVRQIGNVSDSLKIYVEDYVDTYLNQLCEKEAEIPVGAFLIGDIVREEELEYQ YIYGAIQMEELGQDENGVFVNEETWRNACEVCQTFFENGEIIGWFAAIPGIPFAMNGNLK KVHQEIFAKDKGIFILKDPVCKDEMYFAPKFNDLMQMNGHYIYYEKNPSMQNYMISSRKK IGVTPSEIVVDRAAKSFRDLVQTKMLQQEKKEISKWTYGAATFLVLVILVIGVTMINNYD RMKSVQTAIETISNSVGKEEEVHAKEENKKEEKTEEKKEEAQNVYIVEKGDTLEKISKKA YGDTQHIEAISKMNGLENGNLIYIGQKLLLP >gi|330400975|gb|ADLB01000030.1| GENE 5 3964 - 4476 225 170 aa, chain - ## HITS:1 COG:CAC1667 KEGG:ns NR:ns ## COG: CAC1667 COG1418 # Protein_GI_number: 15894944 # Func_class: R General function prediction only # Function: Predicted HD superfamily hydrolase # Organism: Clostridium acetobutylicum # 20 166 21 164 173 142 48.0 3e-34 MKREKLVQKIYKLREEWKYDFYECVQDIAEHPVVLRMKLYPHHGTTSCYKHCMNVAYYNY QWCRFFHLDAKSAARGAMLHDLFLYDWHTHAKKTGDKFHGLTHPKTALKNAEKFFELNDI ERDIIYSHMWPVTFFRFPKTKEGFITTLTDKYCGACESSKRRKREKRREK >gi|330400975|gb|ADLB01000030.1| GENE 6 4533 - 5471 1365 312 aa, chain - ## HITS:1 COG:FN0295 KEGG:ns NR:ns ## COG: FN0295 COG3958 # Protein_GI_number: 19703640 # Func_class: G Carbohydrate transport and metabolism # Function: Transketolase, C-terminal subunit # Organism: Fusobacterium nucleatum # 4 310 1 306 309 344 59.0 1e-94 MSDVKKIATRESYGNALVELGKEHEDLVVLDADLAAATKTAMFQKVFPERHIDCGIAECN MVGVAAGLAATGMVPFASSFAMFAAGRAFEQIRNSVGYPKLNVKIGATHAGISVGEDGAT HQCNEDIALMRTIPGMVVINPSDDVEARAAVKAAYEHHGPVYLRFGRLAVPVINDNEEYK FELGKGITLKDGKDVTIIATGLPVSESLEAAEMLEKDGISVRVINIHTIKPLDEEIIEKA AKETGKLVTVEEHSVIGGLGSAVCDVVAEKAPAKVMKIGINDVYGESGPALELIKKYGLD AESIYKKVKEFV >gi|330400975|gb|ADLB01000030.1| GENE 7 5459 - 6295 1141 278 aa, chain - ## HITS:1 COG:FN0294 KEGG:ns NR:ns ## COG: FN0294 COG3959 # Protein_GI_number: 19703639 # Func_class: G Carbohydrate transport and metabolism # Function: Transketolase, N-terminal subunit # Organism: Fusobacterium nucleatum # 6 268 7 269 270 329 59.0 4e-90 MNKLELMKVANEVRKGAVTAVYNAKSGHPGGSLSAADIYTYLFFEEMNIDPADPKKDDRD RFVLSKGHTAPGYYAALANRGFFPVEDLKTLRKVGSYLQGHPDMKHIPGVDMSSGSLGQG ISAAVGMAISGKLRKKDYRVYTLLGDGEIQEGQVWEAAMLAAHHNLDNLVVIVDNNNLQI DGSIDEVNSPYPIDKKFEAFNFHVINVDGHDFDALDAAFKEARETKGQPTAIIAKTIKGK NVSFMENQASWHGAAPNEEQYAVAMADLEKVGEALCQM >gi|330400975|gb|ADLB01000030.1| GENE 8 6363 - 7016 870 217 aa, chain - ## HITS:1 COG:CAC1347 KEGG:ns NR:ns ## COG: CAC1347 COG0176 # Protein_GI_number: 15894626 # Func_class: G Carbohydrate transport and metabolism # Function: Transaldolase # Organism: Clostridium acetobutylicum # 1 217 1 215 215 284 71.0 1e-76 MKFFIDTAKVEEIKKANDMGVICGVTTNPSLIAKEGRVFEEVLAEIATIVDGAISGEVKA TTVDAEGMIEEGRKIAAIHKNMVVKIPMTEEGLKACKVLSSEGIKTNVTLIFTANQALLA ARAGATYVSPFLGRLDDISVRGVDLISEIAEIFAVAGIDTQIIAASVRNPMHVTDCALAG ADIATVPYKVLEKMIHHPLTDQGIEKFQADYRAVFGE >gi|330400975|gb|ADLB01000030.1| GENE 9 7243 - 8331 824 362 aa, chain - ## HITS:1 COG:BS_degA KEGG:ns NR:ns ## COG: BS_degA COG1609 # Protein_GI_number: 16078147 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus subtilis # 5 357 2 333 337 132 29.0 1e-30 MGNRKITSSDIAREAGVSQSTVSMILNKKYNVSFSKETVKRVEETANRLGYAVPKRKIRK NSRKEKMLVAFCPNLTNPYYVMLLQGIESRANEKGYGLFVCNTRRNLKMEERYLKMMPTL QPSGIIYMCNPSSCFMEQVEQLAERVPIVIINNQNEQLKVDAVELNNEKLGRMMAKHLLE LGHRKVAYIAPPLTRRQKQRFKRVDGFLKEFEKRGLSRSVIVKAAAEEFDRNVQDIDSEY RIGYELTKELLKEEKDLTAIVGLNDMIAFGILEALEEEKYKVPNDMSVMGCDNTLFARMK KISLTTIEHFAIYKGRDACDIIMQKILSQAGKYSEIQPVSIYHVEYEPKVIPRGTTSYAK VN >gi|330400975|gb|ADLB01000030.1| GENE 10 8347 - 8943 765 198 aa, chain - ## HITS:1 COG:CAC0127 KEGG:ns NR:ns ## COG: CAC0127 COG0353 # Protein_GI_number: 15893423 # Func_class: L Replication, recombination and repair # Function: Recombinational DNA repair protein (RecF pathway) # Organism: Clostridium acetobutylicum # 1 198 1 198 198 242 56.0 3e-64 MDYYSSQISKLIEELSRLPGIGAKSAQRLAFHIIHMPKEQVEQMANAMVEARNNVRYCSK CYTLTDREVCPICSDDSRDEKTIMVVETTRDLVAYEKTGKYRGVYHVLHGAISPMLGIGP GDIKLKELMERLQGDVDEVIIATNSSLEGETTAMYISKLIKPTGIKVSRIASGVPVGGDL EYIDEVTLLRALEGRTEL >gi|330400975|gb|ADLB01000030.1| GENE 11 8960 - 9313 172 117 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149916415|ref|ZP_01904934.1| 30S ribosomal protein S21 [Roseobacter sp. AzwK-3b] # 10 109 6 105 114 70 35 5e-12 MAKRGGFPGGMPGNMANLMKQAQKMQRQMEEQAKEMESKEFTATAGGGAVEVTVSGTRAV TKVKLQEEVVDPDDIEMLEDLIMAATNEALRKVDEESSSAMAKLTGGLGGGMPGMPF >gi|330400975|gb|ADLB01000030.1| GENE 12 9327 - 10877 1736 516 aa, chain - ## HITS:1 COG:BS_dnaX KEGG:ns NR:ns ## COG: BS_dnaX COG2812 # Protein_GI_number: 16077087 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, gamma/tau subunits # Organism: Bacillus subtilis # 1 430 1 430 563 362 44.0 1e-99 MSYTALYRKFRPSEFDDVKGQDHIVTTLKNQIKADRIGHAYLFCGTRGTGKTTVAKIFAK AVNCQHTEDGSPCGECEMCRSIALGASMNVIEIDAASNNGVDNIREIREEVAYRPTEGKY KVYIIDEVHMLSIGAFNALLKTLEEPPEYVIFVLATTEAHKIPITILSRCQRYDFKRISI DTIADRLRELMTEENVEVEEKAIRYVAKMADGSMRDALSLLDQCIAFYLGQKLTYDHVLE VLGAVDTDVFSRLLRQILKRDVPAVLKSVEELVMQGRELTQMVNDFTWYLRNLLLAKTSD NMEEVLDVSSENLQQLKEEAEMIEVDILLRYIRIFSDLSNQIKYAVQKRILLEVTLIKLC KPAMETNPDTLLDRIRAVEEKVEQAGEIQVVREVAEPKREPMPKPVLEKAIPEEIKAVVK NFRSIVNETSGLLKGYLKNARLSLGGENKLLVILPDELSADFVRQEQHKRELTELIENRI GKAITIEVRQIEEGRRFEESFVDIEQVVHMDITIED >gi|330400975|gb|ADLB01000030.1| GENE 13 11038 - 11901 967 287 aa, chain + ## HITS:1 COG:PA3657 KEGG:ns NR:ns ## COG: PA3657 COG0024 # Protein_GI_number: 15598853 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Methionine aminopeptidase # Organism: Pseudomonas aeruginosa # 42 286 5 248 261 241 48.0 8e-64 MERNDACWCGSGKKYKKCHMFIDEKIALLEEQGHIVPTRDILKTPEQIEGIKKSAELNTA VLDHVAKHICAGMSTAEIDKLVYDYTTEHGGIPAPLGYDGFPKSVCTSLNNEICHGIPDE NIILAEGDIINVDVSTILNGYFSDASRMFTIGKLSERAEKIVRVTEECVERGLEMAKPWG HLGDIADAINSHAQANGYSVVEDIGGHGIGLEFHEDPFVSYVTPKGSEMVLVPGMIFTIE PMINEGSPDFFVDEDNGWTVYTEDDGLSAQIEYMVLITEDGAEVLTK >gi|330400975|gb|ADLB01000030.1| GENE 14 11952 - 12203 200 83 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKNIKQYFPYIIDIIIYIIIGVGLVRNWIAINLVLLILLPKITFDEFREWKENRKISTFI SFLALFVALILQVVLIIKTGKLL >gi|330400975|gb|ADLB01000030.1| GENE 15 12280 - 12528 181 82 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKNKKQYLSLIVNLIVVIIIGTILVYHWLPENAVFFIIFLGLLFNLFCERKEKKTFLNYL SVFVTLLYFITVLVLIIKNVKL >gi|330400975|gb|ADLB01000030.1| GENE 16 12734 - 13084 225 116 aa, chain - ## HITS:1 COG:SPy0131 KEGG:ns NR:ns ## COG: SPy0131 COG3436 # Protein_GI_number: 15674346 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Streptococcus pyogenes M1 GAS # 3 116 335 450 450 92 40.0 2e-19 PGSNGKLKKAITYINNYQEYLQTYLEDGRCSLSDNLSENAIRPVTIGRKNWLFSDTAEGA KANALYLTIVEMAKTYNLNLYEYLKFLFEHRPNKDMSDEEFENLAPWNEHVQELCK Prediction of potential genes in microbial genomes Time: Tue May 24 22:03:09 2011 Seq name: gi|330400709|gb|ADLB01000031.1| Lachnospiraceae bacterium 2_1_46FAA cont1.31, whole genome shotgun sequence Length of sequence - 56688 bp Number of predicted genes - 61, with homology - 53 Number of transcription units - 26, operones - 16 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 2 - 173 144 ## EUBREC_3236 transposase 2 1 Op 2 . - CDS 163 - 576 238 ## EUBREC_3235 hypothetical protein - Prom 723 - 782 7.8 - Term 853 - 888 4.3 3 2 Op 1 . - CDS 904 - 1317 392 ## 4 2 Op 2 . - CDS 1307 - 1396 57 ## - Prom 1438 - 1497 11.8 - TRNA 1702 - 1772 75.8 # Gly TCC 0 0 - TRNA 1794 - 1866 81.3 # Phe GAA 0 0 5 3 Tu 1 . - CDS 2015 - 3178 1380 ## COG0426 Uncharacterized flavoproteins - Prom 3214 - 3273 5.5 - TRNA 3338 - 3424 65.5 # Leu TAA 0 0 + Prom 3388 - 3447 5.2 6 4 Tu 1 . + CDS 3512 - 3604 72 ## + Term 3643 - 3685 2.5 - TRNA 3513 - 3594 54.9 # Tyr GTA 0 0 7 5 Op 1 . - CDS 3652 - 3801 115 ## 8 5 Op 2 . - CDS 3839 - 4249 388 ## CPF_1302 hypothetical protein 9 5 Op 3 . - CDS 4313 - 4873 519 ## gi|160914690|ref|ZP_02076904.1| hypothetical protein EUBDOL_00697 10 5 Op 4 . - CDS 4876 - 5328 476 ## Elen_2849 hypothetical protein - Prom 5435 - 5494 4.6 + Prom 5299 - 5358 5.7 11 6 Op 1 . + CDS 5516 - 5980 400 ## gi|167767649|ref|ZP_02439702.1| hypothetical protein CLOSS21_02182 12 6 Op 2 . + CDS 5980 - 6183 252 ## SMU.1031 putative transposon excisionase; Tn916 ORF1-like 13 6 Op 3 . + CDS 6262 - 7455 794 ## COG0582 Integrase - Term 7525 - 7584 1.1 14 7 Op 1 . - CDS 7832 - 9874 1742 ## COG0366 Glycosidases 15 7 Op 2 11/0.000 - CDS 9890 - 11725 1252 ## PROTEIN SUPPORTED gi|157803230|ref|YP_001491779.1| 50S ribosomal protein L9 16 7 Op 3 10/0.000 - CDS 11743 - 12270 675 ## COG0634 Hypoxanthine-guanine phosphoribosyltransferase 17 7 Op 4 1/0.143 - CDS 12254 - 13666 850 ## COG0037 Predicted ATPase of the PP-loop superfamily implicated in cell cycle control 18 7 Op 5 . - CDS 13740 - 15131 1245 ## COG2208 Serine phosphatase RsbU, regulator of sigma subunit - Prom 15163 - 15222 6.3 - Term 15182 - 15226 5.5 19 8 Op 1 . - CDS 15238 - 15564 366 ## gi|210613794|ref|ZP_03289908.1| hypothetical protein CLONEX_02119 20 8 Op 2 . - CDS 15623 - 15952 310 ## gi|225571160|ref|ZP_03780158.1| hypothetical protein CLOHYLEM_07248 21 8 Op 3 . - CDS 15959 - 16228 291 ## EUBREC_0530 hypothetical protein - Prom 16265 - 16324 2.0 22 9 Op 1 1/0.143 - CDS 16331 - 16552 294 ## COG1188 Ribosome-associated heat shock protein implicated in the recycling of the 50S subunit (S4 paralog) - Term 16576 - 16618 10.5 23 9 Op 2 . - CDS 16633 - 16935 167 ## PROTEIN SUPPORTED gi|148826039|ref|YP_001290792.1| 50S ribosomal protein L35 - Prom 16972 - 17031 2.5 24 10 Op 1 . - CDS 17041 - 18774 2376 ## COG1109 Phosphomannomutase 25 10 Op 2 . - CDS 18808 - 19812 687 ## Cphy_3824 CotS family spore coat protein - Prom 19832 - 19891 1.8 26 11 Op 1 . - CDS 19893 - 21227 1308 ## COG0617 tRNA nucleotidyltransferase/poly(A) polymerase 27 11 Op 2 . - CDS 21242 - 21484 333 ## COG0526 Thiol-disulfide isomerase and thioredoxins 28 11 Op 3 . - CDS 21493 - 22431 898 ## COG0523 Putative GTPases (G3E family) 29 11 Op 4 . - CDS 22435 - 22773 350 ## COG1733 Predicted transcriptional regulators - Prom 22842 - 22901 6.5 + Prom 22803 - 22862 6.7 30 12 Tu 1 . + CDS 22914 - 23546 706 ## COG2910 Putative NADH-flavin reductase + Term 23551 - 23582 3.4 - Term 23539 - 23570 3.4 31 13 Tu 1 . - CDS 23575 - 25017 1395 ## COG0442 Prolyl-tRNA synthetase - Prom 25043 - 25102 2.8 - Term 25056 - 25110 11.1 32 14 Tu 1 . - CDS 25157 - 25249 104 ## - Prom 25313 - 25372 7.5 - Term 25429 - 25468 5.0 33 15 Op 1 . - CDS 25500 - 26390 879 ## Acfer_1503 tetracycline resistance leader peptide 34 15 Op 2 . - CDS 26463 - 27116 264 ## PROTEIN SUPPORTED gi|238855152|ref|ZP_04645474.1| pseudouridine synthase, RluA family - Prom 27136 - 27195 8.5 - Term 27150 - 27202 11.8 35 16 Op 1 . - CDS 27216 - 27500 482 ## COG2088 Uncharacterized protein, involved in the regulation of septum location 36 16 Op 2 7/0.000 - CDS 27518 - 28636 1330 ## COG0448 ADP-glucose pyrophosphorylase 37 16 Op 3 . - CDS 28639 - 29910 1144 ## COG0448 ADP-glucose pyrophosphorylase - Prom 30052 - 30111 10.1 + Prom 30011 - 30070 9.5 38 17 Tu 1 . + CDS 30126 - 31511 1065 ## COG0773 UDP-N-acetylmuramate-alanine ligase + Prom 31550 - 31609 5.8 39 18 Op 1 3/0.000 + CDS 31651 - 32700 856 ## COG3935 Putative primosome component and related proteins 40 18 Op 2 . + CDS 32716 - 33711 566 ## COG1484 DNA replication protein 41 18 Op 3 . + CDS 33711 - 34895 1224 ## COG0462 Phosphoribosylpyrophosphate synthetase + Term 34904 - 34946 9.7 - Term 34888 - 34934 1.8 42 19 Op 1 5/0.000 - CDS 34935 - 36422 1064 ## COG4717 Uncharacterized conserved protein 43 19 Op 2 . - CDS 36429 - 37484 858 ## COG0420 DNA repair exonuclease 44 19 Op 3 40/0.000 - CDS 37420 - 38610 1029 ## COG0642 Signal transduction histidine kinase 45 19 Op 4 . - CDS 38600 - 39289 609 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain - Term 39308 - 39348 7.3 46 19 Op 5 . - CDS 39360 - 40751 1651 ## COG0017 Aspartyl/asparaginyl-tRNA synthetases - Prom 40806 - 40865 8.2 + Prom 40721 - 40780 8.9 47 20 Tu 1 . + CDS 41017 - 41517 523 ## COG4708 Predicted membrane protein + Term 41525 - 41565 5.1 - Term 41513 - 41553 9.1 48 21 Op 1 . - CDS 41559 - 44066 2203 ## EUBREC_0675 hypothetical protein 49 21 Op 2 3/0.000 - CDS 44067 - 45218 990 ## COG0500 SAM-dependent methyltransferases - Prom 45242 - 45301 4.1 - Term 45270 - 45309 4.4 50 21 Op 3 40/0.000 - CDS 45327 - 47870 2049 ## COG0642 Signal transduction histidine kinase 51 21 Op 4 . - CDS 47883 - 48581 944 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain - Prom 48808 - 48867 7.9 + Prom 48656 - 48715 8.1 52 22 Op 1 . + CDS 48742 - 50112 1275 ## COG0372 Citrate synthase 53 22 Op 2 . + CDS 50118 - 51056 615 ## COG2267 Lysophospholipase 54 22 Op 3 . + CDS 51111 - 51989 905 ## COG0030 Dimethyladenosine transferase (rRNA methylation) 55 22 Op 4 . + CDS 52017 - 52697 770 ## COG2013 Uncharacterized conserved protein + Term 52811 - 52845 6.0 - Term 52689 - 52738 -0.7 56 23 Tu 1 . - CDS 52808 - 52924 99 ## - Prom 53078 - 53137 7.2 - Term 52972 - 53011 1.0 57 24 Tu 1 . - CDS 53144 - 54889 1477 ## COG4805 Uncharacterized protein conserved in bacteria + Prom 54688 - 54747 6.9 58 25 Tu 1 . + CDS 54848 - 54976 56 ## 59 26 Op 1 . - CDS 54954 - 55205 180 ## gi|291172357|ref|ZP_06573530.1| cation/multidrug efflux pump 60 26 Op 2 . - CDS 55208 - 55345 223 ## 61 26 Op 3 . - CDS 55332 - 56024 394 ## gi|210612869|ref|ZP_03289502.1| hypothetical protein CLONEX_01704 - Prom 56103 - 56162 6.9 Predicted protein(s) >gi|330400709|gb|ADLB01000031.1| GENE 1 2 - 173 144 57 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_3236 NR:ns ## KEGG: EUBREC_3236 # Name: not_defined # Def: transposase # Organism: E.rectale # Pathway: not_defined # 1 56 1 56 113 86 64.0 3e-16 MLGDISLATNIYLVTGYTDMRKSIDGLCAIIMKNFKHEPDGHSIYLFCGKRCDRIKV >gi|330400709|gb|ADLB01000031.1| GENE 2 163 - 576 238 137 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_3235 NR:ns ## KEGG: EUBREC_3235 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 134 1 124 125 98 43.0 5e-20 MKAKRVNREEQLKLIMECRSSGLSDYQWCEAHGIHAGTFYNWVSKLRKAGVTIPNSESKH LGTPVHQEVVKLDLVPEPAPAATIMEKNTRILTMPDTDASVAVEIIMGNSTIRFFNNTNP DLIRTTLQCLGGMSHAW >gi|330400709|gb|ADLB01000031.1| GENE 3 904 - 1317 392 137 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MQNKRWKIMLLNVSVVTALSLSVISCKKDVSPLEDVNKDNLVKISMADEEMIRHGAEYIV LDENASEALLEDLKNISYKKMNNDSIKGSIYHMEIQCEVEKEIVSKKLYFYNPDKVKIDD ELYQIPKETYKKFEKYF >gi|330400709|gb|ADLB01000031.1| GENE 4 1307 - 1396 57 29 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MIVNNGWGQNNVWVRENYTNLDGVIYFAK >gi|330400709|gb|ADLB01000031.1| GENE 5 2015 - 3178 1380 387 aa, chain - ## HITS:1 COG:FN1423 KEGG:ns NR:ns ## COG: FN1423 COG0426 # Protein_GI_number: 19704755 # Func_class: C Energy production and conversion # Function: Uncharacterized flavoproteins # Organism: Fusobacterium nucleatum # 5 386 6 400 405 352 43.0 6e-97 MKEMKITDTVKYVGVDDKTLDLFESQYVIPNGVSYNSYVILDEKVTVMDTVDARATEEWL ANLEEVLDGREVDYLVVSHMEPDHASNVQNLIEKYPSMQVVGNAKTFNLVSQFFNVDLEG RKVLVKEGDTLNIGSHTLQFFMAPMVHWPEVMVTYEQSEKILFSADGFGKFGALDTDEDW ACEARRYYFNIVGKYGMQVQNLLKKASALDIEMICPLHGPILKENLAYYIDKYNTWSSYE PEEEGVLVAYASIHGNTAAAAKKMVEILGEKGAKKVAVMDLSRDDMSEAIEDAFKYDKLV VAAATYDNGVFPCMENFLTHLKAKNFQKRKVAIMENGSWAPTAARQMKAVFESMKDITIC DTVVTIKSVMNDDTVEVMEKMADELMA >gi|330400709|gb|ADLB01000031.1| GENE 6 3512 - 3604 72 30 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MDGEGFEPSKAVPTDLQSAPFGHSGIHPYA >gi|330400709|gb|ADLB01000031.1| GENE 7 3652 - 3801 115 49 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MEKKKKSGKKIKYFNVDTCVCCGEIVPEGQMICEKCRQSLNEEKKSNTD >gi|330400709|gb|ADLB01000031.1| GENE 8 3839 - 4249 388 136 aa, chain - ## HITS:1 COG:no KEGG:CPF_1302 NR:ns ## KEGG: CPF_1302 # Name: not_defined # Def: hypothetical protein # Organism: C.perfringens_ATCC13124 # Pathway: not_defined # 9 135 25 149 149 68 31.0 9e-11 MVSGCGKFYNRDTRPGEIIKISMEEMETMMKEKETFTLVVTRDYCKYCAEFYELMESYLK NHHVKIYDVNIDENFGAKSDENIEKIEDMFPDFVGTPGIFYIEDGKLVNQLDNESQDLTK TLFDNWVQDYKMDKKK >gi|330400709|gb|ADLB01000031.1| GENE 9 4313 - 4873 519 186 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|160914690|ref|ZP_02076904.1| ## NR: gi|160914690|ref|ZP_02076904.1| hypothetical protein EUBDOL_00697 [Eubacterium dolichum DSM 3991] # 12 182 10 180 180 150 45.0 2e-35 MKKNWRKEIICLCLAVLLLSGCGNTKYEAKDTVEREGNTFQIGNLQLEVKSWDKREEVKP KNPQGYYNHYKKEEGYYYHVLYGTLKNTGDKKVNVNQIKVEGLSNKEHYQGKIVLINEIQ SYFWEEIEPGVELDFYIFSIVEKKEKAPSEYIFYFDEDGKIDKEQVSFDHKIKYSIPSEL KRDKEN >gi|330400709|gb|ADLB01000031.1| GENE 10 4876 - 5328 476 150 aa, chain - ## HITS:1 COG:no KEGG:Elen_2849 NR:ns ## KEGG: Elen_2849 # Name: not_defined # Def: hypothetical protein # Organism: E.lenta # Pathway: not_defined # 25 145 43 165 170 94 39.0 1e-18 MLLPLILIFALLVGGGIWWALNRQPKVDLDKEAIAYQMPNGMKNENPNEIMIPVFSELTM KTGKNKVEAGLVNPEGNPCYFKYQIYLKEGNKLLYESKWLEPGTAIVEIEIKEKFEVGNY PITISVKTGSLKDPEVEMNGGEVETILKVE >gi|330400709|gb|ADLB01000031.1| GENE 11 5516 - 5980 400 154 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167767649|ref|ZP_02439702.1| ## NR: gi|167767649|ref|ZP_02439702.1| hypothetical protein CLOSS21_02182 [Clostridium sp. SS2/1] # 1 154 183 336 336 320 100.0 2e-86 MQLRDDKAHAFAMTFKDRPLELGELAFGLLANNLRFVVPNRNESNKSRWKTCRFWERFLG AVEVLKLQVPKLHNSLEETQQWLTEGGVISAVKSFYFLEEHDALGGLEKVGTMLDKARYS NSLSSKLTAHLQRIDRTDLIPYIQYDTKHGKGGI >gi|330400709|gb|ADLB01000031.1| GENE 12 5980 - 6183 252 67 aa, chain + ## HITS:1 COG:no KEGG:SMU.1031 NR:ns ## KEGG: SMU.1031 # Name: xis # Def: putative transposon excisionase; Tn916 ORF1-like # Organism: S.mutans # Pathway: not_defined # 1 67 1 67 67 105 85.0 6e-22 MNNNDIPVWEKYTLTIEEASKYFRIGENKLRRLAEENKDAGWLIMNGNRIQIKRRQFEKV IDKLDAI >gi|330400709|gb|ADLB01000031.1| GENE 13 6262 - 7455 794 397 aa, chain + ## HITS:1 COG:SP1129 KEGG:ns NR:ns ## COG: SP1129 COG0582 # Protein_GI_number: 15900995 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Streptococcus pneumoniae TIGR4 # 186 392 176 377 387 79 29.0 2e-14 MKEKRRDSKGRILHTGESQRTDGKYLYKYVDAFGNTKYVYAWRLTPTDPTPKGKREKPSL RELEQQIRRDIEDGIDSTGKKMTLCQLYAKQNAQRANVKKSTQKQREQLMRLLKEDKLGA RSIDTIKPSDAKEWALRMKDKGFSYNTINNHKRSLKASFYIAIQDDCVRKKPFDFKLSEV LENDTKEKVALTEEQEQALLSFIKTDNVYHKYYDDVLILLKTGLRISELCGLTIMDVDFI HEVVVIDHQLLKSKEQGYYIETPKTKSGTRQVPLSKETIRAFQRVMKKRPKAEPFVIDGR GNFLFVNHKGKPKVAIDYNMLFVRMVKKYNKHHKDNPLPHITPHTLRHTFCTRLASKNMN PKDLQYIMGHSNISITMNWYAHASIDTAKSEVQRLIA >gi|330400709|gb|ADLB01000031.1| GENE 14 7832 - 9874 1742 680 aa, chain - ## HITS:1 COG:ECs0453 KEGG:ns NR:ns ## COG: ECs0453 COG0366 # Protein_GI_number: 15829707 # Func_class: G Carbohydrate transport and metabolism # Function: Glycosidases # Organism: Escherichia coli O157:H7 # 104 626 109 569 605 247 31.0 7e-65 MNKQALFSDGTGSYVWPPDPSVNSRITIRFRTAKDDVENVWLVSGNERLPMTKRETEGNF DYYTVEKQLGSEIFYYYFEVKSKDDYCYYNTYGVCKNLIHSYSFAVAPGFSTPDWAKGAV MYQIFVDRFCNGDKTNDVETREYLYIGEPCQKVTEWNRYPREMDVRDFYGGDLKGVMDKL DYLQDLGIEAIYFNPIFVSPSNHKYDIQDYDYIDPHYAVILHDGGELVGEHAKNNVHATK YQKRTTDKENLEASNRFFAQLVEEIHRRGMKVILDGVFNHCGSFNKWLDREHIYERQQGY EKGAYISKDSPYREFFHFNENKDSDWPYNTRYEGWWGHDTLPKLSYEDSPKLEEYILNIA KKWVSSPYNVDGWRLDVAADLGYSNEYNHEFWKKFRSAVKEANPNAVIIAEHYGDPKEWL LGDEWDTVMNYDAFMEPITWFLTGMEKHSDEERQELWGNADHFAETMKYHMSKMFTPSRL VAMNELSNHDHSRFLTRTNHMVGRAAQLGAKAAEEYVSKAVFRVAVTMQMTWIGAPTIYY GDEAGVCGFTDPDNRRTYPWGLEDKLLIAFHKEMIRIHKKYEAFKSGSILLLHGAEHILA YARFSDTEQFVVVINNRSERAEIVVPVWRAGVPKEGRMRRLMYTYDDDYTTEYEEYIVKN GEMVVNMGKHSALVLKTMEF >gi|330400709|gb|ADLB01000031.1| GENE 15 9890 - 11725 1252 611 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|157803230|ref|YP_001491779.1| 50S ribosomal protein L9 [Rickettsia canadensis str. McKiel] # 3 603 2 593 636 486 45 1e-137 MNNGKSRGVNGFPFILLIVVFLVTWWILNDIPQRESAYTYSAFKSAVQNDEVKSVTVKQN KAIPTGQVEAVLKDDSKETFNVSDVKEVEKFLEKENVDYYVNDVPKDSWFATTGFSTLVS VGLILFLFMMMNRQAGGAGAKAMSFGKSRAKMSTDKDRKITFSQVAGLQEEKEDLEEIVD FLKEPKKYIQVGARIPKGVLLVGPPGTGKTLLAKAVAGEAGVPFFTISGSDFVEMFVGVG ASRVRDLFEEAKKNAPCIIFIDEIDAVARKRGTGMGGGHDEREQTLNQLLVEMDGFGVNE GIIVMSATNRVDILDPAILRPGRFDRKVMVGRPDVKGREEILRVHAKGKPLGDDVDLLQI AQTTAGFTGADLENLLNEAAINAAKEDRVYVKQNDIRKAFVKVGIGAEKKSRIISEKEKK ITAYHEAGHAILFHVLPDVGPVYSVSIIPTGTGAAGYTMPLPEKDEMFNTKGKMLQDITV SLGGRIAEEIIFDDITTGASQDIKQATAMAKSMVTKFGMSETLGLINYDNDSEEVFVGRD FAHTSRGYGEEVAGQIDREVKRIIDECYAKAKAIIKEHQSVLHVCADALLEKEKITREEF EALFEETVSEG >gi|330400709|gb|ADLB01000031.1| GENE 16 11743 - 12270 675 175 aa, chain - ## HITS:1 COG:YPO3408 KEGG:ns NR:ns ## COG: YPO3408 COG0634 # Protein_GI_number: 16123557 # Func_class: F Nucleotide transport and metabolism # Function: Hypoxanthine-guanine phosphoribosyltransferase # Organism: Yersinia pestis # 1 171 1 173 178 195 58.0 3e-50 MSEKIKVLISEEEVNARIEELGKKISEDYAGKQVHLICVLKGGVFFMCELAKRITVPVSM DFMSVSSYGDGTSSSGVVKIAKDLDETLEGKDVIVVEDIIDSGRTLSYLLEILQKRGPKS MGLCTLLDKPDRRVRDVKVDYVGFEIPDEFVVGYGLDYAQKYRNLPYIGVVEGVE >gi|330400709|gb|ADLB01000031.1| GENE 17 12254 - 13666 850 470 aa, chain - ## HITS:1 COG:CAC3204 KEGG:ns NR:ns ## COG: CAC3204 COG0037 # Protein_GI_number: 15896451 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Predicted ATPase of the PP-loop superfamily implicated in cell cycle control # Organism: Clostridium acetobutylicum # 1 463 1 461 461 244 34.0 2e-64 MIKKVARFVAKWEMLNKADKVVVGVSGGADSVCLLFVLMELQKTIGFDIIVVHVNHLLRG EDAESDEKYVRELCEKYHLTCEVYRENVEWIAKNKRESLEEAGRNVRREAFEQTLKKYGG TKIALAHHRNDNAETLLMNLARGSGLAGLGGMRPVKENRIRPLLCLERKEIEEYLLKRQI AYCTDETNYSDAYTRNRVRNHVIRTLEQGVNEKTVAHFSETMERIWELQDYMEEEAERRY DRVSLEKENKTVLLKEEYEKLPPILRGMVIKKAIVSMTKKEKDIHAVHIEEVQKLMGNQV GRQIHLPYGVCACRCYEGICITFAREMRQKEEAEKQEIDLLQSNHRQIGEWNISYEIKNA EEISREEVENPYTKCFDCDIMDRTIVIRTRAAGDYITINESGAKQKLKSYFINEKIPKEE RDRVLLLADGSHILWVVGYRRGCAYQIGKNTKHILKITIDKGEKDNVRKN >gi|330400709|gb|ADLB01000031.1| GENE 18 13740 - 15131 1245 463 aa, chain - ## HITS:1 COG:CAC3205 KEGG:ns NR:ns ## COG: CAC3205 COG2208 # Protein_GI_number: 15896452 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Serine phosphatase RsbU, regulator of sigma subunit # Organism: Clostridium acetobutylicum # 10 463 342 794 795 186 24.0 6e-47 MGKIEVDNPYVTQIDKFANSLRHLSDTFLHLEEKRETLTQAEMEEIFAEIGENVCRECEN REWCMGENAIYTYQMVYEILSAVEEYGVELNTEIKRKLQKRCIQAPRFLRETLDTFQGAK KTLLWNNKLAQNRQGCAIQLDSFAHMIQHATRELDASIFSDESLEKRIQAQLKKIDVRLL SSVFLITPEGKYEIHLTVRSKKGQCVTTKELVWNLSKVVGRKMVLPEGERLLVGQEYCTI VAMEGPKYHTLQGVAKIGKGCDKISGDNFSMMELPGGKHGVILSDGMGSGEKAFQESAMV VEMLEELLIAGFPKETAIQMMNTALVMGREEISFSTIDMSVFDLYNGNCEFVKVGASTTY IKRGDKVERISSTSLPIGVVQYLEIETTRKRLENGDFVIMVTDGVLDALPVGEQDLLMEM IISGTNKNNPKEMAHHILEQVLEWTGEEPLDDMTVLAVGFWSL >gi|330400709|gb|ADLB01000031.1| GENE 19 15238 - 15564 366 108 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|210613794|ref|ZP_03289908.1| ## NR: gi|210613794|ref|ZP_03289908.1| hypothetical protein CLONEX_02119 [Clostridium nexile DSM 1787] # 1 107 2 108 109 84 51.0 3e-15 MAKINKKITARQKKQQLQRHKRSMMIVTFVLVLLVGVVSVSSISLQAKNKTYKVQEAELK KQIEEENERLEDIKEFEEYVQTDEYVEDKAKDELGLARKNEILFQSEK >gi|330400709|gb|ADLB01000031.1| GENE 20 15623 - 15952 310 109 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|225571160|ref|ZP_03780158.1| ## NR: gi|225571160|ref|ZP_03780158.1| hypothetical protein CLOHYLEM_07248 [Clostridium hylemonae DSM 15053] # 1 109 1 117 117 95 45.0 9e-19 MLGLEKELEIFVHAVIAGMAVYGTYTLLRVIRRIVKHNLLSISIEDFLFWVGTSFYLFIE IYYTSDGSVRWFFILGVVLGMILLSFALFLAKKICEKIKKSVDKRAKTR >gi|330400709|gb|ADLB01000031.1| GENE 21 15959 - 16228 291 89 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_0530 NR:ns ## KEGG: EUBREC_0530 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 3 89 8 94 94 97 55.0 2e-19 MPKAHKIVINNRKTSVVTGVLDVLSFDLNEILLETEQGMLMVKGSDIHVNRLSLEKGEVD LSGNIDSIAYSEIHAKAKQGENLIAKLFR >gi|330400709|gb|ADLB01000031.1| GENE 22 16331 - 16552 294 73 aa, chain - ## HITS:1 COG:L1001 KEGG:ns NR:ns ## COG: L1001 COG1188 # Protein_GI_number: 15671997 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Ribosome-associated heat shock protein implicated in the recycling of the 50S subunit (S4 paralog) # Organism: Lactococcus lactis # 1 73 23 95 105 62 50.0 2e-10 MKVSRLIKRRTVANEACDAGRVLVNGLVAKASQKVKVGDIIEIQFGTKSVKAEVLALQET TKKEEAKELFRYL >gi|330400709|gb|ADLB01000031.1| GENE 23 16633 - 16935 167 100 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|148826039|ref|YP_001290792.1| 50S ribosomal protein L35 [Haemophilus influenzae PittEE] # 9 98 3 92 96 68 33 6e-11 MIQLEEDISMNKTELITAVAENAELSKKDAEKALKAFVEVVTEELKKGEKIQLVGFGTFE VSERAAREGRNPQTGKTMSIAACKVPKFKVGKALKDAVNC >gi|330400709|gb|ADLB01000031.1| GENE 24 17041 - 18774 2376 577 aa, chain - ## HITS:1 COG:CAC2337 KEGG:ns NR:ns ## COG: CAC2337 COG1109 # Protein_GI_number: 15895604 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphomannomutase # Organism: Clostridium acetobutylicum # 1 575 1 572 575 600 54.0 1e-171 MDYQKRYDQWLNDSYFDEDTKAELKGIANDENEIKERFYKDLEFGTAGLRGIIGAGTNRL NIYTVRKATQGLANYIAKRDAKERGVAIAYDSRHMSPEFADEAALCLAGNGIKAYVFDAL RPTPELSFAVRQLNCIAGINITASHNPPEYNGYKVYWEDGAQITPPHDKGIMAEVEAVED FHCVKTMSLAEAKEMGLYQTIGAEIDDAYIAALKQQVIHQDAIDAMNKELKIVYSPLHGT GNVPVRRVLKELGFENVYIVKEQELPDGDFPTVSYPNPESEEAFELGLKLAREVDADLIL ATDPDADRLGVYVKDTVSGEYKVLTGNMSGCLLADYEIGQRKEKEGLPEDGYLIKTIVTS NMADAIAKGYGAGLIEVLTGFKYIGQKILGFETTGKGHYLFGFEESYGCLIGTHARDKDA VVATMALCEAAAYYKTKGKTLWDAMIDMYEKYGYYKDGIQSITLKGIEGLEKIQEILNTL RNDTPSSFGPYKVLKARDYQADTIKDMETGETKQTGLPSSNVLYYDLNDEAWLCVRPSGT EPKVKFYYGVKGTSLEDADKKSEELGKEVLEMIDKML >gi|330400709|gb|ADLB01000031.1| GENE 25 18808 - 19812 687 334 aa, chain - ## HITS:1 COG:no KEGG:Cphy_3824 NR:ns ## KEGG: Cphy_3824 # Name: not_defined # Def: CotS family spore coat protein # Organism: C.phytofermentans # Pathway: not_defined # 1 325 1 345 347 208 33.0 2e-52 MKEYSLNVLEQYDIKVSATRKVRGAILCETDKGVLLLKETVDSEKKISMLEELTTQMIHR GYERVDNMFINKEGNFVSMSEDGTRYVLKKWYLGRECDVKRETEVLEATRNLARIHLMMK IPNEDWKLFVGNDLAEEYTRHNRELRKIRKFMRNKTAKSQFENVALKHFGEMYMWAEYAD KKLACSSYKMLLEKSIEEGAITHGDYNYHNLIITQQGMATTNFEHSCLDIQAADLYYFMR KVLEKHHWSVELGRAMLRTYQEIKPLGEGEKEYLCLRFVYPEKFWKVLNSYYHSNKAWIS EKNVEKLSVAISQTEEKKKFLKNIFAFHLENTVV >gi|330400709|gb|ADLB01000031.1| GENE 26 19893 - 21227 1308 444 aa, chain - ## HITS:1 COG:FN0243 KEGG:ns NR:ns ## COG: FN0243 COG0617 # Protein_GI_number: 19703588 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA nucleotidyltransferase/poly(A) polymerase # Organism: Fusobacterium nucleatum # 12 443 16 450 451 228 34.0 2e-59 MKITLPEKVKKIIHTLKSHGYEAYAVGGCVRDSILGREPEDWDITTSALPEETKALFERT FDTGIEHGTITVLMGKEGFEVTTYRIDGKYEDNRHPKQVTFTRCLREDLLRRDFTINAMA YNEEEGIVDIFGGMEDLKEGVIRCVGNAKERFGEDALRILRAVRFSAQLGFSIEEETKAG IVALAPTLQQISAERIQVELVKMLVSPNPERIRLAYELGITKVILPEFDEMMNTEQETPH HMYTVGEHTIKALTIIRADKILRLTMLLHDVGKPMMKTVDRDGVAHFKYHDVEGEKIVRN ILKRLKFDNDTLRKVAKLVKYHDYRMPARTKNVRRAVNLIGEDLFPFYLKVRMADTLAQS EYQRTEKLANLDGIRDCFEEIQEKGQCVSLKTLAVTGNDLIQLGMKPGKEIGEMLNTLLQ LVLDNPEYNNEEMLKLYVKENRKI >gi|330400709|gb|ADLB01000031.1| GENE 27 21242 - 21484 333 80 aa, chain - ## HITS:1 COG:MA3938 KEGG:ns NR:ns ## COG: MA3938 COG0526 # Protein_GI_number: 20092734 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Methanosarcina acetivorans str.C2A # 1 75 1 75 77 62 48.0 2e-10 MVIKVIGTGCDKCDRLYENVTQAVEETGVEAEVEKVEDLMEIVKLGVMTSPALMIDGKMI SAGRVMKKDEVKTYIQSTGK >gi|330400709|gb|ADLB01000031.1| GENE 28 21493 - 22431 898 312 aa, chain - ## HITS:1 COG:SMc03799 KEGG:ns NR:ns ## COG: SMc03799 COG0523 # Protein_GI_number: 15966935 # Func_class: R General function prediction only # Function: Putative GTPases (G3E family) # Organism: Sinorhizobium meliloti # 6 195 8 197 329 114 34.0 3e-25 MEEKRTKLYVFTGFLGAGKTTVLSRILQESKEEKIGVIQNEFGKLGIDGEILKHNDIQMV EINRGSIFCSCLKLQFVQALSEMAQYEFDKLFVESSGLGDPSNVEEILKASEMVSGRSYD FRGVICFVDSVNFLSQLEDLETVHRQLKHCHLAVITKTDLVSEEEVQNLCKVIRTINPVC KIVESANRDLDITVLEEDLLLYQWAEGEETTNTEETKPKTLFMELESEVEEEGLRSFLEE IAPSLYRAKGFVQFKGKGWNQVDLVGKKIDLQKTDALEKGQLVFISKIGAAVIKEIFAVW AKYNDTEMKLRN >gi|330400709|gb|ADLB01000031.1| GENE 29 22435 - 22773 350 112 aa, chain - ## HITS:1 COG:FN0589 KEGG:ns NR:ns ## COG: FN0589 COG1733 # Protein_GI_number: 19703924 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Fusobacterium nucleatum # 4 108 1 105 107 141 68.0 2e-34 MGIIMEKNLPICPVETTLMLIGDKWKVLILRDLMPGTKRFGELKKSIGNVSQKVLTANLR SMEESGLLTRKVYAEVPPRVEYTLTETGYSLQPVLSSMVDWGTEYKKKRKKG >gi|330400709|gb|ADLB01000031.1| GENE 30 22914 - 23546 706 210 aa, chain + ## HITS:1 COG:SP1627 KEGG:ns NR:ns ## COG: SP1627 COG2910 # Protein_GI_number: 15901463 # Func_class: R General function prediction only # Function: Putative NADH-flavin reductase # Organism: Streptococcus pneumoniae TIGR4 # 1 209 1 209 209 231 55.0 7e-61 MKIAVVCANGKAGKLITKEAADRGLDVTAIVRGENRSAAGKVVLKDLFELTSADLADFDV IVDAFGAWTPETIDNIPRAVKHLCDLLSGKNTRLLIVGGAGSLYINKEHTLCVADGADFP EAFKPLAEAHDKALRHLRTRNDVNWTYISPAGDFQADGERTGKYILGGEELTLNSKGESV ISYADYAIAMVDEIIKGKHVQQRISVVSEY >gi|330400709|gb|ADLB01000031.1| GENE 31 23575 - 25017 1395 480 aa, chain - ## HITS:1 COG:BB0402 KEGG:ns NR:ns ## COG: BB0402 COG0442 # Protein_GI_number: 15594747 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Prolyl-tRNA synthetase # Organism: Borrelia burgdorferi # 11 480 5 488 488 456 48.0 1e-128 MAKEKKFVEAITSRDEDFAQWYTDVVSKAELCDYTGVKGCMVYLPNGYAIWEKIQADLDK RFKETGVENVYLPMLIPENLLQKEKDHIEGFAPEVAWVTHGGNEPLQERMCIRPTSETLF CDHWSKTVQSYRDLPKIWNQWCSVLRWEKTTRPFLRSREFLWQEGHTIHATYEEAEERTI LMQKIYKDFIEESLAIPLVAGKKTDSEKFAGAEDTYTVEALMHDGKALQAATSHFFGSGF ADAFDIKYLDKNNELHSVYETSWGLTTRMIGAIIMVHGDDSGLVLPPRIAPVQTRVIPIA QHKEGVLEKATELMNTLKAAGYAVNMDSSDKSPGWKFSEQEMQGIPTRIEIGPKDIENNQ VVVVRRDTREKIVVSLDEITTKLGEILETIQKDMFDKAKAFLDSHIDTAVTMDEMIEKFQ ENRGFIKAMWCGDEACEDEIKAQTGGASSRCIPPEQEHLSDVCICCGKPAKHMVYWGKSY >gi|330400709|gb|ADLB01000031.1| GENE 32 25157 - 25249 104 30 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MVEYFYGRYIEPAFELCMKIQRISALKGRK >gi|330400709|gb|ADLB01000031.1| GENE 33 25500 - 26390 879 296 aa, chain - ## HITS:1 COG:no KEGG:Acfer_1503 NR:ns ## KEGG: Acfer_1503 # Name: not_defined # Def: tetracycline resistance leader peptide # Organism: A.fermentans # Pathway: not_defined # 3 275 5 278 278 131 33.0 3e-29 MKKQLKNLTIKDNFMFAAVMLDKENCKGFLERALQIKIDHVEVCTEKNIVYHPEYKGVRL DVYAKDENNTRYNVEMQVSSQSALGLRSRYYQSQMDMEMLLSGCEYSELPNSYVIFICDF DPFGEKKYKYTFAMECTEIKTVELQDKRNIVFLSTKGENDAEVPKELVRFLKFVKADLKE SQNDFEDEYVRKIQEFVKHIKQNREMEEKYMLFEELLKDEREVGRKEGLEQGRNEGISEG RNLGMSEILKMFLSKFDILPKELQNKISEEEDEEILKYWIRIASEVKSLDEFISKM >gi|330400709|gb|ADLB01000031.1| GENE 34 26463 - 27116 264 217 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|238855152|ref|ZP_04645474.1| pseudouridine synthase, RluA family [Lactobacillus jensenii 269-3] # 7 216 83 278 287 106 35 3e-22 MDIKDYIVYEDEHLLVCHKLPKIATQTSRIGEKDMVSLLKNYLHQQSSKKQEPYLAVIHR LDQMVEGLLVFAKTPFAAKELSRQLQQKGFGKEYRAVVCKTPQQQEGTLVNYMVKDKRSN TSRVCGENEMGAKKAVLHYKVTEEQGDGTAVLHICLETGRHHQIRVQLSNIGCGIMGDAK YNAGNYEQEKWDKIALCAYRLRFVHPKTKKEMVFELE >gi|330400709|gb|ADLB01000031.1| GENE 35 27216 - 27500 482 94 aa, chain - ## HITS:1 COG:CAC3223 KEGG:ns NR:ns ## COG: CAC3223 COG2088 # Protein_GI_number: 15896470 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Uncharacterized protein, involved in the regulation of septum location # Organism: Clostridium acetobutylicum # 1 83 1 83 95 116 72.0 1e-26 MQITDVRVRKVAKEGKLKAVVSITMDEEFVVHDIKVIEGEKGLFIAMPSKKSLDGEYRDI AHPINSETRDRIQSTILAKYEEVLQEEPEVEEVV >gi|330400709|gb|ADLB01000031.1| GENE 36 27518 - 28636 1330 372 aa, chain - ## HITS:1 COG:TM0239 KEGG:ns NR:ns ## COG: TM0239 COG0448 # Protein_GI_number: 15643011 # Func_class: G Carbohydrate transport and metabolism # Function: ADP-glucose pyrophosphorylase # Organism: Thermotoga maritima # 1 366 1 366 370 306 41.0 5e-83 MRAVGIILAGGNNNRMKELTNKRAVAAMPVAGSYRCIDFALSNMTNSNIQKVAVLTQYNA RSLNEHMNSSKWWNFGRKQGGLYVFTPTITKNNGYWYRGTADAIYQNLDFLKKSHEPYVI ITSADAVYKLDYNEVLEYHIKKQADITVVCKDLEPGDNPRRYGTIKMNEDSLIEEFEEKP MLAKSRTISTGIYVLRRRQLIDLIEKAAEEENYDFVTDILIRYKNVKKIYGYKIKNYWSN ISTVDAYYKTNMDFLKPEIREYFFHELPRVYSKVSDLPPAKYNPGAVVKNSLVASGTIIN GTVENSILFKKVFVGNNCIIKNSIILNDVYLGDNTYIENCIVESRDTIRANTRYIGEDGV KVVVEKNDRYTL >gi|330400709|gb|ADLB01000031.1| GENE 37 28639 - 29910 1144 423 aa, chain - ## HITS:1 COG:TM0240 KEGG:ns NR:ns ## COG: TM0240 COG0448 # Protein_GI_number: 15643012 # Func_class: G Carbohydrate transport and metabolism # Function: ADP-glucose pyrophosphorylase # Organism: Thermotoga maritima # 7 418 5 419 423 473 54.0 1e-133 MIKKEMIAMLLAGGQGSRLGVLTAKVAKPAVSFGGKYRIIDFPLSNCINSGIDTVGVLTQ YQPLRLNTHIGIGIPWDLDRNVGGVTVLPPYEKSTSSEWYTGTANAIYQNLDYMSAYNPD YVLILSGDHIYKMDYEVMLDFHKENNADVTIAAMPVPLEEASRFGIVITDDEGKIEDFEE KPAQPRSNLASMGIYIFSWPVLKEALQELSSQPNCDFGKHIIPYCHSKNQRLFAYEYNGY WKDVGTLSSYWEANMELIDIIPEFNLYEEFWKIYTNSGVLPPNYVSEQSVIERSIICNGA SIYGEVHNSILGSRVRIGKGAIIRDSIIMNETEIGENCVVDKAIIAENVKVGDNVTIGIG SDIPNKMRPDIYNSGLTTIGEKSVIPSGVQIGKNTAISGVTSKEDYIGGVLESGETLIKA GER >gi|330400709|gb|ADLB01000031.1| GENE 38 30126 - 31511 1065 461 aa, chain + ## HITS:1 COG:CAC3225 KEGG:ns NR:ns ## COG: CAC3225 COG0773 # Protein_GI_number: 15896472 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramate-alanine ligase # Organism: Clostridium acetobutylicum # 12 460 13 457 458 436 51.0 1e-122 MYKIDFKKPIHVHFIGIGGISMSGLAEILLKEGFTVSGSDNKESALTDHLAGKGATIFYG QKASNIIDGIDVVVYTAAIHEDNEEFAEAVRQNIPMLSRAELLGQLMTNYECPIAVSGTH GKTTTTSMLSHVLLEGEKDPTISVGGILKAINGNIRVGNSGVFVTEACEYTNSFLHFLPK ISIILNVEEDHMDFFKDIDDIRQSFHRFAQLLPQNEEGTLIINGEIEKIEAITDGLTCRV LTYGLHGDYDYSAKNISYNETGCTSFDFYRFGEFADRISLSVTGEHNVSNALSAIAVGEL TNVPLESIKKGLLSFSGTDRRFEYKGELNGFTIIDDYAHHPTEITATLSSVKHYPHRETW CIFQPHTYTRTKAFFHEFAEALSLADHVLLVDIYAARETDTLGMSSELLAEEIKKLGTDA HYFPDFKSVENFVLKHCIHDDLLITMGAGDVVNIGESLLST >gi|330400709|gb|ADLB01000031.1| GENE 39 31651 - 32700 856 349 aa, chain + ## HITS:1 COG:CAC3587 KEGG:ns NR:ns ## COG: CAC3587 COG3935 # Protein_GI_number: 15896821 # Func_class: L Replication, recombination and repair # Function: Putative primosome component and related proteins # Organism: Clostridium acetobutylicum # 15 347 14 321 328 150 31.0 3e-36 MKALKLHNHLQMNSTWIENNFIDNYMAQANGEYVKVYLFLLRHLQNPCTTMTISTIADCL ENTEKDILRALKYWEKEGLISLTYDSSNEIDGISMGGNHEDVHEKVIVEEPEPIVESSPV IQEELAEEKPVAKASSIEQFKNRKELKSLLFIAEQYLGKTLSGTDIEAITYFYDTLHFSA DLVEYLIEYCVENGHKSIHYIQKVALAWSEEHITTVEQARNSSKFYNKNCYSILNAFGIK GRAAATPELAYIKKWTEEYGFSLDLIIEACNRTMAKLHQPSFEYTDSILKKWIASNVHHL SDLQRLDSAHRQTVVRRRVETKPVIKNKFNNFEGRTYDYDSLEKQLLSQ >gi|330400709|gb|ADLB01000031.1| GENE 40 32716 - 33711 566 331 aa, chain + ## HITS:1 COG:CAC3588 KEGG:ns NR:ns ## COG: CAC3588 COG1484 # Protein_GI_number: 15896822 # Func_class: L Replication, recombination and repair # Function: DNA replication protein # Organism: Clostridium acetobutylicum # 1 327 1 326 329 184 35.0 2e-46 MALRNAQHDMIMRNYEQKQLRSQNILTQRYNEIYEKIPKFKELEDSISMLGVKYARKLLD GDSRALEEFRVKLNSLCQEKTDLLTQSGYPEDYLEPVYECPLCKDTGYIGNEKCICFKQA AIDLLYTQSNLKEILKKENFDTFSFDYYSDNFVDAKSGLSSSAAIRKAHQTCLDFVNRFG EDFENLFLYGDTGVGKTFLSNCIAKELIDKSYSVIYFTAFELFNIFEKSKFGRDETAEVM NSHIFDCDLLIIDDLGTELSNSFTTSQLFLCLNERLLKKKSTIISTNLALETFSDYYSQR TFSRITSNYKLLKLIGDDIRLKKKLMNMEAN >gi|330400709|gb|ADLB01000031.1| GENE 41 33711 - 34895 1224 394 aa, chain + ## HITS:1 COG:CAC0819 KEGG:ns NR:ns ## COG: CAC0819 COG0462 # Protein_GI_number: 15894106 # Func_class: F Nucleotide transport and metabolism; E Amino acid transport and metabolism # Function: Phosphoribosylpyrophosphate synthetase # Organism: Clostridium acetobutylicum # 15 375 11 356 371 379 50.0 1e-105 MLHRKERNLENIPVGALGIIAVDGCQELGNKVNDYLVKWRNESTKDFSASEVFTGYKKDS YLIDAKVPRFGSGEAKGIINESVRGKDLYLMVDVCNYSLTYSLSGHTNHMSPDDHFQNLK RIIAAIGGKGRRINVIMPFLYESRQHKRSSRESLDCALALQELVHMGVDNIITFDAHDPR VQNAIPLSGFETIRPTYQFIKGLLRAVPDLQIDSKHMMAISPDEGGTSRAVYLANVLGLD MGMFYKRRDYTQIIDGRNPIVAHEFLGSDVEGKDVIIIDDMISSGDSIIDVATELKRRKA KRIFAAATFGLFTNGLAKFDKAYEDGIIHGILTTNLIYQTPELLSKPYYINCDMSKYIAL IIDTLNHDGSLSSILSPNERIQHIVEKYRNGERI >gi|330400709|gb|ADLB01000031.1| GENE 42 34935 - 36422 1064 495 aa, chain - ## HITS:1 COG:lin2324 KEGG:ns NR:ns ## COG: lin2324 COG4717 # Protein_GI_number: 16801388 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Listeria innocua # 293 487 702 905 908 62 24.0 2e-09 MKLLETIIKNFGKFSGKSWEFTEGINVIYGENESGKTTLYTFIKSMLFGLERGRGRASLN DEFSQYEPWENPNFYSGILRFSCGNRNFRLERNFDKYSKSVTLICEDDGEELSVEQGDLQ VLLDGLTQDAFENTVSIGQLKVKTEQSLANRLQDYATNYYTTGDGEMHLEEALKYLNNKK KEVEKQFKEELMKEDRRREEIALESSYVWRDIHKIEQKLEAVDEEIRVKELQEKKTSEKP KWRVHPIEVLGMFAVLFAIFVLIPRPWNYLIVIVGALAEGIYVWNRTKEGRKKEECIENE NAIRKLRWERNHVFEELKEKQVQHSNLEEQLEELQEEQEEHKDWKRKKEALDLASAKLIE VSRHMQKSLGQALNERTSEIISRITGGKYTRLFVEDNLKISLLTKERKIPLERVSRGTIE QIYFALRMASVEIMNEEEFPVILDETFAFYDEKRLQYTLKWLAENKKQVLLFTCQKREEE LLKKMEIPYTYYECR >gi|330400709|gb|ADLB01000031.1| GENE 43 36429 - 37484 858 351 aa, chain - ## HITS:1 COG:MA2363 KEGG:ns NR:ns ## COG: MA2363 COG0420 # Protein_GI_number: 20091196 # Func_class: L Replication, recombination and repair # Function: DNA repair exonuclease # Organism: Methanosarcina acetivorans str.C2A # 1 228 22 267 443 79 28.0 7e-15 MKFIHIADVHLGAKADAGKKYSANRGEEIWESFAKIIDICEKKETDLLLIAGDLFHRQPL LRELKEVDYLFSKLTKTKVVFIAGNHDYIKWNSYYRTFRWSKNVFPLLSDRLGCVEFKEH ETAVYGFSYYQREITERKYAKAKAWKKQKNEILLAHGGDEKHIPINRNELISLGYDYIAF GHIHKPITIEENKIVYSGALEPIDKNDVGLHGYIEGEIEDGKVQTKFVPFAMREYVHMHL EIDDTVTNMQLKEIIRDSIEERGIKNIYKILLCGFRNADIEFDVSNTDPFGNILEIVDET KPAYRFEKLKEKNKDNLLGKYIESIGDCESDSLEYMALYEGVEALLETKKG >gi|330400709|gb|ADLB01000031.1| GENE 44 37420 - 38610 1029 396 aa, chain - ## HITS:1 COG:BH1809 KEGG:ns NR:ns ## COG: BH1809 COG0642 # Protein_GI_number: 15614372 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Bacillus halodurans # 68 372 40 347 351 178 37.0 1e-44 MKNKYRKLKLSVLLQTVFVTALTVLVGAFLLDYVIDGVYNESFGRIFVDVLMDMGVERET AIAWYWKFIGNNKDLFMMVGFLSLFALFFYIALSQMTKYLHQIEAGIENIISDSTEPVHL ITELRPIEKRLNKIKETLKKQELEAIEAENRKNDLVVFLAHDLKTPLTSIVAYLSMLDSH PDMSVEERARYTHISLEKAIRLGELISEFFEITRFNLQNIELEKVKLNLSMMLEQLADEL YGVLREKNLICKVYMEDNLVVEGDPDKLARVFDNILRNAVTYCYPDTLIEIQAKDIGDNV EIIFINKGKQIPPEKLDRLFEKFYRVDDARSSSTGGAGLGLAIAKEIVELHSGTIRAESN EEETKFIVSLPKGEKDEVYSHSGRTLGGKGRRRKKI >gi|330400709|gb|ADLB01000031.1| GENE 45 38600 - 39289 609 229 aa, chain - ## HITS:1 COG:CAC0564 KEGG:ns NR:ns ## COG: CAC0564 COG0745 # Protein_GI_number: 15893854 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Clostridium acetobutylicum # 3 229 6 231 233 224 49.0 1e-58 MNILVVDDEREIADVVELYLQSDQYHIFKFYTGQEALDCIDNTKIDLALLDVMLPDIDGF EILKRIREKYTFPVIMLTAKIEYMDKITGLTLGADDYIPKPFNPLELVARVKAQLRRYTQ YNDVGKNEGDIIDFGALFLNRTSHECVYYDKELTLTPIEFDILWLLCENRGRVISSEELF RTVWKEQYYKNSNNTVMVHIRHLREKMSEPTGKSDFIKTVWGVGYKVEE >gi|330400709|gb|ADLB01000031.1| GENE 46 39360 - 40751 1651 463 aa, chain - ## HITS:1 COG:CAC3260 KEGG:ns NR:ns ## COG: CAC3260 COG0017 # Protein_GI_number: 15896505 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Aspartyl/asparaginyl-tRNA synthetases # Organism: Clostridium acetobutylicum # 1 463 1 463 463 655 66.0 0 MELVTVRELYKNREQYLDKEISVGGWVRSIRDSKTFGFIVVNDGSFFETLQVVYHDTMEN FAEISKLNVGAAIIVKGTLVATPQAKQLFEIQATEVIVEGASAPDYPLQKKRHSFEYLRT IAHLRPRTNTFQAVFRVRSLIAYAIHQFFQERGFVYVHTPIITGSDCEGAGEMFRVTTLD MENLPKNEDGTVDYSQDFFNKETSLTVSGQLNGETYAQAFRNIYTFGPTFRAENSNTTRH AAEFWMVEPEIAFADLDDNMMLAEAMLKYIINYVLENAPEEMNFFNSFVDKGLLDRLHNV ANSEFARVTYTEAIELLEKNNDKFDYKVSWGADLQTEHERYLTEEVFKRPVFVTDYPKDI KAFYMKMNDDNKTVAAVDCLVPGIGEIIGGSQREDDYEKLLARMNEMGLKPEEYQFYLDL RKYGSTRHAGFGLGFERCVMYLTGMSNIRDVLPFPRTVNNCEL >gi|330400709|gb|ADLB01000031.1| GENE 47 41017 - 41517 523 166 aa, chain + ## HITS:1 COG:CAC2413 KEGG:ns NR:ns ## COG: CAC2413 COG4708 # Protein_GI_number: 15895679 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Clostridium acetobutylicum # 2 153 3 155 166 92 39.0 4e-19 MKNQKVTFLTQAAMIAAIYVVLTVVFAPFGFGEVQVRIAESLTILPFFTPAAIPGLFIGC LIGNILGGAIIPDIIFGSLATLVAACFTYALRKKSSLFAPLPPIIANTIVVPFVLAYGYG VPLPIPFMMLTVGIGEIISCGVLGMIVLKALSKYEYVIFKNKKTVS >gi|330400709|gb|ADLB01000031.1| GENE 48 41559 - 44066 2203 835 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_0675 NR:ns ## KEGG: EUBREC_0675 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 806 1 812 842 390 31.0 1e-107 MKKKIIIGSVAAVIVIAAAIILGIILNRGTSDKTEEMGGATLPIVSFQEGEYKINSLYGY TDSMEITSVRDMVLPISVGTTIKADIQGYSGKIKSFTYEICTLDGKESIKQETIKKVKDT VNIELGDTVSDGEEKVLKMKLTLDNGQDVYYYTRVLKATEFHVQQCLEFADDFHKKTFDA DQVNALKPRLERGSNPNPSSIQKVTLNSDIKQIGWSGLNPEIVGDVVWQIKESTETYTSI YLKYQVKCKGEGDSEDLYNVTEFFKVRFLKGKGSLEDYERTMNEVFSGDESDFSAKGVEL GIAGEEIPYMVNEKENIISFVQEREVWNYNKEKNEISLVFSFADGEEDDIRSRNDNHEIE LVSMDNKGNTTFVVCGYMNRGMHEGKVGAGIYYFNSEKNYVEEKAFIPSEKSGEIAAEDL IQLAYYSEEENKLYAMFEGTLYEINLETQKRSVLVKNLEEGQYVVSNDGKYIAYQSDGKL TEATEMTIQNLQTGKKYKVKAEKGEVIRPLTFVMGDAVYGIGKKEDIGKNVSGNQIIPLY KVEIRNQKNKVVKTYQEDQSYVVSANTEHNMITLERVQKSGDTYAGITQHYITNNKEQEE PSVTVETYTSETKGTVTRLEDKKASSETKPKRLVQKQLAKGKPIVIEFNLADEENCYYVY GRGKLQGIYKTAGAAILKADEVRGVVVTSHQTKVWERGNRQLRYNIENTSAFSVQQGETS LQACLRQLLEYKGKKVNVAKEMQDGKSVMDILNKYSGGEAVDLSGCTVEQLCYIIGKGTP VIAMTGTDSAILLVGYDDKMITYLNPSTSKKPTEAISVIENMAKGSQGTFFGYVK >gi|330400709|gb|ADLB01000031.1| GENE 49 44067 - 45218 990 383 aa, chain - ## HITS:1 COG:FN0778 KEGG:ns NR:ns ## COG: FN0778 COG0500 # Protein_GI_number: 19704113 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Fusobacterium nucleatum # 2 377 4 387 412 271 42.0 2e-72 MEEIRRILEESLHIDFIQATISNPKKKGGIMKIKVRPVMVKEELLFQCEIFENNQAFHRN YVREEAVSYLAEMMEQFKQMQLETKQSQYTVLVSKKGKVTVKKKQQSGCTKQVDLSHNRS KKYILEEGKRVPFLCDLGVMTQEGKIVRTRFDKFRQINRFLEFIEDVLPQLDKDREITIL DFGCGKSYLTFAMYYYLHELKQYDVRIIGLDLKKEVIRHCNELSEKYGYEKLKFLEGNIA DYTGVEEVDMVVTLHACDTATDYALAKAVGWNAKVILSVPCCQHEANNQIKNETLEPILK YGLIKERISALVTDALRAEYLEREGYESQILEFIDMEHTPKNILIRAIKTGKKGKNADRI ADCEQQLNLSLTLKKLLDNKGEV >gi|330400709|gb|ADLB01000031.1| GENE 50 45327 - 47870 2049 847 aa, chain - ## HITS:1 COG:BH1154_2 KEGG:ns NR:ns ## COG: BH1154_2 COG0642 # Protein_GI_number: 15613717 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Bacillus halodurans # 593 835 31 271 274 218 44.0 4e-56 MKKWYKSAFCKGILILLAGISVTVLSICLSIGLVYPGQNYEELFTKKTEKHYEDTRGFSK QVWNETGELLRFLESKNLLEKDGKEDLTQAVTVEDLKNTGELKKGEGEFSFRLQDLLDRL DSGQNEIVVAQREDGTYHYMEFSEFIHASRQNEYLFPNEAETKKQQEAALKSMEEGIYLP SGKVLDKDGNTVYKEVWFYNEPFDATIKTTAGKTMVELANNSEIFNGSLEQMQEALDAMK TRVASLKEVYDSYKSYFAQGNTNLSYLYVNYDTKQLESNQEATWENVGNGLPNMQETDKY VVVAPKLADCSTNMAKDGVALSDWQHMVQSSLNYPENFRLIISVDTDYPIQDSFYNWAES YERYAPYVDSAIVIGAGAVLILLLSVIWLTAVAGRSNQQKELALTRFDKWKTEIWLCLIG CLGAPLVTEVIMAIEYYLYQINNYAESGYMESLRNGRISIAAFMGIGAGTFAAGLLFLVA YLSFVRRIKARNLWKNSICKWFLQSVAYIYKNRSSITKIVFIGSGMLILNLLMTSNSGFF ILLGIAADVYVVIRIAKSNIEKEKIKKGITRIANGDAKYKISSAKLSKDNQEMAMQVNRI GEGIQNAVEKSLKDERLKTDLITNVSHDIKTPLTSIINYVGLLKQEKFDDPKIQRYLDIL DAKSQRLKTLTEDVVEASKISSGNINLEFINLNLVEMIHQTTGEFTEKFEKKDLQAVVNI PEEPVIVRVDGRRMWRVIENIYNNAAKYAMPSTRLYVDMVANGKIVVLSLKNVSEYPLNI TADELTERFIRGDVSRSTEGSGLGLSIAKNLTEMQGGKFKIYIDGDLFKVTITFPQMQPT VTKEGEM >gi|330400709|gb|ADLB01000031.1| GENE 51 47883 - 48581 944 232 aa, chain - ## HITS:1 COG:BH1153 KEGG:ns NR:ns ## COG: BH1153 COG0745 # Protein_GI_number: 15613716 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Bacillus halodurans # 2 231 5 232 232 275 63.0 5e-74 MFNILVCDDDKEIVEAIEIYLRQEGYNVLKAYDGEEALKVLKREEKVDLLVIDVMMPRLD GIRATLKIREENSLPIIILSAKSEDADKILGLNVGADDYVTKPFNPLELVARVKSQLRRY TKLGSTVVQNNNTVYTVGGLSIDDDLKEVTVDGEPVKLTPIEYNILLLLMKNQGKVFSID QIYESIWNEDAIGADNTVAVHIRHIREKIEINPKEPRYLKVVWGVGYKIEKL >gi|330400709|gb|ADLB01000031.1| GENE 52 48742 - 50112 1275 456 aa, chain + ## HITS:1 COG:L67186 KEGG:ns NR:ns ## COG: L67186 COG0372 # Protein_GI_number: 15672652 # Func_class: C Energy production and conversion # Function: Citrate synthase # Organism: Lactococcus lactis # 25 456 12 441 441 518 58.0 1e-146 MANSIDKVLFEITPKIRQLAELCEQNNAIDKELYTKYEVKRGLRDLNGKGVLAGLTNISD VCAKKEVDGVEVPCEGNLYYRGYNIKDLVRGFLKAEHFGFEEAAYLLLFGNLPNQQELSD FHETLIERRTLPPNFVRDVIMKAPSRDMMNSLSRSILNLYSYDEKADDTSIPNVLRQCLN LISEFPMLMVYSYHAYNYRRGDDLFIYAPSPKLSTAENILMMLREDKQYTKFEATVLDMA LVLHMDHGGGNNSTFTTHVVTSSGTDTYSTIAAALASLKGPKHGGANIKVSQMFDDMKQH ISDWEDEDAVRQYLNDLLDKKAFDQKGLIYGMGHAIYSISDPRADIFKGFVKRLAIEKGR EKEYNLYEMVERLAPEVIGEKRKIYKGVNANVDFYSGLVYGMLGLPKELYTPIFAAARIV GWSAHRLEELKNVDKIIRPAYRPLCPYREYINMEDR >gi|330400709|gb|ADLB01000031.1| GENE 53 50118 - 51056 615 312 aa, chain + ## HITS:1 COG:PA3301 KEGG:ns NR:ns ## COG: PA3301 COG2267 # Protein_GI_number: 15598497 # Func_class: I Lipid transport and metabolism # Function: Lysophospholipase # Organism: Pseudomonas aeruginosa # 5 304 6 301 316 206 34.0 6e-53 MKQHFYYPSQDGKTKIHAIEWIPEGSISAILQISHGMVEYIERYDEFARFLNEQGYYVVG HDHLGHGKSIRSNEDWGYFHEEKGNEYVIGDIHRLRQITQEKYPDIPYFLLGHSMGSFLA RQYLTLHGNGLSGAIIMGTGNQPLFLIKSGKILCRLIAFFRGWRYRSHFINNMAFGGYNK KFSPARTPMDWLSLNPENVDKYLSEDCCTFIFTVNGYYHMFRGMEHLAKKQNFNRIPKNL PVFFVAGQDDPVGDFGKGVASVCQKYKDGGIKDVSLKLYKDDRHEILNETDRETVYEDIY QWLETKRISCTQ >gi|330400709|gb|ADLB01000031.1| GENE 54 51111 - 51989 905 292 aa, chain + ## HITS:1 COG:CAC2986 KEGG:ns NR:ns ## COG: CAC2986 COG0030 # Protein_GI_number: 15896238 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Dimethyladenosine transferase (rRNA methylation) # Organism: Clostridium acetobutylicum # 13 290 5 276 276 267 48.0 1e-71 MNTPKCTLGNPQNTIEILQKYQFSFQKKFGQNFLIDTHVLDKIISSAEITKDDFVLEIGP GIGTMTQYLASAAREVVAVEIDKALIPILSDTLSGFDNVTIINNDVLKVDIGALAQEHNN GRPIKVVANLPYYITTPIIMGLFESNVPIESITVMVQKEVAERMQVGPGTKDYGALSLAV QYYAKPYIVANVPPNCFMPRPKVASAVIRLERHKEPPVSVVDEKLMFKIIRASFNQRRKT LANGLNNSPEIHLPKDVITEAIKELGKGAGVRGEVLTLQEFATLSDNISKRL >gi|330400709|gb|ADLB01000031.1| GENE 55 52017 - 52697 770 226 aa, chain + ## HITS:1 COG:CC1115 KEGG:ns NR:ns ## COG: CC1115 COG2013 # Protein_GI_number: 16125367 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Caulobacter vibrioides # 1 219 23 254 280 163 40.0 2e-40 MQYQIKGETLPVVICHLEAGEKMITEKGSMSWMSPNMLMETGTNGGLGKAFGRMFSGESM FQNTYTSQGGNGTIAFASSFPGSIKAFEISTGNEMIFQKSAFLAAEAGVQLSVHFQKKLG SGLFGGEGFILQRVSGQGTMFAEFDGHVIEYELQPGQQIVVDTGHLAAMTPSCQMDIKTI KGVKNIVFGGEGLFNTIITGPGRVWLQTMPASNVAGALIPYLPTGN >gi|330400709|gb|ADLB01000031.1| GENE 56 52808 - 52924 99 38 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLMLVSEKVGFLQSRNQPFSLCKKLGTNGGKADISDVG >gi|330400709|gb|ADLB01000031.1| GENE 57 53144 - 54889 1477 581 aa, chain - ## HITS:1 COG:CC1085 KEGG:ns NR:ns ## COG: CC1085 COG4805 # Protein_GI_number: 16125337 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Caulobacter vibrioides # 102 577 51 553 564 128 24.0 4e-29 MKFSFTKKSNRTRHISFLLAVLLAVSLLANHFIPSSIDINKTFEKFTLSLFQQEVSANTI NLHYTLQNPAEYGVLNTPVTYGGFSTDKVSAMASVENCESILKKFPYRKLSKENRLTYDV LESYLQLQKEGCNYLLYEEPLNAVTGIQAQLPVLLAEYSFTKEEDIDTYLQLLEATPDYF DALIDFEKEKAEKGLFLSDEVADSVIEQCESVVSQGKRHYLYSTFQNRLETISSLSQEER KQYLLENEKMIQKHVFPSYEKLIRELKSMRGKGEKMTGVCHFPDGKKYYQYIAKRETGSS RKISELKSLIQKQMTEDLLAISQTGVSTEASATVFSLQDNTPESYLKDLEKKITKVFPAA PEVDTEIKFVPTDMESYLSPAFYLIPCIDNTEKNVIYINRSHDMEKLHLYTTLAHEGYPG HLYQTTYYASQNPSPIRSLLNFGGYVEGWATYAEMCSYYLTPLEKKQAVLAQKNNSLLLG LYARADIGIHYDGWSLQDTVKFFSKYGIREEETIAEIYELIVSTPANYLKYYVGYVEFLE LKKEVAGRKGEDFSQKQFHKEILDIGPAPFSVVRKYVLQGK >gi|330400709|gb|ADLB01000031.1| GENE 58 54848 - 54976 56 42 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTCSVRFFCKREFHKIHPTFFTVYEKKQGVILSKSIILQFLF >gi|330400709|gb|ADLB01000031.1| GENE 59 54954 - 55205 180 83 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|291172357|ref|ZP_06573530.1| ## NR: gi|291172357|ref|ZP_06573530.1| cation/multidrug efflux pump [Filifactor alocis ATCC 35896] # 1 83 91 177 177 63 39.0 3e-09 MPQYQPTHFWCMEENVTLNVTNYNKKEVSVPEKLEGVTSNVQNFCVNVTKDNNVSFKWVN VNECGKTFAEKKADYLLKIKIVK >gi|330400709|gb|ADLB01000031.1| GENE 60 55208 - 55345 223 45 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKKNKKILIGSLLVLTVCLLSGCKKEKDTEFPKDKQGYRKMKHLK >gi|330400709|gb|ADLB01000031.1| GENE 61 55332 - 56024 394 230 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|210612869|ref|ZP_03289502.1| ## NR: gi|210612869|ref|ZP_03289502.1| hypothetical protein CLONEX_01704 [Clostridium nexile DSM 1787] # 1 221 1 215 226 155 43.0 3e-36 MQEKRSLTIVTISTILFGIISKWLVGIPYMMWGHFDLHFILSLVLWVLYSTSLYVSWKMD SRSNESMVKIGLYGLGFGMIVSCIKMGIDAIIILMVGRTNNQILLTFAMEIGIVLFGSMI MIFLSCILQKRKFVWDKSLNKYAGVLGGTIGIYAIAFFYFYSKYQALAPYTDIHSLTESG NINLNIMLGMESVMQYSRTFTMLSMITYVIFFIVFWIILQKEGGRSYEKE Prediction of potential genes in microbial genomes Time: Tue May 24 22:05:04 2011 Seq name: gi|330400656|gb|ADLB01000032.1| Lachnospiraceae bacterium 2_1_46FAA cont1.32, whole genome shotgun sequence Length of sequence - 30792 bp Number of predicted genes - 28, with homology - 20 Number of transcription units - 16, operones - 6 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 197 131 ## 2 2 Op 1 8/0.000 + CDS 843 - 2738 2027 ## COG0143 Methionyl-tRNA synthetase + Term 2739 - 2797 16.1 + Prom 2818 - 2877 5.0 3 2 Op 2 . + CDS 3088 - 3867 711 ## COG0084 Mg-dependent DNase + Term 3989 - 4024 1.4 - Term 4036 - 4083 5.5 4 3 Tu 1 . - CDS 4122 - 4256 109 ## - Prom 4299 - 4358 2.9 5 4 Tu 1 . - CDS 4445 - 4705 273 ## + Prom 4503 - 4562 7.2 6 5 Tu 1 . + CDS 4719 - 4907 138 ## + Prom 5078 - 5137 6.5 7 6 Tu 1 . + CDS 5219 - 5392 184 ## + Term 5412 - 5452 -0.2 - Term 5400 - 5440 4.3 8 7 Tu 1 . - CDS 5450 - 5968 657 ## COG0700 Uncharacterized membrane protein - Prom 6094 - 6153 9.4 + Prom 6071 - 6130 7.9 9 8 Op 1 . + CDS 6293 - 6730 427 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases 10 8 Op 2 . + CDS 6770 - 6847 72 ## + Term 6910 - 6965 10.8 - Term 6899 - 6952 16.3 11 9 Op 1 . - CDS 6967 - 8304 1435 ## COG0534 Na+-driven multidrug efflux pump 12 9 Op 2 . - CDS 8384 - 8974 515 ## COG2715 Uncharacterized membrane protein, required for spore maturation in B.subtilis. - Prom 8994 - 9053 5.3 - Term 9011 - 9069 4.2 13 10 Op 1 . - CDS 9073 - 10224 1247 ## COG0628 Predicted permease 14 10 Op 2 . - CDS 10242 - 12842 2632 ## COG0474 Cation transport ATPase - Prom 12867 - 12926 7.1 - Term 12890 - 12940 10.5 15 11 Tu 1 . - CDS 13068 - 14180 1619 ## COG3839 ABC-type sugar transport systems, ATPase components - Prom 14214 - 14273 9.5 - Term 14222 - 14263 6.0 16 12 Op 1 . - CDS 14298 - 14672 417 ## Phep_0230 hypothetical protein 17 12 Op 2 1/0.000 - CDS 14733 - 15572 915 ## COG0313 Predicted methyltransferases 18 12 Op 3 1/0.000 - CDS 15584 - 16324 632 ## COG4123 Predicted O-methyltransferase 19 12 Op 4 1/0.000 - CDS 16314 - 17225 955 ## COG1774 Uncharacterized homolog of PSP1 20 12 Op 5 . - CDS 17228 - 18217 1066 ## COG2812 DNA polymerase III, gamma/tau subunits 21 12 Op 6 1/0.000 - CDS 18260 - 18844 627 ## COG0194 Guanylate kinase 22 12 Op 7 . - CDS 18837 - 20234 1182 ## COG1982 Arginine/lysine/ornithine decarboxylases - Term 20251 - 20281 -0.3 23 12 Op 8 . - CDS 20304 - 20396 76 ## - Prom 20442 - 20501 2.2 - TRNA 20307 - 20389 56.5 # Leu CAG 0 0 - Term 20446 - 20494 10.1 24 13 Tu 1 . - CDS 20508 - 23732 3737 ## COG0458 Carbamoylphosphate synthase large subunit (split gene in MJ) - Prom 23769 - 23828 11.5 - Term 23780 - 23818 3.2 25 14 Op 1 . - CDS 23855 - 27070 2774 ## Sez_0186 alpha-galactosidase - Prom 27093 - 27152 2.2 26 14 Op 2 . - CDS 27171 - 28655 1586 ## COG0531 Amino acid transporters - Prom 28900 - 28959 11.8 + Prom 28787 - 28846 9.9 27 15 Tu 1 . + CDS 28880 - 29578 417 ## HMPREF0868_0216 DNA-binding protein + Term 29580 - 29611 1.0 - TRNA 29650 - 29722 86.6 # Lys TTT 0 0 - TRNA 29758 - 29830 81.3 # Phe GAA 0 0 - TRNA 29889 - 29962 75.2 # Met CAT 0 0 - Term 29848 - 29882 -0.8 28 16 Tu 1 . - CDS 29966 - 30130 116 ## - TRNA 29978 - 30059 54.9 # Tyr GTA 0 0 - TRNA 30064 - 30136 80.3 # Thr TGT 0 0 - TRNA 30162 - 30234 85.3 # Val TAC 0 0 - TRNA 30244 - 30317 87.1 # Asp GTC 0 0 - LSU_RRNA 30387 - 30792 93.0 # CP000885 [D:78327..81226] # 23S ribosomal RNA # Clostridium phytofermentans ISDg # Bacteria; Firmicutes; Clostridia; Clostridiales; Clostridiaceae; Clostridium. Predicted protein(s) >gi|330400656|gb|ADLB01000032.1| GENE 1 2 - 197 131 65 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MVTEAENLYILLQNLKKLEILTLYNNKYYDATFTYACELLAKNDGIFISPSALTKIMYEN FIPSP >gi|330400656|gb|ADLB01000032.1| GENE 2 843 - 2738 2027 631 aa, chain + ## HITS:1 COG:CAC2991_1 KEGG:ns NR:ns ## COG: CAC2991_1 COG0143 # Protein_GI_number: 15896243 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Methionyl-tRNA synthetase # Organism: Clostridium acetobutylicum # 2 512 3 524 536 652 59.0 0 MENKKFYMTTAIAYTSGKPHIGNTYEFVLADAIARYKRSQGYDVFFQTGTDEHGQKIELK AEEAGVSPKEFVDGVAAQVKEIADMMNTSYDKFIRTTDEYHEKQVQKIFKKLYDQGDIYK GHYEGMYCTPCESFFTESQLVDGKCPDCGRKVQPAQEEAYFFKMSKYADRLIEHINTHPE FIQPVSRKNEMMNNFLLPGLQDLCVSRTSFKWGIPVDFDPKHVVYVWLDALTNYITGIGY DCDGENNEFFNKNWPADLHLIGKDIIRFHTIYWPIFLMALDLPLPKQIFGHPWLLQGDGK MSKSKGNVIYADELVNMFGVDAVRYFVLHEMPFENDGVITWELMVERMNSDLANTLGNLV NRTISMTNKYFGGNVTDKGVTEEVDADLRAITENTPKAVDEKMDKLRVADAITEIFNLFK RCNKYIDETMPWALAKEDDKKDRLETVLYNLIQSISAGAELLSSFMPETAEKILAQLNGG NVTDKPEILFQRLDLEEVMKKVEELHPPVQEEKGDDVIDIEPKEEITFDDFMKLQFQVGE IIACEEVKKSKKLLCSQVKIGSQVKQIVSGIKAHYTAEEMVGKKVMVLVNLKPAKLAGVL SEGMLLCAEDAEGNLALMTPEKTMPAGAEIC >gi|330400656|gb|ADLB01000032.1| GENE 3 3088 - 3867 711 259 aa, chain + ## HITS:1 COG:CAC2989 KEGG:ns NR:ns ## COG: CAC2989 COG0084 # Protein_GI_number: 15896241 # Func_class: L Replication, recombination and repair # Function: Mg-dependent DNase # Organism: Clostridium acetobutylicum # 1 250 1 250 253 266 53.0 2e-71 MIFDTHTHYDDEQFNEDRETLLAHLQDNGIGTIVNVGASLQGCQNSIELAEKYPFIYATV GVHPDEVGSLNDETFTWLRQQCDHEKVVAVGEIGLDYYWDNESRDTQKKWFIKQLNLARE KNLPVVIHSREAAQDTLEIMKEYAKGLNGIIHCFSYSVEMAKEYIKMGFYIGIGGVVTFK NSKKLKEVVSEIPLERIVLETDCPYLSPVPNRGKRNSSLNLKYVVEEIASIKDISPEEVI RQTEANACRIYTKCGKSIS >gi|330400656|gb|ADLB01000032.1| GENE 4 4122 - 4256 109 44 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MPERGIQYNNGEKYSCFECNIEDKSVYVYQIKHPSRSCKVIEEK >gi|330400656|gb|ADLB01000032.1| GENE 5 4445 - 4705 273 86 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTKIEYNNFGEYEGYVGNEKEFWFRKIVLNEVKCHENLRKIGICDNDIARSIKSATKFKR RFKGILTQLNVQTGIEEDLKKGCILQ >gi|330400656|gb|ADLB01000032.1| GENE 6 4719 - 4907 138 62 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPIKTYYKRFLILLLLENNPAEKQRFQKYHNLKYYIMHSMYLLKQIHHSSDVELFYYPNY LN >gi|330400656|gb|ADLB01000032.1| GENE 7 5219 - 5392 184 57 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MESQIEETTLLENKYIGFNSGYHGGCCFVTLSDFSNDGVIQYDTLNDAKDNPLFEVY >gi|330400656|gb|ADLB01000032.1| GENE 8 5450 - 5968 657 172 aa, chain - ## HITS:1 COG:CAC0470 KEGG:ns NR:ns ## COG: CAC0470 COG0700 # Protein_GI_number: 15893761 # Func_class: S Function unknown # Function: Uncharacterized membrane protein # Organism: Clostridium acetobutylicum # 6 171 3 168 173 127 43.0 7e-30 MKLLLYVSDFIVPAVILGIVMYGLMQNVNVYDEFIKGAKNGFLTVIKIMPTLIGLMVAVG ILRASGFLEFLGKLLGQFTSYIGFPGELVPLAIVRMFSSSAATGLVLDIFKQYGTDSQIG LIASIMMSCTETIFYTMSVYFMAVKVKKTRYTLAGAIVATLAGIVASVWLVG >gi|330400656|gb|ADLB01000032.1| GENE 9 6293 - 6730 427 145 aa, chain + ## HITS:1 COG:mlr1969 KEGG:ns NR:ns ## COG: mlr1969 COG0454 # Protein_GI_number: 13471860 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Mesorhizobium loti # 7 145 1 138 141 81 35.0 4e-16 MISLERVNEQNYQHVFALELADNQKGFVSSNMKSLAQAWIYYDRARPYAIRNDNDIIGFI MFDYKPSEKKAEIWRFMIGKDFQGKGYGTEALSSAIRLLANENLFSTIQINYVKGNFPAK HLYEKLGFQETGEMEENEIVMKLYL >gi|330400656|gb|ADLB01000032.1| GENE 10 6770 - 6847 72 25 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNSTLKELFKKRTAHFYTEEVKTFS >gi|330400656|gb|ADLB01000032.1| GENE 11 6967 - 8304 1435 445 aa, chain - ## HITS:1 COG:lin0003 KEGG:ns NR:ns ## COG: lin0003 COG0534 # Protein_GI_number: 16799082 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Listeria innocua # 1 425 1 425 447 261 38.0 1e-69 MERDMTVGSPGKMILQFTFPLFIGNVFQQLYNMADTIIVGKFVGANALAAVGSTGTIMFL IIGFLQGLTAGFTVPTAQKFGAGDLKAMRKTVGSAAILSAIVSVIMTVVSMLGMKSLLHL LNTPSDIFQDAYSYIMVICAGIFAQVLYNLLASILRALGNSKIPLYFLILAAGLNVVLDL LFIIVFHMGAVGAAYATVISQGVSGALCLIYIVKKVPVLKMKKDDWKMNGHLVKVQFGVG FPMALQYSITAIGSMMMQSALNILGSVVIAGFTAGSKIEQLVTQAYVALGTTMATYSAQN MGAGKIDRIRKGFKNAMIIGTIYSIVTSIFVMTAGKYLTPLFLSENLGEIMGYVDIYLKC VGMFFIPLAVVNIFRNGIQGMGYGLLPMMAGVAELIGRGVVAIIASQKRSYVGVCMASPV AWILAGGLLLVMYYFIMRRAKSFEK >gi|330400656|gb|ADLB01000032.1| GENE 12 8384 - 8974 515 196 aa, chain - ## HITS:1 COG:BH1574 KEGG:ns NR:ns ## COG: BH1574 COG2715 # Protein_GI_number: 15614137 # Func_class: R General function prediction only # Function: Uncharacterized membrane protein, required for spore maturation in B.subtilis. # Organism: Bacillus halodurans # 1 183 1 181 197 159 46.0 3e-39 MLNYLWGGMIIVGIIFGVFSGKMPDVTNGALDSAKEAVDLCVMMIGIMSMWVGIMEIATK SGLIDTISRKIRPLIHLLFPGVPKGHLAEKYITTNMIANFLGLGWAATPAGIMAMKHLQE INPNRDKRVASCDMCTFLIVNISSLQLIPINVIAYRSQYGSVNPAGIIGAGIVATTISTV AGILFAVIMAKKSAQQ >gi|330400656|gb|ADLB01000032.1| GENE 13 9073 - 10224 1247 383 aa, chain - ## HITS:1 COG:CAC0730 KEGG:ns NR:ns ## COG: CAC0730 COG0628 # Protein_GI_number: 15894017 # Func_class: R General function prediction only # Function: Predicted permease # Organism: Clostridium acetobutylicum # 5 383 4 383 383 216 37.0 7e-56 MNLDKETIRKIRGLIVFTIIVLMIIWNYKVFFELLGYVLNIILPFLIGAAIAFVLNVPMH FIEEKVFGNKTAKSSKFLQKIARPISLVMTLLFVFGIVCLVVFIVAPELANTIAGLGKTI QTFIPEVQKWAEDIFHNNKEILSWIQNLEFNWDKMIAGGIDFFKSGAGSVLDSTFAVAKS IVSGMTTFAIAFIFACYILLQKEKLNVQVKKLGYAFIPRDWVEILIAISSMAYKTFSNFL TGQCVEAIILGSMFFVAMTIFNMPYALLVGVLIAFTALIPIFGAFIGCAIGAILILMVNP VQALGFLVLFFVLQQIEGNLIYPHVVGNSVGLPSIWVLVAVSVGGSLMGIVGMLVFIPIS SVVYTLLKGIVNRRLERMKIKVE >gi|330400656|gb|ADLB01000032.1| GENE 14 10242 - 12842 2632 866 aa, chain - ## HITS:1 COG:FN1022 KEGG:ns NR:ns ## COG: FN1022 COG0474 # Protein_GI_number: 19704357 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Fusobacterium nucleatum # 23 866 22 862 862 816 54.0 0 MEKYYQLSPEEVRMKVNGKLEPLTDEEVLKHQEQYGKNELVEGKKKSTFQIFLEQFKDFL VIILIVAAIVSGFLGDVESTAVILIVITMNAILGTVQTVKAEQSLNSLKKLSGPTAKVLR NGTVIQIPSSELTIGDEVMLEAGDYVPADGRILQNASLKIDESALTGESLGVEKSEEIIS KEVPLGDQTNMVFSGSFVTYGRGSFIVTSIGMETEVGKIATLLKTTSEKKTPLQMNLDQF GQKLSIIILIFCAILFGISVFRGDSVGDAFLFAVALAVAAIPEALSSIVTIVLSFGTQKM AKEHAIIRKLQAVEGLGSVSIICSDKTGTLTQNKMTVEDYYVNGQRIFAKDIDRQAQGQK QLLRYSILCNDSTNVDGVEIGDPTETALINLGSKLGDEAGYVREKFPRMSEIPFDSDRKL MSTAHVLENGPVMITKGAVDVLLTRMNRIWKNGEVHELTDEEKNAIEKQNQEFSRGGLRV LAFAYKDIEEGHQLTLEDEQDLIFVGLISMMDPPREESAQAVAECIRAGIKPIMITGDHK ITAAAIAKRIGILKEESEACEGSEIDNLSDEELKNFVEGISVYARVSPEHKIRIVRAWQE KGNIVSMTGDGVNDAPALKQADIGVAMGITGSEVSKDAAAMVLTDDNFATIIKAVENGRN VYKNIKGSIQFLLSGNFGGILAVLYAAIAGLPVPFAPVHLLFINLLTDSLPAIALGLEPH TKAVMDEKPRPMNESILTKDFITKIGIEGLSIGVTTMIAFMIGYRSGNAVLASTMAFGTL CTARLVHGFNCKSDRPLIFTKRFFNNIYLIGAFLLGLVLITSVLMIPALDSIFKVQTLSI SQLMTVYGLALVNLPIIQFLKFIRKK >gi|330400656|gb|ADLB01000032.1| GENE 15 13068 - 14180 1619 370 aa, chain - ## HITS:1 COG:BH1140 KEGG:ns NR:ns ## COG: BH1140 COG3839 # Protein_GI_number: 15613703 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport systems, ATPase components # Organism: Bacillus halodurans # 1 368 1 364 365 472 65.0 1e-133 MASLSLKNINKVYPNGFVAVKDFNLEIEDKEFIIFVGPSGCGKSTTLRMIAGLEDISSGE LRIGDKLVNDVEPKDRDIAMVFQNYALYPHMTVYDNMAFGLKLRKVPKEEIDKMVREAAK ILDLEALLERKPKALSGGQRQRVAMGRAIVRNPKVFLMDEPLSNLDAKLRGQMRIEISKL HQRLGTTIIYVTHDQTEAMTLGTRIVVMNAGVVQQVDTPQVLYDTPCNLFVAGFIGSPQM NFVDAVCKVNGNKVALQAGPASIELPPAKAKKLIEGGYDGKTVVLGIRPEDIHDEQMFIE SSPNTVIEATIRVYEMLGAEVYLHFDYENASMTARVNPRTTARTGDTVKFALDAEKIHVF DKETERTITN >gi|330400656|gb|ADLB01000032.1| GENE 16 14298 - 14672 417 124 aa, chain - ## HITS:1 COG:no KEGG:Phep_0230 NR:ns ## KEGG: Phep_0230 # Name: not_defined # Def: hypothetical protein # Organism: P.heparinus # Pathway: not_defined # 19 120 23 122 161 87 48.0 1e-16 MLLEALGIVTTGVGLVISLGVLVLTIVASWKLFEKAGLQGWKAIIPIYSTYCLYQMAFGK GKGWYIICLLVPCVNVILSIVYCVNLAKSFNKGTGYALGLIFLNTIFMMILAFGDAQYVG ERLR >gi|330400656|gb|ADLB01000032.1| GENE 17 14733 - 15572 915 279 aa, chain - ## HITS:1 COG:CAC0307 KEGG:ns NR:ns ## COG: CAC0307 COG0313 # Protein_GI_number: 15893599 # Func_class: R General function prediction only # Function: Predicted methyltransferases # Organism: Clostridium acetobutylicum # 3 275 4 276 282 276 50.0 3e-74 MAGKLYLCATPIGNLEDMTYRVVRTLQEADLIAAEDTRNSIKLLNHFEIKTPMTSYHEYN KIEKGKKLVEKLQSGMNIALITDAGTPGISDPGEELVKMCYESGIEVTSLPGAAACITAL TLSGLSTRRFAFEAFLPTDKKEKQEILKELTNETRTMILYEAPHRLIKTLKELRDTVGNR KITICRELTKKHETAFATTLEEAISYYESNEPKGECVLVLEGKSRTEIREEEISRWEEMS VEEHMEYYLSQGIEKKEAMKRVAKDRGVGKREIYQALLR >gi|330400656|gb|ADLB01000032.1| GENE 18 15584 - 16324 632 246 aa, chain - ## HITS:1 COG:CAC0306 KEGG:ns NR:ns ## COG: CAC0306 COG4123 # Protein_GI_number: 15893598 # Func_class: R General function prediction only # Function: Predicted O-methyltransferase # Organism: Clostridium acetobutylicum # 5 244 4 243 244 224 46.0 2e-58 MENNLKPEERLDDLQVKGYHIIQNPSKFCFGMDAVLLSNFARVKKGEKVLDIGTGTGIIP ILLEAKTEGEHFTGLEIQEESADMARRSVAYNHLEDKIDIVTGDVKEAVNLFGSVFFDVV TTNPPYMIGAHGLQNKDSAKAIARHEVLCDLDDILRESAKVLRPGGRFYMVHRPFRLAEI LSKMCAYKIEPKRMRLVHPYIDKEPNMVLIEGSRGGNSRMTVEPPLIVYREKNVYSEELL GEYGLK >gi|330400656|gb|ADLB01000032.1| GENE 19 16314 - 17225 955 303 aa, chain - ## HITS:1 COG:CAC0301 KEGG:ns NR:ns ## COG: CAC0301 COG1774 # Protein_GI_number: 15893593 # Func_class: S Function unknown # Function: Uncharacterized homolog of PSP1 # Organism: Clostridium acetobutylicum # 1 285 1 291 303 330 59.0 3e-90 MTKVIGVRFRTAGKIYFFSPAKLNIKKGDKVIVETARGVEFGSVVMNPKDVEDDEITQPL KSVIRVATEEDKRIEERNKEKEKEAFEICLEKIRKHGLEMKLIDAEYTFDNNKVLFYFTA DGRIDFRELVKDLAAVFRTRIELRQIGVRDETKIRGGIGVCGRPLCCHTYLSEFAPVSIK MAKEQNLSLNPTKISGVCGRLMCCLTNEEETYEYLNSRLPAIGDTVTTVDGLRGEVQNLS VLRQLVKVIVTLDNDEKEIREYKVSELKFKPRRKKADVKLSKEEMRELAALEKGEGESKL DGE >gi|330400656|gb|ADLB01000032.1| GENE 20 17228 - 18217 1066 329 aa, chain - ## HITS:1 COG:BS_dnaX KEGG:ns NR:ns ## COG: BS_dnaX COG2812 # Protein_GI_number: 16077087 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, gamma/tau subunits # Organism: Bacillus subtilis # 4 275 15 305 563 141 32.0 2e-33 MKGFKDVIGHNDIIQYIQNAVSQDKVSHAYILNGERGSGKKMLADLFAMTLQCEEHTPNP CGECHSCKQAKSGNHPDIIHVTHEKPNTISVDDIRTQVNNDIVIKPYSSPYKIYIIPEAD LLSVQAQNALLKTIEEPPAYAVIFLLTENAESLLPTIMSRCVMLKLRNIKTTLIKKYLME QMQIPDYQADICAEFAQGNMGRAIMLASSEHFNEIKEEALQLLKHINEMEISEIVSAIKK IGTYKLSINDYLDIIMIWYRDVLIYKATKDVNGIVFADQLRYIKDRANKSSYEGIETILE SLEKAKARLKANVNFDLVMELLLLTIKEN >gi|330400656|gb|ADLB01000032.1| GENE 21 18260 - 18844 627 194 aa, chain - ## HITS:1 COG:CAC0298 KEGG:ns NR:ns ## COG: CAC0298 COG0194 # Protein_GI_number: 15893590 # Func_class: F Nucleotide transport and metabolism # Function: Guanylate kinase # Organism: Clostridium acetobutylicum # 1 194 1 195 195 192 53.0 3e-49 MGKIFYVMGKSSSGKDTIFKKIQERLPELKTIVLYTTRPIREGEKDGVEYYFVGDKELEM FQREGKIIELRSYNTVHGKWNYFTVADNQIQLEESSYLVIGTLVSYEKMKEYFGGENLVP VYIEVEDGERLARAVERERKQESPKYAELCRRFLADTEDFSEENLKKQGIKRRFYNENVE KCVDEIVLCIKEKL >gi|330400656|gb|ADLB01000032.1| GENE 22 18837 - 20234 1182 465 aa, chain - ## HITS:1 COG:FN0501_1 KEGG:ns NR:ns ## COG: FN0501_1 COG1982 # Protein_GI_number: 19703836 # Func_class: E Amino acid transport and metabolism # Function: Arginine/lysine/ornithine decarboxylases # Organism: Fusobacterium nucleatum # 17 453 27 476 503 289 35.0 6e-78 MLYDRLKAYRESDYYGFHMPGHKRNENKFPIGLPYGIDITEIEGFDDLHHAAGILKSAEE RASTLYKAEESHFLVNGSTVGILSAILGVTERGDKILVARNCHKSVYHAIYMNELEPIYI YPKYEKDLQINGEISCKEIEQILDKEKDIKAVVIVSPTYDGVVSDVENIAKTVHKHDIPL IVDEAHGAHFGFHKKFPKNANEKGADIVIHSVHKTLPAMTQTALIHLNGNRVNRESIGNY LHMLQSSSPSYVLMASIDYCMDILEKSGESLFDEYVEEIEQLRQELEELKHLHIIRTENF DISKFIISVKDTNMTSGELSRILLEKYHLQMEMTAGTYVLAMTTVGDTKEGLQRLKKAMF EIDEELKSEKKVGEDLELPHLPLIYKSAEVLKKAQKSGLSYVKWEEAEGKVAGEYAYVYP PGIPFLVPGEKITKEAVRCLKRYEELQFTIEGIAVEKHIGVWKNG >gi|330400656|gb|ADLB01000032.1| GENE 23 20304 - 20396 76 30 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTCGSVGMADELDSGSSVGYYVWVQVPSSA >gi|330400656|gb|ADLB01000032.1| GENE 24 20508 - 23732 3737 1074 aa, chain - ## HITS:1 COG:AF1274 KEGG:ns NR:ns ## COG: AF1274 COG0458 # Protein_GI_number: 11498873 # Func_class: E Amino acid transport and metabolism; F Nucleotide transport and metabolism # Function: Carbamoylphosphate synthase large subunit (split gene in MJ) # Organism: Archaeoglobus fulgidus # 1 1070 1 1075 1076 1079 51.0 0 MSQRKDIHKVLIIGSGPIIIGQACEFDYSGTQACKALKSLGYEIVLVNSNPATIMTDPET ADVTYIEPLNVERMEQIIAKERPDALLPNLGGQSGLNLCAELAQAGVLEKYNVEVIGVQV DAIERGEDRIEFKKTMDSLGIEMARSEVAYSVDEALAIADKLGYPVVLRPAYTMGGAGGG LVYNKEELKTVCARGLQASLVGQVLVEESILGWEELELEVVRDSEGNMITVCFIENIDPL GVHTGDSFCSAPMLTISEEVQKRLQEKSHKVVDSVQVIGGCNVQWAHDPKTDRDIIIEIN PRTSRSSALASKATGFPIALVSAMLATGLTLKDIPCGKYGTLDKYVPGGDYIVLKFARWA FEKFKGVEDKLGTQMRAVGEVMSIGKTYKEALQKAIRSLETGRYGLGHAKDFDKKSKDEL LKMLVTPTSERHFIMYEALRKGATIDEIHERTKVKAYFIEQMKELVEEEEELLKSKGNLP SDEALITAKKDGFSDKYLSQILDVTETEIREKRIALGVEEAWEGVHVSGTKDKAYYYSTY NSEDKNPIREEKPKVMILGGGPNRIGQGIEFDYCCVHASLALRKLGFETIIVNCNPETVS TDYDTSDKLYFEPLTLEDVLSIYKKEKPVGVIAQFGGQTPLNLAADLEKNGVKILGTAPS VIDLAEDRDLFRAMMEKLEIPMPESGMAVNVEEAVEIAERIGYPVMVRPSYVLGGRGMEV VYDSESMAGYMKAAVGVTPDRPILIDRFLNHALECEADAISDGTHAFVPAVMEHIELAGV HSGDSACIIPSKHISEENIKTIKEYTRKIAEEMHVKGLMNMQYAIENGKVYVLEANPRAS RTVPLVSKVCNIKMVPLATEIVTSELTGKPSPVPQLKEQYIPHYGVKEAVFPFNMFQEVD PVLGPEMRSTGEVLGLSSFYGEAFYKAQEATQKKIPLEGTVLMSINDKDKPEVVEVAKEF IECGFKILASKNTCKLIQDAGMEAERVYKLNEGRPDMLDLITNGKVDLIINTPIGQDRQE DDSYLRKAAIKAKVPYMTTMAAAKATVSGIKSMRKPGCGEVKSLQELHAQIRDL >gi|330400656|gb|ADLB01000032.1| GENE 25 23855 - 27070 2774 1071 aa, chain - ## HITS:1 COG:no KEGG:Sez_0186 NR:ns ## KEGG: Sez_0186 # Name: not_defined # Def: alpha-galactosidase # Organism: S.equi # Pathway: not_defined # 47 1070 148 1105 1326 337 28.0 1e-90 MKKKKLMRLFAGLLVITMLAPNSIVSVYAEEPVQNEIQQNKNTKIQYLSDKQEVEKRVGY GSFGKDRNTEMKQNGEGLQVKTRGEKVTFPKGIFAHAPSTVIYDVSELKDKYPNFAGYLG IDSRSGGGNGVIFKISVSDDKNSWEEIYNSGVVTINDAVYVNLSMKGKKYIKLEADSNGE NSRDHVVYADAGFVTNDYTPSENYNAPIKTVAEYDEELSKINFRDKNAVNNNTQTIYQRE LVNQAGFYTINKVYEMEDERGNKLYKDAIEYLLKDKEALYYYINGGPKPSQGTYENSLIA FGKLYNAYKGELDNNSENDLNIRLAVSVASAYANPKTVRFWTENDNPNSGVEEVQKEDPV RRYATYKKLSEQGNYMDKISTLAKEPSRNMGNAIWSGKQFKELSVPMMRWVVDSRMHEDE FEWLAQYITQWASEEENKNKNFLDSYMYVHNKQGNWQYTDEKYYSQQKRAEWNKKYKFDD FKSFDDNAKYGTKDLIRSWIVWEEGGVCGAYAKTYANLAEVAGRPSIVTGQPAHAAALTW QWVSDGGPDHKGQYEWRIQNNAWSLRETSSEYEDYLLGWGNRRKMGNIDTNRNRASSYTL LATDVIQDWDAYVMAKKYTLLANSLKDFSAKREVYYRALTESPRYLDATYGMLDMYLNKP DLTSAELNRFMKETASRYTYYPMVMADLLREIELSGKLTDPVHMAELYMERQRALEEAYT LRKDTKNLTADDYEKTRQPWYAGDVARAILEKDHSRIASFSFDGDNAGKIVLTKNLTEKG LKMKYSLDGGKTWKTSDKDVISLTNDELKSITVENDIQVTVEGATEDMYGRLPICVIDIM EQEAPKNIEANDKEDLLLGDVSNLEYSEDGGNTWKPYEQNGLKNQNRFTGEKKVTVRRGS HGQYVASKTAEYQFKNADDKNEEKYLQLQYVSMHEFSSQQNEGNQAAKNLIDGQRNTNWH SKWDVQDKKEYSVKFDKARRISKLEYVPSVGGSNGRWEEVEIYGSNDGKTWTKIGKSGQL ANDVNAKEIKVDSSKAWQYIKVKGLHSYSHDGNKDKYFSGSMLNFFEDTTK >gi|330400656|gb|ADLB01000032.1| GENE 26 27171 - 28655 1586 494 aa, chain - ## HITS:1 COG:yjeM KEGG:ns NR:ns ## COG: yjeM COG0531 # Protein_GI_number: 16131981 # Func_class: E Amino acid transport and metabolism # Function: Amino acid transporters # Organism: Escherichia coli K12 # 1 492 15 504 514 454 49.0 1e-127 MSEKQKKLSMGALILMIFTSVYGFNNMPRSFYLMGYGAIPFYLLSAVVFFVPFAFMIAEY GSAFKDEKGGIFSWMQICVNTKYAFVATFMWYTSYVIWLVNIASGIMVPLSNAIFGKDTT QEWSIFGLSSTQTLGILGVLFIVFVTFFASKGMKNIQKVASIGGIACMFLNVMLIGGALL VLIGNKGELAQPVVSAASLVESPNPEYAGSLAMLAFLVYALFAYGGTEAVGGLVDETENP EKNFGKGLTIAAIIVAVGYSIGIFCVGIFTNWSETLSAATVHKGNASYVIMNNLGYQIGA AFGASQSVCIQMGDWAARIMGISILLSLAGAVFTLCYSPLKQLIEGSPKGLLPEKFCKME NGMPKFALKIQAVVVVVFILLISMGGDTMTQFFNILVSMTNVAMTLPYMFISAAFIKFKK NKDIHKPFVIFKNDTISIVFTILVTAIVGFANFFTIIQPAMAGDVATTVWSILGPVLFTV TALILHARYEKRNK >gi|330400656|gb|ADLB01000032.1| GENE 27 28880 - 29578 417 232 aa, chain + ## HITS:1 COG:no KEGG:HMPREF0868_0216 NR:ns ## KEGG: HMPREF0868_0216 # Name: not_defined # Def: DNA-binding protein # Organism: Clostridiales_BVAB3 # Pathway: not_defined # 7 226 14 243 243 87 28.0 3e-16 MTMNETVGNNIKKYRIAHNYTLKEMSALIHKSCSTLSKYEKGIIPITADTIEEFAQIFHI LPSQLLAVPYEDMLPIEKTDFISKHYMYSYDGKRKRIMKSILEEFSISNSENTFVELFYD IENFKNPEKCKVIYAGESVKFGPWQNYHLRNQSHSNEEIWLCSMDTFSHNNKIGILAGIS SVTMYPCARKILIASDIQKENEVLNKLIFSKEDIQLFKKYNFFSISQFMTEE >gi|330400656|gb|ADLB01000032.1| GENE 28 29966 - 30130 116 54 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MAQLAEQLICNQQVIGSSPIVGFMDGFPSGQRGQTVNLLAPPSKVRILLHPLIR Prediction of potential genes in microbial genomes Time: Tue May 24 22:06:01 2011 Seq name: gi|330400618|gb|ADLB01000033.1| Lachnospiraceae bacterium 2_1_46FAA cont1.33, whole genome shotgun sequence Length of sequence - 14060 bp Number of predicted genes - 16, with homology - 15 Number of transcription units - 5, operones - 4 average op.length - 3.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) - 5S_RRNA 216 - 274 98.0 # AE015927 [R:2797299..2798807] # 5S ribosomal RNA # Clostridium tetani E88 # Bacteria; Firmicutes; Clostridia; Clostridiales; Clostridiaceae; Clostridium. - TRNA 602 - 691 61.7 # Ser GGA 0 0 - TRNA 743 - 828 61.8 # Ser TGA 0 0 1 1 Op 1 . - CDS 985 - 1467 229 ## COG3663 G:T/U mismatch-specific DNA glycosylase 2 1 Op 2 . - CDS 1460 - 2167 401 ## PROTEIN SUPPORTED gi|163764767|ref|ZP_02171821.1| ribosomal protein L15 3 1 Op 3 . - CDS 2169 - 3197 832 ## PROTEIN SUPPORTED gi|229232313|ref|ZP_04356740.1| (SSU ribosomal protein S18P)-alanine acetyltransferase 4 1 Op 4 . - CDS 3208 - 4119 886 ## COG1234 Metal-dependent hydrolases of the beta-lactamase superfamily III 5 1 Op 5 . - CDS 4130 - 4360 238 ## Cphy_0349 hypothetical protein - Prom 4401 - 4460 4.1 6 2 Op 1 20/0.000 - CDS 4471 - 4917 207 ## PROTEIN SUPPORTED gi|238926143|ref|ZP_04657903.1| SSU ribosomal protein S18P alanine acetyltransferase 7 2 Op 2 12/0.000 - CDS 4914 - 5636 219 ## PROTEIN SUPPORTED gi|238855674|ref|ZP_04645973.1| ribosomal protein ala-acetyltransferase 8 2 Op 3 . - CDS 5641 - 6069 545 ## COG0802 Predicted ATPase or kinase 9 2 Op 4 . - CDS 6083 - 6589 580 ## Cphy_0345 hypothetical protein - Prom 6613 - 6672 3.9 - Term 6611 - 6653 7.6 10 3 Op 1 . - CDS 6678 - 7013 429 ## COG3870 Uncharacterized protein conserved in bacteria 11 3 Op 2 . - CDS 7076 - 7678 703 ## COG3601 Predicted membrane protein - Prom 7736 - 7795 5.1 12 4 Op 1 . - CDS 7950 - 9962 1293 ## EUBREC_1883 hypothetical protein 13 4 Op 2 . - CDS 9975 - 10844 839 ## COG1131 ABC-type multidrug transport system, ATPase component 14 4 Op 3 . - CDS 10857 - 11981 991 ## 15 4 Op 4 . - CDS 11986 - 13401 1105 ## COG1653 ABC-type sugar transport system, periplasmic component - Prom 13565 - 13624 5.9 + Prom 13470 - 13529 10.0 16 5 Tu 1 . + CDS 13659 - 13943 453 ## EUBREC_2191 hypothetical protein + Term 14018 - 14057 -0.6 Predicted protein(s) >gi|330400618|gb|ADLB01000033.1| GENE 1 985 - 1467 229 160 aa, chain - ## HITS:1 COG:Cj1254 KEGG:ns NR:ns ## COG: Cj1254 COG3663 # Protein_GI_number: 15792578 # Func_class: L Replication, recombination and repair # Function: G:T/U mismatch-specific DNA glycosylase # Organism: Campylobacter jejuni # 6 158 7 155 160 144 43.0 7e-35 MTKEVHHIEPVFDENSRILILGSFPSVKSREGNFFYHHPQNRFWKVIAAVTGCEIPCTIE EKKTLLLKNRIAIWDVIASCEIEGSSDSSIKNVVPNDIDRILNMGKIENIYTNGGTASRL YRKYCQKKTGMEDVKLPSTSPANASYSLERLVKEWHIITG >gi|330400618|gb|ADLB01000033.1| GENE 2 1460 - 2167 401 235 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163764767|ref|ZP_02171821.1| ribosomal protein L15 [Bacillus selenitireducens MLS10] # 4 229 2 223 234 159 41 1e-38 MSKKPCTAVVLSAGQGRRMGTSVQKQYIQLDGKEIICHTLETFQQSSVIDDIVLVVGKNQ EEYCQKELVEKYHFTKVQKIVVGGEERYHSVFNGLREITHQGYVFIHDGARPFVSEEIMQ RAYDAVCRFGACVVGMPVKDTVKIADKETFISETPNRSFVWQVQTPQVFQIDMVKEAYEK MMKSGYTQATDDAMVVEKMLGKKVKLVEGSYENIKITTPEDLDIAKIFLMRKHHD >gi|330400618|gb|ADLB01000033.1| GENE 3 2169 - 3197 832 342 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229232313|ref|ZP_04356740.1| (SSU ribosomal protein S18P)-alanine acetyltransferase [Cryptobacterium curtum DSM 15641] # 2 341 512 858 860 325 48 1e-88 MSETKDVLILAIESSCDETAAAVVKNGREVLSNVISSQIELHKLYGGVVPEIASRKHIEK INQVIEEALEEANVTLDDVDAIGVTYGPGLVGALLVGVAEAKAIAYAKRKPLIGVHHIEG HISANFIENKELEPPFICLVISGGHTHLVCVKDYGEYEIIGRTRDDAAGEAFDKVARAIG LGYPGGPKIDKLSKEGNPDAITFPKAHINDAPYDFSFSGVKSAVLNYINGCQMKGETFNQ ADVAASFQKAVTEVLVENAMRAVDEYDMKKLAIAGGVASNSTLRQAMKDACEKKEIEFYY PSPIFCTDNAAMIGVAAYYEYLKGTRHGWDLNAVPNLKLGER >gi|330400618|gb|ADLB01000033.1| GENE 4 3208 - 4119 886 303 aa, chain - ## HITS:1 COG:CAC1584 KEGG:ns NR:ns ## COG: CAC1584 COG1234 # Protein_GI_number: 15894862 # Func_class: R General function prediction only # Function: Metal-dependent hydrolases of the beta-lactamase superfamily III # Organism: Clostridium acetobutylicum # 1 292 5 302 313 346 52.0 3e-95 MLDVCLLGSGGMMPLPYRWLTSLMTRFNGSSLLIDCGEGTQIAIKEKGWSFKPIDVICFT HYHGDHISGLPGLLLTMGNADRREPLTLIGPKGLERVVTALRVIAPELPFPINYVEIEGA EQTFELNGYRLTAFRVNHNVLCYGYTLEIDRAGKFDVKRAMEQEIPKQYWKFLQRGETIR EDGKIYTPDMVLGAQRKGIKLTYTTDTRPTESIKTHAAKSDLFICEGMYGEKEKAEKAVE YKHMTFTEAGKLAKEADVKEMWLTHYSPSLTKPEEFMDDVRKIFPNARAGKDCMSVTLEF AEE >gi|330400618|gb|ADLB01000033.1| GENE 5 4130 - 4360 238 76 aa, chain - ## HITS:1 COG:no KEGG:Cphy_0349 NR:ns ## KEGG: Cphy_0349 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 9 74 5 72 80 65 48.0 4e-10 MKQYKTKEILEAKKVICNKCGKEIVVENGRLSEDMLQVEKRWGYPSEKDNEVHEFDLCEQ CYDEFTATFLIPVERR >gi|330400618|gb|ADLB01000033.1| GENE 6 4471 - 4917 207 148 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|238926143|ref|ZP_04657903.1| SSU ribosomal protein S18P alanine acetyltransferase [Selenomonas flueggei ATCC 43531] # 1 141 1 141 163 84 34 4e-16 MIVLQEMEEKDSEQIAKLEREIFKDSWTQSGIEETWRETHSFIVVAKEEDEVIGYCIVYC VLDEAEIARIAVRETKRHEGVGSKLLAKTEENCRRLNVERMLLEVRESNAAARKFYAKQQ FAEDGIRKNFYNNPRENAVLMSKMICLQ >gi|330400618|gb|ADLB01000033.1| GENE 7 4914 - 5636 219 240 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|238855674|ref|ZP_04645973.1| ribosomal protein ala-acetyltransferase [Lactobacillus jensenii 269-3] # 42 231 1 186 380 89 31 3e-27 MRILALDSSGLVATVAIVEEEQTVAEYTVNYKKTHSQTLLPMLDEIVKMTDMDLQTIDAI AIAGGPGSFTGLRIGSATAKGLGLALDKPLIHIPTLEGMAYNLYGSSSVICPIMDARRNQ VYTGVYRFSEGKLEVLEEQTAIAVEELIQKLNEKGEKVIFLGDGVPVYAEQLKEGLTIPF AFAPANMNRQRASSVGLLGIEYFKQGKTETAREHQPDYLRVSQAERERNEREKAKELKNA >gi|330400618|gb|ADLB01000033.1| GENE 8 5641 - 6069 545 142 aa, chain - ## HITS:1 COG:CAC2838 KEGG:ns NR:ns ## COG: CAC2838 COG0802 # Protein_GI_number: 15896093 # Func_class: R General function prediction only # Function: Predicted ATPase or kinase # Organism: Clostridium acetobutylicum # 6 138 6 137 152 131 50.0 4e-31 MIIETNNAKETFELGVQIGREAKAGDVYTLVGDLGVGKTVFTQGLAKGLEIEEPISSPTF TIVQVYEEGRLPFYHFDVYRIGDVEEMDEIGYEDYIYGQGVCLIEWSNLIEEILPEKRRE ITIEKDLEKGFDYRKITIAERE >gi|330400618|gb|ADLB01000033.1| GENE 9 6083 - 6589 580 168 aa, chain - ## HITS:1 COG:no KEGG:Cphy_0345 NR:ns ## KEGG: Cphy_0345 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 1 165 5 167 168 88 30.0 1e-16 MTIKFKVNMTDRVMYDFLLYHNYTSMTGLIGSILGVLLLGVAITKGMAGDIQTAVLFFAI SIFVLMSTPMTLKATAKNQVKNTPMFQEPLEYEISEEGITVSQHEESALNEWKDFAKVVS TSKSLILYITRVRAIILPREAMGDDYMKVVEMISKNVPAKRVKIRHTR >gi|330400618|gb|ADLB01000033.1| GENE 10 6678 - 7013 429 111 aa, chain - ## HITS:1 COG:lin2840 KEGG:ns NR:ns ## COG: lin2840 COG3870 # Protein_GI_number: 16801900 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Listeria innocua # 1 108 1 106 109 98 45.0 2e-21 MKLIFAVVHAEDAKGAAKELNKKQFGVTKLSSSGGFLRKDNCTLMIGTEDEKVGEVMDVL KETCAKREEVEIVTPYLSEGMQIPNYAYTPIKVEAGGAIVFVVDVAEFWKI >gi|330400618|gb|ADLB01000033.1| GENE 11 7076 - 7678 703 200 aa, chain - ## HITS:1 COG:CAC2841 KEGG:ns NR:ns ## COG: CAC2841 COG3601 # Protein_GI_number: 15896096 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Clostridium acetobutylicum # 11 196 9 192 209 144 40.0 1e-34 MSTKKVNARKLVSIGMLSAVAVILMQFEIPLPFAPAFYQIDFSEIPVLVGSFALGPLASV TIEFVKVLLNMLISGTTTAGVGDVANFLIGCGLSVPAGIIYHQTKTKKGAFIGMIAGTGC MTILGCFLNAYVLLPVYAKAFELPIDSLIQMGTEVNDSITGLATFVIFAVAPFNLLKGVL VSLVVLLIYKKISPIFKMKR >gi|330400618|gb|ADLB01000033.1| GENE 12 7950 - 9962 1293 670 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_1883 NR:ns ## KEGG: EUBREC_1883 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 62 523 60 533 701 159 27.0 3e-37 MWKYEWKKLLGNKAILSMIIGCFLLNGFLVFRQGTQYDEEKRSSPDNIHRIYEEMDSIEN KQEWLQKEIEKEENAEDMGNYARRSALIYVLESVEEVENYDGYLENIKIQAKRISRSALF SDTDSFSKRNASQIPKKYDKFEGMTLEVADSQGALLATQSEMTDVLFVVLAVLLTYFFIS MEREEGTLAFVRCTKNGGRELGVRKIAIILAGSLVGMLLLYVQNFIIMANLYGFGNLGRP VQSVNGFIGSAWKINLGQYFLLFLCGKLFVACVFVGVITWICLKGKNILRTSVVLVLLVG VEWVFYAKIPSNSWLGILKWCNLFSFIRTENFFQTYETVNLFSYPVSSVIVCSIAGGILF IICIVQSILCYEKVSRGEYAQKKEKKKREGRQTKGHGIFFYEFRKVMWINGAGVVLLLFT IGYFMTVSGEKIYFSQDELYYKHYLKELEGEMTAEKTEFIQKEDERIKKLEESEESFDKE RMLQCKPAFDQVKMQAERIGEKGVFLNEIGFNYLFDKTTFIGRTGLLLLIIMLAFFQTFV MEQIAGMDTVWNTIPEGRKKIQRRKWVILIVSVLLLSVFIEGMFIVHGVKGQQLSGLSEQ IKYISAFEGFGNMPIAVYLFLRWGFECLAGIGGLIVIAGVSKKVKNTATVLLVSGGIAGL FYLCMRFIII >gi|330400618|gb|ADLB01000033.1| GENE 13 9975 - 10844 839 289 aa, chain - ## HITS:1 COG:all2672 KEGG:ns NR:ns ## COG: all2672 COG1131 # Protein_GI_number: 17230164 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase component # Organism: Nostoc sp. PCC 7120 # 3 284 2 279 293 180 37.0 2e-45 MELEIRNLSKRYGKKQALSNVSCKLNDGVYGLLGANGAGKSTLMNIITGNIEADEGEVLL DGKNILKSENQREFREYLGYMPQFPIQYRGFTVLDFLFYMASLRGISKKKAKEKIDELLD MLALTEVRNKKMTALSGGMKQRLMLAQSVIGNPKVLILDEPTAGLDPKQRIAVRNLISEI AFQKIVLIATHVVTDVEFISKEIVLLKDGQVLKKAERETLTANLEGKVYEIKVPVSELKR IQQTYLVGNIVKEREWIYVRIVSDTLPQEKNVTIVRPSLEDVYLYYFRE >gi|330400618|gb|ADLB01000033.1| GENE 14 10857 - 11981 991 374 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRKKICLIISAALFITMSAACKREEKAKEKPENYSVVTNMRMQVKEDGIFYLDGYDKLMK FYDYELKENMPVCDKPNCKHRSLKCNAYIENGFNSTVGNYRGKLYFFKPASEEFSLYVSD ANGSNRKELAKLNEGGKHMTCMIKAPMWFIEDKIYLGVEYSDLTNNPENPEKLVWQFISI SLEDGKIDVLKETEEIDEMTGYVDVLSYKDGQLLYYAGNSLYLYDMETKQTDMILTDVTK TRWCLGTDKEQKNMYYSEENEKYSEVYKLDLQTKEKTSIVKKEKNGKELVWNYWGGKFYY SLFDKEGSPVDGELRIYDAEKKTEKKISKEEYLYSPQYVAGDWYVSMTEEGIVCIKKQDY EKKNWDKLQVMGRF >gi|330400618|gb|ADLB01000033.1| GENE 15 11986 - 13401 1105 471 aa, chain - ## HITS:1 COG:BH0796 KEGG:ns NR:ns ## COG: BH0796 COG1653 # Protein_GI_number: 15613359 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Bacillus halodurans # 44 468 94 497 500 76 25.0 9e-14 MKKALAGILILSLICCTGCSTTKKNKDNEEKRIKLVWQSEQSFWEHQDYFNQVLKEKGYP YEVEFVTSETVQKGQRVDLLETGIDTWENPYDTNKDALEGRLLPLDDYLNTKEGKKIKDA VPEKIWDSYKINGKQYSVLSPGFLLDKTVYVWDKELAEKYNVHPETWDGDLWEHREELLK VAEGEKKAGRENFTTVSGLLLYAEEIPETTKVLGLLYPIIFRENTDKAETEFLYETPEYK RNLAGMREFYKMGLWRPEVETVRTAESFLSIETQFWSKNAYLGFREPNFWDTHEVKELYT EKLWELSLNCIETGITTESKHPEEAFDLLCALYTDKDLVNAIEWGDEENYEVVDGKAVKP MSEGEYIPRLYAGNQLLGYVEANEDNKLKENFPKEIERAKVSKISGFRFSGKGIEKELEK VVALNSLVYSGSAEEVIKDMEQTIEKYKQAGIDKVIAEWNRQFAEWNKERR >gi|330400618|gb|ADLB01000033.1| GENE 16 13659 - 13943 453 94 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_2191 NR:ns ## KEGG: EUBREC_2191 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 94 1 94 94 135 77.0 4e-31 MAKHYDKQFKLDAVQYYHDHKNLGLQGCATNLGISQQTLSRWQKELRETGDIESRGSGNY ASDEAKEIARLKRELRDAQDALEVLKKAINILGK Prediction of potential genes in microbial genomes Time: Tue May 24 22:06:34 2011 Seq name: gi|330400597|gb|ADLB01000034.1| Lachnospiraceae bacterium 2_1_46FAA cont1.34, whole genome shotgun sequence Length of sequence - 25715 bp Number of predicted genes - 16, with homology - 11 Number of transcription units - 10, operones - 5 average op.length - 2.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 361 246 ## COG2801 Transposase and inactivated derivatives 2 2 Op 1 . - CDS 380 - 778 379 ## 3 2 Op 2 . - CDS 845 - 925 88 ## - Prom 1121 - 1180 7.4 + Prom 925 - 984 7.0 4 3 Tu 1 . + CDS 1013 - 1462 349 ## - Term 1220 - 1290 13.1 5 4 Op 1 . - CDS 1457 - 1945 356 ## LJ1831 hypothetical protein 6 4 Op 2 . - CDS 1990 - 2436 192 ## COG1396 Predicted transcriptional regulators 7 4 Op 3 . - CDS 2440 - 4596 185 ## PROTEIN SUPPORTED gi|82703189|ref|YP_412755.1| 30S ribosomal protein S1 - Prom 4626 - 4685 6.7 - Term 4663 - 4712 10.1 8 5 Tu 1 . - CDS 4719 - 11762 8109 ## COG3250 Beta-galactosidase/beta-glucuronidase - Prom 11805 - 11864 5.0 - Term 11911 - 11969 11.1 9 6 Op 1 4/0.000 - CDS 11973 - 13463 1561 ## COG4468 Galactose-1-phosphate uridyltransferase 10 6 Op 2 . - CDS 13486 - 14652 1491 ## COG0153 Galactokinase - Prom 14698 - 14757 8.6 + Prom 14790 - 14849 6.0 11 7 Op 1 . + CDS 14899 - 16731 1792 ## COG0493 NADPH-dependent glutamate synthase beta chain and related oxidoreductases 12 7 Op 2 . + CDS 16725 - 18467 1430 ## COG4624 Iron only hydrogenase large subunit, C-terminal domain + Term 18473 - 18512 8.2 - Term 18457 - 18500 9.1 13 8 Tu 1 . - CDS 18505 - 19125 572 ## Cphy_0670 hypothetical protein - Prom 19157 - 19216 9.4 - Term 19189 - 19224 5.1 14 9 Op 1 . - CDS 19250 - 24892 5771 ## BLD_1258 subtilisin-like serine protease - Prom 24933 - 24992 7.9 15 9 Op 2 . - CDS 25077 - 25211 84 ## - Prom 25338 - 25397 80.4 + Prom 25080 - 25139 6.3 16 10 Tu 1 . + CDS 25228 - 25332 78 ## - LSU_RRNA 25321 - 25715 93.0 # CP000885 [D:78327..81226] # 23S ribosomal RNA # Clostridium phytofermentans ISDg # Bacteria; Firmicutes; Clostridia; Clostridiales; Clostridiaceae; Clostridium. Predicted protein(s) >gi|330400597|gb|ADLB01000034.1| GENE 1 2 - 361 246 119 aa, chain + ## HITS:1 COG:L0434 KEGG:ns NR:ns ## COG: L0434 COG2801 # Protein_GI_number: 15672639 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Lactococcus lactis # 35 106 206 277 279 82 48.0 2e-16 KERPKTRRNIDKPLICIQTGEASSIKEYKRVTATMQCSYSKKSYPWDNACIESFHSLIKR EWLNRFKIRDYDHAYRLIFEYLEAFYNTKRIHSHCDYMSPNDYEELYRRLQQDELQLAG >gi|330400597|gb|ADLB01000034.1| GENE 2 380 - 778 379 132 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKCKKVLAVAGAVTCLTVGGIVSQASSYVQGSMPGFPEMSCTGSLGFANHAMTATTNCNK TPGGYKTTLEGKLKKGSQILEKRTASGKQKATFKDVEATGFHYGRATHKVYYRDKTWAQN TFIGPMSRFSTS >gi|330400597|gb|ADLB01000034.1| GENE 3 845 - 925 88 26 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MFSIFAGNVTFYNRKTINVQYFCYAN >gi|330400597|gb|ADLB01000034.1| GENE 4 1013 - 1462 349 149 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKKIFATLFTIFFLTTSITASATSNIAEYDLNSQQKEQHFTLLEENGEISYVTITPISKV SNGSYEVKYTKPKKWTGKFIVDVKSNKFSSVHSPSATALSGSIKGHSLKKNSSTKATLEI SWKPSTISPTVKVGVKAYISSGTLKVAPL >gi|330400597|gb|ADLB01000034.1| GENE 5 1457 - 1945 356 162 aa, chain - ## HITS:1 COG:no KEGG:LJ1831 NR:ns ## KEGG: LJ1831 # Name: not_defined # Def: hypothetical protein # Organism: L.johnsonii # Pathway: not_defined # 1 93 1 94 161 66 34.0 3e-10 MTLEEKLCKYRKERKLSQAEIAEKLNVTRQKVSRWEHGTSVPNIETMKQLAEIYGVNVTE MLQQEEKKSIEQPIVKKVKEVSSSKHEDIVALLCGILLLAACSIPIINIILPIAILIKYK KRKYSFFVKAVAVMCLAVAAYNIYLFVDIYFVDDGIVEIRQL >gi|330400597|gb|ADLB01000034.1| GENE 6 1990 - 2436 192 148 aa, chain - ## HITS:1 COG:SPy1834 KEGG:ns NR:ns ## COG: SPy1834 COG1396 # Protein_GI_number: 15675661 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Streptococcus pyogenes M1 GAS # 1 63 1 63 195 57 38.0 9e-09 MKLGEKLKKYRKERKLSQREVAEKLNVTRQVVSYWECDLTIPDIQILQQLAELYEIDMEE MLQEENSTTQNEEIAALICGIILAVAFKIPVVNIIVSLIVLWKYKNRKYSFFIRTVAIVC LVITVYNTYGTFYTCFNPNSTIKNRYLW >gi|330400597|gb|ADLB01000034.1| GENE 7 2440 - 4596 185 718 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|82703189|ref|YP_412755.1| 30S ribosomal protein S1 [Nitrosospira multiformis ATCC 25196] # 639 718 195 272 571 75 46 3e-13 MDINQKLTEELGVKKWQVEAAVKLIDEGNTIPFISRYRKEVTGSLNDEQLRQLFERLTYL RNLEEKKEQVLSSIEEQEKLTEELRRQIMMAETLVVVDDLYRPYRPKRRTRATIAKEKGL ESLAALITMQKTNVPVEESAKAYLSEEKGVQTVEEALAGAKDIIAEYISDEADYRIYLRN FTAKQGKLISTAKDDTVESVYEMYYDFEEPVNKLAGHRVLALNRGEKEKILQLKIEVPEE KVIQYLEKKVIHSDNAYTTPILKEAIEDSYKRLIAPAIEREIRNELTEKAEDGAISVFGK NLEQLLMQPPIANKVVLGWDPAFRTGCKLAVVDETGKVLDTTVIYPTAPTTPQKIAQAKE TLKKLIRKYGITLISLGNGTASRESEMIIVELLKEIPENVQYMITNEAGASVYSASKLAT EEFPQFDVGQRSAASIARRVQDPLAELVKIEPKSIGVGQYQHDLNQKKLSETLSGVVEDC VNKVGVDLNTASAPLLSYISGISSAVAKNIVAYREENGRFKSRKELLKVAKLGPKAFEQC AGFMRIQGGKNPLDSTSVHPESYEVAEKLLEKQGFSLQDIAEGKLSSLSKTIKDEKKLAN ELEVGEITLKDIIKELEKPGRDPRDEMPKPILRTDVLEMKDLTEGMVLKGTVRNVIDFGV FVDIGVHQDGLVHISQITDKKFIKHPLEVVSVGDIVDVKVMSVDVKKKRIQLTMKGIS >gi|330400597|gb|ADLB01000034.1| GENE 8 4719 - 11762 8109 2347 aa, chain - ## HITS:1 COG:TM1193 KEGG:ns NR:ns ## COG: TM1193 COG3250 # Protein_GI_number: 15643949 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Thermotoga maritima # 77 670 19 558 1087 340 35.0 2e-92 MKKKKITALLLVAAMIGTSFPQGTLTVRAQESLNPSEKIRADVDKTKFTHKEWTGTDYED VDGKQVTGEDVFGINREDAGVSIIPYQDTASAAGAVWDYNARENSEYFQLLTGKNQKWDL TVVQNQEQAQKFMGENGFMTKDYQKQEADGWKSVELPKSWTRDDFDFSIYTNVQMPWQSK YDQNVSAPNAPTNYNPVGLYRKTFTVDDKMLNDGRRVYLNFAGVESAYYVYVNGKEVGYS EDSYSPHHFDVTDYLQEGENILAVKVHKFCDGTWFEDQDMIYDGGIFRDVYLTSAPLMQI KDYTVRTDLDSSYKNATLKISADIRNLSDAAQNGYKIQAKALDKDGNDILGGVSVPVEEV VSTKTKTVELKTKVANPKLWSAENPNLYALVLTLVDESGNAVETVSTQLGFREIEFTRTE VDKNYNVTTKKWDPVKINGERLLLKGANRHDTDPFYGKAVPQSTILEDVKLMKQNNLNAV RTSHYSNDDYFYWLCNSYGLYMIGETNMESHAIMNDHNAKGLFYELAMDRTETSYKRLKN NPAIVIWSIGNEMVYTSDPNTSNGMFRDMIWYFKNNDPTRPVHSEGQNDKMGTDMGSNMY PSVDTTQGRAGEGKIPYVLCEYAHGMGNSVGNLKEYWDAIRSADNMLGGFVWDWVDQARE VDLDELGYEYGVTDKTGVSGDAIGEEENWIDNAGEGSMNGGKAFSGYTIMDQNEKYNAAL SGTGKFFTFETIVKPASSSQNSVLLSKGDTQVALKTQSSGKGLEFFIYADNSWKSVSCDF PANWENQWHQVAGVYDKGKISIYVDGKQLATNNVNDKIAASDSPVGVGYDNIHGRKVDGQ ISVARIYNKALTAEELKAQNSTTPAISSKDKSVVLWLDYADEHTKAQASGWDYYATENAH KNLYADEIKGKFYGYGGDWGDVPNDNSFCQNGLVSPDRNPQPELMEVKYQYQNFWFSADV ADLDARQIKVYNESNFENLNEYDVTWQLMENGKEIQSGTVADTDVAPQTNGTIKVPFTMP EKIPAGTEYYLNISVSLKADTEWAKKGAEMSWSQIEVPVTVEQAAPQVSDKAVTVNETES AFEVKGEKFSFSIDKATGTMKNYTYNGETLVKEGPVPNFWRGLVENDKVSFDRAWQNVEK NIKVESVKAEKNEAGQNVITADLVFPDAKNTKETIVYTINGSGEVSVKMSVDATKSGMGN FIRVGSMMTLPEGFEDVTWYGNGPVETFNDRKTNARQGIYTNTVTDFFYPYLKVDDTGNL TDVKWLSVKSDNTNTALLISAKDKVEASALHFTPDDLNAVTHVYGLTPRKETILSVNYGS LGTGGATCGPGPLAQYQLPASKVYEWEFTMIPADKNANAESLGNMAKSYHEVSSFNREDY DKEYAQSLIDRIDSFVVYDYSQLEEVEELQADVNAMTKEQAAIVNKDKDRSKLVEGYVKE VKALEDKETYIQDSSKNALEIPYESSAKFKKNGETVVMNGKLAVPFNEVLNPVLEGNKSF TVETKVTPTGSLDYNMFAGKGDEAFALRTRGTSYIDFHIYAGGSWRSIEYQMTEEQKANW LGKEHQIVGVYDDANHKIQLYADGKLMKEKATETTEGVAHSDYDFTIGGCPSTNRSSAAE FSSVHVYGKALTADEVAAQNSDKLAIEAKDENVALWVDMKDIKHRDKENIYKVTIDPAKA AIEVGNSKEFTLVPDNANAKLTSVKWSILDADGFNAEGIEVNYSADDYTKATVVVSENAE AGNYVLKAENVNGKEELTAEAQITVTAKPEQEDQTIIDSSKNKLDTILPNTAQFTKGETK ETGALKGYFSVDDKNKVVNDAISGNNAFTVSSRVYVPASVKSADTGVWDQTNPHEKHNMI ASMGDNSFAYRIYYDKNRNDVHIDAFISDGTTWLQATTEQLPNDFFDKWHTLSASYDGKT LKVYVDNEVTEKAGTKSVHKSENTFSVGYEPQKTGRQSELTFEQMRVFNQALNADQLNKA TDPKAENVVLWLDFDAEEDNSASEADKQNLQYLIDECEKLQKDNYFDKGWNEMQTALNAA KEVLKKDAPTKEEVQNAYDTLKDAKDKLVYIKDLKDAVANTKDVVENKDSYTKKSYEAFG KALNKAKEVLENQDATQKEVNDAKVALLDAQNKLVEKADVTALKDAIEKAETIKEDVTPS SWDRVQAAKKAAEDVLKAFEKDDESVTQEQIDSAAKALNDAIDSAQKRADFSKLQEAVNR IEKLDLDGYTKDSVKTLKEALKEAKAVLKNEEATQKEVEDALTKLLAAEAGLVKEDEKPN PDPNPGDPGNPGNPGNSGNTDKPDNSGKPSGGNHKPVKTGDASPVAETGMLMLAAATVLL VWRKRNR >gi|330400597|gb|ADLB01000034.1| GENE 9 11973 - 13463 1561 496 aa, chain - ## HITS:1 COG:CAC2961 KEGG:ns NR:ns ## COG: CAC2961 COG4468 # Protein_GI_number: 15896214 # Func_class: G Carbohydrate transport and metabolism # Function: Galactose-1-phosphate uridyltransferase # Organism: Clostridium acetobutylicum # 1 496 1 497 497 553 53.0 1e-157 MLSKSIKKLVEYGVETGLTPECERIYTTNLLLELFKEDEYEDVSIEGEELNLEEILKELL DEACARGIIEDSIVYRDLFDTKMMNCLVPRPAQVQETFAKKYEVSPKEATDYYYKLSQDS DYIRRYRVCKDRKWKVDSPYGEIDITINLSKPEKDPKAIAAAKNSKSSSYPKCQLCVENE GYAGRVNHPARENHRIMPITVNDSAWGFQYSPYVYYNEHCIVFNGQHTPMKIEKQTFIKL FDFVKLFPHYFLGSNADLPIVGGSILSHDHFQGGNYTFAMAKAPMEETFSVEGFEDVEAG IVHWPLSVIRLRGKDEIRLIELGAHILDKWRGYTDEDAFVYAETDGEPHNTITPIARKVG DTYELDLALRNNITTDEHPLGVYHPHAQWHNIKKENIGLIEVMGLAVLPARLKEEMEILA DYIVEGKDISSNEKIAKHKEWADTFLPKYAEITKENVMDILQEEIGIVFTHVLEDAGVYK CTEEGRKEFRKFISVL >gi|330400597|gb|ADLB01000034.1| GENE 10 13486 - 14652 1491 388 aa, chain - ## HITS:1 COG:CAC2959 KEGG:ns NR:ns ## COG: CAC2959 COG0153 # Protein_GI_number: 15896212 # Func_class: G Carbohydrate transport and metabolism # Function: Galactokinase # Organism: Clostridium acetobutylicum # 5 384 7 385 389 447 60.0 1e-125 MKAKLIEQFKEIFGTDEEVRAYFAPGRVNLIGEHTDYNGGHVFPCALTIGTYAIVKKRTD NVLRFYSANFSSLGIIESDLNDLVPSEAAGWTNYPKGVMWAFEKRGYKLTGGMDILIYGN IPNGSGLSSSASLEVLTGVVLKDLFGFDVSMVDIALIGQYSENNFNGCNCGIMDQFASAM GKKDNAIFLDTNTLHYEYAPVVLEDAKIVIVNSKVKHSLVDSAYNDRRNECETALKELQE VTDIKTLGDLSEEEFEAHKDAIKSPVRQKRAKHAVYENQRTIKAVAALKANDIETFGKLM NASHTSLRNDYEVSCEEIDILVDLAWATEGVIGSRITGGGFGGCTVSIVKNDAVDNFIET IGQKYEEKVGHKAEFYVVDIGDGAGVLA >gi|330400597|gb|ADLB01000034.1| GENE 11 14899 - 16731 1792 610 aa, chain + ## HITS:1 COG:TM1640 KEGG:ns NR:ns ## COG: TM1640 COG0493 # Protein_GI_number: 15644388 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: NADPH-dependent glutamate synthase beta chain and related oxidoreductases # Organism: Thermotoga maritima # 119 532 29 459 468 253 38.0 1e-66 MSRLEIATPNKAQVVVEGLYKDLERRIEASPPGLCPVDMARAFLELCHAQTCGKCVPCRV GLWQLKNLLTDVLNGNATLDTLDLMEDSALSIMQSSDCAIGYEAAHMVYKALVGYREDFE EHIREGRCTCTYNQPVPCVSLCPAKVDIPGYIALVGEGRYEDAIRLIRKDNPFPTTCGFI CEHPCEARCRRNMVDDAINIRGLKRFAADYAGKVPPPKCAESTGKRIAVVGGGPGGLSSA YYLQLMGHQVTVYEMLPKLGGMLRYGIPNYRLPKERLDDDIDAILQTGVEVKYGLRIGKD ITIQSLRAEYDAVLITIGASTDKKLGIEGEQADGVLSAVQFLRDVGKNQNPDLTGQEVAV IGGGNVSMDAVRTAVRLGAKKVSIVYRRRVADMTALPDEIEGAVAEGIEVQTLKAPSKID VDENGHVKGLYVTPQMISKIKDGRASVKPTGEEDIYIPCTTLIVAIGQDIEFQHFEEAGV PVKRGRIMSEKYGGFDDIPGVFAGGDCATGPASVIKAIAAAKVIAANIDEYLGFNHIISC DVEIPEPNLDDRTPCGRVNMTEREVCQRIHDFEGVENCMSEAEAKQESSRCLRCDHFGFG IFKGGREKLW >gi|330400597|gb|ADLB01000034.1| GENE 12 16725 - 18467 1430 580 aa, chain + ## HITS:1 COG:TM0201_2 KEGG:ns NR:ns ## COG: TM0201_2 COG4624 # Protein_GI_number: 15642974 # Func_class: R General function prediction only # Function: Iron only hydrogenase large subunit, C-terminal domain # Organism: Thermotoga maritima # 214 579 7 368 372 330 50.0 5e-90 MVNLTIDNKQISVPEGTTILEAAEQNGIPIPRLCYLKGLNEIGACRVCVVELVGKERLIP SCNNVVEEGMIIYTNSPKVRMNRKKTVEFLLSQHDCQCATCARSGNCTLQTIANDLNIID IPFKQRLEKMAWNKDFPLIRDSAKCIKCMRCVQVCDKVQNLHVWDLESTGSRTTVHVSKN RKIEEADCSLCGQCITHCPVGALRERDDTDKAWEAIADKDKITVVQVAPAVRTAWGESLG LSREEATIGKIVDSLRQMGFDYVFDTTFSADLTIMEEGNEFLERFLSGELKTRPMFTSCC PGWIRFIKSQYPHLVPQLSTAKSPQQMFGTVMKTYFAKSIGVNPENICTVSIMPCVAKKG ERNMELYYEEYAGHDVDIVLTTRELTRMIRSSHIKPSTLSDVECDRLMQDGSGAGVIFGA TGGVMEAALRSAYYLLMGKNPDADAFSVVRSEKFNQGVISAEFSIGDAKIQTAVVSGLGN TRKLIDAIEHGDVHYDFVEVMACPGGCVGGGGQPIHDGEELAHIRGQNLYYLDKNAKIRF SHENQDVLKLYEDFMEKPLSHKAHMLLHTDHNAWEMPKRR >gi|330400597|gb|ADLB01000034.1| GENE 13 18505 - 19125 572 206 aa, chain - ## HITS:1 COG:no KEGG:Cphy_0670 NR:ns ## KEGG: Cphy_0670 # Name: not_defined # Def: hypothetical protein # Organism: C.phytofermentans # Pathway: not_defined # 34 204 2 167 170 120 39.0 5e-26 MNYTIQRISTEQDIEKCNLFEINNYQWKNVYKPKTYGYAGYLEGKGIYVKFICEESNPKR EYTKDRDSVCLDSTVEIFMAFPEKGEKLSNDVMYINFEMNSNGVMYSKYGKGRKGRTFIS EELCAKSNCKSVIGEDKWEITVTIPEELLREICDFDSILNGETFYCNFYKIAESPEVEHY GTFSPIENETPNFHLPIYFAQANIEK >gi|330400597|gb|ADLB01000034.1| GENE 14 19250 - 24892 5771 1880 aa, chain - ## HITS:1 COG:no KEGG:BLD_1258 NR:ns ## KEGG: BLD_1258 # Name: aprE # Def: subtilisin-like serine protease # Organism: B.longum_DJO10A # Pathway: not_defined # 43 1815 26 1809 1937 1581 51.0 0 MKKRNLKVLSLLLVASMTVPTVAESTYYLNAPTMVQAAEKNNTKVEKEAVLEDGVNDQWT EDNKVGTDSVCTVEKGWLHVKSGSGNGNDVTPSGSKRPAMFVNPTEFDFTKPGYFEFTMK SNNTNKDQNNADRFGIYLGYNTDKNGMFLGYDNGGWFWQKYQNGNGNYAQVGQAAPGKDV ETKVRIEWTGDYKATLKLNGTAVLTNEDFSGIKDNLGKKIAIKAGSFGDNRTDVFLKDIH YTGQKEAKTYDVTGKVVDDKGKAVEKATVKVGDQTVTTNAEGKYTLKLKNGTYTVEVTKE GFETATGSVTVNGATATVADIKLSPEAAVETEKISSKDMDVYVAKNFPSVVKYEMQGDNK GKTFYGQTKALDTIKINDVAIKLNKDDVKATFADDKATYVIKLKADGEKKVDAEITAELK VKKNEVHFDITKVKNNLLKEGVKETKDNAIQTIEIPNHSLISVRSTQEGANLKGALMSSN TTISGDSYIEVNDSTPVKNQDYMYAFVSNNELSAGLWSNSENEGSAKAVGVSGGSHNTRV MSTVEKGNGYVSMGLGSTKWYYHRAVESAPKKKADGSLYKKEYVVEETEMPSMKVAIAGN INGDKNVDWQDGAVAFREIMHNPYKSEEVPELVAWRIAMNFGGQAQNPFLTTLDNVKRVA MHTDGLGQSVLLKGYGSEGHDSGHPDYADIGKRIGGAKDMNTLMEKGKKYGARFGIHVNA GEMYPEAKAFTDESVRRDDSGNLRYGWNWLDQGVGLDSVYDLGEGHRRQRFNDLNEKVDN MDFVYVDIWGNRTGGSDDSWQTRKLSKEINDNGWRMATEWGSANEYDSTFQHWATDLTYG GFELKGENSEVMRFLRNHQKDSWVGDYPSYKGAAMAPLLGGYNMKDFEGWQGRNDYDAYI QNLYTHDLTTKFIQHFEIMDWEDGKTFTDTNPKGQSYTWTPETKITLKGDMGTLVLERGS DNPTDPQYRDRTMTLNDKVIATGAVSRGDNGEGGTESYLVPWNWDAETGDKVSEDKEKMY HWNTKGGTTTWELQDNWKNLKDVKVYELTDLGKVNEKTVPVKEGKITLEAKAQTPYVVCR GKEDNLKITWSEGMHIVDAGFNTGDAGLKKYWEKSGNGEAVIAKSQYSNPMLKLDGKVSM TQELTDLKPGQKYAVLVGVDNRSNAKASVTVKSGDKVLDSNYTTKSIAKNYVKAYTHSNS SATVDGTSYFQNMYVYFTAPESGKVTLTLAKEAGDGSAYFDDVRVTETTMDVLKKDKDGN VTGMKQDFEDNVQGVYPFVIGAIEGVEDNRTHLSELHGKYTQAGWDVKKMDDVLTKNDPD SNWSVKANGLCQRGNLVMQTIPQNFRFEEGKTYKVSFDYQSGSNGIYGVVVGNGEVKGSE KVTELEMAMGSKAEKTYEFEITGATDGQTWFGIYSTTKPADLQGKGGAAANFGGYQDFVL DNLKIEEVKEDVTRADLEALVKEAEQLKAEDYLPADWKVLQDALINAKVALNKDKDSQKD IENAYYKLKGAIESLKYAGGTESDERGDITVENYEVQAGSFEPNNTSEGDPKYAQDNNAG TLWHTKYSDTDIKNAWYQFNFKEAQTVNGMRYLPRSGAANVNGKLKKVDIQISTDNAKTW TTVADDLEVSPDTKWQKIAFDKEYTGVTNVKFIAVETGGNTPNQFAAAAELRVTTPFTPE KPEVDKTGLANAIAEAEKLQKDNYTEESWNNMIKYLDAAKAVNANPEATQYDVALAIANL GDAVKDLVAVEKPGKPDKSALKDAIDKYSKYTQGKYTDESWKKFEDALNHARKVYENENA TQGEVDAAISALNIAENGLTTGEGSDSGNGNGNNGNTGKPSKPGKPNKPVKTGDEAPVLP LTATVAGLGAAIALLFKKRK >gi|330400597|gb|ADLB01000034.1| GENE 15 25077 - 25211 84 44 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRRIPWNSFQHIWEKYTITLKKVKIVYYDGRKMKKYTDEIEIKG >gi|330400597|gb|ADLB01000034.1| GENE 16 25228 - 25332 78 34 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTFLVCLFNNFGVPSKPHTRKITTESNLFLTLAS Prediction of potential genes in microbial genomes Time: Tue May 24 22:07:27 2011 Seq name: gi|330400591|gb|ADLB01000035.1| Lachnospiraceae bacterium 2_1_46FAA cont1.35, whole genome shotgun sequence Length of sequence - 2763 bp Number of predicted genes - 0 Number of transcription units - 0, operones - 0 average op.length - 0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + TRNA 198 - 285 58.7 # Ser GCT 0 0 + SSU_RRNA 546 - 2040 99.0 # EF403348 [D:1..1494] # 16S ribosomal RNA # uncultured bacterium # Bacteria; environmental samples. + 5S_RRNA 2248 - 2361 88.0 # CP000885 [R:4137971..4138087] # 5S ribosomal RNA # Clostridium phytofermentans ISDg # Bacteria; Firmicutes; Clostridia; Clostridiales; Clostridiaceae; Clostridium. Predicted protein(s) Prediction of potential genes in microbial genomes Time: Tue May 24 22:07:29 2011 Seq name: gi|330400585|gb|ADLB01000036.1| Lachnospiraceae bacterium 2_1_46FAA cont1.36, whole genome shotgun sequence Length of sequence - 4141 bp Number of predicted genes - 1, with homology - 0 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + 5S_RRNA 21 - 134 88.0 # CP000885 [R:4137971..4138087] # 5S ribosomal RNA # Clostridium phytofermentans ISDg # Bacteria; Firmicutes; Clostridia; Clostridiales; Clostridiaceae; Clostridium. + TRNA 155 - 228 88.3 # Ile GAT 0 0 + TRNA 317 - 389 86.4 # Ala TGC 0 0 + LSU_RRNA 1055 - 3439 90.0 # CP001107 [D:12954..15836] # 23S ribosomal RNA # Eubacterium rectale ATCC 33656 # Bacteria; Firmicutes; Clostridia; Clostridiales; Eubacteriaceae; Eubacterium. + Prom 3451 - 3510 80.3 1 1 Tu 1 . + CDS 3644 - 3733 121 ## + Term 3742 - 3808 31.6 + TRNA 3721 - 3793 77.7 # Asn GTT 0 0 + TRNA 3829 - 3900 63.0 # Glu TTC 0 0 + TRNA 3944 - 4014 63.3 # Cys GCA 0 0 Predicted protein(s) >gi|330400585|gb|ADLB01000036.1| GENE 1 3644 - 3733 121 29 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNLVRGGSQSLTSVSEFSWADIPNINLAS Prediction of potential genes in microbial genomes Time: Tue May 24 22:07:33 2011 Seq name: gi|330400575|gb|ADLB01000037.1| Lachnospiraceae bacterium 2_1_46FAA cont1.37, whole genome shotgun sequence Length of sequence - 2491 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 131 - 190 3.7 1 1 Op 1 5/0.000 + CDS 257 - 601 200 ## PROTEIN SUPPORTED gi|148984516|ref|ZP_01817804.1| 50S ribosomal protein L9 2 1 Op 2 . + CDS 673 - 2262 1334 ## COG3436 Transposase and inactivated derivatives + Term 2284 - 2321 6.4 - Term 2208 - 2238 -0.9 3 2 Tu 1 . - CDS 2325 - 2483 197 ## EUBREC_2191 hypothetical protein Predicted protein(s) >gi|330400575|gb|ADLB01000037.1| GENE 1 257 - 601 200 114 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|148984516|ref|ZP_01817804.1| 50S ribosomal protein L9 [Streptococcus pneumoniae SP3-BS71] # 1 105 1 103 107 81 38 6e-16 MLGDISLATNIYLVTGYTDMRKSIDGLCAIIMKNFKHEPDGHSIYLFCGKRCDRIKVLFK EPDGYILLYKRLDVLSGRYRWPRNSSEVKPITWQQFDWLMSGLEIEQPKALRMA >gi|330400575|gb|ADLB01000037.1| GENE 2 673 - 2262 1334 529 aa, chain + ## HITS:1 COG:SPy0131 KEGG:ns NR:ns ## COG: SPy0131 COG3436 # Protein_GI_number: 15674346 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Streptococcus pyogenes M1 GAS # 80 529 14 450 450 242 33.0 2e-63 MTKSSKDIQLVELKDMIIQLNTTIKMLNDTISRQQAENDNLKAELAWFRQKMFGSSSERR TDDIAGQLSLFGETVEEEKPVELIEPEIVVPAKKSRKKRPTLAEQFKDIPTRQVIAGTLT DEDKLCSLCSAQMLPIGTEVIRSEIVYTPPKLERIEYIATTYACPECKDTEEPQFIKDNG VPALLAGSYVSPSLLAHIAYQKFGLYIPLNRQEKDFLQLNTPITRASMAKWIINCGMEYL QPMYDYFHRELLKRRFLMMDETPTQVLKEDGRRAETKSYFWVIRTGEDELNPIILYNYTP TRAGENIKKFLKGIAPGFYFMTDGYRGYNKLKEANRCCCWAHVRRYLLEAIPKGMEKDYS NPAVQGVLYCNKLFEYERTYKEKKLSYKQIEKRRLKDQKPVIEGFLSWLKQVNPGSNGKL KKAITYINNYQEYLQTYLEDGRCSLSDNLSENAIRPVTIGRKNWLFSDTAEGAKANALYL TIVEMAKTYNLNLYEYLKFLFEHRPNKDMSDEEFENLAPWNEHVQELCK >gi|330400575|gb|ADLB01000037.1| GENE 3 2325 - 2483 197 52 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_2191 NR:ns ## KEGG: EUBREC_2191 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 45 1 45 94 66 68.0 4e-10 MAKHYDKQFKLDAVQYYHDHKNLGLQGCATNLGISQQTLSRWQKERKRLVSG Prediction of potential genes in microbial genomes Time: Tue May 24 22:07:36 2011 Seq name: gi|330400567|gb|ADLB01000038.1| Lachnospiraceae bacterium 2_1_46FAA cont1.38, whole genome shotgun sequence Length of sequence - 2213 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 111 - 1718 751 ## COG3666 Transposase and inactivated derivatives + Term 1758 - 1808 6.0 - Term 1802 - 1861 11.3 2 2 Tu 1 . - CDS 1946 - 2092 173 ## gi|210622895|ref|ZP_03293418.1| hypothetical protein CLOHIR_01366 - Prom 2112 - 2171 3.6 Predicted protein(s) >gi|330400567|gb|ADLB01000038.1| GENE 1 111 - 1718 751 535 aa, chain + ## HITS:1 COG:CAC0657 KEGG:ns NR:ns ## COG: CAC0657 COG3666 # Protein_GI_number: 15893945 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Clostridium acetobutylicum # 9 332 9 333 334 347 56.0 5e-95 MTNKLKYHKNYTEFGEPYQLVLPLNLEGLIPDDDSVRLLSHELEDLDYSLLYQAYSAKGR NPAVDPKTMFKILTYAYSQNIYSSRKIETACKRDINFMWLLAGQKAPDHSTIARFRTGFL ADACENLFYQMVRRLEEAGELSKETVFIDGTKLEACANKYTFVWKKSVGKWEEKMFRKIQ EAIQHLNREYMKDFSVASETKTPDLQKIVCFLEERCRTDHSVFVHGRGKRKSRNQRYFEL FHSFLERQTVYDWHTANFQGRNNYCKTDPDATFMHMKDDHMRNAQLKPGYNVQIAVDSEY IVATDIFQDRNDVWTLVPFLKTMERNLGFRYPSVTADSGYESEEVYTYLRSAKQKPYIKP QTYEKWKKRNFKQDISKRENMGYDEKTDTYTCHAGKTLFPIFMKKQKSKSGYESEVTVYE CEDCTGCPYKEKCTKAKGNKRLYVSKSFLEKRQESYKNILSETGIKYRMNRSIQVEGAFG VLKNDYEFQRFLLRGKSKVKLEILLLCMGYNINKLHAKIQKERTGSYLFLVKETA >gi|330400567|gb|ADLB01000038.1| GENE 2 1946 - 2092 173 48 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|210622895|ref|ZP_03293418.1| ## NR: gi|210622895|ref|ZP_03293418.1| hypothetical protein CLOHIR_01366 [Clostridium hiranonis DSM 13275] # 1 48 10 57 57 78 79.0 1e-13 MYQETGLKSEKTVYVDKETGVNYLFIANGFGGGLTPLLDAEGKPIITK Prediction of potential genes in microbial genomes Time: Tue May 24 22:07:41 2011 Seq name: gi|330400561|gb|ADLB01000039.1| Lachnospiraceae bacterium 2_1_46FAA cont1.39, whole genome shotgun sequence Length of sequence - 1899 bp Number of predicted genes - 2, with homology - 1 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 18 - 77 6.0 1 1 Tu 1 . + CDS 129 - 1307 473 ## COG3464 Transposase and inactivated derivatives + Term 1399 - 1440 1.6 - TRNA 1515 - 1585 63.3 # Cys GCA 0 0 - TRNA 1629 - 1700 63.0 # Glu TTC 0 0 - TRNA 1736 - 1808 77.7 # Asn GTT 0 0 - Term 1754 - 1793 -0.5 2 2 Tu 1 . - CDS 1799 - 1885 83 ## Predicted protein(s) >gi|330400561|gb|ADLB01000039.1| GENE 1 129 - 1307 473 392 aa, chain + ## HITS:1 COG:BH1170 KEGG:ns NR:ns ## COG: BH1170 COG3464 # Protein_GI_number: 15613733 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Bacillus halodurans # 4 389 38 429 441 217 34.0 2e-56 MHSHYTNKLLNIEDVIIKKIHHADTFLKIYLETNPHEQVCPCCGSTTKRIHDYRYQTIKD LPFQLKHCYLVLRKRRYVCKCGKRFYESYSFLPRYFQRTARLTAFIATSLHNSQSIKETA RQANISTATVGRVLNTIAYSRRKFSTSISIDEFRGNASTGKFQCILVDPVKHKVLDILPD RQYSHLVSYFSSIPKDERHRVKHFVCDMWKPYIELAYSYFPNAEVIIDKYHFIRQTTWAI EGVRKRLQKTMPARLRKYYKRSRTLILSRYNKLKDENKAACDLMLLYNDDLRRAHYLKEK FYELCQNTKYSEQRKDFFDWIKMAESSGLNEFEKVAKTYRAWSKEILNAFKYAHITNGPT EGFNNKIKVLKRTSYGIRNFQHLRTRIFLITD >gi|330400561|gb|ADLB01000039.1| GENE 2 1799 - 1885 83 28 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MNLVRGGSQSLTSVSEFSWADIPNINPQ Prediction of potential genes in microbial genomes Time: Tue May 24 22:07:46 2011 Seq name: gi|330400434|gb|ADLB01000040.1| Lachnospiraceae bacterium 2_1_46FAA cont1.40, whole genome shotgun sequence Length of sequence - 1815 bp Number of predicted genes - 4, with homology - 2 Number of transcription units - 3, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 80 - 364 465 ## EUBREC_2191 hypothetical protein - Prom 421 - 480 6.7 + Prom 367 - 426 1.8 2 2 Tu 1 . + CDS 463 - 681 227 ## + Term 862 - 895 -0.1 + Prom 927 - 986 4.6 3 3 Op 1 . + CDS 1006 - 1536 278 ## 4 3 Op 2 . + CDS 1564 - 1737 261 ## gi|210622895|ref|ZP_03293418.1| hypothetical protein CLOHIR_01366 Predicted protein(s) >gi|330400434|gb|ADLB01000040.1| GENE 1 80 - 364 465 94 aa, chain - ## HITS:1 COG:no KEGG:EUBREC_2191 NR:ns ## KEGG: EUBREC_2191 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 94 1 94 94 135 77.0 4e-31 MAKHYDKQFKLDAVQYYHDHKNLGLQGCATNLGISQQTLSRWQKELRETGDIESRGSGNY ASDEAKEIARLKRELRDAQDALEVLKKAINILGK >gi|330400434|gb|ADLB01000040.1| GENE 2 463 - 681 227 72 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKYLTDDKEECLNETMIYAGWFGETDIVKFLIENGANKEYKTQNGLGLLECSERIEKQFK DSSLKEFLANNQ >gi|330400434|gb|ADLB01000040.1| GENE 3 1006 - 1536 278 176 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNMVISFSGRENGNCDNISKYIASKGDSIIYFRNLNIHDCSGCKYECFKGYCKYRQDDIY NLYESMLSYDKIFLIVPMYCGNPSSLYFKFNERCQDFFLHNEDSYETIISKVYIIGVYGD KKKTPDFIPCLTRWFECTQFNNHVLGIERHRYNQRIADNILDIEEVRNMITCFIQQ >gi|330400434|gb|ADLB01000040.1| GENE 4 1564 - 1737 261 57 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|210622895|ref|ZP_03293418.1| ## NR: gi|210622895|ref|ZP_03293418.1| hypothetical protein CLOHIR_01366 [Clostridium hiranonis DSM 13275] # 1 57 1 57 57 92 78.0 6e-18 MSKNNRFEVVYQETGLKSEKTVYVDKETGVNYLFIANGFGGGLTPLLDAEGKPIITK Prediction of potential genes in microbial genomes Time: Tue May 24 22:08:05 2011 Seq name: gi|330400428|gb|ADLB01000041.1| Lachnospiraceae bacterium 2_1_46FAA cont1.41, whole genome shotgun sequence Length of sequence - 1654 bp Number of predicted genes - 3, with homology - 2 Number of transcription units - 2, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 13 - 72 5.3 1 1 Op 1 . + CDS 96 - 380 453 ## EUBREC_2191 hypothetical protein 2 1 Op 2 . + CDS 377 - 1291 498 ## COG2801 Transposase and inactivated derivatives + Term 1388 - 1421 -0.4 3 2 Tu 1 . - CDS 1288 - 1635 74 ## Predicted protein(s) >gi|330400428|gb|ADLB01000041.1| GENE 1 96 - 380 453 94 aa, chain + ## HITS:1 COG:no KEGG:EUBREC_2191 NR:ns ## KEGG: EUBREC_2191 # Name: not_defined # Def: hypothetical protein # Organism: E.rectale # Pathway: not_defined # 1 94 1 94 94 135 77.0 4e-31 MAKHYDKQFKLDAVQYYHDHKNLGLQGCATNLGISQQTLSRWQKELRETGDIESRGSGNY ASDEAKEIARLKRELRDAQDALEVLKKAINILGK >gi|330400428|gb|ADLB01000041.1| GENE 2 377 - 1291 498 304 aa, chain + ## HITS:1 COG:L0434 KEGG:ns NR:ns ## COG: L0434 COG2801 # Protein_GI_number: 15672639 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Lactococcus lactis # 12 291 1 277 279 199 41.0 8e-51 MTEAIYLEVTEMAETAHKAKRRVSVSGMLKHLGVSRSGYHAWLKRVPSNTEKRREAVKAK IKDIYDESKQNYGAPKITRELRKSGEIISERTVGKYMKQMGIKAQWVKPWTITTKDSDFS SELKNILDEQFNPERPNAVWCSDITYIWTIDGFVYLTSIMDLYSRKIIAWTLSKTLEVSC VIETIKKAKARRNIDKPLILHSDRGSQYVSKEYKRVTATMQCSYSKKSYPWDNACIESFH SLIKREWLNRFKIRDYDHAYRLIFEYLEAFYNTKRIHSHCDYMSPNDYEELYRRLQQDEL QLAG >gi|330400428|gb|ADLB01000041.1| GENE 3 1288 - 1635 74 115 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLEQELTNLQIYISKRNQGQTDEQVINHITKINNKTPLTQEEWHELIFPSCNNGYVEILR FILSNIQCLNNVKEYMRHTVYGRNKNINDERIEVLKEFMVLCQDLVQVKMRFSSS Prediction of potential genes in microbial genomes Time: Tue May 24 22:08:13 2011 Seq name: gi|330400424|gb|ADLB01000042.1| Lachnospiraceae bacterium 2_1_46FAA cont1.42, whole genome shotgun sequence Length of sequence - 1576 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 7 - 66 4.7 1 1 Tu 1 . + CDS 143 - 1513 600 ## SZO_02320 transposase Predicted protein(s) >gi|330400424|gb|ADLB01000042.1| GENE 1 143 - 1513 600 456 aa, chain + ## HITS:1 COG:no KEGG:SZO_02320 NR:ns ## KEGG: SZO_02320 # Name: not_defined # Def: transposase # Organism: S.equi_zooepidemicus # Pathway: not_defined # 1 452 8 457 472 341 42.0 4e-92 MNEQLKYEVIKSLVDHNGNKKAAALKLGCTTRHINRLIQKYKQNGKAAFIHGNRGRKPLH SFTESQKLEILTLYNNKYYDATFTYACELLAKNDGIFISPSALTKIMYENFIPSPRTTKT VRKRLAKELRTQQKNVSTKKKQDELQAAIVTVETPHSRRPRCVYFGEMLQMDASVHLWFG NAKTNLHIAIDDSTSRIVGAFFDEQESLNGYYNVFHQILTTYGIPAMFYTDRRTVFEYRN KKMNKTELDTYTQFSYACKQLGVEIKTTSIAQAKGRVERAFQTLQQRLPIALRLAGITTI SEANIFLNSYIKEYNAKFALPINNNKSVFEKQPDCETINQILAVLTERTVDCGHCIKFQK QYYKTINECGIQVHYHKGTKGLVIQTFDKRLLFSVNNKIYELEVVPLHEPLSKNFDLQPL KEKPRKRNLPSPKHPWRMSTFLQFKNHKITETMLIC Prediction of potential genes in microbial genomes Time: Tue May 24 22:08:19 2011 Seq name: gi|330400420|gb|ADLB01000043.1| Lachnospiraceae bacterium 2_1_46FAA cont1.43, whole genome shotgun sequence Length of sequence - 738 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 301 142 ## LGG_02944 integrase + TRNA 536 - 623 58.7 # Ser GCT 0 0 Predicted protein(s) >gi|330400420|gb|ADLB01000043.1| GENE 1 1 - 301 142 100 aa, chain - ## HITS:1 COG:no KEGG:LGG_02944 NR:ns ## KEGG: LGG_02944 # Name: tnp # Def: integrase # Organism: L.rhamnosus # Pathway: not_defined # 1 99 1 99 478 69 39.0 4e-11 MNEQLKYEVIKSLVDHNGNKKAAALKLGCTTRHINRLIQKYKQNGKAAFIHGNRGRKPLH SFTESQKLEILTLYNNKYYDATFTYACELLAKNDGIFISP Prediction of potential genes in microbial genomes Time: Tue May 24 22:08:21 2011 Seq name: gi|330400202|gb|ADLB01000044.1| Lachnospiraceae bacterium 2_1_46FAA cont1.44, whole genome shotgun sequence Length of sequence - 637 bp Number of predicted genes - 0 Prediction of potential genes in microbial genomes Time: Tue May 24 22:08:22 2011 Seq name: gi|330400198|gb|ADLB01000045.1| Lachnospiraceae bacterium 2_1_46FAA cont1.45, whole genome shotgun sequence Length of sequence - 618 bp Number of predicted genes - 2, with homology - 0 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 340 312 ## 2 2 Tu 1 . - CDS 354 - 617 218 ## Predicted protein(s) >gi|330400198|gb|ADLB01000045.1| GENE 1 2 - 340 312 112 aa, chain + ## HITS:0 COG:no KEGG:no NR:no KLTVQKTSTQGGWVVEKYANGWCKLYIKAQVYRASKEQTVYKLPFPNGIQLKNVRVQMTP AQNGWNVANFYHNTSGQNDDNAVISEVVIIFTAKDTTALTYCFDVCVSGFLV >gi|330400198|gb|ADLB01000045.1| GENE 2 354 - 617 218 87 aa, chain - ## HITS:0 COG:no KEGG:no NR:no NKENPVIMEKIVKTPGITLNAFEGKALSSSAITPPTVEGYRCIGLASGWGEGQVGLVVSP NGWAANCTNVKKTYNAVALKFLYLKSF Prediction of potential genes in microbial genomes Time: Tue May 24 22:08:33 2011 Seq name: gi|330400194|gb|ADLB01000046.1| Lachnospiraceae bacterium 2_1_46FAA cont1.46, whole genome shotgun sequence Length of sequence - 561 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 89 - 148 7.0 1 1 Tu 1 . + CDS 189 - 560 272 ## PROTEIN SUPPORTED gi|148996730|ref|ZP_01824448.1| 30S ribosomal protein S9 Predicted protein(s) >gi|330400194|gb|ADLB01000046.1| GENE 1 189 - 560 272 124 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|148996730|ref|ZP_01824448.1| 30S ribosomal protein S9 [Streptococcus pneumoniae SP11-BS70] # 51 124 1 74 77 109 63 4e-25 MDSNSLSHTKWNCKYHIVFAPKNRRKVAYGKIKQDIANILSMLCKRKGVKIVEAEICPDH VHMLVEIPPSISVSYFVGYLKGKSTLMIFERHANLKYKYGNRHFWCRGYYVDTVGKNAKK IQEY