Prediction of potential genes in microbial genomes Time: Wed Jun 22 10:55:32 2011 Seq name: gi|313159829|gb|AENZ01000001.1| Alistipes sp. HGB5 contig00034, whole genome shotgun sequence Length of sequence - 8871 bp Number of predicted genes - 9, with homology - 6 Number of transcription units - 8, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - TRNA 52 - 124 77.4 # Phe GAA 0 0 - Term 265 - 305 -0.3 1 1 Tu 1 . - CDS 348 - 545 169 ## gi|167754155|ref|ZP_02426282.1| hypothetical protein ALIPUT_02448 - Prom 792 - 851 6.1 - Term 860 - 922 1.1 2 2 Tu 1 . - CDS 958 - 3147 273 ## COG3291 FOG: PKD repeat + Prom 2927 - 2986 4.3 3 3 Tu 1 . + CDS 3172 - 3282 64 ## - Term 3691 - 3719 -0.9 4 4 Op 1 . - CDS 3754 - 4029 205 ## gi|291515326|emb|CBK64536.1| hypothetical protein AL1_22740 5 4 Op 2 . - CDS 4026 - 4412 429 ## ACP_1223 hypothetical protein - Prom 4432 - 4491 2.4 - Term 4477 - 4512 5.0 6 5 Tu 1 . - CDS 4518 - 4871 148 ## - Prom 4923 - 4982 3.8 7 6 Tu 1 . - CDS 5001 - 5693 -134 ## gi|313159831|gb|EFR59186.1| KAP family P-loop domain protein + Prom 5950 - 6009 6.6 8 7 Tu 1 . + CDS 6138 - 6242 65 ## + Term 6459 - 6496 -1.0 9 8 Tu 1 . - CDS 7790 - 8011 175 ## gi|291515324|emb|CBK64534.1| hypothetical protein AL1_22700 - Prom 8065 - 8124 3.9 Predicted protein(s) >gi|313159829|gb|AENZ01000001.1| GENE 1 348 - 545 169 65 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167754155|ref|ZP_02426282.1| ## NR: gi|167754155|ref|ZP_02426282.1| hypothetical protein ALIPUT_02448 [Alistipes putredinis DSM 17216] hypothetical protein ALIPUT_02448 [Alistipes putredinis DSM 17216] # 1 65 1 65 65 126 95.0 5e-28 MDGIIGTMTANEAQIEGTGEKVNLFDYHVKADTILFQYTTPDGRDEVVKFPLAGFCENYL MQFAK >gi|313159829|gb|AENZ01000001.1| GENE 2 958 - 3147 273 729 aa, chain - ## HITS:1 COG:MA4292 KEGG:ns NR:ns ## COG: MA4292 COG3291 # Protein_GI_number: 20093081 # Func_class: R General function prediction only # Function: FOG: PKD repeat # Organism: Methanosarcina acetivorans str.C2A # 460 679 926 1116 1995 96 36.0 2e-19 MKKLLAFAALFAVVALTSCKYDDDDLWNSVHGLENRVAKLEELCKQMNTNISSLQTIVTA LQNNVYVTGTTPLMKDGKEIGYTITFSKGNPITIYHGKDGQDGEDGITPTISVKKDTDGV YYWTLNGEFIVVDGGKIQAEGKDGTNGTTPQFKIENDYWFVSYDNGANWTQLGKATGEDG IGGDSMFSGVDYETSTDYVIFTLSNGTQIKLPTWSAFEALQRLCNETNTNLSALQTIVTA LQNNDYITSVDPLTENGKVVGYTIKFAKSNPIVIYNGKDGADGVNGNTPVIGVKKDTDGI YYWTLDGEFIVVDGQKIKAQGTDGNNGADGSDGVTPKLEIREGYWWISYDNGVNWTQLGK ATGEDGKDADSIIITQDENNVYFELADGTVITISKTGQSTPNIIEFKDPYVKTICVAAWD TDGDSELSYVEATAITTLGTKFKGNTLIESFEELKYFTHLTSIDDDTFNACTALTSIQIP ASVETIGLRAFKSCSSLANITFEKGSILRDIKGGSKMASDYMSIDYYGAFSDCSALTAIE IPASVESIGVAAFSNCKRLASVTFEHSSKLKSIGGGWSSPYGGFSYGAFLYCSSLTSIKI PASVETIGASAFKGCIKLTTVTFEKESRLTTIEGYYDYGYKAAHGTFTNCFALTTIELPA SIKTIETYSFSGCGKLANIYSRSTTPPTLEATLPSEAKIYVPIGSENAYKIANEWRNYAD NIIGYDFNE >gi|313159829|gb|AENZ01000001.1| GENE 3 3172 - 3282 64 36 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTHVLPYNSYAVDKYKKESVELFVICTEVLTKPKDE >gi|313159829|gb|AENZ01000001.1| GENE 4 3754 - 4029 205 91 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|291515326|emb|CBK64536.1| ## NR: gi|291515326|emb|CBK64536.1| hypothetical protein AL1_22740 [Alistipes shahii WAL 8301] # 1 91 1 91 91 172 100.0 8e-42 MTAQRLVENCVLTNQTAVVDEMLNKHLLPEEYIYPFLGDVMEWWLIDSWLAERLKREGEV IIEEYGCCWWGRLASGQAICMDSVIQEIAAG >gi|313159829|gb|AENZ01000001.1| GENE 5 4026 - 4412 429 128 aa, chain - ## HITS:1 COG:no KEGG:ACP_1223 NR:ns ## KEGG: ACP_1223 # Name: not_defined # Def: hypothetical protein # Organism: A.capsulatum # Pathway: not_defined # 23 124 22 126 138 80 45.0 2e-14 MKTLDRKAAEIFRALLALQTTKIDNSDGTYMPVYLELIGRIDNYNFFSLAHYGQQNGDAM RDPEMLFALHKETQQFIPYYYRNDYCGIEQNSVKWSEDGIALNPRLQAEHTTFANQWLRN IAAQQGIL >gi|313159829|gb|AENZ01000001.1| GENE 6 4518 - 4871 148 117 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MNLAVNSINKTNIENYKKYPDFLTYLAQNGNMQQKSYLAGMFSDMLNKNENVNVVLGIIQ SFNSLKQVDCKLLTTQIQRIMEGLDSLPDDNEYLTAMSYLNSLLQPLTRAKEDKKTA >gi|313159829|gb|AENZ01000001.1| GENE 7 5001 - 5693 -134 230 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313159831|gb|EFR59186.1| ## NR: gi|313159831|gb|EFR59186.1| KAP family P-loop domain protein [Alistipes sp. HGB5] # 1 230 623 852 852 447 99.0 1e-124 MVNDDLLSYAIEQWPDLDNDNIYIKDVVYISQKRQISENVVIKIYEKIHHILGDLRGVSQ SIILEKLKAINIVTENLSLTNNLPSIQNIVTLIAGTRPMPHPSYPSHRNYDQNIRFIDEC TNDDTAIETVVKFGCITYKVSSGRIAIDNIIDPFFAQSSNIILDQLLWLKNRWGFNLYPL ANYILKFKSYNNDKLMELFADHFTAKPNNRYILSEEIIQTKLDEMLKFAI >gi|313159829|gb|AENZ01000001.1| GENE 8 6138 - 6242 65 34 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRNIFSKPYWKLLQCICNIMRIYYILYFLWYIGK >gi|313159829|gb|AENZ01000001.1| GENE 9 7790 - 8011 175 73 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|291515324|emb|CBK64534.1| ## NR: gi|291515324|emb|CBK64534.1| hypothetical protein AL1_22700 [Alistipes shahii WAL 8301] # 1 73 1 73 73 122 100.0 7e-27 MSVLLIKYNAIGLSRTEPISITYQLDKRCKTLPEKVSLAYFIKVLDMQEQYKAVTERAEQ AKQFKEEFKGFEF Prediction of potential genes in microbial genomes Time: Wed Jun 22 10:56:42 2011 Seq name: gi|313159762|gb|AENZ01000002.1| Alistipes sp. HGB5 contig00033, whole genome shotgun sequence Length of sequence - 92013 bp Number of predicted genes - 65, with homology - 61 Number of transcription units - 24, operones - 15 average op.length - 3.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 81 - 1061 835 ## Phep_1385 endonuclease/exonuclease/phosphatase 2 1 Op 2 . - CDS 1079 - 2134 1015 ## COG0240 Glycerol-3-phosphate dehydrogenase 3 1 Op 3 . - CDS 2162 - 2977 501 ## COG0584 Glycerophosphoryl diester phosphodiesterase 4 1 Op 4 . - CDS 3012 - 5207 1648 ## Sph21_2903 glycerophosphoryl diester phosphodiesterase 5 1 Op 5 . - CDS 5230 - 7299 1977 ## gi|313159789|gb|EFR59145.1| conserved domain protein 6 1 Op 6 . - CDS 7323 - 8828 1281 ## Sph21_5172 RagB/SusD domain-containing protein 7 1 Op 7 . - CDS 8855 - 12442 3670 ## Sph21_5171 TonB-dependent receptor plug 8 1 Op 8 6/0.000 - CDS 12483 - 13484 1104 ## COG3712 Fe2+-dicitrate sensor, membrane component 9 1 Op 9 . - CDS 13665 - 14228 497 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog 10 1 Op 10 . - CDS 14275 - 15663 1323 ## COG2271 Sugar phosphate permease - Prom 15683 - 15742 1.6 - Term 16027 - 16086 -0.5 11 2 Tu 1 . - CDS 16327 - 16464 74 ## gi|313159264|gb|EFR58632.1| ISPsy11, transposase OrfB family protein - Prom 16599 - 16658 3.6 12 3 Tu 1 . - CDS 17582 - 17749 63 ## - Prom 17908 - 17967 7.2 + Prom 17827 - 17886 3.2 13 4 Op 1 4/0.000 + CDS 17944 - 19272 806 ## COG0477 Permeases of the major facilitator superfamily 14 4 Op 2 . + CDS 19274 - 20134 497 ## COG0524 Sugar kinases, ribokinase family 15 4 Op 3 . + CDS 20160 - 21746 1274 ## BF4324 hypothetical protein 16 4 Op 4 . + CDS 21764 - 24700 2550 ## Sph21_4291 TonB-dependent receptor plug 17 4 Op 5 . + CDS 24712 - 26310 1585 ## Cpin_7233 RagB/SusD domain protein 18 4 Op 6 . + CDS 26294 - 27910 1360 ## STAUR_2023 hypothetical protein 19 4 Op 7 . + CDS 27934 - 29457 1539 ## nfa54520 hypothetical protein + Term 29475 - 29522 10.6 20 5 Op 1 . + CDS 29528 - 30025 410 ## gi|313159766|gb|EFR59122.1| carbohydrate binding domain protein 21 5 Op 2 . + CDS 30043 - 32766 1758 ## COG1879 ABC-type sugar transport system, periplasmic component + Term 32958 - 33012 3.2 - Term 33091 - 33132 8.1 22 6 Op 1 . - CDS 33189 - 33380 224 ## PROTEIN SUPPORTED gi|160883083|ref|ZP_02064086.1| hypothetical protein BACOVA_01051 23 6 Op 2 . - CDS 33480 - 34535 1168 ## COG1555 DNA uptake protein and related DNA-binding proteins 24 6 Op 3 . - CDS 34577 - 36388 1849 ## COG0043 3-polyprenyl-4-hydroxybenzoate decarboxylase and related decarboxylases 25 6 Op 4 . - CDS 36369 - 37184 975 ## COG2103 Predicted sugar phosphate isomerase 26 6 Op 5 . - CDS 37147 - 37995 765 ## BF0368 hypothetical protein 27 6 Op 6 . - CDS 38020 - 38682 246 ## gi|313159827|gb|EFR59183.1| hypothetical protein HMPREF9720_1549 - Prom 38742 - 38801 8.7 - Term 38784 - 38821 0.4 28 7 Tu 1 . - CDS 38981 - 39313 168 ## Bacsa_0048 hypothetical protein - Term 39369 - 39409 8.2 29 8 Tu 1 . - CDS 39445 - 41625 3544 ## COG3968 Uncharacterized protein related to glutamine synthetase - Prom 41780 - 41839 4.2 + Prom 42100 - 42159 4.8 30 9 Op 1 7/0.000 + CDS 42208 - 44058 2680 ## COG1884 Methylmalonyl-CoA mutase, N-terminal domain/subunit 31 9 Op 2 . + CDS 44061 - 46199 3176 ## COG1884 Methylmalonyl-CoA mutase, N-terminal domain/subunit + Term 46219 - 46258 9.1 32 10 Tu 1 . - CDS 46473 - 47615 1560 ## COG1979 Uncharacterized oxidoreductases, Fe-dependent alcohol dehydrogenase family - Prom 47642 - 47701 2.7 - Term 47652 - 47704 16.7 33 11 Op 1 . - CDS 47718 - 49751 2811 ## Odosp_2490 PpiC-type peptidyl-prolyl cis-trans isomerase - Prom 49773 - 49832 1.6 34 11 Op 2 . - CDS 49842 - 51104 1916 ## COG1253 Hemolysins and related proteins containing CBS domains 35 11 Op 3 . - CDS 51108 - 51983 915 ## Odosp_3342 hypothetical protein 36 11 Op 4 . - CDS 51987 - 53321 1863 ## Odosp_3341 tetratricopeptide TPR_1 repeat-containing protein - Prom 53341 - 53400 3.9 37 12 Op 1 . - CDS 53438 - 54811 1945 ## BDI_0180 putative outer membrane protein 38 12 Op 2 . - CDS 54786 - 55526 615 ## COG1521 Putative transcriptional regulator, homolog of Bvg accessory factor 39 12 Op 3 . - CDS 55523 - 58849 5006 ## COG1197 Transcription-repair coupling factor (superfamily II helicase) 40 12 Op 4 . - CDS 58854 - 60404 2392 ## COG1530 Ribonucleases G and E - Term 60686 - 60741 12.1 41 13 Op 1 . - CDS 60765 - 60854 83 ## 42 13 Op 2 . - CDS 60890 - 61162 410 ## BDI_1995 DNA-binding protein HU - Prom 61224 - 61283 6.8 + Prom 61551 - 61610 5.9 43 14 Op 1 . + CDS 61637 - 63187 2647 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 44 14 Op 2 . + CDS 63188 - 63604 745 ## COG0802 Predicted ATPase or kinase + Prom 63704 - 63763 3.0 45 15 Tu 1 . + CDS 63803 - 64219 634 ## Odosp_2074 hypothetical protein 46 16 Tu 1 . + CDS 64328 - 65911 1942 ## COG0249 Mismatch repair ATPase (MutS family) - Term 65913 - 65942 1.2 47 17 Op 1 . - CDS 65957 - 67090 1728 ## COG1835 Predicted acyltransferases 48 17 Op 2 . - CDS 67096 - 67722 657 ## Odosp_2077 NUDIX hydrolase - Prom 67751 - 67810 4.8 + Prom 67695 - 67754 2.5 49 18 Tu 1 . + CDS 67798 - 69909 3329 ## COG1506 Dipeptidyl aminopeptidases/acylaminoacyl-peptidases + Term 69933 - 69968 7.2 + Prom 70046 - 70105 2.0 50 19 Op 1 . + CDS 70128 - 70292 296 ## 51 19 Op 2 . + CDS 70307 - 72661 2577 ## BDI_2795 putative lipoprotein + Term 72704 - 72741 7.8 + Prom 72829 - 72888 4.3 52 20 Op 1 . + CDS 72912 - 73082 130 ## 53 20 Op 2 . + CDS 73096 - 75555 1405 ## BDI_2795 putative lipoprotein + Term 75598 - 75636 8.5 + Prom 75750 - 75809 2.9 54 21 Op 1 . + CDS 76022 - 76801 866 ## gi|313159786|gb|EFR59142.1| hypothetical protein HMPREF9720_1575 55 21 Op 2 . + CDS 76823 - 78553 2151 ## gi|313159779|gb|EFR59135.1| putative lipoprotein 56 21 Op 3 . + CDS 78553 - 80682 2353 ## Bache_3264 lipoprotein 57 21 Op 4 . + CDS 80700 - 81521 1213 ## Bache_3263 hypothetical protein 58 21 Op 5 . + CDS 81535 - 82536 1277 ## Bache_3263 hypothetical protein 59 21 Op 6 . + CDS 82540 - 84225 1970 ## Bache_3264 lipoprotein + Term 84262 - 84299 9.4 - Term 84409 - 84451 12.6 60 22 Op 1 . - CDS 84489 - 85181 753 ## COG4623 Predicted soluble lytic transglycosylase fused to an ABC-type amino acid-binding protein 61 22 Op 2 . - CDS 85226 - 87340 2575 ## Palpr_0191 hypothetical protein - Prom 87361 - 87420 1.5 62 22 Op 3 . - CDS 87425 - 88159 871 ## COG1183 Phosphatidylserine synthase - Prom 88228 - 88287 3.9 + Prom 88089 - 88148 4.0 63 23 Op 1 . + CDS 88249 - 89322 1832 ## COG0836 Mannose-1-phosphate guanylyltransferase 64 23 Op 2 . + CDS 89360 - 90604 1825 ## COG0770 UDP-N-acetylmuramyl pentapeptide synthase 65 24 Tu 1 . + CDS 90850 - 91779 1220 ## COG1073 Hydrolases of the alpha/beta superfamily Predicted protein(s) >gi|313159762|gb|AENZ01000002.1| GENE 1 81 - 1061 835 326 aa, chain - ## HITS:1 COG:no KEGG:Phep_1385 NR:ns ## KEGG: Phep_1385 # Name: not_defined # Def: endonuclease/exonuclease/phosphatase # Organism: P.heparinus # Pathway: not_defined # 13 323 8 274 278 84 26.0 9e-15 MKTAIKAAAAAGLFILTLFPGCAQRQTDEANLRLLYWNIQNGMWSGQGDNYTRFVNWVSA QNPDVCVWCESVTNYKTASDEKIKPEEAYLPEHWGELARRYGHEYWYKGGHRDNFPQVIT SKYPIENVARLIGAVPDSIVSHGAGWAQIRKNGHTVNIVSLHTWPQVYGFGLADKLERKA SAAAREGDRFRRMEIEYVCNHTIGSVAGAEKQLWMMMGDFNSKSRVDNYEYGFPEDTTAF LVHDYICQHTPYVDVIAAKYPGEFQTTTGRRFRIDYVYCTRPLYDRIVSARVVSDEYTTP VRDPQELRNFWHPSDHRPIIIDFDMR >gi|313159762|gb|AENZ01000002.1| GENE 2 1079 - 2134 1015 351 aa, chain - ## HITS:1 COG:AF0871 KEGG:ns NR:ns ## COG: AF0871 COG0240 # Protein_GI_number: 11498477 # Func_class: C Energy production and conversion # Function: Glycerol-3-phosphate dehydrogenase # Organism: Archaeoglobus fulgidus # 4 309 2 299 335 134 32.0 2e-31 MKKVIAIIGSGMMGSALAFPAAENGHEVRLVGTHLDRDIIDECRRSNKHPKFDRAFPVGV KYYQIEEYREAVAGADFVIGGVSSFGVDWFLNEILVNLDPRLPVLSVTKGLINLEDGTLI SYPDYWRSELKKRGVDREVCAVGGPCTSYELVFHDQTEVAFCGRDTASLRLFKETLSTSY YHISLTHDVIGLESAVALKNAYALAVAMTIGLVNRHHGADAGLHYNSQAGAFYQAVKEMR RLLKIQQATEDCENIGIGDLYVTVYGGRTRRIGILLGEGKSYSEAMDILAGVTLESLVVA RRVAKAIYRKAELGQVSLQEFPMLVHANDVLDNGKDAELPWEKFTFDHTAN >gi|313159762|gb|AENZ01000002.1| GENE 3 2162 - 2977 501 271 aa, chain - ## HITS:1 COG:FN1891 KEGG:ns NR:ns ## COG: FN1891 COG0584 # Protein_GI_number: 19705196 # Func_class: C Energy production and conversion # Function: Glycerophosphoryl diester phosphodiesterase # Organism: Fusobacterium nucleatum # 22 253 23 260 261 64 23.0 2e-10 MKKLLLTGIALVGAALVFAQPRLVAHRGFYTTPGSDENTISSLINAQQLGVYGVEFDVNR TSDGELIVVHGPKVGDRLDAQRDTYAEISRVVLPGGNRIPTLREFLEQGRKDPGTKLILE LKKHKTPEIETRIVEEIVSLCKKLNMLDQMEFTSFSEHACREFRRLAPQNKTLYISNSLW TPINADVAKKEGFQLSYSMYVFMNRPELIDRMNEIGVESTLWIVDNPEVVDWAVKHNVGF ISSNFPDRIKAYLDALRTVETARNGACNLIR >gi|313159762|gb|AENZ01000002.1| GENE 4 3012 - 5207 1648 731 aa, chain - ## HITS:1 COG:no KEGG:Sph21_2903 NR:ns ## KEGG: Sph21_2903 # Name: not_defined # Def: glycerophosphoryl diester phosphodiesterase # Organism: Sphingobacterium_21 # Pathway: Glycerophospholipid metabolism [PATH:shg00564] # 51 230 27 195 259 90 35.0 3e-16 MIRKIICMLTLAAAFVGCTKDEWPDQPDWSRIPDPSIPVDDGFMKPAACSNTVVAHRGGA AECGAPDNSMAALEYAMSLGCYGMECDIYWTKDNDIIVAHANGDCKVNNLQPWTATVAEL RAAGRLSNGEELPTLEEFIRRVMVEGNCTRLVLDVKRVDKPYAQPEYVINAARRACEIVT EMKAKHFVELICTGFNLDAMKAAHNCAVIAEVPIGMNSSRSGKEYGTLGFGWANLSAASG MDAAAGGKGSCSLEEYEKAGVALSVYNVDQRAGDGNAVYSTAAVNYYIANYKRFRTLCSN YPKWLIGKIDHAYKVYDGIRSEADFEAFAESLASDPTGRRFLDGNGEVVLHCDLTLNGFV PLSNFSGTFNGNGKTLTIGYRGDAQQIGLFKRLSGTVRNLTVAGRFESVRSDDSEIHLGA FAAETDNAAIENCTNRAEIVVADAADVTPRTMILSGFVGKAFNGVTLRNCRNTGNISFSS PALYMIGGFVGAVQEDDGLYTIADCHNTADFDNAGSNSGWNFMGGIAGKTISRQLVPGET SNYRLIVEECSSTGTISIAGPSKVRASGIVAQTQGAYRISGCTFSGAIESTDATKRDVVI GGIMAMADKECVGLVEGCTFSGRISAAQAGANNFFGGIYGNNGGAASVVNDCRTTASAYV GCPIGKSVGMLAGRPNKKGFTVSNCRIAGTVTNKQGAAVVITADNLEDWMFAGYGTSVAV TLKNNGYNDGK >gi|313159762|gb|AENZ01000002.1| GENE 5 5230 - 7299 1977 689 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313159789|gb|EFR59145.1| ## NR: gi|313159789|gb|EFR59145.1| conserved domain protein [Alistipes sp. HGB5] # 1 689 1 689 689 1304 100.0 0 MRLNFKKYMAGLATAATLLLTACYEDHTDLEDRTTRIRISPEIEAFEADGTASVSGNSNV TSVVLVKVGTRVSRMEWEAELVDWMPSVDETGPWATIQRVPVVSQYTEIAGPNTVAVTQS GFELTALPNTGYGRRCTVRITAEDGTVQDYELFQYGELADAEVVPALESVEISFQGDPVD LAYTTNMGDKYAYAIAYEEGGAEGWLTAEHRGEGLLTLTAEPWDDMERDRRATLTITVGD DETSKAAVDIPVVQLRKAEYYYVYGSALGEEAATALQMKRESDDSYRVSGFFFATEDNLI CINKDSRAGDPSWYLGADGELKNWARNVTASDLKIEANGYRTLTVDFAAGTWNWIRQNST PNCMPDEELANYPTKDYPTASGGVKTWMTVSLHWNGGPGIGAVKLGSGLVGGSKTGGYGK PSDTTVPYDVRNPAYDTAENGGGIEELRANDGTPLASKYGRLYSSFEAVTGQPNGALNKA YQIASPYGEPGAVLVDATGTAVTIENILMATLAAADDDAKAEAEHPVLKMQIQGICPYGW HIANLQDWKDLIWAAAQASKGSRYEIAESSASYKAIGGGSIANLSTILFDASWNTYSSGS PISPLAPDFGFNMFIQGWRLYDTGYNYGATSGDPRFYAWIPLLGQYTSKKTSFWRIYISG HTKTDMTLNDGFDLGNGSGAAIRCVKNYK >gi|313159762|gb|AENZ01000002.1| GENE 6 7323 - 8828 1281 501 aa, chain - ## HITS:1 COG:no KEGG:Sph21_5172 NR:ns ## KEGG: Sph21_5172 # Name: not_defined # Def: RagB/SusD domain-containing protein # Organism: Sphingobacterium_21 # Pathway: not_defined # 20 500 20 484 484 219 30.0 2e-55 MNYKNIRSLLLVWIAAGVVSCHGDLDIMQDNKLSVSNMWKTSIQVENSTYGIYSAMRDNF VQDQVNVLTWGELRVGEYMWRNSRTETIWLADLRSVVQNTMSSTTNAVSWSKLYTAIDQA NAVLKYAPAVAMTDTKRSWAIGQASFARAYLYFWAVRLWGDVPLNLNPIESVAQPEAYPI RAPKASVYAQIGTDIDRAVENAGSLGTDKYLATADAVNMLKAEYALWMYAVQKGGDDYLT MAEEALAAVGVKAGDNRLLTDYAKIFDGYGSNNKNSNEVVFALYNSRDENKTGGFSTYFA FSDGAIDPNAQATVPTNTTVQYLDYGDEYLELLHKSQTENNDSRVKTNLGEGPYGKENRA VVTWPNKFIGNADSSPMIRDCDILYYRYALAVMLDAELKYYRRDYAGALASLNIIAKRAY GKENFHTSATQGDVLKALCNEYLLEFPCEGVVWWALIRLDRIWEYNADLKARKDAANILL WPISKSARDKNSNLAQTEGWY >gi|313159762|gb|AENZ01000002.1| GENE 7 8855 - 12442 3670 1195 aa, chain - ## HITS:1 COG:no KEGG:Sph21_5171 NR:ns ## KEGG: Sph21_5171 # Name: not_defined # Def: TonB-dependent receptor plug # Organism: Sphingobacterium_21 # Pathway: not_defined # 118 1164 24 1039 1056 768 43.0 0 MKKTFTFHTRSRLWILCAAVCMVWAVPLHAQTRRISLNLKNATIQQAVIALQQQGYSLSV KADDVDMKAPVDIHARNEELQAVVDRIFAGQDVNCIINDKSILITKVPSQPSSEDSPQES SVTGQVKGPNGLPLVGVTVLIDGTMVGTTTDASGNFRIRAKSGDTLVFSYIGYDERREKV GVRTEIDVTLAESSTLVNEVVVIGYGTQSRRTLSTAISKVSGEVFDNAPAATVGDALKGR VAGLHVQTSNAIAGESPRMMIRGGSSITMGNDPIYIVDGALREDLTGINTNDIESMEVLK DAASASIYGARASNGVILVTTKKGSLAKGPQIVFDLQLGCSSPSRKWDFMNAREYISFLR PVISKAYANGIEHNAKTMLSGPNAYGTGNTAPNANYSTRYLDYGQPVPAGYQWMYDPLDA NKVIIFTDTDWQSQWFTDAFWHKEYIGVNGGTDKLKYAASVSYHGDDGMVAMSSYKVFTL HGSTSFKITKNLEASTTFDLSRQKKHPLIDNYYQAIGRGIIAAPTAREFDDNGDWCQLSS NGNAHSAGWYESFYDREQAINRASGTFGLKWNITDGLTAFAQYNYFDNSYRGSYYAYGER NGTPNNVSLERNTTETREQTIRDIFTAHLNYNKTFRDVHTLNVTGGYEYMSQNYWKVKAN AKGASSDDVPVLGSGSIFTASNQDEKQAMISYFARASYNYDDRYIVSGTFRADGSSKFAV GNQWGYFPAGSVAWVISEEPFWRNAKKTANTFKLRASYGQTGNNGIGLYDAYGAFATGTY HGHTTLLPSAMANSALKWETTTQLDLGMDLGFFGDRLRFVFDYYNKHTDNMLFSISVPDT GPYSSVKANVGSARFYGVEVELSSANIRREHFSWTTDITYSFTRNKVLSLPGEYAYDEID ADGRATGRRAYRIGGYKASESGYRFGGTAVGEPLGRIYGYKIDHIIQTEAEADAALYDEQ SNGYRVGDGRMVKGRKDAGDYEWRNRKGSARRDGEEIINDEDQFYLGNVMPHSTGGINNT FRYKRLSLNVYCDFALGHSIINGMKAHLLRNTMGNCNSTIGRIAYDCWQHPGDTDAKYAR FFPNDSDWGNRNWRQSNFMVEKADYLCLRDVSLYYDLPEHWLRKIGIRKVTVGVTGNTLC YWTGVTGAISPESGIGSNAGAGMYTPVSTSNSNSDITANMMPVPRKIIFSLKLVF >gi|313159762|gb|AENZ01000002.1| GENE 8 12483 - 13484 1104 333 aa, chain - ## HITS:1 COG:AGl2289 KEGG:ns NR:ns ## COG: AGl2289 COG3712 # Protein_GI_number: 15891252 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 15 292 12 284 323 67 25.0 3e-11 MSRIRKIISIYLSYRQPRSTRKEFAEWFAAPFDGETKQRLMREYWDALSPEMSQDRVQKA YARVRERIGFRSVSLPSPQGVRPVFRRAAVMVAMIAIPVLAVALYALIGHMNSAPKWQEI YAPYGQTRSVVLADGSQIVINSGSRLIYPDRFAGKERRVFLCGEAYADIAKNPKRSFVLS ADDVDILVHGTSFRVSSYVNDSEVEVALLSGAIDMQTKNLQQNCKIQMTPGDMVKVDKRS GRVTSMRFPGGTFANGIDDGHLTFINSRLSDIARQLERTFDVKIVIDSQQLADERYYSVF INHETLDEILSILEQNGDMKHRREGEMIHLYKK >gi|313159762|gb|AENZ01000002.1| GENE 9 13665 - 14228 497 187 aa, chain - ## HITS:1 COG:VC2302 KEGG:ns NR:ns ## COG: VC2302 COG1595 # Protein_GI_number: 15642300 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Vibrio cholerae # 8 182 16 194 194 71 25.0 8e-13 MPLDSEYELMQRLSAGNEDAFGELFGRYYPRVCAFVGCIVKEESTAEDIAQDIFLKIWER RDIFQGRVASFNGYVYRMARNAALNAIRQMTNIDWEQYIQIEETLPDESFEREYYSREKE LFIRLVVCRMPEQRRRIFEMSRYAGMDNQAIADELKISKRTVENHLTLALKKLREALAVF SLLFFMH >gi|313159762|gb|AENZ01000002.1| GENE 10 14275 - 15663 1323 462 aa, chain - ## HITS:1 COG:RC0082 KEGG:ns NR:ns ## COG: RC0082 COG2271 # Protein_GI_number: 15892005 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar phosphate permease # Organism: Rickettsia conorii # 7 446 17 430 431 258 35.0 2e-68 MTPEQTKKFKYWQTRTIVVTMVGYALYYFVRKNFNTAMPSIEVTFGITKAQLGLFLTLNG IIYGLSRFINGFIADRVSARKVMSLGLALSALVNIGFGFSDKMAMLVAGTAGGQEYITAL TIIMGSILLLNGYFQGMGVPPVPPLMTHWVPANELARKMSIWNMSHSIGAGLIFVMGAVL VHYFDNSAWRLCFLIPAAFSLLGAVALYLALRDKPSSVGLPELEQMKVPGEEQKPVRKSD KAYHAAFLRRMVFGNPIVWVLSVSNFFVYIVRFSLLDWGMMLLPHTKGISVAVAGIMVAA FEFIGGNLGMVIAGWATDRLFGSRAHRTCVFCMLGTIVMTIVFWCIPDTVSPWVMAVPFM LIAFFIYGPQALLGIAMSNQATKEASATANGILGVFGYASTLISGVGLGFIADRYGWNSI YAVILVFAVLGLLTLTTIWKAAPDGYGAAKKFTAEYERNSSR >gi|313159762|gb|AENZ01000002.1| GENE 11 16327 - 16464 74 45 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313159264|gb|EFR58632.1| ## NR: gi|313159264|gb|EFR58632.1| ISPsy11, transposase OrfB family protein [Alistipes sp. HGB5] # 1 45 4 48 106 87 93.0 2e-16 MTDNGIRINMTEKGDLYENAVAERVNGILKSEWIDEECFESFQAA >gi|313159762|gb|AENZ01000002.1| GENE 12 17582 - 17749 63 55 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MQMVPSDDGKNSRRRISAQSFQNSPMLLPRVGLFPMQYYNSLKMKKSPPNGLERS >gi|313159762|gb|AENZ01000002.1| GENE 13 17944 - 19272 806 442 aa, chain + ## HITS:1 COG:BS_yxcC KEGG:ns NR:ns ## COG: BS_yxcC COG0477 # Protein_GI_number: 16081032 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Bacillus subtilis # 1 438 1 441 461 256 36.0 8e-68 MKNKLLLFGYVFAASFGAFVVGLNLGGISGALEFITAEFGLSAMAMGLVTSAIMIGCLIG ALLGGRYSDKYGRRNMMIISAVMLILSAVGCAMASNAAWLIAARFLGGCGMGVLSAVIPI YISEISPAKWRGTFVSFYQLFIVIGILAAYCADFGMISWGNNWRWMLGLPLLFAAGNLLM LLFLPESPRWLIKQGEYEVARKAIARMGISSEDAAVMLETPKSSQKGGPKLSELFRGSTT HIVLLGSLLAVFQQITGINVIINYAPEILRQTGIGGDTALMQAIYVGIVNFLFTIVAVWL VDRLGRKKLLLWGCAGLVVSLAYLTYAFAQPLPGSIGILIVLLVYIAFFAVSLSPLMFVV TAEIYPSAIRGTAMALSTGISWACAFLVVQFFPIMLESFGAAIVFAGFGVLCLAAWLFIY IWIPETKGRSLEEIEKQLLKKE >gi|313159762|gb|AENZ01000002.1| GENE 14 19274 - 20134 497 286 aa, chain + ## HITS:1 COG:MA1840 KEGG:ns NR:ns ## COG: MA1840 COG0524 # Protein_GI_number: 20090690 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Methanosarcina acetivorans str.C2A # 11 282 35 312 326 160 36.0 3e-39 MYSKPLVIGIGEFLWDILPTGRKAGGAPVNFAYHASQNGVEGWAVSAVGNDDAGRDLLAV TASYGIRTLVATVDKPTGTVDVSLCDGQPTYTIHEHVAWDYIPLTEEMLSLARRAGAICF GTLAQRGEVSHRTTCAMVETVPADAYRIYDINLRQHFYSKELIDRSLRIANVLKINDDEL VRLQEMFALPEDTDAACRQLAEQYALRMVVLTGGDRFSSIYTREYISTLPTPRVEVVDTV GAGDAFSGTLIGALLSGRTIAEAHRTAVETAAYVCTCAGAWPPPRK >gi|313159762|gb|AENZ01000002.1| GENE 15 20160 - 21746 1274 528 aa, chain + ## HITS:1 COG:no KEGG:BF4324 NR:ns ## KEGG: BF4324 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 9 518 20 526 554 481 46.0 1e-134 MKRLTTILMALCTIGCGTHGSPSDAASQVTMASLLDEMISYDAVTGYPAVNYRAAQVSSY DRRTVDLHEPGWFANDDGAGFERLDTIRGRVEKVLFDEKGPGAITRIWMTTNDKRGTLRF YFDGASTPEIEIPAYDMARFPVTVGEALSLTHTHYEDELSKTGGNTFFLPLPYARSCRIT LEESDYTVKIPRYYHVGYRTYDNGTNVRTFSLREVRKLTDKIAAVNRKLFSPETYTDGTE CTRTVTPMSNGATSLHLPDGNKAIRSLTITLSEFDPADLVEIMQRTWLRIDFDGIRCVWC PLDCFFGAGTGAPASSNWYVSSDGKGTFTSRWVMPYAEHADLRLEKRTDIPFTAVITGYV DDFDWTAQTLYFHATYHDETSIPVNNDYNSPDNLDWNFTTITGRGVYCGDVLSLYNHCPD WYGEGDEKIWIDNDTFPSFMGTGTEDYYNCSWAPVVPFATPFGGAPRADEASSHGYNTFV RTRNLDVIPFGQRLQFDLEMLSWNPGAVDYRAAAFWYGTAASAATEKN >gi|313159762|gb|AENZ01000002.1| GENE 16 21764 - 24700 2550 978 aa, chain + ## HITS:1 COG:no KEGG:Sph21_4291 NR:ns ## KEGG: Sph21_4291 # Name: not_defined # Def: TonB-dependent receptor plug # Organism: Sphingobacterium_21 # Pathway: not_defined # 28 978 145 1108 1108 706 40.0 0 MKQLYFMLCMALMLLYGMPASAQTSHGISGKVTDQTGNGIAGASVIVKGTTTGTSTGSDG SYSLNVKDGNATLVCSFIGMKSAEAAVNGRTEVNFALEADAIGLEEVIAIGYGTIRKEEI TSSITRVSSDEFLQGSVSNPLQLLQGKVAGLGISKTSGNPSGELSIMLRGISTLAASSAP LIVIDGVAGGSLNAISPEDIESIDVLKDGSAAAIYGTRGTNGVIIINTKRPQSGRVALEY KGYVTIDQMLDETSDYPDATELRSLKSRLAKEDSRFDLINDFKGDTDWVREITRTPVSQT HYVSLQGGNARTNYLASVTYNDKQGIYKGSFDESLTAKLSINHAMFDDRFKIALNVSNKI VTQGIVPDDLYMQALSRNPTIPVYNADGSYYENSNGANPVGLLKEQNTENKYDQLMMSGR ISVEPVKDLTISATGIYMGDFNDYAYSTTQKHYSSTMGSTQGQARLSGGHGEDKTLELQA DYSRKFGRHTLQATVGYSYNDYIHQSSEMYAYDFPIDGFGAWNIGSANSTLDGTSKLTSY KYRIKLIGFYGRVNYNFDNKYLFMASLRHEGSDKFGKNNRWGTFPAVSAGWRITQEKFMR DAYWVSDLKLRVGYGITGTAPSSPYQYVPLYNFSTAYMGYDGGKWVNGIVPTNNANDDLK WERKKELNVGLDFALFNSRFRGSIDFYNRRTDDLLYTYTVPTPPNITNSILANVGSMRNQ GVEVMLSGDLISRKDMALTLSGNFSYNKNKLISLSNDRYTMEYLTLGSTGEPMQTYTHRL EDGWAVGNFYGWRVDGLKNSTTWNVVGAENSDPSEDNKTIIGNGMPKMFAALQASYRWKG LDVSVSFRGAFGFDILNAYRMKYETLAWLSSFNVPKSAYQKIGEYYNFAPSIYSDRYIEP GDYVKLDNLTIGYTFDLSQRNKVIRSIRVFCTGMNLLTISGYSGLDPEVAIVGLTPGVDP INKYPNLRSYIVGASFNF >gi|313159762|gb|AENZ01000002.1| GENE 17 24712 - 26310 1585 532 aa, chain + ## HITS:1 COG:no KEGG:Cpin_7233 NR:ns ## KEGG: Cpin_7233 # Name: not_defined # Def: RagB/SusD domain protein # Organism: C.pinensis # Pathway: not_defined # 1 530 1 538 538 332 37.0 4e-89 MKRKTIHLFGACALLLASCTNLDPDIYSDMTIDKILEDSEQSSAYLLTPMYGQMRWFNED RSIWDLTELGTDAWVIPTNSDGGWYDGGIWLRLDNHEWKSTDPHFSTCWSHLWYGITTCC NRVLHQFEEAGLELDEQTLAEVRAVRAFYYYYLLSLFGNVPIMETYDVPKDYMPTTRDRK EVYDFVVRELRESMDKLPEEQRYSRFNKWAAKHLLAKVYLNAESWLGSEYAAKRDSTLIL CNEIIGSAKYQLDDSFSNIFSLQSETSPEIIFAVPYDETTGSPIFHCIYAKTQHWACKPI YNAGSGGYNGLRAVPSYVKETFDATDPDHYNDKRYRQSYCMGQQYDYLTKQPLYFTDGVT PFNYVNEIASIDAAREFDGYRFGKYEVKIGQKWETDQDWVVFRFGETLMMKAECMLRDGD AQGAADIVNEVRRRSFDPSLPRSVRELTAEQLSALTVVDGVSVPYGEFLKELGREFTGEG MRREQLIRWNLYVNGSWTFHTPKQKKYLELYPIPSSELLSNINLRQNDGYSY >gi|313159762|gb|AENZ01000002.1| GENE 18 26294 - 27910 1360 538 aa, chain + ## HITS:1 COG:no KEGG:STAUR_2023 NR:ns ## KEGG: STAUR_2023 # Name: not_defined # Def: hypothetical protein # Organism: S.aurantiaca # Pathway: not_defined # 55 532 28 507 657 286 38.0 2e-75 MDILIKTDRRPVSGIRSMLFLIAAIFLLPGCSSASDTEIPDPEPPGPEPLETGTLLPDNI TLVARVTGRSESGETIPNPNRTDARFNIGRTDYNNMWDAGNGTVMCAFGDNFDYGGGNWK SNAIALSSDRDLTDGLYYSGMLMDGNAVKEIVVSRAKTGQYPDGSEYEVTCIPTGGIAVG TRQYLNYMSIHDWTPTGDNDYWSVNYSEIVYSDNYGTAWTRSGVKWNADSNFTQVAYLKE NGLIYMYGTHSGRYGNVYLARVSETKLLDKSAYEYWTGGGWERNEQAAEPVARGTASEMT VAYNSRYKRYMMMYLSVNQRAVVYRDAPSPEGDWSGEKIIMYEDGNALYAPYIHPWFNED DELWFVISHAVPTWNVFLMRADLNWDQAGINLLAEGGFEEHPTQALSYKTMWHVNAAALT SRDAHSGKIACRFSNTNSGQWQDVCTQTVSLQQNADYSIECWVKSEEVLPEGAYVGVRLT DGTIHDITGTAQAGEWTRFGCEFNAGQAVSADVFFGVWGAPEMTLTVDDICLKPVKTE >gi|313159762|gb|AENZ01000002.1| GENE 19 27934 - 29457 1539 507 aa, chain + ## HITS:1 COG:no KEGG:nfa54520 NR:ns ## KEGG: nfa54520 # Name: not_defined # Def: hypothetical protein # Organism: N.farcinica # Pathway: not_defined # 263 505 113 347 361 135 33.0 5e-30 MKNCYFAMLLLLSAAFAAGCADDKEEGGNDIKLNPDPTQLSNITLMKATFAADEFNMISE PGFELFPDEAIDYRSLWYFEGAYKEPSLVSKHTQTHGGQVALKLENPNDGSWCDACLQSI ALKKGKDYTFSCFGQASWAGMNAFTGVRLEGGPIYDGQAGDWNPDVWTEFTKEFNSGDYT QGNVFCGAWGYPGVWVALDDFRLVPTGTTQTSTKLGACQTTGTVSNASFTDISFAGKAVI WAGPNNTVSMALADVSAGGKHYANAFALSNDMNPDDGFYIAQVSPGNDGIVPILEPSGDG ETACVPTAGVTVNGKQYIHYYSFKSLDPEDSDAWTANFSGLLSSADGGKSWTREIRGRWS GAGHFVQAAFYLSDKYLYMFGSDAGRSTPKIYVARIKSDGDIATAANWTYWDGTDWVAGD PESAAAITYGNASEMCVAYNATHKRYIMIYRSGTTGGLVYRDAGSPEGDWSGEKLLALDN ANQGGWFSPSVYPYSSGDNMYFIVSQL >gi|313159762|gb|AENZ01000002.1| GENE 20 29528 - 30025 410 165 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313159766|gb|EFR59122.1| ## NR: gi|313159766|gb|EFR59122.1| carbohydrate binding domain protein [Alistipes sp. HGB5] # 1 165 1 165 165 351 100.0 1e-95 MYKTILPLISYILFACTPHGRENLLADGGFEESAKMWDIGKSELTSEAHAGEFACALYVE TPRYIPVWSDICIQKIIVMPGKEYRLSAFARMSLPPLTNGVYIGVRRPDGSVLKDRLFLL SGRDWTRMALDFDSGEEQELTVFCGVWTDRNSTYFVDDFSLNMIK >gi|313159762|gb|AENZ01000002.1| GENE 21 30043 - 32766 1758 907 aa, chain + ## HITS:1 COG:SMb20671 KEGG:ns NR:ns ## COG: SMb20671 COG1879 # Protein_GI_number: 16265126 # Func_class: G Carbohydrate transport and metabolism # Function: ABC-type sugar transport system, periplasmic component # Organism: Sinorhizobium meliloti # 31 301 33 313 322 167 36.0 1e-40 MPHCTIRLFLLLAICMMTNGCRRESISEPKRHYVIGFSQCTNDLWRQIMMIQMQAEAAKY PELSIVVSNAHNNTQLQIDQIREFIEAKVDLLIISPNESEPVTPIAVEAFDMGIPTIIWD RKIHSDHYTTCISADNYDIGRDVGRYINTILSEGSTILEITGLMSSSPAVERHKGFLAEA SESYSVHTIPGDWKPDVAKERVAAIGHYNDIDLVFAHNDDMALAAYNVINAADSLCAQRI KFIGIDALVGVDAVLDGRLQASFLYPTGGDKVMAIARRILLGKRVEKSYQLQSALVDSHN AYTLKAQQEQIVSYQEQINKQKTVLEQYDRSVDNLKYSLWAVIIIALVAGGMGIYAIRLN LRLRRRNEILTAKNAEIEIATRELMDKHAQIENVTAHKLQFFTNITHEIRTPLTLILNPL DSIVKREKDPEIQRNIWTIQRNARHLLNVVNQILDFRKIENNKMMLTLKQVDIVQFTQEI LSYFETYAETEKIVYKFRTDTSHQQLWIDTDKIEQVLLNLLSNAFKYSPKYGTILVSVTD NGPSVLIEVEDNGKGIDKESLPRIFDRFYALKNTSNYSTGIGLHLTREYVELHHGYITAD SLPGAYTVFRVELFKGKSHFGENVTFDETLPTNYLADEVIDDNKVNELLAKKYNETIVIA EDDSEILAYLKDELSTNFRVIAVNNGYDAVKAVMDDEVSLILSDVLMPALNGFQLCSNIK SNIATCHIPIVLLTALSDDNQRIYGIAEGADEYIHKPFNMQYVRLKIIRIIEERKRLAST FAQKFGRLTRHEAANLPCVDDVFRDKLFNLIEAQHSDSNFKIELMSDMLGMSRILLYRKI TSIFGMSPSDLLRNYRLQKAVQLLTDQRRNVSEVAYIVGFSSPAYFAKCFKSVYNMTPTE YMYRTRE >gi|313159762|gb|AENZ01000002.1| GENE 22 33189 - 33380 224 63 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|160883083|ref|ZP_02064086.1| hypothetical protein BACOVA_01051 [Bacteroides ovatus ATCC 8483] # 1 63 1 63 63 90 63 2e-17 MIIMPVKEGENIERALKKFKRKYERTGVLKELRRRQYFTKPSIAKRVAKQHAIYVENMYR DED >gi|313159762|gb|AENZ01000002.1| GENE 23 33480 - 34535 1168 351 aa, chain - ## HITS:1 COG:TM1052 KEGG:ns NR:ns ## COG: TM1052 COG1555 # Protein_GI_number: 15643810 # Func_class: L Replication, recombination and repair # Function: DNA uptake protein and related DNA-binding proteins # Organism: Thermotoga maritima # 213 324 46 163 181 66 33.0 7e-11 MGKFFTEREIRAVAVFLPLAGLLVLGIVLVRPKADPAAALRVEAEMEERADTVVMQPFDP NTVDYDGLRRLGLTKHEAVSLLKYRAAGKIFRIPEDVTLCYGISDSLFYRLEPYIRIGRK YAIAPQEYRTGRVVSEPMPPAPFRIDTVSARYLRAIGALSKRQAEAFVRWRDLSGIYDME ELRACYVVSDSVASALEPYVIFPERKAEPIDVPVEINTADSAALRSVAGIGEKTVVSIIG YRDRLGGFLRAEQLAEVPGVTERNYEKILKQIYCDSCEIRKIDINFASPKELGRHPYIPP QTLRKLLKRRQLKGGWSTAEELIEDDIMTREEAARLVPYLRFGARSGPDDE >gi|313159762|gb|AENZ01000002.1| GENE 24 34577 - 36388 1849 603 aa, chain - ## HITS:1 COG:aq_1612 KEGG:ns NR:ns ## COG: aq_1612 COG0043 # Protein_GI_number: 15606727 # Func_class: H Coenzyme transport and metabolism # Function: 3-polyprenyl-4-hydroxybenzoate decarboxylase and related decarboxylases # Organism: Aquifex aeolicus # 2 451 5 451 486 433 49.0 1e-121 MYRSLNEYIARLEREGELIRIGAPVSPVEEIAEITDRISKTPGGGKALLFENTGTAFPVL TNLFGSERRMALALGVGTLDELSERIDVLLKQAAAPRNSLSDKLRALPMLAEMVRWFPRT VSGRGACQQVVLTGAEAALSALPVLKCWPADGGHFVTLPMVNTVDPETGVRNVGMYRMQV FDDRTTGMHWHVHKTGARHYDAYKRLGRRMPVSVALGGDPAYTYAATAPMPDNMDEYLLA GFLRRRPVKLVKCVTNDIYVPADCDFVIEGYVDPAEEKAAEGPFGDHTGFYSLEDRYPRF HVTALTRRRDAVYPATVVGIPPQEDAWIARATERIFLAPIRMALQPEVRDLTMPEAGTAH NVAVVSIDRRYEGQAVKVAQSLWGAGQMMFNKYLLVVPSGTDVRDSDALARLLRRIDPVR DLVRSEGILDVLDHATATPGFGGKIAIDATTAGAAPNPASQSESVPQLSLPEGCRTDLLE KWGAALAFAEPGAGVRAPEGVRYFVLFDPAAAELTPEELLWLAAANTDPRRDVRTEGATL VIDARSKRPGDGGNPARFPNIVTASEATVGLVDRRWTEYGLGAFVESPSRRYRRLLLSGG AQW >gi|313159762|gb|AENZ01000002.1| GENE 25 36369 - 37184 975 271 aa, chain - ## HITS:1 COG:YPO2925 KEGG:ns NR:ns ## COG: YPO2925 COG2103 # Protein_GI_number: 16123112 # Func_class: R General function prediction only # Function: Predicted sugar phosphate isomerase # Organism: Yersinia pestis # 14 254 17 257 295 237 53.0 2e-62 MENRITEQASAYDDLQRMSVHDILTGINREDARVHEAVRTTIPVMERLVERIVERMECGG RMFYIGAGTSGRLAVTDASELPPTYGVPFDRVIGLIAGGDGALRRAVEHAEDDMEGAWRD MAPYAPTGKDVLIGIAASGTTPYVIGGLRTARKHGLLTGSITCNPASPVAVEAEYALEAV VGPEFVTGSTRMKAGTAQKLMLNMLSTAVMIRLGRVEGNRMVNMQLTNDKLVARGTRMVA EASGLDQTRARELLLRYGSVKKALEHVPESE >gi|313159762|gb|AENZ01000002.1| GENE 26 37147 - 37995 765 282 aa, chain - ## HITS:1 COG:no KEGG:BF0368 NR:ns ## KEGG: BF0368 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 270 1 277 278 229 45.0 1e-58 MKIIADSGSTKCTWLLTDGVHTREVRTRGINAVQHSPEQIREALAELPPCGAAEAVYFYG AGCGHTFPDATAKMVREFGEHFGAARIEAESDLLGAARALFGRGEGVACILGTGSNSCWC RGGEIVENVPPLGYVLGDEGSGATLGRNLVNGIFKGHISLRGEFLAAHGLTYEEIIRRVY REPYANRFLASFAPFVHAHLDRPDIRDMVERSFADFAERNLSGYPVHLPVACVGGVAAAF GDLLRETLTRCGREVVTIVRSPAAGLTEYHYGKQNNRTGFGL >gi|313159762|gb|AENZ01000002.1| GENE 27 38020 - 38682 246 220 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313159827|gb|EFR59183.1| ## NR: gi|313159827|gb|EFR59183.1| hypothetical protein HMPREF9720_1549 [Alistipes sp. HGB5] # 1 220 1 220 220 451 100.0 1e-125 MLTVCVTAAAYAQGPLSAADRTRNAKTAARIERYRLDSKENNLFVTAASGITAHNQATAG VKVEYSRQLKGNFYWGADFSARWHLGSLMDYDWSGRGPDPYRNTVAQDIYKLDAMAYYRL PVVKGRLFLRFGAGVGAGYHRMIRDVDGEKDMDKDKVLPYFNVECAWILRVTKGFELKFS PTILLAPSEFSVSPVKLGAPTDVTPWLTDAGFSLTCGWRF >gi|313159762|gb|AENZ01000002.1| GENE 28 38981 - 39313 168 110 aa, chain - ## HITS:1 COG:no KEGG:Bacsa_0048 NR:ns ## KEGG: Bacsa_0048 # Name: not_defined # Def: hypothetical protein # Organism: B.salanitronis # Pathway: not_defined # 4 107 3 106 145 114 53.0 1e-24 MEYLKSWFRGFEQGIADLQPQQREVLFRACAVNCVHGGPFGLYRSLFEAAEGDLDRFFVK IDELDGVRGEIVCAGREYNLCFEACSCALHRAGCVNTPMLCECSRQSCFT >gi|313159762|gb|AENZ01000002.1| GENE 29 39445 - 41625 3544 726 aa, chain - ## HITS:1 COG:slr0288 KEGG:ns NR:ns ## COG: slr0288 COG3968 # Protein_GI_number: 16331104 # Func_class: R General function prediction only # Function: Uncharacterized protein related to glutamine synthetase # Organism: Synechocystis # 27 726 27 724 724 619 46.0 1e-176 MSSLLRFKMVDAAINHTAVEVNAPEGRPSDYFGKKVFGRAAMRKYLNKATYEALVDTMEN GTRLTREVADSIAAGMRQWALEHGADHYTHWFQPLTGGTAEKHDAFADPDGCGSVLEEFS GKLLVQQEPDASSFPNGGIRNTFEARGYSAWDPTSPAFIVDTTLCIPTVFIAYTGEALDY KVPLLRSITAVNKAATEVCRYFDKNVQRVVSYLGWEQEYFLVDESLWAVRPDLMLTGRTL MGHESAKNQQLEDHYFGAIPTRVMAFMKDLEYECLKLGIPVKTRHNEVAPNQFELAPVYE EANLANDHNQLLMTVMDKIARRHRFRVLLHEKPFKGINGSGKHNNWSLGTDTGVNLFGPG KTASENLQFITFLVNAISAVYKFNGLLKASIMSATNAHRLGANEAPPAIISTFLGTQVSA VLDKLAASKGDDAIRFDAKNVFKMSGISHIPTLLLDNTDRNRTSPFAFTGNRFEFRAVGS SDNCAEAMIVLNTAMAYELTEFRKKVDAKIEAGMKKEKAIYEVLKQMIKACKAVRFDGNG YSDEWKAEAKKRGLDCETSTPLIFERYLDKATLEMFGSMGVFTDVELEARTEVKWETYTK KIQIEGRVLGDLAMNHIVPIASKYEALLLDKVYKMSQIPGLNASADIALIKKIQYHTAEI QRLTGEMIDARKKANKIEQMREKAIAYHDTVSVFFDEIRRHIDKLEEIVDDQMWPLPKYR ELLFLR >gi|313159762|gb|AENZ01000002.1| GENE 30 42208 - 44058 2680 616 aa, chain + ## HITS:1 COG:BH2956_1 KEGG:ns NR:ns ## COG: BH2956_1 COG1884 # Protein_GI_number: 15615518 # Func_class: I Lipid transport and metabolism # Function: Methylmalonyl-CoA mutase, N-terminal domain/subunit # Organism: Bacillus halodurans # 9 479 9 477 525 240 31.0 5e-63 MANTKREKLFTEFPPVPTEKWEEVITADLKGADYERKLVWKTGEGFNVRPYYRAENLEGI KFLGSQAGEFPYVRGTRAHNRWRVHQTVSVVCPKEANAEALKILNAGVDSLGFCIASEAF TAADLDTLLGEICIPAVQLTFCGQKTADVAELVLAKIEKEGIAKEDVRIAFCIDPLVKGL STKGDFCSPNGEKCFARIAELIRKTKEYKHIRVVTVSGQIFGNSGSTIVEELAFVLSAGH DYLVRLMDAGLTIEEAARKLRFSFSVSSNYFMEIAKFRAARMLWANIVKGYNPEKNCACK MQIHAETSKWNQTVYDPYVNMLRGTTEAMSAAIGGVYSLEVTPFDASFENPTEFSKRIAR NVELLLKHESHFDQVVDPAGGSYYIENLTQSIAAEAWKLFLEIEEKGGYTEAYKAGFIAE RIKASAAAKDKNIATRRQILLGANQYPNFTEVAGKEITAESVTRKQAEGNVLVPYRGAMA FEEMRLHVDRSGKEPKAFMLTCGNLGMARARSQFSCNFFACAGIKVIDNTYFKSIEEGVK AALESKAQIVVVCASDDDYAEAAPKIKELLCGKAILVVAGAPACAPELEAQGITNFINVK SNVLETLKFYLKEMGI >gi|313159762|gb|AENZ01000002.1| GENE 31 44061 - 46199 3176 712 aa, chain + ## HITS:1 COG:BH2955_1 KEGG:ns NR:ns ## COG: BH2955_1 COG1884 # Protein_GI_number: 15615517 # Func_class: I Lipid transport and metabolism # Function: Methylmalonyl-CoA mutase, N-terminal domain/subunit # Organism: Bacillus halodurans # 32 585 29 582 582 860 74.0 0 MRAKFSELTYDAGQQKSCCASKGCGQVEPWLTAERIPVKGAYTAEDLEGMEHLNYAAGIA PFLRGPYSTMYVMRPWTIRQYAGFSTAEESNAFYRRNLAAGQKGLSVAFDLATHRGYDAD HPRVVGDVGKAGVSICSVEDMKVLFNGIPLDKMSVSMTMNGAVLPVLAFYIVAGLEQGCT LDQLAGTIQNDILKEFMVRNTYIYPPEFSMRIIADIFEYTSKNMPKFNSISISGYHMQEA GATADIELAYTLADGLEYLRAGINAGMSVDAFAPRLSFFWAIGMNHFMEIAKMRAARMLW AKIVKQFDPKNPKSLALRTHSQTSGWSLTEQDPFNNVARTAIEAMGAALGHTQSLHTNAL DEAIALPTDFSARIARNTQIYIQEETNVCREVDPWAGSYYVESLTQEIADKAWERIQEVE KLGGMAKAIETGIPKMRIEEAAARKQARIDSGTEKIIGVNEYRLEKEAPIDILAVDNTAV RESQIKRLKELRANRDEAAVKKALAAITECVKTKQGNLLELAVEAAKVRASLGEISDACE VVVGRYKAVIRSISGVYSSEVKNDKQFERAKELCAEFAKKEGRQPRVMIAKLGQDGHDRG AKVVATGYADIGFDVDMGPLFQTPEEAAKQAVENDVHVVGVSSLAAGHLTLVPQIIAELK KLGREDIIVIVGGVIPAQDYDQLYKDGAAAIFGPGTPIATAAIKILEILLAD >gi|313159762|gb|AENZ01000002.1| GENE 32 46473 - 47615 1560 380 aa, chain - ## HITS:1 COG:alr4566 KEGG:ns NR:ns ## COG: alr4566 COG1979 # Protein_GI_number: 17232058 # Func_class: C Energy production and conversion # Function: Uncharacterized oxidoreductases, Fe-dependent alcohol dehydrogenase family # Organism: Nostoc sp. PCC 7120 # 1 376 1 380 384 410 55.0 1e-114 MNNFIYHNPTKLVFGKGQIARLGKLIPADKKIMITFGGGSVRRNGVYDQVLKALEGRDWV EFWGIEPNPSVETVREAVALGRENGSDFLLAVGGGSVIDGTKLIAAGLLYDGDPWDIVLK GQADKTVPLGTVLTMSATGSEMNSGSVISRQETKEKYAFYGDYPVFSILDPETLYSLPQR QIACGLSDTFVHVLEQYMTTPGQSRLMDRWAEGILHTVIEIAPKIRENQHDYAVMSEYML GATLALNDMIRMGVTQDWATHMIGHELTALHGLTHGATLAIVINGTLRVLREQKRGKLLQ YGERIWGIADGSEEERIDRTLEATEKFFRSLGLSTRLSEEGIGETTIAEIERRFNASGVK YGEAQNVDGAAARRILEACM >gi|313159762|gb|AENZ01000002.1| GENE 33 47718 - 49751 2811 677 aa, chain - ## HITS:1 COG:no KEGG:Odosp_2490 NR:ns ## KEGG: Odosp_2490 # Name: not_defined # Def: PpiC-type peptidyl-prolyl cis-trans isomerase # Organism: O.splanchnicus # Pathway: not_defined # 1 677 1 708 708 313 30.0 1e-83 MASLNTLRTKFGIVLSIVIAGALLAFILSLKTEMGFSGNDPRVGVIDGEKINYSEYYNQY EQVKAQSGAQESNEQQSAMLANAAWQALIGKYVLTPGFDKMGLRVTEPERMSMVSGQHPS QAFYNAFADPRTGEYNVAAVHQFLSEAEANAQAQQAWAQLNEQARMEREVAKFLGLIKGG VYVNSLEVANGVNSANNTYAGKWAGKKYSAVPDSLIQLKSSDIKAYYNSHKNMFKQTPSR ALSYVVFEVSPTDDDMLALEKSVAEVGAQFAATEELKSFVRANRNGKIADNYVSAKQLSE EEAKALLDGATYGPVLKNNEWTMARALDTKIVPDSMGIRHIVLPYTQEALADSLLTVLKG GADFAQVAAQYSVYDATAANGGEVGVMPFSAFSGEFAAALANAKTGDIVKIASGDAIQLM QVYRADKPSKHVQVASITYPVEASAATRRDIHNQAGTFSVNAKGSVEAFNDAASAAAVTP RIASLAQGERTIRGLEDSRDVARWAYGAEVGDVSEIFPVGKDYVIAMLTEIDDNEFAPLE KVSAQIRAQVLRDKKYDYIVKELSGSTLDEQAKSLGTEVADFDNVTFGAFYVNGPGFEPR LIGAISSTTEKGVLSAPVKGLSGVYVFEVDDIQTSDKQTAEGEKVRAQAMAESMAQQFSV QAIQQMAKIQDLRGKYF >gi|313159762|gb|AENZ01000002.1| GENE 34 49842 - 51104 1916 420 aa, chain - ## HITS:1 COG:TM0845 KEGG:ns NR:ns ## COG: TM0845 COG1253 # Protein_GI_number: 15643608 # Func_class: R General function prediction only # Function: Hemolysins and related proteins containing CBS domains # Organism: Thermotoga maritima # 9 420 19 430 455 168 27.0 2e-41 MLTSVILIVLMLLLSAFFSGMEIAFTSKNRLKLEIDRKQSRMFDRIADIFSRHPGQYITT ILVGNNIALVIYSLYMSLLLRGIFYALGWESIARNGSVAIETAVSTVIIIFFAEFLPKSV FRNNPNFYYRALAPVIYFFYLLLYPIARLTTLISHGILRLTGRRVEERTTTHSFDREDLA SLLDTNSSEPRPEPDNELKLFQNALDFADLRVRDCMVPRVDVEAVDIDDTTIEQLTARFV DSKYSRIFVWRKSIDNIIGYINSKSLFTRPAGISDVMMQVNFVPETMPLQLVLQNFIKHR TNIAVVIDEFGGTAGVISLEDVLEQIFGEIEDEHDVPDLTEKQVGPDEYVLSCRLEVKYL NEKYGLGIEESREYDTLAGFIIFNYEGIPTAGETVFVGGLQVRILRTTRSRIDLARVRKL >gi|313159762|gb|AENZ01000002.1| GENE 35 51108 - 51983 915 291 aa, chain - ## HITS:1 COG:no KEGG:Odosp_3342 NR:ns ## KEGG: Odosp_3342 # Name: not_defined # Def: hypothetical protein # Organism: O.splanchnicus # Pathway: not_defined # 10 194 10 192 192 92 31.0 2e-17 MNRLIARYAMVALSVMGSAILLFSCERDEAEDAAASEETMRTEYSENLSIVESQNGRRSY HFVTPLVEGYSLAREPYREFRKGVKITTYKDDSLSTVDAVLTANYAIYYENRKLWEAKGN VVVVKSDGKNLYTQQLFWNQQTKKIYSNVDSKIVQNDGRDVFIGEGFESDEEFKDWRFRR MKGRMEVEVKPTEQEVEVVPDSTAGGRTGASGPVPAQQSGKTAVASAGEPMPEPAVKPAV APVPVRRPVRRDAPGRPASAGEPAEARPRTEQQQMKPVDLKLDDDEIQSTQ >gi|313159762|gb|AENZ01000002.1| GENE 36 51987 - 53321 1863 444 aa, chain - ## HITS:1 COG:no KEGG:Odosp_3341 NR:ns ## KEGG: Odosp_3341 # Name: not_defined # Def: tetratricopeptide TPR_1 repeat-containing protein # Organism: O.splanchnicus # Pathway: not_defined # 1 444 1 440 441 163 29.0 1e-38 MKNVKILLSAALVFVGATALAQDFSAPQYEKWGDTPEQREKNILNSNFLKESCDNRDYNA AAHYLNELINSVPTASVSFYQRGAVIYKNKINRAKSVAEKNTYIDSLMLMYDLRAKYFGD HPKQGTAFILDQKAREFLRYKPNDRKGIREAFRAAIEAGGDNTDPETVVAYFSNLCDDYK NTDEVLPDEIIAEYDRLTPFFEKNPNANEYKAQFDAAFGLSGAASCENLEKLFRGKLEAA PDDEALLAQAVALMSRAKCDGDFYFTLAEKYYAVKPSSETAMFLAQAFQSKGDYAKAIKY LNEALAVEQDPAERQLLLVRIALIDLVANDIPGAASAARQARDLNPEDGVPYYVLAQCYA ISAANCGGFAGQATFWAAYDTMSKAIELLPADSEYIESAKTSLNAFRSRFPTSEECFFNE LQAGSRYTVTCGTAAGVVTSVRPR >gi|313159762|gb|AENZ01000002.1| GENE 37 53438 - 54811 1945 457 aa, chain - ## HITS:1 COG:no KEGG:BDI_0180 NR:ns ## KEGG: BDI_0180 # Name: not_defined # Def: putative outer membrane protein # Organism: P.distasonis # Pathway: not_defined # 30 457 25 426 426 159 29.0 3e-37 MQVKNTLIKLVVIAAVMIPCAAAAQTSSINAFSPYTMYGIGEQNTPGTLPMRSMGGVGVA MRSSGVVNLLNPAAFSAAPQKSFLFNFGLEGQNYYNSQKVDGMSKSTAYNTFNFHDIAFQ LPLAKKLGLGFSLTPYSSVGYRTKYTHDYDPSDPVWGNVGRAQYIYQGEGDVTEVKLGVG WELFKNFSLGIAAQYYWGDIDRSFVMTPTPITGDGSYSSTVGTDNYSISSIKGQIGVQWN AILNQKRALTLGAAYDFGGDLNPTVTKNIYVGDLYNSTVKGDTTHLALVLPRQLSAGVFY QTSKWTMGVDYVYQDWGGRNRQTESTGVSGPDRSAFAVAYADTHTIKAGVEYTPNRYDVR NFLKRWSYRAGFRYGTHNQTYNGDKLGQYAVTLGIGVPVKFLAISSIDVGVEYGRRGYNL AERLGLVRQQYFKFSIGFTLFAGSENGEYWFLRPKYD >gi|313159762|gb|AENZ01000002.1| GENE 38 54786 - 55526 615 246 aa, chain - ## HITS:1 COG:TM0883 KEGG:ns NR:ns ## COG: TM0883 COG1521 # Protein_GI_number: 15643645 # Func_class: K Transcription # Function: Putative transcriptional regulator, homolog of Bvg accessory factor # Organism: Thermotoga maritima # 1 222 1 230 246 92 31.0 7e-19 MNLVVDIGNTLVKLAVFDRGEIVFQRCVERLHPSMLEELFAVWPVRRAVVASTRGEVGEV ADLLRPRVEYLLEFSSQTPVPIGNAYLTPETLGRDRLAAAVGATVLYPGRNVLIVDFGTA VTIDLVTADATFRGGCISPGMKTRFRALHDYTAKLPLCAATESEELAGLTTRQAIELGVM NGIAFEIEGYAARMRSQIDDLCVIFTGGDANFFVKRIKNTIFANCNLVFCGLNRILEYNA SEEHLN >gi|313159762|gb|AENZ01000002.1| GENE 39 55523 - 58849 5006 1108 aa, chain - ## HITS:1 COG:SPy0008 KEGG:ns NR:ns ## COG: SPy0008 COG1197 # Protein_GI_number: 15674256 # Func_class: L Replication, recombination and repair; K Transcription # Function: Transcription-repair coupling factor (superfamily II helicase) # Organism: Streptococcus pyogenes M1 GAS # 36 1108 31 1147 1167 623 35.0 1e-178 MTAREQLAQFAAGSKALPRLCREYKKERATVHLKELVGGALSFYAAAAAAKTGGVHVFVA EDRDAAAYLMNDFYNLLDEKQVYFFPSSYKRSVAYGAEDAQGVVQRTAAMNAVKGFTKGY LIVCTYPEALAERVADAETLRRDTIAVKVGDSISIAVLEDALVDANFTRVDFVYEPGQYS VRGGIVDVFSFSESKPYRIDFFGDEVDSIRRFNISSQLSADKLDRVEIIPNLNAGDGAAK VSFVQFAGDAAAYWFYDADYVLRRVNDIRRKTLADMERPEEIDRLMTSRNALLGDLANCR MLALRDNLPERPATAAVEFDTAPQPKFNKNFEMLADDMIRNSLRGYTTYILSENKVQVER LENIFHQIGRGQAVVRPLSVTLHEGFVDNDLKLCLYTDHQIFDRYQRYRINGEIRRDEQM TVAELNQLRPGDYVVHIDHGVGRFDGLVKINENGKAHEAIKLVYKDGDVLFVNVHSLHRI SRYKSGDGEPPKVYKLGNGAWQKLKNATKKAVKDISRELIALYAKRKASKGFAFTPDSYL QHELEASFQWEDTPDQQTTVAAVKKDMESDQPMDRLVCGDVGFGKTEVAIRAAFKAAVDG KQVAVLVPTTILALQHYRSFSERLRNFPVRVEYLNRTKTAKEVSQIRADLEAGRIDILIG THKILGKQIVFRDLGLLIIDEEQKFGVAAKEKLTQLAVNVDTLTLTATPIPRTLQFSLMG SRDLSVISTPPPNRQPIVTESHVFSEEIIRDAVETELARGGQVYFVHNRVEDLMTMQGLI TRICPKARVGVGHGKMPAEKLERLIMDFIYGEFDVLVATTIVENGIDIPNANTIIVNNAQ NFGLSDLHQLRGRVGRSNQKAYCYLLSPPDELLSSDARRRLRAIEEFSDLGSGFNIAMQD LDIRGAGNLLGAEQSGFIADIGFETYQKIMNEAIAELRAEGLHVAGLNDSEQEVVEQLRY IDDAHIEIEVEAGLPDSYVAQQAERLKLYRELDSTKNEEALLAFGERLVDRFGPLPRETE ELLNVVRLRWEAIRLGMERVKVKNGLMIVHFVGEQDSPYYKSDTFMDLLRKVTQNPGRFV LKQHNNRLAMTVRNVKDIEDAYKTLQQL >gi|313159762|gb|AENZ01000002.1| GENE 40 58854 - 60404 2392 516 aa, chain - ## HITS:1 COG:CT808 KEGG:ns NR:ns ## COG: CT808 COG1530 # Protein_GI_number: 15605542 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Ribonucleases G and E # Organism: Chlamydia trachomatis # 1 473 1 468 512 259 31.0 1e-68 MNRELIVNVNPTEISIALCEDKVLVELNKEQCQTGFAVGDIYLGKVRKIMPGLNAAFVNI GHEKDAFIHYLDLGGQFLSLQKLVASQQPGKRGLRVESMKLEPPVEKSGKIGDYLQVGQN IMVQIAKEAISTKGPRLTADISLAGRNVVLVPFSSKVFLSQKIRSVDEKKRLKRIAAAVL PKNFGVIIRTAAMEAKDEDIEHDIQTQIDRWRKTCAAIKKNTAPAQLMSEMNRANTIIRD SLNGSFSQIAVDDEAMYNDIRGYIRQIEPEKEKIVKLYRGNVPIFDNFDISKQIKSLFAK YVSLRRGAYLIIEHTEAMNVIDVNSGNRTKAEDNQEQTAMDVNLAAAAEIARQLRLRDLG GIVIIDFIDLRKAQNRQALYDEMVKLMATDKAKHTVLPLTKFGLMQITRQRVRPVAVEEV SDVCPTCNGSGKIEPTVLLDKKIENQISFLTHDRGHKYIKLVVSPYVASFLKQGLWSLRR RWQWKYKVRLHVVADQSLGIVEVHYHDRKDNDLINK >gi|313159762|gb|AENZ01000002.1| GENE 41 60765 - 60854 83 29 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MPNGKKHKRHKMATHKRKKRLRKNRHKKK >gi|313159762|gb|AENZ01000002.1| GENE 42 60890 - 61162 410 90 aa, chain - ## HITS:1 COG:no KEGG:BDI_1995 NR:ns ## KEGG: BDI_1995 # Name: not_defined # Def: DNA-binding protein HU # Organism: P.distasonis # Pathway: not_defined # 1 90 1 90 90 111 72.0 9e-24 MTKADIVSEIAKSTGVEKVQVQAIVEAFMDSIKTSLTNKNNVYLRGFGSFIVKKRAKKVA RNISKNTTITIPEHNIPAFKPAKSFAAKVK >gi|313159762|gb|AENZ01000002.1| GENE 43 61637 - 63187 2647 516 aa, chain + ## HITS:1 COG:BS_resD KEGG:ns NR:ns ## COG: BS_resD COG0745 # Protein_GI_number: 16079369 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Bacillus subtilis # 3 114 8 118 240 83 39.0 1e-15 MVKILWVDDEVELLKPHVLFLKQKGYEVDTCNNGYDAIDMASEGAYDLIILDEMMPGMTG LETLPKIKEVRPTTPVIMVTKSEEENIMDKAVGSKIADYLIKPVNPNQVLLSIKKNVHQQ QLVTEQTTADYRSEFGRISSSLQMAETFGDWCSLYRKLANWEVDLSESTDQSIKEVLTYQ KTEANQEFCKFVRRNYYNWINRRSEDTPVMSHTLMRTNIFPVADENPKTTLLLIDNFRYD QWRSISSLLRGYYDVAQDDFYCAILPTATQYARNAIFAGLMPLAIDKLMPNKWLNDNEEG GKNQYEEEFLKRLMAQNGKNWKFSFDKLVRPEQGRKLVDNIQKVYDADFSVIVYNFLDIL SHARTETDIIRELTEDDAAFRSLTRSWFEHSDLYTILKLLSERGHTVVITSDHGTIRVDN PVKVTGDRETSANLRYKTGRNLAYNSRDVYEVLKPEDVQLPSSNLTSSYIFAYNSDFLVY NNDANRHIRYYRNTFQHGGISMEEMIVPYIVLKPKQ >gi|313159762|gb|AENZ01000002.1| GENE 44 63188 - 63604 745 138 aa, chain + ## HITS:1 COG:SA1857 KEGG:ns NR:ns ## COG: SA1857 COG0802 # Protein_GI_number: 15927627 # Func_class: R General function prediction only # Function: Predicted ATPase or kinase # Organism: Staphylococcus aureus N315 # 4 137 2 132 153 93 35.0 9e-20 MKTIHITSQDDLPDVAEAVIEALGRRTVVAFRGEMGAGKTTLIREIAAQLGATDTVTSPT FAIVNQYKGKGGRRIHHFDFYRINDVREAYDFGYEEYFYSGDLCLVEWPEKIEQLLPDNA MTVRITVDSDTARTFEID >gi|313159762|gb|AENZ01000002.1| GENE 45 63803 - 64219 634 138 aa, chain + ## HITS:1 COG:no KEGG:Odosp_2074 NR:ns ## KEGG: Odosp_2074 # Name: not_defined # Def: hypothetical protein # Organism: O.splanchnicus # Pathway: not_defined # 10 130 12 149 160 84 33.0 2e-15 MQEYISKQQQILRPAEQIYAVISRFDNLTPAVADKVEEWQATEDTCSFKAKGFTVKLRME EREAPRHVKIVADGGGVPMDFAFWIQLHKVSDTDTRMRLVLHIELNMMMKMMIGGKLQGG LDQIAEGIAKAMNAAPQY >gi|313159762|gb|AENZ01000002.1| GENE 46 64328 - 65911 1942 527 aa, chain + ## HITS:1 COG:CAC3563 KEGG:ns NR:ns ## COG: CAC3563 COG0249 # Protein_GI_number: 15896798 # Func_class: L Replication, recombination and repair # Function: Mismatch repair ATPase (MutS family) # Organism: Clostridium acetobutylicum # 31 512 84 566 577 234 30.0 2e-61 MKPTAYLTTFLGLPGASGEKRAACNFDLTSRYFYNVSSEESSFQTIDAETTADLDLNAVF ERIDRTSSKPGQQYLYARLRTLRGTEEVREFGRRTDHFAGNAGHAEQCAKHLSRLSDDDA YDLQNLIFDKPEQVRRIALVYLLSAAAVVSLLLSFLYPLLLLLFLAVFAANMYIHYSNKL NISIYASAVKQLSLALRTARHLAAEEVPGTEEALKVIRQVSEVERRSRVVGTQGDSANEL AAAVWLVLELLKTAFNVEVILFHRFIGSIIARRDAIHGLFRFIGETDAAISAARLRGETP TCRPEFTDGKYLRAEEVVHPLIGDCVPNTLELDGTGLLLTGSNMSGKTTFIRTLVLNALL AETLDICFAEAYTAPYMKIYSSIRISDDIAEGTSYYLQEVLTVKRFIDASQEPAPCLFAL DELFKGTNTTERIAAGKAVLAHLNRGPHLVLVSTHDVELADLLRKDGYELHHFREEVIDG KLVFDYRLHTGPLTTRNAIRILEMYDYPRDLIAEAYEVQEKLTEKSN >gi|313159762|gb|AENZ01000002.1| GENE 47 65957 - 67090 1728 377 aa, chain - ## HITS:1 COG:CC1328 KEGG:ns NR:ns ## COG: CC1328 COG1835 # Protein_GI_number: 16125577 # Func_class: I Lipid transport and metabolism # Function: Predicted acyltransferases # Organism: Caulobacter vibrioides # 16 366 12 333 337 132 35.0 1e-30 MLSTSASAYPDSKPHYAILDGLRGVASLVVVAFHLFEAHAASHADQIINHGYLAVDFFFV LSGFVIGYAYDDRWGRMTYRDFFKRRLIRLHPMVVMGMLIGAAAFYFGAGGPYEMIAGVP VGRMLLILLLGCLMIPVPPSMDIRGWSETYPLDGPAWSLFFEYIANICYALVLRRLSKLL LGALAVVAACFTVRLAVTQGDMIGGWALDGEQLGVGFTRLAYPFIAGLLLSRVGKLIRLR GAFWLCSLCVVAVLAVPHLGTDRLWLNGLYDAVCIIVLFPLVVAAGAGGKVTDRVSKKVC GFLGDISYPLYITHYPFVYIYTAWVIDTRPAWPEALGYGALVYGGSILLAWLCLRLYDEP VRGWLKRRFMQRKPVQG >gi|313159762|gb|AENZ01000002.1| GENE 48 67096 - 67722 657 208 aa, chain - ## HITS:1 COG:no KEGG:Odosp_2077 NR:ns ## KEGG: Odosp_2077 # Name: not_defined # Def: NUDIX hydrolase # Organism: O.splanchnicus # Pathway: not_defined # 55 199 57 199 202 114 39.0 3e-24 MNHTVYFADKSVIFTSDAPGGECYAVVPADPADLSRAKVTKILESHNCVAVVSADPDAAF RSFAADFTQIEAAGGVVVNDRGEWLMMRRNGRWDLPKGHLECGERIEACAAREIAEETGV CAEPVRPLCRTWHAYYFPKTERWELKRTHWYELRAAACGGLVPQTEEGIETVVWCAPAEA MAHAAGSFPTVRTVLERLEACLSDDKKR >gi|313159762|gb|AENZ01000002.1| GENE 49 67798 - 69909 3329 703 aa, chain + ## HITS:1 COG:XF2260 KEGG:ns NR:ns ## COG: XF2260 COG1506 # Protein_GI_number: 15838851 # Func_class: E Amino acid transport and metabolism # Function: Dipeptidyl aminopeptidases/acylaminoacyl-peptidases # Organism: Xylella fastidiosa 9a5c # 40 702 50 707 709 306 29.0 8e-83 MAIAATGLSACDDKRPQPLTIDNALTADEIAAGILTPEVMWKMSRAGSSSLSPDGRTLLY AQTDYNMAENRGVTTIWVEDLATKAVTRLTDTASNNADPKWSADGEKIYFLSDRSGSMQV WEMTPAGGNARQLSAFDKDVEGFGISPRGDKAWYVQRVEVCDRKSSDVYKDMDKSKARIY DDLMARHWNYWDEGSYLHIFVGDFGAEGLKPGVDIIGKDAAWDAPLAPYFDMAEIAWNNA GTMLAYTCKPLTGTEYAVSTDSDIFVYVLENGVTQNICKPVNVNTGEPIADMATMAGYDK YPVWSPDDRQIAFLSQRRAGNESDKARLFLYDCQTAQMQDLTEDFDYNAMNVVWSGSDML YFIAPIEATHQICRIAPSVGEVEVVTRGDHDINAFSMAGDRIAAEMCTISMATEFFDVNP ADGTLTQISAINKPVYDNIRMGEVQKRWVRTTDGKQMLTWVILPPDFDPAKKYPTLLYCQ GGPQSVVSQFWSYRWNFQLMAAQGYVVVAPNRRGLPSFGQEWLDQISGDYSGQNIRDYLS AIDDVAKEPWADENRMGCVGASYGGYSVYFLAGCHEKRFKAFISHCGIFNFESMYGQTEE LFFINNDYGGPYWDRSNEVAQRSYANSPHKFVGKWDTPMLIFTGEYDFRIPYTQSLEAFT AARVRGIPARLVEFENEAHQVFKPQNSLVWNREFFGWLNKYVK >gi|313159762|gb|AENZ01000002.1| GENE 50 70128 - 70292 296 54 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKNVTTHRAGKLCYVTPLLEQIDVQPEKGFANSFPGGGVEPTGASYDDFYLDNE >gi|313159762|gb|AENZ01000002.1| GENE 51 70307 - 72661 2577 784 aa, chain + ## HITS:1 COG:no KEGG:BDI_2795 NR:ns ## KEGG: BDI_2795 # Name: not_defined # Def: putative lipoprotein # Organism: P.distasonis # Pathway: not_defined # 453 784 221 517 518 103 27.0 4e-20 MKNFHKALSILAAAALFAGCSSYDETNDPARAGNTSVKLLADVADDAETRAGISVGESTF TGYWEENDAMGILFTAPGSTPELRPFTYNNNDFAFEGELPTQSGAWQYMAFYPHATVNGT KASIPFGNLRTQSGNAFNCASDALVAPCLNFANAEPGKTDEGDPVRFTLNRLTSILNLEV KGGDNADKVRYLLLTSENETQTLSAASIDFDISDMGSGLAFSQTAPSNVIALGFEQGTAP DANDINAFFNIMPGSYDKLTVDVITATRIGTVAIERGADKPFAAGKLYKRAETPVFAPLE APSFDWPGHEIDQAHEITVDDNNQLTYSAAINIHVPGGIAGLEVDVSSPVLSMLGISKLD LFNDSDVIEGISFADLGLSCQTEIQYKKECVFDITSLVPIILLLGPDPGSEHVFDVKVTD LAGQVTEQPLTFTTPALVRVDQTDLWKNTASLTISNQFADAGSVVLEYRIKGESAWNAAS VSGPNDDGSRTALISPTWTAGTNEAGLTIHSVDPKTGIFARKSYEYRLLADGATVASGEF TPKNNLGDVIPNAGMESWSTKSMKKMFSGSANAPYPNAYMTSSGTDKLCTQATYPGMVGD YCAQLAAKYAGIAFAAGNLYTGDFVMDGTVGYAQFGQPYTYSARPAALKLKYAAEIGEIN RVKNDPPVSTGIDKGRIFVCIVEWSDRHAVQSGTTVDKTTFWDPETVSSLNEGKIIGYGP AYITESHTGSMKDLALPIVYYEKTDTPPTGNYTLVISTATSYLGDYLTGCDANKLWVDDF EWVY >gi|313159762|gb|AENZ01000002.1| GENE 52 72912 - 73082 130 56 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MEIVSTTKGGGQYSAPMIETIEVAAEKGFATSGDVPSSDGISPLSCYDDFYLGENE >gi|313159762|gb|AENZ01000002.1| GENE 53 73096 - 75555 1405 819 aa, chain + ## HITS:1 COG:no KEGG:BDI_2795 NR:ns ## KEGG: BDI_2795 # Name: not_defined # Def: putative lipoprotein # Organism: P.distasonis # Pathway: not_defined # 476 819 207 517 518 76 25.0 6e-12 MKKQVSLLTLTAALLLGACSTYEEVPATSGDGATAVGFLADIAPREQSRADINVDFGTGL TGTWNGEDKLGVLANDFSKLLQFTYTTDSKAFTGSLFGSAGTWAYRAFYPHNGNATVSGT TVTVPFSALRTQNGNKYNSEFDIMAADAITHNNAKPGKTPEGNAVKFNLHRITSILALKL QGGAASEKIASVMLTSKKPIASEKLTFTVPSDPNATYDPSAITPKLVVEGTSAMGSPISI NSEHITVTYADGTAPSADYSETFFNVLPDDSYGDLTFSACTDKGNAASFTITRTTPVVAN WVYTVERTASFTKAAAPTVKWIGHEDLTTPTELLESGNSANIRVSAPGGIKSMQVDITSS VLTTPMEGSEQNLLEAVKLAPSMELTNPANNDMAAALAGFGFPTPAQLLNQQHVYFQIGG LIDMLAMVCAEVTETTNSDFKFTVTDNAGQKTVITLKYVKTVAPAITYNNDADLWANTAS FTLSNIPADATSVSVQYKKSTASDTKWQTAEITGNNTKAEIKPDWGTQFTAADWTTPNSV QPFWRITEGTGVFANNTYDYKLTVDGTEYKGQFKTGNGDVIPDGNMENTNLPCFTQNGSE SSTFWGSGNNDQTPSLCTQITTEGNKYAVLQSKKAVIVLAAGNLFTGTFKYTSARLGGTG AVNFGQKYTYTARPSALQVKYRAKINAVDMNVLNGPLAKKEQDKARIFVAIINWTSQHTV SSTYSLEGSNNGCVGGWDPETTTNPGEGKIIGYGSLWLTESAETDWQNADIQIQWYDKAI KPTEGNYSLIISCAANAYGDYMNGCTSNFLHIDDIQWVY >gi|313159762|gb|AENZ01000002.1| GENE 54 76022 - 76801 866 259 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313159786|gb|EFR59142.1| ## NR: gi|313159786|gb|EFR59142.1| hypothetical protein HMPREF9720_1575 [Alistipes sp. HGB5] # 1 259 1 259 259 414 100.0 1e-114 MASAALLAAGCSTENKAEGTEYGTLKVSCTADGSIDAASDDTSRMPAAPSVPQAGDFTLT VTGESGTQKWDTLTEFEQSDAVFRMGTYTVSVAHGDPDAEGIDKPYYAAEKRVEVLPRRA ATVEMTATIANSQTVVRATEQFLNYFHDAQFTVTTASGNQFDFTPGSTPADEPVFVKSAT TLKVTGTARRQSQTGTDEGPKVTFSEQTLEATSPRTCHVLTFDAKDAGSVTLTVTLSDDY TETRPIDCEVNEGAIDDTK >gi|313159762|gb|AENZ01000002.1| GENE 55 76823 - 78553 2151 576 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313159779|gb|EFR59135.1| ## NR: gi|313159779|gb|EFR59135.1| putative lipoprotein [Alistipes sp. HGB5] # 1 576 1 576 576 1115 100.0 0 MKTFTRLFTMLAVAAFVAAGCVNEEPPYKEEPKPTPGDATGFLSVSGLSMRVVYDETDVR PDDTSDQTQSPQAVSGTRAEQPDVDGFIVEILDADNAQVFKKTYAELKQQLAEPMELPVG AYRMEVRSKESTPDVAWEHPVYGATSSFTISKAQTTQLEEVVCTLQNIKVTVDYSSELAG MLADTSKATISLGQTSQEFLKTETRAAYFKSLDIENTLVFNFDGVFAGTDIPAQFSKQIT GVKAGQWRKISVVINYADKGTLLFQVTVDNNIIQDNSFVVDGTDNLLEELLEDPSAPALA WPGHDMSKPFTLTDAMFDDNGNCIEPFVFDLASPNGIESLRVNIASTSSQFMASMAAIQL PETFDLCALDASSPAGIILKGFGYPVGGELKGQQAKSFNIAGQIKALYEFDGTHTFSFDM TDAKGVSTAAALTLVVDKSAGQEGPAIVWRGYDIDQQYEVQKDMVIDIDVTATAGIKSFW VTIDSEALKDLLPVINMPEKFDICDIPADLAAILHDEFGFPINEQVKNQTAVMFSITKFV EILLEIPGEHNFVLDVTDNNNVLTHKTVKLIVKAAE >gi|313159762|gb|AENZ01000002.1| GENE 56 78553 - 80682 2353 709 aa, chain + ## HITS:1 COG:no KEGG:Bache_3264 NR:ns ## KEGG: Bache_3264 # Name: not_defined # Def: lipoprotein # Organism: B.helcogenes # Pathway: not_defined # 426 709 281 550 551 102 29.0 5e-20 MKRHITYIAMLLTALTGASCSQDETPAGDGRTGVMAMTISTSRAEDNGEYDPLQYQKVYI YNSEGGLLRKYAAKDDIPERLELLSGTYRVAVEAGEQVPADFSKRFYKGEETFTVKPGET THAEVVCKIANTVVEVKFDASIVENLDPGYFVWIAGTDKFDEAEAESGAVPALKFTDEGT GYYTLPAGTTSLAWMFRGTHTSGKEVEMENTLTEVKAGGKYVFTFRYSPDLPGYIDALLI RVDTETEDKDDEIIFSPDPALLTEGFDNDEVQKYTSGEKKYRIMAFAELKKFTVSVGDDS YDLLTGTHEGISFTPTDKYNTELTLSDAFFAGLAGGDHAVTIHVEDANGGSNEISTTYRL QGLVPVTEKSYDLWANTVTLQVVDFTRSANVTFGLCGSDGQWKYLTGTSQGDDFISATFA PQWQESTNANGHTVYTREAGTGVVATYGYEYKVTIDGTEKSGSFTAGAAQTIPNADMSGW SMTSGFAFPNAEGGAFWSSGNNTMTKTLCTSTDVKFGKAAPAAKLTSTNMLVLAAGNLFT GTFAYKSFTGTVQFGQPYDFTVRPSALKVKYHAKVGTVDKVRTSGDICPYIKKGDPDMAR IFVAIVDWNAPHQVVSAMTTTKGAWDPEKGIDNVSEGKILGYGSLWITQSTEGDDMQDAE LDIVWYDHETRPSGGGYSLVISCACNAYGDYFTGCSTNVMYVDDFEWVY >gi|313159762|gb|AENZ01000002.1| GENE 57 80700 - 81521 1213 273 aa, chain + ## HITS:1 COG:no KEGG:Bache_3263 NR:ns ## KEGG: Bache_3263 # Name: not_defined # Def: hypothetical protein # Organism: B.helcogenes # Pathway: not_defined # 16 273 21 283 283 195 39.0 2e-48 MRKITISTLLTLLYLAGTAQALRAQQTTEDASLPADSVGQERALTINEMRQQRGLTDTHN LFVPRGQWIFGGTASYSTHSNETYRFLVVEGIESKGYTFKVSPMIAYALRDNMAIGGRFI YSRSLLKLDKADLNLGGEDSDLNFEVNDYYSLRHSYSVAVIWRQYIPLGRNKRFALFNEM QLSGGGTQARFAKDSPVKGTYETGYTFSLGVSPGIVAFATNNMAVEVNVGVMGITYTHTK QVHNQVTVGKRDASMMNFKVNIFSIGLGMAFYL >gi|313159762|gb|AENZ01000002.1| GENE 58 81535 - 82536 1277 333 aa, chain + ## HITS:1 COG:no KEGG:Bache_3263 NR:ns ## KEGG: Bache_3263 # Name: not_defined # Def: hypothetical protein # Organism: B.helcogenes # Pathway: not_defined # 89 327 54 283 283 105 32.0 3e-21 MKRIILSIFSVFITLGAAAETEQARPAVQQSETAAAAETPSVTAESAAAPAAQTAAPAAE QTSGNQLTADAFATPAPSPIAADRQKRPKEQRFLPTRKRIDREINKLKFAYKGEVMMGLT ASYGTLSSDDTDIMLILDNINADGTVATVKPFVGYFYRDNNCIGVRFGYRHMSGTLDNTY FDAGEGNDLSGQLPYIDLSSDNYSFGIFHRSYAGLDPKGRFGLFAEFELALSTGTSNFLY DPDRKEGGNPNRTYSDNTQVKLSFNPGAAVYIFPNVCATLSFGLGGIQYNSVTQKDADGN KVGSRKASNMRFRLNIAAINFGMTIHLWDKKKQ >gi|313159762|gb|AENZ01000002.1| GENE 59 82540 - 84225 1970 561 aa, chain + ## HITS:1 COG:no KEGG:Bache_3264 NR:ns ## KEGG: Bache_3264 # Name: not_defined # Def: lipoprotein # Organism: B.helcogenes # Pathway: not_defined # 13 561 14 550 551 160 26.0 1e-37 MKYFRILFSAAALLLAASCIDNDVPYPVVELRIAGVEGSGFTVSGISIANRTVTLTLDEK TDIRKVGIDKVTFDAATSNPMMTDTESFIGQIKTSRPLSGEFDLRSPLYVTLSLYQDYEW TIVAEQPIERAFTVAGQIGATVIDAQKRTATAYVAKGTNLGDITVTRLKLGPADITTYSP TAEELSASGFETMRFVDATYHGATERWTLHVEHTDLKIVFRETDLWNNTGVITAMATEEE YPDAVIQYRVKGTDEWQATQKGERDETGLFTAAVTPEWTSSTNAAGLPVKRLVRTKGVYA GQTYELRLLVNGEATETSEYTVPAGDVIPDGNMENPGLSCFTQENQNAEFWASGNNGFAK ELCTSAAFDGMGGSRCALLKASAPPIVGLAAGNLMSGIFYKDGPTTGVVEFGQPYTWTAR PSGMKVKYHATVGIVNQAKHSGAPIGKGDQDKARIFVAIVDWNARHRVASGTGAPTGTWD PTETTATDEGKIIACGSLFIDRSTAGDRMTEITLPLDFYDTQARPTGKYSIVISCSTSAY GDFMVGCTTNTMYVDDFEWVY >gi|313159762|gb|AENZ01000002.1| GENE 60 84489 - 85181 753 230 aa, chain - ## HITS:1 COG:HI0232 KEGG:ns NR:ns ## COG: HI0232 COG4623 # Protein_GI_number: 16272195 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted soluble lytic transglycosylase fused to an ABC-type amino acid-binding protein # Organism: Haemophilus influenzae # 69 194 281 402 462 85 35.0 5e-17 MLKKTVLTFAFLTILTTFYGFNAKFSTPVTDDMLNEEADFGNSLLSDGDYVISAYDNIIR NISEKEGHDWRLMSAIAYHESRFTPDITSRSGAKGLMQIMPSVARQFDVPAGDIANPETN IWLANKLMSKIKSTLRFPAETSDKDRMSIILACYNSGIGHVNDARRLARVNGEDPNSWEV VARYLQLKAQPEYYENEVVKCGRFTGSRQTLAYVNDVIGRYDKYCRVAVR >gi|313159762|gb|AENZ01000002.1| GENE 61 85226 - 87340 2575 704 aa, chain - ## HITS:1 COG:no KEGG:Palpr_0191 NR:ns ## KEGG: Palpr_0191 # Name: not_defined # Def: hypothetical protein # Organism: P.propionicigenes # Pathway: not_defined # 85 704 32 624 624 211 25.0 1e-52 MLGLFLTAPGVVRSQGFDAKTLMRAQQRGQTGSLYGNNPFEQPAEEGEEGEQQPQDTTKK ERKIRKPLESYFFSDSVRSLNNFRWNVRRDYNRVEIGPLDTTLTDWRIDYPFYREEVGDI AQGALGQASLPMNYFRRPQNFDFGFASPYYAYTYDMENVPFYNTKKPIIRMTYLESGQKR FREENFNIMHAQNVSPTTGFNVDYKARGTRGQYIWSRTKNHNLAVAFNHTGKRYSVHAAY YNNHIEQQENGGVVGTWAIADTTFQMPSGVPMKLADAEAQNTYRNNAFFVTQSYALPLQR VTDSDFSLADLSAVFIGHSFEYSSWSKVYTDIKAGYTNERGERDPETGEFKPTEGIYYKD WFINPRDTRDSIYERVISNRFFVQAQPWDRNGVVGTIDAGIGIDMHTYSQFEMRDFLTGK YTKVNKTSYFAYGSVGGKIKKYVDWDANLKFYPSGYRGGDLTLGAHLALTGYLRGHPLIL EGRFTMDRRSPNYWQENLFSNHYIWSTPLGKENETRFEVKFSIPDFAFEAGAWQGVVSDK IYYDADSQITQQGGNVSLTSVYARKDFRLGGLHLDNRVLLQWSTNQEVAPVPLLSAFLSY YYEFWVVRNVLRLQIGLDGRYNTRYYAPGYNPALSAFYNQREVEVGNYPYMDAFVMGKWK RMRIFLKYQHVNKGLFGNGEYFSAANYPLNPGMFKIGISWGFYD >gi|313159762|gb|AENZ01000002.1| GENE 62 87425 - 88159 871 244 aa, chain - ## HITS:1 COG:RSc2073 KEGG:ns NR:ns ## COG: RSc2073 COG1183 # Protein_GI_number: 17546792 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylserine synthase # Organism: Ralstonia solanacearum # 5 150 44 182 291 75 37.0 6e-14 MKIKLFTIPNLLTLSSLLCGSFAAVSALVYHDLELTFWLTVAAGVFDYCDGFAARLLKCP SAIGVQLDSLSDMVSFGFVPASVLFVLWNDALAADAEPWMRYGGSVLCFVVAAFSALRLA KFNIDETQHTEFCGLPTPANALFFAALGWMNAKTGFSLGVWVLLLMPVMSWLLISPVRMF AFKFRSFSFKGNEIRYLFIALSVVLLVALGVRAVPVIILLYIIVSAVNWIVTLKSRDSAA CPEE >gi|313159762|gb|AENZ01000002.1| GENE 63 88249 - 89322 1832 357 aa, chain + ## HITS:1 COG:CAC2968 KEGG:ns NR:ns ## COG: CAC2968 COG0836 # Protein_GI_number: 15896221 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Mannose-1-phosphate guanylyltransferase # Organism: Clostridium acetobutylicum # 6 349 3 347 356 254 36.0 1e-67 MASNKYCVIMAGGIGSRFWPKSRQSMPKQFLDILGTGKSFIRHTYERFAKIVPAENFLVV TNHKYKDLVLEHIPEIGERQVLCEPIGRNTAPCIAYAAYTLMKQNPDAEMIVTPADHLIL NEEDFRQIIGECLEFADKHDALLTVGIKPTRPDTGYGYIQVSDTNVISKVKCFTEKPNLE LAQTFLQTGEFFWNSGIFIWKVQTIVEAFRKYLPEHHAMFSGVMKALGTDAEKTTVEIVF SECRAISIDYGIMEKADNVYVRCGDFGWSDVGTWGSVYQHSRKDRYANAVPEQGCYLYDT RSSIVSLPKDKVAVISGLKEYIVVDTDDVLMICPRSEEQNIKKFIDEVKFHNGDKHI >gi|313159762|gb|AENZ01000002.1| GENE 64 89360 - 90604 1825 414 aa, chain + ## HITS:1 COG:BS_murF KEGG:ns NR:ns ## COG: BS_murF COG0770 # Protein_GI_number: 16077524 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramyl pentapeptide synthase # Organism: Bacillus subtilis # 14 414 30 451 457 209 32.0 1e-53 MSELYDIFRRHPRISTDTRRIEPGSVFFALHGATFDGNRFAADALSKGAAYVVVDDPAAV SGDRTVLVGDTLEALQELAREYRRELAIPILAISGSNGKTTTKELVSRVLAERFEVYATH GNLNNHIGVPLTLLAMTRDTEFGVVEMGASACGEIALLASIAEPNYGILTNIGRAHLEGF GGPEGIRRGKGELLDYLAANGGRAFVPREDQTLTNMAAERDGLAAEYYSVTLADGIAHHL EGDYNRFNIAAAVAVGRYFDIDGERIRHAVGSYMPDNNRSQRTETERNTLIVDCYNANPS SMRASVLNFLAEPAGGRSRRVLILGDMLELGAWSEQEHRDIVALAARNPDAELMLVGGEF ARACAGMEPQPANTALFPSREELIAELRRRPVENAFVLVKGSRGIGLGKALEEL >gi|313159762|gb|AENZ01000002.1| GENE 65 90850 - 91779 1220 309 aa, chain + ## HITS:1 COG:PA1680 KEGG:ns NR:ns ## COG: PA1680 COG1073 # Protein_GI_number: 15596877 # Func_class: R General function prediction only # Function: Hydrolases of the alpha/beta superfamily # Organism: Pseudomonas aeruginosa # 13 306 22 317 327 181 40.0 1e-45 MKRLLSALLLLAAANAAAGEKSVSLERNFGTLYGTLLTPDEGAETVAVLIAGSGPTPRNG NTNNYLYLAQELEKAGIATLRYDKRGIGSSKFDDPDKMADATLDDFIGDAAAWAEYLSRQ DFRRIVLIGHSEGALIAFCAAQQCPEVDAVISLAGAGYPLDEILQLQLAAQLAPTHMELL MQARAITAALKRGERVESCPPELTPLFAPPLQTFWISSLSYDPREEIRKVTVPVLIVNGD IDFQVTPDNADALAKAQPRSRKTIIEGMTHTLKKASGPTRTEQIAAYTDDTLPVAPELVA AVTGFIEGL Prediction of potential genes in microbial genomes Time: Wed Jun 22 11:02:06 2011 Seq name: gi|313159667|gb|AENZ01000003.1| Alistipes sp. HGB5 contig00067, whole genome shotgun sequence Length of sequence - 115147 bp Number of predicted genes - 97, with homology - 91 Number of transcription units - 36, operones - 17 average op.length - 4.6 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 1 - 1006 241 ## COG0732 Restriction endonuclease S subunits 2 1 Op 2 . - CDS 1010 - 1570 154 ## gi|313159750|gb|EFR59107.1| hypothetical protein HMPREF9720_2336 - Prom 1593 - 1652 3.4 3 2 Tu 1 . - CDS 1747 - 1854 116 ## - Prom 1982 - 2041 5.4 + Prom 1662 - 1721 4.7 4 3 Tu 1 . + CDS 1886 - 2059 96 ## + Term 2184 - 2219 -1.0 5 4 Op 1 . - CDS 2795 - 3268 312 ## BF1460 hypothetical protein 6 4 Op 2 . - CDS 3261 - 4277 185 ## BF1459 putative DNA-binding protein 7 4 Op 3 5/0.000 - CDS 4293 - 5801 404 ## COG0286 Type I restriction-modification system methyltransferase subunit 8 4 Op 4 . - CDS 5804 - 6880 80 ## COG4096 Type I site-specific restriction-modification system, R (restriction) subunit and related helicases 9 5 Op 1 . + CDS 7499 - 7726 101 ## 10 5 Op 2 . + CDS 7787 - 7936 145 ## 11 5 Op 3 . + CDS 7970 - 8323 65 ## + Term 8397 - 8443 8.1 - Term 8059 - 8095 -0.5 12 6 Op 1 . - CDS 8132 - 8341 151 ## CHU_2935 hypothetical protein - Prom 8364 - 8423 3.3 13 6 Op 2 . - CDS 8439 - 9185 343 ## gi|198277097|ref|ZP_03209628.1| hypothetical protein BACPLE_03305 - Prom 9307 - 9366 6.0 14 7 Tu 1 . - CDS 9501 - 9596 91 ## + Prom 10223 - 10282 3.8 15 8 Op 1 . + CDS 10322 - 13273 4633 ## BT_0206 hypothetical protein 16 8 Op 2 . + CDS 13283 - 14833 2516 ## BT_0207 hypothetical protein 17 8 Op 3 . + CDS 14857 - 15801 1436 ## BT_0208 hypothetical protein 18 8 Op 4 . + CDS 15831 - 17864 2762 ## gi|313159684|gb|EFR59041.1| putative lipoprotein + Term 17882 - 17921 8.0 + Prom 17867 - 17926 1.7 19 9 Op 1 . + CDS 17959 - 20607 4036 ## COG4886 Leucine-rich repeat (LRR) protein 20 9 Op 2 . + CDS 20627 - 22270 2757 ## BT_0211 hypothetical protein 21 9 Op 3 . + CDS 22279 - 24327 2964 ## COG1404 Subtilisin-like serine proteases 22 9 Op 4 . + CDS 24365 - 25153 953 ## BT_0213 hypothetical protein 23 9 Op 5 . + CDS 25189 - 25650 722 ## BT_0214 hypothetical protein + Term 25683 - 25723 8.8 24 10 Tu 1 . + CDS 25836 - 26405 738 ## AHA_0769 acetyltransferase - Term 26310 - 26339 1.9 25 11 Tu 1 . - CDS 26430 - 27497 1560 ## COG2855 Predicted membrane protein - Prom 27563 - 27622 4.0 26 12 Tu 1 . - CDS 27813 - 28700 1074 ## COG0583 Transcriptional regulator - Term 29297 - 29342 4.0 27 13 Tu 1 . - CDS 29346 - 30377 1568 ## COG0618 Exopolyphosphatase-related proteins - Prom 30561 - 30620 5.1 + Prom 30486 - 30545 2.0 28 14 Tu 1 . + CDS 30571 - 32442 2813 ## COG0488 ATPase components of ABC transporters with duplicated ATPase domains + Term 32661 - 32707 14.1 - Term 32649 - 32695 14.1 29 15 Op 1 . - CDS 32733 - 33509 1204 ## Fluta_0605 PKD domain-containing protein 30 15 Op 2 . - CDS 33544 - 34197 966 ## COG0177 Predicted EndoIII-related endonuclease 31 15 Op 3 2/0.200 - CDS 34194 - 35135 1088 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 32 15 Op 4 . - CDS 35141 - 37678 4063 ## COG0210 Superfamily I DNA and RNA helicases - Prom 37716 - 37775 2.7 + Prom 37953 - 38012 2.8 33 16 Op 1 23/0.000 + CDS 38063 - 39061 1879 ## COG0714 MoxR-like ATPases 34 16 Op 2 . + CDS 39064 - 39948 1404 ## COG1721 Uncharacterized conserved protein (some members contain a von Willebrand factor type A (vWA) domain) 35 16 Op 3 . + CDS 39945 - 40925 1701 ## BDI_0947 hypothetical protein 36 16 Op 4 . + CDS 40930 - 41718 1285 ## Cpin_0905 hypothetical protein 37 16 Op 5 5/0.000 + CDS 41731 - 42723 1532 ## COG2304 Uncharacterized protein containing a von Willebrand factor type A (vWA) domain 38 16 Op 6 . + CDS 42737 - 43759 1501 ## COG2304 Uncharacterized protein containing a von Willebrand factor type A (vWA) domain + Term 43801 - 43838 8.1 + Prom 43986 - 44045 3.8 39 17 Op 1 . + CDS 44091 - 45104 1238 ## Odosp_3517 hypothetical protein 40 17 Op 2 . + CDS 45053 - 47716 3379 ## Cthe_1890 cellulosome enzyme, dockerin type I + Term 47754 - 47791 10.1 - Term 47742 - 47779 10.1 41 18 Tu 1 . - CDS 47808 - 49868 2662 ## COG3525 N-acetyl-beta-hexosaminidase - Prom 49900 - 49959 3.4 - Term 50018 - 50075 4.5 42 19 Op 1 . - CDS 50111 - 51268 850 ## BT_4710 hypothetical protein 43 19 Op 2 . - CDS 51299 - 53209 1618 ## BF1328 putative secreted endoglycosidase 44 19 Op 3 . - CDS 53237 - 54847 2194 ## BF3612 hypothetical protein 45 19 Op 4 . - CDS 54861 - 57797 4594 ## BF1326 hypothetical protein 46 19 Op 5 . - CDS 57811 - 58227 292 ## Fjoh_1213 TonB-dependent receptor - Prom 58254 - 58313 1.9 - Term 58261 - 58291 3.0 47 20 Op 1 . - CDS 58354 - 59355 1385 ## COG3712 Fe2+-dicitrate sensor, membrane component - Prom 59382 - 59441 3.0 48 20 Op 2 . - CDS 59443 - 60012 929 ## BF1324 RNA polymerase ECF-type sigma factor 49 21 Tu 1 . + CDS 60811 - 62334 2323 ## COG0477 Permeases of the major facilitator superfamily + Term 62362 - 62403 5.1 - Term 62346 - 62394 9.6 50 22 Tu 1 . - CDS 62410 - 63672 1821 ## COG0641 Arylsulfatase regulator (Fe-S oxidoreductase) - Prom 63692 - 63751 2.9 + Prom 63656 - 63715 3.3 51 23 Op 1 . + CDS 63937 - 65451 2106 ## COG2986 Histidine ammonia-lyase 52 23 Op 2 11/0.000 + CDS 65448 - 66188 207 ## PROTEIN SUPPORTED gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 53 23 Op 3 . + CDS 66193 - 67416 1712 ## COG0304 3-oxoacyl-(acyl-carrier-protein) synthase 54 23 Op 4 . + CDS 67431 - 67679 478 ## BVU_1013 hypothetical protein 55 23 Op 5 . + CDS 67669 - 68550 1215 ## COG4261 Predicted acyltransferase 56 23 Op 6 . + CDS 68520 - 70022 1717 ## COG1032 Fe-S oxidoreductase 57 23 Op 7 . + CDS 69998 - 70438 503 ## BVU_1016 putative 3-hydroxymyristoyl/3-hydroxydecanoyl-(acyl carrier protein) dehydratase 58 23 Op 8 . + CDS 70438 - 70881 561 ## COG0824 Predicted thioesterase 59 23 Op 9 27/0.000 + CDS 70866 - 72830 2124 ## COG0304 3-oxoacyl-(acyl-carrier-protein) synthase 60 23 Op 10 27/0.000 + CDS 72836 - 73090 524 ## COG0236 Acyl carrier protein 61 23 Op 11 . + CDS 73087 - 74250 1427 ## COG0304 3-oxoacyl-(acyl-carrier-protein) synthase 62 23 Op 12 . + CDS 74247 - 75149 827 ## BVU_1021 3-oxoacyl-[acyl-carrier-protein] synthase 63 23 Op 13 . + CDS 75140 - 75826 555 ## COG0726 Predicted xylanase/chitin deacetylase 64 23 Op 14 . + CDS 75823 - 76437 1026 ## BVU_1024 hypothetical protein 65 23 Op 15 . + CDS 76456 - 76824 445 ## Weevi_1569 beta-hydroxyacyl-(acyl-carrier-protein) dehydratase, FabA/FabZ 66 23 Op 16 . + CDS 76821 - 77975 1469 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 67 23 Op 17 . + CDS 77968 - 79362 1826 ## COG1033 Predicted exporters of the RND superfamily 68 23 Op 18 . + CDS 79233 - 81809 3648 ## COG0204 1-acyl-sn-glycerol-3-phosphate acyltransferase 69 23 Op 19 . + CDS 81811 - 83310 1429 ## COG1233 Phytoene dehydrogenase and related proteins 70 23 Op 20 . + CDS 83307 - 84965 2128 ## BVU_1032 putative choloylglycine hydrolase 71 23 Op 21 . + CDS 84962 - 86269 1740 ## COG1541 Coenzyme F390 synthetase 72 23 Op 22 . + CDS 86281 - 86853 967 ## BVU_1034 hypothetical protein + Term 86874 - 86913 9.1 + Prom 87006 - 87065 3.1 73 24 Tu 1 . + CDS 87129 - 89336 3626 ## COG3345 Alpha-galactosidase + Term 89372 - 89410 8.2 + Prom 89661 - 89720 2.5 74 25 Tu 1 . + CDS 89823 - 90260 758 ## BT_0961 hypothetical protein 75 26 Op 1 . + CDS 90423 - 90935 745 ## BT_0959 hypothetical protein 76 26 Op 2 . + CDS 90947 - 91909 1417 ## COG2768 Uncharacterized Fe-S center protein 77 26 Op 3 1/0.400 + CDS 91911 - 92579 437 ## COG1453 Predicted oxidoreductases of the aldo/keto reductase family 78 26 Op 4 . + CDS 92468 - 93058 683 ## COG1453 Predicted oxidoreductases of the aldo/keto reductase family 79 27 Tu 1 . + CDS 93163 - 95280 185 ## PROTEIN SUPPORTED gi|121997364|ref|YP_001002151.1| 30S ribosomal protein S1 + Term 95286 - 95325 4.3 - TRNA 95635 - 95705 56.6 # Cys GCA 0 0 + Prom 95798 - 95857 2.5 80 28 Op 1 . + CDS 95929 - 97032 1305 ## COG0758 Predicted Rossmann fold nucleotide-binding protein involved in DNA uptake 81 28 Op 2 2/0.200 + CDS 97029 - 97817 923 ## COG0084 Mg-dependent DNase 82 28 Op 3 . + CDS 97837 - 98865 1597 ## COG0252 L-asparaginase/archaeal Glu-tRNAGln amidotransferase subunit D + Prom 99062 - 99121 6.2 83 29 Op 1 . + CDS 99228 - 99986 1003 ## COG0811 Biopolymer transport proteins 84 29 Op 2 . + CDS 100018 - 100437 502 ## gi|313159725|gb|EFR59082.1| hypothetical protein HMPREF9720_2421 85 29 Op 3 . + CDS 100452 - 101057 779 ## BDI_3731 hypothetical protein 86 29 Op 4 . + CDS 101061 - 101531 681 ## Odosp_2820 biopolymer transport protein ExbD/TolR + Term 101551 - 101599 12.3 - Term 101545 - 101580 6.0 87 30 Tu 1 . - CDS 101643 - 102242 788 ## COG0164 Ribonuclease HII - Prom 102295 - 102354 2.0 88 31 Tu 1 . + CDS 102750 - 103730 1150 ## gi|313159682|gb|EFR59039.1| putative membrane protein + Term 103772 - 103809 2.0 + Prom 103777 - 103836 9.3 89 32 Tu 1 . + CDS 103889 - 104806 1273 ## COG2207 AraC-type DNA-binding domain-containing proteins + Term 104862 - 104896 -0.8 + Prom 105040 - 105099 8.9 90 33 Op 1 13/0.000 + CDS 105294 - 106181 692 ## COG0320 Lipoate synthase 91 33 Op 2 . + CDS 106178 - 106936 761 ## COG0321 Lipoate-protein ligase B 92 33 Op 3 1/0.400 + CDS 106945 - 108711 2587 ## COG0608 Single-stranded DNA-specific exonuclease + Term 108717 - 108778 23.6 93 34 Tu 1 . + CDS 108866 - 112015 4447 ## COG0642 Signal transduction histidine kinase 94 35 Tu 1 . - CDS 112030 - 112557 553 ## COG0778 Nitroreductase - Prom 112613 - 112672 1.9 + Prom 112556 - 112615 4.8 95 36 Op 1 . + CDS 112635 - 113474 1362 ## Cphamn1_2546 hypothetical protein 96 36 Op 2 . + CDS 113540 - 114121 310 ## PROTEIN SUPPORTED gi|71274727|ref|ZP_00651015.1| Ham1-like protein 97 36 Op 3 . + CDS 114203 - 114811 1031 ## COG0302 GTP cyclohydrolase I + Term 114874 - 114927 16.5 Predicted protein(s) >gi|313159667|gb|AENZ01000003.1| GENE 1 1 - 1006 241 335 aa, chain - ## HITS:1 COG:SP0508 KEGG:ns NR:ns ## COG: SP0508 COG0732 # Protein_GI_number: 15900422 # Func_class: V Defense mechanisms # Function: Restriction endonuclease S subunits # Organism: Streptococcus pneumoniae TIGR4 # 1 334 1 333 522 193 36.0 4e-49 MNGKQLKNSILQWAIQGKLVPQDPNDEPASVLLERIRAEKARLVKEKKIKKDKNESIIYR GEDNSYYEKFLATREVKCIDEEIPFEIPATWEWCRLLSIVSLLGDGIHGTPEYSEGGSVY FINGNNLFDGQILIKPDTKTVSKEEAVKHSRLLNESTVLVSINGTIGNIAFYSGENVILG KSACYFNLLNGIERKYIKIVLQTDYFLEYTKRVATGSTIKNVPLSGMRNVLIPIPPKDEQ QVIIDKLSSLKLLIEKFNIEQSQLNKLNAELRSVLKKSILQEAIQGKLLPQITEEGTAQE LLEQIRQEKQRLVKEGKLKKSALLDSIIYKGDDNK >gi|313159667|gb|AENZ01000003.1| GENE 2 1010 - 1570 154 186 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313159750|gb|EFR59107.1| ## NR: gi|313159750|gb|EFR59107.1| hypothetical protein HMPREF9720_2336 [Alistipes sp. HGB5] # 1 186 405 590 590 367 100.0 1e-100 MQRPNLSYGETKIFDKYFNTLFKNDYLPEDAYALSFWMRKIMDAWTQENPLGLEEELLTM KAYAPYHLLFAISMVFAKCNNQTNVPSPSECLKVASENNLVDSIINIAANCLNSAISAEK NNCEQNNRSFIPQNWVKNKSCNAGIMSAIQNYFSFLPTMNKEMDSKLKNGVKIDSKYFSY RVQAED >gi|313159667|gb|AENZ01000003.1| GENE 3 1747 - 1854 116 35 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MYESVMMLMFYSDFMKFHKETGLILLALTQTRKVL >gi|313159667|gb|AENZ01000003.1| GENE 4 1886 - 2059 96 57 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTSIYNIQSIKFKLTIATFFNFVTNGSDSIMKKEEVTTFIRIVYGFLDLFIYGIVNA >gi|313159667|gb|AENZ01000003.1| GENE 5 2795 - 3268 312 157 aa, chain - ## HITS:1 COG:no KEGG:BF1460 NR:ns ## KEGG: BF1460 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 148 1 148 148 276 95.0 1e-73 MSKEFIYIDYDEALNIYGKMIDASDGGFEGVRDEGGIRATLDFVQNDLYYPTFADKLTYL MYRFCSGHFFNDGNKRIALTVGTYFLHKNNYYWHACICMRTLESIIYHVAASNIDQDLLL RIVNSFLTSKDYDEELKIDIANAMSKGELGIQGEDYE >gi|313159667|gb|AENZ01000003.1| GENE 6 3261 - 4277 185 338 aa, chain - ## HITS:1 COG:no KEGG:BF1459 NR:ns ## KEGG: BF1459 # Name: not_defined # Def: putative DNA-binding protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 338 1 338 338 570 89.0 1e-161 MKDLTISNIERQNVLNNRFAVSKVQEHLDIEGMLFEGEYRFTKKMVADFYQVDTSTIDRY LQSNSDELKHNGYILCKGKQLKEFKLEFAHLINEASKTTQLGLFNFRSFLNIGMLLTESE KAKKVRSLILDFVITTINEKTGGGTKYINRRDVNYLPAAITEENYRKNLTSAINQYVDGH PTYKYSQITDFIYKAVFKENAKEYREVLKLDSKDNVRHTLYSEVLLVISSFENGVGAAIS ERFKENGGKLLTIDEVERIVNELAEHPMQKPYLNDARTKMASRDFSFRDAYHGNIADYLQ AVTPEEFERFIGDQSIDFDRILADNKDVLKRLKQAEDE >gi|313159667|gb|AENZ01000003.1| GENE 7 4293 - 5801 404 502 aa, chain - ## HITS:1 COG:SP0509 KEGG:ns NR:ns ## COG: SP0509 COG0286 # Protein_GI_number: 15900423 # Func_class: V Defense mechanisms # Function: Type I restriction-modification system methyltransferase subunit # Organism: Streptococcus pneumoniae TIGR4 # 1 494 1 484 487 573 55.0 1e-163 MAVNNIIKRIQNIMRQDAGINGDAQRIEQMTWMFFLKVYDTQEETWEYKDENYKSIIPED LRWRKWAVDEKDGEALTGEALLSFVNEKLFPTLKNLPIDANTPRAKSIVQETFADLNQYM KNGTLLRQVVNIVNEIEFDDADDRHTFGDIYEGILKDLQSAGNAGEFYTPRALTDFIVMM LDPKLGETFGDFTSGTGGFLTSALNYVSKSVSSAEDGEKLQNAVVGQEWKPLPYLLSITN LLLHDIEAPNIANCDSLGTNITDFKESDKVDVIGMNPPYGGSTEDSVKSNFPMQYRSSET ADLFIALIMYRLKAGGRCGVIIPDGFLFGTDGAKLALKENLLRKFNLHTIIRLPGSIFSP YTSIATNILFFNNEEAEGCKEGFKTKETWFYRLDMPEGYKHFSKTKPMKVEHTLPIQEWW KDRKEIISDEVGEKSRVFTAQQLIDLDCNFDQCKFPKEEEEILPPAELLKQYFEKRAALD HEIDKTLSEIQKILGIDIKSCN >gi|313159667|gb|AENZ01000003.1| GENE 8 5804 - 6880 80 358 aa, chain - ## HITS:1 COG:alr3618 KEGG:ns NR:ns ## COG: alr3618 COG4096 # Protein_GI_number: 17231110 # Func_class: V Defense mechanisms # Function: Type I site-specific restriction-modification system, R (restriction) subunit and related helicases # Organism: Nostoc sp. PCC 7120 # 7 358 422 776 776 316 46.0 4e-86 MLHDIGRMTKTIVFCSDIEEAAEMRTLLINMNSDLCKKSPYYVTRIVGEDKEGKKQLDNF ISVDEPYPVIVTTSELLSTGVDCKTCGLIVIDKEIGSMTEFKQIIGRGTRLRKDKGKWHL EILDFRNATAKFKDPAFDGDPEPPKGGGKKPKPYKIIDAPETPIVSEPREKYLISGKNIR IAHEIVSVLGEDGKTMRTESVQSFAKKQLLRHYQTLDDFIQTWTEAERKQAIMDELKEYA ILIDAVREANPALKDADIFDVICHVAFDQPPLTRKERANNVKKRNYFGKYEGKAREVLEA LLDKYAENGILDFEKANILEIPPFNSIGKPTKIIKLFGGKVAFEQAIRELEYQIYKSA >gi|313159667|gb|AENZ01000003.1| GENE 9 7499 - 7726 101 75 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKTMDNLESSKSLTSTGSHYHKYTLLPSCDGLYCAINCYTLIISWFMAMGIIVERSFYYI EFIRCIVLRCQITVV >gi|313159667|gb|AENZ01000003.1| GENE 10 7787 - 7936 145 49 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MFQECITITGVGKFQVKDLGIIHSLLNATSNCMFIIFSFNYRNWLTFMV >gi|313159667|gb|AENZ01000003.1| GENE 11 7970 - 8323 65 117 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLVANNKHSTICEIILFTHIYTYRPSFLLECRSNEQSAYLLFFEFFLVYFHDNNSFVACA TNNQIFYINIKGIGYIIKRFEVWLNRITTPFTYGTIGFPNLFCKPFSGFLLFCENYL >gi|313159667|gb|AENZ01000003.1| GENE 12 8132 - 8341 151 69 aa, chain - ## HITS:1 COG:no KEGG:CHU_2935 NR:ns ## KEGG: CHU_2935 # Name: not_defined # Def: hypothetical protein # Organism: C.hutchinsonii # Pathway: not_defined # 4 65 7 68 70 74 64.0 1e-12 MANFNRLKIVLAEQQKTGKWLAEQIGKSNCTVSKWCSNSVQPDLKTLNDIANALNIDVKD LIVSSASHK >gi|313159667|gb|AENZ01000003.1| GENE 13 8439 - 9185 343 248 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|198277097|ref|ZP_03209628.1| ## NR: gi|198277097|ref|ZP_03209628.1| hypothetical protein BACPLE_03305 [Bacteroides plebeius DSM 17135] hypothetical protein BACPLE_03305 [Bacteroides plebeius DSM 17135] # 1 248 1 247 247 367 83.0 1e-100 MKQKQHILHSLTIEVMAVLLASMIAFQVCNMLGIRMSLLPFVMATGYIILKLLYHLCIIV ARYIIEAIPSSHFALANEKTDTSSSVVLPPSAKDCVEVQKKRMELFHYEYQREHQQYQQR KEEEENKKLNAILRYTRETFKRFDLNETEIFQICESVRYFVTNHQVFSMTEVHIKKHSSL TQISLKNFAWNIAFQYNIGRDMTTSFVMATFSEWFANSTFDTVRKNLRTTTGRHKIEIDE NILSKYLI >gi|313159667|gb|AENZ01000003.1| GENE 14 9501 - 9596 91 31 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRKWRNLAAEHPAVVERLRGELRKLGFLPQR >gi|313159667|gb|AENZ01000003.1| GENE 15 10322 - 13273 4633 983 aa, chain + ## HITS:1 COG:no KEGG:BT_0206 NR:ns ## KEGG: BT_0206 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 983 1 980 980 1162 60.0 0 MQKTKLLLNILALFLCILPAVSMAQGGGNVTVSGIVTSADDKQPLIGVNVISGATSGVST LADGTYRITVSAGTTLTFQYIGYKPFEFTVPAGRSSVTCDVALQGDSQTLDDVVVIAYGV RKKGTIAGSVSTVKAEKIEDTPAAAFDQALQGQVPGLTVLSDSGEPSEAAAMAIRGTNSI NSGTAPLYILDGVAISSRDFNTINPSDIESLSVLKDASSTSIYGARAANGVIVITTKRGR MADRAKINYRMQLGFSQIAYGNWDLMNTSERIQYEKELGLTDGKNYNVLAKTDVNWLDEV FSNAALLQNYELSVSGASEKTNYYVSGGYYSQDGTAVGSSFERYSIRANVEQRAAKWLKL GTNTMMNYQTIEKADDGEYTLTTPISAARFMLPYWNPHRAGGSLASVNDGSWKGEGQNPL EWLKNNPVDYKKYKIISTVFAEATPIEGLTLRSQFGVDYSHSTGRGISYPSYAPNLGSGT VQRNSSDGMSLSVTNTITYRFDIDNKHMFTFLAGQEGLDYRYEAFSLQTKDQNNDKLVNI SSSPRATRWSDTTDDNYAYLSFFGRAEYSYSDRYYADFSLRTDGSSRFGASNRWAGFWSV GFMWNLRNEKFMENSRRWLTLAQIAVSTGTSGNSEIPNYEHLALVTGGQNYFGNAGIAPT QPGNEELSWEKLWTTNVALHLGFWNRLNLDVELYNKKTTDMLMEVPESYADKGYGYHWSN VGGMVNRGVELSLSGAVVASKDFLWSVNANVSYNKNKITELYNGVTEYEMSSTNTKLVVG HPVGEFYINRFAGVNPANGDALWYTKQGELTTELRDEDKVLTGKSFFAPWQGGFGTTLTW KGLSLSAQFAWVADRWMINNDRYFDESNGRFVSFNQSKRLLNRWKKPGDITDIPRHGVYT EFDDRLLEDASFLRLKNLMLSYTFPQEWLRKTRFIGGARIYAQAQNLFTFTKFTGLDPEG NSNMYQAQYPMARQFTFGLDLTF >gi|313159667|gb|AENZ01000003.1| GENE 16 13283 - 14833 2516 516 aa, chain + ## HITS:1 COG:no KEGG:BT_0207 NR:ns ## KEGG: BT_0207 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 516 1 511 511 463 47.0 1e-128 MNMLRNIKFRLSALAVLLLATSCLDKYPESAIPEKDAMKTFADAEQTLTGIYATFKSSSL FSGRLTLLPDIQADLVYAVEGNSNQFGNFWRWDVRPTDLDLEGVYADLYTVISRCNFFLE RIDEVMRNEISDDNLEDLEEYTGEVYAIRALCYSELLKNYCKAYDPATAQSELGVVIRTK YSTPEPIRRASLYDSYKFVLDDLAEAEKRLDKEEDAYSNEYITSAAAQALHARVALYMQD WDTAIEYASILIDDKKNTFKLADAKTKYNADYTFFDYMWAYDLSYEIIWRIGFTETSYGS PLGTVFLNFTKDLTYYYPDYVPATAALNLYDAADLRYSGYFAGKEQGLTIGYTNGLDWPM LVKYYGNRNFTSKLIFHVSMPKPFRLAEQYLIRAEAYCRKNDFTKAGNDLTALRKMRYAS GGTLNVSKNNWLQTISDERVRELYMEGFRLHDLKRWGKEYADLNGGYSIRRTPQSCSQPE GSSLKITPDNPLFVWPIPQHELESPGSEILPNESNR >gi|313159667|gb|AENZ01000003.1| GENE 17 14857 - 15801 1436 314 aa, chain + ## HITS:1 COG:no KEGG:BT_0208 NR:ns ## KEGG: BT_0208 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 19 300 19 297 299 253 44.0 7e-66 MKQIITVLFATGLLAALFTGCKEEYKTYSDREFVMFADTAATYMVLQDKEYFSVPVTSTV ACDYDRTFGVEIIDKGSNAIEGLHYALRSNTITIKAGQRAANVEVRGLYDNIEPTDSLGF ILKLVMPEQVNWNLYHDRTKVVMMKSCPYDVNDFTGWCVVTSLFLNQYPGIENTSLQRLI RTEKHPTEENMIILHDWLFSGYDVTIRLDPADPAEPLVTMDKNQILADEGSVFGQILGDN KILVTNSPLYDSYFNSCQKYVTLWIKVHVENMGENVGLVGHFYNIIEWVSDEEADRLQRE EGMKPGGVITPHAR >gi|313159667|gb|AENZ01000003.1| GENE 18 15831 - 17864 2762 677 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313159684|gb|EFR59041.1| ## NR: gi|313159684|gb|EFR59041.1| putative lipoprotein [Alistipes sp. HGB5] # 1 677 1 677 677 1327 100.0 0 MKRHYLHLVMSLLAGACLAAASCSDSDKEIKAPSPNFPEAVSPTIGAGETFTIEIDPNQD WEASVPTATAAWFWIEDGTQKVWTLRGGAGPARIVIAAAELEEFDDNRVCEVTLTMGGQS KVIATITRGTLDRGFTLRTCQLDDYGDFIYNPNSTETGLTYLYGTEPAENVALTWPEGRT GFSMPVLVEANFEWVLGKLPEWLEVPAKTIGEPGAQLELRLQGDASRYPLDGAEETLVFT DKNNSGVSYEIPISIPACRDIFSVSGMSAETKFNTKAEYFNSMNGDWVPGSAMGSVQSID GAKFFLFAEVTQQWGAPYLSAEAEDLAWILLTEEPWDSTPGSDVVQSRQFTVGVTENGGK ARKAYLLALPAAVAETISDPYQLIDAEIRDEYRQYLVTTINQEANPGSISANNPAGMTEI GAAFEKLPADDWYIGEFGVRDGYKLIYTKEWSNDPESTLTVDREYTGVAFFDYDLQPMSG EESWLTVRKTAEGIIVDMDPAKDKCGGSSMTNEGIAHIGFAVFTDADGRFALIQCIYDEN YPIGGGGEGFEVKFAMPDLVSGATLVEITKENYKTVAGGNTDLLASFAENLGMGVPQYML TYTSAEPSNAVLKVSPYQQIMLMPMSGAEWLDYEPLSDTQIQVSMAKPEAEQAPYGMLQF LDGSWNMKCIVYCMPAF >gi|313159667|gb|AENZ01000003.1| GENE 19 17959 - 20607 4036 882 aa, chain + ## HITS:1 COG:alr0124_1 KEGG:ns NR:ns ## COG: alr0124_1 COG4886 # Protein_GI_number: 17227620 # Func_class: S Function unknown # Function: Leucine-rich repeat (LRR) protein # Organism: Nostoc sp. PCC 7120 # 445 836 42 378 461 79 23.0 2e-14 MKLLKIFVAAILALAAALTGGCSDDDGIDNRDLDYGYVQFKLYKEASYDAVSRAQKPQLD YLYEAAKVQVSLGCNGTTVTQTLTLSASDKESAEFGLRSEKLRLLTGQYQIITFSLYNAN DELIYNGMPLDDRLVVEAGGLQVHDLTVDVEPRGKVRFTIRKDLSDFTETPVIRAANRQY TFDEIAFISLTVKQGETNEQTTFKMLPTKFSIHFDEDDDTFGYQTSSLECDTLLSLKAGS YKMMRYETYDKNKLLLETNKRPKNLDFVIEDNKTTETKVGVTLYEADEYIRDYYALYEIW KALDGPNWYFSGENYPEGTNWDFNKDPDLWGIQPGVEVHSNGRVAKITLSDFGFRGHLPA AIGQLTQLIELYLGSHKDANLLEYDPSIAPDKSLSERKRTRMERNKEFLRLIHTPTQTSE PIARALKEHNLTIPGISLYEENYTEDEIIDRKTGLQKRIRPYDVVHGKLTNGLKSIDPAI GKLENLEYLFVANGELETLPDELANLKSLTDVEVYNCPKMTQFPMVLAKLPEMISLNISN NSQWSAEEVYKGLNAIANGPAKEKLQILYVRDNNLREVPESFRNLKKIGLLDLAQNQIGT IHPFGKEVAPAQLYLDNNKLTSLPADGNGLFCTMDDVETFSATYNKFTKFPNIFSAKSKF TIKSVDFSYNEIDGFEGEEDGTFKGLNIEQLSLAANKFTKFPKCLGTTGSLVAYIILRGN MIDEIPEGSFSGKYSTSMTSLDLTYNKLSKLPKDFTAEQLPYLYGLDVSYNAFDKFPWSP LNCAGLTVYAIRGQRDAQGNRCLREWPTGLYQHTGLRGFYIGSNDLRKIDDTISHLIYHL DISDNPNITFDASAICYYWQQGAYNLIYDKTQNILNCDKMLE >gi|313159667|gb|AENZ01000003.1| GENE 20 20627 - 22270 2757 547 aa, chain + ## HITS:1 COG:no KEGG:BT_0211 NR:ns ## KEGG: BT_0211 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 9 540 11 639 660 269 32.0 3e-70 MKITNYIPVILCSIALCGCSDAAKTVGFGTDSDAIEAPATGGTHTVRVSAEKEWVATTDE PWITVSPANGRGTTECRVLIDSALTDQPRRGVIRIMEQNTWVKKEIAVSQKGFDYLIGID DKEVTVANYAAYGMRHFDVKIKTNVDFYVKVPESAENWLKFEKPAVEFDRGIRPREVTVR FNWNINSQPNPRIADVTFTSKKEVELARHDNLVVTQQAAEPIEENTRGGDSIALLAIART LETIASWENGERMDNWDNVILWEEGMEGYTPEKKGRVKYARFFMFNTKEELPFEVQYLTA ADELSFYSNVNAFLKDLTTGEYITKLTQLRRLTISAYGLVSLDKNFTALKNLEYLDLSSN NFQKVPEEINPTNFPKLRTLNMGANTRKNIYDLSNTIETNYGGLVEEQGFPRRLIEWDLD TLLLSVNYLQGPLPKMDDWEKYTEQDIIDADTLPRALIGTPKVMPHTKRFAINLNRLTGE LPDWLLYHPALDWWVPSVLVFTQEGKDATGTLAGFTNEPANMNYYYEFYEGYKKDPGTEE EEEETTK >gi|313159667|gb|AENZ01000003.1| GENE 21 22279 - 24327 2964 682 aa, chain + ## HITS:1 COG:alr1615_2 KEGG:ns NR:ns ## COG: alr1615_2 COG1404 # Protein_GI_number: 17229107 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Subtilisin-like serine proteases # Organism: Nostoc sp. PCC 7120 # 195 538 33 315 416 125 33.0 2e-28 MKNYLYGILATALLAFTACTKEELSAPEPAPGPDVPAEYQSGEVLVKFAPYVSDILDRAG ITRSGGPATRSGILSVDEILDIVGGYEIERVFPVDPRNEERTRESELHLWYVVRFGEEFT AAEVAEKLSALGEVQHVSFNRTIHRAYNAGKKAMPLSRKTLEAIQHQRATRTGEASAYPF DDLLLPQQWHLVNRGNMFGGKSIVDADVQCEQAWELSTGDESIVVAVLDEGVMIEHPDLK NNMWVNENEIYRSHEDNDGNGYQGDVYGYNFVKNTGVISWDDVNDTGHGTHVAGVIAAQN DNGIGISSIAGGRGSIPGAKIMSCQIFSGSTASNALATVKAIKYAADNGAVILQCSWGYV SGLANSYEWGEPGFKTQEEWEKSMPLEKEALEYFIHNAGSPNGPIEGGLAIFAGGNENAP MAGFPGAADYCISVSATAADYTPAVYTNYGPGVTIAAPGGDQDYYYEYFDDDHKRGEIGT VLSTLPYNVSESGYGYMEGTSMACPHVSGVAALGLSYAAKLRRHFTADEFKALLYETATP IDDYMSGMKFYYRYVADVGLNQPMQLNKSNYRGQMGVGQANAAKLLNAVAGNGTQVSFPN LYINLGGEVTAIPANYFLGGETMTYTVSISDTTVATASIEGRKLTVKGLKSGTTKASITS SGNETHTFNITVRKVANGNGWL >gi|313159667|gb|AENZ01000003.1| GENE 22 24365 - 25153 953 262 aa, chain + ## HITS:1 COG:no KEGG:BT_0213 NR:ns ## KEGG: BT_0213 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 21 257 27 266 271 151 36.0 2e-35 MLLCCGALWAALAAAPLRAQIRIVPRAKLDSLAHPATAAGGAEAMRFERTLIDAGHIGED DVPPVYTFRWRNTGGKPLVVTRVQTTCGCAAASWDKKPVAPGGEGAVTVTYRPKGHPGVF DRRIFVYTQLSDKEPTAILSLRGAVTASVRTDDDYPHAMGALRLKQRTVRFGRTDKVQSE RIECLNAGDKPLTPLCDEGFAATGLSFACEPATLAPGAKGDLVVRFDPAKAPQPLRRTPL PVGGPDVPPSQRMLWIEFGETE >gi|313159667|gb|AENZ01000003.1| GENE 23 25189 - 25650 722 153 aa, chain + ## HITS:1 COG:no KEGG:BT_0214 NR:ns ## KEGG: BT_0214 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 146 1 149 158 147 48.0 1e-34 MKRFLRFAALLSALALLGACDDDMYVTPLEVTNNNIAGTWQLAEWNGAPLAEGSFVYIEF IRKDAKYVMYQNLDSFGTRKLTGRFVIDNDAELGAVIRGSYDYGAGDWAHRYIVTDFTET SMVWTAKDDRSDVQVFVRCDGIPDEVTGGKTEE >gi|313159667|gb|AENZ01000003.1| GENE 24 25836 - 26405 738 189 aa, chain + ## HITS:1 COG:no KEGG:AHA_0769 NR:ns ## KEGG: AHA_0769 # Name: not_defined # Def: acetyltransferase # Organism: A.hydrophila # Pathway: not_defined # 14 177 11 175 178 141 43.0 2e-32 MTQELQFIPFESRSDKGWAEAWDLYEQSFPRCERWNAQGYDRAFGDPRFEADGIWRGEEF IGILFHWNAGAYRYVEHLAVSPALRGRNMGSQALTAFCRKVGRVILEIDPPVDDISIRRR HFYERLGFVANPYEYIHPSFRKPFTPHQLILMSYPGPLTYDEARGFADFVRETVLGYSEH EAPTLPKLP >gi|313159667|gb|AENZ01000003.1| GENE 25 26430 - 27497 1560 355 aa, chain - ## HITS:1 COG:PM1461 KEGG:ns NR:ns ## COG: PM1461 COG2855 # Protein_GI_number: 15603326 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Pasteurella multocida # 7 347 2 328 336 211 40.0 2e-54 MLEKGNRANTLHGILLIALFSFAAFYIAGFPIVKRLSFSPLIVGIVLGMLYANSLRNKLP ETWVPGIKFCTKQLLRWGIVLYGFRLTLTEVAAVGIPAVAVDLVIVTVTIFGGVLLGRLL KIDRDTALMTSTGSAICGAAAVLGAEPVVKCEGYKTAIAVSTVVIFGTLSMFLYPLMFRM GLLGGLTDTGVAVYTGSTLHEVAHVAGAGNAMDPTDALGIAGTATITKMIRVMMLAPVLV VMGFVLGRRKSSGACREKSKIAVPWFAFGFIGIICLNSLLQYLFGVESVREIPLNGAIEY ADTFLLTMAMTALGTETSIDKFRQAGAKPFVLAGLLYVWLVAGGYFVTKYLVGIL >gi|313159667|gb|AENZ01000003.1| GENE 26 27813 - 28700 1074 295 aa, chain - ## HITS:1 COG:aq_638 KEGG:ns NR:ns ## COG: aq_638 COG0583 # Protein_GI_number: 15606065 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Aquifex aeolicus # 1 290 2 291 303 132 28.0 8e-31 MDDFRLKVFITAARTLSFTRTAERLFISQPAVSKHIGELESQYKVQLFMRRGSRLELTGA GETLLACAERLADDYRCMQYEMSLCADAPVGELRLGASTTIAQYLLPPILARFTTRFPGV RVAMMSGNSDQVEQALGGHGIDLGMVESLSRRQGLHYTVFAPDELVLVARTGGPYSRTES ITADELREIPLVLREGGSGTLEVIKAALGRAGIRIPQLNVVMRLGSTEGIKAFVRNSDAM AIVSVISVVDELRSGVLRIIDLEDLPLRRDFAFVHVEAEPSALVRQFIEFARADR >gi|313159667|gb|AENZ01000003.1| GENE 27 29346 - 30377 1568 343 aa, chain - ## HITS:1 COG:aq_1630 KEGG:ns NR:ns ## COG: aq_1630 COG0618 # Protein_GI_number: 15606737 # Func_class: R General function prediction only # Function: Exopolyphosphatase-related proteins # Organism: Aquifex aeolicus # 3 335 1 316 325 118 26.0 1e-26 MELSKERLERLRELLAPAHQQIVIVSHTNPDGDAVGSSLAWAEALRSMGHEVTCVVPNKY PYFLDWMQGIEEVVVFKNDTEGRAARAIADADILFCLDFNAVSRLEILSETIGANTTARR VLIDHHLSPDEGFDLSFSHPDSSSTCFLVYSIVEALFGAQAVTRRMAEALYVGIMTDTGN FAYSFLTPELFRAVAVLVGTGISIPDIHNNVYNAYTEGRARLFGYAINRKMEIIQDGTVA YMSLLENEMRRFQFQQGDSEGFVNYALTVKKMKMSAMFLAHRKFIRVSLRSRGDVDVNLF ARKYFSGGGHKNAAGGKSFVSMQETIAHYIKSVKEFADEGHLG >gi|313159667|gb|AENZ01000003.1| GENE 28 30571 - 32442 2813 623 aa, chain + ## HITS:1 COG:lin1989 KEGG:ns NR:ns ## COG: lin1989 COG0488 # Protein_GI_number: 16801055 # Func_class: R General function prediction only # Function: ATPase components of ABC transporters with duplicated ATPase domains # Organism: Listeria innocua # 5 622 4 627 630 477 43.0 1e-134 MACCLQVENLTKSFGDLVLFENISFAIEDGRRVALVAKNGTGKTTLLNIIAGREDHDGGS VVPRRDLRIAYLEQSPVYPAEMTVLEACFLSENPALKAIAEYERAIEEPSGEGLQEAMAR MDALEAWDYEQRAKRILSRLKIRDFGQRVGTLSGGQLKRVALANVLISESDLLILDEPTN HLDTDMTEWLEEYLAASGAGLLMVTHDRYFLDKVCSDILEIEDRQLYHYAGNYSYYLEKK QERETAQAARRESETNLYRRELEWMRRQPQARATKARSRIDAFHELEARLQSTRSRGEVR LDVKASRIGTKIFEAKEVGKRFGELVLLDKFNYNFTRYEKLGIVGDNGCGKSTFLKLLTG IERPDSGTIEVGETVRFGYYSQQGLEFDEGKRVIDVVTEIAEQIDLGDGRRMSASQFLQH FLFTPETQYSFVARLSGGERRRLYLCTVLMRNPNFLILDEPTNDLDIVTLGILEEYLRAF KGCVIVVSHDRYFVDKVADHLLVFCGGGEIKDFTGTYSEYVAWKREYEAARHAQEAQARP KPQAVRTQAAETAPRKLSYNEKRELEMLEREIPVLEAEKAALEESLSSGSLGVDELTAQS QRIAELIGLIDEKTMRWLLLSER >gi|313159667|gb|AENZ01000003.1| GENE 29 32733 - 33509 1204 258 aa, chain - ## HITS:1 COG:no KEGG:Fluta_0605 NR:ns ## KEGG: Fluta_0605 # Name: not_defined # Def: PKD domain-containing protein # Organism: F.taffensis # Pathway: not_defined # 3 251 2 247 411 139 32.0 2e-31 MTKNLFRILLFLLLPLAGQAKDADSLRVLWVGNSYTYYNDMPAIVQQIAATQKVKVSCTR FLKGGERFSGHLTNQKLLKALAAGGWDYVVLQEQSSAPAMPTRQVAREVYPQARTLDSLV HAGSPDARVIFYMTWGHKNGNRFPIPEYPPINRYETMQERLITSYLEMAYDNDAWCAPVG MAWRRVRAERPDCVLYAQDCFHPALPGSYLAANVIFTTIYGKPYQTDCTMGLPPEQAEYL QRVAQQTVLENRTLLNLE >gi|313159667|gb|AENZ01000003.1| GENE 30 33544 - 34197 966 217 aa, chain - ## HITS:1 COG:CAC0689 KEGG:ns NR:ns ## COG: CAC0689 COG0177 # Protein_GI_number: 15893977 # Func_class: L Replication, recombination and repair # Function: Predicted EndoIII-related endonuclease # Organism: Clostridium acetobutylicum # 4 213 3 211 211 179 42.0 3e-45 MTTKQRYDGVIAWFSEHMPVAESELHYTDPYQLLVAVILSAQCTDKRVNMTTPALFEAFP TPFDMAAATAEDIYPYIKSISYPNNKARNLAGMARMLCSEFGGEVPSDLQQMQRLPGVGR KTANVLGAVLWQKEVMPVDTHVFRVSNRIGLTTNSKTPLQTELTLEKNIPPHLLPVAHHW LILHGRYVCTARAPKCGECGIAVWCRKYAADHKAPKG >gi|313159667|gb|AENZ01000003.1| GENE 31 34194 - 35135 1088 313 aa, chain - ## HITS:1 COG:alr0557_1 KEGG:ns NR:ns ## COG: alr0557_1 COG0463 # Protein_GI_number: 17228053 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Nostoc sp. PCC 7120 # 8 218 3 211 270 120 31.0 3e-27 METSPPLISVCMTTYNHAPYLRQAIESVLSQQTSFGVELVLGEDCSTDGTAELCREYAAK YPGRVRLVTGGRNVGWRANYRRTFDACRGKYVAYCDGDDWWTDLCKLQMQADLMESDPGC GMCYTGADEYWQAEGTLKSDPDRHYTDFEQMLLGISVPNCTALARRELIAAYYAEIRPEE HPEWLTDDWPMWLWFACRSRIRFIDRVTAVHRRMVGSVSHGDSYRGRLASRDSAMETSLW FDERFTASRNRFRILRRRHVVGLWLLSWKGAGVGEYLARWWADLRRTPRLVFCPEGAIFL VKKILFRRKKQHL >gi|313159667|gb|AENZ01000003.1| GENE 32 35141 - 37678 4063 845 aa, chain - ## HITS:1 COG:SA1721 KEGG:ns NR:ns ## COG: SA1721 COG0210 # Protein_GI_number: 15927479 # Func_class: L Replication, recombination and repair # Function: Superfamily I DNA and RNA helicases # Organism: Staphylococcus aureus N315 # 8 712 4 681 730 487 40.0 1e-137 MESKESPILQGLNPAQRAAVVNYDAPSLIIAGAGSGKTRVLTSRIAYMIEQGVAPFNILA LTFTNKAAEQMRERIAQMIPDNRSRYIRMGTFHSVFSRILRENADRIGFPDSFTIYEPSD CKNLLKTIVRELNLPDEKYKPNLLASRISYAKNCLVTPGAYLANSVYAAEDRQAQIPEFG NIYNIYCQRCKRNGAMDFDDLLLQTNILLRDAPDVLARYQEQFKYILVDEYQDTNYAQYV IIRRLSQLHSKVCVVGDDAQSIYSFRGAKIENILSFQKDFPDAMVFKLEQNYRSTRTIVD AANSVIVRNSKRMEKHCFSEGDKGEPIRILKAYTDREEAEMVVSDLRDKIRQTGDEWSEA VILYRTNSQSAVLEDNLRRRGVPYRIYKGSSFYDHKEVKDMLAYIRLVINPRDDEAFKRI VNYPARGIGDTTVQRIAELAAGRGVSMWEAVDTLVAEPAADPVQKTIARKVSDFVAMIRG LSLARNEKSLYDFGLEIATRSGIIAAYRTENTPEATSALDNIEELLNSMQLFKEQRDAEI RSGERQEDEEATIDEWLQNVMLMTDMDKDDPEDRNKVTLMTVHSAKGLEYKYVYIVGLEE NLFPSQRAAESPDGIEEERRLFYVALTRAKVEATISYAEMRFKWGNMEFSRPSCFLREID PKYVRADFDAGEERPRRPQDGAGPSAIDELRRRFDYRFQQKRQADRAGGNNGGQQGGFGG RGSSFGGRSADDQSGPARRFARAAAPQGARPAGRPDPALVQTPRPSTDGMRRVGVRQAVD GGVPVGPSMPVSGDYTVGQRVEHPKFGVGIVQRIETLATDHKLVVAFDSVGEKTLLAKFA KLTKL >gi|313159667|gb|AENZ01000003.1| GENE 33 38063 - 39061 1879 332 aa, chain + ## HITS:1 COG:Rv1479 KEGG:ns NR:ns ## COG: Rv1479 COG0714 # Protein_GI_number: 15608617 # Func_class: R General function prediction only # Function: MoxR-like ATPases # Organism: Mycobacterium tuberculosis H37Rv # 27 332 49 352 377 321 51.0 1e-87 MSEVINIKELNERIERESVFVDTLRTEMGKVIIGQSHLVDTLLIGLLSNGHILLEGVPGL AKTLAITTLAKAVDAAFSRIQFTPDLLPADLIGTLIYSQKNEEFVVKKGPVFANFVLADE INRSPAKVQSALLEAMQERQVTIGDNTYSLPQPFLVLATQNPLEQEGTYPLPEAQVDRFM LKAKISYPKKQEERDIVRMNLAGGGLPAVNKVISPEDIVKARKVVEDVYMDEKIEKYIID IIFATREPAEYNLQKLQNLIAYGGSPRASISLAKAARAYAFIRRRGYVIPEDVRAVCHDV LRHRIGLTYEAEAENITSEEIITDILNNVIVP >gi|313159667|gb|AENZ01000003.1| GENE 34 39064 - 39948 1404 294 aa, chain + ## HITS:1 COG:BB0175 KEGG:ns NR:ns ## COG: BB0175 COG1721 # Protein_GI_number: 15594520 # Func_class: R General function prediction only # Function: Uncharacterized conserved protein (some members contain a von Willebrand factor type A (vWA) domain) # Organism: Borrelia burgdorferi # 11 256 14 252 291 130 31.0 3e-30 MQQTENDILKRVRKIEIKTRGLSNEIFAGKYHTAFRGRGMSFSEVREYRAGDDVRDIDWN VTARSRKPHIKIYEEERELTMMLLVDVSASRMFGSTDRLKKNIITEIAAVLAFSAAQNND KVGCIFFSDKVEKFIPPKKGRSHILMIIRELVGFRPESTGTKLSEPVRFLTNVNKKRCTT FILSDFMDSSKDKAALDDALKIAGGKHDLVGIRVYDPRETELPDVGIVELKDAESGRKVW VDTSSRAVREHYARTWERRSAEIDSTLKHNRIDSTMISTDGDYVAELLKLFKQR >gi|313159667|gb|AENZ01000003.1| GENE 35 39945 - 40925 1701 326 aa, chain + ## HITS:1 COG:no KEGG:BDI_0947 NR:ns ## KEGG: BDI_0947 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 4 324 11 321 369 128 33.0 3e-28 MKRLFAIALLLAGAAVRAQNAPTVTARIEPDSIMIGDRFDLTVEVDKDLVQVVQFPEFEN KEGSPLELVKDHPIDTLERDGRHLRLRKRYTLAAFEEGKWNLGLPAVLYADKNIVDTLRA RDSLYLEVATFQIDSTSQSIYDLKPQRNLPFRFAEVSGYAKWGVLALLLALVALYALKRI LARYGKGLGDLFKPAPPQPPHVVAIKALEALHNQKLWQNNRHKQYYSGLTDILRTYIAAR WGVGAMEMTSDEIIEAMRSEELPDKARMDLTAILRDADLVKFAKATPDGEQNEADYLKAF YFVEETKLVVEEEEPEQPDPMQNQNA >gi|313159667|gb|AENZ01000003.1| GENE 36 40930 - 41718 1285 262 aa, chain + ## HITS:1 COG:no KEGG:Cpin_0905 NR:ns ## KEGG: Cpin_0905 # Name: not_defined # Def: hypothetical protein # Organism: C.pinensis # Pathway: not_defined # 27 262 32 253 253 87 30.0 6e-16 MYRLLYIVFLFAALGASAQDMPERSEVRRGNRQYNKGNYEKSIERYERALEAAPESFEAR YNLGNALYKAERFDKAEQTMRQAAADTLRTDDERAQAFYNLGNAQFKQQKYKEALESYKQ SLRLNPSDQEAKYNYAYTKRLIDDDENGGGGGGDDKNQDKDQNKEQQGGQDRQNGDQQKD DQQKDDKGQGDDKEQQGDPQQNPAQPDKEQEGDQQGEPQPVPAGISPQEQEQMLDAIQAQ EDRTQDKLKEKQGVVVRGSKNW >gi|313159667|gb|AENZ01000003.1| GENE 37 41731 - 42723 1532 330 aa, chain + ## HITS:1 COG:PA3073 KEGG:ns NR:ns ## COG: PA3073 COG2304 # Protein_GI_number: 15598269 # Func_class: R General function prediction only # Function: Uncharacterized protein containing a von Willebrand factor type A (vWA) domain # Organism: Pseudomonas aeruginosa # 73 317 75 317 340 159 39.0 6e-39 MHFASPYYLWLLSALVPMIAYYVWRTLQGGASIQISSVEGVVRAPKTVRYWLRHLPFAQR LAALALLIVALARPQDVERLSRTNTEGIDIMLAIDVSGSMLARDFRPDRITAAKEVAGSF IADRYGDRIGLVAFAGEAFTQSPLTTDQGTLQTLLARIRSGLIEDGTAIGNGLATAINRL RESEAKSKVIILLTDGVNNRGEIAPQTAAEIAKAQGIRVYTIGVGTEGMAPYPAVDIYGT PTGGTVMAKVEIDEKTLRSIAEQTGGQYFRATDKAKLKAIYDQINQLEKSKVEVTEHVTY HEQFLLWALAGLGLLVLEFLFSNLVLKRIP >gi|313159667|gb|AENZ01000003.1| GENE 38 42737 - 43759 1501 340 aa, chain + ## HITS:1 COG:VCA0172 KEGG:ns NR:ns ## COG: VCA0172 COG2304 # Protein_GI_number: 15600942 # Func_class: R General function prediction only # Function: Uncharacterized protein containing a von Willebrand factor type A (vWA) domain # Organism: Vibrio cholerae # 68 328 62 318 318 98 31.0 2e-20 MFRFANPQYFWLLPVIPALILLFWLAARNRRRRLERFGRMQVLEELMPEVSTGRVTLKFI LFCTAVTLLILAAARPQFGSKLREEKTQGVEMMLAVDVSNSMLAEDFEPNRLERTKYAIN KLFDGLHQDRVGLIVFAGEPKVQLPITSDYRMAKAFAKRIDPSLVPVQGTAIGKALSQAL MSFSGETEENHSRVVILITDGENHEDDALAAARHAAEMGIRIYTIGIGTPEGAPIQIGGE FIKDEKGDMVVSKLNEEMLAQIADITGGAYVRSSKQSIGLDEIVKSINEMEQSELSVMRF EEFNEQYQYLLLAAIVLLLAEFLLLDRRNPLLAHLNIFRE >gi|313159667|gb|AENZ01000003.1| GENE 39 44091 - 45104 1238 337 aa, chain + ## HITS:1 COG:no KEGG:Odosp_3517 NR:ns ## KEGG: Odosp_3517 # Name: not_defined # Def: hypothetical protein # Organism: O.splanchnicus # Pathway: not_defined # 2 184 12 207 250 114 35.0 4e-24 MKKILLMIVPLMALIGCSYDDTDIQKRLDDLDGQLTELQALVKALNSDVTTLKELVAGKR FISDVQPGEDGGYVITLVTAAGETSTITISDGKDGTSSVIGVKQDSDGVYYWTLNGDFIL DNGKKLPVSGVTPEFKIENSHWWVSYDNGATWTDCGQAQTDQNLFKSVATSEDGKLVYLT LADGTVLTFELYVQFGIAFDTASATIRVGETAEIPFTLTGADAKTDIQALADGAWKAEVK RTGNEGGTIAVTAPGDSSTGKVIVLVSDGDAKTLMRTLTFVSGVLNVSTSSKEATAAGGP VTVEVETDLDYTVAIPEAAKAWITLAETRGGRFGPKP >gi|313159667|gb|AENZ01000003.1| GENE 40 45053 - 47716 3379 887 aa, chain + ## HITS:1 COG:no KEGG:Cthe_1890 NR:ns ## KEGG: Cthe_1890 # Name: not_defined # Def: cellulosome enzyme, dockerin type I # Organism: C.thermocellum # Pathway: not_defined # 86 560 140 606 710 186 26.0 6e-45 MDYAGRDEGGEIRTETLTFNVAANTETQSRSANIELVAGETVIETILLYQEAYYDPAQMV LKVEAKEYSSSAYNNKVYLPLYGAVDVTINWGDGQSEPVAATISTAAAMVNHTYAESGIY YVTISGTTEKLSGRLTNKTVAPAILEVRQWGKLGMTTLDYAFANTSLVTVPQPEEDAFAA VTTVENMFSGCTKLETVPEGLLAGAPEITSVASMFYNCTALKSVPEKLFEKTTKATSASG LFSGCKALESVPAGLLADMPALTNLGSIFNNCASLKSVPETFFTNQTEANSLTSGFFGCA ALETLPAELFKNMTKLTNVASLFKGCANLKSVPAGLLDTFTEVTNMTSLFSGCKMLEDLP DDIFKNMGKTKSGGYLYEGCISMTQFPSLKNCVSLEAVPAIWKDCTQLVEAPADYFPESV KKGISAAYIFSGCTALKTVPQGLFKDFESVTTISQMFLNCTSLESLPADIFDGMKKINSA GAAFSGCTAFTGESPYTMVQDGEQQVKVHLYERTNYPDIFGAKIFTSASSYKDTFKGCTQ MADYADIPIPWGGISDGTKAKPTLTLTAAPAEGKEYYQLTGIVKSTEMKSGKVLCTTKAL LPELIEQMETLEKVMNRYGNPISSAAVTQANSEAGATFNFNVQADTEYIFLASGTNAHGT TIEQTEVKIPAVPTGEADYERYIGTWTVTSTSSEINKQPQTYTVEITPYRTNESFRVKGW GITTLGDDYPFLLKYNEDGNVTIPTFDPQGMYGLTAYVYLKYHFYDPAQTPPYPIYTTDQ ELMKGSYDAAGGSVRFEGQKFPHNNTEYTVCGIDYAIYSGGQYVIWPDLFKAGYTLQDYA IGPYTMTKNSASVTKSEVKAETGIAPLAGTEMPKAATPTGVQRLIRK >gi|313159667|gb|AENZ01000003.1| GENE 41 47808 - 49868 2662 686 aa, chain - ## HITS:1 COG:CC0447 KEGG:ns NR:ns ## COG: CC0447 COG3525 # Protein_GI_number: 16124702 # Func_class: G Carbohydrate transport and metabolism # Function: N-acetyl-beta-hexosaminidase # Organism: Caulobacter vibrioides # 26 547 28 542 757 366 41.0 1e-101 MKPRLLLFTAACMLFAACGNSRGDLSTGIIPAPQQVAWGDGAFRMPSTLLFATNLEGEDK ADLAAWMRRSGDGFFPVSFAEAAGDDIPVLYLLLAEGGAPESYRLDVARGKITVTAPDAA GLFYGLQSLGQLAERCGRRIPAVVIEDAPRFGYRGFMLDVSRHFRDKEFVKRQLDLLARY KFNRFHWHLTDGAGWRIEIKKYPVLTDIAAWRPYPDWEGWNFGGKRYCRRDDPAADGGYY TQDEIREVVEYARALHIEVIPEIEMPGHSEEVLAVFPELSCSGKPYLNSDFCIGNEQTFE FLENVLSEVIGLFPSEYIHIGGDEASKQGWRTCPKCAARMRKEGLKDVDELQSYMIRRIE TFLNAKGRRLLGWDEILEGGLAPDATVMSWRGTEGGIAAAAAGHHAVMTPSNYCYLDFCQ DDPTTEPVAAAAFLTLEQAYSYDPAPDSLGAGVVPMILGVQGNLWCEHVPTAEHAEHMMW PRLLAVAEVGWSAPERKDYDDFYARALDAVAWLQERGYHPFDQKNAVGPRPESLEPLRCL STGKTVIYHTPYSPKYAAAGDGSLTDGLRGGWNYGDRRWLGWLDTDVDLVVDLGERQPVR RIAADFMQGFYADIWMPRAVEFSVSDDNEHFTPLATVGNDIPVEFKQDCYREFGWTGQTE ARYVRLKARHNGRPGGWIFTDEIIVE >gi|313159667|gb|AENZ01000003.1| GENE 42 50111 - 51268 850 385 aa, chain - ## HITS:1 COG:no KEGG:BT_4710 NR:ns ## KEGG: BT_4710 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 11 369 13 364 402 158 28.0 3e-37 MKLKYLIAPFVLAAAALCGCQETAEIHPVAYMTDAQKTVSKSVTIDTPPAGTEISVSSSV PVNKDTKIKISLRPDLLEEYNAKYQKNYVVPPADSYTLSGDEVVISAGYSRSSEVDFTVT SLDDFQEGTTYCMPVSIERIENSGLAVLEPSRTLFIVLKTPVISKAIYLGSSNIYKVPSF QEDANLAALKAITLEARVYMDGFQNYDPYISSIMGIEGECGVRFGDVKVPKNVLQICHGD YQPAATSKPFDTGKWYHVAAVWSGKSWDIFIDGVYVTGVATQGETINLSSDNSGGFYLGA SYGGGRPLNGYVAECRVWTRALTEAEISNNMNYVNPESDGLLAYWRMNAWEAKDGGGNIV KDLTGHGYDAVGGSSNPRMMDTKWM >gi|313159667|gb|AENZ01000003.1| GENE 43 51299 - 53209 1618 636 aa, chain - ## HITS:1 COG:no KEGG:BF1328 NR:ns ## KEGG: BF1328 # Name: not_defined # Def: putative secreted endoglycosidase # Organism: B.fragilis # Pathway: not_defined # 1 330 1 348 350 229 40.0 3e-58 MKNILKHAVVCLVLAAAVASCTKETTPERINIQHPDQQSPILRDNAYYQNLRAYKQTKHK LAFGWYGSWTAVGASYQSRLISAPDSMDIISIWSQWHTLTPQQMADKEFVQKTLGTKVTY TIFSDKLPEPFLEIGNGEYTDEAIEAYAKAYCKDSMDKYQYDGIDIDYEPGYGASGPFVG HDNALFTKLINAMSKYVGPKSGTGRLLIIDGVPYAVDRSVVDCFDYGIVQAYASSGYTDL QNRFNNADAKGWKPEQYIFAENFESYWKTGGVNFTDREGNRMPSLYGMATFNPTQGAGAG FGAYHMEYEYGNSAMPYQFMRNAIQMANPAGGWKTPIDVAFSSNQSSNFSFVVEDDGSVT GTMQDKVSLSFSRPVVSGMQLTLGVDNSLVAVYNDENGTEYETVDPSLVKMEPIQCAENQ VFSPDATITLDPKSIEKGYYLIPVVISPISDAGYAVKEGSVHYIFVTKVAMDVEIGATTL DGSKIAPTSAWTITCCQGTATSGATGVWNCDSAAQKAAMFDGKLDGNCWYASSASYSWGN GGNFTIDMGEVNDVTGLRWHIYYQDSEPQCTDLTYSEDGNVWYSLSAGVPFTPVLSDNWK WFKFKRTVKARYIRVYVGRVTGHTSMNEAEIYGPAN >gi|313159667|gb|AENZ01000003.1| GENE 44 53237 - 54847 2194 536 aa, chain - ## HITS:1 COG:no KEGG:BF3612 NR:ns ## KEGG: BF3612 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 536 1 541 542 355 40.0 2e-96 MKRNNIKQGFLSLAAACLAFTACTGDFERFNTDPNAAQDVDLKMFITTMQMDAVYSCSGS DTDPNNRYQAAYNLIGDVYAGYMSGTNNWNGGTNTQIYALNGTDWCNVPFSESFTNVMPA WLQLKYAHANGLLSDDIFAVANILKVMSLHRVSDIYGPLPVLHFGETENPYNSQEECYMH FFETLDSAIAVLKDVVENNPDAKPLEEVDALYHGDYSKWLKLANSVKLRLAMRICYVKPD LAKQYAMEAVRDGVMETADDSALFQSWGSITIINPLEKIWNGYSDTRMGASMDSYLNGYN DPRLPVYFQPVTEGDNKGEYRGVANGMPNPQQEDYKAMSAPNVQEKSTDSPLRWMMASEV AFLCAEGAMRGWTAEMGGTAEEFYNKGIELSFLENGLSADAAARYAADETSKPTDFTDVS AANTKYNISALGKIAIKWNAGASEEEQLERIITQKWIAMFPNGTEAWSEFRRTGYPRLFP VRYNYPSSGVDTNKQIRRMVFPQSEYSNNGAAVQEAVKLLGGPDNGGTKLWWDKKN >gi|313159667|gb|AENZ01000003.1| GENE 45 54861 - 57797 4594 978 aa, chain - ## HITS:1 COG:no KEGG:BF1326 NR:ns ## KEGG: BF1326 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 2 978 129 1101 1101 844 49.0 0 MPLIGATVLIQGTTTGVAVDLDGNFMLPQAKVGDTLEISLIGYAKQTLPVTGSAPLQVVM LEDNEVLDAVVVTALGIKRAEKSLTYNVQEVAGDIVTSVKDANFMNSLSGKVAGLQINAS ASGAGGSTRVVMRGVKSINGENNALYVIDGIPLPSLRSSQTSGTYETPDGGDFEGIANFN PDDIESMSILSGATASALYGAQGANGVILITTKKGKEGRVRVNYSNSTTFSHPFVMPKFQ NTYGTSEIEPMMSWGSKLATPTSYDPADFFQTGFNETNSVGLSAGTERNQTYASASAVNS RGIVPNNVYNRYNFSIRNTTMLIKDRLTLDVGATYMKQYTRNQTIQGQYRNPLVGIYLFP RGNDIRKYEIYERPGSDSRYMEQFWDLEFLKGVENPWWITNRELNENRQHRYSFNATMKL DITDWMSLTGRIRTDNATINYTRKLSASTNTLFASKYGSYLNRTMTHNNLYGDVLLSIDK SFFDNAFSLQFNLGASIMDDKNQLTGYEGYLAGIPNKFTYNNIISDNSQSFPTQENYHDQ IQAVYATMQLGWRGMLYVDVSVRNDWASMLAFTPKQNIFYPSVGLSAVISSMADLSKAGI SFLKVRGSYAEVGNAPERYITGPNYTLTNGVVSTATIAPAKHLEAERTKSFEAGLDVKFL GNKISATATYYNTNTYNQLFLYDAPPSSGYKQKYINAGKVNNWGIEASAGYKNTWRDFSW ATTLNFSMNRNKIKELVPEGTRDPDTGSLVDIKEVNKDYGGYRVKLVEGGSIGDFYVKGL LTDDKGHIYVDPNSNTVLLDPDPEAYIFAGNTEARFRWGWNNQFSYKGVSLGVLIDARIG GRGVSATQALMDRFGVSQDSADARENGGVWISDDQQVPDAKTFYANSGDGMAMLSHYVYS MTNVRLRELTLGYDLPSKWFRNKVNMSVSFVGRNLFMFYNKAPFDPEVTASTSTYYQGVD FFMQPSVRTLGFSVKLQF >gi|313159667|gb|AENZ01000003.1| GENE 46 57811 - 58227 292 138 aa, chain - ## HITS:1 COG:no KEGG:Fjoh_1213 NR:ns ## KEGG: Fjoh_1213 # Name: not_defined # Def: TonB-dependent receptor # Organism: F.johnsoniae # Pathway: not_defined # 39 120 37 118 1090 68 35.0 1e-10 MKKTYLAAGRGVHAVSACFRCLFLIPIFVLGCFGAHAAAQTQQSVSRVSLDVRDAAIVNV FQTVQQQTGCSFVYNTSDIDTDRKVTVSAQDEPLTALLDKLFAGSDIAYTLRDKHIVLSK KTKNSPPPQRSGRSQRNR >gi|313159667|gb|AENZ01000003.1| GENE 47 58354 - 59355 1385 333 aa, chain - ## HITS:1 COG:PA1364 KEGG:ns NR:ns ## COG: PA1364 COG3712 # Protein_GI_number: 15596561 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Pseudomonas aeruginosa # 120 297 71 243 280 92 37.0 8e-19 MTNLQISEEVLAAYLRGELNAAEAAAVEAWYDASAANRKLLGEVYYILYVNDRINDTAGI DVERSLRQFKRRMHAGRRISLRRIAVRAAAAAAVAVILLAGGVTTVSLSKRLAQPLTVIT HLGERSQVVLPDGTKVWLNSASSVEYVAPFFSRERRVKMDGEAYFEVQHDAQAPFVVSTN GLDIKVLGTRFNIRNDDSDHRITTVLLEGAVKAYASGDEKAAVRLRPSQQLVFDTRTGAM RLTDEPSADRSINWIDGRFCFEHDTFGEIVAELKRYYNVDIRFMDEALRSERFSGDFRVE DGIYHIMSVLQLTYKFTYKVVGNDIEIYPNPKR >gi|313159667|gb|AENZ01000003.1| GENE 48 59443 - 60012 929 189 aa, chain - ## HITS:1 COG:no KEGG:BF1324 NR:ns ## KEGG: BF1324 # Name: not_defined # Def: RNA polymerase ECF-type sigma factor # Organism: B.fragilis # Pathway: not_defined # 1 189 10 198 199 209 53.0 5e-53 MQDNRTVVGLLRAGDEKCFDALFRRYYKPLCTYATRFVTPARAEELVQDTLMWVWENRTS LLPEMPLKSLLFTIVKNKAINHMSRDTIKNRVIRQLAEYYEEEFDDPDFYLEGELVERLT AALRKMPPEFQQTFRMHRLEGRTHKEIAAALGVSPQTVNYRIGQTVRLLREELKEYWPLV VLLLWPDLH >gi|313159667|gb|AENZ01000003.1| GENE 49 60811 - 62334 2323 507 aa, chain + ## HITS:1 COG:BS_yhcA KEGG:ns NR:ns ## COG: BS_yhcA COG0477 # Protein_GI_number: 16077966 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Bacillus subtilis # 18 452 15 448 532 233 29.0 9e-61 MANLREHIRQSSGYKWWILGMIMLGTFMAVLDVTVVNVGLPAIMSAFGIGISSAEWVITA YMITMTIMLPSAGWFADRFGNKRVYMLGLVLFTLGSWLCGKASSDPFLIGARALQGVGSG IIQALGLAIVTREFKPAERGLALGLWSMAAAASISFGPLLGGYLVDAYSWHKIFDVNVPV GVLAVLLAAFVQKEWKNQTRSPFDWQGFAAIVLFMPLAIYALARGNSPSNPGGWSAPEVI GCFIVAGAALAWFIVAELRNPAPLLQIRLLAERNFGISMAVLTLFAIGMLGGTYLLPLYM QRGLGYTALAAGSMFLPVGVIQGVLSAVSGYLTRYVRPLLLSAAGILIMALSFWLASRFT LHTTHGHILFILYLRGFGMGLTFAPLNLFSLKNLTQQDMAAAAGISNSIKQLAGSIGIAL LTVIFSARTSYHAAHETVSAAQTYVEGVTDALRVVAVITLAALLPLLGVFRKKRPGKSGV KGAATPSPGEAVSGKTDKTTIYRTKPE >gi|313159667|gb|AENZ01000003.1| GENE 50 62410 - 63672 1821 420 aa, chain - ## HITS:1 COG:STM1287 KEGG:ns NR:ns ## COG: STM1287 COG0641 # Protein_GI_number: 16764638 # Func_class: R General function prediction only # Function: Arylsulfatase regulator (Fe-S oxidoreductase) # Organism: Salmonella typhimurium LT2 # 19 409 3 385 398 381 45.0 1e-105 MKSKDIFTFRDAEKQAGPVAFSTMLKPAGSACNLDCHYCYYLDKAVQYGGRQAVMSDELL ELYVKQYICANEVDTVQFCWHGGEPLLLGVDFYRRAMEFQRKYADGKRIENTLQTNGTLV DEAWCDLFASNNFLVGVSLDGPEDIHDAFRLTKGGKPTFARVMETVRMFERSGVEFNTLS VVNRRCEGRGAEIYRFFRDTVHSKYMQFLPAVEHVVDKPGFHRPLIVSPDREGARVAEWS VTAEGYGRFLCDVFDQWVVGDVGRYYVQMFDASLAQWCGVQPGVCSMGETCGDALVVEHN GDVYSCDHFVYPEYKLGNIAQTPLDEIYRTAKRREFGLNKRNTLPTECLRCKFYFACRGE CPKHRFDRGADGSPKNSLCEGLKIYFRHVEPYMEYMRDLLSKQQAPAWVMPFARKRMGLE >gi|313159667|gb|AENZ01000003.1| GENE 51 63937 - 65451 2106 504 aa, chain + ## HITS:1 COG:BS_hutH KEGG:ns NR:ns ## COG: BS_hutH COG2986 # Protein_GI_number: 16080986 # Func_class: E Amino acid transport and metabolism # Function: Histidine ammonia-lyase # Organism: Bacillus subtilis # 13 504 18 493 508 293 36.0 4e-79 MHYSDLDSINGRIFNRTEVSLGEAELREVEACYRFLEDFEQGKVIYGINTGFGPMAQYRI GDADLNSLQYNIIRSHSCGAGETLPDICVRAAMLARLQTFLNAKSGVHPDVVRILADFLN NEIYPLVPRHGSVGASGDLVQLAHIALALIGESEVSFRGETVSAAEAMKACGITPLRLRI RDGLALTNGTAVMTGIGMVNLAQARRLLDWAVKASVMLNEVVESYDDFMAEPLNACKLHA GQIEIARRMREICASSRRLRKRETELYNGDTGEAPTFKHKVQPYYSLRCTPQILGPVLET LEQAEKILVEEFNSVDDNPVVDPKTGTIYHGGNFHGDYVSLEMDKLKIAVTKMTMLAERQ LNYLFHDRINETLPPFVNLGKLGLNYGLQAAQFTATSTTAESQTLSNPMYVHSIPNNNDN QDIVSMGTNAAMLTRQVIENGYQVLAIEMLALVQAADYLECAAELSAQTRQLYDDIRRIV PKFIDDTPKYREIAAIENFLKQQA >gi|313159667|gb|AENZ01000003.1| GENE 52 65448 - 66188 207 246 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 [Phaeobacter gallaeciensis BS107] # 6 242 4 238 242 84 30 2e-15 MTEPTKYALVTGGSRGIGREICLKLARSGYYVLINYRSNADEAQRTLEGVRAAGGDGETL RFDVAAGEETAAAIAAWQERHKGAVIEVVVNNAGIRSDTLMMWMEPQQWHSVVDTGLGGF YNVTRPLLKDMLVKRFGRIVNIVSLSGIKGLPGQTNYSAAKGGVIAATKALAQEVAKKGV TVNAIAPGFVRTDMTADLDEAELKKQIPAGRFCEAEEVADLAMFLISDKASYITGEVISI NGGLYT >gi|313159667|gb|AENZ01000003.1| GENE 53 66193 - 67416 1712 407 aa, chain + ## HITS:1 COG:fabB KEGG:ns NR:ns ## COG: fabB COG0304 # Protein_GI_number: 16130258 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: 3-oxoacyl-(acyl-carrier-protein) synthase # Organism: Escherichia coli K12 # 4 407 3 405 406 268 40.0 2e-71 MENRVVITGLGIWSSIGTSLGEVTESLREGRSGIGLDPERRELGYISALTGIVKRPELKG ALDRRKRLCLPQQGEYAYMATAEAMRNAGMDEAFLAANEVGIIYGNDSSAAPVIEGIDII RAKRNTAMVGSGNIFQSMNSTVTMNLSVIFNLRGVNLTLSAACASGSHAIGMGYLMIRQG LQERIVCGGAQEVNLYSVGSFDGLGAFSKRESDPAAASRPFDKDRDGLVPSGGAATVILE SYDSAVRRGAAILGEVAGYGFSSNGEHISVPNVDGPRRSLLRCLADAGMQPGEIPYVNAH ATSTPLGDYNEALAIDEVFGANRPYVASTKSMTGHEMWMAGASEIVYSMLMMEHGFIAPN INFAQGDEATSKLNIPARKVDLAFDRFLSNSFGFGGTNSTLIIKKCK >gi|313159667|gb|AENZ01000003.1| GENE 54 67431 - 67679 478 82 aa, chain + ## HITS:1 COG:no KEGG:BVU_1013 NR:ns ## KEGG: BVU_1013 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 77 1 77 83 90 67.0 2e-17 MTQQELIEKINTVLADEFEVEQEVITPDAPLLETLDLDSLDLVDVVVLVDKNFGVTLTGP DFKELKTFQDFYDLIISRTNGK >gi|313159667|gb|AENZ01000003.1| GENE 55 67669 - 68550 1215 293 aa, chain + ## HITS:1 COG:RSp0366_2 KEGG:ns NR:ns ## COG: RSp0366_2 COG4261 # Protein_GI_number: 17548587 # Func_class: R General function prediction only # Function: Predicted acyltransferase # Organism: Ralstonia solanacearum # 8 293 3 302 310 104 26.0 2e-22 MANNDKRWSGRSRGGSFGYGFFIFLMNVFGIRAAYAFLSLVVVYFIPFAPRATRAVWFYN RRILRYGVFRSAVKLYAHYYALGQTIIDRVAIGSGMADRYKFEFENYDAFLRLLDGGRGA VIIGAHVGCWGTGAGFFGDYARKMHLVMYDAEYRRIKNVMDKHCKQEGYKVIPVNEGGIE SILRIKEVLDSREYVCFQGDRFVEGGATAPVTFMGRKALFPAGPFVVAEKFRAPAVFYYA MRERGRRYRFIFDIPETPDGKTPNAVLESYVRSLEAVVRRYPQQWFNFYRFWS >gi|313159667|gb|AENZ01000003.1| GENE 56 68520 - 70022 1717 500 aa, chain + ## HITS:1 COG:MK0895 KEGG:ns NR:ns ## COG: MK0895 COG1032 # Protein_GI_number: 20094331 # Func_class: C Energy production and conversion # Function: Fe-S oxidoreductase # Organism: Methanopyrus kandleri AV19 # 30 440 23 389 457 103 26.0 7e-22 MVQFLSVLVVSACSILLISANRHTSPYPVYPLGISYLKTYLERTISGIRVDTADCNLLTD EELAERIRTLAPRYIGVSLRNVDGANSLDRRGFLPEYKALIDVIRAASDAPLIIGGAGFS IYPQAFMRELGADYGIHGEGEGPLAELIGALERGETGADIPAVYTRDGRTGNGPMGNGPA ENRLSENERTENTPTENERTNSPTGNERTGNGRRSYLPAIEVEFEPELTGYYWKRSGMLN IQTKRGCPYECIYCSYPHIDGRCVRTMDPEIIAENILRAKRDYGINYLFFTDSVFNIRPE YNVRLAETLIRRGTNVAWGAYFSPRGIDAEQMRLFRASGLTHIEFGTESFCDRTLEAYGK RFTFGDVVRASRLALDNGVYYAHFLILGGYGDTREHVRETIENSRRLEYTVMFPYAGMRI YPHTRLAELAAREGVIGPDDDLLAPAYYIAPDFDLEEARSAAAATGKAWVFPDDPQSALV DTLRLKRNKKGPLWEYLRKP >gi|313159667|gb|AENZ01000003.1| GENE 57 69998 - 70438 503 146 aa, chain + ## HITS:1 COG:no KEGG:BVU_1016 NR:ns ## KEGG: BVU_1016 # Name: not_defined # Def: putative 3-hydroxymyristoyl/3-hydroxydecanoyl-(acyl carrier protein) dehydratase # Organism: B.vulgatus # Pathway: not_defined # 11 146 10 144 145 130 49.0 2e-29 MGVPEKALIAADEIVEYIPQRPPIVMADAFYGIGEDGCARSGLTVCEDNIFVEEGALSEC GLIEHIAQSAALRAGYMDRSRGEKVRLGYIGAVNDLKVHALPPVGSRLVTTIAVEQAVMN VTLLSARTECGVKPVAECRMKIYMEE >gi|313159667|gb|AENZ01000003.1| GENE 58 70438 - 70881 561 147 aa, chain + ## HITS:1 COG:CAC0271 KEGG:ns NR:ns ## COG: CAC0271 COG0824 # Protein_GI_number: 15893563 # Func_class: R General function prediction only # Function: Predicted thioesterase # Organism: Clostridium acetobutylicum # 15 142 6 136 138 77 34.0 1e-14 MVRRKTAEASLVGKTSLRVRFSEVDSMQIVWHGEYVRYFEDGREAFGREFAGLGYMDIHA SGYTAPIVELQLQYKKPLRVNDTAVVETRYIATEAAKICFEYTIRSGTDGEIVAEGSSTQ VFLDARGELQLLAPEFYRKWKERWDVK >gi|313159667|gb|AENZ01000003.1| GENE 59 70866 - 72830 2124 654 aa, chain + ## HITS:1 COG:RSc0427 KEGG:ns NR:ns ## COG: RSc0427 COG0304 # Protein_GI_number: 17545146 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: 3-oxoacyl-(acyl-carrier-protein) synthase # Organism: Ralstonia solanacearum # 118 371 141 395 405 137 37.0 6e-32 MGREVAVWWGPDSILSALGFGTRENMEAVRAGRTNLSAWHDGTPVCQIDPERFAQLTAER AVAEYTPAERLALLTLGEVIARSGVSPANERTLVLLSTTKGNIGLLNGDPAKCDLNDTAE VVGRHFGAANRPLVISNACISGVSAIVVASRLIRSGEYDHVFVAGFDLLCDFIVSGFNAF KSVSPALCRPYDAARDGLTLGEAGGAVLLTTDRELSATGITVAGGGISNDANHISAPSRT GDGLAFAIGAALREASLGAAAIGMVNPHGTATLYNDEMESRALHLAGLCGTPCNSLKPYF GHTLGASGVIESIVTVHGLAEGTVFGVKGYAECGVPYPLNISAEHRTTRTDAALKTASGF GGCNAAAVFRRGTGRNAYGTDETGSTDENAAAERSDVFGGRYCDGENPARQDGTNGAETD RDDAQNKRRQETAGGQPGFTGADESNAAANVGQKECAAANENNVITNAGQAGGDETEEIR TARDTAHVAITRHPEMPFGVFIRERYRALADPNMKFSKMDDLCKLAYVASCELLAGRRPD CPAERIGVVLANRSASLDSDMRHQAVIDADDGGGASPAVFVYTLPNIMLGQIAIKHGLKG ESTFFAFPDKSCNFIRKYAEGLIAQGRMDAVVWGWCELCGGEYDCELTLTEKLE >gi|313159667|gb|AENZ01000003.1| GENE 60 72836 - 73090 524 84 aa, chain + ## HITS:1 COG:ECs4328 KEGG:ns NR:ns ## COG: ECs4328 COG0236 # Protein_GI_number: 15833582 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Acyl carrier protein # Organism: Escherichia coli O157:H7 # 1 83 1 84 85 76 54.0 1e-14 MENLELQLKQQIIEALNLEEITAEEIATDAPLFGEGLGLDSIDALEITLLLEKHYGIRLA NPAEAKPIFHSVATLADYIRKNRK >gi|313159667|gb|AENZ01000003.1| GENE 61 73087 - 74250 1427 387 aa, chain + ## HITS:1 COG:RSc0427 KEGG:ns NR:ns ## COG: RSc0427 COG0304 # Protein_GI_number: 17545146 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: 3-oxoacyl-(acyl-carrier-protein) synthase # Organism: Ralstonia solanacearum # 12 387 13 395 405 187 37.0 4e-47 MNIAVRGIGIISALGNGAGETLAALRAGRSGIGKPTLFRSAVDVPVGEVRRDNRALGELL GIPARETPSRTALLGMIAAAQAVADAAIPAAARVALVSGTSVGGMDLTENFYRDFRSDNG RGRLRSVAGHDCADSTRRIAEYCGITGYTATVSTACSSAANAVITGALLLENDMADYVVA GGTDALCRFTLNGFNSLSILDRERCRPFDATRAGLNLGEGAGYVVLAREEPGMRSYCRLA GYANANDAHHQTASSETGEGAYRAMAEALARSGLERVDYINAHGTATPNNDLTEGTALRR LFGELVPPFSSTKGFTGHALAAAGGIEAVLSALAIGHGLRYGNPGFAEPIPELGLRPVAE TEAAAVNSVLSNSFGFGGNCSSLIFAK >gi|313159667|gb|AENZ01000003.1| GENE 62 74247 - 75149 827 300 aa, chain + ## HITS:1 COG:no KEGG:BVU_1021 NR:ns ## KEGG: BVU_1021 # Name: not_defined # Def: 3-oxoacyl-[acyl-carrier-protein] synthase # Organism: B.vulgatus # Pathway: not_defined # 21 192 1 170 285 170 51.0 8e-41 MKAYVNCITSGAELRSDIKVLIPEMNLRRRMSHVVKSGVAAGIESLLEFGARAPIDAIIT ATGLGCIADSEKFLDGLIAGDETMLNPTPFIQSTFNTVGAQIALLRGLHCYNTTYAHRWT SFENALTDAALRIGAGWSQAVLVGAFDETTPSVEKVLQRLRVAQEGRWGESSVFFVLTAE RFDCSVAEITGIRIPAGTPAADGGYPAGGETNGETGGEEKSRAEAPSGKAYGNENACVGE SGAEASEGRAIRASQTGVKPHFTAIAEVFRQAVAENAGHKITLYNDFSGRTESAMELTCM >gi|313159667|gb|AENZ01000003.1| GENE 63 75140 - 75826 555 228 aa, chain + ## HITS:1 COG:alr1793 KEGG:ns NR:ns ## COG: alr1793 COG0726 # Protein_GI_number: 17229285 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted xylanase/chitin deacetylase # Organism: Nostoc sp. PCC 7120 # 38 228 100 290 290 126 34.0 4e-29 MYVAAVIIAAGFVLFYCSYQIRLGAYVRTLCRNRAAGRVVALTFDDGPDPEMTPRVLDLL RERGVRATFFLVGAKAERHSELVRRIAAEGHDTGIHTWEHAAGFPMRSSRAMTADILRCR ESLRQIAGVETHLFRPPFGVTNPMVARAVKRTQSRCIGWSVRSFDTLLHRSREAVAERIA RRLGDGKVILLHDDRPGADRLLRLVLDDLKRRGYGTATVCELFKIEKP >gi|313159667|gb|AENZ01000003.1| GENE 64 75823 - 76437 1026 204 aa, chain + ## HITS:1 COG:no KEGG:BVU_1024 NR:ns ## KEGG: BVU_1024 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 202 1 205 208 103 30.0 4e-21 MKITSLLIPFALLSALTAQGQLPDSFRERLAQASRDNKTIQCDFTQHRQVRRMKGEIELK GRFYYDNTKAMALEYTVPAGDKVIIRDDRIILRTAGQVTETATSANPMLQQVALMIRASM TGDLSQFGEGWQIGYDETEGAGTVRMVPLSRRARKYIDSITLRFDLENMTLDRMELNETE GGHSVYEFRDKRFNLAVDPAKFNP >gi|313159667|gb|AENZ01000003.1| GENE 65 76456 - 76824 445 122 aa, chain + ## HITS:1 COG:no KEGG:Weevi_1569 NR:ns ## KEGG: Weevi_1569 # Name: not_defined # Def: beta-hydroxyacyl-(acyl-carrier-protein) dehydratase, FabA/FabZ # Organism: W.virosa # Pathway: Fatty acid biosynthesis [PATH:wvi00061]; Metabolic pathways [PATH:wvi01100] # 1 88 4 91 125 74 35.0 2e-12 MQKLFSDIDNYLSAEDRLAFRVRLDAAHPVYAGHFPGNPVLPGVCTLQIVRECLERGTGR RHRFTAIRECKFLGMIVPQADTLLDIDIRLADDGTAAKKVTCVVTNNEKTVLKLKATATP EP >gi|313159667|gb|AENZ01000003.1| GENE 66 76821 - 77975 1469 384 aa, chain + ## HITS:1 COG:Z4858_1 KEGG:ns NR:ns ## COG: Z4858_1 COG0463 # Protein_GI_number: 15803996 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Escherichia coli O157:H7 EDL933 # 10 229 10 235 246 129 34.0 1e-29 MNDRNDILAVIPTYNNEKTVARVIADVRRYCAHVLVVNDGSTDSTAQILAAEGVGTISYA PNRGKGYAIRRALRYAEEHGYRYMITIDSDGQHFASDIPKFVEEIEKTPDALLVGARNLR SDNMPGKNTFANKFSNFWFRVETGMRLDDTQSGFRLYPVRRMKGMRFLTRRYEFEVEVLV RAAWRGIAVRNIPVNVFYPEKDERVTHFRPGKDFTRISILNTFLVLGALLFYYPWRFLRS LTKENIRRFVADNITRSRDSNPQLAASIGLGIFFGIAPLWGYQMIAAGVTAHFTRLNKAV AILSSNISIPPMIPFILYGSYWTGAQVLRRAMPLSLSDITLERAAADMFQYVVGSFVMAA VCAVAAAAVSYALLVFCKRTPRHE >gi|313159667|gb|AENZ01000003.1| GENE 67 77968 - 79362 1826 464 aa, chain + ## HITS:1 COG:TM0817 KEGG:ns NR:ns ## COG: TM0817 COG1033 # Protein_GI_number: 15643580 # Func_class: R General function prediction only # Function: Predicted exporters of the RND superfamily # Organism: Thermotoga maritima # 231 416 487 672 674 67 23.0 7e-11 MNKFFIALYDFFESRRTLLYALLGVLVVAMASAALRLRFSENITGFFPDGERKAAAAFSN LKIKDKIAVMINAGEDAADKTDEMMACADSLAARLNADTLFRRYAEVEATFGRELADGMR SFLQGNLPLLLSEADYARMDTLVTPRGIAQAMEGNYRRLLSPVGGFIDEYIYDDPLGLSF GALGKLQELNIGGNYTLCDDYLFSKDMTTLLVFISPHYQSGDTGVGDRLIERIESALEGL NAEYAAAGITADYYGGPAVAAYNARQIKRDMMLTLNVAILIIVAFITFSFRNKFAVLLAL IPVAGGALFALALMSLTCHTISSIAVGAGTVVMGIALSYSIHILCHANHCHDPRQIIRDL AYPLTIGSITTIGAFAGLLFTDSQLLRDFGLFASLTLVGTTLFSLVLLPHLIRKEKRGGG PPCWNGWSGSRECVPTGTARWWRRSCCSRSYVSSSSTASALTAT >gi|313159667|gb|AENZ01000003.1| GENE 68 79233 - 81809 3648 858 aa, chain + ## HITS:1 COG:RSp0371 KEGG:ns NR:ns ## COG: RSp0371 COG0204 # Protein_GI_number: 17548592 # Func_class: I Lipid transport and metabolism # Function: 1-acyl-sn-glycerol-3-phosphate acyltransferase # Organism: Ralstonia solanacearum # 408 580 19 196 268 88 33.0 6e-17 MLERVERLTGMRPDRNRPLVAAILLLTFICLFFFNRIGFDSDMMHLNYDPPRLAAAQQRL SRLTDEDGERSKVLFITTADTPGEAVASYLRMGRRLDSLKQAGKIDSHAGIASFVVDSAE QQRRLERWRRFWTPQRREAVRAGIREAEKRYGFADGAFDGALKLAEKEYGPLDYASAAAR EVFREWIDGQEKSPIFLSHVTLADSCKHGVYAAFSATDDIVAADRAFYAGKMARSVNRNF YLILSISSILVTAALFLCYGRIELTLMSLLPMAVGWVIILGLMAMLGIEFNIVTIILSTF IFGIGDDFSIFIMDGLLSEYKTGRRMLDTHKTAIFFSAFTVVVGLGALIFARHPALHSLA LISLFGIVAVVLVSYTVQPVLFRMLVSSQTEKGGAPYTLGSLINTLYAFGLFVTGCQLLQ ALIFTLWPLPMARRRKQRIVQWSIHHMTRGFLRAMVTTKTIRLNDTGETFAEPAVVIANH QSFIDILVLLSICPKAVMVTNGWVWRSPVFGRIVRYLGFYHAADGYERLAPALAQKVAEG YSVIVFPEGTRSADGRIKRFHKGAFYLAAELGLDILPICLYGNGMISSKRQPIYIKHGLV VSRILPRTACGDPAAYSAQAKSACRQMRREYRKLYETYNRPCNPYFRDMLIKSYTYKGPV LEWYMRVKIRLEKCYTLFDRIVPREGTVVDLGCGYGPLSYMLAMLSDRRRIVGVDYDAEK IETARHSFLRRPETEFVHADLRTAELPEADAFLLLDVLHYMQPEEQRGLIARCAARLKAG GRIIIRDGDAGKAERHKTTAMTEVWSTKIVGFNKTDGELHFTSTPELQQTARRLGLEIHA AHTDAHTSNTVYILTRPE >gi|313159667|gb|AENZ01000003.1| GENE 69 81811 - 83310 1429 499 aa, chain + ## HITS:1 COG:alr4631 KEGG:ns NR:ns ## COG: alr4631 COG1233 # Protein_GI_number: 17232123 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Phytoene dehydrogenase and related proteins # Organism: Nostoc sp. PCC 7120 # 3 474 15 505 533 132 26.0 1e-30 MQEEFDIIVIGSGLGGLECGVMLSREGVGVCVVEQAAVPGGCLQSFRRRGHSIDTGMHYV GSMQQGGIMRRYFDYFGIGDSLEIRPLDEAFDIVSPGGDGEFAYMHGYDEFRRHLTSLFP REAAGIARYCDKIREIGDSIGVEVHRSGRLSSGGVKYLGASAAEFIGECVADPLLRSVLA GTNPLYGGVREHSPLYHHAMINHSNIEGACRFAGGTQHIADALAARIREHGGTILTGCRA SALHTEGRRITGVELADGRILRAKCVISAIHPAATFGLIGPTPVLRPAFRDRMAAMPGTY GLFSVYLLLRPQSFPYINRNLYYYAEGGDVWDTLFDTEAMRPKMVLFSAQPPCADPAWSE VVTLMAPIATDIWSAWSGSAPGRRPEAYTALKTAVAEHLTDFVCERLPGLRAAVKQTYAA TPLTYRHYTGIPHGAAYGLQKDCRNVMATCIPVRTKFENLLLTGQNLTVHGAIGVTLSAA ATCAELLGTEYLAKKIGDA >gi|313159667|gb|AENZ01000003.1| GENE 70 83307 - 84965 2128 552 aa, chain + ## HITS:1 COG:no KEGG:BVU_1032 NR:ns ## KEGG: BVU_1032 # Name: not_defined # Def: putative choloylglycine hydrolase # Organism: B.vulgatus # Pathway: Penicillin and cephalosporin biosynthesis [PATH:bvu00311]; Metabolic pathways [PATH:bvu01100]; Biosynthesis of secondary metabolites [PATH:bvu01110] # 1 544 5 549 554 589 51.0 1e-166 MKRFLKYTLITLLALPLLAAAFLWGLYLTADMEPPAMTVDTAAYRVADHGGYTSCRGSFL RHNRYGLWELYTAGSPEESGAAAGALTAGLMRYQEQVFVDQIRQFIPSDGYLKFLRGLIV IFNRNLGRHVPEEYRREIYARSLYCSHDFDAIGTPYERQLNYHAAHDIGHAMSQYMLVGC SSFAAWGGMSADGELVVGRNFDFYMGDDFARHKIVTFCRPQAGHPFVSIGWAGMIGVLSG MNDKGLTVTINAAKGPVPLSAATPISILAREILQHAATIGEALEIARRRDTFVSESLLIA SARDGRAAIIEKTPRKTALYESGGEYLLCTNHYQSAELAGDEHNLENLARTDSPCRFARL EELTAANAPLTPQAAVAMLRDQRGEGGADIGVGNDCSVNQSIAHHSVVFRPSALKMWVST SPWQGGAYICYDLGAVMRDPDPAGELCDAAQEIAADTAYLAHDYPRVVAYRRLGGEIRKA MREETAADAATLDRFERTNPQNFHTWKLLGEYYRSQGDDVRAAQCFDKALQSGIPRADER EAVERLKSECKP >gi|313159667|gb|AENZ01000003.1| GENE 71 84962 - 86269 1740 435 aa, chain + ## HITS:1 COG:MA3853 KEGG:ns NR:ns ## COG: MA3853 COG1541 # Protein_GI_number: 20092649 # Func_class: H Coenzyme transport and metabolism # Function: Coenzyme F390 synthetase # Organism: Methanosarcina acetivorans str.C2A # 4 432 6 433 434 218 32.0 2e-56 MIRNPEIQFASADAIAAFQQEYLREEIAYLYENSPYYRRMFDHCGARPDDIRTQEDLRRL PVTTKTDLQLHPQDFICVPRSRIIDYVTTSGTLGDPVTFALTEEDLDRLAYNEAESFTTA GCTPEDVLQLMTTIDRRFMAGLAYFLGARKLRCGVIRVGNGIPELQWDTIRRIGPTGCIV VPSFLTKLVAYAEEHGIDYRRSSLRRAICIGEALRDTAFGNNTLGAKITELWPELELFST YASTEMQTSITECGHHCGGHVPADLILVELLDEQNNPVAEGEEGEVVITTLGVRGMPLLR FRTGDICIGYTERCACGRSTMRLSSVIGRKGQMIKFKGTTLYPPALYDILENIPGVSNYI IEVFTGSLGTDQIVLRIGSARRDEAFEKEIKDTFRSKVRVAPEVVFEPVEYIAKKQMPQM SRKQIKFVDLREQTK >gi|313159667|gb|AENZ01000003.1| GENE 72 86281 - 86853 967 190 aa, chain + ## HITS:1 COG:no KEGG:BVU_1034 NR:ns ## KEGG: BVU_1034 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 22 183 25 189 191 105 35.0 8e-22 MKKLLAMALCLVAVSGYAQKGKEKIAADFYGVDYSCVDVKGADEEPGAFIKAFEAINRLF LSEPKKYDVAGFTGIDILSTGVEQANESLGALAAEQFTPRSAATDWQTQLPQIVARYDNG SGNKGLVLVATTLDKGNGLGYYTAVLFDPATHEIITQMDMVGKPGGFGLRNYWAGSVYNA LKKQGKYLTK >gi|313159667|gb|AENZ01000003.1| GENE 73 87129 - 89336 3626 735 aa, chain + ## HITS:1 COG:BH2223 KEGG:ns NR:ns ## COG: BH2223 COG3345 # Protein_GI_number: 15614786 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-galactosidase # Organism: Bacillus halodurans # 30 720 13 716 748 410 34.0 1e-114 MKKLLKPLFTALFAGMVCTGAKAAEPQVINIRTNDLSMVMSVAPNGEVLFHHFGGRIDDA SPAAGIKSYRRTDHGTDNLAYSTMGGRNFREPALRVTHADGDMNTELRYVSHATRTLADK NVSETVVKLTDTNQALDVELVYTAYAKENVITTHTVIRNREKGDAVLHAFYSSSLPVKAR SYLLTHLYGSWARESQTDHTLLTHGSKSIESRKLVRTTHTENPAFMITLDSEAFDENYGE VIAGALAWSGNFRLNFEVDEFNVLNILAGVNPYASDYTLAKGESFTTPEMVYTYSFEGAG GASRNLHDWARNYGVWHGHTYAPTLLNSWEGAYFTFDAKTLTDMIDDAADMGLEMFVLDD GWFGNKYPRNDANAGLGDWQINAKKLPEGIDYIASHAHRKGLKFGIWIEPEMVNPKSELA EKHPDWIVRSGDREMPRMRNQWLLDLTNPKVQDFVFSVFDNTMKLSPNIDYIKWDANRHA ENAGSEFLPKDRQSHFWIDYVQGFYKVMERIRAKYPDVLIQACASGGGRVEYGAMKYFNE VWTSDNTEALSRARIQYGTSLFYPATVMGSHVSATPNHQTSNITPIKFRFDMACAGRLGM ELQPKQMNAEEKAFARKAIESYKGYRDIVMEGDLYRIGTPYDDTGYYGVMYVSKDKKKAV LFTYCTLYQSRTLVPKYRLYGLDAQTRYKIREQNVDKKRFWFDGGTFTGEYLNHAGINPN LNKIYDSAVFVLEAE >gi|313159667|gb|AENZ01000003.1| GENE 74 89823 - 90260 758 145 aa, chain + ## HITS:1 COG:no KEGG:BT_0961 NR:ns ## KEGG: BT_0961 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 7 141 2 136 144 138 51.0 6e-32 MTKEQKEQTIGLLFAEKCSCVVRNGDEVRIFRERGVKDLYRLLREEPQLLDGAFVADKVV GKGAAALMILGGVEELFADVVSRAALDLFAAAGKAVAYTVAVPHIINRAGDGICPVERLC ASAQTAGECLPLIEGFIRKMQERNE >gi|313159667|gb|AENZ01000003.1| GENE 75 90423 - 90935 745 170 aa, chain + ## HITS:1 COG:no KEGG:BT_0959 NR:ns ## KEGG: BT_0959 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 170 1 169 169 219 71.0 3e-56 METTAVKLYSLDYRDAKTYLAAALFVAGNILLPQICHSVPQGGLRWLPIYFFTLVGAYKY GWRVGLLTAVLSPLVNSALFGMPAAAALPAILLKSVLLAAAAGFAARRFGRATLPLLAGV VLFYQTVGALGEWAFTGSATAAVQDFRIGIPGMLAQAAGGYLIINYLLRK >gi|313159667|gb|AENZ01000003.1| GENE 76 90947 - 91909 1417 320 aa, chain + ## HITS:1 COG:MA0367 KEGG:ns NR:ns ## COG: MA0367 COG2768 # Protein_GI_number: 20089264 # Func_class: R General function prediction only # Function: Uncharacterized Fe-S center protein # Organism: Methanosarcina acetivorans str.C2A # 30 319 64 353 355 352 58.0 7e-97 MINRMYVITLLSVLPFTACGSAQGGEKTEAPEVYFIREITPRNMVRIYEAQERPATGKVA VKLSTGEPDGHNYLKPEQIKEIVTKVGGTIVECNTAYGGGRAHTEAHLKAAADHGFTAIA PVDIMDADGGVSLPVKGGKHLKENFVGSHYPRYDFTVVLSHFKGHAMGGFGGAVKNISIG IASSAGKAWIHSAGKTKNAAEMWSDLPAQDDFLESMAESAKSVADHCGDRILYISVMNNL SVDCDCDAHPEEPKMGDIGILASLDPVALDKACVDLVYASPDPGKSHLIERMESRHGIHT LEHAEAIGLGSQQYRLVELK >gi|313159667|gb|AENZ01000003.1| GENE 77 91911 - 92579 437 222 aa, chain + ## HITS:1 COG:MA0404 KEGG:ns NR:ns ## COG: MA0404 COG1453 # Protein_GI_number: 20089299 # Func_class: R General function prediction only # Function: Predicted oxidoreductases of the aldo/keto reductase family # Organism: Methanosarcina acetivorans str.C2A # 1 180 1 180 364 139 38.0 5e-33 MEYRTLGRTGLRVSAVALGCEGFMNRPEEEVRADFDFAIENGINFLDLYASNPELRSSIG AALARRREGFVIQGHLCSVWEEGQYLRTRDPEKAQAAFDDLLARLGTDYVDIGMIHYVDA EADLRKVLDGPILQLALRLRSEGRIRHIGISSHNPAVARQAAETGLIDVLMFSLNPCYDL CRRATTWTRSGPTKAMRRSCTTSTPNASGSTNSANAKAWRST >gi|313159667|gb|AENZ01000003.1| GENE 78 92468 - 93058 683 196 aa, chain + ## HITS:1 COG:MA0404 KEGG:ns NR:ns ## COG: MA0404 COG1453 # Protein_GI_number: 20089299 # Func_class: R General function prediction only # Function: Predicted oxidoreductases of the aldo/keto reductase family # Organism: Methanosarcina acetivorans str.C2A # 20 194 192 362 364 171 46.0 1e-42 MDTLWADESYAQELHNIDPERQRLYEFCEREGVAIDVMKAYGGGDLLSGTNSPFGKPMTP VQCIEYALTRPGVAAVMAGCRSRTEIEAALAWCTATAAQRDYTQALTGLGRFSWEGHCMY CGHCAPCTAGIDIASVNKYYNLTLAQDEVPETVREHYNLLAHHASECIACGRCQSNCPFG VDIVGRMQLAAAKFGY >gi|313159667|gb|AENZ01000003.1| GENE 79 93163 - 95280 185 705 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|121997364|ref|YP_001002151.1| 30S ribosomal protein S1 [Halorhodospira halophila SL1] # 630 705 189 263 560 75 50 9e-13 MTEQIIAQRLGITLRQVQGTVRLLRDGATIPFISRYRKEATGSLDELQVGAVKEQLDKLT ELEARKQTVLSTIGEQGKMTDELRKRIEACWEAVELEDIYLPYKPKRRTRATVARERGLE PLANSIMAQNSHDIARQAQRFVTADVPTPEEAIAGACDIIAERVSEDERARNSVRRTAAR EGAVHSRLVKGKEQEGVKYSDYFDATSPLRTVSSHRFLAMRRGMDEGILKISVEMDADRI TEGLCRQFVRHGSATREWMEAAVADSFKRLIRPSIETELLAAAKEKADDEAIRVFAENLR QLLLSAPLGQKRIIAVDPGFRTGCKVVALDSQGNLLHNTTVYPHPPQNRYDEAAETLRSL AARYKAEAFAIGDGTAGRETEQLVRGLGLEGVGVFMVSEDGASVYSASAAAREEFPDHDV TVRGAVSIGRRLMDPLSELVKIDPKSIGVGQYQHSVDQSKLRARLVTVVESCVNKVGVNI NTASKHILTYISGLGPALAENIVAYRAANGDFKDRKSLLKVPRLGAKAYEQAAGFLRVVG GRNPLDNSAVHPESYHIVERMAADAGVTVEQLLADEELRKGIKPERYVDAEVGLPTVTDI LEALDKRGLDPREQLQAFSFDPDVHSVTDLREGMTLPGIVTNITAFGAFVDIGIKQDGLV HISQLADRYVASPADVVHLGQHVEVRVTGIDTVRNRIALSMRKNG >gi|313159667|gb|AENZ01000003.1| GENE 80 95929 - 97032 1305 367 aa, chain + ## HITS:1 COG:FN1068 KEGG:ns NR:ns ## COG: FN1068 COG0758 # Protein_GI_number: 19704403 # Func_class: L Replication, recombination and repair; U Intracellular trafficking, secretion, and vesicular transport # Function: Predicted Rossmann fold nucleotide-binding protein involved in DNA uptake # Organism: Fusobacterium nucleatum # 78 367 10 288 288 157 33.0 2e-38 MTIEDIALQMTPGIGVKGAVHLLGVFGSARDIFAAAPDELAGEAGLREEIAREIVRRRGF AAAEKELEHCRRNGIAAIASTDPEYPPLLREIPDYPHVLYIKGDTAALSARCLSMVGTRN ATPYGQTMCNRLVEGLAAQVPGLCIVSGLAFGIDVAAHRAALAAGVPTVAVLANPLPEVT PAQHTAVARDILDHGGALVTELHSQTRQNGTAYIARNRIIAGLSAGCIVVESPDSGGSLV TAHCADDYDRSVMAVPGRATDRMSAGTNHLIRNRKAQLVLTADDIVRELMWDLGAEPATL RPKPATPQLTPDETGLLGCFRTDDPLSHETLSELSGLDPGELATLLVGLELAGAVRQLPG NRYMKLI >gi|313159667|gb|AENZ01000003.1| GENE 81 97029 - 97817 923 262 aa, chain + ## HITS:1 COG:VC0103 KEGG:ns NR:ns ## COG: VC0103 COG0084 # Protein_GI_number: 15640135 # Func_class: L Replication, recombination and repair # Function: Mg-dependent DNase # Organism: Vibrio cholerae # 3 261 1 252 255 226 42.0 3e-59 MRLTDTHSHLYAPEFDADREEALARAADAGVERALLPAIDSESHERLFGLCRSHPERCAP MMGLHPTSVNDNPRWREELALVERYLETPPAGIRFCAVGEIGLDLYWSRDFRAEQLEAFR RQIDLSLEYGLPIAVHTRDAWPETVELMREYKGRGVRGVFHAFSDTADTYRELKEYGDFV FGIGGVVTFKKSKLADAVREMELRDIVLETDCPYLTPAPHRGERNESAYVRYVCEKVAEL KGLTPGETAEATTANAERIFGK >gi|313159667|gb|AENZ01000003.1| GENE 82 97837 - 98865 1597 342 aa, chain + ## HITS:1 COG:ECs2474 KEGG:ns NR:ns ## COG: ECs2474 COG0252 # Protein_GI_number: 15831728 # Func_class: E Amino acid transport and metabolism; J Translation, ribosomal structure and biogenesis # Function: L-asparaginase/archaeal Glu-tRNAGln amidotransferase subunit D # Organism: Escherichia coli O157:H7 # 2 341 3 335 338 271 43.0 2e-72 MKSSILIIYTGGTIGMKNDAETGALVPFDFSAIYDEFPSLKRLNVDIDVLTMDPVIDSSN VTPANWAALAELIRDNYARYDGFVVLHGTDTMSYTASAMSFMLENLAKPVVFTGSQIPIG VLRTDGRENLITAIEIAGAHIGGRPEVPEVSLYFQNRLFRANRTTKRSAEALSAFRSYNY PPLAEVGVNIAYNLPAILHPTEVSPQLRIATRLADGIEVVKLFPGLGENILRAMLSAPGL RAVVLETFGAGNAPTNEWFIRVLKEAIGRGIIILNITQCGGGKVSMELYETGLRLQEIGV LCGHDMTTEAAVTKLMYVLGLDLPDDRTRALLRRPLRGEFTA >gi|313159667|gb|AENZ01000003.1| GENE 83 99228 - 99986 1003 252 aa, chain + ## HITS:1 COG:PA2983 KEGG:ns NR:ns ## COG: PA2983 COG0811 # Protein_GI_number: 15598179 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Biopolymer transport proteins # Organism: Pseudomonas aeruginosa # 56 246 8 197 211 88 29.0 9e-18 MKKLFLILSVALFALGTVNAFAQEAAAAAEETAVAAETALAGDSMHHIMMQKFLEGGWEW MLPVLLCLVIGLAVAIERILYLSLATINSKKLIAAVEDALKNGGIEAAKEVCRNTRGPIA SIYYQGLDRYDQGLDSVEKAVVSYGSVQTGQMESGLSWIGLFIALSPMLGFMGTVVGMIA AFDAIQAAGDISPTLVAGGIKVALLTTLMGLIAAVILQLFYNYIVSKIDSLVNDMEDSSI TLMDILTAYSKK >gi|313159667|gb|AENZ01000003.1| GENE 84 100018 - 100437 502 139 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313159725|gb|EFR59082.1| ## NR: gi|313159725|gb|EFR59082.1| hypothetical protein HMPREF9720_2421 [Alistipes sp. HGB5] # 1 139 1 139 139 188 100.0 9e-47 MKKILNILLGILMAITVVLLVYAIATGGSDAAISLNLVWGYFLFVFAVAAALFCAIFGMI QNPAGIKGTILSLALIIIVVGVSYFYAAGHTVNIVDLQTNGFFGHGETVITETSILVTYV AFVAAFLTAVVTEIWGAFK >gi|313159667|gb|AENZ01000003.1| GENE 85 100452 - 101057 779 201 aa, chain + ## HITS:1 COG:no KEGG:BDI_3731 NR:ns ## KEGG: BDI_3731 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 5 198 4 191 196 168 49.0 1e-40 MATDKRKIQEINAGSMADIAFLLLIFFLVATTMNVDTGLVRMLPPMPPDEKQQEDIKVKE RNLFLVLISGSGNIMAGPSGKQEIIDLHQLKNRTKEFILNPMDDENLPEKVEKEIELSDG SKWMYPESQGVVSLQTTRDTGYQSYIMVQNELTRAFNEVRDEVAMRKFGSKFADLPEENR NAVSKAVPLKISEAEPRNIKK >gi|313159667|gb|AENZ01000003.1| GENE 86 101061 - 101531 681 156 aa, chain + ## HITS:1 COG:no KEGG:Odosp_2820 NR:ns ## KEGG: Odosp_2820 # Name: not_defined # Def: biopolymer transport protein ExbD/TolR # Organism: O.splanchnicus # Pathway: not_defined # 1 152 1 152 155 162 54.0 4e-39 MAVMKKKGNKGLPPISTASLPDVIFMILFFFMVSTTMRDQELLVRYKLPEATEVQKLEKK SLVSYIHIGQPSLAMQAKFGTAPRIQLNDSYKTTKDILDFVAAERDKLSEADRASMTICL KADQTTKMGIVTDVKQELRRANALKVSYAASKSLGY >gi|313159667|gb|AENZ01000003.1| GENE 87 101643 - 102242 788 199 aa, chain - ## HITS:1 COG:NMA0075 KEGG:ns NR:ns ## COG: NMA0075 COG0164 # Protein_GI_number: 15793104 # Func_class: L Replication, recombination and repair # Function: Ribonuclease HII # Organism: Neisseria meningitidis Z2491 # 13 188 6 181 194 161 49.0 7e-40 MLENCFQYDLPEAGCDEAGRGCLAGPVFAAAVMLPPDFHDPLLNDSKQMTERNREKLRTI IEHEAVAWAVEAVSAERIDEINILNASFEGMSLAAARLDPAPGFLAIDGNRFRTRLELPY RCIVKGDGKYADIAAASVLAKTHRDEYMLRLAEEFPQYGWRKNKGYPTREHRLAIREYGL TPHHRLTFNHEIGQLELVF >gi|313159667|gb|AENZ01000003.1| GENE 88 102750 - 103730 1150 326 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313159682|gb|EFR59039.1| ## NR: gi|313159682|gb|EFR59039.1| putative membrane protein [Alistipes sp. HGB5] # 1 326 1 326 326 493 100.0 1e-138 MKIDIARQPLVPAFVTLFALAVTAMWGGAGNGVSAGAPETMPLLGGALTRFQAAYPVWAR LAAGFMILFTGMCTGRIAIRYNLYGVSTCLPIPLYAVVACGIFSGGNYLTAFAASMLLAL AAKNYCRSYCNGYGFDAIFRASLYLGLLPLVYAPATPPVLILPLAILLFKRTFREAVVAA AGLILPLLTACYVSWGMGDEFTAPVMTLADALVSGVPLWIFKGLPLPSLVMLCVIAALAL TALLFFLSDIYAAGTKPRFILIFNTGILAMTLALLCTPSVTPEAFTLTAVPAALLFPVFF VRIDRRIALPVYLILLAASVYIATLQ >gi|313159667|gb|AENZ01000003.1| GENE 89 103889 - 104806 1273 305 aa, chain + ## HITS:1 COG:Cj1042c KEGG:ns NR:ns ## COG: Cj1042c COG2207 # Protein_GI_number: 15792369 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Campylobacter jejuni # 198 280 203 285 296 69 37.0 9e-12 MKNSLNSSAQQPMIVKYVEALHNGIQSQALSRYAIGYILRGTKYIYEGDKRQTLTRGDVF YLGIGHHYIENFPENGQPFEQVLFYYTPADLQRILMHLNITYGLNISNEHSCENCRNRTH VAMPAWNSIRNFFVNTNNDLRDEDFHRDETAENIKMTELIYLIASHEDCCIKSKLLSNMD AAKENFEQIVYDHIFKDISIEELSKLTNRSLTSFKKEFRRHFQMPPHKWYIRQRLMHSRL LLISTSKSISEIGNACTFPNTSHFIKLFKKEYQMTPAAYRNRHITSLTPEKQPETVQNAE IREAL >gi|313159667|gb|AENZ01000003.1| GENE 90 105294 - 106181 692 295 aa, chain + ## HITS:1 COG:BH3435 KEGG:ns NR:ns ## COG: BH3435 COG0320 # Protein_GI_number: 15615997 # Func_class: H Coenzyme transport and metabolism # Function: Lipoate synthase # Organism: Bacillus halodurans # 7 291 4 291 303 292 50.0 5e-79 MPKFYDKVDVLKKPGWLKIRLHRTPEYAEVQQIVRKHALHTICSSGMCPNKAECWSRRTA TFMILGDICTRGCRFCATRTGHPLAPDAEEPRRVAESVSLMKLRYVVVTSVTRDDLPDGG AAHWAETVRAIRKQNPDAAIELLIPDLDARPDLLDTVIASKPDIIGHNIETVERLTPVVR SRAKYRTSLETLRHMSRQGVATKSGLMVGLGESDDEVLQTLHDLREAGVGIVTLGQYLRP TLEHYPVAAYITPEKFEWYRLKALEMGFAYCASAPLVRSSYLAEEALKSVKSLQR >gi|313159667|gb|AENZ01000003.1| GENE 91 106178 - 106936 761 252 aa, chain + ## HITS:1 COG:ML0859 KEGG:ns NR:ns ## COG: ML0859 COG0321 # Protein_GI_number: 15827384 # Func_class: H Coenzyme transport and metabolism # Function: Lipoate-protein ligase B # Organism: Mycobacterium leprae # 6 247 14 217 235 151 38.0 1e-36 MNVYCRDLGQMDYKACWDLQQTLFDALTAGKTNRTAAPESATPSSEEAGTILLVEHPPVY TLGKSGHAENLLVGREALEAMGAQFFHIDRGGDITFHGPGQLVCYPILDLERLGIGLRAY IDALEESVIRTVAEYGIRAERIAGASGVWVGAGIAQEGDHRTADGGGKCTPAGAPRKICA IGVRSSRYITMHGFALNVTTDLEWFSRINPCGFTDRGATSIERETGAKVPMDDVKRLVVK FLSEILNVRIYK >gi|313159667|gb|AENZ01000003.1| GENE 92 106945 - 108711 2587 588 aa, chain + ## HITS:1 COG:BH1240_1 KEGG:ns NR:ns ## COG: BH1240_1 COG0608 # Protein_GI_number: 15613803 # Func_class: L Replication, recombination and repair # Function: Single-stranded DNA-specific exonuclease # Organism: Bacillus halodurans # 4 582 5 562 562 372 37.0 1e-103 MPIEKRWVVKPQGDSEAVAKLASVLRISPVLANLLVQRGIDTVEKADKFFKPSLADLHDP FLMKDMDKAVERVEQAVRNNEKIMVYGDYDVDGCTAVALVYKFLRQIGHKNLMFYIPDRY TEGYGISVKGIDLAARKGVGLIIALDCGIKATEKIVYAKSKGVDFIICDHHLPAEEIPRA VAVLDPKRVDCSYPFDELSGCGVGFKLVQAYAQKNRIPFEQISPLLDLLVVSIASDIVPL VDENRILAHFGLKDLNREPSKGLLSIIKICGLDKHNITIDDIVFKIGPRINAAGRMRMDE NDENASPSGGHAAVELLIEGNESIAAEFGNVIDAYNQDRKSIDRSVTQEAHDFIERNPEM KQLKSTVIYNPKWMKGIVGIVASRLIETYYRPTVVLTMSNGFVTGSARSVPGFDLYQAVE SCSDLLENFGGHMYAAGLTMRPENVDEFTRRFNAYVEENIDPQMLMPQVDIDSELLFSDI TPAFRKDLNRFQPFGPGNTAPVFATYGVSNHGDAKLVGVECEHLRMDLIQRQKPNTKIQS IAFQQPTHYEWVRSGRPIDVCYQIVENHYRGTVTTQLRIKDIKPVLNK >gi|313159667|gb|AENZ01000003.1| GENE 93 108866 - 112015 4447 1049 aa, chain + ## HITS:1 COG:slr2098_3 KEGG:ns NR:ns ## COG: slr2098_3 COG0642 # Protein_GI_number: 16330584 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Synechocystis # 644 899 2 267 280 178 40.0 6e-44 MNLLKKPFGLTLQALLWVMIFGCLLAHPFTNATSSPPGDKRDYVLIINSYNESSSWGWEI ITDITARIEQIENLEVYVEHMNTLLMDQQSDLDNFRTNLSREYGKNPPRMLIYIGAPAFI MRDFAEKEWGKGIPSIICAEEDFIGPDKYYVSKRAIPHSERIPLRELSGEYNLTLLYAPI YLEQTIELMRRMIPEMNRLVFVRDGRYINQQYEDELKKLLDTDYPGMRFTSYKASNMSVD QLLDSLNVTDTRHTGVLFSSWHYTRNIAGNTEITANPFKIIASSTAPLFALWPSSVKNSG VVGGFTYDVSVYNRQIIKTFNTILEGRQPRDIPFYTPTDAHPTFNYPAMLRNKISPEICP PDSVFIDKPETFAERNRTAIILCGGFFLLFILFQQWRIRVMRKVEAARRRESESQIKYAN LFNRMPIIYIQERVIFDEKHYPADTEFSDINAYFERTFLPRNQVVGKRGSELFADSFSEF LRLVNIVLTENRTVTYPFYYKPLDIFFEVVLSRSYLTDHIDIFCLDGTALHKVQQKLDLI NHKLSLALDVANIVPWKCDLNNRTILCDINKPVKAASGEAILQNEASLSVSDECYFSKIH KEDRDRIRQAYDDLIAGRTDKVCEEYRVISNESGHWHMEWVEALAAVESFDAQGRPKTLV GTSQVITERKRMEQELLSARDRAEESNRLKTAFLANMSHEIRTPLNAIVGFSAILASTDE EQEKQEYVSIIENNNALLLQLIGDILDLSKIEAGTLDFIHTDFDLNELMREKENVIRMKV REGVELLFEQKYEQCSVCTDRNRLSQVIINLLTNAAKFTQTGSIRFGYEMRDRELYFWVS DTGSGIPADKTEQVFERFVKLNTFKQGTGLGLSICKMIVEHLGGRIGVESEEGKGSKFWF ILPYVPGKAAAQRTEEQYPVISVEKDKLVILIAEDNDSNYRLFESILRKEYTLLHAWNGL ETVEIYRQRRPHLVLMDINMPVMNGYEATAEIRKLSESVPIIAVTAFAFASDEQKVLQSG FDGYMAKPINALQLRTQIVDMLRKHVVLL >gi|313159667|gb|AENZ01000003.1| GENE 94 112030 - 112557 553 175 aa, chain - ## HITS:1 COG:CAC3555 KEGG:ns NR:ns ## COG: CAC3555 COG0778 # Protein_GI_number: 15896791 # Func_class: C Energy production and conversion # Function: Nitroreductase # Organism: Clostridium acetobutylicum # 5 169 3 163 174 111 32.0 7e-25 MEFKELIEKRRSIRKFSDRAVPRETVDRILRETLTAPSARNTRTTRLMVVDDPALVARMA EMRDYGSAFMKGAPLAFVVLGDTSKSDLWRENAAISATLLQLACVDEGLGSCWVHINGRP RRKDEPDGQSAADYLRSFLPVPADCEPLCAIAAGYSDFTPAPLPDADDAARIIRL >gi|313159667|gb|AENZ01000003.1| GENE 95 112635 - 113474 1362 279 aa, chain + ## HITS:1 COG:no KEGG:Cphamn1_2546 NR:ns ## KEGG: Cphamn1_2546 # Name: not_defined # Def: hypothetical protein # Organism: C.phaeobacteroides_BS1 # Pathway: not_defined # 6 272 7 273 277 324 53.0 3e-87 MDYRFNIALIYDFDGTLAPGNMQEYDFIPAVGKSNKEFWTEANTLAEEQDADMVLTYMAR MLQEAKSKGLSLRREAFQESGRRVTLYKGVREWFGRINAYGAARGIRILHYINSSGMKEI IEGTEIAHEFRKIYACSFLYDVDGIAYWPAVAVNYTNKTQFIFKINKGVESVFDSKLVNR YIPENERPIPFKHMIYVGDGTTDIPCMRLVKNSGGHSIAVYNPDQKGARREMASLIHDNR VSHVCPADYSEGSDMDVLVKTIIDKIDLDDKLEKLEVVK >gi|313159667|gb|AENZ01000003.1| GENE 96 113540 - 114121 310 193 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|71274727|ref|ZP_00651015.1| Ham1-like protein [Xylella fastidiosa Dixon] # 3 190 5 197 200 124 42 3e-27 MNILFATNNAHKLMEVQAVLGPGFRLVTPRECGVTEEIPEEQQTLEGNASQKAHYLNDRT GLDCFADDTGLEVKALGGAPGVHSARYATDGHDFAANNRLLLKNLEGATDRRAQFRTVIS LILNGEEHLFEGVVEGRIIDREEGHEGFGYDPLFIPDGYDKTFAQMSTEEKNEVSHRARA VRKLAAYLHSAGK >gi|313159667|gb|AENZ01000003.1| GENE 97 114203 - 114811 1031 202 aa, chain + ## HITS:1 COG:all4721 KEGG:ns NR:ns ## COG: all4721 COG0302 # Protein_GI_number: 17232213 # Func_class: H Coenzyme transport and metabolism # Function: GTP cyclohydrolase I # Organism: Nostoc sp. PCC 7120 # 31 199 43 211 216 223 63.0 2e-58 MEERTYNKEERFDARTIEALKLHYAEILRLLGEDPSREGLLKTPERVAKAMAFLTKGYEE NPLEIIRSATFREEYKQMVLVKDIELYSLCEHHMLPFYGKAHVAYIPNGHITGLSKIARV VECYARRLQVQERLTVQIRDCIQEALNPMGVAVVIEASHMCMQMRGIEKQQSATTTSAFT GIFLSDHRTREEFMTLISHKYR Prediction of potential genes in microbial genomes Time: Wed Jun 22 11:06:56 2011 Seq name: gi|313159627|gb|AENZ01000004.1| Alistipes sp. HGB5 contig00022, whole genome shotgun sequence Length of sequence - 46505 bp Number of predicted genes - 39, with homology - 37 Number of transcription units - 22, operones - 10 average op.length - 2.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 970 737 ## COG3525 N-acetyl-beta-hexosaminidase + Term 1045 - 1080 2.0 - TRNA 1079 - 1155 76.1 # Ala GGC 0 0 2 2 Tu 1 . + CDS 1731 - 2090 372 ## COG0526 Thiol-disulfide isomerase and thioredoxins + Term 2163 - 2195 4.2 - Term 2332 - 2370 10.2 3 3 Op 1 23/0.000 - CDS 2599 - 3630 1425 ## COG1013 Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, beta subunit 4 3 Op 2 . - CDS 3635 - 5479 2618 ## COG0674 Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, alpha subunit - Prom 5507 - 5566 2.9 5 4 Tu 1 . - CDS 5978 - 6940 1134 ## COG1432 Uncharacterized conserved protein 6 5 Tu 1 . - CDS 7302 - 7973 1035 ## Pedsa_0979 peptidase S24/S26A/S26B, conserved region - Prom 8080 - 8139 8.2 + Prom 7994 - 8053 4.1 7 6 Tu 1 . + CDS 8117 - 8401 286 ## gi|291513609|emb|CBK62819.1| hypothetical protein AL1_00960 + Term 8530 - 8571 0.7 8 7 Tu 1 . + CDS 9180 - 10538 1779 ## COG5410 Uncharacterized protein conserved in bacteria 9 8 Op 1 . - CDS 10561 - 11109 543 ## 10 8 Op 2 . - CDS 11120 - 11347 104 ## - Prom 11367 - 11426 2.8 11 9 Tu 1 . + CDS 11380 - 12474 1654 ## COG2070 Dioxygenases related to 2-nitropropane dioxygenase + Term 12481 - 12510 1.2 + Prom 12494 - 12553 4.7 12 10 Tu 1 . + CDS 12748 - 13896 1042 ## Odosp_0525 hypothetical protein + Term 13949 - 13984 3.1 13 11 Tu 1 . - CDS 14044 - 15066 609 ## PROTEIN SUPPORTED gi|227425790|ref|ZP_03908856.1| SSU ribosomal protein S18P alanine acetyltransferase - Prom 15102 - 15161 5.0 + Prom 15051 - 15110 10.9 14 12 Op 1 . + CDS 15147 - 19709 6662 ## Palpr_2903 hypothetical protein 15 12 Op 2 . + CDS 19776 - 20450 1051 ## COG0811 Biopolymer transport proteins 16 12 Op 3 . + CDS 20457 - 20864 637 ## BF3738 putative tansport-like protein 17 12 Op 4 . + CDS 20909 - 21598 777 ## gi|313159661|gb|EFR59019.1| TonB family C-terminal domain protein 18 12 Op 5 . + CDS 21617 - 22069 646 ## Palpr_0285 hypothetical protein 19 13 Op 1 . + CDS 22319 - 23626 1862 ## COG0402 Cytosine deaminase and related metal-dependent hydrolases 20 13 Op 2 1/0.000 + CDS 23637 - 24086 411 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases 21 13 Op 3 . + CDS 24108 - 25133 1338 ## COG0673 Predicted dehydrogenases and related proteins + Term 25155 - 25213 23.6 - Term 25156 - 25188 6.3 22 14 Tu 1 . - CDS 25406 - 27469 3283 ## COG0296 1,4-alpha-glucan branching enzyme - Prom 27496 - 27555 2.7 + Prom 27438 - 27497 3.9 23 15 Tu 1 . + CDS 27567 - 29117 2466 ## COG3104 Dipeptide/tripeptide permease + Term 29226 - 29261 8.1 + Prom 29219 - 29278 3.2 24 16 Op 1 . + CDS 29338 - 29874 644 ## BF0456 two-component system sensor histidine kinase 25 16 Op 2 . + CDS 29878 - 31350 2118 ## COG0642 Signal transduction histidine kinase + Prom 31400 - 31459 6.2 26 17 Op 1 . + CDS 31621 - 32253 981 ## BF3965 putative TetR transcriptional regulator 27 17 Op 2 13/0.000 + CDS 32275 - 33612 1885 ## COG1538 Outer membrane protein 28 17 Op 3 27/0.000 + CDS 33640 - 34650 1630 ## COG0845 Membrane-fusion protein 29 17 Op 4 . + CDS 34665 - 37805 4731 ## COG0841 Cation/multidrug efflux pump 30 17 Op 5 . + CDS 37822 - 38115 461 ## BVU_0236 hypothetical protein + Term 38155 - 38188 6.1 + Prom 38381 - 38440 2.7 31 18 Op 1 . + CDS 38536 - 39321 1019 ## COG1235 Metal-dependent hydrolases of the beta-lactamase superfamily I 32 18 Op 2 . + CDS 39323 - 40138 245 ## PROTEIN SUPPORTED gi|163797523|ref|ZP_02191474.1| 50S ribosomal protein L9 + Term 40210 - 40243 6.1 - Term 40196 - 40233 7.8 33 19 Op 1 28/0.000 - CDS 40251 - 41087 994 ## COG0805 Sec-independent protein secretion pathway component TatC 34 19 Op 2 . - CDS 41094 - 41336 464 ## COG1826 Sec-independent protein secretion pathway components - Prom 41390 - 41449 1.7 - Term 41419 - 41459 8.0 35 20 Op 1 . - CDS 41693 - 42634 1432 ## COG0462 Phosphoribosylpyrophosphate synthetase 36 20 Op 2 . - CDS 42718 - 43404 940 ## COG0218 Predicted GTPase 37 21 Tu 1 . - CDS 43586 - 44338 1057 ## COG3568 Metal-dependent hydrolase + Prom 44624 - 44683 3.0 38 22 Op 1 . + CDS 44711 - 45760 1523 ## COG3049 Penicillin V acylase and related amidases + Term 45780 - 45835 15.8 + Prom 45777 - 45836 2.5 39 22 Op 2 . + CDS 45857 - 46505 -223 ## cce_1072 type II site-specific deoxyribonuclease Predicted protein(s) >gi|313159627|gb|AENZ01000004.1| GENE 1 2 - 970 737 322 aa, chain + ## HITS:1 COG:CC0447 KEGG:ns NR:ns ## COG: CC0447 COG3525 # Protein_GI_number: 16124702 # Func_class: G Carbohydrate transport and metabolism # Function: N-acetyl-beta-hexosaminidase # Organism: Caulobacter vibrioides # 1 159 436 598 757 73 31.0 6e-13 SLARICRFDMLPDSLSDAQKRQIKGVQGNLWTEKIPTWSRAEYMFYPRLLVLAEKGWTEP GKFDEDAFMSRLLDYCRRLDDEGVNYRIPSLTNYHAVSVFTDSVRTAVVCPLPGAVLRYT LDGSVPTVNSPRYDAPLVIRDDCMLHIRPYHADGRTGDWITVRYEKQDYARPLEPRDTEP GLCVDWYFRRFPGCDAIGIPETLYAAGGRCVVDSVCFPRAAAGRRAVGMIFRGYIDIPAT GIYTFALASDDGAILRIGGRVVVDNDGEHPLLEKSGQVALSAGLHPLELRYFDYNGGEIR LVLLGDDGARHPLDTDLLRHDP >gi|313159627|gb|AENZ01000004.1| GENE 2 1731 - 2090 372 119 aa, chain + ## HITS:1 COG:BB0061 KEGG:ns NR:ns ## COG: BB0061 COG0526 # Protein_GI_number: 15594407 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Borrelia burgdorferi # 3 112 2 111 117 119 43.0 2e-27 MKAIELTTTEFQTRIYDFRANPEWKYEGDLPAVIDFYAPWCGPCRMMGPVMESLAEEYAG KVRMYKVNVDKEKRLAAIFRVRSIPTFLFIPVSGEPKHANGAMEIAQMRRIIDTTLLEH >gi|313159627|gb|AENZ01000004.1| GENE 3 2599 - 3630 1425 343 aa, chain - ## HITS:1 COG:Rv2454c KEGG:ns NR:ns ## COG: Rv2454c COG1013 # Protein_GI_number: 15609591 # Func_class: C Energy production and conversion # Function: Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, beta subunit # Organism: Mycobacterium tuberculosis H37Rv # 10 343 46 373 373 307 46.0 2e-83 MAEYKYTPADFKSDQEVRWCPGCGDHAILNAVQRALPEIADATDTPHNMFTFVSGIGCSS RFIYYMKTYGFHSVHGRANAVATGVKVANPRLNVWVTTGDGDSLAIGGNHFIHAIRRNVD LNVILFNNEIYGLTKGQYSPTSKLGKITKTSPYGTVEKPFNPGELVIGAKGTFFARSVDM EVGLSKECMVAAAMHKGMSVVEVLQNCVIFNDKTHGEFAADKATRAERTVTLRHGEKMLF GADKQKGIVFENMKLKVVTVGEDGYTLDDVLTHDAHQRDTTLHSMLAAMKYPDYPVALGV IRAVEDATVYDREVARQVEEVKAQSKIHSVDELLHSGATWEIE >gi|313159627|gb|AENZ01000004.1| GENE 4 3635 - 5479 2618 614 aa, chain - ## HITS:1 COG:MT2530_2 KEGG:ns NR:ns ## COG: MT2530_2 COG0674 # Protein_GI_number: 15841979 # Func_class: C Energy production and conversion # Function: Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, alpha subunit # Organism: Mycobacterium tuberculosis CDC1551 # 217 581 4 366 425 366 54.0 1e-101 MEKTNVRELQDVVIRFSGDSGDGMQLTGTLFSDTSALLGNGISTFPDYPAEIRAPQGTVA GVSGFQVHFGSHRELNPGDYCDVLVAMNPAALKANRKWLKQGATVIIDGDSITEEHLKKA GFATLDPIAELGLDEYNVVIPDITSMTREALKQTGLDNKAVVKCKNMFALGICFYLFDRP EAYAYKYLETKFARKNPVVAEANKLAIDAGYNYAANTHQFANTYTVSPAPLEKGTYRTIN GNVATAWGLCAAAEKAGLPLFCGSYPITPATVILEELAKRKDLGVKTVQAEDEIAGICTA IGAAFAGNFAVTTTSGPGLSLKSEALGLAVMTELPLVVVDVQRGGPSTGLPTKTEQSDLS QALYGRNGECPVIVVAASSPSDCFHYAFEAGRLAMEHMTPVILLTDGFIANGSEPWRIPS MKDYPAIQPPIVDDAPEGGFMPYVRDEKLARGWAFPGKVGLEHRVGGLEKDCVKGCISHD PSNHQKMVSTRATKVAKAADDIPAQTVWGDPEGDLLVVGWGGTRGHLQNAVDQMRAEGKR VSLAHFNYINPLPHGVRDIFAKFRRIVVCELNEGQFANYLRQNLQEFRYEQYNKCEGLPF TIVELKEKFESLLK >gi|313159627|gb|AENZ01000004.1| GENE 5 5978 - 6940 1134 320 aa, chain - ## HITS:1 COG:slr1870 KEGG:ns NR:ns ## COG: slr1870 COG1432 # Protein_GI_number: 16330259 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Synechocystis # 1 320 1 242 249 158 36.0 1e-38 MELSDKKFAVLIDADNISHRKIKDILDEIANYGTPTIKRIYGDFTDPKFAAWKSILLENS ITPIQQYAYTTGKNATDSALIIDAMDILHKESVNGFCIVSSDSDYTRLASRIRESGREVL GFGEKKTPKPFIKSCDKFIYVEILGQPAAPTPALAPAPGTAPAATQAPAAKAAQKKTSAK RRQTSDTQAPGTPEAAAGKKTAPAPGQTPGVVPASELPQVTVQAQTLDPAQSQSIIRPID DDFKDLLANTIEDAADDSGWAFLGIVGSVLTKKMPDFDPRNYGYKKLSLLVKALSEAIET EERLSANKEQKFIYVRNRTR >gi|313159627|gb|AENZ01000004.1| GENE 6 7302 - 7973 1035 223 aa, chain - ## HITS:1 COG:no KEGG:Pedsa_0979 NR:ns ## KEGG: Pedsa_0979 # Name: not_defined # Def: peptidase S24/S26A/S26B, conserved region # Organism: P.saltans # Pathway: not_defined # 4 221 6 227 229 197 45.0 2e-49 MNTRVKLIRKQLGMTQEQLAQHLGIGKAALSMIETGKAGLSARNRNILVQELNVNPEWLE SGKGNMFNAEPDLTAYRLRTDNSLPLQSVPLYSIEGTAGLVPLFTDQAQAKPVNFIHIPN LPKCDGAIYIVGDSMYPLLKSGDIVLYKQLGNIDDIFWGDMYLLSIDIDGEEYVTVKYIQ KSDREGYVKLVSQNPHHADKEVALSRIRAIALVKASIRMNSIR >gi|313159627|gb|AENZ01000004.1| GENE 7 8117 - 8401 286 94 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|291513609|emb|CBK62819.1| ## NR: gi|291513609|emb|CBK62819.1| hypothetical protein AL1_00960 [Alistipes shahii WAL 8301] # 1 91 1 91 94 126 68.0 7e-28 MYSVSPELYQEVAARLADAVDGENYFSGSLAFRFGDTDCRFTASVIVYRTRLSQPEGDAE PVSDLVPVWWEFHTFSAEGEMLNDFDFSEMKRFV >gi|313159627|gb|AENZ01000004.1| GENE 8 9180 - 10538 1779 452 aa, chain + ## HITS:1 COG:lin1732_1 KEGG:ns NR:ns ## COG: lin1732_1 COG5410 # Protein_GI_number: 16800800 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Listeria innocua # 12 288 53 305 310 90 27.0 7e-18 MQPFHRAYYRVLEAFAAGRVRRLIVTMPPQHGKSVGATTLLPAYVLGLDPDQRVAIASYS GALASKFNRRVQRIIESREYAAFFPATTIKQGSKPPSYIRTADEVEIIGCRGGLLSVGRE GSLTGNRVDCFILDDLYKDALEANSPLIRANCWEWYTSVVRTRMHNASRELIVFTRWHEE DLIGTLTAREPVAELKEWAQLDGLPADTWLHLNFEALKSSPPTGIDPRMPGEALWEQQQG RALLEAKRRLDPLQFESMYQGHPSSREGLLYGLNFAEYDDLPHEIVRRGNYTDTADTGDD YLCSLSYAVDADGAIYITDAVYTREPMEVSEPLVAEMLLRSDTRQAAVESNNGGRGFARA VQSLSPGVRIEWFHQGGNKEARILSNSATALHLLRWPRGWNFRWPELYAHLTTYRRRFRA NRWHDAADVVTGIVEREAADRSRSRVRGVRFL >gi|313159627|gb|AENZ01000004.1| GENE 9 10561 - 11109 543 182 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKKLLFPALALMLCIPTLSAQSRGDKYVGGIIGITTTSISIDGSSASQTTFGFAPEFGYF ASDRLRVGGSIGYQLISSDGETTHGLTAGPSLAYYVRLCDRFYYTPQLAVGFAFASTDGT SGYGFDAGLSLGAFEIRPSAHIGLSVSLLTLDYSYLSYSGTGVNGVSFQLGISPTVGFKY YF >gi|313159627|gb|AENZ01000004.1| GENE 10 11120 - 11347 104 75 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MGPKVRKIEAIGSAREDNLTQASHCVAAAAAFSGREAAAGRLSGSEKRHFLQAKKRRISL FFLTWKRKRHSTANL >gi|313159627|gb|AENZ01000004.1| GENE 11 11380 - 12474 1654 364 aa, chain + ## HITS:1 COG:CAC3580 KEGG:ns NR:ns ## COG: CAC3580 COG2070 # Protein_GI_number: 15896814 # Func_class: R General function prediction only # Function: Dioxygenases related to 2-nitropropane dioxygenase # Organism: Clostridium acetobutylicum # 1 347 3 347 355 314 46.0 2e-85 MKALKIGNLLATVPIVQGGMGVGISLSGLASAVANQGGVGVISSAGLGAIYNNYSKDYRA ASIWGLKEELRKAREATRGIVGVNVMVAMSNFADMVKTAVAEKADIIFSGAGLPLNLPSF LTEGAKTKLAPIVSSARAAKLLCQKWFSEYRYIPDAIVVEGPKAGGHLGYKEEQLADEHF SLEALVPEIVAEVRAFGAEHDCHIPVIAGGGIYTGEDIYRIMQLGADGVQMGTRFVTTEE CDADPAFKQTYLDAKPQDIEIIKSPVGMPGRAIHSSFLDRVKEGLKRPKNCPFDCIKTCD VTHSPYCIMLALYNAFKGKLQNGYAFCGANAWRAEKIQSVRDLMAALRDEYDNFSLKEKL FGAR >gi|313159627|gb|AENZ01000004.1| GENE 12 12748 - 13896 1042 382 aa, chain + ## HITS:1 COG:no KEGG:Odosp_0525 NR:ns ## KEGG: Odosp_0525 # Name: not_defined # Def: hypothetical protein # Organism: O.splanchnicus # Pathway: not_defined # 112 329 307 532 575 103 35.0 2e-20 MRNFLRKSALLTAASLVFFACDDDEKKVPETVVTLDRTELNLNVGFSETLVATVTPPLQD GVAVVWSSDDEVVATVEDGVVTALAAGEATITASVGESKATCTVTVAYVAPKIGDYYYSD GTWSDGGLVSIEADGLNPVWADTKPAPVAGKTVIGIVCQTDENRIAAGDKEKGYTHGYVV AVKNAHSADSQTVQYSTDNDFASTPKAKIASTWYGNVNGYEETMKTVSDYGPNLATWCPA FDLTVNNFSLPAPETSSGWFLPSTGQLWDMVANLCGHEAALLLKEWQTSSYNVYYGYNSE NVSYDVIAKFNETLAMIPADQKEELFVTDGTHYNTCTLWATTCFEPGETACIIHIGGSEK HLVELMCEYIDYDGIARPILAF >gi|313159627|gb|AENZ01000004.1| GENE 13 14044 - 15066 609 340 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|227425790|ref|ZP_03908856.1| SSU ribosomal protein S18P alanine acetyltransferase [Atopobium parvulum DSM 20469] # 5 337 480 817 832 239 39 3e-62 MDITILGIESSCDDTSAAVIRNNVLLSNVIASQAVHIKYGGVIPELASRAHQQNIIPVVD TALKEAGLTPEEIDAIAFTRGPGLLGSLLVGVSFAKGLSIAHNIPMVEVNHLQGHILSHF IDLQDRALPHPGFPFLCLLVSGGHTQIVRVDSPLEMEIIGTTIDDAAGEAFDKCAKVMGL PYPGGPVIDKLAKEGDPKAFRFARPHVDGYDYSFSGLKTSFLYTLRDAVANDPGFIENNK ASLCASLQRTIVEILLDKLVRASKDLGIRDIAIAGGVSANSGLRNGIVEEGSRRGWRTFL PEFKFTTDNAAMIAMAGYYRYQHGDFSSLAVSPVARLQEL >gi|313159627|gb|AENZ01000004.1| GENE 14 15147 - 19709 6662 1520 aa, chain + ## HITS:1 COG:no KEGG:Palpr_2903 NR:ns ## KEGG: Palpr_2903 # Name: not_defined # Def: hypothetical protein # Organism: P.propionicigenes # Pathway: not_defined # 27 1485 7 1461 1468 433 24.0 1e-119 MGKVLSAAVLLLIILPLALSLLLDIPAVQNYVVQKAVRMASKKLETTVSIDRVDIGLFSR VKIKGFYVEDYQRDTLLYVGRLDAFVTGLGIFGGGLSLSRGEIADAKLYLRETPGGEMNI KQIVNRISNPDKPKKGNFRLSLRKASIENTDLCLERLDRRDPEFGIDFSHMHLYGITAYV NDFTIDGQSIYTTIETLSARERSGFALDRFAGRFYLTNGCMGFEDVSVVTGRSNVQIPYI SLAGDSWAEYKDFIGEVRIDGALRNTTVSTDDIAYFAPKLRGWHTDFSNVNVEVAGVVAD FTAKVKSMQIGEGTWLIADASVRGLPDIRQTRFDLNVPRLKSSAEAVDELAAGIGGRELS DKLVGILGNTGQIDVNARFKGLLSSFDMQLGASTGVGEVTCNLRMTPLKAGRSSVRGDVE THNLRLGELLGRRDLLGNATLSAYIDGVVGKGYTDANVVGNVTQLGFNDYIYDSLRLDGR LRNRQFDGRVTARDPNLDFDFSGLVDFNDSIPRYDFTMDLRHADLARLHINRRDSVSQLA ARIEANAGGRSLDDLNGRIHVTDAVYRYNDKRITSKTMTVTGENSARSKLVELRSDFADA TFRSKTSYREVFEYLRRSAWKYLPLLRRGEGDPTPRGRKTAVANDYSLLSVNIRHIDPIT DAITAGLQIADGSSMQLLFNPASDQFSLKATSEYVERKRMLATRLSINASNRGDSLTVYA SAEDLYAGVLHLPQLSLTGGAKQGRMQVSAGFNDTLRKVSGLLGVQAHVVDEHGPNGRVV DLRILPSHVTRGSKTWQIYARKIQLDTAQVVIDRFFVMNREQELLLDGVASRSREDSVTL RLRNFDLAPFMQVAERLGYVVEGRTNGSATMKSVLHAGEISADILLDSLEVNDIPAPPLR LSSRWDFSRNRAGVTVTDRNKRDTLIRGFYAPSQVRYYARLDIDSLDMGLLDPVLSGVIS QTEGLASAELVLQGQRRDADLTGRIHVTGLSTKVDFTQVAYSMPEAVLDVRGNRFRASNV PVFDPEGNRGRFDIDLSLQHLSNIAYDVRVAPQQMLVLNTTAQDNDFFYGKVYASGTARI SGDKGAVNMDIAASTDDHSSFFMPLSNKSNISYADFVTFKEKPKVDTVDNLARRKMMFER QRQQKTVAGSQMNIALALNVRPGVEVELSVSGNTLKGRGDGTLNLQINPRSNVFEMYGDY TITEGSFLFSLQNIINKKFIIENGSTIQWTGAPMDAMLNIDAIYKLKASLQPLLQGTAEN VTADRSVPVECIIHLGDRLSNPAVTFDVNVPGTDPETQAVVANALTTPETVDTQFAYLLL FNSFMSENNAASSNIGASVSAATGLEFLSNMVSNLLSNDDYNIVIRYRPKSELTSDEVDF GLSKSLINNRLFVEVEGNYLIDNKQAINSSMSNFMGEAYITYLIDRAGTLKAKAFTQTID RFDENQGLQETGVGIYFKEDFDNFRDLRRRIRERFTNKKRKARREARREAARREKELRRA AADTLQAFRYVKHEKDVSAP >gi|313159627|gb|AENZ01000004.1| GENE 15 19776 - 20450 1051 224 aa, chain + ## HITS:1 COG:PA2983 KEGG:ns NR:ns ## COG: PA2983 COG0811 # Protein_GI_number: 15598179 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Biopolymer transport proteins # Organism: Pseudomonas aeruginosa # 21 218 1 197 211 99 29.0 6e-21 MMTFLQAAEAVATSEETRMGLWTLFTKGGWLMWPLLALGGVTIFIFVERFMAIRKASVLD MNFMNRIRDYISDGKIQTAVNLCKKTDTPIARMIEKGIERIGRPMSDVQTAIENVANLEV SKLENGLPFLATIAGGAPMIGFLGTVLGMVQTFMDMSAAGGTVDLGLLSSGMYVAMVTTV MGLIVGIPAYFGYNYLVARIEKLVFQMEANSIAFMDILNQPVQK >gi|313159627|gb|AENZ01000004.1| GENE 16 20457 - 20864 637 135 aa, chain + ## HITS:1 COG:no KEGG:BF3738 NR:ns ## KEGG: BF3738 # Name: not_defined # Def: putative tansport-like protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 134 6 139 145 95 38.0 5e-19 MAIKHGSKVDKSFSASSMTDLMFLLLLFLLIATTLINPNALKLMLPKSSNQLKDKAMTTV SIQDAGHGKYRYYVELQEVGSIEGVERALKTRLDGQKDATVSLHCDETVAVGETVKVMNI AKDNNYKLILATAPN >gi|313159627|gb|AENZ01000004.1| GENE 17 20909 - 21598 777 229 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313159661|gb|EFR59019.1| ## NR: gi|313159661|gb|EFR59019.1| TonB family C-terminal domain protein [Alistipes sp. HGB5] # 1 229 1 229 229 371 100.0 1e-101 MYYYDPNNRNPRRWATVATAVYALLLIGSFALVSFDFRQIHDKPGDTITIDFTEPPAPEP PKPRVRTATEPRVHDRTAPVEQTAQVSGKDETTQTPNPRALFNMNKGGADEPDNAGNPRA PEGEDRASGTGPGLNPDGLDQLDQGLQGRGLVGDLPKPSYPGSKSGKVIVRVTVDASGRV TSAAYEPKGSTTDAAELVEAAKAAARKARFTESRAAVQGGTITYVFRME >gi|313159627|gb|AENZ01000004.1| GENE 18 21617 - 22069 646 150 aa, chain + ## HITS:1 COG:no KEGG:Palpr_0285 NR:ns ## KEGG: Palpr_0285 # Name: not_defined # Def: hypothetical protein # Organism: P.propionicigenes # Pathway: not_defined # 1 136 1 126 360 86 38.0 4e-16 MRFLLLTLFGCLTFAAGRAQQPAKIEGAHLHMEVASHDFGDVPRKGGDLVREFTFTNDGT VPLVVTRVITSCSCLKASYFKRPVAPGESGTISIIYEPHKSEPGVFNKVIQIYSNSVDGR DVITVQGNSIDPGPRKVKTGEVKIKYKKDK >gi|313159627|gb|AENZ01000004.1| GENE 19 22319 - 23626 1862 435 aa, chain + ## HITS:1 COG:MTH1505 KEGG:ns NR:ns ## COG: MTH1505 COG0402 # Protein_GI_number: 15679502 # Func_class: F Nucleotide transport and metabolism; R General function prediction only # Function: Cytosine deaminase and related metal-dependent hydrolases # Organism: Methanothermobacter thermautotrophicus # 25 410 22 403 427 328 44.0 2e-89 MAILFSNATILPMTADEGSPKTFTGFVGVAGSRIALVTESASEADAFRAAHPDARIIDCT GRLLMPGLVNTHCHAAMTLQRSYADDIALMEWLHDYIWPFEARQTADDVALGMTLGVVEM LLGGVTSFVDMYYHENRCVEVAERLGIRAMLGCNYFDSNVEEVLPEVGQAVELAACCDRV RIALAPHSPYTVSPENLRRGKETAERYGLHLMTHIAETQDEVRIVREKYGMTPVEHLDAL GMLDARTIGAHCIHLTDSDIATLAARGVAVSHNPQSNMKISSGVAPVERLRAAGALVTIG TDGTCSNNDLDMFEEVRTAAFLQKSATGDPVALPAYEALKLATVNGARALGYAEGELGVV REGALADLIVVDLQKPHLQPVHDLVSNIVYCGKASDVDTVMVDGRIVVENRRVAGVDLPA LYAAVAAAVKRITAK >gi|313159627|gb|AENZ01000004.1| GENE 20 23637 - 24086 411 149 aa, chain + ## HITS:1 COG:L69304 KEGG:ns NR:ns ## COG: L69304 COG0454 # Protein_GI_number: 15673990 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Lactococcus lactis # 3 146 4 148 152 146 45.0 1e-35 MRIVSVREHPEFADTAIGYISACWPEVRPVLYEDCIRHAVSAAGPLPQWYLLMSGSEPVG CAGLIANDFISRMDLWPWACALYVEERLRGHAYGRLLLDRAAEDARRAGFGKLYLSTDHA GLYEKWGFRYLGQGYHPWGAESRIYERTL >gi|313159627|gb|AENZ01000004.1| GENE 21 24108 - 25133 1338 341 aa, chain + ## HITS:1 COG:BS_yulF KEGG:ns NR:ns ## COG: BS_yulF COG0673 # Protein_GI_number: 16080169 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Bacillus subtilis # 5 340 2 328 328 333 50.0 4e-91 MKGKIRFGMVGTGFIADWVLAGARQDPRFEAAAICSRNQATADAFAAKHGIPHTFTSLEE MARSPLVDAVYIASPNALHASQSILCMSCGKHVLCEKPLASNAREARAMIEAARRYGVVL MEAMIATLNPNFRIVREQLPRLGTIRRYFASYCQYSSRYDKFREGVVLNAFDPSLSNGAM MDIGVYTVYPMVALFGRPQAVDAQGVVLSSGADGQGAVNFRYEGMNATVLYSKIADSRLP SEIEGEEGTLLLDTIHDIHRVTRFPRRGAASDRGPETVGESVGVEPDRDRYYYEIAEFID LIEQGRPESAVNSHANSLATLEIIDEVRRQLGVVYPADGAR >gi|313159627|gb|AENZ01000004.1| GENE 22 25406 - 27469 3283 687 aa, chain - ## HITS:1 COG:YEL011w KEGG:ns NR:ns ## COG: YEL011w COG0296 # Protein_GI_number: 6320826 # Func_class: G Carbohydrate transport and metabolism # Function: 1,4-alpha-glucan branching enzyme # Organism: Saccharomyces cerevisiae # 11 672 12 699 704 529 45.0 1e-150 MEKTTRRLPIVERDEWLLPAEQELNNRYERYMDKMNAIVQAAGSLVDYANGYRYFGWQRD ETLDGWWLREWLPGAHDVYVFGDFNNWQRTEIRMQRDRHGVWSAFFPTAMYRDRLVHGSL YKLHVHGDNGWLDRIPAYATRVVQDEETKNYTAQFWAPEPFDWRGDAFDISKNGNLLIYE AHVGMAQEKEGVGTYREFTEKILPIIKKDGYNAVQLMAIAEHPYYGSFGYHVSSFFAPAS RCGTPEELKELVRRAHELGLGVIMDLVHAHYVKNLNEGINELDGTDHHYSLPGKAGYQPY WDSMLFDYGKDEVQHFLLSNVKYWLDEFHFDGYRFDGVTSMIYHHHGYVDFDCRERFFDA GVNGDALTYLTLANRLVHDFRAGDVTIAEDVSGMPGMCIPDTDGGIGFDYRLGMAIPDFW IKQLKEVPDEEWNIWEMWNVMTDRLPEVKTVAYAESHDQALVGDKTLAFRLMDKEMYFNM DRASQSVVIDRGMALHKMIRLMTISTGGQAYLNFMGNEFGHPEWIDFPREGNGWSYAHAR RQWSLAGNGFLRYAWLGDFDKAMIKLVKRYKVLADGYAWNLVMDECNKTMAFAHGDLLFV FNWHPSASIPDYELPVQAPGKYVPLLSTDERRFGGQERQAMDGEHFSFPAQDGDNTERPH IRIYNTSRTATVYLRTDGRQPSAVTGE >gi|313159627|gb|AENZ01000004.1| GENE 23 27567 - 29117 2466 516 aa, chain + ## HITS:1 COG:XF1891 KEGG:ns NR:ns ## COG: XF1891 COG3104 # Protein_GI_number: 15838489 # Func_class: E Amino acid transport and metabolism # Function: Dipeptide/tripeptide permease # Organism: Xylella fastidiosa 9a5c # 3 473 28 458 510 155 29.0 2e-37 MKGQPQGLIAAALANMGERFGFYIMMAILTLFISAKFGLDEATAGYIYSGFYASIYLLAL VGGIIADKTKNYKGTIMWGLVVMSLGYLLIAIPTPTPVPSLPLYLTLTCLGLLVIAFGNG LFKGNLQALVGQMYDNKEFSAKRESGFQIFYMFINIGGFFAPFVAIGVRNWWLKVNNFDY NAKLPELCHEYLAKGQDMAGEGLANLTSYANSAYLDGSPVGDLHAFCNSYLDVFNRGFHY AFIAAIVMMLVSLMIYMSNKKRFPDPSQKAAAKGGKASAAEITMSAQEIRQRIYALFAVF GVVIFFWVSFHQNGYSLTYFARDYVNLDVINIDLGFTVIKGAEIFQSVNPFFVVTLTPLI MWFFGWLRKRNVEVSTPMKIAIGMGIAATAYLFLMFFSLALPGKAALAGMKADEMQSILV TPWVMVGLYFILTVAELFISPLGLAFVSKVAPPHMQGLMQGFWLAATAVGNALLFVGGWL YLHTPMWATWFVFVAACGLSMLVMLSMVRWLERVTR >gi|313159627|gb|AENZ01000004.1| GENE 24 29338 - 29874 644 178 aa, chain + ## HITS:1 COG:no KEGG:BF0456 NR:ns ## KEGG: BF0456 # Name: not_defined # Def: two-component system sensor histidine kinase # Organism: B.fragilis # Pathway: not_defined # 10 176 9 167 671 134 44.0 2e-30 MRIPSSLAAFAVCILMAFSHAFPMRAGERALQTPSQPADSALTAYYKLCKAHRADAEAPA MCDTLFRRAEAARNVRMQAIALCVRLDHFYYKNDRAEILEGVRRVQEFCRRHPKEDLRYF YYFVWSSRLITYYIKQNQSNTAIYETRKMLAEAEDDDYPEGVASCYRMLGNLYLTQGA >gi|313159627|gb|AENZ01000004.1| GENE 25 29878 - 31350 2118 490 aa, chain + ## HITS:1 COG:sll1228_2 KEGG:ns NR:ns ## COG: sll1228_2 COG0642 # Protein_GI_number: 16330678 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Synechocystis # 267 478 1 225 267 110 35.0 4e-24 MAYDNFRRQIEVLEHNGIDDINLPTQYASLAQCALELDMPDSAFVALQKAASLPKRTTYQ EFTVNKGFGLYYIRTENFAEAKKRLEASEELFRRDPSLRFHTAGLSYLRTAYFKASGQYG KALETILETQRDTVIRSSGFNNYALTKELGDVYWHLREMERAAANYREYIRLSDSVRNRE IRTATDDFSGILEISRLHNETKELQYDLQRKRLRNTYLIICLLAGVLVTGGVGYARMMKL NRRLKASEATVLAQNEHLRISGEELLEAKEQAEQASRMKTEFIQHMSHEVRTPLNSIVGF SQVLASEFRDKPATGEYASIIEANSATLLRLFDDVLEVAYLDQTGDLPRCDVTALNNLCN DCVESTLPELHPGVSLLDELPVSDPVVRTNLKRIEQVLLHLLRNAAKFTLSGHITLTYDC LPAERLLRFSVTDTGPGIPADLREEVFGRFVKLDPFSQGTGLGLPICRLIAVKLGGGVCS SIRSMRPAAA >gi|313159627|gb|AENZ01000004.1| GENE 26 31621 - 32253 981 210 aa, chain + ## HITS:1 COG:no KEGG:BF3965 NR:ns ## KEGG: BF3965 # Name: not_defined # Def: putative TetR transcriptional regulator # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 6 203 15 208 214 98 33.0 2e-19 MTPPQKERIIDQAMQMFATQGIKSVRMDDIAQHLGVSKRTLYELFGDKEGLLYLAMERYF QRDRQRWTELTANARNVLEAMFMVLAQVMDKAEVSSRMMDNLKKFYPAVHDKLTREGMEK NRRSLRGMLDQGIVDGLFVDNINIDLAISVLYYTASALVTRKELMLPAGMTEREAFVQII SNFFRGISTAKGLRLIDDNLKRYELTKYGK >gi|313159627|gb|AENZ01000004.1| GENE 27 32275 - 33612 1885 445 aa, chain + ## HITS:1 COG:PA0427 KEGG:ns NR:ns ## COG: PA0427 COG1538 # Protein_GI_number: 15595624 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Outer membrane protein # Organism: Pseudomonas aeruginosa # 26 442 66 464 485 78 25.0 3e-14 MKKLILTVCLAATGVFGAAAQMRLTLGEALDLALSENPTVKVAEMEVQRFDYVKRQTWGS WIPQISVGGTYTRSIVKQSMTKGLSFGADNTLAAQGDATWTLFAPAVFRTLKMNDVQRAA AVESARSSRITLVAEVKKAFYNILLAEQSLEVLRESQATVQRTVDDTRLQYDNGLASEYD LLTAQVQLSNLKPSILQTENSIRLAKLMLKMYLSIPENVEIEVVGELDAMRDAVLAGTDG LTTDVQDNSDLRTLELQEDLLKRQLKVANAGRMPTVTAFFTATYTGNDIDMTKLNFGGAE GGQPGDVLDGSKFWWQHPMSAGLQVSIPIFSGLTKMNRSREIKNQISQIGLQRDYARQQI DVEVRAALNDLLTARETMYAQELTVAQARKAYSISDTRYRAGAGTILELNSAQLAQTQAE LNFSQAIFDYLTAKAEYDRIVGKEN >gi|313159627|gb|AENZ01000004.1| GENE 28 33640 - 34650 1630 336 aa, chain + ## HITS:1 COG:aq_698 KEGG:ns NR:ns ## COG: aq_698 COG0845 # Protein_GI_number: 15606100 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Aquifex aeolicus # 60 331 71 372 374 101 25.0 3e-21 MKNLTVMLIAAAALAGCTGKKTADVQQETRVLTKTETAVSAPVQLAEEFTSEIEPYKQND ITPAASGVHIDRILVEVGDPVRQGQLIVTLDPTQYTQQLVQLKTVEDDYNRLLPVYEAGG ISAQQIEQAKAQLDVQREVVANLKKNIEVHSPITGVVTARNYESGDLFAAQPILHIMQID PLKVVANISEQYFRNVKVGMPVDLKVDIFPGETFPGTVSLIYPALDPATRTFKVEVKVPN AKRTLRPGMFARTGFNMGEKEGVMVPDVAVQKQVGTAERFLYVIKGDSVAERRRVEVGRQ VGDRVDILSGVAPGERVAVTALSKLFDGAKVEVKQN >gi|313159627|gb|AENZ01000004.1| GENE 29 34665 - 37805 4731 1046 aa, chain + ## HITS:1 COG:SMa0875 KEGG:ns NR:ns ## COG: SMa0875 COG0841 # Protein_GI_number: 16262933 # Func_class: V Defense mechanisms # Function: Cation/multidrug efflux pump # Organism: Sinorhizobium meliloti # 7 1028 7 1027 1065 501 31.0 1e-141 MKIYESAVRKPVSTVLLFVGVMVFGLFSLMNLAVDQYPEIEIPQISVITMYPGANAAEIE TNITRVLEDNLNTVSNLKKLTSKSQDNVSMITVEFEYGSDLNEGANEIRDVVSRVQSMLP DDIDYPTIFKFSTSMIPVMMIAVTAEESYPALSKLLDDKLVNVLNRVDGVGAVSVIGAPE REVQVNVDPAKLDAYNLTVEQLGQIIAAENVNIPSGTIDIGNNTFNIKADGEFKLSDELR KVVVSNAGGRTVMLSDVAEIRDTLEKFTMDERVNGQRGVRVMFQKQSGANTVNIVHEIQK RLPEIQKTLPKDVKMELIFEGSQEITDAIGSLSETIMYAFIFVVLVVMIFLGRWRATLII CMTIPVSLICSFIYLFATGSTLNIISLSSLSIAIGMVVDDAIVVLENITTHIERGSSPKE AAIYATNEVWLSVIATTLVVVAVFLPLTMVPGMAGILFRELGWIVTIVVCVSTTAAISLT PMMSAYLLKLDGGVHDYKGVGIVYKPIDRALEWLDDIYARSLNWVVRHRRITIFSMMGLF VVSLGLITRVPTEFFPPSDNSRISAMVELEQNIGVEYTARIARQIDSIIYAKYPEIVLVS ASAGANSSDNAFAAMQTTGSHIINYNMRLTDVEGRERSIYVVSDLLREDLDRIPEVRQYT VTPGGMSGSMSGSATVNVKVFGYDMDVTNAIANDLKEKMRGMKGVRDVKLSRDDLRPEYN VVFDRDRLSYYGMNSATASQAVKNRIDGLVASKYREDGDEYDIVVRYGEPFRTSVDDVEN ITLYNAHGRPVKLKEVGRVQEEYAAPMIERENRQRVISVQSTLGAGVALGDVVAEVEKLI SEYHIPDGVDLEVGGTVEDQGDAFGDLGVLFILIVILVYIVMATQFESLLFPFIIMFTIP FAFTGVFLALWMTSTPLSLIALIGAIMLVGIVTKNGIVMVDYMNLLVERGSGVFDAVIAG GKSRLRPVLMTSFTTILGMLPLAIGTGAGSETWQPMGIAVIGGLTCSTLLTLFIIPALYS VFVNRSQRKEKEKLARLSAEHQASVH >gi|313159627|gb|AENZ01000004.1| GENE 30 37822 - 38115 461 97 aa, chain + ## HITS:1 COG:no KEGG:BVU_0236 NR:ns ## KEGG: BVU_0236 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 96 1 96 97 126 57.0 3e-28 MKAVFLSYNQALTDRVNAILDEQGIRGFTRWALTEGRGSVDGEPHYGTHAWPSMNASIMA IVDDEKVAPLMAAFREMDAATKLQGSRAFVWNIEQTY >gi|313159627|gb|AENZ01000004.1| GENE 31 38536 - 39321 1019 261 aa, chain + ## HITS:1 COG:BB0533 KEGG:ns NR:ns ## COG: BB0533 COG1235 # Protein_GI_number: 15594878 # Func_class: R General function prediction only # Function: Metal-dependent hydrolases of the beta-lactamase superfamily I # Organism: Borrelia burgdorferi # 4 259 4 253 253 188 38.0 1e-47 MKLTFLGTGTSQGVPVIGCRCRVCTSSDRRDDRLRTSAMVETQGVRIVIDAGPDFRCQML RTGVRRIDAILLTHEHKDHIGGLDDVRAFNFVDYPPTIHRIDLYAAPHTLDVVRKDFDYA FAQDKYRGVPEIELHEIDVTRPFSVKGVEILPVSGHHSERFAVTGFRIGRLAYLTDFKTI ADAEVEKLTGLDVLVVNALRFAEHYSHFNVAEALELIARVSPREAYLTHMSHDIGLHAET EPTLPPHVHMAYDTLEIEIND >gi|313159627|gb|AENZ01000004.1| GENE 32 39323 - 40138 245 271 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163797523|ref|ZP_02191474.1| 50S ribosomal protein L9 [alpha proteobacterium BAL199] # 9 231 7 220 259 99 30 5e-20 MKNFKDKVVIVTGASSGIGEAMAREFAAQGARVVLGARSVQKLQLIAGEIRSQGGQAAYC GVDVTNVDECRRLIETAVNEFGGIDVLVCNAGLSMRAIFDDVDLGVLHRLMDVNFWGTVN CCKFALPYLQQSHGSIVGISSVAGLHGLPGRTGYSASKYAMTGFLETLRIENLKKGLHVM IACPGFTASNVRFSALTADGSAQGETPRNEAKMMTSAEVARIVARGVLKRKRLCLMESEG RATHFVKKFAPGFLDRMFYLVMSKEPDSPLK >gi|313159627|gb|AENZ01000004.1| GENE 33 40251 - 41087 994 278 aa, chain - ## HITS:1 COG:Cj0578c KEGG:ns NR:ns ## COG: Cj0578c COG0805 # Protein_GI_number: 15791938 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Sec-independent protein secretion pathway component TatC # Organism: Campylobacter jejuni # 17 262 8 233 245 119 31.0 5e-27 MSGQKRSDTDEMTFAEHIDALRPHLVRGVLAILALAVAAFLCKGLLIDGVLFGPMSAGFP TNRLLVWMAGLAGIEFVPGIERMQLINTAMAGQFNLHLKISMVAAFCIGFPYLLWELWRF ARPALTERELRGCRRFVFYVSAGFFAGLLFGYFVIAPLTIGFLSQYSVSEQVTNMIDVGS YLSTVLNVSIACAVVFQLPLLVYFLTRMGLVSPDFLRRYRRHALVALAILSAVITPPDVF SMILVLLPLFGLYEYSIRISERTLRRSSGTEGTTTEGK >gi|313159627|gb|AENZ01000004.1| GENE 34 41094 - 41336 464 80 aa, chain - ## HITS:1 COG:DR0805 KEGG:ns NR:ns ## COG: DR0805 COG1826 # Protein_GI_number: 15805831 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Sec-independent protein secretion pathway components # Organism: Deinococcus radiodurans # 12 56 29 73 132 57 53.0 5e-09 MMFIHPLFLGKIGLTEILVILLIVVIFFGGKKIPELMRGMGRGVREFKDAMENPAAGDEK NGKEQAAPQESRKETPEEKK >gi|313159627|gb|AENZ01000004.1| GENE 35 41693 - 42634 1432 313 aa, chain - ## HITS:1 COG:Cj0918c KEGG:ns NR:ns ## COG: Cj0918c COG0462 # Protein_GI_number: 15792247 # Func_class: F Nucleotide transport and metabolism; E Amino acid transport and metabolism # Function: Phosphoribosylpyrophosphate synthetase # Organism: Campylobacter jejuni # 7 311 5 308 309 299 52.0 4e-81 MAIHKIKIFSGRGSEYLAEKIAASFGTTLGQSEVLRFSDGEFQPCYNESIRGCTVFIIQS TFPPSDNLMELLMMIDAARRASAYKVVAVIPYFGWARQDRKDRPRVPIGAKLVANLLMAA GVDRVMTMDLHADQIQGFFDVPVDALYASGIFIPYIESLKIEDLSIAAPDMGGAKRANTY AKLLGTPIIISHKERAKANVVGKMTAIGEVEGRNILIVDDMIDTAGTICMAADMLMSRGA KSVRAAITHPVLSGPAYERINDSALEEVIVTDTIPLNPDKDLHKFTVLTIADMFADVIER VHNYKEISSIFFK >gi|313159627|gb|AENZ01000004.1| GENE 36 42718 - 43404 940 228 aa, chain - ## HITS:1 COG:FN2013 KEGG:ns NR:ns ## COG: FN2013 COG0218 # Protein_GI_number: 19705309 # Func_class: R General function prediction only # Function: Predicted GTPase # Organism: Fusobacterium nucleatum # 32 227 1 193 194 145 42.0 5e-35 MLKCYLCSRHPLRQTAARRPARTVFAANSDNMQINKAEFKCSSERISQVPKDALKDIAFI GRSNVGKSSLINMLTGRQGLAKVSGTPGKTRLINHFLIDNAWYLVDLPGYGYARTSKTKR SEFSKIITDYVLKCEKMHFLFVLADLRLEPQKIDLQFIEMLGMNGIPFAVIFTKADKLSK TQREKSVERYRAALAEQWEELPRMFVSSSQHGTGREEILGFIGECLNV >gi|313159627|gb|AENZ01000004.1| GENE 37 43586 - 44338 1057 250 aa, chain - ## HITS:1 COG:SMb20092 KEGG:ns NR:ns ## COG: SMb20092 COG3568 # Protein_GI_number: 16263840 # Func_class: R General function prediction only # Function: Metal-dependent hydrolase # Organism: Sinorhizobium meliloti # 28 248 12 245 252 77 31.0 3e-14 MKQYFSVALAVLFCALMPAACTTQTTAVKFMSYNIRNGRGADDVQDLGRIAEVIGRVAPD VVALQEVDSVTGRMNGRFIPEELGRMTGMHARFCRAIDYDGGGYGIGLLSRAEPLSVRRI PLPGREEARVLLMAEFPGYVVCVTHLSLSPEDQRASLPIIRQATDTCRKPVLLAGDFNMD DAEEVLGGLGGEFRPLSDTAQLTFPSDRPSIRIDYILGRGLPQSAKIAERTVDYTTVASD HCPLWVSLVW >gi|313159627|gb|AENZ01000004.1| GENE 38 44711 - 45760 1523 349 aa, chain + ## HITS:1 COG:mlr8141 KEGG:ns NR:ns ## COG: mlr8141 COG3049 # Protein_GI_number: 13476735 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Penicillin V acylase and related amidases # Organism: Mesorhizobium loti # 13 346 14 348 350 365 55.0 1e-101 MLMKTQFLTVFGAAALAVFPAEPEACTRAVYLGPDGMTVTGRTMDWREDPLTNLYIFPRG TARRGANTDDTVFWTSKYGSLSAAGYDIGITDGMNEAGLVANLLFLPESVYERPGDTRPV MGLSIWTQYVLDNFATVDEAVAELSKEKFRIDAPDLPNGVQSRLHLAVSDPSGDSAIFEY IDGRLVIHHGRQYQVMTNSPFYDQQLAILDYWRQIGGLTMLPGTNRAPDRFVRASFYINA VVKSPDPKIAVPAVMSVMRNVSVPYGISTPDKPHISSTRWRTVCDQKNRVYYFEPTLAME TFRVDLAKIDFGKGTPERVLKLVGGRTYTGDATAEFRRSDKPFVFLFGV >gi|313159627|gb|AENZ01000004.1| GENE 39 45857 - 46505 -223 216 aa, chain + ## HITS:1 COG:no KEGG:cce_1072 NR:ns ## KEGG: cce_1072 # Name: avaIR # Def: type II site-specific deoxyribonuclease # Organism: Cyanothece_ATCC51142 # Pathway: not_defined # 5 216 13 216 331 172 46.0 9e-42 MIEMDKYQYIQCAEDLITSREQTRAGFIEAARAKNYKAQPYIEQARTLKSLSSQAASPSD LLNIEEIRNSLLTASGLSDKAFKYFTEEDKTEAIRILISEFLAPAGENFVDELENRFLLI KGDSLGGSMRNYVGSAAQVKLVRKILSILSMQQIAFQILFKDDKKNNEWQTLSYEDVFER VDDVTAIYWNIVPQDKSYDRVLFFNATIPLVKNNID Prediction of potential genes in microbial genomes Time: Wed Jun 22 11:08:17 2011 Seq name: gi|313159625|gb|AENZ01000005.1| Alistipes sp. HGB5 contig00077, whole genome shotgun sequence Length of sequence - 703 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 15 - 702 735 ## COG0021 Transketolase Predicted protein(s) >gi|313159625|gb|AENZ01000005.1| GENE 1 15 - 702 735 229 aa, chain + ## HITS:1 COG:PM1242 KEGG:ns NR:ns ## COG: PM1242 COG0021 # Protein_GI_number: 15603107 # Func_class: G Carbohydrate transport and metabolism # Function: Transketolase # Organism: Pasteurella multocida # 2 228 19 251 668 210 47.0 2e-54 MVEKAKSGHPGGAMGGADFMTVLYTEFLRYDPDNTSYPYRDRFFLDPGHMSPMLYSTLSL AGYYTTEDLQSLRQWGSVTPGHPEADFARGVENTSGPLGQGHAMALGAAIAERFMAARFG EWMAHKTYAYISDGAVEEEISQGVGRIAGHLGLSNFIMYYDSNNIQLSTKVDEVMTENVA MKYEAWGWNVLSVDGHNINEIREALVAANSETERPTLIIGRTVMGKGAK Prediction of potential genes in microbial genomes Time: Wed Jun 22 11:08:20 2011 Seq name: gi|313159611|gb|AENZ01000006.1| Alistipes sp. HGB5 contig00007, whole genome shotgun sequence Length of sequence - 9982 bp Number of predicted genes - 17, with homology - 15 Number of transcription units - 8, operones - 6 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 2 - 439 250 ## gi|313159624|gb|EFR58984.1| hypothetical protein HMPREF9720_0265 2 1 Op 2 . + CDS 456 - 1445 1141 ## COG4227 Antirestriction protein + Term 1473 - 1508 6.3 3 2 Op 1 . + CDS 1514 - 2344 497 ## BDI_3903 hypothetical protein 4 2 Op 2 . + CDS 2356 - 3405 949 ## gi|313159622|gb|EFR58982.1| hypothetical protein HMPREF9720_0268 + Term 3433 - 3469 5.0 5 3 Op 1 . + CDS 3475 - 4017 568 ## gi|291513806|emb|CBK63016.1| hypothetical protein AL1_03360 6 3 Op 2 . + CDS 4014 - 4256 135 ## 7 3 Op 3 . + CDS 4213 - 4578 359 ## Arad_15013 YagA-like protein 8 4 Op 1 . + CDS 4693 - 5052 289 ## gi|313159615|gb|EFR58975.1| hypothetical protein HMPREF9720_0270 9 4 Op 2 . + CDS 5071 - 5361 290 ## Bacsa_1125 hypothetical protein + Prom 5530 - 5589 3.7 10 5 Op 1 . + CDS 5628 - 6008 277 ## Coch_1598 hypothetical protein 11 5 Op 2 . + CDS 5963 - 6670 607 ## Coch_1599 hypothetical protein + Term 6767 - 6797 -0.9 + Prom 6717 - 6776 4.4 12 5 Op 3 . + CDS 6838 - 7116 190 ## Palpr_0257 histone family protein DNA-binding protein + Term 7123 - 7171 16.6 - Term 7104 - 7160 4.0 13 6 Tu 1 . - CDS 7184 - 7840 718 ## COG0176 Transaldolase - Prom 7892 - 7951 4.1 + Prom 8157 - 8216 3.8 14 7 Tu 1 . + CDS 8275 - 8496 380 ## Bache_1018 hypothetical protein + Term 8518 - 8569 18.0 15 8 Op 1 . - CDS 8880 - 9224 117 ## Mmol_0134 hypothetical protein 16 8 Op 2 . - CDS 9265 - 9405 96 ## 17 8 Op 3 . - CDS 9413 - 9889 390 ## BVU_3726 mobilization protein BmgA Predicted protein(s) >gi|313159611|gb|AENZ01000006.1| GENE 1 2 - 439 250 145 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313159624|gb|EFR58984.1| ## NR: gi|313159624|gb|EFR58984.1| hypothetical protein HMPREF9720_0265 [Alistipes sp. HGB5] # 3 145 1 143 143 239 100.0 4e-62 TVMNKISEIPEQESIPENPAVETSADPWRCEECGSLEVSYRTWVDSNTGQVAPAAPEQDD LWCDGCEEHTYQIRESELMSDTVEPWWNDGTTEEDREIITGLNPENFSPKDDRKAFRDAC DMWWNGRTNDEKIRLWRQATAPEEE >gi|313159611|gb|AENZ01000006.1| GENE 2 456 - 1445 1141 329 aa, chain + ## HITS:1 COG:XF2061_1 KEGG:ns NR:ns ## COG: XF2061_1 COG4227 # Protein_GI_number: 15838653 # Func_class: L Replication, recombination and repair # Function: Antirestriction protein # Organism: Xylella fastidiosa 9a5c # 13 328 223 520 522 106 28.0 8e-23 MNKNLIEKIAPQLTELMIKKMETLTEEWRKPWIADLAHGLPRNLRGTHYRGGNILMLLFL SEIAGYRTPLFMTFKQAKEEGLNILKGSGSFPVFFWKLYIRHKETRKKIELAEYYRLPQE QRRQYDVLPVMRYYPVFNIDQTDMQERHPERYSSLTTPTGPKDYSDGLACEPLDRMLMEQ SWLCPILLKSGDRASYSPTLDRIVCPEKRQFPEGAAFYTTLLHEVTHSTGHAERLNRSFG ACYGDADYIREELVAELTAALCGAMLGFATTPREESAAYIKDWLAEFHKEPTYLFDILTD VNRSARMISERLAVEQEPETPDAIPSEAA >gi|313159611|gb|AENZ01000006.1| GENE 3 1514 - 2344 497 276 aa, chain + ## HITS:1 COG:no KEGG:BDI_3903 NR:ns ## KEGG: BDI_3903 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 5 268 64 320 320 155 34.0 1e-36 MQQSKTEDRYSCETLLPMNTLYDHEHHLTQEDVDMANKLVRHIEHTRNPRIPQVGDRVRY TTRHGDFHSNALIEAVREDGMRSICLCPYVPFVWATAGGIGCAVSGGPFTAMMPQELKPS GAVPGDFCAWGHCGACGNGVVRFSAEVPLWEFREGDPLYGDFTTEKWRKIRLYKDAECRN GDLYRGDCISFRTEKEFQMFLSDYEGTAFAAPGPKSVIVWCYRDEQTAVSQEKWDALDAP VTERRIYNAPQPVKLVKDHGRHTTVCYFVRPEFSYK >gi|313159611|gb|AENZ01000006.1| GENE 4 2356 - 3405 949 349 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313159622|gb|EFR58982.1| ## NR: gi|313159622|gb|EFR58982.1| hypothetical protein HMPREF9720_0268 [Alistipes sp. HGB5] # 1 349 1 349 349 678 100.0 0 MTIKNKETLEKLRTYQIGVIRSVIASRQGRLGTYSADATDRERLADTDGLAYEEYYESPL TIRYDGLYRNVISLEFCEDGLTPVCLMNGGDDFPLPLENLSCDTLQGIVEWLEEYDFIPS AEAAPQTTSDLTADFVECSLPEYPTRYDVFRLGELQNFLDGHESPEFGLDCDEAAAERDR LQLRIYAEAITAFMKRQPAALPGNVSLRDYAEALVDIAYEAGRRKFRPSDNFRDTAATLI AWGDEFSRQHGDANRTEKEYLDEIYRFTDEKLRSVPSKNDPNAEFDIAFLNRAALERHGY DTATITDGDLQELAGRMGDYYCESSKFSEDLRTACGDFGLKLRDTTNPK >gi|313159611|gb|AENZ01000006.1| GENE 5 3475 - 4017 568 180 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|291513806|emb|CBK63016.1| ## NR: gi|291513806|emb|CBK63016.1| hypothetical protein AL1_03360 [Alistipes shahii WAL 8301] # 18 180 1 164 164 279 94.0 7e-74 MSYQIITRITITPDLRVMVRMAANNIRPLDFRYDEVVSLTETLRTKGRPTLELELLSLFF KGLWQGRTRYDRAVGYTLLTDGIDKYEAWERCRGDKEYERGLLLRMRGFLHYRPVPCRCH LEYQRSPVRRIYVGYISFSRQRRRIFPSVLDAQAALFAKGWNPDKFQIVEEETNPKSEIQ >gi|313159611|gb|AENZ01000006.1| GENE 6 4014 - 4256 135 80 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKTYRINKNAARLAQGTGFAPELIYNISLVRFQGRNGRCIAAWTPGMKRPPPVCLQGPHP GRIRQGYGAYPAGGRALPAP >gi|313159611|gb|AENZ01000006.1| GENE 7 4213 - 4578 359 121 aa, chain + ## HITS:1 COG:no KEGG:Arad_15013 NR:ns ## KEGG: Arad_15013 # Name: not_defined # Def: YagA-like protein # Organism: A.radiobacter # Pathway: not_defined # 10 121 66 184 184 75 39.0 5e-13 MERIRQEAERFRRHDEAVARSSEEFRRSLRVGDILYSSWGWEQTNIDFYQVIAIRGSAVD LRQLDQRTTEDGYMCGTTVPLPDVFKGKTHTHRLSKNYIRIDSYRTAWKWDGQPLRCSWY A >gi|313159611|gb|AENZ01000006.1| GENE 8 4693 - 5052 289 119 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313159615|gb|EFR58975.1| ## NR: gi|313159615|gb|EFR58975.1| hypothetical protein HMPREF9720_0270 [Alistipes sp. HGB5] # 1 119 1 119 119 249 100.0 4e-65 MNDPHWTEGLLRPVMAEIVRLTPEIDWENNDEFYPIDLRGAITVFGRTKRGRPVCITFTE SGHDLQFDSGQIHNSFSLKVLKDIGGTNNIMESVGDGEPLLHYIRQRMLFLEQHPGMGK >gi|313159611|gb|AENZ01000006.1| GENE 9 5071 - 5361 290 96 aa, chain + ## HITS:1 COG:no KEGG:Bacsa_1125 NR:ns ## KEGG: Bacsa_1125 # Name: not_defined # Def: hypothetical protein # Organism: B.salanitronis # Pathway: not_defined # 1 96 1 97 97 142 69.0 5e-33 MKIMCQEHYDKVVQYAESIGDSTLRECLERLERREQNPHHPCQIELYRDFAPYSFLFKER YPDGSLGVVGGLVYHGCPDRSCCFIDRPFHGWATHT >gi|313159611|gb|AENZ01000006.1| GENE 10 5628 - 6008 277 126 aa, chain + ## HITS:1 COG:no KEGG:Coch_1598 NR:ns ## KEGG: Coch_1598 # Name: not_defined # Def: hypothetical protein # Organism: C.ochracea # Pathway: not_defined # 21 124 2 104 106 117 51.0 1e-25 MVYSRLIRIFVKRNELRLMTIFDDYIRNKGCCKVSKTLLWDYDLTQFDWQRSRKVVVQRI IERGWLRDYFAAFDLYGGIEGFREIIKEVPTLSAQDMNFVCTAFGLKKEELRCYTRRQLR RRHLGC >gi|313159611|gb|AENZ01000006.1| GENE 11 5963 - 6670 607 235 aa, chain + ## HITS:1 COG:no KEGG:Coch_1599 NR:ns ## KEGG: Coch_1599 # Name: not_defined # Def: hypothetical protein # Organism: C.ochracea # Pathway: not_defined # 1 213 1 213 213 188 50.0 2e-46 MLHTQTVAPQTLGLLKQLEAEPRLAAFNLAGGTALALYLGHRVSVDLDLFTPESFDAGEL EAFLSQRYGFQTAFRRPDTLKGMIDGVKIDCIAHKYAYLRQPYAESGIRLYSIEDIVAMK LSAIADDGSRLKDFVDIACLSTRIPFYEMLKCYERKFPQANVIRPFKALTYFDDIDFGED IVMLNFEYDWKQIARRLKEMTVRQEHIFSQWPLKEQKRPVAEDKDMTSMKRGRKR >gi|313159611|gb|AENZ01000006.1| GENE 12 6838 - 7116 190 92 aa, chain + ## HITS:1 COG:no KEGG:Palpr_0257 NR:ns ## KEGG: Palpr_0257 # Name: not_defined # Def: histone family protein DNA-binding protein # Organism: P.propionicigenes # Pathway: not_defined # 1 85 1 85 90 103 63.0 2e-21 MTKADIVKQLAQETGVEAVTVLAVVEGFMEEVRAAQIRKENVFLRGFGTFLIKHRKEKTA RNITKNTTIKIPAHDIPAFKPSPAFQRLLEKK >gi|313159611|gb|AENZ01000006.1| GENE 13 7184 - 7840 718 218 aa, chain - ## HITS:1 COG:TM0295 KEGG:ns NR:ns ## COG: TM0295 COG0176 # Protein_GI_number: 15643064 # Func_class: G Carbohydrate transport and metabolism # Function: Transaldolase # Organism: Thermotoga maritima # 1 212 1 208 218 243 53.0 3e-64 MKFFIDTANLEQIRKAHELGVLDGVTTNPSLMAKENIRGTENCNRHYVEICNIVEGDVSA EVIATDFEGMVREGEALAALHPRIVVKLPCTAAGIRAVKYFAAKNIRTNCTLVFSVGQAL LAAKAGATYVSPFVGRLDDISEDGVALVAHIVKVYRTYGYKTQVLAASIRHTQHIIQCLD AGADVATCPLAAIEGLLRHPLTDSGLEKFLADHARLNA >gi|313159611|gb|AENZ01000006.1| GENE 14 8275 - 8496 380 73 aa, chain + ## HITS:1 COG:no KEGG:Bache_1018 NR:ns ## KEGG: Bache_1018 # Name: not_defined # Def: hypothetical protein # Organism: B.helcogenes # Pathway: not_defined # 1 73 19 91 91 116 93.0 3e-25 MYWTLELASKLEDAPWPATKDELIDYAVRSGAPLEVLENLQEIEDEGEIYESIEDIWPDY PSKDDFFFNEDEY >gi|313159611|gb|AENZ01000006.1| GENE 15 8880 - 9224 117 114 aa, chain - ## HITS:1 COG:no KEGG:Mmol_0134 NR:ns ## KEGG: Mmol_0134 # Name: not_defined # Def: hypothetical protein # Organism: M.mobilis # Pathway: not_defined # 11 100 4 93 116 137 65.0 1e-31 MIENTHAAESFDALFARLATSEFRRRFRLKAEDIAYIERKGMETIRRHAADFVRTRLAPA VIPNDGRQTPMRGHPVFIAQHATACCCRGCLAKWHGIPPGGNWRLKNNDISLRY >gi|313159611|gb|AENZ01000006.1| GENE 16 9265 - 9405 96 46 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MYIHLLKAFCLTTGILSLPAAVGLAWLILFPEYPERKSPQERIKTH >gi|313159611|gb|AENZ01000006.1| GENE 17 9413 - 9889 390 158 aa, chain - ## HITS:1 COG:no KEGG:BVU_3726 NR:ns ## KEGG: BVU_3726 # Name: not_defined # Def: mobilization protein BmgA # Organism: B.vulgatus # Pathway: not_defined # 1 154 1 148 316 150 50.0 2e-35 MIGKLSKSESLRTVIEYVLKNDHSRLIACEGILVGDYHNEEWLEQMISDFEDQTLLHDPV AKAAGHISLSFHPNDAPQMTDFRMVKIAREYMQCMGIDNTQYVIVRHTNTAHPHLHIVYN RICNDGKLIPDRNERFRNVRFCKALSFKYGLTLGERQK Prediction of potential genes in microbial genomes Time: Wed Jun 22 11:09:38 2011 Seq name: gi|313159607|gb|AENZ01000007.1| Alistipes sp. HGB5 contig00088, whole genome shotgun sequence Length of sequence - 2090 bp Number of predicted genes - 4, with homology - 3 Number of transcription units - 1, operones - 1 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 272 - 313 9.2 1 1 Op 1 . - CDS 320 - 619 396 ## gi|313159608|gb|EFR58969.1| hypothetical protein HMPREF9720_3042 2 1 Op 2 . - CDS 632 - 844 289 ## gi|313159610|gb|EFR58971.1| conserved hypothetical protein 3 1 Op 3 . - CDS 851 - 1504 588 ## BF1784 hypothetical protein 4 1 Op 4 . - CDS 1501 - 1701 229 ## - Prom 1905 - 1964 5.4 Predicted protein(s) >gi|313159607|gb|AENZ01000007.1| GENE 1 320 - 619 396 99 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313159608|gb|EFR58969.1| ## NR: gi|313159608|gb|EFR58969.1| hypothetical protein HMPREF9720_3042 [Alistipes sp. HGB5] # 1 99 1 99 99 181 100.0 2e-44 MNNLCFNRFTLRTDDAKTRRKILRWLALNYRLYEYIPSEAEIRGRFISRKPFPAEALRRQ IGKLRGDRTLFLRIVSTDLYSLYVEGNVYLLGAWHRIFL >gi|313159607|gb|AENZ01000007.1| GENE 2 632 - 844 289 70 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313159610|gb|EFR58971.1| ## NR: gi|313159610|gb|EFR58971.1| conserved hypothetical protein [Alistipes sp. HGB5] # 1 70 1 70 70 121 100.0 2e-26 MERNIIIENICTACRCGERRAEEYLTAELRNLRELRDAGALCYGDLETACSGLGLDFDYT DYFCQALSLN >gi|313159607|gb|AENZ01000007.1| GENE 3 851 - 1504 588 217 aa, chain - ## HITS:1 COG:no KEGG:BF1784 NR:ns ## KEGG: BF1784 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 20 215 20 212 212 88 31.0 2e-16 MRKWIDQSTFDELLLANAEDNIERAIRSASAVLEAVQEFVPEAALFYRVTHAADSLKNYF EEACTGFWRNGILFLVRQYFYPKPYYDFRFDWSFFRYADNYSYGKAFDKQTAPNRIGVFT KKKIDDWVEYLTQGFRNLERIDAENERKMTGYRNRLEALSDVVWVHDKSHGQIIRNGLTY TFDIRQTDYSEKISLDYRCRTLDDFLALSDNKFTPKP >gi|313159607|gb|AENZ01000007.1| GENE 4 1501 - 1701 229 66 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKNQKFEIAVAGYDYRFKTYARDGVEASVKVKCFLGRPDAECTILIPTLNDDLRETESCR RQVAQR Prediction of potential genes in microbial genomes Time: Wed Jun 22 11:10:05 2011 Seq name: gi|313159588|gb|AENZ01000008.1| Alistipes sp. HGB5 contig00027, whole genome shotgun sequence Length of sequence - 21530 bp Number of predicted genes - 17, with homology - 17 Number of transcription units - 12, operones - 5 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 1282 1505 ## COG0021 Transketolase + Term 1368 - 1414 8.0 + Prom 1376 - 1435 4.8 2 2 Op 1 . + CDS 1492 - 1983 741 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog 3 2 Op 2 . + CDS 1995 - 2579 934 ## gi|313159597|gb|EFR58959.1| hypothetical protein HMPREF9720_1137 + Term 2630 - 2673 11.6 - Term 2618 - 2659 5.6 4 3 Tu 1 . - CDS 2697 - 3140 624 ## BT_0641 hypothetical protein - Prom 3329 - 3388 6.3 + Prom 3336 - 3395 11.8 5 4 Tu 1 . + CDS 3464 - 4768 1951 ## COG0192 S-adenosylmethionine synthetase + Term 4788 - 4826 8.4 + Prom 4784 - 4843 3.1 6 5 Op 1 1/0.000 + CDS 4944 - 6416 2006 ## COG0265 Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain 7 5 Op 2 . + CDS 6471 - 7331 1574 ## COG0568 DNA-directed RNA polymerase, sigma subunit (sigma70/sigma32) + Term 7371 - 7412 7.4 - Term 7260 - 7311 -0.3 8 6 Tu 1 . - CDS 7528 - 8421 1247 ## COG1575 1,4-dihydroxy-2-naphthoate octaprenyltransferase 9 7 Tu 1 . - CDS 8622 - 9284 1031 ## COG0692 Uracil DNA glycosylase - Prom 9304 - 9363 2.2 10 8 Tu 1 . + CDS 10120 - 11559 1374 ## Pedsa_3099 hypothetical protein + Term 11670 - 11711 7.7 - Term 11656 - 11699 8.1 11 9 Op 1 . - CDS 11934 - 14120 2180 ## BF3975 putative transcriptional regulator 12 9 Op 2 . - CDS 14156 - 14530 656 ## COG3682 Predicted transcriptional regulator - Prom 14554 - 14613 5.0 13 10 Op 1 . - CDS 14702 - 15451 1169 ## COG0220 Predicted S-adenosylmethionine-dependent methyltransferase 14 10 Op 2 . - CDS 15457 - 15891 448 ## gi|313159606|gb|EFR58968.1| putative lipoprotein 15 11 Op 1 . - CDS 16212 - 17876 2285 ## COG1774 Uncharacterized homolog of PSP1 16 11 Op 2 . - CDS 17898 - 19259 2354 ## COG0534 Na+-driven multidrug efflux pump - Term 19322 - 19354 3.0 17 12 Tu 1 . - CDS 19364 - 21421 2886 ## COG0642 Signal transduction histidine kinase - Prom 21448 - 21507 5.7 Predicted protein(s) >gi|313159588|gb|AENZ01000008.1| GENE 1 2 - 1282 1505 426 aa, chain + ## HITS:1 COG:BH2352 KEGG:ns NR:ns ## COG: BH2352 COG0021 # Protein_GI_number: 15614915 # Func_class: G Carbohydrate transport and metabolism # Function: Transketolase # Organism: Bacillus halodurans # 9 424 258 663 666 249 39.0 6e-66 PAGESFEDKVSTHGQPLTAAGADFAATVKNLGGDPNDPFAVFSESREAFSERREALRAWA SRQAEVEKTWRAEHGDLARKLDMFLSGRLPEIDYKSIEVKAGVATRAASAAVLGVFAETI ENMIVASADLSNSDKTDGFLKKTKAFTKGDFSGKFFQAGVSELTMACVANGMALHGGVIP ACGTFFVFSDYMKPAVRLSALMRLHVIYIWTHDSFRVGEDGPTHQPIEHEAQIRLMEHLR NHHDERSMVVLRPADGDETVMAWKLAVEEQRPVALVLSRQNIKSLPALGASRREEAAQVA KGGYVVLDSAKPEVVMVATGSEVSTLVEGAELLAAEGIAVRVVNVPSEGLFRDQPRSYQE SVLPVGVVRYGLTSGLPVNLMGLVGEKGMIHGLDHFGYSAPYTVLDEKFGYNGKTVAEEV KKLLGK >gi|313159588|gb|AENZ01000008.1| GENE 2 1492 - 1983 741 163 aa, chain + ## HITS:1 COG:PA1776 KEGG:ns NR:ns ## COG: PA1776 COG1595 # Protein_GI_number: 15596973 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Pseudomonas aeruginosa # 11 160 5 158 165 65 28.0 3e-11 MSLQRTKQYEFETFVREHRRIVGKVCYLYAVDSDDFDDLYQEVLINLWRGFDGFEGRAKV SSWVYRVALNTCISYYRRNRRHTGRLPLTDSLGAADEDPERGERLRDLYALINRLDALEK AVVMLWLDELPYEEIAAITGFTRNNVASKLHRIKLKLREQANH >gi|313159588|gb|AENZ01000008.1| GENE 3 1995 - 2579 934 194 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313159597|gb|EFR58959.1| ## NR: gi|313159597|gb|EFR58959.1| hypothetical protein HMPREF9720_1137 [Alistipes sp. HGB5] # 1 194 1 194 194 353 100.0 5e-96 MELDELKTKWHELDGRLSCADTKIDRLTAEVSAGKITSAKQRLARTIRLGIFLLAVLPLC FANIFRTGEADPGTAVKATLVLFIFAMLARQIALLVLLARIEPGAGTVRETCAAVLRFRT CFLWGVGAGIVLGVPLLILLGFYVGSLTSPYVFYGFVAGLIVGLPLGVRIFLRMMGDINA LRAALRETEGDVEN >gi|313159588|gb|AENZ01000008.1| GENE 4 2697 - 3140 624 147 aa, chain - ## HITS:1 COG:no KEGG:BT_0641 NR:ns ## KEGG: BT_0641 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 18 138 4 112 133 75 35.0 9e-13 MRPTFTIFIPVEFNKPYMKRFLLAACMLLAAVGFSSCENDDDDLYDTLTGRVWAGDLGFY QDGYALDSYVYFGADGFGSDELRYADNGRLLDTLNIQWDAYDDTVYIDYGRVDLPRELRR VHIRRGMLTADLYIGGRYYDRITLYMQ >gi|313159588|gb|AENZ01000008.1| GENE 5 3464 - 4768 1951 434 aa, chain + ## HITS:1 COG:SA1608 KEGG:ns NR:ns ## COG: SA1608 COG0192 # Protein_GI_number: 15927364 # Func_class: H Coenzyme transport and metabolism # Function: S-adenosylmethionine synthetase # Organism: Staphylococcus aureus N315 # 4 429 7 393 398 390 48.0 1e-108 MGFLFTSESVSEGHPDKVSDQISDAILDEFLRRDANSKVACETLCTTGLVVVAGEVRSEA YVDVQGVARRVIDRIGYTKSEYQFDCNSCGILSAIHEQSSDINQGVVRAEEDEQGAGDQG IMFGYACNETREMMPATLILSHVILKELAVIRREGEVMTYLRPDSKSQVTIEYDEKTNRP LRVHTIVVSTQHDEFILPGNGLTEKEAEERMQATIREDVRTILIPRVKARLERAGDKLAG LIGDDYILHVNPTGKFVIGGPHGDTGLTGRKIIVDTYGGRGAHGGGAFSGKDSSKVDRSA AYAARHIAKNLVAAGVADEVLVELSYAIGIAQPLSIYVDTYRSPRPAALEGMTDGEIARR IGKLFDLRPAAIVRRFGLKNPIFEATASYGHFGNRPYTERVKVWENGRETEREIEFFGWE KLDAVELIRKEFGL >gi|313159588|gb|AENZ01000008.1| GENE 6 4944 - 6416 2006 490 aa, chain + ## HITS:1 COG:PA0766 KEGG:ns NR:ns ## COG: PA0766 COG0265 # Protein_GI_number: 15595963 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain # Organism: Pseudomonas aeruginosa # 57 488 26 470 474 268 40.0 2e-71 MKKAFLILGAVAVSAAAGGLTAWAVAGGREGSVQYIEREVERTPALGTQFTSYQAEQYPD LTYAAENAVKAVVNIEAVQQVEMPRRRGYDPFLEFFGIPQGYDEGPQFREQRAGGSGVII SADGYVVTNNHVVDGASKLRVKLNDGRSFDAKLIGKDSATDLALLKIEASDLPTLTFGSS DALRLGEWVLAIGSPFDLQSTITAGIVSAKARNLGVIPNDFRIEAFIQTDAAVNPGNSGG ALVNTHGELVGINTLIKSQTGSYIGYSFAIPESIVRKVVVDLKEFGIVQRALLGIRFAIV DQDFIDREGKELGITELGGAYVAGVVEGGSASEAGIRKGDVILDIDGVKINDNATLSEQI GRRRPNDKVKLSVKRDGAVKQIEVTLRNKAGKTELMTKEDVDVVEVLGGKFADAGTKLCR ELDIKGGVQVVGIKADGILARARVKQGFVITHINDRPVYSVADMQRMDEKVRSIDGIYPN GRAASYTLVE >gi|313159588|gb|AENZ01000008.1| GENE 7 6471 - 7331 1574 286 aa, chain + ## HITS:1 COG:lin1491 KEGG:ns NR:ns ## COG: lin1491 COG0568 # Protein_GI_number: 16800559 # Func_class: K Transcription # Function: DNA-directed RNA polymerase, sigma subunit (sigma70/sigma32) # Organism: Listeria innocua # 21 285 108 373 374 225 46.0 6e-59 MRQLKITKSITNRESASLDKYLQEIGKEELITVEEEVELAQRIKKGDQEALEKLTKANLR FVVSVAKQYQNQGLSLPDLINEGNLGLIKAAEKFDETRGFKFISYAVWWIRQSILQALAE QSRIVRLPLNQVGSLNKINKAFARFEQEHERTPSPEELATELELPKEKVTDTLRVAGRHV SVDAPFADGEDNSLLDVLVNPDSPNADRGLINESLSTEVDRALETLTDRERDIIKYFFGI GTSEMTLEEIGEKFDLTRERVRQIKEKAIRRLRHSSRSKLLKSYLG >gi|313159588|gb|AENZ01000008.1| GENE 8 7528 - 8421 1247 297 aa, chain - ## HITS:1 COG:VNG1075G KEGG:ns NR:ns ## COG: VNG1075G COG1575 # Protein_GI_number: 15790173 # Func_class: H Coenzyme transport and metabolism # Function: 1,4-dihydroxy-2-naphthoate octaprenyltransferase # Organism: Halobacterium sp. NRC-1 # 3 295 2 309 311 151 36.0 2e-36 MKTQTATNSFRAWMLAARPKTLTGAAVPVMLGCALAASDGDFHLQPAVLCLLFAFLMQID ANLINDLWDYLKGSDGKDRLGPERACAQGWITPRAMRRGIALTTAAACITGCGLLFYGGW WLIAVGALCVVFAFLYTAGPYPLAYHGWGDLLVLVFFGFVPVGCTSYVLSGSWTWQTGVI SAACGLAIDTLLMVNNYRDREQDARSGKRTLVVRLGARAGSGLYLALGFAAATLCLPLLC DGRIGAAALPLLYLAPHTATWRRMVRIGRGRELNAILGATSRNILLFGILLSIGLLL >gi|313159588|gb|AENZ01000008.1| GENE 9 8622 - 9284 1031 220 aa, chain - ## HITS:1 COG:PA0750 KEGG:ns NR:ns ## COG: PA0750 COG0692 # Protein_GI_number: 15595947 # Func_class: L Replication, recombination and repair # Function: Uracil DNA glycosylase # Organism: Pseudomonas aeruginosa # 3 220 8 226 231 262 57.0 4e-70 MDVKIAPDWKELLAPEFEKPYFADLTQFVRQEYATRRIYPRGSNIFRAFDKCPFDKLKVV IIGQDPYHGPGQANGLCFSVGDGVPFPPSLQNIFKEVADDTGTPPPATGNLDRWAEQGVL LLNAVLTVRAHEAASHAGRGWETFTDAVVRAISERKQGVVYMLWGSYAQKKGAIADPQRN FILKSVHPSPLSVYRGFFGCRHFSRANEYLRSIGKEPIVW >gi|313159588|gb|AENZ01000008.1| GENE 10 10120 - 11559 1374 479 aa, chain + ## HITS:1 COG:no KEGG:Pedsa_3099 NR:ns ## KEGG: Pedsa_3099 # Name: not_defined # Def: hypothetical protein # Organism: P.saltans # Pathway: not_defined # 3 468 4 436 445 125 26.0 4e-27 MKKLILGCLLALFASGCNDNEPEQKPVELKISFAGENPVEFTVGEQKRLAFTVTGDDLEG LTFAAEPDEALSGWTVVVSAGREGDAYAGALDVTASATPSETAVTLTVTDRNGKPWSATT PTLTAVQAPVTAIDLSRDGTANCYIVSEAGDYMFDATVRGNGSGDDAAIALADGMKADWL WVTKGLEQEISAVSLDAGKGRIFFTAAGAAKGNAVIALADAAGEIVWSWHLWFTPEPRMV TYANGRVLLDRSLGAVGTTPGSAEAYGLYYQWGRKDPFCGGTAPETSATAFAQAAENSVV NPAFADTHAWKQESGAAVSTLEYAAAHPLSFLSNKGATGVYDWLAKPRADLWNTAKTCYD PCPVGYKVPDRDTWDDFADDQDRYVDGTSEWDGEKYGMTYIFGDLRDWYPTSGYRNRDKG NLAGLATTRTGHYWSNYRSGNTISCFYISKKLSTGKLLQPQSSSKSDAAYGYNVRCCRE >gi|313159588|gb|AENZ01000008.1| GENE 11 11934 - 14120 2180 728 aa, chain - ## HITS:1 COG:no KEGG:BF3975 NR:ns ## KEGG: BF3975 # Name: not_defined # Def: putative transcriptional regulator # Organism: B.fragilis # Pathway: not_defined # 8 295 11 334 730 149 30.0 7e-34 MKPAIVYMLEVLVCSGVLLAAYAILLERRVRFRWCRLYLLLTTFTAALIPLLRIPVWPGR VVAAAPAIPPDLPVWTGEVLPDESGGFAVTPESLCLGLYLLGAVLIAGVMAWQVVRIHRL RRGAEISRTGDYTLVRTRQEIASFSFLRTIYVWDQTSAGELAAILAHESSHIAHRHSIER ILMELMKALLWWNPFAWIAARRLTEAEEFEADNDVLNSGYNRAEYMQTIFRQLFGYSPEI ANGLRNSLTKKRFKMMTMQTKSRHGLLRLAGTLPAVIGLLCAFGFTSRAAVIVAPATAPA TGTTEEPTGIRTAADAMQKQDKTCTATLSVVDKKDKRPIEGALVQAAGTPKGTVTGKDGR AVITVPEGSELKISHIGYEPRTIGVKDENPVVVSLIRNPKAGFTPTDDLPDAPQKQQQKQ QQEQVTVSVATYKDNAPLAGALVKVSGSQIGAVTNDEGLAALQVPANSALEISHIGCKPH IAQVGDKARQMFMFDMQAETAGKSETEGTTVTTTVTDKPLYIVNGIETKEIGKLDPDRIE SMSVLKDKAATALYGEKARNGVIVITLKGADMGPATSESNGKRDDAEQTAANAIAIKNGT PEKEEAFLVTETMPLFPTEDPAAPYGDLGNFRAWVQKNVKYPAEAFEKGIGGRVVLSFVV EKDGSVSDIEKLQAPDASLWEEARRVIASSPKWKPGEQRGQIVRVKYTLPVDFRLTKASD TPAPESGK >gi|313159588|gb|AENZ01000008.1| GENE 12 14156 - 14530 656 124 aa, chain - ## HITS:1 COG:Cgl0019 KEGG:ns NR:ns ## COG: Cgl0019 COG3682 # Protein_GI_number: 19551269 # Func_class: K Transcription # Function: Predicted transcriptional regulator # Organism: Corynebacterium glutamicum # 12 120 14 117 123 62 34.0 3e-10 MEKKSTELTRGEEEIMQILWRLGDAVVNEIIAQTEEPRPKYTTVATFLKILENKGFVGHT PEGKSHRYYPLVDREQYARGVMSSVLTSYFDGSLARMVSFFSRNEDISVKEMDEILEIMR NAKK >gi|313159588|gb|AENZ01000008.1| GENE 13 14702 - 15451 1169 249 aa, chain - ## HITS:1 COG:CAC2627 KEGG:ns NR:ns ## COG: CAC2627 COG0220 # Protein_GI_number: 15895885 # Func_class: R General function prediction only # Function: Predicted S-adenosylmethionine-dependent methyltransferase # Organism: Clostridium acetobutylicum # 34 216 27 205 211 115 34.0 8e-26 MGKDKLRRFRENLTFDCFVQPEFDEVFHCDHPLKGRWRSDFFRNDNPIVLELGCGKGEYT VALAEHDPSRNYIGIDIKGARMWRGAKSVTERKIANAGFLRTRVEFINGLFAENEVDEIW ITFPDPQLKSRRAKKRLTSPLFLEYYARLLKPAGWINLKTDSKHLFAYTNEVVRRYTLPC VVSNPDIYGSGYADEVLSVKTAYEKRFLEMGLPITYTRFSLAGRQEFPWFDWEEDEKEEK DNEQERRIH >gi|313159588|gb|AENZ01000008.1| GENE 14 15457 - 15891 448 144 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313159606|gb|EFR58968.1| ## NR: gi|313159606|gb|EFR58968.1| putative lipoprotein [Alistipes sp. HGB5] # 1 144 1 144 144 252 100.0 5e-66 MRAACRTALAAAALLAAGCMSPHGAIATDVNSASWSDPAPLTLANADTTTLRDVNLFLRC NDRFAEDSLTVHIRVRTPDSLQHEEPFVMVIPPAHTPAAISREADIPYRRRVRFDRTGDY HLTITPCRPVEGIEAVGIHIVKSQ >gi|313159588|gb|AENZ01000008.1| GENE 15 16212 - 17876 2285 554 aa, chain - ## HITS:1 COG:BH0045 KEGG:ns NR:ns ## COG: BH0045 COG1774 # Protein_GI_number: 15612608 # Func_class: S Function unknown # Function: Uncharacterized homolog of PSP1 # Organism: Bacillus halodurans # 66 310 4 241 275 163 37.0 7e-40 MRIFTQIIPSMDTEKQHTKAFKPEIGRGCSFCNCGSECEAKLCDGCFKLHETAWLEEYPE NLPTDIVEVRFKNTRRAFYQNVNNLPLKRGDIVAVEASPGHDIGIVSLTGDLVARQMRRT GFNPFNGEFKKIYRKAKPYDIEKWQEAIELEHETMIASRQIAADMGLNMKIGDVEYQGDK IKAIFYYIADERVDFRELIKVFAERFHIRIEMKQIGARQEAGRIGGLGACGRELCCASWM SSFSSVTTGAARVQEISLNPQKLAGQCSKLKCCMMYEYDTYVDARKEFPRLREPLQAMDG EYYLVKNDILAGTMTFSSSKDAMVNVTTLSIARVKEIVSQNRAGKKVDRLQAAEDIAPAA EEPTYRSEEDSITRFDQAKRRSKRSRGKGGRQNGQDSQEAGQQAAGRQEQPDQQGRQQER RPRRNDRPRNGEGRSSEGRNNEGRNGNPERRNNGERRDGENRENRNANNGGRRGNSGSEN RTGENGDTRNGNNNEARNGNTNNGNGAERREGNSRGRNNNRNRRRPNNDRPRGEGSENGN NSGNGNNGGNGANE >gi|313159588|gb|AENZ01000008.1| GENE 16 17898 - 19259 2354 453 aa, chain - ## HITS:1 COG:YPO2392 KEGG:ns NR:ns ## COG: YPO2392 COG0534 # Protein_GI_number: 16122615 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Yersinia pestis # 7 442 4 441 457 212 32.0 2e-54 MYSFSTYKDQYKSNLKLALPVVLTQLGQILTQVADNLMVGRYGGDDPVPLAAVSFGGSVF FILFIAAIGIALGMTPLVGELYAQGDREKSAGLLQNGILFYTLLGFAMAVVQYSVIPLMY RLGQPVDVVDAAIPYYRMLVFSMPFVMLFFAFKQFLEGVGNTKVEMVVTIVANLANIGFN WVFIYGRFSLPEMGAEGAGLGTLLSRIIAPILMVGYFYSRERYRGYLDGFSPRNYSWATV KKLLHMGLPISMQMFLEASAFVGTGIMMGWFNKETMSANQIAVTIGNCAFMIVMSIGAAT TIRVSHCYGARNIGELSLAAKASYHLVLAWNAFAALVFITMRNVIPTFFTTNAEVIAIAS QLMVFAALYQLSDGIQNVSVGILRGIQDVKIIMPIAFVSYWLLNLPVGYLFGFTLGMGPS GLFLGFSFGLSAAAVMMILRIRRSIRRLYLSSN >gi|313159588|gb|AENZ01000008.1| GENE 17 19364 - 21421 2886 685 aa, chain - ## HITS:1 COG:CC0586_2 KEGG:ns NR:ns ## COG: CC0586_2 COG0642 # Protein_GI_number: 16124840 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Caulobacter vibrioides # 466 669 14 220 232 105 32.0 4e-22 MRQMLLKALASGLIMSLFPFAAISAPAGPSHADSLLRLLSKTHDAVGRERIYVQLADLSG DSLELAAPYWDAALAEARKSGDLYGCKDALDFLVRKFAGRDSQRAEKYIALADSILPGPR HALFRSSLYAYYIWKLMNDNNAVETVKHELDRLKTKIHNELSPEERIEWEFLTGLSLDFS SLATEAYDNIGKAIPYVEQALKKLEAYPLEERLHMERICRDELSELYMLSKDKRAEKQIQ QCIDLHRAWLAMDDRFERPYRDTTGYTMRAYSKMLYLRELISKEKATQYYGKCMELARAR GDLAEIYSTSARYYQYMEEYERAVAYIDSAVTVYKRNGTKADFASIYAVQSWLYEHLGDY KNALEALRESNTIRHNDRVEEAQNSLAEMQTLFEVGQLELEKSRLANRMKFIALLAGGVL LLLLVGWSVYQYVMVRRLKQIRRQLTDANQEITRQSRRATESEKMKTAFINSMCHEIRTP LNAINGFSELLLDDTLDAHTRREFREQIWTNTTALTTLLENMLELSSLVCSEAPLPQTDT DIGLLCAERLEYQKRQSNNPQVEYIFKGGGEGSCIIPTNVFYMTRVIDNLLQNAAKFTAA GSVTLSCKKDDSKRRLRIRVADTGIGIDPDKREWVFERFAKVDPFKPGSGIGLYLCRLIV TRLGGEIRVCPEYRAGCCIDIVLPC Prediction of potential genes in microbial genomes Time: Wed Jun 22 11:11:50 2011 Seq name: gi|313159403|gb|AENZ01000009.1| Alistipes sp. HGB5 contig00018, whole genome shotgun sequence Length of sequence - 220068 bp Number of predicted genes - 192, with homology - 179 Number of transcription units - 73, operones - 39 average op.length - 4.1 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 53 - 949 368 ## COG0338 Site-specific DNA methylase 2 2 Op 1 . + CDS 1874 - 3610 1464 ## COG0270 Site-specific DNA methylase 3 2 Op 2 . + CDS 3645 - 3914 314 ## + Term 4107 - 4144 0.0 + Prom 4084 - 4143 3.9 4 3 Tu 1 . + CDS 4168 - 4488 176 ## + Prom 4531 - 4590 4.0 5 4 Tu 1 . + CDS 4678 - 5145 270 ## + Prom 5392 - 5451 4.1 6 5 Op 1 . + CDS 5573 - 5869 192 ## 7 5 Op 2 . + CDS 5860 - 6384 496 ## 8 5 Op 3 . + CDS 6423 - 6752 176 ## 9 5 Op 4 . + CDS 6804 - 7685 267 ## COG0616 Periplasmic serine proteases (ClpP class) 10 5 Op 5 . + CDS 7697 - 9250 1551 ## Odosp_2605 hypothetical protein 11 5 Op 6 . + CDS 9275 - 9832 420 ## BF2335 hypothetical protein 12 5 Op 7 . + CDS 9834 - 10619 275 ## 13 5 Op 8 . + CDS 10619 - 11302 273 ## FP0797 hypothetical protein 14 5 Op 9 . + CDS 11280 - 12221 535 ## gi|313159442|gb|EFR58805.1| hypothetical protein HMPREF9720_0689 15 5 Op 10 . + CDS 12313 - 13962 1020 ## FP0793 hypothetical protein + Term 14144 - 14184 3.0 + Prom 13996 - 14055 2.4 16 6 Op 1 . + CDS 14237 - 14479 267 ## BF1503 hypothetical protein 17 6 Op 2 . + CDS 14547 - 15365 344 ## BF1504 hypothetical protein + Term 15386 - 15426 7.0 + Prom 15632 - 15691 2.7 18 7 Op 1 . + CDS 15813 - 16085 114 ## gi|313159575|gb|EFR58938.1| conserved hypothetical protein 19 7 Op 2 . + CDS 16082 - 17446 783 ## FP0787 hypothetical protein 20 7 Op 3 . + CDS 17451 - 18389 331 ## BF2432 hypothetical protein 21 7 Op 4 . + CDS 18389 - 19270 350 ## FP0785 hypothetical protein 22 7 Op 5 . + CDS 19254 - 19685 253 ## + Term 19706 - 19766 10.8 + Prom 19692 - 19751 8.0 23 8 Op 1 . + CDS 19785 - 20021 158 ## Poras_0316 hypothetical protein 24 8 Op 2 . + CDS 20029 - 20442 406 ## BVU_3457 hypothetical protein + Prom 20573 - 20632 5.7 25 9 Tu 1 . + CDS 20656 - 21084 283 ## Sph21_3507 hypothetical protein + Term 21183 - 21220 1.4 + Prom 21089 - 21148 4.4 26 10 Op 1 . + CDS 21343 - 21543 141 ## + Term 21565 - 21598 6.1 27 10 Op 2 . + CDS 21608 - 21967 256 ## 28 10 Op 3 . + CDS 21972 - 26243 3199 ## COG5283 Phage-related tail protein 29 10 Op 4 . + CDS 26240 - 28021 508 ## gi|313159484|gb|EFR58847.1| hypothetical protein HMPREF9720_0701 30 10 Op 5 . + CDS 28080 - 29228 311 ## gi|313159520|gb|EFR58883.1| hypothetical protein HMPREF9720_0702 31 10 Op 6 . + CDS 29283 - 30083 129 ## gi|313159408|gb|EFR58771.1| hypothetical protein HMPREF9720_0703 32 10 Op 7 . + CDS 30096 - 31208 971 ## gi|313159503|gb|EFR58866.1| hypothetical protein HMPREF9720_0704 33 10 Op 8 . + CDS 31213 - 31818 511 ## gi|313159417|gb|EFR58780.1| hypothetical protein HMPREF9720_0705 34 10 Op 9 . + CDS 31832 - 32164 341 ## gi|313159446|gb|EFR58809.1| conserved hypothetical protein 35 10 Op 10 . + CDS 32161 - 32808 91 ## gi|313159419|gb|EFR58782.1| hypothetical protein HMPREF9720_0707 36 11 Tu 1 . - CDS 32774 - 32962 141 ## - Prom 33008 - 33067 2.3 + Prom 32932 - 32991 3.6 37 12 Op 1 . + CDS 33015 - 33569 688 ## BF2340 hypothetical protein 38 12 Op 2 . + CDS 33594 - 33806 133 ## gi|313159462|gb|EFR58825.1| hypothetical protein HMPREF9720_0709 39 12 Op 3 . + CDS 33821 - 34321 589 ## Coch_0633 mannosyl-glycoproteinendo-beta-N-acetylglucosami dase 40 12 Op 4 . + CDS 34318 - 34932 431 ## Bache_2807 hypothetical protein 41 12 Op 5 . + CDS 34929 - 35402 448 ## + Term 35408 - 35440 5.4 - Term 35382 - 35420 -0.5 42 13 Op 1 . - CDS 35460 - 36176 166 ## gi|313159487|gb|EFR58850.1| hypothetical protein HMPREF9720_0712 43 13 Op 2 . - CDS 36241 - 36837 169 ## Halhy_5893 integrase family protein - Prom 37017 - 37076 3.5 - TRNA 37578 - 37650 80.6 # Met CAT 0 0 - Term 37684 - 37726 1.1 44 14 Op 1 . - CDS 37876 - 38784 551 ## PROTEIN SUPPORTED gi|148988856|ref|ZP_01820271.1| 50S ribosomal protein L9 45 14 Op 2 . - CDS 38811 - 40559 2337 ## COG0247 Fe-S oxidoreductase 46 14 Op 3 . - CDS 40595 - 40930 448 ## Odosp_0420 hypothetical protein 47 14 Op 4 2/0.000 - CDS 40934 - 41938 1285 ## COG1148 Heterodisulfide reductase, subunit A and related polyferredoxins 48 14 Op 5 7/0.000 - CDS 41938 - 43026 1505 ## COG2048 Heterodisulfide reductase, subunit B 49 14 Op 6 . - CDS 43030 - 43728 1058 ## COG1150 Heterodisulfide reductase, subunit C - Prom 43870 - 43929 2.8 - Term 43878 - 43907 1.4 50 15 Tu 1 . - CDS 43945 - 44430 787 ## COG0262 Dihydrofolate reductase - Prom 44597 - 44656 2.8 51 16 Op 1 . + CDS 44913 - 45398 671 ## COG0591 Na+/proline symporter 52 16 Op 2 . + CDS 45423 - 45941 509 ## Phep_3587 Na+/solute symporter 53 16 Op 3 . + CDS 46010 - 46315 327 ## gi|313159554|gb|EFR58917.1| hypothetical protein HMPREF9720_0726 54 16 Op 4 . + CDS 46319 - 48601 2758 ## Sph21_1996 fumarate reductase/succinate dehydrogenase flavoprotein domain protein 55 16 Op 5 . + CDS 48633 - 51221 2897 ## Slin_2415 hypothetical protein 56 16 Op 6 . + CDS 51218 - 53023 912 ## Phep_3589 hypothetical protein 57 16 Op 7 . + CDS 52774 - 53913 988 ## Phep_3589 hypothetical protein 58 16 Op 8 . + CDS 53920 - 55845 2235 ## Phep_3590 hypothetical protein 59 16 Op 9 . + CDS 55905 - 56342 81 ## COG1609 Transcriptional regulators 60 16 Op 10 . + CDS 56233 - 57027 825 ## BVU_2283 ATP/GTP-binding site + Term 57053 - 57091 7.5 - Term 57041 - 57079 8.3 61 17 Op 1 . - CDS 57144 - 58043 1090 ## COG0207 Thymidylate synthase 62 17 Op 2 . - CDS 58126 - 58872 798 ## COG2908 Uncharacterized protein conserved in bacteria - Term 58955 - 58985 -0.6 63 18 Op 1 . - CDS 58990 - 60132 1626 ## Odosp_2266 alkyl hydroperoxide reductase/thiol specific antioxidant/Mal allergen 64 18 Op 2 . - CDS 60196 - 60882 938 ## Sph21_2930 hypothetical protein - Prom 61021 - 61080 2.4 + Prom 60833 - 60892 4.5 65 19 Tu 1 . + CDS 61076 - 62215 1715 ## COG2204 Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains + Term 62220 - 62263 10.4 - Term 62193 - 62260 15.2 66 20 Op 1 . - CDS 62274 - 62942 905 ## COG0546 Predicted phosphatases 67 20 Op 2 . - CDS 62939 - 64399 2114 ## COG1757 Na+/H+ antiporter 68 20 Op 3 . - CDS 64463 - 65560 1251 ## BT_0445 endoglucanase E 69 20 Op 4 . - CDS 65557 - 66876 1827 ## BT_0447 sialic acid-specific 9-O-acetylesterase 70 20 Op 5 . - CDS 66890 - 68749 2130 ## BT_0434 hypothetical protein 71 20 Op 6 . - CDS 68760 - 70190 1634 ## Cpin_5275 hypothetical protein 72 20 Op 7 1/0.000 - CDS 70208 - 71371 1875 ## COG2942 N-acyl-D-glucosamine 2-epimerase 73 20 Op 8 . - CDS 71393 - 72787 1681 ## COG0477 Permeases of the major facilitator superfamily 74 20 Op 9 . - CDS 72797 - 73894 1586 ## FB2170_00105 hypothetical protein 75 20 Op 10 . - CDS 73901 - 74836 1594 ## Pedsa_0025 hypothetical protein 76 20 Op 11 . - CDS 74860 - 76260 1772 ## BT_0446 hypothetical protein 77 20 Op 12 . - CDS 76273 - 76938 998 ## COG2755 Lysophospholipase L1 and related esterases 78 20 Op 13 . - CDS 76941 - 78326 2107 ## BT_0449 putative S-layer related protein 79 20 Op 14 . - CDS 78391 - 80052 1896 ## BT_0450 hypothetical protein 80 20 Op 15 . - CDS 80074 - 81738 2135 ## BT_0451 hypothetical protein 81 20 Op 16 . - CDS 81757 - 84633 3145 ## BT_0452 hypothetical protein 82 20 Op 17 . - CDS 84641 - 84949 288 ## BT_0452 hypothetical protein 83 20 Op 18 . - CDS 84998 - 86308 1923 ## BT_0447 sialic acid-specific 9-O-acetylesterase 84 20 Op 19 . - CDS 86324 - 87352 1024 ## COG4632 Exopolysaccharide biosynthesis protein related to N-acetylglucosamine-1-phosphodiester alpha-N-acetylglucosaminidase - Prom 87460 - 87519 8.4 + Prom 87720 - 87779 3.9 85 21 Op 1 . + CDS 87810 - 88022 435 ## Bacsa_2647 hypothetical protein 86 21 Op 2 31/0.000 + CDS 88033 - 89604 2474 ## COG1271 Cytochrome bd-type quinol oxidase, subunit 1 87 21 Op 3 . + CDS 89611 - 90762 1975 ## COG1294 Cytochrome bd-type quinol oxidase, subunit 2 + Term 90766 - 90813 12.1 - Term 90759 - 90796 9.1 88 22 Tu 1 . - CDS 90824 - 91693 1408 ## COG0623 Enoyl-[acyl-carrier-protein] reductase (NADH) - Prom 91843 - 91902 4.5 + Prom 91653 - 91712 7.2 89 23 Tu 1 . + CDS 91876 - 93843 2971 ## COG0513 Superfamily II DNA and RNA helicases + Term 93855 - 93890 7.4 - Term 93947 - 93977 -0.9 90 24 Tu 1 . - CDS 94139 - 94699 135 ## Ajs_3985 hypothetical protein - Prom 94723 - 94782 4.9 + Prom 95058 - 95117 2.4 91 25 Tu 1 . + CDS 95255 - 96268 1030 ## gi|313159463|gb|EFR58826.1| hypothetical protein HMPREF9720_0765 + Term 96309 - 96343 4.3 - TRNA 96354 - 96441 61.1 # Ser TGA 0 0 + Prom 96483 - 96542 2.8 92 26 Op 1 . + CDS 96610 - 96870 389 ## COG0776 Bacterial nucleoid DNA-binding protein 93 26 Op 2 . + CDS 96902 - 98098 1805 ## COG0436 Aspartate/tyrosine/aromatic aminotransferase + Term 98122 - 98158 8.0 + Prom 98214 - 98273 5.4 94 27 Tu 1 . + CDS 98325 - 99776 2043 ## COG2067 Long-chain fatty acid transport protein + Term 99800 - 99840 9.1 + TRNA 99837 - 99911 46.2 # Pseudo CCT 0 0 + Prom 99837 - 99896 77.5 95 28 Tu 1 . + CDS 100124 - 101329 1640 ## COG1760 L-serine deaminase + Term 101574 - 101606 -0.1 - Term 101839 - 101878 9.1 96 29 Op 1 58/0.000 - CDS 101905 - 106164 6256 ## COG0086 DNA-directed RNA polymerase, beta' subunit/160 kD subunit 97 29 Op 2 28/0.000 - CDS 106201 - 110061 2963 ## PROTEIN SUPPORTED gi|163796927|ref|ZP_02190884.1| 30S ribosomal protein S12 - Prom 110122 - 110181 2.2 - Term 110229 - 110274 11.5 98 30 Op 1 47/0.000 - CDS 110306 - 110683 441 ## PROTEIN SUPPORTED gi|86131816|ref|ZP_01050413.1| putative 50S ribosomal protein L7/L12 99 30 Op 2 43/0.000 - CDS 110725 - 111249 501 ## PROTEIN SUPPORTED gi|120436717|ref|YP_862403.1| 50S ribosomal protein L10 100 30 Op 3 55/0.000 - CDS 111267 - 111965 950 ## PROTEIN SUPPORTED gi|150008877|ref|YP_001303620.1| 50S ribosomal protein L1 101 30 Op 4 45/0.000 - CDS 111972 - 112412 625 ## PROTEIN SUPPORTED gi|53715481|ref|YP_101473.1| 50S ribosomal protein L11 102 30 Op 5 . - CDS 112429 - 112989 809 ## COG0250 Transcription antiterminator 103 30 Op 6 . - CDS 113004 - 113201 277 ## gi|313159420|gb|EFR58783.1| preprotein translocase, SecE subunit - Prom 113264 - 113323 4.0 - TRNA 113221 - 113294 62.8 # Trp CCA 0 0 - Term 113289 - 113327 10.0 104 31 Tu 1 . - CDS 113343 - 114530 1391 ## PROTEIN SUPPORTED gi|119502908|ref|ZP_01624993.1| Ribosomal protein S19 - Prom 114747 - 114806 3.0 - TRNA 114660 - 114735 72.2 # Gly TCC 0 0 - TRNA 114830 - 114915 63.7 # Tyr GTA 0 0 - TRNA 115041 - 115114 86.1 # Thr TGT 0 0 + Prom 115342 - 115401 6.3 105 32 Tu 1 . + CDS 115453 - 115974 229 ## PROTEIN SUPPORTED gi|229235658|ref|ZP_04360081.1| acetyltransferase, ribosomal protein N-acetylase - Term 115967 - 116009 6.3 106 33 Tu 1 . - CDS 116082 - 116810 879 ## BT_2320 transcription regulator - Prom 116983 - 117042 10.4 - Term 117199 - 117236 9.4 107 34 Tu 1 . - CDS 117281 - 118138 1169 ## Bacsa_3078 hypothetical protein - Prom 118182 - 118241 2.6 - Term 118380 - 118423 9.2 108 35 Tu 1 . - CDS 118444 - 118710 344 ## PROTEIN SUPPORTED gi|150004229|ref|YP_001298973.1| 50S ribosomal protein L31 type B - Prom 118743 - 118802 2.6 + Prom 118911 - 118970 4.0 109 36 Tu 1 . + CDS 119027 - 119854 1312 ## GFO_2348 S1/P1 endonuclease family protein (EC:3.1.30.-) + Term 120038 - 120087 7.3 + TRNA 119962 - 120035 65.5 # Met CAT 0 0 110 37 Op 1 . + CDS 120335 - 120703 559 ## Plut_0531 hypothetical protein + Term 120724 - 120758 7.4 111 37 Op 2 . + CDS 120770 - 121267 360 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases - Term 121303 - 121344 14.1 112 38 Op 1 . - CDS 121367 - 122500 1622 ## COG0545 FKBP-type peptidyl-prolyl cis-trans isomerases 1 113 38 Op 2 . - CDS 122591 - 123304 1081 ## COG0545 FKBP-type peptidyl-prolyl cis-trans isomerases 1 114 38 Op 3 . - CDS 123306 - 124685 2143 ## COG3004 Na+/H+ antiporter 115 38 Op 4 . - CDS 124686 - 125747 1290 ## COG0010 Arginase/agmatinase/formimionoglutamate hydrolase, arginase family - Prom 125767 - 125826 1.9 + Prom 125715 - 125774 3.6 116 39 Op 1 . + CDS 125821 - 126816 1104 ## COG0111 Phosphoglycerate dehydrogenase and related dehydrogenases 117 39 Op 2 . + CDS 126875 - 128152 1619 ## COG0167 Dihydroorotate dehydrogenase 118 39 Op 3 . + CDS 128169 - 129410 1562 ## COG1058 Predicted nucleotide-utilizing enzyme related to molybdopterin-biosynthesis enzyme MoeA 119 39 Op 4 . + CDS 129423 - 129971 256 ## PROTEIN SUPPORTED gi|229231897|ref|ZP_04356325.1| SSU ribosomal protein S12P methylthiotransferase + Term 129986 - 130026 0.3 120 40 Op 1 . + CDS 130241 - 130507 280 ## Bache_0368 integral membrane sensor signal transduction histidine kinase 121 40 Op 2 . + CDS 130375 - 131958 1847 ## COG0642 Signal transduction histidine kinase + Prom 131965 - 132024 3.9 122 41 Op 1 . + CDS 132086 - 132319 199 ## PROTEIN SUPPORTED gi|227396859|ref|ZP_03880178.1| LSU ribosomal protein L28P 123 41 Op 2 . + CDS 132336 - 132518 278 ## PROTEIN SUPPORTED gi|228474099|ref|ZP_04058840.1| ribosomal protein L33 124 41 Op 3 . + CDS 132531 - 132689 393 ## gi|313159457|gb|EFR58820.1| conserved hypothetical protein + Term 132718 - 132749 3.2 125 42 Tu 1 . + CDS 132932 - 134941 2172 ## COG1442 Lipopolysaccharide biosynthesis proteins, LPS:glycosyltransferases + Term 134971 - 135007 8.7 - Term 134942 - 135013 21.6 126 43 Tu 1 . - CDS 135022 - 135930 1212 ## Lbys_1555 hypothetical protein - Prom 135997 - 136056 3.6 + Prom 135910 - 135969 6.0 127 44 Tu 1 . + CDS 136013 - 137023 719 ## PROTEIN SUPPORTED gi|163762490|ref|ZP_02169555.1| ribosomal protein L28 + Term 137102 - 137143 9.0 + Prom 137250 - 137309 6.2 128 45 Op 1 . + CDS 137380 - 140598 3889 ## Dfer_2245 TonB-dependent receptor 129 45 Op 2 . + CDS 140610 - 142241 2257 ## Celly_2444 RagB/SusD domain-containing protein 130 45 Op 3 . + CDS 142267 - 143091 1262 ## RB4127 hypothetical protein 131 45 Op 4 . + CDS 143123 - 143968 1442 ## gi|313159427|gb|EFR58790.1| putative lipoprotein + Term 143990 - 144029 10.1 132 46 Op 1 . + CDS 144072 - 145370 1106 ## PROTEIN SUPPORTED gi|229200236|ref|ZP_04326798.1| SSU ribosomal protein S12P methylthiotransferase 133 46 Op 2 . + CDS 145435 - 145728 402 ## gi|313159453|gb|EFR58816.1| conserved hypothetical protein 134 46 Op 3 . + CDS 145825 - 146124 526 ## gi|313159494|gb|EFR58857.1| hypothetical protein HMPREF9720_0819 + Prom 146368 - 146427 5.8 135 47 Tu 1 . + CDS 146447 - 147994 2600 ## COG1418 Predicted HD superfamily hydrolase + Term 148047 - 148082 5.1 - Term 147984 - 148038 2.3 136 48 Op 1 . - CDS 148112 - 148843 988 ## BF2504 hypothetical protein 137 48 Op 2 12/0.000 - CDS 148846 - 149343 613 ## COG3610 Uncharacterized conserved protein 138 48 Op 3 . - CDS 149340 - 150152 926 ## COG2966 Uncharacterized conserved protein - Prom 150208 - 150267 1.8 139 49 Tu 1 . + CDS 150179 - 150844 866 ## BVU_3025 hypothetical protein + Term 150928 - 150974 8.0 + Prom 150874 - 150933 4.1 140 50 Op 1 6/0.000 + CDS 151083 - 151709 1016 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog + Term 151733 - 151768 3.0 141 50 Op 2 . + CDS 151817 - 152818 1575 ## COG3712 Fe2+-dicitrate sensor, membrane component 142 50 Op 3 . + CDS 152863 - 156414 5934 ## Slin_4979 TonB-dependent receptor plug 143 50 Op 4 . + CDS 156481 - 158262 2848 ## Slin_4978 RagB/SusD domain protein + Term 158290 - 158328 9.3 144 50 Op 5 . + CDS 158355 - 159152 1334 ## ZPR_1186 S1/P1 endonuclease family protein + Term 159176 - 159217 13.1 - Term 159163 - 159204 13.1 145 51 Op 1 . - CDS 159254 - 160183 1289 ## COG0584 Glycerophosphoryl diester phosphodiesterase 146 51 Op 2 . - CDS 160204 - 163923 4522 ## Fluta_0605 PKD domain-containing protein 147 51 Op 3 . - CDS 163936 - 168177 5728 ## BT_4606 hypothetical protein 148 51 Op 4 . - CDS 168208 - 170652 3421 ## Bache_0357 fibronectin type III domain protein 149 51 Op 5 . - CDS 170687 - 173959 4677 ## BT_4606 hypothetical protein - Term 174124 - 174168 16.0 150 52 Op 1 . - CDS 174196 - 175170 1501 ## COG0167 Dihydroorotate dehydrogenase 151 52 Op 2 . - CDS 175260 - 176360 1821 ## COG0404 Glycine cleavage system T protein (aminomethyltransferase) 152 52 Op 3 . - CDS 176378 - 177586 1961 ## Slin_5553 hypothetical protein - Prom 177608 - 177667 5.4 + Prom 177572 - 177631 4.9 153 53 Op 1 . + CDS 177761 - 178021 324 ## gi|313159506|gb|EFR58869.1| conserved hypothetical protein 154 53 Op 2 . + CDS 178078 - 178437 396 ## Galf_1289 methyltransferase type 11 + Prom 178648 - 178707 4.1 155 54 Op 1 . + CDS 178728 - 181721 5041 ## BF3633 phosphoenolpyruvate synthase + Prom 181754 - 181813 6.5 156 54 Op 2 . + CDS 181865 - 183202 2054 ## COG0334 Glutamate dehydrogenase/leucine dehydrogenase + Term 183216 - 183268 14.7 - Term 183452 - 183486 0.4 157 55 Op 1 . - CDS 183513 - 184763 1498 ## Bache_3206 phosphoesterase PA-phosphatase related protein 158 55 Op 2 1/0.000 - CDS 184778 - 187351 3364 ## COG0474 Cation transport ATPase 159 55 Op 3 19/0.000 - CDS 187365 - 188765 2258 ## COG0772 Bacterial cell division membrane protein 160 55 Op 4 . - CDS 188788 - 190614 2858 ## COG0768 Cell division protein FtsI/penicillin-binding protein 2 161 55 Op 5 . - CDS 190611 - 191129 679 ## GFO_2960 hypothetical protein 162 55 Op 6 22/0.000 - CDS 191130 - 191984 1316 ## COG1792 Cell shape-determining protein - Prom 192006 - 192065 1.5 - Term 191993 - 192025 4.0 163 55 Op 7 . - CDS 192079 - 193098 1774 ## COG1077 Actin-like ATPase involved in cell morphogenesis - Prom 193226 - 193285 5.5 + Prom 193201 - 193260 11.4 164 56 Op 1 . + CDS 193296 - 194303 1655 ## COG0240 Glycerol-3-phosphate dehydrogenase 165 56 Op 2 . + CDS 194311 - 195642 2281 ## COG0166 Glucose-6-phosphate isomerase + Term 195661 - 195700 9.1 + Prom 195644 - 195703 3.4 166 57 Tu 1 . + CDS 195779 - 196585 570 ## gi|313159546|gb|EFR58909.1| hypothetical protein HMPREF9720_0854 - Term 196416 - 196473 2.4 167 58 Tu 1 . - CDS 196607 - 197497 1002 ## COG0613 Predicted metal-dependent phosphoesterases (PHP family) - Prom 197519 - 197578 3.9 - Term 197554 - 197598 10.5 168 59 Tu 1 . - CDS 197622 - 198347 1248 ## COG0217 Uncharacterized conserved protein - Prom 198378 - 198437 3.7 + Prom 198343 - 198402 4.7 169 60 Tu 1 . + CDS 198500 - 198877 173 ## PROTEIN SUPPORTED gi|15900839|ref|NP_345443.1| lactoylglutathione lyase 170 61 Op 1 . + CDS 198994 - 199575 855 ## COG0009 Putative translation factor (SUA5) 171 61 Op 2 . + CDS 199581 - 199976 553 ## COG0824 Predicted thioesterase - Term 199695 - 199749 4.6 172 62 Tu 1 . - CDS 199973 - 200926 1549 ## Palpr_1888 hypothetical protein + Prom 200878 - 200937 2.3 173 63 Tu 1 . + CDS 200976 - 201914 1319 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily + Term 202064 - 202092 -1.0 - Term 201937 - 201964 -0.1 174 64 Op 1 . - CDS 201972 - 203381 2157 ## COG0617 tRNA nucleotidyltransferase/poly(A) polymerase 175 64 Op 2 . - CDS 203386 - 204039 936 ## COG2755 Lysophospholipase L1 and related esterases 176 64 Op 3 . - CDS 204067 - 204780 1033 ## COG0854 Pyridoxal phosphate biosynthesis protein - Prom 204934 - 204993 3.1 + Prom 204733 - 204792 4.9 177 65 Op 1 . + CDS 204961 - 205839 1063 ## COG0061 Predicted sugar kinase 178 65 Op 2 1/0.000 + CDS 205918 - 206631 1065 ## COG0020 Undecaprenyl pyrophosphate synthase 179 65 Op 3 . + CDS 206658 - 209231 4476 ## COG4775 Outer membrane protein/protective antigen OMA87 + Prom 209290 - 209349 5.5 180 66 Op 1 . + CDS 209415 - 209921 913 ## Sph21_5200 outer membrane chaperone Skp (OmpH) 181 66 Op 2 . + CDS 209952 - 210506 983 ## BF0502 putative outer membrane protein OmpH + Term 210541 - 210573 5.3 182 67 Tu 1 . + CDS 210690 - 211598 1360 ## COG2207 AraC-type DNA-binding domain-containing proteins + Term 211743 - 211779 8.2 - Term 211731 - 211767 8.2 183 68 Op 1 . - CDS 211788 - 214163 3195 ## BDI_0453 putative surface membrane protein 184 68 Op 2 . - CDS 214172 - 214918 1123 ## COG0778 Nitroreductase - Prom 214943 - 215002 4.0 + Prom 214975 - 215034 5.2 185 69 Op 1 . + CDS 215056 - 215862 977 ## COG0253 Diaminopimelate epimerase 186 69 Op 2 . + CDS 215884 - 216084 241 ## gi|313159472|gb|EFR58835.1| conserved domain protein 187 69 Op 3 . + CDS 216087 - 216896 1065 ## COG0596 Predicted hydrolases or acyltransferases (alpha/beta hydrolase superfamily) 188 70 Tu 1 . + CDS 217035 - 217313 512 ## Fisuc_0607 hypothetical protein + Term 217320 - 217352 6.3 + Prom 217320 - 217379 6.0 189 71 Op 1 . + CDS 217415 - 217957 877 ## COG0501 Zn-dependent protease with chaperone function 190 71 Op 2 . + CDS 217970 - 218224 219 ## COG0501 Zn-dependent protease with chaperone function + Term 218359 - 218406 11.5 - Term 218354 - 218385 3.1 191 72 Tu 1 . - CDS 218402 - 219232 598 ## COG0266 Formamidopyrimidine-DNA glycosylase - Prom 219334 - 219393 7.2 192 73 Tu 1 . - CDS 219660 - 219869 92 ## - Prom 219970 - 220029 3.2 Predicted protein(s) >gi|313159403|gb|AENZ01000009.1| GENE 1 53 - 949 368 298 aa, chain + ## HITS:1 COG:XF0935 KEGG:ns NR:ns ## COG: XF0935 COG0338 # Protein_GI_number: 15837537 # Func_class: L Replication, recombination and repair # Function: Site-specific DNA methylase # Organism: Xylella fastidiosa 9a5c # 9 287 40 309 311 156 35.0 5e-38 MSGNNSKKIAFNYFGGKFSFIEYLYENFPPKFTHLVDLFAGSMVVSLNYRGNVIRTANEI NGEITNFFEQLREHEDELIRMLELTPCSELEYRKCWGGAEGITDLERARRFYVRVRQSFF GLGVQRQNKGWHMSKQHVNAQGGETVSRWKNGVEKLHEVAQEIRRNFQIINSDFSAAIDK LDFPEAFFYCDPPYPLESRGSKHKGGDYLFDFTDADHERLAWKLHQIKGFAMVSGYDCLL MDYLYDDWHKIKFPMKRNNIRSNIVNGSGTLMQECIWCNYEIPVRNQNLFLHETAYSI >gi|313159403|gb|AENZ01000009.1| GENE 2 1874 - 3610 1464 578 aa, chain + ## HITS:1 COG:mlr8517 KEGG:ns NR:ns ## COG: mlr8517 COG0270 # Protein_GI_number: 13477024 # Func_class: L Replication, recombination and repair # Function: Site-specific DNA methylase # Organism: Mesorhizobium loti # 8 302 29 330 667 132 30.0 2e-30 MAIKVIYIDLFCGAGGTSTGVHFARHAGDPCAKVIACVNHDANAIASHAANHPDALHYTE DIRTLELGPLAAHAARMRRQYPDAFVVLWASLECTNFSRAKGGLPRDADSRTLAEHLFRY IEALNPDYIQIENVEEFMSWGDLDERGKPVSRDAGRLYRKWIDNVRGYGYDFGHRILNAA DFGAYTSRRRFFGIFARRGLPIVFPKPTHTKNPAQGDLFGQQLRKWRPVREVLDLEDEGE SIFDRKRSLVEASLARIHAGLVKFVAGGREAFLVKYNSMNQSGKYVAPGIDEPCPTVATQ NRLGVAKVYYLCKHFGGSPEGKCTAVDAPAGAITCRDHHAFVSAYYGNGFNSSIERPSPT VTTKDRFQLVRPFFANYYSGGGQTSGTDGPAPAVMTNPKQRLVTPWIMNTNFSNVGSSLD APAQTVTANRKWQYLMNPQFASAGTATDRPCFTLIARMDKRPPRLVTAVADGKELPSFIR LEDETLVYEIYPEDTPMLVRIKEFMALYGLVDIRMRMLRIPELKRIMGFPENYVLIGTQE EQKKFIGNAVEVNMARVLCEALVAKLYELNIVPQRFAA >gi|313159403|gb|AENZ01000009.1| GENE 3 3645 - 3914 314 89 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKSEKAKSFIDNDAIQMNNGDRMIDASTAYTAVELAEQEAEERMRAKAIGAFDDMWFENG EDGEFEPDYEYHRRNFIRKLNKEDEVQEA >gi|313159403|gb|AENZ01000009.1| GENE 4 4168 - 4488 176 106 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKQIEERGGDFAANYEILDDNGRRKNECEIARESYISGAKCEHELLTRWHDPKEPPEPGR VVLVKRNPSSIIPYDLGHIDNDGNWVDSWCGSPIDDKIIGWRKIHE >gi|313159403|gb|AENZ01000009.1| GENE 5 4678 - 5145 270 155 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKEMLTNYISRYSAFGLKMEVQKRDKTGFSIIGQQLPILRPGDDVRNEITHGGHRLVPLV GCAAIVFKCSPDQVAFDKEIGVAYRLGPLHIPAIKLMYLPETGDFSACYLNGEHYPIYSY HRLYDYLDSLLIDYRGLIGEGLAVSNHAVKCDPYE >gi|313159403|gb|AENZ01000009.1| GENE 6 5573 - 5869 192 98 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKTYYDLSDEEFLRYLRLLVAQLSRLPPYRWQQIAAIAGNPEEFIEIVQSMCNDGLFDWI DDFRNRCCFIDIKDDSLIRLDPMYVRTLKTGIDKRIWK >gi|313159403|gb|AENZ01000009.1| GENE 7 5860 - 6384 496 174 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MEMRLTFDVDDRIHMDYLAYLFERDTSGAYIVTARNCFGKLIIGHTQAASLPPKEACGKF AVTFILPINEATQNFQNKFIYLSAQATKQLNISLRAYFELDFWEFVQRRKALRQRKEEII EAFILSRKLFSAEYFESLHKRAYRRELKELEALKRKLTRRAYYIESLVDEPKKV >gi|313159403|gb|AENZ01000009.1| GENE 8 6423 - 6752 176 109 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLKIITSIALKPAACPAAEFRRVPLISPFGNLKTATTREDAGELLTITLTATLRSDDAFL HEPAIVRVKWRGGSLVFGSKDIPALLTLTEEETLVATCKYQTSIEAVKG >gi|313159403|gb|AENZ01000009.1| GENE 9 6804 - 7685 267 293 aa, chain + ## HITS:1 COG:HI1541 KEGG:ns NR:ns ## COG: HI1541 COG0616 # Protein_GI_number: 16273441 # Func_class: O Posttranslational modification, protein turnover, chaperones; U Intracellular trafficking, secretion, and vesicular transport # Function: Periplasmic serine proteases (ClpP class) # Organism: Haemophilus influenzae # 51 285 303 532 615 73 27.0 4e-13 MKRLITPENNVKLALELRRGRWLVHDISTLIPTAISLLQQQDILLQAAEREFDIILYDVD ARAVSANAARSGDPCVAVIPISGTITKYDSCGTIGTATYAQALLAAAADTNVVACVLDID SGGGNSTAIPLMLDAIRKFKATGKPLLAHADFCASAAYWIAAQCDAVYCDNAITSEVGSI GAYTYFIDDREALEKSGQKVHEIYAEESSDKNLAYRQALTGEYTLIRMRLSHLVAAFHAD VKAGRQALRADAPGVLTGATFFADKAIENGLADGIATLQECVDHAFIRASIHS >gi|313159403|gb|AENZ01000009.1| GENE 10 7697 - 9250 1551 517 aa, chain + ## HITS:1 COG:no KEGG:Odosp_2605 NR:ns ## KEGG: Odosp_2605 # Name: not_defined # Def: hypothetical protein # Organism: O.splanchnicus # Pathway: not_defined # 166 482 1 312 320 96 25.0 3e-18 MKLNWNFISKIFARTVGVENLSRDAEGHEVLSADQRKILEQKFGPEALQHYDAYAASDAN DDAQQEQLLLNFLDAIGRSGGDDKDKLRQELAKAEEKIASMSALMTILQQEKEKLAKARE DKPAATEVFGRSEGRTFTIDAKASHNRLALEALATGQLPTFSAATIDVDDLKKELGTYSS QGNSLELMQDIYRGFTSAKFMTPKRAIETYKAVRSDYTSVVQEFSAKWTPSGDARFTAIK IQNYRHKINFAIVPADVANSWLLSLYNERLSPDQMPITRYIVQKILLPSILQDIEMKMIG KGKYKAKENPTDAGKPEESMNGIETLLVEAAKSGDKGINFYPNAKDLRTATDAEVVEYID DFAHKILAKYQSLKMNIFLSADLYVKYKRGYKDKWGAGSGTENPDFGKDRVDFTNFSLQV LDCLYGSPIIFCTPKSNFIMLQNLNQPQVITDIQKVDYEVRYYGEFWLGVGFAFGEMLFA SVPADYDPQAAISDDGSTDFWKKTNAAAEVEDTHEGA >gi|313159403|gb|AENZ01000009.1| GENE 11 9275 - 9832 420 185 aa, chain + ## HITS:1 COG:no KEGG:BF2335 NR:ns ## KEGG: BF2335 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 19 183 22 183 272 80 31.0 3e-14 MYNPVSILKTDEGAGCPRPKNPTIILVPVEFVAAEPTREIGDIVMKSDLELIAEKKAIGI YATPSTIECTEESEGDPDARGVKVGIAFEHPGDSAEIAGFTEYARNRGFIALNRDCANGN TAEYRGSKCNPLFLTTEYTNNKDARKRKLTFKQEQRDLIGASIYKGKMPELADEATSAPT EEEGA >gi|313159403|gb|AENZ01000009.1| GENE 12 9834 - 10619 275 261 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGNNNTARQTGAKGAPAVSATPTAEVEKRAADAGVVSPDRNTPEETPIPEPAPLSPLPEK DAAPAPASSDAEGVPADDSVIVIAAYPGTKDLLQALWQKAAPGHCTLVFVDDAPFAKQFP ALIADASIPDEFVFVPANCAPVAPVDFADLAQLKVYVRKDGSRHYAERLPMLLNKVALVE LCGTMQEDADNETLVADYAKEYRRGVRATEVSHDFGNFVTLVLRSNPCENVVIAGLCQRK FIAASAEGWNAVSPLLTKIRS >gi|313159403|gb|AENZ01000009.1| GENE 13 10619 - 11302 273 227 aa, chain + ## HITS:1 COG:no KEGG:FP0797 NR:ns ## KEGG: FP0797 # Name: not_defined # Def: hypothetical protein # Organism: F.psychrophilum # Pathway: not_defined # 70 215 117 278 288 81 30.0 2e-14 MRDAVQKWLESGAEVQAGLRLLSLYAKNEHLAQLVTARPDRYKSLLIDTLCKTAGTSPLT TPPPSSRPAFREQWAFLSEPDCPPELKILAADKITAYRDYVDAHRRLFDCTTLDECFATA ENLIKSFSENRKILYEFTYYAEHHALLGRHSIFREMQELATLRKMGPVALVARQKNLKGS IWRIKHEITRGSKPHLDIERRNRLKAKERELAAVNKMIEEYERTIRP >gi|313159403|gb|AENZ01000009.1| GENE 14 11280 - 12221 535 313 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313159442|gb|EFR58805.1| ## NR: gi|313159442|gb|EFR58805.1| hypothetical protein HMPREF9720_0689 [Alistipes sp. HGB5] # 1 313 1 313 313 579 100.0 1e-164 MNELSDHKSIGENLAPEEEQRLVDLGALGWPPEDIARSINANVEQFTQEYNDPNSRVAVL IARGLLQKRALLEIRLMDEALGGNIPATTQLYKVQRDRSFEMTKLDVFGGFTDEQSYQRI MDYIDKGCEDDLSAKEQLYIDLLNLVFSLSKQHDRRNVIKFLTKPPFKLSYARAVDIYDE AVNLFYSNRRVTKEALRQKYADDLEAWSSIVAQRATCAADYAVAADMRAKAAKILRLDQP DPEQLPSSQYIKPIRLLSLDSSDVSLPPANRDALSRQIDSVLAPDAVKKRLRMEAGIEDV DFIEIINVAQEEN >gi|313159403|gb|AENZ01000009.1| GENE 15 12313 - 13962 1020 549 aa, chain + ## HITS:1 COG:no KEGG:FP0793 NR:ns ## KEGG: FP0793 # Name: not_defined # Def: hypothetical protein # Organism: F.psychrophilum # Pathway: not_defined # 2 533 39 552 569 384 38.0 1e-105 MGRGSAKTTDFQAERLAEIIFDMPGAPLVWVADTFSNLTSNILPGVLEGLERKGLHEGVH YVIEKEPPTYTEKEKEHLPTWLKPHFWKPFNRLVSYKRTIIFHTGTNIRFGSLDRPSTLA GSSYVYVFGDEVKYFKEEKIANLMKAVRGYSVQYGNSPFYRGYSFTTDMPDVSHIGEYDW ILKGASAMNTERLTLVMQAGLVYNQALHEYVAAKLEWQRTRTPEAEREYKNKLRTARLWR SRWVELRRLPETATFFMLASSYINVDILTAEWFDDAFAEQFSDYKAAILSMKPTLEGGER FYANLGERHFYYDGIDEEAYERFGLLEQEDSRVLRYCNPQRTLSLGVDFGNMCSMCVAQE EPRILRVLKFIYTLSPESIRQLADKFVTYFAHHRNKYVMLYYDRAGNNYRKMGADTASQL KHAIEFDSNEHPTGWSVQLMSIGQGNIGQGDEYIFMQELLAGHNKYLPQVLIDAYQCKPL KASLENARTRVSKGKICKDKRSEHLPVAELPMRSTNPSDSFKYLMMTPPLIQIVKGSPRS DIIYEPTFG >gi|313159403|gb|AENZ01000009.1| GENE 16 14237 - 14479 267 80 aa, chain + ## HITS:1 COG:no KEGG:BF1503 NR:ns ## KEGG: BF1503 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 76 1 76 80 87 53.0 2e-16 MKLYEVISFNRELLRRLSSVGIRLDDCKYIDLYEDYRKLREGGGKMTYIAAVLAEKYGVC ERQVYLIIARFEKDCTDISV >gi|313159403|gb|AENZ01000009.1| GENE 17 14547 - 15365 344 272 aa, chain + ## HITS:1 COG:no KEGG:BF1504 NR:ns ## KEGG: BF1504 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 2 271 7 276 282 323 57.0 4e-87 MKNYTSAPLPFMGQKKRFIREFRKALREFDHATVFVDLFGGSGLLSHVTKRERPDARVIY NDFDDYHVRLENIKRTNALLHDIRSIVGDYPTAKRLTPQMRTTILDTVRSAEKTGYVDYI TLSSSLLFSSKYVTDYTELQNAGLYNNLRASDYTCEGYLDGIEVVHADYRELFNQYKDIP GVVFLVDPPYLSTEVGVYKCRWRLSDYLDVLTLLSSTSYFYFTSNKSSIIELCEWISEAK VNANPFLNAVRKEMGAQLNYNSRYTDIMIYRR >gi|313159403|gb|AENZ01000009.1| GENE 18 15813 - 16085 114 90 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313159575|gb|EFR58938.1| ## NR: gi|313159575|gb|EFR58938.1| conserved hypothetical protein [Alistipes sp. HGB5] # 1 90 1 90 90 175 100.0 1e-42 MDLWDAIKEMRRLSAEGVPFGFTFMSYDATARVSKGVIEVRHARLLKREKQENHRDAEFV EAYLDLDTCQARRFYQPLLMSFNGQKVVLQ >gi|313159403|gb|AENZ01000009.1| GENE 19 16082 - 17446 783 454 aa, chain + ## HITS:1 COG:no KEGG:FP0787 NR:ns ## KEGG: FP0787 # Name: not_defined # Def: hypothetical protein # Organism: F.psychrophilum # Pathway: not_defined # 39 454 32 452 456 269 35.0 3e-70 MNKTGKATQISDFSYALPVDDCVYTISTAQANNLDTLLWQAERENWEYMPQYVGGQKIVP YGNNNRLPVQIRDLMDENNLAPGILARQKGLLYGEGPFLRSLRFENGEITKEFKDDREIM AWLKDWDYLKYIDAAMTDYLYLKGFFDIKLLERRGRISGQRPRIAALEFVSAKNARLEWA ATRRLEDVRHIFVGNFENDCIDTGIQTYPVFDSSNPARYPASAAYNSSYSFARDFYSIPE YWGTLRWIMRGSEVPAIFKYVTDNGFNAAYHVHSPDGYWEKKRTYIRKNNPAWSDKQVEE EIGKLTATMLTKLTEVLSGAKNAGKFFHTVDVFDPLSQQTNIWKVEAIDQKIKDFIESQL KVMEAASSAITSGLGLHASLSNIMVAGKLASGSEMLYAYKLFMMSNTAKPTYDILEPINQ AIRFNFPDTDLQLDFYHSKPLTESETSPNDRIKN >gi|313159403|gb|AENZ01000009.1| GENE 20 17451 - 18389 331 312 aa, chain + ## HITS:1 COG:no KEGG:BF2432 NR:ns ## KEGG: BF2432 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 298 7 303 317 91 28.0 4e-17 MIFNKTKKGSAELYNLTGTWYKANDFTGISEDIVLAQNEVIKLIGKATFDRAHSRYMTDE YDPEVSSDDPEDMLVRRVQLPVAYKAMHHFYQRNLVSHEDSGRKVKISENEKLPWAWQIE KDDAVLRDTFFRTLDELYLFLEQTDIKEWKDSPLRTQQQQSILRTLDQFESIYPLDGSFY TFYTLIPFILEVQQRFVRPIAGDRYMSLLSDIDSDTALAARRFVALKAMVIAVQRLSVSV FPIGISQRFTDSFQGKGAGKTPSTDALKFYLSALDHQAATALEEFHEALSATVEKYSLLP DNDPRNKFFSVQ >gi|313159403|gb|AENZ01000009.1| GENE 21 18389 - 19270 350 293 aa, chain + ## HITS:1 COG:no KEGG:FP0785 NR:ns ## KEGG: FP0785 # Name: not_defined # Def: hypothetical protein # Organism: F.psychrophilum # Pathway: not_defined # 1 286 1 280 293 71 22.0 5e-11 MNTIEIPAIGVCREIPSKWSEMTPEQARVTMRLLWDMESGHISPLEFHVRVLYLLLGIKR TWRSVMWEKLNRAAAEQKNANIFLLCENLLGWLFTDTEDGLLPTFDTIQNPLPEIHIGSH RLRGPADALQDLILVEFRNALIARDEFLNTRQPAALDRMIAFLYRQPSKQANRAGRCIVP IRHETFERDVCHARRIAPWQKQTILMWFSACVKFLQTGTIVVAGEQIDMRTIFSSEETAH DTGPKFTLTDVAYELARDRALGTLNDIDEEGLYTIFQILHHNVKQAKRNAKTH >gi|313159403|gb|AENZ01000009.1| GENE 22 19254 - 19685 253 143 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MQKLINLLRYLVNMRVTENEVIPVVDDTHGTERLKSADGRQVVVSYPSLRQTGETSNSYQ DQLPAAIFVLEKAMAGQRTDKDELEQYLSMLATVDMILKTLRADTLGYNACPRLAGMTIK VANTVPVYKVFGSWVGWMIEIEF >gi|313159403|gb|AENZ01000009.1| GENE 23 19785 - 20021 158 78 aa, chain + ## HITS:1 COG:no KEGG:Poras_0316 NR:ns ## KEGG: Poras_0316 # Name: not_defined # Def: hypothetical protein # Organism: P.asaccharolytica # Pathway: not_defined # 1 77 1 77 85 67 42.0 2e-10 MPILFYYLGLKFFFYSNDHEPVHVHVSNGECDAKFMIEPETKLIENFGLKPRELKHALMG IEENKEVIIERWKEFFDK >gi|313159403|gb|AENZ01000009.1| GENE 24 20029 - 20442 406 137 aa, chain + ## HITS:1 COG:no KEGG:BVU_3457 NR:ns ## KEGG: BVU_3457 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 4 135 5 138 138 119 44.0 5e-26 MKAQKIWFENGRIFLTTDDGRTGSLLLRAFPRLARASDEQRMKYELSRSGIHWPELDEDL SFEGFFDQPAETSDNRVAMAFAQFPEINVRQMARRMGINETLLAKYICGYSKPSEKRAKE IEAALHDLGHKLTQISI >gi|313159403|gb|AENZ01000009.1| GENE 25 20656 - 21084 283 142 aa, chain + ## HITS:1 COG:no KEGG:Sph21_3507 NR:ns ## KEGG: Sph21_3507 # Name: not_defined # Def: hypothetical protein # Organism: Sphingobacterium_21 # Pathway: not_defined # 1 117 2 110 142 75 33.0 7e-13 MKKLLFLFMAVLACNYVIAQQKVYCEIVGTQKFAKSQVTVGIDFGQEDAMRTNALGGVAS RNKLVGDDGKPLSFNSMVDAMNYMGSLGWEFEQAYVVTMPGMGGGQNVYHWLLSKYIGEN ESGTNGFKTRGQYETETAPCMQ >gi|313159403|gb|AENZ01000009.1| GENE 26 21343 - 21543 141 66 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKNTKIGETCLAVSRIESEFLSKLIETVKSQSEMIARLTAERAELQRRKMQADRRRQEVF RLPTDR >gi|313159403|gb|AENZ01000009.1| GENE 27 21608 - 21967 256 119 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTVKGEYIRRTLLDESNRWLKNQNTVLATKLHSRTGRLVNERSMSVSEQGEMSATMTYQH TIEERFLDMRVLRYGSKLVRRARKIHTRFAYGHYESIASRLMYGLTDDVVAEIKQQLTD >gi|313159403|gb|AENZ01000009.1| GENE 28 21972 - 26243 3199 1423 aa, chain + ## HITS:1 COG:XF2482 KEGG:ns NR:ns ## COG: XF2482 COG5283 # Protein_GI_number: 15839072 # Func_class: S Function unknown # Function: Phage-related tail protein # Organism: Xylella fastidiosa 9a5c # 116 558 74 529 739 68 22.0 7e-11 MGKAIRDEDLRLNIIVNGDESQKRIGDLSRQTRDLANANFELRQEQRRLKAEGKENTELY RQNAAEIKKNAAIIKSNKEQMQQLRSEMKLDTLTISDLTQEQRRLRAVFNNAIPGTTEWE KYRAELKQVDARIKTLKGSARDTGNVVQRMAGGFSKFFGAITAGFASMSFAVMGTKKARA AFLEYDEALTDAQKTTSTTKTEIREVSEELKKIDTRTPLNSLLDIVRVAGKLGIEGKQNL IEFARAGDKIGVALGRDLGGNAEAAIQQIGKLVDIFHLREQYGIEQSMLKVGSAINEIGM ASTAAEGFVVDFAKRAAGTAPNVNISIQSVLGLAGTLDKLGQQAETAGTSYGQVITAMYK RTEVFAKIAKMSLGEFQKLMGEDMNEAFIRVLEGMGKSGQGMQSIVNALNSMKLDGQRSV QVLGVLAANTDELRRQQEIANHAFETGTSVIEEFNTKNESATAQYEKQKKALHEQAVLLG ETLNPAFTSTTSITVTFLKALTGLVKFLYQTKGAIIPIVAAVAAYKTIMFAAHKAMVIYR AAHIAYIAITKQATLVTQVFNSVVKAGPWGWIATAISVAIGAVTLFSDKIFKTHEQVRNM AAEAAVEIDNEKRKLNELQDAATRAASGSRERAEAIMIINERYGKYLPKLLTEKSTNEDI AIALQSVNTELEKNIKLKFRQQEAERIAGDEMTAMKKAIAEITTKYKEWGDESSLTADKQ RLIAAAVVDFTGKIKAAGNDSAKQSQAVSDLNMELRNLGLNLTGSAWNPAGRNIALQEIT TELRNTVTEGQQALSMLDTLYGKFATPLNLNTTTTTNTTPTAPDDTGKWSLEKDKEFLTA KLKLKEKYQNGEIVSASQFNEELLKLEIAALEKRLAKNIDEGATRLKIQNQLADKRLEQK KAAAAKEEEINKLLQQSETDRIKRENADYELKKKRYAGNAAALEALEKAHQRNITKIRLD EIDDRLKREEETYELQKKQLKNRHKQELLDFHGSAAERKQLRKAQAEEESRLELEHLTQL SAQLKTLIESGMFDGIQLDTQLLSAQEKQNLVNRFEDVRTAIIGVLEVLGDGQQKTFSFA GNDPTFMGLPQSDWMQFFDVLKNGAASTEEALQAVHAAMTAIGAATETALQVYTTYDNMM TKKENAELKKYQKNQDKKKKANEKRVKAGLMTEEQAQAEEERMAADLEAKQEEMQIKQAK RQKAMNITSAIINTATGVTKTLAEWGWPLGAIFAAIVGAMGAAQIAMIASTPITTGAEEG GQVIERMQDGKKFNARYSPDKRGFISSPTVLVSENGKEYVVPAAAMDNPSLIPVLNTIEA ARRQGTLGSFDFNAVYRQNTPVPGFVSGGPTGDIPSFDTTDTGLSSLDAGTAGKFIAAVD RLCDVLKNPILAYVTMLGENGIVAKMKEYNRTRERGQIGGKKR >gi|313159403|gb|AENZ01000009.1| GENE 29 26240 - 28021 508 593 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313159484|gb|EFR58847.1| ## NR: gi|313159484|gb|EFR58847.1| hypothetical protein HMPREF9720_0701 [Alistipes sp. HGB5] # 1 593 1 593 593 1176 100.0 0 MIQFRSEGIILDVQPDQDVTFTLDNPIFEDDRVPVAVSTNVEFKLSPKNCKFFGFTPGIR RRPSRKTAACEALFNGIVAFQGELKYDDYSDKSLQYSFVGAEFDHIVTGKLTDIPFSGFE DIKFSTMVENARKGLYDEFGLPQIMRKAMSASIEYVTSGPTKAECSTVDKYANWLYTTRP YVVPAIKVRYILDKILPELELGEPELEKLLNMLAILGLYKSSEYDNRYGVKDTSPGARPG LYPTQCTLDLADSMPDMDLSDFLISLLKIPCCTLFFSGKKYFLMSNKSILASNKFVDWTA KVSDNYSLPAYEKKGYTLAFRNEDDNYTKSYEEDLGQEPTDEIIESYSLQDIINKYRVSP DYKNIRLAQTGDIYSGKKIDALLYYSGRGAAWNKYTTPIATLDIVHQAGFTKMEPAADSE DQNYDCSIDFTCPKCVPLAVYPSKQIHADGSIEEAIGLNAITPIVDFPTAGGNRPSEVYI GLLIKNNFSDKGYYFEGGIPDLSAGSEDRSEYSLAIGGENGLYNRFHKPYAEWLARDKEM LKVDLNLTASDIANLRLWVKVLVFNVRYLIKTLEITGNTTHDILHSNAELVEV >gi|313159403|gb|AENZ01000009.1| GENE 30 28080 - 29228 311 382 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313159520|gb|EFR58883.1| ## NR: gi|313159520|gb|EFR58883.1| hypothetical protein HMPREF9720_0702 [Alistipes sp. HGB5] # 1 382 28 409 409 762 100.0 0 MVDTFEIIEKPEICQFSENLGKMIIRNKNSSLTCVYMTVRLDDATICDKLTLYYDAESLI TINLRDIVHTLLECEFPRQTGVTDFTYLYITLTDTATTKTYRFQVIAGGVAAPRKVGLDW WARNFLTWQGQIVTMPAWQPQWLSVVKLNRDPQFLRIKSRLYTAEGIERTQDIFTASEEG IVRINVSFELLWRNICVSEELTPIAYDIYGLGANSVSEPDTAGAKNYPFAQRYILRSGNF RDRCFLFQNSLGGFDTIIASGLSTLLPEGEVDTFINQGREAELSNDYTSIWQQNTGYISS SSIARQWQEFLHSSNRYLYADGEWKQIIVTEYEVKHKEAALNSYTFKYHLSEKDEANYYD RAELPEPELPTDFWQIPSIRRE >gi|313159403|gb|AENZ01000009.1| GENE 31 29283 - 30083 129 266 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313159408|gb|EFR58771.1| ## NR: gi|313159408|gb|EFR58771.1| hypothetical protein HMPREF9720_0703 [Alistipes sp. HGB5] # 1 266 1 266 266 531 100.0 1e-149 MTSYRRWNILTDLTYIETFYERQSDGTLVKAAIPDAGIDFTIDYFTDGATHFKASRIDGV YKDCRQVDEYSLEVFIPLSRRRMCKGELRRELTLTIPDDNFLNQIKSICFPAKTGLFLWF GPSDNFKQTAYGEAIIATIINASYDIIDLTSSAYPATCLKIMEKLEKGITANIYIKERAE LPQQSIIQVQKADGGYQLFTGIFETSEEDGTIKIYQIYYLVNRTDGVVTRKRYDRFPAVA SAGDYIKKNDRLILGGYSPADLKSNS >gi|313159403|gb|AENZ01000009.1| GENE 32 30096 - 31208 971 370 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313159503|gb|EFR58866.1| ## NR: gi|313159503|gb|EFR58866.1| hypothetical protein HMPREF9720_0704 [Alistipes sp. HGB5] # 1 370 1 370 370 678 100.0 0 MAENQTLAVLQEILLKSRIKFVTGTEAEWTAANPVLLDGEYGLIRGSSPLKYKVGDGTKT WSALGWGNVTSLAQLTADATHRLVTDAEKTGWNEKADVFTFNYEGYLNPPTGGLNPQGQV AKNIVAAINANKKCVVVAQNVTVQEIEDSISGFVSITEVSASAVTGLIDTIRMASDDSGR TVFLATASITFKSDGTVTTAAVPYTGRIVMEDALPEYSTEKAPTVSGFAATYYLTRNGSR IGVPINIPLDQVLRGSSIKTVVTANSPYSGAKVGDKYIEFLFQNNNTPQYLPVQDLVDVY TGDDQYIQVTESNVIKLNYSVLSLKLAADLKKSYDNFYDPKGAGEAAAKAAIDEFKASTF VIQCTIPGMN >gi|313159403|gb|AENZ01000009.1| GENE 33 31213 - 31818 511 201 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313159417|gb|EFR58780.1| ## NR: gi|313159417|gb|EFR58780.1| hypothetical protein HMPREF9720_0705 [Alistipes sp. HGB5] # 15 201 1 187 187 349 100.0 8e-95 MAATEKITGRVQFPMFTAAALAAANPVLLKGEVVYESDTRRRKIGDGVTAWKSLPYESDG EMAGSIHASQITTDATHRFVTDSEKKTWGDKAAKDLSNVTLTKALSSNGYYKAPDGLMFQ WGISPGGAYQYYFSPAFIAKPFGCFLTAYYGNGNVITAASYVELTAQYLRYQSRWANLTD KNGGLASSTETVHWLVIGRWK >gi|313159403|gb|AENZ01000009.1| GENE 34 31832 - 32164 341 110 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313159446|gb|EFR58809.1| ## NR: gi|313159446|gb|EFR58809.1| conserved hypothetical protein [Alistipes sp. HGB5] # 1 110 1 110 110 190 100.0 4e-47 MKYWKQGFYDEPVEGGVEITDERWLELIDGQAAGMLITEDEQGSPVLTEYVNSVPVPTYE QRVQQSIRERYSVDDELAILRQRDTKPDEFAAYYEYAEQCKAQAKKQMQL >gi|313159403|gb|AENZ01000009.1| GENE 35 32161 - 32808 91 215 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313159419|gb|EFR58782.1| ## NR: gi|313159419|gb|EFR58782.1| hypothetical protein HMPREF9720_0707 [Alistipes sp. HGB5] # 1 215 12 226 226 445 100.0 1e-123 MIGRIQHPKYTAAALKAANPLLLDGEVVYESDTGRHKIGDGVNKWTELPYPMNAEAVPAV TWKVQGGMLCVKPATDLKNPILKQCFVGILHYKNAKKRYRRNPQTGQTQNRPLNAGFKLV QDSFSRDEVNWTSVRINPVPFDATKVNAAGWMPIISVADLLERWVVCIADRGFVGGGEIR AASRHQYRRPRRKDGGFREAQDAGFILRRSSIVYG >gi|313159403|gb|AENZ01000009.1| GENE 36 32774 - 32962 141 62 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSHNTTISRKGIFVIGHTPSKKLYMYYRCRFIVVSGYYSEIGACPLGSVLRITRKQYYSS VK >gi|313159403|gb|AENZ01000009.1| GENE 37 33015 - 33569 688 184 aa, chain + ## HITS:1 COG:no KEGG:BF2340 NR:ns ## KEGG: BF2340 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 15 156 35 177 178 90 34.0 4e-17 MIDHIFAAIRPQLIILTIVYLLVLFVIFLDLWAGIRKARKRGELRSSLGYRKTVEKIAKY FNLIFVVTAIDAVQMLTVWQINEQTGSRLPLIPILTVLGAMFIGFIELKSVYEKSEDKEK AKIADAAAALGSALKNRETQGIVAAVLEYMERAGRQSGSPRGPARSEQAPGPEFMPNPDI YDEE >gi|313159403|gb|AENZ01000009.1| GENE 38 33594 - 33806 133 70 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313159462|gb|EFR58825.1| ## NR: gi|313159462|gb|EFR58825.1| hypothetical protein HMPREF9720_0709 [Alistipes sp. HGB5] # 1 70 1 70 70 133 100.0 4e-30 MKANYTLEKVETEDGFAATYLLMCDGYQVGSAINIPDIVKGAKIISAAGGREALVINVKG RIGRPRHIKL >gi|313159403|gb|AENZ01000009.1| GENE 39 33821 - 34321 589 166 aa, chain + ## HITS:1 COG:no KEGG:Coch_0633 NR:ns ## KEGG: Coch_0633 # Name: not_defined # Def: mannosyl-glycoproteinendo-beta-N-acetylglucosami dase # Organism: C.ochracea # Pathway: not_defined # 4 166 58 221 222 104 40.0 2e-21 MTPKEFKKTYWPDIAASCEETGLNPLFVAAQAALETGWGKSAIGHNLFGITATKKWRGAV KYVRTFEYFDDDKQGHRFPKVHSITRMPDGRYKYVVDRAFRDYTSVRECLTDHSRILLTE RYAPAMPYKDDVYQFAYRVAACGYCTAKPADYAGLILKISKTLEKA >gi|313159403|gb|AENZ01000009.1| GENE 40 34318 - 34932 431 204 aa, chain + ## HITS:1 COG:no KEGG:Bache_2807 NR:ns ## KEGG: Bache_2807 # Name: not_defined # Def: hypothetical protein # Organism: B.helcogenes # Pathway: not_defined # 4 201 6 183 183 91 34.0 2e-17 MKRFLLIALIIAGGLLWVQSARLRSEKRERRRLESNQTALMSDVEIYRTKAGKAAASNMV LNLRVSELERLRAADAESIRDLGIKLKRVESTAKTATATVVELRAKLRDTAVVRETSAGA VIIDSMRTFRWRDPWVTVEGLIERDSVACRVESIDTLRQVVHRVPRRFLFIRWGTKAIRQ EVMSSNPHTRIVYTDYIELKKRNR >gi|313159403|gb|AENZ01000009.1| GENE 41 34929 - 35402 448 157 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKKFLKTTWAVLLYLWQLPQNLLGLTYLAFCFDRVKITEQRGAVFYATKHVRGGMTLGRY VFIAPGNIDREPVYDHEFGHVRQSRRWGWLWLPVFAIPSGLHSLFCRAANYYHFYTERSA NRLGGVPNYAGEYHYHMDGLIVTYWDKLVELKDKYFK >gi|313159403|gb|AENZ01000009.1| GENE 42 35460 - 36176 166 238 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313159487|gb|EFR58850.1| ## NR: gi|313159487|gb|EFR58850.1| hypothetical protein HMPREF9720_0712 [Alistipes sp. HGB5] # 1 238 1 238 238 455 100.0 1e-126 MHTDDFDGFVVTINGTTHYVDGTVPDKTLNTVSDYNSLLRDMINIPTIHEHYYSNLSREH WAQLCAAMDVIRDSQRAIDKYQEIDQIDYIHGDALYIYGLLNAFYLQQDAATTIFKIILK KKELNYFDDYPEINKIRRIRDDVCHATDRNIIKGKIIAQIFISPSTVTKNGFCYLKYTTS DGNRKVVTIHVDIDDCIAKQANDIKDILCTICKKILSSISPDVQKEYLESWSQAMKAL >gi|313159403|gb|AENZ01000009.1| GENE 43 36241 - 36837 169 198 aa, chain - ## HITS:1 COG:no KEGG:Halhy_5893 NR:ns ## KEGG: Halhy_5893 # Name: not_defined # Def: integrase family protein # Organism: H.hydrossis # Pathway: not_defined # 21 182 174 331 347 66 28.0 7e-10 MVSIGKALLKKERKVIAVDDMQRLHDWLQENNRYFLLVCYFLHYMLIRPKEIAKLRLCDI SVSKQTVYIDDTISKNKRSACVTIPQKIIELMAELGYFDAPSTYYIFSKDFRPGPEWVNE KTYRDFWSRKIRPALHFPKEYKFYSLKDTGITAMLRAGYDTLSVKEQARHSSLLMTDVYT PQDIRDANPLLLNYQGVL >gi|313159403|gb|AENZ01000009.1| GENE 44 37876 - 38784 551 302 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|148988856|ref|ZP_01820271.1| 50S ribosomal protein L9 [Streptococcus pneumoniae SP6-BS73] # 1 301 1 307 308 216 40 5e-55 MKAKKTILDTIGSTPMVRINALSPNPKVKIFAKLEGFNPTGSIKDRIAVKMIETAEREGR LTKGKTIIEPTSGNTGIGLAIVGIVKGYPVEIVMSEAVSIERRKIIRAYGGTVRLTPAAE GTDGAIRLARKLVAENPDKYFMPDQFANAANYLAHYENTALEIWQQTGGQIDYLVCAIGT SGTLMGLSRFLKVMNPAIKVVCAQPTKGHYIQGLKNMEEAIVPDIYDPSKIDVQELVESE EAIDMARRIISAEGIFAGMSSGAAMLAAVRTAARIESGNIVVVFPDRAEKYLSTTMFREF ED >gi|313159403|gb|AENZ01000009.1| GENE 45 38811 - 40559 2337 582 aa, chain - ## HITS:1 COG:TVN1395 KEGG:ns NR:ns ## COG: TVN1395 COG0247 # Protein_GI_number: 13542226 # Func_class: C Energy production and conversion # Function: Fe-S oxidoreductase # Organism: Thermoplasma volcanium # 263 581 297 657 659 147 28.0 6e-35 MVGAAVMFVVLLWKWGSWLYLLPRADKKRILFGLPTRRTLAAVWEVISESLLHRRIFKVN PLLGYMHMSLAFGWFLLIAVGWIETIAYLGFRYVPLQGHVFFKYFATGLEHKPVFDFLMD LLLLFVLSGVALAWGKRVYSRAMGMRRTTKHLLGDRVALSALWFVFPVRLIAESTTCALY GGGGFLTGAVGAWMAEHVSTLALMNLESAAWWAYSACLGIFFVALPFSRYMHIFTEIPLI FLRHYELRSTEKEGAFDHFQVEACSRCGICIDPCQLQSVLGINDVQSVYFLRDRRYRMLR LATADNCLMCGRCAEKCPVDIDLNTLRLNSRDTMRNVPDEKRYDYFKGLDRSSGEGKVGY FAGCMTLLTPRTMSAMDKVFRAAGEEVWWADREGGVCCGRPLKLAGETDSARRMMRYNTD LFRKHGITTLVTSCPICLKVFREDYELAGIEVLHHSEYILRLIRAGRLDVVHGPTRFTYH DPCELGRGSGIYDEPRAVIEAVGELLEPAQTRENAPCCGSSVANTAISDSQQVRLAQAVA EELEATGAEVIVTACPLCKKAIGRGTRGEVRDLAEIVAAGLK >gi|313159403|gb|AENZ01000009.1| GENE 46 40595 - 40930 448 111 aa, chain - ## HITS:1 COG:no KEGG:Odosp_0420 NR:ns ## KEGG: Odosp_0420 # Name: not_defined # Def: hypothetical protein # Organism: O.splanchnicus # Pathway: not_defined # 6 102 4 100 109 108 51.0 7e-23 MAAINFGYTISKPRAIDIDRNNLRKSDEILHEMPELQACIGCGACTAVCTAGNLTEFNFR KVHTLVRRGEYQGAYEEMNKCMLCGKCRLVCPRGINTRGVVMLIKRKLGDF >gi|313159403|gb|AENZ01000009.1| GENE 47 40934 - 41938 1285 334 aa, chain - ## HITS:1 COG:SSO1131 KEGG:ns NR:ns ## COG: SSO1131 COG1148 # Protein_GI_number: 15897989 # Func_class: C Energy production and conversion # Function: Heterodisulfide reductase, subunit A and related polyferredoxins # Organism: Sulfolobus solfataricus # 2 332 4 346 368 232 40.0 6e-61 MKRVAVIGGGVAGMQTALRLAEQGIEPVIIEKEAELGGKLRGWHVLFPSFTPASEILTEL RRRVAERGIEVLTQTEAAGFTREGVRLTDGRTIACDSVVMCSGFTLFDASIKEEYGYGIY DNVFTTVDIERMLNEGRVAKADGSRPKRIAFLHCVGSRDEKVCQQHCSKVCCITGVKQAM EMKQLFPDADVFNFYMDIRMFGPGYEEMYREAQQKYNIHFIRGRISEASPTIDGRVQIKA EDTLTGRPLRMSVDMLILIVGMRANDDNAVLAEGAGLHRAPSGFMAPRDMFLGNVKSNVE GIFYAGTVTAPKNIGESLNEATAAADAAARYLGA >gi|313159403|gb|AENZ01000009.1| GENE 48 41938 - 43026 1505 362 aa, chain - ## HITS:1 COG:MK0572 KEGG:ns NR:ns ## COG: MK0572 COG2048 # Protein_GI_number: 20094010 # Func_class: C Energy production and conversion # Function: Heterodisulfide reductase, subunit B # Organism: Methanopyrus kandleri AV19 # 25 325 11 304 305 119 30.0 7e-27 MIKRKSWQDYQKEIADDHYYYARSCIRQNFFPGSEKLFIDMLRNDLGKDLTDDPLHSSCT GIGYHSDIVPLETIMTVVARQFALMTEAGYENLVSSCITSFGVYTEILATWHEFPETEAK TRENLYKATGREFKKPASLAHTSDVVFHFREQIAARAKHRLVNAQTGEPLKVVEHIGCHY AKIFPKSGIGGSEFPYVLAGMVESWGGECVDYPERRHCCGFGFRNYLVQANRGYSIANSH KKLESMAPYKPDFIVANCPGCAMFLDKWQYTIAEMEGTTYGQNGHGIPVLTYEEMAGLVL GYDPWVLGMQMHQVDVEPLLDKMGVEYDPAAKYLGRNGKYIGKPASAVVNCCPTDTIYDI RE >gi|313159403|gb|AENZ01000009.1| GENE 49 43030 - 43728 1058 232 aa, chain - ## HITS:1 COG:SSO1127 KEGG:ns NR:ns ## COG: SSO1127 COG1150 # Protein_GI_number: 15897987 # Func_class: C Energy production and conversion # Function: Heterodisulfide reductase, subunit C # Organism: Sulfolobus solfataricus # 3 142 48 182 280 94 32.0 1e-19 MAKYFDMLMEDVRMKEGLQSCMNCGVCTGVCPAAEFYNYDPRQIVNIVQTRDDDAIEELL KSDTIWYCGECMSCRPRCPRGNTPGYVIQALRTLSQKLGFFVESEKGRQQLALKRIIGEN ILRTGYCIVPRLVKPELHPEQGTVWQWIYDNDKEIYGQFTPVYMRHGAGALRRLDEKSLD EIHRIFEVSGGKEMFDAIEEHSDRKARELGYEEGADQKYMMDVFLNNSNEHY >gi|313159403|gb|AENZ01000009.1| GENE 50 43945 - 44430 787 161 aa, chain - ## HITS:1 COG:BS_dfrA KEGG:ns NR:ns ## COG: BS_dfrA COG0262 # Protein_GI_number: 16079240 # Func_class: H Coenzyme transport and metabolism # Function: Dihydrofolate reductase # Organism: Bacillus subtilis # 1 161 1 162 168 173 47.0 1e-43 MVSIIVAVAENGVIGDKNALLWHISEDLKYFKSVTSGHPVVMGRKTYESLGRPLPNRTNV VVTRQEMEIPGCRVAHSLEEAVALFPAEEEVFVIGGAQIYAQALPLAGRFYLTRVFHAYE GDTHFPAWNDAEWRLVSSESFASGANYPYPFAFETYERRRN >gi|313159403|gb|AENZ01000009.1| GENE 51 44913 - 45398 671 161 aa, chain + ## HITS:1 COG:sll1087 KEGG:ns NR:ns ## COG: sll1087 COG0591 # Protein_GI_number: 16330938 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Na+/proline symporter # Organism: Synechocystis # 9 120 85 206 512 65 38.0 4e-11 MAAAGFAVGLLIAPRWNKTGCLTAAEYITRRYGASTQKLYTYIFLFISIFTTGSFLYPIA KIIEVAAGIPLSSSILIIGVFCMIYVSLGGLRAVVVTDVLQFIILFAAVIIVTPLAFGEV GGVPEFLARVPEGFFTLFAGEYNWVFIVAFMLYNLFFLGGN >gi|313159403|gb|AENZ01000009.1| GENE 52 45423 - 45941 509 172 aa, chain + ## HITS:1 COG:no KEGG:Phep_3587 NR:ns ## KEGG: Phep_3587 # Name: not_defined # Def: Na+/solute symporter # Organism: P.heparinus # Pathway: not_defined # 3 164 250 412 556 177 57.0 2e-43 MRTPRDAKKVGVLFGVLYTFSPILWMLPPMIYRVFEPGLSGLENENAYLLMCKQTRPAGM LGLMLGGMIFATASSLNATLNISAGVFTNDIFKRLRPAAGETTLMKVARISTLGFGVLAI AVALLIPNMGGIVNVVISVAALTGVPLYLPLIWSLFSKNLNGRAISARPSPV >gi|313159403|gb|AENZ01000009.1| GENE 53 46010 - 46315 327 101 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313159554|gb|EFR58917.1| ## NR: gi|313159554|gb|EFR58917.1| hypothetical protein HMPREF9720_0726 [Alistipes sp. HGB5] # 1 101 1 101 101 137 100.0 3e-31 MLVGVLFPALLLTGYEIYHKLSGSRVCVPMPAAENAAAPAAEDAGTDNNNSIRIIGAGAA ISGCFISLIALLSGGETLVLLTTGIVLTVLGGLIFLTTKNK >gi|313159403|gb|AENZ01000009.1| GENE 54 46319 - 48601 2758 760 aa, chain + ## HITS:1 COG:no KEGG:Sph21_1996 NR:ns ## KEGG: Sph21_1996 # Name: not_defined # Def: fumarate reductase/succinate dehydrogenase flavoprotein domain protein # Organism: Sphingobacterium_21 # Pathway: not_defined # 18 732 20 733 764 810 53.0 0 MFIGEFEKSRKPAEVHSRHSLVIGGGGLTGVCCAIAAARAGADVALIQDRPVLGGNASSE VRLWALGATSHMGNNNRWSREGGVIDEILVENTFRNREGNPVLFDMVLIDKVLSEPNITL YLNTCVYDAEMSGRTVRSVTAFNPQTGTHHTIEGDLFADCSGDGALAYMAGAEYRMGAED REEYGEKFAPDKTRYGELLGHSILFYIKDTGRPVRFEAPEFALKEVEELIPRLRNPEYFS TSQHGCKYWWLEYGGRLDTIRDTEKIKFELWKIVYGVWNHIKNSGKYPEMENYTLEWVGL FPGKRESRRFKGYYMLTQQDIIEQHEHYDAVSFGGWSIDLHPADGVYGAGRACNQWHSKG IYQIPYRCLVTPDADNLFIGGRIISVSHVANGSTRVMCTAAHGGQAIGTAAAIALRDHLK PADLIGRERIGQLQSALLRTGHFLPGERFGRGMLPPKARISASSEFALERLHPDGTRFRL DCSAAELIPVGGPVPAVGLTVQADKATRLTVELRSSSRRGNYTPDTTDKRLVFDLRKGEN RLTVDFGMRYDAPQYVFICFMANPDVSIPMSSEIVSGLTSVFNTVNPAVSNFGRQTPPED IGVDAFEFWCPKRRPEGKNIALEFSAPLTCFGAENLLNGYFRPYIQPNGWIPAPDDRTPS VRIDLGEERTVRQIRLFFDTDFDHALENVQMEHFDSVMPQCIRNYRITDGDGNLLFRTEG NHQSCNTIALDAPHTLRSIRFWPEDSEPGRHKALMGIIVE >gi|313159403|gb|AENZ01000009.1| GENE 55 48633 - 51221 2897 862 aa, chain + ## HITS:1 COG:no KEGG:Slin_2415 NR:ns ## KEGG: Slin_2415 # Name: not_defined # Def: hypothetical protein # Organism: S.linguale # Pathway: not_defined # 25 859 25 855 857 784 45.0 0 MKPHFKIFSAFVLLLSAACTPSDELAVRNKGLEIRWEQTAEGWKISRIADRKGRCWGVPD GAYTILFSSEKPSAEPETITNRGGDTMNFDIARFHYIAKDFNRAVSSVPMNRAGEALRFY PEKGWKEGRAVCFEARNELGRLRTTWSPDPDYPADIRIESVFYPARKGYYSVSTPTLAVL PEERLGWSVVPGFFQGDYIQPVFHLAYMYGQGLPHLPVICNDNTVTTMITSMTEKAGNTL ALIPDSGYSRSEYSGGKRTHGISWRCGLSHMNRQGELSPTLYYPVLGQEGSEKQAGDSVR FGFRVSMTDKGWYEAHKHAVYDIYGLGNSLALKHTTLPLYKRMEAIWDYILDDSLSFWRT AGYKGLTIGAQDYLGGVVEADRDAMKNSDIGASWMLASMTGDPRLTEERLPYMRNFKLMQ QAPAGDPNHGAAMGQYYLWKKQKFVEEWGDHIEPIGITYYTLMDLGNILLFERNDSLLRS SFRAGAERLLSLQQADGGFAVAYGKHDGEPLFTDLKDLRPTFYGFVVAYKLLGEHKYLDA AVRGADWFVRNAVDKGAFTGVCGDARFINDFATGQAVQALLDMHGLTGEARYRDAALRTA RMYATSIYTHPTPGDREIEYKGRKMQEWQISQVGLCFEHGGCAGSAVKSGPILLTSHCGM FVRLYEETADRLFLDLARAAATAREAHLAPDTHMATYYWSQFDRGPGPFPHHAWWQLGWI ADYVFAEAEMRSGRRISFPRGFMTPKVGPQRIFGFEPGTVYGEQANPIMVKGLFDADNTD IEILSALTTDRNRLCLILMNSTPRKLHATLTVHPAAIPGRRIDTASAADPATGRKITPGD DGAFGITLPGYGIQTLKLDLEP >gi|313159403|gb|AENZ01000009.1| GENE 56 51218 - 53023 912 601 aa, chain + ## HITS:1 COG:no KEGG:Phep_3589 NR:ns ## KEGG: Phep_3589 # Name: not_defined # Def: hypothetical protein # Organism: P.heparinus # Pathway: not_defined # 45 550 25 524 924 353 37.0 2e-95 MTRAGRILLIILAAAGACRPAQAQEFRVKPDHTVVFGGQHLLPDYLPEFTVLFSAADPQL RMRPAGIPDVQYNVATWLSDRPELNRTDRRADQSGDGFDDEILEGSVESRTADLFNAGRI IVVRPVSHRAEETDKVTFGYPENDLFSIAAELTADTATGRPLLTYRLTAKTGGWYSVGFT GFAGRDPKRTDEIWQPLIWQEKRFPDKSYLTLAFRCPVPSAFVTYRNATYGILAHPAEFP FDPLPTAANSRFGVAVRDRQGLASPMLFAPVPGGIGSETAPGGELVFSALLCGERGDTTD ALQNVAESLFGFTGNRNNALGSLNGTIENMISYILGPYSRFLDKEKGCTYATDVPGAVKN VSSLNPLEIAVLNDNRTMLLERALPVYEYVLSREKLLFCADSTQDIQYPSRRMNGPCCPV SELTALNSILKDNAPYLLSFAGKEYHSTRVRNLDVVEQGGTWRNAMHLYRATGEPGYLKK AVAGADLYIRRRIDTPAAAGRNSTRSTSGTPPPSRTGIFWSIPTGKPRTTNTSSRKDMRR WTLPPNGSRHGGFRKSDSPRNLPARAPDTGVSSWPTTLRGCCGWRITRATGFWHARRTTP S >gi|313159403|gb|AENZ01000009.1| GENE 57 52774 - 53913 988 379 aa, chain + ## HITS:1 COG:no KEGG:Phep_3589 NR:ns ## KEGG: Phep_3589 # Name: not_defined # Def: hypothetical protein # Organism: P.heparinus # Pathway: not_defined # 2 379 542 924 924 315 44.0 2e-84 MVNPDGKAPHYQYLKSKGHAQMDAPAERVPAWRLSEIGLTPESSGTSTGHRGIFMANYAA WMLRLAHYTGDGFLARTADHAVIGRYRNFPGYHINTARTTVYEKADYPLRGHMELGANSF HYNHIMPHISLLYDFLVTQALFRSGGKVDFPGEFIEGYAYLQSKFYGHLPGTIYGREARL WMPTGLATPADRQLNYISAVDDDALYIVFMNQSAETVETDVALDYALSGHSDGGHYRTQV IGGSGASGQLADGRFTVRVAGNGIAAVRIETPTPVRPQLAALFTAETRWQTPYAATQDDA VRCMRVGFGRAANNVFVYLTQDDTQLRKVWFTVDGRTTEDTDYPFEHTAPIDGERTEITV KALTRQGRIIEWETLELKK >gi|313159403|gb|AENZ01000009.1| GENE 58 53920 - 55845 2235 641 aa, chain + ## HITS:1 COG:no KEGG:Phep_3590 NR:ns ## KEGG: Phep_3590 # Name: not_defined # Def: hypothetical protein # Organism: P.heparinus # Pathway: not_defined # 23 641 36 626 630 413 39.0 1e-113 MRTTAFFAFMALLSAGCGGTAQKSPSGNIRAEVFAGKEGLYYTVTHDGRTVIDTSALGIT IDGTELFRDAALRPAGTRKIDETYSVTGRKAVSEGRCNEYLFDVTHRPSGYKYRLQTRLY DDGFAYRFVLPGDGTRTVNGEAAQWRPPHQSRVWFAERNSGWKLLTYAGEWISTTADSLH IVSRQGPVQTMPLLYRTPDEYVMIAEAALYDYSGMRLRAEPGASLRADFTEQEGFELSGD IVTPWRVTIIARDLDALVNTDIITSLNPAPDPELFADTSWIVPGRSLWSWWSGIDGGFMT LEGEKRVIDTAASLGIEYSTVDDGWEERPDKWKFLHELAGYAQSRGVGLFVWRHWEKLND PADDYRQMRGFLDSVAAIGIKGVKVDFMNGEGIRQIDFNTHLLKNAARRRLMVNLHGCQK PTGEIRTYPNEITREGVRGIELNRITANYEKRMRDEGRTPDPDRYVPGDENQNIPASHNA ALPFTRGVLGAADYTPVAFTMPGGTTAAHQLAFALLLDSPLLTIAENPFVLSGDERYRPA LDIIRRLPTVWDETVVLPQSEIGRLAVIARRKGGTWYIAGVNTAPAAVTIPAGLIPATAR KAQIVRDAADGSLQTAEVALPAPEPLVMQLPENGGFIVVLE >gi|313159403|gb|AENZ01000009.1| GENE 59 55905 - 56342 81 145 aa, chain + ## HITS:1 COG:Cgl1222 KEGG:ns NR:ns ## COG: Cgl1222 COG1609 # Protein_GI_number: 19552472 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Corynebacterium glutamicum # 8 102 11 105 369 79 40.0 2e-15 MKKLEMIKHATIKDLAQALGISKSTVSRALADHSDVKPETKRLVLEMAEKMNYRPNPYAQ NLIRRRSKVIGVVVPEFVNSFFPRIIIQIQKVFEKEGFNVLIEECVFREHARRCDQRTRL LRPRDRHHGARSGHRPHQPPPAGRV >gi|313159403|gb|AENZ01000009.1| GENE 60 56233 - 57027 825 264 aa, chain + ## HITS:1 COG:no KEGG:BVU_2283 NR:ns ## KEGG: BVU_2283 # Name: not_defined # Def: ATP/GTP-binding site # Organism: B.vulgatus # Pathway: not_defined # 1 264 324 605 605 176 39.0 7e-43 MLDDATNVHGSYVRVTGITAPDQVIARINHPQQAGYEFAGKGDEIDVVDAETLLAKHTLR VKKSERINAHYIRLTFTAPVEGRLTVGDGLENMSWYPELIFRNNVVRNNRARSILVSTPR KVVVEGNTFSSMMSAILFEGDMDHWYESGAVRDVTIRNNRFLDGTYGGADFPTIFINPHQ KKEVPGHPYERNITIEGNLFRTFNEQLLRAKSVGGLIFRDNTIELSEKYKPYNDLPTIDI RSSQQVVIENNRYKKPGKISIVRK >gi|313159403|gb|AENZ01000009.1| GENE 61 57144 - 58043 1090 299 aa, chain - ## HITS:1 COG:NMB1709 KEGG:ns NR:ns ## COG: NMB1709 COG0207 # Protein_GI_number: 15677556 # Func_class: F Nucleotide transport and metabolism # Function: Thymidylate synthase # Organism: Neisseria meningitidis MC58 # 1 298 1 263 264 380 61.0 1e-105 MKQYLDLLRRIKTEGVVRGDRTGTGTKGVFGHQMRFDLAEGFPLLTTKKVFLKGVIHELL WFLAGDTNIKYLVDNGVHIWDNDAFRYYNELCVRHGVLPVDRDTFLRAAQEGVESPVEGY KFGDLNHVYGYQWRSWPKPDGTVVDQIAQAVGLIRSNPESRRILVSAWNVAEVEDMALPP CHVLFQFYVAEGKLSCQLYQRSADTFLGVPFNIASYALLTLMTAQVCGLQPGEFVHTLGD THLYLNHLEQVDEQLAREPRPLPVMRLNPAVKSLFDFRYGDFTLEGYDPWPAIKAPMSF >gi|313159403|gb|AENZ01000009.1| GENE 62 58126 - 58872 798 248 aa, chain - ## HITS:1 COG:HI0735 KEGG:ns NR:ns ## COG: HI0735 COG2908 # Protein_GI_number: 16272676 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Haemophilus influenzae # 1 238 1 231 237 75 26.0 7e-14 MNHYFASDIHLGAGGEAFAGETERRFVAWLDDAAKDAESIFLVGDLFDFWFEYREVVPKG FVRTLGKLAELTDRGVRVVFFTGNHDMWVGDYLARECGVEVYTSPQRLCLNGKHLFIAHG DNMKIDGQPVLKLLNTVFRSRTLRWLFSWLLHPDWAMRFGHWWSGKSRKSHAADTLDVSL TEPLIQYAREYAAMHDVDHFVFGHMHFPRDFREGNLHVINLGCWEQYPSYAVLDASGEMT LRRLEAFR >gi|313159403|gb|AENZ01000009.1| GENE 63 58990 - 60132 1626 380 aa, chain - ## HITS:1 COG:no KEGG:Odosp_2266 NR:ns ## KEGG: Odosp_2266 # Name: not_defined # Def: alkyl hydroperoxide reductase/thiol specific antioxidant/Mal allergen # Organism: O.splanchnicus # Pathway: not_defined # 1 380 1 385 385 161 28.0 5e-38 MNKQTLFVSLTAAVMLCACQSSKVKISGRIVGNDAKNVYLEQVSPLSQSVIDSAVLDKEG NYRFELKGVTRTPSLYNIIYNGERIPLFLAGGDRLSVNSVGSFIRNYTVEGSKETELLRQ FYQAFVGGAQRLDNIAGQFARTNLSEEERKALVKEYTDEYYRIRREQLRFIIENKSSLAA VYALYQRLPGDTYLFNGDSDVVYYRTVAEALEQSYPDSPYLQSLLAEITRMDARISLTSR ISEAGYPDLELSDIYGKKVRLSSLTGKVVLLDFWSAELGNSNTLNAELKEVYKKYADAPT PFEVYQVAVDSSKPLWITAVQEQQLPWISVSDLRGQASTAPRLYNVQKLPANFLIDREGT IVAKDIYGKSLERKLDELTK >gi|313159403|gb|AENZ01000009.1| GENE 64 60196 - 60882 938 228 aa, chain - ## HITS:1 COG:no KEGG:Sph21_2930 NR:ns ## KEGG: Sph21_2930 # Name: not_defined # Def: hypothetical protein # Organism: Sphingobacterium_21 # Pathway: not_defined # 1 228 1 226 227 153 40.0 7e-36 MKKNYNYTRRKLYLPEYGRHIQEMIDSLQLIEDRRERNRQARAVIAVMGNLNPLLRDTAD FTHKLWDHLFIMSDFQLDVDSPYPQPTRQELTIAPRRMAYSQGRIAFKHYGKYVERMIGS LAGDKDRREVSRTVDNLARYMRSKSYEYNQEHPNNEVIVKDIKRMSGGAIEIDEVALNNL RSDYKQHFTARPQKGPQQRNQQRQQKNRNQQHNRNFSKNNGPHRNSSK >gi|313159403|gb|AENZ01000009.1| GENE 65 61076 - 62215 1715 379 aa, chain + ## HITS:1 COG:STM4174 KEGG:ns NR:ns ## COG: STM4174 COG2204 # Protein_GI_number: 16767428 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains # Organism: Salmonella typhimurium LT2 # 4 365 8 371 441 284 42.0 2e-76 MAKVLILDQSKSMRNTLRERLEFEGFSTEAAEDEEAASTMCEKIPFDVILSDRGCKIPDA GIPFIVLSAEASIDSAVEAVRNGAQDYLTKPIDMNRLLQSIHRTIDNPAPAPACAPAAAP ARRSRRSRNMHTEQIIGTSPQMEHVRQLIDKVAPCEARVLITGENGTGKELVARWLHAKS SRAAAPFVEVNCAAIPSELIESELFGHEKGAFTSAIKQRKGKFEQADGGTLFMDEIGDMS LAAQAKVLRALQENKICRVGSDKDIDVDVRVIAATNKNLRDEITKGNFREDLYHRIGVIV VRVPALRDHAQDVPLLADHFIRTICAEYGIPPKRIESNALRELQAMRWSGNIRELRNVIE RLIILSEERITLDDVKTYC >gi|313159403|gb|AENZ01000009.1| GENE 66 62274 - 62942 905 222 aa, chain - ## HITS:1 COG:TP0554 KEGG:ns NR:ns ## COG: TP0554 COG0546 # Protein_GI_number: 15639543 # Func_class: R General function prediction only # Function: Predicted phosphatases # Organism: Treponema pallidum # 9 222 5 220 222 145 39.0 7e-35 MIMETTVNTDLVIFDLDGTLLDTIGDLAVACNAVLALRGLPQHSYEEYCHFVGNGIMRLV ERALPEELRTPYTVAAVRADFVKYYTEHIDAYTKPYDRIPELVAGLVRRGVRIAVASNKF QAGTEKLVRLFFPDVPFAAVFGQREGVPLKPDPAVVEEILSLTGAAKERVLYVGDSGVDM QTAAAAGVRSVGVTWGFRDREELVESGAAHIVDKPAEILDLL >gi|313159403|gb|AENZ01000009.1| GENE 67 62939 - 64399 2114 486 aa, chain - ## HITS:1 COG:FN1422 KEGG:ns NR:ns ## COG: FN1422 COG1757 # Protein_GI_number: 19704754 # Func_class: C Energy production and conversion # Function: Na+/H+ antiporter # Organism: Fusobacterium nucleatum # 62 472 42 439 473 271 38.0 3e-72 MESHKNKLPSAWVSALPLAVLTLLLYVVIRCFGGDAINGGSQIALLSATSVCVMLSIGIY RCKWSVLEDAIIDNIRASASAIIILLLIGAIAGSWMVSGVVPTMIYYGLKILHPSFFLVA SCVICAGVSLMTGSSWTTIATIGVALMGIGQAMGFPEGWIAGAIISGAYFGDKISLLSDT TVLASSTVGVPIFTHIRYMLYTTVPSFVVALAVFVVAGLTLSHTGGAHAELYADSLRGAF RITPWLLLVPVATGVLIVRKLPAIVTLFCAAVFACVAMLLAQPELVVRVAGVEGLNFMSG FKGVLMSCFGPTALETGSPQLDELVATRGMAGMLNTVWLIICAMCFGGVMTGSGMLGSLT AVFLKFVRHSFSAVASTVGAGIFFNLCTADQYISIILSGRLFRELYAERGLESRLLSRSV EDSATVCSVLVPWNSCGMTQATVLGVSTFVYMPYCIFNIVSPLMSLCVAALGWKIKKAAS SAGRRQ >gi|313159403|gb|AENZ01000009.1| GENE 68 64463 - 65560 1251 365 aa, chain - ## HITS:1 COG:no KEGG:BT_0445 NR:ns ## KEGG: BT_0445 # Name: not_defined # Def: endoglucanase E # Organism: B.thetaiotaomicron # Pathway: not_defined # 28 357 30 358 366 349 51.0 9e-95 MSRRLIFAVLALLCAGAVSAQKRTEAPAASSEVRYVGRTQTNGGDVSFDWSGTSFECRFT GGSLAMRVSDTKKNYYNLTVDGRDAGVVTTFGTDSVVVLAEKLGRGEHTVRMQKRTEGEQ GRTTIHAFLLDRGGRLLPASPAPGRHIEFIGNSLTCGYGTEGLSKDEPFKPQTENCNKAY ACIIARYFGADYTLIAHSGRGAARNYGDKNTTSQNTMADRIANTFDEAAEPAWDFAASPY RPDLVVINLGSNDFSTLPHPSRDEFAAAYTRILQTLRGAYGDEMPILCVAPRVSEPAFTY IRDLCQSAVVPNLHFAAILPGYCNDGSELGSSAHPNYAGQRKMAMLLIPYVSTLTGWEAE IKPVE >gi|313159403|gb|AENZ01000009.1| GENE 69 65557 - 66876 1827 439 aa, chain - ## HITS:1 COG:no KEGG:BT_0447 NR:ns ## KEGG: BT_0447 # Name: not_defined # Def: sialic acid-specific 9-O-acetylesterase # Organism: B.thetaiotaomicron # Pathway: not_defined # 28 430 475 878 884 426 49.0 1e-117 MKKLLTVCLLAFAAAGATGCRSAAEGVKPKLMWLDCSANWVRFSYPDSIRYYVNKCREAG MTALVLDVKGTSSEVVYPSEHAPQVREWKGFARPDFDFVGTFVEAAHDAGLEIYGSFNTF AEGNGVFRRGLIYDGHPEWQAVNYIPGRGLVPQLEIPEKKVLFANPALPAVQDHEIAIFK EVAQKYDFDGLLLDRGRYDNIQSDFSDFSRGKFEAYIGKKLDRFPEDIYTWEEDGDGGWK RIDGPYFKQWIEWRASVIYDFFKRTKEELKAVKPGLKFGAYTGAWYPSYFEVGVNWASNT YDPSQDFAWATPGYKNYGYAELLDIFTNGNYYWNVTVDEYRRSNGLHKNETDSEMSKGDH LSVEGGCRYSRRLLGGRPFFGGMYVEDYKRDTTQFKRAVEMNLRESDGLMVFDIVHIINR DWWGPLQRAVSAYEAEAKQ >gi|313159403|gb|AENZ01000009.1| GENE 70 66890 - 68749 2130 619 aa, chain - ## HITS:1 COG:no KEGG:BT_0434 NR:ns ## KEGG: BT_0434 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 24 616 27 625 627 726 59.0 0 MRFTGLKYLQLLACLGLFACGSAERYDVVIVGGGASGTAAGLQAARMGARTLIVEEFDWL GGMLTSAGVSATDGNYRLRGGIWDEFRTELARHYGCDSALITGWVSNVMFEPSVGDSIFK RLVAREPNLTVWYRSAAETAERGKDVWRLGVRRDGRLRQVEAGVLVDATELGDVARMAGV PYDVGMDSSAVTHEDIAPAEANGIVQDLTYVAVLKDYGRDVTIPRPDGYDPSLFACCCVN DLCIAPKEPHRMWSREMMITYGKLPDGKYMINWPIEGNDYYVDMIDMTPEERADAVRRAK NHTLSFVYFLQHELGFNTLGLADDEFPTEDRLPFIPYHRESRRIHGVVRFTLNDITDPYA GTLYRTAVGVGDYPVDQHHTRYSGWDDLPDLYFHPIPSYGFPLGIVIPAGFPGLLVAEKS VSVTNLVNGSTRLQPVVLQIGQATGALAALAAAAGVDPSEVAVRRVQEAVLDAGGYLLPY LDLPADDPRFGAMQRIGVTGILKGRGANVGWENQTWFDADRTIAESELREGLREVYPSVR PSVLETPVDGALLTAMLAEALGKPAGDVAARTERAAAGLLSGYDPARPLTRLECALLIDA VADPFHADEVDIYGNYKRK >gi|313159403|gb|AENZ01000009.1| GENE 71 68760 - 70190 1634 476 aa, chain - ## HITS:1 COG:no KEGG:Cpin_5275 NR:ns ## KEGG: Cpin_5275 # Name: not_defined # Def: hypothetical protein # Organism: C.pinensis # Pathway: not_defined # 22 468 41 491 496 381 43.0 1e-104 MKRYLLTILLSLWCACAWAAGSVTVKGRVLCGGRGVEGVWVSDGEEFACTDKRGNYRLQA GADNRFVFVCVPAGYDAPVEKGVVRYFHPLPADGKSCDFTLLRRAGDDSRYGFIAIADPQ IWAPKEFAKLAAAADDIAATVRSYGGMPFHGICCGDIVSHDHSLYGRYNEVMERTGITFR NAMGNHDMKVYGRSYETSFSKFEQMYGPVYYSFDVGRIHYVVLDDNFFIGRDYFYIGYLE ERQMRWLEKDLARVQPGSTVVVCLHIPSTCEEQDRKQFRYDRAGSTMTNHRGLYEILKPY RAHIISGHTHTTFNQPIAPGLYEHVTPALSGAWWQGPLCTDGTPAGYGVYEVNGDRIDWY YKSTGYPADYQMKIYSGREYPQFEGYAVANVWASDPAWEVQFTIDGVPCGPAERFQAYDP AAKQMYSDTSQMDHKWIYPSISDHYYRVALPEGAKRVEVSATDRFGRISRAAADLK >gi|313159403|gb|AENZ01000009.1| GENE 72 70208 - 71371 1875 387 aa, chain - ## HITS:1 COG:slr1975 KEGG:ns NR:ns ## COG: slr1975 COG2942 # Protein_GI_number: 16330802 # Func_class: G Carbohydrate transport and metabolism # Function: N-acyl-D-glucosamine 2-epimerase # Organism: Synechocystis # 4 384 7 387 391 461 55.0 1e-129 MDFKRLAEQYKTELLDGVLPFWLEKSQDREFGGYFTCLDRDGSVYDTDKFIWLQGREVWM FAMLYNKVERRPEWLECAVQGAEFLRKYGHDGNYNWYFALTREGRPLVQPYNIFSYTFAA MAFAQVAVATGSDEYALIAKRTFDRILEKRANPKGEWCKAYPGTRSLKSFALPMILCNMA LEIEPMIDKQLLADTIGECLHEVMEVFYREELGLIVENVTEDGRLSDTFEGRQMNPGHSL EAMWFIMNLGVRLDDRALIDKAVKIALNTAEYGWDKEYGGIFYFLDRKGAPRVELEWDQK LWWVHIESTIAMIKGYQLTGSKECLEWFGKLHEYTWAHFKDTEHPEWFGYLNRRGEVLLP LKGGKWKGCFHVPRGLFQCWQILEQCK >gi|313159403|gb|AENZ01000009.1| GENE 73 71393 - 72787 1681 464 aa, chain - ## HITS:1 COG:CAC1339 KEGG:ns NR:ns ## COG: CAC1339 COG0477 # Protein_GI_number: 15894618 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Clostridium acetobutylicum # 9 462 13 453 469 282 36.0 1e-75 MKRRGNISYMLFLAVTAAVGGLLFGYDTAVISGTVELVTARFGLDSLQQGWYVGCALAGS IAGVLCAGVLSDRLGRRRTMLVSAVLFTVSAAGCALCADFTQLVVYRIVGGLGIGVVSVV SPLYISEVSAARRRGMLVSFYQLAVTMGFLAAYTVNYLLLRMGQTAGFETAWMQRIFVDE VWRGMLGAETLPALVFFVIIFFIPESPRWLALRGHTDRALRVLGRINGDRAEAAAELAAI ETAVGGGQAPQWRLLLSKGIRTAVLVGAAIAILGQFMGVNAVLYYGPSIFERAGWSGSDS LFAQILVGAVNMLTTVLALAIIDRVGRKKLVYWGVSGMVLSLLLIGTCFLTGERSGMPDG VLLAAFLCYIFCCAISVCAVVWVLLSEMYPIRVRGAAMSIAGFALWVGTYLVGQLTPWML ANLTPAGTFFLFAAMCVPYMLLIWKAVPETSGRTLEEIERYWTR >gi|313159403|gb|AENZ01000009.1| GENE 74 72797 - 73894 1586 365 aa, chain - ## HITS:1 COG:no KEGG:FB2170_00105 NR:ns ## KEGG: FB2170_00105 # Name: not_defined # Def: hypothetical protein # Organism: F.bacterium_HTCC2170 # Pathway: not_defined # 56 357 3 308 309 374 55.0 1e-102 MNGHYPKPGGLAAALAVLIAVVLAACSPKGAAGGDAACGLSDDDFRTEGILQAIPVRATF LDEVSWDIPHQNWGVREWDADFRAMKNMGINTVVLIRAGLGRWIAAPFECLLESEEVYYP PVDLVEMFLTLADKYDMAFYFGMYDSGKYWQEGLFQREIDLNLKLIDEVWARYGHHESFQ GWYLSQEISRRTKNMSRIYAEVGRHAKAVSGGLKTMISPYIHGVKTDQVMAGDKALSVEE HRREWNEILGNVAGAVDILAFQDGQVDYHELYDYLVVNKALADKYGMECWTNIESFDRDM PIRFLPIKWEKLLLKLDAARRAGMRNVITFEFSHFMSPNSAYPQAGHLYDRYCDYFKIDN PYRKR >gi|313159403|gb|AENZ01000009.1| GENE 75 73901 - 74836 1594 311 aa, chain - ## HITS:1 COG:no KEGG:Pedsa_0025 NR:ns ## KEGG: Pedsa_0025 # Name: not_defined # Def: hypothetical protein # Organism: P.saltans # Pathway: not_defined # 1 310 1 310 312 436 65.0 1e-121 MRIKGTFLDEISHDIPHQNWGEAEWDRDFGYMREAGIDTVILIRCGYRRWQTFPSRVLTA EEGCYEPPVDLVDMFLRLSDKWGMNFWFGLYDSGKYWDRGEYEQEVALNKRVIDEVWSRY GHYKSFAGWYISQECSRNTGKIVDLYASLGRYCKEVSGGLPTMISPYIDGSKNVSQYNAS TSKQNVVTLESHEREWDAIFAGIKGAVDIVAFQDGHVEYDELVDFLKVNKKLADRYGLEC WTNSETFDRDMPIKFLPIKWEKLRLKMGLAAQAGYQNAITFEFSHFMSPQSAYLQAGHLY DRYMEYLKTLE >gi|313159403|gb|AENZ01000009.1| GENE 76 74860 - 76260 1772 466 aa, chain - ## HITS:1 COG:no KEGG:BT_0446 NR:ns ## KEGG: BT_0446 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 466 1 469 469 291 41.0 5e-77 MEKRAYTVDLLRGLAIVGMVLSGQILWHAELPAWLFHAQVPPPSFRFDPSVPGITWVDLV FPFFLFSMGAAFPLALRRRLEQRGESVALILTVVARRWALLALFAIALANLRSGVTGTLP GWGSSLLQLAGWGCFCALFMRFGRLSDRQNRMVCLAGVAGIAALLAAARWIWGLPVSAER SDVIILVLANMALFGSLVWLYTRNNLLARLGVLALLAALRLGSGVEGSWNEALWNWSPAP WLFRFDYLKYLCIIIPGTIAGDRIYEWMTQSGEDAPGASRRREVWILVLLVTLICLNMWG LFARQLVVNLAAGVLICLLLRRLLRGDGSATGRLHGSLFGWGFFWLMLGLALEAFEGGIK KDYATFSYFFVTSGLASFVLIAAGIAMRRLNVRFSALVKCGQNPMVAYTAAGFLIMPLLT LLHLAPYLQTFAELCPWMGVVRGVLVTAAVMAVTVFFTNRRLFWRT >gi|313159403|gb|AENZ01000009.1| GENE 77 76273 - 76938 998 221 aa, chain - ## HITS:1 COG:all1288 KEGG:ns NR:ns ## COG: all1288 COG2755 # Protein_GI_number: 17228783 # Func_class: E Amino acid transport and metabolism # Function: Lysophospholipase L1 and related esterases # Organism: Nostoc sp. PCC 7120 # 69 210 231 364 383 95 33.0 7e-20 MKRIVILFLAALFAASGASAQKFHNAYYDTRRAAHDEEGLQQGAIVFLGNSITEQGWWSL LLKRGDVENRGIGGDNTFGMIDRLPDILKSKPRKIFLMAGINDLTGGQPVDTIVMNITRM ADMVHEAVPGCRLYIQSVLPVNTRRLAYPGLKGHNPQVRALNARLVRLCDAKPWCTFVDL APLLSDADGELRIDLTKDGIHLHPAGYVIWTDYLKKQKYLK >gi|313159403|gb|AENZ01000009.1| GENE 78 76941 - 78326 2107 461 aa, chain - ## HITS:1 COG:no KEGG:BT_0449 NR:ns ## KEGG: BT_0449 # Name: not_defined # Def: putative S-layer related protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 23 459 6 449 453 430 51.0 1e-119 MKKNVKWFAAALVVLTLGYGAVSCGSDSKEPEWEWPDPDPDPDPEPGVEKPCFIWVDAAA NFPDFANSKENILRDLTKAKEAGFTDIVVDVRPTTGDVLFRTSVVDQVEWLGAWLPGGYS KVERTATWDYLQAFIDAGKSLDLRIHAAINTFTGGNQTLLGGAGVVFRDEAKRAWTTDLN LAGGITNIMSTSQSAKFFNPVLPEVQEYLCSMLRDLAAYDGLAGIFLDRGRFDGFTSDFS NYTRKEFEKYIGQSVAGFPADILPAGHTSGIPSPVPVHMKQWLEFRAKVIHDFMEKARAA VKSVNPSVKFGVYVGGWYASYYDVGVNWASPNYDTSSKFSWATKKYMNYGYADLMDQMLI GAYASPTRVYGTTEWTMQGFCLLAKERTMGACPMVAGGPDVGNWDADDKVPQEEENRAIT ASVAACINACDGYFLFDMIHLKKADQWSYVKTGIDGVIKKD >gi|313159403|gb|AENZ01000009.1| GENE 79 78391 - 80052 1896 553 aa, chain - ## HITS:1 COG:no KEGG:BT_0450 NR:ns ## KEGG: BT_0450 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 553 1 565 565 347 38.0 9e-94 MKNSIKLFALAALLGLGGCHEPDELTPSVVDQGLNTISAQFATGEYKNDPLAKFTAAVTE GQERIVIDVPYYYPENSSNTTSITQMRVTANLDDNCFITPMLGVLDLTKENWFTLTRVDG SKRNFCVTGNIKKSDKCEIISFATDEPAIQGVIDNDRNTISLITVDDLSSVTAQVILSPH ATISPDPAVAANYEGEGVKFTVTAHDGVTKRVYSVSKEIPPKIKYGWRAGSETEVWQNVS LDKLGIVNAGGKIYSLAASGNKLVLSTGADKFLFNRATGDFLGTHDMKGVAADGGMTSDD AGNILYANLANPNAEFKVYAAASTDEMPAELLSYTNATGASMGKHISVQGNVKGDAIVTA VIYTWNGAVCKFLRWVITGGVPAKPQMISVTGATAGWNGNGHADVEAYSANPDDPYFLAY YSANALYRVDATGAVTHKIATATWGANSNYNCVDVCTFNNAKYAAIYEGAHFTYGEFKAY MFDVTTPDQMTGACDTSPSKVFVSNEYKPAGAVVNAAGDVLMTASADGYKMNLYYTDANT NALVAWEFDCIDK >gi|313159403|gb|AENZ01000009.1| GENE 80 80074 - 81738 2135 554 aa, chain - ## HITS:1 COG:no KEGG:BT_0451 NR:ns ## KEGG: BT_0451 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 553 1 552 553 528 51.0 1e-148 MKKKIINIVLLAATVAMTGCNDWLDMSPTNKVSEKIVWSDVNYVTQYVNGFYPYISRYGA FETGDSQVGLTDGLTETLKFGSATPGTNVGFANIIAFAEGGLAASTAAFHFGAWDDTYTR IRRVNEFLYGLKKYGSNFDGDTYKRFEAEARFFRGFLYFQLLKRTPEVILYNEDLLAITE NKALSTEEEGWNMVESDLSYAAKTLPAKWDNEAGRITSGAAYAMLSRAMLYAKRWQSAKT AAEEVFKLGYVLMPGKTAADYAKAFTSMRDGNTESILEYNYLVGGPNHSWDKLFMPGGDN TTMGGRATPTQEMVESYELATTGGYPDWTPWHNTTTGTTDTPPYAQLEPRFHASVLYNGC EWKGRALEPYVSGKDGYAVYDDGSALNGRTTTGYYLRKMLNEGYTDYSKACTQPWIAIRL AEVILNHAEACYMLGADGANDDLRQIRERVGLPYSNKSGEALMAAIRQERKIELYCEGHR YWDMRRWKLAHTAYSGQNSRVHGLKIEKQGAEFIYTYVDCDRKDRLFEEKLYRIPLPETE LSNNSAVKQFPEWL >gi|313159403|gb|AENZ01000009.1| GENE 81 81757 - 84633 3145 958 aa, chain - ## HITS:1 COG:no KEGG:BT_0452 NR:ns ## KEGG: BT_0452 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 958 106 1074 1074 959 52.0 0 MKEDAARLDEVVILGYGSQRRADVTAAVSQIDGKELQKMPMSNISQGLAGRLPGLISVQN SGQPGQDQASMTIRGAKSGILYIVDGVQRSINDIDPNDVESVSLLKDGAAVAVYGLEAAG GVMIVTTKKGRQGAMNLTYKGSYGASFNTSYPEFLDGPGYAYWYNKALELDGQEPIFTAK HVQMMREGTNGWGNTNWIDEIFGTGTIQQHSVTSTGGSDRISYFASLGYMNQTGNVKNFD YDRYNLRTNIEAKIAKSLTFTFGVAGQIGDRKEPGFPAGGTSGTSVSTAPWLSIAEQAAY AHPYLPKTYDGLPTASQNNYGNTINPIAATELSGYSKSQTVAVQTNAALQWDLPWVKGLS LKVMGAYDRSSTTSKILSTPYFLMLASTPNSTSQDITYSKASDPRNDANVTTGKKTNNLT EGFSQWRRITSQASISYKNTFAEKHKVDLLALIETRDYKTNKFSASGKDIPFPQLPELDQ ATTPADKPIGGSSDASRKVGLVFRGQYNYADKYLFEFSGRYDGSYKFIGNVSGRRWGFFP SVSLGWRMSEENFFAPLRNAVDNFKLRASFGSLGDDSGSAAYAFLSTFSNVSTAPSVVLG GVGQNGIMTSLVANELLTWERNYSYNVGFDLAMWSGKLGVEFDAFYNYIFDILGSNTGKP ASMGGYYPTYINNNAQDVKGIDVKLTHRNHIGKDFQYGVTVNMTWAKDRWLRYQDSPNTP DYAKRTGKSRYMTMGFMADGLFQSEEEIDNSPWISGSRPRVGDIKYKDLNGDGAITYDQD RAYLGRGQRPQFNGSVNLNASWKGLEFDILFVGAAVCDVSLTGTYYNWNEDNTIFTRPFK AGANSPRYLVEKAWTPENPNAEYPRLSLNPPNTNNSYASSFWYRDGKYLRLKSVHLGYTL PKNWTNKLRIQKVKVYIEGNNLCTWSGLPEGVDPEWPGVTNGYYPQQRTVVGGLEVSF >gi|313159403|gb|AENZ01000009.1| GENE 82 84641 - 84949 288 102 aa, chain - ## HITS:1 COG:no KEGG:BT_0452 NR:ns ## KEGG: BT_0452 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 97 1 98 1074 75 45.0 5e-13 MKEFFSRFRDSLAARTCLWTLFLLLAGAGTAFAQPTVGGTVTDEQGNPLVGVTVVVVDSN KGTTTDINGQYSIAVPKGGILEFTYIGMSSQRITCGGGDFQP >gi|313159403|gb|AENZ01000009.1| GENE 83 84998 - 86308 1923 436 aa, chain - ## HITS:1 COG:no KEGG:BT_0447 NR:ns ## KEGG: BT_0447 # Name: not_defined # Def: sialic acid-specific 9-O-acetylesterase # Organism: B.thetaiotaomicron # Pathway: not_defined # 23 436 473 883 884 422 48.0 1e-116 MKKLLFICLAAFAAASCTQEKQQSKPLYMWFDCEGNYATLSHPDSIRYYVSKIHDMGFTD VVVDVKSIMGETLYKSDIAPYMGEWDGVTRPENYDLLGYFIEEGHKLGMRVHGSLNVFAG GHNFFDRGIIYGDHADWQSQVYTEGKIVPISEIKSNYNGMLNPANPEVQEYELAILREFA EKYPDVDGIVFDRVRFDNITSDFSPLSKELFEAYAGTKVADYPDDILRWTQDADGKWDWS QGPLFRKWIEWRASVIKDFVTEAHRQLKEINPRLLVGDYTGAWYPTYYYVGVNWASEQFD PARYFDWATPEYRNTGYADLLDIYMTGLYYTLVTKAEVDKANGVVGQRTEAGMSDEQNYW YCIEGGAEWAKKITCGVVPVTGSIYVEQYEGDAAQFTRAVAQALRDTDGLMIFDIVHIIN RGWWSELAAGIAEGSN >gi|313159403|gb|AENZ01000009.1| GENE 84 86324 - 87352 1024 342 aa, chain - ## HITS:1 COG:CAC2630 KEGG:ns NR:ns ## COG: CAC2630 COG4632 # Protein_GI_number: 15895888 # Func_class: G Carbohydrate transport and metabolism # Function: Exopolysaccharide biosynthesis protein related to N-acetylglucosamine-1-phosphodiester alpha-N-acetylglucosaminidase # Organism: Clostridium acetobutylicum # 105 320 151 331 347 67 29.0 3e-11 MKLKFLGFALLAVCVCAVCCMCGSDSKTPDYEFPDGPDPDPDPQPGDYPAGLTVTEFTDD LGGGKQCLGFVAVADLKANPKLRFNAVHLPQQKIPSRIHAEFASANRGTACVTINAGYWW AGNSLSLLVTGGTVKSIENQTVTRNNQTVYPVRSSFGQMASGGFETHWIYCVLDDGNKPY AFPSALDNDERTNTYMSAPPTSKTPGAVLWTPQEAVGGGPMLVKEGKNVAVENYWKEVFD GGGIAGTSRQPRTAVGATADGKLILLVCDGRNMRGSAGFTLAELADKLIELGAVDAVNLD GGGSSTMVGSDGKVLNRPSDTGSAEVIVERKISTAVVISEVN >gi|313159403|gb|AENZ01000009.1| GENE 85 87810 - 88022 435 70 aa, chain + ## HITS:1 COG:no KEGG:Bacsa_2647 NR:ns ## KEGG: Bacsa_2647 # Name: not_defined # Def: hypothetical protein # Organism: B.salanitronis # Pathway: not_defined # 9 66 5 62 72 73 63.0 2e-12 MAVQTLKKVFRFYVDGFRSMTVGKTLWAIILVKLFIMFAILKLFFFPDFLAGQSPEERSR SVMKELTPEK >gi|313159403|gb|AENZ01000009.1| GENE 86 88033 - 89604 2474 523 aa, chain + ## HITS:1 COG:Cj0081 KEGG:ns NR:ns ## COG: Cj0081 COG1271 # Protein_GI_number: 15791471 # Func_class: C Energy production and conversion # Function: Cytochrome bd-type quinol oxidase, subunit 1 # Organism: Campylobacter jejuni # 6 517 4 504 520 560 54.0 1e-159 MLSDYLQTVDWSRAQFAMTAIYHWLFVPLTLGLGFILAIMETLYVRTGDEFWKRTTKFWM RLFGINFAIGVATGIILEFEFGTNWSNYSHFVGDIFGAPLAVEGIMAFFMESTFIAIMFF GWNKVSKGFHLTATWLTAIGANLSALWILVANAWMQYPVGCTFNLETVRNEMTSFWDVLF SPVAMNKFFHTVTSSFVLASLFVVGVSAWYLLRRREQKMARKSIAIASAFGFIFALITAT TGDRSGAVIARVQPMKLAAMEALYDGSQGAPLTAIGIVKPEAERTSNEDAFYFKIDIPKL LSIMSFRDADAYVAGINDLVNGNERYGVMSAGEKIERGRVAVGELARYRKALDEGDQATI DEVTAKFDPATPQGEEFLREYFAYFGYGYLDAPQDIVPDVPLLFYSFRVMVGAGCFFILL LGVVWWLNRKDRLADKRWLLWIAVWSVPLAYLASQAGWVLAEVGRQPWAIQDLMPVGIAA SKIGSTSVTATFFIFLALFTALLIAELSILFRQIKIGPEQEKE >gi|313159403|gb|AENZ01000009.1| GENE 87 89611 - 90762 1975 383 aa, chain + ## HITS:1 COG:Cj0082 KEGG:ns NR:ns ## COG: Cj0082 COG1294 # Protein_GI_number: 15791472 # Func_class: C Energy production and conversion # Function: Cytochrome bd-type quinol oxidase, subunit 2 # Organism: Campylobacter jejuni # 7 383 10 374 374 318 51.0 8e-87 METITLLQHYWWLIISLLGALLVFLLFVQGGQGLLYSLGRTEEERNMLVNSLGRKWEFTF TTLVTFGGAFFASFPLFYSTSFGGAYYVWMTILLSFVIQPVAYQFRRKAGNVFGEKTFDV FLTINGIAGPLLLGTAVGTLFTGANFTVDRLNLANGAGSGSATVISQWTTPWHGLEALGD MRNVALGLAVMFLAGMLACQYFMNNIADETLFARARRRMLTLAAPFLVFFLTFFVWLLFS DGLAVDAAGRISAEPYKYLHNMLEMPYVAAALLIGVVSVLWSIYSGWRGKRNAVWFGGAG TVLTVLALLLCAGWNNTAYYPSLAEMQSSLTIYNSSSSEFTLKVMSVVSLMIPFVAAYIW YAWRAMNRKPITREEIRGNDHMY >gi|313159403|gb|AENZ01000009.1| GENE 88 90824 - 91693 1408 289 aa, chain - ## HITS:1 COG:SMc00326 KEGG:ns NR:ns ## COG: SMc00326 COG0623 # Protein_GI_number: 15963999 # Func_class: I Lipid transport and metabolism # Function: Enoyl-[acyl-carrier-protein] reductase (NADH) # Organism: Sinorhizobium meliloti # 5 260 4 256 268 124 31.0 3e-28 MAYNLLKGKKGLIFGALNEQSIAWKVAERAVEEGAEIVLTNTAVSIRMGTIGRLAEKCNT IVVPADATSVEDLENLIDKTMEHFGGKFDFMLHSIGMSPNVRKGRTYDDLDYDYLSKTLD ISAISFHKAIQVARRKDAINDWGSIVALSYIAAQRTLYGYNDMADAKALLESIARSFGYI YGREKHVRINTVSQSPTPTTAGSGVLGLGDLMEFAENMSPLGNASAEDCADYVLTLFSDL TRKVTMQNLYHDGGFSSMGMSRRAMKTYEKGLRFDDVHQNQYPFGDGEK >gi|313159403|gb|AENZ01000009.1| GENE 89 91876 - 93843 2971 655 aa, chain + ## HITS:1 COG:FN1975 KEGG:ns NR:ns ## COG: FN1975 COG0513 # Protein_GI_number: 19705271 # Func_class: L Replication, recombination and repair; K Transcription; J Translation, ribosomal structure and biogenesis # Function: Superfamily II DNA and RNA helicases # Organism: Fusobacterium nucleatum # 8 525 9 525 528 530 51.0 1e-150 MHMTITDDFSALGLSEQMLTAVRAKGFETPTAIQKLTIPHLLTKTNDIIAQSQTGTGKTA AYGLPILQSLEPARGPVQAIILVPTRELALQAAEELLSYNREKRLSITAIYGGAAMSEQL RRLAKGVDIVVGTPGRVRDHIRRGTLKLENVRYLVLDEADEMLNMGFVEDVEEIMSHTSE ERRVLLFSATMPERIIRLSKTYMRDTEIVRVENKQLTTDLTEQIYFEVREADKFDALTRI IDVEPEFYGIIFARTKIGADETVSRLAARGYAAEVLHGDVSQAQREKILRKFRDRSVNIL VATDVAARGIDVGNLTHVINYSLPQDSESYVHRIGRTGRAGKQGTAITFVSPSEFRGLNN LMRDIKVEIKRETLPSPQDIVEMKRLKIKDEMQEIVENESYDGYREFAEELLAEYTPDVA LGALLRLAFRSELDQSNYPEIRSFSVDRKGTARLFLAVGRRDGYTARKLVDMLKFKCGLR DKYINDVQISDNFSFVSVPFHDAEEIVRKLNRLNRGRRPIAEIARDGEEAAARKPRRAKT ADAGGEEYAPAPKKSVRREKPAAPAQTAPAAGDESAAPRRKLKTKIQTEPRPEIPDFSKM SNEGFDWSAFMKFDDGTAWGRDENGKGKGAKSVRKTPKRTVTAAQRIAAKGKKRK >gi|313159403|gb|AENZ01000009.1| GENE 90 94139 - 94699 135 186 aa, chain - ## HITS:1 COG:no KEGG:Ajs_3985 NR:ns ## KEGG: Ajs_3985 # Name: not_defined # Def: hypothetical protein # Organism: Acidovorax_JS42 # Pathway: not_defined # 8 144 4 145 179 113 42.0 3e-24 MNEEKECIVIIDNSNVWIEGMKLSALKEGIKSSPIKEKEPCDYSWRLSFGNLLNKVSEGK KVVSTLLVGSRPPKNDSLWTSAKKQGFQVSVFDRNTQGKEKAVDAQIVAQGTKMICTHPN KGVLILLSGDSDFIPLLEICNELGWESEIWAFKSALPCAKKMIQYVTRVNYLDSIFSDIG YRQKKK >gi|313159403|gb|AENZ01000009.1| GENE 91 95255 - 96268 1030 337 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313159463|gb|EFR58826.1| ## NR: gi|313159463|gb|EFR58826.1| hypothetical protein HMPREF9720_0765 [Alistipes sp. HGB5] # 1 337 1 337 337 659 100.0 0 METNQNPAHENPAQPIVPETSATVQPGQQLVPGQPNAQSASEQPATTPEQSVPELQPDHP APAGQLYDPNEAAQIVLHLTDGYFDPEYILLFGKLVGGTPHSDAMAYDLLMVVRETPEYD WIRAKRILRYRMPYRHRKITYINLYILPLNYVESNWTPFLYFARSEGELLYCSDRHHFRR PKHPCNFAASYADARFHFDTFRTLGNDLLEQAQDAFVESRNVRLAAQFTAQAVVYFYHTL YYVYHGMEFDIHDPVVMHDRMRTLSTKLMLVFDDNHIENIFTLPCLKQILMKTPYSAEFY MAPQELEMHMDRVQKAAEIIENYCELRLERYKELSER >gi|313159403|gb|AENZ01000009.1| GENE 92 96610 - 96870 389 86 aa, chain + ## HITS:1 COG:YPO3154 KEGG:ns NR:ns ## COG: YPO3154 COG0776 # Protein_GI_number: 16123316 # Func_class: L Replication, recombination and repair # Function: Bacterial nucleoid DNA-binding protein # Organism: Yersinia pestis # 1 80 1 80 90 71 48.0 4e-13 MNKAQLVEAIALDANISKIDARKAVDAIIRVTVQSLREGERLTLTGLGTFSVQQKAARVG RNPRTSAAVKIPPRKAVKFRPTIELE >gi|313159403|gb|AENZ01000009.1| GENE 93 96902 - 98098 1805 398 aa, chain + ## HITS:1 COG:CAC1001 KEGG:ns NR:ns ## COG: CAC1001 COG0436 # Protein_GI_number: 15894288 # Func_class: E Amino acid transport and metabolism # Function: Aspartate/tyrosine/aromatic aminotransferase # Organism: Clostridium acetobutylicum # 5 395 4 394 395 366 46.0 1e-101 MPEISQRAELMPASPIRKLVPLAEAARSRGIRIYHLNIGQPDLPSPQTGLAALKKIDRKV LEYSPSDGYRSLREKLAGYYQQYQIKLSPEEIIVTTGGSEAVLFAFMSCLNPGDEIIVPE PAYANYMAFAISAGAVIRPVVSSIEQGFALPDVAEFEKLINERTRGILICNPNNPTGYLY TRKEMERIRDLVKKYDLFLFSDEVYREFIYTGSPYISACHLEGVEQNVVLIDSVSKRYSE CGIRIGALITKNKKLRDAVMKFCQARLSPPLIGQIVAEASIDAPRSYSTEVYEEYIERRK CLIDGLNRIPGVYSPIPMGAFYTVARLPVEDSDDFCAWCLSDFEYEGETVMLAPASGFYS DPTRGRNEVRIAYVLKKEDLERSLLILGKALEAYNNRK >gi|313159403|gb|AENZ01000009.1| GENE 94 98325 - 99776 2043 483 aa, chain + ## HITS:1 COG:XF1053 KEGG:ns NR:ns ## COG: XF1053 COG2067 # Protein_GI_number: 15837655 # Func_class: I Lipid transport and metabolism # Function: Long-chain fatty acid transport protein # Organism: Xylella fastidiosa 9a5c # 17 444 27 433 447 90 22.0 9e-18 MKKIFLLLAAAAAASGAYAEGYQVNNLSSKQNGMGHVGTAMKLNSESIWFNPAAASFQDS EFDISVGITGIASKATYTSLPDYTGKTKPQHYTSDNSIATPLYAYFNYKPLDWMSVGLAF NTPFGSSMDWGNNWPGAQLVQSINLKAYNVQPTVSFKLCEHLSVGGGLMMTWGKFDLSRS LLPIGADNAQNNALMGALQMPNLFEMAGDKNLMSLGLEGKAKMAVGVNLGMMWDINEQWS LGFTYRSKLKMKVDSGSIDLRMIDNAQIAQAIGQLLPQLGLDPAKLTSAVVKTELPLPAS LTWGVSFRPVPKWEFAVDLQYVLWSAYDQLDVRILDPDSNTPVLNIPVSDKKYSNTLAFR FGGQYHALDWLTARMGMYVDESPVRSDYLNPETPSMTKLAYTAGVTFRPIKWMNIDIAYG YVNSADPERTGSYPYTNSVLGIVNEKLTAAGVPESQLTSPIADFSGNYTARAHTFSIGVG FSF >gi|313159403|gb|AENZ01000009.1| GENE 95 100124 - 101329 1640 401 aa, chain + ## HITS:1 COG:FN1106 KEGG:ns NR:ns ## COG: FN1106 COG1760 # Protein_GI_number: 19704441 # Func_class: E Amino acid transport and metabolism # Function: L-serine deaminase # Organism: Fusobacterium nucleatum # 1 398 1 400 408 399 50.0 1e-111 MESLKELYKIGNGPSSSHTMGPKKAAERFAERCHDADAYRVTLYGSLAATGKGHLTDAAI LSVLASLAPTEIVWKPEVVLPFHPNGMLFEGLKAGNVADSWTIYSIGGGALANETSRLET PHSIYPLTTVSEIKAWCSHEGKTYWEYVTDCEGPEIWGYLDEIWTVMCETIQRGLNNDGV LPGGLKVARKASTYWVKSKSYTDSLKSRAQIYAYALATSEENASGGTVVTAPTCGSCGVV PAVLYHLANSRNFLRIRILRALATAGLFGNVAKTNASISGAEVGCQGEVGVACAMAAAAA CQLFGGTPAQIEYAAEMGLEHHLGLTCDPVCGLVQVPCIERNAIAAARAFDANAYATLSD GSHMVSFDKVVEVMNETGHNLPSLYRETSTGGLAKRYNDKK >gi|313159403|gb|AENZ01000009.1| GENE 96 101905 - 106164 6256 1419 aa, chain - ## HITS:1 COG:mlr0277 KEGG:ns NR:ns ## COG: mlr0277 COG0086 # Protein_GI_number: 13470543 # Func_class: K Transcription # Function: DNA-directed RNA polymerase, beta' subunit/160 kD subunit # Organism: Mesorhizobium loti # 13 1397 18 1356 1398 1306 50.0 0 MSISKDNKTNNGYSRISIGLASPEEILAQSSGEVLKPETINYRTYKPERDGLFCERIFGP VKDYECHCGKYKRIRYKGIVCDRCGVEVTEKKVRRERMGHISLVVPVVHIWYFRSLPSKI GYLLGIPSKKLEAIIYYERYVVINAGAAAEQGIERLATLSEKEYLDVLAALPKGNQALDD SDPNKFVAQMGAEAIYTLLKTVDLDSMSYALRHKASTETSQQRKSEALKCLNVIESFRAS EGKNKPEWMVLNVIPVIPPELRPLVPLDGGRFATSDLNDLYRRVIIRNNRLKRLIEIKAP EVILRNEKRMLQEAVDSLFDNSRKSNAVKNESNRPLKSLSDSLKGKQGRFRQNLLGKRVD YSARSVIVVGPELKMHEMGIPKDMAAELYKPFVIRKLIERGIVKTVKSAKKIIDRKDPVI WGILENVIKGHPVLMNRAPTLHRLGIQAFQPKLIEGKAMQLHPLACTAFNADFDGDQMAV HLPLGNAAILEAQLLMLGSHNVLNPANGAPITVPSQDMVLGLYYITKPRKGVKGEGRVFY GPEEAIIAYNERQADLHAIVKCLVDDIDENGNPITELKETTIGRILFNQVVPKEVGYINE ILTKRSLRDIIAVVMKKAGADKVAAFLDDIKHMGYQMAFRGGLSFNLDAVIIPEEKEKLV QEGYERSDSIMEDYNMGLITNNERYNQIIDVWTNINTKLTKVVIDTLIKDDDGFNPVYMM LDSGARGSKEQIRQLSGMRGLMAKPQKSGVEGGQQVIENPILSNFKEGLSVLEYFISTHG ARKGLADTALKTADAGYLTRRLVDVAQDVIINEEDCGTLRGLTATAIKRNDDVVQTLYDR ILGRTALNDVIHPLTGEVICKAGEEITESIAEAIEKSPLESVEIRSVLTCEARRGVCAKC YGRNLATARMVQKGEVVGVIAAQSIGEPGTQLTLRTFHVGGVAGGTAVETNVVSKYEGRL EIDELRTVKGKNAMGEAIDIVISRQSEFRIVDPKTEIVLYTHNLPYGATLFMADGAEVKK GDMICEWDPYNAVIISEYEGRAVYENIIEGVTYRDERDEQTGLSEKVVTESKDKTKNPVI KIVNKEGEEVKQYNLPVSAHVVVKDNAKIKAGDILIKIPRAVGKSGGDITGGLPRVTELF EARNPSNPAIVSEIDGEVSFGKIKRGNREIIITSKQGDMKRYLVPLSRQIIVQENDYVKA GSPLSDGAITPSDILNILGPTKVQEYIVNEVQEVYRMQGVKINDKHFEVIVRQMMNKVRI EDPGDTRFFEDQTVDKWEFMDVNDELYDKVVVTDAGDSTSLQPGQIVSLRKLRDENSSLK RRDMKPVQVRDIIPATSTQVLQGITRAALQTSSFISAASFQETTKVLNEAAIQAKVDPLE NLKENVICGHLIPGGTGLREYDNLVVGSKAELESLQQAQ >gi|313159403|gb|AENZ01000009.1| GENE 97 106201 - 110061 2963 1286 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163796927|ref|ZP_02190884.1| 30S ribosomal protein S12 [alpha proteobacterium BAL199] # 3 1285 10 1387 1392 1145 46 0.0 MSTAKTQQRISFSTVKNRVPYPDLLEVQLKSFRDFFQMDTTAENRKNEGLYKVFQENFPI TDTRNNFVLEFIDYYIDPPRYSIEECLERGLTYSVPLKAKLKLYCTDDEHEDFGVVVQDV YFGTIPYMTERGTFVINGAERVIVSQLHRSPGVFFGQSMHTNGTKLYSARIIPFKGSWIE FATDINNVMYAYIDRKKKLPVTTLLRAIGYEADQQILEIFDLADEVKVTKANLKKAVGRK LAARVLSTWVEDFVDEDTGEVVSIERNNVIVDRETVLEEEHVDQILESGAKTILLHKESL SGIDFSIIYNTLQKDPCNSEKEAVVYIYRQLRASEPPDEATARDVIEKLFFSDKRYDLGD VGRYRINKKLDLDTDPAIRTLTREDIIAIIKYLIQLINSKADVDDIDHLSNRRVRTVGEQ LSNQFSVGFVRMARTIRERMNVRDNEVFTPVDLINAKTLSSVINSFFGTSQLSQFMDQIN PLAEITHKRRLSALGPGGLSRDRAGFEVRDVHYTHYGRLCPIESPEGPNIGLISSLCVYA KISPMGFIETPYRKVENGQIDMNNDDIRYYSAEEEEGKIVAQANMPIDDEGHFLQPDRIK AREGADFPVVTAEEVNLMDVAPNQIASIAASLIPFLEHDDANRALMGSNMMRQAVPLVTT DAPIVGTGIEKDMISDSRIQIVAEGDGEVTFADATKIQIRYERTEDEILASFEPEVTTYE LPRYRRTNQNTSITLKPIVLTGDKVVKGQILTEGYSTQHGELALGRNLKVAFMPWKGYNF EDAIVISERIQREDIFTSVHVDEYIMEVRDTKRGVEELTSDIPNVSEDATKDLDANGIIR IGANVHPGDILIGKITPKGESDPSPEEKLLRAIFGDKAGDVKDASLKAQPSLHGVVIDTK LYSRANKEGKKGKSAEKVQIEKLDEKFAAEIAELTKRLVAKLWTLLQGKTTTGVTDYFGV ELYPAGTKFSQKMLEEIARKTTDEKTGVVMGYLNLGTCKWTGDAHTDALIEATINNYTIE WKKADAAIKREKYNITNGDELPQTGVIQMAKVYIAKKRKLKVGDKMAGRHGNKGIVARIV RDEDMPFLEDGTIVDICLNPLGVPSRMNLGQIYETVLGWAGRELGLKFATPIFDGASLDQ INEYTAKAGIPHSGRTYLYDGGTGEMFDQPATVGVIYMLKLGHMIDDKMHARSIGPYSLI TQQPLGGKAQFGGQRFGEMEVWALEGFGAANILQEILTIKSDDVMGRAKAYEAIVKGENL PKPGIPEAMNVLLHELRGLALSVKLE >gi|313159403|gb|AENZ01000009.1| GENE 98 110306 - 110683 441 125 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|86131816|ref|ZP_01050413.1| putative 50S ribosomal protein L7/L12 [Cellulophaga sp. MED134] # 1 125 1 124 124 174 75 3e-42 MADVKKLAEELVNLKVTEVNELATILKEEYGIEPAAAAVAVAAPAAGAGEAAAAEKSTFD VILKTAGQAKLQVIKVVKDLAGLSLGDAKALVDGAPKAVKEGVSKEEAESIKAQLEEAGA EVELK >gi|313159403|gb|AENZ01000009.1| GENE 99 110725 - 111249 501 174 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|120436717|ref|YP_862403.1| 50S ribosomal protein L10 [Gramella forsetii KT0803] # 1 172 1 171 173 197 61 3e-49 MTKEEKLVVINAIAEQLQAYPHFYIADIAALDAEQTAALRRKCFENDVKLVVVKNTLLGK ALEKVEKADADLVKVLEGPTSIMFANVAKAPAVLIKEFRKKSDKPVLKAAFAEGCVYVGD DQLDALCNIKSKEELIGDIIALLQSPAKNVISALQANAGQKIAGIVKTLESRNN >gi|313159403|gb|AENZ01000009.1| GENE 100 111267 - 111965 950 232 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|150008877|ref|YP_001303620.1| 50S ribosomal protein L1 [Parabacteroides distasonis ATCC 8503] # 1 230 1 230 232 370 77 1e-101 MSKLTKNQKIAYAKVEANKAYKLSEAAALLKEITFTKFDASVDIDVRLGVDPRKANQMVR GVVTLPHGTGKQVRVLVLCTPEKEAEAQAAGADYVGLDEFVDKIKGGWTDVDVIICTPNV MGKVGALGRILGPRGLMPNPKTGTVTMEVGKAVQEVKSGKIDFKVDKFGIIHTSVGKVSF SADQIVDNAKEVLNMILKLKPAAAKGSYVKSIYLSTTMSPGLQIDSKSVETK >gi|313159403|gb|AENZ01000009.1| GENE 101 111972 - 112412 625 146 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|53715481|ref|YP_101473.1| 50S ribosomal protein L11 [Bacteroides fragilis YCH46] # 1 144 1 144 147 245 81 1e-63 MAKEVAAFIKLQIKGGAANPSPPVGPALGSKGVNIMDFCKQFNARTQDKAGKVLPVIITV YSDKSFDFVVKQPPVAIQLKEAAKVQKGSAQPNRDKVGQVTWDQVREIAQDKMPDMNCFT LEAAMRMIAGTARSMGINVVGEFPNM >gi|313159403|gb|AENZ01000009.1| GENE 102 112429 - 112989 809 186 aa, chain - ## HITS:1 COG:BMEI0744 KEGG:ns NR:ns ## COG: BMEI0744 COG0250 # Protein_GI_number: 17987027 # Func_class: K Transcription # Function: Transcription antiterminator # Organism: Brucella melitensis # 7 185 4 174 175 138 42.0 7e-33 MSEIKKQWYVVRAIGGKENKVKEYIEAEIRHNNLEEYISQVLIPTEKVYTIRNGKKVSKE KVSYPGYVLVEAAFVGQIPIIIRNTPNVLGFLGDTKEDSRKMNATPLRPQEVARILGRVD EMNAMEEENEIPFFVGETVKVTDGPFSSFQGTIEAVDNERKKLTVSVKIFGRKTPMELGF TQVEKE >gi|313159403|gb|AENZ01000009.1| GENE 103 113004 - 113201 277 65 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313159420|gb|EFR58783.1| ## NR: gi|313159420|gb|EFR58783.1| preprotein translocase, SecE subunit [Alistipes sp. HGB5] # 1 65 1 65 65 105 100.0 1e-21 MFNYVKESYNELVNKVTWPTFPQLQSSTIVVMVASVIFAIVVLAMDLTFENLMAVIYKTL GNLGR >gi|313159403|gb|AENZ01000009.1| GENE 104 113343 - 114530 1391 395 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|119502908|ref|ZP_01624993.1| Ribosomal protein S19 [marine gamma proteobacterium HTCC2080] # 1 395 1 407 407 540 65 1e-152 MAKEKFDRSKPHVNIGTIGHVDHGKTTLTAAITTVLAKKGLSELRSFDSIDNAPEEKERG ITINTSHVEYQTANRHYAHVDCPGHADYVKNMVTGAAQMDGAILVVAATDGPMPQTNEHV LLARQVNVPRIVVFLNKCDMVDDPEMLDLVEMEVRDLLSKYEYDGDNAPVIRGSALGGLN GEPAWEDKIMELMNAVDEYIPIPQRENEKPFLMPVEDVFSITGRGTVVTGRIETGIIHVG DPVEIVGLEEKTLTSTCTGVEMFRKLLDEGEAGDNVGLLLRGIDKKEVKRGMVVAKPGSI TPHTEFEAEVYILKKEEGGRHTPFHNNYRPQFYLRTMDVTGEVHLPAGVDMVMPGDHVTI TVKLIYPVAINEGLRFAIREGGRTVGAGQILKIVK >gi|313159403|gb|AENZ01000009.1| GENE 105 115453 - 115974 229 173 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229235658|ref|ZP_04360081.1| acetyltransferase, ribosomal protein N-acetylase [Chitinophaga pinensis DSM 2588] # 1 165 1 167 180 92 33 1e-17 MKTIPLFDDFSLRSLRSGDAPAIFGAIDTQREHLGRWLPFVAATHRVEQTQEVVAGMLND TANPVFTIRSGDAFAGLIGFKSADSTTRTIEIGYWLRSEFQGRGIMTSAVQALCRLAFEE MGMERIEIRCALGNYRSNRIPQRLGFALDRVEVRGERLADGEFVDLNVYLLER >gi|313159403|gb|AENZ01000009.1| GENE 106 116082 - 116810 879 242 aa, chain - ## HITS:1 COG:no KEGG:BT_2320 NR:ns ## KEGG: BT_2320 # Name: not_defined # Def: transcription regulator # Organism: B.thetaiotaomicron # Pathway: not_defined # 4 234 43 272 291 110 30.0 5e-23 MVRILFVMSGSLMMEHGDTVRLVTSKQYVCLARGERFVTTAQDDAHVVVLSLIHRIEFCE QDIFDKVMPYDTVIPTEAVPRLSMHPTIERLLCDLFAVPELSECARYHKMKATELFMMIK VLYTPTEHAYFFQSMIQPQDNFRVFVCNNYDKAQGVAELAALAGMSLSVFKRRFAEHFND SVYHWMMRQKALKIFTDIRDGEDSTKVLMNKYGFRHYTQFSRFCKNYLQATPAQLIASIK EG >gi|313159403|gb|AENZ01000009.1| GENE 107 117281 - 118138 1169 285 aa, chain - ## HITS:1 COG:no KEGG:Bacsa_3078 NR:ns ## KEGG: Bacsa_3078 # Name: not_defined # Def: hypothetical protein # Organism: B.salanitronis # Pathway: not_defined # 1 285 1 284 284 117 32.0 5e-25 MKHVIVMALGAALALASCVSKGTVVKVEEQRDSLSVVVSAKDSLISAVFEDISSISENLA LIRTRENLLSVAGDAEGGRRPVEEINNDIAAIDRLLRENKVKIASLQSAVAQLRKANLRL DGLEKMIRDMSAQLAEKKNEIARLRENLTQMGVEVETLTEQVAVRSEQVETLNTEKVELE NQLNTVYYIVGTEKELREAQIINKQGFIGRTLTVNKTNNLDSFTKTDSRLLSEVPIGQKK VSVVTTHPEESYELVTDADKVVLKLLITDPVRFWESSKILIISYK >gi|313159403|gb|AENZ01000009.1| GENE 108 118444 - 118710 344 88 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|150004229|ref|YP_001298973.1| 50S ribosomal protein L31 type B [Bacteroides vulgatus ATCC 8482] # 1 84 1 84 84 137 79 5e-31 MKKGIHPENYRLVAFKDMSNDHVFLCRSAVSTKETIEVNGETYPVYKMEISNTSHPFYTG KMKLVDTAGRVDKFMSRYAKRYDKKGAK >gi|313159403|gb|AENZ01000009.1| GENE 109 119027 - 119854 1312 275 aa, chain + ## HITS:1 COG:no KEGG:GFO_2348 NR:ns ## KEGG: GFO_2348 # Name: not_defined # Def: S1/P1 endonuclease family protein (EC:3.1.30.-) # Organism: G.forsetii # Pathway: not_defined # 1 272 1 259 260 109 31.0 1e-22 MKRLLLIAAFAALTSAQGAFAWGRLGHAAVARLAEQHLTKKAKANLDKLLDGRSIVYYAS WMDDYKPQMLVDLGYTPTNGPRMHMLPHTFSVDENGEVIRGNRLPGDKYLANCLYYVERA ADRLKNRMHEMNDSTRLACIQVIVHCLGDMHCPGHVRWPDNQEIGYFNVVLKGSEIRYHT IWDTPIVATTHPWSFSDLAFQLDRYTEEQQRAAIAGDIYDWGRESAANSKCIYDVKPGDK LGHDFILKYKPLAEEQLAKAGYRLAKVLNDIFDRQ >gi|313159403|gb|AENZ01000009.1| GENE 110 120335 - 120703 559 122 aa, chain + ## HITS:1 COG:no KEGG:Plut_0531 NR:ns ## KEGG: Plut_0531 # Name: not_defined # Def: hypothetical protein # Organism: P.luteolum # Pathway: not_defined # 1 122 1 118 118 86 39.0 4e-16 MEFEGTVYKIMPVTKGTSARGEWQRQDVVFEMNEGSFARKICVTFFNKPDDVARLKEGAA YTVSVNIESREYNGRWYTDIRAWRLQPKQAEVPGAGPMPDMPPIAEEPSYASSPAEVDDL PF >gi|313159403|gb|AENZ01000009.1| GENE 111 120770 - 121267 360 165 aa, chain + ## HITS:1 COG:FN1295 KEGG:ns NR:ns ## COG: FN1295 COG0454 # Protein_GI_number: 19704630 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Fusobacterium nucleatum # 30 165 3 130 135 95 37.0 4e-20 MQPIVPDTTPPPATDIRPLRKAEIPAASALVRRVFAEFEAPDYAQEGIDVFHRFIAPEAM TEQFRTGALMLWGAFRQDTLAGVAAVRNHSHISLLFVDASCHRQGIARALFSAVRDFCST DPAVSRITVNSSPYAVEIYRRLGFTATDAERVTDGIRFTPMTYLL >gi|313159403|gb|AENZ01000009.1| GENE 112 121367 - 122500 1622 377 aa, chain - ## HITS:1 COG:NMB1567 KEGG:ns NR:ns ## COG: NMB1567 COG0545 # Protein_GI_number: 15677418 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: FKBP-type peptidyl-prolyl cis-trans isomerases 1 # Organism: Neisseria meningitidis MC58 # 139 363 53 267 272 117 36.0 3e-26 MKKILFAAALAGAAFMTACGGKDGGVRMGSLSDFDSLSYSLGANIGYGMNHEMKDIPFDF KAIDKGIKEGAMGKAAQEHDKSLDMLREYFMSKRGERAQEIAAKRAEQDSIRLAGGDTTK VEYPAADPAMFESEEERAEISYAFGNDIGYNISQSGMPIQLVWIGQAMQDVRDGKAKMAE DAVNQYLQYYFMVKRPAENAEASKAWLEKIEKKSGVKKTESGLLYKVTKQGDASMMPKDP RDVVKVHYTGRTREGKVFDTSKFANRPKEQQEMIRKQNPDGFNEDGTPKEADTEAEFPLN RVIKGWTEGLQLVGKGGTITLWIPADLAYGARGAGRDIGPNEALEFEVELIDVVPFEEPA PADSTATAETPEVAPAK >gi|313159403|gb|AENZ01000009.1| GENE 113 122591 - 123304 1081 237 aa, chain - ## HITS:1 COG:YPO0195 KEGG:ns NR:ns ## COG: YPO0195 COG0545 # Protein_GI_number: 16120534 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: FKBP-type peptidyl-prolyl cis-trans isomerases 1 # Organism: Yersinia pestis # 4 221 15 243 266 71 27.0 1e-12 MRKTLVVLTAAALLAGACSKKTGGGVKLKTDTDSVAYVIGMNVGMNLMKMDSTINVNAVC EGIRDVFRQGTKFSAADAEVFYLRYMNYALPEKARAYEEQFLQDILKANRNYARTASGVT YTVEVLGDQEQIPASDRDSIALRYLIRTADGRDVYSSYERGDTLRTTLGGLLKGMKESVK LVGRGGKINAWIPSSAAYGTEGDKELGIQPNTTLYYEIELVDVDKYANWSRRSNLRR >gi|313159403|gb|AENZ01000009.1| GENE 114 123306 - 124685 2143 459 aa, chain - ## HITS:1 COG:jhp1447 KEGG:ns NR:ns ## COG: jhp1447 COG3004 # Protein_GI_number: 15612512 # Func_class: P Inorganic ion transport and metabolism # Function: Na+/H+ antiporter # Organism: Helicobacter pylori J99 # 21 457 15 433 438 256 38.0 5e-68 MRLFSMYRWVRERHMIGIRGRNFMEAPWAGGIVLLGCVIIAMLLANLPATSLYYRHLLET DLSLMVHSPDGLIDWVFPRGMTVEKLINDGLMVIFFFAVGLEIKREIVCGQLSSAKKALL PVLAAAGGMLFPAIIFTAFNHGTMAANGWGIPTATDIAFAIGILSMLGDRVPVSLKIFLT ALAIADDLGAILVIALFYGGQVQIGCLLAALVIMLGVYFMKEMGEKRMFFYLVPAVVVWG LFYYSGVHSTISGVAMAMLIPMTPRYSKEYFVHKMRHLKALMLAAGTSGEDFPNENHRFY LRRMRNLAADSVGMSYRLEHALAPSVTFLVMPIFALANAGVEITSLEYLNIFHMSPDIGS VGMGVFFGLLVGKPLGIFLASWGAVKSGLAELPEGATWRMLFAVGCLGGIGFTMSLFVDA LAYTEPDLIDRGKIAILMGSTAAAVAGSLLILIFSKKRT >gi|313159403|gb|AENZ01000009.1| GENE 115 124686 - 125747 1290 353 aa, chain - ## HITS:1 COG:alr2310 KEGG:ns NR:ns ## COG: alr2310 COG0010 # Protein_GI_number: 17229802 # Func_class: E Amino acid transport and metabolism # Function: Arginase/agmatinase/formimionoglutamate hydrolase, arginase family # Organism: Nostoc sp. PCC 7120 # 11 346 6 339 346 273 42.0 3e-73 MTFKTDLMEQKTFDPDGVGVDNGTYFGLPFEPETAELVIVSAPWDVTVSYGAGTAYAPDA IIEASTQLDFHEPLAPGAWRRGIATADVDYSLLDESQRLRGDAAKVIDHLEGGGSPEDDY VVRKICRVNEGCAAMNANIGAQAARWLDAGKLVGLVGGDHSTPYGLIRALGERHAEFGIL HIDAHCDLRDAYEGFEFSHASIMFNVLRDVPAVTKIAQVAVRDFSEREAALAASSGRVVL FDDLSLAAAGFRGETWDTQCLRIVETLPQEVYVSFDIDGLSYENCPHTGTPVAGGLGFNQ AVWLLDTLVRSGRRIVGFDVVEVTPAREERIDAITGARVLWKLCNLTLKSNVR >gi|313159403|gb|AENZ01000009.1| GENE 116 125821 - 126816 1104 331 aa, chain + ## HITS:1 COG:YPO2763 KEGG:ns NR:ns ## COG: YPO2763 COG0111 # Protein_GI_number: 16122967 # Func_class: H Coenzyme transport and metabolism; E Amino acid transport and metabolism # Function: Phosphoglycerate dehydrogenase and related dehydrogenases # Organism: Yersinia pestis # 1 330 1 349 375 223 36.0 5e-58 MKIIADSAIPFLQGVLEPWAEVRYLPGAEIAPEEVRDADALLIRTRTRCDERLLGGSRVR LIATATIGFDHIDTAWCAAHGIRVCTAAGCNARGVLQWAGAVLAHLARTQGWQPAGRTLG VVGVGHVGSLVKEYAETWGFRVLCCDPPREAREHCGFLPLEEVARQADILTFHTPLDDTT RRMAGDALFARMKPGAVILNSSRGEVVDGAALLRSGLACVLDVWEHEPAIDRRLLARTLL ATPHIAGYSEQGKANATAMSVDTLARFFGLPLAGWYPPQAAPSTPRPISWQELCTTIGGA FDIEAQSRSLKARPEDFEPMRDHYRYRREYF >gi|313159403|gb|AENZ01000009.1| GENE 117 126875 - 128152 1619 425 aa, chain + ## HITS:1 COG:PM0617 KEGG:ns NR:ns ## COG: PM0617 COG0167 # Protein_GI_number: 15602482 # Func_class: F Nucleotide transport and metabolism # Function: Dihydroorotate dehydrogenase # Organism: Pasteurella multocida # 1 340 1 338 339 245 40.0 1e-64 MYRSIIKPILFSLTIERAHRAVLILLRAIGLIPGGRWLLRKCYAVKHPALEREVFGIKFA NPVGLAAGFDRNGEAFRELAALGFGFVEVGTVTPRPQAGNPRPRVFRLPKDEAIINRIGL SNRGLEKTIQHLRRPHEGFIVGCNIGRNTATLAENAAADYLKLFRNLYQYADYFTVNISC DNSCREGATHTREHILRILDPLFDFRRGQNQYRPIMLKVSPDMPDAVIDEISDILLETPL DGIVATNGTHNREGLHTSRTTLDKIGSGRLSGAPLTQRAVEVVRRIHTRSGGNFPIIGVG GIMTPDDAKAMLDAGADLLQLYTGYIYNGPGLVRDICRALVADAEAKAAAERAATERAAT ERAAAERAAAEKTTAKPAAQQLPAKHTPQPAATAPAPENAENAEPAPAESRTPAQSGPDS EQPES >gi|313159403|gb|AENZ01000009.1| GENE 118 128169 - 129410 1562 413 aa, chain + ## HITS:1 COG:CAC3586_1 KEGG:ns NR:ns ## COG: CAC3586_1 COG1058 # Protein_GI_number: 15896820 # Func_class: R General function prediction only # Function: Predicted nucleotide-utilizing enzyme related to molybdopterin-biosynthesis enzyme MoeA # Organism: Clostridium acetobutylicum # 1 241 1 238 245 139 34.0 8e-33 MKATIITIGDEILIGQIVDTNSVSIARHLNAAGIVVHEKVSIGDDSAQIVGCVKRALGQS DIAIITGGLGPTKDDITKKTLAEMFGSELILNQTVSDHVKRMLEERGIEFNELNRGQALV PACCTVLFNAHGTAPGMWFERGGKVVVSLPGVPYEMEHLMQDEVMPRLKAHFELRQIVHR TMITAGLPESMLAKAIETWENALPPYLKLAYLPNPGAVRLRLSAYEVEGESVSKEIERQF EALRRIIPHNIIGYETATMQELIHKLLTERRQTLATAESCTGGTIAARFTAMPGASAYFL CGVVSYSNASKQAVLGVDPDTIARYGAVSEQVARQMAEGARRISGADYAVSTTGIAGPAG GTAEKPVGTVWIAVAGPRRTVALLKQCGSDRGQIIDRAGAFALGLLRDELNGK >gi|313159403|gb|AENZ01000009.1| GENE 119 129423 - 129971 256 182 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229231897|ref|ZP_04356325.1| SSU ribosomal protein S12P methylthiotransferase [Cryptobacterium curtum DSM 15641] # 27 175 755 902 904 103 39 8e-21 MNKILLSTATAFILSLTACSSHAPEKRVHRILTDRIQTLAVAESCTGGTIAARFTALPGA SAYFKCGVVAYSLDTKQEILQISCDTIARYGAVSEQVVRQMAEGVRRASNSHYAVATTGI AGPTGGTPEYPVGSVWIAVSSPLRTTTRLIRAGGSRNAVIRKAGTAAIELLEKELQASNK AE >gi|313159403|gb|AENZ01000009.1| GENE 120 130241 - 130507 280 88 aa, chain + ## HITS:1 COG:no KEGG:Bache_0368 NR:ns ## KEGG: Bache_0368 # Name: not_defined # Def: integral membrane sensor signal transduction histidine kinase # Organism: B.helcogenes # Pathway: not_defined # 1 74 97 171 700 67 48.0 2e-10 MQAVALSFKADHYYFNNEPDSLKAWIPRVQAFARANEQPTYYFFTWSRLILYYTKHGQYT LAQYELERYMAQADRTTTSPPSPKPTSS >gi|313159403|gb|AENZ01000009.1| GENE 121 130375 - 131958 1847 527 aa, chain + ## HITS:1 COG:lin2643 KEGG:ns NR:ns ## COG: lin2643 COG0642 # Protein_GI_number: 16801705 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Listeria innocua # 308 525 365 588 591 109 30.0 2e-23 MVAAYSLLHETRAIHAGAVRAGTLHGAGRQDDYKPAIAEAYKQLGHIYRTRGLKETAADY YRKAIGFIEENNLNKFSLPILYSELATVLIDTGRYEEAAEELEKGKARLTLPEYIWPLKL KQVILYSRTGRSAEARTLFDQIRQGHDGYLTTASLTEAQLAINLAEHEFGRALTTLDKLI GLFKETGHSETYFYELFQTRAETYAETGNYEAAYKSQKHYLDLYRKKVGDDNERNLGEFA TLLDVSRLDVEKAELQRQTQEVRLHRTQLGIAALSVILLLTVLFLLFTARMNRRLAHAKR AAEASSRMKGVFIRNITHEINTPLNSIVGFAELAAAPDADDEERQSYIEIIRENSGYLQK LVDDVLYIAGLESSDTPPALGPVDINVCCMQCIQTVRDYSLRKLDIRFEPECAQLPVNTS CLLLSKALTELLRNAARFAPDGRITLAYTLISHKKRIAFTVTDRGPGIPAAEADRIFDRF VKLDPFSQGMGLGLAVCRLIADALGGGVELDTSYTEGARFTLTIPIV >gi|313159403|gb|AENZ01000009.1| GENE 122 132086 - 132319 199 77 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|227396859|ref|ZP_03880178.1| LSU ribosomal protein L28P [Haliangium ochraceum DSM 14365] # 13 72 1 60 65 81 60 3e-14 MKVCEITGKVAIVGNNVSHSHHKTKRKFSPNLKTKRFWSEQEGRWITLKVSAAGMKTINK KGLAVALREAAAPKSVY >gi|313159403|gb|AENZ01000009.1| GENE 123 132336 - 132518 278 60 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|228474099|ref|ZP_04058840.1| ribosomal protein L33 [Capnocytophaga gingivalis JCVIHMP016] # 1 60 1 60 60 111 86 2e-23 MAKKGNRVQVILECTEQKESGVPGMSRYITTKNKKNTPDRLERKKYNPFLKKVTVHREIK >gi|313159403|gb|AENZ01000009.1| GENE 124 132531 - 132689 393 52 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313159457|gb|EFR58820.1| ## NR: gi|313159457|gb|EFR58820.1| conserved hypothetical protein [Alistipes sp. HGB5] # 1 52 1 52 52 75 100.0 9e-13 MAKKVVATLKTGTGKNFTKCIKMVKSDKTGAYMFKEGIIPNDQVKEFFEEKK >gi|313159403|gb|AENZ01000009.1| GENE 125 132932 - 134941 2172 669 aa, chain + ## HITS:1 COG:PM1139 KEGG:ns NR:ns ## COG: PM1139 COG1442 # Protein_GI_number: 15603004 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Lipopolysaccharide biosynthesis proteins, LPS:glycosyltransferases # Organism: Pasteurella multocida # 9 245 3 254 302 114 32.0 7e-25 MQESQRIAIAFTVNDSYAQHCCVAMASALENNKSEFLDIYIFTDYFSDRNRQLFEQTVRR YGDGNRLSFIIVEDSALKKLRLNISYITHHTYYRFLLPELLPELDRILYLDSDLVVNGPL RPLWETSLDGYYCAGVKDSWIEQISYKYRLGMSQAELYINAGVILMNLEQMRKDQAADQL LQTATQKGTQFEYQDQDVINTTFRGGKIKELPGIYNYTATDDKFHTVKDPVIIHYNGDVK PWSTERRCPNSMASRYFTYLRLTPYKKFRSEFVRSRILRTLKKAAGIDVYPSRVKVALIV DEYFGACGTAYGGYGFLARNYIARYLPDTNISVEVLLSNKHRWKFAPNALTEKVDNRKVI IPPGRFFARQWLKKKKYDIYLTVELTHNILKYERRSKPIIHWIQDPRPWSEWLEIQTVKL FPENSYWSSSLYDSVNKLYGKGLVTFVSQGYFLNDKAIELYRLPGDAPIAYMPNPIDIDY GFDPESYPKKNHILFIGRIESIKRGWLFCEIAKQLPQYEFFMLGQTFREKSQNESIMAGY RDIPNLHFVGHVEGEEKMKYIREAKILVNTSIHEALPITFLEALAYGTLLVSCRNPEDLT SKFGRYTGPVLGDGFDKVPLFTSAIEELIENEGLRKELSCKAVEYIREIHPVHKFQKHMR KLIHDLAHK >gi|313159403|gb|AENZ01000009.1| GENE 126 135022 - 135930 1212 302 aa, chain - ## HITS:1 COG:no KEGG:Lbys_1555 NR:ns ## KEGG: Lbys_1555 # Name: not_defined # Def: hypothetical protein # Organism: L.byssophila # Pathway: not_defined # 21 117 19 116 303 72 36.0 2e-11 MMKKTILLLVAALCGVLTAAAQDLIVKTDATKVEAKVTEITPDAVRYKRFSNPDGPTYVL PVADIDYIQYANGEKERFRAAETVPATPLTPATPVGEAPVAAAPAAQAPVQQAAPVQYVA KEYQIGEFYDQNGVKGVVCMLTEDRRHGLIISLDEIYLPWSEFRKPDLRVAGADDRIDGM GNMEKVAAYIAENNLAWDDFPAFKWCRDKGEGWYLPAIDELLTIGHNYNGGTRIHSNRQA RNRFNDALKDNGGKRMDRLVYYFSSTEKDEKEAYTTHTGIEPPYVIEIPKYNKFLVRAVR KF >gi|313159403|gb|AENZ01000009.1| GENE 127 136013 - 137023 719 336 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163762490|ref|ZP_02169555.1| ribosomal protein L28 [Bacillus selenitireducens MLS10] # 1 331 1 322 336 281 43 2e-74 MAFFDIFKKKQDNSAAQASEERKQQQEELNAGLEKTKTGLFSKLARAVAGRSTVDADVLD DLEEVLISSDVGVETTVKIIRGIEERVARDKYMGTSELQSILREEVTALMEESHGSEKHF GLDAKEGEPYVVMVVGVNGAGKTTTIGKLAAQLTKAGQKVWIGAADTFRAAAIDQLKVWA DRAGATMIRQEMGSDPASVAFDTLTSAKANGADVVLIDTAGRLHNKINLMNELTKIRNVM GKVIPGAPHEVMLVLDGSTGQNAFEQARQFTQATQVTSLTITKLDGTAKGGVVIGISDQF HIPVRYIGIGEGIDQLRIFDRKEFVKALFGDGSGQS >gi|313159403|gb|AENZ01000009.1| GENE 128 137380 - 140598 3889 1072 aa, chain + ## HITS:1 COG:no KEGG:Dfer_2245 NR:ns ## KEGG: Dfer_2245 # Name: not_defined # Def: TonB-dependent receptor # Organism: D.fermentans # Pathway: not_defined # 27 1072 35 1040 1040 545 35.0 1e-153 MKMNLTTSEKNRGLVRFLLMSLLLSILILPAHAQDRVTVSGQLTDSSQTPLIGASVIEKG TTNGVTTDADGRYQISVKPDGVIDFSFIGYKPQSLAVMNRTEINVTMEEDATMIGEVVAI GYGSQRKEDLSMAVSTVKVDDAARSRAADLGTLLQGRMPGVTIQQSGDPLQKASFTVRGR GSKGNDDGTDKSAAYSRDNIGPTSGDGVLVVVDGVPGAPYMAEDIETITILKDAASAAIY GASVGSSGVILITTRQAQAGKTRVNVNVSVGFERAMNLPTMLNAQQFCDVWGKAVENSPG SSLPNLANPQVYAGANVTRTDWLDEIFRTGLTQHYAVSITGGSETLSSILSVTYDKKEGT LLNTWSQALGAKLHTEFKPVKWLKLSERVNFEYSNGQGNISTSHTGPIYGAMWFPASASV YDRDANGNIVYGNDGKPKWGGIASSADMASGVEGPNIVNPVAQLETMHRRYPRTKIFSTT SLEIKPITSLTLKSEFTADLDMREEDEFSPVIDVPGGSSTSSREQFNYNNFHYLWETTAT YAQVFGRHHISAMAGFTTDYKKLHFRDFRTQGYGSNEDHNLVWGAAGSWTKNPNEDIVDY SMVSFLARLGYSFDDRYFLTGSIRRDASSKLPAAKNYDWFPAISGSWKLSSEKFFQNAGL NKVFDLVKFRAGWGKVGNVDLYNLPNPTSIPLSIYPDGSLIGGTTHYGTYLATIPNTDAR WETTVQTSAGLDLTMLHNTLEVSVDYYNKETKDLVDYLTIPPQMGVENAPLGNMGHVINK GWEFSVSYRNSAAQGKFNYNVWGTFSTNKGYVEDYGPQPIVMHKYPTFNGIQLFASGAGH PWYSYYIYRTDGIFRSQDEIDRHISKDPDTGEVKMLQPDAKVGDLRFIDTNGDGVINDND RVLSRSYTPKQTYSFGGSLDWKGFDFSFMFQGVAGNYVYNGTKQLGMNGRQDLGNLTTDV YNTWDFKPMTSKYPRLGLSDDMNGNYLKVSDIFLEKGDYLRLKNITLGYTLPKHISRHIG MEKGSVRVYLSIDNVATITGYSGTDPEVGNYGLDSGIYPTSRFFNFGVNINF >gi|313159403|gb|AENZ01000009.1| GENE 129 140610 - 142241 2257 543 aa, chain + ## HITS:1 COG:no KEGG:Celly_2444 NR:ns ## KEGG: Celly_2444 # Name: not_defined # Def: RagB/SusD domain-containing protein # Organism: C.lytica # Pathway: not_defined # 8 543 11 528 528 327 38.0 7e-88 MKKTIFSLCLAISAFCLSGCSDFLDREPYGKDATWKTKEDVDQAIYALYHFVSPYWSEEI CGRGHMWLECASDNILIGRARPSVDEIREFRMSPSNDNDVSRVWEVMYQNVAKANNIIKM VPDMKLDQGYKNQAVGTAYFFRGFSMLWMVPYYGDDTNGGIPIILDTTPAAEIDSPRPAH VLQNYDQIISDMRTAGEMLPYLSQLDASQIGLPHKAAAWGYAARAALYAAQYDAKYYDIV LEMCNKIMALTGSDKRELYIDPSNKLNSYANLWRKEQNHSKEYIFSLEGDYTNGARYHGV TFVNAGWGLYNTWGYFTPSKNLWDAFEEGDQRRDVTILYPGQHVKFVGRDITFGGYNTDG SVSDYTTGHVSAGLICRKFISPWEGADCKGKEVSALRDKLWNSLNCCLLRYADVMLMKAE ALIWTKGEGDAEAKQLLNQIRDRAGLPQNSQATKAQLMNERRCELAFEFQPSRHIDVVRW GIATTTYAKPLYSVISKKVNGQIVAEEVEVYPGRTFNPTYNKVFPIPQRAFTGTVYLKQN KGY >gi|313159403|gb|AENZ01000009.1| GENE 130 142267 - 143091 1262 274 aa, chain + ## HITS:1 COG:no KEGG:RB4127 NR:ns ## KEGG: RB4127 # Name: not_defined # Def: hypothetical protein # Organism: R.baltica # Pathway: not_defined # 1 266 11 271 279 83 27.0 8e-15 MKINKYILGAGALLCMMLPAASAQAQDVKLKCMEWNIKSFEYDDNSNAVYFEIAEAMEKI KAENPDIVCFNEFETATGRMSSVEKLTECAQSLGMFPFFVFSYNKDNGAYGNGILSKYPI VNSASVLLGMYTGADQRSAGWADILVPTDSKPEGVKVRIVCTHLDAFGGDETCLGQAKEV IEHAIAPAVAENIPVLIMGDMNCGPSSSAIREYEKTGTRLCNNDGTFGGYSKLDYFISFP QGKWSCSDYKVVKGDRLNVISDHYPIAGTAVLKN >gi|313159403|gb|AENZ01000009.1| GENE 131 143123 - 143968 1442 281 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313159427|gb|EFR58790.1| ## NR: gi|313159427|gb|EFR58790.1| putative lipoprotein [Alistipes sp. HGB5] # 1 281 1 281 281 539 100.0 1e-152 MKNTKRYLSIFLALALAAPVFTGCDNGLDEYKSGETIAPPSDKFNAEIQFVTRLTDGALA GSDADYAAIDNYMVNTLEGRGKSWLTVLDRADGANLPKTMQTALNTKRWTAFAFNRIANK SSYQGSMLYFNGPTTEVKGTPSGSGCYVTGFAPTLNGTRTDKDEDGNVTGTTNVSFAISF NTARFENADQINAFGGKNGVLSGLHYKKQNLLMIGTVKNDLLGALQSAAANTNEAIQVRE IVKGPAYTIFMLADSRFWGYVDVASTSLGNGIEAYGIHVMW >gi|313159403|gb|AENZ01000009.1| GENE 132 144072 - 145370 1106 432 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229200236|ref|ZP_04326798.1| SSU ribosomal protein S12P methylthiotransferase [Pedobacter heparinus DSM 2366] # 22 428 1 407 410 430 52 1e-119 MKKINVITLGCSKNTVDSEHLMARLAAAGYEVLFDSDRTDAKVVVINTCGFIGDAKQESI DMILRAAAAKNAGKIERLFVVGCLSERYADELRAELPEVDEFFGARTWDGIVRALGAAED PALETERHLTTPKHYAYLKISEGCNWKCGYCAIPLIRGGHVSVPMERLEEEARKLAAGGV KELIVIAQDTTYYGLDLYGRRMLAELLRRLCRIGGIEWIRLHYAYPTAFPDEVIEVMASE PKICKYLDIPFQHISDAQLSAMQRRHTKADAYALIGRLREAIPDLALRTTLLVGYPGETQ ADFAELEEFVRDVRFERLGVFAYSEEEGTYSAQKLQDNVPEEVKQQRVERIMALQNEISL ENNLRRVGRTERVLIDSRQGDYYVGRTQYDSPEVDQEILIPASEKRLLRGRFYDVRIDSA ADYDLYGRAAGK >gi|313159403|gb|AENZ01000009.1| GENE 133 145435 - 145728 402 97 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313159453|gb|EFR58816.1| ## NR: gi|313159453|gb|EFR58816.1| conserved hypothetical protein [Alistipes sp. HGB5] # 6 97 1 92 92 130 98.0 4e-29 MAEKSVITNIENRIRQLMDDHKRLSDQCAELTAQRDSLKAENRTLQERIRELDGELSRMQ LTEGLAGGSRNRDKARARVNRLMREVDKCIALLGRPE >gi|313159403|gb|AENZ01000009.1| GENE 134 145825 - 146124 526 99 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313159494|gb|EFR58857.1| ## NR: gi|313159494|gb|EFR58857.1| hypothetical protein HMPREF9720_0819 [Alistipes sp. HGB5] # 1 99 1 99 99 169 100.0 5e-41 MAQQAITLKLAGKSYSLNIDSEKEEMYRLAEREVNSYLAAIKQNNFKNWTDQDYLSMTAL KFAIANVDMRQSRELGDEDLKRLGHLGEEIDAYLNALKG >gi|313159403|gb|AENZ01000009.1| GENE 135 146447 - 147994 2600 515 aa, chain + ## HITS:1 COG:CAC1816 KEGG:ns NR:ns ## COG: CAC1816 COG1418 # Protein_GI_number: 15895092 # Func_class: R General function prediction only # Function: Predicted HD superfamily hydrolase # Organism: Clostridium acetobutylicum # 5 515 9 514 514 408 49.0 1e-113 MYTTILVACISAIAAIGIYMAVTNLVTKNSIRKRREAALKEAEAEGEMIKKERILQAKEK FIQLKSEYDRQVNERNQKIAQSEQRAKQIENNLQNQQRDLENKLRENDRLKEQMQNQLQI LEHKKEEVDQMMREQNVRLEQISGLSSEEAKNILIENMKAEAKTEAAGYINETIEEAKMT ATKEAKRIIVASIQRVATETAIENAVTVFNIESDEVKGRIIGREGRNIRALEAATGIEII VDDTPEAIILSGFDPVRREIARLALHQLVTDGRIHPARIEEVVAKVQKQIEEEIVEVGKR TTIDLGIHGLHPELIRMIGKMKYRSSYGQNLLQHARETANLAGIMAAELGLNPKTARRAG LLHDIGKVPDDEPELPHAIIGMKLAEKYKEKPEVCNAIGAHHDEVEMTSLIAPIIQVCDA ISGARPGARREVVESYIKRLKEMEDIALSYPGVVKTYAIQAGRELRVIVGADKLSDQESE GLSHDIAKKIQDEMTYPGQVKITVIRETRAVSYAK >gi|313159403|gb|AENZ01000009.1| GENE 136 148112 - 148843 988 243 aa, chain - ## HITS:1 COG:no KEGG:BF2504 NR:ns ## KEGG: BF2504 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 228 1 228 244 245 56.0 1e-63 MELFWQTIADYNTMTWPVQALIVLAAVVLTARLYLRPTLRVKTAMKCFLAFLNAWITVAY YLVSCDARAYSGVMAFFWGVMAAVWIYDAVVGYTTFERTYKHDKFAFALYLMPFLYPLIS HLRGLDFPMTTSPVMPCTVAIFTIGLMLSFSKRINLFVVLFLCHWSLIGFSKVYFFGIPE DLLLASALVPALYLFFKEYIDVNFRRNTKPDLRVMNMLLLAMCSGLGVFFAAIIWQQLVG MAG >gi|313159403|gb|AENZ01000009.1| GENE 137 148846 - 149343 613 165 aa, chain - ## HITS:1 COG:Cj1165c KEGG:ns NR:ns ## COG: Cj1165c COG3610 # Protein_GI_number: 15792489 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Campylobacter jejuni # 6 162 5 163 164 90 37.0 2e-18 MTLLAILSDGFFAAVAAVGFGAVSDPPMRAFPAIALLAAMGHALRFCLMDTGVDIASASL CASVAIGIGSLLLGGRIHCPMTVLFIPALLPMIPGMYAYKTVFSMIMFMQNLDDPSAAGY LAAIVRNGFVTFSVIFMLAAGAAAPIFLFNKRAHSLTRNKKQIHS >gi|313159403|gb|AENZ01000009.1| GENE 138 149340 - 150152 926 270 aa, chain - ## HITS:1 COG:Cj1166c KEGG:ns NR:ns ## COG: Cj1166c COG2966 # Protein_GI_number: 15792490 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Campylobacter jejuni # 21 270 5 257 258 142 33.0 7e-34 MFGLFTIFAPVKLGKMKSEAELKAVTQFVAEYATRLMGSGVHTSRVVRNTKRLGEALGVR VMVSAFQKVVTFSVYDDESGRVYSEVADIPPLPISFELNSELSALSWDACDLSLPLAEIR RRYDEIVARPRLDPIFTLILVGLANASFCRLFGGEWCAVGIVFTATLLGFYLKQRMQAKG FNHYVIFIASAFAASMYASVAMIFDTTSEVAIATSVLYLIPGVPLINGVIDIVEGHVLNG IARLTSALMLIVCIAVGLSCTLMIVKNGLL >gi|313159403|gb|AENZ01000009.1| GENE 139 150179 - 150844 866 221 aa, chain + ## HITS:1 COG:no KEGG:BVU_3025 NR:ns ## KEGG: BVU_3025 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 18 219 11 212 233 249 59.0 9e-65 MPHFPPNAYFCTMQPSEELRIYIETEIIPRYESFDAAHGTDHVRTVIAQSLDLARHYDVD ADMIYAVAAYHDTGLARGRKLHHIHSGEILLADTELRRWFTAEQLAVMRDAVEDHRASSD HAPRTIYGRIVAEADRCIDPATVLRRTVQYGLSHCPALDREGHFERCLAHLRKKYAEGGY LRLWIPESDNARKQEELRALIRDEARLRAAFDAVFDQETAR >gi|313159403|gb|AENZ01000009.1| GENE 140 151083 - 151709 1016 208 aa, chain + ## HITS:1 COG:PA2896 KEGG:ns NR:ns ## COG: PA2896 COG1595 # Protein_GI_number: 15598092 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Pseudomonas aeruginosa # 16 194 14 188 194 62 20.0 8e-10 MKESVYNTFVNCDPQEQDLLRQLIRGDIAGYEVLFHKYYPTFFAFIKGMTKETAVAEDIA QNIFMKVWLNREKLDAAKSIRNYLFVLAKHEIYNYFRTKSRTFTTLKEAIAQTESKGGGN LPSRNEIEEKLDLAETAEQVETIVGKMPPQRQQIFRMSRFEHMPSREIAEQLNLSVRTVD KHLELALKELRKYLNIIPAIIVFLDILP >gi|313159403|gb|AENZ01000009.1| GENE 141 151817 - 152818 1575 333 aa, chain + ## HITS:1 COG:PA1301 KEGG:ns NR:ns ## COG: PA1301 COG3712 # Protein_GI_number: 15596498 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Pseudomonas aeruginosa # 68 294 78 289 327 77 31.0 4e-14 MNKPTLDKVRDYFYGRCNDGEEVRIQQWFADNGNSSEADRLLGALLDEVRSENPALAQAA FEEFCRRIGHSVPARTAPRRLTTVVRWTQRIAAVLIVPLLIAVSLLYTKTAHTPEWEEVL VPAGQRSELRLADGTLLWLNSGTRVTYPTHFNGRQRKIFVDGEVYAEVMHDKRHPFVISA GDVEVEVLGTKFNMRAYNSDQLVEVALVEGSVRFDVNSDKCDDEVVMVRNDVAQYDRLTG ALEMTSFQSENYKSRATGGGFYFFNESLDGITAQLARCFDQKIIITDPELAGVRFYAFFT NNESLLKILGTLNTDKSMSIRERNDVIYISRKK >gi|313159403|gb|AENZ01000009.1| GENE 142 152863 - 156414 5934 1183 aa, chain + ## HITS:1 COG:no KEGG:Slin_4979 NR:ns ## KEGG: Slin_4979 # Name: not_defined # Def: TonB-dependent receptor plug # Organism: S.linguale # Pathway: not_defined # 32 1183 29 1188 1188 802 39.0 0 MKKLYNRTALFHARAIVGLFALCLLALPSPALAQAIDLKLTNVTVKEAIEALNQRENYSV AIKSAGVDMQRRVSISAQNASIDEVLAQIFADQDITYTITGKSISVTKAAPKSLAADKNQ LKGVVKDNLGLPVPGATIIVDGTNNGTTTMSNGDFSLENVKLPAKLVISFIGYQPRTVEV NSYAAIDITLVESSAAIDEVVVVGYGQQKRVNVTGAVGTISGKDLNNRPVTNTAAALQGA DPSLLLTLGSGSIEGKNYDVKIRGAVSLNSGSPLVLVDGIEASLAQVNPNDIESVSVLKD ASACSIYGAKASAGVVLITTKSGKAGTLKVNYNGRYGVSWNTTSTDFITSGYDYVKLTNE FCYPSKGYVGWNYTDEEMQMLYDRRNDKTEHPDRPWVITDSNGKYRYLGNFDWYDYMFKR SRPETEHNISLTGGNDKINYYVSGRYLYREGLFNHGAEDIYNGYSFRTKIAAEVTPWMHY SNNISMEVTDYKYGGYWEQDGSEELNSNGILFNVANNISPTFVPVNPDGTTFVYSNGIQF ANSPIASGRGGVFADGRNKNSRKNNYYIITNRVTFDLTRNKDLKLNADYTYRRRDNLGAY RSYPTANTWNATQTAVVDFTNGSIYDFYQEDRYYYNGHVVNAYLDYGHSWGKHNFSAVAG GNFEDFRSSKLSVRQKGSLSEKLSFINMAQGEIERCVESNTAYRTLGYFARANYDYAGKY LFEVSARYDGSSRFAANDRWGFFPSASAGWRISEEKFWEPMRNWWDNAKVRFSYGSLGNQ QVSNYYYIETISTGQLGYTFNGTEKANYASASNPISDGLTWETVVTYNLGFDLGFLKNRL NVTADLYIRDTKDMLTTSLTLPDVFGAPSPKENCADLRTKGYEITVSWRDRHMVAGKPFS YGISASLGDYKSKITKYKNDDMLLTDHYVGETLGELWGYRTDGLFKTDEEAARYQAQIND KAVNNRVYTSSDASAAHLMAGDVRFRDLDGNNIINNGDGTVKNPGDMRVIGNSLPRYTYS IRGDLNWNGFDFAVFFQGVGKIDWMPSANCYYFWGPYSFPTTTFIAKDFERLAWSEDNRN TYFPRRRSYQTSSAGSMNVKTDRYLQDASYIRLKNITLGYTIPINKRILEKVRVYVSGEN LAYWSPLKRYSKTVDPEVATTSATNDCLYPYSRTFSVGVDITF >gi|313159403|gb|AENZ01000009.1| GENE 143 156481 - 158262 2848 593 aa, chain + ## HITS:1 COG:no KEGG:Slin_4978 NR:ns ## KEGG: Slin_4978 # Name: not_defined # Def: RagB/SusD domain protein # Organism: S.linguale # Pathway: not_defined # 1 588 10 576 576 398 40.0 1e-109 MSLALSGCDDFLTKEPETNLSPNTFFSSEAELELWTNRFYSLFAGPDTDAIQVSDIQIAK NLSSVQQGTRSPATENWGTSAWAYLRYINYYLERSGNCSDETVRQRYDGVAYFFRALFYF EKVRKYGDIPYYDFVIPSNDWASLKRPRDSRGFVMKKVMEDLDRAITDLPDNWPSDALYR LSKNAARAMKARAALYEGTFRKYHGIADETIDGVTISADYFLQLAADAAWAVMEQNKYSL YKGNTLKLDAPYREYFILEDGDAKETILSMRFNADILVRHGIQFTFRNMRHSATQRLVNH YLMADGSKIQDQPGYETMTYNQQFQNRDPRMAQTLMAPGYVDLNGIDEVIEDCKSYDMTG YRFIKFVSDDTHNGATTSTTDWPVFRYPEILLTYAEAKAELGTLTPEDIALTVDVVRDRV GMPALDMTAANNNPDPLMASYYPNVNSGANKGVILEIRRERTVELPCEGLRQWDMLRWKE GAQLVPSSNGLDGFLGCYFPSLGEYDMNGDGKMDLCLWSGTKPATTCDAMLEIGEGKDAQ LTEGTSGYIVCFRGRTYKWEEGRDYLWPIPTDQRVATGGALSQNPGYEDGLSF >gi|313159403|gb|AENZ01000009.1| GENE 144 158355 - 159152 1334 265 aa, chain + ## HITS:1 COG:no KEGG:ZPR_1186 NR:ns ## KEGG: ZPR_1186 # Name: not_defined # Def: S1/P1 endonuclease family protein # Organism: Z.profunda # Pathway: not_defined # 1 264 1 259 259 125 29.0 3e-27 MKKLLLLLFAAALSLSASEPARAWGREGHETIAKIAERNLTKRAKKRIEKYLGGHSVVYY AKWMDEYRQTPEYAFTNDWHTAPVGADLRYGDELLKPGKGNAVYGLELAIRNLRDYRNLT DSAVAVNLKYVIHLVGDMHCPAHIKYTTHNTKYDVLFEDKYHKPHKYYVHHVWDNEIITT TRIWSVTEWAGELDRASKREKAAVQAGTPRDWLHDSAVTCEVQFEWAKPDERLGQDFLNK ALPLVEHQIRNAGYRLAAVLNELFD >gi|313159403|gb|AENZ01000009.1| GENE 145 159254 - 160183 1289 309 aa, chain - ## HITS:1 COG:mlr8455 KEGG:ns NR:ns ## COG: mlr8455 COG0584 # Protein_GI_number: 13476980 # Func_class: C Energy production and conversion # Function: Glycerophosphoryl diester phosphodiesterase # Organism: Mesorhizobium loti # 46 302 99 382 407 110 30.0 3e-24 MKRLVSCILAIPALLWGCAGPKTAAPEYPTHAAKIVAEIHDPASKYVVVASHRGDWRNYP ENSIPAIESVIRMGVDIMELDLKLTKDSVLVLCHDHTIDRTTTGRGRVCDITYDSIQRCF LRTAHGVRTPRKMPTLREALEVCKDRIVVNIDQGYEFYDMALKISEELGVTEQMLIKGKR PAEAVAAKFGEYEHNMMYMPIIDILKPQGQKLFGEYMSKGIVPLAYEVCWDEYTPEVKDC MEKVVESGSKLWVNSLWESLCGGLCDDAAWESGDPGSVYGKLLDMGATMIQTDRPELLIS YLRSQGRHD >gi|313159403|gb|AENZ01000009.1| GENE 146 160204 - 163923 4522 1239 aa, chain - ## HITS:1 COG:no KEGG:Fluta_0605 NR:ns ## KEGG: Fluta_0605 # Name: not_defined # Def: PKD domain-containing protein # Organism: F.taffensis # Pathway: not_defined # 1008 1229 21 245 411 73 26.0 6e-11 MKQIFAWLFAAIFFAAGCAHSPELAEPVRGTGYEDADGLVIVAACPPATRTDIEEGKSTW EAGDRITVVYDGAAYEYTAAEAGPTTAFTSEAGIADYDASKPLTAYYPATTAEGVVAVEA ERTIALDAESQSNPARAPLVGLPTSGNLAEGALEVTFRNIFSVIELRIDAGELASAAQSL TVEPADEGAFEGFLSFEGTVDPETLALTPAENGTGNSLIFNFAEGVDLTKPQTIKFPVGR FKSEAGLRLTLNTADGKSYSKNIYKTGITSYAEQGGVFRAKHMAKALYAFAPQGGISTAD DLIEFAAAVNAGGTLAPWQDDKGVVVLLDDIDLAEVTEWTPIGAATSKLASNALSITSGR PFTGYFDGQGHTIRNLKMVCKAAQATSAWGFFGAVANGAVVENLIFDASCSLEDKATAGT DCGVVAGLVYEATVRNVVNKASIAFDGAAPDETRMTIGMVGLAFADKNGARLEKLVNHGA LTATSGGNTKNGATGIHVGGIVGFSSNKSNTTAVVTIAECANYGDLDTQVARASGIVAAA NRYTEIENCVNEGDNVNAFATVNSARIGNITCITAVGSKITNTVNRGNVICKTSGAAGGI ICLVNDDGNAFVGCENYGLVITDRANYKGTLFGQCNKAARFSDCIAQGDLGAYEDGSYAL VGVNYTNYMSYIGDHNATAVHVNSQNILYYPNGSQVPAEPEFGVNLSSVELNAQGSNAAV VQLSSVDYDWTVSADGDWAHITDLSDVPVVSGVRDSGVQYIKIGAGANTKTAPRSAVVTF ASTDGSKSATVRVDQQARGEAFPSKWVFQASTLPLYGSSWTDDNVIPATSGAAGFISVVR GDANASAAFKRSVVTNRPAVSTMVEGDYWLYTFPVENLAAGSVVDFNATMAGEANSPKYF IVEYLDGGVWKSVEADLLTAPENPAVRYTYKCSGTATGSSYQHATVMQTMRFENAVTDGE VKIRCRAVGPYTCAGGTQNITATNAASSIPPYGFTGSYVQNFGTATPRDTKKVLCLGNSF SYYSNPAWMLKEIAWREGHALNIKAHFKGSQTLTQHLSLGFSTDVIEQGGYDFAFLQDQS QNPANYGRDATASILTGLTTLADKVRAASPSCKVILEETWTFSSASYGGFTDFPTFETYN DAGAKAMAKAAGTWVSPIGPAFRQVREGGSGINLYYSDSKHQSEYGAYLKACVNYLVLFG ERFGADPADCGLNPDKAAYLRSVAEQIVLGNEGDYFIER >gi|313159403|gb|AENZ01000009.1| GENE 147 163936 - 168177 5728 1413 aa, chain - ## HITS:1 COG:no KEGG:BT_4606 NR:ns ## KEGG: BT_4606 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 209 2 225 394 95 32.0 2e-17 MKKILNIIILFAAVVFCGCSDFDDSELRGRIDGYKNRIEALKAKAETLGKQLADLSYLTN GNVITTVSQDADGKYVVTYKDNKDVEHTVVLATMDDIVDVPIIGVKLDENGVYYWTKCID GEITWLTDDDGEKFPVSGYTPTISVDADGCWTVDGVQILDASGNPIKATTDATSVFRSAE LDGDGNFSLTLGDGSTITLRVFNSLNLKLDAMPVTTVADPSKSITVSYELSGEAKETAIV AVAKAEGLDAVIDRDAKTVTVSFDASFSRGTVIVMAYDLADNVIVKPLFYKAATLGTVAI STPDQLVAFAAAVNAGGEEAAAKAVLTQDIDMKDVAWTPIGNGAYTTANAMMGPAFQGTF DGQGHAVRNLKIVVPADAAAGSAWGLFGVLKGATVRNLAIGEGSSVVSTAAAMTAVGAVA GYAYEATIENCENRAAIDIQGGGDNVRESAGGIVGAICANENDSHILSCTNYGKITSKNS VNTKNGATGFSIGGIVGFADASTTTERYNNVVGCVNEGAIDAQATRTAGVVATMNKYTKL ENCANNAAVTCSDVTASNSRVAGIVSAMGGHTYLTSCVNNGTVAFAVAGDTTHGYAAGIA GQTNDNNTAIDGCENYGAVLSDIINAAANKYIGIVCANTNKKTIAIRNCKIGGRIGPFSD GQQGATEITEQNFEQYIYFTLTGGGVPTLENNSFSGGPAKPGIATVEDLTAFRDAVNAGE STAQWEDAGGVVSLLGDIDMKDVAGWTPIGNASYKWEKNLLTIEGNAFKGTFDGQGYALK NLKLAYGGSAANTAYGLFGVLDGATVRNLTVGAALGDASALKVTASGGTAEVGVIAGVCR DANVSDCVNYAKIEYDGTSAARVSAAMVGFIFSETEGTKLERLQNYGAVEADTHGNSANG AGAAIHMAGICGFATGNATNKICNDIAYCDNYGNITSNSARSSGIVAAANNYTRINSCVN HGNQLNSCGTTGRLGNITCITGTGCSMTDCINNGNLVSTGGARCGGLLSLANHATNSFSG CANYGEILTDDANRGVFFGYSAYATSWINCIAGGKVGVYNGGTAVYDSYGENEQVRYLGV QKSTDPINADNITYMIGSSSGGSGGGDDVEPTLRILFIGNSFTKDAVEHLPKMVSAADIP TLKMVHLYYGGRTIPEYADGYATKSDYTCYKYNPGTSLWLSYTGYNIQQIVKSDTWDIVC LQEHTGNSCGWIWNDTEKNAIQGLIADIRADQNGHTPKFVYIMSQAYFNMDKIGTAQRPY KNFTTQDEMFDVIVAQARKVLDQTDVEQIIPTGTVLQNLRTSSLNNDMDLTRDGYHMDYG LSRYAAACAVFESIISPSFGGKKLDGNSFRYNVSSTTDGTYTTPVTDDNQPVALQAARYA LATPFAVTDMSSGTQTPGNGIEDTDFENDSNKE >gi|313159403|gb|AENZ01000009.1| GENE 148 168208 - 170652 3421 814 aa, chain - ## HITS:1 COG:no KEGG:Bache_0357 NR:ns ## KEGG: Bache_0357 # Name: not_defined # Def: fibronectin type III domain protein # Organism: B.helcogenes # Pathway: not_defined # 402 715 335 639 781 68 26.0 9e-10 MKKILNTIFLATFLFAAACSDGYDDTRIKKDLDDVGQILDELEQAVDGLRTQMDALTQLI NSSFVSLISTDAEGNYVITYIDRGGESRTLTLATQKEVVTLPIVGIAKDEDGVWYWRQTS DNGETYEWILVDGEKVPAGGEKPEVGIDAEGFWTVNGKPITDAKGNKVLADDVSNILFKE AYVDEKTGEAVFILADGTELRLQMFEALSVSFDSPTYTAVPDYASKVKIRYTVGGSQAEG ALVDIFTAYNVDAEIDESISTITVSLKEGAQEGNILVMAHAGGNTILKPLFFTYGTAEIQ DPVYNGSTADIVLEGDMTQFEVKVSASIDYEVTVEESASKWLIYNSTRAMTILTHVFTAD YYEDASGALRTGEIRFSNALYDISAAIVVKQSPKIPEGGGGGISTAADLMGFAAAVNAGA STARWENEAGEVVLLNDIDMAAVESWTPIGGVDASEYNTTTPYKTVNPFKGIFDGQGFAV KNLNYTADMSTGKWGYAFFGSLDGATVRNLTLGDPDTDITWTFTGDAPKATSVASLAVYA VNSTIESCTNYYNIDFAGDSGSEVLCVASGLVGVMKNSTIGGRAKSLGCANYGFVRTGKI TNNGNGGNGMQTAGICGFMAKDQGNLIQFCVNYGHISCPTGRTGGLVATLMNGNVKNSDN RGTVEDDIVGKFEGAAAQNSYNSKRMGGLIGGTDDLKKVLTATVESCTNYGNVFTHIGCR TGGFIGHSNIRIIGCANQGAILGDVYNSDHGPAWACGYSGQSSGEWVNVSSCTMGGYVGS YTTYKDNPTSAPAATVHNAFSYKNDEYYDPTINN >gi|313159403|gb|AENZ01000009.1| GENE 149 170687 - 173959 4677 1090 aa, chain - ## HITS:1 COG:no KEGG:BT_4606 NR:ns ## KEGG: BT_4606 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 21 197 21 208 394 74 28.0 3e-11 MKRILTLMFAAATVFAYGCSDNFDDSALWKDIDGMYKSLNELKAQVTTMQQQLDALAAVV SGGAVTSITQDADGHYVLSYKDADNVEHTIDIATMDDANTQPIIGMKADGEIYYWTVTTG GKTSWLLDTDGAKIPVTGRTPEIGVDDQGYWTLFGKRITDASGNPVKAEGKSASVITKVE MKDDGTVVFTLGDGVQVTAQVQNGFNVLLSVEPRTVVPDVAQPLVITYTLVGETETSVLT VEKAEGLAAKLDEQAKTITVTFPDGFEEGRLVVMFYDGADNVIIKPLIFTTMEGAPTGIR NADDLKAFASAVNAGKSLAKYTIDGEVCLMNDIDMAGTDWSDYVIGGVVTPSTADANKAV TYAMGENVFDKVFNGKNFALKNVDWTFDLADGNVAHGLFSALGAEGEIKNLTIEGVIRLT GAAPQGAAIGAFAGYAEGKITSCTNKAAIAFAGSDAANISVCLGGIAGYLQNATLTQCVN DGALTCGTIANTGNGSNSGFHQGGIVGYMKTSSLTECTNNGALSAPSGRSGGIVAVATSG QVTACVNNGKVQDDVNGIFGANPGYKRMGGLAGGASADAAFTSCVNNGDVFSQLGCRTGG FVGHNEAKITKCENKGVILSDHTLSGTNYHGSGWAAGYNKSADLITECVVGGRVGDYTAY KDNPQSAPEATYAMAIVHGKFDPTLNGLSDQYEEFYDWEVKAETQLAEGVKFYHYAMKNF AQNVYVVEADLTNPNVVLETVMADELCLNPNANNNSNNGKKLRETLSETCTRRRAEGRNI VAGINTGFFNSHDGFPRGFHIEYGEPVFINNPTVRQSLSNHRPGFTFFEDRTVSFDNRSF TGYLKVNDTDYEYYSVNDTIVRLNNTDGYDANLYTSRFRKEPHPGIYNPVGSDALFVVGR CSQQMTVNDGWFDATVTAIVDGRNGASVEVPFVSEKTDWVLQVTGEKAAALAAALKVGDA VRINANVSIGSVSKQIIMHNSSMYRFLNGGNWNAVNDATLMPATCIGADQAGTTVKLVCV DGRTSIDTGMNYWQLYMTMKKLGLHNAIRFDGGGSTTLWKWENGAGAIANRPCDSKGERS CMNYMHVRIK >gi|313159403|gb|AENZ01000009.1| GENE 150 174196 - 175170 1501 324 aa, chain - ## HITS:1 COG:alr1912 KEGG:ns NR:ns ## COG: alr1912 COG0167 # Protein_GI_number: 17229404 # Func_class: F Nucleotide transport and metabolism # Function: Dihydroorotate dehydrogenase # Organism: Nostoc sp. PCC 7120 # 1 317 3 323 343 179 34.0 6e-45 MKAKYLGLDLSSPVVVSSSPYTATVSNIEQCVRNGAGAVVLKSIFEEQIIRHAAALDYAS QQGMGDSGEYLERYIGDAYKGEFLKLVADARTTGVPVIASINCIASAEAWTDYAVSMEQA GAAALELNIFLQPTDRHRSAQELEQEYADVVRRVAEAVKIPVSVKLPMRLTNVLAVADSL LARGARGVVMFNRFFEPDIDVERMTFVNGDPFSEPAELRNVLRSVALCTTAVPQLDVSVS TGVHDGEAAVKALLCGANAVQICSAIHEKGYGVIVDINRFIDLWAERHGFGSLAEFRGKM NYGNAESDVYQRVQYMKYFPHDAE >gi|313159403|gb|AENZ01000009.1| GENE 151 175260 - 176360 1821 366 aa, chain - ## HITS:1 COG:BH2816 KEGG:ns NR:ns ## COG: BH2816 COG0404 # Protein_GI_number: 15615379 # Func_class: E Amino acid transport and metabolism # Function: Glycine cleavage system T protein (aminomethyltransferase) # Organism: Bacillus halodurans # 1 365 4 360 365 313 44.0 2e-85 MKTTAFTKYHIAAGAKMAEFAGYNMPIEFSGINDEHMAVRGGAGVFDVSHMGEIWVKGPK ALGLLQRITTNDVSKLYDGKVQYTCMPNGRGGIVDDILVYRVDAETYMLCVNAANIEKDW NHICEQGRAFGMEAGHGKELYNASDEICQLAVQGPLAMKIVQKMCDEPVEEMEYYTFKKM KVAGCDAILSITGYTGSGGCEIYAANEDGDKLWKALWEAGEEFGLKNIGLGARDTLRLEK GFCLYGNDIDDTTSPLEGGLGWITKFAEGKEFIDRPLLEKQKAEGVTRRLVGFKMIDRGI PRHGYEIAAPDGTRIGHVTSGTMSPCMKVGFGLGYVTPEYAKAGTEIAVVVREKPLRAEV VKIPFV >gi|313159403|gb|AENZ01000009.1| GENE 152 176378 - 177586 1961 402 aa, chain - ## HITS:1 COG:no KEGG:Slin_5553 NR:ns ## KEGG: Slin_5553 # Name: not_defined # Def: hypothetical protein # Organism: S.linguale # Pathway: not_defined # 17 400 13 398 398 387 48.0 1e-106 MNSDKYVLQEVVTPALEREWLDLPKKIYKGNRNWVCPLDTDIRKVFDPARNEQFADGEAV RWVVRNRAGEVVGRIAAFYNREKAALEEQPTGGCGFFESIDDQQVADLMFDASRMWLASR GMEAMDGPINFGQRDAWWGLLVEGYEFQPLYENPYNPPYYKELFENYGFRNYFNQNTYIW KIYDDDVNAMVHDRAKRLFSTPGYGFRQIDMSRIEEEAENFRIIYNKAWSLFSGVKPMTQ EEAMKIMETMKPIIDPEIIFFAYFNDEPIGFFIMVPDLNRIIGKFNGKFGLIQKLRMLWD LKVRKASDRIFGIIFGIAPEFHGKGVESGMMRFILEKYMRTPRNHYKTIEFAWVGDFNPV MNRMIESYVCATRHKMHTTYRYLFDRTKEFHRCPRLGVKRRE >gi|313159403|gb|AENZ01000009.1| GENE 153 177761 - 178021 324 86 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313159506|gb|EFR58869.1| ## NR: gi|313159506|gb|EFR58869.1| conserved hypothetical protein [Alistipes sp. HGB5] # 1 86 1 86 86 164 100.0 2e-39 MTTDYSEFGWSDAQPAFYHELLCKYIERLLPADGSPILDVGCGNGFTANYLAGKGYDVYG IDASRQGIAIANRTGGGIPAVSSSAT >gi|313159403|gb|AENZ01000009.1| GENE 154 178078 - 178437 396 119 aa, chain + ## HITS:1 COG:no KEGG:Galf_1289 NR:ns ## KEGG: Galf_1289 # Name: not_defined # Def: methyltransferase type 11 # Organism: G.capsiferriformans # Pathway: not_defined # 2 102 108 206 209 121 51.0 8e-27 MEVIEHLYSPRTFAAFVRSILEANGGGRFILTTPYHGYLKNLSIALAGKADRHYSALWEG GHIKFWSRRTLAILLREAGFRNMAFTGAGRIPYLWRHMVFSAETPAAAEVRACDPAAEP >gi|313159403|gb|AENZ01000009.1| GENE 155 178728 - 181721 5041 997 aa, chain + ## HITS:1 COG:no KEGG:BF3633 NR:ns ## KEGG: BF3633 # Name: not_defined # Def: phosphoenolpyruvate synthase # Organism: B.fragilis # Pathway: not_defined # 19 987 20 984 989 931 47.0 0 MNKNLEQSLSDGNRPQYRFMKYRIHKILLVCCSYDGYILEEDGHIESQINQEYLDLNMSN PPSFTRVSSTREALDLLGRDDSFDFILTMYNVGELDVFTFAKIVKERHPQIPVALLTSFS KDIYRRIEEQDRSGLDYIFGWHGNTDLIMAIIKLVEDKMNAEEDIVEGGVQAILLVEDSI RFYSTYLPELYKLILLQNTEFLKDALNEQQQILRKRARPKILLATNYEEAVELYDRYKKN MLGVISDVGFVLHRNDPPESEKRDAGIDLCRRIKEDNPLMPVLLQSSQTEFEAQARGLGA GFIAKNSKTLLSQLHEYIAKEFAFGDFLFKDPDTGAVIGRAKDLAQMQEMIATIPDKAFE YHTSQNHLSKWLYSRGLFPLAAAIRRGNKSQFATTEEHRQRIVNLIKDYRILLGQGVVAR FDTETYSDAVAFARIGEGSLGGKARGLAFMNSMLLKHRQYDKHDNLRIMIPRSVVIATDY FDEFIRNNGLKYIISQEFSDEEILSEFVSSTIPVKLQRELKAYIKTVSTPLAVRSSSKLE DSHYQPFAGIYSTYMIPYVDNEDQMLRLLLKAVKSVYASVYFASSRAYLSSSQNLISEEK MAVIIQEVCGTEQNGLFFPTFSGVARSINYYPIGDEAPEDGVCNVAMGLGKLVVDGGRTL RFSPRYPQKVLQTSMPELALRDTQNEVLALSLQPEEFRTSIDDAVNLRRLDIAQIAELRN SRFVCSVWDRENERISDSPFDRGRKVITFNNILKYNTFPLAEIVTDILHMGAEEMRCPVE VEFAVNMDVAPGEQQIFNLLQIRPIIDNQDNRPIDWSEVDTSDALVYGENALGIGMMSDI SDVIYIKSGTFSSLSTEKIADELLELNRRMRDEKRSYILVGPGRWGSSDPFLGVPVKWNH ISEAKVIVECGIEKFDVEPSQGTHFFQNVTSLGVGYLTISPFRGDGVFREETLDKCAAIY EGSFLRQVRFERPLWVCIDGRSNKGIVKETPDEAREE >gi|313159403|gb|AENZ01000009.1| GENE 156 181865 - 183202 2054 445 aa, chain + ## HITS:1 COG:HI0189 KEGG:ns NR:ns ## COG: HI0189 COG0334 # Protein_GI_number: 16272153 # Func_class: E Amino acid transport and metabolism # Function: Glutamate dehydrogenase/leucine dehydrogenase # Organism: Haemophilus influenzae # 2 444 6 448 449 535 58.0 1e-152 MNVNKLMAELERKHPGESEYLQAVREVLMTVEETYNQHPEFEANRIAERIVEPDRSFTFK VVWVDDKGEVQVNTGYRMQFNNAIGPYKGGLRFHPSVNPSILKFLGFEQIFKNALTTLPM GGAKGGSDFNPKGKSDREVMRFCQAFMVELWRHIGPETDVPAGDIGVGGREIGYLYGMYR KLARENTGVLTGKGMTYGGSLIRPEATGFGAVYFLRQMLEKAGMDIKGQTIAISGFGNVA WGAATKATELGAKVVTISGPDGYIYDPEGLDAEKIAYMLELRASNNDVVEPYVHKFPNAQ FFAGKKPWEVKVDIAMPCATQNELDGEDAKKLLANGVKIVAEVSNMGCRPEAIDAFIAAK IPYGPGKAVNAGGVATSGLEMSQNSQKYNWTAEEVDAKLHQIMSSIHHACLEYGTEKDGF INYMKGANIAGFMKVAKSMVEQGVL >gi|313159403|gb|AENZ01000009.1| GENE 157 183513 - 184763 1498 416 aa, chain - ## HITS:1 COG:no KEGG:Bache_3206 NR:ns ## KEGG: Bache_3206 # Name: not_defined # Def: phosphoesterase PA-phosphatase related protein # Organism: B.helcogenes # Pathway: not_defined # 44 409 45 412 419 361 48.0 4e-98 MKRSFLLCLSAVFLISGPVSAQIKDSLGNECLIPAAESSFSQYVDRRTSTKGYRMTFVAV PLILGGAVVSLYDTDFRRLRNGYVRSFHHDYDDYLQYAPAALLVGMKVCGVESRSSWGRM LVSDAFSAGLMAAAVNSLKYSFRVMRPDGSTRNSFPSGHTATAFMTATMLHKEYGHRSPW YSIGGYTLATLTGVTRQLNNRHWMSDVMVGAGIGILATEFGYFLADLIFKEKGLKVRETY VVFDRYRHPSFVGFSVGLTVIPGSYTPYAGMHTDFKVGPTVGVQGAWFASPYVGVGGRFA VTNLEMNVNGVPQNDEFECCSVTAGPYFSYPLSTRWRIGSKLLCGYEHYKPYTTSGYRIE GRGGFTVGTGFSSTYLVNRNLSVRFTADYDVAPPLVSSSCVAMHKLTLGMAVSAMF >gi|313159403|gb|AENZ01000009.1| GENE 158 184778 - 187351 3364 857 aa, chain - ## HITS:1 COG:MTH1001 KEGG:ns NR:ns ## COG: MTH1001 COG0474 # Protein_GI_number: 15679019 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Methanothermobacter thermautotrophicus # 2 851 18 829 844 425 34.0 1e-118 MIRYNPKGLSSQEAADSRRTHGDNVITPPKDDSAWKLFVEKFKDPIIRILLLAAVLSLAI GSVHKDFTESIGIICAIILATCVGFWFEWDAMRRFRRLNQVNDDIPVKVMRDGSIREIPR RDVVVGDVVYIESGETVPADGELVEAVSLRINESTLTGELEVDKTVDEAHFDSEATYPSN VALRGTTVADGYGVLVATAVGDATEAGRVTEQATVQSDEQTPLNRQLTRLSKLIGRAGIA LAVAIFCVMLGKAVFVGGLFERDWLEISQQVLHIFMVSVAIIVMAVPEGLPMSITLSLAM SMRRMLKTNNLVRRMHACETMGAVTVICTDKTGTLTQNRMHVQELVRYDALPAHDFAEIV AANSTAFLDASGAVIGNPTEGALLEWMRSQGEDYEPLRSEAKIVDRLTFSTERKYMATII ESGVSGRRIVCVKGAPEIVRAMCAPDGKDTQVAEQLAGFQGRAMRTLAVAWAETAEDDCL RAVAASQLHFSGVAAISDPVREDVPDAVRRCLNAGIDVKIVTGDTPATAREIARQIGLWD DARDNDRNYMTGTEFAAMSDDELLGRVRELKIMSRARPLDKQRLVRLLQQCGEVVAVTGD GTNDAPALNFANVGLSMGSGTSVAKDASDITLLDDSFASIATAVMWGRSLYRNIQRFVLF QLTINFAAIVICFVGAVFGTDMPLTVVQILWVNIIMDTFAAMAMASLPPNPEVMLEKPRP RDEFIITPGMARTLFICGGIMVAVLLGMLFWWTITAGGLTVRQLTLFFSTFVFLQFWNMF NAKGFETCHSVFTCLRGCREFFLILLAIAAGQVLIVEFGGEVFRTEPLAWREWAAVIGCT SLLAIGGEAIRALRRKR >gi|313159403|gb|AENZ01000009.1| GENE 159 187365 - 188765 2258 466 aa, chain - ## HITS:1 COG:alr0653 KEGG:ns NR:ns ## COG: alr0653 COG0772 # Protein_GI_number: 17228149 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Bacterial cell division membrane protein # Organism: Nostoc sp. PCC 7120 # 79 452 93 423 438 169 29.0 2e-41 MDLWTVLLYVLIVLAGWVSITSASYDEGTADIFSFSHFYMKQLMWIGVAWTTALVVLLLD ERFYHMFAYPAYFAGLALLLGALLFGREVNGAKAWFEFGSFRLQPVEFVKIATALALARV MSAYSFSINRPGDLFKVGVVICIPLFIIILQNDTGSGIVLGSFLFVLYREGLNKWLCIPV LLIAALFIVSFLLSPMTLLVSIILVCTLSEAMMTGEWCSRLIYLASLALASILLCMTMAL VAPGVMDLHACLLTVTLLSLVGVLVYAFRMNLSSTLITVGLFLCTMIFLPTTDYIFNSIL KQHQRDRILSFLGIISDPLGTDYNVNQSKIAIGSGGFWGKGFLEGTQIKYGFVPERHTDF IFCTVGEEWGFLGTMVVLALLCMLILRLMRMGERQQEPFGRIYCYCVAAILLFHVLVNVG MTIGLMPVMGIPLPFMSYGGSSLIAFTILLFIAVRLDASTRQFSLN >gi|313159403|gb|AENZ01000009.1| GENE 160 188788 - 190614 2858 608 aa, chain - ## HITS:1 COG:PA4003 KEGG:ns NR:ns ## COG: PA4003 COG0768 # Protein_GI_number: 15599198 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell division protein FtsI/penicillin-binding protein 2 # Organism: Pseudomonas aeruginosa # 27 585 38 609 646 243 31.0 1e-63 MRDSEGFVRMRTLQCVVLFVFLLIGGRLAYIQLVDSRYNELAKANVLRHVVQYPPRGEVF DRNGEYLVQSRECYDLMVIYSEIDKKGFDTLRMCEVLGLSREKLEKELANARMRPRAPRV VTSYISKEDKLRFDECNFRGFYAVYRTVRRYPRKVGGNLLGFVGEVNADFLKRHPDYKSG DYVGMSGVESAYEPLLKGRKGVKIQEIDTHGAIKGSYMNGVYDSLPEPGRYLVSTIDARL QLLGEELMRGKVGAAVAIEPSTGEILMMVSSPTYDPDELVGRERGNNYMKMIYNKRHPLF SRAVKAKYPPGSTFKLVQGLIGLQEGVLRPSDLHVCHEGYQVGRRRMKCHAHASPLDLRF AVATSCNAYFCYVFRDILDNRKYESVKDGYDVWKEYVESFGFGRKLGSDFLDEGNGYVPD RSYYDRQYRGSWNSLTVISLAIGQDALGCTPLQLANLAAIVANRGYYYIPHIVKKIEGQD SLDRRFYERHYTKVDPKHFEPIVDGMWRGVNVGGTSTAARLEGLDVCGKTGTAENPRGRD HSTFLSFAPKDNPKIAISVYVENGGFGASAALPIASLLEEYYLTDTIRRPALLEHIKNMN IYYPSYDR >gi|313159403|gb|AENZ01000009.1| GENE 161 190611 - 191129 679 172 aa, chain - ## HITS:1 COG:no KEGG:GFO_2960 NR:ns ## KEGG: GFO_2960 # Name: not_defined # Def: hypothetical protein # Organism: G.forsetii # Pathway: not_defined # 3 167 4 166 168 75 30.0 9e-13 MHRTLPYISLFVVTVLLQVFLFDNLSISIYLNPLVYVAFVALLPLDTPPVVLLASGLAMG VTMDFAMGAAGVNTIATLLVAFVRPALLRTLYTRDDLREGGVPCAGRLGRRVFLNYLIVL VLLHHAVFFSLEALSWMHLVRTLVRIVVSGAVSVAFIWIIARIFTAKLPVRV >gi|313159403|gb|AENZ01000009.1| GENE 162 191130 - 191984 1316 284 aa, chain - ## HITS:1 COG:BH3030 KEGG:ns NR:ns ## COG: BH3030 COG1792 # Protein_GI_number: 15615592 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell shape-determining protein # Organism: Bacillus halodurans # 39 261 51 277 293 72 27.0 9e-13 MRKLLEFIRSIYVMVLFVVLEVIAVSYYAHSTYYTQARLLARSNQVVGGVHGMFADIRHY FSLGRENRDLLAHMAEMKERLAVYEEAETAARLDSYMQDVGISKYRVITASVASNTVNRA QNLVVLNRGRRDGVAEEMALLASDGSMAGYVVDCTERYSVAMSVLNTSFRASGKLVGSDY FGSIYWDGADPHTVVLDELSKYADPQPGQEVVTTGFSQYFPADVLIGWVESAELNETRTA FKVRVRLAAEMSRLTDVILVENRDLTEIRDLQNSEKVEQHTRLN >gi|313159403|gb|AENZ01000009.1| GENE 163 192079 - 193098 1774 339 aa, chain - ## HITS:1 COG:VC0415 KEGG:ns NR:ns ## COG: VC0415 COG1077 # Protein_GI_number: 15640442 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Actin-like ATPase involved in cell morphogenesis # Organism: Vibrio cholerae # 2 332 7 337 347 298 51.0 1e-80 MGIFSLTQELAIDLGTANTLIIYNGKVVVDEPSIVALDVHTGKVVAIGHQARQMHEKTNP NIKTIRPLKDGVIADFNATELMLRGMIKKVKTSGSLFAPSLRMVICIPSGSTNVEIRAVR DSAEHAGGREVYMIYEPMAAALGAGLDVEAPEGNMVIDIGGGTSEIACISLGGIVCSESI NTAGDVFTNDIQSYVRQQHNIRIGERTAEAIKCSIGAAVSDLDEEPEDFVVTGPNMLTAL PQTVSLSYSEIAYALEKSLTKIDAALMKVLESMPPELYADIVKNGIYLAGGGALIKGLDR RLNEKTGIPFHIAEDPLRAIARGTGIALKNINRFSFLMK >gi|313159403|gb|AENZ01000009.1| GENE 164 193296 - 194303 1655 335 aa, chain + ## HITS:1 COG:SA1306 KEGG:ns NR:ns ## COG: SA1306 COG0240 # Protein_GI_number: 15927055 # Func_class: C Energy production and conversion # Function: Glycerol-3-phosphate dehydrogenase # Organism: Staphylococcus aureus N315 # 10 333 3 327 332 179 32.0 9e-45 MEYKIGKEARCAVIGYGSWATAIVGLLAANEARVGWYVRNPEVLEGLLAEGRNPRYLSDV EFDRRRIAPSDDLDEVVREADIVILATPSAYLKTFLEPLTVSLQDKFVVSAIKGIVPGDY KTIVEYVHDRYDLSYKQIGIITGPSHAEEVSRGKLSYLTVVCTDPENAQMLGEKFATDYI HLSYSTDLYGIEYAAILKNIYALSVGIAVGLGYGDNFLAVLIANSAGEMRRFLEESYPAE RDTLVSGYLGDLLVTCYSVYSRNRRLGLLIGHGCTVKSALNEMTMVAEGYFAADCIRHIN ARHKIEMPIAEMVYNVLYQGASARKCMKELTSKLI >gi|313159403|gb|AENZ01000009.1| GENE 165 194311 - 195642 2281 443 aa, chain + ## HITS:1 COG:BH3343 KEGG:ns NR:ns ## COG: BH3343 COG0166 # Protein_GI_number: 15615905 # Func_class: G Carbohydrate transport and metabolism # Function: Glucose-6-phosphate isomerase # Organism: Bacillus halodurans # 19 443 24 449 450 498 56.0 1e-141 METLKLDISKTGVAVSAAMQAKAQAANALLESGKGAGSDFLGWVHLPSSITPDRIEAIEK QAAKLREKAEVIICIGIGGSYLGAKAVLEAMSDSFKFLHKKRTEPVVVFAGQNISEDYTH ELLEAVKEYSIATIVISKSGTTTEPAIAFRLIKAEIEKRYGKQEAAERIVAITDKARGAL KTLADNEGYPTFVIPDDVGGRFSVLTPVGLLPLAAAGVDIAALVRGAQEMERATADGTPF EKNPAAVYAAVRNELYEGGKKIEILGSYEPKLQYINEWWKQLYGESEGKDGKGIFPASVT LTADLHSMGQYIQDGERTLFETIISVAEPAGKVVVEADGENLDGLNFLAGKRISEINRMA ELGVQLAHVDGGVPNIRIEIPQIDAHAVGSLLYFFERACGISGYILGVNPFDQPGVEAYK KNMFALLDKPGYEEASKAIKARL >gi|313159403|gb|AENZ01000009.1| GENE 166 195779 - 196585 570 268 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313159546|gb|EFR58909.1| ## NR: gi|313159546|gb|EFR58909.1| hypothetical protein HMPREF9720_0854 [Alistipes sp. HGB5] # 1 268 21 288 288 513 100.0 1e-144 MIMKGRPLLVLLLFAASAVILAGMYYAATREGTTTRQDKWAGVLSDLDACSRRKHVKSAQ YDHFAGIARQEREHDAERLFRAMAHAERLQEYNCANAIVRLGGRYAPPERVTVFHGTTDD NLRRSIDFARRPPEGLHADDIERALQSGNRYAARVLIWARSGDMRHLALMETCRRRLAAN RPAGTQGNIAGRRANNGTAGAPANGTARASGRGTAEGSENSRPDATADNAPKGYLVCPTC GNIYPAGCSDSYCPCCLTDGRRFVAFED >gi|313159403|gb|AENZ01000009.1| GENE 167 196607 - 197497 1002 296 aa, chain - ## HITS:1 COG:BH2283 KEGG:ns NR:ns ## COG: BH2283 COG0613 # Protein_GI_number: 15614846 # Func_class: R General function prediction only # Function: Predicted metal-dependent phosphoesterases (PHP family) # Organism: Bacillus halodurans # 4 266 7 258 290 127 31.0 3e-29 MIRADLHIHSQYSSDGEFRPADIVGKCAAGGVDLFALTDHNTVRGLDEACDGALQAGLEF VPGIEIDCSFEGTDLHLLGYGIDWKSPDFRALEETVAAKVMASFGETVGKLRRLGFAVDE QAVLAAAAGKLPTMELIAEVMLSDAACDTPLLAPYREGGVRGDMPYINFYLDFGAQGRPA FVPMEYMNFRDAVELVRDNGGVPVVAHPGLNLRGRERIVEKLLERGAEGLEVFNNYHDDR QIAYFAPLVRRRGALMTCGSDFHGKTKPLIHVGRFGCDARWESSLADSVARLAKGV >gi|313159403|gb|AENZ01000009.1| GENE 168 197622 - 198347 1248 241 aa, chain - ## HITS:1 COG:AGl2215 KEGG:ns NR:ns ## COG: AGl2215 COG0217 # Protein_GI_number: 15891221 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 11 240 13 241 248 176 43.0 4e-44 MGRAFEYRKARKMKRWGHMARTFTKLGKEIEIAVKAAGPDPSGNTRLRILMQNAKAENMP KENVERAIKRATEKDAADYKEVIYEGYGSHGIAFLVETATDNTNRTVANVRMYFNKCGGT LGNSGSVAFMFDHKCVFKFRPAEGVDMEELELEMIDLGVDEFYPEEDGVTVYAPYESFGA IQKWLDDKGYEIVSGESVYLPTDTKELDAEGRESIEKLVEKLEEDDDVTNVYHNMKEVEE E >gi|313159403|gb|AENZ01000009.1| GENE 169 198500 - 198877 173 125 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15900839|ref|NP_345443.1| lactoylglutathione lyase [Streptococcus pneumoniae TIGR4] # 5 125 3 126 126 71 34 3e-11 MEIKSRFDHFNINVTDLGRSLEFYDKALGLKETDRKEAGDGSFILVYLGDGQTGFRLELT WLRDHAGAYELGENESHLCMRVAGDYDAVREYHRAMGCICYENHDMGLYFISDPDDYWIE VLPVK >gi|313159403|gb|AENZ01000009.1| GENE 170 198994 - 199575 855 193 aa, chain + ## HITS:1 COG:sll0216 KEGG:ns NR:ns ## COG: sll0216 COG0009 # Protein_GI_number: 16329324 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Putative translation factor (SUA5) # Organism: Synechocystis # 1 193 7 199 210 108 32.0 4e-24 MQNNTELQREADEAVRVMKAGGIILYPTDTVWGLGCDATNAEAVERIYKLKRSENKKSML VLCASADMVVRYVNKAPGIAFEVMELATSPLTAILPGAAGVAENLIPEERTLGVRIPDHE FCRRILRGLGRPVVSTSANISGEATPAGLAEVSREIIDGVDFVVNPRFEGKPTKKASSII AFGEGGEVKIIRE >gi|313159403|gb|AENZ01000009.1| GENE 171 199581 - 199976 553 131 aa, chain + ## HITS:1 COG:MA1423 KEGG:ns NR:ns ## COG: MA1423 COG0824 # Protein_GI_number: 20090283 # Func_class: R General function prediction only # Function: Predicted thioesterase # Organism: Methanosarcina acetivorans str.C2A # 8 128 5 126 151 62 27.0 3e-10 MAKTLETSIQKRFSDVDPFQHVNNVSQQMYFDVGKMEYYEKILGDEVLLGDLRIVTVSTS TSYMDQIRLHDRVRVTTTCEKIGNKSLTLFQRLLAGDRVCSESRSVMVAFDFSRQQSMPV PDAWRERLLAD >gi|313159403|gb|AENZ01000009.1| GENE 172 199973 - 200926 1549 317 aa, chain - ## HITS:1 COG:no KEGG:Palpr_1888 NR:ns ## KEGG: Palpr_1888 # Name: not_defined # Def: hypothetical protein # Organism: P.propionicigenes # Pathway: not_defined # 13 294 9 289 307 97 26.0 5e-19 MPLGIPVSKTRYNIDLVFANLFFGANFSFYVSLTRNYLDFQQIFMLQVLSAAVFFIPFAL FSKQSFRIRWRDAGNILIVTMLIVYGWMYMLLWGSSYTSPIDASIISTLGPAVTLITDHL MHPHKYIRARVVGVVCALIGAAVLLFDHGFVLTHGSRAYGNALVLVAVVAIAINTVIIKP QLERLGTLVVMGWYYIIGLAVTAPFFWKYIAHTQFLKLPLQAQAELAYILILGTVLPMYL LYRGTEKLTSVHTALYRYIQPVTAGILAVVRGQAVFDTANVVALVFIFAGVVLVVIGYKY YVRHGLPPVGSDGKLRA >gi|313159403|gb|AENZ01000009.1| GENE 173 200976 - 201914 1319 312 aa, chain + ## HITS:1 COG:BS_yyaM KEGG:ns NR:ns ## COG: BS_yyaM COG0697 # Protein_GI_number: 16081133 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Bacillus subtilis # 17 302 9 302 305 67 22.0 5e-11 MDKGKLKGHAALWVANLVWGLNAPIGKSVLWSEHNPGGVSPFALSVYRMLGACLLFWSVS LFLPRERVARRDIVLLLFASVFGIQLNQMLFLWGLSLTSPIDSSIIATVVPVLTMVLATL FLREPITWLKAGGVFLGCAGALLLILVSRHGTGHTSSVKGDVLCLVSAVSYATFLTAFRN VIVKYSPVTTMKWMFLFAAVVAAAVYYRPLAAVDYAALAPRTWAGIGYVVVCSTFLSYFM VPIGQRYLRPTVVSMYNYVQPVVAVLFSVAIGLDSFGFTKAAAALCVFAGVWLVTKSKSR AQVEAEAGMKKG >gi|313159403|gb|AENZ01000009.1| GENE 174 201972 - 203381 2157 469 aa, chain - ## HITS:1 COG:ML2697 KEGG:ns NR:ns ## COG: ML2697 COG0617 # Protein_GI_number: 15828457 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA nucleotidyltransferase/poly(A) polymerase # Organism: Mycobacterium leprae # 6 451 25 474 486 228 34.0 2e-59 MPLTNPIFRRISRLAEEQGVRAFVVGGYVRDHFLRRPSTDIDVVVVGSGIALAEALGREL HAKVSVFKTFGTAMLRHKGVEVEFVGARRESYTQDSRKPQVEAGTLEDDQRRRDFTINAM AWSLNAGSFGELVDPFDGMDDLEECVIRTPCDPDVTFSDDPLRMMRAVRFASQLGFTIEG ETFDAIRRNAHRIRIVSRERIAAELNKIVLSPVPSMGFELLELTGLLELIFPEMHNLKGV EKRGAHAHKDNFIHTLKVLDNVARRSGDLWLRWAAVLHDIAKPLTKAYDPKVGWTFHGHE VLGSKMVPAIFRQLKLPLNEHMKFVQKLVFLHLRPIILSEDLVTDSAVRRLLFEAGDDVE QLMILCEADITSGIDAKVKRYLANFELVRRKMKDLEERDRIRNFQPPITGEIIMQTYGIG PCRAIGDIKEVIKNAILDGEIPNDYDAAYALMERLAAEKGLTKAGGQNP >gi|313159403|gb|AENZ01000009.1| GENE 175 203386 - 204039 936 217 aa, chain - ## HITS:1 COG:SP1450 KEGG:ns NR:ns ## COG: SP1450 COG2755 # Protein_GI_number: 15901300 # Func_class: E Amino acid transport and metabolism # Function: Lysophospholipase L1 and related esterases # Organism: Streptococcus pneumoniae TIGR4 # 40 216 33 210 211 95 33.0 7e-20 MKKIVLLAAALLLAAASFAQSEYNYQKRSLFEQLPIRGNDIVFLGNSITDGGEWAELFNN RHVKNRGISADRSGWLLDRLDPIINGHPKKLFLMIGTNDLAVGITPEEVAANVEKLLDRF AEESPWTKIYVQSILPVNGVDTKAKPKNHWKKGAEIIETNKLLETLCEGRKNVMYVDVYS ALVDEKGMLDKQYTNDGLHLMGEGYLAWKTVIEKFVR >gi|313159403|gb|AENZ01000009.1| GENE 176 204067 - 204780 1033 237 aa, chain - ## HITS:1 COG:XF0060 KEGG:ns NR:ns ## COG: XF0060 COG0854 # Protein_GI_number: 15836665 # Func_class: H Coenzyme transport and metabolism # Function: Pyridoxal phosphate biosynthesis protein # Organism: Xylella fastidiosa 9a5c # 2 232 5 246 260 195 49.0 6e-50 MTKLSVNINKIAVVRNSRGGNLPDVVRAALDIERFGAEGITVHPRPDARHIRYDDVRNLK RVLATEFNIEGNPIPSFVDLVLEVVPTQVTLVPDAHDAITSNAGWDTLANRGFLTEVTAR FHEKGIRVSVFVDPDPAMVAGAKACGADRVELYTEAYAREYPAGAEAALAPYLAAAEEAR RQGLGLNAGHDLNLENLRFFVSRIPWTDEVSIGHALICDALYYGLENTVQLYKRELR >gi|313159403|gb|AENZ01000009.1| GENE 177 204961 - 205839 1063 292 aa, chain + ## HITS:1 COG:RSc2650 KEGG:ns NR:ns ## COG: RSc2650 COG0061 # Protein_GI_number: 17547369 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted sugar kinase # Organism: Ralstonia solanacearum # 68 286 72 292 302 145 39.0 8e-35 MKIILFSRRQLPHTAGEICQLFEAFRIFGFDYAVNEEFAPLAEELTGIRIPPEKIYGQCT GKQPANSVMVCYGGDGTLLEGVHRLCGAPIPVMGINAGHLGFLTSAPSAGLNLIFKEIAE GRLTTEARSMIEVTGDYAEQPDTTLALNEFTVQRHGAGMISVETYVDDQMVATYHGDGVI FSTPTGSTAYSLSAGGPVVAPTCACLVISPLAPHNLTMRPVVIPDTAVITLHVHTRRSDA FVTLDNRVYAVGQEATFTVKRAEQKIFLAVPHNISFYDTLRNKMMWGIDIRS >gi|313159403|gb|AENZ01000009.1| GENE 178 205918 - 206631 1065 237 aa, chain + ## HITS:1 COG:SMc02097 KEGG:ns NR:ns ## COG: SMc02097 COG0020 # Protein_GI_number: 15965252 # Func_class: I Lipid transport and metabolism # Function: Undecaprenyl pyrophosphate synthase # Organism: Sinorhizobium meliloti # 7 234 9 236 247 240 53.0 2e-63 MSEQNRIPQHVAIIMDGNGRWAELRGKERYEGHVAGVEPVRASLRAAARWGVKYLTLYAF STENWGRPAQEVDALMELFCKSVVNETPELIRQGVEIRMIGDRTRFSEKVQRYLAEAEQR TAGGKTLTLILALNYSSRSEITRAVRQIAARAASGELAPEEISEGTISAALDTAPYPNPD LIVRTSGEHRLSNFLLWQASYAELYFPEVLWPDFTEEEFDRAMEEYARRDRRFGLVK >gi|313159403|gb|AENZ01000009.1| GENE 179 206658 - 209231 4476 857 aa, chain + ## HITS:1 COG:NMB0182 KEGG:ns NR:ns ## COG: NMB0182 COG4775 # Protein_GI_number: 15676109 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Outer membrane protein/protective antigen OMA87 # Organism: Neisseria meningitidis MC58 # 55 776 23 681 797 148 23.0 4e-35 MNYFGKIFTAAAALVLCCTNIFAQEQNPADTAASAKPAFPVDAPILRQDGAQKLYYIRDV NIHGVQYLNPDILKSSAGLIPGDSIYLPSNFIANAISRLWSQRFFSDVKIGAEIEGDSLD LAVFLKERPRVNNWDFEGISKGKKKDLLEKLKLKRGSELSDYVIDKNQKLIKAYWSEKGF RNTEVSTRITNDTLRPQMVNVTFLIDRKHKVKIGKINFTGNEQFKDKRLRRTFKKTHQKS INFFRGTKLNESDYENDKDLLIDFYNSKGYRNATIVSDSIYPISDKRLAIDLDISEGNKY YIRNVSWVGNSVYETDDLQRMFGVNKGDTYDKKSMHKRLGIGKETDPEAMSVSSLYQNKG YLMSQIEPAETIIGPDSIDIEVKVFEGKQFTINEVGITGNQRVDDEVIRRELYTRPGELY DRSLLMQTIRTLGSLGHFNPEAIMPDIKPVSNELVNVNWPLEEQASDQFNIAGGWGSGTF VGSVGITLNNLSIKNFFKKGAWRPYPMGQNQRLSLSAQTNGTYYKAFALSFTDPWLGGKK PNSFTISAHFSEQNNAYYVWQKSTQYFRTYGLAAGLGKRLNWPDPYFTLYGEASYERFSL KNWNTFGMTNGAANLLSLKFVFARNSVDQPIYPRRGSEFSASVQFTPPYSLWDGKDYKKL AELANSTDSKVADKANQERYRWVEFHKWQFKAQWFQALTKNSNLVLMLKAEMGYLGNYNK YKVSPFERFEVGGDGMSGYNIYGIDIISMRGYEDGALDPTNDYSVAYNKYTAEIRYPVIL KPSSQIYVLGFLEGGNAFDSWKKFSPFKIKRSAGFGVRLYLPVVGMLGIDWGYGFDSPAG QSGKSGSQFHFVMGQQF >gi|313159403|gb|AENZ01000009.1| GENE 180 209415 - 209921 913 168 aa, chain + ## HITS:1 COG:no KEGG:Sph21_5200 NR:ns ## KEGG: Sph21_5200 # Name: not_defined # Def: outer membrane chaperone Skp (OmpH) # Organism: Sphingobacterium_21 # Pathway: not_defined # 2 163 4 166 178 112 37.0 6e-24 MKRLILIAAFILTAGTLAAQNYIIVNSEKVFKSVAAYNKAISDLDELAKQYQEQVDAKFA EVEALYNAYMNQKASLSATTRQTRENAILAREKEAQTFQESLFGNDGALMKKRIEMIEPI QKKVFAAIEAYAKQAGADVVLDSANNPTLLYTNPSVERTQQLIDILKK >gi|313159403|gb|AENZ01000009.1| GENE 181 209952 - 210506 983 184 aa, chain + ## HITS:1 COG:no KEGG:BF0502 NR:ns ## KEGG: BF0502 # Name: not_defined # Def: putative outer membrane protein OmpH # Organism: B.fragilis # Pathway: not_defined # 1 170 1 168 169 98 35.0 1e-19 MKKAIKLTLAVVFVMGATSLFAQQKFGRINTQEIIVGMPETKEMQTNMEAYAKDLQDNLE SMTVEYNQKLQEFQKNFNTLSESVRQLKENDLNALIQRRNEFEQAAQQDFQKRQNELLAP IIEKAKNAIDKVAAAGGYLAVFDTSTGSLAYFDEASLTDIAPAVKKELGITDAPAAAAAP AASK >gi|313159403|gb|AENZ01000009.1| GENE 182 210690 - 211598 1360 302 aa, chain + ## HITS:1 COG:PA0248 KEGG:ns NR:ns ## COG: PA0248 COG2207 # Protein_GI_number: 15595445 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Pseudomonas aeruginosa # 197 299 183 285 288 87 38.0 4e-17 MNNTESQTPINSFTLAELIDLAGEQRQGLMRECITASSDSQMQVFRFPCRIDAFIIGVGT EGETSVSFNLHEFRLKKDSMFIFTPKNILQVNSQQYFKADVIAISPDFMRRINIDIKNMM PLFLKFVENPTLALTPEESRSMRGMIAQIERETRGPETHFSFDIVSGLIAATIYKVGDIM YHYLAEHPEGQNNSHNRAEEYFKQFTHLLGEHFREERSVGFYARQLCITPKYLTTLIKRI SGQSVSEWIDNYVILEAKTLLKYSTMSIQEIAYYLNFPNQSFFGSYFKRNTGMSPSQYKA QN >gi|313159403|gb|AENZ01000009.1| GENE 183 211788 - 214163 3195 791 aa, chain - ## HITS:1 COG:no KEGG:BDI_0453 NR:ns ## KEGG: BDI_0453 # Name: not_defined # Def: putative surface membrane protein # Organism: P.distasonis # Pathway: not_defined # 6 791 167 946 946 402 31.0 1e-110 MRGLSRILAVAVLGFLCSACSVTRKIPEGQYLLQKVTIESDKSTPRKERITAADLEKYVR QTPNKRFLGTNFYVWLYEQANPGKQNWWNNWKRKIGQEPVLLDMSLTERSAQNLKVYMDT RGFFSSQATFEVDTTSRRRRAKVVYRTRQGEPYRIDSISYDFQDKFLEQIILPDTANTLI RPGRVFDIAVLDRERERVTAFLKERGYYNFTVNNIDYVADTLGGNHQVDVQVNIKQYLTG YNERGQAVMDNNMVYRIDRINVFPAYDPTVARTDSTFLSRLDTLYYRGLNIIYEKRPNLR PAILRQSVPLYPNYVYNSAQVNRAYTDLMSLGYFKSAKIAFVEQPRSVDVTNYVSFIGAS ADSTQTRFTKEGYLECNILCTPALKQSFKVDLEGSTTSSFYGLKATVGYQNRNIFRGAEA LDVSFTAGYEFMKAPDAKKKRATEFGVTTGLTFPRFLVPWRTRRFRSVNQPKTKVELSVN FQDRPYYRRTLSSAGITYQWTNNRYSSFSLRPVDINVVDVNRLDSTFLGKTTNKYLKNSF RTQFIGGLSFGYSYNNQRKNLGGNATNIRFNLETAGNLIDAVDRLFYARPKEGEPAKIFG IEYSQYFRTDLSVSRKIMLGEVSALVGRLYGGVAMAYGNSSAVPFDRQFYAGGSNGMRGW TPRTLGQGSVPNPHDSFPIQTGDVKLEANLELRFPIWGMVHGATFFDLGNIWYIRRNPSE YSDDAVFFFDKFYKQLGFNTGLGLRFDIKFAVLRLDWGVQLHNPNNPAGERWIHNLRWKN TALNFGVGYPF >gi|313159403|gb|AENZ01000009.1| GENE 184 214172 - 214918 1123 248 aa, chain - ## HITS:1 COG:BH1048 KEGG:ns NR:ns ## COG: BH1048 COG0778 # Protein_GI_number: 15613611 # Func_class: C Energy production and conversion # Function: Nitroreductase # Organism: Bacillus halodurans # 1 205 5 207 244 118 32.0 9e-27 MKSVLFKHRSIRKFCSTPVPEELLQEILAAASRASTCGNMQLYSLVVTRDAALRAKLAPC HFNQPMVTQAPCVVTVCADVHRFSMWCEQRDADPAYDNFAWFLNASTDALLAAQNLCVEA EMNGLGICYLGTTIYTAGMIAEILELPKGVIPVTTIVLGYPDESPELTDRLPLEAVVHYE KYTDYTAAEIDELWAEREESELTKRLLEENGLPNLAQIFTQRRYVREDNLSISNSYFALL KEKGFFNN >gi|313159403|gb|AENZ01000009.1| GENE 185 215056 - 215862 977 268 aa, chain + ## HITS:1 COG:aq_1838 KEGG:ns NR:ns ## COG: aq_1838 COG0253 # Protein_GI_number: 15606880 # Func_class: E Amino acid transport and metabolism # Function: Diaminopimelate epimerase # Organism: Aquifex aeolicus # 3 261 1 270 279 160 37.0 2e-39 MLVSFVKYEGAGNDFILIDDREELFSADARLIAALCDRHFGIGADGLMTLRRSVEMDCSM RYYNADGSPGEMCGNGARCFALFAEHLGIGGETKYFDATDGMHTARICRAKNRTGEIELG MINVSEIRSGKGWWFLNTGVPHYVEFTEELDGVDVKGLGRMIRHDTARFPQGTNVNFAEI AGEGEIRMRTYERGVENETLACGTGATAAAIITNYALQHETTKYRITVPGGELHVRFSHE PGTQTYTDIRLTGPARRVFKGVFETDNF >gi|313159403|gb|AENZ01000009.1| GENE 186 215884 - 216084 241 66 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313159472|gb|EFR58835.1| ## NR: gi|313159472|gb|EFR58835.1| conserved domain protein [Alistipes sp. HGB5] # 1 66 1 66 66 68 100.0 1e-10 MKTQKGIREHEADELRKAKRLEPMRKSGKERHSLYRSIDEEEEEELMPRRESVLDYLDDQ DEEQPL >gi|313159403|gb|AENZ01000009.1| GENE 187 216087 - 216896 1065 269 aa, chain + ## HITS:1 COG:aq_327 KEGG:ns NR:ns ## COG: aq_327 COG0596 # Protein_GI_number: 15605847 # Func_class: R General function prediction only # Function: Predicted hydrolases or acyltransferases (alpha/beta hydrolase superfamily) # Organism: Aquifex aeolicus # 25 265 10 251 256 97 29.0 3e-20 MTEKFIMAGSTALHVCDSQAGDKCVVLLHGYLESLLVWEDFVPYIYKEVRVVTLDLPGHG ISVVTGAVHTMDFLADTVADALKALGIGRCTLVGHSMGGYVALAFCERHPEMLDGVVLLS STPNPDTPEKAENRRREIALVEAGKKEMLARVAPAAGFAEENRARMRDEIEDLTEQVFVT EDEGIVALLGGMIARRDQNEMLRTSKVPQLFILGRKDGYIPPEAAEKMVAEHPQAQVVWL ENSGHMGFLEEPEAAAQAILDFVHDEKIG >gi|313159403|gb|AENZ01000009.1| GENE 188 217035 - 217313 512 92 aa, chain + ## HITS:1 COG:no KEGG:Fisuc_0607 NR:ns ## KEGG: Fisuc_0607 # Name: not_defined # Def: hypothetical protein # Organism: F.succinogenes # Pathway: not_defined # 4 90 2 87 195 105 55.0 6e-22 MNEEMKFCQSCGMPMQTAGDFGTEADGGASADYCVYCYKNGAFTEAYTMEEMIRHCAEFH EEFRDENGKSYTREEAVRLMREYFPTLKRWSK >gi|313159403|gb|AENZ01000009.1| GENE 189 217415 - 217957 877 180 aa, chain + ## HITS:1 COG:yggG KEGG:ns NR:ns ## COG: yggG COG0501 # Protein_GI_number: 16130837 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Zn-dependent protease with chaperone function # Organism: Escherichia coli K12 # 15 180 47 212 294 117 40.0 8e-27 MKKLFFMGFALLLCAMWPADAAAQLKIGGRKLNTGKLLEAGKDVAKAVTLSDKDIAQLSR EAVEWMDANNPIADETTEYGARLKRLTEGITEINGLPLNFKVYHVVDVNAFACGDGSIRV FSALMDLMDDDELMAIIGHEIGHVVHADVKHAMKNAYLSSAARNAAGAAEGSTLAKLSDS >gi|313159403|gb|AENZ01000009.1| GENE 190 217970 - 218224 219 84 aa, chain + ## HITS:1 COG:ECs3811 KEGG:ns NR:ns ## COG: ECs3811 COG0501 # Protein_GI_number: 15833065 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Zn-dependent protease with chaperone function # Organism: Escherichia coli O157:H7 # 7 77 224 290 294 57 40.0 5e-09 MTAFTGAQFSQKQEYEADEYGFEFSVKNGFSPYGMGNSLNKLVELSKGAKSSTVQKMFSS HPDSEKRAARMKEKADAYVAAHQQ >gi|313159403|gb|AENZ01000009.1| GENE 191 218402 - 219232 598 276 aa, chain - ## HITS:1 COG:ML1658 KEGG:ns NR:ns ## COG: ML1658 COG0266 # Protein_GI_number: 15827876 # Func_class: L Replication, recombination and repair # Function: Formamidopyrimidine-DNA glycosylase # Organism: Mycobacterium leprae # 1 273 1 278 282 84 27.0 3e-16 MLEIPESYSFARQAADMLSGRTVTDVFNATHPHKFTWYLGDPADYRARLVGKTVRAAEGH GAFIDLLLDDEAHIALSDGVNLRYHAPGAEVPPKYQLLIAFDDDSFLVFTVAMYGGIIAF RKNFDNPYYLGALSKPSPLDDAFDEACFGRLLSDAKSNLSAKAFLATEQRIPGLGNGVLQ DILFKSHIHPKRKLATLGDAELGRMYSSVKSTLRDMADRGGRDTEKDLLGKPGGYKTLLS KNTFAEPCPGCGGVIVKEAYLGGAVYYCPVCQPLIK >gi|313159403|gb|AENZ01000009.1| GENE 192 219660 - 219869 92 69 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLLTAMHDNPNHGFWQANARSNPPAIIRESRASPDKRMILFCMCFENQSGSVAQILTVAG TKLMQISSR Prediction of potential genes in microbial genomes Time: Wed Jun 22 11:24:41 2011 Seq name: gi|313159314|gb|AENZ01000010.1| Alistipes sp. HGB5 contig00078, whole genome shotgun sequence Length of sequence - 93099 bp Number of predicted genes - 87, with homology - 86 Number of transcription units - 41, operones - 18 average op.length - 3.6 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 52 - 102 8.5 1 1 Op 1 . - CDS 136 - 690 970 ## COG1592 Rubrerythrin - Prom 728 - 787 2.8 2 1 Op 2 . - CDS 811 - 1734 1242 ## BF1573 hypothetical protein - Prom 1868 - 1927 4.8 + Prom 1728 - 1787 3.8 3 2 Op 1 4/0.000 + CDS 1878 - 3425 2425 ## COG0531 Amino acid transporters 4 2 Op 2 . + CDS 3446 - 4429 1524 ## COG2066 Glutaminase + Term 4454 - 4491 9.4 + Prom 4855 - 4914 4.8 5 3 Op 1 . + CDS 5057 - 5302 252 ## gi|313159390|gb|EFR58754.1| conserved hypothetical protein 6 3 Op 2 . + CDS 5348 - 5599 163 ## gi|313159387|gb|EFR58751.1| conserved hypothetical protein + Term 5668 - 5701 2.1 - Term 5646 - 5698 11.6 7 4 Op 1 3/0.000 - CDS 5715 - 7442 2515 ## COG1960 Acyl-CoA dehydrogenases 8 4 Op 2 29/0.000 - CDS 7442 - 8461 1654 ## COG2025 Electron transfer flavoprotein, alpha subunit 9 4 Op 3 . - CDS 8479 - 9357 1448 ## COG2086 Electron transfer flavoprotein, beta subunit - Prom 9382 - 9441 3.5 - Term 9406 - 9464 2.1 10 5 Tu 1 . - CDS 9518 - 10783 1680 ## COG0104 Adenylosuccinate synthase - Prom 10842 - 10901 2.6 11 6 Tu 1 . + CDS 10793 - 11263 4 ## - Term 11434 - 11471 4.2 12 7 Op 1 . - CDS 11546 - 12154 725 ## COG4886 Leucine-rich repeat (LRR) protein 13 7 Op 2 . - CDS 12160 - 12492 601 ## gi|167752041|ref|ZP_02424168.1| hypothetical protein ALIPUT_00283 - Prom 12521 - 12580 5.9 + Prom 12550 - 12609 10.0 14 8 Tu 1 . + CDS 12632 - 14716 2896 ## COG0272 NAD-dependent DNA ligase (contains BRCT domain type II) 15 9 Op 1 . + CDS 15099 - 15989 1346 ## COG0329 Dihydrodipicolinate synthase/N-acetylneuraminate lyase 16 9 Op 2 . + CDS 16029 - 16766 1343 ## gi|313159364|gb|EFR58728.1| hypothetical protein HMPREF9720_2725 17 9 Op 3 . + CDS 16727 - 17464 810 ## BT_0727 hypothetical protein 18 9 Op 4 . + CDS 17486 - 18217 930 ## HMPREF9137_0437 putative lipoprotein + Term 18220 - 18265 9.3 - Term 18212 - 18247 5.1 19 10 Tu 1 . - CDS 18255 - 19526 1820 ## COG1187 16S rRNA uridine-516 pseudouridylate synthase and related pseudouridylate synthases - Prom 19570 - 19629 2.9 + Prom 19595 - 19654 3.3 20 11 Op 1 . + CDS 19703 - 21028 2042 ## COG0534 Na+-driven multidrug efflux pump 21 11 Op 2 1/0.143 + CDS 21034 - 21567 621 ## COG3910 Predicted ATPase 22 11 Op 3 . + CDS 21564 - 21773 282 ## COG3910 Predicted ATPase + Term 21774 - 21806 5.6 - Term 21852 - 21895 7.7 23 12 Op 1 . - CDS 21934 - 23280 2021 ## Odosp_0710 tetratricopeptide TPR_1 repeat-containing protein 24 12 Op 2 . - CDS 23319 - 25874 3955 ## COG0188 Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), A subunit - Prom 25902 - 25961 3.3 + Prom 26125 - 26184 1.6 25 13 Tu 1 . + CDS 26226 - 27005 1113 ## COG2116 Formate/nitrite family of transporters - Term 27106 - 27141 0.6 26 14 Tu 1 . - CDS 27147 - 27869 615 ## gi|313159362|gb|EFR58726.1| hypothetical protein HMPREF9720_2735 27 15 Op 1 1/0.143 - CDS 28127 - 31081 3698 ## COG1674 DNA segregation ATPase FtsK/SpoIIIE and related proteins 28 15 Op 2 . - CDS 31173 - 31946 950 ## COG0796 Glutamate racemase 29 15 Op 3 . - CDS 32007 - 32531 731 ## COG0703 Shikimate kinase 30 15 Op 4 24/0.000 - CDS 32541 - 35687 4680 ## COG0188 Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), A subunit 31 15 Op 5 1/0.143 - CDS 35694 - 37568 3158 ## COG0187 Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), B subunit 32 15 Op 6 . - CDS 37572 - 38300 1124 ## COG0204 1-acyl-sn-glycerol-3-phosphate acyltransferase - Prom 38533 - 38592 3.6 + Prom 38366 - 38425 4.4 33 16 Tu 1 . + CDS 38497 - 39444 985 ## COG0789 Predicted transcriptional regulators + Term 39466 - 39506 10.5 + Prom 39538 - 39597 4.8 34 17 Tu 1 . + CDS 39623 - 41995 3314 ## COG1629 Outer membrane receptor proteins, mostly Fe transport 35 18 Tu 1 . + CDS 42113 - 42982 677 ## BF3797 putative integral membrane protein - Term 42961 - 42998 1.1 36 19 Op 1 . - CDS 43022 - 43933 1248 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily 37 19 Op 2 8/0.000 - CDS 43939 - 44514 1016 ## COG1704 Uncharacterized conserved protein 38 19 Op 3 . - CDS 44634 - 45554 1232 ## COG1512 Beta-propeller domains of methanol dehydrogenase type - Prom 45594 - 45653 1.8 - Term 45605 - 45660 6.1 39 20 Tu 1 . - CDS 45673 - 46956 2181 ## COG0366 Glycosidases - Prom 47096 - 47155 8.5 + Prom 46898 - 46957 5.0 40 21 Op 1 2/0.000 + CDS 47127 - 47924 1241 ## COG1596 Periplasmic protein involved in polysaccharide export 41 21 Op 2 . + CDS 47933 - 50392 3558 ## COG0489 ATPases involved in chromosome partitioning 42 21 Op 3 . + CDS 50394 - 50783 543 ## PRU_0341 hypothetical protein + Term 50836 - 50879 8.7 - Term 51101 - 51142 10.2 43 22 Tu 1 . - CDS 51158 - 52069 1503 ## BF3922 hypothetical protein - Prom 52104 - 52163 3.6 + Prom 52045 - 52104 3.5 44 23 Op 1 . + CDS 52189 - 53061 1273 ## COG0739 Membrane proteins related to metalloendopeptidases 45 23 Op 2 17/0.000 + CDS 53068 - 54219 1517 ## COG0743 1-deoxy-D-xylulose 5-phosphate reductoisomerase 46 23 Op 3 . + CDS 54233 - 55549 2215 ## COG0750 Predicted membrane-associated Zn-dependent proteases 1 47 23 Op 4 . + CDS 55578 - 56174 584 ## Amuc_1287 protein of unknown function DUF1121 48 23 Op 5 . + CDS 56188 - 56856 850 ## Avi_5186 outer membrane autotransporter 49 23 Op 6 . + CDS 56867 - 58234 2086 ## COG1066 Predicted ATP-dependent serine protease 50 23 Op 7 2/0.000 + CDS 58270 - 59238 1089 ## COG0451 Nucleoside-diphosphate-sugar epimerases 51 23 Op 8 . + CDS 59250 - 59582 449 ## COG0526 Thiol-disulfide isomerase and thioredoxins 52 24 Op 1 . - CDS 59593 - 60288 681 ## BT_1845 hypothetical protein 53 24 Op 2 . - CDS 60297 - 61190 1271 ## gi|313159388|gb|EFR58752.1| hypothetical protein HMPREF9720_2764 54 24 Op 3 . - CDS 61243 - 61959 741 ## COG4123 Predicted O-methyltransferase 55 24 Op 4 . - CDS 61962 - 62735 1003 ## Palpr_0457 hypothetical protein 56 24 Op 5 . - CDS 62747 - 63397 334 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 57 24 Op 6 . - CDS 63411 - 64250 1193 ## COG0294 Dihydropteroate synthase and related enzymes 58 24 Op 7 . - CDS 64231 - 64824 679 ## COG0237 Dephospho-CoA kinase 59 24 Op 8 . - CDS 64821 - 65237 686 ## gi|291513720|emb|CBK62930.1| hypothetical protein AL1_02330 - Term 65242 - 65289 11.4 60 25 Tu 1 . - CDS 65311 - 68337 4893 ## COG0342 Preprotein translocase subunit SecD - Prom 68380 - 68439 4.8 + Prom 68310 - 68369 5.3 61 26 Op 1 . + CDS 68485 - 69402 1344 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily 62 26 Op 2 . + CDS 69421 - 69843 367 ## gi|313159389|gb|EFR58753.1| conserved hypothetical protein 63 26 Op 3 . + CDS 69834 - 70241 487 ## gi|313159396|gb|EFR58760.1| hypothetical protein HMPREF9720_2774 + Prom 70293 - 70352 3.4 64 27 Tu 1 . + CDS 70375 - 71679 1659 ## COG3669 Alpha-L-fucosidase + Term 71742 - 71778 5.0 - Term 71674 - 71717 13.1 65 28 Tu 1 . - CDS 71767 - 73686 2833 ## COG0443 Molecular chaperone - Prom 73848 - 73907 2.0 - Term 74208 - 74243 7.2 66 29 Tu 1 . - CDS 74275 - 74517 329 ## BT_0902 hypothetical protein - Prom 74616 - 74675 8.1 - Term 74622 - 74667 14.6 67 30 Op 1 . - CDS 74690 - 76279 2176 ## gi|313159391|gb|EFR58755.1| putative lipoprotein 68 30 Op 2 . - CDS 76296 - 76922 715 ## gi|313159394|gb|EFR58758.1| putative lipoprotein - Prom 76952 - 77011 8.2 - Term 77002 - 77030 1.0 69 31 Tu 1 . - CDS 77038 - 77181 128 ## gi|313159344|gb|EFR58708.1| hypothetical protein HMPREF9720_2781 - Prom 77217 - 77276 2.8 70 32 Tu 1 . - CDS 77286 - 77870 847 ## COG0632 Holliday junction resolvasome, DNA-binding subunit - Prom 77892 - 77951 5.3 + Prom 77889 - 77948 6.7 71 33 Tu 1 . + CDS 77969 - 78553 884 ## gi|313159355|gb|EFR58719.1| outer membrane protein + Term 78580 - 78615 6.5 + Prom 78673 - 78732 6.0 72 34 Tu 1 . + CDS 78767 - 79567 1409 ## COG1624 Uncharacterized conserved protein 73 35 Op 1 . + CDS 79763 - 79936 61 ## gi|313159365|gb|EFR58729.1| conserved domain protein 74 35 Op 2 . + CDS 79933 - 80463 180 ## PROTEIN SUPPORTED gi|237671810|ref|ZP_04531781.1| acetyltransferase, ribosomal protein N-acetylase 75 36 Tu 1 . - CDS 80723 - 80986 145 ## BT_0505 hypothetical protein - Prom 81229 - 81288 2.7 - Term 81201 - 81231 4.3 76 37 Op 1 . - CDS 81312 - 81677 635 ## Odosp_2375 preprotein translocase, SecG subunit 77 37 Op 2 . - CDS 81689 - 82177 572 ## gi|313159393|gb|EFR58757.1| conserved hypothetical protein 78 37 Op 3 . - CDS 82190 - 82699 758 ## Odosp_2373 hypothetical protein 79 37 Op 4 . - CDS 82785 - 83993 1730 ## COG2204 Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains 80 37 Op 5 . - CDS 83994 - 84704 685 ## gi|313159400|gb|EFR58764.1| hypothetical protein HMPREF9720_2793 - Prom 84891 - 84950 2.8 + Prom 84758 - 84817 2.4 81 38 Tu 1 . + CDS 84924 - 86318 1612 ## PROTEIN SUPPORTED gi|163755454|ref|ZP_02162574.1| 50S ribosomal protein L19 + Term 86370 - 86414 9.5 - Term 86339 - 86398 11.1 82 39 Tu 1 . - CDS 86448 - 88049 1935 ## COG0488 ATPase components of ABC transporters with duplicated ATPase domains - Prom 88287 - 88346 2.1 + Prom 88109 - 88168 4.0 83 40 Tu 1 . + CDS 88367 - 90307 2038 ## BVU_0479 hypothetical protein 84 41 Op 1 . - CDS 90299 - 90853 494 ## gi|260170825|ref|ZP_05757237.1| hypothetical protein BacD2_03087 85 41 Op 2 . - CDS 90798 - 91184 388 ## Patl_2044 hypothetical protein 86 41 Op 3 . - CDS 91202 - 92701 279 ## COG0863 DNA modification methylase 87 41 Op 4 . - CDS 92723 - 92953 183 ## Ava_3180 type II site-specific deoxyribonuclease - Prom 93037 - 93096 5.9 Predicted protein(s) >gi|313159314|gb|AENZ01000010.1| GENE 1 136 - 690 970 184 aa, chain - ## HITS:1 COG:CAC3597 KEGG:ns NR:ns ## COG: CAC3597 COG1592 # Protein_GI_number: 15896831 # Func_class: C Energy production and conversion # Function: Rubrerythrin # Organism: Clostridium acetobutylicum # 3 182 2 180 181 216 67.0 1e-56 MAKKWRCTVCGYIHEGPEAPEQCPMCKVGKEKFVEVVEKEGDLEFVTEHVIGDGKGASKE LWEGLQNHFMGECTEVGMYLAMSRQADREGYPEIAEAYKRYAWEEAEHASKFAELIGEVV WDTKTNLEKRMNAECGACEDKMRLARMAKEQNLDAVHDTVHEMAKDEARHGKGFEGLYKR YFGK >gi|313159314|gb|AENZ01000010.1| GENE 2 811 - 1734 1242 307 aa, chain - ## HITS:1 COG:no KEGG:BF1573 NR:ns ## KEGG: BF1573 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 8 307 12 311 311 259 43.0 1e-67 MYRIVVILLAALCAVSCGRRSSRQQTSAAPQMRVFLPAIAPSSLSDDAKRDYLRWHYWDR FDFADTLFIREVDTVQMVEAYVRWIALISDRPTDGVPMDSLMRRASASRPMLDYFTMLAE LVIHDPNSPLRNDEFYIPVLRAVLASPYYDEYERIGPSYDLNMAMQNRIGERANDFRYTL ASGATGTLYGVKAEYVLLFINNPGCPMCKQLREQIGGSPMLSVMIKRGRLKVVALYPDED LAEWREYRGHIPPSWINGYDAGCVVREKSLYDLHAIPTLYLLDRDKRVLVKDSTDVPYIE EVIDRRG >gi|313159314|gb|AENZ01000010.1| GENE 3 1878 - 3425 2425 515 aa, chain + ## HITS:1 COG:BMEII0909 KEGG:ns NR:ns ## COG: BMEII0909 COG0531 # Protein_GI_number: 17989254 # Func_class: E Amino acid transport and metabolism # Function: Amino acid transporters # Organism: Brucella melitensis # 13 489 23 499 510 527 59.0 1e-149 MDVKKTTTSTGFKLSVMTLAIMNVTAVVSLRGLPAEAVYGLSSAFYYLFAAIVFLIPTAM VAAELAAMFSTKQGGVFRWVGEAYGARTGFLAIWLQWIESTIWYPTVLTFGAVSIAFIGM NDVHDAALASNKVFTLCMVLAIYWIATFIALKGLGWVGKISKWGGMIGTIIPAGLLILLG IIYISTGGHNHMDMSQGFFPDLSKFDNLVLASSIFLFYAGMEMMGIHVMDVKNPSKNYPK AIIIGSLVTVCIFVLGTFSLGFIIPAKDISLTQSLLVGFDNYFHYLHMSWAGPIIAIALM FGVLAGVLTWVAGPSKGIFAVGKAGYLPPFFQKTNKNGVQKNILLIQGCVVTLLALLFVV MPSVQSFYQILSQLTVLLYLIMYMLMFSAAIVLRYKMKNTDRPFRLGKGNGLMWFLGCMG FCGALLAFVLSFVPPSQISTGSNTVWFSVLIIGCVVVVGAPFVIYALRKPSWKDPQAAAE FEPFHWETQPVAAAPVHTNTPEPTAKPSETELHNK >gi|313159314|gb|AENZ01000010.1| GENE 4 3446 - 4429 1524 327 aa, chain + ## HITS:1 COG:ybaS KEGG:ns NR:ns ## COG: ybaS COG2066 # Protein_GI_number: 16128469 # Func_class: E Amino acid transport and metabolism # Function: Glutaminase # Organism: Escherichia coli K12 # 5 312 2 308 310 292 48.0 5e-79 MIQKIDNATVREAVQQAYERCKNETGGKNADYIPYLANVPSNLFGIAACLPDGEVIAVGD TDYKFGIESVSKVPTAILAMNQYSAQEMLDKIGADATGLPFNSIMAILLENDHPSTPLVN AGAISACSMVKPVGDSDGKWKSIVGFIEGLAGSQVEVIDELYKSETATNFNNKSIVWLLK NYNRIYDDPDMSLDIYTRQCSIGVTAKQLATMAATIANGGVNPVTKQPVFKPELAPKIAS LMATVGFYEHTGDWLFTTGLPAKTGVGGGIMGVVPGVMGVAAFAPPLDEAGNSVKAQKAL AYVAGHLNLNIFGTTRCVMAEKEPAKA >gi|313159314|gb|AENZ01000010.1| GENE 5 5057 - 5302 252 81 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313159390|gb|EFR58754.1| ## NR: gi|313159390|gb|EFR58754.1| conserved hypothetical protein [Alistipes sp. HGB5] # 1 81 1 81 81 156 100.0 5e-37 MKQMKPKRTDEKLAQYVISRMKQLRRDHNYSQEYVIENTGLDIFHFESGSKFPTLISLTI LCRFYGISLREFFGESDYPVE >gi|313159314|gb|AENZ01000010.1| GENE 6 5348 - 5599 163 83 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313159387|gb|EFR58751.1| ## NR: gi|313159387|gb|EFR58751.1| conserved hypothetical protein [Alistipes sp. HGB5] # 1 83 1 83 83 149 100.0 5e-35 MERKKQRRDEELLQKIILRVKELRHMHDHQSQEQLAEATELSIAQLESGKNFPNLTTISI ICKFYNITLGEFFAPLDYPTKDN >gi|313159314|gb|AENZ01000010.1| GENE 7 5715 - 7442 2515 575 aa, chain - ## HITS:1 COG:SMc00977 KEGG:ns NR:ns ## COG: SMc00977 COG1960 # Protein_GI_number: 15964628 # Func_class: I Lipid transport and metabolism # Function: Acyl-CoA dehydrogenases # Organism: Sinorhizobium meliloti # 45 511 33 512 593 188 31.0 3e-47 MANFFSDNKDLQFHLGHPMMRKIVELKERGFAEKDLYDFAPQDFEDAMDSYRRVLEIAGE ICGEVIAPNAEGVDHEGPRVVNDHVEYASGTVRNMKAIVDAGLSGMTLPRKYDGLNFPLV CFVMANEMVSRADAGFENIWGLQDCAETLNEFASEEIKAKFLPWVSAGATCAMDLTEPDA GSDLGAVMLKATWSEEKQTWLLNGVKRFITNGDGDVSLVLARTEEGTTDARGLSMLVYDK RDGGVKVRRIENKMGIKGSPTCELVFTNAPAQLVGDRKMGLIKYVMSLMNAARLGIGAQS VGLCEAAYREALKYAHEREQFGKPIIRFAAVSEMLSNMKAKLQGVRALLYETTRFVEVYK QYGHIAHERSLEGEERQEMKFYNRLADGFTPLVKLFSSEYANQQAYDAIQIHGGSGFMKD YPCERLYRDARIMNIYEGTSQLQVVAAINSVTKGTFMEQIERYAAAEYSQPMQPVVAKLK ELTAKFAEMIARVEAGDKELCGFKDFHARRLVETAGHIIITYLLARQAGESEEYVNSAKV FCKLAEGKISEAYTYVMNSTLEDVELFKAVIEETE >gi|313159314|gb|AENZ01000010.1| GENE 8 7442 - 8461 1654 339 aa, chain - ## HITS:1 COG:CAC2709 KEGG:ns NR:ns ## COG: CAC2709 COG2025 # Protein_GI_number: 15895966 # Func_class: C Energy production and conversion # Function: Electron transfer flavoprotein, alpha subunit # Organism: Clostridium acetobutylicum # 4 335 9 332 336 268 47.0 1e-71 MNNIFVYIENEGGKVADVCLELLTKGRELATTLGVKLEAVVLGEHLAGIETELAKYGADT VWVADDKIFAPFRTLPHTAVMCGLIEQEKPQIALFGATCNGRDFAPRVSSALYSGLTADC TQLVIGDHKDAKTGKEYKDLLYQIRPAFGGNIIATIVNPDNRPQMATVREGVMRREYAAV PGAGEVKKIDWQKFVKDTDLAVRILDREISESKIDIKGASVIVAGGYGMGSKENFDLVFE LADVLGAEVGASRAAVDAGFADHARQVGQTGVTVRPKLYIACGISGQIQHTAGMDGSAMI ISINTDPEAPINKIADYAITGDVNEIIPKMIKYYKQNSK >gi|313159314|gb|AENZ01000010.1| GENE 9 8479 - 9357 1448 292 aa, chain - ## HITS:1 COG:CAC2710 KEGG:ns NR:ns ## COG: CAC2710 COG2086 # Protein_GI_number: 15895967 # Func_class: C Energy production and conversion # Function: Electron transfer flavoprotein, beta subunit # Organism: Clostridium acetobutylicum # 5 257 1 225 259 173 43.0 3e-43 MKKALKIVVLAKQVPDTRNVGKDAMTPEGTVNRAALPAIFNPEDLNALEMALALKDRTEG STVHILTMGPQRAADIIRDAMFRGADGGYLLTGREFAGSDTLATSYALSCALKMIAPDII FAGRQAIDGDTAQVGPQVAEKLGLPQVTYAEEITEIKADSLVIKRRLNCGMETVETPVPV VVTVNASAPECRPRNAKRVMTCKFALAKSEIAAAPDSPAAKRAAAKEYLQIVEWAAADVN PDPQQLGLAGSPTKVKKIENVVFQAKEAKKLTSSDEDINSLMVELIASHTLG >gi|313159314|gb|AENZ01000010.1| GENE 10 9518 - 10783 1680 421 aa, chain - ## HITS:1 COG:CAC3593 KEGG:ns NR:ns ## COG: CAC3593 COG0104 # Protein_GI_number: 15896827 # Func_class: F Nucleotide transport and metabolism # Function: Adenylosuccinate synthase # Organism: Clostridium acetobutylicum # 6 419 5 423 428 363 46.0 1e-100 MKKVDVILGLQWGDEGKGKVVDVLTPAYEVVARFQGGPNAGHTLEFNGEKYVLRSIPSGI FQGGKTNIIGNGVVIDAVLFREEAEALAASGHDLTKQLCISKKAHLILPTHRILDAAYEA AKGSAKIGTTGKGIGPTYTDKVSRNGMRVGDVLSADFRQIYARAKARHESILRGLGYEYD IAELERKWFEAVEYLKRFNIIDSEYFVNGCLAQDKSILAEGAQGTLLDVDFGSYPFVTSS NTVCAGACVGLGIAPNRIGEVYGIFKAYCTRVGSGPFPTELFDETGARMRSIGHEYGAVT GRERRCGWLDLVALKYSIMINGVTQLIMMKSDVMNDFETVKVATEYEVGGERTAHFPYEI GDDLKPVYREFEGWKCDLRDCKSYDDFPAAFKTYVEFIERETGVPVKIISVGPDRGETIV R >gi|313159314|gb|AENZ01000010.1| GENE 11 10793 - 11263 4 156 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MYFCVQKTFFASKGTNYYPKYPIRRGDFSTMYQKSGAKKHPATKKQSTVRNRRNKIRRLR SGTGGAGGRGRRSGRSGRTRMSGETKRGQAGGQAEQTERSEQEDGTNGADKGIGGNQTKL TDGETKQTGQDEWADRRRDEPNRRKDETAKTGEAAG >gi|313159314|gb|AENZ01000010.1| GENE 12 11546 - 12154 725 202 aa, chain - ## HITS:1 COG:lin0354_1 KEGG:ns NR:ns ## COG: lin0354_1 COG4886 # Protein_GI_number: 16799431 # Func_class: S Function unknown # Function: Leucine-rich repeat (LRR) protein # Organism: Listeria innocua # 34 194 71 241 292 63 29.0 3e-10 MTRFVPLLAFALLAACGLVDDERDDKNKRVYITFADPAFEAYCLEHFDIDHDGRISRYEA QRVLKMDCPDRGIAHMWEIGEFSRLERLDCSGNDLTQLDLRKCTLLQTLDCSRNRIASLD LDGLRALLELNCADNALTLLDLKSAGALRLLDCSGNRLVTLDLRPCSERLKADAGGNPPL TTVYCRASQSVSADGHTEIIVR >gi|313159314|gb|AENZ01000010.1| GENE 13 12160 - 12492 601 110 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167752041|ref|ZP_02424168.1| ## NR: gi|167752041|ref|ZP_02424168.1| hypothetical protein ALIPUT_00283 [Alistipes putredinis DSM 17216] hypothetical protein ALIPUT_00283 [Alistipes putredinis DSM 17216] # 1 106 52 157 158 110 71.0 3e-23 MAKQNTTKKHIVTSFHNLAPELQEAVKQKYPLGFTEAMMRVDKPNGDFFYAVPFDTDEIA YMVKIDVKIDDNAQEDDDKDYYDDEIKGADEIQDDGNSDGGDDSDDDVNI >gi|313159314|gb|AENZ01000010.1| GENE 14 12632 - 14716 2896 694 aa, chain + ## HITS:1 COG:all1717 KEGG:ns NR:ns ## COG: all1717 COG0272 # Protein_GI_number: 17229209 # Func_class: L Replication, recombination and repair # Function: NAD-dependent DNA ligase (contains BRCT domain type II) # Organism: Nostoc sp. PCC 7120 # 4 660 9 675 677 562 44.0 1e-160 MIRERIEELRRQLEYHNFRYYVENAPEISDFEFDAMMRELQDLERAHPEMADPNSPSVRV GSDITAEFRSVKHRFPMLSLGNTYSPDELHEFIERIEKETGPTEFVCELKFDGTAISLTY EHGRLLRAVTRGDGTQGDDVTANVRTVRTVPLRLRGDDWPDYFEIRGEILMPYASFDKIN AEREAAGETLFANPRNAAAGTLKQQASAVVARRGLDCTLYQLAGDDLPFTNHWESLQKAR EWGFKISDQMRICRSAEEIDAYIAHWDEARRELPFPTDGVVIKVNDFGVRRQLGFTAKAP KWAVAYKFKAEQALTRLDSVAFQVGRTGAITPVANLEPVLLAGTTVRRATLHNAEQMALL DIRPGDMVYVEKGGEIIPKITGVELSQRPADSRPFEYITVCPECGTPLVKYEGEAKHYCP NQGGCRPQIIGRIIHFIRRKAMDIEGLGEETVELLYENGLVRDVADLYDLQAPQLACLPR LGEKSADNIIRSIRGSVEVPFRRVLFGLGIRFVGETTAKYLAEHFRSLDAVMKATREELT EADEVGGRIADAIIEYFADEKNLAIIRRLRDAGVKFEAEARELASESLAGKSFVISGKFS NHSRDELKELIEIHGGRNLAAVSANVDYIVAGENMGPAKLKKAEKLGVKIISEEDFIAMI GSDEIPAANAAEKRTEGSAEGGAAHKDAAEPKLF >gi|313159314|gb|AENZ01000010.1| GENE 15 15099 - 15989 1346 296 aa, chain + ## HITS:1 COG:NMB0929 KEGG:ns NR:ns ## COG: NMB0929 COG0329 # Protein_GI_number: 15676823 # Func_class: E Amino acid transport and metabolism; M Cell wall/membrane/envelope biogenesis # Function: Dihydrodipicolinate synthase/N-acetylneuraminate lyase # Organism: Neisseria meningitidis MC58 # 6 281 2 279 291 239 46.0 5e-63 MIREKLTGVGVAMITPFTSDGRVDYQALARMIDYVIEGGVDYIVALGTTAETPTLYMPER AVIAMFITNHIAGRVPLVMGCGGNSTSEVLDQLREFDLRGADAILSVTPYYNKPSQEGLY QHFRTVSEHSPLPVILYNIPGRSGVNMTAETTLRLARDMKNVIGIKEASGDIEQMQRILD NRPEGFLVLSGDDGMTLDLMRRGGEGVISVAANVFPKRFMQCVGHAKQGDFDRAEEEYRA LDEAVHALFEEGNPVGAKCALSMMGKIGPTMRLPLVEGSQALREKFSRLIAEYDLR >gi|313159314|gb|AENZ01000010.1| GENE 16 16029 - 16766 1343 245 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313159364|gb|EFR58728.1| ## NR: gi|313159364|gb|EFR58728.1| hypothetical protein HMPREF9720_2725 [Alistipes sp. HGB5] # 1 245 1 245 245 417 100.0 1e-115 MKNLLLLLLLLPALAAAKVPDEEDIQNKTMDAESPFYYPSLMMRYNAGDETLTDEDYHYL YYGYAYQESYKPLDSNPDLDKLLLMASGLDPDKPAVETLEAMLYTGEDALARDPFSPKIL NLMAYAHGALGNKLQEKMYYNRMQGVIRAIRESGDALTQKTPRHILMFDHALDVMATEGL SYDKSRIISRTVEFIPLTVPYTVEGKKRKGLYYDFGRIYWNKPEGYTYKRDRTWQFNNLK PRTYK >gi|313159314|gb|AENZ01000010.1| GENE 17 16727 - 17464 810 245 aa, chain + ## HITS:1 COG:no KEGG:BT_0727 NR:ns ## KEGG: BT_0727 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 103 244 11 149 393 67 27.0 4e-10 MAVQQPQTPYVQIIRRALCLLLALAVFTGCEKEREPAEITSSQEEAVLRSTAGSAAAFTV TATGPWTLTTTGGGFGISPTAGGRGETTVTVTASDGNPSRSRVKLGTVALTLNAGGAQCS VTVSQSPATATQTMLLYMPGRDLLKFYKQNIDGVLKAVDANVPGDGRVLVCYQPNAHSQA EMYEAYFNAEKQAAAFALLKTYDDFAAADPACVQRMLADVEAFAPAQHYGIIVGCHGKAW VPANQ >gi|313159314|gb|AENZ01000010.1| GENE 18 17486 - 18217 930 243 aa, chain + ## HITS:1 COG:no KEGG:HMPREF9137_0437 NR:ns ## KEGG: HMPREF9137_0437 # Name: not_defined # Def: putative lipoprotein # Organism: P.denticola # Pathway: not_defined # 18 243 179 408 408 125 33.0 1e-27 MSKELEDLWTPAPGALTTRSFGDTGRSIDITDFAAAVKAQNYRTDYLLFDACFMANIETL YDLRECTDYVIAAPCEIMGQGFPYDRAMPWFFTDGGKGRDLTKVCEAFWNFYMNDATTQS GCISLAVMSEMEGMKEVMRRINAAPKKSYAEELQSYEGMSSHIFYDLGHWVELACGDAKL KEDFKAQLDKAFPKAARLSTPGFYSAYNGRMNPVAYYSGVSFSEPSDKYVEENKQTSWYR DTH >gi|313159314|gb|AENZ01000010.1| GENE 19 18255 - 19526 1820 423 aa, chain - ## HITS:1 COG:L107724 KEGG:ns NR:ns ## COG: L107724 COG1187 # Protein_GI_number: 15673257 # Func_class: J Translation, ribosomal structure and biogenesis # Function: 16S rRNA uridine-516 pseudouridylate synthase and related pseudouridylate synthases # Organism: Lactococcus lactis # 185 417 15 250 257 163 38.0 6e-40 MKDFKNDSTSVLTRFGRDKRQRTVTATAERVERRPRLQRDAADSEQPSDDRKPVKRASYN PHFTEDNRPAFDKPRRSFGDERQGDRARSDERYRDGDKPRRSFGDKPRGEKPAYGDNRGP KKFGDRKFGDKPYGERKSGDKPAGDRRFADKKSGDRRFGDKPYKPGFRKHDDKPASYPKF TPEKQIGEMRLNRFLAQSGLCSRREADDFITAGLVTVNGQIVTQLGTKVLPTDEVKFNDS RVQGEKKVYLVLNKPKGYVTSLDDPHAGKTVMELVQGACTERIYPVGRLDKNSLGLLLFT NDGDLTKQLTHPAFRKKKIYQVSLDKPLTRADMDRIAEGITLEDGEIFADEISYVKDNKQ EVGIEIHSGRNRIVRRIFEFLGYTVTKLDRVYYAGLTKKNLKRGAWRFLSREEVERLKSG QYE >gi|313159314|gb|AENZ01000010.1| GENE 20 19703 - 21028 2042 441 aa, chain + ## HITS:1 COG:VC0090 KEGG:ns NR:ns ## COG: VC0090 COG0534 # Protein_GI_number: 15640122 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Vibrio cholerae # 2 422 15 431 454 219 35.0 1e-56 MNMNREILRIALPNIVSNITVPLMGIVSTAIAGHWGSDTAATIGALAIGVSIFNFIYWNC SFVRMGTSGLTAQAFGAGNFRECTNMLVRALLVAGGMGLLMLLLQYPLGELALWGLNGND MTREYFYARIWAVPAGILLFGFNGWFTGMQNALIPMFTAITVNVVHVLCSLWFAFGLDMG IVGIAYASVIAQWTGVALSALLLFACYRRILTGIDWAEVVNLKPLRTFFIVNRDIMLRTF CIVAVYTFFTGASARMDEPALLAVNTLLLQLFTLFSYMNDGFAYAAEALTGRFIGARDEG ALRDCLRRCIAWGTAVSVLFVGIYIGWWRELVGLFVDSTAPNAATIVELAGEYIVWIILI PVASAMPFIMDGIMVGATETRVMRNSMFWATAAYFGIFYVGYAIIGNNALWLAFTLYMFL RGVLQYFMTHRLRSIYVKATV >gi|313159314|gb|AENZ01000010.1| GENE 21 21034 - 21567 621 177 aa, chain + ## HITS:1 COG:MA0995 KEGG:ns NR:ns ## COG: MA0995 COG3910 # Protein_GI_number: 20089872 # Func_class: R General function prediction only # Function: Predicted ATPase # Organism: Methanosarcina acetivorans str.C2A # 9 158 10 167 251 149 46.0 2e-36 MELTPDSLYIKYVRLKEEGETPQTYPYTVPALANFRRLEFRRPVTFIMGENGMGKSTLLE AIAVKAGFNPEGGSKNFRFATRASHSDLYEHVVLGRGLQPRDGYFLRAESFYNVATEIEN VADGVLKYYGDKSLHQQSHGESFLALLGHRLFGHGLYISTSRKPHCRPRARCTCCAG >gi|313159314|gb|AENZ01000010.1| GENE 22 21564 - 21773 282 69 aa, chain + ## HITS:1 COG:mlr8463 KEGG:ns NR:ns ## COG: mlr8463 COG3910 # Protein_GI_number: 13476986 # Func_class: R General function prediction only # Function: Predicted ATPase # Organism: Mesorhizobium loti # 1 69 188 257 258 57 37.0 5e-09 MKQLAEQGSQFIISTHSPIVMSYPGAEIYEITDRGLEPTELEETSHFRLMKRFILDRRGI LRQMELEKE >gi|313159314|gb|AENZ01000010.1| GENE 23 21934 - 23280 2021 448 aa, chain - ## HITS:1 COG:no KEGG:Odosp_0710 NR:ns ## KEGG: Odosp_0710 # Name: not_defined # Def: tetratricopeptide TPR_1 repeat-containing protein # Organism: O.splanchnicus # Pathway: not_defined # 106 436 55 381 390 77 26.0 1e-12 MKKVFLTVLAVALVAVTAVQAQKVNKSALVSKIEKSDADIADAKKGAKAATWINRGKAFY EAAIEPTKSLFVNMDAAMLKLAVGEPASTESVTLVNVPYEAWVYPWFTAYIKDGKIATWS QTQWVIEDAPAKAIEAYNKAYEMAPKTADKVKEGLKQISDFCSQVGNTGIDTGNYADAAD AYALAFEAQSSPAHGNPEPALLYYAGYLRTVDGAANPASYVIGADYLNKALDLGYNDEEG NIYYYLFHCYYGQKDADKANVLKAKDALVAGIKKFPKNERILDGLVQLYTNPEDSVGDPA DLVALIDAAIESNPENVDLWFGRGRIFFALKQYDESIASFQKVVELKPELFEGNYYLGVF YTIKADEMNKVMNEKQYSSQAAYDADLKAANAVYMEAIPWFEKAHELKADDFNTLDMLKQ LCFRLRDEPGIQEKYDTYFPLWKAAKGE >gi|313159314|gb|AENZ01000010.1| GENE 24 23319 - 25874 3955 851 aa, chain - ## HITS:1 COG:BH0007 KEGG:ns NR:ns ## COG: BH0007 COG0188 # Protein_GI_number: 15612570 # Func_class: L Replication, recombination and repair # Function: Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), A subunit # Organism: Bacillus halodurans # 14 819 8 804 833 815 51.0 0 MLTEEEKNAGLVGRIIPINIEEQMKSAYIDYSMSVIVSRALPDVRDGMKPVHRRILYDMS AELNLYSDKPTRKSARIVGDVLGKFHPHGDTSVYDAMVRLAQDWSMRYPLVDGQGNFGSM DGDSPAAMRYTEARMKKITDEVMADIDKETVDWTLNFDDTIPEPTVLPTKIPLLIVNGAS GIAVGMATNMAPHNLSEVVDACCAYIDNPEITGEEMLQYIKGPDFPTGGIIYGYEGVREA MLTGRGRVMMRAKTDIEHTPSGRECIVITEIPYMINKAEMIKKIADMINEKKIEGISYIN DESDRNGLRIIIILKHDAVASVVLNTLFKNTPLQTSFAVNNIALVNGRPQMLPMRDLVKH FVDHRHDVVVRRARFDLKKAEERLHIVQGLLIAQDNIDEIVHIIRSSQTPDAAKQTMIER FNLSDIQASAIIEMRLRALTGLEYGKLIAERDELTKQIAYLKEVLENVGMQMQIIKDELL EIKEKYGDERRSEIVYSSEEFNPEDFYADDDMVITISHMGYIKRTPLAEYRTQNRGGVGA KGSATRDEDFIEHIYVASMHNTMLFFTEKGRCFWLKVYEIPEGARSSKGRAIQNVIQIEP DDKVRAYINVKRLNDEEYVNNNFIIMCTKDGTIKKTKLEAYSRPRQNGVNAIVIREGDQL IEAKLTSGQAEVMIAARDGKAIRFNESTVRPIGRVGAGVRGISIEESDEVVGMICVEPDS KQDVLVLSENGYGKRTDLDEYRITNRGGKGVKTINVTEKTGKLISIQAVTDDNDLMIINR SGLTIRTAVSQIRLAGRATQGVRIINLREGDAIASVMAVPAAGDEDEEVQSAEVAATGND ATPEADRPAEE >gi|313159314|gb|AENZ01000010.1| GENE 25 26226 - 27005 1113 259 aa, chain + ## HITS:1 COG:BS_yrhG KEGG:ns NR:ns ## COG: BS_yrhG COG2116 # Protein_GI_number: 16079774 # Func_class: P Inorganic ion transport and metabolism # Function: Formate/nitrite family of transporters # Organism: Bacillus subtilis # 7 254 6 249 266 162 39.0 7e-40 MTRINTPKEVLALAGSSAADKLQNTAGKTLILAFLAGAYIAIGGLFSLMAGFGFPGAAEA PGFQRLLSGAVFPLGLILVVFTGAELFTGNNAVLMPGALARRYGWGKVARNWTLVWMGNF AGALFFTYFMVVLPGVLSSEMWREAACNIAQAKVSLPWSTVFLRGVGANWLVCLAVWLGM SANDVPGRMLGLFFPIMCFVVIGYEHCIANMFFIPLGMLLGAPVSAAELFAGNLVPATLG NIVGGGLFVGGLYWYLNRK >gi|313159314|gb|AENZ01000010.1| GENE 26 27147 - 27869 615 240 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313159362|gb|EFR58726.1| ## NR: gi|313159362|gb|EFR58726.1| hypothetical protein HMPREF9720_2735 [Alistipes sp. HGB5] # 1 240 1 240 240 383 100.0 1e-105 MSAVSSVSAEAASGLRAVFPETTVSAVPAAAGDFFRSGVPAAAASPASSAAVAALEPSAD GRAAEILEKLAAGFRALGAYGVTFEVSSDEYTTRGRYAVEGENYYIAVGEAEVYCDGKIR YEIDNRRREVTIDDVDTSSRNLLSNPAHAFDFIGTQYAPSLVSDAEGRAVVRLTPTSADA SPAGEILVTVDTAAMRPESLRYDYDGEQVGIAVLGVAPLDTPLKAFSKGDYKGYEFIDFR >gi|313159314|gb|AENZ01000010.1| GENE 27 28127 - 31081 3698 984 aa, chain - ## HITS:1 COG:SA1119 KEGG:ns NR:ns ## COG: SA1119 COG1674 # Protein_GI_number: 15926859 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: DNA segregation ATPase FtsK/SpoIIIE and related proteins # Organism: Staphylococcus aureus N315 # 488 967 318 780 788 363 44.0 1e-100 MASKNTTNTTASKNGRQRIERSNSDSARWIAGLLLLFVGVFAASSVLFSFFSWAADQSGL QLSPEERLTLGVEPENLCGWAGAKLGRLLVDNSFGVFGILIPTMIILVGVRIIRQRPLLF NHSILSLFLIMILGSLTLGFAFGDKWSLCSSTGWGGAFGIETGALLHTHIGVFGTLILLV GCWILTGVFINRNFINKVNRAGNVMVDKSGRIVEIVKHKVVPGHLHAEEADGAVSAAATP AAPDAAGAAGTERAAEPETPRAVRVPEPETPRAARTPDPEIVRPAREPEAAATQIVRGEE DDPFVEISSDGTPVGSEPAREAESAVKTDEDGEFIEVDLSRPEGRLVLGPGGLVELERPS TPSAAPSAGTPVRNMPVSDGPFTELTVGGDAGASSPAALAGAAPESLSGSESLSGPASLA AVASASGDAPAHGAGSVPEYESAPADASVSAGSAGAEGVVVTVEANEARMVDERAITTES YDPLKDLVNYSKPPVTLLEDYQSDSEVSDEEIFDNKTRIEETLKNFGIPIQRIKATVGPT VTLYEIVQAQGVKISKIQSLENDIAQSLKALGIRIIAPIPGKGTIGIEVPNRDKQVVSMY SAVRSLRFQESKAELPVVIGRTIQNENYVFDLAKMPHLLVAGATGQGKSVGLNAIITSLL YRKHPAQLKFVMIDPKMVEFSLYAKIERHFLAKMESEDDAIVTDPRKAVYALNSLCTEMD NRLELCKKAGARNIAEYNEKFTSRRLNPHNGHRYLPYIVVVVDEFADLIMTAREVEVPVM RLAQKARAIGIHLIIATQRPDVKVITGGIKANFPARIAFRVMQMIDSRTIIDQPGANQLI GRGDMLFSKDGELTRIQCALVETKEVERVVDYISRQQGYTEAYPLPDYTPDADGGGSSLG SEESAPVKYDSLFAEIARDAVSGGNISTSMIQRNYEVGFNRAGRIMTQLERAGIVGRQQG AKPRDILFHDLPSLEAKLQDLGLF >gi|313159314|gb|AENZ01000010.1| GENE 28 31173 - 31946 950 257 aa, chain - ## HITS:1 COG:RSc1956 KEGG:ns NR:ns ## COG: RSc1956 COG0796 # Protein_GI_number: 17546675 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glutamate racemase # Organism: Ralstonia solanacearum # 4 227 7 228 277 176 45.0 3e-44 MNDAPIGVYDSGLGGLTVWREVRRMLPSESLVYLGDGKNCPYGSRPREEVRRLADEAVAS LVAQGCKMVVVACNTATAAAIDFLREKYAPMPIVGMEPAVKPACLNTRSGVVGVLATERS LDGELFRRTAAKYGSGVELITAPGRGFVELVESDRESTPEAEQAVRDAVAVMLEHGADQI VLGCTHYPFLLPVLERVVAGRGVEIVDPSPAVARRVVQLLDQYGLHAAPDHVPSYTFRTF AGESYRLRLEHKAAESL >gi|313159314|gb|AENZ01000010.1| GENE 29 32007 - 32531 731 174 aa, chain - ## HITS:1 COG:sll1669 KEGG:ns NR:ns ## COG: sll1669 COG0703 # Protein_GI_number: 16329403 # Func_class: E Amino acid transport and metabolism # Function: Shikimate kinase # Organism: Synechocystis # 4 150 15 155 189 99 42.0 3e-21 MKPLFLIGYMGCGKSTLGRRLARRLGAEFADTDALIERREGASVADVFRYEGEERFREVE REVLEQTLAGTAAVVSTGGGLPVWRDNMARMNAAGFTVYLRREAEQIARRLSPYGRQKRP RLRGLGDAELVEFMSRDMAAREPFYAQAQLIVDCGELSDDEVVETILRHTMQNE >gi|313159314|gb|AENZ01000010.1| GENE 30 32541 - 35687 4680 1048 aa, chain - ## HITS:1 COG:BB0035 KEGG:ns NR:ns ## COG: BB0035 COG0188 # Protein_GI_number: 15594381 # Func_class: L Replication, recombination and repair # Function: Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), A subunit # Organism: Borrelia burgdorferi # 61 694 8 626 626 423 36.0 1e-118 MAEEKDIIGREEPLNEDASTPETADEAADETPAAGPQGKAGKFDRLTQDEGGVRKLTGMY KNWFLDYASYVILERAVPHVEDGLKPVQRRILHAMKVVDDGRYNKVANIVGQTMQYHPHG DASIKDALVQLGQKDLLIDCQGNWGNILTGDEAAAGRYIEARLSKFANEVVFNKKTTEWM LTYDGRKEEPVTLPIKFPLLLAQGSDGIAVGLASKILPHNFVELINACIAHLQGREFQLY PDFPTGGMADVSRYNDGLRGGAVKVRAKISKIDKRTLAITEIPYTTTTESIKDSIIKAND KGKIKIKKVDDNTADRVEIVIQVSPDESSDKTIDALYAFTDCEVSIAPNACLIWEDKPHF LGVSEILRRSAEHTKWLLGRELEIRLGELNEAWHAASLERIFIENKLYQLIEGSKSREEA YAAVDKGLEPFKKLLRREVTLSDVQRLTELKFIRISRYDSDKADNEIRQIEEDIKSTQYD LNHLTEYAVAYYERIRDKYGKGRERRTELREFDNIEATKVAVTNAKLYVDRAEGFFGIGK SMKDAELVCDCSDIDDVIVFTKDGRYVITKVSDKAFFEKGIYYIGVFKRNDERTIYNVLY RDGKNGPIMMKRCAIKAITRDKEYDITKGTPKSEILYMSVNPNGEAEVLKIYFKPRPRLK KVIVDLDFSTLAIKGRQSQGNLFSRYGIHKIVLKERGTSTLGGQNVWFDEDVRRLNADGR GTLLGEFKGDDKIIVWTSKNQYYITGYDLGQHFPDETVRVSRYEADRIYSICYYDRSQQY YYMKRFTAEMSDKMQFFLDEEGQADLVAVTERTGAKLEITYKGAHASRPADEIDVDEFVG VKSHRAKGKRLTTYDVAALRFIEPELPPEPEPEPSDDDGDDVPDDTPSGGGAFGGAQGGS GNGADNGAVGGSNGDAADRGAIDAAALSGGVGSAVSSGANSATAGPAFAPASSAAETPSA PGVPAGSAPARNKSAADGVPAAKDNPAAKDDSAVKDNPAAKASAPKKTPASPAGLPRTGT TSGGVEFEIERAKGDADEVIDPEQLNLF >gi|313159314|gb|AENZ01000010.1| GENE 31 35694 - 37568 3158 624 aa, chain - ## HITS:1 COG:BB0036 KEGG:ns NR:ns ## COG: BB0036 COG0187 # Protein_GI_number: 15594382 # Func_class: L Replication, recombination and repair # Function: Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), B subunit # Organism: Borrelia burgdorferi # 12 610 5 598 599 570 50.0 1e-162 MADLLNTNPNDNYGDDAIVTLSPREHIRLRPGMYIGKLGDGAQADDGIYVLIKEVVDNSV DEFIMGVGRQIDISIADNVVSVRDYGRGIPLKSLAAAVSEMNTGGKYGGSAFKKTVGLNG VGVKAVNMLSSEFTARSVRDGEARTVTFAQGLEQSDTWESGVREKNGTFISFRVDEEVFG QYAYNLEYVEQMIRNYTYLNLGLTFNFNGSSYVSKNGLLDLLNENMTEEPLYPPIHLSGD DIEVAIAHGTGYGESYFSFVNGQYTSQGGTHQAAFREAIAKTVKEFYHKDYDPSDIRTSI IAAISVKVTDPVFESQTKIKLGSKEIEPGVSMRNFVVDFLGKHLDDYLHKHSETAQILQK KIVENEKERKAISGIQKKARETAKKVSLNNKKLRDCKIHRTDKHELAEQSMIFITEGNSA SGSITKSRDVRTQAVFSLRGKPLNCYGLTKKVVYENEEFNLLQAALNIEEDMDNLRYNKV IIATDADVDGMHIRLLMMTFFLQFFPDVIRQGHLFVLQTPLFRVRNKKETHYCYSEDERL KAVSRCGANAEITRFKGLGEISPDEFREFIGEGMRLDKVRITKDDPIHDLLEFYMGKNTY ERQGFIIDNLRIEEDIVEQDLAIS >gi|313159314|gb|AENZ01000010.1| GENE 32 37572 - 38300 1124 242 aa, chain - ## HITS:1 COG:YDL052c KEGG:ns NR:ns ## COG: YDL052c COG0204 # Protein_GI_number: 6320151 # Func_class: I Lipid transport and metabolism # Function: 1-acyl-sn-glycerol-3-phosphate acyltransferase # Organism: Saccharomyces cerevisiae # 59 238 64 245 303 94 33.0 2e-19 MISAVYYIFLVFLCTFFMVLSAVALVVCYPFDKGRRVVHELSRILVRIFFAVPPRWRQRV IGREYVDRKKSYVIVLNHNTVIDIPTLYYIPLNFRWVSKREVFKTPFFGQYLVLHGDICI NRGRASEALEQMVRDGKLWISRGASVAVFPEGTRSKDGEIHRFKAGAFTLAKEAGVEILP VVLDGTKTLIKKNALFNWGNRITIRVLPPVSAGRVAAAETHELMQEVHDAMCAALAEIRN KK >gi|313159314|gb|AENZ01000010.1| GENE 33 38497 - 39444 985 315 aa, chain + ## HITS:1 COG:all0345_1 KEGG:ns NR:ns ## COG: all0345_1 COG0789 # Protein_GI_number: 17227841 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Nostoc sp. PCC 7120 # 40 149 15 132 135 84 40.0 4e-16 MTVGPDLFFRRPPKKNNAKQLTRPQGQGVSLPHEHCAEFMKTTKIKFKIGEFSKLNRVTV KTLRHYEEIGLLVPSQVDKWTGYRYYDVCQLGRMNTIRYLKELGFTLEEIGELFDEGRTR PSPELIAAKTAECRETLLRLERRMTELERLGKELAKPKKMEKAFIKELPAVVVASHRRVV RGYDELFDLCPNVIGPEMARLGCECSEPGYCFTIDHTCEFRERDIDIEYCEAVTERKEES ELIEFKELEAVPTAVCMYHRGNYDTLPQTFAELYAYVEKEGYKLAGSPRFSYIDGIWNKD SEEEWLTEIQIPAGR >gi|313159314|gb|AENZ01000010.1| GENE 34 39623 - 41995 3314 790 aa, chain + ## HITS:1 COG:CC0815 KEGG:ns NR:ns ## COG: CC0815 COG1629 # Protein_GI_number: 16125068 # Func_class: P Inorganic ion transport and metabolism # Function: Outer membrane receptor proteins, mostly Fe transport # Organism: Caulobacter vibrioides # 104 772 47 723 737 90 22.0 1e-17 MKKLILLILLLAMAAANAIAARMYPTGGRVVDAQGQAVEYATVVLLRGGEQVAGMATDDA GRFELKVPTGEYTLSIQYLGFDPVLRQVRVDADNDLGDIVLKSSSTQIEGVVVKAQLIRR EADRFVVDVANAPAAIGKDGIELLERAPGVWVDDEKISINGKSGSKVYVNDRELRMEPAQ LLTYLRSLRADDIQKIEVVPTTGADYDADSSGGIIRITLRKRRENGMEGSLSVDSRLGRW VRSVNPRARVNYHSGRLDLYGLAWFNLETDEFESDENTRYTSGSNSLEAHSEKFERDRNF GASFGSVYEINSRQSVGAEFEYWRNREKGPNDSYTDFTAEEGVTRTDSRYDNFTARNNYS LTFNYIRKIDSLGSTLKVLADYNRRTTDAENDNFSRIAAPAPAPAADSAYRDNSVSVYNV TTATLAFDKKFSPRWSLRAGAKYTYNDMHNDALYEYLKDDAWARNDNQSFTINYTENIAA AYGIASANLGRWSLVAGVRGEYTRTEGKGHGIAQNYFSLFPNANVSYALTKDGAYSLIAQ YARTIERPRFWTLNPQRFQISDYTYQTGNPKLDPAYKHDASLTLVLKHKYTLTGGIVVQT GEIQQTMRPDADDPKRLCIAWVNYDTTKSYYVSANAPCQFAKWWTMNLNATYIRQGQRID QHTPEKHYNLYFANASTTFTLPAKFYIDLSYRFQSRMDFGNCWVEPMHFLNAGIKKRFGD KFTAACSVRNLIDRPQHVGARGEGFVRRVDVSQQWNCREFRISLSYNFKSGKAFKRRAVE AGSADEKSRL >gi|313159314|gb|AENZ01000010.1| GENE 35 42113 - 42982 677 289 aa, chain + ## HITS:1 COG:no KEGG:BF3797 NR:ns ## KEGG: BF3797 # Name: not_defined # Def: putative integral membrane protein # Organism: B.fragilis # Pathway: not_defined # 4 289 6 240 240 213 44.0 1e-53 MQHFSYLAEIFVVGGIVSALIIAADLHSCRQPMRIMNSVWILTGLWASVIGLWAYFTFGR PRHCDLEEHNGGRKTGTKPETGAETRAEAETGARAAEAGADGMRKMSSTAGMENMGETAG MPGMAMPGMRNMAEAEMPGRPKWESAVLSTLHCGAGCTLADLVGEWFMYFVPIAVGGSLL AGAWIADYILALVFGIGFQYAAIRGMERTLGRGTALRRAAKADILSLTAWQAGMYGWMAV AIFGLNGGMPLPRTSFVFWFMMQIAMACGFLAALPVNVLLIKAGIKKGM >gi|313159314|gb|AENZ01000010.1| GENE 36 43022 - 43933 1248 303 aa, chain - ## HITS:1 COG:FN1744 KEGG:ns NR:ns ## COG: FN1744 COG0697 # Protein_GI_number: 19705065 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Fusobacterium nucleatum # 15 284 12 285 293 77 24.0 3e-14 MANAKSRGCVLGAVAAASYGLNPLFTLPLYEAGMGVDSVLFYRYLLAAAMLGALMLVRRQ SFAVRRRDLVPLAVMGLLFSFSSLFLFESYNHMDAGIASTILFLYPVLVAVIMAVGFHEK VSRITMLSILLAFTGIAMLYKGGGEPLSFLGVALVFLSSLCYAVYIVGVNRSSLRGLPTE KLTFYALLFGLSVYVVRLRFCADLQAIPTPGLWINAVSLALFPTIVSLVTMAAAIRAIGS TPTAILGALEPLTALFFGVVVFGERLTPRIVLGVVLILVAVTLIIAGRSLHIPFPHLRFR RAR >gi|313159314|gb|AENZ01000010.1| GENE 37 43939 - 44514 1016 191 aa, chain - ## HITS:1 COG:PM0785 KEGG:ns NR:ns ## COG: PM0785 COG1704 # Protein_GI_number: 15602650 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Pasteurella multocida # 1 191 1 192 193 207 59.0 1e-53 MKKWIWIGVAALVVIFFYATYNGFVNKEEGVKTAWANVETQYQRRADLIPNLVNTVKGYA AHETQTFTEVTEARAKATSINLSADDLTPEKLAEFQQAQAQVRSALGRLIAVSESYPDLK ANQNFLELQAQLEGTENRISVARKQFNEVARTYNVSVRRFPANLVAKMFGFEQKPYFESA EGSEAAPQVQF >gi|313159314|gb|AENZ01000010.1| GENE 38 44634 - 45554 1232 306 aa, chain - ## HITS:1 COG:PA1450 KEGG:ns NR:ns ## COG: PA1450 COG1512 # Protein_GI_number: 15596647 # Func_class: R General function prediction only # Function: Beta-propeller domains of methanol dehydrogenase type # Organism: Pseudomonas aeruginosa # 62 209 66 216 419 89 37.0 8e-18 MNMRCFIILCLLLFASLAAQARSYRAEDIPNVQRADRTRYVSNPDGILSPGAVARIDSVC GALRDRAIAQVAVVAVDDIAGGDVFDFAVDLFTQWGVGRAENDNGLGILLVRDRREIRFV TGGGLEGVLPDAVCKRIQLKYMLPAFREGDYSRGMVAGVEAAAQLLEGSELDLGGTDTAD GELPSWAVFLIVVGFVVLPLGVVMLGYYARKRCPKCHKPTLRQQSNRILQVTPSYRLVEY TYVCSNCGAVVKRRAKNLRDDNFGGGGGTIIGGFGGFGGFGGGSRGGFGGGFGGGSFGGG GAGSRW >gi|313159314|gb|AENZ01000010.1| GENE 39 45673 - 46956 2181 427 aa, chain - ## HITS:1 COG:TM1650 KEGG:ns NR:ns ## COG: TM1650 COG0366 # Protein_GI_number: 15644398 # Func_class: G Carbohydrate transport and metabolism # Function: Glycosidases # Organism: Thermotoga maritima # 38 379 34 365 422 211 35.0 2e-54 MEMLHPEWSYQAVLYEMNVRQLTAEGTLRAAAAKLGFLRDMGIDAVWLMPVYPIGVEGRK GSLGSYYSVRDYCAVDPCLGTMEDFDAFVAEAHRLGMKVLLDWIANHTARDARWVTEKPA SWYERDASGAPAVPWDWTDTAKLNYAEREVWRAQGDAMEFWVREHGVDGFRCDMAMLVPI EFWNETAQRLRRVKPDLFMLAEAEETYLFDGGAFDACYAWEMHHLMNDVAQQRVRVTSLR DYIYADRRRYPRSAMRLAFTSNHDENSWNGSEFARMGAAREIMAVFTFVVPRGLPLIYTG QEIGYDHSFAFFDRDPLPAYGSNPFSEFYRRLTALRHANPALASGERGGEMIEIRNNAED CLMIAVREAEGNCVVAVMNLSPYAIHADYYTGIYAGMYTDAMTGRPGELRGHVEEDMAPW SYRILTR >gi|313159314|gb|AENZ01000010.1| GENE 40 47127 - 47924 1241 265 aa, chain + ## HITS:1 COG:AGc2283 KEGG:ns NR:ns ## COG: AGc2283 COG1596 # Protein_GI_number: 15888569 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Periplasmic protein involved in polysaccharide export # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 96 197 64 164 190 62 32.0 7e-10 MKHLRLLMPVILTALLVTSCASQKRAWYLQDAQPFTPEQIAENGQIRIKPLDRLTIVVNS KDPELAVPFNSSSSLSSLTGATNYGSATNQSLQIRTVDENGQLEMPVIGKIECKGKTRSE LAQAIADKIVEGGYINDPTVNVQFADMKISVIGEVARPGHYDVTRDKLSIFDALAMAGDL TIYGIRTDVAVAREVDGVRTIEYLDLTSKELFNSPAFYIQQNDVIYVKPNKYKAQTGEIS QNRNFYLSLVGTAISVATLIITLTK >gi|313159314|gb|AENZ01000010.1| GENE 41 47933 - 50392 3558 819 aa, chain + ## HITS:1 COG:all4432_2 KEGG:ns NR:ns ## COG: all4432_2 COG0489 # Protein_GI_number: 17231924 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Nostoc sp. PCC 7120 # 567 783 30 244 263 126 34.0 1e-28 MTEQNINPHLLNEESENTNGITLHDLIQMVLANWYWFALSAIICLGTAYYYLASTPKIYS RTATILVKDSRKGGDTDLGAFSDLVGFQNRRNVDNEVYILQSHRLMSEVVKRLHLTVNYS VREGLRTADLYGRSPIEVDFINDNSKQKLSLEVTPLRNGKIELTDFEDKFVTRQETKRVI LAEYGDTVATPVGQVVVHKTLFMDSTYLKKPITVIKNSLAATTNAYRAAVKSDVANKQAS IVTISMNNSVPRRAEDVINTLIAVYEEDAIADKRQLSVVTADFIKERMQVIGRELGDVDR DIEDIKKSNKMIDITSEATRTITESTRYKAESLTIENQISVADFIREYLNDPSHAGDLIP MVASMTNNGIVAQISEYNEAILRREKLLENSSERSPVIQDLDNGLAAVRRSIIASLNSHI STLEIQLETMRKEEAQVNRRISTMPSQEKVMLDIMRQQKIKEELFLYLLNKQEETQLNYA VAESNSRTIDMAYGSARPVSPRSTIILGISLLAGLAIPFGILYLIGMLDTTIRGRKDVEE NLSAPFLGDIPFLEGDNKGGVVVRETGRDALSEAFRILRSNMTFMNVSSGKEIKCVLFTS SDPHAGKTFVAMNLAMTLAMAGKRVVLIDLDLRRHALTTHLGRSNSKNGMSGYLAGTITD IDQLVTNMGFHENLDVICAGIQPPNPTEMLLSNRLDKLVEQLKERYDFVFIDSTPAMSVA DAVITDRLADLCIYIVREGVLDRRQLPDIERLYREKKFHNMCIVLNGTRTRRHGYGYGYG YGYGYGYGYGYGYGDDHQEVSYRKRIKLFFSKLARKVKK >gi|313159314|gb|AENZ01000010.1| GENE 42 50394 - 50783 543 129 aa, chain + ## HITS:1 COG:no KEGG:PRU_0341 NR:ns ## KEGG: PRU_0341 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 1 115 1 120 156 128 49.0 8e-29 MAIHNALIAVDFDGTVVTHAYPEIGDDAGAVAVLKELTDNGCRLILYTMRSGALLDKAVK WFRDREIPLYAINENPSQQRWTSSPKIHADLYIDDSNLGCPIRFVDGVKRPVADWTKIRE QLVREGFLD >gi|313159314|gb|AENZ01000010.1| GENE 43 51158 - 52069 1503 303 aa, chain - ## HITS:1 COG:no KEGG:BF3922 NR:ns ## KEGG: BF3922 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 2 302 1 300 300 362 57.0 1e-98 MVKPTLLVLAAGMGSRYGSLKQMDGVGPNNEAIIDYSVYDAIRAGFGKVVFVIRHSFEKE FREVFSAERFGGRIAVEFVFQELDYLPEGASVPEGRVKPWGTNHAVMMAADAVHEPFAVI NADDFYGAEAYQTIGDYLSQLGDSKNRYCMVAYDLNRTLSDNGTVSRGVCGVDADGNLTS MVERTQIERMPDGRILFHDGGADEELAEDTPVSMNLFGFTPDFFAYSKEYFKTWLAENRE NLKSEFYIPTMVNKLIGEGAASLRVLRSAAQWHGVTYKEDKPALMASIEKMIAEGKYPRN LWA >gi|313159314|gb|AENZ01000010.1| GENE 44 52189 - 53061 1273 290 aa, chain + ## HITS:1 COG:RSc1206 KEGG:ns NR:ns ## COG: RSc1206 COG0739 # Protein_GI_number: 17545925 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane proteins related to metalloendopeptidases # Organism: Ralstonia solanacearum # 159 287 139 265 268 65 34.0 1e-10 MKFLKNIRAWQRSFFRKRRFNMLNATDNSEEWHMHLSPASIFAAFVSFALLLFILILTLV AYSPVLEFLPGYRTEADRSRESLVQNIIRLDSMERMMNDMLTYNQNIALIMEGKTPVARV LANSDSTRATKVLVMPSPEDSVLRAQMEGDGEYAVTGRSGGSRRSVREAIELATPVEGII TDRFDIRNGNFGIRIAAAASDRIAAVDNGTVVLSLWTPETGYMVELQHAGNLLSVYKGLS QSLVTKGQTIRAGELIGYNAEAEQGEVRLFEFELWNNGKPVDPEGYIVFQ >gi|313159314|gb|AENZ01000010.1| GENE 45 53068 - 54219 1517 383 aa, chain + ## HITS:1 COG:lin1354 KEGG:ns NR:ns ## COG: lin1354 COG0743 # Protein_GI_number: 16800422 # Func_class: I Lipid transport and metabolism # Function: 1-deoxy-D-xylulose 5-phosphate reductoisomerase # Organism: Listeria innocua # 3 374 2 373 380 327 46.0 3e-89 MKQRLAILGSTGSIGVQTLDIVRENPDFFEVRVLTANSNWQRLAAQAREFDADTAVIADK RYYTRLRDALEDTDVKVYAGEEAVAQVAAGGDTDVAVNALVGYAGLAPTVAALGAGKKLA LANKESLVVGGEYVMRLAAEKRAPILPIDSEHSAIFQCLAGEQSPVRRLIITCSGGALRD LKREELAGVTAEQALRHPQWEMGAKITIDSSTLVNKGFEVIEAHWLFGTPAEKISVVLHP QSIVHSMVEFEDGAIKAQLGTPDMRMPISLALMYPRRANRPGERFDFLKHPQLTFAEVDR VKYPALDIAYDCLRRRGTAACTMNGANEVAVAAFLGGRCAWLDIVRAIEHALNRASFAAA PTLADYAAADGEARRLAAEFIAL >gi|313159314|gb|AENZ01000010.1| GENE 46 54233 - 55549 2215 438 aa, chain + ## HITS:1 COG:aq_1964 KEGG:ns NR:ns ## COG: aq_1964 COG0750 # Protein_GI_number: 15606963 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted membrane-associated Zn-dependent proteases 1 # Organism: Aquifex aeolicus # 7 436 3 428 429 137 27.0 4e-32 MDIIIKIIQFFLCFTILVGIHELGHFLMARVFKIRVDKFYIFFDPWFSLFKFKRGDTEYG LGWLPLGGYCKIAGMIDESMDKEQMKLPPKPDEFRTKPAWQRFLVMIAGVVMNVLLAIFI YIGICYTWGDNYFSNEDARWGYTFNEAGRKLGFQDGDRIVSIDGEAVDNVNKIVNALIIT EGERRVVVEREGRQVELTLPLGELIEMRQNKGYEDLLALRMPFLIDSAVYETASALRRGD EIVAINDAQGLEYPAYREYLKAHAGEDVTLTVKREGDMLLELVVPVSDEGRLGVTALNPY KLRTQKYTFWQAIPAGISKAGKVMSSYWEQLKMIVQPKTKMYEELGGFIAIGSIFPGDWN WEDFWMKTAFLSIILAVMNILPIPGLDGGHAIFTFWEMITGRKVSDKILEGAQYVGLFII LLLLLYANGNDIYRFFIK >gi|313159314|gb|AENZ01000010.1| GENE 47 55578 - 56174 584 198 aa, chain + ## HITS:1 COG:no KEGG:Amuc_1287 NR:ns ## KEGG: Amuc_1287 # Name: not_defined # Def: protein of unknown function DUF1121 # Organism: A.muciniphila # Pathway: not_defined # 1 198 14 210 210 211 51.0 2e-53 MRRHHFEVETVRDTDEAFAVIKAAVETERPETVSFGDSMSMRATGVIEWLRGDGRYRLLD GFDPAMTYPQRLEIRRQALMSDLFITGVNAVTAEGTLHWLDKVGNRIAPVAFGPRKVIIV AGRNKIVANRDEAEERIRTIAAPQNIARHPGFRTPCAKTGVCSDCNSPDRVCNTRMEMLR CWPDKRVLVVLIDQDLGL >gi|313159314|gb|AENZ01000010.1| GENE 48 56188 - 56856 850 222 aa, chain + ## HITS:1 COG:no KEGG:Avi_5186 NR:ns ## KEGG: Avi_5186 # Name: not_defined # Def: outer membrane autotransporter # Organism: A.vitis # Pathway: not_defined # 35 126 609 700 1056 75 35.0 2e-12 MGKLKILTALLLAAALTACGDDSDVFYTTSYPVARIEISVSLSEPEKPDPENPDPENPDP ENPDPENPDPENPDPENPDPENPDPENPDAGTSQTEKPENPEEPENPLLEEIRNDALAKA PVQAGGSYRLDFTHHNGGPLVVRPAADAETVTGTFIKEPDKPEELHFTFGEQAYTCKVSG YTDTDDLRKTLFSVDLTEEYKQLYPDAGITQVIRKEYTSHPY >gi|313159314|gb|AENZ01000010.1| GENE 49 56867 - 58234 2086 455 aa, chain + ## HITS:1 COG:BS_sms KEGG:ns NR:ns ## COG: BS_sms COG1066 # Protein_GI_number: 16077155 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Predicted ATP-dependent serine protease # Organism: Bacillus subtilis # 1 450 1 453 458 439 48.0 1e-123 MAKVKKAYFCKNCGFEAPKWLGRCPSCGEWNTFTEEIVARESGSVPANVAGSLPAAKPQR VRDIRESEHRRMDLGNSEVNRVLGGGMVPGSLILLGGEPGIGKSTLSLQLALAANGLKTL YVSGEESAEQIKMRAGRIGIGNDECLIYPETLLENIVNQIGEHRPDLVVIDSIQTIYTDL LDSSAGSVSQIRECAATLLKYAKSTGTSIFIIGHITKDGTIAGPKILEHIVDVVLQFEGD SNNIYRILRGIKNRFGATFEIGVFEMLDNGLRGVDNPSEILLTHYEEPLSGIAVGASADG VRPYLIEVQALVSGAAYGTPQRTTTGYDTKRMNMLLAVLEKRVGMKMFQKDVFLNFAGGF KVADPGLDLAVVAAVISSYYDRPVAEGVCCAGEIGLSGEVRPAPRTEQRISEAARLGFKR IIVSGYLGKGGKRPKGIEIVTINSIDQLPRALFVE >gi|313159314|gb|AENZ01000010.1| GENE 50 58270 - 59238 1089 322 aa, chain + ## HITS:1 COG:SMb20458 KEGG:ns NR:ns ## COG: SMb20458 COG0451 # Protein_GI_number: 16264188 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Nucleoside-diphosphate-sugar epimerases # Organism: Sinorhizobium meliloti # 1 309 27 338 348 233 42.0 4e-61 MQKRIVILGGVGFIGTHLCLRLLEEGHEVFCVDIRDAVNSPLLRSVLQHPLFRYVHHNII NTFGIRCDEIYNLTSPSRVRYNKALPVETLRINMQGAINALDTARMEHARILFASSGCVY GAGYRDPATDPAAGYSGQYILSEGKRAAEALHRAYQEELGVDARIARIFNTYGSGADIMD QRVVMKMIVAALQNRDIPVNGSGEQMRTFCWVEDMVDGLVRMMEATPSEQTRTLDLGSNH EISIRALAEKIISLTGSRSRIVHRAARPDDARRRIPDISAARRELDWTPRTPLAEGLRRT ISYAEKELTDKVYTELTWAEMN >gi|313159314|gb|AENZ01000010.1| GENE 51 59250 - 59582 449 110 aa, chain + ## HITS:1 COG:YLR043c KEGG:ns NR:ns ## COG: YLR043c COG0526 # Protein_GI_number: 6323072 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Saccharomyces cerevisiae # 2 80 7 82 103 79 40.0 1e-15 MTATDFDKIIWRDALTFVDFFATWCGPCQMMHPVIDRFQEKMNGRVDVYKVDIDDRDMLE IVHRYNIMSVPTLMFFRRGEVLWRESGRIGYDHLANVLRELEQREQVGQR >gi|313159314|gb|AENZ01000010.1| GENE 52 59593 - 60288 681 231 aa, chain - ## HITS:1 COG:no KEGG:BT_1845 NR:ns ## KEGG: BT_1845 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 167 1 167 221 74 26.0 4e-12 MDFTSRMVALLGELRRERNGAVADAMRCYGAPYGLNYGVSLPTLRKLARAETPDHDFSRY LYLQEVRELRLAALHIARPESLTPDEFPAWAAGIVNSEVAEEAAFAFLSRSAALPALFDA WIADPNPLLRYAALQSAARSDLLTAAWIAPAVEAVRRAAVCAAESLSKPAAAPLSASSAA RLIAQGAVALLSAVGGLNEENRQAVLRAAGSLGKLPAEDYVHEELTWRLEA >gi|313159314|gb|AENZ01000010.1| GENE 53 60297 - 61190 1271 297 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313159388|gb|EFR58752.1| ## NR: gi|313159388|gb|EFR58752.1| hypothetical protein HMPREF9720_2764 [Alistipes sp. HGB5] # 1 297 1 297 297 533 100.0 1e-150 MKLRLLALALAVCGGAAAQNPYIALQGANETADGIVVSQPRTVLAVDVTVEKDVTLSGPY ARYAQKFLGVRAPLTDKTGWAVTGAAVALLDAKACLDAPAPAEPSRRIMTYAAAEDEFAR LQADKNDMTVLALEDAAREAANTIFSLRKHRLELITGEAGENVFGEGLKSALAEIERLEQ NYLELFLGKRIVTTETRRYVVYPQADKKQYIVCRFSPAAGLLPESDLSGDMVLLQIEPSG NTSNALEASAKETNVVACRVADPSVCTVIAGGREYARAVLPLFEFGRTINVALPRRK >gi|313159314|gb|AENZ01000010.1| GENE 54 61243 - 61959 741 238 aa, chain - ## HITS:1 COG:YPO2709 KEGG:ns NR:ns ## COG: YPO2709 COG4123 # Protein_GI_number: 16122913 # Func_class: R General function prediction only # Function: Predicted O-methyltransferase # Organism: Yersinia pestis # 2 238 20 252 252 162 38.0 6e-40 MFRFKRFTIRQDRCPMKVGTDGVLLGAWAGVRPSDRRILDVGTGTGLIALMMAQRAPEAR ITGVDVEEVSQARENAAASPWGGRVVFEQCPVQEFAPDEHFDLIVSNPPFFVDSLTCPDE GRTTARHAVRLPFDQLRDAVVRLLAAEGRFAVILPTDEARRFVDICRGFLAQTRSTGVRT TPRRPVKRLLMEFVHENACSGLRSDSPPVCSELVVGTGEHEQYTPEYRALTRDFYLKF >gi|313159314|gb|AENZ01000010.1| GENE 55 61962 - 62735 1003 257 aa, chain - ## HITS:1 COG:no KEGG:Palpr_0457 NR:ns ## KEGG: Palpr_0457 # Name: not_defined # Def: hypothetical protein # Organism: P.propionicigenes # Pathway: not_defined # 2 250 22 270 274 263 48.0 5e-69 MDDELKELLERLHDKYNRPEFIEDDPISVPHRYTDRADREIAGFLAATIAWGNRRAIVTS GHRMMRCMDDAPADFVRNASDRELATLETYVHRTFNGRDLRDFVLALRRMDRRFGGLGAF FEERYAAAQSIPAVLSEFRREFFSCDHALRCEKHLSSIDRGAACKRLCMFLRWMVRRDER GVDFGQWTRIPMSALYIPLDLHTGDMARALGLLSRRQNDWRAVEEVTAALRGFDPADPVR YDFSLFGAGIDGYLKEC >gi|313159314|gb|AENZ01000010.1| GENE 56 62747 - 63397 334 216 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 216 1 218 245 133 39 4e-30 MIKVTDIHKRFGDLEVLKGVSLDVAEGEVVSIVGASGAGKTTLLQIVGTLSCPDAGRVEI DGQDAFSLGDKALSRFRNERIGFVFQFHHLLPEFTAFENVCIPGFIGRRPRAEVERRAAE LLEMMNLTPRRGHKPGQLSGGEQQRVAIARALVNSPAVLLADEPSGNLDSHNRDEIHRLF FELRDRLGQTVVIVTHDEHLAAMADRKITMSDGRIL >gi|313159314|gb|AENZ01000010.1| GENE 57 63411 - 64250 1193 279 aa, chain - ## HITS:1 COG:VC0638 KEGG:ns NR:ns ## COG: VC0638 COG0294 # Protein_GI_number: 15640658 # Func_class: H Coenzyme transport and metabolism # Function: Dihydropteroate synthase and related enzymes # Organism: Vibrio cholerae # 5 277 8 277 278 215 45.0 1e-55 MKQHDIDLDLSRPQVMAILNVTPDSFFAGSRMPDATHVERRVKEAAAEGASIIDVGGYSS RPGADEVPADEEWRRVELGIGAVRRLAPGVLISVDTFRSEVAARAIEKFGPLIINDISAG ELDPQMPATAARYGVPYVAMHMKGDPRTMQTLTDYKRDITAEVTAYFETKTADLLAAGIK RENIILDPGFGFAKTTEQNYELLAGLHCLCALGYPVLAGLSRKSMIYRVLDATPAESLAG TVALGWECLRQGAAILRVHDVQEAVDTVRLFDAYERNRK >gi|313159314|gb|AENZ01000010.1| GENE 58 64231 - 64824 679 197 aa, chain - ## HITS:1 COG:BS_ytaG KEGG:ns NR:ns ## COG: BS_ytaG COG0237 # Protein_GI_number: 16079958 # Func_class: H Coenzyme transport and metabolism # Function: Dephospho-CoA kinase # Organism: Bacillus subtilis # 4 194 5 197 197 108 36.0 5e-24 MMKIGVTGGIGSGKSTVCRLFAQKGIAVYDSDAAAKRLMQQDDTLRMRLTERFGADTFRD GQLDRSRLAGVVFSDPQALADLNALVHPVVMADFDAWAARQEGPYVILESAILFEAGLET CVDKTVAVLAPRELRIERTCRRDDCGPDEVGRRIAAQLDDDTLSGRADYAIVNIFEEDLE PAVVKLDRIFSHEAARH >gi|313159314|gb|AENZ01000010.1| GENE 59 64821 - 65237 686 138 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|291513720|emb|CBK62930.1| ## NR: gi|291513720|emb|CBK62930.1| hypothetical protein AL1_02330 [Alistipes shahii WAL 8301] # 7 134 2 129 132 224 83.0 2e-57 MELQRTLDNLLKWMHRYVSPVFLALLVASFILWYIAKLSYTYTTEQVVRVSVDGEPFEVT CVVEGVGTNLFGYRVYMNKTLRIPLSELKYKRSREEGHEGKVVIDPQSLQSAISVRYSDI KVVSIGAVPELDYPEEKP >gi|313159314|gb|AENZ01000010.1| GENE 60 65311 - 68337 4893 1008 aa, chain - ## HITS:1 COG:AGc2877_1 KEGG:ns NR:ns ## COG: AGc2877_1 COG0342 # Protein_GI_number: 15888881 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit SecD # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 392 664 275 543 562 242 46.0 2e-63 MQSKGAIKLLAVLLALACIYQLSFTFKTRSVEKKAAEYAAQFPVGQQSEAEQHYLDSVQN VGVYNLGFKKFTYKECKEKELNLGLDLKGGMNVMLEVQVEDVIKALAGDSQNDPAFVEAI AEANAALKEGTSKDYISDFVKAYQRLSNGGSLAAVFVSPDRKDITLESSDADVEKILKRE TDAAIAASFNVLRSRIDHFGVTQPNIMRLPNSHRILVELPGVKEPQRVRDLLQGTASLEF WLTYDAREVLPALAAADKFIKAELSEAPAAGQPEVASVEAAAAAGTPAGEAEGLLAEIGS DTTAVAEAEHTDRYDRAKNPLFAVLDPGFAGGAAIGAAYKADMAAVNEYLANPAVRELFP ADIAFKWGVKGDDKIDGRFYLYAIRISTPDGKAPLDGSVVTEAREQYAQRGANAVVSMSM NGEGTQEWARLTGENIGKCIAIVLDGYVYSAPRVNTKIDKGSSEISGDFTIQEAKDLANV LNSGKVPAPAKIIQDTVVGPSLGQESINAGMLSFLIAFILVLLYMGLFYKTAGWMSDIAL LTNVFLLMGVLVSFGAVLTLPGIAGIVLTMGMAVDANVIIYERIKEELRGGKGLSLAIKD GFSKAYSAIIDGNLTTIITGIVLFIFGNGPVQGFATTLIIGIITSFFSAVFITRLLIEWI VAKWGNISFSRKWSENFLTNTHFDFIRVRKVAYTVAVALLALSCISFVARGLNLGAEFTG GRAYVIRFDKPVSAEEVRQNLGQVFAQHADADASAISFEVKQYGNENQMRIVTQYRYDDT SDEATAEVEQIIYDAMKSLYSYDISFEQFRNTQTDVNGILTADKIGPSIAKDMTWNAIYS VLFSLIAIGLYITFRFKRWQWASGATIALAVNALFIIGLFSMLYGFVPFNLEVNQAFIAA ILTIIGYAINDTVVVFDRIREYLGMYPKRNLKENVNNAINSTLSRTINTSGTTLVTLLAI FFFGGETIRGFIFALTVGVIVGTVATIFIATPIAYDLMVKRAKLDKEA >gi|313159314|gb|AENZ01000010.1| GENE 61 68485 - 69402 1344 305 aa, chain + ## HITS:1 COG:BS_ybfH KEGG:ns NR:ns ## COG: BS_ybfH COG0697 # Protein_GI_number: 16077290 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Bacillus subtilis # 6 295 6 295 306 171 38.0 2e-42 MIKYTEYKYHAAALFTVAVWGATFVSTKVLIAHSLTPAEIFLLRFALAYVCIWPLSKGRL RADGWRDEALLAAAGVTGGSLYFLTENMALEYAPASNVSLIVCTAPVWTALLLSLVYRGE RMTRRQIGGSALAFAGMVLVVLNGRFVLHLSPKGDLLALSAALLWMVYSLAVKRIGGRYP AVFITRKVFFYGLLTILPVFAFQPFSVGAEVLARPAVWGNLLFLGVVASMLCYILWNAAM HRLGAVRTTNYIYFNPLVTIVTAALCIGERITAAALAGAALILYGMWRAERRPASGNKIP RPDVP >gi|313159314|gb|AENZ01000010.1| GENE 62 69421 - 69843 367 140 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313159389|gb|EFR58753.1| ## NR: gi|313159389|gb|EFR58753.1| conserved hypothetical protein [Alistipes sp. HGB5] # 1 140 1 140 140 247 100.0 2e-64 MLKFLLVCAAFAALPIQAQTLSGRIVGADARPVPYANIGLKNRPAGTVSGRDGSFTLRIP DLSVRDTLRISCIGYAAREIPAAGHDARPLEIRLQEQPQTPLSPVITHIPRKRYTFGRSI GSKTVSARRSAPGRKAARWA >gi|313159314|gb|AENZ01000010.1| GENE 63 69834 - 70241 487 135 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313159396|gb|EFR58760.1| ## NR: gi|313159396|gb|EFR58760.1| hypothetical protein HMPREF9720_2774 [Alistipes sp. HGB5] # 1 135 1 135 135 273 100.0 4e-72 MGVKCKLGELRGELQRVRLNLAGCKGVDTLHMRINVYAMRGDRIRKNLLAEPVYFSVPAS EADKTTTVDLRPYEIFVRGDFLISVENLTHMPAGAKYWIRARLLARTYTRDTSQAPWKMR KAGVGLSVDVLVEPE >gi|313159314|gb|AENZ01000010.1| GENE 64 70375 - 71679 1659 434 aa, chain + ## HITS:1 COG:TM0306 KEGG:ns NR:ns ## COG: TM0306 COG3669 # Protein_GI_number: 15643075 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-L-fucosidase # Organism: Thermotoga maritima # 23 385 8 397 449 128 30.0 2e-29 MTRLFTSFCAFVIATGAFAQAGYRPAPENLLSRSQFADCRFGIFLHWGLYAMLAQGEWAM TNHNLNYREYEKLAGGFYPSKFDADAWAEAFRQAGARYVCFTTRHHDGFSMFRTAQTPYN IVDATPFARDAVKELADACHRRGLRVHFYYSLIDWWREDAPRGRTGLGTGRPAEKEDADA YFDFMKAQLTELLTQYGNVGAIWFDGIWDQDENPGFDWRLGELYGLIHKLQPGCLIINNH HLAPFEGEDAQAFERDLPGENTAGYSGQEISRLPLETCQTMNGMWGYKITDQNYKSAQTL VRYLAGAAGRGANLLLNIGPQPDGALPAAALERLDSMGRWLRANGETIYGTQAGPVSPRS WGVTTRRGDKIYVHILDWPDEELFVPVRDERIRRAVRFADKRAVAFSQDKAGVTLRLGEK PAGADCIVELTVGQ >gi|313159314|gb|AENZ01000010.1| GENE 65 71767 - 73686 2833 639 aa, chain - ## HITS:1 COG:BB0518 KEGG:ns NR:ns ## COG: BB0518 COG0443 # Protein_GI_number: 15594863 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Molecular chaperone # Organism: Borrelia burgdorferi # 1 618 1 619 635 700 64.0 0 MAKIIGIDLGTTNSCVAVMEGSEPVVIPNSEGHRTTPSVVAFTADGERKVGDPAKRQAIT NPKRTVFSIKRFMGERYDQVASDIARAPYEIVKGDNNTPRVDIDGRQYTPQEISAIILQK MKKTAEDYLGQEVSEAVITVPAYFSDSQRQATKEAGEIAGLKVRRIINEPTAAALAYGMD KKDSDMKIAVYDLGGGTFDISILELGDGVFEVKSTNGDTHLGGDDFDHVLIDYMAEAFKA EHQIDLRQDPMALQRLKEAAEKAKIELSSSTTTEINLPYIMPVNGIPQHLVMSLTRAKFE QLCDHLIRKTIEPCKLALRDAGLDASQINEVILVGGSTRIPAIQKIVEEFFGRTPNKSVN PDEVVAIGAAIQGGVLTGEVKDVLLLDVTPLSLGIETLGGVMTKLIDANTTIPTRKSEVF STAADNQPSVEINVCQGERPLARDNKSIGRFHLDGIPAAPRGVPQIEVTFDIDANGILNV SAKDKGTGKEQKIRIEASSGLTEQEIQRMRDEAKANEAKDKEEKERIDKINAADSNIFAT EKQLKEYGDKLPADKKSAIEGALAKLKEAHKNADVAAIDTAIAELNAAWQAASQDIYAQQ QAQGAQPGADAGQQSQANAGNASNGDSSQPEDVEFEEVK >gi|313159314|gb|AENZ01000010.1| GENE 66 74275 - 74517 329 80 aa, chain - ## HITS:1 COG:no KEGG:BT_0902 NR:ns ## KEGG: BT_0902 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 78 4 80 96 106 71.0 3e-22 MIITFNELRRIKDRLPSGSSQRIADELGIDVDTVRNYFGGKHEAKECVGVHFEPGPDGGV VTLDDTTILDVAMRILGEQK >gi|313159314|gb|AENZ01000010.1| GENE 67 74690 - 76279 2176 529 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313159391|gb|EFR58755.1| ## NR: gi|313159391|gb|EFR58755.1| putative lipoprotein [Alistipes sp. HGB5] # 1 529 1 529 529 868 100.0 0 MKFITRFLCVPVAGLLALTSCAKDEDESYVQFENQALEAWMTQHRPDLVKNYQPDGGYYI DVLDAGEPDKDPLNKEPIWVSFDFTGRDLAGNIILTRSASDAKLTGAFSKYTHYVPFYRY CGTENTGLMEGTWLAMRNTLKLDQEYFDKYKNDPERRLTSTELQLRIGSKIQLYMPSTVV GNGVEGTGGYEGQGSPSKYTLSANRPFIVTMEIRDTVSNPLEREGKNVDEFSDGNGGRLV YGKEADEKTQTVVRPTDPEDPNHAYVTDKRWVSACDTIPQLYVNVRYDPTQAADRLDYPA PYHSGYEPYVTEESVKTIDEKIAEALKKRFYPDDDDKYEGVVKLDADSVTLEGTAKIWYI GRFLDGFIFDTNIDEVKKIIYGEVKSAGSALSYKPSERKMIEAFYYTIPNLKYGQWAELI TTSTNAYGSSGKTGSTSSSTSGTSGYSSSYYDYLNYLNYANSYYGSGGYYGGYYNNYYGG YGGYYGGYYGDMYGSDYNSSSTGTTVTTISTEIPSFTPLIFQLYIEPKE >gi|313159314|gb|AENZ01000010.1| GENE 68 76296 - 76922 715 208 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313159394|gb|EFR58758.1| ## NR: gi|313159394|gb|EFR58758.1| putative lipoprotein [Alistipes sp. HGB5] # 1 208 1 208 208 394 100.0 1e-108 MSRVFISIFAALVLLAGCSDEEDILPTQKTKIVSYLTGTHSPKLVAYEELEEGSDEPYFT TSGNAVYRYIAGINNPDRVNWTEVTRTSKVTVTFSAYVFTFANIVTPATSSTNLTMPYYS NDPVLIAAMEDPENGPGLTPGAWSSEPLEIDMRGSGIIKGLYDALLGCREGDYVESYMTY NMAYGDINFSTIPKESPVAYFFTVNSVE >gi|313159314|gb|AENZ01000010.1| GENE 69 77038 - 77181 128 47 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313159344|gb|EFR58708.1| ## NR: gi|313159344|gb|EFR58708.1| hypothetical protein HMPREF9720_2781 [Alistipes sp. HGB5] # 1 47 1 47 47 79 100.0 8e-14 MEEQIPLGFIGILLGVFAFAFLIGVGVIYLKQRRIQRLKNPRKDYYR >gi|313159314|gb|AENZ01000010.1| GENE 70 77286 - 77870 847 194 aa, chain - ## HITS:1 COG:NMB0265 KEGG:ns NR:ns ## COG: NMB0265 COG0632 # Protein_GI_number: 15676189 # Func_class: L Replication, recombination and repair # Function: Holliday junction resolvasome, DNA-binding subunit # Organism: Neisseria meningitidis MC58 # 1 194 1 192 194 113 35.0 2e-25 MYEYISGPVAEVAPAYAVIDAGGVGYYLHISLETFSAIEHTETARLYVHYVVREDAQLLY GFATKTERELFRHLISVSGVGGNTARMILSTYSPRELQGIITSGNAVLLKNVKGLGLKTA QKIIVELSGKLTGLGDADAALPAAGNGEKLEEALAALVMLGFAKTAAEKALRGILRENPG ASVEDLVRMGLKSL >gi|313159314|gb|AENZ01000010.1| GENE 71 77969 - 78553 884 194 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313159355|gb|EFR58719.1| ## NR: gi|313159355|gb|EFR58719.1| outer membrane protein [Alistipes sp. HGB5] # 1 194 1 194 194 248 100.0 1e-64 MKKMLFTALVAACAMTACGTKTTETAAAAEEATREIGSSDIAYVQVEAVLAQCDLFLNEG KALQEKTQKAQKSWAQKEQNLQSEAAQLQEKYQKGLITTNDAQAQQQSIEKRVAAYQSSA QKEAQALDEENFVFTNRAQDLLHRAVQDINAGKKYKLIINSTALIDADTTLNITPAVLAK VNELYAADQKAEKK >gi|313159314|gb|AENZ01000010.1| GENE 72 78767 - 79567 1409 266 aa, chain + ## HITS:1 COG:BS_ybbP KEGG:ns NR:ns ## COG: BS_ybbP COG1624 # Protein_GI_number: 16077243 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Bacillus subtilis # 1 260 1 261 273 180 38.0 2e-45 MGFVPFTFVDFIDIILVAVIMYWIYRMTKGTNAPYILSGIIAIYLLWVVVRTLNMELLST ILGQLISVGAIALIIVFQPELRRFLQMIGMRQKHFNFITRIFSTGDDAVQTNVVPIVTAC REMSETKTGALIVIGQQSDLRLIAEGGIALDAKVSTPLIRNIFFKNAPLHDGAAIIEGDR IVAAKCILPVTQSDVPKSYGTRHRAAIGMSEISDAIIIVVSEETGGISIAQSGELRRDID PVRLQQTLQRYLTINTRKRSKKEVAE >gi|313159314|gb|AENZ01000010.1| GENE 73 79763 - 79936 61 57 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313159365|gb|EFR58729.1| ## NR: gi|313159365|gb|EFR58729.1| conserved domain protein [Alistipes sp. HGB5] # 1 57 1 57 57 95 100.0 1e-18 MSAAGSGKRPADNCSRKQPIRFRKTHGGGTGCDAKPKSERHSEYGIYIHQFKTEFAK >gi|313159314|gb|AENZ01000010.1| GENE 74 79933 - 80463 180 176 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|237671810|ref|ZP_04531781.1| acetyltransferase, ribosomal protein N-acetylase [Brachybacterium faecium DSM 4810] # 1 146 3 146 179 73 36 3e-12 MIFPAPTPRLVLRTWREEDLGTFTAMNADPRVMEYFPAPLAPEQSAEFMERIREEFSTEG FGLYAVERREDGELLGYTGLHRVTFTGMKDKIEIGWRLRAEFWDQGYATEAAQACVALAA KLGIEELVAFTTVTNLRSQRVMQKLGMELLCEFDHPALPEGHPLRRHVLYLIGTEK >gi|313159314|gb|AENZ01000010.1| GENE 75 80723 - 80986 145 87 aa, chain - ## HITS:1 COG:no KEGG:BT_0505 NR:ns ## KEGG: BT_0505 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 4 84 50 130 189 125 74.0 4e-28 MMRLKKPLCFDWIDSTIRPLDAEHKIQRFTPRNPKSPYSQANKERLKWLMDNGMIHPKFE EKIQKVLSVPFVFPDDIIDRLKEDRTV >gi|313159314|gb|AENZ01000010.1| GENE 76 81312 - 81677 635 121 aa, chain - ## HITS:1 COG:no KEGG:Odosp_2375 NR:ns ## KEGG: Odosp_2375 # Name: not_defined # Def: preprotein translocase, SecG subunit # Organism: O.splanchnicus # Pathway: Protein export [PATH:osp03060]; Bacterial secretion system [PATH:osp03070] # 2 117 1 120 120 71 41.0 1e-11 MLYTICIALILIASVLIILAVLVQNPKSGMAANFGASNQVMGVRETSDFLEKFTWTMAIA IVVLSLVATLAMDKGLVAESNSEISKDVKALQERVIESETPATMPQAEIPAAEVPAEQSA E >gi|313159314|gb|AENZ01000010.1| GENE 77 81689 - 82177 572 162 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313159393|gb|EFR58757.1| ## NR: gi|313159393|gb|EFR58757.1| conserved hypothetical protein [Alistipes sp. HGB5] # 1 162 1 162 162 284 100.0 2e-75 MTDFRQYLASPATAERPSAGELDALAARYEWFAPVRIAREIVTGICDPRLAVTAPWRAQS SLLRRPIDAEAVLRLSSDDIIDRFLQEDDLRIVAADGEPEEEVRTAAELDDDDEVVSEEL AEIYLAQGLCDKAIAIYRKLSLLNPEKSVYFAELIGKLENNN >gi|313159314|gb|AENZ01000010.1| GENE 78 82190 - 82699 758 169 aa, chain - ## HITS:1 COG:no KEGG:Odosp_2373 NR:ns ## KEGG: Odosp_2373 # Name: not_defined # Def: hypothetical protein # Organism: O.splanchnicus # Pathway: not_defined # 2 169 3 173 173 125 41.0 8e-28 MKIKFAITALCAALLTGCGVAIKYSLSGASIPPDAKTFSVAYFPNNATMVAPILSSTLTD ALVDMFTRRTRLMQVEEGGDFAFEGEITNYMSATSSVSTDDYAVLNRLSITVKVRFTNAL DEKMSFNRTFTAFEDYESTRLLSEVEGELIPQIVDKLVTDIFQASASNW >gi|313159314|gb|AENZ01000010.1| GENE 79 82785 - 83993 1730 402 aa, chain - ## HITS:1 COG:AGl2141 KEGG:ns NR:ns ## COG: AGl2141 COG2204 # Protein_GI_number: 15891186 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 13 402 145 506 513 231 37.0 2e-60 MTDITTVKQRFGIIGASELLDRALEVAVRVAPTDLTVLVTGESGVGKEFFPQVIHAYSAR KHNKYIAVNCAAIPEGTIDSELFGHEKGSFTGALEARKGYFEEADGGTIFLDEVAELPHP TQARLLRVLQTGEFIRVGSSKVQKTDVRVVAATNMNLQQAIASGRFREDLYYRLSTVPIT VPALRERPQDIPLLFRKFAADVAVQYRMPAVTLDPAAREMLMHYYWRGNIRQLKNVAEQI SAIEQTRLITPDVLAKYLPPQGGGAPVMMGGAPVDDAMSTERELLYKVLFDMRADINDLK RMMSELLHNPCQEPVRDVKALLSTTAQSSAPVYAAGSVPVFAESEEVTDEPPREMTKADM QREQIIRALRRNNGRRREAAAELFMSERTLYRKIKELGIEEN >gi|313159314|gb|AENZ01000010.1| GENE 80 83994 - 84704 685 236 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313159400|gb|EFR58764.1| ## NR: gi|313159400|gb|EFR58764.1| hypothetical protein HMPREF9720_2793 [Alistipes sp. HGB5] # 2 236 1 235 235 403 99.0 1e-111 MLIRTISQYLESHKRLVVPQLGTFIVKEPGVSIVFSELLKRDDGTLRRLLIDGGLSELEA AGEIDRFVFEVRHAVEHGAEFRLDGFGVMRPGPNGTIAFAFESRRAESASGASGLPGAAA ASDGQPKPARRPYDEPRISTSAKMAPDPSVRGLRYGKPPKNTDAYTYVDRPPRRRKADRF IWIALIAAALAVAAIAFGYLREAKDRETEMQYLEQPADTHTAASGDAAQPEQPITE >gi|313159314|gb|AENZ01000010.1| GENE 81 84924 - 86318 1612 464 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163755454|ref|ZP_02162574.1| 50S ribosomal protein L19 [Kordia algicida OT-1] # 1 464 1 464 490 625 63 1e-178 MKDKYIDLIEQTFDFPQDEFSVEDNELNFHDIPLMELIKQYGTPLKITYLPKISQQINRA KRMFNVAMAKVDYKGSYNYCYCTKSSHFSFVLEEAMKNDIHLETSSAYDIHIINALYDSG IIDKDRYIICNGFKRPQYVENIAQLVNDGFVNTIPVLDNKEELELFEDSFTKKCKVGIRI ACEEEPKFEFYTSRLGIRYNDIIDFYKAKLKNSKKFQLKMLHFFINTGIKDTAYYWNELS KCINVYCELKAICPELDSLNIGGGFPIKNSLNFEYDYEYLTEEIIAQIKNICQRNDTEEP NIFTEFGSFTVGESGASLYSIVNQKQQNDRENWYMIDSSFITTLPDTWGINQRYIMLAVN NWDKEYQRVLLGGLTCDSEDFYNAESHTNAIFLPKLEPGNTQYIGFFHTGAYQESLGGFG GIQHCLIPAPKHIIIDRDKSDNEYYTRLFAKEQSYRSMLRILGY >gi|313159314|gb|AENZ01000010.1| GENE 82 86448 - 88049 1935 533 aa, chain - ## HITS:1 COG:CC2587 KEGG:ns NR:ns ## COG: CC2587 COG0488 # Protein_GI_number: 16126825 # Func_class: R General function prediction only # Function: ATPase components of ABC transporters with duplicated ATPase domains # Organism: Caulobacter vibrioides # 3 523 4 521 535 248 35.0 2e-65 MSIILSGVSYRYRNQQFLFEGIDLSVAAGAKAAVVGDNGAGKSTLLKLVAGEFAPAAGSI ACSSRPYYVPQHLSAAGISAVGVLGVADKLDALRAICGGSADPRCYDILADDWDVEARCR AALDHWGLPHVGLTDPVDALSGGERTKLCLAGISVHNPSVILLDEPTNHLDTSGRRRLYD LIRASRATIVVVSHDVALLNLLGTTCELSAKGLKVYGGNYDFYRERKRIEEEALAQQVDS EQTALRLARKKAQEVRERQERRMRQGERHKDQLPRILRRTMKDSGERTKAQLVGKHAELI DGSRERLDELRRRQRTQCELKIDFDDAQLHDGKLLVRADGLNFEYAPGRPLWREPLALEI RSGERIRLTGDNGTGKTTLVRLLTGELEPTAGRIGRADFSYVCLDQQYSRVNTPQSVLEL AQSCNHGNLQDHELKLRLHRALFGQQMWDKRCDTLSGGERMRLCLCSLMIANHVPDLFVL DEPTNNLDLSSLAILTRTVRNYRGTLLVVSHDDNFTREIGITRTVGLESPPAE >gi|313159314|gb|AENZ01000010.1| GENE 83 88367 - 90307 2038 646 aa, chain + ## HITS:1 COG:no KEGG:BVU_0479 NR:ns ## KEGG: BVU_0479 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 269 645 64 433 436 482 62.0 1e-134 MECKIRKGLTGLLAGSMLFLSGFARAGEAADESRKDLTTAGVSETPAAPANANRGGDPTT TESPTADLYQVNSTINEPIGSPTGASGPGACETDAPQSPETAKSDAKPTVESAPGTANGS PETTTPRPAKPSAQSAARSAEYAAGYDAGYRDGHKRGYRTGYETGFQAGFSAGSGAVGKH PDGASGTGPGSGSGTAESGGAGNWGGSAPNSGNSDRSSATGDGNNGSSTGFVPNAGTRIA SATGGGNGFGTGSETGANTRTDSTAAFPRRRFVHGIGLELRPEYVCPTNPFLAGDNLARQ PIDLSLAAHLRYSFRFQPGSRPDRIYGGAYQGIGVSYYSFENREELGSPVAFYIFQGARI ARISRRLSFNYEWNFGLSFGWKPYDAQMNPANIMMGSKINAYINADFYLNVTLTRELDFS AGLSMTHFSNGNTKFPNAGLNSIGMKFGLLYCFGRADDPLVKPRRPLLAPDFPRHFSYDL VLFGSWRRKGVEVGDKQYASPNAYGVAGFNFATMYNFGYKFRAGLSLDGVYDGSANVYTE DYIVEVGGRDPGYTFYTPSIDRQIALGISGRAEFVMPYFTVGIGLGVNVLHKGGDLKSFY QMLTLKIAATRRTFVHIGYCLKDFQTPNYLMLGVGFRFNNKYPVLR >gi|313159314|gb|AENZ01000010.1| GENE 84 90299 - 90853 494 184 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|260170825|ref|ZP_05757237.1| ## NR: gi|260170825|ref|ZP_05757237.1| hypothetical protein BacD2_03087 [Bacteroides sp. D2] predicted protein [Bacteroides sp. D2] predicted protein [Bacteroides sp. D2] # 10 125 152 266 321 70 31.0 4e-11 MKGWPPPPPQLLITTNLDNDDALAADAVELLQREIRFSSGKRIYSLLYGYQYFTDRRFAL KMRYTNNHFLTLVEPFDAHSETIVSYRHTKAIRQQPTVYLATDKGKWLEIVHEDNVSNDF RINIKVWYIPLLWGRRFADFGLPDLHVPGLRQWFCTLCVVPARFLVTAVRRLRRKWALRR AALP >gi|313159314|gb|AENZ01000010.1| GENE 85 90798 - 91184 388 128 aa, chain - ## HITS:1 COG:no KEGG:Patl_2044 NR:ns ## KEGG: Patl_2044 # Name: not_defined # Def: hypothetical protein # Organism: P.atlantica # Pathway: not_defined # 4 87 2 87 278 87 45.0 1e-16 MKSFEHLIITRFNLNLYARDKHDAPTRTERWLEHRFEIFERYCLPSVAAQTNPNFRWLCL FDAATPAAYRRRIGGYQSVCPQFRAVFYSAGQAGRLTESLRTTISGLLAGREGLATPPSA AADHYQSR >gi|313159314|gb|AENZ01000010.1| GENE 86 91202 - 92701 279 499 aa, chain - ## HITS:1 COG:all3632 KEGG:ns NR:ns ## COG: all3632 COG0863 # Protein_GI_number: 17231124 # Func_class: L Replication, recombination and repair # Function: DNA modification methylase # Organism: Nostoc sp. PCC 7120 # 34 499 36 481 482 273 37.0 5e-73 MQDVFHNLEIVDAKGWYVTEFSAGNTNSQKARIALEERYWPITEITERFNRQSVSYQLSK RDCLHKWLKYKEGFSAELVRTLLKDFHLQKGDIVADPFMGSGTTALVSMFNGYNSLGFDI LPMSKIAIHAKTAIYTYNTLELKELLKEIINLNVPEEYDGRTPEIRITKDGYPRETSREL AYYKEHFRNSRYSENTKMLLELCTLNALERISYSAKDGQYLRWDWRCPKIIAAAEAREKT GRKPFVVKLDKGQLPSLKEVLTEELSGVIKDIEYLNNNPSGFDTQCKFIEGSALFELPKI ADGTVSAVISSPPYCNRYDYTRTYAMELAYLGMSEAGVRKLRQDLLSCTVENKSKIEQLQ DYYKQIGQQERYERTMNIVDENAALQEINNALKNRNENGEINNKGVLKMVEGYFTELTFL FSELYRVCKTGAYVAFVNDNVRYAGEVIPVDFLTTNLAEQIGFTPVKIYTLKQQKGNSSQ QMKKYGRVALRKSITIWKK >gi|313159314|gb|AENZ01000010.1| GENE 87 92723 - 92953 183 76 aa, chain - ## HITS:1 COG:no KEGG:Ava_3180 NR:ns ## KEGG: Ava_3180 # Name: not_defined # Def: type II site-specific deoxyribonuclease # Organism: A.variabilis # Pathway: not_defined # 2 75 230 305 315 100 64.0 3e-20 MFGELKGGIDPAGADEHWQTGNSALVRIRKAFEDYQVKTSFIAAAIEKKMATEIYNQLSE GILSNAANLTVDKQLT Prediction of potential genes in microbial genomes Time: Wed Jun 22 11:29:16 2011 Seq name: gi|313159277|gb|AENZ01000011.1| Alistipes sp. HGB5 contig00016, whole genome shotgun sequence Length of sequence - 44764 bp Number of predicted genes - 34, with homology - 34 Number of transcription units - 18, operones - 11 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 1 - 1956 2756 ## COG0642 Signal transduction histidine kinase - Term 2775 - 2816 11.4 2 2 Op 1 . - CDS 2837 - 4813 2515 ## Bache_0757 hypothetical protein - Prom 4855 - 4914 3.4 3 2 Op 2 . - CDS 4963 - 6828 2318 ## COG1520 FOG: WD40-like repeat - Prom 6870 - 6929 4.3 + Prom 6798 - 6857 5.2 4 3 Op 1 . + CDS 6912 - 8087 1759 ## COG0027 Formate-dependent phosphoribosylglycinamide formyltransferase (GAR transformylase) 5 3 Op 2 . + CDS 8102 - 8542 763 ## COG0652 Peptidyl-prolyl cis-trans isomerase (rotamase) - cyclophilin family + Term 8592 - 8623 4.1 - Term 8707 - 8750 1.4 6 4 Tu 1 . - CDS 8764 - 9585 754 ## CLJ_B1028 GNAT family acetyltransferase - Term 9724 - 9766 9.0 7 5 Op 1 . - CDS 9804 - 10991 2103 ## COG0436 Aspartate/tyrosine/aromatic aminotransferase - Prom 11018 - 11077 2.8 8 5 Op 2 . - CDS 11177 - 12841 2730 ## COG2985 Predicted permease - Prom 12885 - 12944 4.5 9 6 Op 1 23/0.000 - CDS 12969 - 13667 1011 ## COG1346 Putative effector of murein hydrolase 10 6 Op 2 . - CDS 13660 - 14052 508 ## COG1380 Putative effector of murein hydrolase LrgA - Prom 14150 - 14209 1.9 - Term 14440 - 14487 -0.7 11 7 Tu 1 . - CDS 14528 - 17176 3530 ## COG0370 Fe2+ transport system protein B 12 8 Op 1 9/0.000 - CDS 17255 - 18619 468 ## PROTEIN SUPPORTED gi|157165073|ref|YP_001466086.1| 30S ribosomal protein S12 13 8 Op 2 27/0.000 - CDS 18616 - 21750 4856 ## COG0841 Cation/multidrug efflux pump 14 8 Op 3 . - CDS 21755 - 22867 1454 ## COG0845 Membrane-fusion protein - Prom 22990 - 23049 7.1 + Prom 22943 - 23002 3.1 15 9 Tu 1 . + CDS 23042 - 23215 350 ## gi|291513925|emb|CBK63135.1| Histone H1-like protein Hc1 + Term 23233 - 23280 16.1 - Term 23211 - 23272 9.4 16 10 Op 1 . - CDS 23292 - 25004 2690 ## COG2194 Predicted membrane-associated, metal-dependent hydrolase 17 10 Op 2 . - CDS 25019 - 26221 1775 ## Odosp_2868 hypothetical protein 18 10 Op 3 . - CDS 26226 - 26744 956 ## Odosp_2869 hypothetical protein - Prom 26781 - 26840 3.0 - Term 26801 - 26841 -1.0 19 11 Tu 1 . - CDS 26848 - 27261 699 ## Odosp_2937 DoxX family protein 20 12 Tu 1 . + CDS 27559 - 28035 639 ## COG0394 Protein-tyrosine-phosphatase + Term 28051 - 28091 7.5 - Term 28039 - 28079 11.3 21 13 Op 1 . - CDS 28126 - 29004 1470 ## Odosp_1765 hypothetical protein 22 13 Op 2 . - CDS 29219 - 29875 990 ## COG4845 Chloramphenicol O-acetyltransferase - Term 29885 - 29923 7.2 23 14 Tu 1 . - CDS 29990 - 30736 250 ## PROTEIN SUPPORTED gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 - Prom 30858 - 30917 5.2 + Prom 30716 - 30775 4.1 24 15 Op 1 . + CDS 30907 - 32187 2118 ## COG0112 Glycine/serine hydroxymethyltransferase 25 15 Op 2 . + CDS 32189 - 32869 843 ## COG5587 Uncharacterized conserved protein 26 15 Op 3 . + CDS 32885 - 33832 1404 ## BDI_2196 hypothetical protein + Term 33856 - 33892 8.0 + Prom 33877 - 33936 6.7 27 16 Op 1 . + CDS 33967 - 36093 1929 ## COG2213 Phosphotransferase system, mannitol-specific IIBC component 28 16 Op 2 25/0.000 + CDS 36140 - 36403 322 ## COG1925 Phosphotransferase system, HPr-related proteins 29 16 Op 3 . + CDS 36403 - 38043 1321 ## COG1080 Phosphoenolpyruvate-protein kinase (PTS system EI component in bacteria) + Term 38049 - 38073 -0.3 30 16 Op 4 . + CDS 38117 - 38965 1338 ## COG0657 Esterase/lipase + Term 38988 - 39019 3.1 - Term 38966 - 39013 7.2 31 17 Op 1 6/0.000 - CDS 39090 - 40070 1318 ## COG3712 Fe2+-dicitrate sensor, membrane component 32 17 Op 2 . - CDS 40110 - 40736 912 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog - Term 40797 - 40843 15.3 33 18 Op 1 . - CDS 40867 - 43860 3633 ## Phep_3775 hypothetical protein 34 18 Op 2 . - CDS 43876 - 44763 1414 ## Phep_3774 hypothetical protein Predicted protein(s) >gi|313159277|gb|AENZ01000011.1| GENE 1 1 - 1956 2756 651 aa, chain + ## HITS:1 COG:mll3725_2 KEGG:ns NR:ns ## COG: mll3725_2 COG0642 # Protein_GI_number: 13473203 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Mesorhizobium loti # 267 505 61 308 328 187 41.0 7e-47 VICLFSDTTDVHNAHEALFNSEKLLRNIFDNVQVGVELYDREGYLVDLNNKDLEIFGLKK KEEVLGVNFFRNPLVPGEIRENVRNGQEQAFKLDYHFNRLDGYYSSGKNGNVQIYTTVTM LYDMYGELINFVVINIDNTEINEAYSRLAEFESSFSLVSRFGKVGYARFDLMTRDGYAVP QWFHNLGEESDTPMPQVIGVYNYVDPEDRDAIFREIRRVKAGESNGFTLDLRVTPKGEPT GWTRVNVVRNPLNTDPSKIEMVCVNFDVTELKQTEKNLIEAKNKAEVSDRLKSAFLANMS HEIRTPLNAIVGFSNLLAETDDIGERREYMRVVEENNELLLKLISDILDLSKIEAGTFEF NYGRVDVNRMCEETVCSLSLKVKDKPVELIFGEHDAQCCVVGDKNRLIQVITNFINNAVK FTDEGSITLGYRTEGGELLFHVEDTGSGISEEHRQSIFDRFVKLNSFAQGTGLGLSISKS IVEQMGGRIGVESEVGRGSRFWFTIPAVTCDAPEKKGAAPAPRPAAVHGDGRLPRLLVAE DTDSNYLLVSLMLRREFDIVRASDGEEAVRICRELKPAAILMDVKMPGMDGLEATRRIRA FDPSVPIVAVTAFAYDRDRQKALDAGASEYLSKPLNGERLRQTLRELLSEE >gi|313159277|gb|AENZ01000011.1| GENE 2 2837 - 4813 2515 658 aa, chain - ## HITS:1 COG:no KEGG:Bache_0757 NR:ns ## KEGG: Bache_0757 # Name: not_defined # Def: hypothetical protein # Organism: B.helcogenes # Pathway: not_defined # 37 658 34 660 660 536 47.0 1e-151 MKSVTILTIAAMAMAAATGCSHDTPTPSPEPPVKGGEFNEFGFTTTAPVTLSVDYGDMAG MPANVYFEVYDTCPVEETETAYAKIAGVEPLYAGYTDGQGRFEGTIELPAYIRKAYVLTP AFYARTLLEATRSGDMLTASAESGAPANAAPASTRAKEYHSTAVERDGWKTWLGSYDTTY GSIGYEYDGDLKVKNYGALLKAHASVFDTTKKCPKEYRSSRDLYLEKSAEVVITLLGSNT CWNSSMGYYYYKTGNKPKSLADANVVMIFPNTQDGRWSNSPWESRLYQGVERGTAVQLIY YPEIADGSKAGATTVFPADYRIGFVLATNAWTNRLRAGDKKYRAATSDGLSVNNNGVAYQ TPRTAVFRYTDKKSGINSVLFSFEDHTNDENFSDVVFTMTSNPVDAVTDIPSVDVNDGKK TANVLRGIYAFEDLWPSRGDYDMNDVMVRSDYEKVFNEKGVFEESFMLKTFANFAGNANG LAVTLTGAAADAKLEFSVRKPGAETFEAADFERDGKVVLLTPDVKETMGATYRITAKYDA PVAEAQAGTIKPFIYRTDRDGLTAGKRWEVHIPYEAPTARAEMSFFGTNDDKSIPEKGIY YVRAENYPFAFFLSGANDGDVAKLLDQTNEKSPIDQVYPAYAEWAATNGEKNKDWYKK >gi|313159277|gb|AENZ01000011.1| GENE 3 4963 - 6828 2318 621 aa, chain - ## HITS:1 COG:MA4299_2 KEGG:ns NR:ns ## COG: MA4299_2 COG1520 # Protein_GI_number: 20093088 # Func_class: S Function unknown # Function: FOG: WD40-like repeat # Organism: Methanosarcina acetivorans str.C2A # 312 593 14 315 345 65 25.0 4e-10 MKNPFIELCGCICLTLLTLAPQPAAAKTAGEVPPADTLRTVFFTDIHVTPGNAQDSLFRI AVAEANASDADLVIFGGDLTNMGSDRELAHVHGLMSQLRKPWFTVPGNHETTWSESGCTT FRRLFGHAGCVATRAKGYLFLGYASGPYMKMADGMIRGEDLAWLERQAAAARPGERIVSL CHYPLNGDLTNRQEVTETLKSVGVTASLYGHYHQLQLRNFDGIAGIPGRALAGKRGEAPG YTLLDFFADSVRIREKPLGMPPVTRYTVCLKDDPQILALPCDPLPPAPDAEGRMRLVVQD SAMVLTGAAFCGEVLCYGNSQGVLRAYDTRKGRELWRHVFPDGLYTTPLCTDGLVIAGAA SGGIWAFDARTGREKWHLPTQSAVVGDGLTDGGALYIGLGTGSIGKIDLRSGRLLWRHDY GCGQAQGRPALADGRLVFGAWDRHLYCLDAGTGELLWKWNNGSGSLFFSPGHVVPRIAGG KVFIVAPDRVVTCLDLATGKQLWRDNSRKSRETTGLSGDGRRFYYKTMDGELAAIDTSAE RYRELWCTDLGWGYEYNSCPAFVRDGVVYVASRHGTVAAVGEDGTLLWSVKCCNSGGNDF AQAPDGSLWTTFAEGKVFRIP >gi|313159277|gb|AENZ01000011.1| GENE 4 6912 - 8087 1759 391 aa, chain + ## HITS:1 COG:alr1299 KEGG:ns NR:ns ## COG: alr1299 COG0027 # Protein_GI_number: 17228794 # Func_class: F Nucleotide transport and metabolism # Function: Formate-dependent phosphoribosylglycinamide formyltransferase (GAR transformylase) # Organism: Nostoc sp. PCC 7120 # 2 388 9 388 391 421 57.0 1e-117 MKKIMLLGSGELGKEFVIAAQRLGQTVVACDSYAGAPAMQVADACEVFHMLDGDALAAAV GKHRPDIIVPEIEAIRTERLYEFEKAGVQVVPSARAVNFTMNRKAIRDLAAKELGLKTAR YFYATSYDELKQAAAKIGYPCVVKPLMSSSGKGQSLVKGPEELEHAWRYGCEGSRGDIRE LIIEEFIRFDSEITLLTVTQKNGPTLFCPPIGHVQKGGDYRESFQSPHIAPEHLAEAQRM AAAVTEALTGAGLWGVEFFLSHENGVYFSELSPRPHDTGMVTLAGTQNLNEFELHLRAVL GLPIPAVTFERLGASAVVLSPVACTSAPRYEGIDRALAADPRTDVRIFGKPSARVNRRMG VVTGYAPLDGDLEALRERVKAAAAEIKVLEP >gi|313159277|gb|AENZ01000011.1| GENE 5 8102 - 8542 763 146 aa, chain + ## HITS:1 COG:MTH1338 KEGG:ns NR:ns ## COG: MTH1338 COG0652 # Protein_GI_number: 15679338 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Peptidyl-prolyl cis-trans isomerase (rotamase) - cyclophilin family # Organism: Methanothermobacter thermautotrophicus # 5 146 4 141 141 177 64.0 8e-45 MAQYAIISTEKGDMKAELYTDETPGTVANFVKLAESRFYDGLTFHRVIPGFVIQGGCPKG DGTGGPGYTIKCETSAPRQYHDRGVLSMAHAGRDTGGSQFFICHTRETTQHLDGRHTCFG RVVEGLDVIDDIRPGDLILSVRIVNE >gi|313159277|gb|AENZ01000011.1| GENE 6 8764 - 9585 754 273 aa, chain - ## HITS:1 COG:no KEGG:CLJ_B1028 NR:ns ## KEGG: CLJ_B1028 # Name: not_defined # Def: GNAT family acetyltransferase # Organism: C.botulinum_Ba4 # Pathway: not_defined # 5 273 6 284 284 186 35.0 9e-46 MEIKSLEKTDFDTLFRAFGRAFADYEVQLNAEQLRAMLTRRGFDPALSFAAFDGAQIAAF TLNGIGNFNGVPTAYDTGTGTLKEYRGTGLGTEIFRHSMPHLRRAGVGQYLLEVLQHNTR AVSVYRNLGFETVREFNYFCRANTEIADRGDTPRLPHSVRRTDWDFTPSWQNSFESIGRA AGDFVSLGVFIEEELVGYCVFEPASGDVAQIAVKRQYRRKGIAGLLLREMLRLNRNDSIK IINTDVSCGSITGFLQAKNIVPQGKQFEMIRKI >gi|313159277|gb|AENZ01000011.1| GENE 7 9804 - 10991 2103 395 aa, chain - ## HITS:1 COG:aq_273 KEGG:ns NR:ns ## COG: aq_273 COG0436 # Protein_GI_number: 15605813 # Func_class: E Amino acid transport and metabolism # Function: Aspartate/tyrosine/aromatic aminotransferase # Organism: Aquifex aeolicus # 10 390 5 383 387 280 39.0 4e-75 MAKAVNIARSHRLDGIGEYYFSRRLREIAEIEAATGRQIVKLAMGSPDLPPHQSVIDRLA KEAQRPDVHKYMSYKGEPILRKAFADWYKKWYRTELDYNSEVLPLIGSKEGIMHICMTFL NKGDKVLVPNPGYPTYSAAVRLAGGEMVTYALNKETNFRPDFAAIEKAGLDGVKMMLINY PNMPTGQTPTMELFQQIVDFGAKHNILIVHDNPYSFIRNAAAPISIMEAEGARDVALEMN SLSKGHSMAGWRVGVVVGKKEWIDSILTFKSNMDSGMFYPIQAAADTALALGEEWFKELN DIYYGREKQAYALLDALGCKYREHQAGLFVWAELPESYEGDSFAFSDEVMDKCDVFLTPG GIFGSEGKRYIRITLCCPEELLKKATENIKAKFGK >gi|313159277|gb|AENZ01000011.1| GENE 8 11177 - 12841 2730 554 aa, chain - ## HITS:1 COG:STM3807 KEGG:ns NR:ns ## COG: STM3807 COG2985 # Protein_GI_number: 16767092 # Func_class: R General function prediction only # Function: Predicted permease # Organism: Salmonella typhimurium LT2 # 27 549 18 545 553 382 42.0 1e-106 MEWLHTLFFGNGIAHAVLTFALVITIGILLGKVKIGGVSLGITWILFVGIVLSHFGMTVD GEVRHFVQEFGLILFVFSIGLQVGPGFFASFKHGGMTLVMCAVAIVLLGVATAYVVHLVT GTPIPTMVGILSGAVTNTPGLGAAQQAYADASGIEDPTIALGYAVAYPLGVVGIIFTMIF IRYALRVKFEKEDEGLAALSREHKLADKVSVEFTNKTLDGRTVAYVRDLINRQFVISRIL RPDGTISMADADSVIHLGDRLWVISQAEDIEAIVAFFGRRVEMTDEQWGNNTPNAELVSR RILITKSSLNGKKFSDLRLRTKYGITITRVNRAGVDLIPYQGLELQVGDRVMVVGPAKAV AQVADVLGNSLKKLNQPNLVTIFVGIALGVLLGSIPLLNVPQPVKLGLAGGPLIVAILIG RFGTHFHLVTYTTMSANLMLREIGIALFLAAVGIGAGDGFIDAIVDGGYRWIGYGVIITV LPLIIVALVARLWLKMNYYTLMGLIAGSTTDPPALAYANATAGNDMPAVGYSTVYPVVMF LRVLTAQIFILFAL >gi|313159277|gb|AENZ01000011.1| GENE 9 12969 - 13667 1011 232 aa, chain - ## HITS:1 COG:MA3262 KEGG:ns NR:ns ## COG: MA3262 COG1346 # Protein_GI_number: 20092078 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Putative effector of murein hydrolase # Organism: Methanosarcina acetivorans str.C2A # 6 227 9 230 238 137 36.0 1e-32 MIDKAFLSSDLFLLTLTVGLYCLGCAVYRRLRVPLLHPVLLTFVAVIVFLSAAGIDYPRY KQATSALDFALGMSVVALGYLLYEQMEQLRGSLLPVGVATLAGCVTGVLSVVYIAMAFGA GREILTSIAPKSVTVPIAVSVSGPLGGIVPITSVVVFCVGIFGSIFGEWILRRCGVRDAE ARGFALGAAAHGIGTARAIEIGAVEGALSGLAMALMGLATALLMPLMERYLY >gi|313159277|gb|AENZ01000011.1| GENE 10 13660 - 14052 508 130 aa, chain - ## HITS:1 COG:YPO1514 KEGG:ns NR:ns ## COG: YPO1514 COG1380 # Protein_GI_number: 16121787 # Func_class: R General function prediction only # Function: Putative effector of murein hydrolase LrgA # Organism: Yersinia pestis # 14 127 25 130 135 64 33.0 6e-11 MLGLFYILLFWLIGNALSILTGGYVSGNIIGMILLFAALCMRWVKAETVRPAAKFLLGAM ALFFVPYGVGLMDSYRVILDNLWAIVISGIVSTIVVLLVTGQTFQSLNRRSRLRRIRHLR ETAPNTPDHD >gi|313159277|gb|AENZ01000011.1| GENE 11 14528 - 17176 3530 882 aa, chain - ## HITS:1 COG:MA3477 KEGG:ns NR:ns ## COG: MA3477 COG0370 # Protein_GI_number: 20092288 # Func_class: P Inorganic ion transport and metabolism # Function: Fe2+ transport system protein B # Organism: Methanosarcina acetivorans str.C2A # 128 733 2 592 670 493 42.0 1e-139 MRLSELKTGESATILKVTGHGGFRRRIMEMGFVRGQRVEVILNAPLKDPIEYKIMGYDIS LRRSEADMVVVLSDSEASEYLAGGGSRHHHHEHHRRHCGCGADGQPAEPEYPTAETPGPD DGCATIDEVINRHSRTIGVALVGNPNSGKTSLFNAISGGHEHVGNYSGVTVGAKIGHRHY RGYRFEVTDLPGTYALSAYTPEERYVRSHIAERTPDVIINSVVASNLERNLYLTTELIDI NPRMVVALNMFDELNSSGAELDYDNLGRMLGVPMVPVEARNGKGIEQLLDTVIAVYENQD ERVRHIHINMGSVIEEGLRRLNGEMNAFRGELPKAFPPRYYAMKMLEGDRQVEEQLRGCS RWPEWAAIRDLEAKRISEALGEDVETAVANQKYGFIQGALRETFTPGKREEASTTALIDT FVTHKLWGFPIFFFLMWFMFWCTFSLGAYPQEWIDALVGWIGGAIDYMMPAGPLRDLLVD GIVGGVGAVIVFLPNIMILYLFISFMEDSGYLARAAFIMDRVMHRIGLHGKSFIPLIMGF GCNVPAIMACRTIESRSSRLITILITPFMSCSARIPIYLLLAGTFFAENAGLVMIGLYVL GVVLAVFTARLMRRFMFPVDETPFVMELPPYRLPTWKTTLRHMWDKCAQYLRKMGGMILV ASVVVWFLSYYPRTDAGNTPEHYENSYLGRLGRTCEPVFSPLGLNWKAGVALLSGVPAKE IIVSTLGVLYSEGAPAAAQPSGTIEIVAESGEMADTAKLSINDLPNGKVAVIGGADGPTA IYIAKTADGRPGLNPGTAAEDAAADGDEETANLARRLTASGDFTHASALAFLVFILLYFP CIATVAAIGSEAGWRWAVASVAYNTLLAWLAAWAVYHLVMWL >gi|313159277|gb|AENZ01000011.1| GENE 12 17255 - 18619 468 454 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|157165073|ref|YP_001466086.1| 30S ribosomal protein S12 [Campylobacter concisus 13826] # 1 454 1 456 460 184 27 6e-46 MKAGYLIIIFAAFTAACTPRFYPPRVSVPDDYIYGRGFSEDTTRFGPEWWTLFGDTVLNN LVARALDNNRDVAVAASRVEEARLNLKSVRAQYLPQVGLGVTAEGEYTPATKIVQSYAVE PSLSWEVALFGQLRNAKRAAKAQIASSEWALRGVRLALAAEVATTYFTLLEYERDLAIAR QSCALRRESAALIDSMFRYGMSDGVALEQARSLVYTAEADIPQYRRAVAQTRLSLDILLG ETPRRTDSAGAGLRLLTDYRPADIPVGLPSELLERRPDIMQARYDMLAAAAEVGIARGNR FPSIALTAKGGIASNSIKGLTSANPWAWDALGSLTLPVFNFGRLRRAEQAAMERYTQSAL TYEQTVLTAFSDVEQALVAIATYRRQTERYGELVLANDRIAAMTQALYRSGLSDYLDVID AQRSLYQSQMEFVNLVAQQYINYVNLCKALGGGW >gi|313159277|gb|AENZ01000011.1| GENE 13 18616 - 21750 4856 1044 aa, chain - ## HITS:1 COG:BMEI1629 KEGG:ns NR:ns ## COG: BMEI1629 COG0841 # Protein_GI_number: 17987912 # Func_class: V Defense mechanisms # Function: Cation/multidrug efflux pump # Organism: Brucella melitensis # 1 1028 1 1027 1051 711 39.0 0 MEKFFISRPIFAISLAIVIVLVGLISILNLPIEQYPDITPPVVEVSATYDGADAETVNNA VATPVAQSVMGVSDMLYLQTTSANDGSMVMQVTFDIGSDPDLDAIFTQNNVSSAAAQLPA TVTKQGVTTRKTMTGFLLVFSLHSDGRYDDEFLSNYAYINLQNELLKINGVGKVSIMGAG EYAMRVWLRPDVLKYYDIPVSAVTAAIENQGGIYPAGQFGAEPAPDGTSYTYTVTMPPQI TTAEQFGDIVVVTTSEGEQIRLRDVADVSLGSQSYGVSSLFEGKPTALIVIYQEPGSNAV AVGDKVKAEMARLGERLPDGITTSTVVDTTTSIDAGISDIFTTLIIALVLVICIIYLFIQ DWRATVIPLVAIPVSLVGAFALFPLLGFSINIISLLGLVLAIGLVVDDAIVVVEAAQVNI ANGMKPRAAALEAMKNVASPIVATTVVLLAVFIPVSFTGGITGRLFQQFSVTIAVSVVIS AFNALTLSPALCALLLRRREPPKKGFFAAFNRWFARRMDKYTAFTPTLIRHVARTGVFIA AVLAVIFLVWRKLPAGFLPEEDQGYVMVMVSTPEASSLQVTQKAMIRADEVIRTLPEVAS TSFAAGFNMMAGIASTDSGIIFVSLVGYSDRKLTAMEIAQKLTDELYMAVPGAECFAFIP PSIPGLGITSGVSVEVQDLEGRGTAYLLEQSERLMDSLRKLPSVASVTTQFNAGVPQRRL RIDKEQALASGVDLGVLYGDLTTLLGGAYVNNFSRFGKLYQTYIQAAPAYRMDKRSLDSY YVASSSGESVPVSSLVEVVDTVGVEYVSQFNLYRSIGLTVTPAARTSTTTVMQDITRTAA QVLPDDVGTAWSGTSFQEANASKTGGLVYLLALVFVFLALAALYESWGLPLAILMSVPVA VLGAVLFIGVSHLLNALYVNDIYMQISLVMLIGLAAKNAILVVEYADRLFNEQGASLMDA AIGAAKLRVRPIIMTAFAFILGVMPLIFASGVYATARNIMGVALVGGMLFATLLGIFVYP ALYYFVGRIGGFERRRERKKQEQS >gi|313159277|gb|AENZ01000011.1| GENE 14 21755 - 22867 1454 370 aa, chain - ## HITS:1 COG:BMEII0914 KEGG:ns NR:ns ## COG: BMEII0914 COG0845 # Protein_GI_number: 17989259 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Brucella melitensis # 10 367 81 434 451 157 32.0 3e-38 MNRLLIVAGLVSALLSAGCGRHKPKPAMPPLQVETARAVTDSIPNRMSFIGYLSSNFDAV IQPRINGFLLSKQYANGMPVKRGQLLFTIDPDQLSTTMLAAEAALQSAKAQALEARNNYE RAVPLAKINAISQAQLDQYTAQYKAAEASVRSAEQTLSSARMNVGYTELRSPIDGIIEHT AAHVGDYVGPGTQFSVLTSISNIDTLTVDVAIPMAEYLRAAGGRSSIYDNEGLLSDIRLT LADGTRYPYGGLYDYTRKDVSSSTGTLVLVVMFPNPDRSLKPGQFARVSANVGPVQPRVI VPQQSVSQAQGVNSVWVVRPDSTAAYRRVVLGDTYGTMWCVDEGLAEGEQVVTAGQQKLR DGMKVIPVKR >gi|313159277|gb|AENZ01000011.1| GENE 15 23042 - 23215 350 57 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|291513925|emb|CBK63135.1| ## NR: gi|291513925|emb|CBK63135.1| Histone H1-like protein Hc1 [Alistipes shahii WAL 8301] # 1 57 23 79 79 71 92.0 2e-11 MKELVAKINAEYEVFAANAAAQVEKGNKAAGTRARKSALEISKLMKEFRKVSVDAAK >gi|313159277|gb|AENZ01000011.1| GENE 16 23292 - 25004 2690 570 aa, chain - ## HITS:1 COG:PA1972 KEGG:ns NR:ns ## COG: PA1972 COG2194 # Protein_GI_number: 15597168 # Func_class: R General function prediction only # Function: Predicted membrane-associated, metal-dependent hydrolase # Organism: Pseudomonas aeruginosa # 233 530 230 536 567 140 30.0 7e-33 MNADPNRVNRRRTVKTRSRFTTLLTVYFFVALIIPNCVLANTEPYSGWTVEALILMPLGF YMMWSVALSRSGVMIWLGFPFIFLCAFQIVLLYLFGNSIIATDMFTNLVTTNPGEAGELL SNIYPSVILVCVMYLPLLWFAAREIGHKRYISRTTRMNVGLSGAALMALGMLALWPAYHV SEQRHVLRDEIFPLNVAYNVYLSASEFRKAYNFEKTSADFTYEASREAEAPGREIYVYII GEASRAMSWELYGYERETNPRLSQVDDLVIFRDVLTQSNTTHKSVPLILSSVSTEEHDQL YRRKGLPALFNEAGFETWFISNQSPQGAMIDKLARDADHLIYIRSPRHDMQLLDEMRRIV ETAPAQKMLFILHCYGSHFSYHQRYPREFARFKPDDDVAISREHAHTLRNAYDNSICYTD HFLAETIAFLRGLRQTSSALLYCADHGEDILDDDRERFLHASPTTTAYQLYVASLAWFSD AYRRNFPAKAAAAGINETAPATTHALFHTMADMASIRGRFVDPTVSLVNPDYDRKRPRRY LNDHNEAVPFRKTGLEPEDMEVFARFGIKP >gi|313159277|gb|AENZ01000011.1| GENE 17 25019 - 26221 1775 400 aa, chain - ## HITS:1 COG:no KEGG:Odosp_2868 NR:ns ## KEGG: Odosp_2868 # Name: not_defined # Def: hypothetical protein # Organism: O.splanchnicus # Pathway: not_defined # 29 400 25 387 387 316 46.0 1e-84 MLRKTAVLALALCFALQGAAQEGVYINGSDTLHYSYTPLPEQPQPQKRLSFPRRVVKYFE GSTRDRTFEKKIDFTFAGGPSYSKNTSLGIGLLAAGLYRIDRTDSVTSPSDISIFANVSI SGFYALGVTGNNIFSRNGKRIDYTVMFASAPRDMWGIGYNDGRYNEESSYNEKRYLIKAS YLHRVLPNTYVGGILSFEHTQGKKFKPLGESYLYGQKTHYTATGVGAILEYDSRDFIPNP FRGVYVSLQETLFPKGLGNCGRSLWRTTFTADYYRQLWKGSILATDLYAEFNSEGTPWLM LARMGGSQRMRGYYQGRYTDNDMITFQVELRQRIWRRIGCTFWGGAGNVFSDLERFDWSE TLPNYGVGLRWELKKRVNVRLDYGFGKKTSGFLLSINEAF >gi|313159277|gb|AENZ01000011.1| GENE 18 26226 - 26744 956 172 aa, chain - ## HITS:1 COG:no KEGG:Odosp_2869 NR:ns ## KEGG: Odosp_2869 # Name: not_defined # Def: hypothetical protein # Organism: O.splanchnicus # Pathway: not_defined # 23 171 22 169 171 92 31.0 6e-18 MKMKVIGMLAAFTLAASAAAAQELPRKVNNVEVLDLDGNPAKLPHWGEKNLMIFYVDPDR HKQNNDFTVELEENHRAQSDNIFGFGVMNLKDAPMVPNGMARNMAKKRTAKNGALVLADQ DRILSKAWELGDCNNQFVLMIVSKEGELVFLRKGVLSEQDKADFYETIEKYK >gi|313159277|gb|AENZ01000011.1| GENE 19 26848 - 27261 699 137 aa, chain - ## HITS:1 COG:no KEGG:Odosp_2937 NR:ns ## KEGG: Odosp_2937 # Name: not_defined # Def: DoxX family protein # Organism: O.splanchnicus # Pathway: not_defined # 13 127 14 132 146 72 36.0 5e-12 MGKRIEDWIGQYRHMDWAVLYMRLFAGGMMLFHNIGKMQDYNEIVNSYPSFASAGSAATF VVVTVAEVLLAVAIIMGIWVRMSALILSLGILLMFAWGGFGAGELEFVWLGVYVFLIISG GGLYAFDTALTPAKPKK >gi|313159277|gb|AENZ01000011.1| GENE 20 27559 - 28035 639 158 aa, chain + ## HITS:1 COG:alr5068 KEGG:ns NR:ns ## COG: alr5068 COG0394 # Protein_GI_number: 17232560 # Func_class: T Signal transduction mechanisms # Function: Protein-tyrosine-phosphatase # Organism: Nostoc sp. PCC 7120 # 1 158 1 158 161 152 48.0 2e-37 MKTNILFVCLGNICRSPAADGIMHHIVEERGLSGRIGIDSAGTYAGHTGELPDARMRRAA ARRGYDLGHRARQIREEDFDRFDMIVVMDDMNYENVHRLAPSRRAAEKIFRMREFFRRHS RWDHVPDPYYEGAEGFELVLDMLEDGCGGILEYLENPQ >gi|313159277|gb|AENZ01000011.1| GENE 21 28126 - 29004 1470 292 aa, chain - ## HITS:1 COG:no KEGG:Odosp_1765 NR:ns ## KEGG: Odosp_1765 # Name: not_defined # Def: hypothetical protein # Organism: O.splanchnicus # Pathway: not_defined # 7 292 16 322 327 121 29.0 4e-26 MKRLITTLTLLAATLCAAAQEATLPTPDAKTLGMGGVAMTTLSGSHAIYNNSATAIFSQM PSQISSSYYGQGQFDYYAVTGFCRFDNVNLAQIGWRQYLRERGNNDMAVDLGYSRRIGGN WAVGVVARYMHLKRPETSADALAVDLSAAWSHPLENVGSYSTLRAGAKLGNLGGYLKDTP YTLPMDLTAGVALDTFLSDVHEITVGTDLGYYFSPSKVRGFQMSVGAEYNLMQLIQLRAG YHYGERRDYYPSYGSVGAGVRFLHLRLDFAYLIAKKQTLLHNTYSISFGLDF >gi|313159277|gb|AENZ01000011.1| GENE 22 29219 - 29875 990 218 aa, chain - ## HITS:1 COG:MA1703 KEGG:ns NR:ns ## COG: MA1703 COG4845 # Protein_GI_number: 20090555 # Func_class: V Defense mechanisms # Function: Chloramphenicol O-acetyltransferase # Organism: Methanosarcina acetivorans str.C2A # 5 216 8 209 209 91 29.0 1e-18 MKHTVDIETWERRDNYAFFRNFLNSWISVATEIDCTEARAAAKAAGRSFFLYYLYAVLRA VNEIKEFRYRTDSRGNVIWHDTVDVISPIAVPGRTFYTVRIPYHADFERFYAEARAIVTD IPQDGDPYGAEKVLQAQGDYDVFLLSATPDLYFTSLTYAQGAPGQPTEYPLMNAGKAVMR EGRLVMPLSIFVNHAFVDGAHISRFFKLVEECLKRISA >gi|313159277|gb|AENZ01000011.1| GENE 23 29990 - 30736 250 248 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163739489|ref|ZP_02146899.1| 50S ribosomal protein L17 [Phaeobacter gallaeciensis BS107] # 7 247 4 241 242 100 32 1e-20 MKLLEGKVAVVTGAARGIGKAIALEFAKEGADVAFTDLVIDDNGKATEAEIAALGVKAKG YASNAANFEETHKVIEEIVKDFGRIDILVNNAGITKDGLMMRMSEAQWDAVLTVNLKSAF NFIHAVTPIMARQKGGSIINMSSVVGVSGNAGQCNYSASKAGMIGLAKSIAKEMGPRGIR ANAIAPGFIMTEMTDKLPDEVKEGWYKQIPLRRGGTPEDVAKVALFLASDLSSYVSGQVI HCCGAMNC >gi|313159277|gb|AENZ01000011.1| GENE 24 30907 - 32187 2118 426 aa, chain + ## HITS:1 COG:aq_479 KEGG:ns NR:ns ## COG: aq_479 COG0112 # Protein_GI_number: 15605959 # Func_class: E Amino acid transport and metabolism # Function: Glycine/serine hydroxymethyltransferase # Organism: Aquifex aeolicus # 1 424 5 410 428 487 58.0 1e-137 MKRDTQIFDLIAAERSRQMHGIELIASENFVSEQVMEAMGSVLTNKYAEGYPAARYYGGC EVVDKVETLAIERICRLYGAEYANVQPHSGAQANMAVFFAVLQPGDTFMGLDLAHGGHLS HGSPVNMSGKYFNAVGYQLDEATGVIDYDAMERKALECKPKLIVGGASAYSREWDYKRMR EIADKVGALLLVDMAHTAGLIAAGLLDNPVKYAHIVTSTTHKTLRGPRGGIILMGRDFEN PWGLTTPKGAVKMMSQILNSAVFPGIQGGPLEHVIAAKAVAFGEALEPSYKEYQTQVQKN AKAMAEAFVKRGYKIVSGGTDNHLMLVDLRTKFPELTGKLAEKCLVAADITTNKNMVPFD SRSPFQTSGLRFGTPAITTRGLKEDKMDYIVGLIDRVLHDPENEDNIAAVRRDVNALMAD YPLFAW >gi|313159277|gb|AENZ01000011.1| GENE 25 32189 - 32869 843 226 aa, chain + ## HITS:1 COG:XF2023 KEGG:ns NR:ns ## COG: XF2023 COG5587 # Protein_GI_number: 15838617 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Xylella fastidiosa 9a5c # 7 222 21 237 237 82 28.0 9e-16 MKEVIDFFRRLHDNNDRAWFDAHRAEWTHVKGRFAAFTERLIEGIAEFDPAVRGLRVQDC TYRIARDTRFSPDKSPYKTYIGAYIAPKGKKSGFAGYYFHIEPCCDSLIWSNALSAGLYC PEPVVLRSVRDEITDNGAEIAAAIRKAKGFDLVEENKLKRVPTGFPADSEYADMLKLKDF YIAKRITEEFLLADDLLEKTLAEFRCTQPFVAILNRAVQFAYEEMM >gi|313159277|gb|AENZ01000011.1| GENE 26 32885 - 33832 1404 315 aa, chain + ## HITS:1 COG:no KEGG:BDI_2196 NR:ns ## KEGG: BDI_2196 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 21 312 31 322 324 215 44.0 2e-54 MKRILVLFALFAAAAGYAQTVEKQTFVYAVKQTDTLRLDRYVALTPDSRTKPCLMFVFGG GFVGGRRDNASFLSYFEYYARKGYVVVSIDYRLGMKKAMQAGTLSEETFPEAWITTLAMA TGDLYDATAYVCDHASAWGVDKTRIVASGSSAGAITVLMGEYGICNGHPLARQKLPQDFN YAGVISYAGAIFDTQEELRWAKTPAPMMLFHGDADRNVPYDAVLYDGNGFFGSKHIADML TERRIPHWFYSVANTNHVMATRPMYDNRYEIDAFLEKLVLKREPLVIDTYVTPLEAPELP KAFTLSDYIDANYGK >gi|313159277|gb|AENZ01000011.1| GENE 27 33967 - 36093 1929 708 aa, chain + ## HITS:1 COG:BH3854 KEGG:ns NR:ns ## COG: BH3854 COG2213 # Protein_GI_number: 15616416 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, mannitol-specific IIBC component # Organism: Bacillus halodurans # 2 452 8 468 468 476 53.0 1e-134 MRQQIQQFGRTLSGMVMPNIGAFIAWGFITALFIPEGWWPNGQLAQLVGPMLTYLLPLLI AYTAGRNVAGERGGVIGAIAAVGVIVGSDIPMFIGAMIMGPLAGYAIRRFDRMVEGRVKA GFEMLVSNFSIGILGMLLAVLGYYVVGSVVSGLTMLISSGAELVIRHGLLPLVSLFVEPA KVLFLNNAVNHGIFSPIGIEQARETGQSIMFLLETNPGPGLGVLLAYWLFGRGNARQSAP GAVIIQFFGGIHEIYFPYILARPVLIVAAIAGSAAGLLFFSLSDAGLVAPASPGSILSVL AMAPKGKTLIVLLGVVISAAVSLVVAAPFIRRASKTETEGDPAVGKLPQSAAGISPHAAG RPVRKVIFACDAGMGSSALGATRFRKRLRDAEIGVAVGNSAADRIPSDADVVVCQSVLAE RIAAAAKGAELIVIDNFLSDPGLDALFARLESAKPTAAGPGTVSCGESDSDMSDARLAEA VLPSDVCDARSAGATPSSETSAFRPAETTASPDMPAARLAETTPSCGVSGSRSAQSALSC DQSGLRGAAADSAPQPEETASAPKDAAPDGAILQPGNIRVGLPAEPKEEAIRRAGELLVA GGYARPEYVDAMLRREELATTCLGMGLAIPHGTSDAKERVLRSGIVVLQYPDGVDFDGEK AHLIVGIAGVGDEHLEILARLSASFEDEELLQRLMTATDPQVIYDALK >gi|313159277|gb|AENZ01000011.1| GENE 28 36140 - 36403 322 87 aa, chain + ## HITS:1 COG:YPO2993 KEGG:ns NR:ns ## COG: YPO2993 COG1925 # Protein_GI_number: 16123174 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphotransferase system, HPr-related proteins # Organism: Yersinia pestis # 1 85 1 84 85 59 43.0 2e-09 MTSQKVTITAPNGMHARPAGELAKLVKGLAPVRITFRTAAKEVNAASMLALLSLGLKCGT EIEICAEGGDEAAAVKAVAAFIANMSE >gi|313159277|gb|AENZ01000011.1| GENE 29 36403 - 38043 1321 546 aa, chain + ## HITS:1 COG:CAC3087 KEGG:ns NR:ns ## COG: CAC3087 COG1080 # Protein_GI_number: 15896338 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphoenolpyruvate-protein kinase (PTS system EI component in bacteria) # Organism: Clostridium acetobutylicum # 3 530 1 536 539 405 42.0 1e-112 MRVLRTGIPAAGIAAGTAFVLERRVSAAPSAATGDPAAETARFHAALAQAKTELGVLAAE NDIFAAHLEMADDPMLAEQVEERISEHGFSAERALDEACEEVCGMLAALDDEYLGGRTDD VRDVCGRIRRILTGETAENPFAGLAPGTIVVAEELTPSDTALMDFSRIAGVVTARGSVTS HVCIIARAKGIAAIVGASECMLEINTGDKLIINGDTGEIIVAPDTATERRYRALSASRKR HGEHCLKGAHTPAVTRGGRRIAVLGNAGSVAEVRAALDAGAEGIGLFRSEFLYMQSRGGF PGEQTQFEAYRHAAELCGERPLVIRTLDIGGDKALPYMDFGHEENPFLGWRAIRVSLSMH DVFRTQLRALLRASVFGNLRIMFPMITSVGEFRRAETAVRECMAELDAEKAAYNPGIELG VMIETPAAVMVADLLAAEARFFSIGTNDLTQYVMAADRGNPRVAHLCDPFDTAVKRSVAM TLAAARSAGIEAGMCGELAADPEATAWLLEAGLEEFSVSAPAVAPLKERIRTLDLPEVRK TPGAAE >gi|313159277|gb|AENZ01000011.1| GENE 30 38117 - 38965 1338 282 aa, chain + ## HITS:1 COG:CC2313 KEGG:ns NR:ns ## COG: CC2313 COG0657 # Protein_GI_number: 16126552 # Func_class: I Lipid transport and metabolism # Function: Esterase/lipase # Organism: Caulobacter vibrioides # 24 261 43 304 328 122 33.0 6e-28 MKKTLLLLAALTLSPALYAQDYHVVLWDNATAPTSNGLTGEEIQPKPGQWSNTQKAEMWI YKAGDNATGQALVFFPGGGYSTLSVGNGHKEAKWFAENGVTAAVVKYRLPNGHSEVPRND ADEALRVMREMAAELNIDPAKVGVSGTSAGGYLAGSVGTLSEVKPDFMILYYPVISADAD KRHKGTFVQLLGADDADKPVAQEFSLEKKVDARTPPTLLFHCDDDKVVPAVNSALFYQKL KEHGVKATLHIYPSGGHGWGMNPKFRYYDDWQKATLDWLSTL >gi|313159277|gb|AENZ01000011.1| GENE 31 39090 - 40070 1318 326 aa, chain - ## HITS:1 COG:SMc04204 KEGG:ns NR:ns ## COG: SMc04204 COG3712 # Protein_GI_number: 15965785 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Sinorhizobium meliloti # 122 286 145 308 354 63 28.0 7e-10 MEDTLIMDAVINKYFAGETSDREEQFLMAWLEESEEHRKYFFELKSIWNARNVFAESADL GRFAAFMRSTDARIAKIAGEERARRRRGLLRWSMSAAAAVLLVVCAGFAWHFLAGPDIYR VYENNTETVTTVTLDDGTQVWLNARTRLILPKAFSPTERNVKLAGAAFFEVARDEAAPFT VATDDLRVRVLGTAFCVQAYENSGQAEVVLEHGSVRLQTPEGVNLVTLQPDQRALYDAAA DNLEISHVNASHLVLSQYDLVTMANATLHEIISHIETTYGVRISVPGRYGDKRYAFNYQR SNSLEEVLNIVEYMTGEHCEVIFRQK >gi|313159277|gb|AENZ01000011.1| GENE 32 40110 - 40736 912 208 aa, chain - ## HITS:1 COG:BH0263 KEGG:ns NR:ns ## COG: BH0263 COG1595 # Protein_GI_number: 15612826 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Bacillus halodurans # 1 186 1 184 187 63 25.0 2e-10 MESIPTDNARLTAAIRDGSREAFDEMCGRYYAPLMAYARLFLKGNWAQDIVQDVFVNVWI RRETLDPRQSLYGYLLRSVYNQSVNYVNKNRHAGNYRSYHQERIESIGYAYYDPDNNPII EKIYSQDLRAGIDAAVAALPEKCREVFTLSYLQGFSHREISERLGIAQSTVENHIYLALK ALRSKLSKSELLMLLFFIFLCNNSQSFG >gi|313159277|gb|AENZ01000011.1| GENE 33 40867 - 43860 3633 997 aa, chain - ## HITS:1 COG:no KEGG:Phep_3775 NR:ns ## KEGG: Phep_3775 # Name: not_defined # Def: hypothetical protein # Organism: P.heparinus # Pathway: not_defined # 12 475 1 462 465 325 43.0 6e-87 MKKSIIYILFALFSFSAAVSLTACKDDGVEISDAYKSKTPKNVKLLEYGNKSLTICWDFV RGATSYTVQLVDGDMNPVSEALCMTTADIDYHEFTDLPTDRIYYGRVRANYPYSATSDWV YVTEHDKPAMLMASVGILDLDPQLKLHAASGSTLTFEWSYTDDKATDAARSYNVELFRDE ACTDLYVSWLADGKLSSGKGIFTALAGYPVVRYTFSGLDPETTYYARITNLSFANIQTPV VAGTTTQAGPKAAANTPAKAGDIVLAQDFSAFIHGGDIVNSAAGYNAVSGTDFRKTWEKA EGVNPQADGDRPLCNWATEFHIHTGGTSAEYVEALGMKGWGSSGNTSTRPGYIKCGGGSG GIGILYTPQLTALPANTTVKVSFSASAYAEGENVYGSDIVVEAVEGAEFGSNNVVSKKGT AFVSKTVDISSAVGRFETYTVTLEGLTPASRIAFSSNPAQAGANKTRFLLDDIVVTCEGE THLEQLAAPANVKFDAEAVQPDKLTLKWDAVSGAASYTVACWADGTDESKAALVEKIDGT ACTLDKLTPETKYNAKVKAVAGRTATDSDWSAKASATTAEASQGKLDAPANLKAEPGFST VALTWDAVSGATGYSVTVDNGTPQTVTDNAFTATGLTAGTSHDFSVVALAALAADNSDAA TLQSKTLYVRIAAVTTSSIILEWEQAGDVSQYTVAIEEEGDPSRSARFTYDWSKASGYGS IPLRFAFNYVGTEWASAALSPSASYRLRVKAGAADSDSAWSNDITGTVARRTAPAGEVFY EDFDRFMGGDAVMTAVGVKSAKKVTSEAGLSTMLTEFTYTDWSAGGGAMNAGSGTYGDPY RLMFFKDDGWETGALVTGNGVHNAFCGSFRLGGDNANQSYIQTPAMSGLTAAANVTVTFK TAVNVWTAASTTAPAVVFKAGWCRDQFESMELYVVHGDGTKVKVGDTIAVGDAGSELAWK SHSVAIEGLLPTDKLMFTSVGAKKSRYYIDEISVVKN >gi|313159277|gb|AENZ01000011.1| GENE 34 43876 - 44763 1414 295 aa, chain - ## HITS:1 COG:no KEGG:Phep_3774 NR:ns ## KEGG: Phep_3774 # Name: not_defined # Def: hypothetical protein # Organism: P.heparinus # Pathway: not_defined # 6 295 166 440 440 168 40.0 3e-40 TEKYTVSVTQTGGNGLTAAAGSYSISSPGNGFIEIPLTGTATKQGIVTFALKVTDAAGVE TDFGEISTIVRAGFDQAGEETPLLVQNFNKFPWGGDCIGQKAGVTSADKTVANLSLDNET VSVTAGTNASIGGITSTVRNGNPSFYRAIGMEGWTGYLNYMCPGYMQFGAAGDNADGQPG NIISPVLNIPAGYDMLLTFKVGIWAAAPDQGMVGICEKNDGFWISKGTLSKIVASTKSYV SLDIPAYTWVEVSAVIPNPGSLANPSLFISTADSWFSGSTVKAGRWYVDDIKLVY Prediction of potential genes in microbial genomes Time: Wed Jun 22 11:30:14 2011 Seq name: gi|313159274|gb|AENZ01000012.1| Alistipes sp. HGB5 contig00045, whole genome shotgun sequence Length of sequence - 1424 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 23/0.000 + CDS 44 - 568 280 ## COG2963 Transposase and inactivated derivatives + Prom 703 - 762 3.7 2 1 Op 2 . + CDS 793 - 1386 281 ## COG2801 Transposase and inactivated derivatives Predicted protein(s) >gi|313159274|gb|AENZ01000012.1| GENE 1 44 - 568 280 174 aa, chain + ## HITS:1 COG:YPO0996 KEGG:ns NR:ns ## COG: YPO0996 COG2963 # Protein_GI_number: 16121298 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Yersinia pestis # 1 170 8 178 178 106 39.0 2e-23 MKYNYEIRLKAVKLVLEGGLSVREAGCHLGCGRSQVHLWVTLFERHGLAGLKLRHGSYSA EFKLSVLKHMHQNHLSLLETAVHFGIPGPFVIRQWERLYQNQGAEGLRRKPQRRRPAMSK SKTKKVKLKTTPHEELLKELEYLRAENAYLKKLQALVEERIVRESGKEPKPSKD >gi|313159274|gb|AENZ01000012.1| GENE 2 793 - 1386 281 197 aa, chain + ## HITS:1 COG:RSc1436 KEGG:ns NR:ns ## COG: RSc1436 COG2801 # Protein_GI_number: 17546155 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Ralstonia solanacearum # 1 197 80 276 278 235 57.0 3e-62 MNICGIKSQVRLRKYCSYKGQIGRIAPNLLQRDFAAEKPNQKWVTDLTEFSVCGVKLYLS PIMDLYNREIISYKIAERPNFMQIMKMLDDAFARIPDSPGIVLHSDQGWQYQMKQYQLRL RQKGITQSMSRKGNCLDNAAMESFFGLLKSELLYLQKFSSIDHFRKELEEYIDYYNNKRI KKYLNNMSPVQYRTHAI Prediction of potential genes in microbial genomes Time: Wed Jun 22 11:30:16 2011 Seq name: gi|313159267|gb|AENZ01000013.1| Alistipes sp. HGB5 contig00058, whole genome shotgun sequence Length of sequence - 5561 bp Number of predicted genes - 8, with homology - 7 Number of transcription units - 4, operones - 2 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 2 - 482 418 ## Dfer_4749 TonB-dependent receptor plug 2 1 Op 2 . - CDS 479 - 1348 495 ## gi|313159272|gb|EFR58639.1| hypothetical protein HMPREF9720_2309 3 1 Op 3 . - CDS 1351 - 1809 236 ## gi|313159273|gb|EFR58640.1| putative lipoprotein - Term 1822 - 1869 12.3 4 2 Op 1 . - CDS 1880 - 2917 181 ## gi|313159270|gb|EFR58637.1| putative lipoprotein 5 2 Op 2 . - CDS 2981 - 3574 351 ## PRU_2727 hypothetical protein 6 2 Op 3 . - CDS 3606 - 4070 150 ## gi|313159269|gb|EFR58636.1| hypothetical protein HMPREF9720_2313 - Prom 4254 - 4313 4.7 7 3 Tu 1 . - CDS 4633 - 5175 106 ## - Prom 5266 - 5325 4.4 - Term 5221 - 5257 1.0 8 4 Tu 1 . - CDS 5344 - 5559 140 ## gi|291514538|emb|CBK63748.1| Protein of unknown function (DUF3244) Predicted protein(s) >gi|313159267|gb|AENZ01000013.1| GENE 1 2 - 482 418 160 aa, chain - ## HITS:1 COG:no KEGG:Dfer_4749 NR:ns ## KEGG: Dfer_4749 # Name: not_defined # Def: TonB-dependent receptor plug # Organism: D.fermentans # Pathway: not_defined # 32 159 103 233 806 116 41.0 3e-25 MRKLLAGCLALCIPALLTAQTPADSISRTEAIDSVVVTARRPLMVYKQTGNIAVDIEQLK YAPLFAGEKDIFKFLQLLPGVSAGKDGMSGLLVRGGSNDQTLILYDDVPIYNQAHAYGIL SIFSGETVQSAEVSKGFISPAYGSRLSALTQIRTREGDRL >gi|313159267|gb|AENZ01000013.1| GENE 2 479 - 1348 495 289 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313159272|gb|EFR58639.1| ## NR: gi|313159272|gb|EFR58639.1| hypothetical protein HMPREF9720_2309 [Alistipes sp. HGB5] # 1 289 1 289 289 582 100.0 1e-164 MLRCISVFFVVFICWSCEVDYSFGEKDFKPRVVVNALISPQEPFAVRLHWSRSYSAQSGF TPVGEAEIRLYEDNTEVVRCPADPEGTTQTTFRAAAGRSYRLVVSVPGYGELSARTTIPE APAARISFAHQKGWYRHFDMTDLTAGVDAKAIWLRGTEQDNNGEKDIYAFYTTSVFVDQV NGANDAYESDEKGSTIDFEYFLRVARENLSAAFPLRFSVFGAVENRHTFQIIAASDTYDR YMRSRYKHELNTGESAQENPFIEQITVYSNIDNGIGIFAGYNYYTTPEL >gi|313159267|gb|AENZ01000013.1| GENE 3 1351 - 1809 236 152 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313159273|gb|EFR58640.1| ## NR: gi|313159273|gb|EFR58640.1| putative lipoprotein [Alistipes sp. HGB5] # 1 152 1 152 152 317 100.0 2e-85 MKKILFLLAFAIGFISCNDEKHVWSFEDMFYFLHNDPSQEGWVEPVTKITVGPGAQVDLL VTRNAFAAKAHPKQTVKVVVEENLSTANPGGDFILSEQAFNFKNKDALQLPLRVNISGGT SGKKIVLRLDYGYYDECSPESRKGDKLIIKIK >gi|313159267|gb|AENZ01000013.1| GENE 4 1880 - 2917 181 345 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313159270|gb|EFR58637.1| ## NR: gi|313159270|gb|EFR58637.1| putative lipoprotein [Alistipes sp. HGB5] # 1 345 1 345 345 694 100.0 0 MKITKICIFTIVAFAAYACAKSDSVDTSSVTPETLVPVENDEIQCINGTLHFSSGEVCLN TLESLTSEESLHAFEKAHNFTSWRSFTDSMLDDIIACETPEEYQAVLETCAGYLKMDNDR LMPVITSVGYASIADINGVFYVGDVKHTVNAESVTIERTDAKTRAVEVEEIGYVAPVFDE TQTRGEISQKYTDRRRETKKYKVFSRTNVLRHSVVQEVNGQKYYNTSFAIQVHVSGQKKK TLIGWNTYNDRFYIENLHWDLTIAGKRISFETYRNNYSKTESTKNLYATIGFGNGQFNMT PTEVPMPSGFNCIVHRARSREMGNCGVVTDINCCQGIYLPVPECI >gi|313159267|gb|AENZ01000013.1| GENE 5 2981 - 3574 351 197 aa, chain - ## HITS:1 COG:no KEGG:PRU_2727 NR:ns ## KEGG: PRU_2727 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 1 192 22 194 198 85 30.0 1e-15 MKKLIFAFALLCFMQGNAQTSQSTVADRKVGCIEGEIGGGFAFGADKLNFDKNKLGATFY AEVRYNIQRVPLDVGVQVGGTIFHRESVNAGQLKFKTWNVMAVTDYNFRRYKKISFFAGI GLGYASLDNSAPITFDDSQPNWGGFSTGTKTGSFCFMPRIGVELFHHLRVTFNYKLEEKA NRHFNISVGVVFGGGRK >gi|313159267|gb|AENZ01000013.1| GENE 6 3606 - 4070 150 154 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313159269|gb|EFR58636.1| ## NR: gi|313159269|gb|EFR58636.1| hypothetical protein HMPREF9720_2313 [Alistipes sp. HGB5] # 1 154 72 225 225 314 100.0 1e-84 MTKKFPFANSKIVLFHIPLANYDVTTGDVNGAMAAGMKKLASSINSWFNDSQNKGLLNNP RGLFTTRDGDRKIRVIFPQSEEVDYHTGREVVRWDKQWFSGNFMIGFKSNMSGGGFHYKN SSFSPAKDVSIIRGRIYGAVKYDNQWRACVIETK >gi|313159267|gb|AENZ01000013.1| GENE 7 4633 - 5175 106 180 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKKILLLFVMALYVSSCTQDLTQNENMQPQSDETTAPVKSGVLHFNSVDDLSSTISQMKE RQAFDIKHLPATRAAAQALNGEFVSLRQHLLDQGLREFTDVELAEIAADSLEYEPEDSLI VDPYMTAILNGEREVQVGDKICRFVEDGMMMYDANEKVLFDPNLVEQKLAVESMFSWSSC >gi|313159267|gb|AENZ01000013.1| GENE 8 5344 - 5559 140 71 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|291514538|emb|CBK63748.1| ## NR: gi|291514538|emb|CBK63748.1| Protein of unknown function (DUF3244) [Alistipes shahii WAL 8301] # 1 71 57 127 127 134 97.0 3e-30 PDGNTITLYSPENCDRAFVTISGNGTYLTEMVNFTDQTATLDVSDLDCGVYLITVEYENG TIYTGQIEFTE Prediction of potential genes in microbial genomes Time: Wed Jun 22 11:31:23 2011 Seq name: gi|313159263|gb|AENZ01000014.1| Alistipes sp. HGB5 contig00096, whole genome shotgun sequence Length of sequence - 1397 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 1, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 57 - 341 247 ## Halhy_1324 integrase catalytic subunit 2 1 Op 2 . - CDS 358 - 948 308 ## COG2801 Transposase and inactivated derivatives 3 1 Op 3 . - CDS 945 - 1172 230 ## gi|313159266|gb|EFR58634.1| hypothetical protein HMPREF9720_3051 Predicted protein(s) >gi|313159263|gb|AENZ01000014.1| GENE 1 57 - 341 247 94 aa, chain - ## HITS:1 COG:no KEGG:Halhy_1324 NR:ns ## KEGG: Halhy_1324 # Name: not_defined # Def: integrase catalytic subunit # Organism: H.hydrossis # Pathway: not_defined # 1 86 217 300 311 85 52.0 7e-16 MTEKGDPYENAVAERVNGILKSEWIDEECFESFQAAKERIDQIVILYNSLRPHASCDWLT PLEAELRTGKLKHHWGRKTVVRKAYVNLYQDNIF >gi|313159263|gb|AENZ01000014.1| GENE 2 358 - 948 308 196 aa, chain - ## HITS:1 COG:PA0257 KEGG:ns NR:ns ## COG: PA0257 COG2801 # Protein_GI_number: 15595454 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Pseudomonas aeruginosa # 13 187 2 181 263 114 39.0 1e-25 MSLSFLCGLFGYTRQAYYKHLRRNREGSLSDTLLLERVGYYRKLMPRLGGRKLWHLLQQG GFPVSRDRLFTLLSENNLLVKRRKKYSVTTCSRHWMRKYPNLIRGFDLERPHRLWVGDIT YISLKEGFAYLALITDAYSKRIVGYNLNTTLERDGALRALRMAIDQTPQQKRQGLIHHSD RGCQYCSKRICEITDR >gi|313159263|gb|AENZ01000014.1| GENE 3 945 - 1172 230 75 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313159266|gb|EFR58634.1| ## NR: gi|313159266|gb|EFR58634.1| hypothetical protein HMPREF9720_3051 [Alistipes sp. HGB5] # 32 75 1 44 44 68 97.0 1e-10 MSTPNPYPIMSRVTSEEASELLSENKALRRRLEEALLRLEGYEIMGDILQEEYGIDLLKK SAAGQSSVSKKDTQQ Prediction of potential genes in microbial genomes Time: Wed Jun 22 11:31:37 2011 Seq name: gi|313159239|gb|AENZ01000015.1| Alistipes sp. HGB5 contig00080, whole genome shotgun sequence Length of sequence - 20865 bp Number of predicted genes - 26, with homology - 23 Number of transcription units - 17, operones - 4 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 42 - 1193 994 ## COG1672 Predicted ATPase (AAA+ superfamily) - Prom 1226 - 1285 10.0 + Prom 1559 - 1618 2.1 2 2 Op 1 . + CDS 1639 - 2574 1152 ## BDI_2532 tyrosine type site-specific recombinase 3 2 Op 2 . + CDS 2608 - 2913 333 ## BDI_2532 tyrosine type site-specific recombinase 4 2 Op 3 . + CDS 2987 - 3508 101 ## Tresu_0262 hypothetical protein + Term 3510 - 3546 7.5 + Prom 3521 - 3580 4.3 5 3 Op 1 . + CDS 3606 - 3944 379 ## gi|313159259|gb|EFR58628.1| DNA binding domain protein, excisionase family 6 3 Op 2 . + CDS 3947 - 4318 429 ## gi|313159262|gb|EFR58631.1| hypothetical protein HMPREF9720_2938 7 3 Op 3 . + CDS 4344 - 5288 556 ## RB2501_01425 hypothetical protein + Term 5307 - 5349 8.5 + Prom 5975 - 6034 5.0 8 4 Tu 1 . + CDS 6143 - 6469 183 ## + Prom 6720 - 6779 7.2 9 5 Tu 1 . + CDS 6842 - 7069 148 ## + Term 7177 - 7213 5.5 + Prom 7218 - 7277 3.5 10 6 Tu 1 . + CDS 7403 - 7795 185 ## gi|291515329|emb|CBK64539.1| hypothetical protein AL1_22780 + Term 7804 - 7842 6.1 11 7 Tu 1 . - CDS 8236 - 8841 147 ## Dfer_5485 hypothetical protein - Prom 8952 - 9011 4.1 - TRNA 9485 - 9559 66.7 # Pro GGG 0 0 12 8 Op 1 . + CDS 9640 - 9909 340 ## Odosp_2779 periplasmic binding protein 13 8 Op 2 33/0.000 + CDS 9957 - 10793 1145 ## COG0614 ABC-type Fe3+-hydroxamate transport system, periplasmic component 14 8 Op 3 35/0.000 + CDS 10811 - 11818 1588 ## COG0609 ABC-type Fe3+-siderophore transport system, permease component 15 8 Op 4 . + CDS 11820 - 12584 1029 ## COG1120 ABC-type cobalamin/Fe3+-siderophores transport systems, ATPase components + TRNA 12684 - 12758 44.0 # Glu TTC 0 0 + Prom 12684 - 12743 76.8 16 9 Tu 1 . + CDS 12963 - 13214 279 ## PROTEIN SUPPORTED gi|227370862|ref|ZP_03854358.1| ribosomal protein S20 + Term 13237 - 13268 1.1 - Term 13367 - 13406 8.0 17 10 Tu 1 . - CDS 13441 - 14478 1526 ## COG0611 Thiamine monophosphate kinase - Prom 14519 - 14578 3.0 + Prom 14431 - 14490 5.3 18 11 Tu 1 . + CDS 14572 - 15144 878 ## COG0563 Adenylate kinase and related kinases + Term 15171 - 15208 8.2 + Prom 15453 - 15512 3.5 19 12 Tu 1 . + CDS 15557 - 15673 148 ## 20 13 Tu 1 . + CDS 15811 - 16719 1331 ## gi|313159257|gb|EFR58626.1| hypothetical protein HMPREF9720_2950 + Term 16802 - 16837 4.0 + Prom 16788 - 16847 4.6 21 14 Op 1 11/0.000 + CDS 16883 - 17227 393 ## PROTEIN SUPPORTED gi|34540403|ref|NP_904882.1| 30S ribosomal protein S6 22 14 Op 2 27/0.000 + CDS 17231 - 17509 384 ## PROTEIN SUPPORTED gi|160885186|ref|ZP_02066189.1| hypothetical protein BACOVA_03184 23 14 Op 3 . + CDS 17528 - 17968 402 ## PROTEIN SUPPORTED gi|146301222|ref|YP_001195813.1| 50S ribosomal protein L9 + Term 18039 - 18082 9.7 - Term 18027 - 18070 7.6 24 15 Tu 1 . - CDS 18241 - 19161 1318 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily - Prom 19376 - 19435 3.9 + Prom 19307 - 19366 3.1 25 16 Tu 1 . + CDS 19401 - 19922 938 ## COG3685 Uncharacterized protein conserved in bacteria - Term 19958 - 20011 19.3 26 17 Tu 1 . - CDS 20047 - 20700 737 ## Sterm_1337 hypothetical protein - Prom 20732 - 20791 2.8 Predicted protein(s) >gi|313159239|gb|AENZ01000015.1| GENE 1 42 - 1193 994 383 aa, chain - ## HITS:1 COG:MA1854 KEGG:ns NR:ns ## COG: MA1854 COG1672 # Protein_GI_number: 20090704 # Func_class: R General function prediction only # Function: Predicted ATPase (AAA+ superfamily) # Organism: Methanosarcina acetivorans str.C2A # 8 375 7 385 390 81 23.0 3e-15 MAKIINPFIVTGKIAPEYFCDRVSESARLVKSITNGNNLVVISPRRMGKTGLIQFCYDKP GIGKEYYTFFIDILHTSSLREFTYLLGREIYETLLPRSRKMATLFIQTIKSISGKFGFDP ITNLPTFNVELGDIERPEYTLDEIFQYLSHADKPCIVAIDEFQQIAKYPEKNIEALLRTH IQRSENSHFIFAGSERHMMQEMFASAARPFYHSADMLELKAIPAEIYIPFIVGHFERRNR SIAAIDVEKVYALFQGHTYYIQKTFNESFADTPEGDECTLETIRAAIDNMIASNDTIFRE ILSNVPEKQKELLYAIAKEGEAERITSADFIKRHSLTSASSVQSAAKKLLEKDLITEINK VFSVTDRLFAMWINKLYGNNLYL >gi|313159239|gb|AENZ01000015.1| GENE 2 1639 - 2574 1152 311 aa, chain + ## HITS:1 COG:no KEGG:BDI_2532 NR:ns ## KEGG: BDI_2532 # Name: not_defined # Def: tyrosine type site-specific recombinase # Organism: P.distasonis # Pathway: not_defined # 16 306 15 297 422 206 40.0 1e-51 MAHSIRFPRRHDKADRNGRYAVRLCITKNKRRKYIALDLYADPAYWDEAGEQFIILRNLK GAEQKAENKQREADNALLAKYKVRAREIVERFEIEGIDWTLNQFEDAFLNTSKQGKFNAY FTDRIAELHATGHIGNSQTYKQTQDMLRSYDRKLDQRLFSDIDLRYVRGFDMFLQKRGCC GNTRKFYFKALRAILNRANAEGVGSVATYPFGRGGFEVSKLEEATAKRYLPAAKLSKLKS STASNPQCEYARKLFLFSYYCYGISFIDMAMLTAAHIKQMEDGDYIVYKRQKIKRQKGVK PISIKITPAIR >gi|313159239|gb|AENZ01000015.1| GENE 3 2608 - 2913 333 101 aa, chain + ## HITS:1 COG:no KEGG:BDI_2532 NR:ns ## KEGG: BDI_2532 # Name: not_defined # Def: tyrosine type site-specific recombinase # Organism: P.distasonis # Pathway: not_defined # 1 96 315 409 422 99 52.0 4e-20 MDDFLLPIVTRSGYTGERLYMHIRTRYSKYQKYLRLLAEELGIDFHLTSYVSRHTAAMTL QRNNIPREVISQMLGHADLETTNIYLDSFDNRVINEAAKVL >gi|313159239|gb|AENZ01000015.1| GENE 4 2987 - 3508 101 173 aa, chain + ## HITS:1 COG:no KEGG:Tresu_0262 NR:ns ## KEGG: Tresu_0262 # Name: not_defined # Def: hypothetical protein # Organism: T.succinifaciens # Pathway: not_defined # 9 120 1 99 303 66 38.0 6e-10 MYFFQSSPVSYKRFIKRFPTENSLIKLYAKTCVDFDIESKPKKRRCPHCGSCQNGPFRKN NEKLRLCKRCGRTFSIFRNSIFDKSKADLRIWMYIGYCMFYYHHVYFNVDKISRTQIIRE THAPSNATIYHIYDVLYKLFKSQNKVDMLLIFNIFLPFMKIFHNMNPYRKKEH >gi|313159239|gb|AENZ01000015.1| GENE 5 3606 - 3944 379 112 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313159259|gb|EFR58628.1| ## NR: gi|313159259|gb|EFR58628.1| DNA binding domain protein, excisionase family [Alistipes sp. HGB5] # 1 112 1 112 112 207 100.0 2e-52 MNNITDDTPIAMLTVGQLKEILSIDKIMGHLTPSRAPDKEVLTAADVARLTGYSISTVYK LTSERKIPFHKPEHKGRKLYFNREEILDWLQSESHPTIEQENIRKIKQLKKH >gi|313159239|gb|AENZ01000015.1| GENE 6 3947 - 4318 429 123 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313159262|gb|EFR58631.1| ## NR: gi|313159262|gb|EFR58631.1| hypothetical protein HMPREF9720_2938 [Alistipes sp. HGB5] # 1 123 1 123 123 191 100.0 2e-47 MKNEKTTTKPAFDPEVPPREGFIFYRTFYEALASMKNRARLHLYDVIMRYALYGEEPTDL NGEQMRTFILIRPQLDANERKRQAKYKKKAIKKNNDEELNEFFTKTEDEENDNLRILNDI QYE >gi|313159239|gb|AENZ01000015.1| GENE 7 4344 - 5288 556 314 aa, chain + ## HITS:1 COG:no KEGG:RB2501_01425 NR:ns ## KEGG: RB2501_01425 # Name: not_defined # Def: hypothetical protein # Organism: R.biformata # Pathway: not_defined # 27 311 3 281 754 131 30.0 5e-29 MKENEYDNGYIERQTVSSNPPELPQIRVSVFENYFAQKPLGDVDLIKWCKTAKFKEQVIA FRTTSNEKVRQRIKRNLPCITPSGIFKTRSRDGLVQHTGFICIDIDHKDNGVFGPEWFDK KRLVAKTFDSLLCASMSISGNGLYLIFRIAHPDMHLAQFDALVREIYEKTGLVADQGCCD VCRLRGASYDAYPYINPHAKPYRGVLKERTARAKVRTAREKKLLDEKVYKLIQKIREEKK DITDDYHDWYCIGCALAHEYGKEEGLRLFHLVSMHSKKYYPTDCDEQFAKCLRSRKIGIE TFLWICKKHGVTFK >gi|313159239|gb|AENZ01000015.1| GENE 8 6143 - 6469 183 108 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MDRSVFEILDAIGVNLLRLLIHDYLPQRFHAKQFYDLLINHHPLEYREMRNRYAHKPEQE IDGIVNAQISQYLLKRSSELNISKIENCYHKYTSHNNNSSQVSYWQHR >gi|313159239|gb|AENZ01000015.1| GENE 9 6842 - 7069 148 75 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNCTNEVNEKFTLKFGYTQELIQDKSETSVFLKGPLGIFFGTFQGIYDIRKVTESYESMR MMEQLMQSYREHYGL >gi|313159239|gb|AENZ01000015.1| GENE 10 7403 - 7795 185 130 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|291515329|emb|CBK64539.1| ## NR: gi|291515329|emb|CBK64539.1| hypothetical protein AL1_22780 [Alistipes shahii WAL 8301] # 1 130 1 130 130 236 94.0 3e-61 MKTKEEIRHYLDTRFSPSEDDMVQIVRNDFDRILNSAASCALITVEGAMPVLAEQLHTEL EALHTGADLDVIVKVSYHPASEISFDDLCSLLTVIHEFAPQVNIVWGCGTDAKLAETDYS ILLLIGTSTE >gi|313159239|gb|AENZ01000015.1| GENE 11 8236 - 8841 147 201 aa, chain - ## HITS:1 COG:no KEGG:Dfer_5485 NR:ns ## KEGG: Dfer_5485 # Name: not_defined # Def: hypothetical protein # Organism: D.fermentans # Pathway: not_defined # 2 201 154 358 358 135 39.0 1e-30 MLKRPVRWIKEVRRRELDPYLYRMFTAHQAVNKVNDYAEIIERSVNDLFILEDEAHFVIN VGSDTIAAKNLFGLGASLMEILDEISEKFVLGISSEDLEVTININSPGKIDIKSKVKKTT VVFGLILLLCGGGYEAADGTKLATDGLPGIIKTIDEYLSHRQERELKSDIFTTYKDSLQI KDPEDILLLLKQVSENKDVAK >gi|313159239|gb|AENZ01000015.1| GENE 12 9640 - 9909 340 89 aa, chain + ## HITS:1 COG:no KEGG:Odosp_2779 NR:ns ## KEGG: Odosp_2779 # Name: not_defined # Def: periplasmic binding protein # Organism: O.splanchnicus # Pathway: ABC transporters [PATH:osp02010] # 29 88 29 88 379 72 50.0 7e-12 MKRFQPLIPLLLALLGTACGGSASYTPADFTTEVYTPAYASGFDIRGTERNAATLVTVRT PWQGGSGVEQHLLVLREGIEPPAGFDGRS >gi|313159239|gb|AENZ01000015.1| GENE 13 9957 - 10793 1145 278 aa, chain + ## HITS:1 COG:alr4031 KEGG:ns NR:ns ## COG: alr4031 COG0614 # Protein_GI_number: 17231523 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+-hydroxamate transport system, periplasmic component # Organism: Nostoc sp. PCC 7120 # 5 277 153 424 426 167 31.0 2e-41 MFDAIGQIRRVCGVSGIDYISNAYVNEHRCCGEVLDVGYDTNLNFERLAAMQPDLMLLYG VTSENTVVTGKLRELGIPYIYVGDYMEESPLGKAEWLMVAAELCNDRSRGAETFGGIAER YRAVKTAVAGSAPAVRPKVMLNTPYRDTWFMPSSRSFMIRLIEDAGGEYVYTKNDSDTSV AVDLEEAYLLASSADVWLNVGPCNTLAELTAQNPKFAGIPAVRNRMVFNNNRRQTPAGGS DFWESGVIHPDLVLRDLSLILGGKPAEEGELHYYKRLE >gi|313159239|gb|AENZ01000015.1| GENE 14 10811 - 11818 1588 335 aa, chain + ## HITS:1 COG:alr4032 KEGG:ns NR:ns ## COG: alr4032 COG0609 # Protein_GI_number: 17231524 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+-siderophore transport system, permease component # Organism: Nostoc sp. PCC 7120 # 7 330 22 353 362 222 46.0 9e-58 MSGRRTAILFTVLSLLTAALFTADLLIGSVAVALRDIWAALTGGSCDPAVRDIILKIRLL KAVTALFAGAALAASGLQMQTLFRNPLAGPYVLGISSGAGLGVALFLLGAPLLGVSAHSF VQSLGIAGAAWLGAALVLLIVMAVSRRIKDIMVILILGMMFGSGVSSVVEILQYLSSEAA LKSFVIWTMGSLGDVTGGNLALMLPVITAGLALSVAVIKPLNLLLLGENYARTMGLNVQR TRTLLFLSTVLLAGTVTAFCGPVGFIGLAVPHLARMLFASADHRVLMPASMLSGAALLLV CDLISKSLALPINTVTALMGIPVVIVVVVRNRNLF >gi|313159239|gb|AENZ01000015.1| GENE 15 11820 - 12584 1029 254 aa, chain + ## HITS:1 COG:alr4033 KEGG:ns NR:ns ## COG: alr4033 COG1120 # Protein_GI_number: 17231525 # Func_class: P Inorganic ion transport and metabolism; H Coenzyme transport and metabolism # Function: ABC-type cobalamin/Fe3+-siderophores transport systems, ATPase components # Organism: Nostoc sp. PCC 7120 # 15 243 19 248 333 184 44.0 2e-46 MSIALRHITLAYGQRILLRDVSASVPPGSLTALIGRNGTGKSTLLRTVAGLGAAASGGIE LCGKPLAALTPLQRASTVSFVTTDKVRIANLACEDVVALGRAPYTNWIGRMQETDRDIVA RSLRLVGMEAFACKTMDRMSDGECQRILIARALAQDTPVILLDEPTAFLDLPNRYELATL LRRLAHDEGKCILFSTHDLDVALGLCDAVALIDTPDLHCLPASDMASSGHIERLFAGAGI SFDPATLTIRLTKK >gi|313159239|gb|AENZ01000015.1| GENE 16 12963 - 13214 279 83 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|227370862|ref|ZP_03854358.1| ribosomal protein S20 [Chryseobacterium gleum ATCC 35910] # 1 83 1 83 84 112 68 3e-24 MANHKSSKKRIRQTLTKRAHNRLYHKTARNAVKALRNTTEKSAAEALLPKVTAMLDKLAK HNIVHKNKAANLKSAIQLHVNAL >gi|313159239|gb|AENZ01000015.1| GENE 17 13441 - 14478 1526 345 aa, chain - ## HITS:1 COG:MTH1396 KEGG:ns NR:ns ## COG: MTH1396 COG0611 # Protein_GI_number: 15679395 # Func_class: H Coenzyme transport and metabolism # Function: Thiamine monophosphate kinase # Organism: Methanothermobacter thermautotrophicus # 9 341 5 321 327 144 34.0 3e-34 MEKKRRTEIAELGEFGLIDLLTSGFEPTNDSTLRAAGDDAAVILPPAGEAVLCSTDLLTE GVDFDLTYFPLKHLGYKAVTVAGSDILAMNARPSQLMVSLGVSAKMSVEALQELYEGIAF ACREQGIDLVGGDTKASVTGLVLSLTAAGHAPKERIVSRGGAQQNDLICISGNLGAAYMG LRLLEREKRVLADVENPEPKFGGYEYLLEKYLKPRLRRDIVDALAEEGIVPTSMIDLSDG LASDLLQICKASKCGARIYLDRIPIAKQTYALAEELHADPVVAALNGGEDHELLFTVPLA MQEQVMKMGCVDVIGHITPESTGAYLVTPDGSDIRLRAQGFAEKE >gi|313159239|gb|AENZ01000015.1| GENE 18 14572 - 15144 878 190 aa, chain + ## HITS:1 COG:MA1096 KEGG:ns NR:ns ## COG: MA1096 COG0563 # Protein_GI_number: 20089966 # Func_class: F Nucleotide transport and metabolism # Function: Adenylate kinase and related kinases # Organism: Methanosarcina acetivorans str.C2A # 2 190 1 214 215 147 37.0 1e-35 MINIVLFGAPGCGKGTQAQRLKAHYGIEHVSTGEVIRDEIRRGTELGRSMESYIKAGKLA PDQIVIGMIANYVAEHKHAKGCIFDGFPRTTVQAEEFDKILAGHGLQVDIMIDIHVPEEE LIQRILLRGKDSGRADDASEEVIRGRLDVYRQQTAVVSDYYAAQGKYASVNGTGTMDDVF ARITDVIAQL >gi|313159239|gb|AENZ01000015.1| GENE 19 15557 - 15673 148 38 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKKSLLYATLLCAGFYAASCSDNEGGGKSLTPHPNIFG >gi|313159239|gb|AENZ01000015.1| GENE 20 15811 - 16719 1331 302 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313159257|gb|EFR58626.1| ## NR: gi|313159257|gb|EFR58626.1| hypothetical protein HMPREF9720_2950 [Alistipes sp. HGB5] # 1 302 3 304 304 597 100.0 1e-169 MPQQPSVFYLGSYLKTPTDSILLTLCVEKDDYDNVKTITAAPEDPAKSLDVWQYCLTSAE ALRLGTFLGTKYKSPSSSGVFQTIDDTLNHIAQHGTAETLACTIFGVVPQAAYAVFQLDN GAFSARLMNSYLKLDYPAMRRWLGGDHAAFAAEYYVLGNKINAFGDLYVYFYYAKDAAGN TFTVDVHADKNGEKISEIDVYVSADRNDADKQLAIWKEYARDYADKGLGAFKEAYTTDSW GDRKETFADLAAAVAHVESNGRPGPFDGGIIVVFEADGVATNLVMNQQYIYLLIKDPNYN VE >gi|313159239|gb|AENZ01000015.1| GENE 21 16883 - 17227 393 114 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|34540403|ref|NP_904882.1| 30S ribosomal protein S6 [Porphyromonas gingivalis W83] # 1 114 1 113 117 155 67 2e-37 MNNYETVFIVTPVLSDAQVQEVADKFQGVITENGGQIVNKESWGLRKLAYPIQKKTTGFY FLVEFTGEGMLINTLETQYRRDERIIRFLTFKQDKFAVEYSEKRRAKLSNKSEE >gi|313159239|gb|AENZ01000015.1| GENE 22 17231 - 17509 384 92 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|160885186|ref|ZP_02066189.1| hypothetical protein BACOVA_03184 [Bacteroides ovatus ATCC 8483] # 1 92 1 90 90 152 81 2e-36 MAQENKAQSEIRYLNPVSVDVKKKKYCRFKKLGIRYVDYKDGEFLKKFLNEQGKILPRRL TGTSQKFQKKVAQAVKRARHLAILPFVTDCMK >gi|313159239|gb|AENZ01000015.1| GENE 23 17528 - 17968 402 146 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|146301222|ref|YP_001195813.1| 50S ribosomal protein L9 [Flavobacterium johnsoniae UW101] # 1 144 1 144 146 159 53 1e-38 MEVILIKDMENLGYANDIVNVKPGYANNYLIPQGYAKAATASAKKVLAENLRQRAHKDAK ILADAQALAETIANLPLSLAVKAEEGKLFGTVTAADLAEALAAKGIELDRKVIVVDAIKT VGEYEAVAKLHKEVKAVIKFSVTAAE >gi|313159239|gb|AENZ01000015.1| GENE 24 18241 - 19161 1318 306 aa, chain - ## HITS:1 COG:BH0390 KEGG:ns NR:ns ## COG: BH0390 COG0697 # Protein_GI_number: 15612953 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Bacillus halodurans # 34 261 33 264 311 62 23.0 1e-09 MNRFKPHIALLICNIVWAMDYPFYNIVLPRYVHPMAMVSGSLIATALFSLVPLLWQKAEK VAKADVRKLIGAALLIGVLRKVFIMYGLSMTSPIDGSIIDTIVPLLVLLLSVLLGMDRFT KLKIAGLVLGMAGAVAVVLAGASSSHQHSHLWGNVMILLCACVTSLYMVWFKRLIAKYRI TTVLRWLYCVAAVVALPFGLKEIVHTDYAAIAKHALFPTLFVLTVPTYLPNLMLNYALKS VPATVSSIYTYLQPVLAIAISVGMGLDKLHADTVIFALVIFVGVGLVLRSYSVPPRHVDP PTAAPH >gi|313159239|gb|AENZ01000015.1| GENE 25 19401 - 19922 938 173 aa, chain + ## HITS:1 COG:AGl1433 KEGG:ns NR:ns ## COG: AGl1433 COG3685 # Protein_GI_number: 15890839 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 16 170 54 207 212 121 50.0 7e-28 MNTKKNKKTDSPHFREFFLDQIKDVYWAEKHLSSGLKKMRKAATSPKLAAAFEKHIEETA GQIERLKRVFELLGKTPQAKKCEAMEGLVSEAESMIEDTRKDSYTRDAGLILAAQKAEHY EIASYGTLKVFAEMMGETEIARELGMILKEEVSTDSLLSCLAEEDVNEMAVAE >gi|313159239|gb|AENZ01000015.1| GENE 26 20047 - 20700 737 217 aa, chain - ## HITS:1 COG:no KEGG:Sterm_1337 NR:ns ## KEGG: Sterm_1337 # Name: not_defined # Def: hypothetical protein # Organism: S.termitidis # Pathway: not_defined # 44 203 29 160 161 103 36.0 5e-21 MKKMIVVLLSAFALLSGGDACFAQVSAARHGGVHPGRDRRPEGYSKDSWTVYYRGLKVEG ASASSFVDLGDGYGKDNWKVFYCGEEVKGASASSFESLGKGRGRDNWNTYLYGERQRANA KMTRTLGGGYSKDSWTVYYREREVEGAAAGSFVSLGGGYGKDAWTVFFQGRKVGGASASS FENIGKGYGKDAWSVYYRGEKIDGASPATFEVPSQRR Prediction of potential genes in microbial genomes Time: Wed Jun 22 11:33:03 2011 Seq name: gi|313159222|gb|AENZ01000016.1| Alistipes sp. HGB5 contig00028, whole genome shotgun sequence Length of sequence - 15365 bp Number of predicted genes - 15, with homology - 14 Number of transcription units - 11, operones - 3 average op.length - 2.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 67 - 915 1257 ## COG0648 Endonuclease IV - Prom 973 - 1032 3.8 + Prom 925 - 984 6.9 2 2 Op 1 . + CDS 1011 - 2255 1755 ## Weevi_0423 hypothetical protein 3 2 Op 2 . + CDS 2299 - 3936 2269 ## COG0793 Periplasmic protease 4 2 Op 3 . + CDS 3929 - 4153 251 ## gi|313159234|gb|EFR58604.1| hypothetical protein HMPREF9720_1158 + Term 4208 - 4248 13.4 + Prom 4261 - 4320 3.0 5 3 Tu 1 . + CDS 4555 - 4767 68 ## gi|313159224|gb|EFR58594.1| conserved domain protein - Term 4698 - 4735 9.4 6 4 Tu 1 . - CDS 4761 - 5639 1472 ## COG0274 Deoxyribose-phosphate aldolase - Prom 5663 - 5722 4.2 + Prom 5381 - 5440 2.8 7 5 Tu 1 . + CDS 5538 - 5738 105 ## gi|313159228|gb|EFR58598.1| hypothetical protein HMPREF9720_1160 + Term 5824 - 5863 4.0 - Term 5719 - 5755 7.1 8 6 Tu 1 . - CDS 5778 - 7511 2837 ## BT_2844 hypothetical protein - Prom 7628 - 7687 3.5 9 7 Tu 1 . + CDS 7539 - 7655 81 ## 10 8 Op 1 26/0.000 - CDS 7721 - 9955 3912 ## COG1185 Polyribonucleotide nucleotidyltransferase (polynucleotide phosphorylase) - Term 9992 - 10034 7.7 11 8 Op 2 . - CDS 10050 - 10322 257 ## PROTEIN SUPPORTED gi|237672639|ref|ZP_04532607.1| SSU ribosomal protein S15P - Prom 10430 - 10489 3.8 12 9 Op 1 . + CDS 10717 - 11727 1443 ## COG1477 Membrane-associated lipoprotein involved in thiamine biosynthesis 13 9 Op 2 . + CDS 11736 - 12602 1327 ## COG1597 Sphingosine kinase and enzymes related to eukaryotic diacylglycerol kinase + Prom 12657 - 12716 3.1 14 10 Tu 1 . + CDS 12740 - 13663 1347 ## COG2207 AraC-type DNA-binding domain-containing proteins + Term 13680 - 13716 6.6 + Prom 13752 - 13811 3.3 15 11 Tu 1 . + CDS 13831 - 15291 1500 ## Aasi_1729 hypothetical protein Predicted protein(s) >gi|313159222|gb|AENZ01000016.1| GENE 1 67 - 915 1257 282 aa, chain - ## HITS:1 COG:STM2203 KEGG:ns NR:ns ## COG: STM2203 COG0648 # Protein_GI_number: 16765533 # Func_class: L Replication, recombination and repair # Function: Endonuclease IV # Organism: Salmonella typhimurium LT2 # 1 277 1 277 285 425 76.0 1e-119 MKYIGAHVSASGGVENAPANAHKIGATAFALFTKNQRQWVAAPLTAAQIDSFRKVCDLYG YLPAQILPHDSYLINLGHPERDGLEKSRAAFLDEMQRCEQLGLDRLNFHPGSHLQKIAPE ESLDRIAESINIALDKTHGVTAVIENTAGQGSNLGFAFEQLAYLIDRVEDKSRVGVCIDT CHAFAAGYDLRTAEACERTFAELDSIVGFEYLKGMHLNDAMKILGSRVDRHTPLGEGMIG MECFRYIMQDARFDGIPLILETPDEERWPEEIAMLKAFAEEK >gi|313159222|gb|AENZ01000016.1| GENE 2 1011 - 2255 1755 414 aa, chain + ## HITS:1 COG:no KEGG:Weevi_0423 NR:ns ## KEGG: Weevi_0423 # Name: not_defined # Def: hypothetical protein # Organism: W.virosa # Pathway: not_defined # 52 414 15 303 345 68 23.0 4e-10 MKPFYLILTLIPALLIGRTVSAQISVDEVDAGEESVTFQDKLKSIPVDVDYFSRARYKAE RAAIRKERNYLEFNSGVQGALTSYNDPWISVSGGDNSIALTAVIGLRHVFTKNLFTIETK FSAKLGYNRMKVETQRLDADGNPMVDDSGNKIMDSEGVWFKNQDEFVVSVAPSFKMSENW SYGSILNFRSQFVNGYKSRTEQKKEHLKSKFMTPGYLDISLGITYKSPKPKFPIVINLSP IALNATFAENDWIRRRQVEERVDSEGKKTETEIKPAYNYGIEDPDKTSKYEGGSSIQIDF DRTFGKTGFLRYRTTLYSFYGWITDIGQKNKISDYTKFRHAYEEWDKTANKDIKDKPRLP IHPIARWENTLDIKATKYLSTTLSFQLYYNRAQNVDVQTRTLLSVGLTYTFKNK >gi|313159222|gb|AENZ01000016.1| GENE 3 2299 - 3936 2269 545 aa, chain + ## HITS:1 COG:aq_797 KEGG:ns NR:ns ## COG: aq_797 COG0793 # Protein_GI_number: 15606169 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Periplasmic protease # Organism: Aquifex aeolicus # 52 380 39 364 408 221 40.0 4e-57 MYKNSKYTILFPLLLAAGVVLGLLLGQYMGRNSTTSQIKGMLRQMALPTNKLTYTLSLIE NQYVDSVAMDSLAEHVIPLLVKELDPHSVYIPAAEMQQLNEPLEGEFDGIGVVFNMATDT VIVLNVIPQGPSDKAGIKAGDRIIEINDTLVAGQKIPQRDVVKKLRGPRGTTVHLGLGRQ GIGELVGVDVVRDKIPIKSIESAFRIADGIGYIKLGQFARTTSAEMEQALASLRAEGVTK LIFDLRGNSGGYLDQAILVANEFLHKEQLIVYTEDRHHQQQREYADGTGSAQDMDVVVLI DEGSASSSEILAGALQDNDRGTIVGRRSFGKGLVQRQIPYSDGSALRLTTARYYTPTGRS IQKPYTIGDDESYEEDIWNRYRNNEFFSADSIHFADSLKRTTPGGKVVYGGGGIMPDVFV PADTTDVTKYFVEVAGRNILYRYTIEYADRHREALNAVKTIPELQALLDGDRGLVEDFIR YAARQGVAPNYRDIARSRKLIEAQLRAYIGRNTALEDNGFYANIYPVDNVIVRAVGILNE TREND >gi|313159222|gb|AENZ01000016.1| GENE 4 3929 - 4153 251 74 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313159234|gb|EFR58604.1| ## NR: gi|313159234|gb|EFR58604.1| hypothetical protein HMPREF9720_1158 [Alistipes sp. HGB5] # 1 74 1 74 74 87 100.0 2e-16 MIKKLIGIIVATAVIVIIVIAAIRRDNFQSMVLRDEIEDQTYPEPVASQQPATPVPVETE TTDSISVEVIDSLR >gi|313159222|gb|AENZ01000016.1| GENE 5 4555 - 4767 68 70 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313159224|gb|EFR58594.1| ## NR: gi|313159224|gb|EFR58594.1| conserved domain protein [Alistipes sp. HGB5] # 1 70 1 70 70 122 100.0 7e-27 MPTIQGRGYCKAAAFFRQSGCPAKQVARQTVWFSGRVFRTTGFSAGRRPNEKQQSESGLL HTSIALAGDQ >gi|313159222|gb|AENZ01000016.1| GENE 6 4761 - 5639 1472 292 aa, chain - ## HITS:1 COG:AGl76 KEGG:ns NR:ns ## COG: AGl76 COG0274 # Protein_GI_number: 15890144 # Func_class: F Nucleotide transport and metabolism # Function: Deoxyribose-phosphate aldolase # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 27 285 64 329 345 154 35.0 2e-37 MEYANHLNEYAPAWSEAQVAEEVARIREAAKRNHNAEVYKMCYSAIDITTLSCNDSVTSV TEFARKTAEFYQKFPHIPNVASICIYPAFVETVGLAVDGTPMKITSVGGGFPAAQTFLEV KALEVAMAIENGADEVDIVLNVGRMLTGEYDEAANEVEVIRSEMDADVVLKVIIESGALK TPDLIRKASLLSMFAGADFVKTSTGKIDVAATPEAAVVMCQAIRDYYQKTGRKVGFKAAG GVRTAEDGALYYTIVEEILGPEWLNTDMFRIGASSAANNLLSAIEGREIKYY >gi|313159222|gb|AENZ01000016.1| GENE 7 5538 - 5738 105 66 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313159228|gb|EFR58598.1| ## NR: gi|313159228|gb|EFR58598.1| hypothetical protein HMPREF9720_1160 [Alistipes sp. HGB5] # 29 66 1 38 38 75 100.0 2e-12 MIALRGFADAGDLFGHLRFAPCRSVFIQMIGVLHNVSLFFTKLGKIIHTRYLRPANIFPV SCKKRA >gi|313159222|gb|AENZ01000016.1| GENE 8 5778 - 7511 2837 577 aa, chain - ## HITS:1 COG:no KEGG:BT_2844 NR:ns ## KEGG: BT_2844 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 18 572 8 539 553 245 36.0 5e-63 MKNLMKLSVVLVAAALAFSSCNCFKKMAKNRDDINLTVTPEILTLNNGIVAADINVTFPV KYFNAKAVIKVTPVIVFEGGEVAGAAKYLQGSKVDENYTVVDKKNGGSYTQHVEFPYDPR MDKCALQLRAEIKCPGGKCKEFTLVNLNTGAIPTKEQAAVLAGNDEAAKAALAKEFGLTV AYGLNTLQKDLKYSDLMAPMANNYKKVHTVVDKTDLLYAINSSVVTKKNEKKANLDAFKA DVDAKLQNDRATQNIAVKGYASPDGPVKFNDKLSKARSESGKKVVAKLLKDSGLDVDAAA YGEDWDGFKELVEKSDIKDKNLILQVLSLYNSSAERESEIKNMSSVFNELKEEILPELRR SQIVNSTDIQGKTDAEIMAAFKNGQDLTVEEYLYAAEALANGAEEQVAILTVASKKFNDA RVYNNLGIAQAKAGDKAAALKSFEKAAKTDSSSEINKNLILANLANGNTAEAKKYAQAAD AQAKAAIAAAEGDYKSAAKNLDGYNEAVAQVMNNDLSAAKKAIAKDNSADADYLRAVIAA KEGDLKTAEAQLKSAVSKNPKLAQKAANDINLKALNK >gi|313159222|gb|AENZ01000016.1| GENE 9 7539 - 7655 81 38 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MFAFFLFLSRKFRFARRRLGVRISEFNPLNNANIENSW >gi|313159222|gb|AENZ01000016.1| GENE 10 7721 - 9955 3912 744 aa, chain - ## HITS:1 COG:DR2063 KEGG:ns NR:ns ## COG: DR2063 COG1185 # Protein_GI_number: 15807057 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Polyribonucleotide nucleotidyltransferase (polynucleotide phosphorylase) # Organism: Deinococcus radiodurans # 11 743 35 769 810 580 44.0 1e-165 MEDKKLYNAVRKIITLADGRQIEIETGKLAKQADGSVVVKQGDTMLLATVVAAKDAKPDT DFMPLQVEYKEKYASCGRYPGGFMKREGRANDSEILVARLIDRALRPLFPADYHAEVYVT VNLISADKDIQPDALAGLAASAALAVSDIPFGGPISEVRVARLDGEYVINPKFSEMDRVD LDIMVGGTIDNILMVEGEMKEVSEEVMLGAIKFAHEEIKKHCAVQIELSKELGKDVKRTY CHEENDEELRQLIVKELYDKAYAIATSGTMKHEREDMFNALEAEFAARYSEEELAEKAML IHKYFHDDVQKKAMRNMILDEGLRLDGRKTDEIRPIWCEVDYLPAAHGSAIFTRGETQSL TSVTLGTKLDEKQKDEVLVQGTEQFVLHYNFPPFSTGEAKAARGLSRREIGHGHLAWRAL KPMVPLGEENPYAVRVVSDILESNGSSSMATVCAGTLALMDAGVKLKKPVSGIAMGLISD SQSGKWAVLSDILGDEDHLGDMDFKVTGTKDGITATQMDIKVDGLSYEVLAAALEQARKG RLHILGKLTECIAEPRADYKPFVPRIVQITIPQDMIGAVIGPGGKVIQDIQKTTGTTITI TEVDNKGIVDIFGQDKAALDGALNRIKAIVAIPEVGETYHGKIRSIVTFGAFVEIMPGKD ALLHISEIDYKRFETMEETGLKEGDEIDVKLIGIDPKTNKFKLSRKVLLPKPEGYVERER RPREDRPDRGERRERRDRPGRKEE >gi|313159222|gb|AENZ01000016.1| GENE 11 10050 - 10322 257 90 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|237672639|ref|ZP_04532607.1| SSU ribosomal protein S15P [Brachybacterium faecium DSM 4810] # 1 90 1 89 89 103 55 7e-22 MAYLTAEKKQELFGQYGKSNTDTGSPESQIALFSYRISHLTEHLKSNKHDFGTQRSLLRL VGKRRALLEYLKEVDIERYRAIVKTLNLRK >gi|313159222|gb|AENZ01000016.1| GENE 12 10717 - 11727 1443 336 aa, chain + ## HITS:1 COG:YPO3234 KEGG:ns NR:ns ## COG: YPO3234 COG1477 # Protein_GI_number: 16123393 # Func_class: H Coenzyme transport and metabolism # Function: Membrane-associated lipoprotein involved in thiamine biosynthesis # Organism: Yersinia pestis # 1 331 1 333 340 174 34.0 2e-43 MAKKLIRKVWIAVSAAALWGCGGASSYTAVDGVMLGTTLHITADVQGVSPQELYAAVMEL DREAKASMSIFDPGSLLSRLNRNETDSVDRHIAFNLHLADSIGALSGGRYDVTVKPLVEA WGFAGRQAERHPDVDSILAFVGREKVRVEEGRLVKADPRVQLDFNSIAKGYTVDLLARLV ESFGARNYIVDIGGEVRCKGVNRQGGPWRIGIETPFDGNMSDGEYVQKRIRLTDGGLATS GNYRRFYLDADGNKVAHTIDPRTGRSAVSRLLSVTVAAPTCAEADALGTMFLAMGADDAL KAVRTMPDVKAYFILADGTDGYEEYISPAMEAMIMQ >gi|313159222|gb|AENZ01000016.1| GENE 13 11736 - 12602 1327 288 aa, chain + ## HITS:1 COG:CAC0679 KEGG:ns NR:ns ## COG: CAC0679 COG1597 # Protein_GI_number: 15893967 # Func_class: I Lipid transport and metabolism; R General function prediction only # Function: Sphingosine kinase and enzymes related to eukaryotic diacylglycerol kinase # Organism: Clostridium acetobutylicum # 1 282 1 289 295 192 37.0 4e-49 MKSAVFLYNTQSGKCKIERCTEAVCTVFRAYGYDIKPQLIDFCANPFDGNEQIDLMVVAG GDGTVNYVVNAMKNKGLDIPLGVIPAGTANDFAGALGMSHQPLEAARQIASGAVDRVDCG CVNGLYFVNIFSFGIFTTTSQRTPDQRKHKIGKLAYLIEGVKELRAMHAVPLKVVADGQA FDFNSLMVLVFNGETAGGFRLARRSSIKDGLFDCIMLEKKNFLRSTLAMGRYLLRGNPKI VRHLQVRSLDIVSTVNEPTDVDGQKGAEFPLHIECIAGGLRIMCPRGE >gi|313159222|gb|AENZ01000016.1| GENE 14 12740 - 13663 1347 307 aa, chain + ## HITS:1 COG:PA0248 KEGG:ns NR:ns ## COG: PA0248 COG2207 # Protein_GI_number: 15595445 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Pseudomonas aeruginosa # 196 307 177 287 288 80 34.0 4e-15 MADNIKATRTTREGGRENLRKVSISRIKKEMSDVFYLSDDLVITTLDAQNNTTSDYPASI DGFSAIIMMTGEATVSIDMQNYNVKPNTIVFFNPDSIIRTVKCSSNAAAYFLAFSKSFVN EIQIDLSTSLPVYMRFGKAPVLEVTPQDVDQIRQLFQLIKTMLRSDKERYRHEIIRTLFT TAFYIITEINQREQPGEIKQGRCEVLFDEFMSLLQQYNKRERNVSFYAKQLNITPKYLSS VVKEVSGKTAARWIDESVILEAKALLKYSGMSIQEIAYHLNFSTQSFFGKYFKQHTGTSP SRYKRKG >gi|313159222|gb|AENZ01000016.1| GENE 15 13831 - 15291 1500 486 aa, chain + ## HITS:1 COG:no KEGG:Aasi_1729 NR:ns ## KEGG: Aasi_1729 # Name: not_defined # Def: hypothetical protein # Organism: A.asiaticus # Pathway: not_defined # 8 466 3 400 404 146 26.0 2e-33 MKIPKATVTAVLYKSKTLASGEHPVMIRVCYNSKRRYKSTGLSCPAKWWNAAKQEVRERH PLAPNMNAIIGSELTALKNKVLDFERQGVPYSVQRIFEASVRKPPSRKTLYDLFEERIAY FRDTLQKHNTATGYQTLLHIVERFSQHRTVELFDVDGAWLGEFEEYLHAHYADTSIKRFF SALKALMNYACQNGLLDANPFDRFRLSRRLDVRTAKRALETDEFDSLIRYYLDTYYYKTR KRPDPRTMKRRCWRAPCHGCRGEERQASYDAEQFALSMFICSYIFQGLALVDLARLRWKD LVCVEIPDREKYDRDCAAYGPRYAEAHKETVAFYEINFVRAKTLHPIRILVEQRVAWPYM KPFARTAKGGAGDDFVFPIYFDDDPQRRFERITYANNVINQGLQRAAKRIGLSRKLTFYS ARHTYASRLYHADVPLPLIAQNMGRNPAEIETYLKEFDTDRIISANKRVWQIPDPARKDP KTGAGL Prediction of potential genes in microbial genomes Time: Wed Jun 22 11:33:43 2011 Seq name: gi|313159221|gb|AENZ01000017.1| Alistipes sp. HGB5 contig00060, whole genome shotgun sequence Length of sequence - 559 bp Number of predicted genes - 1, with homology - 0 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 169 195 ## - Prom 374 - 433 5.4 Predicted protein(s) >gi|313159221|gb|AENZ01000017.1| GENE 1 1 - 169 195 56 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKNQKFEIAVAGYDYRFKTYARDGVEASVKVKCFLGRPDAECTILIPTLNDDLRET Prediction of potential genes in microbial genomes Time: Wed Jun 22 11:33:49 2011 Seq name: gi|313159217|gb|AENZ01000018.1| Alistipes sp. HGB5 contig00065, whole genome shotgun sequence Length of sequence - 3094 bp Number of predicted genes - 3, with homology - 3 Number of transcription units - 3, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 417 305 ## Bacsa_3150 hypothetical protein - Prom 615 - 674 3.6 + Prom 718 - 777 1.9 2 2 Tu 1 . + CDS 873 - 1391 125 ## gi|291514412|emb|CBK63622.1| Protein of unknown function (DUF3408) - Term 1768 - 1810 0.7 3 3 Tu 1 . - CDS 1839 - 2846 349 ## COG3547 Transposase and inactivated derivatives - Prom 3001 - 3060 2.9 Predicted protein(s) >gi|313159217|gb|AENZ01000018.1| GENE 1 3 - 417 305 138 aa, chain - ## HITS:1 COG:no KEGG:Bacsa_3150 NR:ns ## KEGG: Bacsa_3150 # Name: not_defined # Def: hypothetical protein # Organism: B.salanitronis # Pathway: not_defined # 4 138 2 139 140 80 40.0 3e-14 MNTDPKQTNRNARKSETPSYKYSFWFNEEHIRFKKLLCKSGLEHNRSQFIVKRIFGEEFV VIKRDPSKTQFIARLNDFYFQFQKLGNNYNQIVKAVNSHFSNVAIPHQIAMLEQRTRELK ALSIEILSFAKHTVEWLQ >gi|313159217|gb|AENZ01000018.1| GENE 2 873 - 1391 125 172 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|291514412|emb|CBK63622.1| ## NR: gi|291514412|emb|CBK63622.1| Protein of unknown function (DUF3408) [Alistipes shahii WAL 8301] hypothetical protein HMPREF9720_2331 [Alistipes sp. HGB5] # 1 172 1 172 172 305 100.0 8e-82 MDSLKEKDNPAPNPAKRPRIEVDEELMRQMIAGQAPLDSEVVRRIPEPEEEDTNALEENT SETVSGASAPTAEKTGIDSTASTVKEHSGFRRKKLTLPDFERTFFAPVDCRNRSAIYVST RTKHKVSEILHLLGNESTRLTALVDNMLRFVMDIYSGELNYLHEKKNNRRPF >gi|313159217|gb|AENZ01000018.1| GENE 3 1839 - 2846 349 335 aa, chain - ## HITS:1 COG:NMB1750 KEGG:ns NR:ns ## COG: NMB1750 COG3547 # Protein_GI_number: 15677594 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Neisseria meningitidis MC58 # 6 330 2 310 316 102 27.0 8e-22 MHYLYYVGLDVSKETFDASLVAFEDANEMAHRKFANSRKGICSCLHWVEKRHGIRLDDVI FCAEDMGSYISEMAVCASDRTLNFNFSLISPLVIKYSMGIARGKTDRVDARRIAEYAITH YRKIALYLPAEKELCQLRTWLILRAHLAKQRVAKLVLLEKLDYKEKFADVSIQRSMLQEE IAYAETHMKTIEREMKELIAADSNICRNYKLLTSIKGVGPITASVMLCSTLNFTKITDHR KFACYCGLAPFEHSSGTSVRGGCHTSSMANRDIKVQLNRSALIAIRCDPQLKAYYERKVA EGKHKFSVLNAVRAKIAARCFAVVRRGTPYVALQI Prediction of potential genes in microbial genomes Time: Wed Jun 22 11:34:19 2011 Seq name: gi|313159169|gb|AENZ01000019.1| Alistipes sp. HGB5 contig00021, whole genome shotgun sequence Length of sequence - 56702 bp Number of predicted genes - 46, with homology - 43 Number of transcription units - 26, operones - 9 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 36 - 71 6.5 1 1 Tu 1 . - CDS 122 - 1957 2284 ## PROTEIN SUPPORTED gi|237711154|ref|ZP_04541635.1| 30S ribosomal protein S1 - Prom 2049 - 2108 1.9 - Term 2050 - 2082 -0.9 2 2 Tu 1 . - CDS 2292 - 3716 1706 ## COG0737 5'-nucleotidase/2',3'-cyclic phosphodiesterase and related esterases - Prom 3750 - 3809 2.9 + Prom 3708 - 3767 4.9 3 3 Tu 1 . + CDS 3787 - 5457 698 ## PROTEIN SUPPORTED gi|39938628|ref|NP_950394.1| ribosomal protein L13 4 4 Tu 1 . - CDS 5448 - 5585 56 ## - Prom 5606 - 5665 1.8 5 5 Tu 1 . + CDS 5599 - 8424 4434 ## COG1003 Glycine cleavage system protein P (pyridoxal-binding), C-terminal domain 6 6 Op 1 . + CDS 9022 - 9561 172 ## PROTEIN SUPPORTED gi|126666687|ref|ZP_01737664.1| Ribosomal protein S2 7 6 Op 2 . + CDS 9620 - 10291 559 ## Ping_0030 pentapeptide repeat-containing protein 8 6 Op 3 . + CDS 10267 - 11613 1241 ## COG0515 Serine/threonine protein kinase 9 6 Op 4 . + CDS 11654 - 13384 2270 ## Bacsa_2872 hypothetical protein 10 6 Op 5 . + CDS 13392 - 14249 1174 ## Slin_4126 hypothetical protein 11 6 Op 6 . + CDS 14252 - 15274 1308 ## CLB_1682 hypothetical protein 12 6 Op 7 . + CDS 15297 - 16229 740 ## PROTEIN SUPPORTED gi|148988049|ref|ZP_01819512.1| 30S ribosomal protein S9 13 6 Op 8 . + CDS 16267 - 17190 1384 ## COG1234 Metal-dependent hydrolases of the beta-lactamase superfamily III 14 6 Op 9 . + CDS 17187 - 17942 1093 ## COG0566 rRNA methylases + Term 17964 - 18001 7.8 - Term 18148 - 18195 1.1 15 7 Tu 1 . - CDS 18252 - 18614 463 ## - Term 18693 - 18730 2.2 16 8 Tu 1 . - CDS 18731 - 18835 61 ## - Prom 18894 - 18953 2.9 + Prom 18887 - 18946 5.2 17 9 Tu 1 . + CDS 18972 - 20066 792 ## PROTEIN SUPPORTED gi|163786851|ref|ZP_02181299.1| 50S ribosomal protein L32 18 10 Op 1 16/0.000 + CDS 20289 - 22241 3055 ## COG0441 Threonyl-tRNA synthetase 19 10 Op 2 . + CDS 22527 - 23003 362 ## PROTEIN SUPPORTED gi|167856598|ref|ZP_02479300.1| 50S ribosomal protein L35 + Term 23044 - 23097 12.4 20 11 Op 1 . + CDS 23358 - 24323 1267 ## Sph21_3906 hypothetical protein 21 11 Op 2 . + CDS 24377 - 25030 902 ## COG1083 CMP-N-acetylneuraminic acid synthetase + Term 25064 - 25110 7.1 - Term 25059 - 25090 1.1 22 12 Tu 1 . - CDS 25113 - 25997 1288 ## Bacsa_0071 hypothetical protein - Prom 26130 - 26189 6.5 + Prom 26105 - 26164 3.8 23 13 Op 1 . + CDS 26184 - 27113 1285 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 24 13 Op 2 . + CDS 27131 - 28027 1005 ## COG3509 Poly(3-hydroxybutyrate) depolymerase + Term 28088 - 28123 6.7 - Term 28072 - 28115 11.0 25 14 Tu 1 . - CDS 28170 - 30767 1883 ## PROTEIN SUPPORTED gi|163764771|ref|ZP_02171825.1| ribosomal protein S8 - Prom 30957 - 31016 3.4 26 15 Tu 1 . + CDS 31107 - 33356 1904 ## COG2812 DNA polymerase III, gamma/tau subunits 27 16 Op 1 . + CDS 33488 - 34786 646 ## PROTEIN SUPPORTED gi|163739624|ref|ZP_02147033.1| 50S ribosomal protein L32 28 16 Op 2 . + CDS 34788 - 35141 546 ## COG1539 Dihydroneopterin aldolase + Term 35144 - 35180 9.4 29 17 Op 1 . + CDS 35194 - 36543 2061 ## COG0733 Na+-dependent transporters of the SNF family 30 17 Op 2 . + CDS 36595 - 38019 1766 ## Odosp_3161 metallophosphoesterase + Term 38237 - 38270 5.4 - Term 38219 - 38264 12.3 31 18 Op 1 . - CDS 38286 - 38816 756 ## COG0054 Riboflavin synthase beta-chain 32 18 Op 2 . - CDS 38825 - 39508 979 ## Cpin_5454 hypothetical protein - Prom 39544 - 39603 4.2 + Prom 39556 - 39615 2.7 33 19 Op 1 . + CDS 39700 - 41046 1480 ## COG1195 Recombinational DNA repair ATPase (RecF pathway) 34 19 Op 2 . + CDS 41043 - 41333 440 ## BVU_2621 hypothetical protein - Term 41611 - 41645 3.5 35 20 Tu 1 . - CDS 41701 - 42900 1462 ## BVU_0414 major outer membrane protein OmpA - Prom 42923 - 42982 8.3 - Term 42982 - 43036 7.7 36 21 Tu 1 . - CDS 43058 - 45613 4034 ## COG0209 Ribonucleotide reductase, alpha subunit - Prom 45707 - 45766 5.0 - Term 45782 - 45817 5.3 37 22 Tu 1 . - CDS 45840 - 46214 688 ## COG0251 Putative translation initiation inhibitor, yjgF family - Prom 46234 - 46293 2.8 + Prom 46198 - 46257 3.8 38 23 Tu 1 . + CDS 46292 - 46720 656 ## CJA_2990 putative lipoprotein + Term 46739 - 46789 11.2 - Term 46732 - 46770 6.6 39 24 Tu 1 . - CDS 46780 - 47607 649 ## gi|313159215|gb|EFR58588.1| hypothetical protein HMPREF9720_0975 - Prom 47850 - 47909 4.1 - Term 47874 - 47914 7.2 40 25 Op 1 . - CDS 47937 - 48935 1717 ## COG0039 Malate/lactate dehydrogenases - Term 48943 - 48973 2.3 41 25 Op 2 . - CDS 48979 - 50091 1479 ## COG1409 Predicted phosphohydrolases 42 25 Op 3 . - CDS 50123 - 50719 858 ## COG0353 Recombinational DNA repair protein (RecF pathway) 43 25 Op 4 . - CDS 50720 - 51538 1262 ## PRU_0761 putative BatD/BatE protein 44 25 Op 5 . - CDS 51547 - 53403 2855 ## BT_0904 hypothetical protein 45 25 Op 6 . - CDS 53480 - 55093 456 ## PROTEIN SUPPORTED gi|169634422|ref|YP_001708158.1| fumarate hydratase + Prom 55275 - 55334 8.9 46 26 Tu 1 . + CDS 55387 - 56379 1695 ## COG0191 Fructose/tagatose bisphosphate aldolase + Term 56399 - 56438 10.0 Predicted protein(s) >gi|313159169|gb|AENZ01000019.1| GENE 1 122 - 1957 2284 611 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|237711154|ref|ZP_04541635.1| 30S ribosomal protein S1 [Bacteroides sp. 9_1_42FAA] # 1 593 1 594 598 884 74 0.0 MEETKTVVANENFDWNAFENDLGVYSQPKEEIAEAYDKTLSNVNVGEVVEGTVTGITKRE VLVNIGYKSEGVIPVSEFRYNPDLAVGDKIEVYVESAEDKNGQLALSHKKARQLKSWDRV NEALEKDEIIKGYIKCRTKGGMIVDVFGIEAFLPGSQIDVKPIRDYDVYVDKTMEFKVVK INQEFRNVVVSHKALIEAELEAQKQVIMSKLEKGQILEGTVKNITSYGVFVDLGGVDGLI HITDLSWGRVNHPEEIVALDQKIQVVILDFDDAKKRIALGLKQLTAHPWEALDQNLKVGD KVKGRVVVMADYGAFVEIAPGVEGLIHVSEMSWSQHLRSAQEFMKVGDEVEAVILTLDRE ERKMSLGIKQLTPDPWENIETKYPVGTKCTAKVRNFTNFGIFVEIEEGIDGLIHISDLSW TKKVKHPGEFTSVGADIEVVVLEIDKENRRLSLGHKQLEENPWNEFENQFSVDSIHEGTI TEMTDKGAVVALGENIEGFCPARQLTKEDGTTPKAGDKLDFKVIEFSKATKRITLSHVRT YEEAKRAEVAAEKAEKRAAADATKSTVKKINASVEKTTLGDIAGLAALKSAMEAAEAKNA KKAAAKEETEE >gi|313159169|gb|AENZ01000019.1| GENE 2 2292 - 3716 1706 474 aa, chain - ## HITS:1 COG:BH1015_1 KEGG:ns NR:ns ## COG: BH1015_1 COG0737 # Protein_GI_number: 15613578 # Func_class: F Nucleotide transport and metabolism # Function: 5'-nucleotidase/2',3'-cyclic phosphodiesterase and related esterases # Organism: Bacillus halodurans # 7 444 23 470 707 152 28.0 1e-36 MERILRTVLIVTAALAAACAPRERTLVLLSTNDMHAKIRNFPRLAAAVEACRDTAQLVVL VDAGDRWTGNAYVDRAATPGMPMIALMNRLGYDVATLGNHEFDHGQAFLGRMIDSMDFEV VCANVVSDTCSFPPLPPYTVLEEGGFRIGFVGVVTNYEGPGHPAGNASSFAGLTFPDPQQ MALKYAAELRPKCDVLVLVSHMGDDRDRELLAGGASQYDVVIGGHTHEEVDTLIGGTLLT QTGKDLRNIGVTTIRMKGRKVAGVDFRLVPLAGYEPDPVFQKQVDACYANPELNRPMGEF GNAANKWGLANWMAGAVADGIDAEVGFYHIGGVRLDSIPAGGVSAASVYGLEPFGTLVAE MKMTPADMRRMIVSKYNDPVNVKEAHRIDLISTTPYVIVTDQADNALDVEFPKLREGKVY TVAVSDYVYKNYNDLNYTDGKITEEDVTGLLLEELEEDSPLRIDNTPRQRVRRK >gi|313159169|gb|AENZ01000019.1| GENE 3 3787 - 5457 698 556 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|39938628|ref|NP_950394.1| ribosomal protein L13 [Onion yellows phytoplasma OY-M] # 16 556 11 546 546 273 31 2e-72 MADSIKVICDNLGGPIRVAMGTPLSDVAAQLTPGRYPFLAAFVNNRIKELNYKIYTPVTV RFVDITDFAGIRVYQRTSWFILQKAARTLFPGHTLHIRHSMGQSGFYCELEGLDEFTHEQ AAALEGHMREVVAQNLPIERTKVLTSELRAIYAEQGFDDKTALLDTRPRLYSDLYTLDGT AGYFYGALAPSTGYIDRFCIEPYYKGFYLALPLRTNPGVLNKNVQQEKMFGIFQEYQSWV RIMGVPTVGDVNSKVLAGDAGGMIKLAEAFHERKFAWVADTIYDANLSRGVRIVLISGPS SSGKTTSAKRLGIQLGVLGLKPVLISLDDYFVDREKTPKDADGEYDYEALEAIDLELFND HLRRLMRGESVDIPRYNFITGRRMQHNDPLTLDERSILIVEGIHGLNPRLTPGVPEEQKF RIYISCFTSVAMDNLSRIATTDNRLLRRLTRDYRQRGADALATLSRWASVRRGEEKHIFP YQENADVMLNSSLFYEISVLRPFAEKILREVPDTVPEYDEARRMLKFLDNFIPIPPDEIP PTSILREFIGGSSFQY >gi|313159169|gb|AENZ01000019.1| GENE 4 5448 - 5585 56 45 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKLWGLGAAVRRNYRFGAGSEPGGRLLSCMVCVRVTPGRADDSQY >gi|313159169|gb|AENZ01000019.1| GENE 5 5599 - 8424 4434 941 aa, chain + ## HITS:1 COG:all4607_2 KEGG:ns NR:ns ## COG: all4607_2 COG1003 # Protein_GI_number: 17232099 # Func_class: E Amino acid transport and metabolism # Function: Glycine cleavage system protein P (pyridoxal-binding), C-terminal domain # Organism: Nostoc sp. PCC 7120 # 461 937 5 487 496 574 61.0 1e-163 MFDKFSERHIGVSNEKELKAMLETIGVKSVDELISQVIPHSIRLKKPLALPAEGMSEYEF AGHIRALAERNRCLRSFIGMGYYPCAVPAAVTRNVFENPAWYTSYTPYQAEISQGRLEAL LNFQTAVISLTGMEIGNCSLLDEATAAAEAMLMMFALRSREAVKEGRNQLFVDRNIFPQT LDVLLTRSEPFGIELIVDEYDEYSFTGKEFGAIVQYPAANGAVRDYADFTAAAHAKGALV TAVADLLALALLKAPGEWGADIAVGSTQRLGTPMGLGGPSAGYMTTREAFKRNMPGRIIG VSVDRLGNRALRMALQMREQHIKRERATSNICTASALMASMVGFYCVYNGPEGLKRAADT AHLAAATVAKALEAMDYKLAATAYFDTLEVAAEAAVVQSLALESGINFFYPTEGSVRMSF DEVTTPEEIAEVIRIFAAAKGKKAKAVKPVTESNVPAGLRRRSAYLTEPVFNAYRSESAL MRYIKQLELRDISLANSMISLGSCTMKLNAAALMQPLSLAGFQNMHPFAPADQAEGYMQL ITELENDLATITGFAASSLQPNSGAAGEYTGLMVIRAYHQSRGQGYRNVVLIPASAHGTN PASAAMAGMKIVTVACDANGNIDVEDLEAKAKEYSSELCGLMVTYPSTHGVFESRIREIV DAVHDAGGQVYMDGANMNAQVGLTNPGYIGADVCHLNLHKTFAMPHGGGGPGVGPICVAE HLKAFLPSHSVMATGGDEGITAVASAPWGSALLLPITYGYIKMLGEAGLRRATEMAIVNA NYMSAALASEFRTYYSGETGRVGHEMILDLTNFKKDYNIDCGDIAHRLMDYGFHAPTLSF PVHETLMVEPTESEPKAEMDRFIEALVSIKRECEAAVGQPDNVVVNAPHTAVEIAGEWPH PYTRQQAVFPLEWVRQAKFFPYVSKIDAGYGDRNLCCRNCE >gi|313159169|gb|AENZ01000019.1| GENE 6 9022 - 9561 172 179 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|126666687|ref|ZP_01737664.1| Ribosomal protein S2 [Marinobacter sp. ELB17] # 1 149 4 148 150 70 27 2e-11 MKVEQNKMVGVDYKLTVDGQIADQSRPGQPLEFIFGTGMLLPKFEEAILGKEVGEAVSFT LEPKDGYGELIADAVVDLPKNIFMVDGKLAEDILFVGSQVPMSDNQGNRMMGIVKEVGEE TVKMDFNHPMAGKTLNFDVEIVSVRDVTPEDLQGGCSCGECGDDCGGGCDHEKGHCDCH >gi|313159169|gb|AENZ01000019.1| GENE 7 9620 - 10291 559 223 aa, chain + ## HITS:1 COG:no KEGG:Ping_0030 NR:ns ## KEGG: Ping_0030 # Name: not_defined # Def: pentapeptide repeat-containing protein # Organism: P.ingrahamii # Pathway: not_defined # 17 105 193 281 359 65 34.0 2e-09 MTQTELLKMIGSGAVRDLDLTGRELKNIDFKGCRVENVTFDECTLTECNFDGCGMERVSF RKAVLRNCRFRRAKIAWSDFRYCEIERATFEEAEIRFCDLYRAMLTGIVIMRKARIGETS LYYAYFGEGVNIRRENIAGGRLLQQDLDAYRQFLIEWNTSGTGVRRNDRAEQSAWSPDAA LHAGGHAQRRPDDDRHPAHGYLRLHPRQQDTQPIKPCSHCANT >gi|313159169|gb|AENZ01000019.1| GENE 8 10267 - 11613 1241 448 aa, chain + ## HITS:1 COG:TVN0584 KEGG:ns NR:ns ## COG: TVN0584 COG0515 # Protein_GI_number: 13541415 # Func_class: R General function prediction only; T Signal transduction mechanisms; K Transcription; L Replication, recombination and repair # Function: Serine/threonine protein kinase # Organism: Thermoplasma volcanium # 32 230 140 349 401 100 29.0 5e-21 MFTLRQYLTTLADTHGLTRTLGEIEVCRDGKGRICYSAGNSAVVFRIRCEGRVRSLRCYM HHPRHLAEIYGEKLLPQELFIYTSPAGGVWVDVVLSDWIEGVTLHEAVAAAAEAGDTARL RRFAAAFDRMAAALTADDWAHGDLKPENIVADNWGRLHLIDFDAMFLPAFAGRHSPELGT AAFQHPARTVRDFDASLDDYPAALISTALHALALDPTLYARYSDADGLLFTPQKIGTDAA LCEVLALFERRGLAAQYRIARLLRSPSLRLPGLPQLLALAAETTETDEITGPEETKNAVN TATTGTTGTGETAGTTGPKRAMGAEETAGGNSGSAEGPAGDSADGTTEDPTDGAVAEAAE LFVENGLWGYRTPEQVVVPPLYDCGFDFTEGLAAVRLGATWHYIDGAGRTRISCPGCEAV KPFRNGRAPVVCGGRRLEIDREGREFDI >gi|313159169|gb|AENZ01000019.1| GENE 9 11654 - 13384 2270 576 aa, chain + ## HITS:1 COG:no KEGG:Bacsa_2872 NR:ns ## KEGG: Bacsa_2872 # Name: not_defined # Def: hypothetical protein # Organism: B.salanitronis # Pathway: not_defined # 1 575 1 630 662 424 38.0 1e-117 MPQLRYMTPAEISAAESLGSSAEAWSQVRVSEDFTPFQLLQSHLEGTVEIGSGARIIRSR VCNYRIGEGALIEGVMALECRRRSTFGNGVGVATMNECGGRTVRIFDRMSAQVAYLMAVY RHRPQTVAALERMVEAYAEAGASEMGSVGRNTRIVGAKFIREVRIGDEVCIDGASMLENG TVCDGAHIGVDVKAYDFIAAEKAHIDNGSIVERCFVGESCRLDKAFTAAESLFFANSHCE NGEAASIFAGPYTVSHHKSSLLIAGMFSFFNAGSGSNQSNHLFKSGAVHQSVHLRGCKFA SGAYIMSPALEGAFTMIMGHHSYHHDTSAFPYSYLIEKEGRTTLMPGANLTSYGAVRDIE KWPARDRRERKRDVINFEEYNPYITEAMLRAVDTLHTLAEEDPDAPSYVYRKAVIRAAAL KRGIGLYNKFVVAALGAMLDRGESASRYDGSGRWLDVAGQYVTKREVEAILDAVDRGELT TPEEVDNRFRVFFVHYDDYAHSWAEGIYASLLGRVPTAAEIGDAIEAGRNAREAMRRTTD ADRERDCSLDMAVSYGLDSDDEREVRDDYYSVRGLK >gi|313159169|gb|AENZ01000019.1| GENE 10 13392 - 14249 1174 285 aa, chain + ## HITS:1 COG:no KEGG:Slin_4126 NR:ns ## KEGG: Slin_4126 # Name: not_defined # Def: hypothetical protein # Organism: S.linguale # Pathway: not_defined # 2 279 34 314 330 181 40.0 2e-44 MYTQSITRNHRTAFILAIDCSGSMAESILFRGRRLTKAEAVAGITNDLLFELVERARRSD GIRDYYDIAVIGYSGDDEVRSLLPDGEELVPVSALAAREMPVRTEVIEHRLPDGSIALRE IPAPSWIEPQAAGQTPMCEALRRVRDIAAEWTARAANAESFPPVVFNITDGEATDCDDEE LRAVCNQIKALETADGNVLLINIHIAAGDAGRHVFFPEADEANYTNRYAALLYDCSSEMP AVFNEAIREAKGPAAAPPFRGMSYNASAAQLVTMLNIGSISVKTE >gi|313159169|gb|AENZ01000019.1| GENE 11 14252 - 15274 1308 340 aa, chain + ## HITS:1 COG:no KEGG:CLB_1682 NR:ns ## KEGG: CLB_1682 # Name: not_defined # Def: hypothetical protein # Organism: C.botulinum_A_ATCC19397 # Pathway: not_defined # 227 319 73 164 337 68 40.0 5e-10 MTATIQHFTRALLTPDLSLATLSDARAVTDRNGMPRMVRTTRFAEAEIEWRGERWLAAMP LTPAALPRIERTASVLRRMNTEHLTQFRILPEEMRWTDALGNERRTGLVLQQLPGREFAE ALLTEDKTVLLAALDTLREALRELEFTHNNLRESNLRWHRGRIIPIRYYDARIGAAENGG ADAEAFEALKRRIAEAPMPRQSVNDVEAVYNPVRKLTGHRWTSHVFEGLVCVEDEGGFGF VDTDNNPVIPSQFLWAGDFHEGRAEVQTRTGMGLIDREGGYVIPPEYEIVDYDPAVSIVH VRHGGRWALFDYLGRRLTEFGREDMQEPVRQEPCRSEICR >gi|313159169|gb|AENZ01000019.1| GENE 12 15297 - 16229 740 310 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|148988049|ref|ZP_01819512.1| 30S ribosomal protein S9 [Streptococcus pneumoniae SP6-BS73] # 8 305 5 303 306 289 49 2e-77 MEETIKCLIIGSGPAGYTAAIYTSRANLRPVLYEGIEPGGQLTTTTDVENFPGYPDGVSG QQMMADMRRQAERFGADIRIGTVTSADLSSRPFHVVIDGTTQLKAETLIIATGASAKYLG LPSETKFRGMGVSACATCDGFFYRKKDVAVVGGGDTACEEATYLASICRKVYMIVRKPHL RASKAMQERVFNTTNIEVMFNHNTAEVLGDESGVTGALLRCNDGKEVKIDIAGFFLAIGH HPNTELFADQLTLDAEGYIKTEAGTSKTNIEGVFAAGDVRDPHYRQAITAAASGCIAAID CERFILSRAE >gi|313159169|gb|AENZ01000019.1| GENE 13 16267 - 17190 1384 307 aa, chain + ## HITS:1 COG:alr5152 KEGG:ns NR:ns ## COG: alr5152 COG1234 # Protein_GI_number: 17232644 # Func_class: R General function prediction only # Function: Metal-dependent hydrolases of the beta-lactamase superfamily III # Organism: Nostoc sp. PCC 7120 # 5 307 3 310 322 175 34.0 1e-43 MSFSVTILGSSSAKPTLSRHPSAQAVNVHEQYYLVDAGEGVQQQLIRCGINPLKLRAVFI SHLHGDHVFGLFPLISTLALYGRKTPLRVFAPAPFGEILACHLRFFDSELPYPVVWTEVD TTKHALLLENRTLEVWSIPLRHRVPTAGFLFREKEPPLNVEKFKIAKYGLTVAQITAAKR GEEIALPTGEVIPNAELTYRPYAPRSYAYLSDTNFSAKAATLAKGADLMYHEATYAAAEQ KTAKERGHSTSADAAKAALKAGAKRLIIGHYSSRYKNENILVEEARAIFPETYPATEGVT FTIEKQR >gi|313159169|gb|AENZ01000019.1| GENE 14 17187 - 17942 1093 251 aa, chain + ## HITS:1 COG:VC0803 KEGG:ns NR:ns ## COG: VC0803 COG0566 # Protein_GI_number: 15640821 # Func_class: J Translation, ribosomal structure and biogenesis # Function: rRNA methylases # Organism: Vibrio cholerae # 1 244 15 255 257 187 42.0 1e-47 MTKAEIQLVRALADKRGRTEHGLFVAEGEKLIGELRGSHLKVRKIFALEGVFAGPEVETV APRDMERLSLLKTAANSLAIVETPRYGLDMRSLAGRLTLALDDVQNPGNLGTIVRLADWF GIADIVCSEASADCFNPKVVQATMGAILRVRVHYTDLGAFLHDAGAAGLPVYGTFLEGEN IYGTTLTEEGVVVMGNEGRGISQAAARAVTHKLFIPPYPTDRRGSESLNVAMATGIVCSE FRRRAGMGSGK >gi|313159169|gb|AENZ01000019.1| GENE 15 18252 - 18614 463 120 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MEPCAQKTTKKHNPELVDTVFRLMFEILWVAPYDRRRSNAALSGFERCSRETAVLLAATD LRSASPGELHTLLQAVDRLVQTIGRLESEALFSRWQCAEALAQVRRIAAIVQEHAAVAVG >gi|313159169|gb|AENZ01000019.1| GENE 16 18731 - 18835 61 34 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MQPVRADDIFGLKRIGFFLTVENWTLSRNVRNKG >gi|313159169|gb|AENZ01000019.1| GENE 17 18972 - 20066 792 364 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163786851|ref|ZP_02181299.1| 50S ribosomal protein L32 [Flavobacteriales bacterium ALC-1] # 3 345 4 346 346 309 44 2e-83 MSEQKLKIGITQGDTNGIGWEVILKALADPRMTELFTPVVYGSPKAAAYYRNTLAQTEPV QFNAVTSARDARRGKVNLVACGDTEQIEPGKASAEAGRAAVEALRRATQELKEGLLDAIV TAPFNKESVQGDGFHFTGHTEFIGAELGGEPMMIMCSEILRVGLVTKHIPVSEISGSITG EKIVRDLHTLRRTLKEDFGIVEPRIAVMALNPHAGDGGLLGREEQEIIRPAVVEAFNQGV LAFGPFAADGLFAGGGYAKYDGILAMYHDQGLAPFKTLSPDGVNFTAGLSKVRTSPDHGT AYDIAGQDKADPQSMRSAIYAAIDIVEHRRAWAEWSRNPLQRAEREKGGRDVSVKDLPQT EKED >gi|313159169|gb|AENZ01000019.1| GENE 18 20289 - 22241 3055 650 aa, chain + ## HITS:1 COG:DR2081 KEGG:ns NR:ns ## COG: DR2081 COG0441 # Protein_GI_number: 15807075 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Threonyl-tRNA synthetase # Organism: Deinococcus radiodurans # 2 622 1 627 649 671 53.0 0 MIKITFPDGSVREYAEGTTAYQIAESISPRLAADSLAASVNGATVDMTRPIGEDASVKFY KWDDEEGKHAFWHTSSHLLAEALEALYPGIKFGIGPAIENGFYYDVDSPTPITEADLPKI EQKMQELSRNKEQLVRREVPKAEALKTFTEKGDQYKVELISDLQDGTISFYTNGAFTDLC RGPHIPNTGYIKALKLTSVAGAYWRGNEKNKMLTRIYGISFPKKSMLDEYLVMMEEAKKR DHRKLGKDLELFCFSQRVGQGLPLWLPKGAALRDRLEQFLRNVQKEYGYQQVITPHIGNK ELYVTSGHYAKYGKDSFQPIHTPIEGEEYLLKPMNCPHHCEIYRSKPRSYKELPVRLAEF GTVYRYEQSGELHGLTRVRGFTQDDAHLFVRPDQLLEEFERVIDIVLYIFKTLKFDNYTA QISLRDPNNKEKYIGSDENWEKAESAIMQAAEEKGLNTVVEYGEAAFYGPKLDFMVKDAI GRKWQLGTIQVDYNLPERFDLTYKGADDKLHRPIMIHRAPFGSMERFVAVLLEHTGGKFP LWLSPQQVVVLPISEKFNDYAQQVAEYLNRSDVRTEVDDRNEKIGRKIRDNELKRIPYLL IVGEKEEAEGLVSVRAQGEGDKGQMTLEGFRDFIAGLVKEEIEANKMEKK >gi|313159169|gb|AENZ01000019.1| GENE 19 22527 - 23003 362 158 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|167856598|ref|ZP_02479300.1| 50S ribosomal protein L35 [Haemophilus parasuis 29755] # 1 154 3 157 159 144 50 1e-33 EVRVVGDNVEQPLVLPIREALKLADSMELDLIEISPKAEPPVCRIADYQKFLYQQKKKAK ELKANQVKVVIKEIRFGPQTDDHDYNFKLKHATNFLKEGCKVKAYVFFRGRSIVFKEQGE ILLLRFATDLEEVAKVEMMPKLDGKKMNMMLAPKTNKK >gi|313159169|gb|AENZ01000019.1| GENE 20 23358 - 24323 1267 321 aa, chain + ## HITS:1 COG:no KEGG:Sph21_3906 NR:ns ## KEGG: Sph21_3906 # Name: not_defined # Def: hypothetical protein # Organism: Sphingobacterium_21 # Pathway: not_defined # 2 321 4 328 328 343 51.0 4e-93 MITQDDIFGITDDAAFEAAALEVFRRQAAECPPYREYLALAGVRPEEVRAAREIPFLPIE IFKTHDVYCGGTEPEAVFTSSATTGMTPSRHPMRSLALYERTFRAAFRTFYGEPGQWSLY ALLPNYLRRKGSSLVYMADRLIADCGSGGFYLDDYDALLTAMQADPKPKILLGVSYALWD LAERYAPKLENTVIMETGGMKGYREEIPKEEFHKILCGAFGVGAIHSEYGMAELTSQAYS QGGNLFRCPPWMRVTTRDVNDPFDPQPAGTRGGLNIADLANWWSCAFIQTQDVGRVDARG AFVVEGRIDHSDIRGCNLLVQ >gi|313159169|gb|AENZ01000019.1| GENE 21 24377 - 25030 902 217 aa, chain + ## HITS:1 COG:slr2122 KEGG:ns NR:ns ## COG: slr2122 COG1083 # Protein_GI_number: 16330651 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: CMP-N-acetylneuraminic acid synthetase # Organism: Synechocystis # 1 209 4 217 225 129 34.0 4e-30 MKTAAFVPIRLNSQRVSGKNLRPLSGSPLMCHILRTLTEVEGIDEVYVYCSDERIREFLP EGVRFLRRSEELDRDTTLGREIYDSFTAEVEADLYVLAHATSPFIRAETVADALRKVVSG EYDSAFSAEKIQTFAWYEGRPLNYSPENIPRTQTIEPVYIETSAFFIFPRALWTGRHRRI GDRPYMAVVDRIEGLDIDYPEDFTMAEIIAASRNLPK >gi|313159169|gb|AENZ01000019.1| GENE 22 25113 - 25997 1288 294 aa, chain - ## HITS:1 COG:no KEGG:Bacsa_0071 NR:ns ## KEGG: Bacsa_0071 # Name: not_defined # Def: hypothetical protein # Organism: B.salanitronis # Pathway: not_defined # 47 291 43 286 288 159 34.0 1e-37 MTAAEKLIAFIRNSFDTALYFAVMAVKENFRNYVGRAGTVGRPAQLMVILGNGPSLAGDL PRLIERREYETEDFLAVNFFAEDDRFEVVKPKYYVLSDPMFFRDSACRDRVRALYATLAR KVAWPMNLYVQYYNPEGFDYRAALPNSNIRIVRFHTQMYRGFRSLEFWLFRRGLGSANFG TVVQVGEYVALLLGYKRIELYGVDHTLLDGLCVDDGNRLCRIDRHYYDGAEAAAPQPIYK KVPHVPYTMADYLAEVAELFRGHEVLRDYAAALGARIVNRTRGSMIDAYERNPE >gi|313159169|gb|AENZ01000019.1| GENE 23 26184 - 27113 1285 309 aa, chain + ## HITS:1 COG:alr3070 KEGG:ns NR:ns ## COG: alr3070 COG0463 # Protein_GI_number: 17230562 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Nostoc sp. PCC 7120 # 6 242 4 240 318 114 31.0 3e-25 MASATISVIIPLYNKEREIGDTLRSVLAQTLPPAEIVVVDDGSTDRSAEIVRGIRSPLVK LVTQPNAGECAARNRAIAESTGDYIALLDADDTWEPGFLEEIAAMIAEFPGCGVYSTAFN IVSHDGRFPARTPSERGVVANFFRDSAHRYISIPSASAVPRAVFEAVGGFPEGMKIGGDM YMWIKIARRYAVCFSPKPLANYSKVASNRSASSYTPERTRYSFEELYDPSAPEEYNEFVA RAALGKALIISAKGGTKEAARAAEFFGYTKTYRRTLRKVRVLNALPRSWRAPLIGLYNSL AWRIARKGL >gi|313159169|gb|AENZ01000019.1| GENE 24 27131 - 28027 1005 298 aa, chain + ## HITS:1 COG:ML0715 KEGG:ns NR:ns ## COG: ML0715 COG3509 # Protein_GI_number: 15827302 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Poly(3-hydroxybutyrate) depolymerase # Organism: Mycobacterium leprae # 13 278 6 269 304 105 35.0 8e-23 MRHLLKSAAAKGFLLTALLLFAAASCRAQSAAPETPDYILRSGGMERTYKLHLPAGLPEN APLVVVLHGYGGNNNPDRFAMNATADRHGFAVCYPQGAKDGRGKSCWNVGYPFQADMTVD DVRFLTELVRHLQKKHGLSRHNVFCTGMSNGGEMCYQLAAQRPRLFAAVAPVSGLMLDWL YKADRSTAPVPLFEIHGTEDKTSAWLGDPQNKGGWGAYLPVPLAVHYWAAKNRCTVMQTD TLPAKAPGGRTVIAHRFTGGAGGSEVWLYEIVGGKHAWGEQDIDTGEELWKFFSRFVK >gi|313159169|gb|AENZ01000019.1| GENE 25 28170 - 30767 1883 865 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163764771|ref|ZP_02171825.1| ribosomal protein S8 [Bacillus selenitireducens MLS10] # 1 858 1 804 815 729 46 0.0 MNINTLTIKAQEALQAALNLARERGQQAVEPLHLLAVLIREDDSLATFLLGRVGVNVRGL RDETQRAVGSLPRVEGGGEQFFAQETSKVIQRAVDFTKNFGDKYASVEHLLLGLVAERGQ AADILKRSGATEKELLEAIRIFRKGATVDSQTSEQQFDALGKYAVNLNEQARSGKLDPVI GRDEEIRRVLQILSRRTKNNPILVGEAGVGKTAIAEGIAHRIVDGDVPENLKSKVIYSLD MGALIAGAKYQGEFEERLKAVVQEVVASEGEILLFIDEIHTLVGAGKSSGAMDAANILKP ALARGDLRTIGATTLDEYQKYFEQDKALERRFQKVMVDEPTQEDAISILRGLKDRYENHH QVRIKDEAIVAAVELSTRYITSRFLPDKAIDLVDEAASRLRLEMNSVPEEIDTLDRRVRQ LEIEREAIRREKDKERVEQLTKEIEELKSRDAEMRAKWQGQRDLLKRIQENKDRIEQLKI EAQQAERQGDYGKVAEIRYGKIQEAEKEIAAFQEEYKLASANGSMIKEEVDAQDVAEVVS RWTGIPVTRMLASEREKLLHMEDELHRRVIGQEQAIAAISDAVRRSRAGLNDPRKPIGSF IFLGTTGVGKTELAKALAEFLFNDDSMMTRIDMSEYQERHSVSRLVGAPPGYVGYDEGGQ LTEAVRRKPYSVVLLDEIEKAHPDVFNILLQVLDDGRLTDNKGRTVDFRNTIIIMTSNMG SHIIQENFAAAFSGEKLAPEVVEKTRMDVIDLLKQQLKPEFLNRIDEIVMFEPLTRRDIE RIVDIQLGAVRRMLAENGIRLEYSDKAREWIAAAGYDPLYGARPVKRTIQRYIVNELSKR ILAGDVNREKPIKIGADDKGLTFAN >gi|313159169|gb|AENZ01000019.1| GENE 26 31107 - 33356 1904 749 aa, chain + ## HITS:1 COG:CAC0125 KEGG:ns NR:ns ## COG: CAC0125 COG2812 # Protein_GI_number: 15893421 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, gamma/tau subunits # Organism: Clostridium acetobutylicum # 9 358 8 358 542 288 42.0 4e-77 MSDFIVSARKYRPATFASVVGQKHITSTLKNAIERAQLAHAYLFCGPRGVGKTTCARIFA KAINCLSPNGAEACNECESCRSFNEGRSLNIHELDAASNNSVEDIRTLIEQVRIIPQVGR YSVFIIDEVHMLSAAAFNAFLKTLEEPPAHAIFILATTEKHKIIPTILSRCQIYDFNRIR VEDSVEYLKYIAGQENISADEESLNLIAQKADGGMRDALSMFDKAVSFCGTTLDYRNVAQ TLNVLDYDTYFSVTEMLLAGNYVDVLVTFDTVLSKGFSGQTFTAGLNRHMRDLLMAKRPE TLRLIEMTGTLLERYRTQAGACNVEFLFGAISILTELDGKIRQSSNQRLLVELGLMKIAG LGQKKNDDLTSSGEYSLPALSPRTAAGAAATPTAAARPAPQQSASTVQAQTVSAAGQTTP GTSQSGAGATVQAATPPAAGQTAPSAVQPGAGQTGQGTVRPEAGPTASAGIPQVSGFSVR GAAMQTAGRQAAEVSAQDNAPQAAAAGQTIPGGAANPAAQGGMANPAMQSGTPNPTAQGG AAGPTVLGGTAHPTAQGGAAVPAVQTAGGTTAETTPQPAPARPAVQTAPAPARRPLISGA SLSELLASAGSDPDEELSDGETPDEAEVVTVDPECAEKLEHARSRILNLIKEKRPRFVPA FELMTFRDNTISVSVPTTELREEILRSKTGMLMRIAELAGIEGMIELEVIVNEEIRAVRP IKLEDRVRYITEKNPLVAELRKALDLEVE >gi|313159169|gb|AENZ01000019.1| GENE 27 33488 - 34786 646 432 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163739624|ref|ZP_02147033.1| 50S ribosomal protein L32 [Phaeobacter gallaeciensis BS107] # 5 429 8 414 418 253 34 2e-66 MATKNFIEELEWRGMIHTIMPGAKEQLEKEMTTAYLGIDPTADSLHIGHLVGVMILKHFQ MCGHRPLALIGGATGMIGDPSGKSQERNLLDEPTLRHNQEAIKRQLAKLLDFESDAPNAA VLVNNYDWMKDISFLEFIRDIGKCITVNYMMAKDSVKKRFSGEGDGMSFTEFTYQLVQGF DFLHLYQTMNCKVQLGGADQWGNITTGTELIRRKLGSEAEAFAITCPLITKADGTKFGKT ESGNVWLDPRYTSPYKFYQFWLNVSDEDAKRYIRIFTLLDRETVEALTAEHEAAPHLRIL QKRLAQEITTMIHSKEEYEKAVEASAILFGGSTSEALRKLDEETLLQVFEGVPQFRIARA ELGLPFVDLCAEKAQVFPSKGECRKMVQGGGVSLNKEKVADAAREVTEADLIAGKYLLVQ KGKKNYFLLIAE >gi|313159169|gb|AENZ01000019.1| GENE 28 34788 - 35141 546 117 aa, chain + ## HITS:1 COG:BH0094 KEGG:ns NR:ns ## COG: BH0094 COG1539 # Protein_GI_number: 15612657 # Func_class: H Coenzyme transport and metabolism # Function: Dihydroneopterin aldolase # Organism: Bacillus halodurans # 7 117 1 112 114 66 34.0 1e-11 MEYRIVLSRMEFRALHGCYELERKVGNRFTVDLELTAELGDAAVQDDVRKTVNYLTVYEV VRMQMRITQHTIERVAMNIIEAIYAAFAQVRHVKCTVSKLAPPLGGKLEKVSVVLEK >gi|313159169|gb|AENZ01000019.1| GENE 29 35194 - 36543 2061 449 aa, chain + ## HITS:1 COG:BS_yocR KEGG:ns NR:ns ## COG: BS_yocR COG0733 # Protein_GI_number: 16078994 # Func_class: R General function prediction only # Function: Na+-dependent transporters of the SNF family # Organism: Bacillus subtilis # 13 449 10 445 445 321 42.0 2e-87 MHHKESRATLGGKLSAVLVAAGSSVGLGNIWRFPYVAGDNGGGAFLIIYILCVLLLGLPL MIAEFSVGRASHLNAVGAYRKLNRRWSFLGYNGVLAAFLILGFYFVVSGWTAEYMVHSAT GEIARYSTAEEYKNLFETFISNPWRPVLYTCLFVLATHFVIALGVQKGIERSAKILMPLL FVILIILSVHSLLMPGGEAGLKFLFAPDFSKVTPTTVLVALGQAFFSLSIGIGTMVTYAS YFKPDTNLRHTALNVTILDTLVAVLAGVVIFPAVFSVGIEPSSGPSLVFITLPSIFNDMP LSMVWSSVFFLLLVVAALTSTISLHEVVTVYLHEEWHLSRKTAAWLTTAATAALASLASL SLGVLSGWTICGLSLFDSLDFLTANILLPAGGFFTCVFVGWKLDRQILRDQITNNGELKF RIYGVFIFLLRYVCPAVLLLIFLDNLGVF >gi|313159169|gb|AENZ01000019.1| GENE 30 36595 - 38019 1766 474 aa, chain + ## HITS:1 COG:no KEGG:Odosp_3161 NR:ns ## KEGG: Odosp_3161 # Name: not_defined # Def: metallophosphoesterase # Organism: O.splanchnicus # Pathway: not_defined # 14 473 14 465 465 377 41.0 1e-103 MKHPLLLLLICCGLCTAVSARPIRGTVKCGGRPLAGVAVTDGYTFAQSDAQGAFALDADD EALFISVVTPAGYIAPLKEGVPQFYLPYTPSAKRFDFELQAWPGTAACYELLAIADPQPK TDEHFERLRTEIIPPLQAATAAKKAQGVDQAVILLGDIVWDSPQLFAKVREQFASLGIPV YGVIGNHDHDLNKFTDREATENYRSHFGPTYYAFDMGRTHYVVLDDIVYHGARKYDEQID SMQLRWAAAYAERLPAGSRVCFAMHAPAMKSWRDERRVMESAETLMDAFAGHEIHFISGH THINSNFDIREGAMEHNVAQICGNLWRDPINRDGTPKGWQLFRECGEDFAWTYQSLEMPE ARQLRVWGPGTAEDYPASVIAKVWNWDSYWTVVWYEDGRYRGSMQRIWFNDPDYIANIDS LKAAGKTVSKSQRPRTTHFYFKARPSASAREIEVVATDRSGRRYSERITLPAKE >gi|313159169|gb|AENZ01000019.1| GENE 31 38286 - 38816 756 176 aa, chain - ## HITS:1 COG:VC2268 KEGG:ns NR:ns ## COG: VC2268 COG0054 # Protein_GI_number: 15642266 # Func_class: H Coenzyme transport and metabolism # Function: Riboflavin synthase beta-chain # Organism: Vibrio cholerae # 14 163 26 173 173 129 41.0 3e-30 MATKNHNLSKFDSPLPSAADMRFGIVVAEWNREVTEALLEGAVRTLRAAGCPDLNIQIKY VPGTFELALGAQFFAEYTDVDAVIALGCVIQGDTRHFDFICQGVTQGITQLQIQWNMPIA FGVLTVNDMQQALDRCGGRHGNKGDEAAATAVNMVKLQIEMEAASPDHEPDRRNIN >gi|313159169|gb|AENZ01000019.1| GENE 32 38825 - 39508 979 227 aa, chain - ## HITS:1 COG:no KEGG:Cpin_5454 NR:ns ## KEGG: Cpin_5454 # Name: not_defined # Def: hypothetical protein # Organism: C.pinensis # Pathway: not_defined # 6 226 17 231 237 124 35.0 2e-27 MAKQHVAEQETLGEAMNRTELFFEKNGRNMAYIFLGLLVLAALIFGYRALIVAPRATKAA ERIAEAQYRFEEQNPDYQLALEGDANGAGFLDVIEEYGSTPSGNLAKHYAGICYLKTGDL ENAAKYLAKYSPVKGIPGALINAQNLGLQGDIAVEQQNYAKAVKFYEQAVKAADNNLTAP MYLRKAGLAEQAQGNNEKAAAFYEQILTSYPASTDAREAEKLLGSVK >gi|313159169|gb|AENZ01000019.1| GENE 33 39700 - 41046 1480 448 aa, chain + ## HITS:1 COG:BS_recF KEGG:ns NR:ns ## COG: BS_recF COG1195 # Protein_GI_number: 16077072 # Func_class: L Replication, recombination and repair # Function: Recombinational DNA repair ATPase (RecF pathway) # Organism: Bacillus subtilis # 1 360 1 368 370 172 31.0 1e-42 MFLKKISLLNFKNIEQAELALCRGVNCLVGDNGAGKTNVIDAVYYLSMCKSSLPMTDGQS IRHGADFFLAEGQYLTDGGKSENIVCSFSRKGGKVLKRNGKEYERLSDHVGLVPAVIVSP ADSALISDASDERRRYLNAFISQLDRSYLTAVMRYNAVLAERNRLLKNMPDETMLQIYDM QLVEQGERIHARRREFAERLQPVAAEYYRILSGDREQVELHYKSELNDRPFGEILLAARQ KDLANEFTTSGIHRDDLVLRIGGYPLRKYGSQGQQKSFLIALKLAQYTIVAQEKGEKPIL LLDDLFDKLDAGRVEQLIRLVSEDSFGQIVITDCNPTRLRRILDKAGGAYSLFTVENGGI GQETATAGAPACGGQLPAEESAKEAADRTRHAGPQEAGAAEEIRTAAVQGEASEDLRNAA AAGEKSGGQDACGTDTADTASGGKEGAR >gi|313159169|gb|AENZ01000019.1| GENE 34 41043 - 41333 440 96 aa, chain + ## HITS:1 COG:no KEGG:BVU_2621 NR:ns ## KEGG: BVU_2621 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 96 1 96 96 64 33.0 2e-09 MRRTKTMLMGDLLEEFFKRPYVAAKVAEGKLPDTWRAVVGDRAADFTTELKLENHILYAR IQSSVLRSELFYQREALKEELNRRSGVRIVNAVIIR >gi|313159169|gb|AENZ01000019.1| GENE 35 41701 - 42900 1462 399 aa, chain - ## HITS:1 COG:no KEGG:BVU_0414 NR:ns ## KEGG: BVU_0414 # Name: not_defined # Def: major outer membrane protein OmpA # Organism: B.vulgatus # Pathway: not_defined # 1 399 1 397 399 156 29.0 2e-36 MKKLILILAVVAFATTAWAQETPKKPSFAGFVSNGFWDNWEMSLGGGVGTALTNGSNSGS FGKRLGFEANFSLVKWVHPVVGMRLQLQGGQFANYDADLGKLKWPYLFVHSDFMLNFSNW AGGYRDDRAYYLVPFVGFGYLATNFTDKSQEDNMTGTHQAFAFSYGLLNKFRLSRSFDFN IELKGMLAPSRVCPAKTDGSYLFGFSATAGFTYRFNKRGWQRGVAGYTAADIQAFQDAVA ASMAAVEAAELENAQLAQQLAAAQAQAAAAQNAAADAAAAAAVATVVAADEAALADTSII LFDYSMSVLTPQEKTRLELIAEQIKSGSKDAVYQIVGHADQQTGTAAGNKRVAEHRAKRV YDFLVSKGVNPKQLNYEGKGNSPDPFKKVQAANRAAIIQ >gi|313159169|gb|AENZ01000019.1| GENE 36 43058 - 45613 4034 851 aa, chain - ## HITS:1 COG:AF1664 KEGG:ns NR:ns ## COG: AF1664 COG0209 # Protein_GI_number: 11499254 # Func_class: F Nucleotide transport and metabolism # Function: Ribonucleotide reductase, alpha subunit # Organism: Archaeoglobus fulgidus # 32 655 7 560 752 263 33.0 1e-69 MSKAAKPVAPQKVEYNDAVAESKKYFEGDDLAATVWVSKYALKDSFGNIYETSPKQMHER IAAEIERIERKYPDPMSKEEVFELLDHFRYVIPQGGPMTGIGNNFQVASLSNCFVIGHKN PADSYGGIFRMDEEQVQLMKRRGGVGHDLSHIRPTGSPVLNSALTSTGIVPFMERYSNST REVAQDGRRGALMLSLSIKHPDAERFIDAKVDTGKVTGANVSIKIDDEFMRAAIAGKKYH QQFPIKSDHPKYEQDIDAKKLWDKIIHNAWKSAEPGVLFWDTIIRESVPDCYADEGFVTV STNPCGEIPLCPYDSCRLLAMNLLSYVDNPFTKEAKFNFDRFKAHVGKAMRMMDDIIDLE LEKVEQIIAKIAADPEDEEVRRVELELWKKIRTMAQKGRRTGLGITAEGDMLAALGLLYG SDEAIAFAVEVQKTLAVEAYRASVRMAAQRGAFPIYDASKEKNNPMIARIREADPELYAE MEKNGRRNIAMLTIAPTGTTSLMSQTTSGIEPVFRTVYKRRRKINPSDHDTHVDYEDETG EKFQEYNVYHHNFVKWLDANGYDTSKLETISDAELNEWVAASPYHGATANDIDWVAKVKM QGAIQKWVDHSISVTVNLPNNVSEALVADVYRTAWECGCKGVTVYRDGCRDGVLLDKASK NKKKKCEGHPGEVAKRPKSIPADIVRFKNGTEDWIAFVGLQDGRPYEVFTGKIEEDAMYI PRKINKGFIIKVREDDGTKRYDFQYTDRYGYTNTIGGISRLFDEEFWNYAKLISGVLRHG MPIEKTVSLIESLHLDSESINTWKTGVCRALKQYIVDGTKSKGKCPSCGQENMAYQNGCL TCMSCGYSKCG >gi|313159169|gb|AENZ01000019.1| GENE 37 45840 - 46214 688 124 aa, chain - ## HITS:1 COG:PH0854 KEGG:ns NR:ns ## COG: PH0854 COG0251 # Protein_GI_number: 14590714 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Putative translation initiation inhibitor, yjgF family # Organism: Pyrococcus horikoshii # 1 124 12 136 137 130 50.0 4e-31 MKKIIASPHAPKAVGPYSQAVEAGGALYVSGQLPIDGATGKMAEGVEAQTHRSLTNLRHI LEEGGYTLGDVVKTTVLLQDIGDFAAMNAVYARFFTERMPARVCYQVAALPMGALVEIDA VAVK >gi|313159169|gb|AENZ01000019.1| GENE 38 46292 - 46720 656 142 aa, chain + ## HITS:1 COG:no KEGG:CJA_2990 NR:ns ## KEGG: CJA_2990 # Name: not_defined # Def: putative lipoprotein # Organism: C.japonicus # Pathway: not_defined # 29 140 58 172 176 71 40.0 1e-11 MKILLKIAALAAVAALAAGCCSCRSYQKKNRRPLVGTEWQLIQLGGKAVKPEEGKFTLTF LAEENRIAGVGACNRIMGRYEATEKGVLKIGPLASTMMACPGMEQEDAFTKALESTTHYD MDGPMLLLLSDGELRAVFQAKP >gi|313159169|gb|AENZ01000019.1| GENE 39 46780 - 47607 649 275 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313159215|gb|EFR58588.1| ## NR: gi|313159215|gb|EFR58588.1| hypothetical protein HMPREF9720_0975 [Alistipes sp. HGB5] # 1 275 5 279 279 505 100.0 1e-141 MKTMKRANSVTVTLSATLLFCAVSAVASQPEIIDLPETLCCCPEPPAAPECTMLDNFSEK LARIEARIGELTLELNRLVWETYFDYAEQARIVPDIMTYPGLDYSALCDTVSAIAVLDAQ YRKAADVYTKVLKSDPKYDAIHREYVALQNVDDKDRKNANREQYNLMYDRLRRKNSEYAP ALKARQDALRARNIAVARFLLNHYQAEGRVMPVEPLFKRYSETMRTLQIRDARIEAYENE LSTLQRLRREVLEQVLREQYDVPKRDAVGASALKP >gi|313159169|gb|AENZ01000019.1| GENE 40 47937 - 48935 1717 332 aa, chain - ## HITS:1 COG:TVN1097 KEGG:ns NR:ns ## COG: TVN1097 COG0039 # Protein_GI_number: 13541928 # Func_class: C Energy production and conversion # Function: Malate/lactate dehydrogenases # Organism: Thermoplasma volcanium # 4 269 1 267 325 107 30.0 3e-23 MEFLTNDKLTIVGAAGMIGSNMAQTALMMRLTPNICLYDPYAPALEGVAEELYHCAFEGV NITWTSDIKEALTDASYVVSSGGAARKAGMTREDLLKGNTEIAAQLGKDIKTYCPGVKHV VVVFNPADITGLVTLIYAGIRPSQLSTLAALDSTRLRSELAKYFKISPDEIRNCRTYGGH GEQMAVFASTTLVAGRPLSELIGHEMPEGDWHDLQQRVIQGGKHIIDLRGRSSFQSPAYL SICMIAAAMGGKPFGYPAGVFVHNDEFKHILMAMETQITKEGVSYKNVQGTAEENKTLAA SYEHLCKLRDEVISMGIIPPVEEWRSLNPHLK >gi|313159169|gb|AENZ01000019.1| GENE 41 48979 - 50091 1479 370 aa, chain - ## HITS:1 COG:AGl909_1 KEGG:ns NR:ns ## COG: AGl909_1 COG1409 # Protein_GI_number: 15890570 # Func_class: R General function prediction only # Function: Predicted phosphohydrolases # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 34 345 356 660 1299 120 29.0 6e-27 MKRGIMTLAALAAMLWSLPAAAQEEYRQPALENPESWSVVVIPDLQGYAKNEASQPIARL MTAWIADNIERLNVRMVLCVGDVVEQNDRIGNGFSGDLTSVRQWQGMADAFDVLDGRVPY LVATGNHDYTYTRSGARRTHLNEYFPIGRNPLNAAAICQFGLDSDENPAVENCAFELKAP DGKDYLFLNMEFAPRDTVMKWAQQVAGLEEYADHRIVLMTHAYLDAKDQRLSGPCKVTSY EPLVRNGRIVKIKGLPLPDASNGEDLWQKLVRPASNIELVVCGHISGSGFRTDRNSAGKD VHQMLFDAQSMGGGYEGNGGDGWIRILEFMPDGVTVKATTFSPLFAISPTTRQFAWMRDA KNEFTFRFSK >gi|313159169|gb|AENZ01000019.1| GENE 42 50123 - 50719 858 198 aa, chain - ## HITS:1 COG:L0265 KEGG:ns NR:ns ## COG: L0265 COG0353 # Protein_GI_number: 15672322 # Func_class: L Replication, recombination and repair # Function: Recombinational DNA repair protein (RecF pathway) # Organism: Lactococcus lactis # 8 197 10 198 198 172 44.0 4e-43 MSKLLQDVVGELSKLPGVGRRTALRLAIHILRMERESVAEMTESIDRFRNDVKYCAECNN LSDEEVCPICLDDERDRTTICVVEQVADVLSIENTHQYKGLYHVLGGVISPMQGISPSDL KIDLLTERIARGGVKEVILAISTSVEGETTLFYLMNRLRQFPGVKVTSIARGIGFGDELE YVDELTITHALRNRREVE >gi|313159169|gb|AENZ01000019.1| GENE 43 50720 - 51538 1262 272 aa, chain - ## HITS:1 COG:no KEGG:PRU_0761 NR:ns ## KEGG: PRU_0761 # Name: not_defined # Def: putative BatD/BatE protein # Organism: P.ruminicola # Pathway: not_defined # 51 272 632 853 853 206 46.0 7e-52 MKKRISVCLVLLLGCLASYAQEPADSLAQPAAEQSESAPRPTTDELWDMANTAYINGNFH SAAEVYEEILSRGVSSVKLYYNLANAYFKEDRIGKAILYYKRALRLAPGNDDIRHNLSVA EARTKDNIEDIPEFFFVTWMREARHTMSCTAWSVLSLVLLACALALFLVYLLAQRLSLRK AGFYGTVVAVLLCMLTTWFALGERREMLDDTSAVVMTASTAVKSSPDKSSTDLFVLHEGT VVTITNRLDDWCEVVIADGKKGWLECRKIETI >gi|313159169|gb|AENZ01000019.1| GENE 44 51547 - 53403 2855 618 aa, chain - ## HITS:1 COG:no KEGG:BT_0904 NR:ns ## KEGG: BT_0904 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 10 617 5 607 608 399 36.0 1e-109 MRDFIAKISLTALFALAIFSASAAEKVTFEASSPLTVAVGEAFRVEFALNAYPDKGTFKA PSFDGFDVIAGPAESSGQSIQIVNGAMTKSINYTITYVLLPQAAGNVTVGAAEVTVDGTV YRSNALPIEIVNEGKSPGAGGTQSRPREDSSPDVTAQNQIAKDDILLRAVVSRTSVYKGE PLRVTFKLYERVPVVGYNDVKFPSFNGFWAQELNTDNARRQRETFNGKVYETLVAKEYLL YPQQAGTLVIEPAELTAVAQVVVPGRRNVDPFFGGGPDFVNVPRKVQSPRINIAVKPLPA GAPASFSGAVGSFTMDAVLPQERLAANSAATYTVKISGTGNLTFVQAPKLTLPASFEQYN VKTTESINTSAAGISGYRQFEYPFIARAEGTYEVEPVEFSYFDPSRMQYMTLKSKALELE ITPDAKGGGDALVVQGRGMSKEEVKLLGQDIRFIKLGSPQLRSVREPFLFSAAYWVLFFG ILILFAMVYVALRRQIRESQNVALLRGKRANKVAVQRFRAAKRYMEEQNRHAFYEEMLRA LWGYMSDKFNIPVANLTKENVREELHKRGVSSEESQRFTAIITKCDEAQYSPAASARMTE VYGEGVDIISRIEAMIKR >gi|313159169|gb|AENZ01000019.1| GENE 45 53480 - 55093 456 537 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|169634422|ref|YP_001708158.1| fumarate hydratase [Acinetobacter baumannii SDF] # 74 533 38 487 508 180 31 2e-44 MADFIYQEPFPVGEDKTEYRLLTKDYVKVVECDGRKILKVDPAGLELLSKAAYCDVSFCL RAAHLQKLRNILEDPEATDNDKFVAYTMLLNQVVAAEGELPTCQDTGTAICIGHKGEDVY TGADDAACIAKGVYETYKDRNLRYSQVVPFTMTDEKNSGTNLPAQIDIYGGRPGMEYEFL FITKGGGSANKTFLYQQTKALLNEESLTKFIKQHIFDLGTSACPPYHLAICIGGTSAEMC LSTVKKASAGYLDELPTSGNEGGRCFRDLEWEEKVLKICQESGVGAQFGGKYLVHDVRVI RAPRHAASCPVAIGVSCSADRNIKAKITPDGIFLEALEKNPARFLPAEAPAMSPAVDIDL DEGMDKVREILTKYPIKTRLNLKGTLIVARDIAHARIKQMLDEGKPMPEYFKKHPVYYAG PAKTPEGMASGSFGPTTAGRMDPYVDLFQSHGGSLVMVAKGNRSQQVTDACRKHGGFYLG SIGGPAAILAKNSIKSVEVVDFPELGMEAVRKIYVENFPAFIIVDDKGNDFFADFKH >gi|313159169|gb|AENZ01000019.1| GENE 46 55387 - 56379 1695 330 aa, chain + ## HITS:1 COG:TP0662 KEGG:ns NR:ns ## COG: TP0662 COG0191 # Protein_GI_number: 15639649 # Func_class: G Carbohydrate transport and metabolism # Function: Fructose/tagatose bisphosphate aldolase # Organism: Treponema pallidum # 1 330 1 332 332 513 76.0 1e-145 MVSYKDLGLVNTREMFAKAIKGGYAIPAFNFNNMEQMQAIIQACVEASSPVILQVSSGAR KYANQTLLRYMAQGAVEYAKELGKNIPIVLHLDHGNSFELCKSCIDMGFSSVMIDGSHLP YEENVALTKQVVDYAHQFDVTVEGELGVLAGVEDEVSAEHHTYTDPKDVVDFVSKTGVDS LAISIGTSHGANKFKPEQCTRNAEGILVPPELRFDILAEIEKELPGFPIVLHGSSSVPQE YVKIINTHGGALKDAVGIPEEQLRKAAKSAVCKINIDSDGRLAMTAAIRKIFVDQPAEFD PRKYLGPARDELKKLYMHKCESVLGSAGKA Prediction of potential genes in microbial genomes Time: Wed Jun 22 11:35:58 2011 Seq name: gi|313159159|gb|AENZ01000020.1| Alistipes sp. HGB5 contig00010, whole genome shotgun sequence Length of sequence - 7759 bp Number of predicted genes - 10, with homology - 10 Number of transcription units - 4, operones - 3 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 204 220 ## BT_0095 conjugate transposon protein - Term 225 - 269 1.1 2 2 Op 1 . - CDS 348 - 884 317 ## gi|291513813|emb|CBK63023.1| hypothetical protein AL1_03450 3 2 Op 2 . - CDS 897 - 1493 149 ## BF1363 conjugate transposon protein TraB 4 2 Op 3 . - CDS 1517 - 1987 170 ## BT_0097 conjugate transposon protein 5 2 Op 4 . - CDS 1990 - 2745 490 ## BF1364 conjugate transposon protein - Prom 2796 - 2855 5.5 + Prom 3275 - 3334 2.9 6 3 Op 1 . + CDS 3367 - 3681 418 ## BF1366 hypothetical protein 7 3 Op 2 . + CDS 3666 - 4895 976 ## BF1367 hypothetical protein + Term 4899 - 4949 11.1 - Term 4893 - 4930 3.3 8 4 Op 1 . - CDS 4988 - 5839 683 ## BT_0683 alpha-glucosidase 9 4 Op 2 . - CDS 5824 - 6996 787 ## BT_0683 alpha-glucosidase 10 4 Op 3 . - CDS 7004 - 7750 731 ## COG3537 Putative alpha-1,2-mannosidase Predicted protein(s) >gi|313159159|gb|AENZ01000020.1| GENE 1 3 - 204 220 67 aa, chain - ## HITS:1 COG:no KEGG:BT_0095 NR:ns ## KEGG: BT_0095 # Name: not_defined # Def: conjugate transposon protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 67 1 67 98 79 67.0 4e-14 MQKRAIFMIFSLCSALASFAQGNGMAGISEATNMVTSYFDPLTKLIFAVAAILGLVGGVR VYSKFSS >gi|313159159|gb|AENZ01000020.1| GENE 2 348 - 884 317 178 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|291513813|emb|CBK63023.1| ## NR: gi|291513813|emb|CBK63023.1| hypothetical protein AL1_03450 [Alistipes shahii WAL 8301] # 98 178 1 81 81 131 90.0 3e-29 MNSIILIAACAVCLDQLFFRGRLASYVVDNLKGPKKKHTGHNAAPHTPNAGTAPRQTPEI LVSKSYRCGLERTGADIGTAPAGGDAIFAPASEEKSAMASWNGLVVQPPIGEDGELAEPE APAGARPPLPALPARAELRDAEVHAMLGELDFGESTVMGDATAEERQAIANFDIRAYA >gi|313159159|gb|AENZ01000020.1| GENE 3 897 - 1493 149 198 aa, chain - ## HITS:1 COG:no KEGG:BF1363 NR:ns ## KEGG: BF1363 # Name: not_defined # Def: conjugate transposon protein TraB # Organism: B.fragilis # Pathway: not_defined # 61 192 3 128 133 72 33.0 1e-11 MLYSFLSESGVLDISLLALGTGIIVLSILGLLHLGKRRGHNKAGHFSEQPDDCAMPAVQT ENSDGWPSMEQPQTTELVSVVAAPEDSSNLDSPPSETVVSEKQDVKPERVPVPDIRWRRS MTLPDYKKTFLVRVDYDLRASLYVSAPTKRKILEVLKKIGGERLTATSYVDNILRHHLEM FRDEINCIHQEQNYHNIV >gi|313159159|gb|AENZ01000020.1| GENE 4 1517 - 1987 170 156 aa, chain - ## HITS:1 COG:no KEGG:BT_0097 NR:ns ## KEGG: BT_0097 # Name: not_defined # Def: conjugate transposon protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 6 154 8 143 145 67 30.0 2e-10 MAKKPKIEVDESLVKQVIAGQLPITTKVTRVIPEQIPSGTATDAIPAEPPSVPEIDTEKI PDETPSVPAAQEPRRRRTPPLSDYERMFLTPVEYGIRATLYVNASTKRKILEILKRIGGE RLSATSYVDNILQHHIETFRDDINRLDRKRNFEKLV >gi|313159159|gb|AENZ01000020.1| GENE 5 1990 - 2745 490 251 aa, chain - ## HITS:1 COG:no KEGG:BF1364 NR:ns ## KEGG: BF1364 # Name: not_defined # Def: conjugate transposon protein # Organism: B.fragilis # Pathway: not_defined # 6 251 15 260 260 218 44.0 2e-55 MKQASISICNQKGGIGKSTFTMLLASHLHYTLGYDVLVVDCDYPQWSVQAQRERELSLIE HDDYHKLLLVRQFKATGRKLWPVLKCMPPEAPEEVERLLQSGYHPRIILYDLPGTVNAEG VIRLLASLDAVFVPMKADKVVMESTLSFARSLTTNLAQDPTLTLQSVYLFWTMIDRRERT PLYDRYEAVIRKLGLSLMTTHIPYRSKFNKELLADNTGVGRSTLLAPARTFACEAQLEIL TEEILTLLKIR >gi|313159159|gb|AENZ01000020.1| GENE 6 3367 - 3681 418 104 aa, chain + ## HITS:1 COG:no KEGG:BF1366 NR:ns ## KEGG: BF1366 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 10 102 45 138 140 74 43.0 1e-12 MLCKAGLEHNRSRFIVKRIFNEEFVVVKRDPSKVQFIARLNDFYFQFQKLGNNYNQIVKA INAHFSNIAIPHQIAALEQRTRELKALSIEILNLAKQTKEWLRI >gi|313159159|gb|AENZ01000020.1| GENE 7 3666 - 4895 976 409 aa, chain + ## HITS:1 COG:no KEGG:BF1367 NR:ns ## KEGG: BF1367 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 382 1 382 422 321 46.0 3e-86 MVANIRSGSSPGGALYYNKEKVDKDEAEVLFWQKMLEPFDKHGRMDVDACMDCFWPYLEA NRRTTNTVFHASLNPSPEDRLTDDQLRDIAQEYMERMGYGNQPYIVFKHKDIDRQHLHVV SLRVDENGHKLPHDFEARRSMEILRDLERKYNLHPSVKGEEQADKVGLHKVNYREGNVKQ QISSIVRSCLRNYKCSSYGEFRTLLELFNVSVEERTGTIDERNYAGIVYGAMTDDGYGIG TPFKSSRIGRDVGYKALQKYYERSKSALKQDGSLDRLRQTVRDAMSPHNTRDEFRQLLKA ENIDAIFRINPVGRIYGVTFIDHNDGIVANGSVLGKEFSANVFNELYPAPKEAQQVAEQH VEQKHEVQNHAANPISGIVDTVLDLADTRAYEEQQRQMQQRRKKRRHRS >gi|313159159|gb|AENZ01000020.1| GENE 8 4988 - 5839 683 283 aa, chain - ## HITS:1 COG:no KEGG:BT_0683 NR:ns ## KEGG: BT_0683 # Name: not_defined # Def: alpha-glucosidase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 283 386 671 671 370 62.0 1e-101 MEAAMRRYAELGVHTLKTGYAGGFKGGYLHHSQYGVQHYQRVVETAAKYRIMLDVHKPIK ETGIRRTWPNMMTREGARGMEWNAWSEGNSAEYLTTLPFVRMLSGPMDYTPGIFDIDYST AKADKGRIEWNGPNAERCIKTTLARQIANWVIIYSPLQMAADLIENYEGHPAFRFFRDFD ADCDWSKALQGEIGDYIVVARRAKDRYFLGAGTDEKARTLVQKLDFLEPGVTYTATIYAD APDAGRNPEAYLIGKRAVTARDTIRIEMAERGGQAITFIPAEK >gi|313159159|gb|AENZ01000020.1| GENE 9 5824 - 6996 787 390 aa, chain - ## HITS:1 COG:no KEGG:BT_0683 NR:ns ## KEGG: BT_0683 # Name: not_defined # Def: alpha-glucosidase # Organism: B.thetaiotaomicron # Pathway: not_defined # 12 380 9 379 671 424 57.0 1e-117 MKIPNPHAAFMAFACALFAACSPQQPFITSPDGRIAVSVEADAGAGVTYAVTVDDKPLIL PSAMGLAACGTSLQGATITGVWHSSHDGLWTAPWGENKRHRNHYNEMSVAVRKPDKSMAF TLRFRAFDDGIAFRYELAQPDSLAVTDELTEFRFADEGHTMSWSIPGNFETYELQYREQP VARITDANTPFTFRTGDGIFGAVHEAALYDWPEMVLRRDSAGVLKAALAPLPDGVKAYIP GRFTTPWRTIQIADRAVGLINSSLVLNLNEPSKIEDTSWIVPQKYVGVWWGMHLGTQVWT MGPRHGATTRNAIRHIDFAAENGIQGVLFEGWNKGWENWGGNQQFDYLEPYADFDLERIA AYAREKGVQLWMHNETGGNILSTKPSWKQR >gi|313159159|gb|AENZ01000020.1| GENE 10 7004 - 7750 731 248 aa, chain - ## HITS:1 COG:L135972 KEGG:ns NR:ns ## COG: L135972 COG3537 # Protein_GI_number: 15673483 # Func_class: G Carbohydrate transport and metabolism # Function: Putative alpha-1,2-mannosidase # Organism: Lactococcus lactis # 6 244 458 696 717 193 38.0 2e-49 MLEKASQNYKNLFDPETRLMRGRNADGTFQTPFSPYKWGDAFTEGNAWHYTWSVFHDVEG LIDLMGERKGFTRMLDSVFVVPPVYDDSYYGFRIHEITEMQVADMGNYAHGNQPAQHMIY LYDYAGQPRKAQWWVREVMDRLYSAKPDGYCGDEDNGQTSAWYVFSALGFYPVCPGADEY AVGSPLFRKAVIHLENGNTIEIDAPENSSENRYVGKMAIDGKVSERPFLSYSTLLEGAKV RFEMESEK Prediction of potential genes in microbial genomes Time: Wed Jun 22 11:37:27 2011 Seq name: gi|313159013|gb|AENZ01000021.1| Alistipes sp. HGB5 contig00032, whole genome shotgun sequence Length of sequence - 160386 bp Number of predicted genes - 145, with homology - 136 Number of transcription units - 69, operones - 34 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 1717 874 ## COG3292 Predicted periplasmic ligand-binding sensor domain - Prom 1774 - 1833 1.9 2 2 Op 1 . + CDS 2035 - 4269 1319 ## COG1472 Beta-glucosidase-related glycosidases 3 2 Op 2 . + CDS 4282 - 6219 1729 ## COG3533 Uncharacterized protein conserved in bacteria 4 2 Op 3 . + CDS 6240 - 7961 1063 ## Cpin_1088 hypothetical protein + Term 8093 - 8160 31.4 + Prom 8368 - 8427 2.2 5 3 Tu 1 . + CDS 8499 - 8951 213 ## gi|313159017|gb|EFR58392.1| hypothetical protein HMPREF9720_1376 - Term 9129 - 9182 -0.2 6 4 Tu 1 . - CDS 9246 - 10517 849 ## BF1833 putative bacteriophage integrase - Prom 10561 - 10620 3.4 + Prom 10564 - 10623 4.2 7 5 Tu 1 . + CDS 10803 - 11729 1456 ## gi|313159052|gb|EFR58427.1| hypothetical protein HMPREF9720_1379 + Term 11750 - 11791 10.1 - Term 11737 - 11778 10.1 8 6 Tu 1 . - CDS 11899 - 12795 1315 ## COG4974 Site-specific recombinase XerD - Prom 12846 - 12905 4.6 + Prom 12779 - 12838 4.1 9 7 Op 1 . + CDS 12865 - 13692 513 ## PROTEIN SUPPORTED gi|228472988|ref|ZP_04057745.1| ribosomal protein L11 methyltransferase 10 7 Op 2 . + CDS 13754 - 14422 765 ## BT_3431 DNA repair protein 11 7 Op 3 . + CDS 14454 - 14864 463 ## gi|291514730|emb|CBK63940.1| hypothetical protein AL1_15140 + Term 14879 - 14917 1.0 12 8 Tu 1 . - CDS 14893 - 15348 -456 ## 13 9 Op 1 . + CDS 15599 - 16645 1308 ## Odosp_3332 phosphate-selective porin O and P 14 9 Op 2 . + CDS 16667 - 17953 1915 ## COG2855 Predicted membrane protein + Term 17977 - 18016 7.1 15 10 Tu 1 . - CDS 17962 - 18264 99 ## gi|313159027|gb|EFR58402.1| hypothetical protein HMPREF9720_1385 + Prom 18032 - 18091 3.7 16 11 Tu 1 . + CDS 18187 - 19398 954 ## BVU_1527 integrase 17 12 Op 1 . + CDS 19861 - 20409 458 ## BF3925 putative transcriptional regulator Updx-like protein 18 12 Op 2 . + CDS 20459 - 21379 444 ## gi|313159049|gb|EFR58424.1| hypothetical protein HMPREF9720_1388 + Prom 21385 - 21444 4.8 19 13 Op 1 . + CDS 21464 - 22636 824 ## COG0472 UDP-N-acetylmuramyl pentapeptide phosphotransferase/UDP-N-acetylglucosamine-1-phosphate transferase + Prom 22646 - 22705 2.8 20 13 Op 2 . + CDS 22725 - 23609 1096 ## COG1209 dTDP-glucose pyrophosphorylase + Term 23649 - 23685 0.0 + Prom 23748 - 23807 5.6 21 14 Op 1 . + CDS 23829 - 24140 87 ## gi|313159078|gb|EFR58453.1| conserved domain protein 22 14 Op 2 . + CDS 24220 - 24717 96 ## MM_2099 hypothetical protein 23 14 Op 3 9/0.000 + CDS 24714 - 25289 830 ## COG1898 dTDP-4-dehydrorhamnose 3,5-epimerase and related enzymes 24 14 Op 4 . + CDS 26010 - 26870 1084 ## COG1091 dTDP-4-dehydrorhamnose reductase 25 14 Op 5 . + CDS 26877 - 27635 44 ## COG3774 Mannosyltransferase OCH1 and related enzymes 26 14 Op 6 . + CDS 27654 - 28760 1388 ## COG1088 dTDP-D-glucose 4,6-dehydratase + Term 28764 - 28810 1.3 - Term 28608 - 28640 4.7 27 15 Tu 1 . - CDS 28844 - 28978 56 ## - Prom 29032 - 29091 4.2 + Prom 29870 - 29929 4.4 28 16 Tu 1 . + CDS 29964 - 30431 -91 ## BT_0467 hypothetical protein + Term 30435 - 30479 -0.5 29 17 Tu 1 . - CDS 30593 - 30754 110 ## - Prom 30807 - 30866 5.3 30 18 Tu 1 . + CDS 31004 - 31387 201 ## gi|313159102|gb|EFR58477.1| glycosyltransferase, group 2 family protein + Term 31407 - 31457 -0.1 31 19 Tu 1 . - CDS 31477 - 31590 65 ## - Prom 31675 - 31734 4.8 + Prom 31541 - 31600 5.6 32 20 Tu 1 . + CDS 31621 - 32592 211 ## COG2327 Uncharacterized conserved protein + Term 32755 - 32802 -0.1 + Prom 33080 - 33139 3.0 33 21 Tu 1 . + CDS 33198 - 33554 142 ## Mfla_2019 nitroreductase + Term 33795 - 33832 -0.1 34 22 Tu 1 . - CDS 34409 - 34615 226 ## - Prom 34704 - 34763 8.2 35 23 Tu 1 . + CDS 34922 - 35815 246 ## COG1216 Predicted glycosyltransferases + Prom 35831 - 35890 5.4 36 24 Tu 1 . + CDS 35910 - 36335 113 ## COG0110 Acetyltransferase (isoleucine patch superfamily) + Term 36536 - 36582 -0.9 37 25 Tu 1 . - CDS 36844 - 37065 119 ## - Prom 37273 - 37332 8.9 38 26 Tu 1 . + CDS 36899 - 37426 116 ## gi|313159144|gb|EFR58519.1| putative acyltransferase + Term 37633 - 37669 0.2 + Prom 37540 - 37599 3.1 39 27 Op 1 25/0.000 + CDS 37727 - 38536 307 ## COG0438 Glycosyltransferase 40 27 Op 2 . + CDS 38538 - 39611 675 ## COG0438 Glycosyltransferase 41 27 Op 3 . + CDS 39702 - 40721 541 ## Bache_1136 hypothetical protein 42 27 Op 4 . + CDS 40760 - 40918 160 ## + Term 40950 - 40988 10.9 - Term 40938 - 40976 10.9 43 28 Op 1 . - CDS 41059 - 42915 2212 ## COG1032 Fe-S oxidoreductase 44 28 Op 2 . - CDS 43181 - 44269 1698 ## COG0564 Pseudouridylate synthases, 23S RNA-specific 45 28 Op 3 . - CDS 44269 - 45234 1588 ## Palpr_2074 PASTA domain containing protein 46 28 Op 4 . - CDS 45292 - 46374 1533 ## COG1060 Thiamine biosynthesis enzyme ThiH and related uncharacterized enzymes 47 28 Op 5 1/0.167 - CDS 46376 - 47479 1988 ## COG0795 Predicted permeases 48 28 Op 6 . - CDS 47486 - 48616 1722 ## COG0343 Queuine/archaeosine tRNA-ribosyltransferase - Prom 48661 - 48720 3.1 + Prom 48625 - 48684 5.5 49 29 Op 1 . + CDS 48708 - 49862 1669 ## BVU_3860 glycosyl transferase family protein 50 29 Op 2 . + CDS 49853 - 50416 971 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog 51 29 Op 3 . + CDS 50444 - 51067 1003 ## COG0357 Predicted S-adenosylmethionine-dependent methyltransferase involved in bacterial cell division + Term 51183 - 51217 -0.9 52 30 Op 1 . - CDS 51093 - 51332 181 ## gi|313159104|gb|EFR58479.1| conserved hypothetical protein 53 30 Op 2 . - CDS 51415 - 51660 195 ## gi|313159127|gb|EFR58502.1| conserved hypothetical protein 54 30 Op 3 . - CDS 51691 - 51921 78 ## gi|313159108|gb|EFR58483.1| hypothetical protein HMPREF9720_1422 - Prom 52032 - 52091 5.5 55 31 Op 1 . - CDS 52352 - 52465 56 ## 56 31 Op 2 . - CDS 52484 - 52729 120 ## - Prom 52817 - 52876 3.6 + Prom 52702 - 52761 3.1 57 32 Tu 1 . + CDS 52814 - 53683 891 ## COG0500 SAM-dependent methyltransferases 58 33 Op 1 2/0.167 - CDS 53658 - 54074 657 ## COG0526 Thiol-disulfide isomerase and thioredoxins 59 33 Op 2 . - CDS 54113 - 55219 1138 ## COG0820 Predicted Fe-S-cluster redox enzyme + Prom 55175 - 55234 2.1 60 34 Tu 1 . + CDS 55305 - 55988 1016 ## Odosp_2864 TonB family protein + Term 56038 - 56075 5.2 + Prom 56024 - 56083 3.7 61 35 Op 1 . + CDS 56157 - 57374 1783 ## COG2262 GTPases 62 35 Op 2 . + CDS 57395 - 58051 910 ## COG0035 Uracil phosphoribosyltransferase + Term 58053 - 58100 5.2 63 36 Op 1 . + CDS 58375 - 60213 2247 ## COG1368 Phosphoglycerol transferase and related proteins, alkaline phosphatase superfamily 64 36 Op 2 . + CDS 60290 - 61039 942 ## BDI_2598 hypothetical protein 65 36 Op 3 . + CDS 61036 - 61641 774 ## COG2197 Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain 66 36 Op 4 . + CDS 61648 - 62538 1206 ## COG1230 Co/Zn/Cd efflux system component + Term 62597 - 62631 5.0 - Term 62579 - 62623 6.8 67 37 Op 1 . - CDS 62743 - 63003 260 ## COG0759 Uncharacterized conserved protein 68 37 Op 2 . - CDS 62960 - 63328 175 ## PROTEIN SUPPORTED gi|126646897|ref|ZP_01719407.1| 50S ribosomal protein L34 69 37 Op 3 . - CDS 63336 - 64196 1298 ## Odosp_2726 uroporphyrinogen III synthase HEM4 70 37 Op 4 . - CDS 64233 - 65339 1129 ## gi|313159109|gb|EFR58484.1| putative membrane protein 71 37 Op 5 2/0.167 - CDS 65381 - 66559 1993 ## COG0642 Signal transduction histidine kinase 72 37 Op 6 . - CDS 66569 - 68233 2570 ## COG0793 Periplasmic protease 73 37 Op 7 . - CDS 68226 - 68696 583 ## gi|313159060|gb|EFR58435.1| hypothetical protein HMPREF9720_1440 74 38 Op 1 2/0.167 - CDS 68827 - 69705 384 ## PROTEIN SUPPORTED gi|229229955|ref|ZP_04354520.1| SSU ribosomal protein S1P; 4-hydroxy-3-methylbut-2-enyl diphosphate reductase 75 38 Op 2 . - CDS 69727 - 70425 256 ## PROTEIN SUPPORTED gi|15639271|ref|NP_218720.1| bifunctional cytidylate kinase/ribosomal protein S1 76 38 Op 3 . - CDS 70412 - 71065 881 ## Odosp_2288 lysine exporter protein (LysE/YggA) + Prom 71155 - 71214 4.6 77 39 Tu 1 . + CDS 71283 - 71588 464 ## Rmar_1696 hypothetical protein + Term 71598 - 71634 7.2 - Term 71586 - 71620 5.0 78 40 Op 1 . - CDS 71848 - 73251 1947 ## COG3119 Arylsulfatase A and related enzymes 79 40 Op 2 . - CDS 73280 - 74866 2135 ## COG3119 Arylsulfatase A and related enzymes 80 40 Op 3 . - CDS 74871 - 77177 2982 ## COG3525 N-acetyl-beta-hexosaminidase - Term 77199 - 77244 12.0 81 41 Op 1 . - CDS 77262 - 78575 1865 ## HMPREF0659_A7291 putative lipoprotein 82 41 Op 2 . - CDS 78594 - 79724 1674 ## HMPREF0659_A7292 putative lipoprotein 83 41 Op 3 . - CDS 79779 - 81377 2218 ## HMPREF9137_0327 putative lipoprotein 84 41 Op 4 . - CDS 81396 - 83426 2943 ## BF1326 hypothetical protein 85 41 Op 5 . - CDS 83518 - 84696 995 ## BT_3983 hypothetical protein - Prom 84735 - 84794 3.6 86 42 Op 1 6/0.000 - CDS 84865 - 85809 998 ## COG3712 Fe2+-dicitrate sensor, membrane component - Term 85829 - 85884 12.8 87 42 Op 2 . - CDS 85902 - 86492 785 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog - Prom 86514 - 86573 2.4 - Term 86642 - 86691 8.1 88 43 Op 1 . - CDS 86741 - 88030 1869 ## COG0285 Folylpolyglutamate synthase 89 43 Op 2 . - CDS 88046 - 89632 2138 ## COG0815 Apolipoprotein N-acyltransferase - Prom 89687 - 89746 2.1 + Prom 89644 - 89703 6.1 90 44 Tu 1 . + CDS 89727 - 91124 1994 ## COG0673 Predicted dehydrogenases and related proteins + Term 91149 - 91187 7.0 91 45 Tu 1 . + CDS 91233 - 92219 1288 ## COG0657 Esterase/lipase 92 46 Tu 1 . - CDS 92412 - 93371 1396 ## COG0451 Nucleoside-diphosphate-sugar epimerases - Prom 93424 - 93483 5.1 + Prom 93360 - 93419 5.7 93 47 Tu 1 . + CDS 93468 - 94514 1266 ## COG3021 Uncharacterized protein conserved in bacteria + Term 94559 - 94601 -1.0 - Term 94648 - 94697 8.2 94 48 Op 1 1/0.167 - CDS 94740 - 95918 1873 ## COG0784 FOG: CheY-like receiver 95 48 Op 2 . - CDS 95977 - 97149 1853 ## COG1215 Glycosyltransferases, probably involved in cell wall biogenesis 96 48 Op 3 . - CDS 97146 - 98273 1704 ## BT_1178 hypothetical protein 97 48 Op 4 . - CDS 98278 - 99759 2124 ## COG2244 Membrane protein involved in the export of O-antigen and teichoic acid 98 48 Op 5 . - CDS 99740 - 100831 1267 ## BT_1175 hypothetical protein 99 48 Op 6 . - CDS 100839 - 101447 775 ## COG0110 Acetyltransferase (isoleucine patch superfamily) - Prom 101600 - 101659 4.1 + Prom 101541 - 101600 5.5 100 49 Tu 1 . + CDS 101637 - 102893 1421 ## Bache_3096 hypothetical protein + Term 102914 - 102953 5.8 - Term 102907 - 102937 4.3 101 50 Op 1 1/0.167 - CDS 103010 - 104128 1749 ## COG0438 Glycosyltransferase 102 50 Op 2 . - CDS 104125 - 105327 1757 ## COG1215 Glycosyltransferases, probably involved in cell wall biogenesis 103 50 Op 3 . - CDS 105324 - 106217 1039 ## COG1216 Predicted glycosyltransferases 104 50 Op 4 . - CDS 106214 - 107719 2176 ## BDI_1279 hypothetical protein 105 50 Op 5 . - CDS 107757 - 109964 3599 ## BF2061 putative transmembrane protein 106 50 Op 6 . - CDS 109968 - 110768 1255 ## BF2062 putative outer membrane protein 107 50 Op 7 . - CDS 110780 - 111946 1369 ## COG2148 Sugar transferases involved in lipopolysaccharide synthesis 108 50 Op 8 . - CDS 111943 - 112311 573 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain - Prom 112555 - 112614 5.2 + Prom 112441 - 112500 4.5 109 51 Op 1 16/0.000 + CDS 112664 - 113056 537 ## COG0784 FOG: CheY-like receiver 110 51 Op 2 1/0.167 + CDS 113068 - 115059 2739 ## COG0642 Signal transduction histidine kinase + Term 115106 - 115133 -0.8 + Prom 115292 - 115351 2.2 111 52 Op 1 27/0.000 + CDS 115432 - 116682 1916 ## COG0845 Membrane-fusion protein 112 52 Op 2 9/0.000 + CDS 116686 - 119781 4877 ## COG0841 Cation/multidrug efflux pump 113 52 Op 3 . + CDS 119778 - 121154 412 ## PROTEIN SUPPORTED gi|157165073|ref|YP_001466086.1| 30S ribosomal protein S12 + Term 121171 - 121224 11.1 - Term 121335 - 121380 8.1 114 53 Op 1 . - CDS 121406 - 121732 392 ## COG0724 RNA-binding proteins (RRM domain) 115 53 Op 2 . - CDS 121764 - 122201 687 ## BDI_0811 cold shock DNA-binding protein - Prom 122387 - 122446 2.5 - TRNA 122540 - 122615 83.7 # Thr CGT 0 0 + Prom 122641 - 122700 5.1 116 54 Op 1 . + CDS 122735 - 124204 1803 ## COG1966 Carbon starvation protein, predicted membrane protein 117 54 Op 2 . + CDS 124185 - 124877 1023 ## gi|313159133|gb|EFR58508.1| hypothetical protein HMPREF9720_1488 + Term 125006 - 125049 2.0 118 55 Op 1 . - CDS 124982 - 125698 1287 ## COG2186 Transcriptional regulators 119 55 Op 2 . - CDS 125734 - 126315 886 ## COG1435 Thymidine kinase 120 55 Op 3 . - CDS 126344 - 127660 1982 ## COG0766 UDP-N-acetylglucosamine enolpyruvyl transferase - Prom 127871 - 127930 7.3 + Prom 127818 - 127877 5.4 121 56 Tu 1 . + CDS 127898 - 128065 215 ## PROTEIN SUPPORTED gi|34540457|ref|NP_904936.1| 50S ribosomal protein L34 + Term 128085 - 128121 4.9 122 57 Tu 1 . + CDS 128212 - 128709 209 ## PROTEIN SUPPORTED gi|229884790|ref|ZP_04504247.1| acetyltransferase, ribosomal protein N-acetylase + Term 128759 - 128792 3.1 - Term 128745 - 128778 3.1 123 58 Op 1 . - CDS 128863 - 129429 923 ## Bacsa_0770 hypothetical protein 124 58 Op 2 . - CDS 129442 - 131685 3629 ## Amuc_1410 putative phosphate/sulphate permease - Prom 131706 - 131765 2.9 - Term 131834 - 131868 5.2 125 59 Tu 1 . - CDS 131920 - 134076 3409 ## COG0480 Translation elongation factors (GTPases) - Prom 134185 - 134244 4.6 126 60 Tu 1 . - CDS 134246 - 135370 1880 ## COG2311 Predicted membrane protein 127 61 Op 1 . - CDS 135514 - 136272 1053 ## COG2107 Predicted periplasmic solute-binding protein 128 61 Op 2 . - CDS 136314 - 136547 456 ## gi|313159077|gb|EFR58452.1| hypothetical protein HMPREF9720_1501 129 61 Op 3 . - CDS 136540 - 137319 672 ## gi|313159117|gb|EFR58492.1| conserved hypothetical protein 130 61 Op 4 . - CDS 137321 - 138712 2155 ## COG0006 Xaa-Pro aminopeptidase - Prom 138806 - 138865 3.7 131 62 Op 1 . - CDS 138911 - 139978 1359 ## COG1619 Uncharacterized proteins, homologs of microcin C7 resistance protein MccF 132 62 Op 2 1/0.167 - CDS 140052 - 141413 2053 ## COG0477 Permeases of the major facilitator superfamily - Term 141509 - 141546 10.1 133 63 Op 1 . - CDS 141626 - 143557 3377 ## COG1506 Dipeptidyl aminopeptidases/acylaminoacyl-peptidases 134 63 Op 2 . - CDS 143562 - 143765 352 ## gi|313159043|gb|EFR58418.1| hypothetical protein HMPREF9720_1507 - Prom 143788 - 143847 8.6 - Term 143889 - 143922 2.2 135 64 Op 1 . - CDS 143969 - 145189 1751 ## COG0826 Collagenase and related proteases 136 64 Op 2 . - CDS 145215 - 146189 1597 ## COG0142 Geranylgeranyl pyrophosphate synthase - Prom 146281 - 146340 3.0 + Prom 146137 - 146196 3.1 137 65 Tu 1 . + CDS 146307 - 147311 374 ## PROTEIN SUPPORTED gi|116517028|ref|YP_816079.1| glucokinase + Term 147371 - 147410 10.0 - Term 147501 - 147548 17.0 138 66 Op 1 . - CDS 147607 - 148545 1732 ## Odosp_0756 hypothetical protein 139 66 Op 2 . - CDS 148622 - 149419 1333 ## Cpin_1245 hypothetical protein 140 66 Op 3 . - CDS 149462 - 151600 3202 ## COG1200 RecG-like helicase - Term 151621 - 151670 -0.4 141 66 Op 4 . - CDS 151690 - 152604 958 ## COG0491 Zn-dependent hydrolases, including glyoxylases - Prom 152624 - 152683 2.4 + Prom 152637 - 152696 2.3 142 67 Op 1 . + CDS 152862 - 154343 1953 ## COG0821 Enzyme involved in the deoxyxylulose pathway of isoprenoid biosynthesis 143 67 Op 2 . + CDS 154340 - 154942 643 ## PGN_1944 hypothetical protein - Term 154790 - 154828 2.0 144 68 Tu 1 . - CDS 154866 - 156896 2442 ## COG3250 Beta-galactosidase/beta-glucuronidase - Prom 156923 - 156982 4.0 - Term 157170 - 157211 9.3 145 69 Tu 1 . - CDS 157351 - 160386 3324 ## BT_1502 hypothetical protein Predicted protein(s) >gi|313159013|gb|AENZ01000021.1| GENE 1 1 - 1717 874 572 aa, chain - ## HITS:1 COG:VC1353_1 KEGG:ns NR:ns ## COG: VC1353_1 COG3292 # Protein_GI_number: 15641365 # Func_class: T Signal transduction mechanisms # Function: Predicted periplasmic ligand-binding sensor domain # Organism: Vibrio cholerae # 40 565 44 563 675 83 24.0 1e-15 MHRRFLISRFFIVCIGFVLSEIQIVAAESFPQVRFTPLPSDILPSNEVRKLYQDSDGYIW IPTYNGLARYDGYGAITYGMRDVSNGLFNTFVNVVAEDHDKNLWIGTEHGLFRLDKVSGN IVADEYPELADCNIAVILCDTGNGIWIGGDKGLFRKNALDRNFHPVPISNSAGRPVKAVT SIIKDDKLNLWIAAFDQGLLRYDIREDRAYACDDAVLRKAHVLARDVAGNIWVGTWGAGV VRLVNPLAPGPTRYVHYKHVPGRTHSLLDDIIYDIEENPEQNTIWIGGRSGLSILHDIDN PDSFQNFFPGDNVGDLPYNEVNSILRTRDGLMWIGLLGGGVCKVQTSGTKFESDRLEPIR TRYNTSSVRSMYYAGNGDFWFGLLDFGLIKYNIRSGKIVDYHEHPDLKSLPYTSTVNTII RRSTTGELYFGTQNAGIWVYNETQHKVRQINHFNQPNFLDDCVIALCEDTHGNLWIGSRL GIYVESTDGRFHTAAEWLGYATPFDQTYVFDICCDKAGDVWIASNGQGILHIRTADGTWR QYTRDNGMISDHVYCLQADDTGCIWAGTFADG >gi|313159013|gb|AENZ01000021.1| GENE 2 2035 - 4269 1319 744 aa, chain + ## HITS:1 COG:Cgl0317 KEGG:ns NR:ns ## COG: Cgl0317 COG1472 # Protein_GI_number: 19551567 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Corynebacterium glutamicum # 40 510 14 487 548 245 36.0 2e-64 MLPILCACGRKWTVEPHDTYCLIRQDGGQTLGYFPGSGVRILYSDGYAFKDLNRNGILDC YEDWRYTPEERAEDLAKRLSVEEIAGLMLYSSHQAVPTDSVGYWSSTYNGTSLRESGLPH SAVSDKQRKFLRDDNLRAVLVVRVESPRIAAEWNNNMQAFVEGLGQGIPVNISSDPRNET RAWAEYNAGSGGKISLWPSPLGLAATFDPELVEKFGRIASAEYRALGIATALSPQIDLAT EPRWNRFYGTFGEDPDLDTDMARAYIDGFQTSVGAAEIADGWGYESVNAMVKHWPGGGPE EGGRDAHFSFGKYTVYPGGNFEQHIRPFVEGAFRLNGKTRKAAAVMPYYTVSHGVDPSGK KVGNGFSKYIVTDLLRNRYGYNGVVCTDWGITHDYSSIEEADGKCWGVEALSIPQRHYEA LKAGVDQFGGNNDKGPVLEAYQMWTAEFGEKSARERFERSAIRLLLNIFRTGLFENPYVD PDRTEATVGNPEFMQAGYEAQLRSVVLLKNRAETLPASTRRSVYLPECGQSRLPVDTSLV KQYYELAESPENADFAIVFIDEPDGGCGYDAADREKGGNGYVPISLQYGDYTARHARPES IAGGDPKEPSIDRSYRGKTVRSSNREELKRVLGTKRQMGDKPVVAVVSTTRPFVPAEFEP SADAILLAFGVQRQAVLDIISGRREPSALLPMQLPADMKTVEQQHEDVPRDMVCYTDSEG NTYDFAFGLDWKGVIRDARVEKYK >gi|313159013|gb|AENZ01000021.1| GENE 3 4282 - 6219 1729 645 aa, chain + ## HITS:1 COG:SMb20631 KEGG:ns NR:ns ## COG: SMb20631 COG3533 # Protein_GI_number: 16265291 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Sinorhizobium meliloti # 290 515 334 550 640 88 28.0 4e-17 MKKLFSIIALCLVGTATNAQAPYAQFQGGMIGCIEPQGWIEEFLHRQQTGLTGHPEAMSY PYDSCLWAGEIGRNTETYGSDWWRYEQTAYYTDGLLRLGYLLGDREMIAKAEEGIRYTLA NASSTGVLGNKAIESMWPMCVYFRVLQAYYERTGDPAIPAALERHYMNFTQEQVEKWRNI VSIEGMLWTYGKTGNAKLLDICERAYNGGKFGDLTPAVAAGDERFVMHGVTCMEELKLPM LLYAYTGKRYYLDLALNAERKLTRDHMLPDGVPASAEALVGNGNVINSHETCDISDYTWT LEQFLLTTGEVRWADKIEKAVFNAGLGAVTKDFRSLQYFSSVNQVIATGRSNHNEFFHGS TWMAFRPTHETECCAGNVHRFMPNYVAHMWLRGKDGSIAAALYGPSAATFDLPNGRQCHI AQRTSYPFDGEIEFSFGLKERTDIPFLLRIPAWCRDAKIYVNGKLWRDACPAGTFVTLRR KFRNGDRIRLCLSMQPVMNTVPGQGIYVQRGPLLFSYPVPQRKTADRTVYANMNGKVPGN PEFECWSIEPAGPWNYALCSDPVIPLKVIRTKPAAAGSYPFDPEHTPVKISVPVKPIDWE LEKGRYTPRLPAEGIARAVSDRIEYLELIPYGCTELRLTVFPQCN >gi|313159013|gb|AENZ01000021.1| GENE 4 6240 - 7961 1063 573 aa, chain + ## HITS:1 COG:no KEGG:Cpin_1088 NR:ns ## KEGG: Cpin_1088 # Name: not_defined # Def: hypothetical protein # Organism: C.pinensis # Pathway: not_defined # 30 571 26 560 560 275 30.0 5e-72 MNISPLRLLVNRTIRTLCPALLMLPVACSESGLTIRVIDPLTNVLPGIVYPPAGDDTVRV ARGENAVLQFVISADDSVAAMTPVLRSLKTEQGTSLDKAVLGWVRNVQASHAYVPAAPDA LQSPSGEYPDPILTDTTVSIAVGGTALLWLDIPIPADAEAGLYEGSVRISGLKNGKRIVA DRQFTIQVYPVTLPEQSLLVTNWYFPDKFSFMNDNESVEDDSPAYWECMRRLVETASAYG QNVWLLYETGTPVPTADGKGLTFDFSRMDKTIEFLLRHADVRLIEANHFAKRSRNGWTDP FWANVPVPDGKGSYVYQRLPHDDPRVQRYIAAYFPALQEHLRSRMIDDGSGRSWLDIYTQ HIADEPLNENKTSWEGLARQVKQAAPDIRIIEAYRSSSYDPALIDILVPQLDEFVWEIYR TMPAGHSCWFYTCMYPRGNFANRYVTLPLIKTRLLHWINYKYDSPGYLHWGFNAWGANGD PFGDVSAPANDWPGGDSHIVYPGYRKLYPSIRLAAMRDGIRDYDLLKMVEARDSIRAQAF VNAIIFDFDRYDTSVSRFRQIRREILDFLEDMH >gi|313159013|gb|AENZ01000021.1| GENE 5 8499 - 8951 213 150 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313159017|gb|EFR58392.1| ## NR: gi|313159017|gb|EFR58392.1| hypothetical protein HMPREF9720_1376 [Alistipes sp. HGB5] # 1 150 1 150 150 276 100.0 4e-73 MDSLKETDNLATKPSKRPRIEVDEELMRQMIAGQAPLDSKVIRRIPEPEAENTDAPEGKT SETVSGASAPTAERTDVNTQTSDVKELAGFRRKKIPLPDFERTFFAPADCRNRSAIYISA ATKYKVSTILHLLGMRIQGLQLWSTICCAL >gi|313159013|gb|AENZ01000021.1| GENE 6 9246 - 10517 849 423 aa, chain - ## HITS:1 COG:no KEGG:BF1833 NR:ns ## KEGG: BF1833 # Name: not_defined # Def: putative bacteriophage integrase # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 408 1 399 410 310 41.0 1e-82 MRSTFKVLFYLKKDKHKVQPVVPVMGRITVNGTIAQFSAKLSVPPHLWEVKGGRAKGKSL EADRINRYLDNIRIQIGKHYQSICDHDGYVSADKVKNAYLGFSKRYKLLLELCDEFCKEY KNRIDVDRTIHSLFRYQTLRRDLSLFICQDYKVKDIPLVELDQSFAEKFAAYLKHVKGLA DTTISVEIKSLKHIVKKAFNDGQMEKNPFAYYYYFADQPEIEYLTEEEINKLIIGKVKQQ RQDRTRDMFLFCCFTGLSYADLAKLSYEELKQTPNGAWWISSIRQKTKVPFTVKLLPVAK AILEKYRIPANRFNRLFPENPGKVFPVASLKSSDASLKHIARQCGIAKNLKFHTARHTFA TTVSLMNGIPLETVSKMLGHKYTTTTQIYAKVTNQMIGNAISRIEDKIGDRFQFPTLKEE SDG >gi|313159013|gb|AENZ01000021.1| GENE 7 10803 - 11729 1456 308 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313159052|gb|EFR58427.1| ## NR: gi|313159052|gb|EFR58427.1| hypothetical protein HMPREF9720_1379 [Alistipes sp. HGB5] # 1 308 1 308 308 582 100.0 1e-164 MQGFDLTESGSIYYSQVGSDGATLNICRAAGPGLNAQTDYMILKYFGHGTQIVAEEASDG KTYIWLNSNASVDKSGEYGDNWSVSRVEFVPGATSDAGYAGETFFLNKDGQYDQQVSIDF GARRLLIGSRRSGVRYFWIFDLDEALALPLKKMTATVTVGSAGSEPVTREIEARDLNDCR VLGGFSVPAGSDKENDVYSYSHQGHEVAGNYVYFYEGNAVETGADSFESKAYVTVFNYSG KIVVPRTEVAAVADKAGLAAAGLTTTGYAEGESLKVRDGKLYLGVACRDGSSGNRRANIL VYDCVQGE >gi|313159013|gb|AENZ01000021.1| GENE 8 11899 - 12795 1315 298 aa, chain - ## HITS:1 COG:lin2069 KEGG:ns NR:ns ## COG: lin2069 COG4974 # Protein_GI_number: 16801135 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinase XerD # Organism: Listeria innocua # 18 297 11 295 297 215 40.0 9e-56 MAENTKKWEETGRRYRTYIKLEKRLSQNTVESYMRDLRQFAHFILRQYDVAPRKVEGTMI ERYMAWLYDRGREKTSQARCLCGIRSFFNFLLVNDQIETSPAEFVDTPKFGRPLPDILTT DEIDSIIATVDMRSTKGLRDSAMLEVLYSCGLRVSELTSLRLSDLFFGEGYIRVIGKGNK QRLVPISNTARDKIQRYLEERRSARSGEETVFLNNRGGQLTRVMVFTILKRAVERAGIDK HISPHTFRHSFATHLLEGGASIRQVQEMLGHESILTTEIYTHLDSDHLRRTLEEHLPI >gi|313159013|gb|AENZ01000021.1| GENE 9 12865 - 13692 513 275 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|228472988|ref|ZP_04057745.1| ribosomal protein L11 methyltransferase [Capnocytophaga gingivalis JCVIHMP016] # 1 274 1 275 276 202 42 1e-50 MNYIALNIAFSEDEQAEILTAELADYPFESFETEDGTLKAYIPQERLADCKAGVDALLAR YGVQGRYIAIETQNWNAVWESNFPAVDVEGRLLIRAPFHDPAPEGVMEVVVMPKMSFGTG HHATTWLVSRAVLDLGVAGRRGLDMGSGTGVLSIVAAKCGAEHVDAVDIDDWADANCREN IAANGVADRITPMLGDVRRIAGRHYGFILANINRNILLADMPVYAAALDAGGDLVMSGFL EPDVPAITACAEKLGLRPVATAVKEGWVTVHVRKE >gi|313159013|gb|AENZ01000021.1| GENE 10 13754 - 14422 765 222 aa, chain + ## HITS:1 COG:no KEGG:BT_3431 NR:ns ## KEGG: BT_3431 # Name: not_defined # Def: DNA repair protein # Organism: B.thetaiotaomicron # Pathway: Homologous recombination [PATH:bth03440] # 1 222 20 241 242 152 40.0 1e-35 MVVYLLTDAGGRRSYMVQGVRSRNGRGSKLALFQPMFPVEFEGLESPRQQMHRFKEVRGG FVLQSLPFDVRKSTMALFMAEVLYRLVREYDEPNEALFDFVWNCVGALDSMDEGVANFHL WFLANLSRFLGFCPGNEYASGDWFDIREGLYTKTRPAHVSYMNRECACMLRDFLECDVRC LSEIGLNRGQRVEFLNAVLVYFGYHLDAISAVQSVRILREVF >gi|313159013|gb|AENZ01000021.1| GENE 11 14454 - 14864 463 136 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|291514730|emb|CBK63940.1| ## NR: gi|291514730|emb|CBK63940.1| hypothetical protein AL1_15140 [Alistipes shahii WAL 8301] # 29 136 24 121 121 151 76.0 2e-35 MKRMFMIAVALFAAFSLSAQEPGRFRLVRPDSTRTGSFRLPDSLRVEMSVPDSLAHMSPW KTDFRTAAPMKVKPAVSMTSVVVIQKENLPSRVTVIDNNSLRLGSHFNLSNGQAWNWSPF PDAFLDARTLSFPMPR >gi|313159013|gb|AENZ01000021.1| GENE 12 14893 - 15348 -456 151 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRARRWRQRRRGHRGRFAHRDQCGHRDQCGHRDQRGHRDQRGHRDQRGHRGYRGRCEHRN CLAHRRRERRDRNKRQRRGNPPPKYEISLGCSSRHAVLHPQARRFGHRSERCASTSCAPT TDRSVRRQHRQPRYARIRTDTPTKSLQNRAG >gi|313159013|gb|AENZ01000021.1| GENE 13 15599 - 16645 1308 348 aa, chain + ## HITS:1 COG:no KEGG:Odosp_3332 NR:ns ## KEGG: Odosp_3332 # Name: not_defined # Def: phosphate-selective porin O and P # Organism: O.splanchnicus # Pathway: not_defined # 31 332 19 351 373 151 32.0 4e-35 MRLILPVLLSFGLFCASYAQQPDTLCTATVGELSAEVAALKAKTSAWDKVLAALPAISGY LQTGYEWSDNSSSFFIKRVRLSLSGNIAPKLDYRVQIEFASPKIVDAYLCYRPFDELNLQ LGEYKLPFSIENTEYVPLRYEFIEYPLSLCKLMGFNDLCGLSATGRDMGAQLFGGFFRRD GYSILNYNIGVFNGEGLNVRDKNKSKDIVARLTLKPAAGLQLAGSYYWGEYGADYLKRVR YGAGACYDRGPVVVRGEYICGTTGDLDSEGWYAVAGWRVTKTLLPAVRYDTFLENTARSS SRQTNYTAALTWQPVKYLRCQLNYTYEDYAARDVSGRNVVALMLSGIF >gi|313159013|gb|AENZ01000021.1| GENE 14 16667 - 17953 1915 428 aa, chain + ## HITS:1 COG:YPO1307 KEGG:ns NR:ns ## COG: YPO1307 COG2855 # Protein_GI_number: 16121589 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Yersinia pestis # 75 311 22 261 362 69 25.0 1e-11 MKKWMEQFRVEDWVVVWVSIPLLALAAIVPAGLPKVPATLLGGAAWSNIAYLFAVVLAVL YVGCLLLRRPLRGLLPSLAVVFAVSLLAQIVAKIPAVSYYGFESVFFSVLFGLVIRNVWR VPEWMKPAIQGEFYIKIGVVCLGATILFSDVMKSGVFGLAQACLVVAVVWFFAYWVSRRM KVDERTAMILSSGVSICGVSACITAARVAGGDDRKLSYIVSLVLIVVVPMIYLMPWLANT ILPLIFDDPHVVQEVAGAWIGGTIDTTSGVAASSTIVGEVANQHAVIIKAAQNVLIGVVA FFIALYLSTRRGEKGAQAPSLGIVWEKFPKFIIGFVAASLVFSILQSNGLFTADAKGKLA EPGVAKMFSTVFFSLAFVCVGLDTRLKEIVSKENRNALWAFLVAQTFNIVVTLVIALVLF GVLKPMLA >gi|313159013|gb|AENZ01000021.1| GENE 15 17962 - 18264 99 100 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313159027|gb|EFR58402.1| ## NR: gi|313159027|gb|EFR58402.1| hypothetical protein HMPREF9720_1385 [Alistipes sp. HGB5] # 36 100 1 65 65 122 98.0 9e-27 MAELVIDYPRLTLYRRIPEFYFYFCHTYSFLVVHILAPGTFSGAGRLQQSSLSQDRPTKL IIRSDKKAEKYAVRATKDVSPECMETKTPRQQEAGALTVL >gi|313159013|gb|AENZ01000021.1| GENE 16 18187 - 19398 954 403 aa, chain + ## HITS:1 COG:no KEGG:BVU_1527 NR:ns ## KEGG: BVU_1527 # Name: not_defined # Def: integrase # Organism: B.vulgatus # Pathway: not_defined # 1 401 1 401 407 438 52.0 1e-121 MAKVKVKFRNSTVEGKAGVIYYQLCHKCQSRQITTGIRLFPEQWDALRERAVVPFAGLDG EIAVVQRQIDGDLCLFRAIIGDFEARRVEYGLPDVVGRFRSFGRVTVFAFIEKQIACLRA GGRLGTARNYRRTLNSFAGFLNGADIPFSLLDEQLVSRYDDWLRRRRIVRNTVSFYMRVL RAVFNKAVREGIVSQSSPFRNVYTGVDRTRKRAVGEETVIRLRRLNLEHSPSLALARDIF IFSYCARGMAFVDIAFLRKQDIAGGAICYVRRKTGQRLTIRIEPCMGDIIERYAPVTRTS DYVFPLLNAEDPVRGFSQYQTALGYYNRRLKRLAELLGLEMPLSSYTSRHTWATTARNHN VPLSIISAGMGHASEKTTQIYLASLESSVVDQANRSIIASLYV >gi|313159013|gb|AENZ01000021.1| GENE 17 19861 - 20409 458 182 aa, chain + ## HITS:1 COG:no KEGG:BF3925 NR:ns ## KEGG: BF3925 # Name: not_defined # Def: putative transcriptional regulator Updx-like protein # Organism: B.fragilis # Pathway: not_defined # 11 182 9 177 179 145 49.0 1e-33 MMNSPDTAAEWYVLRVTYQRELSTKEYLDKLNIENFVPVRVVRRRNSKGQFFRACEVAVH NYIFIRSTREVIDELKTYKLPMLRYVMHPQNGENQIMIVPEEQMRNFIAVAGNEDEQVLF MSPEEVALSKGDKVRITGGVFEGVEGLLMRVKNSRGKRVVVKIDGITAVATASIPSALVE KI >gi|313159013|gb|AENZ01000021.1| GENE 18 20459 - 21379 444 306 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313159049|gb|EFR58424.1| ## NR: gi|313159049|gb|EFR58424.1| hypothetical protein HMPREF9720_1388 [Alistipes sp. HGB5] # 1 306 1 306 306 613 100.0 1e-174 MTPEYLKEALPSLLNADFAGATNVRLLWLARAYAALCGLVTFDSPAGEFGTRRIWLQRID TLFGVLRERCRRESDAALRCRMVHAMYTLVCGTMSGDVSKKKEVCFAMADELVRDCLGGN EKRSSLRPDESELLIRTGVCHCLIDLLYPGPDAGDEYLLLLKRQIAGWIAEMNRDGSWSG VSPDVALERIGVMNRYSYAFLDKTNDRAVNRSFEYFRNSLPVPEDAGNFDENYLYTLVRL YDAAVLGNAYDPDPHLAWRIARFMYDYGRTPFCSDDDRFCCVCCVVCHVAERIDIWQSAP TERYIA >gi|313159013|gb|AENZ01000021.1| GENE 19 21464 - 22636 824 390 aa, chain + ## HITS:1 COG:BS_tagO KEGG:ns NR:ns ## COG: BS_tagO COG0472 # Protein_GI_number: 16080606 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramyl pentapeptide phosphotransferase/UDP-N-acetylglucosamine-1-phosphate transferase # Organism: Bacillus subtilis # 9 308 9 297 358 123 30.0 6e-28 MATSLCYTIIIPFFIALLLVGWIHPRLVKIALLKNIVDNPDARKLQRTPVPVLGGVAVFF GVVIAIGCMSSVVDCSGLPVVIMAMMAMLYTGTMDDILSLSPGLRFVIEIVVVLLLIFVG GYCIDDFHGLWNIGRFSYWYAVPLTVVAAVGIINAINLVDGVNGLSSGYCIMACLIFGTL FFLSGEESMTILAVVSVGALIPFFLHNVFGKTSKMFIGDGGTLVMGVVMSVFVIEILQNG SRAAAYVDPNVGLVPFTLAVLSVPVFDTLRVMSTRILKGTSPFRPDKTHLHHMFIDLGCS HVATTLAILGVNMFVVLCWWSLEASGFSIAVQLYAVIAVSLLVTSGLYHFMQWHICRDTR FMRAMRRLGYKTHISRTGIFFWLQQVMDKV >gi|313159013|gb|AENZ01000021.1| GENE 20 22725 - 23609 1096 294 aa, chain + ## HITS:1 COG:NMB0062 KEGG:ns NR:ns ## COG: NMB0062 COG1209 # Protein_GI_number: 15675999 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-glucose pyrophosphorylase # Organism: Neisseria meningitidis MC58 # 1 291 1 288 288 405 65.0 1e-113 MKGIILAGGSGSRLYPITKGVSKQLLPVYDKPMVYYPLSALLLAGIREILVISTPEDLPG FRRLLGDGSDYGVRIDYAAQPSPDGLAQAFFIGEDFLGDDSACLVLGDNIFYGSGFTGLL REAVRTAEEDGKATVFGYRVDDPQRYGVAEFDGEGNCLSIEEKPAHPKSNYAVVGLYFYP NKVVDVAKSIKPSARGELEITSVNQCFLQSGELKVQTLQRGFAWLDTGTHDSLAEASIFV EVIEKRQGLKIACLEGIAYRNGWITADKLRALAEPMLRNQYGQYLLKLLDEPEA >gi|313159013|gb|AENZ01000021.1| GENE 21 23829 - 24140 87 103 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313159078|gb|EFR58453.1| ## NR: gi|313159078|gb|EFR58453.1| conserved domain protein [Alistipes sp. HGB5] # 1 103 31 133 133 181 99.0 1e-44 MWFPLVGYGIVFMLLHNLFFRAHFYNPLTSHLYTRQDYFDCLKYFCSCVTPEQLLGALWF LRSLFIVSFLFMIGVWISKRLSDRYSDIILGGDFVRSGPLFCF >gi|313159013|gb|AENZ01000021.1| GENE 22 24220 - 24717 96 165 aa, chain + ## HITS:1 COG:no KEGG:MM_2099 NR:ns ## KEGG: MM_2099 # Name: not_defined # Def: hypothetical protein # Organism: M.mazei # Pathway: not_defined # 1 149 201 347 385 62 32.0 9e-09 MGRMFRKYQRYMPVNIWSIGVLLMLLLFLQYKNVTVAIASSIFPPLPVFYFASAVGCLFT YTLAVYIHNLPTLSRIMIYAGNASLAIMALHFLAFKVVSLLQILIYGYGIDYLSAFPVIP DRINIWWVPYVVCGVALPLLYTSVKQILVLQSGRLYGQLILKFKL >gi|313159013|gb|AENZ01000021.1| GENE 23 24714 - 25289 830 191 aa, chain + ## HITS:1 COG:MA3780 KEGG:ns NR:ns ## COG: MA3780 COG1898 # Protein_GI_number: 20092576 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-4-dehydrorhamnose 3,5-epimerase and related enzymes # Organism: Methanosarcina acetivorans str.C2A # 1 178 1 179 183 220 58.0 1e-57 MKVVPTAIEGVVILEPEVFGDARGYFFESYSQRRFDAEVRPVRFVQDNESHSRYGVLRGL HFQKGRYSQSKLVRVVRGRVLDVAVDIRRGSPTFGRHVAVELTEDNKRQFFIPRGFAHGF AVLSDEATFQYKCDNPYSPESEGAIAWNDPSLGIDWRLAPEDVILSPKDSAHPLLSEAGE LFDYNTNYYSL >gi|313159013|gb|AENZ01000021.1| GENE 24 26010 - 26870 1084 286 aa, chain + ## HITS:1 COG:CAC2315 KEGG:ns NR:ns ## COG: CAC2315 COG1091 # Protein_GI_number: 15895582 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-4-dehydrorhamnose reductase # Organism: Clostridium acetobutylicum # 1 277 1 271 280 236 46.0 3e-62 MNILITGANGQLGRSLRRLGGVSPHNYLFTDVAELDITDAAAVLRTVEERRIDVIVNCAA YTDVERAEEDEPTAELLNHKAAGNLAAAAKATGATLFHVSTDYVFDGTAHTPYTEDGTPS PLGAYGRTKLAGERAVMASGCRYLIFRTAWLYSEYGNNFLKTMLRLTSERDTLQVVFDQI GTPTYAGDLALAIFSIIESERYAGNEGVYHFTDEGVCSWYDFATEIAAAAGHDSCHIIPC HTSEFPTKAARPAYSVLDKTKIKTTFQMDIPHWRESMIYCLKQIEK >gi|313159013|gb|AENZ01000021.1| GENE 25 26877 - 27635 44 252 aa, chain + ## HITS:1 COG:FN1241 KEGG:ns NR:ns ## COG: FN1241 COG3774 # Protein_GI_number: 19704576 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Mannosyltransferase OCH1 and related enzymes # Organism: Fusobacterium nucleatum # 1 108 1 111 243 85 41.0 1e-16 MIPKIIHFVWLGGGKYPPPVQDCIDSWKRILPDYTIKRWDETNFDIDSVPWVKEAIQMRK WSLASDYIRHFALYTEGGIYMDTDVKVFRPFDEFLSWDFFSSVEYHPTYFLSQGIHQIDD AGVVRRDGDIVAGLGLLAALIASKEANPFIKECLDYYGSRHFIQEDGSLYVDVINPGIMA MLATKYGFRYKDENQLLDGNMMVYSSSIFAGDPATRSKESYSMHYCDGSWREKTLKTRMK DYIKGILYKKRK >gi|313159013|gb|AENZ01000021.1| GENE 26 27654 - 28760 1388 368 aa, chain + ## HITS:1 COG:CAC2332 KEGG:ns NR:ns ## COG: CAC2332 COG1088 # Protein_GI_number: 15895599 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-D-glucose 4,6-dehydratase # Organism: Clostridium acetobutylicum # 3 368 2 350 351 404 52.0 1e-112 MQRTILITGGAGFIGSHVVRLFVTKYPDYRIVNLDKLTYAGNLANLRDIEERPNYTFVRG DICDFEAMRELFRQYGIDGVIHLAAESHVDRSIRDPFTFARTNVMGTLSLLEAAREHWNG NWAGKLFYHISTDEVYGALELTRPAGDPAGCESGGGPFGEEFFTEETKYDPHSPYSASKA SSDHFVRAYHDTYGMPTLVTNCSNNYGPYQFPEKLIPLFINNIRHRRPLPVYGRGENVRD WLYVEDHARAIDVIFHKGKVADTYNIGGFNEWKNIDLIRVIVKTVDRLLGNPEGASEKLI TYVADRAGHDLRYAIDSRKLKDELGWQPSLQFEEGIEKTVRWYLQNQKWMDDITSGEYEK YYQSMYKN >gi|313159013|gb|AENZ01000021.1| GENE 27 28844 - 28978 56 44 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MENNILTYNVAVFLTKLFRSVEVAKLFFNFHKHYIASTSYLLND >gi|313159013|gb|AENZ01000021.1| GENE 28 29964 - 30431 -91 155 aa, chain + ## HITS:1 COG:no KEGG:BT_0467 NR:ns ## KEGG: BT_0467 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 7 154 340 486 489 69 32.0 3e-11 MASTINTFSLPLATAANANGNIKKFQLYTGIFEIMNIPLSYFFLRIGFQPIVVYFISLII IVITLYVRLVVLKNLVNLDVSYFVRSVVFRSLITGVVAWGLSCVIANYLDSDNFFLILSY LILSIICTGLTIYIMGLDIGERQYVVNAVRKFIKK >gi|313159013|gb|AENZ01000021.1| GENE 29 30593 - 30754 110 53 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MINSNNSFHLIIIPYIILISNQNKIATSLFESILEIYYKPLIFLISQQQPISK >gi|313159013|gb|AENZ01000021.1| GENE 30 31004 - 31387 201 127 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313159102|gb|EFR58477.1| ## NR: gi|313159102|gb|EFR58477.1| glycosyltransferase, group 2 family protein [Alistipes sp. HGB5] # 1 127 193 319 319 248 100.0 8e-65 MQNGLFFLDEVLVKRREHQSATVRIAEEWSKQEFGGDLYIYYAYRRFLNLKGFLEFCSEY YPEKRDIIFDLKRYYSAGKLRYEYLKNNKIKDWFKTAGYVKYIGMRTYILDGLHIFKSGY KTLKSKV >gi|313159013|gb|AENZ01000021.1| GENE 31 31477 - 31590 65 37 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MTTFILKFDTTTMKGMIFSQPKSTIRKIEYMWDITLI >gi|313159013|gb|AENZ01000021.1| GENE 32 31621 - 32592 211 323 aa, chain + ## HITS:1 COG:MTH340 KEGG:ns NR:ns ## COG: MTH340 COG2327 # Protein_GI_number: 15678368 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Methanothermobacter thermautotrophicus # 87 305 170 379 400 80 30.0 3e-15 MENMIRPRQYFSSRKIYKDADFILDIGQGDSFADIYGEKRFKWIYSEYKLAKKFNIPLCI LPQTIGPFNDAGLRKKAMGAVRSAKCVMVRDKQSADYVKSLLPNLDVTEIIDVAFFMPYE KKEFNKEYIHVGLNISALLWNGGYTMDNQFGLKSNYQCLIRGIVEYFLSKKDVKLHLIPH VVGGERGLENDYAVAYDIYEEYCNENLILAPLFFDPIVAKNYIAGMDFFMGARMHSAIAA FSSEVAVYPMAYSRKFNGLFLETLDYPIMGDMKVKDERDILLDIKKAYFQRAELKNIIRN RMATVVAERYNLLMENLMTFFKI >gi|313159013|gb|AENZ01000021.1| GENE 33 33198 - 33554 142 118 aa, chain + ## HITS:1 COG:no KEGG:Mfla_2019 NR:ns ## KEGG: Mfla_2019 # Name: not_defined # Def: nitroreductase # Organism: M.flagellatus # Pathway: not_defined # 4 112 230 336 340 76 33.0 4e-13 MDNQLGSQGWCDNASVLICVTVNCNYFGGNYERYQALIDGGLYAMNFVMGLHLNHIASCF KMFIRTPRREKEFKKIAKIPQCEMPVVLILGGHYKSGIVTSPKSERFTFDELACVDNC >gi|313159013|gb|AENZ01000021.1| GENE 34 34409 - 34615 226 68 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKSLISKMHRQITKAKNQILLNIRHNGLNRIHSCKSRGIATFFPLSSFNQRIDKFIIKRK ENKSSNQR >gi|313159013|gb|AENZ01000021.1| GENE 35 34922 - 35815 246 297 aa, chain + ## HITS:1 COG:CAC2321 KEGG:ns NR:ns ## COG: CAC2321 COG1216 # Protein_GI_number: 15895588 # Func_class: R General function prediction only # Function: Predicted glycosyltransferases # Organism: Clostridium acetobutylicum # 2 274 5 275 298 125 32.0 7e-29 MDVSVIIVNYNTLGLTSDCIESIIAQTSTVEYEIILVDNASTDGSKEVFAQDKRIKYIYS DRNLGFGRANNLGIREAKGRYLFFLNSDTILLNNAVKLFFDFCEKNPDRKIGAVGAVLKD QNQKNIHSYGRFITPRGEIAEVLGKYLRFLKKRNHLCPASVQQPTPVDYITGADLFVPRM VCDELGTFDPVFFMYCEEVDWQFRMSKAGYERLLINGPEIIHLEGGSDPSNTKLWSFNRI ENIFRSKLLYIRKHYNYFVFWSYRLCNAVLWLPILLMRKDTPDQQKRLLRVLLFTNR >gi|313159013|gb|AENZ01000021.1| GENE 36 35910 - 36335 113 141 aa, chain + ## HITS:1 COG:mll2311 KEGG:ns NR:ns ## COG: mll2311 COG0110 # Protein_GI_number: 13472117 # Func_class: R General function prediction only # Function: Acetyltransferase (isoleucine patch superfamily) # Organism: Mesorhizobium loti # 69 128 98 157 196 73 58.0 8e-14 MTGFFRAKIYKLLGVSIEKGVVRIGRVSIDTIHPEDIFIGKGTTITDGCILLSHYYDVRN LKEHAYYRGEIHIGRNVYIGSNAIFTKPVTIGDGAVIGAGSVVNKDIPPYQVWAGVPVRF ICNRYQDESEIPVDTNAFKPK >gi|313159013|gb|AENZ01000021.1| GENE 37 36844 - 37065 119 73 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MQHGRFIAVSKQLLAAINSPRKKNMKKWTARLNANTPLISPTGIKNKPAAQHKQSKDNQL FFNILLYTSNKEK >gi|313159013|gb|AENZ01000021.1| GENE 38 36899 - 37426 116 175 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313159144|gb|EFR58519.1| ## NR: gi|313159144|gb|EFR58519.1| putative acyltransferase [Alistipes sp. HGB5] # 1 175 147 321 321 299 99.0 5e-80 MLCLCCAAGLFFIPVGDMSGVFAFNRAVHFFIFFFLGLLIAANNCFETAIKRPCCIAICL ILYVAMFWVGGVEEDVVTLVMSLSGCLLCWGLAGKFDRNLSNNVFVSFRNYTYQIFLLGI FIQIGIKMIYNKLGMQGTYPYFYVLCVLSGIYIPVLVSKVIQRIDNKYLNMLIGL >gi|313159013|gb|AENZ01000021.1| GENE 39 37727 - 38536 307 269 aa, chain + ## HITS:1 COG:MA3755 KEGG:ns NR:ns ## COG: MA3755 COG0438 # Protein_GI_number: 20092553 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Methanosarcina acetivorans str.C2A # 9 234 126 358 384 61 26.0 2e-09 MYIVSLAKKSIIDIHEAIAQNADDNAGLKRCFRKVYSERFATVIVHSQRTNDLLEEYGYR GNRLFVPHFKYCFSKNYDIRRLGPDIVSAIAEDKVNILFFGNITYDKGIDILISAVNSLE PELRCKANVIIAGKDFDGTVHRVQPSDKAAFKIVLRHIEDDELVYLYEHTDYVALPYRKT SQSGILEMAFYFRKPILASDVPYFRRMLTEFPSFGILSGKDVPAYAETLASVIGLHSSAD YFTDEDYFRYTNRSEIEAFKKQFAIWIKR >gi|313159013|gb|AENZ01000021.1| GENE 40 38538 - 39611 675 357 aa, chain + ## HITS:1 COG:L1389 KEGG:ns NR:ns ## COG: L1389 COG0438 # Protein_GI_number: 15672182 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Lactococcus lactis # 2 357 3 378 402 170 33.0 3e-42 MKKVAIIGTVGVPAKYGGFETLVENLIGECCSEDVDYTVFCSGKAYPTRLKAYKNARLEY VPLHANGMQSIPYDIWSLIRTVGRGYDVVLILGVSGCLFLPLYRLLFRKRIVVNIDGLEH RRDKWGKWARKFLRLSERIAVKFADVVIADNKVIRDYVTETYRKPSVLIAYGGDHAVIPV DLSRQREILASFGVGAKAYACTICRIEPENNCHLILEAFSKTNIPLVFIGNWNLSAYGRE LKAVYGKFCNIHIVDPVYDREQLFVLRKQCLFYIHGHSAGGTNPSLVEAMSCGAPILAFD VGYNRETTGHKACYWSTPEELAALAAESREDTSGSMERLANECYTWRKVVGQYEAIY >gi|313159013|gb|AENZ01000021.1| GENE 41 39702 - 40721 541 339 aa, chain + ## HITS:1 COG:no KEGG:Bache_1136 NR:ns ## KEGG: Bache_1136 # Name: not_defined # Def: hypothetical protein # Organism: B.helcogenes # Pathway: not_defined # 1 261 17 276 375 124 35.0 7e-27 MQEPALDKALSDPEWSRVYGQAAQQGVLAVAWESVRRLSPDLQPGRGLRLQWAYGAERIV SRSKRQMLEAARFAALCTDAGCRLILLKGIGLARYYPFPLQRECGDIDILLPEGFEKGNE IARCRGMKVSGLDYKHNHICSNGVMFENHRYLTSFKGKSDIRRLERIFQDAVKGGDFVPF GEDRFHTLFPTVNALYITYHGFFHFLIEGITLRHLLDWALFVGKEQNAIDWNYFYRICDD LGFSRFADTMNAIAVDVWGVPLTNPGVSFDRRCMDRVLADVLANKRHVSGLPVYKRRAKL VCNMLSSRWKYRQLYGRSLFGVMVRSVFGLLFDASPELK >gi|313159013|gb|AENZ01000021.1| GENE 42 40760 - 40918 160 52 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MIYVVVWSVFLVLFYMLLWWLIRRTPVTGDDDKEVSRISRLLKRVGDKWGVK >gi|313159013|gb|AENZ01000021.1| GENE 43 41059 - 42915 2212 618 aa, chain - ## HITS:1 COG:PA4928 KEGG:ns NR:ns ## COG: PA4928 COG1032 # Protein_GI_number: 15600121 # Func_class: C Energy production and conversion # Function: Fe-S oxidoreductase # Organism: Pseudomonas aeruginosa # 2 572 23 652 747 497 41.0 1e-140 MFLPTTIKEVRARGWEQLDVILFSGDAYIDHPAFGAAVIGRLLEAEGYRVAIVPQPNWRD DLRDFTKLGAPRLFFGVSAGSMDSMVNHYTANLRLRSNDAYTPGGKAGFRPDYAVKVYTQ ILKRLYPHVPVVVGGIEASLRRLTHYDYWNDTLKPSVLAESGADLLIYGMGERVVQQVAK AMRNGYNAKLLRKLRQVAFLADDGYVQRLDPAETIRLHANEECVRDKRAFGENFTIIETQ SNLMEPTATLIEAVGDRYVVVTPPNTTLSTDELDHSFDLPYERAPHPRYNGKGDIPAWEM IKHSVNIHRGCFGGCSFCTISAHQGKFINSRSERSILAEVERVTRMPDFKGYLSDVGAPS ANMYRMGGRDRELCRKCRRPSCLHPKMCPNLDNDHRPLLALYEKIRAVKGVKKAFVGSGI RYDLFDESPYLDTVVKYHTSGRLKVAPEHTEDAVLKLMRKPPFALFERLTADFQRICRRE GLPYQLIPYFISSHPGCTERDMRSLSEKVLGKLHFNLEQVQDLTPTPMTLSSVMFYTGEN PYTHEKVYVARSQEEKRRQKSYFFNEKPTGNPTAPGRRTSAGQERAPKNHRPNGDRTSAP RPRKGENPRAIPGIKRKR >gi|313159013|gb|AENZ01000021.1| GENE 44 43181 - 44269 1698 362 aa, chain - ## HITS:1 COG:BH2542 KEGG:ns NR:ns ## COG: BH2542 COG0564 # Protein_GI_number: 15615105 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Pseudouridylate synthases, 23S RNA-specific # Organism: Bacillus halodurans # 44 348 17 300 305 265 46.0 1e-70 MADERYITEDPELDDGQTADEDGEGAGLYEHFAVVADKGQAPLRLDKFLTVRMEKCSRNR IQAAADSGSILVNGKAAKSSYKVKPLDRIQIVMPYPRREVEIIPENIPLEIPYEDDDLLI VNKPAGLVVHPGHGNYSGTLVNALTYHLRNLPLFQEGDMRAGLVHRIDKNTSGLLVVAKN EQSHARLAKQFFDHTIGRRYVALVWGNFEEDEGTITGNIGRSPRDRQKMFVFEDGSDGKH AVTHWRVLKRYGYVTLVECRLETGRTHQIRVHMSWQGHPLFNDERYGGDRILKGTTFSKY KQFIENCFAVMPRHALHAQSLGFVHPMTHEAVYFESELPDDFRALLEKWDIYAAASKESN NG >gi|313159013|gb|AENZ01000021.1| GENE 45 44269 - 45234 1588 321 aa, chain - ## HITS:1 COG:no KEGG:Palpr_2074 NR:ns ## KEGG: Palpr_2074 # Name: not_defined # Def: PASTA domain containing protein # Organism: P.propionicigenes # Pathway: not_defined # 7 274 6 270 277 110 29.0 9e-23 MEKLTYYWKKLRQNKLAYNLVLIAAIILAMAVTAHFVMQVGTRHGARRIVPDFSGVKLDD AQRTARKYDLELHINDSLFVPAYEGGIVLDQLPEGGVEVKPGRTVYITINSFRQKMVPVP YVAGRSLRQAKNMLEIAGLEIGELIYRADMATNYVLEEYCEGRPVTQNSRIQAEMGSGVT LYVGVEGGFGSTVVPRLVGFPLKEAKGRLWELGLNVGKIDFDEGINLLNQKDARVYVQVP TAERSAALGSRVDLKLTLDEKKLAQHRATAEKQAREAAEERLRLERERADSLAQAEFEQA AAAGEEQPEILPATDNDGFFD >gi|313159013|gb|AENZ01000021.1| GENE 46 45292 - 46374 1533 360 aa, chain - ## HITS:1 COG:aq_648 KEGG:ns NR:ns ## COG: aq_648 COG1060 # Protein_GI_number: 15606071 # Func_class: H Coenzyme transport and metabolism; R General function prediction only # Function: Thiamine biosynthesis enzyme ThiH and related uncharacterized enzymes # Organism: Aquifex aeolicus # 7 354 20 364 371 293 45.0 5e-79 MQQIDDIAAEVRRGARIGDAEAARLWREAPLWLLGELATECKRRVSGDKVYFNRNFHIEP TNRCVFNCRFCSYRRPAGSPEAWDYTMEEIEQIARGRQGKGITEVHIVGGVHPDHGLEYY IDMIRRVKAILPEAAVKAFTAIELSYMIRKAGLSVEEGLRRLKEAGMEAIPGGGAEIFDQ EIRSRICPDKGSTDEWLEVHETAHRLGIPTNATILYGHVEGLAHRIDHLRRLRELQDRTG GFNAFIPLKYRNFGNPMSEIGEVSVIEDLRMLAMSRIYLDNVPHIKAYWVMYGKATTELA LAFGADDIDGTIDDTTKIYSMAGADDRRPSMSVEEMHRIVRAAGCRAVERDTFYNELPDR >gi|313159013|gb|AENZ01000021.1| GENE 47 46376 - 47479 1988 367 aa, chain - ## HITS:1 COG:FN1030 KEGG:ns NR:ns ## COG: FN1030 COG0795 # Protein_GI_number: 19704365 # Func_class: R General function prediction only # Function: Predicted permeases # Organism: Fusobacterium nucleatum # 253 365 249 361 363 70 35.0 6e-12 MKLRFPGFKILDRYILGKFLATYFFAIAMIIIVVVLFDYVEKIDDFTELHAPLRDVIFDY YLNFIPFFINQFSGLFTFIACIFFTSKMAYQTEIVAMLSGGMSFRRLMWPYFLGALIIGS LSLTLNLWLIPISQRHIVNFESQYIKRKQNNKFNRHIYRQIEPGTFAYIRGYNDASQQAS FLALEHYYSGTMTRSLEASDVKFNPETKCWTAPRYTKREFDSLGVEKFEQFRNLDTLINL EVTELGEINDLIQTMNITELNEFLDQQRAKGSDAINIIEVEKHARFAYPLSTFILTLIGV SLSSRKVRGGTGLHIGIGTGLCFSYILFNRFFEEFAKSGTLPPGLAVWLPNIIYLFIAVY LYRKAPK >gi|313159013|gb|AENZ01000021.1| GENE 48 47486 - 48616 1722 376 aa, chain - ## HITS:1 COG:aq_1308 KEGG:ns NR:ns ## COG: aq_1308 COG0343 # Protein_GI_number: 15606515 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Queuine/archaeosine tRNA-ribosyltransferase # Organism: Aquifex aeolicus # 2 376 3 378 378 385 49.0 1e-107 MKFTLQHKDPATKARAGELTTDHGVIRTPIFMPVGTAATVKGIFHRDVRDEARAQIILAN TYHLYMRPGMEIIEKAGGVHRFSTWEGPMLTDSGGFQVFSLAACRKLKEEGCHFRSHIDG SKHLFTPESVMDTERTIGADIMMAFDECPPGDAPREYAAKSLALTERWLERCFNRYHQTE PKYGHYQALFPIVQGCTYPDLRAHAAENVKQYDADGYAIGGLAVGEPTDVMYEMIEVVNA ILPEERPRYLMGVGTPVNILEAIERGVDMFDCVMPTRNGRNGQLFTAEGVINIRNKKWKD DFSPIDPEGTAFVDTLYTKAYLHHLVTCGEMLAAQIASLHNIAFYLRLVGMAREHIAAGD FKPWKDAMVAKLTRRL >gi|313159013|gb|AENZ01000021.1| GENE 49 48708 - 49862 1669 384 aa, chain + ## HITS:1 COG:no KEGG:BVU_3860 NR:ns ## KEGG: BVU_3860 # Name: not_defined # Def: glycosyl transferase family protein # Organism: B.vulgatus # Pathway: not_defined # 24 367 16 366 385 171 30.0 4e-41 MQFFDTLLTCYGWEGIALAGAVLAMFGVQLYYYIFVYGRIPGYKNNRRSAVLDAEPPVSV VVPLFSEDYSFVEERLPLMLAQSYPDFEVVIVYVGHDSDFYEDLVRLKQSFPQIVTTKIH LDPRFPISRKMALNVGIKSAHHEHMVFTSTDACPQTDRWLSLMAKGFTRGEIVVGYCGVE RKKGFSNYMMRAWRMMHSADWLARAVRRRAYRGTLHNFGFTKRIYFGANGFSHLNMNIGE DDLFMQKVMTRDNVSVILSPRASLREKTWGGMGWWMSQLRYYGSAFRFYPQTVRNYIQWE LGSRSLFFLTVVCAAAVMPFEFKIAALALLVIRFLIVAIEVRRIARRLGESGMMGRYFVY DLLSPLWALALSVMLMRKDDRVWR >gi|313159013|gb|AENZ01000021.1| GENE 50 49853 - 50416 971 187 aa, chain + ## HITS:1 COG:BS_sigW KEGG:ns NR:ns ## COG: BS_sigW COG1595 # Protein_GI_number: 16077241 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Bacillus subtilis # 12 179 6 182 187 116 35.0 2e-26 MEIADYIVAEDRRLVELVLEGDDTAFEYLFNRYRDAIHRLFVQRLGGVNDADDLLQETFI KVYINLHRYSTDYTFGQWVYTIARNTFIDFVRRRQDDLSIDERFSSPPSTAPTPEESVIS LQQRTQIEHYLERLTPRYRTLIVMRFFDEYSYEEIAAKLTLPLGTVKTQIHRAREQMCKL IAQGDER >gi|313159013|gb|AENZ01000021.1| GENE 51 50444 - 51067 1003 207 aa, chain + ## HITS:1 COG:ECs4682 KEGG:ns NR:ns ## COG: ECs4682 COG0357 # Protein_GI_number: 15833936 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted S-adenosylmethionine-dependent methyltransferase involved in bacterial cell division # Organism: Escherichia coli O157:H7 # 11 190 16 188 207 105 32.0 7e-23 MELILKYFPELTDCQRQRFAALYDLYADWNAKINVVSRKDFDQLYLRHVLHSLAIAKVCA FDAGARILDVGCGGGFPSVPLAILFPEARFTAVDSIRKKIAVVEGVTAGLGLRNLEPRCA RAETLPERYDYVVSRAVTAMSEFVNWVWNRLERGQRGTLPNGILYLKGGDLAEELALTGK RWDVYDIARFFEEEFFETKKVVYTPKK >gi|313159013|gb|AENZ01000021.1| GENE 52 51093 - 51332 181 79 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313159104|gb|EFR58479.1| ## NR: gi|313159104|gb|EFR58479.1| conserved hypothetical protein [Alistipes sp. HGB5] # 1 79 1 79 79 154 100.0 2e-36 MKPTNPILVDLIARRLTEIREQHNHTKEYVLHNTGLGISGYENKVRFPSLESIAKFCKFY NISLEKFFAGITYPEEPQE >gi|313159013|gb|AENZ01000021.1| GENE 53 51415 - 51660 195 81 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313159127|gb|EFR58502.1| ## NR: gi|313159127|gb|EFR58502.1| conserved hypothetical protein [Alistipes sp. HGB5] # 1 81 1 81 81 154 100.0 2e-36 MAKRKQRRDEELFKMVIQRIKNLRETHHYTQEYVNEYTGLDIPHLETGRDFPSLTTIAIL CKFYNITIVEFFSPIDYPAKE >gi|313159013|gb|AENZ01000021.1| GENE 54 51691 - 51921 78 76 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313159108|gb|EFR58483.1| ## NR: gi|313159108|gb|EFR58483.1| hypothetical protein HMPREF9720_1422 [Alistipes sp. HGB5] # 40 76 1 37 37 64 97.0 2e-09 MKHTNDTLLRSVASRLKQIRIERGLTQEVVTDRTKVNVGLYEVGTTNITIVLLTVLCNFY NVTLEEFFKGIEYEKA >gi|313159013|gb|AENZ01000021.1| GENE 55 52352 - 52465 56 37 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRDFIVPLTRSQGRAARPYGINPGKRHGNARRPQAAG >gi|313159013|gb|AENZ01000021.1| GENE 56 52484 - 52729 120 81 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKITPGESPGANERTIRTAGTNRLQHRIRFMEGLNHGTKIVNIPNNRPLCRIKSRMQDRN NSVTQLSQIYEEFISAARTRL >gi|313159013|gb|AENZ01000021.1| GENE 57 52814 - 53683 891 289 aa, chain + ## HITS:1 COG:slr1117 KEGG:ns NR:ns ## COG: slr1117 COG0500 # Protein_GI_number: 16329224 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Synechocystis # 15 256 9 252 253 192 40.0 6e-49 MSNENNAIHGFDVNLICDFFLNTERQGPGNPEVTLKALSFIDNLTDESLIADLGCGTGGQ TMTLARHAPGRITGLDFFPGFIDRFNADARRLHLADRVKGVVGSMDALPFRAGELDLIWS EGAIYNIGFERGLNEWREYLKPGGYLAVSESVWFTDERPTEIHDFWVDAYPEIDTIPNKV AQIHRAGYLPVAAFVLPETCWMEHYFAPLAKARELFAAKYPGDSTAEGLMAFQRYEEELY RKYNEFYGYVFFIARKPNPRRTLCPGPMSNPGSTSCSGPAAVTCASSRR >gi|313159013|gb|AENZ01000021.1| GENE 58 53658 - 54074 657 138 aa, chain - ## HITS:1 COG:MTH1745 KEGG:ns NR:ns ## COG: MTH1745 COG0526 # Protein_GI_number: 15679737 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Methanothermobacter thermautotrophicus # 21 125 45 146 150 67 33.0 7e-12 MKKLIFVLLLALGAGAAQGQVKFETKSTDAVREMAIKSGKLVFIDLYASWCPPCRMMERE VFSRKDVGDFMQQRFVAAKYDTDKPTGKELMKRYGSGAIPLYLVFDTQGELLGRIQGASA PKEFMESVQKIIDGSKRR >gi|313159013|gb|AENZ01000021.1| GENE 59 54113 - 55219 1138 368 aa, chain - ## HITS:1 COG:RSc1212 KEGG:ns NR:ns ## COG: RSc1212 COG0820 # Protein_GI_number: 17545931 # Func_class: R General function prediction only # Function: Predicted Fe-S-cluster redox enzyme # Organism: Ralstonia solanacearum # 38 368 13 362 383 255 41.0 9e-68 MRHPAQKSLFFTYFCAGKQLAGDGKMALSETLYGKTPEQLAAVCAELGMPRFAAKQLARW LYAKHVEDPMRMSDIAAAHRAKLAERFRPAFTPPARITESADGTKKYLYRTQQGAWIESA YIPDGERATLCVSSQAGCRMGCKFCATGRQGLQHSLTTAEILNQIVSLPERDSLTNVVFM GMGEPLDNTDNVLRALEIMTSEWGFGWSPTRITLSTAGVAPELQRFLDATKVHLAVSLHN PFHEERAAIMPVERAWPIAEVAAILRRYDFTHQRRVSFEYIVMSGLNDSPRHIRELCRLL DGIKCRINLIRFHKIPGSPFFSPDDEAMIRFRDTLTAKGIQTTIRASRGEDIQAACGLLS TAAAENGQ >gi|313159013|gb|AENZ01000021.1| GENE 60 55305 - 55988 1016 227 aa, chain + ## HITS:1 COG:no KEGG:Odosp_2864 NR:ns ## KEGG: Odosp_2864 # Name: not_defined # Def: TonB family protein # Organism: O.splanchnicus # Pathway: not_defined # 1 227 1 229 230 156 41.0 7e-37 MEIKKSPKADLQNKRGLLLEIGLVVALALVIAAFAYTPKEHRIEKVDLNYGPVEEEITEI TRQDQKPPEPPKKVEVKVIADLLQVVTNDTKITTEVDFAEFDENTEVIQQVEVVEETIED DQPFLIAETMPSFQGGDLNTFRNWVQQNVKFPQIALENGIQGRVVLSFVIEKDGRLTNIQ VLQTPDRSLSEEAIRVLNKSPKWSPGKQRNQVVRVKYTLPVDFRVQN >gi|313159013|gb|AENZ01000021.1| GENE 61 56157 - 57374 1783 405 aa, chain + ## HITS:1 COG:PA4943 KEGG:ns NR:ns ## COG: PA4943 COG2262 # Protein_GI_number: 15600136 # Func_class: R General function prediction only # Function: GTPases # Organism: Pseudomonas aeruginosa # 17 358 8 337 433 271 45.0 2e-72 MEENKNKTPNAGTASDGRERAVLVAVVRDDQDPRQAAEFLDELEFLAETADIESVKRFTQ RLPQPSSRIYVGPGKLEEIAGYCKEQEIDVVIFDDELSPSQTRNIEKAMPCRILDRTRLI LDIFMSRAQTAYAKTQVQLANYEYMLPRLSGMWTHLERQRGGTGTRGGAGEREIETDRRI IRNRIAKLKEDLRKIDRQMAVQRSNRGSMVRVALVGYTNVGKSTLMNLISKSEVFAENKL FATLDTTVRKVVFDNLPFLLSDTVGFIRKLPTELIESFKSTLDEVREADLLVHVVDISHP QFEEQIDVVKQTLQDIGAGDKPVYLVFNKVDAYTYVKKDEDDLTPATRENLSLDDLKKSW IARANTPCIFLSARMKTNVEKFRTDLYGMVREIHAGRYPFNNFLY >gi|313159013|gb|AENZ01000021.1| GENE 62 57395 - 58051 910 218 aa, chain + ## HITS:1 COG:MTH1114 KEGG:ns NR:ns ## COG: MTH1114 COG0035 # Protein_GI_number: 15679125 # Func_class: F Nucleotide transport and metabolism # Function: Uracil phosphoribosyltransferase # Organism: Methanothermobacter thermautotrophicus # 10 215 13 211 215 139 39.0 4e-33 MTLHTLNSQNTVLNKFIAQIRDKSIQTDSMRFRRNLERIGEITAYEISKELSYTPRVVET PLGEATVETIDDRIVVATILRAGLPYHQGFLNYFDDAENAFVSAYRKSTKDGKFTVKVEY ISCGDLEDKVLLLVDPMLATGSSLVLAYNALCEKGGTPKYTHVAAVIASEQGIDYVSKNM PRQATTIWAAAVDEELTSRSYIVPGIGDAGDLAYGEKI >gi|313159013|gb|AENZ01000021.1| GENE 63 58375 - 60213 2247 612 aa, chain + ## HITS:1 COG:HI1246 KEGG:ns NR:ns ## COG: HI1246 COG1368 # Protein_GI_number: 16273165 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Phosphoglycerol transferase and related proteins, alkaline phosphatase superfamily # Organism: Haemophilus influenzae # 28 562 35 578 647 157 26.0 8e-38 MVKKKLAFVFGVFCATVLLMAVQKPVFLAYYAADAAQASAGEWLGVVWHGLTLDSTVAGY VTALPLLLTLASLWVRLPERIWRRVLNVYFVLIAVLTAAIFAVDVELYRHWGFRLDSTVL IYLADPKEAMASIDFWLGVRQTLLAAAYAALMIWTYRRVVGLFDGEPLRRRAALPWSFGL LLLAGCDFLAIRGGTGASVANVSKVYFSSNMFLNHAATNPVFSFLTTLGDHTDYASEYPF FDETTRTAKFDALRGNAPASGPSERVLTKTRPNVVVVILESFARTIMDADVDGRPVMPNM RRLRDEGIWFENFFANSFRTDRGEVAVLSGFPAQTRMSIMKLPAKSRSLPSLARSLAREG YATSFVYGGDLNFTNQASYMYATGWQQLVWQRDLRFDTPPSDWGYDDAVMCDWFADRVIA QSGTGEPFLAGLLTLSSHRPFDVPFSRFDDRELNAFAFADDCVGRMIDKLKASPAWDDLL VVLVADHGYPWPRSLSYNEPLRHRIPMIWTGGAVAGPRVVEEYASQIDICATILGQLEVA HDDFDYSKDIFAPTPPRKFAYYTFNDGFGVVDASGEAVWDATSGKAVSETSPALMDVGRT MLQTTYTDIGNR >gi|313159013|gb|AENZ01000021.1| GENE 64 60290 - 61039 942 249 aa, chain + ## HITS:1 COG:no KEGG:BDI_2598 NR:ns ## KEGG: BDI_2598 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 1 223 21 247 249 275 58.0 2e-72 MYRIGRYDADDRMCDLVCDSYPVLLVMSRFGIALGFGDKTIGEVCRDSGVDTGTFLTVVN THIDGEAPDPGAVSVGALVGYLRNSHDYFLDFRLPGIRRKLIEAIDCSRNDVAFAILRFF DEYVAEVHKHMDYEEHKVFPYVEALLAGERPEGYSIDVFRRHHDQVEAKLAELKRIIIRY YPSGSTNELNGVLFDIFTCERDLASHNDIEDRLFIPAIKRLEEAGETPADVRSGDGARKR SSGAKGGVR >gi|313159013|gb|AENZ01000021.1| GENE 65 61036 - 61641 774 201 aa, chain + ## HITS:1 COG:BMEI1582 KEGG:ns NR:ns ## COG: BMEI1582 COG2197 # Protein_GI_number: 17987865 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain # Organism: Brucella melitensis # 123 188 143 208 213 63 54.0 2e-10 MKALKLAVAESSAILRCGIIALLHRMPAANVDILEIGDISQLAAQLSRHRPDVLIVNPAS LGLCTPPQLRAQTGCEGMRCVALQLAMTDAATLGAYDEVLSIYDSEDCVRRKIVQPADES PREERQEPLSVREREIVVCIVKGMSNKQIADTLCISTHTVITHRRNIVAKLQIHSPAGLT IYAIVNKLVDLSEIRDTITEI >gi|313159013|gb|AENZ01000021.1| GENE 66 61648 - 62538 1206 296 aa, chain + ## HITS:1 COG:CC0303 KEGG:ns NR:ns ## COG: CC0303 COG1230 # Protein_GI_number: 16124558 # Func_class: P Inorganic ion transport and metabolism # Function: Co/Zn/Cd efflux system component # Organism: Caulobacter vibrioides # 16 254 75 313 361 220 50.0 2e-57 MGHDHHHDHTVSSLNKAFILGITLNVAFVVVEFAVGLYYGSMGLLSDAGHNLSDVASLLL AMLAFRLAQAHATPRYTYGYKKSTVLISLLNSVILLIAVGVIVAESIGRMMHPAPIEGGA IAWTAGVGVAINGFTAWLFMKDKDKDLNVKGAYLHMAADALVSVGVLVSGLVISWTGWTI VDPIVGLVVAAVIVASVWSLTRASLRLSLDGVPDGIRVDELERMMTAVPGVEAVHHIHVW AISTTENALTAHVVLTDLSRLEAVKRELKTRLDGAGVHHATLEFESVAENCCDFSD >gi|313159013|gb|AENZ01000021.1| GENE 67 62743 - 63003 260 86 aa, chain - ## HITS:1 COG:NMA0549 KEGG:ns NR:ns ## COG: NMA0549 COG0759 # Protein_GI_number: 15793543 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Neisseria meningitidis Z2491 # 26 86 36 96 96 91 62.0 4e-19 MQSGEFWKRSRSIFKRICALPLIALVKFYQLCISPFTPPSCRFTPTCSQYALEALRKHGP LKGSWLTLRRLSRCHPWGGSGYDPVP >gi|313159013|gb|AENZ01000021.1| GENE 68 62960 - 63328 175 122 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|126646897|ref|ZP_01719407.1| 50S ribosomal protein L34 [Algoriphagus sp. PR1] # 6 122 5 122 130 72 37 2e-11 MPDCSLPRTERLRSLGAVRRMFESGESGFIFPFRYVWFAEADEIPSAEVLFSVPKKFHKR ANKRNLLRRRTKEAYRLQKLLLHNGRPVNLDLALIYSSKEELSYKTISNAVRRILETVAE HL >gi|313159013|gb|AENZ01000021.1| GENE 69 63336 - 64196 1298 286 aa, chain - ## HITS:1 COG:no KEGG:Odosp_2726 NR:ns ## KEGG: Odosp_2726 # Name: not_defined # Def: uroporphyrinogen III synthase HEM4 # Organism: O.splanchnicus # Pathway: Porphyrin and chlorophyll metabolism [PATH:osp00860]; Metabolic pathways [PATH:osp01100]; Biosynthesis of secondary metabolites [PATH:osp01110] # 1 246 1 248 253 230 48.0 5e-59 MKVKKVLVSQPRPAIIEKSPFYELSEKYKVEIAYKPFIRVVGVSLKEFRAQRVEILTHTA VIFTSRTTVDSFFHICEEARITVPETMKYICQTEAVALYLQKYIVYRKRKISFADGSFTS FIELIIKHKDEKFLLALSEPHKPELPETLAKLKMSFDPAILARTVAADMDDIALTDYGLL ALYSPSDVKTLVEKFGTENLPAVAVFGEGTLRAALDAGITVLANAPTPEAPSMVKAIDIY LNKVQKGEEIAPVELITDTQKEEFIRSQQHKLAKKSRARRPAEPRK >gi|313159013|gb|AENZ01000021.1| GENE 70 64233 - 65339 1129 368 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313159109|gb|EFR58484.1| ## NR: gi|313159109|gb|EFR58484.1| putative membrane protein [Alistipes sp. HGB5] # 1 368 16 383 383 622 100.0 1e-176 MQEPGLASVREAATPAAQQAPADTSASAQGRGTDRSAAYPSTETAANTSDTAQETTRALS GTGDSARNITPRTALPAAPAPAQGADSLAADTLHSGPQAADSLPFFSLEGPLRGAADTLW RDAPADKVFGPASTLVPARILPAAPAPSLTENPVFQGFVLLLAATYAILLYRNLGDIRLL LTRVSHDTATGKRLMEDPGGSGFSHFLNVATAIGILFVGVMTVKYGDSLIPDSLIETLPQ GAVLALSLLATFACLCVGIFQRGVVRLAGAVTLSQPFVSQLMLLKRTYFTLGVIVTVPAL LLFALCPRGTGDVWFCIIVIELLATVFLYLRETLNLFLSKKISILHWILYLCTVEIFPIS LIGLLAAR >gi|313159013|gb|AENZ01000021.1| GENE 71 65381 - 66559 1993 392 aa, chain - ## HITS:1 COG:BS_ykvD KEGG:ns NR:ns ## COG: BS_ykvD COG0642 # Protein_GI_number: 16078430 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Bacillus subtilis # 172 386 282 501 506 84 29.0 4e-16 MKIRRDIFSFRNRVVVIIVGIVLGMVSLLYTNNMAKRLKEKEQHDVALWAHAMERVNRDV MGGSMEDPLIQDIVSNGNNIPFIITNENLEVFRSHLIPDKIIDHPDRLRRMIDKLTEENP PRTVRFWWSNDHYHIIFYGRSALLKSLYYFPYVQILVITAFIMLGFIAFRSSKHDEQNRV WIGLAKETAHQLGTPTSSLLGWIEYLRSQNVDQGAVEEMNKDLTHLMKIVDRFSKIGSET PLTPANINEIVGESVMYFRKRIPRNVTLDYNGLAIAPVQANINAALFEWVVENLMKNSLD ALQGHGSINVHISSDDKNVMIDVKDTGKGIPKGNWKRIFEPGFTTKTRGWGLGLSLSRRI VEDYHQGKIAVIESEIGKGTTIRITLKRIFEG >gi|313159013|gb|AENZ01000021.1| GENE 72 66569 - 68233 2570 554 aa, chain - ## HITS:1 COG:aq_797 KEGG:ns NR:ns ## COG: aq_797 COG0793 # Protein_GI_number: 15606169 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Periplasmic protease # Organism: Aquifex aeolicus # 36 348 36 345 408 226 39.0 7e-59 MIRKGYRYLLTAAAAASAAALLTFAARNDFGLGRNMEIMVNMMRELSLSYVDPVDPDKLM EGAAEGMVRDLDPYTEFIPEEQMSDFELLTTGKYGGVGSLIRQKGDWVKIAQPYKGSPAD RAGLKIGDKILAIDGKDAKGFTTEQVSSRLKGEPGSKVKVTVEHLDGSTETVTLQRERIS IPGVPYTGWVADGIGYIRHSDFTEGCYEDMRAAVEKLRTEGELKGLILDYRSNGGGIMQE AIKILGMFVPKGTEVVSTKGRTEDSRQVYRTSSEPILPDLPLAVLINGNSASASEIVTGA LQDLDRAVVIGQRSYGKGLVQSPRPLGYNAMLKLTTAKYYIPSGRCIQAIDYSHSQEGSV KAIPDSLITEYRTRAGRKVYDGGGITPDIRTEPEYISRFAMTLYALGFIEDFGDDYMRRH PDQRIDIRNFSITDRDYAEFAEFMKDKKVPYESDTRRALKALKKAAEEDRFADLKEKFDR VEAELKDDTQTNLETYRKEIVETINNDIVMRHGYSEGVIEHSLKDDPEVLRATEILGDGA EYTRIVTEQDTPRK >gi|313159013|gb|AENZ01000021.1| GENE 73 68226 - 68696 583 156 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313159060|gb|EFR58435.1| ## NR: gi|313159060|gb|EFR58435.1| hypothetical protein HMPREF9720_1440 [Alistipes sp. HGB5] # 1 156 1 156 156 292 100.0 6e-78 MKYFIRSLKYFVALCVLCAAIMALNRLSGFATLTLEETFYVMFHTTRGMMLPAVIVLLAA FYPKFGFVQRKVEGDVGENREQIVNAFKSAGFSLRTDEDGVMTFRADGLLLKLTLLGEDE IKVSQYGQWILIDGIRRGVARAGYKLDSYIQMTKHD >gi|313159013|gb|AENZ01000021.1| GENE 74 68827 - 69705 384 292 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229229955|ref|ZP_04354520.1| SSU ribosomal protein S1P; 4-hydroxy-3-methylbut-2-enyl diphosphate reductase [Desulfotomaculum acetoxidans DSM 771] # 4 290 1 279 676 152 30 9e-36 MKDLRIEIDDKSGFCFGVVRAISEAEKALAGGETVYSLGDIVHNRIEVQRLEKLGLSTVT HADMPRLTGRRLFIRAHGEPPTTYARAAELGIEVIDATCPVVARLQARVVKAHERMRPAG GQVVILGKRGHAEVVGLTGQVPDRTIVVEGEEDLSQIDFTRPVYFLSQTTQSIALFEALG AEMRRRAANPADVHIDDTICRQVSSREQHLQEFARRFDAVVFVCGRKSSNGRVLFEVCRT ANPRTCNIEEAAEIDPAWFEGAESVGICGATSTPKWLMQEVADRIGKIGKGK >gi|313159013|gb|AENZ01000021.1| GENE 75 69727 - 70425 256 232 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15639271|ref|NP_218720.1| bifunctional cytidylate kinase/ribosomal protein S1 [Treponema pallidum subsp. pallidum str. Nichols] # 8 218 35 274 863 103 30 6e-21 MSDTKHKIIIAVDGFSSCGKSTFAKAIARRLGYIFIDTGAMYRAVTLYALEHGAIRSGIV DEDAVVRLLDQITITFRFNPERGASDIYVNGDLAEGKIRTIEVSNCVSRVSAIPAVRRKL VAMQQEMGRRRGVVMDGRDIGTVVFPDAELKIFMTADPKVRACRRYDELRAKGDNVSLEE IERNVRERDKADMSRAISPLRQADDAVVLDNSCMTVEQQMAWFMTELERINK >gi|313159013|gb|AENZ01000021.1| GENE 76 70412 - 71065 881 217 aa, chain - ## HITS:1 COG:no KEGG:Odosp_2288 NR:ns ## KEGG: Odosp_2288 # Name: not_defined # Def: lysine exporter protein (LysE/YggA) # Organism: O.splanchnicus # Pathway: not_defined # 1 208 7 208 218 131 38.0 3e-29 MFRGICVGVAASITVGPVAVLCIQRTLSKSRRSGIVSGLGVACADTFMAMAALFFYSMLQ TQIEQYNTLLRVIGGIFVVIVGVYIFAQNPVPQIRRNRAGKTSLWQDFASIFGLTIANFI MVIPYILAFFAVFKISGGDITHEGVGGFIRAVFIIAGFFGGAAAWWTMLAFVINLFRRRF RPRHMLTINHVAGLIIGILGIYTILSTFLDIFPNVGY >gi|313159013|gb|AENZ01000021.1| GENE 77 71283 - 71588 464 101 aa, chain + ## HITS:1 COG:no KEGG:Rmar_1696 NR:ns ## KEGG: Rmar_1696 # Name: not_defined # Def: hypothetical protein # Organism: R.marinus # Pathway: not_defined # 9 101 15 107 107 107 55.0 2e-22 MAEMKQFGIDLELDDTIAQGNYSNLAIISHSTSEFIIDFAAVLPGVQKARVKSRIILTPE HAKRLLRSLQENIVRYESNVGKIEIPTPSPASEAGPKMGEA >gi|313159013|gb|AENZ01000021.1| GENE 78 71848 - 73251 1947 467 aa, chain - ## HITS:1 COG:MA2648 KEGG:ns NR:ns ## COG: MA2648 COG3119 # Protein_GI_number: 20091471 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Methanosarcina acetivorans str.C2A # 24 429 43 479 535 194 32.0 4e-49 MKYLPILPAAAAAAGIVSCDTEPKRPDHVNFILINLDDAGNGDFSFSGALGYKTPNIDRL ALEGMRYTNFYAAQPISGASRAGLLTGCYPNRIGFAHAPNPGCPYGISDEEETIAEVLKK RDYATAIFGKWHLGDAKKFLPLQHGFDEWYGLPYSNDMWPYHPKWKFPDLPTYEGNEILG YNLDQTTLTTDYTERSVKFIRENRERPFFLYLAHSMPHVPLAVSDKFKGKSEQGLYGDVM MELDWSVGEVLRTLEELGLERNTLVIFTSDNGPWIAYGNHAGSTGGLREAKATTFNGGLR VPLIARWKGTIPAGTVCERLASNIDLLPTFAQIAQAPLPAHKIDGVSMLPLLEDPDAAPV RDALCLYYQDNSLEAVTDGTYKLIFPHDYLAYGTPGDDGMPGEMIPRRVEEKELYDMRRD PGERCNVLSQHPEIAEKLEAVAERYRAELGDDLTGCEGTGRRKPGYR >gi|313159013|gb|AENZ01000021.1| GENE 79 73280 - 74866 2135 528 aa, chain - ## HITS:1 COG:ECs4619 KEGG:ns NR:ns ## COG: ECs4619 COG3119 # Protein_GI_number: 15833873 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Escherichia coli O157:H7 # 29 505 2 440 497 115 25.0 3e-25 MDTRGFKLCAAAVGILPVTAGCTDRAPQEHPNILFILSDDHTSQSWGIYGGILAPYAAND NIARLAAEGCVLDNAFCTNSISVPSRAAILTGQYSHLNGVYTLDDALRPEQDNIAKRLQQ AGYRTALIGKWHLKKELAGFDYYSVFHDQGTYRDPVFKTAENWMDDDKGVGGAVEKGFST DLVTDKAIRWIKERDRTKPFSMFCHFKATHEPCDFPERFAHLYDGVRFPEPENLLEFGPA KSGRTFAGQPLETMAWRWNKAYTDPENWWTYYPELPFCGVEGDSVANRRAIYQKLVRDYL RCAAAIDDNIGRLLRTLDEEGLAENTVVIYVSDQGYFLGEHGFFDKRIMYEEPLRMPFVI RYPKEIPAGTRNGDIVTNVDFASLLADYAGAEPPQQAQGRSFRSNLAGDTPQEWPRSMYY RYWTQHEIRPAHMGIRNDRYKLIFFYGDRLGMTGSDDCVSTPSWEFYDLRTDPHENRNQY GNPAYAEVIGAMKREMLELRRQYKDTDEGSARMREIMETHYAPESAKH >gi|313159013|gb|AENZ01000021.1| GENE 80 74871 - 77177 2982 768 aa, chain - ## HITS:1 COG:CC0447 KEGG:ns NR:ns ## COG: CC0447 COG3525 # Protein_GI_number: 16124702 # Func_class: G Carbohydrate transport and metabolism # Function: N-acetyl-beta-hexosaminidase # Organism: Caulobacter vibrioides # 29 593 31 583 757 368 38.0 1e-101 MKFSFHLLLTCIALTACGESGKTTADYRIVPRPGEIAPGDGGDFELDARTRIVCTGDGDG MRANAEFLAGYVESVVGSAPEITGPDGLCDDGRAVILRADAPGKGGEAYEIEVSENRITV SGESDAGVFYGIQTLRKAMGAAKAAKVLFPAVRIADEPYLGYRGVMLDVGRTFYPVEEVK RIIDLAALHNLNVFHWHLTDDQGWRIEIDAYPRLTETGAFRRDTTLAGEPGTFGGYYTKR QIRDIVEYAARRYIEVVPEIDMPGHMTAALASYPELGCTGGPYVITSQPGVRRDILCAGN PAVFDFVEKVLEEVIGLFPSKYIHIGGDESPRTRWRECPECQSLIRRAGLKADARHSAED KLQGYFNTRIEEFLARHGRRLIGWDEIVDGGMSPDATVMSWRGTAGGIRAADEGFDVIMS PNSSLYFDYYQSANIDTEPPTIGGYIPLKKVYDTEPVPEELTPEQARHIIGVQANVWTTY MRTDTILEHMLLPRLAALAERAWSDREKDFTDFMMRLDRLVGFYDRDGYSHFPHFYDITG AFTADYARKAVGMTLSSLPGADIRYTLDGSEPTEESPLCTDTVWIGSPAAVRAVAVLPDG RRSEPFREEVTFNKATMKPIRLLTAPHPKYAAAALNDGLRGKRVFTYGNWAGWQDDDMEA VIDLGRQEEIAQVAFNALLDYGSHIMDAAGAKVWVSDDGETFREVYGEHYPEIPYGAVKR ILRHEANFSPALRARYVKLRVIRSEQLPEAYFDRNLRPFLFIDEIYVY >gi|313159013|gb|AENZ01000021.1| GENE 81 77262 - 78575 1865 437 aa, chain - ## HITS:1 COG:no KEGG:HMPREF0659_A7291 NR:ns ## KEGG: HMPREF0659_A7291 # Name: not_defined # Def: putative lipoprotein # Organism: P.melaninogenica # Pathway: not_defined # 3 382 5 372 386 130 30.0 1e-28 MKIKAFYLSAVAAIVTGLVSCDNAELKYDDVTGPGRLFLNEAVGTAKSVSFVIPDDGSVY TVTPRISKPLGRDLKLTVYVDEKALDEYNKQNNTSYTLLPNTNYEFESSAVIPAGKVVSN PVTITLHPLTTEQNKTGFVHALPIAVKAEGEAMQVLEGADAFVLAATPVPYSNVVEITGN SYLQTRFAEDYKLSSWTFETLFNPSQIYTGNTTTWLASAIGYYEQGGTKIIQDGFMLRLG DSGGGTKNSLINGMVSRSGKQISSIPLQSNIWHHIAIVCDEGDITLYVNGNVGYTAKSGY AATQFYALNGIRFGNESASGNLGSLRYRYCQVRFWSVARSVEELNNNRYAVPADTPGLLG YWKLNEYTEGKAMVPGSSVKCIEEDTPMTETETYLFKDATGKQPDALIHRASANTPYIFT PDKRIEVGYTWDGVLAQ >gi|313159013|gb|AENZ01000021.1| GENE 82 78594 - 79724 1674 376 aa, chain - ## HITS:1 COG:no KEGG:HMPREF0659_A7292 NR:ns ## KEGG: HMPREF0659_A7292 # Name: not_defined # Def: putative lipoprotein # Organism: P.melaninogenica # Pathway: not_defined # 23 367 23 361 363 232 41.0 2e-59 MKKILLMCLSLLLFASALVSCEDWTEPESKVFLKGDGHDDAYYAALRKWKAETDYDMAWG WFGGWGANATNLKNSLRGLPDSLYLAAIWGRWRPSVLTDAMKADMEYVQRVKGTKVVCTT ITGWVGVDVIGGDYQNNPQKEEYFGWKPEWDTPSGWRAADGTDERAAQEASIRKYARMLA DSVYTGGYSGIDLDYEPNVGGAGCKRELSNRDNFYIFVDELGKYLGPKSGTDKLLVIDGE INAVEGRCMPYFDYFIWQAYSTSSDSGLNTYISTVIRNGSGYMEPEELIRKLYTTVNFEQ YAAEGGGSYTGGINRLLGQALWKPTWEGKTYRKGGLGSYHIEYEYYLSGKSGFYPWTRQA INAVHRSENEEEVPNE >gi|313159013|gb|AENZ01000021.1| GENE 83 79779 - 81377 2218 532 aa, chain - ## HITS:1 COG:no KEGG:HMPREF9137_0327 NR:ns ## KEGG: HMPREF9137_0327 # Name: not_defined # Def: putative lipoprotein # Organism: P.denticola # Pathway: not_defined # 1 523 1 517 519 320 39.0 8e-86 MRKYMNIINRLLFGAVALAALSCTGNFETYNRNPDQPSTIDRDVWNYIKSMQYNVIPTVK NQYQIVENVSGGAYGRYFAYAKSTDAGWNTGMFAFYRIANDWANPPYEVPMTNIYSNWRA IKAELEEPEDDYRMALAQIIRVAAMQRVTDIQGPIPYRTMENNDDLTAPYQSQEAVYGFM LEDLDHAVNVLETYITLAGTDVVAEAASADYVYNGNLAKWIRFANSLKLRLAMRISNVSP DAERYAEEAVAHPYGVIESNADNASLDVSQTNDQNPLRWLVEDYSDVHAAAEIVTYMKSF DDPRLSKYFNPAVRDSEYHGSRVGAYSSSQWKDYYSLPKTNALDRLMWMNAAEVAFLKAE MAVNGWDVTGDAKTLYEEGIRLSFEQYQAENYTSYIAGTAAPTAYSDPRRANNYTPGTAY SVTVAWDGALSAEKQREKIITQKWIALYPLGTEAWAEQRRTGYPRFYPTPTSTNASNEPN LPTEGASRIPFAPNEQTRNADNYDAAVALLGAGGDKFGTRLWWDVKTDKPAW >gi|313159013|gb|AENZ01000021.1| GENE 84 81396 - 83426 2943 676 aa, chain - ## HITS:1 COG:no KEGG:BF1326 NR:ns ## KEGG: BF1326 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 5 676 421 1101 1101 619 48.0 1e-175 MHVYANGILPNNSYDKHSVTVRNTTSFLQDRMTLDAGFSYVETEDQNMLSAGRYFNPLTS LYTYPRGESVEDLMMFEIYDSNKKRYVQNWAWSRGALDMQNPLWMMHRNIQNNKRRRSMM NASLTYKVTDWLTLAGRAKADLSVGESSRKLYASTDPLFAGGDNGFFEFTKEEDTHRYGD FIATVNKHWEQFSLFANVGASISDQLSSIAGARGPLELPNFFSLTNTNPYGASGQRLQER KREQEQAVFASVEAGWRGMLYLTATVRNEWHSNPAYTNNLSYCFPSVGLSGLIHEMVKLP SFIDYLKVRASWAEVGSPILWGISYLTYKWKQGSWSTSATRPIQNLRPKNTTSWEVGLNA RLFANSLSLDFTYYKSNTKHQTFLVPLSGSAGYENMYAQTGNVENKGFELSIGYDKTWRD FAWSSSLTLSHNKNKVVELMNNYYDPQTDKYYTLKEYEVASMGGANFIMTEGGTMGDLYA YGDLWLDENGYIYINPNDPKLMVTDERIKLGSVLPDYNAGFRNDFSYKGINLSLLFSARF GGIVVSPTQGFLDGLGVSKASADARDAGGVPINNGRIDAQHYYETVGLGQLLSHYTYSAT NVRLQETSIGYTFPKKWMGDRVKLTASLVGRNLWMIYCKAPFDPELTASTGSYLQGIDYF MQPSLRSLGFNVQLKF >gi|313159013|gb|AENZ01000021.1| GENE 85 83518 - 84696 995 392 aa, chain - ## HITS:1 COG:no KEGG:BT_3983 NR:ns ## KEGG: BT_3983 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 14 373 18 379 1135 306 47.0 7e-82 MNKKLILQIRSLYSLLFAVSLSLSSVAAFAQPAPPRITITMADVPMERVIDAIEKQSSYL FAIDTNVNISQKVSVSCTDASVETALTQMVKGTDVAYRIDASNVVLSAVRKEVSAQPRFV SGKVLDAQGNPVVGAAVIVKGTTVGMSTGADGDYLLQISPSAGNAVLVVNYLGFRPVEVT VGNRTQIDFTLHEESQAVDAVVVTALGIKRSEKALSYNAQQVSSEDISIVKDVNLMNSLN GKIAGAVINSSASGIGGATRVVLRGLKSIISSNNALYVVDGIPLFNNNNGEIKSEFESQP RGEGLTDLNPDDIESMTVLTGPSAAALYGSSAANGVIVITTKKGKAGRPQISFSNQTTFS SPLKMPEFQTKYGNQPRTYSSWGGGTGDPFVL >gi|313159013|gb|AENZ01000021.1| GENE 86 84865 - 85809 998 314 aa, chain - ## HITS:1 COG:RSc2919 KEGG:ns NR:ns ## COG: RSc2919 COG3712 # Protein_GI_number: 17547638 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Ralstonia solanacearum # 109 258 71 215 274 64 31.0 2e-10 MDDLLLYKFLQNEASEDEIRTVLDWLDADPANRKYFDGLDNMLNAARVNRPAGRKPQRRP LPTLRSIGAWSMRIAAVLCLCLGVSYFAATKMFDRKASNNSLSVFVPNGERISMTLTDGT TVWLNSGTTLEYPAVFARGERRVKVTGEAMFDVKSDPQHPFVVETFACDVRVLGTKFNVE ANEEKGIFSADLLRGKVQVCNRTDRSDRITMEPNQTVHLENGKLQLHAQENADKLLWTDG ILCITGMTFEEVMAKFERCYDINVEILRDDLPQIEFQRCKLRISEGIDHALAVLQHAADF RYERDITTNTLYIR >gi|313159013|gb|AENZ01000021.1| GENE 87 85902 - 86492 785 196 aa, chain - ## HITS:1 COG:TP0092 KEGG:ns NR:ns ## COG: TP0092 COG1595 # Protein_GI_number: 15639086 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Treponema pallidum # 18 183 1 154 162 60 29.0 2e-09 MHQKPTKTCITSAEFGRLYSDYNPRFTEVTYRYVRDRGIAEDLVSDSFMAFWEERDRLPE DVNAPAYVLTIVKNKCLNHLKVRLLHLQIEKKLHSTQQRLIQSDIQSLAACVPDTLFVEE MQTMLDRAVSRMPELTRRIFEGSRYENKSYKELAQELNIAFTHVNFEMRRALKILREEFK DYIFALIILLAVSNLI >gi|313159013|gb|AENZ01000021.1| GENE 88 86741 - 88030 1869 429 aa, chain - ## HITS:1 COG:BH3037 KEGG:ns NR:ns ## COG: BH3037 COG0285 # Protein_GI_number: 15615599 # Func_class: H Coenzyme transport and metabolism # Function: Folylpolyglutamate synthase # Organism: Bacillus halodurans # 2 427 3 422 430 229 32.0 8e-60 MNYTQTLEFLFSSLPVFEAKGATAYKPGLERITAFCRHIGNPQRNFFTIHVAGTNGKGSV SHIIASVLQQAGYRTGLFTSPHLQDFRERIRVDGEMIPKQKVVNFVDKHHGKMVELELSF FEMTAALAFDYFAQSDVEVAVIETGLGGRLDATNIIVPLVSVITNVGLDHTALLGDTLPK IAAEKAGIIKKSIPVVIGESDPRYNEVIEQTAAANKSKVIYAEQEFSCEPCPCGTDRQHF CMHRTRDNRNFEVDLDLGGNYQRRNMVTAAATVDFLHEETPLTISRRAFLEGARDTAANT SLMGRWQKLGEQPLTVCDTGHNPHGIAYVARQLKATPHKQLYCVIGFVRDKDLAHILPLL PHDAHYIFTQAKTERAFSAAELTAKAAIYGLHGETVTEVPAAVARAKELAGADDMIFIGG STYVVAEAL >gi|313159013|gb|AENZ01000021.1| GENE 89 88046 - 89632 2138 528 aa, chain - ## HITS:1 COG:mlr5543 KEGG:ns NR:ns ## COG: mlr5543 COG0815 # Protein_GI_number: 13474621 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Apolipoprotein N-acyltransferase # Organism: Mesorhizobium loti # 64 489 82 495 530 86 26.0 1e-16 MFRKLTAAILSVILLSPGWLGATGLTLPFALVPLLWISSSYDASRRSWRCMFGWAALTFA LWNVATVWWIWNATPVGPVAATLASTFFNMIAFMLFHTVSKKGPKALAYTLLVAGWIATE YWYTVGEFSWPWLILGNGFSHEAWLVQWYEYTGVFGGSLWVLLCNILFFEAIRARRNAAR WAAAAGMLLVPAAVSLGIWWSWQQPAEGTAEVSVVQPNVDCYDKFNGRAEWQERNIVDLL TEVPAGAQFILLPETAVPRHYWEPGLTETRFDNEPGPFWQELADTLRSSHPQALLVTGAF TNRYYAAGLQPRTAEGNPYGEGYYDDFNTAVGLDSAGHTQLHHKGKLVIGVENTPTVIFD LMQFLVIDLGGVVGQIGMGQHGTAFGHAGMRMGPAICYEGLYGDFYGDFVRRGAQFMAII SNDGWWGDTPGYKHLFTISRLRAIEHRRAIARSANTGMSGFISARGDIGETLGWEKRGVI SAAVPLNSELTFYTRYGDYLGRISEYLTLLCVLYYIAYRVKRKNYLVK >gi|313159013|gb|AENZ01000021.1| GENE 90 89727 - 91124 1994 465 aa, chain + ## HITS:1 COG:lin2262 KEGG:ns NR:ns ## COG: lin2262 COG0673 # Protein_GI_number: 16801326 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Listeria innocua # 64 349 9 264 349 70 25.0 7e-12 MKKFAFLSWAVVAALFFTACGSSPKGYSVVDGVIVFDAPARVEGQHSVLRLTTDPIPVVR VGFIGLGMRGPGAVERFTYLDGVEIKALCDLYPDRVEKSQEILAGRNLPAAAAYSGEEGW KELCRRDDIDLVYICTPWQLHVPMAVYAMEHGKHVAVEVPAAMSLEECWQLVDTAEKTQR HCMMLENCVYDFFELTTLNMAQHGLFGEIIHTEGSYIHDLDPFWEHYQGNWRLDFNQSHR GDVYATHGLGPACQLLDIHRGDRMTRLVSMDTESINGLKTAKEKMGAETFANGDQTLTLI KTEKGRTILLEHNVYTPRPYSRMYQLTGTEGFANKYPVQGYAFRPEQLAGELPDHENLSG HSFIPAEARKTLMERYKHPIVRDIEDKARQVGGHGGMDFIMDYRLVYCLQHGLPLDQDVY DAAEWSCIGALTAASLEHNNAPVAVPDFTRGDWAKIDGYRHAMAQ >gi|313159013|gb|AENZ01000021.1| GENE 91 91233 - 92219 1288 328 aa, chain + ## HITS:1 COG:RSp1108 KEGG:ns NR:ns ## COG: RSp1108 COG0657 # Protein_GI_number: 17549329 # Func_class: I Lipid transport and metabolism # Function: Esterase/lipase # Organism: Ralstonia solanacearum # 16 327 28 339 339 318 52.0 1e-86 MNAVKEAPDYLHDEHLSQGTKTYLNVLNGGGPAVESLPVPDARRVLADLQAAVPVDLSGV EESERTIESEGRTLRLNLVRPEGTGRERLPVFVFVHGGGWVLGDYPTHRRLVHDLVAESG CAAVFVNYTPSPEARFPQAVEEIYIAVKWVAENGREIGVDGSRLALAGNSVGGNMSLAAA LLAKDHGGPQIRTLVLMWPVTDAGYDWDSYAEYGRQRFLTAPLMKWMFEQYVSDPAQRGS DLMSPVRASAERLRGLPPTLIAVAENDILRDEGEKMGRLLDAAGVEVTTVRFNGVIHDWG MLNGFAGLHPTRTLVRLAGAVLRDYLRE >gi|313159013|gb|AENZ01000021.1| GENE 92 92412 - 93371 1396 319 aa, chain - ## HITS:1 COG:XF0611 KEGG:ns NR:ns ## COG: XF0611 COG0451 # Protein_GI_number: 15837213 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Nucleoside-diphosphate-sugar epimerases # Organism: Xylella fastidiosa 9a5c # 3 312 22 329 329 460 66.0 1e-129 MKRILISGGAGFIGSHLCERLLKEGNDVICIDNYFTGHKSNIRHLLKHPNFEVIRHDIVY PYMAEVEEIYNLACPASPIYYQHDPIKTTQTSVIGAMNMLAIANRNHAKILQASTSEVYG DPLIHPQPEDYWGHVNPLGLRSCYDEGKRCAESLFMSYYREHGVPVKIVRIFNTYGPKMD INDGRVVSNFIVQALRGEQITIYGNGEQTRSFQYIDDLIEGMLRMMTATPDDFTGPVNIG NPNEFTISELAHIVLELTGSKSKIIRMPLPSDDPQQRKPDITLAHKMLGDWEPTIQLRDG LLKTIAYFEEVLSRGGVRR >gi|313159013|gb|AENZ01000021.1| GENE 93 93468 - 94514 1266 348 aa, chain + ## HITS:1 COG:DR2293 KEGG:ns NR:ns ## COG: DR2293 COG3021 # Protein_GI_number: 15807284 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Deinococcus radiodurans # 98 347 92 320 322 63 29.0 5e-10 MGRKGVFCLLYFLAVVFTGLLLAAAVLSWRASFVSPEQGGFWATIALLLPVVLLANLAAL VWWLIRRRWIVALMPLAALLLNMGYVSSMIQLPDFNAPAGSHDIRIATLNVNGFRQLGPK SITAAAVAEMMRHEQVDVLCLQEFLDDGEFPADSIGALFSRRMPYFVSEGNGAVASRYPI LDCKYVRFPDTSNDYLRADLLVEGDTLRIFSVHLQTSGIAQLRRRFQKDYNREAPVDSML GAVDRNSRIRAAQVREIRAETDASPYPVILAGDFNDTPSSYTYREMKGALTDGFRRCGNG YGGTFRYLGGLLRIDYVFYDDTFECVRYYMPSEVVSDHKVVIAELRFK >gi|313159013|gb|AENZ01000021.1| GENE 94 94740 - 95918 1873 392 aa, chain - ## HITS:1 COG:MA2445 KEGG:ns NR:ns ## COG: MA2445 COG0784 # Protein_GI_number: 20091276 # Func_class: T Signal transduction mechanisms # Function: FOG: CheY-like receiver # Organism: Methanosarcina acetivorans str.C2A # 13 158 17 162 292 120 42.0 5e-27 MKNIAIKPEDYTILAVDDIATNIMLLKAVLSRAKYKIVTASGGYEAIEKASKEAPDLILL DIMMPDLDGYEVIKHLKADPATQDIPVIFLTALHNPEDIVKGFKLGASDYVSKPFNHEEL ITRVTHHIYLAAAQRTIMQQRDELQATVDARDKMYSVIAHDLRSPIGTLKMVFNMLLMNL TPEQIGEENLEMVTMGNNITESTFMLLDNLLKWTKSQTGRMNSVFQEVDISEVVVFASKM SDLVAQVKSISVEYDIPGPISVNCDVDMVKTIMRNLMSNAIKYSNEGGRIIISVRETPTH ARISVRDFGTGIREEDIPKLLNPEIHYMTYGTKNEEGSGLGLQLVQDLTRRNGSELTIES AEGEGSTFTFTIAKEQPRSETVEDTSGGIARP >gi|313159013|gb|AENZ01000021.1| GENE 95 95977 - 97149 1853 390 aa, chain - ## HITS:1 COG:PAE0419 KEGG:ns NR:ns ## COG: PAE0419 COG1215 # Protein_GI_number: 18311929 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases, probably involved in cell wall biogenesis # Organism: Pyrobaculum aerophilum # 49 305 42 301 365 82 29.0 1e-15 MTTIAILFWAGVFLVFYTYLGYGILLWTLVKIREALRPARRYKVPTEAPEVTLLIAAYNE QEIVAEKMANCRALEYPASKLRITWITDGSTDRTVELLAAYPDATVLHDARRGGKTAALN RALEHIRTPLVVFTDANTMLNLEAVTEIVRCFEDPQVGCVAGEKRVADAGGAGAAATEGV YWKYESKLKELDYRLYSAVGAAGELFAVRRGLWQTLPEDTLLDDFVCSMLIASQGYKIAY CKEAYALETPSADMGEEGKRKKRIAAGGLQSVWRLRKLLNPFRYGVLWFQYVSHRVLRWT LTPVVLVALLPLNVALLWSGHPTLYVVTLALQCAFYLAALAGRALERSGRRSRLLFIPYY FLFMNLNVFRGTAYLATHRGRGAWEKAKRA >gi|313159013|gb|AENZ01000021.1| GENE 96 97146 - 98273 1704 375 aa, chain - ## HITS:1 COG:no KEGG:BT_1178 NR:ns ## KEGG: BT_1178 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 374 1 376 376 377 49.0 1e-103 MKGLFVVFHGFSAHSGISKKIFSQCDALRRNGADVELCHLEIAADGTQRRMVGGRAIRTF GQGLRAKVAKRVSLGDITRHIRDEGVEFLYIRHDHNASPVLIHWLRKVKKLGVRIALEIP TYPYDAEFAQSPFVRRLKLRIDRTFRRRMARWVDRIVTFSDDPEIFGRPTIRISNGIDFR SIPLKTQRHGSPHEIRLLAVANIHFWHGLDRVIEGLKVYYAGPHQCIVRLRIAGDGIESL IDGYRRTIEEYGLSEYAEVIGPRSGAALDAEFEWCDMGIASLGRHRNGITRIKTLKNREY AARGIPFVYSENDSDFDGMPYVMKAPADDTPLDIAALVCFCDGVNLTPEQIRASVEGTLS WDRQMKQVLTELFEA >gi|313159013|gb|AENZ01000021.1| GENE 97 98278 - 99759 2124 493 aa, chain - ## HITS:1 COG:L13324 KEGG:ns NR:ns ## COG: L13324 COG2244 # Protein_GI_number: 15672194 # Func_class: R General function prediction only # Function: Membrane protein involved in the export of O-antigen and teichoic acid # Organism: Lactococcus lactis # 15 478 4 466 475 225 29.0 2e-58 MTENDNKQPHTVGRQLASGVFYTSIAKYAGIVVTLVVSGVLARLFTPEEFGVVNIATVVI AFFAIFSDLGIGPAVIQHKNLDKRDLGGIFSLTLWSGAVMALLFFAASGLIASFYDDSAE LRNILRILSANLFFAAANIVPNGLILKEKRFRFAAVRSLSVQVVGGAAAIAAAYAGAGIY ALTINPVFSSLMLFVINYRQNPLSVHLKPGKAAIGKVFSFSAYQFSFQLLNYFSRNLDKL LMGRYMSLSQLGYYDKSYRLMMLPLQNIAYVVSPVMHPIFSEMQNDLRKLADSYRKVVRL LALIGLPLSVVLWFTAQELVLIIFGDQWQPSVPVFRILALSVGIQIVMSTSGSIFQAANA TRMLFFCGLFSAALNTAAICTGIFAFGTTQAVAWCICISFAINFIQCYHALFCLTLRTGW GAFWRSFLSPLLLSALVCLPLAAVEWLLPPMPMLVSLAAKGAAALAVWLLYIQLSGEYDL KGTALRLLSRKNR >gi|313159013|gb|AENZ01000021.1| GENE 98 99740 - 100831 1267 363 aa, chain - ## HITS:1 COG:no KEGG:BT_1175 NR:ns ## KEGG: BT_1175 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 357 11 367 372 277 42.0 6e-73 MKILIVYTASGTSDNPFVRLLADGIRACGYEVVCSVEEFRSNAADYDIVHLQWPEELFRW KAPAPADVDTLEQQLQTLKNNGIPLIYTRHNTLPHHGNPLVAEAYGLVERYADAIVHLGD CSLREFTAAHPDSEQLQVMIPHHIYEGLYDMDITRDAGRRALGIPADRLVVLAFGAFRHA EERRLVWGAFRRLHYPAKFLLAPRLWPYTRRGSRFKGLKRLAGRLLYAAAHSAEGFFDSR ITSPEPLIPDEQLPCYLAAADVVFIQRTDILNSGNVPLALSFGRVVTGPASGNIGGLLAE TGNPAFDPADPRSVDIALERAARLSATDQGARNRAYAQEHFGIGRIAAMYGGLYERLYDG KRQ >gi|313159013|gb|AENZ01000021.1| GENE 99 100839 - 101447 775 202 aa, chain - ## HITS:1 COG:MJ1064 KEGG:ns NR:ns ## COG: MJ1064 COG0110 # Protein_GI_number: 15669253 # Func_class: R General function prediction only # Function: Acetyltransferase (isoleucine patch superfamily) # Organism: Methanococcus jannaschii # 51 190 92 211 214 77 39.0 1e-14 MIDKIKDNPRLKKFVIGLISPHKHPRPRLWVRWFVNPFVHKKGRGALIRRHARLDVFPWR RFEVGRDALIEDYAVVNNGAGDVLIGDAARIGIGSVVIGPVRMGDRAGLGQHVFISGFNH GYADGTRDSNEQKLVRKEVVIGRESHIGANSVVVAGVTIGERCQIGAGSVVTKDIPAYSV AVGNPARVIKCYDPEKQTWVKI >gi|313159013|gb|AENZ01000021.1| GENE 100 101637 - 102893 1421 418 aa, chain + ## HITS:1 COG:no KEGG:Bache_3096 NR:ns ## KEGG: Bache_3096 # Name: not_defined # Def: hypothetical protein # Organism: B.helcogenes # Pathway: not_defined # 12 406 1 382 397 124 28.0 8e-27 MKKILLFAAAALSFAGCAEDAPTPGPADDAGKFAEMSVPSSFGWKTTASVACNFTAPQPT RVYVAAEQGAEPFASFMVGGDADPVKLNVPAATRTLCVSCRTESGVSPQVAVPVTSDGAF CSLDPKAASGTRAGIGNDDEASVDEGLIYIPARWNGWGTLMFEDLWPAYGDFDFNDFVVN YKIQLYMQNKNKVDAMLIGVRVKAVGGSIPYDLCLAMKGVKGGEIDQIEPYNSKNAPEAE LVALNSPNYVKEPAVLKFLNIRENANRPAGAAYVNTEEGYEMPEDRLAEASFMVYFRNSI AIEDVAFDTFDFFLTRDRESDGRRIEIHRGGFEPTPAATADYNALAGQSAYTDRAGRFYY SNDGLVWAINIPFDIQHAYEKTDFLKAYPQMLEWAQSGGAVAQEWYLHGVEKHLVKRK >gi|313159013|gb|AENZ01000021.1| GENE 101 103010 - 104128 1749 372 aa, chain - ## HITS:1 COG:CAC2907 KEGG:ns NR:ns ## COG: CAC2907 COG0438 # Protein_GI_number: 15896160 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Clostridium acetobutylicum # 1 372 1 373 374 152 26.0 1e-36 MKIAIEAQRIFRPNKHGMDFVALETIRCLQKIDTKNEYFIFTGEGGDRCLEETPNMHIET LRCPTYPLWEQWALPRAVARVKPDLLHCTSNTAPVWGRTPLVLTLHDIIFLEKQASRNKS SYQSLGRIYRRLVVPRILPKCRKIITVSQFECDRIRTALGLDPERITAIYNGYNERFRPL DDVARTVGKYLSDPGYIFFLGNTDPKKNTSGTLRAYAEYVRRSEKPLPLLVADLREQAAE EILREIGAPELRGMLRLPGYIPNGDLPAVYNGASAFLYTSLRESFGIPQLEAMACGTPVV TSNTSAIPEVAGPGAILVDPTSPTAIADALLRLETDPAFRAEQVAYGLERVKLFSWEETA RQLLALYESLGR >gi|313159013|gb|AENZ01000021.1| GENE 102 104125 - 105327 1757 400 aa, chain - ## HITS:1 COG:CAC1691 KEGG:ns NR:ns ## COG: CAC1691 COG1215 # Protein_GI_number: 15894968 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases, probably involved in cell wall biogenesis # Organism: Clostridium acetobutylicum # 1 265 4 271 425 83 25.0 7e-16 MTTLLNIADWIFIFVLGLPVLYLFVFALFSTRKSMDDYPQAKRRRKFVTLIPAYKSDAVI VRTAQAALRQEYPAQLHEVVVIADRLKPQTLAELRTLPIRVLEVSFENSSKAKALNFAVE ELGPEAAEAVTILDADNLAGEDFIARLNDVFDSGVQAVQAHRTAKNRDTDTAVLDAASEE INNAIFRRGHVALGLSSALIGSGMAFEYKWFRDNIARCTTSGEDKELEALLLRQGIYIDY IDGLRVLDEKVQGEGAYYNQRRRWIAAQFYALGSAVRQLPGAIAAGNLDYCDKLLQWCLP PRILLMGLVPLWTAAMTVFDPLGSIKWWIVLLLLLFALALALPDEQTDRQLGRALRRMPV LFLLTAANLFRLRGTKDKFIHTEHTGAGGNAPGANTPTEP >gi|313159013|gb|AENZ01000021.1| GENE 103 105324 - 106217 1039 297 aa, chain - ## HITS:1 COG:CAC3069 KEGG:ns NR:ns ## COG: CAC3069 COG1216 # Protein_GI_number: 15896320 # Func_class: R General function prediction only # Function: Predicted glycosyltransferases # Organism: Clostridium acetobutylicum # 4 259 2 260 299 177 38.0 2e-44 MMERRISVISVNYNGYDLTCAMIDSLRRHVTTPLEIVIVDNGSTRDEAAPLRERYPDVKV LRSERNLGFAGGNNLGFAAATGDYLLLLNNDTEVTEDTLHYLRETLDNDRSIGAVCPKIR FFAPPQHIQFAGYTPLTRITLRNALIGFGEPDDGRYDTPHDTPYAHGAAMMVRRETLEKA GPMPEIYFLYYEELDWSVRIREQGRRIVYDPRCTVFHKESATTGQQSPLRSYCLTRNRLL FAWRNLRGGARLLSVAYQLCIAAPKNAVSALAHRRGDLAKAVCRGVRDFFTLKNKQA >gi|313159013|gb|AENZ01000021.1| GENE 104 106214 - 107719 2176 501 aa, chain - ## HITS:1 COG:no KEGG:BDI_1279 NR:ns ## KEGG: BDI_1279 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 16 486 20 483 490 315 38.0 3e-84 MERYGSITTPVSIPRFIIGGMLVGGVAVVAMAFLNYGLLAGTVVAAIPAALCILWVTLRN PALAMLGLFVINYFIMGLTRYAHDLPFGMILDALIFYNILIISLQAMMHRIEWKRASSGL TVVAAIWVAYCVLEVVNPESVSVSGWFSSVRSVAFYFFFIVVLTQLTMNEYKYLKYMLVI WSVLTLIAVGKACIQKFFGFNAAENYWLFVLGGRSTHIIHSGVRYFSFFSDAANFGGSMG LSMVVFSISALYYRNPWMKIYLLLVAAAACYGMLISGTRSALAVPFVGYTAFIMMSRNIK VIVMGVLLVIAAFVFLKFTSIGQGNALVRRARSAFNSEDPSFKVRLENQAKLREIMADKP FGAGLGHGGGKAKTFTPTAPLSQIPTDSWFVMIWVETGVVGILLHIGILLYILARGAYLV VFKLRNTQLRGFTAALTAGISGIVVMAYANEILGQIPTGAILYMSMGFIFLAPRFDRELS RKEILDKATAAQQPPLRENYE >gi|313159013|gb|AENZ01000021.1| GENE 105 107757 - 109964 3599 735 aa, chain - ## HITS:1 COG:no KEGG:BF2061 NR:ns ## KEGG: BF2061 # Name: not_defined # Def: putative transmembrane protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 732 1 715 738 556 39.0 1e-157 MNYIAYTARFLYRIKWWLILAPTVVALAVFFKMGAQPRNYKSMTTVYTGIVSGYDITTTE GTRQDWNIINNAMDNLINIILSQSTLKNVSMRLYAQGLTHLDPDNDNQYLTARTSRYLLN RTPKEVMDLVDRTSEEKTLENLRRFEEADHDNHVYGMFHWNPPYYSYQALSQIKVKRVTS SDMLEISYENDDPYIVYNTLVILNDEFVKQYRDLRFGETNNVIAYFESELARVGKNLREL EDSLRDYNVEHKVINYDEQTKHIAALSRDYELRYEEILLNFESAEKLRKSIEEQLEGLQT FHNNAQFIEKLHTIGSLYSRISAAEAFQPAPTESGETLSDRPSIAASRSNTGDLRRKLAE ETRNLQEITTNIASQQYTKEGLSTNSIVAEWLAAVLLAEKSRAELAVMKLRKTELDDKYT HFSPVGSTLKRKGREINFSEQSYLSILTALNTARLRQKNLQMTSATLKIINPPVLPINAE PNKRRLVVAAAFLATFLFVLGFFILLELLDRTLRDKVRAERITGGRVIGAFPGKGRFGQR RYAKQYREIAARYIGNAAINYFEPSRRPNILNILSTEAGDGKSTISEQVAAYLREADMKV RVISWNKDFDVEQKSYLLAGKLEDFVQDRPGDLPLCEADVVLVEYPPFSDKSVPKELLRN AALNIVIAPANRTWKDTDQLLFEKTEKLAGRTPVTLCLNCARRDVVQSFTGLMPPYTRMR RLGYQISQFGFTAVK >gi|313159013|gb|AENZ01000021.1| GENE 106 109968 - 110768 1255 266 aa, chain - ## HITS:1 COG:no KEGG:BF2062 NR:ns ## KEGG: BF2062 # Name: not_defined # Def: putative outer membrane protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 253 1 249 249 137 33.0 4e-31 MKRAVFTLWLLLAAVAATAQGNSPDFEFEYSDLQLSPDDYIGLQLPPLHQLLENARNTPQ VEYYNKAVEIQERELKNVRRNWQHYFKINANYNYGSSDIYNQNYQDNSNQIWTTTTTGRE QSWWNVGASFSLPIDEIFNRRNKIKQQKRRIEQVELDTERWLEEQRIKVIQQYTLAVQQL SVLRSAVEAMVTAQAQYRLSEADFINGKLDAQTLSRQKNIENVAIREYEEVRRSLNNALL TLEVLTRTPIISKPAPATAPQAPKTE >gi|313159013|gb|AENZ01000021.1| GENE 107 110780 - 111946 1369 388 aa, chain - ## HITS:1 COG:all4420 KEGG:ns NR:ns ## COG: all4420 COG2148 # Protein_GI_number: 17231912 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Sugar transferases involved in lipopolysaccharide synthesis # Organism: Nostoc sp. PCC 7120 # 141 385 250 441 445 145 33.0 1e-34 MKYPVIYIKGKGSHTARFSEMVEGRMYVVGDIGSAMRTLTSLRTDHAFILYEQRDPAADL SDIRLLNSWPGGGIVLITSRELTERERREYLRAGVNSAIAGDMPRKDFLRMLQFMGDYAF THKPVAQNSPEVQIFRMPLWKRTFDIAASLSAILLLSPLLIVVAAAIILDSPGPVVYKSK RVGSNYRIFDFLKFRSMRTDADRRLKELGELNQYAAQPQEADSGKMVLGEEEMRRLLEDT QNGMLYADDFVIAEETHHHKVETEQENAFVKIENDPRITRLGRFLRKYSIDELPQLFNIL RGDMSVVGNRPLPLYEAERLTSDEYIERFMCPSGLTGLWQVEKRGQAGKLSPEQRKQLDI EYARKMSPWFDLKIILRTLTAFIQKENV >gi|313159013|gb|AENZ01000021.1| GENE 108 111943 - 112311 573 122 aa, chain - ## HITS:1 COG:DR2418 KEGG:ns NR:ns ## COG: DR2418 COG0745 # Protein_GI_number: 15807407 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Deinococcus radiodurans # 1 118 1 115 373 83 33.0 1e-16 MNKKQILIVDDKEQIAKILYAYLQADYDCHYVQNPLHAIKRLLAGHMPDLIISDIRMPEM RGDEFLEYLKHNELFKQIPVVMLSSEDSTSERIRLLEEGAEDYIVKPFNPQELKIRIKKI LD >gi|313159013|gb|AENZ01000021.1| GENE 109 112664 - 113056 537 130 aa, chain + ## HITS:1 COG:all4097_4 KEGG:ns NR:ns ## COG: all4097_4 COG0784 # Protein_GI_number: 17231589 # Func_class: T Signal transduction mechanisms # Function: FOG: CheY-like receiver # Organism: Nostoc sp. PCC 7120 # 14 128 1 118 133 93 45.0 1e-19 MECNECVNSDDGRPRLLVAEDNPSNFKLVEVLLRRDYELVHAWDGRQAVDMFAEVHPDLV LMDINMPVMDGYDALRLIRERAPDIPVIALTAYAFETDRQRMFQAGFNECLAKPLRADEL RSRIASLLSR >gi|313159013|gb|AENZ01000021.1| GENE 110 113068 - 115059 2739 663 aa, chain + ## HITS:1 COG:alr3761_5 KEGG:ns NR:ns ## COG: alr3761_5 COG0642 # Protein_GI_number: 17231253 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Nostoc sp. PCC 7120 # 420 642 2 231 245 153 40.0 1e-36 MTGNRAIKHWTLLLLSCCILLPSLLRAEEARGRESILIISSYNPDTRRMSSFITEFERQV VASGVPCDIYVETLECKSINDAPLWISQTDNLITRYESKGGLRAIVLLGQEAWASFVSLG RIPKEVMCFSCFVSSNGIVLPPPADSTGVWMPPSINYMNMVDSLHNVGGLLNKYDVRRNI ELIRTLYPEVENIAFISDNTYGGISLQALVREEMENYPDLNLVLVDSRDGDEASHAAYAS LPPRSAVMLGTWRVGSDGEYLMQRSLNDLVQLNPRLPVFSITQTGIGDVAVGGFVPNYEN GANVIAAQIKEYYKTGSMEGAHFHLSDGGYVFDSRKLKELKIAEYALPKGSVIEDTVAAK LSKYSHYIELLVVGIVLLVLLLIFVAFLFLRTRRLKRTLEEREGQLVIAREKAEESDMLK SAFLANMSHEIRTPLNAIVGFSSLMQGEELSQEERAEYCAIVVNNSEMLLTLLNDILDIS SLECGKIRFNYSAEEIVQICQHILMTTAHTCQPGVEGRLECAVDSYMLTTDVHRLSQILI NLLTNAAKFTSEGSIVLGVEICPEQNEVLFSVTDTGPGIPLDKQEMIFNRFEKLDGNKKK GTGLGLAICRQIAMIFGGRIWVDPTYTGGARFIFAHPIGLRLPEDREHREGGGAFGRRQR PSV >gi|313159013|gb|AENZ01000021.1| GENE 111 115432 - 116682 1916 416 aa, chain + ## HITS:1 COG:mll6731 KEGG:ns NR:ns ## COG: mll6731 COG0845 # Protein_GI_number: 13475614 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Mesorhizobium loti # 18 403 1 392 402 215 35.0 1e-55 MKFDKLKPDLQNLNLAGVKKAASRAARVRVKLSRRQWIAIACAAIVAGVLLVVLLRPRPA SVTLPVVAVEPVETEDVNIYGDYVGRIRAQQFVEIRARVEGYLEKMLFAEGTYIKKGQTL FVIDPLVYRARVDKAKAQLNKARAQALKAKRDLDRIRPLYQQSAASQLELDNAIASYESA AADVVVGEADLTQAEMTLGYTNVKSPIAGYISERNADIGTLVGPNGKSLLATVVKSDTVR VDFSMTALDYLRSKARNVNLGHRDTARKWDPYITVTLADGSEYPYRGLVDFADPQVDPQT GTFSVRAEMSNPDHILLPGQFTKVRLLLDVREDAVVVPSKAVVIEKGGAYIFVVRPDSIV EKRFIETGPEIGQNIVVERGLVSGEDIVVEGYHKLQHGMKVEPVVTRREEETENQQ >gi|313159013|gb|AENZ01000021.1| GENE 112 116686 - 119781 4877 1031 aa, chain + ## HITS:1 COG:SMa1662 KEGG:ns NR:ns ## COG: SMa1662 COG0841 # Protein_GI_number: 16263363 # Func_class: V Defense mechanisms # Function: Cation/multidrug efflux pump # Organism: Sinorhizobium meliloti # 6 1030 7 1032 1044 835 42.0 0 MKADFFIDRPVFSTVLSIIIVIVGGIGLALLPVDQYPQIVPPVVRISASYPGADAQTVTQ AVATPIEQELNGTPGMIYMESSSSNSGGFSATVTFDISTDPDLAAVDIQNRLKKAEARLP AEVVQNGISVEKQAASKLMTITLLSSDPKFDEIYLSNYATLNVLDMLRRVPGVGSVSNVG SRYYAMQIWVMPDKLADLGLTVKDLQNALKDQNRESAAGVLGQAPMNGIDVTIPITAQGR LSSVSEFEDIVVRANPDGSIIRLKDVARISLEASSYSTESGINGGNAAVLNINMLPGANA MEVAGSVKKVMEEIRANFPEGISYEIPFDMTTYISESIHHVYRTLFEALLLVILVVFLSL QSWRATLIPIVAVPISLIGTFGVMLVFGFSLNMLTLLGLILAIGIVVDDAIVVVENVDRI MNEEHLSPYEATKKAMGSLSGALIAMSLVLCAVFVPVSFLAGITGQLYRQFTITIAVSVI ISTVVALTLSPVMCSLFLKPESGEKKNRFFRRINLGLATGNRFYGRMIRGALKHSRRMLA AFGIVLVGIWLMNRLVPQSFMPQEDQGYFTVELELPEGATIERTREVTDRAVEFLMNDPD VEYVLNVTGSSPRVGTNQARSQLTVILKPWGDRDSDGLSEVMQRVRAEMSRYPESRVYLS RPAVIPGLGNSGGFEMVLEARGNTTYAELQRAVDTLMHYAAQRPEFTGLSSSMQGDIPQL YFDVDRDKAQLLGVSMSDIFSTMKAFTGSIYVNDFNMFNRIYRVYIQAEAPYRAQRDNLN LFFVRGAGGVMIPITALGTTHYTTGAGNIKRFNMFNSATISGEAAHGYSSGQAMSVLESI VRKHFPASVGVEWSGLSYQEKHVGEQTGLVLALAFLFVFLFLAAQYESWSVPVAVILSLP VAGIGAYLGIWICGLENNIYFQIGLVMLVGLVAKNAILIVEFAKEEIEKGRDAVSAALTA AHLRFRPIVMTSLAFILGLLPLVFASGPGSASRQGIGTGVFFGMLVAITVGIVFVPFFFV WIYRIKAKLKR >gi|313159013|gb|AENZ01000021.1| GENE 113 119778 - 121154 412 458 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|157165073|ref|YP_001466086.1| 30S ribosomal protein S12 [Campylobacter concisus 13826] # 9 456 7 456 460 163 27 5e-39 MKRLTIIILLAAAAGMVSCSAVRHCKAPDVDLPERIAGEDSDSLTVADVQWWRFYGDSTL CQIIERTLDNNRDMLAAAARVERLRELYRVSRAERLPSVTGTAYGDYETNDYAGEKSSRD PGFGAKVTLSWELDLWGNLRWAKRKGGAEYLASVEDRRAMRMTLVASVAAAYFNLIALDN ELSIVRRTLITRSEGLYQVQLRFEGGLTSEMVYQQAKVEYASTAALIPDLERKIKIMENG ILLLMGENPDRRVVRGKMNTDAEFADSLPVGLPSGLLQRRPDVRSSEQRLRAAMASAGMA YADRFPRLTFNLTGGLENDALSGFFRSPFSYVAGTLASPIFGFGRKQARYRAALAAYDEA RLAYEQKVLTVFKEADDAVVTYRSARKTAALKLNLRDAASKYVELAHLQYRAGSTNYIDV LDAQRRYFDAQIGLSNSVRDEHLALVQLYKALGGGWTE >gi|313159013|gb|AENZ01000021.1| GENE 114 121406 - 121732 392 108 aa, chain - ## HITS:1 COG:alr4683 KEGG:ns NR:ns ## COG: alr4683 COG0724 # Protein_GI_number: 17232175 # Func_class: R General function prediction only # Function: RNA-binding proteins (RRM domain) # Organism: Nostoc sp. PCC 7120 # 1 82 1 82 110 82 48.0 2e-16 MNIYVSKLNFRTTGESLQALFAQYGEVSSANIITDRETGRSRGFGFVEMPDEGQGQSAID ALNGTDFEGMTIIVNVARPKTEHGNGGGYGGNRGGGGYNRDGYGKRRF >gi|313159013|gb|AENZ01000021.1| GENE 115 121764 - 122201 687 145 aa, chain - ## HITS:1 COG:no KEGG:BDI_0811 NR:ns ## KEGG: BDI_0811 # Name: not_defined # Def: cold shock DNA-binding protein # Organism: P.distasonis # Pathway: not_defined # 4 141 6 143 148 152 57.0 4e-36 MAISFSKRELEKKKQGKKLEKQQRKEERKANRGGSSLDDMIAYVDANGMITDTPPDMSSK PKVDAETIAVFVPKKEEQEPVALRGRVEHFNTDKGYGFIKDLDSTEKYFFHISNAPADIA EGALVTFETERGQRGLNAVNIAYAK >gi|313159013|gb|AENZ01000021.1| GENE 116 122735 - 124204 1803 489 aa, chain + ## HITS:1 COG:MA1905 KEGG:ns NR:ns ## COG: MA1905 COG1966 # Protein_GI_number: 20090754 # Func_class: T Signal transduction mechanisms # Function: Carbon starvation protein, predicted membrane protein # Organism: Methanosarcina acetivorans str.C2A # 1 448 1 433 479 368 49.0 1e-101 MVTFLVCLALLVAAYFTYGRYLERLCGADAGRPVPSAVSFDGVDYIPMPMWKTFLVQLLN IAGLGPIFGAVLGAAYGPVAFLWITFGGIFMGAAHDFIAGMISLRSNGASLPETVGTYLG NGIKQLMRVFSVGLMVLVGAVFLSQPASLIANRIDAPALSGIVFGDFSWLLLIVLGVILV YYIVATLLPVDKIIGRIYPVFGCALLFMALGILAVLLFSGKYTIPEFTSFRNQIADAARF PIVPMLFTTIACGAISGFHATQSPLMARCLGSERQARPVFYGAMISESIIALIWAAVAMA FWNGVGGLNAAIAEYGGQAAVMVDAIARDTLGEVLAGFVIFGVVACAITSGDTAFRSARL IVADFMGVEQRSLRKRICISVPLFAAGLVIIFCLPFQTMWSYFAWMNQTLAMVTLWMITA FLNRHGRNRWVGLIPATVMTYVCSSYVFVSPLMCGMRDRAAAYLLGGAVTLVLLIILIFK MRRDAKSLP >gi|313159013|gb|AENZ01000021.1| GENE 117 124185 - 124877 1023 230 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313159133|gb|EFR58508.1| ## NR: gi|313159133|gb|EFR58508.1| hypothetical protein HMPREF9720_1488 [Alistipes sp. HGB5] # 1 230 1 230 230 414 100.0 1e-114 MQNPCLNPTADQTYEVIGEGPYNFAKVLARTREMERAGNVEEACNERFQAFQRFAELVPE DEEVNLEWNHRNSRAALELIRASAIDHFLINDFEMSAALLELLLELDPEDHLEGSELLAF DYLAMDEQELFDEVINDVSDKYASREILLLWSAFRRGGKLPEGELQRFKTRFAPYFAEFT AEEHPADETYLRDIESERPSLRAQARELWLQTENLWCLWPGFIGALRAAR >gi|313159013|gb|AENZ01000021.1| GENE 118 124982 - 125698 1287 238 aa, chain - ## HITS:1 COG:BMEII0858 KEGG:ns NR:ns ## COG: BMEII0858 COG2186 # Protein_GI_number: 17989203 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Brucella melitensis # 16 226 17 222 242 81 30.0 1e-15 MEELKLANHNTTLVDSVEESLISYFKEQGLRPGCSIPNEMELAASLGVGRAVLREALSRF KMTGMIVSRTKKGMVLGEPSLLGGMKRCINPLLMNESTLCDILEFRVALEIGISDNIFRN LTDADVAELKEIVEMSQVIGNNKYAPVSEHRFHTKLYEITGNKIITEFQDIIYPVLDFVK EKYRDFLEPIEKELERAGALVTHRDLLKYIENRDLKGYKKAIEEHFRLYSIYLERHKK >gi|313159013|gb|AENZ01000021.1| GENE 119 125734 - 126315 886 193 aa, chain - ## HITS:1 COG:BH3779 KEGG:ns NR:ns ## COG: BH3779 COG1435 # Protein_GI_number: 15616341 # Func_class: F Nucleotide transport and metabolism # Function: Thymidine kinase # Organism: Bacillus halodurans # 8 183 6 188 204 197 54.0 9e-51 MFLENASRKGWIEVICGSMFSGKTEELIRRLKRAKFANQKVEIFKPRIDVRYSEEEVVSH DANAIRSTPVDSARNILLMTSDVDIVGIDEAQFFDDGIIEVCRELADSGVRVIVAGLDMD YTGKPFGPMPALMATAEYVTKVHAICVRCGNLAHHSHRLTQDEKLVMLGETDSYEAICRH CFKELVRNRKETE >gi|313159013|gb|AENZ01000021.1| GENE 120 126344 - 127660 1982 438 aa, chain - ## HITS:1 COG:BB0472 KEGG:ns NR:ns ## COG: BB0472 COG0766 # Protein_GI_number: 15594817 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylglucosamine enolpyruvyl transferase # Organism: Borrelia burgdorferi # 1 434 16 439 442 370 45.0 1e-102 MAIFKVEGGHRLHGSITPQGAKNEALQILCATLLTPERVTVHNIPRILDVMQLIELLRRM GVEVEQLAEDTYSFCAKEIDFEYLRTEDYRRRCTRLRGSVMLIGPMLARFGVGYIPKPGG DKIGRRRLDTHFIGFQKLGAKFNFDIADEFFMVEGKELKGTYMLLDEASVTGTANIVMAA VLAKGSTTIYNAACEPYLQQLCKMLNRMGARISGIASNLLTIEGVDSLGGTEHTLLPDMI EVGSFIGMAAMNQSELTVKNVSFENLGIIPAQFARMGIRFEQQGDDIHIPRQDHYSIETF LDGSIMNIADAPWPGLTPDLLSIFLVVATQAKGAVLIHQKMFESRLFFVDKLIDMGAQII LCDPHRATVIGHDHRIRLRATRMTSPDIRAGVALLIAAMSAEGVSTIQNIDQIDRGYRDI DGRLNALGARITRLDECE >gi|313159013|gb|AENZ01000021.1| GENE 121 127898 - 128065 215 55 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|34540457|ref|NP_904936.1| 50S ribosomal protein L34 [Porphyromonas gingivalis W83] # 3 49 1 47 50 87 89 4e-16 MAMKRTYQPSRRKRINKHGFRSRMETANGRKVLAARRAKGRKKLTVSDESTFKYA >gi|313159013|gb|AENZ01000021.1| GENE 122 128212 - 128709 209 165 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|229884790|ref|ZP_04504247.1| acetyltransferase, ribosomal protein N-acetylase [Sebaldella termitidis ATCC 33386] # 2 160 3 160 169 85 31 2e-15 MLTIRRAVETDCGLIRSLADEVFPATYREIITSSQMDYMMKWMYAPEVLRREMRTGVAWF IASSDGEPCGYLSVQQEAEELFHLQKIYVLPRFQGLGVGEFLFRHAVEYVRSVHPAPCRM ELNVNRSNRAVRFYEKMGMRKLREGDFPIGDGYYMNDYIMGLEIE >gi|313159013|gb|AENZ01000021.1| GENE 123 128863 - 129429 923 188 aa, chain - ## HITS:1 COG:no KEGG:Bacsa_0770 NR:ns ## KEGG: Bacsa_0770 # Name: not_defined # Def: hypothetical protein # Organism: B.salanitronis # Pathway: not_defined # 1 181 1 181 182 116 35.0 5e-25 MDIAQAKRRENIAEYILYLWQLEDLLRALQFSPEAIFSTLIAPRKDIADEQKHVFLLWYM DIANLLHQEGKDEKGHLEHTLHLIGDLHDLHLQLMKLPVGEHYRRTYARLEPELPRLRAV LGNPGMNDTELCFRALYAAMLYRIKGEGEKSAVADTLEFISPVIAELADMHGKVERGEVD LFKTAEGK >gi|313159013|gb|AENZ01000021.1| GENE 124 129442 - 131685 3629 747 aa, chain - ## HITS:1 COG:no KEGG:Amuc_1410 NR:ns ## KEGG: Amuc_1410 # Name: not_defined # Def: putative phosphate/sulphate permease # Organism: A.muciniphila # Pathway: not_defined # 1 732 1 740 743 567 43.0 1e-160 MSEIYTVIVGILGILAISGLFVGVTNDAVNFLNSAIGSKAAPMRTILLVASIGIIIGVVT SSGMMEVARSGMFNPGLFTFHEVMMLYLGVMFANIILLDLYNSWGLPTSTTVSLIFCLLG SAIAVSIYKISNTPDLGVGALGQFINTSRAMGIVSAILLSVVIAFTCGTFVMYVSRTIFS FRYTLPFRRFGSMWCGASLTAIVYFAVFKGLKSILAGHDFVTMVDNHLLLSLLICWVVCS LLLFFIQRFKINILRITILSGTFALALAFAGNDLVNFIGVPVAGFDAYSIARHSGDSQML MGVLNDNVPANFLILLAAGAIMILTLWTSKKAMHVTETELSLSAQGDEGTEQYGSSVLSR TIVRAALNINAGIERVIPPRVRAAISRRFEYEDIEHSGAPYDMIRATVNLTTSAMLIAIA TSLKLPLSTTYVCFMVAMGSSLADRAWGRESAVYRISGVMTVVAGWFITALGGFLIAFVV GLALIYGGTVAFVIVTLLCGYMLIHSNFLKKGKSASPTAANADTRTNEDIIASLREEVCR TMECATKIYDRTLIAVFKENRKVLRDMVKESNDLFYQSRERKYTLLPTLKKLQSGDVNTA HYYVQVVDYLNEMTKALMHITRPAFEHIDNNHEGLSKEQAKDLMSINDDVESIYRHINQM LRDGDFTEIEMVLALRDQLFESIAEAIKSELGRINEARSNTKASMLYLTILTETKNMVLQ SRNLLKSQQYFLKHRAGAPQRWIKPTK >gi|313159013|gb|AENZ01000021.1| GENE 125 131920 - 134076 3409 718 aa, chain - ## HITS:1 COG:sll1098 KEGG:ns NR:ns ## COG: sll1098 COG0480 # Protein_GI_number: 16330914 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation elongation factors (GTPases) # Organism: Synechocystis # 7 709 8 684 691 391 34.0 1e-108 MKNYSAKEIKNIVLIGAPGTGKTTLAEAMAFEGKVIDRRGSIETNNTLSDNTDIEHEYKR SIYSTILFTEFMDRKLNIIDCPGSDDFCGSLFSAFKVGDVGVFLFNAQNGWEVGSEIQAR YARLLQKPVIGVVNQLDADKANFEAAVESIRASSRVKPVIVQYPVNQGPEFNAFIDVLLM KMYRFKDNDGHREELEIPAEELDKAQELNKELVEMAAEHDEALMELYFDKGTLTQDDIRA GLKIGLSKREVMPIFCTSGKRDIGTKRLMEFIINVAPGPLKAPNFLSTDGEEIAADETAP AVAFVFKSQVEQHIGEITYFRVIRGRIAEGTELVNTRTGNKEKVSQLFAVAGRNRIKVTE LMAGDIGCTVKLKGTRTNDTLAAPSAPVTVEPIVFPEPRYRAAVKAKEQGDEEKLGKVLN DAKFEDPTILVEYSKELKQTIIQGQGEHHLNILRTRIEKENRLQYDYIAPKIPYRETITK VAQADYRHKKQSGGAGQFGEVHMIIEPYYEGMPEPKNYKVPGKGDMVVNPKTKEEYDLPW GGKLQFYSAIVGGAIDARFMPAILKGIMEKMDEGPLTGSYARDIRVVIYDGKMHPVDSNE ISFKLAARNAFKEAFRNAGPKIMEPIYNVEVLVPSDYMGAVMSDLQNRRAMIAGMESDKG FDRLNATVPLAELYRYSTTLSSLTSGSATYSMKFSSYEQVPADVQDKLLKAYTDTDEE >gi|313159013|gb|AENZ01000021.1| GENE 126 134246 - 135370 1880 374 aa, chain - ## HITS:1 COG:BS_yxaH KEGG:ns NR:ns ## COG: BS_yxaH COG2311 # Protein_GI_number: 16081049 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Bacillus subtilis # 7 363 8 391 402 114 28.0 4e-25 MYRTTNLPNTRIEVADALRGIAVAGIILFHAREHFNLYWSGLDYLRAGFGDWEQTVAGAL GFLLSGKMYAIFALLFGLSFFIQSDNQAQRGNDFSLRFAWRMVLLFIIGLVNAAIYNGDV LTYYAVFGLLMIPIGKLPNRWVWVIAALLFIQPLELYQHFSGHTLSLRGIEGMETLYPTL ATGTFAQSARASLAYGPISSFAWGLEHGRATQTLLMFVLGMLAGRYRLFYDEGRNRRIWG GLLAAGILGTWLVPFEAMRNLMTATAIVAAVVLLWYGLPGFRRALHGMTFFGRMSLTNYL LQSLLGTALFYNWGMGLYRHVDVIYGTLAGVGIILVQYAFCRIWLRRHSHGPVEWLWKKL TWLDIPFLDRKIAS >gi|313159013|gb|AENZ01000021.1| GENE 127 135514 - 136272 1053 252 aa, chain - ## HITS:1 COG:DR1011 KEGG:ns NR:ns ## COG: DR1011 COG2107 # Protein_GI_number: 15806034 # Func_class: R General function prediction only # Function: Predicted periplasmic solute-binding protein # Organism: Deinococcus radiodurans # 1 238 28 269 301 176 43.0 5e-44 MFDALLNGRIDTGGLHFDVEYRDIEALNRGVQAGDADVSKISCAVLPAVADRYALLDSGA ALGRGNGPLLVRRAGDRSPIRRVAVPGVHTTANALMMRLFPRITERTPLLFSQIAAAVER GDFDAGVLIHEGRFVYRQRQLELVADLGQLWERTTALPLPLGGIAAKRSLPEAMRRQVET LIRQSIEYAFAHPEASRAYIKEHAQELDDAVIDAHIALFVNDYSLSLGIEGRRAVEALTG IVLRSKAQNNEK >gi|313159013|gb|AENZ01000021.1| GENE 128 136314 - 136547 456 77 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313159077|gb|EFR58452.1| ## NR: gi|313159077|gb|EFR58452.1| hypothetical protein HMPREF9720_1501 [Alistipes sp. HGB5] # 1 77 1 77 77 102 100.0 9e-21 MNKKWTLWAGLILILVVFILIWQFAPAIQRWLANVFIDIIVLALTFAAGWLLGRYGGRRN RNRDQENVRNITEAQKR >gi|313159013|gb|AENZ01000021.1| GENE 129 136540 - 137319 672 259 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313159117|gb|EFR58492.1| ## NR: gi|313159117|gb|EFR58492.1| conserved hypothetical protein [Alistipes sp. HGB5] # 1 259 1 259 259 445 100.0 1e-123 MKRIFIFPTIEEAKAFILTQPRDPVFVAGVGAAEIAAATIRAVKARKPQLVVLAGIAGAY DRSLQRGEVVEVVTERIAGIPERYAREYETSGPDLGLPLAAGVTVNRSGDGLRETGRKQE QEQEQEQEFRETGYGIPEPETKSGDAVSAPAGQTLQPTGERADSPEQPSDMQSAIGDTNG TAGQGSLPGIDGTDGQSALPEIENMEGAPFFAVCEALGVACCQIRAVSNYVGEPFDRWAV GLAVENLTATLTQIFGNDE >gi|313159013|gb|AENZ01000021.1| GENE 130 137321 - 138712 2155 463 aa, chain - ## HITS:1 COG:FN1949 KEGG:ns NR:ns ## COG: FN1949 COG0006 # Protein_GI_number: 19705251 # Func_class: E Amino acid transport and metabolism # Function: Xaa-Pro aminopeptidase # Organism: Fusobacterium nucleatum # 1 458 1 454 462 337 38.0 4e-92 MFSAKTYAARRNMLRTKIGSGIILLPGNSLSPNNYPNNAYYFRQDSSFRYYFGLNTPSLA GLIDADTGEEALYGDDFTVEDIIWTGPQPTLRELGAEVGVTATFPMAELEKRLRKAVSLG RKIHYLPPYRGETKLQLSALLGIKPELLHDYKSVELMFAVAEMREKKSAEEVEAMERAFL IGYQMHTLAMKMCRPGVVEREIAGAIEGIAKSSGSGLSFPSIVSQHGETLHNLNADGVLE QGRLLLCDAGCETVDGYCSDHTRTYPVNGRFTQKQKEIYNVVLAAHDHVARIVKPHMMYT EIHNAAYMTLAEGLVALGLLKGTAADAVASGAMTMLMPHGLGHGLGMDVHDCEAMGERSF DFSTIADRAAKSGTCIYRAAWRIEPGTVMTDEPGLYFIPALIDKCRAEGLYKGIVDYDAL EAYRDFGGIRIEDDILVTEEGSRMLGDRKIPITVEELEAAVGR >gi|313159013|gb|AENZ01000021.1| GENE 131 138911 - 139978 1359 355 aa, chain - ## HITS:1 COG:slr1534 KEGG:ns NR:ns ## COG: slr1534 COG1619 # Protein_GI_number: 16330906 # Func_class: V Defense mechanisms # Function: Uncharacterized proteins, homologs of microcin C7 resistance protein MccF # Organism: Synechocystis # 50 350 8 296 301 157 36.0 3e-38 MKYPFLLAALLLCSANLAARHSDTDTLLQPDTLSLRQSDALSAQKSDAPFVRPPYLRPGD TIGIVTPARKLKEKADTAKVRERFEEWGLKVKFGAHYADREQPYFAGTDARRAADLQAMI DDPGVKAVVSYQGGYGSVRLLPLLDLTPLREQPKWIVGFSDVTMLHMALGRLGVESLHAT MPGKFRFGADEKPEAIVSDESLRSALFGRWTRIDAAAHPLNVSGTARGRLAGGNLSLLCS AIGTPEQPDFDTPTVLFIEEIGEQMYRLDRMMQQLERSGILAKCKAVLVGHFTDMLGQKH FGVWDPCDIIAAYVRPLGIPVVFGIPAGHEDPNVALYMGREVAVTVNDAGASVEF >gi|313159013|gb|AENZ01000021.1| GENE 132 140052 - 141413 2053 453 aa, chain - ## HITS:1 COG:YPO1712 KEGG:ns NR:ns ## COG: YPO1712 COG0477 # Protein_GI_number: 16121972 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Yersinia pestis # 1 440 11 450 455 481 60.0 1e-135 MPRRLWAILAVAFGVGLSVLDSAIANVALPTIGQELGISSADSIWIVNAYQLAIMVSLLS FSALGELVGYRKVYIGGLMLFTVASVGCALSSSLATLVLARVMQGFGAAAVTSVNTTLIR IIYPKAQLGRGLGLNATVVAVSSVAGPTIAAGILSIAHWPWLFAVNIPVGLVALSLSRRF LPDNPVRVGTHRFDWRDAVMNALTFGLLMASVEGFSHGLDPRILSAGIAALLVVGFFFIR SQLREPYPLLPFDLLRIPIFSVSVLTSICSFLGQMLAMVALPFYLQHEFGYDEVATGLLM TAWPAVIMVVAPIAGLLVERVHAGFLGGMGLTAMAAGLFLLAFLPDDPAPFDIAWRLVLC GAGFGLFQSPNNSILIASAPPQRSGSASGMLATARLVGQITGAALMALLFHIVPENSTHT ALLLAGGFALTGAVVSITRIRLPLPEGLTRRRK >gi|313159013|gb|AENZ01000021.1| GENE 133 141626 - 143557 3377 643 aa, chain - ## HITS:1 COG:AGpA267 KEGG:ns NR:ns ## COG: AGpA267 COG1506 # Protein_GI_number: 16119416 # Func_class: E Amino acid transport and metabolism # Function: Dipeptidyl aminopeptidases/acylaminoacyl-peptidases # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 4 629 6 653 654 350 35.0 4e-96 MKKLLLFAAAAAVLASCAPKTAPELPMETFFRNSEKTGYQISPDGRYFSYMAPYESRRNI FVQPVDGKQAVRITSETERDLAGYFWAGDNRILYLKDTGGDENYQLYGVNIDGSDPKAYT AIPGVRTQIIDPLEEIDSLIIIGTNQRNPMIFDPYRLNLNTGEMTMLCENPGDIQGWQTD HNGRLRVAYAIVDGVNSQIRYRETEAEPFRPVLTTNFKESVSFAAFTPDNKQVYAVTNLG RDKVALVLMDPATCEEIEQLYVNDKYDLDNIWYSDAQKKLLGVSYTGHKGTARHFFDKAT GEMFGRMEKHLRGYAIGIAGSNKAEDKYIVYAGGDRTMGTYYLYDVEADTMTKLADLAPW IEEEQMAEMIPINYTSRDGLEIEGYLTLPVGKTVHNAKNLPVVVNPHGGPWARDYWGFNP EAQFLANRGYAVLQMNFRGSTGFGRKFTEIAYGKWGQTMQDDITDGVNWLIGKGIADPAK IAIYGGSYGGYATLQGIVKDPDLYACAIDYVGVSNLFSFLETIPPYWKPMLDMMYEMVGN PEKDAEMLRENSPALNAERIKTPLLVVQGANDPRVNINESNQMVEALRSRGVHVDYMVKD NEGHGFHNEENRFDFYRAMEKFLAKYLKGVEPQGGIVPESDRR >gi|313159013|gb|AENZ01000021.1| GENE 134 143562 - 143765 352 67 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313159043|gb|EFR58418.1| ## NR: gi|313159043|gb|EFR58418.1| hypothetical protein HMPREF9720_1507 [Alistipes sp. HGB5] # 1 67 1 67 67 105 100.0 1e-21 MEIEKIIRMLAVFIVVAAFALSVAVQLLFPEVSHLSDTPLLILLILSVCAIWRDALRSEK QNENTNE >gi|313159013|gb|AENZ01000021.1| GENE 135 143969 - 145189 1751 406 aa, chain - ## HITS:1 COG:aq_1015 KEGG:ns NR:ns ## COG: aq_1015 COG0826 # Protein_GI_number: 15606313 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Collagenase and related proteases # Organism: Aquifex aeolicus # 1 406 7 400 409 269 37.0 7e-72 MAPVGSYESLAAAIQAGADSVYFGVGKLNMRSASAANFTLDDLAKIVATARTAGVKTYLT VNTIVYEDELRTVHEVIDRAKAEGIDAVIASDFAAILYARRIGVEVHISTQSNISNSEAV KFFSQWADTVVLARELTLEQVARIHREIAENDIRGPRGELVQLEMFAHGALCMSVSGKCY LSLYETNCSANRGACRQLCRRKYTVTDKETGAALDVDGRYVLSPKDLCTVDFLDKFIGAG VRVLKIEGRARGAEYVKRVVECYDEALRAIEAGTYTPELAAGLKERLATVFNRGFWEGYY AGRPVAEHSEHYGSAATRRKVYVGKVTNFYKRISVAEVLVEAAPLAVGEEIFFMGATTGV AEQTLAELHDTDGKPVPSVAQGTLCAVRTQGTIRRGDQLYKFVDAQ >gi|313159013|gb|AENZ01000021.1| GENE 136 145215 - 146189 1597 324 aa, chain - ## HITS:1 COG:MA0606 KEGG:ns NR:ns ## COG: MA0606 COG0142 # Protein_GI_number: 20089495 # Func_class: H Coenzyme transport and metabolism # Function: Geranylgeranyl pyrophosphate synthase # Organism: Methanosarcina acetivorans str.C2A # 25 245 30 253 324 188 44.0 1e-47 MMQTNEQLLLAIENYLAQTEFPAEPERLYAPIGYSLAGGGKRLRPMLLMLAHGIFTDRFQ AALPAAAAVEVFHNFTLLHDDIMDNAAVRRGKPSVYAKWGPSVAILSGDAMMICAYRLLS EVPAELLPRILGTFNTMALEVCEGQQYDMDFESKRKVSVVEYMHMIELKTSVLLAGSITI GAMLGGASEEDCRKLRRFAIELGLAFQLQDDLLDSYGDDRLGKAIGGDILEGKQTYLMIT AMSRADEATREALRTTRLDARLSDAEKIAAVKSIYDRLDVPRLTEQQISLRFERALSILD TLSADKARTQRIREYAESLIGRKN >gi|313159013|gb|AENZ01000021.1| GENE 137 146307 - 147311 374 334 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|116517028|ref|YP_816079.1| glucokinase [Streptococcus pneumoniae D39] # 1 330 1 319 319 148 30 1e-34 MKKLAIGVDIGGINTAFGLVDENGDLYAESVISTKKYPHVDDYPAYIEDLCQAMHALADS LSFEYELTGIGIGAPNANFHKGTVENPANLWKFREGDPNPDESRRLFPLADDIGRCFGGV KVLVTNDANAATIGEMIYGNAKGMRDFVMITLGTGLGSGFVSNGEMIYGHDGFAGEFGHI IVERNGRECGCGRRGCLETYVSATGIKRTAFELMATMTAPSKLRDIAFADFDASMISAAA EQGDPVALEAFRYTGELLGRALADVVTVTSPEAIFLFGGLSKAGKLIFEPTQWYMEENML FVFKNKVKLLPSGIQGKNAAILGASALIWQQENK >gi|313159013|gb|AENZ01000021.1| GENE 138 147607 - 148545 1732 312 aa, chain - ## HITS:1 COG:no KEGG:Odosp_0756 NR:ns ## KEGG: Odosp_0756 # Name: not_defined # Def: hypothetical protein # Organism: O.splanchnicus # Pathway: not_defined # 34 311 17 294 295 144 28.0 4e-33 MEENKQGYDPSEFEDDDYMTPQPDAGKSIRGYRIVIIILSVILVALSALYFSIHRQQMID NRLLQSDRDSIQNDLGRLMTDFDNLQVTNDSISAGLTIERDRADSLMTRLKKERSWNLAK IKQYEKEVGTLRTIMKGYIRQIDSLNTLNKRLISENVGYRKEISSAKLRAEMAEEKAAEL DNKVRVGAVIRARDITLAALNANSKPVSRVKNAARLRVDFVLTANELATPGEKTIYVRIT SPDGYVLTTEAMPTFDFEGERLSYSAMREVDYQNQDLDVGIFYNSTGFAAGTYTVQLFCE GRLIGTSQIAMR >gi|313159013|gb|AENZ01000021.1| GENE 139 148622 - 149419 1333 265 aa, chain - ## HITS:1 COG:no KEGG:Cpin_1245 NR:ns ## KEGG: Cpin_1245 # Name: not_defined # Def: hypothetical protein # Organism: C.pinensis # Pathway: not_defined # 1 264 1 262 264 130 30.0 4e-29 MKRLLITILFLSAVLPLPAQLYHPGEQLFYRVSYKAKMFPNTEVGAVEVKTSDSEIAGRK YYKVEGIGRTLPTYRWFFNLEDIYTVWIDPETKRPVRFESDLHEGDYTFQSYYNYDWENN QVFTRWRRRQKPYQEKTMPLTPESMDAIALFFTMRGSDADSFKPGEPATLQMVLQDTIRH LSYRFINRETKKIRNMGKFKTLKFECQLGTTEGFSFTDGTVFTLWISDDENKIPLYIESP VRVGSINAYISGYKGLKYPMTSLIK >gi|313159013|gb|AENZ01000021.1| GENE 140 149462 - 151600 3202 712 aa, chain - ## HITS:1 COG:slr0020 KEGG:ns NR:ns ## COG: slr0020 COG1200 # Protein_GI_number: 16331409 # Func_class: L Replication, recombination and repair; K Transcription # Function: RecG-like helicase # Organism: Synechocystis # 32 685 151 807 831 476 41.0 1e-133 MSPDFHYICRVEYLDNDIKYVGGVGEARARLLDKELGIRTLGDMLSHYPFRYIDRTKVYR IAEITEGAPTLLQFRARITGVAYAGTGRKKRFTAYVQDPTGSAELVWFQGIKWIEKRVEV GREYLIFGRPSFYRGLLSMAHPELETMEQALSRKAESGMQGIYPSTEKLSNVLGAKGMYQ IICNTWALVKDHITDCMPEAVRARYGLIPLRDAYYNIHFPQSAEMLRQAQYRLKFDELLG IQLNVQSRRTERLSKNNGFLFMKVGGVFNTFYNEKLPFPLTGAQKRVVKEIRQDTVTGFQ MNRLLQGDVGSGKTLVALMSMLLAVDNGFQACMMAPTEILARQHFATITRMLEGMDVKTA VLTGSSKVKERRLALEGIASGEVDILIGTHALIEDRVQFANLGFVVIDEQHRFGVEQRAR LWTKNEQPPHILVMTATPIPRTLAMTLYGDLDVSVIDELPPGRRPIKTVHYTDAARLRLF GFMKQEIAKGRQVYVVYPLIKESEAMDYKDLTDGYEAISRDFPLPDYVTTICHGKMKPAD KEESMRQFKSGEADIMVATSVIEVGVDVPNATVMVIESAERFGLSQLHQLRGRVGRGGEQ SYCILMSGEKLSKESRARLQAMCETNDGFRLAELDLKLRGAGDINGTLQSGMAFDLKIAN PTLDVQILTVSREAAAAILTADRDLSLPEHRGLQELRRKYSGQEEIDFSMIS >gi|313159013|gb|AENZ01000021.1| GENE 141 151690 - 152604 958 304 aa, chain - ## HITS:1 COG:RSp1314 KEGG:ns NR:ns ## COG: RSp1314 COG0491 # Protein_GI_number: 17549533 # Func_class: R General function prediction only # Function: Zn-dependent hydrolases, including glyoxylases # Organism: Ralstonia solanacearum # 82 293 33 260 335 68 26.0 1e-11 MKPTLRKRLSYLLLALLDFASACDREPVDRNSAGNSIFAAEIQPVIMPKNMSALRPFLFA LLLAATAAACTFRRIDIGNDITVQRLAEGIYLYTAVDEIEGYGPVPSNGVLIVREGEAVL LDTPVGDAQTKTLTDWAAETLRACVTTFVPNHWHSDCMGGLGYLKTLGVKSYAHRLTHSI AQREGKPVPDIGFGDSLRLDLHGTEIRCYYFGGGHSEDNIAVWIPSAKLLFAGCMVKEMA AQNAGNLSDAVPAAWPATLDSLLVRFPDARIVVPGHGAPGGPELIRHTKALLSEAASTDN RRKE >gi|313159013|gb|AENZ01000021.1| GENE 142 152862 - 154343 1953 493 aa, chain + ## HITS:1 COG:CPn0373 KEGG:ns NR:ns ## COG: CPn0373 COG0821 # Protein_GI_number: 15618288 # Func_class: I Lipid transport and metabolism # Function: Enzyme involved in the deoxyxylulose pathway of isoprenoid biosynthesis # Organism: Chlamydophila pneumoniae CWL029 # 7 291 11 312 613 243 43.0 8e-64 MNLSEYSRRPSGEVRIGRVVIGGGHPVAVQSMTNTDTNDTEACAAQIERIDRAGGGIVRL TAQGRREGENLANIVRRVREDGFDTAIVADIHFVPEVAAIAAKYVDKVRINPGNYRTDHG ELEALIAQCRERGVALRIGVNHGSLSKRVFDRWGDTPQGMVVSAMEFLRVCKAQGFDQVV VSMKSSNTRVMVAAYRLLVEAMDAEDMHYPIHLGVTEAGNGIEGRIKSAVGIGALMADGI GDTIRVSLTEAPENEIPVAELLVKHFARRPGKFPVLHPERYSPTEYRRRTNVAVPVVHTE PLEGFCVIEAVSENPTAELRAAILNLDTPRPVVVKRRYGETSPETLAVKAAADLGVLFLD GLADGIWIDAPGFAEEEIRNIELMILQAARVRFSHTEYIACPSCGRTLYDIEKTLAAVKS RTSHLKNLKIGVMGCIVNGPGEMADADYGYVGAAPGRITLYKGRTVVERNIPQEEALDRL VELIRDNGDWVEP >gi|313159013|gb|AENZ01000021.1| GENE 143 154340 - 154942 643 200 aa, chain + ## HITS:1 COG:no KEGG:PGN_1944 NR:ns ## KEGG: PGN_1944 # Name: not_defined # Def: hypothetical protein # Organism: P.gingivalis_ATCC33277 # Pathway: not_defined # 3 198 11 210 212 153 40.0 4e-36 MTVLPLAYLPSVEYFAHLLRGGCVVDLGEHFVKRSERNRARILATDGVMELTVHVRNANR PRQPVRDVRIDYSKRWQHQHWGALVASYKGSPYFDFYAEYFEPFYRREYGFLADYNRGLL ELLCSLTHVPMPDFSESYVDAAAGDLDLRPKRKKEGPAFIAETYVQVFSDRMPFQPNLSV ADLLFAEGPAAASVLARCRL >gi|313159013|gb|AENZ01000021.1| GENE 144 154866 - 156896 2442 676 aa, chain - ## HITS:1 COG:SSO3036 KEGG:ns NR:ns ## COG: SSO3036 COG3250 # Protein_GI_number: 15899743 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Sulfolobus solfataricus # 29 563 15 554 570 140 25.0 7e-33 MKRYIATLLLLAACASVQAREVFPLNEGWRFFFKSENSSDNARHVTLPHTWNTDTGACGY FLETTANYQNDMYVPAEWASKRLFVKFYGVQNVADLFVNGYHVGAHRGGSTAFTFEITDK IRFGEDNALLVVVSNNSRDDVLPASTDMNLYGGIYREAELILTGKTAVSPLHLGSEGVLV RQNSVTSALVEGEAEIYLTSAGESTCMLTLDITAPDGRKVFTKRQKTRLDGRPVVIPFSI ADPQLWSPSSPALYRVTASIGEETVTDSVTVRTGFRNIQVTTAGGLTINGERIPVHGVTL YHDNAISGGAVLAQDYDADLQQIRDLGANALRSAVMPHAQYLYDRCDEQGLLVWVDSPLH RSSFLGDVAYFATPQFEQNGIQQLQEIIAQNYNHPSVVMWGIFSRLWMRGDDVTPYLRRL NDTAHAMDRSRPTVACSDQNGGLNFITDLIVWRQDVGWRKGSTDDVAVWRNQLQKNWSNL RSGVCYGGSGFIGHKSYTAQAAPRSNWMPEERQTRFHEEYVKNLQNDSLFWGTWINNMFD YGSARRPYGINGEGLVTIDRRERKDAYYLYRALWNERKPTLHIVDKRRSLRDRNRQAFSV YSSVGAPTLFVGADTVAMTQYAACQYRSDSVEIQGIVQVKAVAGEQCDSVTLRVGNVLKP KRQPVPRRTAGPQQTN >gi|313159013|gb|AENZ01000021.1| GENE 145 157351 - 160386 3324 1011 aa, chain - ## HITS:1 COG:no KEGG:BT_1502 NR:ns ## KEGG: BT_1502 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 211 1002 194 937 945 109 25.0 6e-22 ELNNRLDELTTGKIATLESQLSSLQTSVDNLKSADEALGKRIDELKSDADANAKDIEALE KAQEQLQKDIEAIEKDLSDNYVTKSYLNTTLSSYATTKYVGDAVAAVTQNLGKFTTEKAI QDAINAAKDAAIKAAGDACKDAFQTSFDAAVASANLVNETKLTKAIDDYDVKIKQYLEDA VTKEDGFINKAIAQAVSDAVADLQAMISGRLTSVLLIPDLYVGGIETIELKSLSYEAWKV TSTASAETVAQGSTTSTTAASATSVRYNVSPSVVTKEDVKEPSFVFEKAEVRTRAAVSEQ LLSVASWDIAGGVLTVDVKKTAGTALTLDANHVYTAALKVPLADKYLGEGEVGTAVYSDY VALTETAIEPKIAALIDLNSGKATDGKFECDETATHYHFSAKYEDAQAADPSFTEAYNKP IDLLSMVTGCYIDNDEAKEITKETLKAAGLEFRFALPTKPYTVGDNDTDQQKFASVANTD GAWTLTSKLPDGTTNNQAAIDKTPIVRVELVDVNNKNAVVDVRYFKVQWLREKIAPVDLE ILKTFNYTLTCNAFTGSFTWEEMVTKVLGKLGENGLSQAEFIATYAAPEITATDHTVGTV AQAAADGVELVYNFDAAAAESAAAFTWTLSPEQIGNVIADLIAGKEVKKVVNVTIPAQNA YQGKLTFSFVVNIKKPTLPSIYGYDSSFWHSDYTLAYVYPIQYNTPGAYATCAYRYELDR LFSDSKPVKDMLPCGKWDIQFAKKQPATGYAPALTGNTEPAGETDKTGYTLKKGALDAVQ LNYNGMGDNWYAPETTNTGAVNAAAAADITVQVLNNEAGIGILKQQATLKVWAAINPYNH IELTDFNVYFVEPLKINTELTNAYFVDQIISGSEVDCSKAFTMTDFKDYIVAEVTTGTGE KEKYAAELYAYYAVNAPAWDLANAKTNLKKDANGNYVVDEAITAETAKIKAADRFGVNCI TKNGSKLVFKNINGVKVEKTVKLFIPVTVSHKWGTMTANVTIELHPEEPAN Prediction of potential genes in microbial genomes Time: Wed Jun 22 11:44:10 2011 Seq name: gi|313159010|gb|AENZ01000022.1| Alistipes sp. HGB5 contig00036, whole genome shotgun sequence Length of sequence - 2416 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 164 - 223 6.7 1 1 Tu 1 . + CDS 309 - 1076 514 ## Bache_0848 hypothetical protein + Term 1096 - 1136 4.7 + Prom 1412 - 1471 3.8 2 2 Tu 1 . + CDS 1493 - 2323 691 ## BVU_0433 hypothetical protein Predicted protein(s) >gi|313159010|gb|AENZ01000022.1| GENE 1 309 - 1076 514 255 aa, chain + ## HITS:1 COG:no KEGG:Bache_0848 NR:ns ## KEGG: Bache_0848 # Name: not_defined # Def: hypothetical protein # Organism: B.helcogenes # Pathway: not_defined # 1 252 13 279 282 63 21.0 6e-09 MDPLYDYVSEIANRMYPTWVFDDSCGEYFQTDSFRGWDFVHQLEDGRLKSPFTKEQYLEN APIIRSLRAMGIDTEKFWMAILFVYDVVQERTENVQQVPTSVFEEFRTFAKYLQENPDAK IRAWRGREHGATLESDLAKRMLGKLLADNVAELYAKLSNTRSFGLDGFSVNLKSCYKITL AVKCFLPLLEKFKEEDNRSTNPNKVSYNRMLLISRIVYFFGYTDNPKFLDNDESISGIWT SYKDKEWTTVGANFR >gi|313159010|gb|AENZ01000022.1| GENE 2 1493 - 2323 691 276 aa, chain + ## HITS:1 COG:no KEGG:BVU_0433 NR:ns ## KEGG: BVU_0433 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 13 271 14 283 296 119 32.0 1e-25 MSVISVIVNRNFAFVKGNRPTNSKAVTAKMKSIEEYGLLSPITVVDGEQVITSGGHLVDL NGKDIPDSQSVNYYAVLDGQHRLIAYIKLGLNLNDLVITEPLNVDMSIAALIAEMNICTT TWKGTDYMAAPAMTLSKTNEVFEFAVQLRSKGFPLATISQWCTGTNSLKPKDLVNCVKSG ELPKILQSETWYQRSIRWYEAAQEKFSDSFLSKKYLITYIIMRYNNAADPVAYCRQIEQA LKQLTPAQATEIMEARKIGLKSREQVVVELLEQYLG Prediction of potential genes in microbial genomes Time: Wed Jun 22 11:44:37 2011 Seq name: gi|313158953|gb|AENZ01000023.1| Alistipes sp. HGB5 contig00011, whole genome shotgun sequence Length of sequence - 61565 bp Number of predicted genes - 57, with homology - 55 Number of transcription units - 27, operones - 14 average op.length - 3.1 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 30 - 890 565 ## COG3829 Transcriptional regulator containing PAS, AAA-type ATPase, and DNA-binding domains + Term 953 - 988 3.5 - Term 1141 - 1180 2.9 2 2 Op 1 . - CDS 1186 - 1503 526 ## gi|313158961|gb|EFR58338.1| hypothetical protein HMPREF9720_0303 3 2 Op 2 . - CDS 1514 - 3316 2876 ## COG1217 Predicted membrane GTPase involved in stress response - Prom 3381 - 3440 4.0 4 3 Tu 1 . + CDS 3400 - 4416 1429 ## COG0337 3-dehydroquinate synthetase + Term 4448 - 4478 2.1 - Term 4436 - 4465 1.1 5 4 Op 1 . - CDS 4489 - 5769 1893 ## COG0281 Malic enzyme 6 4 Op 2 . - CDS 5842 - 6684 669 ## BDI_3048 hypothetical protein 7 4 Op 3 . - CDS 6728 - 7741 988 ## BDI_3048 hypothetical protein 8 4 Op 4 . - CDS 7778 - 8785 1514 ## COG2255 Holliday junction resolvasome, helicase subunit - Prom 8862 - 8921 2.6 - Term 8891 - 8929 12.3 9 5 Tu 1 . - CDS 8951 - 9643 1107 ## BT_4719 hypothetical protein - Prom 9690 - 9749 2.3 + Prom 9615 - 9674 9.7 10 6 Op 1 . + CDS 9780 - 10334 702 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog 11 6 Op 2 . + CDS 10338 - 11636 642 ## Bache_1804 hypothetical protein + Term 11771 - 11826 3.6 - TRNA 11871 - 11945 55.6 # Arg CCG 0 0 + Prom 11907 - 11966 8.0 12 7 Op 1 . + CDS 11990 - 12619 760 ## Coch_0868 putative phage repressor 13 7 Op 2 . + CDS 12625 - 13221 827 ## COG0307 Riboflavin synthase alpha chain 14 7 Op 3 . + CDS 13224 - 14273 1791 ## COG0642 Signal transduction histidine kinase 15 7 Op 4 . + CDS 14307 - 15176 1200 ## COG0682 Prolipoprotein diacylglyceryltransferase 16 7 Op 5 . + CDS 15188 - 16252 954 ## gi|313158971|gb|EFR58348.1| hypothetical protein HMPREF9720_0318 17 7 Op 6 . + CDS 16252 - 16737 683 ## Palpr_1847 hypothetical protein + Term 16818 - 16863 4.4 18 8 Tu 1 . - CDS 16846 - 18921 3109 ## COG0556 Helicase subunit of the DNA excision repair complex - Prom 19062 - 19121 2.8 + Prom 18891 - 18950 5.3 19 9 Tu 1 . + CDS 19090 - 21063 3121 ## COG0358 DNA primase (bacterial type) + Prom 21134 - 21193 2.6 20 10 Op 1 . + CDS 21241 - 21735 853 ## COG0691 tmRNA-binding protein 21 10 Op 2 . + CDS 21732 - 22508 815 ## COG0561 Predicted hydrolases of the HAD superfamily 22 10 Op 3 . + CDS 22566 - 23273 917 ## COG1214 Inactive homolog of metal-dependent proteases, putative molecular chaperone 23 10 Op 4 . + CDS 23289 - 23855 487 ## gi|313158983|gb|EFR58360.1| hypothetical protein HMPREF9720_0325 + Term 23924 - 23962 6.0 - Term 23906 - 23954 10.3 24 11 Op 1 . - CDS 23973 - 26588 4159 ## COG1501 Alpha-glucosidases, family 31 of glycosyl hydrolases - Prom 26611 - 26670 2.3 - Term 26612 - 26643 -0.8 25 11 Op 2 . - CDS 26704 - 28185 1957 ## COG1322 Uncharacterized protein conserved in bacteria 26 11 Op 3 . - CDS 28224 - 28418 152 ## + Prom 28274 - 28333 4.1 27 12 Op 1 . + CDS 28366 - 29934 2182 ## BF1024 hypothetical protein + Term 29956 - 29993 5.1 28 12 Op 2 . + CDS 30002 - 30649 194 ## PROTEIN SUPPORTED gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) 29 12 Op 3 . + CDS 30646 - 31842 1589 ## BVU_3608 hypothetical protein + Prom 31854 - 31913 2.1 30 13 Tu 1 . + CDS 32013 - 32639 923 ## COG0036 Pentose-5-phosphate-3-epimerase + Term 32813 - 32844 0.6 - Term 32444 - 32469 -0.5 31 14 Tu 1 . - CDS 32629 - 33189 880 ## COG1896 Predicted hydrolases of HD superfamily - Prom 33323 - 33382 5.4 + Prom 33261 - 33320 2.2 32 15 Op 1 . + CDS 33344 - 35329 2922 ## COG0488 ATPase components of ABC transporters with duplicated ATPase domains 33 15 Op 2 . + CDS 35392 - 36123 390 ## PROTEIN SUPPORTED gi|163764761|ref|ZP_02171815.1| ribosomal protein S11 34 15 Op 3 . + CDS 36154 - 37872 2047 ## PM1469 hypothetical protein 35 15 Op 4 . + CDS 37869 - 38441 453 ## COG1451 Predicted metal-dependent hydrolase + Term 38444 - 38480 0.5 - Term 38391 - 38425 5.3 36 16 Op 1 . - CDS 38454 - 39803 2066 ## COG2271 Sugar phosphate permease 37 16 Op 2 . - CDS 40053 - 41633 2063 ## COG0578 Glycerol-3-phosphate dehydrogenase - Prom 41653 - 41712 6.9 + Prom 41584 - 41643 2.3 38 17 Op 1 . + CDS 41695 - 41814 58 ## 39 17 Op 2 . + CDS 41826 - 42620 926 ## COG1349 Transcriptional regulators of sugar metabolism + Term 42672 - 42705 5.1 - Term 42653 - 42699 14.7 40 18 Tu 1 . - CDS 42728 - 43483 816 ## COG2197 Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain - Prom 43547 - 43606 2.6 + Prom 43486 - 43545 1.7 41 19 Tu 1 . + CDS 43575 - 44225 596 ## COG3506 Uncharacterized conserved protein + Term 44256 - 44300 15.2 - Term 44244 - 44288 15.2 42 20 Tu 1 . - CDS 44310 - 45074 1309 ## COG0584 Glycerophosphoryl diester phosphodiesterase - Prom 45310 - 45369 3.5 + Prom 45248 - 45307 4.9 43 21 Tu 1 . + CDS 45340 - 46470 1807 ## COG0153 Galactokinase + Term 46489 - 46531 7.6 - Term 46482 - 46515 6.1 44 22 Op 1 . - CDS 46538 - 47983 1605 ## AMED_4170 hypothetical protein 45 22 Op 2 . - CDS 48022 - 48543 339 ## PROTEIN SUPPORTED gi|229254479|ref|ZP_04378409.1| acetyltransferase, ribosomal protein N-acetylase 46 22 Op 3 . - CDS 48543 - 49772 326 ## PROTEIN SUPPORTED gi|116517028|ref|YP_816079.1| glucokinase - Prom 49802 - 49861 8.5 - Term 49876 - 49914 8.6 47 23 Tu 1 . - CDS 49938 - 50819 1247 ## COG0331 (acyl-carrier-protein) S-malonyltransferase - Prom 50845 - 50904 2.8 + Prom 51180 - 51239 8.0 48 24 Op 1 . + CDS 51259 - 51969 212 ## PRU_2509 putative lipoprotein + Prom 51971 - 52030 3.1 49 24 Op 2 . + CDS 52110 - 52838 611 ## gi|313158992|gb|EFR58369.1| hypothetical protein HMPREF9720_0351 + Term 52877 - 52927 12.7 - Term 52868 - 52912 9.7 50 25 Op 1 . - CDS 53035 - 53826 1222 ## COG0363 6-phosphogluconolactonase/Glucosamine-6-phosphate isomerase/deaminase 51 25 Op 2 . - CDS 53883 - 55832 3032 ## COG0363 6-phosphogluconolactonase/Glucosamine-6-phosphate isomerase/deaminase 52 25 Op 3 . - CDS 55829 - 56479 700 ## Odosp_1650 hypothetical protein 53 25 Op 4 . - CDS 56476 - 57579 1692 ## BF0928 hypothetical protein - Prom 57703 - 57762 6.5 - Term 57753 - 57797 7.3 54 26 Op 1 3/0.000 - CDS 57816 - 58781 1427 ## COG1482 Phosphomannose isomerase 55 26 Op 2 1/0.000 - CDS 58789 - 59898 1822 ## COG1940 Transcriptional regulator/sugar kinase 56 26 Op 3 . - CDS 59909 - 61042 1875 ## COG0738 Fucose permease - Prom 61097 - 61156 2.9 + Prom 61165 - 61224 2.0 57 27 Tu 1 . + CDS 61253 - 61414 245 ## gi|313158955|gb|EFR58332.1| hypothetical protein HMPREF9720_0359 Predicted protein(s) >gi|313158953|gb|AENZ01000023.1| GENE 1 30 - 890 565 286 aa, chain + ## HITS:1 COG:aq_091m KEGG:ns NR:ns ## COG: aq_091m COG3829 # Protein_GI_number: 15607134 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Transcriptional regulator containing PAS, AAA-type ATPase, and DNA-binding domains # Organism: Aquifex aeolicus # 1 279 253 529 530 217 45.0 2e-56 MSVLIFGENGTGKEHIAHHLHDKSKRAGKPFVPVDCGSLSKDLAPSAFFGHVKGAFTGAD STKKGYFNEAEGGTLFLDEVGNLALETQQMLLRAIQERRYRPIGDKTDKSFNVRIIAATN EDLEKAVIEKRFRQDLLYRLHDFEITVPPLRDCQEDIMPLAEFFREIANNELECSVTGFD AEARKTLLTHPWPGNVRELRQKIMGAVLQAQTGVVMKEHLELAVTKPTSPVSFALRNDAE DKERILRALKQANGNRKVAAELLGIGRTTLYSKLEEYGLKYKFQQS >gi|313158953|gb|AENZ01000023.1| GENE 2 1186 - 1503 526 105 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313158961|gb|EFR58338.1| ## NR: gi|313158961|gb|EFR58338.1| hypothetical protein HMPREF9720_0303 [Alistipes sp. HGB5] # 1 105 1 105 105 173 100.0 4e-42 MDHFTLSLAVGAVAGALDVIPMIFQNLSARSCFSAFFIYFFAAVIVFYSDLPYLPWWADG MAVTLMLMIPVLFTFSGKDRKAIPVVVFNGLLFGFLIGVAERYLG >gi|313158953|gb|AENZ01000023.1| GENE 3 1514 - 3316 2876 600 aa, chain - ## HITS:1 COG:DR1198 KEGG:ns NR:ns ## COG: DR1198 COG1217 # Protein_GI_number: 15806217 # Func_class: T Signal transduction mechanisms # Function: Predicted membrane GTPase involved in stress response # Organism: Deinococcus radiodurans # 5 596 4 593 593 627 54.0 1e-179 MQKLRNIAIIAHVDHGKTTLVDKMILAGNILRDNAKQSGELILDNNDLERERGITILSKN VSVIYKDYKINIIDTPGHADFGGEVERVLNMCDGVLLLVDAFEGTMPQTRFVLQKALALG KKPIVVVNKVDKPNCRPEVVNEQVFDLMFSLDATEEQLDYKTIYGSAKQGWMSHRWNEPT DSIVPLLDAIIDEIPEPKIVEGTPQMLITSLEYSSYTGRIAVGKVTRGSLKAGQNVTLAK RDGVTMQKTKIKDLMVFEGLGKKKADEVPCGEICALMGIEGFEIGDTICDYENPEPLPPI AIDEPTMSMLFTINNSPFFGKDGKYVTSRHIKDRLDRELEKNLALRVTPGPSADSFNVFG RGVLHLSVLIETMRREGYELQVGQPRVIVKEIDGKKCEPIEELTVDCPEEYSGTVIELAT KRKGTLTNMASNGDRTRLEFSIPSRGIIGLRSNMLTATAGEAIMTHRLKGFEPWVGDIEM RVNGSIISGETGTAYAYSIDKLQDRGRFFISPMEQVYEGQVIGEHTRQNDITVNVTKAKQ LTNMRASGSDEKTSIAPPKIFSLEEALEYIQADEYVEVTPHSMRLRKILLHEVDRKRASK >gi|313158953|gb|AENZ01000023.1| GENE 4 3400 - 4416 1429 338 aa, chain + ## HITS:1 COG:alr1924 KEGG:ns NR:ns ## COG: alr1924 COG0337 # Protein_GI_number: 17229416 # Func_class: E Amino acid transport and metabolism # Function: 3-dehydroquinate synthetase # Organism: Nostoc sp. PCC 7120 # 6 321 8 343 363 211 40.0 2e-54 MKPTFNVREQSEIYIGPTADILSGVLPEGRVIVVSDATIDRLYHPLLAPYDTVLIGAGES IKTLQTVESIYRRFIELGADRSTFVLAVGGGIVTDVAGFAAATYMRGLSFGFVSTTLLGQ VDASVGGKNGVNVDGYKNMAGTFKQPRFVICDPEMLRTLSDREFRTGLAEVVKAAVIADA DLFTRIENVSFEALRADTDLLTDAISAAVRVKADIVERDEHETGDRRKLNLGHTLAHAIE KCSNRMNHGEAVAVGTALIADAAVKLGVLLPADRDRIVNVLEKLGFDLTPPVDVKRLLKE VGKDKKSEDGMLRIVLPVGIGDCEVRPMPIADFAALFA >gi|313158953|gb|AENZ01000023.1| GENE 5 4489 - 5769 1893 426 aa, chain - ## HITS:1 COG:maeB_1 KEGG:ns NR:ns ## COG: maeB_1 COG0281 # Protein_GI_number: 16130388 # Func_class: C Energy production and conversion # Function: Malic enzyme # Organism: Escherichia coli K12 # 5 420 3 420 434 481 60.0 1e-135 MSKGNKMDQAALRYHSEGRPGKIAVVPTKPYHTQHDLSLAYSPGVAAPCRAIEANPDDVY RYTNKGNLIAVISNGTAVLGLGNIGALAGKPVMEGKSMLFKTFADIDAFDIEVDETDPEA FIRTVKAIAPTFGGINLEDIKAPECFEIDRRLSEELDIPVMHDDQHGTAVISTAALLNAA KIAGKALDKLRVVVNGAGAAAIACARLFIAVGVRRENMIFCDSQGVVTTYRTDINEIKRE FATTRRITTLAEAIHDADVFLGVSKADVLTPEMLRTMATNPIVMALANPDPEIAYDKAVV SRPDIIFATGRSDYPNQVNNVLGFPYIFRGALDVRATKINEAMKLAAARALAELAQEPVP SMVLRAYALGKLEFGRGYLIPKPLDPRLLCTVTPAVARAAIESGVARQPITDWDAYIEKL RTMTEI >gi|313158953|gb|AENZ01000023.1| GENE 6 5842 - 6684 669 280 aa, chain - ## HITS:1 COG:no KEGG:BDI_3048 NR:ns ## KEGG: BDI_3048 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 110 280 148 317 317 111 36.0 3e-23 MNQRSSLSSRICGTAALVLLTCGAARLHAAAPATAASRTDISAAACTPAEERGRTTEEQV RTAETPARTTEEQVRTTEDPARTAEEPLRTAPSDDRSKRARHLAPRGRSQHTFFANAGYT AILSTIYLRESSSGSLNEGLEITAGYNWTSRRGLGVGVVYSGGFISARRGGFERTSRIHY IAPEFVARQRVGRRWLFRENAGIGYGRYIRSYAGLTGSRGGVGIHESVSAEFMLTRFLGL GVTVGGQWLIVDSPDVDDGAELNLAGIFRVQLGGGIRFYF >gi|313158953|gb|AENZ01000023.1| GENE 7 6728 - 7741 988 337 aa, chain - ## HITS:1 COG:no KEGG:BDI_3048 NR:ns ## KEGG: BDI_3048 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 16 337 15 317 317 232 40.0 2e-59 MRHFSASLQAFAAGALFLTTGCAPSVYTSIQKSYPARIEGSTVLVYDMADSLPEPAEVLG SVTVKDSGFSVNCNYAKVMQRAKDETNKIGGNGLHLFWHKEPSVLGSSCHRIAAHMLLLP DSVYTASYFDNVAAQNARLAAYGLIPADGTVAPGRETPKAANNNTLTFNAGYAFITSDFY LPAGVSGNPKQGLDINAAYQWTSRIGIGFGLRYSGYFTSVMSEGAKLKVRLHYFAPEFVL RQEAGRNRRWAFHESVGLGYARYTEQMGSLSGGIGGLGFHVEVGFEYKLAQNVGIGASVG TYGTRFSSMDDAVSQYNDDQKGGISRISLNGGLRFYF >gi|313158953|gb|AENZ01000023.1| GENE 8 7778 - 8785 1514 335 aa, chain - ## HITS:1 COG:ECs2570 KEGG:ns NR:ns ## COG: ECs2570 COG2255 # Protein_GI_number: 15831824 # Func_class: L Replication, recombination and repair # Function: Holliday junction resolvasome, helicase subunit # Organism: Escherichia coli O157:H7 # 14 327 20 333 336 384 62.0 1e-106 MSIVRNTESDLEFENKIRPQELETFSGQDKIVDNLRIFIKAALMRGDSLDHVLLHGPPGL GKTTLANIISNEMNAQLRVTSGPVLDKPGDLAGLLTSLNPGDVLFIDEIHRLSPIVEEYL YSAMEDYKIDIVLDKGPSARSIQIELAPFTLIGATTRSGLLTSPLRARFGIQCHLEYYDA PVLAGIVKRSARILDVSIDDDAAHEVALRSRGTPRIANALLRRVRDFAMVKGEGHIDLEI TQIGLAALNIDSRGLDQMDNKILGTIIEKFGGGPVGLNTIATAVGEDAGTIEEVYEPFLI KEGFLKRTPRGREATELAYRHMGFTPPGGSDTLLF >gi|313158953|gb|AENZ01000023.1| GENE 9 8951 - 9643 1107 230 aa, chain - ## HITS:1 COG:no KEGG:BT_4719 NR:ns ## KEGG: BT_4719 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 224 1 234 240 89 29.0 1e-16 MKRFGFITLAVLTAVFFSSCNDSDGDYASSWAFASAETLSNDYYFVLDDDKTVYPSDKSR VAGYKPEDDQRVIIFFNLLKTGVEGYDYNIALYDVRNIYTGETRIVTTQEEVEELPDAQT SYFGSSMNANWLNVGIGFNASDLSKHKFLLVRNDFTQIDPDNKKEGYLNLELRHDAGTDT SGGYNYDDRYVSFKLDQFKEDLEGMSGVILRMNTRQNGVIYLQINMSKEK >gi|313158953|gb|AENZ01000023.1| GENE 10 9780 - 10334 702 184 aa, chain + ## HITS:1 COG:MT1259 KEGG:ns NR:ns ## COG: MT1259 COG1595 # Protein_GI_number: 15840665 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Mycobacterium tuberculosis CDC1551 # 22 182 86 245 257 75 29.0 4e-14 MEMEEQILAEGCREGEDTARRELYDRYAARLLAVCLRYSGDRATAEDLMHDAFLKIYGAF DRFSYRGPGSLRAWIERIAVNVALEWLRNRNKLNFRTLDEGRALPDVPEPDPSEVARIPR DVLMEFVGELPDGYRTVFNLYCIEGYSHRDIAQMLGINEKSSSSQLFRARTLLARRIAAY METH >gi|313158953|gb|AENZ01000023.1| GENE 11 10338 - 11636 642 432 aa, chain + ## HITS:1 COG:no KEGG:Bache_1804 NR:ns ## KEGG: Bache_1804 # Name: not_defined # Def: hypothetical protein # Organism: B.helcogenes # Pathway: not_defined # 3 432 2 423 423 161 31.0 6e-38 MKQNKDWTDVMRSALRDAEVTPPADGWARLERELKTPAPRISVLRRYWPRIAAAAAAVLI FVGGGELLLHENRSLGDKGFVIASATDGGSSAADNRTYPAGGIGTLEESLRAVVPSQEAV RAESLPGREAQARQAVRAAYPATAVARNAAPAVVAVDVTDSRPQPVPENEAAATLRSEET IAQKQAAAETSGVQERTAPKENAASARTYPSTTSYYGDQIAFEQPKSRRRTSVSLFAAGS VTGGGTVSAGPGPGMMMSDAPVGGGAGTMAQLKYNYDDYSFKHHQPLSFGLSVRKEFAHG LSLESGVNYTLLWADVRMKSSRQDISQKLHFIGVPLRMNWQFLDTGRFSLYAGAGGMIEK CVSAKFGSRAMDEPGVQWSVLGAAGAQYDLGGLVGLYFEPEVSYYFTETDLRTSRTDSPL SLTLRLGVRLSF >gi|313158953|gb|AENZ01000023.1| GENE 12 11990 - 12619 760 209 aa, chain + ## HITS:1 COG:no KEGG:Coch_0868 NR:ns ## KEGG: Coch_0868 # Name: not_defined # Def: putative phage repressor # Organism: C.ochracea # Pathway: not_defined # 9 201 19 236 244 66 28.0 6e-10 MREKQNNWQRIEAVIKWANMSTNYFARYIGLARGENLYQIKRGNNGISLDVADRIVSKFP QVDKLWLLTGEGQMFSDEKLRGVQIPFYNVDVEQAVAHVAHLEAESSLMVPQAGQCDLAM CYMGRAMGPALPPGAVVLLKAVDPDAIIPGGEYVIVSRKIVTLRIVRLSDGDDKLRLVAG DKENYDDIILNVSDIKSVYKVKGKLIINS >gi|313158953|gb|AENZ01000023.1| GENE 13 12625 - 13221 827 198 aa, chain + ## HITS:1 COG:SP0177 KEGG:ns NR:ns ## COG: SP0177 COG0307 # Protein_GI_number: 15900114 # Func_class: H Coenzyme transport and metabolism # Function: Riboflavin synthase alpha chain # Organism: Streptococcus pneumoniae TIGR4 # 1 198 1 198 211 144 41.0 1e-34 MFSGIVERMGEVVEIRTDRQNKDFTLRADFCKELKIDQSIAHNGVCLTVVDIRDDTYTVT AMKETLDKSNLGLLKTGDLVNLERSMKPDALLDGHIVQGHVDQTAVCTAKEDADGSWYFT FEYQPQGGGLCTVEKGSVTVNGVSLTVCDSKESSFRVAIIPYTFEHTNFCRIEVGSVVNL EFDIVGKYIARLMQQYTK >gi|313158953|gb|AENZ01000023.1| GENE 14 13224 - 14273 1791 349 aa, chain + ## HITS:1 COG:BS_phoR_3 KEGG:ns NR:ns ## COG: BS_phoR_3 COG0642 # Protein_GI_number: 16079962 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Bacillus subtilis # 106 342 40 273 279 192 44.0 1e-48 MVYIKTKEGASLLVALLAAVIVAVTTTLLDVVWWMTLCAAAGVFVVMALVALFIIRKYVA YKLKPIYSIVLSRDVHTNEIFSELKDKHVENIGEELTAWADTNDKEIARLKETEQFRKQY LGNVAHELKTPIFNIQGYISTLLDGGLEDELINRKYLERAEKSIDRLINIVNDLDTISKL ESNMNRLKMERFDIVALTKEIAEQAEMEADKKGIKISVKGAENLPSPFWVLADKHYIGQV MVNLIINSIRYGKEGGQTRVHFRDMLDKILIEVEDNGSGIGKEDLPRVFERFYRTDKGRS REQGGTGLGLAIVKHIVEAHGERITVRSELGVGSTFSFTLKKVNLQDIK >gi|313158953|gb|AENZ01000023.1| GENE 15 14307 - 15176 1200 289 aa, chain + ## HITS:1 COG:RP046 KEGG:ns NR:ns ## COG: RP046 COG0682 # Protein_GI_number: 15603925 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Prolipoprotein diacylglyceryltransferase # Organism: Rickettsia prowazekii # 7 265 11 264 268 124 33.0 2e-28 MIQPLSIVWDFNPVFFSIGSLDIRYYGLMWALAILIGAKFFDNFCKREGLPSKVSESIFI YGTLATIIGARLGHCLFYDPMEYLSKPWTIITGFRDGGMASHGAAIGLLIGLWLFSRKNK LPYIWSLDRIMIAVGIGGAVVRLGNLFNSEIFGMATTLPWGFEFVRSAKWVNEFAPAAVH PTQIYEALCYLATFGLLCWLYYAKDMARRRPGILFGIGLIGIFLTRFFIEFIKTEQEAFE QGWLLDMGQWLSIPFILLGIYMIYRGATQPDVVPVVPKSPAATAKKKKK >gi|313158953|gb|AENZ01000023.1| GENE 16 15188 - 16252 954 354 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313158971|gb|EFR58348.1| ## NR: gi|313158971|gb|EFR58348.1| hypothetical protein HMPREF9720_0318 [Alistipes sp. HGB5] # 1 354 6 359 359 557 100.0 1e-157 MVNEINRLVGDLLAGGGEVFLPGVGSLYTERRGARRISRRSVLPPSRAVSFTSQERGVSL VAEISGAVQCDAAEAQDVYDRWLARTLENGVLTIEGVGVLKLKHFTPDAAFDRRLNPQGR EPVRIKPARKFDWALWIGIAAILVALSFGGYEFLKLYDEGPEEIAQTTGIPGPDTNAGAA TADSPAAIPAASDSGATTGSAASASSADAAKAAGTTADTASGTNAEMRSEPAASGANAGT AASSVGGQGNAPAGQKDIAPAKRPETRSAGTADAPASLVSGRRYVVLGVFSTPENAARAV DAAAEKDPSVRCGIYRFGTKFMVSPFESADTEACALFVRNYSDRFPGLWTYTAR >gi|313158953|gb|AENZ01000023.1| GENE 17 16252 - 16737 683 161 aa, chain + ## HITS:1 COG:no KEGG:Palpr_1847 NR:ns ## KEGG: Palpr_1847 # Name: not_defined # Def: hypothetical protein # Organism: P.propionicigenes # Pathway: not_defined # 11 157 16 161 172 96 42.0 4e-19 MRISELIIRFIDWFYRKPVAAILPRQTFRYIVCGGANVVFSWVCYFLVYNFVLDKEQLDL GFVVVSAYVATMLLIFPFTFFTGFWLNRYVTFRHSPLPTGTQLFRYLLSIGGSVVVNYAG LKFFVEFCGLWATPSQMLATFVTMIYSYLAAKYFTFRHAEV >gi|313158953|gb|AENZ01000023.1| GENE 18 16846 - 18921 3109 691 aa, chain - ## HITS:1 COG:SP1238 KEGG:ns NR:ns ## COG: SP1238 COG0556 # Protein_GI_number: 15901100 # Func_class: L Replication, recombination and repair # Function: Helicase subunit of the DNA excision repair complex # Organism: Streptococcus pneumoniae TIGR4 # 3 673 10 661 662 667 51.0 0 MDFKLVSDYAPMGDQPEAIEQLVGSIRHGSKHNTLLGVTGSGKTFTVANVVAQLNRPTLV LSHNKTLAAQLYGEFRNFFPENAVEYFVSYYDYYQPEAYLPSTDTYIEKDLSINAEIEKL RLRTVATLLSGRRDVIVVSSVSCLYGAGNPADFHATAIDIKVGQIVSYKHFLYKLVEALY TRTERELEPATFRVNGDTVDIMAAFGEFGSQCFRVMFYDNEIEAIQTIDPVTGQRIHSLD NLTLYPTSLFVTTKERINGAVQQIYLDLGRQIEFFERAGRPMEAQRIKQRVEYDIEMIKE LGYCPGIENYSRYFDGRSEGTRPFCLIDYFPKDYLLVVDESHVTIPQVHAMFGGDRARKE NLVEYGFRLPAAKDNRPVTFAEFEQLQGTSIYVSATPADYELMKSEGVIVEQLIRPTGLV DPPLEVRVTMNQIDDLLEEIDKRVKNDDKVLVTTITKRMAEELSKYFDRVGVRNRYIHSD VDTLERIQILEDLRAGMFDVLVGVNLLREGLDLPEVALVAILDADKEGFLRNVRSLTQIA GRAARHSQGNVILYADTCTDSMRYAIEQSNRRREKQVRYNMEHGMLPRRAQRSGSGQSTL LTTRTDVPEIAAVYPIAEDHYMAAADVQKTYAASENLDALIVKAREDMERAAKSLDFLAA AKFRDRMYELQKLREETGNGSSRKTARHETL >gi|313158953|gb|AENZ01000023.1| GENE 19 19090 - 21063 3121 657 aa, chain + ## HITS:1 COG:BH1375 KEGG:ns NR:ns ## COG: BH1375 COG0358 # Protein_GI_number: 15613938 # Func_class: L Replication, recombination and repair # Function: DNA primase (bacterial type) # Organism: Bacillus halodurans # 2 432 5 430 599 302 39.0 1e-81 MIDRETVDRIYAAANIVDIIGEYVTLKRKGVNYQACCPFHNEKTPSFVVSPSKGVYKCFG CGKGGNAVTFLMEHENITYPEALKMVAKRYGIEVKEKEMTDEEVRRNDDRESMFALNGWA ADYFADYLHHETEGMSVGMTYFRQKRGMADATIKKFGLGFCPAKGDRMSKDALAAGYKKE FLVATGLSLQRESDGSLYDRFRDRVIFPVHNISGRIVAFGGRTLRTDKTVAKYQNSPESE IYSKKRELYGLYFAKKAIQQLDFAIMVEGYTDVISMHQAGVENVVASSGTSLTTEQIRLL NRFTKNITVIYDGDSAGIHASLRGIDMILKEGMNVRVVLLPEPEDPDSFARTHTAAELQE YIRANEQDFLAFKARLLLQDAEGDPIKKAALIGDMVQSIAQIPDPIQRSVYIKECARIMD IDENILISEVARKRMSTTGDRETDEFVRRQTAQRRAEVREPEVEFVKQVQAGSSVEALER ELVKYLLKYGHCSFDFKEGRTMVACNVAEVIFMELDSDGLSFSNPLYDKILATYREQWKL LGTGAEVPAHFFLNHPDPEVCNVSVDILTSDDNYVPSELWRRKEIHIDTDAEMLAVGVPK AVTLYKSKVIEGYIKEWQAKLADESLTEEQQNEVIQRLAGFNKVKVTIARKLQRLIL >gi|313158953|gb|AENZ01000023.1| GENE 20 21241 - 21735 853 164 aa, chain + ## HITS:1 COG:TM0254 KEGG:ns NR:ns ## COG: TM0254 COG0691 # Protein_GI_number: 15644629 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: tmRNA-binding protein # Organism: Thermotoga maritima # 21 163 16 158 158 137 49.0 1e-32 MGNYCAVSEYMSEQKNINIRNKRATFDYEILEEYVAGVVLVGTEIKSIRAGRASLTDSFC YFDRGELWIRGVNIAEYAWGTCNNHTPRRDRKLLLNRRELDKLQRAGQDKGLTIVGLRMF LNERGLVKIVIGLARGRKTYDKREYLKENDAKREIDKAMKNYRR >gi|313158953|gb|AENZ01000023.1| GENE 21 21732 - 22508 815 258 aa, chain + ## HITS:1 COG:lin1028 KEGG:ns NR:ns ## COG: lin1028 COG0561 # Protein_GI_number: 16800097 # Func_class: R General function prediction only # Function: Predicted hydrolases of the HAD superfamily # Organism: Listeria innocua # 1 258 1 256 256 134 34.0 1e-31 MIRAIFLDVDGTLISFSTHEIPASARRALTQAHERGVRLFIATGRAANDLGPLEGIPYDG VVSLNGARCVANDGRVVSLHPIPRADFERALALSEEQDFAMGLELEEGVFVNRVTPDVKR VAHMVAHPVPEQTDLRELFGRVECCQLCFYFDAETQRRVMAELPTLVASRWCPIFADVNV AGIDKATGMEEFMRSYGFSAAEVAAFGDGGNDVAMLRAAGVGVAMGNACDEALNAADYVT ASVDDDGIAKALAHLGVI >gi|313158953|gb|AENZ01000023.1| GENE 22 22566 - 23273 917 235 aa, chain + ## HITS:1 COG:SA1856 KEGG:ns NR:ns ## COG: SA1856 COG1214 # Protein_GI_number: 15927626 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Inactive homolog of metal-dependent proteases, putative molecular chaperone # Organism: Staphylococcus aureus N315 # 5 223 13 214 229 96 31.0 3e-20 MSLILCIETGTDICSVGIAKDGELLSLRESDEGRDHARKVGVFVDELLRETGIAPDELDA VAVGKGPGSYTGLRIGVSFAKGLCYGLQKPLIAVGSLDALTEVAREDYEAGILSVSDWDR ALLCPMVDARRMEVYAQVFDAEGRPQNEVTAEVIDAGSFREFREQGRPFVIFGSGARKCA EVLAGAEFVEVAPSARGLARLAQQALDEGRTEDIAYFEPFYLKDFVVTTSKKSLF >gi|313158953|gb|AENZ01000023.1| GENE 23 23289 - 23855 487 188 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313158983|gb|EFR58360.1| ## NR: gi|313158983|gb|EFR58360.1| hypothetical protein HMPREF9720_0325 [Alistipes sp. HGB5] # 1 188 1 188 188 349 100.0 4e-95 MTEAELVKSLSERFYSDFADTVARRVRDAGAVELLYRVATSPYANLPKPARHKAAFRSAY VLEKIYFAAPDAFMPFAGAFCDRDFPACTDRSAQRHFAKIMADLLGRYTPDSSSLERIAE TAAQWAADPGTKVAVRIWAVEVLKHCRERVGWVAEAWDDLVETMAHGATPGIECRMRKSW KPGRSDKA >gi|313158953|gb|AENZ01000023.1| GENE 24 23973 - 26588 4159 871 aa, chain - ## HITS:1 COG:SP0312 KEGG:ns NR:ns ## COG: SP0312 COG1501 # Protein_GI_number: 15900245 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-glucosidases, family 31 of glycosyl hydrolases # Organism: Streptococcus pneumoniae TIGR4 # 94 677 8 564 679 398 37.0 1e-110 MKRFLALLTALACTLSPAIADNPKADAKAVVTSGNARFTVLTPQLIRMEWSADGQFEDRA TLTFVNRETPVPEFKVRESRSKLTITTPALTLTYLKNGKFSDKNLKAVFTLNGKEVVWTP GMENPQNLLGTTRTLDGADGSKLKEPMEQGILSRAGWSLIDDSQRHVLTPDGSEWEEWIE ARPEGDRQDLYLFAYGHDYKQALADYALVAGRAPMPPKYTLGYWWSRYWQYSDNEFVDLV NKLKSMDVPIDVLIVDMDWHETWGLRKSNSPKDEYGQRIGWTGYTWQKELFPSPANFLKW TENEELKVALNLHPASGIQPYEAVYDDFTKEYGWSEKGKSVPFKIDERKWAEAYFKTVLE PMERDGVDFWWLDWQQWKESKYTPGLSNTFWLNHTFFNHAERQNPGLRPFIYHRWGGLGS HRYPLAFSGDTYATWPMLAYLPYFTATASNVNYGWWGHDIGGHMFHKTQKATDPELYTRW LQYGVFTPIFKTHSTKDPRIERCIWCFPDHMFLMRDAIRLRYTLAPYIYNAARENYDTGV GMCRPMYYDYPESDKAYETSEQFMFGNDILATTITQPVDSITGLAPRTIWFPEGAWFDCA TGSMYEGGRTEELHYTLAENPHYAKAGSIIPMNPATVKNLQQPCDTLVLTFIPGGDGQLR HYEDDGMSQQYKTNYAVTTVSKKQEGNTVRVRISPREGSFAGASDSRSYELRFPAVFPPK SVKVNGKELAYSRFPKAGEWTYDGYTLAPVIYTGTTACDAPVEIELAFDDYATAHQADLY GMSGVFKRCLDLTVEFKTEQGAHSEPYLMLPEEYLRVSQCPNFILEEPFRIAEFIGAYAK NKAALFEKTDSMTIIGDNFKQRLKAVIGSVK >gi|313158953|gb|AENZ01000023.1| GENE 25 26704 - 28185 1957 493 aa, chain - ## HITS:1 COG:VC0082 KEGG:ns NR:ns ## COG: VC0082 COG1322 # Protein_GI_number: 15640114 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Vibrio cholerae # 79 441 104 474 513 236 36.0 6e-62 MTPTSLFILTTCCLTAILAIVLLLWHRESRRLGASNADAEARERDAAARLSAAEARLAAA EQAVSAERELRQRYEAQAQALTEQRTRTEAELAALRASSETELASLRAKNAEERAAEKAE REKLEETFRAQFKNLATEILGEHSVRFRQTSQEQIDSLLKPFRDNITDFRKRVEEIYTTQ TSQRGELKAELKNLMELNRRITAETTNLTNALKGNSKVQGDWGEMLLETILDSSALIKGI HYQTQYNIKDEEGRNLRPDVVLHLPEKKDIVIDSKVSLTAFVGYTSAESEEERRRHLAAH VASVRQHVAELGRKEYQRRLNSPDFVIMFVPNEPAFLAALQNDPAIWADAYDRKVIVSSP TNLFALLKLVADLWKYNDQDKNTKEIAACGLKLYEQLVAFTSSLEGVGTALDKARDAYED AHKRLCTGNDNIVRVGERLRKTASLQTKKRHSARTLEIAGADEDPAALSATESAAETAAK SAAKAAATDTTAQ >gi|313158953|gb|AENZ01000023.1| GENE 26 28224 - 28418 152 64 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MDAAARQASFKGRKKFFICFKFKMNPIKEQNYKKSANRTLFQANKFAFGAKIFRMRPERL VFRE >gi|313158953|gb|AENZ01000023.1| GENE 27 28366 - 29934 2182 522 aa, chain + ## HITS:1 COG:no KEGG:BF1024 NR:ns ## KEGG: BF1024 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 9 505 8 507 517 94 21.0 8e-18 MKNFLRPLKLACLAAASMLIVSCGGSADYRSILPADSFMTVSVNPASLMQKCEAGDLDQH PLYVRIKAELDKDQNLSAEEKEYLLALLKNPGESGIDVKKDFFFFAAMEGTVDAPVMRGG LLLPIGDKAKFDALLARINEKSGVTPETKGGVSVVDLGKEGDAGVLCAYNDIAFMVYFVQ NGVDDLAGDVRKLFAQKSGESLMGDKAVAEQLARKNDVNMVLSYGEILPMMNNPMLSSMP MMDALKGATMVGSVNFEKGRIVTEAAVSYKDKESEAKMMDFYAYVKPQTGALLRYVPAAS IGAMSLGLNGEKLYSMLSAMPGYGMLMANPMVKQVMEPFDGDFLLSFSGMGIDGKYPVAS MLAQVKDPAVLQTIVTNLAGMPVQQTAEGEYTLNMGGVTILFGVKGDVLYCTTDAVVKMA LDGGEIASMESMDKIFKGQSSTFYLDFEGLNAMIAQLLGGNVTPQAEAALSVLGMFDDME AYGTMKGGTMIVNMVDKEQNSFKTICGKTGELIRQYVPEANL >gi|313158953|gb|AENZ01000023.1| GENE 28 30002 - 30649 194 215 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|157164682|ref|YP_001467345.1| 50S ribosomal protein L25 (general stress protein Ctc) [Campylobacter concisus 13826] # 24 209 27 209 223 79 29 5e-14 MESITLQQVLPDAFAGNPEPPKSDLWLHDVTFRKGEYYLVEAASGTGKSSLCSFLYGART DYAGRILFDAADCRTLSVAQWGELRRRSLSMLFQDLRLFGELTVAENLALKNELTHFLTP DRIERLLEAAGIAGKRDTPAGKLSFGQQQRVAFIRCLCQPFDFILLDEPVSHLDAANGEV LSALLLEGAQAQGAGVIVTSVGSRLRLPYHKIFTL >gi|313158953|gb|AENZ01000023.1| GENE 29 30646 - 31842 1589 398 aa, chain + ## HITS:1 COG:no KEGG:BVU_3608 NR:ns ## KEGG: BVU_3608 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 390 1 392 400 392 48.0 1e-107 MRLLWKLLRCHLSMAQTVGFTLAGLVGMAIVLTALQAYRDVLPVFDRPDSFMRGDYLVLS KQVGALQAIGLGSSDFTAEELADLRAQPFVREAGAFTPADYRIKGTVGMGGVELSTYLFF ESVPDRFLDVGAENWDYKAGDRDIPIIIPRNYLNLYNYGFARSQGLPQISEGIFRRVSLG IEIAGNGHREQFRGRIVGLSNRLNTILVPDAFIRWSNGRFGSGAAKQPARIIVETDRPVD AAVTDYLARKGYEAEGDRRDDGRAARFLRIAAGGVAGVGLVFSAMSFYILMLSIFLLLQK NSGKMENLLLLGYAPSRVARPYQLLTLGLNLGVLCVALLAVWLVRLCYLPSLAALQEDYR PAGLGLTVLCGVALALLLSLFNGAAIRRKIDGLNRVRR >gi|313158953|gb|AENZ01000023.1| GENE 30 32013 - 32639 923 208 aa, chain + ## HITS:1 COG:alr0782 KEGG:ns NR:ns ## COG: alr0782 COG0036 # Protein_GI_number: 17228277 # Func_class: G Carbohydrate transport and metabolism # Function: Pentose-5-phosphate-3-epimerase # Organism: Nostoc sp. PCC 7120 # 1 208 16 225 235 199 48.0 3e-51 MLSADFGHLERDTQMIDRSAAEWVHIDVMDGVFVPNISFGFPVLKAIRKATAKCLDVHLM IVEPERYVARFAEAGADVVTFHYEATYNPKGCIEMIRNAGAKVGVSIKPATSVEVLRDIL PQIDLVLIMSVEPGFGGQAFIPASLEKIAQLRALADELGTETIIEVDGGISSHNAHEVFG AGADVLVAGNAVFGAEDPQAEIVKMLNA >gi|313158953|gb|AENZ01000023.1| GENE 31 32629 - 33189 880 186 aa, chain - ## HITS:1 COG:PA1878 KEGG:ns NR:ns ## COG: PA1878 COG1896 # Protein_GI_number: 15597075 # Func_class: R General function prediction only # Function: Predicted hydrolases of HD superfamily # Organism: Pseudomonas aeruginosa # 1 180 3 182 192 165 52.0 4e-41 MEKLIRYLRFIREAERLKNVLRTAYTSEGRHESTAEHSWRLALLAAVLTGERPELDMQRV VLMCLIHDLGEAFDGDVPAIAQTAPGVKAASELAAMERLTRLLPPEAGATIREIWEEYEA CQTPEARWVKALDKAETIIQHNQGANPAGFDYAFNLTYGSEYFDDGALLSDLRLLLDEET GRHVRR >gi|313158953|gb|AENZ01000023.1| GENE 32 33344 - 35329 2922 661 aa, chain + ## HITS:1 COG:all4183 KEGG:ns NR:ns ## COG: all4183 COG0488 # Protein_GI_number: 17231675 # Func_class: R General function prediction only # Function: ATPase components of ABC transporters with duplicated ATPase domains # Organism: Nostoc sp. PCC 7120 # 1 533 1 531 564 430 43.0 1e-120 MISLDNLTVSYGGWTLFDNISFLINPKDRIGLVGRNGAGKTTLLRIITGEQQPTSGHVTL NGECTIGYLPQTMRVADTTTLAEETAKAFDEVLRLEAEIASLTREIAERTDYESAGYEQL LHRLNDAQDHYHILGGDTREADIEKTLLGLGFKRTDFGRATSEFSGGWRMRIELAKLLLR RPSIFLLDEPTNHLDIESIQWLEEYLKNYNGAVLLISHDRAFLDNVTNRTVELSLGKVTD YKVSYSKYVVLRAERRAQQMAAYENQQRMIEKTEEFIEKFRYKPTKSNQVQSRIKQLERL ERLEIEEEDLSTLNIKFPPAPRSGQIVAEINEAGMSFGTKHVFSGANFIIEKGDKIALVG RNGEGKTTLARMLIGQLTPTEGSVRLGANVNIGYYAQNQDDLMDGEFTVYDTLDRVAVGD IRTRLRDILGAFLFRGEDIDKKVKVLSGGERARLAMARMMLEPRNLLILDEPTNHMDMRS KDILKSAIMKYDGTVVVVSHDREFLDGMVQKVYEFRDGGVKEYLGGIYYFLEKRKLESLQ EIERRDAPAKPAANPAANPAAKSAAQPAANRDAAASGKLTYEQRKEQEKQLRKLRRAVET VEAELAEIEKQIAAYDAKFAAATEYNEADYKAYNDLKARYDHQMHEWEKASYELEIVEEQ G >gi|313158953|gb|AENZ01000023.1| GENE 33 35392 - 36123 390 243 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163764761|ref|ZP_02171815.1| ribosomal protein S11 [Bacillus selenitireducens MLS10] # 5 240 9 246 255 154 34 9e-37 MDNIIFGIRPVAEAIEAGKQIEKLYIRKGAEGQLMSELRDLCFRHKVRVQEVPVEKLNRL TRGNHQGVVAQISAIEYVGLADILERVPEDETPLVVIFDGVTDVRNFGAIARSAECAGAH GLITPLKNSAPVNAEAIRSSAGALTAIPVCRVGSIRNTVKLLQTEGFQIVAATEKSRKLL YDADFRKPTALVMGAEDTGISKEVLKLCDEQLAIPLIGHIESLNVSAAAAVMLFEAVRQR IAE >gi|313158953|gb|AENZ01000023.1| GENE 34 36154 - 37872 2047 572 aa, chain + ## HITS:1 COG:no KEGG:PM1469 NR:ns ## KEGG: PM1469 # Name: not_defined # Def: hypothetical protein # Organism: P.multocida # Pathway: not_defined # 121 375 111 364 579 111 29.0 7e-23 MKAKLERYLRWLREGFLQMLRLHPVESALTVYACAGCLLTYELDWDHSLPKLALAALFFA VALVVNNLAGRGPWRRVYWVSWTPIVPLSLWGGLEAWIVSEPAFLTFGILVPLALLMCRR AVHNERFVCDALLWLRSGVLAVFFANVALGLFGAILFSTTYIFGLEGAWIDHVWTYALIV FETFAVPVLFLMMADRWREAGYESNRILEILLNYIVAPALLIYTAILYLYMAKILVTWSL PEGGVAYLVFGFTLFALAVKALQFLMRKRLYDWFFDRFSLISLPTQLLFWVGAIRRTSEY GLTSPRIYLLVCGGLMTLCVVLFLSRRAGRYMYVCLTAFAVFAAFAYVPSLEPERLAVRS QVGRAVRIAESLGRLGNDGRLILTPVPPADSVYREAYHGLYESLKFVERRDTAVFARFGT ELDEFAAIFPERMRSYVRWGCDYCNSYCVNNIIELEAPVNARFEVNAEYPHYYMNLRGWS DSSYHFGGDTLRLFLGEERAVHRIPCRELLERQLKESGFEPSEGCEPTSDQLLRLLDYRD DRCRILFENIKLERTDSAVVIQGLSINAVLMR >gi|313158953|gb|AENZ01000023.1| GENE 35 37869 - 38441 453 190 aa, chain + ## HITS:1 COG:NMA0032 KEGG:ns NR:ns ## COG: NMA0032 COG1451 # Protein_GI_number: 15793065 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase # Organism: Neisseria meningitidis Z2491 # 54 179 89 215 230 73 36.0 2e-13 MISTVTHPRLGEVTLSRTRRARRISISVRGTGAVRVSFPYGVSVRRAMEFLDLKADWVEQ ARVRMAARMPVEPPLPPAEAKARIEELRRAAKADLPARIERLSQLTGLKYAKLTIRASRT KWGSCSGRNTISLSLFLMTLPEHLRDYVIVHELCHTVHHNHSPRFHALVDRMVGGNEKAL NRELRAFAIR >gi|313158953|gb|AENZ01000023.1| GENE 36 38454 - 39803 2066 449 aa, chain - ## HITS:1 COG:BS_glpT KEGG:ns NR:ns ## COG: BS_glpT COG2271 # Protein_GI_number: 16077283 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar phosphate permease # Organism: Bacillus subtilis # 1 443 14 439 444 438 53.0 1e-123 MEADKVDAKYKRMRLQVFLGIFIGYAGFYIVRKNFSMAIPELALLGFDQSELSIVLAMNA VSYALSKFLMGSVSDRSNARVFLPLGLVLAALSMMFMIVPVTLLGPEHKRLAIFIMAVLN FLVGWFNGMGWPPCGRVMTHWFSIKERGTKMSIWNCAHNVGGALVGPMAVYGAIWFGSWF CGSHTELYFLIGTYLFPAVVAIFIAILAYVLIRDTPQSCSLPSVEKWRNDYPKNYSAKQE EVLTTREIFFKYVLNNKLLWFIAIANAFVYMVRYGCLDWAPTYLRDAQGYDIKQAGWAYF AYEFAAIPGTLICGWLSDKVFHGRRAVPTILFMAIVAVFIFLYWQFSSNYFIVTMSLIAI GFFIYGPVMLIGVQALDLAPKNAAGTAAGLTGFFGYFLGTAILANIVIGSVAEAAGWDWT FILLIAACFLSIVFMAFTYKGEKEILGQK >gi|313158953|gb|AENZ01000023.1| GENE 37 40053 - 41633 2063 526 aa, chain - ## HITS:1 COG:DR1019 KEGG:ns NR:ns ## COG: DR1019 COG0578 # Protein_GI_number: 15806042 # Func_class: C Energy production and conversion # Function: Glycerol-3-phosphate dehydrogenase # Organism: Deinococcus radiodurans # 17 520 23 522 522 470 50.0 1e-132 MERLAQIKKLSEKDKVWDIVVVGGGATGLGVAVDAATRGMSVALLEKTDFAKCTSSRSTK LVHGGVRYLQKGDVMLVLEALRERGRMKANAPHLVKDQAFVISNYRYWDNFLYFCGLTFY DMLSFGFGYGRSKYISAKKVMKYIPTSVEKGLKGGVVYHDGQFDDSRMAVNLAQTCVENG GTVVNQATVTGIVHDAAGRAAGVKFVDNLTGEEHTLKARSVVNAAGCFVDDIMHMDSPTH RRMVTPSQGVHIVLDMKFLQSDYAIMVPKTSDGRVLFAVPWHDKVVVGTTDIVRPKAEED PRPLKEEIDFILGTAGLYMNPAPTYKDILSVFAGQRPLAAPKKEGKNTKEISRSHKIIVS DNGLVTITGGKWTSYRLMAEDTVDKAIEVAKLPARKCVTKKLHIHGYRKNPDLTDHMYVY GSDEPKILELIKQQPELGEKLSPKYGYTYAEVLWAVREEMAMTVEDVLSRRVRLLFVDAR EAMAAAPKVAQLMAKELGRDQAWIDGQVKDFTELAKTYIFAGEAEA >gi|313158953|gb|AENZ01000023.1| GENE 38 41695 - 41814 58 39 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MFYFIIFRFDMLPLRKNGYFFCPFRRISNYFSYLALRND >gi|313158953|gb|AENZ01000023.1| GENE 39 41826 - 42620 926 264 aa, chain + ## HITS:1 COG:YPO0831 KEGG:ns NR:ns ## COG: YPO0831 COG1349 # Protein_GI_number: 16121139 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulators of sugar metabolism # Organism: Yersinia pestis # 14 263 6 255 258 171 40.0 1e-42 MNKSGEETILSLPERHNRILALLQQNGSISVAQLAELFKVSEVTIRKDLSYLEQQKKLYR THGSAILISPYISDRHVNEKEKKNVAEKRAIGAKAASLIAQDDSIIIASGTTMAFLAREI KPVGHLTVITAAVPVTSILSQDANIDVIQLGGITRSSSVSVVGPFAEQMLRHFNCSKLFV GVDGIDLEFGLTTTNMLEATLNGAMMNAAQKVIVLADSSKFGRRGFSKICDLEAVDRIIT DSGIQPLYLERLRERGIEVIVVDA >gi|313158953|gb|AENZ01000023.1| GENE 40 42728 - 43483 816 251 aa, chain - ## HITS:1 COG:ECs3712 KEGG:ns NR:ns ## COG: ECs3712 COG2197 # Protein_GI_number: 15832966 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain # Organism: Escherichia coli O157:H7 # 157 247 114 205 210 61 39.0 2e-09 MNIVSTLNEKLLRQNFDDEHSEALLWEECRQTALAYVRTENALAVLSDLQSNTSRIYNGR TAVRLGLAEHTTEETIASIWEEKIFDRIHPDDLLEKHTQELRFFGFLQEIPIAERVNYYV SGQIRMRDATDIYVPIRHRMFYIGSTPNGSLRMALCLYDSVCTVQPDYTFVNSATGEIIS PETRRSDNLLSDREKEVLQLISAGKMSKEIAILLSISIHTVNRHRQNILSKLKADTSIEA CRIAEYMHLIS >gi|313158953|gb|AENZ01000023.1| GENE 41 43575 - 44225 596 216 aa, chain + ## HITS:1 COG:all7165 KEGG:ns NR:ns ## COG: all7165 COG3506 # Protein_GI_number: 17233181 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Nostoc sp. PCC 7120 # 25 199 1 175 183 141 38.0 1e-33 MRKILLACLALAVAEVTCAQSLEKIQWFNEPERWTAEDGILTMQVPPHCDYWRVSHYGFT VDDAPFCYATYGGEFEAKVKVTGDYKTRFDQAGLMLRIDAGNYIKAGVEFVDGKFNLSTV VTHGTSDWSIIPQDGAVPFVWIKAVRRLDAVEIFYSFDDKEYVMMRNAWLQDNTPVMVGL MAASPDGEGFEAKFENFSVKHLPDMRRMEWLKRNAE >gi|313158953|gb|AENZ01000023.1| GENE 42 44310 - 45074 1309 254 aa, chain - ## HITS:1 COG:SA0220_2 KEGG:ns NR:ns ## COG: SA0220_2 COG0584 # Protein_GI_number: 15925931 # Func_class: C Energy production and conversion # Function: Glycerophosphoryl diester phosphodiesterase # Organism: Staphylococcus aureus N315 # 28 246 1 217 242 73 26.0 3e-13 MTKKLILAAAMLLTGSIAVAAQPKVISHRGYWTAPNSAQNSLASFTKADSVGVFGSEIDI WLTADDKLIVNHDRVYKGTDINMEKSTLKEITSIVLPNGENIPTLDAYLRLVATKPNTRL ILEMKSLSDLKREDLAAEKIVKALRKYNLLDRTDIIAFSINACLAFKKLMPDGRIFYLNG DLAPRSIKKLGLTGIDYSMSVLRKNPKWVEQAHKEGLEVNVWTVDTEEDMRYFIDLGVDY ITTDYPERLQALLK >gi|313158953|gb|AENZ01000023.1| GENE 43 45340 - 46470 1807 376 aa, chain + ## HITS:1 COG:CAC2959 KEGG:ns NR:ns ## COG: CAC2959 COG0153 # Protein_GI_number: 15896212 # Func_class: G Carbohydrate transport and metabolism # Function: Galactokinase # Organism: Clostridium acetobutylicum # 1 375 7 388 389 267 40.0 3e-71 MNEKVSKKFGELYGSGAILFASPGRINLIGEHTDYNGGFVFPGAVDKGIVAAIKLNGTDK VRAYALDLGESAEFGLNEADKPAQSWACYIFGVCREIQKRGGKIGGFDTVFAGDVPLGAG MSSSAALESTYAFALNDLYSCGIDKFELAKIGQSTEHNYCGVNCGIMDQFASVFGKKGNL IRLDCRSLEYAYFPFDPKGYKLVLLDSRVKHELVGSPYNDRRASCERVAKMLGLEFLRGA TMEQLDAIKDKISEEDYKRARYVIGEEKRVLDVCEALEKGDYETVGKRMYETHWGMSKDY EVSCEELDFLAEVAEACGVTGSRIMGGGFGGCTINLVKDGLYDNFIATAKEKFNAKYGHE PKVYEVVISDGSRRLE >gi|313158953|gb|AENZ01000023.1| GENE 44 46538 - 47983 1605 481 aa, chain - ## HITS:1 COG:no KEGG:AMED_4170 NR:ns ## KEGG: AMED_4170 # Name: not_defined # Def: hypothetical protein # Organism: A.mediterranei # Pathway: not_defined # 45 469 52 498 502 220 35.0 1e-55 MTNTRIRISALLLAISGLLTPHPAAFGAGKAPKIHELTFSGKATPGQYYVPVYTSFTVPE GIVKISVTQHLGSGEARPGNLDLGIFDERGAGFEGPGFRGWSGGARRSFEIGETEATPGY LAGRINRGRWTVIQMPTTAGRTTDWTLKITLTEGPRAKKLPAPSYAAPQLNDKPGWYRIA PHVHTVHSDGRLTPAEVIALGKASGLDGIVSTDHNTTSALLHWGEVQDPEFLVINGMEVT YGQGHWNIFGLDPHAWIDFRIHHNDTLRYRKAVAKARKAGRLLVANHPYNLDFLYDVTPM DGIEVWNSAWSPANERAVELWHTLLVGGNRKFAVASTDFHRGGNIASPHTVVRAEALSAE AVIDAIAAGRSYMARDTSVEIGMQVRNSATPVQHADIGDTLLRSGKMQAEFRSNTAGRLR LTDQQGWFSDTSIAPGTVVVSIPERSLWVRAELRAEDGSMIALTNPIYIQHPLTTQSLLP E >gi|313158953|gb|AENZ01000023.1| GENE 45 48022 - 48543 339 173 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229254479|ref|ZP_04378409.1| acetyltransferase, ribosomal protein N-acetylase [Capnocytophaga ochracea DSM 7271] # 9 171 3 164 166 135 41 7e-31 MNLEGRTTRLRALEPGDIDLMYAWENDTAVWSVSGTLAPFSRHTLERFIQEQQFDIFQTR QQRLVIETLGGIPVGALDFFELDPINRRAGIGILIHDDALRGKGYASDAVETACRYAREV LNLHQLWCNVGSDNAPSRRLFAKAGFAEVGIKRDWLWRPGGYGDEILLQKILE >gi|313158953|gb|AENZ01000023.1| GENE 46 48543 - 49772 326 409 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|116517028|ref|YP_816079.1| glucokinase [Streptococcus pneumoniae D39] # 97 402 7 317 319 130 26 2e-29 MTSLTEFFNKHEDDTLKGISHKNSIIKRNIIAHMAVNGECTLSELTKELHISVPTITKLV QELVDENIVTDLGKVETPGGRRPNIFGLANSAIYFAGVNVGRDNMTFLITDLQNNIIKEE YDYTFELLDRPQCIERICTNIENFIATCGIDRGKILGLGVCMTGRVNPDTGRSYKYFTTS EQSLRDLLEERVGIRVLLENDTRARCYAEYTCGKSKDESNVLYLHMGRGVAIGIVVDGQL YYGKSGFAGEFGHIPFFDNEIICSCGKKGCLETEVSGIAIEDKMSHLIERGVNTILKEKY DKQKSIHIDDIIAAAKNDDNLSIELIEEAGEKVGKAVAFLINTFNPETVIVGGNMAAAGD YIMLPLKSATNKYSLNLVYKDTKFRVSKMTENANAWGVAMLIRNKIIGI >gi|313158953|gb|AENZ01000023.1| GENE 47 49938 - 50819 1247 293 aa, chain - ## HITS:1 COG:CAC3575 KEGG:ns NR:ns ## COG: CAC3575 COG0331 # Protein_GI_number: 15896809 # Func_class: I Lipid transport and metabolism # Function: (acyl-carrier-protein) S-malonyltransferase # Organism: Clostridium acetobutylicum # 2 287 3 294 308 270 49.0 2e-72 MKHAYVFPGQGAQAVGMGKDLYDNVPEAKELFEKANEILGFRITDIMFAGTDEELKQTKV TQPAVFLHSVIMAKALGVKPDAAAGHSLGEFSALVVAGALSFEDGLKLVSKRALAMQAAC EAQPGTMAAILGLDDKTVEDICASIDGVVVAANYNCPGQLVISGAVEAVEAACEKAKAAG ARRALRLPVGGAFHSPLMEPAKQELEKAINEAPFQAPVCPVYQNVDAKPYTDPAQIKANL IAQLTAPVRWTYIVKNMLADGVTEFTELGPGTVLQGLIKKVNPEATVESKSTL >gi|313158953|gb|AENZ01000023.1| GENE 48 51259 - 51969 212 236 aa, chain + ## HITS:1 COG:no KEGG:PRU_2509 NR:ns ## KEGG: PRU_2509 # Name: not_defined # Def: putative lipoprotein # Organism: P.ruminicola # Pathway: not_defined # 11 234 17 245 247 68 27.0 3e-10 MKKLLLIAFAALAFAACSDDDKTPTPPHVETITFSDAELGADGYLWGKKLATDEDGSLVF EGVIYKEGKASFLSYFSDFGGVWDTWCKFAISGCHDKVTEGTDNQFSVYTTVDDGQNKFA VAYDMKGMGPGYTFNPAVEFSTPVSPVSVRVANNTWTYLHLKGDYSDYSVSIIGFDGETQ TGKVDVPLAANNKIVADWNTVGLDKLGVVTKIIFSVECSDKMAPTYFCIDDLVYSE >gi|313158953|gb|AENZ01000023.1| GENE 49 52110 - 52838 611 242 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313158992|gb|EFR58369.1| ## NR: gi|313158992|gb|EFR58369.1| hypothetical protein HMPREF9720_0351 [Alistipes sp. HGB5] # 1 242 1 242 242 465 100.0 1e-129 MLDFKGNAAALRTVTIVGEWAGGSFDDVLCGREYMNEEDANGNFFDGLLFTTADRKIGFG SYFCDMSKNQFGAYDTWGGFALSKNYSQTPTADGSPDYKGSHFSAWTKSGANNTATFALA YFNDYGAYDYNTPKIEFSERREVAHLYMANATVTGQSQSSLSDYWFKVSVTGYSGGVKGK TIEQVLISGKSIVSDWVKVDCSSLGAVDELRFGVMSNDVSGGFLNCPSYFCIDEIALVKQ TK >gi|313158953|gb|AENZ01000023.1| GENE 50 53035 - 53826 1222 263 aa, chain - ## HITS:1 COG:STM0684 KEGG:ns NR:ns ## COG: STM0684 COG0363 # Protein_GI_number: 16764054 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphogluconolactonase/Glucosamine-6-phosphate isomerase/deaminase # Organism: Salmonella typhimurium LT2 # 1 262 1 263 266 364 65.0 1e-101 MRLIIEDTQENAGRWAARYIVEQINKKQAAGGTFVLGLPTGSTPLTTYKELIALNKAGKV SFKDVVTFNMDEYVGLPEEHPESYHSFMWNTFFNHVDINPANVNILNGNAPDLQKECDEY EEKIRKAGGIDLFMGGVGEDGHLAFNEPFSSLNSRTRVKTLTYDTLVVNSRFFDNDVNKV PKQAMSVGVATVLDSKQVLILALGHKKARALQQCVEGPYSHVCTISAMQVHPHGIVVCDE PATVELKVGTYRYFKDIEKENLI >gi|313158953|gb|AENZ01000023.1| GENE 51 53883 - 55832 3032 649 aa, chain - ## HITS:1 COG:BS_ybfT KEGG:ns NR:ns ## COG: BS_ybfT COG0363 # Protein_GI_number: 16077305 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphogluconolactonase/Glucosamine-6-phosphate isomerase/deaminase # Organism: Bacillus subtilis # 63 281 22 243 249 182 45.0 2e-45 MNNKYSLPKDGGLIAESTPRDIIHRYEKIHTTVYENEYLGVQYVADTIVKAIRTYNELHC SNEVYEEQQPFVLGLTTGRTPLGLYRELVKRHKEGLVSFRNVAVFSLDEFYPIRSTEQQS RNFRIHEDFLNHIDILPENIHIPDGTISEDKVSEYCASYDHSVRKIDLMIIGVGEDGQIG FNEPGSYAKSRTRLVQLTHNTRKIQSGAFFGLDYTPKLAITMGIDTIMRADKIILMAWGE EKAQIIQKVVEGEITTQVPASNLQAHPNIEVVIDENAAQLLTREQTPWLVGPCKWTPKFT RKAVVWLCGVVQKPILKLTYKDYIENSLGELLEQGRAYDQINIDVFNDLQHTITGWPGGK PNADDSTRPVPSNPYPKRIVIFSPHPDDDVISMGGTFIRLVQQGHDVHVAYETSGNVAVH DDVVLQNIDTARELGYGNHYAEVEKIIAGKKKGEPEPRPLLDLKGAIRRAEARAAVRSFG LNPDTNAHFLNLPFYETGGIKKGLLTEKDIEIIVKLLREVKPHQIYAAGDLADPHGTHRT AMEAVLGALDVVKDDEWLKECHLWLYRGAWMEWDLGMVDMAVPLSPDELIMKRHAIYRHL SQKDIMPFPGSDPREFWQRAEERTQNTAKLYDKLGMAEYQAIEVFVKMF >gi|313158953|gb|AENZ01000023.1| GENE 52 55829 - 56479 700 216 aa, chain - ## HITS:1 COG:no KEGG:Odosp_1650 NR:ns ## KEGG: Odosp_1650 # Name: not_defined # Def: hypothetical protein # Organism: O.splanchnicus # Pathway: not_defined # 34 213 33 211 211 210 55.0 3e-53 MTEIRKITGIVPTNKAAVEAAFAGIEPQAIACCNWPEQFPYAPEVSFRMFHTGDYLMLRF DVAERCTMARVTEDNGEVWTDSCVEFFITPDDSGYYYNFECSCIGRLLLGFRKEREHPAH AAPKVMAAILRNPSLGLRPFPEHEGDNRWSVVLAIPPQALFMHELTDWSGLKASVNLYKC GDKLSQPHFLSWKPIETPKPDFHRPDFFEQIKFSEI >gi|313158953|gb|AENZ01000023.1| GENE 53 56476 - 57579 1692 367 aa, chain - ## HITS:1 COG:no KEGG:BF0928 NR:ns ## KEGG: BF0928 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 3 359 2 358 361 320 47.0 5e-86 MEKKLQEIASHFALEGGIAVIDSLGEGFINDTFIIRTEGEAPDYILQRKNKNIFPDVPAM MDNIRRVTDHIRRGVIAAGGDPRREVMTVVPTREGALYHIDGDGEYWAVSVFIDDTVAYN KADSPELARKGGEGIGKFQAQLADFTEPLAETIKGFHNIRHRFVQWDEALKRDAAGRKKE LSEEIGWIESRRGDMLGFWSKVEEGTIPTRVTHNDTKINNILFDRQGEVLCAIDLDTVMA STSLNDFGDAIRSYANTGDEDDRDLSRVGLSLEMFRAYTEGYLSQRAKQLTDSEIDHLAF STRYITFEQVLRFLMDYIDGDTYYKVKYPGHNLVRTHAQYRLLQSMEEHYGEMCRIVRET VDKYRKA >gi|313158953|gb|AENZ01000023.1| GENE 54 57816 - 58781 1427 321 aa, chain - ## HITS:1 COG:BS_yjdE KEGG:ns NR:ns ## COG: BS_yjdE COG1482 # Protein_GI_number: 16078267 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphomannose isomerase # Organism: Bacillus subtilis # 4 299 8 296 315 179 35.0 8e-45 MYKFQPILKSTIWGGEKIVPYKQIASGQTQVGESWELSGVKGNESVVAGGPEAGTTLPGL IARHGAALLGKANFERFGEEFPLLIKFIDARQDLSIQVHPDDRLAWERHKSKGKTEMWYV VDADKGARLRSGFAKQVTPAQYEASVEDNTITDILAEYEIHPGDLFFLPAGRVHSIGAGA FIAEIQQTSDITYRIYDFNRKDADGNTRELHTELAKGAIDYTVLPDYRTKYEQVQDRETE LVSCPYFTTSLCDLTAPLTLDYAALDSFVVVICVEGKGMIADDSGNEMPIHQGETVLLPA TVKSLRVVPEGKLKMLTSCIK >gi|313158953|gb|AENZ01000023.1| GENE 55 58789 - 59898 1822 369 aa, chain - ## HITS:1 COG:slr0329 KEGG:ns NR:ns ## COG: slr0329 COG1940 # Protein_GI_number: 16331233 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulator/sugar kinase # Organism: Synechocystis # 6 286 29 288 327 68 25.0 3e-11 MYRYDNRVVITLDAGGTNLVFGAMQANKFIVDPITLPSNADNLDKCLATMVEGFQAIIDK LEEKPVAISFAFPGPADYPNGIIGGYLPNFPSFREGVALGPFLEYKFGIPVFINNDGDLF SYGEALGGALPEVNARLEALGSPKRYKNLVGYTFGTGLGIGIVVNNELNRGDNSCVETFC LKHKKMPDIIVEDGAAIRAVKRVYGELTGNPNHGLEPKDLCDIADGKREGDTEAARKAFA EMGEIAGDAMATAVTLIDGLIVIGGGITGARKWIMPSLLKELRSKMHTIAGDELNRVQMK VYDLDSEEEFKEFAKGDQRTLKVYGTDRYVAYDPQKRIGVAISKLGASNAISVGAYAFAL SQLDAQKAQ >gi|313158953|gb|AENZ01000023.1| GENE 56 59909 - 61042 1875 377 aa, chain - ## HITS:1 COG:NMB0535 KEGG:ns NR:ns ## COG: NMB0535 COG0738 # Protein_GI_number: 15676441 # Func_class: G Carbohydrate transport and metabolism # Function: Fucose permease # Organism: Neisseria meningitidis MC58 # 7 368 39 418 426 95 25.0 2e-19 MPILFSFFVMGFCDVVGISTTYVKNDFNLSEALAGFIPSMVFLWFLLLSVPVALAMNRVG RKRTVQISNVITIVGMLIPFVSYNFATCMVAFALLGIGNTILQVSLNPLLTNVVSGESLT SSLTAGQVVKAVSSFMGPIIAVFAVNVFGNWQYLFPIFAAITLLSSLWLMMTSIPKEEVS LQSGSSVGATFSLLKDSHILLFFIGILCTVGLDVGMNTLTPKLLIERCGLEITDAGLGSS VYFFCRTAGAFIGAFLLARLSDVRYLRVNLIVMLAALGVLYFANSYIEILICVGVFAFAL SCVFSIVYSLALRRRPDKANEISGLMITGVCGGAIIPPLMGLLTETVGSQVGSLIILTVC AVYLTFCAYSIKSKTNK >gi|313158953|gb|AENZ01000023.1| GENE 57 61253 - 61414 245 53 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313158955|gb|EFR58332.1| ## NR: gi|313158955|gb|EFR58332.1| hypothetical protein HMPREF9720_0359 [Alistipes sp. HGB5] # 1 53 1 53 53 100 100.0 2e-20 MKKLLILALFAAVTFCAAAQELTVATYNIRNANRGDAERGNGWERRGPWVCRP Prediction of potential genes in microbial genomes Time: Wed Jun 22 11:46:56 2011 Seq name: gi|313158951|gb|AENZ01000024.1| Alistipes sp. HGB5 contig00086, whole genome shotgun sequence Length of sequence - 1310 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 144 - 1308 1155 ## COG0582 Integrase Predicted protein(s) >gi|313158951|gb|AENZ01000024.1| GENE 1 144 - 1308 1155 388 aa, chain + ## HITS:1 COG:SSO0375 KEGG:ns NR:ns ## COG: SSO0375 COG0582 # Protein_GI_number: 15897309 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Sulfolobus solfataricus # 255 388 147 272 291 63 32.0 8e-10 MQRSTFKVLFYVKRQSEKHGQVPVMGRITINGTMSQFSSKLSVRSSLWDAKANKASGRSL EAQRLNEKLENIKTNIGKQYQRLCDRDSYVTAEKVRNAFLGMGDDCRLLLQTFDEYLAGF LKRVGKDRAYSSYDNYRKRRNRLASFLEYEYHVKDIAFKELKREFVEKFVVYLSTVKGLA SGTIHSTLKKLKLMTYTAYKNGWIAADPFAGFYVKAEYAERRYLSASELQAVMDVRLPNY RTGINRDAFVFCAFTGLSHADVVKLTHADIHTDDNGERWIIDKRQKTGTQFRVKLLPAAE MLYKRYKDTYRTSEKVFPLKGTYKTLNMSLRHVARHAGLSFNPTIHMARHTFATTVTLTQ GVPLETVCKMLGHKRITTTQIYAKITND Prediction of potential genes in microbial genomes Time: Wed Jun 22 11:47:17 2011 Seq name: gi|313158870|gb|AENZ01000025.1| Alistipes sp. HGB5 contig00050, whole genome shotgun sequence Length of sequence - 83488 bp Number of predicted genes - 96, with homology - 78 Number of transcription units - 39, operones - 20 average op.length - 3.9 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 982 1313 ## COG1640 4-alpha-glucanotransferase + Term 1024 - 1066 10.0 - Term 1202 - 1230 0.3 2 2 Op 1 9/0.000 - CDS 1357 - 2109 779 ## COG3279 Response regulator of the LytR/AlgR family 3 2 Op 2 . - CDS 2111 - 3013 1262 ## COG3275 Putative regulator of cell autolysis 4 2 Op 3 . - CDS 2925 - 3500 286 ## gi|313158896|gb|EFR58275.1| putative lipoprotein - Prom 3521 - 3580 4.9 + Prom 3637 - 3696 7.1 5 3 Op 1 . + CDS 3723 - 5030 2017 ## COG0477 Permeases of the major facilitator superfamily 6 3 Op 2 . + CDS 5074 - 8115 4829 ## BVU_1380 hypothetical protein 7 3 Op 3 . + CDS 8129 - 9787 2455 ## Halhy_0390 RagB/SusD domain-containing protein 8 3 Op 4 . + CDS 9811 - 11625 2313 ## Halhy_0391 hypothetical protein + Term 11640 - 11680 7.1 9 4 Op 1 7/0.000 + CDS 11733 - 14051 3204 ## COG0366 Glycosidases 10 4 Op 2 7/0.000 + CDS 14062 - 15399 1868 ## COG0366 Glycosidases + Prom 15514 - 15573 1.5 11 4 Op 3 . + CDS 15598 - 17475 2300 ## COG0366 Glycosidases 12 4 Op 4 . + CDS 17488 - 19683 3132 ## BF1158 putative alpha-glucosidase + Term 19746 - 19785 7.4 13 5 Op 1 . - CDS 19934 - 20692 776 ## Bacsa_2167 hypothetical protein 14 5 Op 2 . - CDS 20717 - 21787 764 ## HMPREF0659_A6126 relaxase/mobilization nuclease domain protein 15 5 Op 3 . - CDS 21777 - 22172 284 ## HMPREF0659_A6127 bacterial mobilization protein 16 6 Tu 1 . - CDS 22414 - 23271 772 ## PGN_0923 putative DNA primase - Prom 23291 - 23350 2.6 - Term 23405 - 23448 5.4 17 7 Op 1 . - CDS 23455 - 24675 1119 ## HMPREF9137_0377 hypothetical protein 18 7 Op 2 . - CDS 24665 - 24970 160 ## gi|291513569|emb|CBK62779.1| hypothetical protein AL1_00430 - Prom 25098 - 25157 2.4 - Term 25118 - 25168 12.2 19 8 Op 1 . - CDS 25185 - 25928 63 ## gi|313158876|gb|EFR58255.1| hypothetical protein HMPREF9720_2090 20 8 Op 2 . - CDS 25925 - 26944 449 ## PRU_1108 hypothetical protein 21 8 Op 3 . - CDS 26934 - 27542 331 ## Halhy_3922 hypothetical protein - Term 27546 - 27592 8.4 22 8 Op 4 . - CDS 27600 - 29003 685 ## Bacsa_2219 integrase family protein 23 9 Tu 1 . + CDS 29963 - 30286 99 ## 24 10 Op 1 . - CDS 30675 - 31115 150 ## gi|313158934|gb|EFR58313.1| hypothetical protein HMPREF9720_2095 25 10 Op 2 . - CDS 31105 - 31320 89 ## gi|313158938|gb|EFR58317.1| hypothetical protein HMPREF9720_2096 - Prom 31364 - 31423 5.5 + Prom 31383 - 31442 4.3 26 11 Tu 1 . + CDS 31485 - 31817 319 ## 27 12 Op 1 . + CDS 32034 - 33185 296 ## HMPREF9137_1422 initiator RepB protein 28 12 Op 2 . + CDS 33207 - 33962 318 ## COG1192 ATPases involved in chromosome partitioning 29 12 Op 3 . + CDS 33972 - 34220 91 ## gi|313158933|gb|EFR58312.1| hypothetical protein HMPREF9720_2099 + Prom 34260 - 34319 2.5 30 13 Op 1 . + CDS 34342 - 34521 273 ## 31 13 Op 2 . + CDS 34533 - 34868 241 ## 32 13 Op 3 . + CDS 34865 - 35140 239 ## 33 13 Op 4 . + CDS 35137 - 35715 63 ## COG0739 Membrane proteins related to metalloendopeptidases - Term 35694 - 35724 -1.0 34 14 Tu 1 . - CDS 35733 - 36011 325 ## gi|313158926|gb|EFR58305.1| hypothetical protein HMPREF9720_2100 - Prom 36032 - 36091 4.8 + Prom 36165 - 36224 2.5 35 15 Op 1 . + CDS 36276 - 36854 784 ## BF3513 hypothetical protein 36 15 Op 2 . + CDS 36867 - 37166 325 ## 37 15 Op 3 . + CDS 37163 - 37513 312 ## gi|313158924|gb|EFR58303.1| hypothetical protein HMPREF9720_2102 38 15 Op 4 . + CDS 37522 - 39306 1993 ## BF1481 hypothetical protein 39 15 Op 5 . + CDS 39308 - 39880 279 ## BF4320 hypothetical protein 40 15 Op 6 . + CDS 39885 - 40763 891 ## gi|313158902|gb|EFR58281.1| hypothetical protein HMPREF9720_2105 41 15 Op 7 . + CDS 40811 - 41881 1092 ## gi|313158895|gb|EFR58274.1| putative lipoprotein + Term 41989 - 42058 20.2 + TRNA 41931 - 42040 26.0 # Pseudo ??? 0 0 + Prom 41933 - 41992 80.3 42 16 Op 1 . + CDS 42059 - 43405 934 ## gi|313158907|gb|EFR58286.1| putative lipoprotein 43 16 Op 2 . + CDS 43521 - 44003 260 ## PP_1535 methyltransferase, putative 44 17 Tu 1 . - CDS 44032 - 44403 339 ## gi|313158922|gb|EFR58301.1| hypothetical protein HMPREF9720_2110 - Prom 44428 - 44487 5.7 + Prom 44354 - 44413 3.7 45 18 Tu 1 . + CDS 44439 - 44840 308 ## gi|313158925|gb|EFR58304.1| hypothetical protein HMPREF9720_2111 + Term 45021 - 45066 -0.3 46 19 Tu 1 . - CDS 44822 - 45043 170 ## + Prom 44992 - 45051 2.4 47 20 Tu 1 . + CDS 45129 - 45470 216 ## gi|313158909|gb|EFR58288.1| hypothetical protein HMPREF9720_2112 + Prom 45504 - 45563 3.0 48 21 Op 1 . + CDS 45686 - 45886 142 ## 49 21 Op 2 . + CDS 45883 - 46044 58 ## gi|313158950|gb|EFR58329.1| hypothetical protein HMPREF9720_2113 50 21 Op 3 . + CDS 46041 - 46262 182 ## gi|313158943|gb|EFR58322.1| hypothetical protein HMPREF9720_2114 51 21 Op 4 . + CDS 46259 - 46975 724 ## gi|313158908|gb|EFR58287.1| putative lipoprotein 52 21 Op 5 . + CDS 46983 - 48311 736 ## gi|313158912|gb|EFR58291.1| putative lipoprotein 53 22 Tu 1 . + CDS 48447 - 48521 91 ## + Prom 48586 - 48645 3.7 54 23 Op 1 . + CDS 48675 - 48884 267 ## 55 23 Op 2 . + CDS 48884 - 49180 236 ## gi|167754233|ref|ZP_02426360.1| hypothetical protein ALIPUT_02526 + Term 49254 - 49297 -0.3 - Term 49748 - 49782 6.1 56 24 Op 1 . - CDS 49815 - 51770 -175 ## Bacsa_0604 hypothetical protein 57 24 Op 2 . - CDS 51812 - 52387 81 ## Bache_0302 KilA-N, DNA-binding domain protein 58 24 Op 3 . - CDS 52399 - 52611 233 ## gi|313158905|gb|EFR58284.1| hypothetical protein HMPREF9720_2121 59 24 Op 4 . - CDS 52621 - 52941 317 ## gi|313158937|gb|EFR58316.1| hypothetical protein HMPREF9720_2122 60 24 Op 5 . - CDS 53021 - 53275 167 ## 61 24 Op 6 . - CDS 53334 - 53537 242 ## gi|313158878|gb|EFR58257.1| conserved hypothetical protein - Prom 53592 - 53651 3.2 62 25 Tu 1 . - CDS 53662 - 54069 93 ## 63 26 Op 1 . - CDS 54201 - 54569 350 ## 64 26 Op 2 . - CDS 54550 - 54744 105 ## 65 26 Op 3 . - CDS 54701 - 57109 823 ## COG3505 Type IV secretory pathway, VirD4 components - Prom 57129 - 57188 9.8 - Term 57206 - 57256 3.0 66 27 Tu 1 . - CDS 57288 - 58439 509 ## gi|313158872|gb|EFR58251.1| hypothetical protein HMPREF9720_2126 - Prom 58460 - 58519 4.8 + Prom 58380 - 58439 6.2 67 28 Tu 1 . + CDS 58470 - 58595 75 ## + Term 58668 - 58706 -0.8 68 29 Op 1 . - CDS 58979 - 60760 171 ## ZPR_1414 KWG Leptospira 69 29 Op 2 . - CDS 60807 - 61193 157 ## gi|313158901|gb|EFR58280.1| hypothetical protein HMPREF9720_2128 70 29 Op 3 . - CDS 61204 - 62067 74 ## gi|313158898|gb|EFR58277.1| hypothetical protein HMPREF9720_2129 71 29 Op 4 . - CDS 62078 - 62509 207 ## gi|255009787|ref|ZP_05281913.1| hypothetical protein Bfra3_11671 72 29 Op 5 . - CDS 62531 - 63337 280 ## gi|313158887|gb|EFR58266.1| hypothetical protein HMPREF9720_2130 - Prom 63368 - 63427 10.7 73 30 Tu 1 . - CDS 63498 - 64085 -25 ## gi|313158871|gb|EFR58250.1| relaxase/mobilization nuclease domain protein + Prom 64043 - 64102 3.0 74 31 Tu 1 . + CDS 64311 - 64553 180 ## - Term 64568 - 64606 3.2 75 32 Op 1 . - CDS 64655 - 65017 276 ## gi|313158892|gb|EFR58271.1| hypothetical protein HMPREF9720_2132 76 32 Op 2 . - CDS 65034 - 65255 147 ## - Prom 65462 - 65521 3.6 + Prom 65291 - 65350 6.9 77 33 Op 1 . + CDS 65427 - 66200 530 ## gi|313158893|gb|EFR58272.1| conserved hypothetical protein 78 33 Op 2 . + CDS 66190 - 66660 122 ## gi|313158929|gb|EFR58308.1| conserved hypothetical protein 79 33 Op 3 . + CDS 66662 - 67003 193 ## gi|313158900|gb|EFR58279.1| hypothetical protein HMPREF9720_2135 80 33 Op 4 . + CDS 67007 - 69385 1600 ## COG3451 Type IV secretory pathway, VirB4 components 81 33 Op 5 . + CDS 69417 - 70061 275 ## gi|313158921|gb|EFR58300.1| hypothetical protein HMPREF9720_2137 82 33 Op 6 . + CDS 70058 - 71110 325 ## gi|313158886|gb|EFR58265.1| hypothetical protein HMPREF9720_2138 83 33 Op 7 . + CDS 71117 - 71731 278 ## ZPR_1772 conserved protein found in conjugate transposon TraK 84 33 Op 8 . + CDS 71736 - 72059 226 ## + Term 72182 - 72212 0.2 + Prom 72068 - 72127 3.2 85 34 Op 1 . + CDS 72360 - 73457 431 ## Fjoh_4351 hypothetical protein 86 34 Op 2 . + CDS 73485 - 74342 491 ## Pedsa_1554 conjugate transposon protein TraN + Term 74346 - 74379 5.2 - Term 74329 - 74368 8.0 87 35 Tu 1 . - CDS 74482 - 75375 865 ## COG3177 Uncharacterized conserved protein - Prom 75419 - 75478 2.8 - Term 76422 - 76468 11.8 88 36 Tu 1 . - CDS 76488 - 78236 1587 ## COG0270 Site-specific DNA methylase - Term 78253 - 78295 15.5 89 37 Op 1 . - CDS 78330 - 78557 170 ## gi|313158874|gb|EFR58253.1| hypothetical protein HMPREF9720_2146 90 37 Op 2 . - CDS 78595 - 78816 249 ## gi|313158889|gb|EFR58268.1| conserved domain protein 91 37 Op 3 . - CDS 78803 - 79330 497 ## gi|313158927|gb|EFR58306.1| hypothetical protein HMPREF9720_2148 92 37 Op 4 . - CDS 79347 - 79901 621 ## gi|313158913|gb|EFR58292.1| hypothetical protein HMPREF9720_2149 93 37 Op 5 . - CDS 79972 - 80517 93 ## gi|270265411|ref|ZP_06193671.1| hypothetical protein SOD_n00310 94 37 Op 6 . - CDS 80555 - 80887 274 ## gi|315223254|ref|ZP_07865115.1| conserved hypothetical protein - Term 81605 - 81638 7.5 95 38 Tu 1 . - CDS 81673 - 82662 922 ## COG4227 Antirestriction protein - Term 83125 - 83160 4.2 96 39 Tu 1 . - CDS 83163 - 83486 247 ## gi|313158899|gb|EFR58278.1| conserved hypothetical protein Predicted protein(s) >gi|313158870|gb|AENZ01000025.1| GENE 1 2 - 982 1313 326 aa, chain + ## HITS:1 COG:L94405 KEGG:ns NR:ns ## COG: L94405 COG1640 # Protein_GI_number: 15672678 # Func_class: G Carbohydrate transport and metabolism # Function: 4-alpha-glucanotransferase # Organism: Lactococcus lactis # 52 325 220 487 489 230 42.0 3e-60 SADELRGMGFDTQDGRYTTPAPDEHMLGELFGDLAGEVRATCMKEGRLLPAYATQRKVAA RFPGDDERQTRLREGLMALLDDVLFIEDPRRKGYFHPRIAPHSTHAYRRLDGERRAAFDR LYTDFFYHRHNRFWQESALRKLPVLLSATEMLTCGEDLGMIPDSVPETMHELQILSLEIQ RMPKTPGELFADPAHYPYFSVCTTSTHDMNPLRAWWEEDRELTARFYHEALGIGGDVPYF CEPWICRRILDMHLNSPAMLTILPLQDWFSTDGELRYPDPAKERINVPAVARYYWRYRMH LPLEELLHKETFNESLHDMIACSGRR >gi|313158870|gb|AENZ01000025.1| GENE 2 1357 - 2109 779 250 aa, chain - ## HITS:1 COG:CAC1670 KEGG:ns NR:ns ## COG: CAC1670 COG3279 # Protein_GI_number: 15894947 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Response regulator of the LytR/AlgR family # Organism: Clostridium acetobutylicum # 1 239 3 244 253 84 26.0 2e-16 MNVLIVEDEMMAQANLARALTQHFPDVRIVGTTGSVRETVLWLRTPGNSADVIFMDVELS DGDCFEIFRQADVTARVIMTTAYDNYAVRAFEVNSIDYLLKPIDLAALRRAVERCRVRSG GIDPDVLLNAIRSPREYKQRYVVRFNDRIVPVQTTDIAYFYSEEKNTYLVTNDNNRYIMD QSLDVLSDELDPGRFFRISRSCIIAMPAIVSIVKYLGNRLKITARPRPEFEMVVSRSRVD DFLKWLEGNG >gi|313158870|gb|AENZ01000025.1| GENE 3 2111 - 3013 1262 300 aa, chain - ## HITS:1 COG:VC0694 KEGG:ns NR:ns ## COG: VC0694 COG3275 # Protein_GI_number: 15640713 # Func_class: T Signal transduction mechanisms # Function: Putative regulator of cell autolysis # Organism: Vibrio cholerae # 122 243 371 490 558 77 37.0 3e-14 MADLLASNSAALIILLCVNIIYVRHTRRRETRHGPIAEIAAFLLFVLLTSACVALITGLH LPFAFDRSFTAREFLQLWIVTLLVEVTLYSVVFMVDYALVARAVLRAERGKAHQAQFRYM KLKQQVNPHFLFNSLNILDCLVCEQRTEQASAYIHKLAGIYRYMLQNEEETLVRLREEMA FVGMYVDLLQVRFPEGFRVETDISDEVMNRHVLPCSVQLLIENAIKHNSVGADRPLVIRI VAEAEAEAVTVSNNLQLKVSGNPSTRVGLNYIRQQYLDLSGIPIGIRRTDTEYCVTLPLL >gi|313158870|gb|AENZ01000025.1| GENE 4 2925 - 3500 286 191 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313158896|gb|EFR58275.1| ## NR: gi|313158896|gb|EFR58275.1| putative lipoprotein [Alistipes sp. HGB5] # 1 191 1 191 191 338 100.0 1e-91 MRSLKVSGIIHLFAVLHAVVALSCHLAGINDELVLTLLTIVLIVLICLKRGLNVEFTAAS VIVVNVIGYLLGTGGAQLIELLLHASPVVHAVSTFITTEILGWSIVWFTKLFRRGDDTAS RSSAWTPRIGLLLLAVGGDFRAEAGLCRAFRLALLFHRPPLPHGRLAGFQFRGTDYPAVR QYHLCPAYAQA >gi|313158870|gb|AENZ01000025.1| GENE 5 3723 - 5030 2017 435 aa, chain + ## HITS:1 COG:NMB0388 KEGG:ns NR:ns ## COG: NMB0388 COG0477 # Protein_GI_number: 15676302 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Neisseria meningitidis MC58 # 4 431 14 446 451 283 40.0 6e-76 MTKPKLSFWQIWNLSFGFLGVQIGYSLQNSNTSSIFESLGADVSHLSYFWLAAPLAGMIV QPIVGLFSDGTWTRWGRRIPYILGGSLISALALVLMPNCPKLLAFAPLAMGAFILLFMDL SFNVTMQPFRALVADMLDDSQKTQGYVVQTFLINLGAVVGAILPLVMTWLGVSDEAAPGH VSPHIAYSYYAGGAILLLTVLVTSFKTREYPPGEFARYNNLSEEDAKPMSFVGLMRNVPG VMVRLGVTQFFSWAALFLMWTYLKPAITGVVTDHATGEVLSAGATQTWVGVLNGTYPIPA CIAALFLGRVAARYGNKPVYAACLLAGALGFAGLCLLHDQYALMLPMVGIGIAWAGILAM PYAILSRAVEPRRMGVYMGIFNFTITVPQIVIGLTGGAIVKYCFASDAADMLALAGVFML LAAVSVFLVKEHKAE >gi|313158870|gb|AENZ01000025.1| GENE 6 5074 - 8115 4829 1013 aa, chain + ## HITS:1 COG:no KEGG:BVU_1380 NR:ns ## KEGG: BVU_1380 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 31 1013 30 1001 1001 865 48.0 0 MKRFLTMCAGILLICQFTPPLTLAQNAGYEVKGVVVDKSGMPILGATVVEKGTTNGVSTG INGDYAIRAAGPESVIEVSYVGYKTVSLVASSSLLVHLTLEEDAMGIDDVVVIGYGTIKK NDMTGSVVAIKAEEFNRGAVVSTQDMLKGKVPGVHIIPGDGGPGSSATIRVRGAASLNAS NDPLIVIDGVPIAVDGGKGMANPLETINPNDIESFTVLKDASAAAIYGSRASNGVILVTT KKGRGNTPRVSYSGSVSVQTNSDELPVMSPGEFRTYIDQVYPAGTTTGDKVQSMLGDKNT NWQDLVFRTAISHDHNVSLIGNINDRMPYRASVGYTNQQGTLETSKYERGTLDLSLSPNF FDKHLTVNLNAKGVITSQRYASGGVVGSAAFFNPTIDPYFRNDDGSIDYTTTNGYWNYGS GRGEDFTPNTLLGAGPLSQLYDRDNTARAKRFIGNAQIDYKVHGFEALRFNLNLGLDVSS AKTYDGVNPGSFQAYTDTEARGWGQYSWTTNFRRNQLLEFYANFNKEWGIHHLDVMAGYS WQHFYSSDHSVSYFNETHEQKGEDSRYPFNRQENYLLSFYGRLNYSIASKYLFSFTLRDD ASSRFSKDTRWGLFPSGAFAWNIAEENFLKDSRAVSALKLRVSVGQTGQQEIGSNYPYLA RYYMSTDVYKTYYMGSAGHMFYLTPGAYDPNIKWETTTTYNVGLDFGFLGGRINGSVDWY LRQTDDLLNNVITPMGSNFGNTVLTNIGSMENKGVEFNLNFIPVQTKDWNLTVGFNGTFQ HTEFTKLNNTDDPDYAIQVSSITKGTGNLLQRHMVGYAPYTYYCFQQVYDQDGKPIQNAL VDRNKNGQIDQGDRYMTDKSPNPDFFYGISLKLSYKNWDFGFNGHGSAGNWVFNDFASAN STSNIDINAGNLPNFARLVKKTGFTKANSGEQWYSDMFLENASFFRMDDINLGYTFNKIG NWKGSMRVAFGVQNAFVITDYSGVDPEIPGVNGIDGSIWPRPRTYSLRLNVNF >gi|313158870|gb|AENZ01000025.1| GENE 7 8129 - 9787 2455 552 aa, chain + ## HITS:1 COG:no KEGG:Halhy_0390 NR:ns ## KEGG: Halhy_0390 # Name: not_defined # Def: RagB/SusD domain-containing protein # Organism: H.hydrossis # Pathway: not_defined # 10 546 12 530 530 329 37.0 2e-88 MKIDIKHFALAAAGATMLLTSCIGDLDTLPLNPSDSTSETVYGKDENGYLAGLTKLYFNF VSNDTTDLQVSDGGASELIRAFWTIQEVTADACKCAWENDAWVRAMNTDTWSDADNDATY AVYVRTLQGIAYVNEYLRQTASDKLSDRGVSSELAARIQSFRAEARFLRAYFYWIALDVF GDVPFTTETSPFGGGVNPKQASRKDVFDYCISELTDLASDESAMPAARSNYPRADKGAVN GLLARMYLNAEVYAGTPMWTEAKSACEAIFGMGYSLCPEYSDLFRGDNGENADALKEIIF GVAYDAEQTQSYGGTSYLTLAAIAATDVTAEQKINGVNNGWAGIRVPYEYVQKYFNARNA DYTTGEYTVNDKRGGMFYIKGRQESMNDALYVFLNGWTCLKFNNIPHNMTNDEFLATAAS KAYSDIDFPMIRLGEIYLIYAEACMNLGQANTALPKLKELTDRAGVSAPSSVTADYLLEE RARELMWEGHRRTDLIRYGKFTTPSFLWTYKGGTFTGQGFDDYMKIFAIPSSELASNPEL HQNPGYGNATDK >gi|313158870|gb|AENZ01000025.1| GENE 8 9811 - 11625 2313 604 aa, chain + ## HITS:1 COG:no KEGG:Halhy_0391 NR:ns ## KEGG: Halhy_0391 # Name: not_defined # Def: hypothetical protein # Organism: H.hydrossis # Pathway: not_defined # 19 359 19 342 345 110 29.0 2e-22 MKITRYAMMLAAATGLLSACQKLDEVKAYDPDKVVAPVLHALPGEIVITPDNMGSTQTFT WDAADFGVRTQINYSIEASYNDGAKLVLFTGMNGTSSEQTYESLNNILALSVEDGGLGVP SGEPTDVDFYISATIGTDFEKFYSAPATVRMTVTTAERTYPQVWVIGDYCGWNFDNAQGL FCFSGDEVTYEAIVDLGEKAANGFKLSGEAGWNDACNWGTDGDAAAPETEAPSITLISSG GSGNIMVYSKRFYRFVFDRSTLTLSNKLSFNSMGIIGDATPGGWDTDTEMNFDTQKQRFW VDVTLTAGEFKFRADNDWAINFGGADGRLSQNGDNIKATAGNYRVYATLNNSAEITYELN AGDYGTGGGEEPDPGPEKADWYIHGQTVATPDWGPTAMESASSNIVAYKAAGVEVAANSE FLFKSGDESQWIGADAAFAGSSPYTCTIGSAFKVSADKVNAVIAEAGTYDYWLLPEAGRA YVMAAGAKPELVADTWGLVGNITGWGDLGDFSMSEEGAYLVRKGVVLTTASEFKIRFNNA WDDSKNYGTASGGAVDINKAVDIITSGGSQNMKVQLDGTYDIYFDLANSQIYIMSEGKTP AEAE >gi|313158870|gb|AENZ01000025.1| GENE 9 11733 - 14051 3204 772 aa, chain + ## HITS:1 COG:MYPU_6320 KEGG:ns NR:ns ## COG: MYPU_6320 COG0366 # Protein_GI_number: 15829103 # Func_class: G Carbohydrate transport and metabolism # Function: Glycosidases # Organism: Mycoplasma pulmonis # 138 263 66 190 607 129 45.0 2e-29 MKKFLFLLLALVLSVSVSCSDDKTGDTGGSMLSVTPTEIEFEAAGGSRLLDLRTDAGAWS LTQSDNTAWCTPALTSGKTSTSFAVTATANESSKRSATLTFTAPGCDPVVVTVTQSGDAS QDFEGEQVVAQPDAWDNRKRADISYQLLVYSFADGNGDKVGDLPGLTRRLDYIDALGASA VWLSPIHPAASYHGYDVLDYEAVNPAFGTDADLRAFIDAAHARGIRVYLDYVLNHTGKDH PWFKSAAASEGSPYRDRYIFSEDPQADIAAGRIDQIATEGAAGYDAGQWFSTDTGAGAAG RFKFVLDWTNADSPTVTVTETTDAADADNTQGGADDKYLYFGNGTSKRFYARGGNSYELT LDFDSDWGFLVRTSTTSWAAGTKYGAPDNRTIIRFGEPFTLMSNRSADPANVQFSLPTMY HSHFWTAAFADLNYGKAAEAEQSGAFKAVTEAADKWVRMGVDGFRLDAVKHIYHNAYNDE NPTFLKKFYDRMNESYKAAGGEGDFYMVGEMLDEADKAAPYYRGLPALFEFTFWYKLKWA LQNGIGCYFVKDILDVQPLYAQYRSDYIEATKLSNHDEDRTGSDLGQSAEKMKVAAAVLL TAQGAPYIYQGEELGYWGTKSNGDEYVRTPILWDKAGNELASGSLSGKIDMQMLTPAISV EAQADDDGSLLNLYRTFARLRNTYPVLAQGKMVKHPVYNDGNTSQQSIAAWYRELDGERM LVVHNFGREEQILTLTDQPDKAVGVSGEVKLQRGDASSKLLMGAWSSVVFTL >gi|313158870|gb|AENZ01000025.1| GENE 10 14062 - 15399 1868 445 aa, chain + ## HITS:1 COG:TM1650 KEGG:ns NR:ns ## COG: TM1650 COG0366 # Protein_GI_number: 15644398 # Func_class: G Carbohydrate transport and metabolism # Function: Glycosidases # Organism: Thermotoga maritima # 34 365 7 336 422 200 35.0 5e-51 MKRLPALLLAMVLVACGTRPQQPAAHPDWSYNSVVYEMNVRQYTPEGTLAAAARHLPRLR ELGVDVVWLMPVYPIGVKERKGTLGSYYAISDYEAVNPEFGTLEDFDRFLAEAHRLGLRV ILDWVANHTSPDARWIEEHPADWYVRDSLGNTIVQYDWTDIAKLDYGNADMRAAMAAAMR FWLNRGIDGFRCDMACEVPIDFWQQTLPALRKEYPGIYLLAEGEAPALHDGAFDASYAWE LHHLLNDIAQGRKSAADLRAYVARDSAAMPREAFRLMFTSNHDENSWAGTEFERMGDAAR VMALLTFTLPNGQPLVYTGQEMGFDHRFEFFEKDPVPAWEHNGFTDFYTALIRLRHENPA LAAGERGGQAAYPLGERTPDGLMLFSRTAGGNEVTVAANLSAEEVAFDLPFEGSRREFFT GKEYTGGTRTVLPAWQWLVFAQDRR >gi|313158870|gb|AENZ01000025.1| GENE 11 15598 - 17475 2300 625 aa, chain + ## HITS:1 COG:BH2927 KEGG:ns NR:ns ## COG: BH2927 COG0366 # Protein_GI_number: 15615490 # Func_class: G Carbohydrate transport and metabolism # Function: Glycosidases # Organism: Bacillus halodurans # 135 621 136 578 578 145 25.0 4e-34 MTLAAAGLLSCTGTGQSVFDPASVADASGEVTRVEPLSWWTGMQTPLQLLVNGAGIAAYD VRIEGGQGVGVKARHKADSPNYLFVDVKVKPDAEPGTYYLVFSQGERQFKVPYEIAARAE GSAARKSFTTADMIYLLMPDRFANGDASNDSTPHTRERADRSAFFGRHGGDLQGMIDHLD YIAGLGATAVWPTPLLLDDEPEGSYHGYACGDYYRIDPRFGSNELYREFVGKAHEHGLKV IMDIVTNHCGTGHWWMKDLPFRDWIHQFPEYTGTNVCFSTNMDPNASRYDLDLQESGWFV PSMPDMNLDNPYVLQYFKQWAVWWIEYAGLDGFRVDTYPYNEKVPMSEWCAAVRREYSDF NIVGECWTSSIPQLAYWQGGNPNKDGFDSHLPSIMDFPLQEAICRALPTDSLRWGEGMTR VYDCLSHDFVYHDLSKMMIFVANHDTDRIGDIVRGNPDRLKLSMAMLATMRGIPQIFSGD EMMFTSKDLSQGHGGLRVDFPGGWEGDAVNLFDPAQRDAVQAGLFDYTQRLFQWRKSKPV IHNGRTMHFLSRDNTYAYFRYDDTDAVFVFINNSRGKKQVPWSHYAEIASGLSDGRNVLT GEATEVDDTTTVGPRQALIVEFKRK >gi|313158870|gb|AENZ01000025.1| GENE 12 17488 - 19683 3132 731 aa, chain + ## HITS:1 COG:no KEGG:BF1158 NR:ns ## KEGG: BF1158 # Name: susB # Def: putative alpha-glucosidase # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 3 729 14 720 723 1080 70.0 0 MKKIFLYISALLAGAACTAQETLLSPDGRLRLEFSLADGGRPTYTLDYKERPVILPSGMG LELRGEAPKLEFGAEIRKGEPGRPVSLYDGFTLGQVARSESDQTWRPVWGEQAEIRDRYN EMAVTLRQAATGRDMTIRFRLYDDGLGFRYEFPEQESLTYFVIGEEKTQFAMAGDHTAFW IPGDYDTQEYDYTESKLSQIRELMPGAITPNSSQTPFSPTGVQTALQMKSDDGLYINIHE AALVDYACMHLDLDDRNFIFTSHLTPDAQGWKGYMQAPCHTPWRTVTVSDDARDILASKL ILNLNEPCAYDDTSWIKPVKYMGVWWEMITGKSDWAYTRDVPSVQLGVTDYARCRPSGRH GAENENVKRYIDFAAAHGFDQLLVEGWNVGWEDWFGHTKDYVFDFVTPYPDFDIAALNEY AHGKGIRLMMHHETSASVRNYERHMEAAYRLMDKYGYNSVKSGYVGDILPRGEHHYGQWM NNHYLYAVRRAADHKIMVNAHEAVRPTGLCRTYPNLIGNESARGTEYQAFGGSKPHHVTI LPFTRLQGGPMDYTPGIFVMDVAEVNPGNHSHVNATLANQLALYVTMYSPLQMAADLPEH YEKYMDAFRFIEDVALDWEESRYLLAEPGDYIVVARKAKDTGEWFVGGVTDENRRTVTLP FDYLDPGKEYVATLYADAPDADYQTNPQAYVIRTGKVRQRTELDVEMARGGGFAISIREA TDADRKLCKLK >gi|313158870|gb|AENZ01000025.1| GENE 13 19934 - 20692 776 252 aa, chain - ## HITS:1 COG:no KEGG:Bacsa_2167 NR:ns ## KEGG: Bacsa_2167 # Name: not_defined # Def: hypothetical protein # Organism: B.salanitronis # Pathway: not_defined # 83 252 80 249 257 72 29.0 1e-11 MNDENKIGLSIEMYEDLKETLVKAVKSTKANAGLENAQLERIERLIEATELSQGQVAQML EQLQRHTALPGVKTEQEREFTDKYNALLAQLAGRLEVIDKEIRQTHTLIEQVRQAVEAAA KPENLPVREHRHTYTLDISSSKTFLTMFVMLGCIFLQGAFIYRISENNRLLAANDLKYRY VKMRGEIKEKELMELETIFRDEQHTAIRDTVRNQVERYKEAVRRRAEQLERAASKERKAK KLLDDAAQLRGK >gi|313158870|gb|AENZ01000025.1| GENE 14 20717 - 21787 764 356 aa, chain - ## HITS:1 COG:no KEGG:HMPREF0659_A6126 NR:ns ## KEGG: HMPREF0659_A6126 # Name: not_defined # Def: relaxase/mobilization nuclease domain protein # Organism: P.melaninogenica # Pathway: not_defined # 1 273 1 277 306 262 44.0 2e-68 MVGKVISASSFSGTVGYVMKEESRILEAEGITPPEVKDMVQDFKDQTLLNPRLKNTVGHI SLSFSPKDAPRMTDALMTQIAKEYMQKMGITDTQYLLVRHLDQPHPHCHLVYNRVGNNGQ TISDKNIKLRSAKVCRELTEKCGLYLAPGKDDVRRERLREPDKTRYEIHDAIKRCLPRCA GWKGLEKQLEKQGIGIRYKYCGSTDRKQGVLFSKNGFEFSGSKIDRAFNFTKLDNRFNNI QQQTQHRATLFGNLSAAAGNYRSAFAGLFGGMGSSGTREEPSSVNLGKAGGIPLPPADSP GGVSAKQLQRKPGESPEKYIARITALLNAAAEAMAIAAMEHRRRMEEQKRRAKIKL >gi|313158870|gb|AENZ01000025.1| GENE 15 21777 - 22172 284 131 aa, chain - ## HITS:1 COG:no KEGG:HMPREF0659_A6127 NR:ns ## KEGG: HMPREF0659_A6127 # Name: mobC # Def: bacterial mobilization protein # Organism: P.melaninogenica # Pathway: not_defined # 15 127 3 115 117 72 36.0 6e-12 MNNLSKQPVMETPKKTYSKQGGRPKVGIGRICKYVVSTRLSPERKLRFSALCREAGQPPA EVLRQLIDRGTVRARITREQLDFMAQLKGIARNLNQLTRLANAKGLAAVRVRHAAIVTAI EKLLKQICDGR >gi|313158870|gb|AENZ01000025.1| GENE 16 22414 - 23271 772 285 aa, chain - ## HITS:1 COG:no KEGG:PGN_0923 NR:ns ## KEGG: PGN_0923 # Name: not_defined # Def: putative DNA primase # Organism: P.gingivalis_ATCC33277 # Pathway: not_defined # 4 270 3 254 295 208 40.0 2e-52 MTVIPYSNRISIRDFLAWRGIQPKYERNGYGMYLSPLREERTPSFKVDYVQNLWYDFGLG EGGTLLTLVMRLERCDSREAVRRLQNGEKGDTGIASISPGIGEPLGVGGALSVVRPATVP ALRILSDASLRHPALVGYLASRGIVPPVAAAFCREVRYEINGRAFFAVGFRNDAGGWELR SARFKGGSSPKHITTIDNRSDTVIAFEGFMDFLAYLSLKYPERLRIDAAVLNSVVNLPKA VPFISRHPVIRTFFDNDEAGRKATADLIRLCPRSEVIDQSSFTRT >gi|313158870|gb|AENZ01000025.1| GENE 17 23455 - 24675 1119 406 aa, chain - ## HITS:1 COG:no KEGG:HMPREF9137_0377 NR:ns ## KEGG: HMPREF9137_0377 # Name: not_defined # Def: hypothetical protein # Organism: P.denticola # Pathway: not_defined # 9 397 3 393 396 468 59.0 1e-130 MPAKTVKRESPYLRVGTTIYKRVRQPLSSGRSVETLIPWNVETLRQDYGKSYLACIPKYD GFCTVPDHTNYRREIDGFLNRYEPIPFQPADGIFPHIHDFFAHIFGEQVELGYDYLQLLY LRPLQRLPVLLLVSDERNTGKTTFLNLLKSIFGGNVTFNTNEDFRSQFNDDWMGKLLICV DEVLLNRREDSERIKNLSTARSYKAEAKGRDRREVEFFGKFVLCSNNERNPVLIEAAETR YWVRRVPPLPYDDQHLLAKMRAEIPGLLFYLQQRMLSSHEESRMWFAPRLLVTDALRRII HYNRSKTETEMLSIIRDIMDAENLAEYRFDVSDMVNMLEIRGIRVDHPTVRRILAENWQL RPAPPTYYQRYTITYNGETQRQDSKTARVYTVTREQLGGLLDDAAM >gi|313158870|gb|AENZ01000025.1| GENE 18 24665 - 24970 160 101 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|291513569|emb|CBK62779.1| ## NR: gi|291513569|emb|CBK62779.1| hypothetical protein AL1_00430 [Alistipes shahii WAL 8301] # 1 101 1 101 101 195 100.0 1e-48 MDLKQLSELGGEVNVTVRLEDLRQWHKELTAVASPPSVMPVPQHAGELYTRKQTIALLGV DSSTLWRWAKSGYLVPVEYGGQRRYRVADVQRILNGDSYAR >gi|313158870|gb|AENZ01000025.1| GENE 19 25185 - 25928 63 247 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313158876|gb|EFR58255.1| ## NR: gi|313158876|gb|EFR58255.1| hypothetical protein HMPREF9720_2090 [Alistipes sp. HGB5] # 1 247 1 247 247 511 100.0 1e-143 MNPEYYFQKGLRLIVASAVIFALIVLFSAMVSRGFPLWLALVLLVPAGLGVTGFYFFSVG SSADLSERFFRRFGCSQPRGEEKTTTEGVGSEQRCERSPDAIVPESAVVSADPVPDDRIL ENALATVYEYTDKDLGDAVDGTNRQILRRRLLYLACMAPVPNDVPQVRLRHDRVSYGDLC HYGWIVWNAFKGATNRFYDQTELAEWLKASFESLAKYNTKTLRAKLRATDGGYRIRLIDN LKEYIQK >gi|313158870|gb|AENZ01000025.1| GENE 20 25925 - 26944 449 339 aa, chain - ## HITS:1 COG:no KEGG:PRU_1108 NR:ns ## KEGG: PRU_1108 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 17 329 2 330 330 266 44.0 9e-70 MKNNFWTFSDEQKSMFVAQTSERVGLPPQAVEKDWWVTMTLKALFESSCRDFITFKGGTS LSKGWHVIERFSEDIDIAIDKSFWRIAGDNKSQRDRIRKLSRAYIEQRLVAEMQALLEGY GASDFELRAVPAQDSDADPTLVLLPYRSIYANIEYVESQIRIEFSCRSMKEPRERIEIRP LIAEAYPDVFGELVFPIYAVVPSRTFLEKAFLLHEEFQKENPRVERMTRHLYDLERLMDT DFGKAALADPKMYVEIVRHRSIFNTIRGVDYRTHHPSRIDFIPPEKLAEVWRRDYERMQE YFIYGDSLPYDRLIARMAELRDRFRKVVMEDDFFSESEL >gi|313158870|gb|AENZ01000025.1| GENE 21 26934 - 27542 331 202 aa, chain - ## HITS:1 COG:no KEGG:Halhy_3922 NR:ns ## KEGG: Halhy_3922 # Name: not_defined # Def: hypothetical protein # Organism: H.hydrossis # Pathway: not_defined # 2 197 3 195 204 138 39.0 1e-31 MQAIEDKILNKIKKCGRGKLYSASDFAVYGSAVSVAKALERLTRKGGLVRIARGLYCYPK IDRKFGMGIQYPTINEIAEKVARQSEARVVPTGMHALNVLGLSAQVPMNYVFYTDGNSRT VNLFNGRKLRFKRVALKNLAYQNKTLMLAVFALKEIGRPQVTEEHTAQLKTIFARIPKSS ILPDLRLVPAWIRKIIMSFYEE >gi|313158870|gb|AENZ01000025.1| GENE 22 27600 - 29003 685 467 aa, chain - ## HITS:1 COG:no KEGG:Bacsa_2219 NR:ns ## KEGG: Bacsa_2219 # Name: not_defined # Def: integrase family protein # Organism: B.salanitronis # Pathway: not_defined # 3 467 2 445 445 320 39.0 7e-86 MAKVTYVLAQGENSAGESQVNFRVYVSRELRVRVPSGIWVDRKRWGKKNDINIPNIPGEE RDALLAKRAKLKELVDVIETSVEAADDKSTVTREWLEKLIRRTLRPKTATSVEEKKIGFF PLTDEYLATHKLSESRVKHFNVLVRTLKRYELYRKLSNRRFVLDVHTVSPTTLDDFGAFL MKEPEIFDEHPELYDEVPYARPKVRKNLPVKRGPYLNAAGETVIPGRPKERGMNYVSDML IRLRSFYVWLNDNGHTYNDPFKQYKIAEIVYGTPIYITTDERKQLAEADMGDDKQLETQR DIFVFQCMIGCRVSDLYKMTYANIIGDCIEYVPRKTRDDRVVTVSVPLIGAAKELIRKYL DENRGTLFPFISEQKYNVYIKAAFRKAGLTRMVTTIDQRTRQNVQVPICDLASSHMARRT FIGNVYKSVKDPAIVGAMSGHKDGSRAFARYRDIDMDIKRDAVSVLE >gi|313158870|gb|AENZ01000025.1| GENE 23 29963 - 30286 99 107 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSCAIQIMQVIHLPGLPINLFDHFDKTAEIIIRISLPIYPDPELIIEIDIDIFKLQTTPN HLFILLQLIQDCSVMPDRYYRPRIGIAGNITLKIIRVIARGGLRSIY >gi|313158870|gb|AENZ01000025.1| GENE 24 30675 - 31115 150 146 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313158934|gb|EFR58313.1| ## NR: gi|313158934|gb|EFR58313.1| hypothetical protein HMPREF9720_2095 [Alistipes sp. HGB5] # 95 146 1 52 52 104 100.0 2e-21 MQNKFPTIKDRILYLAEIKGFGKKNFCEKIGMSYGSFTGQAKNTPLNSNAIANILCIVPD VNLHWLLTGIGDAVIVEQELDSSSIRGENTAKDSMIAELMTIIRTQAETLLQQQNFINDH FMELHQKKISPPRVGVEYPDKEHIKE >gi|313158870|gb|AENZ01000025.1| GENE 25 31105 - 31320 89 71 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313158938|gb|EFR58317.1| ## NR: gi|313158938|gb|EFR58317.1| hypothetical protein HMPREF9720_2096 [Alistipes sp. HGB5] # 30 71 1 42 42 82 100.0 1e-14 MAGSFTAIPFKNGSHPLEDGTSVKVIFRAMSLVYKCYFLTLPNLLWYASSVRTYHNANIA NIFTDKQEYAK >gi|313158870|gb|AENZ01000025.1| GENE 26 31485 - 31817 319 110 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MEDSTHIDSTKISLYIQQLEFTVSNLEATITRMKKECFDPKKFYGTAITVDACARMHSVC TVTVREYVHAGLIPTHPDSTSRKILIFTIDALLLDFKQLKQKYKELKLSI >gi|313158870|gb|AENZ01000025.1| GENE 27 32034 - 33185 296 383 aa, chain + ## HITS:1 COG:no KEGG:HMPREF9137_1422 NR:ns ## KEGG: HMPREF9137_1422 # Name: not_defined # Def: initiator RepB protein # Organism: P.denticola # Pathway: not_defined # 77 370 108 381 390 71 24.0 7e-11 MILDAKYNYRLTQLSAIVKITNILQPIINQNIDEYVTLCRRFGNKEEHPKIDYLLGEERN LRLHAEDVYWKINERTLEVRIPLDAFTNIRSHYSRIRAAISDLIRIPVKFPTWSELAKET LTDGCRPLCQGVFYNNDKSQKNKSYIHIFFDEDIARLLVNPCYGYSRLLQATVDNCRSIY TARIYMQICRHADGKKWAIPYPDLRLLLNVDSPKKKGSTESKQRIRANSVPRYHEFRRRV LDVARDELLLMVQNNTTNYYFTYTEEFPSNRHSTPSHIIFNIYETHPSKKDLYAYENHLS QMRHYAGHILKIGQDRYNRVFKNITLRNYTYVFQRHLAAWIYMEDHGDKINNRDAYYLKS IENAIKEEKISHEQGVIQMELNL >gi|313158870|gb|AENZ01000025.1| GENE 28 33207 - 33962 318 251 aa, chain + ## HITS:1 COG:PAB0852 KEGG:ns NR:ns ## COG: PAB0852 COG1192 # Protein_GI_number: 14521493 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Pyrococcus abyssi # 1 247 1 252 257 128 35.0 8e-30 MKVLISINQKGGVGKTTTAANVGFRLSQLGKKVLLIDGDEQANLSLIFNATKHKETLFKL FITGEAVKPYEINPNLYIIPSDVRTSNINVQLANDVNAPFFLKKYLAMPEYKDYDIAIID CAPALDAIIINAITAGDAMLISLSPGEFSYDGMRRILSASNTIKKNYGAKIQLGGIILSM INARTKVYQQTVELLRADDLLPDAFNTNIRLCEAFKQAEAEHKTIFEFAPKSKGADDMSA LTDEIIAKLLQ >gi|313158870|gb|AENZ01000025.1| GENE 29 33972 - 34220 91 82 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313158933|gb|EFR58312.1| ## NR: gi|313158933|gb|EFR58312.1| hypothetical protein HMPREF9720_2099 [Alistipes sp. HGB5] # 1 72 1 72 82 133 100.0 5e-30 MPAIEDRPSYDEDMMSIALAGITQRQKKKRSRTYTFVQKRSIDFTEEIWSIIEAHRKLPG IQDNLRETIYKGLLLLKEKENL >gi|313158870|gb|AENZ01000025.1| GENE 30 34342 - 34521 273 59 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKVYAITIEDAHGEYGRLYALADNDSDKLRLEEMAQAETMGDDTEVCVREIELNIPIKN >gi|313158870|gb|AENZ01000025.1| GENE 31 34533 - 34868 241 111 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MEWNSKEKKELIKLYPKYTAKELEVKFKRSSKAIHRMANLLGIYKTANKVPPGLYDVGLR FMIVHCRVRADNKKNAKYRAIEKVERSILKRQSIKNYVFFSKSTVEKIKKQ >gi|313158870|gb|AENZ01000025.1| GENE 32 34865 - 35140 239 91 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSENIDKKLQQIIDLEIRELYDLAFFKPYLVNYIADFKPQFHQSMSDREYRDYLERKNEE AIQEFCRSIKADCSRDEAITNALHVLLADLL >gi|313158870|gb|AENZ01000025.1| GENE 33 35137 - 35715 63 192 aa, chain + ## HITS:1 COG:CC1872 KEGG:ns NR:ns ## COG: CC1872 COG0739 # Protein_GI_number: 16126115 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane proteins related to metalloendopeptidases # Organism: Caulobacter vibrioides # 51 179 247 369 383 92 38.0 3e-19 MKKWARISLFLQLLPISLWGQNPYALYTAGNKEILDIIAQLPARAPQILDIPCISPVQQP NFDRLRISSGYGLRIHPITQKVHRHSGIDIPPSGNDTIYATANGILDTVAYNDLIGLHIK IRHKYGFKTTYGHLKRTFIRTPHQKIHIGDPIGIMGTSGRSTGKHLHYTVQRNGQTLSPL PYCYLFINTRGP >gi|313158870|gb|AENZ01000025.1| GENE 34 35733 - 36011 325 92 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313158926|gb|EFR58305.1| ## NR: gi|313158926|gb|EFR58305.1| hypothetical protein HMPREF9720_2100 [Alistipes sp. HGB5] # 1 92 19 110 110 153 100.0 4e-36 MAKNGDTRKLLLDIDIETMEYINDYKYAMLLSQGIVMTNVDIFIAAIQSLKRETEKEGKL KVVPRPEHVRKMEVSRREALQAGKLRKRQNKE >gi|313158870|gb|AENZ01000025.1| GENE 35 36276 - 36854 784 192 aa, chain + ## HITS:1 COG:no KEGG:BF3513 NR:ns ## KEGG: BF3513 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 30 192 25 187 187 94 37.0 3e-18 MRPPHFFHNIVNMKHYLLIIVSLFTFSVSHAQFSSVSTNIVGWAAGNINAAVDLNVNLHN TINIPVSANPFKFGDTQWSHVVLQPGWRHWFVERYIGQFVSPSLFYANYTIGYDKRTFKG NAYGIGCSWGYSKLLSTRWNFIVEIGAGIVYTPYTEKLRPKHIGEFDDEYTYRHRRFLLV PIKCNLSFSYLF >gi|313158870|gb|AENZ01000025.1| GENE 36 36867 - 37166 325 99 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTTTFENSKGYKVLTLSAAEIKVWARAKVCDRCGRKISSSGFYVGAVNLMYCPDCYEEWH ETAPEKQELQDLRESRHLGKAIVHIAEALSDKKSKCNLL >gi|313158870|gb|AENZ01000025.1| GENE 37 37163 - 37513 312 116 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313158924|gb|EFR58303.1| ## NR: gi|313158924|gb|EFR58303.1| hypothetical protein HMPREF9720_2102 [Alistipes sp. HGB5] # 1 116 1 116 116 202 100.0 8e-51 MNRIYLRIFGLTILLIQMALIGYTAFTVKELPHRYQQEFDNATKRLAIEEWSNHNAKKQD QLSSEDAILAKAKDISIESIITAYDAEISKLLSKQIGLLFLLILTSIASFVLPLED >gi|313158870|gb|AENZ01000025.1| GENE 38 37522 - 39306 1993 594 aa, chain + ## HITS:1 COG:no KEGG:BF1481 NR:ns ## KEGG: BF1481 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 193 590 286 681 687 227 35.0 1e-57 MKPYRTLVYIVLTVLMLSCAMTHSLQRSTPTADIHLPSNGPVETPEVEDVERAVTQMRKI SMNGGRDSAYLAEVERDSTGEVTIKGENIQTVYIVAKSKTVAERNGEIAIDFIVGIPAAL QSSTWGLSLTPVIENNGTEEALQPLSIRGELFSDIQKRQYWQMNKYLNRMLGDSTELTST TGLAAKYYEAYNRYIAGGQKRRADKLESTYKQTISFPYLSDPRLDSVLIRKGEIRYYYTQ TYRPTKDTKRLHLFFRGRVDAIDRSRYDLSNSDTLTYTVTSMLSLLDNEPRYMLKIIDKY VEVRDRNYITFPVGRANVIDTMGQNHVQLDKIERLMDTLINQYEFFVDSITITASSSPDG SMATNNRIAGERARAIKERLVRKFGHEVDTLIRTRSIGEDWTLLKRLIRHNSDVPNWEKI TDMIDQSSNLDVTEPQIRQQFPKDYAYMKEILYPQLRAVDFRYNLRRVDMVKDTVVLTVL DTTYMRGVQQLRDRDYVGALRTLNDYKCQNLAIVLLSLSYDEAAFEILEQLPPAEKNPRT DYLSAIALSRMNRPREGLEYYLKAIQADPVLKFRGNLDPEIQILYKQNNVKASW >gi|313158870|gb|AENZ01000025.1| GENE 39 39308 - 39880 279 190 aa, chain + ## HITS:1 COG:no KEGG:BF4320 NR:ns ## KEGG: BF4320 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 188 1 187 196 187 56.0 2e-46 MELQTIQNKIYEIRGQQVMLDFDLAALYQVETKVLKQAVRRNIERFPEDFMFEISNEEYN SLKDSLRSQIVTSNKGGIRYMPFAFTEMGVAMLSSVLRSGSAIQVNIAIMRAFVAMRNYL TQASQHSAELAEMRSRLQLIEHEVRENLKAMNDMSEDIGKDIDAIYEAIGALSVKLPQIR QEPQKIGFKK >gi|313158870|gb|AENZ01000025.1| GENE 40 39885 - 40763 891 292 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313158902|gb|EFR58281.1| ## NR: gi|313158902|gb|EFR58281.1| hypothetical protein HMPREF9720_2105 [Alistipes sp. HGB5] # 1 292 1 292 292 592 100.0 1e-167 MKPIQILFTGLLLASLLPGCEIHDDIDVDVKRKAIVYLRYNYNDGEHDTMDEVQTLRLFF FDLKTGRQYRDTTLTRDEFLTDTGAMQTYISNGEYGIVTLANVGHGSTVSADNLGDAAIT FPDTGADPLFFNRIQTPIQKGDSLRFDIDLFKSVYKVNVRVEGMQNINNPEDFYFGLNNY AALSFDNKPCGGFRMYRPQLARDPAAGTMSGSFYTPYFPSDSPISIGIYTDDPTSIYGHE LFVATIQRYMEIAPNPGHDVEIDIRILLNRANVTVIISDWEGTVIQEEHFGA >gi|313158870|gb|AENZ01000025.1| GENE 41 40811 - 41881 1092 356 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313158895|gb|EFR58274.1| ## NR: gi|313158895|gb|EFR58274.1| putative lipoprotein [Alistipes sp. HGB5] # 1 356 1 356 356 580 100.0 1e-164 MKKFFMKLTAIAVMPMLLAACTKDEVNPTPTNGGKEGEIELILKNESVADGTRAFGSGTT ESWEKSISSAVLIVYNTSGTQILRRVLTAAEVNGSTTTPIKFVLPGVSADASCDFYVVIN RNIADNITTKTGLLAELESDIASYNSTYANVTTKAMRTGGFVMTGSASVKIAAGTTAVTM TVKRVVAKVEIQTSMTDSFRTKYGNGCVEVKKVTLSRGAEKSFLIDQTTSKYATVSSSFT SAQDAYCDKTGASKTNAYKYNNLFYINEKAAAATGSRIKVVLNAVYDADGNLSTTTDQLA VTYETELTGATGGKIARNGSYKVNAKLDGLTGQDVTLAVTVANWDALSTQDVNLGQ >gi|313158870|gb|AENZ01000025.1| GENE 42 42059 - 43405 934 448 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313158907|gb|EFR58286.1| ## NR: gi|313158907|gb|EFR58286.1| putative lipoprotein [Alistipes sp. HGB5] # 1 448 1 448 448 927 100.0 0 MKTYKTILMLLLAATTACSKDSAEPDTPGPKPSISMTLDYQSREYELGEVIEATLTITEK NPAADYFLLNTSCNGGKAVATVDGHELQWQAEQQIPYEIVNEEFSSKVLHLKITPQAGAT AKQPFNFGIYAISADGTKVEKRIYAVSVNTAEIITNAECITPTINLEQQFKFILTATKEN YAGDFFVQLSTEGNGFFILEDGIAGNRFYCSADSHNMLSYQPHETGLHKIHCIVKDDICV SEVDIEVEVNGINGSLTNPEPGVYIYCNSLYYPSSAWNQEWEEQAEGVAIITEECRFLMA PDPVIGDWGGGEFTSVSDLTVMQDWSEAKFDYNGRKNTEALLNAKDVMRKIEFTEKCYNY DKDNPGKWYQPAAGQMYLIRQNLDEVQRCLSLIGGRKLKANEHYISSTAADGLHLWAISL TQFERIYFFLYEPDNRTYPVRDLQTEEL >gi|313158870|gb|AENZ01000025.1| GENE 43 43521 - 44003 260 160 aa, chain + ## HITS:1 COG:no KEGG:PP_1535 NR:ns ## KEGG: PP_1535 # Name: not_defined # Def: methyltransferase, putative # Organism: P.putida # Pathway: not_defined # 7 156 1 150 154 201 61.0 1e-50 MKTDKLILDACCGPRMMWFDKRNPQAVFMDIRDEEHILCDGRSLEVHPDVIGDFRSIPFE DATFRLVVFDPPHLVRLGDNSYMAHKYGKLLSTWETDLKQGFDECMRVLKPEGVLIFKWC EEQIPASRIIEIFGVEPLFGHKSGKNSKTQWMCFMKINNP >gi|313158870|gb|AENZ01000025.1| GENE 44 44032 - 44403 339 123 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313158922|gb|EFR58301.1| ## NR: gi|313158922|gb|EFR58301.1| hypothetical protein HMPREF9720_2110 [Alistipes sp. HGB5] # 1 123 11 133 133 248 100.0 1e-64 MKTDSLRIGNYVLCAGKVIAVTSVFPKGINAHTDRITGKVSYIPADRIDPIPLSTDMLQR LGMRRNAVRWVKYGVDNLYVDRAVDKFYISIGRLGERVCQVRFVHQLQNILSDGWGVELK LRR >gi|313158870|gb|AENZ01000025.1| GENE 45 44439 - 44840 308 133 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313158925|gb|EFR58304.1| ## NR: gi|313158925|gb|EFR58304.1| hypothetical protein HMPREF9720_2111 [Alistipes sp. HGB5] # 74 133 1 60 60 124 98.0 2e-27 MTDQVTSVEQSKRLIELGVPAEKASMVWHTMPAVIGRSKLRIAEEEHVGWMCRNFPGQYA PAFTVADQQELLPVFIDAKGTFYLNISKCYEGWYVSYETESGAELISHRGIKLVDVLMDA TEWLLSNGYKLNI >gi|313158870|gb|AENZ01000025.1| GENE 46 44822 - 45043 170 73 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MIFHNKIAKRLDCLLPHPLLCLLLGMLDRILGYIYSHGMIWQVKFFALTIDFYFPATVLQ FLSCFFAFHMFSL >gi|313158870|gb|AENZ01000025.1| GENE 47 45129 - 45470 216 113 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313158909|gb|EFR58288.1| ## NR: gi|313158909|gb|EFR58288.1| hypothetical protein HMPREF9720_2112 [Alistipes sp. HGB5] # 72 113 1 42 42 93 97.0 6e-18 MRTTIKERAKAFCEKNICVDCGDRKDCDRGCMGCCIPTFSVLEWLIQFGKSERAELTRWH DPNITPDDNKPVIICTSPGIYYIAAYDKQFNYWFTGNGSFYRHEILGWREIHE >gi|313158870|gb|AENZ01000025.1| GENE 48 45686 - 45886 142 66 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKKQEKTGTTCEYCKPGMWWIVDQVSECVLQIDGQRLKITYDAECDDDSFASGVHIKYCP MCGRKL >gi|313158870|gb|AENZ01000025.1| GENE 49 45883 - 46044 58 53 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313158950|gb|EFR58329.1| ## NR: gi|313158950|gb|EFR58329.1| hypothetical protein HMPREF9720_2113 [Alistipes sp. HGB5] # 1 53 6 58 58 88 100.0 2e-16 MTPQELYDWAVENGYEDYNIKIEWDDEYSYGWSEVSKVGLIVTEPDKTITINI >gi|313158870|gb|AENZ01000025.1| GENE 50 46041 - 46262 182 73 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313158943|gb|EFR58322.1| ## NR: gi|313158943|gb|EFR58322.1| hypothetical protein HMPREF9720_2114 [Alistipes sp. HGB5] # 1 73 1 73 73 129 100.0 6e-29 MKRELTLTDMDGHTAVQTCKICGCTKDDPCDHEKLGMCWWVKADLCSHCAMIQTGEIPGD SVTHCVNSNPIIQ >gi|313158870|gb|AENZ01000025.1| GENE 51 46259 - 46975 724 238 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313158908|gb|EFR58287.1| ## NR: gi|313158908|gb|EFR58287.1| putative lipoprotein [Alistipes sp. HGB5] # 1 238 1 238 238 433 100.0 1e-120 MKKYREIILPLLLFAAAGCEKSEYRITEPRPTLEVTALQDSYIINQPAYLQLKVSQQGYD GEFQLSAVLNEGACELSMQGSDLPTDGTWTSMSNTTEILTLTPTLAGPLRISFEVKTKEG EQSGRSFINFNVQKSPALALEVEYPETASITERIELTMLLTKTGWTGAIPVTYTQLTGNG TLQYGAVTITPAEAFSVPANMEQPLYYTPAERGIHRIQLSATDGYTTQFKTLEIIVTN >gi|313158870|gb|AENZ01000025.1| GENE 52 46983 - 48311 736 442 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313158912|gb|EFR58291.1| ## NR: gi|313158912|gb|EFR58291.1| putative lipoprotein [Alistipes sp. HGB5] # 1 442 1 442 442 875 100.0 0 MRTTLFYALALFVSAGCSSSKYQIVEPTPQLSLSTVTAVCELGETVNLTLTVAQEGVDGN FSLSAFIREGKATLTLDGSDMDTSGQWVQLAAKHARLVITPAQAGDLLVSFQAKSPDGEV SEQQDLKVTVTAPSEITAEAVCEAKIVNPAADARIPVKLHIQGTPGADGKFVVTPTLSQG KGKIFRNGYAVNGQACPVDADATFEYAPEEIGEQILEFEVTAGKTSAKARAYMDVVKNIV VTSSVEGCFTIEGAGEHNTEGEKVTLALVNEELFNFEPAGWYDSTGQLLSNEATYALQLS RDCITRLEVRLKPRTVNITRQGIARIEFQYLVMEGGRPVPKVAYDYRTQYSTDYKASEPI KFYYEEYRLDRSKIPPVGLRSTAMPTITKGARNSTYLWRCDEKFSVSIRPGDNPGFKFHY NDRYIESQTTKYYLPSDVTMTR >gi|313158870|gb|AENZ01000025.1| GENE 53 48447 - 48521 91 24 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MEHFANIVLTSLLLTEYEQYLNIA >gi|313158870|gb|AENZ01000025.1| GENE 54 48675 - 48884 267 69 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTTVHIEDSTPEGRWLLDLIKDHKSVTVEPEKKEAKTVGAWGAALAAGAVPLEEFNARFD AKIQEAYKQ >gi|313158870|gb|AENZ01000025.1| GENE 55 48884 - 49180 236 98 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167754233|ref|ZP_02426360.1| ## NR: gi|167754233|ref|ZP_02426360.1| hypothetical protein ALIPUT_02526 [Alistipes putredinis DSM 17216] hypothetical protein ALIPUT_02526 [Alistipes putredinis DSM 17216] # 1 98 1 98 98 108 50.0 1e-22 MREVIVSPTVLGKLSDLVSYLRDDIKLSEEAAQAYRGQFVQFIMTCSAEINHPLCQFKRW CKLGYRCAVFEKHWVLAYQILDEGIIIQDMCHTAILKE >gi|313158870|gb|AENZ01000025.1| GENE 56 49815 - 51770 -175 651 aa, chain - ## HITS:1 COG:no KEGG:Bacsa_0604 NR:ns ## KEGG: Bacsa_0604 # Name: not_defined # Def: hypothetical protein # Organism: B.salanitronis # Pathway: not_defined # 7 646 5 763 790 114 25.0 2e-23 MKKIFNYVLAISAIVIGMAFGSCNEEEFPSPSHPDGDVVPVILSVEVADLVSSRAIDENL ISDINIYFFGNKISYHFYYAETAPSFTFQMLPGTYQLYVVTNVHQDLGELTKDELLACDY IIGDMVSDIPMTGNMNIHITNAMTLPTLQVKRVAAKISYNITVDNAVASNIKLRSVQFCN LPKSTVLFGSGTTSDDVNDFYDANAIEIDNDMIFSDVFYMFENCQGNVNSIYNQTDKSPE KAPKCASYMRILADGPGKLLEYIVYLGENSTTNFDVRRNTKHEMNLYIKGENEIDNRVTV YEGLYYGKANCYICTGSQVAFDVTPYRTSKNLTYTYSDIYAGVEYEAVRAALLWQDTKNL VRSVSINDNTVIVNTNGGKGNAVVAIYNKAGEILWSFHIWCTDQPEICEFAKNSVGNSYS VMDRNLGATTSALGVQSSYGLYYQWGRKDPFVGALGATSSENGKMYDINGGEVSFQQERA VAGLTVENAIRNPQVYYTGYNTFWVAKSESTKLLWGDNIDDNVAPVKTVYDPCPAGYKVA PQDLLRITSKNGSLTTTQPSGYYTYGQFNSGWALYYDGRGTDRTKIFYLQAGGSRAGLTG GISMGTITGAMWTSGCNSNFSASSAGCNSVEILNQSIENLSHGFNIRCVKE >gi|313158870|gb|AENZ01000025.1| GENE 57 51812 - 52387 81 191 aa, chain - ## HITS:1 COG:no KEGG:Bache_0302 NR:ns ## KEGG: Bache_0302 # Name: not_defined # Def: KilA-N, DNA-binding domain protein # Organism: B.helcogenes # Pathway: not_defined # 1 185 1 182 191 199 60.0 6e-50 MDIQIIQNKIYEIRGQRVMLDFDLAALYQVTTSALNQAVKRNIKRFPDDFMFQLSKGEFD NLKSQIVTSSWGGTRKPPYAFTEQGLAMLSGLLNSDVAIHVNIMIMRAFVQLRNMIVQSA RYSAELAEFRSRLLLVEREVRENLEAMNDLSEDVRRDFDTVFEAIGALSVKLPEAKKSRQ PIGFVTGNKED >gi|313158870|gb|AENZ01000025.1| GENE 58 52399 - 52611 233 70 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313158905|gb|EFR58284.1| ## NR: gi|313158905|gb|EFR58284.1| hypothetical protein HMPREF9720_2121 [Alistipes sp. HGB5] # 24 70 1 47 47 88 97.0 2e-16 MPKGEYTRTEAGRRAYFVVTGIELPNTLTHDEIKAYSHALPEEQWKRCHELYLQYMSIGR PEYMKNYAEN >gi|313158870|gb|AENZ01000025.1| GENE 59 52621 - 52941 317 106 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313158937|gb|EFR58316.1| ## NR: gi|313158937|gb|EFR58316.1| hypothetical protein HMPREF9720_2122 [Alistipes sp. HGB5] # 1 106 4 109 109 201 100.0 1e-50 MAYPFPEIQRVEEVFSTLKADPALLEEARHRGFYMGHTKANQMFSRLFGGGRIMIKPEAD TEFVRRVIPYLKALLSSYALKHEEKEAICAMILDEIADDVLPEPKK >gi|313158870|gb|AENZ01000025.1| GENE 60 53021 - 53275 167 84 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MVDVSTAYLAIDRARSDERHQLLAGWEPTTKSVPTTGLVDVFFLDGTIREIVGDHVCNWV YPYIETGYAVAWRLRKVDTEQVDE >gi|313158870|gb|AENZ01000025.1| GENE 61 53334 - 53537 242 67 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313158878|gb|EFR58257.1| ## NR: gi|313158878|gb|EFR58257.1| conserved hypothetical protein [Alistipes sp. HGB5] # 1 67 1 67 67 115 100.0 1e-24 MTLNDAFSVLIAQPFWYKGSGYTKQYAYRDKKNFQNGKLIPEERMRHYLKTAGWEQTQEE QWEKDGK >gi|313158870|gb|AENZ01000025.1| GENE 62 53662 - 54069 93 135 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MADLRFMNQVTPRSAIGGEVVYNREQLSTTDDADFNAQQVLVGVVYQYPVLFGRLGFFPS ASCLIGGEFADKTTSKGDMLSFSNGVAFGVSVNIPVGVVIGKRFTVYADPRLMYIPTTNF ESCILSIGVGLKYYF >gi|313158870|gb|AENZ01000025.1| GENE 63 54201 - 54569 350 122 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRKLEFDGIIGAIDEKTNKLVLRVGSETYAKTEEGNFETKTDWINCYVALSEKGRYRVGD RYFFRGNMIVEPYASKKEEGVAKAQIKVFVSEKPELIYRKKDTAPVPAAPAEPCDDVDDL PF >gi|313158870|gb|AENZ01000025.1| GENE 64 54550 - 54744 105 64 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MNMKMNMEPIWSILKMIGYFLMELLVWIGTVLLHVTTFLLLFFVLRKVKNKPKIKKYYYE ETGI >gi|313158870|gb|AENZ01000025.1| GENE 65 54701 - 57109 823 802 aa, chain - ## HITS:1 COG:alr7213 KEGG:ns NR:ns ## COG: alr7213 COG3505 # Protein_GI_number: 17233229 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirD4 components # Organism: Nostoc sp. PCC 7120 # 203 577 115 470 589 83 23.0 2e-15 MATLSQIEEEDNEIYAYVVLGAICIGFLLLLVQLYYANYDLVQRLHIDIPFLDKHIFSKE GMKLLLGDSYKVRNVVLFLFAFSLTSPSKGEKPTSFWRQLAGLGISVAVFHCCRFFLLIP LTAVGQSLNILGSLISAVCILSYCGRITRFYLAPDRSLNENERLADGFEQTTELIETPTS VNWKYEFSYQGKTHIGYINELAVYRASFVIGMPGTGKSHSFLDPAMKQLIAKHFSAVVYD YKDPTLSNAVYQYYVAYKREHPASTLRFGYLSYVNINHTYRCNPMKGISTSAEAVNFAIT ILTALNKNFVEKQGEFFTESAKSYTAIVIYALGVLFGGRYLSLPHTLTMLSQVPSVLFPV LKLISVLYPDMKTLFSPFKEAYDTNTLPQLQGQLASAQIGLGSMSDASLAYVMTEDEESR DIAVDLDTISSKESPMLLCLGSNPRLGAVLGLANAVYLTRIANLLNRKGRNPTAFFADEV VTTYINGLDNLIATARSNKIAVFLGFQDFSQMVRDYGQKISDAIVNTVNNVFVGAVKGKT AKELAESFGKKTVRKISKSITEDGKVTTSIAEHKEERITQSMIEELSQGEFVGRIADEYG KEIKCKVFHGKVIVETPEKEQRLREELENRTKNECERDGRSYLPDETPWIPKVRNWSDEE IKRRLRLNVIKINNEVSDVLLKLNEIADTYKILTHLTTGTDEFCLRHYLAEPQNPQKRIN LFVWLEEAYRIVWRMGELELKDDILSFEEKFQLLLRDVYTTSYEGLQQILKAREQYHAMD LNSVKQLIDEYEDEYGTNLVNP >gi|313158870|gb|AENZ01000025.1| GENE 66 57288 - 58439 509 383 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313158872|gb|EFR58251.1| ## NR: gi|313158872|gb|EFR58251.1| hypothetical protein HMPREF9720_2126 [Alistipes sp. HGB5] # 1 383 1 383 383 792 100.0 0 MIDYKEAKKYLLPTLLDYGFKERKDKSCKSCLVFSDGSETLLVYRNQRGIYDHYVNKSAS DDYGDAVHFIMKRILNKTTNLSAEDFCKVEAELSRIAGLGVIKDVINNTSTIQEKQPFDI SKYTVERLDLAAVPGACRRFFEQRHIDPGQLAPFKSVIGILVSDRGFRQPLFYWKNMDGN IIGAQYKYIKDNVCQKRFLKNTDRSNSLWMTPIEGSKALFVTEDPLDAIAHSQLFPGARY AYLCTGGTSTKAQREMIHSFSQKFKLPIILGNDNDLAGQLENFKLAYHTANIEYAINSKT QRVLLTVNGAEREWGKDEIVAALQAVMKQRPNMILSIPWAKDWNDDLRSSKKSISLRIAE NMVRAQIETTKSEQRGMIAGKLP >gi|313158870|gb|AENZ01000025.1| GENE 67 58470 - 58595 75 41 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNMLLNLSYKMLTAMIVQVIVSPCEPSHDWMTKIFQHDPFG >gi|313158870|gb|AENZ01000025.1| GENE 68 58979 - 60760 171 593 aa, chain - ## HITS:1 COG:no KEGG:ZPR_1414 NR:ns ## KEGG: ZPR_1414 # Name: not_defined # Def: KWG Leptospira # Organism: Z.profunda # Pathway: not_defined # 161 350 146 352 763 77 28.0 2e-12 MATINFRIEAPIAKYGTRLPVVENRLVWLIKDFIRDAEQIQGLSVGMSHVEFADGAYFTN VSIKNEWYESARGLKKLARKYDPNYDKRAHQAFPAMYATDLSSLGEEIVLHFQKSCDQLI VDRYAGIIKNEIGSLHTELGKDVLIRNINEDNSLFFVCREVDGKEKVGIVAKYGQTLIPL KYDDLKEIRNGYYLAKEAGKFGVVTLSNKVVVPLKYFRLHIFPLQFSWESLHNSIKKDYL LVAGKETGDGIRYGVVNLKGEVKIPFEFESLEVQYNWEQRFVRYGFMVAQKDGKYGMIDK EGSVIIPFEYDHLGQNGYSKEYFEFEDGYLGAVKDNKFGVVDVNNNVVIPFDRSKDALLD MVYPDVYFLCRPTNKTPEQRIADLAELENGLIKQDWECYKKNYLSGSYMREQLDKLRILY CNIDDRDAGKANELWDKYSRNGLSRIPDLFEDSNGNLWRTQNYMMEIAHLEKPSADNAVG VFLSPRERMQRIEKELRDIENQEIMGVHYGEFPPERVDSEISAKLLENDRRIEELHDSYR VIRANYRKYRARGSEQQDRGGLSKTLQVVEGQIHVKRSKAEHVHNDNANILAK >gi|313158870|gb|AENZ01000025.1| GENE 69 60807 - 61193 157 128 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313158901|gb|EFR58280.1| ## NR: gi|313158901|gb|EFR58280.1| hypothetical protein HMPREF9720_2128 [Alistipes sp. HGB5] # 1 128 1 128 128 212 100.0 6e-54 MYAHFEGTSQELADLQNDVSAAKTALELHREFNIQYANVAAHRNPTFSMYEYLMEKYEAL NKQRIAEGIEPLKLSALEARYQELKSGSLSTSESLAFIERCLKTSEVGRQSNQNKNNSIP AKGRTSSD >gi|313158870|gb|AENZ01000025.1| GENE 70 61204 - 62067 74 287 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313158898|gb|EFR58277.1| ## NR: gi|313158898|gb|EFR58277.1| hypothetical protein HMPREF9720_2129 [Alistipes sp. HGB5] # 1 287 1 287 287 524 100.0 1e-147 MKIVLGEAKEEYDTDVQMAVTRQLILRVNTEIKPEEAYVVSEITYSQRTGYSYTLINTQT KERRTTNLIRPLSKRFGVGFYKDDVRMMLMPEPEFKRLCEGAQNIVKKEEVKKDKTRYQL SNDYQGSRYDSKLTTKEIAIKIRAYCKTQYPDYKFSVRVDRSEINVKILSGPQEIRAGKG LAKGNWQTIGITDFYKDWLSEGAYKMLYDITQYAKSYNRSDTDIQDDYFDEHFYFRLDIG AWDRPYTVTPEIKPQTGITKSGTLQKIESRIVSQPEANHRSGPEKTI >gi|313158870|gb|AENZ01000025.1| GENE 71 62078 - 62509 207 143 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|255009787|ref|ZP_05281913.1| ## NR: gi|255009787|ref|ZP_05281913.1| hypothetical protein Bfra3_11671 [Bacteroides fragilis 3_1_12] predicted protein [Bacteroides fragilis 3_1_12] predicted protein [Bacteroides fragilis 3_1_12] # 1 93 1 93 201 80 47.0 3e-14 MDVYISGKISGKTQQEAEADFGKYAEIIQAAGHTPVNPMKNGLPFDAPWEDHMERDLEML RASDAICLLPDWVESCGAKIELHQALEIRMPVLNDSNLRLYLEQANHQGVQQEDSRSKIL RNIERSYKMQQGTIKDPSQKHNL >gi|313158870|gb|AENZ01000025.1| GENE 72 62531 - 63337 280 268 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313158887|gb|EFR58266.1| ## NR: gi|313158887|gb|EFR58266.1| hypothetical protein HMPREF9720_2130 [Alistipes sp. HGB5] # 1 268 1 268 268 460 100.0 1e-128 MEASQTFKDKRSRLTKEQRERVTPELVEEIMEKHVSPEHWPSFRRRHKAEFRKIAEGYVS QCMTFSAKDTKRPYKARIKGLLVSEVNGEERISFDFLFAQRKLVIPNRMQIKNGQDAINI SDEQIEKLKAGAFLDELLVSEKGAKFFAEVDTELNRLAIAEATYVRAPKTLHGSILTPEQ SNAIMSGKSMEYERENPNDKSQVFVLKARYSPIYYTIRSESVLDKDGKPLVRSVSVSVDP SQTLQSVEEAVKMSAQKTKKRVSQQRQM >gi|313158870|gb|AENZ01000025.1| GENE 73 63498 - 64085 -25 195 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313158871|gb|EFR58250.1| ## NR: gi|313158871|gb|EFR58250.1| relaxase/mobilization nuclease domain protein [Alistipes sp. HGB5] # 1 195 192 386 386 377 100.0 1e-103 MYRFCQACEKRGLKVKKLPRGYVISQGNSYPVALSKLPVFAERRLSHILRAVKQERIREM RYIRKCLYMVLRNMPQIDMQSDMAAELFGDKLLNQLAQYNINVIYNRNSAGIVGISFGRP DLTFKGSELKLSWETILHYVRSKRGVSAKKKPILQTTVRPQAAHLQTLVQGAAVGKKIAE DEEEKKRRGQAEQEI >gi|313158870|gb|AENZ01000025.1| GENE 74 64311 - 64553 180 80 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLFIRVTIHYPGLICISHFLKILSGHCLNHLRRRIYFIVGKTDNGMKNRIFDPGRASCRD IHKVLGKFNRSIANNILRKA >gi|313158870|gb|AENZ01000025.1| GENE 75 64655 - 65017 276 120 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313158892|gb|EFR58271.1| ## NR: gi|313158892|gb|EFR58271.1| hypothetical protein HMPREF9720_2132 [Alistipes sp. HGB5] # 45 120 1 76 76 129 100.0 1e-28 MDDKRNHRIYCRLTSAEYGRFKTEVRKTYLSESEYIRRKLLHPQMRIRSREETDLLNFVI GSLSQVNNNINQIARVMNALKGSGAVVIGSVAQRQLNQVEAILKEWNTFEGRLRQSLKSK >gi|313158870|gb|AENZ01000025.1| GENE 76 65034 - 65255 147 73 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MNPIIRVLMCEIVGKAVLFGYRSDGQLLLLWGAKKVARCEFVGQTLTTTCSAACCILRQC SGSGAFDENLPFG >gi|313158870|gb|AENZ01000025.1| GENE 77 65427 - 66200 530 257 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313158893|gb|EFR58272.1| ## NR: gi|313158893|gb|EFR58272.1| conserved hypothetical protein [Alistipes sp. HGB5] # 1 257 1 257 257 513 100.0 1e-144 MTQKGGAGKSVLTNILPNEIILWHLLHSGKQVNVLVVDIDPQQTNTAKRKRDIEQLSYKE TSPEFLAMDLGQKTEIGQFQRRYALLIDAGFKTYKMQYVDLGNDTQISRALENIQYQEYD YVFIDFPGTLTQDGTGAFLQLVQHIFIPTSINPSDVLGTECFLKTLQGLPIDLQSKYILW NKFEVSRLRKTNSTEKRLFNDYGIPFLQARIPYSPLNDCNTVIPASLKVNLNLGTYSVVK PYLQEMAAEIIKITNGK >gi|313158870|gb|AENZ01000025.1| GENE 78 66190 - 66660 122 156 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313158929|gb|EFR58308.1| ## NR: gi|313158929|gb|EFR58308.1| conserved hypothetical protein [Alistipes sp. HGB5] # 64 156 1 93 93 179 98.0 8e-44 MASRRKYQSIQDLAGSAELIVPEAPTEDQSAMGKEIASEPKPASIPEPMPKPERENQHDY ESALDLQKIATQILIRQNYKNSRTAYPICEEYLHRIRAIAILGEIPITNVINNIFALFFN PDGPLNPTHTAIKSLLTDSNNMLKDYLISNRTTHRK >gi|313158870|gb|AENZ01000025.1| GENE 79 66662 - 67003 193 113 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313158900|gb|EFR58279.1| ## NR: gi|313158900|gb|EFR58279.1| hypothetical protein HMPREF9720_2135 [Alistipes sp. HGB5] # 1 113 1 113 113 183 100.0 4e-45 MDTYIKYSFLIEIALIIVVLTLLYIAVYFYRKKPVKSIPEMAVKEIPPVEDEIISKLDRI LAMIPVDYDSTLHSSDDDIKHDKRIIQQITGTEEEITTILPQRDFNTRDLFEE >gi|313158870|gb|AENZ01000025.1| GENE 80 67007 - 69385 1600 792 aa, chain + ## HITS:1 COG:BMEII0028 KEGG:ns NR:ns ## COG: BMEII0028 COG3451 # Protein_GI_number: 17988372 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirB4 components # Organism: Brucella melitensis # 637 774 666 803 831 63 26.0 2e-09 MARTNLENIIPIYRIGDGYAIAKDGSVTVGFILTLPEYDTLSKADFRDDGSGDGMRLYTQ LEAAIKDLDEGYTFHQQDIIYYAPQDLPHYDNYLSKTVNRMYNGKRWLSNKSYMFITKQK SISIQSDYSDEAIDKIVSVIKRFRAALSQFSPRRMDDKDWLEYLRDFFSMQGRCALDLSF QDQKFGNYKMAGIAVLADPHIKSLRDKVRNNATSSPYAQRFNALVSPLCWSVPCYKIINN IISREDPVAIRKHLKTFTRNIAFLGKAAASHIASAETMAEIIETGESLPVYHHFNAFLIY PENEETAIEDKLDIALEKMQIAPTRLSLDFEHIFMSSIGGCCSGLEFPLDMYPTFLDEVT VFSNLEADYAQARDGIVLNDTHGKPVVVDIFNETMRNSQISNRNFTAIGPSGSGKSVSAN KLISGLVQTDEYFNFILDDGESYEMLHYLMGEKSGYMRMTPTGEELSFNPFLIPFVDPKN DSEKLLVPELEMLTSLILLLWDQNNGLGLKDDPGKTAYIKKLLYDFYSLRYEQRTEYVNF DSFYKYIIQQHERGTLNDKYFDFDSFELVLHQYSKEGEWSLLLNSQNNSLNLAPQLRMCI VELKAVSNVKSIYKIVLYLVMLMAKKVLAEAPTKYKIFWLDEAWKLLDDPYFGDFIKYLY KTIRKEDSGIGLIVQDVPDIIHSEHHEAIINNSDTMFFLSHEGKESNLEKYRSELSLTSR DLEIILSMKKSDHAICIRQGTHTTEYIVQLSQEELALYTTSKEHKATIKKYIDQYNGNVQ MAINAFCHAQNK >gi|313158870|gb|AENZ01000025.1| GENE 81 69417 - 70061 275 214 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313158921|gb|EFR58300.1| ## NR: gi|313158921|gb|EFR58300.1| hypothetical protein HMPREF9720_2137 [Alistipes sp. HGB5] # 1 214 3 216 216 405 100.0 1e-112 MKKILIIILAGVLLHPLPTKAQFGQDAAALIGFLTPYLGALEALSEASGMEISDLKKITQ QSIQLTSSLAKVYNAGNRLYRVSSRTHEIYIEYLETIKYIYDNSIYFEDEEASYHVAVLN KVVFGQADQDGKLSVNNIMDGAMGDLKDLLNSMQSGAQTTLTEMARYLDDVRKQMDDTYG IICSYHTYLICSVNRKQHELGMYELDEYLTNRKK >gi|313158870|gb|AENZ01000025.1| GENE 82 70058 - 71110 325 350 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313158886|gb|EFR58265.1| ## NR: gi|313158886|gb|EFR58265.1| hypothetical protein HMPREF9720_2138 [Alistipes sp. HGB5] # 146 350 1 205 205 383 99.0 1e-104 MNPILTSGLANIINSFTTQFETYFNSNFLTLLSSEVGSYYGVLGAGLMFFLIVNGALFAL REKTGKDLQAEIMRCVHLFLILIFGYCIIPAVDFFGETAIEYLYPNRQDAMQVAIDKSDG LGRIEASEYALSENDIRNVKRAQARVSKRNYNGAKIDNASDFNEDNKKESLWSLIKGGLV KLLANICLIVGAFCKLIITFFVLIIKALLRITFPVAVALSFIYGAEKTIGAWWQSYITVT MMSFIIAAIEILQNLLLFDVASNSLSVAQLGIALMAFAFGAAYLCAPTLTVLCVGGSEAA AQLPQQITTSFMTVGAMTWGILKSRSVTSGQFFKETYLKQKKGVSGRYFE >gi|313158870|gb|AENZ01000025.1| GENE 83 71117 - 71731 278 204 aa, chain + ## HITS:1 COG:no KEGG:ZPR_1772 NR:ns ## KEGG: ZPR_1772 # Name: not_defined # Def: conserved protein found in conjugate transposon TraK # Organism: Z.profunda # Pathway: not_defined # 21 193 18 190 204 65 30.0 1e-09 MKIFEHLTQLDNAVGFYRKGIIAILIVCAILMTAMYIHTETRILQATSSAYILNQDGDVA IATRIRAAEFCKIEAEAHVRLFIMLFFDIDKYTYEQRISSTYALGNSSIYALKQRFDKDN WFTNMRQYNISQSVQITDLKCVAENPLAVDAIFTVTITSEVHSKPQIYEVSITFYLEETK ERTRENPHALAITKIDFKNFNLKQ >gi|313158870|gb|AENZ01000025.1| GENE 84 71736 - 72059 226 107 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKRKIIVLLPILLFIPIFLWAAGGQDIAATTKLKGVIKNGVQIVAGLIALIAVIKIGLST ATKSIMSGQELSQKETKDVITKLGYLIIGLSVMLFASSIADFITQKL >gi|313158870|gb|AENZ01000025.1| GENE 85 72360 - 73457 431 365 aa, chain + ## HITS:1 COG:no KEGG:Fjoh_4351 NR:ns ## KEGG: Fjoh_4351 # Name: not_defined # Def: hypothetical protein # Organism: F.johnsoniae # Pathway: not_defined # 143 331 196 379 429 65 28.0 3e-09 MDSSRTDKFKKWYDRNKQKIILLLGMAVVLILIFGLIIRGVLSIQSNTSTDERVASEDIA FDIVAETPRTQGGTDEKLYQERLANSQSKTVPEDVKISFDAIFVPETPKGETHSPVAHSS QVPGTTTSYTPAAERNAMEKTAIQQELDAIYNSQSQNQPNTTATDESQTEPSRPLTKDEI LAQQKKLMEEGFNLSDLQDPMNTNSDAAVSIASPSEIPAFIHGPQTVMVGNRLALRIDTS VILSNGQVIPRNSIIYAQIALNNNRVQVTVESIKFGTAIYSLELSGYSEDGSPGLPVQND QNQKIVGDATSSAAASAIDRKISTVTGGLVGSVVRDISNAGRSRKETYVSFFDNQKVFLR PGLPK >gi|313158870|gb|AENZ01000025.1| GENE 86 73485 - 74342 491 285 aa, chain + ## HITS:1 COG:no KEGG:Pedsa_1554 NR:ns ## KEGG: Pedsa_1554 # Name: not_defined # Def: conjugate transposon protein TraN # Organism: P.saltans # Pathway: not_defined # 21 285 39 292 299 100 28.0 9e-20 MKNYIFAIVFVCLTTSGYGAKPEQIPLQQNKVCQIIFPDKIAKIRGGFNPNYFALDKYDN ILYIQCLTDFPSTNLSIITEDTSCYMLDLCYTEKSEKNAYIIDIADRIYVNEQKPLEPGQ ESSPSLRSQPAEDKTPTQAAVLPETGEYESILKQRDFIVAQNGVADKAMEVFLKGIYTRD NYVYFKLDITNNSELPYTYNYCGFAVITKKKGKMTSFDRTDLTPAGSYVPAHTIEYKKTM TVVYKFDKFNFSNDRVLLIEMVEDNGERGLFFRVTSSTFLKARKI >gi|313158870|gb|AENZ01000025.1| GENE 87 74482 - 75375 865 297 aa, chain - ## HITS:1 COG:VNG6349C KEGG:ns NR:ns ## COG: VNG6349C COG3177 # Protein_GI_number: 16120251 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Halobacterium sp. NRC-1 # 114 239 143 270 424 69 34.0 8e-12 MNYFSVSEIARRWVVSERTVRNWCAVGKIKGAFLTGKTWNIPEDAVRPRRGKEKPHSDNP LLNILKEQKDMQLRGGIYHRVQIELTYNSNHIEGSRLTQEQTRHIFETNTLGVTDQAVNV DDVIETSNHFRCIDLVIDKAASGRLTETFVKELHRILKAGTSDSRRTWFRVGEYKKLPNE VGGQETTPPEQVAAQMKALLSHYNALTNKTLEDIVAFHHAFECIHPFQDGNGRVGRLIMF KECLANRVVPFIIDEDLKLFYYRGLHEWDRAHEYLLDTCRTAQDNFRAVMRYFKIEM >gi|313158870|gb|AENZ01000025.1| GENE 88 76488 - 78236 1587 582 aa, chain - ## HITS:1 COG:mlr8517 KEGG:ns NR:ns ## COG: mlr8517 COG0270 # Protein_GI_number: 13477024 # Func_class: L Replication, recombination and repair # Function: Site-specific DNA methylase # Organism: Mesorhizobium loti # 10 306 27 330 667 133 30.0 8e-31 MNSNDKIRLLYIDLFCGAGGTSTGVHLARHGGDPCAKVIACVNHDANAIASHAANHPDAL HYTEDIRTLELGPLAAHAARMRRQYPDAFVVLWASLECTNFSRAKGGLPRDADSRTLAEH LFRYIEALNPDYIQIENVEEFMSWGDLDERGKPVSRDAGRLYRKWIDNVRGYGYDFGPRI LNAADFGAYTSRRRFFGIFARRGLPIVFPEPTHTKNPAQSDLFGQQLRKWRPVREVLDLE DEGESIFDRKRPLVEASLARIHAGLVKFLAGGREAFLVKYNSMNQSGKYVAPGIDEPCPT VATQNRLGVAKVYYLCKHFGGSPEGKCTAVDAPAGAITCRDHHAFVSAYYGNGFNSSIER PSPTVTTKDRFQLVRPFFANYYSGGGQTSGTDGPAPAVMTNPKQRLVTPWIMNTNFSNVG SSLDAPAQTVTANRKWQYLMNPQFASAGAATDRPCFTLIARMDKRPPRLVTAEADGEALP SFIRLEDETLVYEIYPEDTPMLVQIKEFMALYGLVDIRMRMLRIPELMRIMGFPENYVLI GTQEEQKKFIGNAVEVNMARVLCEALVAKLYELNIVPQRFAA >gi|313158870|gb|AENZ01000025.1| GENE 89 78330 - 78557 170 75 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313158874|gb|EFR58253.1| ## NR: gi|313158874|gb|EFR58253.1| hypothetical protein HMPREF9720_2146 [Alistipes sp. HGB5] # 1 75 1 75 75 138 100.0 1e-31 MRTIVITHDGFWYTIEDWNFARWKLYESTQGRYYCDMHGIKVTFESVEHFLELMYGHSRI GEFVNYEIEIKESGR >gi|313158870|gb|AENZ01000025.1| GENE 90 78595 - 78816 249 73 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313158889|gb|EFR58268.1| ## NR: gi|313158889|gb|EFR58268.1| conserved domain protein [Alistipes sp. HGB5] # 1 73 1 73 73 132 100.0 1e-29 MDIVKTITTKIRRHNFTFREGRRIYYGLQMDTTRLVWARNLCEGWQIDIDGRRITVDCVL REITIGNGEIINY >gi|313158870|gb|AENZ01000025.1| GENE 91 78803 - 79330 497 175 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313158927|gb|EFR58306.1| ## NR: gi|313158927|gb|EFR58306.1| hypothetical protein HMPREF9720_2148 [Alistipes sp. HGB5] # 1 175 1 175 175 350 100.0 3e-95 MKVGTKSLLFGAHNIFIHPFLVFIAWWRLYGFPADPRLWVAFIVHDWGYLGKPNMDGPEG ETHVELGARIMRIFGRRWEEFTRHHFRFHSRRDGARPSRLCYADKLALCCEWEWFYLFRT RITGELGEYMALSANGKYSCKEYMNRSIETPQSWYKAVKSFTLRYVAQNYRYGHR >gi|313158870|gb|AENZ01000025.1| GENE 92 79347 - 79901 621 184 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313158913|gb|EFR58292.1| ## NR: gi|313158913|gb|EFR58292.1| hypothetical protein HMPREF9720_2149 [Alistipes sp. HGB5] # 1 184 1 184 184 342 100.0 1e-92 MAQVALLTKGIVYDTSRQVVTLHQVVERFMLGDSLCEKCIVTEIMFDEHAGYTYTLIGLK SLRNFRTHFIFDEHESASGFFADLAYPTFLAAEQVEEVIARAAAAEKQRREEAAIAQRRL HRGALVVDYSAKALAIFTDEPSDVLVLERIKAKRNSSLTYQGRKVAGWIFPKYRQAQLAA VMSL >gi|313158870|gb|AENZ01000025.1| GENE 93 79972 - 80517 93 181 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|270265411|ref|ZP_06193671.1| ## NR: gi|270265411|ref|ZP_06193671.1| hypothetical protein SOD_n00310 [Serratia odorifera 4Rx13] hypothetical protein SOD_n00310 [Serratia odorifera 4Rx13] # 24 117 140 228 284 72 38.0 1e-11 MKNDSNNAAKQMIARYPDLKPYPKSAAENLRRELRAVFPQITFSVRYKSFSGGDEITVSY EDGPKVEEVEAIANKYAYDSSQCDAMTDYYDYRPTEFTRIFGGAKFVLIRRDMSDRVRAD LCCKAVEIAPDLADGRNVRREELFSPGEMCASVELFEATRGLCWVSADSIARNLFNKMSF A >gi|313158870|gb|AENZ01000025.1| GENE 94 80555 - 80887 274 110 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|315223254|ref|ZP_07865115.1| ## NR: gi|315223254|ref|ZP_07865115.1| conserved hypothetical protein [Streptococcus anginosus F0211] conserved hypothetical protein [Streptococcus anginosus F0211] # 3 104 9 110 124 84 46.0 3e-15 MEISLDDYLGQRGLRSPISGYMDDKWRNMRLTARGQKRFEKEAEAAIIEYSKLRKAAIDE YNNLVKSGEIIPPHETKLEALLSVARGHPDNEGTQAARRLLKKRYGIISW >gi|313158870|gb|AENZ01000025.1| GENE 95 81673 - 82662 922 329 aa, chain - ## HITS:1 COG:XF2061_1 KEGG:ns NR:ns ## COG: XF2061_1 COG4227 # Protein_GI_number: 15838653 # Func_class: L Replication, recombination and repair # Function: Antirestriction protein # Organism: Xylella fastidiosa 9a5c # 15 311 224 504 522 129 32.0 1e-29 MGHSQVLEKVSAHFAELLVLKIESMTQTWQKPWFYGQGLLPCNITGKRYAGTNLLTLLML TEVKGYRTPVFLTFRQAKERGLTVRKGSKQFPVVFWKPFYQANNPTDGQRRYLTVEEYNA LPESARAEYVLRFSLRYYPVLNLDQTDFEERFPEQWQKLLERQSMTPRPSGSVHPVLERL LAAQSWVCPILQKSGDAAFYSPCDDRIVLPRQEQFHDLEAFYSTMLHEMAHSTGSAGRLD RHLGEPSTAAYGREELVAEFSAALVGLGLGVSSGIRAENVAYLKAWLRELREKPVFIFSV LSDVQQVVRFFEEELCMRLFDTAPAENGA >gi|313158870|gb|AENZ01000025.1| GENE 96 83163 - 83486 247 107 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313158899|gb|EFR58278.1| ## NR: gi|313158899|gb|EFR58278.1| conserved hypothetical protein [Alistipes sp. HGB5] # 1 97 1 97 107 186 100.0 6e-46 EGLRFRVQGFKHTGYVAVLYNPGSDYFEVELQDDMGHAKTRVEDVCFTELVDKIDELVER TDNYDADIEVWKNTPESDPDKQAAKDLGKLVCELQELGIGGDIIIID Prediction of potential genes in microbial genomes Time: Wed Jun 22 11:57:36 2011 Seq name: gi|313158836|gb|AENZ01000026.1| Alistipes sp. HGB5 contig00084, whole genome shotgun sequence Length of sequence - 43092 bp Number of predicted genes - 34, with homology - 33 Number of transcription units - 14, operones - 6 average op.length - 4.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 397 - 849 -189 ## 2 2 Op 1 . - CDS 1001 - 2479 1339 ## Spro_1210 ribonucleoside hydrolase 1 3 2 Op 2 . - CDS 2494 - 4380 2290 ## COG4771 Outer membrane receptor for ferrienterochelin and colicins + Prom 4368 - 4427 4.3 4 3 Tu 1 . + CDS 4568 - 6532 2316 ## COG0477 Permeases of the major facilitator superfamily 5 4 Op 1 34/0.000 - CDS 6536 - 7297 435 ## PROTEIN SUPPORTED gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P 6 4 Op 2 8/0.000 - CDS 7299 - 8072 790 ## COG0619 ABC-type cobalt transport system, permease component CbiQ and related transporters 7 4 Op 3 . - CDS 8111 - 9103 953 ## COG0310 ABC-type Co2+ transport system, permease component 8 4 Op 4 1/0.000 - CDS 9166 - 10209 1434 ## COG4413 Urea transporter 9 4 Op 5 9/0.000 - CDS 10241 - 11155 1084 ## COG0829 Urease accessory protein UreH 10 4 Op 6 17/0.000 - CDS 11152 - 11823 696 ## COG0378 Ni2+-binding GTPase involved in regulation of expression and maturation of urease and hydrogenase 11 4 Op 7 16/0.000 - CDS 11864 - 12556 949 ## COG0830 Urease accessory protein UreF 12 4 Op 8 10/0.000 - CDS 12567 - 13142 789 ## COG2371 Urease accessory protein UreE - Term 13202 - 13233 4.1 13 4 Op 9 17/0.000 - CDS 13283 - 15001 2395 ## COG0804 Urea amidohydrolase (urease) alpha subunit 14 4 Op 10 13/0.000 - CDS 15007 - 15534 694 ## COG0832 Urea amidohydrolase (urease) beta subunit 15 4 Op 11 . - CDS 15552 - 15854 414 ## COG0831 Urea amidohydrolase (urease) gamma subunit - Prom 16035 - 16094 5.0 + Prom 15821 - 15880 4.9 16 5 Tu 1 . + CDS 16073 - 16450 620 ## COG0239 Integral membrane protein possibly involved in chromosome condensation + Term 16495 - 16529 8.3 - Term 16479 - 16521 12.6 17 6 Tu 1 . - CDS 16592 - 17893 2068 ## COG0148 Enolase - Prom 17955 - 18014 5.6 + Prom 17941 - 18000 7.7 18 7 Op 1 . + CDS 18089 - 19273 1825 ## BF2555 putative Na+/H+ exchange protein 19 7 Op 2 . + CDS 19317 - 19817 810 ## BT_0176 hypothetical protein + Term 19920 - 19972 14.0 20 8 Tu 1 . + CDS 19989 - 20990 1251 ## COG3712 Fe2+-dicitrate sensor, membrane component 21 9 Op 1 . + CDS 21111 - 24626 5258 ## Bacsa_1516 TonB-dependent receptor plug 22 9 Op 2 . + CDS 24643 - 26415 2884 ## Bacsa_1515 RagB/SusD domain-containing protein 23 9 Op 3 . + CDS 26449 - 27348 1254 ## COG3568 Metal-dependent hydrolase 24 9 Op 4 . + CDS 27374 - 28435 1594 ## BT_2194 exo-alpha-sialidase + Term 28454 - 28495 8.1 + Prom 28478 - 28537 2.3 25 10 Tu 1 . + CDS 28635 - 29180 628 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog 26 11 Tu 1 . - CDS 29303 - 30514 1543 ## COG0809 S-adenosylmethionine:tRNA-ribosyltransferase-isomerase (queuine synthetase) - Prom 30570 - 30629 2.5 - Term 30597 - 30633 11.0 27 12 Op 1 . - CDS 30664 - 31623 689 ## gi|313158865|gb|EFR58245.1| F5/8 type C domain protein - Term 31646 - 31699 11.0 28 12 Op 2 . - CDS 31708 - 32877 1582 ## BT_1049 putative patatin-like protein 29 12 Op 3 . - CDS 32904 - 34025 1270 ## BT_1048 putative secreted endoglycosidase 30 12 Op 4 . - CDS 34054 - 35580 2357 ## BT_1047 hypothetical protein 31 12 Op 5 . - CDS 35594 - 38890 5010 ## BF1326 hypothetical protein 32 13 Tu 1 . - CDS 39067 - 40008 1422 ## COG3712 Fe2+-dicitrate sensor, membrane component - Prom 40058 - 40117 3.7 - Term 40060 - 40098 7.7 33 14 Op 1 . - CDS 40144 - 40755 767 ## BVU_0502 putative RNA polymerase ECF-type sigma factor 34 14 Op 2 . - CDS 40770 - 43073 3221 ## COG3525 N-acetyl-beta-hexosaminidase Predicted protein(s) >gi|313158836|gb|AENZ01000026.1| GENE 1 397 - 849 -189 150 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MIYEVFSFYRTPPTLLTFLPSHRNPAANDCRCLFPPPRHPLSAPPAAAAAAPDGRQKNGE GPLSGPSLRSKLRHCSRISEDSFHCMQVDTGSRSGETAGWFHPGYSDCFMTARLLKSLLK KRCAEPDSGNSRIIPIIQICNWSGTAHIAS >gi|313158836|gb|AENZ01000026.1| GENE 2 1001 - 2479 1339 492 aa, chain - ## HITS:1 COG:no KEGG:Spro_1210 NR:ns ## KEGG: Spro_1210 # Name: rihA # Def: ribonucleoside hydrolase 1 # Organism: S.proteamaculans # Pathway: not_defined # 36 238 6 217 310 76 29.0 2e-12 MTIRQCFFCRGLLAVILSAFLAPCHAHSGKARFHAVVDTDGAADDLRTLCMLLGNREVEV LAVTTSEGALLPDSAAVRVRALLDSFYHEGVPVGAGRAVNAPAPAWRAHSGTVDWGDAAT ATSGAGFPAASALIAETLGEEEEKVVFIALGALTNLYDVLRENPASGDRIDRIVWYNSRA EPLSGANFETDSVAARYVLASGVPVTVVSANPACPVVVTPALIDSVAAVPNVYARKIAAT HRTPPLAKLVGERHLEAWDDLVAVYLFAPELFSIRELNGTVRACELSDKSAAAEAQTRLT AILRGKPDSESRVFYGFPVGGALYAADVVPIIDSAVARHGLSEWRAGVLTNELHGHLGIY ATIGVKMGIRAREYFNIGVDDILVTTYAGHKPPVSCMNDGLQVGTGASVGHGLITVAEID TPRPEARFTFKDKTIRLVLKPRYADRIRRDVKRGIELYGDLTEPYWQYVRALALRYWLDF DRHEIFDMYVGD >gi|313158836|gb|AENZ01000026.1| GENE 3 2494 - 4380 2290 628 aa, chain - ## HITS:1 COG:VC0475 KEGG:ns NR:ns ## COG: VC0475 COG4771 # Protein_GI_number: 15640502 # Func_class: P Inorganic ion transport and metabolism # Function: Outer membrane receptor for ferrienterochelin and colicins # Organism: Vibrio cholerae # 39 619 35 602 652 103 24.0 1e-21 MKSKLFVLLLLGGIVPVCALRSETADSLRNGKTYPIGEVVVTGTRHETDVRHLPMTISVV GRQKIAEGFQPSLLPTLTEQIPGLFVTGRGIMGYGVSGGAAGQMTLRGVGGGPTTGLLVL IDGHPQYMGLMGHPIADACQSMLADRVEVVRGPASVLYGSNAMGGVINIVTRRMHEDGVK TDLHAGYGSYNTLETEVTNQVRAGRFTSTVTGSYNGTDGHRRNMGFEQYGGYAKLGYRIA QAWDLYADVNLTHFNARNPGSVSEPILDNRQRITRGMTSFTLRNDYGKTSGTLSFFYNWG RHRINDGHSPGEEPLNYRFNSRDRMLGVSWYQSVALFRGNRLTVGVDWQHFGGKAWNAFT DGAPDKTTADKTMDEAAGYVDFRQSLTHWLTLDAGLRLDHHSHAGTEWVPQGGLSFLLPH NAQIKAMVSKGFRFPTIREMYMFPPQNPDLKPEKLMNYELSFTQRVSGGALSYGVSVYYI DGDNMIQTVPVDGRPKNINTGRIENWGVEGDAACRINPVWAVSANYSYLHMAYPVVAAPE HKLYAGVDFSRRRWKASTGVQYVHGLYTSVKPVAKDNFVLWNASVTFRATRWLDLFVRGE NLLAQRYEINDGFPMPKATALGGMNVNF >gi|313158836|gb|AENZ01000026.1| GENE 4 4568 - 6532 2316 654 aa, chain + ## HITS:1 COG:Ta1379 KEGG:ns NR:ns ## COG: Ta1379 COG0477 # Protein_GI_number: 16082357 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Thermoplasma acidophilum # 208 639 23 463 476 113 22.0 1e-24 MENTRQPIPLRIRQHLRDAVGRMHPSVSFVSCIGVLLAYLTGVYITGSLHEASRWMGAML ACTSLVVVLQSHNYKDSLRAGWTRVLGTFLGALVAYIYLKIWPFSIVGMLASVFVLEMLC MLVGIYQNNRIATITLLIILLVSQMTPHVSPAVNCLLRFTESVVGVGVGVALLWVLERWK RWLEPQEHANIATDKNMNMDTMPLRWGHFRVLIVASLGQITGAGLATLVGVILPMIQIFR HPGLTSLQQGLVAATSLVGIMVGSVLFGAWSDKKGYLFFFRFCPALILAASLFTFFTADL FGLVTGLFFMGLGIGGGYSLDSDYISEIMPRRWKLLMVGVAKAASSLGSIAAAAICFSLL REWHDPHLWNRLLLLVAVMAVVMLLCRIRFEQSPGWLIAHGRISEAEKAVRYFLGPDVEI GEIRNHPNKTQMPKVAWSDLFKPGQMKKVIFSGIPWACEGLGVYGIGVFLPVLVMALGLE TGSESAFSRITDSVLLTTWINLCILPGFVLGLLLVNRWYHVRTQTWGFILCAAGMGILLA AYEFHWPVWIAVSGFMLFELFLNAGPHLMTFIIPPQIYSVAERGAGAGLAAAFGKLGAVA GVVVIPILLKWGGASLVLWVTIGVLLAGALVTAVVGREVLPDKGRSVRPEIRRD >gi|313158836|gb|AENZ01000026.1| GENE 5 6536 - 7297 435 253 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229849245|ref|ZP_04469311.1| LSU ribosomal protein L17P [Thermanaerovibrio acidaminovorans DSM 6589] # 8 247 132 375 398 172 43 4e-42 MSHHYLRFDDVHYRYPNGYEALCGVSFCITHGEKVALVGANGAGKSTLLLHTNGLLIPSQ GEVVMGGIKLTRRTLPLVRQSVGLVFQDSDNQLFMPTVEEDVAFGPSNMRLEPEEIRRRV TEALDAVGALHLRGASPFRLSGGQKKRVAIATVLSMEPSVLVMDEPTSNLDPRARRQIID LIRRFGHTTLIATHDMEMVLDLCDRTIVMKQGRIVADGSTRHVFGDLALLEECGLEQPCE LRMKRALKREYAL >gi|313158836|gb|AENZ01000026.1| GENE 6 7299 - 8072 790 257 aa, chain - ## HITS:1 COG:AF1842 KEGG:ns NR:ns ## COG: AF1842 COG0619 # Protein_GI_number: 11499427 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type cobalt transport system, permease component CbiQ and related transporters # Organism: Archaeoglobus fulgidus # 8 208 27 212 243 80 31.0 3e-15 MERTARMQSPLHRTDARSKLLVTVVFLVTMLSVPLCRLPELLLFFVFPIVACAMGGLSYG TIFRRSLVVLPFVVFIGVFNLFYDREPVFRIGTLAVTAGWVSFLSIVLRGLLSVQALLVL IGSTGYYGLCRSMQRLGVPAVFTTQLLFVYRYLYVLIEEAAAMQQARDARSFGRRSYPLK IWGTLVGQLLIRTFDRAEQISRAMLARGFSGRIPEGVFERPAWKMRDTLFLAVWCSAFVL LRLCRPAENLSMLINNL >gi|313158836|gb|AENZ01000026.1| GENE 7 8111 - 9103 953 330 aa, chain - ## HITS:1 COG:alr3947 KEGG:ns NR:ns ## COG: alr3947 COG0310 # Protein_GI_number: 17231439 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Co2+ transport system, permease component # Organism: Nostoc sp. PCC 7120 # 1 263 1 243 306 119 38.0 1e-26 MHMSDALVSPPVFAVTGAVSLVLLGTAIWKVKHPRNDRREPDARDEHIVPLMGVMGAFVF AAQMINFSIPGTGSSGHLVGGILLSAILGPWAALLTLASVLVIQCLVFADGGFMALGANI LNMAVLSCLAAYPLLFRPLIKRGASPGRIIAASLLASVVGLELGALAVTIETEASGITAL PMGRFLLFMLPIHLFIGIGEGLATAAVICVVQRYKPELLYGIRRERASGRRRLGKALAAI ALLALLIAGSFSWIASSDPDGLEWSIKKTAGRSELEPASDGLHRRAAAIQEKTAVIPDYN TTFAGIVGSGAILLAVFGASCLFRAGQKQG >gi|313158836|gb|AENZ01000026.1| GENE 8 9166 - 10209 1434 347 aa, chain - ## HITS:1 COG:BMEI0642 KEGG:ns NR:ns ## COG: BMEI0642 COG4413 # Protein_GI_number: 17986925 # Func_class: E Amino acid transport and metabolism # Function: Urea transporter # Organism: Brucella melitensis # 12 314 14 318 349 223 44.0 3e-58 MANTIARSGGAGGGSFDFIKILLRGTGQVMFQNSAWTGLLFMIGIFWGAYAEGQGLVGWG ALLGVTVSTVTGYLLGFPAKDGEQGLWGFNGVLVGCAFPTFMGNTVWMWLALALCAALTT WVRAGFNNVMAPWKVNSFTFPFVFCTWMFLLAARAMHGLPTTHMADPALPAAFSSLESIR FGDLAVYWLKGIGQVFLINSWVTGICFLAGLFLCSRWAALWAAIGSALALLTVIALKASG SDISDGLYGYSPVLTAIALATVFYKPNFRSALWAVLGILVTVFIQAGMYMLMAPVGIATL TGPFCITTWLFLLPLVRFDDEEKPDHSNWYPENKKHLAAQQPGAKTE >gi|313158836|gb|AENZ01000026.1| GENE 9 10241 - 11155 1084 304 aa, chain - ## HITS:1 COG:BMEI0643 KEGG:ns NR:ns ## COG: BMEI0643 COG0829 # Protein_GI_number: 17986926 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Urease accessory protein UreH # Organism: Brucella melitensis # 6 303 5 301 302 291 47.0 1e-78 MNFAAAKEMSPYLEEPKAMPVGTPGKMGYLFLGFELDDDGRSIMRDLERHAPLIVQQELY FDEGMPEMPCVYILSSGGPNVDGDRYEQRFVVREGAYAHISTGAATKLAEMRYNYSGLTQ AFELEEDSYLEYLPEPTIPCRHTRFIADTAIRIAPSATLFYSEIYMSGRKYFDRGETFQY DILSVCSHAERPDGEQLFREKFIIRPEMYPPSTLGAMNGYDVFANVIVLTPPEHADSIYE RTEAFIDRDRRLAAGITRLPNDAGLLYKVLGEETAPVKRLVREFCSSVRQEVKGKPLPEE FPWR >gi|313158836|gb|AENZ01000026.1| GENE 10 11152 - 11823 696 223 aa, chain - ## HITS:1 COG:BMEI0644 KEGG:ns NR:ns ## COG: BMEI0644 COG0378 # Protein_GI_number: 17986927 # Func_class: O Posttranslational modification, protein turnover, chaperones; K Transcription # Function: Ni2+-binding GTPase involved in regulation of expression and maturation of urease and hydrogenase # Organism: Brucella melitensis # 8 212 6 210 212 317 77.0 8e-87 MNENTTCRIGIGGPVGSGKTALIEAITPRLLDMGYKVLVITNDVVTTEDAKHVRKMLKGI LVEERIIGVETGACPHTAVREDPSMNIAAVEEMEARFPDGDVVLIESGGDNLTLTFSPAL VDFFIYVIDVAAGDKIPRKDGPGISQSDILVINKTDLAPYVRADLEVMRRDSELMRPGKP FVFTNCMTGEGIDELVALIRRMALFDLGNNKEKVSESLTGAVR >gi|313158836|gb|AENZ01000026.1| GENE 11 11864 - 12556 949 230 aa, chain - ## HITS:1 COG:YPO2669 KEGG:ns NR:ns ## COG: YPO2669 COG0830 # Protein_GI_number: 16122875 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Urease accessory protein UreF # Organism: Yersinia pestis # 4 230 2 228 228 189 43.0 3e-48 MTDDATTVMRLLEFTDSAFPVGTFSFSNGLETAAEEGLVHDAATLEQYTQDIVRQAAFTD GVAALHAFRSYNLGYYEGILNADRQAVLCKMNAEARLMTRRMGKKLAELSKHVFPDETAA KWLGDIAGGRTPGTYPVAQGIVFAACGISEKGLFCSHQYGVVNMVVSAALRCVRVSHYDT QRILFRAAEKLGELYDTAGDMDFDDMYTFVPQIDILASLHEKGIKRMFMN >gi|313158836|gb|AENZ01000026.1| GENE 12 12567 - 13142 789 191 aa, chain - ## HITS:1 COG:BMEI0646 KEGG:ns NR:ns ## COG: BMEI0646 COG2371 # Protein_GI_number: 17986929 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Urease accessory protein UreE # Organism: Brucella melitensis # 1 174 1 174 201 175 48.0 4e-44 MKIYTEIIGNLQDPEWVKKAREAEIEYIDLDQWTAQKSRFVVKGDRENEYAVALKRHSQM LDGDIIEYLPEQRRIAAIRIRLNDVLVADLSDLARQTPETIIHISVELGHAIGNQHWPAV VKGTKVYIPLTVDKKVMDSVMRTHHIEGVAYSFQPGSEVIPYLAPHEIRRLFGGTGPDSD VHHHHEHVHAH >gi|313158836|gb|AENZ01000026.1| GENE 13 13283 - 15001 2395 572 aa, chain - ## HITS:1 COG:YPO2667 KEGG:ns NR:ns ## COG: YPO2667 COG0804 # Protein_GI_number: 16122873 # Func_class: E Amino acid transport and metabolism # Function: Urea amidohydrolase (urease) alpha subunit # Organism: Yersinia pestis # 1 571 1 571 572 807 68.0 0 MATISRQEYNNLFGPTVGDKIRLGDTDLYVEIEKDLREYGDEVVYGGGKTIRDGMGLANT MTSEEGALDLVITNVTIIDANLGVVKADVGVKDGKIAGIGKAGNPNIMHGVHPDLVTSTA TDAISGEHLILTAAGIDGHVHMISPQQAYACLSNGITTLFGGGIGPTDGSNGTTITSGRW NIERMLESIEGLPVNVGLLGKGNCSMNQPLEEQIEAGACGLKIHEDWGSTPAAIRAALGV ADRFDVQVAIHSDTLNESGYVEDTIAAMDGRTIHTYHTEGAGGGHAPDLLKVASMPYVLP SSTNPTLPFGVNSQAELFDMIMVCHNLNPKIPSDVAFAESRVRPETQAAENVLHDLGILS MVSSDSQAMGRIGESFMRTFQMASFMKNACGKLAEDADGNDNFRVLRYIAKITINPAITY GVSDYLGSVEKGKVADLVLWEPQFFGAKPKMVIKGGLINWSNMGDPNASLPTPQPCYYRP MYGAFGRTLPETCLSFVSDAAFQSGIKERLHLHRMVQPVRRTRQLTKYNMVRNGGMPKID VNPETFDVLVNDIRAYVKPADKFPLSQLLWFS >gi|313158836|gb|AENZ01000026.1| GENE 14 15007 - 15534 694 175 aa, chain - ## HITS:1 COG:BMEI0648 KEGG:ns NR:ns ## COG: BMEI0648 COG0832 # Protein_GI_number: 17986931 # Func_class: E Amino acid transport and metabolism # Function: Urea amidohydrolase (urease) beta subunit # Organism: Brucella melitensis # 9 165 2 158 159 155 54.0 4e-38 MNTKNPAPAKNQAQAASLYPNKPGFKPAKPVVVGGEILMNAPIEYNTGRKTVKLVVRNTG DRPIQVGSHFHFFEVNRYLEFDRDAAFGCHLNIPATTAIRFEPGDQKEVEVVAYSGKRRV IGFNGLVMGYTGDEDAPTYFPARMHAVRKARHAGFKNISESDAEAAMKKSNNTKK >gi|313158836|gb|AENZ01000026.1| GENE 15 15552 - 15854 414 100 aa, chain - ## HITS:1 COG:BMEI0649 KEGG:ns NR:ns ## COG: BMEI0649 COG0831 # Protein_GI_number: 17986932 # Func_class: E Amino acid transport and metabolism # Function: Urea amidohydrolase (urease) gamma subunit # Organism: Brucella melitensis # 1 99 1 99 100 123 66.0 8e-29 MHLTPKEIDKLMLLSLGMVAERRMKKGLKLNYPEAVAYITSTALEGAREGKTVEEVMKGA ASVLKKSDVMEGVADMIDLLQVEAVFTDGSRLVSIHHPIK >gi|313158836|gb|AENZ01000026.1| GENE 16 16073 - 16450 620 125 aa, chain + ## HITS:1 COG:Cj0517 KEGG:ns NR:ns ## COG: Cj0517 COG0239 # Protein_GI_number: 15791879 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Integral membrane protein possibly involved in chromosome condensation # Organism: Campylobacter jejuni # 1 123 1 122 122 68 39.0 2e-12 MFKAMIIAGLGGFIGTCLRFLTGKLAHVITVSAFPWGTFAVNIIGSFVIGIFFGLAEKTH VISPSMNVFLITGFCGGFTTFSSFADDMYLLLQQKHWLYFGLYVGLSFILGLVLVWLGRS LIKAV >gi|313158836|gb|AENZ01000026.1| GENE 17 16592 - 17893 2068 433 aa, chain - ## HITS:1 COG:SA0731 KEGG:ns NR:ns ## COG: SA0731 COG0148 # Protein_GI_number: 15926453 # Func_class: G Carbohydrate transport and metabolism # Function: Enolase # Organism: Staphylococcus aureus N315 # 3 431 4 429 434 607 73.0 1e-173 MQIVEIHAREILDSRGNPTIEVEVRTVSGAFGRAAVPSGASTGENEALELRDGDKSRYLG KGVLNAVKNVNEVIAPAVIGMNVADQVGIDKAMIALDGTKTKSKLGANAILGVSLAVARA AADYFGMPLYRYIGGANAKTLPVPMMNIINGGSHSDAPIAFQEFMIRPVGAPTFKEAIRM GAEVFHNLKKVLHNRNLSTAVGDEGGFAPALNGTEDAIESIIEAIKMAGYKPGRKEEGGD VSIGMDCASSEFYKDGVYDYTKFEGEKGAKRSSKEQVEYLKGLVAKYPIDSIEDGMSEND WDGWQMLTEEIGGKCQLVGDDLFVTNVDFLKKGIEMGCANSILIKVNQIGTLTETLDAIE MAHRAGYTSVTSHRSGETEDSTIADIAVATNSGQIKTGSASRSDRMAKYNQLLRIEEELG DEAIYGYTKIYRK >gi|313158836|gb|AENZ01000026.1| GENE 18 18089 - 19273 1825 394 aa, chain + ## HITS:1 COG:no KEGG:BF2555 NR:ns ## KEGG: BF2555 # Name: not_defined # Def: putative Na+/H+ exchange protein # Organism: B.fragilis # Pathway: not_defined # 1 388 1 390 392 507 73.0 1e-142 MRKVLSFSLFLMLGLIVSQMLPGTLGAGYAAFKKCSDTLLYICLGFIMINVGREFEIDKT RWRSYTADYFIAMATAALPWILIVLYYVFALLPPELWTSGEAWKENLLLSRFAAPTSAGI LFTMLAALSLKKSWIYKKIQVLAIFDDLDTILLMIPLQILMTGLRWQMFAIIAVVTLLLA VGWRWQATWNVRQDWKTILGLAVVVCALTQLVHIVTARLYGPENSIHIEVLLPAFVVGML MKHKEIDTAAERRITTGISFLFMLLVGLSMPLVTGAPGDTAAATSVTGSQPMMPWGTLAL HVLAVSLLSNLGKLVPLFFYRDRKISERLALSIGMFTRGEVGAGVIFIALGYSLGGPALI ISVLTLVLNLILTGGFVVWVKKLALRTYAPEAQQ >gi|313158836|gb|AENZ01000026.1| GENE 19 19317 - 19817 810 166 aa, chain + ## HITS:1 COG:no KEGG:BT_0176 NR:ns ## KEGG: BT_0176 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 6 165 2 161 162 235 75.0 5e-61 MIEITDKALQTAAAKGMDEFLRVFTDKYKETVGGDPTAETMPLLSGEQHSLLAYRIFRDE VTEGGFCQLIQNGYGAYIFGNPFARVMRLWGAESFSKLVYRAKKIYDANREDLERERTDE EFMAMYEQYEAFDELEEEYGEMEAEVTETLARYVDGHLDLFAAVVG >gi|313158836|gb|AENZ01000026.1| GENE 20 19989 - 20990 1251 333 aa, chain + ## HITS:1 COG:AGl2871 KEGG:ns NR:ns ## COG: AGl2871 COG3712 # Protein_GI_number: 15891547 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 7 279 21 278 331 83 29.0 5e-16 MDKHYRDLLIRALKGTLDTASRREFDLWVADADNRRIYENALRIWDDVQRRTSAYNPDRD RLWEQLQSRIEVCGTPQRAERPAPKRRWMPILRTAAAVAAGVVVTLFAVRPDLSRTPVPA RQEFCAFGGKSLAHLTDGSTVWLQSGSTLAYDSSFGAESRNVKLRGEGFFDIAKDPERPF TVEVEGLRIRVHGTKFNVNTSHEDRTVAVSLVEGSVALDSGNGETRRLHPGEIACYDPAT QELNIRRGNVELESCWAAGKLVFERQSLGEICRYLSRWYDIRIDISPALAHNYAYTFTLT DEPLEAILRIMSRINPIVYTFSDNRSVSISEIE >gi|313158836|gb|AENZ01000026.1| GENE 21 21111 - 24626 5258 1171 aa, chain + ## HITS:1 COG:no KEGG:Bacsa_1516 NR:ns ## KEGG: Bacsa_1516 # Name: not_defined # Def: TonB-dependent receptor plug # Organism: B.salanitronis # Pathway: not_defined # 109 1171 24 1087 1087 1057 52.0 0 MKHFRLFAILLLLVCGWTTVAHGQTQEPAINLEFTDIPLSEAISRIEKSSKYTFFYDAKQ TDLTRRVSLRAKQLPISAALRQMLAPTGLNFTISERQIALIPAARKAPAGSRTITGTVND SHELPLAGVAVTLEGDNTRGTVSENNGSFTITVPNADAVLSFTYLGYISKKVSVPASQNN LKVFLAEDAVKMEDVVVVGYGTQKKVNLTGAIATVDDTQLASRSAPSVAHMLQGAVPGLT ISTTSGRPGNSADLNIRGITSINGGSPLVLIDGAEGDLMKLNPNDVASISVIKDASAAAI YGARAAYGVVLVTTKEGDDSGKTRVSYSGRWGWNAPTTSTDYETRGYYSVYVNDLFWHAD AGTNYTNYTEQDMMELWARRNDKVENPERPWVKIDQRDGRDTYVYYANFDWYHYLFKDEH PNTSHSVSLSGGNSKVKYMLSGNYYSEEGLFRQDPDRLQRINFRSKISFDINKWLKISNN TSYYNYQYYYPGPSGVNTAFSLGTVHGLASMMPYNPDGTSVYYTSLSKYSIMDGLPTIMN KGGHYNKDKTDNMSTTTELTWTPVKGLEIKGNFTYMFNTQHNLNRQVNTEYSQYPGEVQT LSTGSRFQDKLYEKTMMHNYYQANVYATYAHTWNEKHNFKAMAGFNWETKYLKDVSATGY NLLSETLMDLNLVGQGADGNERMEVGGGQNEYALMGFFGRLNYDYKGKYLVEVSGRYDGT SRFKRGHRWGFFPSLSLGWRISEEPFFEGIRKNFNNLKIRYSYGQLGNQNVGYYDYIRKI SIGNQNYLFGGDKPTTATISAPVASNLSWETSIHNNLGVDMSFLNNRLAFSADFYIRDTK DMLTAGVALPSVYGADSPKMNSADLRTKGYELSLSWRDEFQLLRRPFTYSVTVTFNDYVT NITKYANPDRTFAKSYYEGMRWGEIWGYRIGGLFATDAEAAGYAVDQTAVNNRINAAAGS ERGLHAGDLKFLDLDGDNIISIGKNTVDDPGDREIIGNSQPRFHYGTMLSMSWAGIDFSI FFQGIGRRHWYPKANTIAFWGPYARPYASWIPKDFHKMYWSEENPDAYFPRPRGYVALSG TNRELTAVNDRYMQNIRYCRLKNLTIGYTLPKKWTRKVLIDNLRVYFTGENLATWSPIRS DYIDPEMAAMNDEMRTYPWQKTYMFGVDVTF >gi|313158836|gb|AENZ01000026.1| GENE 22 24643 - 26415 2884 590 aa, chain + ## HITS:1 COG:no KEGG:Bacsa_1515 NR:ns ## KEGG: Bacsa_1515 # Name: not_defined # Def: RagB/SusD domain-containing protein # Organism: B.salanitronis # Pathway: not_defined # 1 590 1 600 600 630 53.0 1e-179 MKKILLILAAGCTMALVSCEEWLDKYPLAQMSPETFFSNENELQAFSNKFYTAFPSDGLY NEYWDNIIHNDLPQEMRGGRTIPASGGGWTWTSLRDINTLLEYSVNCKDLDVRNRYDALA RFFRAYFYFEKIKRFGDVPWYSKPIGSADPELKRPRDSREYVMQRMIEDIDFAIRYLPTK HDLYRITKWTALALKSRFCLFEGTFRKYHGIDLPENDWKYYLDLSAKASEEFITNSGYGL YTSGGTQTAYRDLFVSEDAQQIEVVLARDYNKGLSVFHNSTFYSLNTSYGRPGLTRKIVA SYLMADGTRFTDKAGWETMEFRDECQNRDPRLAQSIRTPGYTRINSTKVEVPSLSTCMTG YQPIKFVAPADFGSDGYNLSYTDLPIIRTAEIYLNYAEAKAELGTLTQDDIDLTIKKLRD RAGMPNLDMAQANANPDPYLSAPETGYPNVTGSNKGVLLEIRRERTIELLQEGHRYYDLI RWKEGKTFTQPLLGLYIPSKGEYDLDGDGTPDIYLKTKGETPSTKAPLIMEIDQDIFLSE GDHGYISPHKKNPGEWNEARDYYYPIPTDDRSLTGGALTQNPGWNDGLDF >gi|313158836|gb|AENZ01000026.1| GENE 23 26449 - 27348 1254 299 aa, chain + ## HITS:1 COG:CC0523 KEGG:ns NR:ns ## COG: CC0523 COG3568 # Protein_GI_number: 16124778 # Func_class: R General function prediction only # Function: Metal-dependent hydrolase # Organism: Caulobacter vibrioides # 58 296 5 245 259 64 28.0 2e-10 MNRLKKYLTIMLLLGLTCPATACNSSSSSEDEVPPHLKPETPEEPGRDYYPKADGATRLV TYNVGVFTKYISDGSYQMIADMMTEGKADLVGISELDSCTVRTGRVFQLKKFAGLMGKEW SYEYSRAMAYQGGAYGDGIASREKPVRTFAAPLPKGDGAEPRVMVVMEFEKYVFATTHLD HVSTLAQAGQVDEINKVIEREFGGSKKPVFLGGDFNARPDSPTISKLKTGWTVLTPHGAS DFTHSSQAPNKCIDYILLWNGNGAKCDVVGTKIMLDFNKGDIKRASDHIPVLVDVKFQC >gi|313158836|gb|AENZ01000026.1| GENE 24 27374 - 28435 1594 353 aa, chain + ## HITS:1 COG:no KEGG:BT_2194 NR:ns ## KEGG: BT_2194 # Name: not_defined # Def: exo-alpha-sialidase # Organism: B.thetaiotaomicron # Pathway: not_defined # 10 329 1 301 316 78 25.0 6e-13 MKRYANIRLLAAVALLVATAGCKDDEGPALKQPALLSIQDAAADAEVITVESPKKQTLNI RVEAEQISGNYITVVFKIDPELVETYNAAHGTSYKLCPSEAYAFSTTEVMLPRYNTVSST MKVELFSERMPDEEPYLLPIAIDSIKGDPRATVSPTGGVRYILFNKFVKPGPPAPVLLDR SGWKVLAAPKAKPNYEVEKMFDEDINTFGYCNADEEQANPPYDFVIDLGKEVTVRGFQFN QRYMTTGKPEAGARYGIAHLEVWTAKEITGDGLGAANDANWTYTQTYRDYRTDPDSPLDP MGVVISPMLDDYQHARYVRLRIHASYTNKTYNPTFRGWCIAELNVWGNNELLE >gi|313158836|gb|AENZ01000026.1| GENE 25 28635 - 29180 628 181 aa, chain + ## HITS:1 COG:MT0759 KEGG:ns NR:ns ## COG: MT0759 COG1595 # Protein_GI_number: 15840142 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Mycobacterium tuberculosis CDC1551 # 26 178 16 170 177 60 31.0 1e-09 MRPENEHDMVLWKHISRRGGETAAFRELFDRYYAPLCRFAAYWLRDRTSAEEIVLDTFTH IWQHAGELRISTSVRAYLFRAVRNRALNRLRDQRTDGIPIEGPEPLFTNPEALQLEADEM MLLVAEAVSQLPDRCREVFRKSREEGLSNAAIADQMRISVKTVEAQITKALRRIRETLLR T >gi|313158836|gb|AENZ01000026.1| GENE 26 29303 - 30514 1543 403 aa, chain - ## HITS:1 COG:BMEI0889 KEGG:ns NR:ns ## COG: BMEI0889 COG0809 # Protein_GI_number: 17987172 # Func_class: J Translation, ribosomal structure and biogenesis # Function: S-adenosylmethionine:tRNA-ribosyltransferase-isomerase (queuine synthetase) # Organism: Brucella melitensis # 8 403 3 355 363 219 40.0 8e-57 MSERHIDINEFDYDLPDGRIAKFPLAERSASKLLVYRGGEIGEAHFADIGDVLPEGQLLV FNNTKVIRARIIMHKPSGARIEVFCLEPHDPADYERAFAVVGRCEWSCIVGNLKKWKEGY VEINFEHEGRPEHLRAWLVENRGREHTVRFEWSAAMTFGQLLEYLGRIPIPPYLNRESEE IDYTRYQTVYSKFEGSVAAPTAGLHFTPELIEGMKARGFGFEEVTLHVGAGTFLPVKDDD AARHPMHTEHFEVRRAAVARLLEKWGRITAVGTTSVRTLESLTALAWRIRTAGTPDAERV VGQWELYDVPAEFSGREALETLLKYMDEKGLERIKAATQIMITPLGYEFRIVRNIVTNFH QPKSTLLLLVSAFVGGDWKRIYEYALGHGFRFLSYGDSSVLMR >gi|313158836|gb|AENZ01000026.1| GENE 27 30664 - 31623 689 319 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313158865|gb|EFR58245.1| ## NR: gi|313158865|gb|EFR58245.1| F5/8 type C domain protein [Alistipes sp. HGB5] # 1 319 1 319 319 625 100.0 1e-177 MMKKLKYLLAGVLVSAVAGCFVACDDDESATDEWTADYVYLERLKLGVGSLEFNQTHSSL GIAGDTEVSMPFAVCLAKPWKSDVAVKFAYTIEGDMPEEIVTFREGGAVVIPAGELAARD TMDLRTDWSFVPQEAATYKVTVGIGSVQPVSGQLRISSKQKELSVQINKAKSMDIQAGVK PSGSRIADYSGWSVYATNVDDNNADWGAHQPKLINGNTGDYIWFNTPHLGVKIDMGATKT VTGLETFSAYGAAYAMSSCAVYTSDDGAGWKLVTPEEGLSMTPAGTQYVSFIAPITARYI IWHMYGSAPLSSEIYVYSK >gi|313158836|gb|AENZ01000026.1| GENE 28 31708 - 32877 1582 389 aa, chain - ## HITS:1 COG:no KEGG:BT_1049 NR:ns ## KEGG: BT_1049 # Name: not_defined # Def: putative patatin-like protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 12 388 12 392 393 372 52.0 1e-101 MKNSIKTAVLGLAALVSGFSACNDADYDTLGVHAFVSESASNKSTKVTITELGADAEITA CLSEAATKDVKLKFVVDSEVLDRYNQKQASSFLVLPEEVYEMDSEVTIKAGEFSAPATKI HIKPLPKEYVGESYALPLRLVSADGSVPVTSLTSTYVITTEAIVVSSLPMFVGGAGLAAD GFPLTLPQFTVEVRFQVSNTANRNRAVFTNGGSVLLRFEDPQNDTSDHKAHSLVQFQGDG WYLNPSLSFTPNKWQHLALTYDGTKVTLYVNGTFAGSKEGVCDPNFGSANWFGGDAGGGH GTGDSWWSGCKILMSEARIWSVCRTEAQVQNNMTTTSAKSQGLEAYWRFNEGEGNTFEDC TGNGHTLKTSKTPTWVAGIKSTDTETPWP >gi|313158836|gb|AENZ01000026.1| GENE 29 32904 - 34025 1270 373 aa, chain - ## HITS:1 COG:no KEGG:BT_1048 NR:ns ## KEGG: BT_1048 # Name: not_defined # Def: putative secreted endoglycosidase # Organism: B.thetaiotaomicron # Pathway: not_defined # 20 371 19 368 375 261 44.0 3e-68 MNIKKNIWALAAALLFATAFTNCSDWTEVEAEKKVDYGNTETSRPESYYQALREWKKTDH SISFGWFSGWGDPAVLTASMLMGLPDSMDMVSLWDNSSNLSQGKIEDLRMVQQKKGTKVL MCTFIQYVGKGFTPAEYDVDEATREAFWGWVDDDEAAIKTSMEKYAQAIADTIYKYNYDG LDIDFEPNVDGVVGKLDENDTYVRWFLDILCKHLGPQSNSGKMLVIDGELWKVPVSAATY FDYFIGQAYSVSGGTPSPNAGQSESNMDSRLTQIINRFGGVMSEEEITKRFIVTENMESA IDALNGGYFWTLRNGTRLDKAECPSLLGMARWQPVNGFRKGGFGGYRFSNEAVNKPSYKW MRTGIQAQNPAVN >gi|313158836|gb|AENZ01000026.1| GENE 30 34054 - 35580 2357 508 aa, chain - ## HITS:1 COG:no KEGG:BT_1047 NR:ns ## KEGG: BT_1047 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 506 5 512 514 548 55.0 1e-154 MKNIIKLTCLLLTVCGFAGCTGNFDEYNTDPYALYKGDPAVLIPTMLDAMMYVQQNDSQM VDQMVGSLGGYFTLSNRWGGQNFDTFNASDAWNATPYNTMFEDIYANFFDIEKSTSKSGH YYAMARLVRAAVMMRVADCYGPIPYSKVADGSFYVEYDSAEDVYKHIIEDLAGAATTLYA YAQEYPASKPLANNDPIFAGDYTAWARLANSLCLRAAIRSNDREAAESACGHAAGLIETN AQNAMMSPGVQGNPYQLASASWGDLRINASIVDYMTGYDDPRCEAYFQKSTFDNTRYIGM RAGTAGFEKSAVSGYSLPNVQSGSSLPIFVAAETNFLRAEMALREWTVGGTAQSYYEAGV RLSMEQNGVAGEDIDAYLADETLVPAGHLNDPRGAKYNYDRQTDVKIKWNDADGTEKNLE RIITQKWIANFPMGLEAWAEFRRTGYPELAPAIDNLSGGVISDNFRGLRRLRYPYTERNL NKSNYDKAVALLGGTDNESVDLFWTKKK >gi|313158836|gb|AENZ01000026.1| GENE 31 35594 - 38890 5010 1098 aa, chain - ## HITS:1 COG:no KEGG:BF1326 NR:ns ## KEGG: BF1326 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 3 1098 2 1101 1101 1202 56.0 0 MNKKRIQQIKCLFTLFFVVCASFYATRSYAQNNAVTIRLSGVSMERVMNEIENQTNYLFL SNKDVDINSVVSVDVEARPLSEALAQMVRGTDVVYRFDGKYILLSKADRRPVTVKGVVTD ADGAPVIGASVIVKGTTVGASTNTDGSFSLQVPPPAENAVLTVTYLGYEPVDVTVGSRTD FRITLRDSAVAVENVVVTALGIKRQEKALSYNVQQVKSEELTTVKDANFMNSLVGKVAGV QINSGASGPGASARVVMRGEKSIEKGNNVLYVIDGIPMYNHSFGGDGGTYAKQAGSESAA DINPEDIESINMLTGPSAAALYGSDAANGVVVINTKRGVKDRTVVTVSNSTTFSKVYRLP DMQNSYGTSSGLMNWGEKSASTFDAKKFFNTGTNIINSVSVSTGNEKNQTYISASTTNTA GIIPENTYDRYNFTARNTTSFAKDKLVLDLGASFIIQHDENMVTQGKYYNPLPALYLFPR SDDFDEIRLFERWDPVRGYMVQYWPYGEGAHSLQNPYWIQNRMNRETSKRRYMLNASLRW NITDWMNVSGRVRVDNSEYRIKQKLYASTLTTFCGTNGGYEDQTQHDRSFYGDVMLNIDK TFGDDWTLNANIGASINDQRYEQAGVAGDLLLTNHFAMNNLNYQEKFKPLQEGWHDQTQA IFASVEVGWRSMLYLTVTGRNDWASQLAYSKNSSFFYPSVGLSAVVSNMVNMPEWFTFLK VRGSYSKVASAFARYLSNPAYSFNNQTHNWSKPSTYPAYNLKPEDTKSWEIGLNARFLNH INFDVTYYRSNTYHQTIYAPLPSSSGYNQVVVQAGDVQNQGVELALGYNNKWNDFTWSSS YTFTLNRNEIKRLAAGEVNPVTGETITDNEIQKDWLGASNVAPQVILRPGGSMSDIYVNH ELKRDLNGNIEIDPSTGNLSIAETDFRKVGQLSPRYTMGWSNSFGYKGIELGAVLTARIG GLTYSATQGVLDYYGASQATADARDRGGIPINYGTVDPQKYYSAISTAEGGYGAYYLYSA TNVRLQELSLQYTLPARWFRNKARLTVGFVAKNLWMIYCKAPFDPEISAATDNAYYQGVD YFMQPSTRNFGFNVKLQF >gi|313158836|gb|AENZ01000026.1| GENE 32 39067 - 40008 1422 313 aa, chain - ## HITS:1 COG:CC1130 KEGG:ns NR:ns ## COG: CC1130 COG3712 # Protein_GI_number: 16125382 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Caulobacter vibrioides # 7 313 19 307 307 65 27.0 9e-11 MENTATDAETEQVLDWLDADPAHMRELDELDKVMAASVIYGPDVLSPAPAKKAARRISLG RIPLRRIVRYAAELAAVAVVGIGVARMLADDRIEEWTRRTTALEVPAGQYLSMELQDGTK VWLNAGTRLEYPLVFAGGERRVKVAGEAMFDVEHDPAHPFVVETFACDVEVLGTKFDVTA EEREGLFSAALLRGSVKVTNRLTPGEQFVLKPNEEVRLAGRRLNLNAIGSMDDYLWTEGM ISIKGLSFGELMHKFEKSFGVKIRIDRNRMPEVDYNHGKIRISDGVDSALRLLQMASDFT YTRSEDNGTIVIR >gi|313158836|gb|AENZ01000026.1| GENE 33 40144 - 40755 767 203 aa, chain - ## HITS:1 COG:no KEGG:BVU_0502 NR:ns ## KEGG: BVU_0502 # Name: not_defined # Def: putative RNA polymerase ECF-type sigma factor # Organism: B.vulgatus # Pathway: not_defined # 21 196 8 183 186 153 44.0 4e-36 MEKTAVRIQGRESVAMTTAEFGRLFATWRARFEAIACRYVRSAAVAEDLVSDSFMSFWEN RGRIPADANLQAYILIIVRNKCLDWLRAQSLHAKIEQEVYELRRRVLAADIRSLQAFNPE EIFSAEVAAIVRQSLDRLPELTREVFVARRFEELSYKEIAEKYGITVRRVEFELEKAVKQ LRVALKDYLPVLLMLLSDNILRS >gi|313158836|gb|AENZ01000026.1| GENE 34 40770 - 43073 3221 767 aa, chain - ## HITS:1 COG:CC0447 KEGG:ns NR:ns ## COG: CC0447 COG3525 # Protein_GI_number: 16124702 # Func_class: G Carbohydrate transport and metabolism # Function: N-acetyl-beta-hexosaminidase # Organism: Caulobacter vibrioides # 17 597 29 589 757 409 40.0 1e-114 MAAGIALCSCGSHDPQIAIVPYPNHLEAGRGTYRVTDRPVTCDSRTDERTQRAVVGFAAR LATVTGGTNPVTVADEMPASGIRFVTDESLPAEGYELNVDGEGIEVRASQFPGFLYALQS LGQLLPAAVYGTETAPDAAWEVPCVKIADAPRFAYRGMHLDVARHFFSVDEVKRYIDVMA IHKLNTLHWHLTDDQGWRIEIKRYPELTAVGSIRKATVVRKEWGTYDGTPYGGFYTQDEI RDVVKYAADRGVTVIPEIDLPGHMLAALTAYPELGCTGGPYEVWGRWGVADDVLCPGREK TFEFLEGVLTEVMELFPSEYIHIGGDECPKVRWEKCPRCQAKIRQLGLKDDGEHTAEHYL QSYVTDRIGKFLAQHGRRIIGWDEILEGRAPSDAVVMSWRGSEGGIAAAKLGHDVIMTPN SHFYFDYYQSLDTDAEPFGIGGYIPMEQVYSYDPAFPELTPEQQKHILGVQANLWTEYVL SDEHLEYMLLPRLAALSEVQWCLPETKDWNRFIGSFRMDEIYSQMGYEFAKHIFGVTASY AVDPEKGGVVMTLTTQGGAPIRYTLDGSDPTASSPLYKAPVTIGESCTFKAAALREGMQT PVYTRKFDFNKATGRRIALNAAPTLKYTYGGASLLVDGYRGGPVYSNGAWIGFLNEPLDV TIDMQGAKPYSAVTVESLVEKGEWVFPPSSVGVYLSDDGREFTEAALMSVPQETAGSPDG VKPFKVLFPETSARYLRVVARTVDPIPAWHGAAGQKAHMFVDEIIVE Prediction of potential genes in microbial genomes Time: Wed Jun 22 11:59:24 2011 Seq name: gi|313158832|gb|AENZ01000027.1| Alistipes sp. HGB5 contig00069, whole genome shotgun sequence Length of sequence - 3443 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 3, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - 5S_RRNA 29 - 88 93.0 # AE015927 [R:2797299..2798807] # 5S ribosomal RNA # Clostridium tetani E88 # Bacteria; Firmicutes; Clostridia; Clostridiales; Clostridiaceae; Clostridium. + Prom 432 - 491 8.7 1 1 Tu 1 . + CDS 511 - 1317 904 ## BF2277 lipoprotein signal peptidase + Term 1476 - 1515 2.1 + TRNA 1387 - 1471 61.2 # Leu CAG 0 0 + Prom 1717 - 1776 3.0 2 2 Op 1 5/0.000 + CDS 1957 - 2298 278 ## COG3547 Transposase and inactivated derivatives 3 2 Op 2 . + CDS 2476 - 2844 299 ## COG3547 Transposase and inactivated derivatives + Term 2979 - 3026 -0.6 4 3 Tu 1 . + CDS 3335 - 3443 100 ## gi|291513582|emb|CBK62792.1| 4-alpha-glucanotransferase Predicted protein(s) >gi|313158832|gb|AENZ01000027.1| GENE 1 511 - 1317 904 268 aa, chain + ## HITS:1 COG:no KEGG:BF2277 NR:ns ## KEGG: BF2277 # Name: not_defined # Def: lipoprotein signal peptidase # Organism: B.fragilis # Pathway: Protein export [PATH:bfr03060] # 72 267 13 200 210 166 49.0 1e-39 MTPDMMHDAGPDVSRFGNKGFHPAPIAGRKARSGNIIVRRTSKIGRHPVKQRFFTIFAAD NPTAMNFKKISLLILILLIADQLLKIWVKTHMHLDESIIVFPDWFQLRFIENNGAAFGMH IASKGGFDWGKLLLGIFRIVMVGLIGWLMHHLLRRREDTPKGVIVGLALVMAGALGNIID SAFYGLIFSESTPYAVAHFGGHYAGFMMGKVVDMFYFPLFQWNNVPRFLSFLVDSNNYFF GAIFNLADAYISVAVVYLLLFQYKFLSK >gi|313158832|gb|AENZ01000027.1| GENE 2 1957 - 2298 278 113 aa, chain + ## HITS:1 COG:AGpT275 KEGG:ns NR:ns ## COG: AGpT275 COG3547 # Protein_GI_number: 16119969 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 2 96 27 120 274 68 40.0 3e-12 MYEAGFCGFWIHERLTALGIDNIVVNPADVPTKSSEKLRKSDAVDSGKLARSLRANELKG IYTPDSVSLEMRSLIRLKNSITKDTTRQKNRIKSQLRCLGIGIPQEFLEPFSN >gi|313158832|gb|AENZ01000027.1| GENE 3 2476 - 2844 299 122 aa, chain + ## HITS:1 COG:MA2071 KEGG:ns NR:ns ## COG: MA2071 COG3547 # Protein_GI_number: 20090917 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Methanosarcina acetivorans str.C2A # 1 116 1 117 146 73 33.0 1e-13 MSVPGFGQTTGMAFLSEICDITRFRNAEQLAAYIGMIPMCHSSGEKDGTGDITVRKHAVM RCNLIEAAWVAIRQDPAMNLFYTEQCKRMPKSKAIVKVARKLVNRLYFVLKHQTEYVNSV VS >gi|313158832|gb|AENZ01000027.1| GENE 4 3335 - 3443 100 36 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|291513582|emb|CBK62792.1| ## NR: gi|291513582|emb|CBK62792.1| 4-alpha-glucanotransferase [Alistipes shahii WAL 8301] # 1 36 1 36 867 76 100.0 8e-13 MTLLISIEYRTRWGEQLVLRLGKRRIALQYADGGVW Prediction of potential genes in microbial genomes Time: Wed Jun 22 11:59:35 2011 Seq name: gi|313158827|gb|AENZ01000028.1| Alistipes sp. HGB5 contig00030, whole genome shotgun sequence Length of sequence - 5351 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 2, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 112 - 171 3.1 1 1 Op 1 . + CDS 265 - 732 -146 ## gi|313158830|gb|EFR58212.1| hypothetical protein HMPREF9720_1363 2 1 Op 2 8/0.000 + CDS 735 - 3944 4808 ## COG0610 Type I site-specific restriction-modification system, R (restriction) subunit and related helicases 3 1 Op 3 . + CDS 3944 - 4639 513 ## COG1451 Predicted metal-dependent hydrolase + Term 4709 - 4746 8.7 4 2 Tu 1 . + CDS 4804 - 5350 590 ## PRU_1165 alpha-1,2-mannosidase family protein Predicted protein(s) >gi|313158827|gb|AENZ01000028.1| GENE 1 265 - 732 -146 155 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313158830|gb|EFR58212.1| ## NR: gi|313158830|gb|EFR58212.1| hypothetical protein HMPREF9720_1363 [Alistipes sp. HGB5] # 100 155 1 56 56 111 100.0 2e-23 MHVKDLSAMSLIDKDQSDYNNVDNRLRDKSFADKSQDVEPGRTKIYARFRKWRRSEIENY LINCAVLARAAKISEEDVQTFFSEHALVIPDTNTMKQSDMQNNTIPFFNLSGKEYIELFC MGRGIDKYDIAKEFTKDEICDDIKTILTEIQDMCK >gi|313158827|gb|AENZ01000028.1| GENE 2 735 - 3944 4808 1069 aa, chain + ## HITS:1 COG:BMEII0449 KEGG:ns NR:ns ## COG: BMEII0449 COG0610 # Protein_GI_number: 17988794 # Func_class: V Defense mechanisms # Function: Type I site-specific restriction-modification system, R (restriction) subunit and related helicases # Organism: Brucella melitensis # 21 802 25 751 783 377 32.0 1e-104 MISFKEDHISQIPALLMLEKFGYTYLTPDEALALRGGNLANVLLEPVLRKQLAAINVVQV SSMRTTVFSEQNITAGIDALRNIPMDEGFMEASQTVYDLLTSGKTLEQIVDGDRKSFVMQ YIDWKNPRNNVFHVTEEFAVSRTGMTETYRPDIVLFVNGIPLCVIECKRPDVKDSIEQAI SQHLRNQKADGIRSLYLYSTLLLAINRQEGSYATTATPEKFWARWREQFADREAEARYRQ ELERVVNEPLLDDKLFGERFGYVRRNFEELYRQPVTPSVQDEYLCNLCRPERLLQLMYGF TLYDDGIKKVARYQQYFAVRQTMRRIRHIEGGRRRGGVIWHTQGSGKSLTMVMLAQAIIL DKEIRNPKIILVTDRTDLDRQITGTFRKCQIRVENATTGSKLVELLNSKSDAVITTIINK FETAVRGIRTPLTDPNIFVLIDEGHRSQYGEMGIKMEKTLPNACFIAMTGMPLMKKEKNT ARKFGGIIQPVYTVDQAVADKAVVPLLYEGRMVPQVVHEETIDRYFDKICGWMSNAQRAD MKKKFSHADQLNQTQQRIYAIAWDISQHFRENWQGSKFKAQLVAPRKRIAILYKQFLDEI GIVSSEVLITSPDTREGEDEAFGETSNVEVAFWKRMMDEYGTAKKYETSIINRFKNSDKP EIIIVVDKLLTGFDEPRNTVLYLDRKLKDHTLLQAIARVNRVCEDKEFGYIIDYYGVLGS LNSALELYTDFDKEDLEGTYTDISEEVAKLPQKHAELWDLFKEVRNTTDFEAFGNVLREE DRRSLFYEKLRAFARTLKVALSSIVFHQNTPQEEVERYKHDLAFFMKLRNAVQERYSDTV DYKQYEGQIQKLIDTHIESGEVQVITDLVNIFDKERFAEEVEKISGKAAKADTIASRTAK YITENMDTDPAFYKKFSQMLKETISQYEQGRIDEAEYLTQATDLMNKVLNHTDSEIPDVL KDNNAARAYFGLSLEVYKSVIKPEQGLDLTQIALDTANRIDAIIRQHIFEKGTLIVDWPL KDRLVGMMKLDIEDYLIDEVKRKYDLSMTFDDMDAIIDRAVDVAQKWFR >gi|313158827|gb|AENZ01000028.1| GENE 3 3944 - 4639 513 231 aa, chain + ## HITS:1 COG:BMEII0448 KEGG:ns NR:ns ## COG: BMEII0448 COG1451 # Protein_GI_number: 17988793 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase # Organism: Brucella melitensis # 2 229 6 237 246 121 34.0 1e-27 MQQIFVYGKETIPYSVLFSARRTLGIKVYPSGEVVLLAPEGTSEEVIEQKLHKRAPWILR QQAYFTDFGIRSPEKRYVSGESHYYLGKQFLLRVSEGKTNSVRYKGRCFEVVCTSPDRAR ELMRSWYREHAKLKFAEYAEPIISRFARYGVAPTSLYVQEMENRWGSCTPKGKIILNTEL IKAPRPCIEYVITHEMCHLLHPDHTAAFFTLLETEMPDWRRWKDKLERFMM >gi|313158827|gb|AENZ01000028.1| GENE 4 4804 - 5350 590 182 aa, chain + ## HITS:1 COG:no KEGG:PRU_1165 NR:ns ## KEGG: PRU_1165 # Name: not_defined # Def: alpha-1,2-mannosidase family protein # Organism: P.ruminicola # Pathway: not_defined # 1 178 92 269 746 208 54.0 9e-53 MPVRGRDKVDEESRQSWFSHQSEEARPYYYSVYLADHDIKAEIAPTERAAIMRFTFPESD ESGVVIDAFDRGSYIRVMHDKRTVVGYTTRNSGGVPDNFKNWFIVRFDRKIRDFQIYDGT KPVEGEQLVGEHALVRVGFETRRGEQVTVRAASSFISQMQAVQNLEELGKDDFETVKAKA QA Prediction of potential genes in microbial genomes Time: Wed Jun 22 11:59:56 2011 Seq name: gi|313158802|gb|AENZ01000029.1| Alistipes sp. HGB5 contig00053, whole genome shotgun sequence Length of sequence - 38195 bp Number of predicted genes - 27, with homology - 23 Number of transcription units - 15, operones - 7 average op.length - 2.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 484 - 792 110 ## - Term 634 - 677 11.4 2 2 Op 1 . - CDS 804 - 1001 212 ## 3 2 Op 2 . - CDS 1079 - 1459 285 ## gi|313158813|gb|EFR58196.1| single-strand binding family protein 4 2 Op 3 . - CDS 1481 - 1927 366 ## - Term 2692 - 2729 3.1 5 3 Tu 1 . - CDS 2739 - 2990 116 ## gi|224535162|ref|ZP_03675701.1| hypothetical protein BACCELL_00023 - Term 3424 - 3463 11.4 6 4 Op 1 . - CDS 3466 - 4242 271 ## BT_2589 type I restriction enzyme, M subunit 7 4 Op 2 . - CDS 4235 - 4483 282 ## Bacsa_0420 hypothetical protein 8 4 Op 3 . - CDS 4480 - 4785 158 ## gi|313158809|gb|EFR58192.1| conserved hypothetical protein - Term 5243 - 5280 9.4 9 5 Op 1 . - CDS 5302 - 5832 113 ## BF2908 hypothetical protein - Prom 5954 - 6013 2.3 10 5 Op 2 . - CDS 6017 - 6376 56 ## gi|313158814|gb|EFR58197.1| conserved domain protein - Prom 6431 - 6490 4.7 - Term 7117 - 7170 11.0 11 6 Op 1 . - CDS 7205 - 9184 2142 ## Coch_1009 glycoside hydrolase family 42 domain-containing protein 12 6 Op 2 . - CDS 9188 - 10315 1152 ## COG3250 Beta-galactosidase/beta-glucuronidase 13 7 Tu 1 . - CDS 10509 - 11498 677 ## COG3119 Arylsulfatase A and related enzymes - Prom 11603 - 11662 2.5 + Prom 11588 - 11647 3.7 14 8 Op 1 . + CDS 11709 - 14303 2434 ## COG3250 Beta-galactosidase/beta-glucuronidase 15 8 Op 2 . + CDS 14300 - 15850 2148 ## COG3119 Arylsulfatase A and related enzymes 16 8 Op 3 . + CDS 15874 - 19149 3833 ## COG3250 Beta-galactosidase/beta-glucuronidase + Term 19296 - 19328 1.7 + Prom 19255 - 19314 2.4 17 9 Tu 1 . + CDS 19498 - 21642 3106 ## BT_4606 hypothetical protein + Term 21661 - 21703 6.2 - Term 21650 - 21690 9.6 18 10 Tu 1 . - CDS 21707 - 24043 3021 ## COG3533 Uncharacterized protein conserved in bacteria - Prom 24080 - 24139 3.2 - Term 24319 - 24353 -0.9 19 11 Tu 1 . - CDS 24376 - 26685 2590 ## COG4953 Membrane carboxypeptidase/penicillin-binding protein PbpC - Prom 26715 - 26774 1.9 + Prom 26636 - 26695 4.7 20 12 Tu 1 . + CDS 26725 - 31977 6757 ## COG2373 Large extracellular alpha-helical protein + Term 32064 - 32115 18.0 - Term 32058 - 32097 9.0 21 13 Op 1 4/0.000 - CDS 32176 - 33351 1442 ## COG0477 Permeases of the major facilitator superfamily 22 13 Op 2 . - CDS 33348 - 34367 343 ## COG1853 Conserved protein/domain typically associated with flavoprotein oxygenases, DIM6/NTAB family 23 13 Op 3 . - CDS 34376 - 35194 958 ## COG0627 Predicted esterase + Prom 35150 - 35209 2.6 24 14 Op 1 . + CDS 35247 - 36149 1167 ## COG0582 Integrase 25 14 Op 2 . + CDS 36174 - 36470 211 ## PROTEIN SUPPORTED gi|163755828|ref|ZP_02162946.1| 30S ribosomal protein S21 26 14 Op 3 . + CDS 36504 - 36689 98 ## - TRNA 36743 - 36817 46.6 # Glu CTC 0 0 + Prom 36835 - 36894 4.5 27 15 Tu 1 . + CDS 36959 - 37696 1026 ## Cpar_1045 hypothetical protein + Term 37764 - 37815 6.5 Predicted protein(s) >gi|313158802|gb|AENZ01000029.1| GENE 1 484 - 792 110 102 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGITDPRGDIAGDPAAGYGRFRFHPNIIPRIIINNISIRKLRENRAPKARREKQAARQVL TSRAAMLMSEGKKLQEPFLSLICLTPNDDCNYRHDDDSKNVN >gi|313158802|gb|AENZ01000029.1| GENE 2 804 - 1001 212 65 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKKELGKWLMDIAKYVLTAVVVMSLVSDLGDVRWLIYVFGLLTTAVCLGGGLILVRDKDT KVKED >gi|313158802|gb|AENZ01000029.1| GENE 3 1079 - 1459 285 126 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313158813|gb|EFR58196.1| ## NR: gi|313158813|gb|EFR58196.1| single-strand binding family protein [Alistipes sp. HGB5] # 18 126 1 109 109 221 100.0 1e-56 MQCLAEPHDVQANRIFQMKRTTIVGNIGRDAVCREGDGGHKFVSFSVAYAERKTNETDEQ GNPVTVAQWAECEIYVGPESSAEGLLKLLTKGRFIYAEASDKAEAWIDKDNQLRSRILYR ITNFQV >gi|313158802|gb|AENZ01000029.1| GENE 4 1481 - 1927 366 148 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKTYVATVCLTKSYTDKMCNQVLLQKEFPDEKAAHVWLNKALKRYNYVKGIKDLCYVSWF SLDDENGNQLLYFDYFSSDVPHEIQVRVTLTKKYPAQKQTIVLTHKCACYNHAILWIDWA YLKFESSENATFSSGEMIDAKTEIVIDK >gi|313158802|gb|AENZ01000029.1| GENE 5 2739 - 2990 116 83 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|224535162|ref|ZP_03675701.1| ## NR: gi|224535162|ref|ZP_03675701.1| hypothetical protein BACCELL_00023 [Bacteroides cellulosilyticus DSM 14838] hypothetical protein BACCELL_00023 [Bacteroides cellulosilyticus DSM 14838] # 6 83 19 99 99 68 43.0 2e-10 MENRAKKCIECGKTLTLDYFRRTPLSPDGYAKMCKLCANIKRTQGKNEERNPALLNFKAR ELIDELRARGYRGTLRITKDIQV >gi|313158802|gb|AENZ01000029.1| GENE 6 3466 - 4242 271 258 aa, chain - ## HITS:1 COG:no KEGG:BT_2589 NR:ns ## KEGG: BT_2589 # Name: not_defined # Def: type I restriction enzyme, M subunit # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 221 1 217 256 174 42.0 3e-42 MAKIQLPAEFKGFSEKFDALAYGQDASRVFDDMLAYIVDLFSFDNPWEPHGRYKDPEIRK RFFELFQEIVLLMNKKICDDREWYDPFGNLYQTQIASHARRANAGQFFTPEHIVDLMVSI NGEGRELTGKGLNFGDPACGSGRFLIAAHAKFPGNYCCGEDIDRTCALMTVCNFILHGVN GEVIWHDSLMPTKERFYGAWRVCPRPDLMGCPQVSKMEWEDTLCYAVWQGCLEKHQRETP ENSTSESAPQQIVQLNLF >gi|313158802|gb|AENZ01000029.1| GENE 7 4235 - 4483 282 82 aa, chain - ## HITS:1 COG:no KEGG:Bacsa_0420 NR:ns ## KEGG: Bacsa_0420 # Name: not_defined # Def: hypothetical protein # Organism: B.salanitronis # Pathway: not_defined # 1 72 1 71 74 63 46.0 3e-09 MRDEIRQNGEPLLWAEGEEWTSIPMIFQNITGKNFTGQEYRAYIENWILHQGFELGPVEL WCDGRFVERGEVRLAGKEVCHG >gi|313158802|gb|AENZ01000029.1| GENE 8 4480 - 4785 158 101 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313158809|gb|EFR58192.1| ## NR: gi|313158809|gb|EFR58192.1| conserved hypothetical protein [Alistipes sp. HGB5] # 1 101 1 101 101 196 100.0 6e-49 MPIYKFYQDVLCTSWERRHFTVTAQNQEEADMIAAQCKDTPLCFDPDAEPGETVYCVFED ETLLETVEPLPITDNHGKPTIEVYRSNDDLFIADNYKNREL >gi|313158802|gb|AENZ01000029.1| GENE 9 5302 - 5832 113 176 aa, chain - ## HITS:1 COG:no KEGG:BF2908 NR:ns ## KEGG: BF2908 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 174 46 225 226 114 36.0 2e-24 MPADLNVESGSRGHQGLEYIIGLSKSESREEVCKTWDSLTQEEKDNRLLFGAKYFTNTMR YGFPTWYEWRTQNWGAKWNACNASKSGNIIFFGTAWSTPEPIIKALSVKYPDVTFEVEYA DEDVGNNVGSYSYKSGEQIHFIEMSGSQQGLGLAISLLGLETYFEFVDGQYRRKKE >gi|313158802|gb|AENZ01000029.1| GENE 10 6017 - 6376 56 119 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313158814|gb|EFR58197.1| ## NR: gi|313158814|gb|EFR58197.1| conserved domain protein [Alistipes sp. HGB5] # 37 119 1 83 83 167 100.0 2e-40 MTENYLVRILTDELTSFTDLPFWQIQAKVCDAVHRAMRQTDGLDEFINALQNRIVRFSYK KTDGSVREAIGTMKPSVLSTLSAKMQSGQKRRSTGCIVYFDLEREDWRCFRPENFLSIL >gi|313158802|gb|AENZ01000029.1| GENE 11 7205 - 9184 2142 659 aa, chain - ## HITS:1 COG:no KEGG:Coch_1009 NR:ns ## KEGG: Coch_1009 # Name: not_defined # Def: glycoside hydrolase family 42 domain-containing protein # Organism: C.ochracea # Pathway: Galactose metabolism [PATH:coc00052] # 4 657 8 666 667 469 38.0 1e-130 MRKWLMAFLLLAGTAGVRAQQCDVPQIGAQVFVEPGQTPEDIDGFFRLLHDNGMKVARIR MFGAHMYRGGEWDFSLYDEAFRAADKYGVRLFATLFPVTDELNDVGGFKFPRSKAHLREI DDYITAVVSHFRQYESLWTWVLQNEPGSGGTRVAMTDLAREVYDRWLADFPPEERGEGYL KADFTQEKFLTYYTTWYLNHIAQLVERLDPQRGRHINPHQILGTLPDYDFPAYSKFLTSV GASLHLSWHFGMFSQREYPLGVSLMSDIIRHNALGNPFWITELQGGNVTASGNVPYCPTA AHTAQYLWTAIASGAEGVIFWSLNQRAAVMEAGEWGLLDFLRRPSDRMLEAAKVASVLQR HGEEFRGLKPAPAPVTLLYNIASLRIQRRNAETPASGEEGRQASACMKSLAAAYEAVSAW GVTPEVADMATFDWDDAAGRTAVIPHMVALPSEFRPRIESFVRNGGKLIVTGLSGFYDEN MRCLFMNGFPLKSCFGAEVSEFKVAGEYFTLGEELPAHLWRGILVPASDETMMTDGGDVA AVRNRYGRGEVVWVPSPIELGGYHRDMVPLTAFYGRECRDAIDAAPASFRTPEPDVLMRT MRKEGVLTTVIVNKRPESAAIRLRTGRYGVPRVIYGDASVKGAQVTVGADRCAVVVWSR >gi|313158802|gb|AENZ01000029.1| GENE 12 9188 - 10315 1152 375 aa, chain - ## HITS:1 COG:TM1193 KEGG:ns NR:ns ## COG: TM1193 COG3250 # Protein_GI_number: 15643949 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Thermotoga maritima # 28 352 655 981 1087 163 32.0 6e-40 MGATLGSGTFEVPRTEAGACTECRVGYKAPRPESGVDYWLDIVYRLKEDKPYALRGFEAG RYQFALPAAAPVRAAAARSAVVSRTERSVTLTAGSVTATVDAATGYLSGYAVRGCELMRS PLVPNFWRASTDNDRRGWRTRERMGAWRTMPERLKLESLNAADGAVTAVVCGGGVRLALR YRLAADGELAVSYDLRIADTLPEPLRIGLQALWSGELDRYVYCGRGPGENYADRKEGSLF GVYSGSTADFSPAYIYPQECGNRCDVHYLQLGGKGGGVVFAGRQPLCVSVWPCTQEALDA AEHTHEIVRLDDAWLVNVDCAQAGVGGTDSWSVKSRPSEAYRLLEKHYGYEFVIAPAGTP ADAARTSRRVAYKNE >gi|313158802|gb|AENZ01000029.1| GENE 13 10509 - 11498 677 329 aa, chain - ## HITS:1 COG:MT0310 KEGG:ns NR:ns ## COG: MT0310 COG3119 # Protein_GI_number: 15839682 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Mycobacterium tuberculosis CDC1551 # 26 313 9 296 465 101 30.0 2e-21 MKTTHLSGLLTTAGLVFGGGNSYAQQRPNILWLTIEDTSPYDFGCYGNRHVATPTIDSLA GAGIQYMRAHSVGTQSSPSRSCIITGCYASTFGMEWHRCRFATPETVFLPDYMRAAGYYC TNKSKTDYNTLCDNKAMWDSCGPAATYNNPARKKDQPFFAVFNSMATHMGRIRSYHTDQR RDFALEGLDPARLELPPHVPDIPEIRSDFAFHMEGSQDIDAWVGMFLDDLRRQRLDENTI VFFFSDHGGCLPRGKGFVYETGTHVPFIVYLPPKWRHLANGQSGRTDRLIGFPDLAPTVL SLAGIEPPAYMQGKAFLGLRRRFRKPREP >gi|313158802|gb|AENZ01000029.1| GENE 14 11709 - 14303 2434 864 aa, chain + ## HITS:1 COG:XF0846 KEGG:ns NR:ns ## COG: XF0846 COG3250 # Protein_GI_number: 15837448 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Xylella fastidiosa 9a5c # 41 854 59 877 891 504 35.0 1e-142 MKRLLLILWLLTAAAGSRISAAGRTELNQGWKFRQYGLGEWLPAAVPGTVHTDLLANGQI PDPFYGSSQSDLQWIDKTDWEYCCTFDAPELASYDNVRLVFEGVDCYADIRLNGELLHRT GNMFRTWKSDVKGLLKSRNNRLHVVFHSPVMQGLKNMAAYGSRLTANNDLGALGGLGPNK VSVFTRKSGYQYGWDLAPRYVTSGLWRPVALEAWNEARVEDFHVRTRSTGPRKAQMSASA ALRTDAAGCYRIRILLNGKSILTADKTLDAGTHSIEEPFEIPSPRLWYPNGMGEPYLYDV ELVLEKEGRELDRTAVRCGVRTVSLRCRDDADGRGRGFGFEINGIPVFCKGSNYVPADAF LPRISREKTEFLVRSAAQANMNMLRVWGGGTYESDDFYEMCDRYGIMVWQDFVFACNMYP GSAQIYADIRAEAEDNVRRLRNHPSLVLWCGNNEIDVAWKPHDKRNSRFRKFYTEEEAEQ FDRVNETIFRNILPGVVDSLCGGTVPYWHSSPSPGWGLDTADRWRYGDVHNWDVWHKGDP ISAYNTQIARFISEYGLQSYPELSSVERFIPEGERRLASPSMTSHQGDRKKGDARMLEYV DRSYLRSDDFARTLYLSQLMQAEGMKTAMEAHRRNMPYCMGSLIWQLNDVWPCASWSGID YYGRWKAMHYFVRKACEPVVVSPYIQGDTLDIFVVSDLRKPLRGVMKLTLTDFSGNELKS SSHPVTVGAAASQRALRYGVRDYLDGADPGNAVLVCEFRSDKTAYRALQYFETMKNAALP AAAVRIRAEKPDAATCRITVSSDKLAKNLTLHYKGTAGIFSDNYFDLLPGESRSVTFAAP EDAAEILRHIECMALNPETTIIKL >gi|313158802|gb|AENZ01000029.1| GENE 15 14300 - 15850 2148 516 aa, chain + ## HITS:1 COG:PA2333 KEGG:ns NR:ns ## COG: PA2333 COG3119 # Protein_GI_number: 15597529 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Pseudomonas aeruginosa # 37 509 5 526 538 145 27.0 1e-34 MIHLKRIALPAGFAALAISQAAAQPAAPVPVARNGEKPRNVIFILSDDHRYDFMGFTGAV PWLQTPALDRMAREGACLKNAFVTTSLSSPSRASILTGLFTHTHTVVDNQAPKPDDLVFF PQYLQQNVYRTAFFGKWHMGNQDDMPQPGFDHWEGFRGQGTYYNTVLNINGERVKFDPEL YSTDILTDHAIDFVRDNEEGPFFIYLSYKSVHSGFQPSPSRKGMYKDEKAVYPPSFNVPE YGIPRLPGKDADGRPLAGRGWYGESRLPDWVKNQRESWHGVDYQYHGALPYEEDFRNYCE TVTSMDDAIGRLLDFLQAEGLGESTLVIYMGDNGFTWGEHGLIDKRNFYEPSVRVPMLAY CPELIPAGRTVEEMVQNIDVAPTIMAACGLAKAPQMCGESFLPLLKGGTAADWRKRIYYE YYWEYAFPQTPTVFGVRTDRYKYIRYHGIWDTNEFFDLQEDPYETVNLIDRPELQDTIRS LANDLYDWLETTGGMQIPLKRSVYYRHGDHRNAKTY >gi|313158802|gb|AENZ01000029.1| GENE 16 15874 - 19149 3833 1091 aa, chain + ## HITS:1 COG:TM1193 KEGG:ns NR:ns ## COG: TM1193 COG3250 # Protein_GI_number: 15643949 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Thermotoga maritima # 64 1069 44 981 1087 691 39.0 0 MNRLLAILLCGTTLCTAAAASERNAPADQFVNALRRLPARATSYSYSSEQDALAGDRTLS RIQSLDGMWKFRFAEDVSRSPADFWLPGADLTGWDEIPVPSCWEMQGYGYPIYTNVVYPF EFKPPYITRDNPTGCYVRTFSVPEAWSGNRVTLHFGGVYSGFYVWVNGALAGYAEDSCLP SEFDITGLLQPGENRLAVQVFKWTDGSYLEDADHWRMAGIHREVFLSAKPDAAIGDFGVR TIFDADMRDALLQIRPAIDLREGASAAGWQLGARLYAPDGTPSGRELTLPVEEILAEAYP QRDNVYFALLEERITAPEKWSAENPALYTLVLTLRDAAGKLAEARSCKVGFRDVRLRGRE MLVNGVPVKLCGVNRHDHDQYGGKTVSRESMEEDVRLMKRLNINSVRTSHYPNDPYFYEL CDRYGLYVVDEANIETHGKGGLLSNDPQWITPFLERVSRMVIRDRNHPSVIMWSLGNESG CGPAHAAAAGWAKDYDPTRLIHYEGAQGQPMHPLYVPLKRTSAAAFTSVMAADNQPAAGQ VKKPRNGGNPTDPAYVDVLSRMYPTAAQLEQMALNPMLDRPVMMCEYAHSMGNSTGGLND YWKLIRTHAGLLGGHIWDWVEQGLVKKDAQGRTYWAYGGDFEPAGEHNDAAFCCNGIVNP DRTLKPAALECKYVFQPVEFTASDLAAGKIAVRNRNFFAGTERYDFTWEISTDKGVLQHG SFEVPPTAAGCRAEAAIAFRPFKPEPGAEYLLRVQAREKRRTPYAEAGYAAAEEQFALPV YKEPVRSAPKGRASVAQDDEFIVLSAAGSRTEIDRRSGYVVSHSVRGRKLISAPLRPNFW RASTDNDWRGWRVGKIAGCWKEAPERLQTVSVRIDESAGSVTVEKAIPDSVRLTLTYTLD GAGALAVDYKLDISDRMPEPLRVGLQTCVPNTLERIAYFGKGPQENYSDRCEGAFLGLYR STPEKFMHSYITPQENGNRCDVRWLSLTASDGRGVQFVGAEPLSVSVWNCTQESLDKARH SNEVEPLADALTVNIDRTQTGVGGTDTWSLKARPSDQYRLLEKHYACRFTIIPCNGEAET IRNGRSLFRNQ >gi|313158802|gb|AENZ01000029.1| GENE 17 19498 - 21642 3106 714 aa, chain + ## HITS:1 COG:no KEGG:BT_4606 NR:ns ## KEGG: BT_4606 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 2 214 1 212 394 136 39.0 4e-30 MMKKFALWSLALLLTFAAACSDYDDTDLWNSVNDIKKRVEALETAVAKLNSDVSAMQTLI EKLNEKVYVSKVETNADGSYTIYFTDPENTKITITDGKDGAAAPVIGVAKDTDEIYYWTL TSGEGEPEWLLADGKKIPVTAAAVVPLLKIDDKGFWIISCDKGATWNYITDDKDQPVSAL GQGGVSDSFFKEVTQDGAYAYFTLMDGTTITVAMRSDFYLLVKMAPALGTYAYGETKTYE TESVGVNDVVLTKPTGWVVSYADEVLTIKAPAEEAVCETEGEIAVIYFGDNDRSSLVKMK VKIDKDYTGVTEGDNFTLEITEVGDTRVRAAITPKDPEMSYYVYPYDPAKSDEACITQLQ KRFKADIADGPEYVDYFFKGTKTDYRYSGLTPGESYDLSVVGVKYDLTAKTLDIVTPLMR VPFSTKAPEIINTAYLMTLSDISWYGAKCAVHPSDDLPYFCTFVKKSTFDMAYDDADFAQ TYIDDRYLWPYYGELSDGILLWSDFTATGDLAFSSPGFVQRDPLYISEDIYPLESDTDYY AVAFGCNENGEFSNSRVSRKLFHTKAFTPTEACTFTIDVTVDQQDLAIKVIPSDKNTSYI TFIDERDTYRDNFATPRQYPPYDLYWRMQGLEAGQTIGDDDCFYTGDAAYNVVSLKAASA YIVFAYGCSADGRITTEPEIVEVHTKGTVDQPDLNSAAKRRRTVSARPAYRVIR >gi|313158802|gb|AENZ01000029.1| GENE 18 21707 - 24043 3021 778 aa, chain - ## HITS:1 COG:BH1877 KEGG:ns NR:ns ## COG: BH1877 COG3533 # Protein_GI_number: 15614440 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus halodurans # 47 777 4 756 758 503 38.0 1e-142 MKWRSFPLNMLLAAVVLGTGPVSISAAVPRTRERHAAVQPQRYPVPLNDVRITGGPFLHA QEMDRRWLDSMDPDRYLSGFRSEAGLEPKAPRYGGWESAGCSGHGFGHFLSAAAMMYAAT GDRALLDKINYSIDGLAECQQKEGTGLLAGFERSRALFAELERGDIRSQGFDLNGGWVPF YTLHKMYAGLVDVCRYTPNAKALTVLVRFADWLDGLVAKLSDEQMDKILICEHGGITESL ADIYVLTGERKYLELARRFDHREILRPLAAGVDSLPGKHANTQIPKIVGAVREYECSGDE RYRRIADYFWHRVVGFHSYAIGGNSEYEHFGAPGMLANRLSDGTCETCNTYNMLKLTKHL YQLDPTVRRADYYERALYNQILASQNPDDGMVCYMSPMGSGHRKGFCLPFDSFWCCVGSG MENHARYGEFIYFTDARENLYVNLYIPSTLDWKSRGVKVEQLTDFPCSDEVRLRVEMSGA QRFVLNLRYPEWAAEGYELTVNGRPVKQKAKPGSYISVNRKWRSGDEVRFVLRQSLHSEP IPGDSTLRAYFYGPVVLSSVLEDKEEIPVIVADDVTDVSALVKCTDKKRLRFETLTAQPV QKELIPYYEVGGRRMMVYFQHFPETTWKEQLADMRLREHREEWLRERTVSQFTPGEMQPE RDHNLRGEKTAPHEFEGRKYRETLGGWFSFDMAVAPDVPNTLYCTYWGNRFYNHSFDIEI DGRKVGFENIHNWGPQYVERSYRIPAELTAGKELVTVTLRAIRDDAVAGPLFDCRIMK >gi|313158802|gb|AENZ01000029.1| GENE 19 24376 - 26685 2590 769 aa, chain - ## HITS:1 COG:FN0580 KEGG:ns NR:ns ## COG: FN0580 COG4953 # Protein_GI_number: 19703915 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane carboxypeptidase/penicillin-binding protein PbpC # Organism: Fusobacterium nucleatum # 30 767 10 720 724 377 31.0 1e-104 MTEKLKRKTIYFLLLPLLAAAAAGYCFCLPRTLFDEPFSATVWSRDGRLMSAKVASDGQW RFFPTDSVPYKFRVAITTYEDKRFYRHFGVDPLALGRAVRQNLASGRVASGASTLTMQTI RLSRGRKARTFREKIIETILATRLEFRYSKEEILALYASHAPFGGNVIGLESAAWYYFGR SAGSLSWGESAMLAVLPNSPALIHIRRNRERLRRKRDDLLERIWRGGHIDSLTCALARQE PLPDAPEPMPMQAMHLLGRMRGGSLRSTLDYDLQRRVNELALRHNRRNRGNKINNLAVVV MDVKSGEVLAYVGNVYDPADRSEGTSVDVVRAPRSSGSVLKPLLYAAMLDNGTALPTMLF PDVPTYYKDFTPQNFNHDFSGAVPANRVIERSLNVPSVKMLDKYGRENFLALVRNLGFGT IDRSADHYGLSLILGGAEITLWDLTSVYAMLAAKLGGGPVRIPHCDPEAVKTVSADDIPL SRGAIWLMFNSISHVARPEEEGEWQYFSSSKKIGWKTGTSYGNRDAWAVGVTPDYAVGVW VGNCSGEGRPLMTGVGYAAPLLFDVFGLLPGGEWFAEPHGDLEAAVVCRQSGCLASHICP DRDTLMIPRTAAAGEVCPYHRIVNLSRDLRYRVTADCYDPAQIVRMPMFILPPAQEWYYR KQHPDYRPLPPLHPGLSGSGAGGNPIDIVYPQPGRVLVAPKSLEGEAQSLVFTAIHRDRN AVLYWHIDDNYIGATTLEHKISVRPAPGAHRLTVIDEHGARQSVAFSCR >gi|313158802|gb|AENZ01000029.1| GENE 20 26725 - 31977 6757 1750 aa, chain + ## HITS:1 COG:FN0579 KEGG:ns NR:ns ## COG: FN0579 COG2373 # Protein_GI_number: 19703914 # Func_class: R General function prediction only # Function: Large extracellular alpha-helical protein # Organism: Fusobacterium nucleatum # 232 1750 54 1611 1611 444 25.0 1e-124 MKAIRYVLFATALLLFASCGRQPSDIIESSPNGVTGVKMPILVTFKENVEPKEDRIREAV SISPSVDFDVYLSGMRMLRIIPRSPLKYDTRYKVTIDAAKLTGRQLKGVAEFEFATPKLR FAYSDYWLQQSDDMTSYVLVGEVVSSDYAESGYVEKNLKISGLKNPDVAWVHSANGTTHQ YTVGNIAPGEGAGYTLTLDFGYGDSKTLTVEVPKKDEYVVLDHSVAAEPLAVVVTFSEPL KQNQNLKDLIRFDTKFRTSVDKNRLYIYPESHVTGNFDIEISRNVLSKGGQRLKESYTFT ANLPSRVPAIRFAGKGSVLPSSNDMSLLFESVNYRKARVRVRKIFANNLLQFFQQNYYGG DYYSDMDYVSRIVRDTTVDLSAKASTKLDLSNTYSLDLSRLITDGRKSMYLLEIKGVEPL VPTDEYDYDYYFGDYRTYRERSKLVLQSDIGIICKNSGEEEYVVYTTDLLSARPKGGCKV RAYDKQNQTVAEASTDSEGRAVLKCREEPSIVAAEADGQLAFVKVERGAALSLSNFDVGG TTNPKGTKGFLFGERGVWRPGDDIFLTFIVTSDNPLPENHPASVEFFNPNGQLVQTLVSN GSSDGIYTFKLGTTPAAPTGQWLARVSLGGAVFEKAIRVDAIKPNRMKIDMRLGDGKRID ARKFTGSLTAKWLHGAPADGAKVTLQAQLRQIPTRFKGYEKYSFDDATKYFETEEREIVS GTTDQQGTLQLTTGGLASLEGLSPGMLSGKFTVKVFEKSGDFSVDQQIIPVSPYDAYFGI GVTPQTSDWGDEYLDSKKEHLLRIVMLDAQGRPLPGREEALVSVYKITSYWWWDAASDSQ ARYAKNALNTCYKTLQTTLTDGAGQVAMRWSAGDYGYYMIRVTGSGHEHAATQVVCVSSS DWRGDVSSVTDAATRLAVTKDKEKYAPGDKAKITIPSSPGARALVSVESGSVVRESFWID CADKQTAFEILIKAGMAPNVYASVTLVQPHNHTHNDAPIRLFGVVRLDVEDAATKLTPVI DMPETVRPESEITIKVREKDGKKMSYVLALVDEGLLGLTRFKTPNPYLHFNATEALGVRT WDMFDHVIGAYGGRIEQLFAIGGDAEQQQNTGALKAQRFKPVVRFLEAQKLGAGKTNTHK IALPPYFGSVRVMVVASNGRASGAAEKVAEVKKPLLVQATLPRVVSTDEEVELPVTVFAL AKGVGKIDLKVSANELFTAVGPRSKMLALSQSGEEVVTFRLKVNKETGVGKVRVTATSSG DSSVSEIELDVREPNPYVTTSEDYMLEPGKTIALRPLKDTGNAKLELSSIPSADLTRRME YLVRYPHGCIEQITSGAFPQLYLPAVMECDVRTLQDIDRNVKSVLSRLGGYQLYNGSFAY WSGGSNSSEWGTAYAAHFLTEAAKYGYAIDRPMLDRALKYLRGNEADSFLTQAYAQYVLA LNNMADRGAMNRLREKAADLKNDVKWMLAAAYALDGNRKVAEELTAQGGSGQTGKVDPYD DTYNSSERQMAVVLMTQTLLGKREEAFRTALKMSDILKKEKWLSTQSTAWMLSTLSNFIV SGQTGIDAKAGKESIRTDKSFVSMPLTEETGVTNNGKESLYAVVSQRYNPAKGEETEAAD NIRIAVRYTDMDGKAVDPKSIRASTDFYAVVTVSNISGYEKYTNLALTHIVPAGWEITSE RDLTSVTYQDIRDDRVLSYFDLKRGESKEIPVKLTATYKGRYYLPSIYCEAMYDNSVRAL KKGEWIEVVE >gi|313158802|gb|AENZ01000029.1| GENE 21 32176 - 33351 1442 391 aa, chain - ## HITS:1 COG:CAC3482 KEGG:ns NR:ns ## COG: CAC3482 COG0477 # Protein_GI_number: 15896719 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Clostridium acetobutylicum # 4 390 8 394 394 225 33.0 1e-58 MKAMLKEGAGLSNGLLCTLAVIAGVSVANLYYNQPLLDMLRQDLGTTTLAANHVALFSQL GYALGLLFIIPLADLFSRRRIVLVNFLLLAVSLLAIATASDIRVIHGFSLVTGVCSVIPQ IFIPLAAQYSRPEYKNRNVGIVLSGLLTGILASRVVSGVVGELFGWREMYFAAAGLMVVS AAVVLYVLPDARPNFRGTYAALMKSLLTIVRCYPTLRIYSVRAALAFGSFLCFWASLAFK MAQAPFYAGSNVVGMLGLCGIAGALTATFAGKYIRRVGVRRFNYIGVGLQILAWLLFFFG ADSYAALVAGIVVVDIGMQCIQLSNQATLFELDPSASNRINTIFMSTYFAGGSLGTLLSG AAWSLYGWTGVVAAGILLSSASLLVTLCTKK >gi|313158802|gb|AENZ01000029.1| GENE 22 33348 - 34367 343 339 aa, chain - ## HITS:1 COG:FN1468 KEGG:ns NR:ns ## COG: FN1468 COG1853 # Protein_GI_number: 19704800 # Func_class: R General function prediction only # Function: Conserved protein/domain typically associated with flavoprotein oxygenases, DIM6/NTAB family # Organism: Fusobacterium nucleatum # 19 184 20 185 197 171 46.0 2e-42 MKESWKPGTLIYPLPAVLVSCGATPDEYNLLTVAWTGTVCTDPPMCYVSVRPERHSYGFI RRTGEFVINLTTRGLARAADWCGVRSGRDYDKFREMGLTPGKVLKVAAPIVEESPVSIEC RVRQVLELGTHDMFLAEVVAVQVDADYIDPATGRFCLERACPIVYSHGEYFALGEALGHF GWSVRKKPRPKTPKSETGTKPVTGKKSAPGTKPVAGTKSLTGTKTASGVESEDRTSSALS SATPSASLSSSASAMTPASALSSSALSSATPPASPSSSASAMPSASALSSSASSAASSAG PNSNSTPSRNPNRDPNRRPGRNPDRRQGRKSGPEKPRRR >gi|313158802|gb|AENZ01000029.1| GENE 23 34376 - 35194 958 272 aa, chain - ## HITS:1 COG:PM1451 KEGG:ns NR:ns ## COG: PM1451 COG0627 # Protein_GI_number: 15603316 # Func_class: R General function prediction only # Function: Predicted esterase # Organism: Pasteurella multocida # 2 272 3 267 269 182 35.0 7e-46 MKRIFILAFLLAWLTAQFAAAAAVDTLAVRSASMDRDIPVIVILPDGASPANPCPTVYLL HGYGGNQTTWLRIKPSLPAIADREGIAFVCPDGATSWYLDSKVRAKSLYETFMTRELLPA VEERYPVSRDRSGRAITGLSMGGFGAVSLAIRHKELFGAVGSTSGGLDIRPFPENWEIPQ LLGTQAEHPEAWEAATPINLIPRIADGDLAIVIDCGYDDFFFGVNNDFHAELLRRGIMHD FYVRPGRHNNEYWGNSIDYQIVFFRNFFFRDK >gi|313158802|gb|AENZ01000029.1| GENE 24 35247 - 36149 1167 300 aa, chain + ## HITS:1 COG:Cgl1981 KEGG:ns NR:ns ## COG: Cgl1981 COG0582 # Protein_GI_number: 19553231 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Corynebacterium glutamicum # 1 295 32 314 315 173 37.0 3e-43 MLTEFIRYLSAERRYSPLTVRNYKHDVEQFLAWLDCDESRFDPRSVTTEQIREWIIFRTE EGKLSAGSMNREVASLRAFFRWLHRTGAVEKDIFRMISTLKTSRRLPAFVPESRMTTIVS ECGPDSEDFQTERNSLIILMFYACGLRLAELVGIDRSDFSADYTSLRVRGKGDKQRMVPI LEFLREKILHYLGLIERQNICISSEKALFLTHKGKRISRSVVYRTVQEELTRAGVQGKKS PHVLRHTFATHLLNGGADMREIQELLGHASLQATQVYTHNSIARLREIYAKAHPREKGGK >gi|313158802|gb|AENZ01000029.1| GENE 25 36174 - 36470 211 98 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163755828|ref|ZP_02162946.1| 30S ribosomal protein S21 [Kordia algicida OT-1] # 1 95 4 98 102 85 43 3e-16 MNVQIQSVKFDADKRLIEFVEHKMEKLDRFAERSTGAEVILKLDKDHEKGNKIATITLHM PGEDLVACHQSKAFEESVDEAIDALKRQLDKFKSKSEK >gi|313158802|gb|AENZ01000029.1| GENE 26 36504 - 36689 98 61 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPQRPSFLGRPLHLKTSATWRRGKLETHAIWKHTSFENRGKRKKTGTRNYPEAETARNSG E >gi|313158802|gb|AENZ01000029.1| GENE 27 36959 - 37696 1026 245 aa, chain + ## HITS:1 COG:no KEGG:Cpar_1045 NR:ns ## KEGG: Cpar_1045 # Name: not_defined # Def: hypothetical protein # Organism: C.parvum_NCIB8327 # Pathway: not_defined # 126 245 129 250 262 70 33.0 5e-11 MKYIKTAILIAGMAAATAACTRKPVVPQFGMLTIDTLIGTPANGCKIEYRFATIANAEKS PALRSIEAANAGYFFELEEFGGTARQAADSALRQIAAELAFPQSAPQMTEPYEISAEAEA AVTDSLVTYIISRWSYTGGAHGMYATECHTYSLAGGYELSTADLFSERQLLGMEALLRRK LCEQYEAANDDELAERGFFPEYISLTENFRITPEGITFYYNPYDIGCYALGAVDVTMSRE ELENL Prediction of potential genes in microbial genomes Time: Wed Jun 22 12:01:24 2011 Seq name: gi|313158799|gb|AENZ01000030.1| Alistipes sp. HGB5 contig00040, whole genome shotgun sequence Length of sequence - 1620 bp Number of predicted genes - 3, with homology - 2 Number of transcription units - 3, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 145 - 897 728 ## BF1988 tyrosine type site-specific recombinase - Prom 945 - 1004 2.2 + Prom 772 - 831 3.3 2 2 Tu 1 . + CDS 859 - 1101 75 ## 3 3 Tu 1 . - CDS 1187 - 1360 143 ## BF1833 putative bacteriophage integrase - Prom 1510 - 1569 8.1 Predicted protein(s) >gi|313158799|gb|AENZ01000030.1| GENE 1 145 - 897 728 250 aa, chain - ## HITS:1 COG:no KEGG:BF1988 NR:ns ## KEGG: BF1988 # Name: not_defined # Def: tyrosine type site-specific recombinase # Organism: B.fragilis # Pathway: not_defined # 6 248 163 405 409 199 41.0 1e-49 MPLAELEQSFIEQYHVYLKSDLGLKPTTVSGYLKCLKYVVKIAFNNGWMPRNPFSLYQYT APNPERSFLTEDELRRMMTTELRYKRQDYNRDMFLFSCFTGICYADMASLTYDRIEQDAQ GEWWISGNRQKTETRYVVKLLPYALFILNKYRGLTGDGRVFAMSTLDSIDDSLKNIAREC GIDKQLSFHLARHTYATTICLSNGVSLETLSKMLGHKNITTTQIYAKVTPPMIDREVTML REKLATKFSV >gi|313158799|gb|AENZ01000030.1| GENE 2 859 - 1101 75 80 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLDERLLQFCQGHIAHVILLTHESPQMVDGFIVAVPCTVAAVDADTFLEVAGEFLERFEK QPVFLAEAEIGILDLFGRNV >gi|313158799|gb|AENZ01000030.1| GENE 3 1187 - 1360 143 57 aa, chain - ## HITS:1 COG:no KEGG:BF1833 NR:ns ## KEGG: BF1833 # Name: not_defined # Def: putative bacteriophage integrase # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 45 1 45 410 65 68.0 4e-10 MRSTFKVLFYLKRNKCKDQKVVPVMGRITVNGSIAQFSAKLSVPEALGGQRRPRQRP Prediction of potential genes in microbial genomes Time: Wed Jun 22 12:01:38 2011 Seq name: gi|313158789|gb|AENZ01000031.1| Alistipes sp. HGB5 contig00068, whole genome shotgun sequence Length of sequence - 10855 bp Number of predicted genes - 7, with homology - 6 Number of transcription units - 5, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 1589 - 1648 4.5 1 1 Tu 1 . + CDS 1681 - 2076 348 ## gi|313158795|gb|EFR58180.1| DGQHR domain protein + Term 2145 - 2183 -0.0 + Prom 2108 - 2167 5.8 2 2 Tu 1 . + CDS 2234 - 2392 241 ## + Term 2419 - 2464 8.1 3 3 Tu 1 . + CDS 3099 - 5195 2972 ## COG1509 Lysine 2,3-aminomutase + Term 5318 - 5352 -0.3 + Prom 5231 - 5290 6.0 4 4 Op 1 . + CDS 5389 - 6468 1466 ## COG4299 Uncharacterized conserved protein + Term 6491 - 6530 -0.5 5 4 Op 2 . + CDS 6565 - 7281 1084 ## COG1741 Pirin-related protein + Term 7361 - 7403 9.8 - Term 7354 - 7387 6.1 6 5 Op 1 . - CDS 7464 - 9047 2335 ## COG4108 Peptide chain release factor RF-3 7 5 Op 2 . - CDS 9112 - 10671 2472 ## COG0029 Aspartate oxidase - Prom 10698 - 10757 3.1 Predicted protein(s) >gi|313158789|gb|AENZ01000031.1| GENE 1 1681 - 2076 348 131 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313158795|gb|EFR58180.1| ## NR: gi|313158795|gb|EFR58180.1| DGQHR domain protein [Alistipes sp. HGB5] # 1 131 554 684 684 240 100.0 2e-62 MKEDFKTKLEQAYGSQWFKRGVPKAVYDKANQLASEKNYEITDASEEYSPWDCLTLIDYR RIATYGSNWRDIFEKYYTKPGEEKGGNKEAKTEWMQKLERIRNNNFHTYSVKEEEFEFLS ELHKWLIETSD >gi|313158789|gb|AENZ01000031.1| GENE 2 2234 - 2392 241 52 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSEIEIKQMQEKIDAGILLAQKRLIEKTKKEDGELVVVRDGKVVRIKARDLK >gi|313158789|gb|AENZ01000031.1| GENE 3 3099 - 5195 2972 698 aa, chain + ## HITS:1 COG:MJ0634 KEGG:ns NR:ns ## COG: MJ0634 COG1509 # Protein_GI_number: 15668815 # Func_class: E Amino acid transport and metabolism # Function: Lysine 2,3-aminomutase # Organism: Methanococcus jannaschii # 135 667 181 619 620 147 25.0 6e-35 MKQRKMLALTLYQLGQLYRHELPELVAAAEASGDAAQFRRRLGEWVSRSESAQGDAGEQI RLLIDYDGREVDELSTGERMPVRTLTLLWQFLTGCLENVEMRTDLFIDLFYLFKRLGGAE LPPLSMQRVRIRSERWPSGLDAEVREIRADNRERMLHLLVQKIENRKSTPSRFRFAEGMS YEDKYRQVAIWWNDFRFHLAMAVKSPSELNRFLGNSLSAETMYLLSKARKKGMPFFATPY YLSLLDVTGKGYGDEAIRSYILYSPQLVETYGSIRAWEKEDVVEAGKPNAAGWLLPDGHN IHRRYPEVAILIPDTMGRACGGLCASCQRMYDFQSERLNFEFESLRPKESWDHKLRRLMT YFEEDTQLRDILITGGDALMSQNKTLHNILEAVYRMACRKRKANAGRPDGEKYAELQRVR LGSRLPAYLPMRIDDGLVEVLREFKQKASAVGVRQFIIQTHFQSPLEVTPEAQEAIRRIL AAGWLVTNQLVYTVAASRRGHTTRLRQVLNALGVVCYYTFSVKGFQENYAVFTPNSRSLQ EQHEEKIYGRLTPEQAGELDAELEKGGDTAERLRRFMKRHRLPFLATDRSVLNLPAIGKS MSFRLVGITPEGRRILRFDHDRTRRHSPIIDRMGEIYIVENKSLAAYLRQLENMGEDADD YASIWNYTHGETEPRFGLYVYPDFPFGVTGRVSNLELE >gi|313158789|gb|AENZ01000031.1| GENE 4 5389 - 6468 1466 359 aa, chain + ## HITS:1 COG:all1887 KEGG:ns NR:ns ## COG: all1887 COG4299 # Protein_GI_number: 17229379 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Nostoc sp. PCC 7120 # 30 304 23 306 375 166 36.0 9e-41 MSLDALRGFDMLFIMGFASLVVAVCGLWPSAVTDAAAASMSHVAWDGFAHHDTIFPLFLF IAGVSFPYSVAKQRAGGMSEGRIYAKIVRRGLTLVVLGMVYNGLFKLDFENLRIASVLGR IGLAWSIAAVLYLNFGVKTRAAIAVAVLAGYGALSALVAAPDAAGAGPLTFEGNLAGYID RQFLPGKLIYGSFDPEGLLSTVPAVVTAMLGMFTGEFVRRSDIRGGRKTLWMAAAAAALL AAGLAFSGVLPVNKKLWSSTFVCVVGAYSLGMFALFYYLIDVRGWRRWTLFFRVVGLNSI TIYLAQRIVGFGRISDFFLGGIASKCPEAVAAVISSAGYVAVCWLFLYFLYRKNVFLKV >gi|313158789|gb|AENZ01000031.1| GENE 5 6565 - 7281 1084 238 aa, chain + ## HITS:1 COG:all1172 KEGG:ns NR:ns ## COG: all1172 COG1741 # Protein_GI_number: 17228667 # Func_class: R General function prediction only # Function: Pirin-related protein # Organism: Nostoc sp. PCC 7120 # 1 234 6 237 238 184 41.0 8e-47 MNKTIHRAESRGYADHGWLQTHHTFSFANYYDPRRVHFGALRVLNDDTVAPGEGFGTHPH DNMEIVSIALEGALRHGDSMGNMKVLRPGEIQVMSAGTGITHSEMNASETEPVKFLQIWV LTDAQDHTPRYNQVELAPAKRNVPHVIVAPEGRGGEHVGWVHQDAWFYTLDLDKDRVVEY RMNTRGHGAYVFVIEGSVEAAGEELGPRDGMGITGADEFPIKGETDAKVLIIEVPMEV >gi|313158789|gb|AENZ01000031.1| GENE 6 7464 - 9047 2335 527 aa, chain - ## HITS:1 COG:XF0174 KEGG:ns NR:ns ## COG: XF0174 COG4108 # Protein_GI_number: 15836779 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Peptide chain release factor RF-3 # Organism: Xylella fastidiosa 9a5c # 6 526 21 540 548 549 52.0 1e-156 MTLQEEITRRRTFAVIAHPDAGKTTLTEKLLLFGGAIHVAGAVKSNKIKKGATSDFMEIE RQRGISVATAVMGFEYKGAKINILDTPGHEDFAEDTFRTLTAVDSVIIVIDGAKGVESQT RKLMEVCRMRSTPVIVFINKLDRPSKEPFDLLDEVEKELNIRVRPLAFPISNGPTFKGVY NLYEKNLSLFTSDEKLTADASTVEISDLASSELEAQIGDKFAAQLRSDVELVEGVYDDFD RGAYLRGELAPVFFGSAVNNFGVRELLECFVRIAPSPCPAPTETRIVEPAEPKMTGFVFK IHANMDPNHRDRIAFLKICSGTFERNKNYLHVRSGKQLKFSSPTAFMAEKKSIIDFAYPG DIVGLHDTGNFKIGDTFTEGEKLKFTGIPSFAPEQFRYIENADPLKFKQLAKGVDQLMDE GVAQLFVSSLNGRKIIGTVGALQFEVIQYRLEHEYNAACRWEPISIYKACWIESDNAAQL ADFKRRKHTNMAVDKHGRDVFLADTSYGLALAQENFKDIRFHFTSEF >gi|313158789|gb|AENZ01000031.1| GENE 7 9112 - 10671 2472 519 aa, chain - ## HITS:1 COG:PA0761 KEGG:ns NR:ns ## COG: PA0761 COG0029 # Protein_GI_number: 15595958 # Func_class: H Coenzyme transport and metabolism # Function: Aspartate oxidase # Organism: Pseudomonas aeruginosa # 4 516 8 520 538 488 48.0 1e-137 MKTDFLVIGSGAAGLSFALKAAAHGHVTIVTKGEMNECNTNYAQGGICSVTYAPDTFEKH IRDTLVCGAGKCDEEAVGLVVRRAPELIRDLIEWGTRFDKTPDGRFELNREGGHTEHRIL HHEDLTGAEIERALIESVRRHPGITVLEHRFAIDLLTQHHLGEFVTRHTRGLTCFGAYVL NLETNEIETMLAKFTVVATGGCGNIYSTTSNPVVATGDGIAMCHRAKAITENMEFIQFHP TTLYNPGEKPNFLITEAMRGFGAILRLPGGEEFMDKYHPMKSLAPRDVVARAIYREMTKR GSDFVYLDVTHKDPDAIRSHFPNIYEKCLSIGIDITKDWIPVTPAAHYCCGGVKVDTNGE TSIKRLYALGETSCTGLHGANRLASNSLIEAVVYADQAARHASSLLDRVEIQEGIPDWDF EGTQHTEEMLMIIQSKREMQTLMSNYVGIVRSNLSLKRAMRRLEILWQETEELYNKTKPN RELCELRNMIAVAYLVIKQGREIKESVGCHYNADYPKEN Prediction of potential genes in microbial genomes Time: Wed Jun 22 12:01:53 2011 Seq name: gi|313158783|gb|AENZ01000032.1| Alistipes sp. HGB5 contig00061, whole genome shotgun sequence Length of sequence - 7130 bp Number of predicted genes - 11, with homology - 8 Number of transcription units - 5, operones - 4 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 1 - 35 6.7 1 1 Op 1 . - CDS 64 - 2139 2127 ## COG1475 Predicted transcriptional regulators - Prom 2177 - 2236 3.7 2 1 Op 2 . - CDS 2238 - 2582 107 ## - Prom 2740 - 2799 4.2 + Prom 2443 - 2502 5.4 3 2 Tu 1 . + CDS 2621 - 3022 256 ## gi|291513800|emb|CBK63010.1| hypothetical protein AL1_03290 + Term 3255 - 3287 2.0 + Prom 3542 - 3601 7.0 4 3 Op 1 . + CDS 3653 - 3994 380 ## gi|291513799|emb|CBK63009.1| hypothetical protein AL1_03280 + Prom 4090 - 4149 2.5 5 3 Op 2 . + CDS 4186 - 4362 277 ## gi|291513798|emb|CBK63008.1| Histone H1-like protein Hc1 + Term 4385 - 4419 5.3 6 4 Op 1 . - CDS 4470 - 5261 1120 ## COG0507 ATP-dependent exoDNAse (exonuclease V), alpha subunit - helicase superfamily I member 7 4 Op 2 . - CDS 5279 - 5491 165 ## 8 5 Op 1 . - CDS 5618 - 5914 454 ## gi|167754261|ref|ZP_02426388.1| hypothetical protein ALIPUT_02554 9 5 Op 2 . - CDS 5967 - 6389 149 ## 10 5 Op 3 . - CDS 6407 - 6622 303 ## gi|313158785|gb|EFR58171.1| hypothetical protein HMPREF9720_2326 11 5 Op 4 . - CDS 6635 - 7072 329 ## gi|313158784|gb|EFR58170.1| hypothetical protein HMPREF9720_2327 Predicted protein(s) >gi|313158783|gb|AENZ01000032.1| GENE 1 64 - 2139 2127 691 aa, chain - ## HITS:1 COG:DR0012 KEGG:ns NR:ns ## COG: DR0012 COG1475 # Protein_GI_number: 15805053 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Deinococcus radiodurans # 73 238 29 193 288 121 43.0 6e-27 MQTAKANSKSKQSGKSAVRTPAVSEAVAAQAQDPVAAGSPDMKPAEPQATDIGAVNPVAE TSPVEQPAVHPETDVRLLDLNKIVNSTYNPRKNFREETLLELAESIRQSGVLQPICVRPR DEGFEIVYGERRYWAAAMANLKFIPALVRDLSDAEAEDAAITENLQREDVRPREEAAAYK RALQSGRHTIESLVGKFGKSEAYIRSRLKLCELIDALAGMLDKEEISVGVATEIAKYPAD IQQEVYNDHFAEGCYSSWKTARIKEIARRLYERYMTKLESYNFDKTECLSCQHNTANQVL FKDECTGGCAGCQNRECMIRKNNEFLVQKAVKLLKDDPRTTLATGGETPAAVQEALEQEG YHVEELEYSVYHYDKGPQMPDAPQAEEFESEEDFTAAKEEYGAEMAVFAEETQQLEFDIS EGRVRKYAIIGNLDIEIRYEEIEDEEREVTVNEGQDDEHKVFVTVVPPSPLEGLMQQDRR NREICYEHITTDMKRVFLDVKVANKPLQKEEQQMFYYAVMQRVMSDSKLRQCGFRPKEGS YLTDREQFAAAGRITVKQQAALVRAFLVDYFRSAAPEYGCTDETLLTGMMCRFADMNFTE QSQKVQQEYLQVYERRKARLQEQIDALQAKAESEELAVSMQEAPDVEPEELPDLLPDETP DAEPSPEPLIIPMDPDIEPDTRMPEEMKAAA >gi|313158783|gb|AENZ01000032.1| GENE 2 2238 - 2582 107 114 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MEPANFPEIYTGNEDESDRMQDIAGCFDPIIPRNDGFQYDIEAAASDVCHGKDKWKLPIL PVAPGKIIPGRSAVPPSPAPGRKNHPRRGCGIFPSLHLRRGLMNLPEATDPFGI >gi|313158783|gb|AENZ01000032.1| GENE 3 2621 - 3022 256 133 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|291513800|emb|CBK63010.1| ## NR: gi|291513800|emb|CBK63010.1| hypothetical protein AL1_03290 [Alistipes shahii WAL 8301] # 1 133 1 133 133 259 98.0 4e-68 MRKSVFILLTVFMACLLCKESENMPGTSRKFMDETFLNRGEENLLARQWQCEYIYSDLSF TSVLPEAKKTRSTVSMFRHGAHGRLRATVLSMCCVTYGISGVAGLAARVHGKGFVAGLHA VDYYVYRLHRLII >gi|313158783|gb|AENZ01000032.1| GENE 4 3653 - 3994 380 113 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|291513799|emb|CBK63009.1| ## NR: gi|291513799|emb|CBK63009.1| hypothetical protein AL1_03280 [Alistipes shahii WAL 8301] # 1 113 1 113 113 197 100.0 3e-49 MVELDKSQKKIARTLISRALERECCTFLAKLKRLLQDEKAQSCHEKYLEIYKSIQTFDKD ISRQYDGLNGSRYALTVFSLFYNGILTEKDLSEFDDRTREAFLEHRRQWNLEL >gi|313158783|gb|AENZ01000032.1| GENE 5 4186 - 4362 277 58 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|291513798|emb|CBK63008.1| ## NR: gi|291513798|emb|CBK63008.1| Histone H1-like protein Hc1 [Alistipes shahii WAL 8301] conserved hypothetical protein [Alistipes sp. HGB5] # 1 58 1 58 58 67 100.0 3e-10 MNELVKTIDELYAAFKTDAELQAGKNNKAAGLRARKVSLELEKKLKEFRKTSLAAAAK >gi|313158783|gb|AENZ01000032.1| GENE 6 4470 - 5261 1120 263 aa, chain - ## HITS:1 COG:all7071 KEGG:ns NR:ns ## COG: all7071 COG0507 # Protein_GI_number: 17233087 # Func_class: L Replication, recombination and repair # Function: ATP-dependent exoDNAse (exonuclease V), alpha subunit - helicase superfamily I member # Organism: Nostoc sp. PCC 7120 # 93 255 582 747 748 172 47.0 4e-43 MKIFERITKRYALQNYYLRRNFYWPDTLPEILDYWQQHNDLPFLYGGDNWPYDAMCERQR RKGVYASQYLTPDRTACQMAALAVRYFDNDSRIVGAANLNLLLQQALNPSGPSLGRGGYT YRQGDRVMQLRNNYAKEVFNGDQGYIREVDTEDRMLTVDFDGKKVEYDVTELDELTLAYA TTIHKAQGSEYPIVVMPVLMTHFVMLQRNLIYTGITRAKKICVLIGATKALAYAVRNVSV LKRNTSLRERLNPSLTTDGKLRG >gi|313158783|gb|AENZ01000032.1| GENE 7 5279 - 5491 165 70 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLGLAETSAGTMVVWECPRCGQKWMFHYRAQNSREAHDYAAQLLAYRTGDPDWRMTLNPD WIAVPRQKKE >gi|313158783|gb|AENZ01000032.1| GENE 8 5618 - 5914 454 98 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167754261|ref|ZP_02426388.1| ## NR: gi|167754261|ref|ZP_02426388.1| hypothetical protein ALIPUT_02554 [Alistipes putredinis DSM 17216] hypothetical protein ALIPUT_02554 [Alistipes putredinis DSM 17216] # 1 90 1 90 157 110 61.0 5e-23 MNLHEYYRNHKDAINASIMDIACDLAVGRLLNAHGAPFETFVEADDPDDPDGGTHYKEEY QKEYDTYYDKEYARVAKLMKFDYCQEDGVAASPEDTNT >gi|313158783|gb|AENZ01000032.1| GENE 9 5967 - 6389 149 140 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKRYEFYRNQKITVIDCRYFSFEAENLETAVQKIKDLCADGQLDELSNDPTYQEDAAYQI PGTEYPLDIENNNGDPTVMIYSAADGNCITDNLPKRSIKCLSMPARRGGTDNSQRSKSES GKSLPNEYPEQTRSALRTPT >gi|313158783|gb|AENZ01000032.1| GENE 10 6407 - 6622 303 71 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313158785|gb|EFR58171.1| ## NR: gi|313158785|gb|EFR58171.1| hypothetical protein HMPREF9720_2326 [Alistipes sp. HGB5] # 1 71 1 71 71 125 100.0 8e-28 MNMQESDFRSALEIITRNNRITVSFNTPIADNYSQVYPLLIHESNASVLKQLHEAGFSMS MTKKGLEVSKY >gi|313158783|gb|AENZ01000032.1| GENE 11 6635 - 7072 329 145 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313158784|gb|EFR58170.1| ## NR: gi|313158784|gb|EFR58170.1| hypothetical protein HMPREF9720_2327 [Alistipes sp. HGB5] # 1 145 1 145 145 292 100.0 6e-78 MRRKFKLKVSYTFNGCFTLRAASGKEAARLVSEQCGTVLSAIQTTLGEESEVDWHFDKHP RLAIREVIALPETLDGEKLPRVQVFHPGQRVFRSDPDYEMSGCCVVCSGNDDKETPSDKD LILICCSDSETEVYPCELRSISDNH Prediction of potential genes in microbial genomes Time: Wed Jun 22 12:03:16 2011 Seq name: gi|313158710|gb|AENZ01000033.1| Alistipes sp. HGB5 contig00042, whole genome shotgun sequence Length of sequence - 85484 bp Number of predicted genes - 74, with homology - 69 Number of transcription units - 42, operones - 22 average op.length - 2.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 11/0.000 + CDS 556 - 1401 1123 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 2 1 Op 2 . + CDS 1430 - 2434 1165 ## COG0463 Glycosyltransferases involved in cell wall biogenesis + Term 2475 - 2518 11.4 - Term 2463 - 2506 11.4 3 2 Op 1 . - CDS 2522 - 3589 1464 ## COG1887 Putative glycosyl/glycerophosphate transferases involved in teichoic acid biosynthesis TagF/TagB/EpsJ/RodC 4 2 Op 2 . - CDS 3614 - 5167 2004 ## COG2509 Uncharacterized FAD-dependent dehydrogenases - Prom 5291 - 5350 80.3 + TRNA 5266 - 5350 66.6 # Leu TAG 0 0 - Term 5471 - 5513 12.4 5 3 Op 1 . - CDS 5535 - 5819 135 ## 6 3 Op 2 . - CDS 5995 - 6183 97 ## 7 4 Tu 1 . + CDS 6286 - 6687 83 ## gi|313157265|gb|EFR56692.1| ParB-like protein - Term 6579 - 6616 -1.0 8 5 Tu 1 . - CDS 6621 - 7694 1210 ## COG3049 Penicillin V acylase and related amidases - Prom 7818 - 7877 4.4 + Prom 7645 - 7704 3.0 9 6 Tu 1 . + CDS 7828 - 8073 263 ## + Term 8147 - 8178 4.1 - Term 8131 - 8170 8.4 10 7 Op 1 . - CDS 8194 - 9822 1974 ## BT_4471 hypothetical protein 11 7 Op 2 . - CDS 9825 - 11903 3002 ## BT_4470 outer membrane protein 12 7 Op 3 . - CDS 11936 - 12967 1021 ## BT_4470 outer membrane protein - Prom 13153 - 13212 4.6 + Prom 12944 - 13003 5.5 13 8 Tu 1 . + CDS 13163 - 14287 1280 ## COG2220 Predicted Zn-dependent hydrolases of the beta-lactamase fold 14 9 Tu 1 . + CDS 14620 - 15204 928 ## COG3247 Uncharacterized conserved protein + Term 15242 - 15272 5.0 - Term 15436 - 15478 -0.2 15 10 Op 1 . - CDS 15483 - 15857 416 ## COG3189 Uncharacterized conserved protein 16 10 Op 2 . - CDS 15854 - 16255 402 ## COG0346 Lactoylglutathione lyase and related lyases - Term 16298 - 16327 3.5 17 11 Op 1 . - CDS 16349 - 16498 93 ## gi|313158765|gb|EFR58152.1| hypothetical protein HMPREF9720_1896 18 11 Op 2 . - CDS 16563 - 17147 762 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog - Prom 17386 - 17445 2.7 - Term 17357 - 17384 -0.4 19 12 Tu 1 . - CDS 17487 - 18404 1219 ## COG0053 Predicted Co/Zn/Cd cation transporters - Prom 18434 - 18493 1.8 - Term 18533 - 18568 6.4 20 13 Tu 1 . - CDS 18607 - 20091 2395 ## COG0442 Prolyl-tRNA synthetase - Prom 20121 - 20180 7.6 - Term 20128 - 20168 9.0 21 14 Op 1 . - CDS 20188 - 21915 2376 ## BDI_1697 hypothetical protein 22 14 Op 2 . - CDS 21936 - 23093 1291 ## gi|313158729|gb|EFR58116.1| putative lipoprotein + TRNA 23367 - 23453 62.1 # Ser CGA 0 0 + Prom 23680 - 23739 4.5 23 15 Tu 1 . + CDS 23857 - 23988 79 ## 24 16 Tu 1 . - CDS 24114 - 24785 -133 ## Patl_2743 hypothetical protein - Prom 24817 - 24876 2.5 25 17 Tu 1 . - CDS 25003 - 25650 415 ## gi|313158719|gb|EFR58106.1| hypothetical protein HMPREF9720_1904 - Prom 25826 - 25885 7.5 26 18 Tu 1 . - CDS 26152 - 26838 439 ## Slin_3284 ATPase P 27 19 Op 1 . - CDS 27164 - 28645 899 ## COG3378 Predicted ATPase 28 19 Op 2 . - CDS 28648 - 28929 184 ## Palpr_1583 hypothetical protein - Prom 29039 - 29098 5.2 - Term 29036 - 29086 7.2 29 20 Op 1 . - CDS 29118 - 30047 567 ## gi|313158770|gb|EFR58157.1| hypothetical protein HMPREF9720_1907 30 20 Op 2 . - CDS 30050 - 31306 694 ## BVU_2469 tyrosine type site-specific recombinase - Prom 31396 - 31455 6.3 - Term 31486 - 31532 12.1 31 21 Tu 1 . - CDS 31634 - 32152 640 ## gi|313158782|gb|EFR58169.1| hypothetical protein HMPREF9720_1909 - Prom 32297 - 32356 6.6 + Prom 32164 - 32223 2.9 32 22 Op 1 . + CDS 32324 - 33427 1527 ## COG0668 Small-conductance mechanosensitive channel 33 22 Op 2 . + CDS 33438 - 35405 2332 ## COG1523 Type II secretory pathway, pullulanase PulA and related glycosidases + Term 35407 - 35446 -0.9 34 22 Op 3 . + CDS 35477 - 35737 456 ## gi|313158723|gb|EFR58110.1| transglycosylase associated protein 35 22 Op 4 . + CDS 35740 - 36054 480 ## gi|313158763|gb|EFR58150.1| conserved domain protein + Term 36079 - 36110 3.4 - Term 36067 - 36098 3.4 36 23 Tu 1 . - CDS 36116 - 36802 532 ## COG2135 Uncharacterized conserved protein - Term 36843 - 36886 6.4 37 24 Tu 1 . - CDS 36905 - 37339 565 ## BT_0923 putative periplasmic protein - Prom 37359 - 37418 1.6 - Term 37365 - 37400 -0.9 38 25 Tu 1 . - CDS 37458 - 38312 1202 ## COG1814 Uncharacterized membrane protein + Prom 38560 - 38619 2.1 39 26 Op 1 . + CDS 38667 - 39914 1712 ## Sph21_0818 putative secreted protein + Term 39940 - 39973 6.1 40 26 Op 2 . + CDS 39981 - 40250 377 ## Cpin_5022 hypothetical protein 41 26 Op 3 . + CDS 40262 - 40549 451 ## gi|313158766|gb|EFR58153.1| conserved hypothetical protein + Term 40554 - 40579 -0.1 42 26 Op 4 . + CDS 40592 - 41563 1560 ## COG3546 Mn-containing catalase + Term 41588 - 41629 8.6 + Prom 41607 - 41666 1.6 43 27 Op 1 . + CDS 41693 - 43117 1917 ## COG0366 Glycosidases + Prom 43136 - 43195 2.6 44 27 Op 2 . + CDS 43304 - 43513 204 ## gi|313158761|gb|EFR58148.1| hypothetical protein HMPREF9720_1922 + Term 43547 - 43588 10.5 45 28 Tu 1 . + CDS 43687 - 44163 594 ## COG3476 Tryptophan-rich sensory protein (mitochondrial benzodiazepine receptor homolog) 46 29 Tu 1 . - CDS 44205 - 44906 751 ## COG0778 Nitroreductase - Prom 44945 - 45004 2.6 47 30 Tu 1 . + CDS 44937 - 46013 1354 ## COG0389 Nucleotidyltransferase/DNA polymerase involved in DNA repair + Term 46103 - 46127 -1.0 + Prom 46076 - 46135 2.1 48 31 Op 1 . + CDS 46219 - 51021 5532 ## Odosp_3427 hypothetical protein 49 31 Op 2 . + CDS 51005 - 53308 3400 ## Odosp_3428 surface antigen (D15) + TRNA 53466 - 53556 69.2 # Ser GGA 0 0 - Term 53575 - 53619 1.6 50 32 Op 1 1/0.200 - CDS 53641 - 54402 1011 ## COG1573 Uracil-DNA glycosylase 51 32 Op 2 . - CDS 54408 - 55667 1691 ## COG4277 Predicted DNA-binding protein with the Helix-hairpin-helix motif - Term 55786 - 55818 4.0 52 33 Op 1 . - CDS 55871 - 56482 694 ## COG0655 Multimeric flavodoxin WrbA - Term 56573 - 56610 1.0 53 33 Op 2 . - CDS 56631 - 56870 339 ## - Prom 56912 - 56971 8.3 54 34 Op 1 . - CDS 56976 - 57446 659 ## gi|313158724|gb|EFR58111.1| hypothetical protein HMPREF9720_1935 55 34 Op 2 . - CDS 57446 - 57946 391 ## gi|313158747|gb|EFR58134.1| hypothetical protein HMPREF9720_1936 56 35 Op 1 . - CDS 58165 - 58881 491 ## PROTEIN SUPPORTED gi|163754278|ref|ZP_02161401.1| 30S ribosomal protein S15 57 35 Op 2 . - CDS 58878 - 60122 1681 ## COG4591 ABC-type transport system, involved in lipoprotein release, permease component - Prom 60176 - 60235 4.0 + Prom 60093 - 60152 3.0 58 36 Op 1 . + CDS 60199 - 60948 1027 ## COG4221 Short-chain alcohol dehydrogenase of unknown specificity 59 36 Op 2 . + CDS 60952 - 61719 1069 ## COG2738 Predicted Zn-dependent protease 60 36 Op 3 . + CDS 61727 - 63334 2284 ## COG2755 Lysophospholipase L1 and related esterases 61 36 Op 4 . + CDS 63331 - 64836 2463 ## COG1696 Predicted membrane protein involved in D-alanine export 62 37 Tu 1 . + CDS 64982 - 65629 887 ## FIC_00647 putative membrane-associated phospholipid phosphatase + Term 65747 - 65780 4.1 + Prom 65757 - 65816 2.3 63 38 Op 1 . + CDS 65858 - 67579 2824 ## COG1190 Lysyl-tRNA synthetase (class II) 64 38 Op 2 . + CDS 67623 - 68381 1152 ## COG0149 Triosephosphate isomerase + Term 68406 - 68437 3.4 - Term 68387 - 68433 7.9 65 39 Op 1 . - CDS 68473 - 69783 1961 ## Bacsa_0648 histidine acid phosphatase 66 39 Op 2 . - CDS 69799 - 70704 991 ## COG3568 Metal-dependent hydrolase - Term 70709 - 70763 12.1 67 40 Op 1 . - CDS 70797 - 72170 1970 ## Phep_0457 hypothetical protein 68 40 Op 2 . - CDS 72189 - 73496 1812 ## COG0477 Permeases of the major facilitator superfamily 69 40 Op 3 . - CDS 73513 - 75180 1921 ## COG1069 Ribulose kinase - Prom 75278 - 75337 9.5 + Prom 75634 - 75693 3.9 70 41 Op 1 . + CDS 75880 - 78993 4926 ## Sph21_2494 TonB-dependent receptor plug 71 41 Op 2 . + CDS 79006 - 80817 2301 ## Sph21_2495 RagB/SusD domain-containing protein 72 41 Op 3 . + CDS 80837 - 82381 2286 ## Phep_3771 metallophosphoesterase 73 41 Op 4 . + CDS 82406 - 84109 1545 ## gi|313158715|gb|EFR58102.1| putative lipoprotein + Term 84125 - 84177 18.0 74 42 Tu 1 . + CDS 84188 - 85276 1229 ## BT_0193 hypothetical protein Predicted protein(s) >gi|313158710|gb|AENZ01000033.1| GENE 1 556 - 1401 1123 281 aa, chain + ## HITS:1 COG:Cj1135 KEGG:ns NR:ns ## COG: Cj1135 COG0463 # Protein_GI_number: 15792460 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Campylobacter jejuni # 2 253 257 515 515 189 37.0 5e-48 MTVSLIISTYNWPRALYLCLDSVMQQTVMPSEIIIADDGSGIATRDVVRHFEAVSPVPVR HIWHEDDGFRLAEIRNKAIAASRGEYVIQIDGDLILQRHFIQDHMIFAQPGCFVTGSRGI ITEMLTNQVLRGEITSLTPLMKGVRNSNNVVRIPLMAYLYRTLGPSRFVKGCNMAFWRSD LIRVNGYDEEFRGWGGEDSELATRLNNSGVRQRCMKFRGIVFHLYHGKCDRDRQSANEER YKQSLSEHRTRCRCGLDRHLSASERIVYTDAETAVPAGAGS >gi|313158710|gb|AENZ01000033.1| GENE 2 1430 - 2434 1165 334 aa, chain + ## HITS:1 COG:BS_yveT KEGG:ns NR:ns ## COG: BS_yveT COG0463 # Protein_GI_number: 16080481 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Bacillus subtilis # 2 244 4 237 344 125 31.0 1e-28 MLVSVITPVYNTAQYLDECIGSILSQSMTDFELLLIDDGSTDGSGAICDRYAEKDKRIRV FHIPNGGVSAARNLGLDNARGEFVVFVDSDDRITPDHLRQLADSDIGEDGVAFTNLFEER PASGRYPHGHTRIYAIPDCRITGGREACMPVLAQLLRRHCLGWTCNKMFSRATIERHGLR FDRSIRYAEDEIFTAQYCAHITHLVSNSNPTYHYRYVPTSLLRGKIDPMMLMRIRRYIHE QYKSLGYCDEILYLTTRTQFSRLRRELRRTKGWNAELANELAQGILDNWKFYRAYVRSEF RKGFYDTKALWIARLSCMINSRLWVKLVIKGLHI >gi|313158710|gb|AENZ01000033.1| GENE 3 2522 - 3589 1464 355 aa, chain - ## HITS:1 COG:XF1993 KEGG:ns NR:ns ## COG: XF1993 COG1887 # Protein_GI_number: 15838587 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Putative glycosyl/glycerophosphate transferases involved in teichoic acid biosynthesis TagF/TagB/EpsJ/RodC # Organism: Xylella fastidiosa 9a5c # 1 347 1 339 344 266 43.0 6e-71 MEPKKYLLFVTLPYAYSILRPLEHEIRRRGDSAAWFIEADCPVALEEGEVHLKTIREAVD YNPVAVFAPGNYIPDFFPGVKVALFHGYAIQKRIEAIDDHFTVRGWFDIYCTQGPSSTPY FKELERQYGFFRVYETGWPKADTYFSPETQLRPRNDRPVILYPPTFTRNVCSAPHLMKEI ELLAKTKPWDWIITFHPKLTDPDIIAGYKRIAAENDNVTFFEGPDKMPLLQRADAMLCDS SSIILEFMFLDKPVVTFRNSHPGPHLIDVDRPEKVGPALERALSRPEELMREIRAYTMHH EPHRDCRCSARVLDAVDDYIARGHAGLKRKPLNLIRKWKLRRQMRYYPLLEMFRR >gi|313158710|gb|AENZ01000033.1| GENE 4 3614 - 5167 2004 517 aa, chain - ## HITS:1 COG:L195271 KEGG:ns NR:ns ## COG: L195271 COG2509 # Protein_GI_number: 15673161 # Func_class: R General function prediction only # Function: Uncharacterized FAD-dependent dehydrogenases # Organism: Lactococcus lactis # 22 512 20 524 535 362 41.0 1e-99 MPQNITLVLTPRQAADAKYYTSLAARRLGMPEQDIALVRVVKRSIDARQRQPKVNLSLEV YADREPRPAPVHFDYPSVAGRTEVVIVGSGPAGLFAALRLIELGLRPVILERGRDVSARK VDIAQINRNGDVDPDSNYAFGEGGAGTFSDGKLFTRSKKRGDYNKALQTLVFHGATPEIL YEAHPHIGTDKLPRIMQRIRQTILDAGGGFVFNSRVTDLEIKGGRVRGVWCGATLVEGAA VVLATGHSARDIYELLHREGVRLEAKAFAMGVRIEHPQALIDSIQYHCETRGEYLPAAAY SLVSQENGRGVYSFCMCPGGFIVPAMTDAAQSVVNGMSPSGRTSPYANSGLVTEVRPADF EHLRAEWGELAGLKFQQQFEELARRYGGDRQIAPAQRVADFVAGRASASLARTSYIPGIV PSRLDRWMPGFIAQGLRQGLATFGRRMRGFVTNEAVVVGVESRTSSPVRIPRDPATLMHP ETAGLFPAGEGAGYAGGIISAALDGERIAEAVKNYIA >gi|313158710|gb|AENZ01000033.1| GENE 5 5535 - 5819 135 94 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MCENTFLDYRHESDELLELFPQSASVEFSLWLPTVQTKYMVPAEANRFSLMLWDSIDKAF RCIRRREPGCCIDADKLLTYMTLYLEERSSVGMK >gi|313158710|gb|AENZ01000033.1| GENE 6 5995 - 6183 97 62 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSVRLCEILNYGDVMLEPGADHRRFTLWRPLAVCHRYCRLAKRSKFQDYRVNLTYAEMRK IG >gi|313158710|gb|AENZ01000033.1| GENE 7 6286 - 6687 83 133 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313157265|gb|EFR56692.1| ## NR: gi|313157265|gb|EFR56692.1| ParB-like protein [Alistipes sp. HGB5] # 39 105 78 156 159 71 44.0 2e-11 MKKPPGLAAGSFLFFGCKGTECMGATGTETAGRKPHPAANRSSGNLRRPIAKGKPIRTRR SDRWTMRHISTERNALLRQKQLTGLADFFADEEFSPGDADEEAAENGPQPPSYSGIAICC TGCSRFRSSGWDW >gi|313158710|gb|AENZ01000033.1| GENE 8 6621 - 7694 1210 357 aa, chain - ## HITS:1 COG:BMEI0543 KEGG:ns NR:ns ## COG: BMEI0543 COG3049 # Protein_GI_number: 17986826 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Penicillin V acylase and related amidases # Organism: Brucella melitensis # 20 331 34 345 367 214 36.0 2e-55 MKTRNLLLGMAAVCGSSFQAVACTGIALTAADGSYVQSRTIEWARGVLQSEYVVIPRGQR LRSFTPLGADGMTFTAKYGVVGLSVVQKEFIAEGINEAGLSAGLFFFPQYGGYEPYEPAQ NDRTLADLQFVTWALTQFSTVDEVKAAVGGVRIVALEKTSVVHWRIGEPSGRQVVLEIVG GVPHFYENEVGVLTNAPGFEWQLTNLNNYVNLYPGDAPVRRLDGVTLRPTGGNSGFLGLP GDATPPSRFVRAAFYRATAPQRATGFETVQQCLHLLNNFDIPIGIEHPEGECPDIPSATQ WTSAIDLTNRRIYYKTAYNNTIRCIDLAGIDFGKTAYQSHPLDRNREQPVQQIAIPE >gi|313158710|gb|AENZ01000033.1| GENE 9 7828 - 8073 263 81 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKRPAYDKEALRENKGGMQVGQPSADPQQTQETTRTQITPDDISDDGPIAPASANGTPSR QQIEQDVVAVNPSVDSMESRG >gi|313158710|gb|AENZ01000033.1| GENE 10 8194 - 9822 1974 542 aa, chain - ## HITS:1 COG:no KEGG:BT_4471 NR:ns ## KEGG: BT_4471 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 541 1 528 531 676 57.0 0 MKCRHTYLFAGLCALLFGACTGNFRDINDDLSGITDGELEADNNGLGYRLGIIQQGVYFN YDFGKGKNWPFQLTQNLNADMFSGYMHDPKPLQGGSHNSDYNLQDGWNSAMWQFTYSYVM PEIYRLEQTAAELMPPFYAIAKILKVLAMQRVTDYYGPVIYTRFGAQGAEYVPDGQREVY MRFFDDLDQASEILSDYVAERPTAGEFAKFDLLLDGSYAAWLRFANSLRMRLAVRLASVA PEKARAEFRKAAADPYGVIEVNTNNAAVKTSGIYSNPLGAINRSWNEAVMNASMESVLTG FGDPRIAKFFEPCADDLVITDDGGESVSVPLKGQYHGIRQGTAFSHNYYAAFSRLTVEPT TEVVLMSAAEVWFLRAEAALRGWTDEDAGTCYRAGVSISFAQWQVSGAEQYLENDASAAD YRDPIAPENDIAARCRVSPRWDESAAAEIKLERIITQKWIAMYPEGCEAWAEQRRTGYPR LFPVRFNHSKGQSIDTETMIRRLPYPATLETSDPEQYEMLLKELGGPDHGGTRLWWDTGC NF >gi|313158710|gb|AENZ01000033.1| GENE 11 9825 - 11903 3002 692 aa, chain - ## HITS:1 COG:no KEGG:BT_4470 NR:ns ## KEGG: BT_4470 # Name: not_defined # Def: outer membrane protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 692 344 1035 1035 887 61.0 0 MQTYFSYANTAERGITGSNRLMRHNFNLRATTGLFRDRIKLDGNISFMRQVVKDKPVPGG FYMNPLVGLYRFPRGVDMTPYREHFEVYDPDRKLGVQNWIAPSDDFEQNPYWITNRIRSK SLRNRVMASLSADWKVNGWLRIRARGNVDHIDDKVRQRFYASTAPALAGNNGRYIESGYS ETLFNGEALALFDRRFTPDWTFSATVGVSLNDRTVNSLRIDSKTASLYYPNVFNVANIVM NSSAYVDEQIDARRQIQSLFATASVKYAESLNLEVTGRNDWASTLAYTSHEGSGFFYCSF GASWVVGKMFKLPEWMSFAKVRLTWSRVGNDIPMFITNPKAHITAGGGINAADAAPGTDL KPEMTNALEAGFEWRFFGDRLGINATYYKTNTHNQFFKLPALSGEAYAFRYENAGNIENE GVEVSLSAYPVYGSGLTWQSVVNFAHNRNRVIKLHDELREYAYGPSSFSSSYAMKLVEGG AIGDIYGRAFERDADGHIVYETEGDYAGLPRTAGDGNTVKVGNANPVFALSWSNTFSYRG ASLSLMLDCRYGGRLLSQTMADMDSYGVSAATADARDRGYVTLEGRRIDDVKGFYKLVGG RAGVTEYYMYDATNIRLRELAVSYALPQKLVRRTRVLSGVTMSLVARNLFFIYNAAPFDP DLILSTGNDNQGIEVYGMPTVRNIGFNIKLEF >gi|313158710|gb|AENZ01000033.1| GENE 12 11936 - 12967 1021 343 aa, chain - ## HITS:1 COG:no KEGG:BT_4470 NR:ns ## KEGG: BT_4470 # Name: not_defined # Def: outer membrane protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 25 339 1 328 1035 343 54.0 7e-93 MTDCFCNVMPYTSYSEKLPGPGCRILFRTLVLTFCLCLLSGFEVRAQSAQRKFVVRGHVT DQTGAPLVGVTVVEHGTSNGVATLSGGEFSISVAPGAVLDVSCVGYVPRSVPTENRTGLD ITLEEDVKVIADVIVTALGLERNYADLTYSADKIKGTQLTNVKSSNMILSLAGKSAGVQV NESSSGAGASSKISIRGVRSVASDNQPLYVVDGMPILNSSPEQVYTAIGGVADAGNRDGG DGISNLNAEDIESVSILKGAPAAALYGSQAANGVILITIKKGSAEKRQPVTFTSNLTFQS PFRLPDFQNRYGVSGVVESWGTRAVMKAYDNAGDFFRTGALRR >gi|313158710|gb|AENZ01000033.1| GENE 13 13163 - 14287 1280 374 aa, chain + ## HITS:1 COG:XF1739 KEGG:ns NR:ns ## COG: XF1739 COG2220 # Protein_GI_number: 15838340 # Func_class: R General function prediction only # Function: Predicted Zn-dependent hydrolases of the beta-lactamase fold # Organism: Xylella fastidiosa 9a5c # 31 359 30 358 385 296 42.0 3e-80 MSGKQKRGKMIILSSILLATLATAFVIVRLPSFGRLPQGERLERIRQSPNYRDGRFRNQE AGPMMTGDKSRLRGILEFLFRKREGLRPHDAVPAIRTDLRALDPKSNLLVWFGHSSQLIQ AEGRRTLVDPVFRNAAPLSMLNRPFKGTDIYRPEEMPDIDYLIVTHDHWDHLDYRTVREL KERIGTVICPLGVGEHFEYWGFDPAKIVELDWHEQQTLGDGFTVRCLPSRHFSGRGRSSN QTLWGSFLLQTPARKIYMGGDGGYGCHFAEIGEEFPDIDLAVLENGQYDEAWKYIHTMPD QLAKAAKELGARRIMTVHHSKYALAKHRWDEPLATEAALAADTALHLLRPEIGEVVPLET PQPERAQADDPTGL >gi|313158710|gb|AENZ01000033.1| GENE 14 14620 - 15204 928 194 aa, chain + ## HITS:1 COG:RSp0426 KEGG:ns NR:ns ## COG: RSp0426 COG3247 # Protein_GI_number: 17548647 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Ralstonia solanacearum # 16 186 7 175 186 65 30.0 4e-11 MNNFSSLIENSKRAVRYWWLLLIIGIALFVVGILIFVYPTQSYLGMSLVFGWLMLFSGIL EVVLSSANKHFITGRGWMLAGGIIEIILGIILIFNVALSAATLPIFLGFWLMLRGFSAIG LGGDMNAMEIPGSGWTVFSGILLVLCSLWILFQPLVFGTTAVVIWVGISLLFAGIAACSL SIQLRRAHQCLGNR >gi|313158710|gb|AENZ01000033.1| GENE 15 15483 - 15857 416 124 aa, chain - ## HITS:1 COG:SSO1867 KEGG:ns NR:ns ## COG: SSO1867 COG3189 # Protein_GI_number: 15898659 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Sulfolobus solfataricus # 4 118 2 112 115 106 46.0 8e-24 MTRIRIKRVYEPAAPDDGCRVLVDKLWPRGVRKDALHYDMWAKEITPSPELRAWYHADPQ TRWPEFRRRYLEELRGSQAVREFVRRIAGNETVTLLYASKNAAENHALVLQEFLEHAVTE IAHA >gi|313158710|gb|AENZ01000033.1| GENE 16 15854 - 16255 402 133 aa, chain - ## HITS:1 COG:alr2922 KEGG:ns NR:ns ## COG: alr2922 COG0346 # Protein_GI_number: 17230414 # Func_class: E Amino acid transport and metabolism # Function: Lactoylglutathione lyase and related lyases # Organism: Nostoc sp. PCC 7120 # 1 130 3 132 135 88 34.0 3e-18 MKFSNVRLLVRDYEKCFRFYTEKLGLETAFDIEGCYGSFKVAEGIEGLAIFTSDLMAPVA GNADKELPAGCRDKMMVSFEVDNVDEAFETLSARGVEFVNRPTDIPGWGMRVVYLRDPEE NLIELFTPLAANL >gi|313158710|gb|AENZ01000033.1| GENE 17 16349 - 16498 93 49 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313158765|gb|EFR58152.1| ## NR: gi|313158765|gb|EFR58152.1| hypothetical protein HMPREF9720_1896 [Alistipes sp. HGB5] # 1 49 3 51 51 73 100.0 5e-12 MEDIDKNEVSDDGISDEEELLHREVIQGDADLYYLHHYLSEVFRNGDNF >gi|313158710|gb|AENZ01000033.1| GENE 18 16563 - 17147 762 194 aa, chain - ## HITS:1 COG:PA2896 KEGG:ns NR:ns ## COG: PA2896 COG1595 # Protein_GI_number: 15598092 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Pseudomonas aeruginosa # 7 189 13 191 194 80 29.0 1e-15 MNVQVLSDQVLLNHYLSGDQSAISKLIERHSRRVRDYINMMVKDRDVAEDIFQETFIKAV RVIDDGRYTDNGKFLSWVLRIAHNQVIDYFRAQRQNKAVTEAEAGYDVLGSLRFAERNVE DAMVSEQIERDVRALVELLPAEQREVVMMRYFAGLSFKDIAEQTDVSINTALGRMRYALI NLRRMIKEKNMILC >gi|313158710|gb|AENZ01000033.1| GENE 19 17487 - 18404 1219 305 aa, chain - ## HITS:1 COG:MA3366 KEGG:ns NR:ns ## COG: MA3366 COG0053 # Protein_GI_number: 20092180 # Func_class: P Inorganic ion transport and metabolism # Function: Predicted Co/Zn/Cd cation transporters # Organism: Methanosarcina acetivorans str.C2A # 4 297 2 295 341 242 44.0 5e-64 MSGEAEIRKKKIYRVTFIGFAVNLLLAGIKLAAGILGRSGAMVADAVHSFSDMATDVVVI AFAKISAKPKDEGHDYGHGKYETLATIIISLALAAVGTGILVNSIGAIRVVVDGGLLPRP GTVALLAAAVSIVVKEILYRYTVREGRRVSSPSMIANAWHHRSDALSSLGTLAGIGCAYF LGDKWRIADPIAALVVAVFIFKIAFDLIRTGLDELLERSLPEDVEEEILRVVAANPEVRE PHNLRTRRIGASIAVEVHVRVDGTMSVCRSHELTEDIERRLRARFGEGTMIAIHVEPLKA ACRAE >gi|313158710|gb|AENZ01000033.1| GENE 20 18607 - 20091 2395 494 aa, chain - ## HITS:1 COG:BB0402 KEGG:ns NR:ns ## COG: BB0402 COG0442 # Protein_GI_number: 15594747 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Prolyl-tRNA synthetase # Organism: Borrelia burgdorferi # 8 494 5 488 488 475 48.0 1e-133 MAKELKDLTKSDENYSQWYNDLVVKAGLAENSAVRGCMVIKPYGYAIWEKMHDALDKMFK DTGHQNAYFPLFIPKSFFSKEAHHVEGFAKECAVVTHYRLKNDPEGKGVVVDPDAKLEEE LIVRPTSETIIWNTYKNWIQSYRDLPILCNQWANVVRWEMRTRLFLRTAEFLWQEGHTAH ATREEAVEEAEKMIHVYQRFAEEWMSLPVVVGHKSPNERFAGAEDTLTIEALMQDGKALQ SGTSHFLGQNFAKAFDVQYVNKEGKLEYVWATSWGVSTRLMGALIMAHSDNNGLVLPPKL APIQVVMIPIYKGDDQLAEIRTRFEAIAAQLKAKGISVKIDDRDNVRSGFKFAEYELKGV PVRLAMGPRDMENGTIELVRRDTLEKQTVSQEGLVERIEGLMTEIQENIYRKALNYRESM ITKVDTWEEFKQVLDDKGGFLLAHWDGTVETEVAIKEATKATIRCIPIDAPDEEGVCVFS GKPSHRRVLFARSY >gi|313158710|gb|AENZ01000033.1| GENE 21 20188 - 21915 2376 575 aa, chain - ## HITS:1 COG:no KEGG:BDI_1697 NR:ns ## KEGG: BDI_1697 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 29 575 15 536 536 159 26.0 5e-37 MKSMKIVALCAAFVAAAPLQAQNTGLEFGGSVLNHDMLWATDLAQLSQTHVFGTARVMGM GGAFTSLGADLSSMSLNPAGLGMYRRNEISLTPLVPMAKASTAGTASWKGNSKSQFAFAN VGVALNVFESSRSSLTSLTLGIGMNRIADFNSRYSFSSESRYDPDRPDRLMPTIADIFSQ QMNNFGVRPDREAEGGGPNGPVPVDAWNPNVWPAILAYDAYMLGNYGTVKDPIWAVERIG RNASVLHSMDVVNSGSINEFTISMGGNINNILYFGASIGIQSVHKKLGVTYQEEYGYFNT DGIGHDGSGNALAEQLDVMNLYQEMEMNGSGVNFKLGFTARPLTGLRVGVAFHTPTYYSL DYSYRADIFSDIRNNADNKVATVGNATPRANNSGGNSWDFTSPARLMFGASYTFGTFAIV SIDYERDWYNGIRVKNVPQYNYFSEEDYKAEFKHNFQGTNSIRAGVEVKPLPILALRVGG GYTDSMLKERDFYVNDAYTSMPTTYESYYFSAGAGVNLGRNTTLDLAYQYVTNKQTQYQL FFSRPENGGDMETWSGLYDTSLKRHYLAATLSFRF >gi|313158710|gb|AENZ01000033.1| GENE 22 21936 - 23093 1291 385 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313158729|gb|EFR58116.1| ## NR: gi|313158729|gb|EFR58116.1| putative lipoprotein [Alistipes sp. HGB5] # 1 341 1 341 385 395 100.0 1e-108 MKKLNILIGTLILSAGLAGCSSSYYASSGYANDDLYALHDKTAIARKQQAAAEARKAEAE ARRAEWEAKIAEAQALAAENGYYDESGSNPYNDILADTYESAYARRLRGFESPTYRMPSS YYNFRYGSSFTYATAYDPAFYNIIISGDEVWVEPKYITSMFGTWGGTPYGGWYFGWNYSP SWWGYPSYAWGGWNWGFGFTWYDPWWGPGWHPYWGPGWGPGWGGGWHGHHHAGYRPRDYY GNPGGRGRSYTPGRSTSGARYGNGGYAGGSRGGRNGSYSTSYGSRNGDSPFSSGSRGSSY NNRGSSYNNRGNSYNSGSRGDFNSGSRNDFNSGSRGSSYNDRGSSSSGSSFNSGGSRGGS SGGGYSGGGSRGGSSGGGGRGYGSR >gi|313158710|gb|AENZ01000033.1| GENE 23 23857 - 23988 79 43 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MATSASDGFCVIAYYKHECMRRIKPFPKRALYTAIKTAISRLL >gi|313158710|gb|AENZ01000033.1| GENE 24 24114 - 24785 -133 223 aa, chain - ## HITS:1 COG:no KEGG:Patl_2743 NR:ns ## KEGG: Patl_2743 # Name: not_defined # Def: hypothetical protein # Organism: P.atlantica # Pathway: not_defined # 1 220 42 272 278 135 35.0 2e-30 MVLGLFFLIVAIVFLAKYYKAKRGVSLKIRGIKVNIRQGDIFKGAGWKVIAFNEFFDTTV DDIVIAHNTINGVFIDYHVEDLDDLKNAITNAPDNPKLKKRTRDGKIVYPLGRIITYKNY MLLAFTHFDNNEAHLSQKDYEDCLRIMWTEISRTYANKPIFIPLLGGGITRFDGTPNKSK SDLLKCMLCTLRTSLVNINQPITILLTEEAIEEINIYEMKGVK >gi|313158710|gb|AENZ01000033.1| GENE 25 25003 - 25650 415 215 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313158719|gb|EFR58106.1| ## NR: gi|313158719|gb|EFR58106.1| hypothetical protein HMPREF9720_1904 [Alistipes sp. HGB5] # 1 215 1 215 215 402 100.0 1e-110 MEYNKSREILRDFYAQNGILKYFERDNTYLESAFHEINEMWSRNLECIKEVKYLMIAEAP LWGKDKSYIYNPETKNTSFFYKSDLEYVLNIQIADKQDFINCCNEIGLLIIDISPFALNT EDTIINYRSISASKYRKLVKNTFPFYFEQKLKAVSNKKSDSIKAFFRYARVKKRFLPLIS EALIDCRFIKSANEILEISQRGGNMNRMKFRVLFC >gi|313158710|gb|AENZ01000033.1| GENE 26 26152 - 26838 439 228 aa, chain - ## HITS:1 COG:no KEGG:Slin_3284 NR:ns ## KEGG: Slin_3284 # Name: not_defined # Def: ATPase P # Organism: S.linguale # Pathway: not_defined # 1 218 96 303 748 108 33.0 2e-22 MDIDRKDNLQVEGYDRLKDQLGRLPYVAFCGRSVGGEGYYAIVPIAQPNKLLLHFRSLQT KFSAMGITIDPSCCDISRKRFVSYDPEPYINQEAEIYEGLGLADGAAVPDITGNATLPGT DSEDEPLKEVLKYIQIIEQKKVDITAGYANWLRIGYALHNAFGDFGRELFHRVSSFHPRY SYVETDRLFSGLSKGNCANQVTIRSFFYFVRQHGLDAAADFGVSDEKG >gi|313158710|gb|AENZ01000033.1| GENE 27 27164 - 28645 899 493 aa, chain - ## HITS:1 COG:XF0705 KEGG:ns NR:ns ## COG: XF0705 COG3378 # Protein_GI_number: 15837307 # Func_class: R General function prediction only # Function: Predicted ATPase # Organism: Xylella fastidiosa 9a5c # 158 468 483 809 845 117 26.0 4e-26 MGTINYAELQQDILNTLVKQVTPVNFKNLAYPEAKLLQKQLAGCAPDSDMAQSIQKKLLK MKVNEKHYVIFTIEEIARLAEKNDWGLCRNQNEIYLYNGMFWSRLDVDAFQKFLLKASER MGVPIVSSKYYQFGKKLFEQFMMQSYLQSPAANSNVVLINLLNGTYEIRNGQGKLRKFCK DDFLTHQLPFKYNPDAAAPLFDKYLSKVQPDESARKVLAEYIGYLFIKTGNTILKEEKAL MLYGGGANGKSVFFEIVNALLGAENVICHSLQDLTDGSGYYRAQLANKLVNYASEINGKL ESSIFKQLVSGEPVSARLPYGKPFHLTHYARLIFNCNELPRGNEFTDAYFRRFLIVPFDV TIPPEEQIKDLHSQIIENELAGVFNWVLRGLARLLKQNGFTECIAARRAVEDYRLQSDSL RQFLNDERYKSDVNVKTKIVDLYIEYKYYCQENGFYHLTKPNFIKRLKSYGIQVATINVG NVAYLKKHKSDEE >gi|313158710|gb|AENZ01000033.1| GENE 28 28648 - 28929 184 93 aa, chain - ## HITS:1 COG:no KEGG:Palpr_1583 NR:ns ## KEGG: Palpr_1583 # Name: not_defined # Def: hypothetical protein # Organism: P.propionicigenes # Pathway: not_defined # 1 93 1 95 100 75 41.0 7e-13 MNLYELKRKPVWQMNGEELLELQRLAAPQRVESAEVKEPRLVYGLRGLCELLGCSRTTAN KIKHDPRLRGCYSQAGKKIVFDAEKVLKALGRE >gi|313158710|gb|AENZ01000033.1| GENE 29 29118 - 30047 567 309 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313158770|gb|EFR58157.1| ## NR: gi|313158770|gb|EFR58157.1| hypothetical protein HMPREF9720_1907 [Alistipes sp. HGB5] # 1 309 1 309 309 654 100.0 0 MNNPTAEIDNLIRRIDDELLILQDAPRLKKYKFYNFGVALKHLIESCQKHGLDTNPYYGH KIQEYIAVYRDLFGGADLKNNVVSDLQPNGVIIVSSSLPPKVSKVCELLAGCKVDYSVFV SEKCPEPNIILSKYNVDKYSDETEYYKACAGAYCHLTEVCWNVGRRFNMGANGHGTLIIF EQYIADLKAKRAMKDTPPFETFVPNMELFASLFVPERKGWWNNLIIDITGALKSGKGGLT VVAMAKVLYDSGKLHPVTKPKNFAAWLKIFCQAWSLDKIPTSKAGNNGVKKEITRIHRLF YYLDGAIPK >gi|313158710|gb|AENZ01000033.1| GENE 30 30050 - 31306 694 418 aa, chain - ## HITS:1 COG:no KEGG:BVU_2469 NR:ns ## KEGG: BVU_2469 # Name: not_defined # Def: tyrosine type site-specific recombinase # Organism: B.vulgatus # Pathway: not_defined # 16 417 24 429 430 431 52.0 1e-119 MAVNFYLNSRGDKHGDFPIRVSIAIGGIRLLTSVGYSINPAKWDSSKQKVRQGASNAKGI TYNVINSHLNNIVQQSIDFENKCLTEKLKVTKELIQQNINTNKNKLDDVGEIREKTLFDV FDEFIKEIGNINGWTHGTYEKYIALRNHLYIFSPNLSFDDLTEKGLADFITHLRDEKGLR NSTIGKQLGFLKWFLKWSANNGYHKNMAYLSFKPKLKTTEKRIIYLTWDELMTVYNFPIP ESKKYLDRVRDVFCFCCFTSLRYSDVYNLKRFDIKNGALHITTVKTADSLTIDLNKYSQA ILDKYDGVPFEDNKALPVISNQKMNDYIKELGQLCGLDQPETVTYYVGNERVDEVYPKYE LMGTHTGRRTFISNAIMMGIPPQVVMKWTGHSDYKAMKPYIAIADSAKAEAMKLFNEK >gi|313158710|gb|AENZ01000033.1| GENE 31 31634 - 32152 640 172 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313158782|gb|EFR58169.1| ## NR: gi|313158782|gb|EFR58169.1| hypothetical protein HMPREF9720_1909 [Alistipes sp. HGB5] # 1 172 1 172 172 319 100.0 5e-86 MRMIIHKSRKKAVCITVAGLLAGIAGGLVLYYVRDVVLGWCFVITAGFALLYGMGSLFDR RTYIVLTEYGITEPFAIREQIEWDAVLYADDFYFRGRYWVRLLLGRDYKPQLIRPAWFWR FDRLYESKGVKAVYIRTMGLEVDSMRLAALIRRMKEADMSERIGLLREFRSE >gi|313158710|gb|AENZ01000033.1| GENE 32 32324 - 33427 1527 367 aa, chain + ## HITS:1 COG:MJ0170 KEGG:ns NR:ns ## COG: MJ0170 COG0668 # Protein_GI_number: 15668342 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Small-conductance mechanosensitive channel # Organism: Methanococcus jannaschii # 2 348 1 350 350 166 32.0 7e-41 MLEYEIWGNTVQTWIVSILIVIGAFVVVKLLSLLGRKVIKPFIARTNNRVDDIIYYSLEA PLKFAVMLLGIWIAIHRLVYPDQLVKYVDNAYRILIILDITWVLARLSTALLQQYWGQRS DGHAVKMMPVVRRTILVLIWIIGLVTALSNVGVDINALWGTLGIGGIAFALAAQDTVKNI FGAFTIFTDKPFGIGDTINVNGLEGTVIDVGMRSTRILGYDRRITSYPNYKITDASIVNI SSEPMRRAMVKLGLTYDTGAEKMKQALEILRAIPAKIKDVSGNPSDVTAYFSDYTDSALV VTFYYYIEKQGDVLKTTSDVNLEILDSFAKAGLTFAFPTRTLLVHQDDAEKAGGEPAAAD GTQIQNS >gi|313158710|gb|AENZ01000033.1| GENE 33 33438 - 35405 2332 655 aa, chain + ## HITS:1 COG:TM1845 KEGG:ns NR:ns ## COG: TM1845 COG1523 # Protein_GI_number: 15644588 # Func_class: G Carbohydrate transport and metabolism # Function: Type II secretory pathway, pullulanase PulA and related glycosidases # Organism: Thermotoga maritima # 22 655 220 843 843 515 42.0 1e-145 MTTQQNRPATGQPDAFDKYPCYYGSDLELAYTPQRSVFTLWAPTADKVRLNLYASGEGGE PEERLEMRPSDDGTWRVAAERDLKGTFYTFQIEKEGKWLDETPGIWAKAVGVNGDRAAVI DLRETDPEGWEADRAPELKMYSDIILYELHHRDFSVAPDSGIENKGKFLALTETGTKTPQ GEASGLDHLKELGVTHIHILPSFDYATVDEARLNDKTYNWGYDPKNYNVPEGSYSTDPAD PAARIREFKQMVQSLHRNGMRIVLDVVYNHTASVEHSNFNLTVPGYFYRHNADGSYSDAS GCGNETASERAMVRHYIVESVKFWAKEYHIDGFRFDLMGIHDIETMNRIREELSKIDPTI FIYGEGWLAADSPLPPEKRAVRDHVGQMEGIAVFCDDFRDAVRGSTFDEHACGYASGNIG GHYEPVKFGIVGATQHPQVDYNGLLYSSVPYAAAPSQAVNFVASHDGYTVIDKLRLSVTG DRAEEELLPIDKLIHTILLTAQGVPFIRAGEEMMQDKQGEPNSFRSPDAVNQIDWALKAE HRGLFDYIRQLIALRKAHPAFRIPTAEGLRQWLRFLDTGDSGVIAYTLGEHANGDAWKEI LVAYNGNRHPAEFRIPEAERIVVCRDGRIDPDSRERLSGEVVRMAPSSALIAYCE >gi|313158710|gb|AENZ01000033.1| GENE 34 35477 - 35737 456 86 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313158723|gb|EFR58110.1| ## NR: gi|313158723|gb|EFR58110.1| transglycosylase associated protein [Alistipes sp. HGB5] # 1 86 1 86 86 97 100.0 2e-19 MYFLWYLLIGLAAGWIASLIFKGSGSGLLVNLIVGLIGGILGGWLVSLFGWVPTGTWGTF LASIIGAIVLLWIVSLFTRHKPAKTM >gi|313158710|gb|AENZ01000033.1| GENE 35 35740 - 36054 480 104 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313158763|gb|EFR58150.1| ## NR: gi|313158763|gb|EFR58150.1| conserved domain protein [Alistipes sp. HGB5] # 1 104 1 104 104 111 100.0 2e-23 MEKYSTKENETYREHAEKAVHKVEEHGQKARHNLQENMQKLKDDAEEAAIEAKNEARTGV HKMAGKIEEAGTKVKNATKEGMTKAAHKVKETATQAANRIKEKV >gi|313158710|gb|AENZ01000033.1| GENE 36 36116 - 36802 532 228 aa, chain - ## HITS:1 COG:all3194 KEGG:ns NR:ns ## COG: all3194 COG2135 # Protein_GI_number: 17230686 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Nostoc sp. PCC 7120 # 1 227 1 221 233 115 33.0 9e-26 MCYHIFLAAGPEELARHFGRKADLIRNLRPPVRACAFSHLPYPVLTGDEQIRYSRWGLIP YWLRRPEDVVTVRNQTVNARAETIFENASFRVPIRQRRCLVPVSGFYGWRHEQHRKIPLY VTLKERPLFSLAGIYDRWYCRATAAWTTTFSIITTEANPLMRYVNNANGRMPVILRPDDE SRWLSPALSDAQIAALLKSYPEQELHSEPVRCDFMHEVAADPGILVPA >gi|313158710|gb|AENZ01000033.1| GENE 37 36905 - 37339 565 144 aa, chain - ## HITS:1 COG:no KEGG:BT_0923 NR:ns ## KEGG: BT_0923 # Name: not_defined # Def: putative periplasmic protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 134 1 135 145 73 30.0 2e-12 MKKILLLVAAAVFAFAVSAAPQKVDFGKLPKNSQEFIQKNFPGEKVKAVEMDREASWDKY TVYFNSGNQVSFEGGSGDCSQIIMKNGSVPMSVIPAKIKTYAGNNYAGQRIVMMETTADG YKVALADKTVLDFDKDGNFTKATK >gi|313158710|gb|AENZ01000033.1| GENE 38 37458 - 38312 1202 284 aa, chain - ## HITS:1 COG:TM0497 KEGG:ns NR:ns ## COG: TM0497 COG1814 # Protein_GI_number: 15643263 # Func_class: S Function unknown # Function: Uncharacterized membrane protein # Organism: Thermotoga maritima # 3 284 2 284 284 216 42.0 5e-56 MTKNLLTLQRDETTLCEVYRRLAGLEKDPVRRRTLLRIMQDERRHCEVLRSRTGRTVTPD PKRVLWYVGMVRVLGRAFVVRQMEQCEKGTAASYSRYPEREEFVRIASEERRHGEELTTL AGGMRLCYISSVVLGLNDALVEFTGALAGFTLALNEPRLVALTGGITGIAAALSMAASEY LATKSEKGREKHPLRAAVCTGVTYLVTVAILILPYLLFSSALAALGVMLLAALTVIALFN YYYAVVLCESFRRRFFEMALLSFGIAGISFLIGYILKLFTGADV >gi|313158710|gb|AENZ01000033.1| GENE 39 38667 - 39914 1712 415 aa, chain + ## HITS:1 COG:no KEGG:Sph21_0818 NR:ns ## KEGG: Sph21_0818 # Name: not_defined # Def: putative secreted protein # Organism: Sphingobacterium_21 # Pathway: not_defined # 21 414 18 433 434 297 43.0 5e-79 MKKKYFFTAFAAALLFAGFAFTACDDDDDNSAPGTKPTIKFENVIPTKNYVQSGTFAAVA PGATTSFTFHAAKGQRLMFATMYSYSNDLFFAPENPGIALFNDAGVPYTGVIANAVKLWD NGTRVNEQPGPNVNHPGVAQAGVVSEVNGTDTEGHTYPAASSLLQVSLTFDAVQSLFTCT ISNISNGTSNETPFSGGVFVVSNMLDGKLVMEKPFFNVDQKSGVELTKLAEAGNVDPLKS LVADQTGIITTLHGAIVVVYTGNTNPIYQLGQKDAQLGLAALAQRGDPGQLKASLEKVPQ VRKIYTITKTALPGESMECTYEAAANEKVAFATMFGYSNDWFFANGPELNALTKGDITSK TVLLDDGTAVSQYPGAGNAQSVFGGTLMPEDKAISAVGDTYPVPAVDGIVKITIY >gi|313158710|gb|AENZ01000033.1| GENE 40 39981 - 40250 377 89 aa, chain + ## HITS:1 COG:no KEGG:Cpin_5022 NR:ns ## KEGG: Cpin_5022 # Name: not_defined # Def: hypothetical protein # Organism: C.pinensis # Pathway: not_defined # 3 76 6 79 79 84 52.0 1e-15 MTNDKSQEGKVTVAVEKKTSKIPSMVYLSAGLGALAVSACMMFRGRKSAALLVGQWAAPL LIMGLYNKIVKTEGHDRPESGTDRLNRII >gi|313158710|gb|AENZ01000033.1| GENE 41 40262 - 40549 451 95 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313158766|gb|EFR58153.1| ## NR: gi|313158766|gb|EFR58153.1| conserved hypothetical protein [Alistipes sp. HGB5] # 1 95 1 95 95 108 100.0 2e-22 MDKTNQMHGESREQQNDTQTSPMQNGNTNDLDSRSHNRDNSGMHAGSTEGSSYRQQPDNK EQYGSKKSSYWQEKGSDRKQSVGSRMNEESLKQKQ >gi|313158710|gb|AENZ01000033.1| GENE 42 40592 - 41563 1560 323 aa, chain + ## HITS:1 COG:PA2185 KEGG:ns NR:ns ## COG: PA2185 COG3546 # Protein_GI_number: 15597381 # Func_class: P Inorganic ion transport and metabolism # Function: Mn-containing catalase # Organism: Pseudomonas aeruginosa # 1 286 1 272 294 222 47.0 8e-58 MFYHSKELQFKARVSRPDPRFARLLLEQFGGGNGELKAAMQYFVQAFACHNPYPDKYDML MDIATEELGHLEIVGATIQMLLAGVNGNLKDAAERNDVFGESISKDDFIHSAFSINPQFG VLTGGGPRLTDSNGVPWQGSYVNANSDLTVDLRSDIAAESRAKIVYEYLMQFTDDKDVLA TLNFLMTREVAHFQQFEAALGTIQPNFPPGVLQSDPRYSNIYSNHSTGEDARGPWNEGES TGLGEEWLYVEDPREQVLETDGLTQLKPQGTPRTEASVRSADKKLSKERSGEIRKATPEK NLRWCEYEPQDTKASRRQPVEQD >gi|313158710|gb|AENZ01000033.1| GENE 43 41693 - 43117 1917 474 aa, chain + ## HITS:1 COG:SP1382 KEGG:ns NR:ns ## COG: SP1382 COG0366 # Protein_GI_number: 15901236 # Func_class: G Carbohydrate transport and metabolism # Function: Glycosidases # Organism: Streptococcus pneumoniae TIGR4 # 1 471 7 479 484 460 48.0 1e-129 MQYFEWNLPNDGKLWTQLKEDARHLHDVGVTSVWIPPAYKADEQQDEGYAVYDLYDLGEF DQKGTVRTKYGTRQELEEAIAALHANGISVYIDTVMNQKTGADYTEKFMACEVDPENREQ VIGAPVEVEGWTGYTFPGRGDKYSPFKWHWYHFSGTDQVYETGKRAIYLIQGEGKKWSEG VDGENGNYDFLIFNDVDFDHPEVIEEMKRWGAWIAQTLDADGMRLDALKHIRNTFIGEFM HSVRASRGKEFYAVGEYWSGDFESLEAYLDAVDHQIDLFDAPLHFKLFTASQQGRDFDLR TLLDDTLVRKHPTLAVTFVDNHDSQRGSSLESQVKSWFKPLAYGLILLMKEGYPCIFYGD YYSMKGEASPHRAVLDILLDARRSRAFGEQTDYFDHPNTVGFTRAGDEQHPDSGLALLLS NGEDGEKVMSVGVTHANETWHEITGSYDDEIVIGDDGKALFRVHGGKLAVWVKK >gi|313158710|gb|AENZ01000033.1| GENE 44 43304 - 43513 204 69 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313158761|gb|EFR58148.1| ## NR: gi|313158761|gb|EFR58148.1| hypothetical protein HMPREF9720_1922 [Alistipes sp. HGB5] # 1 69 1 69 69 63 100.0 7e-09 MKESTNKTAGRTQTGQNKSATKSSRTTTSHASSTSRGTAKSQSSATGRVSRSNNPEGHNQ YTKKSQGNC >gi|313158710|gb|AENZ01000033.1| GENE 45 43687 - 44163 594 158 aa, chain + ## HITS:1 COG:AF1475 KEGG:ns NR:ns ## COG: AF1475 COG3476 # Protein_GI_number: 11499070 # Func_class: T Signal transduction mechanisms # Function: Tryptophan-rich sensory protein (mitochondrial benzodiazepine receptor homolog) # Organism: Archaeoglobus fulgidus # 12 155 11 153 153 76 38.0 2e-14 MKKAWAYILPTLLCFVLGGLVGWWQNDAIREWYPHLVKPALTPPNAAFPIAWSIIYLCMG LSAGLVLTSVSPLRRRVMALWFAQLGFNVLWSILFFVCRSPLLGMIDIALLDILVAAYIA QSGKIRAGAAWLFVPYLCWILFATYLNAGILAANGTEL >gi|313158710|gb|AENZ01000033.1| GENE 46 44205 - 44906 751 233 aa, chain - ## HITS:1 COG:FN1880 KEGG:ns NR:ns ## COG: FN1880 COG0778 # Protein_GI_number: 19705185 # Func_class: C Energy production and conversion # Function: Nitroreductase # Organism: Fusobacterium nucleatum # 58 233 10 190 192 70 27.0 2e-12 MPLFRPVILRIFVSGLILKKMADDYLGRKMEEYLARTASGPRNRRPLATLGRLLLKNRSH RGYDARFVVRGDQLRRIIGVNAKIPSARNQQVLRFRPVLADEAPKVLPHIRLGGALPELH LPLPGTEPNAFIIICSTVAEDRYVDIDLGISAQSMLLQAVEIGLNGICIGAFDKEPIRRE FGLDAEPLLILALGKGIEKIELTEIGSDGDRAYYRKNGTHYVPKVRLEDLLIG >gi|313158710|gb|AENZ01000033.1| GENE 47 44937 - 46013 1354 358 aa, chain + ## HITS:1 COG:SMa2355 KEGG:ns NR:ns ## COG: SMa2355 COG0389 # Protein_GI_number: 16263727 # Func_class: L Replication, recombination and repair # Function: Nucleotidyltransferase/DNA polymerase involved in DNA repair # Organism: Sinorhizobium meliloti # 4 336 27 363 379 370 58.0 1e-102 MTQRKIIHIDMDAFYASVEQRDRPEYRGQPIAVGHDGPRGVVATASYEARPYGVRSAISS ALAKRLCPNLIFVPARFDVYKEVSRQIRAVFRDYTELVEPLSLDEAFLDVTHLRSATLAA REIKARILAETGLTASAGISVNKMLAKIASDYRKPDGLFTIPPGQIEEFVAALPIERFFG IGEVTARKMHGMGIHTGADLRLRDEAELVRHFGKAGHSYYGYARGIDEREVTPNRIRKSL GAETTFAEDTDDREQLRLELSAVREEVWNRLMRHEFKGKTVVLKLKFDNFRQITRSKTLF APVNSAETLRQVSEELLAAVDFRGRKIRLIGLTVGNSPEACTECVQLRFDFGESARNV >gi|313158710|gb|AENZ01000033.1| GENE 48 46219 - 51021 5532 1600 aa, chain + ## HITS:1 COG:no KEGG:Odosp_3427 NR:ns ## KEGG: Odosp_3427 # Name: not_defined # Def: hypothetical protein # Organism: O.splanchnicus # Pathway: not_defined # 2 1594 1 1553 1554 733 30.0 0 MLKKILKYTFRTLLVILLVLILVPALLYIPAVQDFVRSKAVGYASRTLGMDLSVERLRLS FPLRLSVDNTLLTDKGDTLLSCGHLSLEVAVWPLLRKEVAVRSLELAKLAAHYGDSTAGM DLKVAAGQFAVNDCRVGLPAKTVGISRIALTDGDVFLNTAESAPAEKADSAAALPWQIDV GKLTVANLVFGMRTAPAVTDLSVRLPDGEVDSCRVLLDSRQVSVKSILLNRGGYAYLTAP ADAGEKAPDKTATQGKAAHPDKTAAADNASSPHDAAADDGEAPALPWTVRVGSVTLNDNS LEYGTLHHRPAAGFDPAFIVLSPLDLSVDSIYNRGADIALRIRRLAFTERSGLSVRNAAG AFAMDSTGISLSGFELATPLSGVRAEAHAGAGIMRMAPDTPLTADLSASLNTEEIKLLYP QLIPAALDDRIVRIKLSAAGTLGDIKKAGLDISSPGHIDLAVNGTAKNLLAPERLEAAAR FEGEFRDMAFLLEMLPDTALRRRVTIPERITLRGAADADRGLYSLASTLNADGGQLTLNG RIDPEKQIYDAEVRCDSLPLNRFLPADSLGALDFTLTAGGAGFDPLLPQTRGSVRMRIGR AEYRSHDFGGIELDADLENQHLSGRLSDRDEALRLLLSVSGTLTEREQRIGVSGNVFDFD LADMGITPEQIGGSFALDADASASDAGGMAARLTLDSIVIRSKNRTDRIRRTNVTFGTDT AATRAGLTSGDLTLSFAAPEPLDSLTAAASRSAGVLAQQIRSQHVDMDSLKTVLPDFGLR VSAGRDNILSSFLRTKRIAFSNLDIAGTNCDSLPVSLRMRVEKLAYGSIVLDTLTASAVQ NGSRLEYALRVANAPGNLDNIALAGVYGHVVRNTGAVNFYQKNRAGREGFRFGVDAAWND SLIRASVTPLAPVFGSEPWTVNPGNYLVYRFDGNLSADLDMTHGDQRFAIHTVPETDSLR GIRLDIAGLNIGGALAMLPSAPPVGGVLGAAVTLNTGADSLAVRGDVSVAGLSYDKQRFG DVGLGVRYAQGREQQADVRLTLDGADVLTARGDYRKERESPLDLTASIPGFPLQRADVFL PADMLRLSGILSGKLHAGGTPQRLQLNGGLQFAQTEVRVPMIGTSFRLSSDTIRIDDSRV LFDDFAVTAPNKSPLTIGGYVDLTDFGRITADIALRASDFQFVNVARKEGTDVYGKAYLD LDATAKGPLDELVVRGSVALLKNTDINYVMQDSPMDVKERPQNIVTFVSFRDMDNQSFAE ATPTVRIGGMDILLNVDINDDVQAAVDLSADGSNRIDLQGGGNLTFTMNPLGDVSLSGKY VLSGGTVRYNPPVISQKIFKITPDSYVDWVGNVADPAFNITAVETVRASVSADGQDSRSV NFNISINIRNTLNDLEVSFGLSAPEDLTMQNQLNSLTAEQRANQAMNLLIYNTYTGPGTT AKVSTENPLNSFIQKELNQWAQNNLKGVDLSFGIDSYGEDDPNGQRTDYSYRLSKSLFSN RVRAVIGGKFSTDAAPSQNLKENLIDDISLEYMLTKRDNMYLKVFRHTGYESILEGEITE TGVGFVIRKKLLRLGDLFKPMRPKAEKEKQKRNESDARQK >gi|313158710|gb|AENZ01000033.1| GENE 49 51005 - 53308 3400 767 aa, chain + ## HITS:1 COG:no KEGG:Odosp_3428 NR:ns ## KEGG: Odosp_3428 # Name: not_defined # Def: surface antigen (D15) # Organism: O.splanchnicus # Pathway: not_defined # 18 767 48 798 798 729 48.0 0 MRVRNSCILLCAALLAAACSTTRRLGTDEVLYTGVRKIRIEPDSGVVLSAAAESAVKEPL SVAPNNPLYSPYLRTPLPIGLWVYNYLYTPKEKGFKYWLYKRLAKQPVLISKVQPGLRTK VAEQVLENYGYFGSQAEDSLLYRKHGRKAKVYYTLRIAPPWYYSNISYPQAGGGLQPLID SMRATSLLRVGAQYNMDSLTLERKRISQLLRNRGYYYFRPEYMEYLADTTVERRRVDLRL NLKPNVPEVALKPYRVGDITVRLTNIKPGPADTFRLRDVRVIAQRPMKIRPKILSRTLSL EPGQLFTVDAQNRTQTALNKLGIFRSVNLGVTPLDSLRGADTLDVVIDAQFDYPLEAALE TDVTSKSNSFIGPGITFKVSNNNLFRGGEVLAVKLNGSYEWQTGNKNSGGRSSRLNSYEL GLNANLNIPRLLLPSFATRRLRYPGSTTFQLGVDLMNRPSFFRLIAFSGSAGYNFQTSPY SRHSLTVFKLTYNKLLHTTEAFDKTMDENPAIAMSFRNQFVPSINYTYTFDKTYGRTGNR RFYWQNSVTSAGNILSGVLSLFGSKQPQHLFGNRFSQFVKEVSEVKFYHRIGRRNNWLAT RFLVGVGYAYGNSEVMPYSEQFYIGGANSIRAFTIRSLGPGSYRPPADDRNGYLDQTGDF KLEANVEYRFGIMGRLNGAVFLDAGNIWLLKKDPKRPGAELKWKGLLNEIALGTGFGLRY DISYLVIRADLGIGLHTPYPNPDKTGYYNISSFKDGLGFHLAIGYPF >gi|313158710|gb|AENZ01000033.1| GENE 50 53641 - 54402 1011 253 aa, chain - ## HITS:1 COG:CC2333 KEGG:ns NR:ns ## COG: CC2333 COG1573 # Protein_GI_number: 16126572 # Func_class: L Replication, recombination and repair # Function: Uracil-DNA glycosylase # Organism: Caulobacter vibrioides # 102 251 92 231 479 59 28.0 6e-09 MLVFRYDKSFDGLLSALFDAYAMRAFPEQLIGPGEPEPLFTERVHEVATDPAHAARVWRG LERRLVVRARSMLVYAWHGEQERSDLLMLRCMRRVFDEGGGVLADQADPDMKALHRLALK VSHECERLRQFVRLQKAADGTYFAAATPEHDALPLSLDYFTDRFADQRWLIYDRRRDYGY YYDGRTARRVTLEDDRGMIADKLADRWLAEDERQFQLLWKNYFRALAIPQRINECQQRRM MPRRYWKHLTEME >gi|313158710|gb|AENZ01000033.1| GENE 51 54408 - 55667 1691 419 aa, chain - ## HITS:1 COG:CAC3343 KEGG:ns NR:ns ## COG: CAC3343 COG4277 # Protein_GI_number: 15896586 # Func_class: R General function prediction only # Function: Predicted DNA-binding protein with the Helix-hairpin-helix motif # Organism: Clostridium acetobutylicum # 5 380 3 378 440 460 56.0 1e-129 MKETVLEKLKVLAESAKYDVSCASSGVTRRHRAGGVGSTAGWGICHSFTADGRCVSLLKI MLTNVCIYDCAYCINRRSNDIRRAAFTPDELTELTIEFYRRNYIEGLFLSSGVVRNPDYT MERMVRVARDLRTVHRFNGYIHLKSIPGASPELVAEAGRYADRMSVNIEIPTERNLLLLA PDKNYESIYRPMSYIQQGVLQSAEERGRFRHASRFAPAGQSTQMIVGATDESDRDILLLS SSLYDRPTMKRVYYSGFIPVNPYDKRLPALDKAPLVRENRLYQADWLMRFYGFRAEEIAD EQTPRLDLEIDPKLAWALRHPEFFPVDVNRADYEALLRVPGVGVKSARLIVSSRRYRTLT MQSLRQIGVVMKKAQYFVRCRELTTGQGVNELRPEQVRRLLTAPKRQKPDKNEGQLTLF >gi|313158710|gb|AENZ01000033.1| GENE 52 55871 - 56482 694 203 aa, chain - ## HITS:1 COG:CAC3341 KEGG:ns NR:ns ## COG: CAC3341 COG0655 # Protein_GI_number: 15896584 # Func_class: R General function prediction only # Function: Multimeric flavodoxin WrbA # Organism: Clostridium acetobutylicum # 1 203 1 208 208 235 57.0 4e-62 MKVLLINGSPHREGNTFIALSEVARTLESEGVQAEIVHIGTKAVQGCIACGKCAELGHCV FSDALYTTVREKLADADGIVVGSPVYYAGPNGSLCALLDRVFYSCGKYLAYKPGAAVAVC RRGGASATFDRLNKYFTIMNMPVVPSQYWNSVHGRLPGEAREDAEGLQTMRVLARNMARL LKAGVGPALTPEAEERQWTHFIR >gi|313158710|gb|AENZ01000033.1| GENE 53 56631 - 56870 339 79 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKKIFSFVVVMAAVAMVSCCGNSNKKAAEGEAAAAAATTEAAACTECTEKCDSTEACCAE KCDSTKDCAEKAEGECCNK >gi|313158710|gb|AENZ01000033.1| GENE 54 56976 - 57446 659 156 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313158724|gb|EFR58111.1| ## NR: gi|313158724|gb|EFR58111.1| hypothetical protein HMPREF9720_1935 [Alistipes sp. HGB5] # 1 156 1 156 156 285 100.0 6e-76 MRRSLTILLILFAAAAYTPLRAQSLDAFKERLAAPVASDAAFGTAKVTVTEYGDAARAVN EASRTGVRLRFPGYRVCIFFDNGQDARAGAIAAKKLFEENYPGIKVYMVYENPYFKVAVG DCLTTEEAIILKGRVSSAFPKAFVKNETLSIADLLN >gi|313158710|gb|AENZ01000033.1| GENE 55 57446 - 57946 391 166 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313158747|gb|EFR58134.1| ## NR: gi|313158747|gb|EFR58134.1| hypothetical protein HMPREF9720_1936 [Alistipes sp. HGB5] # 1 166 73 238 238 317 100.0 2e-85 MLFCILLLWVVLPLGSCRRMAEKAQRNIRLEAVEKARRQGLSGAEIVLRIKNGTGHKLKL EKASLTVYYAGGVVTKVALREPAEAPRRATASVTTLWRIRTSDPMALHVMTKKIREDDIS KIGVSFAIEGRGGPARVKISREMMPLSEFLNIFGLSLNDVKNYLEE >gi|313158710|gb|AENZ01000033.1| GENE 56 58165 - 58881 491 238 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163754278|ref|ZP_02161401.1| 30S ribosomal protein S15 [Kordia algicida OT-1] # 17 238 1 221 221 193 45 2e-48 MKPYNTEQTKKEEVREMFDNIAPKYDLLNHTLSMSIDRIWRRRVVRIVRRCRPHRILDVA TGTGDLAIEMARRIRGVQVLGVDLSEGMLDVARRKVTARGLDGRVVLDAGDAEHLHVADA SVDVATVAFGVRNFGDLDAGLREMARTIKPGGKVVVLEFSRPRNRLFRALYEFYTYKILP RIGGMVSKDKRAYEYLPASVGEFPAPKEFMAMMERAGFRECRARSQSFGIAQIYTGQK >gi|313158710|gb|AENZ01000033.1| GENE 57 58878 - 60122 1681 414 aa, chain - ## HITS:1 COG:aq_1585 KEGG:ns NR:ns ## COG: aq_1585 COG4591 # Protein_GI_number: 15606709 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: ABC-type transport system, involved in lipoprotein release, permease component # Organism: Aquifex aeolicus # 23 372 22 352 394 100 25.0 5e-21 MNLAFFIARRTAQRTPENKPGVMERIAVISVALSVAVMLLAMAVIMGFKQEVSRKVAAFS GHAVVNGVRGVGTLDSAPVRRSARAEELIRTTEGFVSLAPYAVKGGIVRTADAVQGVMLK GVDGDYDWSSFGQWLVEGELPRVGDSVRTKDILLSRGMAEKLMLGVGDKVEMLFVETGDR PRRDRFKVAGIYSSGMDEMDDAVIMTDLRNVQRLADWADGEISGYEIFTSDLAGADDFAR RLGRALLYDDADDTANLAVSSVTELYPNIFDWLKAHDVNAAVIIGIMLVVAFFNMTSALL ILVLERTRMIGLLKAFGMRNATLREVFLWRAAFVTLRGLAWGNAAGLAVCLVQKYFHVVK LSSEGYLLSEVPVALGWGWWLALNAGVVAAIVALLVVPACIVSTVKPDESIRYE >gi|313158710|gb|AENZ01000033.1| GENE 58 60199 - 60948 1027 249 aa, chain + ## HITS:1 COG:all0475 KEGG:ns NR:ns ## COG: all0475 COG4221 # Protein_GI_number: 17227971 # Func_class: R General function prediction only # Function: Short-chain alcohol dehydrogenase of unknown specificity # Organism: Nostoc sp. PCC 7120 # 6 247 10 256 257 236 50.0 4e-62 MKKQALITGATSGIGRATALRLAAEGYDITATGRRAERLETLRREIEAAGGRCTALVFDV RSEAEVRKFLSPLERVDLLINNAGLAAGLEHIDQGDTADWDAMIDTNVKGLLYVTRVVTP KMVAAGGGHVFNIGSIAGTEAYENGAVYCASKHAVHAISQSMRADLLSSGIKVTEIRPGM VETEFSEVRFHGDKERADRVYDGVTPLTGDDIAEAIAWAAQLPAHMNVNEMVLMPSQQAG AYYTYRKNR >gi|313158710|gb|AENZ01000033.1| GENE 59 60952 - 61719 1069 255 aa, chain + ## HITS:1 COG:BH1677 KEGG:ns NR:ns ## COG: BH1677 COG2738 # Protein_GI_number: 15614240 # Func_class: R General function prediction only # Function: Predicted Zn-dependent protease # Organism: Bacillus halodurans # 29 255 1 223 224 149 38.0 5e-36 MTHMLLTLLQSGYYADSGSQGIHYSAAMIGMYILIIVIGIAGYAVQARLQSVFRKYSKVQ FPGGLTGAEVAEKMLRDNHIHNVKVTHVSGHLTDHFNPQTMTVNLSDSVYSSTSVAAAAV AAHECGHAVQHAQGYAPLTLRSQLVPVVQFSSTAATWVIILGLVLMTTTQNALLCWIGVG MIAMSAIFSLVTLPVEYNASARALEWLQVSRTMQGAQLAQAKEALGWAARTYLVAALSAV ASVLYWVFVILGRRD >gi|313158710|gb|AENZ01000033.1| GENE 60 61727 - 63334 2284 535 aa, chain + ## HITS:1 COG:NMA1480 KEGG:ns NR:ns ## COG: NMA1480 COG2755 # Protein_GI_number: 15794380 # Func_class: E Amino acid transport and metabolism # Function: Lysophospholipase L1 and related esterases # Organism: Neisseria meningitidis Z2491 # 96 484 7 386 397 74 25.0 4e-13 MNTPEKDCIHRGWIAALALIAVLTAVSFIPPQSLGGVKLRRANILSDLVAFDDAVAVAEE PALFDEEDFHVDMEQVAERIEAERIEANSAPRPVQITFEWALAPDSVRRMPVVPDSVRLN PTLVEIEDFGTPDSSRLQAFYDTLLCARRPVRIAVLGDSFIEGDILTADLRERLQQAYGG GGAGFAPMASPLTAFRRTIKTQSKGWTSYNIMQRKAAPQNLRENFYISGWVCQPAAGAST RWENSDYRKRLDSCTAARVFFISPGESRVELTLNDSLRREFTVESAPNIRQIAVTAPHVR SLSFKVLSGNEGFIGYGAVFEADGVVVDNYSVRSNNGQAMFWTNPSVNAQMNALLGYDLV ILQYGLNIMQTGVSNYTNYAGQIEKMVAYVRQCFPTAAVLVLGVSDRSVKTDAGFEPMDA IPHMLGYQRGAAENTGAAFWPTCDAMRSLGGMEQFVANGWAGKDFTHINYAGGRRVAWSL FDALNAGANRAYTEAEAARIRRQAEQAVLDSLRRRRIERDLIPGIPPETLNAPLK >gi|313158710|gb|AENZ01000033.1| GENE 61 63331 - 64836 2463 501 aa, chain + ## HITS:1 COG:PA3548 KEGG:ns NR:ns ## COG: PA3548 COG1696 # Protein_GI_number: 15598744 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted membrane protein involved in D-alanine export # Organism: Pseudomonas aeruginosa # 57 423 31 389 520 268 43.0 2e-71 MTLPALDGIPEKLQALLTYDASSPLIFSSGLFLFLFAGFMLVYSMFRRAPMARIVYVILF SLYFYYKSSGIYFLLLIFAATSDYWIAKGIHAARSTRAKRWLVVLSVAVNLGMLAYFKYT NFLIDIANQMFGQGFMQFQNIFLPVGISFFVFQSMSYTIDIYRGQLKPLDNWCDYLFYLS FFPQLVAGPIVRARDFIPQIRQNPIVVTREMFGTGVFLILTGLFKKAIISDYISLNFVDR IFDEPLLYSGFECLAGIYGYALQIYCDFSGYSDMAIGIALLLGFRFPKNFDAPYKSATIT EFWRRWHISLSTWLRDYLYISLGGNRKGKLRTYGNLLITMLLGGLWHGAAVRFILWGALH GAALALHKLWMAVVPGAKATGDRMHWWSRAAGIFFTFNLVCLGWLMFRAESMQTVELMLH QIFYNFNAAMIPQVVSGYAGAFALIAAGFLLHLMPGRADRFAQRLVSHAPLVLQIVMAAA MIWCVMQVKSSDIQPFIYFQF >gi|313158710|gb|AENZ01000033.1| GENE 62 64982 - 65629 887 215 aa, chain + ## HITS:1 COG:no KEGG:FIC_00647 NR:ns ## KEGG: FIC_00647 # Name: not_defined # Def: putative membrane-associated phospholipid phosphatase # Organism: F.bacterium # Pathway: not_defined # 5 211 8 184 189 85 29.0 2e-15 MYTFDHDLFSALNFDGGTALDRVMLTVSGTAMWLPLYALILWLVWRRSGWRGMLIFTVLM LAAIGLADMVSGIFKSNGVLGGLLPDFEPRPRPMFTPSLEGLDITPDSLRVMRREAVPHH WAVHVPPEAVSGMYGTVSAHAATIVALAVLACGAIRRRWFTGLMVVCSVVICYSRIYLGK HFPMDLVWGTLVGAALGYAALRAYRRLIRRDSAAG >gi|313158710|gb|AENZ01000033.1| GENE 63 65858 - 67579 2824 573 aa, chain + ## HITS:1 COG:CAC3197 KEGG:ns NR:ns ## COG: CAC3197 COG1190 # Protein_GI_number: 15896444 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Lysyl-tRNA synthetase (class II) # Organism: Clostridium acetobutylicum # 12 500 33 510 515 482 49.0 1e-136 MSIELSEQEQLRRQSLAALRELGIDPYPAARYEVTATAREIAENFDETKGNYGDVRIAGR IMTRRIMGSASFFELQDHTGRIQVYIRRDDICPEGDPTLYNTVFKKLLDIGDFIGVEGFA FHTNTGELSVHCRKFTVLSKSIRPLPVVKEKDGQTFDAFTDPEVRYRQRYVDLAVNPQVR EVFVKRAKIISTMREFFNSKGYVEVETPILQPIPGGAAARPFITHHNALDIDQYLRIASE LYLKKLIVGGFDGVYEFGKNFRNEGMDRTHNPEFTVMEIYVAYKDYLWMMEFTEQMLEKV AVAVNGTTEVTLDGKQISFKAPFRRLSMTDAIREKTGYDITGQSEEQLREACKRLNVETD ATMGKGKLIDAIFGQYCEEELVQPTFVTDYPIEMSPLCKRHRSNPELTERFELFVNGKEL CNAYSELNDPIDQLERFQEQLQLSEKGDDEAMFIDMDFVRALEYGMPSCSGMGIGIDRLT MFMTDQASIQDVLFFPQMRPEKKAVSDPVEKYTEIGVPEEWVPVIQKMGYITVEALKKLA PGKFFNDLCGFNKKNKLGLKAPSIEEVKKWCEE >gi|313158710|gb|AENZ01000033.1| GENE 64 67623 - 68381 1152 252 aa, chain + ## HITS:1 COG:TM0689_2 KEGG:ns NR:ns ## COG: TM0689_2 COG0149 # Protein_GI_number: 15643452 # Func_class: G Carbohydrate transport and metabolism # Function: Triosephosphate isomerase # Organism: Thermotoga maritima # 2 248 3 248 255 232 49.0 6e-61 MRKKIVAGNWKMNTLPAEGVELAKNIVAGRGEVCSCVNFIVCPPFTHLSTVAEALKGSDV ALGAQDCATEAKGAYTGEIAASMIAALGCKYVILGHSERRQYYGETSETLNKKMAQAYAN GLTPIYCVGENLEEREAGKHFDVVKAQIEEVVYNLTAEQYKDLVIAYEPVWAIGTGKTAT AEQAQEIHAYIRKVLAEKFGAAAEETAILYGGSCKPSNAAEIFAKEDVDGGLIGGAALKA EDFLSIGKGFAK >gi|313158710|gb|AENZ01000033.1| GENE 65 68473 - 69783 1961 436 aa, chain - ## HITS:1 COG:no KEGG:Bacsa_0648 NR:ns ## KEGG: Bacsa_0648 # Name: not_defined # Def: histidine acid phosphatase # Organism: B.salanitronis # Pathway: not_defined # 7 432 7 421 427 277 37.0 9e-73 MRFTKLFAAVLCSLAALAASGQNLRDEIAANPRKSGGVYYVYTYDNPVLTPAPKGYKPFY ISHYGRHGSRWLLHDSEYDEVMAVFRAADAANAFTERGREVYGRVKRVYDDGINRGGDLS PLGAEQHREIAGRMYRNFPEVFRSGAVVDAQATLVVRCVLSMAAFCERLKELNPRLEVSR TAGRRTTRYLNFYSKPTNPTLSREYLDFIDKGGWQEEYERIGDRFVRPDRLMSELFADGE FVRTIDAQKLMKGLFYFAADMQNVGLGISFYDLFTTDELYGLNVYDNYKYYVIRGPSPLN RRFPQYYAKALLEDFLTRADRAVEGGAVSADLRFGHDGNLMTFVSLLQFEGCDVVESDPE KIAQMWPLYRISPMAANIQLVFYRKKAADDVLVKFLYNEREVRIPVASDLAPYYRWNDVR DFYRNVMENLPDPAGK >gi|313158710|gb|AENZ01000033.1| GENE 66 69799 - 70704 991 301 aa, chain - ## HITS:1 COG:lin0348 KEGG:ns NR:ns ## COG: lin0348 COG3568 # Protein_GI_number: 16799425 # Func_class: R General function prediction only # Function: Metal-dependent hydrolase # Organism: Listeria innocua # 36 292 4 248 257 94 27.0 3e-19 MKKSFMLCLAGCLAACVTAVSATAARPASGQPATRVMTCNVRITGLPEDETAGRRWEDCR DACLKAIRMYRPDVICMQEVIYDSYNYFKEKFSDYVAYGFAGPEMDPYTEGYHFIGKNVI FFSKKRYEFVSSGCYWLSETPLAAGSCSWNTMRARHCNWVRLRDRKSGAEFRVLDIHLDH KSDDARREQMKMIVGECAQYADGFPQIICGDFNSGIENAPVACLRDAGWQEAYEAVHGPG EAGFTYHGFKGPDYRKKNARRIDFIFVRGNLQPVAAEILRDKVDGLYPSDHYFLMSDFII Q >gi|313158710|gb|AENZ01000033.1| GENE 67 70797 - 72170 1970 457 aa, chain - ## HITS:1 COG:no KEGG:Phep_0457 NR:ns ## KEGG: Phep_0457 # Name: not_defined # Def: hypothetical protein # Organism: P.heparinus # Pathway: not_defined # 5 456 6 458 466 590 63.0 1e-167 MANQKTTITEPQREIPVRAEVDVLVVGGGPAGIMAARAATGKGLRVMLIESRGYLGGNLT IGLPILGYLGRKGNQIIKGLPQLFIDRLRARGAAGEHRPCKLHVSLTIIDPEEAKTVALE MLQEVGVEVLLYVFCADIVKNGDAVEGVVVESKAGREAILAKTVIDCTGDGDVACRAGVE CRKGDADGGMQPPTLMFCMKGVDVQKLRDALVGRPDVFDMDTMPAEQFRTGKFITVGLRN QIRKAEEAGYKIPVARTILITGIKDDEIWVNMSRVSGVDSTKAESYTHGEVEGRKQIYEL ERYLKNFVPGFENAWREKVAPFMGIRESRVIVGKYILTAEDILACRRFDDAVAVASYPVD IHHATGGDCTLHWCEDCYDIPYRSLVPAAVENLLVAGRCSSMNHEAMASTRVMSTCMALG EAAGRAARIALEEGVRPSAVDVEKVREELRQTGAYLR >gi|313158710|gb|AENZ01000033.1| GENE 68 72189 - 73496 1812 435 aa, chain - ## HITS:1 COG:RSp0834 KEGG:ns NR:ns ## COG: RSp0834 COG0477 # Protein_GI_number: 17549055 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Ralstonia solanacearum # 11 427 8 413 435 159 27.0 8e-39 MNLLQNPNKRRWYIVATIFVAIICNYLDRQLLSILKPEILTHFGIEDLQYAWIVNVFLIC YAIMYPISGILVDRFGSKKVMLGGIVAWSLACIGGGLSTNVVEFTICRGILGLAEPTIFA GQIVAVTLWFEKRHRATANSLCTAGGSLGTVIAPLVIAWMARFLPWHDVFIVAGLVGLVI AVAWALIYKNPPQEVLAVTMAPDKDEVARNNGEQQRAFTWGELWTTRSLWGVLLIRLISD PVWYFCCFWLPGYLRSMGEAQGLSQQATLDMIQYIGGLPFLFGAVGGILTSVWSDWLVRR GMPALKARKTMMILMAFVAPLCILTPYVSEWQAVSFDARIGLVVAIFSLIAIMCLSWLYT ICVVIAEAFPVKNVASVVGITAGAGAVGGAVFNIFIGSLLATMGNVLFAVMGVMHLVTAV ILWRMVRHETPKTVK >gi|313158710|gb|AENZ01000033.1| GENE 69 73513 - 75180 1921 555 aa, chain - ## HITS:1 COG:ECs0067 KEGG:ns NR:ns ## COG: ECs0067 COG1069 # Protein_GI_number: 15829321 # Func_class: C Energy production and conversion # Function: Ribulose kinase # Organism: Escherichia coli O157:H7 # 7 553 5 551 566 442 42.0 1e-123 MEHLYVIGIDYGTDSVRALLADAESGEGIAVAVCNYSRWGRGLYCDTAKSQFRQHPLDYL EGLEQVLRQVIAQCPAPEAIRAIAVDTTASTPCLVDRTCTPLSLTVGYEENPDAMFVLWK DHTAQRESEEITALCARGEINYARRSGNHYSSECFWSKVLHLLRGSERLRRDAWAVVELC DWIPAVLTGCRAMEDLRSGLCVAGSKVMWAEEWGGYPPEEFFAGLDPVLLPILRRLPVRT YGCDTPAGTLSPEWAAKLGLSEQVVIGVGNVDCHSGAVGAGICHGTVVLNLGTSACYMAV MPPEKMGDRMVEGIFGQVDGSILPGMVGFEAGMSAFGDVYAWFKRLLCWPLREVLLPADP ENETLRALAAQTEERLLAKLAEAAAQLPLRADAPLATDYLNGRRSPYPCNRLTGSVVGLN LSATAPELYYAFAEATVFATKAILDHLAENGVEIGRLVGIGGISQKSPFVMQLLADVTGM AIEVSGSAHSCALGAVVHAAAAAGLYPSVGAAQRALCPSVARVYTPDAAKSGILALRYER YRALGAFTENLFINR >gi|313158710|gb|AENZ01000033.1| GENE 70 75880 - 78993 4926 1037 aa, chain + ## HITS:1 COG:no KEGG:Sph21_2494 NR:ns ## KEGG: Sph21_2494 # Name: not_defined # Def: TonB-dependent receptor plug # Organism: Sphingobacterium_21 # Pathway: not_defined # 29 1037 124 1117 1117 969 50.0 0 MKKQLRTLTKLLAAVIACCLSVAVHAQTTVRGTVTDADGMPLVGATVIVITGTQPGGGTT TDANGKFAIAAAPGQRLSVSYIGYKETSVEVTTKTDYDIRLESDNAQIDEVIVVGYGTQK KVNVTGSVATIDSKAFDKRPIVSTSAALQGMAPGVTVTTQSGAPGDDGGAIRIRGINSFG GSSTSPLVLIDGIEGSLDSVDPNLIESISILKDAASSSIYGSRAANGVILVTTKRGSKER FSITYKGYVGWQSPTDLPDMVNALEFRELTRAMYLNDGVDLNGEKPGSTPIPDYNDEDMA LYRKNWGKDPDLYPNTDWQDAVLTGSGFTHSHNVSLSVGSERVRMLTTLGYVDQEGIIKN ADFQRYTFRNNADVKFNDKMSMKLDLSFSNDDRKASPYQSTIFNYMNTRPADIPNQFSTG LYNGLGMQGMNPVALMLYGGSNNTNTIRLSGAITLTYEPAKWLSLQGMLAPRYTTTNRHN WKKPVTTYQDYKGTSTLTSASYATLTESGSRAFYGNYNFLATLKHNFSGHDLKLILGAER NTYDYKYLMAYRQVFNYNYDQIDVGEIDNMDNSGRRYEWAIQSYFGRLNYNYKERYLLEA NLRIDGSSRFNKSNRWGYFPSVSGAWRISEEPFMQDVKHIIDGLKLRASYGTLGNQNLAG GDAASYYPTTQNLAMGQISMNDNIYSLVTLNTLANPDIKWETTTMLDVGVDFSLFNKLNI TADWYRKETKDILMKLDMPLGIGLNAPYQNAGKVRNTGWEVSVGYNNQWRDFSFGVQANL SDVKNEILDMRGKTSTSGVLRNQEGYSIGSIYALKSLGIIRTQEEADWVNANCPQFKETV QIGDIRYADIDGNNSIDENDKDIVGSTIPRYTYSLNLNFGWKGLRLGLLFQGVGKTDGYL NTYYVMPSNQGGTFRKEHLDWASAENPNGKTPRLTSANKNNWYDSSFWMKSAAYLRLKNI QLGYELPKSWMHRIGLNSAYLYVNAQNLFTVTNFWDGYDPEVGYGGDSSGDFDVVKLGSA NNYPQVKIVTVGLELKF >gi|313158710|gb|AENZ01000033.1| GENE 71 79006 - 80817 2301 603 aa, chain + ## HITS:1 COG:no KEGG:Sph21_2495 NR:ns ## KEGG: Sph21_2495 # Name: not_defined # Def: RagB/SusD domain-containing protein # Organism: Sphingobacterium_21 # Pathway: not_defined # 4 596 6 586 586 432 43.0 1e-119 MKNKIITGIALCSLFITGCSLDKAYLNGPNASTFPATKSEVEAGVFATYKGLTLIDASST PFPGIQDNASDIGASRINAANYNYQQQSKLPPSNAWVTKVYTQIYKTAARANLVLDGIDN VRELMSEQEYNMYKAELLLVRSFLYDWGCQLYGDIPYIDHTLKLGDTYTRTPKEEVISRI LNEDLADEMLDFLPIRHNKNSYGSARLGRAGAYGLKARICLNWKYYEEAAYYADKALKLA EEGGYSLQAYDIRYCGEDHTKGEPTPSNLFGINGHKNSDEWIWALQYNTAISSNQHNAGY YAAPRIAGGCSYFSPTQAFLDAIQCTDGKSIVESPLFDYQNPWKNRDPRLDLFCVRPGSR VLGMQFETNPSAQTIKNYNDKEEGVDVANSEAYGSKSEYGANGSKGPCGYLWRKYLDIQE YKYNNAFGTSSVCVLSYPLMRLAELYLIRAEANIEWNGGDLSVAKADIETIRSRAHMPGL SENLSRDGLRSALRYERMVELCNEGFRWFDIRRWDMAESVISGTLYAPALDGSVSNAVPT IDSDWHVTYNGQTFDGKDMNLRRFITMSYDPMKDALWPIPEDERIANPLITQNPGYAGHS IEE >gi|313158710|gb|AENZ01000033.1| GENE 72 80837 - 82381 2286 514 aa, chain + ## HITS:1 COG:no KEGG:Phep_3771 NR:ns ## KEGG: Phep_3771 # Name: not_defined # Def: metallophosphoesterase # Organism: P.heparinus # Pathway: not_defined # 2 511 3 521 521 330 36.0 7e-89 MIQRLYRLLLLFCSIAVFAACSDSDGNDNNGGNGGEIVPDGGDTYGYVTGANGNPVPGVV VSDGFTCTQTDAEGQYVLTRNAESGFVFYSLPSGYKVNMHKTYKLPKFFTKLKPETARYD FTLTALDAPETRFDLICIGDPQINEAAHAVRFKREAAKDIREYSSSADIPCYAIHLGDLV NNKWALYSNMVVALQPEQTGIPVFSTIGNHDHEFPTANEKEARAKYESFFGPVDYSFNRG DVHIVSMDNIIHECQESAGYTGGFSAEQYEWLKQDLSYVPKDKMVILCVHIPFRNSNYAY YDEVLELLSQYKYATIMSAHTHSNINHIHTKNGKEIFEHITGTSCGAWWRSTVCTEGTPI GFGIYRIDGAAMKEWTYKSVQHDEEFQIRLYRGSDKFTGGGDASYYFSKKNAGQIVANIW NWDPAWTVNVYENGAKTGTMTQYSDKDAWTVAYHIGILNNSSSYNKSTDHMFYYTLTNPA ADVKVEAIDRFGNKYEQTVFTDPDSHPGIYHLDY >gi|313158710|gb|AENZ01000033.1| GENE 73 82406 - 84109 1545 567 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313158715|gb|EFR58102.1| ## NR: gi|313158715|gb|EFR58102.1| putative lipoprotein [Alistipes sp. HGB5] # 15 567 15 567 567 1033 100.0 0 MKKLLLFPCLLLALGLSSCESSDDEIMSIELVSPKKLAVTTELATASFKWDAVAKAEGYA YALDNSTEYTTIDAATTTLKLTRLSRGSHTFRIYAVGNQEHTTDSAERTVDFDIDPTLPT PAPSCTRGESADEVVVTWQAVKGAIGYAYKFNDETQWTEVGADVLKITKSGFDPEAKNTF TIYAKGQLPDSEDSEELTLPFQLIDTSEGVWARTSDGSLYELKESESGIYTAAITCKASD DITILIENTPYGFTAYSGNGGIGTVNSVYATVPFYTYPAAVYYVRESLGQMTAKTEDADI NKFWVNVSGSTCKIDVEIDCTNADKTPRYRLKLVENEDPSIILEQYFDLMVYGGDWIQTG KVASSGRKLASTSPAAIDGTEPATATASYTTFGINISSDKDAAPAYLANRGLTDWGIKYC YEFPGYVRLSNSASGSNMYYGVLTTPKLSKLTAASIVTVTFDAVRFASEHDIPVKVLNAG TIASAQVVVEGSGSAASIAPESGGKSINITSAHCPKHGNEALKKWSNFSITIEGATAETQ ICWDTTGAGSTSANGRICLDNIVVRKN >gi|313158710|gb|AENZ01000033.1| GENE 74 84188 - 85276 1229 362 aa, chain + ## HITS:1 COG:no KEGG:BT_0193 NR:ns ## KEGG: BT_0193 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 107 328 71 263 293 74 31.0 8e-12 MKKTLIFPLLAAALLCNCGGSDDKEPGAGEGNGSITVVEPTTALGAPNDDIFRLQLKPGK LITLKQVNTTNCKRAAATGMCIDIMAQDATIFAEGMTDAKLDALFSEAGAAIGSAAASLW GIHLPYRTYDISATDEAARTTAVAKLRTVINSAMKHMSPKHFVIHPSTGTILTTGSDFAA RKAQSRKSLRELQTHIASCNSTYGLHTILCVENCPRSVAYDAETMLALLSGEELEQVRVC LDTGHALIPLNGSYINPTRNGDAVAILRKLGTRLGTLHIQQNRGAEGQSGTLDKHLQPYD GGLIDWGEFYYELLKNNRYRGCFLYEVSFTDTYDGTTATIESAKANYTGLVYPAFTHRLN RQ Prediction of potential genes in microbial genomes Time: Wed Jun 22 12:08:33 2011 Seq name: gi|313158694|gb|AENZ01000034.1| Alistipes sp. HGB5 contig00012, whole genome shotgun sequence Length of sequence - 12163 bp Number of predicted genes - 14, with homology - 14 Number of transcription units - 8, operones - 3 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 4 - 63 2.3 1 1 Op 1 . + CDS 313 - 1536 1004 ## COG0582 Integrase 2 1 Op 2 . + CDS 1533 - 1922 158 ## BDI_3501 hypothetical protein + Term 2041 - 2089 0.6 + Prom 2011 - 2070 1.9 3 2 Tu 1 . + CDS 2200 - 3405 1125 ## BDI_3502 hypothetical protein + Prom 3407 - 3466 3.2 4 3 Tu 1 . + CDS 3663 - 4601 393 ## BDI_3503 DNA primase + Prom 4693 - 4752 2.4 5 4 Op 1 . + CDS 4778 - 5143 435 ## BDI_3504 mobilization protein BmgB 6 4 Op 2 . + CDS 5140 - 6087 671 ## BDI_3505 mobilization protein BmgA - Term 5892 - 5931 -0.8 7 5 Tu 1 . - CDS 6109 - 6714 517 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 6794 - 6853 3.9 + Prom 6513 - 6572 2.8 8 6 Tu 1 . + CDS 6816 - 6980 195 ## gi|313145646|ref|ZP_07807839.1| predicted protein 9 7 Tu 1 . - CDS 7028 - 7927 602 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 8005 - 8064 2.5 + Prom 7919 - 7978 5.5 10 8 Op 1 . + CDS 8058 - 8834 591 ## COG2220 Predicted Zn-dependent hydrolases of the beta-lactamase fold 11 8 Op 2 . + CDS 8864 - 9499 329 ## COG4422 Bacteriophage protein gp37 12 8 Op 3 . + CDS 9496 - 10371 363 ## COG1533 DNA repair photolyase 13 8 Op 4 13/0.000 + CDS 10420 - 11676 532 ## COG0642 Signal transduction histidine kinase 14 8 Op 5 . + CDS 11726 - 12161 142 ## COG2204 Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains Predicted protein(s) >gi|313158694|gb|AENZ01000034.1| GENE 1 313 - 1536 1004 407 aa, chain + ## HITS:1 COG:APE0805 KEGG:ns NR:ns ## COG: APE0805 COG0582 # Protein_GI_number: 14600982 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Aeropyrum pernix # 192 389 83 286 362 64 26.0 4e-10 MARSTFKVLFYVNGSKEKNGIVPIMGRVTINGTVAQFSCKQNIPKTLWDVKGNKAKGKSR EARDINLALDNIKAQIIKHYQRISDREAFVTAEMVRNAYQGIGSEYETLLKAFDRENEVF KKRVGKDRVMATYRARVRARNHVAAFIKSFYRRSDMSMLELTPDFIKEFAAYLSTEAGLR NGSIWANCMWLKGVVMKAHYNGLIPRNPFIQFHISPNVKEREYLTEDELKAVMTHEFADS KLAYIRDIFIFASFTALSFVDIQELTNDNIVEVNGEKWILSKRHKTKVAFQVKLLDIPLQ IVERYRPMQKDNRIFPGLNYWSICKPLKRMIRECGITKSISFHCSRHGFATLALSKGMPI ESVSRVLGHTNIVTTQLYAKITTQKLDNDLTMFGDKLSKTFNGITMS >gi|313158694|gb|AENZ01000034.1| GENE 2 1533 - 1922 158 129 aa, chain + ## HITS:1 COG:no KEGG:BDI_3501 NR:ns ## KEGG: BDI_3501 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 3 129 1 127 127 188 75.0 5e-47 MSMRRNSITVNESGNIIMPENVSNIWMSEPELVELFGVIAPTLRSAIRNIYNSGALKEYE VQKYVRQENGYHADVFSFPMIVALAFRIDSFGAEQVRNAIFKRLYLRKEKTNIFFSLGIN GWDMSNYQA >gi|313158694|gb|AENZ01000034.1| GENE 3 2200 - 3405 1125 401 aa, chain + ## HITS:1 COG:no KEGG:BDI_3502 NR:ns ## KEGG: BDI_3502 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 1 401 1 401 401 756 94.0 0 MKQGNFENEEFIRVGTTLYKLVNQPRLNGGYVKKRIVWNNETLRQDYGKDYLATVPKYDG FCTVPDHVDYRPVVDKFLNLYEPIGHRPQQGEFPCIRSLVRHIFGEQYELGMDYLQLLYL QPVQKLPILLLVSEERNTGKSTFLNFLKAVFQNNVTFNTNEDFRSQFNSDWAGKLLIVVD EVLLNRREDSERLKNLSTTLSYKVEAKGKDRDEIAFFAKFVLCSNNEYLPVIIDAGETRY WVRKIDRLQSDDTDFLQKLKAEIPAFLYYLQYRQLSTEKESRMWFAPSLLHTEALQKIIR SNRNRLEIEMCELILDIMESVGTDTFSFCHNDILLLLVHSQVKVEKHQVRKVLQECWKLT PASNGLTYTTYQFNYNRECRYEPIKRVGRFYTVTREQLESL >gi|313158694|gb|AENZ01000034.1| GENE 4 3663 - 4601 393 312 aa, chain + ## HITS:1 COG:no KEGG:BDI_3503 NR:ns ## KEGG: BDI_3503 # Name: not_defined # Def: DNA primase # Organism: P.distasonis # Pathway: not_defined # 1 312 1 312 312 567 84.0 1e-160 MNIQEAKQIKIADYLQSLGYSPVKQQGNCLWYKSPFRQETEASFKVNTDRNLWFDYGLGR GGNIIALAGVLYASDHVPYLLGKIAEQAPHIRPISFSFRQQASEPSFQHLEVGELTHPAL LRYLQERGINIALAKAQCKELHFTHNGKPYFAIGFPNVAGGFEVRNRFFKGCVAPKDISH IRQQGEAREKCLVFEGMMDYLSFLTLRKRNCSNLPDLDRQDYVILNSTANVSKAIDVLHG YGRIHCMLDNDEAGRKAYRELERKFAGRIRDFSDNYKGHKDLNDYLRGIRQKLAVSPPPR TIVKPKKKGLGL >gi|313158694|gb|AENZ01000034.1| GENE 5 4778 - 5143 435 121 aa, chain + ## HITS:1 COG:no KEGG:BDI_3504 NR:ns ## KEGG: BDI_3504 # Name: not_defined # Def: mobilization protein BmgB # Organism: P.distasonis # Pathway: not_defined # 5 121 2 118 118 201 90.0 9e-51 MEQTKEHKERNKGGRPKKEATEKLKYRIAVKMTAADYFRLLTRSHEAGVSPSEYMRECFR NGHVKERLSEEHAGYIRQLCGMANNLNQLARKANAGGFHDERWDCKVAVARIHELITKIG I >gi|313158694|gb|AENZ01000034.1| GENE 6 5140 - 6087 671 315 aa, chain + ## HITS:1 COG:no KEGG:BDI_3505 NR:ns ## KEGG: BDI_3505 # Name: not_defined # Def: mobilization protein BmgA # Organism: P.distasonis # Pathway: not_defined # 1 301 1 301 313 493 81.0 1e-138 MMAKIVKGSDFKGAVDYIIDKKKDTQIVASEGLFMENLETIAMSFNAQSRMNDKVTKPVG HIALSFSKEDELRLTNRIMAGIALEYMERMGIKNTQFFIAQHFDKEHPHVHIAFNRIDNN GNTISDRHERLRSTRICKELTLKYGLHMAGGKDNVKRNRLKEPDRTKYALYDIIKMEVGR CGNWNVLIANLKRQGVEVHFKHKGQTSEVQGVVFSMNGYHFNGSKVDRRFSYSKIDAALQ HNRCRERMGITAGIYETATPSAPSGTAQSELFNGSWGLLKSSGSSYNAADAEANQEMVDI LRRRKKVKRKRGVGF >gi|313158694|gb|AENZ01000034.1| GENE 7 6109 - 6714 517 201 aa, chain - ## HITS:1 COG:PA0248 KEGG:ns NR:ns ## COG: PA0248 COG2207 # Protein_GI_number: 15595445 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Pseudomonas aeruginosa # 80 197 175 285 288 67 33.0 2e-11 MFHPDLLRNTPLGRIMREYSFFSYETNEALHLSVEERQIIMDCFHKIQYELNHPIHRHSK NLITDSIKTFLDYCTRFYDRQFITRDNQNRDILARFERLLDEYFHDGAAKRIGLPTVQYC ADKLCLSPNYFSDLLKKETGSTALHFIHDKSIEIAKTELASTDDTVNEIAYNLGFQYPQH FTRLFKKEVGYTPNEYRAQVS >gi|313158694|gb|AENZ01000034.1| GENE 8 6816 - 6980 195 54 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313145646|ref|ZP_07807839.1| ## NR: gi|313145646|ref|ZP_07807839.1| predicted protein [Bacteroides fragilis 3_1_12] hypothetical protein HMPREF9011_04051 [Bacteroides sp. 3_1_40A] predicted protein [Bacteroides fragilis 3_1_12] hypothetical protein HMPREF9011_04051 [Bacteroides sp. 3_1_40A] # 1 54 9 62 62 100 98.0 5e-20 MKPERMKQNLDILDFTLSADDMARIKTLDTDKPFLLGSHEDPEIVKWFMQYKNA >gi|313158694|gb|AENZ01000034.1| GENE 9 7028 - 7927 602 299 aa, chain - ## HITS:1 COG:PA0248 KEGG:ns NR:ns ## COG: PA0248 COG2207 # Protein_GI_number: 15595445 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Pseudomonas aeruginosa # 196 283 191 278 288 70 38.0 3e-12 MDEIINPDRLVSVPFNKTRCGVDFYINTAINKDIGLVLTENKRFKTDFFSFYFFRKANGY LLLNFRKIELRDGMVLLLSPHQQQEWHVDETALDYTFLIFREDFMRTFIADKFFVFRLLY CYQTDTPPYINATHDEMKEYMRLLGKIKYELMNPVSDTYNIIVSLLYYLLLIINRTYAAA YCLPAEIAKNNFAFRFKDLLEQNIRTHQRVQEYADMLHVSRITLNNSVKAQFGVSATHLI KQRLLEELKNELLFSNRTVSEMADDFNFSDPSHLMRFFKQQTGKTFTQYMTDYNKGIYE >gi|313158694|gb|AENZ01000034.1| GENE 10 8058 - 8834 591 258 aa, chain + ## HITS:1 COG:SA2259 KEGG:ns NR:ns ## COG: SA2259 COG2220 # Protein_GI_number: 15928050 # Func_class: R General function prediction only # Function: Predicted Zn-dependent hydrolases of the beta-lactamase fold # Organism: Staphylococcus aureus N315 # 7 254 3 249 255 159 36.0 4e-39 MKKTSTVQLVRNATLKIRYAGHTMLIDPVLADKGTLISALGVNKTPRVHLTIPIQDIIGG VDMVLLTHNHIDHYEPSVPTHLPKEIPFYVQPQDADAIRNDGFTNVIPIEEIKTIDGISI YRTTGHHGFGQIGQMMGPVSGYVLKAEGFPTVYIMGDCRWEACIRDTVERFNPDYIVVNS GGAIFPEFSKTDGPIIPDENEVMQILDELPSHIKLIAVHMDAIDHCQTTRAILRNEATHH EADMSRLIIPEDGETVVL >gi|313158694|gb|AENZ01000034.1| GENE 11 8864 - 9499 329 211 aa, chain + ## HITS:1 COG:mll4707 KEGG:ns NR:ns ## COG: mll4707 COG4422 # Protein_GI_number: 13473946 # Func_class: S Function unknown # Function: Bacteriophage protein gp37 # Organism: Mesorhizobium loti # 2 187 14 222 249 74 28.0 2e-13 MNIVVGCTIGCPYCYARNNCRRFHITDDFSVPEYMERKLRIIDTSRPHVWLMTGMSDFSD WKPEWNAEIFERISSNPQHAYIFLTKRPDKISFSSDDENVWMGVTVTRSSEKRRIDDLKK NIKARHYHVTFEPLFDDIGEIDFEGIDWIVIGTETGNRKGKSYSRPEWVLSIAEQAKAHG IPVFMKEDLLPIMGDERMIQELPEQFTRRIQ >gi|313158694|gb|AENZ01000034.1| GENE 12 9496 - 10371 363 291 aa, chain + ## HITS:1 COG:MJ0683 KEGG:ns NR:ns ## COG: MJ0683 COG1533 # Protein_GI_number: 15668864 # Func_class: L Replication, recombination and repair # Function: DNA repair photolyase # Organism: Methanococcus jannaschii # 18 237 19 232 259 141 36.0 1e-33 MNMDTIKEIDVKSVMTKSSLPVGGYSVNPYVGCPHACRYCYASFMKRFTGHTEPWGTFLD VKNWKPITNPHKYDGERVVIGSVTDGYNPYEEEFHRTRRLLEELRGSDAEIMICTKSDLV LRDLDLLKSFPKVTVSWSVNTLDEQFCADMDNAVSIERRLKAMRQTYEAGIRTVCFVSPI FPRITDVKAIIEEVKDYADLIWLENLNLRGQFKGGIMTYIREKYPDLIPLYEEIYNKKRL EYWQALEQDISNYAKEQGFPYRINDLPYGRSEKGKPVIVNYFYHEKIRLTK >gi|313158694|gb|AENZ01000034.1| GENE 13 10420 - 11676 532 418 aa, chain + ## HITS:1 COG:mlr3215_2 KEGG:ns NR:ns ## COG: mlr3215_2 COG0642 # Protein_GI_number: 13472804 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Mesorhizobium loti # 2 316 39 351 382 110 29.0 5e-24 MYVDNIRQSSERMREMLNTLLDFFRLDNGKEQPNLSPCRISAITHILETEFMPIAMNKGL TLIVESHTDAVVLTDKERILQIGNNLLSNAIKFTDNGSVSLAADYDNGLLKLIVEDTGTG MTEDEQQRVFGAFERLSNAAAKDGFGLGLSIVQRIVAMLGGTIRLESEKGKGSRFTVEIP MQTAEELPEQTIQAQMRHNHICHDVIAIDNDEVLLLMLKEMYAHEGMHCDTCTNVSVLME LIRKKEYSLLLTDLNMPEISGFELLELLRSSNVGNSKSIPIVVTTASGSCSKEELIGRGF AGCLFKPFSISELMEITDKCALSSILDEKPDFSTLLSYGNESVMLDKLIAETEKEMQAVK DAEQRKDLQELDTLTHHLRSSWQILRADQPLRELYAQLHGSATPDDEAICKAVTGCAG >gi|313158694|gb|AENZ01000034.1| GENE 14 11726 - 12161 142 145 aa, chain + ## HITS:1 COG:NMA0159 KEGG:ns NR:ns ## COG: NMA0159 COG2204 # Protein_GI_number: 15793186 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains # Organism: Neisseria meningitidis Z2491 # 1 143 1 145 425 79 31.0 2e-15 MDKTRIIVVEDNIVYCEFVCNLLAREGFRTVQAYHLSTAKKLLQQASDGDIVVSDLRLPD GDGIDLLRWMRKEGMTQPLIIMTNYAEVHTAVESMKLGSLDYIPKQLVEDKLMPLLHTIL KERSIGRNRMPVFARDSSAFQKIMQ Prediction of potential genes in microbial genomes Time: Wed Jun 22 12:09:15 2011 Seq name: gi|313158645|gb|AENZ01000035.1| Alistipes sp. HGB5 contig00015, whole genome shotgun sequence Length of sequence - 64784 bp Number of predicted genes - 46, with homology - 45 Number of transcription units - 20, operones - 14 average op.length - 2.9 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 3 - 1326 1225 ## COG3525 N-acetyl-beta-hexosaminidase 2 1 Op 2 . - CDS 1341 - 2915 1500 ## COG3119 Arylsulfatase A and related enzymes - Term 2931 - 2972 7.6 3 2 Op 1 . - CDS 3022 - 4263 1236 ## BT_1049 putative patatin-like protein 4 2 Op 2 . - CDS 4281 - 5402 1268 ## BT_1048 putative secreted endoglycosidase 5 2 Op 3 . - CDS 5456 - 7033 2111 ## BF1327 hypothetical protein 6 2 Op 4 . - CDS 7047 - 10364 4307 ## BF1326 hypothetical protein 7 3 Op 1 . - CDS 10541 - 11476 669 ## COG3712 Fe2+-dicitrate sensor, membrane component - Prom 11502 - 11561 3.1 8 3 Op 2 . - CDS 11563 - 12351 818 ## Bache_2931 RNA polymerase, sigma-24 subunit, ECF subfamily - Prom 12519 - 12578 4.0 + Prom 12479 - 12538 4.3 9 4 Op 1 6/0.000 + CDS 12570 - 13163 645 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog 10 4 Op 2 . + CDS 13239 - 14141 1234 ## COG3712 Fe2+-dicitrate sensor, membrane component + Term 14261 - 14296 2.5 + Prom 14152 - 14211 3.1 11 5 Op 1 . + CDS 14312 - 17665 5111 ## BT_4707 hypothetical protein 12 5 Op 2 . + CDS 17683 - 19272 2244 ## Cpin_6734 hypothetical protein 13 5 Op 3 . + CDS 19301 - 20392 1709 ## BVU_0617 glycoside hydrolase family protein 14 5 Op 4 . + CDS 20422 - 21588 1646 ## BT_4710 hypothetical protein + Term 21614 - 21651 6.2 - Term 21592 - 21647 13.1 15 6 Op 1 . - CDS 21665 - 22819 872 ## Palpr_0285 hypothetical protein - Prom 22850 - 22909 1.9 16 6 Op 2 . - CDS 22970 - 23554 898 ## BF1053 RNA polymerase ECF-type sigma factor - Prom 23606 - 23665 7.5 + Prom 23595 - 23654 6.8 17 7 Tu 1 . + CDS 23676 - 24686 1462 ## COG3712 Fe2+-dicitrate sensor, membrane component 18 8 Op 1 . + CDS 24868 - 28230 5475 ## BT_4707 hypothetical protein 19 8 Op 2 . + CDS 28249 - 29865 2245 ## BT_4708 hypothetical protein 20 8 Op 3 . + CDS 29885 - 30868 1488 ## BT_4709 glycosyl hydrolase 21 8 Op 4 . + CDS 30889 - 33558 3474 ## gi|313158676|gb|EFR58065.1| F5/8 type C domain protein 22 8 Op 5 . + CDS 33573 - 34733 1469 ## BT_4710 hypothetical protein 23 8 Op 6 . + CDS 34813 - 35547 977 ## Palpr_0285 hypothetical protein 24 9 Op 1 . + CDS 35670 - 37895 2580 ## COG3537 Putative alpha-1,2-mannosidase 25 9 Op 2 . + CDS 37892 - 40207 2960 ## COG1472 Beta-glucosidase-related glycosidases 26 9 Op 3 . + CDS 40212 - 42458 2505 ## COG3525 N-acetyl-beta-hexosaminidase - Term 42566 - 42613 17.3 27 10 Tu 1 . - CDS 42665 - 43129 635 ## gi|313158667|gb|EFR58056.1| hypothetical protein HMPREF9720_0441 - Prom 43208 - 43267 4.5 - Term 43530 - 43576 10.5 28 11 Op 1 . - CDS 43680 - 45176 2249 ## COG3263 NhaP-type Na+/H+ and K+/H+ antiporters with a unique C-terminal domain 29 11 Op 2 . - CDS 45192 - 45725 705 ## gi|313158685|gb|EFR58074.1| hypothetical protein HMPREF9720_0443 30 11 Op 3 . - CDS 45784 - 46149 218 ## gi|291513594|emb|CBK62804.1| hypothetical protein AL1_00780 - Term 46280 - 46318 -0.1 31 12 Tu 1 . - CDS 46330 - 47133 1201 ## Odosp_3205 hypothetical protein - Prom 47193 - 47252 2.2 + Prom 47114 - 47173 3.9 32 13 Op 1 . + CDS 47220 - 48611 2021 ## COG0774 UDP-3-O-acyl-N-acetylglucosamine deacetylase 33 13 Op 2 . + CDS 48617 - 49411 1260 ## COG1043 Acyl-[acyl carrier protein]--UDP-N-acetylglucosamine O-acyltransferase + Term 49482 - 49520 5.3 34 14 Tu 1 . + CDS 49539 - 50006 717 ## BDI_3532 hypothetical protein 35 15 Op 1 20/0.000 + CDS 50240 - 51472 2150 ## COG0195 Transcription elongation factor 36 15 Op 2 . + CDS 51503 - 54418 4134 ## COG0532 Translation initiation factor 2 (IF-2; GTPase) 37 15 Op 3 . + CDS 54426 - 55061 994 ## gi|313158692|gb|EFR58081.1| hypothetical protein HMPREF9720_0450 38 15 Op 4 . + CDS 55086 - 56405 1956 ## COG0427 Acetyl-CoA hydrolase + Term 56425 - 56464 8.5 39 16 Op 1 . + CDS 56487 - 57716 1657 ## COG1524 Uncharacterized proteins of the AP superfamily 40 16 Op 2 . + CDS 57736 - 58542 1026 ## BVU_3979 hypothetical protein 41 17 Tu 1 . + CDS 59491 - 59961 -3 ## gi|313158659|gb|EFR58048.1| conserved domain protein + Prom 60030 - 60089 4.5 42 18 Op 1 . + CDS 60205 - 60375 107 ## 43 18 Op 2 . + CDS 60372 - 60743 346 ## gi|313158689|gb|EFR58078.1| hypothetical protein HMPREF9720_0455 + Term 60993 - 61032 8.2 + Prom 61378 - 61437 6.2 44 19 Op 1 . + CDS 61506 - 62351 478 ## Lbys_1555 hypothetical protein 45 19 Op 2 . + CDS 62355 - 63209 643 ## Lbys_1555 hypothetical protein + Term 63226 - 63263 8.5 + Prom 63474 - 63533 3.7 46 20 Tu 1 . + CDS 63618 - 64782 1203 ## COG0582 Integrase Predicted protein(s) >gi|313158645|gb|AENZ01000035.1| GENE 1 3 - 1326 1225 441 aa, chain - ## HITS:1 COG:CC0447 KEGG:ns NR:ns ## COG: CC0447 COG3525 # Protein_GI_number: 16124702 # Func_class: G Carbohydrate transport and metabolism # Function: N-acetyl-beta-hexosaminidase # Organism: Caulobacter vibrioides # 30 441 31 423 757 248 36.0 2e-65 MTLVRIYAALAVGMWLMACTGPAPDERPVLIPAPQKTIWGDGTYRLPERLQLGIADTVLI AAAGYVNEVLAPYGTVAVDRCDRGDIVLELDSAALPPEGYRLSVTHAGVRIGGGSYRGVV NGIATLRQLLPAKAGSGAKIPYAEIADRPAFGWRGVMLDVSRHFFDKEEIFTLLDQMARL KLNKFHWHLTDDQGWRIEIPCYPELTQEGAWRLFNRQDTLCMGRAAREQNDDFLLPVRRL RTDGADTLYGGYYTRKDIREVVAYAAVRGIDVIPEIDMPGHCLQAIDSYPWLACFGRGSW GQSFSSPLCVGKDRTLAFCESVWEELFELFPYEYVHMGGDEVDKSNWKRCPDCQTRMRAE GLPDEAALQAWFMHRMQRFCEARGRRMIGWDEILEGGAVPGATVMWWRPWEPQSVSAATR QGCEVVLCPQSWFYFSLEEDA >gi|313158645|gb|AENZ01000035.1| GENE 2 1341 - 2915 1500 524 aa, chain - ## HITS:1 COG:PM0598 KEGG:ns NR:ns ## COG: PM0598 COG3119 # Protein_GI_number: 15602463 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Pasteurella multocida # 44 510 1 456 467 131 25.0 4e-30 MEKRLFLLGGLAAVAATGCARQSEAPARPMNILYIMTDDHTAQMMSCYDTRYASTPNLDR IAAEGVRFTNSFVANSLSGPSRACMLTGKHSHKNGFYDNTTCVFDNTQQTFPKLLQQAGY ETAVVGKWHLESLPTGFDHWEIVPGQGDYYNPDFIRQTGDTVRSEGYITNLITDKSLDWL RSGRDPEKPFCLLVHHKAQHRNWMADTCNLALYEELDFPLPDTFFDDYEGRPAAAAQEMS IMKDMDLIYDLKMWRPDKRSRLKGTYEAFVGRMNPEQRAAWDRFYTSLIEDFYRRNPQGR ELAQWKFTRYMRDYMKVLKSLDDNVGRLLDYLDESGLAENTLVVYTSDQGFYMGEHGWFD KRFMYEESFRTPLVMRLPGGVRGDIAEMVQNIDYAPTFLELAGVAVPDDIQGVSLLPLLR GEHPAGWRRSLYYHFYEYPAEHMVRRHYGVRNDRWKLIHFYNDIDQWELYDLQEDPHELR NLYGLPEYAAPRREMTDELVRLQTQYGDTLALRRNGRVTAANGN >gi|313158645|gb|AENZ01000035.1| GENE 3 3022 - 4263 1236 413 aa, chain - ## HITS:1 COG:no KEGG:BT_1049 NR:ns ## KEGG: BT_1049 # Name: not_defined # Def: putative patatin-like protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 381 1 370 393 175 31.0 3e-42 MKNMLNIRICRALFAGALLTGLVTGCDNADLGALDNAAYIKEAQKASSTTVKVEDDGGKT SLTVRLGGKGDTETAFRFEMNPDVLERYNALNGTSYVQLPETCFELPADPVIVPAGEVAA APAQIDILPFSEEMDDSGSVYALPVTLRCVSGGMKMLGDASDFLIVCERKKIIPVPIFNS EYRTGGSSKLNRVMLNMKDAPITFNAYTIEFKMYKEEFTARNYMIVGFDNGEGNINNRMW VRFEASSTTSDVVNRWMQMNTMAQPGQTAVTPCEAKKWQHVAIVFDGAATSLYLDGEQVV QKSAPRNTVTFKYFALLPQPSYCNTTISFSEVRLWNVARTSTQLKNNVNNVDPKSEGLLG YWKMNEGHGYTFEDATGNGNTAECEYSRAGDLWDPQSYGPATGLEWAPDMDNM >gi|313158645|gb|AENZ01000035.1| GENE 4 4281 - 5402 1268 373 aa, chain - ## HITS:1 COG:no KEGG:BT_1048 NR:ns ## KEGG: BT_1048 # Name: not_defined # Def: putative secreted endoglycosidase # Organism: B.thetaiotaomicron # Pathway: not_defined # 10 364 19 370 375 247 40.0 6e-64 MIAVLAAGAFVSCSDWTEPEARDFTQHKPASYYENLRAWKADKSHPIAFGWFGGWDPTGA TTATSLMGIPDSVSLVANWATFSLTPEQMEEVVQVKKLKGTDVVVTILLTNVGAKATPEE VTAGVEDTWEQVRLMREYWGWTDDADAAQIEAAIRKYANGLVDEVLKYGYTGLDLDYEPG LGSYHNGNIVMNQSQQGDIYAGTSPSQRTTWFVDECSKRLGPKSGSGKLLIVDGLVSSMP KETIECFDYYILQTYALTAQSSLDSYRLAGLVNAFGDIIDEETITNRTLVTENFEPEAMW KYGGTSCRLPDGTYTNSLQAMALWQPANGFRKGGIGAYQMQNDFKNDCYKYFRAAINAMD KLEKGGTEADDQQ >gi|313158645|gb|AENZ01000035.1| GENE 5 5456 - 7033 2111 525 aa, chain - ## HITS:1 COG:no KEGG:BF1327 NR:ns ## KEGG: BF1327 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 25 525 21 513 514 383 43.0 1e-104 MKANTIIRRFGRFAACVAMLAAAACTGNFEDYNTKPFRPSDDDLNGDNVGVGILFPAMME FVTHFQVNASQMHDVLVADELGGYASAVKSYQGQNIATYNPSDKFNDYIFEQTFANFYGN YFKVETQTGGEGPVYQLARILRVAAMLRVTDAYGPIPYSKMQNGTFSVPYDSQRDVYMAM IDDLDAAMETLYTFASAGDGILMRDFDISSFKGDSRLWVSFANTLKLRMAIRMSGVEPEA RRIAEEAVRDMATYGLIDANAENLSFSSDSRQNPFYTQATSTSWQDLRSNASIVMYMNAY EDPRRASYFSKSGYDGIYVGVRAGIQNVTPTDYATYSYPLFAEKDPVPVMYASESWFLRA EGALLGWTMGDTAENLYNTGVKTSLELWGNGSSYETYIASETAPALSYADPRNKHNVSDF GTAVSVKWADDGNELVRIITQKWIANHMIGHEAWADVRRTGYPRLLQVVNNLSGGMWGNV DSSKGMRRLHYPQSEYDNNRENVLKGVELLGGSDLHGTDVWWAKK >gi|313158645|gb|AENZ01000035.1| GENE 6 7047 - 10364 4307 1105 aa, chain - ## HITS:1 COG:no KEGG:BF1326 NR:ns ## KEGG: BF1326 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 30 1105 27 1101 1101 1152 55.0 0 MKKKSIRQSKNLIISLFAAFLCLAAGTVRAQAGRVTIDMQQVKMTQIMDEIEKQTKFLFI YDKNISVDRTASVQVREQPVAEALAQMLAGTDLTYEINSSNIILSKRSADAAASSSLRNV AGVVLDAHGQPIIGASVVIKGTTIGISTNVDGSFGLQIPASVKDPVLVVNYLGCEPQEIA VGSRADFRITMVESAVALDNVVVTALGIKRQEKALSYNVQQVKSEDIVTVKDANFMNALN GKVAGVQINSSAAGVGGAARVVMRGAKSLYKDNNALYVIDGIPMSNVSFGSNDDGLQGNY MGSDGVADINPDDIESISVLTGPSAAALYGSDAANGVVLINTKKGQADKTSVTFSNSTTF SSPFVMPRFQNTYGNVSGSYQSWGDKTLRRFDPEGFFNTESNVNNSLTFSTGTQKHQTYF SAATTNARNILPNSGYNRYNFTFRDVTKFLKDKLTVDLSASYILQDNKNMIAAGKYFNPL PALYLFPRGDDFDNVRLYQRWDESRNLYTQAWDYGSGDMSFQNPYWIMNKMINQTKKRRY MLSANLKYDVTDWLDVTGRVKVDNADMDITEKRYAGTDTNFAGEKGMYSIIKRNDRQIYA DAIANIDLTFAGDYHLSANVGASLKDVRMDFMGWGGDLRKIPNFFSITNISTTNYKERED GSHIQSQSIFANVELAWKSMLYLTVTGRNDWESALAFSKYKSFFYPSVGLSAVVSEMVDM PEWFSFLKVRGSYSAVGSSYAAYLTKPYYDYKGQTHEWDSLHRFPNENLKPEKTKSWEVG LNARFFGGKLNFDATWYRSDTFNQTFESTPSSTSGYSSVLVQAGQVRNSGLEMLLNYHNT WRDFTWNTSVTYTMNRNKIIRLANGVISPVDGQPIDMPYLEKATLGDAGSPQVILYEGGT MGDLYINRELRTDLNGNIYVDPQTNKVELTTTERRKVASLLPKGNIGWSNSFAWKGLNLN VMFAARLGGSVVSNTEAFLDYYGVSERSAAAREAGGVRINNGMVDAQNYYQTIGAGTGAG AYYIYDATNVRMQELSLAYTLPKRWFRQKMAMTISLVGRNLWMIYNKAPFDPELTPSTAD NFLQGVDYFMLPSLRNIGFTVKLQF >gi|313158645|gb|AENZ01000035.1| GENE 7 10541 - 11476 669 311 aa, chain - ## HITS:1 COG:PA1364 KEGG:ns NR:ns ## COG: PA1364 COG3712 # Protein_GI_number: 15596561 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Pseudomonas aeruginosa # 101 222 68 186 280 70 41.0 3e-12 MEKETLYRFYKGEATSDEQRRLMEWLDADQENRRIFDRERGMYNALLLFAPEEKAARGRL LNWRRVTRYAVQAAAVVILAVGIGWGYVSYKERSWANLMTRVAAPEGQRINLTLQDGTNV WLNSGAEIEYPSLFAGNSRQVRLSGEALFDVSRDTRRPFVVETFACKVEVLGTRFNVNAD ERHDVFSTALMRGSVRVSSIADPQQQVVLKPHEKVQLSGGQLSLRTSDDPNEYLWAEGLI SVSGRSFADILHRFEHCYGVRFDVKLDRMPDIEAMGKIRISDGIEHALRILQRSCSFHYA YDQETSTITIY >gi|313158645|gb|AENZ01000035.1| GENE 8 11563 - 12351 818 262 aa, chain - ## HITS:1 COG:no KEGG:Bache_2931 NR:ns ## KEGG: Bache_2931 # Name: not_defined # Def: RNA polymerase, sigma-24 subunit, ECF subfamily # Organism: B.helcogenes # Pathway: not_defined # 80 252 16 188 195 155 42.0 1e-36 MGRGRGRTHAIQRFQAEIRNLPCRFIQSGRGPAEILGGNFFGTGFSSLKIISYFYSIEFC NPTMKDRNVITPPLHKEAEFGELYKRYKARFESFAVGYVQDRLIAEDVVTDAFIYYWENR SRLDADCNAAAYILTTIRHKCLNYLRALQVRMRVHNEVGDLQRRVVSENIRSLELCDPQH LFAGEIEKLVRDCMAEMPELTRQVFTARRIHGKSYREIAAECGITERRVETELEKALGKL REVLRDYLPLVLAAGVLERILR >gi|313158645|gb|AENZ01000035.1| GENE 9 12570 - 13163 645 197 aa, chain + ## HITS:1 COG:XF2239 KEGG:ns NR:ns ## COG: XF2239 COG1595 # Protein_GI_number: 15838830 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Xylella fastidiosa 9a5c # 19 195 13 195 206 60 25.0 2e-09 MRSKGRSMTKKRAQIAFDDVLRQVTEGNEEAFQQLYLHYHDRLFQFARMYLHQQQAAEDV VADLFFQLWKSRGMLGGIENFNAYIYRAVRNSCTNYLLSGYRNRMSGFTQVQLQVCIDPA VPADEAVDFQLLNSALTDAVEQLPERCRIIFKLAKEDGMSHREIAAALDISPSTVEGQLA IAMRRLKAAAAPFLKKI >gi|313158645|gb|AENZ01000035.1| GENE 10 13239 - 14141 1234 300 aa, chain + ## HITS:1 COG:PA2388 KEGG:ns NR:ns ## COG: PA2388 COG3712 # Protein_GI_number: 15597584 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Pseudomonas aeruginosa # 76 298 108 322 331 89 29.0 1e-17 MTDSKHTNDNNSLFTRIQVKSPCADERYSAGNSFARLMTRIHADAVNKARPERNYTKAYR IWLAAASLMIVFLTSGWLATALRPAPEMLIRNNAWEKVENLTLADGTQLTLNRGAQLIYP EKFAGRTREIFLSGEAYFDVAHDKKHPFIVRAGDLKIKVLGTKFNIEAYPDSETITTTLL EGSVEVESRLSRECIRMVPSQQLSYDIQSGEMKLSTISDSKEPVRWKDNVWVLHQTPFTQ VCHRLEQMFNVKIVIMNDKFIGKKFTGEFRFGDSLESILEVIRITTPFTYEREEDTIILK >gi|313158645|gb|AENZ01000035.1| GENE 11 14312 - 17665 5111 1117 aa, chain + ## HITS:1 COG:no KEGG:BT_4707 NR:ns ## KEGG: BT_4707 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 22 1117 1 1099 1099 801 43.0 0 MTNFDKFKTYCRKFCVQTRLAVRISLSITAITTFATQAAAQNNPQEIRLDFNDAALSQVL KSIETQSRYTFFYNNDIDVTQSVSAHIASSDINSVVTAVLKGTDIAHRITDNRVVLFVGG GDAADDQERSISGKVTDAAGLPLVGVTVLVTGTNFAAVTDVNGGYTVRVPGSGAEIVYSY IGYQSQQRIVGSQTVINLALQESTAEIGAVVVTALGIKRDQKSLTYNVQQIAGDDISIVK DVSVVNSLVGKVAGVRINQSSSGTGGSTRVVMRGAKSLFGDNNVLYVLNGIPMMSLRSTQ SDNYYEGAGVGDSDGISSINPDDIESLSVLTGASAAALYGNRGANGVILITTKKGSADHS THVSYSNNTTFSNPFVTHQFQNTYGRQSGDFKSWGEKLGKPSSYDPNDFFQTGYNTQNTV SVASSSERSQSFVSAGSVNSRGIIPNNEYSRYNFTIRHGRELVKDKLELDTDFFFTKSES QNGVAQGLYYNPLLPVYLFPPSENFERLKTYEIYDSGRNFATMYWPYSNTGLDALSQNPY WITNRNMFNTTQNRYIVSGSLKWKITDWINLSGRIRMDHADTDYERKLYASTPGLFCQSK GHYREQKMTNNALYADVLLNIDKQLSDDFRLTANIGASLYDEKYDMMGQSGNLLRLANFF HVSNINMSDPTTEMTREHIRQQTQAVYGSVQVGFRNFVYVDASVRSDWFSTLAGTPSMSS VYPSVGASLILSELIPQNKILNYWKIRGSYAGVGNPPSPYLTYTYIPLEDQNISTDGFAA ASHLKPELTKSFEIGTELRMFNNKLSLEFTYYNTNTHNQLFKYEIPSSSGYSYAYANAGK VNNRGIELKAGFNQNIGPVKWETTLTYTRNRNEIKELLDEYVTDPVTGTTVKAPQEFIVS TAESYRMVLTKGGTMGDIYATHLKQDPNGYIYVNPMTNGIEIEPNSYVKVGSIDPKYTLG WHNSFSWKGLNLGFLIDARVGGVVVSGTEAMMDQYGVSKSTARDRDNGGAIVNGGRMDAK TFYSVAASGTTGVLSNYVYSATNVRLRELSLSYTLPALARERLRITLSLIANNVLMLHNE APFDPELTSNTGTYYQGFDYFMPPSLRSWGFGVKVNF >gi|313158645|gb|AENZ01000035.1| GENE 12 17683 - 19272 2244 529 aa, chain + ## HITS:1 COG:no KEGG:Cpin_6734 NR:ns ## KEGG: Cpin_6734 # Name: not_defined # Def: hypothetical protein # Organism: C.pinensis # Pathway: not_defined # 12 529 12 543 543 355 40.0 4e-96 MKKLRNTIACGVAALALLATGLTSCTGDFDNINTSKDKPTPEDVDRDNAWAAFIQTMQRN VFAEGANAYQLADNLLGDSYAGYFGQAQDWDSGSNSTCYAFPTAKWKDEPYKQAYNNVMS SWNILRQKVDSASVVFALGEVVKVQAMHRVTDIYGPLPYTAFGKMSSGLPYDSQETVYRT FIKELDHALKTLKAAYEADSGAKPIADFDVVFDSDLEKWIRFANSLKLRLAMRARFAAPG DAQTWAEEAVNFQFGAYNIGVMTDNTHSAQLLTRTGIGFSYKHPLEYLWGEYNECRMGAT MESYLSGYGDPRLESYFSPAEKDGAYHGIPNGIISNPKDYQSLASCPKVGFTSPLTWMCA AEVCFLRAEGALLGWNMGGTAENLYNEGIRTSFSQWGAPNADKYLSDSKSTPMKYPGLGQ VGSIESPSSVTVKWAADGRELERIMVQKWIALYPNGQEAWSEIRRTGYPKVFPSQTNRSG GVLPSGATIRRVPLPQQEYDGNAEQVAKAVNEFLGGNDNCAQKLWWDAK >gi|313158645|gb|AENZ01000035.1| GENE 13 19301 - 20392 1709 363 aa, chain + ## HITS:1 COG:no KEGG:BVU_0617 NR:ns ## KEGG: BVU_0617 # Name: not_defined # Def: glycoside hydrolase family protein # Organism: B.vulgatus # Pathway: not_defined # 33 363 35 351 352 249 44.0 9e-65 MNKLIKLGCIALAGTLLFSCETDPEKLEIVKPYPKSDQYYADLRDYKNSDHEIVFGWFGN WSATGASASNYMKSLPDSIDMVGLWGGWQTITEAQRKDMEDSQHKRGLKVIATSLFDGFE MGVAPDGATAATEEEMLAYWGWEGPYRSYGNDNLTEGNIAAIQRFAHAFCQRIIDNGYDG LDIDNEPNIGTGKKPYGISGFPNRYKVFMDVVVTYFGRTSGTGRILAIDGELNANMPREF GKCFDYFISQAYGCSGDSNLDGRLQSCANYFSEVPFKDVARRFIVTENFENITYRTNGGA NYTTRDGESMKSVAGMAQWQPIIDGEKVMKGGCGTYHMEYEYRNPEPYKFLIEAIQIQNP ARK >gi|313158645|gb|AENZ01000035.1| GENE 14 20422 - 21588 1646 388 aa, chain + ## HITS:1 COG:no KEGG:BT_4710 NR:ns ## KEGG: BT_4710 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 377 6 371 402 140 28.0 1e-31 MKLKYSLFALIALAVSGCQKAPVYDDVVYIAGAEGEVPTLSLTLDDQFPKELSVKVSSSI VVGHDTKVRLVTDPSKIEEYNTQYGKKGVMLPETNFVLKNTELTIPAGSFSAAEPLVVSV TKDEGLKEGVSYVIPLSIESVDGDLKVKEGFRTVFVEIGRIIIGPAADCSGSTYFKVDFA EGEQDKYDIYNLGEVTYEARINLQRWGGWCNTVMGLEENFCLRFVQDDFKGQLQLSGWGG TLMVPTGGIAVSLNKWTHVAAVFDGPNKKVSIYIDGVLACSADTNKTSLDLTQVYQDDSF RIANSCNDGRQLKGYISEARVWAKALNGNDLKNNMCYVDPATDGLLAYWRFNEKADGKKV EDDLTGHGYKATYSSSAMKWVDNVRCPE >gi|313158645|gb|AENZ01000035.1| GENE 15 21665 - 22819 872 384 aa, chain - ## HITS:1 COG:no KEGG:Palpr_0285 NR:ns ## KEGG: Palpr_0285 # Name: not_defined # Def: hypothetical protein # Organism: P.propionicigenes # Pathway: not_defined # 48 318 21 293 360 146 35.0 1e-33 MIRTFLLIACLLTSVACRQRGGTPADANIVANAGNHSAETIRPDTVARPPLAFGHETWDF GTIPETGGPVVHVFRFTNSGTKPVVVERVGVTCGCMKPTFSQAPVLPGGSGEIGIAYDPA DRPGAFDKAVYVYTDGSRMMPLRIRGSVAARPETAADRFPVEFGGGIRLSVKEVNFRMVE QRHDRRLTLRCLNTSPYEVRLRAEFAEPLPFLRAEAPAVLAPGAEGEIAFICDLSAAEVW GTFVDSCYLTVSGRRLEGAVVVRGTGIGDLAELREMPRGKRPQAEIAESLLDFGTCAADG TAECRFTLENKGGSPLRIYAVKYPEGVTGQIAAGEEIAAGRAKTFVLRVDGRTAGSGNYF VHVELLVSDALSPLCDLRVRGRFE >gi|313158645|gb|AENZ01000035.1| GENE 16 22970 - 23554 898 194 aa, chain - ## HITS:1 COG:no KEGG:BF1053 NR:ns ## KEGG: BF1053 # Name: not_defined # Def: RNA polymerase ECF-type sigma factor # Organism: B.fragilis # Pathway: not_defined # 5 182 7 183 193 163 46.0 4e-39 MPERKKISDTCLLNRLVDGEMKAYEEIFVRYYPTLCTYARMYVKREETAENIVQDLMLWL WENRTVLCISTSLPKYLFGAVRNNCLTYLGHESIERRVLGNIRNKMCEQFEVPDFDVVEE LRENIRAAVADLPASYREAFEMSRFQRKTYEEIAAVLEVSPKTVDYRIQQCLKILRVKLK DYLPLLTMLLVVNE >gi|313158645|gb|AENZ01000035.1| GENE 17 23676 - 24686 1462 336 aa, chain + ## HITS:1 COG:PA2388 KEGG:ns NR:ns ## COG: PA2388 COG3712 # Protein_GI_number: 15597584 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Pseudomonas aeruginosa # 121 275 125 273 331 72 32.0 8e-13 MNTPNIEKLLVRYYNDELPFDQTARVAAWIAASEENRKIAEQVYYICFAAETQEAKAALD TAGALRKVKGRIRAEKWRRTLRQVERAAAVMLLPLVGLSAYLLMQVGYKYNSMVEIRSTT GMVSSVTLPDDTRVWLNSNSYLRYPAKFTGKERRVVLYGEGYFDVTKDPERKFVVEAQST EIEVYGTEFNVEAYDDEYIRATLVSGRIGMKYDDANHRSQLVQMMPDQQITYNARTGVIY LNTANIPCNTSWKEGKIVLSNTPLKDALRMIGNKYNVSFNIRNDELLGNTFTGTFSNQSL DLILKHFSISSKIKFRRIDQTAGDKASAGRAVFEVY >gi|313158645|gb|AENZ01000035.1| GENE 18 24868 - 28230 5475 1120 aa, chain + ## HITS:1 COG:no KEGG:BT_4707 NR:ns ## KEGG: BT_4707 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 29 1120 4 1099 1099 907 47.0 0 MKKIFSTQIMGGGNLHRTPAAGVSRSVRSYALALCMAFAIAAVPVQAQVKNITLKASNTP VSTVMQQIEEQSGYTFFYNTKDIPLDRNVTLSVANKDIRAALDELFRNTNVSYSIDNKSI ILSARGGAQPQQASQITGRIVDKNGVPVIGATIMISGTTQGTTSGVDGTFALQSNVSPAG MVLDVSFIGYKPQRISVNNRTSLSVTLEEDDQQIEAVVVTALGIKRQERAVSYNVQNISD EVFLTRDANMVNSLAGKIAGVTINASAAGVGGETKVVMRGSKSIAGSNNALYVLDGIPLP SLSLTNPGDEFTIMRENNLTGDGISNFNPDDIASMAALTGPSAAALYGSQAANGVLMLTT RSGEEGVSVNYSNNTTFMSPFLTPAFQNTYGAKDGYYASWGSKLVTKQSWTPTDFYQTGY NTSNSVGLSFGGKKSQTYVSAGVVTAEGIIPNNEYNRYNFTANHSSSFLDERMHLSVLGM YMNVNEQNMLSSGQYYNPMIPVYLMSPSDDIRKYAVYERYNASRNFPVQYWPWGSQSLQM QNPYWITNRNMFNTGKDRFLFGASLKYDVTDWLDITGRARIDYTHITAEQKNYASTLGLF AGDKGRYYNNTYTTSQRYADVLANVHKTFADGAFSLNATLGASIEDYKHRAILLGGDLTG VPNLFTLANMTTNKSQAKETINDQTQSLFATLQLGYKNMVFLDVTARNDWVTALANTSKT SMFYPSVGLSAVLTDIFKVDSRVLSFAKIRASYAEVGNAPMRWITIPTYPVSDGTPQTST YLTSDDFKPERTKSWEVGADVRLWGNKLILNATYYSSRTYNQVFNPDISSTSTYSSLYVN AGRVDNKGVEVSAELNQNLGPVKWSSNLVYSRNRNKVVDMLDSYKLSNGTVISQDSMVMG GTTGVKMVLREGGQIGDIYVNTLKTDEHGAIWVSPTGSNVAPAKDTWIYAGNSNPSYTLS WRNEFNWKGLSLGFMFNARVGGVGVSLTQAAMDYFGVSERTATDRLNGGALVNGQRIPAE NYYQTIGGNGADAIGACYVYSMTNVRLGELTLGYDIPVQKWCKWIKGLNVSFIGRNLWML YCKAPFDPETVAGTGTFSSGIDYFMQPSTRNLGFSVKVTF >gi|313158645|gb|AENZ01000035.1| GENE 19 28249 - 29865 2245 538 aa, chain + ## HITS:1 COG:no KEGG:BT_4708 NR:ns ## KEGG: BT_4708 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 29 537 33 531 531 385 42.0 1e-105 MKKANGIKLLTLAGLMATAACTGDFLNKNTNPDQATDEMLSWDNLSTGSAFAQMTKNVIP SFQLVGDKEYGSANYQVIEDLAGNIFAGYTGVINTSFTGNNLYNVTGKDWYNAMFNDAYT RVISAWNQLDPQRGEFPEVVALADIVKVAAMHRVTDTYGPIPYLNISSGDINKAYDPQQA VYKRFFEELDAAITTLTAFYEAQPGTKLLEAYDNVFSGNVSGWIKFANTLRLRLAMRVVY ADAALAQEQAAAAIGNPVGLMTGAADLAELHKPATGSWEYPLYMIQYNFDDSRIGATIEA YMNGYQDPRRGKYFVATAKGEYHGVRTGITPTSSYADSEDLSRVNCTNNDNIMWMTPAEA WFLRAEYELRWGSESDAATYYAEGIRTSFSTLGASGAEDYIADKELTPGSYTDMVTGGNS NQSPLASIPICWKAGSNFETHLEQIITQKYLAMFPEGQEAWSEFRRTGYPRILPVLVNNS GGKISTSKQIRRLNYPSTEYSTNAEAVATATHTLNSESSNPTGDNGGTSVWWDKNPRL >gi|313158645|gb|AENZ01000035.1| GENE 20 29885 - 30868 1488 327 aa, chain + ## HITS:1 COG:no KEGG:BT_4709 NR:ns ## KEGG: BT_4709 # Name: not_defined # Def: glycosyl hydrolase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 327 1 313 313 164 35.0 3e-39 MKKTITTLAVIFAVCAGFTACDTDAEPLNLQPAYKYSEEYYANLRAYKDTFKERSLCFVW FSDYDQSYSMAYRFAGLPDSVDICSLWGGYPDPVKNPLAYKEMWEMRNKKGTRLVMPTII RIMENDNFANFGLTYEDMVNQTTTIDPDGVYPDWCVLYGNHLLKDVWEYGIDGLDLDYEP ESSDERNYGIYGDRMTLFVQYLGQFIGPMSPNPEKLLIVDGHTPPAETEPYLDYYVKQNY GSSSVSFTSTFPYEKQVFTENIGAYWQTGGGMEAQAAAKAPEGHFKGGFGAFFCLRDYHT SDSGADKEIPYGHLRRAIQLQNPAVTK >gi|313158645|gb|AENZ01000035.1| GENE 21 30889 - 33558 3474 889 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313158676|gb|EFR58065.1| ## NR: gi|313158676|gb|EFR58065.1| F5/8 type C domain protein [Alistipes sp. HGB5] # 1 889 1 889 889 1585 100.0 0 MKNITKYILAAAAAFSSLTACQEFEDFDKTIDGTPGLVYVQTGTENLYTIRVVHKPTGST GEFFTEFPVRCNTTRHAGVKATFVYDASLVESYNAEHKTSYAALPAEYLTLENTTLTVPE NATASADSVKVTLTGDLSLLTERNYLAPLRIKAEGIDASEVMGAVYVAVATEINLIRAIE STDDMVGFTATGRSAWTADCGNYANLFDGSTSTSVDFPEQYGNVLNIDMKEPQLVTGLCL GYGSVPSVSIEYSADGETFSQAGTPVSGEYVTSGSRMYAAFYGHIEARYLRLTIGFSSSW SKTLSEIDIYKIDSEDPTVYAVTGTENLITGKITHRQTGSTSDVNASFSAYATVASASGY TVSVAADNSLVAAYNTAHDTSYAALSAEYLQLDNSTLTIAAGAYKSEGEVRVSLKGDLTR LSDLNGYLAPLKLSSSGAGTSAGRGVVYLAVKVERNKIRPITSADDMVGFPAAGRTAWTA DCPDYANLFDGSASTRTNFTDQNDNVVTIDMKSTHMVTGIDLNSAGISSVSFEYSTDGQT FLTAGTPASNEYATSGSDRYIAFYDYLEARYLRLTISFSSSWSKYISEFNIYEIESDEPT VYAMCGSDNVLTGAIAHTPGGSFNGLNAAFNVYCTVSSASGYSVSATADNSLIAAYNSAN GTSYAALPDGHLLLENNPCAIGPNNNKSDGQIKASLTGDLTGLTNAKGYLVPLKLSAQDA VTSSSRGVVYLVITPAEELFRKNFTVADITGALVADRSGWSITGGDYHSGSWPEVIDDST DTFMRPWGSPIMFTITFDKEYEMTGLRITARTDNASYQNYQPNAITIEYSLDGEVYTELG TATSADGSLVKNVPSSYVALYGSQKMKYIRITASYGSNMGVGDFNIYAK >gi|313158645|gb|AENZ01000035.1| GENE 22 33573 - 34733 1469 386 aa, chain + ## HITS:1 COG:no KEGG:BT_4710 NR:ns ## KEGG: BT_4710 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 366 6 363 402 179 32.0 2e-43 MKFNHILSGLALTAAVLAAGCQKSLEYSDVVYFTGTENSNITNMYVDGPSSMGVTVTSSC KMAADVQVALAVDAAAVDAYNALHGTDYRMLPAGSYRLSDDAVTIAEGTNVSTPSSFEIV SMDDFDEGAHYCVPLKITGTSNGMGVLEASRTQYIFISQITTTRGINLKNSWYITMDGMV DDPALKDLPACTMEIRMYANGWRTSGHMISSLIGVEENFLLRVGDESIKNPAQLQLAGRG TSITAPNALSLGRWYHIAVVDNGSEMTIYIDGNAEISVDSSSSKAINLGFYYNSPFAIGM SAADVRYFNGYVSEARVWKRALTPTELKNNQCYVDPATAEGLIGYWRLDQVEDDGRTFTD LSGNGYHGKASSNPIWTGEIKCPVVD >gi|313158645|gb|AENZ01000035.1| GENE 23 34813 - 35547 977 244 aa, chain + ## HITS:1 COG:no KEGG:Palpr_0285 NR:ns ## KEGG: Palpr_0285 # Name: not_defined # Def: hypothetical protein # Organism: P.propionicigenes # Pathway: not_defined # 27 214 24 208 360 86 30.0 9e-16 MRILKAMALIATILAPQVAPAANPRLLRFEEPVKNLGKVAETDGTVQLRFEYTNIADKEV TLLDVHTQCGCAQPAFSRKPVKPGDKGIVEVTFDPKDRYGDFSIGLTVIAGNGNYRKFNT LVVKGYVISRIPEAEIRYPYALSAALRADNKAIGMRQLSRSDPKRTRQLRLLNTSAKTLR LAYRTDSGHLALSGPGEIAPGKEAVLEFTLNPRNMPDGQFVLKCTVSTQDEEIPVEVKGQ IVGR >gi|313158645|gb|AENZ01000035.1| GENE 24 35670 - 37895 2580 741 aa, chain + ## HITS:1 COG:XF0842 KEGG:ns NR:ns ## COG: XF0842 COG3537 # Protein_GI_number: 15837444 # Func_class: G Carbohydrate transport and metabolism # Function: Putative alpha-1,2-mannosidase # Organism: Xylella fastidiosa 9a5c # 25 732 45 765 790 572 44.0 1e-162 MKRIIFLPAIAALLCGACSSAVPEDYTQYVDPYIGTGDHGHVFMGANVPFGFVQLGPTSI PQTWDWCSGYNYADTTVIGFGHTHLSGTGIGDLNDISLMPVVGKVTPGRGTAGDPSSGMW SRFSRADERCAPGYYATRLLRYGIGVELTASPRVGMSRYAFPASDDAGIVLDLENGQGWD RSTDCGITACSDRTVEGWRYSTGWADDQRVYFHAEFSKPFERIVFTGADTLSADGRSARG RKLYGRFGFRTEENERITVKVGLSAVSAENARRNMQAECPGWDFEAVREATRRAWNGELS KIRIETADPAVRRIFYTALYHTMIAPSLFCDVNGDYRGADGAVRRDTTFTNYTTFSLWDT YRAAHPLLTLIHPEKVGDLINTMLRIHEQQGKLPVWHLMGCETDCMVGNPAIPVVADALL KGFGGFDRAKAYEAMKSSAMRDDRRLDLYKRYGYIPYEFNESVGYCLEYAIADWALAQAA QREGKREDYDYFLARSKAYRHYFDPSTGFIRGRSASGAWRTPFDPFHSRHMEQDYTEGNA WQYTWLVPHDIEGLMECFGSRERFVGKLDSLFLAEGDMGAHASPDISGLIGQYAHGNEPS HHITYIYTMIGQPWKCAEKVRRILGELYHDRPEGLCGNEDVGQMSAWYVLSALGMYQAEP AGGRYFFGSPAVDGATLRVRGGEFRITAADNSPENIYIQRITLNGEAYRKYYIDYAEIAA GGELRFEMGPEPADWTKQQPL >gi|313158645|gb|AENZ01000035.1| GENE 25 37892 - 40207 2960 771 aa, chain + ## HITS:1 COG:TM0076 KEGG:ns NR:ns ## COG: TM0076 COG1472 # Protein_GI_number: 15642851 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Thermotoga maritima # 20 770 4 758 778 557 40.0 1e-158 MKRLLLLMACAAAAACSADWRDTSLPPQKRAELLTAEMTLDEKIGQLTSPYGWEMYERHG DSVRLTDAFREAVQNGHIGMLWGTFRADPWTQKDLRTGLTPQLAARLANRMQRYAVQHSR LGIPLFLAEEAPHGHMAIGATTFPTAPGQASTWNPELIERMGKVIAAEIRLQGGHICYGP VLDIVRDPRWSRTEESYGEDCYLTARIGEAYVRGTGSGDLSQSRHALSTLKHFIAYGASE GGQNGGSNLLGERELRETYLPPFEAAVKAGARSVMTAYNSVDGIPCTANRRMLTDILRGE WGFDGFVVSDLLSIEGLHETHGVAGSVREAAVQALRAGVDADLKGGAFASLREAAEAGDV AEAEIDRAVERVLALKFEMGLFENPYIDEAAAAEVGCAAHSELALEAARQSVTLLENRSG TLPLDPRRLRRVAVIGPNADNIYNQLGDYTAQQTAANTVRDGLEKLLGRDRVVYSRGCTV RGGDRSEIAAAVSAARGTDAAVVVIGGSSARDFDTEFLQTGAAKAAHDEVRDMECGEGFD RATLALLGEQEELLRRIKATGTPLIVVCIAGRPLDLRRASEQADALLMAWYPGARGGDAV AETILGRNNPAGRLPITIPRAEGQIPVYYNKKRPANHDYTDLTAAPLYPFGYGLSYSTFE YGSLEARQSGDNVLEVSCRIRNTSDREGDEVVQLYISDMVASTVRPPRQLGGFRRIRLAP GEQRQVSFTLGDEALALIDPQGRRVVEKGDFVIAVGSSSQDIRLQTTVTLE >gi|313158645|gb|AENZ01000035.1| GENE 26 40212 - 42458 2505 748 aa, chain + ## HITS:1 COG:XF0847 KEGG:ns NR:ns ## COG: XF0847 COG3525 # Protein_GI_number: 15837449 # Func_class: G Carbohydrate transport and metabolism # Function: N-acetyl-beta-hexosaminidase # Organism: Xylella fastidiosa 9a5c # 23 587 85 662 841 330 35.0 9e-90 MKHFAYLLPVCCLLFAACRKDSPATEIIPTPRSVKAGQGTFDLGGGIRIAPADPLLRPAA DYLAQLLREEDVAAAQDAGNANLSLELDPRLPQQGYTLKITPARIELRGGSCEGVVSAAA SLRQLLWSGKGSLPALEIDDAPRFAWRGFMLDVARHFFTKEEVMSLTDRLACYKFNRLHL HLTDDQGWRIEIKRYPLLTRRGAWRTPNKHDSVCLRRAADERDPKFLLPEKNIRREQGKI RYGGYYTQDDIREIVAYAAQRGIEVIPEIDLPGHSLAAIGCYPQLACDGRGGWGKHFSTP LCIGRDSTIAFCKNVLTELFDLFPSQYVHIGGDEVERTPWETCPDCQRRIRAEKLEGAGE LQAWFTRKTEQFLAAHGKTLLGWDEITEDGLTPQSAVMWWRSWMPSTLTAALQNGHRVIE SPSEFLYLNGELDRNTLSKVYGWEPLPESLRAWQEGLLGIQANMWTEDVPTADAAGERLF PRLLAVAETAWSAPEKRDFADFRRRLPLHLRQLERAGWNYRLDDVEGVCDDNVFIGAATV RLLPPESAELYYTLDGTVPDTASQRYTAPFSITDYCTLTLRCYNRRGVAAEIRRASFRPT RYAEPSADAGNLQNGLLVRWYDYDGDNCADIDEAPLQANFITDSVVIPDGVTGNIGLICD GYIDIPADGIYSFYTYSDDGSTLAIDGRTVVDNDGLHSRTERSGQAALRRGVHSFSLRYF DTNGGILEAGIIDSEGRRIPFSSRMLKH >gi|313158645|gb|AENZ01000035.1| GENE 27 42665 - 43129 635 154 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313158667|gb|EFR58056.1| ## NR: gi|313158667|gb|EFR58056.1| hypothetical protein HMPREF9720_0441 [Alistipes sp. HGB5] # 1 154 13 166 166 296 100.0 4e-79 MQVLLPVVPGLWTVLFRPETAALSPRMLVTWMLCLNLLFILIYWRRLGWSRSERLDTALL LVLFCFFPLPGHWLLDGMLDGSLWLQIAFYCMMQAGFYVAIVLAPDWVGRFQQRRYEERV LSGREKFESRKEVMKAREERGKSANGSKSGEKDD >gi|313158645|gb|AENZ01000035.1| GENE 28 43680 - 45176 2249 498 aa, chain - ## HITS:1 COG:VC2703 KEGG:ns NR:ns ## COG: VC2703 COG3263 # Protein_GI_number: 15642697 # Func_class: P Inorganic ion transport and metabolism # Function: NhaP-type Na+/H+ and K+/H+ antiporters with a unique C-terminal domain # Organism: Vibrio cholerae # 11 480 11 476 581 262 36.0 1e-69 MIDISGENVVLVGALLLFVAVIAGKVAYRFGAPALLLFLGVGMLFGLNFISFRSVEMTQF VGMIALCIILFTGGMDTKFSEIKPIIGPGVVLATAGVVMTAFILASFVWLVAPWLGVEMP FVLALLLASTMSSTDSASVFSILRSKKQGLKQNLRPLLELESGSNDPMAYMMTILLISVV SHSSGSVGMGMSVVFFVVQMIVGALSGYLIGRLAVWTINRINLANHSLYSVLLLAFIFFS FAFTDLIKGNGYLAVYLSGLVVGNYKLQQKRPLTVFFDGFTWLMQIVMFLTLGLFVNSDE LLEPRVLILGGLVGAFMILVARPLTVFICMAPFRKFTTKARLYVSWVGLRGAVPILFAIY PLMAHVENAGLLFNVVFLGTIISLLVQGTTVSGMANLLGLAYEERESAFSVDMHQDMKSA LTEVEVNETMLESGHMLKDITLPENTLVMMVCRNGEYFVPQGKTELLLGDKLLVISDRSE ELATTYKDMGIDDVMKLG >gi|313158645|gb|AENZ01000035.1| GENE 29 45192 - 45725 705 177 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313158685|gb|EFR58074.1| ## NR: gi|313158685|gb|EFR58074.1| hypothetical protein HMPREF9720_0443 [Alistipes sp. HGB5] # 1 177 1 177 177 338 100.0 1e-91 MDKNKICEALYALSGNVVVKRRKFQLLPAVLFVAGAALIVVNNMYGADLTNNLRSSIVFI GGIMVVAGMIMAAAQMFGSGGAPFHREERCFLLYEELYFDRGIRSDVMQAVSDGDAERLL GMKRAQVPALTVALYRTPGNRFAAMQAFEYADLEYKPLTELKIADKTELKVADKVQA >gi|313158645|gb|AENZ01000035.1| GENE 30 45784 - 46149 218 121 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|291513594|emb|CBK62804.1| ## NR: gi|291513594|emb|CBK62804.1| hypothetical protein AL1_00780 [Alistipes shahii WAL 8301] # 1 121 9 131 131 77 43.0 5e-13 MGLFGCLLFCGGGGRTLAHDAGHVSVTLSVDRASDAVWLCEREYNADLNLPRVLGEVSPA PAAPVFQRGGHAESGPAADRYAADRHARGSTKVTSEFKSSDLSVGLHAVDYYVYRLCRLI I >gi|313158645|gb|AENZ01000035.1| GENE 31 46330 - 47133 1201 267 aa, chain - ## HITS:1 COG:no KEGG:Odosp_3205 NR:ns ## KEGG: Odosp_3205 # Name: not_defined # Def: hypothetical protein # Organism: O.splanchnicus # Pathway: not_defined # 32 267 12 251 251 85 30.0 2e-15 MAENIPDNKPKTVRRPRLKLPFDNRKEDAGVWAYDHRIGLCVTIIAYLVLMIAFVSSKIV VGRRPHTQGMFIDLQTLAELEKERDRLEQQVRERQQQDPIDWKSVQNQVSNENALNEKLR DDRGTNAAALNDAAAEAERRMQANREAYEQGLAEERAIRERNGAGKDEERQDRKVKGRVT VSFSLTDPVRTSRSLNIPAYRCEGGGDVVVEITVNRAGEVVNARVQSGGDECMRESALRA ARVSRFNIDQSAPARQQGTITYIFIPQ >gi|313158645|gb|AENZ01000035.1| GENE 32 47220 - 48611 2021 463 aa, chain + ## HITS:1 COG:PA4406 KEGG:ns NR:ns ## COG: PA4406 COG0774 # Protein_GI_number: 15599602 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-3-O-acyl-N-acetylglucosamine deacetylase # Organism: Pseudomonas aeruginosa # 4 297 3 274 303 179 36.0 9e-45 MSTKQQTLKAPISFSGKGLHTGVKVTMTVNPADADTGIVFRRTDIEGQPIIPALCDNVVD TSRGTTIESGGHRVSTIEHIMSALWTLGVDNAVIDIDAGETPIMDGSACAYAKAITEVGI TDQPAERKFYHVTEKMVYTIPEKGVAIILYPDDEFSVSLHVDYNSKVIGNQYATFNPGDN FADKIAPCRTFVFLHELEPLMKMNLIKGGDLDNAIVVVENPVPDEQLEHLKKIFNKPDIE INAGYLNNLELRCNNELARHKLLDLLGDFALLGVRIKGRVWATRPGHYANTEFMKQLKHT IRKAGEKPRYQYDCRKPPLYDINDIRHLLPHRPPFLLVDRIFHRDATDVAGIKNVTMNEP FFVGHFPDEPVMPGVLIVEAMAQCSGILVLGDVPDPENYSTYFMKIDGVKFKRKVVPGDT LQFEIHLLEPIRRGVAVVEAKAFVGETLACEAVLMAQVVKNKK >gi|313158645|gb|AENZ01000035.1| GENE 33 48617 - 49411 1260 264 aa, chain + ## HITS:1 COG:FN0595 KEGG:ns NR:ns ## COG: FN0595 COG1043 # Protein_GI_number: 19703930 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Acyl-[acyl carrier protein]--UDP-N-acetylglucosamine O-acyltransferase # Organism: Fusobacterium nucleatum # 2 255 4 257 257 193 40.0 3e-49 MISKLAYVHPDAKIGNNVTVEPFACIAGDVVIGDDCWVGPGAVIHDGARIGKGCKIHTAA SVSCLPQDLKFAGEVTTAEIGDYNDIREYVTISRGTASTGTTRIGNRNLLMAYVHVGHDC VVGDNCVIANRVSLAGEVHVGNWVVIGGHAAVHQWTHIGDHVMIQGGALLGQDVPPFIIV RNDTMRFAGINKIGLSRRGFTPERIAEIHDACRILFQSGLNYMSGCEEVEKQIPQSAERD ELVKFIRESKRGIIKPYESKAKEE >gi|313158645|gb|AENZ01000035.1| GENE 34 49539 - 50006 717 155 aa, chain + ## HITS:1 COG:no KEGG:BDI_3532 NR:ns ## KEGG: BDI_3532 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 1 155 1 155 155 110 46.0 1e-23 MIDTKKIIEAAERKLQGTDMFVVGCTCTPGNEIELLIDSDTSVAIEACAELSRAVEAEFD RDEEDFSLTVASAGIGSELKSLRQYRKLIGSTVEVLLNSGIKMLAKLDAADDEGITISYE EKQAVEGKKRKQLVTVTRRYAFDEIKSTKEWLDFK >gi|313158645|gb|AENZ01000035.1| GENE 35 50240 - 51472 2150 410 aa, chain + ## HITS:1 COG:mlr5551 KEGG:ns NR:ns ## COG: mlr5551 COG0195 # Protein_GI_number: 13474627 # Func_class: K Transcription # Function: Transcription elongation factor # Organism: Mesorhizobium loti # 17 400 21 410 531 240 37.0 5e-63 MDNLNLISNFAEFKELKNIDKSTMIGVLEDVFRHALQKQYETDENFDVIINPEKGDLEIW RNRTVVEDGAVEDPNAQIAVSEVKAIDPTYEIGDEYADEIKLSSFGRRAVLSLRQNLASR ILDLEKASLYEKYSEKVGEIVTGEVYQVWKKEVLILDDEENELILPKAEQIPNDFYRKGD TIKAIVKSVEMNNNQPRIILSRTANQFLERLFEQEVPEIFDGLITIKKIVRIPGERAKVA VESYDERIDPVGACVGMKGSRIYSIVKELRNENIDVVNYTANPSLMIQRALNPAKISSIT IDEEKMTASVYLKPDQVSLAIGKGGLNIRLSKMLTGYDIDVYREVEEEDVALTEFADEID GWIIDALKSVGCDSAKSVLELPVEEIAARADLELEQAQKVVAILKAEFEE >gi|313158645|gb|AENZ01000035.1| GENE 36 51503 - 54418 4134 971 aa, chain + ## HITS:1 COG:BH2413 KEGG:ns NR:ns ## COG: BH2413 COG0532 # Protein_GI_number: 15614976 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation initiation factor 2 (IF-2; GTPase) # Organism: Bacillus halodurans # 298 970 54 729 730 574 48.0 1e-163 MGNERKLRLIQVAKEFKVGLNTITDFLQKKGIKSDGSPNTLVDSETYAVLEKEFGANRAA GNARDSIRERISLKQTTITLEEAKKQEREEEKEVVIKSNVISVKDEIQQPKILGKIDLSP KPKAAPTPAPAPKAEPVKPAEQPRAAQPAPAGAPVEAPKPAAKPEPAAEKPAAPEVKTAP AAPAAPAAAAATAAQTETGKPAPAAAPAPAATPATTLAAAPAAAAAPAPKPEPAAPKDNI FRPETVTLTGPQVLGTMDVSGFVAGGKHKRKRLQKEKVDVSKAPKGNTPGGNRQGGNQGG QGSQNRPGGPGQNQPRPGEGRRNKNKGKAAPKPIVRPEVSDEEVSKQVKDTLARLTAKGA KSKSAKYRKDKRDAVAERMNEEFEREEMERSTLKVTEFVTVSELATMMNVTPTQVITACM NLGLMVSINQRLDAEALVVVAEEFGYKVEFVSVEIQEAINDDNEDKEEDLVPRPPIVTVM GHVDHGKTSLLDNIRKTNVIEGEAGGITQHIGAYSVELNGQKITFLDTPGHEAFTAMRAR GAAVTDVAIIIVAADDSVMPQTVEAINHAQAAGVPMVFAINKIDKPNANPDHIKEQLSQM NYLVESWGGKYQDQEVSAKKGMNLDKLLEKVLLEAEMLDLKANPNKKAQGTVIESTLDKG RGYVSTILVQSGTLHVGDVILSGTYTGRVKAMFNENGKKVESAGPSTPVQVLGLNGAPQA GDTFNVMEDDRSAREIANKREQLQRMQGIMTQKHVTLDEIGRRIAIGSFKELNIIVKGDV DGSIEAMSGSLIKLSKETVQVNVIHAAVGQISESDVLLAAASNAIIVGFQVRPSASARKL AEKEEIEIRLYSIIYDAINDIKDAIEGMLEPVMKEEIVASVEVLEIFKISKVGTVAGCVV REGRLQRNTPIRVIRDGIVIYTGKLGSLKRFKDDVKEVTNGQDCGLNIESFNDIRVGDIV EGYEQVEVKRK >gi|313158645|gb|AENZ01000035.1| GENE 37 54426 - 55061 994 211 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313158692|gb|EFR58081.1| ## NR: gi|313158692|gb|EFR58081.1| hypothetical protein HMPREF9720_0450 [Alistipes sp. HGB5] # 1 211 1 211 211 414 100.0 1e-114 MLPKRLIVLLMSLCTLCGAASAVQPRTDLTQKEQKALQKAQKKKIRQEARRFESLRNDAV VQFAREHQPSDISTPSFVNDFRISNVICAGDNISGTVMLHFDITPLYGGFRLYMGGERNG TYAFAKGLAYPSSDHYGRIYTMRSGQPEHIVVTFYNIAPGVERLDRVDVSMGLALNALNI ISMRNVPIFWTYSDKAIKNYVQENANKVPAN >gi|313158645|gb|AENZ01000035.1| GENE 38 55086 - 56405 1956 439 aa, chain + ## HITS:1 COG:slr1888_1 KEGG:ns NR:ns ## COG: slr1888_1 COG0427 # Protein_GI_number: 16330298 # Func_class: C Energy production and conversion # Function: Acetyl-CoA hydrolase # Organism: Synechocystis # 6 412 20 429 478 365 44.0 1e-100 MTTQIKFTTAEEAVKVIKSGDHVHLSSVASAPQCLINAMCARGEAGELKDVHIHHLHTEG PAPYADEKFEGVFQLDSFFVGGNVRKVTQSGYADYIPIFLSETQRLYRCGAVPCNVAMIQ VSTPDKHGFVSLGTSVDATLAAVETADYVLAVVNKHVPRAFGQAMIHSSKIDYFVQDDTP LMEAHFSEPNETETAIGKHCAALIEDGATLQMGIGAIPNAVLAQLGGHKNLGIHTEMFAD GVLPLVESGVINGEAKNIDKGKMVSTFLMGSQRVYDFIDDNPAVLMMDVGYTNDPYIISK NDRVTAINSALQVDITGQVCADSLGTKFYSGVGGQIDFVYGASLSKGGKAIIAMPSVTNK GISKIAPVLTPGAGVVTTRNHIHWFVTEHGAVDLYGKTLQERARLIISVADPSAQEELDR AAFERFGQHHHYVKNYMSK >gi|313158645|gb|AENZ01000035.1| GENE 39 56487 - 57716 1657 409 aa, chain + ## HITS:1 COG:CC1277 KEGG:ns NR:ns ## COG: CC1277 COG1524 # Protein_GI_number: 16125526 # Func_class: R General function prediction only # Function: Uncharacterized proteins of the AP superfamily # Organism: Caulobacter vibrioides # 22 403 60 447 451 197 31.0 3e-50 MKRYACILLALLSAGCATSRRSAVATAPQEKQYAVILSMDAFRWDLAGRSHTPTLDSLAR VGTYAEIYPVYPSNTFPSHYAMATGLHPDHHGVVNNGFYDRTQGRRLSVFDSLDVRTPGF WGGEPIWNTAERQGLTANIFMWPGSEVPIGGRQATVWTRYSSKPDYYQRADWVIDAMTRP EAEIPQLVMWYFEEPDAAMHTYGPESPEAVGRAEHIDSVLRYFFREIRRSPVFDRINFIV TADHGMAELSPDRYINLLPLLDTAQVVRVVPGTPFGLEVKEEYADQAVRTLRRTGHMKAW RRERMPRRFHYGTHPTRLTNVIVIPETGWTLDYAPQERPVRKRGTHGFNNRDRDMHMVFY GSGPAFRKGYRQRSFQNQNMYLILCRLLGIEPAPNDGEWRDIKRMFNEN >gi|313158645|gb|AENZ01000035.1| GENE 40 57736 - 58542 1026 268 aa, chain + ## HITS:1 COG:no KEGG:BVU_3979 NR:ns ## KEGG: BVU_3979 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 10 263 15 270 274 311 56.0 2e-83 MNYEDQSCYMRISTPKSRQDLHIKKRYRVLEIGPGNNPSFRADVLADKFLGDDSHRSGGL RIYPHQKLVNAAAEALPFADKEFDYVICNQVLEHSDDPAQFLREVMRVGKAGYIETPSLL GEWLFPKQSHRYVVLCIGDKLVLYDKQRVPGNYANDYGELFLNYLPYQSLPYKLLPFSEG ELMHVRYEWKDDIDFLVNPTDEYYSKFFLKKWDRQMVCTLFPPRGFVTELGRTLRAAAHV IGDKLRRSQGRRPITLEEYRKLHPGELR >gi|313158645|gb|AENZ01000035.1| GENE 41 59491 - 59961 -3 156 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313158659|gb|EFR58048.1| ## NR: gi|313158659|gb|EFR58048.1| conserved domain protein [Alistipes sp. HGB5] # 1 156 169 324 324 313 100.0 2e-84 MKWRYTKKGVDIETSIKEPMNYFMGDKNIDWLANEDPSLWNNIKTDYDPCPAGYRVPSAT EIESIKQIEAELDENDYAKEFWYTWNNETTYYPSYGCRDEKGELVMEGGVFMWTSSQHLN QSSYSTRVVCWGFFTDINGIGGRGTGNNIRCIVDCK >gi|313158645|gb|AENZ01000035.1| GENE 42 60205 - 60375 107 56 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKFDAHVSNVGFILYLVAQVIPEPQKADQSFVPRFLYPYPKLPLTYAGGQAFRYLS >gi|313158645|gb|AENZ01000035.1| GENE 43 60372 - 60743 346 123 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313158689|gb|EFR58078.1| ## NR: gi|313158689|gb|EFR58078.1| hypothetical protein HMPREF9720_0455 [Alistipes sp. HGB5] # 1 123 1 123 123 207 100.0 2e-52 MNNYLLLIGISCVPAVTGMCATLPLWYQRHLRNQTIHTLGLTLLFLTLVICLLSVITNES SGIGFAAIALSSINFFLFRHYERLLYTLRCPGCRRVALRIRAIHHKTYKLHCKHCGLYTD WHA >gi|313158645|gb|AENZ01000035.1| GENE 44 61506 - 62351 478 281 aa, chain + ## HITS:1 COG:no KEGG:Lbys_1555 NR:ns ## KEGG: Lbys_1555 # Name: not_defined # Def: hypothetical protein # Organism: L.byssophila # Pathway: not_defined # 20 107 19 111 303 69 39.0 2e-10 MKKIYFCLPAILTGVLQLSAQDIITKCNGEEIAAKVAEIGDTQIKYRKYENAQGPLYTLP ASEVFLIRFEDGTREVVTPLDAPAVSPATAAETTGTESSTTAPAAANENYAQINRSVKRT KEPLRDKHFYIGPRAEFGYAAITSKSGNISGPVCSAGVIAEYMFDKMSPHGAGAGISYDM YMFEEDFDLNCMNIDLYYVYRPRGRWNCFAGFKFGIPVKSEASGADLKTVTNTAVGIEMG AGWATRHFDLGAAIFMNFNSTLDISESNRMAGVKLRIAYRF >gi|313158645|gb|AENZ01000035.1| GENE 45 62355 - 63209 643 284 aa, chain + ## HITS:1 COG:no KEGG:Lbys_1555 NR:ns ## KEGG: Lbys_1555 # Name: not_defined # Def: hypothetical protein # Organism: L.byssophila # Pathway: not_defined # 19 94 19 92 303 63 43.0 7e-09 MKQLLSFIVCCLIGASSAAQDIILKRNAEQIEAVVKEITDTDVKYRKFTHTEGVTYSIRR SEVFSITYENGTKDLFADEKPQAVAASYPYPPVSRSYSLGELFDEGGIRGIVIRVSEDGR HGLLLSLVEGKAAWGFINKNPLKEKGFASGCTDTEDGWKNMLAIRELIANTDLLWSNLPA FSWCKDLGPGWYLPAQKELESIWNFGRSNPAYSYKEHKEAIENLNLRLAEYGQPELGRMR DYWNSTEADAKRAHILVFTNAPFKSYTKGTEKFAERFVRAVHKF >gi|313158645|gb|AENZ01000035.1| GENE 46 63618 - 64782 1203 388 aa, chain + ## HITS:1 COG:SSO0375 KEGG:ns NR:ns ## COG: SSO0375 COG0582 # Protein_GI_number: 15897309 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Sulfolobus solfataricus # 255 388 147 272 291 64 33.0 4e-10 MQRSTFKVLFYVKRQSEKHGQVPVMGRITINGTMSQFSSKLSVRSSLWDAKANKASGRSL EAQRLNEKLENIKTNIGKQYQRLCDRDSYVTAEKVRNAFLGMGDDCRLLLQTFDEYLADF RKRVGKDRAYSSYDNYRKRRNRLASFLEYEYHVKDIAFKELKREFVEKFVVYLSTVKGLA SGTIHSTLKKLKLMTYTAYKNGWIAADPFAGFYVRPEYSERRYLSASELQAVIDVRLPNY RTGINRDAFVFCAFTGLSHADVVKLTHADIHTDDNGERWIIDRRQKTGTQFRVKLLPVAE MLYERYKDMHLSGDRVFPLKGTYNTLNMSLRHVARHAGLSFNPTIHLARHTFATTVTLTQ GVPLETVSKMLGHKQITTTQIYAKITND Prediction of potential genes in microbial genomes Time: Wed Jun 22 12:13:03 2011 Seq name: gi|313158644|gb|AENZ01000036.1| Alistipes sp. HGB5 contig00090, whole genome shotgun sequence Length of sequence - 699 bp Number of predicted genes - 2, with homology - 0 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 1 - 169 195 ## - Prom 373 - 432 5.4 - Term 401 - 447 9.3 2 2 Tu 1 . - CDS 450 - 659 285 ## Predicted protein(s) >gi|313158644|gb|AENZ01000036.1| GENE 1 1 - 169 195 56 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKNQKFEIAVAGYDYRFKTYARDGVEASVKVKCFLGRPDAECTILIPTLNDDLRET >gi|313158644|gb|AENZ01000036.1| GENE 2 450 - 659 285 69 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MFLSARKVCEVLGLDRHQLEQCRKKRMIRARTVNGQMMYDTYELLALTELFYRRKLRKTL SRIPQFTVR Prediction of potential genes in microbial genomes Time: Wed Jun 22 12:13:19 2011 Seq name: gi|313158628|gb|AENZ01000037.1| Alistipes sp. HGB5 contig00055, whole genome shotgun sequence Length of sequence - 22916 bp Number of predicted genes - 15, with homology - 15 Number of transcription units - 6, operones - 2 average op.length - 5.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 43 - 651 344 ## BF3288 hypothetical protein - Prom 697 - 756 4.1 2 2 Tu 1 . - CDS 759 - 953 87 ## gi|313158639|gb|EFR58030.1| hypothetical protein HMPREF9720_2289 - Term 1826 - 1865 8.2 3 3 Op 1 . - CDS 1933 - 2934 1203 ## BT_1285 endo-beta-N-acetylglucosaminidase 4 3 Op 2 . - CDS 3027 - 4628 1655 ## BT_1284 putative endo-beta-N-acetylglucosaminidase F1 5 3 Op 3 . - CDS 4631 - 5986 1497 ## BT_1283 hypothetical protein 6 3 Op 4 . - CDS 5995 - 6903 1025 ## BT_1282 hypothetical protein 7 3 Op 5 . - CDS 6909 - 8483 2307 ## BT_1281 hypothetical protein 8 3 Op 6 . - CDS 8496 - 11825 3381 ## BT_1280 hypothetical protein - Prom 11872 - 11931 1.9 9 4 Tu 1 . - CDS 12008 - 12991 231 ## Bache_2469 anti-FecI sigma factor, FecR - Prom 13057 - 13116 5.5 + Prom 13037 - 13096 13.0 10 5 Tu 1 . + CDS 13289 - 13726 222 ## BVU_0502 putative RNA polymerase ECF-type sigma factor + Prom 13825 - 13884 6.2 11 6 Op 1 . + CDS 13948 - 15189 1408 ## COG0477 Permeases of the major facilitator superfamily 12 6 Op 2 . + CDS 15192 - 16199 1459 ## COG2152 Predicted glycosylase 13 6 Op 3 . + CDS 16196 - 18433 2338 ## COG3537 Putative alpha-1,2-mannosidase 14 6 Op 4 . + CDS 18434 - 22012 4009 ## COG3537 Putative alpha-1,2-mannosidase 15 6 Op 5 . + CDS 22033 - 22915 1033 ## COG3537 Putative alpha-1,2-mannosidase Predicted protein(s) >gi|313158628|gb|AENZ01000037.1| GENE 1 43 - 651 344 202 aa, chain - ## HITS:1 COG:no KEGG:BF3288 NR:ns ## KEGG: BF3288 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 74 194 6 131 139 65 34.0 1e-09 MENLDKSQFIVLLETSLNRRLLSPEQKTLDAAYEDYIDLLHGLTVQSCKSSLDLLHELAR LQSCLMRLQERIVEKESPQVLLLKSALLLTNFEIRLVFTRVRYPSIAAPVSVEVPKSPLF LSEQFTPTDIMELATALQLSGAIRRIDGTRIDLATLVDVLSGTFNVRINNPEQCRHTLIN RKLRLTHFLDILRNNLIAYSQR >gi|313158628|gb|AENZ01000037.1| GENE 2 759 - 953 87 64 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313158639|gb|EFR58030.1| ## NR: gi|313158639|gb|EFR58030.1| hypothetical protein HMPREF9720_2289 [Alistipes sp. HGB5] # 1 64 138 201 201 125 100.0 7e-28 MTPPPFVLADGSTAPFNLVVRVFENTLHVKLGDPTDVKRRVLERKKDRTKFTDALLYSLN KEDE >gi|313158628|gb|AENZ01000037.1| GENE 3 1933 - 2934 1203 333 aa, chain - ## HITS:1 COG:no KEGG:BT_1285 NR:ns ## KEGG: BT_1285 # Name: not_defined # Def: endo-beta-N-acetylglucosaminidase # Organism: B.thetaiotaomicron # Pathway: not_defined # 7 316 4 306 306 141 34.0 5e-32 MKQNFKLFATLAASAAMMISCQQAEEPTMDKAQEPVVKTRAYGDKTPKVTIYVETNDVNP LNAGDYLLSDGSAYADIVEFFAANLNKETVNGVVRPTLYLNDKMTNLLENGGAATYVAGL QAKGIKVVLTILPNWQNIDFTSLNDTQADQLATILAYAVNKYGLDGIGFDNEYGGTVTSV SGSYGNLITKLRAKLPAGKLITLFQYNIGYNQINATAGAKLDHVYSNFGYNTYISTSGVT KERYAPLSINLGSIGSNVSYYGDRAYDLTEAGYGGIMHFNLRTRTDSDPLSLFNSIADGA WEETVTCENGNRPQDWTFVSSGYTITYDEATAE >gi|313158628|gb|AENZ01000037.1| GENE 4 3027 - 4628 1655 533 aa, chain - ## HITS:1 COG:no KEGG:BT_1284 NR:ns ## KEGG: BT_1284 # Name: not_defined # Def: putative endo-beta-N-acetylglucosaminidase F1 # Organism: B.thetaiotaomicron # Pathway: not_defined # 14 500 8 486 508 136 25.0 2e-30 MKTIFDLQTGVVTLLCTAALAGLTACEADPVMQESGKLPDKGSIEEVHVMLCSSNSVENR VDVLLTEGGVMTKNFYLRQTQPAAAGYSLDAWCDASLLNDYDAGDEIERTLLPEANYEFP DGKTLDLSAATQRSELKRIKFSASGLAAGEYVLPLTVAAQDAPDADKTLYYNVSVRQPYT DEYTLHDGHDLFFVFYVNTNDFQPLLAQDYIMRKKVARGTTVAWYGTVGNIVNLRTVQVG YDAAAGRALLDLGTDMTYVLSQSTKYIRPLQEHGRKVCISIEGGGKGLGFCNLTDAQIED FAAQVKTVIEQYELDGVNLWDRNSGYGKEGMPAVNTTSYPKLIKALREALGTEKLLTVTV YEEPTATFWDTEATGGIAVGDYIDYAWSGYNSNSEAPQLLDPWHPELEYVSTYTQKPIAN LPKDRYGCINFPIYPAAQTEEEAMMREPRFLLDWTPNYKPNNIIVFDDLHSNLQDSYEAF WDNTFAACCTFMDPENSYLLGSKKGYTYLLDMNRLWTLSEGIQNYGKWLKDWN >gi|313158628|gb|AENZ01000037.1| GENE 5 4631 - 5986 1497 451 aa, chain - ## HITS:1 COG:no KEGG:BT_1283 NR:ns ## KEGG: BT_1283 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 450 1 440 440 289 39.0 2e-76 MNMKKILMAVLVSLVAFTACENYGDDNHKFDNVVYLDVASTSDAQLTTFSNTRATYDCAL QAVLTYPAGQDVAVTLTVDPSLVGTYNARYGTEWTMLDSKYYSLLSETVTIPAGKTTSDA VTLRLRELTGEGDEQTGALPIDETYLVPVRIGSASIDVLHGSDVAYYVVKRSSAITVAAQ LTDNWIEFPTLDKYSESSKDWNGLRAVTYEALIYIDEFVKTNTEGNPVNISSVMGVEQYL LLRIGDTNFEREQLQFDGSGSGSGFGKIPGRDAMKHLETGRWYHVACTYDQNTRTARIYV DGQIQSEVTGVGITTQNDKNCINLAMRALYDLWNTAPEGDKAQYEELGYDGLSEAYQFFI GRSYDDYRPLNGKIAEARVWSVARTLEQIWENMYEIENPENDSTLLGYWKFNEGSGNTVK DYSMYGNDGVAKYDIIWPQGIEIPKLNENNE >gi|313158628|gb|AENZ01000037.1| GENE 6 5995 - 6903 1025 302 aa, chain - ## HITS:1 COG:no KEGG:BT_1282 NR:ns ## KEGG: BT_1282 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 301 4 324 327 163 32.0 7e-39 MKQNFIFTLLALFSMLSLVSCSDWTETEALDSTVKKPWEQDPALWAEYTATLRAYKQSEH FIVYARMHNSPEKATGEQDFMRCLPDSLDIVTLTNADNFSKFDAEDMSVMHEKGTKVLYQ IDYAARAEEFADAAALDAYLDKVVVAVAANGMDGYSFTTDPLAADATARIIDRLATARTD GQLLVFEGNPLSLAATDRAKMDYVILDTEKLENVLDVRLHALHAVNYAGIEANRLLLTAE INAALFDEDRTEHAAVDEMSRRVVELGPYAGLAAYNIAEDYYHSEMNYQTIRQAIQTLNP SK >gi|313158628|gb|AENZ01000037.1| GENE 7 6909 - 8483 2307 524 aa, chain - ## HITS:1 COG:no KEGG:BT_1281 NR:ns ## KEGG: BT_1281 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 17 521 17 529 531 473 49.0 1e-132 MRKINMSWQSVLLSVALLGLAACTGDFEDINRNPNQVTEEQMDALNYKTGTKFKSLQSLV IPVQEHMYQFNESLSGGPFGGYIGATVDTWQTKFETYNPSADWRKWPFANVITETYTPYK GIVNGTEDEVAIAFARLLRVAIMHRVTDSYGPIPYSKLESNESVYVEYDSQEAVYTKMFE ELDEAIEILGRNTTLPAEAWSRYDGVYYGNIAQWLKYANSLKLRMAMRLSYVKSDVARAK AAAAIAGGVIEANADNAAMHAAENRTTLIYNDWGDHRVGADILCYMNGYKDPRMEKMFLA NDVGDYVGIRIGIDVTSKSQAVSKYSNMIVASDTPYLWFNAAEATFLRAEYELRWGSEET ARSLYEAAVRLSFEERGASGADAYLAQTDVKPAPYTDPVGNYSAMAQSDIGIPWVDGAEN FERNLERIIVQKWIAIFPLGVEAWSEHRRTGYPKLLPVPTDKSGGTVDVAQGARRLPYPV EEYQQNNANLQLAIQTLDGEQQNGNRTGDVMGTRVWWDCKSYNN >gi|313158628|gb|AENZ01000037.1| GENE 8 8496 - 11825 3381 1109 aa, chain - ## HITS:1 COG:no KEGG:BT_1280 NR:ns ## KEGG: BT_1280 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 12 1109 17 1110 1110 1226 56.0 0 MNKKIIQQICLSFLLVLTLVCIPVTGVQAQTQKVSIKMENVRMKQVMNEIERQTKFLFGI NDDVDVNRLVTVNVTNQSLQAALDQMVKGTDITYQISSSNIILSKKQAKLPVDISGTIRD INGSPIVGATIVIRGTSIGISSGSDGSFSLQIPPPAASRQLEVSYLGYETLVVDVGNRTA FDLTLQESSAEIEQVVVTALGIKRSEKAVAYNVQQIKAEDITTVKDANFINSLTGKVAGV TINASSSGVGGASKVVLRGNKSISQSSNALYVIDGIPMYNFGGGGGTEFDSRGATESIAD LNPEDIESMSVLTGAAAAALYGSEAANGAVMITTKKGQAGALKVTLSSNTEFLNPFVLPE FQNRYGTGLNGTRSGSGIYSWGEKLLPAARYGYTPNDYFETGHVYTNAVTVSGGTDKNQT YFSAASVNSDGIIPNNEYDRYNFTFRNTSYFLKDRLKLDASASYIYQKDQNMTNQGVYSN PLVPAYLFPRGENFDLYRRFERYNEGTKLMEQFWSADMEGGDLRMQNPYWINYRNLRNTD KKRYMLSFSASYDILPWLNVAGRVRLDNSNSLYTQKLYASSNTTITDGGKNGHYTEARAY DTQTYADLMVNINKTFGEDWSLNANIGASINNIKTDELSYRGPIQENGLPNIFNVFDLDD TKKRAEKVGWHDQTQSIFASVEVGWKQMLYLTVTGRNDWASQLANSPESSFFYPSVGLSW VPSSTWDLGKTFSYLKIRGSIASVGMPFPRHLTVPTYEYDATNKVWKDKTHYPIGDLKPE RTITYEVGLDARLWQHINLSASWYRADTKNQTFDPSLPPSSSYTTIYLQTGHVRNTGVEL SLGYDNQWRDFRWTTNFTYSWNKNEIRELAANAVNPVTGESLNLTKLDIKGLGKAKYILK EGGTLGDLYTTSNLRYNENGYVEVDNAGNLQVTDEGEDIYLGSVFPDANLAWRNDFSWKG INLGVLFTARLGGVCYSATQANLDLYGVSEASAAARDAGGVLINGREMVDAQKWYQAIGS QSGLPQYYLYSATNVRLQELSLGYTLPTKWFRDKMRMTVSFVGRNLWMIYCKAPFDPEAV ASTGLNYQGIDYFMMPSMRSLGFNVKFQF >gi|313158628|gb|AENZ01000037.1| GENE 9 12008 - 12991 231 327 aa, chain - ## HITS:1 COG:no KEGG:Bache_2469 NR:ns ## KEGG: Bache_2469 # Name: not_defined # Def: anti-FecI sigma factor, FecR # Organism: B.helcogenes # Pathway: not_defined # 7 327 1 313 313 170 33.0 7e-41 MRYYDRMEKKVTDAMLFRFFAEQTTLEETDMISAWLNEDPEKNQKVLDKAYNLYVLGIMC APNPEAGINNSRSFLRFGWPKVLRYAGVAAAILLLGIAVDHVLLSQRMNRWLQQQTTVEV PAGQHIRLTLGDGSIVELNAQSRIVYPAIFPKGERRVQLSGEAIFNVAHNSECPFIVETF ACDVEVLGTHFDVIANQAENRFSTALFSGRVAVTNKKNGESVILQPNDIVNLRDGGFQIQ ELEDKDDYLWMDGIISVYGMPFEQLMSKLERSYNVKIEIRRETMPVVKYKSLKIRVSDGI DHALHMLQLASDFTFEHDAVTDVIIIK >gi|313158628|gb|AENZ01000037.1| GENE 10 13289 - 13726 222 145 aa, chain + ## HITS:1 COG:no KEGG:BVU_0502 NR:ns ## KEGG: BVU_0502 # Name: not_defined # Def: putative RNA polymerase ECF-type sigma factor # Organism: B.vulgatus # Pathway: not_defined # 1 142 42 184 186 83 34.0 2e-15 MTIWENRETVQFTSIEAYLFGVVKKRCLKYRRDRYTGKAVYEKILMKERGVMDFYTRTIE SCNPNELFQREIIEICRKQLEQMPELTKQVFIAHKLEGKSYKEIADMLCINLKKVDRELQ QAAMKLRLSLKDYLLLLLLIVYSEI >gi|313158628|gb|AENZ01000037.1| GENE 11 13948 - 15189 1408 413 aa, chain + ## HITS:1 COG:RSc0154 KEGG:ns NR:ns ## COG: RSc0154 COG0477 # Protein_GI_number: 17544873 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Ralstonia solanacearum # 31 400 54 401 426 85 23.0 2e-16 MGLPFVVLNMVSAVLFKDLGISDAQIAFWTSLIMWPWTIKFLWSPFLEIFRTKKFFVVTT QLLSGILFGLAALSLHLPSFFAVTIAVFAVVAFSGATHDIAADGVYMSELTTQDQAKYIG WQGAFYNLAKLVATGGLVWFAGFLYEGFSAVGAATFDAYVRSWTVVLAILCGLLVLLGVY HVRALPAGGGAAERHSLRDGLSGLKEVVGAFFQKKHIWYYLGFIILYRLGEGFVMKIVPL FLKADTASGGLGLTNQQIGLYYGTFGAGAFLLGSLLAGYYIAHRGLRRTLFILCCIFNIP FAVYALLAWLQPQSMWIVGGGIVLEYFGYGFGFVGLTLFMMQQVAPGRHQMAHYAFASGI MNLSVMLTGAVSGYLSDALGYGMFFLAVMLATVPAFLVTWFVPFTYDDKPNDK >gi|313158628|gb|AENZ01000037.1| GENE 12 15192 - 16199 1459 335 aa, chain + ## HITS:1 COG:TM1225 KEGG:ns NR:ns ## COG: TM1225 COG2152 # Protein_GI_number: 15643981 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted glycosylase # Organism: Thermotoga maritima # 4 334 1 326 326 385 56.0 1e-107 MNNLKIVGPAMPDMPWEDRPAGSKEVMWRYSANPIIPRDALSTSNSIFNSAVIPFKKGKY NYAGVFRCDDTNRRMRIHAAFSVDGIKWDICEEDFKLVGADPEIAEWVYGYDPRVAKIDD KYYVTWCNGYHGPTIGIAWTDDFETFHQLENAFIPFNRNGVLFPRKINGRFAMLSRPSDN GHTAFGDIFYSESPDMEFWGRHRHVMSPAAFEVSAWQCMKIGAGPVPIETSEGWLLLYHG VLRSCNGYVYAFGSALLDLDQPWKTIARSGPYLISPREIYELTGDVPNVTFPCASLHDPE TGRIAVYYGCADTVTGLAFGYIPEIIDFTKKNNIL >gi|313158628|gb|AENZ01000037.1| GENE 13 16196 - 18433 2338 745 aa, chain + ## HITS:1 COG:CC0533 KEGG:ns NR:ns ## COG: CC0533 COG3537 # Protein_GI_number: 16124788 # Func_class: G Carbohydrate transport and metabolism # Function: Putative alpha-1,2-mannosidase # Organism: Caulobacter vibrioides # 2 741 11 752 770 330 33.0 7e-90 MRSLALLAVLTLCLGARAQEPAECVDPFIGTTNFGTANPGAVTPHGMMSVVPFNVMGSEE NVYDKDARWWSTPYEYHNKFFTGFAHGALSGVGCPELGALLTMATTGPLTVDYREYGTSY RDEKASPGYYSVFLDKYGVLAEVTATARTSVERYTFPGGEGHILLNLGEGLTNESGAMVR RVSATEIEGTKLLGTFCYNAQKVFPVYFVLRVSKTPSASGYWKKQRAMTGVEAEWTPDNG RYKLYMEYGRELSGDDVGYWFSFDGLAAGEQVEVRMGISYVSTENARKNLDAEQAADATF DAIREQARKGWNEALGRIRVEGGTEEQQRVFYTALYHALLHPNLVSDVNGEYPLMERSGE TGVTDGERYTVFSLWDTCRNLHQLLTLVYPDRQREMLRSMTGMYEEWGWLPKWELYGRET FTMEGDPAIPVIVDSWMKGLRDFDVAKAYEAMRKSATTPGAQNRMRPDLDPYIEKGYIPL GFYAKDLAGDTSVSHALEYYMADAALSLLADSLGHGDDARLFRARSLGYKRYYSPESGTL RPLHPDGSFLSPFDPKAGENFTAAPGFHEGSAWNYTFYVPHDVEGLAKLMGGRRKFIDKL QMVFDEGLYDPANEPDIAYAYLFSRFRGEEWRTQRETRRLLERYFTTAPDGIPGNDDTGT MSAWAVFTMLGLYPDCPGEPYYTLTSPTFDRVEIDTERGTLVIEKRGEGYIDRMTLGDKP LKNYRILHDELLKGGKLTFELQTGK >gi|313158628|gb|AENZ01000037.1| GENE 14 18434 - 22012 4009 1192 aa, chain + ## HITS:1 COG:XF0842 KEGG:ns NR:ns ## COG: XF0842 COG3537 # Protein_GI_number: 15837444 # Func_class: G Carbohydrate transport and metabolism # Function: Putative alpha-1,2-mannosidase # Organism: Xylella fastidiosa 9a5c # 21 750 42 785 790 567 43.0 1e-161 MKLTGFLFAAALLGACTQPAAEENYTRHVDPKIGTGGHGHVFVGANVPFGLVQVGPTSIP QTWDWCSGYHASDSTVIGFSHTHLSGTGIGDLFDITVMPVVGEVTYARGEENDPASGLWS YADRTREITKPGYYSVPLVRYGITAELTATERVGLHRYTFPASDAAAVVFDLENGGCWDK ATDTGFRFSDDSTRIAGWRCSTGWAKNQEVYFVAEFSKPAKGISYLQPGEIDDSKMPRIA ARYARVDFDTAEGEQVLVKVALSPVSIKGAKANLAAELPGWDFDATAAAADKAWNDELSK VKIETEDETSKRIFYTALYHTMVAPSLFCDVNGDYRGADGKVHENPGRDTYTTFSLWDTY RAAMPLMTVLHPERMPDIIQTMLHIADEQGRLPVWHLWGNETDCMVGNPGIVAVADAIVK GIGGFDREKAFETIRKTAMNPDRGNGLRMEYGYIPCEMFNEAVAYDMEYALADGAAARAA EALGKAEDAKYFEERSHSYRNYFDPQTGFMRGRDSKKGWRTPFNAFASTHRADDYCEGNA WQYTWLAPHDVAGLVGCFGSRARMLEKLDSLFTVSSVIEGGETSPDISGLIGQYAHGNEP SHHILYLYTMLGQPWETADKVREVLTTLYHAAPDGLSGNEDVGQMSAWYVLSSLGMYEAE PAGGRYWFGSPLFDRVEITVPGGTFTIVAENNSAANKYIQRVWLNGQPYTKPWIGHADVM KGGELRFEMGAEEKVWYCPDEPEAYADQRPAEEQRLFKSEAVEGEIARVCGLLTNERLRW MFANCFPNTLDTTVHYGEDEAGNPDTYVYTGDIPAMWLRDSGAQVWPYVQLCKEDPALQK MIAGVIRRQFKLINIDPYANAFNVGPTGDGEDVGYPGNDQSPWVFERKWEIDSHCYPLRL AHHYWKTTGDTSVFDGEWISAMRNIVKTLKEQQMKEGPGDYIFLRTTDRQLDTRCHVGRG NPVKPVGLIVSAFRPSDDATTFGFLIPSNFMTVTSLRKAAEILTAVNGERELAAECTALA GEVEEALQKYAVVEHPEFGKIYAFEVDGFGGCFLMDDANVPSLLAMPYLGDVERTDPIYE NTRRFVWSEENPYFWRGSAGEGIGGPHIGVEMIWPMSIMMRAFTSEDDAEIRDCIVALMT TDAGTGFMHESFSRHDAANFTRPWFAWQNTLFGELILKLVNENKVDLLNSID >gi|313158628|gb|AENZ01000037.1| GENE 15 22033 - 22915 1033 294 aa, chain + ## HITS:1 COG:SP2145 KEGG:ns NR:ns ## COG: SP2145 COG3537 # Protein_GI_number: 15901958 # Func_class: G Carbohydrate transport and metabolism # Function: Putative alpha-1,2-mannosidase # Organism: Streptococcus pneumoniae TIGR4 # 39 293 6 239 694 89 27.0 7e-18 MIKPVSLLLLAAVLCGSCGSRTGRTAATSCDEPSARPSEYVSTLVGTHSDFTLSTGNTYP AVALPWGMNFWTPQTGEMGSGWAYTYGAHTIRGLKQTHQPSPWINDYGQFSIMPIRGRDK VDEESRQSWFSHQSEEARPYYYSVYLADHDIKAEIAPTERAAIMRFTFPESDESGVVIDA FDRGSYIRVMHDKRTVVGYTTRNSGGVPDNFKNWFIVRFDRKIRDFQIYDGTKPVEGEQL VGEHALVRVGFETRRGEQVTVRAASSFISQMQAVQNLEELGKDDFETVKAKAQA Prediction of potential genes in microbial genomes Time: Wed Jun 22 12:14:20 2011 Seq name: gi|313158626|gb|AENZ01000038.1| Alistipes sp. HGB5 contig00063, whole genome shotgun sequence Length of sequence - 660 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 659 304 ## BF1784 hypothetical protein Predicted protein(s) >gi|313158626|gb|AENZ01000038.1| GENE 1 2 - 659 304 219 aa, chain - ## HITS:1 COG:no KEGG:BF1784 NR:ns ## KEGG: BF1784 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 29 219 20 207 212 95 33.0 1e-18 ILPQAGCSAMRRWIDQATFDELLLANAEDNVERAIQSASAVLEAVQEFVPEATLFYHVTH AADSLKNYFEEVCTGFRRNGILFLVRQYFYPKPYYDLRFDWSSFRYADNYSYSKAFDKQS EPNRIGVFTKKKIDDWVEYLTQGFRNLERIDAENERKMTGYRNRLEAIPDVAWNKDKSRG HITRHGLTYTFEIRQTDYSEKISLDYRCRTLDDFLALSD Prediction of potential genes in microbial genomes Time: Wed Jun 22 12:14:26 2011 Seq name: gi|313158616|gb|AENZ01000039.1| Alistipes sp. HGB5 contig00059, whole genome shotgun sequence Length of sequence - 9760 bp Number of predicted genes - 12, with homology - 11 Number of transcription units - 10, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 159 - 218 8.5 1 1 Tu 1 . + CDS 238 - 567 340 ## BT_p548232 hypothetical protein - Term 740 - 777 6.1 2 2 Tu 1 . - CDS 943 - 1332 334 ## BDI_3248 mobilization protein BmgB - Prom 1556 - 1615 2.8 - Term 1363 - 1413 -0.4 3 3 Tu 1 . - CDS 1662 - 2039 228 ## BDI_3249 hypothetical protein - Prom 2277 - 2336 7.6 + Prom 2572 - 2631 1.9 4 4 Op 1 . + CDS 2783 - 3094 421 ## BT_1129 hypothetical protein 5 4 Op 2 . + CDS 3137 - 3460 285 ## BT_1130 hypothetical protein + Term 3482 - 3537 13.4 - Term 3946 - 3972 -1.0 6 5 Op 1 . - CDS 3982 - 4374 280 ## BDI_2238 hypothetical protein 7 5 Op 2 . - CDS 4413 - 4973 525 ## BDI_2239 hypothetical protein - Prom 5001 - 5060 5.4 - Term 5775 - 5819 -0.5 8 6 Tu 1 . - CDS 5820 - 7067 1280 ## BDI_3265 transposase - Prom 7164 - 7223 4.7 9 7 Tu 1 . - CDS 7373 - 7597 87 ## gi|313159271|gb|EFR58638.1| hypothetical protein HMPREF9720_2312 - Prom 7627 - 7686 2.8 + Prom 7607 - 7666 1.6 10 8 Tu 1 . + CDS 7748 - 7930 132 ## 11 9 Tu 1 . - CDS 7999 - 9165 556 ## BF1542 hypothetical protein - Term 9356 - 9387 1.1 12 10 Tu 1 . - CDS 9540 - 9758 202 ## gi|291514538|emb|CBK63748.1| Protein of unknown function (DUF3244) Predicted protein(s) >gi|313158616|gb|AENZ01000039.1| GENE 1 238 - 567 340 109 aa, chain + ## HITS:1 COG:no KEGG:BT_p548232 NR:ns ## KEGG: BT_p548232 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 109 1 109 109 218 100.0 6e-56 MSTYSYKNPKFINSPKGVVEVVEVIYDGKDDPAYSLAIIKWENTYKLGIRWNIAYSEWDD YRKQNGQDECIGNPQSRGIPTWFVLPDDMMFGEKFSGAMQRLDELRKGK >gi|313158616|gb|AENZ01000039.1| GENE 2 943 - 1332 334 129 aa, chain - ## HITS:1 COG:no KEGG:BDI_3248 NR:ns ## KEGG: BDI_3248 # Name: not_defined # Def: mobilization protein BmgB # Organism: P.distasonis # Pathway: not_defined # 1 115 97 211 219 200 94.0 2e-50 MTKDMNEMKKKRGRPALGRTRKLTKGVTVKFSPVSYEALRFRARKSGRSLAVYIREVALA ATVTARHTPEENALLRSLAGMANNLNHLTKLSHQTGFYRTRLLIDGLLGKLKRIMVYLLG MLSDSPFKF >gi|313158616|gb|AENZ01000039.1| GENE 3 1662 - 2039 228 125 aa, chain - ## HITS:1 COG:no KEGG:BDI_3249 NR:ns ## KEGG: BDI_3249 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 41 125 78 167 167 92 57.0 6e-18 MKEGKEDMPLVGKTMAEGHTHRMAVLCEKLGSTLREILDGTCPQPAKPEKAPKRPALEKY RETYLIPPKIKGPKAVFISEETRSALDMIVLRLGCRGMSVSGLLENLARRHLETYRQDIE RWRKL >gi|313158616|gb|AENZ01000039.1| GENE 4 2783 - 3094 421 103 aa, chain + ## HITS:1 COG:no KEGG:BT_1129 NR:ns ## KEGG: BT_1129 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 103 1 103 103 196 99.0 3e-49 MEIVNIEARTFEAMLSAFRTFADRLDTLCRLYGDIEEKKWLDNQEVCLLLKVSPRTLQTL RDNGTLAYTQICHKTYYKPGDVESIIRIVEERRKRAESMGRSI >gi|313158616|gb|AENZ01000039.1| GENE 5 3137 - 3460 285 107 aa, chain + ## HITS:1 COG:no KEGG:BT_1130 NR:ns ## KEGG: BT_1130 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 107 1 107 107 173 99.0 2e-42 MMNTDNRLLTRESSEHIREFFSTVERLSVSMERLFAGRSPAMAGENFYTDRELAEKLKVS RRSLQQYRDSSLLAFTRLGGKILYRSSDIEKLLDGCYREARTRPEEL >gi|313158616|gb|AENZ01000039.1| GENE 6 3982 - 4374 280 130 aa, chain - ## HITS:1 COG:no KEGG:BDI_2238 NR:ns ## KEGG: BDI_2238 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 1 130 1 130 130 264 100.0 7e-70 MALIVYNRENSRPQEVTYKGKRTINLDSRGTVYLSKTMSIELGILGGGRVNFAHDDETGD WYICRADDSEGFIVWKDKRCARFSAGFIVQRLMRQAKVERKSVQFMMARMPVEIGGVAYY KILLSNPILR >gi|313158616|gb|AENZ01000039.1| GENE 7 4413 - 4973 525 186 aa, chain - ## HITS:1 COG:no KEGG:BDI_2239 NR:ns ## KEGG: BDI_2239 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 1 186 1 186 186 348 98.0 5e-95 MADKSAEKERLFNEWFTKSYDRLRGTLRRYGMLDEDNFHDTYLFVRRQVLVPGKDITDYD AYFIGCYKKAALVKIKRENRYAHPEDDFFLRCGEEAKFLSEDDLNGCERLVKDILRFVRQ KFSYEEYRMFMLRFYEAQFSFKALAECMGISASAISQKVCRIVDAVRTHSGFAWRSRMLA VESFMY >gi|313158616|gb|AENZ01000039.1| GENE 8 5820 - 7067 1280 415 aa, chain - ## HITS:1 COG:no KEGG:BDI_3265 NR:ns ## KEGG: BDI_3265 # Name: not_defined # Def: transposase # Organism: P.distasonis # Pathway: not_defined # 1 415 1 415 415 819 96.0 0 MKTSMSRSTFKILFYVKKGSERANGYLPLMCRLTVDGEIKQFSCKLDAPPKLWDVKTARA TGKSVEAQKINAAVDRIRVDVNRRYQELMQSDGYVTAARLRDVYLGLGVKRETLLKLFEQ HNEEFIKKVGHSRVQGTYNRYRTIYKHLCEFVPKVYRRDDIPLKELNLTFINNFEYFLRT EKKCRTNTVWGYMIGLKHIISIARNSGALPFNPFAGYINSFESVDRGYLTEREIQTLMEV PVKSGTCELVRDLFIFSVFTGLAYADVKALTTDRLQTFFDGNLWIITRRRKTNTESNIRL LDVPRRIIEKYKGLSKDDHVFPVPSSSRCNVILKELGRQCGFKIRLTYHVARHTNATTVL LSHGVPIETVSRLLGHTDLKTTQIYARITNQKISSDMEVLSHKLEKMEKEICDAI >gi|313158616|gb|AENZ01000039.1| GENE 9 7373 - 7597 87 74 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313159271|gb|EFR58638.1| ## NR: gi|313159271|gb|EFR58638.1| hypothetical protein HMPREF9720_2312 [Alistipes sp. HGB5] # 1 74 120 197 197 105 67.0 9e-22 MGLGLASHENAAPIILIGDNSYDIGGSSSSFCLMPRFGIELFHHLRVTFNYKLEEKANRH FNISVGVVFGGGRK >gi|313158616|gb|AENZ01000039.1| GENE 10 7748 - 7930 132 60 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MQKIVAYFPKAHSPGFCIVKIKLTGAKNYAATNLDFYLANFSILLFRRSVRLGIDIEEVQ >gi|313158616|gb|AENZ01000039.1| GENE 11 7999 - 9165 556 388 aa, chain - ## HITS:1 COG:no KEGG:BF1542 NR:ns ## KEGG: BF1542 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 2 388 54 420 421 185 32.0 3e-45 MSLYIAKGYDSLVPNIKVAKLLNARAEVMVDQTIYKISPRGTYYFPESKQKFFESNYERF ETEEGIQMEENTYQLAEDIYRINTFEITEVLDAEAEDLPDDLPGVLDDDDLSTNIVNTLN MTTRTVTTQIPPFERFPRFNADAKTWLGKLWQSIFGRNRGYTYKLSKKHRMKGKFYYYNY KFYTEFGASAEMQKKNWIGWSGKQASELLIGWSNIIFSEGYKRIPDYPKNAEASIVSTEY KTIPGFNHQGMVLTIVGLDITALQQQQLSLMNPIQIRNWFKLRVENSNVDITQLDAIQCF SADKVITILPGGQKRKGDVKKVREIFLSEVSFTINLDLNHMPQSFKEWAKIINQGKFKIK PKNLKFGTVYVAGRLGNSWGGMTIVKKS >gi|313158616|gb|AENZ01000039.1| GENE 12 9540 - 9758 202 72 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|291514538|emb|CBK63748.1| ## NR: gi|291514538|emb|CBK63748.1| Protein of unknown function (DUF3244) [Alistipes shahii WAL 8301] # 2 71 58 127 127 129 92.0 6e-29 SDGNTITLHSPENCDRAFVTISGNGTYLTEMVNFTDQTATLDVSDLDCGVYLITVEYENG TIYTGHIEFLEI Prediction of potential genes in microbial genomes Time: Wed Jun 22 12:15:53 2011 Seq name: gi|313158484|gb|AENZ01000040.1| Alistipes sp. HGB5 contig00079, whole genome shotgun sequence Length of sequence - 149490 bp Number of predicted genes - 133, with homology - 129 Number of transcription units - 51, operones - 28 average op.length - 3.9 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 982 1314 ## COG1640 4-alpha-glucanotransferase + Term 1131 - 1171 4.1 2 2 Op 1 . + CDS 1216 - 1584 691 ## COG1694 Predicted pyrophosphatase 3 2 Op 2 . + CDS 1587 - 1994 576 ## gi|313158497|gb|EFR57891.1| hypothetical protein HMPREF9720_2802 4 2 Op 3 . + CDS 1995 - 4001 3291 ## COG1297 Predicted membrane protein 5 2 Op 4 . + CDS 4015 - 4317 183 ## gi|313158496|gb|EFR57890.1| putative lipoprotein 6 2 Op 5 . + CDS 4331 - 5548 1990 ## COG2195 Di- and tripeptidases 7 3 Op 1 . + CDS 5667 - 7604 2856 ## COG0171 NAD synthase 8 3 Op 2 . + CDS 7607 - 8227 870 ## BF2047 hypothetical protein + Term 8255 - 8291 4.2 + Prom 8238 - 8297 7.5 9 4 Tu 1 . + CDS 8466 - 9821 2089 ## COG0334 Glutamate dehydrogenase/leucine dehydrogenase + Term 9838 - 9882 9.0 - Term 9829 - 9868 8.5 10 5 Op 1 . - CDS 9889 - 10494 1003 ## Odosp_2687 hypothetical protein 11 5 Op 2 . - CDS 10644 - 12065 2423 ## COG1508 DNA-directed RNA polymerase specialized sigma subunit, sigma54 homolog - Prom 12111 - 12170 1.6 - Term 12148 - 12197 0.1 12 6 Tu 1 . - CDS 12251 - 13648 2015 ## COG0017 Aspartyl/asparaginyl-tRNA synthetases - Prom 13679 - 13738 6.6 + Prom 13647 - 13706 3.9 13 7 Tu 1 . + CDS 13755 - 15113 2250 ## COG3250 Beta-galactosidase/beta-glucuronidase 14 8 Op 1 . + CDS 15255 - 15728 692 ## COG1576 Uncharacterized conserved protein 15 8 Op 2 . + CDS 15733 - 16320 852 ## COG4566 Response regulator 16 8 Op 3 . + CDS 16397 - 18760 3217 ## COG1629 Outer membrane receptor proteins, mostly Fe transport 17 8 Op 4 . + CDS 18776 - 19252 456 ## gi|313158541|gb|EFR57935.1| putative lipoprotein 18 8 Op 5 . + CDS 19257 - 20111 800 ## PROTEIN SUPPORTED gi|163755345|ref|ZP_02162465.1| 30S ribosomal protein S6 19 8 Op 6 . + CDS 20148 - 22058 2839 ## COG1368 Phosphoglycerol transferase and related proteins, alkaline phosphatase superfamily 20 8 Op 7 . + CDS 22051 - 23436 2061 ## COG4866 Uncharacterized conserved protein + Term 23479 - 23516 6.3 - Term 23464 - 23507 10.1 21 9 Tu 1 . - CDS 23540 - 24118 321 ## gi|313158610|gb|EFR58004.1| hypothetical protein HMPREF9720_2820 - Prom 24230 - 24289 3.3 22 10 Tu 1 . - CDS 24316 - 24432 56 ## - Term 24850 - 24896 11.6 23 11 Op 1 . - CDS 24971 - 25825 1065 ## COG0190 5,10-methylene-tetrahydrofolate dehydrogenase/Methenyl tetrahydrofolate cyclohydrolase 24 11 Op 2 17/0.000 - CDS 25835 - 27082 1606 ## COG0151 Phosphoribosylamine-glycine ligase - Prom 27123 - 27182 2.1 - Term 27168 - 27199 3.4 25 11 Op 3 . - CDS 27254 - 28783 2234 ## COG0138 AICAR transformylase/IMP cyclohydrolase PurH (only IMP cyclohydrolase domain in Aful) 26 11 Op 4 . - CDS 28790 - 29209 553 ## COG3871 Uncharacterized stress protein (general stress protein 26) 27 11 Op 5 21/0.000 - CDS 29206 - 29769 584 ## COG0299 Folate-dependent phosphoribosylglycinamide formyltransferase PurN 28 11 Op 6 13/0.000 - CDS 29774 - 30799 912 ## PROTEIN SUPPORTED gi|169632702|ref|YP_001706438.1| phosphoribosylaminoimidazole synthetase 29 11 Op 7 2/0.000 - CDS 30823 - 32238 2005 ## COG0034 Glutamine phosphoribosylpyrophosphate amidotransferase 30 11 Op 8 . - CDS 32244 - 32960 1413 ## COG0152 Phosphoribosylaminoimidazolesuccinocarboxamide (SAICAR) synthase 31 11 Op 9 . - CDS 32998 - 33966 1268 ## Phep_1385 endonuclease/exonuclease/phosphatase 32 11 Op 10 . - CDS 33983 - 37693 5243 ## COG0046 Phosphoribosylformylglycinamidine (FGAM) synthase, synthetase domain 33 12 Op 1 29/0.000 - CDS 37810 - 38904 1347 ## COG0026 Phosphoribosylaminoimidazole carboxylase (NCAIR synthetase) 34 12 Op 2 . - CDS 38905 - 39372 738 ## COG0041 Phosphoribosylcarboxyaminoimidazole (NCAIR) mutase - Prom 39604 - 39663 3.9 - Term 39390 - 39437 16.1 35 13 Op 1 . - CDS 39686 - 42016 3072 ## COG3537 Putative alpha-1,2-mannosidase 36 13 Op 2 . - CDS 42032 - 44248 3151 ## COG3537 Putative alpha-1,2-mannosidase 37 13 Op 3 . - CDS 44280 - 46475 3077 ## BT_1035 hypothetical protein 38 13 Op 4 . - CDS 46476 - 48743 3085 ## COG3537 Putative alpha-1,2-mannosidase 39 13 Op 5 . - CDS 48746 - 49756 1565 ## COG2152 Predicted glycosylase 40 13 Op 6 . - CDS 49774 - 51087 1910 ## COG0477 Permeases of the major facilitator superfamily - Prom 51180 - 51239 5.7 - Term 51375 - 51403 1.0 41 14 Tu 1 . - CDS 51429 - 52514 1679 ## COG0540 Aspartate carbamoyltransferase, catalytic chain 42 15 Op 1 . - CDS 52666 - 53298 974 ## COG0461 Orotate phosphoribosyltransferase 43 15 Op 2 . - CDS 53315 - 53575 474 ## BVU_2290 hypothetical protein 44 15 Op 3 5/0.000 - CDS 53598 - 54305 991 ## COG0284 Orotidine-5'-phosphate decarboxylase 45 15 Op 4 13/0.000 - CDS 54320 - 55279 1317 ## COG0167 Dihydroorotate dehydrogenase 46 15 Op 5 3/0.000 - CDS 55273 - 56013 1162 ## COG0543 2-polyprenylphenol hydroxylase and related flavodoxin oxidoreductases 47 15 Op 6 24/0.000 - CDS 56137 - 59319 5081 ## COG0458 Carbamoylphosphate synthase large subunit (split gene in MJ) 48 15 Op 7 7/0.000 - CDS 59312 - 60403 1622 ## COG0505 Carbamoylphosphate synthase small subunit 49 15 Op 8 . - CDS 60406 - 61680 1420 ## COG0044 Dihydroorotase and related cyclic amidohydrolases - Prom 61708 - 61767 2.6 + Prom 62020 - 62079 5.2 50 16 Op 1 . + CDS 62110 - 62538 292 ## 51 16 Op 2 . + CDS 62661 - 63878 1338 ## EcE24377A_1450 hypothetical protein + Term 63900 - 63945 9.7 - Term 63888 - 63931 12.2 52 17 Tu 1 . - CDS 63981 - 64754 1160 ## COG1349 Transcriptional regulators of sugar metabolism - Prom 64779 - 64838 4.0 - Term 64804 - 64854 11.7 53 18 Tu 1 . - CDS 64899 - 66353 1904 ## COG2271 Sugar phosphate permease - Prom 66408 - 66467 4.4 54 19 Op 1 . + CDS 66713 - 67948 1648 ## COG0582 Integrase 55 19 Op 2 . + CDS 67960 - 68322 413 ## BF0151 hypothetical protein + Term 68345 - 68395 10.1 - Term 68332 - 68382 10.1 56 20 Op 1 . - CDS 68400 - 68702 341 ## BF0150 hypothetical protein 57 20 Op 2 . - CDS 68737 - 69036 289 ## BF0149 hypothetical protein - Prom 69237 - 69296 3.6 + Prom 68972 - 69031 3.9 58 21 Op 1 . + CDS 69238 - 69600 407 ## BF0147 hypothetical protein 59 21 Op 2 . + CDS 69604 - 69954 487 ## BF0146 hypothetical protein 60 21 Op 3 . + CDS 69975 - 71546 1917 ## BF0145 hypothetical protein 61 21 Op 4 . + CDS 71607 - 73694 2175 ## COG0550 Topoisomerase IA + Term 73772 - 73808 5.5 + Prom 73699 - 73758 1.6 62 22 Op 1 . + CDS 73857 - 74309 514 ## BF0143 hypothetical protein 63 22 Op 2 . + CDS 74299 - 80115 4662 ## COG4646 DNA methylase + Prom 80570 - 80629 10.4 64 23 Tu 1 . + CDS 80773 - 82698 424 ## COG0480 Translation elongation factors (GTPases) + Prom 82729 - 82788 5.5 65 24 Op 1 13/0.000 + CDS 82875 - 85016 449 ## COG0642 Signal transduction histidine kinase 66 24 Op 2 . + CDS 85030 - 86331 751 ## COG2204 Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains + Term 86556 - 86590 2.0 + Prom 86360 - 86419 2.6 67 25 Tu 1 . + CDS 86611 - 87033 71 ## BF0137 hypothetical protein + Term 87153 - 87201 6.8 + Prom 87123 - 87182 8.7 68 26 Tu 1 . + CDS 87272 - 87877 670 ## BF0136 tetracycline resistance element mobilization regulatory protein RteC + Term 87896 - 87942 9.4 + Prom 88078 - 88137 7.7 69 27 Tu 1 . + CDS 88323 - 89309 635 ## COG1373 Predicted ATPase (AAA+ superfamily) 70 28 Tu 1 . - CDS 90499 - 90792 60 ## - Prom 90864 - 90923 5.8 + Prom 91585 - 91644 1.6 71 29 Tu 1 . + CDS 91672 - 92580 80 ## BF0134 hypothetical protein + Term 92704 - 92735 -0.8 72 30 Op 1 . - CDS 92712 - 94730 1763 ## COG3505 Type IV secretory pathway, VirD4 components 73 30 Op 2 . - CDS 94761 - 95978 1150 ## BF0132 hypothetical protein 74 30 Op 3 . - CDS 95987 - 96415 513 ## BF0131 hypothetical protein 75 31 Op 1 . + CDS 97125 - 97886 980 ## BF0129 hypothetical protein 76 31 Op 2 . + CDS 97889 - 98329 203 ## Bacsa_2547 conjugate transposon protein 77 31 Op 3 . + CDS 98344 - 98685 267 ## Bacsa_2548 hypothetical protein 78 31 Op 4 . + CDS 98702 - 99430 574 ## Bacsa_2549 hypothetical protein 79 32 Op 1 . + CDS 99632 - 99949 332 ## Bacsa_2550 hypothetical protein 80 32 Op 2 . + CDS 99960 - 100292 407 ## Bacsa_2551 hypothetical protein 81 32 Op 3 . + CDS 100289 - 102793 2925 ## COG3451 Type IV secretory pathway, VirB4 components 82 32 Op 4 . + CDS 102834 - 103214 436 ## Bacsa_2553 hypothetical protein 83 32 Op 5 . + CDS 103238 - 103867 926 ## Bacsa_2554 hypothetical protein 84 32 Op 6 . + CDS 103871 - 104875 1202 ## Bacsa_2555 conjugative transposon TraJ protein 85 32 Op 7 . + CDS 104907 - 105530 963 ## Bacsa_2556 conjugative transposon TraK protein 86 32 Op 8 . + CDS 105558 - 105845 234 ## Bacsa_2557 hypothetical protein 87 32 Op 9 . + CDS 105826 - 107178 1121 ## Bacsa_2558 conjugative transposon TraM protein 88 32 Op 10 . + CDS 107213 - 108199 1260 ## Bacsa_2559 conjugative transposon TraN protein 89 32 Op 11 . + CDS 108202 - 108777 577 ## Bacsa_2560 conjugative transposon protein TraO 90 32 Op 12 . + CDS 108818 - 109687 469 ## Bacsa_2561 hypothetical protein 91 32 Op 13 . + CDS 109684 - 110187 649 ## Bacsa_2562 hypothetical protein 92 32 Op 14 . + CDS 110301 - 110711 251 ## Bacsa_2563 lysozyme + Term 110729 - 110779 13.6 - Term 111162 - 111192 3.7 93 33 Op 1 . - CDS 111199 - 111507 352 ## BF0110 hypothetical protein 94 33 Op 2 . - CDS 111561 - 111806 255 ## BF0109 hypothetical protein 95 33 Op 3 . - CDS 111803 - 112036 302 ## Bacsa_2571 hypothetical protein 96 33 Op 4 . - CDS 112056 - 112313 328 ## Bacsa_2572 hypothetical protein 97 33 Op 5 . - CDS 112338 - 113669 932 ## Bacsa_2573 hypothetical protein 98 33 Op 6 . - CDS 113666 - 114082 242 ## Bacsa_2574 hypothetical protein 99 33 Op 7 . - CDS 114096 - 114317 273 ## BF0105 hypothetical protein 100 33 Op 8 . - CDS 114329 - 114544 156 ## Bacsa_2576 hypothetical protein - Prom 114575 - 114634 3.6 + Prom 114497 - 114556 3.7 101 34 Op 1 . + CDS 114576 - 114833 67 ## gi|167762708|ref|ZP_02434835.1| hypothetical protein BACSTE_01066 102 34 Op 2 . + CDS 114841 - 115104 66 ## Bacsa_2577 hypothetical protein + Term 115209 - 115269 17.0 103 35 Tu 1 . - CDS 115588 - 116928 1708 ## COG0371 Glycerol dehydrogenase and related enzymes - Prom 116993 - 117052 7.8 104 36 Op 1 . + CDS 117308 - 118501 1383 ## BT_2457 putative purple acid phosphatase 105 36 Op 2 . + CDS 118511 - 119407 1341 ## COG0584 Glycerophosphoryl diester phosphodiesterase 106 36 Op 3 . + CDS 119407 - 120273 1501 ## COG0647 Predicted sugar phosphatases of the HAD superfamily 107 37 Op 1 . + CDS 120574 - 121449 1360 ## COG0428 Predicted divalent heavy-metal cations transporter 108 37 Op 2 . + CDS 121453 - 122406 1450 ## COG0530 Ca2+/Na+ antiporter + Term 122465 - 122494 2.1 - Term 122453 - 122482 2.1 109 38 Tu 1 . - CDS 122487 - 123524 1013 ## COG1194 A/G-specific DNA glycosylase - Prom 123595 - 123654 4.2 + Prom 123517 - 123576 3.8 110 39 Tu 1 . + CDS 123605 - 123904 471 ## COG2388 Predicted acetyltransferase + Term 123940 - 123987 11.0 111 40 Tu 1 . - CDS 124021 - 125427 1635 ## Bacsa_1635 hypothetical protein - Prom 125554 - 125613 4.6 + Prom 125530 - 125589 7.2 112 41 Op 1 . + CDS 125630 - 125857 454 ## gi|313158608|gb|EFR58002.1| hypothetical protein HMPREF9720_2907 113 41 Op 2 . + CDS 125871 - 126242 427 ## Bacsa_3075 hypothetical protein + Prom 126271 - 126330 2.0 114 42 Op 1 . + CDS 126353 - 127696 1412 ## BVU_0460 hypothetical protein 115 42 Op 2 . + CDS 127701 - 128333 861 ## gi|313158615|gb|EFR58009.1| putative membrane protein 116 42 Op 3 . + CDS 128326 - 128610 299 ## COG1846 Transcriptional regulators + Term 128757 - 128786 1.2 - Term 128637 - 128668 -0.9 117 43 Tu 1 . - CDS 128794 - 129408 719 ## COG2431 Predicted membrane protein - Term 129964 - 130007 8.2 118 44 Op 1 . - CDS 130051 - 132735 3727 ## BF0908 putative outer membrane protein 119 44 Op 2 . - CDS 132735 - 133172 469 ## ANT_07460 putative acetyltransferase (EC:2.3.1.-) - Prom 133354 - 133413 2.0 120 45 Op 1 . + CDS 133604 - 134647 1470 ## Odosp_2777 putative surface layer protein 121 45 Op 2 . + CDS 134664 - 135548 1385 ## Odosp_2775 PKD domain containing protein 122 45 Op 3 . + CDS 135553 - 137604 2942 ## BT_1953 putative TonB-linked outer membrane receptor 123 45 Op 4 . + CDS 137601 - 138275 971 ## COG4912 Predicted DNA alkylation repair enzyme + Term 138315 - 138359 5.5 124 46 Tu 1 . + CDS 138401 - 139147 1067 ## COG0463 Glycosyltransferases involved in cell wall biogenesis + Term 139149 - 139202 4.7 + Prom 140420 - 140479 3.9 125 47 Tu 1 . + CDS 140519 - 140767 136 ## - Term 140690 - 140744 -0.4 126 48 Op 1 . - CDS 140908 - 141276 582 ## COG2246 Predicted membrane protein 127 48 Op 2 . - CDS 141273 - 142415 1514 ## Halhy_6054 glycosyl transferase, family 39 128 48 Op 3 . - CDS 142190 - 143041 255 ## Cpin_1026 4-amino-4-deoxy-L-arabinose transferase and related glycosyltransferase of PMT family - Prom 143134 - 143193 3.2 + Prom 143391 - 143450 2.0 129 49 Op 1 . + CDS 143515 - 145740 3664 ## COG1328 Oxygen-sensitive ribonucleoside-triphosphate reductase 130 49 Op 2 . + CDS 145737 - 146225 404 ## BVU_3106 radical enzyme activating protein - Term 146350 - 146385 1.1 131 50 Tu 1 . - CDS 146449 - 147285 668 ## BF2115 putative AraC-type transcription regulator + Prom 147622 - 147681 12.2 132 51 Op 1 . + CDS 147799 - 149148 900 ## Bacsa_2219 integrase family protein + Prom 149207 - 149266 6.6 133 51 Op 2 . + CDS 149289 - 149490 173 ## Halhy_1851 helix-turn-helix domain-containing protein Predicted protein(s) >gi|313158484|gb|AENZ01000040.1| GENE 1 2 - 982 1314 326 aa, chain + ## HITS:1 COG:L94405 KEGG:ns NR:ns ## COG: L94405 COG1640 # Protein_GI_number: 15672678 # Func_class: G Carbohydrate transport and metabolism # Function: 4-alpha-glucanotransferase # Organism: Lactococcus lactis # 72 325 235 487 489 235 45.0 1e-61 SADELRGMGFDTAGGRFTVPAPDDRMLGELFGELADEVRTTCMKEGRLLPAYATQRKVAE RFPGDDPRRSRLREGLMTLLDDVLFIEDPRRKGYFHPRIAPHSTHAYRRLDGERRAAFDR LYTDFFYHRHNRFWQESALRKLPVLLSATEMLTCGEDLGMIPDSVPETMRALQILSLEIQ RMPKTPGELFADPAHYPYFSVCTTSTHDMNPLRAWWEENRELSERFYREVLGMEGDAPRT CEPWICRRIVDMHLRSPAMLAILPLQDWLATDAALRSPHADRERINIPAAPRYYWRYRMH LTLEELLRQEPFNATLREMIIAGGRR >gi|313158484|gb|AENZ01000040.1| GENE 2 1216 - 1584 691 122 aa, chain + ## HITS:1 COG:L76848 KEGG:ns NR:ns ## COG: L76848 COG1694 # Protein_GI_number: 15673807 # Func_class: R General function prediction only # Function: Predicted pyrophosphatase # Organism: Lactococcus lactis # 41 122 34 115 115 88 57.0 3e-18 MSDSYSQTFKFSAFMTLNEYQQHALETAIYPENRRIIYPTLGLTGEAGEVADKVKKVIRD GHEEFSDEKRLEIVKEIGDVLWYCATLSRDLGYELDDVAQMNVDKLRSRMQRHIISGSGD NR >gi|313158484|gb|AENZ01000040.1| GENE 3 1587 - 1994 576 135 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313158497|gb|EFR57891.1| ## NR: gi|313158497|gb|EFR57891.1| hypothetical protein HMPREF9720_2802 [Alistipes sp. HGB5] # 1 135 1 135 135 220 100.0 3e-56 MEDYILREINRIGELIAALLDKIGLLKKSGAPELIRETAKTELAEQLNLDIDTLLAGADF IATLVDEYGFSDADLEKFAELLFDFAAASEERGERLRLAAAIGALYSYLDEKKAPASLNR YYILKDLDKYIKEPQ >gi|313158484|gb|AENZ01000040.1| GENE 4 1995 - 4001 3291 668 aa, chain + ## HITS:1 COG:PH0361 KEGG:ns NR:ns ## COG: PH0361 COG1297 # Protein_GI_number: 14590271 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Pyrococcus horikoshii # 23 624 2 590 626 250 31.0 7e-66 MEQEKQLTSLPENAYRELKPGEEYTPIMPASSSPREVTPYSVTMGIVMAVVFSAAAAFLG LKVGQVFEAAIPIAIIAVGMGNVLGKKNMLGQNVIIQSIGACSGVIVAGAIFTLPALYIL GLDAAFYQVFLSSLFGGLLGIVLLIPFRKYFVKEMHGKYPFPEATATTEVLISGEKGGNQ AKLLAVAGLIGGLYDFAVGTFGMWTEAVSTRICAWGQVAADKFKVVFSLNTSAAVLGLGY IIGLKYAMIITAGSCLVWFLVVPLVGSLADSIDPAAMASLLGVTRADIMADPQRLFTAEN LFAFIGKPLGIGGIAMAGIIGIVKQSKIIRQAVGLAVSELGGGNKTQAAATERTQRDLTM KRILTILIATLVSIFVFFHFGLLDGWVQSVTAILIVFVISFLFTTVAANAIAIVGTNPVS GMTLMTLILSSLVLVSVGLSGTTGMTAALIIGGVVCTALSMAGGFITDLKIGYWLGTTPR KQEAWKFLGTFVAAATVAGVMIILNKSYGFVGEGALVAPQANAMAAVIQPLMTGGQTPWM LYFCGAALALVLTSIGVPALAFALGMFIPMELNAPLVVGGLVAWFVSNRSKDEALNKARF DRGTLIASGFIAGGALMGVVSALLKFAEVDWFLSGWASSNAAEWTGLIMYLALIGYFGWH TLRAKKEE >gi|313158484|gb|AENZ01000040.1| GENE 5 4015 - 4317 183 100 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313158496|gb|EFR57890.1| ## NR: gi|313158496|gb|EFR57890.1| putative lipoprotein [Alistipes sp. HGB5] # 1 100 1 100 100 142 100.0 9e-33 MKTAVILLVAATLLQGCAPLTVAQQSGTKTPATADAPQRKNRAGTHISLTGPSYPSVVVT PEGGHAVVMPPCGSRSVVVNPDGSHSIAIVNGSTATIVTP >gi|313158484|gb|AENZ01000040.1| GENE 6 4331 - 5548 1990 405 aa, chain + ## HITS:1 COG:CAC0476 KEGG:ns NR:ns ## COG: CAC0476 COG2195 # Protein_GI_number: 15893767 # Func_class: E Amino acid transport and metabolism # Function: Di- and tripeptidases # Organism: Clostridium acetobutylicum # 5 405 6 407 408 487 60.0 1e-137 MDLKERFLKYVSYDTQSSEESGTFPSTEKQKVLLAALRDEMQALGMTEVTMDRYGYVMGT VPATPGCENAPVIGFIAHVDTSPDMSGKDVKPRIIEEYDGGDIALNGQLTMRVADFPELE FFKGHTLIHTDGTTLLGADDKAGVAEIMTAAEYLLAHPEIKHGKIRIGFTPDEEVGRGVD FFDVKAFGADFAYTVDGGMEGELEYENFNAASAKIDIQGRNVHPGYAKDKMINAIEVACD LQRFLPACQRPEHTEGYEGFYHCVGLNGTVEKASVSYIIRDHDADKFEQKKVFMWACVDL LKKKYGDEVLTLTVKDQYFNMRKMVEPHPQVIDKALEAMERAGVKPLVRPIRGGTDGARL SFMGLPCPNLFTGGMNFHGKFEYCSLTTMRKAQQVILNLAQLWAE >gi|313158484|gb|AENZ01000040.1| GENE 7 5667 - 7604 2856 645 aa, chain + ## HITS:1 COG:CAC1050_2 KEGG:ns NR:ns ## COG: CAC1050_2 COG0171 # Protein_GI_number: 15894337 # Func_class: H Coenzyme transport and metabolism # Function: NAD synthase # Organism: Clostridium acetobutylicum # 326 634 1 309 310 446 66.0 1e-125 MDNFGFLKVAAAVPHVRVGDCDFNTERIAAMAEEAAQRGVEIVAFPELAVTAYTCADLLL LPALLDAADEALARLVKATRKLPLVIIAGAPLRHGSTLYNCAVVFTQGRVLGVVPKTYIP DYTEFYENRWFASGAGISEETISVAEQSADFGADLTFGINGTEFGVEICEDLWTAIPPSS HLALNGAKVIFNLSASPESVGKHAYLRQLVAQQSARTLAGYVYCSAGFGESSTDLVFAGN GIVAENGRILRESGRFRLEEQLVVADIDIQRLEFERRRNTSFRMHEGAAENTVIEMEVPE GLRAAALDRDIDPMPFVPQDEAHRSERCEEIFQIQSHGLAKRLVHTRCEKAVIGISGGLD STLALLVTVRTFDKLGLDRAGIIGITMPGFGTTDRTYNNALELMRGLGVTIREIPIRDAC TQHFSDIGLDPEDRSAAYENSQARERTQILMDVANMEGGLVVGTGDLSELALGWATYNGD QMSMYGVNASVPKTLVRHLVKWVADTESDMAARATLLDVIDTPVSPELLPADKEGKIAQK TEDLVGPYELHDFFLYNFIRAGYGPAKILFLAEQAFHGSYDRATILKWLTVFFRRFFSQQ FKRSAMPDGPKVGSAALSPRGDWRMPSDASAAVWLKELETLKYLK >gi|313158484|gb|AENZ01000040.1| GENE 8 7607 - 8227 870 206 aa, chain + ## HITS:1 COG:no KEGG:BF2047 NR:ns ## KEGG: BF2047 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 199 1 199 199 95 32.0 2e-18 MKRNLLLAIATLLLAGSSATAQDWKEALKKAATTAADKATDGKLTQYALTGTWNYTAPGV KFEGNDLLSQLGGTVLQDNIKQQLDKGYQMAGIKPGAGTVTFAKDGKFTTLMGKYELSGT YEFDASTHVATLSFAKDKLDLGSVPGHAYIDGSDLLLVFPVTKLIDMVKAMGSKISSMET IVALLENYKNVYIGFEFSRQAGENAK >gi|313158484|gb|AENZ01000040.1| GENE 9 8466 - 9821 2089 451 aa, chain + ## HITS:1 COG:PA4588 KEGG:ns NR:ns ## COG: PA4588 COG0334 # Protein_GI_number: 15599784 # Func_class: E Amino acid transport and metabolism # Function: Glutamate dehydrogenase/leucine dehydrogenase # Organism: Pseudomonas aeruginosa # 7 451 3 445 445 608 63.0 1e-174 MNAQAHQYVEDFMAQLTLRNPNEPEFHQAVREVAESLAPHIVASPVLQKMKVLERIAEPE RVIIFRVPWLNDKGEIEINRGYRVQMNSAIGPYKGGIRFHPSVNLSILKFLAFEQTFKNS LTTLPMGGGKGGSDFNPKGRSDNEVMKFCQSFMTELQRHIGQDTDVPAGDIGVGGREIGF MFGQYKRLRDEFTGTLTGKGRDWGGSPLRPEATGYGTCYFAQEMLATRKESFEGKTVCIS GSGNVAQYACQKATQLGAKVVTLSDSSGYIHDPEGIDAAKLEYVMELKNIFRGRIKEYAD RYPSATYYPGQRPWGVKCDIAMPCATQNELNGDEAQALVDNGCICVAEGANMPSTPDAIR IFQQHKLLYAPGKAANAGGVATSGLEMTQNSMRITWSPEEVDAKLRSIMQNIHSVCVQYG TQADGYINYVNGANIGGFMKVANAMLAQGCV >gi|313158484|gb|AENZ01000040.1| GENE 10 9889 - 10494 1003 201 aa, chain - ## HITS:1 COG:no KEGG:Odosp_2687 NR:ns ## KEGG: Odosp_2687 # Name: not_defined # Def: hypothetical protein # Organism: O.splanchnicus # Pathway: not_defined # 3 183 2 183 199 84 30.0 3e-15 MYRKLSNGISWALHPFLLPLYMIGVLLTLTVFAHYPSGVKIYLLWVVALYAIIIPLLALG VLRSLGRISDYRIDDRRERLLPLLVGAVCYVLCAITIAKIPSAIFLRKFMIAAACCEVMC LAVSLYWKISLHLTAMGAVVALLVVMNIAGVGDMMVPLMIAILCAGALASARLYLGCHNG QQVLAGFCGGFAVAALAVLFL >gi|313158484|gb|AENZ01000040.1| GENE 11 10644 - 12065 2423 473 aa, chain - ## HITS:1 COG:PA4462 KEGG:ns NR:ns ## COG: PA4462 COG1508 # Protein_GI_number: 15599658 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma54 homolog # Organism: Pseudomonas aeruginosa # 8 473 6 496 497 244 32.0 4e-64 MAINQKQVLSLQQKLSPQQIQMIKLLELPTVQLEQRIKQEIEDNIVLEEEEHASEEEEQP QQISVDEYLREDDTPSYKSRINNYSKDDKQRPVFLTEGRSLPEYLLEQLGFRNLSERDMR LAAYLVGSIDEDGYLRRDLESVADDIAFTLGIETSAEELERLLGVLHELEPAGIGARNLR ECLLLQMAQIPINSRPRRLARKILTNYFEEFVKKHYEKLMSRLQVSEEDFREAIAEIRRL SPKPGNLYAEGGTDTTPYIIPDFILDYQDGRFQLSLNSYNVPEVRVNRRYMDMIREMVGS DGTVREKDKEAIQFVKNKIDSAKWFISAIKQRHDTLMRTMQTILDYQQEYFKDGDKSKLR PMILKDIADRTGLDVSTISRVVNSKYVQTQFGIILLKSLFSEAMQTESGEEVSSYEIKNI LQECIDEEDKRHPLTDETLMDILNSKGYRIARRTVAKYREMLGIPVARLRKQI >gi|313158484|gb|AENZ01000040.1| GENE 12 12251 - 13648 2015 465 aa, chain - ## HITS:1 COG:alr3658 KEGG:ns NR:ns ## COG: alr3658 COG0017 # Protein_GI_number: 17231150 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Aspartyl/asparaginyl-tRNA synthetases # Organism: Nostoc sp. PCC 7120 # 4 465 1 463 463 580 58.0 1e-165 MKDMKSQRIAEILKSGAEDTDIVVKGWVRTKRGNKNVAFIALNDGSCVANIQIVVDLAKF GEEQLKPITTGACIRVDGRLVASLGSGQRVEVQAEKIEIYGTADPETYPLQKKGHSLEFL RDIAYLRPRTNTFGAVLRIRHAMAYAIHEYFNREGFYYFHTPLITASDCEGAGAMFQVTT LDLNDVPKNEQGQVDFSQDFFGKACNLTVSGQLEGELGALSLGRIYTFGPTFRAENSNTP RHASEFWMIEPEAAFYELEDNMELAEDFLKYLIRYALDNCAEDLEFMNKMWDNGLLDRLH FVLENDFKRLDYTEGIEILKASGRKFEFPCDWGCDLQSEHERYLVEEHFKRPVILINYPK DIKAFYMKQNDDGKTVRAMDVLFPGIGEIIGGSEREADYGKLHARVAELGMNEKELWWYL DTRRWGSAPHSGFGLGFDRLLLFVTGMTNIRDVQPFPRTPKNADF >gi|313158484|gb|AENZ01000040.1| GENE 13 13755 - 15113 2250 452 aa, chain + ## HITS:1 COG:L0025 KEGG:ns NR:ns ## COG: L0025 COG3250 # Protein_GI_number: 15673962 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Lactococcus lactis # 71 445 99 458 996 117 28.0 6e-26 MKRTILLTAAILCAAGIRAQDLYNIKEPPATEILSRPVEAGRIHRTEVVPYDKRHDADAR NRAGVEAYIAYTPEAFAATDDAVAVGQVIDIPYVWTDGVVYLHLENVGTAYTLTVNDTEV AEVEDSSTPAEFALTPYIREGKNAVVLTLRRSAADALNAAPASRKAFENSYLYTQNKRSI RDFEIALVPDSTRKFGVLELAIVAQNAFNYDEPVTVGYDIYSPQGKLLDFNIREVVIPGR TTDTVRFSPFIYGTNANKWEPGAKNPPLYKVMLFTRRDGAYREYMPLKIGFGKTEFTDGR ITRFDKEIKLVKTRYNAAADRKITLAELKNLKSKGSNTVCPDYPQPAWFYELCDELGLYV IDRANINAPDRRDDRRVGGTPSNDPALADEYLERVKAMYYRSRNHTCVVAFALGGESGNG YNMYKAYEWLKSVEKSRPVIYTDADGEWNSDL >gi|313158484|gb|AENZ01000040.1| GENE 14 15255 - 15728 692 157 aa, chain + ## HITS:1 COG:BS_yydA KEGG:ns NR:ns ## COG: BS_yydA COG1576 # Protein_GI_number: 16081075 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Bacillus subtilis # 1 155 1 158 159 109 39.0 2e-24 MVIELIVIGKTDSKEVASLVEMYARRVNFYCKFAVTALPDIRNTKNLSVKQQRTAEGEAI LRQLSDGDYVVLLDERGDEMRSVEFAYWLQKRMNSGVKRLVLVIGGPYGFSEEVYKRSDA RLSLSRMTFSHQIVRAIFAEQIYRAFTILNNEPYHHE >gi|313158484|gb|AENZ01000040.1| GENE 15 15733 - 16320 852 195 aa, chain + ## HITS:1 COG:SMa1686 KEGG:ns NR:ns ## COG: SMa1686 COG4566 # Protein_GI_number: 16263379 # Func_class: T Signal transduction mechanisms # Function: Response regulator # Organism: Sinorhizobium meliloti # 122 193 136 207 213 60 48.0 2e-09 MERRPAIAVAAPNVLMGVGMKSILEKIIPAADVELFGDFESFTEADPERFFHFFVAAQLF VTHGSFFRARRSKTILLCSGQPQPAYADMHCIDVCTNEEAIVRDILRMHHGAHRSEHIVP GAVRPEAAPLSGREAEVLALIARGYMNKQIADSLQIGLTTVISHRRNIMEKLGIRSVAGL AIYALTAGYVDAGEL >gi|313158484|gb|AENZ01000040.1| GENE 16 16397 - 18760 3217 787 aa, chain + ## HITS:1 COG:YPO1011 KEGG:ns NR:ns ## COG: YPO1011 COG1629 # Protein_GI_number: 16121312 # Func_class: P Inorganic ion transport and metabolism # Function: Outer membrane receptor proteins, mostly Fe transport # Organism: Yersinia pestis # 53 787 33 690 690 174 26.0 8e-43 MHLRQLTIPLLTLLPLLAAAGGAQTPERHAAATAAVAGTAGEEPAAADTVIVVDKVQVTA IKQGMVLRSQPVAATIVGSRAIERERIGALKHLSQQVPNFFAPDYGSRMTSSIYVRGLGA RIDQPVMGLNIDNVPVLNKDNYDTELADAERIEVLRGPQSTLYGRNTMGGVINVYTLSPL SYEGVRLSAEYGSGDSYKFRASSYYKLTPDLGMAVTGYYTHTGGFFENLATGDKCDWERL GGGRWKTQWRNRTGLRIDNTLSFSVLEQGGYPYAYVGDDIIGDDGRPVVRRGEIRYNDPC SYRRTALSDGLTIRYDAGNFTVASITCYQYSDDEMILDQDFLPLSYFTLRQARTEHSVTE DLVFRSRGDKAYRWLLGAFGFYRHGTMNAPVHFKQTGIEELILKYANENDQSYRYSWGLK DGTGGDELFLGSDFRMPSAGGALYHESNYTLGRWRFTAGLRFDFEHARLRYRNYTDTWYT KTHIKNETAYERPIDIDDRSTLKQTFTELLPKFSVMYSFDETRNLYLTIAKGYKSGGFNT QIFSDVLQQKMMNRMGIGEVYDVQRGVAYKPEYSWNYEIGGHFSCMEGAVRGDFALFYID CRDQQLTVFPPGQTTGRMMTNAGRTRSLGAEAAVQISPWRTFDINLAYGYTDARFVRYET TVKNDDGDPVRISYKDNRIPYAPQLTLSAGAAWTVPTGVKWLGDLVFQAGVRCAGRIWWN EENTLSQPFYALTDASVRFEHARYSLSVWGRNLTDAGYDVFYFKSIGNEFVQRGRPRTFG ITLNINL >gi|313158484|gb|AENZ01000040.1| GENE 17 18776 - 19252 456 158 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313158541|gb|EFR57935.1| ## NR: gi|313158541|gb|EFR57935.1| putative lipoprotein [Alistipes sp. HGB5] # 1 158 1 158 158 269 100.0 6e-71 MKKLLFLLLAAATVTACDNDDENFNPDSGSYTCTLKVTDTAGDVTNQTSGVICNLTEQPD KSVALTMKNVKFTDNPREPVKTIVFPKLSRHNDDGSTIFESTGAIIPEIDDAPYEDYVIS NFACEVTDNAEKFHLVFTCTNPTYHLNHTAEFNGTLIK >gi|313158484|gb|AENZ01000040.1| GENE 18 19257 - 20111 800 284 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163755345|ref|ZP_02162465.1| 30S ribosomal protein S6 [Kordia algicida OT-1] # 4 284 6 286 286 312 54 5e-84 MKPEYRPFVDELIELCIKEDIGDGDHTSLSCIPAGEHGRMRLLCKQEGIIAGIEIARIVF DRLDPDMHFEQVLHDGDRVKPGDVAFYVSGRLRSLLQAERIILNIMQRMSGVATQTAVYV ARLEGLHTKVLDTRKTTPGMRVLDKMAVKIGGGENHRMGLFDMILLKDNHIDFAGGIRKA ICGAREYLKAKGKDIPIECEVRSLEDIDEVFTAGGADRIMFDNFTPAMTREAVKKVAGRC ETESSGGITLDTIRDYAECGVDFISVGALTHQIKSLDMSLKACE >gi|313158484|gb|AENZ01000040.1| GENE 19 20148 - 22058 2839 636 aa, chain + ## HITS:1 COG:PA1689 KEGG:ns NR:ns ## COG: PA1689 COG1368 # Protein_GI_number: 15596886 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Phosphoglycerol transferase and related proteins, alkaline phosphatase superfamily # Organism: Pseudomonas aeruginosa # 18 570 33 611 700 120 23.0 1e-26 MKRLLHTPTALLLRRIALLYAVLMLCRVIFYLYNAQLLGPLSWGELGQLLAGSLKFDTAS VVYADGVFVLLSLLPLHARERKWYRRLLFWYYAVVNAVLVVAVNMADCVYFRYTQKRFSA DEIFFADNGNSLQLVGKFMAENWYLVIAAAALTALLVWGYGRKIREESLLSRGWGYYAGG TAIFAVAAGLCIAGMRGGMTRMTRPITLSNATLYTADSGKANLILSNPFCILRTIGSAGG VSYRKFFPPEELPRRFTPVHRPADSTAVNLAGRNVVIFIMESMSAEHSAHLMPEIYADRK TKGFTPFLDSLMRSGLCFKRMYANGTRSIQAMPSVLGSIPSFKTPFVLMPQSLGESRQLP AILRDKGYATAFFCGSERGSMGFGAYARSAGVERLVSREDYEARHGKEDFDGYWGIWDEP FLQFMGEELSETPEPFFATLFTLSSHHPFVVPARYENTLPDGYTKIHKGVAYDDNAFRLF FERFGREEWFRRTVFVFVADHVSSEKFAPVTRTYPGNYHIIGFMYTPDGALRGEVGDVVQ QLDIMPTVLGLVGSEEPYFAFGRDVMNEPERPRWSVSYDGKFRALTGEGAVVLDDSGAQV QECPVTPAADSLMQSFRALIQQYYSHIERKSYTPDD >gi|313158484|gb|AENZ01000040.1| GENE 20 22051 - 23436 2061 461 aa, chain + ## HITS:1 COG:HP0292 KEGG:ns NR:ns ## COG: HP0292 COG4866 # Protein_GI_number: 15644920 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Helicobacter pylori 26695 # 4 295 2 287 290 153 31.0 7e-37 MIEFKPVRLEDKQIIERHTMPSGILNCDLAFANMYCWQAMYHSAWAVIDGFLVIRFHIGG GEKIGYMQPVGEGDFAGIIPALREDAHAHGQRLRLIGLTDEGREMIRNMHAGLFAFESDR ALEDYVYNAEDLRNLTGRRYQPKRNHINRFMSEYPDFWYENLTRDRFAECMQLEREWRRA HEGHTSELCAEQRAMQRAFDHFEELEMLGGCIYVGDKLVAFTFGSAVNEHTFDTHVEKAD TDYDGAFTIINKLFAEHLPERFTLINREEDLGIDGLRQSKLSYHPAVIQHKFTAIHLHPD EIACKELWTAAFGDDEQFIDSFLIRYYSRSRMLTAEYEGRTAAMLNLLPFESQLGRTTYI YGVATAPEFRRRGLAGKLMGEAMQLIADRDDDAALLIPSEEWLRGFYAAYGFSGAIPVTF ASPDGFDFGTGDPSKDLAMVWRRDASAPMPETLQCSYYKKK >gi|313158484|gb|AENZ01000040.1| GENE 21 23540 - 24118 321 192 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313158610|gb|EFR58004.1| ## NR: gi|313158610|gb|EFR58004.1| hypothetical protein HMPREF9720_2820 [Alistipes sp. HGB5] # 1 192 1 192 192 308 100.0 2e-82 MVAWAPVGIEPDAEAAPNVFTLNPGNTTTAADFVYQENMEVNSGKCTDLALTLKHAFAKL TFTLNATNYPFSAPSLSSLVVKGTPGVSTIDISNGSYSTVTDADLTVVSTATDFTSPNAV TALVLPKTAASLTLDCTLDGTAYQNIAVKDIVKLDAGANYNITINITGTEISVSGVQVPD WTAGGNGSADLK >gi|313158484|gb|AENZ01000040.1| GENE 22 24316 - 24432 56 38 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKRFYLVAALAATAMAAAFFTSCDKQDEVTPPSLKAQH >gi|313158484|gb|AENZ01000040.1| GENE 23 24971 - 25825 1065 284 aa, chain - ## HITS:1 COG:SP0825 KEGG:ns NR:ns ## COG: SP0825 COG0190 # Protein_GI_number: 15900713 # Func_class: H Coenzyme transport and metabolism # Function: 5,10-methylene-tetrahydrofolate dehydrogenase/Methenyl tetrahydrofolate cyclohydrolase # Organism: Streptococcus pneumoniae TIGR4 # 2 284 4 285 285 317 60.0 2e-86 MIISGKELSAKLKAEMAAEVAAIGEKYGRVPHLVVILVGEDPGSVSYVTGKAKASAEVGI RNTTIRKPETISEAELLGIIAELNADAEVDGILVQLPLPKHIDEDKVIEAIDKAKDVDGF HPLNVAALWQKQPCTLPCTPKGIIKMLKAAGVGIAGKEAVVIGRSNIVGLPVSKLLLDEN ATVTMTHSRTRNLPEVTRRADILVVAIGRPKFVTADMVRDGAVVIDVGVNRDPETGKLCG DVDFAAVESKASVITPVPGGVGPMTICCLMENTIECFLRRMGRK >gi|313158484|gb|AENZ01000040.1| GENE 24 25835 - 27082 1606 415 aa, chain - ## HITS:1 COG:BH0634 KEGG:ns NR:ns ## COG: BH0634 COG0151 # Protein_GI_number: 15613197 # Func_class: F Nucleotide transport and metabolism # Function: Phosphoribosylamine-glycine ligase # Organism: Bacillus halodurans # 1 412 1 416 428 428 55.0 1e-119 MKVLVIGGGGREHAIVDALARSAQVEKIWCAPGNAGIAAQAECVAIKETEVEKLRDFAAA NEIGLTVVGPEVALAAGVVDAFKAAGLRIFGPTKAAARIESSKEFAKDLMAKYAIPTAGY RAFTEYAEALDYVASRPLPAVLKYDGLAAGKGVVIAETLEEADAALKDMLLDDKFGEGKV VVEDFLTGPEFSFMCFVSGRKVWPMALAQDHKRAFDGDKGPNTGGMGAYSPLPFVTAEDE RYALEKILQPTADAMVAEGCPFEGVLYGGLMKTPQGIKVIEFNARFGDPETEVVLPRLKS DIVDIFCAVADGADTQLEWHDFATLGVVLASKGYPGSYEKGHEIKGLDRVGSAVYHMGTK ADGGRILTAGGRVLFVVGKGATLAEARANALKDVAQIECDNLFYRTDIGHWAFEK >gi|313158484|gb|AENZ01000040.1| GENE 25 27254 - 28783 2234 509 aa, chain - ## HITS:1 COG:BS_purH KEGG:ns NR:ns ## COG: BS_purH COG0138 # Protein_GI_number: 16077720 # Func_class: F Nucleotide transport and metabolism # Function: AICAR transformylase/IMP cyclohydrolase PurH (only IMP cyclohydrolase domain in Aful) # Organism: Bacillus subtilis # 2 509 5 512 512 568 57.0 1e-162 MRALISVSDKTGVVDFARGLRALGWEVIATGGTMKLLADSGVEVINISDVTGFPEICDGR VKTLHPNVHGGLLARRDDPEHLKALKENNIEFIDMVCVNLYPFRETISKPDVKMEDAIEN IDIGGPSMLRSAAKNWADVTVVCDPADYAQILDEIRAGGNTEKATRLKLSAKAYTHTAEY DSMIATYMRAQAGLNEKLFLEFDLVQSLRYGENPHQSARFYREEKKVPYSLAFARQLNGK ELSYNNIQDANAALCIVREFDKPFCVGLKHMNPCGAAVGKDVVEAWTKAYEADKVSIFGG IVATNRTVTKEAAELMKPIFLEIIMAPKFDEGALEVLCTKKNLRLLEVDMEQGAVDPKQY VSVNGGLLVQDLDVETKSVTADMCVTAAKPAAEQMDDMNFGWHIVKHVKSNAIVVVKDGR TLGVGAGQMNRIGSAEIALKQAHAAGVTEGLVLASDGFFPFDDCVTLAAEYGVTAIVQPG GSVRDEDSVKKADEKGIAMVFTGERHFKH >gi|313158484|gb|AENZ01000040.1| GENE 26 28790 - 29209 553 139 aa, chain - ## HITS:1 COG:CAC3491 KEGG:ns NR:ns ## COG: CAC3491 COG3871 # Protein_GI_number: 15896728 # Func_class: R General function prediction only # Function: Uncharacterized stress protein (general stress protein 26) # Organism: Clostridium acetobutylicum # 11 139 13 138 145 116 43.0 1e-26 MSVAFNTLESLVDRQAVAFIGSVDAEGFPNMKAMLAPRVREGLKVFYFTTNTSSMRAAQY RRNPKAAVYFCDGASFEGLMLRGTMEVLEDAASRRLIWREGDTEYYPQGVDDPDYSVLRF TAVEGRYYSNFHSENIEIK >gi|313158484|gb|AENZ01000040.1| GENE 27 29206 - 29769 584 187 aa, chain - ## HITS:1 COG:BS_purN KEGG:ns NR:ns ## COG: BS_purN COG0299 # Protein_GI_number: 16077719 # Func_class: F Nucleotide transport and metabolism # Function: Folate-dependent phosphoribosylglycinamide formyltransferase PurN # Organism: Bacillus subtilis # 1 185 1 186 195 194 52.0 6e-50 MRRLAVFASGSGTNFEAIVSACEQGVTGGEVVLMVCDKPGARVVERAAAHGVETFVFAPK EYASKADYEREIVRLLDAAGVELVCLAGYMRIVGDVLLEAYGGRIVNIHPSLLPAFRGAH AIEQAMEYGVKVFGVTIHYVDASLDGGRIIAQRAFEYDGDDIEELEARIHAVEYPLYVET IKKLLDP >gi|313158484|gb|AENZ01000040.1| GENE 28 29774 - 30799 912 341 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|169632702|ref|YP_001706438.1| phosphoribosylaminoimidazole synthetase [Acinetobacter baumannii SDF] # 4 338 13 344 356 355 54 5e-97 MAQSYEKAGVNLEAGYEVVRRIKKHVASTSRLGVMGNIGAFGGMFDLSALNVKEPVLVSG TDGVGTKLKLAFEMDKHDTIGVDAVAMCVNDVLAQGAEPLVFLDYVAVGHNEPKKIEAIV SGVAEGCRQAGCALVGGETAEMPGMYAQGEYDIAGFTVGVVEKSKLIDGSKVKAGDVLVG IASSGVHSNGFSLVRKIVADNCFDLHEVYPELSNKLLGEVLLTPTKIYVRQVLEVIRNCD VHGISHITGGGFDENIPRILHDGQGLEIEEGAWEILPVFRFLEKYGKVAHREMFNIFNMG IGMVIALDASEADKAIGILTAQGEKASVIGRVTDAEGVVIR >gi|313158484|gb|AENZ01000040.1| GENE 29 30823 - 32238 2005 471 aa, chain - ## HITS:1 COG:BS_purF KEGG:ns NR:ns ## COG: BS_purF COG0034 # Protein_GI_number: 16077717 # Func_class: F Nucleotide transport and metabolism # Function: Glutamine phosphoribosylpyrophosphate amidotransferase # Organism: Bacillus subtilis # 9 460 8 462 476 580 61.0 1e-165 MRQISDRELHEECGVFGVFGVPDAASLSYYGLHALQHRGQEGAGIVAVDEGGTFRRIKGS GLVTEVFDEAKLATLKGNTAIGHVRYTTAGGGGIENVQPFLFRHNTGDFALAHNGNIVNS ALLREYLENKGSLFQSTSDSEILAHLIKKETRYHDRPRIFSIIDALNMLEGAFAFLIMTA NRIYACRDKYGLRPLAIGRLGDGYVVSSETCAFDVLGAEFVRDVEPGEIVTIDRQGIRSR DYSMYKRCEMCSMEYIYFARPDSDIDGCNVHAYRKESGRLLFKESPADADIVVGVPDSSL SAAMGYAEASGLPYEMGLIKNKYIGRTFIQPSQELREKGVRMKLSAVRSIVKGKRVVLVD DSIVRGTTSRRIVTMLKEAGATEVHVRIASPQMTHPCFYGVDTSTRDELISARKDLEGVR EEICADSLAFLTPGALLKAGNRKELCMACFTGEYPTALYQSVDEANKDVKC >gi|313158484|gb|AENZ01000040.1| GENE 30 32244 - 32960 1413 238 aa, chain - ## HITS:1 COG:FN0988 KEGG:ns NR:ns ## COG: FN0988 COG0152 # Protein_GI_number: 19704323 # Func_class: F Nucleotide transport and metabolism # Function: Phosphoribosylaminoimidazolesuccinocarboxamide (SAICAR) synthase # Organism: Fusobacterium nucleatum # 1 235 1 235 237 303 63.0 2e-82 MKQLEMLYEGKAKQVFRTDDPEKIIIHYKDAATAFNNIKKATIENKGVLNNAISTLIFKE LQKAGIKTHYIETINDRDQICRKVTIIPLEVIVRNIIAGSMAQRLGIEEGTQPSNTIYDI CYKKDELGDPLINDHHAVALGAVTYDELDLIYAMTSRINEVLRNMFAKMNINLVDFKIEF GRTSDGEIVLADEVSPDTCRLWDMSTNEKLDKDRFRRDLGKVREAYEEILARLQKIVE >gi|313158484|gb|AENZ01000040.1| GENE 31 32998 - 33966 1268 322 aa, chain - ## HITS:1 COG:no KEGG:Phep_1385 NR:ns ## KEGG: Phep_1385 # Name: not_defined # Def: endonuclease/exonuclease/phosphatase # Organism: P.heparinus # Pathway: not_defined # 31 319 28 274 278 101 27.0 5e-20 MPMKTSFYRFAALLLFAAALTAGCGERTDDMRLLYWNIQNGMWSDQSNGYDNFVAWVKGY DPDVCVWCEAQSIYKSGTAERVDSVDRYLTANWGELAARYGHKYWYVGGHRDNFPQVVTS KYPIENVRRIVGQAPDSIVSHGAGWARIVKNGRPVNIVTLHTWPQAYAFQAVDRDASKAE HGGDRYRRMEIEYICSHTIATQSDAGQQLWMMMGDFNSRSRLDNRIYNYPDDDTRLLVHD YIAEHTPYVDVIAEKHPGEFHTTTHGQSRIDFVYCTPPLCERITRAEVIADDFTEPVRDP GKLSNFYHPSDHRPILVDFDMK >gi|313158484|gb|AENZ01000040.1| GENE 32 33983 - 37693 5243 1236 aa, chain - ## HITS:1 COG:FN0990_1 KEGG:ns NR:ns ## COG: FN0990_1 COG0046 # Protein_GI_number: 19704325 # Func_class: F Nucleotide transport and metabolism # Function: Phosphoribosylformylglycinamidine (FGAM) synthase, synthetase domain # Organism: Fusobacterium nucleatum # 1 950 5 964 983 933 50.0 0 MKNYRIFVEKHPRFRVEAESLRRELNANLNLDIRELRLLNVYDLFGFSEELLEKTRYSVF GEVVTDSVTDACDLAGQKYIAVEYLPGQFDQRAASAVDCVRLIDPSAEVRIRSSKLLLFD GAVTDEEIARIKRYYINAVESREKDLSVLSDMEQAEVKPVAVLEGFTKMTDAELAPYCAQ YGLAMNADDLREVVKYFRAEGRDPYETELRILDTYWSDHCRHTTFTTELEGITVEESFVK DEIEDSLALYLRIRRELGREHKSICLMDMATIGARYLRKKGLLDDMEVSDENNACSVYVD VDVDGRTEKWLLQFKNETHNHPTEIEPFGGASTCLGGAIRDPLSGRSYVYQAMRVTGAGD IYQKVGDTLEGKLPQNVISKKAAAGYSSYGNQIGLATTHVREVYHDGYVAKRLEVGAVVG AVKAENVRREKPAPGDKIIMLGGRTGRDGIGGATGSSKEHNTKSLETCGSEVQKGNAPEE RKLERLFRRPEVTRLIKKSNDFGAGGVSVAIGELADGLDIYLDRVKTKYSGLNSTELAIS ESQERMAVVVEAKDVETFMRYCLEENIEAVEVADVTDTARMRMFNKDRKVVDLSREFIDS AGARHYAEAKVGEVENRNPFVREIAGDTLAARFENNLRDNNVVSQRGLVEMFDSTIGAST VLMPFGGRTQGSETQVSVQKLPTDGYTDTASIMAFGYNPFIASWSPYHGAAYAVVEAAAK VVAAGARYDRMRYSYQEYFERMTKNPESWGKPLGALLGALKMQVELGLPSIGGKDSMSGT FQHINVPPMLMAFGITTVDAGTVISTDFKKAGSRIYLVRHTPAENYMPDTAQLKANFDFV CGQIESGKILSAWSVGFGGVAEGLAKMAFGNRIGAEVTTDESRLYEYAYGSILVESDGEL DFPAAELLGATVADEALTVNGVRMPLDGLYEANTVKFTTVYPDKGENRADVMSAEPECRT FAYEGEAVERPVAYLPVFPGTNCDYDTAKAFRNAGAEVTTSVLCNLGGDDILRSIAQMKE HIRRAHIFVLCGGFSSGDEPDGSAKFIVNVLNNKDIREEIHALLDRGGLILGICNGFQAL VKSGLLPYGRLGMVTKDSPTLFRNDVNRHISQIVSTRVSTLNSPWLAGFELGEVHSIAVS HGEGKFVVSEELAKELFANGQVAFQYVDAEGRVTAEAPYNPNGSSYAIEGIVSRDGRILG KMGHTERYGKNLFKNIAGNKEQALFRNAVEYFRKSK >gi|313158484|gb|AENZ01000040.1| GENE 33 37810 - 38904 1347 364 aa, chain - ## HITS:1 COG:lin1886 KEGG:ns NR:ns ## COG: lin1886 COG0026 # Protein_GI_number: 16800952 # Func_class: F Nucleotide transport and metabolism # Function: Phosphoribosylaminoimidazole carboxylase (NCAIR synthetase) # Organism: Listeria innocua # 9 351 17 360 374 251 41.0 2e-66 MKTIGIIGGGQLGLMIIEQAHLLGARTVCLDPAADAPAFALSDERIVAAYDDPAALEELC RRSDAVTYEFENVPGSVLIPLEKRYNIPQGFRPLYDSQDRLREKDNARAGGLATPEYAAA DDEASLRAAVAKIGLPAVLKTRTLGYDGHGQLVLKTETDIEKALPMLAVPCILEQFVPFD FEASVVLVSDGSRVVTFPIGRNVHRDGILDLCFVPAGEMDDELRGRMAAAGERFMLTCGY RGILAIEFFVKGREFFFNEMAPRPHNSGHYTIEGCTTNQFRELVRFLLGEPLQEPRLVAP TVMKNILGQDLEAAEAIAAENRPGVHVRLYGKTESRPKRKMGHITFVGMTPAEYGAVWAD RFVK >gi|313158484|gb|AENZ01000040.1| GENE 34 38905 - 39372 738 155 aa, chain - ## HITS:1 COG:SSO1064 KEGG:ns NR:ns ## COG: SSO1064 COG0041 # Protein_GI_number: 15897932 # Func_class: F Nucleotide transport and metabolism # Function: Phosphoribosylcarboxyaminoimidazole (NCAIR) mutase # Organism: Sulfolobus solfataricus # 2 155 3 156 158 182 64.0 2e-46 MKVGVIMGSVSDYEVMADAVATLEQFGVDFEKRVVSAHRTPDLLCEYAKTAKARGIGVII AGAGGAAHLPGMVASMTTLPVVGVPVKSRALNGLDSLLSIVQMPAGVPVATTAINGSKNA ALIAVSILALQDAELAARLEAFRAKQTADVLKAEL >gi|313158484|gb|AENZ01000040.1| GENE 35 39686 - 42016 3072 776 aa, chain - ## HITS:1 COG:L135972 KEGG:ns NR:ns ## COG: L135972 COG3537 # Protein_GI_number: 15673483 # Func_class: G Carbohydrate transport and metabolism # Function: Putative alpha-1,2-mannosidase # Organism: Lactococcus lactis # 36 771 6 715 717 428 33.0 1e-119 MKSRNIFFAALCAAVLAGCSCPSAGQRSPQRPSDYVSTLVGSQSDFTLSTGNTYPAVALP WGMNFWTPQTGKMGDGWAYTYGAHRIRGFKQTHQPSPWINDYGQFALMPVRGNDKLDEES RASWYSHQAEVAKPYYYKVYLADHDIRAEIAPTERAAMMRFTFPESDESGVVIDAFDRGS QIGMLDARTIVGYTTRNSGGVPDNFRNYFVVRFDTPFTAVELTDAPGEYEPGSSLLYPDG SKSVTGGHAVAKVHFPTRRGQQVCASVASSFISPEQAVQNLRELGADDFETVKSKAQERW DEVLGRIEVEGGAEEQMRTFYSCLYRSVLFPRKFYEVTASGEIMHYSPYNGEVLPGYMYT DTGFWDTFRSLFPLLNLVYPSVNAEIQQGLANTARESGFLPEWASPGHRGCMVGNNSASV VADALLKGVTPEKEWQTLYEAMMHARTNVHPGVSSTGRLGHEYYNRLGYIPYDVKINENV ARTLEYAYDDWCIAQVARKLGKTEDAAALEEASRNYRNVFDKTTKLMRGRNENGEFQSPF SPYKWGDAFTEGNAWHYTWSVFHDVQGLAELMGGREGFAEMLDSVFVVPPIYDDSYYGTR IHEITEMQVADMGNYAHGNQPAQHMIYLYDYAGQPWKAQYWTREVMDRLYSSKPDGYCGD EDNGQTSAWYVFSAMGFYPVCPASDQYVIGSPLFRKATVKLENGKKIVIEAPDNAPENRY IGTMKLDGKRYDRNFIRWSDLVKGAKIDFAMQPDPAYKRGVGPQDAPYSLSREECR >gi|313158484|gb|AENZ01000040.1| GENE 36 42032 - 44248 3151 738 aa, chain - ## HITS:1 COG:XF0842 KEGG:ns NR:ns ## COG: XF0842 COG3537 # Protein_GI_number: 15837444 # Func_class: G Carbohydrate transport and metabolism # Function: Putative alpha-1,2-mannosidase # Organism: Xylella fastidiosa 9a5c # 23 732 43 765 790 546 41.0 1e-155 MKPKHLLSLSVLLISAACSQPEQKPLTQYVDMYIGTGGHGHVFMGANVPFGAVQLGPTSI PQSWDWVSGYHISDTTVIGFSHTHLNGTGIGDLFDVTVMPVVGGVTYARGTEDDPQSGLW SYFSRKNEKARPGYYATRLDRYGVDVELTATKRVGLHKYTFPASQEAAVVFDLENGGCWD KATETAFVSDGLRLSGYRYSTGWAKDQRVYFTAEFSKPVKNISYPDASETVPEEGDVTVK GRYARVEFDTTDGEPLYMKVAISPVSIENAKLNMQAELPGWDFEATAAAADKAWNAELQK VRVETSDEAARRIFYTALYHTMVAPSEFCDVNGDYRGADGEIYRAAPFVNYTTFSLWDTY RAAQPLMTILHPEKMPDIANTMLHIYTQQGKLPVWHLAGNETDCMVGNPGIPAMADIVLK GYDGFDKELAYEALKKSAMLGERGMDLRMKYGYIPCDLFNEAVAYDMEYALADWAVAQVA AQRGDTADYDYFLDRSKSYRHFFDPETRFMRGLDSKGGFRTPFNPFASTHREDDYCEGNA WQYTWLVPHDVEGLIGCFGGKEAFVEKLDSLFTVSSVLEGAASPDISGLIGQYAHGNEPS HHVVYLYTMIGQPAKTADKVREILTTLYHDQPDGLSGNEDVGQMSAWYVLSSLGFYQVEP AGGRYFFGSPLFDKAELRVRDGVFTVIAHNNSAADKYIQRVKLNGKPYAKPYIGFDDIAA GGTLEFEMGPDPAVWYEL >gi|313158484|gb|AENZ01000040.1| GENE 37 44280 - 46475 3077 731 aa, chain - ## HITS:1 COG:no KEGG:BT_1035 NR:ns ## KEGG: BT_1035 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 731 1 734 734 930 57.0 0 MRLFRWIFVVSLLLVSSGVRAQTRLTRNGKPAARIVVAQQTAADLTAAQLLQRFVRESTG ATLPLLHDVTPRKGNILIGASDTAGLAEDGFRLRTQDGVLRISSGGDKGAVYGVMTLLER YLGLDYFAAGVYDLDRNPTVTLPAMDFAENPAFRYRQSQGYGMAQDSVYRLALRLEEPRD IFAGGLWVHTFNSLLPASVYGAEHPEYYSFINGERRPGRASQWCLTNDELFELVAAKVDS IFRANPGMNIISISQNDSNFTYCRCEECEKVNRHEGALSGNYVRFLNKLAERFPDKEFST LAYLFTMQPPKHVKPLPNVNIMLCDIDCDREVPLTDNASGREFIEAMEGWAALSDNIFVW DYGINFDNMVAPFPNFPILKPNIGLFRRNHATMHFSQIGGSYGGDFSEMRTYVVAKLMWN PDQDTDSLMLRFMRGYYGPAAPYIYQYEKLLEGALLAGGQRLWIYDSPVSHKDGMLNAAC RKRYGELFDCAERAVAGDSVLLRRVRLTRLPLMYSNLEIARTTADKDLGAVVSELDAFEK YVTQFGVKTLNERNNSPQEYCRLYRERYLPAENSNLALGARIVWIDAPAEKYRASGEKTL TDGLFGGASFVESWTGWEGKDGAFVVDLGRDVAFSTVETDFLHQLGQWILLPRSVRYSVS SDNADWTPFGQVEFPEDQSVPVKFVPAAVTAAQPVRARYVKVEIEGVKTCPPWHYGVGCP CWFFLDEVTVK >gi|313158484|gb|AENZ01000040.1| GENE 38 46476 - 48743 3085 755 aa, chain - ## HITS:1 COG:Rv0584 KEGG:ns NR:ns ## COG: Rv0584 COG3537 # Protein_GI_number: 15607724 # Func_class: G Carbohydrate transport and metabolism # Function: Putative alpha-1,2-mannosidase # Organism: Mycobacterium tuberculosis H37Rv # 19 751 35 749 877 332 33.0 2e-90 MLRRTLLTAALAAAVFCAPLRAQEPADWVDPFVGTTNFGTTNPGAVCPNGMMSVVPFNVM GSDENVYDKDARWWSTPYEFHNKFFTGFAHVTLSGVGCPELGSLLVMPTTGALDVDYRNY GSKYTGETASPGYYSNRLTKYGVLAEVTATPRSSAERYTFPAGESHILLNLGEGLTNESG AWVRRVSDTEVEGMKLLGTFCYNPQAVFPIYFVMRVSKRPSATGFWKKQPPKYGVEAEWD KDAGNYKLYTHYGRELAGDDVGVWFSYDTTEGEQLEVRMGVSFVSVENARLNLEAEQQER SFDDIRAAARRSWNDDLGRIRVEGGTDAQKKVFYTGLYHALIHPNVLSDVNGEYPAMESA EIRTAEGNRYTVFSLWDTYRNLHQLMTLVYPERQLEMVRSMIGMYKEWGWLPKWELYGRE TFTMEGDPAIPVIVDTWMKGLRDFDMDAAYEAMRKSATTPGAQNRMRPDIDPYVEKGYVP LGFYARDLSGDNSVSHALEYYIADHALSLLADSLGRREDAALFRNRSLGYKNYYSPESGT FRPITGEGGFLTPFDPRQGENFEPVPGFHEGSAWNYTFYVPHDVYGLSKLMGGRRKFIAK LQMVFDEGLYDPANEPDIAYPYLFSYFRGEEWRTQREVNRLLAKYFTTAPDGIPGNDDTG TMSAWAVFSMMGFYPDCPGEPYYTLTSPVFDRVTVTLDRKYYPAGELVIETKRPDSGAVY IRSMTLGGKPLKRYRIGHDELLRGGRLTFELQNRK >gi|313158484|gb|AENZ01000040.1| GENE 39 48746 - 49756 1565 336 aa, chain - ## HITS:1 COG:TM1225 KEGG:ns NR:ns ## COG: TM1225 COG2152 # Protein_GI_number: 15643981 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted glycosylase # Organism: Thermotoga maritima # 5 335 1 326 326 383 56.0 1e-106 MDNNLKIVGPAMPDMPWEDRPAGSKEVMWRYSANPIIPRDALSTSNSIFNSAVIPFKKGK YNYAGVFRCDDTNRRMRIHAAFSVDGIKWDICEEDFKLVGADPEIAEWVYGYDPRVAKIE DKYYVTWCNGYHGPTIGIAWTDDFETFHQLENAFIPFNRNGVLFPRKINGRFAMLSRPSD NGHTAFGDIFYSESPDMEFWGRHRHVMSPAAFEVSAWQCMKIGAGPVPIETSEGWLLLYH GVLRSCNGYVYAFGSALLDLDQPWKTIARSGPYLISPREIYELTGDVPNVTFPCASLHDP ETGRIAVYYGCADTVTGLAFGYIPEIVKFTKENNIL >gi|313158484|gb|AENZ01000040.1| GENE 40 49774 - 51087 1910 437 aa, chain - ## HITS:1 COG:YPO3162 KEGG:ns NR:ns ## COG: YPO3162 COG0477 # Protein_GI_number: 16123324 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Yersinia pestis # 19 411 22 398 492 80 23.0 4e-15 MSANSRKVSPLAWVPTVYFAMGLPFIIVNMVATLMFRGLGIDDARITLWTSLIILPWSLK PFWSPLMEMFRTKKFWVVATQLVSGLGLALVALSLPLPNFFPYAIALMAVVAFSGATHDI ATDGVYITELSKDLQAKFIGWQGAFYNIAKVFAMGGLVYLAGALKDHVGIVQAWMTVMGL CGGILFLLGLYHIRMLPSGGAATAHADSFGGAMRETKRIFLEFFKKKYIWIYFAFILFYR FAEGLVIKIVPLFLNAPLDQQGMGLTEQQIGLYYGTFGVIAFVVGSILGGYFISWLKLRR ALFPLVCIFNVPFVVYALLAWFQPSSPVLICAAIVFEYFSYGFGFVGLTLFIMQQVAPGP HQMAHYAFGSSLANLGVMLPGMISGWLCDSLGGYHYFFMWALLATVPAFLLAARIPFTHP DTEEVTAEEIDKELINE >gi|313158484|gb|AENZ01000040.1| GENE 41 51429 - 52514 1679 361 aa, chain - ## HITS:1 COG:CAC2654 KEGG:ns NR:ns ## COG: CAC2654 COG0540 # Protein_GI_number: 15895912 # Func_class: F Nucleotide transport and metabolism # Function: Aspartate carbamoyltransferase, catalytic chain # Organism: Clostridium acetobutylicum # 2 301 5 304 307 401 64.0 1e-111 MRHLIDPTDLTVAETEQIVALAEDIIANRAKYSEACKGRKLATLFYEPSTRTRLSFTSAM MELGGSVIGFSDANSSSVSKGETVADTVRVIRCFADIIAMRHFKEGAPLVASQYAGIPVI NAGDGSHSHPTQTLTDLLTIKREKGRFDNMTIGFCGDLKFGRTVHSLIKALSRYSGIKVI LIAPQELRLPDYMLAEMSENSKLEFREVETMEEVMPELDILYMTRVQKERFLDEEEFDRV KNSFVLDPGKLETAKEDMIILHPLPRVNEITRAVDNDPRAAYFRQVENGKFVRMALILTL LRWADENKPFEKTPVFSEDYVVNEMECSNRRCISATEDVDRLFHRLPDGSCRCAYCEAKA K >gi|313158484|gb|AENZ01000040.1| GENE 42 52666 - 53298 974 210 aa, chain - ## HITS:1 COG:BS_pyrE KEGG:ns NR:ns ## COG: BS_pyrE COG0461 # Protein_GI_number: 16078620 # Func_class: F Nucleotide transport and metabolism # Function: Orotate phosphoribosyltransferase # Organism: Bacillus subtilis # 1 210 7 216 216 241 56.0 6e-64 MEKSIAKDLLSIGAVFLRPEQPFTWASGIKSPIYCDNRLTLTAPVVRGHVEEGLAQIVRT RFPEAEVLMGTSTAGIAHAAITATILDLPMGYVRSGSKDHGRGNQIEGKLEKGQKVVVIE DLISTGGSCIEVVTALREAGAEVLGVASIFTYGMKKGLDRMKEANVVNYSLSNLDALVEV AAEEGYIKPEDKARLLKFRDNPSDESWMEK >gi|313158484|gb|AENZ01000040.1| GENE 43 53315 - 53575 474 86 aa, chain - ## HITS:1 COG:no KEGG:BVU_2290 NR:ns ## KEGG: BVU_2290 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 86 1 86 86 113 67.0 3e-24 MKPVKLFYLKNCPFCRKVLRYIEEAKAAHPELQPVAIEMIEESEQSDLADTFDYYYVPTF YVDGVKVHEGGIYAEEVEKILRSALE >gi|313158484|gb|AENZ01000040.1| GENE 44 53598 - 54305 991 235 aa, chain - ## HITS:1 COG:SP0701 KEGG:ns NR:ns ## COG: SP0701 COG0284 # Protein_GI_number: 15900600 # Func_class: F Nucleotide transport and metabolism # Function: Orotidine-5'-phosphate decarboxylase # Organism: Streptococcus pneumoniae TIGR4 # 1 233 1 228 233 244 56.0 8e-65 MMQHDVIIACDFKSAEDTFKFLDLFRDEERKPFLKIGMELFYAEGPAIVREIKRRGHRIF LDLKLHDIPNTVKKAMAVLSRLDVDMCNVHAAGTVEMMKYALEGLTREDGTRPLLIAVTQ LTSTSEERMRQELLINASINDTIVKYAQNTRAAGLDGVVCSPLEAGMVHEACGKEFLTVT PGVRFADGDVADQVRVTTPARAREIGSDFIVVGRPITAAADPVAAYRRCVSEFCD >gi|313158484|gb|AENZ01000040.1| GENE 45 54320 - 55279 1317 319 aa, chain - ## HITS:1 COG:BH2534 KEGG:ns NR:ns ## COG: BH2534 COG0167 # Protein_GI_number: 15615097 # Func_class: F Nucleotide transport and metabolism # Function: Dihydroorotate dehydrogenase # Organism: Bacillus halodurans # 20 316 4 299 305 336 54.0 3e-92 MVSKQFIAQNRPAGKPAPDMSVTLSGLKLDNPVVPASGTFGYGNEFRDFYDINILGSFSF KGTTREPRFGNPTPRIAECEAGMINAVGLQNPGIDAVIAEELPRLKSFFHKPVIANISGF SIEEYAYCCERIDREEQVGIIEVNVSCPNVRHGGMSFGTSPECAAEVTRAVKTVTTKPVY IKLSPNVTDIVSIARACEEAGADGICLINTLLGMRIDVRCRKAVIANTMGGFSGAAVFPV AVRMVYQVAKACSVPVMGCGGVTTARDVIEMMMAGATAVQVGAANLVNPYASKEIVEALP AEMERLGIERLSDIIGIVE >gi|313158484|gb|AENZ01000040.1| GENE 46 55273 - 56013 1162 246 aa, chain - ## HITS:1 COG:FN0423 KEGG:ns NR:ns ## COG: FN0423 COG0543 # Protein_GI_number: 19703765 # Func_class: H Coenzyme transport and metabolism; C Energy production and conversion # Function: 2-polyprenylphenol hydroxylase and related flavodoxin oxidoreductases # Organism: Fusobacterium nucleatum # 12 244 11 256 259 196 44.0 3e-50 MYKKGIYKILANEPLTASVWRMVLEGDTEWIVRPGQFVNIALEGRYLRRPISVCDCDART LTLIYKVVGGGTEQMSRMAAGAELDLLTGLGNGFDTSNDARRPLLVGGGVGVPPLYKLAK ALLAEGKPVSVVLGFNTADEIFYADEFRALGCDVHVATADGSAGTKGFVTTAIAGDGIDF DYFYACGPLPMLRALCDSVAQDGQLSFEERMGCGFGACMGCSCKTKYGNKRICKEGPVLT KGEIIW >gi|313158484|gb|AENZ01000040.1| GENE 47 56137 - 59319 5081 1060 aa, chain - ## HITS:1 COG:BS_pyrAB KEGG:ns NR:ns ## COG: BS_pyrAB COG0458 # Protein_GI_number: 16078616 # Func_class: E Amino acid transport and metabolism; F Nucleotide transport and metabolism # Function: Carbamoylphosphate synthase large subunit (split gene in MJ) # Organism: Bacillus subtilis # 1 1059 1 1060 1071 1278 59.0 0 MPKRRDIKKVMVIGSGPIVIGQAAEFDYAGTQACLALKEEGYEVILVNSNPATIMTDTHI ADKVYMEPLTLEYVAKIIRFERPDAIVPGLGGQTGLNLAVQLAKKGILKECQVEILGTSF ESIERAEDRELFKELCESLGEPVIESKIAMSVEEAVEVAAEIGYPVVLRPAYTLGGTGGG FADNEEQLREMMRHALFLSPVHQVLVEKSIKGYKEIEFEVMRDHNDTAISICCMENIDPV GVHTGDSIVVAPCQTLTNKEFQMLRDSALKIIRELKIEGGCNVQFALDPLSFKYYLIEVN PRVSRSSALASKASGYPIARVTAKVAVGLTLDEIMLANTPASFEPTLDYVVTKVARFPFD KFSDASNKLGTQMKATGEVMSIGRTMEESLLKAMRSLETGVCHIYHKKFEDWTNDRLLEY IAEPTDDRLYAIAQLIRGGVDLVLIYNRTKIDMFFLEKFKNIVEFERVVEAAPMDIEVLR EAKRMGFSDKFIGQLWGVSQHDMYLLREKNGIFPVYKMIDTCASEFSSYVPYFYSTYEDE NESQVSDKEKIVVLGSGPIRIGQGVEFDYSTVHAIWSIREAGYEAIIINNNPETVSTDYT TSDKLYFEPLMVEDVMNVIHLEKPKAIVVSLGGQTAINLAEPLSRLGVPIIGTDCEAIRN AEDRGCFEKIMEELRIPQPEAEAVTDIESGVKAAARIGYPVLVRPSYVLGGRAMQIVSNE ERLRHYLQTAVEVNDDSPVLVDRYIMGKELEVDAICDGKDVFIPGIMEHVERTGIHSGDS ISVYPTFSVSQKAKDKIIDYTVKLGLRIGIIGLYNIQFILDGNDEVYVIEVNPRSSRTVP FLSKSTGVPMAHIATQVILGKTLKELGVTEVYGKEKKRWYVKAPAFSFAKIRGMDSYLSP EMKSTGEAIGYDDKLTRALYKALQATGMNVCNYGTIFVTIADHDKEQALPLVRRFYDLGF NVEATTGTAEFLREHGIRTRTRRKLSEGSSEIIDALRQGHVSYVINTIDINQHNTRLDGY EIRRTAVENNVTIFTALETVQVLLDVLEEITFGVSTIDAK >gi|313158484|gb|AENZ01000040.1| GENE 48 59312 - 60403 1622 363 aa, chain - ## HITS:1 COG:FN0421 KEGG:ns NR:ns ## COG: FN0421 COG0505 # Protein_GI_number: 19703763 # Func_class: E Amino acid transport and metabolism; F Nucleotide transport and metabolism # Function: Carbamoylphosphate synthase small subunit # Organism: Fusobacterium nucleatum # 4 355 2 351 358 380 50.0 1e-105 MKPFTKKIVLENGREFYGYGFGADREVINEIVFNTSMVGYQEIMSDPSYTDQMVVMTYPL IGNYGMADEDYETKTPTIGGMIVREYNDSPSNFRYTKTLNEVFEEHDIPAIWGVDTRTLT RIIRNEGTQKVIITGADTPLEEALRKVREYALPHDMVSRVSCKKRWMSRVPNHKYDVVAV DCGIKYNIIRQLNRVGCNVTVVPFDSTLEEIMAFRPDGVVLSNGPGNPQDVTPVIELVRQ LRGKLPIFGICMGHQLISLAYGAKTVKMKFGHRGGNHPVKNLATGKIEITSQNHSYAVDV DSLAGTGLELTHVNLLDGTAEGVECVGDAVFSMQYHPESASGPQDSGYLFRKFTKIMEER KNA >gi|313158484|gb|AENZ01000040.1| GENE 49 60406 - 61680 1420 424 aa, chain - ## HITS:1 COG:FN0420 KEGG:ns NR:ns ## COG: FN0420 COG0044 # Protein_GI_number: 19703762 # Func_class: F Nucleotide transport and metabolism # Function: Dihydroorotase and related cyclic amidohydrolases # Organism: Fusobacterium nucleatum # 4 418 2 420 425 411 49.0 1e-114 MKTLYTNAAIYTGGRFAAGEFAVEGGRIVPVAGAAPDRTVDLGGRHVIPGLVDVHVHLRE PGFSQKETIASGTAAAARGGYTTVCSMPNLNPAPDAPDTLRAQTEIIRRDAVVRVVPYGC ITMGQRGAGELVDFAALAPDVVGFSDDGRGVQSDELMEEAMRRVAKAGRPVVAHCEVDDL LRGGYIHDGEYCRAHGHKGICSESEWKQVERDIALAEKTGCQYHVCHVSTKESVELVRRA KTRGVRVSCETAPHYLLLCDEDLQEDGRFKMNPPLRSREDRAALIAGVADGTIEVIATDH APHTAEEKSRGLAGSAMGIVGLECAFPLMYKYMVLPGTLTLEKLVALMSDNPRRIFGLGG GLNVGGEADFTVLDLGAQYEIDPAAFLSKGRATPFAGWPVQGRAVLTIVGGREAYMDETM KTNK >gi|313158484|gb|AENZ01000040.1| GENE 50 62110 - 62538 292 142 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRQFYALLCLLLFSGACSEDDTPNPAVKFSSPDSDVKISQDGTSAAITATHHAGQFVLTM EKNFEAVPESDRSWCTAVLSGDRLTVEIEENAEELRNAAISIMNGESVIGKITVEQGIAP TLSLESNTAEFTNEGGELTPSR >gi|313158484|gb|AENZ01000040.1| GENE 51 62661 - 63878 1338 405 aa, chain + ## HITS:1 COG:no KEGG:EcE24377A_1450 NR:ns ## KEGG: EcE24377A_1450 # Name: not_defined # Def: hypothetical protein # Organism: E.coli_E24377A # Pathway: not_defined # 115 392 410 672 848 150 34.0 9e-35 MTVTTGCKDNPAEVSAAINVTQGPPSLILEYTVPAGGKIILPLSGAIDCTVDYGDGYSEK LALTLNPATGSLINYEYAEAGVYEVSVSGSVEQLYSLQGHSETSRSYLTAVKQWGNVNLT SMYYAFYLCSNLKTLPENTTDSFAEVTTFKYAFEGCSGLQTIPASLFSGCDKVTDVLGCF TKCASLTSVPENLLAPLKNVTSLQSFLAHCKQLKTIPAGFFARSPQITTLKYTFSGNTAF ETLPAGLFKGLANATNFEETFYGCTALKEIPDEFFAGCTSADIFRSCFFGNKALTKVGRN VFKGCTNVTSYKWLLANCTELVSVPADMFDDSRKVTDFSGTFRDAAKLAVESPYTTIDGV KVHIYERSLHPDAFTAPKSFGTCFRGCTALTDWDAIGSGYAAWTK >gi|313158484|gb|AENZ01000040.1| GENE 52 63981 - 64754 1160 257 aa, chain - ## HITS:1 COG:STM3262 KEGG:ns NR:ns ## COG: STM3262 COG1349 # Protein_GI_number: 16766560 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulators of sugar metabolism # Organism: Salmonella typhimurium LT2 # 6 256 5 257 257 143 36.0 3e-34 MLSIAERHKYILDNLNKYGFVRITDVANELGVTKVTIRKDVKILESKGLLYKVHGSARPA NPHVADMDVHVKDNINRDSKRLIAQRALELLEDNDSIIVASGSTIYAFAEEIKARQWHHL NIVTPFLRLGVLLNEAENVNVVQLGGTVHKKSLSVLGEEAARSLDDCICSKLFFGVDGID LEHGITTSTIDEAKLTRKMMKASSQNIVLADSSKFGQRGFGRICALEDIDVIITDDRISE QMVAIVEEAGVDLIIVK >gi|313158484|gb|AENZ01000040.1| GENE 53 64899 - 66353 1904 484 aa, chain - ## HITS:1 COG:CPn0665 KEGG:ns NR:ns ## COG: CPn0665 COG2271 # Protein_GI_number: 15618575 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar phosphate permease # Organism: Chlamydophila pneumoniae CWL029 # 7 472 25 446 455 261 34.0 2e-69 MTPEQTKRFKYWQTRTIVATMIGYALFYFVRKNFSLAMPGLEADLGISKTSLGIFLTLNG VVYGLSRFVNGILADRMNARWYMAVGLALCALANFAFGFGEDVSYWITGQHDGSQFTNTM ILFMGIMWVINGLLQGTGFPPCARLLTHWIPPTQLATKMSVWNTSHSIGAGLVVILCGYI MGTLGMNVSADPDAVAAMAANLGVQPGDAAGMECVMAAAAHAGAWKWCFWIPSAISFAGA VGLVVFLRDTPSSVGLPELEGTEVRKEKKAAKGAEHRAFLMKHVFKNPLIWILGFANFFV YVVRFSVLDWGPSLLSQSKGVSMEHAGWLVAMFEIAGIVGMLVAGWATDRWLKGRAHRTC VFCMAGAAIFVFLFWQLPGDAPVWLLFTTLCAAGFCIYGPQALIGIAAANQATKNAAATA NGLTGLFGYASTVVSGVGLGYVAQHYGWNWAYVGILGMAVVGMLVFLLMWGARADGYDAE PEHD >gi|313158484|gb|AENZ01000040.1| GENE 54 66713 - 67948 1648 411 aa, chain + ## HITS:1 COG:TM0967 KEGG:ns NR:ns ## COG: TM0967 COG0582 # Protein_GI_number: 15643727 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Thermotoga maritima # 228 387 100 245 253 59 27.0 1e-08 MKSTFSVIYYLKRQVVKKDGTVPVMGRITVDGSQTQFSCKLTVDPKLWDTKGGRVTGRST AALETNRMLDKMRVRINRHYQEIMERDNFVTAEKVKNAFLGLEHRYHTLMQVFRQHNEDY EKQVEAGMKAKGTLLKYRTVYKHMQEFLDIRYHVKDIALKELTPAFISDFEMFLRTDKHC CTNTVWLYVCPLRTMVFIAINNEWLTRDPFREYEIKKEETTRSFLTKDEIRLLMEGKLKN AKQELYRDLYLFCAFTGLSFADMRNLTEENIRTYFDEHEWININRQKTGVVSNIRLLDIA NRIIGKYRGLCGDGRIFPVPHYNTCLAGIRAVAKRCGITKHITWHQSRHTAATTIFLSNG VPIETVSSMLGHKSIKTTQIYAKITKEKLNQDMENLAARLNGVEEFAGCTI >gi|313158484|gb|AENZ01000040.1| GENE 55 67960 - 68322 413 120 aa, chain + ## HITS:1 COG:no KEGG:BF0151 NR:ns ## KEGG: BF0151 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 120 1 120 120 241 100.0 7e-63 MKRDTIIIEDKAVSVTGNDVWMTATEIAGLFHTTVPAVNAAIRAVRKSDVLNDYEVCRYM QLENGLHADVYALEIIIPVAFRVNTYNTHLFRTWLVGKALSQEKRQTYVMFIQNGKAGYC >gi|313158484|gb|AENZ01000040.1| GENE 56 68400 - 68702 341 100 aa, chain - ## HITS:1 COG:no KEGG:BF0150 NR:ns ## KEGG: BF0150 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 100 2 101 101 188 100.0 6e-47 MNENNDVFTMEDEPIASVVQDMRKGSKWLSAFLESYRPPLDGERYLTDGEVSELLRVSRR TLQEYRNNRVLPFILLGGKVLYPETGLRGVLEANYRKPLE >gi|313158484|gb|AENZ01000040.1| GENE 57 68737 - 69036 289 99 aa, chain - ## HITS:1 COG:no KEGG:BF0149 NR:ns ## KEGG: BF0149 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 99 15 113 113 185 100.0 4e-46 MNMEIVSIEKKTFEMMVAAFGALSEKVAALRRKSDTGRMERWLTGEEVCGQLRISPRTLQ TLRDRRLIGYSQINRRFYYKPEEVKRLIPLVGTLYPHGR >gi|313158484|gb|AENZ01000040.1| GENE 58 69238 - 69600 407 120 aa, chain + ## HITS:1 COG:no KEGG:BF0147 NR:ns ## KEGG: BF0147 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 120 22 141 141 196 100.0 3e-49 MKVITMESSAYKEMMAQIANIAGYIREARDEKKRKRETEDKLLDTAQAAKMLNVSKRTMQ RMRTDHRIEYVVVRGSCRYRLSEILRLLEDNTVRNEEGTIDTLFHNHTLRTGGKPKGRRT >gi|313158484|gb|AENZ01000040.1| GENE 59 69604 - 69954 487 116 aa, chain + ## HITS:1 COG:no KEGG:BF0146 NR:ns ## KEGG: BF0146 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 116 1 116 116 213 100.0 2e-54 MELLTRNNFEGWMQKLMERLDRQDELLLAMKAEGKQPTITESIRLFDNQDLCMLLQISKR TLQRYRSVGALPYKTLGKKTYYSEEDVLTFLSNHIKDFKKEDIAFYKARIHNFFHK >gi|313158484|gb|AENZ01000040.1| GENE 60 69975 - 71546 1917 523 aa, chain + ## HITS:1 COG:no KEGG:BF0145 NR:ns ## KEGG: BF0145 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 523 1 523 523 854 100.0 0 MAKKKDEKDVLVVRDEKTGEISVVAGLNADGTPKRTPAKAENAQSFLQFDRHGDVLDNFF KNFFRQCKEPSRFGFYRIAADQAENLLEVMKQLLKDPEANKELLAPHKVDTSDYEKKVQE EMAAQQTEKQEPQKQENMEQRKEQQQDKSEQMQGKRGYQPIDESKINWQELEDRWGVKRD NLEKSGDLTKMLNYGKSDLVKVKPTFGGESFELDARLSFKKDGEGNISLVPHFIRKEQKL DEYKEHKFSDNDRKNLRETGNLGRVVDIVDRETGEIIPSYISIDRKTNEITDIPASRVRI PERIGKTEITTQERDMLRAGLPVRDKLIERNDGRKFVTTLQVNVEQRGVEFVPGTGKSPR TAQTQETKGDTSKSQAQGGENAAQTKKEQRRNTWTNEDGSIRPISKWSGVSFTDQQKADY VAGKAVKLENVTDKQGFHATMYIKFNPEKGRPYRYDTNPDNAQQVAPSNESRTQVAVNND GKTNEATKNLREPLQKGQTNPKDARQQQQQEKPQKKTGKGMKM >gi|313158484|gb|AENZ01000040.1| GENE 61 71607 - 73694 2175 695 aa, chain + ## HITS:1 COG:CAC3567 KEGG:ns NR:ns ## COG: CAC3567 COG0550 # Protein_GI_number: 15896801 # Func_class: L Replication, recombination and repair # Function: Topoisomerase IA # Organism: Clostridium acetobutylicum # 4 625 6 647 709 404 38.0 1e-112 MKTIIAEKPSVAREIARIVGATKREEGYFEGGGYAVTWAFGHLVQLAMPDGYGVRGFVRD NLPIIPDTFTLVPRQVRTEKGYKPDSGVVSQIKVIKRLFDTSEHIIVATDAGREGELIFR YLYHYTGCTTPFVRLWISSLTDKAIREGLRKLEDGSKYDNLYLAAKARSESDWLVGINGT QALSIAAGHGTYSVGRVQTPTLAMVCERYWENRRFTSEAFWQLHIATDGCDGEVVKFSSS EKWKEKEPAMELYNKVKAAGCATVTKAERKEKTEETPLLYDLTTLQKEANAKHGFTAEQT LEIAQKLYEKKLITYPRTGSRYIPEDVFAEIPKLLAFIGTQPEWKDKVRAKAAPTRRSVD DGKVTDHHALLVTGEKPLFLSKEDNTIYQMIAGRMVEAFSEKCVKDVTTVTAECAGVEFT VKGSVVKQTGWRAVYGEEKEEITIPGWQEGDTLTPKGSSITEGKTKPKPLHTEATLLSAM ETAGKEIEDDALRQAMKDCGIGTPATRASIIETLFKRGYMERCKKSLVPTEKGLALNSVV KTMRIADVAMTGEWEKELARIERGELSDDTFRKEIEAYTREITSELISCDKLFGSRDSGC ACPKCGTGRMRFYGKVVRCDNTECGLPVFRLKAGRTLSDDEIKDLLTEGHTKLLKGFKSK QGKSFDAVVAFDGEYNTTFVFPEAKKDKKFSGRKK >gi|313158484|gb|AENZ01000040.1| GENE 62 73857 - 74309 514 150 aa, chain + ## HITS:1 COG:no KEGG:BF0143 NR:ns ## KEGG: BF0143 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 150 46 195 195 253 100.0 2e-66 MNNKKKNEGQTDFSYYGLYLLDYLRTNKFEQADDTAFIRERADRAAETYERARLEGYPAD GAQELAMDTLLRGLHYSRYAILREVVENEFADEVPEEKREAFVLKLLPLVGNVFSVYDLS DDNFALSSDYDLLYTELTGATVLYLDEYGV >gi|313158484|gb|AENZ01000040.1| GENE 63 74299 - 80115 4662 1938 aa, chain + ## HITS:1 COG:AGpT188_2 KEGG:ns NR:ns ## COG: AGpT188_2 COG4646 # Protein_GI_number: 16119916 # Func_class: K Transcription; L Replication, recombination and repair # Function: DNA methylase # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 614 1684 65 1140 1315 354 27.0 1e-96 MAFNRKQKLRDNIEAIRTAFILDRENRTATTEERAILQRYCGFGGLKCILNPAKELTDAV RWAKSDLELFAPTVELHRLIRENSKDETEYKRFVDSLKASVLTAFYTPKEITDTIADVLA DYSVRPARMLEPSAGVGVFVDSMLRHSPNADVMAFEKDLLTGTILRHLYPDQKMRTCGFE KIERPFNNYFDLAVSNIPFGDIAVFDAEFQRSDSFGRRSAQKTIHNYFFLKGLDAVRDGG IVAFITSQGVLNSTKTSVRNELFSQANLVSAIRLPNNLFTDNAGTEVGSDLIVLQKNLSK KEMSQDERLMTVIQTDTKTALTDNAYFIHHPERIVHTMAKLDTDPYGKPAMVYLHEGKAA GIAGDLRRMLDEDFHYRLAMRLYSGSIRQAGTEEKVAVQNKVERPAIKLETVSSAQTVET PTEKPQPADEKPEIEPRPQYSAGVQLTLLDLWGMTEEVSQPKTSKKKKTVKKAVTAKSTP PKPKVTVTPTAPTAKPAMENKEVKAENTAKPADPDDIYATLDWDTNPPINGFYEMMMGLT PERRKELRELARQHNEKQVAEKTEVKAVPETSREQPRQEETQPEAVAAPAVTDTPSEAVG TFLFPDIEAEKPKEEVVDLSPRAYHRTPEMHLREGSLVADRGRHNIGYLKDITPYGATFQ PLDLKGYQKEKALLYVSLRDAYERLYRYESLRREANVPWREHLNTCYDEFVMRYGNLNAK QNVKLVMMDAGGRDILSLERMENGKFVKADIFEHPVSFAVESHANVGSPEEALSASLNKY GTVNLDYMREITDSTAEDLLTALQGRIYYNPLVTGYEIKDRFIAGNVIEKAERIEAWMGD NPENERMPEVKQALEALKDAEPQRIAFEDLDFNFGERWIPTGVYAAYMSRLFDTEVKIAY SASMDEFSVVCGYRTMKITDEFLVKGYYRNYDGMHLLKHALHNTCPDMMKSIGKDEHGND IKMRDSEGIQLANAKIDEIRNGFSEWLEEQSPQFKERLVTMYNRKFNCFVRPRYDGSHQT FPDLNLKGLASRGIKSVYPSQMDCVWMLKQNGGGICDHEVGTGKTLIMCIAAHEMKRLNL AHKPMIIGLKANVAEIAATYQAAYPNARILYASEKDFSTANRVRFFNNIKNNDYDCVIMS HDQFGKIPQSPELQQRILQAELDTVEENLEVLRQQGKNVSRAMLKGLEKRKHNLEAKLEK VEHAIKSRTDDVVDFKQMGIDHIFIDESHQFKNLTFNTRHDRVAGLGNSEGSQKALNMLF AIRTIQERTGKDLGATFLSGTTISNSLTELYLLFKYLRPKELERQDIRCFDAWAAIFAKK TTDFEFNVTNNVVQKERFRYFIKVPELAAFYNEITDYRTAEDVGVDRPAKNEILHHIPPT PEQEDFIQKLMQFAKTGDATLLGRLPLSETEEKAKMLIATDYARKMALDMRMIDPNYEDH PDNKASHCAKMIAEYYQKYDAQKGTQFVFSDLGTYQPGDGWNVYSEIKRKLTEDYGIPPS EVRFIQECKTDKARKAVIDAMNAGTVRVLFGSTSMLGTGVNAQKRCVAIHHLDTPWRPSD LQQRDGRGVRAGNEIAKHFAGNNVDVIIYAVEKSLDSYKFNLLHCKQTFISQLKSGAMGA RTIDEGAMDEKSGMNFSEYMALLSGNTDLLDKAKLEKRIASLEGERKSFNKGKRDSEFKL ESKTGELRNNTAFIDAMTEDWNRFLSVVQTDKEGNRLNIIKVDGVDSADEKVIGKRLQEI AKNATTGGLYTQVGELYGFPIKVVSERILKEGLEFTDNRFVVEGNYKYTYNNGHLAMADP LAAARNFLNAMERIPSIIDQYKAKNEVLEMEIPQLQEIAGKVWKKEDELKQLKSELAALD RKIQLELAPPTPEVAEKENEGQQLKPEAEDVRNRQAQYPENAPPQIRSPADSIVANHVII GRPGLYAKEETRSKGLKI >gi|313158484|gb|AENZ01000040.1| GENE 64 80773 - 82698 424 641 aa, chain + ## HITS:1 COG:CAC1448 KEGG:ns NR:ns ## COG: CAC1448 COG0480 # Protein_GI_number: 15894727 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation elongation factors (GTPases) # Organism: Clostridium acetobutylicum # 3 612 4 619 652 473 40.0 1e-133 MNIINLGILAHIDAGKTSVTENLLFASGATEKCGRVDNGDTITDSMDIEKRRGITVRAST TSIIWNGVKCNIIDTPGHMDFIAEVERTFKMLDGAVLILSAKEGIQAQTKLLFSTLQKLQ IPTIIFINKIDRAGVNLERLYMDIKTNLSQDVLFMQTVVDGSVYPVCSQTYIKEEYKEFV CNHDDDILERYLADSEISPADYWNTIIALVAKAKVYPVLHGSAMFNIGINELLDAISSFI LPPASVSNRLSAYLYKIEHDPKGHKRSFLKIIDGSLRLRDVVRINDSEKFIKIKNLKTIY QGREINVDEVGANDIAIVEDIEDFRIGDYLGAKPCLIQGLSHQHPALKSSVRPNKPEERS KVISALNTLWIEDPSLSFSINSYSDELEISLYGLTQKEIIQTLLEERFSVKVHFDEIKTI YKERPIKKVNKIIQIEVPPNPYWATIGLTLEPLPLGAGLQIESDISYGYLNHSFQNAVFE GIRMSCQSGLHGWEVTDLKVTFTQAEYYSPVSTPADFRQLTPYVFRLALQQSGVDILEPM LCFELQIPQVASSKAITDLQKLMSEIEDISCNNEWCHIKGKVPLNTSKDYASEVSSYTKG LGIFMVKPCGYQITKDGYSDNIRMNEKDKLLFMFQKSMSLK >gi|313158484|gb|AENZ01000040.1| GENE 65 82875 - 85016 449 713 aa, chain + ## HITS:1 COG:CC3623_1 KEGG:ns NR:ns ## COG: CC3623_1 COG0642 # Protein_GI_number: 16127853 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Caulobacter vibrioides # 198 461 176 445 460 125 33.0 4e-28 MIKFSLLGETILEWNDKDIEHYHARRMAMDSMLCRFKATYPAERIDSVRSLLEDKERQMF QIVRLMDEQQSINKKIANQIPVIVQKSVQEQSKKPKRKGFLGIFGKKKEVTPAVSTTILH SVNRNVISEQKVQDRQLSEQADSLAARNAELNRQLQELICQIEEKVQTELQSRENEIVAM REKSFMQVGGLMGFVLLLLLISYIIIHRDAKSIKQYKHKTTDLIRQLEQSVQRNEALITS RKKAVHTITHELRTPLTAITGYAGLIRKEQCEDKSGQYIQNILQSSDRMRDMLNTLLDFF RLDNGKEQPRLSPCRISAITHTLETEFMPVAVNKGLSLSVKTGHDAIVLTDKERIIQIGN NLLSNAVKFTEEGGVSLITEYDNGVLTLVVEDTGTGMTEEEQKQAFGAFERLSNAAAKEG FGLGLAIMRNIVSMLGGTIRLDSKKGKGSRFTVEISMQEAEEQLGYTSNTPVYHNNKFHD VVAIDNDEVLLLMLKEMYSQEGIHCDTCTDAAALMEMIRQKEYSLLLTDLNMPDINGFEL LELLRSSNVGNSPTIPVVVATASGSCNKGELLAKGFAGCLFKPFSISELMEVSDRCAIKA TPDGKPDFSALLSYGNEAVMLEKLITETEKEMQAVRDAAKEKDLQKLDSLIHHLRSSWEV LRADQPLNVLYGLLRGDALPDGEALSHAVTAVLDKGVEIIRLAEEERRKYEDE >gi|313158484|gb|AENZ01000040.1| GENE 66 85030 - 86331 751 433 aa, chain + ## HITS:1 COG:hydG KEGG:ns NR:ns ## COG: hydG COG2204 # Protein_GI_number: 16131834 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains # Organism: Escherichia coli K12 # 1 421 10 439 441 262 38.0 1e-69 MVEDNIVYCEYVCNMLSREGYRNMKAYHLSTAKKHLQQATDNDIVVADLRLPDGSGIDLL CWMRKEGKMQPFIIMTDYAEVNTAVESMKLGSIDYIPKQLVEDKLVPLIRSILKERQAGQ RRMPIFAREGSAFQKIMHRIRLVAATDMSVMIFGENGTGKEHIAHLLHDKSKRAGKPFVA VDCGSLSKELAPSAFFGHVKGAFTGADNAKKGYFHEAEGGTLFLDEVGNLALETQQMLLR AIQERRYRPVGDKADRNFNVRIIAATNEDLEVSVNEKRFRQDLLYRLHDFGITVPPLRDC QEDIMPLAEFFRDMANRELECSVSGFSSEARKALLTHAWPGNVRELRQKVMGAVLQAQEG VVMKEHLELAVTKPTSTVSFALRNDAEDKERILRALKQANGNRSVAAELLGIGRTTLYSK LEEYGLKYKFKQS >gi|313158484|gb|AENZ01000040.1| GENE 67 86611 - 87033 71 140 aa, chain + ## HITS:1 COG:no KEGG:BF0137 NR:ns ## KEGG: BF0137 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 140 1 140 140 295 100.0 6e-79 MGKVQILAVLTMDGCLSSELYDKAHQDLCLDRCGLDEIRKKAFYRVTPDYSISMLHEWRK DCTNIRYLAEATPDTADYINGLLRMHAVDEIILYTVPFISGSGRHFFKSALPEQHWTLSS LKSYPNGVCRIIYILDKKAR >gi|313158484|gb|AENZ01000040.1| GENE 68 87272 - 87877 670 201 aa, chain + ## HITS:1 COG:no KEGG:BF0136 NR:ns ## KEGG: BF0136 # Name: not_defined # Def: tetracycline resistance element mobilization regulatory protein RteC # Organism: B.fragilis # Pathway: not_defined # 1 201 1 201 201 402 100.0 1e-111 MNYFLLAETDFFRLINEAGDCNMETAYTAFATQVIELCNGGMDMNLTVIALAYIEIELQH HPVRNLSEEKREIAAYVSKALSFVRKMQKFLATPQVPPLISANNATETTASLLQWTGNAI DLVELIYGIDVMGCINNGNMPLKQLAPLLYKIFGVDSKDCYRFYTDIKRRKNESRTYFID RMQEKLNERMLRDEELERMRK >gi|313158484|gb|AENZ01000040.1| GENE 69 88323 - 89309 635 328 aa, chain + ## HITS:1 COG:RC1031 KEGG:ns NR:ns ## COG: RC1031 COG1373 # Protein_GI_number: 15892954 # Func_class: R General function prediction only # Function: Predicted ATPase (AAA+ superfamily) # Organism: Rickettsia conorii # 30 282 75 314 380 104 28.0 2e-22 MLNGEDYDTLALLENRSIANYRHLLDGIDLLAIDEAQNIPQIGSILKLIVDEIPGISVLA SGSSSFDLLNKTGEPLVGRSTQFLLTPFSQREIAQTETALETRQNLEARLIYGSYPEVVM MENYERKTDYLRDIVGAYLLKDILAIDGLKNSSKMRDLLRLIAFQLGSEVSYEELGKQLG MSKTTVEKYLDLLEKVFVIYRLGAYSRNLRKEVTKAGKWYFYDNGIRNAIIGAFSPLAIR QDVGALWENYIIGERRKANFNEGLHREFYFWRTYDKQEIDLIEESADSLTALEFKWGNKM PAAPKAFQEAYPYAEFHVVNRENYLEFV >gi|313158484|gb|AENZ01000040.1| GENE 70 90499 - 90792 60 97 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MIYVVVFFSVDAAFNIFFQWNLSSIFNIDTIYQSCNFALCLQIIVNYCLNAIQIVAGQYL FYWICQNIRLLQTEQCYVYHQNFVTPLLFKSDHLIDI >gi|313158484|gb|AENZ01000040.1| GENE 71 91672 - 92580 80 302 aa, chain + ## HITS:1 COG:no KEGG:BF0134 NR:ns ## KEGG: BF0134 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 302 783 1084 1084 601 100.0 1e-170 MDYYVDHGDLLVNSVGWNIPLLNETLQYMVNHKLGYKLLLSDILPQFEDIKNRIGVTDEV FIEHLAEWNTDLDKYITKNNIKDVIPDASFYDLTTKISNVLTDHINKIAFEALSEISVDT LYAQRTAHTSYYWFVAIKHLLAKIKSLPDNLTEFGKKILMDIASGTQSLNPFPNCFKNIV ERLDKRKIKSTVTDIRNDFCIGKKTINAIKFQFFETWLRSHGNLKSQAGDVIDKIVKPVI SDGACRSLILQNKDFYMDLINTAGDDAYELKKSLRNLIQKDSDPQLVKFVNSIDSVPEVE TA >gi|313158484|gb|AENZ01000040.1| GENE 72 92712 - 94730 1763 672 aa, chain - ## HITS:1 COG:alr7213 KEGG:ns NR:ns ## COG: alr7213 COG3505 # Protein_GI_number: 17233229 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirD4 components # Organism: Nostoc sp. PCC 7120 # 202 559 117 466 589 99 26.0 3e-20 MSQQEDDLRALAKIMDFLRAVSIILVVMNVYWFCYEAIRLWGVNIGVVDKILLNFDRTAG LFHSILYTKLFSVLLLALSCLGTKGVKGEKITWGRIWTAFAVGFVLFFLNWWLLPLPLPL EAVTGLYVLTIGTGYVCLLMGGLWMSRLLKHNLMEDVFNNENESFMQETRLIESEYSVNL PTRFYYRKRWNNGWINVVNPFRASIVLGTPGSGKSYAVVNNFIKQQIEKGFSQYIYDFKY PDLSTIAYNHLLNHPDGYKVKPKFYVINFDDPRRSHRCNPIHPDFMEDITDAYESAYTIM LNLNKTWVQKQGDFFVESPIILFASIIWYLKIYQNGKFCTFPHAIEFLNRRYEDIFPILT SYPELENYLSPFMDAWLGGAAEQLMGQIASAKIPLSRMISPQLYWVMSDSEFTLDINNPE EPKILCVGNNPDRQNIYGAALGLYNSRIVKLINKKGMLKSSVIIDELPTIYFKGLDNLIA TARSNKVAVCLGFQDFSQLVRDYGDKEAKVVMNTVGNIFSGQVVGETAKTLSERFGKVLQ KRQSISINRQDVSTSINTQMDALIPPSKISGLTQGMFVGSVSDNFNERIEQKIFHCEIVV DAEKVKREESAYKKIPVITNFTDEDGNDRMKETVQANYRRIKEEVKQIVQEELERIKNDP VLCKLLPDNETV >gi|313158484|gb|AENZ01000040.1| GENE 73 94761 - 95978 1150 405 aa, chain - ## HITS:1 COG:no KEGG:BF0132 NR:ns ## KEGG: BF0132 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 405 11 415 415 780 99.0 0 MYGAIAYNGEKINEAQGRLLTTNRIYNDGSGTVDIGKAMEGFLTFLPPQMKIEKPVVHIS LNPHPEDVLTDIELQNIAREYLEKLGFGNQPYLVFKHEDIDRHHLHIVTVNVDENGKRLN RDFLYRRSDRIRRELEQKYGLHPAERKNQRLDNPLRKVAASAGDVKKQVGNTVKALNGQY RFQTMGEYRALLSLYNMTVEEARGNVRGREYHGLVYSVTDDKGNKVGNPFKSSLFGKSAG YEAVQKKFVRSKSEIKDRKLADMTKRTVLSVLQGTYDKDKFVSQLKEKGIDTVLRYTEEG RIYGATFIDHRTGCVLNGSRMGKELSANALQEHFTLPYAGQPPIPLSIPVDAADKAHGQT AYDSEDISGGMGLLTPEGPAVDAEEEAFIRAMKRKKKKKRKGLGM >gi|313158484|gb|AENZ01000040.1| GENE 74 95987 - 96415 513 142 aa, chain - ## HITS:1 COG:no KEGG:BF0131 NR:ns ## KEGG: BF0131 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 142 1 142 142 246 100.0 3e-64 MKEKRKSKSGRNPKLDPAVYRYTVRFNEEEHNRFLAMFGKSGVYARSVFLKAHFFGQPFK VLKVDKTLVDYYTKLSDFHAQFRAVGTNYNQVVKELRLHFSEKKAMALLYKLEQHTVELV KLSRRIVELSREMEAKWSQKSV >gi|313158484|gb|AENZ01000040.1| GENE 75 97125 - 97886 980 253 aa, chain + ## HITS:1 COG:no KEGG:BF0129 NR:ns ## KEGG: BF0129 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 253 97 349 349 509 100.0 1e-143 MSKEIFVAFATQKGGIGKSTVTALAASYLHNVKGYNVAVVDCDDPQHSIHGLREHEMGLI DSSTYFKALACDHFRRIKKNAYTIVKSNAVNALDDAERMIATEDVKPDVVFFDMPGTLRS NGVIKTLSQMDYIFTPLSADRFVVESTLKFVTMFRDRLMTTGQAKTKGLHLFWTMVDGRE RNDLYGIYEEVIAEMGFPVLSTRLPDSKKFRRDLSEERKSVFRSTIFPMDTALLKGSGIR EFSEEISDIIRPQ >gi|313158484|gb|AENZ01000040.1| GENE 76 97889 - 98329 203 146 aa, chain + ## HITS:1 COG:no KEGG:Bacsa_2547 NR:ns ## KEGG: Bacsa_2547 # Name: not_defined # Def: conjugate transposon protein # Organism: B.salanitronis # Pathway: not_defined # 1 146 1 146 146 248 100.0 5e-65 MGSRKVNTEGIDEELLLASIGRRTQDGTLRPAQEVPAAAPTEEDTAAPEPSPVQPVTREK AQRESGRRKRQDEDYNELFLRRNEIKTRQCVYISRDVHGKILRIVNDIAGGEISVGGYVD TVLRQHLEQHKERINELYKKQREDLI >gi|313158484|gb|AENZ01000040.1| GENE 77 98344 - 98685 267 113 aa, chain + ## HITS:1 COG:no KEGG:Bacsa_2548 NR:ns ## KEGG: Bacsa_2548 # Name: not_defined # Def: hypothetical protein # Organism: B.salanitronis # Pathway: not_defined # 1 113 1 113 113 228 100.0 4e-59 MTPNEKRPQQDCGGMFTQVQASVEILSPVPVSGKCSEKDYERLFIRDPEVKAREGKMAYV RPEYHERIMRITRVIGHDRLTLSAYIDHVLTHHFNQCEDAIKSLYARNYNSVF >gi|313158484|gb|AENZ01000040.1| GENE 78 98702 - 99430 574 242 aa, chain + ## HITS:1 COG:no KEGG:Bacsa_2549 NR:ns ## KEGG: Bacsa_2549 # Name: not_defined # Def: hypothetical protein # Organism: B.salanitronis # Pathway: not_defined # 1 242 1 242 242 441 100.0 1e-123 MNYTISMTDILLAVSVGCNLWFLFLLLYERIMDTRIVRFFKGIVGLWRSLDGNEAKRIAA HEEVPAEKADIIGKSRFRMASTRTTAAIPTQEAATIEKGIELSEEEATFDDGKTGNASRP AQVPEEKLDETFTSIPPEELGYGDDEPEEDASDTPRASGSSFDEIDDACKTAKNPDATQA EREKAAKVFTDMEGTELYEKMMEGSSEIGIRIKGLIEIRLKKSEKEFVVPDNIEEFDIRN YV >gi|313158484|gb|AENZ01000040.1| GENE 79 99632 - 99949 332 105 aa, chain + ## HITS:1 COG:no KEGG:Bacsa_2550 NR:ns ## KEGG: Bacsa_2550 # Name: not_defined # Def: hypothetical protein # Organism: B.salanitronis # Pathway: not_defined # 1 105 1 105 105 167 100.0 9e-41 MNKNILKNRKAILSAALVIAATASAFAQGNGIAGINEATSMVSSYFDPGTKLIYAIGAVV GLIGGVKVYGKFSSGDPDTSKTAASWFGACIFLIVAATILRSFFL >gi|313158484|gb|AENZ01000040.1| GENE 80 99960 - 100292 407 110 aa, chain + ## HITS:1 COG:no KEGG:Bacsa_2551 NR:ns ## KEGG: Bacsa_2551 # Name: not_defined # Def: hypothetical protein # Organism: B.salanitronis # Pathway: not_defined # 1 110 1 110 110 213 100.0 1e-54 MAEYPINKGIGRPVEFKGLKAQYLFIFCGGLLALFVLFVILYMVGIDQWICIGFGAASSS LLVWQTFALNARYGEHGLMKLGAARSHPRYLINRRRITRLFKRQRKEERQ >gi|313158484|gb|AENZ01000040.1| GENE 81 100289 - 102793 2925 834 aa, chain + ## HITS:1 COG:PSLT088_2 KEGG:ns NR:ns ## COG: PSLT088_2 COG3451 # Protein_GI_number: 17233453 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirB4 components # Organism: Salmonella typhimurium LT2 # 432 744 184 483 593 84 25.0 1e-15 MRNTSKMTTLENRFPLLAVEHGCIISKDADITVAFEVELPELYTVTGAEYEAIHSCWCKA IKVLPDYSVVHKQDWFIKERYKPELQKDDMSFLSRSFERHFNERPYLKHTCYLYLTKTTK ERNRMQSNFSTLCRGHIIPKELDRETTTKFLEACEQFERIMNDSGLVRLRRLSTDEIVGT EGKTGLIERYFSLMPEGDTTLQDIELSAREMRIGDNRLCLHTLSDAEDLPGKVATDTRYE KLSTDRSDCRLSFASPVGLLLSCNHIYNQYVLIDNSEETLQKFEKSARNMQSLSRYSRSN SINREWIDQYLNEAHSYGLTSVRAHFNVMAWSDDAEELKHIKNDVGSQLASMECVPRHNT IDCPTLYWAAIPGNAADFPAEESFHTFIEQAVCLFTEETNYRSSLSPFGIKMVDRLTGKP LHLDISDLPMKRGITTNRNKFVLGPSGSGKSFFMNHLVRQYYEQGAHVVLVDTGNSYQGL CGMIRRKTGGADGVYFTYTEDKPISFNPFYTDDYIFDVEKKDSIKTLLLTLWKSEDDKVT KTESGELGSAVSAYIERIQSDRSIVPSFNTFYEYMRDDYRKELAQRDIKVEKSDFNIDNM LTTMRQYYRGGRYDFLLNSTENIDLLGKRFIVFEIDSIKENRELFPVVTIIIMEAFINKM RRLKGVRKQLIVEEAWKALSSANMAEYLRYMYKTVRKYYGEAIVVTQEVDDIISSPVVKE SIINNSDCKILLDQRKYMNKFDQIQALLGLTEKEKSQILSINMANNPSRLYKEVWIGLGG TQSAVYATEVSAEEYLAYTTEETEKVEVYRLAEKLGDDIEAAIRQLAERRRNKE >gi|313158484|gb|AENZ01000040.1| GENE 82 102834 - 103214 436 126 aa, chain + ## HITS:1 COG:no KEGG:Bacsa_2553 NR:ns ## KEGG: Bacsa_2553 # Name: not_defined # Def: hypothetical protein # Organism: B.salanitronis # Pathway: not_defined # 1 126 1 126 126 260 100.0 1e-68 MNLPKVKMLQVSKCLIGLAVMMLQSCDVADNRRDMLCGNWESVEGKPDVLIYKEGEAYKV TVFRRSGLRRKLKPETYLLQEENGNLFMNTGFRIDVSYNEATDVLTFSPNGDYVRVKPQP GHPTEE >gi|313158484|gb|AENZ01000040.1| GENE 83 103238 - 103867 926 209 aa, chain + ## HITS:1 COG:no KEGG:Bacsa_2554 NR:ns ## KEGG: Bacsa_2554 # Name: not_defined # Def: hypothetical protein # Organism: B.salanitronis # Pathway: not_defined # 1 209 1 209 209 390 100.0 1e-107 MRTRITMIICLCLLFAGRASAQWVVSDPGNLAQGIINASKNIIHTSKTATNMVSNFQETV KIYQQGKKYYDALKSVNNLVKDARKVQQTILMVGDITDIYVNSFQRMLRDGNFRPEELSA IAFGYTKLLEESNEVLTELRNVVNITTLSMTDKERMDVVERCHSKMKRYRNLVSYYTNKN ISVSYLRAKKKNDLDRIMGLYGNMNERYW >gi|313158484|gb|AENZ01000040.1| GENE 84 103871 - 104875 1202 334 aa, chain + ## HITS:1 COG:no KEGG:Bacsa_2555 NR:ns ## KEGG: Bacsa_2555 # Name: not_defined # Def: conjugative transposon TraJ protein # Organism: B.salanitronis # Pathway: not_defined # 1 334 1 334 334 639 100.0 0 MKFDNLHQILRSLYEQMMPLCGDMAGVAKGIAGLGALFYVAYRVWQSLARAEPIDVFPML RPFAIGLCIMFFPTVVLGTINSILSPVVQGTAKMLEAETLDMNRYREQKDKLEYEAMVRN PETAYLVSNEEFDKQLEELGWSPSDMVTMAGMYIDRGMYNMKKSIRDFFREILELLFQAA ALVIDTVRTFFLVVLAILGPIAFALSVWDGFQNTLTQWICRYIQVYLWLPVSDMFSTILA KIQVLMLQNDIERMQADPNFSLDSSDGVYIVFLCIGIIGYFTIPTVAGWIIQAGGMGGYG RNVNQMAGRAGSMAGSVAGAAAGNAVGRVGKLLK >gi|313158484|gb|AENZ01000040.1| GENE 85 104907 - 105530 963 207 aa, chain + ## HITS:1 COG:no KEGG:Bacsa_2556 NR:ns ## KEGG: Bacsa_2556 # Name: not_defined # Def: conjugative transposon TraK protein # Organism: B.salanitronis # Pathway: not_defined # 1 207 1 207 207 402 100.0 1e-111 MEFKSLRNIESSFRQIRLFGIVFLSLCAVVTVWSVWNSYRFAEKQREKIYVLDNGKSLML ALSQDLSQNRPAEAREHVRRFHEMFFTLSPEKSAIEHNVKRALLLADKSVYHYYSDFAEK GYYNRIIAGNINQVLKVDSVVCDFNAYPYRAVTYATQKIIRQSNVTERSLVTTCRLLNAS RSDDNPNGFTIEGFTIIENKDLQTIKR >gi|313158484|gb|AENZ01000040.1| GENE 86 105558 - 105845 234 95 aa, chain + ## HITS:1 COG:no KEGG:Bacsa_2557 NR:ns ## KEGG: Bacsa_2557 # Name: not_defined # Def: hypothetical protein # Organism: B.salanitronis # Pathway: not_defined # 1 95 8 102 102 189 100.0 2e-47 MWGMYWKLHDKRKRLAASLKGYLDGLPPETRRRIVLGMFAAFAVLALYTFGRAVYDIGRN DGSHMETGHAGRVELPTPAETGNHLTPYLYGTDKE >gi|313158484|gb|AENZ01000040.1| GENE 87 105826 - 107178 1121 450 aa, chain + ## HITS:1 COG:no KEGG:Bacsa_2558 NR:ns ## KEGG: Bacsa_2558 # Name: not_defined # Def: conjugative transposon TraM protein # Organism: B.salanitronis # Pathway: not_defined # 1 450 1 450 450 815 100.0 0 MEQTKNEPTKENKAAPETGKPKKEREPLTEAQRLKRQKMIVLPAMVLVFIGAMWLIFAPS SGKEQPPGTDGYNTEMPDADKANRQIIGDKLKAYEHGEMEERQESRNRAIGQLGDMFDRE IAGTENGVDFDLANPGGKEERAKPATPQTIQSSAAAYRDLNATLGNFYDQPKNDNAEMDE LLERIASLESELESERGKASSMDEQVALMEKSYELAAKYMGGQNGGQPSAEQRAEPTTVQ KGKKNKAMPIRQVEHQVVSSLSQPMSNAEFVAALSQERNRGFNTAVGTAEVLDRNTIPAC VHGAQSVTDGQTVRLRLLEPMAVAGRTIPRGAVVVGTGKIQGERLDIEITSLEYDGTIIP VELAVYDTDGQPGIFIPNSMEMNAVREVAANMGGSLGSSINISTNAGAQLASDLGKGLIQ GTSQYIAKKMRTVKVHLKAGYRVMLYQEKY >gi|313158484|gb|AENZ01000040.1| GENE 88 107213 - 108199 1260 328 aa, chain + ## HITS:1 COG:no KEGG:Bacsa_2559 NR:ns ## KEGG: Bacsa_2559 # Name: not_defined # Def: conjugative transposon TraN protein # Organism: B.salanitronis # Pathway: not_defined # 1 328 1 328 328 662 100.0 0 MRKVIIMFALAMGIITANAQENVTVETTNGSEQPTLTKEVYPQKEADGDLYHGLSRKLTF DRMIPPHGLEVTYDKTVHVIFPAEVRYVDLGSPDLIAGKADGAENIIRVKATVRNFPNET NMSVITEDGSFYTFNVKYAAEPLLLNVEMCDFIHDGSTVNRPNNAQEIYLKELGSESPML VRLIMKSIHKQNKREVKHIGCKRFGIQYLLKGIYTHNGLLYFHTEIKNQSNVPFDVDYIT WKIVDKKVAKRTAVQEQIILPLRAQNYATLVPGKKSERTVFTMAKFTIPDDKCLVVELNE KNGGRHQSFVIENEDLVRAGTINELQVR >gi|313158484|gb|AENZ01000040.1| GENE 89 108202 - 108777 577 191 aa, chain + ## HITS:1 COG:no KEGG:Bacsa_2560 NR:ns ## KEGG: Bacsa_2560 # Name: not_defined # Def: conjugative transposon protein TraO # Organism: B.salanitronis # Pathway: not_defined # 1 191 1 191 191 380 100.0 1e-104 MRKYIAIIIASLALFTGQAHAQRCLPKMQGIEVRADMADGFNLGGKDGGYSFGAALSTYT KKGNKWVFGGEYLLKNNPYKDTKIPVAQFTAEGGYYFKILSDARKIVFVYAGASALAGYE AVNWGKKVLHDGSTLHDRDAFIYGGALTLDVECYVADRIALLANLRERCLWGGDTRKFHT QFGVGIKFIIN >gi|313158484|gb|AENZ01000040.1| GENE 90 108818 - 109687 469 289 aa, chain + ## HITS:1 COG:no KEGG:Bacsa_2561 NR:ns ## KEGG: Bacsa_2561 # Name: not_defined # Def: hypothetical protein # Organism: B.salanitronis # Pathway: not_defined # 1 289 12 300 300 603 100.0 1e-171 MPLADFLARLGHEPVRRSGNELWYLAPYRGERTSSFRVNVAKQLWYDFGLGKGGDIFTLA GEFLQSDDFMKQAKFIAEAANMTVAGWEKPVYLSKPTESVFEDVEVAPLLRSLLTEYLEE RGIPYAIASRHCCRLNYGVRGKRYFAVGFPNMAGGYEVRSRYFKGCIPPKSVSLVKANDI PADECLVFEGFMDFLSAVTLGVTGNADCLVLNSVANVEKAAGLLDGYGRIGCFLDRDEAG RRTLAALTMRYGERVTDRSSLYDGCKDLNEYLQLTTKKQKNNHLKIEEQ >gi|313158484|gb|AENZ01000040.1| GENE 91 109684 - 110187 649 167 aa, chain + ## HITS:1 COG:no KEGG:Bacsa_2562 NR:ns ## KEGG: Bacsa_2562 # Name: not_defined # Def: hypothetical protein # Organism: B.salanitronis # Pathway: not_defined # 1 167 1 167 167 325 100.0 3e-88 MNILNNRNKRTSIFKAVALCLIAAMSFTLVSCDDDMDIQQSYPFTVEVMPVPNKVVKGQT VEIRCELKKEGDFSGTLYTIRYFQFEGEGSLKMDNGITFLPNDRYLLENEKFRLYYTAAG DEAHNFIVVVEDNFSNSYELEFDFNNRNVKDDDLTIVPIGNFSPLLK >gi|313158484|gb|AENZ01000040.1| GENE 92 110301 - 110711 251 136 aa, chain + ## HITS:1 COG:no KEGG:Bacsa_2563 NR:ns ## KEGG: Bacsa_2563 # Name: not_defined # Def: lysozyme # Organism: B.salanitronis # Pathway: not_defined # 1 136 40 175 175 276 100.0 3e-73 MERAFLCCRYFEGWHSEKHYPYVGWGHKLLPNEKYSARTMTKRDADELLRKDLRKFVAMF RKFGVDSILLGTLAYNVGPAKLLGSKTIPKSTLIKKLEAGDRNIYREYIAFCNYKGKRHA MLLKRRKAEFALLYIP >gi|313158484|gb|AENZ01000040.1| GENE 93 111199 - 111507 352 102 aa, chain - ## HITS:1 COG:no KEGG:BF0110 NR:ns ## KEGG: BF0110 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 102 38 139 139 193 100.0 2e-48 MIAKTILQQIGGKRFTAMTGSRDFIDMGNGLRMSLARNKTSANRLDIIYDEGADLYNMRF YRRTFSKKTFECKTKDIAVHEGIYFDMLEEMFTMVTGLYTRF >gi|313158484|gb|AENZ01000040.1| GENE 94 111561 - 111806 255 81 aa, chain - ## HITS:1 COG:no KEGG:BF0109 NR:ns ## KEGG: BF0109 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 81 28 108 108 154 100.0 2e-36 MNTTYQTLIVKFSEPITALDGIFDDTGAWGTDTLKGWIDDYESTRFTATDSHTAVITSEY NMECVKEWLQRQTPISEMREF >gi|313158484|gb|AENZ01000040.1| GENE 95 111803 - 112036 302 77 aa, chain - ## HITS:1 COG:no KEGG:Bacsa_2571 NR:ns ## KEGG: Bacsa_2571 # Name: not_defined # Def: hypothetical protein # Organism: B.salanitronis # Pathway: not_defined # 1 77 1 77 77 141 100.0 7e-33 MTTRMTINGVSTCAEAGTEKYERFQSGIGRRRRTLVQYDYRHPIDRELFSCVKPTLDECR AARDKWLNAKKGKEDRL >gi|313158484|gb|AENZ01000040.1| GENE 96 112056 - 112313 328 85 aa, chain - ## HITS:1 COG:no KEGG:Bacsa_2572 NR:ns ## KEGG: Bacsa_2572 # Name: not_defined # Def: hypothetical protein # Organism: B.salanitronis # Pathway: not_defined # 1 85 1 85 85 159 100.0 3e-38 MEVRIESMICVWDDAIPTMFLEFVNLLTLTTSEGELRKSVKEFAEKHELDKFFLYGFGSH HFYLHQRYTSNPEMVMKNRVLSVHF >gi|313158484|gb|AENZ01000040.1| GENE 97 112338 - 113669 932 443 aa, chain - ## HITS:1 COG:no KEGG:Bacsa_2573 NR:ns ## KEGG: Bacsa_2573 # Name: not_defined # Def: hypothetical protein # Organism: B.salanitronis # Pathway: not_defined # 1 443 1 443 443 871 100.0 0 MKPKTKIQKEVARLSANLRPISATQIDWAYRHCVEHIGYRTKKGNITCSDCGHEWHSDSG LCDTLEGCTCPKCHAELKVQDTRRRIYKETQNFSVITTCKGYQVIRVAQVRCESRKGEPM RFYCHEVVQRWISPDGKVTDMALLRGFLFCYCDVWALGSDMEVRPHNSLYDDVVARSCAY PKMRILPQLRRNGFKGDFHGISPVRLFKALLSDPRIETLMKGGEIEVMKHFLFNTRTADE CWASYLIAKRHKYQIDNLSMWCDYLRMLKKLGQDLRNPKNICPEDFMAAHDNATRKIEAI HEKERAAEQRRWEIERREREQQRQLQRKKDAEDFIANKSKFFGLVITDEEIIVKVLESID EYYNEGKTQGICVFGSGYYKKADTLILSARIGDEIIETVEVDLRTLEVVQCHGKHNQDTE YHERIIDLVNKNANLIRERMKAA >gi|313158484|gb|AENZ01000040.1| GENE 98 113666 - 114082 242 138 aa, chain - ## HITS:1 COG:no KEGG:Bacsa_2574 NR:ns ## KEGG: Bacsa_2574 # Name: not_defined # Def: hypothetical protein # Organism: B.salanitronis # Pathway: not_defined # 1 138 1 138 138 257 100.0 1e-67 MKGTDHFKRTIYMYLEQRAEEDALFAKKYRNPAKNMDECVTHILNYVQKSGCNGFTDGEI FGQAIHYYEENEIEVGKPMDCQVVVNHVVKLTAEEKAEARQNAVRKYQEEELRKLQNRHR PSARKENQPQPSLFDLGL >gi|313158484|gb|AENZ01000040.1| GENE 99 114096 - 114317 273 73 aa, chain - ## HITS:1 COG:no KEGG:BF0105 NR:ns ## KEGG: BF0105 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 73 59 131 131 149 100.0 3e-35 MAKRNSKTAAQQCRYYEVDNIFVYMVETYINGNFETFRRLYHELNKDARRDFMDFLLSEV EPTYWREILKQII >gi|313158484|gb|AENZ01000040.1| GENE 100 114329 - 114544 156 71 aa, chain - ## HITS:1 COG:no KEGG:Bacsa_2576 NR:ns ## KEGG: Bacsa_2576 # Name: not_defined # Def: hypothetical protein # Organism: B.salanitronis # Pathway: not_defined # 1 71 1 71 71 96 100.0 3e-19 MTATANFRQMAQHIGLAICGLMMRTAFGVFGILWGIIREIVNGVFRVAIGVIVAILSTIA FFGFILWLFTL >gi|313158484|gb|AENZ01000040.1| GENE 101 114576 - 114833 67 85 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|167762708|ref|ZP_02434835.1| ## NR: gi|167762708|ref|ZP_02434835.1| hypothetical protein BACSTE_01066 [Bacteroides stercoris ATCC 43183] hypothetical protein BACSTE_01066 [Bacteroides stercoris ATCC 43183] # 1 85 70 154 154 153 100.0 4e-36 MRIQELRELSFKLSYLPLVYPTFFLCVFLSHRSFSFRVLEKVGIRECKVFRSKYYPQGWR FLPKTGGLTLLSRPIPELPLRPERK >gi|313158484|gb|AENZ01000040.1| GENE 102 114841 - 115104 66 87 aa, chain + ## HITS:1 COG:no KEGG:Bacsa_2577 NR:ns ## KEGG: Bacsa_2577 # Name: not_defined # Def: hypothetical protein # Organism: B.salanitronis # Pathway: not_defined # 9 87 1 79 79 135 100.0 5e-31 MRLTERRKMEVNENRGRLCRKKLQGKIRKKKNHQKEKDITGSVKRANHNKGSMIVSDPTE LVQPGGNNHSSRFVRLCVGACFIHGAG >gi|313158484|gb|AENZ01000040.1| GENE 103 115588 - 116928 1708 446 aa, chain - ## HITS:1 COG:BH1862 KEGG:ns NR:ns ## COG: BH1862 COG0371 # Protein_GI_number: 15614425 # Func_class: C Energy production and conversion # Function: Glycerol dehydrogenase and related enzymes # Organism: Bacillus halodurans # 42 308 53 321 399 142 32.0 2e-33 MNKVESALSRTTDTKALVIGIETLPRVADMFKELFPGRRALVVADANTWRAAGGDVHRIL AQAGVAQDEPHVFTDPKLYAEWTFVEQLDGVLSRTDAIPVAVGSGVINDLTKLCSHHNGR RYMVVGTAASMDGYTAYGASITKDGNKQTFDCPAPLGMVLDPTIAAAAPAKMSASGYADL IAKIPAGADWMLADAVGSELLDEFAFSVVQDGLKDALCDPAGVRAGNVAKVEQLAEGLLL SGFAMQATQSSRPASGAEHQFSHLWDMEHLKYNGASVSHGFKVGIGTLASTAFIEMLLDT PVETLDVEKCVAAWKSWEETEKDIYRIFDNDPEFVARGLVETKGKYVDKEGLRRQLQRLQ KAWPALSGHIRRQIIPFGEVRRRLELVGAPYEPEHIGVSRAKFRASFEKIPYMRSRFTVI DIAFRCGWMDEWLDKLFGKGGIWEIK >gi|313158484|gb|AENZ01000040.1| GENE 104 117308 - 118501 1383 397 aa, chain + ## HITS:1 COG:no KEGG:BT_2457 NR:ns ## KEGG: BT_2457 # Name: not_defined # Def: putative purple acid phosphatase # Organism: B.thetaiotaomicron # Pathway: not_defined # 28 394 21 389 389 286 39.0 1e-75 MKNTLFLLICFLASLPASPAFSQPSARIAGGPYLQNVTEDGFTVIWTTTTDAAAWVETAP DDGTHFYAVERPKYYDSHLGRRRLGKLHRVRVGGLEPGKTYRYRIMQQAVLSDEGNKRVV LGEGYGSDILKHAPYPVTTPSADRNELKFWMVNDIHGRDSVFRLLIGDAPKQKPDFVCLN GDLLNSIESEKALFEGFLASASELLTPAGIPLVVMRGNHDGRGKFARHWLDYFPTPTGES YYTFRRGPVFFVVLDGCEDKPDSDIRYYGLSTADAYREQQAQWLRGVVAGEEFRSAPYRI VLIHMPPNKGRGWHGELEIERLFLPILDGSGIDLMLCGHYHRYQWIDDRSRGADFPILIN SNNDKTVVKADEKGIDIRVVDTAGNVLKEHKITKNNR >gi|313158484|gb|AENZ01000040.1| GENE 105 118511 - 119407 1341 298 aa, chain + ## HITS:1 COG:L18600_2 KEGG:ns NR:ns ## COG: L18600_2 COG0584 # Protein_GI_number: 15672199 # Func_class: C Energy production and conversion # Function: Glycerophosphoryl diester phosphodiesterase # Organism: Lactococcus lactis # 52 157 3 104 227 83 41.0 4e-16 MKTFRLLTALTLGTLLWGCAESGKKAEGPFYTDIESRAQLHDWFRYAPERPIVVSGHRGG MVTGYPENCIESFEKTLSMMPSFFEIDPRLTKDSVIVLMHDATIDRTTTGTGKVSDYTFE ELQQFFLKDREGNVTPYKIPTLEECIVWSRGKTILNLDIKDVPLEVMSDFINRLAPANVM YTVRNARQARTYLDRDPQAMFSCWCKNMKEFGEYAEERIPWSQMMAYVGTMMLPDQQELY DKLHEKGVMCMISLAPTHDRRATDAEKIKGYELEIPTGCDVIETDYPYLFQHLNLTRK >gi|313158484|gb|AENZ01000040.1| GENE 106 119407 - 120273 1501 288 aa, chain + ## HITS:1 COG:FN1255 KEGG:ns NR:ns ## COG: FN1255 COG0647 # Protein_GI_number: 19704590 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted sugar phosphatases of the HAD superfamily # Organism: Fusobacterium nucleatum # 16 278 11 270 275 199 41.0 5e-51 MATFDKFRSVYTHDELMERLRGIKHVALDMDGTIYMGMSLFPYTKPFLEGLKELGIGYSF LTNNPSKSIADYLHKLETLGIRATREEMYTTALATIDYIKQHYPAAKRLFLLGTPSMISE FEAAGFESAADSADDVPDVIVAAFDMTLQYDRLCRAAWWVSQGVPYIATNPDRVCPTDQP TVLVDCGSICACIGHATGRRPDITLGKPDPNMLSGILSRHGLKPDQIAMVGDRIYTDVAM AHNAKAMGVLVLSGETTLDVADKADPQPHITADSIEVLGRLIREAHGK >gi|313158484|gb|AENZ01000040.1| GENE 107 120574 - 121449 1360 291 aa, chain + ## HITS:1 COG:Cj0263 KEGG:ns NR:ns ## COG: Cj0263 COG0428 # Protein_GI_number: 15791634 # Func_class: P Inorganic ion transport and metabolism # Function: Predicted divalent heavy-metal cations transporter # Organism: Campylobacter jejuni # 30 291 8 290 291 239 49.0 5e-63 MDSVFPSLSPEKHYICEVNDKPHINMEHNILIPLLLTLGAGLATGIGSAIAFFARRTNKR LLSFSLGLSGGVMIYVSFVELFQQAQITLSEEWGAHTGIIVTVVSFFAGILLIGVIDRLV PSFENPHEAHMVEEMDKQPRNPKLMRMGMMTALAIGIHNFPEGIATFTSAVDNMALGVAI AVAIAIHNIPEGIAVSIPIFYATGDRKKAFKLSLLSGLAEPVGALLAYLVLMPFMSPTLM GCILAGVAGIMVFISIDELLPAAREYGEAHISIYGVVAGMALMAVSLILLA >gi|313158484|gb|AENZ01000040.1| GENE 108 121453 - 122406 1450 317 aa, chain + ## HITS:1 COG:slr0681 KEGG:ns NR:ns ## COG: slr0681 COG0530 # Protein_GI_number: 16329554 # Func_class: P Inorganic ion transport and metabolism # Function: Ca2+/Na+ antiporter # Organism: Synechocystis # 1 316 70 387 433 194 42.0 1e-49 MDILLLIVGLGLILAGANFLTDGSAAVAQRFRVPEFIIGLTIVAVGTSTPELVVSVLSAA AGNSDVAIGNVVGSNLFNVFLILGVCALIRPLPLTRSNIRRDIPFGMAASLVLLAVTSDR LICAGAADRIGRIDGIVMVALYIALMWFTIRTTKRPEAVPADAGAKKPAALWLAVVMIAG GLAGLIFGGDMFLKSATEIARRLGISESVIAITLVAGGTSLPELASSVVSLLKGKSEMAL GNVIGSNIANILLILGVSATIRPLTMGGITTTDILMVVLSSLLLLLTAFTFRRKQIDRWE GAIFLMIYVAYIWYLIR >gi|313158484|gb|AENZ01000040.1| GENE 109 122487 - 123524 1013 345 aa, chain - ## HITS:1 COG:L0296 KEGG:ns NR:ns ## COG: L0296 COG1194 # Protein_GI_number: 15672823 # Func_class: L Replication, recombination and repair # Function: A/G-specific DNA glycosylase # Organism: Lactococcus lactis # 8 272 15 279 387 233 41.0 5e-61 MDNISDILLDWYARHGRDLPWRRTRDPYRIWLSEVILQQTRVAQGMDYYLRFTERFPDVG SLAAAPEDEVLKLWQGLGYYSRARNLHAAARQVAARFGGVFPRSYDEVRSLRGVGDYTAA AVCSAAYDAPCAVLDGNVFRVLARLFDIDLPIDSTAGKRTFAELAQMRLDKRCPGRYNQA VMDFGALQCTPAQPGCADCPLASRCLALAAGTVAERPVKQSKTKVRDRWFNYLHVTCGDR TLLRRRGEGDIWQGLYEFPMIETDRAADFSELTDSEEFRTLLDGVEWRLLRSVAMPKHQL SHQTLHAVFHRIEISFPVDFPSVPTATLGDYAVPRLIDRYLDRQP >gi|313158484|gb|AENZ01000040.1| GENE 110 123605 - 123904 471 99 aa, chain + ## HITS:1 COG:DR1844 KEGG:ns NR:ns ## COG: DR1844 COG2388 # Protein_GI_number: 15806844 # Func_class: R General function prediction only # Function: Predicted acetyltransferase # Organism: Deinococcus radiodurans # 10 92 7 90 93 69 41.0 1e-12 MEKKHELIDNAAEKRYEFDLDGDLAMIEYIKAQGFIVLTHTEVPEKYEGQGIGSELTRAV LEDLRAKKLPMIPQCPFVAQYIYRHPEWADVVLKEVPAK >gi|313158484|gb|AENZ01000040.1| GENE 111 124021 - 125427 1635 468 aa, chain - ## HITS:1 COG:no KEGG:Bacsa_1635 NR:ns ## KEGG: Bacsa_1635 # Name: not_defined # Def: hypothetical protein # Organism: B.salanitronis # Pathway: not_defined # 6 468 4 467 467 503 53.0 1e-141 MKRILLLFIALAAAWGAAAKIRLPEIIGNDMVLQQNAEVRLWGWAEPHAAVKVETSWGAG AAVKSDADGRWAVSVRTPAGSYEPQRITFASGDRVTFDRVLIGEVWFAGGQSNMEMPLGG FWQCPVTDANEIIARAGAKSGKLHYVKIPHAAAYTPQDRTAGSWTPCTTATAAKFTAAGY FFAEMLNDVLGVPVGIIDCSWGGSRVESWMSREVLSGYSDIDLTEEGISRVAQHLRPMVM YNAMLRPVAGYTVRGFLWYQGESNVGRYMDYAARLADMVSLWRGLWGRDDLPFYYVEIAP FEYGNGKSPYLREAQCRAQELIPNSGMICTNDLVEPYEFCNVHPKNKRDVGYRLAFMALN KTYGMKEIACQSPQYESMEIRDGKAHVRFKYMDQGFNRMADIRGFEICGADKVFYPADAV VENQFALVVSSEKVPEPVAVRYCFRDFQPGNLADTRELPVVPFRTDDF >gi|313158484|gb|AENZ01000040.1| GENE 112 125630 - 125857 454 75 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313158608|gb|EFR58002.1| ## NR: gi|313158608|gb|EFR58002.1| hypothetical protein HMPREF9720_2907 [Alistipes sp. HGB5] # 1 75 1 75 75 128 100.0 1e-28 MELTNTSPDKRRLFGRWFARRITTYLVVCAFLAFVNWQTSPHYWWVAWVAAGWGLNLVLS LVWYLTDCDDENNYR >gi|313158484|gb|AENZ01000040.1| GENE 113 125871 - 126242 427 123 aa, chain + ## HITS:1 COG:no KEGG:Bacsa_3075 NR:ns ## KEGG: Bacsa_3075 # Name: not_defined # Def: hypothetical protein # Organism: B.salanitronis # Pathway: not_defined # 9 119 5 120 127 92 39.0 5e-18 MKAWKLVLTVVFALASVAAAAKTLEITVTDIRSDKGSILVMAKVAGQKQPVYGMSPAKRG EVVVTLEGIDADAAEVSLFHDEDGDYKMKTGERGPAEGYAAKKCTLPAERNTVTIRLRYP AAE >gi|313158484|gb|AENZ01000040.1| GENE 114 126353 - 127696 1412 447 aa, chain + ## HITS:1 COG:no KEGG:BVU_0460 NR:ns ## KEGG: BVU_0460 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 195 74 266 268 136 38.0 2e-30 MEDRKLDAAESLALIGRMIENTRSRMVRNAGRPFLAWGYATAATLIAVWAAVSCTGDVRW NYLWFMLPLLGSALMYFTRPKAAEGSVHTYVDRVLDLIWSVIGPATLLISTLAICFVVRF PVAFTVLLMIGLGTTINNLIIGFKPGVAGGIAGIVLASVSLVVTGNWHAGLFLAAFVLMT IIPGHILNYRYRHSRSDVKQAAGPRPGKATKDNTAEEDGAATENGTAEGGAATDSTAADG AATGSAAADNTAADRPLNAAENIALVGRMIDNTRSRMVRNAGRPFLIWGYATVVTTLAVW AAYALFPGRNWYYLWLLLPVFGTALSLLTMPKADGGRVYTFIDRVIGQVWLVMGLTAWFV SLLSVFGDMPVPILFVILLMMGMGTAITGLVIRFIPAAVGGAAAIVLAPVSLAATGYWVP ALFIAGFAAMMIVPGHILNYKSNHPKQ >gi|313158484|gb|AENZ01000040.1| GENE 115 127701 - 128333 861 210 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313158615|gb|EFR58009.1| ## NR: gi|313158615|gb|EFR58009.1| putative membrane protein [Alistipes sp. HGB5] # 1 210 1 210 210 347 100.0 4e-94 MEERKMSEKEAMELIARSLEAQRRKLSFVRSTLFFGAGALALAVTTADYVVRRTTGRDLT LLLVLAGTVAGLAALYFHYRSHRTVTANDRTLMNVWAYALGVCLYTAGYVATGDGLQLPV MGLLTMGAAIAAGVTGELFRRSETNRSNQSGSLAGLLFISISGAFLAAWIFRVAAAGETV SQFILTLAAAVILFFGTGIVLRYNERRNRV >gi|313158484|gb|AENZ01000040.1| GENE 116 128326 - 128610 299 94 aa, chain + ## HITS:1 COG:CC2206 KEGG:ns NR:ns ## COG: CC2206 COG1846 # Protein_GI_number: 16126445 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Caulobacter vibrioides # 5 94 10 99 103 75 40.0 2e-14 MFKELNPLLHSELRLAVMSILIGVESADFVFIRQQTGATAGNLSVQLDKLAKANYIEIEK TFRGRMPCTVCRITDTGRDAFAEYVAALQTYIRK >gi|313158484|gb|AENZ01000040.1| GENE 117 128794 - 129408 719 204 aa, chain - ## HITS:1 COG:FN1083 KEGG:ns NR:ns ## COG: FN1083 COG2431 # Protein_GI_number: 19704418 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Fusobacterium nucleatum # 9 198 2 189 198 97 27.0 2e-20 MWGALKGSLVIVAFFVAGCVAGLFALVPFDVAGARVSTYVLYALMFCVGVTLGHDETLAG RVRRLDPRLALLPVMTAVGTLAGAALATPLLPPLAVTDTLAVGSGFGYYSLSSIFIADFR GAELGTVALLCNVMREIFTLLAAPLVARWCGPLAAVSIGGATTFDTTLPVITQAAGKPYA VVSIFHGCTLDFSVPFLVTFFCTL >gi|313158484|gb|AENZ01000040.1| GENE 118 130051 - 132735 3727 894 aa, chain - ## HITS:1 COG:no KEGG:BF0908 NR:ns ## KEGG: BF0908 # Name: not_defined # Def: putative outer membrane protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 838 1 828 837 620 39.0 1e-176 MKKFVKIAAIVVAVVLVIALVAPMVLRGKIAEIVKREANAMLDARLDFEKLDISLLRHFP RASLDLKGLTLVGEEPFAGDTIVAAQRISVVVNVMSLFGDEGFEVTKIILSEPALHAHKL ADGRVNWDVVKASEEAQGEEETPADAEPSSFRLSVRDFRISDAAIRYEDDSTGMRFSTAP LSLRLRGDMSADQTDLDLRLTAKGMRLVSGGIPMLSGAEAELDAVIAADLANSRFTFSQN KLRLNAIEVGLDGWVELKDDAVAMDVTAGCDKVQFKDVLSLVPAFYTRDFRNLAAGGELS MALWARGEMRGAALPAFELRTEVRNGSFQYSSLPKAVDGINIDVRIANPGGVADKTEIDL SKFTLRMAGNSLSASAYATNLVSDPTFRAAVDGRVDLGAVKEVYPLEKGMELSGLITADV KVSGRMSDIEKSRYERIGASGTFVVERLGLTMPELPAVHIRRAAATITPAAMTLGELGLT IGRSDLAANGQLSGYIGYLLRGDMLSGRLYVKSDLLDLNEMMNVMPDSAPEETPETSSAG TSAEPVQALEVPRNLNLSLNTELQKVLFQKMTVSGISGEMRVAGGALSLERLRMQLFGGS ATASGSYSTAADPQRPALKLSLGLSGASFSKTFDELEMVQKLVPLFAKTGGDYSLSLDMT TALDPKMEPDLQSLSATGEIKSANIHIQNLEAFDALAKALDNDKLRKIEAKDVAVRFSIR DGRVTTQPFDLKMGDIKINMSGSTGLDQTIDYTAKVALPAGSAGGILESVNVGIGGTFTS PKITLGVKEAAEQAVKNVIDQQIQKLTGSESLGEEIQKQADNLRKEAKRAGDKLVEAAEA QRTKLVDGAKEKGALAKLAAEKAGDKLVEEAKKQAGKLSAEAEKQIEKLTAKQE >gi|313158484|gb|AENZ01000040.1| GENE 119 132735 - 133172 469 145 aa, chain - ## HITS:1 COG:no KEGG:ANT_07460 NR:ns ## KEGG: ANT_07460 # Name: not_defined # Def: putative acetyltransferase (EC:2.3.1.-) # Organism: A.thermophila # Pathway: not_defined # 8 141 6 140 145 119 49.0 6e-26 MTVEIRRITQVSDALEEAFARLMPQLSPRLGAPSREVLRRVAGSETGELLAAVAGERIVG VLTLAWYDAPSGRKAWIEDVVVDGAARGCGAGDALVRAAVEHAARIGAGKVMLTSNPARE AARALYRKVGFEEVETTVFAFKTDK >gi|313158484|gb|AENZ01000040.1| GENE 120 133604 - 134647 1470 347 aa, chain + ## HITS:1 COG:no KEGG:Odosp_2777 NR:ns ## KEGG: Odosp_2777 # Name: not_defined # Def: putative surface layer protein # Organism: O.splanchnicus # Pathway: not_defined # 19 346 22 357 357 355 51.0 1e-96 MKRVLRHMAVLCAGAGFCCACMDYGPLHEEPFAQSQRGVFITCEGNFGWDNASLSFYDPA ARSVENEIFIRANGMKLGDVAQSMTLHKGRGYVVVNNSGAVYVIDPATFRVTGVIEGVVS PRYIHFPNDTTAYITDLYDPRITVVDSRSNRIRGRIPMNGHKSTEQMVQWGDEVFVNCWS YDDKILVVDAARETLVDSIEVGPQPSSLVLDRFGKLWTVTDGRTAGIPAALYRIDAATRR VEQTYTFAAGDHPSSVTLNGTRDTLYFINRDVWRMRVDAEELPVTPFLPYTQTTYYEVAV DPLTSEVYVADAIDYVQHGVVYRFTARGAAVDTIRVGITPGSFCFKP >gi|313158484|gb|AENZ01000040.1| GENE 121 134664 - 135548 1385 294 aa, chain + ## HITS:1 COG:no KEGG:Odosp_2775 NR:ns ## KEGG: Odosp_2775 # Name: not_defined # Def: PKD domain containing protein # Organism: O.splanchnicus # Pathway: not_defined # 33 281 323 580 592 239 49.0 8e-62 MKKYAFPLAALLLAACNSGEEITTTQTDEPVARILEYTPAPGQFINEEARSGGAFDNVDT PEKACRYAAARFAENNWVSLGGWGGYLVAAFAEPVPNTGGYDLYVKGNAMNTSSEPGVVW VMQDANGNGMPDDTWYELKGSEYDNAATIRGYAVTYTPLADGSAAWTDDRGGSGTIDRMD EHTQASYCPAWIEPADLKFTGTRLRDNVEQADGQWRPQAFAWGYADNFSTVDRIGTTNRL RISDAVTADGSPANLQQIDFIKVQTGVNAKAPLIGEISTEVCGIGCYRTVTKRN >gi|313158484|gb|AENZ01000040.1| GENE 122 135553 - 137604 2942 683 aa, chain + ## HITS:1 COG:no KEGG:BT_1953 NR:ns ## KEGG: BT_1953 # Name: not_defined # Def: putative TonB-linked outer membrane receptor # Organism: B.thetaiotaomicron # Pathway: not_defined # 2 682 1 686 692 718 51.0 0 MLRTATILLCLTCALPAAAQRDSTARTIAIESVDISGRRPMKEIGVQRTELDTLVLRENI TASLADALATGSTIFIKSYGRATLATASFRGTAPSHTQVTWNDMKVNSPMLGQVDFSLIP AYFIDDATIYHGASSVGVTGGGLGGAVTLATKAPAEQGLGLRYVQGIGSFATFDEFLHLT YGGARWSSSTRVLYSTSDNDFRFRNYNSKEFVTDDDGQIVGEYYPLQRNRNGGFRDLHVM QELYYTTRGGDRFSLAAWYLDSHRGLAMLTSDRNKSKQKQNTQDERTLRAVAGWERLRNG LKLGARAGYTYTDLRYLLKQDPEGKGQFVVNTDARSRIHTLFAKAEAEYALGEKWLFSAN AALHQHRVRSGDLSVIGNNGLRADTLYRQERFELSAFAAVKWRPVPRLGVAADLRWELYG DRTTPVIPALFADYVLSKRGSVVVRASAARNYRYPTLNDLYFRPGGNKNLKPERGWTFDA GLETTLKGDARSLRASATAFDSRIDDWILWIGSPKLGIYTPINIRRVHSYGVEAKLSAEA ETSGGWRLLLDGNFAWTRSINRGDPFSPADESVGKQLVYIPVYSAALTARVEWRRWELTY KWNWYSERYTMSSNDLGVLGRVKPYFMSDLSLEKGLDCKWASFSLKGCIHNLLNEEYESV LSRPMPRLNVSFFIGITPKFGKR >gi|313158484|gb|AENZ01000040.1| GENE 123 137601 - 138275 971 224 aa, chain + ## HITS:1 COG:FN0805 KEGG:ns NR:ns ## COG: FN0805 COG4912 # Protein_GI_number: 19704140 # Func_class: L Replication, recombination and repair # Function: Predicted DNA alkylation repair enzyme # Organism: Fusobacterium nucleatum # 5 223 23 251 251 140 34.0 3e-33 MTLREQLLELTEPKYMKFTSALMPGVENVLGIRLPVLRSIAKEIAAGDWRAYLAEAEDFY FEERMLQGLVIGYARCEPAEKLAHVARFVPKIDNWAVCDCFCWRLRAAERQPMWEFIQPY FHAQAAYDVRFAVVMGLGNFVDEQHLEAFLRQLDGIRHEAYYARMAVAWAVSVCYVKFPQ RTHVWLGTCSLDDWTYNKSLQKIIESYRVSDAAKQEIRAMKRRR >gi|313158484|gb|AENZ01000040.1| GENE 124 138401 - 139147 1067 248 aa, chain + ## HITS:1 COG:MT2111_2 KEGG:ns NR:ns ## COG: MT2111_2 COG0463 # Protein_GI_number: 15841539 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Mycobacterium tuberculosis CDC1551 # 2 227 14 239 277 198 44.0 9e-51 MRKLVIIPTYNEKENISAMIDKVFSLPEPFELLVIDDGSPDGTAEIVRQRRKEFPETLHL LERSGKQGLGTAYLTGFRWGLDNGFDYICEMDCDFSHNPDDLVRLYEAAAEGNDVVVGSR YVQGVNVVNWPMSRLLMSYFASMYVRIVTRMPLRDATAGFVCYSRHALETLDLDAVRMKG YGFQIEMKYSAWRLGLKLKEVSIIFVERREGTSKMSGGIFREAFFGVLGLPLRKIRAAKK AQNMKNRI >gi|313158484|gb|AENZ01000040.1| GENE 125 140519 - 140767 136 82 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MFRTRLQWRFRRKIRGGFAFLSLTRHCEDRKTDHNQCNTFHKIEKLVTQDLNSRVQVDIQ NLKYYKIVNITIYIYKIFHNIF >gi|313158484|gb|AENZ01000040.1| GENE 126 140908 - 141276 582 122 aa, chain - ## HITS:1 COG:MK0919 KEGG:ns NR:ns ## COG: MK0919 COG2246 # Protein_GI_number: 20094355 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Methanopyrus kandleri AV19 # 3 117 18 136 151 65 34.0 3e-11 MIEQFLKFCIVGGSGVFVDFGITYLCKEWLRLNKYVANSLGFLCASTTNYILNRIWTFHN ENPDITGQYLRFLGIAAVGLVINNLTIWLLHGRFRLNFYLAKLFAIGVVTFWNFFMNYFF TF >gi|313158484|gb|AENZ01000040.1| GENE 127 141273 - 142415 1514 380 aa, chain - ## HITS:1 COG:no KEGG:Halhy_6054 NR:ns ## KEGG: Halhy_6054 # Name: not_defined # Def: glycosyl transferase, family 39 # Organism: H.hydrossis # Pathway: not_defined # 5 302 191 484 567 155 30.0 2e-36 MFLRPTLYLSGAVALLLLVPHFVWQYEHDWASLAYHLAGRNSVFRPGYVAEYLLNLLVVF NPFFVPLYVRSWIAVKPQNAVERALRFIPAAFIVFFLLSTLRGYVQPQWVIVAVFGLLYT LFTYARRHPRTRRYLMRMGWVTLALIALTRLVMIFNPLGIRYEVFDNRTSYGEIAEIADG RPVVFRHGYAVAAKYAFYTGGEAYCQPNIRYRTHQWQFRDDDTRFAGREVLVETPYDEAD TTGRVRSVVLPNGKRFTWFVDTHFHPTRLVDISFTGLPERVAAGDTLRLSLRIDNPYPYN IDLGGDMSLVMLWKHGRFRVEEFDTGERFRIPAEEEIRREVAFVVPEQLAGTDFDAGFAL RREGYTNWFNGKPLRTEVAR >gi|313158484|gb|AENZ01000040.1| GENE 128 142190 - 143041 255 283 aa, chain - ## HITS:1 COG:no KEGG:Cpin_1026 NR:ns ## KEGG: Cpin_1026 # Name: not_defined # Def: 4-amino-4-deoxy-L-arabinose transferase and related glycosyltransferase of PMT family # Organism: C.pinensis # Pathway: not_defined # 52 203 26 173 557 136 49.0 9e-31 MHLSLGAGWRKTDKFCYLCRRMRASVKKIYASLRPDALVAFWLGVWWICNLFQAGFTELA NDEAYYHMFAENLSWGYFDHPPMTALLVWLGEHLFGGEFGVRFFFTLLQPLYLFVFWRII RPTDADRRDAGLFVMLSAATLMLQLYGFIAVPDGPLMMTAALFLLTFKWFTEGRRAAWLW MGVAMALMAYSKYHGALVVLFALAATPPACSCVRPYISAARWRCCCSCRISCGSTSTTGR RWPIIWRGVTRSFGPVTSPNTCSTCWSSSTRSSCRSMSGRGSP >gi|313158484|gb|AENZ01000040.1| GENE 129 143515 - 145740 3664 741 aa, chain + ## HITS:1 COG:TM0385 KEGG:ns NR:ns ## COG: TM0385 COG1328 # Protein_GI_number: 15643151 # Func_class: F Nucleotide transport and metabolism # Function: Oxygen-sensitive ribonucleoside-triphosphate reductase # Organism: Thermotoga maritima # 114 693 53 643 651 257 30.0 6e-68 MGISELYIVKRDGKRKAFSVEKIKRAVRKAFLSVGSYATDDDITSILSRVHVSDGMSVEE IQNQVEVALMAERYFAVAKSYMLYRQKHTEEREDREKLEFLINYCDASNPATGSKYDANA NVENKNIATLIGELPKQNFIRLNRRLLTDRIKEMYGKELSDKYLHLLKEHFIYKNDETSM ANYCASITMYPWLLGGTLSVGGNSTRPTNLKSFCGGFVNMVFIVSSMLSGACATPEFLMY MNYFIEQEYGEDYYKRADEVADLSKKRRTIDKIITDCFEQIVYSINQPTGARNFQAVFWN VAYYDRYYFESLFGEFVFPDGSHPHWESLSWLQKRFMRWFNRERTKTVLTFPVETMALLT KDGDVMDREWGDITAQMYAEGHSFFTYISDNADSLSSCCRLRNEIQDNGFSYTLGAGGVS TGSKSVLTINLNRCIQYAVKNGMHYLTYLEEITDLVHKVQTAYNENLKELKSKGMLPLFD AGYINLSRQYLTIGVNGLVEAAEFLGIRIDDNDDYVAFVQNVLGLIERYNKKYRTKELMF NCEMIPAENVGVKHAKWDREDGYAVPRDCYNSYFYVVEDQSLNVVDKFRLHGRRYIDHLT GGSALHMNLEEHLSKEQYRHLLRVAAAEGCNYFTFNIPNTVCNKCGHIDKRYLKACPACH SDDLDYLTRVIGYMKRVSNFSAARQQEAERRYYGNMKPAAGGAKTPGQQGAHPQEGPHPQ REPFAQKESAHQKSSPAQEAV >gi|313158484|gb|AENZ01000040.1| GENE 130 145737 - 146225 404 162 aa, chain + ## HITS:1 COG:no KEGG:BVU_3106 NR:ns ## KEGG: BVU_3106 # Name: not_defined # Def: radical enzyme activating protein # Organism: B.vulgatus # Pathway: not_defined # 7 158 1 152 155 187 56.0 1e-46 MTAPETVVRYYNFDIVFAEIPGETTLAINIANCPNRCPGCHSPHLQADAGHVLDAAELRA LLERYGRSVTCVCFMGGDAAPHRIARLAEVVRQELPVLHVGWYSGRQELPEGFAPQVFDY IKLGGWVESLGPLTSPTTNQRLYRIGPAGAMEDITEQFRHKP >gi|313158484|gb|AENZ01000040.1| GENE 131 146449 - 147285 668 278 aa, chain - ## HITS:1 COG:no KEGG:BF2115 NR:ns ## KEGG: BF2115 # Name: not_defined # Def: putative AraC-type transcription regulator # Organism: B.fragilis # Pathway: not_defined # 5 275 20 289 290 216 40.0 9e-55 MHPKFQIYTDISSLPISEHEAYQEEGFCGICTGGTAVIEVFSVRSPISENDIVTILPLQL VSIREVSDDFSMTFFKVDKEMFLDIMSGLGKITPDFFFYMRKNFLYSVNDVDRNRFLGFC RTIDLRGSNDDPVFLRETILHLLRIYYWDFYVLFQQTTSAKKRPFVNTNKENIAMRFAML VGEHHKTRWEVAFYADKLCISPVYLTKVVLEVNGQSAHEMIADYVVIEIKTLLRDARLDI KAVARRAGFANQSSLSRFFRLHTGMSPTEYRRTIHVTR >gi|313158484|gb|AENZ01000040.1| GENE 132 147799 - 149148 900 449 aa, chain + ## HITS:1 COG:no KEGG:Bacsa_2219 NR:ns ## KEGG: Bacsa_2219 # Name: not_defined # Def: integrase family protein # Organism: B.salanitronis # Pathway: not_defined # 1 448 1 444 445 590 67.0 1e-167 MKITLIIKKSVKRYDTESKATIYARLRDGRQVDMVAPTRLTINPNLWDDKAEQVKSKIVC DDEMRSYYNDEARKLKSYLERAYQSRQTAEPQKEWLKETLEQYYNPQKYNVETATEETAK PTLIALFDEFLEKHRLSDVRKKNYRVIKRGLQRYELYIRTTKRGQKAFVLDIDQVTADTL RNIWDFLENEYRYCALYPEIYAAIPEARTPQPRGKNTLLDCFSRIRTFFYWCNSNKKTRN RPFDDFPLEECTYGTPYYITVEELHKIYGTNLMRHPQLAVQRDIFVFQCLIGCRVGDLLK MTKSNLIDGAIEYIPRKTKEGRPLTVRVPLNQTAKEIVSRYKSLDGDKLLPFISEQKYNL AIKRIFKAAGLKRLVTVINPTTREEEKRVLYEIASSHLARRTFVGNLYKQVKDPNLVGAL SGHKEGSKAFARYRTIDDEMKKELVNLLS >gi|313158484|gb|AENZ01000040.1| GENE 133 149289 - 149490 173 67 aa, chain + ## HITS:1 COG:no KEGG:Halhy_1851 NR:ns ## KEGG: Halhy_1851 # Name: not_defined # Def: helix-turn-helix domain-containing protein # Organism: H.hydrossis # Pathway: not_defined # 1 66 29 94 95 68 53.0 6e-11 MDKVGNAIKERRKILKITQRTLAELAGVGINTLTKIERGEGNPTIEVLEKILDTLGLELQ IGIKQRN Prediction of potential genes in microbial genomes Time: Wed Jun 22 12:21:31 2011 Seq name: gi|313158424|gb|AENZ01000041.1| Alistipes sp. HGB5 contig00070, whole genome shotgun sequence Length of sequence - 56983 bp Number of predicted genes - 59, with homology - 56 Number of transcription units - 23, operones - 16 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 503 - 542 2.3 1 1 Op 1 . - CDS 605 - 1099 432 ## gi|313158474|gb|EFR57869.1| conserved domain protein 2 1 Op 2 . - CDS 1096 - 1545 385 ## gi|313158470|gb|EFR57865.1| hypothetical protein HMPREF9720_2450 3 1 Op 3 . - CDS 1565 - 4753 1863 ## COG0358 DNA primase (bacterial type) 4 1 Op 4 . - CDS 4811 - 5017 145 ## gi|313158475|gb|EFR57870.1| hypothetical protein HMPREF9720_2452 5 1 Op 5 . - CDS 5020 - 5205 299 ## gi|313158453|gb|EFR57848.1| conserved hypothetical protein 6 1 Op 6 . - CDS 5217 - 5426 316 ## gi|313158427|gb|EFR57822.1| hypothetical protein HMPREF9720_2454 - Term 5447 - 5504 8.1 7 1 Op 7 . - CDS 5521 - 5766 247 ## - Prom 5872 - 5931 6.5 + Prom 5841 - 5900 4.7 8 2 Op 1 . + CDS 5941 - 6354 224 ## gi|313158437|gb|EFR57832.1| DNA-binding helix-turn-helix protein 9 2 Op 2 . + CDS 6475 - 6840 171 ## gi|313158440|gb|EFR57835.1| hypothetical protein HMPREF9720_2456 + Term 6973 - 7012 9.1 - Term 6962 - 6997 6.7 10 3 Tu 1 . - CDS 7044 - 7541 581 ## BT_1390 hypothetical protein - Prom 7583 - 7642 2.5 + Prom 7526 - 7585 1.9 11 4 Tu 1 . + CDS 7668 - 7832 241 ## + Term 7845 - 7904 18.4 - Term 7843 - 7883 8.1 12 5 Tu 1 . - CDS 7903 - 9990 2925 ## COG4232 Thiol:disulfide interchange protein + Prom 9957 - 10016 5.3 13 6 Tu 1 . + CDS 10058 - 10954 936 ## COG0463 Glycosyltransferases involved in cell wall biogenesis + Term 10959 - 10990 4.1 - Term 10944 - 10981 7.1 14 7 Op 1 . - CDS 11004 - 12005 1750 ## COG0524 Sugar kinases, ribokinase family 15 7 Op 2 . - CDS 12033 - 13013 1519 ## COG0205 6-phosphofructokinase - Prom 13037 - 13096 6.0 - Term 13066 - 13102 3.6 16 8 Op 1 . - CDS 13147 - 14466 1945 ## GFO_0991 PspC domain-containing protein 17 8 Op 2 . - CDS 14485 - 15288 901 ## FB2170_14393 hypothetical protein 18 8 Op 3 . - CDS 15311 - 15643 617 ## COG1695 Predicted transcriptional regulators 19 8 Op 4 . - CDS 15648 - 15857 454 ## Haur_4670 phage shock protein PspC 20 8 Op 5 . - CDS 15862 - 16374 718 ## COG1983 Putative stress-responsive transcriptional regulator - Prom 16429 - 16488 6.1 + Prom 16463 - 16522 4.4 21 9 Op 1 . + CDS 16584 - 17141 622 ## Celly_0151 ECF subfamily RNA polymerase sigma-24 subunit 22 9 Op 2 . + CDS 17116 - 17613 567 ## gi|313158480|gb|EFR57875.1| hypothetical protein HMPREF9720_2468 23 9 Op 3 . + CDS 17626 - 18564 1049 ## gi|313158441|gb|EFR57836.1| hypothetical protein HMPREF9720_2469 24 9 Op 4 . + CDS 18593 - 19522 1258 ## gi|313158466|gb|EFR57861.1| hypothetical protein HMPREF9720_2470 25 9 Op 5 . + CDS 19539 - 20018 376 ## + Term 20076 - 20113 7.2 26 10 Tu 1 . - CDS 20307 - 20525 337 ## gi|313158431|gb|EFR57826.1| hypothetical protein HMPREF9720_2472 - Prom 20557 - 20616 1.7 - Term 20757 - 20792 6.1 27 11 Op 1 36/0.000 - CDS 20819 - 21562 1021 ## COG0479 Succinate dehydrogenase/fumarate reductase, Fe-S protein subunit 28 11 Op 2 . - CDS 21599 - 23545 3108 ## COG1053 Succinate dehydrogenase/fumarate reductase, flavoprotein subunit 29 11 Op 3 . - CDS 23579 - 24283 1081 ## BVU_1239 putative cytochrome b subunit - Prom 24391 - 24450 3.4 + Prom 24263 - 24322 5.3 30 12 Op 1 . + CDS 24403 - 24939 471 ## COG0212 5-formyltetrahydrofolate cyclo-ligase 31 12 Op 2 . + CDS 24946 - 25662 689 ## BT_0781 hypothetical protein 32 12 Op 3 . + CDS 25667 - 26464 1188 ## COG1752 Predicted esterase of the alpha-beta hydrolase superfamily + Term 26632 - 26681 6.0 - Term 26845 - 26889 7.2 33 13 Op 1 . - CDS 26912 - 28291 2121 ## BDI_1230 hypothetical protein 34 13 Op 2 . - CDS 28305 - 29222 1307 ## gi|313158454|gb|EFR57849.1| hypothetical protein HMPREF9720_2481 - Prom 29245 - 29304 1.7 + Prom 29161 - 29220 3.9 35 14 Tu 1 . + CDS 29405 - 31225 2486 ## COG0322 Nuclease subunit of the excinuclease complex + Term 31227 - 31268 -0.2 + TRNA 31291 - 31377 60.2 # Leu CAA 0 0 + Prom 31833 - 31892 6.0 36 15 Op 1 . + CDS 32138 - 32449 131 ## gi|313158458|gb|EFR57853.1| conserved hypothetical protein 37 15 Op 2 . + CDS 32539 - 34845 2701 ## COG4771 Outer membrane receptor for ferrienterochelin and colicins + Term 34864 - 34908 12.2 38 16 Op 1 . - CDS 34861 - 35607 606 ## COG2102 Predicted ATPases of PP-loop superfamily 39 16 Op 2 . - CDS 35595 - 36167 406 ## BT_1949 hypothetical protein 40 17 Op 1 . + CDS 36595 - 38277 2243 ## COG2985 Predicted permease 41 17 Op 2 . + CDS 38375 - 38740 383 ## gi|313158457|gb|EFR57852.1| hypothetical protein HMPREF9720_2490 42 17 Op 3 . + CDS 38743 - 39282 683 ## COG2109 ATP:corrinoid adenosyltransferase + Term 39426 - 39464 2.0 43 18 Tu 1 . - CDS 39283 - 40428 1044 ## COG0614 ABC-type Fe3+-hydroxamate transport system, periplasmic component - Prom 40501 - 40560 2.9 44 19 Op 1 . - CDS 40617 - 42170 1721 ## Weevi_2079 PKD domain containing protein 45 19 Op 2 . - CDS 42198 - 43427 1020 ## FIC_00184 hypothetical protein 46 19 Op 3 . - CDS 43460 - 44755 1020 ## FIC_00184 hypothetical protein 47 19 Op 4 . - CDS 44773 - 46014 1605 ## Weevi_2079 PKD domain containing protein 48 19 Op 5 . - CDS 46011 - 47168 1443 ## Poras_1287 hypothetical protein 49 19 Op 6 . - CDS 47181 - 49259 2824 ## COG4206 Outer membrane cobalamin receptor protein 50 20 Op 1 . - CDS 49624 - 50244 496 ## gi|291514016|emb|CBK63226.1| hypothetical protein AL1_06150 51 20 Op 2 . - CDS 50263 - 51333 794 ## HMPREF0659_A6126 relaxase/mobilization nuclease domain protein 52 20 Op 3 . - CDS 51323 - 51691 245 ## HMPREF0659_A6127 bacterial mobilization protein - Prom 51726 - 51785 2.7 53 21 Op 1 . - CDS 51795 - 52619 352 ## PG0841 mobilizable transposon, excision protein 54 21 Op 2 . - CDS 52665 - 52817 145 ## gi|313158429|gb|EFR57824.1| conserved domain protein - Prom 52974 - 53033 2.3 55 22 Op 1 . - CDS 53044 - 54252 1321 ## COG5545 Predicted P-loop ATPase and inactivated derivatives 56 22 Op 2 . - CDS 54242 - 54553 371 ## Bacsa_2217 phage transcriptional regulator, AlpA - Prom 54632 - 54691 3.9 - Term 54595 - 54636 12.3 57 23 Op 1 . - CDS 54709 - 55638 685 ## Bacsa_2218 hypothetical protein 58 23 Op 2 . - CDS 55652 - 56662 810 ## COG3550 Uncharacterized protein related to capsule biosynthesis enzymes 59 23 Op 3 . - CDS 56665 - 56982 138 ## BF0796 hypothetical protein Predicted protein(s) >gi|313158424|gb|AENZ01000041.1| GENE 1 605 - 1099 432 164 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313158474|gb|EFR57869.1| ## NR: gi|313158474|gb|EFR57869.1| conserved domain protein [Alistipes sp. HGB5] # 8 164 1 157 157 303 100.0 2e-81 MSFQTSCMVLVRDEQQQIKLEKWVKGIGRQIIGGRQDWSNFFLADTDQTCFWLQLDENDL GLFDQDFYNCDENIEMFKALAAMNRDNDREQWFVAHAVIRFERLKDTVQTETGERLIMAG EWFKVLIPRASDIRAKWMAAVAPKQLCHKATKDEIIEHFNRKKL >gi|313158424|gb|AENZ01000041.1| GENE 2 1096 - 1545 385 149 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313158470|gb|EFR57865.1| ## NR: gi|313158470|gb|EFR57865.1| hypothetical protein HMPREF9720_2450 [Alistipes sp. HGB5] # 1 149 1 149 149 293 100.0 2e-78 MQNNKSKQKQAEKETPTNWQRIEMVIQQSKMTANAFARHIGLPRGENLYQIKKGNNGISL DVATRICQHYPEIDKLWLLTGDGQMLRDDAPAGPWANIGTPNSEAFIGFAAALILPELVN KPECRDPYTMAVEHAKKLMAALAKKGGEQ >gi|313158424|gb|AENZ01000041.1| GENE 3 1565 - 4753 1863 1062 aa, chain - ## HITS:1 COG:PM1240 KEGG:ns NR:ns ## COG: PM1240 COG0358 # Protein_GI_number: 15603105 # Func_class: L Replication, recombination and repair # Function: DNA primase (bacterial type) # Organism: Pasteurella multocida # 2 364 5 367 582 229 36.0 3e-59 MIPEYVIDQILSKDIVSIIGGEGVSLKRAGVNYECCCPFHKEKTPSFKVSPVKGIFTCFG CSAKGNAISFVMMLYNMTFPEAVEYLAKKLNIEYKAEELTPEQKEARFRRSRIFEINQIA LEYFRESYKQSLPAQKYATKERGFKEETIDNMLIGFAPYKGGFREYATQKGYKEQLLIDA DLVRRSERDGSLYDTFRGRLMFTIRDRTGNIVGFSGRLMDKENPKKLPKYINTGDTAVYK KGEHLFAYFESARQAAAVRTMNLVEGNPDAIRMHQIGVDNTVAPLGTALTPKQIELIKKV ADTVIIIGDMDDAGQKAVVKNAETMLRAGLAVRVMEIKDNYKDPDDYFRQYSKGYEELLS NSTTDFIPWLCAHKMEGKNSQTEQIAVISEVCQLLALCRDESTVNMYLDMFAREYKNRKI WTAELQKIQLERERAQRKKEESYSEDMISEYGFYISHNSYYGAGRGNADVRWSNFILEPI VHVKDDQNARRLFRMRNDKGEEAVIKLDQRSLVSFADFRIRTESKGNYIWEAGQGELTKL KKYLFDGTPSADEIKQLGWQKRHQIYAWGNGAMDEGHFVKANDFGLVNVRGQLFYLPGCS KDTADDPQSYQFQRRFVYAITNDITLNDYATRLIEVFGDNAKVGLCFLIASLFRDLVVNV TTSFPILNLFGPKGSGKSELAHSLTSFFIPSYIAPNINNTTKAALAEAVAEVSNAIVHID EYKNTLDIEKREFLKGLWDGAGRSRMNMDNDKKRETTAVDCGVILSGQEMPTADIALFSR LVFLTFSKTTFSDDEKRRYNELKLIEKRGLTHLTGGLLKHRNQFRSNYRHMYDETAADFS VAFVGKIIEDRTFRNWVSITAAFRSIEHLLHLPFTYTEILPMVTRMCETQNLKTNENNEL AGFWETVDILASSGKIWIGVDYHIKASTKKGIPIKESKTPLELPEGVRYLSVSFLRISQL YAKESRDSESKKIPRDSLKYYLEHSREFLGTKKAERFKVIQNPTGFVPAGDAATGRTTTA MLFDYKMICENYGIDLDTSRSYTDNPDEQDDPEPFTPTQLSF >gi|313158424|gb|AENZ01000041.1| GENE 4 4811 - 5017 145 68 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313158475|gb|EFR57870.1| ## NR: gi|313158475|gb|EFR57870.1| hypothetical protein HMPREF9720_2452 [Alistipes sp. HGB5] # 1 68 1 68 68 116 100.0 6e-25 MYCDKDNREHYAIMDLDKELLETIRRALTMYRCGLYEGIQEEKRPRIKEDFQQQYDRAGI IVEHIESL >gi|313158424|gb|AENZ01000041.1| GENE 5 5020 - 5205 299 61 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313158453|gb|EFR57848.1| ## NR: gi|313158453|gb|EFR57848.1| conserved hypothetical protein [Alistipes sp. HGB5] # 1 61 1 61 61 69 100.0 8e-11 MQTIILIILACITGFLDFLILLYIAKSLLQARADDRQGLIPTLICFILNTIVAVALILKL W >gi|313158424|gb|AENZ01000041.1| GENE 6 5217 - 5426 316 69 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313158427|gb|EFR57822.1| ## NR: gi|313158427|gb|EFR57822.1| hypothetical protein HMPREF9720_2454 [Alistipes sp. HGB5] # 1 69 1 69 69 132 100.0 7e-30 MEQTKTFIEFWRGLDIHSREELRTVGAKMLFVATSTFNAYGCGARQIPLSKREALAKFIA EKYQINVTF >gi|313158424|gb|AENZ01000041.1| GENE 7 5521 - 5766 247 81 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MPQCETRYLRGKETKMIQTYPNNVRLIVNLDTEESRLMVGSRLLDIYNGTDIEKMRRITS AAESLSLTIGHRSPDRVRILS >gi|313158424|gb|AENZ01000041.1| GENE 8 5941 - 6354 224 137 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313158437|gb|EFR57832.1| ## NR: gi|313158437|gb|EFR57832.1| DNA-binding helix-turn-helix protein [Alistipes sp. HGB5] # 1 137 1 137 137 251 100.0 1e-65 MYSIDIREFRRANKMTQQELADYFGVVQGFISNMENGREKVPDKYISKILGDPNVDSSMV KAVAPENEVKMPREVFDKFSLLLETISSQQGTIADQQKVNAEQGRLIADMYQKIDRLTSL GSRTARTEDDAGCAAAK >gi|313158424|gb|AENZ01000041.1| GENE 9 6475 - 6840 171 121 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313158440|gb|EFR57835.1| ## NR: gi|313158440|gb|EFR57835.1| hypothetical protein HMPREF9720_2456 [Alistipes sp. HGB5] # 1 121 26 146 146 240 100.0 2e-62 MADFRPYTEEGFIISPLSSYAAPYKPLATICIEFTPGYDRTYVNEKIVESKVNDDMYLSA NGKKQQWRCPTQDEMLKKLVDYAKSIGANGLLDYRFEAVRPVKSNFNQVLYYRLTAFAIS I >gi|313158424|gb|AENZ01000041.1| GENE 10 7044 - 7541 581 165 aa, chain - ## HITS:1 COG:no KEGG:BT_1390 NR:ns ## KEGG: BT_1390 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 161 2 150 153 98 31.0 9e-20 MKIHKTDIPPRSLLTPCLPGDYHDCFTYGMTCRRKISPDDLMTAFWTTMPGWVNTLFKLR NALVRPFGLQTDNGDAKQLEKAIRSGRDYRMMSVVGKTDNETVISLDDKHLKAYLSVYAE AHEIHLSTLVRYHNRLGFFYFNLIHPFHTLVVKSMFRRIIKAHGL >gi|313158424|gb|AENZ01000041.1| GENE 11 7668 - 7832 241 54 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKEDTKTTCTPCEQNFERKVDELVKEGKAPTAREREEKEAEVEAAFSDRGKEKK >gi|313158424|gb|AENZ01000041.1| GENE 12 7903 - 9990 2925 695 aa, chain - ## HITS:1 COG:VC2701 KEGG:ns NR:ns ## COG: VC2701 COG4232 # Protein_GI_number: 15642695 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol:disulfide interchange protein # Organism: Vibrio cholerae # 76 599 82 531 600 90 24.0 8e-18 MYKLFRTLLLPVAALFAFTAATAQNVSWKSSVEHLEGDVYRIVLEASIPAPYHMYDMGPY EGGPNATAIVFTPGDGVTLEGGVEQLSTPERHYDKTFEMEIGTFAGKARFAQQVKLAAAK ATVKAAIEWMICDDTSCMPPDDTELTVELTARPGSAAANSTAAVSPNNDSSAKDTPAEAA ADTASGTAPTAAAEIVPAQKNAAGGGTLWALIIEAILWGFAALLTPCVFPMVPMTVSFFM KGEGGPARGRFRAAMYGFFIVALYTLPIAAIIVITRILGGDTVTADIFNWLATHWLPNIL FFLVFMVFAASFFGAFEITMPSWMVNKTDAKADSKGLAGIFFMALTLVLVSFSCTGPIVG SVLIKSTSGEFWSPIVTMLAFSVAFALPFTIFAFFPSMLKKLPKSGGWLNSVKVMLGFIE VALGMKFLSVADQTYHWGLLDREIYLAVWIVTFSLLGLYLLGKIKFAHDSDMPYLGVGRL ALAIITFSFVVYLIPGMWGAPLKGLSGYLPPLHTQDFVIGQGGPAAASGEPGMLLTADGK KPKHSDFLHLPHGLEGFFDLQEAEAYAAKVDKPLFIDFTGHGCVNCREMEARVWSDPQVL DILRRDYVIAALYSDDKKVLPENEWVTTDSGKVLKSLGKINSYYALKTYGVNAQPYYVLQ GRDGKTLVPPRGYDLNVENFVQFLRSGVEAYKKQQ >gi|313158424|gb|AENZ01000041.1| GENE 13 10058 - 10954 936 298 aa, chain + ## HITS:1 COG:alr3176 KEGG:ns NR:ns ## COG: alr3176 COG0463 # Protein_GI_number: 17230668 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Nostoc sp. PCC 7120 # 3 246 5 246 313 116 34.0 6e-26 MIRLSLIIPTHNRSERLIAALESVIRQDLPAADWECVVVSNNSTDDTVARFGDFAARYPG LNLRLVTEDGPGVSYARNRGIAETSAPLLVFIDDDERINPGFLRAYADFFDAHPDAVVAG GRIIAEYVTGRPAWLSKYTEMPIANPMDFGDAVRPFPAGRVPGGGNMAFRRSAALRYGGF DPSLGRVGRMLIGGEENDFFERLMRGGETCWYVPGAVMWHIIPPEKLTESYFRRLCYNVG VSQRLRAGMYRRYPKTLLFESLKWGATLLLSLTMPPRKSLWLIRMRYEISRGLLSDPQ >gi|313158424|gb|AENZ01000041.1| GENE 14 11004 - 12005 1750 333 aa, chain - ## HITS:1 COG:mll5335 KEGG:ns NR:ns ## COG: mll5335 COG0524 # Protein_GI_number: 13474450 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Mesorhizobium loti # 4 308 19 326 343 163 36.0 5e-40 MKRVIGIGNALTDMLVNLKSDSVLSRFKLAKGSMSLVDTTLQTEISKSVAGLPYSLSLGG SAGNTIRAMAKLGCDVGFIGKVGQDTTGDFFVQALENLGVEPVIFRGTERSGKCVSLISP DGERTMVTHLGAALELTAEEIETSIFDHYDCLYVEGYLVQNHDLILKAAKTAKECGLKVA VDLASFNIVAENLEFLRGLVRDYVDIVFANEDEAKTFTCEAEPLNALQVISEMCELAVVK IGIKGAMIKQGDEVVHVGIMAAAKRVDTTGAGDFYAAGFLSGLCDGLSLRQCGTIGAITA GKVIEVVGTTFGEEAWEDISRLVNKVKNEKYLF >gi|313158424|gb|AENZ01000041.1| GENE 15 12033 - 13013 1519 326 aa, chain - ## HITS:1 COG:TM0209 KEGG:ns NR:ns ## COG: TM0209 COG0205 # Protein_GI_number: 15642982 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphofructokinase # Organism: Thermotoga maritima # 4 325 1 319 319 293 50.0 3e-79 MAGIKCIGILTSGGDAPGMNAAIRAVTRSAIYNGFAVKGIMRGYKGLVFNEIVSFKSQSV SNIIQLGGTILKTARSQEFTTTEGRKAAYDNMVAAGIDALVVIGGDGSLTGAGIFAEEYN VPIVGLPGTIDNDLGGTDSTIGYDTALNTIVEAVDKLRDTASSHERLFFVEVMGHTAGYL AMNSAIATGSEAAIIPEMETEVDQLAELINHGFRKSKNSAIVIVAENPKTGGATALAERV KKEFPQYDARVTILGHIQRGGSPTAVDRILASRMGEAAIEALLDGQRNVMIGTMNGKIVY VPFAKAVKHDKGIDRKDLELVKILSI >gi|313158424|gb|AENZ01000041.1| GENE 16 13147 - 14466 1945 439 aa, chain - ## HITS:1 COG:no KEGG:GFO_0991 NR:ns ## KEGG: GFO_0991 # Name: not_defined # Def: PspC domain-containing protein # Organism: G.forsetii # Pathway: not_defined # 1 345 1 357 586 151 29.0 5e-35 MKEVKKCSISGVAFTMDTDAYQELEAYLESLKKTYKETPDGAEIVADIEARIAELILSAQ DSTRVVEKPLILNIIRQMGTAEDISEEEPGRSPEEPRIPRRLYRDTENAKLGGVCAGIGK YFDVDPVWIRLAMFLPILLTVFQWIPFMRWTGPMMGNLFGIFVICYIIMWFAVPAPRTAR QKLEMNGEKITAQSIRETTEASASCDPDAKAKPIVAEAVSVFGKVVLILLKLFAGVLVFG LIMAACALIIGLFALIIGGASISIPFFPLGMTLWVPALGILIVLILVILLIYVLMCLIAS RKPGGKTILVIFLLWLAGIIACTCIAIRENVGDRIRTRHSVIERMLNSEIIIDNDTTTVE RLLEEYDSQNVIEDGRKTLHISVPNQSIDITIDREKAGLEVKANGKRVKVQTGDDKTESA KPDSVRQPDAGRVPAEQTE >gi|313158424|gb|AENZ01000041.1| GENE 17 14485 - 15288 901 267 aa, chain - ## HITS:1 COG:no KEGG:FB2170_14393 NR:ns ## KEGG: FB2170_14393 # Name: not_defined # Def: hypothetical protein # Organism: F.bacterium_HTCC2170 # Pathway: not_defined # 31 267 17 231 231 87 31.0 6e-16 MKKILLIAVLLGAMLSVTVYFVCIAAPSVFFGKGLKGSGNIVTRTIEAPAFDRIDAARAV KVVITDKTSGKITIAADDNVMDYVVVEANGGQLTATIDKSINNLSNADVTVTVPANGSIR ALDASSAAKISGDVVLNADKFAMDASSAAKIDVSVKAQSCAVAASSASKINASVEAVSCS VEASSASKITLKGSAAKCTADLSSASKLSAGEFVVAECSVNTSSAAKAVVNCTERLHADA SSGSSIRYSGDCRTSINKSSGGSVGRN >gi|313158424|gb|AENZ01000041.1| GENE 18 15311 - 15643 617 110 aa, chain - ## HITS:1 COG:CAC0571 KEGG:ns NR:ns ## COG: CAC0571 COG1695 # Protein_GI_number: 15893861 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Clostridium acetobutylicum # 6 110 1 104 107 84 43.0 4e-17 MAEDNVKAQMRKGILEYCILAILSREDSYAPKIIAELKQAEMIVVEGTLYPILTRQKNAG LLTYRWEESPQGPPRKYYTLTEKGREYLTTLDEAWDELVGQIRIIRHGKQ >gi|313158424|gb|AENZ01000041.1| GENE 19 15648 - 15857 454 69 aa, chain - ## HITS:1 COG:no KEGG:Haur_4670 NR:ns ## KEGG: Haur_4670 # Name: not_defined # Def: phage shock protein PspC # Organism: H.aurantiacus # Pathway: not_defined # 5 61 3 60 143 64 48.0 1e-09 MTTNNTKRLLRTPDGIIAGVCGGLAEFFGLDVSLIRIATLILILFGGLSIWVYIILWLIV PKAPKRLTE >gi|313158424|gb|AENZ01000041.1| GENE 20 15862 - 16374 718 170 aa, chain - ## HITS:1 COG:MA4106 KEGG:ns NR:ns ## COG: MA4106 COG1983 # Protein_GI_number: 20092899 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Putative stress-responsive transcriptional regulator # Organism: Methanosarcina acetivorans str.C2A # 101 159 2 59 59 69 52.0 3e-12 MKETVNVNIGSQAFTLDEDAYRVLRNYLEDIRSRLPEYDTETMGDIEARIAEIFREIVSS PMRVITLDTVRQTMNRMGSPADFGERRGSAPQEEQPEPEPRKLFRSRTDRSIAGICGGLA EFFHADTTVLRLITLFLILFGGLSIWAYIILWIVIPEEPARKFSIHNKNR >gi|313158424|gb|AENZ01000041.1| GENE 21 16584 - 17141 622 185 aa, chain + ## HITS:1 COG:no KEGG:Celly_0151 NR:ns ## KEGG: Celly_0151 # Name: not_defined # Def: ECF subfamily RNA polymerase sigma-24 subunit # Organism: C.lytica # Pathway: not_defined # 14 182 1 167 169 103 35.0 3e-21 MLRFARRFRLTDTMEEFEFTRYVVPLRDRMFRYAQSLLLCADEAEDVTHDLLERLWRECD RLDGCRDVASFVMVAVRNCCYDRFRSRQAGERRDNAVAGWAERSTTGDAEGWEARDLVRR AMACLPERQREVLHLKDIEGYPTREIAEIAACDEAQVRVILSRARNGLREVLKKMMDDER AGRKN >gi|313158424|gb|AENZ01000041.1| GENE 22 17116 - 17613 567 165 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313158480|gb|EFR57875.1| ## NR: gi|313158480|gb|EFR57875.1| hypothetical protein HMPREF9720_2468 [Alistipes sp. HGB5] # 1 165 1 165 165 268 100.0 8e-71 MKEQDERIERLLECYFDARTTEAEERELRDYFRYVDEIPASLRYARTMFGGLDALAEERY PGEGSVQTGPAVRPAERMRRYETRSAAGRGARGGRKLPPLWGVAAAAAVVLGVFFYAEYL RKPYCYIDGVAIYDKEAAMQATVYLQGLSDFGDQVRMVDELIINE >gi|313158424|gb|AENZ01000041.1| GENE 23 17626 - 18564 1049 312 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313158441|gb|EFR57836.1| ## NR: gi|313158441|gb|EFR57836.1| hypothetical protein HMPREF9720_2469 [Alistipes sp. HGB5] # 1 312 1 312 312 622 100.0 1e-176 MKKLFVCTGALLLLAVSPAGAQFFISSAPAHEVAAAQKADSACFRRADGKPVAIETVERD STGIDTLRRIDTVDSDLIRLRDGGRDGNMILEVAGFGLTLGHTPMQRMELKKPRVWFNAF SNIELGFTQLVGVDYSGYASGEKGFLDQRLGSSFHFSFSAVQLRLALNRSRSLCLGVGMQ YTLDNYRLSDNTITLGNDGGRVVPVALDEPAGKSKAVTSSLGIPVRLTYTPAKHLRITAV AYSDFLLGADAIYKKPKEKNSLSGFRAYQFGVGLSVSYRGFGFFTRYGVTPLFKHNMGPD CHAFSFGVGLML >gi|313158424|gb|AENZ01000041.1| GENE 24 18593 - 19522 1258 309 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313158466|gb|EFR57861.1| ## NR: gi|313158466|gb|EFR57861.1| hypothetical protein HMPREF9720_2470 [Alistipes sp. HGB5] # 20 309 20 309 309 565 100.0 1e-159 MKRLLLILILLLPQLLAAQSADRTAAQTADRIPANSAGRRTGREYAIRTVDRDSLGLDTL RRLDAMGGELMSLRGDGREGSVVLEVAGFGLTLGRAPETDSQADLVMRSGRVKLYFFNSG EFGFTKLVGVDYGGYAPEEKGFLDQKLGNSFHCAASVMQVQVALNKSRTLYFATDLRFAV DNYRLADAGIRLGYEKGRLLPVALDEPADKSKFVTSSLGIPLRLIYKPFKHFQASVIAYS DFGIELTSLHKKPRVEYELSGLRTYQFGVGGTVSYYGIGIFASYGVTPLFKKHAGPECHT FSFGISLLM >gi|313158424|gb|AENZ01000041.1| GENE 25 19539 - 20018 376 159 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKRLLLILILLLPRLAAAQSNSVDDFFSRYAAAEGFTSVQLEQKMMQLMSRQAAERGDKG LAVLLKDIQYIRIIALKEGDGGRFVRDAEAAVAADRKFQPVTSGSENGQTTKFYIRETAL SVKSELVMITYGAKETVVVNIFGVFDLRQVARLSTIRPK >gi|313158424|gb|AENZ01000041.1| GENE 26 20307 - 20525 337 72 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313158431|gb|EFR57826.1| ## NR: gi|313158431|gb|EFR57826.1| hypothetical protein HMPREF9720_2472 [Alistipes sp. HGB5] # 1 72 13 84 84 107 100.0 4e-22 MDIVEFPNDRTVISTENNGFAGGERDENLIDEELLETEIPLDDRIPDAAHEDAENDRPVY DEEGYRHDPGGE >gi|313158424|gb|AENZ01000041.1| GENE 27 20819 - 21562 1021 247 aa, chain - ## HITS:1 COG:Cgl0368 KEGG:ns NR:ns ## COG: Cgl0368 COG0479 # Protein_GI_number: 19551618 # Func_class: C Energy production and conversion # Function: Succinate dehydrogenase/fumarate reductase, Fe-S protein subunit # Organism: Corynebacterium glutamicum # 1 245 1 244 249 229 46.0 3e-60 MNFTLKIWRQKDAKTKGGFETYKVENISADTSFLEMLDILNNNLIHEGKEPVAFDHDCRE GICGMCSLHINGHAHGPGQAATTCQIYMRKFKDGATITIEPWRSAAFPVIKDLVVNRSAY DQILQAGGFISVRTNSVPDANAIPIAQADAEESMDAAACIGCGACAATCKNGSAMLFVAA RVSSLAKLPQGRVEGARRAKAMVAKMDELGFGNCTNTGACQAECPKSISIAHIARLNREF LSAKFED >gi|313158424|gb|AENZ01000041.1| GENE 28 21599 - 23545 3108 648 aa, chain - ## HITS:1 COG:Cgl0367 KEGG:ns NR:ns ## COG: Cgl0367 COG1053 # Protein_GI_number: 19551617 # Func_class: C Energy production and conversion # Function: Succinate dehydrogenase/fumarate reductase, flavoprotein subunit # Organism: Corynebacterium glutamicum # 5 647 28 673 673 623 49.0 1e-178 MAFKLDAKIPAGALAEKWSNHKAAIKVVSPANKRKLDIIIVGTGLGGASAAASLGELGYN VKVFCIQDSPRRAHSIAAQGGINAAKNYQNDNDSVYRLFYDTIKGGDYRAREANVYRMAE VANLIIDQCVGQGVPFAREYGGLLANRSFGGAQVSRTFYARGQTGQQLLLGAYSVLNKEI AAGTVKSYPRHEMLDIVVEDGEAKGIIARNLVTGEIERHGAHAVVIATGGYGNVFFLSTN AMASNGSAAFQCYRKGAGFANPCFTQIHPTCIPVHGDQQSKLTLMSESLRNDGRIWVPKK LEDVKALRSGAKKPSDIAEEDRDYYLERRYPAFGNLVPRDVASRAAKERCDAGFGVNETG LAVFLDFKTAIERLGKEIIKKRYGNLFDMYEEITDVSPYEQPMQIFPAVHYTMGGIWVDY NLMTTIPGLYAIGEANFSDHGANRLGASALMQGLADGYFVLPYTIGDYLSHKIQAPKVKT DTKAFDEAEKGVREKIAKLLSIGGKQSVDDIHKKLGHIMWENVGMARSKESLEKAISEIQ ALRKEFWKDVKVVGQEKDFNVELEKALRLADFLELGELMARDALNREESCGGHFRVEHQT EEGEAKRDDENFVYAAVWEYKGEDAAPELTKEPLNFEFVHPAQRNYKD >gi|313158424|gb|AENZ01000041.1| GENE 29 23579 - 24283 1081 234 aa, chain - ## HITS:1 COG:no KEGG:BVU_1239 NR:ns ## KEGG: BVU_1239 # Name: not_defined # Def: putative cytochrome b subunit # Organism: B.vulgatus # Pathway: Citrate cycle (TCA cycle) [PATH:bvu00020]; Oxidative phosphorylation [PATH:bvu00190]; Butanoate metabolism [PATH:bvu00650]; Metabolic pathways [PATH:bvu01100]; Biosynthesis of secondary metabolites [PATH:bvu01110]; Microbial metabolism in diverse environments [PATH:bvu01120] # 4 234 2 228 228 235 58.0 1e-60 MSCFLSNSSLGKKLVMSVTGCFLVLFILFHMSMNIVAIISPEAYNMICALLGANWYALAG TAVLAAGVVVHFIYAVILTMENLKARGSQRYAVTVVEPGVSWASKNMLVLGFIILGGLAL HLFNFWAKMQLVEVLGGHENSLGLHPADGASLIAYTFSQWYYVVIYLVWFFALWFHLTHG VWSMFQTVGWANDTWYPRLKCIANIVATVIFLGFAAVVVIYFIKSVCPCCAGAC >gi|313158424|gb|AENZ01000041.1| GENE 30 24403 - 24939 471 178 aa, chain + ## HITS:1 COG:aq_1731 KEGG:ns NR:ns ## COG: aq_1731 COG0212 # Protein_GI_number: 15606807 # Func_class: H Coenzyme transport and metabolism # Function: 5-formyltetrahydrofolate cyclo-ligase # Organism: Aquifex aeolicus # 1 176 1 174 186 119 37.0 2e-27 MTKKELRAAMKKRNLSLSSEERDAASERIFGRVERLPGFSAARCAAFFCALPDEPQTGAA LARWSAAKRIVVPRVEGDAMRFYDYAPGELCRGAFGIAEPSVAERPCDPAEIDFIVVPGV AFTAAGARMGRGRGYYDKYLSQPGFRGVKAGVCYAHQLVGELPVEPHDVFMDYVVTND >gi|313158424|gb|AENZ01000041.1| GENE 31 24946 - 25662 689 238 aa, chain + ## HITS:1 COG:no KEGG:BT_0781 NR:ns ## KEGG: BT_0781 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 236 1 236 240 307 61.0 2e-82 MDVKEFTAAYREAFGERPELPLLFRYTDTPLRPVEKVGGCFFKALAEARRGLPVSLNAGN IGCGGGKFYTGFSPMPEFVPQFVSLKERYKRTPEMVLEFIAALELRPAPKAWLEFVRADV AETFDGAEGVLFFAAPDALAGLVTWATFDNNAADAVASPFGSGCSSVVAQAVQENRNGGR RCFLGLFDPSVRPWVEPDVLSFVVPMSRFREMCGMMRASCLFDTHAWSKVRERINNDN >gi|313158424|gb|AENZ01000041.1| GENE 32 25667 - 26464 1188 265 aa, chain + ## HITS:1 COG:PA1640 KEGG:ns NR:ns ## COG: PA1640 COG1752 # Protein_GI_number: 15596837 # Func_class: R General function prediction only # Function: Predicted esterase of the alpha-beta hydrolase superfamily # Organism: Pseudomonas aeruginosa # 4 182 3 180 345 168 45.0 1e-41 MEKRRAALVLAGGGARGVAHIGAIEELESQGFEVHAVAGTSMGALVGGMYASGHLEAFKE WMYTLDKYKVFSLVDFALSTEGLVKGNRVIGAMKELVPDVKIEQMPLPFAAVAADLLTGR EVVLERGGLYDAIRASISIPSVFRPVRRGNMVLVDGGTVNPLPLNRVRREPGDVLVAVDV SAPFSDEMAARVRSSLNYYKVITASSEIMQQHIARLMCEIYKPDLLIELPADRFGIFEFY RAREIVEAGRLAARAALEQHMVVAG >gi|313158424|gb|AENZ01000041.1| GENE 33 26912 - 28291 2121 459 aa, chain - ## HITS:1 COG:no KEGG:BDI_1230 NR:ns ## KEGG: BDI_1230 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 8 442 17 463 468 275 36.0 2e-72 MKKTYAFLATLLLASGVAAQNPTAYFMEGTTFRSQFNPAFAPLRGYVNIPGLGGIDINAS GNLAVDKLLYPRNGKLVPLLDEAVSTRQALSGLNADNLLGIDTRVNLIGFGAFTRNHKNF WSFDLNARTTADATLPYSLFAFLKEGTGSDIHNIGLTADSYLEAAFSYSFPLLDDRLYIG VRGKFLVGAARARMYFTRFNVTHGEEEWRIDAAGGLDITAAGLDVTTKRNDAGEEVYKLD DIDFKPTSPAGYGFAVDLGATYDILPNLQASLAVNDLGFIGWSKKKNVTGFSSKELSFTG VNVTEDGTESPDFDIDVLEFHKGAPQSVSRMLRATINAGLEYEVWRHKIGIGLLYSARFW EYKTLHNITGSVNFHPIRWFTLTGSYSVIDNRGGAVGLGLNLCPNWINFFIGTDVLITKH TPQFVPVRQSIMNVTLGIGVPIGKRSHRIPAYVYGKDRR >gi|313158424|gb|AENZ01000041.1| GENE 34 28305 - 29222 1307 305 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313158454|gb|EFR57849.1| ## NR: gi|313158454|gb|EFR57849.1| hypothetical protein HMPREF9720_2481 [Alistipes sp. HGB5] # 3 305 1 303 303 576 99.0 1e-163 MRLILPLFLFGMLVTSCIENNYDLSEIDTDGVVIGDEFRLPLATVTVSMSELSKEGTDIK ALFDEADIWLPSPLPAGGKYVDLQKIQHTPSYIDELLDELIEQMKRSDAKINAVADLLYD KYLGTFLPLLPPNTDPKDFKQVFITMFRATTGLQEELAGEVRDLAGGYLTDLKIEDVTYD LGRIDLGSDVVDMLADNLDPKGTANPRNTLDIYGEIISALPVSLQFSPRFYPTEVEFDIR VEPNVKTKIGETRLHENDLRQIIDGTEIILPVKLEKYFPGSGFTPDQKIVIALRLVKRGG LKLNL >gi|313158424|gb|AENZ01000041.1| GENE 35 29405 - 31225 2486 606 aa, chain + ## HITS:1 COG:lin1197 KEGG:ns NR:ns ## COG: lin1197 COG0322 # Protein_GI_number: 16800266 # Func_class: L Replication, recombination and repair # Function: Nuclease subunit of the excinuclease complex # Organism: Listeria innocua # 8 606 5 596 603 413 40.0 1e-115 MAETGKNNLREQVAMLPLSPGVYQFVDRSGTIIYVGKAKSLRKRVSSYFVQSKEHSAKVR VLVRQIAEIRHIVVSSETDALLLENSLIKTLQPRYNILLKDDKTYPWIVVRREHFPRVQS TRQLTRDGSQYFGPYSSVMMQHSVLEFVREVVPLRTCKLNLAPEQIAKGRYSVCLQYHLG NCKGPCVGEQSEAEYAKLVDMVVAVLKGDLRPVRSYLESEMSHAAAGLKFELAQRYKQRL DALDNYSSRSVIVSAKIVDVDVFSLLPDDDVAYCNFVRIRHGSIVGVYTVKLSTGVGGDE RDMLTLAIQHIVEHIAGTLAKEVVVPFLPSTTLLFDGVTFTVPKRGEKLELLEFSQKSAR IYRAEQLKNLEIKNPERYTERLLNALQKELRLDRQPRHIECFDNSNLQGAHPVASCVVFR DGKPARKEYRHFNIKTVEGPDDYASMREVVFRRYSRLQEEGAELPDLIIADGGKGQMGII HEVLEYLGLDIPVAGLAKNDRHRTSELLCGYPPVLVGIRPTSPLFHFLAHIQEEVHRFAV SFHRQKRSKAFIHSELEQIEGIGDKTIRTLLQHFRSVAKVKAAPAEELSALVGPAKAKKI RAFFEK >gi|313158424|gb|AENZ01000041.1| GENE 36 32138 - 32449 131 103 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313158458|gb|EFR57853.1| ## NR: gi|313158458|gb|EFR57853.1| conserved hypothetical protein [Alistipes sp. HGB5] # 1 103 12 114 114 193 99.0 3e-48 MLLVLFAWYWSSVTLFPHAHNIDGHIYVHSHPFSGTSNNPGHSHTPQQFQLIAHLSLLVM MVATLVAFALRLLGVSFIFKTQKPAVRQDAPIRIYGLRAPPVC >gi|313158424|gb|AENZ01000041.1| GENE 37 32539 - 34845 2701 768 aa, chain + ## HITS:1 COG:ECs3047 KEGG:ns NR:ns ## COG: ECs3047 COG4771 # Protein_GI_number: 15832301 # Func_class: P Inorganic ion transport and metabolism # Function: Outer membrane receptor for ferrienterochelin and colicins # Organism: Escherichia coli O157:H7 # 119 766 31 657 659 116 24.0 2e-25 MKEKILGAFLMLLLPAVCSAGEEPAMKPQKQSDANITGHVVDAKTYEHLAFATIAVKGTT IGIATDATGHYFLKNLPQGRFTLVASSVGYRSAEQTVEISPDKTIEVNFSLTEEALSVEE VVVSASRTETNKKTSPTIVSVASAKLFESTASCNLAETMNFQSGLRVETNCGNCGTTQLR INGLEGQYSQVLLDSRPIFSSLASVYGLEQLPVAMIERVEVIRGGGSALFGANAIGGVVN IITKEPLRNSVTLSNTTNIFEGGTADFNTSLNGSFVSDDYKMGVYLFGMIKDRDSDDRNG DGFSDIPKLNSETAGFRAYYKTSPYTRLTAEYHHIHEFRRGGNEFDQPPHMADIAEQLNH KIDGGGLKFDWFSPNNRHRMGIYTSAQNIDRDSYFGTDKKPDAYGATDDKTFVAGAQYTY SFHKLLFLPSELTAGVEYNYNTLHDKYLGFGRDFEQTTHSTGFFFQNEWRSEKLNFLIGG RVDKHNMMKNVVFSPRVNVRYSPTEKIGLRASYSSGYHAPQAYNEDLHIDALDNKVAIIR LAPDLKPEYSHSLSASVDLYHNFGRVQANLLVEGFYTMLEDVFTLEKIGEDAQGNIIKER RNASGATVAGVGAEAKAGIPGRFELQLGYTFQRSRYDEPEKWSDDVTPQRRMFRSPDHYG YLTANVNLTRDFTASVFGNYTGRMLVQHNAGYIERDTERLTPDFWDMGLRLSYNFRLTKQ LRLELNAGVKNLFDSFQKDLDFGQNKDAAYIYGPAVPRTYFIGAKFAL >gi|313158424|gb|AENZ01000041.1| GENE 38 34861 - 35607 606 248 aa, chain - ## HITS:1 COG:AF0720 KEGG:ns NR:ns ## COG: AF0720 COG2102 # Protein_GI_number: 11498327 # Func_class: R General function prediction only # Function: Predicted ATPases of PP-loop superfamily # Organism: Archaeoglobus fulgidus # 5 205 7 195 214 124 36.0 2e-28 MERIKAVFNWSGGKDSAHALLRAQQSGEYEIVALLTTVNRDTHRSTMHGIPTALLQMQAE SIGVPLYIVDLTPKGNMEDYDVAMSRAVEHFKTQGVTRFIFGDIFLHDVRKYREQQLSPH GIEIVEPLWGKSSEEVMNDFLVSGFRTVVVTTMADGLGADAIGREIDRGFIASLPAGVDP NGENGEYHTFCYDGPIFRQPVPFRLGRSFSQSYDIRLDDGTVKTYSYWFADLQALNTNSD AGTGPASE >gi|313158424|gb|AENZ01000041.1| GENE 39 35595 - 36167 406 190 aa, chain - ## HITS:1 COG:no KEGG:BT_1949 NR:ns ## KEGG: BT_1949 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 7 182 4 179 188 227 59.0 2e-58 MDFSFRYTAEDHCAALPVADYIARFRDAKRFLECCRACRNYGCSWGCPPFGYDVGAYLSQ YTSALIIATKITPAEQHVPMSEAGRLIRPERQRLERRLLEMERRYGGRSFAYVGTCLYCP EGTCTRPEAKPCRHPELVRPSLEACGFDIARTTSELFGIELKWGTDGSLPEYLTLVCGFF HNAENIIWNG >gi|313158424|gb|AENZ01000041.1| GENE 40 36595 - 38277 2243 560 aa, chain + ## HITS:1 COG:HI0035 KEGG:ns NR:ns ## COG: HI0035 COG2985 # Protein_GI_number: 16272010 # Func_class: R General function prediction only # Function: Predicted permease # Organism: Haemophilus influenzae # 19 559 4 548 551 368 41.0 1e-101 MELLERLLFGSESLWGGGVAHSVMILALVIALGIMLGKVKIAGVSLGVTGILFVGIAFSY FGMNIDEHLMHFLKEFGLILFVYSIGLQVGPGFFSSFRKGGITLNKLAVLVVALGVVTTV ALYYVTGLPMTTMVGVMSGAVTNTPGLGVAQQAFSDLHAGADAPDIATGYALAYPLGVIG AILTLLALRYLLRIDVRQEEEAAGLGTDVLKDLTTRRISVEICNPAVEGKSISGIRRLAL RDFVVSRICRPGEAPELADAATTLRCGDRILLVAAPKDAEALVALLGREVDAGPMMPDRK MISRRILITKPELNGKTLAELRIRSTSGVTITRINRSGIDLVAAGNLQLQLGDRVTVVGP ELSVAHAERLFGNSLKRLNHPNLIPIFIGIALGVVLGSISFWIPGVPQPVKLGLAGGPLI VAILIGRYGPHYRLITYTTMSANLMLREVGISLFLAGVGLGAGEDFVPTLVAGGYVWIAY GAVITIVPLLLAGLFGRFYYKLNYYTLIGVLSGASTNPPALAYSTEQTTSDAPSVGYATV YPLSMFLRVLAAQLLILIFG >gi|313158424|gb|AENZ01000041.1| GENE 41 38375 - 38740 383 121 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313158457|gb|EFR57852.1| ## NR: gi|313158457|gb|EFR57852.1| hypothetical protein HMPREF9720_2490 [Alistipes sp. HGB5] # 1 121 1 121 121 244 100.0 2e-63 MNIVLYGVPAETAGRIADRYGLKVINSPDKFDASGTMVLVPSINAPRYLLAFYNAMLRHE DDVDAVIICGAESCEAVSTVQYCTPLGKFFTLNGDLDGEELVSELCLLLDSLFAEGNQIN F >gi|313158424|gb|AENZ01000041.1| GENE 42 38743 - 39282 683 179 aa, chain + ## HITS:1 COG:TM1465 KEGG:ns NR:ns ## COG: TM1465 COG2109 # Protein_GI_number: 15644214 # Func_class: H Coenzyme transport and metabolism # Function: ATP:corrinoid adenosyltransferase # Organism: Thermotoga maritima # 4 178 2 169 170 163 50.0 1e-40 METKGYVHVYTGNGKGKTTAAFGLALRALCAGKGVYVGQFVKSMKYNETKIERLFNGSDP AFGRIRIEQLGRGCFIGKDPELADAEAARKGWTRCADLLRSGEYDVVILDELTIALHFGL LSTAVVLDVLRSRHPAVEVVITGRYAPQALIDAADLVTEMCDVKHYYTQGVLSRDGIDR >gi|313158424|gb|AENZ01000041.1| GENE 43 39283 - 40428 1044 381 aa, chain - ## HITS:1 COG:alr4031 KEGG:ns NR:ns ## COG: alr4031 COG0614 # Protein_GI_number: 17231523 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+-hydroxamate transport system, periplasmic component # Organism: Nostoc sp. PCC 7120 # 2 380 33 424 426 233 34.0 7e-61 MKKTVYFILLSAATFLISCASNTTGIDAYTETLYKPSYASGFEIVGAEGRQSTLLKVFNP WQGAENTETTLFIVRNGEKIPAGFTGQVVKAGARRIVCLSSTYVAMLDALGQVDRVVGVS GRNYISNKYVTTHPEAVADIGFDGNIDYELLLAQRPDLVLLFGVSGASGMESKLLEMGIP FCYIGEYLEESPLGKAEWLVAIAEITDSREKGEAVFEPVPERYDALRAKVAAATSKHPKV MINTPYAGSWFMASTESYVARLIADAGGDYIYKKNTSNRSLPIDLEEAYMLTAQADMWLN AGSAASLGELKSQFPKFTNTRCVRNGAVYNCNKRLNVAGGNDYWESGVVQPDVVLHDLIA IMHPEVLDENDRELHYYQRLE >gi|313158424|gb|AENZ01000041.1| GENE 44 40617 - 42170 1721 517 aa, chain - ## HITS:1 COG:no KEGG:Weevi_2079 NR:ns ## KEGG: Weevi_2079 # Name: not_defined # Def: PKD domain containing protein # Organism: W.virosa # Pathway: not_defined # 135 507 34 400 401 135 32.0 5e-30 MKKFTFRSLLAIAAAGVLFTACKDENDDGNTSKTTYEFSVKAEDNPISAPADGKTYTILV TSTKTAQAGTSAVAYEVVSSPEWAPAELEQTALVITVAKNSSTEAREPGKVVLKQDESDK TLEITINQAGFSNSMSLDATYSTDRCKVLVIEPTITGFDQNPVYKWTVKGPNDAEAAEAG TDKTLSFIQLETGDYTISLTISDDSGITETKSATVTVTTEATAYSYYISEVLEYNPAITN GKGLAFSTIDTPSTVLASVNTKLVDKPFDKNDLGVNLGSFGGSIVFRFDHTVMNVNGLRD FRIGSYATKGAYPAQGVVYVSSDANGNGKADDEWYELAGSEYGKTGERRNLKMVYTRPDN VTPSDGMVEWVSYAINESETGTYCYAKAPWGTFSVWPAWLMESAEGTALTYTGCTMLPPL AKAPEDLTNGWPTQASRWYDYGYVCNSDPTDETGSSFDIGWARDKEGNKVNLPGIDFVKI QNATLQDLGYGYGPACVLFNCAIDLHLAGKEIETIAQ >gi|313158424|gb|AENZ01000041.1| GENE 45 42198 - 43427 1020 409 aa, chain - ## HITS:1 COG:no KEGG:FIC_00184 NR:ns ## KEGG: FIC_00184 # Name: not_defined # Def: hypothetical protein # Organism: F.bacterium # Pathway: not_defined # 64 322 47 324 1036 100 31.0 2e-19 MLLASVIGPSSCSYHNDDNPNEYPDPQPEPEPEPEPQPDVNEKYLEASYTPNCFMVKPGE SVDIPVLKAYAIWDLYAEWLDKSDFTGMTPEPVLLWQDTPGLITNVGLIPGQTAEEGSIF VSTADKVGNALIGLRIGGEIRWSWHIWVTRYDPNAELVAFGKIYTWDNNGDGVADYTFMD RNLGAVIDKALIENTPADSLAACGLLYQWGRKDPFPGDRILRGTNQTDYNRFDSKPIYDA AGTLLTEGSQSGGTGIRTIRTSEDLTRTNFAKSIFSPMHFLLADNWFPSEEPLKVEACDT LWCGARGKTPFDPCPEGWCVPADKNGLFAWNGLDKATTDYSPLGVFPYAGFRYRVGGGCL KNSGFNTGIWTQDTYQLSIYIIPYDQRPSIKRDRGYRSDGYSVRCAKDE >gi|313158424|gb|AENZ01000041.1| GENE 46 43460 - 44755 1020 431 aa, chain - ## HITS:1 COG:no KEGG:FIC_00184 NR:ns ## KEGG: FIC_00184 # Name: not_defined # Def: hypothetical protein # Organism: F.bacterium # Pathway: not_defined # 75 364 47 360 1036 110 30.0 2e-22 MKMKKLLLLPALLLTLSVGNVACSYHNDDNPNEYPDPQPEPEPEPEPGPDINQKYLEAAY TSNCFMVKPGQSVDIPILKAFAMWNLYAEWLGETDLMGLTPEPVLLWQDLPGLITNVGLI PGQQAEEGSIAVSTADKVGNAVIGIRIGGAIRWSWHVWVTRYDPDAELVAYGKIYTWDNN GDGVVDYTFMDRNLGATIDKAIIENTAADSLAACGLMYQWGRKDPFPGDHDFRRSNSTDY NYFESKPIYDASGTLLTERSATGGTGIRSVKTDTDLTRTGLAKSILEPMTVLLGAEGYSD WYCSSKPTEIKKCDTLWCGTRGKTPFDPCPEGWRVPANKNDKFIWNGLENATTDYSALGV FPYAGFRYRAGGGCLKNSGFGVSIWSGTPPTGIGNAHELSIYISPSEKRPIIKSDVSCRS DGQSIRCVKDA >gi|313158424|gb|AENZ01000041.1| GENE 47 44773 - 46014 1605 413 aa, chain - ## HITS:1 COG:no KEGG:Weevi_2079 NR:ns ## KEGG: Weevi_2079 # Name: not_defined # Def: PKD domain containing protein # Organism: W.virosa # Pathway: not_defined # 27 405 34 400 401 268 45.0 3e-70 MKKIFAILLSALALGGCENDGIPYVSLGLDDLYKVARMQTVDLKPAFTGESYRWTVKTAS GADSLLSEEKDYIFLMQYPGTYDVTFQIFDPVNPIVHRMTFYVVEEEVEYSPYISTVYEY CPAPGQFVNAMPEYREGDTAETMRQKAEEAIAGKMQSGVSLGAYGGYITFGFDHTVVNVP GEYDIRIDGNSFNSAAHPGVDGGSSEPGIIMVMFDENQNGKPDDKWYEIDKNPWYTDEAA TYGYEITYHRPAPDHVPTPGEGNDSALTDMTYIKWEDNRGQTDCVYKNKFHDQDYYPKWI DADEMTFRGTLLPKNAVDVSGNGSYYLQYMFKYGAYADNYPNEAKDENGNYYNGFDIGWA VDPETREPVHLPGVDFIRVYTALNQYCGWIGETSTEIFMARDMHIYVRPDQHQ >gi|313158424|gb|AENZ01000041.1| GENE 48 46011 - 47168 1443 385 aa, chain - ## HITS:1 COG:no KEGG:Poras_1287 NR:ns ## KEGG: Poras_1287 # Name: not_defined # Def: hypothetical protein # Organism: P.asaccharolytica # Pathway: not_defined # 18 380 22 389 389 355 48.0 1e-96 MKRYLFIVLAGCCLAAGCRKIELVNPTEYEVLPFGEDPDADPIGMYLLNEGNMGSNKADI DYLDYRTAVYARGIYAEKNPNVVKELGDVGNDIQVYDGRLFAVINCSHKVEVMDAYTARR ITQIDIPNCRYIRFKGKYAYVSAYVGPVAMDPNAQKGAVFKVDLDTYKIVGQVTVGYQPD ELAIIGGRAYVANSGGYRAPNYDSTVSVIELETMRQMYKIDVAINLSRIKADAYGNLWVS SRGNYDDVPSNLYRLEPNGGRYQVAEAMNIPASNMIISGDSLYVYSVEYSNQTSKNTVTY AIIDVKKKRVVSRKFITDGTELDIVIPYGIAIHPRNGDIYVTDAKNYVSSGVLHCYSREG VHKWSVRTGDIPAHMVFLERKQTRQ >gi|313158424|gb|AENZ01000041.1| GENE 49 47181 - 49259 2824 692 aa, chain - ## HITS:1 COG:VC0156 KEGG:ns NR:ns ## COG: VC0156 COG4206 # Protein_GI_number: 15640186 # Func_class: H Coenzyme transport and metabolism # Function: Outer membrane cobalamin receptor protein # Organism: Vibrio cholerae # 68 678 50 597 611 62 20.0 4e-09 MKQFYLSILCLAFIAGGGISTAVQASDKADSTTNRRYVGKLDSIQNINEVVIVGTPVIPK YREVIPAQVLKDVDLQRLNSFSVADAIRYFAGVQLKDYGGVGGLKTVNIRSMGTNHMAVF YDGIQLGNAQNGQVDLGRFSLDDVEEISLYNGQKSDIFQSAKDFGASGTIYITTRRPRFE EGKRANFKATMKTGSFGLINPSMRYEYKISDAVSSSFSGEWINATGRYKFRYRRRNTLGE IAYDTTAYRQNGDINATRMEAGLHGKMADGQWTVRAYTYNSERGIPGAIVNNVFRHGERL WDTNSFIQGTLTNRWSKKFRTLFNTKYAADYTHYENQDAKLIQTKNTYKQKEFYFSCANL YNIFEHWEVSLAYDFQWNTLDATFYMPTESDGTFPFPTRYTHMAALATAFEWGRLKMQAS VLGYFVHEKVERFEARPDKAVATPAFFGSFKVLKNHDLSVRAFYKRMFRMPTFNDLYYTD MGNAFLKPEYAEQFNVGLKYTRNFEKSPFMRMLDVSVDAYYNKITDKIIAYPKGQQFRWT MLNLGEVEIKGVDAVLNALFTLGKLEVTTKFQYTYQKAIDITDPADTYYRDQIPYIPWHS GSAILALFYKGWGLNYSFIYTGQRYNQQENIPRNHTQPWYTSDLSLQKTFKIRTRTLKAT AEVNNLFGQDYDVVLNYPMPKTNFRFVVSVDL >gi|313158424|gb|AENZ01000041.1| GENE 50 49624 - 50244 496 206 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|291514016|emb|CBK63226.1| ## NR: gi|291514016|emb|CBK63226.1| hypothetical protein AL1_06150 [Alistipes shahii WAL 8301] hypothetical protein HMPREF9720_2500 [Alistipes sp. HGB5] # 1 206 1 206 206 347 100.0 2e-94 MNDKNKIDLSIEMYEDLKETLVKAIKNAKAGGVGNAGWENTQLERIERLIEAAERSQGQI AQQLEQLQRNTALSGAKTEQMQEFTDRYNALLAQLAGRLEVIDKENRTIITEVLSFKKSL SEFKNEVDTTKFEKLANNTLWEMKTQIRYLKQPPIVLRTIVFLTVLSILTILATGYLYRD RNEWRGEATYWYKQSDQYKTTQTNKK >gi|313158424|gb|AENZ01000041.1| GENE 51 50263 - 51333 794 356 aa, chain - ## HITS:1 COG:no KEGG:HMPREF0659_A6126 NR:ns ## KEGG: HMPREF0659_A6126 # Name: not_defined # Def: relaxase/mobilization nuclease domain protein # Organism: P.melaninogenica # Pathway: not_defined # 1 273 1 277 306 269 44.0 1e-70 MVGKVISASSFSGTVGYVMKEESRILEAEGIMPPEVKDMVQDFKDQTLLNPRLKNTVGHI SLSFSPKDAPRMTDALMTRIAKEYMQKMGITDTQYLLVRHLDQPHPHCHLVYNRVGNNGQ TISDKNIKIRNAKVCRELTEKYGLYLAPGKEEVRREQLREPDKSQYEIYDAIEGSLPKCK NWNELERKLKDQGITMRYKYCGNTDRKQGVLFSKNGFEFSGSKIDRAFSFTKLDNRFTLL QQQALHRAELLGNLSAAAGNYRSAFAGLFGGMGGGTREEPAAPSPGGKIDAIPLPPASSP VGVSAAQLQRKPGESPEEHIARITALLDAAAAAMAIAAMERRRRMEEQKRRAKMKI >gi|313158424|gb|AENZ01000041.1| GENE 52 51323 - 51691 245 122 aa, chain - ## HITS:1 COG:no KEGG:HMPREF0659_A6127 NR:ns ## KEGG: HMPREF0659_A6127 # Name: mobC # Def: bacterial mobilization protein # Organism: P.melaninogenica # Pathway: not_defined # 6 118 3 115 117 72 35.0 5e-12 METPKKTYSKQGGRPKVGIGRIRKYVVSTRLSPERKLRFSALCREAGQPPAEVLRQLIDR GTVRARITREQLDFMAQLKGVARNLNQLTRLANVKGLAAVRARHAAIVTAIEKLLKQICD GR >gi|313158424|gb|AENZ01000041.1| GENE 53 51795 - 52619 352 274 aa, chain - ## HITS:1 COG:no KEGG:PG0841 NR:ns ## KEGG: PG0841 # Name: not_defined # Def: mobilizable transposon, excision protein # Organism: P.gingivalis # Pathway: not_defined # 3 224 71 278 297 153 41.0 1e-35 MRLERCDSREAIRRLQNGEKRDAGSVSLSPGVGERPAAGGPLPVLRPAAVPALRILSDAP LRHPALVGYLSSRGIVPSVAAAFCREVRYEVNGRAFFAVGFRNDAGGWELRSERFKGGSS PKHITTLDNRSDTVIAFEGFMDFLAYLSLKHPERLHIDAAVLNSVVNLPKAVPFLSRHPV IHAFFDNDEAGRKTTADLIRLCPRSEVIDQRYFYSGHKDVNDYLIARIKGHAEKLPATKN ISETKAVQALRNNLQALKAETAEIEPPRRKGVKI >gi|313158424|gb|AENZ01000041.1| GENE 54 52665 - 52817 145 50 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313158429|gb|EFR57824.1| ## NR: gi|313158429|gb|EFR57824.1| conserved domain protein [Alistipes sp. HGB5] # 1 50 1 50 50 89 100.0 1e-16 MNDLKNISIRQFLARRGILPKYERNGYGMYLSPLREERTPSFKVDYVRNL >gi|313158424|gb|AENZ01000041.1| GENE 55 53044 - 54252 1321 402 aa, chain - ## HITS:1 COG:all8519 KEGG:ns NR:ns ## COG: all8519 COG5545 # Protein_GI_number: 17232892 # Func_class: R General function prediction only # Function: Predicted P-loop ATPase and inactivated derivatives # Organism: Nostoc sp. PCC 7120 # 31 388 306 652 836 86 23.0 7e-17 MANKVSDSVAAPGKQSRNERIEEFLREHYAFRYNTVKSRAEFRSSDGEFLPVTKYRLNSF RRELDRTIGISTSAENLRSMLESDFSERVNPVQAYFHKLPPATGTQAIDELTATVTVRNA RHWSEYLTKWLVGVVANATNDLGCQNHVCLVLTGERGKFKTTWLDNLCPRSLASYLFTGK IDLQNKDVLTLVAEYLFINIDDQLKALNKRDENELKNLITAPSVKYRRPYDVYIEEYPHL ASFMASVNGNDFLTDPTGSRCFLPFEVLSIDIDRAIWVNMDRVYAEARTLLNDGFRYWFD EAEIEELHRGNAAFHVQTIEYEMLLKGFEKPPEHAVADCFMTTVEILNYLRSYSSLNLSE KRMGEALRKAGFERRSKRIGGNPVYGWVIEKISPNPFVSYGL >gi|313158424|gb|AENZ01000041.1| GENE 56 54242 - 54553 371 103 aa, chain - ## HITS:1 COG:no KEGG:Bacsa_2217 NR:ns ## KEGG: Bacsa_2217 # Name: not_defined # Def: phage transcriptional regulator, AlpA # Organism: B.salanitronis # Pathway: not_defined # 1 101 1 101 104 140 67.0 1e-32 MKDLLSILRDAPGSIRLEVSGEDLLAFSKSLIDRAKEELAAQVAEARKERYLTKEQVKEL CDVCDATLWHWNRKGYLKAVKVGNKVRYRTSDIQRILGEQDGK >gi|313158424|gb|AENZ01000041.1| GENE 57 54709 - 55638 685 309 aa, chain - ## HITS:1 COG:no KEGG:Bacsa_2218 NR:ns ## KEGG: Bacsa_2218 # Name: not_defined # Def: hypothetical protein # Organism: B.salanitronis # Pathway: not_defined # 1 308 1 315 316 202 38.0 2e-50 MGQQEYEAFKAKLREWMNTHPDEYAAFEEAMNARDYAGCQLVMFQAMSLIPRYRHLMSAK ANEGLFEHVDEIEQAARQHDLAGKIIRECEQPGKDSTLPAMLCWLYFGKSFERMVERCEE LRRSPDLGFLQKMTMSATIKLLISRSIKLELRTKQDWNAHREAMRLAESDRVLEWAAGTL PAEDSDEKRKPGRPSTTKSLVEMFSPVVAHPDELRQKIGEYLTKRHSQTDIARLKIALDE LRYLVVPTNIKPFRDALQAEYGSDIRIVHERGIQEAYSRLTEPLLIGSAVSSRGGEALII REIKDFLSQ >gi|313158424|gb|AENZ01000041.1| GENE 58 55652 - 56662 810 336 aa, chain - ## HITS:1 COG:SMa0592 KEGG:ns NR:ns ## COG: SMa0592 COG3550 # Protein_GI_number: 16262763 # Func_class: R General function prediction only # Function: Uncharacterized protein related to capsule biosynthesis enzymes # Organism: Sinorhizobium meliloti # 135 289 197 343 390 105 40.0 1e-22 MNPITICPSTLAEGYDSYSPIAIKHLFDGRQVSPFLDYTPIDDDNNSAAQEEFLHNQERI SLSGVQPKYSMIVRNGKLALTQKGEQGHYILKPKLSDFRNRIYSSANENLTMQIASQVFG IETAANGLCFFKGGEPAYITKRFDVKPDGTKRRKEDFASLAGLTTQNGGKNYKYEYLTYE ECGELIRRYLPAWKVETLKFFDLIIFNFLICNGDAHLKNFSVLETESGDFRLSPAYDLIN TKLHVDDRIFALDKGLLKDNAAESMPYGMTNSTTFREFGKRLGLPDKTIRRELDKFCTSY PLLDTLIANSYLSDELKENYRNMYLGRRDSYLKIGL >gi|313158424|gb|AENZ01000041.1| GENE 59 56665 - 56982 138 105 aa, chain - ## HITS:1 COG:no KEGG:BF0796 NR:ns ## KEGG: BF0796 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 105 2 106 109 145 67.0 4e-34 RQGAVYMNGKLAGILTEISPTEYVFKYDDTYYADDMQPAVSLTLPKTQQEYRSAYLFPFF SNMLSEGRNRIVQSRMLHIDENDHFGILLATAQTDVAGAVTVKPL Prediction of potential genes in microbial genomes Time: Wed Jun 22 12:25:39 2011 Seq name: gi|313158422|gb|AENZ01000042.1| Alistipes sp. HGB5 contig00046, whole genome shotgun sequence Length of sequence - 952 bp Number of predicted genes - 0 Prediction of potential genes in microbial genomes Time: Wed Jun 22 12:25:39 2011 Seq name: gi|313158420|gb|AENZ01000043.1| Alistipes sp. HGB5 contig00100, whole genome shotgun sequence Length of sequence - 616 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 615 706 ## COG3537 Putative alpha-1,2-mannosidase Predicted protein(s) >gi|313158420|gb|AENZ01000043.1| GENE 1 3 - 615 706 204 aa, chain - ## HITS:1 COG:SP2145 KEGG:ns NR:ns ## COG: SP2145 COG3537 # Protein_GI_number: 15901958 # Func_class: G Carbohydrate transport and metabolism # Function: Putative alpha-1,2-mannosidase # Organism: Streptococcus pneumoniae TIGR4 # 1 201 242 438 694 159 41.0 3e-39 WDEVLGRIEVEGGTTDQYRTFYSCLYRSTLFPRKFYEIDKNGNILHYSPYNGEVLPGYMY TDTGFWDTFRSLFPLLNLVYPSVNAEIQAGLANAYRESGFLPEWASPGHRGCMVGNNSAS VVSDAILKGVTPEEDIATLYEAMLAGRRKVHPTVSSTGRLGHEYYNTLGYIPYDVGINEN VARTLEYAYDDWCIAQVAKKLGKT Prediction of potential genes in microbial genomes Time: Wed Jun 22 12:25:42 2011 Seq name: gi|313158409|gb|AENZ01000044.1| Alistipes sp. HGB5 contig00083, whole genome shotgun sequence Length of sequence - 7101 bp Number of predicted genes - 10, with homology - 10 Number of transcription units - 5, operones - 3 average op.length - 2.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 218 215 ## CHU_0617 outer membrane protein - Prom 238 - 297 2.2 2 2 Op 1 . + CDS 190 - 564 247 ## BT_p548219 hypothetical protein 3 2 Op 2 . + CDS 583 - 1038 503 ## BT_p548220 hypothetical protein + Term 1050 - 1094 6.1 4 3 Tu 1 . - CDS 1166 - 2224 398 ## Odosp_1706 regulatory protein LuxR - Prom 2395 - 2454 5.9 + Prom 3125 - 3184 4.2 5 4 Op 1 . + CDS 3215 - 3748 238 ## Odosp_1705 hypothetical protein 6 4 Op 2 . + CDS 3769 - 4665 507 ## Odosp_1704 hypothetical protein 7 4 Op 3 . + CDS 4711 - 5832 704 ## BT_p548229 hypothetical protein + Term 5856 - 5914 3.2 - Term 5803 - 5844 1.2 8 5 Op 1 . - CDS 6021 - 6278 307 ## BT_p548230 hypothetical protein 9 5 Op 2 4/0.000 - CDS 6283 - 6564 251 ## COG1192 ATPases involved in chromosome partitioning 10 5 Op 3 . - CDS 6633 - 7043 339 ## COG1192 ATPases involved in chromosome partitioning Predicted protein(s) >gi|313158409|gb|AENZ01000044.1| GENE 1 2 - 218 215 72 aa, chain - ## HITS:1 COG:no KEGG:CHU_0617 NR:ns ## KEGG: CHU_0617 # Name: not_defined # Def: outer membrane protein # Organism: C.hutchinsonii # Pathway: not_defined # 7 72 189 254 800 68 46.0 6e-11 MKALYTGGSNDQTLILYDDVPIYNQAHAYGILSIFSGETVQSAEVSKGYISPAYGSRLSA LTQIRTREGDRQ >gi|313158409|gb|AENZ01000044.1| GENE 2 190 - 564 247 124 aa, chain + ## HITS:1 COG:no KEGG:BT_p548219 NR:ns ## KEGG: BT_p548219 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 5 124 49 168 168 222 99.0 4e-57 MLPPVYKAFKVGANLNYRNLNRDLVKANTVTVGPELAYYALKGRKFSLLGIAAGTIGYQK AKAKSDLVYLAKSKAFVYGYEVGIRPEMLLSPKVALFAEYRFEMLFNSILRNNNHVGLGC VIYL >gi|313158409|gb|AENZ01000044.1| GENE 3 583 - 1038 503 151 aa, chain + ## HITS:1 COG:no KEGG:BT_p548220 NR:ns ## KEGG: BT_p548220 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 151 1 151 151 274 94.0 6e-73 MRKYLYLSAVCVCMALCFVGCSKDDDEPGGKGAMYEVTIEQSGDFRSFIKSVVVVANGTR LKDGATGESLASPLILSDEELSVEKVTLTTTGKAIEFAVSGSVVDGEDGVVNEPMQWVVT VYKNGKEIEKKSLSFRDGKEIGMDDLNLYYN >gi|313158409|gb|AENZ01000044.1| GENE 4 1166 - 2224 398 352 aa, chain - ## HITS:1 COG:no KEGG:Odosp_1706 NR:ns ## KEGG: Odosp_1706 # Name: not_defined # Def: regulatory protein LuxR # Organism: O.splanchnicus # Pathway: not_defined # 1 352 1 353 353 530 98.0 1e-149 MIFSTKAASFLSSIKTQTYDKKEREMIITYQQKRVFHLSLLMLVLCAPIYIYSVPFPNEQ FYYINSVLFLFIIMCTLAYFKKRVNLTTTFSIILIAIHIEIFIEIIYCSICSGYEYSYQR ALIMSNITISLLFTMLSICAYMSNISILLSSLTIASYTICTLITDEPFLYSYLPLIIIIY TMIPLLGRSLHSNISSLLKSSNLLKEEEEMLLKRLQMKKEELFAFAELLSENNPEEKTSS LLNIIGEQSKENLFTALAAYQKKEKSKLDTIRRIYPNLSPSELNICRLILQDKTVSQICE LLHRSSGNITSQRANIRAKLGLKKSDNLKEALQERMRLYEEEHRQEDFSAMR >gi|313158409|gb|AENZ01000044.1| GENE 5 3215 - 3748 238 177 aa, chain + ## HITS:1 COG:no KEGG:Odosp_1705 NR:ns ## KEGG: Odosp_1705 # Name: not_defined # Def: hypothetical protein # Organism: O.splanchnicus # Pathway: not_defined # 1 177 230 406 406 357 97.0 1e-97 MAVKTNLLYDAVLIPDIGVEFCLGKNWSVAGNWMYAWWKSDRKHNYWRIYGGDVELRRWF GRRAVEKPFSGHHVGLYGQIVTYDFELGGKGYLGDKWSYGGGVAYGYSLPVGHRFNVDFT LGIGYLGGSYKEYIPLDGHYVWQTTKNRRWFGPTKAGISLVWLIGRGNYNEKKGGRQ >gi|313158409|gb|AENZ01000044.1| GENE 6 3769 - 4665 507 298 aa, chain + ## HITS:1 COG:no KEGG:Odosp_1704 NR:ns ## KEGG: Odosp_1704 # Name: not_defined # Def: hypothetical protein # Organism: O.splanchnicus # Pathway: not_defined # 1 282 9 290 306 568 95.0 1e-161 MFIFLWIGLTACEHKDLCYDHPHFAKVRVVFDWTKISNHDKPEGMRVVFYPTDDESNTWI FDFPGGEDGEVELPENDYRVICFNYDTDGMVWKENGSYTLFTADTRDVQSPDNRTMAVTP PWLCGDHIDRVILKDIPGGSAEIVRLTPVNMVCHYTYEVNGLRGLDRVADLRAALSGMSG SLNMSGDSLPAGLSESLLFDGMVSRNQIIGGFYTFGHSALEGEPNVFRLYLKNRSGSMSV LEQDVSDQVHDVPVAGHVGDVHLVLNFDYEVPSEPGSDGAGFDVDVDDWDDVNMDIVL >gi|313158409|gb|AENZ01000044.1| GENE 7 4711 - 5832 704 373 aa, chain + ## HITS:1 COG:no KEGG:BT_p548229 NR:ns ## KEGG: BT_p548229 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 373 1 373 373 697 95.0 0 MKKSTVMLWAIFGALLMGCSDEEIANVETSSRNAIGFNVLSNAAETRAIPTTPDNLTSTD FDVFAFTTDGTAFMGKVDTDFGHDGVKIVYKNGKWDYDDANDLRYWPTEALDFYAFNPGT VSEDMIVFYSWEATKDVQKISYTCMDEYGSGTTHANYDVMYAMAKGQTKDMNNGIVKFNF KHILSQVVFKAKTQYDNMQVDIDVIKIHNFKFAGAFTLPAAADGTGSWSSSDLAFPHAFT VVKNANITVNSNTEATDITTNTPMLNIPQELTAWKVSETATKSKLEADNAKQCYLEIACK IRQSGAYLLGSASEYKTIYVPFGDTWEQGKRHIYTLIFGGGYDDQGEAVLNPIQFDAETT GWVDADKDVNVQP >gi|313158409|gb|AENZ01000044.1| GENE 8 6021 - 6278 307 85 aa, chain - ## HITS:1 COG:no KEGG:BT_p548230 NR:ns ## KEGG: BT_p548230 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 85 1 85 85 135 98.0 4e-31 MAKKNDLKNSMSAGLTGGLDSLTQSTAGQKEAQKPKKAKTVHCNFVMDETYHQNLKLIAI RKGDSLKSVLQEAISDYLDKNSSLL >gi|313158409|gb|AENZ01000044.1| GENE 9 6283 - 6564 251 93 aa, chain - ## HITS:1 COG:lin2923 KEGG:ns NR:ns ## COG: lin2923 COG1192 # Protein_GI_number: 16801982 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Listeria innocua # 1 89 162 250 253 72 43.0 2e-13 MQVVHKVQQRLNSDLSIAGVLITQYDGRKNLNKSVSELVQETFQGKVFSTHIRNAITLAE APTQGQDIFHYAPKSAGAEDYEKVCNELLTEIK >gi|313158409|gb|AENZ01000044.1| GENE 10 6633 - 7043 339 136 aa, chain - ## HITS:1 COG:ML1367 KEGG:ns NR:ns ## COG: ML1367 COG1192 # Protein_GI_number: 15827712 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Mycobacterium leprae # 4 126 33 160 287 107 46.0 8e-24 MSKAKVISVLNHKGGVGKTTTTINLGGALRQKGYKVLLIDLDGQANLTESLGFSAELPQT IYGAMKGEYDLPIYEHKDGLSVVPSCLDLSAVETELINEAGRELILAHLIKGQKEKFDYI LIDCPPPRCRCSRLTH Prediction of potential genes in microbial genomes Time: Wed Jun 22 12:26:31 2011 Seq name: gi|313158355|gb|AENZ01000045.1| Alistipes sp. HGB5 contig00019, whole genome shotgun sequence Length of sequence - 71476 bp Number of predicted genes - 59, with homology - 54 Number of transcription units - 32, operones - 14 average op.length - 2.9 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 17 - 76 6.9 1 1 Tu 1 . + CDS 182 - 1789 2650 ## COG1866 Phosphoenolpyruvate carboxykinase (ATP) + Term 1830 - 1871 11.0 - Term 2099 - 2136 8.5 2 2 Op 1 . - CDS 2180 - 2830 1034 ## Riean_0621 hypothetical protein - Prom 2850 - 2909 2.6 - Term 2850 - 2905 6.1 3 2 Op 2 . - CDS 2919 - 3476 1013 ## COG0233 Ribosome recycling factor 4 2 Op 3 1/0.333 - CDS 3479 - 4045 849 ## COG0526 Thiol-disulfide isomerase and thioredoxins 5 2 Op 4 . - CDS 4113 - 4826 1260 ## COG0528 Uridylate kinase - Prom 4853 - 4912 1.6 + Prom 4804 - 4863 2.3 6 3 Tu 1 . + CDS 5050 - 6948 2545 ## COG1154 Deoxyxylulose-5-phosphate synthase - Term 6915 - 6942 -0.9 7 4 Op 1 . - CDS 6961 - 7722 1084 ## COG3022 Uncharacterized protein conserved in bacteria 8 4 Op 2 . - CDS 7729 - 8568 758 ## gi|313158370|gb|EFR57769.1| hypothetical protein HMPREF9720_0888 9 4 Op 3 . - CDS 8580 - 9353 1176 ## COG0731 Fe-S oxidoreductases 10 4 Op 4 . - CDS 9410 - 9706 163 ## 11 4 Op 5 . - CDS 9718 - 9936 175 ## gi|313158364|gb|EFR57763.1| hypothetical protein HMPREF9720_0890 - Prom 9979 - 10038 5.4 + Prom 10576 - 10635 2.1 12 5 Tu 1 . + CDS 10696 - 11853 1750 ## COG1301 Na+/H+-dicarboxylate symporters + Term 11864 - 11896 2.1 - Term 11845 - 11880 1.5 13 6 Tu 1 . - CDS 11893 - 12006 68 ## - Prom 12026 - 12085 2.2 + Prom 11866 - 11925 4.9 14 7 Tu 1 . + CDS 12022 - 13293 2102 ## COG0156 7-keto-8-aminopelargonate synthetase and related enzymes + Term 13297 - 13331 1.8 + TRNA 13590 - 13662 80.8 # Gly GCC 0 0 - Term 13760 - 13809 7.2 15 8 Tu 1 . - CDS 13827 - 14021 143 ## gi|313157795|gb|EFR57206.1| conserved hypothetical protein - Prom 14155 - 14214 5.9 + Prom 14369 - 14428 3.0 16 9 Tu 1 . + CDS 14465 - 14650 144 ## + Prom 14763 - 14822 3.0 17 10 Tu 1 . + CDS 14871 - 15449 559 ## gi|291514535|emb|CBK63745.1| hypothetical protein AL1_12740 + Term 15542 - 15585 5.5 - Term 15457 - 15496 8.0 18 11 Op 1 . - CDS 15514 - 15840 505 ## gi|313158397|gb|EFR57796.1| hypothetical protein HMPREF9720_0896 19 11 Op 2 . - CDS 15892 - 16149 260 ## gi|313157267|gb|EFR56694.1| conserved domain protein - Term 16242 - 16290 11.1 20 12 Op 1 . - CDS 16323 - 16799 621 ## BT_1401 putative sugar transport protein 21 12 Op 2 . - CDS 16812 - 17798 1227 ## COG1073 Hydrolases of the alpha/beta superfamily + Prom 17767 - 17826 9.3 22 13 Tu 1 . + CDS 18070 - 18963 1274 ## COG2207 AraC-type DNA-binding domain-containing proteins + Term 19008 - 19049 -0.0 - Term 18821 - 18865 2.1 23 14 Tu 1 . - CDS 18987 - 19904 1402 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 19978 - 20037 2.4 + Prom 19918 - 19977 4.4 24 15 Op 1 27/0.000 + CDS 20057 - 21067 1254 ## COG0845 Membrane-fusion protein 25 15 Op 2 9/0.000 + CDS 21100 - 24117 4373 ## COG0841 Cation/multidrug efflux pump 26 15 Op 3 . + CDS 24117 - 25394 1654 ## COG1538 Outer membrane protein + Term 25409 - 25433 -0.3 + Prom 25724 - 25783 3.7 27 16 Tu 1 . + CDS 25828 - 26364 832 ## BT_3993 RNA polymerase ECF-type sigma factor + Term 26368 - 26404 2.0 + Prom 26402 - 26461 2.0 28 17 Tu 1 . + CDS 26547 - 27533 1218 ## COG3712 Fe2+-dicitrate sensor, membrane component 29 18 Op 1 . + CDS 27691 - 31233 5317 ## Sph21_5171 TonB-dependent receptor plug 30 18 Op 2 . + CDS 31272 - 32780 2345 ## Sph21_5172 RagB/SusD domain-containing protein 31 18 Op 3 . + CDS 32815 - 34689 2112 ## gi|313158381|gb|EFR57780.1| conserved domain protein 32 18 Op 4 . + CDS 34751 - 35692 1260 ## Dfer_2668 glycerophosphoryl diester phosphodiesterase + Term 35731 - 35773 13.5 33 19 Tu 1 . - CDS 35647 - 35886 185 ## + Prom 35844 - 35903 7.1 34 20 Tu 1 . + CDS 35957 - 37585 2154 ## Bacsa_0697 putative regulatory protein + Term 37586 - 37634 0.1 + Prom 37624 - 37683 5.6 35 21 Op 1 . + CDS 37873 - 39180 2238 ## BDI_2462 hypothetical protein 36 21 Op 2 . + CDS 39202 - 40485 1851 ## COG0673 Predicted dehydrogenases and related proteins 37 21 Op 3 . + CDS 40497 - 43889 4169 ## BDI_1153 hypothetical protein 38 22 Op 1 . + CDS 43991 - 44203 352 ## Coch_0981 twin-arginine translocation protein, TatA/E family subunit 39 22 Op 2 . + CDS 44239 - 45153 1219 ## COG0805 Sec-independent protein secretion pathway component TatC 40 22 Op 3 . + CDS 45143 - 46402 1871 ## COG0673 Predicted dehydrogenases and related proteins + Term 46419 - 46458 9.1 41 23 Tu 1 . + CDS 46491 - 47888 754 ## PROTEIN SUPPORTED gi|145629959|ref|ZP_01785741.1| 50S ribosomal protein L21 + Term 47964 - 48006 2.1 - Term 47861 - 47908 1.1 42 24 Op 1 11/0.000 - CDS 47926 - 48651 1098 ## COG1180 Pyruvate-formate lyase-activating enzyme 43 24 Op 2 . - CDS 48653 - 50881 3902 ## COG1882 Pyruvate-formate lyase - Prom 50945 - 51004 5.6 + Prom 50839 - 50898 2.0 44 25 Op 1 . + CDS 50926 - 51420 -68 ## 45 25 Op 2 . + CDS 51615 - 52007 664 ## BF2989 hypothetical protein + Term 52115 - 52166 12.3 + Prom 52106 - 52165 2.7 46 26 Tu 1 . + CDS 52398 - 54221 2690 ## COG2207 AraC-type DNA-binding domain-containing proteins + Term 54223 - 54258 1.5 + Prom 54269 - 54328 3.3 47 27 Op 1 . + CDS 54457 - 56343 2191 ## FIC_00184 hypothetical protein 48 27 Op 2 . + CDS 56377 - 58356 2645 ## Bache_0429 hypothetical protein + Term 58373 - 58419 14.0 - Term 58364 - 58404 10.2 49 28 Op 1 . - CDS 58577 - 60796 3356 ## COG0306 Phosphate/sulphate permeases 50 28 Op 2 40/0.000 - CDS 60998 - 62446 2164 ## COG0642 Signal transduction histidine kinase 51 28 Op 3 . - CDS 62424 - 63104 914 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 52 28 Op 4 . - CDS 63198 - 64262 1575 ## COG2876 3-deoxy-D-arabino-heptulosonate 7-phosphate (DAHP) synthase 53 29 Tu 1 . - CDS 64387 - 65232 1150 ## COG0077 Prephenate dehydratase - Prom 65357 - 65416 4.7 - Term 65240 - 65278 5.3 54 30 Tu 1 . - CDS 65451 - 66239 989 ## Odosp_3543 hypothetical protein - TRNA 66597 - 66668 77.8 # Thr GGT 0 0 + Prom 66704 - 66763 3.3 55 31 Op 1 . + CDS 66783 - 67733 1346 ## COG0451 Nucleoside-diphosphate-sugar epimerases 56 31 Op 2 . + CDS 67730 - 68098 359 ## COG1917 Uncharacterized conserved protein, contains double-stranded beta-helix domain 57 31 Op 3 . + CDS 68108 - 69061 1228 ## COG1073 Hydrolases of the alpha/beta superfamily - Term 69269 - 69316 16.1 58 32 Op 1 . - CDS 69394 - 70467 1493 ## gi|313158405|gb|EFR57804.1| putative lipoprotein 59 32 Op 2 . - CDS 70489 - 71457 1072 ## gi|313158365|gb|EFR57764.1| hypothetical protein HMPREF9720_0935 Predicted protein(s) >gi|313158355|gb|AENZ01000045.1| GENE 1 182 - 1789 2650 535 aa, chain + ## HITS:1 COG:VC2738 KEGG:ns NR:ns ## COG: VC2738 COG1866 # Protein_GI_number: 15642731 # Func_class: C Energy production and conversion # Function: Phosphoenolpyruvate carboxykinase (ATP) # Organism: Vibrio cholerae # 3 534 11 541 542 799 71.0 0 MNKMDLKKYGITGVTEIVYNPSYDELFREETKKGLRGFEKGQETEMGAVNVMTGVYTGRS PKDKFFVMDETTKDTIWWTSDEYKNDNKPVTKAAWKELKKIAAQELSGKKLYVVDTFCGA NENSRLKIRFIMEVAWQAHFVKNMFIRPTDAELENYGEPDFVVLNASKAKVKNYKKLGLN SETAVVFNLTEKMQVIINTWYGGEMKKGMFSYMNYLLPLKGMASMHCSANTNAKGETAIF FGLSGTGKTTLSTDPKRELIGDDEHGWDDDGVFNFEGGCYAKVINLSQENEPDIWNAIRR NALLENVTVDKKGKIDFSDKSVTENTRVSYPIFHIENIVKPVSKAPAAKKVIFLSADAFG VLPPVSILNAEQTKYYFLSGFTAKLAGTERGITEPTPTFSACFGAAFLSLHPTKYGEELV KKMQKSGAKAYLVNTGWNGTGKRISIKDTRGIIDAILDGSIDKAETKTLPIFDFKIPTAL PGVDPKILDPRDTYKNAKDWDVKAEDLANRFVKNFVKFTGNEEGKKLVAAGPKVK >gi|313158355|gb|AENZ01000045.1| GENE 2 2180 - 2830 1034 216 aa, chain - ## HITS:1 COG:no KEGG:Riean_0621 NR:ns ## KEGG: Riean_0621 # Name: not_defined # Def: hypothetical protein # Organism: R.anatipestifer # Pathway: not_defined # 28 202 34 201 202 114 38.0 2e-24 MKKPFRKISVLIVLFSLFGIQNAASQGYVKLNALYALVGVVNPSVEFAISPKSTLQTDIV VSPWKSINRKHMTFAIFMGEYRRYFKEHNRGWYLGANIGMMAFDMSKPYIEGWKLKFEDR YCKGYGMMIGLCVGYEHQFGERWLLDAFFGWAWMDSHYNGYSFDGKVDMYPHRPVQPENP DPFNGSSEWYPNKLGVSIGYRIFMPKRLRTDGSCSR >gi|313158355|gb|AENZ01000045.1| GENE 3 2919 - 3476 1013 185 aa, chain - ## HITS:1 COG:DR1510 KEGG:ns NR:ns ## COG: DR1510 COG0233 # Protein_GI_number: 15806522 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Ribosome recycling factor # Organism: Deinococcus radiodurans # 2 183 3 182 183 152 45.0 4e-37 MDTKTILNEASGRMQKAIDHLEEELLNVRAGKASPNALNGIMVDYFGSQVPVSGAASVTV PDAKTILIQPWDKNMLRPLEKAIIDSNIGLTPSNNGEQIRLTIPPLTEERRKELVKQIRG EAETARISLRNARRDAVEAFKKAQKEGMPEDESKDGETQSQKLLEKFSKLLDEALQRKEK EIMTV >gi|313158355|gb|AENZ01000045.1| GENE 4 3479 - 4045 849 188 aa, chain - ## HITS:1 COG:TP0100 KEGG:ns NR:ns ## COG: TP0100 COG0526 # Protein_GI_number: 15639094 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Treponema pallidum # 48 185 60 199 200 88 34.0 9e-18 MATKKSNRQLWIMLALIVAIVAVILLLPSCGGNSSKKAGDVESTTLVKAGDKAPDFTVEM FGGGKTDLAELKGKVVLLNFWATWCPPCRQELARVQSDVIDRFAGRDFVFLPISRGETHQ TVAAFREKTGYSFPMGLDPSQTVYDRYASNYIPRNFVIDRNGKVVLASVGYDPEEFDEMI KTIEKTLE >gi|313158355|gb|AENZ01000045.1| GENE 5 4113 - 4826 1260 237 aa, chain - ## HITS:1 COG:FN1622 KEGG:ns NR:ns ## COG: FN1622 COG0528 # Protein_GI_number: 19704943 # Func_class: F Nucleotide transport and metabolism # Function: Uridylate kinase # Organism: Fusobacterium nucleatum # 3 235 6 238 239 260 58.0 2e-69 MKYKRILLKLSGESLQGSQKYGLSPEVLQSYAEQIKAAAGTGVQIGIVIGGGNIFRGLTG AKKGFDRVKGDQMGMLATIINSLALQSALEDNGVKAKVLTSIRMEPIGEYYSKARAIEYL EAGYVVIIGGGTSNPYFTTDSASALRGIEIEADVMLKGTRVDGVYTADPEKDPTAVKFDR ITFEEVLDRRLKVMDLTAFTLCRENGLGIIVFDMDTPGNLGKVLAGEQIGTLVTGLS >gi|313158355|gb|AENZ01000045.1| GENE 6 5050 - 6948 2545 632 aa, chain + ## HITS:1 COG:aq_881 KEGG:ns NR:ns ## COG: aq_881 COG1154 # Protein_GI_number: 15606220 # Func_class: H Coenzyme transport and metabolism; I Lipid transport and metabolism # Function: Deoxyxylulose-5-phosphate synthase # Organism: Aquifex aeolicus # 4 621 3 618 628 544 46.0 1e-154 MTPEEYRLLLHIDSPEDLKRLSAEELRAYCDELRRYIVDECSVNPGHLASSLGAVELAAA LHYVFDTPADKIVWDVGHQTYAHKIITGRREAFKTKRRLGGISGFPRMSESEYDAFGGGH ASVSISAAFGMAKAAELRGEKYRVVAVIGDGSMTGGLAFEGLNNAGASKRTNLLVILNDN NMAIDQATGALKNYLLKISTSVHYNRFKQRLWGILSHTPRLLRLCQKAGNAVKQGLLNKS NLFESLNFRYFGPVDGHNLKELVRTLRALRDIEGPKLLHVMTVKGKGYLPAEHNQPVWHA PGRFNPDTGERISSPGSASRYQDVFGETLVELAERDSRVVGVTPAMPSGCSMNLLMQAMP SRCFDVGIAEGHAVTFSAGLAAAGMVPFCNIYSTFMQRAYDNVIHDVAIQDLPVVMCLDR GGLVGEDGVTHHGVFDMAAFGCVPTLAIAAPMDELELRGMMYTGLQYGHPFMIRYPRGCG EGRMWRGARFETLPVGRGRKLRDGADVALVTVGTVGNAAARAAARAAEEGVSAAHYDLRF AKPLDEELLLEVGAKFRRVVTVEDGALRGGVGEAVAAFFNARGLDVSVRSLGIGDEWVEH GTPAQLYALCGYDEEGILKALLETKQAPVCAE >gi|313158355|gb|AENZ01000045.1| GENE 7 6961 - 7722 1084 253 aa, chain - ## HITS:1 COG:STM0005 KEGG:ns NR:ns ## COG: STM0005 COG3022 # Protein_GI_number: 16763395 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Salmonella typhimurium LT2 # 1 251 1 252 257 137 33.0 2e-32 MQILLSCAKTMAESTSLPIPRTTSPLYGAQAGELAGQLATLSTEELAKILRVNHRIAATN RLRYSRFHDNAERALPALAAYTGIVFKRIAPADFTAGDFEYAQAHLNITSFLYGLLRPLD TIRTYRLEGDAVLPGHGEQTVFEYWQDKLTDAFLEKIKADDGILVNLASSEMKRLFDWKR ICREVRVVTPEFRIREDDRLKTVVVYTKMCRGEMTRHILKNRIGDPQQLKAFEWEGFRLN PTLSKADDWVFTM >gi|313158355|gb|AENZ01000045.1| GENE 8 7729 - 8568 758 279 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313158370|gb|EFR57769.1| ## NR: gi|313158370|gb|EFR57769.1| hypothetical protein HMPREF9720_0888 [Alistipes sp. HGB5] # 1 279 1 279 279 498 100.0 1e-139 MRLLLLSLLLVAGCTATAQRRPTARELGELQDSVWRRLDILLTTDRGFAACTPERETFDG SANAARASATKEIASIADAVRSSGTDETAGIVRVHLLRADSAMKARFRRYVHDSPLIRLE GFDPDTTPYPPAPEGTPPPDGLSMDAEYAFYPAETDCIRLTIRYRGDSTVYFGTDYTVCR FQNGRWETLPGADAWDSLLIGIGRPQFPVPGDARQTKEYAYGFTARLAPRIYPSVYARYR ICKNVYMENPRRDYLLTADFTVTPFVPFTRFPEQTGDNN >gi|313158355|gb|AENZ01000045.1| GENE 9 8580 - 9353 1176 257 aa, chain - ## HITS:1 COG:MA0160 KEGG:ns NR:ns ## COG: MA0160 COG0731 # Protein_GI_number: 20089058 # Func_class: C Energy production and conversion # Function: Fe-S oxidoreductases # Organism: Methanosarcina acetivorans str.C2A # 8 246 5 242 326 98 31.0 1e-20 MTSLFHDIIFGPVHSRRLGLSLGVNLLSTDSKLCSFDCIYCECGWNAEHPGGRRFNARED VRTQLEATLRRMVADGTPPDVITFAGNGEPTLHPEFEAVIGDTIALRDALCPSAKVSVLS NATQLHREEVRRALLRVDNNILKLDSAFDATARLMNNPQSPAYTVRGVVEQMKGFDGRLT VQTMFLRGECDGQKTDNTTEEEVSAWLRLIEEIRPRQVMVYSLDRDTPCRTLEKVPREEL QAIAARVEALGIPCSVA >gi|313158355|gb|AENZ01000045.1| GENE 10 9410 - 9706 163 98 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MALINCPECGNSISDQAPHCPKCGYQISNHSFFPKTEGCFLQSMNMGCLILVIIFLFSII LSEYDIPEEDLPIIGIITIIVFGIMICYNFFKKKKQNQ >gi|313158355|gb|AENZ01000045.1| GENE 11 9718 - 9936 175 72 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313158364|gb|EFR57763.1| ## NR: gi|313158364|gb|EFR57763.1| hypothetical protein HMPREF9720_0890 [Alistipes sp. HGB5] # 1 72 135 206 206 129 100.0 6e-29 MLIEPLYDDTILVQTSNIILLKNHLSKILGSQPPLNSFHWTFSTYTVNLSQEKIKSKYYE KHYTYYRLTITK >gi|313158355|gb|AENZ01000045.1| GENE 12 10696 - 11853 1750 385 aa, chain + ## HITS:1 COG:Cgl2969 KEGG:ns NR:ns ## COG: Cgl2969 COG1301 # Protein_GI_number: 19554219 # Func_class: C Energy production and conversion # Function: Na+/H+-dicarboxylate symporters # Organism: Corynebacterium glutamicum # 7 379 5 378 412 323 53.0 4e-88 MKFRFGLLPRVVLAIGLGVGCGFFFPEWATRIALTFNDIFGQFLSFVIPLLILGLVAPGI ADLGKNAGWLLAVTAALAYAFTLFSGFGTYLVGRAVFPALLEGSQAVLPDDAGAALTPYF TVQMPPLFGVMSALVLAFVLGLGMAYTHSVKLKGVMDEFKGIIERVIGSVIIPLLPFYIF GIFLSMTRSGQVAGVLGVFVKLIAVIFCMTVVLLLVQFSVAGLAARKNPLRMLRTMLAAY VTALGTQSSAATIPVTLAQTLKLGVRPEIASFVVPLCATIHLSGSMMKIVACALAVSMIA GLELPAGTFVGFILLLAVTMIAAPGVPGGAIMAALGLLESMLGFDETLAGLMIATYIAMD SFGTATNVTGDGAVAVIVDAIDRRR >gi|313158355|gb|AENZ01000045.1| GENE 13 11893 - 12006 68 37 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLVSVPDAQKYKKTFLAIKISRFLFGTGTHLYLFTQV >gi|313158355|gb|AENZ01000045.1| GENE 14 12022 - 13293 2102 423 aa, chain + ## HITS:1 COG:PAB1244 KEGG:ns NR:ns ## COG: PAB1244 COG0156 # Protein_GI_number: 14521874 # Func_class: H Coenzyme transport and metabolism # Function: 7-keto-8-aminopelargonate synthetase and related enzymes # Organism: Pyrococcus abyssi # 26 401 23 393 398 228 34.0 2e-59 MVDIFARLEKNAGGPIGQYMSYAHGYFAFPKLEGEIGPHMVFRGKKMLNWSLNNYLGLAN HPEVRKADAEGAAKFGMAAPMGARMMSGQTVYHEQLERELAEFVGKEDAFLLNFGYQGMI SIIDCLLTPRDVVVYDAEAHACIIDGLRLHKGKRFVYGHNDMDSLRLQLQHATDLAEEQK GGVLVITEGVFGMKGDLGKLDEIVALKKDFQFRLLVDDAHGFGTMGEGGRGTASHFGVTD GVDVLFNTFAKSMAGIGAFVCGPRWLVNLLRYNMRSQLYAKSLPMPMVMGALKRLELIRN HPEYQQKLWEIVRALQNGLKENGFEIGVTNSPVTPVFMKGGIPEATNLIVDLRENHGIFC SIVIYPVIPKGEIILRVIPTAAHTLDDVNYTIAAFKSVRDKLEGGIYAQMPIPVRADEGF KVR >gi|313158355|gb|AENZ01000045.1| GENE 15 13827 - 14021 143 64 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313157795|gb|EFR57206.1| ## NR: gi|313157795|gb|EFR57206.1| conserved hypothetical protein [Alistipes sp. HGB5] # 1 64 1 64 64 76 59.0 8e-13 MREIREQHNHTQEFLTENAHLHLSHYEHGRKLPTLGTIIKFCKYYNLSLKEFFGEMTYPK EPKK >gi|313158355|gb|AENZ01000045.1| GENE 16 14465 - 14650 144 61 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSRRIRFRNYPKAQNPHYILLLWHNTGRTFLPFDFARQDVRFWAGERGNALTPHDPARRA S >gi|313158355|gb|AENZ01000045.1| GENE 17 14871 - 15449 559 192 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|291514535|emb|CBK63745.1| ## NR: gi|291514535|emb|CBK63745.1| hypothetical protein AL1_12740 [Alistipes shahii WAL 8301] # 1 192 1 192 192 322 83.0 8e-87 MNQHDPDSEQALRAIDRFYAMQVSICIAALGRMRSAELADKNIILRAIMCDMPDCGICAD PTAEDYGVDFWVDRAEYVNDTTFNLYVTQAEENVLWNIEKKGQKPGRLVAPQQRITMRTD GISRPPYIPLIYLTVLSSKLMAMAMSDVLPNVDLSTYYDERVMASLGERIAGVLDRMTEM QDGEEEGGNPVS >gi|313158355|gb|AENZ01000045.1| GENE 18 15514 - 15840 505 108 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313158397|gb|EFR57796.1| ## NR: gi|313158397|gb|EFR57796.1| hypothetical protein HMPREF9720_0896 [Alistipes sp. HGB5] # 21 108 1 88 88 152 100.0 7e-36 MSCNNSYMHLRVKAVQEFDQMYYEPEYKAKCHKRVWKRLGRYIFGISYQSYLDYLKMDVS DIPPTPFEARQAQLKLVDKLLERELERMKHPVRREKPEEWKKEPVEQG >gi|313158355|gb|AENZ01000045.1| GENE 19 15892 - 16149 260 85 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313157267|gb|EFR56694.1| ## NR: gi|313157267|gb|EFR56694.1| conserved domain protein [Alistipes sp. HGB5] # 1 81 1 80 90 84 50.0 2e-15 MTIKTCKFRIGDVYLFHTTDPGCDSRTSLWGIVGNRDAENRICLETSSADLRKYNYWTFL PAEYQFCRLSTREELRDFSFNLNRN >gi|313158355|gb|AENZ01000045.1| GENE 20 16323 - 16799 621 158 aa, chain - ## HITS:1 COG:no KEGG:BT_1401 NR:ns ## KEGG: BT_1401 # Name: not_defined # Def: putative sugar transport protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 7 76 10 79 401 102 74.0 7e-21 MKKRLAYSPGRLVAILTFGVFGIINTEMGVVGIIPQIAETFGVTVPQAGWTVGVFALIVA VSAPVMPLLFSGTDRRTGSVVFAYDLNPRYGEQLPSVRLTGLAPDLRYRIREINLMPGQN PRIEHNDRVVSGDYLMKAGLKLFSTDRARSVVVELKAE >gi|313158355|gb|AENZ01000045.1| GENE 21 16812 - 17798 1227 328 aa, chain - ## HITS:1 COG:RSc0206 KEGG:ns NR:ns ## COG: RSc0206 COG1073 # Protein_GI_number: 17544925 # Func_class: R General function prediction only # Function: Hydrolases of the alpha/beta superfamily # Organism: Ralstonia solanacearum # 12 328 26 342 342 454 70.0 1e-127 MMHNISSAQTDADNFYRSDAVTAEKVSFPNQYKMKVGANLFRPKEMKPGEKRPAIIVGHP MGAVKEQAANLYAIKMAERGFVTLAIDLSFWGDSEGEPRNTVSPEIYAEDFSAAADFLGT REFVERSGIGAIGICGSGSFAVSAAKIDPRLKAVATVSMYDMGAASRSGLKNALTLEQRK RILAEAAEQRYAEFTGGETAYTGGTADELTENSTPVEREFYAFYRTPRGEFTPEGATHRT TTHPTLASNVKFMNFYPFNDIETISPRPLLFIVGENAHSREFSEDAFRRAAEPKELYVVP GAGHVDLYDRTGLIPFDKLEEFFGNSLK >gi|313158355|gb|AENZ01000045.1| GENE 22 18070 - 18963 1274 297 aa, chain + ## HITS:1 COG:all2035_2 KEGG:ns NR:ns ## COG: all2035_2 COG2207 # Protein_GI_number: 17229527 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Nostoc sp. PCC 7120 # 214 294 54 134 144 63 35.0 5e-10 MDEILKIDTIDRYNKLFGFETRHPLVGVVRFDTAESQGNYRMTMGFYSVFLKETRGCRIN YGKTGYDFDDQTVVSIAPGQTVGYTDIEGIPTKSVGLLFHPDFIRGTSLGRKIRKYTFFS YEANEALHLSEEERTIVLDCLKKIEMELRHAIDKHSKGLIATNIELLLDYCMRFYERQFV TREDLNLDALARFERLLDDYLSEGVAAREGLPSVRYFADKICLSPNYFGDLVKKETGKSA QEYIQLKMIDAAKESLLDPDKTIGQVAGELGFQYPQHFVRFFRRYAGCTPNQYRTRG >gi|313158355|gb|AENZ01000045.1| GENE 23 18987 - 19904 1402 305 aa, chain - ## HITS:1 COG:PA0248 KEGG:ns NR:ns ## COG: PA0248 COG2207 # Protein_GI_number: 15595445 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Pseudomonas aeruginosa # 167 304 156 287 288 84 33.0 3e-16 MNEKDIQSFSLATLIDVCGRRPGDICYDGCLIASRVNVQDEIDIDLFRYPTRIDAFVILL CSKGSGTFTSNLTRHTLTENSIFVHLPGSIIQAESTEEIALHAVICEEEFIRRINIDIRL LSQLFLHVEKQPCLKLDEKEWTGITRSFEELGAEGTEMPADVYSAEVIRSIIRTLAYKVC RVIGRHIETSGERQISARSRNDEYFSQFMNILGKHYTQERSVGFYAGQLNLTPKYLTTLI RKTSGRTAVEWIDDYVVLEAKNLLKYSTMSIQEIAYYLNFSNQSFFGKYFKSHTGMTPSA YRIGR >gi|313158355|gb|AENZ01000045.1| GENE 24 20057 - 21067 1254 336 aa, chain + ## HITS:1 COG:PA3677 KEGG:ns NR:ns ## COG: PA3677 COG0845 # Protein_GI_number: 15598873 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Pseudomonas aeruginosa # 4 334 11 351 367 103 28.0 4e-22 MKRTIFWALLLIAAGCSSPKNTRTADPLRVTTIVAAPSAGFGAAVYVGSVEEEASASLSF PVAGTVARTLADEGQRVRKGQLLAELDSTSARQTFDAARASLEQAEDACGRLRQLYEAQS LPEIKWVEAQTRLRQAESLFEIAKKNLSDCALYAPFDGVVGERRASAGETVLPGVPVMNL LQIGTVKVRFSVPEQEIAAIGADSRMRITVPALGDRTFQAGKIEKGAVANPAAHTYDVRA ALANAGGELLPGMVCRVEVSPAGAAEQIALPVRAVQQAGDGSRFVWTVRGDSAVRTAVTT GRLVNNAVVLEDGVRSGDRIVVDGMQKIGEGSKVVW >gi|313158355|gb|AENZ01000045.1| GENE 25 21100 - 24117 4373 1005 aa, chain + ## HITS:1 COG:VC1673 KEGG:ns NR:ns ## COG: VC1673 COG0841 # Protein_GI_number: 15641677 # Func_class: V Defense mechanisms # Function: Cation/multidrug efflux pump # Organism: Vibrio cholerae # 3 1000 28 1027 1037 343 26.0 1e-93 MRNFRITFLLVGCLFVFGIYGLVHIPKQEFPEYTIRQGVVVGVYPGATSEEVEEQLAKPL EQFLMTYKEVKRSKTTSTSQNGMCYVMVELNDDVNDKDEVWSKIKHGLAAFKMQLPAGVA ALVTNDDFGDTSALLITLESDTRSYRELKGYMDDLSDRLRRIESVSNLRPYGVQQEQISV YADHDRLAAYGIGEKVLSAALASQGLTPAGGSVSNGETETPIHIAPSLAGEREVAEQIVW SDPEGHVLRVKDVARVVREYDDPDSYIRNNGHRCVLLSMEMRGGYNIVEYGREVDEVLHA FMEEELPSDVAVQRIADQAKVVGDSVHSFLRDLFVAMAIIILVMMLLFPLRSAVVAAVTI PLSTFISVGVMYLCGIPLNTVTLAALVVVLGMIVDNSIVVIDGYLDYIGRGHSRWYAAVE SAREFFPSLLLATICICMIFYPILFTMTGMMRDFLTYFPWTITINLMVSLLLAVLVIPFL EIVIIPAVQPRKEGRKSVTDRVHDVYRRVLAWTFRHGWLTISLGLASVAVSLVLATQLKL RMVPFADRDQFAVEIYLRPDTPLERTGAVADSVYRMLRADGRVKSVTSFVGCSSPRFQMS YAPQIAGKNYAQFIVNTTSIDDTEDILDRYADAWADRFPEAYVKFKQLDYQNVPSLEFRF YGSDIDSLRAAGDRLMDRMRRMPELMWVHSDYEDPRAVVDVRLDPVTAPQLGVTRTLAAV NLALAAGDVPVGAVWEGDYKLPVVLKNDVRRGERSLSDVGDTYVSSPVPGVSVPLRQIAD VGPAWSESKIVHRNGMRCLTVTADLKRGANAMRVNSRISELIDKELTLPAGIATELGGAY EFDWETIPPIASGLTISLVIIFFFILVNFRKFGITLVVMASMSLCLFGAVVGLRIADFTI GLTSVLGFITLLGMIVRNVILMYQHAEDKRKVCHWSGRLAAYDAGKRRMVPIFLTTATTA VGVVPMMLSGSTFWAPVGITIFAGGIGSLILVVTILPVLYSKIYK >gi|313158355|gb|AENZ01000045.1| GENE 26 24117 - 25394 1654 425 aa, chain + ## HITS:1 COG:FN1273 KEGG:ns NR:ns ## COG: FN1273 COG1538 # Protein_GI_number: 19704608 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Outer membrane protein # Organism: Fusobacterium nucleatum # 19 419 7 408 413 80 22.0 5e-15 MKKAVVFLFLCVAAAPAAAQTLTLEECRAAAVEHNRTLRNSRLDLDAASQTHREAFTNYF PQISASGGLFQAQHGLVQADFAVPQMGTLPVSMVKRGIIGSVTAVQPLFAGLKIVTGNKL ARLGEEVGRLQLQQTEAEVRERTDACFWQVVSLRDNLSTLDAVERQLAEIRRQVELSVKA GLVTNNDLLRVELRQQEIASNRLKVENGLKVSKMLLAQHIGVDWRGFDVANAAFGEPEAP AAYYVPVEEALDRRTEYLLAEKNVEARKYEKRMERGKRLPTVGIGAGYLYYNVTDKSVDD GMVFAQVSVPVSEWWGGAHALKKARIREQQAENDRLQAREMLAVEIERTWSEVQESYAQI LLMRRSVESSAENLRQNRNFYKAGTAPLTDLLDAETLYTQSRNDLTSACASYRTALAKYM RVTGR >gi|313158355|gb|AENZ01000045.1| GENE 27 25828 - 26364 832 178 aa, chain + ## HITS:1 COG:no KEGG:BT_3993 NR:ns ## KEGG: BT_3993 # Name: not_defined # Def: RNA polymerase ECF-type sigma factor # Organism: B.thetaiotaomicron # Pathway: not_defined # 13 177 8 171 184 125 38.0 1e-27 MSDLDKQLMNADSFDELYLAYYPAMLSYARMFLRDQWAEDVVQDVFFNIWKNRHRISTDD PLYKYLLKAVYNRAINYIWKHKRDTEYRSWYGSQIDRMVFDYYDPDKNPILAKIYDNDMR QQLRQAVDELPDKRREIFRMRFFENMSNKEIGERLGLTVSTVENHMYLALKNLRDKLL >gi|313158355|gb|AENZ01000045.1| GENE 28 26547 - 27533 1218 328 aa, chain + ## HITS:1 COG:SMc04204 KEGG:ns NR:ns ## COG: SMc04204 COG3712 # Protein_GI_number: 15965785 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Sinorhizobium meliloti # 118 288 145 311 354 69 27.0 7e-12 MKNELIYKYLEGNATPAEEQDVLSWISESEANKAEFCEIRALWSVRDRNSVENDPQRILK SLKALHARIDADASPHRSKRARIIRWAGWSAAAMLGVALVVGGYLVGHDMHKAAMESYYT FTNTGRKPESLRLLDGTQVWLAQGSTLSYGKTFSPHDRTVKLDGEAFFDVASDRDHPFFV KTNTVVVKVVGTSFNVKMYKDTDDTEVVLERGVVKLQFGDNDNMITMQPGQKIYYSSASH DISISEVNVEYLMLLKYGMISMSDVRVAEIIRKVEEIYSVDIETVAPLDDDRLYNFNFLK SNTLDDVLDIIEKMSGVKCRPAPAAGAE >gi|313158355|gb|AENZ01000045.1| GENE 29 27691 - 31233 5317 1180 aa, chain + ## HITS:1 COG:no KEGG:Sph21_5171 NR:ns ## KEGG: Sph21_5171 # Name: not_defined # Def: TonB-dependent receptor plug # Organism: Sphingobacterium_21 # Pathway: not_defined # 126 1180 28 1056 1056 850 44.0 0 MKKNLSVFRTGCCMLMLLLCNVFTLSAAEELYAQHYNVTLSLRNATVKEAVDAVTRQTGI AFSYESSVGVMPLGNVEIRVDGGAVDAVLGPLFADKGIAWQVVDGVVILTRQNSPPQSDS GRRLTVRGRITAGAGESIVGATVMVKGTQTGVSSDVEGNYEIVVPGPGSVLVFNYLGYQP QEIAVGNRTQIDVRLAEDNNVLDDVVVIGYGTQSRRTLTSAVSKVSGADLEGVPVNSIGD ALKGKVPGVRVSTANNQPGSDPVFLIRGGSSINQSNAPIIIVDGVRREMTGLNTNDIESI EILKDAASAGIYGASASNGVILVTTKKGNRAKGPQITFEAQAAWTSPATKFDLMDARDYV VTMRRALNDCPGYTYGQQVLTGANSAGIGNGDSSIWTTRYLQEGESVPRGWSSVVDPVDP SKIIVFQDNDQQSQWFDDSYWMQYYVGVNGGSDKVTYAASASYMKDGGIGINTDFSRFTF HGNTSFKITKSLTAGTTFDYSETSGNEIPSSGIGNYWTILGRGMFMPATHRDYLEDGQPG QGTNSTTISAAWFDRYYTNNYVTRRSTANFNLKWDIVDGLSAFAQIANHNSYKNSYQYLA GNAISQTRQTYEGWSQTNRLSFQAHVDYNKSFGDHNLSAMAGYDFLKRKINGMSMTVQGA ESDKTPTLGAGTTPTAWTDTQTPWCQISYFGRLNYNYKEKYMLGFTMRADGSSLFAKDNR WGYFPAVSAGWVVSEEKFWNVEKFNQLKLRLSYGLTGNNNVDYYDTLGAYSVTGIYAGSG ATLASTLPNLGLTWEKTKQFDIGLDMAFFNNRLRVAADYYSKKSEDLLFDVSLPDTSGYG SAMQNVGSIRFYGLEFEISSVNVSTKNFSWTTDFTYSFNANKVLSLPDEYYYKDIDGKDA WRIGGYTMSESGYRFGGTAVGEPLGRIYGYKTSHIIESEAQADAALYDSNSHGYRRSDGL SIAGRKDVGDYEWKNRAGSALTADGREQINGEDMFLLGNVVPHSTGGMNNTFKFKNLTLS VYLDYALGHSIYNYQYTRCFQTSMGNCNWNLVYDALNTWQKPGDDTKFARLTPNDADGGN RNYSRISNINVQKADYLCLRDVTLSYDLPLRWIRKVGLGRLTVSVSGNTLCYWTGVKGIS PESASVGSSTGMYSVTSSSATSFNNYPPTRKVLFGLKATF >gi|313158355|gb|AENZ01000045.1| GENE 30 31272 - 32780 2345 502 aa, chain + ## HITS:1 COG:no KEGG:Sph21_5172 NR:ns ## KEGG: Sph21_5172 # Name: not_defined # Def: RagB/SusD domain-containing protein # Organism: Sphingobacterium_21 # Pathway: not_defined # 1 501 1 484 484 262 36.0 3e-68 MKIYGIILSVLGAASVALSTSCHGELDVTQSDKLNTTNMWTSESDALTATTGIYYRLRQA FAQSKANPFFWGELRVGPAMWGKGTGRSLCDNDMLDVMLNTLSASDASTDWSYLYTTIDQ CNQVLKYAPGIGMSEDNMNYCLGNARFIRAYCYFWIARVWGEAPVITTPTEGTGETIYPS RHPVAEVYARIEDDLNAAAGYIRTNAKGCYYATADNVNLLRADFALWMYAAGGGDGYLTM AETALGAVQSAGLLDRFADVFSISNKKNREIAFALHLENGEYTSGSYFSYFIWGSTQIKA EYRNAPDGVPVSSNQWFLYSDDFIGFLKRSKELGDQRTDVTYMERTGVSDMYEVIQWPNK LIGNISTGTIVWEQDFILYRYAQYYTMLAELRYHRKDYSGALTALNTLAQRAYGKANFYT DSSAAAVRQALVDENLKEFAEEGNVYFTLIRLGAIHDYNPYRYVDGLGQCGIDTSRPNQL LMPVSKKAMNKNNKITQTQGWS >gi|313158355|gb|AENZ01000045.1| GENE 31 32815 - 34689 2112 624 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313158381|gb|EFR57780.1| ## NR: gi|313158381|gb|EFR57780.1| conserved domain protein [Alistipes sp. HGB5] # 6 624 1 619 619 1170 100.0 0 MKTLRMLYTAALLSAAMLLAGCYSTDTDDDLLSTATLISIYPSTVSVGSAAATQNVVVTL APKSKRNLDWSCSVQSAWVTAVREKVPGESTGTVYEDAVVLVFLENTAYKRQTTLTITAS DGTVLEVPVVQKGIYADAYVKAAPESLTFMAENPQPAAVAIDTNMDTFAAVSDSDWCTVV DNGDGTITVTCTEYGDNTADRTAEITVTAGTPDTSEARTVIPVTQLRKDVYCYLYGSGVP KYPTREKWGQMTKLSDRVYVDSVYVKNGRFSVYTTEGDTYHLDGAGNLSAAETDLTVDVA GLRIVTLNLNEKTYSIERITTPNCLPDSEVAKYATSTYTANGRTKVWMRAGLDWNGGENI GGIKLGSRMVSDANNVGGYTSNVSSYETVRVSDYDEVESGGKAQGEVEMSAKYGRIYTLT EFLTGTPKAAVELARLLTDWPAQYRPGSTFVDAVGNDIPVSSVKSLASADFENTPSLSMQ IQGICPYGWHVANVQDFYDMLCAAAAAKGVTPNPLSAMIGKWCVPDVLRSAEGWSAAPTR HAAADAFGFDFFPQGRRLFKSGYQYYASRGEMFICHPGGLSSGVYQCWRINALTNNAADL TVSTTYNIGDCSASFRCVKNYENF >gi|313158355|gb|AENZ01000045.1| GENE 32 34751 - 35692 1260 313 aa, chain + ## HITS:1 COG:no KEGG:Dfer_2668 NR:ns ## KEGG: Dfer_2668 # Name: not_defined # Def: glycerophosphoryl diester phosphodiesterase # Organism: D.fermentans # Pathway: Glycerophospholipid metabolism [PATH:dfe00564] # 56 238 41 196 271 78 30.0 4e-13 MKVLKLLLLFAMGASLSLAGCDDDYLYPVEYPVYDPDGGSDNGSCDEIPVPEKVNNKVVA HRGGSAEAGTAKYPDNSIASLSYAISLGCYASECDIYWTADNNVVVAHAANGCYVNNLKP WEHTLDELRAAGKLSNGELLPTLEEFIDVALHAGTTRLWLDVKRIEVGGATSNTAESVKA CSRACEIIREKKAQHFCEFIVSGNAAIWAGCYSAAQLAGIKVGWMSYSAPSAYKSYLDPW ANLDYSNFFAGGAEVSNPKYPLQSYLDAGVALSVYNADTDPEMTYVLKYYPKLKAVCTNY PAKLLGRIGSEGL >gi|313158355|gb|AENZ01000045.1| GENE 33 35647 - 35886 185 79 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MPGITNFPNYFIPQSGKSAVREISDPTMETLRAAASGAGAKRKVSRTVRLTFCHKDSRKN SGATHYSPSDPILPSSLAG >gi|313158355|gb|AENZ01000045.1| GENE 34 35957 - 37585 2154 542 aa, chain + ## HITS:1 COG:no KEGG:Bacsa_0697 NR:ns ## KEGG: Bacsa_0697 # Name: not_defined # Def: putative regulatory protein # Organism: B.salanitronis # Pathway: not_defined # 5 541 7 547 551 208 28.0 5e-52 MWIVLLCTLLPGCADRPEYSLQTEAALERLDAELANKPRYQQWRQEQADALRSRIDDGLP LLERAERLFAVAQIYVTFHLDSSAYYVERLYRLAERTGSRHVRRFALLGDIDVWLGRGQV SLAEELFRKIDTAGMTPGEYRAWHSRYSSVASRRYDEAADSLQKRAWGDTLSHIRKLSVP GCSDQTRTRMAAARLSETGRYDEALQLLEPLTAKNLSHQNFALVYYSMAHIAKGQGDEDR MLYYLAESSISDLRAGTRSYSSLYDLALELFDRKDYDRATAYMGSTFDDAIRCKSVTRIP NSSSAVVKINEAVISNIAQQQRLMSRTIWLAGIFLIVLIAILWFVLWQHRRLQCSHEQLI RMSDLLREKHDELLKKNDHIRDINNALVDSNRIKDRYVCHYIDLSVHYIKQIDAFRREVC RIAKTKGADEVVRRLNSSQVIDGEYQKFYQSFDSSFLDIFPNFIGQVNELLQPEFRFTPR SDSSLTTELRILAAIRLGITDSGHIASLLNCASATVYTYRTKLRNAALDRDNFEEQVSKT GL >gi|313158355|gb|AENZ01000045.1| GENE 35 37873 - 39180 2238 435 aa, chain + ## HITS:1 COG:no KEGG:BDI_2462 NR:ns ## KEGG: BDI_2462 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 4 435 6 430 430 659 75.0 0 MLSRSNLCSGLARTCGFALLLATGVSCASPQNVLTEAEAADGWELLFDGKTLDQWKDYNG DSLTQPWHVVDGCIQANGDGSDLSGYIVTKKQYENFILDWDWKLSYGGNSGMLYHVLEDP YFKVPYVTGPEYQLIDNDGWEAENAPTKLEEWQRLGVDYAMHLPDPDSMFVNPQGEWNNS RIVCDNGHVEHWLNGHKILEFDAYTDDWFARKNSGKWETAPEYGLARRGVICLQDHGYPA SFRNLKIKELPRKAGREVELFNGKDLTGWEAYGTEKWYVDKEGNLVCESGPDKEYGYLAT REYYNDFDLTLEFKQLANGNSGVFFRSFVEPPVKVHGWQCEVAPKGHDTAGIYESYGRGW LVQIPDEKENILKEGQWNTLRLRVEGDRVQTWLNGEEMVDITDAKIGAAQGRIALQIHDG GGIKVQWRNFRLTTL >gi|313158355|gb|AENZ01000045.1| GENE 36 39202 - 40485 1851 427 aa, chain + ## HITS:1 COG:lin2266 KEGG:ns NR:ns ## COG: lin2266 COG0673 # Protein_GI_number: 16801330 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Listeria innocua # 38 189 3 146 358 83 33.0 8e-16 MSTSRRDFLKTLGGMALLTIVPRQVLGGPKFTAPSDQLTKGIIGVGGIGKSSYHFTSNKD CRLVAVCDVDRKHLESAVALGQKKFGETLEAYSDFRRLITDPNIDIVHIATPPHWHGIMA VEAAKAGKDIWCEKPMTRTIGEGKRVAEAVRRNNRIFRLNTWFRFKDTFYGLGTTVEPLK KLIDSGLLGWPLKVTISGTTGFTWKFFWVGRENLAPEPVPAELDYDMWLGPAPYKPYNKH RVHSTFRGYWDYDGGGLTDMGQHYMDPVQYLLGKDETSPVKIEVDAPQQHPDAVGIWRKI VYTYDDGCQIVLEGEGFESEGDTPYIEGPLGKVYKGFRCTIPDVMEKLAEMPDPEPQNTD FLECVRTRRKFALDEQIGHRSSTLVNLGACALRLNRTLHFDPDTQLFIGDDAANRLIDQP MRGPWHI >gi|313158355|gb|AENZ01000045.1| GENE 37 40497 - 43889 4169 1130 aa, chain + ## HITS:1 COG:no KEGG:BDI_1153 NR:ns ## KEGG: BDI_1153 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 13 1125 7 1154 1155 556 33.0 1e-156 MKKIITRLFILGALIALLPAGAAAQQLDARQRNAETVVADGLAQLPAATPAVFNEVMGEL AATGSAGVEMIAGMLTPADKSKNATLEYALNGITAYATAPGNEALCADVRKGLVRAIERC ADNANRAFLFSQLQLCSAAEDAAWILPYLDDSYLSDYAIRALISTPGTEPVLLAEARKEG LAAKRKRALAYAFAEKRMTQAEPLLLGWLGGADARTAEQIYNALAVCGSRASLKPLAAAA AEAGYGWDNTGAADAYLRLLANLVAAGDAQPAVKAAKGLLKCDRQYVRGAALAVLADALG IEKAMPYVLAAVKEGPAEYRYAALQTLGKGDDEIFARVAAGMPRYDAETRTAVLAWLGDC GAVSQSGVITGAVTSSDDRVARAAIEASGRIGGEQALAALVSALEGPHAGAAEKALLAFN GRIDPMVERLLARDDAKVLVPALRLAAARRMTVAADRVFALLETSDKQVREAAYAALPYV TGPRHMDRLAGLLDAADGQHAAQIQTALIRTSAQLPADKRYGAIAGYMQASKIPARYYPV LAQTGTPEAIASLLAGFREGDREAAFAALLTVESPAMTDILYRIAAENPALADRALMRYA DLCAKQPATPVRRCQLYRQGLELKPSAAVRNKLLGYLAGVHEVPALMLAADYLSDPATAG AAAAAVKTIVAKSAPMPGGEAVRSALEQAREVYRELAKSDADAGYAVDEITGLLGKIPAA GFALLPTAGLEGWTAVTADPAGERKLSARRIAGLRKAAAEAMAANWSADGNALVFAGKAP STIGTEKEYENFELWLEWRSEGEAGLAVRSMSQIRLGGTEGTGLKTQGQTDTPAAAADNA PGTWNTLYAKVVDDRITLIENGVTLAENVVLTNPCAPGEPVCVRGRIELSGLAAPACFRN LYIGELPSTPVFELSPEEAAQGYEVLFDGRSLHKWTGNTVNYVPLDGTIDVTASYGGSGN LYTVKEYGDFDLRFEFRFLRRGVNNGIGIRTPMGVDAAYHGMEIQILDHDAPIYKNLRIY QQHGSVYGIIPAQEHVVFGDLGTWNTMEIHAVGDRITVTVNGRVILDGDIRQACKGHNVS EDGSKKNPYTVDGKNHPGLFNKKGHLGLLGHGAGIQFRNLRVLDLSAGRQ >gi|313158355|gb|AENZ01000045.1| GENE 38 43991 - 44203 352 70 aa, chain + ## HITS:1 COG:no KEGG:Coch_0981 NR:ns ## KEGG: Coch_0981 # Name: not_defined # Def: twin-arginine translocation protein, TatA/E family subunit # Organism: C.ochracea # Pathway: Protein export [PATH:coc03060]; Bacterial secretion system [PATH:coc03070] # 6 58 8 60 66 61 52.0 9e-09 MFIPCMIGYGQLLIIVFVVILLFGGAKIPELMRGMGRGVREFKDAMNTDYSSENKKPNPA DVSGDKPADK >gi|313158355|gb|AENZ01000045.1| GENE 39 44239 - 45153 1219 304 aa, chain + ## HITS:1 COG:DR0806 KEGG:ns NR:ns ## COG: DR0806 COG0805 # Protein_GI_number: 15805832 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Sec-independent protein secretion pathway component TatC # Organism: Deinococcus radiodurans # 103 279 76 252 270 120 35.0 4e-27 MSKPNAGNESAEMDFWSHFAALRPHLVRGAAAVVVIAVAAFFARHFLIDVVLFGPKSAGF PTNRLLLETGHGLAALCGWINALTGLSLSVNPEAFDVSNLDFRVINTAMAGQFNLHMKIS FVTGLVLAMPYVLWEFWRFVKPALTQREIAATHRFVFWISACFFTGLLFGYFVIVPLSVS FFINYEASASIVNMIDVNQYLSTVIVASLASALVFQLPLLVYFLTRMGLVSSSFLKRYRR HAFVGLLVIAAIITPPDIFSLILVIIPLYWLYELSIRLSSRIERRDAADGAVAAAVISAT SDEN >gi|313158355|gb|AENZ01000045.1| GENE 40 45143 - 46402 1871 419 aa, chain + ## HITS:1 COG:DR1362 KEGG:ns NR:ns ## COG: DR1362 COG0673 # Protein_GI_number: 15806379 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Deinococcus radiodurans # 7 415 1 403 403 226 36.0 4e-59 MKTETTLSRPVRIVVIGAGNRAHKYLEYARRNPEQLRLAAIVEVNDLRRRAMADAFGLPD KYCYAHYDDFFAEGLDADMVLISTPENAHFDPAVKAIDAGYHVLLEKPIAQRLDECREIA RRARRRGVLVGVCHVLRYHPYFAKIRELVASGRLGHVVSVNHTASVGLDRATHSYVRGIF RRESEANPILLAKCCHDIDFLLWLTGSHCRLLSSFGSLRWFRAENAPEGSAARCLDCRME SECPFSARDLYYVRRDWVSNFDVPPGATLDATIMEELRTGMLGRCVYRCDNDVVDHQLLS MEMDDEVTMSLSMEMFTNDDFRKTHIRLTGGEIDGDERTLRVRRFRGGDAETYDFSDIAG QPFHAGADLQLIGDFIRALRDPSRPFLTTIDDSIESHRICYAAERSRRTGETIRMDAAE >gi|313158355|gb|AENZ01000045.1| GENE 41 46491 - 47888 754 465 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|145629959|ref|ZP_01785741.1| 50S ribosomal protein L21 [Haemophilus influenzae 22.4-21] # 8 455 7 454 456 295 37 6e-79 MQTLFFWIDSVNNVLWSYLLVALLLGCAVWFTVRTRGVQFRMLREMLRVLGESAGGGRRG ERHVSSFQAFAVSLASRVGTGNLAGVATAIAVGGPGAVFWMWIIALLGASSAFVESTLAQ LYKRRGRDSYIGGPAYYMERGLGRRWMGVVFAVLISVTFGFAFNSVQSNTICAAWEGAFG FDHHWVGIALTLLTLGVIFGGIHRIARVSGVVVPVMALGYIALALGVVLFNFRRLPEVVE LIVANAFGWEQAMGGGVGAALMQGIKRGLFSNEAGMGSAPNVAATAHVSHPVKQGLIQTL GVFTDTLVICTCTAFIILFGGVPDASLSGIRLTQAALVSEIGPVGSVFVAVAIFLFAFSS IIGNYYYGEANIRFITRRPGALTLYRLLVGAMVLFGALATLDLAWSLADITMALMTLFNL AAILLLGRQAFLLLADYVAQKRQGIKNPVYTRDRIPELKDKAECW >gi|313158355|gb|AENZ01000045.1| GENE 42 47926 - 48651 1098 241 aa, chain - ## HITS:1 COG:VC1869 KEGG:ns NR:ns ## COG: VC1869 COG1180 # Protein_GI_number: 15641871 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Pyruvate-formate lyase-activating enzyme # Organism: Vibrio cholerae # 3 238 6 244 246 221 45.0 8e-58 MIKVHSYESMGTFDGPGLRLVVFLQGCNFRCLYCANPDTIDACGGTPTSPDEILRMAVDQ KPFYGRRGGVTFSGGEPTFQAAALAPLVRRLREAGIHVCIDTNGSIWNPAVEELLGLADL ILLDVKQADPERHLALTERDNAQTLRTAAWLEEHGKPFWLRYVLVPGYSDAEADIRTLGG RLGGYRQIERVELLPYHTLGVHKYEAMGKEYKLSDVRENTPEQLDRAAALLREYFANVVV N >gi|313158355|gb|AENZ01000045.1| GENE 43 48653 - 50881 3902 742 aa, chain - ## HITS:1 COG:CAC0980 KEGG:ns NR:ns ## COG: CAC0980 COG1882 # Protein_GI_number: 15894267 # Func_class: C Energy production and conversion # Function: Pyruvate-formate lyase # Organism: Clostridium acetobutylicum # 7 742 8 743 743 923 61.0 0 MELNKTFIDGLWSKEINVTSFVQTNITPYVGDASFLQGPTDRTKHIWNLCLKALEEERAN NGVRSLDNKTVSTITSHKAGYIDKEQELIVGLQTDQLLKRAIKPFGGINVVSRACKENGV DVDEKVKDIFTHYRKTHNDGVFDVYTEEIRSFRSLGFLTGLPDNYARGRIIGDYRRLALY GTDRLIEAKTEDLRGLTGPMTDARIRLREEVAEQIKALKEIRTMGEYYGLDLSRPAHSAQ EAVQWVYMAYLAAVKEQDGAAMSLGNVSSFLDIFIEYDLAHGNIDEVFAQELIDQFVIKL RMVRHLRMQSYNDIFAGDPTWVTEAIGGRFNDGRTKVTKTSFRFLQTLYNLGPSPEPNLT ILWSPDLPQGFKDFCAKVSADTSSIQYENDELMREVRHSDDYGIACCVSYQDIGRQIQFF GARCNLAKALLLALNGGRCENTGTLMVKGIPALTEGPLRFEEVMRNYKMVLTEIARVYNE AMNIIHYMHDKYYYEKAQMALVDTDPRINLAYGVAGLSIALDSLSAIKYAKVTARRNEQG LTDGFDIEGEFPCFGNNDDRVDHLGVDLVYYFSEELKKLPVYKNARPTLSLLTITSNVMY GKKTGATPDGRAKGVAFAPGANPMHGRDKSGAIASLASVAKLRYRDSQDGISNTFSIIPK SLGPTQEERIENLVTMLDGYFTKGAHHLNVNVLNRAMLEDAMEHPENYPQLTIRVSGYAV NFTKLSREHQLEVISRSFHERM >gi|313158355|gb|AENZ01000045.1| GENE 44 50926 - 51420 -68 164 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRKGMLFFSQYRTLRRIFLKNLKKYLRFVGPYRPGAFFLQDGVLRPTVWPLLCSDECGPG SGGIGSVGPGGPVPPQIPAGRYVRPMRRNTARRSRTRILPSGPAAVLPGRVYAVLPGRVY VADLCWFAGTDARCFARPDLRCFAGWMHAVLPGRTLTVLSDRAH >gi|313158355|gb|AENZ01000045.1| GENE 45 51615 - 52007 664 130 aa, chain + ## HITS:1 COG:no KEGG:BF2989 NR:ns ## KEGG: BF2989 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 129 1 129 134 148 72.0 7e-35 MEELTLTTPALLFSAVSLILLAYTNRFLSYAQLVRTLKEQHLQHPSKVTRAQIDNLRRRL HLTRTMQILGVSSLFLCVVTMFLLYVGFFVLSAYVFGAALLLLIGSLGVSIWEIQISVRA LEIHLHDMEK >gi|313158355|gb|AENZ01000045.1| GENE 46 52398 - 54221 2690 607 aa, chain + ## HITS:1 COG:AGc425 KEGG:ns NR:ns ## COG: AGc425 COG2207 # Protein_GI_number: 15887598 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 454 597 207 340 365 72 32.0 2e-12 MKTVCAGTPVRPVSRSSAMLRRLFILAAVVSLGLPAVRAQEGAASHAAVIQADTLPVLLE RLERDPEDIDLLVEICSQYTMRSEFAAIHPYVSRLRRAGAAHDDERALMYADLFSGQSYL LAEGSDSTRIYLDRALIRGRQCEDPVALCRIYNALGIYAVSIETNYFGGIEYFLEAMEYA RSASLNRFYLVAQCNLANTYYMRNDPAGLKYAEEVCRLGAEWGYDYLAFGGAVISAYMHY MLGDQDRALEYILRTLSDTDKFGYHTELYSLYANIAQGADAGAERYYLMALDHIDEKVVT ATVMTYLSYGTYLNDRGEYARAIPVLRQGIELSERSNNAVHRYKLYLRISEAETALGRYR EALDYFKSYHSQADSIFNVERERSINELRVRYDAERQENMLRKSEIDLIRQQRRFQLLLL LLLFAVGISTVVYILYRRKDKMYKQIVRQQYEFLKKEKKAAQPAMPPPDPISPQTEKPSP DRDEHAVRDAELFARIEYLMQTEGVYRQNDLTIERLAERLDTNRTYISRAINQQAGKAFS SYVNSYRIDEAVRRLSDVDDDTPLKALAQMLGYNHLQTFYTSFQSAIGMPSSKYREKLLK LHREHQL >gi|313158355|gb|AENZ01000045.1| GENE 47 54457 - 56343 2191 628 aa, chain + ## HITS:1 COG:no KEGG:FIC_00184 NR:ns ## KEGG: FIC_00184 # Name: not_defined # Def: hypothetical protein # Organism: F.bacterium # Pathway: not_defined # 291 539 501 823 1036 124 32.0 1e-26 MKKTLLFSSRSLLTAALFFCAASVVSCQKEETAGAAPSCAAIRLTAEPAEFVRTKSETDV AYASRFVEGDAIGLFAVIRSDAETQAYPAASGNYIQNAKFVRQADGSWREEGSKSYYMEQ GQVMDLYAYYPYAENADPTALVYDASVAEADFMTARTAGFSERDGEIRLVFRHKLALAEA FVADADKLTDYAVTVQNVRTKAVFSLAAAAQDAEMQSVDAKTASVAMTRCGNAFRAYLPA QEIAEDKALLKVGGNGFAAFDYTHPGKTSLAAGQVRKLLVTPAFANPELLPNCYVVSPGE TLYIPVCKAFGVWEKNEVLAGAGTGMLGAMGARVVWEDVRYLLNDDQITVLGSGSKAVIQ VRTTPRTCGNAVLALEIGGEIRWSWHLWITDYDPNAPEAQKSKNGRTFMDRNLGAMNAEY GNIDALGLQYQWGRKDPVPCPAKWEDIATRPVWNGVGVKVGFSTKANSAVIRDNLVASIN DPLALIKAVGVPYDWYSTVKNQGENRWNAADGSKTEFDPCPRGWRVPVSGFGTLSPWYNC VTAGIPWLNGVIWEDLGYWPAAGLLESSGQSYLGMFGYYWSATPGENLSGGVKQEGCAYG YNFDNQTSLTSYLQYRHTVLSVRCVKVQ >gi|313158355|gb|AENZ01000045.1| GENE 48 56377 - 58356 2645 659 aa, chain + ## HITS:1 COG:no KEGG:Bache_0429 NR:ns ## KEGG: Bache_0429 # Name: not_defined # Def: hypothetical protein # Organism: B.helcogenes # Pathway: not_defined # 14 309 9 305 305 112 32.0 7e-23 MKYTGSIHTTFYALLAAAALTAAGCSDDDNEKGGGATGVEANFSAEITPYTRAAGDTWTN GDAVGIYMLGGDAGTADNVRYTCEPNGRLSAADAKIVIPGSGTFDFVAYHPYKQSSLSGD AGKIDGYLYPVVLSDQTDPAAIDLLWSDNAAGVTAAAPDVKFTFEHVLSKVEIAVKQGGG ISADDLAGVVVTIDGVYNEGFFALNSGRIGNTGVPGAVTARAVTEGVSYEAIVLPTTDAA QRGRVFTVEAPLLGRTYTWTMPDDYLFVGGKIHEIEITVDVDGISVVTGDIADWNGGDSN PEVETDDAEFLALPNSYIARPGSTVTIPVAKAYAAWHNIPMLQEGSFSIPNDLTARIVWQ ERFDEDETILFSTNDKVTMTGSGINAAIEVVLNPDVEGSLVVGVYDGDGNCYWSWHIWVT EYDPSQPAGQQTVGGNVFMDRNLGATSLVKGPQSAGCFYQWGRKDPFQGPYTWMMFLNAN AGWSNDGIGKFMTLKGLASTKVPSENLVASVQQPYRYIVGISGTQDWLSPDVGGEAYRWS NADGTKGVFDPCPEGWRVPVSGAGAASAWADLNGVWDAERTGCVFAEGNGYYPAAGYINI MSASGSGGGAEAGTSGYCWSASSAGRNGYALTYSVSAIKTEAELTKAWACPVRCVKDVK >gi|313158355|gb|AENZ01000045.1| GENE 49 58577 - 60796 3356 739 aa, chain - ## HITS:1 COG:PAB0927 KEGG:ns NR:ns ## COG: PAB0927 COG0306 # Protein_GI_number: 14521600 # Func_class: P Inorganic ion transport and metabolism # Function: Phosphate/sulphate permeases # Organism: Pyrococcus abyssi # 25 286 23 275 405 62 28.0 4e-09 MSPLFTSIVIVMGLLAVMGIVVGVANDAVNFLNSALGSKAAPRRVILWIAAAGILVGTLT SSGMMEVARSGVFYPGQFSFPEIMMLFLGMMLGNILLLDIYNTLGLPTSTTVSMVFGLLG AGVAAALYHIWGDPAASVSDLSQYINTGKAMAIIAAILLSVALAFAAGSLFMYVSRLIFS FRYAKVFRRWGAVWCGISLAGILYFALFKGLKSSGLIPTSVGEYVGEHVLVTLLVFWAGA SVVLWIFQRMRLNIMRITILTGTFALALAFAGNDLVNFIGVPLASYDAWQIARETGSESI MMGELANPARANFLLLLLSGAVMVLTLFFSKKSRHVTETELSLASQHEGDERFGSTFVSR SLVRASLALNTMWTGLIPDRMQKVIDRRFEPLPAEERSSAPYDMIRAVVNLTAASILIAI GTSYKLPLSTTYVVFMVAMGSSLADRTWGRESAVYRITGVMAVISGWFITALGGFLIALC VTLLLLWGGWIAVTTLVALCAWLLIRSHRTKPEEKAAAEELPVIKAETPDEVLCVCIGEV CTTMEQVTRIYNRTLVAVFKENRKVLKEMVQSSNRLFEEARDRKYGIMPTLRRLQQCDID TGHFYVQVTDYLSEVTKALIHITRPAFGHIDNNHEGLSKEQIVDLMRVNDQVEAIFDKIN DMLRTKDFSDLDMVLEMRDKLFDTIVEAIKNQLRRINGDPAASTRASVLYLTILNETKTM ILQARNLLKSQHYFLESRK >gi|313158355|gb|AENZ01000045.1| GENE 50 60998 - 62446 2164 482 aa, chain - ## HITS:1 COG:BH3156 KEGG:ns NR:ns ## COG: BH3156 COG0642 # Protein_GI_number: 15615718 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Bacillus halodurans # 246 481 344 582 589 134 33.0 3e-31 MGTGLRSSLSYHRRLFLLLLAFSWALVACFIVFQYGREKQFKAERLDTRLQLFNLRMLDA LADGATPEEFIAARDEPFGDLRVTVIDAAGRVTFDNSAGSLPETNHLDRPEVAEALARGT GFTIRRHSESTDLYYFYSAMSDGGVVVRSAVPYSITLREVLAADREFLWFMLGVTLLMSV AAYFATRRLGHNITRLRDFAQRAERGERIDEQAVFPHDELGDISSHIIRLYARLQRTTAD RDREHALALHEQQEKIRIKKQLTNNINHELKTPVAAIQGYLETLLDNPGLGAEKRTAFLE KSCAQVARLRSLLADISTVTRMDEASQLIRKERVVLNDIVDEIRADMQLRPEGQRLRVNC DFPYPLEMEGSPSLLGSVFRNLADNAAAYSGGRDIFIRLLDDSPEQYTLQFSDNGIGVEE QHLPHIFERFYRIDKGRSRKLGGTGLGLAIVKNAVMMHGGTISVRNSERGGLEFTFTLRK HS >gi|313158355|gb|AENZ01000045.1| GENE 51 62424 - 63104 914 226 aa, chain - ## HITS:1 COG:TM1655 KEGG:ns NR:ns ## COG: TM1655 COG0745 # Protein_GI_number: 15644403 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Thermotoga maritima # 1 226 9 242 247 165 40.0 7e-41 MTKHKILVVDDEESLCEILQFNLEVEGYEVDTAYSAEQALGMHPERYSLILLDVMMGEMS GFKMARQLRAQPETAAVPIIFCTAKDTEDDTVAGLNIGADDYIAKPFSVREVLARVRSVL RRTAPQEGQSRAIAFEGLHMDLVRKICTVDGTPLALTKKEFEILALLLSNRGRIFSREEI LHQVWSDEVIVLDRTIDVNITRLRRKIGVYGDHIVTRLGYGYGFEE >gi|313158355|gb|AENZ01000045.1| GENE 52 63198 - 64262 1575 354 aa, chain - ## HITS:1 COG:DR1001_2 KEGG:ns NR:ns ## COG: DR1001_2 COG2876 # Protein_GI_number: 15806024 # Func_class: E Amino acid transport and metabolism # Function: 3-deoxy-D-arabino-heptulosonate 7-phosphate (DAHP) synthase # Organism: Deinococcus radiodurans # 17 247 26 249 270 136 36.0 7e-32 MKDLQPINFPGLDPRRPLVIAGPCSAETEEQVIETARELAAEGVRIFRAGLWKPRTKPGG FEGVGAEGVAWLQRVKRETGMYTATEVATRKHVMAALEGGIDMIWIGARTTANPFAMQEI ADALRGHDIPVLVKNPVSPDLELWIGGVERIYNAGIRRLGGIHRGFTSIDKSLYRNHPMW SIPIELHRRLPGLQIFCDPSHIGGRRELIAPLSQQAMDLGFDGLIVEAHCSPDCAWSDKA QQVTPQGLAYICRSLVIREANTTTESLSELRSQIDKIDDELLELLVRRMRVSRDIGQYKK EHNMPILQAKRYEDLLARRAEQAVQLGMDREFMRSVLQAIHEESIRQQMQVLGE >gi|313158355|gb|AENZ01000045.1| GENE 53 64387 - 65232 1150 281 aa, chain - ## HITS:1 COG:BMEI1905 KEGG:ns NR:ns ## COG: BMEI1905 COG0077 # Protein_GI_number: 17988188 # Func_class: E Amino acid transport and metabolism # Function: Prephenate dehydratase # Organism: Brucella melitensis # 3 273 8 276 290 146 36.0 4e-35 MKRITIQGIAGCFHETAAHGSFPGEEVEVLPCVSFDEQFARMAADAELLGVAAIENTIAG SLLPNHELLRRSTLAVIGEYKLRISHVLAALPGQRVTDIREVHSHPIALMQCGEYLKARP AMKVVERDDTAGSAREIAGQRLAGTAAICGAEAAELYGLEILERGIETNKHNFTRFLVLA DRSRAAEFTDPARTDKASLVFTLPHAQGSLSKVLTLLSFYGINLTKIQSLPIIGREWEYR FYVDVTFDDPMRYRQAVDAARPLTSDFRILGEYAEYPNPEI >gi|313158355|gb|AENZ01000045.1| GENE 54 65451 - 66239 989 262 aa, chain - ## HITS:1 COG:no KEGG:Odosp_3543 NR:ns ## KEGG: Odosp_3543 # Name: not_defined # Def: hypothetical protein # Organism: O.splanchnicus # Pathway: not_defined # 6 251 3 244 251 151 35.0 2e-35 MNTDMKITVNPGMEYLRQFVRQLPELFPVSGEVLHDGRNQIRAFDIGSERLVVKRYKRPS AFNAVMYSFFRKSKARRAYEHALRLRELEIDTPEPVAWSEYRRNGLITETYFVSRRSDYA PLTAATERFPTSDSLPVLEAFARFTVRLHEKGICHEDFNQTNILWRHDEATGRYDFQLID INRMKFLRRPLRPDECMINLRRLSCPAVAFLYILDRYAETRQWDMDDTLLRGTFFRLLFG RRQQFKKRFKARKLAAAGKKQG >gi|313158355|gb|AENZ01000045.1| GENE 55 66783 - 67733 1346 316 aa, chain + ## HITS:1 COG:SA0511 KEGG:ns NR:ns ## COG: SA0511 COG0451 # Protein_GI_number: 15926231 # Func_class: M Cell wall/membrane/envelope biogenesis; G Carbohydrate transport and metabolism # Function: Nucleoside-diphosphate-sugar epimerases # Organism: Staphylococcus aureus N315 # 1 312 1 314 321 362 54.0 1e-100 MKRILIIGAGGQIGSELTVYLRKIYGDRNVVATDMRECKALGEAGPFEVLNALDATAMAS VVARYHIDSIFNLVALLSAVGERNPQMAWNVNMGALNNSLEVARQHHCALFTPSSIGAFG PTSPKDRTPQDTIMQPTTIYGICKVTGEMLGNYYHHKYGVDTRSVRFPGIISNVTLPGGG TTDYAVEIYYEAIRSGRFTCSVPPDVYMDMIYMPDALRACVELMEADPAKLVHRNSFNIA SMSFTPEIICAEIRKRLPDFTMDYDVDPVKKEIAESWPNSLDDTCAREEWGWKPEWDLSR MTDDMLAHIRTKLAAE >gi|313158355|gb|AENZ01000045.1| GENE 56 67730 - 68098 359 122 aa, chain + ## HITS:1 COG:slr2101 KEGG:ns NR:ns ## COG: slr2101 COG1917 # Protein_GI_number: 16330587 # Func_class: S Function unknown # Function: Uncharacterized conserved protein, contains double-stranded beta-helix domain # Organism: Synechocystis # 10 117 31 137 143 84 39.0 5e-17 MSLSGGRPAVTRVAETASSWDGAPLPAYPAGRPQVTILRATIPPHATLPRHTHAVVNAGV ILRGELTVVSDNGTERTFRAGEGIVELVGTVHYGENRGDGETEMVMFYAGTEGVPLSESA DL >gi|313158355|gb|AENZ01000045.1| GENE 57 68108 - 69061 1228 317 aa, chain + ## HITS:1 COG:SPy1892 KEGG:ns NR:ns ## COG: SPy1892 COG1073 # Protein_GI_number: 15675706 # Func_class: R General function prediction only # Function: Hydrolases of the alpha/beta superfamily # Organism: Streptococcus pyogenes M1 GAS # 1 314 1 305 308 219 39.0 6e-57 MTTRRITRTIALFVLLLTAAVIGGSFYMLGFSLRPEETMRAKNATAYGYMYAEYPFLRPW TDSLERAGALRDTVIVDPQGVRLHALYAAAPEPTDRTAVIVHGYTDCAVRMLMIGYLYNH DLRYNVLLPDLHYHGRSEGRAIRMGWLDRLDVLRWMEVADTLFGGNTQMVVHGISMGAAT TMMVAGESQRPYVKCFVEDCGYTSVWDEFSNELKTSFGLPAFPLMHTASWLCDLKYGWNF REASALAQVAKCELPMLFIHGDADDYVPTWMVYPLYEAKPGEKELWLVPGAGHAYSYRDN REEYTAVVREFVGKYVR >gi|313158355|gb|AENZ01000045.1| GENE 58 69394 - 70467 1493 357 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313158405|gb|EFR57804.1| ## NR: gi|313158405|gb|EFR57804.1| putative lipoprotein [Alistipes sp. HGB5] # 1 357 1 357 357 640 100.0 0 MQIFRKSLLLTAALALTAFAACSNDDNENGGDGGETDPDYLALPHAYGTYYHTYWGDAAG DYYLVLTDADLDQADAAPYKYRLDIDFLSVLATDPKSARPLDGEYTIGKSTDNKAGVFIE GFLGTDSFGMPALFGTYWNEPDEEEGSTTPHAIVGGKMRIKTDNGITTIKAEFIDEDDST LRVSYRGALEFENRINDSYRGDSKLNADYAMNADAPVVSIKKITDECTARYDRWDLHFYE KECYATQGATGYYLWFRLKTAPGQEYIPSGTYKATSDDTPGFLPGTREELGGGNLRAVNT WFYSGRTPHAPMTNGSLTLKTSNGTDYEVSFSTGDDHPMPYLVTGSFSGTVKKLASN >gi|313158355|gb|AENZ01000045.1| GENE 59 70489 - 71457 1072 322 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313158365|gb|EFR57764.1| ## NR: gi|313158365|gb|EFR57764.1| hypothetical protein HMPREF9720_0935 [Alistipes sp. HGB5] # 12 322 12 322 322 534 100.0 1e-150 MLLPLLAAAVLSSASCSDDKEKEPPVRAVNYTIEFTTVLQAVYLGDRYRTGDGNYRLTLT NADGDVLEADLTSVKAPKPSAAMPEAGVYTAAETCRAGTFAPEASSWETVKSGVEVRSAN QAGQKKKFAIRSGSMTLAPAASGDGTFTLSGEVTDGDKTALSFSWSGALAFANESGEEDP VVYERLSMVFGDYYPDDYGIAADTYMLRGGNERVELHSQLRSVPAADPAAPLPGDGKYTI DLRSEAGKYANETFTFRSGYISQSGRAAGTYWKTDEATYVATSGSFTVSTVGESWKIEGT LRDDGAEETFSFTYTGKPGFKN Prediction of potential genes in microbial genomes Time: Wed Jun 22 12:30:21 2011 Seq name: gi|313158352|gb|AENZ01000046.1| Alistipes sp. HGB5 contig00066, whole genome shotgun sequence Length of sequence - 1245 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 61 - 1029 763 ## COG3547 Transposase and inactivated derivatives - Prom 1177 - 1236 3.0 Predicted protein(s) >gi|313158352|gb|AENZ01000046.1| GENE 1 61 - 1029 763 322 aa, chain - ## HITS:1 COG:NMB1750 KEGG:ns NR:ns ## COG: NMB1750 COG3547 # Protein_GI_number: 15677594 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Neisseria meningitidis MC58 # 3 313 2 310 316 120 34.0 3e-27 MRYIGIDVSKATFVVAYSSDKGGEIRTFNNTTAGIRQFIGTLPKDGSIHCVMEATGNYSA LLLYMLNVAGITVSMENPLKVKNFAKAMLSTIKTDKSDARLITLYGEKMNPRPFKVQGEA ILRLRQKRTVIRQLTKQITAMSNLRGSLACLPVPDKGATHTVDETIKFLEKKRDRLQSEL TDLVEVEFSRQLALLTTIKGIGITLATALIITTGGFTYFQNAKQVSRYLGICPTYEQSGT SVNIKGHINRNGDAYTRGLLYIAAWPASRFNAQCKETYTRLRQNGKSGKLAMIAVANKLV RQAFAVVAHDKEYVDGFVSNRP Prediction of potential genes in microbial genomes Time: Wed Jun 22 12:30:22 2011 Seq name: gi|313158347|gb|AENZ01000047.1| Alistipes sp. HGB5 contig00057, whole genome shotgun sequence Length of sequence - 1036 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 2, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 270 94 ## gi|313158350|gb|EFR57751.1| hypothetical protein HMPREF9720_2304 - Term 272 - 313 8.5 2 2 Op 1 . - CDS 319 - 618 290 ## gi|313158351|gb|EFR57752.1| hypothetical protein HMPREF9720_2305 3 2 Op 2 . - CDS 631 - 843 245 ## gi|313158348|gb|EFR57749.1| conserved hypothetical protein 4 2 Op 3 . - CDS 853 - 1002 98 ## gi|313158349|gb|EFR57750.1| conserved domain protein Predicted protein(s) >gi|313158347|gb|AENZ01000047.1| GENE 1 3 - 270 94 89 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313158350|gb|EFR57751.1| ## NR: gi|313158350|gb|EFR57751.1| hypothetical protein HMPREF9720_2304 [Alistipes sp. HGB5] # 1 89 1 89 90 180 100.0 4e-44 MAYESSIPFADIRDFREAEINGIPALFTMSRLDSETLPADFHSCEVMGGRGSDFQWLVPL ALANFTGTFVSRQPLLREGQAYAEIRRYG >gi|313158347|gb|AENZ01000047.1| GENE 2 319 - 618 290 99 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313158351|gb|EFR57752.1| ## NR: gi|313158351|gb|EFR57752.1| hypothetical protein HMPREF9720_2305 [Alistipes sp. HGB5] # 1 99 1 99 99 193 100.0 3e-48 MNNLCFNRFTLRTDDAKTRRKILRWLALNYRLYEYLPSDAEVRGRFISCKPFPAEAFRRQ IGKLHNDGTLFLRIVSADLYTLYVEGNVYLRGAWHRIFL >gi|313158347|gb|AENZ01000047.1| GENE 3 631 - 843 245 70 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313158348|gb|EFR57749.1| ## NR: gi|313158348|gb|EFR57749.1| conserved hypothetical protein [Alistipes sp. HGB5] # 1 70 1 70 70 124 100.0 2e-27 MERNIIIENICTACHCGERRAEEYLAVELRNLRELRDAGALCYGDLETACSGLGLDFDYT DYFCRTLSLN >gi|313158347|gb|AENZ01000047.1| GENE 4 853 - 1002 98 49 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313158349|gb|EFR57750.1| ## NR: gi|313158349|gb|EFR57750.1| conserved domain protein [Alistipes sp. HGB5] # 1 49 1 49 49 92 100.0 1e-17 MSTTKAIDLQKHIREGWRVEDFIDSLSPEVERIMAGRGWRKLFRLYNLK Prediction of potential genes in microbial genomes Time: Wed Jun 22 12:33:42 2011 Seq name: gi|313158213|gb|AENZ01000048.1| Alistipes sp. HGB5 contig00071, whole genome shotgun sequence Length of sequence - 131205 bp Number of predicted genes - 131, with homology - 126 Number of transcription units - 44, operones - 21 average op.length - 5.1 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 2 - 41 8.2 1 1 Op 1 50/0.000 - CDS 73 - 678 512 ## PROTEIN SUPPORTED gi|228470701|ref|ZP_04055552.1| ribosomal protein L17 2 1 Op 2 26/0.000 - CDS 681 - 1673 1325 ## COG0202 DNA-directed RNA polymerase, alpha subunit/40 kD subunit 3 1 Op 3 36/0.000 - CDS 1698 - 2306 748 ## PROTEIN SUPPORTED gi|229496321|ref|ZP_04390041.1| ribosomal protein S4 4 1 Op 4 48/0.000 - CDS 2356 - 2745 528 ## PROTEIN SUPPORTED gi|229496347|ref|ZP_04390067.1| 30S ribosomal protein S11 5 1 Op 5 . - CDS 2759 - 3139 502 ## PROTEIN SUPPORTED gi|149279640|ref|ZP_01885769.1| 30S ribosomal protein S13 6 1 Op 6 . - CDS 3177 - 3293 170 ## PROTEIN SUPPORTED gi|228470715|ref|ZP_04055566.1| ribosomal protein L36 7 1 Op 7 9/0.000 - CDS 3314 - 3532 229 ## PROTEIN SUPPORTED gi|15900168|ref|NP_344772.1| translation initiation factor IF-1 8 1 Op 8 2/0.000 - CDS 3539 - 4315 737 ## COG0024 Methionine aminopeptidase 9 1 Op 9 53/0.000 - CDS 4344 - 5720 839 ## PROTEIN SUPPORTED gi|163796899|ref|ZP_02190856.1| 30S ribosomal protein S11 10 1 Op 10 . - CDS 5743 - 6189 528 ## PROTEIN SUPPORTED gi|160883045|ref|ZP_02064048.1| hypothetical protein BACOVA_01008 11 1 Op 11 . - CDS 6217 - 6459 342 ## gi|313158269|gb|EFR57671.1| ribosomal protein L30 12 1 Op 12 56/0.000 - CDS 6474 - 6992 661 ## PROTEIN SUPPORTED gi|149279647|ref|ZP_01885776.1| 30S ribosomal protein S5 13 1 Op 13 46/0.000 - CDS 6999 - 7361 388 ## PROTEIN SUPPORTED gi|213962903|ref|ZP_03391163.1| ribosomal protein L18 14 1 Op 14 55/0.000 - CDS 7378 - 7965 611 ## PROTEIN SUPPORTED gi|126646968|ref|ZP_01719478.1| 50S ribosomal protein L6 15 1 Op 15 . - CDS 7991 - 8389 472 ## PROTEIN SUPPORTED gi|229496261|ref|ZP_04389981.1| ribosomal protein S8 - Prom 8417 - 8476 8.1 + Prom 8356 - 8415 2.2 16 2 Tu 1 . + CDS 8659 - 9270 -140 ## - Term 9260 - 9306 14.5 17 3 Op 1 50/0.000 - CDS 9309 - 9605 345 ## PROTEIN SUPPORTED gi|29348123|ref|NP_811626.1| 30S ribosomal protein S14 18 3 Op 2 48/0.000 - CDS 9609 - 10169 693 ## PROTEIN SUPPORTED gi|150008972|ref|YP_001303715.1| 50S ribosomal protein L5 19 3 Op 3 57/0.000 - CDS 10169 - 10492 324 ## PROTEIN SUPPORTED gi|228472133|ref|ZP_04056899.1| ribosomal protein L24 20 3 Op 4 50/0.000 - CDS 10513 - 10878 497 ## PROTEIN SUPPORTED gi|89891142|ref|ZP_01202650.1| 50S ribosomal protein L14 RplN 21 3 Op 5 . - CDS 10881 - 11135 325 ## PROTEIN SUPPORTED gi|241891060|ref|ZP_04778356.1| ribosomal protein S17 22 3 Op 6 . - CDS 11148 - 11336 244 ## gi|313158264|gb|EFR57666.1| ribosomal protein L29 23 3 Op 7 50/0.000 - CDS 11349 - 11777 623 ## PROTEIN SUPPORTED gi|150008977|ref|YP_001303720.1| 50S ribosomal protein L16 24 3 Op 8 61/0.000 - CDS 11805 - 12596 910 ## PROTEIN SUPPORTED gi|237725879|ref|ZP_04556360.1| 30S ribosomal protein S3 25 3 Op 9 59/0.000 - CDS 12601 - 13011 453 ## PROTEIN SUPPORTED gi|163786185|ref|ZP_02180633.1| 50S ribosomal protein L22 26 3 Op 10 60/0.000 - CDS 13019 - 13288 397 ## PROTEIN SUPPORTED gi|53715462|ref|YP_101454.1| 30S ribosomal protein S19 27 3 Op 11 61/0.000 - CDS 13306 - 14130 1086 ## PROTEIN SUPPORTED gi|53715463|ref|YP_101455.1| 50S ribosomal protein L2 28 3 Op 12 61/0.000 - CDS 14136 - 14426 329 ## PROTEIN SUPPORTED gi|53715464|ref|YP_101456.1| 50S ribosomal protein L23 29 3 Op 13 58/0.000 - CDS 14443 - 15069 708 ## PROTEIN SUPPORTED gi|126646981|ref|ZP_01719491.1| 50S ribosomal protein L4 30 3 Op 14 40/0.000 - CDS 15071 - 15685 785 ## PROTEIN SUPPORTED gi|53715466|ref|YP_101458.1| 50S ribosomal protein L3 31 3 Op 15 4/0.000 - CDS 15758 - 16063 451 ## PROTEIN SUPPORTED gi|53715467|ref|YP_101459.1| 30S ribosomal protein S10 32 3 Op 16 51/0.000 - CDS 16072 - 18186 2721 ## COG0480 Translation elongation factors (GTPases) 33 3 Op 17 56/0.000 - CDS 18203 - 18679 635 ## PROTEIN SUPPORTED gi|188995735|ref|YP_001929987.1| 30S ribosomal protein S7 34 3 Op 18 . - CDS 18720 - 19100 594 ## PROTEIN SUPPORTED gi|228470692|ref|ZP_04055543.1| ribosomal protein S12 - Prom 19257 - 19316 5.3 + Prom 19221 - 19280 6.0 35 4 Tu 1 . + CDS 19409 - 19951 657 ## gi|313158278|gb|EFR57680.1| hypothetical protein HMPREF9720_2545 + Term 20165 - 20195 1.0 + Prom 20162 - 20221 3.0 36 5 Op 1 . + CDS 20276 - 23032 4418 ## COG0495 Leucyl-tRNA synthetase 37 5 Op 2 . + CDS 23055 - 23930 1180 ## COG0657 Esterase/lipase + Term 23952 - 23987 4.6 38 6 Op 1 . + CDS 24049 - 25134 1800 ## COG0216 Protein chain release factor A 39 6 Op 2 . + CDS 25154 - 25456 387 ## COG3695 Predicted methylated DNA-protein cysteine methyltransferase + Prom 25971 - 26030 2.3 40 7 Op 1 . + CDS 26097 - 26636 610 ## gi|313158325|gb|EFR57727.1| hypothetical protein HMPREF9720_2550 41 7 Op 2 . + CDS 26640 - 27887 1624 ## COG1570 Exonuclease VII, large subunit 42 7 Op 3 . + CDS 27959 - 28150 337 ## gi|313158336|gb|EFR57738.1| conserved hypothetical protein 43 7 Op 4 . + CDS 28150 - 28704 692 ## COG3153 Predicted acetyltransferase 44 7 Op 5 . + CDS 28718 - 29491 1065 ## Ftrac_3122 methyltransferase type 11 + Term 29531 - 29569 4.9 - Term 29517 - 29556 5.1 45 8 Tu 1 . - CDS 29582 - 30238 962 ## COG1272 Predicted membrane protein, hemolysin III homolog - Prom 30259 - 30318 1.8 + Prom 30211 - 30270 3.6 46 9 Op 1 . + CDS 30320 - 32518 3715 ## COG0317 Guanosine polyphosphate pyrophosphohydrolases/synthetases 47 9 Op 2 . + CDS 32552 - 33172 723 ## TEQUI_0601 hypothetical protein 48 9 Op 3 . + CDS 33189 - 34391 2153 ## COG0137 Argininosuccinate synthase 49 9 Op 4 1/0.000 + CDS 34388 - 35350 1203 ## COG0002 Acetylglutamate semialdehyde dehydrogenase 50 9 Op 5 9/0.000 + CDS 35510 - 36640 1603 ## COG4992 Ornithine/acetylornithine aminotransferase 51 9 Op 6 1/0.000 + CDS 36776 - 37732 1685 ## COG0078 Ornithine carbamoyltransferase 52 9 Op 7 1/0.000 + CDS 37732 - 38499 1007 ## COG0548 Acetylglutamate kinase 53 9 Op 8 . + CDS 38553 - 39614 1206 ## COG0624 Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases 54 10 Tu 1 . + CDS 39767 - 41065 1993 ## COG0165 Argininosuccinate lyase + Term 41090 - 41122 1.5 - Term 41075 - 41113 6.1 55 11 Op 1 . - CDS 41131 - 42336 1682 ## COG4124 Beta-mannanase 56 11 Op 2 . - CDS 42361 - 43254 880 ## COG2207 AraC-type DNA-binding domain-containing proteins 57 11 Op 3 . - CDS 43264 - 44601 1492 ## COG2211 Na+/melibiose symporter and related transporters 58 11 Op 4 . - CDS 44605 - 46611 2123 ## BT_1871 putative alpha-glucosidase 59 11 Op 5 . - CDS 46608 - 46793 204 ## gi|313158280|gb|EFR57682.1| hypothetical protein HMPREF9720_2570 - Prom 46960 - 47019 1.6 60 12 Op 1 . + CDS 46984 - 48549 2254 ## gi|313158296|gb|EFR57698.1| hypothetical protein HMPREF9720_2571 61 12 Op 2 . + CDS 48552 - 51122 3265 ## COG3250 Beta-galactosidase/beta-glucuronidase 62 12 Op 3 . + CDS 51142 - 52503 737 ## COG1626 Neutral trehalase 63 12 Op 4 . + CDS 52500 - 55577 4724 ## ZPR_0351 TonB-dependent receptor Plug domain protein 64 12 Op 5 . + CDS 55590 - 57074 2486 ## ZPR_0352 hypothetical protein 65 12 Op 6 . + CDS 57086 - 57985 1020 ## Celly_1222 hypothetical protein + Term 58002 - 58049 16.6 - Term 57986 - 58041 16.2 66 13 Op 1 . - CDS 58084 - 58650 929 ## COG0231 Translation elongation factor P (EF-P)/translation initiation factor 5A (eIF-5A) 67 13 Op 2 . - CDS 58711 - 59259 746 ## COG0163 3-polyprenyl-4-hydroxybenzoate decarboxylase 68 13 Op 3 . - CDS 59261 - 59833 995 ## COG0726 Predicted xylanase/chitin deacetylase - Prom 59984 - 60043 3.9 + Prom 59903 - 59962 2.6 69 14 Op 1 . + CDS 60033 - 61040 1603 ## BDI_0186 hypothetical protein 70 14 Op 2 . + CDS 61080 - 62495 1916 ## COG1502 Phosphatidylserine/phosphatidylglycerophosphate/cardioli pin synthases and related enzymes - Term 62515 - 62554 6.4 71 15 Op 1 19/0.000 - CDS 62590 - 63225 979 ## COG2197 Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain 72 15 Op 2 . - CDS 63230 - 64201 1504 ## COG4585 Signal transduction histidine kinase 73 15 Op 3 . - CDS 64236 - 65009 1056 ## COG0496 Predicted acid phosphatase + Prom 65431 - 65490 1.8 74 16 Tu 1 . + CDS 65663 - 66043 335 ## PROTEIN SUPPORTED gi|170017595|ref|YP_001728514.1| ribosomal protein L19 + Term 66060 - 66105 12.3 - Term 66627 - 66677 11.6 75 17 Op 1 . - CDS 66695 - 67993 1728 ## COG0738 Fucose permease 76 17 Op 2 . - CDS 68031 - 69020 1434 ## COG0142 Geranylgeranyl pyrophosphate synthase - Prom 69112 - 69171 1.6 77 18 Op 1 . + CDS 69288 - 71777 3761 ## Sph21_1623 protein of unknown function, membrane YfhO 78 18 Op 2 . + CDS 71782 - 72663 1473 ## COG2177 Cell division protein 79 18 Op 3 . + CDS 72664 - 72921 452 ## CA2559_04540 hypothetical protein 80 18 Op 4 . + CDS 72981 - 73772 1189 ## COG1968 Uncharacterized bacitracin resistance protein 81 18 Op 5 . + CDS 73772 - 74488 1174 ## COG0130 Pseudouridine synthase + Term 74491 - 74551 5.0 82 19 Tu 1 . + CDS 74553 - 75602 1847 ## COG0809 S-adenosylmethionine:tRNA-ribosyltransferase-isomerase (queuine synthetase) 83 20 Op 1 . + CDS 75725 - 76531 994 ## COG0297 Glycogen synthase 84 20 Op 2 . + CDS 76542 - 78197 2774 ## gi|313158277|gb|EFR57679.1| putative lipoprotein + Term 78217 - 78262 11.7 85 21 Op 1 . + CDS 78275 - 80572 3562 ## COG0550 Topoisomerase IA 86 21 Op 2 . + CDS 80603 - 81811 1819 ## Odosp_1312 peptidase C1B bleomycin hydrolase + Term 82032 - 82069 8.5 - Term 82016 - 82061 12.0 87 22 Tu 1 . - CDS 82104 - 82181 121 ## - Prom 82203 - 82262 7.3 88 23 Tu 1 . - CDS 82313 - 82954 478 ## COG1357 Uncharacterized low-complexity proteins - Prom 83164 - 83223 2.2 + Prom 82928 - 82987 2.0 89 24 Tu 1 . + CDS 83227 - 84636 2064 ## BVU_4075 hypothetical protein 90 25 Op 1 . - CDS 84859 - 85425 491 ## BDI_1990 siderophore biosynthesis regulatory protein 91 25 Op 2 . - CDS 85422 - 86708 1991 ## COG1253 Hemolysins and related proteins containing CBS domains 92 25 Op 3 35/0.000 - CDS 86725 - 88038 1635 ## COG0206 Cell division GTPase 93 25 Op 4 . - CDS 88518 - 89957 2175 ## COG0849 Actin-like ATPase involved in cell division 94 25 Op 5 . - CDS 89970 - 91070 1394 ## Palpr_1497 cell division protein FtsQ 95 25 Op 6 26/0.000 - CDS 91164 - 92546 1790 ## COG0773 UDP-N-acetylmuramate-alanine ligase - Term 92619 - 92644 -0.5 96 25 Op 7 . - CDS 92645 - 93751 1700 ## COG0707 UDP-N-acetylglucosamine:LPS N-acetylglucosamine transferase - Term 93996 - 94032 2.0 97 26 Tu 1 . - CDS 94102 - 95640 1783 ## COG0606 Predicted ATPase with chaperone activity 98 27 Tu 1 . - CDS 95796 - 97361 2402 ## COG0305 Replicative DNA helicase - Prom 97381 - 97440 2.7 99 28 Op 1 . - CDS 97526 - 98248 768 ## COG1040 Predicted amidophosphoribosyltransferases 100 28 Op 2 1/0.000 - CDS 98232 - 100487 2838 ## COG1198 Primosomal protein N' (replication factor Y) - superfamily II helicase 101 28 Op 3 . - CDS 100487 - 101383 1601 ## COG1284 Uncharacterized conserved protein 102 28 Op 4 . - CDS 101457 - 102410 1201 ## COG1242 Predicted Fe-S oxidoreductase 103 28 Op 5 . - CDS 102413 - 103936 1627 ## BT_3558 putative endonuclease 104 28 Op 6 . - CDS 103951 - 104406 598 ## COG1490 D-Tyr-tRNAtyr deacylase 105 28 Op 7 . - CDS 104420 - 105106 914 ## COG0564 Pseudouridylate synthases, 23S RNA-specific 106 29 Tu 1 . - CDS 105210 - 105665 606 ## COG0295 Cytidine deaminase - Prom 105696 - 105755 2.1 - Term 105844 - 105881 4.5 107 30 Op 1 . - CDS 105903 - 106589 1212 ## Riean_0656 hypothetical protein 108 30 Op 2 . - CDS 106664 - 107017 469 ## Sgly_0946 glyoxalase/bleomycin resistance protein/dioxygenase 109 30 Op 3 . - CDS 107050 - 108459 1860 ## COG2265 SAM-dependent methyltransferases related to tRNA (uracil-5-)-methyltransferase - Prom 108541 - 108600 2.9 110 31 Op 1 . - CDS 108691 - 109830 1371 ## VS_0953 hypothetical protein 111 31 Op 2 . - CDS 109849 - 110673 1247 ## COG0657 Esterase/lipase - Prom 110824 - 110883 5.2 + Prom 110661 - 110720 5.0 112 32 Op 1 . + CDS 110834 - 111502 1041 ## COG0491 Zn-dependent hydrolases, including glyoxylases 113 32 Op 2 . + CDS 111567 - 112295 1066 ## MAB_2973c putative methyltransferase 114 32 Op 3 . + CDS 112319 - 112999 1082 ## COG0336 tRNA-(guanine-N1)-methyltransferase 115 32 Op 4 . + CDS 113020 - 114159 1690 ## COG0763 Lipid A disaccharide synthetase 116 32 Op 5 . + CDS 114169 - 114486 423 ## Cthe_2202 hypothetical protein 117 32 Op 6 . + CDS 114496 - 115209 952 ## COG0179 2-keto-4-pentenoate hydratase/2-oxohepta-3-ene-1,7-dioic acid hydratase (catechol pathway) + Prom 115334 - 115393 1.7 118 33 Op 1 . + CDS 115415 - 116008 960 ## COG0424 Nucleotide-binding protein implicated in inhibition of septum formation 119 33 Op 2 . + CDS 116008 - 117996 3045 ## Odosp_0699 peptidase S46 120 33 Op 3 . + CDS 118036 - 119559 1826 ## COG0793 Periplasmic protease + Term 119691 - 119731 -0.9 + TRNA 119769 - 119856 49.7 # Leu GAG 0 0 - Term 120051 - 120096 2.7 121 34 Tu 1 . - CDS 120098 - 121159 1475 ## Lbys_2826 hypothetical protein - Prom 121224 - 121283 6.6 - Term 121223 - 121272 -0.8 122 35 Tu 1 . - CDS 121313 - 121411 62 ## - Prom 121461 - 121520 3.1 + Prom 121708 - 121767 4.7 123 36 Tu 1 . + CDS 121956 - 122117 87 ## + Prom 122204 - 122263 4.4 124 37 Tu 1 . + CDS 122376 - 123143 179 ## Tresu_0562 hypothetical protein + Term 123173 - 123213 4.1 + Prom 123202 - 123261 5.6 125 38 Tu 1 . + CDS 123438 - 123797 242 ## BT_4618 hypothetical protein + Term 123836 - 123883 1.1 + Prom 124503 - 124562 2.9 126 39 Tu 1 . + CDS 124628 - 125170 10 ## BT_0230 hypothetical protein + Term 125183 - 125220 -0.4 + Prom 125571 - 125630 2.2 127 40 Tu 1 . + CDS 125762 - 126304 230 ## BT_0229 hypothetical protein + Term 126453 - 126486 0.3 + Prom 126309 - 126368 4.1 128 41 Tu 1 . + CDS 126523 - 127908 712 ## PG1109 mobilization protein 129 42 Tu 1 . - CDS 127916 - 128722 412 ## COG0582 Integrase - Prom 128744 - 128803 5.2 + Prom 129126 - 129185 5.0 130 43 Tu 1 . + CDS 129405 - 129506 99 ## + Prom 130280 - 130339 3.3 131 44 Tu 1 . + CDS 130444 - 130731 95 ## CCV52592_0371 type I restriction-modification system S subunit + Term 130813 - 130848 6.1 Predicted protein(s) >gi|313158213|gb|AENZ01000048.1| GENE 1 73 - 678 512 201 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|228470701|ref|ZP_04055552.1| ribosomal protein L17 [Porphyromonas uenonis 60-3] # 1 167 1 166 169 201 59 1e-50 MRHNKNFNHLGRQAGHRKALMSNMASSLILHKRIETTVAKAKAVRQFIEPLVTKSKDDTT HSRRIVFSYLKQKEAVTELFRTIAPKIADRPGGYTRILKTGFRLGDAADMCIIEFVDFND AYTLGIAPAAAAEAKPKTRRSRKPAAKKTDAVEEATVVEGEAKKAAPKKAAAPKAAKPAA PKAAKVAAPKVAKKTNVGKKM >gi|313158213|gb|AENZ01000048.1| GENE 2 681 - 1673 1325 330 aa, chain - ## HITS:1 COG:BH0162 KEGG:ns NR:ns ## COG: BH0162 COG0202 # Protein_GI_number: 15612725 # Func_class: K Transcription # Function: DNA-directed RNA polymerase, alpha subunit/40 kD subunit # Organism: Bacillus halodurans # 3 316 1 308 314 231 43.0 2e-60 MAILAFQKPEKVIMLESTSSFGKFEFRPLEPGFGMTVGNALRRILLSSLEGYAITTVKVA GVDHEFAAIPGVMEDMLKIILNLKQVRFIRTVDNQDAEKVSINVAGVTELTAGYISNYLS FFKVLNPDLVICHLAPGTKMQMTLTIGKGRGYVPAEENTPAECEFGTLPIDSIFTPIKNV KYSIENYRVEQKTDYEKLNLEITTDGSIHPKEALKEAAKILIQHFMLFSDEKITVNMEDT NGAEEFDEDVLHMRQLLKTKLSDQDLSVRALNCLKAADVDTVGDLVKLNRNDLLKFRNFG KKSLTELDELLTSLNLKFGMDVSIYKLDKD >gi|313158213|gb|AENZ01000048.1| GENE 3 1698 - 2306 748 202 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229496321|ref|ZP_04390041.1| ribosomal protein S4 [Porphyromonas endodontalis ATCC 35406] # 1 202 9 209 209 292 69 5e-78 MGKYIGPKSKIARKFGEAIYGADKVLEKRNFPPGQHGLARKRKKVSEYGTQLSEKQKAKY TYGLLEKQFARTYDQAARMGGITGENLLKLLECRLDNVVYRLGIAPTRAAARQLVSHCHI CVNGNVVNIPSYSLRAGDVVSVREKSKSLEVISASLAGGSKSRYAWLEWDNASMSGKFLQ KPEREEIPENIKEQLIVELYSK >gi|313158213|gb|AENZ01000048.1| GENE 4 2356 - 2745 528 129 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229496347|ref|ZP_04390067.1| 30S ribosomal protein S11 [Porphyromonas endodontalis ATCC 35406] # 1 129 1 128 128 207 77 2e-52 MAKKTGTVKKKVVKVGAVGNAYVHSTFNNVIITITNEVGDVISWSSAGKMGFRGSKKNTP YAAQTASADCAKVAYDMGLRKVKVYVKGPGAGRESAVRTIHGAGIEVMEIIDVTPLPHNG CRAPNRRRV >gi|313158213|gb|AENZ01000048.1| GENE 5 2759 - 3139 502 126 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149279640|ref|ZP_01885769.1| 30S ribosomal protein S13 [Pedobacter sp. BAL39] # 1 126 1 125 125 197 76 2e-49 MARIVGVDLPKNKRGEIGLTYIYGIGRSTARKILEVAGISYDLKVQDWSDDQIGAIRSTI ADMGIKVEGECRSAVQLNIKRLMDIGCYRGIRHRLGLPVRGQSTKNNARTRKGRKKTVAN KKKATK >gi|313158213|gb|AENZ01000048.1| GENE 6 3177 - 3293 170 38 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|228470715|ref|ZP_04055566.1| ribosomal protein L36 [Porphyromonas uenonis 60-3] # 1 38 1 38 38 70 84 5e-11 MKVKASIKKRSEDCKIVKRKGKLYVICKKNPKFKMRQG >gi|313158213|gb|AENZ01000048.1| GENE 7 3314 - 3532 229 72 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15900168|ref|NP_344772.1| translation initiation factor IF-1 [Streptococcus pneumoniae TIGR4] # 1 72 1 72 72 92 56 7e-18 MAKQAAIERDGTIIEALSNAMFRVELDNGHVLTAHISGKMRMHYIKILPGDKVKVEMTPY DLTKGRISFRYK >gi|313158213|gb|AENZ01000048.1| GENE 8 3539 - 4315 737 258 aa, chain - ## HITS:1 COG:BH0156 KEGG:ns NR:ns ## COG: BH0156 COG0024 # Protein_GI_number: 15612719 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Methionine aminopeptidase # Organism: Bacillus halodurans # 1 247 1 246 248 273 53.0 2e-73 MIYIKTDEEIELLRENNILVSKTLAEVGRHIRPGVTTKFLDSIAEDFIRAHGAVPAFLGY QGFPASLCVSVNEQVVHGIPSSKCVLKEGDIVSVDCGTFMKGFVGDSAYTFAVGEVAEEV RQLMEVTKEALYKGTAQAKAGNRVGDVSAAVQEYAESFGYGVVRELEGHGLGRKMHEDPG VPNYGARGRGPLLKEGMVICIEPMINMGTKAVVFERDGWTVRTRDHKPAAHYEFAVAVRK DGPDVLTDFSIIEQAINK >gi|313158213|gb|AENZ01000048.1| GENE 9 4344 - 5720 839 458 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163796899|ref|ZP_02190856.1| 30S ribosomal protein S11 [alpha proteobacterium BAL199] # 33 457 19 445 447 327 41 1e-88 MNQYLVVGVVFLVVIALLATNKKLVETLKNIYKIEELRKRVIYTIGLLLVYRLGSFVVIP GINPNALGEGSAYASQLEGNGLLGLLNVFSGGAFGNAAILALGVMPYITASIIIQLMGMM IPYFQKMQKEGESGRRKMNQWTRFLTIGVLILQGPAYIANLYHQVPTAFVYGNSFGFVAY ATTILIAGTMFIMWLGEKITDKGIGNGISLIIMIGIVARLPHALLAEVNARFQTASGSAI MLILELVLLFLVFMATIALVQAVRKVPVQYAKRIVGNKQYGGVRQYIPLKMNAANVMPII FAQALMFIPALFSGTAFAAAFSSMTGFWYNFTLAVLVIAFTYFYTAIIINPQMMADDMKR NGGFIPGVKPGKQTVNYIDTIMTRITLPGSFFLAIVAILPALAMKFLGIQQAFAYFYGGT SLLIMVGVVLDTLKQIESYLLMRHYDGLMKTGRIQGRH >gi|313158213|gb|AENZ01000048.1| GENE 10 5743 - 6189 528 148 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|160883045|ref|ZP_02064048.1| hypothetical protein BACOVA_01008 [Bacteroides ovatus ATCC 8483] # 1 148 1 148 148 207 70 2e-52 MELNNLKPAKGSTHHDKRIGRGAGSGHGGTATRGHKGAQSRSGYSRKLGFEGGQMPLQRR LPKFGFTNLKRVEFKAINLSTIEELAAKKSLAEITVDTLIAAGFISSKDKVKILGNGAVT KALSVKAHAFSKSAEAAITAAGGSVEKL >gi|313158213|gb|AENZ01000048.1| GENE 11 6217 - 6459 342 80 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313158269|gb|EFR57671.1| ## NR: gi|313158269|gb|EFR57671.1| ribosomal protein L30 [Alistipes sp. HGB5] # 1 59 1 59 80 90 100.0 4e-17 MARLKITQVKSRIGATERQCKNLDALGLKRINASVEHDDSAIIKGMIERVKHLVKVEEVA VKKAAAPKAKKAEATEAAAE >gi|313158213|gb|AENZ01000048.1| GENE 12 6474 - 6992 661 172 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|149279647|ref|ZP_01885776.1| 30S ribosomal protein S5 [Pedobacter sp. BAL39] # 1 172 1 172 172 259 72 6e-68 MSNTNIKKVRTSDLELKDRLVSIQRVTKVTKGGRTFSFSAIVVVGNEDGVVGYGLGKASE VQSAIAKGVEDAKKNLVKIPVINGTIPHKQELRFGGSLVMIRPAAPGTGIIAGGAMRAVL ESVGVKNVLAKSKGSSNPHNLVKATIGALCELRDAHSIARLRGISMDKVFNG >gi|313158213|gb|AENZ01000048.1| GENE 13 6999 - 7361 388 120 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|213962903|ref|ZP_03391163.1| ribosomal protein L18 [Capnocytophaga sputigena Capno] # 1 120 1 118 118 154 65 3e-36 MSLNKIERRERIKMRIRKIVSGTAEQPRMTVYRSNKQIYVQFIDDLAGVTMAAASSLDKE VAEAAAGKNKCEVAALVGKLAAERAAAKGISSVAFDRNGYLYHGRVKQLAEAAREGGLKF >gi|313158213|gb|AENZ01000048.1| GENE 14 7378 - 7965 611 195 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|126646968|ref|ZP_01719478.1| 50S ribosomal protein L6 [Algoriphagus sp. PR1] # 1 195 1 185 185 239 62 4e-62 MSRIGKLPVNLPAGVTVEVSADNVVSVKGPLGTLSQKVDSDIKVEVGTHKDIHTGNEVPA VLVSRPTNQPRHRSLHGLYRALINNMVVGVSKGYEIKQELVGVGFKAEVKGQILEMSLGY SHDTHFMLPKEVTATAVTEKKGNPIVTLKSMDKQLIGQVAAKIRSLRKPEPYKGKGIKFV GEQLRRKAGKSAGAK >gi|313158213|gb|AENZ01000048.1| GENE 15 7991 - 8389 472 132 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229496261|ref|ZP_04389981.1| ribosomal protein S8 [Porphyromonas endodontalis ATCC 35406] # 1 132 1 132 132 186 68 5e-46 MTDPIADFLTRIRNAVKANHKVVEAPGSKIKQEITKILYEQGYILAYKFDTDEKGHPSIK IALKWDAATKGNAIKDLKRVSRPGLRQYVSVADMPRVLNGLGIAVLSTSKGIMTDAKARK ENVGGEVLCYIY >gi|313158213|gb|AENZ01000048.1| GENE 16 8659 - 9270 -140 203 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MTRTKIPIDNRPPFRRRFRPTTLRRLSARPSRHIRVHTRHPVIQRVTPCRTHFARSLSGP SVREICNQMPNPSISAPPPRSYRLRRHFPRHCDVTTRILHPYKPRSLRYRTAARFIGLRT IAGFTLRIPGFPLLQSPSPTPGRLSDSRPSTGLRPQNPDSYPGFLLQMSLTGKPCPSLRT NLAYLRAFIALLPSPFGLPPGRK >gi|313158213|gb|AENZ01000048.1| GENE 17 9309 - 9605 345 98 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|29348123|ref|NP_811626.1| 30S ribosomal protein S14 [Bacteroides thetaiotaomicron VPI-5482] # 1 98 1 99 99 137 71 3e-31 MAKESMKAREVKRVKLANRYAHKREEIAAKLKSGEIDHEEAWVALSKLPRNSNPIRQHNR CKITGRPKGYIRLFGISRIQFREMASKGLIPGVTKASW >gi|313158213|gb|AENZ01000048.1| GENE 18 9609 - 10169 693 186 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|150008972|ref|YP_001303715.1| 50S ribosomal protein L5 [Parabacteroides distasonis ATCC 8503] # 1 185 1 185 185 271 74 1e-71 MAYVPTLKTQYKERIEPALMKEFGYSSVMQCPKLTKIVINQGMGQAVADKKLIDVAEAEL TQITGQKAVQTRSRKDISNFKLRKGMPIGVRVTLRDTKMYEFLERLIAVALPRIRDFKGI NEKFDGQGNYTLGITEQIIFPEIDIDKITKIFGMEITFVTTAKTDEEAYALLREFGLPFK NAKKNQ >gi|313158213|gb|AENZ01000048.1| GENE 19 10169 - 10492 324 107 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|228472133|ref|ZP_04056899.1| ribosomal protein L24 [Capnocytophaga gingivalis JCVIHMP016] # 3 106 2 104 104 129 64 7e-29 MSVKLHIKKGDMVQVIAGDNKGQQGKVLKVEVTKQRAIVEGVNLCKKATKPNAQNPQGGI VEKEAAIHISNLQVLDPKSGKPTKVGRKLNAKGKLVRYAKKSGEEIK >gi|313158213|gb|AENZ01000048.1| GENE 20 10513 - 10878 497 121 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|89891142|ref|ZP_01202650.1| 50S ribosomal protein L14 RplN [Flavobacteria bacterium BBFL7] # 1 121 1 122 122 196 81 6e-49 MVQQESRLIVADNSGAKEVLCIRVLGGTKRRYASIGDKIVVAVKSATPSGDVKKGAVSKA VVVRTTKEIRRANGSYIRFDDNAVVLLNNQGEMRGTRIFGPVARELRDQYMKIISLAPEV L >gi|313158213|gb|AENZ01000048.1| GENE 21 10881 - 11135 325 84 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|241891060|ref|ZP_04778356.1| ribosomal protein S17 [Sphingobacterium spiritivorum ATCC 33861] # 1 84 1 84 84 129 77 5e-29 MERNLRKERIGVVVSNKMEKTIVVAVARKVKHPIYGKFVNKTTKFVAESLDETCKEGDTV RIMETRPLSKTKRWRLVQIIERAK >gi|313158213|gb|AENZ01000048.1| GENE 22 11148 - 11336 244 62 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313158264|gb|EFR57666.1| ## NR: gi|313158264|gb|EFR57666.1| ribosomal protein L29 [Alistipes sp. HGB5] # 1 62 1 62 62 77 100.0 3e-13 MKSAEIKDISIKDLQERIETEKAQLAKLKVQHAVSPVENPSIIKKNRRDIARMLTILRQK NA >gi|313158213|gb|AENZ01000048.1| GENE 23 11349 - 11777 623 142 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|150008977|ref|YP_001303720.1| 50S ribosomal protein L16 [Parabacteroides distasonis ATCC 8503] # 1 137 1 137 144 244 83 2e-63 MLQPKKTKFRRMQKGRMKGIAQRGNQLAYGSFAIKALESTWITGRQIEAARQAITRYMKR EGQLWIRIFPDKPITKKPAEVRMGKGKGNPEGFVAPVTPGRILFEAEGVPLEIAQEALRL GAQKLPITTKFIVRRDYVETTI >gi|313158213|gb|AENZ01000048.1| GENE 24 11805 - 12596 910 263 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|237725879|ref|ZP_04556360.1| 30S ribosomal protein S3 [Bacteroides sp. D4] # 1 242 1 242 242 355 69 8e-97 MGQKVNPIANRLGIIRGWDSNWFGGKDFSEKLVEDAKIRKYLNVRLAKASISKIIIERTL KLVTVTISTARPGIIIGKGGQEVDKLKEELKKLTGKEIQINIFEVKRPEVDAVIVGQNIA RQLEGRVSFRRAVKTAVASTMRMGAEGIKIQVSGRVGGAEMARSETIKEGRIPLHTFRAD VDFCLTEALTKVGILGVKVWICQGIVYGKRDLFEIAGASAQAPSGDRGERGDRGDRRGRG DRRGGRGDRGDRGGDRRRNNNNK >gi|313158213|gb|AENZ01000048.1| GENE 25 12601 - 13011 453 136 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163786185|ref|ZP_02180633.1| 50S ribosomal protein L22 [Flavobacteriales bacterium ALC-1] # 1 130 1 130 135 179 69 8e-44 MGARKAIKAEKIKAEKKQKAIAVLRDCPTSPRKMRLVADIIRGVEINKALGILRFSKKEA SIRMEKLLKSAIANWEAKNENERLEDTKLCVKEVFVDGGRMLKRIQAAPQGRAHRIRKRS NHVTIVVDKMVTVENK >gi|313158213|gb|AENZ01000048.1| GENE 26 13019 - 13288 397 89 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|53715462|ref|YP_101454.1| 30S ribosomal protein S19 [Bacteroides fragilis YCH46] # 1 89 1 89 89 157 84 2e-37 MSRSLKKGPFIDPKLEAKVVAQAEGNKKSVIKTWSRASMISPDFVGQTIAVHNGNKFIPV YVTENMVGHKLGEFAPTRNFRGHAGNKKK >gi|313158213|gb|AENZ01000048.1| GENE 27 13306 - 14130 1086 274 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|53715463|ref|YP_101455.1| 50S ribosomal protein L2 [Bacteroides fragilis YCH46] # 1 274 1 274 274 422 73 1e-117 MAVKKFKPVTPGTRHKIIGTFDEITTNKPEKSLLSPQKNSGGRNNSGKMTVRYIGGGNKQ MYRNIDFKRTKDGIAATVKTIEYDPNRTARIALLVYADGEKSYIIAPNGLQVGQTVVSGA GVAPEVGNTLFLSEIPLGTVIHNIELYPGQGAAMARSAGSYAQLLAREGKFAIIKLPSGE TRMVLVTCRATVGVVSNIDHSLESSGKAGRERWLGRRPRNRGVVMNPVDHPMGGGEGRAS GGHPRSRKGLLAKGYKTRNPKKTTSKFIISKKKK >gi|313158213|gb|AENZ01000048.1| GENE 28 14136 - 14426 329 96 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|53715464|ref|YP_101456.1| 50S ribosomal protein L23 [Bacteroides fragilis YCH46] # 1 96 1 96 96 131 67 2e-29 MEIIIKPVLTEKMTIQGEKLNRYGFIVDVRANKLQIRNAVEQMYNVVVTNVNTVNYMGKL KSRFTKAGLLEGRANNFKKAIVTLKDGDKIDFYSNI >gi|313158213|gb|AENZ01000048.1| GENE 29 14443 - 15069 708 208 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|126646981|ref|ZP_01719491.1| 50S ribosomal protein L4 [Algoriphagus sp. PR1] # 1 208 1 208 209 277 64 2e-73 MELAVLNTQGKETGRKVVLSDAVFGVEANDHAIYLDVKQYLADQRQGTHKSKQRNEVAGS TRKLKRQKGTGGARCGSIKSPLFPGGGRIFGPQPRDYSFKLNKKLKQLARRSALTYKAQA GAISVVETLSLDAPKTKAVVALADALKVADKKVLLVLPESNANLQLSCRNLPYVQPVLAQ NVCTYDVMNASAIVMVEGAENVLNTMLA >gi|313158213|gb|AENZ01000048.1| GENE 30 15071 - 15685 785 204 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|53715466|ref|YP_101458.1| 50S ribosomal protein L3 [Bacteroides fragilis YCH46] # 1 204 1 205 205 306 71 2e-82 MSGLIGKKIGMTSVYSAEGKNIPCTVIEAGPCVVTQIKTVEKDTYEAVQIAYDEKSEKHT SSSLLGHFKKAGTTPKRKMAEFKGMPEVNLGDTLTVELFSEEDWVDVTGISKGKGFQGVV KRHGFGGVGGQTHGQHNRQRKPGSLGASSYPSRVFKGKRLPGQMGGEQVKVLNLRVLKVI PESNLILVKGSIPGAKGAYLTIEK >gi|313158213|gb|AENZ01000048.1| GENE 31 15758 - 16063 451 101 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|53715467|ref|YP_101459.1| 30S ribosomal protein S10 [Bacteroides fragilis YCH46] # 1 101 1 101 101 178 85 1e-43 MNQKIRIKLKSFDHNLVDKSAEKIVKTVKSTGAVVSGPVPLPTHKQIFTVNRSTFVNKKA REQFQLCTFKRILDIYSSTPKTIDALMKLELPSGVEVEIKV >gi|313158213|gb|AENZ01000048.1| GENE 32 16072 - 18186 2721 704 aa, chain - ## HITS:1 COG:BH0131 KEGG:ns NR:ns ## COG: BH0131 COG0480 # Protein_GI_number: 15612694 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation elongation factors (GTPases) # Organism: Bacillus halodurans # 6 699 7 692 692 804 58.0 0 MATRDLNYTRNIGIMAHIDAGKTTTSERILFYTGKTHKIGETHEGAAVMDWMVQEQERGI TITSAATTAFWNYKGNQYQINLIDTPGHVDFTVEVERSLRVLDGAVATFCAVGGVEPQSE TVWRQADKYNVPRLGYVNKMDRTGADFLAVCAQIKEHFGATALPVVLPIGAEDKFEGLVD LIYNNAIYYYDDKTVRDNFELKEIPESMKAEAAEWRAKLIEEVAATDDALMEKFFEDPDS ITAEELLAAIRKATISMQVVPMLCGSSFHNKGVQKLLDYVMAFLPSPLDVPAVTGINPKT EKEETRNASEDDPFCGLAFKIATDPFVGRLAFVRVYSGKLDAGSYVLDTRSGKRERISRI YQMHANKQNPMETVGAGDICAAVGFKEIRTGDTLCAENAPIVLEQMTFPEPVIGLAVEPK TQKDLDKLGIALGKLAEEDPTFTVHTDEDSGQTVISGMGELHLDIIVDRLRREFGVEINQ GAPQVNYKEALTKTVQHREVFKKQTGGRGKFADIIFEIGPADEGQIGLTFVDEVKGGNIP KEFIPSVQKGFASAMDNGALAGYKMDSMKVTLKDGSFHPVDSDQLSFEIAARNGYRAAAP KAGSVILEPIMSVEVVTPEESMGDIIGDLNKRRGQITGMESKGTARVVKAKVPLSEMFGY VTVLRTISSGRATSSMEFSHFEEVPANLAKEIIEKASGKRKDIE >gi|313158213|gb|AENZ01000048.1| GENE 33 18203 - 18679 635 158 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|188995735|ref|YP_001929987.1| 30S ribosomal protein S7 [Porphyromonas gingivalis ATCC 33277] # 1 158 1 158 158 249 75 6e-65 MRKAKPKKRILLPDPKFGDVSVTRFVNNLMLDGKKSIAYTIFYDALELVGKKMKDADKSP LEIWKQALENITPQVEVKSKRIGGATFQVPMEVRPERKISISMKNLVLYSRKRAGKTMAD RLSAEIMDAFNQQGAAFKRKEEMHRMAEANKAFAHFRF >gi|313158213|gb|AENZ01000048.1| GENE 34 18720 - 19100 594 126 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|228470692|ref|ZP_04055543.1| ribosomal protein S12 [Porphyromonas uenonis 60-3] # 1 124 1 124 125 233 91 3e-60 MPTIQQLVRNGRVRMEDKSKSPALAACPQRRGVCTRVYTTTPKKPNSAMRKVARVRLTNG KEVNAYIPGEGHNLQEHSIVLVRGGRVKDLPGVRYHLVRGALDSAGVDGRRQRRSKYGAK RPKAGK >gi|313158213|gb|AENZ01000048.1| GENE 35 19409 - 19951 657 180 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313158278|gb|EFR57680.1| ## NR: gi|313158278|gb|EFR57680.1| hypothetical protein HMPREF9720_2545 [Alistipes sp. HGB5] # 1 180 1 180 180 342 100.0 5e-93 MKRLLTTFAFALAALCISACDSRRQPLFTANPLGAGPVLLTVSAAQIPASHPGLYDTFTT DRTPAGETVLRFTLAGEPVMEARAYGDEIESIEIFGPGVGSTDGIAPGTDVKHLFENGGI SQTDNDGRLVITLNGMTYRVSGLGEEGRKKLGKAHADGVTPRISPQDFDPGAKVTSILIN >gi|313158213|gb|AENZ01000048.1| GENE 36 20276 - 23032 4418 918 aa, chain + ## HITS:1 COG:MT0047 KEGG:ns NR:ns ## COG: MT0047 COG0495 # Protein_GI_number: 15839418 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Leucyl-tRNA synthetase # Organism: Mycobacterium tuberculosis CDC1551 # 8 918 35 969 969 736 43.0 0 MEYNFKEIESKWQRRWQEDKTYKVETDPSRPKFYVLDMFPYPSGAGLHVGHPLGYIASDI YSRYKRQKGFNVLHPMGYDAFGLPAEQYAIQTGQHPAVTTEQNIARYREQLDKIGFSFDW DREVRTCDPGYYKWTQWAFLKMFGSYYCNDRQQARPIEELTAAFERNGTEGLNVACTQEL HFTAEEWRAMSEAEKEQTLQNYRLAFRADTMVNWCAKLGTVLANDEVHDGLSVRGGYPVE QKRMKQWLLRVTAYAQRMLDGLDKLAWSDSLKEIQRNWIGRSEGAQVFFDIKDSDRKLEI FTTRPDTIFGVTFMVIAPEHEWVDSLTTPENRAAVEEYICQAKKRSERERIAETKRVSGV ATGSYTVNPFTGKAIPIYISDYVLAGYGTGAIMAVPAHDSRDYAFARHFGLEIVPVVEGG DISQESYDAKSGKVINSDFLNGKDVKEAIGLMFDEIERRGLGRKRINYRLRDAIFSRQRY WGEPFPIYYKNDIACPLAEDKLPLELPPVADFGPTEQGEPPLARVKEWATPEGWPYELST MPGFAGSSAYYLRYMDPHNDRALVGREADEYWRSVDLYVGGIEHATGHLMYSRFWNMFLY DLGVVCEEEPFRKLVNQGMIQGRSNFVYRIVGTNKFVSLGLKEQYRTQALYVDVNIVRND ILDIDAFRAWMPEYKDAEFILEDGKYVCGWAIEKMSKSFYNVVNPDYIVENYGADTLRMY EMFLGPLEQSKPWDTNGIDGVYKFLRRFWRLFYDRDGRLIVTDEKADEKELRTLHKTIKK VSEDIENFSFNTSVSAFMICLNELGECNKREIIEPLTVLLAPFAPHIAEELWAALGHTTS VCTADYPVYDEKHLAQSAFEYPVSINGKLRFKKEYATSLTPAQMQADVVTLPEAQKWLEG KTPKKVIVVPGKIINIVI >gi|313158213|gb|AENZ01000048.1| GENE 37 23055 - 23930 1180 291 aa, chain + ## HITS:1 COG:CC2313 KEGG:ns NR:ns ## COG: CC2313 COG0657 # Protein_GI_number: 16126552 # Func_class: I Lipid transport and metabolism # Function: Esterase/lipase # Organism: Caulobacter vibrioides # 28 269 42 305 328 144 36.0 1e-34 MKRIATALTALLLMAGTATAQDYGQARTLKIWDNKTAPHGNGIATPEREPEKNRLTDVSE AVLYIFPAAPEKATGQAVVICPGGGYVKLCIDYEGYEMAQWFAEHGITAAVLKYRMPNGH PEVPLEDVEQALRIMMGLEAGATGFTADKVGIVGSSAGGHLAASASTLARTKPAFSILFY PVITAEKGKAHQGSFNALLGDKRSAETDTWYSLQNRVSSETPPTLLLLSDDDRVVPPVNS TLYYNALKDNGVKASMHIYPTGGHGWGIRKNFKYREQWQQTVLDWLQGIAK >gi|313158213|gb|AENZ01000048.1| GENE 38 24049 - 25134 1800 361 aa, chain + ## HITS:1 COG:CAC2884 KEGG:ns NR:ns ## COG: CAC2884 COG0216 # Protein_GI_number: 15896138 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Protein chain release factor A # Organism: Clostridium acetobutylicum # 6 358 1 353 359 360 52.0 2e-99 MADNTILSRLDGLKLKYEETGQKLTDPEVIADVKQFVQLNKEYKELEPIIETSERYRTAL ANLAEAKDILANDKDEEMREMARGEITELEPAIEQMEEQIKLLLIPKDPQDSKNAIMEIR GGTGGDEAAIFAGDLLRMYTKYIESKGWRYEMTSFSDGAAGGYKEVVLKVTGQNVYGTLK YESGVHRVQRVPQTETQGRVHTSAASVAVLPEAEEFDVEISMNDIRKDIFCASGPGGQSV NTTYSAIRLTHIPTGIVVQCQDQKSQLKNFDKAFEELRTRVFNLEYSKYLDEIASKRKTM VSTGDRSAKIRTYNYPQGRITDHRINYTIYNLSAFMDGDIQDVIDHLIVAENAERLKESE L >gi|313158213|gb|AENZ01000048.1| GENE 39 25154 - 25456 387 100 aa, chain + ## HITS:1 COG:lin0580 KEGG:ns NR:ns ## COG: lin0580 COG3695 # Protein_GI_number: 16799655 # Func_class: L Replication, recombination and repair # Function: Predicted methylated DNA-protein cysteine methyltransferase # Organism: Listeria innocua # 3 96 2 97 98 94 48.0 7e-20 MRMPENFDAEVYAVVAEIPAGRVVSYKQIARLVGMPDHARRVGRALAEAPAGLPCHRVVN SAGRTVPGWTRQRELLEAEGVRFKTNGCADLARCGWMEIR >gi|313158213|gb|AENZ01000048.1| GENE 40 26097 - 26636 610 179 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313158325|gb|EFR57727.1| ## NR: gi|313158325|gb|EFR57727.1| hypothetical protein HMPREF9720_2550 [Alistipes sp. HGB5] # 1 179 35 213 213 337 100.0 2e-91 MKRYFLFISALLAATLPVAAQQPLYIVNGVETEQIASIPPDDIENVEMLPADEETIARYG DKAANGVMLVTLRYDQSAVFTADSTFSGYIARQVKWGDDEPAARIILRYTVTPEGDTVVA QELESTDSRLKRRVLKAVAEAPRWTPARKNGTPVESEGILHIQLPEGKLMPRQVELVWR >gi|313158213|gb|AENZ01000048.1| GENE 41 26640 - 27887 1624 415 aa, chain + ## HITS:1 COG:DR0186 KEGG:ns NR:ns ## COG: DR0186 COG1570 # Protein_GI_number: 15805222 # Func_class: L Replication, recombination and repair # Function: Exonuclease VII, large subunit # Organism: Deinococcus radiodurans # 7 405 28 411 416 139 29.0 1e-32 MQTHSHITLSQLQRLVKEALHERFALPVWVSAEISEIKVNYSGHCYLELVEKGGDNGVPL AQSRAVIWRTAYPRIAGYFEAETGQRLAAGIKILAKVAVNYHELYGFSLQITDIDPTYTL GDMERQRQITIAQLQQEGVWDMNREAPMPVVVQRVAVVSSANAAGYQDFRKELAKSPYRF DVTLFDAFMQGAAAEESIVAALCAVAERMDEFDAVVLIRGGGSASDLNCFNAYRLCAHVA QFPLPILTGIGHDKDTSVADMVAHTALKTPTAVAGWLVERMTQAEGWLDYAALQLHDATK AAMHASEVRLERLTGDLRQMSGDLLTRQRLRAEHLSALLPEAVRNFLARQATRLDNAAEL IAGRSPERILRLGFAVVRTGGKAVVSARDVRKGDALEIEVADGTIHGTVQRSEPK >gi|313158213|gb|AENZ01000048.1| GENE 42 27959 - 28150 337 63 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313158336|gb|EFR57738.1| ## NR: gi|313158336|gb|EFR57738.1| conserved hypothetical protein [Alistipes sp. HGB5] # 1 63 1 63 63 69 100.0 6e-11 MAKKEMTYAEAMAQIEKILARFRNEEMDVDSLAAEVKRATELIAGCKSRLRKAEEEVSKI LES >gi|313158213|gb|AENZ01000048.1| GENE 43 28150 - 28704 692 184 aa, chain + ## HITS:1 COG:yjhQ KEGG:ns NR:ns ## COG: yjhQ COG3153 # Protein_GI_number: 16132128 # Func_class: R General function prediction only # Function: Predicted acetyltransferase # Organism: Escherichia coli K12 # 10 174 11 173 181 159 52.0 4e-39 MSGEIVIRETVPADLAAIVEVHKKGFGYDKEAGLTAELLSDPTAEPTVSLLALRDGEAVG HILFTRCRIVGAEVQRPLFYILAPLAVVPSCQKQGVGGLLIAEGLQLLRERGAKLVFVLG HKEYYPRHGFIPGAEKLGFPAPYPILEKHADYWMAQALTDDGFSGPKGRVVCADALNRPE HWRE >gi|313158213|gb|AENZ01000048.1| GENE 44 28718 - 29491 1065 257 aa, chain + ## HITS:1 COG:no KEGG:Ftrac_3122 NR:ns ## KEGG: Ftrac_3122 # Name: not_defined # Def: methyltransferase type 11 # Organism: M.tractuosa # Pathway: not_defined # 1 256 1 252 255 270 51.0 3e-71 MKKLIKWMLNHVPRPLLQRVAGWAVPAAGLFYKGRGVECPVCGSRYRRFMPYGYVQSRAN ALCPKCLSLERHRLLWLYLTRETDLLTSFPRTLHIAPEVCIMRHLKPHFKGHPGQYVTAD LESPLADLHFDVQQIPLADDSVDVILCNHLLEHVADDRRALHELYRILKPGGWGILLSPV EPDYEQTYEDDSITDPEERTRIFGQYDHRRIYGADYTDRLREAGFEAADIDYAATFSEAE RRLYALPKDHIYVVYKH >gi|313158213|gb|AENZ01000048.1| GENE 45 29582 - 30238 962 218 aa, chain - ## HITS:1 COG:FN1885 KEGG:ns NR:ns ## COG: FN1885 COG1272 # Protein_GI_number: 19705190 # Func_class: R General function prediction only # Function: Predicted membrane protein, hemolysin III homolog # Organism: Fusobacterium nucleatum # 11 218 8 213 215 210 55.0 1e-54 MTKKKDAYVPTVGEEVANTVTHGVMSLLALVALPFAAVWAYAHDPDRILASVSVSIFVIS IFLMFLASTLYHSMNPASKHKAVFHILDHIFIYVAIAGSYTPIALSVIGGWQGVFITILQ WAMVLFGIFYKSLSRKTIPALSLTIYLVMGWTIVFFMPLFVRQASPPLLALIAAGGVLYT LGAWFYARQGFRYHHMVWHLLINLAVAAHFTGIVFFLY >gi|313158213|gb|AENZ01000048.1| GENE 46 30320 - 32518 3715 732 aa, chain + ## HITS:1 COG:FN1482 KEGG:ns NR:ns ## COG: FN1482 COG0317 # Protein_GI_number: 19704814 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Guanosine polyphosphate pyrophosphohydrolases/synthetases # Organism: Fusobacterium nucleatum # 6 724 5 720 725 397 33.0 1e-110 MDFSRYEKLRTLVRANFSEQTQALVDEALVYADGKLASLTRYDGSPLLTHAAAVAGIVIS EVGLGRNSTISAILHDVVRLAHKQLPAEEFLALTADIRSRFGEQVVGITIGMCNISELKL KVAKEQAENFRDLLVSYSEDPRVILIKLADRLEVMRSLEIFPHEKWRKKSWESMNLYAQI AHKLGLYSLKSELEDIALKYLEPKDYEHIVTKLEESADERKAFIARFLVPIEERLHKLGI KYHIKSRTKSIFSIWTKMHKQHVPFEGVYDIFAIRIIIDCPKEQEKPQCWTAYSVVTDFY TPNPNRMRDWISIPKSNGYESLHTTVSAEGRWVEVQIRTERMDAVAERGIAAHWRYKGVN QGAQTSEVWLGRLRELLEDTTHSLAQRFDAKPASGEIFVFTPNGDLRKLPEGASLLDFAF DIHTSLGSTCVGGKVNNRAVSIREQLRNGDIVEILTQKNQTPKSDWLNFVVTSKARNKIK SFLREEQAKHTRMGREELERKLKNWKFPLTIDEAVAYLVKYFKVRTGTEIYALIATQKVD LGTIKEILTRHISGEADEQRRAAAAEAERNKVHNAKESPAAQDALVIDDDISKIQYKLAK CCNPIKGDEVFGFVTINSGITIHRCDCPNAKRMRENYPYRVIDARWRSTAEGAFRVSLRI VAADTTGMANHITEVISRDLKLNIRSINFASLANGCIAGTVSVEVPGAGVVDTLIHSIMR IKGVQRAFRINN >gi|313158213|gb|AENZ01000048.1| GENE 47 32552 - 33172 723 206 aa, chain + ## HITS:1 COG:no KEGG:TEQUI_0601 NR:ns ## KEGG: TEQUI_0601 # Name: not_defined # Def: hypothetical protein # Organism: T.equigenitalis # Pathway: not_defined # 5 181 2 178 205 248 62.0 1e-64 MDNNSISVLFADSSHAHYAQRICDLIYESALQRGTGIAKRSPEYIAAKMTGGKAVVALDG EKLIGFSYIECWGHGDFVATSGLIVDPEYRHMGLAERIKRRTFELARRRFPYAKLFSITT SLPVMKLNSRMGYVPVTFSELTEDEEFWRGCQGCCNYDILQRNNRRMCLCTGMLYDPAKE PPQSQSRWAPYVMFFKNKFKKLFHLK >gi|313158213|gb|AENZ01000048.1| GENE 48 33189 - 34391 2153 400 aa, chain + ## HITS:1 COG:XF0999 KEGG:ns NR:ns ## COG: XF0999 COG0137 # Protein_GI_number: 15837601 # Func_class: E Amino acid transport and metabolism # Function: Argininosuccinate synthase # Organism: Xylella fastidiosa 9a5c # 1 396 1 397 401 225 36.0 1e-58 MEKKKVVLAFSGGLDTSFCVKYLSEEKGYDVYTAIANTGGFSKEELANIEKRAVALGAVR HATLDIEQEYYEKSIKYMVFGDILRNGTYPISVSSERIFQAIAIIEYAKSIGADCVAHGS TGAGNDQVRFDLTFQVLAPEIEIITPTRDMTLTREYEIDYLKKHGFEADFKKMEYSINKG LWGTSIGGKETLHSEQTLPEEAYPSQITASGEERLKITFEQGEIAAVNGRPYADKVEAIR AVEAVGSKFGIGRDMHIGDTIIGIKGRVGFEAAAPILIIAAHRMLEKHTQTKWQIYWKEQ VANWYGMFLHEAQYLEPVMRDIEAMLVSSQRNVTGTVELILRPYSYTLVGVDSTYDLMKT DFGEYGEVNKAWTADDVKGFTKILGNQIKIYHNVQKRNEK >gi|313158213|gb|AENZ01000048.1| GENE 49 34388 - 35350 1203 320 aa, chain + ## HITS:1 COG:MJ1096 KEGG:ns NR:ns ## COG: MJ1096 COG0002 # Protein_GI_number: 15669284 # Func_class: E Amino acid transport and metabolism # Function: Acetylglutamate semialdehyde dehydrogenase # Organism: Methanococcus jannaschii # 1 313 35 368 375 207 37.0 2e-53 MIRAGIIGGAGYTAGELIRLLVNHPQVRLAFVHSTSNAGNRIADVHGGLEGDTELRFADT CDLASIDVLFLCSAHGESRKWLEANDVPAGVKIIDLAQDFRDESDGFVYGLPELNRDRIR TAERVANPGCFATAIQLALLPLAAAGLLTGEVHVTAVTGSTGAGVKPSATTHFSWRTDNI SVYKAFTHQHLIEIGRNLRRLQPGFGKAVNFVPMRGDFARGILASVYTDCPLDETAARRL YEDFYAPALFTHVAAGSVDLKQVVNTDKALLSVAKYDGKLHVVSVIDNLLKGASGQAVQN MNLMFGLDEGEGLRLKASAF >gi|313158213|gb|AENZ01000048.1| GENE 50 35510 - 36640 1603 376 aa, chain + ## HITS:1 COG:AF0080 KEGG:ns NR:ns ## COG: AF0080 COG4992 # Protein_GI_number: 11497700 # Func_class: E Amino acid transport and metabolism # Function: Ornithine/acetylornithine aminotransferase # Organism: Archaeoglobus fulgidus # 3 367 11 371 375 252 40.0 7e-67 MTLFNVYSLYPVEPVRGKGCFVYNAAGTEYLDLYGGHAVISIGHAQPDYVKAISEQAARL GFYSNSVENSLQTALAEKLGRASGYGDYSLFLCNSGAEANENALKLASFQTGRAKVLAFS KAFHGRTSGAVAATDNPSIRSPFNGTPNVEFTPLNDLEAARAKLATREFAAVIIEGIQGV SGIHCPTDEFLRGLRAAATETGTQLVLDEIQSGYGRTGRFFAHQWAGIRPDLITMAKGMG NGFPIGGVLIAPHFEARPGLLGTTFGGSHLACAAAIAVLDVMEREKLVENAADTGEYLLG ELRKADGPKEIRGRGLMIGIEIDGSGAEFRKKLLFGKHVFTGGAGASTVRLLPALCLTRE LADRFLDAFDATLRGK >gi|313158213|gb|AENZ01000048.1| GENE 51 36776 - 37732 1685 318 aa, chain + ## HITS:1 COG:XF0998 KEGG:ns NR:ns ## COG: XF0998 COG0078 # Protein_GI_number: 15837600 # Func_class: E Amino acid transport and metabolism # Function: Ornithine carbamoyltransferase # Organism: Xylella fastidiosa 9a5c # 35 302 37 322 336 175 37.0 1e-43 MHKFTCAADIGDLHIAVAEALEIKRDRYQFTGLGRNKTLLMLFFNSSLRTRLSTQKAAMN LGMNVMVLDVNQGAWKLETERGVVMDGDKAEHLLEAIPVMGSYCDVIGVRSFAKFNNKAE DYEERVLEQFIRHSGRPVFSMEAATRHPLQSFADLITIEEYKTKERPKVVMTWAPHPNAL PQAVPNSFAEWMNAADYDFVITHPEGYELAPQFVGRARVEYDQRKALEGADFVYAKNWAA YADPNYGKVLCRDRAWTVDAEKMALTDNAFFMHCLPVRRNMIVTDEVIESPRSLVIPEAA NREISAQVVLKRLLEGLG >gi|313158213|gb|AENZ01000048.1| GENE 52 37732 - 38499 1007 255 aa, chain + ## HITS:1 COG:MK1631 KEGG:ns NR:ns ## COG: MK1631 COG0548 # Protein_GI_number: 20095067 # Func_class: E Amino acid transport and metabolism # Function: Acetylglutamate kinase # Organism: Methanopyrus kandleri AV19 # 4 255 1 246 246 162 41.0 4e-40 MEKITVVKIGGNVIDDPAVLKRFIADFAAMPGPKILVHGGGKVATRLAERLELKVQMVDG RRITDKGTLDVVTMVYAGLVNKQLVASLQAEGCNALGMSGADGNAVTARRRAPQPIDYGF VGDIEKVDSRLLGRLLEAGVTPVFCAIMHDGQGTLLNCNADSVASAIALGAAQIAPAELV FCFEKSGVLRDPDDETTLIREITAGSYAALKAEGVVSKGMIPKIENALKAVANGVRSVTI KHSDNLSNDTGTVIR >gi|313158213|gb|AENZ01000048.1| GENE 53 38553 - 39614 1206 353 aa, chain + ## HITS:1 COG:MK1581 KEGG:ns NR:ns ## COG: MK1581 COG0624 # Protein_GI_number: 20095017 # Func_class: E Amino acid transport and metabolism # Function: Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases # Organism: Methanopyrus kandleri AV19 # 7 295 18 314 381 90 28.0 5e-18 MDAKTTEAVELLRRLIATPSTSRDESRTADLLFAFLEERGAAPERLHNNVFARSADFDPA RPTLLLNSHHDTVRPAASYTRDPFTPTAEGDRLYGLGSNDAGASVVSLARTFLTFREQSL PFNLLLALSAEEECMGEHGMRALLPQLGTIDMALVGEPTGMQAAVGERGLVVLDCEARGK SGHAARNEGINALYIALDDIARLRSFRFDRVSELLGPIGIAVTQIAAGTQHNVVPDSCRF VVDLRTTDAYTNEETVEILRGALRSQAAPRSTRIRASALDAAHPLARAAQAAGRRSYVSP TTSDMALMPFPSLKMGPGESSRSHTADEYVLLSEIGEGIGIYEEFIRQLAGIL >gi|313158213|gb|AENZ01000048.1| GENE 54 39767 - 41065 1993 432 aa, chain + ## HITS:1 COG:XF1003 KEGG:ns NR:ns ## COG: XF1003 COG0165 # Protein_GI_number: 15837605 # Func_class: E Amino acid transport and metabolism # Function: Argininosuccinate lyase # Organism: Xylella fastidiosa 9a5c # 1 324 6 329 445 253 42.0 4e-67 MATKLWDKGFEPDKMIEEYTVGDDRELDMRLAKYDVEGSLAHIAMLEKIGLLTSAELEEL TAGLKEIAAEIEAGRFAIEPDTEDVHSQVELMLTRRLGDAGKKIHSGRSRNDQVLVDLKL FLRDELRQTADAVKTLFDRLQGLSEQYKEVLMPGYTHLQIAMPSSFGLWFGAYAETLVDD MRLVAAAWHIANQNPLGSAAGYGSSFPLDRTMTTRLMGFETLHYNVVAAQMSRGKSERAA ASAIAAIAATVGRLAMDVCLFMSQNFGFVSLPDNLTTGSSIMPHKKNPDVFEIMRGRCNR LQSVPNEIALLTANLPIGYHRDMQLLKDILFPATTEIKRTLSMCDFMLAHLKVNAGILDD PKYDYLFTVEDVNRLVLEGVPFREAYKQVGMAVQRGQYRPTREVRHTHEGSIGNLCTAEI RRKMERVMAEFE >gi|313158213|gb|AENZ01000048.1| GENE 55 41131 - 42336 1682 401 aa, chain - ## HITS:1 COG:BS_ydhT KEGG:ns NR:ns ## COG: BS_ydhT COG4124 # Protein_GI_number: 16077655 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-mannanase # Organism: Bacillus subtilis # 99 322 83 299 362 100 32.0 7e-21 MKKKLLFSACACMAVVLGLGCSEVNGVPSRAPSGYPEGPKTMVDAGATAITRTLYENLFV LAERGTMFGQQIPTLYGLDNNRKWWAGDDTDLSDTKYLTGSHPAVCGWELSNIELDMETN IDGELFTEVRKHIIAAFGRGAVNTISWHCANPVSGGNSWDGTRAVYSIIPGGENHEKFKG WLDKVADFMLSLKSPEGEPIPLIFRPWHEHTGSGFWWGKGNATAQEFVALWQFTISYLRD VKGLHNLIWAYSPDMIHLNSRDAYLEYWPGDEWVDILGLDAYDRNGADYDHKGLQMIRLM KNIAYTKNKPLALTETGLENNNPEESNYCNKKWWTSMLYKIIENEPVSFVLLWRNGDFPA NGGHYFSAFRGCYSEKDFMDFSRKEQILFEDDLPDMYKPLK >gi|313158213|gb|AENZ01000048.1| GENE 56 42361 - 43254 880 297 aa, chain - ## HITS:1 COG:AGl1086 KEGG:ns NR:ns ## COG: AGl1086 COG2207 # Protein_GI_number: 15890663 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 34 297 42 302 305 122 30.0 7e-28 MPNPDILTEIPPISGQDCFVVFDRYKREFCFPVHIHKEYELNLVCGARGATRIVGDSVRE ISDRDLVLITGSYLEHAWLNGNFSKNADVHEVCVQFSPDLFESGFIDKKQFYPIKKMFEY ARRGIAFSEETVSRVEPIVENLIRSEDRFDSVLNFLMLLNCLARESDYHILAQRSVSIEN NVNYDMPRIQKVMQFLNDNFHRDISIAEVKALVNMSEPTFRRFIKRHSGKSFVNLLNDIR LGAASRMLIENPTKNVSEIAYKCGFNNLSNFNRIFKKGKGLTPSPFKTFYTQKKISV >gi|313158213|gb|AENZ01000048.1| GENE 57 43264 - 44601 1492 445 aa, chain - ## HITS:1 COG:yagG KEGG:ns NR:ns ## COG: yagG COG2211 # Protein_GI_number: 16128255 # Func_class: G Carbohydrate transport and metabolism # Function: Na+/melibiose symporter and related transporters # Organism: Escherichia coli K12 # 3 443 4 432 460 250 37.0 3e-66 MGIRLREKIGYGLGDTASSMLWKLFSVYLMFFYTDICGIDAWVVGVLFLVTRIWDSLLDP VVGLLCDRTSSRWGTFRPWLLWGALPFGALGVLTFYMPEWGTGGKVVYATVTYSLMMIVY SSVNVPYAALLGVMSADPHERTVLAAFRMTFAAAGSMAVVLAVEALVKAFRPWAGPSASW TAAVAVVAALAVVLFFVTFSTTRERVRPVRAERHPVLLSLRDLLHNRPWLILAGAAVCLQ VFNAFRESGTIYFFKYCVAGETVGTVSFAGIALTGSALFLAVGPFFNIAGIVLIPFAADR FGRRRTLVGTLLLTAVFSFAFYFVRQGGYSVLLLAQALISLSVGGVLPLLWAMSADTADY AERRSGRRDTGLIFSSYSMAQKMGWAVGSAATAWILSLAGFEANAVQSPAALTVIGCLQS VFPALAALGTCAFILFYPLADSRPK >gi|313158213|gb|AENZ01000048.1| GENE 58 44605 - 46611 2123 668 aa, chain - ## HITS:1 COG:no KEGG:BT_1871 NR:ns ## KEGG: BT_1871 # Name: not_defined # Def: putative alpha-glucosidase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 665 1 661 662 709 50.0 0 MKKLFLFLCLLFLLPTAALCRDKTYTIQSPDCSLTVTIGCGDRIVYSVRSDTNQILKPSP LALRLVSGECWGRRSRVAGVRRYRIDEVYDAPVYKKRKVEDRCNGIVIDFREGFSLEFRA YDDAVAYRFVARQPGEVLVAGEEARFRFAPGAVAFVPYVRDTPCRHDGLAFGEQFMQSFE NTYTRTPLCEMDSARLAFLPLLVQTAGGVNVCITDADLESYPGMFLHRSGADELEGVFAP SPRKVVPGGYDRMQGVVEEYEPFIASLAPGDRLPWRALSIVREDARLADNDLVWKLASPC RLDDISWIEPGKAAWEWWNDWGLSGVDFTAGINQPTYEYYIDFASRNGLRYLVLDDGWSR DHHSPLETAPGLDLPALVKYGEERGVGLILWLGYLPFAGQMEELCKLYSEMGIKGFKVDF MDRDDQQMVRFVYDAARTAARYRLLLDLHGIFKPAGIQRTYPNVINFEGVHGLEQVKWSP VEVDQVTYDVTVPFIRMMAGPMDYTQGAMRNASKANFRPVNSEPMSQGTRCRQLAEYVVF DAPLAMLCDSPSAYGREPECLRFISGVPTVWDETEVLAGEVGDHIVTARRHGDVWYVGGL AGWDGRTVRVDLSLLPEGTYAVELFRDGENAARIAQDYVREEFLLAPGKSFDVKMFPGGG FAARITKR >gi|313158213|gb|AENZ01000048.1| GENE 59 46608 - 46793 204 61 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313158280|gb|EFR57682.1| ## NR: gi|313158280|gb|EFR57682.1| hypothetical protein HMPREF9720_2570 [Alistipes sp. HGB5] # 1 61 1 61 61 105 100.0 2e-21 MKRNMYGLIRWYYVIFLLYYSIEVCHGSWKILFDDVILPKTTRINMQCIVNHVYNLIITG L >gi|313158213|gb|AENZ01000048.1| GENE 60 46984 - 48549 2254 521 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313158296|gb|EFR57698.1| ## NR: gi|313158296|gb|EFR57698.1| hypothetical protein HMPREF9720_2571 [Alistipes sp. HGB5] # 1 521 17 537 537 994 100.0 0 MRRKLYYLTAGLAATLLLAAACSSDDTAEKPVPEPATLTAPTLTLSADAVLVDPEHSDQP ALHAEWTAATDDEQTEVAYTLYVNLTGRDMFSGPVVIDKGAALTHDFTHGELNQLLLNTF DLEAGTEAAVRFAVYAKNADEEFDAQLSEIKNCAITPKTTYPNFPPSLIMVGAATPWQWD LNAGLELPESREGSHLYQASGVELRVQPMSLNNGFKFYFSRNVNDTDDPRFAAQDLSAET FGKIAVYKSGEAQFQPGSFGYENGVYDIEVNLDTKLLTLTRTGDLPEAPLPEQLYLLGGC FTWGWTWDGTQLAKAEEGIYRARNIDMTFGDNGDVGFKMFTERDNWGVYYAMTDDATADN ISLQLVTDTDAPQVYPGKLGYGKGVYDIEVNLNTMKMTLTAKSIDYSTAYSMTGEATPGG WESRTYLPKKGDNEWEATGVAMNFDGDYKGFKIFASSDGWWPWYGQTPDAPFGTVIRIDD QATSDAKGDPQFYPSRFGYASGTYTINLNLNTMTLTLTKEN >gi|313158213|gb|AENZ01000048.1| GENE 61 48552 - 51122 3265 856 aa, chain + ## HITS:1 COG:XF0846 KEGG:ns NR:ns ## COG: XF0846 COG3250 # Protein_GI_number: 15837448 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Xylella fastidiosa 9a5c # 8 855 12 879 891 544 36.0 1e-154 MYKQTLKAIMTLFLAGSATIGSAANDTMPLHSGWKFRQANRNEWHPATVPGVVHTDLLDN GLIEDPYYRLNERALQWIDKEDWIYEVSFDAGALTRGYEHVRLEFLGLDTYADVFLNETR ILTADNMFRRWAAEVKPLLKERGNVLRVYFHSPINKALPQYESLPYRYEAWNDQADNGGL DGKKLSPFTRKAGYHYGWDWGPRLVTSGIWRPVKLQGWNELRLEDVFHRQTEVSSEAARV ETQVEIEAAAPVENAVVTVSDGKRVLGSRSVQLHAGMNRVSVPFTIDHPELWWCRGMGEP HLYTFRTSVELGGRVLAGHSAQVGLRSVTVEKKPDAYGRSLRFLLNGEPVFCKGANYIPC DCFLPRVTRETYERTIRDAVDVNMNMLRVWGGGIYEDDFFYELCDREGILIWQDFMYACA VYPAEGALLENMRLEAVDNVKRLRNHACVVYWCGNNENQDSWLSGWKYDVDKVDPKYSDI IWKQYEEQYYRMLAQVVAEYAPDMGYQPTSPFSDYGAMSNDHEGDRHYWEVWHAKKPITE YNRQRSRFFSEYGFQSFPCFETVKRYAPLPGDQDITSEVMMSHQRGGEHANNLIKSYLLN EYHEPRDFESFLYASQILQGDAIKTAIEAHRRDKGYCWGSLYWQHNDCWPVASWSSRDWY GVWKAQHYFARYAFADILISPILDGGRLDIYAVSDLLTPEKGTLCVRAVRLTGGRTGEFE QQIDVPANASTKVAAIDTRTLLNGAAPEEVVIQATFTAAGKTYRNNYFLVQQKTMNYPKA TLHCKVDRAGGEYDVTVRSDNFARGVYLSVEGETAHFSDNFFDLMPGETRTVRVASELSA KQLARRLKSMSLADTY >gi|313158213|gb|AENZ01000048.1| GENE 62 51142 - 52503 737 453 aa, chain + ## HITS:1 COG:all0166 KEGG:ns NR:ns ## COG: all0166 COG1626 # Protein_GI_number: 17227662 # Func_class: G Carbohydrate transport and metabolism # Function: Neutral trehalase # Organism: Nostoc sp. PCC 7120 # 51 325 87 383 495 137 33.0 4e-32 MKKILIVMSAAAGLAFVGCRPQNPDAPAVREFIRDNWHTTVQHCTADTATLIGLPYPYTV PTAGAMFREMYYWDTFFTNEGLVRDGHPELAKGNTDNLLYMVRRFGKVYNGSRTYYEARS QTPYLSMMVDRIYRLTGDKQWLADAYQTLKEEYGFWMRERLTPTGLNRYGSSASDALVDE FLVTGGKRLGTDLFDKGYTPERLHKLGLDFVAEAESGWDFNPRYDRRCTDFCPLDLNANL FMYEVNFARFAQELGRTDEIREWIAKAELRRARILQYCYDPEAKQFYDYDYVNGRRSDVL SGAVFALLYSGAVPREYAGDIVQALIPLRHRRLRGQALRLPLPVVLSQHVAPGLLLCGDG TGPLRIRQRGRTHCRQMDESGHRIVQDDGQPLGKVQRGNGNDRCQQRIRDARDAGLDRRH LHLSGRLSGWGDAARPAALGIHELLTTNKITTL >gi|313158213|gb|AENZ01000048.1| GENE 63 52500 - 55577 4724 1025 aa, chain + ## HITS:1 COG:no KEGG:ZPR_0351 NR:ns ## KEGG: ZPR_0351 # Name: not_defined # Def: TonB-dependent receptor Plug domain protein # Organism: Z.profunda # Pathway: not_defined # 31 1025 57 1052 1052 935 48.0 0 MKPNLRTNLGRFLLGAVLLLAAGTVWAQPQNIRGKITDPAGQPVAGASVIVKGTTNGTSA KADGTFELGAAKGATLQITMLGYKTAEKEVTTATFYPITLEDDSKLVNEVVVVGYGAVKK SDLTGSVASVKMGALDNLSSTSVDGLLQGRAAGLQVLNTSQDPGAGAIIRIRGNSSLEGS NTPLVVVNGFPMGDAGNLSQINVSDIESIEILKDASASAIYGSRGANGVILVTTKSAAKG QTNISVKHQTVISQLSDPLDIWYDPMLMAQVTNEEQVNAGLNPIYIGQTIGGIYYPSLLE IQNGEWANTDWRDLCLRTPVLNSTTAAISSANDRASINLNVNYHNNQGLYKKDSYERINA NLGVTYNLYKNLKLTSYNVFSITNRNISNGLEYGRNPLWPVYDKEGNYYLAGAQDFGHPL VILDNVLNKTTGRDFVTSLAVDWELVKGLTIHSQLDYRYQTSVGDVYRASNTSQDANDNG GIAQINNSFNQNLMSETYLTFSRTFNENHALTVMAGHSYNWDTGRGLYTTARGFVNDVLQ NENMNAGNPNLRQIYNDGYYLSKLLSFYGRINYSLKDKYLLTATMRADGSTKFGKNNKWG YFPSGAVSWKMHNEPWLKQSRWLSELKLRLSYGASGNQGISSYQTLDRYGMEKFWHNGEW ATVIGPGYEVGRTGANNRYMVWGGIANPDLKWESTSQLDFGVDFAAFDRRLRLTADVYYK KTTDLLREKYLPLPSSYDKIWVNDGEVTNKGFEVSLEGDIVHTRDWSFSATFIYSMNRNK VVSLGDAISSGLSQDYLTGLYYEVTGEPISMFNQNASIYAVGHPMNVFYGYRVDGIIQEG QDPGFIDPDGLKDRPGELKYVDLTGDYAITPEDRCIIGDPNPDFTASLNLSLRWKNLDLS VFLYGVYGNDVLYNNYTFSPRVKAKRWTPDNPTNDFPRLNNARQYWLSDYFLQDGSFLRI QNITLGYNLHFNRRFLKGMRIYANIDNVHTFTKFDGYDPEVGLDGIYWGGYPKLRKYTVG IDLNF >gi|313158213|gb|AENZ01000048.1| GENE 64 55590 - 57074 2486 494 aa, chain + ## HITS:1 COG:no KEGG:ZPR_0352 NR:ns ## KEGG: ZPR_0352 # Name: not_defined # Def: hypothetical protein # Organism: Z.profunda # Pathway: not_defined # 2 490 4 494 497 421 47.0 1e-116 MKKIIFTLAAVGMLSACSLDETPYGFYSSDNFYKTEEDAESGLMYAYNALNYLEYLRGIW YLGDIPTETMYPKSDEPGDIHMLQQWTVNSETELTMYYFKYCYIGINRANTVIERVSGGN LNETVKNRVVGEALFLRAWNYFNLVRAYGRVPLYTEPLSSLEETTPQMAESIDKIYDRIL EDLLAAEKMLTVNKVFGRIDKVGAQALLSKVYLTLASSIACKAPGYASAPRNADEMYTQA AYWSRKVVFDYKETYNFDSDLRSIYDVTKPDGPEHIFIMGIDRSGTHEGMYSKIPLQFLP NNGSAPIYIRYSDGSLQKGNGNGWGVFLIEDNFVEKTFLATDKRRTELIHKDIYDQTGKL VTPSPVRGYFSAKYVDPDFIGERTSARPFLIRYSDIALVFAEAAGPDEGLALVNEIRRRA GIPELPEGMSLAEFRKAVIQERALELAFEGNRLYDLRRTASVTATVPEATKLSEEQAAFY PIPQRELDLNPNVK >gi|313158213|gb|AENZ01000048.1| GENE 65 57086 - 57985 1020 299 aa, chain + ## HITS:1 COG:no KEGG:Celly_1222 NR:ns ## KEGG: Celly_1222 # Name: not_defined # Def: hypothetical protein # Organism: C.lytica # Pathway: not_defined # 4 247 8 241 565 87 29.0 5e-16 MKKYLIPTALLLVTALSSCTKTIDEDLSGNERSLLELRVEGQMGTAIIDRDGSRATATVY ILETPGFAYDRVPVKGIVVSEGASASVGNGDVLNFSNPERRARITVTSRSGHRLDWWIYL ETYDPFFLGTWRIVDVKLACNQRVSGVGDGAWTTQLSSGEFGTYGLPEYDDRVTITLGEI SDNELTGTLVHSAGDDGAYGNFWGVYAPYSVEAPLDMNPRLRHLLPPGESQWTLDLTTNQ MKITQNNVSSTMIFGTEGNNRLFRFLLPSAAGEPGRDSFYDNMWRSSTELFYVMYKVGD >gi|313158213|gb|AENZ01000048.1| GENE 66 58084 - 58650 929 188 aa, chain - ## HITS:1 COG:ML0522 KEGG:ns NR:ns ## COG: ML0522 COG0231 # Protein_GI_number: 15827184 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation elongation factor P (EF-P)/translation initiation factor 5A (eIF-5A) # Organism: Mycobacterium leprae # 1 187 1 185 187 168 47.0 5e-42 MATTADIKIGMCIELDGKTFQIVDFQHVKPGKGPAFVRTKLKNLESGRVLDNTFSAGSKI EPVRVERRPYQFTYEDDLGAHFMHTETFEEINIDKDLITNYDLMADGQIVEVMFHTEKET VLSAELPPIVDMEVTYTEPGIKGDTASTNSLKPATVNTGATVRVPLFINTGDKIRVDTRT REYYERIK >gi|313158213|gb|AENZ01000048.1| GENE 67 58711 - 59259 746 182 aa, chain - ## HITS:1 COG:PH1014 KEGG:ns NR:ns ## COG: PH1014 COG0163 # Protein_GI_number: 14590854 # Func_class: H Coenzyme transport and metabolism # Function: 3-polyprenyl-4-hydroxybenzoate decarboxylase # Organism: Pyrococcus horikoshii # 1 181 1 176 181 138 41.0 5e-33 MKIVVAITAASGGIYARLTLERLLRLSEVSQIALVCSARAREVLAHEDVVLPEDGRIRLF GNDDLFAPPASGSARYDAMVVVPSTVGTVGRVAAGAAQSLIERAADVMLKERRRLVFVVR ETPLSLIHLRNMTALTEAGAVILPACPSFYAGAQDVEALCGTVADRAVALLGVDAEHYEW GE >gi|313158213|gb|AENZ01000048.1| GENE 68 59261 - 59833 995 190 aa, chain - ## HITS:1 COG:SP1479 KEGG:ns NR:ns ## COG: SP1479 COG0726 # Protein_GI_number: 15901329 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted xylanase/chitin deacetylase # Organism: Streptococcus pneumoniae TIGR4 # 14 182 270 438 463 116 40.0 3e-26 MPDLIWEIDDPEGVFLTFDDGPTPGITEWILATLDKYDAKATFFVLGKNVEMYPDLFRRI VDAGHKVGNHTYSHQKGWGMSLERYTEDVDFANDLIHSELFRPPYARITPAQARFLGQRY KLVMWDIISRDYNRNLSPRTCLRNVTKYLAPGAIVVFHDSEKAFRNMRYALPRTLEKIRQ MGLKCKAIEF >gi|313158213|gb|AENZ01000048.1| GENE 69 60033 - 61040 1603 335 aa, chain + ## HITS:1 COG:no KEGG:BDI_0186 NR:ns ## KEGG: BDI_0186 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 1 330 1 327 345 187 32.0 7e-46 MKTFFRIAAAAALFAAMTACDAFHSLTKSRLKTAQGSPYELIVVCPQQEWTGELGDTLRS ILTAPVPYLNQTEPLFDVLRVTERGFTGMVADHRNILKIVEDPSLAATNIAVQYDVTAEP QIVLTLQGPDDKALTDYLSEHRESLVQVLEKAERDRAVKFAEAFSEQRVAKAIKSTFGVD MTVPKGYVLAADEKDFLWARYEYPTASQGFFIYSYPYRGKESLSPGALLAARNEFAARIP GPSDGSYMTTSEAFEPDYRMFRMEGRLWCELRGFWDVHGDFMGGPFVSYTTVDTATGRVL TIDCYVYSPKNHKRNYVRGVEHLLYLLKFPEQPRQ >gi|313158213|gb|AENZ01000048.1| GENE 70 61080 - 62495 1916 471 aa, chain + ## HITS:1 COG:BS_ywnE KEGG:ns NR:ns ## COG: BS_ywnE COG1502 # Protein_GI_number: 16080712 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylserine/phosphatidylglycerophosphate/cardioli pin synthases and related enzymes # Organism: Bacillus subtilis # 1 471 3 482 482 353 38.0 5e-97 MQQILTYIFLIIYSVTILGIILVVITDNRNPLKTLPWIIVLVLAPVLGLIFYFFFGQNLS KQRIISRRTRKRITMQLEEAEHTDGPGLPEEYRPLAALLTNTSHSIPLYGSRITTYTDGR SKMEALLTEIACAKHHIHIQYYILNDDETGRRLRDALIAKAREGVEVRILYDDVGSSGAK KSFFRSMRSEGIEVYAFLHVKFPLFTSKVNYRNHRKIAVIDGCVGFLGGMNIADRYVRGT RWGTWRDTHFRIEGSGAAGLQASFLSDWSATTKQQIAAAEYYPPAARFTDNIMQIVSSGP FGKWRTLLQADSYAIARARRRVWIQTPYYLPSDVLNSALQEAALAGIDVRLMLPARSDSK VVDLATHSYLDDMMKAGVKILFYMPGFLHSKLLIIDDMLSVIGSANMDFRSFEHNFEVNA FVYDGEFTARMASVFEEDQSRSHPLTPAEWFRRPRSRRVAESLMRVFSPLL >gi|313158213|gb|AENZ01000048.1| GENE 71 62590 - 63225 979 211 aa, chain - ## HITS:1 COG:RSc0292 KEGG:ns NR:ns ## COG: RSc0292 COG2197 # Protein_GI_number: 17545011 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulator containing a CheY-like receiver domain and an HTH DNA-binding domain # Organism: Ralstonia solanacearum # 4 210 3 210 210 145 41.0 4e-35 MERKIVLVDDHSLFRNGLRGLLERCAGCRVVGEAASGEEFLAMLPGLETDVVFMDFAMPG LDGAQTTERALARRPDLRIITLSMFGEESYYSRMVQAGARGFLLKDSDIGDVIEAIDAVM SGGSYFSPQLLSSLTGRMRTRDDVPDEQLSVREREILVAVCRGLSNQEIADELFISKRTV DKHRANILEKTGCKNTASLVVYAIRNGIVEI >gi|313158213|gb|AENZ01000048.1| GENE 72 63230 - 64201 1504 323 aa, chain - ## HITS:1 COG:SA2180 KEGG:ns NR:ns ## COG: SA2180 COG4585 # Protein_GI_number: 15927970 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Staphylococcus aureus N315 # 81 307 115 342 344 121 28.0 2e-27 MLIKILLVISIIVQTLATAYALRLVRATKYNSVWILFIVGFSLLSVERFVQLLMAGGHYV PRWWFGYLGIVVSVCLSIGVMYAHKLFKYIDRLNRQRQLLNKRILTAVLRTEEKARSRFS KELHDGLGPLLSSARMSLSVLSREERSADQREIIDNTTYVIDEAIRSLREISNNLSPHVL NDFGLARGVQNFIDKSAAMHDAKIRFTTNLRTERYDTDIEVILYRVICELINNSLKHAAC TSINLSLSQNGSELALDYTDDGRGFNPQAMMDCGMGLSNISSRINSLGGTFDISSAKGKG MRAAIRVNTQQEPALPKRKRRNR >gi|313158213|gb|AENZ01000048.1| GENE 73 64236 - 65009 1056 257 aa, chain - ## HITS:1 COG:aq_832 KEGG:ns NR:ns ## COG: aq_832 COG0496 # Protein_GI_number: 15606188 # Func_class: R General function prediction only # Function: Predicted acid phosphatase # Organism: Aquifex aeolicus # 8 252 5 244 251 166 36.0 6e-41 MKEERLILVTNDDGYDSKGLAAAVEVARGFGRVVVVAPETTQSGMSQAITMYNPLYLRCV RKEEGLEVYAFSGTPVDCVKMAFDYLLREERVDLVISGINHGSNSAVNVLYSGTMGAAIE GSFYGCPAVGLSLDDHGEDADFEAAVAYGRRIVGSVLENRIELPLCLNVNVPVGRPDELR GIRLCRQNRGFWREEFYRHEDPRGREYFWLTGAFVNEEPEAQDTDEWALSHGYVSVVPVQ VDLTDYRQLGALAEVLK >gi|313158213|gb|AENZ01000048.1| GENE 74 65663 - 66043 335 126 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|170017595|ref|YP_001728514.1| ribosomal protein L19 [Leuconostoc citreum KM20] # 1 119 1 116 119 133 57 4e-30 MNKDAIIKLVNEQMWAKTDADVPSFKAGDTITVSYKIVEGSKERVQSFRGVVIQIKGSGK TKMFTIRKISGGVGVERIFPLYSPHIDKIEVNKVGVVRRARIYYLRDLTGKKARIKEKRM TSADKK >gi|313158213|gb|AENZ01000048.1| GENE 75 66695 - 67993 1728 432 aa, chain - ## HITS:1 COG:BMEII1053 KEGG:ns NR:ns ## COG: BMEII1053 COG0738 # Protein_GI_number: 17989398 # Func_class: G Carbohydrate transport and metabolism # Function: Fucose permease # Organism: Brucella melitensis # 3 424 16 412 412 126 27.0 7e-29 METKKQNYTLPIIVMIFLFGMISFVTNLASPMGDILKYQFNVPNWMGTLGVFANFIAYAI MGYPAGNMLQKYGYKKTALVAIAVGFTGVGIQTLSGSVESFGVYLLGAFIAGFSMCLLNT VVNPMLNKLGGGGNKGNQLIQAGGSFNSLCGTAVIILTGLLIPEGIKNAQISNVFPLMYG ALAIFAFAFIVIALTKIPETQKTEGKKVAETSKYSPLSFRHFILGSVAIFVYVGVEVGVP NVLQKWLQNPELNVLGSGVNVEAIAGSVAATYWLLMLVGRLLGAMIGSKVSAKSMLMVVS SAGLLLTLGAMFAPNVPVNLPVFNGAEGFGLVNVPVSAALLVLVGLCTSIMWGGIFNLAV EGLGKYTEKASGIFMALVCGGGILPLLQNGIVDLTGGVGYLTSYWVIVAGLAYMLYYALI GSKNVNKDIPVD >gi|313158213|gb|AENZ01000048.1| GENE 76 68031 - 69020 1434 329 aa, chain - ## HITS:1 COG:PA4569 KEGG:ns NR:ns ## COG: PA4569 COG0142 # Protein_GI_number: 15599765 # Func_class: H Coenzyme transport and metabolism # Function: Geranylgeranyl pyrophosphate synthase # Organism: Pseudomonas aeruginosa # 11 327 10 320 322 170 33.0 3e-42 MITLDTIRKPVTAELEAFDEFVDRQFTAEGELLSDMLRYALSSRGKGIRPLLVLLSAAMN SPVAEAAKGRRVHLAAMLVEMIHVASLIHDDVIDEADMRRGKPSANARWQSHKAVILGDY ILARNLSIGLTSGQFDLVTHVCGSMAALCEGEVLQSECAEKHTMTRQAYLDIIYKKTACL IGVSASAGAMAVGASQQKVALMRRFGEAVGMAFQIQDDILDYTRTAHTGKPANNDLREGK ITLPLLAVLDKAPAERRTELLGRLACCHDDEESVEYLQRTVENEGGLTFAAEVMRSYIAR AVEMLSEYEASDYRTALANLCAYIAERDR >gi|313158213|gb|AENZ01000048.1| GENE 77 69288 - 71777 3761 829 aa, chain + ## HITS:1 COG:no KEGG:Sph21_1623 NR:ns ## KEGG: Sph21_1623 # Name: not_defined # Def: protein of unknown function, membrane YfhO # Organism: Sphingobacterium_21 # Pathway: not_defined # 16 820 13 806 816 513 36.0 1e-143 MESSKTIIRRLLPAAAALALFFVVSAVYFAPQFRGEVLPQHDVMQYDGMAKDINDMREST GEDPQWTGRMFGGMPAYLINVAYPAQLVKNTLGRIVKIIDTPAAFLFFAMTAMWLMLLIF GVNPWVGVVASLAYGLSTYFLLIIGAGHITKMWALVYAPLMMGGAWMTLRGNMWAGGALT ALTASLEIGANHPQITYYFLLAMAAFWISEGVTAFREKHFRNFALRTAVLAAAGLLAVGS NFSPLWYTAKHSKETMRGGSELASTAETSKNGLALDYATAWSYGRTESLNLLVPDFMGRE SGTAFAPDGEVAAVLNDYGLRGAAQQLPAYWGTQPYTGGPTYLGAAAIFLAALGVALVRG RNKWWIVAVSALTLLLAWGRNLMWFTELAFDWLPGYNKFRTVSMALVVVQWTVPLLGALA LMRLWRDEIPRRRLFKALGWAAGVTGGLCLLLAVAGGSLFDFGREESAAMMTEQFHQILK ANNMQEYLDRGMDAEMGIATADAMAAERASMMQADAWRSLLMILLAAGGVALFALRRINK YALTALLGAVMLLDLVPVDLRFLSHDDFISARRRQITATAADKAILADKDPGFRVLNLTV SPFQDATTSYFHRSVGGYHGAKLARYQDLIDRYLSYRNDAVLDMLNTRYLIVPGDDGQPQ AVRRATANGPAWFVDGIVAADTPQQEIDLLGSVDLKTTAVAAPADKAFAEEWQAGGQADT TLVRGIALTEYRPNYQKYEYTAPEESVAVFSEIFYDHGWTAYVDGEAMPGFRADYILRAM KLPAGRHTVEWRFRAPGWAAAEAVTLVSSLLILLGAAAALTYCFRKKKA >gi|313158213|gb|AENZ01000048.1| GENE 78 71782 - 72663 1473 293 aa, chain + ## HITS:1 COG:BH3601 KEGG:ns NR:ns ## COG: BH3601 COG2177 # Protein_GI_number: 15616163 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Cell division protein # Organism: Bacillus halodurans # 12 227 19 234 298 80 26.0 4e-15 MKDDKRLKRKVRNSYIVSTVSIMLVLFLLGSVGYLMVAAMKVAQTLQESIAVTVELQNGI SDQQRETINKRLTAEELVATVAYVTKEEKADDAEFRKMFESEFEEILDENPLLDSFELTL TAESADKELLDGFIASVSGIAGVERVSYPALMAERLHATVGKIRLVLLLFGGALLIISLI LLSNTIRLAIFSKRYLINTMKLVGATKWFIMKPFLGSSITQGILAGLGASLLFGLSVFGL NEAAPELTTIAESGKIAVILGAMIAGGVVISGLFTVAALNKFVNMKSNKIYLY >gi|313158213|gb|AENZ01000048.1| GENE 79 72664 - 72921 452 85 aa, chain + ## HITS:1 COG:no KEGG:CA2559_04540 NR:ns ## KEGG: CA2559_04540 # Name: not_defined # Def: hypothetical protein # Organism: C.atlanticus # Pathway: not_defined # 11 81 8 78 86 85 59.0 6e-16 MKKANKNTPAEQEPPKMPLTRRNYVLLAIGFAVILLGFVLMAGGGSDSPDQFNYAMFSWR RITLAPILVIGGFVIEIYAILKRYK >gi|313158213|gb|AENZ01000048.1| GENE 80 72981 - 73772 1189 263 aa, chain + ## HITS:1 COG:STM3205 KEGG:ns NR:ns ## COG: STM3205 COG1968 # Protein_GI_number: 16766505 # Func_class: V Defense mechanisms # Function: Uncharacterized bacitracin resistance protein # Organism: Salmonella typhimurium LT2 # 4 249 9 261 274 133 34.0 3e-31 MDTLQAILLGIVQGITEFLPVSSSGHLQIAKELLGVELEENLTFDVALHAATVLSTIVVL WSEVWRLLKGLFSRRFNAEQAYVLKLVLSMIPIGFVGFLFKDRINALLDAPYILVIVGAM LLLTAALLAFAYYAKPRQKETISYRDAFIIGLAQACAAMPGLSRSGTTIATGLLLGNKKA AVAQFSFLMVLAPILGETLLEAVSGDLTAGVAAGPLAAGFLASFVTGCLACKFMIEIVKR GKLIWFALYCAAAGLVSILSYFC >gi|313158213|gb|AENZ01000048.1| GENE 81 73772 - 74488 1174 238 aa, chain + ## HITS:1 COG:CT094 KEGG:ns NR:ns ## COG: CT094 COG0130 # Protein_GI_number: 15604813 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Pseudouridine synthase # Organism: Chlamydia trachomatis # 17 224 11 212 241 158 42.0 9e-39 MKEPIDLRGLNFEEGYIAVLDKPLRWTSTDVVRKIKFTLRRLGYRKIKVGHAGTLDPLAT GILLVCIGRATKMVDALQAEEKEYVADVMLGATTPSYDLEHPVDRTFPTEHITREKVLEA LSSLTGERLQEPPVYSAKKIDGTRAYELARAGEEVTMRKATVNIYELELLEYDLPRIRIR VRCSKGTYIRSLAHEIGQALESGAHLTSLRRTRSGGFTLEKAFEFEEFLKKLEELETK >gi|313158213|gb|AENZ01000048.1| GENE 82 74553 - 75602 1847 349 aa, chain + ## HITS:1 COG:SA1466 KEGG:ns NR:ns ## COG: SA1466 COG0809 # Protein_GI_number: 15927220 # Func_class: J Translation, ribosomal structure and biogenesis # Function: S-adenosylmethionine:tRNA-ribosyltransferase-isomerase (queuine synthetase) # Organism: Staphylococcus aureus N315 # 1 349 1 341 341 310 43.0 2e-84 MKLSQYGYEFAPEMLAKHPTENRDDSRLMVINRAKGTVEHRIFKDIIEYFDEKDLFVFND TKVFPARLYGNKEKTGAEIEIFLLRELNRELRLWDVLVDPARKIRIGNKLYFGDDDLLVA EVIDNTTSRGRTLRFLFDGSYEEFKQALFSLGETPLPKWVRDKVEPEDAERYQTIFARHE GAVAAPTAGMHFSKHLMKRMEIKGIDKSFITLHVGLGNFRTVDVEDLSKHKMDSEQFFVT DEAAEAVNSAKRDGHKIVAIGTTVMRTVETAVSTNGMIKPMEGWTNKFIFAPYEFTVADA MVTNFHLPYSTQLMMVAAFGGYETVMNAYKVAKEEGYRFGTYGDAMLIL >gi|313158213|gb|AENZ01000048.1| GENE 83 75725 - 76531 994 268 aa, chain + ## HITS:1 COG:TM0895 KEGG:ns NR:ns ## COG: TM0895 COG0297 # Protein_GI_number: 15643657 # Func_class: G Carbohydrate transport and metabolism # Function: Glycogen synthase # Organism: Thermotoga maritima # 4 187 2 181 486 94 28.0 2e-19 MANKILYVCQEITPYLPETESSGLCRALTQAMQERGNDIRTFMPRYGCINERRHQLHEVI RLSGMNLIIDDNDHQLIIKVASIPAARVQIYFIDNDDYFSRKFVLTDEEGKAFPDNDERA IFFARGVLETVKKLRWTPTVVHCHGWFSSVIPVYLKRIFADDPIFRNVKIVVSLYGDGFP GELDNAFGKKIAGEGVKDKNLAILDTPSYENLCRFVMEYADGVVAASPDVEPKVLEIARA SGKPMLEYQSPEAADFFDNYNRFYEALQ >gi|313158213|gb|AENZ01000048.1| GENE 84 76542 - 78197 2774 551 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313158277|gb|EFR57679.1| ## NR: gi|313158277|gb|EFR57679.1| putative lipoprotein [Alistipes sp. HGB5] # 1 551 10 560 560 1029 100.0 0 MFRAAAVLAAVALMTFAGCTKVDDTLGSNLVPDNQQMKVGYTTLGARTLAGKLDAAKYVE TRLFQSDSLKASNISYGYMGSMLSDTFGLRTAGFLTQYLNVYKVKEGYFGYRPIFDSAQI LITITSYGNDTLTPQKYNVYEIKSNKYLTEKPVAAGKTERDTVFYLNFDPVKEGVIGADD EPVFTFTFPDGETTGPATTAVTMNTTPATQNFIDRLMLQKGSGTAYEDDYSIYSTDSLKQ WVEEFKGLYIVPAEDQKEKGKGNIYATELESSGFMVYARNRVESDPTLIKDTIGMGYIFY NTSISTEYGNVSVNTIRRDYGMATSADAKFDIADAKEEAGKPITDHPTYDQIYVEGMGGV VTAITFDKPFFDALEAEIVAGNEGAQNFKTLAMTQVRMSIYFKGSNYDWTKLTDMPHMIG EMDKSQTRLGLYTNYKTLSAIPDYAYAYEQNYSTTLAYGGYINRSRGCYVMDITGYAQSL WNSYRSAKAELGADAPWDELKEKIKNRTIYLGPEAYSVYTTTYSVLQGMTPDGVTEEDVP IKIDVTYNLIK >gi|313158213|gb|AENZ01000048.1| GENE 85 78275 - 80572 3562 765 aa, chain + ## HITS:1 COG:SA1093_1 KEGG:ns NR:ns ## COG: SA1093_1 COG0550 # Protein_GI_number: 15926833 # Func_class: L Replication, recombination and repair # Function: Topoisomerase IA # Organism: Staphylococcus aureus N315 # 1 579 1 559 571 446 46.0 1e-124 MQENLVIVESPAKAKTIEKFLGKDYVVKSSFGHIRDLSKKDLGINIGEGFKPVYEIPADK KKVVDELTKLAKDKTVWLASDEDREGEAIAWHLTEVLGLPVDKTKRIVFHEITKRAILEA IEQPRTVNMDLVNAQQARRILDRLVGFELSPVLWKKVRPSLSAGRVQSVAVRLIVERERE IIAFRSTPYFRVVAQFHAAGDPDKTIFKAELSSRFETREQAEEFLNTCIGATFTVAKAEE KPAQRFPAPPFTTSTLQQEAGRKLGMSVSQTMSVAQHLYEQGLITYMRTDSVNLSKQALA QCKEEITKLYGEKYSSWHNYKTKTKGAQEAHEAIRPSYIENHTIEGTPAEKKLYDLIWKR TVASQMVCAELDRTTVVIDMSGSSQQFVATGEVVRFDGFLRLYSESTDDDQAAESGEGLL PKMTAGDRVVSTQITATERFTQAPARYNEASLVKRLEELGIGRPSTYAPTITTIINRGYV VKQNKEGQKRGYVQLTLTGDKLTSKNLTENFGKEKNRLSPTDIGMVVNDYLETQFKPIMD YNFTANVEKEFDRVADGDITWDTMIHDFYGPFHQMVDTAIGTQTDKKSQARILGNDPKTG HVVKARIGRYGPMVEIEGNEGEKGRFASLKKGQLIESITLEEALELFALPRDLGELEGEQ LSVGIGKYGPYVRHGKSFASLQKGDDPYTLTYERAVEVVRAQQAAAAAANTPLRSFPEDP DMLVKNGRYGPYIAYKGKNYRLPKGAKPETLTLEECRKIVSSSKK >gi|313158213|gb|AENZ01000048.1| GENE 86 80603 - 81811 1819 402 aa, chain + ## HITS:1 COG:no KEGG:Odosp_1312 NR:ns ## KEGG: Odosp_1312 # Name: not_defined # Def: peptidase C1B bleomycin hydrolase # Organism: O.splanchnicus # Pathway: not_defined # 1 402 1 397 397 414 50.0 1e-114 MKKFLLTALAFCTAFAAAAQTPAATEPEGYRFTDVRLIPMTPVKDQHRSGTCWCYSTLSF LEGEILRAGGRPVHLSEMWIVRHTFMDKAVKYVRMHGEINFAEGGASHDVTEGIKNHGIV PQEAYPGFNYGTEKADFHELSLVLKAYLDAVITASGKSSDKALSTAWKRGFDAILDEYFG RMPETFTYEGKEYTPRSFAASLPIDMDDYIDITSFTHHPFYTQFIIEIPDNWMWGTVYNL PLDEMMTVIDNALANGYPVSWGTDVSEKGFSRTKAIGIIPEADLANMDGTEAERWGKLTQ KEKDDALYKFDKPGKELRITQEMRQQAFDNYETTDDHGMVIMGTATDQAGNRYFKVQNSW DVRPPYDGFWYFSRPFVEYKTTSIMVNKNALPKETAKRLGLK >gi|313158213|gb|AENZ01000048.1| GENE 87 82104 - 82181 121 25 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKAKINAWFRTAMVLLGESIKLRAM >gi|313158213|gb|AENZ01000048.1| GENE 88 82313 - 82954 478 213 aa, chain - ## HITS:1 COG:lin0069 KEGG:ns NR:ns ## COG: lin0069 COG1357 # Protein_GI_number: 16799147 # Func_class: S Function unknown # Function: Uncharacterized low-complexity proteins # Organism: Listeria innocua # 2 212 3 212 212 108 30.0 1e-23 MEITRPVPSPNPVEQTDAALLQTAEEVSDAVFRSVPLPPLMREHFWVQSSVFAGCIFPAP VLQGSHFTDVTFRNCDLSNADLSGCSFQRVEFVECKLVGANLSEGTWQHVVFERCKMEFA NFTLGKFRGVRFAGGTMRSVGFDECRFERVEFSLCDLTMAEFSRTRLKGLSFVDSDIRGI RVGDTGSFELKGLKVSALQASELARLLGLEIEG >gi|313158213|gb|AENZ01000048.1| GENE 89 83227 - 84636 2064 469 aa, chain + ## HITS:1 COG:no KEGG:BVU_4075 NR:ns ## KEGG: BVU_4075 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 61 469 41 395 395 226 35.0 1e-57 MEKGYRIFLTVSLLLTIFTTTAQTTDTSKDNFPSTQEALSRIGAAEARPGRQPADTAALT TPAQRKGFIRRIIDYYSRSNVDRTFEKKIDWSIAPGPNYSSDVGFGLGFLVAGLYRLDRT DSVTAPSNISIYGNITTQKFLLLRFSGDNIFQHNKRRLSYSGAFVYFPGAFYGVGHKAGK EGYAQDLTTTMGIFRISYCTSLVGRFYIGVSGGIDYTGAKYKSSGMGEYFTAIDNHEKEK PGGQIGELYDLYKDNKRYDPEKQDPFSNFIAATGDKPNAFNTNLGIFAQYDTRDVTFNAS KGIFIKAEAKWYPQWLGNTRRNFGRFTLTFDFYQKLWKGAIFAYDLYADCTAGTPSWHMY AKLGGMERMRGYYEGRYRDKRLVETQIELRQKIYRRHGIVGWIGGGQVWGTDKFRWDNTL YSFGCGYRFEFKNRMNIRLDYGWGVYGNHNLPWDRKRSSAFLFTASEAF >gi|313158213|gb|AENZ01000048.1| GENE 90 84859 - 85425 491 188 aa, chain - ## HITS:1 COG:no KEGG:BDI_1990 NR:ns ## KEGG: BDI_1990 # Name: not_defined # Def: siderophore biosynthesis regulatory protein # Organism: P.distasonis # Pathway: not_defined # 34 173 44 187 205 77 33.0 3e-13 MTKGLFIEPPMTEAEAAARVTPGERAAAQEFLRERRRREFLTWRAVVRRELGDDVSIAYD EAGAPVIRGREVHIGVSHCPGRVAVCISDAPCAVDVEPESRDFSRAASRYMSPAERALSD DPLLPAAVWCAKETLYKYAGRPGLDMLYDLHVEAVDFEAGVVVGRIADGEPLRLSLKRAD GFIVVYIL >gi|313158213|gb|AENZ01000048.1| GENE 91 85422 - 86708 1991 428 aa, chain - ## HITS:1 COG:FN1486 KEGG:ns NR:ns ## COG: FN1486 COG1253 # Protein_GI_number: 19704818 # Func_class: R General function prediction only # Function: Hemolysins and related proteins containing CBS domains # Organism: Fusobacterium nucleatum # 32 427 17 419 426 188 32.0 2e-47 MEETTTSWITFNGFSAHIAVLCTVTLGLLCVSALVSGAETSFFSLSHNDIRRLRKRTTAS AEAVLRLLTNVDLLLATILVVNNLVNICIVILTSSIIDSVFTFLRFEFLFKTVLVTFLLL LFGEILPKVLAQTIPVRFAQFAARPLTVLRWIFYPISYALVRTSSRISEKASHKSEISLD ELADAVDMTQSSSPEEHVMLSGIVNFVNTEVQEIMKPRVDITVLNITDDYETVKKTIIES GFSRIPVYEDDIDNIKGTLYVKDLLPYINHGSEFGWQQLVRKPYFVPEHKKINDLLADFQ SNKIHMAIVVDEYGSTLGLVSLEDIIEEIVGEISDESDADESFFTRLDEKCFIFDGKSHL GDFERVLGLDEEIFNDVKGEAETLAGLMLELKRDFPRKGDVFTSHDIRFTVQEMDGHRID KIRVDLLA >gi|313158213|gb|AENZ01000048.1| GENE 92 86725 - 88038 1635 437 aa, chain - ## HITS:1 COG:BB0299 KEGG:ns NR:ns ## COG: BB0299 COG0206 # Protein_GI_number: 15594644 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Cell division GTPase # Organism: Borrelia burgdorferi # 9 326 23 331 404 206 45.0 8e-53 MSEIKAPKTDGSIIMVVGVGGAGGNAVNHMWNLGIRGVTFLVCNTDQQALDKSPVELKIR LGAEGLGAGNDPENGRRAAVESLPEIRQHLEESGTRMLFITAGMGGGTGTGASPVIAKLA KEMGLLTVAIVTSPLAVEGKIRYEQAFRGIEELRQNVDSLLIINNENILEIYGRLSLKQA FGKADDILCSAAKGIAEIITVESDLVNVDFADVSKVMRDSGRAHMAVATAEGDNRAEAAA EASLRSPLLDHNLISGAKNILLNISVADADGLMYEEVVRILEYIQAHASVQDDNGVIHNA NIIWGTSEKPQLGNAIELVVVATGFAGDVSAAGTMKQIIPPVRSVEPAKDPVALVLEPIK PVVPPKAAVQRPPEQVMLGAKSTRYSNIELLLSKPAYQSRNSKFVVQMPGGRKEVLKEEN DAGQQAQQDAQGGSLFD >gi|313158213|gb|AENZ01000048.1| GENE 93 88518 - 89957 2175 479 aa, chain - ## HITS:1 COG:PA4408 KEGG:ns NR:ns ## COG: PA4408 COG0849 # Protein_GI_number: 15599604 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Actin-like ATPase involved in cell division # Organism: Pseudomonas aeruginosa # 8 382 11 382 417 202 32.0 1e-51 MERKNYTVAVDLGSSNVVVAVGEKNAEERLDVVCVVSKPVEGVNAGKIENIELVSRAIRE AVSEAEEQLGIRITEAYAGISGDFVRCARHMDHVFVYDPQNGVNQKDVDALFDRMRNVQA PDDETIMDRVPQNYVVDDNQEVKNPVGSFGKKLSSTFNFILCLKTPMQRLDMALKRVGIK MLSVTSNAIATAEAVLLPDEKEEGVAVVDIGGGVTDVAVYYRNVVRYIATIPMGATAINR DIRTMSVPEKHVESLKQKYGSAVAELAPEDKLIRVNGRTAREAKDILLRNLATVIEARAT DIAEFVLQEIKDSGYIGKLAYGIVLTGGSAKLKDLDELFRRVTGMDVRVASAETGIAEES KEKAADPAYATAVGILLKGAEQGACAFVERPVARPAEQQGFRPPQPQPAPEFRHTPRFQQ PVQAPPAAPAAEETEEDAAAEREEAPVIEPKRKRDWGSIFQKTFDKINKSFTAAEDEEI >gi|313158213|gb|AENZ01000048.1| GENE 94 89970 - 91070 1394 366 aa, chain - ## HITS:1 COG:no KEGG:Palpr_1497 NR:ns ## KEGG: Palpr_1497 # Name: not_defined # Def: cell division protein FtsQ # Organism: P.propionicigenes # Pathway: not_defined # 276 366 158 244 268 81 39.0 7e-14 MRKYLRYALLTLLWGAVAAYVVYAGTAAGRLRAGKKVGRVEIEVVDSSSMGYLVSGRMVR EWIAHSGIKTNGTAVDAVELAAIEALIAKNGFVERVDAYVTYGGVLHIDISQRRPLLRLL TDGVDSYVTPEGYVFAAPRASSLYVPVVTGAYRPPFPASFVGSVRGHIDLERAKIDKRIA ELEREKYPFFRRELQNDRNISALRRMRIKKQWWRMESSAAFDARVEELRARKAELRRKYR YEARLVQEGIDRIAQRQEAERLKQKKLEKSYEDFMKLLTFVEFIEDDDFWRSEVVQIAAH TTPSGALEVELVPRSGRHTIVFGRIEQVERKFDKLLRFYRSGLMNIGWGEYRTIDIRYND QVVCKK >gi|313158213|gb|AENZ01000048.1| GENE 95 91164 - 92546 1790 460 aa, chain - ## HITS:1 COG:CAC3225 KEGG:ns NR:ns ## COG: CAC3225 COG0773 # Protein_GI_number: 15896472 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramate-alanine ligase # Organism: Clostridium acetobutylicum # 4 445 11 448 458 225 32.0 2e-58 MEFQNIYFLGIGGIGMSALARYFLHEGKRVAGYDRTPSHLTDELAAEGAAIHFEDDVRRI PAEFLDPAATMVVYTPAVPQEHGEYQYFVSHGFCVEKRSQMLGHLAEGKYVMAVAGTHGK TTTSTLVAWLNRGVTGGGSAFLGGISKNFGGNLVLGAGARLAVEADEFDRSFLRLYPDVA VITSADADHLDIYGTHEAVKEAFSQFVRQIRPGGFLIIKEGVDIVLDNPQITVYRYSYDV PCDFYARNVQLLEGGHYRYDIVTPGGVVEGCTLGIPGWVNIENSVAAAASVWCAAQAEGT TLDAERLREALASFAGVKRRFEFYVNTPKQVYMDDYAHHPRELAATLTSVRKMFPGRRIT ALFQPHLYTRTRDLYREFAEALSHADDVVLLPIYPAREEPIEGVTSEIIAQGVTVPCRIV ERAALADTVAAMDTDVVVSFGAGNIDACCGAIAEKLKAKS >gi|313158213|gb|AENZ01000048.1| GENE 96 92645 - 93751 1700 368 aa, chain - ## HITS:1 COG:lin2141 KEGG:ns NR:ns ## COG: lin2141 COG0707 # Protein_GI_number: 16801207 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylglucosamine:LPS N-acetylglucosamine transferase # Organism: Listeria innocua # 4 368 2 363 363 221 36.0 2e-57 MTRKIILSGGGTGGHIYPAVAVAEALKRRFGDGVEILFVGAEGKMEMEKVPALGYRIVGL PIAGLQRRMDWHNLAVPFKVLKSVSMAKKTIREFGADAVVGFGGYASAPVLWAAQRLGVP TVIQEQNSYAGLTNKILAKRAKRICVAYEGMERFFPAGRITMTGNPLRGRFSKEGADRGE ALEYYGFTPDLPVVLVVGGSLGTRSLNEMMKAWILALEGADAPVQVIWQTGKYYEREMQA FLAAHPVANIWQGAFIDRMDYAYAAADLVLSRSGAGTVSELCLVAKPVLFVPSPNVAEDH QTKNAKALEAKGAAVVVPDAEARTAAMRRAMELLSDKEALRTMSENLEKLARPDAAERIV DEIEKVMK >gi|313158213|gb|AENZ01000048.1| GENE 97 94102 - 95640 1783 512 aa, chain - ## HITS:1 COG:alr4088 KEGG:ns NR:ns ## COG: alr4088 COG0606 # Protein_GI_number: 17231580 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Predicted ATPase with chaperone activity # Organism: Nostoc sp. PCC 7120 # 1 507 1 507 509 513 54.0 1e-145 MFVRTYAGAIVGIDAAAVTVEVNIAGGGLGMYLVGLPDSAVKESEQRIRAAFENSGERMS GRKVVVSLAPADLRKEGASFDLPIAVGILAAMSRVNAETLAGTMFAGELSLDGTLKPVRG VLPMAVRAREEGLRRLVLPAANAAEAAVVEGVEVIGAGSLKEVIGLLNGEREMTPAVPAD GEPFGAEAGRYEEDFADVKGQTHVKRALEIAAAGGHNVLMIGAPGSGKTMLARRMPTILP PLSREEALETTKIHSVAGNMAAGGLLARRPFRAPHHTISQAALIGGGQSPRPGEVSLAHN GVLFLDELPEFGRSVLEVLRQPLEEKRVTVSRAKYSVEYPANFTLVAAMNPCPCGYYNHP AKECVCSPGAVHRYMSRISGPLMDRIDLHVEVTPVPAAELSAAAPGEPSAAVRERVVRAR EVQAARFAGTEGVHTNAMMNVAMLREYCRPDAASAALLERAMERLSLSARAYDRILKVAR TIADLAGRETIAAADVAEAINYRSLDRGNWGR >gi|313158213|gb|AENZ01000048.1| GENE 98 95796 - 97361 2402 521 aa, chain - ## HITS:1 COG:BS_dnaC KEGG:ns NR:ns ## COG: BS_dnaC COG0305 # Protein_GI_number: 16081096 # Func_class: L Replication, recombination and repair # Function: Replicative DNA helicase # Organism: Bacillus subtilis # 25 470 9 452 454 363 46.0 1e-100 MAKDYNPSQRNREAFEALTDTAGNVPPQAVELEEAVLGALMLEKDSIIAVQEYVTPDAFY TEEHRTIYRAIEELSMELKPIDLYTVTERLKAKKELKKVGGASYLAQLTQKVGSAANVEF HAKIIAQKYVQRELIRSATEIQKRSYDESTDVTELIGYAEGEIFKVAEGHVKRSVQASKD ILARALMQIEEASKNTSAFNGVPSGFMAIDRVTLGWQLSDLIIIAARPSMGKTAFVLSMA RNMAVDHEQGVAFFSLEMSAVQLMMRLIIAETGLSGNDVKSGRLTPEQWRHLESATKPLG SAPLYIDDTPALSVFEFRSKARRLKIHHDIKIIIIDYLQLMTGNQDSKGNREQEVAFISR TLKAIAKELNVPMIALSQLSRATEMRGGSKRPQLSDLRESGAIEQDADIVAFIHRPEYYG INQDENGMPTAGMAEIILAKHRNGAVCDVNLRFLKEQARFADVDDTMLPPSQAADSQQAY DDYASGSNAVPGAGIGGGLGAGIGGGEFDPNPPALNDEAPF >gi|313158213|gb|AENZ01000048.1| GENE 99 97526 - 98248 768 240 aa, chain - ## HITS:1 COG:alr2926 KEGG:ns NR:ns ## COG: alr2926 COG1040 # Protein_GI_number: 17230418 # Func_class: R General function prediction only # Function: Predicted amidophosphoribosyltransferases # Organism: Nostoc sp. PCC 7120 # 14 232 14 218 229 86 30.0 4e-17 MSILSDLLSDVAALFFPPRCPVCGVPLAQGERTVCTLCRTTAPLTGFWLEADNPLLAKCR DMLPVERASGFLYYIHGSGWRELIRGFKYRGAWRTARAMGEWYGRCLKESGLYDGVEVVV PLPLHPVKRCRRGYNQSEYIAEGIASQLGAEVDRRSVRRKRNTESQALKPRRERARNVDE AFAVRRPERLEGRHVLLVDDVFTTGSTMLSCAGEMLRAAPGCRISIAALAVSRRELGVRE >gi|313158213|gb|AENZ01000048.1| GENE 100 98232 - 100487 2838 751 aa, chain - ## HITS:1 COG:BS_priA KEGG:ns NR:ns ## COG: BS_priA COG1198 # Protein_GI_number: 16078634 # Func_class: L Replication, recombination and repair # Function: Primosomal protein N' (replication factor Y) - superfamily II helicase # Organism: Bacillus subtilis # 16 750 19 801 805 411 32.0 1e-114 MPLYADIVLPLAQPAYTFAVPEGMHVAEGTAVAVQFGPRKFYTGVVWRVHDRRPPFRTIK SIQRVLYDAPLLSAQQKALWEWIAAYYMCSAGEVMRVALPSLMKPSGDTEEEFAEDEFRP RTECYVALARELHDEGRLHEVFEKLGRRAPKQYEALLEIASAGDETRISTGEVARRLLRA DYAVLHALERKGHIVCTERERTVERGGSAFRLPELTAPQREALGSLREQFAEKPAALLHG VTGSGKTEIYIHLIAETLARGGDVLLLVPEIALTAQLIGRMERIFGSRVTPYHSKLTNRR RTETYLRLNRSQGGEFVVGVRSSIFLPLRHLQLIIVDEEHDASYKQADPAPRYNARDCAV VMARLWGGRTLLGSATPSLETWLNAEGGKYGRAVLAERYGDARPPEIFVSDTIRAAKRGE RHAHFNKLLLDKMEETLGRGGQVMLFQNRRGFSPYVECSQCGWTARCPHCNVTLTYHKGG ARLVCHYCGYTAPVPAKCPSCEVTEVLPRGFGTEKVEEEIARLFPEARVARLDRDSVTSE RAFNAIISDFEARKTDILVGTQMITKGFDFGGVSLVGILNADNLLNNPDFRAAERAFQLM LQVAGRAGRRSDGGEVVIQTSEPGHPVIRQVVAGDFAGMARMQLSEREAFFYPPYARLTS LTLRHRDVALLRRGVTELAARLRVRFGRRVLGPMTPPVDRIRGEYLAGLLLKIESGASSA KARSLLAAELKAFAENPEFKTITVVCNVDPQ >gi|313158213|gb|AENZ01000048.1| GENE 101 100487 - 101383 1601 298 aa, chain - ## HITS:1 COG:TM0177 KEGG:ns NR:ns ## COG: TM0177 COG1284 # Protein_GI_number: 15642951 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Thermotoga maritima # 9 298 1 283 283 166 35.0 4e-41 MKPIIKELMLKPVKEYFLMTVGMIMYSFAWIGCILPAKGVGGGAAGLSLVLCHAVEEWLG VSIQIGTMVFIINGILLLIAGFIIGWNFGVKTVYCVVVISLGMNFWQSVLPEGDFLHLER ILSVILGGILAGMGVAMCFAQGGSTGGTDIAAMIINKYRTISYGKIVIYSDFVIIGSSML VGFHIDTVIYGYVMTAVFGYTVDMIMAGNQQSSQVFIVTHDYEKMAEAIVENIHRGVTLI DSQGWYTKKQSKIVMVVCRKRETAMILKFVKTIDPDAFMTVGSVMGVYGKGFQALNKL >gi|313158213|gb|AENZ01000048.1| GENE 102 101457 - 102410 1201 317 aa, chain - ## HITS:1 COG:aq_632 KEGG:ns NR:ns ## COG: aq_632 COG1242 # Protein_GI_number: 15606062 # Func_class: R General function prediction only # Function: Predicted Fe-S oxidoreductase # Organism: Aquifex aeolicus # 8 308 8 307 317 213 40.0 3e-55 MYAWGDNRRFNSYSGYFRRMFGCRVQKLSVDAGFTCPNRDGTISGGGCTFCNNGAFTPSY CMPVKGVRRQIEEGIEFHRRRYRTASKYLVYFQSFSNTYAPLERLKEVYGEALAHPDVAG IVVGTRPDCVDAEKLDYFAELARGRYVAVEYGIESTRDETLRAVNRGHDFACARRAVEMT AARGLHVGAHFILGLPGETDAMLLEQVAAINSLPLTTLKFHQLQLFRGTAMAAEYDADPG RFRFWEIGEYIDLFVEVLRRLRPDLVVERFASEAPPRYHYGRNWGLVRNEQLLAMLEKRL EERNAYQGEIFVSLQSL >gi|313158213|gb|AENZ01000048.1| GENE 103 102413 - 103936 1627 507 aa, chain - ## HITS:1 COG:no KEGG:BT_3558 NR:ns ## KEGG: BT_3558 # Name: not_defined # Def: putative endonuclease # Organism: B.thetaiotaomicron # Pathway: not_defined # 191 507 89 379 379 76 26.0 4e-12 MKRLLTVLLAGISAGFASCGDDGAEKGDTWFLKPEAVVSGTTAEISCRTKFADGVLTESG AGFVYTRIGAADGYFDAEPVEVSGNALSCRLSDLEPETDYLVYAYLELGGAGRMQSESVS FRTGKGSDPGNAPAFGQPAYSEVTSSSAVVSCTFSYQGAEPVSEVYFLYGTGGVSKREAV SAQPGAKSVRLSGLDASTDYTFRLCVAAGGKTYESGEAAFTTDAGDGPGPGPNPGLTKFS GWPELPVEVKNGDYHYAYHTISDFKVGNYNARNYTVCYSAENHGPVWVAAPLHNCYVTKS GNRSYSQDPDVPAGIQPASKSLADPYNKGHMLGNRERSRTAVMRRQVCYYTNIAPQHSGT FNTGGGAWNNLEDLIDTYWDSTESANVGDTLYVVVGAYYKTWTDSYGNTASPKKAAFGGQ QAGVPTMFYYAMLRTRNAKSGKSVLRCSREELQCAAFVMSHAMQKGHKPSRSDMRSVAEV ERLSGFQFFTNVPNAPKDTYNPSDWGL >gi|313158213|gb|AENZ01000048.1| GENE 104 103951 - 104406 598 151 aa, chain - ## HITS:1 COG:SP1644 KEGG:ns NR:ns ## COG: SP1644 COG1490 # Protein_GI_number: 15901480 # Func_class: J Translation, ribosomal structure and biogenesis # Function: D-Tyr-tRNAtyr deacylase # Organism: Streptococcus pneumoniae TIGR4 # 1 149 1 147 147 147 52.0 9e-36 MRLLIQRVKNASVSVGGDELSRIGQGLLVLVGVGVEDTDEDMEYLAGKLVRLRIFDDGQG VMNLDVRQVGGEVLVVSQFTLQASTRKGNRPSYVRAAPEAVSRPMYERFTARVAELLGRE VPTGEFGADMQVALVNDGPVTIWIDSKMRDC >gi|313158213|gb|AENZ01000048.1| GENE 105 104420 - 105106 914 228 aa, chain - ## HITS:1 COG:CAC1015 KEGG:ns NR:ns ## COG: CAC1015 COG0564 # Protein_GI_number: 15894302 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Pseudouridylate synthases, 23S RNA-specific # Organism: Clostridium acetobutylicum # 7 183 82 261 318 116 37.0 4e-26 MFTPEDILYEDNHLLIVNKRCGDLVQPDPSGESALEDQIKAFVKRRDAKPGDVFLGVVHR IDRPVSGAVLFAKTSKALTRLNEMIREGRIRKVYWALTERTPDPESGELRHYILRDARTN RSRACDGPKPGAKEARLRYETLGAGTNYTLVEVELLTGRHHQIRAQLSKIGCPIRGDLKY GARRSLPGGGISLHSRRVEFEHPVRREPVSVTAPAPADDNLWAYFETR >gi|313158213|gb|AENZ01000048.1| GENE 106 105210 - 105665 606 151 aa, chain - ## HITS:1 COG:BS_cdd KEGG:ns NR:ns ## COG: BS_cdd COG0295 # Protein_GI_number: 16079584 # Func_class: F Nucleotide transport and metabolism # Function: Cytidine deaminase # Organism: Bacillus subtilis # 23 117 4 96 136 70 43.0 2e-12 MEKQFCFNYEHYAALAELSDADRELVREAERATANANAPYSKFRVGAAARLRSGKILYGS NFESEVYPAGLCAERTLMFYAQANYADDPIETLAIASDPSERECYPCGQCRQVMVDVERR QGAPMRVIMSGGGTATAVDSAALLLPFTFVL >gi|313158213|gb|AENZ01000048.1| GENE 107 105903 - 106589 1212 228 aa, chain - ## HITS:1 COG:no KEGG:Riean_0656 NR:ns ## KEGG: Riean_0656 # Name: not_defined # Def: hypothetical protein # Organism: R.anatipestifer # Pathway: not_defined # 28 228 27 230 230 108 35.0 2e-22 MKKLILMAAVALMALPAAAQMQEAFPSYIQVNGRAEKEITPDEFYLSIVINERDSKGKIS VESQQRDMIAALKRLGVNVEKQLKVANLSSEFFKKNTSVATAKYQLQLGSSAEVAKVWQA LDGLGISNVSILKVSHSKIDEYKEQVRVEAMQNAKQSAQTLAGAIGQNVGKCFYIYDSNN NVMPVLYDNMAVMRSAKAGYGEEAAADEPLDFKTIKLQYNVQAKFVLE >gi|313158213|gb|AENZ01000048.1| GENE 108 106664 - 107017 469 117 aa, chain - ## HITS:1 COG:no KEGG:Sgly_0946 NR:ns ## KEGG: Sgly_0946 # Name: not_defined # Def: glyoxalase/bleomycin resistance protein/dioxygenase # Organism: S.glycolicus # Pathway: not_defined # 1 115 13 127 129 203 82.0 2e-51 MPTMVRFYRDVLGFEIREDENASNVYLEKDGTLFLLYRRADFEKMTGRRFGYAGPVNGHY EIALSVENHAAVDAAFREVVAKGARPVMEPTTEPWGQRTCYVADPEGNLIEIGSFKA >gi|313158213|gb|AENZ01000048.1| GENE 109 107050 - 108459 1860 469 aa, chain - ## HITS:1 COG:BH0687 KEGG:ns NR:ns ## COG: BH0687 COG2265 # Protein_GI_number: 15613250 # Func_class: J Translation, ribosomal structure and biogenesis # Function: SAM-dependent methyltransferases related to tRNA (uracil-5-)-methyltransferase # Organism: Bacillus halodurans # 1 466 1 454 458 277 34.0 3e-74 MARKKANYPLIEGLEITTLAAEGKAMGRWNDVVVFVPLTVPGDVVDVQIRSKRRRFMEGF VVRYVRKSPLRAEAFCAHFGVCGGCKWQNLPYEEQLRFKTEQVRDQLTRIGKIALPQIAP CLGSEQTRFYRNKLEFTFADRRWLTREEVESGTDFDAAPALGFHIPNMFDKVLDIDKCWL QPDPSNDIRTETRRFCIENGYTFHNAREHRGLMRNMIVRTASTGEVMVIVVFGEDDRERI AALLDHLAANFPQITSLFYIVNTKFNDSVGDLDPVCYKGKDHIVEEMEGLRFKVGPKSFY QTNSAQAYELYKVAREFADLKPEDVLYDLYTGTGTIANFCAARCSRVVGVEYVPEAIADA KVNSQINGIGNTVFYAGDMKQVLSDGFVAANGRPDVIILDPPRAGVDEPVIGVILRAAPR RIVYVSCNPATQARDLALLDADYRVEAVQPVDMFPHTHHVENVVKLVRR >gi|313158213|gb|AENZ01000048.1| GENE 110 108691 - 109830 1371 379 aa, chain - ## HITS:1 COG:no KEGG:VS_0953 NR:ns ## KEGG: VS_0953 # Name: not_defined # Def: hypothetical protein # Organism: V.splendidus # Pathway: not_defined # 15 367 12 386 392 132 27.0 3e-29 MKINGYALLQAAAALCAALVLTGCSLLKVAVATGDPLSKEEMNVRTMTRGFYYDMAAEVA RTADSIAQAAPDVATRVAAVRWKIHATRAGVSAAMQGIPDVALADMWILCRRMNERFAAT PDSLLFGAQSGMARDAAARLDRRAARLARQVLAEERYALMERFVAQYVRDNPSDEGMEAS NTTLAWIEFLRENGVEHAYATGSIAEVLADVNDRVSSQTQQLASSVGWSKDILEMQLQQD SLRMEVGARLDSLERSFTRIVVVAEHLPEISDRVLDELNKQVTQLIYTMNYSVDNAFADF DRQRDELQRYVTGERQALVEQLRQTAGEVVRTTLDAVPGLVGKVLLYVVLALAVLIGGPF ALGFWLGGVRARAGAKRKE >gi|313158213|gb|AENZ01000048.1| GENE 111 109849 - 110673 1247 274 aa, chain - ## HITS:1 COG:DR0821_2 KEGG:ns NR:ns ## COG: DR0821_2 COG0657 # Protein_GI_number: 15805847 # Func_class: I Lipid transport and metabolism # Function: Esterase/lipase # Organism: Deinococcus radiodurans # 43 210 4 174 242 96 32.0 5e-20 MKRLVALIFALAVCGAGAAARPGAYRTVKNIAYRDAGGDRNIDTMCRLDLYYPADGEGFT TVIWYHGGGLTGGRREIPEALKEKGFAVVGVEYRLSPHVKVADCVDDAAASAAWVVKHIA EYGGDPRRIFVAGHSAGGYLTSMIGLDKRWLEPYGIDPDTTFAALIPYSGQVVTHFARRR EMGIPDTQVVVDDMAPLNYVRPDCRPILILSGDREREMLGRYEENAYFWRMMRVAGHPDV RIYEFDGFDHGNMPQAGHFAAVRYIREMERKMER >gi|313158213|gb|AENZ01000048.1| GENE 112 110834 - 111502 1041 222 aa, chain + ## HITS:1 COG:CC1183 KEGG:ns NR:ns ## COG: CC1183 COG0491 # Protein_GI_number: 16125435 # Func_class: R General function prediction only # Function: Zn-dependent hydrolases, including glyoxylases # Organism: Caulobacter vibrioides # 10 210 51 249 273 148 44.0 7e-36 MKIACLQFNPIQENTYVVWDDTSEAVVIDAGNSNPREDAALDNFIAQHGLKPVLAVNTHG HFDHTLGVAHLKKRYGIPFALSSKDQFLLDNASTSGSIFGVAVGEMPPADIDLDTTPEIR FGHTVLRVIHTPGHTPGHVALFDPESKSLFTGDTLFRESIGRTDLPGGDYSWIMRSILDS LIPLGEEVRVYPGHGPESTIGHEMLYNPFVVEVLNQEVNYKN >gi|313158213|gb|AENZ01000048.1| GENE 113 111567 - 112295 1066 242 aa, chain + ## HITS:1 COG:no KEGG:MAB_2973c NR:ns ## KEGG: MAB_2973c # Name: not_defined # Def: putative methyltransferase # Organism: M.abscessus # Pathway: not_defined # 47 242 37 251 251 80 32.0 5e-14 MKTLIHKILYRTLPLEGYLRAVSRLFFLWQRLGIGRYAPATEYVYHLPQLVRAGDTAIDI GANLGYYARTLSRLTGPAGRVYAVEPVPPILAVLRRNLCRCRNVEILPYALGTENKPITM ANDSARETGYFGTGQNFVNDSGAAAAAQFSAQMRRGSELFAGLERLDFVKCDIEGYEVVV LTELRPLLERFRPTVLVETGGGNRPRIVELFTALGYKAFTLEHGREIPLTAASAKDIIFR PQ >gi|313158213|gb|AENZ01000048.1| GENE 114 112319 - 112999 1082 226 aa, chain + ## HITS:1 COG:BH2479 KEGG:ns NR:ns ## COG: BH2479 COG0336 # Protein_GI_number: 15615042 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA-(guanine-N1)-methyltransferase # Organism: Bacillus halodurans # 1 223 1 224 246 241 52.0 9e-64 MRIDILTVVPELLTSPLNESILHRAQEKGLVQIVVHNLHDYAHDRRKTTDDYPFGGEAGM VLKCEPVFELIEKLQSERHYDEIIYTSPDGVRYDQHEANRLSTLDNIIILCGHYKGIDHR IREHLVTREISIGDYVLTGGELAACIIADSVVRIIPGAIGDEASALTDSFQDNLLAPPVY TRPAEFRGWRVPDVLLSGNFAQIARWQEEQSYQRTRQLRPDLLNEE >gi|313158213|gb|AENZ01000048.1| GENE 115 113020 - 114159 1690 379 aa, chain + ## HITS:1 COG:FN0597 KEGG:ns NR:ns ## COG: FN0597 COG0763 # Protein_GI_number: 19703932 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Lipid A disaccharide synthetase # Organism: Fusobacterium nucleatum # 1 372 1 354 356 178 30.0 1e-44 MKYYLIAGEPSGDLHGANLIEGLRKADPEAQFRFWGGDRMAAAGGAANLAKHYRETSFFG IVQVLKNLRTIKRQMLECQADVAAFAPDVLILVDYPGFNMKMARWAKEHGIRTFYYIAPK VWAWREWRVKAIRKYVDRLFIIFPFERSYFPRHGIEPIFEGNPLVDAIEAKRAALPSPDE FRRRNGLDGRPIVALLAGSRRGEIRDNLPLMADLSKKFPGHQFVVAGVSWLDRALYEQYM AGSDIRYVCDQTYETLAAAEAAVVTSGTATLETALLGIPEVVVYRTLWFQVKLQPYVLNV PWVSLVNLNLGREAVAEIIQSGLDITRAERELRAVVEGGSKREKMLSDFDELRKVIGGPG ASDRFAARMVAELHATEAQ >gi|313158213|gb|AENZ01000048.1| GENE 116 114169 - 114486 423 105 aa, chain + ## HITS:1 COG:no KEGG:Cthe_2202 NR:ns ## KEGG: Cthe_2202 # Name: not_defined # Def: hypothetical protein # Organism: C.thermocellum # Pathway: not_defined # 4 103 1 100 245 63 37.0 4e-09 MDEIFIHILTGVFLIVVGLLVKRFPMLIAGYNTMPAEKKKNVDIAGLSSFMRRHLVIIGA LWVLLAVVLNLTGQQDALPVVYVIYLPVYVVWMLVRAQRYDHNKQ >gi|313158213|gb|AENZ01000048.1| GENE 117 114496 - 115209 952 237 aa, chain + ## HITS:1 COG:ycgM KEGG:ns NR:ns ## COG: ycgM COG0179 # Protein_GI_number: 16129143 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: 2-keto-4-pentenoate hydratase/2-oxohepta-3-ene-1,7-dioic acid hydratase (catechol pathway) # Organism: Escherichia coli K12 # 2 233 18 216 219 142 34.0 4e-34 MKIICIGRNYAAHAEELNHQIRGGAEDPAERSRQAGGGADTLPERNRPGSGECSPEPIWF LKPDTALLRNNDPFYIPAFSHEVHYECELVVRIDRVGKSIAERFAHRYYKEVGLGIDFTA RDLQRQAAAEGLPWERSKAFDRSAAISPEFVSLQELGGDVQKLRFTLEVNGQTRQTGDTR QMLFTVDRIIAAVSRYMTLRMGDLIYTGTPAGVGPVSPGDNLRAVLEGRELLSFDIR >gi|313158213|gb|AENZ01000048.1| GENE 118 115415 - 116008 960 197 aa, chain + ## HITS:1 COG:BS_maf KEGG:ns NR:ns ## COG: BS_maf COG0424 # Protein_GI_number: 16079857 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Nucleotide-binding protein implicated in inhibition of septum formation # Organism: Bacillus subtilis # 12 195 5 185 189 143 46.0 2e-34 MLLKDKLEPYRLLLASQSPRRRELMTGCGLPYEPAPKYECEEIYPDSLPAEEVPLFLSRL KSEAYPAPLGPNDILLTADTVVISDGEVLGKPCDRGQAAQMLRRLSGRRHTVVSGVTLRT AGRMHTFSAESGVWFRPLSEEEIGYYLDTFAPYDKAGSYGIQEWIGYAAIEKINGSFYNV MGLPIQKVYVELEKFLQ >gi|313158213|gb|AENZ01000048.1| GENE 119 116008 - 117996 3045 662 aa, chain + ## HITS:1 COG:no KEGG:Odosp_0699 NR:ns ## KEGG: Odosp_0699 # Name: not_defined # Def: peptidase S46 # Organism: O.splanchnicus # Pathway: not_defined # 1 656 1 712 717 585 43.0 1e-165 MKRTLLTLICSLFALAASADEGMWLPSLISQRIDDMRAKGFRLTAEDIYSINKASMKDAV VLFDGGCTGELISAEGLLLTNHHCGYDAIQSHSSVEHDYLTHGFWAMSRAQELPNEKLNV KFLVRMEEVTDRLAAGQTQEEIVRRAEAEGKGYKAAVEQMYYGNQQFLFVYEQFDDVRLV AAPPSSIGKFGGDTDNWIWPRHTGDFSLFRIYAGRDNKPAAYSPDNVPYKPKRHFKISTA GVGEGDFTMIYGFPGNTQEYILSDAVAYIAERSDPAKIAVRTGRLDIISAAQQSDPALRI HYAAKHASIANAWKKWQGEVLGINRLGTVASKRAYEEDFAAWAQDKPEYRSVVADLKAEY ARIADAYFAREITRETLDALPKRYTAAERAEAVFAQREATERALYRYLFGEYARRCPVQY QAPEFLAGTAAAGSPEAFAERIFSDVWEGSDTTAVNTLARGAKRMQDHIEWLLGTRSLRN LNSARLNELYTAYIKGLREWDAGRAFFPDANLTLRVAYGHVAGYEYADGEYHKPRTTIDG IIAKDNPEIYDYDIPQALRDIHASKEYGRWATRIDGRTTVPVCFLATNHTTGGNSGSPIL NARGELVGVNFDRTWRSTMSDVAFDETICRNIAVDIRYVLFVIDRIGGAGYLFGEMDFGK RK >gi|313158213|gb|AENZ01000048.1| GENE 120 118036 - 119559 1826 507 aa, chain + ## HITS:1 COG:CAC0499 KEGG:ns NR:ns ## COG: CAC0499 COG0793 # Protein_GI_number: 15893790 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Periplasmic protease # Organism: Clostridium acetobutylicum # 155 440 128 372 403 65 26.0 4e-10 MRIIHCAWAAALLSALAAGSCAKDTAGNGGTGTTESYNDKIDSYLAERYLWNGEYGQMTR DLTIEYAADKDNFLTRTLLEMTSNTLDKKRYTDGAGQNYYNLYSYITRKARSRAATELKT RGVNHGVRKETEYGFGFARLGVVRFSNTGKAGFYVMAVYPGSPADKAGFRRGTIFSEIDG AEIADSESEYTPVYKRLTSPSGAASVLFKVRETGEETTLTAERLYPNPVLRAEVIEEGEH KIGYMVYNGFDAAYDDDLLDSLRKFRDAGITDLVLDLRYNGGGHVISAKMLATCIAGEKC DGKVFQYYRYNDSRMADPAKTQKQTGNTYDQGKALFYENFSYGNYYGADLTQYALDLPRI YVLTTGSTASSSEAVINGLRGIDVGVTLIGEKTNGKNVGMEIDKFDDGDYSYELAPITFE GYNAKEETVNPAGLAVDYAVADWNNRLVDFGPEEPMLAKALSLITGRTYTPAPGRSAAGI APITGVSLPEDARRPAGMLVLSPQHGE >gi|313158213|gb|AENZ01000048.1| GENE 121 120098 - 121159 1475 353 aa, chain - ## HITS:1 COG:no KEGG:Lbys_2826 NR:ns ## KEGG: Lbys_2826 # Name: not_defined # Def: hypothetical protein # Organism: L.byssophila # Pathway: not_defined # 5 353 227 566 566 404 54.0 1e-111 MGLALWSDAHFDALRPLVRMLADAGQKVVTATLNKDPWNHQCFDAYEDMIRWTKRADGTW EYDYSLFDRWVGLCAGEGIDRQINCYSMLPWNNELHYYDAAADRIVEVRANPGTPEFEAM WRPFLRDFEAHLDAKGWLGKCCVAMDERSPETMDAAIGLLRSAAPGLGIAMADNHASYKR YADLDDVCVQIDCRVADEDLARRRRDGLLTTYYVCCSSAFPNTFTFSEPWEAVYMAWFAA ACGYDGMLRWSYNSWPADPVRDSRFTAWPAGDTYLVYPDARSSIRFERLREGIQDYEKIR ILRGELASDNTPEGAAKRAELEAAVRPFEAHDPARPWPDLLRRAKETVARLSE >gi|313158213|gb|AENZ01000048.1| GENE 122 121313 - 121411 62 32 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MIYPILITYSNCYRRYKDFPFQTIHKFFYLLF >gi|313158213|gb|AENZ01000048.1| GENE 123 121956 - 122117 87 53 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MEDSTDLRKDISQKWRNTMRNIYMLLSMIKIMVYLFNSQQYFLTKISLQELLY >gi|313158213|gb|AENZ01000048.1| GENE 124 122376 - 123143 179 255 aa, chain + ## HITS:1 COG:no KEGG:Tresu_0562 NR:ns ## KEGG: Tresu_0562 # Name: not_defined # Def: hypothetical protein # Organism: T.succinifaciens # Pathway: not_defined # 9 255 6 247 251 145 35.0 1e-33 MSYIQPTNHIQFILRPETAIIKEAFRSLDATNGGIETYPISDYLLHSLFLRLTGAQEQKL KCICWEMACRDYEYRYERYERKPYGECSSYDDKCMVYNDLLNEIKKLDETFDITDAIRDE ILDDWKTSIQCVLENSLLVRNFKRSYDEYKLLIITIDKSWIMQGTQLFLNKNNIKAEKRT ATCGLSLLEIFKEYVYKERNRCAHNTRSYQHNLPSIREMMSQGYKLQNYFLYMSIIILLD KIYIKLFEVYLSKLK >gi|313158213|gb|AENZ01000048.1| GENE 125 123438 - 123797 242 119 aa, chain + ## HITS:1 COG:no KEGG:BT_4618 NR:ns ## KEGG: BT_4618 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 115 1 114 117 99 44.0 4e-20 MSEKSITFEDLPKAMSWMMDKLNELDSKIDGLNNQNSNVPTEQWMNLKELCEYIPSHPAE QTVYGWTSCHLIPFHKRGKRIMFLKSEIDEWLHAGKIKSDKNLEGEAAQFIKSKRNTKF >gi|313158213|gb|AENZ01000048.1| GENE 126 124628 - 125170 10 180 aa, chain + ## HITS:1 COG:no KEGG:BT_0230 NR:ns ## KEGG: BT_0230 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 2 180 263 443 452 121 37.0 1e-26 MWKSIIDKVVSLPFVENENGGNIQNILDFSSEAKAYFTNWRNNAIRTINQIQDDGLVDSR VIKTPMITARLALILQILRWACGEVHKDFVDIDSTKSAIALSEYFECCYSDIQKYMLKES VEPQKKELLDCLSESFTTADAIQAGKEVGLSGRSVMYSLVSLATNKIIKKVKRGEYEKLQ >gi|313158213|gb|AENZ01000048.1| GENE 127 125762 - 126304 230 180 aa, chain + ## HITS:1 COG:no KEGG:BT_0229 NR:ns ## KEGG: BT_0229 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 178 162 339 341 231 58.0 7e-60 MLYDVKTGHRTKEPRSYVSWVHTELNFQNYHLKQCLFGEHLLSDNSTKPIAIVESEKSAL VATHYMPEFIWLATGGMHGCFKPDVISVLKGRSVMLCPDLGAKEVWQTKMPLLTSVCSKV VLSDSLEQCATDEQRKNGLDIADFLLMTDTPQMILDKMIQRNPALQTLIDELGLKLVETE >gi|313158213|gb|AENZ01000048.1| GENE 128 126523 - 127908 712 461 aa, chain + ## HITS:1 COG:no KEGG:PG1109 NR:ns ## KEGG: PG1109 # Name: not_defined # Def: mobilization protein # Organism: P.gingivalis # Pathway: not_defined # 6 326 3 343 455 237 41.0 1e-60 MSNTQYAVCHLQRGSGNDSGMSCHIERKDAKREKYVPVNADANRTHLNRELVAFPAGVKN RTDAIQYRIDHAGLHRKVGKNQTKAIRIILTGTHKQMMKIAKEDKLDNWIDANLKWLKNT FGSENLVSCVLHMDEKTPHLHATIVPIITTERLRKKREGEKKYATKSGARLSADDVMRRS KLHEYQNSYAAAMKPFGLQRGIVGSTAKHLANLEYYKQQINRYEEDIAKLQADIEKAHEG KNTILSWFGKGDLAKAKKELASRDEQIAKLKNQIKTLMAEKSQLKEKHRGEIEQLRNGYQ KEIDKAIRMAEKADRQSKEKDSVIDKLNVRIEQLDRKANPQRYSLSSGAELVHHFIPNYN NPSLHIWTQVGNEKYDSIKYVDWLNPIWESFTKDEATIYELINDVFEPHEQVNKAQANLL GVAFELATGGQAQVHIGTGSGGSSSELPWGEQKTKNSQRRR >gi|313158213|gb|AENZ01000048.1| GENE 129 127916 - 128722 412 268 aa, chain - ## HITS:1 COG:SP0506 KEGG:ns NR:ns ## COG: SP0506 COG0582 # Protein_GI_number: 15900420 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Streptococcus pneumoniae TIGR4 # 3 268 1 265 265 330 63.0 1e-90 MNIENFETYLRQGNMAENTIAAYRYAVKEYYSRHKELNKRNLLVYKTYLIEKFKPKTVNL RIQAMNKYLDYMNKSRLRLKSVKVQQRSYLENVISNADYAFLKNKLKKEENQEWYFVVRF LAATGARVSELIQMKVEHVQIGYFDIYTKGGKIRRIYIPKSLRKEATEWLNSTNRTSGYL FLNRFGERITTRGIAQQLKNYATKYGLNEKVVYPHSFRHRFAKNFLEKFNDISLLADLMG HESIETTRIYLRRSSAEQQEIVDKIITW >gi|313158213|gb|AENZ01000048.1| GENE 130 129405 - 129506 99 33 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MANLKNQFFKKPFKGDLFHKLQKKEQHKSYLSK >gi|313158213|gb|AENZ01000048.1| GENE 131 130444 - 130731 95 95 aa, chain + ## HITS:1 COG:no KEGG:CCV52592_0371 NR:ns ## KEGG: CCV52592_0371 # Name: not_defined # Def: type I restriction-modification system S subunit # Organism: C.curvus # Pathway: not_defined # 1 95 228 322 323 105 52.0 5e-22 MCIEGGSAGRKIAILNQDVCFGNKLCCFSPFVGIGKYMYYYLQSPSFFELFNLNKTGIIG GVSIAKVKEILIPLPPIKEQQRIVAQIEKLFEQLR Prediction of potential genes in microbial genomes Time: Wed Jun 22 12:38:06 2011 Seq name: gi|313158212|gb|AENZ01000049.1| Alistipes sp. HGB5 contig00075, whole genome shotgun sequence Length of sequence - 3252 bp Number of predicted genes - 1, with homology - 0 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 125 68 ## + LSU_RRNA 2242 - 2701 91.0 # FJ410386 [D:700..3332] # 23S ribosomal RNA # Elizabethkingia meningoseptica # Bacteria; Bacteroidetes; Flavobacteria; Flavobacteriales; Flavobacteriaceae; Elizabethkingia. + 5S_RRNA 3103 - 3161 91.0 # CYTRRAA [D:1..114] # 5S ribosomal RNA # Pedobacter heparinus # Bacteria; Bacteroidetes; Sphingobacteria; Sphingobacteriales; Sphingobacteriaceae; Pedobacter. Predicted protein(s) >gi|313158212|gb|AENZ01000049.1| GENE 1 3 - 125 68 40 aa, chain + ## HITS:0 COG:no KEGG:no NR:no NRGYDECKKVIKGVWGMPRLLEATKDVVSCDKPRGFANEN Prediction of potential genes in microbial genomes Time: Wed Jun 22 12:38:17 2011 Seq name: gi|313158187|gb|AENZ01000050.1| Alistipes sp. HGB5 contig00073, whole genome shotgun sequence Length of sequence - 25441 bp Number of predicted genes - 26, with homology - 25 Number of transcription units - 9, operones - 6 average op.length - 3.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 108 - 167 3.6 1 1 Tu 1 . + CDS 365 - 655 293 ## BT_2612 hypothetical protein + Prom 1112 - 1171 2.5 2 2 Op 1 . + CDS 1201 - 3000 1024 ## COG3344 Retron-type reverse transcriptase 3 2 Op 2 . + CDS 2987 - 3268 294 ## gi|291513816|emb|CBK63026.1| Bacterial mobilisation protein (MobC) 4 2 Op 3 . + CDS 3316 - 3669 237 ## HMPREF9137_2153 relaxase/mobilization nuclease domain-containing protein + Term 3691 - 3730 6.0 5 3 Tu 1 . + CDS 3858 - 4067 208 ## Bacsa_2839 PfkB domain-containing protein + Term 4068 - 4102 7.0 6 4 Op 1 . - CDS 4075 - 4455 633 ## Bache_2020 hypothetical protein 7 4 Op 2 . - CDS 4458 - 4949 802 ## COG0716 Flavodoxins - Prom 5090 - 5149 3.5 + Prom 4900 - 4959 2.3 8 5 Op 1 . + CDS 5167 - 7956 3145 ## Odosp_2398 TonB-dependent receptor plug 9 5 Op 2 . + CDS 7961 - 8350 336 ## gi|291515694|emb|CBK64904.1| hypothetical protein AL1_27620 10 5 Op 3 . + CDS 8353 - 9855 1636 ## Odosp_2400 hypothetical protein 11 5 Op 4 . + CDS 9899 - 11110 1717 ## Odosp_2399 putative lipoprotein 12 5 Op 5 . + CDS 11133 - 12020 1149 ## COG1131 ABC-type multidrug transport system, ATPase component 13 5 Op 6 . + CDS 12030 - 12701 1112 ## Odosp_2402 hypothetical protein 14 5 Op 7 . + CDS 12721 - 13944 1605 ## Odosp_2403 hypothetical protein + Term 13968 - 14004 8.7 - Term 13955 - 13992 8.1 15 6 Op 1 . - CDS 14017 - 14595 669 ## Odosp_2395 hypothetical protein 16 6 Op 2 . - CDS 14611 - 15204 719 ## Odosp_2394 hypothetical protein - Prom 15315 - 15374 3.6 - Term 15281 - 15319 -1.0 17 7 Tu 1 . - CDS 15389 - 16048 637 ## COG2217 Cation transport ATPase - Prom 16173 - 16232 5.0 - Term 16507 - 16540 4.1 18 8 Op 1 . - CDS 16575 - 17345 1144 ## ZPR_1186 S1/P1 endonuclease family protein 19 8 Op 2 . - CDS 17382 - 19127 2351 ## COG1158 Transcription termination factor - Prom 19190 - 19249 4.1 - Term 19260 - 19292 0.6 20 9 Op 1 . - CDS 19498 - 21558 2487 ## BF1750 1-phosphatidylinositol phosphodiesterase 21 9 Op 2 . - CDS 21585 - 21719 141 ## 22 9 Op 3 . - CDS 21793 - 23148 1717 ## gi|313158200|gb|EFR57604.1| putative lipoprotein 23 9 Op 4 . - CDS 23192 - 23347 230 ## gi|313158192|gb|EFR57596.1| hypothetical protein HMPREF9720_2667 24 9 Op 5 . - CDS 23419 - 23691 308 ## gi|313158197|gb|EFR57601.1| conserved domain protein 25 9 Op 6 . - CDS 23712 - 24881 1862 ## BT_0418 outer membrane porin F 26 9 Op 7 . - CDS 24949 - 25392 451 ## gi|313158191|gb|EFR57595.1| conserved domain protein Predicted protein(s) >gi|313158187|gb|AENZ01000050.1| GENE 1 365 - 655 293 96 aa, chain + ## HITS:1 COG:no KEGG:BT_2612 NR:ns ## KEGG: BT_2612 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 10 89 8 88 140 62 41.0 8e-09 MDTDRKLTDKTGKKPRTLSYKYSFHLNEEQNIHFNELLCKAGSEHNRSRFIVKRIFGEEF VVVRRDPSKVQFIARLNDFYFQFQKLGNNCVPVKAI >gi|313158187|gb|AENZ01000050.1| GENE 2 1201 - 3000 1024 599 aa, chain + ## HITS:1 COG:Q0050 KEGG:ns NR:ns ## COG: Q0050 COG3344 # Protein_GI_number: 6226520 # Func_class: L Replication, recombination and repair # Function: Retron-type reverse transcriptase # Organism: Saccharomyces cerevisiae # 28 599 255 827 834 287 34.0 3e-77 MRSPARVLNSLTKHSQTTEYRFERLYRILFNEEMYYLAYQRIYAKPGNMTQGADGETIDR MSLPRIEKLILSLKDESYSPQPSRRVHIPKKNGKTRPLGVPSFDDKLVQEVVRMVLEAIY EGHFEDTSHGFRPHRSCHTALNAVQKTFTGKKWFIEGDIKGFFDNVNHDILIDILKERIS DERSIRLIRKFLRAGYLEQWRFHGTYSGMPQGGIISPVLANIYLDKLDKYMKEYAAGFDR GNRVRAYREYEVLTYQKRLVMRELKTATNNVERKVLVKRLKEIDKTRSAMPCFDPMDGNF KRLKYVRYADDFLIGIIVSKVKIKDDIKRFLADRLALELSDEKTLVTHTEKPAKFLGYEV TVRRSNLQKRNKRGSLSRVYGNKVRLKVTTEVIKKKLLELGVLKFSYHNGHEQWIPKHRS ELINNDDLEILDSYNAEIRGFYNYYSIANNASELNTFHYIMQYSMYKTFAGKYRTTVRRI CRKYKRNGVFTVGYTVKNGKAKERRLYNEGFKRKRPSYDRSVDRCPNPMPGVSTTSLIDR LKAQKCELCGATDNLVMHHVRKLGELKGKENWEKLMIARRRKTMAVCGSCHQKIHHGTF >gi|313158187|gb|AENZ01000050.1| GENE 3 2987 - 3268 294 93 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|291513816|emb|CBK63026.1| ## NR: gi|291513816|emb|CBK63026.1| Bacterial mobilisation protein (MobC) [Alistipes shahii WAL 8301] # 43 93 89 139 139 81 84.0 2e-14 MERFRLTILAESRIRWKVYVRFGGEYWETYHRNMTRRRVLSLHNQIVKAVNSHFSNVTIP HQIAALEQRMRELKAFSIKILNLAKQAKEWLRI >gi|313158187|gb|AENZ01000050.1| GENE 4 3316 - 3669 237 117 aa, chain + ## HITS:1 COG:no KEGG:HMPREF9137_2153 NR:ns ## KEGG: HMPREF9137_2153 # Name: not_defined # Def: relaxase/mobilization nuclease domain-containing protein # Organism: P.denticola # Pathway: not_defined # 1 115 22 134 426 114 48.0 2e-24 MDKDEAEVLLWQKMLEPFDKHGRMDIDACMDSFRPYLEANRRTTNTVFHVLLNPSPEDKL TGEQLRETAKEYMERMGYGDQPYIVFKHNDISREHLHLVSLRVDENGHKLSHDFETE >gi|313158187|gb|AENZ01000050.1| GENE 5 3858 - 4067 208 69 aa, chain + ## HITS:1 COG:no KEGG:Bacsa_2839 NR:ns ## KEGG: Bacsa_2839 # Name: not_defined # Def: PfkB domain-containing protein # Organism: B.salanitronis # Pathway: Fructose and mannose metabolism [PATH:bsa00051]; Starch and sucrose metabolism [PATH:bsa00500]; Amino sugar and nucleotide sugar metabolism [PATH:bsa00520]; Metabolic pathways [PATH:bsa01100] # 1 64 188 251 298 80 62.0 3e-14 MFGIQAQEPQAQCRDLLEKYGLRTVILTCGAVGSHVFTPDGMSYVATPHVEVADGVGAGD SFTAQIRKE >gi|313158187|gb|AENZ01000050.1| GENE 6 4075 - 4455 633 126 aa, chain - ## HITS:1 COG:no KEGG:Bache_2020 NR:ns ## KEGG: Bache_2020 # Name: not_defined # Def: hypothetical protein # Organism: B.helcogenes # Pathway: not_defined # 8 120 13 125 126 116 48.0 3e-25 MSAEEFSRYDSMKLFLHQIYEFQKGVRSLVLCTMCRTCASLLSERLERLGIAYRIQSVTD SKVNLYFGNRMCLETVATFIHKPLNELSAEEDFMLGAMLGYDIAGQCERYCKRKSPGVSS EQAAVN >gi|313158187|gb|AENZ01000050.1| GENE 7 4458 - 4949 802 163 aa, chain - ## HITS:1 COG:ECs0715 KEGG:ns NR:ns ## COG: ECs0715 COG0716 # Protein_GI_number: 15829969 # Func_class: C Energy production and conversion # Function: Flavodoxins # Organism: Escherichia coli O157:H7 # 1 163 3 167 176 133 46.0 2e-31 MTGIFFGSTMGTTEAVAADIAKQLGVADADVHNVADTPAGEVQKYDLLVLGSSTWGAGEL QDDWYGFLDQLKAQDLSGKKVALFGCGDSGSYPDTFCDAVGLIYDGLQQSGCTFVGAYAP EGYDETGSLVCRGGKFVGLAVDESAPDKTDRRVAAWCEQIKNE >gi|313158187|gb|AENZ01000050.1| GENE 8 5167 - 7956 3145 929 aa, chain + ## HITS:1 COG:no KEGG:Odosp_2398 NR:ns ## KEGG: Odosp_2398 # Name: not_defined # Def: TonB-dependent receptor plug # Organism: O.splanchnicus # Pathway: not_defined # 1 929 1 927 927 643 40.0 0 MRLLLKYLLFILLPAYTYGQSGRPDAGITFEVTDSLHNPLAYATVALTPMSGGKAYATTT DEEGRARFTLPANTYNADISYIGYVSQRMEVRPVSGSDMLRTVRLRTSDTQIREVVITAT QVRGPVSGVHIGRDAMNHIQPSSFGDLLELLPGGRASDPSFSSSNHIHLREIGTSNSDYQ TTSLGVSFVMDGIPMSNDAGMTYNSGTTVGNNISLNRGVDMRTMPTDEIASVEIQQGIPS VEYGDLTSGLIKIKRKEGGRNLEARFKADLGSQLLYVGKGFEWGAPADLLTMNVGATWLN SHDDPRNTRQNYQRATGSWRMKKRWESTSAYRYTLGGSLDYTGSFDRQKSDRDIDEGPAG IPLERYKSSYNNVVAAVNFTAESKGANFFRSFDFSASVSSEFDLIDRWRYRANSGNVPIR TAVEEGVFDMEVLPVRYEATLQVDNKPFYANAKAVALLGADTPLSRNTIRIGAEWNMSKN YGGGLLYDVTRPFTDLMSSHPRRYDALPALHRLSAFLEDNTTITAGEWRIEIMAGLRTTA MANLGSRYTLQGKFHFDPRANLSVTLPAFDMAGDPMRITFAGGAGWHTKTPTLDQLFPEP DYSYYTRLNYFPADDESKRRVNVEVFKHDPTNYDLKAARNFKWEVRGNAEWNGYGLSVTY FRENMTSGFRTSTDVLTRTYREYDTGPLKDMEFTGPPQLEWLPYKDKTVFQTVGVKTNGS RTFKQGIEFSLSTRRIRALATKLIVSGAYFKTRYENSEPQYVLTTVQLAEGTPYPYIGLY DQDDNTFHEVCNTNFLLDTQIPKLGLIFSTSFQCQWFSGMKAEWCNPAPTSYLDLQLQEH PFTAESAADGILQHMIHDNVSKEAYLYRLTPFSMNVNLKISKRLYRDRLNIAIFVNRLFT YSPSYRNETNALVRRYSSPYFGMELNFKL >gi|313158187|gb|AENZ01000050.1| GENE 9 7961 - 8350 336 129 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|291515694|emb|CBK64904.1| ## NR: gi|291515694|emb|CBK64904.1| hypothetical protein AL1_27620 [Alistipes shahii WAL 8301] putative lipoprotein [Alistipes sp. HGB5] # 1 129 1 129 129 221 100.0 2e-56 MMMKSTIRTLFLLSALLFSACEKDDSIFTELSIRFVMPDSRPVEKLEIRTDISYFDNINS GERVPFPAAEQNSATIRLRKGVYTFIVEADATYSTGEKKILRCTDYNQIPAAMTWVEDSE SIVLLLKSI >gi|313158187|gb|AENZ01000050.1| GENE 10 8353 - 9855 1636 500 aa, chain + ## HITS:1 COG:no KEGG:Odosp_2400 NR:ns ## KEGG: Odosp_2400 # Name: not_defined # Def: hypothetical protein # Organism: O.splanchnicus # Pathway: not_defined # 48 404 49 410 508 179 32.0 3e-43 MMKALFLSILALLPAAASAQRDTLGVLERSSVFSAEDERAVAGFHTESPAMMLYRSEQSL SQISLRIDLRREQEALLQPLGDGALDGGFYAESYRRLSERSAAWADARYVRGNRRNVCWN STADYLLLYPCITADSAGGNLSTEEYAFGGGYVHRVGRFDLAIRGDYRAGQEYRQVDPRP HNVVSDFTIKLGVGMQFPQYVLGLDLQGRLYKQDQDIDFFYPPGASSSELYMTGLGSYYT RYSLNSDSFNIGYDGKGCLLAAQLMPRHAKGWYARAAFESLTTDRLNKSNNTVPITRLKT RQATLSAAYRSGRWSLRAGGGYELRRSIEMIADRTGHSVIVDEQAMYKNRIWHADAEATV EWRCGTVNYMLSPRAEWRQSTATYAYPARRMRLAQFCGAVRGSAEWLRPQWRVKATAGIG CYVSPDEEVSLSNVLNNISEYLTYTAARLSGRAVSPEVSLRAERRLSRNLACFAEAGWTP RFYSGGLSEHVLTATFGILF >gi|313158187|gb|AENZ01000050.1| GENE 11 9899 - 11110 1717 403 aa, chain + ## HITS:1 COG:no KEGG:Odosp_2399 NR:ns ## KEGG: Odosp_2399 # Name: not_defined # Def: putative lipoprotein # Organism: O.splanchnicus # Pathway: not_defined # 86 401 102 415 417 209 42.0 1e-52 MKKLFLILTTALFAAGFASCTDDPEETVNPVVVSEILRALGGNVLVNVPEENFDVNIPAD AASWVQINKAESSGKALVLSVDENETGTERSTVVNVTRSGKSTVLATVTIKQSDITLQAG EFVIEEIYFTGTALPETGKPDRWLGDQYIKIRNNSDEDLYADGMMLILSSGLNSGTNSEM LEGKDFRKECCAGNAFYCIPGNGQDVLVKAGESLIVVNNAQNHTIGNPNSWDATKADFEW YDVSSNENYLDIDNPDVPNLDKWYASTLTVQVLHNRGFNAVAIAMPPVGLTAEQFLAEYP LKDAQYIFHSPNGSDNTLPLRNCYRVPNEWILDAVNTGCRDAYYIAPWDASLDAGYAWCG TADGDAGRFGKSVIRKSGSSGKLIDSNNSTNDFESNTKASLIK >gi|313158187|gb|AENZ01000050.1| GENE 12 11133 - 12020 1149 295 aa, chain + ## HITS:1 COG:lin0750 KEGG:ns NR:ns ## COG: lin0750 COG1131 # Protein_GI_number: 16799824 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase component # Organism: Listeria innocua # 1 295 1 298 301 129 31.0 9e-30 MTDSVIECRNLTHCYGERLIYKDLSFEVPRGRILGLLGKNGTGKTTTINILSGYLQPRAG QCLIFGEDIRTMRPETRARIALLIEGHVQYAFMTIEQIERFYARFYPRWNRDAYYGLMEM LKVAPRQRISRMSCGQRSQVALGLILAQNADLLVLDDFSMGLDPGYRRLFVEYLRDYAQS EEKTVFLTSHIIQDMEKLVDDCIIMDYGSILVQMPVGELLGTFRRYTFTPGADVTIPAGD GLYHPSVVNGRAELYSFDSPATVQGVLERAGIVCSDLRGETLNLEDAFIGLTGKY >gi|313158187|gb|AENZ01000050.1| GENE 13 12030 - 12701 1112 223 aa, chain + ## HITS:1 COG:no KEGG:Odosp_2402 NR:ns ## KEGG: Odosp_2402 # Name: not_defined # Def: hypothetical protein # Organism: O.splanchnicus # Pathway: not_defined # 1 223 1 223 223 221 56.0 2e-56 MIKAIFYKEWIKMRCFYPLSALFLFGATAYALLRVQRVITFKGAAHVWEVMLEKEVVFID ILQYLPALLGVLLAVVQFVPEMAQKRLKLTLHLPFPQWKMILLMSGIGLGALALLVIVQT AVLWGYFHALLAPELVARILLTALPWYLAGLTLYLLTAWICLEPTWKRRLANLLIAVGVC RIFFLSDTPQAYDGMLPWLALLLVCSLFFPLLSVYRFKQGCQD >gi|313158187|gb|AENZ01000050.1| GENE 14 12721 - 13944 1605 407 aa, chain + ## HITS:1 COG:no KEGG:Odosp_2403 NR:ns ## KEGG: Odosp_2403 # Name: not_defined # Def: hypothetical protein # Organism: O.splanchnicus # Pathway: not_defined # 1 370 1 374 380 320 41.0 5e-86 MIKSIKTGIILLAALLLAWLLPWCYAFVFASPSWSPFTLYSCVTHGFASVDFDRENNVAG RDLQGNTYTQQQFDSILPTFYYRQLAAEGRFPSEIEGVAVESRDVERTNFMFRTSPGEIN RRRPTVYQLLESMPDRIDLEPATDVFRITGEGIEFVDMETNTIDQKKSAAFTKVLRDKGF SFPARVVSGNPSTRKRYDNGYLLVDDAQRVYHMKQVRGRPFVRRTDVADSLQIGQIFVTE FADRKSLGFLVDSKKRFYTLGAEDYKLHEIPVGKFGPTRENMMIIGDMFYWTVTIQGAES KRYVAVNARDYSLADEYRPEEKPQAWAEYAKYLFPFELSFTSPLDGYVKPRIAEVSFQAL WLGLALGAFYALIRRRSPGGRLWQTVGVVLFGLFLFVPLLVFGTAKR >gi|313158187|gb|AENZ01000050.1| GENE 15 14017 - 14595 669 192 aa, chain - ## HITS:1 COG:no KEGG:Odosp_2395 NR:ns ## KEGG: Odosp_2395 # Name: not_defined # Def: hypothetical protein # Organism: O.splanchnicus # Pathway: not_defined # 1 192 1 192 192 249 58.0 4e-65 MGTRSFMSAVRKWSRPVHRDLSFFFSGVLLIYAVSGFMLNHKRDFNSDYSVRRTELVFAQ EIPRAEQEWTRARAEELLLRVGEEGNYLKHYFPEPDRLKVFIRGGSSLTVDLASGHAVYE SIRKRPVLSSLNRLHYNPSRWWTVFSDVFLAGLIVIVLSGLVMMRGPKGLRGRGGVELIA GILIPLLFIFLT >gi|313158187|gb|AENZ01000050.1| GENE 16 14611 - 15204 719 197 aa, chain - ## HITS:1 COG:no KEGG:Odosp_2394 NR:ns ## KEGG: Odosp_2394 # Name: not_defined # Def: hypothetical protein # Organism: O.splanchnicus # Pathway: not_defined # 41 197 86 242 242 218 63.0 1e-55 MRHFVYCAALAAAVFFLAGCGGNPNKKNASETQTSETQSSVMEVDNLLADAEKLTGGKVT VEGVCTHICRHGGRKIFLMGTDDTQVIRIEAGEKIGSFKPECVNNVVRVTGTLVEDRIDE AYLAEWELRLKDQIARQHGEGEAGCSAEHQARGESVASSTEKRIADFRARIADRKAKEGK EYLSFYHVDGGSYEVLK >gi|313158187|gb|AENZ01000050.1| GENE 17 15389 - 16048 637 219 aa, chain - ## HITS:1 COG:alr7635 KEGG:ns NR:ns ## COG: alr7635 COG2217 # Protein_GI_number: 17158771 # Func_class: P Inorganic ion transport and metabolism # Function: Cation transport ATPase # Organism: Nostoc sp. PCC 7120 # 7 128 623 744 753 148 56.0 6e-36 MQFSYDKRIRCHGRCVAMVGDGINDSAALAQADLGIAMGQGSDIAIDVEKVTILSSDLTK IAAAVRLSAATVRTVRQNLFWAFIYNLIGVPVAAGVLYPLTRFLLNSMLVGAAMALSSVS VVANSLQLSTKRLGTKAVHKINTEFITKKQTAMKKEYRIEGMMCNHCRMHVEKALNGIEG LRAVVTLDPPAATVEFTGREPELAELQRVLSEDYKITER >gi|313158187|gb|AENZ01000050.1| GENE 18 16575 - 17345 1144 256 aa, chain - ## HITS:1 COG:no KEGG:ZPR_1186 NR:ns ## KEGG: ZPR_1186 # Name: not_defined # Def: S1/P1 endonuclease family protein # Organism: Z.profunda # Pathway: not_defined # 1 255 1 259 259 194 39.0 4e-48 MKKLLILFSCVLFAHGAFAWGQKGHDVTAYIAECHLTPEAAEKIDKALNGHSPVYYANWL DIASHTPEYAYTKTWHYRNVDEGKTIDTMPENPDGDVLKAVTTLVAELKAGGLPPEEETL KLKMLIHLVGDMHCPMHAGRLSDIGGNLRPVLMFGKKTNLHSAWDTAIPEAARKWSYTEW QEQIDRLTDDEAMLIQAGEPYDWLKETHAICVGIYADSPEGTKISYDYVYKYTPVIELQF LRGGYRLARLLNEIYR >gi|313158187|gb|AENZ01000050.1| GENE 19 17382 - 19127 2351 581 aa, chain - ## HITS:1 COG:AGc5136 KEGG:ns NR:ns ## COG: AGc5136 COG1158 # Protein_GI_number: 15890078 # Func_class: K Transcription # Function: Transcription termination factor # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 213 580 51 420 421 473 65.0 1e-133 MQDLKALEGKSLAELREIAKALGIEDVMVKKRELIEKITGDSVQETPAENKGKRGRKKET PAAETPAVTAPETAAPETAVAEAAAKTAETSAAEPEPMAAAAPAAPAKPRGRRPRMTKAE NTAPVQPQTALTPEMPAELEFPTNGAQNEPAAKPAQETAAQPEPKRRGRKPKQEQAQEQE SAQAAQGQTQAEETPAPRPLEEEVITKDDFAGEIEGEGVLEIMPDGYGFLRSADYNYLNS PDDIYVSPSQIKLFGLKPGDTVNGAIRPPKEGEKYFPLVRVNEINGLAPEYIRDRVQFEF MTPLFPSEKFCLTGNGHNNLSTRVVDLFSPIGKGQRALIVAQPKTGKTVLLQAIANAIAD NHPEVYMIVLLIDERPEEVTEMARNVKAEVVASTFDEQASRHVKVAEMVLDKAKRMVECG HDVVIFLDSITRLARAYNSVQPASGKVLSGGVDANALHKPKRFFGAARNTEEKGSLTIIA TALIDTGSKMDEVIFEEFKGTGNMELQLDRKLANKRVYPAVDVIASGTRREDLLLPRDVM NRTWVLRKYLSDMNPVEAMEFLQKQMSLTDTNEEFLATMNH >gi|313158187|gb|AENZ01000050.1| GENE 20 19498 - 21558 2487 686 aa, chain - ## HITS:1 COG:no KEGG:BF1750 NR:ns ## KEGG: BF1750 # Name: not_defined # Def: 1-phosphatidylinositol phosphodiesterase # Organism: B.fragilis # Pathway: Inositol phosphate metabolism [PATH:bfr00562] # 250 645 2 326 345 90 25.0 2e-16 MTHKPIFRLLILAGLLSGCGKDTTEQLRPADRAVGGDAVSCRLAIGNDQTSGPENETRTA YGPLTDGYFPIYWRTGDRVGIISPQTTPQWAEVEVTVSGATESEADLSNTGMAWGSDSHY DFYAFYPADAVQTNSGSIVVTAIPAVQTCNNGECNMQYAYMAACTKGVERGAEVPFSFRP LMTTVTVDIRFSEAAEVQKLVLSSENDPVAGAFTFDIETHRAAVIEDRCSKVLALHMLTG EEPYIRIGAGSKIMVTAFMLPQDIRGLTLTAVTTEGKTYSYTTQATLKAGNRYSFTISDM QKQAQQTTEDYSDWMKYLPDNVYLSQVSMPGSHDACTMYGSHYEYKSGMPGERYHFKWLQ NVVFGYMNVTKIIKAQELSIEEQLAAGVRMFDLRPSASGSSLQDLPIHHGIAALGDPTRG GYTPGGSGRQELPPFMLSHLLDRFVNFLNEHPDETVLIHMKYENTSGTGKTGWDKSVVDL IKSHCGGHIADFHPRMTLADARGKILFMIREDYKSANGGRYLGAYLNWTHDKVVFETTLH GNTGETAPIKVNDLYNIKNGSSDGVSKYAAIDECIAYTYGATDVTRWCMNYVSCYDTAHC SVSGISLFGAVGDYDYCANLYNRYTADKLNRKDFRGNAGIVLMDFAGAGNATMTYGQTYS NMRVYGDDLVKAVIGVNNKWDLRRSE >gi|313158187|gb|AENZ01000050.1| GENE 21 21585 - 21719 141 44 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKAAYTKPELEILTVRTEAGFAISQPHGGELGPSSEDYADDCEY >gi|313158187|gb|AENZ01000050.1| GENE 22 21793 - 23148 1717 451 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313158200|gb|EFR57604.1| ## NR: gi|313158200|gb|EFR57604.1| putative lipoprotein [Alistipes sp. HGB5] # 1 451 1 451 451 882 100.0 0 MKKLFVIAMLAAAGMSACSKEETATTPAEAKYIQVTIPEDAVAGETRTAVDGTETTWVQG DKVAVFLYNGSTTQKFEFTADKTGATTTFTGDVPVGSYSLAAAHYPYSCGATFDAVNKVF KHTWGTSYFTTNLADYDFMYTNVWETPFEINEDSTVLPVNFTFRPLMARIRMSLGLESNE TPKKIIISTSGDVMPTAGTIDRLGNFTASSVTNSITIPTDRKEFTIGLLPVSIHSTMNVQ VMTADGLTYSDKSFTLAGLKRNHHYTMSVPCNSKTVTIGVPDAIDKKYGTGTWKDNGFYF VDPQYDPDGIFDGWYFYRGELKSDRNGDDKLKKNRIQLQVPAGFNDDNNGAFVTAPIQCD GSKTVKISFEVATNRALVNIKYRIGVTGNGPITEVWLRDRNNILNGNDRECKSGVQTHTF TVTNGQRIMVKILSTGGAYHVEFADFTYTVE >gi|313158187|gb|AENZ01000050.1| GENE 23 23192 - 23347 230 51 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313158192|gb|EFR57596.1| ## NR: gi|313158192|gb|EFR57596.1| hypothetical protein HMPREF9720_2667 [Alistipes sp. HGB5] # 1 51 3 53 53 70 100.0 3e-11 MKKQSMPYEAPELVQISVRVEKGFAVSSDIEPGTGGGELGGTRSLSYDDYE >gi|313158187|gb|AENZ01000050.1| GENE 24 23419 - 23691 308 90 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313158197|gb|EFR57601.1| ## NR: gi|313158197|gb|EFR57601.1| conserved domain protein [Alistipes sp. HGB5] # 1 90 1 90 90 131 100.0 2e-29 MSRTAATVTNETPSGAAHHLLAYLEEGRVRVYAPRRQSLWIMQQLPQAEELRIETQLREL HRTGRRTAVVEVQLRRDEETFRVRVLCVRA >gi|313158187|gb|AENZ01000050.1| GENE 25 23712 - 24881 1862 389 aa, chain - ## HITS:1 COG:no KEGG:BT_0418 NR:ns ## KEGG: BT_0418 # Name: not_defined # Def: outer membrane porin F # Organism: B.thetaiotaomicron # Pathway: not_defined # 19 389 10 371 372 236 39.0 1e-60 MKRLYLIALFTLVCGAVPAQENGNRDAQNRIVRGPYETNRLFDNIFVGVAGGINLYFGEH DSQGKFGKRLAPALDIHVGKWFTSSIGARVGYTGLQAKGWTTAGTMYAKKQDGDWFQEKF GVSYLHADALWNFSNAVSGYKEERTWNFMPFAGVGWARSYGNDTHDNEIGFDVGLLNVVR LCKALDLTLEARCLLVNQRFDGVSGGRIGEGMLSVTAGLAYKFNRRGFTRVSKPQAVDFT PYLDRIRRLEENNNDLASKNTALSDENEKLRNMPPKVVAEKKVAASPVALFFKIGRATLD SKELTNLEFYVRNAMKADKEKTFTLIGSADKATGSKELNQRLSEQRMEYVYDLLKNKYGI APERLIKKAEGDTNNRFAEPELNRAVIVE >gi|313158187|gb|AENZ01000050.1| GENE 26 24949 - 25392 451 147 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313158191|gb|EFR57595.1| ## NR: gi|313158191|gb|EFR57595.1| conserved domain protein [Alistipes sp. HGB5] # 28 147 1 120 120 239 100.0 5e-62 MTLSLKILWAVKILLALRKASEAGAPPMKSRELMAACLNPGADTYMYRRVLRELAAKTDY ATCIIQSGRTYYEWNRNLRPTLYDLTLRLEGGMHEVTNGFWSCENRHVMQPLYKANKQYE SLTREYLSNIHIDELDSPALSARNRAK Prediction of potential genes in microbial genomes Time: Wed Jun 22 12:41:18 2011 Seq name: gi|313158042|gb|AENZ01000051.1| Alistipes sp. HGB5 contig00002, whole genome shotgun sequence Length of sequence - 194471 bp Number of predicted genes - 142, with homology - 135 Number of transcription units - 59, operones - 31 average op.length - 3.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 3 - 1094 1333 ## BT_0150 putative ferric aerobactin receptor + Term 1161 - 1217 12.6 - Term 1319 - 1361 10.1 2 2 Op 1 . - CDS 1390 - 2238 1212 ## COG3568 Metal-dependent hydrolase 3 2 Op 2 . - CDS 2294 - 3202 1289 ## COG3568 Metal-dependent hydrolase 4 2 Op 3 . - CDS 3233 - 5548 3508 ## Phep_1386 hypothetical protein 5 2 Op 4 . - CDS 5568 - 7970 3509 ## Phep_1386 hypothetical protein 6 2 Op 5 . - CDS 8017 - 9513 2458 ## Phep_1384 RagB/SusD domain-containing protein 7 2 Op 6 . - CDS 9532 - 12930 5142 ## Phep_1383 TonB-dependent receptor plug - Prom 12967 - 13026 1.8 8 3 Op 1 6/0.000 - CDS 13087 - 14082 1350 ## COG3712 Fe2+-dicitrate sensor, membrane component - Prom 14104 - 14163 1.6 - Term 14113 - 14151 1.3 9 3 Op 2 . - CDS 14216 - 14773 832 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog - Prom 14805 - 14864 9.1 - Term 14903 - 14953 14.1 10 4 Tu 1 . - CDS 15003 - 16469 2087 ## COG0531 Amino acid transporters - Term 16849 - 16892 10.6 11 5 Tu 1 . - CDS 16927 - 18795 2256 ## gi|313158114|gb|EFR57519.1| hypothetical protein HMPREF9720_0015 - Prom 18828 - 18887 2.6 - Term 19182 - 19222 10.5 12 6 Tu 1 . - CDS 19249 - 20622 2216 ## COG0015 Adenylosuccinate lyase - Prom 20647 - 20706 2.9 + Prom 20585 - 20644 3.5 13 7 Op 1 . + CDS 20695 - 21111 629 ## gi|313158141|gb|EFR57546.1| hypothetical protein HMPREF9720_0017 14 7 Op 2 23/0.000 + CDS 21115 - 21858 1222 ## COG0767 ABC-type transport system involved in resistance to organic solvents, permease component 15 7 Op 3 . + CDS 21855 - 22610 304 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 + Prom 22632 - 22691 6.2 16 8 Tu 1 . + CDS 22728 - 23744 1420 ## COG0057 Glyceraldehyde-3-phosphate dehydrogenase/erythrose-4-phosphate dehydrogenase + Term 23753 - 23815 13.1 - Term 23754 - 23792 4.2 17 9 Tu 1 . - CDS 23811 - 24221 518 ## gi|167753069|ref|ZP_02425196.1| hypothetical protein ALIPUT_01339 - Prom 24405 - 24464 1.5 18 10 Op 1 . + CDS 24396 - 24980 798 ## Phep_3814 hypothetical protein 19 10 Op 2 . + CDS 25038 - 25223 235 ## PROTEIN SUPPORTED gi|110640211|ref|YP_678235.1| 50S ribosomal protein L32 + Term 25236 - 25277 0.3 20 11 Op 1 16/0.000 + CDS 25375 - 26322 1414 ## COG0416 Fatty acid/phospholipid biosynthesis enzyme 21 11 Op 2 . + CDS 26348 - 27376 1447 ## COG0332 3-oxoacyl-[acyl-carrier-protein] synthase III + Term 27419 - 27464 13.0 - Term 27365 - 27405 0.1 22 12 Tu 1 . - CDS 27414 - 27557 58 ## + Prom 27472 - 27531 1.6 23 13 Op 1 40/0.000 + CDS 27586 - 28263 905 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain 24 13 Op 2 . + CDS 28260 - 29540 1752 ## COG0642 Signal transduction histidine kinase 25 14 Op 1 . + CDS 29651 - 30094 764 ## BT_0923 putative periplasmic protein 26 14 Op 2 . + CDS 30141 - 30962 1033 ## BF2756 putative lipoprotein + Term 30978 - 31022 9.0 + Prom 30993 - 31052 7.0 27 15 Tu 1 . + CDS 31133 - 31657 948 ## COG1528 Ferritin-like protein + Term 31683 - 31727 7.1 28 16 Op 1 . - CDS 31709 - 32434 971 ## COG4422 Bacteriophage protein gp37 29 16 Op 2 . - CDS 32441 - 32893 526 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases 30 16 Op 3 . - CDS 32905 - 33363 787 ## COG1238 Predicted membrane protein + Prom 33410 - 33469 1.9 31 17 Tu 1 . + CDS 33489 - 35624 2454 ## COG3537 Putative alpha-1,2-mannosidase + Term 35700 - 35739 10.1 32 18 Op 1 . - CDS 35866 - 36126 355 ## Odosp_1596 hypothetical protein 33 18 Op 2 . - CDS 36211 - 36621 651 ## COG0432 Uncharacterized conserved protein - Prom 36760 - 36819 5.4 + Prom 36585 - 36644 2.8 34 19 Tu 1 . + CDS 36786 - 37226 613 ## BT_0816 hypothetical protein + Term 37360 - 37408 12.0 - Term 37346 - 37394 7.4 35 20 Op 1 . - CDS 37407 - 38837 1547 ## COG0486 Predicted GTPase 36 20 Op 2 . - CDS 38839 - 40221 1817 ## COG0741 Soluble lytic murein transglycosylase and related regulatory proteins (some contain LysM/invasin domains) 37 20 Op 3 . - CDS 40225 - 41385 1789 ## BF0766 hypothetical protein 38 20 Op 4 25/0.000 - CDS 41389 - 42264 1435 ## COG1475 Predicted transcriptional regulators 39 20 Op 5 . - CDS 42277 - 43047 1169 ## COG1192 ATPases involved in chromosome partitioning - Prom 43076 - 43135 4.4 + Prom 43015 - 43074 5.9 40 21 Op 1 . + CDS 43161 - 44075 1168 ## COG1266 Predicted metal-dependent membrane protease 41 21 Op 2 . + CDS 44135 - 45013 1178 ## Ftrac_1858 hypothetical protein 42 21 Op 3 . + CDS 45018 - 46424 2267 ## COG0760 Parvulin-like peptidyl-prolyl isomerase 43 21 Op 4 . + CDS 46429 - 48963 3203 ## Pedsa_3437 hypothetical protein + Term 49116 - 49167 5.2 + Prom 49108 - 49167 1.8 44 22 Op 1 . + CDS 49192 - 50349 1495 ## COG2814 Arabinose efflux permease 45 22 Op 2 . + CDS 50336 - 50836 553 ## COG0494 NTP pyrophosphohydrolases including oxidative damage repair enzymes + Term 50893 - 50927 2.5 + Prom 51015 - 51074 4.6 46 23 Tu 1 . + CDS 51106 - 52524 1769 ## COG0668 Small-conductance mechanosensitive channel - Term 52459 - 52498 1.0 47 24 Tu 1 . - CDS 52574 - 54235 2614 ## COG2985 Predicted permease - Term 54544 - 54582 10.9 48 25 Op 1 13/0.000 - CDS 54621 - 55994 2031 ## COG1538 Outer membrane protein 49 25 Op 2 13/0.000 - CDS 56013 - 57122 1862 ## COG0845 Membrane-fusion protein 50 25 Op 3 10/0.000 - CDS 57148 - 58407 1772 ## COG0577 ABC-type antimicrobial peptide transport system, permease component 51 25 Op 4 36/0.000 - CDS 58414 - 59670 2111 ## COG0577 ABC-type antimicrobial peptide transport system, permease component 52 25 Op 5 . - CDS 59667 - 60356 325 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 - Prom 60435 - 60494 5.4 53 26 Tu 1 . - CDS 60496 - 60753 178 ## - Prom 60784 - 60843 2.2 + Prom 60525 - 60584 6.6 54 27 Op 1 . + CDS 60797 - 63625 4041 ## BT_3633 hypothetical protein 55 27 Op 2 . + CDS 63631 - 64860 1532 ## BT_3632 hypothetical protein 56 27 Op 3 . + CDS 64864 - 66363 1828 ## BT_3631 hypothetical protein 57 27 Op 4 . + CDS 66381 - 67979 2005 ## BT_3630 hypothetical protein 58 27 Op 5 . + CDS 68023 - 69975 2834 ## COG0348 Polyferredoxin + Term 69982 - 70024 9.1 59 28 Tu 1 . + CDS 70032 - 72203 3023 ## Fluta_3976 peptidase S46 + Term 72216 - 72257 2.1 - Term 72249 - 72285 3.2 60 29 Tu 1 . - CDS 72434 - 73081 912 ## COG0546 Predicted phosphatases - Prom 73120 - 73179 1.9 - Term 73177 - 73215 7.3 61 30 Op 1 . - CDS 73238 - 74530 1678 ## COG0427 Acetyl-CoA hydrolase 62 30 Op 2 . - CDS 74535 - 75848 1796 ## COG0436 Aspartate/tyrosine/aromatic aminotransferase 63 30 Op 3 11/0.000 - CDS 75848 - 76432 888 ## COG1014 Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, gamma subunit 64 30 Op 4 . - CDS 76514 - 78115 1994 ## COG4231 Indolepyruvate ferredoxin oxidoreductase, alpha and beta subunits - Prom 78148 - 78207 5.9 65 31 Tu 1 . + CDS 78107 - 78301 100 ## + Term 78313 - 78342 0.5 - Term 78248 - 78297 18.0 66 32 Op 1 . - CDS 78321 - 79328 1008 ## Lbys_0372 inosine/uridine-preferring nucleoside hydrolase 67 32 Op 2 . - CDS 79341 - 80765 1632 ## Phep_2210 sialate O-acetylesterase (EC:3.1.1.53) 68 32 Op 3 . - CDS 80762 - 81760 1114 ## Sde_2315 hypothetical protein 69 33 Op 1 . - CDS 81952 - 83103 1461 ## Sde_2315 hypothetical protein 70 33 Op 2 . - CDS 83116 - 85089 1708 ## COG1262 Uncharacterized conserved protein 71 33 Op 3 . - CDS 85102 - 85515 540 ## Phep_2281 hypothetical protein 72 33 Op 4 . - CDS 85536 - 87098 2155 ## Phep_2282 RagB/SusD domain-containing protein 73 33 Op 5 . - CDS 87110 - 90247 4740 ## BT_3332 hypothetical protein 74 33 Op 6 . - CDS 90330 - 92324 2009 ## Bache_1739 hypothetical protein 75 33 Op 7 . - CDS 92352 - 94127 2345 ## COG2407 L-fucose isomerase and related proteins 76 33 Op 8 2/0.000 - CDS 94162 - 95646 2415 ## COG0591 Na+/proline symporter 77 33 Op 9 . - CDS 95666 - 96583 1347 ## COG0329 Dihydrodipicolinate synthase/N-acetylneuraminate lyase 78 33 Op 10 . - CDS 96662 - 97717 1470 ## COG1609 Transcriptional regulators - Prom 97745 - 97804 4.8 + Prom 97709 - 97768 3.4 79 34 Op 1 . + CDS 97812 - 99452 1782 ## COG1069 Ribulose kinase 80 34 Op 2 . + CDS 99453 - 102299 3721 ## Fluta_0313 hypothetical protein + Term 102324 - 102377 1.4 + Prom 102319 - 102378 2.8 81 34 Op 3 . + CDS 102402 - 104003 2115 ## COG3119 Arylsulfatase A and related enzymes + Term 104024 - 104072 14.5 + Prom 104179 - 104238 3.4 82 35 Tu 1 . + CDS 104275 - 105006 1215 ## COG2188 Transcriptional regulators + Term 105215 - 105248 -1.0 - Term 105287 - 105331 14.1 83 36 Tu 1 . - CDS 105449 - 108538 4258 ## COG3537 Putative alpha-1,2-mannosidase - Prom 108559 - 108618 4.5 + Prom 108999 - 109058 3.8 84 37 Op 1 . + CDS 109256 - 112750 4374 ## COG0383 Alpha-mannosidase 85 37 Op 2 . + CDS 112763 - 115105 3397 ## COG3537 Putative alpha-1,2-mannosidase + Term 115167 - 115209 5.3 86 38 Op 1 . + CDS 115229 - 118471 2583 ## Cpin_4504 TonB-dependent receptor plug 87 38 Op 2 . + CDS 118377 - 120365 2808 ## Cpin_4503 RagB/SusD domain protein 88 38 Op 3 . + CDS 120392 - 121153 1090 ## Cpin_4502 hypothetical protein 89 38 Op 4 3/0.000 + CDS 121234 - 122136 893 ## COG1940 Transcriptional regulator/sugar kinase 90 38 Op 5 . + CDS 122145 - 123911 2411 ## COG1482 Phosphomannose isomerase + Term 123933 - 123977 14.3 - Term 123921 - 123964 14.1 91 39 Op 1 . - CDS 123979 - 124332 447 ## BVU_1310 putative transcriptional regulator 92 39 Op 2 . - CDS 124316 - 124519 226 ## gi|313158171|gb|EFR57576.1| 4Fe-4S binding domain protein 93 39 Op 3 . - CDS 124527 - 124991 675 ## COG0526 Thiol-disulfide isomerase and thioredoxins - Prom 125041 - 125100 3.8 + Prom 125507 - 125566 6.2 94 40 Op 1 . + CDS 125612 - 126859 353 ## COG0582 Integrase 95 40 Op 2 . + CDS 126837 - 127085 60 ## + Term 127121 - 127156 3.4 + Prom 128669 - 128728 3.2 96 41 Tu 1 . + CDS 128799 - 128900 68 ## 97 42 Tu 1 . - CDS 129173 - 129409 91 ## gi|313158127|gb|EFR57532.1| conserved hypothetical protein - Prom 129586 - 129645 7.2 + Prom 129549 - 129608 11.0 98 43 Tu 1 . + CDS 129636 - 129830 123 ## gi|313158164|gb|EFR57569.1| conserved hypothetical protein 99 44 Tu 1 . - CDS 130264 - 130446 86 ## - Prom 130634 - 130693 5.0 + Prom 130734 - 130793 4.1 100 45 Tu 1 . + CDS 130988 - 131353 132 ## gi|313158150|gb|EFR57555.1| hypothetical protein HMPREF9720_0101 + Term 131589 - 131622 4.0 - Term 134217 - 134253 0.1 101 46 Op 1 . - CDS 134341 - 134817 179 ## Gobs_3736 hypothetical protein 102 46 Op 2 . - CDS 134814 - 135716 441 ## Tery_3084 phage integrase - Prom 135751 - 135810 6.8 - TRNA 135889 - 135965 57.9 # Arg ACG 0 0 - Term 135831 - 135878 9.8 103 47 Op 1 . - CDS 136099 - 137145 1100 ## COG0526 Thiol-disulfide isomerase and thioredoxins 104 47 Op 2 . - CDS 137188 - 137778 1008 ## COG0817 Holliday junction resolvasome, endonuclease subunit - Term 137807 - 137853 9.5 105 48 Op 1 . - CDS 137871 - 138926 1767 ## COG0526 Thiol-disulfide isomerase and thioredoxins 106 48 Op 2 . - CDS 138944 - 139978 1625 ## COG1044 UDP-3-O-[3-hydroxymyristoyl] glucosamine N-acyltransferase 107 48 Op 3 . - CDS 139990 - 140460 249 ## PROTEIN SUPPORTED gi|163764798|ref|ZP_02171851.1| ribosomal protein S19 108 48 Op 4 . - CDS 140519 - 141112 678 ## Bache_2807 hypothetical protein 109 48 Op 5 . - CDS 141164 - 141892 796 ## gi|313158178|gb|EFR57583.1| hypothetical protein HMPREF9720_0113 110 48 Op 6 . - CDS 141892 - 142962 1628 ## COG1466 DNA polymerase III, delta subunit 111 48 Op 7 . - CDS 143051 - 144394 2178 ## COG0037 Predicted ATPase of the PP-loop superfamily implicated in cell cycle control 112 48 Op 8 . - CDS 144399 - 145598 1362 ## COG0812 UDP-N-acetylmuramate dehydrogenase - Prom 145646 - 145705 6.0 + Prom 145559 - 145618 2.5 113 49 Tu 1 . + CDS 145670 - 147103 1955 ## Dfer_3787 hypothetical protein + Term 147117 - 147173 16.4 + Prom 147481 - 147540 4.5 114 50 Tu 1 . + CDS 147572 - 148777 360 ## PROTEIN SUPPORTED gi|116517028|ref|YP_816079.1| glucokinase + Prom 149054 - 149113 6.2 115 51 Op 1 . + CDS 149133 - 149747 1137 ## COG1390 Archaeal/vacuolar-type H+-ATPase subunit E 116 51 Op 2 . + CDS 149753 - 150586 1454 ## BF2748 hypothetical protein 117 52 Op 1 16/0.000 + CDS 150725 - 152479 2718 ## COG1155 Archaeal/vacuolar-type H+-ATPase subunit A 118 52 Op 2 16/0.000 + CDS 152490 - 153812 2268 ## COG1156 Archaeal/vacuolar-type H+-ATPase subunit B 119 52 Op 3 4/0.000 + CDS 153819 - 154427 695 ## COG1394 Archaeal/vacuolar-type H+-ATPase subunit D 120 52 Op 4 16/0.000 + CDS 154436 - 156271 2613 ## COG1269 Archaeal/vacuolar-type H+-ATPase subunit I 121 52 Op 5 . + CDS 156293 - 156748 813 ## COG0636 F0F1-type ATP synthase, subunit c/Archaeal/vacuolar-type H+-ATPase, subunit K + Term 156771 - 156814 12.2 + Prom 156794 - 156853 4.8 122 53 Tu 1 . + CDS 156936 - 160142 5030 ## COG0383 Alpha-mannosidase + Term 160170 - 160211 14.1 + Prom 160369 - 160428 2.2 123 54 Op 1 . + CDS 160476 - 160559 61 ## + Term 160569 - 160608 -0.4 124 54 Op 2 . + CDS 160628 - 163759 5101 ## PRU_2712 hypothetical protein 125 54 Op 3 . + CDS 163775 - 165427 2746 ## PRU_2713 putative lipoprotein 126 54 Op 4 . + CDS 165455 - 166579 1373 ## PRU_2714 putative lipoprotein 127 54 Op 5 . + CDS 166595 - 168736 3119 ## PRU_2714 putative lipoprotein 128 54 Op 6 . + CDS 168801 - 170012 1890 ## COG4833 Predicted glycosyl hydrolase + Term 170039 - 170082 10.8 - Term 170029 - 170068 8.3 129 55 Tu 1 . - CDS 170312 - 174370 5717 ## COG0642 Signal transduction histidine kinase - Prom 174437 - 174496 2.4 + Prom 174371 - 174430 2.5 130 56 Op 1 . + CDS 174647 - 175366 734 ## COG2755 Lysophospholipase L1 and related esterases 131 56 Op 2 . + CDS 175377 - 176567 1808 ## COG4833 Predicted glycosyl hydrolase 132 56 Op 3 . + CDS 176596 - 178158 2397 ## COG3538 Uncharacterized conserved protein + Term 178170 - 178208 9.1 133 57 Tu 1 . + CDS 178231 - 180168 2658 ## BF3763 hypothetical protein + Term 180254 - 180299 6.8 134 58 Op 1 . - CDS 180612 - 181742 1516 ## COG1060 Thiamine biosynthesis enzyme ThiH and related uncharacterized enzymes 135 58 Op 2 . - CDS 181746 - 183977 1098 ## PROTEIN SUPPORTED gi|15894003|ref|NP_347352.1| fused ribonuclease/ribosomal protein S1 136 58 Op 3 . - CDS 183974 - 185086 1450 ## COG1473 Metal-dependent amidase/aminoacylase/carboxypeptidase 137 58 Op 4 3/0.000 - CDS 185092 - 186057 1455 ## COG0685 5,10-methylenetetrahydrofolate reductase 138 58 Op 5 . - CDS 186116 - 189727 5613 ## COG1410 Methionine synthase I, cobalamin-binding domain 139 58 Op 6 . - CDS 189728 - 190519 1178 ## COG3823 Glutamine cyclotransferase - Prom 190749 - 190808 3.6 + Prom 190706 - 190765 5.4 140 59 Op 1 29/0.000 + CDS 190849 - 192174 2365 ## COG0544 FKBP-type peptidyl-prolyl cis-trans isomerase (trigger factor) + Term 192200 - 192242 9.4 + Prom 192197 - 192256 2.6 141 59 Op 2 24/0.000 + CDS 192407 - 193075 1103 ## COG0740 Protease subunit of ATP-dependent Clp proteases 142 59 Op 3 . + CDS 193075 - 194352 243 ## PROTEIN SUPPORTED gi|163762510|ref|ZP_02169575.1| ribosomal protein S16 Predicted protein(s) >gi|313158042|gb|AENZ01000051.1| GENE 1 3 - 1094 1333 363 aa, chain + ## HITS:1 COG:no KEGG:BT_0150 NR:ns ## KEGG: BT_0150 # Name: not_defined # Def: putative ferric aerobactin receptor # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 363 430 792 792 487 64.0 1e-136 ACTFQRLFTDETLLSDYRTRLGIVGWGLFAGADYASEDKRFTASLGLRADGNDYSPRMAR LWHQLSPRASVGYRLGRGWSLSGSAGLYYQLPPYTALGFKEADAWVNKSLHYMRVGAFSA GVGWRLRDRLIVSVEGFYKLYGDIPLSVADGIPLSCKGNDYGVVGNEELRSSAEGRSYGV EAMARWQIPGKVNLVGSVTLFRSEYRRDSAAPYIVSAWDSRFVVNLSGTYDLPRSWSVGA RLSAVGGTPYTSYDVGKSSLVEAWNVQGRPYYDYARYNDGRLDAFAQLDLRIDKTFYFRH CMLGLYIDLQNVTGSKLRQPDVLMSTGVVENPEAPAAEQRYRMKYIRQESGTLIPTLGIT VEF >gi|313158042|gb|AENZ01000051.1| GENE 2 1390 - 2238 1212 282 aa, chain - ## HITS:1 COG:lin0348 KEGG:ns NR:ns ## COG: lin0348 COG3568 # Protein_GI_number: 16799425 # Func_class: R General function prediction only # Function: Metal-dependent hydrolase # Organism: Listeria innocua # 29 282 6 256 257 111 29.0 1e-24 MKKLLFTLLFCAAAAAAAGSEPAELKLLSYNIRYIGAPGDEGDFAWDARKEASIRMIRDV RPDVIGFQEPRRQQVAYLVEQLPEYGHIEMGRDFGAKDDPGEHLMIMYLRERYELLDHGH YWLSETPDEVSQGWDGRCRRVTVWARLRDRATGREFCLFDTHLDHIGKTARLEGARLNVE RMRSIAGKRTPQFIVGDMNATAGTPGGVCLEPYFKWLKSACEEAPLTDPRPTYHGFGKAK PLHIDHIFYRCAEPLRYQLLDSTGYGVQYLSDHYPIVCTFRF >gi|313158042|gb|AENZ01000051.1| GENE 3 2294 - 3202 1289 302 aa, chain - ## HITS:1 COG:CC0862 KEGG:ns NR:ns ## COG: CC0862 COG3568 # Protein_GI_number: 16125115 # Func_class: R General function prediction only # Function: Metal-dependent hydrolase # Organism: Caulobacter vibrioides # 9 297 23 290 305 82 29.0 1e-15 MKRFLFSIALLFTALAGCSNHEGESTAPKSELNVMSYNIRYANASDKGDAAWDARKEASV AMIRDVKPDVIGMQEPRFSQAQYLIGELTEYEHYYLAPDDKDSQHRNAVWWRKDRFEMLA QGYFFLNEKDITQPIKGWGHNQFRTALWVKLRERSTGKEFFFFNTHLAHRASPVEGGDID QVARTESVKLIVEQMKQIAGRYAPIFVTGDMNASYAAGDGRRTCLDGFFEFMWSARETAP DGEADDVYSYNNFGEGTPRFTWNIDHIFYRKVTPVRFRTINNDGYGVPYISDHFPILFTS EF >gi|313158042|gb|AENZ01000051.1| GENE 4 3233 - 5548 3508 771 aa, chain - ## HITS:1 COG:no KEGG:Phep_1386 NR:ns ## KEGG: Phep_1386 # Name: not_defined # Def: hypothetical protein # Organism: P.heparinus # Pathway: not_defined # 47 771 50 692 692 211 27.0 1e-52 MKYAKRYMAKIGSAGILLAALAATACMDHGDDIPLIELGAVENSFTVNATAGHVDIQVYS NQPAVLSLLDDGASWADLSDTRLPLDGKFYIDYADNTGFPRMTRVLIQSADAVRSDTVVL RQRGARVPAMAFEGGTSTNVPGSSGGKASVRFGTNIAPDELTFTTQYDAGEEPAEEWLSE IALQPAGEEEYTFTFDYAGNDTDELRTATVTVSYVNGWEETEQLTLNVIQRTKNDMLGHD ISFAELREKALTVSKVDEYWLLEGYVVSDRDSKNASNNPMPTDMSVDYSGCEKTVYLESP DGRYGFCVETATPEDNAFTRYDKVKILLKDAELVFEPDPDRYMIKGIRSSMIVERVTGND ASVLPVKQKYISELTDEDIYTFVTLRDCEFAVRKGSLTPVHEGYTLADAQGRLNMYPRLI RDNKGSLIHLLTNTTCPYRRNGTRLPYGSGELAGVIVHERFPQYEYVDTADDLANGIIGS YQIRHMSFADIRFAEDRSQSFSETLTEYSYVKGKAAAADGYAYWYPTWGTNGRFTQTASK SAYPNGVYNAACWHYLGWCGTARGTAPFRSHIGNDGYSGFGVILEDGTNYAVKSTAVNTD GKGTDQANWLAWVTTYWWNASTDRPYAWVVEFSTAGISSDRVSLIMSVQSGRATLNAGPY FWKVQWSATGDYESDEGWFDVEAAQTAPDGSAYTFTQPDFPVFATQRSWQLPAYKQVEIP LPAEIMGREKVWLRIQPASRLSSGLGFLDGTIPMGYNAAGAMDYLAVRYNR >gi|313158042|gb|AENZ01000051.1| GENE 5 5568 - 7970 3509 800 aa, chain - ## HITS:1 COG:no KEGG:Phep_1386 NR:ns ## KEGG: Phep_1386 # Name: not_defined # Def: hypothetical protein # Organism: P.heparinus # Pathway: not_defined # 264 799 165 690 692 227 31.0 2e-57 MLAVALTAALLGPGCQKDFKWNEPLAVTQNGLNLTSPAGSTRVTVYSTGRWKAEMAEGAW GEVSEVSGNGIGDFLFTYEANNGVSRRARILVSGEGEEQEIVLTQAGAVTEPTLALAETE FEFVRLPRERVQIGVTTNMTQALECILITATDVTDAENPAEAGWLKEIRLEKDAEENIVL VFGIDRNDSSSDRKAAIRLEIPDADGKILAQAEASVVQTTDNATVVFKDEETTVSVPGDQ HNRSALLTANFDVDPAHFSFDIAYDPAGTQWITDVTFSESAVSFIVAENTGDQPRSASLK ITYKDTDVECSSTLRLTQEVKQLSIADLRAMIPGAEGEVELTGEKMLSAVVISDAGNYNM ETNPNLTDTSIDFSVNEKTAYIESLDGQYGLRIVTKLPADNILKRYSSVQLSINGLKLVK ESNPERYTLTGFTKEHVLNQNAGTAADLPKKEKHISELTDADIYTYVTLKECEFMLNGGA YINVHDGYCYKTDLNTQGVLDPRFDCAVRGVIDSRGEKINMVLNTQVRWRRKGDGVPAGS GPISGIIVHTKLPRYGVKGDVGTYQIRPVEEADIAFSREESTRNYSTLVRWAWPGMTTNA GIKQHADGSIVPYLGEGRMFSSVSNKLNTSASVAGVSCTLDYNTLDYAKGIKSPAVRYNG IWWNSSRNEGEWVAFNFSTEGVSGSCMKMILSAALGNLSAATIVAPLYWDVSYSLDGSTF TRFDTVPIRTLVYWAGPQWYVPGLYEVDFDLPSACFGQKDVTIRLQAASKVCGSTTGEDN GTTTKTYVYFRFGDVSVKYF >gi|313158042|gb|AENZ01000051.1| GENE 6 8017 - 9513 2458 498 aa, chain - ## HITS:1 COG:no KEGG:Phep_1384 NR:ns ## KEGG: Phep_1384 # Name: not_defined # Def: RagB/SusD domain-containing protein # Organism: P.heparinus # Pathway: not_defined # 1 493 4 502 505 424 45.0 1e-117 MKTKYILLILTAWLTASCSLQEDTEGFSTPGDFYKTKTQCQSAVNSCYIPLKNLYTYKMM IATEGVTDLMYIASGTQDAQLDISPSQPRFGADMWTYGYRGVMYCNMAVTGIENSPIDEK DRNSLVAEAKVLRAFYYYLLTSFFGDVPFYTDDVSDHEVLDRVAKLPRMSAEATRTALIE DLESCLGALPLIRTSEAAGNRAGAAMGHMLIAKLAMWNKDYDKALEAIAVLEQIYGDDLS VYPVSDIPFRMKNTPESIFEVQHTYTAGGLIYTSNVASICMPYPRNSDNIYSNVVIEELG DAATTWSPLRPNSFFYGNLMPEGGEDLRRDMQIITEWNGVKFTSGDAIVTRPFMGPKFWC PDLQAAYDSNNYKVFRYADAVLMAAECYCALLDKDKAMEYLNKVKRRAGIAEYTRFRTYP LLLNEIQNERGRELLGEFQRKFDMVRWGIWYERTAALSDYAKVKTSIQPCHEYYPIPDTE VIYSGGALDNNEYKKYGF >gi|313158042|gb|AENZ01000051.1| GENE 7 9532 - 12930 5142 1132 aa, chain - ## HITS:1 COG:no KEGG:Phep_1383 NR:ns ## KEGG: Phep_1383 # Name: not_defined # Def: TonB-dependent receptor plug # Organism: P.heparinus # Pathway: not_defined # 11 1132 26 1154 1154 926 45.0 0 MKKSGKSGGKGIIPLPPGRFFLLTLLFTLLAAAGVRAQNPPVSINAKAITINELFIRIEK QGVYTFAYNNADIDLKRIVTVHAKERPIESIVRECLPEVNVQVSNNKVILTGRRSDQSSM PMHKLTGTVTDENGAPVTGATVIVVGTQRGTTTSATGAYTLQVRNGEILEFQYLGYEKQA VSVGGQTTLDVTLQPSKAMAVDEVVVIGYGTVKKGDATGSVANIKISDVKDLPVLSVDQA LQGRVAGADIMSTTGEPGATTSIRIRGTRSISASNEPLIVVDGVMDAVSDLNDLNMADIE SVTILKDASSTAIYGSRGSNGVIIVTTKGGGNQTNTKPSITLKADIGFSQLPRKLDVMNA TEFALYRNDFAYFSTQSGYEDIGEGTPQSKYPFKDPFSLGKGTDWIDEITRTAPYQNYSL SISGRSKKSSYYASLGYNDSQGIIDNSGLQRITGRLNLDHQLFKWLKVGYRGSYTWRDNA QNLAEIGGTAYYRAAMYLSPHIDPQENYNPLWGNGQRINTPRATIDQNTYSIERTSLNHT AYLEVALAKGLKLRSQNSYYSFQRHTYRYYPGSLPAKNEGEGGQAYRAEFHEFSLSSENT LSYKLETKSGHNIDALAGFTAYRYKSDNFTLSGQGYMDDDVLWNNMNAVTDKDTYSAATG LTKRTKMSLLARFNYNYKQRYYLTVTGRYDGSSNFAANNKWGFFPSVALKWNAAKENFLK DVRWIDELSLRLSAGRTGNDAISAYRSLAAMSSTTSGYLFDGMQPGAYYRSRLASPNLTW EKTDLYNAALDLAFFNNRLMITAEGYISKTRDLLLTVQTASATGYTSRYANIGKTSNKGV ELSIESRNIVRPKFSWTTNLTIAHNKQNVDDIGSEDFVTALSSPGNNPYMMYGYVKGYPL NALWGFKYGGTWKSVEEFERNSVTNTYVSALAINSDAASRKASLGMPRYYDINNDGSLNN DDLVYQGNADPDLYGGLQNNFRFGRLNVGIYFTYSLGGKIYNYSELYMAGSSMSNQYRYM LEAWHPVRNPQSNLPRAGAVDAHVPSDLMIYDASYIRLKNITVGYTFDLSKRSKFIRDIT LNLSAENLHIWKKYNGFDPDVSTDSSDSALRRVDLGAYPKPRTIVFSIQLRY >gi|313158042|gb|AENZ01000051.1| GENE 8 13087 - 14082 1350 331 aa, chain - ## HITS:1 COG:SMc04204 KEGG:ns NR:ns ## COG: SMc04204 COG3712 # Protein_GI_number: 15965785 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Sinorhizobium meliloti # 21 278 63 308 354 77 27.0 3e-14 MERRQIRELIGRFLHNDVPGELQTQFRRWMLQPGDGDEKEAALREEWTAVVESPAAADRS EELQRLRHTIRMREAQQRRSRIFRRIRYAAAAAAVLLFAVGEYYYVRSASAVPAEIRLLT ARGSKGEFQLPDGTKVWLNASSRLTYPETFDRRERRVTLEGEAYFEVARNTSHPFVVDMN RMEIEVLGTTFDARYERTSGIAETTLNSGSICVRTSRSQQSVRLRPDERLVFNETSGSMI IEQVNASNYNSWIQPTLTFFDMTLEDIITNLERWFNVPIGTDASVDRTIRLSFHIRHESL EETLQVISLITGLQYTLDSESATFHTARTTR >gi|313158042|gb|AENZ01000051.1| GENE 9 14216 - 14773 832 185 aa, chain - ## HITS:1 COG:mlr8088 KEGG:ns NR:ns ## COG: mlr8088 COG1595 # Protein_GI_number: 13476697 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Mesorhizobium loti # 3 178 37 213 217 71 27.0 1e-12 MTPEKEKELLTALSKGNQSAFDSLYLFYAPKVREFVFRLLKNPGEAEDVTQNIFLRVWEK RRELGGTRSLRSYLYTMARNAVFDIFSHSIVEDKYMQEHINSAAERRDAPLSEKIETEEL ALLIAVAVDRMPEQRRRVFSLSRYEELSNKEIAERLNLSVKTVERHMTAALSQLRRLLTL LALFV >gi|313158042|gb|AENZ01000051.1| GENE 10 15003 - 16469 2087 488 aa, chain - ## HITS:1 COG:BMEII0909 KEGG:ns NR:ns ## COG: BMEII0909 COG0531 # Protein_GI_number: 17989254 # Func_class: E Amino acid transport and metabolism # Function: Amino acid transporters # Organism: Brucella melitensis # 2 469 19 473 510 326 39.0 6e-89 MAKNRNITTTQLALMTAAAVISLRGLPMMAQEELTMFFYIFFATFLFLIPAALVGAELGS AFASKGGGVYTWVKEAFNKHMGFTAIFLQWIQNVVWYPTVLGFAAASIAYMIGMPDLAQN GLFVGLFSIAMYWCATLVTLRGTSAISGITSKGFLIGTVLPGIVVIVMAVVWMIGGNSVA LEHIPDTVSQVVNIDAAHHVHPRLFPHITGMSDIAFLAGILLLFAGVEVHAVHAPELKKP QTQFPRAMFLAALISFGLFTLGALAVAIITPYDQINLQSGLFTTFQIVFEHYHVGWLTNV MGLLVAFGALAGVMSWISGPSRGLLWTAQEGVLPCFLQKTNKNGVQINILIIQGCIVTLL SSLYIVMNDVSVAFFLLSALTVGLYLLMYMMMYAAGIRLRYTQPDLTRSYRIPGGNAGMW LVGGIGFLAVLFSFIVTFFPPSQLPVGSPAMYTWLVVVGTAVFLSIPFVISFVMDRRAAG AADKPSGR >gi|313158042|gb|AENZ01000051.1| GENE 11 16927 - 18795 2256 622 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313158114|gb|EFR57519.1| ## NR: gi|313158114|gb|EFR57519.1| hypothetical protein HMPREF9720_0015 [Alistipes sp. HGB5] # 8 622 1 615 615 1033 99.0 0 MLRSSGLVRRQRRGGGNPPAPAVTLAVGDAASNTLKFTLTLSDADKCTYVCTKSSEAVPS AEKILADGKSVTASGLITIGDLEPSTTYRLSAVASNGKVNGKVETIEHTTSAPDVHPAVV LTPGTPTSTTLSFTAALTDPETAAYVCLEKTEGTTVPTAEEILRDGTAIAQTGELLVENL KPATVYLIAAAVANTGVYSEVETLEMETVARTPVVTVIAGTPTETTLSFHFKLTDAEKAA YVCIEKTDNPTIPPAEEILRDGTDLPVSADAAEIRELKPGTTYIVAVAASNKTVYSDVKT VEMTTDQAVEGPIVFDRQAAGGYYPTESSYIGEFLLVLADGETTESGGVYATTGAGRAMS IDLYQMAVSNPNTTITLPARDYRYATNKGLTTFDPVKTYCMVNDGKGNITRTDFKAGTIS VKKAGSTYTITATLTTTDDEEFTASYEGPLTIENKTPSEIPDLPTLDKDVTNLTFIRALG KYYSDSATADNCIVNLYDVEPTISYGSDYLGQAGHMVSLDLSTAVSTEMQLQEGTYSVST SGAPGSYAAGYQTEFMDTMLPVGTYCEERNDNFQSFYGFVSSGTVTITKSGSGYRFVLDF TTDKGYKVSGTYEGNVEMTDKR >gi|313158042|gb|AENZ01000051.1| GENE 12 19249 - 20622 2216 457 aa, chain - ## HITS:1 COG:XF1553 KEGG:ns NR:ns ## COG: XF1553 COG0015 # Protein_GI_number: 15838154 # Func_class: F Nucleotide transport and metabolism # Function: Adenylosuccinate lyase # Organism: Xylella fastidiosa 9a5c # 14 455 83 524 532 440 52.0 1e-123 MPENKSQHNMLNKLTAISPVDGRYRNTTETLADYFSEQALIRYRIRVEVEYFIALCEIPL PQLAGIDRTKFAALRALYLDFSPADAERVKQIESVTNHDVKAIEYIIKEKMDTLGLEAYK EFVHFGLTSQDINNTAIPLSLKEAMAGFYYPAVEEVRDKLAAFADEWREVPMLARTHGQP ASPTTLGKEFMVFVERIEKQLAMLHDIAVPAKFGGATGNFNAHRAAYPQYDWVAFANRFV GETLGLCRSQYTTQIEHYDNLAAIFDNMKRIDTILIDLCRDMWTYISMEYFKQQIKAGEV GSSAMPHKVNPIDFENAEGNFGIANALFEHLSSKLPVSRLQRDLTDSTVLRNIGVPVAHA AIALRSLMKGLNKVILNREALDRDLENNWAVVAEGIQTILRREGYPKPYEALKALTRTNA HITHESIAAFIETLDVAETVKEELRALAPSTYTGVFR >gi|313158042|gb|AENZ01000051.1| GENE 13 20695 - 21111 629 138 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313158141|gb|EFR57546.1| ## NR: gi|313158141|gb|EFR57546.1| hypothetical protein HMPREF9720_0017 [Alistipes sp. HGB5] # 1 138 1 138 138 258 100.0 9e-68 MNFLTAIINALVGFVQRNPLTVLVILILAIAAPALLKGVALFFLYVVMSILILAVALILV FRWRMNKVRRQMEEQFGEGFDPRNFGGQGFGSPFAGEPRKGREGEVKVRKTSGAPEKRVS KDVGDYVEFEETKEPKQE >gi|313158042|gb|AENZ01000051.1| GENE 14 21115 - 21858 1222 247 aa, chain + ## HITS:1 COG:aq_355 KEGG:ns NR:ns ## COG: aq_355 COG0767 # Protein_GI_number: 15605864 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: ABC-type transport system involved in resistance to organic solvents, permease component # Organism: Aquifex aeolicus # 1 246 1 245 245 114 33.0 1e-25 MLKIFELIGRYFILMGKVFSRPEKAAIYRRRIVYEMESLGIDSIGLTAIISVFIGAVITL QMCINLDSPFIPRSLVGYATRETMILEFSSTVVALILAGKVGSSIASEIGTMRITEQIDA LEIMGVNSASYLILPKIVAAVVFFPFLTILSIAIGIIGGWAIAYLTGIMIPADYIDGLLM DFKPYSIVYSLIKTAVFAYIITSISAFYGYYAKGNSLEVGAASTRAVVASSVVIMIFNLI LTQVLLI >gi|313158042|gb|AENZ01000051.1| GENE 15 21855 - 22610 304 251 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 232 1 231 245 121 32 2e-26 MIRAEHITKSFDGRVVLDDISIEFETGKTNLIIGRSGSGKTVLLKTLVGLHEPDSGDVWY DDTNFTQLDFKQRRAIRKDIGMIFQGGALLDSSTVEENVKLPLDLFTSQSEKEKMERVNF CLQRVRLEEANHLYPAELSGGMIKRVAIARAIVLNPRYLFCDEPNSGLDPQTSIVIDNLI HEITCEYNITTIINTHDMNSVMEIGEKIVYIHEGRKWWEGTKDDILHADNPELNDFVFAS AMAKRAKRAAK >gi|313158042|gb|AENZ01000051.1| GENE 16 22728 - 23744 1420 338 aa, chain + ## HITS:1 COG:YPO2157 KEGG:ns NR:ns ## COG: YPO2157 COG0057 # Protein_GI_number: 16122389 # Func_class: G Carbohydrate transport and metabolism # Function: Glyceraldehyde-3-phosphate dehydrogenase/erythrose-4-phosphate dehydrogenase # Organism: Yersinia pestis # 4 333 6 333 334 489 76.0 1e-138 MSKIKVGINGFGRIGRLVFRAACNNADIEVVGINDLVPVDYMAYMLKYDSVHGRFNGTID YNVEKGQLIVNGNVIRVTAEKDPANLKWNEVGAEYIVESTGLFLTKEKAEAHIAAGAKYV VMSAPSKDDTPMFVCGVNTDTYAGQQIVSNASCTTNCLAPLAKVINDNFGIVEGLMTTVH ATTATQKTVDGPSMKDWRGGRAAGGNIIPSSTGAAKAVGKVIPALNGKLTGMAFRVPTLD VSVVDLTCRLEKGASYDEIKAAMKKASENEFKGIIEYTEDDVVSTDFTGDANTSIFDAKA GIALNPNFVKLVAWYDNEWGYSNKVVMLIQKMAAYNNK >gi|313158042|gb|AENZ01000051.1| GENE 17 23811 - 24221 518 136 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167753069|ref|ZP_02425196.1| ## NR: gi|167753069|ref|ZP_02425196.1| hypothetical protein ALIPUT_01339 [Alistipes putredinis DSM 17216] hypothetical protein ALIPUT_01339 [Alistipes putredinis DSM 17216] # 1 133 1 133 134 159 53.0 5e-38 MAATEGFSIEISSRYDGWWRYNAALMCGCFDAGDERIGFASAESHVADVGANLTAKPADM AGDRSLRLETTACDHLLLYIYIVPHTLPAGNDIADTQPFEITLRIAYGGKVLRSEKRLIN QWSGASVEMRVDRQDK >gi|313158042|gb|AENZ01000051.1| GENE 18 24396 - 24980 798 194 aa, chain + ## HITS:1 COG:no KEGG:Phep_3814 NR:ns ## KEGG: Phep_3814 # Name: not_defined # Def: hypothetical protein # Organism: P.heparinus # Pathway: not_defined # 1 177 1 177 177 110 36.0 3e-23 MGVTKRYTIAYKGLKPGLHDFRFEVDGSLFEEFESTEIKDGACEVTVALERGEAMLTLDV TVDGSVVVECDRCLEECRIPVHYQGRLLVKFSDEVHEYDGEVMWLLPMEDEIDLKQYIYE SIVLSLPYQRVHPEGECNPEMLERFRIVSDSELAAVEARAGAQEHDGGEWAKLAALKERL EAEGPQEEDGTPEK >gi|313158042|gb|AENZ01000051.1| GENE 19 25038 - 25223 235 61 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|110640211|ref|YP_678235.1| 50S ribosomal protein L32 [Cytophaga hutchinsonii ATCC 33406] # 1 60 1 60 61 95 73 2e-18 MAHPKHKVSSTRRDKRRTHYKAVVPTVVTCSNCGAATLYHRVCPECGFYRGKLAIEKKAA E >gi|313158042|gb|AENZ01000051.1| GENE 20 25375 - 26322 1414 315 aa, chain + ## HITS:1 COG:alr0238 KEGG:ns NR:ns ## COG: alr0238 COG0416 # Protein_GI_number: 17227734 # Func_class: I Lipid transport and metabolism # Function: Fatty acid/phospholipid biosynthesis enzyme # Organism: Nostoc sp. PCC 7120 # 2 314 6 334 341 212 39.0 1e-54 MLKIGVDAMGGDFAPEAAVQGAVLALEAIGPDSRIVLFGDEAKIKAVLEAEGCSAERFDI VATTEVIEMGDHPAKAFQAKADSSITVGFGYLAKGAIDGFASAGSTGAMMVGSMYAVKPI EGVIRPTISSIVPTIAGRPALLLDVGLNVDCKPEVLAQYGLIGSIYAEAVLGIGKPRVAV LNIGEEETKGNAQTKATYELLKEDGRINFVGNVEGSYIFTGQVADVIVCDGFVGNTVLKM AEGLYRINKKLGGGNAFWDAMNYENVGGTPVLGVNAPVIIGHGISSARAIKSMILSTEQC IKADLTVKLQHAFKN >gi|313158042|gb|AENZ01000051.1| GENE 21 26348 - 27376 1447 342 aa, chain + ## HITS:1 COG:BMEI1180 KEGG:ns NR:ns ## COG: BMEI1180 COG0332 # Protein_GI_number: 17987463 # Func_class: I Lipid transport and metabolism # Function: 3-oxoacyl-[acyl-carrier-protein] synthase III # Organism: Brucella melitensis # 4 327 2 323 323 258 41.0 2e-68 MGKITAAITGVGEYLPDYILTNEELSTMVDTSDEWIMSHIGVKTRHILKGEGVGSSYMGA RAVINLLDKAGVDPMEIDLVICATVTPDMFFPSTGNLIADQAGCKNAFAYDVSAACSGFL FALTTGAKFIESGSCKKVVVVGADKMSSIVDYTDRSTCPIFGDGAAAVLLEPNEEGYGVL DEILRSDGSGELQLYMKAGGSRYPASVETVQNNWHTIMWDGHAVFKAAVSKMADVSVEIM ERNGLTSADVRYLVPHQANLRIIDATARRMGLEHEKCMINIDRNGNTTAATIPSCLYDYE KELRKGDNLILSAFGGGYTWGAIYVKWAYDGSKACQIDSSEK >gi|313158042|gb|AENZ01000051.1| GENE 22 27414 - 27557 58 47 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRESEKKLKPHMFRESQIYENSRFPETQGAPEYAEKRRKPSLVSVSL >gi|313158042|gb|AENZ01000051.1| GENE 23 27586 - 28263 905 225 aa, chain + ## HITS:1 COG:all5323 KEGG:ns NR:ns ## COG: all5323 COG0745 # Protein_GI_number: 17232815 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Nostoc sp. PCC 7120 # 1 222 1 222 610 143 37.0 3e-34 MKILIVEDEPSLREIMVQTLRREQYVVEQAADYASALDKIAGYDYDCILLDIMLPDGSGL QLLEELKRQRRQAGIIIISARDSLDDKIEGLELGADDYLPKPFHLAELSARIRSVVRRHQ RGGQESLDAGNVRLFPDSRRVEIAGREVELLRKEYDILYYFMSRPNHMIDKTTLAEAVWG DHIDQADNFDFVYAQMKNLRRRLHDAQADIEIRAVYGFGYKLVTP >gi|313158042|gb|AENZ01000051.1| GENE 24 28260 - 29540 1752 426 aa, chain + ## HITS:1 COG:mll7952 KEGG:ns NR:ns ## COG: mll7952 COG0642 # Protein_GI_number: 13476585 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Mesorhizobium loti # 138 408 155 430 452 99 30.0 9e-21 MKLVYRILLHMSWALSLLLAAWAVLFYFTMIDEINDEVDDSLENYAEVLVKRNLAGRELP AAFSGSNNGYFLHDVSAEYAAARPHMVYSDEEIFIPEKDEEEPARVLRMIFRDREGSYKE LTVMTPTIEKDDLQEAILWWIVYLYLFLLLTILLINILVLHRTLRPLYALLRWLDGYRVG GKNAPLANDTKIAEFRRLNEAAIRNAERAEVLFDRQKQFIGNASHEMQTPLAVCRNRLEM LVGDGSALSEEQLGEIMKVQRTLDYLVRLNRSLLLLSKIENGQFPETEQVDFNALVRRTA EDMSEIYAGRGMRLSLREEGRLAARMNSSLGASLVTNLLKNAYVHGDAGGEVTVTVSGDR LSVCNSGAGGALDADHIFERFYQGAKKEGSTGLGLSIVDAVCRLYGLRTEYAYINRCHCF TVHFPK >gi|313158042|gb|AENZ01000051.1| GENE 25 29651 - 30094 764 147 aa, chain + ## HITS:1 COG:no KEGG:BT_0923 NR:ns ## KEGG: BT_0923 # Name: not_defined # Def: putative periplasmic protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 147 1 145 145 146 48.0 2e-34 MKKLMMILAGAALLATSAPAFAGNDRPIAVGELPAVSQEFIKTHFAGVEVSFSKVDEELF DKDYKVVFVNGAKVEFAKNGEWKDVECKYGEVPAAIVPQQIRDYVTKNYPKNKIVAIDRD RRDYEVELDNGLDLKFDLKFRLIDIDN >gi|313158042|gb|AENZ01000051.1| GENE 26 30141 - 30962 1033 273 aa, chain + ## HITS:1 COG:no KEGG:BF2756 NR:ns ## KEGG: BF2756 # Name: not_defined # Def: putative lipoprotein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 13 270 20 275 278 222 43.0 1e-56 MLALASLGAMTGCDDDDDSVKVPAAVQDTFGRMFPGAGHVEWAGKQGYLVAEFREGGTDM QAWFDAAGKWYMTEEDVPYALLPQAVRTAFESGEYAAWHVDDADKLTREGLETVYVLEVE QRDAEYELVYSEDGVLLRAVPDADGDRDHGDMLPQELPQAVKDFIGRKYPGARIVDAERE KGGLEVEIIDGRTPREVYFGAGDAWLRTKTEVRRSEVPAAVMQAFQTSQYAGWEIDDIDH YDSPERQWYLFELEDPQSDREAELRIQADGTIF >gi|313158042|gb|AENZ01000051.1| GENE 27 31133 - 31657 948 174 aa, chain + ## HITS:1 COG:MTH158 KEGG:ns NR:ns ## COG: MTH158 COG1528 # Protein_GI_number: 15678186 # Func_class: P Inorganic ion transport and metabolism # Function: Ferritin-like protein # Organism: Methanothermobacter thermautotrophicus # 1 163 1 163 171 140 50.0 1e-33 MLSEKLHDALNEQINAELWSAYLYLSMSMDAEAKGLKGVANWFFVQFQEEQDHARILMNY INSRDAKVELKPIAEVRTEWTSPLDMFKDTLEHEKVVTSMINNLASIAAEDKDFASANML VWFVDEQVEEEESARDMITACEAVEGNKFGLYTLDKELAARVYQQASPLAANNA >gi|313158042|gb|AENZ01000051.1| GENE 28 31709 - 32434 971 241 aa, chain - ## HITS:1 COG:MT2803.2 KEGG:ns NR:ns ## COG: MT2803.2 COG4422 # Protein_GI_number: 15842273 # Func_class: S Function unknown # Function: Bacteriophage protein gp37 # Organism: Mycobacterium tuberculosis CDC1551 # 5 219 14 239 284 93 32.0 4e-19 MSVGWNPWHGCRKISEGCRHCYVYRQDERHDKDSSEVRKTSAFNLPVRRSRDGRFKVPAG EMVYTCFTSDFLVEEADAWRAEAWEMIRTRSDLRFFFITKRIDRLMQVLPPDWGDGYENL IVGCTVENQAAADRRLPLLLGAPLRHKLIVCAPLLGPLDIAQYLTSEIEEVSAGGESGND ARPCDFDWVLSIRRQCLDADIPFCYHQTGARLVKDGRLYRIRRRFQHAQARKAGIDYKVK R >gi|313158042|gb|AENZ01000051.1| GENE 29 32441 - 32893 526 150 aa, chain - ## HITS:1 COG:BS_yvbK KEGG:ns NR:ns ## COG: BS_yvbK COG0454 # Protein_GI_number: 16080442 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Bacillus subtilis # 15 149 21 155 155 129 47.0 1e-30 MEFEIVKTDTPDRSLLLLADESEESVADYAGRGTTYAACRGGQILGQYVLLHTRPFTAEV VNIAVAPEYQRRGIATALLQHAVRTARDAGFRLLEIGTGDIGAGQIALYERCGFVRCGID EDYFRKHYPAPIFENGVECRHMVRLRMELR >gi|313158042|gb|AENZ01000051.1| GENE 30 32905 - 33363 787 152 aa, chain - ## HITS:1 COG:MA3555 KEGG:ns NR:ns ## COG: MA3555 COG1238 # Protein_GI_number: 20092362 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Methanosarcina acetivorans str.C2A # 1 141 2 142 153 84 34.0 8e-17 METLTQFLIDWGYWGLFFSALIAGSVVPFSSEAVMLVLVHMGLDPVLCVVSAASGNTLGG MSCYWIGTLGKSEWITRLGVKEKQLDKARRFLAGRGAMMAFFSFLPTIGEAIAIVLGLMR SNVWLTGGSMLAGKTLRYIVVLATFQGAASLL >gi|313158042|gb|AENZ01000051.1| GENE 31 33489 - 35624 2454 711 aa, chain + ## HITS:1 COG:XF0842 KEGG:ns NR:ns ## COG: XF0842 COG3537 # Protein_GI_number: 15837444 # Func_class: G Carbohydrate transport and metabolism # Function: Putative alpha-1,2-mannosidase # Organism: Xylella fastidiosa 9a5c # 57 702 63 754 790 290 31.0 5e-78 MLKTGLCIVLFALAAGCAPKQKPFTDYVNPLLGTATLWDSVDLGFKPTKRTWGAEVFPGS SLPNAMVQVSPVTKFHSGAGYQYEDSVIYGFTHTNKGHWNLCHIPLLPVTGTPLPGDYCS PYSHARESAAPGYYRVFLDRYGVDAELTSTLRCAFHKYTYPAGAQAALLADLQRSNEHVR AWDIRKEGDNAFAGWQQTGEKMYFYAVADRPVTGIEPVAEGDREIRIVRFGDGDGPVELR IGFSFVSEENARKNLEAEMLGRSFADVRSEATAVWNDLLGRIRVEGGTEREKGLFYSSLY RSFLWPALRSDVNGEFTDADGEVVNTGFRYYTGPSYWDDYRNKLILLGLLAPDVAADVIS SDTDRGEKRGGFMPTFFHGDHASAFVAGVYLRGIRGFDVRRAYELVLRNATVQGPSRPWL DEYVARGYIPEMDVENPVTQTVCKAAVTKTLEYAYDDYAVALLADALGDGANRDMLMKRT SNYKNLFDPSTGLMRGRLDDGSWITPFDDQYPYYEYMYREANAWQQAFFAPHDTEGLIGL YPSREAFGAQLDKLFSIPWKGYEVDNLSGFIGQYCHGNQPDHSFPYLYYFIGRQERSQEL LDHILDRFYGMGPEGLALSGMDDAGEMSSWYVFNAIGLYTYSPADPEYIVTVPLFDKVAV SLGDGSRWSIRREGTGRRITGITCGGNPVGGWFVSHDALGKGELVIATSEK >gi|313158042|gb|AENZ01000051.1| GENE 32 35866 - 36126 355 86 aa, chain - ## HITS:1 COG:no KEGG:Odosp_1596 NR:ns ## KEGG: Odosp_1596 # Name: not_defined # Def: hypothetical protein # Organism: O.splanchnicus # Pathway: not_defined # 6 86 2 82 86 97 65.0 1e-19 MTTATKHKTADRLTAEERHELPDSAFGIPETREFPLVDAEHVRAAEAYFRYAPDNKKAAL ARRILAKAAAYGVNVQSQVIRSWAEE >gi|313158042|gb|AENZ01000051.1| GENE 33 36211 - 36621 651 136 aa, chain - ## HITS:1 COG:VC0373 KEGG:ns NR:ns ## COG: VC0373 COG0432 # Protein_GI_number: 15640400 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Vibrio cholerae # 1 136 1 138 139 160 57.0 5e-40 MIQQTEFTLRPRTRGFHLVTDEITANLPPLPDAGLVHLFIKHTSAGLSINENADPDVRTD LSAIFDRLVRERESYYVHTLEGDDDMPAHAKSTLTGVELTIPVTQGRLNLGTWQGIYLCE FRNRASGRRIVATLIG >gi|313158042|gb|AENZ01000051.1| GENE 34 36786 - 37226 613 146 aa, chain + ## HITS:1 COG:no KEGG:BT_0816 NR:ns ## KEGG: BT_0816 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 4 146 2 144 145 165 61.0 6e-40 MYTILRFLFSDKPYTAGNSWLLLAVRIIFGGLLMSHGIAKWQNFDALSAAFPDPLGVGGG VSLALAIFGEVVCSAGFIAGLFYRLALIPMIFTMCVAFFAVHGGDPFAARELALAYLSVY VLMYVAGPGLYAADTLIARRLPKPRR >gi|313158042|gb|AENZ01000051.1| GENE 35 37407 - 38837 1547 476 aa, chain - ## HITS:1 COG:CAC3734 KEGG:ns NR:ns ## COG: CAC3734 COG0486 # Protein_GI_number: 15896965 # Func_class: R General function prediction only # Function: Predicted GTPase # Organism: Clostridium acetobutylicum # 4 476 3 459 459 300 37.0 3e-81 MICDHDTIAAPATAAGGALAVIRVSGGNALTICDRIFRGSAPLADAGGYTVHYGHIMADG RTLDDVLVTVFRAPRSYTGEDAAEISCHGSQYIVTETLRLLTAAGARMAQPGEFTIRAYL AGKLDLSQAEAVADIIASSSRASHALAANQMRGGYSAAFDTLREKLLHLTSLLELELDFS EEEVEFADRTQLRDTMRQIKERIDALRSSFSLGNAIKEGVAVAIVGAPNVGKSTLLNRLL NEERAMVSDIAGTTRDIIEERANIDGIVFRFLDTAGIRSTNDTLEQMGIARTMSSIERAQ IIIRLIDASQPAVPVSGGSPSSGQTGRKSAPAVPSADSAADTQSPDFPLRPDQTLLTVYN KIDKTPGLALPEGAVGISARNGDGIDDLRRILRDAVDTEALYHGGTVISNSRHYEALTSA SEALSAALDGLRDNLPTDLLSEEIRQVIRHLSGVTGQDIVPEEVLKTIFSKFCIGK >gi|313158042|gb|AENZ01000051.1| GENE 36 38839 - 40221 1817 460 aa, chain - ## HITS:1 COG:YPO1078 KEGG:ns NR:ns ## COG: YPO1078 COG0741 # Protein_GI_number: 16121379 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Soluble lytic murein transglycosylase and related regulatory proteins (some contain LysM/invasin domains) # Organism: Yersinia pestis # 132 457 91 411 475 160 33.0 4e-39 MRILLTIVLLMAVAAPASGLDRKPRKKEKKKARTEAVAEPAPAPAPPAAKPAHTERKTEP AAPAPAPLPDNDMTLGLTAAQADSLVAVWRERQCCDSFREFFDNYILVDSTQTGAVPDSV YTRRLRDLASPIQLPYNSIVRGYINRYTDSRYGTISRILGMSQYYFPLIEDELLKEGLPI ELRALPIIESALSVTAVSPMGAVGLWQFMPSTGKSYGLEVNSLVDERRDPVRSTQAACRY LKDLYAIYKDWSLAIAAYNCGPGNVNKALARAGEKSRTFWDIYDYLPRETRGYVPAFIGA SYAYAYHRQHGIELTEAPIPLATDTVRIDRIMHLQQIASTIDVPIETLRQLNPQYKLDII PATTKPYTLVLPQRSLTQYVANEPAILAKDSLYLKEYINPANLDKKRQERSGTVYTVKKG DTLGAIARRYRVTTAQLMRWNNIKSAHKLRIGQRLRIEGR >gi|313158042|gb|AENZ01000051.1| GENE 37 40225 - 41385 1789 386 aa, chain - ## HITS:1 COG:no KEGG:BF0766 NR:ns ## KEGG: BF0766 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 226 386 128 288 288 126 38.0 2e-27 MVRIRFRHLLPILAALTLWLPAQAQLHRGARTIVRGGMLVQRDSLASKTDSLADLSLVRE LDSLMTGLFAADSAAVAELRTDDISKADSLYRLKMAGIKIESDTTKHKRGWLMSDSMSLS KVCWISTVLPGYGQIYNKQYWKLPVLYGTLGAGLALYLHENKTYKPLKREYEAYTNKNLS RTPELDALQSRMIRSNTRRQVYLGLTVASYIYFIGDAAVNYSTNEVSDVKKATTLACIFP GAGQIYNKSYWKVPFVVGGFAAMIYCIDWNNRGYQRFKKAYRLLSDYEKNPDAYPDGPTD EFHGRYSADFIRNLRNNYRRNRDLCIIISAGLYVLQIVDAHVDAHLKDYDVSDDLSMNLE PLVDYTYVPSAGGNRAVFGFNLSFKF >gi|313158042|gb|AENZ01000051.1| GENE 38 41389 - 42264 1435 291 aa, chain - ## HITS:1 COG:PA5562 KEGG:ns NR:ns ## COG: PA5562 COG1475 # Protein_GI_number: 15600755 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Pseudomonas aeruginosa # 3 280 4 283 290 181 40.0 1e-45 MKQPKGLGRGLDAIFGTDAVDAKLKPMSQMTEIDIAEIIPNPTQPRTQFDEEALDELADS IRQLGIIQPITVKKSGDGKYIIISGERRWRAAQRADLKSLPAYVREVDDENLHAMALVEN IQRQDLNAIEIALGMQRLIEECNLTQDALSEKVGKKRSTVSNYMRLLKLPDEVQLALKEG LISMGHAKAIAGAPQELQLRTLKKILKKSLSVRQAEELVRTLTEQPAEAAQNMEDEEYPE SYSRLVEQLEKFFSQEISIKRSKNGGGRIVIGFDDDKDIEQFIERFSGRLK >gi|313158042|gb|AENZ01000051.1| GENE 39 42277 - 43047 1169 256 aa, chain - ## HITS:1 COG:lin2923 KEGG:ns NR:ns ## COG: lin2923 COG1192 # Protein_GI_number: 16801982 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Listeria innocua # 1 246 1 248 253 256 55.0 4e-68 MAKVIALANQKGGVGKTTTAINLAASLALLGKKVLLLDADPQANATSGLGFDINLEGVYE CIAGLKKPEEVLLQSPDVKNLWVLPSSIDLVAADTELPKMEDAHHVMKRIVDSVRDRFDY IFIDCSPSLGYTTVNILTAADTVLIPVQCEYLALEGLSKLLNTIRKVKSGLNPALDIEGF LLTMYMRNRLNNQVVNEVREHFGQLAYDTIIQRNIRLGEAPSHGKPVMLYDAGAVGSENY LTLAREFLKRNRKRSK >gi|313158042|gb|AENZ01000051.1| GENE 40 43161 - 44075 1168 304 aa, chain + ## HITS:1 COG:slr0959 KEGG:ns NR:ns ## COG: slr0959 COG1266 # Protein_GI_number: 16331018 # Func_class: R General function prediction only # Function: Predicted metal-dependent membrane protease # Organism: Synechocystis # 143 250 417 523 529 68 38.0 2e-11 MTEKKTEAPVQEPAPAPEAVGKIWKLQGKFPALADYLVFFGIFLLAQVVGAVTALLVGCK WPDQALLASADETVSTAEQVLVGHFNAVSYFVAMTLTLVGFLFYRSRRRGPKIIAHFSSR GLNPVLLLWGVVFMLATSVVLEPLLSLLPEVPNAYGRGAWAIVTVVVMAPLFEEVIFRGV LLESTRAKYGVMAAWLVSSAVFGIVHVHPTVAVNAFAIGLVLGFVYMRTDSLWSTIILHA VNNGIAYLALVTGHGNAMLIDLVGSRTLYVLIYIGALAVFAVSGYMMLLALRRLKAEDKN RAAA >gi|313158042|gb|AENZ01000051.1| GENE 41 44135 - 45013 1178 292 aa, chain + ## HITS:1 COG:no KEGG:Ftrac_1858 NR:ns ## KEGG: Ftrac_1858 # Name: not_defined # Def: hypothetical protein # Organism: M.tractuosa # Pathway: not_defined # 32 286 42 296 297 110 25.0 5e-23 MNLRAKIAVVAASVFLAGCREMPGYFASDTTLARAGGKQLQLRDVESVVPKGVTGEDSAA FMKVYVDRWVRKQLKLKEAETLFSASGDDIDKMVEEYRQALLIRKLNQLYVDRSIDTTFT DDEITAYYNAHKADFRLDRTLVKGRIVQFGEGYRQARKLKELMGAKSAAQQQDFRDICEK NDFTVTDFREQWVDFPEFLSFLPTLRSQNYDSVLSSNAVQAMRDSHSYYYFQIDAVRREG EPIPLERLRSTIRRILFNQRQGEIIRSHEEELYNRAVENGEAKIFENTNDKV >gi|313158042|gb|AENZ01000051.1| GENE 42 45018 - 46424 2267 468 aa, chain + ## HITS:1 COG:RSc0516 KEGG:ns NR:ns ## COG: RSc0516 COG0760 # Protein_GI_number: 17545235 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Parvulin-like peptidyl-prolyl isomerase # Organism: Ralstonia solanacearum # 26 450 88 490 498 93 26.0 8e-19 MLKKGIMAAALAVLAAGAFAQKQQVMLDKVVAVVGSSSILYSEVADHARQLTAQRRAEGY TSDRDPMNEALEALMTQKLLFNQAQIDSVKINAGDIASHVEEQVQNMIEAEGSIPRLEAK HHMAIFNIRENMRQRYEEQSYASSMQNEVVSKVAVIPGEVERFYKSIDKDSLPTIAEQYV YAQITRFPKSITQAKQRTRERLLDMRERVITGKAKFENLARMYSQDPGTMMRGGEMDPAP LASLDSSFAAALENMKPGQISEVVESQFGFHIIQMLDKRGSLYHFRHILLRPVYTIDELT EGTHFLDSIANLIRKDSITFEKAALLYSDDATSKMNGGIVSNHDILERFSAFDAKLTVTK FLREDFAHFNALNDYNALVRLKPGEVSEAFLTEDIMGNQLAKVVKLVEIIPTHVASLNED YLRLEEMALNAKQEKVFKDWLSKKIDAMYVYIDPEFRDGAFENKHWVK >gi|313158042|gb|AENZ01000051.1| GENE 43 46429 - 48963 3203 844 aa, chain + ## HITS:1 COG:no KEGG:Pedsa_3437 NR:ns ## KEGG: Pedsa_3437 # Name: not_defined # Def: hypothetical protein # Organism: P.saltans # Pathway: not_defined # 64 774 53 661 702 186 26.0 6e-45 MRRLLSIAVVALAAGSVIFARGGMQAPEAEQPAEKKLIDLKSDNMGPVAPGDSVIFLVGN FAAQHNGAVITCDSAVRYSDMRIEFFGNVLINKNTTYIYGDRAEYNGEVNEARVFSDIVK VVDEDATLYTYEFLFNTKENVGEFGGGGVLVNRDSRLESVRGYYYANSKELVCVDRVEMR NDEYELKGDSVVYNMATDNAFFFRNTNIWNKEGDYLYADRGAYRKADSLYKVTSNGYVLT DKQEMWSDSIDFYRAEDHIILWRDIQIDDTEHKVLAFGDYGEYWKEPGNAFLTRRPSIVS YDLSQGDSLFMRADSMFLFTINENTERRAAEAAAADSLARSADSLALSGPDSLALSGPYS LAHAAGGVDVPADSLGRPRSGRRPQGVDAADSLATAGSAPDSLAAGGDADSPHATSSPLD APDRPAPADSLGGAAPADSLANAADTLTVAQRKALLKEAAKRAKAEEKAAAAKEKKKKLD EIAARRKEKMTARLLEQKEREEARLTARRLKAESKLKARQARATRKGRMIQIDSTALREL DSLIVLNMAEQDSLLNLLVDSLLTDTAAMAVPADSIDSLAAPRDSIYRLLKGFRNVKIYR SDFQTVCDSMTAISTDSTIHLYIDPVLWNENNQITSDVMDIFTENQQLTRAEFIGSPMMV SQLDTTHYNQVAGKTMTAYFFNNQIYRNDVNGNAQTIYYMQDGEPVEITMMGVIESGEIS FFIEDKQVVQITYRGDPVYNFYPMDKIPPTQDIRLKGFKWEGARRPSQAEVFDRRIRPSE RERRSEMKHPDFPIMQRIDEHRKRLIEQRRWTDRNDQVDAATVEWMHSLGYEVGQPRKTE APAE >gi|313158042|gb|AENZ01000051.1| GENE 44 49192 - 50349 1495 385 aa, chain + ## HITS:1 COG:araJ KEGG:ns NR:ns ## COG: araJ COG2814 # Protein_GI_number: 16128381 # Func_class: G Carbohydrate transport and metabolism # Function: Arabinose efflux permease # Organism: Escherichia coli K12 # 1 383 1 382 394 292 45.0 9e-79 MKKSLIALAFGTLALGMSEFVMMGILPDIARSLGVSIPEAGHLISAYALGVCFGAPLMVL VARKRPLKQILLALTAMIAAGNLCASLAPDYWSLLGLRFVSGLPHGAFFGVGSIVAERVA DAGKRTEAVSIMIVGMTVANLFGVPLGTYISNVLTWRATFGIVAVWGVVALLLVRLWVPR MEPLPDTGLKGQFRFLRSLAPWLVLLSVMLGNGGIFCWYSYVSPLMIHASGFTADDLTLI IMLAGFGMFAGNIIGGHFSDRFTPEKVVRFTLATAAATLLAIYFGAHVRYLSVALMVLCT GCLFCVSSPQQLLILENSRGGEMLGAALVQVAFNLGNALGAYCGGLPIDHGLGYRYTALA GMGFVLLGLLTVVIYIRKYPRHEMR >gi|313158042|gb|AENZ01000051.1| GENE 45 50336 - 50836 553 166 aa, chain + ## HITS:1 COG:PA4841 KEGG:ns NR:ns ## COG: PA4841 COG0494 # Protein_GI_number: 15600034 # Func_class: L Replication, recombination and repair; R General function prediction only # Function: NTP pyrophosphohydrolases including oxidative damage repair enzymes # Organism: Pseudomonas aeruginosa # 2 138 13 149 178 61 33.0 6e-10 MRCDNGSELFPVVDPSGVVVGSATRGECHGGSMLLHPVVHLHLFNSRGELYLQKRPEWKD IQPGRWDTAVGGHVDYGETVDEALRREVREELGVTEFTPERVAVYVFRSERERELVYVHR AVYDGPVAPSDELDGGRFWSRAEVLENLGKGVFTPNFEGEAKLIFE >gi|313158042|gb|AENZ01000051.1| GENE 46 51106 - 52524 1769 472 aa, chain + ## HITS:1 COG:STM0569 KEGG:ns NR:ns ## COG: STM0569 COG0668 # Protein_GI_number: 16763946 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Small-conductance mechanosensitive channel # Organism: Salmonella typhimurium LT2 # 30 417 24 409 415 347 43.0 3e-95 MQTLLKHWADALLEWLGLASAQDSGWDRWVAFTLVLLAVALFDFVCRTILVRGMRRIVKR TKATWDDDLFNPDVLSRLCNVVSAVVLGIVLPVVFEEQSEARTIVTRLVEVYIVVAVFRF INAVLFAVFQIASARPAWQNKPIKGLRQTGQGIAALICGILIVSILIDKSPASLLAGLGA SAAIIMLIFRDSILGFVSGIQLSANDMLKVGDWISVPKYGADGSVIEVSLTTVKVRNFDN TIVTLPPYLLVSDSFQNWRGMQASGGRRVKRSVSIDMTSVRFCTPEMLNKYRSIDLIREY IDDTERRVEEYNAAHGIKPGERRINGLHQTNLGVFRAYLERYLRNEVPVNAGMTLMVRQL QPTETGLPMELYFFTDTVIWVDYERIQSDVFDHVLAVIPEFDLRVFQNPSGSDIAALRYL KPEPGMGRKMEQRRGPDRAPEPEPEQKPEPTPTQEPTPKQEPESGRKPEPTP >gi|313158042|gb|AENZ01000051.1| GENE 47 52574 - 54235 2614 553 aa, chain - ## HITS:1 COG:ECs4625 KEGG:ns NR:ns ## COG: ECs4625 COG2985 # Protein_GI_number: 15833879 # Func_class: R General function prediction only # Function: Predicted permease # Organism: Escherichia coli O157:H7 # 17 552 16 557 561 316 34.0 7e-86 MEWLKSLFVEHSALQAVVVISLISTIGIGLGKIRFFGISLGVAFIFFAGILAGHFGLTID PQMLTYAESFGLVVFVYALGLQVGPGFFNSLRADGLRLLSPAVAVVLIGTALAIGMSFTL NVSMPDAAGILCGATTNTPALGAAQQTLQQMGIDPNGAALSCAVTYPLGVVGVILAIVFI RKVFVRPSDMPEPDAEHRKNVFIASYHLTNPAIFGKSVHEIASQSHHHFVISRIWRNGRV SIPTSDKTLQKDDIILVITAPSEADGLRLIFGEQELKDWNAENIDWNAVDSQLISQRILV TRPEINGRKLSSLRLRNNYGINISRVYRSGVQLLATPDLRLQMGDRLTVVGEAAAIQNVE KILGNAVKNLDEPNLVAVFVGLILGLVLGSIPVSIPGISLPVKLGLAGGPIVIGILIGTF GPRLHMITYTTYSANLMLRALGLSMYLACLGLEAGAHFFETVMRPEGALWLLLGFLLTFV PVVLLGVFALRVLRIDFGSVSGMLCGSMANPMALTYANDTIEGDNPSVSYATVYPLCMFL RVVIIQVMLMIFL >gi|313158042|gb|AENZ01000051.1| GENE 48 54621 - 55994 2031 457 aa, chain - ## HITS:1 COG:BMEI1029 KEGG:ns NR:ns ## COG: BMEI1029 COG1538 # Protein_GI_number: 17987312 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Outer membrane protein # Organism: Brucella melitensis # 51 453 41 421 456 78 21.0 3e-14 MKTSTLFAALSAAAMTLCTAAAAAQTVQAAPQAATPAPGTPWSLDDCIGYALHNNVGVQQ QALQVEQTDVKLSTSKYSRLPDLSASVGYNATFGRGTSDDNIRKTGTIQSGSFDVGASMP LFDGFRINREIKGGKLDLAAAMQDLERAREDLSINVMTLYLQVLYNKELTQIAERQLELS TQQAVRSRELVAAGKQPESARYESEALQANDELNLTQARNDLRLALLNLSQALNRESAAG FDIVAPQFDSVALASLHMLGTADDVYAYATENRPHIKAERLRLESAENAVRIAKSALYPS LSLRGGYGTGIYSTQEAAFGTQFRKNSSEFVGVSMSVPIFNRRATHNSILSARIAMRKQQ LAVTDAEQSLRKEIEQAWYNADAAYGKYRSADAALTSARVAFAYEQQKADAGRSTVFDFN DAKTRMEKAESELVQAKYEFVFRSKILDFYRGRPLEL >gi|313158042|gb|AENZ01000051.1| GENE 49 56013 - 57122 1862 369 aa, chain - ## HITS:1 COG:VC1563 KEGG:ns NR:ns ## COG: VC1563 COG0845 # Protein_GI_number: 15641571 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Vibrio cholerae # 44 318 51 337 338 135 28.0 2e-31 MKKFLKIFLGVLFAALLLGTFWFLWQKTRPVKVVYTIVQPKLDTLKQYVVATGKVEPRDE VLIKPQISGIVSDVYKEAGQMVRKGEVIATVKVVPEMGQLSSAESRVSVAEISLDQTRRE FDRTEALHKKGVVSDEEYEQGRTALRKAQEELQNAKENMEIVKNGITSRYKELSNTQIRS TIDGMILDVPVKVGNSVIQSNTFNDGTTIATVADMSNMLFRGNVDETDVGKLHETMPVKL TIGALQNVELDALLEYVSPKATEDNGVIMFEVKAAVKIPENVFVRAGYSANASIVIQSRE GVLTLPESTVEFEGDKTYVYLLTSADGAEEQTFDKHEVKIGLSDGINIELTEGVTAESKV RGAREEKKK >gi|313158042|gb|AENZ01000051.1| GENE 50 57148 - 58407 1772 419 aa, chain - ## HITS:1 COG:slr0594 KEGG:ns NR:ns ## COG: slr0594 COG0577 # Protein_GI_number: 16332318 # Func_class: V Defense mechanisms # Function: ABC-type antimicrobial peptide transport system, permease component # Organism: Synechocystis # 19 417 18 405 407 130 28.0 6e-30 MRFFDTDRWNEIWQTISRNRKRSIMTALGVFWGIFMLIVLLGAGMGLGRLFRAQLGGMST NTVLLQSSRTSVPYKGMPTGRWWRMDNDDLEAVSKLEEVEYTSGVIWGNELHCSYKERKG DYQMMGYTPDYQKINPQKIIAGRYINEVDMVHKRKVCVIGTQVQKDLFPGEPDPTGKVIK VGGSYFTIIGVMRRESSAMSFSDVERTVVVPISLAQQMFGYGRTIHLLALAGYKDVPSKQ VEKAAREAVFARHMISPDDEKAAWTMSAGEMFDKVMSLFWGIGLLTWIVGLGTLLAGIVG VSNIMLVLVKERTQEIGIRRALGAPPTAIISQILSESFILTFIAGILGLTAAVGVLSVVD SVYYQAVTVAQEGFEVSWQISFGTGMLALFILIAGSLLAGVIPAYRALSIKAVDAIREE >gi|313158042|gb|AENZ01000051.1| GENE 51 58414 - 59670 2111 418 aa, chain - ## HITS:1 COG:slr0594 KEGG:ns NR:ns ## COG: slr0594 COG0577 # Protein_GI_number: 16332318 # Func_class: V Defense mechanisms # Function: ABC-type antimicrobial peptide transport system, permease component # Organism: Synechocystis # 2 415 5 405 407 132 28.0 1e-30 MRELLLEIWTSVRRNKLRTFLTGFSVAWGIFMLVILLGSGNGLKNGVLSNFGNYATNSIT ISGGWTSKAYAGYNKWRSIDLKNRDLEILVNEFPEVDDVAAKAWYRGSTATYLDRFADIS LRGSLPKIQQIEGVEIVKGRFINQADIHNYRKVIVIDEALERILFRGDDPLGKPVKIHNN IYRVAGVYRGDAQQRNSMGYIPLTTGQLIFKANDPVVDNAIITVHGLDTDAQMEAFEQRI RKRLAAEHHYDPEDQSALWIRNTMEQYKSMMIVFGGINLFVWIIGLGTLLAGIVGVSNIM LVTVTERTSEFGIRKALGAKPVSIIRLILTESVMITAMFGYIGMVLGVAVMEAVNYVINQ TPAQTGNFGGSIFLNPTLDLGVAVSATVVLVIAGLIAGYVPAYRAAQLKTIDALRYNK >gi|313158042|gb|AENZ01000051.1| GENE 52 59667 - 60356 325 229 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 218 1 217 245 129 36 7e-29 MIKLENINKTYDNGSPLHVLKGIDLTVEKGELVSIMGASGSGKSTLLNILGILDDYDSGS YYLAGRLIKDLSETQAAAARNNMIGYIFQSFNLINYKNAVENVALPLYYQGVSRRKRNAL ALEYLDMLGLKEWAEHMPNEMSGGQKQRVAIARALINKPQIILADEPTGALDSVTSQEVM NLLRSVNVDMGMTIICVTHEQSIADQTDKIIRLTDGVISSITETGLATR >gi|313158042|gb|AENZ01000051.1| GENE 53 60496 - 60753 178 85 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MCIFKRFFNHFNNFYGQEIRFRPARQRPFRQIPHAFRTEIPALQRKRTAYKHRSQRLILF LQYYVAQSINILPIFALINRRLIGV >gi|313158042|gb|AENZ01000051.1| GENE 54 60797 - 63625 4041 942 aa, chain + ## HITS:1 COG:no KEGG:BT_3633 NR:ns ## KEGG: BT_3633 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 21 942 28 945 945 1001 54.0 0 MRKLLLLLFTLIPAVVWAQRYDLSGRVLVKGTDKPIAQAVVELPQAGLWAVADGDGNFTV KGVPKGATRFVVSCLGYATTETELDVRAGMGRILLYAPEDNLKLESVVVTAKEAPNAMAT SRTIGGNAIDHLQMVNASDISSLLPGGKTINPDLTKDNVFSLRSGGSAAGNASFGTAVEV DGVRISTNASLGEMSGASTRNIASTNIESVEVVTGVPSAEYGDISSGVVKISTRKGKTPY TLVLSTNPRTKQISASKGFDLGTDRGVLNSSVEYTRAMKNPTSPYSSYSRTGIALNYQNT FAKVLRFNFGVTANIGGMNTEDDPDAQVGEWEKVRDNALRANTSLKWLLNKPWITCVDFD ASLSYADKLARRHTYQSSATETPAVHAESEGYYIAEMLPATFYTTRNIDSKQLDYAANLK ATWVRSWGDVHSNAKIGASWRANGNVGEGEYYAAPPLAPDGYRPRPYTDIPYMHNLAAYL EETLTVPLGTTSLQLMAGVRAEKTFIKNTQYENTSSLSPRLNLKFRINDHVTVRGGWGIT EKLPSFNVLYPLPEYRDTRVFSNTYSPNRSVYVYHTQPYQILYNDNLRWQRNRNAEVGVD LRIGGTSVSLVGYFNRTKYPYELSAAYEPFSYKVMGLPSKMPDGSSFQMPSNPLFRVDSQ TGEIFVRDKDDPSAGWIAMETTSTKRTFVKNTYQDNGSPVDRMGLEFVVDFPQINPIRTQ LRLDGAYGYTKYVNKGEASYYPSTTTGGEFFPYVGIYADNGGSSVVTYNGRKTHALDANL TATTHVPAIRMIVTLRLEASLVKRSQNLSEYDGREYAFNVDEDRNPTGGSIYDGNSYTAI WPVAYLDLDGNRHPFTDAQKNDPAFSSLLLRSGNAYSFNGDGYDPYFSANLSITKEIGDH VSISFYANNFTNSRPFVASYASGVKVVFTPDFYYGLTVRLKF >gi|313158042|gb|AENZ01000051.1| GENE 55 63631 - 64860 1532 409 aa, chain + ## HITS:1 COG:no KEGG:BT_3632 NR:ns ## KEGG: BT_3632 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 12 409 16 411 411 333 45.0 6e-90 MKKLFCILSCLLAAGCTSFEGNPYGDTLRSLSVQVVYPEEYASFLREGVPVKLTDRNSSN VYTALTDARGVAAFDVAAGHYRLSVLDRPDASSVFNGAVEQVDLAGADRNVSVELKYAKP GTILIKEIYSGGCPQDPPATGSYADDKYIVLHNNSFDTYYLDGLCLGMVAPYNSNANNPW TSTDPSGNIVFRDYAAMPDCIWMFPGTGTDFPLEPGEEAVVAYYGVDHTQTYSQSVNLNR KGYFVLYDMVHYPGNRLHPTPTPGDQIDESHYMKVLKKTGTNTAVVYVISQNSPAVILFR APGDFDLDAYLANDLESAIASGSVVYSKIPWEWIVDGVEVCNLSEATRNKRLHTDVDAGY VGFSAKAQGHSLYRRLDEEATASAGFDRYVDTNNSSNDFYERETQSLRE >gi|313158042|gb|AENZ01000051.1| GENE 56 64864 - 66363 1828 499 aa, chain + ## HITS:1 COG:no KEGG:BT_3631 NR:ns ## KEGG: BT_3631 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 499 4 502 502 399 40.0 1e-109 MRRILTLLMIIAVSGAAAQERFDYVFRRNPWNGGPNAAGIRQDSLSRSYAEIYFTKETGG MTGHSSSDDSWNAGARTESVRHLKKVSFAGGFGYDYFDGRNMCGSMFSQPGYYPVDILEF TPGRKIREDYSFMGGVSAVLGRRWTGGLRVEFEAQNYAKRKDLRHKNTRLDFEFSPGVMY HAGRFAAGAVYIVGTNSEKLEAEEIGSTPESYQAFFDRGLGYGSLQLWESSDMHLTTSGV TGFPVRENTQGAGVQLQYGPVYAAAAYRNRQGETGERGMIWHEFGTHQVTANAALSLGNL QRRHFVRLHLDWRSLQTDENIITTETVNGVTKPYIHGSTPIFGRKGLDLGGEYEFANRHT DLRAGVAYSQLNRESTLMYPYVKGQKLHFTKVYAELIRAFGFWEVTLAADYRQGGFEECE KQFETSGEQPGEYPGQLTGYYNYETEYLTAKRLGAGLGVRRNIERFYVELYARYEHGFDL QYVAQPNRVRATVSVGYNF >gi|313158042|gb|AENZ01000051.1| GENE 57 66381 - 67979 2005 532 aa, chain + ## HITS:1 COG:no KEGG:BT_3630 NR:ns ## KEGG: BT_3630 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 2 532 4 539 541 266 35.0 2e-69 MKKILMIAAAAAVCFAACSKNEGGVPGDRTLRITPTIGNPSLTPAAMTASTRATDTDFEL GDKVGLKVTMTADGKDFVSNKPMTFDGTDFKTEGFLWYEDINATSTLFAYYPWQEGGDIP AEFSVKADQSGDNYTASDLIVAVKAGVKPTLSSTQMTFKHKMSRIVIDVTNESGFDITNI VIKGAVGTGVLDPATGNFTAKADAEASDLIANTATANKVYYALLVPQNGVKLMVTVTTAD GKKRTQTLGTADFKSGENRRMECNVQPADIEVKFSGPITGWVDGDDLLPDGEGEVETPTV EWGGVKYEIVTLKDGRTWMAENLRYVPAGKTISSDPKDGSGVWYPCNLSKAADPSLVESN GLLYSYPVMLGMTGDMTGDNFDQYEGARGICPEGWHIPTMAEWLKLAGVGSGNLTDPTSP YFEQTQSGGSIVKLNKDGFNIAGCGYVNAANVTATPAYMATASAADASAFGMGYFPSSTG YKVTYNTADDPASGIKNIQYYAGMITYNKRFNRITVAYQGGYCAAPVRCIKD >gi|313158042|gb|AENZ01000051.1| GENE 58 68023 - 69975 2834 650 aa, chain + ## HITS:1 COG:yccM KEGG:ns NR:ns ## COG: yccM COG0348 # Protein_GI_number: 16128958 # Func_class: C Energy production and conversion # Function: Polyferredoxin # Organism: Escherichia coli K12 # 29 295 65 288 357 96 28.0 1e-19 MKRKPNYLKHILQWGVLAAIAGTVLWANFSEKPVDVEAYCPFGGLQAFGTYLVNNSLACS MSMLQIMMGLVLAVGVILFSKLFCGYLCPLGTVGEWMGRAGKKLHLQVEVPSGSIVDKLL RVVKYVLLFTILYFTLSSSELFCKKLDPFYAVATGFKGEIVLWMSLTSLTLLLLGGFVVK MFWCKYICPLGAASNIFKFTLLFVIAALGGWILGMLGVANAWVWTIGGACLAAYVVEIVK MRSCVFPLMYIERDIRTCNNCGLCEKKCPYQLPIHDYVKVKHVDCTLCGNCIGSCTKDAL QVNGRRSLRWVPGLLAVVLFFIAVWMGSTTELPTIDEKWGDYEQVENLQTFEMEGLQTIK CFGSSKAFSAKMQTVPGVYGVKTFVRRHGVEVLFDPAKTDTLKIQAAIFAPTLRKYAMPG ENVPMLDVVKLGVEGLHDRMDMIYFGMVLQKIEGVYGFTSEFACPVDVTVYADPAAGITE KMFEEAIDAEELVIPAKEGEKVIPMHTVLKSYAVAGQVSREEFAQIMFRDVEKQAGRFIA NIEKWGDDEQFPKAVYEMAFPGIEKMPIRNAFPYFKSFLSCSEGIVSVDFVLRDLTPVMR IHYVKSMWNDEKLWKEIFQAEKWMLRMADGTFKEADPRLKFTNPGKTVTE >gi|313158042|gb|AENZ01000051.1| GENE 59 70032 - 72203 3023 723 aa, chain + ## HITS:1 COG:no KEGG:Fluta_3976 NR:ns ## KEGG: Fluta_3976 # Name: not_defined # Def: peptidase S46 # Organism: F.taffensis # Pathway: not_defined # 1 723 1 708 708 458 36.0 1e-127 MRRWIACLAVVLLCMQTAVADEGMWLINRLGEIYPQMKSKGLKIKDKEIYNEQTSALADA VVAVDGGMGTGSMISDEGLMITNHHVAFSDICALSTPEHNYLETGFWARTRGEEIPVAGK TVWFLRKVVDVTEEVEAIRNGMMAEGKWGIMGMRRVYKEIEDRYAAQTEHEVSCYSMWGG KMYLLFYYDVYKDVRLVGTPPITLGAFGGDHDNWGWPQHKGDFTLYRVYADAEGRPAEYS AGNVPLKPRRVLRIATGGVHDGDFAMVIGFPGHTNRYASSFAVAEKQRVKNPVVVANRHD RMDILKRHMERDPKVRMKYSDSYFGLSNYADYAKWENKCLRRFDVVSIRKAEEERLQRWI EADSARRAEDGGLLADLERGYGGRRGAERSLNYFREAWLGPSEALLVANRVSSYLGKLDR LKLDSLKIDSKDAQSVVAGSGRLRRNYDVETDRDLLAKMIVNFTKHVPREMWGEQLTEMY DAAGGDADRMAREAFDASFCSDPDRYDAYFERNRSVAEMRRDPLVRLTESVRVQRFTGGV DKAERRVRAQVGKAESRYAGLLYDFRASEGIAQYPNANSTMRLTYGSVVPLNPSDGVHYD SRSTIAGYMEKYNPDEYEFRVDDRMKRLIAAEDWGRWGEKGTLYVNFLTDNDITGGNSGS PVLDGRGRLIGLAFDGNRESMAGDVWFHPDLARTVCVDIRYVMWIIDKYAEADWLLDEMK FEK >gi|313158042|gb|AENZ01000051.1| GENE 60 72434 - 73081 912 215 aa, chain - ## HITS:1 COG:CAC0418 KEGG:ns NR:ns ## COG: CAC0418 COG0546 # Protein_GI_number: 15893709 # Func_class: R General function prediction only # Function: Predicted phosphatases # Organism: Clostridium acetobutylicum # 3 212 2 210 216 189 45.0 5e-48 MHWKYLLFDLDGTLTDPMQGITRSVQYALRHFGIEVEDLTTLCPFIGPPLRDSFREFYGM TSAQADEASEKYGEYFSRQGIFENKLYAGIPELLSALQAQGATLAIATSKPDFLAERIAR HFGFRDRFTLICGGTAYGPCSTKAGVIRETLAQLRIDDPSQAVMIGDRRFDIEGAAAAGV ESIGVLWGYGSREELEKASPGYLAADIPELRRLLA >gi|313158042|gb|AENZ01000051.1| GENE 61 73238 - 74530 1678 430 aa, chain - ## HITS:1 COG:FN0621 KEGG:ns NR:ns ## COG: FN0621 COG0427 # Protein_GI_number: 19703956 # Func_class: C Energy production and conversion # Function: Acetyl-CoA hydrolase # Organism: Fusobacterium nucleatum # 2 423 11 432 434 309 40.0 7e-84 MQFTTAAEAAKLIRSHSSVYIQGSTSIPEVLVQAMTDRADELTDVKIYSGFAVGRADAPY CRPEYKDTFLVNSLFVANNIRRWLAAGYGQSIPAFLGEIPGLFRDGTLPVDVAILNLSRP NEEGYCSFGVSADLAVSAVECAKTVIAQVNTAMPFSYGDAVIHLSRVAAAVEVDDPLVEV PSATPSEIESRIGGYIAELIPDGATLQIGVGGIPNAVLAALTGHKHLGLHTEAMTDGVLP LLESGVIDNSLKKVMPGTTVASLALGSRRLYDYMDYRKDLLMKDVAWTNDPFRIRENPRV MAINSAVEVDLTGQVCADSVGERIISGVGGQHDFMYGGALSEGGKTFIAIPSTTPKGESK IKALLTPGAGVVTTRHMVQHVVTEYGVAHLRGRNLAERARALISVAHPSVREELEKAACE RFGYTFLRIK >gi|313158042|gb|AENZ01000051.1| GENE 62 74535 - 75848 1796 437 aa, chain - ## HITS:1 COG:Cgl2547 KEGG:ns NR:ns ## COG: Cgl2547 COG0436 # Protein_GI_number: 19553797 # Func_class: E Amino acid transport and metabolism # Function: Aspartate/tyrosine/aromatic aminotransferase # Organism: Corynebacterium glutamicum # 58 279 30 236 369 65 28.0 2e-10 MERLPLDRSVLDSALERMDIADIAQATIRQSGDIARILENETGTEFLHLEMGVPGLPPEQ VGVEAECKALRQGVASQYPSMSGIPELKEQASRFIRAFLDIGIAPQGCIPTVGSMQGSFT LFLLCSQLDPKKNKILFIDPGFPVQRNQVHILNIPHASFDIYEYRAEKLGPKLESYLAQG DVAAIVYSNPNNPAWICLTEEELRTVGELASRYDAIVIEDLAYLCMDFRKPLGRPFEAPY QSSVARYTQNYILMISGSKIFSYAGQRIAVAAISDTLYTRQYPALRERYGMARLGDAYVL TILYAASSGTSHSAQYALAAMFRAAADGRLDFVAETSEYARRARLTKELFLRHGFRIVYD KDQDEAVSDGFFYTVGYGDLTSSELLSELLLYGICAISLTTTGSRQPGIRVCVSQLNRPE QFDLLEARLTAFGQNHR >gi|313158042|gb|AENZ01000051.1| GENE 63 75848 - 76432 888 194 aa, chain - ## HITS:1 COG:MA1023 KEGG:ns NR:ns ## COG: MA1023 COG1014 # Protein_GI_number: 20089898 # Func_class: C Energy production and conversion # Function: Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, gamma subunit # Organism: Methanosarcina acetivorans str.C2A # 2 185 9 193 200 115 37.0 4e-26 MKKDIILSGVGGQGILSIATVIGKAALDAGLSIKQAEVHGMSQRGGDVQSNLRIASGEIA SDLIARGVADIIISLEPMEALRYLPWLSKEGWVITNTAPLINIPNYPDTEALRRELDALP HVVALDVDAIAKSAASPRAANIVLLGAAAPFLGIDAEKLEAGIRAIFARKGQEIVEMNLA AFRAGYQYAQKQTE >gi|313158042|gb|AENZ01000051.1| GENE 64 76514 - 78115 1994 533 aa, chain - ## HITS:1 COG:CAC2001 KEGG:ns NR:ns ## COG: CAC2001 COG4231 # Protein_GI_number: 15895271 # Func_class: C Energy production and conversion # Function: Indolepyruvate ferredoxin oxidoreductase, alpha and beta subunits # Organism: Clostridium acetobutylicum # 4 528 2 519 584 375 39.0 1e-104 MAQRELLLGDEALALGALHAGLSGVYAYPGTPSTEITEFIQGHPLTAERGVHCTWSANEK TAMEEALGMSFAGKRALVCMKHVGMNVAADAFVNSAMTGANGGLVVVAADDPSMHSSQNE QDSRFYGQFALVPTFEPSNQQEAYDMVQAAFDFSEKYRIPVLMRLTTRMAHSRAVVEVAG VRAENECSYPAQTRQWVLMPGNSRVRYNTLLDDYARFETEAAESRFNRYTDAPDKSVGII ACGIAHNYLTENYPDGCPHPVLKISQYPLPASLVRRMASECGSLLIIEEGQPVVEQAVRG ILPAPLDIRGRMTGQLPRTGELTPDSVRAALGLAPHATHAASQIVVPRPPALCQGCGHRD VYTALKEIADEYENAKVFSDIGCYTLGWLAPFHAIDTCVDMGASITMAKGAADAGVYPSI AVIGDSTFTHSGMTGLLDAVNERSNITVIISDNLTTGMTGGQDSAGTGRLEQICAGIGVD PAHIRVVVPLPKNREEMKAILREEIAYDGVSVVIPRRECIQTAKRHNAAKQKQ >gi|313158042|gb|AENZ01000051.1| GENE 65 78107 - 78301 100 64 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MCHKIVVLSNFYKSKKILIKIPELITKSYLIGVNYVKILVKTDGSACAVGKRTAAAEISA AAVA >gi|313158042|gb|AENZ01000051.1| GENE 66 78321 - 79328 1008 335 aa, chain - ## HITS:1 COG:no KEGG:Lbys_0372 NR:ns ## KEGG: Lbys_0372 # Name: not_defined # Def: inosine/uridine-preferring nucleoside hydrolase # Organism: L.byssophila # Pathway: not_defined # 4 328 3 335 338 243 42.0 7e-63 MFSKTILSLIASAMLFTGSVAAAEPVRIIFETDMGNDVDDVLALDMLYKYQDAGKIQLLL ISNNKGNAYSAPFIHLMNDFYGWSGIPVASSPAELLPRQKEKQPSYVEAVAASGVFPASS AETFDSVEKYRQVLAAQPDRSVVIVSVGLLSNLKRLLDSAPDRYSELPGRELVARKVRLL TMMGGSFDGQRRREFNIRIDIPAARAVFGRWPTDILVSPWEVGGQIFYRCAALDKIGYAA QHPLKMAYASYLPMPYDRECWDQTAVMAGVEEPAKWFGLSSRGEVEVDEEGRTKFTPAPN GRIRILQTTPGQRKKAEKYIMKLVGQTPEKYTNRK >gi|313158042|gb|AENZ01000051.1| GENE 67 79341 - 80765 1632 474 aa, chain - ## HITS:1 COG:no KEGG:Phep_2210 NR:ns ## KEGG: Phep_2210 # Name: not_defined # Def: sialate O-acetylesterase (EC:3.1.1.53) # Organism: P.heparinus # Pathway: not_defined # 23 472 24 466 466 410 44.0 1e-113 MKTKHLLALLLLALLPLPLTAGIRLPAVIGDNMVLQQQADVLLWGEASPRCNITLEPSWS QSVYTTRTDDDGNWKITLPTIAAGGPYEIAITAGNDTKTLKNVLLGEVWLCGGQSNMDVS FRGLANQPVADAADEILDSAYPDLRLFRVKRDFSLQPRHDCRGSWTVSSPESAETFSAIG FLFGRMLHQREKVPVGMIVCAWGGSEVEAWMSRENLMRFAGVRFRESADLRTANVTPTAL YNAMLLPISGYTIKGCIFYQGEANITDPAAYRTRFPAMVAEWRARWGEEFPFFYVQLAPY GYTNMGWSPDQTQAARFREVQQQCMSDIPRSGMVATTDIGAVHTIHPPDKKTVAKRLLYL AYARCYGYKGFECSGPVYKSMEIQGENIALSFDHAPYGVSSYGEPLAGFEIAGEDRIFHP AEAWLKGRSTVIVHSGEVKAPIAVRYCYKNYAYGNLYNNYGIVAPPFRTDDWNE >gi|313158042|gb|AENZ01000051.1| GENE 68 80762 - 81760 1114 332 aa, chain - ## HITS:1 COG:no KEGG:Sde_2315 NR:ns ## KEGG: Sde_2315 # Name: not_defined # Def: hypothetical protein # Organism: S.degradans # Pathway: not_defined # 3 331 56 381 392 111 26.0 5e-23 MPAGDWSYSHHPSVTYFQGAFYAVYSNGMRGEDEPGQRVMIASSRDFIRWSAPQLLAEPS DGDYGTKKILTPGGIRVCDGRLTVYYTENDNDGTSNARLNPQLYAVTSPDGRTWTRPVSL GLAIFPSHAPLRLSTGRLAMTCNRAFYYSDDPTGLRGWHKCGTAAQDQTGHVTFEQIRPS LCEGELFEHADGTLYCLFRNSSTPYDGYLWQAQSYDGGRNWTMPVRSDFSDNNTKSFFCA LPDGRYLYIGTPDNTRQGTRYPLVLALSEDGFNYDKCYILSDDRYTRQYEGRWKSGDYGY PFAYIREGWIYVIVSRQKERMDALKCKISDLR >gi|313158042|gb|AENZ01000051.1| GENE 69 81952 - 83103 1461 383 aa, chain - ## HITS:1 COG:no KEGG:Sde_2315 NR:ns ## KEGG: Sde_2315 # Name: not_defined # Def: hypothetical protein # Organism: S.degradans # Pathway: not_defined # 57 382 60 381 392 120 28.0 1e-25 MNKTIRCGLLLTAVTLSCTAGFGQQPTLFTNNYDPKAEIPKLDAEKSVIYDPVRTWTYSH HPSVAFFKGTFYAVFSNGPEGEDECGQRIMLATSEDFTDWSEPRVILSPGEGEFGQTKIL TPGGITVIGGKLTLYYTENDNDGVSNKRIKATLFAITSEDGRTWSEPVNLSIRVFPCHRP LTLSNGRIILTGNTLVYYTDDPSGTGKWNRCRMEVPKYQDGAVTFEQVHPSLCEGALFEH NDGQVFCLLRSTGKTYDGYLWQMQSTDGGISWTSPVKSRFTDNNTKSFFGNLPDGRYFYV GTPDTSRPGERYPLVLALSEDGYNFGKTFILSDDRYTQQYKGRWKGGDYGYPYAMVHDGH IYVIVSRRKERIEIIRIEIDALQ >gi|313158042|gb|AENZ01000051.1| GENE 70 83116 - 85089 1708 657 aa, chain - ## HITS:1 COG:MA4278 KEGG:ns NR:ns ## COG: MA4278 COG1262 # Protein_GI_number: 20093067 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Methanosarcina acetivorans str.C2A # 27 252 46 269 270 125 35.0 3e-28 MNRKTLFTALLLLLGRNVCSAGEPGPAIRFAEIPAGWFYMGSGGPGADYDEAPIHKVVIS RPFRMSVTEITNAQYEAFDPSHRALRGKQGFSSGDDEAVIFVDWHQADAFCRWLSEKEGR TYRLPTEAEWEYACRAGSCMHYSFGDRLPKAALKNQEEHHSFIPVDLTVAQSAPNAFGLY DMHGNVEEWCYDWYGPYFRGDQTDPVGYADGFYKVTRGGSHSTPVEYLRSANRMAMIPED KHFLTGFRIIEAPYPATKPVPASTAAEPVAQHVAHWVDMNDQPRYYTPIEYVIPPHISAA PMYPHNHCPAVTWCRNGDLLAVWFSTISEFGREMAIWHSRLRCGADSWDEASLLCKVPDR NMTGSSLRCDPDSGRIYLLNGVEAGGWWRNLALMAQTSDDNGATWSRPQIVSPEHTPGHQ VIAGMIRSSEGWLINACDGGPDNNEGSVIQISRDDGRSWTSPCGEKPEEFEAGRTGGMIA GIHACVVQLRDGSLMAFGRGNDVADAKGRRRMTRSISGDMGKTWRYEASPFPPISGGQRC TMIRLKEGPIVLVSFTHHPLRPPKDEDRMRLGGKIKSGMFAAVSYDEGKTWPVRKLITDG QYRFMDGGAWTGFFETDADRAEPRGYLTMTQSPDGIIHLLSSKNHYRFNLKWLEQND >gi|313158042|gb|AENZ01000051.1| GENE 71 85102 - 85515 540 137 aa, chain - ## HITS:1 COG:no KEGG:Phep_2281 NR:ns ## KEGG: Phep_2281 # Name: not_defined # Def: hypothetical protein # Organism: P.heparinus # Pathway: not_defined # 39 130 47 138 140 68 34.0 8e-11 MKKNLIITFITALLLCLISCGEVLEDVSFGVTPDSGNVYEAGKEVYFNFSGNPDYITFYS GEAGHKYEYAGKIDGEGTANYGIPVKAMNARADNYSYIYETAGEYDAVFVARNATFEGES KVVARLKITIAEPADNE >gi|313158042|gb|AENZ01000051.1| GENE 72 85536 - 87098 2155 520 aa, chain - ## HITS:1 COG:no KEGG:Phep_2282 NR:ns ## KEGG: Phep_2282 # Name: not_defined # Def: RagB/SusD domain-containing protein # Organism: P.heparinus # Pathway: not_defined # 1 517 1 537 540 331 39.0 5e-89 MKKLLFIFIAGITLGSCNFFDVETAGFVSPETSYKDEESVQKALVGVYAPLSDLSFYGRD WFYAFNLQDDLSYYDRNYTKQELFLNNYTYTNATLNNLWANLYTGIDRANSFLEYIQGSP IDETLIAQYMGEVRFLRAYYFFTLSSLWGDVPLRLKSTRDTDMEALQMPSTPAAEVFDFI VTEMEDVVGQVRAADQLNGPGRISKSTVQGILARVYLKMGGFPLYKGKEAFEKAAYWARK VRSSRLHTLNPDYKEVFTNLSKDIYDLTYRESIWEIEFKGNNQDGHTTGGCVGSYNGVYN NQNDAYGFGYGYVSVTLKLFDLFDDPNDVRREWNICEYYYRNGEKLWRNLNNKQYVYCNA GKFRREYELSDTKDKDYTPINFPVLRYSDVLLMLAEAENEAYGGPTTLAYECINEVRRRA NSDYPAVAGLGQEDFRRLVIDERGRELCFEGLRRLDLIRWGLYVEAMTTERRAQVADPRW HANKSYADGIADYTEYRHNWYPIPSKELATNFSIKQNPLW >gi|313158042|gb|AENZ01000051.1| GENE 73 87110 - 90247 4740 1045 aa, chain - ## HITS:1 COG:no KEGG:BT_3332 NR:ns ## KEGG: BT_3332 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 38 1045 21 1053 1053 901 47.0 0 MIRIYNALRRNLRLWLILPVLFAGMAGPAGAQTKGHKVVGTVMGPDSLPLQGATVIIKEV NRGAVSDSKGRYVLSNVPPTASLVFSFLGMVSQEVPVGQKTEINVVLTATDARLDEVVVV GYSEVKYKDLTGSVGRANVGDMLKTDAGNIGEAFAGRVAGVQVTSNEGMPGEEMNIVIRG SNSITQSNSPLYVIDGFPLEDTSMGALNPSDIASISILKDASATAIYGARAANGVVIITT KLGTPGTTKVTFDASWGFAQISKKIDLMNAYEFVTLQQEIMTEADFNKTYLGEGGTIEDY RNVKSIDWQDQIFRTAPVQKYSVNLSGGNRSTRFSTTLSMLDQDGIIKNSNYSRYQGRAT IDHRFAKKWHVQATANYSRILQTGDSPSQTNYTGSANLLPNVWSYRPFVSVGDDDLINEW VDPSINPSQDFRVNPIYIVSDEYRKRINNYFRANAYVEYEIIKDLKLKVMGGFLTDDYSN EQFNGSRSRTGNKYRSDGVNASVQNKSTQQWVNENTLTYKTAFDKDEKHKFDILLGQTMQ GSRYKNTYMKATHIPNEHLGINGFGEGSPVVSTYNSAEWTLMSFLSRINYNYDNRYYFTA TFRADGSSKFAKNNRWGYFPSASAAWNISNEKFLSQNKVVSNLKLRASWGQTGNNRVSEY AYRTQMYATEGSEYPFDNSNTTGNIIVNLGNRNLKWETTTQWDAGVDLGLFRNRLTIAVD WYRKITSDLLLSADIPPSSGFSTNTMNIGKVRNQGWEFTLQTENFNTKNFRWTTDFNIAF NRNKVLALSHNQESIVNVISYNNAYITKIGKPMGMLYGYIYEGTYKYDDFDNVDGKWVLK ADIPDNGTGRSGVQPGDARFRDINGDGTVNADDCTIIGRGQPLHTGGFNNTFEFFGFDLN VFFQWNYGNDILNAARTQFVRGKSEVGYNRWKEYCGRWTPENPMSDIPRVGGWGSSQYST FEVEDGSFLRLKNLSLGYTLPSKITKKFFVQKLRVYFSAQNLVTWSNYSGYDPEVSTKNT ALTPGYDWSAYPRSRVYNFGINVVF >gi|313158042|gb|AENZ01000051.1| GENE 74 90330 - 92324 2009 664 aa, chain - ## HITS:1 COG:no KEGG:Bache_1739 NR:ns ## KEGG: Bache_1739 # Name: not_defined # Def: hypothetical protein # Organism: B.helcogenes # Pathway: not_defined # 287 653 443 867 879 112 28.0 4e-23 MTIRKLLTLTVCTALLGCSDDPEQTQQVPPTTGFGNRLSGLTAIIPEAGVTDKSSTTRTG VAETGATVWLIDDAITVTDGLHAGTFDVATGIGTATGKFNGSLDFEQTPLYAVYPAVSAM NGFIAPIVISTMQNARTAGANNIMTARCEEVKAASESQFVFRKKAATAQLSFDFTSEGKY ASEKVNSIKITAEGVNFAGACDFDLTDPDAQLRGSSNSITYTFNEAPSLADKIEFTIGLA PCDLTQAANMYYLVSTDRYTFLFNKRPAQSSPEGSAISVSLPLDQFTATDSETPAEGEVK ITHELPALNLSAQGTANCYIVDKEARCSFDATVMGNGGEGIIPGGSFTDYLGNPLTAPAT IAPVSAELLWETTDGLISELVLSDGIVSFNTSAGKGNALIAVKDQQGRIMWSWHIWCTDL PADQVYMANKYGNSYTFMDRSLGATSALDDTGSLGMSYQWGRKDPIPGSSKFYDMTEPTL YGSVTKVSVAASDASTGTIANAIQNPVTFLKAADWLFAGRNNYLWGNPQGEEGLTATHQK SIYDPCPPGYRIPGKDAFSIFTKTGENTTDPAQHNVVNTSPTKRGWYLYYAATGSGEQAW FPYNGCRYYSSGALNRGSFYYYWTSAPSAGTKSCCLAFKNDWKEVDPLYTFQRGNAQTIR CVKE >gi|313158042|gb|AENZ01000051.1| GENE 75 92352 - 94127 2345 591 aa, chain - ## HITS:1 COG:SP2158 KEGG:ns NR:ns ## COG: SP2158 COG2407 # Protein_GI_number: 15901968 # Func_class: G Carbohydrate transport and metabolism # Function: L-fucose isomerase and related proteins # Organism: Streptococcus pneumoniae TIGR4 # 5 591 4 588 588 806 64.0 0 MTTSHPKIGIRPIIDGRRGGIRESLEAMTMGMAQRVARLYSEELRYSDGSPVECVIADTT IGGVAEAAACTDKFRDSNVGAVLSVTPCWCYGAETIDMDPLTPKAIWGFNGTERPGAVYL ASALAGHNQKGLPAFGIYGRDVQDMDDDTIPADVREKLLRFGRAAVAVMQMRGKSYLSIG SVSMGIAGSMPDPDFFQEYLGMRNEAVDCSEIERRIELGIYDKEEFARAMEWVEKYCKPN EGSDCNPAHLVYSREQKDVVWEYVVKMTLIFRDLMHGNPKLAEMGFEEEAMGHNAIAGGF QGQRQWTDFRPDGDFPEAILNSSFDWNGIREAITFATENDTLNGASMLLLHLLTNRAQLF ADVRTFWSPEAVERVTGRRLTGRAAGGIIHLINSGSCTLDATGQQSDAAGNPAMKPFWEI SDKEAEACLQATTWYPAGREYMRGGGFSSHFLTRGEMPMTMCRLNLVKGLGPVLQIAEGW AVNVDNETFEIINKRTDRTWPTTWFAPRLTGHGAFRDVYSVMNNWGANHGAISFGHVGAD LATLAAMLRIPVAMHNLDADRLFRPSAWGAFGSDPEGSDYRACATYGPIYK >gi|313158042|gb|AENZ01000051.1| GENE 76 94162 - 95646 2415 494 aa, chain - ## HITS:1 COG:STM1128 KEGG:ns NR:ns ## COG: STM1128 COG0591 # Protein_GI_number: 16764485 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Na+/proline symporter # Organism: Salmonella typhimurium LT2 # 5 442 8 447 498 194 32.0 5e-49 MEITLLDYLVFFIFVGGVALFGCSFYFRSRKGAAAFTAAEGSLPTWVVGMSIFATFVSSI SFLGLPGDAYKGNWNPFVFSLSIPIATWLAAKVFIPLYRSVNSVSAYHYLEMRFGYWARC YVAVCYLLTQLARIGSILLLLALPLNTMFGWDIQTIIICTGIATLIYTLLGGIAAVVWTD AIQGIILIVGALACAAILTFTMPEGPGQLFEIAAAHGKFSLGSFSLSLTEPTFWVVLIYG LFVNMQNYGIDQNYVQRYMTTKSTAEAVKSTLFGGLLYIPVSLVFVYIGTALFSYYTARP ELLPAGTPSDQVFPWFIVHGLPTGLTGLVVASLFSAGMSTVATSINSSATIVLTDFAKRL SKKELTEKQNMGTLYATSFVVGTLGIVVGLLMMRIDGVLDAWWKLASIFSGGMLGLFLLG VVCKTVRRVHAVVAVILGLLTIAWMSLSPLINEGSPFYRFHSPLHTYLTIVFGTTVIFLT GFLLTTLNRRREAE >gi|313158042|gb|AENZ01000051.1| GENE 77 95666 - 96583 1347 305 aa, chain - ## HITS:1 COG:BMEII0862 KEGG:ns NR:ns ## COG: BMEII0862 COG0329 # Protein_GI_number: 17989207 # Func_class: E Amino acid transport and metabolism; M Cell wall/membrane/envelope biogenesis # Function: Dihydrodipicolinate synthase/N-acetylneuraminate lyase # Organism: Brucella melitensis # 4 300 19 319 322 167 34.0 3e-41 MHPTKLQGIIPPMITPLKGDDELDREGAVRLTEHLVAGGAHAIFLLGTTGEAQSLTYRLR YEFVELVCRQVAGRVPVLVGVTDTAFIESIRLAEHAAKCGAVGVVAAPPYYFAPSQQELI EYYTALADALPLPLYLYNMPSHVKVFLEPATVKTLANHPNIVGLKDSSANMTYFQTLLYH LGDNPDFSLYVGPEELTGECVLLGADGGVNGGANIFPELYVAMYDAACAHDIARVREIQR RIMQISTSVYTVGKYGSSYLKGVKCALSLLGVCDDYLSYPYRKFRTEERARIRQALEALG ANCNV >gi|313158042|gb|AENZ01000051.1| GENE 78 96662 - 97717 1470 351 aa, chain - ## HITS:1 COG:BH2219 KEGG:ns NR:ns ## COG: BH2219 COG1609 # Protein_GI_number: 15614782 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus halodurans # 5 343 2 332 335 152 31.0 7e-37 MRTRKVTIKDIATEAGVSIALVSFVMNNKADGKGTYRVNKETAQRILEVAEKLNYQPNNA ARTLRSGKTNTIGVIVSDISNKFFADIARCIEDRAYKHKYTVLFGSTDENPQKLENLIEV FRNKGIDGLIIVPCEGADEIIRNVALQNIPVVLLDREVPDTELSSVVLNNRRAGVETTQA LIRQGFTRIEMISYSMNLSNIREREEGYRFCMTEAGLSDRINIHHLRHDKLGKIDEIVRD AKMRNVEAFVFATNTLAANGLTAIFRNGWRVPQDFAMACFDSNEAFDIYKTAVAYVRQPI EQFGTEALDLLIKNIEQKDQPINSIRIVLTPEIVESGIPGRETEISQLIEE >gi|313158042|gb|AENZ01000051.1| GENE 79 97812 - 99452 1782 546 aa, chain + ## HITS:1 COG:STM0103 KEGG:ns NR:ns ## COG: STM0103 COG1069 # Protein_GI_number: 16763493 # Func_class: C Energy production and conversion # Function: Ribulose kinase # Organism: Salmonella typhimurium LT2 # 14 539 5 539 569 544 51.0 1e-154 MLRRDMQTQTKYVIGVDFGSDSVRCLIVGTADGAEIASAVAAYPRWKERLYCDPSLNRYR QHPLDYIESLEQCVRSALEKCGKEVADNICGISFDTTASTPALTDERGVPLALLPEYAEN PDAMFVLWKDHTALAEADEINDLAKRWEVDYTRYSGGAYSCEWVWAKMLHCLRSDPALRA KTCSWVEHCDWICALLTGDTKPETIARSRCAAGHKAMWHESWGGLPSPEFLSAVDPLLDI FRGHLYTRTVTGDECIGGLCPEWAARLGLRPGIAVGVGAIDCHVGAVGAGIVPGTLVKVM GTSTCDIVVGTYGEVGDRVVRGICGQVDGSVLPGYVGFEAGQAAFGDIYAWFRRIMAWPL SRIAQGDETLEERILGELTAEAESLPLTTGDPVALDWHNGRRTPDADPRVHGAIDGLTLA TSAPAVFKALVEATAFGSRAINERMLEEGVPIDNIIAIGGIARKSSFVMRTMADVMGMPI RVLDSDQACALGAAMFAAVAAGIHGSVAEAQQAMRPGFSAEYRPDMERHAIYDYLYARYL KLGAAR >gi|313158042|gb|AENZ01000051.1| GENE 80 99453 - 102299 3721 948 aa, chain + ## HITS:1 COG:no KEGG:Fluta_0313 NR:ns ## KEGG: Fluta_0313 # Name: not_defined # Def: hypothetical protein # Organism: F.taffensis # Pathway: not_defined # 57 932 65 938 963 417 31.0 1e-114 MTKRFFLLAALAGVWSVSAQNGTLHLRLVDRVTREPVVGAVAELRSQADTARTPFYAASD IDGAALLQRVPAGAWRLRVTSLGYEALERELRTSGGKTDLGTLEMSPGAEAIESVVLEVP ALRSSIRGDTLSYRASAYKVTFGADAGALIGKMPGLEVADGVIEAQGRTVQRVFVDGREF FGNDVMSAVRNIPADMVESIDVYNSQGDQSEFTGVDIGDGYTAINIVTQPDKRRGAFGRL FAGYGIPDKYIGGGNINIFDRARRLSVIALVNNVNMQNFSSEDILGTTEQGQANARSGSG NFMVRPLDGVSTVQAVGANYSDEWGEKAKITASYFFNRADNRNESLTDRQTFTSSEKLVL YDGATDARIENVNHRFNSRFDYKFNNRHSLMMRTAFSVQDYLLDNETFSRTDNKFADDDI RFVNRRRNFAHNDNFGYNVSNNLIYRYRLPGSKLRSLTFGVGGRYSGGEQFSLPRQYTFR DPDDIEADTTDYDARNISRTNREQPGYYVSGNAAYTHAFGRRSRMSADYRVTYAANRVNR RTVLFDNKTGMFGPEPDPRQSTDYDYDYLTQRAGLSYQYLFKKTKVAASVYYQHVDFGGD YTLPVPDRTSASFDNITYNVVGNIHFDRSNLLKIDASSRTRNPRAADLQSIVNTTNRQHV FAGNPGLKPVYTHDLSAQYIRSNAAKGRTFTFAVRFSASPNTIADSLVIDSPHFVIVIDG DGTELGEGNQFVRPVNLPGYWNLRTTLSYGFPVRWLRSNLNVRAGVTTGRIPSVINGTRN RLNGDSYDAGLTLGSNISESVDFKIGYTGYYNVSRNSSQIRTVDNVYVSQYLTADLNFVV RQRIVLRGSADYNYYKGITDTFREERLICNLQVGCKLFRRRLGEVTVGVNDLFDQNDTTF RRTVTGTYIRNVSNLGLGRYFLAQFSYNLRLFPRQGAAVTRILQQGVE >gi|313158042|gb|AENZ01000051.1| GENE 81 102402 - 104003 2115 533 aa, chain + ## HITS:1 COG:mll7612 KEGG:ns NR:ns ## COG: mll7612 COG3119 # Protein_GI_number: 13476324 # Func_class: P Inorganic ion transport and metabolism # Function: Arylsulfatase A and related enzymes # Organism: Mesorhizobium loti # 35 506 5 425 509 122 27.0 2e-27 MRQNHLSPALLCGVAAAVPAACAAQQGAGGRESGRPNIVYIMTDDHAYQTVGAYGHPISR LAPTPNIDRLAREGMLFREAFVENSISAPSRATLLTGVYSHRHGQTTLSYGIMDTTLVHF PELLRAEGYLTAIFGKWHLSVEPKGFDHYDLLWDQGEYYNPVMRTPETGGRYVRQEGYAT DIITDHALAWLDARRDDDAPFCLMIHHKAPHRNWMADLKYLDLYEDVEFPEPETLFDDYA TRGDQMRQQQLTIDRHMGYAFDFKVEELKDEPTLQYIHDSWDIAMSTLTPEQRRVWDDSY GRKNRDFLADRPEGKELLRWKYQRYIHDYCRTIRSVDDQIGRVLDYLEENGLMDNTLIVY TSDQGFLLGEHGLYDKRFMYEESFRTPLIMAWRGHIRPGTVCRELVQNIDYAPTLLDAAG VGVPDGMDGVSLQPLFRSGKARGWRTSLYYQYYDYPAVGSVRAHYGIRTDRYKLIHWFGP GADGDPDIDFWELYDLRKDPCEVHNVYEERGYLPIRRELARLLAEKRGELGIR >gi|313158042|gb|AENZ01000051.1| GENE 82 104275 - 105006 1215 243 aa, chain + ## HITS:1 COG:Cgl2832 KEGG:ns NR:ns ## COG: Cgl2832 COG2188 # Protein_GI_number: 19554082 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Corynebacterium glutamicum # 4 235 26 255 266 103 33.0 2e-22 MKLRLDPNSTQPLHQQAEELLRTMIQQEEYKKGKLMPNEVELSKELNISRNTLRQAINRL VFEGLLIRKKGYGTTVAPQNVMSNARNWMSFSQEMHAQGMVVQNFELHISRKHPSQEAAA FLNAPEGTKLLCLERLRGRPGLPFVYFISYFNPAIPMTGEEDMAMPLYENLQKNYGIVVK TSREQIYARAASPDLADKLDISEGAPILVRKRFVLDVNDKPVEYNIGYYRADSFTYSIEF TND >gi|313158042|gb|AENZ01000051.1| GENE 83 105449 - 108538 4258 1029 aa, chain - ## HITS:1 COG:XF0842 KEGG:ns NR:ns ## COG: XF0842 COG3537 # Protein_GI_number: 15837444 # Func_class: G Carbohydrate transport and metabolism # Function: Putative alpha-1,2-mannosidase # Organism: Xylella fastidiosa 9a5c # 310 1028 30 738 790 246 30.0 2e-64 MRKTLFSGIAACLLATAATAAEQSPSVIWTVGQADRSASEFALAPDRFRDFLANDFGYED KYFLVGHSVPAADFPYVLPGPADTWGGTWPTSGWRTHQINILFGLKEIPSDGTCTLVVDL LDYAKNFLPLVKVSVNTQDAKFQLDAPGLSVADQRKPNQMEKIVDTLSITGNLAKATPRR LEIPIRPGVLKAGGNEVTITVLEGSWILFDRVSLEGPAGTNVEQPQQLFIRNIEAAPYEL ADGKRRVQPLLVDLEWLEGAPALSVELDDKEILNQRIETGRYQLEAPMPQVKKAKKSAYR ILCDGREIARGSVVRTPQTLQTPADYVDTRIGTAHSRWMIAPGPWMPFSMVKMSPDNQNT GWQAGYQPSFENIGCFSHIHEWTLGGLGIMAANGELKTRVGDEQKPDEGYRSRIDKRTER ADIGYYSADLTDYGIKAEVTATTRCGFERFTFPADRGTGRILIELHPETEYGIQLKNVSV KRVGERRIEGSCHQFSPGVWSHDADQDYVLHFVVEFDRPILRVGSWKDDVVANDIRELSS GPCKYAGLFVEFDPVRNPVVQVRSGISLVSVENAAQNLAAEVTEPFGWDFEAVRRNQVDT WNDLFSRLTVKTNDRLEKVRFYNNMYRAICSRNTWSDVNGQWVSTDGKVHTVADPSEDVM LGCDAFWNTFWNLNQFWNLVTPEWSNRWVNSQLAMYDANGWLAKGPAGMNYIPVMVGEHE IPLIVSAWQMGIRDFDAKKALEASVKMQTTPAQKVFKGFAGNRDLTAYMEHHYVPYDKGR FSNTMEYSYDDWTVGQLAKALGDEETYRTFNDRGYWWRNVISDEGYCHMRDSKGQWLENF DPFRSGANQHYVEGNAWQLTFFVPQDVPALVEKIGKQRFTDRLLWGFNQDEAWRYNAPND QYWDHPVVQGNQQSMHFAFLFNYADTPWNTQRWSRSILDRYYGYGIANAYLGDEDQGQMS AWAIMTSIGLFQTDGGCRVDPVYEIASPIFEEVTIDLGERYGRGKKFTVKAHNASRKNIY VQSARLNGR >gi|313158042|gb|AENZ01000051.1| GENE 84 109256 - 112750 4374 1164 aa, chain + ## HITS:1 COG:Rv0648 KEGG:ns NR:ns ## COG: Rv0648 COG0383 # Protein_GI_number: 15607788 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-mannosidase # Organism: Mycobacterium tuberculosis H37Rv # 568 919 149 512 1215 75 28.0 6e-13 MKKLLLLWLLTLATTVAVHAQRVEIYRQLPSDQITASASSTLRGSSATAAVDGAGMRGDL HEANNLGHGMWVSQASAETVRYSPSTREGVVWFLCQVGDKNSPPRQIDQIRIWNHNQNEH TRRGLNKVYIEYSADGQTWQLLPDGRLDYHVIPESVGRNPEPADLILNTPGLKARYICFT AAAGGEGNHYDRNDPVVMREAADMHQNPDYYGLAEIRFYTKERADVRTLAPVSELSFAAS QGYLKTPEGPSREFTLRFDNPIYAGADLAFECGGRTWSAEIAPSGVGVVRYDGLFPAGYM EETAKLVVRLTSRQGTVEKRFEVPAARKWTVNFLSHSHQDIGYTHRQMDVMKFQWRNLER AMDLAERTKDYPEGARYRWNTEATWAIAGYLEAYAGTDKAARLIQAVRDGVINIDAPLGS ILTGICRQEELMHMFDDAHRLAREIGVEVNTAMMSDVPGQVWGLATAMSKNGVKYYSPGP NYVPFYGKIGNDRAAALHVEWGDRPFYWLSQSGTDKVLVWQAGRGYSWFHGWLAGRLSVC GVEPIWNYLQELETDEFPYNTCYLRYTVHGDNGPPDELMPDVIRAWNERYDSPQFRITTT KEFFTAFEEQYGEYLPTYGGDMTPTWEDGASSTARETAMNRESAARLTRTGILWSMLSPE SDYPARELAEAWKNVLLFSEHTWGASASGPDPYSQFTKDLWAGKKMYADSADVQSRRLCD EAMAGITAGEGYVQVLNTNLWPRTDVVTVAADLTGKRLLAPSGEPVAVQRLHDGGWIFLA EEEPALSSSVYRIVPAKTSAKAKKSLEAPVSMIGGNVLDNGLVRVAVDPAKGTIRSLKAA GADYEYAAGEGLNDYIYSGRIAADPRGIDRVTRVEVLDDGPVAATLRIVSDAPGCNALWR DVTVYRGLGRVDIRNTVDKQDILEHESVRFVFPFNFAHPEITMDLAMSEMHPEREQLAGV NKHYYSLLNGMAVGDLEHAVCLTTLDAPFVELGTPSGEDYRLNPRHGYGWWPSAQISPVV YSWVMNNTWRTNYKASQGGVASFRYSLQISDPFDLKLKQRGAEREQPLIAVESGRPEPVG RLFRLEGRNRIAVSGIAPSADGTGYIVRLQNMGDQSVHSAFVWGRMKARRVSVCDYREQP LAPFDDRSFWMKPFEYLMLKVETE >gi|313158042|gb|AENZ01000051.1| GENE 85 112763 - 115105 3397 780 aa, chain + ## HITS:1 COG:XF0842 KEGG:ns NR:ns ## COG: XF0842 COG3537 # Protein_GI_number: 15837444 # Func_class: G Carbohydrate transport and metabolism # Function: Putative alpha-1,2-mannosidase # Organism: Xylella fastidiosa 9a5c # 24 778 46 769 790 269 29.0 2e-71 MNRLFLSLISLAACMSSCTSPEPLTDYVDPRIGTAHSRWFFFTPAAVPFGMAKLAPTTDG HLGNPGGWQAVGYDARHTSIEGFANFHEFQVGGVVIAPTVGALQTVPGPLDDPEAGYRSR FDKKDEVARPGYYSVLLKDYGVKAELTATERVGFHRYTFPASEEANLIFNIGTKMGESGP ARDASVTFTEDGRIEGWVVTEPVYVDIYQKGAVVKMYFSAVVDAEPASWGAFCGEEVFAG ERSRTGVGAGVYLRFDTRARRDVGVKIGLSYTSVENARLNMQAEAADLDFDGALLAANEA WEEALGRIRVEGGKREDRVKFYTGLFHAVLGRGLASDVNGAYPANDGTVGQIPLDPAGNP LHNHYNTDAIWGGFWNLTQLWSIAYPEYYADWISSQLLVYKDAGWLGDGIACSKYVSGVG TNFTGLAIAAAYNCGIRNFDVALGYEAARKNELGSEGRPAGAGKLDVGQFVERGYSPYST ELHMQTTPRGSGFSASHTLEYSFSAYAVAQMARQLGHEADYERLSKLSEGWKLLFDPETK LMRPRDEQGRFIADFDPYAPWVGFQEGNAVQYTYYVPHDIDRLVEMVGREEFNNRLDSTF LISRENIFGGGKTIDAFAGLHTYYNHGNQPNLHISGLFNFSGKPWLSQKWMRTICNEFYG TEEIHGYGYGQDEDQGQLGAWYVMASMGLFDVKGLTAPDPSFQLCSPLFDKITISLNPDY YEGREFVIETAGNAPENYYIGSAQLDGKPLTGVQLPFADVVRGGRLKITLAPEPNTSLTE >gi|313158042|gb|AENZ01000051.1| GENE 86 115229 - 118471 2583 1080 aa, chain + ## HITS:1 COG:no KEGG:Cpin_4504 NR:ns ## KEGG: Cpin_4504 # Name: not_defined # Def: TonB-dependent receptor plug # Organism: C.pinensis # Pathway: not_defined # 34 897 53 925 1074 670 42.0 0 MKICTNFLNLLKVCMCVALAFAALETRAQGEAFTVSGTVKDASGQPVIGATVFDTTTQKG DVTSTTGSFSLQVVPGSVLKVSFVGYAEQSAKVVAGKTTYDFVLESDALEIEELVVVGYG TQKKASLTGAISAINNEEIITTKNENVQNMLTGKVAGLRVRQNSSEPGQFNTSMDIRGFG APLIVIDGVPRDNMARLDPEDIEQVSVLKDASAAIYGVKGGNGVVLITTKKGDKGRVNIN YSGNISWQRPSNFPDLVDAADWMTLYNEKYTMHSVDNMSPVPQYSQEDIAAYRNGEKKSY NWKDAVFRNSAPQTQHTVSASGGNDKVTFYTSLGYQYQESFLQHTPITYDKYTLRANINA KIAKNLTLDVNLAGHMDEKKMSNFSSSDIVRSTWLFTPLDPFYYDDEQTMYHTKDDNTGI VNPLAMIDKDANGYQSLISRWFQSAFSLRWDMPFLPGLYAKGFFSYDYIMNDNKFYRKAF NTYTSTGAATSFLKGDAHDYFVQRNYYGKEHRQWHVQVGYDRTFGRHSVSGMLLFEDQHK VGDNFYGSREVVLPMDQVFVGDGETQQFNQSSGGGSLYDYAYQSLAGRFNYDFGGKYLAE FVFRYDASSRFPSGDLRWVFFPSVSVGWRISEENFWKESSLDFIENLKIRASYGKTGYDG DLNYEFLTGFTYPSGGAILGGDYVNGSVPKGIANKDITWQTIKMFDIGLDFSAWNGKLGL TFDYFRRHRDGLYARRNLSLPGSVGASLPLENLNSDRDTGFELELSHRNRVGDFSYAVKG NVSFTRRKTLYYERAESGNSYLNWRQNNNDRFNSIWWGYGEGGRITSWDQLYYNPVYIGR GSVMGDYLYEDWNGDGWINDLDVHPIGYTDMVPMVNFGLTISASWKGIDFSMLWQGSGKP LHRRSRVSAGAALVAYQRHFGTHGPLASGRSDGQSLRSRHEVGRRRIRLYGLDGKSQLGA WVAECPLPASEESRNRLFAAQEVADEGGYRECPDLCQRLQPADDHRSGLSRPGVLHSSVG WRGQQYGILLSDQQDLYYRTERQILKKQKNETSEILSPDSLCGRIRRVHRFGYTAEKYLQ >gi|313158042|gb|AENZ01000051.1| GENE 87 118377 - 120365 2808 662 aa, chain + ## HITS:1 COG:no KEGG:Cpin_4503 NR:ns ## KEGG: Cpin_4503 # Name: not_defined # Def: RagB/SusD domain protein # Organism: C.pinensis # Pathway: not_defined # 18 652 20 596 607 339 36.0 2e-91 MKRLKYYLLILCAGAFAACTDLDIPPKNIFNEKDIFGNVEGATSYLARLYGVLPMEDFRY SFEHGFGYSGAMYRQMCCATGEAVGRDTQGFFAEPGSNMWDAAYKTIREINLLIEGLPKY KSGFSDSDYNTVLGEAYFIRAYVYFAMVKRLGGVPKIDYVIDYPANATLEDTWTPRASEE ECWDFIGQDLDLAIAGLPDKNDSGRATRYAAAALKSRAMLYAGSIAKYNTVSEEYLGKRI CGIPAERAVDYFEAAYKASKIVDEGGYSLYKAKWAAGNRDAQAENFAALFLDESSANTET IFARYYKEKFLTHDYDNSVQPRSTSTGGNDSEVSPTVDFIEMFDGVELDADGKFGFLDEN GYYKLYDNPMDAFANAEPRLLGTVILPMSKFKDQYIEIRRGLYTGKVEEGKGISRLIKEG ETAAYETLGLKDVQLTGNFNGGPAVQLKTPYTPYPGATPVTSMLITGLNGPVNGWNFGNI GGTYLRKYLDPDLENNSGGKSTQHWIDIRYAEVLLNRAEAAYELTQLGKTTSEDGDSYRD EAFRCINLIRERAGADLLASAADLNSVDIVRRERRKELAFENQTYWDLRRWRIFYDEQNT TRYRILMPFFATDVNKYFFDVRFTEPRAGYNYIFTYDTRYYYQPLPSGELTKNPNLKNNP GF >gi|313158042|gb|AENZ01000051.1| GENE 88 120392 - 121153 1090 253 aa, chain + ## HITS:1 COG:no KEGG:Cpin_4502 NR:ns ## KEGG: Cpin_4502 # Name: not_defined # Def: hypothetical protein # Organism: C.pinensis # Pathway: not_defined # 1 123 1 124 235 89 40.0 1e-16 MKKYIYLLLFSLPLLAASCSKDNYDEPQETFRGKFIDKQTGEPFQTAIGNTGIRIRMMEY SWSDNPQPYDMNVKMDGTFNNTKVFKGEYGITPSGAFVPLEEERIKISGTVEKTWEVEPL LRVEWVGEPVVNADGTVDVKVKVSRGTDNPDYQEALAEAWLFVSENMYVGDFSYSPNYST RISGAAIGMVQFDQVYTIRTGQPGGYNPAGTYTPFPAFSRKYFLRFGARTTRQFDGTNRY NYTTVKEITTIAR >gi|313158042|gb|AENZ01000051.1| GENE 89 121234 - 122136 893 300 aa, chain + ## HITS:1 COG:HI0182 KEGG:ns NR:ns ## COG: HI0182 COG1940 # Protein_GI_number: 16272147 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulator/sugar kinase # Organism: Haemophilus influenzae # 12 300 13 311 313 66 28.0 7e-11 MISNDMERYAVGADIGGSHICSAVVDLTTGEICGEPLSGKVDCDAGAGEILGAWADNIRQ SIAASGLKAVRRVGFAFPGPFDYEHGVSLIRGVNKFERIYGLDISESLFPLLLRSGAEEF RYVNDASAFALGECLGGAACDARRVVALTLGTGVGSGFVSDRKLVTSGDEVPADGWVYCL PFGDGIVDEAFSTRGIIRRYEELTGETLTGAREVAARYDADPAARRLFDVYGEELAQFAG PVLTRFNADVLVLGGNISRAYPLFGPALERRFAADGIRVAVRTSALLDHAAMIGAASLFL >gi|313158042|gb|AENZ01000051.1| GENE 90 122145 - 123911 2411 588 aa, chain + ## HITS:1 COG:BS_yjdE KEGG:ns NR:ns ## COG: BS_yjdE COG1482 # Protein_GI_number: 16078267 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphomannose isomerase # Organism: Bacillus subtilis # 248 572 9 304 315 79 22.0 2e-14 MTLLRKTTQFVLPVEKPETGVNQYDIYPAHDLGEEKIFCDYMSLARRIAGSKRVIIDGYV GVRFDIFSRELNKALETLGIRPVWWNAGAAMKEPAEIDRLIEPYLGGDDPIFGFRTPLRL EEFFDREKLDRIRPDDAAQMNILIGIGASLAGWDGLLLYIDIPKNEIQFRSRAGSITNLG AAAADAPKKMYKRFYFVDWVVLNRHKKALLPEIDVMIDGQRETEITWTEGADLRRGLDRL AGNGFRVRPWFEPGAWGGQWIRNHIEALPHDVPNYAWSFELIVPENGLIFRSSGRMLEVS FDTVMFQAGERVVGKECYGLYGDEFPIRMDYLDNFDGGNLSIQCHPQREYIRKNFGEVLT QEETYYILDTTPDSVVYLGFKEDIVPEKFEQALTESFEKGTELDVNKYVNAEPSAKHDLF LMPPGTLHSSGRNNLVLEISTTPYIFTFKMYDWLALDLDGKPRPLNIRRAMENLCFDRKG ETVKRELIAHPELLEKGADWELWHLPTHRNHSYDVHRYKLNTSVEVQTEGKCHVLNLVEG ESARIETAGGDSFPISYAETFVVSAGAGSYRIVNTSGREIMVVKAFMK >gi|313158042|gb|AENZ01000051.1| GENE 91 123979 - 124332 447 117 aa, chain - ## HITS:1 COG:no KEGG:BVU_1310 NR:ns ## KEGG: BVU_1310 # Name: not_defined # Def: putative transcriptional regulator # Organism: B.vulgatus # Pathway: not_defined # 1 115 1 115 115 119 52.0 3e-26 MDKLCRIRDLQRAVHQFEANLERLYGICLNEGMTLCSLSKAGRLSCGELSDLLGLTPSNM SKVLRSVEEKKFVRRELGTADKRQMYFSLTDRGRQLLASIDCGKIETPDPITALLQK >gi|313158042|gb|AENZ01000051.1| GENE 92 124316 - 124519 226 67 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313158171|gb|EFR57576.1| ## NR: gi|313158171|gb|EFR57576.1| 4Fe-4S binding domain protein [Alistipes sp. HGB5] # 1 67 1 67 67 117 100.0 3e-25 MALTIDELRCPQDHRCPLIPVCPVGAISQLGDGLPAIDPQKCIECGKCIRHCGRQAVHKK PEYGQTV >gi|313158042|gb|AENZ01000051.1| GENE 93 124527 - 124991 675 154 aa, chain - ## HITS:1 COG:BB0061 KEGG:ns NR:ns ## COG: BB0061 COG0526 # Protein_GI_number: 15594407 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Borrelia burgdorferi # 38 147 3 111 117 127 48.0 7e-30 MKKLLVTAACILTFAAGADGQTDNKNDNTKQTNTMKTIALTKADFLTKVANYETNPTEWK YLGDKPALVDFYASWCGPCKALAPVLEELAAEYGESIYIYKINTEEEQELAAAFGIRSIP TLLFIPKDGKPQMAQGALPKASLKEAIDKVLLNK >gi|313158042|gb|AENZ01000051.1| GENE 94 125612 - 126859 353 415 aa, chain + ## HITS:1 COG:mlr7741 KEGG:ns NR:ns ## COG: mlr7741 COG0582 # Protein_GI_number: 13476425 # Func_class: L Replication, recombination and repair # Function: Integrase # Organism: Mesorhizobium loti # 162 403 100 311 359 61 26.0 3e-09 MRTAIILRQDKTNQKGVLPVCMRITHERRSMYVTLIRVKPENWDASRGIVKKSDLQHERL NNELRRRIAEVNKVVVLCEAMEPERGIDAVRDRLHKRTTADMFDYAQNYLSRKENTSVRT YNKCHSHITKFKKFIGKDQFSVSHLTYELLLRYENYLREDLGNSINTVTTNMKTLKELTA EMYEEFRLDTRNNPFRKYKMKSTPTERPCLSEREYFRVRNLKLIMQGKLYDARQLFVFES ETGIRISDILKLRWQNYDGHYIQIIMDKTDRQLKIPASDLVKEICAMRERRYRTAGVPVT PDGFIFRGILPYEYDSMSRAEKVCAENCAVARINLALKRVAVLARVKKNLSTHIGRHTFA SRLVRSGQNMVVIRDLLGHASIRTTEIYAKIMQSQMNDAIDVLNKLNNHGYKRAE >gi|313158042|gb|AENZ01000051.1| GENE 95 126837 - 127085 60 82 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MDTKEQNSRDWGDLLLTSQQVADVLHVSLRTLQKYRDEGRLNYVRISPRIIRYYPEEVAA LIRKSHCSTWLKDNCARLLKLK >gi|313158042|gb|AENZ01000051.1| GENE 96 128799 - 128900 68 33 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MFPSDISAIKRLILNTGSLSGAVSIGKQPMVFI >gi|313158042|gb|AENZ01000051.1| GENE 97 129173 - 129409 91 78 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313158127|gb|EFR57532.1| ## NR: gi|313158127|gb|EFR57532.1| conserved hypothetical protein [Alistipes sp. HGB5] # 1 78 1 78 78 137 100.0 4e-31 MYKSATVLRVCGNFLITHLGFIIPTIKARNDKMKIRVNNIALMFEKYNYRTLETNNKSDL KNHLDLFEFEKVSSILFF >gi|313158042|gb|AENZ01000051.1| GENE 98 129636 - 129830 123 64 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313158164|gb|EFR57569.1| ## NR: gi|313158164|gb|EFR57569.1| conserved hypothetical protein [Alistipes sp. HGB5] # 1 64 1 64 64 91 100.0 2e-17 MTPEEKMQLAEERERSIAIRDSLLKQQAEKSKILLPVVTDSLLISKYVPQFDSLYTELLT LDKP >gi|313158042|gb|AENZ01000051.1| GENE 99 130264 - 130446 86 60 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MELCFGYNKYKKSRVIYPAFVFLFCVNNIITKTAAQIQPTGPKHNKATKSMKIRHPLIIH >gi|313158042|gb|AENZ01000051.1| GENE 100 130988 - 131353 132 121 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313158150|gb|EFR57555.1| ## NR: gi|313158150|gb|EFR57555.1| hypothetical protein HMPREF9720_0101 [Alistipes sp. HGB5] # 1 121 182 302 302 218 100.0 1e-55 MLKRKYLRAHGFRDFRGTGLMNFHYYSILKRQLKDFFENVSFRNFYLKYDEISQDVVKRY IFVTSADYQLYLNYLKKQVGEVEYRKEKRATINFLKRMDAYKTARLEREIKSKFAETLAE L >gi|313158042|gb|AENZ01000051.1| GENE 101 134341 - 134817 179 158 aa, chain - ## HITS:1 COG:no KEGG:Gobs_3736 NR:ns ## KEGG: Gobs_3736 # Name: not_defined # Def: hypothetical protein # Organism: G.obscurus # Pathway: not_defined # 7 157 4 154 195 115 38.0 7e-25 MKKQLPRNIYCLEGNWRTNPRCKQSVRPMLDILRDAAGIKYIYRKCNTREEFFEYLRQYT FERYRNYTILYIAFHGRPNKIQIGRDLVTLREIADVLEGFLAHRIVYFGSCSTMRTKRAN IDDFLHRTKADILAGYRKDVDFIQATAWEMMRFANFLY >gi|313158042|gb|AENZ01000051.1| GENE 102 134814 - 135716 441 300 aa, chain - ## HITS:1 COG:no KEGG:Tery_3084 NR:ns ## KEGG: Tery_3084 # Name: not_defined # Def: phage integrase # Organism: T.erythraeum # Pathway: not_defined # 133 280 22 172 189 90 38.0 9e-17 MTNEIIRSNRLELLKNEVFGALDVTDATRNEYRVRITHFIRYAETHGINRNSYLDYKRYL AGIDTFSVSTKNKYLIAAKIFLDGLHRLNLIPQKITDHVRGFTQCRLHRKEGLQDADIGK LQRYCSDLPPAPQNLRLRALVALFLFQGLRQIEVVRLDVGDIDLRNKTAFIRGKSRDDKE PIHLHPSTVRVLREYLNCYRFRSGALFRSDSNHCRGERLTTKSVREIIKRIFAELEIDGS THGFRHFFITKLIKSYKGELLMVSKYSRHRSIQMLEVYNDEIIREQDLPRFYEVFKEIRI >gi|313158042|gb|AENZ01000051.1| GENE 103 136099 - 137145 1100 348 aa, chain - ## HITS:1 COG:BS_resA KEGG:ns NR:ns ## COG: BS_resA COG0526 # Protein_GI_number: 16079372 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Bacillus subtilis # 215 330 46 155 181 74 34.0 3e-13 MKKILLTASTAALLCGCGGPNYSITGNAGLEPGDSVFLIGSGRTELAAGIVCADSTIRLQ GRVAEPEIARLANQERIPVGTAVIFLEPGDIRTVPTDDRRVYAVSGTPLNDRKREFDERM ADFDRKFRELPLDAPADSLYADYNKLVPESIEANLDNLFGAYLFSVYEFDGNDIAAAKTR LKQFPAAVQAHSILQRIAEKVAATENTEIGKPYMDLTLPDAAGEPVALSGIVGKGRWVLL DFWATWCGPCCREIPHLREAYAACKSKGLEIYGVSLDNDAAKWKTFVADNDMPWINVLGV SADKRSDAAAMYGISSIPANFLISPEGIIVARDLRGENIKARLEEAMR >gi|313158042|gb|AENZ01000051.1| GENE 104 137188 - 137778 1008 196 aa, chain - ## HITS:1 COG:NMA1631 KEGG:ns NR:ns ## COG: NMA1631 COG0817 # Protein_GI_number: 15794525 # Func_class: L Replication, recombination and repair # Function: Holliday junction resolvasome, endonuclease subunit # Organism: Neisseria meningitidis Z2491 # 1 148 10 156 180 107 40.0 2e-23 MGIDPGTNYMGYGVLEIDGRSVRSVVLGDIDLHKLSDPYVKLRYIFERVGALIDQYAPRE VALESPFFGENVQSMLKLGRAQGVAMAAALSRGLPVSEYAPMRIKQSITGRGSAAKEQVA AIVCRILSLEQPPKRLDATDGMAVALCHHFTISNPLNAAMGDERVKGLGGGKKAASKGGS QSWEKFLREHPDREIK >gi|313158042|gb|AENZ01000051.1| GENE 105 137871 - 138926 1767 351 aa, chain - ## HITS:1 COG:aq_152 KEGG:ns NR:ns ## COG: aq_152 COG0526 # Protein_GI_number: 15605725 # Func_class: O Posttranslational modification, protein turnover, chaperones; C Energy production and conversion # Function: Thiol-disulfide isomerase and thioredoxins # Organism: Aquifex aeolicus # 208 280 34 102 146 63 38.0 5e-10 MKKAFLFAASAAVLCSCSTQPKYVVEGDIAGLEGTVYLFQQDSLIDSAVVKSGKFRFSGP AGAPAMHYLLDSRDGQPQAFAMQLILEPGTISIKSDADDPQVRHTTGTPANDAAEAYTAA SRALITEYRDAATTDQRREAIEEEFEQLTRTSVEANRDNFFGVMLMPNMAYELSGQEILD QIAKFSPEMQQTKELTELKATAEQKKRTDVGQPYINILQSDAEGQIVTLTSVIENPANKY TLVDFWASWCGPCMGEVPHLKETYDKFHKKGFEIYGVSFDNNRDKWLAAVKDNGMKWIQV SDLNGFDNPAAKDYAVQGIPSNFLIDASGNIVAKNLRGEDLYKKVEELLGK >gi|313158042|gb|AENZ01000051.1| GENE 106 138944 - 139978 1625 344 aa, chain - ## HITS:1 COG:RSc1414 KEGG:ns NR:ns ## COG: RSc1414 COG1044 # Protein_GI_number: 17546133 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-3-O-[3-hydroxymyristoyl] glucosamine N-acyltransferase # Organism: Ralstonia solanacearum # 2 344 1 354 356 207 33.0 3e-53 MMEFTAEMIAGFLGGDIVGDKETKVHTVSSIEEGKAGSLTYLTNPKYETFLYSTGASIVL VNRSFEPSQQVSATLIKVDDAAACVLKLLEMYNAAKPRRSGISKLASVAEKAEVGADCYI GDFTVVEAGVKIGKNCQIYPQVYLGAGVTVGEGTILYPGVKVYEGCRIGRNCILHAGAVV GADGFGFMPNAAGGFDKIPQLGNVVIEDDVEIGANTCIDRAKTDSTVIRRGVKLDNLIQI GHNVQIGENTVSSAQTGIAGTSRVGRNCFLAGQVGIADHVNVGDFVKIGSKSGLDKDVPD GEVRFGYPALPGMQYHRSAAVFKRLPELEKLVHNLEKQLAELKK >gi|313158042|gb|AENZ01000051.1| GENE 107 139990 - 140460 249 156 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163764798|ref|ZP_02171851.1| ribosomal protein S19 [Bacillus selenitireducens MLS10] # 2 151 3 152 164 100 34 5e-20 MERTAIFPGSFDPFTRGHAALVDEALNLFDRVVIGIGNNTAKTGLLTVANRKRLIDDLYA GNPRVEAHIYTGLTGEFAEKAGACAIIRGVRNTTDFEYERTMEATNHRIYPDITTVMLFT PSPVADISSSTVREVLSFGRSVEEFMPKGIDINKYL >gi|313158042|gb|AENZ01000051.1| GENE 108 140519 - 141112 678 197 aa, chain - ## HITS:1 COG:no KEGG:Bache_2807 NR:ns ## KEGG: Bache_2807 # Name: not_defined # Def: hypothetical protein # Organism: B.helcogenes # Pathway: not_defined # 1 197 1 183 183 98 36.0 1e-19 MKKYLFLYAVAVTALLVCGYGRYRRENRRLTQNQHALAAGIERYRTRLGQEAASVQALRL RCAEFEELRAADAAEIRRLGIKLRRLEAAARTATATKVEIRTPVRDTVVVRLSDTLVVRD TLRLFRWRDAWVRVEGAVTADSVLCRVESADTLRQVVHRIPRRFLFIRWGTKALRQEIVP SNPHTRIVYSEYVKIER >gi|313158042|gb|AENZ01000051.1| GENE 109 141164 - 141892 796 242 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313158178|gb|EFR57583.1| ## NR: gi|313158178|gb|EFR57583.1| hypothetical protein HMPREF9720_0113 [Alistipes sp. HGB5] # 1 242 1 242 242 455 100.0 1e-126 MQEVYLYQTVHILGGRSLHLPAHLAVLDRWSRELFGRPAGLRQQPLARQIEALAAQTAPA DCDLSQFVRIVVPASGDPAFRLESAGISLYRGYDLRSLMPDAVTLQYDMPFPEAPTSARE AAAGLARQQARLHGASVAVRCDGDGIVRTADNAPLFAVRGKTAFVSPAEACVEQELGLRA VEAAGLELEERPVMRDELPRFDELFYVDHRGVTALAHCDGHPCMAILAERVAQSLGGLFP KM >gi|313158042|gb|AENZ01000051.1| GENE 110 141892 - 142962 1628 356 aa, chain - ## HITS:1 COG:BH1337 KEGG:ns NR:ns ## COG: BH1337 COG1466 # Protein_GI_number: 15613900 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, delta subunit # Organism: Bacillus halodurans # 15 274 3 267 342 86 26.0 8e-17 MAKSALKFKDSVAAYEKLAQEIAARRFAPVYLLMGEESYFIDAIAERLATTVLGEAERAF NQITVYGKDSEAGQVINLCRQMPMMGSYQVVILKEAQQLRGLDKLSLYTQKPSPTTILVI CHKEKNADKRSAFYKGCAANGAVLESVRPRDYEIASWLQQFIAKKGLAIDAKALSMLTDH LGTDISKISNELGKLVVSLPEGTKRITDADIEANIGISKDFNNFELCKAVVTRDMARALM IAEHFARNPKDNPLLVTVLALFGQFKELFVVNYLRWTARHKGMPFPSDAELMRILKKNNV YVLGEIKQNAANWDNRRVFNILGLLREYDAKSKGMNAGGASDGELLRELLLKIFLL >gi|313158042|gb|AENZ01000051.1| GENE 111 143051 - 144394 2178 447 aa, chain - ## HITS:1 COG:CAC3204 KEGG:ns NR:ns ## COG: CAC3204 COG0037 # Protein_GI_number: 15896451 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Predicted ATPase of the PP-loop superfamily implicated in cell cycle control # Organism: Clostridium acetobutylicum # 11 445 9 458 461 161 28.0 2e-39 MALLEAFRKYIADNDLATHDDRILLTVSGGVDSMVMLSLFTRSGYRVGVAHCNFQLRGAE SDEDEVLVEEEARKYGVEFYNKRFETAAEMERTGESMEMAARRLRYAWFDALSREHGYTV VAIAHHADDSIETFFINLLRGTGLRGLTGISTQVGKVVRPLMFASRKEILEYAVANRIPF REDSSNRSTKYLRNKIRLGLIPRIREINPKFTSLMRRNIARLTDAQLFINHGIERIRGEA VTTENGLDTIHLDRIDPAFPQEFVIYELLNSAYGFKGDVIDSLCHSLRHGMTGRRFYSRE RVAYVDRGTIVVAPIAPGDACMVTVERGVPRSYCGNSALYFEYCDIDAIKNFGVPEHIAQ VDADLLKFPLTLRRWREGDWFIPFGMAGRKKVSDFLIDAKVSMPEKERQFVLLSGDEIVW LVGRRIDDRYRLTPDTENVLRITKEIV >gi|313158042|gb|AENZ01000051.1| GENE 112 144399 - 145598 1362 399 aa, chain - ## HITS:1 COG:VC0318 KEGG:ns NR:ns ## COG: VC0318 COG0812 # Protein_GI_number: 15640345 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramate dehydrogenase # Organism: Vibrio cholerae # 71 396 18 338 357 256 40.0 5e-68 MRKIAHFPYPRPHRTTLYKEPRGGASRPERRKDRSGTGARRIYAPNARRYNRANDKQTET DGMIREFHQIDLSGRNSFGVGQQAARLAEFETEEDLRTIFSGGVPERWAVLSGGNNILFT RDYDGLLLTPVARQITPLGEEGDTVRVRADAGVEWDDLVEWAVERGLWGIENLSLIPGKA GAAPVQNIGAYGCEAKDVIERVHMFCTDNRSAMVIDAGHCCFGYRESIFKHELRGRVIIT AVDIRLSRTPRPRLGYGDVEREVEARGGVTLRNIREAICAIRRAKLPDPKVTGNAGSFFK NPVVDECVARQLQAQWPDMPVYPAAGCAGRVKLAAGWLIDKAGLKGYKRGRVGVHERQAL VLVNLGGATGGEVIDFAHTVQMRVHEKFGIEIDTEVNIF >gi|313158042|gb|AENZ01000051.1| GENE 113 145670 - 147103 1955 477 aa, chain + ## HITS:1 COG:no KEGG:Dfer_3787 NR:ns ## KEGG: Dfer_3787 # Name: not_defined # Def: hypothetical protein # Organism: D.fermentans # Pathway: not_defined # 1 477 1 498 498 514 51.0 1e-144 MFTQQDLQQIEKHGLIPEAVELQLENFRRGFPFLNVVRAASPGDGVLTLSDTEAAAAAER YEQAAAGLSVVKFVPASGAATRMFKELFEFVNDGKRGKGIDTLLENIGKFAFWPELKALL PAGADDRTIVSAIVNEGLGYGRKPKGLVTFHAYPEGARKAVEEHLVEGAVYAAANGVAKI HFTVSPEHIEGFQELLAAKVPVYEKRFGIRYDISFSVQKPSTDTIAVNPDNTPFRQDDGT LLFRPAGHGALIENLNEIDADVVFIKNIDNVTTDAQRGDTIRYKKVLAGILLDLQERAFE YLKALEVGGAELEPIVEFIEKRLCVKLPGSYDSAVLRAVLDRPIRVCGMVRNEGEPGGGP FWVGNPDGTQSLQIAESSQIGPDDLPLMRSATHFNPVDLVCGMKNSKGVRFDLRRYTDPS TGFISSKSSGGRDLRAQELPGLWNGAMAKWNTVFVDVPITTFSPVKVVQDLLRPQHQ >gi|313158042|gb|AENZ01000051.1| GENE 114 147572 - 148777 360 401 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|116517028|ref|YP_816079.1| glucokinase [Streptococcus pneumoniae D39] # 88 392 6 316 319 143 28 6e-33 MTSSFLKQATEGNKNAILKQQIICQYIFCGDSSITDLSKAVNLSVPTVTKLIGELIDEGF VHNFGEQGTAGGRRPNIYGLNPYAGYFVGVDLRKDSVMMAVINFKGQLIDETTVDFLMDN DPRSLDRLCDVIQNFIRARKIARDKVLAVGVNISGRVNSQTGYSYSYFFVEEQPLTMLLE ERLGTTVYIENDTRAATYGEYMYGDAHSEKTMLYINASWGLGLGMIIDGKIFYGKSGFSG EFGHFPLLDNEIICRCGKRGCLETGASGSAMHRIFLEKLKEGRVSMLSEKYKKGEEITLN DILAALLKEDVLAIEILESVGHTLGKAIAGLINMFNPEVIVIGGTLSVAKEYLMLPVRNA INKYSLMLVNKDTQIKLSSLGERAGVMGACLLARSKSLGLF >gi|313158042|gb|AENZ01000051.1| GENE 115 149133 - 149747 1137 204 aa, chain + ## HITS:1 COG:TP0424 KEGG:ns NR:ns ## COG: TP0424 COG1390 # Protein_GI_number: 15639415 # Func_class: C Energy production and conversion # Function: Archaeal/vacuolar-type H+-ATPase subunit E # Organism: Treponema pallidum # 27 201 1 171 232 74 29.0 1e-13 MENKLQQLTQKLYDEGLEKGRAEADRLVAEAKKEAAKIVAEARAQAEDIVRKAQDKAEDV EKNTMTEISLAGKQAAAKIKSEIAAMIVAKSTAAGVREAVMDPAFIKEMLLSVAKNWNGA DAGKVELKALLPEADRAKLDEAFGKSARELLSAGIEVGYSKEVKTGFKVGAKEGGYYISF SDADFDALLGEYLREKVSDMLFKA >gi|313158042|gb|AENZ01000051.1| GENE 116 149753 - 150586 1454 277 aa, chain + ## HITS:1 COG:no KEGG:BF2748 NR:ns ## KEGG: BF2748 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 4 268 2 268 279 134 31.0 3e-30 MFATEYYCLVAGLKEYSLDADTKGFDAKAIVGEILDGVSASDAAQVRLLYGYYDCENIAS LRAGRSAHNPLGNFTREELEEEVKAPKRLPSAVGRVLRAYADPEGEDAEEVDTAQRFETA LFDAYYATCSRAKSRFLREWSEFDRTLRNVTAAVTARAAGRPVEEVTVGGGDVVEQLERS SAADFGLRGELPYIDAVIAAVNDEANLVEKEHKIDLIRWNEATELATFDYFDINTILSYL ARVNIVARWTQLDAVRGREMFDRLMAELDGRELVNRI >gi|313158042|gb|AENZ01000051.1| GENE 117 150725 - 152479 2718 584 aa, chain + ## HITS:1 COG:TP0426 KEGG:ns NR:ns ## COG: TP0426 COG1155 # Protein_GI_number: 15639417 # Func_class: C Energy production and conversion # Function: Archaeal/vacuolar-type H+-ATPase subunit A # Organism: Treponema pallidum # 2 579 3 570 589 560 47.0 1e-159 MKTTGRVNGIISNIVIVKADGAVGQNEICYVYAGDTKMMAEVIKVVGDSAYVQVFDSTRG LKIGDKVEFEGHMLEVTLAPGLLSRNYDGLQNDLEKMDGLFIERGSITDPIDFDAEWEFA PLAKAGDKVTAASWLGEVKEQWVSHKIMVPFTMAGNYTVKSVAAAGKYKVTDTIAVLTDE EGREHAVTMVQKWPVKQAVRCYVEKPRPSRMMETGVRAIDTFNPMAEGGTGFIPGPFGAG KTVLQHAISKQADADIIIMVACGERANEVVEIFKEFPELVDPHTGRKLMERTIIICNTSN MPVAAREASVYTGMAIGEYYRSMGLKVLVMADSTSRWAQALREMSNRLEELPGQDAFPMD LSAIISNFYSRAGLVKLNNGQTGSVTFLGTVSPAGGNLKEPVTESTKKAARCFYALSQGR ADSKRYPAIDPLESYSKYLEYPEIREYLDEHIEKDWVDLVYAGKTLVQRGKEANDQINIL GDDGVPVEYHERFWKSELLDFVILQQDAFDDIDANCPLERQKMMYKMVLDICRKDFAFAD FEECSQFFKGLINLFRQMNYSEWQSEKFEGYRKQIEEYVSEKIK >gi|313158042|gb|AENZ01000051.1| GENE 118 152490 - 153812 2268 440 aa, chain + ## HITS:1 COG:CT307 KEGG:ns NR:ns ## COG: CT307 COG1156 # Protein_GI_number: 15605028 # Func_class: C Energy production and conversion # Function: Archaeal/vacuolar-type H+-ATPase subunit B # Organism: Chlamydia trachomatis # 7 440 2 434 438 390 46.0 1e-108 MTTRAFQKIYTKIDNITKATVTLRALGVGNDELATVGGKLAQVVKIMGENVTLQVFSGTE GIATDSEVVFHGEPPKLRVSDNLAGRFFNAYGEPLEGGEIVEGEPREIGGPTVNPFKRIQ PSELIATGIAGIDLNNTIVTGQKIPFFADPDQPYNAVMANVALRAKADKIILGGMGLTND DFLYFKSVFENAGALDRIVSFVNTTENPPVERLLVPDMALTAAEYFAVDKGEKVLVLLTD MTLYADALAIVSNRMDQIPSKDSMPGSLYSDLAKIYEKAVQLPNGGSITIIAVTTLSGGD ITHAIPDNTGYITEGQLFLRNDSDTGKVIVDPFRSLSRLKQLVIGKKTREDHPQVMNACV RLYADAANAKTKLENGFDLSDYDERTLKFAFDYSEKLLSIDVNIGITEMLDTAWGLFAKY FSKEEVAIKEELIKKYWKEA >gi|313158042|gb|AENZ01000051.1| GENE 119 153819 - 154427 695 202 aa, chain + ## HITS:1 COG:TP0428 KEGG:ns NR:ns ## COG: TP0428 COG1394 # Protein_GI_number: 15639419 # Func_class: C Energy production and conversion # Function: Archaeal/vacuolar-type H+-ATPase subunit D # Organism: Treponema pallidum # 1 179 1 180 206 97 34.0 1e-20 MAIKFQYNKTSLGDLGKQLKMRQKALPTIKSKESALRSEVKKAKDSAGDYRRRLDRLKAE YDYMVSLWGEFDCDLLRIADVELSVQKIAGVRTPVLQDVKFEVKPYDLFSSPVWFADGVD ILKRMAQLGIEFEVYNRKMELLDYARRKTTQKVNLYEKVQIPGYEDAIRKIKRFMEDEEN LSKSAQKIVKTKQQQAGEGATL >gi|313158042|gb|AENZ01000051.1| GENE 120 154436 - 156271 2613 611 aa, chain + ## HITS:1 COG:SPy0148 KEGG:ns NR:ns ## COG: SPy0148 COG1269 # Protein_GI_number: 15674358 # Func_class: C Energy production and conversion # Function: Archaeal/vacuolar-type H+-ATPase subunit I # Organism: Streptococcus pyogenes M1 GAS # 75 610 107 656 666 145 27.0 3e-34 MAKYDFVLYAAQSEDFIEKLRELGLVDITTTGWEPSEEDRQLLLDIEGHAKAFEFLRNFR AEEGRCKAGAEPFATGAEAYEHYAAAHQQATALHAEIGRLEKTADELRPWGEFSPERTRS LAEQGIVLRYFLTPKSNYDKFEAQWAERYTIAPINRTESTVYFVVVAAPGEEVTLDAQEM KTPSMDIREAERRIAEARKELSALDAEFSRVAVSEKLLAAHAASLKEQLQGVRVKATAQQ AADGTLVVMEGWAEKETSDKVDALLEQYPNVVYLKGDPTPEDDTPVKLKNNWFARIFELV GDMYARPKYGTMDLTPFFAPFYMLFFGICLNDAGYGAILALLGAWMLAKNRQPGMMRQAA WFATLCGLTTVVVGLACGSVFGVNLKEYFPSIPFFDFQGSFFSIALAIGVVQIVFGMMLK VVMISATVGFRYSLGTLGWLLVILGGSIAGGLPLLNPEWVIPFYTTSSPAFYATLGVGAV LMLFFNSPGKNPLLNFGLGLWDTYNNLTGILSDVLSYIRLFAIGLSGGILATVFNALADG FVPEGSGIVVRLLIMIPILLVGHGINLFMSTISSFVHPMRLTFVEFYKNAGFEMSMRSFD PLRKIENSENK >gi|313158042|gb|AENZ01000051.1| GENE 121 156293 - 156748 813 151 aa, chain + ## HITS:1 COG:FN1740 KEGG:ns NR:ns ## COG: FN1740 COG0636 # Protein_GI_number: 19705061 # Func_class: C Energy production and conversion # Function: F0F1-type ATP synthase, subunit c/Archaeal/vacuolar-type H+-ATPase, subunit K # Organism: Fusobacterium nucleatum # 6 147 13 154 160 72 37.0 3e-13 MESTTAIILGYLGVALMVGLSGVASCIGTSIAGMASVGAMKKNGGAFGSYMILSAIPGSQ GLYGFVGYFIIKGFLTDSMTLLQGAAIFGAGLLTGLVCLASSYFQSKVCANGIAAIGNGH DVMGKTLILAAFPELYAILTVAAVFLIAGAI >gi|313158042|gb|AENZ01000051.1| GENE 122 156936 - 160142 5030 1068 aa, chain + ## HITS:1 COG:lin2123 KEGG:ns NR:ns ## COG: lin2123 COG0383 # Protein_GI_number: 16801189 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-mannosidase # Organism: Listeria innocua # 32 858 233 1031 1032 301 29.0 6e-81 MKPNRFMLALAALCFASGGASAQQTDGKSVAPKTYKAYMVSDAHLDTQWNWDVQTTIREY IPRTLFQNLYLMERFPDYRFSFEGGVVYSWMKEYYPMHYERMKTYIASGQWHIAGASWDA NDPNMPSAESFFRNILLGQEFYKKEFGVRSTDIFLPDCFGFGYTLPTIAAHSGLIGMSTQ KLSWRNHDFYDAPYHKKNPFSWGVWYGIDGSSVIAAFDTGGYNSELPENVQYNERLMERA AHGYDNTCFRYYAGGGEGVGDKGNSGTVTTCRRLTKAINDPNAPIEIISATSDDLFKAYL GRKAELPSFDGELLMDVHATGCYTSQTAMKYYNRRNEELTGAAERASVAADWLGALAYDQ RKLTEIWQRFIWHQFHDDLTGTSIPEAYTWSWNDELIALRQGGDVMNSAVGALSYSLDTR VKGTPVVVYNAVTYPVKAVVEAEIPLAAKAKGVAVYGPDGRRVAAQILSREGDKAVIAFA ADVKSVGYAVYDVRPASPARSSALKVNGNTIENAVYKVTLDRNGDIASLVDKRYGRELVE QGKAFRLAIFEGNPSNEWPAWEVLKEVVDKTPRAVTDNVSVSVGEEGPVRASLKVERTYG DSKFTQYITLTDGAQDDRIDIRTTVDWNSRNTLLKAEFPMSVSNAKAAYDLGIGFIERGN NTATAYEVPALKWADLTDADGSYGISVLNDCKYGWDKPADNTIRLTLIHTPSTEKRYPHQ RDLDLGVNHFTYSIVGHKGTDRSGVVAASEQLNLPLVAYVAPKHAGSLGRTFSMLESSTP QIGVRALKKAEDGDGYIVRCYELTGKPVENACITFPAQILSAEECNGIEEKIGAAETEGR SLIVSAGKFAPKTYRVRLAAPAQKSAFEVKSAPVTLPYNTVAFTTDEFYTYYRFDNQRGS FAAELIPAELTCNGVRFVMGEENVKDAVTCRSQEIELPEGGYRKLYMLVAASDKECEALF KVGDSEQAVYVPLWKGFYGQWGWRGHSEAFLKDATIAHIGTHRHQGDVGNLPYDFSYMYM VSLDIPEGARTVTLPENKNVAVFAMTVSDNGIDDVLPVSGTFIRPDVK >gi|313158042|gb|AENZ01000051.1| GENE 123 160476 - 160559 61 27 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MEKNSCLILSADFFAESRNLSCYFPVF >gi|313158042|gb|AENZ01000051.1| GENE 124 160628 - 163759 5101 1043 aa, chain + ## HITS:1 COG:no KEGG:PRU_2712 NR:ns ## KEGG: PRU_2712 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 38 1043 9 1010 1010 1070 54.0 0 MTNFGHKIMLTVLVLLITPPLGFAAATVNASPLTQTSKVTVKGVVKDAAGAPLLGVTVIV KGTSVGTSTSIDGEYQIQCAADAELDFSFIGYETQTLPVANRTLLNVTLKEDAQLMDEVI VIGYGTTTRRRAVGAVDQIKSEAITERSVANLTQALQGTSPSLVIQQRSSDPADNQLNIN IRGIGTMNNNEPLIVIDGLVSDNASFNKLNPSDIENVSVLKDAGTAAIYGSRAANGVLLV TTKKGKLNQAPTVELTAMVGWQQPEVLYTPVQGWQNATLLNISKANSGLAPAFTAEQIRD LKAHGNGTFYMDEILRTAMQQNYNVSVSGGGANTTYMISAGYYDQESNFEGPDYGRQRYN FRTNITTEYKRVKVTALLSYSHENSKTTTDGSTIANAMRIPTYYYYRQYDPVTGKYLLND KLTDRNSLGLLEEGGYNKYRNDYVNANLSAEVKIIDGLKLRGVLGADIFADHRYTRTHEV MFYNTDAAGNPVGEGRSANNDNKSEDWNKRAYLINSQLMLDFDRTFGKHHVAALLGASNE SYTSEANQISFKWADPDLGINGDGTEAVISGDGRSSVTPENTTRTSLTSVFGRAAYDYDN RYYAEFSFRYDGSSKFRSDLRWGFFPSVSLGWRLSQEAFMESYRDNVGDLKIRGSYGTLG NQSVGDYQYFTTYNVYSNTYAFNNVGVGGAGFQLGTDNLQWEVSKTFNVGFDASFFRGKL NLGFDYFNKHTTDILVKPQTPLILGTELQNYNAGEMRTQGWELTLSYALKKRDWSHFFQF NIGDSWNKVLKYEGFEDISGQEEFWRITREGLPFNSYYGYKTDGFFQSYDEIAHSAVPTG KSVKPGDVKFKDRNGDGTIDENDRYYLGNAFPRYTLGFTYNVAWKGIDLSIFLQGVLKRD MMVRGELIEPFHSNYGYTMYEHQLDFWTPTNTDARWPRLTDISDPSNQNNYRMGSDLYMF DGSYLRLKNIQLGYTLPERISMKFGVKRFKIYFNAQNLLTFSNCSFIDPESSEFGSNMNS GGANSGRNYPNLRYFGAGLNITF >gi|313158042|gb|AENZ01000051.1| GENE 125 163775 - 165427 2746 550 aa, chain + ## HITS:1 COG:no KEGG:PRU_2713 NR:ns ## KEGG: PRU_2713 # Name: not_defined # Def: putative lipoprotein # Organism: P.ruminicola # Pathway: not_defined # 1 550 6 543 545 429 42.0 1e-118 MKTTILKYITVTFAAGAMLLAGCNDLDQEPTNKFTDKAFWTSPERANMVLNMAYNQMFGH DKIWQDEALSDNLYEQRGNPDTRTIRMGQATPNTGLFRSEWKWVFEGVKTCNVFMNFVDE VPGMDPERLAGMKAQIRFIRAFLYFRLANFYGDIPFFLQDISLEESRRVSRTPYPEVMSA LHRELDEIVGDLPTRDELKADDNGRITKAAAMVLKARMYMYENVQRTGNSVSDNMRKAAD ICDKLIHEQGTYGTYSLLTQGSGDGIPAYEYLFTSAAEYNSEVILDYAAVELNKQWSTLY SMVPLTMGANLCQKAPTRALVDSYLNADGSVPADKTVYANRDPRLTATVVYNGYVWKDRN DKGEYVTKGTINVTSGNDKAGTDNGSPTGFYTRKYFDTTHGKNLEMWTNIIMMRYADVLL MYAEAKAYLNEMDAAVWNETIKPIRQRAGLSGTDFPSSGDYTQIVRDERRVELALEGLRY FDLIRWINYKDSKSQGVIDLLNGAVYGAKELNGGRQIDEFKFNSSRDILWSLPLSETQLV PTLLPNNSGY >gi|313158042|gb|AENZ01000051.1| GENE 126 165455 - 166579 1373 374 aa, chain + ## HITS:1 COG:no KEGG:PRU_2714 NR:ns ## KEGG: PRU_2714 # Name: not_defined # Def: putative lipoprotein # Organism: P.ruminicola # Pathway: not_defined # 13 372 15 372 374 177 34.0 8e-43 MKKILNILMAALVLTGISCSEDLEYKPGGVTPVQTLLLPADNYYVELQSASTATLRFSWE PALAADGQLPHYEVVFFGKPGGEIIYRCDAGSSTSVDIVHKEINRIANAAGIDVGADGAV YWSVVSSRGVDTAPVVATPRRLELTRLLGFNVIPDQLFLTGEATEGGSDLENACVGRRTG DGEFTFYQQLEAGKGFTFVSSKGDDRTTYTVVNGVLNDQSAEPATIDRSGVYRIKLDLNV RGITFEHIDRVEFNFAPRTEDNGNMEYIGNGCWKFSDYNVKFREESWGLDQRYNFHPVFD GVTYVWAGTKGNDSSPSALSGAEYYILTENLFTGDAYGPEKFKFHSDFNGKKVDITLIMS GDVENCTHVIEIVG >gi|313158042|gb|AENZ01000051.1| GENE 127 166595 - 168736 3119 713 aa, chain + ## HITS:1 COG:no KEGG:PRU_2714 NR:ns ## KEGG: PRU_2714 # Name: not_defined # Def: putative lipoprotein # Organism: P.ruminicola # Pathway: not_defined # 2 334 3 337 374 152 32.0 7e-35 MKNILNKLVVLLLAGFALAACTEHTEYDDTGFSAVEKLLYPSDGYALDLIEQADANLYFE WEVSKVGTPVYTVVFLDAAKKEIGRYLADNNGQKSSLKMLHSQLNAIAGDAGIEPDAAGD LYWTVSAGLGGAEQLSPAEPHKLTVKRYASIDAPYHLYVTGEGSEFGTDPAAAKQMRELG DGKFEIYTRFTGSFSFINRNAAGSKRTFGVAEGRLTEGADAAGTGDGVYRVLVDFKAGTV KMEKIESVKYHWCWTPDPDAVMEYAGNGVWKRQITWDGDNRYRFDAVIDGTDYIWGYSSS DMSDSDMPSGLTGPQYRLSMRETGNINQWDYSFKHIAVLRNVTCTIVVDCSPEAEAYYHR YEFDFAPEATPVTLISPDDNASVVLNAQAGAKLTFAWNKFPDSDPAGKLTSYTLVFYSDA ALEKAVGRKEAQYNGSVDVSFSELESIADAAGIAAEGTGDLYWAVESKLLSWTALSSGRK LTVTRMKGIPTAAYITGAASEFGSGYQALKALGAGKFEIFTKLTTDAAGYNFTDGDTADA RKFVVEGGAVRESDASVVSSEEAIYYILLDFGAGTAKLQKIENMRYLSPNVKTQHTEKQI KLPYQGNGIWYAAGVVPYLRDWDDDRYFFWAEVDGVKTKFGANPGMTGDLNSKPAENDSR FRVFWPIADVNGDDAGAFKMLHAYRGNDSKRVNIKLNMSPDTEHYYNYIEYLD >gi|313158042|gb|AENZ01000051.1| GENE 128 168801 - 170012 1890 403 aa, chain + ## HITS:1 COG:lin0763 KEGG:ns NR:ns ## COG: lin0763 COG4833 # Protein_GI_number: 16799837 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted glycosyl hydrolase # Organism: Listeria innocua # 57 323 24 274 341 102 31.0 1e-21 MKKYILALTATATLLGACEVDDKYYGKEVVDPAEKLEPAYGEDWTAHADKIDEKLIENFM NTDRGTFWYTDQDRTNESTYCYWPQAHAMDVLIDAYLRIPEGDARRDTYAGYMSKWYENK GNNYGNSYWGTYGFGNAYTDDSEWIILTLIRMYEATGVQKYLAAARKTYEETVIARWTED ENGGGIRWSFDAENSKNACSNGPGALCAMRLWANSPKGAERDQYLADAKKIYNWLSSTLY NPLTGAVSDNMKNGVINGGALTYNQGTFMGAAHELYKATGDAKYMTDAARAARYTMKSMV TNGVLNNEGSGDNALFKGIFVRYAMNVWKDASVDKADAGLRREIEEFLIYNGLVCWRDGV DKSEGSKWFFGGFWGEPGSSWDGRLNEQVSASTMIEAMALLIE >gi|313158042|gb|AENZ01000051.1| GENE 129 170312 - 174370 5717 1352 aa, chain - ## HITS:1 COG:lin2643 KEGG:ns NR:ns ## COG: lin2643 COG0642 # Protein_GI_number: 16801705 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Listeria innocua # 805 1042 358 589 591 99 29.0 5e-20 MTRKPGLVWLLLLFCTAVRASNLRQISSREGISNNAILSLAQDKHGFIWVGSCDGLNMWD GERMRLFPNDWGGETPLSGNLIEEIAVTTDSLFWIRTNYGLDLFDPDSKRVERHADFQGM YKIVTRRSDEVLVVTQDNKFSCYDPAARRFSSVQPMSGIDYADILDMQLDDDGALWVFSR KGIFRIPVAFPDKNNDRIGIEQPERFPTVSNPEYAFAENGSGFFIDENHIFYEFDIRERR LIYNKDMSDEMARMGSVSSIVRDGGDYIVSFYTNGVIRLRTTPEQSVKYATEHINIPCGV FSLLRDARQEIVWIGTDGQGLYQHTRNATAFRSITFNQLPYNLSKPVRALYIDDERTLWV GTKGEGILRIADFYDRKEFTPAHVQQLTTGNSELLDNSVYAFARSRRNLLWIGTDGEGLN YYSYGDRRIRTLRTPAPLKYIHALYETSPDTLWVATVGCGVFRVTLGGTASQPSIRGVEQ LHFNDELEIKNFFFTLYPENGSAMWFGNRGGGAVRYDIAGGGSEVFKFDAGRSAIANDIF AIHRSGDGTMWFGTSGGLIEFRRDSTSHVPGIRNTVHGILEDRKGDLWLSTNRGLVQYSP HTGYVVTYGYSYGLNTIEYSDGACFADPRTGTLFFGGIDGLVTIEDTGFREQEYNPPMLF RDIRLNDGIRNIGDMLSRDGVLTLRHGQRIFEITVSALDYVNGSNYSYYSRLDGYNDQWA GASSQLSFADLPAGTYRLDVRYRNNITGAESPVYSLPIRLKPAWYATAAAKCLYVLLLLA VVGGVIRHYLLRYRRRRAEKLQRLEIRRKEEVYESKMRFFSNITQELSMPLTMISAPCQQ ILAYRKADPYVLKYAQTIRQNVSKLHDLIYMLHEFRGIRTGQNDESENIELVPVAEIAQS MVETFGEYSQQNSIHCQLDIERNLVWPTDRDGLSTILNTLLSNAFKHTPYNGAVSLTIRS DNGKLLISASNDSVGVNLEDIEAIFDRYRVLDYFERKSERGLSFQGDLRLAICHSIVVRM QGEIKVESTPNAQTTFTVLLPQLKVTAENAPTADNIVPASKEYGLPLPTVTRREFPFDKN RRTVFIVNDNSEIMNFVAELFAADYNIKMPSGKTEMIELLKQMHPDIVICDALSEQSDCL SLIQFIKQGKLTSHIPVILLSTAQQIDERIKGVESGADICLTLPFNVEYLKAVAEQLLRR NRSLKDYYKSSISAFELSDGRMLHQDDKEFIDRMLKIINDNISNTEISTKFIADELGVSI RNLYRRLEGILNQTPTNIIKEYRLAKAEQLLTTTKLSIDEIIYKAGFVNRGTFFKCFSAK YGCTPKVYRKEKLSQIRQEVGEESEDAADRSA >gi|313158042|gb|AENZ01000051.1| GENE 130 174647 - 175366 734 239 aa, chain + ## HITS:1 COG:all0976 KEGG:ns NR:ns ## COG: all0976 COG2755 # Protein_GI_number: 17228471 # Func_class: E Amino acid transport and metabolism # Function: Lysophospholipase L1 and related esterases # Organism: Nostoc sp. PCC 7120 # 28 211 48 234 249 62 24.0 9e-10 MTCMKKIVLLSLALFSVAALRADGPVDWAQYGRYELQNAVLDRPVEVVFMGNSITDSWIR VDPDFFEQNGFLDRGISGQTTVQMLARFRSDVIDLKPQVVVILAGINDIARNNGPIELEN VFGNIVSMCDLARYNGIKVVLCSLLPCDRFSWRPEMEPAEEVRRLNTMLERYAAEQKIPY VDYHRALDNGSGGMSEELSQDGCHPVLSGYLRMESLVVEGINRALGVQKTWYTTVLPAK >gi|313158042|gb|AENZ01000051.1| GENE 131 175377 - 176567 1808 396 aa, chain + ## HITS:1 COG:lin0763 KEGG:ns NR:ns ## COG: lin0763 COG4833 # Protein_GI_number: 16799837 # Func_class: G Carbohydrate transport and metabolism # Function: Predicted glycosyl hydrolase # Organism: Listeria innocua # 139 289 90 243 341 92 36.0 2e-18 MKKLLIMAFAASVFAACCNNGGAACDARNLDRAKATLDSIYAHYGVAENRLLRENYPFNV DYTASYLASADQARPNPYSYLWPFSGTLSAVNTILEADASYRSVLDGRVLPGLAEYLDTT RMPAAYASYINTASASDRFYDDNVWLGIDFCDIYEATGDKKYLAEAEMIWKFIESGTDDV LGGGIYWCEQQKHSKNTCSNAPGTVYALKLYAATGDERYLAQGKALYAWTKKHLLDPEDG LYFDNIGLDGRIGRAKFAYNSGQMVQAGALLYKATGDESYLKDAQRTAAACYDRFFGEFT PQGGETFRIINKGNVWFSAVMVRGLAELYAIDGNAAYLGDVQRTLDYAWEHARDAQGLFE TDFTGADKQSEKWLLTQGAMAEMYGRMSGVKLDQTK >gi|313158042|gb|AENZ01000051.1| GENE 132 176596 - 178158 2397 520 aa, chain + ## HITS:1 COG:XF0843 KEGG:ns NR:ns ## COG: XF0843 COG3538 # Protein_GI_number: 15837445 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Xylella fastidiosa 9a5c # 62 505 52 497 516 447 48.0 1e-125 MKMKKILNICLAAAVVCAAEGVSARTVSLSAPAEAPQPAAREAAAPMPADNTRMAKPGKK TKAQTPAAYAATNRPAEAERLFRSEAVEQEIARIKGVLTNAKLAWMFENCFPNTLDTTVR ARKDANGKDDTIVYTGDIHAMWLRDSGAQVWPYLQLANKDEHLRSMLAGVIRRQFKCIEL DPYANAFLDPYDPNPDHQWMSDQTEMRPELHERKWEIDSLCYPLRLAYQYWLTTGDASVF DEHWMAAIKNILKTFREQQRKEGLGPYTFMRVTDRQLDTVCNAGKGNPVNPVGLIASVFR PSDDATTFLFLVPSNFFAVTSLRKAAEILTKVNDEPALAAECTALAAEVETALKKYATYN HPEFGTIYAFEVDGFGNRFLMDDANVPSLLAMPYLGDMDVNDPIYQNTRRFVWSEANPYF FRGTAGEGIGGPHIGYDMAWPMSIMMKAFTSQDDAEIKRCVEMLMTTDAGTGFMHESFNV NDPADFTRSWFAWQNTLFGELILKLVNEGKTDLLNSIEVK >gi|313158042|gb|AENZ01000051.1| GENE 133 178231 - 180168 2658 645 aa, chain + ## HITS:1 COG:no KEGG:BF3763 NR:ns ## KEGG: BF3763 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 2 645 3 647 649 814 57.0 0 MKRLFTLLCALTLACAAEAKTVEAASPGGNLRLAVDVADRITYGLWSGEDKILENCTLSL TLADRTLGEKPRLRSVKRSSADEVLERRNPTKNASVRNRYNAVRLNFAGGYAVEFRLFDD GAAYRFVTSLPGEVEVMGEACRLGFPAGSEAWLSEVDGFRSMYEEPYTRVATAEYDSADR MSYLPVLVGMPGGKKVLLAESDVRDYPCMFVRGDGAGGFESLFPRVPKEYAPAGDRSLKI VAEEPYIARTQGTRTFPWRVAVVAEKDAALLENELVWLLAEPAADADWEWVKPGLVSWDW WNGMRLTGVDFRAGRNTESYRYYIDFAARYGIPYIIMDEGWSASTTDVFHPNPDLDLQEL IAYGRERNVQIVLWLTWLAVEKDMDRLFATLEEWGIPGVKIDFMDRSDQWMVGYYERVMK CAARHRLFVDMHGSYTPKGLYRTYPNLLTYEGVLGMEQGARCKPENSNWLPFIRNAVGPM DFTPGSMFSAQPEDNRSTGANPMGTGTRAYQMALYVVFESPLQMLADNPVLYRREHPCTE FLAAVPTTWDSLRVLHAACGREVVVARRKGDKWYVGGITNDRPYTTEIALDFLPAGRTFT MTSFEDGVNADLQATDYRKRVREVDASTVVKVEMVRNGGWAAVIE >gi|313158042|gb|AENZ01000051.1| GENE 134 180612 - 181742 1516 376 aa, chain - ## HITS:1 COG:AF0390 KEGG:ns NR:ns ## COG: AF0390 COG1060 # Protein_GI_number: 11498002 # Func_class: H Coenzyme transport and metabolism; R General function prediction only # Function: Thiamine biosynthesis enzyme ThiH and related uncharacterized enzymes # Organism: Archaeoglobus fulgidus # 12 362 15 354 355 289 41.0 6e-78 MQRIYDKCRRREPLSREEAYRLYDEAPLQELALVADEVRRAVVPDPQVVTWQIDRNVNIT NVCISGCKFCNFHCKPHQTDRAFITTMEQYCEKIDRTLRLGGDQLLLQGGLHPGLKIDFY ERLFSGLKSRYPSLRLHALGAPEVAHIARISGLTTLETLRRLMAAGLDSLPGAGAEMLDS DVRRAISPAKPSVEKWLDVMHEAHCLNLPTSATMMYGHVETPHQRIDHLLRIRDLQTRCP EGNYGFIAFIPWIFRSTGTELERQGVETRFSPLEYLRIIAVSRLMLHNIPNIQASWLTVG KATAQAALHSGANDMGSIMIEENVVSSAGAHNSFDAEGIQSAIREAGFIPRLRDQLYRMR EYAPQTGELATSLQDA >gi|313158042|gb|AENZ01000051.1| GENE 135 181746 - 183977 1098 743 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15894003|ref|NP_347352.1| fused ribonuclease/ribosomal protein S1 [Clostridium acetobutylicum ATCC 824] # 23 717 7 702 730 427 34 1e-118 MKKQSQKNRPARGAKNTDRAVVILSLFREFPNNKFSLKHLASASGGATKEGRRETFEILG RLHDEGIVEECAREKYRLTHKHLPHFEGIADMTASGSIYVRVEGEENDIFVNQRNTANAL NGDRVEVVAIHRGRDGKLEGEITRIVERSRKPYVGVAEVGAHQIFVRADSRRMPMDIYLS KKQYPDVKDGEKVVVRIADWAEGSKSPVGELVERLGMAGNNDTEMHAILAEYELPYRFDP EIEQAAEAIDGSITAKEIAARRDFRQVTTFTVDPADAKDFDDALSVRRVRDGIWEIGVHI ADVTHYVRPQSTIDDEAVERGTSVYLVDRTVPMLPERLSNELCSLRPHETSLCFSAVFTL NENLDILEEWFGRTVIYSDRRFTYAEAQEVIETGRGDYAEEILTLNRLAQALRRQRFKNG AISFDREEVKFKLDETGKPLGVYFKEQKESNQMIEEFMLLANRRVAEFCGKRKTDKGRTV ERPMVYRVHDSPSEEKLDRFRQFILRFGHIFKATKGRAIAKELNKLFAQIKGTTEENAVS TMAVRSMAKAYYTTDNIGHYGLAFPYYTHFTSPIRRYPDMMVHRLLARYLADGKAADKAM LEDLCFRASEREVIAAEAERASIKYKMVEFMKERIGEEFEGHISGLTEWGIYVELDETHI EGMSFLRDIDGDFFQFDEPNYQIIGRSTGRRMTLGDAVRIRVKRADLQKRQLDFEVLLDG LRTPAPAEGQGPQVRRSNRRKNR >gi|313158042|gb|AENZ01000051.1| GENE 136 183974 - 185086 1450 370 aa, chain - ## HITS:1 COG:alr4934 KEGG:ns NR:ns ## COG: alr4934 COG1473 # Protein_GI_number: 17232426 # Func_class: R General function prediction only # Function: Metal-dependent amidase/aminoacylase/carboxypeptidase # Organism: Nostoc sp. PCC 7120 # 4 362 28 397 405 218 36.0 2e-56 MNPIQFRRHLHMHPELSFREHDTAAFISARLSELGIEHRPIAGTGVLAKIEGRGKADPKR RAVVLRADIDALPIDERNDADWSSRNQGVMHACGHDMHAAVLFGVLQQLAAEPDFRGTIF GLFQPGEECNPGGASLVLAENPFEGYDVRAVVGEHVEPQLEVGTLGFRAGKYMAASDELR FTVHGTGGHGAMRPQLKDPVAAAAEFVTRLVALNHEECVLSIGRVEAGGATNIVPDQVYL EGTLRTFDEREREIIHKRIRNFAADIDKRHGVRTDVDISRGYPCVVNDAALVKHAAALAA EAGLRVKMLPLRTTAEDFGFYCTKYPSLFYRLGVGAAAGQPHTATFNPDEGAIDTGIDYM KRLALQILKK >gi|313158042|gb|AENZ01000051.1| GENE 137 185092 - 186057 1455 321 aa, chain - ## HITS:1 COG:aq_1429 KEGG:ns NR:ns ## COG: aq_1429 COG0685 # Protein_GI_number: 15606607 # Func_class: E Amino acid transport and metabolism # Function: 5,10-methylenetetrahydrofolate reductase # Organism: Aquifex aeolicus # 1 320 1 287 296 172 31.0 6e-43 MNVLDIINTAIAEHRTRFAFELLPPLKGDGMGGVFAAIDRLIGFDPAYINVTFHREGIKQ SVRPDGGIDWHVVRRRPGTVGISAAIQKKYGVEVVPHLICGGQSKYDIEDALIDLDFLGL HNVLALRGDKSQNEKFFMPHPQGHSHAVDLVRQIAAMNRGEFVDGEVEECHHSKFSIGVA GYPEGHEESPSPEADIAALKAKIDAGAGYIVTQLFYDNARFFDFVRRCREAGIAVPIIPG IKPLSTLRHLTLLPQTFGCRIPEELEREVLRHREDPKAIREVGTEWAVAQSRELKQAGVP VLHYYPMGKADNIIKIAKAIF >gi|313158042|gb|AENZ01000051.1| GENE 138 186116 - 189727 5613 1203 aa, chain - ## HITS:1 COG:CAC0578_2 KEGG:ns NR:ns ## COG: CAC0578_2 COG1410 # Protein_GI_number: 15893867 # Func_class: E Amino acid transport and metabolism # Function: Methionine synthase I, cobalamin-binding domain # Organism: Clostridium acetobutylicum # 325 1199 4 886 890 867 49.0 0 MADIYEQLESRILLLDGGFGTMVQQYGLTEEDYRGERFREWPALLKGCNDLLALTRPEAV REIHVKYLQAGADIIETDSFNANAVSLADYGLEAYAYEMSRAAAAVARSAADEFTARNPQ KPRFVAGSMGPTNRTASMSADVANPAAREVTFAQLVEAYTDQARGLLDGGADILLVETIF DTLNAKAALYAIDALAEKLGRTIRVMASGTLADASGRTLSGQTVEAFCTSLSHAQLLSLG LNCAYGAKQLLPYLERLAETAPLRISAHPNAGLPNVMGGYDETPEMFAEDVGEYMRRGLV NIVGGCCGTTPAHIFELSKIACNYAPRPVPAPRRVTTLSGLEPLRIAPETNFVNVGERTN VAGSAKFARLIREGNYEEALSVARAQVDAGAQIVDVCMDDGMIDGVEAMRTFLNLMASEP EIARVPTMIDSSKWEVLAAGLEVTQGKAVVNSISLKEGEAEFLRRAREIHRYGAAAVVML FDEQGQADTCERKIEVASRAYRLLTDAGFPAEDIIFDPNILAVATGIPEHDGYAKAFIDA TRWIKENLPYAKVSGGVSNLSFAFRGNNAVREAMHSAFLYHAIRAGMDMGIVNPQMVKIY SEIEPELLQRVEDVILCRRADAAERLTEYAQQVRTTAETQPQAPDAWRAGTLAERIGHAM LKGVADYIEQDALEGYEALGTPMAVIDTLLMPAMERVGTLFGEGKMFLPQVVKTARVMKR AVAALTPYIEQGGAQTPHNAGKVLIATVKGDVHDIGKNIVAVVMACNGYEIKDLGVMVES QRIVEEAVAWGADCICLSGLITPSLDEMAHVCEELERRALRLPVIIGGATTSNLHTAVKI APVYSGLVIHSPNASRNSQILAQLLGPDGQLYADKVRADQQALRSDYLRAERTRNLIPIV EVRKTRKGAAPHLPVEPLHPGRMVFPDFDVADAEPYIDWSFFFAAWGLKGRYPEILDHPE KGAEARKVFADAEALLARIRDERLLTLQAAVGIFPARAEGDDIWITDQKGRERRLAMLRN QTRGEENRSLADCIAPHSDWIGCFAVTAGIGLRELCEKFRAEGDDYGAIMAKLLADRLTE AFAEAVHTFVRRQMWGYETGEPLTPQQTVRGQYRGRRMAFGYPASPDHSLKREVFDLLSV EMTTGMKLTENYMIDPGEALCGLMFADADYFSVGTIDTKQLLDYARRRGMEVEEIKKLIP NNI >gi|313158042|gb|AENZ01000051.1| GENE 139 189728 - 190519 1178 263 aa, chain - ## HITS:1 COG:CC2650 KEGG:ns NR:ns ## COG: CC2650 COG3823 # Protein_GI_number: 16126885 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Glutamine cyclotransferase # Organism: Caulobacter vibrioides # 38 262 33 258 260 199 49.0 4e-51 MRKFATVLLSGLLLASCGGNQVRAKRPAAPAEPKEYTYGVRAVHPHPATSYTQGLQYADG RLWEGTGQHGESVVQTLDLATGRAEVFARLPKEDFGEGIALLDGKVYQLTWQSNKAYVYD ARTGEKIREFRYPGEGWGLTTDGKKLYMSDGTANIYTLDPATFKREKRTTVTLRGETLNF LNELEWIDGKIWANVYTTDQIVIIDPATGAVEGIVDLTGLLPEEDVTPATDVLNGIAYDA AGGRILVTGKNWNKLFEIEIREK >gi|313158042|gb|AENZ01000051.1| GENE 140 190849 - 192174 2365 441 aa, chain + ## HITS:1 COG:RC1306 KEGG:ns NR:ns ## COG: RC1306 COG0544 # Protein_GI_number: 15893229 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: FKBP-type peptidyl-prolyl cis-trans isomerase (trigger factor) # Organism: Rickettsia conorii # 28 315 28 308 445 66 25.0 1e-10 MKIVREQREENNSLLKVTVGEEDYGQAVEKELREYRRKANIPGFRPGMVPMGLVKKMYGK GVLAEQAYRTASNSVFEYLQKENIDYLGDVIPSEEQGDFDFENGKEFEFVFEIGEAPEIK LELSDKDKLTYSKIKVDKKMHDDYRANYLRQYGRLVDADEVTSDEALSVTLDNGEMNVAD AYVGLISMSEEERKPFIGKKVGFKTEVNVNELYKNPSQRAAVLQVKENELEGIKPEFTLE ITKIRKFAEPELNEEFFKTAFPGGEVKDEAGLDKFIDARIEPELRRESDYLFTLQLRDYL VQKAGLKMPEAFLKRWLYTINEGKFSMEDIDKDFDQFLKMFTWNYLQKHFIKQDNITVTK DEALAEAKALAAAQFAQYGMPSAPEDMLAGYAEKILADKDQGQKIYEKLYEVKVVEDVKS KIKVTEKAVSADDFAKLAKAL >gi|313158042|gb|AENZ01000051.1| GENE 141 192407 - 193075 1103 222 aa, chain + ## HITS:1 COG:alr3683 KEGG:ns NR:ns ## COG: alr3683 COG0740 # Protein_GI_number: 17231175 # Func_class: O Posttranslational modification, protein turnover, chaperones; U Intracellular trafficking, secretion, and vesicular transport # Function: Protease subunit of ATP-dependent Clp proteases # Organism: Nostoc sp. PCC 7120 # 30 220 29 219 232 239 61.0 4e-63 MSEFEKYARKHCGISSLKLHDFQSVSGAYVSPTIIEERQLNIATMDVFSRLMMDRIIFLG APIYDDAANIIQAQLLFLESVDPEKDIQIYINSPGGSVSAGLGIYDTMQLVNSDVATICT GLAASMGAVLLTAGAKGKRSALPHSRVMIHQPLGGAQGQASDIEITAREIMKTKRELYEI LSAHSGVALKKIEKDADRDYWLSAREAKEYGLIDEVLSKRGK >gi|313158042|gb|AENZ01000051.1| GENE 142 193075 - 194352 243 425 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163762510|ref|ZP_02169575.1| ribosomal protein S16 [Bacillus selenitireducens MLS10] # 170 420 250 457 466 98 30 2e-19 MASGKGNNHKNGAGGDCCSFCGAKRSDVEIMFQGIDGANICNKCIENGYRIIVENDVASE RGGSAAAAPLKMEELLKPAQIKDFLDQYVIGQDDAKRYMAVAVYNHYKRLLYAPKEDEVE IDKSNIVLVGPTGTGKTLMARTIAKLLKVPFTIVDATVLTEAGYVGEDVESILSRLLQVA DYNVEQAERGIVFIDEIDKIARKGDNPSITRDVSGEGVQQALLKILEGTVVNVPPQGGRK HPEQKFIQVDTRNILFICGGAFDGVEKKIAQRLNTRAMGYGRLRADRIDRNNLMQYITPQ DLKSFGLIPELVGRLPVLTYMQPLSREALRSILTEPKNAIVKQYKRLFELDNVKLTFDDA VLDYIVDKAVEFKLGARGLRSICEAIMMDTMFDLPSTDTRKFVVTLPYAKKKIEKANILE LKQVG Prediction of potential genes in microbial genomes Time: Wed Jun 22 12:49:20 2011 Seq name: gi|313157860|gb|AENZ01000052.1| Alistipes sp. HGB5 contig00017, whole genome shotgun sequence Length of sequence - 223431 bp Number of predicted genes - 185, with homology - 178 Number of transcription units - 81, operones - 43 average op.length - 3.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 1182 686 ## COG4974 Site-specific recombinase XerD - TRNA 1348 - 1423 75.0 # Lys CTT 0 0 - Term 1410 - 1444 -1.0 2 2 Tu 1 . - CDS 1562 - 2866 1519 ## COG3250 Beta-galactosidase/beta-glucuronidase - Prom 2961 - 3020 1.6 + Prom 2796 - 2855 2.2 3 3 Tu 1 . + CDS 2992 - 3444 662 ## SRM_01521 hypothetical protein + Term 3509 - 3559 3.6 4 4 Op 1 9/0.000 - CDS 3439 - 4185 1117 ## COG3279 Response regulator of the LytR/AlgR family 5 4 Op 2 . - CDS 4178 - 5251 1715 ## COG3275 Putative regulator of cell autolysis - Prom 5276 - 5335 6.0 - Term 5258 - 5301 8.4 6 5 Op 1 . - CDS 5419 - 7119 2460 ## COG1022 Long-chain acyl-CoA synthetases (AMP-forming) - Prom 7155 - 7214 5.5 7 5 Op 2 . - CDS 7249 - 7947 1111 ## COG0344 Predicted membrane protein - Prom 7972 - 8031 9.6 + Prom 7932 - 7991 5.2 8 6 Op 1 . + CDS 8083 - 9072 1621 ## COG1284 Uncharacterized conserved protein 9 6 Op 2 2/0.000 + CDS 9086 - 10450 554 ## PROTEIN SUPPORTED gi|16079597|ref|NP_390421.1| hypothetical protein BSU25430 10 6 Op 3 . + CDS 10467 - 11975 1848 ## COG0513 Superfamily II DNA and RNA helicases + Term 12000 - 12040 8.9 - Term 12092 - 12127 7.1 11 7 Tu 1 . - CDS 12196 - 13500 2033 ## COG1160 Predicted GTPases 12 8 Op 1 . - CDS 13692 - 15758 3096 ## COG0475 Kef-type K+ transport systems, membrane components 13 8 Op 2 . - CDS 15800 - 16645 867 ## BF3354 hypothetical protein - Prom 16672 - 16731 4.3 + Prom 16688 - 16747 3.1 14 9 Op 1 . + CDS 16792 - 17916 1643 ## COG0738 Fucose permease 15 9 Op 2 . + CDS 17929 - 18525 1054 ## Krodi_1953 secreted protein 16 9 Op 3 . + CDS 18529 - 19500 1548 ## COG1702 Phosphate starvation-inducible protein PhoH, predicted ATPase 17 10 Op 1 . + CDS 19701 - 21170 2359 ## COG2195 Di- and tripeptidases 18 10 Op 2 . + CDS 21177 - 22163 1527 ## COG1597 Sphingosine kinase and enzymes related to eukaryotic diacylglycerol kinase 19 11 Tu 1 . + CDS 22330 - 22683 133 ## Desal_1939 hypothetical protein + Term 22750 - 22787 7.1 - Term 22731 - 22780 16.5 20 12 Tu 1 . - CDS 22854 - 23429 924 ## gi|313157883|gb|EFR57289.1| putative lipoprotein - Prom 23570 - 23629 3.4 + Prom 23368 - 23427 1.9 21 13 Tu 1 . + CDS 23580 - 23963 661 ## COG2315 Uncharacterized protein conserved in bacteria + Term 24011 - 24045 -0.7 - Term 24055 - 24101 6.1 22 14 Op 1 . - CDS 24237 - 24992 929 ## COG3201 Nicotinamide mononucleotide transporter 23 14 Op 2 . - CDS 25050 - 27548 3894 ## BF0615 hypothetical protein 24 14 Op 3 . - CDS 27559 - 28050 -306 ## 25 14 Op 4 . - CDS 28116 - 29540 2191 ## COG0477 Permeases of the major facilitator superfamily 26 14 Op 5 11/0.000 - CDS 29545 - 30858 2185 ## COG2115 Xylose isomerase 27 14 Op 6 . - CDS 30882 - 32366 2062 ## COG1070 Sugar (pentulose and hexulose) kinases 28 14 Op 7 . - CDS 32387 - 33133 1244 ## COG1051 ADP-ribose pyrophosphatase - Prom 33155 - 33214 3.3 - Term 33312 - 33375 22.0 29 15 Op 1 . - CDS 33445 - 34503 393 ## PROTEIN SUPPORTED gi|15900011|ref|NP_344615.1| aldose 1-epimerase 30 15 Op 2 . - CDS 34503 - 36035 2263 ## COG0008 Glutamyl- and glutaminyl-tRNA synthetases 31 15 Op 3 . - CDS 36083 - 37183 1861 ## COG0012 Predicted GTPase, probable translation factor + Prom 37442 - 37501 4.0 32 16 Tu 1 . + CDS 37650 - 40124 3288 ## COG1472 Beta-glucosidase-related glycosidases + Term 40150 - 40186 5.3 33 17 Op 1 2/0.000 - CDS 40313 - 40837 723 ## COG2207 AraC-type DNA-binding domain-containing proteins 34 17 Op 2 . - CDS 40716 - 41492 547 ## COG1609 Transcriptional regulators + Prom 41703 - 41762 2.3 35 18 Op 1 . + CDS 41798 - 44689 3619 ## COG3537 Putative alpha-1,2-mannosidase 36 18 Op 2 . + CDS 44726 - 45565 1210 ## COG3568 Metal-dependent hydrolase 37 18 Op 3 . + CDS 45574 - 47505 2650 ## COG1785 Alkaline phosphatase 38 18 Op 4 . + CDS 47516 - 49129 2323 ## COG1626 Neutral trehalase 39 18 Op 5 . + CDS 49167 - 50150 1293 ## COG1409 Predicted phosphohydrolases 40 18 Op 6 . + CDS 50161 - 50982 1203 ## COG1940 Transcriptional regulator/sugar kinase + Prom 50984 - 51043 3.4 41 18 Op 7 . + CDS 51065 - 52708 2028 ## COG3525 N-acetyl-beta-hexosaminidase + Term 52731 - 52768 3.5 42 19 Op 1 . + CDS 52864 - 54876 992 ## gi|313157942|gb|EFR57348.1| conserved hypothetical protein 43 19 Op 2 . + CDS 54881 - 55588 1015 ## COG0313 Predicted methyltransferases 44 19 Op 3 . + CDS 55611 - 56990 1789 ## COG1808 Predicted membrane protein 45 19 Op 4 . + CDS 57002 - 57922 1189 ## CHU_1078 lauroyl/myristoyl acyltransferase (EC:2.3.1.-) 46 19 Op 5 1/0.000 + CDS 57909 - 58922 1052 ## COG1216 Predicted glycosyltransferases + Term 58949 - 58980 0.1 47 19 Op 6 . + CDS 58982 - 59845 860 ## COG2207 AraC-type DNA-binding domain-containing proteins + Term 59926 - 59958 7.7 - Term 59912 - 59946 8.1 48 20 Op 1 . - CDS 60005 - 61681 2496 ## COG0008 Glutamyl- and glutaminyl-tRNA synthetases 49 20 Op 2 . - CDS 61702 - 62766 1801 ## Sph21_1686 hypothetical protein - Prom 62872 - 62931 5.1 - Term 62963 - 63010 2.7 50 21 Tu 1 . - CDS 63119 - 63577 290 ## gi|313157912|gb|EFR57318.1| hypothetical protein HMPREF9720_0548 51 22 Tu 1 . - CDS 63686 - 63880 117 ## gi|313158016|gb|EFR57422.1| conserved domain protein - Prom 63940 - 63999 3.8 52 23 Tu 1 . - CDS 64001 - 64267 217 ## gi|313158024|gb|EFR57430.1| conserved hypothetical protein 53 24 Tu 1 . + CDS 64352 - 64918 862 ## COG0616 Periplasmic serine proteases (ClpP class) + Term 64930 - 64987 11.1 - Term 64929 - 64964 4.1 54 25 Op 1 . - CDS 64998 - 66575 991 ## Pedsa_3179 metallophosphoesterase 55 25 Op 2 . - CDS 66589 - 68604 684 ## COG1262 Uncharacterized conserved protein 56 25 Op 3 . - CDS 68623 - 69243 209 ## Pedsa_0075 hypothetical protein 57 25 Op 4 . - CDS 69258 - 70565 565 ## Cpin_5276 hypothetical protein 58 25 Op 5 . - CDS 70595 - 72073 922 ## Pedsa_0072 hypothetical protein 59 25 Op 6 . - CDS 72085 - 75279 1877 ## Pedsa_0071 TonB-dependent receptor plug 60 25 Op 7 . - CDS 75349 - 76956 485 ## BURPS1106A_3267 BNR/Asp-box repeat-containing protein - Prom 76995 - 77054 1.6 61 26 Op 1 . - CDS 77085 - 77846 605 ## CA2559_07475 putative S1/P1 nuclease 62 26 Op 2 . - CDS 77860 - 79743 1558 ## Pedsa_2434 sialate O-acetylesterase (EC:3.1.1.53) 63 26 Op 3 . - CDS 79756 - 81624 1712 ## Pedsa_0715 hypothetical protein 64 26 Op 4 . - CDS 81638 - 82639 1100 ## Phep_1498 hypothetical protein 65 26 Op 5 . - CDS 82663 - 84261 1542 ## Phep_2282 RagB/SusD domain-containing protein 66 26 Op 6 . - CDS 84274 - 87432 3279 ## Phep_0787 TonB-dependent receptor plug 67 26 Op 7 . - CDS 87466 - 89241 2562 ## COG2407 L-fucose isomerase and related proteins 68 26 Op 8 . - CDS 89261 - 91240 1706 ## COG1621 Beta-fructosidases (levanase/invertase) 69 26 Op 9 2/0.000 - CDS 91237 - 92724 2363 ## COG0591 Na+/proline symporter 70 26 Op 10 . - CDS 92756 - 93667 1431 ## COG0329 Dihydrodipicolinate synthase/N-acetylneuraminate lyase 71 26 Op 11 . - CDS 93691 - 94566 1107 ## COG0657 Esterase/lipase 72 26 Op 12 . - CDS 94563 - 96383 2158 ## Psta_1430 FG-GAP repeat-containing protein 73 26 Op 13 . - CDS 96439 - 96615 87 ## gi|313157966|gb|EFR57372.1| hypothetical protein HMPREF9720_0571 74 26 Op 14 . - CDS 96665 - 97723 1728 ## COG1609 Transcriptional regulators - Prom 97771 - 97830 7.3 + Prom 97715 - 97774 6.8 75 27 Op 1 . + CDS 97839 - 98978 913 ## FB2170_11376 hypothetical protein 76 27 Op 2 . + CDS 98990 - 100615 2047 ## COG1069 Ribulose kinase + Term 100640 - 100706 20.0 + Prom 100716 - 100775 4.1 77 28 Tu 1 . + CDS 100851 - 101303 476 ## COG0720 6-pyruvoyl-tetrahydropterin synthase + Prom 101308 - 101367 3.4 78 29 Op 1 . + CDS 101450 - 101581 131 ## 79 29 Op 2 . + CDS 101660 - 101992 249 ## gi|237711810|ref|ZP_04542291.1| transposase IS116/IS110/IS902 + Term 102106 - 102132 -0.7 - Term 102083 - 102120 0.6 80 30 Tu 1 . - CDS 102124 - 102546 432 ## BT_2612 hypothetical protein - Prom 102743 - 102802 4.4 + Prom 102833 - 102892 1.6 81 31 Tu 1 . + CDS 103001 - 103138 97 ## 82 32 Tu 1 . + CDS 103244 - 103762 160 ## gi|313157877|gb|EFR57283.1| hypothetical protein HMPREF9720_0577 83 33 Tu 1 . - CDS 104306 - 105523 1533 ## BF1833 putative bacteriophage integrase - Prom 105756 - 105815 2.6 + Prom 105732 - 105791 4.2 84 34 Tu 1 . + CDS 105817 - 106353 780 ## COG0566 rRNA methylases + Prom 106374 - 106433 4.4 85 35 Tu 1 . + CDS 106599 - 107414 1201 ## COG0413 Ketopantoate hydroxymethyltransferase + Term 107436 - 107469 5.1 + Prom 107518 - 107577 4.3 86 36 Op 1 . + CDS 107643 - 109145 1937 ## COG0516 IMP dehydrogenase/GMP reductase 87 36 Op 2 . + CDS 109148 - 109480 284 ## Bacsa_0047 hypothetical protein 88 36 Op 3 . + CDS 109608 - 111167 2245 ## COG0519 GMP synthase, PP-ATPase domain/subunit + Term 111214 - 111266 15.2 - Term 111208 - 111249 11.7 89 37 Op 1 . - CDS 111304 - 112674 1425 ## COG2271 Sugar phosphate permease - Term 112788 - 112835 19.4 90 37 Op 2 . - CDS 112857 - 113945 1475 ## Sph21_4209 hypothetical protein - Prom 113980 - 114039 4.8 + Prom 114282 - 114341 2.1 91 38 Tu 1 . + CDS 114373 - 114867 499 ## COG0847 DNA polymerase III, epsilon subunit and related 3'-5' exonucleases + Term 114884 - 114937 12.5 - Term 114877 - 114920 9.2 92 39 Tu 1 . - CDS 114947 - 116206 2018 ## Weevi_1984 phosphate-selective porin O and P - Prom 116438 - 116497 4.7 + Prom 116303 - 116362 3.4 93 40 Op 1 . + CDS 116470 - 116976 725 ## COG2077 Peroxiredoxin 94 40 Op 2 . + CDS 116998 - 117273 440 ## BVU_3861 hypothetical protein + Term 117289 - 117336 10.3 + TRNA 117570 - 117660 64.2 # Ser GCT 0 0 + Prom 117572 - 117631 80.3 95 41 Op 1 . + CDS 117839 - 119284 2120 ## COG0719 ABC-type transport system involved in Fe-S cluster assembly, permease component 96 41 Op 2 . + CDS 119297 - 119869 535 ## COG0655 Multimeric flavodoxin WrbA 97 41 Op 3 41/0.000 + CDS 119851 - 120615 186 ## PROTEIN SUPPORTED gi|225088774|ref|YP_002660041.1| ribosomal protein S16 98 41 Op 4 24/0.000 + CDS 120612 - 121433 1019 ## COG0719 ABC-type transport system involved in Fe-S cluster assembly, permease component 99 41 Op 5 . + CDS 121436 - 122653 1698 ## COG0520 Selenocysteine lyase 100 41 Op 6 . + CDS 122704 - 123393 987 ## COG0125 Thymidylate kinase 101 41 Op 7 . + CDS 123386 - 124492 1410 ## Odosp_1120 DoxX family protein 102 41 Op 8 27/0.000 + CDS 124554 - 125582 1219 ## COG0845 Membrane-fusion protein 103 41 Op 9 9/0.000 + CDS 125590 - 128622 4690 ## COG0841 Cation/multidrug efflux pump 104 41 Op 10 . + CDS 128619 - 129863 1775 ## COG1538 Outer membrane protein + Term 129879 - 129915 9.6 - Term 129862 - 129910 11.6 105 42 Op 1 . - CDS 129925 - 130332 608 ## BDI_1522 hypothetical protein 106 42 Op 2 . - CDS 130380 - 131804 1550 ## COG0642 Signal transduction histidine kinase 107 42 Op 3 . - CDS 131756 - 131950 159 ## - Prom 132071 - 132130 1.7 + Prom 132028 - 132087 4.0 108 43 Op 1 . + CDS 132118 - 135102 4496 ## BDI_3591 TPR domain-containing protein 109 43 Op 2 . + CDS 135102 - 136835 2097 ## gi|313157875|gb|EFR57281.1| hypothetical protein HMPREF9720_0605 + Term 136874 - 136912 2.1 + Prom 136852 - 136911 4.6 110 44 Op 1 . + CDS 136980 - 137273 318 ## COG3877 Uncharacterized protein conserved in bacteria 111 44 Op 2 . + CDS 137257 - 137799 815 ## gi|313157879|gb|EFR57285.1| hypothetical protein HMPREF9720_0607 + Term 137823 - 137857 5.9 + Prom 137870 - 137929 2.1 112 45 Op 1 . + CDS 138068 - 139081 1308 ## COG1559 Predicted periplasmic solute-binding protein 113 45 Op 2 . + CDS 139102 - 139704 766 ## COG0009 Putative translation factor (SUA5) 114 45 Op 3 . + CDS 139733 - 140311 712 ## gi|313158003|gb|EFR57409.1| hypothetical protein HMPREF9720_0610 115 45 Op 4 . + CDS 140329 - 141285 1109 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 116 45 Op 5 . + CDS 141290 - 141754 667 ## Bache_2138 hypothetical protein + Term 141771 - 141818 14.0 + Prom 141902 - 141961 7.5 117 46 Tu 1 . + CDS 142004 - 146023 1042 ## gi|313157890|gb|EFR57296.1| putative lipoprotein + Term 146040 - 146085 13.2 + Prom 146059 - 146118 5.8 118 47 Op 1 . + CDS 146145 - 147362 1160 ## COG1864 DNA/RNA endonuclease G, NUC1 119 47 Op 2 . + CDS 147383 - 148417 1339 ## Poras_0065 Endonuclease/exonuclease/phosphatase 120 47 Op 3 . + CDS 148434 - 151193 3533 ## Poras_0864 TonB-dependent receptor 121 47 Op 4 . + CDS 151220 - 152245 1453 ## gi|313157923|gb|EFR57329.1| putative lipoprotein + Term 152256 - 152321 27.3 - Term 152250 - 152301 21.0 122 48 Tu 1 . - CDS 152316 - 154373 3136 ## COG1368 Phosphoglycerol transferase and related proteins, alkaline phosphatase superfamily - Prom 154398 - 154457 4.4 - Term 154422 - 154468 8.5 123 49 Op 1 . - CDS 154484 - 157312 4315 ## COG0612 Predicted Zn-dependent peptidases 124 49 Op 2 . - CDS 157338 - 157559 268 ## PROTEIN SUPPORTED gi|163756262|ref|ZP_02163377.1| 50S ribosomal protein L20 125 49 Op 3 . - CDS 157620 - 158309 1230 ## gi|313157995|gb|EFR57401.1| hypothetical protein HMPREF9720_0621 - Prom 158332 - 158391 4.5 - Term 158402 - 158431 1.2 126 50 Tu 1 . - CDS 158454 - 159065 1031 ## gi|313158028|gb|EFR57434.1| hypothetical protein HMPREF9720_0622 + Prom 159030 - 159089 3.2 127 51 Op 1 . + CDS 159165 - 159893 1049 ## COG3884 Acyl-ACP thioesterase 128 51 Op 2 . + CDS 159966 - 160502 815 ## COG2193 Bacterioferritin (cytochrome b1) + Term 160595 - 160625 3.0 + Prom 160507 - 160566 1.9 129 52 Tu 1 . + CDS 160716 - 161852 1523 ## COG0438 Glycosyltransferase - Term 162104 - 162144 4.1 130 53 Op 1 1/0.000 - CDS 162229 - 163887 2128 ## COG0365 Acyl-coenzyme A synthetases/AMP-(fatty) acid ligases 131 53 Op 2 . - CDS 163891 - 164445 885 ## COG1396 Predicted transcriptional regulators - Prom 164682 - 164741 5.9 - Term 164680 - 164727 11.3 132 54 Op 1 . - CDS 164753 - 166141 2264 ## Odosp_1741 major facilitator superfamily MFS_1 133 54 Op 2 . - CDS 166164 - 168137 3453 ## COG0457 FOG: TPR repeat 134 54 Op 3 . - CDS 168207 - 169088 1445 ## COG1159 GTPase - Prom 169114 - 169173 3.1 + Prom 169078 - 169137 3.7 135 55 Tu 1 . + CDS 169244 - 169936 673 ## COG0778 Nitroreductase + Term 169953 - 170002 14.6 - Term 169940 - 169989 12.3 136 56 Tu 1 . - CDS 170048 - 171373 2032 ## COG0541 Signal recognition particle GTPase - Prom 171492 - 171551 4.2 + Prom 171312 - 171371 1.9 137 57 Tu 1 . + CDS 171536 - 172264 758 ## gi|313157874|gb|EFR57280.1| hypothetical protein HMPREF9720_0634 - Term 172402 - 172428 1.7 138 58 Tu 1 . - CDS 172456 - 173031 967 ## BF0012 hypothetical protein - Prom 173053 - 173112 4.0 + Prom 173001 - 173060 3.9 139 59 Op 1 . + CDS 173103 - 173477 539 ## COG5496 Predicted thioesterase + Term 173503 - 173550 4.5 140 59 Op 2 . + CDS 173567 - 174343 817 ## COG2720 Uncharacterized vancomycin resistance protein + Term 174361 - 174399 6.0 - Term 174349 - 174387 6.0 141 60 Op 1 . - CDS 174411 - 175748 2204 ## COG1055 Na+/H+ antiporter NhaD and related arsenite permeases 142 60 Op 2 . - CDS 175759 - 176259 679 ## gi|313158026|gb|EFR57432.1| hypothetical protein HMPREF9720_0639 143 60 Op 3 . - CDS 176340 - 176720 215 ## gi|313158013|gb|EFR57419.1| conserved domain protein - Prom 176852 - 176911 73.9 + TRNA 176838 - 176908 61.1 # Gln TTG 0 0 - Term 176907 - 176939 1.5 144 61 Op 1 . - CDS 177011 - 177157 219 ## 145 61 Op 2 . - CDS 177175 - 177426 484 ## gi|313157980|gb|EFR57386.1| transglycosylase associated protein 146 61 Op 3 . - CDS 177494 - 177742 310 ## bpr_I0029 hypothetical protein - Prom 177763 - 177822 6.2 - Term 178018 - 178061 8.1 147 62 Op 1 . - CDS 178079 - 178399 556 ## COG1862 Preprotein translocase subunit YajC 148 62 Op 2 . - CDS 178472 - 178885 507 ## gi|313157866|gb|EFR57272.1| putative lipoprotein 149 62 Op 3 . - CDS 178892 - 179827 1601 ## Bacsa_1247 NusB antitermination factor - Prom 179879 - 179938 2.3 + Prom 179763 - 179822 3.2 150 63 Op 1 . + CDS 179910 - 181187 1620 ## COG2081 Predicted flavoproteins 151 63 Op 2 . + CDS 181193 - 181705 294 ## gi|313158031|gb|EFR57437.1| hypothetical protein HMPREF9720_0648 152 63 Op 3 . + CDS 181702 - 182253 780 ## COG2096 Uncharacterized conserved protein + Term 182254 - 182304 -1.0 + Prom 182281 - 182340 10.6 153 64 Tu 1 . + CDS 182417 - 182638 310 ## Bache_1018 hypothetical protein + Term 182685 - 182726 7.1 - Term 183011 - 183052 10.2 154 65 Op 1 . - CDS 183073 - 185661 3142 ## Pedsa_3179 metallophosphoesterase 155 65 Op 2 . - CDS 185741 - 187744 2513 ## Phep_1386 hypothetical protein 156 65 Op 3 . - CDS 187838 - 189370 2321 ## Odosp_3156 hypothetical protein 157 65 Op 4 . - CDS 189378 - 192593 4595 ## Pedsa_0071 TonB-dependent receptor plug - Term 193383 - 193415 6.3 158 66 Tu 1 . - CDS 193443 - 193877 424 ## - Term 194309 - 194353 12.7 159 67 Op 1 1/0.000 - CDS 194375 - 195139 964 ## COG0500 SAM-dependent methyltransferases 160 67 Op 2 . - CDS 195139 - 195975 1246 ## COG2996 Uncharacterized protein conserved in bacteria - Prom 196108 - 196167 1.9 + Prom 195866 - 195925 2.7 161 68 Tu 1 . + CDS 196152 - 197972 2069 ## COG2936 Predicted acyl esterases + Term 198039 - 198075 7.1 + Prom 198046 - 198105 5.3 162 69 Tu 1 . + CDS 198131 - 198895 1057 ## COG0388 Predicted amidohydrolase + Term 198936 - 198977 -0.8 163 70 Op 1 . + CDS 199033 - 199497 -117 ## gi|313158032|gb|EFR57438.1| hypothetical protein HMPREF9720_0659 164 70 Op 2 . + CDS 199528 - 202347 4339 ## COG0178 Excinuclease ATPase subunit 165 70 Op 3 . + CDS 202412 - 202954 796 ## gi|313157957|gb|EFR57363.1| hypothetical protein HMPREF9720_0661 166 70 Op 4 . + CDS 202956 - 203891 658 ## gi|313158011|gb|EFR57417.1| hypothetical protein HMPREF9720_0662 + Term 203917 - 203954 6.0 - Term 203893 - 203932 -0.8 167 71 Tu 1 . - CDS 203970 - 204452 634 ## COG0245 2C-methyl-D-erythritol 2,4-cyclodiphosphate synthase - Prom 204496 - 204555 3.0 + Prom 204448 - 204507 1.8 168 72 Tu 1 . + CDS 204674 - 206125 2262 ## COG2195 Di- and tripeptidases + Term 206149 - 206189 9.4 - Term 205987 - 206025 1.0 169 73 Op 1 . - CDS 206185 - 206700 408 ## gi|313157952|gb|EFR57358.1| putative lipoprotein - Term 206715 - 206751 1.6 170 73 Op 2 . - CDS 206763 - 207377 301 ## - Prom 207445 - 207504 6.6 + Prom 207478 - 207537 8.4 171 74 Op 1 . + CDS 207656 - 210082 2780 ## BT_3560 hypothetical protein + Prom 210087 - 210146 2.8 172 74 Op 2 . + CDS 210174 - 211280 1594 ## COG0686 Alanine dehydrogenase + Term 211473 - 211516 2.3 + Prom 211488 - 211547 1.6 173 75 Op 1 . + CDS 211570 - 211890 485 ## gi|313158034|gb|EFR57440.1| hypothetical protein HMPREF9720_0669 174 75 Op 2 . + CDS 211877 - 212542 908 ## COG1137 ABC-type (unclassified) transport system, ATPase component + Term 212548 - 212597 2.3 175 76 Tu 1 . + CDS 212697 - 213710 749 ## PROTEIN SUPPORTED gi|149915191|ref|ZP_01903719.1| 50S ribosomal protein L27 + Term 213725 - 213757 3.0 176 77 Op 1 . - CDS 213640 - 214470 646 ## Poras_0865 extracellular nuclease 177 77 Op 2 . - CDS 214448 - 215212 598 ## Poras_0865 extracellular nuclease 178 77 Op 3 . - CDS 215221 - 216441 1849 ## COG0560 Phosphoserine phosphatase - Prom 216474 - 216533 2.1 + Prom 216436 - 216495 3.6 179 78 Op 1 . + CDS 216536 - 217459 1394 ## COG0668 Small-conductance mechanosensitive channel 180 78 Op 2 . + CDS 217471 - 217923 581 ## gi|313157902|gb|EFR57308.1| putative lipoprotein 181 78 Op 3 . + CDS 217945 - 218349 625 ## Odosp_3558 hypothetical protein + Term 218355 - 218387 6.3 + Prom 218385 - 218444 3.6 182 79 Tu 1 . + CDS 218481 - 218975 708 ## COG1803 Methylglyoxal synthase 183 80 Op 1 . - CDS 218990 - 220012 484 ## PROTEIN SUPPORTED gi|42631300|ref|ZP_00156838.1| COG0042: tRNA-dihydrouridine synthase 184 80 Op 2 . - CDS 220009 - 221295 1813 ## COG1253 Hemolysins and related proteins containing CBS domains - Prom 221405 - 221464 3.7 185 81 Tu 1 . + CDS 221437 - 223218 2720 ## COG0018 Arginyl-tRNA synthetase + Term 223234 - 223277 -0.8 Predicted protein(s) >gi|313157860|gb|AENZ01000052.1| GENE 1 3 - 1182 686 393 aa, chain - ## HITS:1 COG:BH1529 KEGG:ns NR:ns ## COG: BH1529 COG4974 # Protein_GI_number: 15614092 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinase XerD # Organism: Bacillus halodurans # 101 388 4 280 299 60 26.0 4e-09 MNISINTVFRKDRLNSQNAAPVHLRLTQTRKNKFISTGVTLNIDDWDFENQRIKTETPEL QALQLRIDTKIDELRRKIKRLEALEVDVTLDNLLETNGRKINCTVGEYLKQTIERLETLG KYGSASKHRSLLSRLSQFRSLNIRFDEIDLAYLHDFELFLRKEGNTNNSIATKYAIFKAA YNKALAEGLFVPKTTPFTKYKVGSLWTRTRKRAITKEDIQELVALEIAPNYRTNYAEFAR DIFLFSYYTAGINFTDMATLRHCDIVDGRIYYSRHKTQKLLSFQLVPNAMRIIEKYSKAN HAQEDYIFPILDRSEHKTAQQIFNRTHKVLRKVNRELKMLGEQIGLEMPLTTYVARHTFA TVLKRSGVSVALISESLGHSDLSTTQIYLDSFE >gi|313157860|gb|AENZ01000052.1| GENE 2 1562 - 2866 1519 434 aa, chain - ## HITS:1 COG:TM1193 KEGG:ns NR:ns ## COG: TM1193 COG3250 # Protein_GI_number: 15643949 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Thermotoga maritima # 66 419 116 469 1087 115 25.0 1e-25 MRSILSIIALFSCCTLAAQEYIPTYGRELPRGEVIAYPTAQEAAAAFAGDTRYFTRLADW TQKGNVFSTEFTVPFAWANRQVFFHLGWASADYEVRVNGKAVAYDSDCSAPAEFNLTRHA QEGRNTLEIVVSAPSQVARLESWKSDPAPAIGPAWVMSQPTLRVRDVLTKSWRSSEENDK VMAEIALVVKTESLNPRTSRIHYELLTPAGSNAAIGYKDVTLDMRREDTVRFLARIPDSM LWNPGRPTQYTLRVKTQHEGRYMEYIELPLGFRTVELRNGQLAVNGNPVALRTREVPPQF PAEEIAKLREQGFNTLRLLPGPVSPALYAVCDTMGMYVIPQAPIDTRSSGESRRIGGNPS NDPAWQNAYVERTEDSYHTSKRHPSVIAFSLATKSANGINLYESYLNMKKFSETRPFIYP DAAGEWNSDGLKLE >gi|313157860|gb|AENZ01000052.1| GENE 3 2992 - 3444 662 150 aa, chain + ## HITS:1 COG:no KEGG:SRM_01521 NR:ns ## KEGG: SRM_01521 # Name: not_defined # Def: hypothetical protein # Organism: S.ruber_M8 # Pathway: not_defined # 7 150 5 150 157 116 41.0 3e-25 MSELPKLNFPAIRLRARRRGEQVEVWDDLRGIYLVLTPEEWVRRHLIAYLVSHCGVLPKR IVQEYAVPVNGQPQRADVVVVGDGAEPLVLAECKAPEIRIDERTLSQAVRYNSVLGAQFV ILTNGRRHYCCEYRDGRYVQLAGFPDFAAL >gi|313157860|gb|AENZ01000052.1| GENE 4 3439 - 4185 1117 248 aa, chain - ## HITS:1 COG:ECs2936 KEGG:ns NR:ns ## COG: ECs2936 COG3279 # Protein_GI_number: 15832190 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Response regulator of the LytR/AlgR family # Organism: Escherichia coli O157:H7 # 1 241 6 242 244 107 29.0 2e-23 MLKCIAIDDEPLALRQIESYIRKIPYLELTASCNNALEAQQFLAGQHADLIFVDINMPDL SGVEFVRSLVERPMVIFTTAYSEYAVEGFKLDAVDYLLKPFSFADFSRSAGKANSLYELR HNQRPADASEPVPEALPKDKEYISVKADYKVSLVKISDIVYLESEGEYVRMHLADGSTIT TLFRLKNMETALPSEMFMRVHRSYIVNLRCIKGYVRGRVFLSDTEYVPIGENYKESFQRY IDANFKNL >gi|313157860|gb|AENZ01000052.1| GENE 5 4178 - 5251 1715 357 aa, chain - ## HITS:1 COG:SMb21546 KEGG:ns NR:ns ## COG: SMb21546 COG3275 # Protein_GI_number: 16264735 # Func_class: T Signal transduction mechanisms # Function: Putative regulator of cell autolysis # Organism: Sinorhizobium meliloti # 160 339 185 363 383 94 32.0 3e-19 MKITKKQTIGENLLYVMVWSAIILVPVLNSLMMSELHVSWENVLIAWRKIAPYLIIFIVH NSVIAPRFMLKHKYWKYLITNLTLIIGTFWFVDFYEQHITDSLLFSDLSEDQFIEHRKAS FTDLEMYWNVILGFFMSGANTGIKLMYQSIRDEQQMEELKQQNMQAEMDYLKYQINPHFF MNTLNNIHALIDIDTEYAKNAVIELSKMMRYVLYDSGREIISLNKDIQFIQNYIGLMRIR YTNDVDIRVEYPHDLPTQVSIPPLLLIVFVENAFKHGVSYNNPSFIHMHVDYADGKVTGT ISNSRHTQKGERHSAGIGLDNVRKRLELIYGPRNYALDIEENPTTYTVKLVIPTLNA >gi|313157860|gb|AENZ01000052.1| GENE 6 5419 - 7119 2460 566 aa, chain - ## HITS:1 COG:aq_999_1 KEGG:ns NR:ns ## COG: aq_999_1 COG1022 # Protein_GI_number: 15606303 # Func_class: I Lipid transport and metabolism # Function: Long-chain acyl-CoA synthetases (AMP-forming) # Organism: Aquifex aeolicus # 28 558 15 499 600 218 29.0 3e-56 MLQENLIKIYETSFRDNREMPALTDYFKGETFSYYEMAKEVAKLHLLFKKAGIRQGDKIA LIGRNNPRWCITYIATITYGAVIVPILQDFTPADIIHIINHSESRMLFLGDNFWDNIEEE QIRQIEAVFSLTDFHTIYERDGKALTKFQRDILKNYRSKYPRGFSINDIKYPEIPNDQVI LLNYTSGTTGYSKGVMLTVNNLTGNVIFAQRMINTQTGTFYFRKGGRTLSFLPLAHAYGC AFDFLSPLAVGGHITLLGKIPSPKILLEAMGVVRPTVICCVPMILEKVYRKQVMPMLEKG PMSIAVKIPLLNTAIYSVIRKKLLEAFGNNVDIFIVGGAPMNQETEAFLMKIKFPITIGY GMTECAPLISFTPDNEFKAGSCGRYLHELLEVKIDSPDPQHEAGEIIVRGEHVMKGYYKN EKDTEKVLEPDGWLHTGDMATMDPDGTLYIRGRSKTMILSGSGQNIYPEEIEDKLNNMYL VLESLILETENGKLKALVVPDYEQAELEGVDKNDLPQIMQNNLTELNTLLAAYERVSELA IYPTEFEKTPKRSIKRYLYSPSLLTK >gi|313157860|gb|AENZ01000052.1| GENE 7 7249 - 7947 1111 232 aa, chain - ## HITS:1 COG:FN0537 KEGG:ns NR:ns ## COG: FN0537 COG0344 # Protein_GI_number: 19703872 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Fusobacterium nucleatum # 11 211 6 191 194 133 41.0 2e-31 MILTYYTATTLIIIAYLLGSIPSAVWIGKKYYGIDIREHGSKNAGTTNMLRVLGKRAALP VFGLDVLKGFVAVTIIDLMKYDSVFADGANETWFYILRFIAVFAAVLGHIFPIFAGFRGG KGVATLVGAIMGVNAPLVLLCFGVWFLVLMITHYVSLASMVAGCSFPIFTLISPKVNHLV PFVVFSFIIAVLLIYTHRKNIERLKAGTESKIYIWKPRHVKQQPPKPEKKEE >gi|313157860|gb|AENZ01000052.1| GENE 8 8083 - 9072 1621 329 aa, chain + ## HITS:1 COG:CAC0496 KEGG:ns NR:ns ## COG: CAC0496 COG1284 # Protein_GI_number: 15893787 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 16 319 4 278 279 141 31.0 2e-33 MNTKALLEPMGSWAWWRSWFLIFLGCSIMGAGFVLFVNPYNFVPGGVYGMGIVLHNIFPS IQVGTFGYMFDVPLMLTAMLVFGGQFGTRTVLAALYTPGFMNMLTRLVYPDAAAVESLDP ALLLGGRLDLSNDLLLTCIIGAVVIGIGQGVVVRQQATTGGTDIVAMLLQKFAGIKFSTG ILLADGFVVLSGLAVIGFGLGTGDAASNGWMLTLYSLITIYITSRVIAYLLNGASYDKLL FIISDHHEQLKRFIIEDLDRSATYIKSKGMYTDALRDMIFLVVSRKEVRLVQHKIKEIDP SAFVVVTDAYETFGEGFKQFPDKNEIHAE >gi|313157860|gb|AENZ01000052.1| GENE 9 9086 - 10450 554 454 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|16079597|ref|NP_390421.1| hypothetical protein BSU25430 [Bacillus subtilis subsp. subtilis str. 168] # 24 438 8 423 451 218 32 2e-55 MELRLDTNTLRPLSGAGRKLFVETYGCQMNVGDTEIVVSVMQREGYVYTERIDEADVILI NTCSIRDNAEQRIWGRLAEMKRYRRANPGLVVGVIGCMAERLKEKLVEGPHGVDVVAGPD VYRDLPRLVREAEAGGKGVNVLLSTEETYAEIAPVRLDRNGVSAFVAIMRGCNNFCSYCV VPYTRGRERSRDPETILAEVRTLFGNGYREVTLLGQNVNSYRFGEVDFPGLMRRVASVSP LLRVRFATSHPKDISDSLLEVMAEMPNICRAIHLPAQSGASSMLARMNRKYTREWYLDRV AAIRRYLPDCAVTTDLIAGFSGETEEEHAATLSLMREVGYEFAYMFKYSERPGTYAHKHL PDDVPEEVKSARLAEIIALQNELSRASNLRDVGREFEVLVEGTSKRDENQLSGRTSQNKV VVFDRGGHGVGDYVRVRITGCTPATLFGEEISEG >gi|313157860|gb|AENZ01000052.1| GENE 10 10467 - 11975 1848 502 aa, chain + ## HITS:1 COG:STM0820 KEGG:ns NR:ns ## COG: STM0820 COG0513 # Protein_GI_number: 16764183 # Func_class: L Replication, recombination and repair; K Transcription; J Translation, ribosomal structure and biogenesis # Function: Superfamily II DNA and RNA helicases # Organism: Salmonella typhimurium LT2 # 1 385 1 379 454 282 39.0 1e-75 MRFDELELEDEILDGLYDMNFQEMTPVQEHTIPVILEGRDIIGCAQTGTGKTAAYTLPLL NRLLLEGNEDNVIKSVIIVPTRELAQQIDQQFQGFSYYLPISTTVVYGGGDGKGWDVQKR GMLMGSDVVIATPGRMISHIQNSGIDLSHVECLILDEADRMLDMGFSEDIMKIVSYMPKE RQTIMFSATLPPKIRELAKTILRNPAEVNIAISKPNEAIDQSAYVCYENQKLGIIREMFA EPTESKTIIFSSSKMKVKELAHTLKRMKLNVAAMHSDLEQAQREEVMLDFKNNKVSILVA TDIVARGIDIEDIGLVINYDVPHDPEDYIHRIGRTARAAATGAAVTFVSEEEQGKFHAIE KFIERDIRKAELPASVGEGPKYAPDENRGRFGRGGRGGHGGSGRGGSGRGRGRGDKGDRG RSSRGDGDRNRDRRGDRERAAAAPAENGVGTPQAAPADNDHASGRGEGNRGTRDRKRHRG SRNRNRNRGGNPQEGGATPLAN >gi|313157860|gb|AENZ01000052.1| GENE 11 12196 - 13500 2033 434 aa, chain - ## HITS:1 COG:SP1709 KEGG:ns NR:ns ## COG: SP1709 COG1160 # Protein_GI_number: 15901543 # Func_class: R General function prediction only # Function: Predicted GTPases # Organism: Streptococcus pneumoniae TIGR4 # 4 434 6 435 436 381 46.0 1e-105 MSLVAIVGRPNVGKSTLFNRLVGQRKAIVDATAGTTRDRHYGKTDWNGKEFSVIDTGGYT VNSEDIFEDEIRRQVLLAIDEADVILFLVEVATGITDLDQLMADILRRTSKKVILVCNKV DNNDQIYSSHEFYALGLGDPFCISSMSGSGTGDLMDAILAALPAETVPEEDEDLPHITIV GRPNVGKSSMTNALLGEERNIVTSIAGTTRDSIHTRYNKFGMDFYLVDTAGMRKKGKAME DLEFYSVMRSIRAIENSDVCILMLDAQQGLESQDLNIHNLIVRNRKGCVIVVNKWDLIEK DNNTMKEWTEFLKKKLAPFNDIPIIFTSVLSKQRIFDVLQTAIRVYQSRKRRISTSELND FLLPLIENYPPPATKGKYIKIKYVTQLPTPTPQFAFFVNLPQYIKEPYRRFLENNIRDHW DFAGVPMQIYFRQK >gi|313157860|gb|AENZ01000052.1| GENE 12 13692 - 15758 3096 688 aa, chain - ## HITS:1 COG:BH2844_1 KEGG:ns NR:ns ## COG: BH2844_1 COG0475 # Protein_GI_number: 15615407 # Func_class: P Inorganic ion transport and metabolism # Function: Kef-type K+ transport systems, membrane components # Organism: Bacillus halodurans # 11 408 2 388 388 330 46.0 8e-90 MMFQFAHIAPITLPLEDPILKFLLILIIILAAPLLLNKLKIPYLLGLIIAGAVIGPHGLN LVLRDSSIILSGTAGLLYIMFLSGLDMDTSDFKKSGWRSLVFGLYTFCVPLALGILAGYY IMGFSIYSSILLAGLFASQTLIAYPIVSKLGISRDKAVTIAVGGTVITDTLALLLLTVIV GMATGNVDESFWWRLSASVLLCIGIILFLFPLLAHWFFKQCSDNVSQYIFVLAMVFLGAY LAQLAGLEPIIGAFLSGLALNRLIPRTSPLMNRIEFVGNAIFIPFFLIGVGMLIDYRAFF RDWESIEVALIMIVLITAAKYIAAYLTQKTFRLSPDQRTVIFGLSSAHVAATLAAVMVGY NVILGHTPDGEPIRLLSESVLNGTILMILATCTVSTFATQRGAHNIAMRNAQDDDEQPQQ EDHILIPVANEQTVSELVSLSTTLKKAKSRNGLYALNAIDNKVEDPTVEKRGRKILDTAA KAAAASDNYLQALLRYDVNIANAIVSVVKERSITDIVMGMHHDRTPGGSGIGRMAADVLG QSNVTTYLYHPAQPLPTIKRHLVVVPEKAEKEAGFQMWIQKLRNIADNSGARIVFYAPEN TMQYLCPTRGKRSAKAEYVVSNPWDDLTALTYETKRDDCLWVVMSRRERVSYHPAMSKVP ACMEQFFADNSFVLVYPVQAGDTESWYI >gi|313157860|gb|AENZ01000052.1| GENE 13 15800 - 16645 867 281 aa, chain - ## HITS:1 COG:no KEGG:BF3354 NR:ns ## KEGG: BF3354 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 257 5 304 335 172 40.0 1e-41 MDKKRARYAASLGLVAAMVAAAELLGEQEVVFPEIAALAAGMWVIDKQVWRIGRRQAVWL MTLAAAAGWLIARIPGLPPPGGIALGFLFAAGCQVLTRTSLAPMLSACLLPVALGADSPV YPVAVFLLTLTLSGGQLWMERRKLRTPLPPVFHLPRRRDECLVWLRMFAVLMLIAAGAAL GAGFRRLLCVELGMPQAAAALAACACLFALFERYGRIFAPAGAVALIPLLIPARDLSVYP LQVAAGAAVLIAAGMLLGRRKRSAGSAVRTPQQQNAAGACF >gi|313157860|gb|AENZ01000052.1| GENE 14 16792 - 17916 1643 374 aa, chain + ## HITS:1 COG:NMB0535 KEGG:ns NR:ns ## COG: NMB0535 COG0738 # Protein_GI_number: 15676441 # Func_class: G Carbohydrate transport and metabolism # Function: Fucose permease # Organism: Neisseria meningitidis MC58 # 9 350 39 398 426 70 24.0 5e-12 MLGPVLMGFFIMGFCDMVAPITGRIALEFPASRQAAVSFLPTMVFLWFLVLSTPLAALMN RIGRKATALIGYAFTVVGLMVPYVAGEGCALGWYFAGFGLLGVGNTAIQVAINPLLATIV PGERMTSYLTVGQIFRNTSLLLLAPIVTALVALTGSWRLLLPIYAGLTVLGGIWLQATSV AEPPRSDRAAGMADCFRLLGNPTVLLCTLGVACFIAGDVGIGFLSVRLIDNPDSILTTTG FYACRIVGTLVGAWVLVRVSDVKYLRWNMAGALALCAVLLLVRGETAVYAAVGLTGFAMA CVFATFYAVATKAVPEKANEVAGLMILAISAGAVSGPVCGAIVRWSGDAHWGMLFVALCV GYMLWASFKLKIKQ >gi|313157860|gb|AENZ01000052.1| GENE 15 17929 - 18525 1054 198 aa, chain + ## HITS:1 COG:no KEGG:Krodi_1953 NR:ns ## KEGG: Krodi_1953 # Name: not_defined # Def: secreted protein # Organism: K.diaphorus # Pathway: not_defined # 2 168 1 170 209 66 28.0 5e-10 MIRKIVVTSLLVVCALSAAAQRPRARIEWGVLGGINIPDYTTNMSGTDVKNKLGWQAGIT TAVNLGAFAVEPQILYVRQGLRIRPEGQKEINLKSNSIDVPVLLSFRLLRPVRIYAGPVF TVMNDCKQKSGGDLLDFGRVRPTMSYTVGAGVKLLGHLLVDLRYNGQFKSKKDVVLPDGG KLDKLRTYNVALNFGYLF >gi|313157860|gb|AENZ01000052.1| GENE 16 18529 - 19500 1548 323 aa, chain + ## HITS:1 COG:lin1504 KEGG:ns NR:ns ## COG: lin1504 COG1702 # Protein_GI_number: 16800572 # Func_class: T Signal transduction mechanisms # Function: Phosphate starvation-inducible protein PhoH, predicted ATPase # Organism: Listeria innocua # 10 312 11 315 319 264 45.0 2e-70 MTDAVGKEILIEGADPRELYGPRNVYLDQLKSLHPQLRIVARGSSLKVLGSEAETRRFEE RMEGLVAYYLKYGHISSQVIDQCFAGGIAAQEAPVDKDVIVYGNNGNIVRARTVNQLRLV KLYDRNDLLFAVGPAGSGKTYTAIALAVRALKEKQVRRIILTRPAVEAGEKLGFLPGDMK EKLDPYLQPLYDALNDMIPPAKLAKYLEEGTVQIAPLAYMRGRTLDNAFVILDEAQNTTL SQIKMFLTRMGRNAKFIVTGDVTQIDLPRKSDSGLVRTMEILREVEGIGIVEFDVRDIVR HPLVKHIVEAFDKCGTDLSKAEK >gi|313157860|gb|AENZ01000052.1| GENE 17 19701 - 21170 2359 489 aa, chain + ## HITS:1 COG:VC2279 KEGG:ns NR:ns ## COG: VC2279 COG2195 # Protein_GI_number: 15642277 # Func_class: E Amino acid transport and metabolism # Function: Di- and tripeptidases # Organism: Vibrio cholerae # 7 489 52 533 534 464 47.0 1e-130 MNDTRNLAALKPALVWKHFAQIVRIPRPSSHEEKIRKYVMDFAREKGLECKEDAAHNVYV RKPASKGMEDRKGVVLQAHLDMVPQKNNDKKFDFTKDAIDAYVDGEWVTADGTTLGADNG IGAAAILAVLEDDSLVHGPLEALFTATEETGMDGAFGLKKGLLHGDILLNLDSETEGELY VGCAGGLDANIVFKYAAEPTPARNYTAARITVKGLKGGHSGIQIVCQRANANKVLFRFLN AASASRDLVLCSVDGGGLRNAIPREAEAVVMVKTKEYEAFAKELKAYEKVIKAEYAGIED AVSIKLKECEAPKERIAADVAAQLTKAVVGCPDGVQRMSVSMPGLVQTSSNLARVVSDGK SVKLQCLLRSSVNSEKAALGEAIAAVFSLAGAKVQLTGSYDGWNPNMESPILKAMTASYK ALYGKEPAVTAIHAGLECGIIGSKYPKMDMISFGPTICYPHSPDEKVEIASVGKFYDFMV DTLRNIPKK >gi|313157860|gb|AENZ01000052.1| GENE 18 21177 - 22163 1527 328 aa, chain + ## HITS:1 COG:TM0358 KEGG:ns NR:ns ## COG: TM0358 COG1597 # Protein_GI_number: 15643126 # Func_class: I Lipid transport and metabolism; R General function prediction only # Function: Sphingosine kinase and enzymes related to eukaryotic diacylglycerol kinase # Organism: Thermotoga maritima # 9 312 5 304 304 182 35.0 1e-45 MEADKVKWFVIVNPVAGGGRGLDHFPQISKLLRDAHIVCEPVFTEHKFHATELTVSAVKE GYRNIIVVGGDGTLHEVVNGLFIQQEVCPDEVLLAVIAVGTGNDWVRTFGISNRYQDAVK AIGEGYSFLQDVGVVSYEESHYRQSRYMANVAGAGFDARMVRKLSHLKKKDRKSRWRSTW CLVKNFFRYKSTGVKVWVDDRLVYNNLLFSIAIGICKYNVGGMQQLPDAVADDGMLDVSL VRPVHFWHLLFRFHYLFNGGIYRIRHILQERGSCIRIESSPEISVEVDGELLGETPLEFS ILHKAVRIVVGREFYDRQERGECPCEGK >gi|313157860|gb|AENZ01000052.1| GENE 19 22330 - 22683 133 117 aa, chain + ## HITS:1 COG:no KEGG:Desal_1939 NR:ns ## KEGG: Desal_1939 # Name: not_defined # Def: hypothetical protein # Organism: D.salexigens # Pathway: not_defined # 1 116 4 119 122 173 65.0 2e-42 MTPKEVLEKWIDRFNAADAVRLTELYAEDAVNHQVANKPVVGKTAIHEMFANEFATAEMV CIVENIFEDGEWAIMEWKDQLGLRGCGFFHVKNDKIVFQRGYWDKLSFLKQHNLPIE >gi|313157860|gb|AENZ01000052.1| GENE 20 22854 - 23429 924 191 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313157883|gb|EFR57289.1| ## NR: gi|313157883|gb|EFR57289.1| putative lipoprotein [Alistipes sp. HGB5] # 1 191 1 191 191 357 100.0 2e-97 MKKILLCICLLSTFAAGLVGCSQQRQWNHEQRKAMREALRSYRQMVYLDDLSDDEFVLFT DEVAGDLEGAYPVYTTFVEMPGVTDTVDMVVVTTIVDELNADAHNMRHIYPYNYLVAQGV LPAGLDHQQQKAFYNCFAGKVNAAYSTMSQFFNAILADTTDMSQIRQMEAGCANDLFDWV ITEVEVTETAD >gi|313157860|gb|AENZ01000052.1| GENE 21 23580 - 23963 661 127 aa, chain + ## HITS:1 COG:DR2400 KEGG:ns NR:ns ## COG: DR2400 COG2315 # Protein_GI_number: 15807390 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Deinococcus radiodurans # 6 127 11 130 132 72 35.0 2e-13 MDVLTFRDYCLSLPMAEETTPFDETTLVYKVGGKMFACADMAEFEWVAVKCDPAEAVLLR EQYDEVSAAWHFNKRHWNAVRTTGDLPDGFIREQIRNSYLLVLRCSVTPKALREEILAYV GEHGLPE >gi|313157860|gb|AENZ01000052.1| GENE 22 24237 - 24992 929 251 aa, chain - ## HITS:1 COG:YPO1128 KEGG:ns NR:ns ## COG: YPO1128 COG3201 # Protein_GI_number: 16121428 # Func_class: H Coenzyme transport and metabolism # Function: Nicotinamide mononucleotide transporter # Organism: Yersinia pestis # 150 244 137 231 241 63 32.0 3e-10 MDWKLILQIAGVVLGLLYLWLEYRADIRLWIVGLVMPLVHGALYYKAGLYADCSMQVYYV LAGLYGWLVWRNAPRKKAKTARNAAAAGPTDQTARSAEPAAARTARNTTAGHSDQTACNA GPEAAQTARSAEPAAVRIGHTPLRYAAGLIAVYAAAHAGIYFLLSRFTNSTVPFWDAMTT AASIVAMWMLSRKYIEQWLVWLAVDLITIGLYLYKGIPLTAGLYALYSALAVAGYLRWRK LAAQETTGAAK >gi|313157860|gb|AENZ01000052.1| GENE 23 25050 - 27548 3894 832 aa, chain - ## HITS:1 COG:no KEGG:BF0615 NR:ns ## KEGG: BF0615 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 87 832 24 739 739 589 42.0 1e-166 MKRFLLPIAAVALCVAANTASAENPANAVITAAAETPAAVSLTAGVTNAATAADATRSAA FVTAAAVPPAKTSEAITAAAGPARPEDSVRMYRLQEIEVTATRASGSTPVAYTDLSHEVI ARNSYGFDIPSVLALTPSMIATNETGIGIGGTSMRLRGTDATRLNVTINGVSMNNPDSHS MYWYDTPDLISSVGTMQIQRGAGISTNGTGAFGGAVNMSTAPLETEFGGDVSLSYGSYNT NKQAVHVSSGLMGGHWTVDARLTRLSSDGYIRRGGTSLTSYMFQGGYYNGNTMLKLVSFG GKAKTNLSYNGATRDEMRLNGRRYNSSGMYSTSDAYSHRIWNSEDGEWQQVNFYDDETDN YLQINNQLILSQRLSDRWTLSATAFYTYGYGYYKQYKDNAKLFEYLFPDHLAYQTDDQGN FVTDGDGERIAVRRDLIREKLMRNHLGGLNAAAAYAADNLDLAFGGSWSYYSCPHWGELD WMQETDPADYRSKRWYDNDVDKQDANLFARANWTVARGLNLFADMQYRYVHYKAWGVNDN YDEASGGMQPIDVDETYHFFNPRAGVSYTLAKRHNFYFSFAVAQKEPTRSDFTDRYMFSG DDSYPSSEKLYDYELGYDYSSPRFSAGVNLYYMKYKDQLVHTGMVNDSSDALNVNVPDSY RRGIELTASWRVTGWFTAGANATFSQNRIENYVDALSNSPTYGQNLGERTISYSPSAMGS LFLDFHYRGFEAVLHTQAVGKQYFTNDENETLSLGRYCVTNLNLGYTFRTRAARSVRLGL QVNNLFNAEYESNGYGYSYMDTWSNPTPQRIDMAYYFAQAPLNVLANVTVKF >gi|313157860|gb|AENZ01000052.1| GENE 24 27559 - 28050 -306 163 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MPVTERPGRETQAPGHLRQQGPRRRNPKAGSIAGCRRGRRIRRRNRATQAAEQRTKRAER AVRTRFARKSHRPQCNDAGTCPRRCIIVTKPSDNRSCLHLLSFSEILRIPLPISRIFYYL CRRYFEGVPYYQAEIIPIEPDTGNAETGMHNLSTHTPFQFINF >gi|313157860|gb|AENZ01000052.1| GENE 25 28116 - 29540 2191 474 aa, chain - ## HITS:1 COG:ECs5014 KEGG:ns NR:ns ## COG: ECs5014 COG0477 # Protein_GI_number: 15834268 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Escherichia coli O157:H7 # 11 469 8 478 491 422 50.0 1e-118 MTKTQETGSRAYLLSIVLVAIIGGLLFGYDTAVVSGAEQALDQFFRSAADFTYTPWMHGF TASSALVGCVIGGALSGLLASRLGRKRSLVAAAVLFFVSAVGSWRPESGILPYGEASLPL LASFNIYRILGGIGVGLASAVCPMYIAEISPAKIRGTLVSCNQFAIIFGMLVVYFVNYLI RNQLGDTGEAIQAAMVSVGWRRMFLSEAFPAATFLLLILFVPETPRYLTLTGRDQKALQV LTRINGAAEARTIHTEIRRTVHAKTERLFAYGAAVILIGILLSFFQQAIGINVVLYYAPR IFAGMGASGDASMLQTVVMGVVNILFTVVAIFTVDRVGRKPLLIVGSAGMMIGMAALAAL SFTGSIGIAALVFIIIYTASFMMSWGPICWVLISEIFPNTIRSQAVAVAVAAQWISNFLV SATFPSLSAWSVGGTYCIYALMALASALFVWKWVPETKGRTLEEMSKLWRKNGD >gi|313157860|gb|AENZ01000052.1| GENE 26 29545 - 30858 2185 437 aa, chain - ## HITS:1 COG:STM3661 KEGG:ns NR:ns ## COG: STM3661 COG2115 # Protein_GI_number: 16766946 # Func_class: G Carbohydrate transport and metabolism # Function: Xylose isomerase # Organism: Salmonella typhimurium LT2 # 4 437 2 437 440 452 49.0 1e-127 MASKQYFPTVGQIEFEGLESKNPMAFRYYDPEKIVKGRKMKDWFRFSMAWWHTLCAEGGD QFGGGTKTFPWNESACPIERAKAKMDAGFEFMRKIGIEYYCFHDVDLVDEAATPEEYEKN LRQIVAYAKQKQDETGIKLLWGTANVFGHARYMNGAATNPDFDAAARAMLQIKNAIDATI FLGGENYVFWGGREGYMSLLNTDMKREKEHMATMLRMARDYARSKGFKGNFLIEPKPMEP MKHQYDADTETVIGFLRAHGLDKDFKVNIEVNHATLAGHTFEHELQCAVDAGMLGSIDAN RGDYQNGWDTDQFPIDLYEMVQAMMVIVRGGGLQGGGTNFDAKTRRNSTDPEDIFIAHIA AMDVMARALLIAADILDNSPYEKMLAERYASYDSGEGRLFEEGKLTLEQVADYARRHEPV QRSGKQELFEAIVNMYI >gi|313157860|gb|AENZ01000052.1| GENE 27 30882 - 32366 2062 494 aa, chain - ## HITS:1 COG:CAC2612 KEGG:ns NR:ns ## COG: CAC2612 COG1070 # Protein_GI_number: 15895870 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar (pentulose and hexulose) kinases # Organism: Clostridium acetobutylicum # 4 458 5 462 500 199 31.0 1e-50 MKSLGIDIGSSSVKVSLLDIASGECAASSANPATEMPIGSPQSGWAEQDPEMWWHYVCEG IRTIAAQGFAMSDVVSVGITYQMHGLVCLDKQGRPLRPSIIWCDSRAVEIGAEALEGIGR EFCLAHTLNSPGNFTASKLAWVRRNEPGIFAQIYKFMLPGDYIAYRLSGRMSTSVSGLSE QILWDFEEERRADFVAGWYGIPQETIPEAGVSIGTEARTDEAAERLLGIPAGTPISYRAG DQPNNAFSLNVMESGEVAATGGTSGVVYGVTDKRQADPQSRVNTFVHVNHTASNPRYGIL LCINGTGIMNSWIRRNVTQETLDYAEMNRLAASVPAGSEGLSVVPFGNGAERMLCNRCPG AGIIGLDLNRHTTAHILRATQEGIAYSFRYGIDIMRGLGVRPDVIRAGAANLFLSPLFRQ TLSTLTGARIELFNTDGALGAARGAALGAGYYKTRGEAFAALRRLEVVEPAEADRETLEE GYRAWVREVEKRME >gi|313157860|gb|AENZ01000052.1| GENE 28 32387 - 33133 1244 248 aa, chain - ## HITS:1 COG:alr2484 KEGG:ns NR:ns ## COG: alr2484 COG1051 # Protein_GI_number: 17229976 # Func_class: F Nucleotide transport and metabolism # Function: ADP-ribose pyrophosphatase # Organism: Nostoc sp. PCC 7120 # 9 230 23 237 248 115 32.0 1e-25 MSIQMNQSLSVDCAVFGFNGKSLKVLLIERRYYAPDTRLDKLKLPGAMILDNETLPQAAY RVLEEATGLRDVYLKQMDIFSDPNRVSGEELEWINRYHGIRTERVVTVGYYALVKLDTRT VAYTTAKGAQWVDVDSIQRLAMDHKQILSAALAVLCREMLQSPVAFELLPRKFTIRALQN LYSAVLGIEIDNRNFRKKILASGFLTPTAEREQGVAHKPAQYYTFNKNAYKKALKAKLKL GFINNWRY >gi|313157860|gb|AENZ01000052.1| GENE 29 33445 - 34503 393 352 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|15900011|ref|NP_344615.1| aldose 1-epimerase [Streptococcus pneumoniae TIGR4] # 14 343 13 337 345 155 28 1e-36 MVQIEQHVWGMTPEGEAIILYTMRNDKGAEVKITNFGAAIVSVTMPDREGRMADVVLGYK HPEGYFFDGAASGKSVGRCANRIAFGRMTVEGKEYALEVNNGPNHLHGGTKNFANRIWES RVETNRVVMSLLSEDGDQNYPGELNVEAAFDFDDENSLEITYLARTDKTTVVNLTNHVYF NLAGEGSGSVLDHQLRLNSSQVLEMNDKQIPTGRLLDVAGTPQDFREFRPFRPGIDSEFN HIRDFKGYDHPFVIDGWKPNILGEVGCLREQTSGRCVTVLSSQPSVMIYTGNWLAGGCPE TKSDGRYNDYDGVAIECQNYPDAVNHPEFPSPLLRPGETYCQKIVFRFGTFA >gi|313157860|gb|AENZ01000052.1| GENE 30 34503 - 36035 2263 510 aa, chain - ## HITS:1 COG:DR0485 KEGG:ns NR:ns ## COG: DR0485 COG0008 # Protein_GI_number: 15805512 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Glutamyl- and glutaminyl-tRNA synthetases # Organism: Deinococcus radiodurans # 2 500 47 521 525 320 37.0 6e-87 MSRPVRVRFAPSPTGPLHIGGVRTALYNYLFARQHGGVMILRIEDTDSQRFVPGAEEYIL ESLKWCGIEIDEGVGAGGPHAPYRQSERREIYLKYALQLVDAGWAYYAFDTAEELDALRR EAEERGEAFAYNYAVREKLATSLALPADEVKARIERGDQWVIRFRMPENQTVEMDDLIRG HVEVNTSTLDDKVLYKSADALPTYHLANIVDDYLMEVSHVIRGEEWLPSLPLHYLLYRAF GWQERQPRFAHLPLLLKPTGGGKLSKRDGDKMGFPVFPLFWTSPTGETARGYREDGYFPE AFINMLALLGWNPGTEQEIFSMQELIDSFSLERVSKSGARFQPDKAKWFNAQYMHHKSDA ELAALFQPILRDHGIEVADEVAGRAAGIMKERATFITDLWDLTSFFFTAPAEYEEKQTRK YWKGQNPEILRELRGVLASIDDFSKENTEQIVHGWIEEKGYGMGQVMNTLRLALVGAGKG PGMYDVTAFIGREETLRRIDHILATLKPAE >gi|313157860|gb|AENZ01000052.1| GENE 31 36083 - 37183 1861 366 aa, chain - ## HITS:1 COG:SMc02695 KEGG:ns NR:ns ## COG: SMc02695 COG0012 # Protein_GI_number: 15966110 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Predicted GTPase, probable translation factor # Organism: Sinorhizobium meliloti # 1 365 1 366 367 404 55.0 1e-112 MALQCGIVGLPNVGKSTLFNCLSNAKAQAANFPFCTIEPNVGVITVPDRRLVRLAEIDKP KRVIPTTIEIVDIAGLVKGASKGEGLGNKFLANIRNTHAIIHVLRCFDNGNITHVDGSID PVRDKGIIDTELQLKDLETIENRLGKTQKQAAVGGDKNAKRQVELLLQYKSVLEQGRSAR TVELEKEDRKLVADLNLLTDKPVLYVCNVDEKSAATGNAYTEAVRAAIADEKAEMLIVAA ATEADIAELESYEERQMFLEDLGLEESGVERLIKSAYKLLNLETFFTTGADESRAWTYVR GMKAPQTAGVIHTDFEKGFIRAEVIKYDDYVTLGSEKACRDAGKIGVEGKEYVVQDGDIM HFLFNV >gi|313157860|gb|AENZ01000052.1| GENE 32 37650 - 40124 3288 824 aa, chain + ## HITS:1 COG:SPBC1683.04 KEGG:ns NR:ns ## COG: SPBC1683.04 COG1472 # Protein_GI_number: 19111852 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Schizosaccharomyces pombe # 32 824 8 823 832 404 32.0 1e-112 MKKTLVAVLAVAFVSSAAAQPIPPQAKERAAELVGQMTLDEKIDYIGGYNEFYIRAVPRL GIPEIRMADGPQGVRNNTRSTMFPCGVAAAATWDRALVRDMGRGLGQDARARGVHIMLGP GVNIYRSPLCGRNFEYFGEDPYLASETAVQYIEGMQSEGVMATIKHFAGNNQEWDRHQVS SDIDERTLHEIYLPAFRKAVEQAGVGAVMSSYNLVNGQHATENEQLAVDILRGMWGFEGI FMSDWNATYSAEGAANRGLDLEMPSARFMNARNLRPLIESGVVSERTIDLKCQHILQTLI AFGFLDRQQLDPAIPECNPFSDAAALDVARGGVVLLKNDGAFLPLTKQRDIVVLGPNSGN IPTGGGSGFVHPFSTVSVGEGMQMMGKKYRVTVLGNLPSASDMAAQGMVYTSADCKTPGL RGEYFANKRFEGTPALTRVDTRIGFNWKDKAPAEGLPADGFSIRWTGVFVPESDCTASLV MRGDDGYRLFVDGEEVLADWGNHSATTRKGSVEMKAGRKYALRLEYFDNASSAEVSFGYM TADPRAEDARIVRADAVIYCAGFDSSNEKENSDRTFALPEGQSEEIARLAALNENLIVVV NSGGGVDFSTFGDKAKAILMAWYPGQQGGQAIAEIVTGRISPSGRLPISVERRAEDNPTL GSYYENVARTHRKNTLQKRVTYNEGVFVGYRGYERSGVKPLYPFGYGLSYSTFEYSDLKV EKRDGGVVVSFAVKNTGGMDAAEVAQVYVGDVEASVPRPAKELKGYEKIFLKKGEQKRVE VTLTDEAFRFYDIFSHGFVTEPGDFNIFVGSSCEDIRLRGGVTL >gi|313157860|gb|AENZ01000052.1| GENE 33 40313 - 40837 723 174 aa, chain - ## HITS:1 COG:CAC2608 KEGG:ns NR:ns ## COG: CAC2608 COG2207 # Protein_GI_number: 15895866 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Clostridium acetobutylicum # 60 165 174 278 284 69 35.0 3e-12 MDNDETICKLSTPNLSSLNQGVEQGGYDVARLIDRLIRNPEAEWEDVMVMPTHIVTRQST DIYANNDPHIAEVLRYIHENISQKITVNELVKLVPLSRRLLETRFKKSMGTSIYDYIIQV RIEKMMQLLCEGQSVSEAAAELGFSDIKNVSRTFRQLKGITPSEYREQFAPKRR >gi|313157860|gb|AENZ01000052.1| GENE 34 40716 - 41492 547 258 aa, chain - ## HITS:1 COG:HI1106_1 KEGG:ns NR:ns ## COG: HI1106_1 COG1609 # Protein_GI_number: 16273032 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Haemophilus influenzae # 3 199 7 202 265 74 26.0 2e-13 MARVIFLTDFSEAYARELLLGMARYAHDTAQAWSLCRLPLSIRDKFGIEAVVEWALRMKA DAVIGQFYNTDNVELFRKNGIIAIAQDFKKRFTTIPNITGPHYSAGRMAAEYFLQKGFRN FAFYGTRGIDFSDERCQGFRETIEAANPEFTFSSLRSSAQNDLWYYDSTQLITWLQSLPK PVAIMACDDNQAYHITEACLQIEGGGKFPHPERHRHSGRGQRRDDLQALHPQPFVAQSGR RAGRLRRGAPDRPADPQP >gi|313157860|gb|AENZ01000052.1| GENE 35 41798 - 44689 3619 963 aa, chain + ## HITS:1 COG:XF0842 KEGG:ns NR:ns ## COG: XF0842 COG3537 # Protein_GI_number: 15837444 # Func_class: G Carbohydrate transport and metabolism # Function: Putative alpha-1,2-mannosidase # Organism: Xylella fastidiosa 9a5c # 22 726 44 781 790 513 42.0 1e-145 MTKKLTILLLALAALGCSPRSADPVDYVNPFIGTGFHGHTYPGATTPFGMVQLSPDTRAG NWDACAGYHYSDTTIDGFSHTHLSGTGCADLADILFHPTTREIVIHDGECVLQPYFFSHD DERASCGYYAVTLPDVNIGVELTAAPRTGVHRYTFAGKGPRRVIVDLLHTVTEEKIDLCE LRRTAPYELAGMRRTQGWVPDQYVFFAARFSEPFADVQLLGDKQAVLTFAPDVRTLTIAV GLSSVSVENARMNSLAEVPELDFDAVHARAVGQWRKALGDIVVEGGSRDEMTNFYTAQYH TKLTPNLMSDVNGEYRRHDQTVARMPEGKSYYSTFSLWDTFRAWNPLQTLVDTALVNDMI RSMLDMYDSTGELPIWPLASGETGTMIGYHAVSVIADAYLKGIRGYDADKALEAMIRSSN INKKGSDYYTAQGYIPSNIKRESVSCTLEYAYDDWAIARMAQAMGRDDVFGEYARRALNY VNVFDGSTCFFRGRQSDGNWSAPFEEFATGRDYTEATPWHYRFFVPHDVNGLIQLFGSRE AFVREMDRLFTLESDEMQLDVSDVTGLMGQYAHGNEPSHHMAYLYNYVGQPWKTQELTRR LLHEMYAPTPEGIIGNEDCGQMSAWYVFSSLGFNPVCPGSNEFALTAPQFPKAVVRLANG RTLTVTADNPRRNVYIASVTLDGKPIDRNYITYDELMQGGELHYALRPRPDYERGTDDAA APHSLTRGEVVSIPYTTQNVSLFTEPLAVALATTTSGAEIRYTLDGSEPTETSALYAAPV PVDRSLTLKAKGFKPGAAPSRTLTLEAEKAVFHRGMPAETATRPGVAYSHYEGVFSCVND IRKGKYVSSGTMPAPSIAQAPQEDHFAYVFTGLILIPERGIWEFMTKSDDGSVLMIGDRR VVDNDGSHASVMATGRVALEAGLHPYTLLYFEDYEGQDLAWGWKAPGAEGFEAIPEANLR LTN >gi|313157860|gb|AENZ01000052.1| GENE 36 44726 - 45565 1210 279 aa, chain + ## HITS:1 COG:lin0348 KEGG:ns NR:ns ## COG: lin0348 COG3568 # Protein_GI_number: 16799425 # Func_class: R General function prediction only # Function: Metal-dependent hydrolase # Organism: Listeria innocua # 22 276 2 255 257 168 38.0 9e-42 MKRLIYLLAAVAFTACGSATSLSVMTFNMRYDNPEDGQNNWRFRRERVAGVIKAQEVDVL GTQELLSNQFNDLSGLLTGYQGVGVGRLDGVESGEYCAVFFRKDRFTLLDSGTFWLSETP EVVGSLGWDGACERIATWVVLRDRDGREFFFIDTHLDHVGQVARDEGVSLLMKRIETLSG GRPVILTGDFNSEPGSSVVAHVQKDGVLRDAKAIAAQRSGTDWSFSDFGQIPEAERPLLD YIFVSGDIEAVRYEVLPDTFDGGYVSDHAPVMAVVKIAK >gi|313157860|gb|AENZ01000052.1| GENE 37 45574 - 47505 2650 643 aa, chain + ## HITS:1 COG:TM0156 KEGG:ns NR:ns ## COG: TM0156 COG1785 # Protein_GI_number: 15642930 # Func_class: P Inorganic ion transport and metabolism # Function: Alkaline phosphatase # Organism: Thermotoga maritima # 318 595 20 310 434 171 37.0 3e-42 MKKILKSWLLAAALCFCVTAAAERPVLIHSHNDYCRRAPFWQAYAQGVYSIEADVFLHDG RLLVGHDVEDLSPDMTFESLYVEPIVTLFGRNGGRAWKDSDERLQLMVELKSATEPTLQA VTALLGRYPEVFDPAVNPEAVRIAVTGRVPAPEDFGKYPAYVRFDGNWETDYTPDQLERI ALVSVNFRNYSQWNGKGSIIPAERVKLEKIIDRAHGWGKPVRFWGAPEGTTVYYTFYDMG IDYINTDRPEVCAGFFDDFGNKNFQIGQRRTSVGGVTGTKRLDKTTRDFRGFQNDKLQLT EGIDVYTPTYRNDGGKGKVRNVIYLIGDGMGLSQIVAAFYANKGLTTLQMKYMGLQQNNA LDAFTTDSAAGGSALATGERHDNRHISMSSEGEPYPSLSDFFHDKGMPVGVLTLGNVADA TPTAFYGHSVERDNADELTRCLMDGRVDLLCGSGIREFTRRSDGVELISELEKQYDFVRS VDEITARKGKVICIDETMDDAAEQANLTLLADATRASIAKLQERGGKGFFLMVEGAKIDY AGHSRCLPGSIIEMLSFDLAVAEALKFADRNGETLVVVTADHETGGLVLVDGDERTGRVM GVYVSDDHTPAMLPVFAYGPGADKFCGTYMNTEIARRIKSLIK >gi|313157860|gb|AENZ01000052.1| GENE 38 47516 - 49129 2323 537 aa, chain + ## HITS:1 COG:TP0931 KEGG:ns NR:ns ## COG: TP0931 COG1626 # Protein_GI_number: 15639916 # Func_class: G Carbohydrate transport and metabolism # Function: Neutral trehalase # Organism: Treponema pallidum # 124 478 62 418 476 129 26.0 2e-29 MTLRKQIAALLPLVFAAGAAGAQGLKSGPYELPYKNTYVKEVFVAENEFRTAKPETIKPR SFAEARKILPAPVWEGHDKEIEMYWHAWRIAVGNIRQPQEGSGFVSPYLDIAYNGNIFMW DASFMMMFARYGYRFFPFQRSLDNFYAKQHPDGFICREIRADGSDCFERYDPTSTGPNLL PWVELMYYRQFGDIDRLHKVFPALCAYAKWWRLNRTWPDGTYWSSGWGTGMDNMPRVKPE YNPIFSHGHMVWLDTNLQQMLVDESLLNIGFYIERWQEIEDMEDEMKRLRAYVNEHLWDE RTAFLYDRYGDGMLCTTKGIGAFWALQADALDKERLDRMVAHLSDTTEFDRPHRVPSLSA DHPKYNPLGRYWQGGVWPGTNYMTIDGLYRKGYHDLAREIAANHYAAVFEVWKNTGTFWE YYAPEKIEPGFMARKDFVGWTGLPPIAVFIEYILGVKSDYSEGRIVWDIEHTEAHGIERY PYGPDGVVGLKVKRRASAGETPSVSVETNVPFDLTVTWGDGRSRTVRVEKSGSVKLN >gi|313157860|gb|AENZ01000052.1| GENE 39 49167 - 50150 1293 327 aa, chain + ## HITS:1 COG:CAC1961 KEGG:ns NR:ns ## COG: CAC1961 COG1409 # Protein_GI_number: 15895233 # Func_class: R General function prediction only # Function: Predicted phosphohydrolases # Organism: Clostridium acetobutylicum # 13 301 32 316 324 157 32.0 3e-38 MCAVALWCSAQAQRPALEFNGEGKFKIAQFTDVHLDLGNPYRQAQAEKTIAQMRYILDAE RPDLVVFTGDVVTGKPAAEGWKRVLAPVAERNLPFCVVLGNHDAEQDIPRAGIGRIVTSY AGTLNTLNADGELADVVLEIAGKKSPAALLYCLDSHDYSTVEGIDGYGWFTQDQIRWYRD RSAAYTGANGGKPLPALAFFHIALPEYVAAWRNPDNTHIGRAAEDECPGELNPGMFAAMV ESGDVTGVFVGHDHDIDYVVAEKGIALGYGRFSGDDTTYNNLRPGVRLLVLTEGERGFET WIHERDGRIVDHVEFRDGRISKVKNSN >gi|313157860|gb|AENZ01000052.1| GENE 40 50161 - 50982 1203 273 aa, chain + ## HITS:1 COG:all5002 KEGG:ns NR:ns ## COG: all5002 COG1940 # Protein_GI_number: 17232494 # Func_class: K Transcription; G Carbohydrate transport and metabolism # Function: Transcriptional regulator/sugar kinase # Organism: Nostoc sp. PCC 7120 # 3 270 20 301 311 124 31.0 3e-28 MRIGVDLGGTNVRAGLVKDGHIVRLLSEPCKADRPEGEVVDHIASLIGELMRPEVSRIGI GVPSVVDAARGIVYNVVGIPSWREVYLKDLLEKRFSVPVHVNNDCNCFALGVCRFGEASA FSDVVCVALGTGVGAGVVIGGKLYCGRDTGAGEIGSIPYLDRDYEYYCSSRFFVGRGTTG KEAYERAAAGDPEALALWREFGGHVGQLVMLVLYTYDPEAIVFGGSIAHAFGFFREAMYE QLKRFPYAKTVEGLHICCSNIDHVGLLGASACE >gi|313157860|gb|AENZ01000052.1| GENE 41 51065 - 52708 2028 547 aa, chain + ## HITS:1 COG:CC0447 KEGG:ns NR:ns ## COG: CC0447 COG3525 # Protein_GI_number: 16124702 # Func_class: G Carbohydrate transport and metabolism # Function: N-acetyl-beta-hexosaminidase # Organism: Caulobacter vibrioides # 30 526 29 507 757 358 41.0 2e-98 MKRLLPVLALFAVLCGCCSCGRVIPTDGEVAVVPLPEHIVEGQGTFRLTSDAEVVLLFDD ERLSGVTAALNKVLMPLFGRGPRVRHADKAADHAVNVTRDETMPAEAYRLFVTPCRIDIV AGGAQGAFYAVQTLRQLLPAAAYEADDVRAVELPVVTIEDKPCLGYRGMMLDVGRHFFTV DEVKEALDIMALHKLNVFHWHLTDDQGWRIEIRKYPKLTEIGSVRSRTLIGRDPGGTYDE NCKFDETPYGGYYTQDEIRDVVNYAAERFITVIPEIEFPGHAVGALASYPWLGCTGEQYE VRQTWDIDDRVFCIGKETTFEFIEGVLEEVLELFPSEYIHIGGDECPTVMWKKCPHCKAR MKAERLKRPRRLQNYATARVEKFLNAHGRRLIGWDEILEGEVTPTATIMSWRGAEGGIKA AKMGNHAIMAPTTHCYLDYYQTRDTAGEPLAIGGYLPVEKVYSLDPFEMLTADEQRCILG VQANLWTEYIATWPHAEYMLLPRLSALAEVGWSLDRKDYGDYLRRVRRLAKIYDACGYNY AKHIFGK >gi|313157860|gb|AENZ01000052.1| GENE 42 52864 - 54876 992 670 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313157942|gb|EFR57348.1| ## NR: gi|313157942|gb|EFR57348.1| conserved hypothetical protein [Alistipes sp. HGB5] # 1 670 1 670 670 741 100.0 0 MDLKYVRSELQKLSEIIDNWNTPQETATLERDLVLEKLRKLYDAVRFGADADLAAGSGNA AEEAASETAPTEIPVSIDLGEILPLDPFAVEAQAEPDAAAKRDAVGEAGAAFESESFAEV GASFESESVAEAGAVSESESVSEPESVSEPEPETVPESKPEFGSEQAVEPGVNFAGHSEP AEELAAEPEVVADSAEAFVSQSDSDGGTNPISEPEISAAEPVTESGPVASEPESEFSIEE TAPAEPVAGNAAEPAAEQSVPLAETVRESEGASEKPEFIAEPVVESATESFAESLSESVA ESVSEPAAESVVESVAESVVEFPADPVSESAAESVFGSHSESVAEPAAESAAAPAPSVDA EQAPAQEAPAKAEAHSAPGTQPIAPTLFELEDETVRHRHKQRVIMSLYNTEPAAPAPKPA PAPKPASVPAPEPTAKPVAGGNPAAEAPFAQPAVPGSASSGSAASDSVSAGTAAAGVSVS ADATAGSALSATPFTAPVTEPAASATTSVGSATSSASGPDALASRTPVAAPQPKAAETTA PDNAGDDDEPDFEEITLEAKNTSGAVLGEVINHNVQTLADTIAPPRDVASELRRSEHVTD LRRAIGINDKFLMIRDLFGGDAAAYEAAIGTLNGFDDFDECMIYIAENYAWNANSDGAKF LMELLERKFA >gi|313157860|gb|AENZ01000052.1| GENE 43 54881 - 55588 1015 235 aa, chain + ## HITS:1 COG:all4680 KEGG:ns NR:ns ## COG: all4680 COG0313 # Protein_GI_number: 17232172 # Func_class: R General function prediction only # Function: Predicted methyltransferases # Organism: Nostoc sp. PCC 7120 # 4 235 10 240 285 221 47.0 1e-57 MAKLYIVPTPIGNLDDITLRAVNVLRDVDFILAEDTRTTSFLLKHLGIEKKLYSHHKFNE HATVKMVAESIAAGRNAALVSDAGTPGISDPGFLLVRTCVEAGIEVETLPGATALIPALV QSGFPCDRFCFEGFLPQKKGRAKQLQSLAAEERTMIFYESPYRVVKCLEQFAGVFGPERR VSVSRELTKKFEQTVRGTVAEVLEHFRTTDPKGEFVIVLAGKPKPKRETAADEEE >gi|313157860|gb|AENZ01000052.1| GENE 44 55611 - 56990 1789 459 aa, chain + ## HITS:1 COG:SP1264 KEGG:ns NR:ns ## COG: SP1264 COG1808 # Protein_GI_number: 15901124 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Streptococcus pneumoniae TIGR4 # 55 362 26 326 347 223 37.0 8e-58 MENTNDTHNVKRSFSLWVGEMREFLRGHFSLDEDKAQRDEVVAAISKGVVFRGVNLWVLI FATMIASLGLNVNSAAVIIGAMLISPIMGPIMGIGLSLGINDFELLKKALRNYALMFIVA IITSTVYFLVSPLSSNSSELLARTVPTTYDVLIALFGGLAGIVAQTRQDRTSTVIPGVAI ATALIPPLCTAGFGLATGQIKFFVGAFYLFFINSVFIALATYLMVRFLKYEKNAFIDKVR ERNVKRLMMVITLVTFIPSVVIGFHMVRVSLFEAVADKYVAQVFRFPHTRVIECNKHYAH GKDPSQIELLLVGEPIGDEIIENARAQMRNFGLEKAELVVRQANKTDEINVASLQQSYLE LLEEKNRRIGEMTAQLKRYRVTNVEVDDISREVGVMMENIKAISLMKGITFDVAGTPLDT TLVCVVTPKAPADSIDRTTLASWLRIRTKVDNVKLFVEP >gi|313157860|gb|AENZ01000052.1| GENE 45 57002 - 57922 1189 306 aa, chain + ## HITS:1 COG:no KEGG:CHU_1078 NR:ns ## KEGG: CHU_1078 # Name: waaM # Def: lauroyl/myristoyl acyltransferase (EC:2.3.1.-) # Organism: C.hutchinsonii # Pathway: Lipopolysaccharide biosynthesis [PATH:chu00540]; Metabolic pathways [PATH:chu01100] # 34 299 18 279 291 183 35.0 6e-45 MKRAELNFAQKVWLESLWLGARFFAVLPYWFKYYVVENLLFFILYYCLRYRMKVVNENLR NSFPEKSVGELAVIRRNFYRTLAEIFVDTVNMAHMGEEKARTVLTVKGFDEHYKAVHGRD WIAMTAHFGCWEYCSYWGLYERSQMLVAVYHPLRSKVMERLYQRLRNYENSMTVAMKDSL RFYLRNRECGVDGKNLVMGLIADQNPPRRPDSHWFRFLNQDTIFFDGGEKLALRCKLPVY FVRMDRLRRGRYQMSFEPIYDGEEQVAEYEITERYVRKLEAMIREHPELWMWSHRRWKHK RTNAGR >gi|313157860|gb|AENZ01000052.1| GENE 46 57909 - 58922 1052 337 aa, chain + ## HITS:1 COG:CAC2327_1 KEGG:ns NR:ns ## COG: CAC2327_1 COG1216 # Protein_GI_number: 15895594 # Func_class: R General function prediction only # Function: Predicted glycosyltransferases # Organism: Clostridium acetobutylicum # 2 310 5 329 378 149 29.0 5e-36 MPVAKIVILNWNGAEHLRRFLPSVVAAAPAGVEVVVADNGSTDDSLRVLATEFPSVGVLR LDANYGFAGGYNRALKQVEADYYLLLNSDIETPEGWLAPILDVLEREPDVAVVSPKLISW TDRTKFEYAGASGGFIDFLGYPFCRGRILRQVETDEGQYDDARDVFWVSGAAFCCRADVF HALGGFDADFFAHMEEIDLCWRMQLAGYRVRVVPQSRVYHLGGGTLTTDSPTKVFYNHRN NLAMLYKCSSSVQRVSVAVVRPALDLLAALSYLVQGRSDNFRAVFRAYRDFLRWHRDLSR KRKTIRQNRKGAAEEKIYRGSVVLRYLLGRRTFGRMM >gi|313157860|gb|AENZ01000052.1| GENE 47 58982 - 59845 860 287 aa, chain + ## HITS:1 COG:BH3506_1 KEGG:ns NR:ns ## COG: BH3506_1 COG2207 # Protein_GI_number: 15616068 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Bacillus halodurans # 10 117 2 109 130 88 37.0 1e-17 MKNKPATTADYQRRINTVVEYIRTHLDETLDLCTLAEVSSFSPCHFHRIVSAFLGEPPGE FIARTRIETAARLLLYSDLTVAEIAYRVGYDAPSSLTKAFGRFFGISPKEYRTAKNLPSM TTRMHSDAGIRLSKPKMLEIPEKTVAYLRAQGDYGEVDYGDCFARLWQYVKEQKLFSAGI ESIVVYHNDPDISKSGDLRCDVCLTIPKGPVAVGPVGVKTIAAGRYAMFVHTGPYAGLGG AYHKIFREWLPRSGCELRDSPGFERYRNNPDRVPAEKLKTEIYIPLK >gi|313157860|gb|AENZ01000052.1| GENE 48 60005 - 61681 2496 558 aa, chain - ## HITS:1 COG:STM0686 KEGG:ns NR:ns ## COG: STM0686 COG0008 # Protein_GI_number: 16764056 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Glutamyl- and glutaminyl-tRNA synthetases # Organism: Salmonella typhimurium LT2 # 6 555 2 552 555 662 58.0 0 MASETNETREKSLNFLEEIIEESIAKGETRVQTRFPPEPNGYLHIGHAKSICINFGLAKK YGGKCNLRFDDTNPVKEDVEYIDSIKRDIHWLGFDWAVERYASDYFDQLYDWAVVLIKKG LAYVDDQTQEQIRENRGTVSVPGTPSPWRDRSVEENLDLFERMKKGEFADGEKVLRAKID MAHPNMLFRDPIMYRIIHAEHHRTGDKWCIYPMYDYAHGQSDSIEQITHSICTLEFDVHR PLYDWFIQALEIFPSHQYEFARLNLTYTMMSKRKLLKLVQEGAVMGWDDPRMPTICALRR KGYTPASVRNFAEMVGVAKRDNVIDLGKLEYCVREDLNKVAQRRMAVLNPLKVVITNYDE NKTELFTAINNPEDETAGTRQVPFSKVIYIERDDFMEEPPKKFFRLSPGGEVRLRYSYLI KCEEVVKDAEGNVTELRCTYDPMSGQGSSSDGRRVKGVIHWVSAKDAVEAEIRLFNPLFT KENPDDVEEGQTWEDNLNPESMVRIKGYLEPSLRDVPVGQAVQFERVGYFCPDTDSTPEH LVFNRTVTLKDSWAKINK >gi|313157860|gb|AENZ01000052.1| GENE 49 61702 - 62766 1801 354 aa, chain - ## HITS:1 COG:no KEGG:Sph21_1686 NR:ns ## KEGG: Sph21_1686 # Name: not_defined # Def: hypothetical protein # Organism: Sphingobacterium_21 # Pathway: not_defined # 6 344 7 326 331 95 27.0 3e-18 MSLKKVLTFALALFITGGGLQAQKLIVGADFTTRFDNREYANNDFNESQTLFSARLTPRI GVEWMDKNRLIVGVDLLQNFGDDTKFLSKARPLAYYQFNSRNVQANAGIFDRKELLGDYS LAFFGDSTAFYHNRLSGFLGHYKSTERRNTYVEMAIDWEGMYSEESREMFRIISAGRYTL DKGFYFGYAFSMFHFAGKIGNENVTDNLLVNPYAGWEFNAYFDFDIKAGFLFAPQRARSI ESGWKKQKGAQIDFVMTKWGVKLENNLYLGENLQPFRYAYVGEPLYAGEQFYSTTDHIYN RTWIGYDRRFFKGTMGVEAGMVFHYDGSGMGTQQMVKLSVDIQKLFNIGPKAKK >gi|313157860|gb|AENZ01000052.1| GENE 50 63119 - 63577 290 152 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313157912|gb|EFR57318.1| ## NR: gi|313157912|gb|EFR57318.1| hypothetical protein HMPREF9720_0548 [Alistipes sp. HGB5] # 1 152 5 156 156 260 100.0 2e-68 MDNLDKNQFIVMLDTLLNSRLSDAVCTAFHASYKEFIDLLHELARLQSRLMRLQERIVEK ESPQASLLKSALLLTNFEIRLVFTRLVIHPSPLRYRSKYPNHHCFSPNNIMELITALHLS GAIRRIDGTRVDLATLVDVLSRTFNVRINNPE >gi|313157860|gb|AENZ01000052.1| GENE 51 63686 - 63880 117 64 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313158016|gb|EFR57422.1| ## NR: gi|313158016|gb|EFR57422.1| conserved domain protein [Alistipes sp. HGB5] # 1 64 1 64 64 115 100.0 8e-25 MALPPFELADGSAAPFNLFVRVFENTLHVKLGDPTDVKRRILERKKDRTKFTEALLYSLN KEDE >gi|313157860|gb|AENZ01000052.1| GENE 52 64001 - 64267 217 88 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313158024|gb|EFR57430.1| ## NR: gi|313158024|gb|EFR57430.1| conserved hypothetical protein [Alistipes sp. HGB5] # 1 88 1 88 88 169 100.0 6e-41 MLYCIFNLLFKVYSLLEWFPPSSLWLVGGGIVVEYFGFVGFTLFMMQQEAVIAGSFHDVV MLSDSKRTDELTHSRSGVRKSLNTATVR >gi|313157860|gb|AENZ01000052.1| GENE 53 64352 - 64918 862 188 aa, chain + ## HITS:1 COG:all4590 KEGG:ns NR:ns ## COG: all4590 COG0616 # Protein_GI_number: 17232082 # Func_class: O Posttranslational modification, protein turnover, chaperones; U Intracellular trafficking, secretion, and vesicular transport # Function: Periplasmic serine proteases (ClpP class) # Organism: Nostoc sp. PCC 7120 # 2 130 411 539 609 95 43.0 7e-20 MADKLTLTGSIGVFGMILDTREALKNKLGITIDGVQSNASSSFLATQPLTPVQRSMIMRG VDKVYTTFTNDVAEGRNLPIEKVLDIAGGRVWSGADALGIGLIDTYGGLKTAIALAVDKA DLGDNYRVTEVTETPTGFAAFIASLNMSVREAFTRSELGLMMKDYNTVREAFRQQGVLMY SPYKVEIR >gi|313157860|gb|AENZ01000052.1| GENE 54 64998 - 66575 991 525 aa, chain - ## HITS:1 COG:no KEGG:Pedsa_3179 NR:ns ## KEGG: Pedsa_3179 # Name: not_defined # Def: metallophosphoesterase # Organism: P.saltans # Pathway: not_defined # 8 525 113 664 664 318 34.0 3e-85 MKKFFIVATFFFASTCFACAKGTDTETSGKSNVSEYQTTVSGRVTCQGQPVVGAVISDGI ELTQTDANGHYTLRSPERMGYVFISVPSGYKVAHKGVLPNHYLDIKSKKTENADFELLPM ADKKYKLFVMADVHLIGSATHRDLEQFRSTFLPDITAAVDTTSEEVFSLSLGDMTTDSRW YHENFMLPNYLKEFKNYPSPIYHIMGNHDNDPSATGTTEEKLDWSASALYRKHIGPTYYS LNIGDVHYVMLDNIVGLGNCKYKYLINSAQLAWLKKDLALVDKSTPLIVSMHVPAYTYTG ISDGKPMTKKRNTQYQDVQVLINILKPFKEAHILTGHDHRNRNIQITHNILEHNFASASA ISWKLNDVRIMTTDGTLSGYQIFDISGKQIQWHYKAVGLTPEKSQFRAYDLNTVPEKYGG NPASNEILVNVFNWDPRWKISVTEDGQELPVEQLWEKDPLYMYIRDKTQRFNNRPKDWRA VNCIHMFHTKATEANSEIVVSVTDRFGNVYTEKMDRPKPFTENMN >gi|313157860|gb|AENZ01000052.1| GENE 55 66589 - 68604 684 671 aa, chain - ## HITS:1 COG:all3226_2 KEGG:ns NR:ns ## COG: all3226_2 COG1262 # Protein_GI_number: 17230718 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Nostoc sp. PCC 7120 # 29 257 3 244 246 119 36.0 2e-26 MIRKTLIGALAFLYLPGIISGQIQKPYGIRMVDIPSGEFIMGSRGYGAVEEFDEAPAHLV RISRPFRMSATEITNIQYEQYDPSHRKLRGKAGFSTEDDDAVIFVSYDDALGFCRWLSEK EGKNYRLPTEAEWEYACRATSITPFNTGTNGLPTKQHKVQEYNRTPKTVSLQTALFPPND WGLYDMHGNVEEWCLDWYGPYTQDPQTDPSGPAEGLYRVTRGGSHNTPVKYLRSANRSAM IPEDATWAVGFRIVEASNPTSHTPVIATRPEHISQKRYKWAAPTDKPFFLEPICYVHKPY ANSGVPFYGHNHCPAITWCANGDLLAAWFSTDDEAGREMVILSSRLRAGSDIWDSPQEFL RIPDRNLTGTSLLHDGEGTIYHTNGVEVIGGWKNLAIILRESRDNGATWTRPRIIVPEHD QRHQVIAGLFRTREGYLVQPCDAVPGHYGGSAIHISRDKGQTWENPYTNPEIPIYEEGAT GGLIAGIHAGVVQLTNGDLMALGRNNDIEGGAKYPGLRMPRSISSDMGQNWSYSATEFLP IYSGQRLVLRRLNEGPLLLISFTHHPYDKKRRGMEFEDARGNKFTGYGMFAALSFDEGKN WPVKRLLTDGKRRLLDGRGWTGYFEMTETQAEPLGYLAATQTPDNTIHLISSNIHYRFNL AWILGKPDMNN >gi|313157860|gb|AENZ01000052.1| GENE 56 68623 - 69243 209 206 aa, chain - ## HITS:1 COG:no KEGG:Pedsa_0075 NR:ns ## KEGG: Pedsa_0075 # Name: not_defined # Def: hypothetical protein # Organism: P.saltans # Pathway: not_defined # 4 204 12 224 1170 77 28.0 4e-13 MKHYLLSLLLLLFSGTSYAQLVAWNFGPATRGDEKTSVSTFADPCMEASELSRGPGVRPQ RATYSFAGLWPGCNSMLGAVRQGAFYEFSLKARKGCLFSLKRLALVLRVQPDGPKAYILT YSTDGEHFIDIGAPVRVGSTGNDGKLQKPIDLSAVPELQNVPAKKTITFRLYAWGSKEGS ENTAFRIGKSTADRPALTVQGTAWKK >gi|313157860|gb|AENZ01000052.1| GENE 57 69258 - 70565 565 435 aa, chain - ## HITS:1 COG:no KEGG:Cpin_5276 NR:ns ## KEGG: Cpin_5276 # Name: not_defined # Def: hypothetical protein # Organism: C.pinensis # Pathway: not_defined # 1 247 1 246 504 105 30.0 6e-21 MKKIVLATALFALCATSACKTDPDWVAGTLNPTVSISDVRFLYKGNNRQLTTDDLSGAVQ ISGVVSSNYAEKNVPEGIVIIQNTFRSKTSGIAVHVGPELASTFQPGDSLLLQIAGKTLT RENGALTIRSLNSTEITVLSTGHKLSPIPVTAATLEANPEAYESILVRVSNCTPEDPSVE NPSYKEGLTVGDGTGTLTIYAREGSDLATLAVPEEPATYIGLVFTEAGEDGNTQLFIQPR SLRDVIEKYIVLAWNLTGHDNIKYPTRDATVVNANLEMSALSRGPGLTAQKAGNAFASEW PMDISKEAAIERGSYYQFTITPKNNAQVSLLSLDVALRVQPNGPKNYIWMYSLDNGENFS EMSGNLVFKGSTTDNNGIQQPTLNLEEVAGVQEFSEPMIVRIYAWGAADAKSTFRIGQSL ANRPYALTLEGMIRP >gi|313157860|gb|AENZ01000052.1| GENE 58 70595 - 72073 922 492 aa, chain - ## HITS:1 COG:no KEGG:Pedsa_0072 NR:ns ## KEGG: Pedsa_0072 # Name: not_defined # Def: hypothetical protein # Organism: P.saltans # Pathway: not_defined # 6 488 2 476 476 488 50.0 1e-136 MRNHFKYIVAILLLLFVSVSCTSNFEDVSKNPNASDEALPQALLAPALTSVVKANMSRTR SLTNELMQVTVLMGDIEGRIFRYDIRKSVADYLWNNWYVQLTNFKDMYETAQTLYTVESD KTYNTYMGISLICQCWVFSMLTDTYGDIPYTEACQGKKGILTPVFDRQEDIYTDIFAKLE QANELLANGSDLPEDQVICDPLFQGNALKWRKLSNSLYLRLLLRVSGKEEETAAAKIAEI LEERPTDYPIMTSNDDSAILRWTGEQPYVSPFYTLRDSDWRTYVLAEFFLDNLNRWNDPR RPKWADTSNGEYAGVPSGYPVGEEVSARSRLPLALKNDPRLGNIMNYAELEFIITEASVK RYISADAETHYKNGVIAALGMWDLTASSAYLDNTAVKWDDGESLDDKMSRIHLQKYYSLF YTDLQQWFECRRTGYPILPKGAGLLNGGELPSRLNYPVYLQTTNRKNYYDAVSSQGEDEI STKVWWQRKDNQ >gi|313157860|gb|AENZ01000052.1| GENE 59 72085 - 75279 1877 1064 aa, chain - ## HITS:1 COG:no KEGG:Pedsa_0071 NR:ns ## KEGG: Pedsa_0071 # Name: not_defined # Def: TonB-dependent receptor plug # Organism: P.saltans # Pathway: not_defined # 26 1064 99 1134 1134 1040 50.0 0 MKKNGPKTSPVQCNRWLLILTFMLMSFHAGAQNKVTVSGVVSDEKGTPMVGVTVLVKNTN IGRITGIKGEYRLEIPANSVLVFSYMGYTTEERLVSTQTQINVTLTEQAQAINDVVVTAL GIKREQKALGYATTKLGGDKISEALSTNWTSALSGKIAGLNLVRSGAGPAGSNKIILRGE SNLTGDNQALIVIDGVVINSSSGLSTGSGNSSYLGEDSPVDFGSALNDINPEDIESINVL KGAGAAALYGQRGAHGAIIITTKTGEETMRGLKVSFSSTANFETPYHWPDYQYAYGSGTE GVNYFSWGATEDGPNTRNTSQAWGPKFDGQMFYQYDPATHGQATERTLWRPYPNNHKDFF RTGQTYTNSVSLSGGSTNTNFRVGYTNVYNEWIIPNTGYKRNTISVSVNQKMTEKLRLST KINYTNKKSDNLPATGYNNQTVMYWNIMQVPNGNLDWLKEYWMPGKEGVDQSYPFSTSPD NPYLIVNEMLNKLNRHTVTGNVQMNYDIIKGLNLMVRGSMDMSYEDRSEQRPMDTQKFKY GMYRTTKIFAQEITGDFLLKYEKKISAFELSASLGGSTLHNRYRRENNRANALMFPQIFS LSNSKFALVTIPYESDYIINSIYGLLTLSYKNYLFLDVTARNDWSSVLATPDSTDNSSFF YPSVNVSAVLSDMFRMPRWITFAKLRASFSEVGSSGTKVYQNTYVYSTADGFSGGLTNPT TLPNPDLRPQRTRTWEVGADIRFFNNKLGVDIALYQGETRDQILTATLDRATGYNNMIVN GGIVRNRGIEIEANATPFRNKNGWTWRLFGTYSANENTVMELTDDLFALQLQNTLSGRGS MEARVGQSMSAIYGTGYKRAPDGQIVYGEDGYPLLANDLIYLGDSNPKGKGSFGTEIKYK GISLNILFDGQFGGKGYSFTNAMLMEHGKHQKTLPGRYNGIIGNGVIENPDGTYRKNDVV ANEIWTYYTRHSGRENLEGSMFNTDFLKLRELTLSYSFPARICKKIGMRKLSIGVYGRDL FMITNWPAFDPEFGTLNNGSIDKGFEIAQFPATRNFGVKLNIEF >gi|313157860|gb|AENZ01000052.1| GENE 60 75349 - 76956 485 535 aa, chain - ## HITS:1 COG:no KEGG:BURPS1106A_3267 NR:ns ## KEGG: BURPS1106A_3267 # Name: not_defined # Def: BNR/Asp-box repeat-containing protein # Organism: B.pseudomallei_1106a # Pathway: not_defined # 166 524 30 380 390 211 37.0 7e-53 MVIRKLTLLLLLAGSCSSGDNTPSLPTFDPDALISVSGTMIDIRDVRSGDALYPQNDDFK IPQEFAIAFGSNMHYATTLREYRGVERVSADHPTILRIAVADNTHPGSMWNDSRTSFRIG SLRMHLYTCHYTTPGKWIPVPSGQPNAATLVFGRKITVDGVPPAPGKLIASATELRKRNI TNPSIMILSDGSYFVGSSGPDPKGNTYFRSTDKGLSWKKCSNPDYMNFCKCFTCPSDPST LYEVGISAATRGNIVIRKSSDKGHTWSSLTTLFKGDYHGAPTPFIEYQGRIWHAMGTAPA DGKMGITVLSIPIGSDPMRPANWTLTNTLTGDTSWLSGNPDHIFNEWQEGCIVVDPAGKL VVLIRIDDSRTNDLAALVSVTDQRTITFNPATGFCDMPGGGKKFTVRYDTVSGKYWTLAN PCYDKDRVRTHTGWYSTRIYPIFLRSRLVLCSSADLRNWTVVKEVISSNNCFFHGFQYTD WEFDGNDIIAVSRTAFPESRGLPIRQHDANMLTFHRIVDFRSAGFTTENITYDQL >gi|313157860|gb|AENZ01000052.1| GENE 61 77085 - 77846 605 253 aa, chain - ## HITS:1 COG:no KEGG:CA2559_07475 NR:ns ## KEGG: CA2559_07475 # Name: not_defined # Def: putative S1/P1 nuclease # Organism: C.atlanticus # Pathway: not_defined # 15 250 26 267 268 115 29.0 2e-24 MKKNITLFICLAWCAIQSASAWNRTAHEAIAYIAEQHLTPSAKAAIEKYLDGRSIVYYAA WMDQRHEHIPYKHTVTVDEDNEPLSASKRPELDGMNAIMRSLDRLENRDMHPKDSIALDI KFIVHLIGDIHCPAHIVYPKTTRFFPVKLYGRVQKYHPIWDAMLDNNHGWTYREYQEQLD RFTDEQMAEMAAGTPISWARENARRCRIIYKWAKKDDELDRPFINKAYPLAEDLMLRASY RLAKLLNDIFSEE >gi|313157860|gb|AENZ01000052.1| GENE 62 77860 - 79743 1558 627 aa, chain - ## HITS:1 COG:no KEGG:Pedsa_2434 NR:ns ## KEGG: Pedsa_2434 # Name: not_defined # Def: sialate O-acetylesterase (EC:3.1.1.53) # Organism: P.saltans # Pathway: not_defined # 29 471 20 461 463 357 40.0 1e-96 MERITLKSLCAALLLLGAGAGQITAQPILRLPSIIGDHMVLQENSNVKIWGWAEPNKEVL IIPSWSQDTVRTKSTGDTKWLASIQTPAADNRTYTLTVQTQQRKIEVKDILVGQVWLCGG QSNMNYSAAQGITDMQEELEKPMNPSIRLFTVTRNSSPWYQEDCEGEWQVCDAQSARWFS AVGYFFGKALTEGLNQPVGLVHASWGGSPVETWIPASNMSKEPVLKNMWSKNRKSKSPYK YSIGSMYNAMIHPIVNFALAGVVWYQGEANVGHTSYADAFSLLIDAWRTRFGRELPFYFV QIAPYKYKHPNSGAEIREQQARVAALKDRTGMVVVSDLVENVKDIHPRRKQEVGRRLANW ALAETYGKSGSKYRHATFASMTVKGQKAIISFNDAEDGITCDGNPAQALEICDASMVFQP ARAMIDQKNGTLIVWSDAVRKPVAVRYMFSNDGIGNLKDAAGLPVAPFRTDSPFHAADMA AAELAKEMTLTGVKATGQGYKLVKLKKGAKLFLNRNYPVNILPERFEDFDMLIRNATVGI LSQSCTIVPDADGMVYIIARKNERTAEDLFGWREVKNSEVTYSTHKGDESLKIFSKKAKA GKEIEIPQTKDFCGITPIAKKIDYRNE >gi|313157860|gb|AENZ01000052.1| GENE 63 79756 - 81624 1712 622 aa, chain - ## HITS:1 COG:no KEGG:Pedsa_0715 NR:ns ## KEGG: Pedsa_0715 # Name: not_defined # Def: hypothetical protein # Organism: P.saltans # Pathway: not_defined # 307 621 27 313 315 97 29.0 2e-18 MKNNIFLFSMLATAMLAGCQDAPIPGDNTENYFSFSAGVAKTRTEAIVEDDGLTFRWTDR DEVGIYGIQDDRSLGDNYAYTAYPDSEDAANCVFRFSTYGQAYQNVAAGSEFYAYYPYDP AANESTPGALPIVLPAEQQQLTAGSPDHLARYGFMTAAPVRIADPAKGAGFNFRNLYTIV ELKLKMAASSSLEEVPVKQIRITSAATDLAIVRGTTDLTAVERRIDIIEGGKSVTMTFGQ TASLAKSERSFYFVVAPGNHPDGDITLEVTAIDNSVNTLTLPGKVDFKSNMHYVKSVELA LEDFRLTDEFNVVPTTLSCNAGESIDFVFSGIADKITFWSGETGHEYRYATQGKLQESSV NMNFKSIYINGMQRNCATVRYSTNFNGEYTEEAINKASWTDVSPQFQLPPYISVSSGAAI DPEANAITQPYDSGVADITSWFPDYDTPVYFAFFYHVDTYDADFVDEKTGLKGNGRTWFQ IYSVESQSCYPNEAPTPLISVVGKAQEQIEVVNGASYAEGKDTNICKKGVSSGGTTVIRM QAAFKPTTDRDAWAITHAIYRPAPKHAPADTGMTVKSASGEQPEKYSHVFMQPGEYVTTV VATVTTLAGELKLEKEFNIRVN >gi|313157860|gb|AENZ01000052.1| GENE 64 81638 - 82639 1100 333 aa, chain - ## HITS:1 COG:no KEGG:Phep_1498 NR:ns ## KEGG: Phep_1498 # Name: not_defined # Def: hypothetical protein # Organism: P.heparinus # Pathway: not_defined # 16 316 20 310 312 136 29.0 1e-30 MKKIIAFIGISLLAAACEKGSVSSPDFKVTVASTTVKAGTPVVFSISGNTDLISFYSGEL GNDYAYKTTDRITPTTMFITFSTENPNGTPGHGNPSRVPFCYSFDFNGADKVYTVEDVTS ATWTDFTDRFKMPTDINQTVASGEVSINELFPDAQTPMYLMFHYCVDTYDASLYGGQGNP RTQWLIKNFRIDGVTEAGNSTLYPFVDCNWQVVETESFEDNGKQHPNVNATRILMYCDAY PKTPKECYVIGGPFYRQADINTGPDLAEGIKSTADPMPSTYEHTFEQPGTYTVTFVGINS SVYGRKEAIRQLTLTVVEDSGGITPPDPGEWND >gi|313157860|gb|AENZ01000052.1| GENE 65 82663 - 84261 1542 532 aa, chain - ## HITS:1 COG:no KEGG:Phep_2282 NR:ns ## KEGG: Phep_2282 # Name: not_defined # Def: RagB/SusD domain-containing protein # Organism: P.heparinus # Pathway: not_defined # 20 532 17 540 540 333 37.0 2e-89 MRNTLYYALFAGTVAIMLFTSCSLLDTEPEDFVQPGEFYKTEKQLQSALLGVYATLANNS VYGVNMLGRVGLTADLGYERYSIDESSVGYYNVSSADAKILGYWRDLYSGIGRANMLLKY IGQPQMDETIRTKIEAQARFLRAYYHFMLVIRFKNIPLITEVPEDSNKTSVQIPQSDPRE VYRWIIREMEWAAPLLPDASTLTGGGELSKSAAYGLLARVALYMAGNPLNEPGMYAKAKE YAATVIGYEFHELNPSYKDVFLKLIQDEYDIRESIFEVEFYGNNQSTYTTTAGQVGRING IAYSIDINNWGRSLGTIRASQYFYQLFDKDDLRRDWTIASYSYDGKTGEKIDCANNYWMR FCGKFRREYELSNPKDSQYTPINFPVLRYADVLLMYAEAVAADPDNDDSQELADAYEYLN RIRRRGHGLDIHTPAPAVDLEQAGKSALLDEIKDERARELGFELLRKDDLVRWGELYDRM QAIRSTIPDHYTSSYYVAARLYYGNVSRRDIVWPIPSYELGVNRRLVQNEGF >gi|313157860|gb|AENZ01000052.1| GENE 66 84274 - 87432 3279 1052 aa, chain - ## HITS:1 COG:no KEGG:Phep_0787 NR:ns ## KEGG: Phep_0787 # Name: not_defined # Def: TonB-dependent receptor plug # Organism: P.heparinus # Pathway: not_defined # 38 1042 15 1018 1028 972 51.0 0 MKKILYSQRLRLLWAAVCLTVGIGPILAVPAPKGKITVSGMVTDKNKQPLIGVTVIIKDT FSGTTTDNAGKYTINAASSDVLEFSLLGYTQQEALVGTRTKIDVTMAEDSKEIDAVVIEV GYGEQRKKDVSGSVGVVKVTDLIKAPVTSFDQALQGRVAGVNVSSSDGQPGTDFNIVIRG TNSLTQDNSPLYVIDGFPIENFSNAALNPEDIESMTILKDASATAIYGSRGANGVIIIET KKGKVGRAVVSYTGTFGVQSVTKKIDMMSSYDYVNYMIERDESNYATFLTNEGRTLDDYR NIRAIDWQDKIFRTAFVHMHNVSLRGGNKMTRYAASASLSNQPGVIVNSGYKKYQGRLSL DQQISKAVKLSLNANYTQDKISGQTSSSALSTSNSYATYLMYRTWAFRPVMLSHQSTDDL FDDDFDGSNSAVMNPYITTMNENRHQKRTTFMANAKLGITLAKGLTLNISGGYTSYNSRT TEFNNSESYKGYPRPNNSKNVNASVADYLRNDWMNENVLTYKRKYNNKHNFDVMVGFTMQ GTTTEKYGFETTNIPDEGLGLSGMDDGLPYATNVSLSENTLMSALVRVNYNYKSRYMFTA SFRADGSSKFMDGHRWGYFPSGAFAWRMGEEPWLRMAGWINEAKLRVSVGMTGNNRVGDF SAYPSISMSDYYSIGNATPGEAYIPNNLGNRDLTWETTTQYNIGYDLSLFRSRLNFTVDL YRKNTTDLLLNANMPYSSGYTKVFKNVGAVRNQGIELSLSTVNIQTKDFTWSSDFNISFN RSKVMELTEGEEFLLSKVSFTADYNSSYLYIAQVGQPMAQFYGYEWAGVYGLDDFDCDSS GNYTLKKGVPTNGEERDKIQPGDIKYVDQNNDGVVNDKDRVVIGRGEPIHTGGFNNNFTY KGLTLNVFFQWNYGNDIMNANRIIFEGNALSKNINQYASYADRWSPENQTSRNFRTNGQG PSGVYSSRTIEDGSFLRLKTLQLSYKLPRNWIRKLHIASCEIFVSGQNLFTWTDYSGLDP EVSTRYTALTPGFDYSAYARNRIFTGGIKLIF >gi|313157860|gb|AENZ01000052.1| GENE 67 87466 - 89241 2562 591 aa, chain - ## HITS:1 COG:SP2158 KEGG:ns NR:ns ## COG: SP2158 COG2407 # Protein_GI_number: 15901968 # Func_class: G Carbohydrate transport and metabolism # Function: L-fucose isomerase and related proteins # Organism: Streptococcus pneumoniae TIGR4 # 5 591 4 588 588 823 65.0 0 MKNSHPKIGIRPIIDGRRGGIRESLEEMTMGMAQSVAKLYAAKLRYADGAPVECVIADTT IGGVAEAAAATDKFRDNNVGAVLSVTPCWCYGSETIDMDPLMPKAIWGFNKTQRPGAVYL ASALAGHNQKGLPAFGIYGSEVQEQDDNSITPDVEEKLLRFGRAAVAVMEMRGKSYLSIG SVSMGIAGSMPDPDFFQEYLGMRNEAVDCSEIARRIELGIYDKEEFERAMAWVEKHCKTH EGEDLNPEHLRYSREKKDEVWEYVVKMTMIFRDLMVGNPKLAEMGFEEEAAGHNAIAGGF QGQRQWTDFRPDGDFSEAILNTSFDWNGIREAYTFATENDTLNGVSMLFMHLLTNQAQLF SDVRTYWSPEAVERATGRKLTGKAANGIIHLINSGSSTLDASGAMSDAEGNPVMKPFWEI TEQEAERCLEATTWHPAGREYMRGGGFSSKFFTRGDMPVTMCRLNLIKGQGPVLQIAEGW AVNVDRAIFEHIDKRTDPSWPTTYFAPRLTGKGAFRSVYNVMNNWGANHGAITYGHVGAD LITLAAMLRIPVCMHNIDEEKIFRPSAWAAFGSDPEGADYRACNNYGPIYR >gi|313157860|gb|AENZ01000052.1| GENE 68 89261 - 91240 1706 659 aa, chain - ## HITS:1 COG:CAC0425 KEGG:ns NR:ns ## COG: CAC0425 COG1621 # Protein_GI_number: 15893716 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-fructosidases (levanase/invertase) # Organism: Clostridium acetobutylicum # 143 620 18 490 490 140 25.0 9e-33 MKTRLKTGLLLCGGVLGGSAAVAAQKQPNILLVLSDDQSAFAVGCYGNADIRTPNLDRFA SEGVRFNRAYATSPQSVPSRASIIKQIALFLIVLTTLLPGLSATAQQTPAADRVTMYTFS DGLSDQLQELTGNPLLHRYAQTRERLSSGKYRPLYHFSAPEGRLNDPNGLCYWNGNWHLF YQAYPIENPIAHWGHAYSPDLIHWKDLPIAIAPSIEERVYSGSTLIQGDSVIAMYRGVGQ GEMVAVSRDPLLLNWRKPAANPVIPDLDPEKEQPVFARGGDPFLWYCDKLYYAILGGYSD SGPDGKRMFAEYLYRSPDLLRWEYLHPFIDADPYAFVGDDGACPYFYPIGNGDEHILLHF SHKSGGKYLIGNYDTQLQKFIVTDGGNFNHGPAKGGGIHAPSACPDGKGGIVVLFNTNPG FPRKDDFSEMMTLPRLLTWEEGALRVRPWDAVESLRRDHIRLEHIAVLANQETILDGITG DALEISMEIDLGKTKTFELSVLRSPDGQEHTRILFQRDMGKPTREYENHTPTSTITIDNT HGSRSPLFASRPPETAELSFAKGEKLKLRVFIDRSIVEVFVNDRQSVTLRTYPERDDSKG ISVRTAGNGAEILSLDAWRMESIWTDASRQTRRSADGTRSQQSADGLPTHEIHKTYKNR >gi|313157860|gb|AENZ01000052.1| GENE 69 91237 - 92724 2363 495 aa, chain - ## HITS:1 COG:STM1128 KEGG:ns NR:ns ## COG: STM1128 COG0591 # Protein_GI_number: 16764485 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Na+/proline symporter # Organism: Salmonella typhimurium LT2 # 5 442 8 447 498 196 32.0 7e-50 MEITILDYLVFFIFVGGVALFGCSFYFRSRKGAAAFTAAEGSLPTWVVGMSIFATFVSSI SFLGLPGDAYKGNWNPFVFSLSIPIATWLAAKVFIPLYRGINSVSAYHYLEMRFGYWARC YVAVCYLLTQLARVGSILLLLALPLNTMFGWDIQTIIICTGIATLIYTLLGGIAAVVWTD AIQGIILIVGAVACAAILTFTMPEGPGQLFAIASEHGKFSLGSFGASLTEPTFWVVLIYG LFVNMQNYGIDQNYVQRYMTTRSTAEAVKSTLFGGLLYIPVSLVFVYIGTALFSYYTARP ELLPAGTPSDQVFPWFIVHGLPTGLTGLVIASLFSAGMSTIATSINSSATIVLTDFAKRL SKKELSEKKSMRVLYATSFVVGALGIIMGLMMMRIDGVLDAWWKLASIFSGGMLGLFLLG VVCKTVRRAHAVVAVILGLLTIAWMSLSPLINEGSPFYRFHSPLHTYLTIVFGTTVIFLT GFLLTKLTNRKAENA >gi|313157860|gb|AENZ01000052.1| GENE 70 92756 - 93667 1431 303 aa, chain - ## HITS:1 COG:BMEII0862 KEGG:ns NR:ns ## COG: BMEII0862 COG0329 # Protein_GI_number: 17989207 # Func_class: E Amino acid transport and metabolism; M Cell wall/membrane/envelope biogenesis # Function: Dihydrodipicolinate synthase/N-acetylneuraminate lyase # Organism: Brucella melitensis # 1 298 18 319 322 162 33.0 6e-40 MKTIKGIIPPMITPLKGDDQIDREGTVRLTEHILAGGVHALFLLGTTGEAQSLSYKCRYD FVELVCRQVAGRVPVLVGVTDTSLDESVKLAAHAAKCGAVAVVAAAPYYFAPSQQELIEY YTALADALPLPLFLYNMPSHVKVFLEPATVKTLAEHPNIVGLKDSSANMTYFQTLLYHLG DREDFALYVGPEELTGESVVMGADGGVNGGANIFPELYVQMYYAACNRDVNTMRALQRKI MQISTSLYTVGKYGSSYLKGVKCSLSLMGICDDYMSYPYRRFRTEERARIRKALEALGTA CGE >gi|313157860|gb|AENZ01000052.1| GENE 71 93691 - 94566 1107 291 aa, chain - ## HITS:1 COG:CC2313 KEGG:ns NR:ns ## COG: CC2313 COG0657 # Protein_GI_number: 16126552 # Func_class: I Lipid transport and metabolism # Function: Esterase/lipase # Organism: Caulobacter vibrioides # 28 267 45 305 328 150 35.0 4e-36 MKTCLTTLCALLLAVAVSQAAEPLHVKLFPEGTPTKSGLEEQPEGVNDKGYYVGVSDPEL LVYLPDPARATGQAMLVVPGGSYEKVCITHEGYKTAEWLNEHGIAAMILKYRMPNGHPGI PLEDGEQAMRVIRRNAAQWGVDPHNVGIIGFSAGGHFASTLITEYTSEETRPDFAVLVYP VVSMNYSSVRTRENLLGARSEEEALRKRHSTFGQVHEGMPEVMLLLCNDDKAVVPDNSIA FYRALNRRGVKAEMHIYPEGGHGFWMRERYKYGEETYPAVIRWIKQHKTTN >gi|313157860|gb|AENZ01000052.1| GENE 72 94563 - 96383 2158 606 aa, chain - ## HITS:1 COG:no KEGG:Psta_1430 NR:ns ## KEGG: Psta_1430 # Name: not_defined # Def: FG-GAP repeat-containing protein # Organism: P.staleyi # Pathway: not_defined # 12 598 88 672 681 480 47.0 1e-134 MDYDGDGLKDLLVSCPDRPYKGLYFFKNIGTPRHPRFAAAERISEKGMNNIRLSEADGKP YVLSKNFEYPDFFKAPYAKKRRLHYEGEELGATYNKSRSNMWSYADWDGDGDKDIVVGID TWDDYGWDNAFDSLGRWTRGPLHGYVYLLENTGRGYVNRGKVEAAGAPIDVYGAPNPCIA DFDGDGDPDLICGEFVDGLTWFENVGTRTEPRFAAGRPLANKHGEIRLHLEMIVPVVSDF DGDGHPDLIVGDEDGRVAWVRHTGKVKKGMPQFESPVYFTQQADPVKFGALSTPCAFDWD GDGKQDIIAGNSAGEIAFIRNLSGGENPVWDAPRLFTVNGRPIRIQAGENGSIQGPAERK WGYTVLSVADWDGDGLPDIIVNSIWGKIEWFRNLGGKDGLRLAPAQPVRVAWEGEAPKPA WNWWTPEPGTLVTQWRTTPVAMDWNGDGLTDLIVLDQEGYLAYYERFRTPEGELLLHPGR RIFHGTNCSLYNSRSGVADASEGLLRLNDGIGGQSGRRKICFTDWDGDGRLDLIVDSQNA CWFRNVREERGEVWYEYMGNVSERILAGHTTCPTPIDWRGDSTPELLLGAEDGHFYRMAN QQKQAR >gi|313157860|gb|AENZ01000052.1| GENE 73 96439 - 96615 87 58 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313157966|gb|EFR57372.1| ## NR: gi|313157966|gb|EFR57372.1| hypothetical protein HMPREF9720_0571 [Alistipes sp. HGB5] # 1 58 1 58 58 74 100.0 3e-12 MNPATHLYKTSVLAIRTAALAAALFVTTSAPARTAHTANPKEGGGKCPGPIRPGHPAL >gi|313157860|gb|AENZ01000052.1| GENE 74 96665 - 97723 1728 352 aa, chain - ## HITS:1 COG:TM1200 KEGG:ns NR:ns ## COG: TM1200 COG1609 # Protein_GI_number: 15643956 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Thermotoga maritima # 1 313 1 309 333 150 32.0 4e-36 MRTKRVTIKDIATEAGVSIALVSFVMNNKADGKETYRVNKETAQRILEVAQKLNYQPNNA ARTLRSGKTNTIGVIVSDISNKFFADIARCIENHAYKHKYTVLFGSTDENPQKLENLVEV FRNKGIDGLIVVPCEDADEIIRDIARQNIPLVLLDREVPDLEVSSVVLNNRRAGYETTEA LIRRGFTRIEMISYSMGLSNIREREEGYRRCMQAHEMSENAVIHHLRHDKFAKIEEIIRE ARQRRVEAFLFATNTLAGQGLSAIFRNGWRVPQDFAIACFDTNEAFDIYKTAIAYVRQPI ERFGTEALDLLIKSIEQRDKPGSCTRIVLTPEIVESAPEEALRSAEEAAVKS >gi|313157860|gb|AENZ01000052.1| GENE 75 97839 - 98978 913 379 aa, chain + ## HITS:1 COG:no KEGG:FB2170_11376 NR:ns ## KEGG: FB2170_11376 # Name: not_defined # Def: hypothetical protein # Organism: F.bacterium_HTCC2170 # Pathway: not_defined # 3 366 31 389 397 300 41.0 5e-80 MKTYGEERSFLESCGVRVVELSDESGRARVCLVPAWQGRVMTSTAAGTGGPGYGWINRSL IASGVRKSRFNPFGGEERLWLGPEGGPFSLYFAPGAEQVYANWQVPAALDTEPFRVVGRD ARQVRFEAEMSLRNAAGTRFEIGVARRVELLSHRQAEVSLGRVLPPELALVAYRSENRIG NCGPEPWTPAGGAPSVWMLGMFTPSPTTVVFLPCDGEEVRAAVNSDYFGELPDDRLSVSG GLVCLRIDGAFRSKIGLPAGRDTGLCGSYDAAAHHVTLVRYRRSAPGDRYVESRWGAQAD PFGGDVVNAYNDGPTETGEVMGPFYEIETSSPAAFLRPGETLCHTQEVFHLQGDEALLEE LLRGLIPGGLRAVKEAFNH >gi|313157860|gb|AENZ01000052.1| GENE 76 98990 - 100615 2047 541 aa, chain + ## HITS:1 COG:STM0103 KEGG:ns NR:ns ## COG: STM0103 COG1069 # Protein_GI_number: 16763493 # Func_class: C Energy production and conversion # Function: Ribulose kinase # Organism: Salmonella typhimurium LT2 # 9 534 5 539 569 547 51.0 1e-155 MQPRTNYLLGVDFGSDSVRCLVVDAADGSELASAVVAYPRWAERLYCDPAHNRYRQHPLD YVESLEASVRQALAACGRDVADAVCGISFDTTASTPALTDAKGTPLALLPEFAEEPDAMF VLWKDHTALAEADEINDLARKWETDYTRYSGGTYSCEWVWAKMLHCLRRSPGLRAAAYSW VEHCDWISALLVGDTTPETIARSRCVAGHKAMWHASWGGLPAPEFLEAVDPLLGIFRGHL YSDTVTGDACVGRLCGEWAARLGLKPGIAVGVGAIDCHVGAVGAGIVPGTLVKVMGTSTC DIVVGTYDEVGDRTVRGICGQVDGSVLPGYIGFEAGQAAFGDLYAWFRRVMAWPLRQIAG SDPQLEERILGELTAEAERLPLTPDDPVALDWHNGRRTPDADPRVRGALDGLTLATSAPA IYKALVEATVFGSRAIDERMLEEGVPIDNILAIGGIARKSPFVMQTMADVIGVPIRVVAS DQACALGAAMFAAVAAGLYGSVQEAQRAMCPGFSDEYLPDMERHAVYDRLYDRYLKLGGR R >gi|313157860|gb|AENZ01000052.1| GENE 77 100851 - 101303 476 150 aa, chain + ## HITS:1 COG:HI1190 KEGG:ns NR:ns ## COG: HI1190 COG0720 # Protein_GI_number: 16273112 # Func_class: H Coenzyme transport and metabolism # Function: 6-pyruvoyl-tetrahydropterin synthase # Organism: Haemophilus influenzae # 3 144 1 139 141 87 37.0 6e-18 MTVIRLTKEFSFEAAHALDGYDGPCREIHGHSYRLFVTVKGTPAEDEADPKCGMVMDFGV LKRIVNEEIVSRFDHALVLRGSASGAELRRVLAERFGNILTVDYQPTCENMLADFARRIA ARLPEGVALHSLKLHETATSFAEWFAGDNE >gi|313157860|gb|AENZ01000052.1| GENE 78 101450 - 101581 131 43 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRAQRFEELVGYGFVLDFCTEDIQAKSGSGSVARISNEELSGC >gi|313157860|gb|AENZ01000052.1| GENE 79 101660 - 101992 249 110 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237711810|ref|ZP_04542291.1| ## NR: gi|237711810|ref|ZP_04542291.1| transposase IS116/IS110/IS902 [Bacteroides sp. 9_1_42FAA] transposase IS116/IS110/IS902 family protein [Bacteroides eggerthii 1_2_48FAA] transposase IS116/IS110/IS902 [Bacteroides sp. 9_1_42FAA] transposase IS116/IS110/IS902 family protein [Bacteroides eggerthii 1_2_48FAA] # 1 110 88 197 359 152 72.0 6e-36 MVNPADVLIKSSEMLRKSDAVDRCKQAWSLQGNKLKGIYTPDSISLVMCSLIRLKNSITK DTTRQKNRIKSQLRYLVIEMTWKFLEPFSNWLKPFIVWLKGGEMLTPSGR >gi|313157860|gb|AENZ01000052.1| GENE 80 102124 - 102546 432 140 aa, chain - ## HITS:1 COG:no KEGG:BT_2612 NR:ns ## KEGG: BT_2612 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 11 139 9 139 140 83 37.0 3e-15 MDTNPKLTDMTGKKPRTPSYKYSFRLNEEQNIRFNEVLSKAGSEHNRSRFIVKRIFSEEF VVIKRDPSKTQFVARLNDFYFQFQKLGNNYNQIVKAINAHFSNVAIPHQIAMLEQRTREL KALSIEILNLAKQAKEWLRI >gi|313157860|gb|AENZ01000052.1| GENE 81 103001 - 103138 97 45 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MDSLKEQTRPATKSSKPPRIEVDEEMVRRMIVGLASLDSKVVIPL >gi|313157860|gb|AENZ01000052.1| GENE 82 103244 - 103762 160 172 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313157877|gb|EFR57283.1| ## NR: gi|313157877|gb|EFR57283.1| hypothetical protein HMPREF9720_0577 [Alistipes sp. HGB5] # 1 172 1 172 172 299 100.0 7e-80 MDSLKETDKSATEPPKRSRIEVDEELMRQMIAGQAPLDSKVGRRIPEPEEEDTNALEENT SETVSGASAPTAEKTDVDTQTSGVKEPAGFRRKKLALPDFERTFFAPVDCRNRSAIYVST RTKHKVSEILHLLGNESTRLTALVDNMLRFVMDIYSDELNYLHEKKNNRRPF >gi|313157860|gb|AENZ01000052.1| GENE 83 104306 - 105523 1533 405 aa, chain - ## HITS:1 COG:no KEGG:BF1833 NR:ns ## KEGG: BF1833 # Name: not_defined # Def: putative bacteriophage integrase # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 398 1 399 410 324 42.0 4e-87 MRSTFKVLFYLKRNKEKTQSAVPVMGRITVNGTISQFSAKLTVPERLWEVRGGRAKGRSL ESDRINRHLDEIRSQLDRHYRDIRDRESYVTAEKVKNAWLGFGKRYRTLLSTFRSFTEDL HGRIGVDRSKNTWYRYLATMKHLQAFLTAKYRVSDIALAELEQSFIEQFHVYLKTERALK LTSICRYLDCLINVVKIAFNDGIMPRNPFASYRYNEPAPERAFLNEEEILTLQHAAFRTK KQRMIRDLFLFSCFTGICYADLKTLAWKQFEQDTHGDWWVTGNRCKTDTQYVVKLLPAAL SILERYRDGTDYVFSFMPHLNTVDRSLKRIAVLCGIEKKLTFHVARHTYATTICLMNGVS LETLSKMLGHKRITTTQTYAKVTQPMVDREVEMLKEKLADKFMAV >gi|313157860|gb|AENZ01000052.1| GENE 84 105817 - 106353 780 178 aa, chain + ## HITS:1 COG:BMEI1798 KEGG:ns NR:ns ## COG: BMEI1798 COG0566 # Protein_GI_number: 17988081 # Func_class: J Translation, ribosomal structure and biogenesis # Function: rRNA methylases # Organism: Brucella melitensis # 25 172 144 288 297 67 35.0 2e-11 MRKITNEELGRPTAGEFAAMAKMPVTVVLDNVRSLQNVGAFFRTGDAFAVEHIALCGITA VPPNRDIHKTALGAELTVPWSYYETTEACIDRLHADGYEVLAVEQVEGAVMLDVFRAAPG VKYALVFGNEVMGVGQSAVDRCDGAIEIPQAGTKHSINVAVSGGVVLWSFFRRIGPGR >gi|313157860|gb|AENZ01000052.1| GENE 85 106599 - 107414 1201 271 aa, chain + ## HITS:1 COG:all1522 KEGG:ns NR:ns ## COG: all1522 COG0413 # Protein_GI_number: 17229014 # Func_class: H Coenzyme transport and metabolism # Function: Ketopantoate hydroxymethyltransferase # Organism: Nostoc sp. PCC 7120 # 10 265 3 256 257 254 52.0 2e-67 MSVESTVRAVTTYRLTEMKQRGEKIAMLTSYDYSMAKIVDAAGIDVILVGDSAANVMAGY ETTLPITLDMMIYHACSVVRAVNRALVVVDLPFGTYQGNSKVALDSAVRIMKETEADAVK IEGGEEILESVNRILSAGIPVMGHLGLTPQSIHKFGTYNVRAKEEAEAEKLVRDAHLLSE AGCFGIVLEKIPAALAARVTGEIPTPTIGIGAGGACDGQVLVIHDMLGINKGFSPRFLRR YADLHELMTGAVQQYIKDVKECDFPSEKEQY >gi|313157860|gb|AENZ01000052.1| GENE 86 107643 - 109145 1937 500 aa, chain + ## HITS:1 COG:CAC2701_3 KEGG:ns NR:ns ## COG: CAC2701_3 COG0516 # Protein_GI_number: 15895958 # Func_class: F Nucleotide transport and metabolism # Function: IMP dehydrogenase/GMP reductase # Organism: Clostridium acetobutylicum # 219 497 2 277 280 346 62.0 5e-95 MQQKSTFIMSFINERVQPEGLTFDDVLLVPAYSEVLPREVNIQTRFSRNIKLNIPIVSAA MDTVTEAPLAIALAREGGIGVIHKNMSIAEQAAQVRRVKRAENGMIYDPVTISKENTVGD ALNLMRENKIGGIPVVDDDNILIGIVTNRDLRFQRDMMRRIEEVMTPGDRLITTHSTELS HASEVLLNSKIEKLPVVDDKGHLVGLITYKDITKVQDHPNACKDAKGRLRVAAGVGITPD ALERVKALVAEDVDAVVLDSAHGHSIAMVRKLREIKEVYPSLEVIAGNVATAEAARFLIE NGADGVKVGIGPGSICTTRIIAGVGVPQLSAIFDAASAAAGTGVPVIADGGLRYSGDLVK ALAAGGDCVMIGSMFAGTEEAPGETIIYNGRKFKSYRGMGSIDAMKAGSADRYFQKGCEN NFSKLVPEGIAARVPFKGTLSETVYQLVGGVRAGMFYCGAKDIETLQRARFVRITSSGMH ESHPHDVAITSEAPNYSSER >gi|313157860|gb|AENZ01000052.1| GENE 87 109148 - 109480 284 110 aa, chain + ## HITS:1 COG:no KEGG:Bacsa_0047 NR:ns ## KEGG: Bacsa_0047 # Name: not_defined # Def: hypothetical protein # Organism: B.salanitronis # Pathway: not_defined # 5 98 11 103 114 95 56.0 5e-19 MNPEIAACGLFCGNCRKFGKGSCPGCRPNEKASWCKVRACCIEHGWQSCAECTLMPPQTC KKFDNFIAKVFQVVFRSDRRGCVERIREVGPEAFAAEMRLSGSYNRPVKK >gi|313157860|gb|AENZ01000052.1| GENE 88 109608 - 111167 2245 519 aa, chain + ## HITS:1 COG:FN1444_2 KEGG:ns NR:ns ## COG: FN1444_2 COG0519 # Protein_GI_number: 19704776 # Func_class: F Nucleotide transport and metabolism # Function: GMP synthase, PP-ATPase domain/subunit # Organism: Fusobacterium nucleatum # 204 519 2 318 318 436 64.0 1e-122 MNPDSSDLRYIMEKILILDFGSQYTQLIARRVRELSVYCEIHPFNKIPALDASVRGVILS GSPFSVRDENAPTPDLSAIKGKLPLLGVCYGAQFLASAFGGEVQPAPSREYGRAMLTVGD ADDALMRGLPKTTQVWMSHGDTITSVPANYKIVASTEDVRVAAFHVEGEQTWGIQFHPEV YHSTDGTQLLKNFVAGICGCKQEWTSESFVEATVRELREKLGDDKVVLGLSGGVDSSVAA VLLHKAIGKNLYCIFVDSGLLRKNEFEDVLESYKNMGLNVKGVKAGAKFLGDLAGVCDPE TKRKIIGRDFVEVFNEEAVQIKDVRWLAQGTIYPDVIESVSVNGPSATIKSHHNVGGLPE KMNLRIVEPLRLLFKDEVRRVGRSLGISEQLIGRHPFPGPGLAIRILGDITPEKVEILQN VDKIYIDAMRDAGLYDKVWQAGAILLPVKSVGVMGDERTYESCVALRAVTSTDGMTADWV HLPYEFLANVSNDIINKVKGVNRVVYDISSKPPATIEWE >gi|313157860|gb|AENZ01000052.1| GENE 89 111304 - 112674 1425 456 aa, chain - ## HITS:1 COG:VCA0707 KEGG:ns NR:ns ## COG: VCA0707 COG2271 # Protein_GI_number: 15601463 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar phosphate permease # Organism: Vibrio cholerae # 4 447 1 434 459 328 40.0 2e-89 MNFLTRFYRISSPVPRRQLDPAQEQKLYKRLRLQAFIAATLGYSLYYVCRTSLNVMKKPI LDSGTLDATQIGVISSALLFAYAVGKFVNGFIADYCNIKRFMATGLIVSAVANMLMGALG VAHSVIPTAVIFISFAVMWGLNGWAQSMGAPPAIISLSRWYPLKERGTYYGFFSASHNLG EFFSFIFVGSIVSIAGWQAGFFGSAIAGAIGAVIVLLMLHDTPESKGLPPVETLAHEEPA TGQELSVKEIQKQVLRTPAVWILAAASAFMYISRYAINGWGVLFLQEAKGFSDAQAISIV SINALLGIIGTVLSGWFSDKLFKGDRHMPALIFGILNSVALILFIYGGDALWVNILSMVL FGIAIGVLICFLGGLMAVDIVPRKATGAALGVVGVASYIAAGLQDVASGWLIDANITVAE NGEKIYDFSQAAIFWIAASVISFLLPLLNRKKKQAA >gi|313157860|gb|AENZ01000052.1| GENE 90 112857 - 113945 1475 362 aa, chain - ## HITS:1 COG:no KEGG:Sph21_4209 NR:ns ## KEGG: Sph21_4209 # Name: not_defined # Def: hypothetical protein # Organism: Sphingobacterium_21 # Pathway: not_defined # 27 361 30 362 362 288 43.0 4e-76 MKHLPLLLLLGGLLSASAARSADPVRYVDAATLTVIGKALPTEQPYNRIDTTRFRVPAKT PGYCYHPTGLAVVFRTDSRTIRARWETSGKNPSDNMAAVAQKGLDLYIRSNGEWVFAGVG RPKINGKNDRHDAAIISNMAEGEKECLLYLPLYDQLKKLEIGVDEKSVITPMENPFRHKI VVHGSSITHGIAAGRAGMAYSSRLGRDTGLYCINLGFSGQCTMQPEFASYLSRVEADAFV FDCFSNPSAEVINERFDAFVDTIRRTHPTTPLIFLQTIRRETRNFSTRIEKFEREKQAAA EAQVRDRMKTDKNIYFVDPGDLLGSDHIATADGTHPTDLGFTYMLDRIEPQIRKILGKYG IR >gi|313157860|gb|AENZ01000052.1| GENE 91 114373 - 114867 499 164 aa, chain + ## HITS:1 COG:SA1710 KEGG:ns NR:ns ## COG: SA1710 COG0847 # Protein_GI_number: 15927468 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, epsilon subunit and related 3'-5' exonucleases # Organism: Staphylococcus aureus N315 # 4 164 6 163 184 122 37.0 3e-28 MKDFAAIDFETANGKRTSVCSVGVVVVRGGEVTDSFYSLIRPRPNFYSRFTTAIHGLTYD DTAEAPDFETVWKQIAPRIEGLPLVAHNSPFDEGCLRAVFELYGMPYPGYRFYCTCRASR RTFGTQLPNHQLHTVSAACGFDLANHHHALADAEACARIALKIL >gi|313157860|gb|AENZ01000052.1| GENE 92 114947 - 116206 2018 419 aa, chain - ## HITS:1 COG:no KEGG:Weevi_1984 NR:ns ## KEGG: Weevi_1984 # Name: not_defined # Def: phosphate-selective porin O and P # Organism: W.virosa # Pathway: not_defined # 5 419 8 415 415 294 40.0 5e-78 MKKTLLLLLLTSCCSIAAAQERSPLRRTEVITADSLDNDAELRERLRNMPNLEVGKGITF RPKSNWFSLTMRFRMQNMVGLSFDKDFTLTKTDAQVKRLRLRFDGHIYSPKLVYSIQLGF TSYDTEPLPNGNMNIVRDAIVYYVPSPKWNIGFGQTKIKANRARTNSSSALQFIDRSIVN SEFNLDRGFGFFGEYNMRQGEGFNLSAKGSVTLGEGRNWGSSAIGGVAYTGRLELYPLGR FKSKGDVLEGDFEGEERVKILLAGAYSYNHKASRLKGQRGAVMPDDATRNLGSYFADFIL KYRGFAFYTDFMGRSCDEPLFDPRSNAFVYNGQGLNVQTSYLFDKKWEIALRNSTLFPES EVQPFAGYKRWNQTTFGVTRYIIGHSLKVQADMSYNHRSESIAPNYNRWEIRFQLELGL >gi|313157860|gb|AENZ01000052.1| GENE 93 116470 - 116976 725 168 aa, chain + ## HITS:1 COG:RSc2790 KEGG:ns NR:ns ## COG: RSc2790 COG2077 # Protein_GI_number: 17547509 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Peroxiredoxin # Organism: Ralstonia solanacearum # 1 166 1 166 166 177 57.0 8e-45 MSETKFKGSPVRVAGEFITAGAEAPAFELVKSDLSPLRLSDLKGKRVVLNIFPSLDTSVC ATSVRKFNKLAASLPDTVVVAVSKDLPFAHARFCTTEGIENVVPASDFRASGFDGAYGVL MADGPLAGLLARAVVVIDKAGRVVYTELVPEITEEPDYDKAVEAVKAN >gi|313157860|gb|AENZ01000052.1| GENE 94 116998 - 117273 440 91 aa, chain + ## HITS:1 COG:no KEGG:BVU_3861 NR:ns ## KEGG: BVU_3861 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 91 1 91 91 132 72.0 3e-30 MIRLNVFIRTTESNRAEILSVAKELVTASLKDEGCIAYDIFESATRPDVLMICETWSDAK ALAAHEQASHFTTLVPRLHQLGEMKLEKFVF >gi|313157860|gb|AENZ01000052.1| GENE 95 117839 - 119284 2120 481 aa, chain + ## HITS:1 COG:slr0074 KEGG:ns NR:ns ## COG: slr0074 COG0719 # Protein_GI_number: 16331744 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ABC-type transport system involved in Fe-S cluster assembly, permease component # Organism: Synechocystis # 9 481 6 480 480 704 70.0 0 MAEKDKDILENIGEQEYKYGFTSDIETETIGKGLSEDVVRLISAKKGEPAWMTERRVAAY RHWLTMGPPTWAHLTIPEIDFQDIIYYAAPKQQKKLDSMDEVDPELKRTFDKLGIPLEEQ MALAGVAVDAVMDSVSVKTTFKEVLAEKGIIFCSISEALRDYPDLVKKYLGSVVPYTDNF YAALNAAVFSDGSFCYIPKGVRCPMELSTYFRINAAGTGQFERTLIVADEGAYVSYLEGC TAPRRDENQLHAAVVEIIVEKDAEVKYSTVQNWYPGDREGRGGIYNFVTKRGICRENARL SWTQVETGSAITWKYPSCILAGDNSVGEFYSVAMTNNFQQADTGTKMIHIGRNTRSRIVS KGISAGRSENSYRGLVRMAKGAENARNYSQCDSLLIGDKCGAHTFPVIDSRNRTAVVEHE ATTSKISDDQLFYCNQRGLSTEDAVGLIVNGYAREVLAKLPMEFAVEAQKLLSISLEGSV G >gi|313157860|gb|AENZ01000052.1| GENE 96 119297 - 119869 535 190 aa, chain + ## HITS:1 COG:MA0332 KEGG:ns NR:ns ## COG: MA0332 COG0655 # Protein_GI_number: 20089230 # Func_class: R General function prediction only # Function: Multimeric flavodoxin WrbA # Organism: Methanosarcina acetivorans str.C2A # 1 127 1 123 185 76 33.0 3e-14 MKVLVLNGSPRRNGLVSQMLEHIAGSLPETCDVERIFVHDLTVRPCTGCMSCRSKLCCAL PEDDAHRVAEAIRTADALVIGSPCYWGNMNGALKVLFDRSVHVMMGEKESGMPVALHKGQ RAVLVATCNTIYPFSVLFNQTRGVFRALREILRWSGYRIVGTAAKSGCRKNGRLTDREIE KCKKLARKIC >gi|313157860|gb|AENZ01000052.1| GENE 97 119851 - 120615 186 254 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|225088774|ref|YP_002660041.1| ribosomal protein S16 [gamma proteobacterium NOR5-3] # 1 243 7 237 312 76 26 1e-12 GKKDMLSVKNLHASVDGKEILRGIDLEVKAGEVHAVMGPNGSGKSTLAAVLAGNEKFTVT EGSATFLGRDLLEMPIEDRARLGLFLGFQYPVEIPGVTMANFMKLAVNEQRKFRGEEPLT AAEFLKLMREKSAVVELDSKLTSRAVNEGFSGGEKKKNEIFQMAMLDPKLAILDETDSGL DIDALRIVATGVTKLHTPENATVVITHYQRLLDYIVPDVVHVLYRGRIIHTGDKTLALKL EKEGYDWLINDYRE >gi|313157860|gb|AENZ01000052.1| GENE 98 120612 - 121433 1019 273 aa, chain + ## HITS:1 COG:AGc3345 KEGG:ns NR:ns ## COG: AGc3345 COG0719 # Protein_GI_number: 15889125 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ABC-type transport system involved in Fe-S cluster assembly, permease component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 27 245 193 406 435 111 32.0 2e-24 MTAILDIIRDFRTAQGEVFRIEGTQDGPYAAVDPRHMRIEAAAGASGRIVVVHTAPDMSS LEIVAAEDARLEITEVFLAEAFAEIAVKQSARSLCRLTAVQLTSANASYTIDLDGRDAEN MLGGVFLAGGEEHCVVKLRTNHNVPDCRSNSYIKGVAGGSARGEFCGLVYVAPDAQHTDA QQQNRNILLSETAHIATQPQLEIYADDVKCSHGATVGQMDSEAVLYMRQRGLSEVQARRL QIEGFVGDVVRRCGVEPLCETMMEAVRAKMEKL >gi|313157860|gb|AENZ01000052.1| GENE 99 121436 - 122653 1698 405 aa, chain + ## HITS:1 COG:mlr0021 KEGG:ns NR:ns ## COG: mlr0021 COG0520 # Protein_GI_number: 13470346 # Func_class: E Amino acid transport and metabolism # Function: Selenocysteine lyase # Organism: Mesorhizobium loti # 2 404 10 412 413 423 49.0 1e-118 MFDVEKIRGEFPILGREVYGKPLVYLDNGATAQKPLAVIETIDYLNRELNANIHRGVHYL SEEATTLYEASRERIRAFIGAAEKEEVIFTAGATASLNTVAYGWGEKFVRAGDNIVVSEM EHHSNIVPWQMLCERKGAEIRVLPFDDEGRLRTELLPSLLDGRTRAVAVTQASNTLGTRP ELRGIIDAAHAVGAIAVVDGCQGVVHGGADMQELDCDFYAFSGHKLFAPTGIGVLYGKRA LLEAMPPFLGGGDMVDTVTFAKTTYAPVPLKFEAGTANFSGAIALGEAVKFVQRFDPEEV ERHEQALLRRATERLAAIDGLRIYGTAPGKCAIVSFNVEGVHPYDMGMILDKLGIAVRTG QHCAEPVMDHYATTGMCRASFALYNTQAEADALADGVERAVRMLR >gi|313157860|gb|AENZ01000052.1| GENE 100 122704 - 123393 987 229 aa, chain + ## HITS:1 COG:PH1695 KEGG:ns NR:ns ## COG: PH1695 COG0125 # Protein_GI_number: 14591458 # Func_class: F Nucleotide transport and metabolism # Function: Thymidylate kinase # Organism: Pyrococcus horikoshii # 2 219 5 196 205 94 34.0 2e-19 MFIVLEGLDGAGKSTQIRMLRQLFTDRGVESEYVHFPRFDSPVYGQLIARFLRGEFGGVQ EVDPYLVALIFAGDRADAAPQIRQWLAEGKAVVLDRYVYSNVGFQCAKLPAGEERDRLAD WIVNLEFGHNALPRPDLSLFLDVPFAFTERKLSEVREGDDRDYLQGGQDIHEASLQLQQD VRSVYLASAAKDPSLRVVDCSDASGAMESPEGIFAKIRAELIPILGADV >gi|313157860|gb|AENZ01000052.1| GENE 101 123386 - 124492 1410 368 aa, chain + ## HITS:1 COG:no KEGG:Odosp_1120 NR:ns ## KEGG: Odosp_1120 # Name: not_defined # Def: DoxX family protein # Organism: O.splanchnicus # Pathway: not_defined # 8 366 2 363 415 216 33.0 9e-55 MCKTRVFKLIANVCRLILACTFILSGFTKVIDPWGTAMKVNEYLSIYGVESLQPASMAFS IWLCGAELMMGCMLLFKVRIRLISIFAVLSMLFFTALTLLSATLIPVEDCGCFGEALKLT PWQTFFKNLALLPMAFVVWWRYRPDKIFAFNALEIVLTVTFFFLSMYLGYYCYRHLPLID FLPYKVGVNIWEGMHAPAVEPGETETVLVYRNLETGKLHEFSLEDTAWQDAGKWEWVDTR TTEEMPAIRPLMSEFALRDAEGDATEEILTAPGRVYMLCVTSFDRLPRACAKRFARLIQR AETEGARAVCLTPQPLYGVTRHDFGSGEVRCYNIDASTMKTMLRANNGMVLLEEGVIRAK KNCRDIRP >gi|313157860|gb|AENZ01000052.1| GENE 102 124554 - 125582 1219 342 aa, chain + ## HITS:1 COG:VC1756 KEGG:ns NR:ns ## COG: VC1756 COG0845 # Protein_GI_number: 15641760 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Vibrio cholerae # 9 339 24 355 364 142 30.0 1e-33 MRTFQTLFLLLALCGCGRRQPAPETVRPVKVVTASGAGVIDKDFAGMATPDDAVNLAFKL SGQVLDVPVSQGENVKKGALLAELDPRDIQLQVSADRSAYEEARSQLQRMQRLLEHEAVS QQEFEASRTRYAQARSAYDNSLDMLKETKLRAPFASVVERKYVDNYERVQAGQTIVRVVN PVTTQVQFTLPESALPLLASQSTRFEVEFDNYRGVKFPAVLKDYAKTSSDASGFPVSLRL TNADPGRYPVSPGMSCTITMQSADPVPDAVSLPVSAIFAPAEGGTYVWIVTGDGHVVRRE VTLGELYGRDRVVVDSGVEPGERVVTAGVYQLRQGERVRILK >gi|313157860|gb|AENZ01000052.1| GENE 103 125590 - 128622 4690 1010 aa, chain + ## HITS:1 COG:VC1673 KEGG:ns NR:ns ## COG: VC1673 COG0841 # Protein_GI_number: 15641677 # Func_class: V Defense mechanisms # Function: Cation/multidrug efflux pump # Organism: Vibrio cholerae # 6 1008 25 1034 1037 612 35.0 1e-174 MSLPEYSLKNKKVVWFFLFVLLAGGALGFVTLGKKEDSTFVIKSASLVCTYPGATPLEVE QLISEPIEREVQSMRLVHKITSESYYGLSKIMVELDPATGAAEIPQLWDELRRKVLNIQS RLPAGASPVTVADDFGDVYGIYYGLSVDGGFTWSELREWAQRIKTALVTVDGVQKVTLFG EQTPVVNVYVSLAALANFSIRPESIVRTIGQQNTIVNSGEKQAGALEIQILEDGAYKDLG DISNQLLISSSGKQYRLGDIARVERGYADPPQTLMRVNGRRAVGIGISTEADVDVVKAGE KIGRVLGGLTRQMPVGMDLTVLYPEDRIAREANSTFILNLAESVAIVILIIMLVMGFRAG VLIGSSLLFSIGGTLLLMQFLGEGLNRTSLAGFIIAMGMLVDNAIVVTDNAQQAMLRGVA RRQAVIDGANAPRWSLLGATLIAIFSFLPLYLAPSSVAEIVKPLFVVLGLSLLLSWGLAL TQTPLFGDFMLRVKPAAHDPYDTKFYRKFDQLLAGLLRWRWGVVAGVVALFALSLAVMGL MPQNFFPSLDKPYFRADVLLPEGYNIRDTERNLRLMEEWLHAQPEVKTVSVTMGSTPPRY YLASSSISLRPNFGNILIELHDRKQTEEVENRFNAYVVANCPDVWLRSSLFKLSPVPDAA IEFGFIGDDIDTLRRLTQAAEDIMWRTPGTANIRNSWGNRVPTWLPLYSQMKGQRIGVTR SQMAQGITIATQGYRLGEYREGDQFMPILLKDENIDAYNLTNLQALPIFTPAGKVYSIEQ ATDGFRFEYRGGVVKRYNRQRVMKAQCDPARGVNTMELFAALRDSVTRAVPLPEGYSMKV FGEQESQQESNSALAEYMPLTMILIFIVLLLLFRNYREPVIILLMIPLIFIGVVLGLAVT GKVFNFFSLLGLLGLVGMNIKNAVVLVEQIGVLRAGGRGPYDALVSATRSRIVPVAMASG TTILGMLPLLFDSMFGAMAATIMGGLLVATLLTVCVLPVVYAIFYNIRES >gi|313157860|gb|AENZ01000052.1| GENE 104 128619 - 129863 1775 414 aa, chain + ## HITS:1 COG:tolC KEGG:ns NR:ns ## COG: tolC COG1538 # Protein_GI_number: 16130931 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Outer membrane protein # Organism: Escherichia coli K12 # 37 402 41 419 495 67 24.0 4e-11 MKPFLLSVILFFCLPPTHAQTTLDDYRRDVIEYSRRLKVAAAGSDAAAETVGQARTGYLP RLSLDGNFTATVHHYDGVERWNFSVLPQLVQVVYGGGAVRAAYRQAELGYDIALCDEEFT RLDVRYTAEYTYWNLSAMALYAASMRQYVSIIRSLKEVVDRRFAEGYIAKGDVLMIDTRL SEAQYELISAERNYTVALHNFNILRGADPALGVTLAQGIRDSLPQPVRVPAAEAAAFRPD YAAAQLRSDRAEAGIRAARAAFNPQLSVGIGGSWQPYYPNRTGATTVDGSAFVKLSLPIF HWGERRRATGVARAVQRQSEWDAAQLYDDIVREEMNGWTALENSRAQVEAMEQSLRIAGE NLDISTYSYGEGLATILDVLQAQLSWIQLYTNAIKAHYNYAVAVSDYRRITAQQ >gi|313157860|gb|AENZ01000052.1| GENE 105 129925 - 130332 608 135 aa, chain - ## HITS:1 COG:no KEGG:BDI_1522 NR:ns ## KEGG: BDI_1522 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 18 134 31 142 148 73 37.0 2e-12 MIEKVTLQLTGDSKLVPENGPDCGEMNPKTLLLYAAAKCSGMTALMIMEKERLRPKRFEI SVSGELSTEEVRSESVFKSFHVVYNIECDTEDDQAKVSRAATLTHEKYCGIMRMFRMIAP VSHEIAVVSTEPAKA >gi|313157860|gb|AENZ01000052.1| GENE 106 130380 - 131804 1550 474 aa, chain - ## HITS:1 COG:MA2348_2 KEGG:ns NR:ns ## COG: MA2348_2 COG0642 # Protein_GI_number: 20091183 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Methanosarcina acetivorans str.C2A # 191 455 151 421 427 159 36.0 8e-39 MRRTPHPRGGDFAHPLNISDILSVCHDRQDLLPGIFARLRERHRYGDLPPWSYIRLRDSN LKFYIEGRFVGRYDRRRQLRRIVFLFRNTEEEITRKFILGMALSQTKIFPWFFDLGHNRM VIDERWFAHLGLPAGDGTIGTEDFFRLVHPEDRGRLADAFAKQLAGEMNPDTFTYRLRRG DGTWEWFEEQSVYLGQTGDGSPCRIVGICQSIHDHKISEDGLRAARDKARESDRLKSAFL ANMSHEIRTPLNAIIGFANLLTSEDIPFSEAEKQEYSRLITSNGDQLLRLISDILDLSKI ESNTMEFHFGEHSLHALLTDIYQAQRLCMPEGVRLRLDMPQADTPIVTDASRLKQVVNNL INNAAKFTDDGSITFGFRSPQGCEQVELFVHDTGKGIAQEHLDRIFERFYKADSFVKGVG LGLSICRTITEYLGGAISVESESGKGTRFTIQHPLSRPEAAPRREPAGTAGMRQ >gi|313157860|gb|AENZ01000052.1| GENE 107 131756 - 131950 159 64 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKSRLFHHKDPRAASRTGAQSETDTLSMLSVGAVTLTPEGEITEINARACDELRIRGGGI LHIR >gi|313157860|gb|AENZ01000052.1| GENE 108 132118 - 135102 4496 994 aa, chain + ## HITS:1 COG:no KEGG:BDI_3591 NR:ns ## KEGG: BDI_3591 # Name: not_defined # Def: TPR domain-containing protein # Organism: P.distasonis # Pathway: not_defined # 62 993 60 997 999 374 28.0 1e-101 MRKNLNLLLAFAVCALCASGAQAGTPESAAFIDRGRSLFDFGRWTDARHEFLQARETLSP ADREAAQTVDFYLAACAVELGSRDAEAALRRFEARYPGSVYANDVRFSLGSYYCAEGDMK RAREAFEKTDYKALGRARKEQYDVRMGYVEFTDGDYDKAFGYFDRIGPQSEYADHALYYK SYIDYAEGRYGRAKQGFTALQRSDAYRDVVPYYLLQIEFHEGNYRYVVENGGKLVQRAVP ERRKELERVIAESWFRLGDFNKTIEHLDAFAAAGGELDRDGSYLMGFSLYRTARYPEAAE YLRRACGAEDALTQNASYHLADCYLRAGDKRAAMHTFAMAADDRFDATIAEDALFNYGKL QYELGGGAFNGAINVLTRYVEQYPSSPRVGEARTLLIAAYYNSNDYDAAYRAIKSFPTQD ADIRAALQKITYFRGLEAYNAGDMRAAQRYLAESAAINVSPKYSALNSFWQGEIAFAQGD YTVAAAKYNAYLKRAPRSEKEYAMALYNLGYCAFSRMDMAQARGSFEKFLAVYPARDRYR ADACNRLGDIRYSDREFEAAVAEYDRAAALGGPEKYYAQYKRAVTLGILGRTEQKQQALR QIIAAGEGDYADEASYELGRSHIAQEQYAEGAAQLEKFVADYPSSPRRAQALSDLGLAYL NLGDKEKSLRYYDMVVETAPQSSEAKGAMEGIREIYVSEGRVDDYFDYAQKAGLESDLTA VSRDSLSFASAQKLYLAGQTDAAAKSLRSYVKSYPKGYYVNDALYFLSDCYLRSDQRDDA IETLTTLAGQGTNQYTVTVLEKLSDMTFEDKRYDEAAAAYRQLYDVTTTVAGREDAMKGY VRATLAGGDAAKIETMAADVAAHPDAGTVALRESKFAWAELLRRQDRRADAVKLYKELAS DVRTKEGSAAAYYVLEDVFAGGDMDKAEKAIFAYSEREPQAYWLAKAFILLGDVYVRKGD NFQARATYQSVADGYSPADDGIVAEAKERIAKLN >gi|313157860|gb|AENZ01000052.1| GENE 109 135102 - 136835 2097 577 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313157875|gb|EFR57281.1| ## NR: gi|313157875|gb|EFR57281.1| hypothetical protein HMPREF9720_0605 [Alistipes sp. HGB5] # 1 577 1 577 577 1153 100.0 0 MMKRILFAAALLAALPCSVSAQVEKRVEVTKAYVPSVESAAKLAVVPDMTDTVKMRPEID YTITPLSLQTTLATRPIRPATVTYWEFNRPLPFYLKAGAGYPLNSVLDFYASSQNPGTGY VVGYVNHEGRYAKIKNDFGIRNNSTRMLNRIGAAAGKYFGKHILEGDLSYENRMYHRYGM YVASDVTSDMTPGAMADYGDANIAVRFGDDFQDLSRVNFEIAIRGGLFFDHSEWPDYNDK ARQTTLETHAKIARGFGRHQLAAEFGYTRLAGQKAIGLYNQQLIHAALRYGIEGGVVRLE AGADYYHDKVKTVDAENYIIPFVRLNLNLGTDGLCPFFEMDGSVDDNGFRSLTRQNPYVA AAFWQTKSSVDYNGRFGIGGSIWRGKFDYRVYAGFSVRDNHPYWYVSDNYAIDGEKTAVA GAMRPALARQTVLSFNGEVTWRPVSSFRTELGVHGYIYNDDETKLHNGAPSFDGSLSAHY DGRKISFGAGVYLQSARKWSAVFENTGTETAPGEPVMQYDTFEAPFAVDLRVNFDWKVSG RVTLFAEGRNLINRRLYEYPFYPEYGANFTVGVKANF >gi|313157860|gb|AENZ01000052.1| GENE 110 136980 - 137273 318 97 aa, chain + ## HITS:1 COG:CAC3001 KEGG:ns NR:ns ## COG: CAC3001 COG3877 # Protein_GI_number: 15896253 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 11 97 9 94 131 72 37.0 1e-13 MATLKNLPKICPSCGKRLSVHAMHCNGCQTHIEGNYSLPVMMQLSAPDQQFVLDFVKSSG SLKEMAHKLGLSYPTVRNRLDDIISQLNKFESDEQDS >gi|313157860|gb|AENZ01000052.1| GENE 111 137257 - 137799 815 180 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313157879|gb|EFR57285.1| ## NR: gi|313157879|gb|EFR57285.1| hypothetical protein HMPREF9720_0607 [Alistipes sp. HGB5] # 1 180 1 180 180 262 100.0 7e-69 MNKILNPFEYLSTGKALGWGLAGTLFSICLLTAVSWPVEGDVSKIITILSSNLLLWLPLS LLLYVMALVLSPSRIRAVDIFATNLFAMLPTIVILGVLSLCAMWLGSIVCEPRSACEVLK QAAYNLIVIVLSVSMVWSMVWGCFAYLVSANMKGWKSIVVFVICYVFVSVVNQLLIEYTK >gi|313157860|gb|AENZ01000052.1| GENE 112 138068 - 139081 1308 337 aa, chain + ## HITS:1 COG:VC2017 KEGG:ns NR:ns ## COG: VC2017 COG1559 # Protein_GI_number: 15642019 # Func_class: R General function prediction only # Function: Predicted periplasmic solute-binding protein # Organism: Vibrio cholerae # 78 335 84 338 338 133 32.0 5e-31 MRKKTLLYIFFAGLSVLLIAGFVLRQQFYGNAVKTERELFVSSRADYRALVDSLLPELKH HWAFGVYARRINLAETFKPGHYLLERGMSVIRVARMLKLGMQTPVRVTINNVRIPAQLAQ KLAGQIDADSAAIMRALTSDEVARKAGFDSLTLFSMFIPNTYEFYWTVTPEEFVERMKRE YDRFWTPERGALRKRSGLSRFEVMTLASIVYEETRKTDEMPRVAGVYVNRLKKGMPLQAD PTIKYAMQDFGLRRILYKHLKYESPYNTYLNKGLPPSPICMPGINAIDAVLNYEKHDYIF FCARETFDGYHNFAKTLREHNANAQAYSRELNRRKIK >gi|313157860|gb|AENZ01000052.1| GENE 113 139102 - 139704 766 200 aa, chain + ## HITS:1 COG:YPO2212 KEGG:ns NR:ns ## COG: YPO2212 COG0009 # Protein_GI_number: 16122440 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Putative translation factor (SUA5) # Organism: Yersinia pestis # 5 199 7 202 206 157 41.0 1e-38 MLTKIYEQNPSERELRKVVDALEDGGIIIYPTDSVYAFGCSLRSPKAIERLRRMKGKDAE AFTVVFENLAQAAEYCRVDNAAFHILKRNLPGPFTFVLTASSGMPDKALEKRRTIGIRIP GNAVPRVIVAALGCPLITTSVKDDDEVVEYTTDPELIHERYGRDVALVIDGGVGDNVPTT VVDLTGDEPEILREGKGELQ >gi|313157860|gb|AENZ01000052.1| GENE 114 139733 - 140311 712 192 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313158003|gb|EFR57409.1| ## NR: gi|313158003|gb|EFR57409.1| hypothetical protein HMPREF9720_0610 [Alistipes sp. HGB5] # 1 192 1 192 192 347 100.0 3e-94 MEKQNFWNDAAKCGAIIGAILAVSMVLETMMTLSGSMKYYALMTVEWIGVVVLHYYLLHR FTRNRSKLYSAEEGFSFGQGYLFVLAVSAFAGVIVGGVQYIYLHLIMGYSNYTSRLVEAL TDMMALGGGVPASMESVMAQSLEQIQTAPAPSVLATVWGGIWVSLLFGAVFGLIIAGVLS RAPRPFDTQAGE >gi|313157860|gb|AENZ01000052.1| GENE 115 140329 - 141285 1109 318 aa, chain + ## HITS:1 COG:mlr7556 KEGG:ns NR:ns ## COG: mlr7556 COG0463 # Protein_GI_number: 13476277 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Mesorhizobium loti # 6 310 4 301 326 246 41.0 3e-65 MNKLDISVVVPLYNEAESLPELVAWIDRVAAEHGYTYEVILVDDGSSDDSWSVVESLKKQ YPAVRGIGFARNYGKSAALYCGFAAAEGEVVVTMDADLQDSPDEIPELRRMILEEGYDLV SGWKKKRYDPIGKRWPSKFFNWTARTVSGIGLHDFNCGLKAYRRKVVKAIEVYGEMHRFI PILAKQAGFRRIGEKPVEHRARKYGRSKFGLERMLKGYLDLISVLFMSYFGRSPMYFFGS LGTLMFLIGGGTTVWIIAAKIWKQVHELPLRAVTDQPLFYLAILAIILGVQLFLAGFLGE LINRGSSDRNKYLIDKTL >gi|313157860|gb|AENZ01000052.1| GENE 116 141290 - 141754 667 154 aa, chain + ## HITS:1 COG:no KEGG:Bache_2138 NR:ns ## KEGG: Bache_2138 # Name: not_defined # Def: hypothetical protein # Organism: B.helcogenes # Pathway: not_defined # 1 154 1 152 152 79 36.0 3e-14 MIQRIQTLYMLIVTALMAVTLFAPLAWFAGEAGEFGLYAFSLKTAAGEAVQPTVYMGVVL ALACVLPFITIFLFKRRLLQLRLCVVEMVLLVGSAVMEGVYYFLSYRVFAEQTFHTQVLK PAVVLPLVCLLFAYLAARAVFRDELMVRAADRIR >gi|313157860|gb|AENZ01000052.1| GENE 117 142004 - 146023 1042 1339 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313157890|gb|EFR57296.1| ## NR: gi|313157890|gb|EFR57296.1| putative lipoprotein [Alistipes sp. HGB5] # 1 1339 1 1339 1339 2276 100.0 0 MKKLNLLRLLFGAMFLIAATAFTSCVDDNDDDGMPYLEVAPEALTFSAEGVAEGASAVTV KSNRPWTLAFESGADWVTPSATDGKAGTTEVEFSIPASNESRTATLSFQLKNSYGAYFTK MVTINQGEVVPEKLVYKETFGTTGDKTPVASYDGWDKTGEGSSTVTYAGEGVDIRSNLSS VGNGSAYEGSGGSNLMFGTGNTYFTVQDITLPQGQTGYTLTFGASYFNAPNIEIYDQLTV QFSKDGEKWTNAIPYTFENFVAEKNWNLATLNFTLKEFSEHLYIKFLSTKKPDGNYRIDD VTLVTSAGGQQVDLDNGSVTPPVGDVELPTTVVTQFGDSFNDVISGVVYDSPNWAFTSSD AGYPANPKLGWFGSVFGDTFYLQCAPYSSTQKTVTAYAIMTPFNVKAADNKVLTFKLAWY FNATASAADDSKIEIVASTTVTNETITDPSVWTVVKTIEYKEGVNEINVYFDESADLSAY AASDKVYVAFRYVGHNNTYRLDDVSFNGGATGSLVVDPTAISLGDAAGATAKITVTSTGD WKATVSGSGFSIDKTTGTASDTSITVTASEANASSEIKNLGSIVVSNDFGTKTIAVSQKG VSNDIFYESFGDLEQKLDKWPYINESPYVIRSGFGYVSGTSDYKTAYASNRWTATTASPT SAGFSGKGLMWLQAGKNAYFTVQDLVLTTEKNLSIRFGLYGNTDAFDPQKDVIKLYLSSD AENWTEIAYNLENVDATPAWKWASADITLKNTVSRLSLKIEYPSGATGTRVDDVRIFVGT GGTEIDLGGGAATPAVTTANASNVAETTVTLGGSIANADLADYSAVGVEYVVFATGNTPD WTGSVKVPAATKAATWTVDVTGLTKETQYAFRAYATPNTGDILYGDHKTFITTGGGSVTA ITIPELLQIAKGSVISESQDKVLTAVVCGDPTGNNFSFGTLYVMTEGATEGGNGIVIYNN KSADFNVADYALGDRLKITLQANVAMMDEYNGVKQITGVTTYHVEKSGTATVTPINVAPA DLASYVSMPATVQQASTAKAGVWCTASSNGNHTFTSGGTNFTVNVNKRATVFQDIPYAAA TGNITGIVTMYGGKGQIAPRNLNDVEAFKVTVPTITEVTPSSLVWTANDTSEKTVTIEGV NFNGFTLSSLTSFNASVEGTTVKISPKAVNTSDAELKETLTITANGGNSVTVTLTQNGAS SGTKLTITITPTTADSGFPATAPTTEATATIDGYVWGLLNAKQNAGYLMINGTNGYLATP AVPGSTLKTIKITTSNGISASAKVTIKDTSDNIVSPEQAVGDKNTEYTFNLTNPAAGVSY RINSSNKNTQLVKVELSYE >gi|313157860|gb|AENZ01000052.1| GENE 118 146145 - 147362 1160 405 aa, chain + ## HITS:1 COG:all7362 KEGG:ns NR:ns ## COG: all7362 COG1864 # Protein_GI_number: 17233378 # Func_class: F Nucleotide transport and metabolism # Function: DNA/RNA endonuclease G, NUC1 # Organism: Nostoc sp. PCC 7120 # 185 399 65 273 274 72 28.0 1e-12 MFAASLAAMVLIAVAFTGCGKDSDNKGGGGANSVEWDPAKTVHPDQEGAAVLLVTGDTGT PWTAEIISGAEWISFNRTAPGGQTVKTGKVGTSLSDKNQYVYYWPNNTKDERHALIRFEF EGEMPVELELVQYSTSSNDNVYETGHNLVWPEIPAEKVNSNYIYVSHFAQLKNQNTNQWY NARNYTLCLDKTKYAAWWVAYPLHSSYTGSGRVETWAYDPKIAAEYQANLTRSYPAANYD RGHQIPNADRSGNATMQAQTFYFSNMTPQNSSLNQHPWADLEKMARDHWMCSDTLYVVTG AYWNPGSTLTTADRDGKLCPVPNYYFKVFVRTVKGNIRQAGDKLGDYSAGQLKSIGFWVA NEGGQGEAKSWVKSVKEIEELTKFEFFPTLPAEVKEKTDAASWGL >gi|313157860|gb|AENZ01000052.1| GENE 119 147383 - 148417 1339 344 aa, chain + ## HITS:1 COG:no KEGG:Poras_0065 NR:ns ## KEGG: Poras_0065 # Name: not_defined # Def: Endonuclease/exonuclease/phosphatase # Organism: P.asaccharolytica # Pathway: not_defined # 26 344 37 351 357 302 46.0 1e-80 MKKRLIFTLFLAAFVVMCFAQKPCKVMFYNLENFFDTINDPEVHDDEFTPEGPKKWNSAK YFKKLGNIERVLFDIAAADKNYPAVIGVSEVETRSVLEDIVATPKLAPGNYRIVHYDSPE ARGVDVAFMYRPDVFKLEGSFPVKTVVPQLPNFKTRDILTMWGTIENEPFFFMVAHWPSR LGGKDASEFKRIAVGEQMRRIADSVLKINPATKVVAMGDFNDDPTDESIAEGLGAKAKMK DLKPGDFYNPFADMLKAGLGTLAYGDAWNLFDNIVVTENLATGSEGRLRLQKAPGSKFYG NIFKRYYMLQKEGQYKGYPLRTYVGNNFQGGYSDHFPVYIYFAK >gi|313157860|gb|AENZ01000052.1| GENE 120 148434 - 151193 3533 919 aa, chain + ## HITS:1 COG:no KEGG:Poras_0864 NR:ns ## KEGG: Poras_0864 # Name: not_defined # Def: TonB-dependent receptor # Organism: P.asaccharolytica # Pathway: not_defined # 41 919 32 906 906 594 39.0 1e-168 MKIKSLLILVLSLVTLTAFGQDGGIKGRVVSRAGRIALGDVKITMTPGGTTTVSDAQGNF VFENIPAGEYSLQFETPEFETSNIAVRVGSQMRDINAVILVPDTQRQMIDDAVFAEFDTE TTDDAQALPTSLSASKDLFNNIASYKFSEMRFNVRGYDSQYQDVYMNGIQLNDAMTGYTP WSLWSGLNDATRNQEVTSGIVASDAGLGGIGGTTNIVTSPAQMRQGLRASLVNGNSMYRF RAMVTYASGYQDNGWSYAFSVSTRQGGNSYVNGVYYNAFGYFAAVEKQFGQRHRLALTLL GAPTERGAQQAATQEAYDLVGNNYYNPNWGWQDGKKRNARVRNNHEPVVMLNYTFDISDR SKLDLATSLRFGMNGYSALTWQNGPDPRPDYYRYLPSYFALDKNMLGAAWQQVYWQANYQ NIRHFDWDKMYQTNYIQNDPVDEAQYGPGRRSNYMVEERHTDQLDWNLVANFSHIFRNNS KIYGGLSLRRNRTEYYSEVKDLLGGDYWADIDKFAERDFGSNAIAYQNNLDYYYRHGHAH AAKVGDKYGYDYYAHVLRTKAWAAYNFNVGGLGVTVGGEVGYASQWREGLWRKGLFPDNS KGDSKKLDYLTYKTKGNFSYRFSAAHTVEANVVYLQDAPEFQASFVSPRTRNTTTPGLSA ERVFGVDASYNLRVGDLKARVSGYYTRIYDQSKVISYYDDVAATFTNFAMSGIDKLHFGL EAAVSIPLYRSLALNGAVSWGQYTYDSNPYYLQTQDNSGDIVSDGYVYWKNFRVESTPQL AANVGLSYRSNNNIFLSADFNYYNNMYLSMSPIYRTDAVITTGMTAEDIAHLRRQEKFDS AYTLNASIGKNWYIHRQYTLGFSLEVKNILNNQDIRTGGYEQTRLTKNTETTVTTYQAFD SKYFYMFGTTYYLNLYFRF >gi|313157860|gb|AENZ01000052.1| GENE 121 151220 - 152245 1453 341 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313157923|gb|EFR57329.1| ## NR: gi|313157923|gb|EFR57329.1| putative lipoprotein [Alistipes sp. HGB5] # 1 341 1 341 341 646 100.0 0 MNKVSKLFLIAAAGLFFVGCYNDYRNPKAAKIYTRADFEKEGLEYISIKDLKAQFKAENP GMNDGTVASWTVDEPIFTSGKVISTDRYGNVYKSVYLYDAESESAIELKLNTGNYLFHPA GQIVFVKLQGLVLGNYRGMTSIGTTSSNASYSNDNIESKIMQDEHIFSGEQQQMLKSDTL VVTKDNYKTAISDADLGRLVRFEGLESKFGTAPWGYKNTFPNYFANSTSYDVNSPGWSDI NEWATWATKRRLEGANAETYFYGSAWFTYDAAATGSGTNAAPGNYVVRTSGYSQFRDNKI PEDGWVVNLTAIYTKFTNGSGNYGTYQLTLNTDRDVTVVEK >gi|313157860|gb|AENZ01000052.1| GENE 122 152316 - 154373 3136 685 aa, chain - ## HITS:1 COG:PM1683 KEGG:ns NR:ns ## COG: PM1683 COG1368 # Protein_GI_number: 15603548 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Phosphoglycerol transferase and related proteins, alkaline phosphatase superfamily # Organism: Pasteurella multocida # 157 639 102 598 649 236 31.0 1e-61 MRQSLRLVSGLYLKLSLLTIAVGFLLRIVLLFNEQTTDLGFSFGEWLQVFLLGAVNDLCA ATIGFVFLWLFSMSVSETKYGKPWGYIILGILVAAFCYVTFCNTIFDEYGSVAPQIASAV LGYWAGTFALRLFFKGFRSYWTTVWFALIITLYVGAILFNGLSEYFFWSEFGVRYNFIAV DYLVYTNEVVGNIMESYPIVPMTLGLILVTLVVTWYLFRRDLALADRLKGWRWKAVAGPA YIAAVLLAVWLLNFNTRFQDSQNVYVNELQANGLYKFYDAFVKNELNYKQFYITEPEEQA EAFVHGIYGSTGDNLHAVRCEGPEIRRNIVLITIESMSASFMERFGNTNRLTPVLDSLYK QGLAFDRVYATGNRTVRGLEAVTLSLPPCPGQSIIKRPNNAGMHSTGALLREKGYNVKYF YGGNSYFDNMETFFSGNGYDIVDQRQYAPEEITFANIWGVCDEDAYRKVIRTLNEDAQSG KPFFAHVMTVSNHRPFTYPAGKISIPHDSKSRNGGVMYTDYALGEFFTEASRQPWFDNTI FLVTADHCASSAGRTEIPLNKYHIPALIYAPGFVEPQCVDGIVSQIDLMPTLLSLLNMSY DSHFFGRSIYDKQYVNRAFIATYQDLGYLEGDVFTILSPVRRFEQYRVVPTAENPHNLEP MEEKNRELVTRAVGYYQTSGEWNKR >gi|313157860|gb|AENZ01000052.1| GENE 123 154484 - 157312 4315 942 aa, chain - ## HITS:1 COG:CC2638 KEGG:ns NR:ns ## COG: CC2638 COG0612 # Protein_GI_number: 16126873 # Func_class: R General function prediction only # Function: Predicted Zn-dependent peptidases # Organism: Caulobacter vibrioides # 27 941 69 974 976 272 26.0 2e-72 MRKITMLVIAAFAVTSAAAQFNPQQPIPADKDVRTGKLENGMTYYIRHNEKPKGQADFYI LHDVGAIQENDSQQGLAHFLEHMAFNGTKNLPGKMLTEYLEKVGVKFGANLNAGTGWDQT TYMMKDVPTSREGIIDSALLILHDWSHFIALEPEEIDSERGVIMEELRTRDGASWRSTMK MLQALGKDTKYEHRNLIGYLDGLKGFHHKELEDFYNQWYRPDYQAVVVVGDIDVDAVENK IKTLMSDIPVPAADAARKETITVPDNEDPIISIYTDPEMQGSKIQLFVKRPALPEQMNNL IYGEMFDVIQAYMTTMENARLQEISMKPDAPFLGAGMGSGEIGVIPTLNATTFVAMTQDG KLAEGFEAIYTEMEKVRRYGFTQGEFERAQNDLMRRAERAYANRNDRRNGEFVQTYLNNY SKNTPMPDAETEWQLDSMLIKMINVEAVNGFAQQVIYPRNQVIVVTAPEKEGIVNPTAEE LLAIREKVANAEIEAYEDNTVKEPLIPEGTVLKGSPVKKTAQDATLGTTEWTLANGVKVV VKPTTYKADEVRMSAVAKGGLSILSDEEFYMGEMMPAFNSMSGVGKFSATDLKKQLSGKS ASVQPSVENYASAVNGYCSPKDLETMMQLLYLNFTQPRFDQNDYNTLMKMLRSQLDNVKS NPDYLMEEKFIDVAYGNNPRRQMISSEIIDKFSFEALPAIYRKLYPDANSFTFTIVGNVD LDALKPLVEKYIGSIPVSKKAMTFADDKCAPVKGDVTEEFTAPMQQPKVSVHYMFSGKMP YTLKDKAALTFLTQALNSRYLISIREEKGGTYGVQVSGSTEYIPDETYKLDIRFDTNEEM ADELREIVMKEIREIAENGPKTEDIEKNREFMLKSWKNSLEQNAGWMNYIQAKYGPGLEY LKDYEQVIRSLTNADVQAMAKKVLGDNNLVKVVMRPAKEKAE >gi|313157860|gb|AENZ01000052.1| GENE 124 157338 - 157559 268 73 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163756262|ref|ZP_02163377.1| 50S ribosomal protein L20 [Kordia algicida OT-1] # 1 65 1 65 67 107 78 3e-22 MAIIINIDVMMAKRKMSLGELAERVDITPANLSILKNGKAKAIRFSTLEAICRELDCQPG DIIEYRPDQPSEE >gi|313157860|gb|AENZ01000052.1| GENE 125 157620 - 158309 1230 229 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313157995|gb|EFR57401.1| ## NR: gi|313157995|gb|EFR57401.1| hypothetical protein HMPREF9720_0621 [Alistipes sp. HGB5] # 1 229 1 229 229 435 100.0 1e-120 MQNKKRLHKHLLMIYIIYFAALIVGFAASFVPDFSRGWRNAQNTLEMEIPQGGIRSYYVS APVIRSAGEPIGIDNLPENITPTISRLDLKVTVDEAYTLGNAFKVMGNNSFCYLLMVITG LSYLTILVLIALIINSLRKSIRDEQPLRHSNILRTRAIGILILVGELSEAFMKYLNNNEA VRLLEGTSFEVVQTFPLSYWNVIVGILFLFMAEVLSLGTQLSEEQKLTI >gi|313157860|gb|AENZ01000052.1| GENE 126 158454 - 159065 1031 203 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313158028|gb|EFR57434.1| ## NR: gi|313158028|gb|EFR57434.1| hypothetical protein HMPREF9720_0622 [Alistipes sp. HGB5] # 1 203 1 203 203 322 100.0 8e-87 MEMKKLDFAETLKDSIAIGVKNAPSVVVAVALWLVTIWIPYINIGTTIAISLLPVELAKG SVINPLGIFDSKYRRYMGEYLITAGLMVIPIYIAFVFMIVPGIVLSLSWALSFYFLLGKG KKPMQAIKASNDATYGSKWTMFLVTLVFGIMVGIVFGILYAICLAINVGFITFLVMFVLI VLAASVRMAIDASFWKQLKDNVE >gi|313157860|gb|AENZ01000052.1| GENE 127 159165 - 159893 1049 242 aa, chain + ## HITS:1 COG:CAC3591 KEGG:ns NR:ns ## COG: CAC3591 COG3884 # Protein_GI_number: 15896825 # Func_class: I Lipid transport and metabolism # Function: Acyl-ACP thioesterase # Organism: Clostridium acetobutylicum # 8 218 9 218 248 70 25.0 2e-12 MAEKSHYNYRVEPQDVDFTLRATIPSLGSAILNTAGVDAHGKGFGVDALNADNHSWVLSR MAVEFDSQPTQYTDYTIATWINEYGRVLSTRNFTLTDAAGTEFGRAVTQWAMIDLRSRSA LDLSWVGDAHADAIVDAPSPTDKPRKIREVNPAQTVEHKVVYSDIDFNRHVNTMRYIEMM IDMLPVELLMQETPVRFDIHFLRECRYGQTLSVGYEQRGRTALFEIRSDEGTPAVRASIE WK >gi|313157860|gb|AENZ01000052.1| GENE 128 159966 - 160502 815 178 aa, chain + ## HITS:1 COG:BMEII0704 KEGG:ns NR:ns ## COG: BMEII0704 COG2193 # Protein_GI_number: 17989049 # Func_class: P Inorganic ion transport and metabolism # Function: Bacterioferritin (cytochrome b1) # Organism: Brucella melitensis # 15 167 8 160 161 71 32.0 9e-13 MENAVKTQNKYQVSIDLLNDAVGKEIATSLQYMYFHTHFEDDRYQYLSKIMREISIAEMR HIEEFSDRILFLQGDVDMNASFRTRQVTDVKDMLRLALQLEQSTIDSYNEASRIASEHKD AVTHKMFQDIIAEEEQHLDTFRTELQNVLDYGEEYLALQSAAGSKHAAKSFGHPGDGE >gi|313157860|gb|AENZ01000052.1| GENE 129 160716 - 161852 1523 378 aa, chain + ## HITS:1 COG:MA3755 KEGG:ns NR:ns ## COG: MA3755 COG0438 # Protein_GI_number: 20092553 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Methanosarcina acetivorans str.C2A # 3 316 7 330 384 139 29.0 1e-32 MKITILGPAHPYRGGLASIMEIMARTFRLRGNEVDIKTFTLQYPSLLFPGKSQTVDTPPP ADLDICRCVNTVNPFNWLRVGRRIRRERPDFVLMKYWTPFMAPCFGTIARIARGNGHTKV LCQIDNVEPHERHLTDKPFNRYYLRAVDGFVYMSEQVHRELEAYSKAPALFSPHPLFENF GERVARSEACVRLGLDPAVGYALFFGLIRDYKGLDLLLDAWAELKRAGRAEGRKLIVAGE FYTPKERYLRQIADNGLQDDVILHDRFIPDEQVKYYFSAADFVVQPYKTATQSGVTQIAY QFCVPMVVTDVGGLAEIVPDGRVGYVCEPTAQGVAEAVAKMYEGDAIERFRLNCIEERKR FSWEEMCSRITELYERVR >gi|313157860|gb|AENZ01000052.1| GENE 130 162229 - 163887 2128 552 aa, chain - ## HITS:1 COG:MA2912 KEGG:ns NR:ns ## COG: MA2912 COG0365 # Protein_GI_number: 20091733 # Func_class: I Lipid transport and metabolism # Function: Acyl-coenzyme A synthetases/AMP-(fatty) acid ligases # Organism: Methanosarcina acetivorans str.C2A # 1 552 7 560 560 677 56.0 0 MVERFLQQTTFRSQEDYRKNLHIRVPENFNFAYDVVDAYAEEQPDRKALLWTNDQGAEIQ FTFADMKRETDRTASYFQSLGIGKGDVVMLILKRRYEFWFSILALHKLGAVVIPATHLLT KKDVVYRCNTAGIKAIVAAGERVIADHVAAAMPESPTTELLISVGPEIPEGFLDFHAGIA NAAPFVRPQRVNANDDIMMMYFTSGTTGEPKMVAHDFTYPLGHISTGCFWHNLHDGSLHL TIADTGWAKAAWGKLYGQWLAGANIFVYDHEKFTPADILHKIEQYRITSLCAPPTIYRFL IREDLSKYDLSSLEYCTTAGEALNGAVYDTFRQLTGVRLMEGFGQTETTLTLATFPWMEP KPGSMGVPNPQYEIDLLTPDGRSAEDGEQGQIVIRTDKGKPLGLFKEYYLNDELTREVWH DGVYYTGDVAWRDEDGYFWFVGRADDVIKSSGYRIGPFEVESALMTHPAVVECAITGVPD EIRGQVVKATIILGEKYRAQAGEALIKELQNHVKRITAPYKYPRIIEFVEELPKTISGKI RRKEIRNGDEKR >gi|313157860|gb|AENZ01000052.1| GENE 131 163891 - 164445 885 184 aa, chain - ## HITS:1 COG:MTH700 KEGG:ns NR:ns ## COG: MTH700 COG1396 # Protein_GI_number: 15678727 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Methanothermobacter thermautotrophicus # 1 184 1 182 182 174 49.0 1e-43 MCDPIKSIANRLRGLREVLELSAQEVAESCHLRVEEYMALESGESDISVNVLQTIARRYG ISLDVLMFGEEPKMNAYFITRAGAGVSVERRKAYKYEALASGFRDRKADPFIVTVEPAPA DAPMHLNSHEGQEMNYVLEGRLLLSLNGKELVLNVGDSLYFDSSLPHGMKALDGRPVRFL AIIM >gi|313157860|gb|AENZ01000052.1| GENE 132 164753 - 166141 2264 462 aa, chain - ## HITS:1 COG:no KEGG:Odosp_1741 NR:ns ## KEGG: Odosp_1741 # Name: not_defined # Def: major facilitator superfamily MFS_1 # Organism: O.splanchnicus # Pathway: not_defined # 1 462 1 471 472 494 60.0 1e-138 MTEIIQRKLNDSAFVRWTALILIALTMFFGYMFVDMMSPLQSMIEAQRGWTPDVFGMYGS SEFIFNVFGFLILAGIILDKMGIRFTGVLSASLMFIGASIKYYGVSDAFIGSGIETWLNS WWVSFPGSAKLASLGFMIFGCGMEMAGITVSKTIAKWFEGKEMALAMGLEMAIARVGVFA VFTISPWLANMAPATVVRPVAFCTLLLLIGLLTYVVFTFMDRKLDKQLGLDAKGNNSSEE EFRVSDLGKIFGSKVFWIVAMLCVLYYSAIFPFQRFATNMLESNLGVSAQTAADIFRWFP MGAAAITPLLGSYLDHKGRGATMLIFGAILMTVCHLIFAFVLPAYPSTLIAYGAIIILGI SFSLVPAALWPSVPKIMETRYLGSAYSLIFWIQNIGLCLFPAVIGYALKFSNPGHIDGTA YNYTLPMVIFASCGVVAMLLGIWLKAEDRRKHYGLELPNIKK >gi|313157860|gb|AENZ01000052.1| GENE 133 166164 - 168137 3453 657 aa, chain - ## HITS:1 COG:FN1787 KEGG:ns NR:ns ## COG: FN1787 COG0457 # Protein_GI_number: 19705092 # Func_class: R General function prediction only # Function: FOG: TPR repeat # Organism: Fusobacterium nucleatum # 20 354 26 361 628 154 30.0 4e-37 MGIALAAALMCGTKSYAQYNREYFFWVGRSCMMNNDYQEAIRTLNTLLRFDENAFEGYFL RGIAKYNLNDLLGAEADFSTAIRMNPVYTQAYTYRAITRSRLGNYDDALQDFREAIELRP DLPGPYYSRGVTRLLNQQFTEAIDDFDKFIRQENKVADAFICRGLSYLHLKDTVRAYDNF NTAIRTNRENPNGYNRRGGLYMEQKRYPEAEADFNKAIECDSTYLLSYFNRALVYSNTNR PMLALSDFDKVIQLDSTNSLTYFNRAMLRTQIGDYNRALEDYDKVALYSPNNVLVYYNRA GVYAQLGEIEKAAEDYTSAIKLYPDFANAYIYRGRLRELLRDPQGAKKDRDTAQKKIAEY RSRLSDSTYSIFADTTQRFDRLLSFDSKFSGGSFDRVTGHNGGREEMRLLPLFKFTLMRP DTVPAVKRYHVQRVEDFKKRVDNEFLTLSWRETNIAPDTLVMLDKQYVEELKAAETSWTV LFERGVSQALIKQYTNAVNTFSSAIELNPSNPFLYLNRSTTRAEMIDFISSIDNSYQRIT IDSDPANRLNNNSKRTYSYDEAVADLNKAIKLFPDLAYSYYNRANLRALSGSLPEAFEDY TKAIELNPNFAEAYYNRGIIQIFMKDTRKGCLDISKAGELGIVEAYEVLKRYAPTDK >gi|313157860|gb|AENZ01000052.1| GENE 134 168207 - 169088 1445 293 aa, chain - ## HITS:1 COG:BS_bex KEGG:ns NR:ns ## COG: BS_bex COG1159 # Protein_GI_number: 16079583 # Func_class: R General function prediction only # Function: GTPase # Organism: Bacillus subtilis # 3 291 7 297 301 252 45.0 6e-67 MHKSGFVNIIGNPNVGKSTLMNALVGEKLSIITSKAQTTRHRIMGIVSGEDFQIVYSDTP GILKPSYKLQESMMKFVTGAVTDADVILYVTDTVEQGDRSAEIIGRIRESGIPTVVVINK IDLSTPEALDALVDKWRQELPEAQIVPASAKENFNIEGLFKTILALLPEGPAFYPKETLT DKTLRFFASEIIREKILKFYDKEIPYCCEIEIDSYREEPTIDRIAATIYVARDSQKGILI GHKGEKLKRVGQTAREDMEEFLGKKVFLQLFVKVSDDWRNNERQLRRFGYELE >gi|313157860|gb|AENZ01000052.1| GENE 135 169244 - 169936 673 230 aa, chain + ## HITS:1 COG:MA0330 KEGG:ns NR:ns ## COG: MA0330 COG0778 # Protein_GI_number: 20089228 # Func_class: C Energy production and conversion # Function: Nitroreductase # Organism: Methanosarcina acetivorans str.C2A # 61 226 9 172 179 169 50.0 4e-42 MQPPMKTSKTLFFLLGALLAALLTWWLAGGKQAAAGTSAVGMPAAGISAAGLPAAGERSA LDVIATRTSVRAYRDCPVGADTVELLLRAAMAAPSAMNRQPWVFVVVDDKALLQKFADSL QYAKMAASAPLAVVVCADLTRNPGASGDWWVMDASAASENLLLAAHAVGLGAVWTGVYPR SERVKAVRTILGLPESVVPLNVIPIGYPAQTPEPKQKWNPGNIRRNGWTR >gi|313157860|gb|AENZ01000052.1| GENE 136 170048 - 171373 2032 441 aa, chain - ## HITS:1 COG:FN1393 KEGG:ns NR:ns ## COG: FN1393 COG0541 # Protein_GI_number: 19704725 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Signal recognition particle GTPase # Organism: Fusobacterium nucleatum # 1 439 1 436 444 418 52.0 1e-117 MFENLTDKLERSFKILKGEGRITEINIAETLKEIRRALIDADVNYKVAKSFTDEVKQKAL GQDVLKAVKPGQMMTKIVRDELALLMGGTATDIRLEGNPAVILIAGLQGSGKTTFSGKLA LFIKSKKGRQVLLVAGDVYRPAAIDQLKVLGGQIGVEVYTEEGSKDPVQIAENAIKYARQ HAFNVVIVDTAGRLAVDEAMMQEITAIKTAVKPSETLFVVDAMTGQDAVNTAREFNDRLD FDGVVLTKLDGDTRGGAAISIRSVVNKPLKFISSGEKMDALQVFHPERMADRILGMGDVV SLVERAQEQFDEEEARKLKKKLVKNQFNFNDFIGQIQQIKKMGNLKDLASMLPGMGKMLK NVDIPDDVFKQTEAIISSMTPAERERPEIINARRRERIAKGSGTTMADVNRLMKQFEDTR KMMKAVAGGGMKMPKMPGMRR >gi|313157860|gb|AENZ01000052.1| GENE 137 171536 - 172264 758 242 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313157874|gb|EFR57280.1| ## NR: gi|313157874|gb|EFR57280.1| hypothetical protein HMPREF9720_0634 [Alistipes sp. HGB5] # 1 242 1 242 242 483 100.0 1e-135 MKKTAFAVLLALLCCVSGAAQTPDEWQPPKGLPYGNEIGIRYGAQIQLGNGGRLSPDRLS PDLQAFSIDYARYNYYNIGFRTGLNVFLDSDVFDYYSIPMQFTWRTGRMVSAWRRSRDEG YPNNDPYGNDYYGESEPDWGSAFLTTLFSILPSAFEVHTGFTPGMMFGPLSLPPYDGGFL VRHRFSCTFDVGARLIIPIWRFNLYGDFTYHCYITDNFGFRYGDFRPTRSYMGVGVGLSF NF >gi|313157860|gb|AENZ01000052.1| GENE 138 172456 - 173031 967 191 aa, chain - ## HITS:1 COG:no KEGG:BF0012 NR:ns ## KEGG: BF0012 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 6 190 7 199 199 128 37.0 1e-28 MKKWLLLLITLAAALPGAQAQGGRGFWHQEWCVEKGDSVPLVHILPVYVFSRPVDLRRYR KLVDAVKKVYPIARIAKAKMAAMEEELCRLPTKKAQKAYIRQVYNQIKEEYTPVLKHMTR TQGKVLLKLIDRETEYTAYEVLKEFRGGFVAGFWQGVSRIFGQNLKSEYDKQNEDRMIEQ IVIYYEAGLLR >gi|313157860|gb|AENZ01000052.1| GENE 139 173103 - 173477 539 124 aa, chain + ## HITS:1 COG:FN0889 KEGG:ns NR:ns ## COG: FN0889 COG5496 # Protein_GI_number: 19704224 # Func_class: R General function prediction only # Function: Predicted thioesterase # Organism: Fusobacterium nucleatum # 1 123 1 123 127 89 39.0 1e-18 MLEKGLSAQSRTTVTNENTAAAMGSGDLPVFATPSMVALMENAAMTAAAAALPAGSTTVG AEMNVTHIKPSGLGAEITATAVLTEVEGRKLTFNVGARDAGGMIGEGVHIRYVVDCEKFM AKLG >gi|313157860|gb|AENZ01000052.1| GENE 140 173567 - 174343 817 258 aa, chain + ## HITS:1 COG:PA0749 KEGG:ns NR:ns ## COG: PA0749 COG2720 # Protein_GI_number: 15595946 # Func_class: V Defense mechanisms # Function: Uncharacterized vancomycin resistance protein # Organism: Pseudomonas aeruginosa # 2 191 28 218 273 174 47.0 2e-43 MWYFGGLRFARKSGSPSCDYSHFRHRTPLLRRLKDVDMQYQYNKITNLRLAAARLDGVVL RPGETFSYWRLIGRPSRRKGYLDGIVLFCGTFRPGVGGGLCQMSNLIFWMTLHTPLTVVE RYRHSYDVFPDCGRTQPFGSGATCVYPYRDLMIRNDTQTTFRLHVTVGGEFLEGEWRADR RPEYSYRVVERDHCMQSEYWGGYSRHNKLYREVYDAEGRCVDVQYVTRNDALMMYSPFLA ENAQPGKRPADNQRPGYE >gi|313157860|gb|AENZ01000052.1| GENE 141 174411 - 175748 2204 445 aa, chain - ## HITS:1 COG:CPn1015 KEGG:ns NR:ns ## COG: CPn1015 COG1055 # Protein_GI_number: 15618923 # Func_class: P Inorganic ion transport and metabolism # Function: Na+/H+ antiporter NhaD and related arsenite permeases # Organism: Chlamydophila pneumoniae CWL029 # 3 434 4 408 420 278 40.0 2e-74 MILTMILLFVVGYTFIALEHKVKVDKSAIALLMCGAIWTVFSLWGHDSNISHEMVDHLGD TCEILVFLIGAMTIVDLIDTHGGFNVITDHITTRNKHKLMWLLAIITFFMSAALDNMTTT IIMVMLLRRIIADQKERWIFASLIVIAANSGGAWSPIGDVTTIMLWMRGNVTAAALMGTL FVPCIVSVIIPTAIAMRYVGNEDAAPVDESAFEAELPKGVGPRLSKFILVTGVLSLLFVP VFKSITHLPPYMGMMVSLGVMWVLTEIIYDRKRGIEESIKCRVSKVLKHIDMPTILFFLG ILMSVAALQSAGVLTDVANWLDKQVHEVFTIAGVIGVLSSVIDNVPLVAACMGMYPVMDA AAVAASADPAYMQSFVQDGLFWHLLTYCAGVGGSLLIIGSAAGVVAMGLEKINFAWYFKK ITLLAFVGYLSGILVILLEHLLIGL >gi|313157860|gb|AENZ01000052.1| GENE 142 175759 - 176259 679 166 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313158026|gb|EFR57432.1| ## NR: gi|313158026|gb|EFR57432.1| hypothetical protein HMPREF9720_0639 [Alistipes sp. HGB5] # 1 166 1 166 166 340 100.0 2e-92 MKNMIDDAIYSASPEIACRRYNPSLGIVLTAGSITLLWANSHASLFTENDMLSQWNLLIS SCILCTGIVMICYRLFGDSSAPVEKRSRERLYRSEYSFEAHDLPKVQSAIECGNFAVLQN LPRSYQPAAQVICYRTDSGSLIAAQVLMNRQPTGEIKVFRAGEYDF >gi|313157860|gb|AENZ01000052.1| GENE 143 176340 - 176720 215 126 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313158013|gb|EFR57419.1| ## NR: gi|313158013|gb|EFR57419.1| conserved domain protein [Alistipes sp. HGB5] # 6 126 1 121 121 210 100.0 3e-53 MKQWLMILALCWAAATVNAAGESRSRTLRAEITAAVLSQADSERFETDNAPLLATACTGC TQLSGAEQGAGRSVRHAGNFIVAATRGYACAGNPHPAAPAVGKCGVRPPSIRRADRYIYY LRRIII >gi|313157860|gb|AENZ01000052.1| GENE 144 177011 - 177157 219 48 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MDTLKTDKTDTAEAENIETMELLTGMLDNATQCGGTDDENPQPEKSGK >gi|313157860|gb|AENZ01000052.1| GENE 145 177175 - 177426 484 83 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313157980|gb|EFR57386.1| ## NR: gi|313157980|gb|EFR57386.1| transglycosylase associated protein [Alistipes sp. HGB5] # 1 83 1 83 83 100 100.0 3e-20 MYFVWYLLLGLASGWIANLIVKGDGSGLIINLIVGLVGGVLGGWLFSPVGPAPLGSLGSL ATSVIGAVALLWIAAAITRKKIR >gi|313157860|gb|AENZ01000052.1| GENE 146 177494 - 177742 310 82 aa, chain - ## HITS:1 COG:no KEGG:bpr_I0029 NR:ns ## KEGG: bpr_I0029 # Name: not_defined # Def: hypothetical protein # Organism: B.proteoclasticus # Pathway: not_defined # 7 82 3 78 80 96 61.0 4e-19 MTQHISYPTVGTCSRQIDIELEEGVIRNVTFTGGCSGNTQGVAALVRGMKAADAVARLEG IDCRGKGTSCPDQLAKALKQAL >gi|313157860|gb|AENZ01000052.1| GENE 147 178079 - 178399 556 106 aa, chain - ## HITS:1 COG:PA3822 KEGG:ns NR:ns ## COG: PA3822 COG1862 # Protein_GI_number: 15599017 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit YajC # Organism: Pseudomonas aeruginosa # 20 98 25 102 112 63 35.0 1e-10 MINFLQVPPVEPQPGFMQQYSFIIMIGLMVLVLWLFMWRPEAKRRKQMQAFRDGLKKGDK VITAGGIYGTVKEIKETTLLIEVDGNVTLRIDKNMVVADNSDLQRQ >gi|313157860|gb|AENZ01000052.1| GENE 148 178472 - 178885 507 137 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313157866|gb|EFR57272.1| ## NR: gi|313157866|gb|EFR57272.1| putative lipoprotein [Alistipes sp. HGB5] # 1 137 1 137 137 249 100.0 6e-65 MLRVCLLLLLAASLAACGARQVKTERKGRIITLTDKILDTGGTDTVRFGRLHSGEIAQLR LWLANETSGPVVIAKYGRSCGCTTLGYDNQPINAGDAQQVTLTFDARGEWGWQLKTLDVS FAGARSPLRLFIEAEVE >gi|313157860|gb|AENZ01000052.1| GENE 149 178892 - 179827 1601 311 aa, chain - ## HITS:1 COG:no KEGG:Bacsa_1247 NR:ns ## KEGG: Bacsa_1247 # Name: not_defined # Def: NusB antitermination factor # Organism: B.salanitronis # Pathway: not_defined # 1 305 1 307 308 233 42.0 9e-60 MLSRRLLRIKVVKALFAHLKSGADNMIASEKTLMTSVDKAYDLYFQILILPVEIARYAEQ RQELAKQKKLPTHEDLNPNTKFVDNQIIRVIANSDAVNDYAAARKLNWTRYPELIRTLYT QLTESDYFKDYMARPERSFADDRKLLEDFFKELQSCEPLDNVLEEMSILWSDDLPYIVMM ILRSLSNLRPTHTELKVPAKFKSDEDPQFVRTLFEKSLVNYDSYQDYIEKFTSNWDVERI VFMDNLIIGTAMAELTSFPSIPVKVTLDEYIEISKYYSTPGSSTFINGVLDKIVDSLTAE GRIKKAGRGLI >gi|313157860|gb|AENZ01000052.1| GENE 150 179910 - 181187 1620 425 aa, chain + ## HITS:1 COG:CAC1849 KEGG:ns NR:ns ## COG: CAC1849 COG2081 # Protein_GI_number: 15895124 # Func_class: R General function prediction only # Function: Predicted flavoproteins # Organism: Clostridium acetobutylicum # 24 421 1 388 393 264 40.0 2e-70 MEQLEEKEIIPYDVIVVGAGAAGMMAAGTAARNGKRVLLIEKMEKSGRKVRITGKGRCNV TNARPAEEFASQVRTNAEFFAPAFAEFNNRAAIRFFERAGVKLEIERGERVFPKSGKAWD IATALLEYCFENGVKIIYNTRVTGIMTLGSKVFGVRYINKRGFERKEECPNVIVATGGLS YPATGSTGDGYAFASDLGHSVEPVRPSLTPLVSSHPQIKYLDGLLLRNVRATLWIDDQPV REEFGELGFSQRGIEGAVALRMSRDAVDALIDQRRVKLVVDLKPALTEEVLRERIAREIA DMEPTEFFTELLRKLVPKPLVLTLVHEMDVNGKNYVSKITEEQIGRLIKTLKGFTFPLTD YAPFEYAVVTAGGVCCDEVNPYTMESLKVRGLYFAGEVLDLDANTGGYNLQIAFSTGRLA GMLKK >gi|313157860|gb|AENZ01000052.1| GENE 151 181193 - 181705 294 170 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313158031|gb|EFR57437.1| ## NR: gi|313158031|gb|EFR57437.1| hypothetical protein HMPREF9720_0648 [Alistipes sp. HGB5] # 1 170 1 170 170 329 100.0 5e-89 MRIGGRTIGLGFYGRIAELHNRLVRARYFRGHGVHSPFVYNIVREVFMRGELLPGDRRLY RALRGTGVPERRAIQLQNVAIHCGYGTFGLNRADAEFCILTRDLPRAETLALVAAAGEAG HTVALMSPCEGQDRQMLCRQIVAAHRSTTVDNRGYLLIFNNNLPKQHFRI >gi|313157860|gb|AENZ01000052.1| GENE 152 181702 - 182253 780 183 aa, chain + ## HITS:1 COG:lin1172 KEGG:ns NR:ns ## COG: lin1172 COG2096 # Protein_GI_number: 16800241 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Listeria innocua # 1 179 1 172 188 119 37.0 3e-27 MKIYTKTGDKGTTSLVGGERVFKTDERVEAYGSVDELAAFTALLCDNMRSDAALTPYVDD LNRILSRLMTVEALLARGKSGCEKVAPLAPEAVTWLEGRIDALQAAVPPIDKFTIPGGNA VVSMCHVCRTVCRRAERAALRADEKYGTDAEALMFLNRLSDYFYALGRMLTAHYEVDETL WIP >gi|313157860|gb|AENZ01000052.1| GENE 153 182417 - 182638 310 73 aa, chain + ## HITS:1 COG:no KEGG:Bache_1018 NR:ns ## KEGG: Bache_1018 # Name: not_defined # Def: hypothetical protein # Organism: B.helcogenes # Pathway: not_defined # 1 73 19 91 91 116 94.0 2e-25 MYWTLELASKLEDAPWPATKDELIDYAVRSGAPLEVLENLQEIEDEGEIYESIEDIWPDY PSKDDFFFNEEEY >gi|313157860|gb|AENZ01000052.1| GENE 154 183073 - 185661 3142 862 aa, chain - ## HITS:1 COG:no KEGG:Pedsa_3179 NR:ns ## KEGG: Pedsa_3179 # Name: not_defined # Def: metallophosphoesterase # Organism: P.saltans # Pathway: not_defined # 54 658 57 655 664 363 36.0 1e-98 MRKMKRLIKFRLPYLFAAAFLLLCIGISCSEDPTWTKPTPEQEEIIPAVFIEDLVMPDAE EIFIPEDPIKIMGTGFDKYDVFLFRSTSEDPEEETKYFQAEISEVAEDYVIITVPKTIDL RSYELILKRYNKQQVLGPMTIYKAHFPAIPDQAGMTVKGRVYCGKKAVADVVVSDGVIVT KTDKDGVYYLPSEKKMGYVFVSIPSGYNIESEAGYPAFPAMSQQLLNEDNEQHDFALTEI PGGNTNHVMMVFTDLHLASKTPNTAGKYADIDQYKNKFMPDITAEMAQYANVYALCLGDI TWDAFWYKTMNQGPENMVEVRYQLPNYKELMKNFPCPVFNVIGNHDNDPKVVGDYFTELP FKQHIAPTYYSFNIGDVHYIVLDNDQYDNYNGGSRIERIGLLGDSDPQKWQMAWVAEDLK YVDKSKKIVVAMHAPMTSAGLNPIVTLSGGNDLLGLLQGYQVEFLTGHTHTNHHAKIAEG VREHNVAAVCGTWWFNVKSANGGADYDLCKDGSPTGYGVFKFNGTAVQWYYKGIEVARNE QFAVYDMNFVKSDYKSAVGAKANEVLVNVWNWDEDWTVSVTENGQPLTATRVYRKDPTHY TWQKDVLEPAHAPGSPNSTYLTGYNAHMFSAIAQVPGSTIKVVVTDPFGGVYEKTIVREI PSNLPAQWVFTKGVNVDEFVVDNKMPSATGKGYISYISNCDPALDVNNKIARANTAGEPY ITGRWPGDWWLFTIPEMTIKAGTVINAKFHARASGTGMKYWMLEYYDGGEWKPGAPLQTT TVGEGDQAQTFSYNYEMMNTDHCLIDRNMTFEHAINNGDILIRLRCMANWQASGKGALAA PNGGTHRISVQNNINPTISIVQ >gi|313157860|gb|AENZ01000052.1| GENE 155 185741 - 187744 2513 667 aa, chain - ## HITS:1 COG:no KEGG:Phep_1386 NR:ns ## KEGG: Phep_1386 # Name: not_defined # Def: hypothetical protein # Organism: P.heparinus # Pathway: not_defined # 19 663 19 689 692 97 23.0 2e-18 MKIKQFISGRWPGLKALALCAMAATAFVACKDDETDVVQDPHITLSEDAFLYNAAGETHS FTVYSNIDWKVEMADTSQTWVNIWPKEGSDDGVFAVKVSKQPAKDPVLREARFRVVGKHE HDNIIQEITISQRDVDPALKLGVTTTPPKVVINSNGEEKYSVGVTCNVEWTAKVGDSGEG WITIDEISDNEIKFSVPKYAGTVPRTGRIEFNATTERMATVELIVYQSIPLDIGENAELK TIAEVYEKLGGLQGTVRENVKIQGTVISDRVSGNCPNPAAFFVQDASGRAIQFRVQEGSH SLQLNSLVTIPLMNTQSVIDDDGQPYITISTERIRDVITTGGAPIEPKVMESTQGLTDEM LGTLVTIRNLQFVFPHGTYYNANEGSIYGGGRPTDLAQLLLDRDGNTISHYVYGGTSVET GAMFKHYELLTDDPMLDITGILIKVENEWAIRMRNINDRDKLTEARRYVPVTEFYWPETL PDKEILVPKVGTGQLFVNFAPKAVIAQSTTFRLYSGAAYVRVDASVNGTEKGYYSANCGG WGAQSNDTYTPTQKAWYFETSTAGISDDLYVCFCTTSSQSGPGPFSIEWAAAEAGPWTFV TDYKAGNWEVIPGPKNFAFKLPDGCKNQSSLWLRFRVNSDERVDGKAANSATGTNRMIYT SLKKKVN >gi|313157860|gb|AENZ01000052.1| GENE 156 187838 - 189370 2321 510 aa, chain - ## HITS:1 COG:no KEGG:Odosp_3156 NR:ns ## KEGG: Odosp_3156 # Name: not_defined # Def: hypothetical protein # Organism: O.splanchnicus # Pathway: not_defined # 12 483 8 476 476 305 36.0 4e-81 MKRINSYILTGGVIILGLCGCTSNFDDYNKDPNRAEVGKANSSQMLEDLIFEGAGSMKYR TWRHNNEFMQYTVETSTLNQLHRFVLTDSEFKGAWNFCGRWAVNAIEMYKLAEKEENTNA QAIALTMKALYLSNMTDLFGDMPFKEAFKGIDENIMQPKFDDQKVIYDSLLMDLERANTL YTKTSTIDAKRDLLYNGDVTKWRKFTNSLYLRLLMRVSNRRDMNSAERIKTVFENPSQYP IFESNDDNATLKYSGTRPFVNDFGDNATDDSAMGERFINIMVDSSDPRISVYCNRVSSGA NAGGYVGITSGAPASVISKQSADGASNSNNTTFRQYTSPYTFMTYSEVLFIKAEAIQRGM MDGEASETYYAAVTASLKQWIPDISESAVETFLQNGAVAYDATTEIIITQKYVSLFTVGY ESWHEYRRTGYPLLVMGEAIRNDGVHPTRFPYPLVLRSSNKNYATQLAKMGEDNMKTPVW WSRKAVKLENPEEDNNLGYTGHKWDADIKW >gi|313157860|gb|AENZ01000052.1| GENE 157 189378 - 192593 4595 1071 aa, chain - ## HITS:1 COG:no KEGG:Pedsa_0071 NR:ns ## KEGG: Pedsa_0071 # Name: not_defined # Def: TonB-dependent receptor plug # Organism: P.saltans # Pathway: not_defined # 28 1071 105 1134 1134 746 41.0 0 MLMNFKFYVKRFLLSLFLVLASTMLYAQKKADITGTVYDSDGTTPAVGCAVTIEGTPTGV ITDINGKYKIRGGETQTLVFSSLGYKTQKILIGPKTRIDVTLEADSQEIEGVVVTALGIK RDEKALGYAVTKVDNDVLTSTVSSNWMNALSGKVAGLNFDQASTGPGGSMRVTLRGETSI NMNGGTALFVVDGVPITSSTTSSTSESAYNSTDGAVDFGNAAGDINPEDIENVSVLKGPA ATALYGSRAANGAIIITTKSGRTTKGIGVTFSSSVSFEKAGFWPDFQYEYGAGRFGGNPY SFYSVDGVSRNWSSYAFGDKFDGHLTYEWPSLNDDGTYTMMEWRPRNFFAGFFDTGVTWN NNISVSGNNGKGTSVRVSVTDVRNDWIVPNTGYKQQTFSISLSQQINKYIKLDAKVNYYR KDSDNLPMTGYGGSSIMYPLLWNTPNVDAQWFKADYKKWVREGGDLSKSTQHVSANGNNP YYTAYEQLNKLDRDRVFGNLAATINFTEELSLILRCGMDLNNDFRTQQKPKGAKTYINGM YKEQTVFDYEMNNDFLLRYTKKLNDFDLSASFGGNNMMQSYRSNTSLAESLVVDRDYRLS NSVDRPKVTSIRRQKSINSFYGLISASWRNMIFLDVTGRNDWSSTLAPGNNSYFYPSVSG SVILSDLLHIDTPMVNFLKVRASWANVGNDTSPYQLLNYYNNSSFTGGFNMPTNKANYNL KPENVESWEFGVEGRFFDSRLTFDVAFYNATTTNQIISVPVDITTGVYNTIVNAGEINNR GWEVSARIQPVRNKNIRWDMNFTWSRNRNKVVELAPNLDSWTIATGPRGEIRAVPGGSLG DLYGSGYEKAPKGSYVTADDGSTIDVSGWDIVDSDGYPVLASEFENLGNTQADWKAGWMN SISYKNFRLSFSFSAQWGGQAYSFTNAMLGYQGKIKATLPGRYDGLIHKGVNQNADGTYS INKTVTASIESYYNLRVFNRDNVVNNTFSTSFLKMKDIRLEYSLPKKIAAKTKVLQGASI AFFATNLFCWTNWPQFDPEIATMNGSEITKGFETAAFPMTRTYGVNLKLQF >gi|313157860|gb|AENZ01000052.1| GENE 158 193443 - 193877 424 144 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MEFIITILVMDIFIVLIIYRRLRNHIHYVSGQAKARLNYIRILIDTVYTYSHQPELLQKP LQKEMSIKKLLAYHVIDNRCDRLFSARMKVNSNDQLLYMLHKEGFSPRELCVIFELNNLN SVYVKCHRINKKLHAGDMQEVLSK >gi|313157860|gb|AENZ01000052.1| GENE 159 194375 - 195139 964 254 aa, chain - ## HITS:1 COG:ML1713 KEGG:ns NR:ns ## COG: ML1713 COG0500 # Protein_GI_number: 15827917 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism; R General function prediction only # Function: SAM-dependent methyltransferases # Organism: Mycobacterium leprae # 16 176 42 211 280 97 37.0 3e-20 MAKITTAERVSREASDNFVFQRSLLAYHAAAQRISGDVLEIGTGAGYGIEVVAPRARSFV TIDKLAPAPEPTQLPNVEFRQATVPPLPFANDCFDFVISFQVIEHIKRDLELVREVKRVL RPGGKFIVTTPNAPMSLTRNPWHIREYTAVELRSLLGSEFSSVEALGVSGNARVMEYYEQ NRRSVERITRFDILDLQHRLPRCLLQFPYDVLNRINRRRLLKSNTDLTTSIRMDDYRIGP VTDESFDLFFVAEK >gi|313157860|gb|AENZ01000052.1| GENE 160 195139 - 195975 1246 278 aa, chain - ## HITS:1 COG:VC1503 KEGG:ns NR:ns ## COG: VC1503 COG2996 # Protein_GI_number: 15641512 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Vibrio cholerae # 1 276 22 297 298 223 36.0 4e-58 MLRAGRIQKLTVSRISDYGLYLADEEQNEVLLPNRYISLTDKPGDEKEVFLYHDSEDRLV ATTETPLLRVGEAGYLRVVDKTAHGAFLDWGLYGKDLFLPNRNQQGGIIAGRSYIVYLYE DSVTGRCVATCKLKSFINNDIITVAPRQEVDLLVASESPIGYRVIINNRHWGMLYRNQLF RPIAVGDRTKGYVRKLTEDNRIDVSLQQEGFAQVKDSAEVLLQLVRDNGGFLPLNDDSAP EEVNRLTQTSKKIFKRSLGMLLKRGAVTVDEQGIKINE >gi|313157860|gb|AENZ01000052.1| GENE 161 196152 - 197972 2069 606 aa, chain + ## HITS:1 COG:XF1029 KEGG:ns NR:ns ## COG: XF1029 COG2936 # Protein_GI_number: 15837631 # Func_class: R General function prediction only # Function: Predicted acyl esterases # Organism: Xylella fastidiosa 9a5c # 30 590 41 620 663 301 33.0 4e-81 MKLRLFSALLLAGLLTAAAAFARQPVDELYDKAEYRIAMRDSVKLYTAVYTPKTDAKAPI LIFRTPYGCAPYGPGEFPRGFGSGYMRSYIDRGYIVVMQDVRGRFMSEGEFVHVRPAGYG ETDETTDSYDTVEWLVKHVPHNNGRVGFAGSSYPGFYAMMGGLCTHPAVKAVSPQAPVTD WFMGDDVRHNGVLMLTDSYRFLSGMNTPPGHTPTEKMPPMSKRTRPDEWTFFREHATLAE LTELQRPNPFWEELAEHSSYDAWWQARDLRRACYNVQPAVLVVGGTFDAEDCFGAWNLYR AMVRQSPATPCHLVVGPWAHGAWRNGSGKLGAFDFGREAAWEYYMEHFELPFFDHYLKGE GAADPLPAAAVFSSGDNRWHAFDRWTPREARKLTFYLSDGGRLETRRPTAKSSASAYTSD PADPVPYIESFGERRPKEYMLADQRFVSDRSDVLTFVTEPLDGDMTLAGPVDVELEVILS TTDADFVVRLIDVFPDDDPQSGCRMLVRGDVMRGRFRDGFSRPKAFTPGVPARVPFRMTD IAHTFRAGHRVMIQVQSTWFPLAERSPQQFVDVWTCGASDFVPCKVSVLHQRSNASSVTV WRLGES >gi|313157860|gb|AENZ01000052.1| GENE 162 198131 - 198895 1057 254 aa, chain + ## HITS:1 COG:PA3797 KEGG:ns NR:ns ## COG: PA3797 COG0388 # Protein_GI_number: 15598992 # Func_class: R General function prediction only # Function: Predicted amidohydrolase # Organism: Pseudomonas aeruginosa # 5 252 12 262 264 208 44.0 8e-54 MKIRVAICQVDMEWENTARNLKRLEPIVAAADADVVVLPEMFATGFKLRPSCVAEPADGL VVTTMRRWAAQYGRAVVGSVVVAEQERFRNRMFFVKPSGETAWYDKRHLFRPGGEARDYT PGDRRVIVEYMGFRFLLLVCYDLRFPVWSRNRGDYDAILCSASWADDRREVWRTLLRARA IENQCYLAGVNRVGTDPDASYAGDSALIDFAGRTLADAGDRERTLTAEFDSEAQAAFKEK FPAWMDADDFELKC >gi|313157860|gb|AENZ01000052.1| GENE 163 199033 - 199497 -117 154 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313158032|gb|EFR57438.1| ## NR: gi|313158032|gb|EFR57438.1| hypothetical protein HMPREF9720_0659 [Alistipes sp. HGB5] # 51 154 1 104 104 157 99.0 2e-37 MPDPVFLQRRVAVSKEFRLGFSQFSASVPALPRRPCLLFLGVRACSSSASVPALPRRPCL LFLGVRACFPSASGPAFPRRPGLFFLGVRACFFSEFRAAVSSEFRFAFLRIPALSFRQLR RILSRPNFLLFVLFCAAYVVSLYDIPAGKIRSAQ >gi|313157860|gb|AENZ01000052.1| GENE 164 199528 - 202347 4339 939 aa, chain + ## HITS:1 COG:BH3594 KEGG:ns NR:ns ## COG: BH3594 COG0178 # Protein_GI_number: 15616156 # Func_class: L Replication, recombination and repair # Function: Excinuclease ATPase subunit # Organism: Bacillus halodurans # 1 937 1 937 957 979 53.0 0 MKEDTIKILGARVHNLKNVDLEIPRRKLVVITGLSGSGKSSLAFDTIYAEGQRRYMETLS TYARQFVGTMERPDVDKITGLSPVVAIEQKTTNKNPRSTVGTVTEINDFLRLLYARASRA YSPVTGEEMVHYTDEQIAELIMTGFAGRKIALMAPIVKGRKGHYRELFESLAKKGYIYAR IDGEIREISAGMRLDRYKIHTIDLVVDRLVVAEDARDRVMTSLRESMRQGKGTMAVYDYG TEQQRFYSRHLMCPSTGVAFEDPAPHTFSFNSPQGACPHCNGLGEEAVFDVGKIIPDMKL SLREGAVEPLGKFRNNMLFAILEMLGRRYDFTLDDPVGTLPEAGLNAVLYGDSEPLTINL SEFSSSGGNHLVSWEGVAEYIGRTEDDDSKHGQKWREQFLVYRKCSVCGGTRLKKEALQF RIGGLSIADVSAMSISEFSEWAARIGEHMTDKEWKIAQEVVKEIRERLRFLMDVGLGYLS LSRSSRSLSGGESQRIRLATQIGSKLVNVLYILDEPSIGLHQRDNRKLIRSLEELRDAGN SVIVVEHDEDMMRAADFIVDVGPRAGRKGGNIIAAGTFDDILKSDSITADYLTGRRRIEI PETLRKGTGQSIVIRGAKGNNLKNVTAEFPLGKFVCVTGVSGSGKSTLVNETLRPILSKE LYRSFAQPLEYDSVEGIEHVDKLVVVDQSPIGRTPRSNPATYSNVFSDIRKLFEMTPDAQ IRGFKAGRFSFNVKGGRCEECRGAGVQTIEMNFLPDVYVRCKTCGGHRYNRETLEVKYKG KNIDDVLNMTVNMAVEFFENIPNIYQKLRAIQEVGLGYLTLGQPCTTLSGGESQRIKLAS ELSKRDTGRTLYILDEPTTGLHFEDIRLLLEVLNKLVERGNTVIVIEHNLDVIKVADHII DIGPEGGAAGGEIVAAGTPHEVARCKRSYTGEFLKKLGL >gi|313157860|gb|AENZ01000052.1| GENE 165 202412 - 202954 796 180 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313157957|gb|EFR57363.1| ## NR: gi|313157957|gb|EFR57363.1| hypothetical protein HMPREF9720_0661 [Alistipes sp. HGB5] # 1 180 1 180 180 327 100.0 3e-88 MKKTVIIVVGLLLCVVGATVIGQKKNLSPKEERREVREKRRADRIASFEKTMDSVILSRN FQFNPQTMQRQPAGPMRQIMNPAFNVGVWDGTVDICLPYIKGYVPPYYVMILNYTVPNVQ GYTTEQTHEGWMVTFSTSLFSASTYTFTFEIFTRTGGANLTITNPWYNPVEYSGSISQLY >gi|313157860|gb|AENZ01000052.1| GENE 166 202956 - 203891 658 311 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313158011|gb|EFR57417.1| ## NR: gi|313158011|gb|EFR57417.1| hypothetical protein HMPREF9720_0662 [Alistipes sp. HGB5] # 5 311 1 307 307 502 99.0 1e-140 MKSGLQSRTGFRTPSPEGRAVRRAAVPASPERFRRHPAEDRTPDAVPEPQECSPQSEPPA SAGAFVLRRLKTGRALLCAAALFWLMPVAAQSPQAAAPSPQAATQSPQASSSPQVPSSPQ FSLGTSAVDSVPQTTVVPVQQQPVQLVEHTTVYEDPARGMTRPERRAYRARLYAQKIDSL VQSRDYMFFPNSMQEIPGGLIRSIYADYFFFGMFVDHVEIHLPTERGVTQYVEMLNFDSM SIRSYQAARLQWGWCISFDVADGNRVYHAEFAVSTATGETVLTLLTPDVTMRYVGWLWNK RMGDPKFRRID >gi|313157860|gb|AENZ01000052.1| GENE 167 203970 - 204452 634 160 aa, chain - ## HITS:1 COG:RSc1644 KEGG:ns NR:ns ## COG: RSc1644 COG0245 # Protein_GI_number: 17546363 # Func_class: I Lipid transport and metabolism # Function: 2C-methyl-D-erythritol 2,4-cyclodiphosphate synthase # Organism: Ralstonia solanacearum # 3 160 5 162 168 181 56.0 7e-46 MTDFRIGHGYDVHALADGLRLILGGIEIPHTKGCVAHSDGDVAIHAVCDALLGAAALGDI GLHFPDTSDDFAGIDSKILLRRVAALLRERGYEIGNVDCTIRMQRPKLRPHIDAMRAAMA GAMGVGDDRVSVKATTTEHLGFEGREEGVSVSAVALIYKS >gi|313157860|gb|AENZ01000052.1| GENE 168 204674 - 206125 2262 483 aa, chain + ## HITS:1 COG:YPO3230 KEGG:ns NR:ns ## COG: YPO3230 COG2195 # Protein_GI_number: 16123389 # Func_class: E Amino acid transport and metabolism # Function: Di- and tripeptidases # Organism: Yersinia pestis # 1 483 1 482 486 422 44.0 1e-118 MSAITNLEPRIVWEQFDAITRVPRPSKKEGRIIEFLVDFARKHNLEYKKDAIGNVVMRKP ATPGFENRPTVILQSHMDMVCEKNSDVEFDFDRDPIRTRIDGGWVRAEGTTLGADCGIGM AAALAVLIDPAVEHGPVEALFTVDEETGLTGAFELGEDMLTGKYLVNLDSEDEGEIFIGC AGGIDTVATFRYTMEEAPKNYAFFRVDVSDLQGGHSGDDIDKGRVNSNKTVARLLWDGMQ SYELKLSWFDGGNLRNAIPREAYAIFGVPVRFKDEFIKRYKLFAADLEAEFRLREPNFKI TLNEMPQVDRVLDARTQFGLVYSLVGVPNGVIAMSFAVPGLVETSTNLASVKFVGDDRIV VTSSQRSSVESAKTYVMQMVESVFALAGADVAHSDGYPGWTPDPQSALLAKTVDAYKRLF GADPKVRAIHAGLECGLFLEKYPELEMVSFGPTLRGVHSPDERLEIATVPKFWDLLLEVL KTV >gi|313157860|gb|AENZ01000052.1| GENE 169 206185 - 206700 408 171 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313157952|gb|EFR57358.1| ## NR: gi|313157952|gb|EFR57358.1| putative lipoprotein [Alistipes sp. HGB5] # 1 171 1 171 171 336 100.0 3e-91 MKTILKPLSIIILIMALAGCAAQKSGNTPGKSTAEEVLYTQALKALEERNFIIKIDEFHF PSDKAPVNSTNSYVSMQGSHAVIRMSQDFPSQIPSNRNTESDNAEITKGSAKKNGDIQFH MLIKGAEKWQDRELIIILYKNTNQCFVNINRGHSGQNIVNFKGYVYPLATE >gi|313157860|gb|AENZ01000052.1| GENE 170 206763 - 207377 301 204 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MVAGSGLLSSNCSNDDLTVDRTVPDDVVATRSSAFHWTCPDSKCGFDLNAGWQSHCGRPG CKEEYSDSHGILILTFADYIKNTVGFDTGGGGGAGFDYNRIELPNGKFPKFAPEPWYETP CAMKYYNELRNSSQYRLTPGYAEAVEYAWYRTVRILYPKYHNSTTVEREYTRFLLNEGRN LTGFKGSGILDASEAAVKAFATCR >gi|313157860|gb|AENZ01000052.1| GENE 171 207656 - 210082 2780 808 aa, chain + ## HITS:1 COG:no KEGG:BT_3560 NR:ns ## KEGG: BT_3560 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 52 808 56 846 846 236 26.0 3e-60 MIPKYGCLLVGLLTVCGAAAQHPDDEYYPYAARQEERTPLLLTDSTLFYRAVQTTPDLYA EHTAFNLPYVSVKRRGLNYRDESASVGVVRLSSRYFGAMRLLGADEVRYGGLAAADGVTG GVGGLRLFRFTADYPQASRYTAVSFTDRSYLAGARLSVTEPLGCGWSGTAALDARTGRDM HVEGVFTNALTASLRAAKRFGDDHNLSLMLIVPPSVRGTRLSSAEEAFRLTGDRLYNPAW GFQHGKVRNSRVRREFVPLAVVSYRMPVSQSTSLAADFSAEYGTRKYSALGWYDARTPMP DNYRYLPGYTGDRETEQAWRAGDARYTQIDWDELIAQNRMAAGHAVYALEDRAERLGDLN FSALFTTALDARLTLRYDLVLRHARSRNYKQMRDLLGAEYITDIDQYLVDDDTYSNLLQN DLRRPDRTIRKGGRFGYDYALTTRRADARLHAEYRSDRFRADLVLSLGAAAVCRRGYYEK ELFPGAQSYGRSRRVRFTPYTLKATAGWAFSPRSYLEASAAAEAVLPDAADLFYQPQYNN RVIDDPCPERRYAAEINYSRTGETLTLRFSAYLLAMFDGVETRRYYDDMAGVFCDMAATR IGRMACGVEAAADIRLSYRWSLSLAASAGRYKFIRNPRITVISDVDNSVVDARAVSHLNG CTPGGAPQLTTCAELGYFGPKGWGFKTSAGYAGARYVEPSLLRRTERIARQGGTTREMFD AFTRQQRLGDAFTLDAALFKTFWFDRSRLTASLILRNLLGDGDTVYSAYESQRVRRIRSG DTLCYAPHATRLTYAYPRSFYLTVSYRF >gi|313157860|gb|AENZ01000052.1| GENE 172 210174 - 211280 1594 368 aa, chain + ## HITS:1 COG:Rv2780 KEGG:ns NR:ns ## COG: Rv2780 COG0686 # Protein_GI_number: 15609917 # Func_class: E Amino acid transport and metabolism # Function: Alanine dehydrogenase # Organism: Mycobacterium tuberculosis H37Rv # 1 368 1 369 371 383 53.0 1e-106 MIIGIPKEIKNNENRVALTPSGAQELTKRGHKVYVQASAGVNSGFADDAYTAVGAEMLPT IEDVYARAEMIIKVKEPIAPEYQLIRKDQLVFTFFHFASSEPLTRAMIDSGAVCCAYETV ERADRSLPLLIPMSEVAGRMATQEGRYFLEKPRGGKGILLGGVPGVKPAKVFVIGAGVVG TAAARTAAGTGADVTICDISLQRLTYLADVMPKNVKTLMSSEYNIREELKRADLVVGSVL IPGAKAPKLVTRDMLKEMEPGTVMVDVAIDQGGCFETSRPTTHEDPVYYVDGILHYCVAN IPGAVPYTSTLALTNATLPYAIQLADKGWRRACRENRELELGLNIVQGKVVYKPVADAWG LDYEPLTL >gi|313157860|gb|AENZ01000052.1| GENE 173 211570 - 211890 485 106 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313158034|gb|EFR57440.1| ## NR: gi|313158034|gb|EFR57440.1| hypothetical protein HMPREF9720_0669 [Alistipes sp. HGB5] # 1 106 1 106 106 171 100.0 2e-41 MIRKLLLILRFYYSAMFFFCAAVSVALAWLVKVHGPEIIPMCMLGKAAVYPLYLYVWIVQ RYGDEFFYYRNLGVRRRELLGVSCAADFVLCYVLLKLTALFLYEPL >gi|313157860|gb|AENZ01000052.1| GENE 174 211877 - 212542 908 221 aa, chain + ## HITS:1 COG:PA4461 KEGG:ns NR:ns ## COG: PA4461 COG1137 # Protein_GI_number: 15599657 # Func_class: R General function prediction only # Function: ABC-type (unclassified) transport system, ATPase component # Organism: Pseudomonas aeruginosa # 8 220 3 237 241 94 31.0 1e-19 MNPSEKHTLEIDSVELSFGDRRILSGVYLAVETGGVSAVLGRNGCGKSCLMKILCGSLQA GFRSMRIDGVWHDRFRADEVRYLPQQGFIPGWLTLDRVLRDYDMTRGDLLKWFPLFEKLF GTKIYAMSGGERRILECYLILRSPTRFILLDEPFSQVAPLHVSALKRLILQERASKGILL TDHMHRHVTDLADRLYVMADGRAYLAQGEEDLVRYGYLTHL >gi|313157860|gb|AENZ01000052.1| GENE 175 212697 - 213710 749 337 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|149915191|ref|ZP_01903719.1| 50S ribosomal protein L27 [Roseobacter sp. AzwK-3b] # 6 298 3 287 345 293 52 6e-78 MAGSNFVDYVKIFARSGHGGAGSAHFRREKFVAFGGPDGGDGGKGGSIVLQGDSQYWTLI HLKYQRHQFAEDGQCGSGARSSGKDAKDIVIPVPLGTVAKRIVTQEDGTETVETVGEVTA DGERLVLLHGGRGGLGNWHFKSATNQTPRYAQPGEEGDEGAFILELKVLADVGLVGFPNA GKSTLLAAVSAAKPKIANYAFTTLEPNLGIVEVRDHKSFVMADIPGIIEGAHEGRGLGTR FLRHIERNSVLLFMIPADSDDVRRDYEVLLGELTQYNPELLDKERLLAVTKCDMLDEELI EQMKPHLPEGIPSVFISSVSGLNISRLKDMLWEALQR >gi|313157860|gb|AENZ01000052.1| GENE 176 213640 - 214470 646 276 aa, chain - ## HITS:1 COG:no KEGG:Poras_0865 NR:ns ## KEGG: Poras_0865 # Name: not_defined # Def: extracellular nuclease # Organism: P.asaccharolytica # Pathway: not_defined # 36 255 43 265 272 105 31.0 2e-21 MKTTACIKLLALLLLTACDRTSKLSFEGGAPGGVHTIADLKARCTANSVAVTEDITLEGV VTGNDFYGEFYKTLVVEDASGGISVLIDGTRLAFDYPVGAAVTIFCNGLTLGDYGGKIQL GVAPDGDYGVGRIPREELERYLRRNPDKDRRPRPAVCTFDAIGPRQTDTYVCFEGVRFTE AGPWCDTDPETSEPQTSERTLTDAQGRTFRVRTLGTCAYATEPVPQGTGSVYGIIDYFNG KYTLRVSERRVEFANAAGLPRAYPSTGKYSAPKPTK >gi|313157860|gb|AENZ01000052.1| GENE 177 214448 - 215212 598 254 aa, chain - ## HITS:1 COG:no KEGG:Poras_0865 NR:ns ## KEGG: Poras_0865 # Name: not_defined # Def: extracellular nuclease # Organism: P.asaccharolytica # Pathway: not_defined # 3 249 7 260 272 94 27.0 5e-18 MRRLPLLCAFLLLTGCYDSRFGERSGGEEPEPTTATIAQLRQLYAGTTFVIATDVVVSGR VNTSDAEENFYRTFCVEEDGAGLEVMAGIDHLHNDFPPGSRVTLRLKGLALGESRGILQV GRKPAAGSGFATDYLGSKAALDAAVSRTGEAPQPVAPALLTIAELTPKRCGTLVRIDGLS YTPEDLTPGTWAGYKRFTDAGGAEIYTYVRNYAEFAGEEVPAGKSCSLTGILQYDDAGKG RYLLKIRDENDCMY >gi|313157860|gb|AENZ01000052.1| GENE 178 215221 - 216441 1849 406 aa, chain - ## HITS:1 COG:PA4960_2 KEGG:ns NR:ns ## COG: PA4960_2 COG0560 # Protein_GI_number: 15600153 # Func_class: E Amino acid transport and metabolism # Function: Phosphoserine phosphatase # Organism: Pseudomonas aeruginosa # 191 402 1 212 217 255 65.0 1e-67 MQNNVEIIQINISGEDKPGMTSSLTEILARYDAFILDIGQANIHQSLTLGILFKTTADKS GNIMKELLFKASELGVMIRFTPISERKYEDWVGRQGKNRYIITLLGRTVTARHIAEVTSV VAEHGLNIDAMKRLTGRIPLSEDDRAAKSCIELSVRGSLTDEERSTMQEGFMNLSEIGLD VSFQKDDIYRRSRRLICFDMDSTLIETEVIDELAERAGVGEEVRAITASAMRGEIDFRES FSRRVALLKGLDVSVMEEIARSLPITEGLERMMTILKRVGYKTAILSGGFTYFGNYLRQK YGFDYVYANELEIEEGKLTGRYVGEVVDGRRKAELLRLLCQFEEINIAQSVAVGDGANDL PMLNLAGLGIAFHAKPKVKATARQSISTIGLDGILYFLGLKDSRIE >gi|313157860|gb|AENZ01000052.1| GENE 179 216536 - 217459 1394 307 aa, chain + ## HITS:1 COG:VC0480 KEGG:ns NR:ns ## COG: VC0480 COG0668 # Protein_GI_number: 15640507 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Small-conductance mechanosensitive channel # Organism: Vibrio cholerae # 56 306 35 285 287 198 37.0 1e-50 MWSFLVPFFTETPEAPLILPDSVQKANLAEAVNKIIHLDVGDMLRSLLSESVWILVKILI ALAIYFIGRWIVRRIVRLLDAAFERRQVDTSLRSFLRNTVKVVFTLILLMIVVQTLGVNV TSLIALFSAATLAIGMALSGTAQNFAGGVMILLMKPYRVGDFISAQGQSGTVREIKLFST VITTGDNQTIYIPNNSIATAIIDNYSTSETRRVDWTIGISYGDDVDAARRALLDLLAADS RVLTDPAPVVWVAALADSSVNLSVRAWVKNADYWDVFFENNEKIYKLLPLKGISFPFPQM DVHVKQN >gi|313157860|gb|AENZ01000052.1| GENE 180 217471 - 217923 581 150 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313157902|gb|EFR57308.1| ## NR: gi|313157902|gb|EFR57308.1| putative lipoprotein [Alistipes sp. HGB5] # 17 150 17 150 150 247 100.0 2e-64 MKKLLFAAAAVWLLAACNKELGPEYETPPVFDEVTFTTATQPDKITEPVKVDEVVWVNAT VSCPYKVKQVWLAYFVDGDESNVVTTTPFNYDATTVTFRERIPGQKAGAKVSFQVLTRSD YVFMWTQTYTYKVEGGEDEEKPNPDLPDEN >gi|313157860|gb|AENZ01000052.1| GENE 181 217945 - 218349 625 134 aa, chain + ## HITS:1 COG:no KEGG:Odosp_3558 NR:ns ## KEGG: Odosp_3558 # Name: not_defined # Def: hypothetical protein # Organism: O.splanchnicus # Pathway: not_defined # 3 127 2 126 138 131 52.0 1e-29 MELKEILAISGQPGLYKYVAQSTNGVIVESLLDGRRMNASASSKVSSLTEISMFTEGEDI ALADVFTKIYAHTGGKEAISPKEAPEKLKAAFAEVLPEYDRDRVHVSDMKKCFAWYNILV RAGFTEFKLPSEAE >gi|313157860|gb|AENZ01000052.1| GENE 182 218481 - 218975 708 164 aa, chain + ## HITS:1 COG:TM1185 KEGG:ns NR:ns ## COG: TM1185 COG1803 # Protein_GI_number: 15643941 # Func_class: G Carbohydrate transport and metabolism # Function: Methylglyoxal synthase # Organism: Thermotoga maritima # 4 164 14 156 166 158 50.0 3e-39 MERRTKVLALVAHDNMKHDLAEWVDWNSKKLGRHHLVCTGTTGKMVEKTLRDHREEYEGG EEIEELPDLRITLLKSGPLGGDQQLGSLIADGKIDALIFFWDPMSAQPHDVDVKALLRLA TLYNVPTAINRSSADFLISSPLFEDDGYVPVVKDYKAYVERVLK >gi|313157860|gb|AENZ01000052.1| GENE 183 218990 - 220012 484 340 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|42631300|ref|ZP_00156838.1| COG0042: tRNA-dihydrouridine synthase [Haemophilus influenzae R2866] # 1 299 28 326 353 191 37 3e-47 MKIADIELGEKPLLLAPMEDVTDPSFRYMCKRFGADVVYTEFISSDGLIRDAAKSLKKLE IDAAERPVGIQLYGHLIEPMVEAARMAEAAGPDIIDINFGCPVKKIAGRGAGSGMMRDVP LMVEMTRRIVQAVRKPVTVKTRLGWDEESKNIEEIALRLQDVGIAALTVHGRTRAQMYRG EADWTLIGKVKNNPQIHIPIIGNGDVDSGPKAAEMFDRYGVDGVMIGRATYGRPWIFREV KHYLATGEVLPQPSVCERVEIAKEHLRKSLEVKGEFVGILEMRRHLSNYFKGLPDFKPTR LKLVTSLDIPELFDTLDSIARRWGDFDVSGTVPAPLSHDL >gi|313157860|gb|AENZ01000052.1| GENE 184 220009 - 221295 1813 428 aa, chain - ## HITS:1 COG:alr5216 KEGG:ns NR:ns ## COG: alr5216 COG1253 # Protein_GI_number: 17232708 # Func_class: R General function prediction only # Function: Hemolysins and related proteins containing CBS domains # Organism: Nostoc sp. PCC 7120 # 1 423 6 434 442 320 41.0 3e-87 MEIIIIILLILLNGLFAMSEIALISARRSNLTARAKQGNKAAAKALQLAQDPDRFLSTIQ IGITLIGILTGIYSGEALAGRFGAMLASLGMPLRTAVPVAQILIVIVVTYLTIVFGELVP KRIGMKSAERVAILVARPMQVLAKVTSPFVWLLSRSIELITRLLGIRDTESKVTEEEIKS IIQEGTEDGEVQIVEQQIVGRVFSLGDRKVGSIMTHRSEIAWIDPAMTPAEIRELVGREP HTLYPAARSNLDRLAGVIYLKDLFTHIGEEDFDVASILRPAKFFHEETQVYTALEQLRSE QVGYGIVCDEFGVTQGIVTLHDIFEALVGSIPEEREEPDIVHREDGSCLIDGQCPFYDFL VCYGLEDVYPNNLYNTLSGLILDELGHIPQTGEKLRWNTFTFEIVDMDGARIDKILACET SEKTNQNP >gi|313157860|gb|AENZ01000052.1| GENE 185 221437 - 223218 2720 593 aa, chain + ## HITS:1 COG:TP0831 KEGG:ns NR:ns ## COG: TP0831 COG0018 # Protein_GI_number: 15639817 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Arginyl-tRNA synthetase # Organism: Treponema pallidum # 10 593 11 589 589 459 40.0 1e-129 MNIENFISDAVRRSVEALYGPLDGEQLQIQKTRREFEGDYTLVTFPLLRRSRKSPEATAT EIGEYMTANVPEVKSFNVIKGFLNLTLDCAFWAARFAEIAADANFGQAPDTGRTVMIEYS SPNTNKPLHLGHIRNNLLGYSVAQILRANGHNVIKANLVNDRGIHICKSMLAWKLYGNGE TPASSGMKGDHLVGKYYVEFDKHYKAQIKELTAQGQTEEEAKKNAPVMLEAQEMLRKWEA KDPEVYSLWETMNGWVYEGFDVTYKALGVDFDKIYYESQTYLLGKSLVEEGLRKGVFYRR DDNSVWIDLTADGLDEKLLLRGDGTSVYMTQDLGTAYRRFEENRLDDMIYVVGNEQNYHF QVLKLILKKLGYADWSDHITHLSYGMVELPEGKMKSREGTVVDADDLIADMVSTAREMSD ELGKLDGCTEEEADAVSTMVGLGALKYFILKVDPKKTMLFDPRESIDFNGNTGPFIQYTH ARICSVLRKAAEAGIDFSGAAQADYLPEEIALVKSLTEYPSVVAAAGENFAPSIVGAYVY ELAKQFNGYYHDHSILKEENDETRRMRLQLAQQVARVIRSGMKLLGIDVPERM Prediction of potential genes in microbial genomes Time: Wed Jun 22 13:00:50 2011 Seq name: gi|313157858|gb|AENZ01000053.1| Alistipes sp. HGB5 contig00001, whole genome shotgun sequence Length of sequence - 1763 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 208 - 267 3.3 1 1 Tu 1 . + CDS 290 - 1546 1302 ## BT_1138 transposase Predicted protein(s) >gi|313157858|gb|AENZ01000053.1| GENE 1 290 - 1546 1302 418 aa, chain + ## HITS:1 COG:no KEGG:BT_1138 NR:ns ## KEGG: BT_1138 # Name: not_defined # Def: transposase # Organism: B.thetaiotaomicron # Pathway: not_defined # 2 405 3 407 411 335 41.0 3e-90 MRNTFKVLFYIKKNAPLRNGWAPIMGRITINGQRTQFSTRLSVTPASWDTAQGRVCGRSN MAVQINEKLANIRFHIERCYNKLFYEQAFVTPVMVKEMYFGGNHKQETVVAFFKQHNDEF KRMVGVSRSMTTYYKYRCVCRHLADFVWDKYNRKDLMFKELNREFLTGFHSYIAQECAHK KNTTWIYMIALKHILMLARSKGYMNKDLFANYKLQSEFVTRNYLSMSEINKLMQYEGDTP TLQLIRDAFLFSCFTGLSYIDMKDLTLENIQQVNKQLWISTTRRKTGSEVNVRLFEVPYN ILLKHRPMTRTKRIFDLPSNGWCNACLDKIMSEIGIMKQITFHSARHTFATTITLSQGVA IETISKLLGHRNIRTTQIYATITHSQLDGEMERLSKRINSLYRNLLPPDAPEAGTKLI Prediction of potential genes in microbial genomes Time: Wed Jun 22 13:00:57 2011 Seq name: gi|313157852|gb|AENZ01000054.1| Alistipes sp. HGB5 contig00031, whole genome shotgun sequence Length of sequence - 5802 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 3, operones - 1 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 35 - 1516 1344 ## BDI_2339 hypothetical protein + Prom 1890 - 1949 2.0 2 2 Tu 1 . + CDS 1988 - 2260 326 ## BDI_3040 hypothetical protein + Prom 3342 - 3401 3.8 3 3 Op 1 . + CDS 3564 - 4118 228 ## Odosp_3447 hypothetical protein + Term 4125 - 4173 8.2 4 3 Op 2 . + CDS 4185 - 4646 -108 ## gi|313157853|gb|EFR57261.1| putative lipoprotein 5 3 Op 3 . + CDS 4691 - 5149 -85 ## gi|291514233|emb|CBK63443.1| hypothetical protein AL1_08980 6 3 Op 4 . + CDS 5156 - 5617 -57 ## gi|291514233|emb|CBK63443.1| hypothetical protein AL1_08980 + Term 5650 - 5694 11.1 Predicted protein(s) >gi|313157852|gb|AENZ01000054.1| GENE 1 35 - 1516 1344 493 aa, chain + ## HITS:1 COG:no KEGG:BDI_2339 NR:ns ## KEGG: BDI_2339 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 5 491 331 878 878 132 26.0 3e-29 MAGTLDGPIKRDKGSYLISARYFFPEAVLAIVDNAVRYGFYDVTGKLTYDIHRNHTLSLG IYSGDDHMKNKEDHAENGFGWGNTTASLRLESRWNDNLRSSVVAYYTYLQNRQETEFKDD GFSNWGKTTFKTHEFGARMTFDQRLSHIWTLEYGATFSHQRFEPMHTKKHHQRTAQGPRL FFRAACFGALFLNNRFQWGGWRADVGVRGAVYDNSEQTRYAVEPRAQLSYDFGRDNSVWL SGTINSQALVQFNRYYYSMPIDFWTPFRDGRLQHAWQVSLGGRAKLHENLTLSVEGYYKR MRNLPLIYDSDDFLLGNGGFIYGTGRAFGIEAMLQYQTERLSLTASYTYTDSRRRSDGVT YPFEYDVPHDFNAFVSYDVVKRPGRKHTFSLNVAWRSGLPYRLTNESYPDTDGNPIIGIT AYPTMRMHNYFRADVSYNMERRKRNGVRNWQFSIINATWHKNPVSIYPYRGSYKATVLIP IMPSVSYTRTFGK >gi|313157852|gb|AENZ01000054.1| GENE 2 1988 - 2260 326 90 aa, chain + ## HITS:1 COG:no KEGG:BDI_3040 NR:ns ## KEGG: BDI_3040 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 2 85 103 183 292 68 41.0 9e-11 MVLKLNPVKYHWRDESEYERFNIRPVQSGVEEYGFLAQELEAIIPGAVAMTEEGDRLVNY SALIPILTGAIQELTARVAALESQLKAAGK >gi|313157852|gb|AENZ01000054.1| GENE 3 3564 - 4118 228 184 aa, chain + ## HITS:1 COG:no KEGG:Odosp_3447 NR:ns ## KEGG: Odosp_3447 # Name: not_defined # Def: hypothetical protein # Organism: O.splanchnicus # Pathway: not_defined # 1 108 448 556 557 93 41.0 5e-18 MTYGSYGNGTGKNAELCGIGEMWGYSMGHIRAYEKYNPSGLLDDYPDVHTWLKPHVFWDL QRDKVLTKKQIYDCLVVGVDTYDRLVAKMYEKYPEKADEIEKAFTDNGITPNVPKPDTGD LTHDAFYTDKTVSSSFVFSGNNILTRNVTVTNSAKLTFRANKSVTINSPFTINQGAQLEI TCGN >gi|313157852|gb|AENZ01000054.1| GENE 4 4185 - 4646 -108 153 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313157853|gb|EFR57261.1| ## NR: gi|313157853|gb|EFR57261.1| putative lipoprotein [Alistipes sp. HGB5] # 1 153 1 153 153 309 100.0 5e-83 MKKIIFAVISLTLFVSCRIFHTTEPTNSAHWYVKNMTWKPIVIKKRDFRTERLLNPGDSI SVFHRLPMQKLGFPSFDLFYELWNVNGNGVEQNLDVLSENGQLLKTWKISDKNDPSVRIF QETSWRFYQKPQLQHVSTCEVAWVFDILPADIE >gi|313157852|gb|AENZ01000054.1| GENE 5 4691 - 5149 -85 152 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|291514233|emb|CBK63443.1| ## NR: gi|291514233|emb|CBK63443.1| hypothetical protein AL1_08980 [Alistipes shahii WAL 8301] # 1 151 1 150 153 154 53.0 2e-36 MKKIFVLCIIILGGVSCDPPEHLKTHYTYWYVKNATNKPISITTVPGHLMSILTIISGDS VCFHSFCPPQHWGIPSFNGLYDIWKKTAVQDQHTDILSNEGSLLKSWNYADRDAEGRQFF NESYWRLYTKKYSHSDELNFAWVFDILPADIE >gi|313157852|gb|AENZ01000054.1| GENE 6 5156 - 5617 -57 153 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|291514233|emb|CBK63443.1| ## NR: gi|291514233|emb|CBK63443.1| hypothetical protein AL1_08980 [Alistipes shahii WAL 8301] # 1 153 1 153 153 292 96.0 5e-78 MKKIAFIFAITLISISCDPKKWVATHATSWYAKNNTDQILIITTSPFIDTDAVVDPGDSV LVHSFSPFQYLGEPTFDTFYDAWNGKPEQEWCISICSKDGQLLKIWKYIDRNAEGRQFFN ESYWRLYTKKYSNSDQLNLSWVFNILPEDIATL Prediction of potential genes in microbial genomes Time: Wed Jun 22 13:01:36 2011 Seq name: gi|313157848|gb|AENZ01000055.1| Alistipes sp. HGB5 contig00091, whole genome shotgun sequence Length of sequence - 3462 bp Number of predicted genes - 4, with homology - 3 Number of transcription units - 3, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 10 - 756 691 ## COG3537 Putative alpha-1,2-mannosidase + Term 825 - 864 8.1 - Term 813 - 851 7.1 2 2 Tu 1 . - CDS 903 - 1253 184 ## - Prom 1287 - 1346 3.0 3 3 Op 1 . - CDS 1606 - 2832 748 ## BT_0101 hypothetical protein 4 3 Op 2 . - CDS 2817 - 3131 341 ## BF1366 hypothetical protein - Prom 3164 - 3223 2.9 Predicted protein(s) >gi|313157848|gb|AENZ01000055.1| GENE 1 10 - 756 691 248 aa, chain + ## HITS:1 COG:L135972 KEGG:ns NR:ns ## COG: L135972 COG3537 # Protein_GI_number: 15673483 # Func_class: G Carbohydrate transport and metabolism # Function: Putative alpha-1,2-mannosidase # Organism: Lactococcus lactis # 6 246 458 698 717 196 39.0 3e-50 MLEKASQNYKNLFDPETRLMRGRNADGTFQTPFSPYKWGDAFTEGNAWHYTWSVFHDVEG LIDLMGGHKGFTRMLDSVFVVPPVYDDSYYGFRIHEITEMQVADMGNYAHGNQPAQHMIY LYDYAGQPRKAQWWVREVMDRLYSAKPDGYCGDEDNGQTSAWYVFSALGFYPVCPASDEY AAGSPLFKKATIRLENGKTVEIEAPENSPENRYVGKMRINGQVSQRPFIEYAEIEDGAKI LFDMQVRK >gi|313157848|gb|AENZ01000055.1| GENE 2 903 - 1253 184 116 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MGAHIKFSMEHRYFRDWLEVDVDWNYPFLPRVGEFVNAWIWIEAGKFSRADIEKILNPDG QENLNSEFYRDYTLDDWLYEIGMECNKVYGVSYYREKNDPANIYARVSLSEPGTAL >gi|313157848|gb|AENZ01000055.1| GENE 3 1606 - 2832 748 408 aa, chain - ## HITS:1 COG:no KEGG:BT_0101 NR:ns ## KEGG: BT_0101 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 367 1 364 412 322 46.0 2e-86 MVANIRSGSSPGGALYYNKEKVDKDEAEVLLWQKMLEPYDKCGRLDIDACMESFMPYLEA NRRTTNTVFHASLNPSPEDRLTDEQLRKIACEYMERMGYGEQPYIVFKHKDISREHLHIV SLRVDEQGRKLPHDFEARRSMEILRDLERKYGLHPSVKGQGLTDREGLRKVNYSEGNVKQ QISSVARSCLRNYKCSSYGEFRTLLELLNVSVEERTGTVDGRDYAGVIYGAMTDDGYGIG TPFKSSRIGKDVGYKALQKYYERSKSALKQDGTLDRLRQTVKDAMSPDNTREEFRQLLKA DGIDVVFRINPVGRIYGATFIDHNAGIVANGSLLGKEFSANAFNELYPAPKQAQQVAEQH TEQQHKQRSHAVNPLTGIVDTVLDLADTQAYEEQQRQIQHRKKRRFRR >gi|313157848|gb|AENZ01000055.1| GENE 4 2817 - 3131 341 104 aa, chain - ## HITS:1 COG:no KEGG:BF1366 NR:ns ## KEGG: BF1366 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 10 102 45 138 140 75 44.0 7e-13 MLCKAGLEHNRSRFIVKRIFAEEFVVVKRDPSKVQFIARLNDFYFQFQKLANNYNQIVKV INSHFSNIAIPHQIAALEQRTRELKALSIEILNLAKQTKEWLRI Prediction of potential genes in microbial genomes Time: Wed Jun 22 13:01:54 2011 Seq name: gi|313157843|gb|AENZ01000056.1| Alistipes sp. HGB5 contig00026, whole genome shotgun sequence Length of sequence - 4028 bp Number of predicted genes - 5, with homology - 5 Number of transcription units - 2, operones - 1 average op.length - 4.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 72 - 1274 1719 ## COG4992 Ornithine/acetylornithine aminotransferase 2 1 Op 2 . - CDS 1277 - 1960 882 ## COG0566 rRNA methylases 3 1 Op 3 . - CDS 1941 - 2861 1107 ## COG1162 Predicted GTPases 4 1 Op 4 . - CDS 2865 - 3368 584 ## COG1286 Uncharacterized membrane protein, required for colicin V production - Prom 3394 - 3453 5.6 5 2 Tu 1 . - CDS 3486 - 3704 311 ## gi|167753959|ref|ZP_02426086.1| hypothetical protein ALIPUT_02244 - Prom 3731 - 3790 7.5 Predicted protein(s) >gi|313157843|gb|AENZ01000056.1| GENE 1 72 - 1274 1719 400 aa, chain - ## HITS:1 COG:CAC2388 KEGG:ns NR:ns ## COG: CAC2388 COG4992 # Protein_GI_number: 15895654 # Func_class: E Amino acid transport and metabolism # Function: Ornithine/acetylornithine aminotransferase # Organism: Clostridium acetobutylicum # 23 396 15 386 387 262 37.0 7e-70 MNTILRKQFLAHVGQTSPSPMLVEVERAEGTFFYTPEGKRYYDLVAGVSVSNVGHCNPAV VRAVQEQAARYMHVMVYGELVETPQVEYASKLASLLPGELESVYFVNSGAEAVEGALKLA KRFTGRTELISMRRAYHGSTHGSMSMMGAPEGEEWKGAFRPLLPDVQAIEFNDFDALNRI TRRTACVLAEPVQGEAGVRTPAPGYLEALRRRCDEAGALLVFDEIQTGMGRTGQLFAMLK YGVTPDIVCLAKAFGGGMPLGAFVARREIMDTLQSDPVLGHITTFGGHPVCCAAGLAALN YLLDNKVVEQAEAKGALYEALLGKHPAVREIRRSGLLMAVELGESAKLYRTMELFKEAGI LSDWFLFCDTAFRISPPLTISEDEVRESAALIRECLDQLR >gi|313157843|gb|AENZ01000056.1| GENE 2 1277 - 1960 882 227 aa, chain - ## HITS:1 COG:BB0052 KEGG:ns NR:ns ## COG: BB0052 COG0566 # Protein_GI_number: 15594398 # Func_class: J Translation, ribosomal structure and biogenesis # Function: rRNA methylases # Organism: Borrelia burgdorferi # 14 216 7 208 218 137 32.0 2e-32 MKNTANKGAEWYAERIACLAEFTTSERYEVLRRTVAMRTRYMTVLAENMYHGQNAAALIR HCEAFGVQEMHTVETLCRFEPNPDIVRGTDQWVDVRRHPSTGEALAALRSAGYRIVATTP HREDVTPETFDVAAGPFALVFGTEHAGISDEVIAGADEFLRIPMCGMVESLNVSASAAIL IYMLSERMRLTVGEWRMTDAQQAETLFGWMRRSVKDADAILQRKFKD >gi|313157843|gb|AENZ01000056.1| GENE 3 1941 - 2861 1107 306 aa, chain - ## HITS:1 COG:BH2503 KEGG:ns NR:ns ## COG: BH2503 COG1162 # Protein_GI_number: 15615066 # Func_class: R General function prediction only # Function: Predicted GTPases # Organism: Bacillus halodurans # 7 300 4 285 294 201 37.0 2e-51 MEKRFTGTVVRATGSWYDVLHDGAVVRCRIRGRLRLRGVRSTNPVVVGDEVTCEADAGGD YVICDISPRRNYVIRRASNLSKESHIIAANVDQALLMATLRSPETATEFVDRFLVTCEAY KVPVTILLSKIDLQDPQAVADFHAVYESAGYRVMEVSSATGEGVGAVHELLKGRTTLVSG NSGVGKSTLIRAIDPSLDIRTGEISESHHKGRHTTTFSTMYPLAEGGAVIDTPGIKGFGL IDIDDAELWHYFPEMMRTAPDCRFYNCTHTHEPGCAVVEAVERGQIAYPRYESYLKILDE DEKYRK >gi|313157843|gb|AENZ01000056.1| GENE 4 2865 - 3368 584 167 aa, chain - ## HITS:1 COG:jhp0169 KEGG:ns NR:ns ## COG: jhp0169 COG1286 # Protein_GI_number: 15611239 # Func_class: R General function prediction only # Function: Uncharacterized membrane protein, required for colicin V production # Organism: Helicobacter pylori J99 # 1 160 1 154 236 71 30.0 7e-13 MNTIDLIVCLILALAVWNGWRQGFVMQICSLAGIVAGIWLASRYGVYVGAWLKLDATVSA AGGFVVVLLAVILLVAVAGRLVRKLFHFAGFGIPDIVLGIAVSVLKYMLVLSVLFSAFDS LNEDYTLVGPQTIEKSKSYKPVMRLSEAVFPFLEWVGEQVPQEQNQD >gi|313157843|gb|AENZ01000056.1| GENE 5 3486 - 3704 311 72 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167753959|ref|ZP_02426086.1| ## NR: gi|167753959|ref|ZP_02426086.1| hypothetical protein ALIPUT_02244 [Alistipes putredinis DSM 17216] hypothetical protein ALIPUT_02244 [Alistipes putredinis DSM 17216] # 1 72 35 106 106 98 75.0 2e-19 MGVVYEEEVSYISSRIRELRRERRLTVQELAYRCDMERSNLSRIEAGRTNLTVKTMCIIC NALNVSLREVIR Prediction of potential genes in microbial genomes Time: Wed Jun 22 13:02:59 2011 Seq name: gi|313157655|gb|AENZ01000057.1| Alistipes sp. HGB5 contig00029, whole genome shotgun sequence Length of sequence - 204753 bp Number of predicted genes - 191, with homology - 184 Number of transcription units - 80, operones - 41 average op.length - 3.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 2140 - 2199 2.3 2 2 Op 1 . + CDS 2234 - 3511 1559 ## BT_4752 hypothetical protein 3 2 Op 2 . + CDS 3524 - 4036 349 ## BT_4751 hypothetical protein 4 2 Op 3 . + CDS 4064 - 5095 1231 ## gi|313157724|gb|EFR57135.1| putative lipoprotein + Term 5111 - 5153 7.0 5 3 Op 1 . + CDS 5157 - 7754 2284 ## Pedsa_3099 hypothetical protein 6 3 Op 2 . + CDS 7782 - 9275 1473 ## gi|313157768|gb|EFR57179.1| putative lipoprotein 7 3 Op 3 . + CDS 9280 - 11112 1725 ## gi|313157841|gb|EFR57252.1| putative lipoprotein 8 3 Op 4 . + CDS 11136 - 11693 613 ## BDI_3526 hypothetical protein 9 3 Op 5 . + CDS 11696 - 13351 1575 ## BF1481 hypothetical protein 10 3 Op 6 . + CDS 13361 - 13540 161 ## gi|291515067|emb|CBK64277.1| hypothetical protein AL1_19270 + Term 13547 - 13586 8.3 11 4 Tu 1 . - CDS 13557 - 14075 185 ## BF1081 hypothetical protein - Prom 14235 - 14294 2.0 + Prom 13887 - 13946 4.2 12 5 Tu 1 . + CDS 14115 - 15092 200 ## Odosp_3572 transposase IS4 family protein 13 6 Op 1 . + CDS 16114 - 16431 425 ## mru_0298 hypothetical protein 14 6 Op 2 . + CDS 16501 - 16890 303 ## gi|291514492|emb|CBK63702.1| hypothetical protein AL1_12210 - Term 16866 - 16903 2.6 15 7 Op 1 . - CDS 16918 - 17214 386 ## BF3840 hypothetical protein 16 7 Op 2 . - CDS 17216 - 17500 260 ## gi|313157815|gb|EFR57226.1| hypothetical protein HMPREF9720_1186 - Prom 17733 - 17792 3.0 - TRNA 17862 - 17947 61.3 # Leu TAA 0 0 - Term 17771 - 17817 11.2 17 8 Op 1 . - CDS 17990 - 18895 1019 ## COG0324 tRNA delta(2)-isopentenylpyrophosphate transferase 18 8 Op 2 . - CDS 18888 - 19397 803 ## BF0904 hypothetical protein 19 8 Op 3 . - CDS 19446 - 20429 958 ## Dfer_0440 acyl-CoA reductase - Prom 20553 - 20612 5.6 + Prom 20455 - 20514 2.9 20 9 Tu 1 . + CDS 20587 - 21189 821 ## COG3560 Predicted oxidoreductase related to nitroreductase + Term 21236 - 21274 -0.4 + TRNA 21299 - 21371 80.8 # Gly GCC 0 0 + Prom 22086 - 22145 5.5 21 10 Tu 1 . + CDS 22182 - 22367 144 ## 22 11 Tu 1 . + CDS 22544 - 23119 740 ## gi|313157440|gb|EFR56861.1| hypothetical protein HMPREF9720_3014 + Prom 23373 - 23432 3.1 23 12 Op 1 . + CDS 23477 - 24631 1162 ## COG1835 Predicted acyltransferases 24 12 Op 2 . + CDS 24654 - 26267 2035 ## BT_0763 hypothetical protein 25 13 Op 1 . + CDS 26483 - 26938 723 ## Odosp_0013 hypothetical protein 26 13 Op 2 . + CDS 27033 - 27683 998 ## gi|313157720|gb|EFR57131.1| putative lipoprotein + Term 27695 - 27737 5.1 + Prom 27685 - 27744 1.8 27 14 Op 1 . + CDS 27775 - 29106 2090 ## BT_0764 hypothetical protein 28 14 Op 2 . + CDS 29128 - 30216 1611 ## COG0265 Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain + Term 30239 - 30276 7.1 - Term 30225 - 30262 7.1 29 15 Tu 1 . - CDS 30322 - 31632 936 ## gi|313157697|gb|EFR57108.1| hypothetical protein HMPREF9720_1201 - Term 31658 - 31698 9.1 30 16 Op 1 . - CDS 31728 - 32108 423 ## BT_0963 putative ferredoxin-type protein 31 16 Op 2 1/0.200 - CDS 32119 - 32676 599 ## COG0716 Flavodoxins 32 16 Op 3 . - CDS 32696 - 33235 690 ## COG0655 Multimeric flavodoxin WrbA 33 16 Op 4 2/0.000 - CDS 33252 - 34643 1204 ## COG0599 Uncharacterized homolog of gamma-carboxymuconolactone decarboxylase subunit - Prom 34689 - 34748 5.3 34 16 Op 5 . - CDS 34753 - 35571 1080 ## COG0656 Aldo/keto reductases, related to diketogulonate reductase - Prom 35596 - 35655 2.9 + Prom 35548 - 35607 3.5 35 17 Tu 1 . + CDS 35837 - 36733 1261 ## COG2207 AraC-type DNA-binding domain-containing proteins - Term 36836 - 36886 -1.0 36 18 Op 1 . - CDS 36909 - 37652 990 ## COG0588 Phosphoglycerate mutase 1 37 18 Op 2 . - CDS 37656 - 38672 1520 ## COG1052 Lactate dehydrogenase and related dehydrogenases 38 18 Op 3 . - CDS 38730 - 39782 1545 ## COG1830 DhnA-type fructose-1,6-bisphosphate aldolase and related enzymes - Term 39842 - 39869 1.5 39 18 Op 4 . - CDS 39873 - 40760 1214 ## COG2207 AraC-type DNA-binding domain-containing proteins - Prom 40856 - 40915 4.5 + Prom 40721 - 40780 4.4 40 19 Op 1 27/0.000 + CDS 40884 - 42041 1683 ## COG0845 Membrane-fusion protein 41 19 Op 2 9/0.000 + CDS 42045 - 45224 4555 ## COG0841 Cation/multidrug efflux pump 42 19 Op 3 . + CDS 45229 - 46602 387 ## PROTEIN SUPPORTED gi|157165073|ref|YP_001466086.1| 30S ribosomal protein S12 43 19 Op 4 . + CDS 46602 - 47183 476 ## BT_1964 hypothetical protein + Term 47203 - 47247 14.6 - Term 47270 - 47310 11.2 44 20 Tu 1 . - CDS 47336 - 49528 2849 ## BF2681 hypothetical protein - Prom 49646 - 49705 4.0 + Prom 49645 - 49704 3.0 45 21 Op 1 7/0.000 + CDS 49907 - 51190 1886 ## COG2873 O-acetylhomoserine sulfhydrylase 46 21 Op 2 . + CDS 51190 - 52206 1301 ## COG2021 Homoserine acetyltransferase 47 21 Op 3 . + CDS 52216 - 53427 2015 ## COG0460 Homoserine dehydrogenase + Term 53484 - 53516 5.0 - Term 53471 - 53503 5.0 48 22 Op 1 . - CDS 53531 - 54871 1170 ## BT_3630 hypothetical protein 49 22 Op 2 . - CDS 54944 - 56146 1655 ## COG0477 Permeases of the major facilitator superfamily - Prom 56172 - 56231 2.1 - Term 56180 - 56221 12.2 50 23 Tu 1 . - CDS 56244 - 57710 1825 ## Bacsa_0604 hypothetical protein - Prom 57816 - 57875 5.7 + Prom 57772 - 57831 3.6 51 24 Tu 1 . + CDS 57866 - 58642 895 ## COG3369 Uncharacterized conserved protein + Term 58770 - 58822 11.2 - Term 58766 - 58804 8.5 52 25 Op 1 . - CDS 58883 - 60142 1558 ## Odosp_3666 peptidase M64, IgA - Term 60178 - 60211 4.5 53 25 Op 2 . - CDS 60231 - 61202 1544 ## COG4864 Uncharacterized protein conserved in bacteria 54 25 Op 3 . - CDS 61215 - 61685 692 ## BVU_3647 hypothetical protein 55 25 Op 4 . - CDS 61731 - 63281 2305 ## gi|313157708|gb|EFR57119.1| sporulation and cell division repeat protein 56 25 Op 5 . - CDS 63287 - 64732 1708 ## COG0168 Trk-type K+ transport systems, membrane components 57 25 Op 6 . - CDS 64738 - 65517 1202 ## Dole_1371 histidine kinase 58 25 Op 7 . - CDS 65553 - 66851 408 ## PROTEIN SUPPORTED gi|227424643|ref|ZP_03907734.1| SSU ribosomal protein S12P methylthiotransferase - Prom 66912 - 66971 2.4 + Prom 66787 - 66846 2.9 59 26 Op 1 . + CDS 66954 - 67787 1094 ## COG1108 ABC-type Mn2+/Zn2+ transport systems, permease components 60 26 Op 2 . + CDS 67784 - 69286 2188 ## COG0795 Predicted permeases 61 26 Op 3 . + CDS 69261 - 70484 1918 ## COG0108 3,4-dihydroxy-2-butanone 4-phosphate synthase 62 26 Op 4 . + CDS 70542 - 71513 1228 ## COG0223 Methionyl-tRNA formyltransferase + Term 71552 - 71581 2.1 - Term 71537 - 71573 6.5 63 27 Tu 1 . - CDS 71576 - 72280 1040 ## COG0450 Peroxiredoxin - Prom 72351 - 72410 1.7 - Term 72471 - 72507 9.0 64 28 Op 1 22/0.000 - CDS 72530 - 73075 835 ## COG1014 Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, gamma subunit 65 28 Op 2 23/0.000 - CDS 73086 - 73847 1185 ## COG1013 Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, beta subunit 66 28 Op 3 8/0.000 - CDS 73853 - 74947 1659 ## COG0674 Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, alpha subunit 67 28 Op 4 . - CDS 74951 - 75178 230 ## COG1146 Ferredoxin - Prom 75200 - 75259 8.0 - Term 75266 - 75314 9.9 68 29 Op 1 . - CDS 75336 - 75464 94 ## gi|313157733|gb|EFR57144.1| hypothetical protein HMPREF9720_1241 - Term 75465 - 75496 1.7 69 29 Op 2 2/0.000 - CDS 75520 - 76764 1835 ## COG4198 Uncharacterized conserved protein 70 29 Op 3 6/0.000 - CDS 76789 - 77712 1391 ## COG0111 Phosphoglycerate dehydrogenase and related dehydrogenases 71 29 Op 4 . - CDS 77803 - 78876 1784 ## COG1932 Phosphoserine aminotransferase - Prom 78927 - 78986 6.9 + Prom 78896 - 78955 6.1 72 30 Op 1 8/0.000 + CDS 79067 - 79939 1287 ## COG1561 Uncharacterized stress-induced protein 73 30 Op 2 . + CDS 79939 - 80508 551 ## COG0194 Guanylate kinase 74 30 Op 3 . + CDS 80505 - 81314 1128 ## COG1057 Nicotinic acid mononucleotide adenylyltransferase 75 30 Op 4 . + CDS 81341 - 82303 1483 ## COG1181 D-alanine-D-alanine ligase and related ATP-grasp enzymes 76 30 Op 5 . + CDS 82300 - 82872 673 ## Odosp_3353 hypothetical protein 77 30 Op 6 . + CDS 82880 - 83950 1492 ## COG0739 Membrane proteins related to metalloendopeptidases + Term 83993 - 84031 10.2 - Term 84021 - 84065 -0.9 78 31 Op 1 . - CDS 84072 - 85067 1088 ## Lbys_0372 inosine/uridine-preferring nucleoside hydrolase 79 31 Op 2 . - CDS 85081 - 86091 1522 ## BT_2809 putative integral membrane protein 80 31 Op 3 . - CDS 86102 - 87052 999 ## COG0524 Sugar kinases, ribokinase family - Prom 87109 - 87168 8.2 81 32 Tu 1 . + CDS 87486 - 88502 1092 ## COG1609 Transcriptional regulators + Term 88623 - 88657 5.0 - Term 88604 - 88650 12.8 82 33 Op 1 4/0.000 - CDS 88663 - 89445 915 ## COG0345 Pyrroline-5-carboxylate reductase 83 33 Op 2 . - CDS 89557 - 90222 830 ## COG0325 Predicted enzyme with a TIM-barrel fold - Term 90258 - 90306 10.1 84 34 Tu 1 . - CDS 90323 - 91582 1276 ## BT_1856 hypothetical protein - Prom 91664 - 91723 2.9 + Prom 91636 - 91695 4.4 85 35 Op 1 . + CDS 91764 - 93458 2508 ## COG4690 Dipeptidase 86 35 Op 2 . + CDS 93546 - 94922 1411 ## PROTEIN SUPPORTED gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 + Term 94956 - 94994 9.2 - Term 94938 - 94987 12.0 87 36 Tu 1 . - CDS 95006 - 97045 2253 ## COG0642 Signal transduction histidine kinase - Prom 97082 - 97141 3.4 - Term 97109 - 97144 5.1 88 37 Op 1 . - CDS 97151 - 97339 361 ## gi|291515703|emb|CBK64913.1| hypothetical protein AL1_27710 89 37 Op 2 . - CDS 97359 - 98123 1133 ## COG0101 Pseudouridylate synthase 90 37 Op 3 . - CDS 98163 - 99593 1847 ## COG1027 Aspartate ammonia-lyase - Prom 99630 - 99689 4.8 + Prom 99596 - 99655 5.2 91 38 Op 1 . + CDS 99823 - 100578 1053 ## COG0289 Dihydrodipicolinate reductase 92 38 Op 2 . + CDS 100589 - 101941 2294 ## COG0681 Signal peptidase I 93 38 Op 3 . + CDS 101943 - 102551 848 ## COG1739 Uncharacterized conserved protein 94 38 Op 4 . + CDS 102620 - 105409 4020 ## COG0178 Excinuclease ATPase subunit 95 38 Op 5 1/0.200 + CDS 105412 - 106626 1606 ## COG0612 Predicted Zn-dependent peptidases 96 38 Op 6 . + CDS 106631 - 107227 734 ## COG0693 Putative intracellular protease/amidase 97 38 Op 7 . + CDS 107227 - 107874 833 ## COG4122 Predicted O-methyltransferase + Term 107914 - 107957 8.1 98 39 Tu 1 . + CDS 107979 - 108299 569 ## COG1694 Predicted pyrophosphatase - Term 108245 - 108286 -1.0 99 40 Tu 1 . - CDS 108465 - 109376 953 ## gi|313157667|gb|EFR57078.1| hypothetical protein HMPREF9720_1272 - Prom 109401 - 109460 2.4 - Term 109445 - 109483 10.1 100 41 Tu 1 . - CDS 109508 - 111250 2962 ## COG1109 Phosphomannomutase - Prom 111398 - 111457 8.2 + Prom 111382 - 111441 6.7 101 42 Op 1 . + CDS 111507 - 112889 2120 ## COG0593 ATPase involved in DNA replication initiation + Term 112917 - 112959 7.7 102 42 Op 2 . + CDS 112963 - 113478 318 ## gi|313157763|gb|EFR57174.1| hypothetical protein HMPREF9720_1275 103 42 Op 3 . + CDS 113485 - 114381 1244 ## HMPREF0659_A5182 hypothetical protein 104 42 Op 4 . + CDS 114436 - 114969 794 ## COG0662 Mannose-6-phosphate isomerase + Term 114986 - 115037 12.2 + Prom 114974 - 115033 5.1 105 43 Op 1 . + CDS 115092 - 115304 286 ## gi|313157753|gb|EFR57164.1| hypothetical protein HMPREF9720_1278 + Prom 115306 - 115365 2.0 106 43 Op 2 . + CDS 115388 - 115849 710 ## COG1225 Peroxiredoxin 107 43 Op 3 . + CDS 115863 - 116333 469 ## gi|313157736|gb|EFR57147.1| hypothetical protein HMPREF9720_1280 108 43 Op 4 . + CDS 116362 - 117378 1664 ## COG0468 RecA/RadA recombinase 109 43 Op 5 . + CDS 117385 - 119625 2961 ## COG0317 Guanosine polyphosphate pyrophosphohydrolases/synthetases 110 43 Op 6 . + CDS 119627 - 120112 276 ## gi|313157701|gb|EFR57112.1| hypothetical protein HMPREF9720_1283 111 44 Tu 1 . + CDS 120417 - 120803 506 ## COG0537 Diadenosine tetraphosphate (Ap4A) hydrolase and other HIT family hydrolases + Term 120831 - 120891 12.9 - Term 120824 - 120872 6.0 112 45 Tu 1 . - CDS 120919 - 121059 58 ## + Prom 121344 - 121403 5.1 113 46 Tu 1 . + CDS 121463 - 121927 284 ## BDI_1114 hypothetical protein + Term 122081 - 122112 3.4 - Term 122084 - 122126 -0.0 114 47 Tu 1 . - CDS 122168 - 122662 349 ## BF3191 hypothetical protein - Prom 122867 - 122926 5.3 + Prom 122874 - 122933 3.8 115 48 Op 1 . + CDS 123052 - 123846 578 ## RB2501_04605 hypothetical protein 116 48 Op 2 25/0.000 + CDS 123839 - 124744 977 ## COG0803 ABC-type metal ion transport system, periplasmic component/surface adhesin 117 48 Op 3 . + CDS 124720 - 125511 227 ## PROTEIN SUPPORTED gi|90020817|ref|YP_526644.1| ribosomal protein S16 + Term 125515 - 125551 10.3 - Term 125540 - 125597 2.9 118 49 Tu 1 . - CDS 125613 - 125903 548 ## gi|313157818|gb|EFR57229.1| hypothetical protein HMPREF9720_1291 - Prom 125982 - 126041 3.5 119 50 Tu 1 . + CDS 125908 - 126084 61 ## + Term 126197 - 126232 5.2 - Term 126001 - 126045 11.0 120 51 Op 1 . - CDS 126060 - 126977 959 ## Poras_0858 hypothetical protein - Prom 127069 - 127128 3.2 121 51 Op 2 . - CDS 127134 - 127625 380 ## COG2801 Transposase and inactivated derivatives - Prom 127748 - 127807 3.0 + Prom 127578 - 127637 3.7 122 52 Tu 1 . + CDS 127746 - 127874 67 ## 123 53 Tu 1 . - CDS 127920 - 128225 104 ## gi|167753422|ref|ZP_02425549.1| hypothetical protein ALIPUT_01696 - Prom 128294 - 128353 3.2 - Term 128423 - 128468 11.4 124 54 Op 1 . - CDS 128505 - 128801 422 ## BF3840 hypothetical protein 125 54 Op 2 . - CDS 128803 - 129087 245 ## HMPREF0659_A7057 hypothetical protein - Prom 129319 - 129378 4.1 - Term 129360 - 129402 6.2 126 55 Tu 1 . - CDS 129588 - 132266 4140 ## COG0249 Mismatch repair ATPase (MutS family) - Prom 132289 - 132348 4.8 127 56 Op 1 . - CDS 132377 - 132964 740 ## COG3663 G:T/U mismatch-specific DNA glycosylase 128 56 Op 2 . - CDS 132945 - 134276 1908 ## COG0507 ATP-dependent exoDNAse (exonuclease V), alpha subunit - helicase superfamily I member - Prom 134375 - 134434 5.9 + Prom 134282 - 134341 7.5 129 57 Tu 1 . + CDS 134415 - 135047 936 ## Bache_1240 hypothetical protein + Term 135070 - 135112 9.0 130 58 Op 1 . + CDS 135178 - 135636 516 ## gi|313157837|gb|EFR57248.1| hypothetical protein HMPREF9720_1300 131 58 Op 2 . + CDS 135627 - 136163 187 ## PROTEIN SUPPORTED gi|163764797|ref|ZP_02171850.1| ribosomal protein L29 + Term 136348 - 136381 5.1 - TRNA 136395 - 136471 90.7 # Asp GTC 0 0 - TRNA 136524 - 136597 82.4 # Asp GTC 0 0 132 59 Op 1 . + CDS 136647 - 136838 115 ## 133 59 Op 2 . + CDS 136907 - 137452 540 ## gi|254423961|ref|ZP_05037679.1| hypothetical protein S7335_4118 + Prom 137457 - 137516 1.9 134 59 Op 3 . + CDS 137536 - 138702 1488 ## COG1168 Bifunctional PLP-dependent enzyme with beta-cystathionase and maltose regulon repressor activities 135 60 Tu 1 . + CDS 138803 - 139111 467 ## + Term 139217 - 139249 1.3 136 61 Tu 1 . - CDS 139108 - 140043 1076 ## BT_1910 hypothetical protein - Term 140157 - 140193 0.8 137 62 Tu 1 . - CDS 140221 - 141105 1174 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily - Prom 141278 - 141337 3.6 138 63 Tu 1 . + CDS 141297 - 141956 856 ## COG0110 Acetyltransferase (isoleucine patch superfamily) + Term 142161 - 142200 5.1 - Term 142243 - 142281 1.5 139 64 Op 1 . - CDS 142300 - 143247 826 ## COG0229 Conserved domain frequently associated with peptide methionine sulfoxide reductase 140 64 Op 2 6/0.000 - CDS 143270 - 144511 1804 ## COG1228 Imidazolonepropionase and related amidohydrolases 141 64 Op 3 6/0.000 - CDS 144508 - 145989 2057 ## COG2986 Histidine ammonia-lyase 142 64 Op 4 1/0.200 - CDS 146006 - 148027 2988 ## COG2987 Urocanate hydratase 143 64 Op 5 . - CDS 148033 - 149730 2304 ## COG3643 Glutamate formiminotransferase - Prom 149756 - 149815 2.4 - Term 149944 - 149980 6.1 144 65 Tu 1 . - CDS 150058 - 152040 2603 ## Palpr_1878 TonB-dependent receptor - Prom 152060 - 152119 4.2 - Term 152399 - 152438 9.1 145 66 Op 1 . - CDS 152487 - 153794 1983 ## COG0498 Threonine synthase 146 66 Op 2 . - CDS 153815 - 154945 1343 ## COG0019 Diaminopimelate decarboxylase - Prom 154967 - 155026 2.5 + Prom 154946 - 155005 4.3 147 67 Tu 1 . + CDS 155080 - 155442 79 ## 148 68 Op 1 . - CDS 155250 - 156371 1421 ## COG2885 Outer membrane protein and related peptidoglycan-associated (lipo)proteins - Prom 156396 - 156455 6.4 - Term 156650 - 156687 7.1 149 68 Op 2 . - CDS 156737 - 157927 1836 ## COG1748 Saccharopine dehydrogenase and related proteins - Prom 158077 - 158136 80.3 + TRNA 158058 - 158133 78.0 # Lys TTT 0 0 150 69 Tu 1 . + CDS 158387 - 159034 216 ## COG1662 Transposase and inactivated derivatives, IS1 family 151 70 Op 1 . + CDS 159765 - 160436 118 ## COG1106 Predicted ATPases 152 70 Op 2 . + CDS 160450 - 161157 205 ## BVU_3668 hypothetical protein + Prom 161623 - 161682 2.6 153 71 Tu 1 . + CDS 161720 - 162784 1021 ## BVU_4155 hypothetical protein 154 72 Op 1 . + CDS 163349 - 163822 565 ## BF3466 putative LPS biosynthesis related transcriptional regulatory protein 155 72 Op 2 . + CDS 163896 - 165290 1952 ## COG2148 Sugar transferases involved in lipopolysaccharide synthesis 156 72 Op 3 13/0.000 + CDS 165303 - 166181 1106 ## COG1209 dTDP-glucose pyrophosphorylase 157 72 Op 4 9/0.000 + CDS 166186 - 166758 789 ## COG1898 dTDP-4-dehydrorhamnose 3,5-epimerase and related enzymes 158 72 Op 5 11/0.000 + CDS 166751 - 167614 1152 ## COG1091 dTDP-4-dehydrorhamnose reductase 159 72 Op 6 . + CDS 167625 - 168734 1287 ## COG1088 dTDP-D-glucose 4,6-dehydratase 160 72 Op 7 . + CDS 168806 - 170338 1350 ## BVU_2391 putative transmembrane protein 161 72 Op 8 . + CDS 170335 - 171534 878 ## GFO_0566 hypothetical protein 162 72 Op 9 . + CDS 171522 - 172514 790 ## gi|313157767|gb|EFR57178.1| putative acyltransferase 163 72 Op 10 . + CDS 172516 - 173670 633 ## BT_0041 F420H2:quinone oxidoreductase 164 72 Op 11 . + CDS 173667 - 174815 451 ## Dfer_0060 hypothetical protein 165 72 Op 12 . + CDS 174818 - 175885 998 ## CPF_0915 putative polysaccharide polymerase protein 166 72 Op 13 . + CDS 175894 - 176826 508 ## COG0463 Glycosyltransferases involved in cell wall biogenesis + Prom 177327 - 177386 7.9 167 73 Op 1 . + CDS 177575 - 178006 241 ## gi|313157791|gb|EFR57202.1| glycosyltransferase, group 1 family protein 168 73 Op 2 . + CDS 178013 - 178987 645 ## PRU_0259 acyltransferase family protein 169 73 Op 3 1/0.200 + CDS 179055 - 180236 1146 ## COG0438 Glycosyltransferase 170 73 Op 4 1/0.200 + CDS 180238 - 181308 1261 ## COG0535 Predicted Fe-S oxidoreductases 171 73 Op 5 . + CDS 181349 - 182449 1137 ## COG0438 Glycosyltransferase + Term 182489 - 182534 -0.8 172 74 Tu 1 . - CDS 182459 - 182950 530 ## COG2606 Uncharacterized conserved protein - Prom 182994 - 183053 4.7 + Prom 182933 - 182992 2.6 173 75 Op 1 . + CDS 183040 - 185553 3969 ## Odosp_1673 hypothetical protein 174 75 Op 2 . + CDS 185613 - 186272 930 ## COG0637 Predicted phosphatase/phosphohexomutase + Term 186303 - 186341 6.6 175 76 Tu 1 . + CDS 186564 - 186929 606 ## COG3169 Uncharacterized protein conserved in bacteria + Term 186963 - 187004 9.0 - Term 186953 - 186989 7.1 176 77 Op 1 . - CDS 187043 - 187957 1309 ## Palpr_1890 bile acid:sodium symporter 177 77 Op 2 . - CDS 187958 - 189112 988 ## Halhy_6144 hypothetical protein 178 77 Op 3 . - CDS 189121 - 189570 211 ## PROTEIN SUPPORTED gi|42519249|ref|NP_965179.1| 30S ribosomal protein S21 179 77 Op 4 . - CDS 189601 - 192585 4105 ## Bache_0394 metallophosphoesterase 180 77 Op 5 . - CDS 192575 - 194026 1879 ## COG0591 Na+/proline symporter - Prom 194046 - 194105 4.5 181 78 Op 1 . - CDS 194195 - 194893 688 ## Closa_2690 GCN5-related N-acetyltransferase 182 78 Op 2 . - CDS 194974 - 195195 432 ## BT_1494 hypothetical protein + Prom 195522 - 195581 3.5 183 79 Op 1 18/0.000 + CDS 195686 - 196537 1246 ## COG0040 ATP phosphoribosyltransferase 184 79 Op 2 19/0.000 + CDS 196560 - 197855 1638 ## COG0141 Histidinol dehydrogenase 185 79 Op 3 13/0.000 + CDS 197852 - 198886 1460 ## COG0079 Histidinol-phosphate/aromatic aminotransferase and cobyric acid decarboxylase 186 79 Op 4 18/0.000 + CDS 198883 - 200385 1625 ## COG0131 Imidazoleglycerol-phosphate dehydratase 187 79 Op 5 25/0.000 + CDS 200382 - 200969 850 ## COG0118 Glutamine amidotransferase 188 79 Op 6 23/0.000 + CDS 200974 - 201705 924 ## COG0106 Phosphoribosylformimino-5-aminoimidazole carboxamide ribonucleotide (ProFAR) isomerase 189 79 Op 7 24/0.000 + CDS 201687 - 202439 1007 ## COG0107 Imidazoleglycerol-phosphate synthase 190 79 Op 8 . + CDS 202445 - 203065 914 ## COG0139 Phosphoribosyl-AMP cyclohydrolase + Term 203096 - 203135 9.1 - Term 203077 - 203127 14.3 191 80 Tu 1 . - CDS 203205 - 204752 316 ## BDI_3446 hypothetical protein Predicted protein(s) >gi|313157655|gb|AENZ01000057.1| GENE 1 107 - 2119 2346 670 aa, chain + ## HITS:1 COG:alr7213 KEGG:ns NR:ns ## COG: alr7213 COG3505 # Protein_GI_number: 17233229 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirD4 components # Organism: Nostoc sp. PCC 7120 # 196 586 112 498 589 104 26.0 6e-22 MQQEDDLKALENIVQFARGLGVFFLVLHVYWYCYEWLSSCGLTYGYADRLLLDIQRTTLL FSSPVITKLCAFLFLALGCYGTKSVRNGRVDRRHIVAAACLGFPLFFLNGFLLSLPAGIG FQGWSYTLTLGAGYLSLLAAGVWLRRLMKQPLMDDPFNDNNESFLQEERCIDNDCSVNLP TRYRYKGQWRNGWISVVNVYRSTAVIGLPGSGKSYAVLNNYIKQHIEKGFSMLIYDYKFD DLTRIAYNHLLRHLDKYTVKPRFCIINFDDPRRSNRCNPIDARFMDDISDAYESAYVTLL NLNKTWVDKQGDFFTDSAVILLAAIIWYLKIYDNGHYCTFPHAIEFLCQPLEKIFPILSS YPELENYLSPFMDAWKSNAQDQLQGQVASVKIPLSRMISPQLYWVLTGNDFTLDINNPEE PKILCMGNNPDRQSIYGAALGLYNSRLIRLINKKGKLKSSLIIDELPTIYIRGLDNLIAT ARSNKVSVCLGFQDFSQLERDYGEKEAKVITNTVGNIFSGQVVGDTARTLSERFGKIVQL RESHSVSNENVTTSTNTQLETLIPASKISNLTQGMFVGSVADNFDERIEQKIFHAEIVID NDKVKAETAAYVPIPEISSFVGEDGKDHMEEIVKENYYRVKADVAELVRREIARIEADPD LAKLLPKKKP >gi|313157655|gb|AENZ01000057.1| GENE 2 2234 - 3511 1559 425 aa, chain + ## HITS:1 COG:no KEGG:BT_4752 NR:ns ## KEGG: BT_4752 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 29 344 35 354 431 69 25.0 3e-10 MPEENQNQKPIQTPEQADPKYKETLLLYNEQNGAVEAVSDLKQSGNQYKVTTTQPLTANK PAFYELRNSSAVAAFIKGFMSQENAKPFHFLKVAADKASEVTQSLLRLADNPKDPEGLKA LYDHRVTSYQLEKVKFDTPDLKLQELKEMGIIVTPKELEAMKHGLPTTDLHDVTLKIGNI PVAGQFALHPYKDMNGDVQVGLTSALPRPEFEREEYRMMFSTSEKEQLLAGKTPDRLYEL PNPHTGEKEWCFATLNPATNRLVTIPKNEVPDLRYFNGVRLDDTQQNELALGGRVFVEGC SMRGSDITYSGKVGFDVLSNEYKMTDYQFSRPYISPQLDKQLDDRQRTALLSPEGLDCSK EKEHPILGKNGKALNCILRIDPRSNGVVYDFSQQRRQEQQEKQEQKAEKAQEQAPDQGQG RGRKR >gi|313157655|gb|AENZ01000057.1| GENE 3 3524 - 4036 349 170 aa, chain + ## HITS:1 COG:no KEGG:BT_4751 NR:ns ## KEGG: BT_4751 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 8 141 14 147 147 90 35.0 2e-17 METPDEKWFRERLRHFLEIRHPPRQFHHVMIERRSRLAFESYAQSVELGVPAASAVRAAD KVLFRGLLFSKYDTVHLILTTDFPAIPENQRQEIALRLVRICAPIFRAYDLKDDFSERPE FRQLKEELRTGIRSWIDENGVPGYRSPARKEKPPAPYSEQPRYSKRNKRK >gi|313157655|gb|AENZ01000057.1| GENE 4 4064 - 5095 1231 343 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313157724|gb|EFR57135.1| ## NR: gi|313157724|gb|EFR57135.1| putative lipoprotein [Alistipes sp. HGB5] # 1 343 1 343 343 490 100.0 1e-137 MKKILFAATALAAALFASCDKDNDTINPALPETSDGARIAITLTGADEADTRAFFDASAK AESWESSLSSLSVFAFDNSGALIIRRDFTSSELASKSATFALPKSAAGTECSFYAVANYD ASSAKTRAALTVLVEKSAADYNGTFAEVSSAAKRSGGFVMSGSTSKTIGAVNSTTSVGIT LKRTVAKVALQTTIDPSFADKYAGELTINSVKLSKAASQSSVVAGTPTPGAMSYTHTQTL AAASGKYNALFYCFENGSLTAGNRVLLEINATYDLDGNGSTTDDRSEVTYSVELTGKAAG EILRNGYYRIAANITGLVGQDCAVTVTVADWETPVTQSVELGA >gi|313157655|gb|AENZ01000057.1| GENE 5 5157 - 7754 2284 865 aa, chain + ## HITS:1 COG:no KEGG:Pedsa_3099 NR:ns ## KEGG: Pedsa_3099 # Name: not_defined # Def: hypothetical protein # Organism: P.saltans # Pathway: not_defined # 536 770 155 362 445 101 29.0 2e-19 MKRLLYILSISLAATLMGCIAEDRSACPHPRGVSVRLTVCPDPMTAVTRTTDEEALRDLN LYLYDDDGEIVLHRYQSSSTLRFTSLPGNYRMRIAANMGRDLGENPASEDFTVTHADEYD TLPMSYEGDVTITSSSGGILTLPVVEVQRVVTKVSYDIAVKPADIELRSVQLCSVPRAVS VFDVAARPSDAPDDYTDCPESGISGQHISGNCYLLPNMQGTVPSITDQRDKNPDNAPANA SYLLIRAVRGAKVLAYYIYLGDNNTTDFNVRANVHYRLAISILGDSEVDTRVSSYTLNVY DSYAENAIGGYCTYDVMGELFVEVEGDPAPLTLRGRITASQGDKNRLLVDDASIGTGRDL ELANQPGLNEYFLYYDLPVYTAANSQVAYTVTVADDAGLAQSFDFERRFANRLQVVVETA DNGKGWVNVSRALYTAEIAGTRNHVVMCHEMGCTLVALPAAGYRFEGWYTTDGYTQRLSS STTYLYTPASNDASIFPKFTANTRPLDDEGTANCYIAPALNTSYSFDATTMGNGCTTLNV TPKRLSGVSARILWETGTLSEAIVKDVRYQDGRITFTTGTTHGNAVIGLFDSRGECLWSW HIWATDYDPAATAQTYASGAVFMDRNLGALETDYTLPASRGLYYQWGRKDPFPYPATATD AYIQAPTIYAAGFEYAESDPRTSGTESPYDVMTLEWATAHPTTYMDGVSFEDWEEWASSL DWLCDHHPNLWGNVTTGKNNISRTSHKSIYDPCPPGWKVPGAEDFAGIERISVSAPYYVT IRCNGTQTAKIPLGGTFYEGRYDRNGSLGRLYTNAPYYLHWGGSTGVFHDVACTSIVFDT SNYPPFVNTTDFYRYAANPVRCVRE >gi|313157655|gb|AENZ01000057.1| GENE 6 7782 - 9275 1473 497 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313157768|gb|EFR57179.1| ## NR: gi|313157768|gb|EFR57179.1| putative lipoprotein [Alistipes sp. HGB5] # 1 497 1 497 497 1023 100.0 0 MKRYLWIFAVLPFLWSCIAEDSSACPQPRGVSVRMSVCPDPMTAVPRAADEEALRDLNFY LLDADGNVVQHRYQSSPTLRFECLPGNYLLRIAANMGRDTGDSPAWENLAVTHADDYDTL PMAYEGEVSIPSSDETVTLPTVEVQRVVAKVTYNISVAPDAGDIRLHSVQPVNLPARMPV FDPGLRAAEYTAGEIVGSSSQTLTGTFYMLPNLQGEVPSIADQRDKGPANAPADATFLRI RALRGSKVLDYYIYPGGNNTSNFDIRANTHYRLDIMIRGDAEVDTRIRAYTVEVLCTPEA ALSNGFLLERQPMRLTLRLGGAYEETGVEAFVELKTGDVRYFGFEGQWGAVARTMEIRGP ENDYDIRYWPPSFTREECRFMFRVQILDRYGEVASFSFPYCYAHLLRVYTKWFDGSNGKG TVASPDALRVVEDMTLASWYSLIYCPDEGCTLVATPDADRLFEGWCRNADHTGVLSYEEQ YRFSPDDDPAVIYAYFR >gi|313157655|gb|AENZ01000057.1| GENE 7 9280 - 11112 1725 610 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313157841|gb|EFR57252.1| ## NR: gi|313157841|gb|EFR57252.1| putative lipoprotein [Alistipes sp. HGB5] # 1 610 1 610 610 1194 100.0 0 MKTLIHILLGTLLFAGSCSDEMPVCTPAVHGPRHVTFTFSCDALPGTRAVADESRVEDIN LYLYPANDAQARHVYAASQRTVALELPDGRYTLYAVANLGTDTGPLPEAEVRALRSAWNP DGSDSGTLPMSAMRTFTVRGTTQIAIELVRAVAKVDFSYTVAADFRNSLSVRSVRLCNVP RFAALFGEGRITADGECAATEAFPADGDGYSAVYYLPENLQGKNVLIPGQEQKNEANAPE YAAYILIEGEADGLHVVYRIYLGENNTSDFNIARNRVYRIEARILGRNTVDWRVSTTEVS VTPFTERHAPGRTATAQLRVVSTNNDRSTCFLTCEREEGDGDVTLDGEPILFGVPYPFLS GDGERIAEIAYTQQREGDVRLRLVLTDETGLHAERVLQTSYVREPLRVSFTQDKFELAAL DRARVNFTVEQDDYDGTYTVQVEGTPTLFRNLTDDIGPVTGYTVAGNGTYPLRIRPEAVG ENPYRITVTDEKGNSAFFDSFVTGIKTVSCFTCSFSKISGGIEVVAEGTYPVMNDLTITL NPTVTVVLRGGATKTLTCTMNVKIEADRMRGRASATLDLGTGYTDYFMTDYEATFSRTAS ENGMVEYAVR >gi|313157655|gb|AENZ01000057.1| GENE 8 11136 - 11693 613 185 aa, chain + ## HITS:1 COG:no KEGG:BDI_3526 NR:ns ## KEGG: BDI_3526 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 11 148 4 150 185 102 41.0 7e-21 MKKQRIFITAVLLLAALWGAPRRAAAQIFAVRANALAACSATLNVGAETALTDNWSLELS GYWNPVNTASLSMNFHAVQLGGRYWFYESFVGHFLGQHLTYADYDLGSRTKRYKGNAYGL GVSYGYAWMLSKRWNIALEAGIGLFHTEDTRRDPTVSDWEDEYIYRYRRWTLAPTKLEVS FSYIF >gi|313157655|gb|AENZ01000057.1| GENE 9 11696 - 13351 1575 551 aa, chain + ## HITS:1 COG:no KEGG:BF1481 NR:ns ## KEGG: BF1481 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 172 550 296 676 687 269 38.0 2e-70 MKICNKILQIGLLSAALGSLFGCSVAGRLQRQQATAGLAQLTRAERQERQQDSRLQVVKL QRDSNTFFLALVDTLADGERVMALQIEQVTVVAKMRSIPERNGRVVLDFIVTLPKQLLGK SRSVVITPILHKPDESVALEDLVIRGGRFSLLQERDYWQYETYVERFRPDTEGREAAFNR FVKFPYPEDVRLDSLVEGRSTVTYYYSQAVKTDETSKKMLVTLQGQVLAVDDSAYRLPPS DTLSYVVSSMLSFVDTVPRYRIKVIDKFVTVEDRNYIQFFVGDTRVVDTLGDNRQQLDKI TGLIRQIVEQQEFWVDTITLTAASSPEGAYAFNDRLSQGRAQALKRYLVRRYGRSIDTML IVRWVAENWPELTQRIRTDKSIENREAILALIASEKNPDRREQAIRLRFPKEYAYIRSVI YPQLRAVNFRYNLRRKGMVKDTIHTTELDTTYARGVELLQKRKYAKALYILNDYNDRNTV VAHLSLDHNERAMELLAAMPEDAATEYLRAIACSRLGRKEEGRRHFLEACRLDERMEYRG NLDPEIAELLK >gi|313157655|gb|AENZ01000057.1| GENE 10 13361 - 13540 161 59 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|291515067|emb|CBK64277.1| ## NR: gi|291515067|emb|CBK64277.1| hypothetical protein AL1_19270 [Alistipes shahii WAL 8301] putative lipoprotein [Alistipes sp. HGB5] # 1 59 1 59 59 112 100.0 1e-23 MKQQKIKHAILFCLLVAIWGTLAASLFGSCNVYEYEEVEIIHCPQIQDTIQIPDWEPEE >gi|313157655|gb|AENZ01000057.1| GENE 11 13557 - 14075 185 172 aa, chain - ## HITS:1 COG:no KEGG:BF1081 NR:ns ## KEGG: BF1081 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 164 255 412 415 206 65.0 3e-52 MNFNYTKTADLYLPANSDIPVNHIHGELDNEQNPVIFGYGDELDEDYKTISNLNDNSYLT NIKSIRYLETDNYRQLLQFIDTGPYQIYIMGHSCGNSDRTLLNTLFEHKNCVSIKPYYYE WTDEEGGHSDNYIEIIQNISRNFNSMQLMRDRIVNKLYCRPLPQKPKVAAIE >gi|313157655|gb|AENZ01000057.1| GENE 12 14115 - 15092 200 325 aa, chain + ## HITS:1 COG:no KEGG:Odosp_3572 NR:ns ## KEGG: Odosp_3572 # Name: not_defined # Def: transposase IS4 family protein # Organism: O.splanchnicus # Pathway: not_defined # 4 325 3 324 324 406 59.0 1e-112 MTNERLCIFLVFSNLIASKLLSSNRRMHNFIEKFTKINDICKKFAGNRVNEHGNVSRRGV IPTFSDLEVIALSLTAEAFGYDSENNLFKRLAESPEHIPNLISRRQFNVRRKLTACLAED IRRDIAKSIDGGENVFVIDSKPVKVCQLARAKRCVMGNDNPQSAPSKGFCASQQMYYYGY KLHAICGISGVIHSYDITAANVHDIHYLDDAKWDYHDCLMLGDKGYLSAEVQQDLFETAH IKLEVPYRLNQKNWRNPSWAYRRFRKRIETVFSQLNDQFMMVRNYAKQTGGLFTRTAAKI AAMTVLQYINFCNHRKIGLVKDALF >gi|313157655|gb|AENZ01000057.1| GENE 13 16114 - 16431 425 105 aa, chain + ## HITS:1 COG:no KEGG:mru_0298 NR:ns ## KEGG: mru_0298 # Name: not_defined # Def: hypothetical protein # Organism: M.ruminantium # Pathway: not_defined # 1 101 1 114 116 84 44.0 2e-15 MKLLTKEIISKFEKHPFHSQDGKGMDAEVLVKYFNPCGTGTWLITEAEREGDDWRLFGYC HIYEWEWGYLMLSELASLRLPFGLTIERDIYTARKYVRDFLPQDE >gi|313157655|gb|AENZ01000057.1| GENE 14 16501 - 16890 303 129 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|291514492|emb|CBK63702.1| ## NR: gi|291514492|emb|CBK63702.1| hypothetical protein AL1_12210 [Alistipes shahii WAL 8301] # 1 129 1 129 129 244 99.0 1e-63 MATFEEKAERLKKELEEATNDDQRRNLSREYELTLRLLRIIRGEVFTLDDINKCRMEIMR LYPGYDRPITAESGILLAAEAIRKSFGKKYYLPLYKYPILIDFGTPDGQICVIHSSNYIS YTSKKGGEE >gi|313157655|gb|AENZ01000057.1| GENE 15 16918 - 17214 386 98 aa, chain - ## HITS:1 COG:no KEGG:BF3840 NR:ns ## KEGG: BF3840 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 4 98 3 97 100 85 46.0 6e-16 MPEEIITRDSAAFKELYRDIVKAIRAVDILIDTHRPTIGNELYLTSEEICSIFSISKRSL QNYRDNRQIPYTTLGGKILYPQSSLYKLLEQHYMKAQR >gi|313157655|gb|AENZ01000057.1| GENE 16 17216 - 17500 260 94 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313157815|gb|EFR57226.1| ## NR: gi|313157815|gb|EFR57226.1| hypothetical protein HMPREF9720_1186 [Alistipes sp. HGB5] # 50 94 1 45 45 87 97.0 3e-16 MDCVLMETSAYKELQAHLQRLLERVSALHSLSAQPTTVRWLTAEEVCKALSITKRALQYY RSAGIIPYTALGNKVLFRDDDIRHLLEKNLIKSL >gi|313157655|gb|AENZ01000057.1| GENE 17 17990 - 18895 1019 301 aa, chain - ## HITS:1 COG:BH2366 KEGG:ns NR:ns ## COG: BH2366 COG0324 # Protein_GI_number: 15614929 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA delta(2)-isopentenylpyrophosphate transferase # Organism: Bacillus halodurans # 4 300 3 296 314 212 40.0 8e-55 MSDKRLIVVVGPTGSGKTDLSIRLALNYDAPILSTDSRQVYRGLPIGTAQPTPEQLQAVE HHFIASHDINDNLNCGEYETQALVRLGELFAAHDYVVAVGGSGLYVRALCEGMDDLPQAD ETLRRNLAARLASEGVERLAEELRTLDPAYYAEVDRSNPARVVRALEVCLLTGLPYSQLR TGLRRERWFDIVKVGVDMPRGELYDRINRRVDRMLADGLEAEARAVYPYRSLNALQTVGY REFFDYFDGRISYDEAVELIKRNSRRYAKRQLTWFRRDGEIRWFRSDETDAIISYIDFGK S >gi|313157655|gb|AENZ01000057.1| GENE 18 18888 - 19397 803 169 aa, chain - ## HITS:1 COG:no KEGG:BF0904 NR:ns ## KEGG: BF0904 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 2 169 35 197 222 84 32.0 1e-15 MSMVFRFRMLSDENDNFVRDYEVLYDTTLLEFHDFILRSLEYEECMASFFTADDRWEKLR EFTRMDMDDGAEDAPLAMEKVTLGQIIHNNRDRLIYLFDIFGDRAYFLELTGSFESKPGI SYPREIYAQAEAPDQYDPSKNVVSGEGSIFDDVMSEFSDFEGDDNYDDE >gi|313157655|gb|AENZ01000057.1| GENE 19 19446 - 20429 958 327 aa, chain - ## HITS:1 COG:no KEGG:Dfer_0440 NR:ns ## KEGG: Dfer_0440 # Name: not_defined # Def: acyl-CoA reductase # Organism: D.fermentans # Pathway: not_defined # 14 326 13 340 341 230 41.0 8e-59 MKSAIELFSALGRRLADFGGDVPTQEVAERACRANGWFTPADVRRAVAAIADGMLRRDRL ETWLAPYPVPVAVPRRVLVVMAGNIPLVGFFDLLCVLAAGHRCLVKPSAKDRVLTEYVVG MLRELDPEVPVGFCDGSSPVDAVIATGSDNANRYFRTQYAGIPALLRGSRQSVAVLSGRE TEAQLEGLADDIWAYSGLGCRSVSLLFVPEGYDLRLRMPAVNEKYRNNYRQQKALLTMTG RPFRDLGSAVAVEERAFPAALSRIACSRYKTLGEVEAWLAQHDAELQCVVSECVSHGRRT GFGRAQSPALTDYPDDRDVIAFLAALN >gi|313157655|gb|AENZ01000057.1| GENE 20 20587 - 21189 821 200 aa, chain + ## HITS:1 COG:CAC3314 KEGG:ns NR:ns ## COG: CAC3314 COG3560 # Protein_GI_number: 15896557 # Func_class: R General function prediction only # Function: Predicted oxidoreductase related to nitroreductase # Organism: Clostridium acetobutylicum # 1 199 1 198 198 239 54.0 3e-63 MERTFKEALRHRRSYYALAPESPVEDAQIEEIVRFAIKHVPSAFNSQSTRAVLLLHEHHE ELWKIVKRTLRAIVPEDAFARTEEKIERSFAAGYGTVLFFEDTNVVRDLQQKFPGYAGNF PVWSEQTSAMHQLAIWTMLEDAGFGASLQHYNPLIDNEVRKRWSLPEEWRLIAQMPFGTP AGEPGEKTFKPLDERIRVFR >gi|313157655|gb|AENZ01000057.1| GENE 21 22182 - 22367 144 61 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSRRIRFRNYPKAQNPHYILLLWHNTGRTFLPFDFARQDVRFWAGERGNALTPHDPARRA S >gi|313157655|gb|AENZ01000057.1| GENE 22 22544 - 23119 740 191 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313157440|gb|EFR56861.1| ## NR: gi|313157440|gb|EFR56861.1| hypothetical protein HMPREF9720_3014 [Alistipes sp. HGB5] # 1 191 15 206 206 242 61.0 1e-62 MNEEDPEFEKTLRAINHYYITQVGICTAALSKLRSAELSDKNIIIRAIMCDMPDCGVETE ASADDCGVDLWADRAEYVNETSFNLYVTQAEENVLWNAGEKGRKPGRLAVPQQRITVRSD GMGTAPVIPLIYLTLLSAKLLAMAFPDGCTDAELSGYNDERTMAALTDRVMTVLEQLALR DGEEETRSPAS >gi|313157655|gb|AENZ01000057.1| GENE 23 23477 - 24631 1162 384 aa, chain + ## HITS:1 COG:CC1328 KEGG:ns NR:ns ## COG: CC1328 COG1835 # Protein_GI_number: 16125577 # Func_class: I Lipid transport and metabolism # Function: Predicted acyltransferases # Organism: Caulobacter vibrioides # 18 369 12 332 337 112 33.0 1e-24 MSETKVSAAAFADTKPHYNILDGLRGVAALMVVWFHVFEAFATSHVDQRINHGYLAVDFF FILSGFVIGYAYDDRWKRMTVREFVTRRFIRLHPMVVIGAVIGAVMFYFQGCSVWDVSKV SVTMLLAATLMNACMIPATPGMEIRGVTEMFPLNGPSWSLFYEYIGNILYALFIRRLPTK VLAALVLLAGCGLAAFAVWGPYGDICAGFSLTGDNIAGGSLRLLFSFSAGLLMSRVFRPV KVKGAFWICSLGVVVLLAVPRIGGEENYWMNGLYDTLCFALAFPLLVYLGASGKTTDRTT ARICKFMGDISYPLYMVHYPFIYLYYAWVKNGNLTFAQSLPGAAALVIGSVLLAYLCLKL YDEPVRRFLTKHLLRTQKPVRQTP >gi|313157655|gb|AENZ01000057.1| GENE 24 24654 - 26267 2035 537 aa, chain + ## HITS:1 COG:no KEGG:BT_0763 NR:ns ## KEGG: BT_0763 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 6 534 9 558 561 228 30.0 5e-58 MHRIRLAILLFLCVSAAGCQPVPGVLLDAEAIVMEHPDSAARLLEGVPAPEKRLSRRNYA HYALVLTQARWLAGENMIDDTLSDVALDYYRTHTDDFAAAHKAYYYAAKIAHQRRQPEVA MTLLLKSRDMLPPKGEWRRHYVVETWLGVFCGQQHLFEEKIRHAQQAYAYADSMERYDWM CISLGDMAHAYMGLDNYDSMEYYAIKALRLAEEKGITENTSPKWAMLIESALKRKEYRKA EEYWEKGFGHAPAWTRYSWLSTKADICNRTGRYDTALELTDSSRRMCADTTHLTSRALWA LNTSNAWEGKGDAARSLEALKQYIRLKDSIDEKVHSAEVLNIRKLWRYEQLKTQNAALQS QKQHRERIVYETILICAALMLLGAWLGLRMWHRAKLREFAQQKRILRQENELDAMRRKEE HLRTTFFRQLNSRFIAQVRQGGESRKCRMSDEDWKTIFTHADAVFDNFTVRLRGQFPALN EEDIRYCCMVRMHLSQAEIASVVCLEKDSVKKRLKRIRTEKLASAKGSTLEEILSRL >gi|313157655|gb|AENZ01000057.1| GENE 25 26483 - 26938 723 151 aa, chain + ## HITS:1 COG:no KEGG:Odosp_0013 NR:ns ## KEGG: Odosp_0013 # Name: not_defined # Def: hypothetical protein # Organism: O.splanchnicus # Pathway: not_defined # 2 150 15 163 163 140 47.0 1e-32 MQKFNYTPSNPFSGSVSSLAIGAGMVVVPLVYPFGIRIGRMRILGPTAVTIIFVIGGLAL LAFTVREIMQARKLIAQGGEITVEGGKVTIPVVRKKEVVNESFLLSEVEYTKFDEEENEF KISLPADHHVIRGAFFENAEAFDAFKSIFDK >gi|313157655|gb|AENZ01000057.1| GENE 26 27033 - 27683 998 216 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313157720|gb|EFR57131.1| ## NR: gi|313157720|gb|EFR57131.1| putative lipoprotein [Alistipes sp. HGB5] # 3 216 1 214 214 358 100.0 2e-97 MTMRPIIRLLAAAVAAAATIQSCSTLTGYPTDAEASYAKAIELTKKSVDTDKFKIYSLSI MEGETLSDNLFLVTVKLVNKDDQAFSQSYYMNGLEPTDLRDVQSTFEAPEYETTVGIDLS ELDPARIAAQIAQAKTMLPEGHSYKSVGRYEIEEEVPAGNSAFNRNRKPGQQHTSFIVRF TEDGKETESSAGKTSYIYYEAKVTVGPDGTLTIEEN >gi|313157655|gb|AENZ01000057.1| GENE 27 27775 - 29106 2090 443 aa, chain + ## HITS:1 COG:no KEGG:BT_0764 NR:ns ## KEGG: BT_0764 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 369 1 370 444 392 49.0 1e-107 MKTTHSFVAPVLNSTQSKVNTEAYEQAIDNYNEGKHLEAFRLLLDHLNPEFRTKYGNAEG TEFHIPHGSILVNIRIGEGKLHISADFLELPEKGRVAMLRQVADLNINRLMLPRFRKEGD KLKMEYVCPLSQSHPHKLYFILRNICHIGDRYDDEFCAKFGAKRSYEPQVTPYSEEEVTR IHEAVRQTCRETLEAVKEYEAERKYGYSWNVIDIALYKISYFAQPQGQLMNDLDKAVDDM DKELPVAELVTKGKAFLERLLAMPREELARDLYFVDTLVSAKRRSSLNNVQENFKEVYKE AADAIEAENYERSAVRILYKFYEMYFYNDVQDDLNAVIANALRKSAGKSMEEASEILFEA LEKVMDDDLTTDDDDDEETGELGAAAAAAVEQVQQMAAAMQGEIVQMQAAMQQALMKGDM AEYMRLAQEFQQKMMEQALGQQK >gi|313157655|gb|AENZ01000057.1| GENE 28 29128 - 30216 1611 362 aa, chain + ## HITS:1 COG:CAC2433 KEGG:ns NR:ns ## COG: CAC2433 COG0265 # Protein_GI_number: 15895698 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Trypsin-like serine proteases, typically periplasmic, contain C-terminal PDZ domain # Organism: Clostridium acetobutylicum # 5 181 147 338 433 101 34.0 3e-21 MEHNNIYRSVFKVTHSGGSGSCFYLKNYDLFVTNYHVVEGYRTVAVHDNDRNPYLAKVVL VNPALDIALLAAEGDFSALPEMTLAADDSLTIGRKVYVAGYPYGMPFTITEGSVSSPKQL MDGKYYIQTDAAVNPGNSGGPILNDAEEVVGVTVSKFTQADNMGFGIRVETLHGLLESLG ELDRTAFQVQCGSCEELICEEEEFCPSCGDKLPEGVFDERELSPLGGFVEGAIREMGVNP VLARDGYDSWLFHKGSSEIRIFVYDGNYLFSTSPINLLPKKEVENVLDYMLSEDFGPYKL GIEGRQIYIAYRIHLADITDESEDEIRKNIVGLALRADEMDNMLVERFGCEFSEYSKTDA EA >gi|313157655|gb|AENZ01000057.1| GENE 29 30322 - 31632 936 436 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313157697|gb|EFR57108.1| ## NR: gi|313157697|gb|EFR57108.1| hypothetical protein HMPREF9720_1201 [Alistipes sp. HGB5] # 1 436 1 436 436 882 100.0 0 MRKLYFLFMLLPALSWSVALRAQRLGVEVTPACPTVFQEFEVAYTADREIESIAPPRWGA LTLVRELGQSHGSQLSIVNGVRTDSELFSYRYRVRSRVSGEVFVPGTTAVVGGRECCFER KRITVLPDSAHCEPQCRLEPDSAAMAAMGGCIVRLVCDRKPDKADPVLVLDDTETCTPFS TAYSSRNGREECIYRYRIDVDTPGKHILTPRLSFGGKPFEAPSYVVWLKGEAGPEGTEPA TPGRRRIVYAFAGVLLFLWTAIFVRDHAPAAGRNAVVSRFSWQVYFVVAAALLFIGFCGL FICGALRAEGASRLPVYGVTLLMLFCVCWLVFGELRRSAVRLRIDGDTLAVTSFAGLGFT RRYDLKSFDAITSTILVSRGGAYEYRYLMKAGRRVVRVSAFYLKNYNGLCEALSEHCVYK GRRPMNLWLEMKEIFR >gi|313157655|gb|AENZ01000057.1| GENE 30 31728 - 32108 423 126 aa, chain - ## HITS:1 COG:no KEGG:BT_0963 NR:ns ## KEGG: BT_0963 # Name: not_defined # Def: putative ferredoxin-type protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 126 1 126 502 182 68.0 4e-45 MLRKIRIAAAAVFFTMITLLFLDFTGTLHAWFGWMARVQLLPAVLAVNVAVVAALLVLTL LFGRVYCSVICPLGVFQDVASWAGGKRRKNRFTYSRALPWLRYGMLALFVAVMVAHIKPV AELLAP >gi|313157655|gb|AENZ01000057.1| GENE 31 32119 - 32676 599 185 aa, chain - ## HITS:1 COG:MA0407 KEGG:ns NR:ns ## COG: MA0407 COG0716 # Protein_GI_number: 20089301 # Func_class: C Energy production and conversion # Function: Flavodoxins # Organism: Methanosarcina acetivorans str.C2A # 28 183 8 177 179 146 43.0 2e-35 MKRIVLIAIMTMFLPDMNTQAQTASGTKVLVAYFSHSGNTRAMARQIAEATGGDLFEIVP AAAYPAEYGAVVDQAKKEIGGGVRPALKGRLPDVGAYDVIFVGSPCWWSTVAPPVATFLA DCDWAGKTVVPFMTHEGSRMGRSEEDIRKLCAGATLLGGLPLRGGAVKDSRDVVRKWVQG LNLTK >gi|313157655|gb|AENZ01000057.1| GENE 32 32696 - 33235 690 179 aa, chain - ## HITS:1 COG:MA0418 KEGG:ns NR:ns ## COG: MA0418 COG0655 # Protein_GI_number: 20089311 # Func_class: R General function prediction only # Function: Multimeric flavodoxin WrbA # Organism: Methanosarcina acetivorans str.C2A # 1 179 1 179 179 240 62.0 9e-64 MAKKVLILSSSPRRGGNSDLLCDRFMAGAQQAGHDVEKIFLRDKKINYCTGCGVCYGGAK PCPQRDDAAEVVGKMIGAEVIVMATPVYFYTLCAQMKTLIDRTCARYTEMGGKEFYFILT AAEESVEMMERTVECFRGFLDCLDDPEERGVVYGVGAWQAGEIEGMPAMDEVYELGRRV >gi|313157655|gb|AENZ01000057.1| GENE 33 33252 - 34643 1204 463 aa, chain - ## HITS:1 COG:MA0409 KEGG:ns NR:ns ## COG: MA0409 COG0599 # Protein_GI_number: 20089302 # Func_class: S Function unknown # Function: Uncharacterized homolog of gamma-carboxymuconolactone decarboxylase subunit # Organism: Methanosarcina acetivorans str.C2A # 213 461 2 250 250 374 69.0 1e-103 MEKPKKISAFPVGDKLPETFSKYFVGQAYLARLTRNGALNCPISNVTFEPGCRNNWHSHT GGQILVAVGGRGYYQAEGEPARELLPGDVVEIAPDAVHWHGAAPDSWFSHLAIEPNPQTN RNTWLGPVDDAHYSAVTAGAAAIASTPASATAPAAGIAATAESSAAAVSAAMSMSSAAPI PAAMSMSSAAPIPAAMSMSATAPLSSAKSAASRLRAAAVETRAQLFSGCESELAATDPEL IEIFDNFTFAEVLGYGDLDVKTRMMCILASCIAGAAQTEFRTMLEGALNVGVTPVEAKEV VYQAVPYVGMARTVDFVHIVNGVLTARGVALPLEGQSATSPETRFEKGLAVQKAIFGERI DAMRAAAPENQKHMQDYLSANCFGDYVSRGGLDAKVRELLTFSMLLTLGGCEPQLRGHIQ GNLNVGNDKRTLLAVVTQLLPYTGYPRTLNAIACLNEAIPENE >gi|313157655|gb|AENZ01000057.1| GENE 34 34753 - 35571 1080 272 aa, chain - ## HITS:1 COG:TM1009 KEGG:ns NR:ns ## COG: TM1009 COG0656 # Protein_GI_number: 15643767 # Func_class: R General function prediction only # Function: Aldo/keto reductases, related to diketogulonate reductase # Organism: Thermotoga maritima # 1 268 14 285 286 368 62.0 1e-102 MPVLGFGVYQVGETVCEQCVRDAIAAGYRSIDTASAYLNERAVGRAIRRSGVPREELFIT TKLWVQDAGYESTKRAFAKSLERLQLDYFDLYLIHQPFGDVYGSWRAMEELYREGAVRAI GVSNFQPDRLVDLILHNEVVPAVNQVETHPFCQQAEAAAVMASEGVQIESWAPFAEGRNN LFGNGTLVSLAAKYRKSVAQVVLRWLIQRGVVVIPKSVRPERMAENIDVFDFHLAPEDMD LIATLDTRRSCFLSHRDPETVKWLGTMKYEMD >gi|313157655|gb|AENZ01000057.1| GENE 35 35837 - 36733 1261 298 aa, chain + ## HITS:1 COG:BMEII0641 KEGG:ns NR:ns ## COG: BMEII0641 COG2207 # Protein_GI_number: 17988986 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Brucella melitensis # 193 295 193 291 307 61 33.0 2e-09 MDEIVKMEHVYQYNEMMGQETLHPLVSVIDFSKCPKARHARRMYGFYCVFLKDVKCGDMR YGRHYYDYQEGTLVFIAPGQVVGIEDNGEVFQPKGWALLFHPDLIRGTSLGRSMDEYTFF SYEANEALHLSEQERLVVLECLQNITGELNHAIDKHSRRLIVSNIELLLNYSIRFYERQF ITRSEVNKDALSRFERLLNDYFKGDTPQRDGVPSVRWCAEQLHLSANYFGDLIKKETGKS AQEYIQLKVIDIAKERIFDATKSISEIAYELGFRYPQHFTRLFKKCVGMSPNEYRTTN >gi|313157655|gb|AENZ01000057.1| GENE 36 36909 - 37652 990 247 aa, chain - ## HITS:1 COG:STM0772 KEGG:ns NR:ns ## COG: STM0772 COG0588 # Protein_GI_number: 16764136 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphoglycerate mutase 1 # Organism: Salmonella typhimurium LT2 # 3 246 5 248 250 287 59.0 1e-77 MKKVVLLRHGESTWNRENRFTGWTDVDLSEKGVAEAEKAGLLLREEGFLFGHAYTSYLKR AVKTLGVVLDKLDQDWVPVSKSWRLNEKHYGSLQGLNKKETAEKYGDEQVHIWRRSYDVA PAPLGEEDPRNPRFDPRYRDVPEAELPRTESLLDTVGRIMPYWKCEILPALARHDSLLVV AHGNSLRGIIKHLKGISDEAISEFNLPTAVPYVFEFDDGLNYVRDYFLGDPEEIAALMAA VADQGRK >gi|313157655|gb|AENZ01000057.1| GENE 37 37656 - 38672 1520 338 aa, chain - ## HITS:1 COG:NMB1685 KEGG:ns NR:ns ## COG: NMB1685 COG1052 # Protein_GI_number: 15677533 # Func_class: C Energy production and conversion; H Coenzyme transport and metabolism; R General function prediction only # Function: Lactate dehydrogenase and related dehydrogenases # Organism: Neisseria meningitidis MC58 # 1 332 1 331 332 374 56.0 1e-103 MKITFFGTQPYDRESFDRANERYGFEFNYHRSHLNGNNTSLAQGADAVCIFVNDTADAAT IRSLAAMDVKLIALRCAGFNNVDLKAAAEYGIPVVRVPAYSPHAVAEYAVMLMLTLNRKV HRAYWRTRDGNFSLHGLLGFDMYGKTAGIVGTGKIARELIRILKGFGMEVLAYDIYPDPE YAVRAQIEYVSLDELYRRSDIISLHCPLTDATRYMIDGAAIGRMKPGVMLINTGRGQLIH TEALIEGLKEKKIGAAGLDVYEEEAAYFYEDTSDRIMDDDVLARLLSFNNVVMTSHQGFF TREALDNIAHTTLQNINDFAVHRELRNEVRAEPEKTMN >gi|313157655|gb|AENZ01000057.1| GENE 38 38730 - 39782 1545 350 aa, chain - ## HITS:1 COG:all3735 KEGG:ns NR:ns ## COG: all3735 COG1830 # Protein_GI_number: 17231227 # Func_class: G Carbohydrate transport and metabolism # Function: DhnA-type fructose-1,6-bisphosphate aldolase and related enzymes # Organism: Nostoc sp. PCC 7120 # 2 350 8 360 360 493 67.0 1e-139 MSKTIEILGEKAEYYLSHVCRTIDKKLLYLPAPDTVDRVWMESDRNIRTLGSLQWILSHG RLAGTGYVSILPVDQGIEHAAGASFAPNPAYFDPENIVKLAVEGGCNAVASTFGVLGAVA RKYAHRIPFIVKLNHNELLSYPNSYDQVMFGTVRDAWNMGAAAVGATIYFGSEESRRQLV EIAEAFDYAHELGMATILWCYLRNGSFRKDGVDYHASADLTGQANHLGVTIKADIVKQKL PENNGGFRAIGFGKTRDEVYTKLTTDHPIDLCRYQVANGYMGRVGLINSGGESHGASDLQ EAVMTAVVNKRAGGMGLISGRKAFQRPMDEGIALLNAIQDVYLDPEVTIA >gi|313157655|gb|AENZ01000057.1| GENE 39 39873 - 40760 1214 295 aa, chain - ## HITS:1 COG:AGl475 KEGG:ns NR:ns ## COG: AGl475 COG2207 # Protein_GI_number: 15890346 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 156 291 143 271 289 71 34.0 2e-12 MKKALNLFEKRIVMSATNPIITTPEEQFVVGESDFAFFGNYASRCEGGAILYCRKGSADA TVNQYCGQVRRNTLILVLPGSLLMLTNRTEDFRMTFCAFSRDLFAEAGFRLEPSFFRILR ENPITYPPARIVEGASTWFQMAAYTYRDRNNVFRNTIIRNRLQNVLLEIYDKLQRYANMQ QQTPETTTRQTELFHRFVALVHEHSSQQREVSFYADKLCISTRYLSTIVRNIAHSSAKEF IDRSVLLEIKMLLQSTDLSVQEIAYRLHFPDQSYLGRYFKKHTGESPTEYRNTKK >gi|313157655|gb|AENZ01000057.1| GENE 40 40884 - 42041 1683 385 aa, chain + ## HITS:1 COG:Cj0367c KEGG:ns NR:ns ## COG: Cj0367c COG0845 # Protein_GI_number: 15791734 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Campylobacter jejuni # 14 362 17 361 367 147 32.0 3e-35 MKQTFVKAAVMACFMAAVSCGQAPTAMGPAEYAVMTIATTDREIPINYSATIRGRQDIAI YPQVSGTIFELCVNEGQTVSKGQPLFIIDQVPYKAALQTAEANVAAAKAGVATAQLTYDS KKELYAKNVVSQYDLLTAENTLLTAKAQLAQAEAQRVNAANNLSYTVVKAPANGVVGTLP YRVGALVSASIPQPLTTVSDNSDVYVYFSMTENQLLNLTRQYGSIANTLKNMPDVRLVLN DGSVYDRTGRIESISGVIDTSTGSVQLRAVFPNADGLLHSGGAGSVIVPNIHKDCVVVPQ VATFELQNKVYVYKVEDGKATSSMIDVEKINNGREYIVKSGLTPGDVIVAEGVGLLREGT PIVVKGQGAAAQTAPEAATQTEKEE >gi|313157655|gb|AENZ01000057.1| GENE 41 42045 - 45224 4555 1059 aa, chain + ## HITS:1 COG:BMEI1629 KEGG:ns NR:ns ## COG: BMEI1629 COG0841 # Protein_GI_number: 17987912 # Func_class: V Defense mechanisms # Function: Cation/multidrug efflux pump # Organism: Brucella melitensis # 4 1024 3 1022 1051 739 41.0 0 MTLKHFIERPVLASVISIVIVIAGLIGLATLPVEQYPDIAPPTVMVRASYPGASAETIQK SVIVPLEQAINGVEDMTYMTSSAAVGSASVTVYFRQGINADMAAVNVQNRISRATGQLPS EVTQIGVTTMKRQTSMVKIFSLYSPDDSYDETFLSNYLKINIEPRVLRIAGVGEAFTLGA DYSMRIWLKPDVMAQYKLIPSDVTAALAEQNIESATGTLGENSQNTFQYTMKYRGRLMTP EEFGEIVLLAQPDGTILHLKDVADIELGSESYAYKGYTNGHPGVSTMVFQTAGSNATQVV NEINALLKEVESELPKGVAIAHLQSVNDFLYASINEVIKTLIEAILLVILVVYVFLQDIR STLIPTISILVALVGTFAFLSVAGFSINLLTLFALVLAIGTVVDDAIIVVEAVQARFDVG YKSSYMATIDAMSGITSAIITSTLVFMAVFIPVAMMGGTSGVFYTQFGITMAVAVGISAI NALTLSPALCALLLKPYLDENGEMRDNFAARFRKAFNSGFSAMINKYKHGVLLFIKHKWL MWSTLGLAFAALVLLMNTTKTGLVPDEDQGTIMVNVTTAPGTSLAETHQVMEEVSARIKA IPQIRDFMQVAGYGMIAGQGSSYGMCIIKLKDWEERPEKTDAVNAVIGQIYGRTADIKNA QIFAVAPPMISGYGTSTGFSMHLQDKSDGSLTDFYNIYLRFIGALNQRPEIAMAYSTFNI NFPQYVVSIDAAKAKRAGVSPNAILSTLSGYYGGQYVSNVNRFSKMYYVTIQSDPKYRLN TESLNNVFVRTNNGEMAPLGQFVDLTRVYSSEVLNRFNMYSSIAVNGTAADGYSSGDAIR AIQEVAAQVLPKGYGYDFDGITREEAQTGSNTTIIFGICLLLIYLILSALYESFLIPFAV ILSVPCGLLGSFLFAKMMGLENNIYLQTGIIMLIGLLSKTAILITEYAADRRAAGMSLTQ AAVSAAKARLRPILMTVLTCVFGMIPLVFSHGVGANGNSTLGSGVVGGMIVGTLALLFLV PTLFIVFQTLQEKVKPLEFDPDPQWAVRAELEECKNEKE >gi|313157655|gb|AENZ01000057.1| GENE 42 45229 - 46602 387 457 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|157165073|ref|YP_001466086.1| 30S ribosomal protein S12 [Campylobacter concisus 13826] # 3 449 4 455 460 153 26 5e-36 MKKLIIILAAAAMTGCGIYKPYTRPEVKTDGLYGTAETADTMTIGNIGWEEMFSDPYLQT LIRQGLENNTDLQSAQWRVKEAEATLKSARLAYLPSFNFAPQGNLSSFDWATPSKTYNIP VTASWQIDIFNGLTNAKRKAKALYAQSREYEQAVKTQLISGIANLYYTLLMLDGQYEVTE QTAVKWRESVRTMRAMKEAGMANEAAVAQYEGTCLSIEASLHDLQYQIRMAENSLCTLLA EGPHQIERGRLEGQRLPDDLTVGVPVQMLSNRPDIRSAEYSLMQSYYATSEARSALYPTI TLSGTLGWTNNSGMGIKDPGKILWSAAASLLQPIFNANANRARVKIAKAQQEETKLAFQQ ALLNAGAEVNNALTQCQSARAKADLRVQQIEALERAVESTELLMKHSSTTYLEVLTAQQS LLSAQLSQIADRFDEIQGVVNLYQALGGGRDMAEEAK >gi|313157655|gb|AENZ01000057.1| GENE 43 46602 - 47183 476 193 aa, chain + ## HITS:1 COG:no KEGG:BT_1964 NR:ns ## KEGG: BT_1964 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 9 190 9 190 202 68 27.0 1e-10 MKRGATKEEIIRITQTLIARNGIRAVRVDEIVQTLGISKRTLYEMFADKDELISSCLDSM SSQQRARIATYRKRRSGSALQRTFKLAYEYIANLYMVESSFLSDLRHKIIYADHFDEHRE FWRRELAHHLEASKEEGLLLPEIEGASFADRILETILELRLNNATREEVYLFCRTILRGA ATRQGIERIDRKR >gi|313157655|gb|AENZ01000057.1| GENE 44 47336 - 49528 2849 730 aa, chain - ## HITS:1 COG:no KEGG:BF2681 NR:ns ## KEGG: BF2681 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 14 722 107 796 808 155 24.0 6e-36 MCAMHKVKASQPVFSTAMTLPAGVRNLYVQTTLPDGTVSVSMQPVSAALNVAGAVMKSAE AHKIRTSAATRSESSSMPDYPTLPVKTESDFAEGAVIRQTPAGNIDLGAIWIAKPYAAAA EYYIPAGAEITGNINLNGKFSPHSAPVLYVAGKLTLNASVTIGQATLAVLPGGEVYIREA KANLQQNAVYPAIYVFEGGTFTTDQTSFSCKTIVNEGTFVVNGLFNVNTSCEFYNGASAA LQAGEVEVTNKAKLYNDGKVESADFRLNTYAELHNCENGTVAIDGTFYVTNYSVTYQKGV ARMDRLEARGGGTLYVNCHTAADDVIAEGAKFYIASGSGLDAGTVYFNSNTELYAAAGSI FSMDEYNASRSGGNVRIVSQAQADQPMAVVVIREKGVSSRYYGTKFEGLMEVVYDNAADA KYVIDAGSLIDGAVMRDRQTVVIAGSKCNGGKEPVTPDPEPEPEIIEVIGAPYTYCFEDG WPWIGDYDMNDVVVVTGIDRLVNKESGKVGSIRINWELKAAGAAHLNAFAVQLDKVAASQ VASVETTNTAFGKGAFAGPGLESGNEYAVIPLFNTAQEILGEGTYINTSKGTAPVPTVKH TTTVTFTRPVDAAAVLESAVNAFIVVNSKSSGVFSRDTEVHMPTYKPTGFAVVSGNTFTE AEPYKYFVSKGTGMKDNYMMWALMISGEFRYPAERKDIRTAYTYFNAWAASGGAQHVDWY EDEADEDMLY >gi|313157655|gb|AENZ01000057.1| GENE 45 49907 - 51190 1886 427 aa, chain + ## HITS:1 COG:SPBC428.11 KEGG:ns NR:ns ## COG: SPBC428.11 COG2873 # Protein_GI_number: 19111981 # Func_class: E Amino acid transport and metabolism # Function: O-acetylhomoserine sulfhydrylase # Organism: Schizosaccharomyces pombe # 1 425 1 427 429 469 55.0 1e-132 MTPKNYRFETLQVHAGQQPDPATGSRALPIYQTTSYVFDSAEQGAARFALREEGNIYTRL SNPTTSVFEERMAALEGGTAAVAAASGMAAQYLAFSNLVEAGDNIVSTSFLYGGTFSQFK HTFARIGVEVRFAQGDDIRSIASLMDDRTKAVYLETIGNPEFNVPDFEPIVALAHKHGIA VVTDNTFGAGGFLCRPIEWGVNVVTHSATKWIGGHGTSLGGVVVDGGNFDWGNGRYPLLS EPSEGYHGFNFWKECGKTAFATRTRCEGLRDIGPCISPFNAFLLTQGLETLSLRVQRSVD NALAVAEWLEKHPKIEHVSYPGLKSSKYHTLAKKYLRNGFGGVLTFRVKGSAEQASKIVD NLGLISHLANVGDAKTLLIHPASTTHAQLGEQELAASGVYPNLLRLSLGIEHIDDIKEDL NKALKHL >gi|313157655|gb|AENZ01000057.1| GENE 46 51190 - 52206 1301 338 aa, chain + ## HITS:1 COG:HI1263 KEGG:ns NR:ns ## COG: HI1263 COG2021 # Protein_GI_number: 16273181 # Func_class: E Amino acid transport and metabolism # Function: Homoserine acetyltransferase # Organism: Haemophilus influenzae # 10 332 12 352 358 218 36.0 2e-56 MEPRIYIAPEPFQLETGHTLPELKIAYHTYGELNAAHDNVVWVCHALTANSDVADWWPHT VETGKFLDPARHFVVCANILGSHYGTTGPLHTNPQTGTPYYRDFPPFTIRDIVRAHILLA DALGIGRIGALVGSSVGGFQAVEWAVTEPQRIGKLVLIATSAKASPWSIAIDETQRMAIE ADRTFGEPRDDAGMAGLAVARAIGLLTYRGSLGYNLTQQDREELPAVHRASSYQQYQGEK LCRRYNAYSYYAILNAFDTHDAGRGRGGVAAALSRIAARTIVVGITTDIIFTPTEMRELQ GMIAGSVYHEIDSPFGHDGFLVEHEQLNDILRPFMNNE >gi|313157655|gb|AENZ01000057.1| GENE 47 52216 - 53427 2015 403 aa, chain + ## HITS:1 COG:CAC0998 KEGG:ns NR:ns ## COG: CAC0998 COG0460 # Protein_GI_number: 15894285 # Func_class: E Amino acid transport and metabolism # Function: Homoserine dehydrogenase # Organism: Clostridium acetobutylicum # 4 352 1 362 429 201 33.0 3e-51 MQHMGKIKIGLFGFGVVGQGIYEVVRKSKNANAEIVKICVRSLDKPRSIDRSHFTDSPAE ILDNPEIDLIVEVIDDAKASYDIVKSALLKGIPVVSGNKTMLAHHLPEMIEIQKKHNVAL LYDASSCGSIPVIRNLEEYYDNDLLLEVKGILNGSSNYILSRVFDHKDSYANALAQAQAL GFAESDPSFDIGGYDSLFKLVIITVHALGTYVAPERIFTYGISTIHDADIQYAREKNVKI KLVAQVVKVSDEHFTMFVMPEFVTPAKYIYSVDDEYNGVVIRGECYDRQFMFGKGAGSLP TASSILSDIMARLHGYRYEYKKLSYIQKPDYTTDITLKIYVRYTDTDILGILNFTKIHEQ YTSEKSNYVIGDILLSELIAKRDRLCGDDVFVANIPIFFLNRD >gi|313157655|gb|AENZ01000057.1| GENE 48 53531 - 54871 1170 446 aa, chain - ## HITS:1 COG:no KEGG:BT_3630 NR:ns ## KEGG: BT_3630 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 2 364 4 392 541 216 36.0 2e-54 MRKILFMALAVLIAAGCAESEKGTLCVKIAPAIRTRVTGLHFDTGDRIGLTISKGSAPYA ENVPMTYDGTAFTASGLSWYNDLNEKSVLTAYYPYAEGGRPDEFTVSSDQTQGNGPSDLL AAVKTDVTPVSAPVGMIFYHVMSQLTIVVSNSSSAAVTGVSVGGLVPTAEIDYSAPKAAA KSGVAAAEVKTCEVKPGATYRAILAPQQAALTVTVTTDDGRSHSKTLSSAQLESGRRYDM SVLVTNEEIQISLSGDIGDWEDGGSLDGSGGGDGGDDDSQTLSYGGVTYRTTTVGETVWM AENLRYVPDEALLTKGIWYPDGNEDAEYVREHGMLYSFTAALGGAQAVSGAPVRGICPEG WHVPYIDELQTLAVGPDCGFTLAGMWNSVSKNYVSSAKSGYLMSAATDDGGSSYSALLFY ASGGNPSPSSLPAGNGVSLRCVKDAK >gi|313157655|gb|AENZ01000057.1| GENE 49 54944 - 56146 1655 400 aa, chain - ## HITS:1 COG:AGc4286 KEGG:ns NR:ns ## COG: AGc4286 COG0477 # Protein_GI_number: 15889635 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 16 386 10 387 400 96 26.0 1e-19 MKIKTGKGTITLTVLLAIWSVSAIASLPGLAISPILGDLNRIFPKATDLEIQMLTSLPSL LIIPFVLLSGRLSVGRDKLRILTVGLAVFFLSGVACIFAKSMAWLIVISCILGIGAGMVI PLSTGLVVDYFTGDYRVRQLGYSSAINNLTLVLATVVAGYLADVNWHLPFLVYTLPGVSL ALSFFLRRGRSTPEPEQSIQMRHKMIDRRKLVGLMLFYFFATYAVLVVTFYASFLVDDYK IDSSFSGVLISLFFLAIMVPGLFIDKIIRRLTRSVNLVSLALICAGLLCVGIFRDKTMLV VGVLCAGFGYGVIQPVIYDKAATIAPPRSATLALSFVMAMNYLAVMVCPFIVDLFRHLFG THSDRFPFFFNAALVLALTLVTFRRRDDFILGLDESYYKN >gi|313157655|gb|AENZ01000057.1| GENE 50 56244 - 57710 1825 488 aa, chain - ## HITS:1 COG:no KEGG:Bacsa_0604 NR:ns ## KEGG: Bacsa_0604 # Name: not_defined # Def: hypothetical protein # Organism: B.salanitronis # Pathway: not_defined # 160 383 390 683 790 107 28.0 1e-21 MKGFVKLIAVVVTAAAFAVSCSKDKDSGAVSFDKPAVFIDAPSGMATVGFTLRNIKTLSV TSQPTGWDEPMLDQTTGMLTVIAPSATALEKGTAVESGTVVLAGVTPGGTSVSGVLFVGV VKTVDLSAAGVANSYMASVKETNYLFDVMHKGDGSPLATDHLGVIWKSASGLVQYLQMEN GKASFYIGADTEDSNKILKGNAVIGAYDANDELIWSWHVWATDYDPEGENASVELNGYTM MTRNLGALANGNATTSEILASYGLYYQWGRKDPFIGPSTYKISSGQGAAMYNDSGSRTYV TMVASSAETGTMDYAVKHPLTFVTGVAESNNDWLWSKSDAGWSSDGDAGAKSVNDPCPYG WRVAPSAAFAGLEIVGTPAAGDEEKFGWTLTDGKAESFFMGGGRRRYDDGKIQNIYIPKD EAKMKLYTRADIAQPWEGLYWTTAVNGTQSHALHFWYEKLTGTGGIAPAEAYARANGMQV RCVKVKNL >gi|313157655|gb|AENZ01000057.1| GENE 51 57866 - 58642 895 258 aa, chain + ## HITS:1 COG:MA0986 KEGG:ns NR:ns ## COG: MA0986 COG3369 # Protein_GI_number: 20089863 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Methanosarcina acetivorans str.C2A # 17 235 7 224 241 206 45.0 4e-53 MSNMSNLKIETGSSPAEERFSITVTETGPYLVYGRPPLAEQFIMPDESNDSWYFQEGRHF STESEPTALCRCGASRRKPYCDGSHEKASWDPRLTAQRDALLDNAEVIDGGTLQMTDNEK YCVFARFCHPHGDAWSLTEQSDDPEARRLAVREASMCPSGRLMAWDKETMKPYEFRFEPS LGLIEDITIGASGGLWIRGGIPMRRENGQTYEIRNRVVACRCGQSANKPYCDGTHAAIKW RDELNGEPVGETLPEEVY >gi|313157655|gb|AENZ01000057.1| GENE 52 58883 - 60142 1558 419 aa, chain - ## HITS:1 COG:no KEGG:Odosp_3666 NR:ns ## KEGG: Odosp_3666 # Name: not_defined # Def: peptidase M64, IgA # Organism: O.splanchnicus # Pathway: not_defined # 12 419 21 428 428 632 71.0 1e-179 MLFLGAGVSAAAQTPFADCFFDKTMRFDYYHAGDSRSEEYFFDALKEEPYWAGSKVSLVD TTGYGNQFFRIVDRASGREIYSRGFCTLFNEWQSTPEADSVRRSYPESVVFPYPKRPCRI EIFTRNGKGRFEKRFSQNIDPDSYFIERFTPRCETFEVMYSGNSAQRVDIVLLPEGYGAG ERAKFESACRTFADEFFSYSPYKENAARFNVRAVWAPSDDSGVTIPGENVWRNTACGASF YTFDSERYQMVTDFQRLRDMAAHVPYDYIYVLSNTQKYGGGGIFNFYGISAAHHPTRTGK IHVHEFGHLLLGLGDEYVGTTSYDDMYAKSIEPWEPNLTTLVGFGDKFWSRMVAEGTPVP TPDTEEYDGVVGVFEGGGYAAEGVYRPWRNCLMNNLHKTDEFCPVCSKAIVDYIDFLCK >gi|313157655|gb|AENZ01000057.1| GENE 53 60231 - 61202 1544 323 aa, chain - ## HITS:1 COG:BS_yqfA KEGG:ns NR:ns ## COG: BS_yqfA COG4864 # Protein_GI_number: 16079592 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus subtilis # 1 322 1 322 331 376 64.0 1e-104 MDFQGGLIVLIIFVSLFAIWLIFYFIPVGLWFSALVSGVRISLLQLILMRWRKVPPSTIV SSMIESTKAGLSLDPNELEAHYLAGGNVTSVVHALVSAQKANIMLDFKMATAIDLAGRDV FEAVQMSVNPKVINTPPVAAVAKDGIQLIAKARVTVRANIKQLVGGAGEETVLARVGEGI VSSIGSAASHKIVLENPDSISRVVLEKGLDAGTAFEILSIDIADIDVGKNIGAQLQMDQA QADKNIAQAKAEERRAMAVALEQENKAKAQDARAKVILAEAEVPLAMAEAFRNGNLGIMD YYKMKNIMADTTMRETIARPEKK >gi|313157655|gb|AENZ01000057.1| GENE 54 61215 - 61685 692 156 aa, chain - ## HITS:1 COG:no KEGG:BVU_3647 NR:ns ## KEGG: BVU_3647 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 153 1 152 156 81 35.0 1e-14 MTMFYIVLLIFFGLLFLVAELVLLPGVSIGAILALVCYGSSIYLAFRDLGPVAGSVVVLV ILVLSLIATVVSLRAKTWQRFSLKQKINSSSMPTLPEQELSVGDRGTTLSRLSPMGKIEV NGRTYEAKSLGAYVDPRKEIEVVGFENFSVIVKTTK >gi|313157655|gb|AENZ01000057.1| GENE 55 61731 - 63281 2305 516 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313157708|gb|EFR57119.1| ## NR: gi|313157708|gb|EFR57119.1| sporulation and cell division repeat protein [Alistipes sp. HGB5] # 9 516 1 508 508 902 99.0 0 MLVRKTCILFLFAALLGAGSLRAQQLSVEARIAGLESNEEYMSLLREDAQLQMREDSIVN AVERARRQLRENPAGRQQYSQDILQLESRIFEIRNAKGRLIDRINTIEQEWVLANLNGAA SQPAGPAAENPAAEIPDSLKVRNLVSNPYFREQLPEADYAALVEAQRLEFDAVDYVNRYF ANYGTISELAEAYAAAQAEADAAEIYDKYKTLQGFNRVLADSLSDAWNYIFDNKSYAYGY LMDKLGRDAILSREEERLSEAARQLSELRGETASDAVTDYFLRKKVLVGYETSVADMLGL TSARDSLRGVMAQLEGIDFRLPRIDVAERYFLDYDSVSFSSTPKYSYQHPIPECKVYEHG TIYRILLGTFNTKRAVSTFRGAYPLSYLVNDDRKWCYYAGGFATREEAEAAQKLLKARGF VRPEIVVWTDGEYRNLSKDPEAQHIAYRVEIVSSEALSDAVKAAITDTAEGCELSRVGQQ LFVVGTFDDKAVADRVAAAVTQTDPALEIKVAEIAE >gi|313157655|gb|AENZ01000057.1| GENE 56 63287 - 64732 1708 481 aa, chain - ## HITS:1 COG:FN0993 KEGG:ns NR:ns ## COG: FN0993 COG0168 # Protein_GI_number: 19704328 # Func_class: P Inorganic ion transport and metabolism # Function: Trk-type K+ transport systems, membrane components # Organism: Fusobacterium nucleatum # 62 481 63 480 483 234 35.0 3e-61 MRVDVVLRYVGVVMLFIALFMLLSAGISFVSGMDSAFYPLLLASLLTALLGAFPLIFVER TQQITNKEGFCIVVGSWLVACLVGMFPYLIWGGEFSLVNAWFESVSGFTTTGSTILNDIE VLPRGLQFWRMSSTWIGGMGVVMFALLILPSLGRNKMTLSSVELSTLAKDNYRYRTQIIV QILLVVYVGLTVLTTVLLKMAGMNWFDSLCHAMSACATSGFSTKNASVAYFNSPMIDTIL IFAMATAGIHFGLIYATVTGKRNNIFRSEVTRWYLVMLLGGGLLIAVSLFTANIYPTFSA AFHHALFQFVSLVTTTGFATADSNQWTSFAVILLIFGSIVCACAGSTAGGIKTNRLVLAM KMMRTRLRQQQHPNAIIRIKLDGVIQENEALHSVMIFIVAYLMLILAGTVLGTMFGVDLM TSFSGSVASIGNVGPGFGQVGSMDNFSSVPGVFKISSSLLMLLGRLEIFGLIQLFFIKWW R >gi|313157655|gb|AENZ01000057.1| GENE 57 64738 - 65517 1202 259 aa, chain - ## HITS:1 COG:no KEGG:Dole_1371 NR:ns ## KEGG: Dole_1371 # Name: not_defined # Def: histidine kinase # Organism: D.oleovorans # Pathway: not_defined # 4 250 3 243 254 75 23.0 2e-12 MTGMRRRVTIGFLSIVCLLFFSGMVSFVELSHLSRDTGEILKANKRNIELAKEMLDAAYE QNVALIRLSVFGDNSYDSLCRSSMERLENTLLVAQSEALDKSFLDSLAFATTELRLLTDN YLAFGAHAAANPAAPDSVGRSWYDSEYEVLYGRLTAAIKNYMTSTQSSLAPRAEQMKKNA YRAVTPVLISLVVMIAIVLMLYYFMSVYCVNPIIRMNKGLGDYLSFRVPFAVKADCKDEV LELKEKIETLITVSKQPKS >gi|313157655|gb|AENZ01000057.1| GENE 58 65553 - 66851 408 432 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|227424643|ref|ZP_03907734.1| SSU ribosomal protein S12P methylthiotransferase [Denitrovibrio acetiphilus DSM 12809] # 1 360 1 353 435 161 29 2e-38 MQQRRVNFHTLGCKLNFSESSTLAREFERGGFVRVAPDAEADICVINSCSVTEHADKKCR NLIRKLHRRNPGAIIAVTGCYAQLKPQEIAAIEGVDIVLSNNDKGMLYKRVVELSGKGRA QIYSCDTESLTSFFAAFSSGDRTRAFLKVQDGCDYKCAYCTIHYARGGSRNMPIADLVAE ARQIAAAGQKEIVITGINTGDFGRTTGERFIDLLRALNEVEGIERYRISSIEPNLLTDEI IAFCAASPKFQHHFHIPLQSGSDKILGLMRRRYTTARFADRIAAVRALMPDAFIGIDVIV GFPGETEEDFRTTYDFLEGLKPAFLHIFPFSERPGTPAVDMPGKVQPSVATRRVAELEEL CGRLHGAFCAKAVGSEDTVLFESTLRGGMMFGYTGSYRRVKAPYDRTRINAICRVRFGAM DESHDLMGDILD >gi|313157655|gb|AENZ01000057.1| GENE 59 66954 - 67787 1094 277 aa, chain + ## HITS:1 COG:MA0025 KEGG:ns NR:ns ## COG: MA0025 COG1108 # Protein_GI_number: 20088924 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Mn2+/Zn2+ transport systems, permease components # Organism: Methanosarcina acetivorans str.C2A # 6 258 3 254 274 173 39.0 4e-43 MEFFSDLIQYGYLSNALAACVLSGITCGVIGTYVVCRRMVFLAGGITHSSFGGLGIAFYL GTDPIAGAMIFAVLSALGIEWAGSRGRIREDSAIGIIWSVGMAVGALFMSLRPGYTSGDL SAYLFGSIVTVTHGDVAALTALTLFIIAGALLWLRPVMYVAFDRDFARSRGIPTRVISYV MAALIAVTIVLSIRIMGIVLLISLLTMPVAIVNSLSRSYRTIALCAPLVAVAGNVAGLVA SYNFEVPPGAAIIFTLTLTLIVVKLLPLRNKKAGAAA >gi|313157655|gb|AENZ01000057.1| GENE 60 67784 - 69286 2188 500 aa, chain + ## HITS:1 COG:FN1031 KEGG:ns NR:ns ## COG: FN1031 COG0795 # Protein_GI_number: 19704366 # Func_class: R General function prediction only # Function: Predicted permeases # Organism: Fusobacterium nucleatum # 1 211 1 224 359 78 22.0 3e-14 MKTIHKLVLKAYLGPMVLTFFIVMFVLMMNIVWRYIDELVGKGLSAGIIIELMTYFMANM IPLGLPLAMLLAAIMTMGNLGENYELLAMKSAGMSLIRITKPLIILVSLISVGSFFIGNN LVPYANKKVFSILYDIRQQKQSLEFQDGLFFNGIDNMSIRVSRQEPETHLLHDVLIYDNR AANGDMNTIVADSGYIRLSDDKKYLLVTLFNGETYEQTRSSQWFTQSKLRHHIFEKQDQV IPMEGFAMGRTDANQFSNSQTKNINELQHDIDSLEKMVNSATTRSYEPLLKEQIFSRDNS VLPQPDSLRIDKSRFRDMAAMDSIATLQMREKERVWNQARTLAKNSRNMFSFDESAAKEA LNQLYRSKVEWHKKMSLPVSIMIFVLIGAPLGAIIRKGGLGLPVVVSIIFFVIYYIISLS GEKLAKEGSWDAVYGMWLSTFILTPIAIYLTYKATNDSALLDTDWYAGKFKALYERMRPA INKLKNAIKLKRNGKKHDSE >gi|313157655|gb|AENZ01000057.1| GENE 61 69261 - 70484 1918 407 aa, chain + ## HITS:1 COG:aq_350_1 KEGG:ns NR:ns ## COG: aq_350_1 COG0108 # Protein_GI_number: 15605862 # Func_class: H Coenzyme transport and metabolism # Function: 3,4-dihydroxy-2-butanone 4-phosphate synthase # Organism: Aquifex aeolicus # 7 208 9 211 211 236 58.0 9e-62 MAKSTILNSVEEVIEDFRNGKIVIVVDDEDRENEGDFIVAAEKITPEIVNFMLKEGRGVL CAPLSEDRCAELGLNMMEENNTSLLGTPFTVTVDLLGHDCTTGVSIHDRAATIRALADPA TRATDLGRPGHINPLRARQKGVLRRPGHTEAAIDLARLAGLQPAGALIEIMNEDGTMARL PQLVEIAKKFDLKIISIASLISYRLKEESIVEKGETVDLPTAWGDFRITPFRQKSNGLEH VALTKGEWTEDEPVLTRVHSSCATGDIFGSCRCDCGDQLHEAMRMIEKEGKGAIIYLQQE GRGIGLCNKIKAYKLQDEGLDTVDANVQLGFGVDERDYGIGASIIREMGIKHMRLMTNNP LKRAGLEGYGLRIDEIVPVVIAPNEHNLRYLKTKEERMHHTLGLDKK >gi|313157655|gb|AENZ01000057.1| GENE 62 70542 - 71513 1228 323 aa, chain + ## HITS:1 COG:BS_fmt KEGG:ns NR:ns ## COG: BS_fmt COG0223 # Protein_GI_number: 16078636 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Methionyl-tRNA formyltransferase # Organism: Bacillus subtilis # 7 312 3 301 317 221 42.0 1e-57 MNPKELRIVFMGTPEFAVPSLRALVAGGYNVVAVVTTPDKPAGRGQKLHQSEVKLAALEL GLPVLQPEKLKAPEFVEAMQALKPDLGIVIAFRMLPEVIWAMPRLGTFNLHASLLPQYRG AAPINWAVINGETETGVTTFLLNHEIDKGAIIAQVRVPILPKDNVGTMYDRLMHTGTALV TETVDRIAAGDIQPMEQTGIDESRLHPAPKIFKEDCRIDWSWEGRRIVNFVRGLSPYPAA WTEMRKEGETASLTAKIYAAAFEGAAHNKAAGTVESDGRTFIRVACADGWITLGELQIAG KKRLPVRELLLGLRDIGQYRFQK >gi|313157655|gb|AENZ01000057.1| GENE 63 71576 - 72280 1040 234 aa, chain - ## HITS:1 COG:MA4103 KEGG:ns NR:ns ## COG: MA4103 COG0450 # Protein_GI_number: 20092896 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Peroxiredoxin # Organism: Methanosarcina acetivorans str.C2A # 8 215 1 214 219 257 55.0 1e-68 MDKNEQQLPRLGEPVPAFEAMTTQGKVSFPADYKGKWVILFSHPADFTPICTSEVLTFGA RTAEFKALNCELIGLSVDSRNSHIAWLRTIREKIEYKGMKDIKVEFPIIDDVSMKVANLY GMIQPGESQTAAVRAVFFVDPKGVLRAMIYYPLALGRNFDEIKRVLVGLQAIDAFGVAMP ADWRPGDEVIVPMKGEDMNDQPEGVKCYDWFFCTRPLPKEEIEKKLGKGYEIKG >gi|313157655|gb|AENZ01000057.1| GENE 64 72530 - 73075 835 181 aa, chain - ## HITS:1 COG:MA2909_2 KEGG:ns NR:ns ## COG: MA2909_2 COG1014 # Protein_GI_number: 20091730 # Func_class: C Energy production and conversion # Function: Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, gamma subunit # Organism: Methanosarcina acetivorans str.C2A # 7 133 12 137 186 109 43.0 3e-24 MKETLIIAGFGGQGVLSMGKILAYSGVMQDFEVTWMPSYGPEMRGGTANVTVILSDRKIS SPIAHEFDTAIILNQQSMEKFEPMVKPGGVLIYDTNGITRHPVRTDIEVYSIDATAECAK MGQAKLFNTMILGGYLKVRPVVEMENVMVGLKKSLPERAWKMLPENEKAIHHGGEIIKKI R >gi|313157655|gb|AENZ01000057.1| GENE 65 73086 - 73847 1185 253 aa, chain - ## HITS:1 COG:MA2909_1 KEGG:ns NR:ns ## COG: MA2909_1 COG1013 # Protein_GI_number: 20091730 # Func_class: C Energy production and conversion # Function: Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, beta subunit # Organism: Methanosarcina acetivorans str.C2A # 25 251 36 262 296 246 48.0 4e-65 MAEVEIKKENLVYAKPKLITDNVMHYCPGCSHGTVHKLIAEVIEEMGLEEKTVGVSPVGC SVFAYNYIDIDWIEAAHGRALAVATAVKRLHPGNLVFTYQGDGDLSAIGTAESIHAAARG ENVVAVYINNAIYGMTGGQMAPTTLLGMKTATTPYGRDPRLNGYPYKIAEMMAHLDGATF ITRQSVHTPANVRKCKKAIRKAFENSMAGKGFSLVEVVSTCNSGWKLSPVASNEWLEQNM LPFYPLGDIKDVK >gi|313157655|gb|AENZ01000057.1| GENE 66 73853 - 74947 1659 364 aa, chain - ## HITS:1 COG:TM1759 KEGG:ns NR:ns ## COG: TM1759 COG0674 # Protein_GI_number: 15644505 # Func_class: C Energy production and conversion # Function: Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, alpha subunit # Organism: Thermotoga maritima # 8 359 8 351 356 365 53.0 1e-101 MAEKEVKLMKGNEVIAHAAIRYGCDGYFGYPITPQSEVMETLMELKPWETTGMVVLQAES EISSINMVYGGASTGKRVMTSSSSPGISLMSEGLSYLAGAELPCLIINCQRGGPGLGTIQ PSQGDYFQACKGGGHGDFHLIVLAPNSVQEMHDHVAVAFDLAFKYRNPALILADGAIGQM MEKVVLAPQRPRKTDEEIAAECKSWATYGKPADRQRNIVTSLELQSEKMEIINKRLQEKY KALEENEVRYEAIDCDDADYVIVAFGSSARICSATVEMARAEGLKVGLLRPITLYPFPKK PLAELAARGVKGFLSAELNAGQMVEDVRLAVNGAAPVEHFGRQGGMMFAPDEVLAALKEK LIKK >gi|313157655|gb|AENZ01000057.1| GENE 67 74951 - 75178 230 75 aa, chain - ## HITS:1 COG:TM1758 KEGG:ns NR:ns ## COG: TM1758 COG1146 # Protein_GI_number: 15644504 # Func_class: C Energy production and conversion # Function: Ferredoxin # Organism: Thermotoga maritima # 4 72 1 68 77 58 46.0 3e-09 MAKIKGTVVVDKERCKGCGVCVASCPCNVLGMSAEVNSKGYPVARMANPDACIGCASCAV ICPDSVITVYRQKIE >gi|313157655|gb|AENZ01000057.1| GENE 68 75336 - 75464 94 42 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313157733|gb|EFR57144.1| ## NR: gi|313157733|gb|EFR57144.1| hypothetical protein HMPREF9720_1241 [Alistipes sp. HGB5] # 1 42 3 44 44 82 100.0 1e-14 MRRNRYLAVMLVFFVWSLFFLVWFCIGGHRGENKYGPDPKAE >gi|313157655|gb|AENZ01000057.1| GENE 69 75520 - 76764 1835 414 aa, chain - ## HITS:1 COG:CAC0016 KEGG:ns NR:ns ## COG: CAC0016 COG4198 # Protein_GI_number: 15893314 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Clostridium acetobutylicum # 1 413 1 414 414 390 46.0 1e-108 MVRIKPFRGIRPPKQYASEVASRPYDVLNSAEAKAEATERSLLHIIKPEIDFDPIADEHS QEVYDKAVENFRAWREKGWLEQDPEEYYYIYAQTMDGRTQYGLTMCCNYEDYLSGAIKKH ELTRPDKEEDRMIHVRNQKANIEPVFFAYPDNAEIDAIVAGVVENNAPEYDFTAADGFGH KLWVIRDRKTNNRITEIFKDIPALYVADGHHRTAAAARVGQECQRRNPDHTGREEYCYFL AVTFPAGQLRIIDYNRVVKDLNGLTSAQLLEKLAENFTVEDKGAEEYRPSGLHNFSMYLD GRWYSLTAKPGTYDDNDPIGVLDVTVLSNLVLDRILDIKDLRTSKRIDFVGGIRGLGELK RRVDSGEMKAAFALYPVSMQQLINIADTGNIMPPKTTWFEPKLRSGVVIHSFEE >gi|313157655|gb|AENZ01000057.1| GENE 70 76789 - 77712 1391 307 aa, chain - ## HITS:1 COG:TM0327 KEGG:ns NR:ns ## COG: TM0327 COG0111 # Protein_GI_number: 15643095 # Func_class: H Coenzyme transport and metabolism; E Amino acid transport and metabolism # Function: Phosphoglycerate dehydrogenase and related dehydrogenases # Organism: Thermotoga maritima # 2 307 3 312 327 167 36.0 3e-41 MKILVATEKPFAKKAVDGIRQIVEGAGYELALLEKYTDKSELLSAVADVDALIIRSDKVT AEVIEAARNLKIVVRAGAGYDNVDLAAATARGIVVMNTPGQNSNAVAELALAMMIFMARN GFTPGTGSEIQGKTLGIHAYGNVGRLVGRKGKAMGMNVIAYDPFITDSAVFEADGVKKVN SVEELYKNSDYLSLHIPATEQTKGSIGYDLLMSMPKGATLVNTARKEVIDEQGVMKALTE REDLKYITDIAAGIQSELNEKFGRRVFATAKKMGAETAEANINAGLAAANQIVDFFKNGN TRFQVNK >gi|313157655|gb|AENZ01000057.1| GENE 71 77803 - 78876 1784 357 aa, chain - ## HITS:1 COG:BH1188 KEGG:ns NR:ns ## COG: BH1188 COG1932 # Protein_GI_number: 15613751 # Func_class: H Coenzyme transport and metabolism; E Amino acid transport and metabolism # Function: Phosphoserine aminotransferase # Organism: Bacillus halodurans # 2 355 3 360 361 350 48.0 3e-96 MKKHNFNAGPSILADEVLENAAKAIIDFNGSGLSLLSISHRTKDFDDVMAEADALFREML DIPENYKIFYIGGGASTFFYEVPANFLGKKAGYVNTGVWAKKAMKEAKRYGEVELLASSE DRNFTYIPKGFAIPSDLDYVHITSNNTIYGTEYHEDLVSPVPLIADMSSDILSRPVDVTK YAMIYGGAQKNLAPAGVAFAIIREDMLDKIVRDLPTMVSFRTHVENGSMFNTPPVFPIYV LKETLKWLKSIGGVPEIYRRNQAKAALLYDEIDRNPLFRGTVDKEDRSLMNICFVMADGY EELAPEFMEAAKSRGIVGIKGHRLVGGFRASCYNALPVESVRVLVECMQEFAAKHAK >gi|313157655|gb|AENZ01000057.1| GENE 72 79067 - 79939 1287 290 aa, chain + ## HITS:1 COG:BS_yloC KEGG:ns NR:ns ## COG: BS_yloC COG1561 # Protein_GI_number: 16078630 # Func_class: S Function unknown # Function: Uncharacterized stress-induced protein # Organism: Bacillus subtilis # 1 289 1 290 291 135 34.0 7e-32 MVKSMTGFGKGEAALQNKKITVEIRSLNSKQLDLGLRLPAVYRQSEYEIRNIIARTIQRG KVDVFVTVESQAVETPARINKEVFREYLHQMTDTLAFAGIDADYDAIVPVIMRLPEVIST ETESISDEEHAALIAATEAAAARLDAFRMQEGAILIADLLGRVDRIESYKEEVVPFEKAR TETIKARILDNLEKLQADVDRNRLEQEMIFYLEKLDITEEKVRLANHCRYFREVAAGEEG AGRKLGFIAQEMGREINTMGSKANESNIQILVVKMKDELEKIKEQVLNIL >gi|313157655|gb|AENZ01000057.1| GENE 73 79939 - 80508 551 189 aa, chain + ## HITS:1 COG:CC1681 KEGG:ns NR:ns ## COG: CC1681 COG0194 # Protein_GI_number: 16125927 # Func_class: F Nucleotide transport and metabolism # Function: Guanylate kinase # Organism: Caulobacter vibrioides # 2 180 10 187 213 143 36.0 2e-34 MGKVVIFSAPSGSGKTTIVRELLKRFPQFEFSVSATSRAPRGCERHGCDYHFMTHEEFMQ AVAENRFVEWEEVYKGTCYGTLRSEVERIWAKGNIIVFDVDVIGGINLKRIFGGDACSIF VMPPSVEELRRRLEGRGTDAPEVIDRRVAKAEFELTKAPEFDHIVVNDSLDEAIAETTRI IDEFITGRS >gi|313157655|gb|AENZ01000057.1| GENE 74 80505 - 81314 1128 269 aa, chain + ## HITS:1 COG:RSc2193 KEGG:ns NR:ns ## COG: RSc2193 COG1057 # Protein_GI_number: 17546912 # Func_class: H Coenzyme transport and metabolism # Function: Nicotinic acid mononucleotide adenylyltransferase # Organism: Ralstonia solanacearum # 3 195 11 225 231 90 33.0 3e-18 MKRVMLYFGSFNPVHKGHIALAEYVVEQGLCDEAVLVVSPQSPYKRAAELAPEMDRFEMA ERACAASRLPERIKPSVVEFLLPKPSYTIDTLRYLTENHGAEMEFSILMGADQLERLDGW KEYEKILEYPIYVYPRRGEQVGRFAGRITVLEDAPLQDFSSTEVRGRIERGEDVSQMLDA GVAEYIRRKGLWSPAARKAALTAQIAAEPENTELYTERGKLHYRLNEWGAALNDFNRVLQ LDDSHAEARQYAQMVQEILEFRYKDIYNP >gi|313157655|gb|AENZ01000057.1| GENE 75 81341 - 82303 1483 320 aa, chain + ## HITS:1 COG:HI1140 KEGG:ns NR:ns ## COG: HI1140 COG1181 # Protein_GI_number: 16273066 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanine-D-alanine ligase and related ATP-grasp enzymes # Organism: Haemophilus influenzae # 1 309 11 297 306 172 32.0 7e-43 MAGGDSPEREIALQSAAQIESALDHEKYDITVIDLHHRDWHYTAPDGRQWQVDKNDFSLT VEGERKVFDYALIIIHGTPGEDGRLQGYLDMMGVPYSSCSMTSSVITFDKVTTKRTVAGR GINLAREIFLCKGETADPDEVIAEFGLPLFIKPNASGSSFGVTKVHTRDEVLPAVEAAFA QSDEVLIEECIEGREMGCGMMVAGGREYLFPITEIVSKKDFFDYQAKYTEGYSDEITPAQ IAPEIAEELHRMTRIAYKACRCSGVVRVDFIVTPEGKPYMIEINSIPGMSAGSIVPKQAK AMGMTLGEMFDLVIADTCRK >gi|313157655|gb|AENZ01000057.1| GENE 76 82300 - 82872 673 190 aa, chain + ## HITS:1 COG:no KEGG:Odosp_3353 NR:ns ## KEGG: Odosp_3353 # Name: not_defined # Def: hypothetical protein # Organism: O.splanchnicus # Pathway: not_defined # 1 187 1 186 196 205 54.0 8e-52 MIEIDDKIVSADLLRECFACDIAACKGICCVEGNAGAPLEAEEVDILEREYEAYRPYMTP EGIEAVERQGFMVVDEDGDLTTPLVDDAECAYTYRENGITLCAVEKAWLEGKTAFRKPIS CHLYPIRLMRFSNGTVGLNYHRWSVCAPARECGRKLGIPVYKALREPIVRRFGEEFYKAL EAAEELIRQQ >gi|313157655|gb|AENZ01000057.1| GENE 77 82880 - 83950 1492 356 aa, chain + ## HITS:1 COG:slr0993 KEGG:ns NR:ns ## COG: slr0993 COG0739 # Protein_GI_number: 16331215 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane proteins related to metalloendopeptidases # Organism: Synechocystis # 138 252 589 701 715 104 48.0 2e-22 MKKYCLTCMAAVAALTAAAAEPQAAEKTDSLHIAALSPEEAAAVDSIAALRLRLAPKQIE SIFDTNDVVVLDTLDSGNDAVQVVLYSNNTWKYVRNREVAKDSTIFEKYWDTKTLFPYKE VDMSSMPQSVVIDLIDSLTGYHCPYQGSVHPRGKYGPRRRRQHQGVDLPLKTGDPVYATF CGRVRISEYNKGGYGNLVIIRHDNGLETYYGHLSERMVEPNQWVEAGQIIGLGGSTGRST GPHLHFETRYYGQSFDPERLIDFKNGILSRETFLLKKSFFSIYSNAGQDFDDEIANEEQD KKEAAEKAAMKYHRIKSGDTLGAIARRYGTTVNNICRMNNIKSTTVLRIGRSLRVR >gi|313157655|gb|AENZ01000057.1| GENE 78 84072 - 85067 1088 331 aa, chain - ## HITS:1 COG:no KEGG:Lbys_0372 NR:ns ## KEGG: Lbys_0372 # Name: not_defined # Def: inosine/uridine-preferring nucleoside hydrolase # Organism: L.byssophila # Pathway: not_defined # 1 327 1 335 338 292 47.0 2e-77 MKKLLILAALAALCTPAVRAQQTPAAPLRVIFDTDMGNDVDDPLALDMLYKAVDRGEITL LGILSSKDTEFSPRYIDMMNTWYGYPEIPVGRVRDGVVLKRDDYARAVCESGLFPRSRRD RDYGDPVTLYRRLLAESPDSSVVVVSVGFSTNLGRLLESRGDRYSPLDGIELVKRKVKSC SVMGGSFGDKPRAEYNIVNDIPNAKRLFALCPVPVAVVSLELGKTVNYPGASIETDFAWA GKHPMVEGYKAYRKMPYDRPTWDMMSVLYVLRPEMFGVSEPGIICVDDQGYTYFTPTPRG KHTVLTLTPGQQDAVLRFFVRELSSKPAKYR >gi|313157655|gb|AENZ01000057.1| GENE 79 85081 - 86091 1522 336 aa, chain - ## HITS:1 COG:no KEGG:BT_2809 NR:ns ## KEGG: BT_2809 # Name: not_defined # Def: putative integral membrane protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 336 1 334 334 407 68.0 1e-112 MFVVNDYATAVILCIVTMLCWGSWGNTQKLAGRTWRYELFYWDYVVGMLLFSLILCFTLG SIGSEGRPFLEDLAQAAPKALGSVLLGGVIFNASNILLSASVSLAGLAVAFPLGVGLALV LGVVINYMGAPKGDPVILFLGVALIVIAIVCNGIASGRVRKEPGSSAQNRKGILLAVLAG VLMSFFYRFVAAAMDLDNFESPTPGMLTPYSAIFIFSVGVLLSNFVFNTLVMKRPFVGTP VGYDAYFKGSARTHLVGMLGGAVWCLGTAFSYIAAGKAGAAISYALGQGAPMVAAVWGVF VWKEFRGAGRSVGWLLAVMFLFFILGLACIILSGGN >gi|313157655|gb|AENZ01000057.1| GENE 80 86102 - 87052 999 316 aa, chain - ## HITS:1 COG:HI0505 KEGG:ns NR:ns ## COG: HI0505 COG0524 # Protein_GI_number: 16272449 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar kinases, ribokinase family # Organism: Haemophilus influenzae # 2 297 3 298 306 224 42.0 2e-58 MEKIVIVGSANTDLIVRADRIPSPGETVLGGEFRIVSGGKGANQAVAVARLGGDALFVAR IGTDLFGDELMARYTAEKMDTSHVVRDSQAPTGVALITVDNSAENCIVVAPGANARLSRK DIDDVRPELAKAGYLLIQLEIPLETVEYAIQTAAELGVRVILNPAPAAQIDEKYLKYVYL LTPNESECALLTGRPVLNGTDAAAAAETLLNKGVKNVIVTCGSRGALVKNADICTVVPAC RVSAVDTTAAGDVFNGALTVALAEGLPLLDAARFATHAAALSVTRIGAQSSIPTRREVDD ARFANNGRSADRMTTL >gi|313157655|gb|AENZ01000057.1| GENE 81 87486 - 88502 1092 338 aa, chain + ## HITS:1 COG:BS_degA KEGG:ns NR:ns ## COG: BS_degA COG1609 # Protein_GI_number: 16078147 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Bacillus subtilis # 1 332 1 333 337 188 30.0 1e-47 MKETLVTISKRTGFAISTVSRVLNGQAEKYRISRKTVELILAEARRCDYTPSLLAKGLRM KRTNTLGLLIPQVSNPYFADIASAIIREARSHGYTVMVVDTMENEENEQEGIRTMLSRRV DGIVAVPCGRTPDHLEKVNRSVPVMLVDRSFERTSLPYVCTDNYRGGVEATRILLQRGHR RIACIQGVPHSMPNRRRVQGYLDALRQAGLQEEAIITGDAFSLENGYRQTKLAVAGPDRP TAIFALSNTILLGAVKAIRESGLRIPEDISVVSFDDNPYLDFLVPAITRIGQPVEEIGKT AVRLLLESIREEVRCRTQLQLPPERIVRDSVANPRTAK >gi|313157655|gb|AENZ01000057.1| GENE 82 88663 - 89445 915 260 aa, chain - ## HITS:1 COG:TP0797 KEGG:ns NR:ns ## COG: TP0797 COG0345 # Protein_GI_number: 15639784 # Func_class: E Amino acid transport and metabolism # Function: Pyrroline-5-carboxylate reductase # Organism: Treponema pallidum # 1 255 1 258 263 189 42.0 6e-48 MKVGFIGFGNMAQALAEGFAATGALKPCQIGACARDRAKLRRNTEPKGFLAFDDAARVAE FADMVIVAVKPHQVEAVLSPVKELLAGKIVVSVVAGMTFGKYETILAPGTHHLSTVPNTP VAVGEGIVVCERLHSLSDAEWKETERLFSHVGLVMQVDTPLLGLAGTICGCGPAFVAMFI EALADAAVKHGIARADAYRMVSQMVVGTGKLQLATGQHPGAMKDAVCSPGGTTIVGVAEL ERKGFRGAVIDAVDAIQDKK >gi|313157655|gb|AENZ01000057.1| GENE 83 89557 - 90222 830 221 aa, chain - ## HITS:1 COG:CAC2121 KEGG:ns NR:ns ## COG: CAC2121 COG0325 # Protein_GI_number: 15895390 # Func_class: R General function prediction only # Function: Predicted enzyme with a TIM-barrel fold # Organism: Clostridium acetobutylicum # 1 221 1 220 221 167 41.0 2e-41 MSIACQLSFVRSTLPEDVTLVAVSKTHPVEAIREAYDAGHRVFGESRPQELREKYEALPQ DIEWHMIGHLQTNKIKYIAPFVSLIHSVDSARLAEAIQREAAKCGRTIEILLEIHVADEE TKSGWEIGELMRYVGTAPFARMPDVCVRGVMGIATNTDDGEVIRRDFTELKRCFDLLQPY FGARFNVLSMGMSHDYPFAVECGSNMVRVGSLIFGERDYAK >gi|313157655|gb|AENZ01000057.1| GENE 84 90323 - 91582 1276 419 aa, chain - ## HITS:1 COG:no KEGG:BT_1856 NR:ns ## KEGG: BT_1856 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 16 419 16 439 439 226 36.0 2e-57 MKKYLLLLLLGTFLCACSSDDGEGIPFVTDVVMPSEARTFAPGDEVTVSAKGFEAGDDIM LRIAWPLTNAAISEGYADGVWGVVTSRTESSITFLAPGGYPAGTAEVKLFRRGRAMPLGR ISVSDGTPPASYTLYGVRNDDTGEVAISEFDMQTGEIVRTERFPGSHFIHPVNRPGSNCI FGLSLDGGKRSAAFYDLTMRYFRNSGSDRVVTTGVLPNSEAAYLLCEDNYCIILGMTETR TSVVVPPSWRMPEGIGAGMLTGSPFVMNSDGYVMLSVNNAEGCYTPMVFGMRSAGAEARL GDPVYADGMIPFWIVESASDAAGPKHSVCCGYAVVKDGATELRLYDPVAMTFGEVLAAVP AAVRSVTAVTFPDSDIQDRIYMLCDLGDGGSRVQVYDRAAKSLDIFSGTVYCTEIVLVR >gi|313157655|gb|AENZ01000057.1| GENE 85 91764 - 93458 2508 564 aa, chain + ## HITS:1 COG:MA3377 KEGG:ns NR:ns ## COG: MA3377 COG4690 # Protein_GI_number: 20092191 # Func_class: E Amino acid transport and metabolism # Function: Dipeptidase # Organism: Methanosarcina acetivorans str.C2A # 22 504 2 538 574 177 27.0 3e-44 MKKLRYCLAATAVFAYGATAACTNFIVTKGASTDGSAMVSYAADSHALYGALYHTPGGSH KAGTMMPVYEWDTGRYLTDIPQLRETYSTVGNMNEHSLIIGETTYGGRPELADSTGLIDY GSLIYITLQRAKTAREAIATIAELANTYGYASSGESFSIVDRDEAWIMELIGKGYKADGK GGNANKGIVWVARRIPDGYVSAHANQARITTFPKNDPENCLYAPDVVSFAREMGYYDGPD ADFSFCDAYAPLDFGALRACEARVWAFFRTVADDMDRYADYAMGHNAANRMPLWVMPREK VSPKTVFDCMRDHYEGTPMDMTADIGAGGSACPYRWRPMEFEVDGVSYVNERATATQQTG FWFVAQARPWNPADMGILWFGVDDAATSCLTPIYCSAQEVPACLSEQNGSMLEYSPTAAF WLFNRVTNFAYMRYDMISADIRKVVDKWENGMLETVREVDAEALSLSPKARGKFLTAFST ATAQQLFDRWSKLDKYLLVKYMDGNVKSEKADVLTFLDGDGGPAHFVDNGNGRNIPDKIQ FPGYNEKWKRAVAKDNGEILRVVK >gi|313157655|gb|AENZ01000057.1| GENE 86 93546 - 94922 1411 458 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163788782|ref|ZP_02183227.1| 30S ribosomal protein S1 [Flavobacteriales bacterium ALC-1] # 2 458 3 458 458 548 59 1e-155 MKYDIIVVGSGPGGYVAAIRASQLGRKVALVERAEAGGVCLNWGCIPTKALLKSAQVYTY CKSAAHYGLDLTGEVKPDLEKIVARSRGVAETMSKGVAFLLGKNNIDLIPGFGRLTAPGK LDVDGTEYEADHIVLATGARPREMAFMPIDGERVISSRQALTLAKLPETMIVVGSGAIGS EFAWFYAALGVKVTVVEYMPRMMPLEDEEVSKTMERAFRKLRAAVLTSTTVKSVRVNAEG RCEVEIEGKKGAETLTADIVLSAVGIKSNIENIGLEELGVAVERDKVVVDQFYRTNVPGV YAIGDIVGGPALAHVASAEGICCVEAICGLNPAPVDYSTIPSCVFTSPEVASVGMTEQQA QERGIAYKTGRFPFTASGKATAAGDRDGFVKLLFGEDDTLLGAHMVGMNVTEMIAEPTLA RMLGATGHRIARTIHAHPTMNEGVMEAAEAALGAAIHL >gi|313157655|gb|AENZ01000057.1| GENE 87 95006 - 97045 2253 679 aa, chain - ## HITS:1 COG:mll3725_2 KEGG:ns NR:ns ## COG: mll3725_2 COG0642 # Protein_GI_number: 13473203 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Mesorhizobium loti # 440 679 61 310 328 120 33.0 8e-27 MKKFLWTLLFVGYVLSCSAFSPKDRLIDSLRAECETEQDAPERVALLLNLKDLTDASKDE MHYSRRLFREACDAGDGFAVGASLGSLASYCIGNGGESRDSLDLLLAEAEPLLQNSSMEG LATYYRMVWLARRIQTAPHEKSQQYCREYLDSIRTAPAGDVYQEAARLFLSGVAAYRLAS AKGQMQMERGLPYWNDELALLPEMRPTARRNFHANLITCLISVYSKFRDQEALVRTADGY LAMLDAYYRDPEIVRRRPYISQEMSYLVCYYTMCTTAVLDKSTAREYYGRYSRFMQSAAA DPNNILTNRQGFYSISTEYFTRQEDYETALAYNDSLIMLTRPSGPSPLLVGMYKRRARLC ERMGQYREACRAYNTMTALRDSLPARDYAQKISEMEVRYGLDKVERDRALILAQKRQNTL WFIGVILLIAIVVLIFLWRNLLHIKRLQHNLRIESQRAQESDRLKSDFMGSMSHEIRTPL NAINGFAELIAEGGLSQAECAEFAQIIRDNTRLFTTLINDMLEVAQLDNTIAELPKMPMD ICRIIRAETELLPPKEGVEYRMNFEVPEVIVPLHRGYMTELIRELLKNAVKFTGHGTITV DCGKPENGELTFSVSDTGCGVGAEWADRIFERFYKKDPFGQGLGLGLSLCRLIVEKLGGT IRLDAAYTQGARFVVTLKV >gi|313157655|gb|AENZ01000057.1| GENE 88 97151 - 97339 361 62 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|291515703|emb|CBK64913.1| ## NR: gi|291515703|emb|CBK64913.1| hypothetical protein AL1_27710 [Alistipes shahii WAL 8301] # 1 62 1 62 62 92 83.0 1e-17 MIGTIIWLIGVACAIWCVMDIFKKNISTAGKVIAAIVVLLTSWLGLAVYYFYGRNHLEEW FR >gi|313157655|gb|AENZ01000057.1| GENE 89 97359 - 98123 1133 254 aa, chain - ## HITS:1 COG:CT463 KEGG:ns NR:ns ## COG: CT463 COG0101 # Protein_GI_number: 15605190 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Pseudouridylate synthase # Organism: Chlamydia trachomatis # 5 245 7 250 267 160 36.0 2e-39 MRYFLELRYNGAAYCGWQRQPDMPSVQQTLERALTTLLREKIEVTGAGRTDTGVNASYYV AHFDCEEPVPDPAQVVYKLNFLLPGDIAVGSMTPVGADAHARFAAREREYRYYIEPRKNP FTRDATWQYYVPLDMERMNEAAASLLEYDDFTSFAKLNSNNKTNICRIMHAAWTADERGV LCFTIRADRFLRNMVRSLVGTLVDVGRGRYTPDGFREIVGSRDLSRSSGGAPAQGLFLSD VVYPPGVFERKTFY >gi|313157655|gb|AENZ01000057.1| GENE 90 98163 - 99593 1847 476 aa, chain - ## HITS:1 COG:Cj0087 KEGG:ns NR:ns ## COG: Cj0087 COG1027 # Protein_GI_number: 15791475 # Func_class: E Amino acid transport and metabolism # Function: Aspartate ammonia-lyase # Organism: Campylobacter jejuni # 8 467 3 462 468 528 55.0 1e-149 MEKEQKKTRTECDLLGCVEVPENVLYGVQTMRGLENFPISKFRLCEYPLFIRGLAIVKLG AAVANHDMGLLSDELFNAISQACREIMEGKFHEFFPVDMIQGGAGTTTNMNANEVIANRA LEIMGHARGEYQYCSPNDHVNCSQSTNDAYPSAIHLGLYATHLELRKHLMELIASFARKG EEFANVLKMGRTQLQDAVPMTLGQTFHGFASVLQGEVENLDHAASEFLSINMGATAIGTG ICAEPGYAERCTAAIREITGWDVKQAEDLVGATSDTVCMVGYSSAMKRIASKMNKICNDL RLLSSGPRCGLGEFNLPARQPGSSIMPGKVNPVIPEVMNQIAFKVIGNELCVTMADEAAQ MELNAMEPVMAQCCFESAELLVNGFDTLRIRCVEGITANEERCLQYVHNSIGVVTALNPI IGYKNSTKIAKEALETGRGVYELVLEHGILSREELDEVLKPENMIRPVKLNIRPRR >gi|313157655|gb|AENZ01000057.1| GENE 91 99823 - 100578 1053 251 aa, chain + ## HITS:1 COG:AF0909 KEGG:ns NR:ns ## COG: AF0909 COG0289 # Protein_GI_number: 11498514 # Func_class: E Amino acid transport and metabolism # Function: Dihydrodipicolinate reductase # Organism: Archaeoglobus fulgidus # 1 251 1 257 257 103 31.0 2e-22 MKAAIIGYGKMGREIEKILIERGHDVTLVIDTDNAADLNAANLAGVDVALEFTMPSTAYA NIRTCIENATPVVSGTTGWTDRLGELQQLCREKGGALFYASNYCLGVNLMFRLNRQLAAM IDRVGGYDAKIEEVHHTQKKDAPSGTAITLAEGVISELHDKTAWVNYAPGIEHATNRIET PADAAADQLEIRSVREGMVPGIHTVTYESADDILELKHTIKNRRTLAMGAVIAAEFLCGR QGVYSMDDLLK >gi|313157655|gb|AENZ01000057.1| GENE 92 100589 - 101941 2294 450 aa, chain + ## HITS:1 COG:PM0062 KEGG:ns NR:ns ## COG: PM0062 COG0681 # Protein_GI_number: 15601927 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Signal peptidase I # Organism: Pasteurella multocida # 394 450 283 340 340 80 56.0 8e-15 MGKIKAFFRNKWVGFALASLLYTLWFVVWTGNLWMLLGLAVIFDLYITKFFYRYVWCHNA RMCQQSKVYKTVYEWVNAIIFATVVASLVHIFVFQMYVIPTSSMEKSLLIGDYLYVSKVT YGPQMPNTPLSFPFVHHTMPFSQTKKSFSEAVKWPYHRLKGLRKIKRNDVVVFNFPAGDT VLLENQNATYYDVLRSYEDSFGKEEGRKRLAEKYTIISRPVDKRENYIKRCVAVPGDSLE IRNGQVWVNGAPQEGIPGIQYQYAVQVSSPLSQYAIENLGITEYRGNGSVYDMFLTEEAA GKIEALNNVISVRRYIFSPNTEVFPQWKEPHWSQDNYGPIWVPKKGATVELTAENLPLFR RIIETYEGHSLEERDGKIFIDGAEADSYTFGMDYYWMMGDNRHNSADSRFWGFVPEDHIV GKASFVWLSLDANKSFPANIRWNRLFTKVR >gi|313157655|gb|AENZ01000057.1| GENE 93 101943 - 102551 848 202 aa, chain + ## HITS:1 COG:NMA0240 KEGG:ns NR:ns ## COG: NMA0240 COG1739 # Protein_GI_number: 15793258 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Neisseria meningitidis Z2491 # 6 178 4 176 203 172 44.0 4e-43 MEPEDSYLTIAAPAEASSRERSSKFLAYAYPVQQEEQIREILDGLRKKYYDATHHCYAWR LGPGGAAFRANDDGEPSGTAGKPILGQLLSNNLTDCLIVVVRYFGGTKLGVPGLIAAYRE SAAEAIAAARIVERTVDRTIRVDFPYIAMNDIMRVIKEQQPKIASQEFDNLCTMVLTIRE SRAGELTEKLKKAGGSIAGEDI >gi|313157655|gb|AENZ01000057.1| GENE 94 102620 - 105409 4020 929 aa, chain + ## HITS:1 COG:MA3325 KEGG:ns NR:ns ## COG: MA3325 COG0178 # Protein_GI_number: 20092139 # Func_class: L Replication, recombination and repair # Function: Excinuclease ATPase subunit # Organism: Methanosarcina acetivorans str.C2A # 5 926 2 932 993 823 46.0 0 MTHEKSIYIKGARVHNLKNIEVEIPHDKLVVVTGLSGSGKSTLAFDTIFAEGQRRYVESL SAYARQFLGKINKPDVDIITGIAPAIAIEQKVNTRNPRSTVGTTTEIYDYLKLLFARIGH TFSPVSGQEVRCYSVDDVATYVQELGEGSRAVIAAPLVLAEGQGIIEKLTLLLSDGLMRV YTDGQTRLIEEILPSIDETTGAADIQVVIDRMRIAPDDDTQTRIRDSVARAFSYGDGICT VISDKGAAEFSSRFEADGIEFEHPSEHLFSFNNPLGACPRCEGYGKVIGIDEDLVIPDKG KTIYEDAIACWRGETMRKWKEKLVENAYKFDFPIHTPFHELTQEQKRLLWRGNQYFHGLD EFFAYIDSERRKIQFRVMKARYTGKTTCPECGGSRLRKEALYVRIGGRNIADLVVMPVDE LIVFFNGLQLDEHDTKTAARILIEIRSRLQYLTDVGLGYLTLDRLSSTLSGGESQRINLS TSLGSNLTGSLYILDEPSIGLHPRDTNRLIKVLRQLRDLGNTVIVVEHEEEVIRAADYIV DIGPKAGYNGGEVVFSGTLAQLLKNKKSLTADYLTGRRAIAPPAAERGWSSSIVIKGARE NNLRNIDVRIPLGVMTCITGVSGSGKSSLAKGILYPALRRMLFDTGVKPGDFDGIVGDVQ LLKSVEMIDQNPIGKSSRSNPVTYIKAYDEIRRLFADQPYAQHTGLGASAFSFNIAGGRC EECQGEGVIKVSMQFMADVELVCEACGGKRFRDEILEVKYRGKSIYDVLEMTVDDAIAFF GEEKKDSTCKRIVERLKPLQDVGLGYIKLGQSSSTLSGGESQRVKLASFLTKDSAQGGVM FIFDEPTTGLHFHDINKLLAAFNALIERGHTIVIVEHNMDVIKCADWVVDLGPEAGTGGG RVVFEGTPRALEQCPESYTGKFLSLRTKL >gi|313157655|gb|AENZ01000057.1| GENE 95 105412 - 106626 1606 404 aa, chain + ## HITS:1 COG:BS_ymxG KEGG:ns NR:ns ## COG: BS_ymxG COG0612 # Protein_GI_number: 16078734 # Func_class: R General function prediction only # Function: Predicted Zn-dependent peptidases # Organism: Bacillus subtilis # 6 396 5 393 409 181 30.0 2e-45 MEFFTYRLPNGIRGIHRQVKSNVAHCALVVNAGSRDEHADQYGLAHFTEHAFFKGTRRRR AWQVNCRLENLGGELNAFTTKEDTTIHATTLRGDFPKAVELIADIAFRSTFPDRELEREK EVIADEINTYKDSPADLIYDTFEEMLFAGSELGHNILGRKAALMRYDGDAIRAFTGRTHT TDQMVFSSIGNFSAKTAETVAARYFAQQAATTRGFVRAATAPRPPFEKTVVKHTHQTHCI IGGRAYGIGEEKRLPLALVTNILGGPCANSLLNVVVREKNGLSYNIEASYTPYSDTGIVA IYFSSENGNTAQCIDLIEGELRKLRTTPLTGRQLSMAKKQFIAQLAISSESNEGYMLGAG KSFLTHDDVDTMEQVYAKVRSLTAVQLTEVAEEVFSGMSRLIYK >gi|313157655|gb|AENZ01000057.1| GENE 96 106631 - 107227 734 198 aa, chain + ## HITS:1 COG:BS_ydeA KEGG:ns NR:ns ## COG: BS_ydeA COG0693 # Protein_GI_number: 16077578 # Func_class: R General function prediction only # Function: Putative intracellular protease/amidase # Organism: Bacillus subtilis # 3 195 2 184 197 126 38.0 2e-29 MEKEIVFVLLDEFADWEAAFLAPALCSGVMPGHPGSYAAKYMSPDGESVRSIGGLRATPD YDASTLPESCAGLILVGGMQWESAAARRIAPLAGEALKRGILVGAICNAVSFMAANGLLN SVRHTGNTVEMLKQWGGANYTGEVLYEERQAVRDGNVVTANGTGYLEFTRECLLALKADT PDRIEASYKFNKYGFCRQ >gi|313157655|gb|AENZ01000057.1| GENE 97 107227 - 107874 833 215 aa, chain + ## HITS:1 COG:aq_1507 KEGG:ns NR:ns ## COG: aq_1507 COG4122 # Protein_GI_number: 15606661 # Func_class: R General function prediction only # Function: Predicted O-methyltransferase # Organism: Aquifex aeolicus # 2 213 7 210 212 152 38.0 6e-37 MDALEQYIRDHTSPEEELLHELDRETNLRVIQPRMLSGHIQGRLLEMLVRMLRPRRVLEI GTFTGYSALCMAAGLEGEAELHTVEVEDELEELAQSFFDRSAHGSRIRLHIGSALDIAPG LGGVFDLVFIDGDKREYPDYYRMLMGDDGGCPLVRSDSVLIADNILWSGKVVQPIAHNDR HTQAVMEFNRMVREDPRVENVIVPLRDGLNLIRIK >gi|313157655|gb|AENZ01000057.1| GENE 98 107979 - 108299 569 106 aa, chain + ## HITS:1 COG:SA1292 KEGG:ns NR:ns ## COG: SA1292 COG1694 # Protein_GI_number: 15927040 # Func_class: R General function prediction only # Function: Predicted pyrophosphatase # Organism: Staphylococcus aureus N315 # 3 99 4 101 105 78 44.0 4e-15 MELSELQRRVDAWIKEYGVRYFSELTNMAVLTEEVGELARIMARKYGDQSFKEGEKCNLA DEMADVLWVLTCLANQTGVDLTAAMEANFAKKTSRDKERHRSNPKL >gi|313157655|gb|AENZ01000057.1| GENE 99 108465 - 109376 953 303 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313157667|gb|EFR57078.1| ## NR: gi|313157667|gb|EFR57078.1| hypothetical protein HMPREF9720_1272 [Alistipes sp. HGB5] # 4 303 1 300 300 574 100.0 1e-162 MKNMFIRTAALVAACIYMGSAAAQEPRRPEAQLRFDEARQEAIADLPSPAFAADSLAAAR PFAYKLNRRRTTEAAAYDAVTLWNVRKVKAKNRRMRLSDTTRREKVRLLKFFTERNRKKG VRPHLLGNVAGYTWADDVYDEPGHKWRSAPKHIAAYTLRDEYDSVRRAGGELSLFKVPKE LWRTLPPDAVYLLDGERVPGSSFQFIDGLILRTLEIYTDSATMARYGAGRGVVIGDIYPD RVPLVVFNGAFSSIESWLKMCRADAFSVSADVPMHYFYMLPVEAVQTYGVRGKYGAICID IVE >gi|313157655|gb|AENZ01000057.1| GENE 100 109508 - 111250 2962 580 aa, chain - ## HITS:1 COG:CAC2337 KEGG:ns NR:ns ## COG: CAC2337 COG1109 # Protein_GI_number: 15895604 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphomannomutase # Organism: Clostridium acetobutylicum # 8 549 2 547 575 474 48.0 1e-133 MANELEQSVLKKAQAWLDGHYDEATKKQVKYLMDNDMKELVESFYKDLEFGTGGLRGIMG VGSNRMNVYTVGAATQGLANYLRKNFAGEQIRVAVGHDSRNNSRMFAERVADIFASNGFT VFLFDALRPTPELSFAIRELKCHSGVVVTASHNPKEYNGYKAYWTDGAQVTEPHDKNIIA EVAKITDVNMIQLGKNPQNITILDEKFDEIYLNKVHELSLSPESVKKHHDMKIIYTPLHG SGVRLVPESLKKFGFTNVKLVPEQAVIDGNFPTVESPNPEERKTMSMAIDLAAKEGADLV LATDPDSDRIGVALRNKKGEYVLLNGNQTLVLLLSYQLTRWAERGKLDGNQYVVKTIVTS QMANAVADYFKVKCYDCLTGFKYIAKIIRENEGKAKYIGGGEESFGYLAGDYVRDKDAVS ACSLAAEAAAWAMDTMGLTLYEWLQELYVKYGFFREGLVSVVRKGKEGAELIQKMMVEFR ENPPKTIVGSPVVKINDFLSLETTDVKSGSKTPIEQDKSNVLQWFTEDGSIVSVRPSGTE PKIKFYFGVKAPLESVADFERVQAELDAKIEAIKKDLKLE >gi|313157655|gb|AENZ01000057.1| GENE 101 111507 - 112889 2120 460 aa, chain + ## HITS:1 COG:CAC0001 KEGG:ns NR:ns ## COG: CAC0001 COG0593 # Protein_GI_number: 15893299 # Func_class: L Replication, recombination and repair # Function: ATPase involved in DNA replication initiation # Organism: Clostridium acetobutylicum # 1 456 8 442 446 298 37.0 1e-80 MWQNCLDRIKAQTSEEEFVKWFQPIVPLEFDGTNLRLRVPNESYVHQIEKNYIPFLRPII SQLYGQQTRLHYAVPRSAAQAVPVTADADTTAISRFNTQTNTANIKNPFVIPGLKKIIID PQLNPGLTFATFIEGECNRLARSAGMSVAVNPGNNPFNPLYIYGNSGLGKTHIVQAIGHE VRQRHPELQVLYVSMNKFQAQFQTAYKNGEIPDFIHFYQMIDVLIIDDIQELTGKTGTQN AFFNIFNHLQLAGKQLILTSDKPPVELKDIEQRLLTRFKWGLSAQLNTPDHETKLKIIRV KAQKLGAQISDDVVAYLADNISANVREIEGALSSLVANASFLGRKITTSLAKEILKVYVQ LYQKEITIDHIIKVVCEYLNLDFERFNSTERTREIAQARQIAMYLSKQHTKAPLTTIGSA IGGRNHATVLHSCKAVSNLIETDKAFRRQVEEIEKKVLAQ >gi|313157655|gb|AENZ01000057.1| GENE 102 112963 - 113478 318 171 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313157763|gb|EFR57174.1| ## NR: gi|313157763|gb|EFR57174.1| hypothetical protein HMPREF9720_1275 [Alistipes sp. HGB5] # 1 171 14 184 184 313 100.0 3e-84 MRKVYHFKMLRTAALILIGAPAFGAAGLLLIRLHPGNLPVRIVGYAGILFFGGCFAAGLL VLRRLARNSEALTLTPEKLTVRQPNGREYALKWNEIRSFREADINGQPFIAVITSEPQKP AVQSHNRLRLWLRRLDERLTGSAINISPSSIHCRRDELYATLCEYLEKYGR >gi|313157655|gb|AENZ01000057.1| GENE 103 113485 - 114381 1244 298 aa, chain + ## HITS:1 COG:no KEGG:HMPREF0659_A5182 NR:ns ## KEGG: HMPREF0659_A5182 # Name: not_defined # Def: hypothetical protein # Organism: P.melaninogenica # Pathway: not_defined # 1 297 23 319 328 186 34.0 1e-45 MDQSKLERLLRLMKLLTANTTYDIDQLAERLQMSRRTVYRYIDTFREAGFVIKKSGNCIR LDKESPHFRDISQLVHFTEEEAVILKSAIENIDDTNMLKQNLKRKLYSVYDNKTLADTVV RGKNSPNIRTLIEAIDEHRQVILHGYQSAHGGEVRDRRVEPFAFTTNYVQVWCYDPEAHA CKLFKTSRIGSAELTAAPWEHEPEHCEGFIDAFRMHGGARRRVRLELGLLAYNLLCEEYP LAERDVRPLGRGRWLLDTEVAGFAGVGRFVVGLLDDIRIVDSPELTTYIHNYIRANIS >gi|313157655|gb|AENZ01000057.1| GENE 104 114436 - 114969 794 177 aa, chain + ## HITS:1 COG:BH2720_2 KEGG:ns NR:ns ## COG: BH2720_2 COG0662 # Protein_GI_number: 15615283 # Func_class: G Carbohydrate transport and metabolism # Function: Mannose-6-phosphate isomerase # Organism: Bacillus halodurans # 44 177 29 162 164 168 57.0 5e-42 MKTGSILTGALCMALAAGSCCRQTPAAETAGKQLTTAETIVFKDYGAEPTVLDIESYTLA NENFRTVLWTGGNLQVTVMAIPVGGDVGLELHNGIDQFLRVEEGTAQVMMGDSADKLDFV KEVKDDYAIFVPAGKWHNIVNKGDKPLKIYSIYAPAEHPHGTIHKTQQESMEAEHDH >gi|313157655|gb|AENZ01000057.1| GENE 105 115092 - 115304 286 70 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313157753|gb|EFR57164.1| ## NR: gi|313157753|gb|EFR57164.1| hypothetical protein HMPREF9720_1278 [Alistipes sp. HGB5] # 1 70 11 80 80 137 100.0 4e-31 MGTAYKIRCRHCGAQFEHYMQPGYGVLPMCVGCGEYVETETAIRCPACLKKLNTTQEEFN EQIEVTYMWD >gi|313157655|gb|AENZ01000057.1| GENE 106 115388 - 115849 710 153 aa, chain + ## HITS:1 COG:BMEI1049 KEGG:ns NR:ns ## COG: BMEI1049 COG1225 # Protein_GI_number: 17987332 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Peroxiredoxin # Organism: Brucella melitensis # 1 153 1 152 154 166 54.0 1e-41 MTQLQAGDMAPDFKSTTQDGETLTLADLKGQRTILYFYPKDNTSGCTLEAQSLRDGKAEL TRRGFRIVGVSPDSEKSHRNFCAKHDLNFTLLADTDHSVCEAYGVWAEKSMYGRKYMGVL RTTFVIDAEGRIEQVFTKVDTKNHYQQILDAYK >gi|313157655|gb|AENZ01000057.1| GENE 107 115863 - 116333 469 156 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313157736|gb|EFR57147.1| ## NR: gi|313157736|gb|EFR57147.1| hypothetical protein HMPREF9720_1280 [Alistipes sp. HGB5] # 1 156 4 159 159 264 99.0 2e-69 MGSILLVAGCLSGTLQPYAQEYPAGPLYDTPDELSFKARAAAQTLLFKELEPLSLQWLRE EVNPLLRARRKLRQAMQTRAEKAAATAVPMVPKRDLSKEVGLWRMSVGNTSADNWSPYPD RALDARTLRFPLPRDSRADKRPDNLKALDRMRQQKR >gi|313157655|gb|AENZ01000057.1| GENE 108 116362 - 117378 1664 338 aa, chain + ## HITS:1 COG:AGc3441 KEGG:ns NR:ns ## COG: AGc3441 COG0468 # Protein_GI_number: 15889174 # Func_class: L Replication, recombination and repair # Function: RecA/RadA recombinase # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 8 330 66 389 416 429 66.0 1e-120 MAEKVQVNADKLKVLNAVMEKIEKDFGKGSIMRMNSQEVSDVPVIPTGSITLDIALGVGG YPKGRVVEIYGPESSGKTTLAIHAIAEAQKAGGIAAFIDAEHAFDSFYAQKLGVDVDNLL ISQPDNGEQALEIADSLIRSSAIDIIVIDSVAALTPKAEIEGEMGDSKMGLQARLMSQAL RKLTSSISKTKTVCIFINQLRDKIGVVYGNPETTTGGNALKFYASVRIDIRRMSVIKDGE EQLGTRTKVKVVKNKVAPPFKRAEFDIMFGEGISKIGEIVDLGVDYGVVKKAGSWFSYGD RKIGQGRDAVKELLKNDEELRNEIEGKVREAMKAPKEK >gi|313157655|gb|AENZ01000057.1| GENE 109 117385 - 119625 2961 746 aa, chain + ## HITS:1 COG:RSc2153 KEGG:ns NR:ns ## COG: RSc2153 COG0317 # Protein_GI_number: 17546872 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Guanosine polyphosphate pyrophosphohydrolases/synthetases # Organism: Ralstonia solanacearum # 31 734 33 721 735 437 34.0 1e-122 MDYTAEDEVIIKEKWDDLLLSCTKICKNDEDWNFIKRAFFLAKEAHQGVRRRSGEPYLLH PIAVAKIVIEEIGLGVKSVVAALLHDVVEDTEYSVEDMERIFGPKIASMVDGLTKMSGVF NADTSEQAEYFRKVLLTLSDDVRVILIKIADRLHNMRTLGSMPMNKQIKITGETIYLFAP LAYRLGLYSIKSELEDLCMKYRFPQQYAEITQKLNETEASRREFIDKFNAPIIAALNRDN INYEISGRVKSIYSIWSKMQRKQIPFEEIYDLFAIRIVFKPLPFPSEKTQCWQVYSTITD IYTPKPDRLRDWISMPKANGYEALHSTVMGPDGVWVEVQIRTQRMEDIAERGFAAHWKYK HATISQDEDEFDKWLKQIRAALNSPTENAVDFLDNFKLSLYTSEIVVFTPKGEARKMPFG ATALDFAYDIHSKIGNSAISAKINHKLEPITTQINSGDQIEIITADNARPKPEWLEIATT AKAKQSIKSFLKRERQNNIERGMQMLDEKMKSLNIKLSGRVLRKIVPIYESNNKEELYSK IGAGIIDLKDLDKVLKVNSKSKILKFWTLFINKKEDEEGDDPAAINDAGETAPNAAHPAK EQPETPQFEIAECCKPIPGDKVVGYRDPKTGNIIVHKATCDELNRLATQFGRNIVKEEIK WSQHKAMSYLVTIELRGIDRMGILLDLAKVVSADFSINIREVGIHSHDGIFEGSISLYVK DAEGLQDVMTKLRKIKGIESVKRTLS >gi|313157655|gb|AENZ01000057.1| GENE 110 119627 - 120112 276 161 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313157701|gb|EFR57112.1| ## NR: gi|313157701|gb|EFR57112.1| hypothetical protein HMPREF9720_1283 [Alistipes sp. HGB5] # 1 161 1 161 161 270 100.0 2e-71 MEDYSTVIWVVAIFAAMLFNATSQARKKAKKQAHETQKHTQHEAWPSWDTQSADEMRHPD SQETVSGAETDRLRPAMQQPAPTPGFGEVARKTADFKSSGRHAGRKNAPDGRHLHEEILH EEHAAGDKHTTAEAVAEITEDFDLRKAVIYSEILKPKFNEE >gi|313157655|gb|AENZ01000057.1| GENE 111 120417 - 120803 506 128 aa, chain + ## HITS:1 COG:MT0784 KEGG:ns NR:ns ## COG: MT0784 COG0537 # Protein_GI_number: 15840174 # Func_class: F Nucleotide transport and metabolism; G Carbohydrate transport and metabolism; R General function prediction only # Function: Diadenosine tetraphosphate (Ap4A) hydrolase and other HIT family hydrolases # Organism: Mycobacterium tuberculosis CDC1551 # 3 126 2 125 133 93 36.0 1e-19 MASIFSRIIAGEIPSYKVAEDENYYAFLDINPLTKGHTLVVPKKEIDYIFDLDDQTLAGM MLFAKKVAGKIKQEIACSRVAVVVLGLEVPHAHIHLIPIKSENDVDFHREKLKLTPEEFE EIATKLSK >gi|313157655|gb|AENZ01000057.1| GENE 112 120919 - 121059 58 46 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MYRRLCIYKCNYKWKYFLHCVFTSVNSLPYFRKILIFLTDFRRIIA >gi|313157655|gb|AENZ01000057.1| GENE 113 121463 - 121927 284 154 aa, chain + ## HITS:1 COG:no KEGG:BDI_1114 NR:ns ## KEGG: BDI_1114 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 1 154 3 149 151 85 32.0 8e-16 MREKLLDLMKNEGLKPSQLAELLGINPAGISHILAGRNKPGFDLLQKILRRFPRINPDWL LLDSDKMYRDEPPAQSSAPQPMSQPASPGGDLFGLASGRHPLEEKRQPETTEKATDDPAP QRQLPTALFTANVKRIVVLYDDQTFESFTPTTKR >gi|313157655|gb|AENZ01000057.1| GENE 114 122168 - 122662 349 164 aa, chain - ## HITS:1 COG:no KEGG:BF3191 NR:ns ## KEGG: BF3191 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 162 1 162 163 233 62.0 2e-60 MKKITNPWRGMDGYHCYGCDPNSPQGLRMEFYENGDEIVSVWHPRPEYQGWVDTLHGGIQ ATLADEISSWVVFRKFQTSGVTSKMEVRYHKPIHTTDDHIVLRASVCRQRRNIVEIEVRI YDRHEVLCTEAVCVYFLFPPDKARSEFHFCDCGVEAAEENPLDK >gi|313157655|gb|AENZ01000057.1| GENE 115 123052 - 123846 578 264 aa, chain + ## HITS:1 COG:no KEGG:RB2501_04605 NR:ns ## KEGG: RB2501_04605 # Name: not_defined # Def: hypothetical protein # Organism: R.biformata # Pathway: not_defined # 8 257 1 253 265 159 38.0 8e-38 MAENDSCIKRSAIILTAFGLFFVTWYFNLGVRPFANFISTCKSIHYILHYLIAGIIPAAA LLLLHRPDTILYRLGLTHGFGRGLLFGVLSTVPMFAGYAVIGSFNRETPPDHMFTFIVVA GFFEELIFRGFVFGELFRTARWGFLPAATLTALAFGSLHLYQGDDLISASAAFGVTTAGS IFFSWIYVESENNLWSVIWPHTLMNAPWMLFSGSGSGAVGGLWANSLRLCTLLIAIALVV IYKKRKGLPYHISGKTLIINRQYV >gi|313157655|gb|AENZ01000057.1| GENE 116 123839 - 124744 977 301 aa, chain + ## HITS:1 COG:MTH604 KEGG:ns NR:ns ## COG: MTH604 COG0803 # Protein_GI_number: 15678632 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type metal ion transport system, periplasmic component/surface adhesin # Organism: Methanothermobacter thermautotrophicus # 18 301 28 295 295 168 36.0 1e-41 MYKKLVFIFIAALMGACTPRQQADEKTLYVSILPLRSLVEEIVGDDFKIEVLVPLGASPE TFEPTPRQFIGLNRAEMIFNVGLIDFETTLLSKIEERGKVVNLSQGIELIAGSCSHACTH AEQETSGGHGTSHAETHNHSHAHGVDPHVWTSPRALQKMAQNAYAAIRRAYPDSVKYETN YKRLQADLRALDARTGEKIAQSGIEYFIIYHPALTYYARDYGIRQVAIEADGKEPSAKQL TQVIRQAREDGVRRILYQSQFPASAVEIIARDIDAEYVEVDPLREDAIANIDAITDIITR R >gi|313157655|gb|AENZ01000057.1| GENE 117 124720 - 125511 227 263 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|90020817|ref|YP_526644.1| ribosomal protein S16 [Saccharophagus degradans 2-40] # 1 216 5 218 318 92 28 2e-17 NRHNHAAMNLVTLRDVGVAYDGYEALQHVDLEIAKNDFLGVIGPNGGGKTTLVKAILGTV PHTGEVRLAPELFRDKERLIGYMPQLSDFDRTFPISVLEVVLSGLQGHKGIFSRYTKADR AKALGLLETAGVADTARNPIGEVSGGQMQRALLCRAVISDPKLLILDEPANFVDNKFENE LYRTLHELNRRMAIVMVSHDIGTISSVVKEIVCVNRRVHRHRSNIITEEQLRNYDCPIQL VSHGHIPHTVLEHHPGDCCCDHE >gi|313157655|gb|AENZ01000057.1| GENE 118 125613 - 125903 548 96 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313157818|gb|EFR57229.1| ## NR: gi|313157818|gb|EFR57229.1| hypothetical protein HMPREF9720_1291 [Alistipes sp. HGB5] # 1 96 1 96 96 160 100.0 4e-38 MANLRNLKKEIDYRLEEVVFDCDMAICFQPSKEKEVFEVMQEAVAVRNDLFTKANNPAEP HNRSLVRKHFAALRVEMVEAYDKLFEKLSKINEAKK >gi|313157655|gb|AENZ01000057.1| GENE 119 125908 - 126084 61 58 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MCVLKICSDKGTKSFRIIKRFMPYAAEIIPAGPCKKKGFRMRNPFYRIIRPYIFSNSL >gi|313157655|gb|AENZ01000057.1| GENE 120 126060 - 126977 959 305 aa, chain - ## HITS:1 COG:no KEGG:Poras_0858 NR:ns ## KEGG: Poras_0858 # Name: not_defined # Def: hypothetical protein # Organism: P.asaccharolytica # Pathway: not_defined # 63 241 348 530 666 85 30.0 4e-15 MKTIFTTFALSLALCAAAGATAQVRSAAPVPAADTVMLSETTDGDYLVRRFIIKRPGDTD YSVRYQINLAKLSSTLDGNSRELGDLDDFVGNLMKDTLMRVKSVEITGYSSPDGPLKFNE TLARNRAQDFKNYVDRKYGFSNKYDVTVNAVAEDWEMCRSLVAKSEVPDKQSVLDILDGS RSPDAKELALKKMPSAWNYMKKNILPPLRRVELTINYGEGSVVEQRTMIPRPKPAPQPVC EPCGCEVIDESITGIIVEMPGSDVDYKKELREARKIVREQERAARKAARQEAKAAKKSYR ELEKM >gi|313157655|gb|AENZ01000057.1| GENE 121 127134 - 127625 380 163 aa, chain - ## HITS:1 COG:RSc2408 KEGG:ns NR:ns ## COG: RSc2408 COG2801 # Protein_GI_number: 17547127 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives # Organism: Ralstonia solanacearum # 2 163 115 276 278 187 53.0 9e-48 MAERPNQKWVTDLTEFSLFGKKLYLSPIMDLYNREIISYNIAEHPTFYQTMKMLEDAIKD LPDTPELILHSDQGWQYQMKRYQYRLREKGISQSMSRKGNCLDNAVMENFFGLLKSELLY LQKFESVDHFRKELEEYINYYNNKRIKTALNNMSPVQYRTHAI >gi|313157655|gb|AENZ01000057.1| GENE 122 127746 - 127874 67 42 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MVYRISAIVHLYCDTAVTITSLMIVVYTYNLVFLLFVFIGEL >gi|313157655|gb|AENZ01000057.1| GENE 123 127920 - 128225 104 101 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167753422|ref|ZP_02425549.1| ## NR: gi|167753422|ref|ZP_02425549.1| hypothetical protein ALIPUT_01696 [Alistipes putredinis DSM 17216] hypothetical protein ALIPUT_01696 [Alistipes putredinis DSM 17216] # 1 101 70 170 170 171 99.0 2e-41 MHENHLSLTETTAKFGIYNESTLSKWERIYYEEGESGLYRDNRGKMRTKPNKQELQPQEE NDLRVENQRLRAEVAYLKKLRVLVEERIVRENGSGQKPSKN >gi|313157655|gb|AENZ01000057.1| GENE 124 128505 - 128801 422 98 aa, chain - ## HITS:1 COG:no KEGG:BF3840 NR:ns ## KEGG: BF3840 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 4 98 3 97 100 84 46.0 1e-15 MTEEIITRDSAAFKELRRDIVKAIRAVDILIDTHRPTIGNELYLTSEEICSIFSISKRSL QNYRDNRQIPYTTLGGKILYPQSSLYKLLEQHYMKAQR >gi|313157655|gb|AENZ01000057.1| GENE 125 128803 - 129087 245 94 aa, chain - ## HITS:1 COG:no KEGG:HMPREF0659_A7057 NR:ns ## KEGG: HMPREF0659_A7057 # Name: not_defined # Def: hypothetical protein # Organism: P.melaninogenica # Pathway: not_defined # 1 89 1 87 89 63 32.0 2e-09 MDCVLMETSAYKEMQAHLQRLLERVSALHSLSAQPTTVRWLTAEEVCKALSITKRALQYY RSAGIIPYTALGNKVLFRDDDIRQLLEKNLIKSL >gi|313157655|gb|AENZ01000057.1| GENE 126 129588 - 132266 4140 892 aa, chain - ## HITS:1 COG:MA0523 KEGG:ns NR:ns ## COG: MA0523 COG0249 # Protein_GI_number: 20089412 # Func_class: L Replication, recombination and repair # Function: Mismatch repair ATPase (MutS family) # Organism: Methanosarcina acetivorans str.C2A # 12 891 4 900 900 627 42.0 1e-179 MNAEKKIEKKYVETPLMKQYYSIKAVHPDAILLFRVGDFYETFGEDAIKASGILGITLTR RANGSATYVELAGFPYHAIDTYLPKLVRAGERVAICEQLEDPKQVRGLVKRGVIELVTPG VVLGDNILANKENAFLASVYFGRQTTGVAFLDISTGEFYVAEGSESYIDKLISNLAPKEV IYQRGYEDRFSDAFGSKLYTYRLDEWVFSEEVNREKLCKQFGTRSLKGFGVDHFTSGLSA AGAILYYLEFTEHRDIAHITSISRIDQEDYVWVDKFTIRNLELFSSNGAREKCSFADVID RTLTPMGGRLLKRWIAMPIKDPVKIGERLDVVEKFVRDADLADVVREQVALVGDMERIAS RIAAARVTPRELVQLKNSLFAVELLKAALESTDDAQLHALAGQIDLLEGVRDRIAREIYP DPLNNQIQKGGVIADGVDPELDDLRRIALHGKDYLARIQQRESEATGIPSLKISYNNVFG YYIEVRNAHKEKVPETWIRKQTLANAERYITEELKEYEEKILGAEEKMLVIEQRIYADII AHISRSLSSLLRDAAVVARIDCLQSFARLACERRYVRPVLDDGKLIDIRQGRHPVIETLL PVGEEYVPNDVMLDDKEQQIMMITGPNMSGKSALLRQTALIILMAQMGSFVPAKSAHIGV VDKIFTRVGASDNISQGESTFMVEMLESASILNNISDRSIVLLDEIGRGTSTYDGISIAW AMVEYLHNHPSAHAKTLFATHYHELNEMEQMCPRVKNYHVSVKEMGNQIVFLRKLERGGT EHSFGIHVARMAGMPVSVVSRADEILRNLELVYGNNEIVPSRSIKNRGKKPSPSVKEAAE NGAPQNMQLSMFQLDDPVLVQIRDQIKGLDINSLTPIEALNKLNEIKKITGI >gi|313157655|gb|AENZ01000057.1| GENE 127 132377 - 132964 740 195 aa, chain - ## HITS:1 COG:NMB0698 KEGG:ns NR:ns ## COG: NMB0698 COG3663 # Protein_GI_number: 15676596 # Func_class: L Replication, recombination and repair # Function: G:T/U mismatch-specific DNA glycosylase # Organism: Neisseria meningitidis MC58 # 2 190 30 220 229 191 49.0 7e-49 MKNSMNSELHPLEPFLPANARILMLGSFPPKRIRWSMDFFYPNLQNDMWRIVGYLAAGDK SHFLMPGGRKFDKERIEAFCRERGIALYDTAVEVIRLKDNASDNFLQVVREVDLAALLAR IPHCRTLVTTGQKATDTLRAITGCGEPAVGESVAFEYAGRSMRLWRMPSSSRAYPRPVEW KAGFYRKVFEDNEIL >gi|313157655|gb|AENZ01000057.1| GENE 128 132945 - 134276 1908 443 aa, chain - ## HITS:1 COG:SMc01414 KEGG:ns NR:ns ## COG: SMc01414 COG0507 # Protein_GI_number: 15965850 # Func_class: L Replication, recombination and repair # Function: ATP-dependent exoDNAse (exonuclease V), alpha subunit - helicase superfamily I member # Organism: Sinorhizobium meliloti # 11 435 58 409 410 96 27.0 9e-20 MSEYLADDDFSRIFVLNGYAGTGKTTLIAALVGALKDLNIKPVLLAPTGRAAKVLAQYAQ EKALTIHKRIYRQRTNADYESKYSLNINTEKGAVFIVDEASMLSDGSGDGALFGSGSLLE DLVQYVRSGRACRLILVGDSAQLPPVGADCSPALDPSALARFGDVEYGTMDEVVRQEAES GILFNATLVRCMLENGIYEIPHFETDYPDIEAVEGGEFLDKLQDCYAKYGRDETIVITRS NKRANRYNEGIRRNVTFAEEEIESNDMLMVVKNNYYFTERNKDCPMHFIANGDIARLKRL RRYEDFYGFRFADVVLEFPDYEDAELECKILLDTIASESPSLTREESTRLFYEVEKDYLD VKSKLKRFKEIRENPHFNALQVKFSYAVTCHKAQGGQWRAVFVDRCLFGDEPMTRDMLRW LYTALTRATDKLYLVNFDEKFYE >gi|313157655|gb|AENZ01000057.1| GENE 129 134415 - 135047 936 210 aa, chain + ## HITS:1 COG:no KEGG:Bache_1240 NR:ns ## KEGG: Bache_1240 # Name: not_defined # Def: hypothetical protein # Organism: B.helcogenes # Pathway: not_defined # 1 210 1 228 228 115 36.0 1e-24 MKKIAQIILAVVIVALVYVIYDQISTPIRFEQEMKAKKAQVIDRIKDIRTAQRAFKSKYQ RFTGSFDTLASFILNDSLELERKIVDEDDSVAMAMLKKMNKKNVEKFTVAVIDTIFSPKR LSRQDVEDLRYIPTTDKQAQFIMEAGSAVASNVTVPIVECRAPYKLFLDTVAYRQEIINL IDDEENNFNRYAGVKFGSMEAANNEAGNWE >gi|313157655|gb|AENZ01000057.1| GENE 130 135178 - 135636 516 152 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313157837|gb|EFR57248.1| ## NR: gi|313157837|gb|EFR57248.1| hypothetical protein HMPREF9720_1300 [Alistipes sp. HGB5] # 1 152 11 162 162 280 100.0 3e-74 MLVPEELFDKEHAGEVLAAAGMAALPGERAVWSAPQQNAVAVMAAAEDALAAVRERLGDR AHYTTPLLCAPQASVPTVWMYYAAGVLYIKVYDGKLRFAEVVPAPDESDLLYFFERLGSE FKLRDYTLRIGSGDGRALKRKLGGYFRQIVCE >gi|313157655|gb|AENZ01000057.1| GENE 131 135627 - 136163 187 178 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163764797|ref|ZP_02171850.1| ribosomal protein L29 [Bacillus selenitireducens MLS10] # 1 153 13 167 199 76 29 8e-13 MRIVSGKYKGRAINPPRNLRARPTTDFAKENLFNVLGNLVDFEECDVLDLFAGTGSISYE FASRGARSVTSVEINPVHYNFIRQTAAQLGIGNLYAVKANVFLYLKSCPKQFDVIFSDAP YDLEGSEQVVRLVLENDLLRPEGVLIFEHSKKMDFSAYPEFWQQRSYGSVQFSFFRRQ >gi|313157655|gb|AENZ01000057.1| GENE 132 136647 - 136838 115 63 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLSGIFRKKKTAQIHNSLSVRSDYCMKKFAFLPAQGIERPQGGRGGGTALPPQVPVSAPP DGK >gi|313157655|gb|AENZ01000057.1| GENE 133 136907 - 137452 540 181 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|254423961|ref|ZP_05037679.1| ## NR: gi|254423961|ref|ZP_05037679.1| hypothetical protein S7335_4118 [Synechococcus sp. PCC 7335] hypothetical protein S7335_4118 [Synechococcus sp. PCC 7335] # 81 177 72 167 168 70 32.0 5e-11 MKKTTLADNLPECSAAAAGFGDFFSGLLETTAAKTIDDALGQQIRAYVLGWLQTHPLTAF DDYSDTAYRRTYLGRCPATGWEAIVMSWKNGNRTSIHAHPQFAGYHIADGTFRLEIFEPA AGGAARLTESAVVEGPCAFFAVGAPGRFDNHIHRITCLSDTGHSLHVYSDDALRGEVYRE E >gi|313157655|gb|AENZ01000057.1| GENE 134 137536 - 138702 1488 388 aa, chain + ## HITS:1 COG:CAC2970 KEGG:ns NR:ns ## COG: CAC2970 COG1168 # Protein_GI_number: 15896223 # Func_class: E Amino acid transport and metabolism # Function: Bifunctional PLP-dependent enzyme with beta-cystathionase and maltose regulon repressor activities # Organism: Clostridium acetobutylicum # 3 387 2 384 384 370 42.0 1e-102 MKYDFDEIVPRRGTHSVKWELDSDPDILPMWVADMDFRTAPPVVEALRRRVEHGIFGYAQ APQAYYDAVVKWFERRHGWHTEPEWILHTTGVIPALSAILKALTQPGDKILVQTPVYNHF FISIRNSGCRTAENDLTYRGGAYTIDFADLERKAADPEVKAMLLCNPHNPAGRVWSPEEL RRIGEICLRNGVFVVADEIHCELVMPGFRYTPFASLGEEFLHNSVTCCSPSKAFNLAGLQ VANIIAAGETVRAKIEKALRINETGEIGPLAIDALIAAYDRGAEWLDALNEYLHANYLFL REYFAQHLPHCPVLPLEGTYLVWADCRAAGLASDELAERLLAEGRLRVNSGTMYGAAGAG FIRLNIACPRTLLAEELDRLRKVLELRK >gi|313157655|gb|AENZ01000057.1| GENE 135 138803 - 139111 467 102 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGRARENFNDVYFKDLRGQQMSFAGYANVLRHIFWNLMDGHSMVRESEKCNYTIEAWGRR CYLVQTNRETKAIDRYEILNCTYEQLRGKISKRHEAEEVPAR >gi|313157655|gb|AENZ01000057.1| GENE 136 139108 - 140043 1076 311 aa, chain - ## HITS:1 COG:no KEGG:BT_1910 NR:ns ## KEGG: BT_1910 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 304 1 305 312 489 70.0 1e-137 MIISASRRTDIPAFYAPWFFNRLREGYLLVPNPFNPKAVSRISLDPAVVDCIVFWTKNPA PMLPRLRELERYKYYFQFTLNPYGADVENRLPDLSRRIETFKRLSDAIGRDRVIWRYDPI LTNGKYDVGFHCEAFARIARALRDHTAKCMLGFIDHYRHIRGALGELGVGPLRRDEIEVM AQSFVRTLEPYPVALETCTVKVDLRHLGIPSGMCIDRGLVERIAGYPIAAKKDKNQRQVC NCIESIDVGTYETCLNGCAYCYAIKGNYNTAEYNRRRHDPDSPLMIGRVGGDDVIREREM KSLRCERRLPF >gi|313157655|gb|AENZ01000057.1| GENE 137 140221 - 141105 1174 294 aa, chain - ## HITS:1 COG:TM1017 KEGG:ns NR:ns ## COG: TM1017 COG0697 # Protein_GI_number: 15643775 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Thermotoga maritima # 7 288 5 281 293 63 27.0 4e-10 MERAQGILYAALSSSTFGLAPLFTLLLLAGGYSPFEALSYRWGVAALFLGVLAVFSGRSF RLGRRELVTVFLLSLFRAATSLSLIIAYQHIASGVASTIHFMYPLAVALAMMCFFREKGS AWVFAAIGMSVVGAVLLSLGNVDFTVENSALGMVSACVSVFSYGGYIVGVRKSRAVEIDS AVLTCYVMGLGALYFILGGLLTGGVRIETDGMTWLCILGLALPATALSNMTLVQAIKRIG PTLTSIFGALEPLTAVVIGVAVFGEPFTAQGAAGILLIVAAVSVVVLRTGRRPQ >gi|313157655|gb|AENZ01000057.1| GENE 138 141297 - 141956 856 219 aa, chain + ## HITS:1 COG:CAC0777 KEGG:ns NR:ns ## COG: CAC0777 COG0110 # Protein_GI_number: 15894064 # Func_class: R General function prediction only # Function: Acetyltransferase (isoleucine patch superfamily) # Organism: Clostridium acetobutylicum # 17 218 8 209 210 295 66.0 5e-80 MRRPGKGGATKHTMNTKTYPRTNDRRTIYLNTVIDNPCIEVGDYTIYNDFVNDPTEFEKN NVLYHYPINKDRLVIGKFCSIACGARFLFNSANHTLGSLANYTFPLFFEEWELDRANVAA AWDNKGDIVIGNDVWIGYEAVIMAGVRIGDGAVIAARAVVTRDVPPYTIVGGVPAKTIRP RFDDRTAARLLELQWWNWPVEKIRANLPRIMRGEIDKLG >gi|313157655|gb|AENZ01000057.1| GENE 139 142300 - 143247 826 315 aa, chain - ## HITS:1 COG:Cj1112c KEGG:ns NR:ns ## COG: Cj1112c COG0229 # Protein_GI_number: 15792437 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Conserved domain frequently associated with peptide methionine sulfoxide reductase # Organism: Campylobacter jejuni # 21 134 1 114 119 176 66.0 6e-44 MRLLILTIGLLLSGAVNGQNMKPLTPEERRVIVDKGTEAPFTGRYYDHREAGVYHCRQCG APLYRSADKFDAGCGWPSFDDEIPGAVMRTPDADGRRTEITCAKCGAHLGHVFLNEGFTP KNTRHCVNSVSLLFEPEAKAGEQPAAGGEQTQKKEGTETAIFAGGCFWGVEYLLSKMPGV LKVESGYTGGRTENPTYEQVCSHTTGHAEAVRVTFDPAKVSYEKLAKFFFEIHDPTQLDG QGPDLGDQYRSEIFYTTPAQQQTAERLIGELRRRGYDVVTEVTPAGRFWPAEDYHQQYYK RKGTLPYCHAYTKRF >gi|313157655|gb|AENZ01000057.1| GENE 140 143270 - 144511 1804 413 aa, chain - ## HITS:1 COG:BH1984 KEGG:ns NR:ns ## COG: BH1984 COG1228 # Protein_GI_number: 15614547 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: Imidazolonepropionase and related amidohydrolases # Organism: Bacillus halodurans # 4 409 8 410 426 274 36.0 2e-73 MRLLVKNIGKIVGIETAGRLRLCGGEMDRLETLDDAWLLADDGRIAAFGRMETLGEMAAD RTVDAEGGMLFPSFCDSHTHLVYAGSREREFLDKINGLSYEEIARRGGGILNSADLLHET SEEELYAQAMERVREIAAKGTGCVEIKSGYGLTTEDELKMLRVIRRIRETAPLAVRATFL GAHAVPRNYIGRQEEYVELVCNEMLPAVAAGKLADFVDVFCDEGFFTVEQTARIMKAGRK LGMQPKIHANELAVSGGVQVGVEYGALSVDHLERMGEAEIRALRGAVTSPTMLPGAAFFL GMSYPPAREMIRAGLGVALASDYNPGSSPSGDMRMVVSLACIRMKMTPAEAINAATLNGA YAMGLSRSYGSVTVGKVANFFITKPIPSVEFLPYAYTTPLIRRVFLRGEEYVA >gi|313157655|gb|AENZ01000057.1| GENE 141 144508 - 145989 2057 493 aa, chain - ## HITS:1 COG:FN1406 KEGG:ns NR:ns ## COG: FN1406 COG2986 # Protein_GI_number: 19704738 # Func_class: E Amino acid transport and metabolism # Function: Histidine ammonia-lyase # Organism: Fusobacterium nucleatum # 6 493 7 497 511 424 44.0 1e-118 MEHHLISAQHLSIDRIREILFRRLPLALSDDARTRIVRCREYLDRKMENPERPIYGITTG FGSLCDISVGYDELAQLQKNLVMSHACGTGERVPSEIVKLILLLKIQSLSYGHSGVQLAT VERLIDFFNNDVLPVVYQQGSLGASGDLAPLAHMSLPLLGLGEVECEGRVRPAAEVLAER GWQPIRLESKEGLALLNGTQFMSAYGVWSLIGARRLSEWADRIGALSLDAFDGRIEPFCD EVHLIRAHKGQLTTARNIRRLLEGSELIARPKKHVQDPYSFRCIPQVHGASKDTIDYVEG VLTTEINSTTDNPTVFPEEDMVVSAGNFHGQPIALAMDFLAIALAELGSISERRTYKLIS GARELPAFLVAKPGLNSGFMIPQYTAASIVSQSKGLCMPASVDSIPSSQGQEDHVSMGSN AATKLYRVVCNTERVLAIELFNAAQALEFRRPARSSETLERLVAEYRKEVPFIDNDSVMY PHIESSIRFIRKS >gi|313157655|gb|AENZ01000057.1| GENE 142 146006 - 148027 2988 673 aa, chain - ## HITS:1 COG:SPy2082 KEGG:ns NR:ns ## COG: SPy2082 COG2987 # Protein_GI_number: 15675840 # Func_class: E Amino acid transport and metabolism # Function: Urocanate hydratase # Organism: Streptococcus pyogenes M1 GAS # 7 671 6 673 676 652 48.0 0 MTLQEFQNDIRAGIPDRLPAAKPYDRQINHAPKRKEILTPEEQVLAIKNALRYFPAKHHA VLAKEFAEELRKYGRIYMYRLRPDYEMYARPIDQYPCKCRQAAAIMLMIQNNLDKAVAQH PHELITYGGNGAVFQNWAQYRLTMKYLSEMTDSQTLAMYSGHPLGLFPSHPDAPRVVVTN GMVIPNYSKPDDWERMNALGVSQYGQMTAGSYMYIGPQGIVHGTTITVMNAARKRFTADR RDARGMLFVSSGLGGMSGAQPKAGTISGVVSVIAEINPKAAQKRYEQGWVDELHHSLDEL IPRIRRACREREAVSMAYVGNVVDLWERLAAEEIPVDLGSDQTSLHNPWAGGYYPVDVSY EASNKMMAEEPARFRECVQESLRRQVDAINKLTARGMYFFDYGNAFLLEASRAGAAVMGE GGRFRYPSYVQDIMGPMFFDYGFGPFRWVCTSGKPEDLELTDRLAAEVLEEIRRTAPAEI AGQLDDNIHWIREAGRNRLVVGSQARILYADSEGRTKIAQAFNRAIADGRLTAPVVLGRD HHDVSGTDSPYRETSNIYDGSNLTADMAVQNVIGDSFRGATWVSIHNGGGVGWGEVINGG FGMVIDGSEASDRRIREMLLWDVNNGIARRSWARNEGAVSAIRREMERTPGLQVTLPNAA DEEMIRNVLKENE >gi|313157655|gb|AENZ01000057.1| GENE 143 148033 - 149730 2304 565 aa, chain - ## HITS:1 COG:SPy2083 KEGG:ns NR:ns ## COG: SPy2083 COG3643 # Protein_GI_number: 15675841 # Func_class: E Amino acid transport and metabolism # Function: Glutamate formiminotransferase # Organism: Streptococcus pyogenes M1 GAS # 4 311 3 282 299 236 41.0 7e-62 MEKRIVECVPNFSEGRDKAVIDRIVSAIETSGGVKVLDVDPGEATNRTVVTFVGSPEAVV EAAFAGVKKAAELIDMRKHKGAHPRMGATDVLPLIPIAGITLEECAELARKLAERIAGEL HVPTYCYEAAAFTPRRRNLAVCREGEYEALPEKLAHEESAPDFGARPFDEGVARTGATTV GARDFLIAVNFNLNTTSTRRANAIAFDVREKGRPVREGNPITGKVVRDADGNPLMRPGTL RATKAIGWFIEEYGIAQVSMNITDIAVTPLHVAFDEVCRKADARGVRVTGTEIVGLVPKR ALLEAGRYFLRKQQRSAGIAEEEIVRIAVKSMGLDDLKPFDPKEKVIEYLLEAEDQKKRL IDMTCKGFAEETASESPAPGGGSIAAYMGALGAALGTMVANLSSHKAGWDDRWEEFSGWA EKGQVLLRELLHLVDEDTAAFNRIMAVFAMPKSTDEEKAARSAALQAATLYATQVPLRTM KTAFEVFAIVRAMAAEGNPNSVSDAGVGALAARSAVLGACLNVKINAAGLKDRAAADALV AEANAVAAAAERAEREVLEIVERKI >gi|313157655|gb|AENZ01000057.1| GENE 144 150058 - 152040 2603 660 aa, chain - ## HITS:1 COG:no KEGG:Palpr_1878 NR:ns ## KEGG: Palpr_1878 # Name: not_defined # Def: TonB-dependent receptor # Organism: P.propionicigenes # Pathway: not_defined # 58 658 18 630 630 305 32.0 4e-81 MKNIQAKQTPCFRRWSRKGWSAFASLHRHVTIGVLAATMSILLLATQSASAQHADTAAVL RTLRIDEVGISGSQTAPTRNVQSQTPLFDRKAQAAAPVQTLESALRLAPSVDIRERGGKG MQADIFIRGGSFDQTMVLLNGIDFTDARTGHQSHSLPVDLDCISAVDLLDGVPGVGAYAG AVNIRTAPLRPTYLRFEGTGGQYGYAYANLSGAVTAGRFSLLAAGSYRRSDGYRHNTDFT NGNAFLRATYETRRLGFFDFQAGWQNRGFGSNGFYAAYNPDQWEGTSTSLASLRWLRQAG RFSLGASASYRKNFDRYDWTRGTVMNRHNTDNAGARLWTDFDWAGGTTSLGGDYAFNHIY STNLGEKLSVPHGHYTHAKARHTGNVWLRHAEQWRRFDAAASAGVSLTPYGTSALWNVSG GWRPAAGLHLAVGASQSMRLPTFTDLYYSSPAQINNLDLIPEKAVTYRIEADYVKGRWNA SLRTYYRAGRDIIDWVWREDMDGKWHSEQTSRLDTYGVELTGGYAAAEGFLRRATLSYGY ITTDRNTEVVARSAMDFMKHKAALAVEVRFLRRMSLALTASVYDRNGSYTDYPTPGDASV SQVRDYEPYFLLDGRLSWGKGVCRLYVDATNITDTRYCDLGGIRLPGAWVTGGVVLTIGR >gi|313157655|gb|AENZ01000057.1| GENE 145 152487 - 153794 1983 435 aa, chain - ## HITS:1 COG:MA3297 KEGG:ns NR:ns ## COG: MA3297 COG0498 # Protein_GI_number: 20092112 # Func_class: E Amino acid transport and metabolism # Function: Threonine synthase # Organism: Methanosarcina acetivorans str.C2A # 9 408 7 403 419 356 47.0 5e-98 MKDFTPTSYTLECVATGREFEDTGWMLADPQCKTPSLVRAKYARRQLEVKPDEYGFYKFC DWLPVRRMLKGSSAPVTYKSKGLARHLGLENLYITFNGYYPAIGATMSTCSFKETEAFSV CARAAEDEERVLVVASAGNTARAFAKVCSDNHIKLLLSVPYDNIEALWFEHPLNPCVKLI SCEKGGDYFDAIHLSDIALKGPGFYAEGGAKNIARRDGMATTVLSAVTTIGRIPDYYFQA VGSGTGAIAAWEANMRLIEDGRFGSNTMKLMVSQNAPFVPMYDAWQAGSRKMLAYEDDKA RRDAEIIDAKVLSNRRPPYGITGGLYDALKATGGEFFVATNAMARKARKLFKELEGVDIY SASGVALASLVNAVAAGKIEKDAVVMLNITGGGEEHFKEDKELWYLKPSHVFPLEPDTDD VVAKVEALFAKNIAE >gi|313157655|gb|AENZ01000057.1| GENE 146 153815 - 154945 1343 376 aa, chain - ## HITS:1 COG:sll0873 KEGG:ns NR:ns ## COG: sll0873 COG0019 # Protein_GI_number: 16330194 # Func_class: E Amino acid transport and metabolism # Function: Diaminopimelate decarboxylase # Organism: Synechocystis # 7 375 17 386 387 434 55.0 1e-121 MIDFLKLPSPCYVLDEELLDRNLAIIDRVRRESGAEIIVALKACAMWSIFPELARHSDGA TASSAAEARLVFEEFGRPAHTYAPVYTDRNIEEILRCSDHITFNSVAQFERFGPMALLRG ISCGLRINPQYSPVETDLYNPCVPGSRLGVTADLLKELPAGIDGLHFHVLCESRPHHLRL ALEAVEKHFGQYLDRIKWLNMGGGHLMTHAEYDCDELIALLREFKARHPHLRLILEPGSA FTWRTGYLVSTIEDLVENAGVHTVMLDVSFACHMPDCLEMPYKPAIVGAHEPAEGEKRWR MGGTSCLAGDYYGDWTFDHELKVGERIVFEDMIHYTMVKTTMFNGVAHPAIVIARRDGRI DVVREFGYEDFRNRMS >gi|313157655|gb|AENZ01000057.1| GENE 147 155080 - 155442 79 120 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MREAQADKGDDVKPAAGLSNKSYSYDFTPVRICSDRGDFQFRGVVRQPLQAGKGAVPYFA IQMLTRFFSTPGNGWAVSPLPSALMREAAIPAFFSAAATLSARSLEIFRLTACEPVAASA >gi|313157655|gb|AENZ01000057.1| GENE 148 155250 - 156371 1421 373 aa, chain - ## HITS:1 COG:VC2213 KEGG:ns NR:ns ## COG: VC2213 COG2885 # Protein_GI_number: 15642211 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Outer membrane protein and related peptidoglycan-associated (lipo)proteins # Organism: Vibrio cholerae # 223 366 157 306 321 67 36.0 3e-11 MKMKCILCTLGALVMCTAPVFAQAANKQEKSEFNPHWFMQVQAGASYTLGEGPFGKLVSP AAALSAGYQFSPVWGLRFGLSGWQSKGAWVSPQTTYQYKYLQGNVEATLDLANLFGRFNP RRTVNPFLFAGVGLNGAFDNDEANALNDSGYRLGNIWSGSKVFVAGRLGLGVNFRLSDCV LFGVEMNANMLSDKYNSKKAGNLDWQFNALGGFTFRFGKNHKKARTAAVVPAPAPAPAPE PAPAPVEEKPAVKETPAPAPAAVAERPAELRENIFFRIGSSQIRTTEEAKVGALVEYLKA NPEANVEIVGYADAATGSHAVNLKISKLRAESVAAALKKAGIAASRISAEGRGDTAQPFP GVEKNRVSICIAK >gi|313157655|gb|AENZ01000057.1| GENE 149 156737 - 157927 1836 396 aa, chain - ## HITS:1 COG:slr0049 KEGG:ns NR:ns ## COG: slr0049 COG1748 # Protein_GI_number: 16331467 # Func_class: E Amino acid transport and metabolism # Function: Saccharopine dehydrogenase and related proteins # Organism: Synechocystis # 1 391 1 391 398 573 68.0 1e-163 MCKALIIGAGGVGTVVTQKIAANPVFTDVMLASRTKSKCDAVAAAIGGGRVKTAQVDADN VAELCELFRAFRPDIVVNVALPYQDLTIMDACLECGCNYLDTANYEPKDEAHFEYSWQWA YQDRFKAAGLTAILGCGFDPGVTAIFTAYAAKHHFDEIHYLDIVDCNAGNHGMAFATNFN PEINIREVTQKGRYYENGEWVVTEPHEIHRPLNYPGIGERESYVIYHEELESLVKNYPTI KRARFWMTFGQEYLTHLRVIQNIGMARIDPIIYNGVEIVPIQFLKAVLPDPKSLGANYHG QTSIGCRIRGVKDGKERTYYIYNNCDHEQAFKETGTQAVSFTTGVPAALGASMWAKGLWR GAGVFNVEEFDPDPFLGELGEQGLPWHELFDTDIEL >gi|313157655|gb|AENZ01000057.1| GENE 150 158387 - 159034 216 215 aa, chain + ## HITS:1 COG:insB_g4 KEGG:ns NR:ns ## COG: insB_g4 COG1662 # Protein_GI_number: 16128954 # Func_class: L Replication, recombination and repair # Function: Transposase and inactivated derivatives, IS1 family # Organism: Escherichia coli K12 # 96 213 42 161 167 68 34.0 1e-11 MNCKFCNGKCVKDGHQKNGIQRYECKDCHRKQQASYAYNAYDPSLNKSITTFIKEGVGIR SIARILEISITTLTKRILSISSSIITPHIIADAVYEVDEIKTFVGRKTDHIWVTYALNRD DKSVSCFSVGPRTNETLRKVTDKLSNAKRVYTDKLRQYRTLLSPAIHKTTNRGTNHIERY NLTIRTHIKRLNRRTICYSRAISMLYAVMKIYFWG >gi|313157655|gb|AENZ01000057.1| GENE 151 159765 - 160436 118 223 aa, chain + ## HITS:1 COG:FN1198 KEGG:ns NR:ns ## COG: FN1198 COG1106 # Protein_GI_number: 19704533 # Func_class: R General function prediction only # Function: Predicted ATPases # Organism: Fusobacterium nucleatum # 30 220 235 415 420 122 41.0 5e-28 MGYSKEMFHTKLNGHNEALNFLQNLKLGFESLSTIENEFDPSTLPTELSEKLKKELTKQL ANKKVVELLSIHKKYDKNGVVIGEIPFDVYEKESEGTQKLIDLSGPIFDTLINGDTLFID ELDCRMHPIISEFIVKLFNDKETNPNNAQLIFTTHDTHLLSSKIFRRDQIWFAEKDSMQQ TDLYTLNDIILPNGQKPRNDSNYEKNYIAGRYGAIPFLMNYNL >gi|313157655|gb|AENZ01000057.1| GENE 152 160450 - 161157 205 235 aa, chain + ## HITS:1 COG:no KEGG:BVU_3668 NR:ns ## KEGG: BVU_3668 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 2 233 3 232 237 223 51.0 6e-57 MARKIKIDNAILKRRQHSAPKRKVRVINCNILIVCEGEKTEPNYFRVFNKTRKGTIVYEL TIDGGGINTMQVVDKAIELNSSFDYDSVWAVFDKDSFSPDQFNGAIIKARQNNINCAWSN EAFELWYLLHFHNRVTAMSREEYKKAISDAVNTSPKNKKKKKDYEYAKNAPNNYDIINQY GNQNAAIRWAKSLHDQFEDQRFHTHNPCTLVYKLVLQLIGEDDDLNSKLISKIES >gi|313157655|gb|AENZ01000057.1| GENE 153 161720 - 162784 1021 354 aa, chain + ## HITS:1 COG:no KEGG:BVU_4155 NR:ns ## KEGG: BVU_4155 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 10 354 12 367 371 189 34.0 1e-46 MYSENDTRHFFQLLRSGLKPDCAPAVTGKISARQWDDIFRMAADQGVCAVIGDGMERLPE KLRPPRDIRLRWALTAERQEKRYRRQQEKTAKMAAAFAENGIRMLVLKGLGLSRDYPVPA HRECGDIDIYLFGASDEGDRLLLQMGAQPYFDVPKHSSHTWDGILIENHRTILNVRRNRS ERELNALLTAVLEQEGVCEIGENIAVPPATFNAIYLLRHAAVHYQKEGIAVRHLCDWACF LERHGHEIDRQLFHKALADYRLDRFEALMTAAAVRYLGTEVPEPACDAGMLERFMREIYA MRPMPTRTLPKLYFKLFSPIHNRWRLRRVLRTSPWRYYYDTIRAQWNERFTLFR >gi|313157655|gb|AENZ01000057.1| GENE 154 163349 - 163822 565 157 aa, chain + ## HITS:1 COG:no KEGG:BF3466 NR:ns ## KEGG: BF3466 # Name: uphY # Def: putative LPS biosynthesis related transcriptional regulatory protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 157 20 178 179 127 44.0 1e-28 MAAKSYLDSIGIESYVPMHFAERTYGGKRRKVWEPLIHNLLFIRTSADRLREIKATTTLP IRYIMDRESKSPTVIPERQMQDFMAVVATQNEHVEIVAPQDVDLEKGDPVRVTEGIFAGI EGRYIRHKGHSKVAVAIRNVATALTAYIPLKYIAKID >gi|313157655|gb|AENZ01000057.1| GENE 155 163896 - 165290 1952 464 aa, chain + ## HITS:1 COG:VC0934 KEGG:ns NR:ns ## COG: VC0934 COG2148 # Protein_GI_number: 15640950 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Sugar transferases involved in lipopolysaccharide synthesis # Organism: Vibrio cholerae # 106 464 110 465 465 248 38.0 2e-65 MKQVMQESLLIKLPVIIGDLFLLNLSWIFALTLFPQPAYVAHSLEIFACLNICFIPGLSW FGVILSSRIVPYEEIIRRVFYVVLCHIGFFTLIQTVWSYGLLPLRLIGVFYISLTVALML WRYICRMAVKITRGHGRNSRRVIIVGSKDNALEVYHEMVDNTSTGYRVLGFFSNHDDKAL PGNTPCLGSVDEALPWLKRHPVNEVYCCLSTDRYLEEIFPIMDYCENNFVRFYYVPNLRN YMKRAMNLELLGNVPILYIREEPLRQVSNRFVKRAFDVAVSGAFLCTLFPVVYLFVALGI KLTSKGPVFFIQERSGENGRTFGCIKFRSMRVNDEADRVQATKNDPRKTRFGSFLRKSSI DELPQFINVLKGDMSIVGPRPHMLQHTELYSKLVNKYMVRHLIKPGITGWAQVTGYRGET HELSQMEGRVRRDIWYLENWSLLLDIRIMLLTVWNALKRDENAY >gi|313157655|gb|AENZ01000057.1| GENE 156 165303 - 166181 1106 292 aa, chain + ## HITS:1 COG:NMB0062 KEGG:ns NR:ns ## COG: NMB0062 COG1209 # Protein_GI_number: 15675999 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-glucose pyrophosphorylase # Organism: Neisseria meningitidis MC58 # 1 291 1 288 288 404 65.0 1e-113 MKGLILAGGSGSRLYPITKGVSKQLLPVYDKPMVYYPLSALLLAGIREILVISTPEDLPG FRRLLGDGSDYGVRIDYAAQPSPDGLAQAFLIGEDFLGGDSACLVLGDNIFYGSGFTGLL REAVRTAEEDGKATVFGYRVDDPGRYGVAEFDDKGNCLSIEEKPAHPKSNYAVVGLYFYP NKVVDVAKSIKPSARGELEITSVNQCFLQSGELKVQTLQRGFAWLDTGTHDSLAEASIFV EVIEKRQGLKIACLEGIAYRNGWITADKLRALAEPMLRNQYGQYLLKLLDEK >gi|313157655|gb|AENZ01000057.1| GENE 157 166186 - 166758 789 190 aa, chain + ## HITS:1 COG:MA3780 KEGG:ns NR:ns ## COG: MA3780 COG1898 # Protein_GI_number: 20092576 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-4-dehydrorhamnose 3,5-epimerase and related enzymes # Organism: Methanosarcina acetivorans str.C2A # 1 178 1 179 183 219 58.0 2e-57 MKVLTTAIDGVVILEPEVFGDARGYFFESYSQRRFDAEVRPVRFVQDNESHSRYGVLRGL HFQKGRYSQSKLVRVVRGRVLDVAVDIRRGSPTFGRHVAVELTEDNKRQFFIPRGFAHGF AVLSDEATFQYKCDNPYAPESEGAIAWNDPSLGIDWRLAPEDVVLSPKDSAHPLLSEAGE LFDYNEDYYA >gi|313157655|gb|AENZ01000057.1| GENE 158 166751 - 167614 1152 287 aa, chain + ## HITS:1 COG:CAC2315 KEGG:ns NR:ns ## COG: CAC2315 COG1091 # Protein_GI_number: 15895582 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-4-dehydrorhamnose reductase # Organism: Clostridium acetobutylicum # 2 278 1 271 280 230 44.0 3e-60 MLNILITGANGQLGSALRRLGCVSPHNYICTDVAELDITDAAAVLRTVEERRIDVIVNCA AYTDVERAEEDEPRADLLNHKAAGNLAAAAKATGATLFHVSTDYVFDGTAHTPYREDTAP SPLGAYGRTKLAGERAVMASGCRYLIFRTAWLYSEYGHNFLKTMLRLTSERDTLQVVFDQ IGTPTYAGDLALAIFSIIESERYAGNEGVYHFTDEGVCSWYDFATEIAAAAGHDSCRIIP CHTSEFPTKAQRPAYSVLDKTKIKTTFQMDIPHWREAMIYCLKQLAE >gi|313157655|gb|AENZ01000057.1| GENE 159 167625 - 168734 1287 369 aa, chain + ## HITS:1 COG:CAC2332 KEGG:ns NR:ns ## COG: CAC2332 COG1088 # Protein_GI_number: 15895599 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: dTDP-D-glucose 4,6-dehydratase # Organism: Clostridium acetobutylicum # 3 369 2 351 351 403 52.0 1e-112 MQRTILITGGAGFIGSHVVRLFVTKYPDYRIVNLDKLTYAGNLANLRDIEERPNYTFVRG DICDFEAMRELFRQYGIDGVIHLAAESHVDRSIRDPFTFARTNVMGTLSLLEAAREHWNG NWAGKLFYHISTDEVYGALELTRPAGDPAGCESGGGPFGEEFFTEETKYDPHSPYSASKA SSDHFVRAYHDTYGMPTLVTNCSNNYGPYQFPEKLIPLFINNIRHRRPLPVYGRGENVRD WLYVEDHARAIDVIFHKGKVADTYNIGGFNEWKNIDLIRVIVKTVDRLLGNPEGASEKLI TYVADRAGHDLRYAIDSRKLKDELGWQPSLQFEEGIEKTVRWYLQNQKWMDDITSGEYEK YYQSMYKER >gi|313157655|gb|AENZ01000057.1| GENE 160 168806 - 170338 1350 510 aa, chain + ## HITS:1 COG:no KEGG:BVU_2391 NR:ns ## KEGG: BVU_2391 # Name: not_defined # Def: putative transmembrane protein # Organism: B.vulgatus # Pathway: not_defined # 1 509 1 512 512 400 44.0 1e-109 MSVKSENNKRIAKNTFLLYFRMLFTMGVSLYTSRVVLNTLGVEDFGIYNVVGGVVVMFSF LNSSLATATQRFLNYEMGQGNALKLEKVFSIALTGHYLIALSVILLAETIGLWFVSTQLE IPAERMHAALWVYQFSILTLAVSIISVPYNAVIIAHERMKVFAYVSIAEVSLKLTVVFLL VYLSFDKLVLYAFLLLVVAVIVRMIYATYCKRNFTECKYRFIFEKDLFAQMFCFSGWMFT GTLSNLFSSQGVNILINMFFGPIQNAARGIAYQIQGTVNAFVTNFMVAVQPQIVKSYSQG NYSYMYKLVFMASRYSFYLLFLLSLPVLLQTQYLLQLWLKNVPDYSVLFTQLVVIDLLIN SSFTPIASVAFASGKIRNYQLVVSAGYLLTFALTLLFYQLGFPSYVAFVVAIIISFIGLF ARLYVLRHSVHFPSKKYIHKVFLVQAKVGIIAIVIPAVFVYYASPTFFHFAITVLISSVS IAGTTWCWGLGPEERLYIRNKIRGYFHRKI >gi|313157655|gb|AENZ01000057.1| GENE 161 170335 - 171534 878 399 aa, chain + ## HITS:1 COG:no KEGG:GFO_0566 NR:ns ## KEGG: GFO_0566 # Name: not_defined # Def: hypothetical protein # Organism: G.forsetii # Pathway: not_defined # 1 332 1 337 388 197 32.0 7e-49 MITIHRPDFSQRGEYARIQSLVDIDGNRNTIWFEVRREYGEYLCWERSDAFVIGLLNYAM RNGHDITCEAPMGEDLHYQITTYLIEAIAKGDARMYHTRITAEVDHSELPCDNAVGTGIS CGIDSFHALAGNNDAKYPKHRITHLTFNNVGSHGEGEHARRLFAERKEHSRKFCEEFGFK FVESNSNIMDVIPQDHYQTHTFTSSFAIYCLQKLYSVYYYASGQPFAEFSLSDNYDRDCA YYDLLLAEAFSTRNLRIHSEGGALTRLEKTKRVITYAPSYNYLNVCTITAANCGKCEKCS RTLLSLEALGQLELYKNVFDIGCYQSHKRDYLIFLYGQRWLKNHTYIELYPFFKKQITPA IRIAAYIKFSKDFFVSCIRGTRLERMLRGLKKKITKCNA >gi|313157655|gb|AENZ01000057.1| GENE 162 171522 - 172514 790 330 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313157767|gb|EFR57178.1| ## NR: gi|313157767|gb|EFR57178.1| putative acyltransferase [Alistipes sp. HGB5] # 1 330 1 330 330 535 100.0 1e-150 MQRIVYFDFLRGIAIIMVVCLHCIDYSAMEGPVLNLFGVEFRNLLNAAVPLFLAISGYFM ANKKTDTPTAYKEFLKRQAVKVYIPMLVWSVPYLILAYLHRENSVTSWLLFFFGGFSIYY FIILIVQYYALLPLMQKMGRSLRGGVLAAVAISIACISCLEILLYVYRMEIPLIVYAGPF PLWIMFFVIGIYLRNHTVQMKYVYPLVLLFFLLSVAESFVIMHCTGQVTGLGIKVFSFLF SLYFIIMLFASPLQRYLTARKDRFIVRSIAKVGEISFGIYLIHMLVNMVLNKLCGYHDFN FILVLTISIVVIVATMKLVPTSIRKYIGFS >gi|313157655|gb|AENZ01000057.1| GENE 163 172516 - 173670 633 384 aa, chain + ## HITS:1 COG:no KEGG:BT_0041 NR:ns ## KEGG: BT_0041 # Name: not_defined # Def: F420H2:quinone oxidoreductase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 380 1 400 400 403 50.0 1e-111 MIEIRNKRDCCGCNACVQKCPQQCIGQSEDAEGFIYPQVDKARCVGCGLCEKVCPVINQN PKSKPLKVMAGRNGNEEIRMVSSSGGAFTLLAGEIIGRGGVVFGARFDENWNVVHSYTEC TDGLAAFRGSKYVQSRIGETFSQAEKFLKQGREVLFTGTPCQIAGLRRFLGKEYDNLLSM DIVCHGVPSPGVWKEYLKQTTQNLSPIGRIVGINFRDKRTGWKEYSFSLHISSPDKEVLH TEKFYRNKYMRGFLANLYLRPSCHQCPSKSGKSGADITLADFWGIEHIAPRLDDDKGTGL ILVNTVKGKRIYDKLPIISEEIIYGQALKGNHSIEISAPVPPQRECFFENLGKEDFSRLV NRLTKVPLTIRLRRAARRLIKKLR >gi|313157655|gb|AENZ01000057.1| GENE 164 173667 - 174815 451 382 aa, chain + ## HITS:1 COG:no KEGG:Dfer_0060 NR:ns ## KEGG: Dfer_0060 # Name: not_defined # Def: hypothetical protein # Organism: D.fermentans # Pathway: not_defined # 1 381 1 368 368 278 39.0 3e-73 MKIGILTQPLQRNYGGILQCYALCEVLKGLGHEVCVLDRHHHTPFLRKVLRSAKRILLYC LGAEKKENILSIWMTAGQMRTISRHTSAFIDRHIPHTRPLLGTKALRQAAGNEHIDAYVV GSDQVWRPVFSPCIANYFIDFDHRDNIVRIAYAASFGVDTWEYDRQTQQVCAGYAKKFDA ISVREDSGVTLCRNYLGVAAEHVLDPTMLLRREQYEKLATDRHTLASKGNLLTYILDPSP EKSTIVGNVSRKLGLTPFSVMPHREADVLNCAMYPEECTSPAVEQWLRGFMDARYVVVDS FHGCVFSILFNKPFIAIANCSRGETRFTSLLRTFGLEGRLIHNAGELTDSLITAPVDWER CNGILQSERRKALLFLRTNLNK >gi|313157655|gb|AENZ01000057.1| GENE 165 174818 - 175885 998 355 aa, chain + ## HITS:1 COG:no KEGG:CPF_0915 NR:ns ## KEGG: CPF_0915 # Name: not_defined # Def: putative polysaccharide polymerase protein # Organism: C.perfringens_ATCC13124 # Pathway: not_defined # 3 350 4 354 358 65 22.0 4e-09 MEFYIYLVLIVLFLSLVKRANNKVVYIGVAFALFFLAAYRSLGVGTDSEAYYTIYYQYAN NDAYVAANENYKTAEFIWLAMLRYFRNANNYRGVLVMLAALNAIPLFYALNKQSKWPLFS VFLYISLYFYGNSMNAMRQSVAMSFLLLAIPFLEKDKNQYYLLLMLLAMAIHMSALYVFL FVWAFYKANLNFDNKLKYALAIAISFTIGMFFTAWFKETLKPIANLFASSNYDYYLSGGS VNESRNLLSNLGLNLVAVFMLYINKDANNIYFKLYILGVIMVNLLAGLGAFSGRIASYFS LMQIVAIPHLLYLVKKTFQKYLYTVIFIAYGLSVYIYYLFKNLSEIVPYSNIFSN >gi|313157655|gb|AENZ01000057.1| GENE 166 175894 - 176826 508 310 aa, chain + ## HITS:1 COG:all2288 KEGG:ns NR:ns ## COG: all2288 COG0463 # Protein_GI_number: 17229780 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Nostoc sp. PCC 7120 # 2 238 3 248 343 132 32.0 1e-30 MFSVVIPLYNKQDCIRNTVQSVLNQTFPDFEINIVDDGSTDRSLEIARQFDDPRIRVFSK PNGGVSSARNYGIRQSRKKYIAFLDADDLWYPDYLSEIARLIDKYPGCGIYSAGYILRKK HREIRYAYPTEGVIQDYFKEYFKHKSPFCNASCAVIPVEAFGKVGYFPENLHTGEDLYMW MAIAVHYNVCVTPNILMSYEYQESTFNDRLEKNIYIQGDLFKKLCESQNTYLNEIIARLV ITDAIFQALRNNKDVSKSVEKKFNYTKFNRTELLYLRILNRTPAWSNKIIRNGWKLGSAI KTYLLGIMGR >gi|313157655|gb|AENZ01000057.1| GENE 167 177575 - 178006 241 143 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313157791|gb|EFR57202.1| ## NR: gi|313157791|gb|EFR57202.1| glycosyltransferase, group 1 family protein [Alistipes sp. HGB5] # 1 143 230 372 372 285 99.0 6e-76 MYVIGQGPLEEQLKRQSIISDTQNNIIFTGFLPHVQMSEYTAHAKALLINTYQDNNMVSI PETLSCGTPVISNTVPTNSYIINSHQLGIAKDNWDEHDLKEIIANNSFYVKNCIAYRDNL ANSHIAQTFLDIFQDKKIAENIK >gi|313157655|gb|AENZ01000057.1| GENE 168 178013 - 178987 645 324 aa, chain + ## HITS:1 COG:no KEGG:PRU_0259 NR:ns ## KEGG: PRU_0259 # Name: not_defined # Def: acyltransferase family protein # Organism: P.ruminicola # Pathway: not_defined # 2 324 7 340 342 122 29.0 1e-26 MTINRINWLSVLQGWSMLLVVIGHVTLTNVFQDPETPVSAEIERIIYSFHMPLFMFISGF LFYLTKIGRNKRYVETIGDKAKRLLIPYAAFTCATFFLKYAFNPLMRRPVDFSWSEILDI VTFRSNPLAEMWFISTLFVLFLFFPIYKWSLGGKMKSVFVFCAALLIYFFFPKDIEVFCI SYASSYLLFFYTGILISKYGGGKFLDTPVLLWCCTALMVVCSLYPGIPLLNIFVGIFFSL TLCMFLSRRFPGLFGSFREYTFQIFLMGIFFQIAIRFIYARLGMAGLYWPLYVVSILLAL YMPVLISKAIRKTGSKALARCFGL >gi|313157655|gb|AENZ01000057.1| GENE 169 179055 - 180236 1146 393 aa, chain + ## HITS:1 COG:SMb21503 KEGG:ns NR:ns ## COG: SMb21503 COG0438 # Protein_GI_number: 16265081 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Sinorhizobium meliloti # 2 361 5 369 447 227 35.0 3e-59 MKILLANKFYYPRGGDCTYILNLRNLLESHGHETAVFAMQYPENIPSPWSGYFPSEIKFT PGLKMLEAFFRPFGTGEVKRKFTKLLDDFRPDIVHLGNIHTQLSPVLAKIAHGKGIKVVW TLHDYKLLCPRYDCLRNGATLCEACFSDKHKVLEYKCMKNSRIASYIAYREALKWPREKL ENYTDAFISPSRFLKDKMVQGGFTKEKIRTCCNFINTAKVRRENYDKDDYYCYVGRLSHE KGIGTLIEAAKGLPLKLKIIGNGPLQEILTAQAAGADIELLGYRQWDEIKEIVGKARFCV LPSEWYENNPFSVIESQCLGTPVLGARIGGIPELIDEGKTGMLFMPGNAEELKTQIAQMF ATPFEYRQIALDAQKRYSAERYYTELMKIYNNI >gi|313157655|gb|AENZ01000057.1| GENE 170 180238 - 181308 1261 356 aa, chain + ## HITS:1 COG:CAC2796 KEGG:ns NR:ns ## COG: CAC2796 COG0535 # Protein_GI_number: 15896051 # Func_class: R General function prediction only # Function: Predicted Fe-S oxidoreductases # Organism: Clostridium acetobutylicum # 17 164 44 197 394 74 31.0 4e-13 MNTRNIPHPTDASIILTYRCPMKCKMCNIWFNPTKKDDEIKAADLRTLPRLKFINLTGGE PFVREDLAEIVEECYKHTDRIVISTSGWFEDKVVALAKQFPNIGIRISIEGLSCKNDELR GHAGGFDKGLRTLLTLKQMGLKDIGFGCTVSNNNSKDMLSLYQLSLSLGMEFATAAFHNS YYFHKDDNVITNKDEVCKNFEQLIEWQLQENHPKSWFRAWFNMGLINYIEGGRRMLPCEA GSANFFIDPFGEVYPCNGLEEKYWQKSMGNIHETPDFMNIWESDRAQEVRAMVRKCPKNC WMVGTASPVMHKYMKYPLRWALRNKLRSLRGKPACLDKKWCDVGQDPMQGDLREKF >gi|313157655|gb|AENZ01000057.1| GENE 171 181349 - 182449 1137 366 aa, chain + ## HITS:1 COG:SMb21502 KEGG:ns NR:ns ## COG: SMb21502 COG0438 # Protein_GI_number: 16265080 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Sinorhizobium meliloti # 2 361 23 389 389 277 43.0 2e-74 MKIVVTGTRGIPGIMGGVETHCEELFPRIARKGFDVTVIRRKSYVHDTLRQYEGITLVDL ATPKKKSFEAIVHTFRAVLEAKKLKADIIHIHAVGPALLTPVARLLGLKVVFTHHGPDYD RDKWGAAAKTVLRLGEMMGARFANEIIVISNVINDMLIRKYHRKECHLIYNGVPAPQIVD FPEYLEETGVEKGKYIFSMCRFVPEKNLHHLIEAFSKTDNKGCRLVLAGDTDFEDEYSAG LKKYAREQGVVLTGFIKGKKLHALLTHARCFVLPSSHEGLPIALLEAMSYRLPVIVSDIP ANLEVGLEKEAYFPVGDIAALAQKLQQNIDAPYRQKEYPMEKYDWDAIAGQTAAVYEKCA KPRPER >gi|313157655|gb|AENZ01000057.1| GENE 172 182459 - 182950 530 163 aa, chain - ## HITS:1 COG:lin0783 KEGG:ns NR:ns ## COG: lin0783 COG2606 # Protein_GI_number: 16799857 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Listeria innocua # 7 162 3 158 158 160 51.0 1e-39 MAHGKVEKTNAARLLDRAKINYELVPYEVDEEHLAATHVAEQLGEEIATVFKTLVLKGDR TGYFVCVVPGNHEVDLKAAARVSGNKKADLIPMKELLPVTGYIRGGCSPVGMKKRFPTYI HSSAGEHPFIYISAGVRGLQLKIAPAELIAFVGATVAEISREG >gi|313157655|gb|AENZ01000057.1| GENE 173 183040 - 185553 3969 837 aa, chain + ## HITS:1 COG:no KEGG:Odosp_1673 NR:ns ## KEGG: Odosp_1673 # Name: not_defined # Def: hypothetical protein # Organism: O.splanchnicus # Pathway: not_defined # 26 837 20 829 829 1011 58.0 0 MLRRKITYKPLAFLIAVLFLPTVLPAQSTRVRGRVTDAADGTPMQFVSVVFPGTTTGITT DEAGIYTLETRDTVSRVQASMVGYASQTKPLVQGGFNQIDFALEAVEFGIGSVVITPGDN PAHPILRGVIRRKPQNDPDLYDSYTCATYTKMQLDLTNIKPRFRSKRLQRNFGFVFDYID TSALTGQAYLPAMISEATADYYHSRRTPAVSREIIRASRVSGVEDSFAIAQFTGRMHGNV NFYANYIDIFNVRFASPLSDSGLAFYDYFLVDSMQVEGRKTYKIRFHPKRLATPVLDGEV NIDSASYALQSASARMPKGVNVNWIKHLRLENDNRIVADSTWFRHRDRVSAEFSIATGDS SKLTSFIGTREVVYSDVRVGVPIPDEVARMDNNVVIGDEEGVSRTDEAFWEQVRPYRLSD KEKGIYSMVDSVQNVPLYRNIYTLINTVIVGYWNTKYIGIGPYYKLASFNKLEGFRMQPG FRTTTAVSKRIRLSGYAAYGTRDGMFKGGGSIELAFNRRLTRKLTVSGRHDVMQLGAGQN ALTESNILSSLLSRGDSRLSMVNRGEIGYEHEWSHGTSNFLGARIQKIFGNRYVPLVRPD GRIVNSVSDAALHVGMRISKNESIYRLPFDKQYMGTVYPVLTLGFTAGVPAMLNDSYEYY RLEGGIHYKPELPPLGYSNITLQGGKIFGKVPYPLLKLHEGNGTYFYDPNAFSCMNFYEF ASDAWVTLFFEHHFNGILLGRIPLVKKLKWREVLVCKGVWGTLSKENDGSLPDTQAPLLF PRGMTSVSDPYVEMGFGVENIFRLLRVDCIWRLTHRDPKPGQDVQNFAVNLSMHLKF >gi|313157655|gb|AENZ01000057.1| GENE 174 185613 - 186272 930 219 aa, chain + ## HITS:1 COG:CAC0855 KEGG:ns NR:ns ## COG: CAC0855 COG0637 # Protein_GI_number: 15894142 # Func_class: R General function prediction only # Function: Predicted phosphatase/phosphohexomutase # Organism: Clostridium acetobutylicum # 2 217 1 212 212 105 31.0 5e-23 MIKGAIFDMDGTLVANSPVHIRAFEIFCDRYGVRGWKEKLGGAFGRGNDDIMRLIMPEEV IREKGTAALGDEKEAIYREIYAPEIEPMPGLVALLDKLRDAGIRCAVGSSGCRANVDFVL EKCDIESFFDAKINGDMVTRCKPDPEIYLTAAAALGLAPAECVVFEDAKAGIESARRAGA GRIVALASTHSRQMLAAETDADVIIGDFRDITDLETLLK >gi|313157655|gb|AENZ01000057.1| GENE 175 186564 - 186929 606 121 aa, chain + ## HITS:1 COG:XF0449 KEGG:ns NR:ns ## COG: XF0449 COG3169 # Protein_GI_number: 15837051 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Xylella fastidiosa 9a5c # 8 121 12 116 116 87 44.0 4e-18 MFKGISTVVLLVISNAFMTLAWYGHIAFKSKLERFGMLAIVLLSWGIALFEYCFQVPANR IGSAQFGGPFSIWELKVIQEVVSLSVFTLFALVFMKSDTLRWNHLAGFLCLILAVYFIFR K >gi|313157655|gb|AENZ01000057.1| GENE 176 187043 - 187957 1309 304 aa, chain - ## HITS:1 COG:no KEGG:Palpr_1890 NR:ns ## KEGG: Palpr_1890 # Name: not_defined # Def: bile acid:sodium symporter # Organism: P.propionicigenes # Pathway: not_defined # 9 296 3 290 296 245 46.0 2e-63 MHVLEYLHRNVKTVAMPSAMVAGALLCRPVTALEAWTHQMITPSLIFLMLFVTFCRVKPS QMKPSMLHVWLLLFQVVVSAALYFALLPLDAIVAQGAMICVLAPVAMAAVVIAGMLGANV PTMATYSLLCNMAIALIAPVVLSLAGTGACSFTQILARIAPLLIMPFAAAQFCRFLLPKT AKWIGDHSQISFYMWLASLVVIIGRTTAFIIDLHDASLSVELWLAFAALAICLVQFKVGR MLGRRYGDAPAGGQSLGQKNTVLAVWMAQSFLDPISSIAPTAYIVWQNFVNSYQIYRHDQ RKSD >gi|313157655|gb|AENZ01000057.1| GENE 177 187958 - 189112 988 384 aa, chain - ## HITS:1 COG:no KEGG:Halhy_6144 NR:ns ## KEGG: Halhy_6144 # Name: not_defined # Def: hypothetical protein # Organism: H.hydrossis # Pathway: not_defined # 27 382 57 407 414 121 25.0 7e-26 MRRLLLVVCWLAAAQTVCARKPAVCFDLPYTLERGKYVVTVETAAGPRRFAFDTGASKTC ISERLSRELGLAASGRGTSGDFEGHRHAITYVRVPYLRMGEATYSDRQVIVLPDSSYIFR CLGFDGIAGSDLLRGFVVRMPNADSTITLAGDLRQLGGFRCRGGARIHFAGNAPVIAARV GCGKSRMKTYMKFDTGSGALFDCRYDECRAMIEKGILREVRRTEGHSGNLGWTNRSVVGE AVRGVIPEFGLAGNTLAGMPLEVTHGGHSKLGCGVFRWGTVVIDYPGRRFWLLPHAEPPG PPDVSVSGVTVALDGGRLVVGQVWDETLADRVAPGDRIVFLGTVDVSRVDPCAVIRGEIR GDKPQMTVERKDGTRVTVPVKRMK >gi|313157655|gb|AENZ01000057.1| GENE 178 189121 - 189570 211 149 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|42519249|ref|NP_965179.1| 30S ribosomal protein S21 [Lactobacillus johnsonii NCC 533] # 1 148 1 146 147 85 34 1e-15 MSLEQQISKGIMEAMKAKDTVRLGALRNAKKYIIEAKTAGPEITELPDADVLKIISKLAK QGTDSAAIFTEQNRPDLAAEELAQVAVFQEFLPKQLTAEELAAEVKAVIAEVGATSRKEM GKVMGVASKRLAGRADGKDISAKVKELLA >gi|313157655|gb|AENZ01000057.1| GENE 179 189601 - 192585 4105 994 aa, chain - ## HITS:1 COG:no KEGG:Bache_0394 NR:ns ## KEGG: Bache_0394 # Name: not_defined # Def: metallophosphoesterase # Organism: B.helcogenes # Pathway: not_defined # 1 993 1 1002 1601 956 50.0 0 MKNRLLYILLLLGLCGQEVSAAGIDGPARREIGRTLTRIVAREVAGGYVRIEGVDASRKR VRIYTSVGLSYYPFREENLRAMRDSVRLLLPPEFRKADIELYSDKREVGELIPMACRTGA EYRKLLRKKKIVPFTNRSERPLVTRSSAPVVPSQGLAGRHIALWQSHGRYFDQPQNRWKW QRSQLWQTCEDLYTQSYVLPYLVPMLENAGACVMLPRERDVQKYEILADNDAAGQYREEE GPEKWQPGGMGFAHVQQVYTTGQNPFRDGTTRRVRSVTGGAESRAVWTADIPERGEYAVY VSYDSTPQNADDAQYTVHHLGGDSSFAVNQTMGGGTWIYLGRFLLDAGSQEVVTLTNRSR QAGRIVSADAVKIGGGYGNIARTVCDSLRRPGMVCHLETSGYPRFCEGARYWLQWAGFDE KVYSPKENRDDYKDDYMSRAHWVNALTGGSERMPDSAGLRIPVDMALAFHSDAGVRLNDD IIGTLGIFYTRENKGKFEGGADRYRSRDLTDIVMTQIVEDIRSTYEPAWSRRGLWNRAYY EARVPGVPTMLLELLSHQNFADMRYGSDPRFKFLVSRAVYKGILRYISSQYGLPYVVQPL PVESLAVQFAEGGKAAVTWSPVMDSLETTAAPTGYVVYTRIDDGGFDNGRYVDKPCLLTA QEPGRIYSYKVTAVNEGGESFPSETVAACRMPDEKGTVLIVNGFDRVSAPLSVRADSLAG FYTDIDGGVPDRQDISFIGAQHVFDMQMAKCEVDSIALGACACDYETEVIGGNTFDYPAL HGRSVAAAGYSFCSASVRAVERGEVSPDGYSAVDLILGEQRSTTIGRGVTGYAFKTFSPE LQAVLRRYMAGGGALFVSGSYVATDLWTGGEASDDDRRFAEEVLHYTYDGSRAAQRGRVR VVTSHPGFSRDEYRYVNEYRPDRYRVESPDALRPAGAGAFSVMRYVENGRTAGVASEAGG TFVMGFPFESIESDVQRDRLMRDVLDFLLKQKDR >gi|313157655|gb|AENZ01000057.1| GENE 180 192575 - 194026 1879 483 aa, chain - ## HITS:1 COG:sll1087 KEGG:ns NR:ns ## COG: sll1087 COG0591 # Protein_GI_number: 16330938 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: Na+/proline symporter # Organism: Synechocystis # 8 422 7 420 512 121 29.0 4e-27 MTSAAVIATVLGYIAVLFAVAWVSGRRADNAGFFSGNRRTPWYMAAFAMIGAAMSGVTFI SVPGSVAVDGFSYMQMVAGFTVGQLIVAFVLIPTFYRLKVVSLYEYLDDRFGVRSHRTGA WFFFVSKMLGAALRIYVVCAVMQLLVFSHYGLPFWLNAAVTMLFVWLYTQQGGVKSLIWT DTLKTVCLVGSLVLSIVFIMQALGMSLPEMTREVAASPYSKMFFFDDPASDRYFWKMFAA GIVLLIAMTGLDQDMMQRNLSCATPRDSQKNIVLTAVSQIFVIFLFLVLGVLLYIYMDRS GLAMPEKGDQVFSQVAVNGGLPVFVGILFVIGLISSTYSAAGSALTALTTSFTVDILDGT KRYGERRLTRLRRIVHVAMALGMALVILAFEYLADDSVINLVYKVASYTYGPILGMFAFG IFTRLRVRDGWMPLVALAAPLLSALLQYWAYEAWGYRIGFELLIYNAAFTMAGMLLLVKR HEK >gi|313157655|gb|AENZ01000057.1| GENE 181 194195 - 194893 688 232 aa, chain - ## HITS:1 COG:no KEGG:Closa_2690 NR:ns ## KEGG: Closa_2690 # Name: not_defined # Def: GCN5-related N-acetyltransferase # Organism: C.saccharolyticum # Pathway: not_defined # 1 232 18 249 249 398 75.0 1e-109 MCCIIRTKKAHPGVEAKRRWLSDRLREGHVFRKLNAKGVVFIEYAPLETAWVPVIGDNYY YIYCLWVAGPYKGKGYAKALMEHCLADARTHGKSGVCMLGAEKQKAWLSDQSFARKFGFE VVDTTDNGYELLALSFDGTTPQFAHNAKKQRIESQALTVYYDMQCPFIPGNLEMIRQHCE AHGVPADFIEVDTLRKAKELPCVFNNWGVFYKGRFETVNLLDAAALERILKK >gi|313157655|gb|AENZ01000057.1| GENE 182 194974 - 195195 432 73 aa, chain - ## HITS:1 COG:no KEGG:BT_1494 NR:ns ## KEGG: BT_1494 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 6 71 3 68 74 105 65.0 9e-22 METEKKRIEYTNGEIVVVWQPHLCIHSGVCVRMLPEVYNPQERPWVKLENATTDRIVAQV EKCPSGALSYRKA >gi|313157655|gb|AENZ01000057.1| GENE 183 195686 - 196537 1246 283 aa, chain + ## HITS:1 COG:YPO1549 KEGG:ns NR:ns ## COG: YPO1549 COG0040 # Protein_GI_number: 16121822 # Func_class: E Amino acid transport and metabolism # Function: ATP phosphoribosyltransferase # Organism: Yersinia pestis # 2 282 7 298 299 224 43.0 1e-58 MLRIAIQAKGRLNEQSIDLLSEAGIGVAESKRKLISRADGFPMEVLYLRDDDIPQAVAMG VADLGIVGLNEVAEKNFPVERTMDLGFGSCRISLAVDRTADYNGLDYFRGKRVATSYPNI LARFFAEKGIDAEIHTIEGSVEIAPAVGMSDAIFDIVSSGGTLISNGLVEVEKVFYSEAV LIANPELDEDKRRETAQLTFRFNSILESRGMKYVLMNLPQEKLDEAITILPGMRSPTVLP LAQEGWCSIHAVICESQLWERIERLKALGAEGILVLTLENMIR >gi|313157655|gb|AENZ01000057.1| GENE 184 196560 - 197855 1638 431 aa, chain + ## HITS:1 COG:VC1133 KEGG:ns NR:ns ## COG: VC1133 COG0141 # Protein_GI_number: 15641146 # Func_class: E Amino acid transport and metabolism # Function: Histidinol dehydrogenase # Organism: Vibrio cholerae # 14 427 19 425 431 367 50.0 1e-101 MEMPKIYVNPPRSEWPALTARCTRQEEEIGERVAAILAEVRTGGDAALRRIVRRIEGYLP ETFEVTRERRAEAAKAVSPQLKAALAQAKANIEAFHRAQLPAQVEVETMPGVRCVQRAVA IGRAGLYIPGGKAPLFSTVLMLALPARIAGCREVILCTPCGRDGRIAPEILYAADLCGVD RVFALGGAQAVAAMAYGTESIPRVDKIFGPGNRYVTKAKQLAGAADVAVDLPAGPSEVLV LADEDARPEFAAADLLSQAEHGDDSQAVLVCRSEEFAQRAIASVGEQAARLSRRDAIGNS LANSRIVVFSDPDEQIAFADAYAPEHLIVAMRDAWDAAARITAAGSVFIGGYSPESAGDY ASGTNHTLPTGGWARAYSGVNTESFMRKITYQELTRGGLEALAPTITAMAEAEGLDAHAN AVRIRTEGGAR >gi|313157655|gb|AENZ01000057.1| GENE 185 197852 - 198886 1460 344 aa, chain + ## HITS:1 COG:YIL116w KEGG:ns NR:ns ## COG: YIL116w COG0079 # Protein_GI_number: 6322075 # Func_class: E Amino acid transport and metabolism # Function: Histidinol-phosphate/aromatic aminotransferase and cobyric acid decarboxylase # Organism: Saccharomyces cerevisiae # 4 340 5 376 385 226 36.0 4e-59 MKPLAQLVRPNILALQPYSTARDEYAGGGIGVWLDANESPYDNGVNRYPDPHQRELKAQL AALKGVRSGQIFLGNGSDEAIDLAFRIFCEPGRDNAVSIAPTYGMYRVAAQTNGVELREV PLGADFSLPVEALLAAADGRTKLLWLCSPNNPTGNAFPDREIEELLRRFDGVVVLDEAYI DFAEGRGFLPRLDEFPNLIVLQTLSKAWGMAGLRLGLAFASERIAALFGQVKYPYNINTL TQQTAAECLRRDIAAQIAQIREERGRLAAALAGCGCIERVYPSQANFLLVKTADPDRLYG ELIAAGVIVRNRTRIAGCEGCLRITVGTPAENGRMLETVKNFRP >gi|313157655|gb|AENZ01000057.1| GENE 186 198883 - 200385 1625 500 aa, chain + ## HITS:1 COG:Cj1599_2 KEGG:ns NR:ns ## COG: Cj1599_2 COG0131 # Protein_GI_number: 15792904 # Func_class: E Amino acid transport and metabolism # Function: Imidazoleglycerol-phosphate dehydratase # Organism: Campylobacter jejuni # 316 500 4 190 190 186 49.0 8e-47 MKKALKKALFIDRDGTLIVEPPVDMQVDSLAKLEFVPGAISALKVLRGLDFELVMATNQD GLGTDSFPEEDFRIPQEKMLRTLAGEGVLFDDMLIDRTFESDGAPTRKPRTGMFGRYTGG GYDLAASYVVGDRATDILLARNLGARGILFAQETAGRRMLREAGAEEACVLISDSWAEIA EYIRRGERRVVVTRETRETRITVRLDLDGKGFGGVSADRPAAETGGAPESGTDTGNRVGT GADNGPDDNRADTGNTTKGIRSHTDDMTSSDRAGAKNPAGEWNNDVWNGNSSNSDGRNGD NRNRDDGNDNSRINDSSSDERNDDNGNADNSNCGDGISTGLRFLDHMLAQIAHHGGVALE VEARGDLDIDEHHTMEDVAIVLGEAIDRALGSKAGIGRYGFALPMDDCRALVLLDFGGRI DFEWDAEFRRERVGDVPTEMFRHFFHSLCCAARCNLQIAAKGDNDHHKAEAIFKAFARAL RMAVARSGFGYDIPSSKGVL >gi|313157655|gb|AENZ01000057.1| GENE 187 200382 - 200969 850 195 aa, chain + ## HITS:1 COG:YPO1545 KEGG:ns NR:ns ## COG: YPO1545 COG0118 # Protein_GI_number: 16121818 # Func_class: E Amino acid transport and metabolism # Function: Glutamine amidotransferase # Organism: Yersinia pestis # 4 194 5 195 196 182 48.0 2e-46 MTAIVDYDTGNLRSVADAMRRIGAEFTITADPALLRGADRVLLPGVGEASSAMAKLRERG LDLVIPTLTQPVLGICIGMQLLCLDSEEGDARCLGIFPAHVRRLKPAESGLKIPHVGWNT VGSLRGALFEGIGEETYVYYVHSYAAEVCEATIARTDYGGEFSAALGRGNFFGTQFHPEK SGSAGERILRNFLKP >gi|313157655|gb|AENZ01000057.1| GENE 188 200974 - 201705 924 243 aa, chain + ## HITS:1 COG:ECs2825 KEGG:ns NR:ns ## COG: ECs2825 COG0106 # Protein_GI_number: 15832079 # Func_class: E Amino acid transport and metabolism # Function: Phosphoribosylformimino-5-aminoimidazole carboxamide ribonucleotide (ProFAR) isomerase # Organism: Escherichia coli O157:H7 # 5 243 3 246 246 199 42.0 4e-51 MKIEIIPATDIIGGACVRLTQGDYGRRTTYYSDPLEAALRFEETGIRRLHMVDLDGAKAA QPRNLAVLERIASKTALEVQYGGGIKSTEALRSVFDAGASRAICGSVAVRRPELFAGWLA EFGPGKLILGADIRNGKAAVQGWTEASELSAQELIAQFAPQGLAQVICTDIARDGMLCGA SAEFYAALQGNFPGVEITVSGGIGSLADIEALDGAGLRSVIVGKALYEGRITLEELKRCL QNA >gi|313157655|gb|AENZ01000057.1| GENE 189 201687 - 202439 1007 250 aa, chain + ## HITS:1 COG:BS_hisF KEGG:ns NR:ns ## COG: BS_hisF COG0107 # Protein_GI_number: 16080540 # Func_class: E Amino acid transport and metabolism # Function: Imidazoleglycerol-phosphate synthase # Organism: Bacillus subtilis # 1 249 1 252 252 256 55.0 2e-68 MLAKRIIPCLDIREGQTVKGINFAELKNVGDPVELGAKYAAEGADELVYLDISATEEGRG TFTGLVSRIAARIDIPFTVGGGIASVADAGRLLDAGADKVTVNSAAVRDPGLIDAIASKY GSQFVVAAIDAKKVDGIWRVTTHGGKRLTERELFAWAHEVWRRGAGEILFTSMDHDGTRD GYPCETFARLAELPIPVIASGGAGSVRHIADVLTLGRADAALAASIFHYNEIPIPALKRE LHRQNINVRL >gi|313157655|gb|AENZ01000057.1| GENE 190 202445 - 203065 914 206 aa, chain + ## HITS:1 COG:PM1206_1 KEGG:ns NR:ns ## COG: PM1206_1 COG0139 # Protein_GI_number: 15603071 # Func_class: E Amino acid transport and metabolism # Function: Phosphoribosyl-AMP cyclohydrolase # Organism: Pasteurella multocida # 15 107 6 98 107 133 63.0 2e-31 MKIKAKPFAELDSAKQADGLIPAVIQDDRTLQVLMLGYMNREAYEKSLAEGRVTFYSRTR GCLWTKGETSGNYLDIVSVTADCDDDTLLIRVIPHGPTCHTGAKSCFAGAAPETEGFIRY LQSVIRGRHAEMPEGSYTSRLFDRGVNKIAQKVGEEAVETVIEAVAGNREAMIYEASDLV YHLLVLLEATGCSVADLEKELARRHS >gi|313157655|gb|AENZ01000057.1| GENE 191 203205 - 204752 316 515 aa, chain - ## HITS:1 COG:no KEGG:BDI_3446 NR:ns ## KEGG: BDI_3446 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 1 515 138 660 660 292 34.0 2e-77 FRLWESPWADYLTFDRFCAWLLPYKVAEFQPLDYWCDSLGCRFSERLRTAPQNDENSYSP YYRTLQILPEVRAGVGPNIPLNYAEYKGYRLFAASTFHRMPFGDCFDYSTLSVAVLRSHG IPAVFDYLPQWGRGFLRHAWFGFMNDNGVFMASEWGLDSDPGTPFHPWRPIPKIYRYSYA PVPARSDYLRNSKYPLDGFSPFEEDVTDWYMSTSDPNVPLFRTDLSDDYLYIAAFNNYRW NPIDVGRRNGRKGVFSKMGRRNAYLVFGHDGSGLVPVSHPFTIDAGGGICCRNADTTCRE RIRVTRKMLLNEHVARMEHRIVGARIEASDYPDFRIAEVLYSIDSVVFPDMIPVRASKKY RYWRLYSADGSHGSISELQFFMPGSDVPATGRPIGTHGLPPQRGLDKVFDGDWLTSYDHP DADGNWYGMAFDKPVVIDRVRCVPRTDDNLIHAGDTYELKYWDGIKWASLGCQVACERYL LFDNVPANALLWLSNLTRGYDERIFTYENDRQVWW Prediction of potential genes in microbial genomes Time: Wed Jun 22 13:12:17 2011 Seq name: gi|313157653|gb|AENZ01000058.1| Alistipes sp. HGB5 contig00092, whole genome shotgun sequence Length of sequence - 1513 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 1511 1623 ## COG1640 4-alpha-glucanotransferase Predicted protein(s) >gi|313157653|gb|AENZ01000058.1| GENE 1 2 - 1511 1623 503 aa, chain - ## HITS:1 COG:CPn0326 KEGG:ns NR:ns ## COG: CPn0326 COG1640 # Protein_GI_number: 15618246 # Func_class: G Carbohydrate transport and metabolism # Function: 4-alpha-glucanotransferase # Organism: Chlamydophila pneumoniae CWL029 # 190 484 26 319 526 204 34.0 2e-52 CAVERYAPAAQPAEYRYEVEREGVCIRSEWRPHTLRIPSREGVRTLRIRDRWQEMPSDTP FYSSAFTRGIFGRGKTGNPKKAAGNITLRVILPTLRPDETLAVAGSGRELGDWKRIVPMD DSRFPEWELTLHTAHRFEYKFLIADRKTLTPILWEEGANRTWGELPGAGEHALDAAAYPR FPERRWQGAGTAIPVFSLRTEEDFGVGEFYDLKRLIDWAAATGQCVIQVLPINDTTMTGT WEDSYPYNANSTFALHPQFIRLPAAGVVEDDEYRTLRNELNALPEIDYERVNRHKLRLLR RAFERHGTRTAARRDYKDFIAANRHWLIPYAAFCTLRDETGTPDFTRWGGFARYDRKTVD AYCRSHNRDIAFHCYVQYHLHTQLSEVCAYARAHGIVLKGDLPIGISRTSVDAWLYPHLF HMDSQAGAPPDAFSAVGQNWGFPTYNWERMARDGYAWWRARMAKMAEYFDAFRIDHILGF FRIWEIPTHAVHGLLGYFNPALP Prediction of potential genes in microbial genomes Time: Wed Jun 22 13:12:19 2011 Seq name: gi|313157642|gb|AENZ01000059.1| Alistipes sp. HGB5 contig00082, whole genome shotgun sequence Length of sequence - 12939 bp Number of predicted genes - 12, with homology - 9 Number of transcription units - 10, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 7 - 615 345 ## gi|313157648|gb|EFR57061.1| hypothetical protein HMPREF9720_2958 2 1 Op 2 . - CDS 644 - 976 165 ## gi|313157644|gb|EFR57057.1| hypothetical protein HMPREF9720_2959 - Prom 1011 - 1070 3.3 3 2 Tu 1 . - CDS 1589 - 2053 120 ## gi|313157651|gb|EFR57064.1| hypothetical protein HMPREF9720_2960 - Prom 2107 - 2166 3.1 - Term 2503 - 2554 11.1 4 3 Tu 1 . - CDS 2568 - 4628 1019 ## COG0457 FOG: TPR repeat + Prom 4387 - 4446 5.2 5 4 Tu 1 . + CDS 4627 - 4728 98 ## 6 5 Tu 1 . - CDS 5271 - 5618 63 ## - Term 6565 - 6626 17.5 7 6 Op 1 . - CDS 6632 - 7690 694 ## Odosp_0062 hypothetical protein 8 6 Op 2 . - CDS 7668 - 8387 728 ## Riean_1245 AAA ATPase - Prom 8479 - 8538 2.3 9 7 Tu 1 . + CDS 8414 - 8698 77 ## + Term 8841 - 8880 -0.4 - Term 8478 - 8531 6.1 10 8 Tu 1 . - CDS 8597 - 8923 225 ## gi|313157646|gb|EFR57059.1| hypothetical protein HMPREF9720_2966 - Prom 8985 - 9044 4.6 - Term 10463 - 10500 6.2 11 9 Tu 1 . - CDS 10515 - 11021 180 ## gi|313157643|gb|EFR57056.1| hypothetical protein HMPREF9720_2967 - Prom 11041 - 11100 3.1 12 10 Tu 1 . - CDS 11858 - 12079 144 ## gi|291515324|emb|CBK64534.1| hypothetical protein AL1_22700 Predicted protein(s) >gi|313157642|gb|AENZ01000059.1| GENE 1 7 - 615 345 202 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313157648|gb|EFR57061.1| ## NR: gi|313157648|gb|EFR57061.1| hypothetical protein HMPREF9720_2958 [Alistipes sp. HGB5] # 1 202 1 202 202 393 100.0 1e-108 MNFYEIVSQKMKNYLLPIFALLIVGCGNRTQPNIVVENKISQTEQTDLIVRDIYERDTIK GDFNGDGKIEYAYSESNSAKYYSLDEVDDGKFNNITFSNPTIPAIETEFQIERLTNEGDL NGDGTDEIGFNERAVSRICRYTVYSLSRTNQWKKLLHVDYDALNPPSDVVRPHPNKKGYV IIQTVEWNDGESTLIEKSVKIE >gi|313157642|gb|AENZ01000059.1| GENE 2 644 - 976 165 110 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313157644|gb|EFR57057.1| ## NR: gi|313157644|gb|EFR57057.1| hypothetical protein HMPREF9720_2959 [Alistipes sp. HGB5] # 1 110 139 248 248 215 100.0 8e-55 MLALNMWGKGGKDCNCYRFYEKTNDATYEPINYEPFISEINDFAEIAGGKITVTTLGMTG KRKIYQKAKEKNSYPMYELVRIEEYDNFIDSVFTYKLEKRLIKKEYVKRN >gi|313157642|gb|AENZ01000059.1| GENE 3 1589 - 2053 120 154 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313157651|gb|EFR57064.1| ## NR: gi|313157651|gb|EFR57064.1| hypothetical protein HMPREF9720_2960 [Alistipes sp. HGB5] # 1 154 61 214 214 298 100.0 1e-79 MAANLLPEQEAYELIIRDLELDFMLTEGMIGVNGKQYIPEHYTDLVKNLADFYREAKMWG KAFDLANRMPEILSKAGSDPVFAYSVSKMYKAGIYKAKGDILESLKTLRQLKAYLVKEIK TSDNKEVLQGVLQVVDNTVKEVETEKWHILNGAI >gi|313157642|gb|AENZ01000059.1| GENE 4 2568 - 4628 1019 686 aa, chain - ## HITS:1 COG:FN1787 KEGG:ns NR:ns ## COG: FN1787 COG0457 # Protein_GI_number: 19705092 # Func_class: R General function prediction only # Function: FOG: TPR repeat # Organism: Fusobacterium nucleatum # 52 468 18 395 628 91 26.0 5e-18 MKNYLLIILLLCAGAYSGYGQKKQPSDYHLQKAIDLIQNNGDKKEVMKCLDQQLQETPNN ADARIARANLYYQQKKYGNALLDANLAIKYYKKGGMFPKYLMYWERARTYYDMEDSDKAL ADYDTAYKLVLKDKKGTTDIHDILYERAQIYYDLKDYANADADYRQMLKHDEADQVAMIG LIRNMIARKEYDAALELVNKCEKYDTDYDETYRFRMQIYDKTGEVDKAIDDAIAYHEKSE NPSSELTNPIFKKHLSYALAKVTSKINSVSDNGSWKMLRVTIYELGHDYANAIKAYDDLE KEYGTSRNIYYYRADCYNEIGDTERAVVDMTKFIELGDGKDYFAIVRRADYYREGGKYEE AIADFTKGIELEPVDAYAYYKRGWCYELTGDDDSAMKDYNAGIDVDKSYPYIFLMRGELY LKRGEKEKANADFEEIIRQDTVAESGSCRQYALHLLGRNTEALEWMEKVIAADSTGNGVY YDKSCLLARMGQLDESVTALRKAFKNGYRSFAHIEHDDDMDAIRELPEFKRLIEEYKAKP IRIEADDEQAKDKIETISEIQMKKMHSGLYEIPCTINELPLKFFFDTGASTVTISSVEAN FMLKNGYLKSDDIKGKEYYSVATGEIHEGTTIRLREIKIGDAILRNVDASVVHNQQAPLL LGQTVLERFGTVTIDNINSKLIIKQK >gi|313157642|gb|AENZ01000059.1| GENE 5 4627 - 4728 98 33 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSYVYLLVFVYVVTHFAKISLIYPFEIMSLLQE >gi|313157642|gb|AENZ01000059.1| GENE 6 5271 - 5618 63 115 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MIFPFGIEKEAEIYCTSAKVALKSQEKSIGPQIADMANNLIVNGVVDFPYSTILQFMVTW TEQAVRANGWNIQDEDGVSWWIGLYAQSYIRAMNNNEHSFDEIFKAVFVKYFELR >gi|313157642|gb|AENZ01000059.1| GENE 7 6632 - 7690 694 352 aa, chain - ## HITS:1 COG:no KEGG:Odosp_0062 NR:ns ## KEGG: Odosp_0062 # Name: not_defined # Def: hypothetical protein # Organism: O.splanchnicus # Pathway: not_defined # 51 352 48 348 348 431 66.0 1e-119 MYPAFQPNAEFAVGLNADNPFLDNPTAEGTTTVEERPATPVIPLGVRHNRHFIEANTKPV DIAHLREDCVVPVFSKDNEVTISHQSFIETVLGAAHRMFPQEVIDAPDIVVSHIIKGRIP EAIHKPVNQLLETDKTIYYERMAFCFEIPTIYEDVAGNRLNLSIGGVRAYNHENLYSKKT VEKFKVFIGFKNLVCCNLCVSTDGFKSELRVMGVQDLFDATLHLFQAYDAERHVKRMAAM QERTLTEHQFAQLIGKTRLYQYLPTAEKRQLPAMEFTDCHINAVAKAYYMDENFSRGTNP EIDLWRVYNLFTGANKSSYIDTFLDRSLNATELISGIGRAIEGDAQYRWFVE >gi|313157642|gb|AENZ01000059.1| GENE 8 7668 - 8387 728 239 aa, chain - ## HITS:1 COG:no KEGG:Riean_1245 NR:ns ## KEGG: Riean_1245 # Name: not_defined # Def: AAA ATPase # Organism: R.anatipestifer # Pathway: not_defined # 1 219 1 221 295 303 62.0 4e-81 MQLRQSMRRAAKMRLALAGASGSGKTYSSLLIAYGMTGDWSKIAVIDSENCSADLYAHLG GYQVLTLENYAPETYIEAIGICEQAGAEVIIIDSISHCWDYLLDFHANLQGNSFANWAKV TPRQNAFIQRILTSSAHVICTMRSKQDYVLSDKNGKMVPEKVGLKAVQRDNVDYEFTAVL DIAMNHKATTSKDRTGLFTGRPEFLITPAVGQAILKWCNLSNPSVQPQTPYNHVPSVSA >gi|313157642|gb|AENZ01000059.1| GENE 9 8414 - 8698 77 94 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MIKLIILSVAFVFVLQFLSIICHKKIPPRISFSKPNRGRGVNWGIYSWVDDKIPYLINIK ALRIKQLINGKCSKSFLIIQCIEQYLCLHLIITK >gi|313157642|gb|AENZ01000059.1| GENE 10 8597 - 8923 225 108 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313157646|gb|EFR57059.1| ## NR: gi|313157646|gb|EFR57059.1| hypothetical protein HMPREF9720_2966 [Alistipes sp. HGB5] # 1 108 476 583 583 214 100.0 2e-54 MLRAYDKTANPKLLDLCINFSKWLLENANELPLPIRQLNYLQSVKRLRLLTGSELSIIND IIQNDTDPEHKIAAYLLCDNQVQAQVLFNTLDDQERFRTFPIYQLLNA >gi|313157642|gb|AENZ01000059.1| GENE 11 10515 - 11021 180 168 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313157643|gb|EFR57056.1| ## NR: gi|313157643|gb|EFR57056.1| hypothetical protein HMPREF9720_2967 [Alistipes sp. HGB5] # 1 168 234 401 401 343 99.0 4e-93 MADKVPLMLFAWHANFEQLSPICNYLIRSLQHNRFFDAPDFLIIAQALDGYYKRFVNKKD GKDIKKYQLQIERLLEQFKGVYMLQECRIDAEELTQSRHKYSHLIPDDDKMVSKAVAGDD LYDLTQKCIVLLTCCILDNIGLTTDEINICFKDSAIQQIVRDLPPTFD >gi|313157642|gb|AENZ01000059.1| GENE 12 11858 - 12079 144 73 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|291515324|emb|CBK64534.1| ## NR: gi|291515324|emb|CBK64534.1| hypothetical protein AL1_22700 [Alistipes shahii WAL 8301] # 1 73 1 73 73 115 90.0 1e-24 MSVLLTKYNAIGLSASEPITITWQLNKRCKTLPEKVSLAYFIKALDMQEQYKAVTERAEQ AKQFKEEFKGFEF Prediction of potential genes in microbial genomes Time: Wed Jun 22 13:13:35 2011 Seq name: gi|313157635|gb|AENZ01000060.1| Alistipes sp. HGB5 contig00041, whole genome shotgun sequence Length of sequence - 8817 bp Number of predicted genes - 7, with homology - 7 Number of transcription units - 4, operones - 3 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 131 - 772 1101 ## BF0572 hypothetical protein 2 1 Op 2 . + CDS 797 - 1429 1115 ## BDI_1217 hypothetical protein + Term 1448 - 1493 16.8 - Term 1497 - 1554 7.6 3 2 Op 1 . - CDS 1582 - 4458 4307 ## COG0749 DNA polymerase I - 3'-5' exonuclease and polymerase domains 4 2 Op 2 . - CDS 4555 - 5007 397 ## COG0720 6-pyruvoyl-tetrahydropterin synthase + Prom 5283 - 5342 3.1 5 3 Op 1 . + CDS 5432 - 5815 290 ## gi|237722468|ref|ZP_04552949.1| predicted protein 6 3 Op 2 . + CDS 5644 - 7566 1788 ## COG3291 FOG: PKD repeat + Term 7570 - 7615 1.7 + Prom 7697 - 7756 4.9 7 4 Tu 1 . + CDS 7785 - 8801 1200 ## COG2207 AraC-type DNA-binding domain-containing proteins Predicted protein(s) >gi|313157635|gb|AENZ01000060.1| GENE 1 131 - 772 1101 213 aa, chain + ## HITS:1 COG:no KEGG:BF0572 NR:ns ## KEGG: BF0572 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 7 211 3 207 210 199 50.0 9e-50 MTVKKLRYLAFALTAVAFCACQKEPSTSSLHKDYLVYTAHDTDADFAAFDTYYIPDSILL IGSADKTEYWKDANALEIVNTVVGRMNAAGYTRTDDKDAANLGLQLSYVQKVTYFVGYDY PYWWWYYPYYWTPGYWGDWAGWHYPYSVYYGYTAGSLLVEMMNLEADQESGKKLPVIWDS FIGGLLTSSEELNQQRTVDAVQQAFDQSPYLKK >gi|313157635|gb|AENZ01000060.1| GENE 2 797 - 1429 1115 210 aa, chain + ## HITS:1 COG:no KEGG:BDI_1217 NR:ns ## KEGG: BDI_1217 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 9 210 8 209 209 206 47.0 7e-52 MKTSKYQVLKTIALCVVLLAAARTGKAQIFPNSYINVDWQVGVPLGSAYADKASGWGMNF EGGYFITPSVSVGPFISYQTNLSSIPRQTLDLGDGSALTVNQKHGSFQLPFGVTGRYTWL PDSVFQPYAGLKLGANYAEFSSYYYVIKQYTDTWGFYLSPEIGVSIFPRPDYRFGFHVAL YYSYATNSGDILTYSVNNLNNFGIRVGISF >gi|313157635|gb|AENZ01000060.1| GENE 3 1582 - 4458 4307 958 aa, chain - ## HITS:1 COG:HI0856_2 KEGG:ns NR:ns ## COG: HI0856_2 COG0749 # Protein_GI_number: 16272796 # Func_class: L Replication, recombination and repair # Function: DNA polymerase I - 3'-5' exonuclease and polymerase domains # Organism: Haemophilus influenzae # 341 958 20 648 648 504 45.0 1e-142 MKKLFLVDAYALIFKYYYAFLGRPMRNRAGMNTSVVFGFVKFLRDIQKRERPDLLGVAFD PKGGSFRREVFPEYKANRAETPEDILLSVPYVKRVLEAMCIPILEVEGYEADDVIGTLSQ KGVEAGYEVFMVTPDKDYGQLVRDNCKIYKQKGADGSIEIVDRDSIREKYGIDDPVLVRD ILALWGDASDNIPGVPGIGEKSACKLVQEWGTVENILDNVSKIKGKQGEKIAAWGDKLRL AKHLTTICLDVPIPFRPEDLTVCDPHIDELKAVFAELDFKAFMNDLTNLAPPEALPEGPR QEAQTQLAEMARAKSAAAKRAALVGQGNLFGDPVVEMPAATDVPAAELQAEAEAMQFKTA QTTPHDYRLVEDAAQLRGVVDEVGKYEEFCFDTETTGFDIFNDRIVGMSLAVNPFEAWYI PFKEENTAEYTEIVRPLFENDRIAKIGQNIKFDLMVLRQLGLEIRGRKYDTMILHYLLDP ESRHNMNALAEKYLNYKPIEIETLIGKGSKQLTMDLVNVERVKEYAAEDADVTLRLKHAL YPQIEELGLQHLYFEIEEPMIAVLADIEMAGVRIDSEALAVYSVELSRRLAELEAAICEE AGESQLNINSARQLGEVLFGKMRIAEKPKMTKTKQFCTDEDYLQSFAHKHRIVDLILEYR GVKKLLSTYVEALPQLVNRRTGRIHTSFNQAVTATGRLSSTNPNLQNIPVREEMGRRIRR AFIPSDEEHLLLSADYSQVELRLMAHLSGDESLIAAFAHGEDIHAATAAKLFNKTLDEVT SEERRRAKTANFGIIYGISAFGLSQRLEIPRKEAKEIIDGYFASYPKVQEYMDNVVAKAK EEGFVSTIFGRRRYLNDIASHNAIARGLAERNAVNAPIQGSAADIMKIAMINVHRRFAAE GIRSKVILQVHDELVVDMLRSEQERVVAIVTEAMESAAALKVRLVVDYGVGGNWLEAH >gi|313157635|gb|AENZ01000060.1| GENE 4 4555 - 5007 397 150 aa, chain - ## HITS:1 COG:HI1190 KEGG:ns NR:ns ## COG: HI1190 COG0720 # Protein_GI_number: 16273112 # Func_class: H Coenzyme transport and metabolism # Function: 6-pyruvoyl-tetrahydropterin synthase # Organism: Haemophilus influenzae # 3 144 1 139 141 89 37.0 2e-18 MTVIRLTKEFSFEAAHALDGYDGPCREIHGHSYRLFVTVKGCPAAGEGDPKCGMVMDFGV LKRIVNEEIVSHFDHSLVLRQTPCNASLRAMLAERFGNIVTVDYQPTCENMLGDFARRIS RRLPEGVELHSLRLHETATSFAEWFAEDNR >gi|313157635|gb|AENZ01000060.1| GENE 5 5432 - 5815 290 127 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|237722468|ref|ZP_04552949.1| ## NR: gi|237722468|ref|ZP_04552949.1| predicted protein [Bacteroides sp. 2_2_4] predicted protein [Bacteroides sp. 2_2_4] # 7 94 30 116 431 66 40.0 6e-10 MKFWKSALLYAATVCTLTATSCSDETKEPKLYTTAVTDDGNGTAEANPASAIPGETVTLT ATPAENFFFLKWTVLSGGIELENPTENPTTFTLPEGGGISKSEPNSPICFRLQPPKSSRM TQRKRPR >gi|313157635|gb|AENZ01000060.1| GENE 6 5644 - 7566 1788 640 aa, chain + ## HITS:1 COG:MA4289 KEGG:ns NR:ns ## COG: MA4289 COG3291 # Protein_GI_number: 20093078 # Func_class: R General function prediction only # Function: FOG: PKD repeat # Organism: Methanosarcina acetivorans str.C2A # 193 586 848 1306 1734 134 28.0 5e-31 MDGPQRRYRTGKSDGKPDDVHPAGGGGNIEIRAEFTDLLPLAASEIIAYDAEKAATVTRL SVSGDLTDEVTAAIFDKFQNLLFLELTDAKAIPDNFMYDAGSRAPKQTKLEELNAPEVTS VGVNAFRNSVLTAVNNSSLRTVHLPKVQTIGERAFSDLWTLEEAGSAFFYCGGPTSISLP KLKTAGESCFAYSTKLTQIDIPNVEKLEKMAMYACNGLLSFSGEKVESVGAQCFERCYAL KEVSFPAATAFGSGCFIDCESLTKVNIPLCTAISDKMFFIAQASVCTSTLKTIDLSNITT IGTSAFEGCKALEDVVDFSKVTTVGARAFYECITLRTVDLSNVTATGDYAFFRCWGFTGE LNMPKLTAPGKYLFRECKNITKVSAPVLENMSLYMFAECTALADIDMPVLKKVENFSFNK CEGLTKVDMPTVETIIAGGFSGCVNIATVNLPNVKSIGGTVFNNCSLITSISLPALQTFT GGAFSGMTGLTSYDVPLLGTLETVPSSLFQNCKSLTAIDLPAAKDLGMNAIRGCDALESI SLPNVEKVGNFCFSENPKLKEVDLPKATSLGTKAFDKCTALTTLKLGATTAITLANDTFT SADVPAACNLFLNAAGVEYPTAAGQQWKNLTWKSIASYIP >gi|313157635|gb|AENZ01000060.1| GENE 7 7785 - 8801 1200 338 aa, chain + ## HITS:1 COG:AGl1135 KEGG:ns NR:ns ## COG: AGl1135 COG2207 # Protein_GI_number: 15890685 # Func_class: K Transcription # Function: AraC-type DNA-binding domain-containing proteins # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 32 286 47 302 313 114 27.0 4e-25 MKYVLKDLAPRPEKDLFITQYWSDKQSDPPLHFHEDYMLSLTLNVRGTRITGRTADDFTE KDLIIIFPGVPHCYTRDEAYADIDCEAVAVQFSRDMPNWQLFETEYLQPIQKMLSLPVAG LHFSEMVVDRVRERLLQLPGLRGFESVALFLDILNDLATAGPDEMHIIGTTDYKTQADGN ERIKKILQFVENNYHNKITLEDIGAEVGMSPSSVCRYFKKNTCQNLWTYINGFRIVRAAQ MIVETDAPISEISTRCGFHNISNFNHAFRERIGSAPGEYRRKFGSTVISPDRKQVEEIGR KLDDLRNGGSLSEGGEIRPPGPLRKTGRGNLNPLSRSK Prediction of potential genes in microbial genomes Time: Wed Jun 22 13:14:01 2011 Seq name: gi|313157600|gb|AENZ01000061.1| Alistipes sp. HGB5 contig00014, whole genome shotgun sequence Length of sequence - 33122 bp Number of predicted genes - 32, with homology - 31 Number of transcription units - 19, operones - 8 average op.length - 2.6 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 41 - 1207 1437 ## gi|313157632|gb|EFR57047.1| fibronectin type III domain protein - Prom 1357 - 1416 4.0 + Prom 1247 - 1306 4.0 2 2 Tu 1 . + CDS 1340 - 1516 58 ## + Term 1647 - 1684 -0.8 3 3 Tu 1 . + CDS 1878 - 4019 2782 ## BT_4606 hypothetical protein + Term 4041 - 4089 14.6 - Term 4030 - 4073 13.1 4 4 Op 1 . - CDS 4098 - 6998 4734 ## COG0612 Predicted Zn-dependent peptidases 5 4 Op 2 . - CDS 7010 - 7522 843 ## COG2839 Uncharacterized protein conserved in bacteria - Prom 7545 - 7604 4.3 - Term 7572 - 7601 -0.3 6 5 Op 1 12/0.000 - CDS 7650 - 8009 521 ## COG0853 Aspartate 1-decarboxylase 7 5 Op 2 . - CDS 8009 - 8872 1014 ## COG0414 Panthothenate synthetase - Prom 9039 - 9098 3.0 + Prom 9029 - 9088 1.7 8 6 Op 1 . + CDS 9211 - 9867 824 ## Odosp_2759 hypothetical protein + Term 9883 - 9933 10.0 + Prom 9897 - 9956 3.0 9 6 Op 2 . + CDS 10147 - 12135 3457 ## BF3425 putative dipeptidyl-peptidase III + Term 12151 - 12198 11.5 - Term 12145 - 12181 4.2 10 7 Tu 1 . - CDS 12205 - 12630 569 ## BF0549 hypothetical protein - Prom 12795 - 12854 5.8 11 8 Tu 1 . - CDS 12945 - 13109 302 ## COG1773 Rubredoxin - Prom 13147 - 13206 1.8 12 9 Op 1 . - CDS 13257 - 13835 884 ## gi|313157604|gb|EFR57019.1| hypothetical protein HMPREF9720_0392 13 9 Op 2 . - CDS 13819 - 14526 275 ## PROTEIN SUPPORTED gi|163764767|ref|ZP_02171821.1| ribosomal protein L15 14 9 Op 3 . - CDS 14496 - 14903 299 ## COG3787 Uncharacterized protein conserved in bacteria 15 10 Tu 1 . - CDS 15284 - 16615 1974 ## COG0739 Membrane proteins related to metalloendopeptidases - Prom 16647 - 16706 6.4 + Prom 16645 - 16704 4.6 16 11 Op 1 . + CDS 16761 - 17804 1271 ## COG1663 Tetraacyldisaccharide-1-P 4'-kinase 17 11 Op 2 . + CDS 17788 - 18594 1156 ## COG0005 Purine nucleoside phosphorylase + Term 18737 - 18775 1.1 18 12 Tu 1 . + CDS 18813 - 19310 805 ## COG0629 Single-stranded DNA-binding protein + Term 19345 - 19383 7.3 19 13 Tu 1 . - CDS 19683 - 20417 205 ## PROTEIN SUPPORTED gi|163797523|ref|ZP_02191474.1| 50S ribosomal protein L9 - Prom 20441 - 20500 2.2 - Term 20506 - 20536 3.0 20 14 Op 1 . - CDS 20638 - 21075 588 ## COG2131 Deoxycytidylate deaminase 21 14 Op 2 . - CDS 21083 - 24376 5136 ## COG0653 Preprotein translocase subunit SecA (ATPase, RNA helicase) - Prom 24405 - 24464 5.0 - Term 24457 - 24497 1.1 22 15 Op 1 . - CDS 24518 - 25480 935 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 23 15 Op 2 . - CDS 25477 - 25974 470 ## gi|313157611|gb|EFR57026.1| hypothetical protein HMPREF9720_0403 24 15 Op 3 . - CDS 26040 - 26450 485 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 25 15 Op 4 . - CDS 26461 - 27420 1430 ## Palpr_3034 hypothetical protein 26 15 Op 5 . - CDS 27422 - 27742 655 ## COG2151 Predicted metal-sulfur cluster biosynthetic enzyme - Term 27766 - 27800 2.3 27 16 Op 1 . - CDS 27870 - 28298 835 ## COG2166 SufE protein probably involved in Fe-S center assembly 28 16 Op 2 . - CDS 28319 - 29884 2306 ## RB2501_05390 hypothetical protein 29 16 Op 3 . - CDS 29910 - 30875 1317 ## Bache_1321 GSCFA domain protein - Prom 30905 - 30964 3.3 + Prom 30826 - 30885 3.5 30 17 Tu 1 . + CDS 30909 - 31721 860 ## COG1947 4-diphosphocytidyl-2C-methyl-D-erythritol 2-phosphate synthase 31 18 Tu 1 . - CDS 32047 - 32502 805 ## COG0698 Ribose 5-phosphate isomerase RpiB + Prom 32488 - 32547 2.9 32 19 Tu 1 . + CDS 32606 - 32971 427 ## COG2832 Uncharacterized protein conserved in bacteria Predicted protein(s) >gi|313157600|gb|AENZ01000061.1| GENE 1 41 - 1207 1437 388 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313157632|gb|EFR57047.1| ## NR: gi|313157632|gb|EFR57047.1| fibronectin type III domain protein [Alistipes sp. HGB5] # 1 388 1 388 388 692 100.0 0 MPPAPAASDITSDSFAISWEAVEGAVSYTYTLLHKNTHGAITVIVPETNTDALSAAFTDL KASTAYTFRIKALGDGEKTLDSEWCECPVTTDEPEYLSGPWVTFEVNYEERTSYYCRITA TFTPNDKTVAYYATCVYGSYFDDDPDDPDFEPNTEEDMIGYLSAQKPISNNTLTESSWGY STEYILAVVGVDADGNFGKLNWAKLKTPARSSSGDQGQSEAALRIQHVVVNSSELEGAPE NCFATVYRFEPTAGARAFRYEDGYYKGDFAKKETSYWRQYFSSLANAYGEDYDGYYSGWK SSMDLEVSADGFYYYDVTFWDVSMANETFEVIYMAYEADGVPGVPACYTVTLPAEIPAVT PQQTPSAYKAAVAAANRLAHPQALPRRR >gi|313157600|gb|AENZ01000061.1| GENE 2 1340 - 1516 58 58 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSLKNCETNLQVFSENSNTRSPSAEGLRKRIVTVIFGLFYNFFLFLRMEFAGFFRTGA >gi|313157600|gb|AENZ01000061.1| GENE 3 1878 - 4019 2782 713 aa, chain + ## HITS:1 COG:no KEGG:BT_4606 NR:ns ## KEGG: BT_4606 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 25 214 18 212 394 120 36.0 2e-25 MKFFTMRRLAACVAALLLVSPPLLTGCDDYDDSELWQNVNDLKSRIEALETKIGQMNTEI AALQKIVDDAVTVVKVEKSADGYVIYFSDNTTAEIKNGTAGADAPVIGVKQDTDEVYYWT ITTDGKTEWLMAGGVKLRVTGESVKPVMSVDDEGYWTISYDGGKTSERILDAAGDPVSAV TEGSSIGSLFKSVNYDDNNVYFELSDGSIVTVPVRSNFYMLIRKAPEVATFVFGETKVWD VESAGVTKTLVSKPDEWKVTYADGKLTVTAPTEAHKECADLRGTVGITYFSANGQADAVM MDVVADADYRGETVGEDFTVNITEITDKNIKATVTPKDDAAYWYMGYTTQENIDNGGLDK LINDPLNGYLSMLGYYAAYNFLDMYAFKGAKTDFSFPGVLKGGTEFCAVLFGFTPDASAA YPVPVTLNTEVMTVPFKTQEPVVINTVYRIDVSDVSWYGAKYVCVPSDDLGYLHGFVRKS EFDTYDDDAAFMKSRIDIYKRAYRDELESGALTWADLTFTETQTVVAPAYIEEAGATRMG LVDDTEYYVYAFSCTDGNATSPLSKAAFKTGKFIPSEECTFEITTTVERQDVTVNVVPSN GNVTYYFSVTRAAQQEQFDADLQFAVDDLIWTKIWVESQQITLSSMLSKGEDSNKWTDLW AATAYIVCAYGVSEDGTITTRPTLARFVTKGTIDQTSVKNAAASAMGGKVFRR >gi|313157600|gb|AENZ01000061.1| GENE 4 4098 - 6998 4734 966 aa, chain - ## HITS:1 COG:sll0915 KEGG:ns NR:ns ## COG: sll0915 COG0612 # Protein_GI_number: 16330991 # Func_class: R General function prediction only # Function: Predicted Zn-dependent peptidases # Organism: Synechocystis # 37 484 62 514 524 208 29.0 4e-53 MRKFLLFAVVIAAMLAAACSKYKYETVAGDPLNTRIYTLDNGLKVYMSVNKEAPRIQTYI AVKVGGKNDPAETTGLAHYFEHLMFKGTQQFGTSDYAAEKPMLDEIENLFEVYRKTADEA ERAAIYRRIDSISYEASKIAIPNEYDKLMSAIGANGTNAFTSQDMTVYVEDIPSNQIDNW AKIQADRFKNPVIRGFHTELETIYEEKNMSLTQDSRKVWEAMDAALFPNHPYGTQTVLGT QEHLKNPSITNVRNYHKTYYVPNNMAVCVSGDFEPDEMVATIEKYFGDMQPNPNLPELQF EPEKPITTPVVKEVYGLEAANVMLGWRLPGANDKSTDISDIVGSILYNGQAGLIDLDLNQ QQKVLSAYGYASTQPDYSSFLVAGRPKTGQSLDEVRDLLLEEVAKLREGDFDEKLIEATI NNYKMQLMRSFEENDSRAILYVYSFISGADWADEVARLDRMSKITKQDVVEWANKYLGPE SYAIIYKREGKDPDEQKIAAPKITPIVTNRDSQSEFLSEIQTSQVQPIEPVFVDYKKDMS QFEAQPGIDVLYKKNETNDIFTLIYVFNTGTENDPALNLAFNYLSYLGTDKMTAEEIASE MYDIACSFYMNAGANQSSIQITGLSENMGKAMEIVEGLIAGAKPDEAILENLKGDMLKSR ADAKLNQSRCFGALQRYLFYGGDFIRRTTLTDPALQALTSEQLLAKIGGLMGKQHEVLYY GPQKEAEVKSALAEHHKVAADLQPLEKKYSTLLPTDANKVVLAQYDAKQLYYLQFSNRGE KFDVAADPEITLYNEYFGGGMNTIVFQEMREARGLAYTAWATLATPSNANGDYSYYAFIA TQNDKMQKAVEAFDDIINNMPESEKAFGIAKEALVSRLRTERTVKDGVLWSYLRARDLGL DAPRDKQILEKVQSMTLDDVKAAQQKWVKGRKYVYGILGDIQDLDLNYLKTLGPIQTVSQ EEIFGY >gi|313157600|gb|AENZ01000061.1| GENE 5 7010 - 7522 843 170 aa, chain - ## HITS:1 COG:NMB0932 KEGG:ns NR:ns ## COG: NMB0932 COG2839 # Protein_GI_number: 15676826 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Neisseria meningitidis MC58 # 1 151 1 150 162 68 37.0 6e-12 MDIALSVAAFLLSIVGIVGCIVPALPGVVLSYAGLLCAYFTSYSSMSPAAIWLWLAITVA VSVADYFLPAWMTRRFGGSRSGAIGATVGVFAGFFFFPPVGIILGPFFGAVLGELLNDRR DAGKAFLVGIGSFLSFVVGTGIKLAAAIGMFVHVTADTYPAVRDWFATLF >gi|313157600|gb|AENZ01000061.1| GENE 6 7650 - 8009 521 119 aa, chain - ## HITS:1 COG:MT3706.1 KEGG:ns NR:ns ## COG: MT3706.1 COG0853 # Protein_GI_number: 15843213 # Func_class: H Coenzyme transport and metabolism # Function: Aspartate 1-decarboxylase # Organism: Mycobacterium tuberculosis CDC1551 # 7 111 5 109 139 129 60.0 1e-30 MKFQIEVIKSKIHRVTVTQADLHYVGSITIDEALLEAANIIEGEKVQIMDIDNGERFETY VIKGERGSGCICLNGAAARKVQVGDVIIIASYALMDFEDAKQFKPWIVFPDTATNRLVK >gi|313157600|gb|AENZ01000061.1| GENE 7 8009 - 8872 1014 287 aa, chain - ## HITS:1 COG:aq_2132 KEGG:ns NR:ns ## COG: aq_2132 COG0414 # Protein_GI_number: 15607078 # Func_class: H Coenzyme transport and metabolism # Function: Panthothenate synthetase # Organism: Aquifex aeolicus # 12 287 4 281 282 222 43.0 5e-58 MPRKFVSEEMKIFTSVKELRAELETAEQSGIGFVPTMGALHAGHRSLVERARRENATVVV SVFVNPTQFNDKNDLRHYPRTPEADARLLERAGADFVLMPSVEEIYPEPDGRQFDFGQID KVMEGATRPGHFNGVAQVVSRLFDIVRPARAYFGEKDFQQIAVVKAMVDQLSLPVEIVEC EIVRGEDGLALSSRNALLDAEHRAAAPQIYAALRAAAEKSQELAPEALKAWVTAEVERNP LLKVIYYQSVDARTMQEVAAWSDAEHIQGCIAVQAGDIRLIDNIRIR >gi|313157600|gb|AENZ01000061.1| GENE 8 9211 - 9867 824 218 aa, chain + ## HITS:1 COG:no KEGG:Odosp_2759 NR:ns ## KEGG: Odosp_2759 # Name: not_defined # Def: hypothetical protein # Organism: O.splanchnicus # Pathway: not_defined # 23 218 265 448 448 65 26.0 1e-09 MKKILLCLLALAAGLTAFGQNESSFGVRAGLNVANLHFSAGGLSASMDSRASYHVGFAYQ QPVLRVLPLYFETGLYLSGRGASVTAGDLDLDVEGKAKFNMLYLQVPAVVSWHFNIKSVS IQPAAGVYYGFGIHGKLKGDAAKVDLFKEISVPVEDGSVEGQVFKRSDFGLRFGVGVTVM KHYYAGVGYDLGLLNISKDSGEGKIKNGSFFISLGYNF >gi|313157600|gb|AENZ01000061.1| GENE 9 10147 - 12135 3457 662 aa, chain + ## HITS:1 COG:no KEGG:BF3425 NR:ns ## KEGG: BF3425 # Name: not_defined # Def: putative dipeptidyl-peptidase III # Organism: B.fragilis # Pathway: not_defined # 1 662 1 675 676 829 59.0 0 MTVTALALTSCGGASRSETPWIVDRFDDIKVIRYEVPGFEQLPLEEKELIYYLAEATKCG RDILFDQNFKYNLAVRRTLETVYENYEGDRTTAEWKALEKYLKKVWFANGIHHHYSNDKF VPEFTEGYLLDVIETIPEEKFGELNPLRGEVCKAIFDASLYKTRLNQTAGEDLIATSSNN YYEGVTQAEVEKFYADMADPHDPEPISYGLNSKLVKEDGVVRERVWKIGGMYSPAIEKIV YWLEKAQAVAAEPQKSNIAALISYYKTGDLREFDRYNIGWVKDTVSNVDFVNGFIEDYGD PLGRKASWEGIVDFMDKEACHRTEVISENAQWFEDHSPVDPRYRKEKVKGVSAKVITVAM LGGDCYPATPIGINLPNADWIRKEYGSKSVTIDNITYAYDMAAHGNGFNEEFVLRADDRA VMDKYGKLADDLHTDLHECLGHGSGQLAPGVKGGELKSYSSTLEETRADLFGLYYLGDPK MVELGLVPSFDVAKAGYAKYILNGMMTQLARIEPGKNVEESHMRNRKLICEWCYERGKAD NVIEMVRENGKTYVVVNDYEKLRGLFGDMLREIQRIKSEGDYEAGRALVERYAVQVDPEL HKEVRDRYYALSIEPYGGFVNPEFELVEKDGEIVDVKISYPADYVKQMLGYSKDYSFLPN IN >gi|313157600|gb|AENZ01000061.1| GENE 10 12205 - 12630 569 141 aa, chain - ## HITS:1 COG:no KEGG:BF0549 NR:ns ## KEGG: BF0549 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 2 133 54 184 186 125 48.0 6e-28 MRTGGGVYIAGFAASKSKTKIVIEGGRAGIRFKTGEPFALIVRAKDNKADPMSIVRIFRM KSTKKNRSAVISAVGSFAVQSNTMEYLPFSAEKYGESSYRLTFGEQPEGEYGIIVSNPNN VDEKMVIVSTFAIGGEEARNR >gi|313157600|gb|AENZ01000061.1| GENE 11 12945 - 13109 302 54 aa, chain - ## HITS:1 COG:CAC2778 KEGG:ns NR:ns ## COG: CAC2778 COG1773 # Protein_GI_number: 15896033 # Func_class: C Energy production and conversion # Function: Rubredoxin # Organism: Clostridium acetobutylicum # 1 54 1 54 54 81 79.0 3e-16 MKKYRCIVCEWIYDPAVGDPDGGIAPGTSFEDIPDDWVCPVCGVGKDQFEEVEE >gi|313157600|gb|AENZ01000061.1| GENE 12 13257 - 13835 884 192 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313157604|gb|EFR57019.1| ## NR: gi|313157604|gb|EFR57019.1| hypothetical protein HMPREF9720_0392 [Alistipes sp. HGB5] # 1 192 1 192 192 289 100.0 6e-77 MEKIYKYRFSRRTIYWMLIYLVVFGLLGGLLYHLYEGGYLSAWFTSFIVALIALMSLSIP RYIVVTDEKVEVRCLLDITEIKRGEIASVRRVDPKRMKWFFPLFGGCGFFGYYGHFLDLR RFERVRLYASEWKNFVEITDIYEERLYVSCSDADCLIAELMPPGGNRLTEEAAEEEEEER EEAQTAKETQTT >gi|313157600|gb|AENZ01000061.1| GENE 13 13819 - 14526 275 235 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163764767|ref|ZP_02171821.1| ribosomal protein L15 [Bacillus selenitireducens MLS10] # 8 229 7 231 234 110 36 1e-23 MERQTGVIIVAGGGGSRMGGARPKQFMLLGGMPILARTINNFAAALPGAEIVVVLPAEHT GFWHNLSARFEVAPHTVAEGGRERFHSVKNGLAALKRDPELIAVQDGVRPLGSQELIRRT VAAAAEHGTAIPVVEPVDSFRETDGAASQIIDRRRLRIVQTPQVFRAELLRRAYETEYRP EFTDDASVVELSGEAVFLCEGERANLKITTPEDTVIAEALLADREETHETDGENI >gi|313157600|gb|AENZ01000061.1| GENE 14 14496 - 14903 299 135 aa, chain - ## HITS:1 COG:ECs4035 KEGG:ns NR:ns ## COG: ECs4035 COG3787 # Protein_GI_number: 15833289 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Escherichia coli O157:H7 # 2 127 9 138 147 102 40.0 2e-22 MRFLRRHHVLTLATVAEGVPYCSNAFYCYDKERNLLVFSSDLATRHAQEMERNPRIAASV VLETKIVGRVQGLQLCGTAARADETARRAYLKRFPYAALAELTLWAIRPDYMKLTDNTLG FGKKLIWNDKLESSS >gi|313157600|gb|AENZ01000061.1| GENE 15 15284 - 16615 1974 443 aa, chain - ## HITS:1 COG:PA0667 KEGG:ns NR:ns ## COG: PA0667 COG0739 # Protein_GI_number: 15595864 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane proteins related to metalloendopeptidases # Organism: Pseudomonas aeruginosa # 50 408 83 438 447 201 33.0 3e-51 MKSLRLPICVLLLVAAGACSRTQQQETAGEEAAEQHNIVYGINADNYRTETGEVGSGETL GKILNGFGVSALTIDRLDKASKDIFPLRNIRAGHKYTAFIHEDSLYAPHLDYLVYERNVA EYVVFGFHDDSVSVRTGEKQFTVRRTKKSATINSSLWGAIMEQELPYALAAEMEDIYQWT VDFFGIQKGDNFTVIYDERFIDDSVSVGIGRIWGAKFCQGGKEYYAIPFRQGGKIRYWEY DGASLRKQMLKAPLKYSRISSKFTYARKHPIYKVYRPHTGVDYAAPKGTPVHAVADGVVT FKGWGGGGGNTLKIKHAGNLMTGYLHLSGYAKGISKGSRVSQGQLIGYVGSTGASTGPHL DYRIWKNGTPIDPLKVPQEPAEPIAKENRATFEFVRDRIAAELNGEVKDEERITQLDSLV IPQAPAASAPAGETSAGETTAAK >gi|313157600|gb|AENZ01000061.1| GENE 16 16761 - 17804 1271 347 aa, chain + ## HITS:1 COG:PM0860 KEGG:ns NR:ns ## COG: PM0860 COG1663 # Protein_GI_number: 15602725 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Tetraacyldisaccharide-1-P 4'-kinase # Organism: Pasteurella multocida # 5 342 13 320 325 123 27.0 6e-28 MLKCLLAPAALLYKAGVTFRHRLFDWGVLKSEKFDIPVVCIGNITVGGTGKTPMAEMVIA YMSQMHNVALLSRGYGRRTKGYREVKTDSHYRDVGDEPLQIKLKFPGTVVVVCEKRAEGI RRIRAEHPEVDLIIMDDGFQHRYVEPKINIVMIDATRPIQHDRMLPLGTLRDLPEELHRA HYFVVTKCPEKMAPIDRRIMRKVLIQVAYQRVYFTRFESFMPQPLYPDAAPDEPLLHGRQ VIALSGIGNPKPFLATLRERYEVVQEMTLEDHHVYKVRDLNRLRDLLGRCPGAVIVTTEK DAVKLTNRAKIPEEIRRSIYYLPINISFIEDSATDFLQKLEEDVRGN >gi|313157600|gb|AENZ01000061.1| GENE 17 17788 - 18594 1156 268 aa, chain + ## HITS:1 COG:BS_pnp KEGG:ns NR:ns ## COG: BS_pnp COG0005 # Protein_GI_number: 16079406 # Func_class: F Nucleotide transport and metabolism # Function: Purine nucleoside phosphorylase # Organism: Bacillus subtilis # 1 268 1 266 271 278 52.0 8e-75 MLEEIKKTAAFIDAATDSFAPEVGVILGTGLGGFADKIETRFAIEYKDIPGFPVSTVEGH KGRMIFGMVEGRRVVAMQGRFHYYEGYGMQQVTFPVRVMRQLGIKYLFVSNASGGINTSF RVGDLMVITDHINLMPNPLIGPNIAELGPRFPDMHNCYDKELIAKATAIAEEENIKLQYG VYVGGTGPTFETQAEYRYFKNIGGDAAGMSTVPEVIVARHMLIPVFGVSVITNCGLSDEV GDHEDVQRQGKKAGIKMEVLFKRMIKAL >gi|313157600|gb|AENZ01000061.1| GENE 18 18813 - 19310 805 165 aa, chain + ## HITS:1 COG:PM1950 KEGG:ns NR:ns ## COG: PM1950 COG0629 # Protein_GI_number: 15603815 # Func_class: L Replication, recombination and repair # Function: Single-stranded DNA-binding protein # Organism: Pasteurella multocida # 2 165 4 166 166 124 43.0 6e-29 MVNKVILIGNVGIDPEIRTTEGGVKVARIRLATTERLFDRQANEAKEHTEWHTITLWRGL ADVVDKYVRKGTQIYVEGRLRTREWMDKDNNKRYTTEILADVMNLLGRRSDNPSSDGQQG YGSQQGYGSQQSSAGQSAGGYQQPAAPKPAPAPSIPADDPDDLPF >gi|313157600|gb|AENZ01000061.1| GENE 19 19683 - 20417 205 244 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163797523|ref|ZP_02191474.1| 50S ribosomal protein L9 [alpha proteobacterium BAL199] # 4 231 7 240 259 83 27 2e-15 MKRIVIVGATSGIGLEVAKLCIQAGWQIGAAGRREEALEKLRTQAPDLIVTESLDITRDD APEHLSRLIDKLGGMDIYLHCSGIGKRNTELHPDIEIDTLRTNGEGFVRMVTAAFGYFRA HGGGHLAVISSIAGTKGLGSAPAYSATKRMQNTYIDALAQLAHMEKLNIRFTDIRPGFVA TPLLAGDGHYPMLMQVGKVAARIMKVLKRQRRRVVIDRRYAVMVFFWKMIPEWLWERLNI RKND >gi|313157600|gb|AENZ01000061.1| GENE 20 20638 - 21075 588 145 aa, chain - ## HITS:1 COG:AF1764 KEGG:ns NR:ns ## COG: AF1764 COG2131 # Protein_GI_number: 11499353 # Func_class: F Nucleotide transport and metabolism # Function: Deoxycytidylate deaminase # Organism: Archaeoglobus fulgidus # 7 136 2 142 157 111 45.0 5e-25 MGYNEEKQRQLDIRYLRMARIWAENSYCVRRKVGALIVKDKMIISDGYNGTPSGFENICE DENGHTKPYVLHAEANAITKVAKSANNCDGSTLYITAAPCIECSKLIIQAGIKRVVYCEE YRSEEGLNLLRKVGIECVQHAMEDC >gi|313157600|gb|AENZ01000061.1| GENE 21 21083 - 24376 5136 1097 aa, chain - ## HITS:1 COG:CT701 KEGG:ns NR:ns ## COG: CT701 COG0653 # Protein_GI_number: 15605434 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit SecA (ATPase, RNA helicase) # Organism: Chlamydia trachomatis # 7 1006 4 914 969 612 37.0 1e-175 MDIAASLIKLIFGSKADKDRKQIEPYLEKIKAVYPAIEALSNDELRARSEALKKQIADFI AADEARIVELKAKLELAETSLEEKEKVSKEIDETTKRIDEKIEEKLDEILPEAFAIMKDT ARRFAQNETVVVTANDFDRDLAAAKDFVTIEGDKAVYANHWMAGGNDVKWDMIHYDVQLF GGVVLHKGKIAEMATGEGKTLVATLPVFLNALAKKGVHLVTVNNYLAKRDSEWMGPMYQF HGLSVACIDDTQPNSDARRKAYMADITFGTNNEYGFDYLRDNMASSPADLVQRKHHFAIV DEVDSVLIDDARTPLIISGPVPKGDDQMFEQYRPAIDHLYNLQKNLVTGLLAEARQLIAE GKNDEGGVKLYRAHKGLPKYKPLIKYLSETGVKALMQKTENTYMQDNNRRMPEITDDLFF VIDEKLNSVELTDKGHEVLSKYFNEDGFFVMPDIGAEVAELEKSDLSAEERARKRDEVIN DYSIKSERVHTVHQLLKAYAMFEKDVEYVVMDNKVKIVDEQTGRILDGRRYSDGLHQAIE AKEHVKVEAATQTFATITLQNYFRMYHKLAGMTGTAETEASEFWSIYKLDVVVIPTNRPV VRDDRQDLIYKTKREKYNAVIEEIVKLVEAGRPVLVGTTSVEISELLSRMLKLRNIKHNV LNAKQHQLEAQVVAEAGRSGQVTIATNMAGRGTDIKLTPEVKQAGGLAIIGTERHESRRV DRQLRGRAGRQGDPGSSQFFVSLEDDLMRLFGSGRIASMMDRMGLKEGEVIQAGMMTRAI ERAQKKVEENNFGIRKRLLEYDDVMNSQREVIYTRRRHALYGERIEIDLNNIMYDYADNF VETNRGIEFEDFRFELIREVAVEPSFGEETYGSAKPAELVELIVKDLKETYARRAKAVAD TVRPVMERIYEDKKDQLDSNIYFPITDGHLGYNVPVNLLKCKNTDGAEIYKIFSKVVMFT TIDDAWREHLREMDDLRQSVQNATYEQKDPLLIYKFESFGLFSKMIVKVNRDVLSILNKA YIPVRDQNAEAVQRQRQERAKVDVNKLQASRMQAAAQAGQQDKQKPMPMHVEKKVGRNDP CPCGSGKKYKQCHGKGM >gi|313157600|gb|AENZ01000061.1| GENE 22 24518 - 25480 935 320 aa, chain - ## HITS:1 COG:STM3707 KEGG:ns NR:ns ## COG: STM3707 COG0463 # Protein_GI_number: 16766992 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Salmonella typhimurium LT2 # 3 215 7 221 344 94 32.0 4e-19 MKKISIIIPLYNLEGLIGKCIASALDQDLPPEEYEVIVVDDGSTDGSLAAAQRAACGHAN VQVHAKPNEGVSLARNYGTERAAGRYVMYVDADDYLSPGVLGGLTQIMDRERLDMLCFDM AGVDEHGGDLPLWSDGLFARGGGEVQSGRDFLRRDCFLPMACAYLYDRAMLERHGLAMLP MRHEDEEFTPRAIYFAKRIRYVPIRAYNYLQHNGSFMGSYDPQHRSDFVRAMKSLTGFAA LIEESDPEGARLMRLHVGKTMFFACKQTILGRQGNAREMLRDARQQGVLPLAFRRYNFRH FLINRAPALFVLYYRLHKRR >gi|313157600|gb|AENZ01000061.1| GENE 23 25477 - 25974 470 165 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313157611|gb|EFR57026.1| ## NR: gi|313157611|gb|EFR57026.1| hypothetical protein HMPREF9720_0403 [Alistipes sp. HGB5] # 1 165 1 165 165 282 100.0 5e-75 MSSVMLRSEPFKRTGIRFRECMAEDYQLWVDLSEHLRMANIPEYLTFYRRWEDQISTRQL DRQTLSAQLTQQEQLARKLGVRLSDDEARIFTRFSLRTGDVKKRELASYRRILTRLYKAG IRHSHDPKLLKRQLMRRYKMACGLFYPSWRVWIHKRLFLVRLLAS >gi|313157600|gb|AENZ01000061.1| GENE 24 26040 - 26450 485 136 aa, chain - ## HITS:1 COG:BH3661 KEGG:ns NR:ns ## COG: BH3661 COG0463 # Protein_GI_number: 15616223 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Bacillus halodurans # 6 136 9 141 257 116 40.0 9e-27 MQKQTPLISVIMPLYNAERFVGESIENVLRQTVGDFELIVIDDASTDASAEIARAYAAKD DRIVLMHNELNSGAARTRNRALDAARGKFITFMDADDLCSPERFAKQLAFFEQHPQTDIC GSYYTMFGTRGGVMAN >gi|313157600|gb|AENZ01000061.1| GENE 25 26461 - 27420 1430 319 aa, chain - ## HITS:1 COG:no KEGG:Palpr_3034 NR:ns ## KEGG: Palpr_3034 # Name: not_defined # Def: hypothetical protein # Organism: P.propionicigenes # Pathway: not_defined # 40 311 148 419 422 182 34.0 2e-44 MTAEREQRAMNRLRAGAGAYACGGFLAGQSALQRSEICTSLLFDRLERKMRMVEALRHEA AENWNQTFYLLYFRTLGDRQNQEAYLTLARRVSYKTVLRERLAPHAVESLLFGASGLLAL YPHDTYTLNLARNFEYLAAKYDIEPMQAGAWQLGDIRPANHPVLRLAQAAEFFAQDEFVM ERAMACRTEEEIRRLFCVEASDYWRTHHIPGIAGDDRPKRLGAFKANIIGINLVSVLQFA YGSYTGREALRDSALTLLERLPAEDNRYMRGWHDAGLTPRSAFESQALLQLATEYCAAKR CAECPVGRRILNNLKIGNN >gi|313157600|gb|AENZ01000061.1| GENE 26 27422 - 27742 655 106 aa, chain - ## HITS:1 COG:CC1859 KEGG:ns NR:ns ## COG: CC1859 COG2151 # Protein_GI_number: 16126102 # Func_class: R General function prediction only # Function: Predicted metal-sulfur cluster biosynthetic enzyme # Organism: Caulobacter vibrioides # 1 106 13 118 118 95 48.0 1e-20 MTPKEILKVEKEIVLTLKNIYDPEIPVNIYDLGLIYEIDYTPDGVANIRMTLTAPNCPMA DMLVEDVNQQVAKVKGVKSVNVILTFDPVWDKSMMSEEALLELNLF >gi|313157600|gb|AENZ01000061.1| GENE 27 27870 - 28298 835 142 aa, chain - ## HITS:1 COG:XF0994 KEGG:ns NR:ns ## COG: XF0994 COG2166 # Protein_GI_number: 15837596 # Func_class: R General function prediction only # Function: SufE protein probably involved in Fe-S center assembly # Organism: Xylella fastidiosa 9a5c # 12 133 23 144 146 99 40.0 3e-21 MDKIQEEIIEEFSVFDDWLDKYDYLIGLSDSLPAIPAEHRTEQYLIEGCQSRVWVDARME QGRVYYAADSDAIITKGIIALLIRVLNGRTPQEILDTELYFIDAIGLSANLSPTRSNGLL SMVKQMRLYALAFASKDGANEK >gi|313157600|gb|AENZ01000061.1| GENE 28 28319 - 29884 2306 521 aa, chain - ## HITS:1 COG:no KEGG:RB2501_05390 NR:ns ## KEGG: RB2501_05390 # Name: not_defined # Def: hypothetical protein # Organism: R.biformata # Pathway: not_defined # 19 520 23 558 558 256 32.0 2e-66 MKHKIILLAIAAFAALEVSAARPRLVVNIVVGSMRAEDLSRYADNYGEGGLRRLIDGGTV FADSRYDYQQTTTPVSLATLSTGAMPSTHGVIGSRWRDYVANEAVELIAGRNGAGPYNLI APTLSEALLQHEPGSKAVTVAAEATSAIVLGGRGGEVFWLDSARCGWTTSPYYAADVPEW VARSNRERYNLSYIAPEWRTLYEKGRYVNSRNWDVILTGKNKKEKDETGSGRLKLTTDFD RMLYTPAGNTAVFGFAKQAIAQFKLGTDGTPDLLNICLDSSRRISEAYGPESVEAEDMYY RLDRDLADFLTFVFAQVKNGEVLVVFTSDHGTSPSYDAGREEADRFNARQFEVIVNGFLN VRYGMGNWVLEYEDKCLYLNHNLIYERGLNLADVQNEVAIFAMQFRGVSHALSATAMRTS YFGSGYARKMQNSFYPRRSGDVIMNLMPGWIEENDRCASSSGSMYGYDTQVPLVFYGFGA GPLRVKRRVDMTAVAPTVARVLEITEPAASEGEVLPEIIDF >gi|313157600|gb|AENZ01000061.1| GENE 29 29910 - 30875 1317 321 aa, chain - ## HITS:1 COG:no KEGG:Bache_1321 NR:ns ## KEGG: Bache_1321 # Name: not_defined # Def: GSCFA domain protein # Organism: B.helcogenes # Pathway: not_defined # 1 319 2 330 330 250 41.0 5e-65 MKFRTEIEIAPLGTKIGYENRILALGSCFAEHIAGRLAQAKFRVTANPSGILFNPLSIAA ALRSYAGESPVRHSELGFDGELWFHYGFHGSFSAPEADQALAAMNAARKSGAEALRAADR IVLTFGTAWVYEHDGAVVANCHRRPAAEFTRRRLGVDEIVTAFADLMAGPLGGREVILTV SPVRHLGDSLAGNAVSKAALRLAAEELAERFPDVHYFPAYEILNDDLRDYRFYADDLVHP SPQAVEYVWEKFSATVLSEQARRLLPDVRHIVVAAAHRPRNPQSATYREFCRRRLEEIAA LPQIDFRTEAEYFRRCIEINS >gi|313157600|gb|AENZ01000061.1| GENE 30 30909 - 31721 860 270 aa, chain + ## HITS:1 COG:CAC2902 KEGG:ns NR:ns ## COG: CAC2902 COG1947 # Protein_GI_number: 15896155 # Func_class: I Lipid transport and metabolism # Function: 4-diphosphocytidyl-2C-methyl-D-erythritol 2-phosphate synthase # Organism: Clostridium acetobutylicum # 1 258 1 261 280 137 34.0 3e-32 MILRANCKINLGLDILRRRADGYHDLETVMLPVRGLYDEVEVVRMAGAGVDFRAEGLAVD CEPEQNICIKAFRLMHDRYGVDGVSIRLDKRVPFGAGLGGGSSDGTAVLLALDRLFDLHL SEAELIARAAELGSDTPFFVRNTPQFCTGRGEIMSPFPLSLGGLTLVIVKPDEGVSTREA YAGVRPRVPSVPLAERLRRPVAEWQEVVTNDFEPHIFAAHPAIRASKRNLLDAGALYASM SGSGSAVFGLFDRPEKAESLRSQTPFIFSL >gi|313157600|gb|AENZ01000061.1| GENE 31 32047 - 32502 805 151 aa, chain - ## HITS:1 COG:TM1080 KEGG:ns NR:ns ## COG: TM1080 COG0698 # Protein_GI_number: 15643838 # Func_class: G Carbohydrate transport and metabolism # Function: Ribose 5-phosphate isomerase RpiB # Organism: Thermotoga maritima # 3 141 2 140 143 135 46.0 3e-32 MTKIGIASDHAGYEMKEFLVGYLSAMGYDVHDFGPGSPESVDYADYAHPLAEAVEKGEFE CGIALCGSGEGMTMTLNKHQGIRAGLAWEPEIASLIRRHNNANIIVFPARFITNDEAVAM LDAYFAAQFEGGRHQHRIEKMAVPCGGAAEK >gi|313157600|gb|AENZ01000061.1| GENE 32 32606 - 32971 427 121 aa, chain + ## HITS:1 COG:PA1439 KEGG:ns NR:ns ## COG: PA1439 COG2832 # Protein_GI_number: 15596636 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Pseudomonas aeruginosa # 1 114 14 125 135 93 46.0 1e-19 MKFFLAALGCLSFVLGVVGIFVPLLPTTPFLLLSAALWGRSSPRLYDWLLAHPCLGGYVR NFRENRAIPLRAKIVSLTLMWGTMLYCVFAPLSGWWWAQAALLAVAVGVTWHILSFATLK K Prediction of potential genes in microbial genomes Time: Wed Jun 22 13:15:35 2011 Seq name: gi|313157565|gb|AENZ01000062.1| Alistipes sp. HGB5 contig00074, whole genome shotgun sequence Length of sequence - 41675 bp Number of predicted genes - 32, with homology - 32 Number of transcription units - 13, operones - 8 average op.length - 3.4 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 16 - 1233 623 ## Bacsa_3107 integrase family protein - Prom 1381 - 1440 5.1 - TRNA 1484 - 1557 71.2 # His GTG 0 0 + Prom 1548 - 1607 5.6 2 2 Op 1 41/0.000 + CDS 1779 - 2048 461 ## COG0234 Co-chaperonin GroES (HSP10) 3 2 Op 2 . + CDS 2064 - 3695 1705 ## PROTEIN SUPPORTED gi|167855908|ref|ZP_02478658.1| 50S ribosomal protein L28 + Term 3712 - 3755 11.7 - Term 3705 - 3738 6.1 4 3 Op 1 . - CDS 3805 - 4392 923 ## BDI_2199 hypothetical protein - Term 4408 - 4449 3.3 5 3 Op 2 . - CDS 4468 - 6111 2567 ## COG0497 ATPase involved in DNA repair 6 3 Op 3 . - CDS 6118 - 7404 1537 ## COG0452 Phosphopantothenoylcysteine synthetase/decarboxylase 7 3 Op 4 . - CDS 7417 - 7743 585 ## Sph21_1763 hypothetical protein 8 3 Op 5 . - CDS 7746 - 8567 1272 ## COG4105 DNA uptake lipoprotein - Prom 8592 - 8651 7.3 - Term 8637 - 8666 1.2 9 4 Op 1 . - CDS 8678 - 9289 703 ## gi|313157578|gb|EFR56994.1| hypothetical protein HMPREF9720_2682 10 4 Op 2 . - CDS 9325 - 10659 1713 ## COG1502 Phosphatidylserine/phosphatidylglycerophosphate/cardioli pin synthases and related enzymes - Prom 10797 - 10856 3.4 + Prom 10738 - 10797 7.4 11 5 Tu 1 . + CDS 10955 - 11431 816 ## COG0782 Transcription elongation factor + Term 11524 - 11554 1.0 12 6 Op 1 . + CDS 11573 - 11905 310 ## gi|313157569|gb|EFR56985.1| conserved domain protein + Term 11931 - 11974 3.4 13 6 Op 2 . + CDS 11990 - 14293 3316 ## COG4771 Outer membrane receptor for ferrienterochelin and colicins + Term 14321 - 14354 5.2 - Term 14516 - 14559 11.6 14 7 Op 1 . - CDS 14579 - 15520 1505 ## COG0039 Malate/lactate dehydrogenases - Term 15586 - 15615 2.1 15 7 Op 2 . - CDS 15633 - 16010 614 ## COG0509 Glycine cleavage system H protein (lipoate-binding) - Prom 16048 - 16107 6.0 + Prom 16133 - 16192 3.4 16 8 Op 1 9/0.000 + CDS 16288 - 16668 295 ## COG0673 Predicted dehydrogenases and related proteins + Term 16831 - 16864 1.0 + Prom 16836 - 16895 1.7 17 8 Op 2 3/0.000 + CDS 16934 - 17155 115 ## COG0673 Predicted dehydrogenases and related proteins 18 8 Op 3 . + CDS 17177 - 18319 1309 ## COG0399 Predicted pyridoxal phosphate-dependent enzyme apparently involved in regulation of cell wall biogenesis + Term 18368 - 18399 0.0 - Term 18423 - 18467 1.1 19 9 Tu 1 . - CDS 18652 - 18921 143 ## gi|291515975|emb|CBK65185.1| RteC protein - Prom 19015 - 19074 4.5 - Term 19616 - 19661 6.5 20 10 Op 1 . - CDS 19691 - 21442 616 ## Swoo_3656 hypothetical protein 21 10 Op 2 . - CDS 21457 - 23505 886 ## Pedsa_0495 hypothetical protein 22 10 Op 3 . - CDS 23513 - 24478 555 ## COG1073 Hydrolases of the alpha/beta superfamily 23 10 Op 4 . - CDS 24462 - 26285 1146 ## BT_4251 hypothetical protein 24 10 Op 5 . - CDS 26275 - 28131 823 ## Swoo_1248 hypothetical protein 25 10 Op 6 . - CDS 28140 - 29285 885 ## gi|313157571|gb|EFR56987.1| hypothetical protein HMPREF9720_2701 26 10 Op 7 . - CDS 29316 - 30899 1497 ## Dfer_2232 RagB/SusD domain-containing protein 27 10 Op 8 . - CDS 30913 - 34044 2366 ## Slin_0106 TonB-dependent receptor plug 28 10 Op 9 . - CDS 34064 - 35338 767 ## PG0232 zinc carboxypeptidase 29 11 Op 1 . - CDS 35717 - 38266 1266 ## COG0642 Signal transduction histidine kinase 30 11 Op 2 . - CDS 38263 - 39963 583 ## COG3292 Predicted periplasmic ligand-binding sensor domain - Prom 40107 - 40166 8.0 31 12 Tu 1 . - CDS 40168 - 40599 166 ## Bacsa_1585 transposase IS116/IS110/IS902 family protein 32 13 Tu 1 . - CDS 41077 - 41310 107 ## gi|291513816|emb|CBK63026.1| Bacterial mobilisation protein (MobC) - Prom 41509 - 41568 4.2 Predicted protein(s) >gi|313157565|gb|AENZ01000062.1| GENE 1 16 - 1233 623 405 aa, chain - ## HITS:1 COG:no KEGG:Bacsa_3107 NR:ns ## KEGG: Bacsa_3107 # Name: not_defined # Def: integrase family protein # Organism: B.salanitronis # Pathway: not_defined # 3 397 4 398 406 277 39.0 7e-73 MYQRTTFNVVFFCKKTKISKKGKAPIYARITTSGQTTEIYTQCQIEPERWNQRLERSLYK DEVDQQINSIVASYRANILAAYDLLLKENKTPDCFSIKQRLSNASGSRMFLVEFSKYCDK RQQEVGTRITQVTCNKYHRLLRYMTEYTKQQYRKDDLPLDMVSYEYIDGLNTFMQTAHQC RNNGAVNLLCCLKNFILFAIRNEWIEKNPFQYYKMKIDKTNVKVPLTKEELDILISKEMP NERLECIRDVFCFCALTGLAFTDADNLQKRHITTDDSGTTWIHKPREKTSVISRVPLLPY PLKILQKYEHNPELQLKGKLLPIPSNQKMNSYLKEIGTICNIHKNLTTHCARHTFATLAI EYGMPIDIIAKILGHTNTNMTRHYAKISEANISREMQRIGKVLTA >gi|313157565|gb|AENZ01000062.1| GENE 2 1779 - 2048 461 89 aa, chain + ## HITS:1 COG:mlr2393 KEGG:ns NR:ns ## COG: mlr2393 COG0234 # Protein_GI_number: 13472183 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Co-chaperonin GroES (HSP10) # Organism: Mesorhizobium loti # 1 89 1 94 104 95 53.0 2e-20 MNVKPLSDRVLILPNPAEEKTAGGLIIPDTAKEKPLAGKVVAAGPGTSEVKMEVKAGDQV LYGKYAGQEIQIDGVDYLIMKQSDILAII >gi|313157565|gb|AENZ01000062.1| GENE 3 2064 - 3695 1705 543 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|167855908|ref|ZP_02478658.1| 50S ribosomal protein L28 [Haemophilus parasuis 29755] # 2 542 3 543 547 661 63 0.0 MAKDIKYNVEARELLKEGVDALSNAVKVTLGPKGRNVIIDKKFGAPQITKDGVTVAKEVE LEDAFANMGAQMVKEVASKTNDDAGDGTTTATVLAQSIIGVGLKNVTAGANPMDLKRGID KAVLKVVESLRKQSQEVGTDFAKIEQVATISANNDETIGKLIAEAMGKVNKEGVITVEEA KGTETHVEVVEGMQFDRGYISAYFMTDPEKMEAQLEKPYILITDKKVSTMKELMGVLEPV AQSGRSLLIIAEDVDGEALSALVVNKLRGTLKIAACKAPGFGDRRKEMLEDIAILTGATV ISTDKGMKIEDADLSMLGTADKVTLNKENTTIVDGAGKKEEIAARVAQIRASIEKATSDY DKEKLQERLAKLAGGVAVLYVGAATEVEMKEKKDRVDDALAATRAAVEEGIVPGGGVAYI RATAALEGMKGDNEDQTTGIQIVKRAIEEPLRQIVANAGGEGSVVVNKVKEGKDAFGYNA RDDKYEDLLKAGIIDPTKVSRVALENAASIASMFLTTECVLAEKKSDAPAMPAMPAGGMG GMM >gi|313157565|gb|AENZ01000062.1| GENE 4 3805 - 4392 923 195 aa, chain - ## HITS:1 COG:no KEGG:BDI_2199 NR:ns ## KEGG: BDI_2199 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 1 195 1 195 195 277 67.0 2e-73 MDKLSTLAAANQQRAQEVIRDSGIESIWRSVGAEPHLVGSLRTGLLMTHRDIDFHIYSSP LRVADSFAAMARLAENSRIHRIEYGNLLDTSEECLEWHAWYADADDQLWQIDMIHMPKGS AYDGYFERVADRIAAVLTDQTRETILRLKYETPETEKIMGIEYYQAVLRDGVRTYAEFEA WRAANPVTGVLEWMP >gi|313157565|gb|AENZ01000062.1| GENE 5 4468 - 6111 2567 547 aa, chain - ## HITS:1 COG:PA4763 KEGG:ns NR:ns ## COG: PA4763 COG0497 # Protein_GI_number: 15599957 # Func_class: L Replication, recombination and repair # Function: ATPase involved in DNA repair # Organism: Pseudomonas aeruginosa # 1 545 1 552 558 260 34.0 6e-69 MLRRLSVENYALIEKLEMELDPHLNIITGETGAGKSILLGALGLLLGAKNDGSAMKDAAR NCVVEGTFDLSGRGMESFFDENDLDYAPETHVTRMITPAGKSRAFINDIPVQLAQLREFG TRLIDIHSQHQNLILSSEEFRTSALDTVAGNGGLLAQYTAQYARLSELRRQLAALREAAA DGRRDEEWLRFQCEELTAANLREGEQAATEAELAVLENADRIGEVLTTLRNAFDADETGI LTLLKNSENDLVRIREHYPAAGEYAERIHSVLEELKDIDNAAAADCERIDADPERLAKLS ARLDTLYALQQKHRVASEAELIAARDGYAARLAAIVHGDEEIAATEKALHETEQKAAALA DRLHKAREKAAAGFAKQILATLTQLGMPDTVFEVALTPLAELGRTGRDNVVFLFTANRSA TPQPVERIASGGELSRVMLALKALLARCMQLPTVIFDEIDTGVSGRIADAMGEIIESLSA TMQVVDITHLPQVASKGSAHFVVYKQDGRTGIRRLTAEERITEIAKMLSGSRITDAAVAQ ARILLGK >gi|313157565|gb|AENZ01000062.1| GENE 6 6118 - 7404 1537 428 aa, chain - ## HITS:1 COG:lin1939 KEGG:ns NR:ns ## COG: lin1939 COG0452 # Protein_GI_number: 16801005 # Func_class: H Coenzyme transport and metabolism # Function: Phosphopantothenoylcysteine synthetase/decarboxylase # Organism: Listeria innocua # 29 428 1 398 399 339 46.0 8e-93 MDSQGHDIQSPGAGRTVEGLTGLAGLAGLAGRRILLGITGSIAAYKAAVLCRLLKCAGAE VRVVMTPLAKQFITPLTMATLSKNPILVEFFDPENGAWNSHVSLGEWADCYLIAPATANT LAKMACGIADNLLLTTYLSARCPVVVSPAMDLDMYAHPATQENLRRLAERGVRIVQPAEG ELASGLTGKGRMAEPADIAAFVGQLLGGENPEKKKSLQGRRFVVTAGATIEAIDPVRFIS NHSSGKMGYAIAGALAARGADVTLVSGRTALPTPAGVQRTDVLSAAEMYDAATAAFDRAD GAVMCAAVADYTPDTVSETKIKKCGGEMCITLRRTRDIAAELGARKGSRLLVGFALETHD EQANAEAKLAKKNFDFIVLNSLRDAGAGFRGDTNKVTFIDRSGREELPLMPKSEVAERIA DKIESLLK >gi|313157565|gb|AENZ01000062.1| GENE 7 7417 - 7743 585 108 aa, chain - ## HITS:1 COG:no KEGG:Sph21_1763 NR:ns ## KEGG: Sph21_1763 # Name: not_defined # Def: hypothetical protein # Organism: Sphingobacterium_21 # Pathway: not_defined # 4 100 6 102 110 105 60.0 4e-22 MEIKKNIPNNTITRKLVDLDRETGNVYESINIIARRANQIATELKTELNRKLADFSSPTD TMEETFENREQIEISRYYERLPKPAIIATEEFLDDELFFKENKEVKKA >gi|313157565|gb|AENZ01000062.1| GENE 8 7746 - 8567 1272 273 aa, chain - ## HITS:1 COG:aq_1273 KEGG:ns NR:ns ## COG: aq_1273 COG4105 # Protein_GI_number: 15606494 # Func_class: R General function prediction only # Function: DNA uptake lipoprotein # Organism: Aquifex aeolicus # 19 207 19 209 306 61 25.0 2e-09 MKQTFLYAVCGVILITAFSGCAGINALLKSGQPDLIYSKALEYYQKEKWSRASTLFEGVQ HYYSGTPREDSISFFNARCKYKNRDYDTAATLLDDFRRKFGRSAFIEDAEGMYALCFYYL SPGPSRDQTMTGQALIAINEFMSRYPHSEQIENFKTINTELTQRLHDKAYLNAYTYYKIG RYKSAIVSLKNALKQYPESSHREEIMYLIVDASYRFASNSVAEKQTDRYLAMLDSYLSFK EEFPESKHIKEVDRMAQHARDYLDRNKQEDNNI >gi|313157565|gb|AENZ01000062.1| GENE 9 8678 - 9289 703 203 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313157578|gb|EFR56994.1| ## NR: gi|313157578|gb|EFR56994.1| hypothetical protein HMPREF9720_2682 [Alistipes sp. HGB5] # 1 203 1 203 203 350 100.0 3e-95 MKHLFPLLLCALLLTGAASAQRRFQVGVRGGVTTTSYTFSRVEIGENRFRPGPAKAGYQA GLVLRFNLTRRLHLQSELNFAFVNYSIRAEGPRTRSVSLRSERFEIPLQLGLQFGPLRLF GGAQFRVADSERDSAPKLLKVNFNDDVGIMGGVGFNIRKFFFDFRLSGYARSHVWNTFVS DGVSQRVRVPRNLVYGGSVGFFF >gi|313157565|gb|AENZ01000062.1| GENE 10 9325 - 10659 1713 444 aa, chain - ## HITS:1 COG:BH2858 KEGG:ns NR:ns ## COG: BH2858 COG1502 # Protein_GI_number: 15615421 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylserine/phosphatidylglycerophosphate/cardioli pin synthases and related enzymes # Organism: Bacillus halodurans # 1 444 18 503 503 291 34.0 2e-78 MYLLLAIHIAWTLGAVPLVTRQKRLPASAAAWLALILLLPVAGTLLYILAGYRPHLPAAE RRAYRGDRLGQIIARGCGTRITRHNSVELLHNGNNAFTALIASLQRAVRSIHMEYYIIRD DRIGRTIAEILIRKARAGLEVRVIYDAVGSWRLSRTMLRRMHEAGVETAAFEPVRFPWFT TRVTRRNHRKIVVTDGKVAYLGGINIAKYYLDGDYMGKWRDEHLRIEGDAVADLQKLFIA DWARVRGECLDIRRHIAPHDIRQRLPIQLAWAEEGPSRLTIAEAFAAAIVRAQRRVRISS PYFLPPAMLLDALRLAARSGVRVQVMIPTCSDSPFTDLISDSYAGDLLDAGIELYRYDNG FLHAKLLIVDDDTASVGTANMDYRSLLDNLEVTAFIRDREVVRELSATFDGDLASCRRIV RDEWRPAAWRRTLGDVLRLVSPLM >gi|313157565|gb|AENZ01000062.1| GENE 11 10955 - 11431 816 158 aa, chain + ## HITS:1 COG:RC1332 KEGG:ns NR:ns ## COG: RC1332 COG0782 # Protein_GI_number: 15893255 # Func_class: K Transcription # Function: Transcription elongation factor # Organism: Rickettsia conorii # 8 156 51 199 206 129 46.0 2e-30 MAKEIIYLTAEGYKKLRDELDHMRSVERPAISAAIAEARDKGDLSENAEYDAAREAQGML EMRIAKLEDSLANARVIDESKIDKSKVQILSKVTLLNHNTKKEMVYTIVAEHEANLREGK LAIGTPIAKALLGHKKGDRVDVEVPAGTIHFEILDISI >gi|313157565|gb|AENZ01000062.1| GENE 12 11573 - 11905 310 110 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313157569|gb|EFR56985.1| ## NR: gi|313157569|gb|EFR56985.1| conserved domain protein [Alistipes sp. HGB5] # 1 110 1 110 110 182 100.0 8e-45 MIRKAPKRIYAALLLLILVAAYAGHKVHIYKEDPAHFAALCGDLMPDNGAHQIVVERCIV DDFYFFPYLDAVQPAHSFYCGMLGVLMPEATQCRRCMPVPGVSLRAPPAA >gi|313157565|gb|AENZ01000062.1| GENE 13 11990 - 14293 3316 767 aa, chain + ## HITS:1 COG:STM2199 KEGG:ns NR:ns ## COG: STM2199 COG4771 # Protein_GI_number: 16765529 # Func_class: P Inorganic ion transport and metabolism # Function: Outer membrane receptor for ferrienterochelin and colicins # Organism: Salmonella typhimurium LT2 # 114 551 31 457 663 90 26.0 2e-17 MKKYLLPALFMCLYAGASAAETPKRSTDANLFGHVVDRETHEHLPYATVAVTGTAFGTTT DASGHYFLKNLPEGELTLEVRALGYAVLEKKVTLERGQTLELNFEVMQSGISMDEVVVSA SRSATLRREAPALVSVLDAALFEKTNASCLAQGLSFQPGVRVEDDCQNCGFAQVRINGLD GHYSQILVDSHPVFSALTGVYGLEQIPANMIERVEVLRGGGSALYGSSAIGGTINVITKE PSRNSAQLAHTLTSLGGSNSYDNNTMLNASLVTESGRAGLCVFAQNRHRSGYDHDGDGFT ELPLINSRSAGMRSFFRTGAYSRITAQYHHIDEYRRGGDLLDLPPHEAMVAEQTDHSIDG GSLSFDISSADRRNRFNAYASFQNTARKSYYGSKQDPDAYGTTHDLTVAAGVQYVHAFRK LLFMPAELTLGSEYSYDDLNDRSIGYDINTAQTVHIVGGYLQNEWKTKRWSLLIGGRFDK HNLVDHVIFSPRANIRFNPSEAVNLRVSYAGGYRAPQAFDEDMHIAVVGGKRVRIRLADD LKEERSHSVSLSADLYHNFGRVQTNLLVEGFYTILDDVFALRDLEDTDDGGKIKERYNGS GAAVRGFNVEGRAIFTRWFELQAGVTLQQSRYKEPEQWSEDADVPPVRRMFRTPDAYGYF TASFKPVRNFTADLTGTCTGSMLVQHMAGSGVAQDAAVSTPSFFDMNVRLSYDLRIYKEI TLQLYGGVQNLFNAYQKDFDKGADRDSGYIYGPSLPRSWFVGAKFSF >gi|313157565|gb|AENZ01000062.1| GENE 14 14579 - 15520 1505 313 aa, chain - ## HITS:1 COG:BH3158 KEGG:ns NR:ns ## COG: BH3158 COG0039 # Protein_GI_number: 15615720 # Func_class: C Energy production and conversion # Function: Malate/lactate dehydrogenases # Organism: Bacillus halodurans # 3 307 7 311 314 261 46.0 1e-69 MSKVTVVGAGAVGATCANVMACREVASEVVLIDIKEGLSEGKMLDMYQCSTLMDFDTKLV GVTNDYKQTANSDVVVITSGIPRKPGMTREELIGTNANIMKGVIENVVKYSPRAIIIVVA NPMDTLTYLALKASGLPKNRIIGMGGALDSSRFKCYLAKATGANINNVDGMVIGGHGDTT MIPLVSKATVNGVPVSQFASKKKLEEAVANTMVGGATLTRLIGTSAWYAPGAASAMMVEA ILNDQKKMIPCSCLLEGEYGQSDICIGVPAIIGRKGIEKIVKIDLSKEEAEKFAASADAV RKTNNVLHEIKAI >gi|313157565|gb|AENZ01000062.1| GENE 15 15633 - 16010 614 125 aa, chain - ## HITS:1 COG:SA0760 KEGG:ns NR:ns ## COG: SA0760 COG0509 # Protein_GI_number: 15926486 # Func_class: E Amino acid transport and metabolism # Function: Glycine cleavage system H protein (lipoate-binding) # Organism: Staphylococcus aureus N315 # 1 125 1 125 126 128 48.0 3e-30 MNVPANLKYSNDHEWCRIEGDTAVIGITDFAQSQLGDIVFVDVPTVGETLAAGEVFGSIE AVKTVSDAFLPVGGEIVEFNDAVDADPAIVNKDAYGEGWIVKVKMSDPAEYDALLSAADY EKLIG >gi|313157565|gb|AENZ01000062.1| GENE 16 16288 - 16668 295 126 aa, chain + ## HITS:1 COG:jhp0620 KEGG:ns NR:ns ## COG: jhp0620 COG0673 # Protein_GI_number: 15611687 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Helicobacter pylori J99 # 1 122 20 150 315 95 38.0 2e-20 MRSLGCDLVAAHDVFDSVGMIDGYFPRAYFTTDPDDFRKRMVADRAEFLTVCTPNYLHCT HTVTGLEAGLDVICEKPLALTPDELSRMETCSRAAGRRVFPILQLRLHPEIERLKRMVDC APPDNL >gi|313157565|gb|AENZ01000062.1| GENE 17 16934 - 17155 115 73 aa, chain + ## HITS:1 COG:PA3158 KEGG:ns NR:ns ## COG: PA3158 COG0673 # Protein_GI_number: 15598354 # Func_class: R General function prediction only # Function: Predicted dehydrogenases and related proteins # Organism: Pseudomonas aeruginosa # 4 71 246 314 316 68 44.0 3e-12 MSPYRHLVINGEEFDFTNGFTDLHTLSYERILAGRGFAVEDTACAVHTLDMLRKSAAVGL TGDYHPLLRNLQG >gi|313157565|gb|AENZ01000062.1| GENE 18 17177 - 18319 1309 380 aa, chain + ## HITS:1 COG:alr3012 KEGG:ns NR:ns ## COG: alr3012 COG0399 # Protein_GI_number: 17230504 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted pyridoxal phosphate-dependent enzyme apparently involved in regulation of cell wall biogenesis # Organism: Nostoc sp. PCC 7120 # 5 367 12 377 382 285 42.0 1e-76 MKMVDLHRQYLGIRDEVDRAIGDVIAETAFIKGPAVGLFEEELARRFEVRHCIACGNGTD ALLLSLLAVGLQRGDEVIVPAFAFAAVAETVLLLGGVPVFADVDPRTFNIDPASVARLVS ERTRVVIPVHLFGQPCDMAALTAVAKEHGLRVIEDNAQSLGAVCRMSDGASRYAGTIGEI GCTSFFPSKVLGCYGDGGAAFTDDDGLAARIRALGSHGWFPKYDSRLVGMNSRLDTLQAA VLRVKLPHLDTWIAARRSVAARYTERLGGLTPVVTPVETPYATHVYHQYTVKVPPACRDG LRAALAKEGIPSMVYYPGTLYEQPAYRDACICDPAMRHAAALPAAVLSLPMDSELTDAEV ERVADAVCRYFADNESRYHE >gi|313157565|gb|AENZ01000062.1| GENE 19 18652 - 18921 143 89 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|291515975|emb|CBK65185.1| ## NR: gi|291515975|emb|CBK65185.1| RteC protein [Alistipes shahii WAL 8301] # 1 89 23 111 147 95 53.0 1e-18 MKSLLWTVIALREFEMKMVWLKVNHPVVMDCGRKSQFRNLRFIGQSNLHLQILWEIAVPM QELGAGRCLDGSQTSLDLFVEGFTWMINI >gi|313157565|gb|AENZ01000062.1| GENE 20 19691 - 21442 616 583 aa, chain - ## HITS:1 COG:no KEGG:Swoo_3656 NR:ns ## KEGG: Swoo_3656 # Name: not_defined # Def: hypothetical protein # Organism: S.woodyi # Pathway: not_defined # 28 554 33 558 608 310 34.0 1e-82 MKKLLHTLLLFAVGLFAACQAPTSGGDVYLNDFLDDLTAQTDAGPAIRAALSHCARIRAA RLILPGGELRIRPDLAVEKYQFISNNDEGLKRIAFDLVGLQDFTIEGADTKLLFTGFVSP FNLERCRNITIRNLSIDFTRTFHSEGTVRAAGNGWLDLEFPDKYRCDLTDGCLRFLDDEG RVYPYSSLLEFDTQRCEPAFHVDDYWLPAHTIPAERRPNGWIRIFRSDLKAAIGNTMVFG AARRLNPGITVSDSQGIAILDVKLHHCGGMGVIAQRSRDIGIERMEVVPAPGKKRMISIT ADATHFSNCGGQIRLIDCTFENQKDDASNIHGLYMPVDTIFDRERIWVRWGHSGQYGTDF LVPGMAVEIVDNHTLEAYARRIVAKVERFNKEYSAVTFTEPLPENIRPGHLIAADEPGPD VHISGCRMSGNRARGLLIGSRGRVIIENNYFHIAGASILFEGDGNFWFEQSGVRDVTIRN NIFANGNYGSRGWGSACIAVGSGISQRQTSRYHRNIQVDGNLFRVFDPRIVNLYCVDGFQ FSASNRIVRTSDYPATFDPKLHFVFDQCDHIEIPRQIEMAEQR >gi|313157565|gb|AENZ01000062.1| GENE 21 21457 - 23505 886 682 aa, chain - ## HITS:1 COG:no KEGG:Pedsa_0495 NR:ns ## KEGG: Pedsa_0495 # Name: not_defined # Def: hypothetical protein # Organism: P.saltans # Pathway: not_defined # 21 681 25 676 676 653 47.0 0 MKKLLFLLFCTGLGLPVAGQRWQIDPDGGIVWRPGNDTPHRDHLEMTGEQISTVLRWVVD DRGAFRVERSLIFPLLRVRPNDTHASLMCRVATDIPSLLAVNTSASGFNASALQFERVEK IVIDGTVRVFSQWSFNAVGAGYEEVEHPAPVLAMERTIFPSPTQPALCERYLIRNIGSQT LEISIPEFSQIVTTPSERGITGGYVIRAELLDAGIRILQPGDSTVVDALFRACRVGETLP PADVGAELRSRREFVSAISDKLVLETPDPVVDTEFRFAKIRAAESIIKTRGGYMHAPGGE CYYAAIWANDQAEYVNPFFPMLGYDIGNRSALNAFRHFARFMNAEYRPIPSSIIAEGDDI WNGAGDRGDAAMIAYGAARYALARGERNEAEELWPLVEWCLEYCRRKLTPEGVVASDTDE LEGRFPAGKANLCTSTLYYDALRSAVYLGQELGCPRETLRVYKRQTSELAEAIECHFGAT VAGYETYRYFDGCTKLRSWICMPLIAGLQNRSEGTVRALLGKELMTENGLLTEQGSSTFW DRSTLYALRGIYIAGQADRATEFLSHYARRRLLGDHVPYPIEAWPEGSQRHLAAESGLFC RVITEGLFGIRPTGFRSFDLTPSMPSGWNRMALRHIRAFENNFDLVIEKQPDGFLRVTLA ISGCNPRAFRLRQGASLRIKLP >gi|313157565|gb|AENZ01000062.1| GENE 22 23513 - 24478 555 321 aa, chain - ## HITS:1 COG:lin2180 KEGG:ns NR:ns ## COG: lin2180 COG1073 # Protein_GI_number: 16801245 # Func_class: R General function prediction only # Function: Hydrolases of the alpha/beta superfamily # Organism: Listeria innocua # 6 309 3 315 319 210 36.0 4e-54 MKKRNKKIIAALSAIFGLLVIGALCWSFVLCYRTLHPPRLTDLSWLTRQYPHTTEWLDSI RSNGLLHDIVRLNADGERLHARWLWAPEPTEKTAVLFHGYQGSAETMLMIGYLFNHDFGF NVLIPDLRGHGQSEPSAISMGWTERTEVVDWIRTADSLFGGNTQIVLYGVSMGAATTMIA AAEESLPACVRCAVEDCGYTSTRDIFADSWEKQCRLPLFPLFHLSDLWCRILYGWSFAKA SPLDAVHRCRLPMLFIHGDKDSVVPVEMVHRLYEAKIGDKELWILSGVDHGAAYLHDPQI YAQRVRTFVEHWFECPAILEQ >gi|313157565|gb|AENZ01000062.1| GENE 23 24462 - 26285 1146 607 aa, chain - ## HITS:1 COG:no KEGG:BT_4251 NR:ns ## KEGG: BT_4251 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 11 585 11 586 615 291 33.0 7e-77 MKISRYILPILATAWIGSVSSRPFGDTLRMRDFGALPDSGEDASEALRKAVATIGKREIP TVLCFESGRYDFHAPEGQDDRVNIVARLRGIRDLIIDGGGAEFIAHGRLMLFLAEECERL TIRNFSLDWERPYITQASIVALGDGHVDLAIDRKRYPYHIEKGRIRFTDETWEREIDPES YSTAYDPQSGAVLYGTRDCPLSDRNAVFRGEAREIAPDTVRFFGTVDRPLPIGTELALYH GRYLSNAMTVVNCRNVCFEKIDLRHSPGMGVYGLRSENILLKAVCTVVNRSEKRRFSCAA DAFHFTNCRGLIELDGCNCNGQGDDALNIHGIYARIVAVSKDRKQIELFGVRFPAERVFK ADDEIWPIVRTTGSRGARNRIERIVRKSSKRLVVALREPLDKTFREDDFIENASWYANVS IHDCYFGRANRARGILLTTPGKVVVHNNRFMTAGTAILIEGDTDYWFESGAVCDMDIHDN LFENCGTSASNNGGSGWGEALISITPSFRPADESSPVYHRNIRIRNNRILTYDRPLLHAR SVEGLQFVANCVEQTYDFPATAAQRQSFCLEGCRNVRIAENRFIGYGKPDFELIHMNPNN ICHEKAK >gi|313157565|gb|AENZ01000062.1| GENE 24 26275 - 28131 823 618 aa, chain - ## HITS:1 COG:no KEGG:Swoo_1248 NR:ns ## KEGG: Swoo_1248 # Name: not_defined # Def: hypothetical protein # Organism: S.woodyi # Pathway: not_defined # 7 585 9 582 610 259 29.0 2e-67 MSIRKFIILSALIVSGCEMHPETIAIDFDSGTEDYTPLVRKILAEHPAGEVTIRFGAGTF DFYPEQAAGSYLCVSNNDNGYKRCAFLLEEMRRVRIEGAGEKTQLRFHGAIVPFRVARCE QIVFEAFTIDCDASFIFEGLVVGNDPRTHSITLRPLDPERFEIRSGEPWFTGYDWASPFG ENILFDPGTRSPYYQAEQYEHDQKKTLQAEWIGDSLVRLSGYSSRELPPVGSVYTDKGPH STNRRYPGFSFYKSAGVEVRNVTLHDSGGMALIAENCRDVVCSQYRVEVPPQSGRMVSAS ADATHFVGCSGKIVLRACRFESMLDDATNIHGVYMTVVDRFSGNRFGASFGHFQQEGFDF AEQGDSLVFIDRADLGVLGCGRVEEVNHVNENYYIIRTGFDLSAIPDSVHIAVGNRAADA DVEISECTVRYNRARSFLLSTPGDVCVENSDLSSMMAGIRICGDANYWFESGRTRNVVIR NNRFGTMATGGRSPQAVLQIDPVISHDARSGGTPYHGCIRFEGNLVESFDNQLIYALSVD SLVISRNRFVDSRRFEPRFAGLSVIDAQHCRSVTVRNNDFSGWKENSTISLVDCSEHCLE GEEMPRMVENPNPYFYEN >gi|313157565|gb|AENZ01000062.1| GENE 25 28140 - 29285 885 381 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313157571|gb|EFR56987.1| ## NR: gi|313157571|gb|EFR56987.1| hypothetical protein HMPREF9720_2701 [Alistipes sp. HGB5] # 14 381 1 368 368 739 100.0 0 MKTHALYVAGAALMIALSSCDRDLDLAHAYNNGSSSKPYEFDADFNPDDYTDFGTFVVGD RNICWQDPHIPRYMLWLDGKISVGKYDDSGLFTTYWSADNSLRQVSASPWLDENVNALTS DMEVFGKCIEPVSGFYDGGAWFIGVHKLMDGRLAGFFHAESHWPGVSAAYKSIGVTYSSD NGLTWETGRRILSGTDPKPENPAGQGESYGLGDGCVVWNDARKVWICYYSGYCDDPGNYM ITMACSEDPAGAAGTWKKWDGKAFTVEGCNGQTQLGGANIKIANLDTYAGGNPSVMWNHY IGKWVMVYHSWSRRICLSTSADGIAWEAPISLNLGEEEAMYPNLISEEGDQTGGQAVRMY YAADMNDLGQRSLAWRKVVFY >gi|313157565|gb|AENZ01000062.1| GENE 26 29316 - 30899 1497 527 aa, chain - ## HITS:1 COG:no KEGG:Dfer_2232 NR:ns ## KEGG: Dfer_2232 # Name: not_defined # Def: RagB/SusD domain-containing protein # Organism: D.fermentans # Pathway: not_defined # 26 524 15 482 486 225 31.0 5e-57 MKKFITSAYSRLGRFVVIAVAGCIFSACSDFLDVPLESTVATSNFYKTAEEFDMGLTGVY NMLLSAEWANGDRYGSYFQGFLILGRVGTDEMIIPTNIDGNETELCNYTYTPSHRYISRT WYVQYRGIQRACVIIDRLTDTDIGNESEKKRILGEAYFLRAFYYFHLVRLFGEVPIINHE VTDLTMVQTEKSSVAQVYEQIVSDLKLAIGNLPVSNANGRAHYYAAKALLGKVYLQMAGE PLEDSEAAALAEVELNEVIQSGRFELVKDYFSLFDASNEYSSEYLFDVEFANNGTTTYGG QVGTIDGVQTPNNLYWSAVWSTQEFYETYDPADLRRDNIARFKYVYDDNQNLVKEDLSSE PIYYAYKFRHALTEEGRGPGWANWANPINFPIIRYADVLLMYAEAVWRAHEVPSPEALEY VNQVRRRGFGVDIKTPNEAVDLKMMDGDKFAEALLAERSFELCFEGQRWYDLVRFGKLEE GVKKLAKYSSVATSQAQNFQPKHVIFPIPQDVIDASNGKIEQNPLWK >gi|313157565|gb|AENZ01000062.1| GENE 27 30913 - 34044 2366 1043 aa, chain - ## HITS:1 COG:no KEGG:Slin_0106 NR:ns ## KEGG: Slin_0106 # Name: not_defined # Def: TonB-dependent receptor plug # Organism: S.linguale # Pathway: not_defined # 8 1042 146 1161 1161 558 34.0 1e-157 MKRFVFLLAGLLLIASSLRAQTSVSGTITSPDGKPVAGASIIIEGSTVGATSDAKGFYRI KVRSSKDILVFNFLGYREVRESVGQRSLIDVVLSPSDLQIDDVVVVGYGALRKKDMTGAV ASIKADAFENRVLFSVDDALAGGVAGLMVNSSSGKPGSESSMLIRGANSLTGSTAPLIVL DGFPLFDVSTSTGGGIDGYDTGMSSLSMINPDDIASIEVLKDASATAIYGNRGANGVILI TTKRGRGDSGKIQYNTYFGFQQMNKRYDMMDFRQYAAYQVDRNSSNNHLFTDPITMQPRT IGDVQSRDWQDEIFRTGFIQNHSLSISHSTEKTSMFVSGSYMQNKSVLIATNWQKLTAKA TIDHRFTDFIRTGVDIGYNRIVDDGVPTGGEGTSQVAGVITSALTALPFEFDETTQAYFR RAGVSQSSLDSYIDNYHGNPVNIANETELTKRINRMMLNAYVEADILKNLTLRITAGYDS YSLKDRQFYPTSTPRGYFYKGQGIIGSSESGSWINENTLTWKPVFGKHRLNVLVGMTEQG YTNFYDRSETTQYEYEDLGSNNAQMAKVFNVYSSKEQVRYVSLIGRINYSYDNRYIATFT LRRDATSRFINDKWGTFLSGALAWNIDSERFMQNQNTVSALKLRLSLGEVGNSNVPTSGS YSQLYGTNYSFGTIETIGQSSLSIANENLSWETTREVNAGLEVGLWNDRLKFTADFYDKV TRDLLLEAPVVNIVGFDKAWQNIGKMRNRGIELSLNAQLINHKNFKWNFFANFARNKTKI LELGQGGAPILLGVTCLSGQNAVILQEGGEIGDIYGYVTKGVYGLNDFEIDGITPKPDVA VETGAEKPGAMRFADLYKDEKITSDDRTVIGNSTPDFFGAFGTNFTWKNLDLNLSFQYSY GGNVYNANYNQLAAFTGTTNNQMAFFEERWSPRNLASTQYSTMTNGAVCSAFVEDASFLR LRSARISYTCPRKWFGPNSHIGSIKFYIAGENLFVLTRYSGYDPEVYSNQGSSSMSNILT SGFDYGCFPRPRTFTAGINILFQ >gi|313157565|gb|AENZ01000062.1| GENE 28 34064 - 35338 767 424 aa, chain - ## HITS:1 COG:no KEGG:PG0232 NR:ns ## KEGG: PG0232 # Name: not_defined # Def: zinc carboxypeptidase # Organism: P.gingivalis # Pathway: not_defined # 39 182 573 716 821 70 30.0 2e-10 MKQLFFHLVPVLAFGLVLAACDNDSDEWTQPYEFAWSYGAESQFSVGKPATFTDLSLGVE TREWSFEDATPATSTDPEPSVVFNSKGIKTVTLTIHFLNGQVQSESFDIEVFYPLSARIK ALELTPKGCIRLDTPVSFGLTEVEGNPTSYQWTFEGGTPSSSTDPAPVVTWTSANKNGAR ISCRLTRADDGMTTTVEQTFIVGNYPMLHPIPEKDYDPWRFELSSIGKWTLWNTTTSADD LTTNTSIVSGGADGSKQALKVTLKPGVIYQLFTRDNWVCNAQLVAGQKYEVSFWQKTDAA EGSLIVLTGIYNNLPSWSWNEYLQVLASDHWSIYFPDIPFEEQVEEMFGIWSNIEYPLTE TITLPASPELMPSAEWKQVRFEFTATSAKYESLLNTYPQFALLSAGSADVNWYLDDIQIN LIEE >gi|313157565|gb|AENZ01000062.1| GENE 29 35717 - 38266 1266 849 aa, chain - ## HITS:1 COG:VC2369_1 KEGG:ns NR:ns ## COG: VC2369_1 COG0642 # Protein_GI_number: 15642366 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Vibrio cholerae # 265 511 268 519 526 123 32.0 2e-27 MTEGRCGGKTPRFSCFDTSNSRFFNNELESLQEDDHGAFWIGGYGITRFDPRDHSIRHFT VKDQLQSNSFKIWASYRLRNGEMVFGGVKGFNIFHPDSITDNDIRPRTAISRLKIRNQTV SAGDSVCGRVVLPRAINRLNKLDLSYNCNNLTFEFSAFAYVDPKYNTYKYRMENFDTDWN YTSGLSPQAVYTNLRPGSYRLVVYASNEDGYWSQAPAVLEITIRPPVYATRLAWLCYIAL TILFLYRWHRRSLRKLQERHQIELERNKYYEEQKSSANMLQFYTDIAHELKTPLSLITAP VEELLLNPHIGKTTRNRLELVNRNAGALKTLLEQILDLRKYESHKMELAAVQTDASEFLR GIAELFKPLADQLQIQFSQEIPNDPLILWIDRRKMEIVIANLLYNAFKYSRKSSGKVNLV CEEQQLSVTISVEDNGIGIARDEQSRIFERFYQARNNNSPQKSIGIGLSLVKHIVTLHHG EIEVASELNVGSLFRVRIPKGCAHLAPEQIKDSDKVPGRPFQNLPIGMDSALSDFGDDQF ESEMAAETAIGNDRGMQCGLLEENTRHGAAASDGKRVSILVAEDNTALRDYLRSALENRY NVSTARNGEEAYCQAINEQPDLVLTDIVMPGMSGIELCRKIKENSRTSAIGVILLTAHNL QNYEVSGYRVGADAYVAKPFSLEVLFSRIDNLVARQNKIHKNSQPIKEIPISEVAVEKSE DLFLVKCTETIETEIADPRFDVLQLCRKIGVSRSQLYRRILALTGLTPIQFIRSIRLKHA ASFLAQDGTLPVNEVMYRVGYTNLSHFAKIFHEEFGLYPKEFALRNRSKTGKVDSASEIF SKNEPETQK >gi|313157565|gb|AENZ01000062.1| GENE 30 38263 - 39963 583 566 aa, chain - ## HITS:1 COG:VC1353_1 KEGG:ns NR:ns ## COG: VC1353_1 COG3292 # Protein_GI_number: 15641365 # Func_class: T Signal transduction mechanisms # Function: Predicted periplasmic ligand-binding sensor domain # Organism: Vibrio cholerae # 22 517 38 494 675 90 23.0 8e-18 MLYFVGFLSPPLPAAAASYDRIQTLSMDEGLPHSDVNAITQDRDGYLWFATFSGLSKYDG YRLQTFRTDNSDLTSDRILCLFVARDSSLYIGTESGGLNRYDPVSETIFPVAEEPTTSAD QVINNIFEDRKGTVWVCRNDGLGYLTLRGATTYLHVKNRWRGFYIQCGTALDSCRLLLST DAGPVIYNPETEEIQNILRDEIKTRCFSMQSFADGKIALSGGWGVRIYDPKTDDLQRICD FSSRVTTQDNHGNIWVGSFNRGLYKYDRNGIKIVHYHPKLPIPHAVNSFEISALFEDRSG VLWIGTIGGGLSRLNVVEKHIECYTEAQGLCENRIITFLEARDGILWVSTHGGINLFNRS SATFRELRINGLPSIQFATVSAFFMAENGDIWLGTWDKGLWVIDSRDIDRAIQTGQVRAR QLKHPVIDGELSVFRIVEDRDRHLWISSNRGCFEYIPSHGDFKTGQWINYTHNGDDPNSL CSNFTTDIYPDTVTDNKTIWIGTRSGLNRIVFDADGKANSQRIELAATNARERKVCSSDA FISTIHCDRKEGGVFGSQRSAKVSSL >gi|313157565|gb|AENZ01000062.1| GENE 31 40168 - 40599 166 143 aa, chain - ## HITS:1 COG:no KEGG:Bacsa_1585 NR:ns ## KEGG: Bacsa_1585 # Name: not_defined # Def: transposase IS116/IS110/IS902 family protein # Organism: B.salanitronis # Pathway: not_defined # 32 137 205 312 321 123 55.0 3e-27 MPVTKTDEKDAYMIAMYGEKMNPPIYKMPSQATIRILTTEGFTYFDNAKQLSRFIGICPT YQQSRTSVNIREHINRNDDERLRSLLYIAFWTALRYNSRECYIRLKANEKPFKVALIAVT NKLVRPVLTITTINSIYRWICPG >gi|313157565|gb|AENZ01000062.1| GENE 32 41077 - 41310 107 77 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|291513816|emb|CBK63026.1| ## NR: gi|291513816|emb|CBK63026.1| Bacterial mobilisation protein (MobC) [Alistipes shahii WAL 8301] # 1 65 1 65 139 116 95.0 6e-25 MTTTKKRIGRPTTTDPRVHRYNFKLTTEENIRFKQMLCKAGLEHNRSQFIVKRIFNEEFV VIKRDRRYSSNYSISLF Prediction of potential genes in microbial genomes Time: Wed Jun 22 13:17:36 2011 Seq name: gi|313157563|gb|AENZ01000063.1| Alistipes sp. HGB5 contig00062, whole genome shotgun sequence Length of sequence - 818 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 209 - 817 237 ## gi|313157564|gb|EFR56981.1| conserved domain protein Predicted protein(s) >gi|313157563|gb|AENZ01000063.1| GENE 1 209 - 817 237 202 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313157564|gb|EFR56981.1| ## NR: gi|313157564|gb|EFR56981.1| conserved domain protein [Alistipes sp. HGB5] # 31 202 1 172 172 310 100.0 3e-83 KRYFEWFNPINQTYILYARMADNQSQKRCTMDSLKEKDNPTPNPAKRPRIEVDEELMRQM IAGQAPLDSEVVRRIPEPEEEDTDAPEEKTSATASVASVAVAEKTDVDSRMSTVKEPAGF RRKKIALPDFERTFFAPVDCRNRSAIYVSAQTKRKVSAILHLLGNDNTRLTALVDNMLHF VMDIYSGELNYLHEKKNNRRPF Prediction of potential genes in microbial genomes Time: Wed Jun 22 13:18:02 2011 Seq name: gi|313157524|gb|AENZ01000064.1| Alistipes sp. HGB5 contig00052, whole genome shotgun sequence Length of sequence - 51464 bp Number of predicted genes - 36, with homology - 34 Number of transcription units - 17, operones - 11 average op.length - 2.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 6/0.000 + CDS 2 - 1882 2568 ## COG0129 Dihydroxyacid dehydratase/phosphogluconate dehydratase 2 1 Op 2 32/0.000 + CDS 1900 - 3627 2528 ## COG0028 Thiamine pyrophosphate-requiring enzymes [acetolactate synthase, pyruvate dehydrogenase (cytochrome), glyoxylate carboligase, phosphonopyruvate decarboxylase] 3 1 Op 3 15/0.000 + CDS 3642 - 4184 793 ## COG0440 Acetolactate synthase, small (regulatory) subunit 4 1 Op 4 . + CDS 4197 - 5243 1779 ## COG0059 Ketol-acid reductoisomerase + Term 5305 - 5343 3.7 5 1 Op 5 . + CDS 5401 - 7074 2561 ## COG0367 Asparagine synthase (glutamine-hydrolyzing) + Term 7141 - 7181 13.4 - Term 7005 - 7031 0.1 6 2 Tu 1 . - CDS 7256 - 8284 1604 ## COG1087 UDP-glucose 4-epimerase - Prom 8304 - 8363 6.6 - Term 8365 - 8414 14.1 7 3 Op 1 . - CDS 8528 - 10180 1996 ## COG1785 Alkaline phosphatase - Term 10405 - 10436 -0.9 8 3 Op 2 . - CDS 10479 - 12275 2179 ## Pedsa_1514 polymorphic membrane protein 9 3 Op 3 . - CDS 12288 - 13259 1213 ## Pedsa_1515 exopolysaccharide biosynthesis protein 10 3 Op 4 . - CDS 13279 - 14547 1806 ## Pedsa_1516 RagB/SusD domain protein 11 3 Op 5 . - CDS 14561 - 17485 4311 ## Pedsa_1517 TonB-dependent receptor 12 3 Op 6 . - CDS 17386 - 17862 119 ## BDI_2323 hypothetical protein - Prom 17926 - 17985 2.1 - Term 17902 - 17953 4.2 13 4 Op 1 6/0.000 - CDS 17988 - 18962 1086 ## COG3712 Fe2+-dicitrate sensor, membrane component 14 4 Op 2 . - CDS 18959 - 19561 725 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog - Prom 19582 - 19641 2.8 + Prom 19811 - 19870 2.4 15 5 Op 1 . + CDS 19917 - 21104 1901 ## COG0156 7-keto-8-aminopelargonate synthetase and related enzymes 16 5 Op 2 . + CDS 21118 - 21591 764 ## COG1522 Transcriptional regulators + Term 21625 - 21662 7.8 - Term 21807 - 21852 12.4 17 6 Tu 1 . - CDS 21890 - 22156 301 ## - Prom 22224 - 22283 2.0 + Prom 22117 - 22176 4.8 18 7 Tu 1 . + CDS 22227 - 22355 62 ## + Term 22595 - 22622 -0.9 - Term 22332 - 22362 1.2 19 8 Op 1 13/0.000 - CDS 22370 - 23710 1812 ## COG2204 Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains 20 8 Op 2 . - CDS 23713 - 26310 3132 ## COG0642 Signal transduction histidine kinase - Term 26456 - 26495 12.1 21 9 Op 1 9/0.000 - CDS 26520 - 27839 2309 ## COG1668 ABC-type Na+ efflux pump, permease component 22 9 Op 2 . - CDS 27852 - 28781 278 ## PROTEIN SUPPORTED gi|225088774|ref|YP_002660041.1| ribosomal protein S16 - Prom 28801 - 28860 7.0 - Term 28935 - 28975 12.7 23 10 Op 1 . - CDS 29010 - 31040 3171 ## COG3590 Predicted metalloendopeptidase 24 10 Op 2 . - CDS 31124 - 32386 2438 ## COG0126 3-phosphoglycerate kinase - Prom 32452 - 32511 5.3 25 11 Tu 1 . - CDS 32574 - 34901 3260 ## COG1752 Predicted esterase of the alpha-beta hydrolase superfamily - Prom 34949 - 35008 5.0 + Prom 34868 - 34927 6.2 26 12 Op 1 . + CDS 34974 - 35999 1536 ## COG0016 Phenylalanyl-tRNA synthetase alpha subunit 27 12 Op 2 . + CDS 36011 - 37795 2831 ## COG0457 FOG: TPR repeat 28 12 Op 3 . + CDS 37809 - 38936 1651 ## GFO_3257 M23 family peptidase 29 13 Op 1 . + CDS 39153 - 39584 716 ## COG0319 Predicted metal-dependent hydrolase 30 13 Op 2 . + CDS 39672 - 41555 1612 ## COG0445 NAD/FAD-utilizing enzyme apparently involved in cell division + Term 41661 - 41693 5.0 - Term 41642 - 41687 6.1 31 14 Op 1 . - CDS 41705 - 42718 699 ## COG0115 Branched-chain amino acid aminotransferase/4-amino-4-deoxychorismate lyase 32 14 Op 2 . - CDS 42747 - 44822 2470 ## COG0143 Methionyl-tRNA synthetase - Prom 44852 - 44911 6.0 - Term 44877 - 44918 8.6 33 15 Op 1 . - CDS 44939 - 47053 2829 ## Bacsa_3552 peptidase S46 34 15 Op 2 . - CDS 47063 - 48004 1496 ## COG0379 Quinolinate synthase - Prom 48150 - 48209 11.0 - Term 48252 - 48292 -0.9 35 16 Tu 1 . - CDS 48382 - 49434 1429 ## COG0489 ATPases involved in chromosome partitioning - Prom 49478 - 49537 5.1 + Prom 49504 - 49563 5.9 36 17 Tu 1 . + CDS 49611 - 51389 2875 ## COG0616 Periplasmic serine proteases (ClpP class) Predicted protein(s) >gi|313157524|gb|AENZ01000064.1| GENE 1 2 - 1882 2568 626 aa, chain + ## HITS:1 COG:PA0353 KEGG:ns NR:ns ## COG: PA0353 COG0129 # Protein_GI_number: 15595550 # Func_class: E Amino acid transport and metabolism; G Carbohydrate transport and metabolism # Function: Dihydroxyacid dehydratase/phosphogluconate dehydratase # Organism: Pseudomonas aeruginosa # 26 621 5 605 612 788 66.0 0 RAFLTSGPAAGSGHAEKREKMKHTLRSAVTTTGRRMAGARSLWRANGMREEQFGRPVVGI ANSFTQLVPGHVHLHEIGQYVKQRIEACGCFAAEFNTIAVDDGIAMGHDGMLYSLPSREI IADSVEYMANAHKVDALVCISNCDKITPGMLMAAMRLNIPTVFVSGGPMEAGKFDGRDID LIDAMVMAADEACSDEQIARVERCACPGCGSCSGMFTANSMNCLTEALGLSLPGNGTIVA THANRRRLFEDAAALIVRNAERYYFLGDESVLPRSVATKASFENAMSLDIAMGGSTNTVL HLLAVAREAGVDFTMQDIDRLSRRVPVLCKVAPNSQYHIQDVNRAGGILSILGELARAGL LDTSVPRIDGRTLREAIDACDLRSETLTDAALRRWLSAPAGLYSRELGAQHSYYDAPDAD RTTGCIRDVEHAYSRDGGLAVLTGNIARNGCIVKTAGVDESLHVFRGRARVYESQEQAIE GILGDEVQAGDVVVIRYEGPKGGPGMQEMLYPTSYLKSKHLDKACALITDGRFSGGTSGL SIGHVSPEAASGGEIAVVRTGDEIEIDIPRRSIRLLADDAEIAARMAAQEYFRPATRDRK VPASLRAYAKLVSSADKGAVRIVDEE >gi|313157524|gb|AENZ01000064.1| GENE 2 1900 - 3627 2528 575 aa, chain + ## HITS:1 COG:TM0548 KEGG:ns NR:ns ## COG: TM0548 COG0028 # Protein_GI_number: 15643314 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Thiamine pyrophosphate-requiring enzymes [acetolactate synthase, pyruvate dehydrogenase (cytochrome), glyoxylate carboligase, phosphonopyruvate decarboxylase] # Organism: Thermotoga maritima # 19 575 8 560 584 510 47.0 1e-144 MDTTQTNMSEPQPGPSVTGAQALIESFLREGVDTIFGYPGGAIIPVYDALYDYRDRLRHI LVRHEQGAVHAAQGYARVSGRVGVCLVTSGPGATNTVTGLADALMDSTPLVLVTGQVGSS LLGTDAFQETNFVGITQAVTKWNCQVKRTEEIPGAIAKAFYIARSGRPGPVVVDITKDAQ CGTAPFSYEKVASIRSYVPASVPSEERIDEAAKLIDAAQRPLAMIGQGVILSGGEAELRA FLDKSGMPAASTLLGLSALPTDSPQYVGMLGMHGNYGPNIKNRDCDLLLAVGMRFDDRVT GNPGCFGANAKIIHLEIDPAEIGKIVPADVPLVGDVKQTLPLLTGRIQARSHRAWIDEFR ACDKIEYEKVIRRAVHPAEGRIRMGEAVAAVARAYDNDAVLVTDVGQQQMFAARYFGFRR SRSMVTSGGLGTMGFGLPAAIGAKLGAPDREVCLFAGDGGLQMTIQELGTIFQSQVAVKI VLLNNSFLGMVRQWQELFYDRRYSFTELANPDFGLIARGNGIAYRCVERRGELSEAIAEM QACRGAYLLEVRVEGEDNVFPMVPAGAPVAAIRLE >gi|313157524|gb|AENZ01000064.1| GENE 3 3642 - 4184 793 180 aa, chain + ## HITS:1 COG:BS_ilvN KEGG:ns NR:ns ## COG: BS_ilvN COG0440 # Protein_GI_number: 16079882 # Func_class: E Amino acid transport and metabolism # Function: Acetolactate synthase, small (regulatory) subunit # Organism: Bacillus subtilis # 7 165 4 170 174 90 36.0 2e-18 MEEQEYIITVFSENKVGLLNQITTVFTSRDVNIESLTTSESALAGIHKFTIVVRTGPERV EKLVRQIEKKIDVLKTFVYTSDEVVQQEIALYKVTRSRSVERLVRQHNVRILEIDDDYIV VEKTGHKAETKELFRLLQPYGVQQFVRSGIVAIIKSRRELLNEYLEELERSRRQTNSNHF >gi|313157524|gb|AENZ01000064.1| GENE 4 4197 - 5243 1779 348 aa, chain + ## HITS:1 COG:YLR355c KEGG:ns NR:ns ## COG: YLR355c COG0059 # Protein_GI_number: 6323387 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Ketol-acid reductoisomerase # Organism: Saccharomyces cerevisiae # 1 347 48 394 395 407 57.0 1e-113 MAKINFGGVEENVVTREEFPLEKAREVLKNETIAVIGYGVQGPGQSLNLRDNGLNVIVGQ REGSKSWEKAVADGWEPGKTLFGIEEAAERATIIQYLLSDAGQIAVWPTIEKHLTPGKAL YFSHGFGITYKERTGIVPPAGVDVILIAPKGSGTSLRRMFLQGRGLNSSYAVFQDATGRA FERVVALGIGVGSGYLFETDFKREVYSDLTGERGTLMGAIQGIFAAQYETLRANGHTPSE AFNETVEELTQSLMPLVAENGMDWMYANCSTTAQRGALDWWKRFRDATKPVFEELYREVA AGNEAQRSIDLNSKPDYREKLDEELRQMRESEMWQTGAVVRKLRPENN >gi|313157524|gb|AENZ01000064.1| GENE 5 5401 - 7074 2561 557 aa, chain + ## HITS:1 COG:VC0991 KEGG:ns NR:ns ## COG: VC0991 COG0367 # Protein_GI_number: 15641006 # Func_class: E Amino acid transport and metabolism # Function: Asparagine synthase (glutamine-hydrolyzing) # Organism: Vibrio cholerae # 1 550 1 550 554 764 66.0 0 MCGFVGLFDIRQKSDALRTQVLKMSKNIRHRGPDWSGVYCGERAILSHERLSIVDPQSGG QPLFSRDGKLVLAVNGEIYNHREIRDELAGEYDFRTGSDCEVILPLYQKMGVGLLEKISG IFAFALYDIENDEYLIARDPIGVIPLYIGWDRDEQFYVASELKALEGVCTTIQPFLPGHY WSSKEGKMVRWYDRDWFEYDAVKETGADIPALRAALEAAVKRQLMSDVPYGVLLSGGLDS SIISAVARKFADRRVESHDRDRAWWPRLHSFAIGLEGSPDLAAARKVADRIGTVHHEIHY TIQEGLDALRDVIYFIETYDVTTVRASTPMYLLARVIKSMGIKMVLSGEGADEVFGGYLY FHKAPNARAFHEETVRKVGKLHLYDCLRANKSLSAWGVEGRVPFLDKEFLDVAMRINPEA KMAKDGRIEKWILRKAFEELLPEEIVWRQKEQFSDGVGYGWIDTLKRITSEAVSDREMEH AAERFPINPPRNKEEYYYRTIFEEHFPSQTAAQCVPSVPSVACSTAEALAWDAAFRDRND PSGRAVLGVHNTDLTHS >gi|313157524|gb|AENZ01000064.1| GENE 6 7256 - 8284 1604 342 aa, chain - ## HITS:1 COG:BS_galE KEGG:ns NR:ns ## COG: BS_galE COG1087 # Protein_GI_number: 16080937 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-glucose 4-epimerase # Organism: Bacillus subtilis # 6 335 3 328 339 340 51.0 3e-93 MKKSCVLVSGGAGYIGSHTAVELIGAGYDVVIVDNLSNSDMNAVEGVRRITGVEVPFEKA DCCDRDAFRKVFEKYDFDSVIHFAASKAVGESVGKPLEYYRNNLVSFMNVIDLMREFGRH NIVFSSSCTVYGEADKLPVTEQTPRKPATSPYGNTKQMCEDILRDSLAAYDGMKGIALRY FNPIGAHPSALIGELPRGVPQNLVPYITQTAAGLRECLSVFGNDYPTPDGSNIRDYIDVV DLAKAHVTAIARMIEGKNREKYEIFNIGTGRGVSVLELVRKFEEVNNLKLNYKIVGRRAG DIVAIWADPSYANDELGWKAERSLDQTLAAAWKWEKHIRGIE >gi|313157524|gb|AENZ01000064.1| GENE 7 8528 - 10180 1996 550 aa, chain - ## HITS:1 COG:MA4354 KEGG:ns NR:ns ## COG: MA4354 COG1785 # Protein_GI_number: 20093141 # Func_class: P Inorganic ion transport and metabolism # Function: Alkaline phosphatase # Organism: Methanosarcina acetivorans str.C2A # 26 545 46 558 585 283 36.0 8e-76 MMKKSAILLLLLFAALAAGAQPFYRKPVAPVRNVIVMIPDGTSTSVLAAARWYKFYNDPS QRMLNLDSCLCGLVGTFSSDSPIPCSAPAMSAYMTGMPQQAGNVSVYPAANPAQDIVEVE SGRACQPLATLLEAAKHDRGKATGLVVTADFCHATPAACAAHHYSRHAYPALAAQMAGND LDVMFGGGRKQVTDAMRAYLRGSGATLIEQDAAAFRAYEGKGPVWSLFAQGNMSYDLDRN DAEQPSLAEMTAKALEVLDRHKNGFFLMVEGSRIDMAAHAKDPVGVITEMLAFDKAVGVA LDFARRDGRTAVVIMPDHGTSGLSFGDGAYKDYASKGLDSTYMNISKVRRTASGLEKILL KARPEQIRPLFREYTGIELTDDELALLRSSRNYTEADYMKVANSVNMVSSIARIVTSRTH FGFLSGSHTAEDVFLAAYHPEGNLPVGRNTNTEIHAYMADLLGLERPLAELTGELFAPHT EVFRGAKCSISAAEGQMPCLTVRKGRRTLRVPANGSVAYLDGKPLKLRSAVVYIDRNDTF YLPRELGEKF >gi|313157524|gb|AENZ01000064.1| GENE 8 10479 - 12275 2179 598 aa, chain - ## HITS:1 COG:no KEGG:Pedsa_1514 NR:ns ## KEGG: Pedsa_1514 # Name: not_defined # Def: polymorphic membrane protein # Organism: P.saltans # Pathway: not_defined # 31 597 25 585 587 285 35.0 6e-75 MQSIMKKIRKYGIILFAGLCACAAWSCEEDKTDRKFTPKDPVIKLGGDVEVGKAGGSYTV PIESNLPWRVRSEVDWILLGEVENGMGDGEFTFTVSPNKTLFEREGRVTAWITDEYAQSI RVVQAPSSPEDLEVHWYVKTDGSADNDGMTWETATTLHNALSKSINGNFIHVAAGTYVPE QSLAGSKGAAEDATFEISANVSLIGGYPAGAVTGAVADPDANPTVLSGRLSGGRHAYHVV CVTAAKADGRRVLMKGLTITEGLCSGTASYYTLNGARFYISRGGGVTVGNAAVDIADCKI TQNKSAKDCAGICIVAGADVSLTDTEISENECSNGNGAGLHNEASVVRMDRCTVRGNSAS GVCGGVYTFSSSAPSYTYIYNSTLCDNRTDGSKNSRRGGAVYSREYSETVLVNCTVHGNT GGNGGGIALYGASGKESKMTLVSCTVTGNTSLFVGGGVEFTPYTTMNVYNTVVSGNTAAN GGDDLVGTNTALAATANLPAVLSYAVNGSVVYGAGKAVVAGSSFDPATMLGPLAGNGGPT QTCLLLGADSPARTLGMPYVDLSALGQDFDPQIGPEITGFDQTGLSREGTSAMGACVK >gi|313157524|gb|AENZ01000064.1| GENE 9 12288 - 13259 1213 323 aa, chain - ## HITS:1 COG:no KEGG:Pedsa_1515 NR:ns ## KEGG: Pedsa_1515 # Name: not_defined # Def: exopolysaccharide biosynthesis protein # Organism: P.saltans # Pathway: not_defined # 1 321 1 313 316 273 46.0 1e-71 MKNNCYIKLLFSGLFVLAGFAFCGCNDDPEYTRYEAVPRSEIGRKLVEGTNLIAHISVDS TYVLANGASVTELRYLSASGLAMAAFFFEIDLTSPDIALEVCTPQNKPIGVGFEPVTQQA MHVDAEGHRVLGGTNADFGSETQKGPQGIFWKDGVAQKTVFNTTPARPRSFFAIRTDKRA VAAAAADYDEIAASKVIYEAVGGGPVLVDDGVVPIPLDPNDLSVEPRTCIGVSEDGTKVW IMVVDGRNFYYSNGMSFVELGQFMKAVGSYDAINLDGGGSSTFFVRSAPGFSADDRFEIR NWPTDGGGVERAVCNGLLIVSNK >gi|313157524|gb|AENZ01000064.1| GENE 10 13279 - 14547 1806 422 aa, chain - ## HITS:1 COG:no KEGG:Pedsa_1516 NR:ns ## KEGG: Pedsa_1516 # Name: not_defined # Def: RagB/SusD domain protein # Organism: P.saltans # Pathway: not_defined # 2 422 4 424 424 522 60.0 1e-147 MKYITVLLFCALGFASSCSNMLDQYSHSAIPPEAVTEKDLPALRLGMYNRVQNEPQTRSF IMCDILGGDITQSNYNPIDVINSTLSPLNSAIVNGWNGYYSALYQVNNLLAVTAQFPDSE ISVRARGEAHYFRAYIYFCLVSRWGGVPLLRENTLDKLPRSPAADVWALIEEDLDTAASL LDTSESYYYVSRNAALALKARVMLSQGKMTEAARLAEDLITSGPYKLDSFDKIFRKKANT EIIFAFENISEESSINISDLFYTYGHPNKGQGVYRLPASTVELFGANDTRKEMSVINIAG TDCLNKYPSGQTGKDPVIVSRIAELYLISAEAQGRAKGVGRLNELRRERGLDDIYPASDA AYVDAVLDERRRELLGENFIYHDMVRTGRAVERLGIQKFQLLLPIPGKELQLNPLLEPNP GY >gi|313157524|gb|AENZ01000064.1| GENE 11 14561 - 17485 4311 974 aa, chain - ## HITS:1 COG:no KEGG:Pedsa_1517 NR:ns ## KEGG: Pedsa_1517 # Name: not_defined # Def: TonB-dependent receptor # Organism: P.saltans # Pathway: not_defined # 3 974 151 1125 1125 1151 59.0 0 MVGASVVIRGSTTGVSSDIDGRFAIEAREGEVLSVSFVGYTPQTITLGAKTMLTLTLRED TSELEEVVVVGYGTQRRSLVTSAISKVQMNESNMRQVASPTQLLSGRVAGVTTSTGSGNL GSGERMVIRGSSSLSAGNEPLYVIDGIPITNTNANLVDFGEDMSSLATLNISDIESIEVL KDAASAAIYGSRANNGVIVITTKSGKEGKSEVHLNFNTGVTRFANVGKIKMADSGLYVRD FNEGVDNYNRQYGYKPGDSGFKKHIQNPFGTLPDTDWMDLILQTGTFYNGDVSFSGGNVK TRYYVGANYNHQTGVIRTNKMEKMNFKVKISHEFTPWLEVGANVSGNYIKNHQIPGANSG TTIIGRSIQQRPFDRPYKPNGDYYVGGTDELVFHNPMQILDEETAYIENMRYLGNFYATF KYKDKFAFKSSVNTDITQIYDYTYYNENHPYGKGVGRIVDYNRTIKNILVENFATYNDKF GDFSLSAMLGHSFQKVTTRSAKLDGSGFPSPSFDVVGVASSLDAYSGSLSNYAMESYFGR ATFSYKDRYVLTATLRTDGSSKFARDNRWGWFPSVSLGWNISKENFMKDSDTELKFRVSY GKTGNQEGIGSYAYQALMSGGYNYGNGSGIAVSTFGNRDLTWEKADQFDAGFDITLFKGR VNIMADAYYKKTKDLLYSTPIHSTTGVTSIISNIGSMRNIGAELTINTHFNFGPLSWLSQ FNIATNRNKLTELLGDDKPISIGANRALQVGKEVGAFYLFIMDGIYQYDGEVPAEQYAQG IRAGDVKWRDVDDNNLINDNDRQVIGSSNPYFSGGWNNTFRYKGVSLDVFFTYMYGNDVY AAWKINTSKLGHKNGVLAEEARNRWTGPGSTDLHPRSVSGDTNNTRNSDRWLEDGSFLRL RSLTLSYTFPEKISRRLAMKSLRVYFQGDNLWLATRYSGWDPEVSNNLDPRFMGVDNFSV PQPRMFCFGLNVTF >gi|313157524|gb|AENZ01000064.1| GENE 12 17386 - 17862 119 158 aa, chain - ## HITS:1 COG:no KEGG:BDI_2323 NR:ns ## KEGG: BDI_2323 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 6 98 28 120 1125 88 45.0 1e-16 MKHFRLLFLTVVLSAVCATASAQRVTLDMQNAKLEKVLGQITQQTGLVFNYTRPTINPDK RVSVSVKQAELESVLRQLFDADGIVYEIKDGKVYLADRKGGVNPPRPYRRPDSGMPDASS MLRGIRWSARASSYGDRLRASVPISTAVLPSRPAKERC >gi|313157524|gb|AENZ01000064.1| GENE 13 17988 - 18962 1086 324 aa, chain - ## HITS:1 COG:PA1301 KEGG:ns NR:ns ## COG: PA1301 COG3712 # Protein_GI_number: 15596498 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Pseudomonas aeruginosa # 7 283 13 284 327 90 31.0 4e-18 MKRPDRDTIRRVLDERATPQEAREVARWFSSAEGAAELSRLIDEEARMLETGERMPASGI PSEAIYRRIRAALVRRRRRRIALRVAAVLIPCLLVLGVGIRLDRQVGGIFSAPEYAEIYV PKGERMQMVFQDGTRVWLNADTRLRYPEKFGFSMRRVELDGEAYFSVSHNPARPFVVGLG GAQIEVLGTEFNVQAYDDSPLIAVSLDAGSVSMSDGRNSFLLAPSDRLLYDRTTRQGRLD HVDTSGASLWRKSIISFRDTPLEEVLRTLGRWYDVTFRVTDPRAYDYYFTLTTHDSSLRE ILSELQRIAPMHFTREGDTIVIGM >gi|313157524|gb|AENZ01000064.1| GENE 14 18959 - 19561 725 200 aa, chain - ## HITS:1 COG:all2193 KEGG:ns NR:ns ## COG: all2193 COG1595 # Protein_GI_number: 17229685 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Nostoc sp. PCC 7120 # 17 181 21 189 201 68 28.0 9e-12 MNEDFHISGTAADECELFERIRGGDEQAFTLVYERYHGMLYSLALRMIKQTGAAEDAVQY VFVRLWELRGGLSVTVSLRNYLYTMTKNYLLNYIRHNGRVLEQSYVIAQLRQQQDDTLAQ YIESRDIDRRLGEAIDRLPRQQRAVIMMKRDGCSNREIAEKLHVSINTVKTHYADGLRSL RLSIGPIIKMITLLVMIRSL >gi|313157524|gb|AENZ01000064.1| GENE 15 19917 - 21104 1901 395 aa, chain + ## HITS:1 COG:YPO0059 KEGG:ns NR:ns ## COG: YPO0059 COG0156 # Protein_GI_number: 16120412 # Func_class: H Coenzyme transport and metabolism # Function: 7-keto-8-aminopelargonate synthetase and related enzymes # Organism: Yersinia pestis # 7 394 12 402 403 493 62.0 1e-139 MYGKIKEHLQQELAEIKAAGLYKSERVIESPQRAEIEVAGRKVLNFCANNYLGLSDNPRL IEAAKRAMDNRGYGMSSVRFICGCQDIHKQLEKAIADYFGTEDTILYAACFDANGGVFEP IFGQEDAIISDSLNHASIIDGVRLCKAVRYRYANADMAELEECLKQAQAQRFRIICTDGV FSMDGNAAPLDKICALAEKYDALVMVDECHSAGVLGKTGRGITELYDLRGQVDILTGTLG KAFGGAVGGFTTGRREIIEMLRQRSRPYLFSNSLPPAVVGAGIEMFRMLGESNELHDRLV ANVEHFREGMMAAGFDIKPTQSAICAVMLYDAKLSQDFAARLQDEGVFVTGFYYPVVPKG QARIRVQVSAGHTTEQLDRCIAAFIKVGKELNVIK >gi|313157524|gb|AENZ01000064.1| GENE 16 21118 - 21591 764 157 aa, chain + ## HITS:1 COG:VC0071 KEGG:ns NR:ns ## COG: VC0071 COG1522 # Protein_GI_number: 15640103 # Func_class: K Transcription # Function: Transcriptional regulators # Organism: Vibrio cholerae # 4 151 5 152 153 114 41.0 5e-26 MPKTSLDAIDRKILKYLIKNARMPFLEIARECGISGAAIHQRIRKLDDSGVILGSRLIVD PKMMGFDVCAHISITLKDPQLLKQTVEQLKEIPEIVEAHFITGSGNILVKLYCVDNEHLM RTIFDGILRIQGVSSTETQISLQEAFQRQVNIDFIEE >gi|313157524|gb|AENZ01000064.1| GENE 17 21890 - 22156 301 88 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKKTIVTMAALLLMAGGAMACAQQPEKREATEAAAPAADTQEKSAADDAEKTDSTAGESA KEESKVESEGSSEGNSSSASEAPASSAE >gi|313157524|gb|AENZ01000064.1| GENE 18 22227 - 22355 62 42 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MEFKRRAIRIKNGKPLIDSGRQGTPESSGVWKGVESRYKVWK >gi|313157524|gb|AENZ01000064.1| GENE 19 22370 - 23710 1812 446 aa, chain - ## HITS:1 COG:hydG KEGG:ns NR:ns ## COG: hydG COG2204 # Protein_GI_number: 16131834 # Func_class: T Signal transduction mechanisms # Function: Response regulator containing CheY-like receiver, AAA-type ATPase, and DNA-binding domains # Organism: Escherichia coli K12 # 4 440 8 439 441 323 42.0 3e-88 MERILIIDDDITFALMLKTWLSKKGFRTETAASVAAARTALAEGGFSLVLSDMRLPDEDG IALLQWMSGQHMEIPVIVMTSYAEIQNAVRCMKLGARDYVAKPVNPDELLKKIREALDVP AAGSEKPPVPAASAKKPREERPNYIEGRSDAAQRLYEYVRLVAPTNMSVLVNGASGTGKE HVAQLIHRESRRAGKPFVAVDCGAVPRELAASEFFGHVKGSFTGAVGDKTGAFEAANGGT LFLDEVGNLTYETQVQLLRALQERRIRPVGGSREIPVDIRLIAATNEDLEAAIARGAFRA DLYHRINEFTLRMPELRQMRGDIMLFADFFLDAANRELDKRIVGFDPQAAAAMTRYDWPG NLRQMKNAVMSATLLCTGDYITCRELPAELSEAPETPAVPLRNPAGEEEQIRRALAMAGG NKSQAAKLLGIDRKTLYNKLHLYGIE >gi|313157524|gb|AENZ01000064.1| GENE 20 23713 - 26310 3132 865 aa, chain - ## HITS:1 COG:mlr3215_2 KEGG:ns NR:ns ## COG: mlr3215_2 COG0642 # Protein_GI_number: 13472804 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Mesorhizobium loti # 343 697 1 352 382 157 32.0 9e-38 MKESKFVKTKIFAGYAILIAVCVLSVGYVYRTVVRFSTPDGSYSLLHTKRSVAGQALYHL YQAESYGQLMIAGYQSYESRYKRELRTVRGLIDSLRGLTAAEDSLQTMRLDSIVRLLADK ERRTMSLRRTIRSAATSSLLDKNIRELIGPADSAAHGDSVVVRAADTVASRVVVQDTVTV PRRKRKFFRRFADLFSPPKEEAGMIISRHERVVDSLPAPEVKDTIAVVLRTLQDRVTSDR IGIYDRAWNEGMRLRYSNELVNTKIYRLIMDFEAEDTAFLMNRFEQTEAIRRRSSLTLGV IAVAAVVLMLLFVGILWRDINRSNRYRRALERANRDNEALLAAREKLMLAITHDIKAPLG SVMGYIDLLSRLTGDKREELYLHNMKESSEHLLALVNSLLDFYRLDINKVDVDKVAFCPA QLFETIRAGFAAQAGAKGIGLTLDVEPAAGREVAGDPFRIRQIADNLISNALKFTDEGSV TIRVDVVQGRLVFSVRDTGRGIGREEKERIFQEFVRLRSAQGVDGFGLGLSIVDRLVKLL KGTISLESRLGEGSKFIVSIPVGPVSGGEGRKLRPAEACVPAVRGGVKALLIDDDPLQLE MTAAMCRQAGVGAECCQYPEYAAKLVADGGFDVVLTDIQMPSADGFSVLAAVRGVNPALP VVAVSARGELEAGDFSDRGFAGCLRKPFTANELIAVLDAVCGAGKARSAEDAEGAEKSGA AGIVVGAGEPVSAGRHVGKRGADGAADNAPEVSEGGVNFGPLTAYAGDDAEAARGILESF AEQSAANCRLLERALESGDVAALKAVAHKMLPIFTMLGAAGVAATLRTAESWEGPLTDAL RGEIGAAAENIRAIVAEAQKKVSLP >gi|313157524|gb|AENZ01000064.1| GENE 21 26520 - 27839 2309 439 aa, chain - ## HITS:1 COG:MA4365 KEGG:ns NR:ns ## COG: MA4365 COG1668 # Protein_GI_number: 20093152 # Func_class: C Energy production and conversion; P Inorganic ion transport and metabolism # Function: ABC-type Na+ efflux pump, permease component # Organism: Methanosarcina acetivorans str.C2A # 6 436 18 421 425 152 23.0 2e-36 MSQVSIIIGREFNERVRKKSFIITTLLMPLLMIGLMFAPMLIMKYSRGDEKQIAVIDESG LVAPKLQSGEELVFQTTDLSTDAARKELTDKFGVLYIGSDILTNPNNVKLYVNSSSSLTV ESNITGQLEEIIEAEKLKSYNIENLSQILQEVKTTVGMQTFRNDESQEEESQAKSSVIAT GVGFVLGMILYMFLLIYGSMVMQSVIEEKNSRVLEVMVSSVRPFDLMLGKILGVASVAVV QVLIWGVLCAVGAAVAVHMMPADVLAGVQAMQHGVPDAAASIDMNPEMLQVMAAVTDFGY ILRIFAYLLLFVFGGYLFYSAMFAAVGSAVDSIQDAQQLQTPITIPIILALLVMITVIND PNSQMAFWFSMIPFTSPVVMMARIPYGIPLWEVILSLAILYASFTAMVWLAAKIYRVGIF MYGKKPTFKELYKWIRYKY >gi|313157524|gb|AENZ01000064.1| GENE 22 27852 - 28781 278 309 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|225088774|ref|YP_002660041.1| ribosomal protein S16 [gamma proteobacterium NOR5-3] # 1 264 6 283 312 111 29 7e-24 AATMDLLTVEHVTKQYAGHKALDDVSLAIPKGSVYGLLGPNGAGKTTLIRIINRITAPDS GRVMFGGREISPEDVYRIGYLPEERGLYKKMKVGEQAVFFARLKGLSRREAVVRLKQWFV KFGIQDWWDKKIEELSKGMAQKVQFIVTVLHEPELLIFDEPFSGFDPINANLLKDEILAL RDKGATIIFSTHNMSSVEEICDHITLINKSRNILSGRVDDIRRRHGSNIFEVFYRGDEQA LRNAVDGRCEILEGSQAQAVYTSLKLHVERDEEVRGVIAAVNDAVELRSFQEIIPSMNDI FIRAVNGNL >gi|313157524|gb|AENZ01000064.1| GENE 23 29010 - 31040 3171 676 aa, chain - ## HITS:1 COG:CC3504 KEGG:ns NR:ns ## COG: CC3504 COG3590 # Protein_GI_number: 16127734 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Predicted metalloendopeptidase # Organism: Caulobacter vibrioides # 8 676 49 706 706 487 41.0 1e-137 MKKIFIFAATVACMASCQNTSKTPAIDMANFDLSVAPDADFYQYATGGWQKNNPLKPEYS RYGSFDVLRDNNEKRINELFSEMTKISAAPGSVEQKISDLYKMGLDSTRLNAEGAAPLKS AVGEILSVGDRGQLTGIVAKLHTTVANPFFGVGVQADLMNSDINALYISQSGLTMGNRDY YLDPENEHIRKGYKEYLGRIFRFAGIPEADVEKAVAGVMNVETKLAEKSWSNVELRNIPA QYNPTAKADFEKTYDAIDWEAYYKAMGIGDFETIIVTTPSAVANANDLLKNAPLEDIRYY LAAQYIDAAAPYLSDDFQQASFDFYGKVMSGQQEMKPRWKRAMSVPNGTLSEAVGEMYVA KYFPAKDKERMLTLVKNLQTALGQHIAALDWMSDATKAKAQEKLAAFTVKIGYPDKWKDY STLEIDPSKSYWENIVNANRWYTADNISELGKPVDKEKWHMSPQTVNAYYNPTTNEICFP AAILQPPFYNPDADDAVNYGAIGVVIGHEMTHGFDDQGRNFDKDGNMNNWWTEADAEAFK AKTDILVKQFDAIEVLPAKEGQPAVMANGSLSLGENIADQGGLRVAYTAFHNSLNGTEPA PIDNFTADQRFYLAYAALWAQNIRDEEIARLTKIDVHSLGKWRVNATLRNLQTFYDAFDI TDGAMFMPEQERVIVW >gi|313157524|gb|AENZ01000064.1| GENE 24 31124 - 32386 2438 420 aa, chain - ## HITS:1 COG:all4131 KEGG:ns NR:ns ## COG: all4131 COG0126 # Protein_GI_number: 17231623 # Func_class: G Carbohydrate transport and metabolism # Function: 3-phosphoglycerate kinase # Organism: Nostoc sp. PCC 7120 # 8 418 13 397 400 362 48.0 1e-100 MFAIDNYDFKGKRAIIRVDFNVPLNEKGEVTDDTRIRAAIPTIKKVLEKGGSVILMSHLG RPKKNPDPKFSLEQIVPAIEKRLGVKVMFAGDCMGEKAAEMAKNLKPGEVMLLENLRFYA EEEGKPRGLAEDATDEEKKAAKKALKEGPQKEFVKKLASYADCYINDAFGTAHRAHASTA LIADYFPNDKMFGYVMENELKAIDGIMLNPERPFCAILGGSKVSTKITIIENLLEKVDVL VLGGGMTYTFAAAEGGKVGNSICEPDQFQTALDILAKAKEKGVKVVMSPDALIADAFSAD ANTNVAPANNIPDGWEGVDIADEGKKVFREEILKCKTILWNGPVGVFEIDKFATGSRAVA EAIAEATSKGAYSLIGGGDSVACINKFGLADKVSYVSTGGGALLEYMEGKELPGVAAIRK >gi|313157524|gb|AENZ01000064.1| GENE 25 32574 - 34901 3260 775 aa, chain - ## HITS:1 COG:PA3339_1 KEGG:ns NR:ns ## COG: PA3339_1 COG1752 # Protein_GI_number: 15598535 # Func_class: R General function prediction only # Function: Predicted esterase of the alpha-beta hydrolase superfamily # Organism: Pseudomonas aeruginosa # 30 226 22 215 308 141 40.0 5e-33 MLRKTLSFISVLLLVAGAAGAQNPETPAARPRTVGVVMSGGGAKGLYHIGVLEALEENGV PIDYVAGTSMGSIIAAMYAAGYSPAEMREIVASGVVKDWVSGRIDPRYTPYYRQIGHNPA FLSVRMNVGGAGKRFRVTNLLSSTPIDMALTELFAPATAAAEGDFDRLMVPFLCVSSDMN AREPVVMRNGDLSEAVRSSMSIPLVFKPMKKDSMLLYDGGIYDNFPWRPLDVGFRPDLIV GSICTAGNAPPSENSSILDQAFMLAMHDTDYDLPKGRSVTVRRAVDVNMLDFNSAVAIMD AGYADATAAMPQILERVSERWSPERYAERREEFRAKWPPLWFSDYKLEGLAEAPESYIRD FVKVDRRTPGRQRPMGFEELRDNLYAVLAGGDFTMDFPHVRYDSLRGSYSFAAQFHTKPN FKLTIGGNISSTAFNQAYIGVNYETIGRVGQQLGADLYLGPIYTWGAIGGRTDFYMWKPV FLDYSYNFAVRNFRHGAFGNLTKIDNTQQVKNSESFFSVAAGMPLAHRGVFLLRANGGHI NYRYDSDVLFADDTDHSRYSFFGLKAELARNTLDKFLYPRKGSDMRLSGIFVTGRDKYEP FNADRFLSRTTRQWVGARFTWDKYFDMPGTGWFSLGVNVDGVITNHPEFTTGGATLMSMP AYTPVSHAQMIYMPDFRGKRFVAGGVMPTFDLMPNFFFRTGFYAMFRAKRAYVPGSRSTT ADERWHYIAEASLVYHTPIGPVSLALTKYDLRNWKNMYLTFNFGYAIFAPKGTFY >gi|313157524|gb|AENZ01000064.1| GENE 26 34974 - 35999 1536 341 aa, chain + ## HITS:1 COG:PA2740 KEGG:ns NR:ns ## COG: PA2740 COG0016 # Protein_GI_number: 15597936 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Phenylalanyl-tRNA synthetase alpha subunit # Organism: Pseudomonas aeruginosa # 23 337 24 337 338 351 53.0 1e-96 MTDKINELLRRVEEFKPKAAAEIEEFRIKILGKKGELNALMEDFKTVAPELKRELGQQLN KLKTTALDRINSLREQLQEAASDDGAATEDMTRPGSAEQLGSRHPISLVKNQIVEVFSRL GYTVADGPEIEDDWHVFSALNFPPEHPARDMQDTFFIEKNPDILLRTHTSSIQVRTMEHQ KPPIRVICPGRVFRNEAISYRAHCIFHQIEGLYIDEGVSFADMKQSLLYFAKEVFGEQTV IRMRPSYFPFTEPSAEVDVSCNLCGGKGCPVCKGTGWLEIMGCGMVDPNVLKANNIDPEK YSGFAFGMGIERIAMLKYGVKDLRLYFENDVRFLHQFDTAL >gi|313157524|gb|AENZ01000064.1| GENE 27 36011 - 37795 2831 594 aa, chain + ## HITS:1 COG:aq_854 KEGG:ns NR:ns ## COG: aq_854 COG0457 # Protein_GI_number: 15606205 # Func_class: R General function prediction only # Function: FOG: TPR repeat # Organism: Aquifex aeolicus # 41 593 51 541 545 66 21.0 2e-10 MKRLTATTALCVLFCTAFAAGKGPAATVLGGYPDSLRSVWLYTEGIKQNTIFHDTVRARE FLTEAIRADSTYAPAYYEMATNGMYSTPDEAVELARRAYRLDTTNKWYHQFLGQALIFAQ RYPEALGVYKRLRAADPQNPDNYRLLAALYEQVQQPYSAIATLDSAELRFGRIPMLSAMK RQLLISTRQLDKAVDEAKAMVEAAPYEAQHHVVLADLYAILNKDSLAMAEYDRAMQIDST DVATLMSLADYYNGRRDYRSVLNVTQRLFDSDQMPLDAKIKRFEQLTSDMRFYREYYIQL NALASTLAVRYPQDRRVVELYAKHLIASGELEQALALYKLHLDDQPPVESYYRSVIDIES YLQHPDSAALYINRALELFPGEIDFQLSKGHVLNYTKEYDKAIKAYQQSLRYARTDSLRG AIWGLIGDTWHQKAAAGESGDEEDFVLISRKGSFRSAMKQCYKAYDRSLRYDPDNAMVLN NYAYFLSLEERDLEKALAMASRATALTDNNPTYLDTHAWVLFKLGRVDEARKIMQQAVAL DAQESAALLVHYGDILKALGENFMAEIYWRKALEKGYDAGRIERRITESKAKKE >gi|313157524|gb|AENZ01000064.1| GENE 28 37809 - 38936 1651 375 aa, chain + ## HITS:1 COG:no KEGG:GFO_3257 NR:ns ## KEGG: GFO_3257 # Name: not_defined # Def: M23 family peptidase # Organism: G.forsetii # Pathway: not_defined # 38 342 32 381 409 80 22.0 1e-13 MKPTKSCILTLFLLFAALAASAQADRKIEQQKRVIAALEQKIAAEEREISKLKEGRAATE ERIRRLARQIDSRNQLLEETEKQARLLRGEIDRTDSVAGDLGRSLDRNRTQYAEMVREAY RNYRHNNYLTYIFSSRDFADMARKLTNLREVASMRERKLQDIAALSKQVAEEKELLDRRK RSLDSVTLKLKAQKQKLERDARNARTSIRQLSQKEKTALQRKISQEQQLDVAIGELRKLT KGNKEGDSFSTKTTGLRLPVTAGRVKRYKENMAEITGPKGAHVISIYDGKVVDVKRNRIT NKYDVYVAHGEYITSYANMGSICVEKGQKVARNAQLGTIGSAVNIMTMETEYKLVFGIYP PNPGQKLRAENCFKK >gi|313157524|gb|AENZ01000064.1| GENE 29 39153 - 39584 716 143 aa, chain + ## HITS:1 COG:TP0650 KEGG:ns NR:ns ## COG: TP0650 COG0319 # Protein_GI_number: 15639637 # Func_class: R General function prediction only # Function: Predicted metal-dependent hydrolase # Organism: Treponema pallidum # 27 137 30 145 160 65 30.0 4e-11 MAVKYYTDDCSYRLPQKRLTAEWLRRVAEAEGYELGDVSYIFCSAQRLLEMNRQYLGHDY FTDVITFDYSDRKGARVISGDIFIDVETVADNARLYGFSTLQEMRRVVVHGVLHLCGQGD KTPRTNAQMHRKEDKYLKFWEEQ >gi|313157524|gb|AENZ01000064.1| GENE 30 39672 - 41555 1612 627 aa, chain + ## HITS:1 COG:HI0582 KEGG:ns NR:ns ## COG: HI0582 COG0445 # Protein_GI_number: 16272526 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: NAD/FAD-utilizing enzyme apparently involved in cell division # Organism: Haemophilus influenzae # 5 627 7 621 629 612 49.0 1e-175 MILEYDIIVIGGGHAGCEAASAAARLGSRTLLLTMDMTKMASMSCNPAVGGVAKGQIVRE IDALGGQMGRITDLTTIQFRMLNRSKGAAMWSPRAQCDKSRFSAQWRHTLENTVNLYIWQ DAATELLFDTAGVKPRIKGVRTQMGIEFACRAVVLTSGTFLGGMMHCGTSHAEGGRAGDA ASHGITESLRAIGFETGRMKTGTPARLDARTIDFESLEPQYGDENPDKFSFSPDTQPVKH QLPCFLVYTSAEVHDLLRTGFDRSPLFNGTIKGIGPRYCPSIEDKLRTFADKDQHQLFLE PEGESTNEYYLNGFSSSLPWDIQWEALHKIRGFEDLHIFRPGYAIEYDYFPPTQLHHSLE TKLVSGLYFAGQVNGTTGYEEAAAQGLIAGINAHRALKGEEAVVLQRDEAYIGVLIDDLV TKGVDEPYRMFTSRAEYRILLRQDNADLRLTPIGYKIGLISQKRYAHFTEKKASVESLIS FARRQSIKAAEINDYLESVNSEPLTQGRKLYDILMRNNVTFESLSEALLKLRKFISDNNI SAEAIEEAEIQIKYKGYIEREKFIAEKLHRLENIRIPEDFDFHSMNSLTIEARQKLTRIR PATIGQASRIPGVSPADVNVLLVKFGR >gi|313157524|gb|AENZ01000064.1| GENE 31 41705 - 42718 699 337 aa, chain - ## HITS:1 COG:HI1193 KEGG:ns NR:ns ## COG: HI1193 COG0115 # Protein_GI_number: 16273115 # Func_class: E Amino acid transport and metabolism; H Coenzyme transport and metabolism # Function: Branched-chain amino acid aminotransferase/4-amino-4-deoxychorismate lyase # Organism: Haemophilus influenzae # 1 337 1 343 343 310 47.0 2e-84 MENMDWGALGFDYRKTDANVRYYYTDGKWSEMEITGDEYIKIHMSASCLHYGIELFEGLK AFRGVDGKVRLFRVEENAKRLQSSAERLCLPVPTIEMIVKACVEVVKRNEKYIPPYGTGA SLYLRPVMFGTTAGLGVKPAKDALLIVYCSPVGAYFKDGIKPISVAIDREQDRAAPRGTG DVKVGGNYASSLLSGENGHKLGYSNIMYLDAAEHKYIEECGACNFFGIKEGKYITPKSSS VLPSITNKSLRQLARDMGLEVEDRLIPVEELPTFEECGGCGTAAVISPIGKIFDMQTNDI YEYGTEVGKVCMELYTRLQDIQYGRAEDKYNWCTIVE >gi|313157524|gb|AENZ01000064.1| GENE 32 42747 - 44822 2470 691 aa, chain - ## HITS:1 COG:PAB2364_1 KEGG:ns NR:ns ## COG: PAB2364_1 COG0143 # Protein_GI_number: 14521189 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Methionyl-tRNA synthetase # Organism: Pyrococcus abyssi # 18 556 3 553 562 541 46.0 1e-153 MTEISNTVSVFMAKEFKRYLVTSALPYANGPVHIGHLAGVYIPSDIYTRYLRLRGRDVIS VCGSDEHGVPITIKARKEGVTPQQIVDRYHSMIKKSFEGLGMSFDIYSRTSSATHAKTAS EFFRKLYDEGKFIEQTSMQYYDEETQTFLADRYIVGTCPKCGNDRAYGDQCEKCGSTLSP DELIDPHSAVSGSVPVKRETKHWYLPLDQYEGFLREWILEGHKEWKSNVYGQCKSWLDGG LQPRAVSRDLDWGIPVPVEGAEGKVLYVWFDAPIGYISATKDLTPDWETYWKSEDTKMVH FIGKDNIVFHCIVFPSMLKAHGGYILPENVPANEFLNLEGDKISTSRNWAVWLHEYLEEF PGKEDVLRYVLCANAPETKDNDFTWKDFQARNNNELVAVLGNFVNRALVLTRKYYDGEVP ACGELNDYDRQTVGEVAAVKASLESNIEHYHFREALKDAMNIARIGNKYLADTEPWKVVK TDPQRVGTILNIALQITANTAIAIEPFMPFSAAKILKMLSVEKFGWEQLGTMDLIAAGHK IGEPVLLFEKIEDDVIQRQLDKLAATKEANAAAEAAQQVEPQKDAVSFDDFQKMDIRVST ILAAEKVAKTKKLLKLTVDTGIDQREIVSGIAEYFTPEELVGRQVLVLVNLQPRELKGIL SRGMILMAEDASGKLRLLSPNEATNSGAVVG >gi|313157524|gb|AENZ01000064.1| GENE 33 44939 - 47053 2829 704 aa, chain - ## HITS:1 COG:no KEGG:Bacsa_3552 NR:ns ## KEGG: Bacsa_3552 # Name: not_defined # Def: peptidase S46 # Organism: B.salanitronis # Pathway: not_defined # 1 703 1 719 721 654 48.0 0 MKKLFLLIAAACASLTAAADEGMWLLPYLQKMNIKEMKARGCKLSAEEIYSVNKSSLKDA IVIFGPGCTGEIVSADGLLFTNHHCGYGSIQALSSVEHDYLKNGFWAMSREEEIPAPGLK VRFIRKISDVTSDILGYVPDVASGEERARIVAEHAAAVRSRIAQDNPGMEVLVKDFFGGN QFFAFVIEVFSDVRLVGTPPTSIGKFGGDTDNWMWPRHTGDFSVFRVYAGPDNKPADYAP ENRPYKADKFLKVSVDGYKEGDFAMIMGFPGSTSRYMTSFEIDRMLDIENPQRIFIRGER QAILKEDMAASDKVRIQYASKYAQSSNYWKNSIGMSRGIRKLNVKAQKEAQEAAFRKWAE ANTLPTEGYMDALDRIREAVEGNASAFAAEQVLREALYRAVEILTPARSFLAVEKITDPA RSKEAMRAFYKDYNPATDRRVAKRMMQIVKEKCGDLPTVFAEVIDKRFGGDTDAYVDYLY DNSIFATEEGTLAFVDDFSVEKRDADPAVVFVRSLDAKLLELADAQRENNRRFKDGHRLY IAGLMRMQPDKAWASDANFTIRLTYGRVLPYDPADGIRYNYYTTLKGVMEKEDPKNPTEF TVPDKLKELYAAKDFGRYANAEGELPVAFLADCDITGGNSGSPVMNAKGELIGLAFDGNW EAMSGDVAFEPQLQRTIAVDVRYVLFVIDKFAGAKWLIDELSID >gi|313157524|gb|AENZ01000064.1| GENE 34 47063 - 48004 1496 313 aa, chain - ## HITS:1 COG:all4673 KEGG:ns NR:ns ## COG: all4673 COG0379 # Protein_GI_number: 17232165 # Func_class: H Coenzyme transport and metabolism # Function: Quinolinate synthase # Organism: Nostoc sp. PCC 7120 # 13 312 25 322 324 337 53.0 2e-92 MKVNNSAEIRRRIDELKRRKRAVILAHYYTTPEVQAAADFLGDSLALSVKAQSVDADIIL FAGVHFMAETAKVLCPDKKVLIPCPEAGCSLAESCDEKDFAVFKAKYPGHTVVSYVNTTV GVKALTDICCTSSNALKVVQSIPADRPIIFAPDRNLGGYIKKLTGRENIVLWDGACHVHE EFSLEKLLQLKKEHPAAKVVVHPECRAYIVEVADFVGSTAAILDYCGRSGADEFIVVTES GILAEMKKRYPGKTFIPAPPDDETCGCNNCKYMKMVTLENICACLENESPEIVLDEEVRR RAERSILNMINIK >gi|313157524|gb|AENZ01000064.1| GENE 35 48382 - 49434 1429 350 aa, chain - ## HITS:1 COG:BMEI1887 KEGG:ns NR:ns ## COG: BMEI1887 COG0489 # Protein_GI_number: 17988170 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Brucella melitensis # 3 349 14 380 394 221 38.0 2e-57 MEEKIRHLLTSVVHPETGQDIVGSGFIEHIASGAGKITVVLRFAKARDPFAVKIKNQAEE ILRREFPQQNVMVVIKEGGAAPRPEPKLKTTTGGIAKVIAVASGKGGVGKSTVTANLAVA LRNMGYRVGILDADIYGPSQPKMFGVEGYLPDAVQEEGTDRIVPAESMDIRLMSIGFFIK PTDALLWRGAMAVSALKQMIHQTKWGTLDFLLADLPPGTGDVHLSIIGELKIDAAVIVST PQQVAVADVVRGVEMFRNENVNIPVAGVIENMAWFTPEELPENRYYIFGKGGARRYAEEN GVDFLGEIPIVQSIMEGSDEGRPAAGIDPRVEKWYREIAEKTVEKVMKSC >gi|313157524|gb|AENZ01000064.1| GENE 36 49611 - 51389 2875 592 aa, chain + ## HITS:1 COG:all4590 KEGG:ns NR:ns ## COG: all4590 COG0616 # Protein_GI_number: 17232082 # Func_class: O Posttranslational modification, protein turnover, chaperones; U Intracellular trafficking, secretion, and vesicular transport # Function: Periplasmic serine proteases (ClpP class) # Organism: Nostoc sp. PCC 7120 # 42 591 47 608 609 329 38.0 1e-89 MNFIKTFLAAILAFVVGSFLVFFLWIFILLGIAGSMEKTVAVHAESILKIDFSDVLTDAP SSDPFAGIDFATLRSTRQLPLMKALRALEAAKDDSRIKGIYLRMNGNGGVMGAALLEELR EAVVDFKQSGKFVVAYDETYSQGKYYLASAADKIYMQPEGAMDWSGLAFNLMFYKGLLDK LDIKAEVFRPTACKYKSAVEPYILPKMSDANREQMQALVNSMWNTIAGSVAEARGIDLKE LNRITDKLEVSLPEDALEHGFVDSLIYEDQMKDIFAELGVASDSDGEYNFITLGEYASQV GADLKNISADQVAVVYADGAIVDGEGFGKEIYGNTLAATLAGVRDDEKVKAVVLRVNSPG GSALASDVIWREMELLKAEKPVVVSMGSYAASGGYYISCPADVIVADKLTLTGSIGVFGM YLNTIDAFKNKLGITFDAVKSNTSAGMGATSPLTAAERASIMRGVDKVYTTFTTHVAEGR NLPVEKVLDIAGGRVWSGEDALGIGLIDTYGGLKTAIAIAVDKAGLGDSYRVTEVIEEPT GFAAFIASLNVSVREAMTRSELGLLMKEYKQVQEATKQQGVVMYYPYKLELR Prediction of potential genes in microbial genomes Time: Wed Jun 22 13:19:28 2011 Seq name: gi|313157460|gb|AENZ01000065.1| Alistipes sp. HGB5 contig00025, whole genome shotgun sequence Length of sequence - 69655 bp Number of predicted genes - 60, with homology - 56 Number of transcription units - 32, operones - 14 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 19 - 3303 4874 ## BVU_2749 OmpA-related protein + Term 3328 - 3365 9.4 - Term 3316 - 3353 9.4 2 2 Tu 1 . - CDS 3376 - 3963 683 ## COG1011 Predicted hydrolase (HAD superfamily) + Prom 3862 - 3921 1.9 3 3 Tu 1 . + CDS 4033 - 5076 1388 ## BT_0805 hypothetical protein + Term 5115 - 5153 4.6 - Term 5173 - 5207 1.5 4 4 Tu 1 . - CDS 5382 - 7598 3232 ## COG3808 Inorganic pyrophosphatase - Prom 7629 - 7688 4.4 - Term 7679 - 7718 9.1 5 5 Op 1 7/0.000 - CDS 7737 - 8999 1991 ## COG2871 Na+-transporting NADH:ubiquinone oxidoreductase, subunit NqrF 6 5 Op 2 9/0.000 - CDS 9022 - 9642 1093 ## COG2209 Na+-transporting NADH:ubiquinone oxidoreductase, subunit NqrE 7 5 Op 3 9/0.000 - CDS 9654 - 10310 1127 ## COG1347 Na+-transporting NADH:ubiquinone oxidoreductase, subunit NqrD 8 5 Op 4 9/0.000 - CDS 10303 - 11172 1227 ## COG2869 Na+-transporting NADH:ubiquinone oxidoreductase, subunit NqrC 9 5 Op 5 7/0.000 - CDS 11181 - 12329 1754 ## COG1805 Na+-transporting NADH:ubiquinone oxidoreductase, subunit NqrB 10 5 Op 6 . - CDS 12332 - 13687 2006 ## COG1726 Na+-transporting NADH:ubiquinone oxidoreductase, subunit NqrA 11 5 Op 7 . - CDS 13742 - 13807 88 ## - Prom 13829 - 13888 2.0 + Prom 13765 - 13824 6.1 12 6 Op 1 . + CDS 13853 - 14488 548 ## COG0084 Mg-dependent DNase 13 6 Op 2 . + CDS 14509 - 15237 1062 ## COG1179 Dinucleotide-utilizing enzymes involved in molybdopterin and thiamine biosynthesis family 1 14 6 Op 3 . + CDS 15252 - 16193 1317 ## COG1052 Lactate dehydrogenase and related dehydrogenases 15 6 Op 4 . + CDS 16193 - 17236 1240 ## COG0463 Glycosyltransferases involved in cell wall biogenesis 16 6 Op 5 . + CDS 17255 - 17755 783 ## COG3760 Uncharacterized conserved protein 17 7 Tu 1 . + CDS 17990 - 18442 674 ## gi|313157499|gb|EFR56918.1| hypothetical protein HMPREF9720_1081 + Term 18511 - 18544 4.5 - Term 18497 - 18532 4.1 18 8 Tu 1 . - CDS 18554 - 20080 2294 ## COG0696 Phosphoglyceromutase - Term 20299 - 20324 -0.5 19 9 Tu 1 . - CDS 20338 - 22959 4139 ## COG0013 Alanyl-tRNA synthetase - Prom 23000 - 23059 4.1 + Prom 22976 - 23035 5.2 20 10 Op 1 2/0.000 + CDS 23059 - 24057 1680 ## COG0739 Membrane proteins related to metalloendopeptidases 21 10 Op 2 . + CDS 24061 - 25011 1606 ## COG0739 Membrane proteins related to metalloendopeptidases 22 10 Op 3 . + CDS 25074 - 25721 736 ## Astex_3484 had-superfamily hydrolase, subfamily IA, variant 3 23 11 Tu 1 . - CDS 25728 - 28610 3857 ## Palpr_0523 hypothetical protein - Term 28702 - 28734 2.2 24 12 Tu 1 . - CDS 28829 - 29536 1159 ## gi|313157478|gb|EFR56897.1| hypothetical protein HMPREF9720_1089 - Prom 29564 - 29623 1.9 + Prom 29562 - 29621 4.0 25 13 Op 1 . + CDS 29643 - 32786 4685 ## COG1074 ATP-dependent exoDNAse (exonuclease V) beta subunit (contains helicase and exonuclease domains) 26 13 Op 2 . + CDS 32788 - 34731 2493 ## COG1132 ABC-type multidrug transport system, ATPase and permease components 27 13 Op 3 . + CDS 34734 - 35723 879 ## BVU_4153 hypothetical protein - Term 37063 - 37129 13.1 28 14 Op 1 38/0.000 - CDS 37158 - 37988 358 ## PROTEIN SUPPORTED gi|42631241|ref|ZP_00156779.1| COG0264: Translation elongation factor Ts 29 14 Op 2 . - CDS 38040 - 38981 999 ## PROTEIN SUPPORTED gi|237708157|ref|ZP_04538638.1| 30S ribosomal protein S2 - Prom 39092 - 39151 3.1 30 15 Op 1 59/0.000 - CDS 39165 - 39551 441 ## PROTEIN SUPPORTED gi|126646806|ref|ZP_01719316.1| 30S ribosomal protein S9 31 15 Op 2 . - CDS 39557 - 40012 567 ## PROTEIN SUPPORTED gi|229200005|ref|ZP_04326568.1| LSU ribosomal protein L13P - Prom 40146 - 40205 5.8 - Term 40231 - 40259 -0.7 32 16 Tu 1 . - CDS 40390 - 40824 265 ## - Term 41066 - 41103 7.8 33 17 Op 1 9/0.000 - CDS 41152 - 42327 1689 ## COG1883 Na+-transporting methylmalonyl-CoA/oxaloacetate decarboxylase, beta subunit 34 17 Op 2 . - CDS 42332 - 42787 719 ## COG0511 Biotin carboxyl carrier protein 35 18 Op 1 . - CDS 42937 - 43386 754 ## BT_1687 hypothetical protein 36 18 Op 2 2/0.000 - CDS 43388 - 44950 2606 ## COG4799 Acetyl-CoA carboxylase, carboxyltransferase component (subunits alpha and beta) 37 18 Op 3 1/0.000 - CDS 45119 - 45529 739 ## COG0346 Lactoylglutathione lyase and related lyases 38 18 Op 4 . - CDS 45684 - 47447 1297 ## COG0438 Glycosyltransferase - Prom 47493 - 47552 5.5 + Prom 47301 - 47360 5.7 39 19 Op 1 . + CDS 47517 - 47987 553 ## gi|313157505|gb|EFR56924.1| hypothetical protein HMPREF9720_1104 40 19 Op 2 . + CDS 47992 - 49107 1652 ## COG1703 Putative periplasmic protein kinase ArgK and related GTPases of G3E family + Term 49177 - 49218 10.3 - Term 49161 - 49210 14.6 41 20 Op 1 . - CDS 49226 - 49924 969 ## COG0846 NAD-dependent protein deacetylases, SIR2 family 42 20 Op 2 . - CDS 49928 - 50092 116 ## - Term 50477 - 50519 -0.7 43 21 Op 1 . - CDS 50524 - 51300 451 ## gi|313157508|gb|EFR56927.1| hypothetical protein HMPREF9720_1108 44 21 Op 2 . - CDS 51318 - 51587 368 ## Bache_1363 hypothetical protein 45 21 Op 3 . - CDS 51599 - 52090 737 ## COG0330 Membrane protease subunits, stomatin/prohibitin homologs - Prom 52121 - 52180 3.3 + Prom 52426 - 52485 1.5 46 22 Tu 1 . + CDS 52505 - 53920 2522 ## COG0076 Glutamate decarboxylase and related PLP-dependent proteins + Term 54069 - 54099 3.0 + Prom 54630 - 54689 4.5 47 23 Op 1 . + CDS 54714 - 55475 1224 ## BVU_1964 3-oxo-5-alpha-steroid 4-dehydrogenase 48 23 Op 2 . + CDS 55483 - 56694 1817 ## COG1902 NADH:flavin oxidoreductases, Old Yellow Enzyme family 49 23 Op 3 . + CDS 56724 - 57677 988 ## COG0300 Short-chain dehydrogenases of various substrate specificities 50 24 Tu 1 . + CDS 57820 - 58446 780 ## COG1011 Predicted hydrolase (HAD superfamily) 51 25 Tu 1 . + CDS 58554 - 58730 88 ## 52 26 Tu 1 . + CDS 58975 - 61440 2706 ## Odosp_0360 hypothetical protein - TRNA 61805 - 61882 74.1 # Pro CGG 0 0 - TRNA 62192 - 62266 85.4 # Pro TGG 0 0 53 27 Op 1 . - CDS 62645 - 63718 1553 ## COG0535 Predicted Fe-S oxidoreductases 54 27 Op 2 . - CDS 63722 - 64054 285 ## gi|313157509|gb|EFR56928.1| putative lipoprotein - Prom 64082 - 64141 5.7 + Prom 64021 - 64080 3.0 55 28 Tu 1 . + CDS 64198 - 64398 417 ## COG1278 Cold shock proteins + Term 64481 - 64519 9.3 - Term 64425 - 64471 -0.2 56 29 Tu 1 . - CDS 64524 - 64934 436 ## gi|313157466|gb|EFR56885.1| conserved hypothetical protein - Prom 64974 - 65033 6.0 + Prom 64920 - 64979 3.8 57 30 Tu 1 . + CDS 65163 - 66398 1978 ## COG0205 6-phosphofructokinase + Term 66426 - 66462 9.6 + Prom 66781 - 66840 1.6 58 31 Tu 1 . + CDS 66884 - 67762 1229 ## gi|313157481|gb|EFR56900.1| hypothetical protein HMPREF9720_1127 + Term 67790 - 67825 4.1 59 32 Op 1 . + CDS 68118 - 69164 421 ## gi|313157518|gb|EFR56937.1| hypothetical protein HMPREF9720_1129 60 32 Op 2 . + CDS 69178 - 69507 443 ## gi|313157521|gb|EFR56940.1| hypothetical protein HMPREF9720_1130 + Term 69521 - 69558 9.4 Predicted protein(s) >gi|313157460|gb|AENZ01000065.1| GENE 1 19 - 3303 4874 1094 aa, chain + ## HITS:1 COG:no KEGG:BVU_2749 NR:ns ## KEGG: BVU_2749 # Name: not_defined # Def: OmpA-related protein # Organism: B.vulgatus # Pathway: not_defined # 5 1094 8 1113 1113 954 49.0 0 MKKLFILMLSIFTVAAASAQVTTSGMNGTVTDEQGQPLAGATVIAVHTPSGTQYGAVTNK DGRYNLQGLRTGGPYTVTFSFVGYQGVEFPGLELQLGETLTRNAFLKDSQTLEAVVVTAD GRNSSMNVNRAGAVTSISSEQIELMPSVSRSMNDIMKLTPQASTTTSGLAIGGGNYRQSY VTVDGAAFNNAFGIGGNLPAGGSPISLDALEQMSVSVTPYDVRQSGFTGGAINAVTKSGT NEFKASAYVFAKSDQLQGDKYDGGKLSLSEMRNTTLGFSIGAPIVKDKLFVFANFEREWN TTPGTSRLARTSEGQSFGGGSQYNRPTVEKLDEISNFLIDKYGYNPGPYQGYSVKTPGYK LMARVDWNINRNNSLNVRFSRTQNKYSSSPSSSISPLDSKLAYDRNNYGRTSNYAMYFQN SRYYQEQNFTSVAAELNSRFLEGRLTNTLRYTYSHQYEPRSYDGKLFPTVDILEEYQGNR AVYASFGLDPFTYGNLREVSTHVVTDEIGYTVGKNRFVAGLQFEHNVAKNGYLQGGAGYY VYETWDDFKNDREPLAFRIAHGNNDALAQEYPQFTYMQYSIYLQDEINFSERFKATVGIR FEVPSYPSIDNNENKDFTQAFANYGGYKTSDMPKARLAVAPRVGFNWDMTGERKYILRGG TGVFNGRLPFVWLVSVAGNSNCIQNGLSLYKGDSRMPSFHTNVNDMLKNIYGGTYKQQDL AANTQPTILDKKLKMPSTWKTSLALDLKLPGDVDLNIEGIYNKDFNSVTVTKLGIEENPA GIQLPGEPALRKAWKSQNIRNKNPEEKYSINPYLINNADIDGYYASVSAQVSKRWGFGLS LMAAYTYSSAKNVIDGIGDQVTSAYNTNTFNRNGSNTPELGYASYVSPHRILFNVGYRLA QKNGASNFGLYYEAFQHGYIGGYSYSRYSYTMGNVTGDGGAALLLYIPTREQLDKMTFAD LVDNGKVIYSAADQKNDFWAFINKDSYLSKHIGEYSKRGGAVMPWQHMVNFKFSQDFYIN INGKRNTITLGVDINNLANLINRNWCGIDRLESSQILKYNTKTNAYNFTKPVWSKYASTV ATWSAMFSIRYTFN >gi|313157460|gb|AENZ01000065.1| GENE 2 3376 - 3963 683 195 aa, chain - ## HITS:1 COG:VC1645 KEGG:ns NR:ns ## COG: VC1645 COG1011 # Protein_GI_number: 15641650 # Func_class: R General function prediction only # Function: Predicted hydrolase (HAD superfamily) # Organism: Vibrio cholerae # 1 178 4 186 209 87 31.0 2e-17 MKNIVFDLGGVLFARDKNKSTQEFHEFFSFIRSERMPLFWEEYDRGASTLGEVTDILART TGCPREKCERFLLLSIDMQEPVAPTERLVGDLKAAGYRLYVLSNMSREFIAFLRRFPVYR LFDGEVVSCEEGTVKPEPRIYEILLERYGLDPAETLFIDDRAANIAAAEGLGIAGYLFDH RNPAAACDELRRRLL >gi|313157460|gb|AENZ01000065.1| GENE 3 4033 - 5076 1388 347 aa, chain + ## HITS:1 COG:no KEGG:BT_0805 NR:ns ## KEGG: BT_0805 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 347 1 349 349 347 58.0 5e-94 MNHGISILFRAIPLAMAAFCFGYGAYVFAAGSDPSRLIAGPVVFFLGSICVALYCTAATI IRQIIGTYSAAAKYLFPAVGYAFAAMTVICGIFIITSNMTGAYVTGHVVCGLGLITACVS TAATSSTRFSLIPKNSGDSSFSINPAGFSRGQSSLLIFIVSAVAAGAWLWCILLFSIGTL PAHIVAGSVMFGIACVCTSLIALVASIARQARGSYTMGERRRWMSLVLCMGGLAFVLGLI LIFTFRGESINFVGFVLIGLALICWSISSKVILLAKIWHADFPLANRIPIIPVLTALACL FLAAFLYEAALFETKYFVPARVLTGFGAICFTLYSIVSILESGANKK >gi|313157460|gb|AENZ01000065.1| GENE 4 5382 - 7598 3232 738 aa, chain - ## HITS:1 COG:MA3879 KEGG:ns NR:ns ## COG: MA3879 COG3808 # Protein_GI_number: 20092675 # Func_class: C Energy production and conversion # Function: Inorganic pyrophosphatase # Organism: Methanosarcina acetivorans str.C2A # 6 731 13 683 685 535 49.0 1e-151 MNIPSIFWFIPAAAVIALLLAWAFYRAMKREDEGTPRMREIAGHVRRGAMAYLRQQYKVV AIVFVVLALFFAYLAYGAGVQNPWVPFAFLTGGFFSGLAGYFGMKTATYASARTANAARQ SLDRGLKVAFRSGAVMGLVVVGLGLLDISFWYVILEYFVDVTGPQKLVVITTTMLTFGMG ASTQALFARVGGGIYTKAADVGADLVGKVEAGIPEDDPRNPATIADNVGDNVGDVAGMGA DLYESYCGSILATAALGAAAFASADGMAMQLCAVLAPMLIAAVGIVLSIIGIFLVRTKEG ATMRELLRSLGVGVNFSSLLIAGATFGILYLLGIENWIGLSFSVITGLLAGIIIGQATEY YTSHSYEPTQKIAGSAQTGPATVIIAGIGSGMISTAVPVLTIGVAIILAYLCAIGFDMEH IMSAQSMSLGLYGIGIAAVGMLSTLGITLATDAYGPIADNAGGNAEMSGLGPEVRKRTDA LDALGNTTAATGKGFAIGSAALTALALLASYIEEIRIGLLHNGVTALDLPNGTTQLVEKA SLLDFMEYYHVSLMNPTVLIGVFVGAMMSFLFCGLTMNAVGRAAQSMVNEVRRQFREIKG ILTGEGTPDYARCVEISTRGAQREMLLPSLLAIVVPVVVGLVFGVAGVMGLLVGGLSSGF VLAVFMANAGGAWDNAKKMVEEGHFGGKGSDCHKATVVGDTVGDPFKDTSGPSLNILIKL MSMVSIVMAGLTVAFHLF >gi|313157460|gb|AENZ01000065.1| GENE 5 7737 - 8999 1991 420 aa, chain - ## HITS:1 COG:PA2994 KEGG:ns NR:ns ## COG: PA2994 COG2871 # Protein_GI_number: 15598190 # Func_class: C Energy production and conversion # Function: Na+-transporting NADH:ubiquinone oxidoreductase, subunit NqrF # Organism: Pseudomonas aeruginosa # 5 419 6 406 407 469 53.0 1e-132 MELTIIVAIIAFLAVTLLLVGLLLFAKAKLTSSGEVTIDINDGEKVITAESGSTLLSTLA NNKVFLPSACGGGGSCGMCKCQVLEGGGDILPTETGFISRKLAKDHWRLGCQVKVKENLR IKVPEAVLGVKKWECTVVSNRNISTFLKEFVVKLPEGENLKFRSGGYIQIDIPKYDAIKF SDMDVDEKYRADWDKFKMWDLVTTNPEDTFRAYSMANHPAEGNIIMLNIRIATPPFDKAT GGFMKVNPGICSSYVFSRKPGDKITISGPYGEFFLPDDLPDTQELIFIGGGAGMAPMRSH LMHLFKTEKTKRPVSFWYGARALKEVPYLDEFHAIEKDFPNFSFNLALDRPDPEADAAGV KYTPGFVHNVLYENYLKNHQAPEDCIYLMCGPPMMIASVVKMLDSLGVPPENILYDNFGS >gi|313157460|gb|AENZ01000065.1| GENE 6 9022 - 9642 1093 206 aa, chain - ## HITS:1 COG:PA2995 KEGG:ns NR:ns ## COG: PA2995 COG2209 # Protein_GI_number: 15598191 # Func_class: C Energy production and conversion # Function: Na+-transporting NADH:ubiquinone oxidoreductase, subunit NqrE # Organism: Pseudomonas aeruginosa # 4 206 5 202 202 189 53.0 3e-48 MESLNIFIRSIFIDNMVFAFFFGMCSYIAVSKSVKTALGLGAAVTFVMVMTVPLNYFLYE YVLKAGALAWAGLPDVNLDFLTFIVFIATIAAFVQLVEMAVEKFSPTLYSQLGIFLPLIA VNCAIMGGSLFMQQKVDALELTSLWQSIVYGLGSGLGWWLAIVMMAAIREKTTYSQIPAA LKGPGIAFIITGLMGIAFMIFSGIQF >gi|313157460|gb|AENZ01000065.1| GENE 7 9654 - 10310 1127 218 aa, chain - ## HITS:1 COG:PA2996 KEGG:ns NR:ns ## COG: PA2996 COG1347 # Protein_GI_number: 15598192 # Func_class: C Energy production and conversion # Function: Na+-transporting NADH:ubiquinone oxidoreductase, subunit NqrD # Organism: Pseudomonas aeruginosa # 18 217 12 208 224 218 59.0 6e-57 MSKNEKEPLFSAKNMKFLTGPFSANNPVIVQILGICSALAVTVQLKPAIVMALSVTVVTG FSNLVMSLLRNGVPSRIRIIVQLVVIAALVILVDQVLKAYVFEVSKQLSVYVGLIITNCI VMGRIEAFALANKPWASLLDGIGNGLGYGLILVIVAFFRELFGSGSLLGFRIVPESWYIA EGGFYSNCGLMLFPPMALIIVGCIIWVHRSRNKDLQEK >gi|313157460|gb|AENZ01000065.1| GENE 8 10303 - 11172 1227 289 aa, chain - ## HITS:1 COG:PM1330 KEGG:ns NR:ns ## COG: PM1330 COG2869 # Protein_GI_number: 15603195 # Func_class: C Energy production and conversion # Function: Na+-transporting NADH:ubiquinone oxidoreductase, subunit NqrC # Organism: Pasteurella multocida # 150 268 137 256 260 100 45.0 4e-21 MATKKFKCTVCGYIHEGDAAPEKCPVCMAPASAFVELKDEKKKGLFSDKNGNAYIIMYST VMVVIVATLLAVAALSLQSRQYANELNEKKQSILSSLSAQGENYDAFIKGYVLDADGNEL EGEDVFELLKDLQGAFDDGKFPVFEAADGRVVIPVTGMGLWGPVWGYIALEKDMDTVAGI IMAHKGETPGLGAEIATPKYQAQFVGKKIFKGDEFVSVKLRKGGAQDPEHEVDAISGGTK TSDGVTAMLYNSMEHYLPLLEAKRKAAVEQVVFAPVRDESNEENVENNE >gi|313157460|gb|AENZ01000065.1| GENE 9 11181 - 12329 1754 382 aa, chain - ## HITS:1 COG:PA2998 KEGG:ns NR:ns ## COG: PA2998 COG1805 # Protein_GI_number: 15598194 # Func_class: C Energy production and conversion # Function: Na+-transporting NADH:ubiquinone oxidoreductase, subunit NqrB # Organism: Pseudomonas aeruginosa # 3 377 2 401 403 320 46.0 3e-87 MSALRNLVDKLKPTFTKGGKLGFLESTFDAFETFLFVPDTTTSKGAHIRDCNDMKRTMIM VVVALMPALLFGFYNTGYMHFLSQGVQPEFWASLWFGFLKVLPLIVVSYVVGLGIEFAFA QFRGHEVNEGFLVSGLLIPMIMPVDIPLWMVALSTAFAVVFGKEVFGGTGMNVFNPALLA RAFAFFAYTPSMSGATVWIAGMTSGEGVVDGFSGATALENLSTTGQMGYSAMDAFLGFIP GCVGETSTLAILIGAAILLFTGVASWRTMVSVFVGGLAMGYLFQALGVTTYPAWWHLIVG GFAFGAVFMATDPVTSAQTDKGKYIVGLMTGALAVLIRVVNPAYPEGMMLSILFMNALAP LVDYYVVEANISRRKKRVKLAK >gi|313157460|gb|AENZ01000065.1| GENE 10 12332 - 13687 2006 451 aa, chain - ## HITS:1 COG:NMB0569 KEGG:ns NR:ns ## COG: NMB0569 COG1726 # Protein_GI_number: 15676474 # Func_class: C Energy production and conversion # Function: Na+-transporting NADH:ubiquinone oxidoreductase, subunit NqrA # Organism: Neisseria meningitidis MC58 # 4 451 1 447 447 319 37.0 9e-87 MSKIITLRKGLDINLRGKAQESLADAPLASEYALSPLDFEGVTPKLLVKVGDEVKAGTPL FFNKYSERILFTSPVSGTVAAVNRGEKRKILSVTVTPASVQSYEEFAKPDLQKASREEIV ELLLKSGLWPMIVQRPYGVIADPNDTPKAVFISAFDSAPLAPDYNYVLRAEQKNLQTGID VMRKLTSGKVHLSVRAKAEGQMTALKGAETHAFAGKHPVGNVGVQIHHIDPVNKGEVVWT VNIQDLAIIGRLFNEGRVDMTKIIAVAGSEIEKPQYCRIIAGAKVDSILKGNVKPQKEGD HVRIISGNVLTGAKTAADGFIGFYANQLTVIPEGDKFELLGWAMPRFNKFSVSRAYFSWL CPKKEYNLDTNMNGGERPFVVTGLYERYLPMDIYPMYLLKACLAGDIDKMENLGIYEVVE EDFALCEFVDPSKIEIQQIIRDGINLMIKEA >gi|313157460|gb|AENZ01000065.1| GENE 11 13742 - 13807 88 21 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MHTGFSEKIRIFARELWLLAH >gi|313157460|gb|AENZ01000065.1| GENE 12 13853 - 14488 548 211 aa, chain + ## HITS:1 COG:MTH233 KEGG:ns NR:ns ## COG: MTH233 COG0084 # Protein_GI_number: 15678261 # Func_class: L Replication, recombination and repair # Function: Mg-dependent DNase # Organism: Methanothermobacter thermautotrophicus # 22 192 63 254 256 80 32.0 2e-15 MTDQFVNIHTHRPTGRGIELRTAGIHPWNADKEDVSTIVPLLGDVQAVGETGLDFVHGAD RETQLAAFRAQLALARERRLPVVLHCVRAFEPVMRELDACRPRAAIFHGFIGSPEQARRA VLKGYYLSFGLRAFASPKTLESLRETPLSQLFLETDDSDVPIEEIYARAAKVKGVTPEEL QRATLENYGRIFTTGPQGQDKAAGTPPLPAR >gi|313157460|gb|AENZ01000065.1| GENE 13 14509 - 15237 1062 242 aa, chain + ## HITS:1 COG:FN0725 KEGG:ns NR:ns ## COG: FN0725 COG1179 # Protein_GI_number: 19704060 # Func_class: H Coenzyme transport and metabolism # Function: Dinucleotide-utilizing enzymes involved in molybdopterin and thiamine biosynthesis family 1 # Organism: Fusobacterium nucleatum # 6 237 2 231 234 182 40.0 4e-46 MNTDNWLERTELLLGKEKLDLLRKAHVLVVGLGGVGAYAAEMIVRAGVGRMTIADADAVA PSNINRQLVALHSTVGRQKAEILAERLRDINPEIELTVVSRYIKDEETDLLLDAAKYDYA VDAIDTLSPKLALIKGALDRSLPLVSSMGAGAKTDPTKMEIADISKTHHCPLAHMLRKRL HKIGVRSGFRAVYSPEPMREGALILCEEQNKKSNVGTISYIPALFGIGCASVVIRGLIGE MN >gi|313157460|gb|AENZ01000065.1| GENE 14 15252 - 16193 1317 313 aa, chain + ## HITS:1 COG:HP0096 KEGG:ns NR:ns ## COG: HP0096 COG1052 # Protein_GI_number: 15644726 # Func_class: C Energy production and conversion; H Coenzyme transport and metabolism; R General function prediction only # Function: Lactate dehydrogenase and related dehydrogenases # Organism: Helicobacter pylori 26695 # 6 309 8 309 314 259 45.0 4e-69 MKPNIVFLDEYTLGGADLGRLQALGEYKGYARTTPEELPDRCREADVIITNKVVLRHETL QSLPRLRLICVAATGTNNIDLEAAAELGIEVKNAAGYSTHSVAETTLGAAIALRRNIVYY DRYVKSGAYSAAGQQFHFALPTHQLYGSKWGIVGLGAIGREVARLAAAFGCEVCYTSTSG VVREEPYPALPLTELLGRSDIVSIHAPLNDRTRGLIGAPELSVMKRSALLINVARGGIVD EAALAEALDRGSIAGAALDVFSREPFAADSPLLGIREPDRLLLSPHNAWSPREAVDVLVG CVEENIKTFYHIG >gi|313157460|gb|AENZ01000065.1| GENE 15 16193 - 17236 1240 347 aa, chain + ## HITS:1 COG:BS_yveR KEGG:ns NR:ns ## COG: BS_yveR COG0463 # Protein_GI_number: 16080483 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferases involved in cell wall biogenesis # Organism: Bacillus subtilis # 5 246 4 242 344 106 28.0 8e-23 MTGRPLLTLIVPAYNAANYLPNCLDDLLAQTFRDFEVLVMDDGSTDRTADIGEQYARRDP RIRVMRLPHGGVSAARNRGLEMAESRYVAFMDCDDRVDPQYLAAFFRTPLPDGDLLVSQG VAIEYIAQNRTELYDYPDTTGTQGELGKIIVENRLLRDGWTYCKLFSLPLIRREALRFDT GLSICEDLVFVLGYLSHVRQIILRSGTYYHYQVCPDGTSLSQRRQPAAETFRAGVSILAK QQALTERFGITDPEYLAEAYSEHGLTNLARSISTTDAADNRALQREIIRNRELFRQYFDA DSMHHPIVGKKILWMYLTLPESLHLLPHWIVRFRHFIRRIKGRPIYY >gi|313157460|gb|AENZ01000065.1| GENE 16 17255 - 17755 783 166 aa, chain + ## HITS:1 COG:CC0111 KEGG:ns NR:ns ## COG: CC0111 COG3760 # Protein_GI_number: 16124366 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Caulobacter vibrioides # 7 157 4 153 168 96 35.0 2e-20 METTDRRQKVFDWLDSHGIAYTWYEHPEAPTIEIARRYWRDDGSKHCKNLFFRNHKGNRH YLVAFDCEQNLAIHDLERRLRQGKLSFASEQRMERWLGLRPGSVSPFGLINDPERHVHLF LDRNLERFPAYSFHPNDNRATVVVSRSEFLRYLAAVGNTYEFIELY >gi|313157460|gb|AENZ01000065.1| GENE 17 17990 - 18442 674 150 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313157499|gb|EFR56918.1| ## NR: gi|313157499|gb|EFR56918.1| hypothetical protein HMPREF9720_1081 [Alistipes sp. HGB5] # 1 150 36 185 185 299 100.0 6e-80 MKKFTLTLLLIFGTVSALFAQQSVREDIFAFLKREGYVPTFDEDRDILFKVQGINYYAII KEVADMDYAYVEVSANFTADVPYDKLLAISNDQNRDKFVCKCSAERDGEDNCFKVAMEFI TNNRTNTEYQMAHALRLLPGWIEKFKEELD >gi|313157460|gb|AENZ01000065.1| GENE 18 18554 - 20080 2294 508 aa, chain - ## HITS:1 COG:MA4007 KEGG:ns NR:ns ## COG: MA4007 COG0696 # Protein_GI_number: 20092802 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphoglyceromutase # Organism: Methanosarcina acetivorans str.C2A # 5 505 14 518 521 481 49.0 1e-135 MNNKVLLMILDGWGNGHHDKADVISTVHPEYISAMTEKYPHAQLRTDGENVGLPEGQMGN SEVGHLNIGAGRVVYQDLVKINRACRDNSIMENPEVKAAFEYAKKNGANMHFMGLVSDGG VHSSLEHLFKLCDISAAYGLDNTYVHCFMDGRDTDPRSGKGFIADLEKHLAATTGRIATV IGRYYAMDRDKRWERVKIAYDALVNGIGERSSDMVEAVQKSYDEGVTDEFIKPFVRIDEN GQPVGMIRPNDVVIFFNYRNDRAKELTVVLTQEDMPAEGMHTMPLYYCCMTPYDAKFTGL HILFDKENVPNTIGEYVSKLGLRQLRIAETEKYAHVTFFLNGGREAEFEGEERILVASPK VATYDLQPEMSAPEVADKLAAALGEQKFDFICLNFANGDMVGHTGVYEAIVKAVKAVDGC VAKVVEAAKANGYEVVMIADHGNADNAVNADGTPNTAHSLNPVPIVVVSDRVKSVHDGIL ADVAPTVLRLMGLEQPAEMTGKALVELK >gi|313157460|gb|AENZ01000065.1| GENE 19 20338 - 22959 4139 873 aa, chain - ## HITS:1 COG:aq_1293 KEGG:ns NR:ns ## COG: aq_1293 COG0013 # Protein_GI_number: 15606507 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Alanyl-tRNA synthetase # Organism: Aquifex aeolicus # 1 868 3 865 867 671 43.0 0 MESNKIRRAFLDFFESKGHVIVPSAPMVVKGDPTLMFTNAGMNQFKDIFLGNAPRKYPRA ADTQKCLRVSGKHNDLEEVGHDTYHHTMFEMLGNWSYGDYFKKEAIEWAWELLGGVYKLP ADRMYATVFEGSEEDGVPFDQEAYDYWKQFLPEDHIIRGNKHDNFWEMGETGPCGPCSEI HFDLRDEAEIAAKPGREMVNAGHPQVIEIWNLVFMQFNRKANGSLEPLPARNVDTGMGFE RLCMILQGKKSNYDTDVFQPTIQRISQLSGKAYGADAKCDVAMRVIADHLRAIAFSIADG QLPSNVKAGYVIRRILRRAVRYGYTYLGFTEPTICKLVPGLVEQMGEQFPELKAQQSLIE KVIEEEEASFLRTLATGINLLDGVIEKTKGEGRELISGKDAFELYDTFGFPIDLTELIAR EQGVGVDLAAFETELQAQKERSRNAAAVDTDDWVELFPIRESLFTGYDTLTEQVRIARYR RVTSKGKTSYQLVFDRTPFYGNSGGQIGDIGYIENADERIAVVATEKENGLIIHIVKELP ENPAAGFTAVVDAAKRQSAANNHTATHLMHEALRKVLGQHVEQKGSMVTPEVLRFDFSHF QKMTPQELREVEMLVNRAVRANYPLEENREATKEEAEKCGAMMLFGEKYGDKVRMVRFGS SVELCGGTHTSATGNIGFFKILNESAISAGVRRIEAVTGERAEQIIYAAEDTMRDISDYL HNPQVLQAVKKMFESNEALSKEVETMRREQVAQWADKIIASTPERRGVQLIATQTDRTPE FVKDLAYCLRARAPKLVLVQGSVNDGKPMLTVMLGEEITAQGVNAGAVVREAAKLMQGGG GGQAFFATAGGKNPDGLQAAIDKAVELIMAQLH >gi|313157460|gb|AENZ01000065.1| GENE 20 23059 - 24057 1680 332 aa, chain + ## HITS:1 COG:VCA0079 KEGG:ns NR:ns ## COG: VCA0079 COG0739 # Protein_GI_number: 15600850 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane proteins related to metalloendopeptidases # Organism: Vibrio cholerae # 194 304 282 389 430 130 56.0 5e-30 MGQKGQRSDTHTLTHEAQTLTRDVVTAPFRLRTYRLIRKILIGFILVSIANVLFSYFFYT PKMYRIIRENRDLVIKYRILQDRIRTAQRHVDEIRHRDNYVYRSLFSTDTMSIEGVWQPY PDTKYAALLDDDYAPLMVGTWRQLDALARTLYLESVSFDELQAFSQDKEKMSAAVPAIWP IDRSALHNNHIGAFSPRRYHPVLHRVQAHTGVDFGCDRGTPVYATGDGEVELAVGSGYNG GYGYQVLVNHGFGYKTRYAHLSKVLVKPGERVTRGQVIAETGNTGRSTGPHLHYEVIHKG TPVNPVNYFNRDMTAAEYDDLMARMRETNFEM >gi|313157460|gb|AENZ01000065.1| GENE 21 24061 - 25011 1606 316 aa, chain + ## HITS:1 COG:VC0843 KEGG:ns NR:ns ## COG: VC0843 COG0739 # Protein_GI_number: 15640860 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane proteins related to metalloendopeptidases # Organism: Vibrio cholerae # 160 288 134 259 302 106 37.0 5e-23 MAGKKDLIRLRKRKRRKQNIIRATIHFFVWAGVAVLYYIGFSVFFDTPVEYELKHSTDRL RREYTALVQRYDTLTTVMRNLSERDRNVFRILFESDPYDFDSEYERKQAATYENIFNRSS RRLKRELRERVAEMETRLDELNDTYLDLQARIDSAGSRCDNIPSIQPVINKQLTLLTASY GMRIHPFYKTLQSHQGVDYTIPEGSRVFATADGTVREVAQRNSTSGQTVVIDHGNGYETS YNHLSKIDVRKGQQVRRGDIIALSGDTGLSLAPHLHYEVRYNGMRVDPIHYFFMELSPTE YQRLMRIAQSGMQSFD >gi|313157460|gb|AENZ01000065.1| GENE 22 25074 - 25721 736 215 aa, chain + ## HITS:1 COG:no KEGG:Astex_3484 NR:ns ## KEGG: Astex_3484 # Name: not_defined # Def: had-superfamily hydrolase, subfamily IA, variant 3 # Organism: A.excentricus # Pathway: not_defined # 3 214 7 193 206 72 27.0 2e-11 MTHPGSEIKLILLDFDGTLADTRRANTLAYVATLREAGYTLTEKEYAARYFGMRCDEFLT RIGIADPAERERLRLRKIALYPTFFDTVRLNRPLWDFCRQFRAQGGRVWIVSTGSRANID NAMRHLGIAGPQAVRRLETPDEDTSRKEERLPEENAVDGILSGADVARSKPAPDCFLEAM RREGCTPRETLIFEDSEIGLEAARRSGASYFRVKL >gi|313157460|gb|AENZ01000065.1| GENE 23 25728 - 28610 3857 960 aa, chain - ## HITS:1 COG:no KEGG:Palpr_0523 NR:ns ## KEGG: Palpr_0523 # Name: not_defined # Def: hypothetical protein # Organism: P.propionicigenes # Pathway: not_defined # 1 959 1 972 972 588 35.0 1e-166 MKGFLEEVAGDLYARYGEGLSERAVLFPSRRARLFFVDALTRIAGRPMWQPEWVTVDDLM SEISGLHAGDRVRLITELYKVYSEFHTEPFDKFYFWGDMLLTDFDTIDKYRIDAQMLFRN ISEIKEIEADISYLTPAQLQILSFWSSLGEEADLSEEKRRFLAIWKTLGPVYRKFRRRLV SLGIAYNGMVQRAAADRIAEGAFAFPEPRRYVVAGFNALSECEKQLFKFLAVAAETDFYW DYDAYYKDNPEQEAGMFVRSNVALFPPRTEFAHDNMRGEKEVVSVAAVSNAVQCKYAAAI LADLARRRAQEDPEVAAGLKPALGKETAVVLTDENLLLPLLYALPADIGRVNVTMGFPLR QSLAYTFVERLVELQNHRRKKGGGWTFYHADVAGILAHPYVAECDAALTRTMHEEIVRDR WISVDAAWLGRNELLKRIFSPAAEWRELSDYLLGVIAAVARQPYEGDDARQRVEFLAVIA EQVTKLRNSLDECDIELATEVYTSLLRRHLQTLRIPFEGEPLEGIQIMGILETRNVDFEN VILLSMNDDNFPGNHVAQSSFIPYNLRAAYELPTPEHHEGVYAYYFYRLIQRAKSVHMLY CSHADDKSTGEPSRYIYQLDYESGFDVRKIEVGVDVNLAETAPIEVAKDGEVMRRLERFV DAESPAALSPTAFFRYVACPLRFYFHSVARLEADDEISEEVDAPMFGTILHAAVQTLYAR IVGEEHPGQTLRAMIRSGEVAAAVERAINENYLQDERASAEDYTGNLLLVKDIVTRYLRG GVMPYDAAHDAFAVSGLEERVAYSFPFRAGERELEMKFGGIADRIDMLGDGALRVVDYKT GAPHLEFDGVESLFTGTGKQRLSNILQTLLYSMILHHTRGCDAEPALYYVRSMNRPDYSP QLDDKQLGVRGARYTLYRERFEELLRAQLAEMYDPAVPFRQCEDADTCKFCDFRIICKRG >gi|313157460|gb|AENZ01000065.1| GENE 24 28829 - 29536 1159 235 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313157478|gb|EFR56897.1| ## NR: gi|313157478|gb|EFR56897.1| hypothetical protein HMPREF9720_1089 [Alistipes sp. HGB5] # 1 235 1 235 235 410 100.0 1e-113 MKKMILSAVMLVFAASGYAQETGAAGAGERNEWLPTFEAVAVDADLDVKFVRVPDTEAPK IVYDTKGSYTTKFRAEVKDKTLRISEKPDSRRPERTEVTVYYNALRAVSLSGAAATFVGT LDAAVLDVTVNRKASLTAKLDVKDLKMEQTGYSTANLSGSVRYLTLYVSTGKVAASDLEV MSAEVNAQSKAEVSLWITDRFVGKTTTNARISYKGDPKIVRGGAKFMGGEINRVE >gi|313157460|gb|AENZ01000065.1| GENE 25 29643 - 32786 4685 1047 aa, chain + ## HITS:1 COG:RSc1190 KEGG:ns NR:ns ## COG: RSc1190 COG1074 # Protein_GI_number: 17545909 # Func_class: L Replication, recombination and repair # Function: ATP-dependent exoDNAse (exonuclease V) beta subunit (contains helicase and exonuclease domains) # Organism: Ralstonia solanacearum # 1 463 26 503 1177 82 24.0 6e-15 MRAKILNASAGSGKTYQLAYKYVRDVVEQPSIYRHILAVTFTNKATEEMKSRILKEIHLL ASGGESSYLENLCRELDMDAATVRRRAAEVRSKILHDYSRFTVLTIDTFFQRILRAFIKE LGIDLNYNVEIETASVLTKSADTLIEQITTDRDLQRWLTDFVQERIDEGKKWDVRDGILT LGGELFKEKNKEALSLARSREELGRIVGEATARAAATKQQMRERAAEAVRIMADAGVGPT DFTGKSRSFAHYFLTVAAGELKPYTATVGKMSLTTEGWAPKGSPAAPLAARLQPLLREMC DLYDANVRSWNTCDLLRENYRSFALLSDLYAKVQQLCDEQNMMLLSETKYILSEFIGHND APFIYEKVGNRFEHFMIDEFQDTSVKEWENFLPLLQNAMSQSEATSVLIVGDIKQSIYRW RGGDWKILHSQAQAQLDPASTEVEILRENYRSLPAVVEFNNEIIGRVVETDNRALNETLD EAAAKGGINPAAAAALHDTLRDAYRGHAQTPRRRSDRPGYVSVSTFAEQPPVVERICEVL DKGFRPRDIMILVRGATDGAKVAAELLDFKRRNTDPRYRFDVMTQEALIVGNAPVSSFIA ASLRLSLNPDDSLSRAVYNHYLGRGFDRPLSDDERTFFRSIRLLSPEEAFERIVMRHGLQ DDKQQTAYLQAIHEQIIGFCSSKIADIALFLDWWEQQGQNRSLSVEESETTVEITTIHKA KGLEKRVVLIPYCSWQLDPKSGGNVTNIVWAEAHGDAEAVGRFPVKYKKSMAESGFSAEY YRELVYSHVDNVNLLYVALTRAAESLHVFIPQKGGKTVGGLLLQSILTDGDKALAGSAEG RYTVTETGERFEFGQFRGPVADGGKTSGAEHVVLENYPTARADLRLRLPSQRYFEQEEEV ELSPRNFGILMHKAFENADDEEQIRLAVERMQADGTLSAAEAAALRRMIARALAHPAARE WFAGGWERVRNENEIIIPGGSSARRPDRVMIRGTRAVVVDYKFGGREPERYRRQVREYLA LLRQMGYTETEGYLWYVKLGRIEKVEE >gi|313157460|gb|AENZ01000065.1| GENE 26 32788 - 34731 2493 647 aa, chain + ## HITS:1 COG:jhp1129 KEGG:ns NR:ns ## COG: jhp1129 COG1132 # Protein_GI_number: 15612194 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase and permease components # Organism: Helicobacter pylori J99 # 426 647 359 572 578 186 42.0 2e-46 MNEIIGIIPHKFRRRGFGVACTLLLRAGLNFLGLAVLLPVLALALDARSLDGGGQLARIY TALGFTSPRSFALTVCAAVVAVIVLKCLTNLWLARIERNYIYDLYRTLSRRLYVTYHDRG LPFVKSSNSAVLARNVNVVCLAFTAGVLKPAAAIAAEALLLALLFGALLWYAPVAAVLTV AVFLPSIWIYYGLVRNRINRYGELENKAQREKARLVAETFRGYADIEINNAFPMMLRSFD RAMDQVIRTRLRETAIGMLPQAFTEIGLALGMALLVALSLGAEEGRAQLLFGVFAVAALR LMPSVRSIMAGWTSIKYNRYTIAVLRDATAGDEPSSASPDTLSGAAERAGKGPHTDAAAI PPSEYSAAESGETARGRRTAPDKRTASGRQTAPYGQTAPAKQPCTAKSTKPVSAGGASGA CAPEILPFEHEIAVRDLGFRFADDGHELFRGLTLSIRKGERIGIRGASGAGKTTLFNLLL GLYEPTGGEITIDGTPLTSANRRAWQNRIGYVSQNLFIADGSFAANVALGVPDDEIDRGR VAEALEAARLGEFVAGLSKGMDTHVGECGCRLSGGQRQRIGIARALYRRADVLFFDEATS ALDSRTEEEINRSVAELAARDKGLTLVVIAHRESSLEYCTRIITIGE >gi|313157460|gb|AENZ01000065.1| GENE 27 34734 - 35723 879 329 aa, chain + ## HITS:1 COG:no KEGG:BVU_4153 NR:ns ## KEGG: BVU_4153 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 58 321 75 300 308 205 41.0 2e-51 MEYIIAGIRIAVPQEFTAGSFGMALAPFAAADKGPAGLSVETRSEIRQADGYRELDEFDF ADADADCRFGRDAEGYLLTMTPRDGSAAARFRKAFGAPLVTTDVTLRHNPALFRFGLWSM FNIAAVARQAVAIHSSVISLNDGAVLFLGESGTGKSTHTRLWREHIPGAELLNDDSPIVR IVQTADADPICGAVQNPANAPAAAAETPDASAATTGAPKPQAMVFGAPWSGKTPCYRNVC QPIRAIVRLSQAPHNRIRRLRAIEAIGALLPSCPPSFAHDETLEDAVCATVSAVVAQVPV YHLECRPDAAAAELACRTIFGGESANPRP >gi|313157460|gb|AENZ01000065.1| GENE 28 37158 - 37988 358 276 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|42631241|ref|ZP_00156779.1| COG0264: Translation elongation factor Ts [Haemophilus influenzae R2866] # 2 276 3 262 283 142 34 5e-33 MEIKAADVMKLRKMTGAGMMDCKKALIEAEGDFARAQDIIREKGKLVVAKRADRTATEGV VVTKIVGQKAYILCLACETDFVAQNAEYTASAEAMLEVAVNSDAADRDALMAAKNAEGHT VEEMVTEKSGQTGEKIELAYYARIEAPYCAAYVHFNKKLGTILGFNKEVPAEVAHTVAMQ ATAMAPVSISEADCPAEVVEHERKIAVEAMKQDPKNANKPEAILEKIAEGKMRKFFEENT LLAQAVVGEKESIADFIHKADKEATVIAYKRFALGE >gi|313157460|gb|AENZ01000065.1| GENE 29 38040 - 38981 999 313 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|237708157|ref|ZP_04538638.1| 30S ribosomal protein S2 [Bacteroides sp. 9_1_42FAA] # 1 269 1 270 279 389 73 1e-107 MSRTDFNTLLEAGAHFGHLKRKWNPKMAPYIFMEKNGIHIIDLHKTVLKIDEAAAAIKQI AKSGRRVLFVATKKQAKEIVAEKVAAVGMPYVTERWAGGMLTNFPTIRKAVKKMSTIDKM NSDGTFDNFSKREKLQIARQRAKLEKNLGSIADLTRLPAALFVVDVQKESNAVKEAKRLS IPVFAMVDTCCDPTDIDYVIPANDDATKSIAVVLDAMCSAIAEGTEERKLEKEKEAQEAE AGADAPVKKEGKPRIKKAVKAAIDAEEAAVADVVAAVETPFEEPAEVAAEEAAVVAEEAA AVAAEEVAEAKAE >gi|313157460|gb|AENZ01000065.1| GENE 30 39165 - 39551 441 128 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|126646806|ref|ZP_01719316.1| 30S ribosomal protein S9 [Algoriphagus sp. PR1] # 1 128 1 128 128 174 64 1e-42 MEVVNTVGRRKAAVARVYVKPGKGQITINRKALEVYFPLEILQYQVKQPLLATNTVENYD IAINLDGGGITGQASAARLGIARALCEIDAEMRPVLKKAGFLTRDPREVERKKPGQPGAR RKFQFSKR >gi|313157460|gb|AENZ01000065.1| GENE 31 39557 - 40012 567 151 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|229200005|ref|ZP_04326568.1| LSU ribosomal protein L13P [Pedobacter heparinus DSM 2366] # 1 147 1 147 147 223 69 3e-57 MDSLSYKTISANASTVTKEWIVIDATNEVLGRLASQIAKILRGKNKPSYTPHVDCGDYVI VINAEKVKLTGNKMTEKVYTRHTGYPGGQRFATPADYLAKKPTFIVEKAVKGMLPRTRLG AALLKNLKVYAGSEHPHAAQNPKTIKLNEIK >gi|313157460|gb|AENZ01000065.1| GENE 32 40390 - 40824 265 144 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MNRVKWILLGVCLFGMLIGAAVAYSGYVKKRLDADGAETRAVLTASFSRTECVGYRPGYR GRMHRNFREKRTVYYFRYRFRVGGTDYTGKVRKTENLMTARIGDSVVVRYLPDDPDVHRL VRSEGGKYRVIRPRPRSRARRSAS >gi|313157460|gb|AENZ01000065.1| GENE 33 41152 - 42327 1689 391 aa, chain - ## HITS:1 COG:VC0551 KEGG:ns NR:ns ## COG: VC0551 COG1883 # Protein_GI_number: 15640573 # Func_class: C Energy production and conversion # Function: Na+-transporting methylmalonyl-CoA/oxaloacetate decarboxylase, beta subunit # Organism: Vibrio cholerae # 19 391 9 376 376 346 56.0 4e-95 MLDFLSSSFLGESVSKFVHETGFAQLFLQEGGYKYLIMIAVGCFFLYLAIKHKFEPMLLV PIGFGIIIGNIPMVTGHGIGIYEEGSVLNYLYFGVRYGVYPPLIFLGVGAMTDFSSLISN PKLMLIGAAAQIGIFAAYMGALALGFPANEAGSIAIIGGADGPTAIFLTSRLAPEYMGAI AICAYSYMALVPVIQPPVMRLLTSSRERLIKMRPPRAVSSTEKILFPIIGFLLTTFVVPD AIPLLGMLFLGNLLKESGVVKRLATTASNQLIDIMTILIGITVGASTQASVFLTPKTLGI FVLGACSFFIATMGGVLFVKLMNLFLKEGNKINPLIGNAGVSAVPAAARVSQDMGIRYDR SNYLLMHAMGPNVAGVVGSAVAAGILLSFLG >gi|313157460|gb|AENZ01000065.1| GENE 34 42332 - 42787 719 151 aa, chain - ## HITS:1 COG:PAB1771 KEGG:ns NR:ns ## COG: PAB1771 COG0511 # Protein_GI_number: 14521093 # Func_class: I Lipid transport and metabolism # Function: Biotin carboxyl carrier protein # Organism: Pyrococcus abyssi # 8 151 7 145 145 69 36.0 2e-12 MKEYNFKINGNDYTVGIEILNDTEADVIVNGASYSVEVLDTTVKPSVAAKPQVAAVPASH PVQPSTVPVTKPAAAAPAAAGESPVKSPLPGVILDIRIKEGDTVAVGQTLVVLEAMKMEN NIDSDRAGVVKSIKVNRGDSVLEGDVLITLG >gi|313157460|gb|AENZ01000065.1| GENE 35 42937 - 43386 754 149 aa, chain - ## HITS:1 COG:no KEGG:BT_1687 NR:ns ## KEGG: BT_1687 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 27 146 194 305 306 74 38.0 1e-12 MKRAILTAGLLFAALAASAKGADTILDKDPVGIYVTIVSVVTVLSALIVLFLLIQLFAKV MVKSAQKKAAKSKTRGMLVDDMKVSVPAGCDSVVNGEIIAAIALAVKLNKAEMHDRESDV ITINKVARVYSPWSSKIHGLTPMPERIRK >gi|313157460|gb|AENZ01000065.1| GENE 36 43388 - 44950 2606 520 aa, chain - ## HITS:1 COG:BMEI0801 KEGG:ns NR:ns ## COG: BMEI0801 COG4799 # Protein_GI_number: 17987084 # Func_class: I Lipid transport and metabolism # Function: Acetyl-CoA carboxylase, carboxyltransferase component (subunits alpha and beta) # Organism: Brucella melitensis # 4 520 1 510 510 672 64.0 0 MSEIQNKINQLIENRATARLGGGQKRIDAQHAKGKFTARERIQMLLDEGSFEEFDMFVTH RCYDFGMDKSHTFGDGVVTGYGTIDGRLVYVFAQDFTVTAGSLSLSMSDKICKVMDMALR NGAPCIGINDSGGARIQEGVNALAGYANIFQRNVMSSGVIPQISAIFGPCAGGAVYSPAL TDFIIMKKETSNMFLTGPKVVKTVTGEDVTQEQLGGATMHTTKSGVAQFAVDTEEEGIAL IRKLISYMPQNNMEDAPLAVCTDKITRLEDSLNEIVPDSANKPYDMSEVIKAIVDNGEYL ESAPGYAKNIITCFARFNGQSVGIVANQPKFMAGVLDINASRKAARFIRFCDAFNIPIVT LVDVPGFLPGTTQEYGGVITHGAKLLFAYCEATVPKVTVTLRKAYGGAYIVMSSKHIRGD INYAWPTAEIAVMGASGAVEVLYGKEVAAIESPEEKAKFVAEKEKEYNDKFSNPYNAARY GYIDDVIEPRNTRFRIIRALQSLSTKRLQNPAKKHSNIPL >gi|313157460|gb|AENZ01000065.1| GENE 37 45119 - 45529 739 136 aa, chain - ## HITS:1 COG:PH0272 KEGG:ns NR:ns ## COG: PH0272 COG0346 # Protein_GI_number: 14590197 # Func_class: E Amino acid transport and metabolism # Function: Lactoylglutathione lyase and related lyases # Organism: Pyrococcus horikoshii # 6 134 8 135 136 137 57.0 5e-33 MKVSHIEHLGIAVKSLDEAIPYWENVLGLKCYAVEEVADQKVRTAFFMLGQTKIELLEPT SEESTIAKYIENRGVGIHHMALACENIEEQLADAEAKGIRLIDKTPRKGAEGMTIAFLHP KSTQGILTELCENKNK >gi|313157460|gb|AENZ01000065.1| GENE 38 45684 - 47447 1297 587 aa, chain - ## HITS:1 COG:sll1231 KEGG:ns NR:ns ## COG: sll1231 COG0438 # Protein_GI_number: 16330676 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Synechocystis # 481 587 293 397 399 63 33.0 1e-09 MKIGFDDTPAAPAGTYSQHLARLLAEYAPEHEYIIDGKRCKEFDLYHGFRPGLPFPVLLR RIPCVMTVHNLNFLRYPHLYSLGERLVLLRLYRRMLRSASRVITVNRDAREELSDRLRID PGRIEVVMPLAARMPQNPPDGAELEGVRRKYALPRDFVLMLGTVEPRHNQEVLFEALALL REREREMLETRGAEMPAAAERTEVRAAERTGAQAGMQSGTEAGTETGAETGTEAGAETGM RTGARTGEQPAGGAAERTGERAAGRNVESVAEAGAFAETVARSVREPDGGGDLNPENAGW RDDTASDGVGGYGDVEAGAGLAMRGVGEDTPDANRNTVAEFAPAGRERVRPADGTARSGG AGTDGTDETAGVVGVGEASGAADRGADMSVEGVCRSARTSVGDAGRDVGVSVADAGRSDA AVAAGRRGAAGTDGRPPLPQRVGVVVCGRRTAYADFLLGYARERHMAARVDFIYELSPED LPALFRLARTFVYLPDAEIEASIVPVVEALRAGLPIVLSDTRLNREAAGDAAVYVDPEAV GEVTAALENVLWDETFRSEMRRRERRRAELFSEYAVARRLIDIYTSL >gi|313157460|gb|AENZ01000065.1| GENE 39 47517 - 47987 553 156 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313157505|gb|EFR56924.1| ## NR: gi|313157505|gb|EFR56924.1| hypothetical protein HMPREF9720_1104 [Alistipes sp. HGB5] # 1 156 16 171 171 266 100.0 3e-70 MKKVLLILLFAALLPFAAADAQNIALGERVPELKVPAWLDGQKPAATPRLTYVEFFQSSN AACITSLKQLRAMTDKLGTKLRVVVITQEKEDKIGPLLRPYLSPQISVAIDAGGKIFTAF GVQYVPFGVLVDSKNRALWQGNSLHLTPEIIDKSSK >gi|313157460|gb|AENZ01000065.1| GENE 40 47992 - 49107 1652 371 aa, chain + ## HITS:1 COG:BH2954 KEGG:ns NR:ns ## COG: BH2954 COG1703 # Protein_GI_number: 15615516 # Func_class: E Amino acid transport and metabolism # Function: Putative periplasmic protein kinase ArgK and related GTPases of G3E family # Organism: Bacillus halodurans # 42 371 5 334 340 320 47.0 2e-87 MSFTKINHYEKEYADMHHDTALNVTEGVEDQPIVNPYFTRRKKPKLSTDQYVEGILAGNI TTLSQAITLIESNNPAHYAQAQEIIERCLPHAGKSVRIGITGVPGAGKSTFIEAVGNMVT SLRHKLAVLAIDPSSERSGGSILGDKTRMESISGNPDVFIRPSPSAGSLGGVARKTRETI VLCEAAGFDVIFIETVGVGQSETAVHSMVDMFMLLQISGAGDELQGIKRGIMEMADIMVI TKADGENIHKAELAKTQFQGALRLFPLPESGWRPKVYTCSAVAGTGLEEVWKGVEEFLDH IEANGYFRHNRNRQNKYWMYESINEVLRNSFYHDPAVESRIAEYEQRVLDDKISSFIAAK ELLDIYFKDLK >gi|313157460|gb|AENZ01000065.1| GENE 41 49226 - 49924 969 232 aa, chain - ## HITS:1 COG:jhp1180 KEGG:ns NR:ns ## COG: jhp1180 COG0846 # Protein_GI_number: 15612245 # Func_class: K Transcription # Function: NAD-dependent protein deacetylases, SIR2 family # Organism: Helicobacter pylori J99 # 1 228 1 226 234 204 43.0 1e-52 MKRIVVFTGAGVSADSGLATFRDSDGLWADYRIEDVCTPEALAHNRATVIEFYNKRRREM LAAEPNAGHRAIAELERDFEVEVLTQNVDNLHERAGSSRVTHLHGELIRLRSSRDSGLIV PIEGWEQRLDATAPDGSLLRPHIVFFGESVPMFERAAEIAQTADIMVVVGTSLAVYPAAS LVRYVRPDIPIYVVDPGNPDTAGIRNPLTLIRKRAAEGMPELAALLRERYAG >gi|313157460|gb|AENZ01000065.1| GENE 42 49928 - 50092 116 54 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSKAALGIAFCVLAAMLAYDIVTEGVAVGVTKAVGFVLLILVAALGSRLGYLKF >gi|313157460|gb|AENZ01000065.1| GENE 43 50524 - 51300 451 258 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313157508|gb|EFR56927.1| ## NR: gi|313157508|gb|EFR56927.1| hypothetical protein HMPREF9720_1108 [Alistipes sp. HGB5] # 132 258 15 141 141 202 100.0 2e-50 MCNLLSKNRIGLCGKLRGSQSDGGMSALCPVRSVSGVRPGRGVSGLRPGQGISALRPVWS VSGVRPGRGVSALRPGQGISALRPVRSVSGLRPVLGVSGLRPGRNMSGRRYGRIVSGLQV VGGLLVFFLLLRPDWLPRWVYWGVLALGAALTLGEMALDYRRRRPRRRWRDAGLVLLSLS VVWIVILPPFFFWGVSGEKELTRATLGLGVAALVACVVCLLRWRRLRAEAEMRLWKLKAE RRRRDMLCARAAAEGGGR >gi|313157460|gb|AENZ01000065.1| GENE 44 51318 - 51587 368 89 aa, chain - ## HITS:1 COG:no KEGG:Bache_1363 NR:ns ## KEGG: Bache_1363 # Name: not_defined # Def: hypothetical protein # Organism: B.helcogenes # Pathway: not_defined # 3 55 4 56 58 81 73.0 8e-15 MAKENNNKSFVLRIDAATMEALEGWAADEFRSINGQLQWIIADALRRSGRLKKPRPAGPE SGDSTGLPSGDAASAKPPVPPRSDYPDDI >gi|313157460|gb|AENZ01000065.1| GENE 45 51599 - 52090 737 163 aa, chain - ## HITS:1 COG:SP2132 KEGG:ns NR:ns ## COG: SP2132 COG0330 # Protein_GI_number: 15901946 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Membrane protease subunits, stomatin/prohibitin homologs # Organism: Streptococcus pneumoniae TIGR4 # 66 162 242 335 335 129 69.0 2e-30 MENGNFVYKGWKCNGFLALTLIFVGLALGVCAIVFGAQRIDAGLAFGVGPVVIGSVWCLC MLICMAGFMLLEPNDARVNYLAYAPEIAAVMLRRQQADAIISAREKIVEGAVSMVHMALE RLGNEHIVELDEERKAAMVTNLLVVLCADEAAQPVVNAGTLHH >gi|313157460|gb|AENZ01000065.1| GENE 46 52505 - 53920 2522 471 aa, chain + ## HITS:1 COG:sll1641 KEGG:ns NR:ns ## COG: sll1641 COG0076 # Protein_GI_number: 16329656 # Func_class: E Amino acid transport and metabolism # Function: Glutamate decarboxylase and related PLP-dependent proteins # Organism: Synechocystis # 9 471 15 467 467 453 47.0 1e-127 MNQQRHTNEDTLTDIFGSEEMRNPAPTEFIPKGKTSPEIAYQLVKDETYPQTQPRLNLAT FVTTYMDDYATRLMNEAINVNYIDETEYPRVAVMCGRCLNIVANLWNTPEKAEWKTGALG IGSSEACMLGGVAAWLRWRNRRKAAGKPFDKPNLVMSAGFQVVWEKFCQLWQIELRTVPL TLDHITLDPKQALEMCDENTICIVPIAGVTWTGLDDDIEGLDKALDEYNAKTGYEIPIHV DAASGGFILPFLKPEKKWDFRLKWVLSISTSGHKYGLVYPGLGWVVWKDKKYLPDEMSFS VNYLGANITQVGLNFSRPAAQILGQYYNFIRLGFEGYKEIQQNSMDVAKYCHQQIGTMKC FKNYSKEVVNPLFIWMMDPEYDKKAKWTLFDLQAKLQQSGWMVPAYTMPKNIENVVVMRI VVRQGMSRDMADMLMGDIRNAVAEFEQLEYPTTSRIKYDNMEHQKGKVFTH >gi|313157460|gb|AENZ01000065.1| GENE 47 54714 - 55475 1224 253 aa, chain + ## HITS:1 COG:no KEGG:BVU_1964 NR:ns ## KEGG: BVU_1964 # Name: not_defined # Def: 3-oxo-5-alpha-steroid 4-dehydrogenase # Organism: B.vulgatus # Pathway: not_defined # 3 253 4 254 254 267 56.0 4e-70 MAQTFDIFLIVMALLALVVFAALHFFEAGYGYLFNPKYGPPVPNKIGWVLMESPVFVAMC VLWLLSERTWEAGPLTLFALFQAHYLQRAFIFPLLMRGASKMPLGIVVMGMCFNTLNALM QGGWIFYVSPEGYYADWFAQPYIYIGGAMFLAGMAVNLHSDHIIRNLRRPGDTRHYIPRG GMFRYVSSANYFGELLEWTGFAVASWSWAGAVFAWWTFANLAPRAASLNKRYAKEFGDEF TSLGRKKIIPFIY >gi|313157460|gb|AENZ01000065.1| GENE 48 55483 - 56694 1817 403 aa, chain + ## HITS:1 COG:MT3467 KEGG:ns NR:ns ## COG: MT3467 COG1902 # Protein_GI_number: 15842955 # Func_class: C Energy production and conversion # Function: NADH:flavin oxidoreductases, Old Yellow Enzyme family # Organism: Mycobacterium tuberculosis CDC1551 # 4 378 11 385 396 276 42.0 6e-74 MSSLFTPYKLGPVTLRNRTIRSAAFESMGRHFGPTQQLKEYHVSVARGGVGMTTLAYAAV CRSGLSFDKQLWLRPEIVPGLRDITDAVHREGAAAGIQIGHCGNMTHLTTAGQIPIGAST GFNLYAYTPVRGMRAGEIEQVARDFGRAVHTAHDAGFDSVEVHAGHGYLISQFLSPYTNR RRDEYGGSLENRMRFMRMCLEEVMEAAAQTGTAVLVKHNMYDGFKGGIEIPESLEIAREI ERFGVDGIVLSGGFVSKAPMAVMRGLIPIYTMSYYSPLWLRYFIRWCGPWMIRQFPFEEC YFLEDAKKFRAELKCPLVYVGGLVSREGIDRALDAGFELVQMARALVNDPAFVNKLREGD AATRSECDHRNYCIARMYSVDMKCCKHCGDLPRKIREELAKLP >gi|313157460|gb|AENZ01000065.1| GENE 49 56724 - 57677 988 317 aa, chain + ## HITS:1 COG:AGl1280 KEGG:ns NR:ns ## COG: AGl1280 COG0300 # Protein_GI_number: 15890762 # Func_class: R General function prediction only # Function: Short-chain dehydrogenases of various substrate specificities # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 49 314 34 286 291 122 32.0 1e-27 MDKQSGHMPGEGAPGTRAVQETTGTAPGIRTGQETARHSRRMKRGEVRPGSAWALVTGAG SGIGRCYALRLAALGYRLVIVGNNAAPLEAVAGEIRRAAGANASASFPEVRVMPMDLARV GAAQELHDRTAAEGIEIDVVINNAGIFSFCDILTTPAERIERIILLHDLTVSQLCRLYAA DMVRRGVRGHILNMSSYSLWMPFPGLALYSASKAYMRSFSVAFAKEVRDRGIRVTAVCPA GVATDLYGLTPRWQRIGKRLGVLITPDNCARRGLRALWRGRRCIVPDWWNRAWIPLCKAL PMWVLRPIRRFTMRFQK >gi|313157460|gb|AENZ01000065.1| GENE 50 57820 - 58446 780 208 aa, chain + ## HITS:1 COG:L111950 KEGG:ns NR:ns ## COG: L111950 COG1011 # Protein_GI_number: 15672092 # Func_class: R General function prediction only # Function: Predicted hydrolase (HAD superfamily) # Organism: Lactococcus lactis # 4 195 3 194 207 89 32.0 6e-18 MNEIKNVIFDLGGVLVDLDIERCTAAFRALDMPRVADLINPYYPAEMIGRLERGDISFRE ACDEMRRLDGRPEVTDSEIAAAYSAFLVGIPVAKLRQIMRLREAGVRTYVLSNNNPAAMV AIRGAFTADGHTMDDYFDKIYLSYELHELKPSEAIFRKVIADSGVIPSETLFIDDGRKNV DAAQALGFAVYMPAPGEDFGHLFEEITR >gi|313157460|gb|AENZ01000065.1| GENE 51 58554 - 58730 88 58 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRTAKPTTGVPTRIPERDRKTHGRIQAEVPKPPGSIRQGAGNILTAFRQYSGGIPTTV >gi|313157460|gb|AENZ01000065.1| GENE 52 58975 - 61440 2706 821 aa, chain + ## HITS:1 COG:no KEGG:Odosp_0360 NR:ns ## KEGG: Odosp_0360 # Name: not_defined # Def: hypothetical protein # Organism: O.splanchnicus # Pathway: not_defined # 1 769 27 778 791 423 32.0 1e-116 MAVLIVIAAIAVSPVAKNYIEKHDRELIGRSIRMERLRMNIFTGRLRIEGLRIGGSEDST TFFRLDSFEMRMRLWPLLGNRVLVKKISFAGPDVKIYQRGNAFSFDDITARFAGDTTAVA ATPEKTSKPWEIGIYDISIRNGRVFYKDLALDAEWGMNDLNLHIPGVYFSGEKTDVGAVL NFAEGGSLSTDVGYNIESSEFDIGIRLQDFALAGTLPYFRQSLDVTAVDGRLSADSRLRG NTQHLLSLTTEGTASLAGFALRDRQQRPVAGVDTLGVKLAEGDLGKMRFVFDRIYVSGLS ALFEMTPEGNNLAALMKSPAAETASPTTVSASAATNPAPEAAPAPAAEITTSAVPAPTAA NPAPAAEITTSDTAGSATPSPTLRIADLEITNGRVTVRDLTMHRPFEYTVSEIRMRSRDF APSKRNSMTVDARMQKTGSAKLRWEGTLDDMNNQNITLWLTNLDLRDFGPYCEHYTAYPL TKGNLTFRSQNVIRNRYLDGTNHLDMFEPKVDKKRREIKAEMNIPLKQGLYVLKDKKGHV KMNLPVKGSLDSPEFSYRKIVLKAIGNVLLKVVTAPFSFLSGNKENLEYINIDPLQYVFT SEQYASLDKIAQALQDKPEMHIVLTQRVNMRRALPRQAAGALRMAYAEHLKSADTTGRQP MSMLEYEKIQQTDIRTPAIMAFADSLLAQRGISPQGLSADDKALALYREKAAGQLARMMA ARNKALTDYMQSTHGATAPAFRVQTMDSLALPNYAGRDRYTIALEVDGETVKVEADDGNA GAGTDAGAGTDAGTEAETDARTNTESDTETNAGTHAGTDTE >gi|313157460|gb|AENZ01000065.1| GENE 53 62645 - 63718 1553 357 aa, chain - ## HITS:1 COG:RSc1728 KEGG:ns NR:ns ## COG: RSc1728 COG0535 # Protein_GI_number: 17546447 # Func_class: R General function prediction only # Function: Predicted Fe-S oxidoreductases # Organism: Ralstonia solanacearum # 20 357 12 361 491 130 30.0 3e-30 MARRRPGLRKRLALNIFSDLYASTVAEHRLTTLFWECTLRCNLSCRHCGSDCRVDPGVPD MPLEDFLKVLDEEITPHVDPSEVLIIFSGGEVLVRDDLERAGAEVTRRGYAWGMVTNGMA LTEARLRSLLDAGLRSVSVSLDGFEREHNHIRGNSRSYDRALAALRMIVREPSLTYDVVT CVTGAMVPQLEAFRDMLIAEGVAHWRLFSIFPMGRAKNDPSLRMTDAQFREMLEFIRRTR KEGRIGTSYACEGFLGDYETEVRDYFYQCAAGVSVASIRVDGAISGCTSIRANFHQGNIY RDKFWDVWQNRFEPFRNRDWARRGACADCKMFRYCLGGGMHLRGDDGELLYCHYHRL >gi|313157460|gb|AENZ01000065.1| GENE 54 63722 - 64054 285 110 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313157509|gb|EFR56928.1| ## NR: gi|313157509|gb|EFR56928.1| putative lipoprotein [Alistipes sp. HGB5] # 1 110 1 110 110 138 100.0 1e-31 MNKKLKLALAALLGFSTACSTVKNAPAKGAEEQKQETGSQEVQDSVKVPPRIVVMYGVRP PGQAGPATGVHIPDAKVQSAEPEELDPVAEKPTEDTPVMDTKKGAKAPEK >gi|313157460|gb|AENZ01000065.1| GENE 55 64198 - 64398 417 66 aa, chain + ## HITS:1 COG:TM1874 KEGG:ns NR:ns ## COG: TM1874 COG1278 # Protein_GI_number: 15644617 # Func_class: K Transcription # Function: Cold shock proteins # Organism: Thermotoga maritima # 1 62 1 62 66 83 70.0 8e-17 MTGKVKWFDSKKGYGFITGENGKEIFVHFSGIVTDGFKSLNEGQAVEFEVGSGAKGDQAV NVTVIE >gi|313157460|gb|AENZ01000065.1| GENE 56 64524 - 64934 436 136 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313157466|gb|EFR56885.1| ## NR: gi|313157466|gb|EFR56885.1| conserved hypothetical protein [Alistipes sp. HGB5] # 1 136 1 136 136 264 100.0 1e-69 MKKHLLPLCVTLFLIFAGEMTAGCDKLDDKYKSEKTVTLTIASVKPAAADDDYDWLQGLP VYIYKEESPDWRLWYSHDPIRGFDEIYEEGYEYVVKVRRLTLAGPPQDAGAYVYELKKVV SKTQKDSEGVPESYLR >gi|313157460|gb|AENZ01000065.1| GENE 57 65163 - 66398 1978 411 aa, chain + ## HITS:1 COG:TM0289 KEGG:ns NR:ns ## COG: TM0289 COG0205 # Protein_GI_number: 15643058 # Func_class: G Carbohydrate transport and metabolism # Function: 6-phosphofructokinase # Organism: Thermotoga maritima # 1 404 1 388 419 243 35.0 4e-64 MKEAIAILTGGGPAPGMNTVVGSVAKTFLRKGYRVIGLHEGYTGLFNPSPRTVDIDYPMA DGIFNQGGSFLQMSRFKPKDSDFENNFNLKFFTDNNIKLLVTVGGDDTASTANRIAKFLE AKKYPIANIHVPKTIDNDLPLPKGTPTFGYESAKDKGAVIARAVYVDARTSGNWFVLAAM GRSAGHLAFGIGEACHYPMIVIPEMFDKTEITVEKIVNLVISSIIKRKIMGMDYGAAVIS EGVFHALSDEEIRKSGIHFTYDEHGHPELGKVSKAHIFNEMIEKKVKELGIKVKSRPVEL GYEIRCQTPIAYDLTYCSELGIGVHKLFAEGKTGCMVYVDSEGNVSPLYLKDLQDPTTGK IPPRLVDIKSDKFSSVVETILNAITPADYEAAKQYVPNPEEYDFHKILNWK >gi|313157460|gb|AENZ01000065.1| GENE 58 66884 - 67762 1229 292 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313157481|gb|EFR56900.1| ## NR: gi|313157481|gb|EFR56900.1| hypothetical protein HMPREF9720_1127 [Alistipes sp. HGB5] # 1 292 1 292 292 540 100.0 1e-152 METPENKNDVFEPMPEEAAAQVTAENSADGNRSQKMAEAKKNLLAGALWCVGGLAFSFLS YYFATAGGRYVVATGAIIWGAIQAIKGLVVILKIQHQEGRFTAFWRTAALAACSLAALLY LGQLSVRLAGGEEMQVVDTEQIYTGAQGIKAKIPAGYTLLEETVQPETDETYAYYGFDTY NDRIGYSLGKVVDMIPEEITSVDEISDHCAQRDSAFYTGGFIAPTQRTEIGGREMLCSEG YITENPELVYAVYDMLYEGSLVTATFRYGKKDYGKRETRLRIESLLKGIELE >gi|313157460|gb|AENZ01000065.1| GENE 59 68118 - 69164 421 348 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313157518|gb|EFR56937.1| ## NR: gi|313157518|gb|EFR56937.1| hypothetical protein HMPREF9720_1129 [Alistipes sp. HGB5] # 95 348 1 254 254 331 99.0 6e-89 MEIDEFETPAQSAANGVVCAAGGTVVPRDQQCGVLYKVVVTHGITPCMIGRGECHAMVGG SQHTPEAFAVGSREAALLSPERNYGDQPQTGHRRVRRVRFRKKGRNGIHSRRKGRCRRDS SRSRNGGKGGSGSRGRRKDRCGRDSSRSRNGGKGGSGSRGGRKDRCGRDSSRSRGGGKGG SGSRGGRKGRCGLGIAAREDRLGGGEVETHGHAAEEGPEAAGVQLPHRRAQSRSGLAPVS AVAGKSDGGQAVFGHTFKGSVLARTLQDFLLNLKLYTNMKHSANCLASFLGGALVGAVVA MLVTPKSGPEFREDIRDMAKKGTRKLKNEIDKIHCDCNGLDCDCDKDE >gi|313157460|gb|AENZ01000065.1| GENE 60 69178 - 69507 443 109 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313157521|gb|EFR56940.1| ## NR: gi|313157521|gb|EFR56940.1| hypothetical protein HMPREF9720_1130 [Alistipes sp. HGB5] # 1 109 1 109 109 177 100.0 3e-43 MKDSPSPTGNIVTSLIFFGVTAAIAIILFIDAFVMWLASLTGSVAVAALITGGFFAVIAA VVYLLAVRSALNYVRDRIETVYEVASRIKEGYDWLSDKFSFLSFLRRSR Prediction of potential genes in microbial genomes Time: Wed Jun 22 13:22:37 2011 Seq name: gi|313157458|gb|AENZ01000066.1| Alistipes sp. HGB5 contig00056, whole genome shotgun sequence Length of sequence - 1944 bp Number of predicted genes - 1, with homology - 1 Number of transcription units - 1, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 274 - 323 9.2 1 1 Tu 1 . - CDS 424 - 1863 947 ## BT_1894 TPR repeat-containing protein Predicted protein(s) >gi|313157458|gb|AENZ01000066.1| GENE 1 424 - 1863 947 479 aa, chain - ## HITS:1 COG:no KEGG:BT_1894 NR:ns ## KEGG: BT_1894 # Name: not_defined # Def: TPR repeat-containing protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 5 476 77 547 551 287 37.0 1e-75 MDADDDSLIRIAYRYYDNRICSDSVKFLINYHYGRIYQNGDDYQEAMRYYLTAEKYAFAA HKNYYLGLVYSRIGEVYSEQMNFNGALEYYRNAYDAWEKLRKPAFMNNATLNIANAYSSL GDNDNAVKYYSQALQAAVQQEDNDMVIACLSNLGDIYVNEGDYPKALKAVKEIERTAPDG LSIYQYRVLAKVYYLQHKIDSARYYFNIASELAEDIRDDAQLAYLSIQIELASGNSEEAA NSINEYIWLSDSISRMVVSQSATAAEGKYYKEQTAFASYRLKVRTYFEYIVGLLICTVAI FLIYYYRERMKRKQQQVERYMLAVDTIRASKNRILEHLAVKEGQEVQLKELVLSRFEFLD QLGRAFYERNNTKAQQEVIYKQVRNFFTNLASNPATKKELEEIVNTVNDNIIVKLRKQFP KFKPADIDLLCYIYAGFSAQIISVIIGDSVSNVYNRKSRLKARIAASDSAEKDFFIQKM Prediction of potential genes in microbial genomes Time: Wed Jun 22 13:22:52 2011 Seq name: gi|313157428|gb|AENZ01000067.1| Alistipes sp. HGB5 contig00087, whole genome shotgun sequence Length of sequence - 34284 bp Number of predicted genes - 34, with homology - 29 Number of transcription units - 21, operones - 7 average op.length - 2.9 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 151 - 203 -0.7 1 1 Op 1 . - CDS 220 - 708 469 ## gi|291514537|emb|CBK63747.1| hypothetical protein AL1_12760 2 1 Op 2 . - CDS 721 - 1650 624 ## gi|313157452|gb|EFR56873.1| hypothetical protein HMPREF9720_3013 - Prom 1832 - 1891 2.6 - Term 1716 - 1770 12.8 3 2 Op 1 . - CDS 1901 - 2479 496 ## gi|313157440|gb|EFR56861.1| hypothetical protein HMPREF9720_3014 - Prom 2499 - 2558 2.7 - Term 2533 - 2585 2.1 4 2 Op 2 . - CDS 2650 - 3192 -122 ## - Prom 3350 - 3409 4.8 + Prom 3408 - 3467 4.0 5 3 Tu 1 . + CDS 3634 - 4548 566 ## COG1708 Predicted nucleotidyltransferases 6 4 Tu 1 . - CDS 5282 - 5362 57 ## - Prom 5456 - 5515 1.9 + Prom 5229 - 5288 5.0 7 5 Tu 1 . + CDS 5384 - 6238 844 ## Bacsa_2412 putative DNA repair ATPase + Term 6287 - 6314 -0.8 + Prom 6338 - 6397 2.8 8 6 Tu 1 . + CDS 6450 - 6923 212 ## gi|313157442|gb|EFR56863.1| hypothetical protein HMPREF9720_3018 + Term 7025 - 7061 3.1 9 7 Tu 1 . - CDS 6916 - 7539 416 ## BT_2779 3-demethylubiquinone-9 3-methyltransferase - Prom 7623 - 7682 2.2 - Term 7718 - 7764 10.3 10 8 Tu 1 . - CDS 7771 - 8271 353 ## COG1404 Subtilisin-like serine proteases - Prom 8388 - 8447 3.0 - Term 9562 - 9628 7.5 11 9 Tu 1 . - CDS 9638 - 10162 509 ## BT_1865 putative fiber protein - Prom 10233 - 10292 4.7 12 10 Tu 1 . - CDS 10413 - 10796 343 ## gi|291514529|emb|CBK63739.1| hypothetical protein AL1_12660 - Prom 10976 - 11035 4.4 13 11 Tu 1 . + CDS 10819 - 10989 84 ## - Term 10931 - 10969 4.1 14 12 Tu 1 . - CDS 11058 - 12443 870 ## BT_1894 TPR repeat-containing protein - Prom 12550 - 12609 4.6 - Term 13187 - 13244 14.1 15 13 Tu 1 . - CDS 13283 - 13735 149 ## - Prom 13799 - 13858 1.7 16 14 Tu 1 . + CDS 14022 - 15689 1513 ## COG3505 Type IV secretory pathway, VirD4 components 17 15 Op 1 . + CDS 15792 - 17147 1255 ## PGN_0083 hypothetical protein 18 15 Op 2 . + CDS 17152 - 17394 327 ## gi|291514523|emb|CBK63733.1| hypothetical protein AL1_12600 19 16 Op 1 . + CDS 17552 - 18574 1069 ## gi|313157436|gb|EFR56857.1| putative lipoprotein + Term 18594 - 18633 6.5 20 16 Op 2 . + CDS 18637 - 21216 1324 ## Pedsa_3099 hypothetical protein 21 16 Op 3 . + CDS 21237 - 21791 462 ## Bacsa_0599 hypothetical protein 22 16 Op 4 . + CDS 21799 - 23628 1092 ## gi|313157447|gb|EFR56868.1| hypothetical protein HMPREF9720_3029 23 16 Op 5 . + CDS 23632 - 25278 1845 ## BF1481 hypothetical protein 24 16 Op 6 . + CDS 25282 - 25353 71 ## 25 16 Op 7 . + CDS 25393 - 26160 503 ## COG1032 Fe-S oxidoreductase 26 16 Op 8 . + CDS 26165 - 26329 220 ## gi|291514517|emb|CBK63727.1| hypothetical protein AL1_12530 + Term 26332 - 26375 11.7 - Term 26319 - 26363 11.1 27 17 Op 1 . - CDS 26364 - 26984 430 ## gi|291514516|emb|CBK63726.1| hypothetical protein AL1_12520 28 17 Op 2 . - CDS 27045 - 27557 240 ## gi|313157450|gb|EFR56871.1| hypothetical protein HMPREF9720_3034 + Prom 28182 - 28241 4.0 29 18 Tu 1 . + CDS 28491 - 28766 190 ## gi|291514514|emb|CBK63724.1| hypothetical protein AL1_12500 + Term 28847 - 28893 0.3 30 19 Op 1 4/0.000 - CDS 28776 - 30044 929 ## COG0389 Nucleotidyltransferase/DNA polymerase involved in DNA repair 31 19 Op 2 . - CDS 30044 - 30484 264 ## COG1974 SOS-response transcriptional repressors (RecA-mediated autopeptidases) - Prom 30506 - 30565 7.2 - Term 30528 - 30572 6.3 32 20 Op 1 . - CDS 30586 - 31125 565 ## Neut_0148 hypothetical protein 33 20 Op 2 . - CDS 31137 - 32351 467 ## BF1340 tyrosine type site-specific recombinase - Prom 32511 - 32570 3.7 - TRNA 32640 - 32744 26.8 # Pseudo CTT 0 0 - Term 32949 - 32984 4.0 34 21 Tu 1 . - CDS 33062 - 34228 162 ## gi|313157441|gb|EFR56862.1| hypothetical protein HMPREF9720_3041 Predicted protein(s) >gi|313157428|gb|AENZ01000067.1| GENE 1 220 - 708 469 162 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|291514537|emb|CBK63747.1| ## NR: gi|291514537|emb|CBK63747.1| hypothetical protein AL1_12760 [Alistipes shahii WAL 8301] # 1 162 1 162 162 293 95.0 3e-78 MLRKLLNRLFGPFSSGGVIPLGEFLILLQKSYAYLRKHRYRVLSTELPDDLLQKWLMPMP DEKQAEEYKRSFARQNDTLSQTQLPVFVFDHVLQKRFNRDISTFDDAEMEIAGGDLRIYI LLLNYIFGMREAGRPLIEFDIFDVENYDAIIERIEAQETERE >gi|313157428|gb|AENZ01000067.1| GENE 2 721 - 1650 624 309 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313157452|gb|EFR56873.1| ## NR: gi|313157452|gb|EFR56873.1| hypothetical protein HMPREF9720_3013 [Alistipes sp. HGB5] # 5 309 1 305 305 605 99.0 1e-172 MRKGLILGFVGNNPKHARRLPDDAVGQLIRGNVPLGYRTVLTGIEGNFEMGCAAAALRLR GEGLKIKLHIAVTRGKYKTYLRYKRDNLRLSEAHRIIEQADNVEIIEGKTPLEAERLRDR HVVDKSDLLFYYSTQLRDDFRNKFISYYLEQQHPRKNVCDLSDKSGRAFVAKEASLRYMR ERDLVVIANSIDKIYLQDWLAPDTDELRKYFRAPKETAVVLLRDTGVCDPKLLPLRVFFY ALSNSVITNLALPEKCWRESREYFDTFQNILRIIRLTRAHNIEIPDFNIFDFPRYGEIMR RIFQYQELK >gi|313157428|gb|AENZ01000067.1| GENE 3 1901 - 2479 496 192 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313157440|gb|EFR56861.1| ## NR: gi|313157440|gb|EFR56861.1| hypothetical protein HMPREF9720_3014 [Alistipes sp. HGB5] # 1 192 15 206 206 376 100.0 1e-103 MIEQHTEEEQTLRAISHFYTTQVSICIAALGRMRSAELADKNIILRAIMCDMPDCGICAD PTAEDYGVDFWVDRAEYVDDLSFNLYVTQAEENVLWNIEKKGQKPGRLAAPQQRITVRTD GISRPPYIPLIYLTVLSSKLMAMAMSDVLPETDLSTYYDERMMAAMSEHMTEALDRIVAM QNGEEDAQMPVS >gi|313157428|gb|AENZ01000067.1| GENE 4 2650 - 3192 -122 180 aa, chain - ## HITS:0 COG:no KEGG:no NR:no METRCVLPSMRNIAVPADCTRPQLGAGRNRYVSAAPAWNRTPGLAVQNNKSRLSVAANKS CLQGWPFKIWSCCPAPYGNGTHVCAALTGGRIIAAPFFAKNPISLYSRSDGRETLNRRSA HHKHKIAKGSNPTGGGTAVTNPTSRTGGTELRLFHFWDLSLAPPTNRNTYRSKEPPRRMR >gi|313157428|gb|AENZ01000067.1| GENE 5 3634 - 4548 566 304 aa, chain + ## HITS:1 COG:SMb20835 KEGG:ns NR:ns ## COG: SMb20835 COG1708 # Protein_GI_number: 16264326 # Func_class: R General function prediction only # Function: Predicted nucleotidyltransferases # Organism: Sinorhizobium meliloti # 1 288 27 320 331 146 32.0 6e-35 MKQSIDFLPERKQRDLHELVGLIRDEVKDVVMIILYGSYARNTYVELDIRRDYGGGKIQY MSDYDILVVSKKRLGSRTVTVSSRVKSSFLKNKNADTQTRPQIINESITKFNNALSEGRY FYVDILTEGIVLYDSGECQLATPCELNYSEILQMAEEYYDTQFNRGKRFIRYTQIAYADN EIKDSSFFLHQASESFLKSIPLVYILYGYKEHELEFLIERCKPYTLELAKVFPCDTKEEE RLFKLLQRAYIESRYNPDFEITKEDIDLLMPKAEMLRDIVEDVCRKQFDYYKQQSKKQLP KERG >gi|313157428|gb|AENZ01000067.1| GENE 6 5282 - 5362 57 26 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MLYDMNNNIFRFLERLLKCINITYDP >gi|313157428|gb|AENZ01000067.1| GENE 7 5384 - 6238 844 284 aa, chain + ## HITS:1 COG:no KEGG:Bacsa_2412 NR:ns ## KEGG: Bacsa_2412 # Name: not_defined # Def: putative DNA repair ATPase # Organism: B.salanitronis # Pathway: not_defined # 1 276 1 280 282 351 70.0 2e-95 MTQKQAIQLFEERKVRTVWDDETEKWYFAIVDVVAILTESADAAAYWRKLKQRLKAEGNE TVTNCHGLKMLAADGKMRLTDVADTTQLFRLIQSIPSPKAEPFKQWMAQVASDRLDQMQD PEMSIEQAMADYKRLGYSENWINQRLKSMEVRKELTDEWQRRGVEGQQFATLTDIITQAW ADRTTKSYKQFKGLKKENLRDNMTNIELALNTLAEASVTEISKQRRPKGFQQNVKVARSG GSVAKAARTQLEKQLGHSVVSPINAAQVLAAKKTEQIEGEKDKD >gi|313157428|gb|AENZ01000067.1| GENE 8 6450 - 6923 212 157 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313157442|gb|EFR56863.1| ## NR: gi|313157442|gb|EFR56863.1| hypothetical protein HMPREF9720_3018 [Alistipes sp. HGB5] # 65 157 1 93 93 173 98.0 4e-42 MKKIVCLIAMALFTATGCTSVEMNQTKTDTPDDNTQGKATTLSCADYTLNPLWNWAETPA NQLIVIRSQAELADFIFPNDKLPEDIDFEKNTLLLVAGQATNGIESVKKNFVKADAQYIY SVTFLLNDTTEAPKWRVAQLVPSVPNEAKISLDLEVN >gi|313157428|gb|AENZ01000067.1| GENE 9 6916 - 7539 416 207 aa, chain - ## HITS:1 COG:no KEGG:BT_2779 NR:ns ## KEGG: BT_2779 # Name: not_defined # Def: 3-demethylubiquinone-9 3-methyltransferase # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 200 55 255 258 206 48.0 4e-52 MGATVTGIDIHKPSIETAKTLFAERGLKGTFICSDIFDYSDTQPYDLIILHDALEHIPEK ERLMLHLKSFLKADGLLYLGFPAWQMPFGGHQQMAKNRIIANCPYIHLLPRPLFRFVFRA CGEKESTVKCFFEIRDTRITIEGFHKLVTATGLRIVNQRLYFINPNYQVKFGLKPRILSP VIGKIPYVRNFFTTTCYCLLAMQEGAN >gi|313157428|gb|AENZ01000067.1| GENE 10 7771 - 8271 353 166 aa, chain - ## HITS:1 COG:alr4870_1 KEGG:ns NR:ns ## COG: alr4870_1 COG1404 # Protein_GI_number: 17232362 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Subtilisin-like serine proteases # Organism: Nostoc sp. PCC 7120 # 4 115 467 583 590 73 41.0 2e-13 MAPGVLIYTTDRTGSVGYVSGDYMPNFNGTSSACPHVAGVAALILSVNPNLTQKEVATII EKTARKVGGYSYSTVSGRPNGTWHTEVGYGLIDAYAAVMEAQNASTTVYFNDKTVTTNTV VSGSEISATNVTVKNNAKLTFTNAKSIIITQPFTVELTSSLELSLQ >gi|313157428|gb|AENZ01000067.1| GENE 11 9638 - 10162 509 174 aa, chain - ## HITS:1 COG:no KEGG:BT_1865 NR:ns ## KEGG: BT_1865 # Name: not_defined # Def: putative fiber protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 2 174 39 203 203 117 37.0 3e-25 MERYNGYMTNWEGWAHCWKFGNRMIKFDLGPADSRVSSNSDKLVLYDTENGGFIDLYARN VYTNSDAASKTNIQSLGSATATLTQLRPVSFEWADKAHYFKTSRRSTGVSNPKEMGFIAQ EIEQVLPDIVAVDCEGHRVVNYSALIPLLTKSIQELNGQIETLKAEIEALKSGK >gi|313157428|gb|AENZ01000067.1| GENE 12 10413 - 10796 343 127 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|291514529|emb|CBK63739.1| ## NR: gi|291514529|emb|CBK63739.1| hypothetical protein AL1_12660 [Alistipes shahii WAL 8301] # 1 127 14 140 140 229 100.0 7e-59 MKKFSPAMSVIVGLFMLFAGIYSVYGGGKTAVIVKQVKVGGGPVCLSENIPLVFYDSSMS IVEVIASGAESKTGQLIISGNFGTVACERFDLFNGKSIIPIPELEPGFYTIELSIDKNVF AGDFFIE >gi|313157428|gb|AENZ01000067.1| GENE 13 10819 - 10989 84 56 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRYSYISKVERLFPKTKKSEPLKFFNRKVISTEKQTVTHTKKPENFSVFGLCFNAF >gi|313157428|gb|AENZ01000067.1| GENE 14 11058 - 12443 870 461 aa, chain - ## HITS:1 COG:no KEGG:BT_1894 NR:ns ## KEGG: BT_1894 # Name: not_defined # Def: TPR repeat-containing protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 16 459 96 549 551 145 27.0 4e-33 MLRPAVNYFLRKGTNRQKFLSWYCLGRMEYSANNYQKATESYLKALEYQDLIDDPYLIGV CNFVLGELNLKQNNYQRALFYYQEAYENYQAANKTKHQVSAKTAIAAVYYMENDNDNALR DYREALNLSEQFGYEDFQVYCLRCLIGILSNKGVTNETYNYVGRFLQLSDDLSPNDYCVL GEYYLRINKPDSAEYYLNAAISANIGGPRENDAAAKILLAETYTMNADYRAAYSMQKECL ALCDSIHLSYMNNSAAETELRYKNERLEFERYRSSVRQNVIYIIALVSVIAICGIIYIFR RRNKMQKQRIEEYLTLIEELNKTKLQIPDAAKISELLNTRFQVLKELAKTYYEFENSPAL TKKVKGILSEKILDKTIMSDLENTLNQQYDNVISNFKSEYPKIKPCFVELLCLLYAGFSP QQISVITNETLKTIYMRKFHLKKKIMASSITTKNAILNIIS >gi|313157428|gb|AENZ01000067.1| GENE 15 13283 - 13735 149 150 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MPTAKMNATIFASALKSSSCCINIVFGFGGKSKSDATIPDVVGCRSPSFALIGYGICLES EKIYTFVGEYSKGACLNIHHSLFPAVKRGNLNGKQAGRLNFVHRLCGVRSDCFRVFHDLT AADCSPYIFKIFCYLCHCNKGTPTYSRPVQ >gi|313157428|gb|AENZ01000067.1| GENE 16 14022 - 15689 1513 555 aa, chain + ## HITS:1 COG:alr7213 KEGG:ns NR:ns ## COG: alr7213 COG3505 # Protein_GI_number: 17233229 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirD4 components # Organism: Nostoc sp. PCC 7120 # 81 440 115 466 589 90 26.0 7e-18 MPGVRVAWYTGTLFAGYLLMLTAGIWIWRLLRHDLLDDVFNDENESFMQETRLLSDEYSV NLPTLFRYRGQNFRGWVNIAVFRSCMVVGSPGSGKSFTVINNYIKQLIEKGYSLYLYDFK YPDLSMLAYNHLLRFTHKYEVKPQFCVINFDDPRHSHRCNPINADFLTDIADAYEAAYVI MIGLNRSWAQKQGDFFVESPVVLLTAIIWFLRIYQNGKYCTFPHAIELLNKKYEEVFTIL MARPELENYLSAFVDAWQGGAQEQLQGQIASAKIPLSRIISPALYWVMSGDDFSLDLNNP QAPKILCVGSNPDRQNIYSAALSLYNSRIVKTINRKGKLKCGVVIDELPTIFFKGLDNLI ATARSNRVAVCLGMQDFSQLTRDYGEHESKVIQNIVGNVIAGQVVGDTARILSERFGKIV QQRQSLNISREDKSTGINTQLDSVIPASKISNLSQGVFVGAVADNFGDNIPQKMFHAQIV VDVEKVKAEEKQYRPIPVLSDFTDENGKDRMAEVIEANYYRIKQEAAQIVEDELRRIADD PKLKHLFKGNETVAN >gi|313157428|gb|AENZ01000067.1| GENE 17 15792 - 17147 1255 451 aa, chain + ## HITS:1 COG:no KEGG:PGN_0083 NR:ns ## KEGG: PGN_0083 # Name: not_defined # Def: hypothetical protein # Organism: P.gingivalis_ATCC33277 # Pathway: not_defined # 17 334 2 329 464 66 24.0 3e-09 MAEEQQQSPGAAQQPNDNSLNEPSIGMMYDKEKDQAKVFSQNPDGSIGTVDPTPENESLF FVMDKNIPLNFYKNLQKYHNNPTINIYVLPRRALGRMKDALKRYWKNTTRDDVKLYYNYK IHPDGKFECKMKTRGIPVAEMPWDTLNRMGYSFGGLEKMNYLQKLQNYEQTGMHKLKYHD DIINYIGEGKIRLKKSGNGYKVDVKSYARILDEGLFDQKFTETDMKNLEMYGNLGRVLET SEGPLLVSRDFDTRQLDYTNTENAFVPRYIRGTELTQEDINSFRRGEVREIAIVNRDGSV NIVPYQYNAVFRKPMEVLTIEQRNALATKLREQRDMERIISSMPETADGWTQYPQAPKQS AEQTAETAGQGQAPAAVPEQNQQGQAAETAGQGQTPAPDLFGAGQTTQGKTPVQGQTSPQ GQKQPRAGQKSPRPEVATPTPKAQRKTGQHV >gi|313157428|gb|AENZ01000067.1| GENE 18 17152 - 17394 327 80 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|291514523|emb|CBK63733.1| ## NR: gi|291514523|emb|CBK63733.1| hypothetical protein AL1_12600 [Alistipes shahii WAL 8301] # 1 80 1 80 80 150 98.0 4e-35 MERFDESAIDWVAAAKRGISRRMLKEYGFLELLLEGGTTPVIPLTLTLADVTIKLDGTLR LTEDCDKNVRFEVLGYGWEE >gi|313157428|gb|AENZ01000067.1| GENE 19 17552 - 18574 1069 340 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313157436|gb|EFR56857.1| ## NR: gi|313157436|gb|EFR56857.1| putative lipoprotein [Alistipes sp. HGB5] # 1 340 1 340 340 511 100.0 1e-143 MKKILFAAFAALSMIACNKDDATTVIPDIDNTLNGAIVAMRFTDETPTRAFFAEAAAAEA WEKSLSSLSVYVFNAKGDLITLRDFTASELTARTATFALPHATAGTTCDFYAVANMSLTG VTTKSALLAKLETSAADYNGTFAEVSTKAKRSGGFVMTAGVSKAIAEGGTTSVALTLKRT VAKVAVQTSVDASFNSKYSGTLTVTGTKVVRAASQTPIIAPATPTPGAMTYTHTQAPAAA SGKYNNLFYIFENGALAEASRVALEITATYDADGNASTTADRMEVVYTVPLTGKAGGEIA RNGYYRVAANITGLVGQDCQVSVSVAEWETPVTQSVDLGA >gi|313157428|gb|AENZ01000067.1| GENE 20 18637 - 21216 1324 859 aa, chain + ## HITS:1 COG:no KEGG:Pedsa_3099 NR:ns ## KEGG: Pedsa_3099 # Name: not_defined # Def: hypothetical protein # Organism: P.saltans # Pathway: not_defined # 564 764 190 362 445 99 31.0 8e-19 MKQTIFILLALALCWGCITDREPTARSKVAVELTVRSAPMTTTTRATDETTIRDLNFYLM DKAGRVVVFRYLTTTTLRFECPPGVYLMRIAANVGRSLGESADLSRYMVTYQQDYDTLPM FYEQETTISCSSGSVVQLPPINVKRFVSKISYNLTAKPADMELKSVQLLTVPSTAALFAG NGPASGNPDDYTHGPEMLLTGRQAAGSYYMLPNLQGVRTSITDQKQKNADNAPPCASWLL IRATRGSKVLAYSVYLGENNTSDFNVRANCHYRLNITLLNDNTADTRIAAYTAAVYDDFD DGNIGGYCVYDPDYSLHVDMVNENSSLAISGRLEVTQGDSDYFEFDRTDCNNSFDFEIYN PKGGNYFYLDYGPSVFTKDNSTLAYRVTLTDEYGFAQSYDFKHTMANIVYAHTSAGGSVT ADRCLYTQTANEGGGKRTAALCHEDGCRLTAAATGSYAFVGWYADAAYSRLLSSSESYNY VPRDSHADIYARFRVMETPLDSKGTANCYIAPALDTRYSFDATVQGNGKNTTNIWPQQLH GVSARVLWESGTLSETVVKDAAYSNGRISFSTGAVRGNAVIGLFDAAGNCIWSWHIWSVD YDPATMAQTYSSGAVFMDRNIGALTTDCTQPSSRGLYYQWGRKDPFPHPATCQDINKRAD AVYAEGFEYAVSYPRNAGTESPYDNMTVEWSIAHPTTFMSDAMYEDWEEWTSVADWLYNH HPNLWGNVTTSNNNISKVSLKSIYDPCPVGWKVPSPEDFAGIERVSQSSPYYVTIHYNGN RTTNIPTGGTFVETRFMNNGQLGRLYTNAPYNMHWGTWACRYGDISCTSIFFSTGSVPSF IGTTDYYRYAANPIRCIRE >gi|313157428|gb|AENZ01000067.1| GENE 21 21237 - 21791 462 184 aa, chain + ## HITS:1 COG:no KEGG:Bacsa_0599 NR:ns ## KEGG: Bacsa_0599 # Name: not_defined # Def: hypothetical protein # Organism: B.salanitronis # Pathway: not_defined # 23 184 16 191 191 104 38.0 2e-21 MKKRHIITLMLCLLALWGKPHRVAAQFFSAKVNALAALTATVNVGVDAAVADKWTLDLSA YWNPVNSDSFSCRLYAAQIGTKRWLYEAFVGHFIGSQLTYGNYLYGGSRRYYKGGMAGLG FSYGYAWLLSKRWNVTAEIGIGVFYMKDTRRDRIPPEYESIFIHHYKRWVVGPSRAEISF NYLF >gi|313157428|gb|AENZ01000067.1| GENE 22 21799 - 23628 1092 609 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313157447|gb|EFR56868.1| ## NR: gi|313157447|gb|EFR56868.1| hypothetical protein HMPREF9720_3029 [Alistipes sp. HGB5] # 1 609 1 609 609 1161 100.0 0 MKKLIYTVVCMMLFCCACSDADIELNPATQAKQQLAMTFRCGAMSPTRAATNDTRIDDIN LYLFPVNGGQARHVYIAPVRPVVLELPKGDYTLYAIANLGHDAGERTQDFVRSLRVEREP AALADAPFPMSAQQAVTVRGDTQIAVSLVRAVAKVNFSYTVAADFAKSFHVKSVQFRSAP RSAALFDFSRAESSEAVADMPPVEASATAYAATCYLLENRQGEVAGIGSQQQKDQTRAPE HATYIAIAGEADGTQVVYRIYLGENNTTDFNVGRNRVYNIDARILGLNTVDWRVSTAELT VTPFAENHAPGEPATTELRLTSTNNPENVYYLSYHIDAGTGIVAIDGVNRNPGTPYPFFS GNGTVTAGISYTQAEPGDVRLRLTVTDKYGFSMERVLTTVFKNPELTITYAQQGNELTVY DRAYIDYTVSQPGYSGNYTVKVEGVPSVYYGRYGSDVPLTTFTQYGNGTNTLRVKPNMLG PNPLKITVTDTDGHSAEIITSVMGIKTTADLQPTFSTSGGLLYVTVESSQPVVDDLTVKV TANADIYSFGGAVTKKQYIVTVKIEADHTTGKGSVSLSGLGSYATFTVTSYSAEFKRTSD NGLVEYKLQ >gi|313157428|gb|AENZ01000067.1| GENE 23 23632 - 25278 1845 548 aa, chain + ## HITS:1 COG:no KEGG:BF1481 NR:ns ## KEGG: BF1481 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 171 546 299 676 687 268 38.0 4e-70 MKRIMKTIAVCLAALLCGCSITGHLERRQYCADVLHVSREQREREQQTYQPPALKIEWDS NRFYLVPTEVQENGERIMAMQIQQVTIRAKARTLPERLGKVIIDFVIDMPRQLQGACRSV IVTPYLHKYGEAHALQDITIRGGLFSRVQQRDYWQFGKYVRAFAPDSSAEARAFARFVKY PYPEGVRLDSVAAHPGHISYYYSQEVPTDETSKTMLVTLQGRVVALDDSSYTLPPSDTLT YHVSSMLSFVDTTTRYKIKVINKYATVQDRNYIQFLVNDTRVLDTLGRNTAQLGKIQARM AGLIAQQEFFVDSVVLTATASPEGSFTRNRALAQGRAHALKRYLQERIGPEVDTLVRVRW VAEDWPELATRIRGDESLPHRAEMLELIAAEKDPDRRERLIRERYPADYRYLKESVYPWL RAVTLRYDLRRQGMIKDTIHTREVDTAYMRGVDLLQKRRYAKALYNLLEYRDRNTVVTLL SLGHDAQALEILDGLEQSATTEYLKAVACSRLGRKVEGREHFLTACGMDKRMQYRGNLDP EITELLKQ >gi|313157428|gb|AENZ01000067.1| GENE 24 25282 - 25353 71 23 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MKIGLIDVDGHHFPNLALMKLSA >gi|313157428|gb|AENZ01000067.1| GENE 25 25393 - 26160 503 255 aa, chain + ## HITS:1 COG:Ta1390 KEGG:ns NR:ns ## COG: Ta1390 COG1032 # Protein_GI_number: 16082367 # Func_class: C Energy production and conversion # Function: Fe-S oxidoreductase # Organism: Thermoplasma acidophilum # 38 193 23 201 425 93 34.0 3e-19 MFGQYDRVYMAKVFTFTPDYPHVFPCEVVKGGTGYRDYARVLPEEVEHSCPDYGLYDYPA AVGFLTRGCVNRCPWCVVPRKEGAIRANADIEEFLGGRRNAVLLDNNVLASGWGLEQIEK IIRLGVRVDFNQGLDARQIARNTEIAELLSRVKWSKYIRMAYDSSAVRDNVHRAIERLKK CGVKPAKMFFYVLVRDVDDALGRIEELRALGCQPFAQPYRDFESKIHPTPEQRRLARWCN HKPTFHTVNYKNYKE >gi|313157428|gb|AENZ01000067.1| GENE 26 26165 - 26329 220 54 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|291514517|emb|CBK63727.1| ## NR: gi|291514517|emb|CBK63727.1| hypothetical protein AL1_12530 [Alistipes shahii WAL 8301] conserved hypothetical protein [Alistipes sp. HGB5] # 1 54 1 54 54 102 100.0 1e-20 MKRKSFVSILIVLLALFSLTRCKLAEVETIDEIPCPGVGIDTVIIPGWDNQPDD >gi|313157428|gb|AENZ01000067.1| GENE 27 26364 - 26984 430 206 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|291514516|emb|CBK63726.1| ## NR: gi|291514516|emb|CBK63726.1| hypothetical protein AL1_12520 [Alistipes shahii WAL 8301] hypothetical protein HMPREF9720_3033 [Alistipes sp. HGB5] # 1 206 1 206 206 385 100.0 1e-106 MYLFNNTPIQTRFDESDKKIASELNKITDNELLNCDLQKIADRIEQQYSIICDTEFTTED VEPISYLMPISREALRPELRIGAIHEFYDFVAVDYKFKIQGDYTFFFNTPTDTHYAPIKG SANANGLTLTIITEYTRIPLSDEWKERVKEDIKFLVSEVKTRINLLKEECKKRNANIKPN VLSILEKERQNLIEKKAHDAKLNPFK >gi|313157428|gb|AENZ01000067.1| GENE 28 27045 - 27557 240 170 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313157450|gb|EFR56871.1| ## NR: gi|313157450|gb|EFR56871.1| hypothetical protein HMPREF9720_3034 [Alistipes sp. HGB5] # 1 170 102 271 271 335 100.0 7e-91 MWSYYNEHKGMCIGVNMKDIYTSLINHFHDKINSYLSFRKVVYKNKLPDLNLINIVPFDY QTPRIDNTPSKKAQNEINGFLLTKAKWWEHEKEYRLILRQTPNPYLNLDEPVKLQMRNLI NRVYLGCKFDGNIDTIINIAKEKKFDVYKFRQQSHTYGLYAEKLYSHNNQ >gi|313157428|gb|AENZ01000067.1| GENE 29 28491 - 28766 190 91 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|291514514|emb|CBK63724.1| ## NR: gi|291514514|emb|CBK63724.1| hypothetical protein AL1_12500 [Alistipes shahii WAL 8301] hypothetical protein HMPREF9720_3035 [Alistipes sp. HGB5] # 1 91 204 294 294 191 100.0 1e-47 MHHKFTDNVNADREYWIPAKFTFDVLVQPIIDGIFYESSQGRVDDRLKDCISVALKPQSV DTKLHFLGVYDVLIKNDGEKVTISAPIFRNL >gi|313157428|gb|AENZ01000067.1| GENE 30 28776 - 30044 929 422 aa, chain - ## HITS:1 COG:ECs1679 KEGG:ns NR:ns ## COG: ECs1679 COG0389 # Protein_GI_number: 15830933 # Func_class: L Replication, recombination and repair # Function: Nucleotidyltransferase/DNA polymerase involved in DNA repair # Organism: Escherichia coli O157:H7 # 1 420 1 422 422 325 42.0 9e-89 MYGLADCNNFYASCERMFAPDLNGRPVVVLSNNDGCIIARSDEAKALGLKMGDAYFQQAA FLRQNNVAVFSSNLELYGDMSTRVMNTLKSYSPATEVYSIDESFMDFTGIDPATLRDFGL EIVQRVKKNTGIPISLGIAPTKTLAKIASKLCKQYPKLAGCCYMHRPEDIEKVLRRFPIG DVWGIGRRFEKKLLAANVATAYDFTQLSPEWVRLNMGGITALRMWKELRGEACIGFDDMP QPKQQIRTSRTFHNDVSDYGELHRNIALFTATSAEKLRKQQSVCGEIRVFILTNRHRPDK PQSYENALLKLPNPTDSTIELVKYAGKILRDLYSKGYGYKRAGVVLSDIRSNEGVQINMF DPIDRDKHTRIMAVMDALNKTYGRNKISLATQGATPPYRMNREHLSQRYTTSWDDIIQVK AM >gi|313157428|gb|AENZ01000067.1| GENE 31 30044 - 30484 264 146 aa, chain - ## HITS:1 COG:STM1998 KEGG:ns NR:ns ## COG: STM1998 COG1974 # Protein_GI_number: 16765334 # Func_class: K Transcription; T Signal transduction mechanisms # Function: SOS-response transcriptional repressors (RecA-mediated autopeptidases) # Organism: Salmonella typhimurium LT2 # 7 144 1 138 139 112 42.0 3e-25 MNENIRLEWFKPDLTAPLEVPLMDSGVHAGFPSPADDSIEQPLDLNKLVAPNPNATFFAR VVGDSMKDAHIEEGDLLAVDKSADPLDGDVVVSFIDGEFTLKTIRFKGAQVWLVPANEKY KPIEVTTDCDFRVWGVVRYVIKKVKK >gi|313157428|gb|AENZ01000067.1| GENE 32 30586 - 31125 565 179 aa, chain - ## HITS:1 COG:no KEGG:Neut_0148 NR:ns ## KEGG: Neut_0148 # Name: not_defined # Def: hypothetical protein # Organism: N.eutropha # Pathway: not_defined # 12 176 14 178 184 166 53.0 3e-40 MDAVQTAIKDKIIVVQGRQVILDSDVAALYGVETKRVNEAIKNNPDKFPDGYILQLTPSE WEDLRSKISTTKSAKVRYAPKAFTEKGLYMLATILKSPVATQTTIGIIETFAKVKELSRT MSALQEPNDPHLQERLAEKGSRIISDLLDDTFETTQSETTIECNLVVLKVKHTITRKPK >gi|313157428|gb|AENZ01000067.1| GENE 33 31137 - 32351 467 404 aa, chain - ## HITS:1 COG:no KEGG:BF1340 NR:ns ## KEGG: BF1340 # Name: not_defined # Def: tyrosine type site-specific recombinase # Organism: B.fragilis # Pathway: not_defined # 1 404 1 405 409 317 42.0 9e-85 MKNTFSVLFYVKKQQKKSGLLPIMGRITVNKQTTQFFTQLEIDPQTTEWDSKTKKINGKG ASSLNSQLRNIEARIRTTYQEMFLRGPVLSAEQVKNAFLGIDDSSTTLLSVFQEYKQHQK DMIGKCRSLSTYKRITLVYNRVADFIKVKHHRSDMYAAEIDGSFVDDFSDYLFTTYKVSN NQVQKIMQVFKHVTKICIKKGLLKKDPFSDYKIKFDPVDPNFLTVDELKLIIQKKFSLRS LETTKDLFVFSCFTGLAFVDAMNLKRENITLAENGDMWIKITRQKTNIPSIIPVMDIAKS IIKKYENQNDSYVLPRSPHQHVNAYLKQIASSCGIKKNMTFHMARHTYATTAMSFGVPIA TIAKTMGHANMKMTQHYARVLPEKIIKDINVLNTATQTLQTLYQ >gi|313157428|gb|AENZ01000067.1| GENE 34 33062 - 34228 162 388 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313157441|gb|EFR56862.1| ## NR: gi|313157441|gb|EFR56862.1| hypothetical protein HMPREF9720_3041 [Alistipes sp. HGB5] # 13 388 1 376 376 761 100.0 0 MKNCILILVVGLMMLGCGNRQVKQIEQTEKDSIEITIKTSYDTVHFKFKWRNADSIPYLF GKPSKDKIAEYQRRFDSLHLTERYGFYDRIEYFRDIPHIIAVNDFVALWFANGTTPYQDD LDMILWRINECYPLNCSTTEIERYQRLSEQIDSLLSYSLGSQWDYNMHASLATKLQECKV VFYNYRLLDCNNEFKDILASEHCAWKEYEETLYDVCDKIVVGKHGGSGAPMAYGDFFIES YKQRLISLLDYYFILGDSNYQPTERHRLIPDKMIHRAYSDFFDVINSDYEEYSVEEKRCA LKNDMRYWDKWMAERRKVSAQLPMPFKKVYDNCTNNLKRRKLIQIKSRYLGYGICSGFEL DCILKKDCSDEELFNYDYQTRYDALLIK Prediction of potential genes in microbial genomes Time: Wed Jun 22 13:26:40 2011 Seq name: gi|313157417|gb|AENZ01000068.1| Alistipes sp. HGB5 contig00023, whole genome shotgun sequence Length of sequence - 8833 bp Number of predicted genes - 9, with homology - 9 Number of transcription units - 4, operones - 2 average op.length - 3.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 2 - 494 -25 ## PA14_49480 hypothetical protein - Prom 618 - 677 5.3 2 2 Op 1 . - CDS 1055 - 2314 614 ## COG0732 Restriction endonuclease S subunits 3 2 Op 2 . - CDS 2329 - 3045 250 ## gi|313157420|gb|EFR56842.1| hypothetical protein HMPREF9720_1025 4 2 Op 3 . - CDS 3060 - 3500 139 ## gi|313157422|gb|EFR56844.1| hypothetical protein HMPREF9720_1026 5 2 Op 4 . - CDS 3511 - 5037 2325 ## COG0286 Type I restriction-modification system methyltransferase subunit 6 2 Op 5 . - CDS 5027 - 5593 556 ## YPTB3883 hypothetical protein - Prom 5613 - 5672 5.7 - Term 5711 - 5750 4.4 7 3 Tu 1 . - CDS 5773 - 7197 1589 ## BVU_2464 mobilization protein - Term 7225 - 7253 1.0 8 4 Op 1 . - CDS 7443 - 8327 835 ## PGN_0923 putative DNA primase 9 4 Op 2 . - CDS 8330 - 8620 399 ## Palpr_1961 hypothetical protein Predicted protein(s) >gi|313157417|gb|AENZ01000068.1| GENE 1 2 - 494 -25 164 aa, chain - ## HITS:1 COG:no KEGG:PA14_49480 NR:ns ## KEGG: PA14_49480 # Name: not_defined # Def: hypothetical protein # Organism: P.aeruginosa_PA14 # Pathway: not_defined # 1 163 191 353 595 143 41.0 2e-33 MYINNQTKRKELKGDRPKIKNSDLATLRTTDPYEQLNAVLREIFNCELFPEKFNSDFHQY VHVNIKKGSHDVNGGFNLFPNYKARDIMTEGSGFLQWLSVYTYALNEEIDVLLLDEPDAH LHCSLQMLLVSKLQNFIEKNNKQILIATHSTEIIKNTPFNLIID >gi|313157417|gb|AENZ01000068.1| GENE 2 1055 - 2314 614 419 aa, chain - ## HITS:1 COG:MJ0130m KEGG:ns NR:ns ## COG: MJ0130m COG0732 # Protein_GI_number: 15669898 # Func_class: V Defense mechanisms # Function: Restriction endonuclease S subunits # Organism: Methanococcus jannaschii # 9 418 6 424 425 111 26.0 2e-24 MTHDTNIPQGYKATALGIIPQEWEVMRLGDIVSITSGESPSLYHLKAEGKYPYVKVEDLN NCEKYQESSREYSDDNNTTIKAGSIIFPKRGASILNNKVRIAAKDIQMDSNMMAITPHTT IVDTEFLYIRILHERLYRIADTSSIPQINNKHIIPYKIAVPPLAEQRKIAEVLGVWDEAI EKQARLIEKLALRKRALMQRLLSAKLRLPGFSEPWEKVKLGDIGHFLSSNTLSRDCLNEQ IGNIKNIHYGDILIKLPTIVDASFIHIPYVNDDVIVKSDYLKNGDIIFADTAEDYTVGKA IEIINIQAIPVTSGLHTIPFRPKSGIFVNRFLGYYVNSTDYRRQLQPLIQGIKVYSISKT ALCKTTLKIPTLSEQTAIAEVLTAADREIELAKEKLERLRRQKRGLMQQLLTGKRRIKY >gi|313157417|gb|AENZ01000068.1| GENE 3 2329 - 3045 250 238 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313157420|gb|EFR56842.1| ## NR: gi|313157420|gb|EFR56842.1| hypothetical protein HMPREF9720_1025 [Alistipes sp. HGB5] # 1 238 1 238 238 488 100.0 1e-136 MNLSTWGDLGNIILGITSVFTAIVTAVVLCKQHKLQQEQHKLEQEKLKIQQMEHQPLFFF KREKDHMDICNSGAKLHRPIEFSIGSMIYVQSSMFLPGGLKTFVYCYPINIYKNCRSQEE LDGVVGTCLFAEEDRAKLHEISSKIWNSLVHSKKLVPQMPYMTSVIVRESDLIAIKYQDV YHIDHTVYYLDSSPISAETFNKLNKIRQLVPPGIYSVDKIDLNNILRGILQFRFEKTW >gi|313157417|gb|AENZ01000068.1| GENE 4 3060 - 3500 139 146 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313157422|gb|EFR56844.1| ## NR: gi|313157422|gb|EFR56844.1| hypothetical protein HMPREF9720_1026 [Alistipes sp. HGB5] # 1 146 1 146 146 283 100.0 3e-75 METWTLIISAAALAVSLYTYFAHDRRLKQQERLINSYQLKQLQKEEQANKKADIRAEISQ NTKGPRTLRIWNNGRAVARNIRVEGLDVEGLIVMNDEIFPYEIMNPQDDAELKLYLTIGC PHKLTLKLICDDESGQNNVTTKVITL >gi|313157417|gb|AENZ01000068.1| GENE 5 3511 - 5037 2325 508 aa, chain - ## HITS:1 COG:BMEII0451 KEGG:ns NR:ns ## COG: BMEII0451 COG0286 # Protein_GI_number: 17988796 # Func_class: V Defense mechanisms # Function: Type I restriction-modification system methyltransferase subunit # Organism: Brucella melitensis # 3 508 12 516 518 369 41.0 1e-102 MNGKITQEEINKVVWQACDTFRGVIDPSQYKDYILTMLFLKYVSDVSKAKYKEYLQRYDG DTERAQRAMRRERFQVPEKSSFDYLFEHRNEPNIGELIDIALADLEFANREKLSSEDGSG IFRNISFNSSNLGETKERNARLKQLLIDFSDERLQFDESHLANNDVIGDAYMFLIEKFAS DAGKKAGEFFTPKEVSSLLARLTKSAPGSRICDPTCGSGSLLIKAGREVGSDNFSLYGQE LNGSTWALAMMNMLLHGFDSATIRWGDTLRNPKLKEGDALMKFDTVVANPPFSLEKWGAD EAADDPYNRFWRGIPPKSKGDWAFICHMLEVANEHGKVGVVVPHGVLFRGASEGKIRQQT VEENLVEAIIGLPANLFYGTGIPAAIAIFNKAKTTTDVLFIDASREFENGKNQNRLRDED IDHIVTTYRRFAQGELKPGIVEERYAYVARREEIADNDYNLNIPRYVDTFEEEPEIDIAA VQQEIDALEKELAEVHVRMDGYLKELGY >gi|313157417|gb|AENZ01000068.1| GENE 6 5027 - 5593 556 188 aa, chain - ## HITS:1 COG:no KEGG:YPTB3883 NR:ns ## KEGG: YPTB3883 # Name: not_defined # Def: hypothetical protein # Organism: Y.pseudotuberculosis # Pathway: not_defined # 3 186 7 192 192 89 31.0 1e-16 MVKLKDIATIQTGVYLKNAPSPDTCYLQVNDFDEVGNIRPTVRPTTTVSSKAARHLLTES DLLLAAKGGKNFCAIAPTQLGPCVASPSFLIIRIDDPARILPEYLCGFLNLPSTRQLLTA QAQGSAITSLSKADLEEFDVPLPPLERQRACIALTRLHRREQALYKAIAERRRQITDCKL TKIYKDER >gi|313157417|gb|AENZ01000068.1| GENE 7 5773 - 7197 1589 474 aa, chain - ## HITS:1 COG:no KEGG:BVU_2464 NR:ns ## KEGG: BVU_2464 # Name: not_defined # Def: mobilization protein # Organism: B.vulgatus # Pathway: not_defined # 1 457 1 431 446 337 42.0 8e-91 MGYAVLHLDKAPGNEVAMTAHIARTKIPTNASPERTSLNEELVEFPEEVADRTEAIKYRL EHAGLTRKIGTNQVRVIRVMLTGSHDAMKRIEEEGRLPEWCADNLAWLRETFGAENVVSA VVHRDESTPHIHAAVVPIVTGERRKARTTASETPGKRHYRPKSAARPRLCADDVMSRVRL KQYQDSYAATMSKYGLERGIDGSEARHITTQEFYRQAIAQQQDLQENIDALLRLEEQKRK AVELLKQQERQVRIEYEQTATLQTQKSAELNRTEAELKSVKGELKGARLKNAAAEVGSNI VEGIGAMIGTSRVKRQQQEIDRLNAENAALHEQIGTLNRANREEKARHEQAEQQLQAKLD RIEHWLPDTQTLISWGEYCRDMGISESGAREIIAMKPLYVSGELRAPEYSRRFDVSNTEI RLQRDPNGPGGFQLLIDRQPAHAWFRQRYVELMIRLGYPVRPERPVVRRHTIKR >gi|313157417|gb|AENZ01000068.1| GENE 8 7443 - 8327 835 294 aa, chain - ## HITS:1 COG:no KEGG:PGN_0923 NR:ns ## KEGG: PGN_0923 # Name: not_defined # Def: putative DNA primase # Organism: P.gingivalis_ATCC33277 # Pathway: not_defined # 8 294 9 293 295 224 42.0 4e-57 MRTTPDCIGIREYLLRRGLHPHRETATHGMFLSPLRKERTPSFSVRYDKGLWYDFGLGEG GTLLQLVMRLEGCGMAEAIRRLRKGAADEVSFQPLPTASPREDSPLRILSVGEIRHPALI GYLRERGIDPAVAGALCREVHYAVGERRFFAIGFRNDAGGWELRSPQFKGSSAPKSITTF DRHGDTALLFEGFFDLLSYLTLQHEPTPTADTAVLNSVVNLPRALPFLARHATIHAFLDN NEAGRFTLERLRSALPGATVIDRAEGYRTHKDLNESLRSSAKVSRTPRRRGRKL >gi|313157417|gb|AENZ01000068.1| GENE 9 8330 - 8620 399 96 aa, chain - ## HITS:1 COG:no KEGG:Palpr_1961 NR:ns ## KEGG: Palpr_1961 # Name: not_defined # Def: hypothetical protein # Organism: P.propionicigenes # Pathway: not_defined # 3 96 10 105 107 63 37.0 3e-09 MTKEIFLSGMTADQLSEMIRKSLRDELRQLHPEPTAETPNYLTRRETARRLRISLVTLND WVNRGRIRAHKIGGRVLFRDSDVEAALHRIVPIKSR Prediction of potential genes in microbial genomes Time: Wed Jun 22 13:27:47 2011 Seq name: gi|313157345|gb|AENZ01000069.1| Alistipes sp. HGB5 contig00043, whole genome shotgun sequence Length of sequence - 98666 bp Number of predicted genes - 72, with homology - 68 Number of transcription units - 35, operones - 16 average op.length - 3.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 41 - 3340 4535 ## BT_3239 hypothetical protein 2 1 Op 2 . + CDS 3362 - 4963 1957 ## BT_3238 hypothetical protein 3 1 Op 3 . + CDS 4994 - 5884 993 ## BT_3242 hypothetical protein 4 1 Op 4 . + CDS 5897 - 7696 1765 ## BT_3236 hypothetical protein + Term 7725 - 7768 12.6 + Prom 7907 - 7966 4.3 5 2 Op 1 . + CDS 8095 - 10143 2249 ## COG1404 Subtilisin-like serine proteases 6 2 Op 2 . + CDS 10162 - 10629 681 ## gi|313157370|gb|EFR56793.1| putative lipoprotein + Term 10657 - 10696 8.4 + Prom 10739 - 10798 4.1 7 3 Op 1 . + CDS 10881 - 12068 1539 ## COG0426 Uncharacterized flavoproteins 8 3 Op 2 . + CDS 12095 - 13408 1953 ## COG1004 Predicted UDP-glucose 6-dehydrogenase + Term 13470 - 13515 14.3 - Term 13458 - 13503 15.1 9 4 Op 1 . - CDS 13521 - 13970 753 ## COG2731 Beta-galactosidase, beta subunit 10 4 Op 2 . - CDS 13983 - 15134 1433 ## BT_1184 hypothetical protein 11 4 Op 3 . - CDS 15139 - 15747 1010 ## COG0705 Uncharacterized membrane protein (homolog of Drosophila rhomboid) 12 5 Op 1 . - CDS 15943 - 17982 2645 ## COG0323 DNA mismatch repair enzyme (predicted ATPase) 13 5 Op 2 . - CDS 17983 - 18432 574 ## gi|313157380|gb|EFR56803.1| hypothetical protein HMPREF9720_1969 - Prom 18459 - 18518 2.7 14 6 Op 1 5/0.000 - CDS 18640 - 19806 1561 ## COG0484 DnaJ-class molecular chaperone with C-terminal Zn finger domain 15 6 Op 2 . - CDS 19811 - 20410 834 ## COG0576 Molecular chaperone GrpE (heat shock protein) 16 6 Op 3 . - CDS 20477 - 21187 828 ## COG0313 Predicted methyltransferases 17 6 Op 4 . - CDS 21166 - 21675 581 ## gi|313157394|gb|EFR56817.1| hypothetical protein HMPREF9720_1973 18 6 Op 5 . - CDS 21680 - 24058 3239 ## COG0787 Alanine racemase 19 6 Op 6 . - CDS 24141 - 26630 3354 ## COG1193 Mismatch repair ATPase (MutS family) 20 6 Op 7 . - CDS 26652 - 27653 1649 ## COG1463 ABC-type transport system involved in resistance to organic solvents, periplasmic component 21 6 Op 8 . - CDS 27667 - 28956 1555 ## COG0860 N-acetylmuramoyl-L-alanine amidase - Prom 28982 - 29041 2.6 + Prom 28936 - 28995 2.7 22 7 Tu 1 . + CDS 29097 - 31961 4241 ## BDI_3715 hypothetical protein + Term 31991 - 32034 14.0 - Term 31979 - 32022 14.0 23 8 Tu 1 . - CDS 32066 - 32938 869 ## COG0778 Nitroreductase + Prom 32898 - 32957 2.8 24 9 Tu 1 . + CDS 32986 - 33429 564 ## COG2153 Predicted acyltransferase + Term 33439 - 33485 8.0 - Term 33422 - 33474 16.1 25 10 Tu 1 . - CDS 33484 - 34728 1808 ## COG0477 Permeases of the major facilitator superfamily - Prom 34881 - 34940 74.1 + TRNA 34867 - 34937 49.5 # Gln CTG 0 0 26 11 Tu 1 . - CDS 35308 - 36234 915 ## BDI_1719 hypothetical protein - Prom 36368 - 36427 6.1 - Term 36402 - 36444 11.2 27 12 Op 1 . - CDS 36467 - 38587 3397 ## COG0339 Zn-dependent oligopeptidases 28 12 Op 2 . - CDS 38624 - 40795 3518 ## COG0339 Zn-dependent oligopeptidases - Prom 40872 - 40931 3.0 + Prom 40810 - 40869 2.6 29 13 Op 1 . + CDS 40891 - 41604 898 ## COG1385 Uncharacterized protein conserved in bacteria + Prom 41608 - 41667 1.9 30 13 Op 2 10/0.000 + CDS 41687 - 43672 2852 ## COG0651 Formate hydrogenlyase subunit 3/Multisubunit Na+/H+ antiporter, MnhD subunit 31 13 Op 3 . + CDS 43675 - 44589 1429 ## COG0650 Formate hydrogenlyase subunit 4 32 13 Op 4 . + CDS 44586 - 45215 1156 ## HRM2_38990 HyfE 33 13 Op 5 7/0.000 + CDS 45220 - 46620 2346 ## COG0651 Formate hydrogenlyase subunit 3/Multisubunit Na+/H+ antiporter, MnhD subunit 34 13 Op 6 6/0.000 + CDS 46617 - 48128 2101 ## COG3261 Ni,Fe-hydrogenase III large subunit 35 13 Op 7 . + CDS 48140 - 48904 904 ## COG3260 Ni,Fe-hydrogenase III small subunit 36 13 Op 8 7/0.000 + CDS 48907 - 49449 703 ## COG2059 Chromate transport protein ChrA 37 13 Op 9 . + CDS 49446 - 50018 927 ## COG2059 Chromate transport protein ChrA + Term 50041 - 50078 6.1 - Term 50028 - 50066 10.1 38 14 Op 1 . - CDS 50089 - 50376 515 ## gi|313157409|gb|EFR56832.1| conserved domain protein 39 14 Op 2 . - CDS 50389 - 51417 1162 ## COG1092 Predicted SAM-dependent methyltransferases 40 14 Op 3 . - CDS 51430 - 54483 4141 ## COG3250 Beta-galactosidase/beta-glucuronidase 41 14 Op 4 . - CDS 54493 - 56322 2082 ## COG0826 Collagenase and related proteases + Prom 56698 - 56757 2.5 42 15 Op 1 . + CDS 56778 - 58049 2306 ## COG0172 Seryl-tRNA synthetase 43 15 Op 2 . + CDS 58101 - 58268 182 ## BT_2414 ferredoxin + Term 58289 - 58335 17.5 - Term 58279 - 58320 13.1 44 16 Tu 1 . - CDS 58346 - 58762 129 ## gi|313157413|gb|EFR56836.1| hypothetical protein HMPREF9720_2002 - Prom 58942 - 59001 4.8 - Term 59415 - 59455 10.6 45 17 Tu 1 . - CDS 59475 - 61556 -98 ## gi|313157412|gb|EFR56835.1| hypothetical protein HMPREF9720_2003 46 18 Tu 1 . + CDS 61593 - 61832 139 ## - Term 62193 - 62231 8.5 47 19 Tu 1 . - CDS 62254 - 65256 3073 ## Poras_0733 hypothetical protein - Prom 65480 - 65539 4.8 - Term 65494 - 65535 -0.8 48 20 Op 1 . - CDS 65546 - 66883 1235 ## COG3876 Uncharacterized protein conserved in bacteria 49 20 Op 2 . - CDS 66925 - 68037 1922 ## COG1186 Protein chain release factor B 50 20 Op 3 . - CDS 68069 - 68704 855 ## BF3595 hypothetical protein 51 20 Op 4 . - CDS 68714 - 70120 2196 ## COG0534 Na+-driven multidrug efflux pump - Term 70271 - 70324 3.3 52 21 Tu 1 . - CDS 70340 - 71815 2271 ## COG0215 Cysteinyl-tRNA synthetase - Prom 71841 - 71900 3.3 + Prom 71766 - 71825 3.1 53 22 Tu 1 . + CDS 72005 - 73426 1537 ## COG0513 Superfamily II DNA and RNA helicases + Term 73489 - 73518 3.5 - Term 73476 - 73506 3.7 54 23 Op 1 . - CDS 73550 - 74203 754 ## COG2173 D-alanyl-D-alanine dipeptidase - Term 74355 - 74397 1.7 55 23 Op 2 . - CDS 74417 - 76459 1892 ## COG3669 Alpha-L-fucosidase - Prom 76491 - 76550 5.8 56 24 Tu 1 . + CDS 76760 - 79990 5077 ## Palpr_3031 hypothetical protein + Term 80010 - 80058 19.3 - Term 79998 - 80046 19.4 57 25 Tu 1 . - CDS 80100 - 80711 693 ## BT_3525 hypothetical protein 58 26 Op 1 . + CDS 80710 - 81111 516 ## 59 26 Op 2 . + CDS 81124 - 82902 2513 ## BF0762 hypothetical protein 60 27 Tu 1 . + CDS 83156 - 84943 2443 ## BT_1768 beta-galactosidase - Term 85238 - 85292 14.4 61 28 Tu 1 . - CDS 85310 - 86986 2451 ## COG1022 Long-chain acyl-CoA synthetases (AMP-forming) - Prom 87068 - 87127 3.6 + Prom 86952 - 87011 3.3 62 29 Tu 1 . + CDS 87086 - 87994 1134 ## COG1600 Uncharacterized Fe-S protein 63 30 Op 1 . - CDS 88123 - 88284 127 ## 64 30 Op 2 . - CDS 88297 - 89256 1660 ## gi|313157353|gb|EFR56776.1| tetratricopeptide repeat protein 65 30 Op 3 . - CDS 89350 - 89766 471 ## COG0590 Cytosine/adenosine deaminases - Prom 89886 - 89945 2.4 + Prom 89812 - 89871 2.7 66 31 Tu 1 . + CDS 89945 - 91702 2666 ## COG0173 Aspartyl-tRNA synthetase + Term 91729 - 91768 9.5 67 32 Op 1 . + CDS 91842 - 93116 1949 ## COG2256 ATPase related to the helicase subunit of the Holliday junction resolvase 68 32 Op 2 . + CDS 93117 - 93611 721 ## COG0622 Predicted phosphoesterase + Term 93624 - 93669 13.9 - Term 93615 - 93652 7.1 69 33 Tu 1 . - CDS 93681 - 94454 532 ## gi|313157368|gb|EFR56791.1| hypothetical protein HMPREF9720_2026 - Prom 94510 - 94569 6.1 + Prom 94473 - 94532 8.8 70 34 Op 1 . + CDS 94755 - 94862 78 ## 71 34 Op 2 . + CDS 94940 - 95983 1512 ## Pedsa_3099 hypothetical protein + Term 96003 - 96045 9.3 + Prom 96090 - 96149 1.6 72 35 Tu 1 . + CDS 96169 - 98622 1652 ## PROTEIN SUPPORTED gi|163764771|ref|ZP_02171825.1| ribosomal protein S8 Predicted protein(s) >gi|313157345|gb|AENZ01000069.1| GENE 1 41 - 3340 4535 1099 aa, chain + ## HITS:1 COG:no KEGG:BT_3239 NR:ns ## KEGG: BT_3239 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 50 1099 1 1059 1059 1401 66.0 0 MVKKILLSLIGVLVFAAGAFAQAKQVRGSVVDENGAAVIGASVIVKGTTIGVSTDQQGFF VMNNVPAKSEELQISYLGYVTQDVKIRPNMQITLVPDEQTIESVVVTGMTKTDKRLFTGA ADRLSADDVKLSGMADISRGLEGRSAGVSVQNVSGTFGTAPKIRVRGATSIFGSSKPLWV VDGVIMEDVIDIDADALSSGDATTLISSAIAGLNADDIESFSILKDGSATSIYGARAMAG VIVVTTKKGRAGESRISYTGDYTFRMTPRYADFNIMNSQDQMSVYQEMAAKGWLNNSTIA NASSSGVYGKMWELVNTGKLENTEAARNRYLRKAEYRNTDWFKELFRSTLMHTHSISLSS GTEKSQYYASMSAMFDPGWTKASEVERYTANLNATYNIFDNLTLNLISSGSYRKQTAPGT LAATTDVVFGEVKRDFDINPYSYAMNSSRTLDPDEFYTRNYAPFNILDELGKNYIDLNVV DTKFQAELRYKPVTGLELSALAAIKYSATTQEHNITDYSNQARAYRAMATTVIRDNNPFL YTDPDNAYAVPISILPKGGIYKRRDNNMLSYDFRFAASYNTEINNQHIINLYGGMEVNQS DRHESWFNGWGMQYSMGLIPFYAYEVFKKGKEENTDYFGLADTHYRNVAFFFNGTYSWNG RYTLNGTFRYEGTNRMGKSRSARWLPTWNVAGAWNVHEENFFEALRPALSHLSLKASYSL TADRGPSFVTNSRVVITSNTPWRPSAGVTESGLTIYDLENSELTYEKKNEVNVGIDMGFL NNRISLAADWYKRDNHDLIGVINTQGIGGQVMKYGNVAGLKSSGVELTLSTRNIETKNFS WSTDFIYSHLKEEVTDLMSYKRAIDLVSGTGFGMEGYPSRSIFSFDFQGLNDEGIPQFIN EKGQLTTTSINFQERNNLGHLKYSGTADPTDFGSFGNMFRLYGFKLNVFITYSFGNVIRL DPVFSESYSDLLSMPREFRNRWMRPGDENHTDIPVIANKRQVKKDNYLSRAYNAYNYSTA RIAKGDFIRMKEISLTYDFPKRWIKKLRFSNMSLKLQATNLFLIYADKKLNGQDPEFFNT GGVAVPVPKQFTLTLKIGL >gi|313157345|gb|AENZ01000069.1| GENE 2 3362 - 4963 1957 533 aa, chain + ## HITS:1 COG:no KEGG:BT_3238 NR:ns ## KEGG: BT_3238 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 4 533 2 517 519 382 43.0 1e-104 MNTKKITTLIVLAAALVALPACNDFLDEMPDNRTELDSSDKITSLLVSAYSEHTYPVTCE YASDNVDETALVSPNFEPEQEEYYRWQDVTAAVTNEAPQVVWSQYYMAIAAANQALDAIK ELGGADTPQLKAAKGEALICRAYAHFCLANMFCQAYNPQYASEDLGIPYMEKAETILNPK YTRGTLADVYSKIDADLIEGLPLITDEFYSVPKYHFNQKAAYAFAARFYLFYQKWDECIA AAKVVLGSAPETMMRDLEANGKLGLQSADGQTLLRTMDYIDYSHKCNLLLQTSITDAPGI SFGPYNAYTGFSHGNWLDQTETFRAPTAPWGSSYTLYSAPYSMNTGKADKCFAWRAPYLF EIKDPVMQTGLAHTVFPAFTAEETLLCRAEAYIMKKEYGPALNDMNLWISKYVKSGAQTL TEASVNQWAQNTKEYTPANPTVKKPFGETAFPLEAGTQTNMCYALLHLRRVETVHLGLRW FDVKRYGITIYRRKLASYNTLGSVTDELHARDPRCAMQLPMDVVAAGLTPNPR >gi|313157345|gb|AENZ01000069.1| GENE 3 4994 - 5884 993 296 aa, chain + ## HITS:1 COG:no KEGG:BT_3242 NR:ns ## KEGG: BT_3242 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 293 1 299 304 250 44.0 5e-65 MKKNIILLMTAVFGVFVLAGCSEDDLKKTSAVLDSPTVENDFDRWLAKNYMEPYNIDFKY RFKDVLGDMDYNLTPADYGQAIKLAKLTLHLTLQVYDEVTGNRQFISENFPKIVHVIGSP AYNENNNIVLGVAEGGKMMTLYQVNIIDYLLSTKNIAQLNELFFKTMHHEFGHILHQTRP YSTDFNAVTPSSYVGDACFDTYRTDAAARQAGFITRYSSKAPDEDFVEQLSLYVTSTAAE WEAILAQGGSAGRPLLEQKNDIMRAYMLSTWDINIDELRKVVLRRQNEIWSLDYNI >gi|313157345|gb|AENZ01000069.1| GENE 4 5897 - 7696 1765 599 aa, chain + ## HITS:1 COG:no KEGG:BT_3236 NR:ns ## KEGG: BT_3236 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 257 1 259 418 139 34.0 5e-31 MKKLIILLAVLPLVLSSCLKDDKDAFGKTASERIREVQEGTEKALTGAANGWLMEYYPSA TQEYGGIAIYMKFQNGTVTVTSEAGGAGKTAESLYSYDQDYGPTINFDTYNPYFHIYSEP NSGVGPTNTGMGGDSEFIIMSYAADKVVLRGKKTQNTIVLTPLPETAWNTMFGKYSADVT KMDNFRSYELVLNGKTYDISREVAAGYNSRYFSVVTENGTIPASFIYTQDTGLKFYEPLV VDGVSIEEMTWSGGMFTDEMSGARIQETVSNNKFTFAVSDIKATSVNVETTPTVANEYYY FDVVPSSAFASASDEEILYELMDGITSMNQLSLGKMMKTLTVDSDTEYIPCAFGLKIVNG WIYPITELAKGTAFMSGQGDPMSPEYQAWLGTWTVTSTSSEKSKQPITFDITLAKKVANT SYSLTGWDISVRRLTWAGTADFKATNNSFEIINGQELPYSDPNGTPYLIARGIAASGTYI VSGSYPALTGVMNSDGTASVSCYKGKFSDGEEFTVGTVGYFIDTGGSYGSYGPAAGFTSG DHPVGPFKMVKKSAAGVPAKAAALTNVNTVIRYHKLSDIPQMNLSSFQIGINVKAAKLE >gi|313157345|gb|AENZ01000069.1| GENE 5 8095 - 10143 2249 682 aa, chain + ## HITS:1 COG:alr1615_2 KEGG:ns NR:ns ## COG: alr1615_2 COG1404 # Protein_GI_number: 17229107 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Subtilisin-like serine proteases # Organism: Nostoc sp. PCC 7120 # 205 514 37 297 416 134 36.0 4e-31 MNNRKLLIFASFLLLAGCTTDPDTDNNAGGGTSAQTPSAKIVNTSADAAAETLLVYFNDR AVETIESTAAATRTAATRSGVASVDDVLSRLEIVSLERLFTYDARNEEQTRAAGLHKWYI LTFGQGADLEKAARELAGVAEVSRIQFDTKLQKASVGNPMPFRIDETGTTRADFSGSGFN DPGLPNQWHYSNNGDKMFAATTAAGADINVPEAWKLTGGSPSIIVAIVDEGVKYTHPDLA DNMWVNKAELNGAANTDNDGNGYKNDVHGYNFATNSSNLTWSVSHYDNKGKYDGDSGHGT HVAGTVAAVNNNGKGVCGVAGGTGSNDGVKLMSCQIFSGGKGGTISTSVRAIKYAADNGA SIIQCSWGYPTQSPFTSDSYYERNYAAEKEAIDYFVSKKNNDVLDGGLAIFAAGNDATNF ASYPGGYRSYVSVTSFGPDYLPAYYTNYGPGCNVAAPGGDASISPAGTSAAQVLSTLPSE LPAELGTDGADYGYMQGTSMACPHVSGVAALGLSYALAKGRHYTVDEFKSMLLTAVNDID SYIIGGTKNGMVLRNYRKNMGTGSIDAYQLLMQIEGTPCLPAKVGVQQVLSLDKFFGQAS ANLTYKSAEMSQADMDKLGITAAPAFSYGKLQIKCTKPGVAKIKIKAMAGSSSANGEMSG MEITKEFAIIARAVQAGNGGWL >gi|313157345|gb|AENZ01000069.1| GENE 6 10162 - 10629 681 155 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313157370|gb|EFR56793.1| ## NR: gi|313157370|gb|EFR56793.1| putative lipoprotein [Alistipes sp. HGB5] # 1 155 4 158 158 288 100.0 7e-77 MRKTIYIVLAALGLCLFAGCGSSDKEKVSPTAKQLVGEWQLKTWTGETPQDFDIYLSFGA DNTFEIYQRLAEVKYQKFTGSYQVQNDVLSGKYSGDKNFGSTYDISFNESGSTLTLTSAT NVTEVSVYERSTIPNSVKEGAAVMKSTRSAVRLAL >gi|313157345|gb|AENZ01000069.1| GENE 7 10881 - 12068 1539 395 aa, chain + ## HITS:1 COG:FN0512 KEGG:ns NR:ns ## COG: FN0512 COG0426 # Protein_GI_number: 19703847 # Func_class: C Energy production and conversion # Function: Uncharacterized flavoproteins # Organism: Fusobacterium nucleatum # 4 393 5 398 403 332 40.0 8e-91 MNITEILPGIHYVGVNDRTTTRFEGLWSLPHGVSYNAYLVADEKVALIDTVEEAFGSRLF ENIRETIGDRPIDYLVVNHMEPDHSSSIRALRLLYPDIRIVGNAKTLQMIQGFYGIDTGT LEVKEGDTISLGAKTLSFYMAPMVHWPETMVTWCAEAGTLFSGDAFGTFGAIDGGITDSQ LDPSRYWDEMRRYYACIVGKYGGPVQKALAKVRGLNPTTICSTHGPVWQRQIPQVMDIYD RLSRYEGEPGVVLAYGSMYGNTEQMAERIARELASRGVGPIVMYNLSFADESVVLRDVFR YDTLIVGAPTYNGGIYPPAARLLDLIAARCVPQRNFGWFGSFCWAGAAVRGMGEFAQRMK WEPICDPVEMKQGFSSGCHDFCSRFADGVAERFRR >gi|313157345|gb|AENZ01000069.1| GENE 8 12095 - 13408 1953 437 aa, chain + ## HITS:1 COG:XF1606 KEGG:ns NR:ns ## COG: XF1606 COG1004 # Protein_GI_number: 15838207 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted UDP-glucose 6-dehydrogenase # Organism: Xylella fastidiosa 9a5c # 1 437 1 444 450 496 56.0 1e-140 MKIAVVGTGYVGLVSGACFAEMGLDITCVDIDKKKIDGLNNGVIPIYEPGLEEVVRRNVE EGRLHFTTELTDCLDNVEVVFSAVGTPPDEDGSADLKYVLEVARTFGRNIKKYTVLVTKS TVPVGTAQKVKAVIREELARRKCKVPFDVASNPEFLKEGAAVKDFMSPDRVVVGVESERA RKLMTKLYRPFLINNFRVLFMDIPSAEMTKYAANAMLATRISFMNEIANLCDLVGADVEM VRKGIGSDARIGSKFLYPGCGYGGSCFPKDVKALAHTAHEHGYTMQVIEAVERVNERQKS IVFDKLQLALGDLRGKTVAVWGLAFKPETDDIREAPALVVIDRLLEAGATVRAYDPVAAN ESRRRLGEERVYHAASMYEAAEGADAVALITEWKQFRMPDWTLLRSAMRDNIIVDGRNIL DKEEALNAGFRYYGIGK >gi|313157345|gb|AENZ01000069.1| GENE 9 13521 - 13970 753 149 aa, chain - ## HITS:1 COG:CAC0836 KEGG:ns NR:ns ## COG: CAC0836 COG2731 # Protein_GI_number: 15894123 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase, beta subunit # Organism: Clostridium acetobutylicum # 1 149 1 151 152 87 32.0 1e-17 MIFDSLKNSALYYSLNPRLEKAFGFIASTDWEKMEPGIHELDGKDIYVNVMDTELKKPAD AKLEIHNAYLDIQVLVRGERETFGWSERAVLSKPLGGFDAEKDIQLFDDEPQTCYTIRPG QFTILMPEDGHAPMVGEGSIRKIIVKVRM >gi|313157345|gb|AENZ01000069.1| GENE 10 13983 - 15134 1433 383 aa, chain - ## HITS:1 COG:no KEGG:BT_1184 NR:ns ## KEGG: BT_1184 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 22 377 4 369 370 182 32.0 2e-44 MSETYYSGYDSDRGRKPRRSLLLRLLDLLMTLLTAVTAVTMVVTFFVPYVDPGRVWFFPV LGLAAPAVYVVTVVLALYWIIRWRLVRAGIMVALVVVGLFKVSLFYRPEFRRNYGDERYD RRAFKVMTYNVRGFYGENGLSSVGDVLQLIGDHNPDIICLQEFNARLAGQSDEFALLDEK YESAVFGRTQAPDSLYGAPLVILSKYRILRSGVVLTPGTSVWADLLIGDDTIRVFNNHLR STAIKAADNDYLTSRGFLSDTAREDKLRSMAGRFRENSVLRAAQVDSIAVVVEAWRSRCI VCGDFNDTPMSYVYRTMAKGLNDAFSQCGSGYSHTFRGFFNTLRIDYVLSSEGFDALSYE VPTVDYSDHHPVVVRLRKNAMNN >gi|313157345|gb|AENZ01000069.1| GENE 11 15139 - 15747 1010 202 aa, chain - ## HITS:1 COG:XF0649 KEGG:ns NR:ns ## COG: XF0649 COG0705 # Protein_GI_number: 15837251 # Func_class: R General function prediction only # Function: Uncharacterized membrane protein (homolog of Drosophila rhomboid) # Organism: Xylella fastidiosa 9a5c # 7 195 8 203 224 146 45.0 2e-35 MNRNYFQTPPVVLNLIIINVLIFMATALLPKAGNAIMEYCALSLGTPFFHVYQFITYMFL HANFEHIFFNMFALWMFGRTLEYELGQKRFLTYYMVCGIGAALIQYLTALAFGEFPLVLV GASGAVMGLLLAFGVLHPNAVIMLLIPPIPMKAKWFVVIYGVIELFLGWRGVGNVAHFAH VGGMLWGFLLLQWWKQRGIIRF >gi|313157345|gb|AENZ01000069.1| GENE 12 15943 - 17982 2645 679 aa, chain - ## HITS:1 COG:NMA1655 KEGG:ns NR:ns ## COG: NMA1655 COG0323 # Protein_GI_number: 15794549 # Func_class: L Replication, recombination and repair # Function: DNA mismatch repair enzyme (predicted ATPase) # Organism: Neisseria meningitidis Z2491 # 4 676 3 652 658 271 29.0 2e-72 MADKIRLLPEVVANQIAAGEVVNRPASVVKEMMENAIDAGARTVKVNFRDGGKDLIQIVD DGCGMSPIDARMAFDRHATSKITSVDDIYALNTFGFRGEALASIAAVAQVELRTRQEGDE VGTQTEINGGQFAGQTPVMCPVGSQFFVRNLFYNVPARRRFLDKSTTSAAQIKSEFQRVA LCNPQIAFELYANDAPVYTLGAASLAGRIVDVVGRHIKQNLLEVEADTSIARIEGYIGRP AAAKRRNGEQYLFVNGRYFKSSYLTSAIMKAYEKLIPESSSPSYFLYLTIDPSRIDVNVH PQKTEVKFADEEAVWQIVNAAVRETLAKTGAVPLMDFDREGIVEIPVLTKGAVYSEPRAM SNSEYNPFREEYIDPSAPDPNVDFAGFDVPFSDNGQTLSDNAGAGFAPRGGGRGSGVALP GGLRSAGGGGFPAAGSGADEFSIPSAADDAFEDFVSGGGFEVTSESGFDASELEFIPSEA TVEQQRLDVDGRPEFTDPLPLGGGYVAALLGGRFVVVDVRRARERILYEDYLKMLGSGSS VSQQLLFPERLVLPDSEYALLEENAVEFASLGFDLDFQGDCAVEVKGTPADMPADSVDQL LYELLQAFSTPVSLADVRREKIAAVMARGAAKQTARLMSRDEAAALLARLAASGNFSFSP SGKAITAEITVEDIRAKLG >gi|313157345|gb|AENZ01000069.1| GENE 13 17983 - 18432 574 149 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313157380|gb|EFR56803.1| ## NR: gi|313157380|gb|EFR56803.1| hypothetical protein HMPREF9720_1969 [Alistipes sp. HGB5] # 1 149 60 208 208 274 100.0 1e-72 MLGFTPFKKHANKFNYIPRYYDPEKEAREQRRAELRGERAEDAGGEYRPGQYIRTQRDAR AARRSKEEKTGRDRIWKMVVGAVLVLLFIYILYPRLADVFLRARTSAPAQTEAAGAEETV RRNGLDQSGISDVEWQEQPITVVPNDYQE >gi|313157345|gb|AENZ01000069.1| GENE 14 18640 - 19806 1561 388 aa, chain - ## HITS:1 COG:ECs0015 KEGG:ns NR:ns ## COG: ECs0015 COG0484 # Protein_GI_number: 15829269 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: DnaJ-class molecular chaperone with C-terminal Zn finger domain # Organism: Escherichia coli O157:H7 # 4 387 3 371 376 282 43.0 9e-76 MAEKRDYYEVLGVQKNANADEIKKAYRKAAIQYHPDKNPGDKQAEEKFKEAAEAYDVLSN PDKRARYDQFGHAGMSGAAGGGGGFGGFSGGGFSMEDIFSQFGDIFGGHFSGGGFRSSSS GGRRVNRGSDIRIKVRLTLAEIANGVTKKLKINKTVACDKCGGTGAKDADSYSTCSTCNG TGHVTRVENTFFGRMQTQSVCPTCGGTGKVITSPCDKCKGEGTVRGSEVVEIKIPAGVGE GMMLTVTGKGNAARHGGVNGDLQVLIEEEPHPELMRDGNDLIHNLNITVTLALLGGTVEV PTVDGRAKIKIAPGTHAGKVLRLGGKGLPDVNGYGRGDELVVVDITIPSKLNAEEKRLVE ELSRQPGFQKADSVKNQNIFERMKSFFR >gi|313157345|gb|AENZ01000069.1| GENE 15 19811 - 20410 834 199 aa, chain - ## HITS:1 COG:alr2445 KEGG:ns NR:ns ## COG: alr2445 COG0576 # Protein_GI_number: 17229937 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Molecular chaperone GrpE (heat shock protein) # Organism: Nostoc sp. PCC 7120 # 43 198 73 229 248 81 32.0 1e-15 MKKEKAYNEAINGDEGFEPGDGYPSCDGEPCANVSDEPQDGTDTMADATDSGATPDLAAA VAEWQDKYIRLQAEFDNYRKRTLKEKMDLVQTGGRDVLLAMLPVRDDVQRAVDAMQKSDD IEALRAGVNLISQKFTETLRQKGVTEIDVKGREFDADLCEAVAKFAAGEDMQGKVVDVVQ TGYMLGDKVLRFAKVVVGE >gi|313157345|gb|AENZ01000069.1| GENE 16 20477 - 21187 828 236 aa, chain - ## HITS:1 COG:RSc1045 KEGG:ns NR:ns ## COG: RSc1045 COG0313 # Protein_GI_number: 17545764 # Func_class: R General function prediction only # Function: Predicted methyltransferases # Organism: Ralstonia solanacearum # 1 233 1 238 243 208 51.0 9e-54 MYGTLYLIPCPISDETAPWDVLPAANRAVMDSLDYFIVENTRTARRFLSKAGAARPIDTL CFRELNEHTVAGREVEELVKPLVEGRSAGVISEAGVPGVADPGALVVEACHRRGIRVVPL VGPSSIVMAVMASGLNGQSFAFNGYLPVKPPERVQAIRRFERRAHAEGQSQLFIEAPYRN VKLMEQLLQTLAPETRLTVAMDITAPGEFIRTLRVREWRAAELPAMNKRPAIFIIG >gi|313157345|gb|AENZ01000069.1| GENE 17 21166 - 21675 581 169 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313157394|gb|EFR56817.1| ## NR: gi|313157394|gb|EFR56817.1| hypothetical protein HMPREF9720_1973 [Alistipes sp. HGB5] # 1 169 1 169 169 263 100.0 4e-69 MKCYDCGGLIAPGADKCPACNCPAERMQAVKCLETRLLTARVEAESALDQLGRAKVAMLC AAFFALVGGVVVLVHAGGDATMRAVGIFMAALACVYAALAFLVRKAPLTLSIAGFLLSWL CLGGLPGLVIVGAMALSLWFAVQYVRIANRVGLLERKIAELKKCTEHSI >gi|313157345|gb|AENZ01000069.1| GENE 18 21680 - 24058 3239 792 aa, chain - ## HITS:1 COG:CAC0492 KEGG:ns NR:ns ## COG: CAC0492 COG0787 # Protein_GI_number: 15893783 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Alanine racemase # Organism: Clostridium acetobutylicum # 432 791 11 375 386 229 36.0 1e-59 MNYRLSRIAAICGGKFSGCDSEVRSVVTDSRSLSCELGCRPMFVAMRGANHDSHDFIADM QSRGVRAFLVERGVEASSCTGECGFVQVDNAIEALQRLAAYHRAQFRGTVVGITGSNGKT VIKEWIAEELPEGMKYYRSPKSYNSQLGVPLSVLMLEGDEELAIFEAGISKPGEMERLER IIRPDAVIFTSIGDAHQENFINLEQKCDEKMILARSASKIIYHSYYEPLGRMVAAHFADR RPVDAASYPEVPESVIGNAASRRNAQIVEAFCDAMRYPAPSFAGAPSVAMRLEVKEGIND SILINDAYNLDLNSLALALDYLHSVALSRRRTLVLSDISQSGLSDDELYGRVAGMVARAG IDFLIGIGPRLKRYAALFACDKEFFTATDECVARINRDAVAGRAILLKGARDFRFEKLVH LLSRKSHTTVLEVDLDAMIHNLNYFRSKLSFGTRLVAMVKAGSYGAGDFEVAQMLQHQGV DYLAVAFADEGVLLRERGISMPIVVLNADADSFDVMIANRLEPEIYSFHSLGAFADAVTH AGESRYPIHVKLDTGMHRLGFVEQEIAQLCATLAATPQVKAASVFSHLNCADMPEEDAYT RAQIALYDRMSAQLAASLPYPVIRHTANSAAIERFPEAQFDMCRLGLGLYGFGFRHNDAL RPVSTLKTRIVQIKRLPAGDAVGYGRAGKLTRPTTTATIPIGYADGLDRHLGCGRWSVLV AGQPAPIIGRVCMDSCMIDITDIPGVKEGDEVSVFSPVPGNDLETMARVLDTISYEIMTS VSGRVKRIYLKE >gi|313157345|gb|AENZ01000069.1| GENE 19 24141 - 26630 3354 829 aa, chain - ## HITS:1 COG:BH3106 KEGG:ns NR:ns ## COG: BH3106 COG1193 # Protein_GI_number: 15615668 # Func_class: L Replication, recombination and repair # Function: Mismatch repair ATPase (MutS family) # Organism: Bacillus halodurans # 13 827 10 785 785 372 31.0 1e-102 MIYPANFEQKIGFDRLREQVAARCTMRAARARLAGETFSTSAREIARRLTLADEMRLLLD MEHEFPGGEYPDVDHIVAKLRVEGSFLDVEEVVTLHRALTVVGGIVAFILNREEQYPALH ARSRGVAAFPEIVQRIDAIVDRFGNVKDNASPGLLEIRCAVREREGQAAKRLQAVLSAAK NAGIVDADAQISIREGKAVIPVAAANKRKLQGFIHDESATGRTFYVEPVEVVEINNELRE LEYAERREIVRILSEFTDSVRPDAELIADSGDYLAEIDMLRAKGRWASENGCVKPIVSTD DRLVLKNARHPLLQQTLRAQGREVVPLDMQLDRRKHILVISGPNAGGKSVCLKTTGIVQY MFQCGFLVPASEVSELPVFRSIFIDIGDEQSIDDDLSTYSSHLLNMKNMLAGASSATLVL IDEFGSGTEPVIGGAIAEAILERLLAKGCYGVITTHYANIKYYASNAEGIANGAMMFDVQ HIRPLFRLETGKPGSSFAVEIARKIGLPEEIIRIAGEKAGSDHINIEKQLREIARDKHYW EQKRDRIRLTDRKVEELEQTYAEQLAKIKSERQEILKKAKQEAQRLIADANKQIENTIRT IREAQAEKELTRLARRELDDFRDAVEQADTAEKDAQVAREIERIERRRQRRIERKTQRGE AQAAAAPEPPKPREVETGSKVRMAGQDMVGVVQSVKGKKAQVAFGQILTTVDKKLLTVVS NSEFREATRPQTARTVVSVDISARKLNFRDHIDVRGMRAAEALDAVQDFVDDALMVGVGS VSILHGKGTGALKEEIRRYLRTVPEVESAKDEHADRGGAGITIVTFKMD >gi|313157345|gb|AENZ01000069.1| GENE 20 26652 - 27653 1649 333 aa, chain - ## HITS:1 COG:MT2021 KEGG:ns NR:ns ## COG: MT2021 COG1463 # Protein_GI_number: 15841444 # Func_class: Q Secondary metabolites biosynthesis, transport and catabolism # Function: ABC-type transport system involved in resistance to organic solvents, periplasmic component # Organism: Mycobacterium tuberculosis CDC1551 # 9 296 11 260 423 63 23.0 4e-10 MKREVKIGIFAVAMIGVAWAGIRFLKGFDIFSRNVEYYAAYDQINGVQNASPIMMKGVKI GSVTGLSFDPARSDKVVLRFTIKREYRIPTDSEAKIFSNGLMGAKAIEITYGTADTYLHK GDTLRSSRDRDLMDVAGSELDFFKQKVSQLTGDLSRALDNLNRLMEDNAGNIAGTLGNLN SVTGDMAQILSAEKNNLKSALENLSKFSDMLGDNAQRVDSIIGNVDRVVAQFSEEEFAVK LSQTVGHLDDLVARIAQGEGTMGKLMNDPELYDSLNRASENLSALLADVKQYPGRYVHLS LFGRNPEKMKERADRKAAKAAEKAERDSLRRLR >gi|313157345|gb|AENZ01000069.1| GENE 21 27667 - 28956 1555 429 aa, chain - ## HITS:1 COG:DR1387 KEGG:ns NR:ns ## COG: DR1387 COG0860 # Protein_GI_number: 15806404 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: N-acetylmuramoyl-L-alanine amidase # Organism: Deinococcus radiodurans # 33 250 149 364 371 97 35.0 6e-20 MRTRLFLLFAFAAVTVLTNAASAGNIAHGVEVVVIDPGHGGNDPGAHYGGTSEKDLTLKV ALKLGKMIEEGMPGVKVVYTRKTDKALGLDKAKDLQARADIANKAGGDLFISIHVNAAKS TAARGVETLIMGESPKEQRYNENALFENNREDLIDMSDERTAAIVRAYIQNLQFTYGEYS MAIARCIQNNYLKAGRHSRGIKPQLLRVLYATDMPGVLTEIGFLSNSQEAAYMKSEKGQT EIARSIYEGVKNYSAYVLETRNAEEEAAAAPKPEQKPEQVVVGKEPVRGAGGKAGAKDSA AKSESVTAPAERETSETQTPKTETPKAGAPKRETPKAETPKQPAPKQDAQPLRYTVQVLA SPQTVPASSERFKSYRGKVKQYTADGKYRYKYCVGEYETRAAAQKQLAAVRKVFPDAFVV SCRGTRIVK >gi|313157345|gb|AENZ01000069.1| GENE 22 29097 - 31961 4241 954 aa, chain + ## HITS:1 COG:no KEGG:BDI_3715 NR:ns ## KEGG: BDI_3715 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 99 951 34 909 914 613 40.0 1e-173 MDKLRVKYLFSAALILLFAQLVFGASAPRSVFAAAAPDSTRTHDSIPAYTLPEAAAEVPE AVRTRSERRAQARAAREKARRHAEFEAMSQEERDSMFSSQVDSLIARRADSLAGRQADTV ARDSVKRPRKPGAFLDDPISGKNKDSLVYDVRNKMVYIYNEGDVTYQNSNLKADYMQIDM NSKLIHAYGKPDSLEGKAIVTKPEFSDGGEAPYKMDTITYNLESKKAKIKGVATQQGDGW LVGGSVKKMPDNTINIENGKYTTCDETDHPHFYLAMTKAKVIPGKKVVTGPAYLVMEDVP IYFLGIPEGFFPISSGPKSGLLMPTYGEEYTKGFFLRDLGYYFTLGDYADLAVRGGIYTL GSWEASAASRYIKRYKYSGSFNMQYSNIKTGEKGEPDYIKQSNFRIQWTHSQDPKANPGS TFSASVNFATSGYSKYSATNLNDILATQTNSSIAYSKNWAGTPFSLSANMSISQNSQNKT ISVTLPTLVFNVSRFYPFKRKEKTGKDRWYEKISMQYTGKMTNSVSTTEDKIFSKETLDN MKNGIEHNIPVSASFNLFNYINISPSANYNEKWYFKKVELEWNPVTNSVDTLPAQYGFYR LYNYNFSVSTSTTVYGMYDFTKKRRDRKIQAIRHTITPSIGFSYAPDFGDPKYGYQSNYQ SDSTGTFRPYSPYSVNAYGVPSSGRSMSMNFSLSQNLEMKVLSKRDTSGVKKIKLIDELR LSGSYNFLADSMGLSTIPISFRTTLFGNFGINLSATLNLYKLTPDGKLYDKLFLPGRIES TGWSFGYTFKSRQDRTERAINDITSIPPEYMNPFYDPYGNMDPVLRRQYMAQTYYDFSLP WNFGFNYTVNYSISRGNYPPKGYKKNITQTIGFNGSINLTPKTGVTFQGGYDIKQNKLTT SSVSISRDLHCWQMSFSWIPFGYHRSWSFNIGVKAASLSDLKYDKSQSMYDNMY >gi|313157345|gb|AENZ01000069.1| GENE 23 32066 - 32938 869 290 aa, chain - ## HITS:1 COG:CAC2933_2 KEGG:ns NR:ns ## COG: CAC2933_2 COG0778 # Protein_GI_number: 15896186 # Func_class: C Energy production and conversion # Function: Nitroreductase # Organism: Clostridium acetobutylicum # 85 290 7 194 195 81 27.0 2e-15 MMENTIRVNDAICIRCGRCVKVCPSQIFVQEKAGAAVTLHKPENCIVCGHCAAACPTGAV EHADFPAEKVHPIDYAALPTPEQVELLLAVRRSNRAFSQRPVPQEYLDRIVAAADRAPTA TNARQLGYTLVTDPAMRREIVEYTLGVFGRIVKRLSNPLVRPWLSRLLPGVYRYIPAFRR MRREYAEQGVDRILRGATAVLFIHAPKESRFGAEDANLAYQNASLTAESLGVSQVYMGFV LTAVRQDGKGGLNRMLGLRDRRICAVMALGMPQFRYPNYIDRKEPEVRKI >gi|313157345|gb|AENZ01000069.1| GENE 24 32986 - 33429 564 147 aa, chain + ## HITS:1 COG:L9876 KEGG:ns NR:ns ## COG: L9876 COG2153 # Protein_GI_number: 15672965 # Func_class: R General function prediction only # Function: Predicted acyltransferase # Organism: Lactococcus lactis # 7 145 8 146 149 94 39.0 8e-20 MHPIVIKHFSEISAAEYHRIVQAREAVFFLEQHITEPDADEVDPQSVFLWMEEDGRVVAF LRIIPAGIVCAEASVGRVLVDAGYRRRGLCRRLMLEALRHIALTWGPQPVRISAQEYLAG FYASLGFEAVSGTYLEAGIPHVRMLRR >gi|313157345|gb|AENZ01000069.1| GENE 25 33484 - 34728 1808 414 aa, chain - ## HITS:1 COG:STM3113 KEGG:ns NR:ns ## COG: STM3113 COG0477 # Protein_GI_number: 16766414 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Salmonella typhimurium LT2 # 1 414 1 418 418 375 49.0 1e-103 MGIKFRLIVMNFLQYAIWGSWLISLGAYLGGGLDFTGLQIGSFFATMGIASLFMPAVMGI IADRWIPAQKLLGICHLISGAFLVAAASQTAYAPLYSCMLLSVMFYMPTIALSNSVAYNA LDLAKLDTVKHFPPIRVWGTVGFIAAMWFVDLTHIGGIQIKLTAWQLYVSAFLSFVLAVY SFSLPGCSVDRNVKSQSWIDTLGLRAFALFKEKRMAVFFIFSMLLGAALQITNAFGDTYI QNFGSMPQYADSAIVKHSVILLSLSQMSETFCILLIPFFLRRFGIKKVMLISMLAWVLRF GFFGIGNPGSGAAFLILSMVVYGVAFDFFNISGSLFVEKETSREIRSSAQGVFMIMTNGF GAFFGSYAAGAVVDAWGWPDSWYIFAGYALVVAVVFAFVFKYKHNPAEMASVNH >gi|313157345|gb|AENZ01000069.1| GENE 26 35308 - 36234 915 308 aa, chain - ## HITS:1 COG:no KEGG:BDI_1719 NR:ns ## KEGG: BDI_1719 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 1 294 28 319 319 230 39.0 7e-59 MTDKLNIGREAAYRRLRGEVPFTFGEAAVLSAQMNFSLDRAVGAVDFGNVLFRLSFNDYH TALEDYTGVIDQDTLFYREVSPDTDAEQAIAGNSFPRMLYMRYEGLTNFKLFKWLYQQGL VDCSTTKFEDMKVPDALLQSYRDLLKEAQLMPSTTLVFDGSCTKRWVNAICAFRNMHLIT DQSVEMLKGELLQLLDELEEIAVAGQFRSGKPVSIYISDMDIEATYCYVAARHYRASCIE AFSINSLRSADPGMFEHVKRWITSLSRFATLISCSGELQRIHFFKRQRAIVAELEREGLE PSPEVFQE >gi|313157345|gb|AENZ01000069.1| GENE 27 36467 - 38587 3397 706 aa, chain - ## HITS:1 COG:SMc04403 KEGG:ns NR:ns ## COG: SMc04403 COG0339 # Protein_GI_number: 15967061 # Func_class: E Amino acid transport and metabolism # Function: Zn-dependent oligopeptidases # Organism: Sinorhizobium meliloti # 32 705 6 686 687 474 38.0 1e-133 MKKALIIMTFSAMAAGCADNSITKAQLPELDTTNPLLAEWNTPHQTPPFSQIELSDYEPA FDAAIACSRAEIEAIVNNPKKPTFGNTIVALERQGALLDRISGVFYNLLEAATSDQMQEI ALRVQPKLTELSNDVSLNPELFARVKAVYEKPGRGLSKEDRKLLEDTYQSFARNGAALSD ADKELYRKYTSELSAATLQFGQNALAATNAFTINITDPKVVAELPAFVREGMAADAKARG EKGWTVTLQAPSYVPFMTYSSNRALKEKLWRAYNSRALGGENDNTQIVKQIANLRLKIAN LLGYKCYADYVLERRMAENTPTVTAFLDELLGETKAYADKDYRMIGDYAASLGFKGQLMP WDWGYYSEKYKDEKYALNDEVVKPYLKLENVKKGVFMLANKLYGLNFTPNDKIEVYHPDV TAYDVTDENGRFMAVLYLDFFPRESKRSGAWMTEFRGTKIEDGEETRPLVSLVMNFTKPT ETAPSLLTFDELETFLHEFGHALHGMLGEGKYESQTGTSVYRDFVELPSQIMENWATEKE FLDLWAVNYQTGEPIPAEIVERIIAAQNYLAAYLNVRQLSFGLTDMAWHTITEPFEGDVE QFEVASMAPSLVLPVVPGTAMAPAFGHIFSGGYAAGYYGYKWAEVLEADAFSLFKEKGIF NREVSGSFRENILSKGGTEHPMELYVRFRGHKPETKALIEKMGLGK >gi|313157345|gb|AENZ01000069.1| GENE 28 38624 - 40795 3518 723 aa, chain - ## HITS:1 COG:mlr4139 KEGG:ns NR:ns ## COG: mlr4139 COG0339 # Protein_GI_number: 13473513 # Func_class: E Amino acid transport and metabolism # Function: Zn-dependent oligopeptidases # Organism: Mesorhizobium loti # 32 701 14 684 684 542 44.0 1e-153 MKRIAGLFTTLLLLMALASCNSPKSAGENPFFTQWDTPFGVPPFDKIRAEHFMPAFERGM SLHDAEIDAITSNNDEPTFENTILAYDNAGQMLAQTELIFGMLCAAETNERMQALQEQVM PLLSAHRDKIRLDEKLFARVKDVYDRRASLGLDAEQSRLLRKTYDAFVRSGALLDAGQKA RLKEINGELSLTAVKFGNNILAENNNFALELTADDLEGLPSGVRDAAREKAQAMGKGDKW VFTLHKPSLIPFLTYSAKRDLREKIYKAYLNRCNNGDQYDNKQLINDFIRLRTEKAKMLG YPSYAAYVVADEMAGTTDAVYALLDQIWTPALDRAGEELAEMDKLLQKDVPGATFESWDW WYYAEKLRKQNYALDEEMLRPYFSLENVQGGIFYLANRLYGITFQPIVAPLYHPEAIAYE VLDADQSHLGVLYFDFFPRDGKSQGAWCGNYVEQTYRDGERVAPVVSIVCNFTRPVRNTP ALLTLDETETLFHEFGHALHFLFHDVKYRGLTEVEGDFVELPSQIMENWAFEPEMLKQYA VHHRTGEVIPKYLVDKLRRSELFNQGFATTELIAASLSDMDIHSTTDYEPFDPMVFERRA LNEKRGLIPQIEPRYRYPYFSHIFDGGYSAGYYFYTWAEVLDKDAFAAFRESGDLFNKKI AADFRTKILARGGSEDGMTLYRAFRGADPDKRAMLRGRGLLAEPDPQPDEETPIPLEPES DGQ >gi|313157345|gb|AENZ01000069.1| GENE 29 40891 - 41604 898 237 aa, chain + ## HITS:1 COG:HI0303 KEGG:ns NR:ns ## COG: HI0303 COG1385 # Protein_GI_number: 16272258 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Haemophilus influenzae # 16 237 20 241 245 94 30.0 2e-19 MQLFYAPDITPPLHTLSEEESKHCVRVLRLGRGDTLHITDGRGNLFCCEITDDNPKRCTV RVTETRAGWEALPYSLTMAVAPTKNADRFEWFLEKATEVGVGTIIPLETEHCERRVFKPE RGERVITAAMKQSLKAYRPTLRPLTPFGEAAAMPFEGRKFIAHCAPAQSPEGKAYLPATL RRGEAALILIGPEGDFSPEEIAFARANGFEEITLGAQRLRTETAAVAAVVMVSVANG >gi|313157345|gb|AENZ01000069.1| GENE 30 41687 - 43672 2852 661 aa, chain + ## HITS:1 COG:MA4368 KEGG:ns NR:ns ## COG: MA4368 COG0651 # Protein_GI_number: 20093155 # Func_class: C Energy production and conversion; P Inorganic ion transport and metabolism # Function: Formate hydrogenlyase subunit 3/Multisubunit Na+/H+ antiporter, MnhD subunit # Organism: Methanosarcina acetivorans str.C2A # 69 654 76 640 648 266 36.0 1e-70 MFLYSLLALAVVTALIFAAPLKAKVWATLAVVAAGALWASATAVGVLAGGHTVRLWATDV FLFGSDAGSMDPLSALFALIVSVGAVAAVLYSRGYLAHYLGRKSPAHISLHYTALATMFY AMLGVVTSEGGFSFLFFWELMTVASFLLILFDAERREVRRAALAYVIMMHVGFALLLIGF VRLDSVCGSADFALLGEYFRTQPALPLFVVFLAGFGMKAGMFPMHVWLPEAHPAAPSHVS ALMSGVMIKTGVYGILRVTAQIADLPTLRTAGLILLVAGIVTGLWGVILAAAQNDVKRLL AYSSIENIGIVLIGLGIAALGKSSGNQLAAICGLSGALLHTLNHSLFKSLLFFGAGNILS QTHTTSLDALGGLGRHMPVTGLLFLAGTTAICALLPLNGFVSELLIYLGMLDGIASGSDV LASAAGLAALALIGGVVILAFTKLYGTVFLGAPRTHEVAEASEVDNFRIAAMALPLAGIL FIGLFPQTAVSAVTRAAGFFIHHPADAADYLLSPTLAAVSRTAWLLILVVGLLAWLRSRA LRTRKVTDGATWGCGFTSPNVRMQYSGESYSEGLQSIATSLTQNSGEGGAVGKGEIFPAA HNFDIRHKDRFDKLFAAWWVELLRVINKRAMRLRTGKINHYVLFALAFLVLIFLLSIFNL I >gi|313157345|gb|AENZ01000069.1| GENE 31 43675 - 44589 1429 304 aa, chain + ## HITS:1 COG:MA4369 KEGG:ns NR:ns ## COG: MA4369 COG0650 # Protein_GI_number: 20093156 # Func_class: C Energy production and conversion # Function: Formate hydrogenlyase subunit 4 # Organism: Methanosarcina acetivorans str.C2A # 4 215 11 222 313 132 39.0 9e-31 MTALNTLLILLTAVCIPGLINRTRAVLAGRRGIRFAQHLYDVRLLLRKGAVYSTTASALF RAVPSVYLGSAIVAALFIPVGELKPLLSFDGDIVCFTYLMALGRFALILGAMDTGSSFEG MGASREALYGALVEPALMLTAGTLALLAGHTSFARIFAEAATGDLQLTVLLVLTAYVLVK IVFTESGRVPVDDPRTHLELTMIHEVMCLDYCGTDLGMIKIAGWLKTAALSMLAADAVTA ALCPHWWAAAPLAVLLTGLSVGVVESTQARNKLSRNTTFILTIAALAALVFFTGYLLQLN ISIQ >gi|313157345|gb|AENZ01000069.1| GENE 32 44586 - 45215 1156 209 aa, chain + ## HITS:1 COG:no KEGG:HRM2_38990 NR:ns ## KEGG: HRM2_38990 # Name: hyfE # Def: HyfE # Organism: D.autotrophicum # Pathway: not_defined # 5 209 8 214 215 76 26.0 8e-13 MILPLVILYVITLVYLAITERFRNFASIIGLQGWILFAVALLRLHAINPLELIFIAIETL AFKALLVPAILFAMIRKTKINRVRRSGSSQSGSLLLSLMALAVSASITYYIADSAIDLVF FGVALYALLSGLILIVLRSRIFSHMVGFLVIENGVFLFSMAIGVEMPLLINIAILLDILI SVLMLGIFFTKIDGKIHADDADSLTNVKD >gi|313157345|gb|AENZ01000069.1| GENE 33 45220 - 46620 2346 466 aa, chain + ## HITS:1 COG:MT0093 KEGG:ns NR:ns ## COG: MT0093 COG0651 # Protein_GI_number: 15839467 # Func_class: C Energy production and conversion; P Inorganic ion transport and metabolism # Function: Formate hydrogenlyase subunit 3/Multisubunit Na+/H+ antiporter, MnhD subunit # Organism: Mycobacterium tuberculosis CDC1551 # 59 404 65 423 488 163 37.0 7e-40 MIYYLIASILIAVAALFARTKRAVLAAGILFYAVQIAAAGLLLWGGLYDTTSAAFFTFDA LGTLFYLLLVVVSMYVFGHSEAYLQGEDLRSYRLYFALLMLLTTAIAGAYFANNLAVTWI FLEATTLCSAGIIYHRRTAQALEAAWKYVFVCSTGIAMAYLGILLLAAATKCETLSYEAV AEAASSGNPLYLKTAFLLILSGYSCKLELFPLYTVGIDANFAAPAPASALISTGLVNAGF LALLRVYKLLAGTEVFPWVRSVLLIVGILSLGVGALFLRRTNNYKRFLSYSTVENMGIAA IGLGIGGIGVWAALFHVFCHTLVKSSLFLQMAVVRQVYDSYRINRIGDYIRINRVGAVGL LTGMVVLLAFPPSPLFVSELMILKQVITDRNWWLVAGLLLLMCIVIYSFGSRLIRLCYQP NQDRLHPSKAHKALSWSALSLLAVAIVLGMWQPELLQRIITEIAAL >gi|313157345|gb|AENZ01000069.1| GENE 34 46617 - 48128 2101 503 aa, chain + ## HITS:1 COG:Rv0087_2 KEGG:ns NR:ns ## COG: Rv0087_2 COG3261 # Protein_GI_number: 15607229 # Func_class: C Energy production and conversion # Function: Ni,Fe-hydrogenase III large subunit # Organism: Mycobacterium tuberculosis H37Rv # 132 503 1 361 361 187 35.0 3e-47 MKYYITDNTAPAVPLSEIPEASYAEFYADLAAKLADARYHIAHYFALPLGDRLRFFCLLL DDAEAKVLIASHAMEYYDEQELPSLTALHPQVHPFERDIAERYGVRFDGMPWPKPLRFPF DRFDRGSTMDNYPFYTMEGRSLHEVNVGPIHAGIIEPGAFRFICNGEQVLHLEIALGYQH RGVEHEFETTANRLRQMCLAEAVAGDSAVTHAATFASAVEKLTGGVPSARLGVERAVALE LERMAMQIADTGALCMDVGYQLGQVACEALRTMTINTTQAWCGNRFGKGLIRPGGTNHPL TAEKAETIRRNVAQIARRYDEVRRDIKSSPTLLARFEQCGIVPREEMLRIGGVGQAARAS GAARDIRATHPWGVFAGTIRHESIVKRHGDVMARLMVRCREVLQSAGYIDRLLADYTPGS LPAPDYAAPPAPDSLAFGLVEGWRGETCHVLLTDAQGRIAACRIKDPSLHNWLALALAVR GEGISDFPICNKSFNLSYCGHDL >gi|313157345|gb|AENZ01000069.1| GENE 35 48140 - 48904 904 254 aa, chain + ## HITS:1 COG:MA4373 KEGG:ns NR:ns ## COG: MA4373 COG3260 # Protein_GI_number: 20093160 # Func_class: C Energy production and conversion # Function: Ni,Fe-hydrogenase III small subunit # Organism: Methanosarcina acetivorans str.C2A # 114 253 31 170 170 117 40.0 2e-26 MILPKIKVLRSHGKQYIPDLDAVELTEEFRGRPHLTETPDTADLERAAAICPTGALTTAP LSLDLGRCLFCGECARTAPRNIRFTNDYRIGSPTREGLVVRPGDKCVRFDPAAVRPEIGR YFGRSLQLREVCAGGDASVEMELNATGNVNFDLGRYGIGFTASPRHADGVVVSGPITVNM AEALRICCDAVANPKILVVCGSEACSGGLFAGSRAVDRTFFDTHAADLWLPGAPTHPMTF IDGILNLLGKKKRE >gi|313157345|gb|AENZ01000069.1| GENE 36 48907 - 49449 703 180 aa, chain + ## HITS:1 COG:FN0712 KEGG:ns NR:ns ## COG: FN0712 COG2059 # Protein_GI_number: 19704047 # Func_class: P Inorganic ion transport and metabolism # Function: Chromate transport protein ChrA # Organism: Fusobacterium nucleatum # 3 177 5 181 186 133 42.0 2e-31 MQRLRTIFASFFKIGLFTFGGGYAMLPLIEQELIAKRGWIEHKEFLDLLTLAQSVPGPIA INTSVFVGYKMRGLKGAVAALLGSVLPSFVIILAIALMFADIRHNPVVDAAFKGMRPAVV ALIVVPVFSLARGMHWSMFFVIAAAAFAVWYLGWSPIYILIAAAAIGISWTLLITKKAKK >gi|313157345|gb|AENZ01000069.1| GENE 37 49446 - 50018 927 190 aa, chain + ## HITS:1 COG:FN0713 KEGG:ns NR:ns ## COG: FN0713 COG2059 # Protein_GI_number: 19704048 # Func_class: P Inorganic ion transport and metabolism # Function: Chromate transport protein ChrA # Organism: Fusobacterium nucleatum # 1 189 1 175 176 120 40.0 2e-27 MIFWQLFASYLKIGFFGFGGGYAMLSLIQNEVVVQHAWMTNAEFADIVAVSQITPGPIAI NSATYVGYTVGAQAGNEWCGILGAALATFAVCLPSLTLMLLITRFFLRLKQNPLVEGAMK GMRPVVIGMIASAALLLMFPHSKAPDEQNFIDGWSWALFGGVFISSVKKVNPILLIALSA AAGILIYYVF >gi|313157345|gb|AENZ01000069.1| GENE 38 50089 - 50376 515 95 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313157409|gb|EFR56832.1| ## NR: gi|313157409|gb|EFR56832.1| conserved domain protein [Alistipes sp. HGB5] # 1 83 1 83 95 95 100.0 2e-18 MAILTLLFGLLSGCLLAVLVGIIGSRRRIGFGLAFLLSLIFTPLVGLIITLLTDPLSGGD QRWGCIGTFIAVLGLLCLGVFLFLLLAGAAVVAVV >gi|313157345|gb|AENZ01000069.1| GENE 39 50389 - 51417 1162 342 aa, chain - ## HITS:1 COG:mlr3209 KEGG:ns NR:ns ## COG: mlr3209 COG1092 # Protein_GI_number: 13472800 # Func_class: R General function prediction only # Function: Predicted SAM-dependent methyltransferases # Organism: Mesorhizobium loti # 10 341 57 337 338 188 38.0 1e-47 MQEQLTPAFGDYELIDTGDFEKLERFGRYVTRRPEPQAIWRRTLSEEEWRRAADASFLRD TRSEERGEWRLGSEMPSRWTVDYAYKGMRLRMRLGLTSFKHVGIFPEQAANWNFIYDNCR ALASGGAAAMGIAGGKAPDAMPATTAPAGAPATTASEGVLSGAAPKGRTAPRKPVTPRVL NLFAYTGGASLAARAAGAETTHVDSVKQVVTWSRENMELSGLDGIRWIVEDALKFVRREV RRGSRYNGIILDPPAYGRGANGEKWILEENICEMLACCAQLLEPQGAFLVLNLYSMGLSS TLARTAVRQAFGAPLEEQYGELCFTDRAGKQLPLGTYYRFIR >gi|313157345|gb|AENZ01000069.1| GENE 40 51430 - 54483 4141 1017 aa, chain - ## HITS:1 COG:TM1193 KEGG:ns NR:ns ## COG: TM1193 COG3250 # Protein_GI_number: 15643949 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Thermotoga maritima # 25 1013 7 984 1087 666 36.0 0 MKKTLLILAAALSGALAASAQTGREWQDPAVNEINRLPMHSTFAAGESMPLDGVWKFHWV RNASERPSGIWQEDYDDASWGAMPVPGMWELNGYGDPLYVNIGYAWRGNFENNPPHVPDA ENHVGSYRHTFEVPASWSGKDIFLSIGSATSNVYVWINGRFVGYSEDSKLAAEFDVTKFV KPGENLIALQMFRWSDGTYLEDQDFWRFSGIARGVTLTARDKAHLKDVRITPTLDADYRD AQLHVEVETTPRVKSVGLTLLDAAGEAVAEQQIMPAGGKAGCLFRVADPAKWSAEAPNLY TLRIDVSDGRNVTETVTQPVGFRSVEIRNAQLLVNGQPVLIKGADRHEMDPIGGYVVSRE RMIEDIRIMKEMNINAVRTSHYPNDPQWYELCDEYGIYVVDEANVESHGMGYGEETLAKV PAFAQAHMERNQRMVLRDKNHPSVIVWSMGNEAGMGPNFEACYRWIKDYDASRPVHYERA VYDRDGAFTDIICPMYWDYAQCEKYLKNNPSKPLIQCEYAHAMGNSMGGLDHYWKLVRQY PSYQGGFIWDFVDQGLARYEKNGRVSFLYGGDFNNYDATDNSFNCNGIIAPDRTWHPHAY EVKRQYQNIWTELLDPAQGRVEVYNENFFVGLDDYELTWELVADGRVVKAGRIGELDVAP RQRRTYALGFTSADFCPQAKELLVNVAYKLKNKQPLLDIGHTAAQQQFVVREYDCKAGFG LAVSERPVEVTAWERGTRVAGQTWEMFFSRDGFLSAYDVDGRALMAEGAELRPQFWRAPT ENDLGAKLDRKLAVWKNPELKLTKFDAGVKDGVAVVEALYELPAVKATLALTYRINGAGE MEVTERMTAGKSAKVPNLLRYGMTLTMPARYDRVIYYGRGEHENYADRLSSADLGLYEQR VADQYHDEYVRPQESGTKSDVRWWTVADSSGTGLTILSDAPFSASALPYATADLDVSNFP PQQHSGTLTARDATFVNFDLRQSGLGCIDSWGQLPEERFRVPYGDYTFRFLLRPTIK >gi|313157345|gb|AENZ01000069.1| GENE 41 54493 - 56322 2082 609 aa, chain - ## HITS:1 COG:ZydcP KEGG:ns NR:ns ## COG: ZydcP COG0826 # Protein_GI_number: 15801708 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Collagenase and related proteases # Organism: Escherichia coli O157:H7 EDL933 # 4 609 22 646 667 423 40.0 1e-118 MNFVELLAPARDLQSAVAAVDYGADAVYIGGAKFGARHAAGNSAEQIARVAEYAHRYGVR VHATLNTLIYDDELETAERQARELIAAGVDALIVQDMALRRMDLPVELHASTQVCNMTPE QARFLGECGFARVILERALSLDEIRAICAATNAEVECFVHGAICVGYSGRCFLSRSMSER SGNRGACSQPCRLTYDLTDGSGRTYLAGKHLLSVRDLNLSAHVGELLDAGVRSLKIEGRL KDINYIRNTVAYYRRAVDDALAVRPHLRRASAGESVADFTPDTAKSFTRGESEYFFAGKR AGVASFDTPKAVGEYAGRVTRVERTRFRLDRAHTLAAGDGICFLTPRGLVGTNVNAADGD CVTPNRMEGIAPGCEVYRNYDHRFNQLLERSRTRRVIPAAAVVTASAEGVTFRYTDCEGV AAVAERRIALEPANNAEANAAALRTQAMKSGDTIFAVRSAEVQGAEWFVPASLASQLRRE GLAALLQARLARGVEHRILPEGAAEYPSETLSAEENVTNRLAEAFYRDHGVRQIERGLDL AATTAGRRVMRSAYCIRREIGECLKERPRLKGDLWLERGANRYRLEFDCVRCEMSLVDCT RKETAKPKP >gi|313157345|gb|AENZ01000069.1| GENE 42 56778 - 58049 2306 423 aa, chain + ## HITS:1 COG:SP0411 KEGG:ns NR:ns ## COG: SP0411 COG0172 # Protein_GI_number: 15900330 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Seryl-tRNA synthetase # Organism: Streptococcus pneumoniae TIGR4 # 1 418 1 417 424 351 46.0 1e-96 MLTLKQIRDDKEAAIRKLAKKGVDAAPVIEKIETLDDRRKAIQVELDNTLAAQNKAAKEI GALMGQGRREEAEERKRFVADLKEKSNRLQAESREVQEELQAAIVTLPNFPAEIVPEGKT AEDNLVVKLVESYTELPENPLPHWELARKYDIIDFDLGVKLTGAGFPVYKGKGARLQRAL INYFLDCNTKAGYLEVEPPIMVNEASGFGTGQLPDKEGQMYHATADNFYLIPTAEVPVTN IYRDVILDEMDFPVKMTAYTPCFRREAGSYGKDVRGLNRLHQFDKVEIVQLSLPQNSYEA LDGMVAHVESIVKSLGLPYRILRLCGGDMSFTSALTYDFEVYSEAQKRWLEVSSVSNFES FQANRLKLRYKDADKKTQLAHTLNGSSLALPRIVAALLENFQTPEGIRIPEALIPYTGFD IIK >gi|313157345|gb|AENZ01000069.1| GENE 43 58101 - 58268 182 55 aa, chain + ## HITS:1 COG:no KEGG:BT_2414 NR:ns ## KEGG: BT_2414 # Name: not_defined # Def: ferredoxin # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 54 21 74 76 68 75.0 6e-11 MAYKITDSCVACGTCIGECPVEAISAGDIYVIDPNTCIDCGTCAGVCPSEAIIPE >gi|313157345|gb|AENZ01000069.1| GENE 44 58346 - 58762 129 138 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313157413|gb|EFR56836.1| ## NR: gi|313157413|gb|EFR56836.1| hypothetical protein HMPREF9720_2002 [Alistipes sp. HGB5] # 1 138 143 280 280 295 100.0 7e-79 MKVGPHYCVEVFFRPEMNYGFRKGISKLGVMYLNHINLCDQQQDVYASLPYRVYFKYKEV GIRSNGIAFEVVDWKEIRQAAGESGKVIMPATTTPMDGIFMGPNPDLPGDYWKIEIPRDR CYSLEEAGGIVIPGKPIE >gi|313157345|gb|AENZ01000069.1| GENE 45 59475 - 61556 -98 693 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313157412|gb|EFR56835.1| ## NR: gi|313157412|gb|EFR56835.1| hypothetical protein HMPREF9720_2003 [Alistipes sp. HGB5] # 1 693 1 693 693 1204 100.0 0 MLPLSEYLATALVKGSAPQKITDLTAGTDYYAVALGMLTDGRFTSDLVKEAFKTDDAPEP APELTLSMRAGDASGANTSSYVTCHIKSSDIVSAKAAFLMKSAVDKALAEGKTLAEILAG VTTDPTLKTEQVEKINSADGYDLPCRGVPAATAVTAIVSVTSANGKTTVDSASATTDAAQ GGDLTFTIAVTNIGAKGADINITPSNNSDTYYFDIYPASVVDQIPDDNQLIAAIIGANDG DVTKYLSTGPDGYSKELIEQYIPFDPETEYCVVAFGCNGTAGTTAVTKERFTTLADDGET GDGPELTLTLRAGDANGANTDTKVYMGAYAPTATGAYYGVFLTSDVEKVLAQGASYDAIV TQNGTDMSTKDGWLDGLVQNPGIGVTFSGLDPATSYTCILKVTDSAGKSTTKHVAATTEG GGEASDAYKAWLGTWTLTSTSSEVNAAPLSFDVTFIQGVANSSYKLQGWGITTIRDQSQI LPSAKFDSATGNFEILEGQSLYTDPEDGSVLTLTGRALLQGKYYIINGGYPGFTGKLNAD KNSATITLYQGSVEGMGNFTTSSMDFFWVLGQDIYWQKAAEGYTDKDYPIGPYTMTRKSS GASVAASKSAATILHASKLSSVLSGCTVLSTSVKQNGVVFESQVAMPSSLVFDRPAASGT VIGNSGKFENAGLKKQISNNLDNNTDFQLLNRL >gi|313157345|gb|AENZ01000069.1| GENE 46 61593 - 61832 139 79 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLGIGRVGLERVELGLLYSRIIVGSLVVAGSYRYGSFGLGYIRNSDTKRIRGGGPTLSSS SLQPEVTNRRVTAANNMRK >gi|313157345|gb|AENZ01000069.1| GENE 47 62254 - 65256 3073 1000 aa, chain - ## HITS:1 COG:no KEGG:Poras_0733 NR:ns ## KEGG: Poras_0733 # Name: not_defined # Def: hypothetical protein # Organism: P.asaccharolytica # Pathway: not_defined # 38 276 69 312 462 89 29.0 9e-16 MTLRSYMRVLSAAIALLFVAVGCKDSDEEIPPPSEKYTFDIAVSDIRVADAVVTVTPSDA LATYYCSVVKKADFDKLGSDEAYLDDDIDYLKAQAEKKQLTFKAYLETVIATGSEPIKFV TLEPSTDYYAYVYGITPEGRVTSDLKKTPFTTATPEPTELTFEFKVENITMTAADIAVIP SDDEAPYYFDVIAAAAYEGMSEDEILADVLDAIIPAYLTQGPDGYPAEMFEGMLALKPGT EYYVYAIGYDAEKEEPTSELQMYKFSTTAPTGEAPDLTFSARAGDADGNNTSTMIYCTAV SEAAVSAKMACLPKQIVDDFIGQGASLENIADANGQDVKSEDIAALNAKGGLGLTLAGDV IVPSTDYTVIFKVVSAGGRSTVKSENVSTTSGDVPPSDLTFSIAVTELKATSAMVTVTPS NDTETYFFDIQPKKLIDENFADDASLIAALDETYAKYGGIAGMLSQGEDGYKPTSLTAGT SYYVLAFGYNTAATTAVTRHEFTTETAATSDLTLSIAIDTSAEPIPGGVTAAITASNDED PYMLDFMLADEIKGMSDAEIIASVEQKYGQIISWLLVTGDYATAPTDFGGELAMMPGAEY YLVAFGYDGASATTGVTKAKFTAGAGPDAAGTGFTFKVEDITSGGATVTVETTKEPVTYI WDVISDASYTQLGGNAEALVSHVTDMFALYGTAQYGNLTPVQVIAGLGAWYSGASYAYGK LSSKTAYRPYAACVDLSGNVVGTPAVGEIFTTLEAVAGSATVEAAYDKYFDGDEAVAAGV SSNASGKAYIPVTVTRSADAAHWYVALYTGDYTDPGAVSEATIINALKSQGVKDAAKINF TCSWDTQCTFLAVAEDASGNVGEIFRLGVRFTKSGASPISELSGASKAAAAGVKLPVLCS SLLPGLDVPKVFRPLEEPNMPQAAASQLINLRRERTSALSEKMVRGAEMSGNAEALSGGK PAQIKCLVGINAGLKQAEEVSLKTAAAARSPKNRVRTPRR >gi|313157345|gb|AENZ01000069.1| GENE 48 65546 - 66883 1235 445 aa, chain - ## HITS:1 COG:BS_ybbC KEGG:ns NR:ns ## COG: BS_ybbC COG3876 # Protein_GI_number: 16077233 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Bacillus subtilis # 119 443 74 414 414 225 40.0 2e-58 MKRLLVFLTLAALAVAPRAAAQGAQPESACGAPLLHAPVLVGMTDTAAYFPKLKGLRVAV LANHTAVARFCEPAQGKYAAEAEQLSGVSAGEAAALPDRAGCRPARLPGASADGTIHLVD LLHGRGFNVTGIFSPEHGFRGTADAGEHVGNSVDERTGIPIRSLYDGNTKRPSDEAMQSF DVLVVDMQDVGLRFYTYYITMLRMMDACAEFGRTVVVLDRPNPNGHLIDGPVLDMKYKSG VGALPIPVLHGLTMGEIARMAVGEGWSRKCRLDVVCCRNYTHATPYGLPVAPSPNLPTQR AVYLYPSLCLFEGTVVSLGRGTDKPFEIYGHPDMKGYGFSFTPRPTAGAKHPPLEGRLCH GADLSRMPLDEARQVGLTLRYVIEAYRNLGLGERFFTPMFEKLIGVGYVREMILAGASEA EIRACWQEDLSRFRRQRRPYLLYAE >gi|313157345|gb|AENZ01000069.1| GENE 49 66925 - 68037 1922 370 aa, chain - ## HITS:1 COG:L0374 KEGG:ns NR:ns ## COG: L0374 COG1186 # Protein_GI_number: 15672953 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Protein chain release factor B # Organism: Lactococcus lactis # 12 363 15 361 365 277 44.0 2e-74 MILADQIKDLEQRSGALERCLNIEQKRIDLLNEEEKTQVPDFWDNPDQAREQLRRVAAIK AWVDDYDAVAKDVEDLALMPEFVKEGVVSEQEMDAHYAQTLERIEKLEMRNMLRRDEDKL GAILDINAGAGGTEALDWASMLLRMYTRWGEAHGYKVKVLDYQAGDEVGVKSCTLEFEGE YAYGYLKSENGVHRMVRLSPFNANNKRQTTFASVFVSPAVDDTIEVTVNPSDIEWDTFRA SGAGGQNVNKVETAVRLRYHGKDADTGEAVEFLIENMETRSQLMNRENAMRILKSKLYQR ELDKRMATQQALEASKKKIEWGSQIRSYVFDDRRVKDHRTGVQTSAVEAVMDGGLDEFIK AYLMEYGAEA >gi|313157345|gb|AENZ01000069.1| GENE 50 68069 - 68704 855 211 aa, chain - ## HITS:1 COG:no KEGG:BF3595 NR:ns ## KEGG: BF3595 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 209 31 239 239 261 61.0 2e-68 MKDKFVINIGRQLGSGGKAVGEAVAARLGIGVYDKQLINLAAEQSGICPEIFEKADEKES RNLFATFIGYLRSPFVGSEYSGSNVLSSDALFKIQSDVIRDLASRESCVFVGRCADYILR DHPRTVNIFIAAGRAERIERLCRLHGIAPGEAESLMDRTDARRAAYYNYYSSGTWGMAET YDLCIDSSVLGIDGTTGFVLEFVERKLGVKP >gi|313157345|gb|AENZ01000069.1| GENE 51 68714 - 70120 2196 468 aa, chain - ## HITS:1 COG:CAC0883 KEGG:ns NR:ns ## COG: CAC0883 COG0534 # Protein_GI_number: 15894170 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Clostridium acetobutylicum # 11 442 6 431 448 327 40.0 4e-89 MSATIKGAAAELGTERIRKLLVQYAVPAVIAMTAASLYNMVDSIFIGHGVGPLAISGLAL TFPLMNLAAAFGSLVGVGAATLVSMRLGQRDYETAQNILGNVVVLNLIIGLSFGLVTLLF LDPILYFFGASDATIGYARDYMTIILLGNVITHMYLGLNSVLRASGHPRKSMYATIYTVV INAALDPLFIYGFGWGIRGAAVATILAQVIALVWQFHILSDKSELLHFRPGIYRLRRKIV RDILAIGMSPFLMNLAACFIVILINKGLKQYGGDLMIGAYGIVNRLAFFFVMIVMGINQG MQPIAGYNFGARQYERVLRVLKLTIIGATCVTMAGFLMGELVPRLAVSAFTNDPELIRLS VEGMRIVFICFPIIGFQMVATNFFMSIGMAGKAIFLSLSRQLLFLMPGLIFLPHIFDVYT RWDGSWGVWCSMPLSDLLASVVAFCMLAYQLRKFRAKMDAAPATNNPE >gi|313157345|gb|AENZ01000069.1| GENE 52 70340 - 71815 2271 491 aa, chain - ## HITS:1 COG:DR1670 KEGG:ns NR:ns ## COG: DR1670 COG0215 # Protein_GI_number: 15806673 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Cysteinyl-tRNA synthetase # Organism: Deinococcus radiodurans # 5 489 52 531 532 436 47.0 1e-122 MESKLVLYNTLTRRKEEFEPLTPGRVGMYVCGPTVYGDPHLGHARPAVTFDLLFRYLKAC GYKVRYVRNITDVGHLEHDADEGEDKIAKKARLEQLEPMEVAHYYTERYHRAMEALNVET PSIEPYASGHIIEQIEFVKKILADGFAYESNGSVYFDVEKYNRKYNYGRLSGRNLDDIVA NTRELDGQSDKRHSYDFALWKKASPEHIMRWPSPWSEGFPGWHMECSAMSTRYLGERFDI HGGGMDLMFPHHECEIAQSTAALGHDSARYWVHNNMITINGQKMGKSLGNFITLEELFTG SHKLLAQAYSPMTIRFFVLQAQYRSTLDFSNEALQAAEKGLDRLMKGVEALGRIKPADVS TVNPAELEERCRAAMDDDLNSPMVISALFDWVRVINQLAEGRQTITAADLETLKTTVRRY AFDILGLRDEKAAGSASGRDYVTPLVEMLLDERLKARAAKDWAASDRIRDGLAAAGIRVK DRKDGSDWELE >gi|313157345|gb|AENZ01000069.1| GENE 53 72005 - 73426 1537 473 aa, chain + ## HITS:1 COG:STM0820 KEGG:ns NR:ns ## COG: STM0820 COG0513 # Protein_GI_number: 16764183 # Func_class: L Replication, recombination and repair; K Transcription; J Translation, ribosomal structure and biogenesis # Function: Superfamily II DNA and RNA helicases # Organism: Salmonella typhimurium LT2 # 1 451 1 440 454 391 50.0 1e-108 MTFKELNLIEPIMHAVAEKGYVTPTPIQEQAIPPALEGRDLMGCAQTGTGKTAAFTLPIL QLLSARPRTKGRRPIKALVLTPTRELAIQIDECCRDYARYTDLRHCVIFGGVNQRPQVDA LQRGVDLLVATPGRLLDLIGQGYVSLSDIRFFVLDEADRMLDMGFIHDIKRILPLLPKER QTLFFSATMPSDIVTLANSMLHDPVHVTVTPPASVVETISQSVMFAEKAEKKDLLISLLR ERSEGSVLVFSRTKHGADRIARILTKAGIEGRAIHGDKSQGARERAMNDFRAGRCRVLIA TDIAARGIDISELPLVINYDLPEVAETYVHRIGRTGRAGHDGTAIAFCSEDERPLLKDIQ KLTGLVLDPDSGRLTPGMQTDTKSPRKQESSQRSAPKQAAQGGDGRTPAGQSGDAKGPRR RRRNAPKREGQAQSAHAPQPQNGGGQASENREPGRRRPHRRRRSDRKSEGQRS >gi|313157345|gb|AENZ01000069.1| GENE 54 73550 - 74203 754 217 aa, chain - ## HITS:1 COG:ECs2092 KEGG:ns NR:ns ## COG: ECs2092 COG2173 # Protein_GI_number: 15831346 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: D-alanyl-D-alanine dipeptidase # Organism: Escherichia coli O157:H7 # 27 206 7 173 193 93 32.0 3e-19 MVFAAVMICWGVQAQEGFDAKMRRAGLVDVLAVDSTLRVKLMYSAPDNFMRRDVYGDLET AYLLPHFARKVAAAQRLLRERRPGWRLLVCDAARPVSVQRQMYELVKGTPNQVYVADGTR GGRHNYGVAVDVTLLDETGAPVDMGTPVDFFGDEAHTGDEPALAAEGRISAEACRNRQLL GSVMREAGLVPYRREWWHYEEPMAMSEVRTRYKLLDF >gi|313157345|gb|AENZ01000069.1| GENE 55 74417 - 76459 1892 680 aa, chain - ## HITS:1 COG:SP2146 KEGG:ns NR:ns ## COG: SP2146 COG3669 # Protein_GI_number: 15901959 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-L-fucosidase # Organism: Streptococcus pneumoniae TIGR4 # 23 497 6 470 559 293 37.0 6e-79 MKRTLSVFLLFAAACGRVEAPAPLFPVPNARQLAWQQLETYAFVHYGLNTFNDMEWGYGD TPASTFDPADLDCDQWVRTFRAAGMKGVMLTAKHHDGFCLWPSAYTEYSVKNSPWRSGKG DLVRELSDACRRHGLKFGIYLSPWDRNHPEYGREEYVAYFHNQMRELLTGYGPLFEYWFD GANGGDGWYGGADEKRSIDAKTYYGYERARRTINELQPGAVIFGGTCADIRWIGNEEGRA GQTNWSMVKGSGDERLNDFTCGESDGDTWLPGECDVSIRPGWFYHPREDHQLKSLSRLID IYYESVGRNANLLLNFPVDRSGRIASADSARIMEWRRALDAEFAHDLFAEAQATADNVRG GARRFSAAKAVDGRADTYWATDDGVTAATLSLRFEQPRRVNRILLQEYIPLGQRVGRFAV EWLDGDTWRPVETAEEMTTIGYKRIIRFAGVTTPALRVRFGQARGPLCISNVEAYDAPVL LEEPRIVRNGAGEVTLAAGDTQAEIRYTLDGTEPGPSSELYAKPFPMTGRGVVKALVRDP EDRRTSAVASRGFDIPCGAFRVKELPDEEAVGLFDGDVSTVVYLPAGVAGFSADTGEERT FRGFAYTPDQSRWASGVVTRYRVSVDGRTAAEGEFSNIVNNPVEQVVEFEPLRGRTVRFE ALSLAAGDRPGIAEFSVLTE >gi|313157345|gb|AENZ01000069.1| GENE 56 76760 - 79990 5077 1076 aa, chain + ## HITS:1 COG:no KEGG:Palpr_3031 NR:ns ## KEGG: Palpr_3031 # Name: not_defined # Def: hypothetical protein # Organism: P.propionicigenes # Pathway: not_defined # 1 1044 1 1062 1075 746 40.0 0 MTFFKRWNNIVGWAVFAIAAAVYLMTMEPVSSLWDCSEFIATSYKLEVGHPPGAPLFMML ARLATLFAFGNPDYVGVAVNAMNSIASAFCILFLFWTITHLARRLVTRGGGELTQANTLA VLGAGIVGALAYTFTDTFWFSAIEGEVYALSSMFTALVVWLMLKWEEEADEPHSSRWIVL IAYMMGLSIGVHILNLLTIPALVFIYYFRKTEKVTFKGMVYATLIAGAILLFINNIIIPY TVYVGAMVDLFFVNTLGLPVNSGIVLFALALILGCGWAAWYTHRKGKVVWNIILLSTTMV LVGFSSYASVTIRAAANPPMNSNNPNNPHALLSMLNRDQYGDRPLLLGAYYSAPPEGYKE KSFYYLDEDGKYKSASVITGYTHSPEFVHFFPRMWDARKGEKEYKQWGAYRTRTDVMRDD KGEILRDEQGRPIRGEVVDFGRKKLYNDGYEDRTIVEPTFAENLNFFFNYQLSYMYWRYF LWNFVGRQSDIQPTRTQITDGNWLSGIKWIDEKFLGPQDDLPREIAENKGRNTYYFLPFL LGLIGLIYQLNRDERNFSIVLWLFIMTGIALVFYFNTSPGEPRERDYVYAGSFYAFSIWI GFGVLAIRDLVAWLTKRKNVTAAVAATAVCMVVPTILAAQNWDDHDRSHRYMARDIGWNY LMSTLPNSIILNYGDNDTFPLWNNQEVYGVRPDVRIMNTSYLGGEWYIDEMKTKANDAPG VPFSLPKSKYTFVNDWVPVYDRVNRSVDIKEVMDFIRTDHPDSKIKMADGSMVDYIPTKR IALPVNKENAIASGIVAEKDRDKMVDTVYINLRKNSIDKNQLMILDMLANFDWKRPIYMT QVYILQDFGLMDYLQFDGYAYRFVPILTPVQNPWEIGRVDADYAAPLLRDTFRYGNLSDP RVYADYFIQYNLSASHARDSFARVAKELLRQDRAQEAVELLDLGLERMPTWQIRFTDTNT YPFLEAYYAASAMGVEEAAAKGDALLREYAQTLIEYIEYYLRFEGTQGDMVSPILDEKLD QLGDIYYLATYAGRKEVVAELNTYYRSFGVSEDNLVDVGDKPRQADTTLIPKGMNE >gi|313157345|gb|AENZ01000069.1| GENE 57 80100 - 80711 693 203 aa, chain - ## HITS:1 COG:no KEGG:BT_3525 NR:ns ## KEGG: BT_3525 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 2 201 520 719 721 176 46.0 6e-43 MISRRPSEWTPTVIRRDPKDFVLTAGGDAEYLAAVRGLYGKGKQAGTLEFKNHFYLERGP YEIVAVVDENADTEPFVLEGKFIDLFDPALPVLTRRAVLPGTQALLYNLDKVADPGRPQV LACAACVETERVGRNGYSFMVKGPAQTTNVMRVLLPRKPVATDVTRSDGTPVADPGFEWD EESHTCLLRFENAPEGVNVALGF >gi|313157345|gb|AENZ01000069.1| GENE 58 80710 - 81111 516 133 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MENPGDEGNLVQEAEILKAFSIVAGVRCEGRRLTLMPRLPWLWDTMECVDWPVTDADGRT HRIRFTVRHERWLRRCTVELEGIGRFEGTDIRFGPFPRLLNNPKGYETELIGNASWIWVR GIKGDKRTITVEL >gi|313157345|gb|AENZ01000069.1| GENE 59 81124 - 82902 2513 592 aa, chain + ## HITS:1 COG:no KEGG:BF0762 NR:ns ## KEGG: BF0762 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 588 6 591 597 700 55.0 0 MKNLFIAACCCLASAGFAAEPFEIRGVLPWHNFLCGPTAWNKADYEVYLDNCKAEGINFI GFHNYTGGGERYATYVEPMVRISDRGIVPQAMLDNSLGCRWGALPLRLKEFAFGSQRALD VPRGAEAFGSDCSLISKTPDEHYRNTQRLMRDVLRMAHDRDIEMAMGFEFGVVPPEYFSL YAAGSCFFWLGAGNMVPNPCHPTSVELHRAALDDLLENYPDIDYVWLWLNEHSFLGVVVE QALGDPAFAEVFRRESGHFEEAQSDSERFVGVWSLEYIRRTLDHLRQKGSRAKLIIGGWG GGGQLPGILRGLDRALPEEVVFSCLNPDLGRTRQPGFLADIARHRKVWAVPWLEGDNQMW HQQPRVGKMRDHVQLAREQGLQGVAAIHWRTAETRYNFRTFARYARTSDDTTVETLYKEY FEEDFGAQAAAALAPLMAAVDTANAWEGPQSPEYFAFRPDWGVLDEANAASRQGIIDAID AVQDKEQTPQQRRNLKSFRAMLSFELLLDKVVRAMAPGWELRDKTLAENRPASREACAAA LRELESAPVEELIRTYVSRTGSRGEMGILTSINQRLWNNYLLLKNYLQENTH >gi|313157345|gb|AENZ01000069.1| GENE 60 83156 - 84943 2443 595 aa, chain + ## HITS:1 COG:no KEGG:BT_1768 NR:ns ## KEGG: BT_1768 # Name: not_defined # Def: beta-galactosidase # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 566 524 1052 1085 285 31.0 3e-75 MRITRECDHTRNWISGDGETQTDTDLPTVIGHYCNMPLMLEWKSKSKPWGVGEMGMCYAG TPAHVSVVNGDRAFESQEGRMEGLAGEAFETISMQRGLDACYASISNLAWYGVQPLEIGL DDITRPVEAEDGIWFGPYREGVPGVQPERLGPYTTTFNPGYDPRLPLYRPWALFEGVKAA FSDTYAALPNKWAVRKHTGISEPVPAARDVVWISADPESKAERQFEELSVAFRPLDTRRN QLILLDGIRPAEDPALVERLRAALGAGSTLLVWNISPAALPLVEKLGGHAVTLTPRNAAS YIIRGRHSLLNGQDLSTLYFNERTKEPVSSFVMTSGDAYATLLDPCNTDWSKWNYQEENI KTGQVLRSERESKPLGSVLLRSEAGRGELLLSAIDPFVLGNKGSALIHEMLHNLGARFNG RPRHIPAALDRNGTVVHALLSGTYAGGSMDEVMARDYLAGRYWGAMDSNEEGFWNFETMN LKGEQKDCAVYLSFWLFSPRSLVNLLLEPNMPRLDLEFAADDKLAVYVNGNLLDNAVKRQ DNPALPQRIEGIPLEKGWNHVLLKVGQLWGGWNGRFRFTATDPAYMRQLESVIMQ >gi|313157345|gb|AENZ01000069.1| GENE 61 85310 - 86986 2451 558 aa, chain - ## HITS:1 COG:aq_999_1 KEGG:ns NR:ns ## COG: aq_999_1 COG1022 # Protein_GI_number: 15606303 # Func_class: I Lipid transport and metabolism # Function: Long-chain acyl-CoA synthetases (AMP-forming) # Organism: Aquifex aeolicus # 27 537 15 499 600 233 30.0 6e-61 MTINTLYELARNSVEKFASKVAFSMFEGDDVTYAEVDRRIAKVQEILTGAGLQAGDKVAL LSSNMPNWGVCYFAVTSAGMIAVPILPDFSGEELDMIIEHSEARALLVSDKLFTKLSKET IGRLNVVVRTKNLGVIAQQVHAQGSTAVPKPDDLAVIIYTSGTTSKPKGVMLTHSALCAQ VGISSSIFPVQPDDVFLSVLPLSHTYECSIGMIYPFSMGAQVVYLDRPPTASALMPALRA VRPSVMLIVPLIIEKIYRLQVLAKFKSTGFWRTLYKIGFMRRYLHRVAGKKLLKLFGGRL RFLGIGGAKLDGDAEKFLLEAKVPYAIGYGLTETAPLLAGAAPSQVRLGSTGPQAPGVEL RLENINPDTRQGEIVAHSPSVMLGYFKNPEATKEVFTADGWFRTGDLGEFDKDGWLYIKG RLKNMIVGPGGENIYPEDIETVLNSHVYIADSIVTEQQGRLVALVHFNRDEIESMIDDWR EEWTSKKEALEAKTEQLKKEILDFVNAKVNRFSRISEVVEEKDDFVKTPTHKIKRFLYSK KKESGTPGEHPAGRPETK >gi|313157345|gb|AENZ01000069.1| GENE 62 87086 - 87994 1134 302 aa, chain + ## HITS:1 COG:PA4950 KEGG:ns NR:ns ## COG: PA4950 COG1600 # Protein_GI_number: 15600143 # Func_class: C Energy production and conversion # Function: Uncharacterized Fe-S protein # Organism: Pseudomonas aeruginosa # 5 299 20 319 361 169 34.0 8e-42 MLDYQHIKKLAADAGFDLCGLAPCGHMAQNEAWLRAWLGKGYQSSLTYMERNVEKRADPR KLVEGARTAVVCAVSYKNRIGEGYPPGHRTKIASYACTADYHTTIKGMLNGMLEHLKAAD PGLGGRAFVDTAPLLEKQLAVEAGLGWIGRQSLLVTPRHGTYVLLGELLLTDEADAYDAP FEGSRCGRCRNCIESCPTGAIVVPKVIDTGRCISCHTIEREPSGGIDLDGWIFGCDRCQS CCPHNQKAPMHANRAFDPVFDPLTMDAAAWMALDETEFEARMGATPLTRSGLARIRGNIK TE >gi|313157345|gb|AENZ01000069.1| GENE 63 88123 - 88284 127 53 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRYEAGPFAGIRFFRAVGEGLLVQSGRFLGLPGLLAQSGEGGGLPGRIIALAE >gi|313157345|gb|AENZ01000069.1| GENE 64 88297 - 89256 1660 319 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313157353|gb|EFR56776.1| ## NR: gi|313157353|gb|EFR56776.1| tetratricopeptide repeat protein [Alistipes sp. HGB5] # 1 319 1 319 319 466 100.0 1e-129 MKKLLVSVIALLATVSLSAQDLTAVYNEAAAAFGNKDFAGAAAKFEEVIDKGMDDETAAS LVATAKTTLPKCYFMLGGSALKTKNYEEALKNFTKSAELSELYGDMNQMAKSNGWVAKIY QIQGGDAFNNKDYATASEIFAKGYAADPDNTGMALNLAMSYCEMAMSSGDMAQYEKGMEV YEAIAAKTHPKYAEDVAKAKEGMALYTNNMVAKMQAANDFDGIIKMADDLLAKNPESALA QNVRLQAYANKKDYAKVIELGQAAADAQTDAADKSLMYYLLGAAYNAKEMKPQAIAAFKQ VTDGPAAENAKAALAELSK >gi|313157345|gb|AENZ01000069.1| GENE 65 89350 - 89766 471 138 aa, chain - ## HITS:1 COG:SA0516 KEGG:ns NR:ns ## COG: SA0516 COG0590 # Protein_GI_number: 15926236 # Func_class: F Nucleotide transport and metabolism; J Translation, ribosomal structure and biogenesis # Function: Cytosine/adenosine deaminases # Organism: Staphylococcus aureus N315 # 1 137 8 149 156 144 51.0 6e-35 MRLALNEARKALQLQEVPIGAVVVADGAVVGRGHNLVETLSDPTAHAEMQALTAAASTLG GKYLQGCTLYVTVEPCIMCAGATGWAQVGRVVWGADDPKKGYRRYSETVFHPKTTVVRGV LGEECEELMTSFFAGLRN >gi|313157345|gb|AENZ01000069.1| GENE 66 89945 - 91702 2666 585 aa, chain + ## HITS:1 COG:CAC2269 KEGG:ns NR:ns ## COG: CAC2269 COG0173 # Protein_GI_number: 15895537 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Aspartyl-tRNA synthetase # Organism: Clostridium acetobutylicum # 1 585 8 595 595 610 50.0 1e-174 MYRTHTCGCLRADDVNRTVTLAGWVQKVRNLGAMTFIDLRDRYGITQIVVEEHSPAEARE TACKLGREWVVQVTGKVVERSSKNPKMATGDIEVAAEKVVVLHEAEVPPFTIEETSDGGD DLRMKYRYLDLRRPPLQRNMILRHKMAQEIRRFLDSEGFLEIETPYLVNSTPEGARDFVV PSRMSPNQFYALPQSPQTLKQLLMVAGYDKYFQIVRCFRDEDLRADRQPEFTQIDCEMSF VEQEDVLEIFERWAKHMFKEVMDIELTEPLRRMPWIEAMEKYGSDKPDLRFGMEFADITD LAKGHGFSVFDDAEYVTGFAATGCAVYTRKQIDALTEFVKRQQIGAKGLIWIRVEESGVK SSIDKFYTPDEVRAMADRCGAKAGDMVFILCGKKFKTLTQLCALRLEVAQQLGLRDPKKF APLWIVDFPLFEWDDETQRYYAMHHPFTSPKLEDVQYIDSDPGRVRANAYDFVCNGTEIG GGSIRIHDSKLQAKMFEVLGFTAEEAQVRFGFLMDAFKFGAPPHGGLAFGFDRLCSLFGG SDSIRDYIAFPKNNAGRDVMLDAPGCIDQAQLDELCLSLVKPAEK >gi|313157345|gb|AENZ01000069.1| GENE 67 91842 - 93116 1949 424 aa, chain + ## HITS:1 COG:CAC0326 KEGG:ns NR:ns ## COG: CAC0326 COG2256 # Protein_GI_number: 15893618 # Func_class: L Replication, recombination and repair # Function: ATPase related to the helicase subunit of the Holliday junction resolvase # Organism: Clostridium acetobutylicum # 4 423 16 437 443 405 48.0 1e-113 MSAPLAERLRPKTIDDYIGQEHLVGKNGVFRKFFETGNVPSFILWGPPGVGKTTLAKIVA TQLERPFFTLSAVTSGVKDVREVIESAKKQRFFDAKPPFLFIDEIHRFNKSQQDSLLGAV EQGTVTLIGATTENPSFEVISPLLSRCQVYILRPMEDKDLQTLLDRALTTDAELKAREVE VRQTGALFKFSGGDARKLLNILDILAGATDGKLTITDQYVTDCLQQNIALYDKNGEQHYD VISAFIKSVRGSDPNAAIYYLARMLAGGEEPRFLARRLVILASEDIGLANPNALLLANAC FDTVHKIGMPEARITLAETTIYLATSPKSNSAYMAINKAMSLVEHDTTNRPVPLHLRNAP TKLMDKAGYGKGYKYAHDFAGNFAEQEFLPDTLAGTKFYEPNTGNATEAKIAERMRELWR DKYK >gi|313157345|gb|AENZ01000069.1| GENE 68 93117 - 93611 721 164 aa, chain + ## HITS:1 COG:MA3016 KEGG:ns NR:ns ## COG: MA3016 COG0622 # Protein_GI_number: 20091834 # Func_class: R General function prediction only # Function: Predicted phosphoesterase # Organism: Methanosarcina acetivorans str.C2A # 3 85 2 87 182 63 44.0 1e-10 MKKIGIISDTHGTFDGTLREFLKDVDEIWHAGDIGSLELADAIAAFKPLRAVSGNIDGGL TRRVYPDFLSFECEGVRVLMTHIGGYPRRYDPRAVAKIQSLRPKLFIAGHSHILKVAYDP VYDLLHVNPGAAGEYGFHKVRTAVRLVIDGADMRDMEVGEWPRR >gi|313157345|gb|AENZ01000069.1| GENE 69 93681 - 94454 532 257 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313157368|gb|EFR56791.1| ## NR: gi|313157368|gb|EFR56791.1| hypothetical protein HMPREF9720_2026 [Alistipes sp. HGB5] # 8 257 1 250 250 501 100.0 1e-140 MSAGCACMAQDRIHTKDGRVIECKVSDSSNRAGNVFFFPADDPETSHFIPVTQVDYLEIG DERVVRTPNDKLKPAETRFGLKAPALPHMLGATYDFLSNTYVHTHGATVSYAYFFDGRNG IHAELAYRSLESASASERRADPGTSVDIFAVIPHYERRFYIQRMGSFYFVRGGLGVNVYT ASILTDVGKSREGLAAPVVSLALGLDFRVSRNLYLEVAARYIGDFASWNFGYDSEPLLAN EGFFGGFSVGLRFGWGR >gi|313157345|gb|AENZ01000069.1| GENE 70 94755 - 94862 78 35 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MIRKSYFKFAALLSMCIAAACSDDDPGKGGGGKLR >gi|313157345|gb|AENZ01000069.1| GENE 71 94940 - 95983 1512 347 aa, chain + ## HITS:1 COG:no KEGG:Pedsa_3099 NR:ns ## KEGG: Pedsa_3099 # Name: not_defined # Def: hypothetical protein # Organism: P.saltans # Pathway: not_defined # 10 347 162 444 445 122 28.0 3e-26 MFDAQYKGNSTTESIGNPVTAELVWQDAKNLIQDIYYVSKEKKIVAVTAPGTSGNAVVAA CGADGEILWSWHLWIADYDPAASLYTTPANASGTTWTFMDRNVGATTNAPDSFDCHGMIY QWGRKDPFTSAGTFTIINEDYSYQVDGERPIYNILNEELPKMRTRAEYHGTIAKSLRNPA VFYAMTYNFTGETDEYGQEIVLNDYRTKDWTDVSNDDYWGGVSGKKTIYDPCPVGYKVPT CDKDGNTPYAWLVYADMTWDDTNHGATQNGQWFPAAGTRVYASGGLDHQELNPYGGMWIG TAGKASANIEEYPELYGQYMFIVNGKRTFKVSKDARSQGMSMRCVKE >gi|313157345|gb|AENZ01000069.1| GENE 72 96169 - 98622 1652 817 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163764771|ref|ZP_02171825.1| ribosomal protein S8 [Bacillus selenitireducens MLS10] # 4 805 5 801 815 640 42 0.0 MQIKISKTLEAIIARAAFNTTKAGLDHSLKDFLTLELLREEGSLAYQLLSSRLKDWELYQ IRLRIEREMLAGQQSTTQSPEEFYRAFTEELRACSGAARSVSTAHALRAIVGDRTTATSR VLEMYGITGTVVDEDLKKFAVGDDFRTEIQVHMLDFNEENKPEEKTSSHLLDKFGVNLTR MAREGKIDPVVGREQEIERVVQILSRRKKNNPILIGEAGVGKSAIVEGLALRIARGEVPY TIADKTLFSLDVSSLVAGTKFRGEFEERMQQLLDELRKAKDTIIFIDEIHTIVGAGSTQG SLDTANILKPALARGELQTIGATTLDEYRENIESDSALERRFQKVVVEPTTPDQTLQILR NIAPHYERHHKVRYTEEALQACVTLTGRYITDRFFPDKAIDVMDEAGSRIHLQSAREPAE LREMETALTDAQRERREAVEALVYEKAASARMREIALRSKLGETRAEWQRSLETNPVEVT AEHIQQVITSITGIPAERISGGEMTRLQMLYDHLARRVVGQQEAVEKIARTIRRSRAGLK DENRPIGVFLFVGPTGVGKTLLAKEVSKWLFDEHRGLIRIDMSEYGEKHNVARLIGSPPG YVGYGEGGQLTEAVRRQPYSVVLFDEIEKAHPEVFNSMLQIFDEGHLTDGSGRRVDFRNT IIIMTSNVGSRNVVQKSVQVGYSTVSKSATAVAAPQCEYRKALEQTFAPEFLNRIDDIVL FRTLELSDVERIIELELQGLFEHTRRLGYKVKITDGAKRRLAAMGYESRYGVRSLKRTLM DNVEEPLSTLIIDGKLHEGDTVVVESGKPHGVRIRVA Prediction of potential genes in microbial genomes Time: Wed Jun 22 13:31:53 2011 Seq name: gi|313157320|gb|AENZ01000070.1| Alistipes sp. HGB5 contig00047, whole genome shotgun sequence Length of sequence - 32720 bp Number of predicted genes - 25, with homology - 23 Number of transcription units - 14, operones - 5 average op.length - 3.2 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 33 - 92 2.4 1 1 Tu 1 . + CDS 121 - 486 218 ## COG0110 Acetyltransferase (isoleucine patch superfamily) 2 2 Tu 1 . + CDS 589 - 1845 1178 ## COG0677 UDP-N-acetyl-D-mannosaminuronate dehydrogenase 3 3 Tu 1 . + CDS 2123 - 3394 998 ## Ssed_2969 O-antigen and teichoic acid-like export protein + Term 3483 - 3527 8.6 4 4 Op 1 . - CDS 3510 - 4718 1329 ## Calni_1801 glycosyl transferase group 1 5 4 Op 2 . - CDS 4757 - 5509 1242 ## COG0340 Biotin-(acetyl-CoA carboxylase) ligase 6 4 Op 3 . - CDS 5557 - 6384 1280 ## COG2171 Tetrahydrodipicolinate N-succinyltransferase + Prom 6403 - 6462 2.1 7 5 Tu 1 . + CDS 6496 - 8637 3136 ## COG1505 Serine proteases of the peptidase family S9A 8 6 Tu 1 . + CDS 8789 - 10042 2008 ## COG0527 Aspartokinases + Term 10125 - 10161 -0.1 9 7 Tu 1 . - CDS 10226 - 11317 1753 ## PRU_2879 hypothetical protein + Prom 11614 - 11673 1.8 10 8 Op 1 . + CDS 11723 - 12070 522 ## COG0023 Translation initiation factor 1 (eIF-1/SUI1) and related proteins 11 8 Op 2 . + CDS 12071 - 15043 3978 ## BF1054 hypothetical protein 12 8 Op 3 . + CDS 15084 - 17489 3665 ## COG5009 Membrane carboxypeptidase/penicillin-binding protein + Term 17493 - 17534 8.5 + Prom 17500 - 17559 3.2 13 9 Tu 1 . + CDS 17619 - 18611 1327 ## COG0136 Aspartate-semialdehyde dehydrogenase + Term 18635 - 18675 6.6 - Term 18627 - 18659 1.5 14 10 Op 1 . - CDS 18775 - 19359 696 ## BVU_0502 putative RNA polymerase ECF-type sigma factor 15 10 Op 2 . - CDS 19487 - 19729 128 ## - Prom 19774 - 19833 7.9 + Prom 19760 - 19819 6.0 16 11 Tu 1 . + CDS 19881 - 20837 1030 ## COG3712 Fe2+-dicitrate sensor, membrane component 17 12 Op 1 . + CDS 21024 - 24356 4810 ## BT_1280 hypothetical protein 18 12 Op 2 . + CDS 24369 - 25958 2508 ## BT_1281 hypothetical protein 19 12 Op 3 . + CDS 25964 - 26893 1313 ## BT_1282 hypothetical protein 20 12 Op 4 . + CDS 26902 - 28257 1992 ## BT_1283 hypothetical protein 21 12 Op 5 . + CDS 28261 - 29883 1469 ## BT_1284 putative endo-beta-N-acetylglucosaminidase F1 22 12 Op 6 . + CDS 29953 - 30204 101 ## - Term 30050 - 30091 10.3 23 13 Tu 1 . - CDS 30179 - 31555 2265 ## COG3033 Tryptophanase + Prom 31544 - 31603 4.9 24 14 Op 1 . + CDS 31681 - 31848 211 ## COG1773 Rubredoxin 25 14 Op 2 . + CDS 31903 - 32511 414 ## PROTEIN SUPPORTED gi|15900660|ref|NP_345264.1| superoxide dismutase, manganese-dependent + Term 32530 - 32574 8.4 Predicted protein(s) >gi|313157320|gb|AENZ01000070.1| GENE 1 121 - 486 218 121 aa, chain + ## HITS:1 COG:PA3156 KEGG:ns NR:ns ## COG: PA3156 COG0110 # Protein_GI_number: 15598352 # Func_class: R General function prediction only # Function: Acetyltransferase (isoleucine patch superfamily) # Organism: Pseudomonas aeruginosa # 2 115 35 148 191 120 52.0 5e-28 MLGPACNIGQNVMVAADVILGRQVKVQNNVSIYSGVICEDYVFLGPSCVFTNVKNPRSEI SRREQYKTTRVRRGATVGANATVVCGVEIGRYAFVGAGRSLRIRSPIMPSLRGCPHAWSD G >gi|313157320|gb|AENZ01000070.1| GENE 2 589 - 1845 1178 418 aa, chain + ## HITS:1 COG:PM1003 KEGG:ns NR:ns ## COG: PM1003 COG0677 # Protein_GI_number: 15602868 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetyl-D-mannosaminuronate dehydrogenase # Organism: Pasteurella multocida # 13 403 8 412 424 338 41.0 9e-93 MIYDRLTGRETSLAVIGLGYVGVPVALRFARCFDVVGYDMDARRIEKLREKYGAGCGSIR FTSLEEELSRAAFYIVTVLTPVDERKNPDLTCLVEATRTVARYLKRGDYVVYESTVYPGC TEEVCVPLLEKGSGLKLGADFKVGYSPERINPNDSEHTFSNTAKIVSANDGQALDEVARV YAEVVAADLCRASGIRVAEAAKMAENVQRSVNIALMNELQRLFSRMGIDMGEVIDMASSK WNFVKYRPGLGGGHCIPVDPLYLVSAASKLGVEMPVTSSGCAVNDGMAAYVVRSLLGAMG SDGRGARALLMGITYKENIDDIRNSRIAEMVGLLEREGLSVDVTDPHADPDKVYAMYGIR PVPALRPPYDLIVVAVAHDDYRGLDDAYFRSISRGAALLGDIRGLYKGRIKSLGYWSL >gi|313157320|gb|AENZ01000070.1| GENE 3 2123 - 3394 998 423 aa, chain + ## HITS:1 COG:no KEGG:Ssed_2969 NR:ns ## KEGG: Ssed_2969 # Name: not_defined # Def: O-antigen and teichoic acid-like export protein # Organism: S.sediminis # Pathway: not_defined # 5 333 75 400 449 84 24.0 1e-14 MSPEEGRDVVRKLNGLFLLLASVFSCILLWQAAAIAGVLGNPLLAEPLRLFAAAPMLLMP VLGVENILVVYGKAHLVAVYVLISRAFMIACVVLPVVVFDAGVSGAVAGFVLASLVTCVA GLKLSSVPFRNVSPVKSRLTVRELLRFSMPVFASSMYGFVIGSASQFLVSRYLGVEDFAL FANGYRELPLAGMIIGAAAGVLLPEFSRMSVEGADGGQFVRLWQAAVLKSSAIIYPLSVF CCVFAPEIICLLYGEGYREAAGLFRIVTLVNLARIMPYAPVLFALGRGKAFARAHLVTAL LIVGLELWCVMVFPSLWAIAAIATGCTFFCLFLLMTTTARVLGTSLAGLMPWRMLTIVLL VSAAACTAARLAVLLTGMTHHLVVVSVGFAVYLSLYLPFAQRAGICYMELFGPVLAKLRP NRT >gi|313157320|gb|AENZ01000070.1| GENE 4 3510 - 4718 1329 402 aa, chain - ## HITS:1 COG:no KEGG:Calni_1801 NR:ns ## KEGG: Calni_1801 # Name: not_defined # Def: glycosyl transferase group 1 # Organism: C.nitroreducens # Pathway: not_defined # 2 393 5 396 403 194 33.0 5e-48 MNILIVCNHFYPHNRIAAFRLNAFARYFREAGHSVTVITEGDRDETVMWNGCEVHYVKDP VITSSKQESLLQRRKKWAFRRILSALQFRLFLDYKRIWQFKASKKARKLAKSRRFDVVLS SYGHLSSHRIAYRLHRKTPFYWIADMRDEMSKWPWLLPINSRRLLFYERRILKDADLILS VSAPLVEDFKQIGGGIDKVIEITNGYDYEEVHDVSFQPVYTMAFIGHFYNSITPDKWFGA FSELVAEGALPSDSRILIIGNTSPLAVPENIRQNVFQIRQVDHDEAIRKSLETDTLVVVH PKGRKGVYTGKLFDYLATNKPILAICDPDDVIADLLKETRAGFTADETDNEQVKRMLLRC YSIWKNKEVLPRDWDKIRQYSRKNQVGRLLKYLAGQEALKQH >gi|313157320|gb|AENZ01000070.1| GENE 5 4757 - 5509 1242 250 aa, chain - ## HITS:1 COG:MA0676_2 KEGG:ns NR:ns ## COG: MA0676_2 COG0340 # Protein_GI_number: 20089561 # Func_class: H Coenzyme transport and metabolism # Function: Biotin-(acetyl-CoA carboxylase) ligase # Organism: Methanosarcina acetivorans str.C2A # 2 247 14 253 254 94 29.0 2e-19 MIYRIDETTSTNDEARDAKYRHGDIVWAERQTAGRGQRGHTWTSPEGENLTFSMVLEPRF LPVGEQFLLSEAVTLALTDTFAAYGIDTRIKWTNDIYIGDRKLVGILIEHNHAGASLSRT IAGIGINVNQTAFDPALPNPVSLAQAGGRKFNRSRLLETFLVRCLRRYAQLERGEKETLQ HAYRERMYRLGEQHPYRLPDGTLFQAAIEGVLPSGELILRHADGTRHEYLFREIEFVIAG KQENAGPAEK >gi|313157320|gb|AENZ01000070.1| GENE 6 5557 - 6384 1280 275 aa, chain - ## HITS:1 COG:BMEII0270 KEGG:ns NR:ns ## COG: BMEII0270 COG2171 # Protein_GI_number: 17988614 # Func_class: E Amino acid transport and metabolism # Function: Tetrahydrodipicolinate N-succinyltransferase # Organism: Brucella melitensis # 3 270 7 284 284 248 48.0 1e-65 MYSELKEIIGQAWENRELLREESVRQAVRQTVELVDKGELRTAQPVDPEKSQWQVNEWVK KAIILYFPIQPMRKMEAGELEWYDKMELKHGYEQLGVRAVPHAVARYGAYIAPGAILMPS YVNIGAYVDTGTMVDTWATVGSCAQIGRHVHLSGGVGIGGVLEPVQAAPVIIEDNCFIGS RSIVVEGAHVCREAVLGSNTVITGSTHIIDVTGPEPVTYKGYVPPRSVVVPGSYRKQFPA GEYSITCALIIGQRKESTDKKTSLNDALRDFGVSV >gi|313157320|gb|AENZ01000070.1| GENE 7 6496 - 8637 3136 713 aa, chain + ## HITS:1 COG:all2533 KEGG:ns NR:ns ## COG: all2533 COG1505 # Protein_GI_number: 17230025 # Func_class: E Amino acid transport and metabolism # Function: Serine proteases of the peptidase family S9A # Organism: Nostoc sp. PCC 7120 # 28 704 6 687 689 694 51.0 0 MMRLPRMAAALAGAAAILTGCNDMKQIKHMPYPRTERTDVTDNYFGTEVPDPYRWLEDDN SEATAAWVKAQNVVTQDYLSQIPFRGAIRERLTELWNYPKEGIPAKHGDAWYYFYNDGLR NQSVLYRTAQPGGEGEVFIDPNTLSEDGTVALSGVTFSKDGKYCAYSVAASGSDWVEIRV MNTADRTLTSDRINWVKFSGAEWAPDSKGFYYSAYDAPQKGVFSSQNQFQKVYYHRLGTP QSADRLIYADAEHPLRYFSPWPSKDGQWLFIVASEGTSGTEVLYKKVSEPKFRTLLPGFD ADYAPVECRDGQLYYVTNRDASNYALMKVDLNDPSKVSTVIPESGGKLLEGVGSAGGYLF ATYLEHAQSKVCQYGFDGKLVREVELPAIGTVSGFDGEKDDTELYYSLTNYIAPATIYKY DIAGGASTLYKAPAVNFDPSLFTTEQVFYTSKDGTKVPMFITRRKDMKLDGGNPCYLYAY GGFQINQTPTFRPSAMMFVEQGGIYCVANLRGGSEYGEAWHKAGMLENKQNVFDDFIAAA EYLIAEKYTSSDKLAIAGGSNGGLLVGACEVQRPDLYAVCLPAVGVMDMLRYHKFTIGWG WAVEYGSSENEEQFDYIYKYSPLHNIREGVKYPATLVTTADHDDRVVPAHSFKFAAQMQH CQAGDAPVLIRIESNAGHGAGKPTSKRIAEEADTYSFLFQNIGVPYKEVKKQK >gi|313157320|gb|AENZ01000070.1| GENE 8 8789 - 10042 2008 417 aa, chain + ## HITS:1 COG:VC0391 KEGG:ns NR:ns ## COG: VC0391 COG0527 # Protein_GI_number: 15640418 # Func_class: E Amino acid transport and metabolism # Function: Aspartokinases # Organism: Vibrio cholerae # 3 402 34 435 479 171 31.0 2e-42 MKVYKFGGASVRNADGVRNLRKIIDDEQNLFIIVSAMGKTTNALEKVFEGLQKGDKQLSM EHVAALREYHAGIIDDLWRGHKQLERVDALFDELERVAVETVYKPSDAELWYDTIVAFGE LVSTTIISEYLNYAGVSNRWIDMRRCFLTEQRHKDAGVDIEASAPLLKGALAECVENIFV GQGFIGGAPDGTTTTLGREGSDYSAAVVANILDAESMAVWKDVDGVLNADPKIFPDAVQI AELNYLDTIELAYSGAQIIHPKTIKPLQNKNIPLYVRPFGDKRKPGTVIRGMSAPVEVPI LILKKDQVLLTIRSRDFSFVLEEKFATIFSLLERFRIKTNLIHNSAVNLSLCVDNSWHID EAIEALREAGFDVMKAENMELLTVRGYTDELWRKYARGPQVFVRQATQSTVRVVRKK >gi|313157320|gb|AENZ01000070.1| GENE 9 10226 - 11317 1753 363 aa, chain - ## HITS:1 COG:no KEGG:PRU_2879 NR:ns ## KEGG: PRU_2879 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 23 336 15 330 332 219 36.0 1e-55 MAKKFGDKLRRFNPFTRPATLEELPKPLPANPDQRLVKVYVEGYEDVAFWRGIFDHFQNP YLRFEISVPNRADLPKGKKVLMGMIPRSSEELILCVDSDFDFLFADRTEQSREVNAARYM FHTYAYATENFLCYAPSLHNVCVKATKNDTRIFDFVHFMHEYSCTIYPLFLWYAYSAQLA TENVFPLIDFKSAVRIGYLDLENNGEKTLAWLQKNVAKREEMLRRRNPKMIDPMLEFEQQ LKSRGLTPENAYLFMHGHTLMDNVVMILLNSVCEKLRQMSIAKITASKKQGVALKNEMSN YTNSLRSIRDVLLDNENYTKCPLYKRLQRDIEKYIARTIWNMKRSGEIRDDSTWTILRNM RQG >gi|313157320|gb|AENZ01000070.1| GENE 10 11723 - 12070 522 115 aa, chain + ## HITS:1 COG:alr3795 KEGG:ns NR:ns ## COG: alr3795 COG0023 # Protein_GI_number: 17231287 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Translation initiation factor 1 (eIF-1/SUI1) and related proteins # Organism: Nostoc sp. PCC 7120 # 13 114 14 114 115 79 42.0 2e-15 MADNDWKARLGMVYSTNPDFNYETTETPEAETLPPARQELRVWLDRKQRAGKVVTLVKGF VGRDADLQELARLLKTKCGVGGAAKEGEIIIQGDHRDRVVEILTKSGYRCKKAGS >gi|313157320|gb|AENZ01000070.1| GENE 11 12071 - 15043 3978 990 aa, chain + ## HITS:1 COG:no KEGG:BF1054 NR:ns ## KEGG: BF1054 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 986 1 947 952 332 27.0 4e-89 MKRLLHILLLLCALGFAHETSAQYYTWGSDPAGLKWSTIRTPDVRMIYPDTVSAVARRTL FYIRTVRPDIAFGFRHGPMRIPFVMHPENFQSNGLVMYLPKRVEFLTSPAIDGYSMPWYK QLVAHEYRHAVQYNNLDRGVIRALSYILGQQGSTIGLLCMPIWAMEGDAVMSETAMSSFG RGLQPSFSLGYRAMGRVGRDRKDTRDRRNIDKWFCGSYRDYIPDHYELGYQICSYAYDRY GEVVWDKVARYGSRNPYVLATTRVALEKYYDTNVSRLFRETFDVLERHWESLPQVEDSAE PLTPMPAGNYTTYQWPLPLDAASALALKTDYDRPSRFVRLDTRTGEEEVICYTGAVSTRP AMAGGRVWWTEYRRSKLFEQRVNSQLCYMDLSDGTPRMVVGRRNALYPTPSEDAVAWVEY NPDGRYTVVVQGKEGAEKRFATPDRSEIHGLAWDDATRGYYVIVTDDSGMWLGRIDGDGV HPVTEGAYITLSNLRAGGGRLYFGSIASGRDEAHCFDLKTRREYRITTSAYGSFMPVPWR DGEGRERVLLTAYDRRGYHVAAQDADADALIPVAPSKLPLNVVNPDRKRWDVVNLDTVRF SAVDSLRQEGVYRAKRYRKVPNLVNVHSWTPVAFNPFEAVDEHNINLNLGVTLLSQNLLS NTEAFASYGWNRNEGAIFNLGVRYFGLGVRLDLDASYGGNQVFYSVGQYDEQTGKYEYQQ RPSPDKYYSVGLSATLPLYFQRGYHTRQLSVTSGWNYSNGMVANLGKIEWNAGQISNIQR IGFRKGLHKLSFGLGYSDQVRMAHRDFAPRWGYMLSTAYTFNPANTHFSDLISFYGQAYL PGFAAHNSLKVAATYQTSIGGYKFPSGYAPLSYRSTRLIPRGYTSSDIISNNYTAFSADY QLPVWYPEGGIGSVLYFKRIRLNVGGDYAQFRDVGRGGMKWRRIWSVGGDIVFDINAFRQ PASATSTFKLSVYRPANGGVWWAAAVGLPF >gi|313157320|gb|AENZ01000070.1| GENE 12 15084 - 17489 3665 801 aa, chain + ## HITS:1 COG:NMA0655 KEGG:ns NR:ns ## COG: NMA0655 COG5009 # Protein_GI_number: 15793640 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane carboxypeptidase/penicillin-binding protein # Organism: Neisseria meningitidis Z2491 # 25 754 14 703 798 263 28.0 8e-70 MPKKKKTDVSPKTIKWIWCLAFAPFALLFFMLLLTAFGLFGRMPSFEELENPRSNLATEI YSEDGKVIGTFFVQNRSYVQYADLFPDDSTLHIRLGGHEVPPIVAALIATEDARFRTHSG IDIPSLARVAVKTVLLQNTSQGGGSTITQQLAKNLFPRDTVRNRGRLVRNAKLVTSKLKE WITALKLEYNYTKEEIAAMYLNTVEYGSNAYGIKSAALTFFNKEPHELNVQEAAVLVGVV NAPSRYSPVRNYDNALARRNLVISRMEDAGAITRKQRDSLSALPIVLNYRPVSHNEGTAT YFREMLRLVMNAERPKRNQFYNEWDYDQAVKEYEENPLYGWCRKNRKADGTPYNIYRDGL KIYTTINSTMQTYAEQAVQKQMESVIQPRMDAQYRRTKTLFIDADKAERERIMRHAVRYS DRYREMRNAGASEKEINASFDRPCSMKVFTYKGERDTLMTPRDSILHHKRIMRASFVALD PRTGYVKAYVGGPNFRYFKYDMAKQGKRQIGSTIKPFVYTFAIDHLGLTPCTMVPNLPVT IETANGTAWSPKEAGKVEYDGVMHPLSWGLARSRNNYSAWIMKQAKQPAAVADFIHNMGI RSYIDPVPALALGSSESNVFELVSAFSTFANEGVHNDAIFVTRIEDRQGNVIANFIPQSQ DAVSERTAYTMLTMLQGVVTSGTAGRLNRDYGFSGVEIGGKTGTSNKNRDAWFMCVAPKL VAGAWVGGEDQSVHFVSGGEGSVMALPIVGDFMKRVYDDGRLGLSRNDQFIRPAMMPRYD CDEETDPEGRVPDADDDNFFD >gi|313157320|gb|AENZ01000070.1| GENE 13 17619 - 18611 1327 330 aa, chain + ## HITS:1 COG:aq_1866 KEGG:ns NR:ns ## COG: aq_1866 COG0136 # Protein_GI_number: 15606903 # Func_class: E Amino acid transport and metabolism # Function: Aspartate-semialdehyde dehydrogenase # Organism: Aquifex aeolicus # 2 330 4 335 340 371 56.0 1e-102 MKIAIVGASGAVGQEFLRILEQRDLPIGELLLFGSQRSAGRVYKFRGKDITVKLLQHNDD FKGVDFALTSAGAGTSREFAETITRHGAVMIDNSSAFRMDAEVPLVVPEVNPEDAKNAPR RIIANPNCTTIQMVVALKAIEQVSHIRRVHVSTYQSASGAGAAAMDELVEQYAELGAGKE PTVNKFAFQLAYNVIPHIDVFTDNDYTKEEMKMFHETRKIMHSDIAVSATCVRVPVMRAH SESIWLETERPVSVAEAREAFAKAKGVVLMDQPADKVYPMPLFIAGKDPVYVGRIRRDLT CENGLTFWCVSDQIKKGAALNAVQILETLI >gi|313157320|gb|AENZ01000070.1| GENE 14 18775 - 19359 696 194 aa, chain - ## HITS:1 COG:no KEGG:BVU_0502 NR:ns ## KEGG: BVU_0502 # Name: not_defined # Def: putative RNA polymerase ECF-type sigma factor # Organism: B.vulgatus # Pathway: not_defined # 11 191 6 185 186 123 35.0 5e-27 MDRQTPVLTSDEFGKFFSEYKNRFVSIAYGYVHDTDIAKDIVTDSFMYLWEHREQVNMAE NIKGYIYFCVRTRCANHIKEQEVLRRAKNEITKDAYWKLQSSLNSLSSDELSKKLFQSEI IELFSRELAKMPELTRAVYQASRQEGLTYQQIAERYNISTRQVASEMQRAYAQLRVPLKD YLPVIGLLFAMLSK >gi|313157320|gb|AENZ01000070.1| GENE 15 19487 - 19729 128 80 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MNKLRPPDFPGRNLPGRIRRGILQKIRTGIGIPEKLPIFADTISHESGSAAGGPDFGFHT GCNEKGAPPAPFMIYCRSSK >gi|313157320|gb|AENZ01000070.1| GENE 16 19881 - 20837 1030 318 aa, chain + ## HITS:1 COG:AGl2289 KEGG:ns NR:ns ## COG: AGl2289 COG3712 # Protein_GI_number: 15891252 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 6 316 25 322 323 62 25.0 9e-10 MNQEILYRYLKGDTTPQEDLQVAEWLEADPVANQKELDIVRFVFEGMELYGNDARAGALR TRGIVIPWRKVWIFAMRAAAVLLLVFAGGYVAHQRTYEAISDRLTAVHIPNGQRIEITLP DGSRVWLNSGARIEYPVVFKKNLRSVKLSGEALFDVRHDAECPFEVETFATKINVLGTKF NVIADETYGRFSAALLQGRIKVTNLLDPRRQEIVMKPDDIVNLSNGRLFVETIRDPEALC WTEGLVRISGLPFDELMAKFEQVFDVKIVIARNSIPDIGKISGKIRVNDGIENALHILQY AADFSFEKDEETNVVTIR >gi|313157320|gb|AENZ01000070.1| GENE 17 21024 - 24356 4810 1110 aa, chain + ## HITS:1 COG:no KEGG:BT_1280 NR:ns ## KEGG: BT_1280 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 31 1110 33 1110 1110 1231 58.0 0 MNKTVFFKTFRRLFGFVLAISLSASVLTAQAQNARITLNLSDVPMSQVMKEIEKQTRYLF INKGVDTGKKLSVNLTQKTIQEVLDYVLKNTDITYKIDGTYIILAEKAAEKPVAVTGKVS DVNGVPVIGASVVVKGTTVGTSTDMDGTFSLQVPPPAATAQLEVNFLGYEPVMVSVGNRT SFDVTLHEAASEIEGVVVTALGIKRQEKALSYNVQQVKAEDITTVKDANFINSLTGKVAG VTINASSSGVGGASKVILRGNKSISQSSNALYVIDGIPMYNFGGGGGTEFDSRGASESIA DINPEDIESMSVLTGAAAAALYGSNAANGAIMITTKKGKAGALKVTLSSNTEFLDPFVQP EFQNRYGTGSNGVRSGSGIYSWGQKLLPAARYGYTPADFFETGHVYTNAVTVSGGTDKNQ TYFSAASVNSDGIIPNNEYDRYNFTFRNTSYFLKDKLRLDASASYIYQKDQNMTNQGVYS NPLVPAYLFPRGENFAIYKKFERYNPGTKLMEQFWSADMEGGDLRMQNPYWIAYRNLRNT DKKRYMLSFSASYDILPWLSVSGRVRLDNSNSLYTQKLYASSNTTITDGGKNGHYTEARA YDTQTYADVMLNVNKTFGDDWSLTANVGASLNNIKTDELSYRGPIQENGLPNIFNVFDLD DTKKRAEKVGWHDQTQSVFASVEVGWKQMLYLTATGRNDWASQIAGSPSKSFFYPSVGLS WVPSSTWDLGNAFSYLKLRGSIASVGLPFPRFLTTPTYEYDATGKIWKDKTHYPIGDLKP ERTITYEVGIDARLWKNISLSASWYSADTKNQTFDPALPPSSSYTTIYLQTGHVRNTGVE LSLGYSNQWRDFGWSSNFTFSWNKNEIIDLASGALNPVTGQPLNLSELDIKGLGKAKYIL KQGGTLGDLYTTSDLKFNDNGYVEVDKAGNLLLTDEGDQIYLGSVFPSENLAWRNDFRYK GINLGLLFTGRIGGICYSATQANMDLYGVSEVSAAARDAGGVLINGREMVDAQKWYQAIG SQSGLPQYYTYSATNFRLQELSLGYTLPRKWFRNKMGLTVSFVGRNLWMIYCKAPFDPEA VASTGLSYQGIDYFMMPSLRSLGFNVKFEF >gi|313157320|gb|AENZ01000070.1| GENE 18 24369 - 25958 2508 529 aa, chain + ## HITS:1 COG:no KEGG:BT_1281 NR:ns ## KEGG: BT_1281 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 28 525 27 528 531 462 48.0 1e-128 MKKINAYMQLLLLSVVLLSISCTGDFENMNRNPNEVTDGQMDALNYKIGTKFQSLQGLVV PVQEHMYQFNESLSGGPFGGYIGATVDSWLTKFETFNPSADWRKWPFANVITETYTPWRG IVTGTDDEVAVAFAKLLRVAIMHRVTDSYGPIPYSNLETNESITVKYDKQEEVYMKMFGE LDEAIAAFAANTTLPAEAFSRYDRVYGGNIAQWLKYANSLKLRMAMRLSEVKPDVARTKA AEAIAAGVITANADNAAMHAAENRTTLIYNDWGDHRVGADILCYMTGYEDPRLEKMFLPN DKGDYVGIRIGIDVTSKSQAQAKYSNMIVTSSTPYLWINAAEVAFLKAEYELRWGTKDAA KALYEQAIRLSFEDKGAKDADAYIADKTRKPAAYNDPLGNYSATALSSITIAWEDDSAEG ADKAAIKERNLERIITQKWIAIFPLGVEAWSEHRRTGYPRLLPAVEDKSGGTVDLAQGAR RLPYPVEEYDKNNANLQEAVQMLNSESQGSRKGDGMGTRVWWDVKPYNN >gi|313157320|gb|AENZ01000070.1| GENE 19 25964 - 26893 1313 309 aa, chain + ## HITS:1 COG:no KEGG:BT_1282 NR:ns ## KEGG: BT_1282 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 24 308 26 324 327 174 35.0 5e-42 MKQHRIIRFSGLFLLLGVLFGLAACNDWTEMEAVDNNVKKPWEQDPALWAEYTAALRDYK KSEHFIVYARLHNSPAPAASEKDFMRCLPDSLDIVALTNADNFSRYDAEDMAVMREKGTK VLWQVDYAGRAAEFADAAKLGAWLDRVVSSVAANGMDGYSFTGIPNAGDPAAVQAAALIV SKLSADESKLLVFEGNPLFVAEADRAKVDYFILDTEKTEDVGDVKLQILNATGYAAVPAS KLLLAAEAGAPLKDEDRVEYAAVEEMSRRVIAYGPLRGLGAYNIGNDYYNADMNYKMIRQ AIQTLNPSK >gi|313157320|gb|AENZ01000070.1| GENE 20 26902 - 28257 1992 451 aa, chain + ## HITS:1 COG:no KEGG:BT_1283 NR:ns ## KEGG: BT_1283 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 17 447 16 437 440 272 38.0 2e-71 MKLIHKYLILGVAAAGAFTACDNYDADDHKFDNVVYLDVSKTDEVQPATFSNNTPTVEKP IQATLAYPEGQDVAVSLTVDPSLVATYNARYGSDWKMLDAKYYELSSDNVTIHAGKTASE VVTLRLKGLMGEGEQQTGALPIDETYLLPVRISHSSMSVLKGSEAAYYVVKRSSAITVAA QLTDNWIEFPSLDKYGDNSMPWNKLRAMTYEALIYIDDFALTDIQGNPVSISTVMGEEDY ALLRIGDTNFEREQLQFDGSGVGFGKFPGRDNTKVLQKGYWYHVACTYDQNTRVVRVYVN GRIQSEGTEMGPAALDDKNFIHLARRALYDLWDNETNPGKKNEYKTLGYDTYSEARQFFI SYSYNDYRPLNGKIAEARVWSVARTPEQIWENMYNVENPKDDPTLLGYWKFNDGKGNTVK DYSRYGNDGQAKYDLKWPDGIEIPKINEVEE >gi|313157320|gb|AENZ01000070.1| GENE 21 28261 - 29883 1469 540 aa, chain + ## HITS:1 COG:no KEGG:BT_1284 NR:ns ## KEGG: BT_1284 # Name: not_defined # Def: putative endo-beta-N-acetylglucosaminidase F1 # Organism: B.thetaiotaomicron # Pathway: not_defined # 4 452 3 434 508 145 26.0 3e-33 MKTQKIQNLLVVTLFCAAALAGLTACDADPVEREGGKLPDKDGLENTYGMLRSTRSVRDE VRVFLTEGNGFVTDNFYYQLTRPLGSALSLEAAVKAGEGETERTLLPEANYDFPDGKKLD IAGDAQHSALKRIRFFAENLAPGEYYLPLTVAADAADAADAERQTVNYLLTVRGLQMGEY KLNQEQVFTVFYLNTALYQPLLVDEYLMSKLDENWENAWPERPDGTRTIGDIVNLRTVVL DYDAATSRALLNLGADMRYVLGHATKYIRPLQDKGRKVCISIEGGGKGLGFCNLTDAQIA DFTAQVKAVVTEYGLDGVNFWDRNSGYGKEGMPAMNTTSYPKLIKAVREALGDGLLVTLT DHMEPTEYFWDTAAMGGIEVGKYLDYAWSGYLDNSKDMQIVDPWHQGAAFVSAEWPRKPI AGLAPAKYGCVNIPWYTGEAETTPTERIAPVFLWRDAGYKQSNILVFEEMRTVLQDKYES TWTSSITDGYKFFADDGVYLIDESPWGKNYFSENEYMFDAIKLGELADGRRGYNKWTKDW >gi|313157320|gb|AENZ01000070.1| GENE 22 29953 - 30204 101 83 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MNKPQEQITPCNNAPAKGGKSITNLNCGSRLVFTAETTLQPECCSVVCGVRRVLRRGFFT EEPAIRRPSDRRIPGVRFCPAPR >gi|313157320|gb|AENZ01000070.1| GENE 23 30179 - 31555 2265 458 aa, chain - ## HITS:1 COG:PM0811 KEGG:ns NR:ns ## COG: PM0811 COG3033 # Protein_GI_number: 15602676 # Func_class: E Amino acid transport and metabolism # Function: Tryptophanase # Organism: Pasteurella multocida # 6 456 6 455 458 454 46.0 1e-127 MTLPYAEPYKIKMTEAIRTSTRAERETWIREASYNLFKLRSDQVTIDLLTDSGTGSMSDR QWAAMMTGDESYAGASSYFRLRETISRLFGMPFFLPTHQGRAAENVIFSALLKEGDIVPG NSHFDTTKGHIEFRRCHAVDCTIDDAADTQKELPFKGEMDIAKLEKLLRENPREKVPCVV LTITNNTAGGQPVSMRNIRETAEVCRRYGVPLLLDSARFAENAYFIKTREAGYADKTIKE IVREIYTYADIMTISAKKDGVVNMGGFVAMRSEELYKRAMTFSIMFEGYVTYGGMSGRDM DALAVGLDENTEFEQLDARIRQVKLLGDLLDEYGVPYQRPAGGHAIFVDAKKVLPNLPKE QFIAQTLAVELYLEAGIRGVEIGSILADRDPDTHENRYPRLELLRLAIPRRVYSDNHIRV IAAACRNIYERRAEITTGYRITFEAPILRHFTVELDKI >gi|313157320|gb|AENZ01000070.1| GENE 24 31681 - 31848 211 55 aa, chain + ## HITS:1 COG:CAC2778 KEGG:ns NR:ns ## COG: CAC2778 COG1773 # Protein_GI_number: 15896033 # Func_class: C Energy production and conversion # Function: Rubredoxin # Organism: Clostridium acetobutylicum # 1 54 1 54 54 75 68.0 3e-14 MKKYRCTVCEYIYDPALGDPENGIAPGTAFDDLPDGWICPVCGEGKWAFEPLEEQ >gi|313157320|gb|AENZ01000070.1| GENE 25 31903 - 32511 414 202 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|15900660|ref|NP_345264.1| superoxide dismutase, manganese-dependent [Streptococcus pneumoniae TIGR4] # 15 200 7 198 201 164 41 9e-40 MATDTAKQARFTPKDLPYAYDALAPQVSEETLRFHHDKHYVGYVNKLNELIVDTPYAEQP LEDIVVSADGAIFNNAAQMWNHEFFFDELSPKAQTRPTGALLKAIDENFGSFDQLKAQME KAAAGLFGSGWVWLAEDKSGKLAIVSEQNAGNPLRYGMKPLMCFDVWEHAYYIDYRNRRA DAVAALWDRIDWKVVEERYAKK Prediction of potential genes in microbial genomes Time: Wed Jun 22 13:33:31 2011 Seq name: gi|313157286|gb|AENZ01000071.1| Alistipes sp. HGB5 contig00024, whole genome shotgun sequence Length of sequence - 41545 bp Number of predicted genes - 31, with homology - 30 Number of transcription units - 13, operones - 7 average op.length - 3.6 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 2 - 1282 1447 ## COG0021 Transketolase + Term 1405 - 1461 12.2 2 2 Tu 1 . - CDS 1842 - 3071 1190 ## BDI_3500 integrase - Prom 3230 - 3289 5.0 + Prom 3424 - 3483 7.9 3 3 Op 1 . + CDS 3562 - 3927 602 ## PRU_2129 putative transcriptional regulator 4 3 Op 2 . + CDS 3930 - 5912 1490 ## BF0821 hypothetical protein + Term 6029 - 6067 9.1 5 4 Tu 1 . + CDS 6305 - 7732 1778 ## Calkr_1515 kwg repeat protein + Term 7738 - 7768 4.3 - Term 7726 - 7756 4.3 6 5 Tu 1 . - CDS 7786 - 9642 2584 ## Lbys_3247 hypothetical protein - Prom 9668 - 9727 3.0 - Term 9690 - 9732 8.8 7 6 Tu 1 . - CDS 9850 - 10230 544 ## BDI_0449 DnaK suppressor protein - Prom 10255 - 10314 6.9 8 7 Op 1 1/0.000 - CDS 10405 - 13806 5382 ## COG0060 Isoleucyl-tRNA synthetase - Prom 13888 - 13947 4.5 - Term 13842 - 13891 8.6 9 7 Op 2 . - CDS 13966 - 16233 3398 ## COG0642 Signal transduction histidine kinase 10 7 Op 3 . - CDS 16247 - 16804 798 ## COG0110 Acetyltransferase (isoleucine patch superfamily) 11 7 Op 4 . - CDS 16819 - 18648 3000 ## COG1132 ABC-type multidrug transport system, ATPase and permease components - Prom 18697 - 18756 3.4 - Term 18703 - 18728 -0.8 12 8 Op 1 . - CDS 18789 - 19253 577 ## COG0328 Ribonuclease HI 13 8 Op 2 . - CDS 19260 - 21887 3908 ## COG0525 Valyl-tRNA synthetase - Prom 21980 - 22039 5.6 + Prom 21885 - 21944 4.3 14 9 Op 1 . + CDS 22005 - 24434 3864 ## COG0466 ATP-dependent Lon protease, bacterial type + Term 24448 - 24495 10.1 15 9 Op 2 . + CDS 24514 - 25269 962 ## COG1212 CMP-2-keto-3-deoxyoctulosonic acid synthetase + Term 25277 - 25330 1.9 16 10 Op 1 . + CDS 25348 - 26100 1089 ## COG2877 3-deoxy-D-manno-octulosonic acid (KDO) 8-phosphate synthase 17 10 Op 2 . + CDS 26100 - 27065 1361 ## COG0794 Predicted sugar phosphate isomerase involved in capsule formation 18 10 Op 3 . + CDS 27069 - 27353 391 ## COG2350 Uncharacterized protein conserved in bacteria 19 10 Op 4 . + CDS 27381 - 29378 2447 ## BVU_0620 glycoside hydrolase family protein + Term 29420 - 29460 -0.9 - Term 29210 - 29245 1.3 20 11 Tu 1 . - CDS 29416 - 29760 448 ## Toce_0227 PAS fold domain protein - Term 29779 - 29813 3.1 21 12 Op 1 . - CDS 29840 - 31621 2931 ## COG0038 Chloride channel protein EriC 22 12 Op 2 . - CDS 31652 - 32776 1790 ## COG0180 Tryptophanyl-tRNA synthetase + Prom 33009 - 33068 10.5 23 13 Op 1 . + CDS 33088 - 33987 1338 ## COG0583 Transcriptional regulator + Term 34008 - 34050 11.0 24 13 Op 2 . + CDS 34051 - 34551 -242 ## 25 13 Op 3 . + CDS 34569 - 34919 528 ## COG0799 Uncharacterized homolog of plant Iojap protein 26 13 Op 4 . + CDS 34959 - 37037 1249 ## PROTEIN SUPPORTED gi|157803230|ref|YP_001491779.1| 50S ribosomal protein L9 27 13 Op 5 . + CDS 37067 - 37915 1058 ## COG0575 CDP-diglyceride synthetase 28 13 Op 6 . + CDS 37936 - 38607 716 ## COG0688 Phosphatidylserine decarboxylase 29 13 Op 7 . + CDS 38620 - 39726 1775 ## COG0628 Predicted permease 30 13 Op 8 . + CDS 39726 - 40091 525 ## COG0196 FAD synthase 31 13 Op 9 . + CDS 40098 - 41507 2131 ## COG0499 S-adenosylhomocysteine hydrolase Predicted protein(s) >gi|313157286|gb|AENZ01000071.1| GENE 1 2 - 1282 1447 426 aa, chain + ## HITS:1 COG:BH2352 KEGG:ns NR:ns ## COG: BH2352 COG0021 # Protein_GI_number: 15614915 # Func_class: G Carbohydrate transport and metabolism # Function: Transketolase # Organism: Bacillus halodurans # 9 424 258 663 666 245 38.0 1e-64 PSGESFEDKVSTHGQPLTAAGADFAATVKNLGGDPNDPFAVFSESREAFSERREALRAWA SRQAEVEKTWRAEHGDLARKLDMFLSGRVPEIDYKSIEVKAGVATRAASAAVLGVFAEKI ENMIVASADLSNSDKTDGFLKKTKAFTKGDFSGKFFQAGVSELTMACVANGMALHGGVIP ACGTFFVFSDYMKPAVRLSALMRLHVIYIWTHDSFRVGEDGPTHQPIEHEAQIRLMEHLR NHHDERSMVVLRPADGDETVMAWKLAVEEQRPVALVLSRQNIKSLPALGASRREEASQVA KGGYVVLDSAKPEVVMVATGSEVSTLVEGASLLAAEGIAVRVVNVPSEGLFRDQPRSYQE SVLPVGVVRYGLTSGLPVNLMGLVGEKGMIHGLDHFGYSAPYTVLDEKFGYNGKTVAEEV KKLLGK >gi|313157286|gb|AENZ01000071.1| GENE 2 1842 - 3071 1190 409 aa, chain - ## HITS:1 COG:no KEGG:BDI_3500 NR:ns ## KEGG: BDI_3500 # Name: not_defined # Def: integrase # Organism: P.distasonis # Pathway: not_defined # 3 407 2 402 402 334 43.0 5e-90 MQRSTFKVLFYVKRQSEKHGQVPVMGRITINGTMSQFSCKLSVRSSLWDAKANKASGKSL ESQRINEKLENIKTNIGKQYQRLCDRDSYVTAEKVRNAFLGMGDDCRLLLQTFDEYLAEF RKRVGKDRAYSTYEDYCLRRRRLAAFLEYEYHVKDIPFKELNRNFIEKFVVYLSTVKGLA SGTISAGVKKLRLMTYTAYKNGWILVDPFAGFHVRPKYAERRYLSASELQAVMDVELPNY RTGINRDVFVFCAFTGLSHADVVKLTHADIHTDYNRNHWIIDKRQKTGTQFRVKLLPIAE IIYDRYKNLHLEGNKVFPIKRYYKTMNMSLRHVARLAGLSFNPTMHMARHTFATTVTLAQ GVPLETVSKMLGHKRITTTQIYAQITNDKIGRDMAALSEKLDSIFKVAQ >gi|313157286|gb|AENZ01000071.1| GENE 3 3562 - 3927 602 121 aa, chain + ## HITS:1 COG:no KEGG:PRU_2129 NR:ns ## KEGG: PRU_2129 # Name: not_defined # Def: putative transcriptional regulator # Organism: P.ruminicola # Pathway: not_defined # 1 116 1 116 119 154 63.0 9e-37 MEKLTRREEELMRCFWEHGPLFVRELVALAAEPKPHFNTLSTMVRALEAKGYVGHKTFGS TYQYYPVVTEEEFSRRTLGGVISKYFENSYLGAVSALVEEEKISVEDLRELIDRIENQNR R >gi|313157286|gb|AENZ01000071.1| GENE 4 3930 - 5912 1490 660 aa, chain + ## HITS:1 COG:no KEGG:BF0821 NR:ns ## KEGG: BF0821 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 372 2 434 520 285 40.0 5e-75 MYALMIYSLKVGACLAVFYLFFKLLLSRETFHRLNRIVVLAAMVLSFILPFCVITIYREL PAAPEMPAAEQLFEASAEPQPEPFPWDKAAALVFLTGAGATLLWTFGSVFGVIRMIRRGR RERLADGTVLVRIGRSVTPFSWCRYIVLSEKDLAENGDAIVLHEKAHLRLRHSVDLLLTD LAGCLQWFNPAMWLLRRELRAIHEYEADEAVLDSGVDAKHYQLLLIRKAAGGRWYSVANS FNHSKLKNRITMMLRKRSSRWAVARVLFVLPLAGLALGAFARTAYVFPDDKGKKENVTIL IRNAKIDPSDGKSGNPLILVDGREVKSIDSLSPDRIASVSVLKDSASKASYGEKGRDGVI IVTTKEGGADPQMQTFTYGGNSSSTALVVRGAGTEPDSLAGRVRTMTYRIDGRRDSLGGD MVWKINGRDVDAGDLEELSGNVRSIRITKDAAVQGEDSAAAEGMMRIMTGQTMQAADAGL KAAEAGMQAARAGLDAARRYMSAEEWKRAQKQLEMAQKQLDEARAQAKSEISFSEAIAVD MQDSSADTADGGAVVAGAVGGTASETPSYWRVTTGEGTTRVDHSKTESGKTVFTGRVTLH NPSEIPDNYVIFINGEQASKADVLRLAPQKIKRMEVLRGKEAAKEYGAGSEGVIKITTRR >gi|313157286|gb|AENZ01000071.1| GENE 5 6305 - 7732 1778 475 aa, chain + ## HITS:1 COG:no KEGG:Calkr_1515 NR:ns ## KEGG: Calkr_1515 # Name: not_defined # Def: kwg repeat protein # Organism: C.kristjanssonii # Pathway: not_defined # 29 472 267 753 826 175 26.0 3e-42 MKRLLFFLTVVTLVSCGQSFEKRTAKYAEVGEMSDNRAAVLDINEGVGRWGYVDGTGRLA IECRYADARRFADGLAAVQVPEGAWGYIDTVGRLVIAPQFVLAEPFEDGMAWVQAAGELW GRVDKSGKMVIPCLYSEIGEPDERGWMRVLRDGKWGYLRENGEVVVPCDYNLIGEPNAYG LIPVTSGGKHGFLDADGREVLPCFFTYISDFKDGYARTNYGGSMMLRDEPYGGQWGLIDT LGRETIPCRYYYLDTPGEGLAAFMLEQFGNFGYVDVKGNVAISPRFAVAKPFSGGVAVVS YNNVNYGLVDRNGNEVSSFRYKRIGEFHDGLAPFNTNLYGMNFQGMEPRCGYIDTEGREV LPAKWDDAGEFSEMRAPVMRFGVNSEDFYDARWGYIATNGQLVVPFKYHEAYPFSCGLAR VFVKGLGYGYINRKGEEVVPCKYQEAEDFHDFTAKVKQYDRVMTIDDKGQIVEGQ >gi|313157286|gb|AENZ01000071.1| GENE 6 7786 - 9642 2584 618 aa, chain - ## HITS:1 COG:no KEGG:Lbys_3247 NR:ns ## KEGG: Lbys_3247 # Name: not_defined # Def: hypothetical protein # Organism: L.byssophila # Pathway: not_defined # 50 618 22 609 609 203 28.0 2e-50 MYRRICILVLTVLAAVSVCFARTAGPGMRGPEITSPNINSPDGFKPDSLNTATSERNQRL YDSIQSKTNRRAVPRMLYRMLFVKPVLDTTFSGRVVDESRLLEPYAGKTVGDITIERQHP FDSDGNWLERTGNKIHMLTRDRVIRRDLLFKPGDKIDPQLIVRNKQLLRSRPYISDVDIT LMPDSLDSTRVNMVICTRDSWTISVDGAIHGEGRTMVGLSDANIFGWGNTLKFNTNFSRK DFSYGGNIVEYEIPNVLGSFYTADFSAGRDFYNSELNMGLRKEFIRPTDYEIGLTYSDVK SKRYMIDLDTSLLIKVRNLDAWGGKSRYIRSINSSIYFTGRYSYARFSRRPLVAPNHNPA LHDYDAMLFGAGLYREKFYSANMIYGFGTREYLATGYKAELVGGYSWGEFNDEMYLGMTY TTGGFRSVGYVMGSFTLGSYIDLATGMWRHSAVDVDLKWFSNLFMFKRSRIRQFLAFNYT QGWNRGTGSDEIIRMTRINGLQALKEHVTGTNRMILNTETVFFTPYQPLGFRIALFGFAD FGLIGYSPNIFKNDFFTSFGFGVRIKNERLIFNTIQIRLGLAFGKRGLVESEYVRVSNQT RIEQYRYRPTRPEIVGFK >gi|313157286|gb|AENZ01000071.1| GENE 7 9850 - 10230 544 126 aa, chain - ## HITS:1 COG:no KEGG:BDI_0449 NR:ns ## KEGG: BDI_0449 # Name: not_defined # Def: DnaK suppressor protein # Organism: P.distasonis # Pathway: not_defined # 4 124 3 123 126 157 73.0 1e-37 MADERTRYSDAELEEFKQLILKKLENARADYELLRATITHTADNDTEDTSPTFKVLEEGA ATLSKEESGRLAAHQMKFIRNLEMALVRIENKTYGICKTTGKLIPKERLMKVPHATECIE AKEGRR >gi|313157286|gb|AENZ01000071.1| GENE 8 10405 - 13806 5382 1133 aa, chain - ## HITS:1 COG:CAC3038 KEGG:ns NR:ns ## COG: CAC3038 COG0060 # Protein_GI_number: 15896289 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Isoleucyl-tRNA synthetase # Organism: Clostridium acetobutylicum # 6 1105 2 1009 1035 863 41.0 0 MKFKEYKGLDLAQTAADVLGEWDARDTFHKSITTREGHPAFVFYEGPPSANGTPGIHHVM ARTIKDVFCRYKTQQGYLVHRKAGWDTHGLPVELGVEKKLGITKEDIGKKITIEEYNRTC REAVMEFTGMWEDLTRKMGYWVNMDDPYITYDNKYIETLWWLLKQLFDKGLLYKGYTIQP YSPAAGTGLSTHELNQPGCYRDVKDTTCTAQFEVIRDQKSEKLFAGVEGPLYFLAWTTTP WTLPSNTALAVGPAIEYVKVKCRNPYTDRPQTVILAKELVPAYFTKKMEGTYEITGESWQ GPELEGIRYEQLIPWVKPMGDAFRVIVGDYVTTSDGTGIVHIAPTFGADDDRVGRAAGIA PLFMVDKAGKNQPMVDRQGKFFLLEELDPEFVKNNVDAAKYREYAGRYVKNAYDPNIACD AETTLDIDIAVMLKAENKAFKIEKHTHSYPHCWRTDKPVLYYPLDSWFIRTTALRERMIE LNKTIRWKPESTGTGRFGKWLEGLVDWNLSRSRFWGTPLPVWATEDYSELKCIGSIEELT GEIEKSVAAGFMKENPYKNFKVGDMSAENYSTKNIDLHRPYVDGIVLVSSKGEPMKRESD LIDVWFDSGAMPYAQLHYPFENGGEHFKTVYPADFIAEGVDQTRGWFFTLHAIASMLFDS VAFKNIISNGLVLDKNGNKMSKRLGNGVDPFEVLATYGADATRWYMISNSQPWDNLKFDR DGVDEVRRKFFGTLYNTYSFFALYANVDGFTGREPEVPVEKRPEIDRWIISLLNTLVRDV TRSLEDYDPTPAARAIQEFVGENLSNWYVRLNRKRFWGGGMNEDKLAAYQTLYTCLETVS MLAAPFAPFISDRIFCDLNAVSGRHTDESVHLSTFPKADGKLIDADLEEMMSLAQRVSSM VLALRRKVSIKVRQPLTKILIPVLDPAMTQHIAAVKTLIMNEVNVKEIELIENTAGIITK RIKPNFKTLGPRYGKYMKQIAAMTAEFTQERIAEIEAAPETILDLGAEKIAVTPADFEIT SEDMPGWLVASEGKLTVALDITVTEELRAEGVARELINRIQNIRKDSGFEVTDKIRVEIE QKELVADAVTKYAGYIASQTLAVEVKTAAQPAGGVIVDSDVDDEPVRIAVTRI >gi|313157286|gb|AENZ01000071.1| GENE 9 13966 - 16233 3398 755 aa, chain - ## HITS:1 COG:slr2098_3 KEGG:ns NR:ns ## COG: slr2098_3 COG0642 # Protein_GI_number: 16330584 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Synechocystis # 490 747 1 266 280 166 39.0 1e-40 MFTSGAPDLQSLMQLLDAAQMGWWKADFRTGELFFSDYIVDLLGLGSNRAAVGELMPLTR EDYRTRIHNEFLSLKVGNHFDETFPVVLPEGEIWIHTQLVRKQSDPQQGIKLFGYLQRVP VAGRKEADVLTLESNYYAVLKENKHMDELLDHLPIGYFRIRLLYDDDGQATDYLFLSVNQ TAQQILGVDAADYLDKTAREIDIPVDRHIDELAAIRLGDYKMDQWHASKTGRYCRSFLYN TPNDATEIVILILDITDVVTAHQALDEKEKLLRNVIQNAPIGIEIYDRTGHLVDINARDL EMFGVTDPAGIRGLSIFDNPNFSAEIKDCIREGRGTDFTTRYDFSKARSYYDTSRSGYID WTARIRCLYNDTGEITHYLLINIDNTELRQTQDRLTEFEALFRLISEYAQVGYINYNLCN KEGYAQSVWLRNYGEADTAAIGDVIGKYRHIHPDDRTVLLNRLAQFESGEIQSASVSCRV LHDDGRTTWIKSHLICREYRPQEQVIDMLGINYDITALKQTEQELIAAKERAEESNKLKT AFLANMSHEIRTPLNAIVGFSQLLTTEEDPEVRKEYNDIVSLNNGLLLQIISDVLDLSRI ESGHADLVRTRFDLRTLCREAVETFRLQTPEEVVLDVEKGLPSCTVCQYRQGLMQILANF IRNALKFTPKGFVTVGFERKRDRLSLYVRDTGIGIPAEELDKIFERFYKVDTFTQGTGLG LSICKSIAEQMGAQIGVDSAVGEGSCFRVEMPIAE >gi|313157286|gb|AENZ01000071.1| GENE 10 16247 - 16804 798 185 aa, chain - ## HITS:1 COG:BS_yyaI KEGG:ns NR:ns ## COG: BS_yyaI COG0110 # Protein_GI_number: 16081137 # Func_class: R General function prediction only # Function: Acetyltransferase (isoleucine patch superfamily) # Organism: Bacillus subtilis # 1 182 2 184 184 180 47.0 2e-45 MKSEKEKMLAGEWFDPRDEELTADRDRATLLMHRLNVECPGHDAAYRAALRELCPNAAGF IRAPFYCDYGYNIHIGEGSFVNFDCVFLDLAPIRIGRNTLIGPKVQLLTPHHPLDPDLRA TGREAGKPITIGDNCWLGGGVIVCPGVRIGNGAIIGAGSVVTRDIPADSVAAGNPTRVTR TISHT >gi|313157286|gb|AENZ01000071.1| GENE 11 16819 - 18648 3000 609 aa, chain - ## HITS:1 COG:BS_ygaD KEGG:ns NR:ns ## COG: BS_ygaD COG1132 # Protein_GI_number: 16077935 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, ATPase and permease components # Organism: Bacillus subtilis # 92 608 75 588 589 361 38.0 2e-99 MLKIYMRLLGFARPIQKYAIPYFFYSLLYALFNSLTFLLIMPILKTMFDADYTFVYVEKL PPLAFNQEYLTALFNYTYSHLFNEYNPENVLLMLAVVTIFVSLLSNLFRYLGSWTVENMR TRTLQRMRNEMFSKVVDMHVGFFSDQRKGDIISKITSDVGVVQFCITNTLQVAFREPFLI IGYTVMMVAISWELALFSVLFLPVVALIIGSIVKRLRHPARTSQQRMGELVSTLDESLSG IKVIKSYNAVDYIKQKFYDLNADLARLTLSMARRQQLASPMSEFLGISAVGVILVFGGSL VFKGSLSPEGFIAFVAMFSQITRPVRTFIDQFANINQGIAAGERIFSIIDAQSEIQDKPG ALELNGLKDKIEFRDIHFSYDGSREVIDGISFDIKRGETVALVGPSGGGKSTLSELVPRF YDVTAGDILIDGVSIRDYTQESLRAHMSVVSQDTVLFNDTIEGNIAMGKAGASHEEIVEA ARIANADCFITEAPEGYRTNIGDRGVKLSGGQRQRLSIARAVLKNPDILILDEATSALDT ESEKLVQDALNKLLEGRTSVVIAHRLSTIHNADKIIVVDHGRIAEQGTHAELMARGGIYA KLIELQSFE >gi|313157286|gb|AENZ01000071.1| GENE 12 18789 - 19253 577 154 aa, chain - ## HITS:1 COG:PM0107 KEGG:ns NR:ns ## COG: PM0107 COG0328 # Protein_GI_number: 15601972 # Func_class: L Replication, recombination and repair # Function: Ribonuclease HI # Organism: Pasteurella multocida # 3 152 4 154 154 145 50.0 2e-35 MAQITIYTDGSALGNPGPGGYGAVLLSGPHRKELSQGFRLTTNNRMELTAVCAALEALKF EGSDVTVYSDSKYVVDAVTKGWVFGWEKKRFAGKKNPDLWMRFLRIYRRHNVRFVWVKGH ADTVENNRCDQLAVAAANDKAHLSEDTGYEPEGR >gi|313157286|gb|AENZ01000071.1| GENE 13 19260 - 21887 3908 875 aa, chain - ## HITS:1 COG:BH3038 KEGG:ns NR:ns ## COG: BH3038 COG0525 # Protein_GI_number: 15615600 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Valyl-tRNA synthetase # Organism: Bacillus halodurans # 2 875 7 879 880 733 45.0 0 MQIADKYSPQEIESKWYDYWIEKRLFHSEPDQREPYTIVIPPPNVTGMLHMGHMLNNTLQ DVLVRRARMSGRNACWVPGMDHASIATEAKVVAMLHEKGIEKSSLSREKFLEYAWEWKEK YGGMILKQLRKLGASCDWERTCFTMDGPRTESVIKVFCDLFEKGRIYRGVRMVNWDPAAK TALSDEEVVFKESHGKLYYLRYKIEGTDKAIIVATTRPETILGDTALCVNPNDPRYASLP VDARVIVPLVNRSIPVIRDEYVDIEFGTGALKVTPAHDVNDYMLGEKYGLETIDIFNDDG TINDKVGMYVGMDRFDVRKQIEKDLAAAGLLEKTEEYTNNVGYSERTGVAIEPKLSMQWF LSMEELARPATKAVMEDAIRFVPEKYKNTYRHWMENIKDWCISRQLWWGQRIPAYYLPKG GYVVAPTAEEALAKARAKTGDDSLQPSDLRQDEDVLDTWFSSWLWPISVFDGIRNPDNQE ISYYYPTNDLVTGPDIIFFWVARMIMAGYEYRGEKPFGNVYFTGIVRDKIGRKMSKQLGN SPDPLDLIAKYGADGMRMAMLISSSAGNDVMFDEALCEQGRNFGNKIWNAYRLVNGWSAD DKAAQDDNNRLAVEWFRQTLGRALRQIDEDFRSYRISEAFMTAYKLFWDDFSGLYLEMVK PAYGQPVDAETFAATCGFFDELMRVLHPFMPFVTEEIWQDLAPRKEGESICTAPMPQPVE ADERLLARFELAKEAISSVRNIRNQKNLPQKEALTLKVIADENYPAEYAPVMRKMANLTA VETVTEKDPAAAAFIVKTTQYFMPMAGKIDVEAEIGKLTESLAYYEGFLASVMKKLSNER FVQSAPEKVVMNERAKQADAEAKIAAIREQLAALK >gi|313157286|gb|AENZ01000071.1| GENE 14 22005 - 24434 3864 809 aa, chain + ## HITS:1 COG:TM1633 KEGG:ns NR:ns ## COG: TM1633 COG0466 # Protein_GI_number: 15644381 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: ATP-dependent Lon protease, bacterial type # Organism: Thermotoga maritima # 40 809 24 786 787 659 47.0 0 MSKKDKFETIEVEDVNLLPDLLDGNHRVIPIVTGGDEVVEEVEVPEIIPILTLRSSVLFP GAITPITVGRDKSISLVRAVNAEGGILGAVLQRESDVEDPAPDDMYKVGTAARIIKILEM PNGNLTVILNGLEKVEIREYITTEPYFRARVTALRDTTPDLKSVEFEALVDSIRDVALNI INVSPSMPKEAAFAIKNIDSKRGIINFICSNMELTDEDRQSLLEAPGLLARARKLLEILI REQQLAELKNQIQERVKQEIDKQQRDYYLQQQMRTIQDELGDGADADIEKMREEAKKKNW PKEVGETFEKELQKVERLNPAVAEYSVQMTYLQLLLELPWNDTTKDNLDLKCAREQLDRD HFGLDEVKERILEHLAVIKLKGDLKSPILCLYGPPGVGKTSLGKSVAAALGRKFGRISLG GLHDESEIRGHRRTYIGAMPGRIIQTIKRCGSSNPVIILDEVDKVTVSNHGDPSSALLEV LDPEQNTTFHDNYIDMEYDLSKVLFIATANNVANIAPALRDRMEMINIPGYLVEEKIRIA LDHLLPKQREAHGIKEQELVMTPQIVEGIISGYTRESGVRSLDKLLAKIARARAKQIAFD EAFAPEITAKDVEKILGMPKFLREEYEVGGMTGVVTGLAWTEVGGDILYIESVLTPGKGK VSLTGNLGEVMKESATIAHEWVMAHHAELGIDASAFEKNDINIHVPEGAIPKDGPSAGIT MVTSIVSTYTGRKVRERIAMTGETTLRGRVMPVGGVREKILAAKRAGITELILSEENRKD IAEIKPEYVEGLTFHYVKTNDEVLKLALV >gi|313157286|gb|AENZ01000071.1| GENE 15 24514 - 25269 962 251 aa, chain + ## HITS:1 COG:FN0807 KEGG:ns NR:ns ## COG: FN0807 COG1212 # Protein_GI_number: 19704142 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: CMP-2-keto-3-deoxyoctulosonic acid synthetase # Organism: Fusobacterium nucleatum # 1 241 1 233 245 194 44.0 2e-49 MKFIAIIPARYASTRFPGKPLAMLGGRPVIQRVYEQVAGVLDDAVVATDDERIYDTVLAF GGRAEMTSPDHKSGTDRCWEAYLKQGKTYDVVVNVQGDEPFVRASQLEAVKRCFDDPATD IATLVRPFAATDGLEALENPNSPKVVLDAQSRALYFSRSVIPYLRGVERSEWLARHTFYK HIGLYAFRTEVLRAVTALPQSALEKAESLEQLRWLENGYKIGVGVTDAETIGIDTPEDLE KAEAFLIRHGK >gi|313157286|gb|AENZ01000071.1| GENE 16 25348 - 26100 1089 250 aa, chain + ## HITS:1 COG:FN1224 KEGG:ns NR:ns ## COG: FN1224 COG2877 # Protein_GI_number: 19704559 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: 3-deoxy-D-manno-octulosonic acid (KDO) 8-phosphate synthase # Organism: Fusobacterium nucleatum # 5 247 32 274 286 277 53.0 1e-74 MKIKFIAGPCVIESVELLDTVAQRLVAINERLGADIIFKASFDKANRTSISSFRGPGLEK GLRMLADVRAKWGLKLLTDIHESWQAAPVGEVVDVIQIPAFLCRQTDLLVAAAKTGRTVN IKKAQFLSGADMLYPYEKARDAGASEIWLTERGNIYGYNNLVVDFRNIAEMLRIAPTVVM DCTHSVQRPGAAGGKTGGNREFVPAMAQAAKAFGANGFFFEVHPDPDHAKSDGPNMLQLD ELENLIKTLL >gi|313157286|gb|AENZ01000071.1| GENE 17 26100 - 27065 1361 321 aa, chain + ## HITS:1 COG:PM0525_1 KEGG:ns NR:ns ## COG: PM0525_1 COG0794 # Protein_GI_number: 15602390 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted sugar phosphate isomerase involved in capsule formation # Organism: Pasteurella multocida # 10 209 39 238 238 228 56.0 1e-59 MTDTTKAQILDLARKAINTELLALKRMKETLGDNFADAVEMILSGQGKCIVTGMGKSGLV GRKIAATLASTGTPSFFLHPGEAFHGDLGMISKEDVVLALSYSGETDEILKIVPFIHSNG NKLISMTGNPESALAKNSDVHLDVSVEEEACILHLAPTTSTTAQIAMGDALAVSLMQMRG FTSVDFARLHPGGSLGRRLLMTVGNVMRSHDLPVVAPDCSATDMIHAISKGGLGLIIICD GDRIEGIVTDGDVRRAMERRRAEFFNIKAADIATPNPKTISADRKLIEAEKMMTRNKVTS LLVTDEAGKLQGVIQIYDIKL >gi|313157286|gb|AENZ01000071.1| GENE 18 27069 - 27353 391 94 aa, chain + ## HITS:1 COG:CAP0038 KEGG:ns NR:ns ## COG: CAP0038 COG2350 # Protein_GI_number: 15004742 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 1 94 1 94 96 105 48.0 2e-23 MFIVLLTYKLPLAEVERHLAAHREYLDRQYAAGTFLCSGPQNPRTGGVILCRAADRAAVE TLTAEDPFRIHGVADYEIIEFSPVKRLPGFEAFL >gi|313157286|gb|AENZ01000071.1| GENE 19 27381 - 29378 2447 665 aa, chain + ## HITS:1 COG:no KEGG:BVU_0620 NR:ns ## KEGG: BVU_0620 # Name: not_defined # Def: glycoside hydrolase family protein # Organism: B.vulgatus # Pathway: Galactose metabolism [PATH:bvu00052]; Starch and sucrose metabolism [PATH:bvu00500]; Metabolic pathways [PATH:bvu01100] # 1 662 1 667 671 960 69.0 0 MKSFLFSLAALALCGACSQNTGVTSPDGTIRLAFAVDSAGRMTYSVTDGGVRLFEPSRLG FEAAEADLGGGFAVEHVSRTSVDETWTQPWGENKENRSRYNEMAVRLCNGGGVRLTLRFR VFDDGLGFRYEYEACGADSLRVTDELTEFRFAADGDSWTIPASFDTYELLYRKLPLSELA DANTPATFKVGGLYGSIHEAALYDFPEMTLRKTDGLAFKSDLAPLPDGTKAHVGNKFTTA WRTVQLAPDAVGLINSSLILNLNEPSKIGDTSWIRPMKYVGVWWGMHLGVETWAMDDRHG ATTENAKRHIDFAAVNNIQGVLFEGWNEGWENWGGSQSFDFTKPYADFDIAEIVRYARER GIEIIGHHETGGNIPDYERQLERAVKWYADLGIHNLKTGYAGGISGGNNHHGQYMVRHYQ HVVETAAKYRMTVNAHEPIKDCGIRRTWPNMMSREGARGKEWDAWSAGNPPSHEVTLPFT RLLAGPMDFTPGTFDILYENTRNSPRRKLWNCGPEVDMRVNTTLAKQIAEWVIIYSPVQM ASDLIENYEGHPAFRFFRDFDADCDWSRALAGEPGEFVAVVRRAGENYFLGAATDEQPRT LSLPLDFLKPGTKYRATIYADGPDADWKTNPTSYTISEREVSSADTLEVAMAPGGGQAVS FMPAI >gi|313157286|gb|AENZ01000071.1| GENE 20 29416 - 29760 448 114 aa, chain - ## HITS:1 COG:no KEGG:Toce_0227 NR:ns ## KEGG: Toce_0227 # Name: not_defined # Def: PAS fold domain protein # Organism: T.oceani # Pathway: not_defined # 9 113 5 112 113 107 48.0 2e-22 MTTENPFPWADDMNCAVTVCDTEGVILYMNEKARATYIRHGNLIGSNLFGCHNERSREII RRLTAEGGSNAYTIEKQGVRKMIYQTAWRTGDKVGGLVEISMEIPAEMPHYIRG >gi|313157286|gb|AENZ01000071.1| GENE 21 29840 - 31621 2931 593 aa, chain - ## HITS:1 COG:SSO3202_1 KEGG:ns NR:ns ## COG: SSO3202_1 COG0038 # Protein_GI_number: 15899904 # Func_class: P Inorganic ion transport and metabolism # Function: Chloride channel protein EriC # Organism: Sulfolobus solfataricus # 27 457 27 463 473 140 28.0 6e-33 MNLRADHIYRTFLRLTRRLSNSQIMMLLAVVVGVLAGVGTYLFEMLLYGIKYGLTNWFPV DSSHILFLIYPAVGIILATLFVKYIVRDNISEGVTRVLYAMSSRNSRIAGHNCWTSIVGG ATTIGFGGSVGPEAPIVLTGAAIGSNVGRIARLNYKHTTLLLCCGAGAALAAIFKAPITG VVFVLEILMLDITAGSVIPLLIASITATTMAFMLRGFDPILAVTLAPADAFELWQIPLFI LLGVCCGLMSWYFTSTNLRVSTLFKKIDKQYKKWIVGGTVLGVLIFIFPPLYGEGYEGFT SLMHGQAEKLFDNSLFYRFSDIDWVVILFVIATMFFKVIAMASTNAAGGVGGTFAPSLFV GAFTGASIALLCNAFFDWDVSIVSFTLVGMAGVMSGVMKAPLTSIFLIAELSNGYGLFIP LMITACISFAVDYYLDPDSIYTKQLRLRGELLTHDKDQSVFVFLKLDELMETDFLRIREN FTLGDIVHIISTARRNIFPVIDNFGHLLGIIQLDDLREDMFKREKYGHPISDYMIQPPDK ILEHEAIQSVVQKFEDKHTWMLPVVDKQNRYMGFISKSRILNAYREQLVKIQQ >gi|313157286|gb|AENZ01000071.1| GENE 22 31652 - 32776 1790 374 aa, chain - ## HITS:1 COG:SP2229 KEGG:ns NR:ns ## COG: SP2229 COG0180 # Protein_GI_number: 15902033 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Tryptophanyl-tRNA synthetase # Organism: Streptococcus pneumoniae TIGR4 # 4 358 5 341 341 416 60.0 1e-116 MGKIILTGDRPTGRLHLGHYVGSLRRRVELQESGEFERIFIMIADAQALTDNADNPEKVR QNIIEVALDYLSVGLDPNKSTLFIQSQIPELCELAFYYMNLVTVQRLQRNPTVKAEIQLR GFAENNTEGDTQQRQGIPVGFFTYPISQAADITAFKATTVPAGEDQEPMIEQTREIVHKF NTVYGPALVEPEILLPENAVCMRLPGTDGKAKMSKSLCNCIYLSDSADEIKKKVMGMYTD PDHLRVEDPGKVEGNMVFTYLDAFSRPEHFPKYLPDYPSLEELKAHYRRGGLGDVKVKKF LIAVLNEMLEPFRERRKYYEGRIGEVYEILRAGSEAAREVAAATLADVKRAMKIDYFDDK ELIASQAEHFAHKE >gi|313157286|gb|AENZ01000071.1| GENE 23 33088 - 33987 1338 299 aa, chain + ## HITS:1 COG:SMc00818 KEGG:ns NR:ns ## COG: SMc00818 COG0583 # Protein_GI_number: 15964516 # Func_class: K Transcription # Function: Transcriptional regulator # Organism: Sinorhizobium meliloti # 6 297 1 291 311 161 34.0 2e-39 MTIIQLEYLLAVANCGSFSLAAEHCFVTQPSLSMQVKALEEELGVVLLDRSKKPVIPTEA GDVVLERARETLRAYNNIRESVAELKGETSGKLRLGVIPTIAPYLLHKFIPAFVRDYPKV ELEISEMITSDIVEALKRDRIDAALVASGTCGEGILEQELFNDRFFAYVSPENALYERSN IRIEDIDLRDLVILSAGNCMRDQIIELCQAKRDMPSHYSFESGSLDTLMRIVDCTSCLTI IPEMAVEYIPADHRDRLKMLAKGATSRKIAVAVRRTYVKSSIIKALTDTIMANAPEVGN >gi|313157286|gb|AENZ01000071.1| GENE 24 34051 - 34551 -242 166 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MLFFRSGLWRVAAPVCGALLLRSVARCCSGLWRAAVPPYAASSSSVCIGRRPRLCRVAAL ALCRAAVPPCAVSSSSACIGCRPRLCRVAAPICVALSSAVCVVSPFRFADGRRRRILRTV WRGICADCGSLAAFPCVATIPVGSPARFFVALNFQISFYFRTFARL >gi|313157286|gb|AENZ01000071.1| GENE 25 34569 - 34919 528 116 aa, chain + ## HITS:1 COG:BH1328 KEGG:ns NR:ns ## COG: BH1328 COG0799 # Protein_GI_number: 15613891 # Func_class: S Function unknown # Function: Uncharacterized homolog of plant Iojap protein # Organism: Bacillus halodurans # 3 113 5 113 117 82 38.0 2e-16 MDKLIETIVSAIEDKKGKDIVSLDLSGFDGAICSRFIVCNADSTTQVCAIAASIEEKVLE TLGEKVWRIEGQQNGVWVAMDYVDVVVHIFQTELRSFYKLEELWADAPMTKYEYEE >gi|313157286|gb|AENZ01000071.1| GENE 26 34959 - 37037 1249 692 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|157803230|ref|YP_001491779.1| 50S ribosomal protein L9 [Rickettsia canadensis str. McKiel] # 106 635 80 608 636 485 49 1e-136 MNMNMPRPSMLWIYGLIGAFIIGWYVFGDVNDTPLPSDWTTVREMVEKGDVEKIQVVNRD QAQVFLKKEAAEQYRRDTVDKRFKRLPETGVQLTFTIGSVDSFREDLKNAEQQSGQTVPV VYENKANDWTNVLINLLPWVLIIGVWIFIMRSMSRGAGGGAGGGIMNVGKAKAQVFDKDA SKRVTFKDVAGLEEAKVEIMEIVDFLKKSEKYKELGAKIPKGALLVGPPGTGKTLLAKAV AGEANVPFLSISGSDFVEMFVGVGASRVRDLFEQAKQKAPCIVFIDEIDAIGRARGKNAG FSGNDERENTLNQLLTEMDGFQTNTGVIVLAATNRADILDKALMRAGRFDRQIEVGLPDV KEREAIFNVHLRPLKLDPQLDREFLARQTPGFSGADIANVCNEAALIAARHNKKFVSKED FLAAIDRIVGGLEKPNMPMTAAERRATAIHEAGHATVMWSLPQCDPVLKVTVVPRGRSLG ATWYVPDERRIHVTNEALQERLAGLLGGRIAEEVNYGTLGAGALSDLERATETAYAMVAY YGMSKKIGPISYYDSSGTRDTFTKPFSEQTARDIDTEVRRIIEEAYAKARGIIERKSEQI NRMADLLLEKETIYAEDIERILGPAAQVPRENDDPKKGVVVADDDAASGTSAAPADAPPA GTGTTQEGDAGSASGTKTAASGAANAEKSELK >gi|313157286|gb|AENZ01000071.1| GENE 27 37067 - 37915 1058 282 aa, chain + ## HITS:1 COG:ECs0177 KEGG:ns NR:ns ## COG: ECs0177 COG0575 # Protein_GI_number: 15829431 # Func_class: I Lipid transport and metabolism # Function: CDP-diglyceride synthetase # Organism: Escherichia coli O157:H7 # 150 281 116 245 249 105 46.0 1e-22 MSEKMKNLMVRTLSGAVLAVVVLGAVIWSQWSFGALLLAMLVAGMLEFYGLAEKQGNAPQ KIVGMAAGIVLFALNFAFVSDDIEILSGASQAFACGMAFMLLLIPAMFICELYRKQQNPA SGIGTTLMGVCYIALPLSLMCYIPIIGSEVWTPWVMIFYIFIIWANDVFAYLVGMSVGRH RMFERLSPKKSWEGFFGGLAGAVAMGYVAARVLDADVWAWLGLALVAAATGVLGDLVESM FKRAAGVKDSGNLIPGHGGVLDRFDALLLSAPFVFVYMLFVM >gi|313157286|gb|AENZ01000071.1| GENE 28 37936 - 38607 716 223 aa, chain + ## HITS:1 COG:RSc2074 KEGG:ns NR:ns ## COG: RSc2074 COG0688 # Protein_GI_number: 17546793 # Func_class: I Lipid transport and metabolism # Function: Phosphatidylserine decarboxylase # Organism: Ralstonia solanacearum # 3 220 10 214 215 147 41.0 2e-35 MRINKEGYKIIGISGAVCVFLWWLIYHLLVSDANISLLWVSAVLLLLFWFFIVAFFREPR RVRIHDADLAFSPCDGRVVVTEVVKENEYLNEEMLQISIFMSVTNVHMNWVPVGGTVEYF KYHPGRFLVAWHPKSSTENERTTTVVKMASGQKVLFRQIAGLIARRIISYMKVGSPVEQN SVCGFIKFGSRVDVLVPKNSELLVEIGDPVVGSQTPIARLPKA >gi|313157286|gb|AENZ01000071.1| GENE 29 38620 - 39726 1775 368 aa, chain + ## HITS:1 COG:BS_yueF KEGG:ns NR:ns ## COG: BS_yueF COG0628 # Protein_GI_number: 16080231 # Func_class: R General function prediction only # Function: Predicted permease # Organism: Bacillus subtilis # 24 350 33 345 369 98 25.0 2e-20 MRATPQTFLRFIAGAAVAAAVLFLVWYFRSIVVYVLISAVLAVMGNPLVKRLAALHIKGW KVPRWLAALATLIVIWVVLATLCSLFVPLVFNKIYQLSNVDFATVVASIEEPIARAQSYL HEFFAMPESTFSLSEALASALKQVIDIESLNTVFASIVNVVLSSVIAIFSITFITFFFLR DEGLFYAMITAMFPERYHENITRALDSVTLLLARYFTGILSESLLLMVAVSLTMMAFGMK AADAAFIGLVMGVMNVVPYAGPLIGGVVSVFVGIVTPIGGMTVGHTAVVIIGSLLILKGL DDFVLQPTLYSSRVKAHPLEIFLVILIAGSLAGILGMLLAIPSYTVLRVFAKEFFSQFSL VRKLTEKI >gi|313157286|gb|AENZ01000071.1| GENE 30 39726 - 40091 525 121 aa, chain + ## HITS:1 COG:VC0681 KEGG:ns NR:ns ## COG: VC0681 COG0196 # Protein_GI_number: 15640700 # Func_class: H Coenzyme transport and metabolism # Function: FAD synthase # Organism: Vibrio cholerae # 6 115 190 300 322 100 45.0 6e-22 MEKVIIEGVVEHGRRLGRELGFPTANVAVPDSVAAEDGVYRSRAEMDGAVYDAMSNLGRN PSVGGTARRLETHIFGFRGALYGRMLRVELLEKIRDERRFDTLEELRAQIEKDREYILKL K >gi|313157286|gb|AENZ01000071.1| GENE 31 40098 - 41507 2131 469 aa, chain + ## HITS:1 COG:XF1037 KEGG:ns NR:ns ## COG: XF1037 COG0499 # Protein_GI_number: 15837639 # Func_class: H Coenzyme transport and metabolism # Function: S-adenosylhomocysteine hydrolase # Organism: Xylella fastidiosa 9a5c # 31 469 1 446 446 654 70.0 0 MYLDLTMPYKVADMSLAGWGRKEIEIAEHEMPGLMAVRRKYGPQKPLKGVRVMGSLHMTI QTAVLIETLVELGADVRWCSCNIFSTQDHAAAAIAETGVPVFAWKGETLPEYWWCTVMAL SFPGGKGPQLIVDDGGDATLLIHKGFKAEDDASTLDYEPSSYEEEVIIDTLKKVLAEDKD KWHRTVAEWKGVSEETTTGVHRLYQMQEAGELLVPAINVNDSCTKSKFDNLYGCRESLAD GIKRATDVMIAGKVVVVCGYGDVGKGCAASMRSYGARVIVTEIDPICALQAAMEGFEVKT VESALGEGNIFVTCTGNCDIITLEHMERMRDQAIVCNIGHFDNEIQMARLEKSGAVKTNI KPQVDKFTFPDGHSIFVLAEGRLVNLGCATGHPSFVMSNSFTNQCLAQMELWQEKLEVGV YRLPKHLDEEVARLHLDNLGVELTRLTEKQADYIGVSPDGPYKAEHYRY Prediction of potential genes in microbial genomes Time: Wed Jun 22 13:34:29 2011 Seq name: gi|313157281|gb|AENZ01000072.1| Alistipes sp. HGB5 contig00013, whole genome shotgun sequence Length of sequence - 5008 bp Number of predicted genes - 4, with homology - 4 Number of transcription units - 3, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 51 - 674 1021 ## COG0572 Uridine kinase - Term 693 - 741 10.6 2 2 Op 1 . - CDS 752 - 2713 2994 ## COG0187 Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), B subunit 3 2 Op 2 . - CDS 2785 - 3522 277 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 - Prom 3543 - 3602 4.4 + Prom 3629 - 3688 5.8 4 3 Tu 1 . + CDS 3776 - 5006 1624 ## Odosp_2185 multi-sensor signal transduction histidine kinase Predicted protein(s) >gi|313157281|gb|AENZ01000072.1| GENE 1 51 - 674 1021 207 aa, chain - ## HITS:1 COG:BH1275 KEGG:ns NR:ns ## COG: BH1275 COG0572 # Protein_GI_number: 15613838 # Func_class: F Nucleotide transport and metabolism # Function: Uridine kinase # Organism: Bacillus halodurans # 5 203 7 205 211 215 52.0 4e-56 MKVTVIGVAGGTGSGKSTLVKRLQEAFEGDDVVTLCHDYYYKAHPELTYEERTKLNYDHP QAFDTQMLVDHIKALKENVPIEHPVYSFVEHNRMTETVSVKPSKVIIVDGILIFENKELR DLMDIKVYVDTDADIRLARRILRDVCERGRTMQSVITQYTSTVKPMHEEFVEPSKKYADV IIPEGGFNSVAVAMLIQNISSLIARNE >gi|313157281|gb|AENZ01000072.1| GENE 2 752 - 2713 2994 653 aa, chain - ## HITS:1 COG:MA1584 KEGG:ns NR:ns ## COG: MA1584 COG0187 # Protein_GI_number: 20090442 # Func_class: L Replication, recombination and repair # Function: Type IIA topoisomerase (DNA gyrase/topo II, topoisomerase IV), B subunit # Organism: Methanosarcina acetivorans str.C2A # 12 646 4 626 634 703 58.0 0 MSTELENVKKQEEEYSASNIQVLEGLEAVRKRPAMYIGDIGEKGLHHLVYEVVDNSIDEA LAGYCTDIDVTINEDNSITVRDNGRGIPTDYHEKEGKSALEVVLTVLHAGGKFDKGSYKV SGGLHGVGVSCVNALSTLLIAEIHSRDGKIYRQSYSKGAPTSQVEVVGECTDRGTTITFK PDGSIFTLTTEYKYEILANRLRELAFLNKGIHLNLTDKRSKDENGEYVHDNFYSEEGLKE FVQYLDATRGTPIIDQIIYLDTEKNGVPVEVAMQYNDSFSENVHSYVNNINTIEGGTHLT GFRRALTRTLKKYAEDSGMLSKLKFDINGDDFREGLTAVVSVKVQEPQFEGQTKTKLGND EVSAAVDQAMSSALGHYLEENPKDAKAIVQKVILAATARHAARHAREMVQRKTVLSGAGL PGKLADCTSRDRSIAEIFFVEGDSAGGTAKSGRDRNFQAIMPLRGKILNIEKAQEHRMWE NEEIKNMFTALGVTIGTEEDSKALNLEKLRYDKIIIMTDADVDGSHIATLMLTFFFRKMK ELIENGHVYIATPPLYLVKKGKQERYCWTEKERDTISAEFGKGVHIQRYKGLGEMNAHQL WDTTMNPDTRILRQVNIENAAEADRIFSMLMGDEVPPRREFIEKNAHYANIDA >gi|313157281|gb|AENZ01000072.1| GENE 3 2785 - 3522 277 245 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 36 237 16 219 245 111 32 1e-24 MSEKYVVRLKNAAIYHADNPFGSTSEAKLLQRGEMVLSDVNLCVAPGEFVYLLGRVGSGK STLLKTLYAEVQLLTGEGRVAGFDLRRLKRRDIPYLRRRIGIVFQDYQLLTDRNVFMNLY YVMKATGWKHESEIRERIDQVLNLVDLGAKSYKMPFELSGGEQQRLVIARALLNDPQVLL ADEPTGNLDPVTAEGIMQLFEEIACRGCAVVMSTHNTALIENHPARAILFSQGKIREVDL KAELG >gi|313157281|gb|AENZ01000072.1| GENE 4 3776 - 5006 1624 410 aa, chain + ## HITS:1 COG:no KEGG:Odosp_2185 NR:ns ## KEGG: Odosp_2185 # Name: not_defined # Def: multi-sensor signal transduction histidine kinase # Organism: O.splanchnicus # Pathway: not_defined # 1 410 6 415 936 338 41.0 3e-91 MKEHEILQEIMRDARIGWWQADRKRRMFHISEGLRDLLGLSTCDVSYEEFRDMINPTYRE YAFASVGVRGGDFGNEHLYPLRGPKGEIWCYRKLLREETAEDGGMLLTGYFRVTDPPQEA NSPEKQQINDLLYRLNSISHTLLSLLKTNDPDVVIDKILTDVQTMFHGGRAYIIEFDRER RTHDCTYEVTAENVTAEQDLVNSLSMDEVPWWTRRIENGNPIIISSLDELPDEAFREKEV LAMQDIKSLIAVPLASRDKVWGYAGIDIVNEPRTWSGEDYQWFASLMNIISLCIELQRSE REAQSERTYLEGLYRHMPLGYARLQMLRDDAGEPVDFRILDANYAADKISGVFREQYIGK RASELGLDTRFYMPIFLQVFDSGDYIEHDSFDEKSKHWIHSILYTTHEDE Prediction of potential genes in microbial genomes Time: Wed Jun 22 13:34:40 2011 Seq name: gi|313157264|gb|AENZ01000073.1| Alistipes sp. HGB5 contig00006, whole genome shotgun sequence Length of sequence - 16190 bp Number of predicted genes - 17, with homology - 15 Number of transcription units - 9, operones - 6 average op.length - 2.3 N Tu/Op Conserved S Start End Score pairs(N/Pv) - Term 99 - 150 21.1 1 1 Tu 1 . - CDS 330 - 779 239 ## Odosp_1150 hypothetical protein - Prom 821 - 880 3.9 + Prom 890 - 949 5.9 2 2 Op 1 . + CDS 993 - 1400 518 ## Odosp_1949 hypothetical protein 3 2 Op 2 8/0.000 + CDS 1397 - 2713 1425 ## COG3969 Predicted phosphoadenosine phosphosulfate sulfotransferase 4 2 Op 3 . + CDS 2658 - 3221 712 ## COG1475 Predicted transcriptional regulators - Term 3419 - 3454 1.0 5 3 Tu 1 . - CDS 3470 - 3904 323 ## + Prom 4549 - 4608 5.8 6 4 Op 1 . + CDS 4689 - 4961 232 ## gi|313157267|gb|EFR56694.1| conserved domain protein 7 4 Op 2 . + CDS 4963 - 5352 313 ## + Term 5376 - 5416 9.1 8 5 Op 1 1/0.000 - CDS 5433 - 6254 839 ## COG2135 Uncharacterized conserved protein 9 5 Op 2 4/0.000 - CDS 6276 - 7601 1212 ## COG0389 Nucleotidyltransferase/DNA polymerase involved in DNA repair 10 5 Op 3 . - CDS 7598 - 8482 820 ## COG1974 SOS-response transcriptional repressors (RecA-mediated autopeptidases) - Prom 8696 - 8755 3.1 + Prom 8982 - 9041 4.8 11 6 Op 1 . + CDS 9072 - 11000 306 ## CHU_1497 histidine kinase 12 6 Op 2 . + CDS 11046 - 12059 907 ## PROTEIN SUPPORTED gi|167855185|ref|ZP_02477956.1| 50S ribosomal protein L31 - Term 12637 - 12675 0.5 13 7 Tu 1 . - CDS 12776 - 13021 328 ## COG3609 Predicted transcriptional regulators containing the CopG/Arc/MetJ DNA-binding domain - Prom 13051 - 13110 4.6 - Term 13059 - 13105 9.2 14 8 Op 1 . - CDS 13126 - 13416 320 ## Bacsa_1125 hypothetical protein 15 8 Op 2 . - CDS 13434 - 14069 493 ## Bacsa_0421 hypothetical protein 16 9 Op 1 . - CDS 14219 - 15118 475 ## BDI_3903 hypothetical protein - Term 15123 - 15160 4.5 17 9 Op 2 . - CDS 15194 - 16183 1084 ## COG4227 Antirestriction protein Predicted protein(s) >gi|313157264|gb|AENZ01000073.1| GENE 1 330 - 779 239 149 aa, chain - ## HITS:1 COG:no KEGG:Odosp_1150 NR:ns ## KEGG: Odosp_1150 # Name: not_defined # Def: hypothetical protein # Organism: O.splanchnicus # Pathway: not_defined # 70 148 1 79 79 71 43.0 1e-11 MNCANHPAEAAVAQCTECGKGLCVQCVQNQPKPVCAACHEKLRSQTVGSIVFHLLVYVGL FILGYKLNFMEGNGFPDGRFASGYTLMAVVSGWQFLNSVVGWRLVQGDLTAWAIYYVLKL LISVIVGFFTAPFMILWNLIKLISFMMRR >gi|313157264|gb|AENZ01000073.1| GENE 2 993 - 1400 518 135 aa, chain + ## HITS:1 COG:no KEGG:Odosp_1949 NR:ns ## KEGG: Odosp_1949 # Name: not_defined # Def: hypothetical protein # Organism: O.splanchnicus # Pathway: not_defined # 1 132 1 131 134 137 52.0 2e-31 MTIEKLDGTSPRLYTLVAPLVMRRSVLRQNNNYPFWTSRAHTWFVAWEGETVFGFVPVEI TDGGVAKINNYYVSGDDPRLLSRFLREIIQYYHRDYTIRSMTLLRHAEVFRTEGFVPTKE WTQYVTMQYGKKRQP >gi|313157264|gb|AENZ01000073.1| GENE 3 1397 - 2713 1425 438 aa, chain + ## HITS:1 COG:lin1347 KEGG:ns NR:ns ## COG: lin1347 COG3969 # Protein_GI_number: 16800415 # Func_class: R General function prediction only # Function: Predicted phosphoadenosine phosphosulfate sulfotransferase # Organism: Listeria innocua # 3 425 6 434 434 420 45.0 1e-117 MNVYERTQQRLKTVFDLFDNIYVSFSGGKDSGVLLNLCIDYVRRHGLRRRIGVFHMDYEI QYRDTLDYVDRTLAANADILDVYRVCVPFKVQTCTSMFRQYWRPWEESKRDVWVREMPEG CLTHQDFPFFTDEMWDYEFQNRFAEWLHARSGATRTCCLIGIRTQESFNRWRTIYSDRNH HRFAGKRWIRQWAGGGICNAYPLYDWLTTDIWTANGRFGWPYNRLYDLFYRAGVPLDSQR VASPFISPAIASLHLYKAVDPDTWGRMVGRVDGANFAALYGRTAAVGWQSARLPKGMTWE RYMHFLLSTLPEPTRRNYLEKLSVSIRFWREKGGCLSEETIAKLHRAGIRIEVGDKSGYR TDKRPVRMEYLDDIDLPEFGQLPTFKRICICILKNDHACKYMGFSPNKSETLRRRKIMEK YESLLRTSDRKELPKPCL >gi|313157264|gb|AENZ01000073.1| GENE 4 2658 - 3221 712 187 aa, chain + ## HITS:1 COG:L69383 KEGG:ns NR:ns ## COG: L69383 COG1475 # Protein_GI_number: 15673430 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Lactococcus lactis # 15 174 7 166 180 228 65.0 5e-60 MNRYYEHPTEKSYRSPVYNVRAVPVEKVVANSYNPNVVAPPEMKLLELSIWEDGYTMPCV CYYDAERDCYELVDGYHRYLVLKTSKRIYERERGLLPVTVIEKDLSNRMASTIRHNRARG THNVELMSEIVAELTRARMSDQWIMRHIGMDRDELLRLKQITGLAELFSDKEFSLGDADE DTVEFVL >gi|313157264|gb|AENZ01000073.1| GENE 5 3470 - 3904 323 144 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MNGFEKDNENPYRLYRVVSVKKIDRWFFEKHNHRRTHAKIYETVIRPKFGICENTFLDYR HESDELLELFRQSVNVEFSMWLPTMEAKYMSPVEADRFSLMLWDAFDSAFKCILKEELAC RINAEKLLKYLIICLGEKSPVVVR >gi|313157264|gb|AENZ01000073.1| GENE 6 4689 - 4961 232 90 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313157267|gb|EFR56694.1| ## NR: gi|313157267|gb|EFR56694.1| conserved domain protein [Alistipes sp. HGB5] # 1 90 1 90 90 185 100.0 9e-46 MTVKQCNFKVGEVYLFHTDDPRCPDAESLWGLYDRHDGNSIFLESWSTDQKHFSKGRHLP EQYRFCRLSTRSELRDYMVNSIYSEIKGLS >gi|313157264|gb|AENZ01000073.1| GENE 7 4963 - 5352 313 129 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MSRDNYNPYRIVGAKKIDVWFYEEGDMRRTHRIVYELIVLPLYGVCENSFLDYRHHSDEL LELFIQPPYIEVPLWLMVMTVKKMPVHEANCFFELLRTKMDRIFRKRSYPLTAEQLLRLL VDALAEFIY >gi|313157264|gb|AENZ01000073.1| GENE 8 5433 - 6254 839 273 aa, chain - ## HITS:1 COG:all3194 KEGG:ns NR:ns ## COG: all3194 COG2135 # Protein_GI_number: 17230686 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Nostoc sp. PCC 7120 # 77 270 47 227 233 111 36.0 2e-24 MCFFTSLSANAAAVAKRYGKQLDIIEAARQVLAEQEQDARARQDAGSEHRMSVYDVRLAE EIYVIPAYAEPHCVIVSGSEELQVMQWGLIPRTAKPGDAQRYDRENLFKNARAETLFEKW PWRMLWQHNRCIIPVTGFYEPHRLPNGKAQYYYTTLKDQELFSIAGLWDEWTHPQTGEKV LSFVLITTEANAMMRKIHNGGGNPFRMPKILTQEQEKRWLDPSIASEEAVAALLTVYPEK EMTAWPVRPKFNYGDPYDEGIIEPVAEMQTLGL >gi|313157264|gb|AENZ01000073.1| GENE 9 6276 - 7601 1212 441 aa, chain - ## HITS:1 COG:ECs1679 KEGG:ns NR:ns ## COG: ECs1679 COG0389 # Protein_GI_number: 15830933 # Func_class: L Replication, recombination and repair # Function: Nucleotidyltransferase/DNA polymerase involved in DNA repair # Organism: Escherichia coli O157:H7 # 13 439 1 421 422 283 39.0 5e-76 MSRKLLPLNPRQMVGLADGNSFYCSCEESMQPWLHGKPIIVASNNDGCAIAMNRPAKRYV RMGDALFQIADTIREHGIVTFSSNYELYGDMSNRMHSIWASFVPNLEIYSIDEAFLDFTG MEGFDFEKLGREIIRTTRRGIGIPICLGIAPTKVLAKAANKLAKTDDARRGLYIIDTEEK RVEALKKLPIGDVWGIGRRHEKRLTAMGVRTAYDFSVLPREWVRRNMTVVGDRLWREMNG TPCISLELAPPDKQEICTSRAFGKMTSDFNEVKAAVVRYLSSSAHKLRDQHSYARRIYVG IETNPFNENQRQTFRGLQIEFPVPTDNTFEMVPYAMTLLRAIWPAYAPGERPFLFKRATV TLSDLIPAEALQLNMFHHRPDIGQLRHLQAAVDEVNGPLNLDSRLVVLGAELTGQNNTRL RREMLSKCPTTKWSDRIDIAI >gi|313157264|gb|AENZ01000073.1| GENE 10 7598 - 8482 820 294 aa, chain - ## HITS:1 COG:STM1998 KEGG:ns NR:ns ## COG: STM1998 COG1974 # Protein_GI_number: 16765334 # Func_class: K Transcription; T Signal transduction mechanisms # Function: SOS-response transcriptional repressors (RecA-mediated autopeptidases) # Organism: Salmonella typhimurium LT2 # 18 139 15 136 139 80 35.0 3e-15 MNIHFDIYGLITTTVKSLPYVSEGIPAGFPSPAADYIGDRIDLNEVLILSPGATYYARVR DYYLVEEELEQGDGILFDTRLAPRTGDLAFCEVCGERVLKYIRKRGRELWLLGGGTDDTL PVDGDTEARVLGVVTVVIKVRRFKRRRTRLIDRLPQKEQPALFERRAIRASYFPASDVPV AGLVDLNEELIPNPISTFFGKVRGISLIEDDIEDSDSVIVDRMLEPRWGDLAVCVTSEGF TMKYVEVHDDGEIWLMPGNKDYAPIRITGENERLIWGIVTHSIRQLRTKKGGIG >gi|313157264|gb|AENZ01000073.1| GENE 11 9072 - 11000 306 642 aa, chain + ## HITS:1 COG:no KEGG:CHU_1497 NR:ns ## KEGG: CHU_1497 # Name: not_defined # Def: histidine kinase # Organism: C.hutchinsonii # Pathway: not_defined # 4 628 1 615 621 517 46.0 1e-145 MDGMEVNLSRAIDFFYPSSSSLELVYFEAIANAIDAGANNIQISIAVDSYSKPESLTIVI SDNGVGFTNDRYRKFTKLLETEEEDHKGLGRLVFLKYFKEVYIESIFEKQIRKFTFKKNF EKNNFTLSPAPDATYTGSRLAFSGYLKKKLYSYDFLKPVSIRQSILYHFYPHLFSIVKAG KDLKIDLTLSTKEPNPEQGFYNDCQILIASQSVDLESRKFPAESLDLFDTIDIFYKIKQT NKESSLVTAICVDGRTIPIDIMSKESTPKGYEIVFLLYSSFFTGKTNPSREALTLDEQEL KTVKNLFRKHATIILNEKIPSIQEENKHITESLNNNYPHLAGYFDEQSIGIIDRNKAIET AQRKFFQAQKEVLDAPNTISTEQYEKALNVSSRTLMEYILYRNVIIKNLKQIDKKNPEAD IHDIIVPRKKVCGKAGFINDLYTNNAWLLDDKYMSYSTILSDLEMNKLMPYLALDGENVD DEKRPDIAIVFSNDFHNPVGQVEVQTKTDVVIVELKKLGLPLAKREEVVSQLKQRARKLA KYYPDKIQRIWFYGIVDIDAEFRIALKEENYTELFSAGSLFYKEHTVILDEDKNIRIPFG LYVLSFDAFLKDAESRNSTFLNVLKQGFLKGCNIPKASNEQK >gi|313157264|gb|AENZ01000073.1| GENE 12 11046 - 12059 907 337 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|167855185|ref|ZP_02477956.1| 50S ribosomal protein L31 [Haemophilus parasuis 29755] # 10 335 11 337 339 353 51 3e-97 MSAITTLYDDIRAIIINTRNTIYKAVNTGILEANWKIGRRIVEEEQAGASRAEYGQRVIN DLAEKLSVEFGRGFDARELRRYRQFYLLFPKWDALRPELTWTHYRTLIRVENERARLYYM NEAALQNWSTRALDSQIERLTYERILSSQNQLIVKEAEDAASRQAQLTPADIIKDPYVLD FLGLPSGVNFYEKDLEKALIDNLQQFLLELGRGFSFVSRQYRFKTDNENYYVDLVFYNFI LKCFVLIDLKVGKLTYQDIGQMDFYTRYFEENIRTETDNPTIGIVLCTERDNTIVKYSVM NDSNQLFASKYKLYLPTEEELINELETSRKQIENNVK >gi|313157264|gb|AENZ01000073.1| GENE 13 12776 - 13021 328 81 aa, chain - ## HITS:1 COG:XF2071 KEGG:ns NR:ns ## COG: XF2071 COG3609 # Protein_GI_number: 15838662 # Func_class: K Transcription # Function: Predicted transcriptional regulators containing the CopG/Arc/MetJ DNA-binding domain # Organism: Xylella fastidiosa 9a5c # 2 77 3 78 80 65 44.0 3e-11 MKTTSVALGVYFEDFIKAKIAQGRYNNASEVIRAGLRLLEENESRLTELKAAIREGIDSG VAEGFDPEDHLKTLKAKWTNG >gi|313157264|gb|AENZ01000073.1| GENE 14 13126 - 13416 320 96 aa, chain - ## HITS:1 COG:no KEGG:Bacsa_1125 NR:ns ## KEGG: Bacsa_1125 # Name: not_defined # Def: hypothetical protein # Organism: B.salanitronis # Pathway: not_defined # 1 96 1 97 97 139 69.0 5e-32 MKIMCQEHYDKVVQYAESIDDKTLHECLKRLERREQNPHHPCEIELYKDFAPYSFLFKER YPDGRLGVVGGLVYHGCPDRSHCFIDGPFHGWMTHT >gi|313157264|gb|AENZ01000073.1| GENE 15 13434 - 14069 493 211 aa, chain - ## HITS:1 COG:no KEGG:Bacsa_0421 NR:ns ## KEGG: Bacsa_0421 # Name: not_defined # Def: hypothetical protein # Organism: B.salanitronis # Pathway: not_defined # 1 211 6 209 209 200 52.0 3e-50 MHEAEQYLRNPETPYSLYIQYQGRRRRLFYNRDQNICGIIGIGRRRYGFGFGDWDNIEKI FKPAPDKDPEEINRRLIRKFQREAAKAGFTSPFIRKIQHADYSKDLYKNGITTGTSIDGQ IITLEAVRKWCGKTTYRSFCEAVKNRTPFHSGRFDFRGYDGSLWVEPCDKDDGYHRVGDL TAGFSKEYRGCGNGYYYLLINEHTFIGCDID >gi|313157264|gb|AENZ01000073.1| GENE 16 14219 - 15118 475 299 aa, chain - ## HITS:1 COG:no KEGG:BDI_3903 NR:ns ## KEGG: BDI_3903 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 5 268 64 320 320 150 33.0 8e-35 MQQDKTKDRYSCETLLSMNTLYDHEHHLTQADVDAANALVRHIERTRNPRVPQVGDRVRY TTRHGDFHGNALIEAVREGGTPSICLCPYVPFVWATAGGIGCAVSGGPFTAVMPQELKPS GAMPGDFCAWGHCGACGNGVVRFCAEVPLWEFREGDPLYGDFSTEKWRKISLYKDTENLH GNLYRGDCISFRTEEEFRRFLSDCEGTVFAAPNPKSVIVWGYRDEQVALPRTEWKALDVP VTERRIYNTLQPVKLVKDHGRHTAVCYFVRPEDDISQPQDMTLEELIYEESCPTSKVTV >gi|313157264|gb|AENZ01000073.1| GENE 17 15194 - 16183 1084 329 aa, chain - ## HITS:1 COG:XF2061_1 KEGG:ns NR:ns ## COG: XF2061_1 COG4227 # Protein_GI_number: 15838653 # Func_class: L Replication, recombination and repair # Function: Antirestriction protein # Organism: Xylella fastidiosa 9a5c # 13 317 223 510 522 101 28.0 2e-21 MNKNLIEKIAPQLTELMIKKMETLTGEWRKPWIADLAHGLPRNLRGTHYRGGNILMLLFL SEIAGYRTPLFMTFKQAKEEGLNILKGSGSFPVFFWKLYIRHKETRKKIELAEYYRLPQE QRRQYDVLPVMRYYPVFNIDQTDMSEQQPERYTSLTTPTGPKDYSDGLTCEVLDRMLAEQ SWLCPILLRSGNRASYSPTLDRIVCPEKRQFPESAAFWTTLLHEVTHSTGHAERLNRPFG ACYRDADYIREELVAELTAALCGAMLGFATTPREESAAYIKDWLAEFHKEPTYLFDILTD VNRAARMISERLACEQEPETPGAIPAEAA Prediction of potential genes in microbial genomes Time: Wed Jun 22 13:35:28 2011 Seq name: gi|313157263|gb|AENZ01000074.1| Alistipes sp. HGB5 contig00097, whole genome shotgun sequence Length of sequence - 542 bp Number of predicted genes - 0 Prediction of potential genes in microbial genomes Time: Wed Jun 22 13:35:32 2011 Seq name: gi|313157255|gb|AENZ01000075.1| Alistipes sp. HGB5 contig00003, whole genome shotgun sequence Length of sequence - 14421 bp Number of predicted genes - 6, with homology - 6 Number of transcription units - 4, operones - 1 average op.length - 3.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 192 - 1304 1468 ## COG0635 Coproporphyrinogen III oxidase and related Fe-S oxidoreductases + Prom 1335 - 1394 9.9 2 2 Tu 1 . + CDS 1459 - 5709 6619 ## COG0058 Glucan phosphorylase + Term 5720 - 5764 -0.6 + Prom 5734 - 5793 4.1 3 3 Op 1 4/0.000 + CDS 5820 - 7952 2774 ## COG3408 Glycogen debranching enzyme 4 3 Op 2 2/0.000 + CDS 7980 - 9293 2055 ## COG0438 Glycosyltransferase 5 3 Op 3 . + CDS 9309 - 10523 1814 ## COG1449 Alpha-amylase/alpha-mannosidase - Term 11002 - 11043 9.3 6 4 Tu 1 . - CDS 11188 - 14421 3726 ## gi|313157258|gb|EFR56687.1| hypothetical protein HMPREF9720_0155 Predicted protein(s) >gi|313157255|gb|AENZ01000075.1| GENE 1 192 - 1304 1468 370 aa, chain + ## HITS:1 COG:L0193 KEGG:ns NR:ns ## COG: L0193 COG0635 # Protein_GI_number: 15673121 # Func_class: H Coenzyme transport and metabolism # Function: Coproporphyrinogen III oxidase and related Fe-S oxidoreductases # Organism: Lactococcus lactis # 5 368 9 373 379 217 31.0 4e-56 MAGLYFHIPFCKRVCAYCDFYKSADLRRMDDLLAAMHRELDDRRGYPGGEAVTTRYFGGG TPSLCTPEAIRGLLDHAAQLFDCSGAEETTLEANPDDLTAEYLGGLLEAGIDRLSIGVQS FDDDCLKLMNRRHTAAQAAEAIRAAQRAGFGNITLDLIFGVPGFGGDTLRRSLDTALSLG VQHISAYHLTVEPDTAFGRRAARGEFRAVDEQVSETEFLTVHETLTRAGFEHYEVSNFAL PGFRARHNAAYWHGAKYLGIGPAAHSFDGEERHWNVGSVERYIAGDPAERELLTNRDRFN EYVMTALRTAEGIDTREVAARFGTKRLERMREEAAPYLQSGALRDAGGRLAVPPERFLIS DAVIEALFET >gi|313157255|gb|AENZ01000075.1| GENE 2 1459 - 5709 6619 1416 aa, chain + ## HITS:1 COG:all1272 KEGG:ns NR:ns ## COG: all1272 COG0058 # Protein_GI_number: 17228767 # Func_class: G Carbohydrate transport and metabolism # Function: Glucan phosphorylase # Organism: Nostoc sp. PCC 7120 # 579 1416 9 853 854 629 39.0 1e-179 MSNAKLTPDYLFEVSWEVCNKIGGIHTVVSTKAQTAIRKFNERYILIGPDLQHEGVNPEF EEDPNLLKAWRQGLYAEGIRIRAGHWKIKGEPTAILVDFTSLIPRKDEILKKLWEAYHVD SISGQWDYIEPVLFGYAAGAVIASYVENFCTATDKIAAHFHEWMTAAGGLYLRRHAPYVA TLFTTHATVMGRCIAGNHLPLYTDLTKFNADELARQFNVVAKHSIEKAAAAYHDAFVTVS DITANECKFLLGREPDCITPNGFENDFVWTGEEYDAKRAEARKTMISVAEACLGEKFAEE PLIVGTSGRYEFRNKGIDVFIESLKKLAASPRLKRQVLAYITVPAANRGPRADLQAHLAD PSKPIDGGQYKYTTHYLEYQSSDPIVNALNGSILTTPDSKVKVIFVPTYLNKADGIFGKD YYELLVGMDMTVFPSYYEPWGYTPLESVAFSVPTITTTLAGFGLWAAKQREHAGVEIVLR DDYNDQEVEEKIAESLLHFSLLDDKHVNEMRVSAYEISETALWEHLFAAYEQAYSEAVES SVIRTNRAVLDEGGNRNEQINFVRQQLFAEKPNWNRMMVDKTLPKRLHALEELSRNLWWC WNPGTRDLFESIDHALWAECERNPIAFLDKMSVERMKELEQDTNFLSQLDAVYAQFRDYM NEKPDPKATSISYFSMEYGLHSSLKIYSGGLGILAGDYLKEASDKNVPMAAVGLLYRYGY FTQRLSSQGAQEATYEAQNFYKLPISPVRDEAGNWMTVTIAFPGRTLWARVWKCQVGRTD LYLLDADFEANLEEDRQITHHLYGGDWENRLKQEILLGIGGIRALRKLGIKHDVFHCNEG HAAFIGIERIRDLVTHKKLSFSEALEVIRSSSLFTTHTPVPAGHDAFPESMIRQYMSHYP DVLGITWEQYINLGKTNPNDPNEKFSMSVLACNLSQEVNGVSWLHGEVSKDILGSMWPGY FKNELHIGYVTNGVHFPTWIATSMRRLYARYFADGFEGHTYDIPAWQKVHQIPDEELWNE RMYLKNKLVKHIRRRYSDPNQVRLDSPRQMIQIIEGIKPDVLTIGFARRFATYKRAYLLF TNLDRLSAIVNNKERPVQFIFAGKAHPNDKPGQDLIKRIVEVAAMPQFVGKILFLQNYDM ELARRMVQGVDVWLNTPTRPLEASGTSGEKCVMNGVLQFSVLDGWWVEGYKEGAGWMLPM ERTFADQGYQDELDAEMIYNTIEEQIVPKYYDRGSDGIPHQWVDSVKKCVADIASNFTTN RMLGDYEERFYNKLAARKREIVEGGYKLAREIAAWKRKVSAAWDKVRVIDVQRVRIDDEA IFVGEKYHFEVTVDIANLRPEDIGVEMVIAQQIVVGGKVNVTRTIGLKHTKTDGSRVTYA LDYVPGEAGTFDVALRLYPYNPHLPHRMDFALVKWA >gi|313157255|gb|AENZ01000075.1| GENE 3 5820 - 7952 2774 710 aa, chain + ## HITS:1 COG:MA0905 KEGG:ns NR:ns ## COG: MA0905 COG3408 # Protein_GI_number: 20089784 # Func_class: G Carbohydrate transport and metabolism # Function: Glycogen debranching enzyme # Organism: Methanosarcina acetivorans str.C2A # 1 635 22 668 680 317 32.0 4e-86 MSALTFDKSELGNLEYSLQREMLSTDRIGGYMSTTIVCCNTRRYHGLMVAPIDDSCRTYV LLSSLDETVVQHDQTFNLALHRFKGVYEPRGHKYITDFEYTPTPTITYRVGGVILKKEML WIHKRTQLMIRYTLVDAHSETRLRLRPFLAFRDKHALTHANMEADGHSYPAVNGVKCRLY NSFPWLYLQTSKPGTEFVPAPDWYYGFEYQQELARGYEGYEDLLTTGYFEAELKKGESII FSASLDEMGSTKTIEEVFAASIARRTHKIDFISCLEHSARQFLIRRPGDRTEVVSGYPWH GVSGRQTFISLPGITLEQGHKEDCIDALDTLVREMRDGMFTGSASAEVAADAPLWFFWTL QQLEREVGAKEIWASYGPAMKDILESYRQGIGGRVALHDNGLIWASSEEVPLTWMNSTID GRPVTPRNGYQVEVNALWYNAVCYALELAGKYGDKTDKAFVKAWASLPARTQESFIELFR LPEGYLADFVDGSGPDKSTRPNMIVACGLNYKMLDETMQLEVIRTVRQHLLTPKGLRTLS PQNPLYRGSQEGMPAERDFAAKNGSVWPWLLPFYIKACFDIDGDAFLPQAEEALENFDED IQRYGIGSICELYDADPPYASRGAISQAWSVGAALDIHRMIRERSKGDSQAPKAAKKSGR TNGPKEKKPAKTKPAAKSAGKTVKAAAKSPAAAKTPKAAAKKAAGKAAKK >gi|313157255|gb|AENZ01000075.1| GENE 4 7980 - 9293 2055 437 aa, chain + ## HITS:1 COG:TVN0430 KEGG:ns NR:ns ## COG: TVN0430 COG0438 # Protein_GI_number: 13541261 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glycosyltransferase # Organism: Thermoplasma volcanium # 155 430 95 369 369 199 37.0 7e-51 MRVLMFGWEFPPHIAGGLGTACYGMTRGLARNDVDVTFVVPHAYGDEDQRFTHVMNASDV EALYGSTGSGADDILEKMSFIHIDSNMVPYISPEEYEVYHERYVKSGQQVWSTTDAWKQR YTFSGKYGANLMEEVARYAVVAAQVAKDLEGQFDVIHAHDWLTYYAGIAAKRVSGKPLVV HMHATEYDRSGENVNTQVYAIERAGMHAADRVIAVSNLTRNIVINRYGVPAEKVVTVHNA VRFAGGPCKMPERGVKDKIVTFLGRITYQKGPDYFVEAAAKVLKRVPDVRFVMAGSGDMM NHVIRRVARLGIADRFHFTGFLRGEDVHKMFQLSDVYVMPSVSEPFGISPLEAMRSNVPV IISKQSGVAEVLDFAVKVDYWDVDALADAIYGLIQYPALSGMFASKGLEEVTNLKWNDAA AKIKAVYEAVIEENKKQ >gi|313157255|gb|AENZ01000075.1| GENE 5 9309 - 10523 1814 404 aa, chain + ## HITS:1 COG:MA4052 KEGG:ns NR:ns ## COG: MA4052 COG1449 # Protein_GI_number: 20092845 # Func_class: G Carbohydrate transport and metabolism # Function: Alpha-amylase/alpha-mannosidase # Organism: Methanosarcina acetivorans str.C2A # 1 388 1 386 396 372 48.0 1e-103 MKTVCLYFQVHQPWRLKKYRFFNMGKDHNYLDDLLNRSIMQKVARQCYLPMNALLLKLIK ENKGKFRCSFSITGIAVEQFRAFAPEVLDSFKELAATGCVEFLAETYSHSLASLASKEDF TEQVKLHTAMIKKEFGVKPTAFRNTELIYSDEIGAMVADMGFRTMLAEGAKHVLGWKSPN YVYANAINQKLRLLLRNYKLSDDIAFRFSNRSWDEWPLTADKYVKWLSSDETPGEVINLF MDYETFGEHQTADTGIFEFMRALPKAILAKKNDMEFATVSEAAKKYQPVSVLHCPHVMSW ADEERDVTAWLGNELQNEAFSKLYAQKEKVAALKSPDFDYVWSFMQTSDHFYYMATKWLS DGDVHSYFNPYDSAYDAFINYMNVLSDFIIELDQAVAAKAAKAK >gi|313157255|gb|AENZ01000075.1| GENE 6 11188 - 14421 3726 1077 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313157258|gb|EFR56687.1| ## NR: gi|313157258|gb|EFR56687.1| hypothetical protein HMPREF9720_0155 [Alistipes sp. HGB5] # 15 1077 1 1063 1063 1858 99.0 0 DLKGQINDLKGQVELKADASALKAVTDKLNGIDFSSFVTNSGLQTELDKRLADYAKKSDL KDWLTSDEVLKLIKAQGYQTKADVQKLIEEATKNQLTADDVKKIFETMIASDETMGKLQA EMQKIIQQALVDGKYVTEQQMKDFVEGKGYITGADQLSATQINQILTAVANSVSSETATV TDAIKKVLGDNFASYMAEYMNDETVKAEMGKTITDTILKELTDANVTLKEAIEAMIEEGV NSKLFNPDGSAIYLKEADLKDRFDNYDTQLRNLWSAIADLAGRIQSLVYVPTSLEEVSNN VISIEGLPYIAMVDGEGEDYKFYLGSDEGGNEAAIEATFMVSPAALAVKMTKENVSFVTE EIKTRSAASFEVVDVIDQDVVTGKFTVVAKTQYEYGNGKTLAIALNVKLAGTGVAGATPE DAISGEDLGSDFTSAFIGTQYNAGGAINADLVAGRMKEDGNLELVVDDDTQFPEFEGDGG LKYNDEETVVEFFHDVKIYYMKDGKPQELAAIWGEDVPVLKTVVPEDAAVATVDPEDNKD SYTLTATTAAIKKGKGVPALIDDVITSEPFSFDLVSGEYTCRVVKNAKSQFTIIKDNTDV IVSEPQTVTWNYATVSAGDDYEAVDYLLNPDPTKPQIGHVIYNALKAGSFEDIQIFDGET EMTDELPFDVADIILTSSPADDRAEQHLELVVAGNSAMTSGSYTVKARYNYENTDITITL PVVIEGMPEITMAPETVTVDYLAGTSEYTVIETYAEKLWNKDWAPKYFADKQTFVDAIYA ATFEATADEAEDGTELVQATDDAKNIAIKLGTGAKEKSYELTAKLTADWGLDVTVAVTVN FEGFAGELVQGDNYANPVRLNAKLTTTAYVPDGLNLDNTWVAPKDSKGNEIGKVVFSVDA EEYEGASVDADAKTLTWGTEYKGLNMALSVKLMLDSYVLDEQELNVEIPDPIKDKTISLR KSAATTISASDADASLTVGDLLQLANINGNNVFEAAAADENTVLGVAVKYEVMNPEVLVD RITIAGDFATTGTITVKGNADSEFVGEKTVKVKVTVSYDYGTDRVHNLDIKVKQKAN Prediction of potential genes in microbial genomes Time: Wed Jun 22 13:37:12 2011 Seq name: gi|313157045|gb|AENZ01000076.1| Alistipes sp. HGB5 contig00037, whole genome shotgun sequence Length of sequence - 199319 bp Number of predicted genes - 213, with homology - 204 Number of transcription units - 91, operones - 49 average op.length - 3.5 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 50 - 970 -213 ## PPSC2_p0627 hypothetical protein - Prom 1045 - 1104 5.8 2 2 Tu 1 . - CDS 1887 - 2072 165 ## gi|313157196|gb|EFR56626.1| conserved domain protein - Prom 2154 - 2213 3.4 + Prom 2099 - 2158 1.6 3 3 Op 1 . + CDS 2179 - 2739 815 ## COG0664 cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases 4 3 Op 2 . + CDS 2793 - 3509 191 ## PROTEIN SUPPORTED gi|163797523|ref|ZP_02191474.1| 50S ribosomal protein L9 5 3 Op 3 . + CDS 3509 - 3904 563 ## gi|313157193|gb|EFR56623.1| hypothetical protein HMPREF9720_1612 + Term 3930 - 3968 6.3 6 4 Tu 1 . - CDS 3989 - 4930 1196 ## COG2070 Dioxygenases related to 2-nitropropane dioxygenase + Prom 4869 - 4928 2.0 7 5 Op 1 . + CDS 5075 - 6115 1776 ## CHU_0571 hypothetical protein + Term 6147 - 6179 5.0 8 5 Op 2 . + CDS 6188 - 6661 -115 ## + Term 6745 - 6787 -0.8 + Prom 6736 - 6795 3.8 9 6 Tu 1 . + CDS 6901 - 7068 75 ## gi|313157088|gb|EFR56518.1| conserved hypothetical protein + Term 7076 - 7120 12.4 - Term 7070 - 7103 1.3 10 7 Tu 1 . - CDS 7110 - 7688 772 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases - Prom 7771 - 7830 2.6 + Prom 7656 - 7715 2.4 11 8 Tu 1 . + CDS 7807 - 8397 516 ## COG0242 N-formylmethionyl-tRNA deformylase + Term 8416 - 8457 6.3 - Term 8404 - 8445 5.0 12 9 Op 1 . - CDS 8452 - 9804 2124 ## COG0534 Na+-driven multidrug efflux pump 13 9 Op 2 . - CDS 9879 - 10442 798 ## BF1288 hypothetical protein - Prom 10623 - 10682 3.6 + Prom 10495 - 10554 1.8 14 10 Tu 1 . + CDS 10664 - 11101 809 ## COG0071 Molecular chaperone (small heat shock protein) + Term 11120 - 11168 12.7 15 11 Op 1 . + CDS 11439 - 12173 806 ## COG3142 Uncharacterized protein involved in copper resistance 16 11 Op 2 . + CDS 12170 - 13357 1359 ## Odosp_0628 transglutaminase domain-containing protein + Term 13416 - 13460 11.4 - Term 13404 - 13448 11.4 17 12 Op 1 . - CDS 13533 - 14186 491 ## COG0454 Histone acetyltransferase HPA2 and related acetyltransferases 18 12 Op 2 . - CDS 14188 - 15096 675 ## BF3212 putative ferredoxin + Prom 15146 - 15205 3.7 19 13 Tu 1 . + CDS 15228 - 15914 990 ## COG0778 Nitroreductase + Term 15937 - 15975 5.3 + Prom 16010 - 16069 2.9 20 14 Tu 1 . + CDS 16093 - 17160 1439 ## COG3274 Uncharacterized protein conserved in bacteria 21 15 Tu 1 . - CDS 17161 - 17718 850 ## COG0494 NTP pyrophosphohydrolases including oxidative damage repair enzymes - Prom 17746 - 17805 2.2 + Prom 17714 - 17773 4.3 22 16 Op 1 . + CDS 17793 - 19079 1811 ## COG3681 Uncharacterized conserved protein 23 16 Op 2 . + CDS 19091 - 20368 1858 ## PRU_2549 hypothetical protein + Term 20403 - 20451 9.6 24 17 Tu 1 . - CDS 20728 - 21954 1903 ## COG1519 3-deoxy-D-manno-octulosonic-acid transferase - Prom 21976 - 22035 2.9 25 18 Op 1 . - CDS 22092 - 22457 471 ## Dfer_3709 hypothetical protein 26 18 Op 2 . - CDS 22463 - 23305 1225 ## COG0287 Prephenate dehydrogenase 27 18 Op 3 . - CDS 23359 - 24060 1013 ## COG3340 Peptidase E - Prom 24113 - 24172 10.1 + Prom 24117 - 24176 4.0 28 19 Op 1 27/0.000 + CDS 24239 - 24475 517 ## COG0236 Acyl carrier protein 29 19 Op 2 1/0.000 + CDS 24517 - 25764 1415 ## COG0304 3-oxoacyl-(acyl-carrier-protein) synthase 30 19 Op 3 . + CDS 25860 - 26558 865 ## COG0571 dsRNA-specific ribonuclease 31 20 Tu 1 . + CDS 26673 - 27287 708 ## COG2003 DNA repair proteins - Term 27470 - 27508 2.9 32 21 Tu 1 . - CDS 27509 - 27661 58 ## - Prom 27684 - 27743 2.8 + Prom 27629 - 27688 8.3 33 22 Op 1 . + CDS 27710 - 29254 2571 ## COG0423 Glycyl-tRNA synthetase (class II) 34 22 Op 2 . + CDS 29265 - 29678 495 ## COG0816 Predicted endonuclease involved in recombination (possible Holliday junction resolvase in Mycoplasmas and B. subtilis) 35 22 Op 3 . + CDS 29675 - 30220 782 ## COG0242 N-formylmethionyl-tRNA deformylase + Term 30242 - 30296 19.9 - Term 30239 - 30274 8.1 36 23 Op 1 . - CDS 30304 - 30888 789 ## gi|313157247|gb|EFR56677.1| hypothetical protein HMPREF9720_1642 37 23 Op 2 . - CDS 30934 - 31443 0 ## 38 23 Op 3 . - CDS 31503 - 31970 756 ## HMPREF0659_A6148 hypothetical protein - Prom 32020 - 32079 5.6 - Term 32081 - 32126 14.3 39 24 Op 1 . - CDS 32160 - 34889 4325 ## COG0574 Phosphoenolpyruvate synthase/pyruvate phosphate dikinase - Prom 34928 - 34987 2.0 40 24 Op 2 . - CDS 35062 - 36012 1086 ## FB2170_05240 hypothetical protein 41 24 Op 3 . - CDS 36023 - 36250 63 ## TEQUI_0595 hypothetical protein - Prom 36282 - 36341 5.6 - Term 36358 - 36387 0.4 42 25 Op 1 . - CDS 36618 - 37934 1737 ## COG0482 Predicted tRNA(5-methylaminomethyl-2-thiouridylate) methyltransferase, contains the PP-loop ATPase domain 43 25 Op 2 . - CDS 37950 - 38615 828 ## gi|313157050|gb|EFR56480.1| conserved hypothetical protein - Prom 38739 - 38798 4.7 - Term 38758 - 38801 9.3 44 26 Tu 1 . - CDS 38827 - 39381 570 ## PROTEIN SUPPORTED gi|227536351|ref|ZP_03966400.1| 30S ribosomal protein S16 - Prom 39402 - 39461 4.7 - TRNA 39570 - 39643 81.1 # Arg TCT 0 0 - Term 39774 - 39810 7.2 45 27 Op 1 2/0.000 - CDS 39909 - 40292 622 ## COG0784 FOG: CheY-like receiver 46 27 Op 2 . - CDS 40349 - 42076 1703 ## COG0739 Membrane proteins related to metalloendopeptidases 47 27 Op 3 . - CDS 42061 - 42579 777 ## Odosp_3302 regulatory protein RecX 48 27 Op 4 3/0.000 - CDS 42561 - 43403 327 ## PROTEIN SUPPORTED gi|225874212|ref|YP_002755671.1| ribosomal protein L11 methyltransferase 49 27 Op 5 5/0.000 - CDS 43403 - 44377 1479 ## COG0082 Chorismate synthase 50 27 Op 6 . - CDS 44382 - 45626 1527 ## COG0128 5-enolpyruvylshikimate-3-phosphate synthase 51 27 Op 7 . - CDS 45699 - 46487 265 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 - Term 46677 - 46732 3.9 52 28 Tu 1 . - CDS 46788 - 48011 1821 ## Odosp_3332 phosphate-selective porin O and P - Prom 48232 - 48291 3.7 + Prom 48038 - 48097 2.9 53 29 Op 1 . + CDS 48281 - 50473 3923 ## COG0514 Superfamily II DNA helicase 54 29 Op 2 . + CDS 50550 - 50768 382 ## gi|313157235|gb|EFR56665.1| hypothetical protein HMPREF9720_1663 + Term 50797 - 50845 -0.7 55 30 Tu 1 . + CDS 50988 - 51230 388 ## gi|313157172|gb|EFR56602.1| hypothetical protein HMPREF9720_1664 + Term 51237 - 51274 4.0 56 31 Op 1 . + CDS 51673 - 51876 169 ## BT_0780 acetyltransferase 57 31 Op 2 . + CDS 51878 - 53029 1028 ## COG0030 Dimethyladenosine transferase (rRNA methylation) + Term 53131 - 53173 11.1 - Term 53119 - 53161 11.1 58 32 Op 1 14/0.000 - CDS 53252 - 54556 1947 ## COG0612 Predicted Zn-dependent peptidases 59 32 Op 2 . - CDS 54553 - 55791 1819 ## COG0612 Predicted Zn-dependent peptidases 60 33 Tu 1 . + CDS 56191 - 56361 97 ## + Term 56415 - 56450 -0.9 61 34 Tu 1 . - CDS 56311 - 56505 59 ## - Prom 56534 - 56593 4.0 62 35 Tu 1 . + CDS 56528 - 60061 5037 ## COG0674 Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, alpha subunit + Term 60080 - 60122 11.7 + Prom 60160 - 60219 7.7 63 36 Tu 1 . + CDS 60260 - 61036 1286 ## Cpin_0543 S1/P1 nuclease + Prom 61444 - 61503 2.0 64 37 Tu 1 . + CDS 61584 - 62471 805 ## gi|313157171|gb|EFR56601.1| hypothetical protein HMPREF9720_1673 + Term 62511 - 62551 10.4 + Prom 62582 - 62641 2.2 65 38 Tu 1 . + CDS 62714 - 63145 735 ## Odosp_1604 hypothetical protein + Term 63168 - 63206 4.1 + Prom 63549 - 63608 8.5 66 39 Tu 1 . + CDS 63738 - 64883 905 ## COG4973 Site-specific recombinase XerC + Term 64938 - 64973 0.4 67 40 Tu 1 . - CDS 64981 - 65487 66 ## gi|218129443|ref|ZP_03458247.1| hypothetical protein BACEGG_01020 68 41 Tu 1 . + CDS 65452 - 65787 274 ## Bacsa_0506 hypothetical protein + Term 65903 - 65942 6.5 + Prom 65789 - 65848 3.0 69 42 Op 1 . + CDS 65979 - 66362 427 ## BF2790 putative excisionase 70 42 Op 2 . + CDS 66340 - 67440 914 ## BF2791 hypothetical protein + Prom 67455 - 67514 7.1 71 43 Tu 1 . + CDS 67543 - 68430 610 ## BF2792 DNA primase + Term 68459 - 68488 -0.3 + Prom 68585 - 68644 5.6 72 44 Op 1 . + CDS 68674 - 69117 354 ## BF2793 hypothetical protein 73 44 Op 2 . + CDS 69119 - 69445 336 ## BF2794 hypothetical protein 74 44 Op 3 . + CDS 69500 - 69877 566 ## BF2795 conjugate transposon protein TraE 75 44 Op 4 . + CDS 69893 - 70189 370 ## Bacsa_0499 hypothetical protein + Term 70201 - 70233 -1.0 76 45 Op 1 . + CDS 70301 - 73009 3113 ## BF2797 hypothetical protein 77 45 Op 2 . + CDS 73021 - 75567 1860 ## BF2799 hypothetical protein 78 45 Op 3 . + CDS 75564 - 76289 506 ## BF2800 hypothetical protein 79 45 Op 4 . + CDS 76308 - 77075 754 ## BF2802 hypothetical protein 80 45 Op 5 . + CDS 77088 - 78221 748 ## BF2803 hypothetical protein 81 45 Op 6 . + CDS 78263 - 78658 330 ## Bacsa_0493 hypothetical protein 82 45 Op 7 . + CDS 78662 - 79276 589 ## BF2805 conjugate transposon protein TraK 83 45 Op 8 . + CDS 79289 - 79669 286 ## BF2806 hypothetical protein 84 45 Op 9 . + CDS 79680 - 80162 290 ## BF2807 hypothetical protein 85 45 Op 10 . + CDS 80173 - 81321 948 ## BF2808 conjugate transposon protein TraM 86 45 Op 11 . + CDS 81333 - 82118 413 ## COG0863 DNA modification methylase 87 45 Op 12 . + CDS 82156 - 83007 877 ## BF2811 conjugate transposon protein TraN 88 45 Op 13 . + CDS 83031 - 83516 423 ## Bacsa_0487 hypothetical protein 89 45 Op 14 . + CDS 83542 - 84198 515 ## BF2813 hypothetical protein 90 45 Op 15 . + CDS 84227 - 86413 1259 ## BF2815 putative mobilization protein + Term 86416 - 86471 13.6 - Term 86403 - 86459 12.2 91 46 Tu 1 . - CDS 86501 - 87721 651 ## COG4804 Uncharacterized conserved protein - Prom 87774 - 87833 1.8 + Prom 87638 - 87697 4.3 92 47 Op 1 . + CDS 87870 - 88046 102 ## gi|253571025|ref|ZP_04848433.1| conserved hypothetical protein + Prom 88049 - 88108 2.4 93 47 Op 2 . + CDS 88136 - 88258 86 ## gi|253571024|ref|ZP_04848432.1| conserved hypothetical protein + Term 88306 - 88351 8.3 - Term 88294 - 88339 6.2 94 48 Op 1 . - CDS 88361 - 88741 433 ## BF2817 hypothetical protein 95 48 Op 2 . - CDS 88775 - 89047 241 ## Bacsa_0432 hypothetical protein 96 48 Op 3 . - CDS 89055 - 89330 440 ## BF2819 hypothetical protein - Term 89708 - 89752 7.1 97 49 Tu 1 . - CDS 89772 - 91505 1122 ## COG1475 Predicted transcriptional regulators - Term 91845 - 91888 5.4 98 50 Op 1 . - CDS 91937 - 92191 289 ## BF2825 hypothetical protein 99 50 Op 2 . - CDS 92199 - 92954 667 ## COG1192 ATPases involved in chromosome partitioning - Prom 92977 - 93036 7.8 100 51 Op 1 . - CDS 93235 - 93597 221 ## Bacsa_0398 hypothetical protein 101 51 Op 2 . - CDS 93608 - 93949 180 ## BF2912 hypothetical protein - Term 94494 - 94542 7.1 102 52 Op 1 . - CDS 94550 - 95182 262 ## Bacsa_0421 hypothetical protein 103 52 Op 2 . - CDS 95230 - 95454 162 ## BF1096 hypothetical protein + Prom 95691 - 95750 4.3 104 53 Tu 1 . + CDS 95844 - 97013 873 ## COG1373 Predicted ATPase (AAA+ superfamily) + Term 97247 - 97289 -0.5 105 54 Tu 1 . - CDS 97019 - 97672 325 ## COG0739 Membrane proteins related to metalloendopeptidases - Prom 97864 - 97923 10.1 + Prom 97791 - 97850 4.4 106 55 Op 1 . + CDS 97892 - 98650 749 ## BF2874 hypothetical protein 107 55 Op 2 . + CDS 98669 - 98893 436 ## BF2875 hypothetical protein 108 55 Op 3 . + CDS 98910 - 100487 1267 ## BF2876 hypothetical protein 109 55 Op 4 . + CDS 100491 - 101879 936 ## Bacsa_0413 hypothetical protein 110 55 Op 5 . + CDS 101876 - 102373 345 ## Bacsa_0412 hypothetical protein 111 55 Op 6 . + CDS 102354 - 103043 694 ## Bacsa_0411 hypothetical protein 112 55 Op 7 . + CDS 103057 - 103740 487 ## BF2880 hypothetical protein 113 55 Op 8 . + CDS 103754 - 104419 554 ## BF2881 hypothetical protein 114 55 Op 9 . + CDS 104445 - 105263 902 ## COG0739 Membrane proteins related to metalloendopeptidases 115 55 Op 10 . + CDS 105298 - 105579 316 ## Bacsa_0407 hypothetical protein 116 55 Op 11 . + CDS 105588 - 107387 1408 ## BF2884 hypothetical protein 117 55 Op 12 . + CDS 107415 - 109676 1257 ## BF2885 putative DNA primase 118 55 Op 13 . + CDS 109689 - 111245 1114 ## COG1705 Muramidase (flagellum-specific) 119 55 Op 14 . + CDS 111258 - 111434 138 ## gi|253570994|ref|ZP_04848402.1| conserved hypothetical protein 120 55 Op 15 . + CDS 111518 - 112126 549 ## BF2888 hypothetical protein 121 55 Op 16 . + CDS 112132 - 113820 1139 ## COG4227 Antirestriction protein + Term 113830 - 113881 3.5 - Term 113818 - 113869 8.3 122 56 Op 1 . - CDS 113873 - 114247 329 ## BF2890 hypothetical protein 123 56 Op 2 . - CDS 114262 - 114516 160 ## BF2891 hypothetical protein 124 56 Op 3 . - CDS 114513 - 114980 417 ## BF2897 hypothetical protein - Prom 115104 - 115163 4.6 125 57 Tu 1 . + CDS 115532 - 115783 65 ## + Term 115884 - 115924 -1.0 + Prom 116673 - 116732 6.2 126 58 Tu 1 . + CDS 116753 - 117721 483 ## Odosp_2774 hypothetical protein + Term 117731 - 117784 8.5 + Prom 117730 - 117789 5.7 127 59 Op 1 . + CDS 117849 - 119624 1140 ## Odosp_2775 PKD domain containing protein 128 59 Op 2 . + CDS 119633 - 121687 23 ## Odosp_2776 hypothetical protein + Prom 121691 - 121750 4.4 129 60 Op 1 . + CDS 121772 - 122851 1541 ## Odosp_2777 putative surface layer protein 130 60 Op 2 . + CDS 122860 - 123348 64 ## Slin_4541 transposase IS200-like protein 131 60 Op 3 . + CDS 123345 - 125423 1826 ## Odosp_3622 TonB-dependent receptor 132 60 Op 4 33/0.000 + CDS 125428 - 126567 1046 ## COG0614 ABC-type Fe3+-hydroxamate transport system, periplasmic component 133 60 Op 5 35/0.000 + CDS 126568 - 127548 966 ## COG0609 ABC-type Fe3+-siderophore transport system, permease component 134 60 Op 6 . + CDS 127545 - 128303 227 ## PROTEIN SUPPORTED gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 135 60 Op 7 . + CDS 128300 - 128896 129 ## Odosp_2784 hypothetical protein - Term 128909 - 128956 6.7 136 61 Op 1 . - CDS 128977 - 130377 683 ## BF2899 putative outer membrane protein 137 61 Op 2 . - CDS 130432 - 131862 785 ## BF2900 hypothetical protein - Prom 132036 - 132095 1.6 138 62 Op 1 . - CDS 132107 - 132952 545 ## BF2901 hypothetical protein 139 62 Op 2 . - CDS 132942 - 133532 181 ## BF2902 hypothetical protein 140 62 Op 3 . - CDS 133523 - 133900 420 ## Bacsa_0388 hypothetical protein 141 62 Op 4 . - CDS 133921 - 134373 607 ## BF2904 hypothetical protein 142 62 Op 5 . - CDS 134395 - 135444 948 ## BF2905 hypothetical protein - Prom 135654 - 135713 5.5 + Prom 136077 - 136136 3.4 143 63 Op 1 . + CDS 136164 - 136343 65 ## 144 63 Op 2 . + CDS 136398 - 136502 61 ## 145 63 Op 3 . + CDS 136570 - 136851 268 ## COG0776 Bacterial nucleoid DNA-binding protein + Term 136931 - 136963 2.0 - Term 136882 - 136923 8.8 146 64 Tu 1 . - CDS 136939 - 137319 450 ## BF2915 putative single strand binding protein - Prom 137343 - 137402 3.7 + Prom 137617 - 137676 1.7 147 65 Tu 1 . + CDS 137801 - 138019 120 ## gi|319644002|ref|ZP_07998561.1| ATP-dependent RNA helicase DED1 + Term 138023 - 138070 0.1 - Term 137840 - 137873 2.5 148 66 Op 1 . - CDS 138003 - 138260 206 ## BF2918 hypothetical protein 149 66 Op 2 . - CDS 138244 - 138603 268 ## BF2919 hypothetical protein 150 66 Op 3 . - CDS 138615 - 138791 230 ## Bacsa_0363 hypothetical protein 151 66 Op 4 . - CDS 138811 - 139104 356 ## BF2920 hypothetical protein - Prom 139137 - 139196 4.0 - Term 140146 - 140187 11.8 152 67 Op 1 32/0.000 - CDS 140210 - 140470 349 ## PROTEIN SUPPORTED gi|227417938|ref|ZP_03901107.1| LSU ribosomal protein L27P 153 67 Op 2 . - CDS 140491 - 140802 381 ## PROTEIN SUPPORTED gi|124010267|ref|ZP_01694920.1| ribosomal protein L21 - Prom 140836 - 140895 5.5 + Prom 140741 - 140800 4.2 154 68 Op 1 . + CDS 140958 - 141326 415 ## gi|313157078|gb|EFR56508.1| hypothetical protein HMPREF9720_1760 155 68 Op 2 . + CDS 141323 - 142270 1106 ## gi|313157211|gb|EFR56641.1| conserved domain protein 156 68 Op 3 . + CDS 142264 - 143220 941 ## gi|313157135|gb|EFR56565.1| hypothetical protein HMPREF9720_1762 157 68 Op 4 . + CDS 143239 - 144123 1244 ## COG1218 3'-Phosphoadenosine 5'-phosphosulfate (PAPS) 3'-phosphatase + Term 144163 - 144201 6.0 - Term 144153 - 144186 4.1 158 69 Op 1 . - CDS 144430 - 144912 500 ## STHERM_c15490 hypothetical protein 159 69 Op 2 . - CDS 145012 - 145281 429 ## gi|313157228|gb|EFR56658.1| conserved domain protein - Prom 145321 - 145380 3.2 + Prom 145295 - 145354 3.7 160 70 Op 1 . + CDS 145409 - 146749 2160 ## COG0569 K+ transport systems, NAD-binding component 161 70 Op 2 . + CDS 146822 - 148294 2097 ## Phep_3685 GH3 auxin-responsive promoter 162 70 Op 3 . + CDS 148383 - 148997 1051 ## COG1428 Deoxynucleoside kinases 163 70 Op 4 . + CDS 149001 - 150443 1656 ## COG0144 tRNA and rRNA cytosine-C5-methylases 164 70 Op 5 . + CDS 150446 - 151012 816 ## COG0789 Predicted transcriptional regulators + Term 151020 - 151073 3.7 + Prom 151107 - 151166 3.0 165 71 Op 1 7/0.000 + CDS 151194 - 151979 1060 ## COG0327 Uncharacterized conserved protein 166 71 Op 2 . + CDS 152003 - 152764 1359 ## COG1579 Zn-ribbon protein, possibly nucleic acid-binding + Term 152781 - 152829 12.8 - Term 152769 - 152817 13.3 167 72 Tu 1 . - CDS 152831 - 153280 419 ## Bacsa_1316 cupin 2 barrel domain-containing protein - Prom 153517 - 153576 2.9 + Prom 153479 - 153538 4.5 168 73 Op 1 . + CDS 153582 - 155162 2377 ## COG0171 NAD synthase 169 73 Op 2 . + CDS 155166 - 155594 433 ## BDI_0517 hypothetical protein 170 73 Op 3 10/0.000 + CDS 155594 - 156679 1147 ## COG2878 Predicted NADH:ubiquinone oxidoreductase, subunit RnfB 171 73 Op 4 12/0.000 + CDS 156775 - 158118 1971 ## COG4656 Predicted NADH:ubiquinone oxidoreductase, subunit RnfC 172 73 Op 5 12/0.000 + CDS 158149 - 159120 1531 ## COG4658 Predicted NADH:ubiquinone oxidoreductase, subunit RnfD 173 73 Op 6 13/0.000 + CDS 159134 - 159811 923 ## COG4659 Predicted NADH:ubiquinone oxidoreductase, subunit RnfG 174 73 Op 7 3/0.000 + CDS 159804 - 160388 861 ## COG4660 Predicted NADH:ubiquinone oxidoreductase, subunit RnfE 175 73 Op 8 . + CDS 160391 - 160969 1115 ## COG4657 Predicted NADH:ubiquinone oxidoreductase, subunit RnfA + Term 161051 - 161098 8.9 - Term 161043 - 161080 7.1 176 74 Tu 1 . - CDS 161085 - 161267 384 ## - Prom 161322 - 161381 5.3 + Prom 161262 - 161321 5.9 177 75 Op 1 . + CDS 161357 - 161899 879 ## COG0634 Hypoxanthine-guanine phosphoribosyltransferase 178 75 Op 2 . + CDS 161899 - 162534 825 ## COG0602 Organic radical activating enzymes 179 75 Op 3 . + CDS 162534 - 162809 457 ## gi|313157167|gb|EFR56597.1| conserved hypothetical protein + Term 162843 - 162885 6.4 180 76 Tu 1 . - CDS 163258 - 164196 1190 ## COG0611 Thiamine monophosphate kinase - Prom 164246 - 164305 2.7 + Prom 164235 - 164294 2.6 181 77 Tu 1 . + CDS 164348 - 164776 638 ## COG0757 3-dehydroquinate dehydratase II + Term 164827 - 164860 -0.6 182 78 Op 1 . + CDS 165013 - 166488 2009 ## COG0469 Pyruvate kinase 183 78 Op 2 . + CDS 166492 - 167940 1953 ## COG2244 Membrane protein involved in the export of O-antigen and teichoic acid + Term 167991 - 168034 13.1 - Term 167979 - 168022 9.3 184 79 Op 1 3/0.000 - CDS 168056 - 169123 1427 ## COG3426 Butyrate kinase 185 79 Op 2 21/0.000 - CDS 169182 - 170090 1217 ## COG0280 Phosphotransacetylase 186 79 Op 3 . - CDS 170116 - 171318 1905 ## COG0282 Acetate kinase 187 79 Op 4 . - CDS 171334 - 171705 538 ## COG0239 Integral membrane protein possibly involved in chromosome condensation - Prom 171733 - 171792 3.3 188 80 Op 1 22/0.000 - CDS 171813 - 173063 1907 ## COG0014 Gamma-glutamyl phosphate reductase 189 80 Op 2 . - CDS 173071 - 174162 1496 ## COG0263 Glutamate 5-kinase 190 80 Op 3 . - CDS 174191 - 174859 1174 ## COG0745 Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain - Prom 174985 - 175044 5.0 + Prom 174949 - 175008 4.5 191 81 Op 1 . + CDS 175036 - 177393 3325 ## BDI_2685 hypothetical protein 192 81 Op 2 . + CDS 177405 - 179027 2709 ## COG0504 CTP synthase (UTP-ammonia lyase) 193 81 Op 3 . + CDS 179061 - 181031 2826 ## COG0706 Preprotein translocase subunit YidC + Term 181101 - 181127 0.3 + Prom 181165 - 181224 4.2 194 82 Tu 1 . + CDS 181255 - 182079 1164 ## Odosp_3333 hypothetical protein + Term 182085 - 182111 -1.0 - Term 182065 - 182108 4.3 195 83 Op 1 . - CDS 182350 - 182727 448 ## COG1733 Predicted transcriptional regulators 196 83 Op 2 . - CDS 182729 - 183928 599 ## PROTEIN SUPPORTED gi|223476703|ref|YP_002580685.1| ribosomal protein L11 methyltransferase, putative 197 83 Op 3 . - CDS 183925 - 184554 1000 ## Odosp_3430 3'-5' exonuclease 198 83 Op 4 . - CDS 184627 - 186558 2894 ## COG0514 Superfamily II DNA helicase + Prom 186527 - 186586 2.7 199 84 Op 1 . + CDS 186633 - 186899 351 ## BVU_4156 hypothetical protein 200 84 Op 2 . + CDS 186920 - 187336 458 ## gi|313157086|gb|EFR56516.1| conserved domain protein + Term 187342 - 187389 7.3 201 85 Op 1 . + CDS 187402 - 189732 3526 ## Odosp_2264 TonB-dependent receptor 202 85 Op 2 . + CDS 189779 - 190090 511 ## gi|313157090|gb|EFR56520.1| heavy metal-associated domain protein + Term 190117 - 190151 5.1 - Term 190104 - 190137 4.1 203 86 Tu 1 . - CDS 190170 - 190892 980 ## COG2220 Predicted Zn-dependent hydrolases of the beta-lactamase fold - Prom 190919 - 190978 5.4 + Prom 190873 - 190932 2.5 204 87 Op 1 . + CDS 191004 - 191456 679 ## Ctha_1316 hypothetical protein + Term 191470 - 191496 -1.0 205 87 Op 2 . + CDS 191536 - 192462 1435 ## gi|313157052|gb|EFR56482.1| putative lipoprotein + Term 192479 - 192525 8.4 206 88 Op 1 . - CDS 192537 - 193748 1724 ## COG0477 Permeases of the major facilitator superfamily 207 88 Op 2 . - CDS 193764 - 194186 628 ## gi|313157251|gb|EFR56681.1| putative lipoprotein 208 88 Op 3 . - CDS 194195 - 195157 1316 ## gi|313157105|gb|EFR56535.1| hypothetical protein HMPREF9720_1816 209 88 Op 4 . - CDS 194881 - 196293 438 ## COG4412 Uncharacterized protein conserved in bacteria - Prom 196345 - 196404 3.3 - Term 196387 - 196420 5.4 210 89 Op 1 . - CDS 196445 - 196789 422 ## PROTEIN SUPPORTED gi|227417053|ref|ZP_03900228.1| LSU ribosomal protein L20P 211 89 Op 2 . - CDS 196882 - 197076 244 ## PROTEIN SUPPORTED gi|153808045|ref|ZP_01960713.1| hypothetical protein BACCAC_02331 - Prom 197106 - 197165 5.5 + Prom 197052 - 197111 4.2 212 90 Tu 1 . + CDS 197336 - 198007 963 ## COG1285 Uncharacterized membrane protein 213 91 Tu 1 . - CDS 198004 - 198636 361 ## HMPREF9137_0261 putative lipoprotein - Prom 198832 - 198891 3.1 + 5S_RRNA 199232 - 199291 93.0 # AE015927 [R:2797299..2798807] # 5S ribosomal RNA # Clostridium tetani E88 # Bacteria; Firmicutes; Clostridia; Clostridiales; Clostridiaceae; Clostridium. Predicted protein(s) >gi|313157045|gb|AENZ01000076.1| GENE 1 50 - 970 -213 306 aa, chain - ## HITS:1 COG:no KEGG:PPSC2_p0627 NR:ns ## KEGG: PPSC2_p0627 # Name: not_defined # Def: hypothetical protein # Organism: P.polymyxa_SC2 # Pathway: not_defined # 33 212 136 316 452 82 27.0 2e-14 MTKRLFSEFGLDSILDTDEWVKIRKRAKMDSRRMTTEEFKEELFGINYKIRVVGEYTKAN DRIKVQCKTCNHTWNAIPASLRLGVGCPKCAGVLQLTHDQFVARLRVLQPDLIPLTEYIN TKEKVVVKCKVCGYVWSTQPYHLTATSPRQRTGCPKCSGRGRRTPDEFVAEIANLSPAIK IIGTFVGRNKPILVQCAECNRIWHAWPGSLLKGGGCKVCKSKQAIRQRNRKIRCITTGEI FNTLREAAEKYNVSPSSVCICCTNLPKRKHAGGLEWEYILIKKSFTFESLFITSTTKQPR QKGGKI >gi|313157045|gb|AENZ01000076.1| GENE 2 1887 - 2072 165 61 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313157196|gb|EFR56626.1| ## NR: gi|313157196|gb|EFR56626.1| conserved domain protein [Alistipes sp. HGB5] # 1 61 1 61 61 72 100.0 1e-11 MKKQKITQQDYLKAHRKASREEEIARHGRPARFRAVHKSKKAYDRKREKAGLRNLPFSFL Y >gi|313157045|gb|AENZ01000076.1| GENE 3 2179 - 2739 815 186 aa, chain + ## HITS:1 COG:all4541 KEGG:ns NR:ns ## COG: all4541 COG0664 # Protein_GI_number: 17232033 # Func_class: T Signal transduction mechanisms # Function: cAMP-binding proteins - catabolite gene activator and regulatory subunit of cAMP-dependent protein kinases # Organism: Nostoc sp. PCC 7120 # 45 186 49 188 193 75 34.0 4e-14 MDFNRYISAAEFAPLAELFRREGESLFVPKDGCFSFQQVRNRRSGLVEEGAFRYVHATEA GERHVVGYAFAGEFVGDYISMRNDAPAWVSIEAMCDSRVLTLTGERLEKFYRTDAAHERL GRTVAEHMLAEIYERLLQSYVSTPQERYESLLARCPGLLELVPLRELASFLGVRPETLSR IRRRVR >gi|313157045|gb|AENZ01000076.1| GENE 4 2793 - 3509 191 238 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163797523|ref|ZP_02191474.1| 50S ribosomal protein L9 [alpha proteobacterium BAL199] # 5 231 8 240 259 78 32 3e-13 MKRAIVIGATSGIGRGLAECFAAQGYAVGLVGRRENLLKEIAQADPAAYRFAVADVTRPA ETVAALEHLAQELGGMDLCVVCAGTGDLNPSLDFALEETAILTNVLGWTAAADWAYNRFE RQGGGHLVVVSSVGGLRGGGAAPAYNASKAYQINYAEGLRQRAAKSRLPLYVTDVRPGFV DTAMAKGDGLFWVMPVEKAVRQIVRAVRRRRRVAVVTRRWRIAAWLLRHMPDGIYLKM >gi|313157045|gb|AENZ01000076.1| GENE 5 3509 - 3904 563 131 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313157193|gb|EFR56623.1| ## NR: gi|313157193|gb|EFR56623.1| hypothetical protein HMPREF9720_1612 [Alistipes sp. HGB5] # 3 131 1 129 129 212 100.0 6e-54 MCMNGWFIAAGTLLTAAFFVHVVSGNRFYTAACPGRNAAADKAYEAWLMGRCGVQMITAD LALGAVFLLLLGFGAIPRNFQLELFLLLTYGGWFLLWLVSLGYERAAKTRYLRLCHWALF LTVALLVLLGM >gi|313157045|gb|AENZ01000076.1| GENE 6 3989 - 4930 1196 313 aa, chain - ## HITS:1 COG:CAC3576 KEGG:ns NR:ns ## COG: CAC3576 COG2070 # Protein_GI_number: 15896810 # Func_class: R General function prediction only # Function: Dioxygenases related to 2-nitropropane dioxygenase # Organism: Clostridium acetobutylicum # 1 302 2 301 310 221 44.0 1e-57 MENRITTLFGIRYPIIAGGMIWCSGWRLAAAVSDAGGLGLIGAGSMTPDLLREHIRKCRA ATGKPFGVNVPLMYRYAEQIMQVVMEERVPAVFTSAGSPKTWTGQLHAAGCKVAHVVSST KFALKCQEAGVDAVVAEGFEAGGHNGREETATMALVPQVRAAITIPLIAAGGIATGRGML AAFALGAEGIQMGTRFALSEESSASDAFKRLCIGLEEGGTMLALKKIGPTRLIRNDFYTA VETAENRGASAEELKELLGRGRSKRGIFEGDLADGELEIGQIASLIRDLPPAGEIVRQII DEYNAGVKSLTTL >gi|313157045|gb|AENZ01000076.1| GENE 7 5075 - 6115 1776 346 aa, chain + ## HITS:1 COG:no KEGG:CHU_0571 NR:ns ## KEGG: CHU_0571 # Name: not_defined # Def: hypothetical protein # Organism: C.hutchinsonii # Pathway: not_defined # 15 345 14 346 347 318 47.0 3e-85 MLYTEESRIGALAVHIVGNKAKDEPLMVSPSLSDQLSDAALMQTLLAYFMGGFKSEEYFN LYHDTDLTCNEVWNFTCKIFDDPESFYDHSVSLARHLYETGMHPQIKSGEFYVVYFTDCR FGGETVDAVGLFKSETRDTFLDVEEQASGLHIAAHQGINVNKLDKGCVVFNTEREAGFAV CVVDNTNRAEARYWIDDFLHVRQRQDNYHNTHNAMAMCKKYVTKHLPNEFEVSKADQAEM LNETMKYFKEQDSFSLDDFSEKVIRQPEVMESFSRYKQEYEQDRDIRIEDEFAISDLAVK KQARSYKSVIKLDRNFHIYVHGNRSLLEQGEDEKGKFYKVYYEEEE >gi|313157045|gb|AENZ01000076.1| GENE 8 6188 - 6661 -115 157 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPAFFGVARGGCSLSGWCLWCLRCLRCLPVFPAKFAAISPVSRIILFSSAVCYPPPSAVF PLADRPSLSVLSRKNSFSRSETAPETRRCADFAPFYLFYSIIFCNMDLLSARVVTARTVP AMRNRMFSEESVLWPLGGGNCSLRRESFYRHTDRFGM >gi|313157045|gb|AENZ01000076.1| GENE 9 6901 - 7068 75 55 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313157088|gb|EFR56518.1| ## NR: gi|313157088|gb|EFR56518.1| conserved hypothetical protein [Alistipes sp. HGB5] # 1 55 1 55 55 105 100.0 7e-22 MERKLTQENVRFDTDLNIARIESWRHSISLTTIADLCDYYKISPADLFRNIATHP >gi|313157045|gb|AENZ01000076.1| GENE 10 7110 - 7688 772 192 aa, chain - ## HITS:1 COG:CAC2751 KEGG:ns NR:ns ## COG: CAC2751 COG0454 # Protein_GI_number: 15896008 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Clostridium acetobutylicum # 5 188 1 165 167 96 32.0 3e-20 MNEDLKYRKAEPADAERIMEIIRQAQAQMRALGSLQWQDGYPARTDIDNDIARGYGYVFE KSDAAENPAAAETPQKYDKGTVPGGVIAYGAAVFDGEPAYGAIDGAWLTGGDYVVLHRLA VADGEKGRGVAAEFMRRTEALARGRGVGSFRVDTNFDNRYMLRMLERLRFAYCGKIVYGS GERLAFEKPFEA >gi|313157045|gb|AENZ01000076.1| GENE 11 7807 - 8397 516 196 aa, chain + ## HITS:1 COG:all2007 KEGG:ns NR:ns ## COG: all2007 COG0242 # Protein_GI_number: 17229499 # Func_class: J Translation, ribosomal structure and biogenesis # Function: N-formylmethionyl-tRNA deformylase # Organism: Nostoc sp. PCC 7120 # 51 193 16 159 179 90 41.0 1e-18 MKRFVITVACLLAACAVQKGFTDRECEVIASQSGAVMRLCRIGDRADSLFLRRKAAPLGA EELRSEYFRLLKQGMLLTVRDPADEGVGIAAPQVGVSRRLIAVQRFDKPGEPFECYVNPE IVGRSAARTAGREGCLSVPETSGTVLRADRIVLRYLDETTMRPVTDTVEGFTAVIFQHEI DHLDGVLFIDRMQRGN >gi|313157045|gb|AENZ01000076.1| GENE 12 8452 - 9804 2124 450 aa, chain - ## HITS:1 COG:lin2192 KEGG:ns NR:ns ## COG: lin2192 COG0534 # Protein_GI_number: 16801257 # Func_class: V Defense mechanisms # Function: Na+-driven multidrug efflux pump # Organism: Listeria innocua # 8 449 8 442 443 204 29.0 2e-52 MQRDSIDFSSSDVGKLFRRFLFPTVMGMVFSAVFVITDGIFVGRGIGSDALASVNIVAPL FTLGTGLGLMFGMGGSVVASINLAQDKRRVAEINITQSLALPALLIALLSALLIHCHEPL LMLLGTPGELMVPAREYLVWFTLFLAPLAVFNILMFIVRLDGAPGFAMACNIMAACINIA LDYLFIFVFEWGLAGAAVATGIGYVVGSGVMLWYMFRRSRTLRFVKLKMSGKSMRLTVRN LWYMSYIGFPALLSELAISCLMIVGNYTFIHYVGKDGVAAYSIACYIFPIIFMVYNGIIQ SAQPIISYNYGAGLMRRGRDAFRMALGTAALCGLAAFGFTWLCSPWIIGLFLTPDAPAYA IAVQGLPWFAAGYPFFGINVVTIGYYQSIERGRLATGLTVLRGIVLMTFCFLVMPHMAGV TGIWLAVPAAEALETLLLVYLLRSRRFAFA >gi|313157045|gb|AENZ01000076.1| GENE 13 9879 - 10442 798 187 aa, chain - ## HITS:1 COG:no KEGG:BF1288 NR:ns ## KEGG: BF1288 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 5 186 7 188 189 121 36.0 2e-26 MEFTEILCGRYGLSVRAAGVLMRHASQSVHARKEIIVEEGQRDQEIRFVTEGSARSYGMR EDKCVILSFAFEGDPTSSTLGAWPGGISGHTIETMEPTTLVRLPREEIDRLFAADAELAD WGRRMAEEQLRRHEEYFADFAWRDKSEQYKRMLREYPQLLQRIALKDLAAYLFVTPQSLS RIRAEIK >gi|313157045|gb|AENZ01000076.1| GENE 14 10664 - 11101 809 145 aa, chain + ## HITS:1 COG:MA0133 KEGG:ns NR:ns ## COG: MA0133 COG0071 # Protein_GI_number: 20089032 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Molecular chaperone (small heat shock protein) # Organism: Methanosarcina acetivorans str.C2A # 33 134 49 145 153 64 38.0 5e-11 MTPARTNQNWLPGIFNEILGDNWLAERRSTTAPAVNIIDGENEYKVEVAAPGMTKEDFKV HINEDNELVISLEKKTENKEEDAKRKGTYLRREFSYTQFQQSLLLPDNIERENISAKVEN GVMTIDIPKKKIEETAAATRQIEVK >gi|313157045|gb|AENZ01000076.1| GENE 15 11439 - 12173 806 244 aa, chain + ## HITS:1 COG:STM1907 KEGG:ns NR:ns ## COG: STM1907 COG3142 # Protein_GI_number: 16765249 # Func_class: P Inorganic ion transport and metabolism # Function: Uncharacterized protein involved in copper resistance # Organism: Salmonella typhimurium LT2 # 4 226 5 226 248 176 43.0 5e-44 MTTELCAYSIEACETARRAGVARVELCASPYEGGTTPSAAAIRMARRIEGLQLSVMIRPR GGDFLYDDAEFRQMLEEVRFARECGADGVVFGLLTPDGRVDTARTAALVAEAGPMQTTFH RAFDMTCDLSDALEEIVRAGCTRILTSGGRNTASEGIGDLRALVVRAAGRIEIMAGSGVN PSNVRQLAATGVDALHFSARGMRPSGMAYRNPQISMGGCSGVPEYASPCADERTIRQILN ELNR >gi|313157045|gb|AENZ01000076.1| GENE 16 12170 - 13357 1359 395 aa, chain + ## HITS:1 COG:no KEGG:Odosp_0628 NR:ns ## KEGG: Odosp_0628 # Name: not_defined # Def: transglutaminase domain-containing protein # Organism: O.splanchnicus # Pathway: not_defined # 6 395 11 400 400 436 51.0 1e-121 MKRIFLLLLGLSLCACGRYGCGVPAEYEPLLDAALADCPRADSLRQLLRETPRDRRAGMA FLVAYMPQGDRDTMRLDLLRENVEYAYRARAEYPWTRALPDSVFLNDVLPYAVVDEVRDS WRPDFYARFARRAAGAADVRAAIDSINRHIAADVAVEYNTAREKTNQSPSESMRQHMASC TGLSVLLVDALRAAGIPARFAGTPAWHDDRGNHSWVEVWIDGRWHFTEYYFPGRLDYAWF FADAGQASPDDRAHGIFAVSFRPAGDWFPMVWSEDSREVHGVNVTQRYRDLYAAYADDLA ERGRHATVTFMMYDKAAHAGRSDARVAANVDVFCGSEQMGGGRTAGPRQDMNDALRFLLE KGRTYTFRYENSRGEAAQVTAEVGDAPLTVTGYME >gi|313157045|gb|AENZ01000076.1| GENE 17 13533 - 14186 491 217 aa, chain - ## HITS:1 COG:L170990 KEGG:ns NR:ns ## COG: L170990 COG0454 # Protein_GI_number: 15672558 # Func_class: K Transcription; R General function prediction only # Function: Histone acetyltransferase HPA2 and related acetyltransferases # Organism: Lactococcus lactis # 14 186 14 147 147 100 34.0 2e-21 MTIERLKTPSAGKLNALTDLWEASVRATHDFLAPEDIAFFRQMVRQEALRGVELYVIRKP DSKAPETGTSAGQENRHESRTSGKTGESRRPEGLDENGMFGGFAAFAGIDGDMLEMLFVA PKARGKGVGRQLVEYLVRERGIRQVDVNEQNAQAAGFYARLGFRVTGRDATDPSGRPYPI LHLSLGETPQRIVPIPNLQTDRIDVDFEPEPTGDEGK >gi|313157045|gb|AENZ01000076.1| GENE 18 14188 - 15096 675 302 aa, chain - ## HITS:1 COG:no KEGG:BF3212 NR:ns ## KEGG: BF3212 # Name: not_defined # Def: putative ferredoxin # Organism: B.fragilis # Pathway: not_defined # 1 302 1 278 278 219 40.0 2e-55 MNPTAVYTVVFSPTGTSRKIAAAVAQGVARHGGSCNATANTAAGADAGTNAAAGSAAKAF TTIDLTHAAGKPPVLPADAAAVFAVPVYGGRVAPAALERLQEIRGEGTPAVVLAVYGNRA FGTAVAQLAAFVAERGFVPVAAGAFVGEHSYSTPGTPIAEGRPDTQDLAEAAAFGARIGQ KLAHGETGPIDAAKLREPHTPLLSKLRFIRFVLGYRRRQKRNPVALLPAGDAARCTQCGR CVALCPTQAIARGDELHTDPARCIRCCACVKGCAFGARTFETPFAAVLSRNFARRKPPVT LM >gi|313157045|gb|AENZ01000076.1| GENE 19 15228 - 15914 990 228 aa, chain + ## HITS:1 COG:PAB1763 KEGG:ns NR:ns ## COG: PAB1763 COG0778 # Protein_GI_number: 14521107 # Func_class: C Energy production and conversion # Function: Nitroreductase # Organism: Pyrococcus abyssi # 40 214 2 182 196 91 34.0 1e-18 MTVKIITCAALLLAACGGAQPPKEEAKAPETRTEVKHGAEIALLKPDSKAGMTVNEALEN RRSWREYAPEALSLEELSGVMWAAGGVNRPADGRLTAPSALALYPVRVYAFFAEGVYSYD AKARKLVRVAEGDLRRLAGAQDFVYAAPLNLVYIADMSVYEGKSIPAEHVRYLCGQDAAG YAENVNLYTAGHGLKSITRGSAPEAELLKALGLDPKRYFMALAQTVGK >gi|313157045|gb|AENZ01000076.1| GENE 20 16093 - 17160 1439 355 aa, chain + ## HITS:1 COG:RSc3292 KEGG:ns NR:ns ## COG: RSc3292 COG3274 # Protein_GI_number: 17548009 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Ralstonia solanacearum # 52 354 42 336 336 76 27.0 7e-14 MTKLQSRNYGLDLLRVLACYMVLQVHAGEFYYIADGGLVAAGGEPVWACWLNSLCRTAVP LFVMLSGFFLLPVRERTRDFFARRFVRVVVPFLVWCVLYAIYQFLTGRTDAAGALLGVCR IPVNFGVEIGHLWFVYMLLGLYLFAPVVSPWIETASRRAMEGYLALWAFTLCIPYVHLVF PELLGECFWNDTPMLYYFSGFLGYMVLAAYLRRYHAAPRAWHKWGGALLVVAGYFATAWI FDARLATQRLVADLELSWGFGTINVAVMSVGLFLLLKDVRPGAGRFTGVVTDISRLSYGI YLIHIMLLNFFHGWLDPLIVSAGIKIPVLALCTFVSSYAVVKLISLLPGSKYVVG >gi|313157045|gb|AENZ01000076.1| GENE 21 17161 - 17718 850 185 aa, chain - ## HITS:1 COG:slr2078 KEGG:ns NR:ns ## COG: slr2078 COG0494 # Protein_GI_number: 16330005 # Func_class: L Replication, recombination and repair; R General function prediction only # Function: NTP pyrophosphohydrolases including oxidative damage repair enzymes # Organism: Synechocystis # 7 168 10 171 194 112 37.0 4e-25 MENPKSRAWKVVKSEYLARKPWFTVRHERIELPDGRSIPDYYVFEYPDWVNVIAITKEGK FVFIDQYRHGLGETNYEIPAGVAEPDDASMLAAAQRELAEETGYGGGEWRELLVVAPNPA TQSNLTYCYLATGVERLGAQRLDATEDIRVHLFTAAEVRELLTGGGIRQALMAAPLWRYA AEYGL >gi|313157045|gb|AENZ01000076.1| GENE 22 17793 - 19079 1811 428 aa, chain + ## HITS:1 COG:yhaN+M KEGG:ns NR:ns ## COG: yhaN+M COG3681 # Protein_GI_number: 16132252 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Escherichia coli K12 # 9 427 12 434 436 301 42.0 1e-81 MFSPNERAKIIALINREVVPAIGCTEPIAVALCTARAAELLDTQPEKIEVRLSANILKNA MGVGIPGTGMIGLPIAVALGALVGRSEYRLEVLRDVTPEAVGRGARYIDEKRVCISLKED IAEKLYIEVEASAGERRAVAVIAGGHTAFVYLERDGEVLLDKRTASVAEEEAGEVPLTLR RVWDFAMTAPLDELRFILETRRVNKAAAERAFAGEFGHCVGRTLRCERERRVMGDSIFSR ILSYTSAACDARMAGAMIPVMSNSGSGNQGIAATLPVVVYADETAADEEHTIRALVLSHL TVIYIKQSLGRLSALCGCVVAATGSSCGITYLMGGAYEQVAAAVKNMIANLTGMICDGAK PSCSMKLTSGVSTAVLSAMMAMDGHCVTPVEGIIEEDVDKCIRNLTAIGRDGMNETDRLV LGIMTHKC >gi|313157045|gb|AENZ01000076.1| GENE 23 19091 - 20368 1858 425 aa, chain + ## HITS:1 COG:no KEGG:PRU_2549 NR:ns ## KEGG: PRU_2549 # Name: not_defined # Def: hypothetical protein # Organism: P.ruminicola # Pathway: not_defined # 26 422 9 406 416 259 38.0 2e-67 MNRLALKLRFCAAFLLLGLWGADAQTAREQIAAVPERAGGIYHSYEYRPAAAAPVPDGYT PFYISHYGRHGSRWHASESVYAGPLKILRKAAEAGALTPLGRDVLGRVEIIAADADKRYG DLSPRGVAEHRGIAERMYKAYPEVFSTADGRECRIESRSTLVPRCILSMAAFNERLKELN PAIRTTRESSARYMPYMGNNKGLDAQRDRTLKTADSVRAARLIPDRLMKSLFSDPEFVKR EVKKPRKLMEQLLLQAAIMQDVDYLGISLYDLFTGEEIYAAWEDENFRRYVMFGPSKRFG DPIIADAKPLLRNIVETAEEVIGGGKELAASLRFGHDVNVIPLLALLGVEGASARVSTPE EAAEVWQVHRVSPMAANVQFIFFRNPAGDVLVRILHNERDAGLPLGGGPYYRWETFRDYC KSLYE >gi|313157045|gb|AENZ01000076.1| GENE 24 20728 - 21954 1903 408 aa, chain - ## HITS:1 COG:YPO0055 KEGG:ns NR:ns ## COG: YPO0055 COG1519 # Protein_GI_number: 16120408 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: 3-deoxy-D-manno-octulosonic-acid transferase # Organism: Yersinia pestis # 50 348 51 357 425 122 29.0 1e-27 MWLYNFGLTCYVWIIRLVAPRHPKARLWINGRKDLYKRMAETIDPAARIVWIHVASLGEF EQGRPIIERIRKEHPEYKILLTFFSPSGYEIRKNYQGVDYIFYLPIDTPRNARRFLDAAH PEIAVFVKYEFWLNLLYELRRRKVRTYIVSAIFRRNSIFFRPYGGMWRQALESFDVMFVQ NEESKRLLAGLGFDNVIVAGDTRFDRVAEIAKAAKHIDVIERFKGDKRVFVAGSTWGPDE ELLIRLINDNPDIKFIVAPHEMDESRMARLAAEIKGGTLRYTQCTPHTSYGPKQLLILDT VGILSSVYSYATWSYIGGGFGVGIHNTLEAATFGLPVAFGPNYQKFKEARDLVTLGAARS INDYDELRAWFVPLRDNEEFLQKTSRIAKDYTTRHQGATGIIVRTIFN >gi|313157045|gb|AENZ01000076.1| GENE 25 22092 - 22457 471 121 aa, chain - ## HITS:1 COG:no KEGG:Dfer_3709 NR:ns ## KEGG: Dfer_3709 # Name: not_defined # Def: hypothetical protein # Organism: D.fermentans # Pathway: not_defined # 1 116 1 116 119 84 37.0 1e-15 MGTTAETGRMGERAAAEFLRRAGYEICALNWRSGRYELDIVARKGDFVHFVEVKTRCADG LTPPEAAITPQKFRALTRAALRYMACTGEEREAQFDLIAVEVMPGGETEVRLIPQAMEYN W >gi|313157045|gb|AENZ01000076.1| GENE 26 22463 - 23305 1225 280 aa, chain - ## HITS:1 COG:HP1380 KEGG:ns NR:ns ## COG: HP1380 COG0287 # Protein_GI_number: 15645990 # Func_class: E Amino acid transport and metabolism # Function: Prephenate dehydrogenase # Organism: Helicobacter pylori 26695 # 11 279 1 265 265 162 34.0 5e-40 MKILIAGLGLIGGSFALALRDRGIADEILGVEKSDENAAEALRLGLADRIVTLEEGVPQA DLVVLATPVDTIPLMAIKALNHVTDRQVVMDMGSIKAELCEVISMHARRGRFVAAHPMWG TEYSGPRAAQHGAFTGRNAVLCEAERSDADALATVERIFRTLDVPVVYMGAEEHDLHAAY VSHISHVTSFALALTVLEKEREERHIFDLAGGGFESTVRLAKSAAATWVPILLRNKYNVL DVLREHIHQLQIMRRMIERDDAEGLTGAFEKANSIQRIIH >gi|313157045|gb|AENZ01000076.1| GENE 27 23359 - 24060 1013 233 aa, chain - ## HITS:1 COG:ECs4939 KEGG:ns NR:ns ## COG: ECs4939 COG3340 # Protein_GI_number: 15834193 # Func_class: E Amino acid transport and metabolism # Function: Peptidase E # Organism: Escherichia coli O157:H7 # 1 226 1 226 229 208 46.0 6e-54 MKLLLISNSTNAGEEYLRYPLPEIGRFLQGVREIVFVPYAAVTFSYAEYEKKVQARFSEL GIRVRSVHRAKDPARMIREAEAICVGGGNTFALAKKMQEQGLMRAILRKIKAGTPYVGWS AGSNVACPTICTTNDMPIVEPESFKAVGAVKFQINPHYLDANPEGHAGETREQRILEYIE ANPRRWVAGLREGCMLRYEEGKLELIGKRPMRMFRKGTETFEIEPGGDLSFLL >gi|313157045|gb|AENZ01000076.1| GENE 28 24239 - 24475 517 78 aa, chain + ## HITS:1 COG:BS_acpA KEGG:ns NR:ns ## COG: BS_acpA COG0236 # Protein_GI_number: 16078655 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: Acyl carrier protein # Organism: Bacillus subtilis # 1 77 1 77 77 72 61.0 1e-13 MSEIQSKVVEIIVDKLGVKESEVVPEASFTNHLGADSLDTVELIMEFEKAFDIQIPDEDA EKIATVGDAINYIEEHKK >gi|313157045|gb|AENZ01000076.1| GENE 29 24517 - 25764 1415 415 aa, chain + ## HITS:1 COG:sll1069 KEGG:ns NR:ns ## COG: sll1069 COG0304 # Protein_GI_number: 16329903 # Func_class: I Lipid transport and metabolism; Q Secondary metabolites biosynthesis, transport and catabolism # Function: 3-oxoacyl-(acyl-carrier-protein) synthase # Organism: Synechocystis # 1 414 4 414 416 409 49.0 1e-114 MTERRVVVTGIGTINPLGNNIEEYFSNLEKGVSGAAPITHFDAEKFKTKFACEVKNYDSS LYFDRKEVRKYDLFTQYALIAATQAVEDSGLDLETVDKEQVGVIWSSGIGGIKSFFDECL GWAAGDGTPRFSPFFIPRMISDIAAGFISMKYGFMGPNYCTVSACASSNHGMMDAMNAIR YGKADVMVTGGSEAAVNEPSVGGFNSMQALSTRNDDPTHASRPFDKDRDGFVIGEGAGAL IFEEYEHAKARGARIYCEVVGSGCSADAYHFTAPHPEGKGAMKSMRDAIKDAGIKPEDID YVNVHGTSTPAGDIPELKAVAGVLGDHVYNINISSTKSMTGHLLGAAGAAEALACIFAIT HGVVPPTINCENLDPEIDPNLNLTLNKAQKRDVKCALSNTFGFGGHNSSIIFRKL >gi|313157045|gb|AENZ01000076.1| GENE 30 25860 - 26558 865 232 aa, chain + ## HITS:1 COG:SA1076 KEGG:ns NR:ns ## COG: SA1076 COG0571 # Protein_GI_number: 15926816 # Func_class: K Transcription # Function: dsRNA-specific ribonuclease # Organism: Staphylococcus aureus N315 # 2 210 24 234 243 126 38.0 4e-29 MFGFIPNNIELYKLALIHKSASLVLEDGRAINNERLEFLGDAVIEAVTSDYLFIEYPDRD EGFLTQLRSKIVSRQSLNVLAVNIGLDRHVISNGSTSVTQKHIYGDAFEAMIGAVYLDQG YEFVNRLLINRIYFRHLSLDELTESETDFKSRLIEWCQKNRHKIAFRTGYDKEYSANHPV FYSTVLVDGMEVGHGSGDSKKEAEQHAAFSVAQYMSDEQCATLLDRVDRLTQ >gi|313157045|gb|AENZ01000076.1| GENE 31 26673 - 27287 708 204 aa, chain + ## HITS:1 COG:lin1584 KEGG:ns NR:ns ## COG: lin1584 COG2003 # Protein_GI_number: 16800652 # Func_class: L Replication, recombination and repair # Function: DNA repair proteins # Organism: Listeria innocua # 5 201 20 222 224 111 30.0 1e-24 MAARGVDALTDRELLSLLTEDEQSAEALLAAYGGSLARVGSEEAARLRMIGGLGLRRARM LLAAAELGRRITAARAAETDVVSSSDDVVRLFRPQLETLSHEECWAVYLTSSNRIIERQR VSQGGVQGTVVDHRLIIKRALELLATQLILIHNHPSGAAEASPQDKVLTERIAQAAALFD IRLLDHIIIAREGDFSFLREGLIS >gi|313157045|gb|AENZ01000076.1| GENE 32 27509 - 27661 58 50 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MEIKPFCCDFISIFVTAACEGGRHTRHKKRIGFFLAGENWTLSRKVRNKG >gi|313157045|gb|AENZ01000076.1| GENE 33 27710 - 29254 2571 514 aa, chain + ## HITS:1 COG:SA1394 KEGG:ns NR:ns ## COG: SA1394 COG0423 # Protein_GI_number: 15927145 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Glycyl-tRNA synthetase (class II) # Organism: Staphylococcus aureus N315 # 10 505 8 463 463 473 49.0 1e-133 MTQEELFKKLIAHCKEYGFVFPSSEIYDGLAAVYDYGQNGVELKNNIKRYWWDSMVKLHE NIVGIDAAIFMHPRTWEASGHVGAFNDPLIDNKDSKKRYRADVLIEDWLAKQDEKIEKEV EKARKRFGETFDEAKYRETSPRVAETAARRDEMHKRFVEAQNSGDLGELRQIIVDCEIVC PISGTRNWTEVRQFNLMFSTQMGSTAEGANTIYLRPETAQGIFVNFLNVQKTGRMKLPFG IAQIGKAFRNEIVARQFIFRMREFEQMEMQFFVRPGEEMKWWQAWKETRMAWHRALGFGD DRYRFHDHEKLAHYANAATDIEYNFPFGFKEVEGIHSRTDFDLGNHQKFSGRKIQYFDPE TGESYVPYVVETSIGVDRMFLQVMSAAYTEEQLEGGDSRVVLRLPAALAPVKVAVLPLVK KDGMPEVAQKIVDLLKYDYNVVYDEKDSVGKRYRRQDAVGTPFCVTVDGQTLEDNTVTVR HRDTMQQERVAIDKLPALVEEECSYRKLFRKLNL >gi|313157045|gb|AENZ01000076.1| GENE 34 29265 - 29678 495 137 aa, chain + ## HITS:1 COG:BH1269 KEGG:ns NR:ns ## COG: BH1269 COG0816 # Protein_GI_number: 15613832 # Func_class: L Replication, recombination and repair # Function: Predicted endonuclease involved in recombination (possible Holliday junction resolvase in Mycoplasmas and B. subtilis) # Organism: Bacillus halodurans # 3 134 2 135 140 83 39.0 1e-16 MGRILAIDYGTKRTGIAVSDPLRLIAGGLETVETKGLEAWLAKYFAAEDVTTIVVGKPTQ MDGTPSETWRFVEPLAARLRRAYPDKEVVFYDERFTSVLAHRTMLESGIGRMARRNKALV DKISATIILQSYMEFNK >gi|313157045|gb|AENZ01000076.1| GENE 35 29675 - 30220 782 181 aa, chain + ## HITS:1 COG:aq_579 KEGG:ns NR:ns ## COG: aq_579 COG0242 # Protein_GI_number: 15606030 # Func_class: J Translation, ribosomal structure and biogenesis # Function: N-formylmethionyl-tRNA deformylase # Organism: Aquifex aeolicus # 1 173 1 168 169 122 42.0 5e-28 MIYPIVIYGNEVLRKQCEEIAPDYPEVKKLVEDMFQTLGEAEGVGLAAPQIGKAIRLFIV DCTPWGEDDPECADYKRAFINPEIYAFSEEKKTYNEGCLSFPGIHADVPRSLAIRMRYLD ENFVEHDEEFHGLKAWVIQHEYDHIEGVVFTDRISPLRRNLLKGKLLNLAKGKYRAAYKT R >gi|313157045|gb|AENZ01000076.1| GENE 36 30304 - 30888 789 194 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313157247|gb|EFR56677.1| ## NR: gi|313157247|gb|EFR56677.1| hypothetical protein HMPREF9720_1642 [Alistipes sp. HGB5] # 1 194 1 194 194 332 100.0 1e-89 MKKILLISVAALFAWAASAQDNTYRNTQTNNYRNAAQTNNYRNAAQTYNNNAAQAGNYRG TPSDYRSAVGPRVNFYTNTDDASVGIGAYYRYSFNSHWRIEPSIYVLTEKDSSVDINFDA HYVFQIADWWGVFPQVGIVANDIKDWAVGMSVGAGFDFNVAHRWNISAGLKYEPMFDSDR SNPLVVYVGAAYRF >gi|313157045|gb|AENZ01000076.1| GENE 37 30934 - 31443 0 169 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MSGRERPAPHCRPPRKPTRNEVQRATTSDGDCCDGGEDSNGEDSNRDSEDSNGDGEDSNG DGNGDGSGNGSGNDELQRRRAVHFSTPGQKPPQAKQNARPVRRKIQQIGRIPRPENPSGE LRSPPARDGFSPPVAHKFRRQPIAARVLHAVKAMEQRRFGKLSESGDMN >gi|313157045|gb|AENZ01000076.1| GENE 38 31503 - 31970 756 155 aa, chain - ## HITS:1 COG:no KEGG:HMPREF0659_A6148 NR:ns ## KEGG: HMPREF0659_A6148 # Name: not_defined # Def: hypothetical protein # Organism: P.melaninogenica # Pathway: not_defined # 1 155 1 162 162 64 29.0 2e-09 MKKMFFAAIVALFAVTTVSAQTPGNWAVGPKIGVYTNTGADGAIFGIGASGRYSFTDNWR MESGIMAICKKFCSVDISADVQYLFNIAPDWHIYPQAGLSANDIGGWSCGINLGGGADFS VARNWDLSAGFKWMIQTAEYHKNPILINIGATYKF >gi|313157045|gb|AENZ01000076.1| GENE 39 32160 - 34889 4325 909 aa, chain - ## HITS:1 COG:BMEI1436 KEGG:ns NR:ns ## COG: BMEI1436 COG0574 # Protein_GI_number: 17987719 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphoenolpyruvate synthase/pyruvate phosphate dikinase # Organism: Brucella melitensis # 5 896 46 917 930 1039 60.0 0 MADVKRVYTFGNKEAEGNGKMRELLGGKGANLAEMNLIGIPVPPGFTITTEVCAEYYAHG KEAVIKMLRPEVERAMQNIEKLTGMKFGDKEMPLLVSVRSGARASMPGMMDTILNLGMND QAVEAVAKRTGNPRFAWDSYRRFVQMYGDVVLGMKPESKEDHDPFEVIIEEQKEKRGVKN DTDLTTDDLKELVQNFKAAVKKQTGEDFPACPWDQLWGAVCAVFGSWMNERAILYRKLNN IPAEWGTAVTVQAMVFGNMGANSATGVAFSRDAATGENLFNGEYLINAQGEDVVAGIRTP QQITIEGSKRWAVAQNVSEEERRTKYPSLEEVMPTVYKELDEIQRHLEQYFKDMQDIEFT IQDGKLWMLQCRNGKRTGAAMVKIAMDMLREGLIDERTAVLRCEPAKLDELLHPVFDKKA IANAQVITKGLPASPGAATGPVVFFAEDAEKILAESGQKAILVRIETSPEDLKGMLDAAG ILTARGGMTSHAAVVARGMGKCCVSGAGELEIDYKTRTIRVNGFTVKEGDWISLNGSTGE VYLGQVATMAADLSGDFGQLMELAGKYAVLKVRANADTPKDALQAFAFGAEGIGLCRTEH MFFEGDRIKAIREMILADDEAGRRVALAKLLPIQRGDFEGLFKAMNGFPVTVRLLDPPLH EFVPHDEKGQQEMAKEMNVPLQKIVAKVESLAEFNPMLGHRGCRLGNTYPEITEMQARAI IEAAMNVKAQGIPVHVEIMVPLVGNHKELRYQKGIIDATAEQVFSERNDKIDYMVGTMIE VPRAAVTANQIAEVAEFFSFGTNDLTQMTLGFSRDDIGKFLPVYLEKGILKNDPFQILDR NGVGQLIREAVFKGRSTREKLKCGICGEHGGEPSSVEFCHYAGLNYVSCSPFRVPIARLA AAHAALNEK >gi|313157045|gb|AENZ01000076.1| GENE 40 35062 - 36012 1086 316 aa, chain - ## HITS:1 COG:no KEGG:FB2170_05240 NR:ns ## KEGG: FB2170_05240 # Name: not_defined # Def: hypothetical protein # Organism: F.bacterium_HTCC2170 # Pathway: not_defined # 54 316 17 321 324 82 25.0 2e-14 MFPHRTASATKHRPRTGPKYRSGKRPLLPFRTLLLAALLLSGIRCALAQPRIGIAYCDLD HLYDTIPALFYDDSDYTPGGRLAWDTERYRRKIARTAAVIDSMRMPLVALWSVENEAVVR DIAAACRGDYSYLHCTLNSLDGMDFALLYYGDLFDPHYEEPGRRYLYIEGTLRFPAPRTR RTTGRPVRPSRTDTVGLVLCSDTRMAEWVVRDLREERPGVKLIVLGRTASLDAAAYGLRD ALERPARLGRGNVRRRGGWLMRDRILADTALICSGGDVFARRYLLDPKTGNPLPTYERRR YRGGFGYALPVFVYLE >gi|313157045|gb|AENZ01000076.1| GENE 41 36023 - 36250 63 75 aa, chain - ## HITS:1 COG:no KEGG:TEQUI_0595 NR:ns ## KEGG: TEQUI_0595 # Name: not_defined # Def: hypothetical protein # Organism: T.equigenitalis # Pathway: not_defined # 1 75 1 75 75 91 56.0 1e-17 MKTAIDLYVIDAIRRERRAQKVSQAMLSFGIGVSRGFVGQVESPKYNTKYNIIHINEIAK FLGCSPRKFLPEEPI >gi|313157045|gb|AENZ01000076.1| GENE 42 36618 - 37934 1737 438 aa, chain - ## HITS:1 COG:L52034 KEGG:ns NR:ns ## COG: L52034 COG0482 # Protein_GI_number: 15672819 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Predicted tRNA(5-methylaminomethyl-2-thiouridylate) methyltransferase, contains the PP-loop ATPase domain # Organism: Lactococcus lactis # 4 223 19 232 384 233 52.0 5e-61 MAHVVIGLSGGVDSSVAAYLLKRQGHDVTGLFMINWHDTTGTLEGDCPWHDDRVFAELVA KRLDIPLRVVDLSADYRTRVVDYMFAEYEKGRTPNPDVLCNREIKFDVFLREALKLGADF VATGHYCRKAEEVLPDGRTVFKLLAGADPNKDQSYFLCQLSQEQLRYALFPIGDLLKPEV RRIAAKQGLATAKRKDSQGICFVGKVDLPAFLQQKLASKKGNVHEILPSWPKYDRALAAA PAKAENAAGPGTSDARTNSTLSTENRAIPAAADAEVARTDGEPSTERLTVLAAPWRYTVR DGKKIGEHNGAHFYTIGQRKGLGIGGRRESLFILATDTAQNVIYVGEGDAHPGLWRPALH IAPGEIHWVNPARALSPGQSARFSVRIRYRQPLQDATLFMRDRGAYILFDQPQRGITPGQ FAAWYDGDELVGSGVISE >gi|313157045|gb|AENZ01000076.1| GENE 43 37950 - 38615 828 221 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313157050|gb|EFR56480.1| ## NR: gi|313157050|gb|EFR56480.1| conserved hypothetical protein [Alistipes sp. HGB5] # 1 221 1 221 221 372 100.0 1e-101 MSVSIPAGRINKLFGTDGGVMLSLYADFPADFDTDTPLLVTIDALEVPLWCERFERRGAS GAVAAFADFDTERRAQELLGLEFRIRFDEENDDEFYMEDLIGFAVTGFEMRHGGTENGGN GGNDDGNNANNDTCNDDGSRGGSNDNANANSAGSDAGDSMPPAGQFAGRVADYYDSEANP LFELEIGGRRVLVPAAEEFIAHIDFEGRTMKMVLPEGLIDL >gi|313157045|gb|AENZ01000076.1| GENE 44 38827 - 39381 570 184 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|227536351|ref|ZP_03966400.1| 30S ribosomal protein S16 [Sphingobacterium spiritivorum ATCC 33300] # 1 177 1 178 179 224 64 3e-57 MAVKIRLARHGKKGYAFYHIVAADSRAPRDGRFIEKLGTYNPNTNPATIDLDFEKALGWL QKGAQPTDTCRAILSYKGVLYKKHLLGGIAKGAFSETEAEARFNKWMTEKAGKIEAKGNK LEADAKSAEKARLAAEAKVKEERAKAIAEKKAAAEAAAKEAEQAAAAEGAEAPAEGEAEA PAAE >gi|313157045|gb|AENZ01000076.1| GENE 45 39909 - 40292 622 127 aa, chain - ## HITS:1 COG:alr3120_6 KEGG:ns NR:ns ## COG: alr3120_6 COG0784 # Protein_GI_number: 17230612 # Func_class: T Signal transduction mechanisms # Function: FOG: CheY-like receiver # Organism: Nostoc sp. PCC 7120 # 11 125 1 118 123 90 41.0 7e-19 MTDNNPDVRKPLILIAEDVESNYKLLEIILKKEYDLLWAKNGREAVDFAMEHKPDAILMD IKMPVMDGIEALKEIRRHTAELPVIMQTAYAFDTDRRTAEEAGCNGFITKPVMPRELKMY LDKYLNK >gi|313157045|gb|AENZ01000076.1| GENE 46 40349 - 42076 1703 575 aa, chain - ## HITS:1 COG:TM1660 KEGG:ns NR:ns ## COG: TM1660 COG0739 # Protein_GI_number: 15644408 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane proteins related to metalloendopeptidases # Organism: Thermotoga maritima # 39 171 30 168 323 73 32.0 9e-13 MRRILTLLLFTTICSGAQGQRLNPGDYIYPIRGVAGLYSANFGEMRPGHFHGGIDIKTEG GEGKPLVAVADGYVSRVSVSPSGYGRAIYLTLGNGTTAVYGHLQRFRGDIEKQVREERCR RRANSVDLWFGPGAWPVAQGDTIGFSGNSGSSMGPHLHFELRDTPTQRLYNVVREGVVRP DDDLPPRIMRIHYIEVDSVQGVPVHGSPESYAVVREAEGRYRLTREEPVGTGRKGYFVLE ASDRRNGVHNTFGLWRASMSVDGDPRFEYRMDGFTHDLSRCCDAVSHYPMQLTSRNEVIR LAQLAESPDCFYPVMRERGLVRTAEGEKRRIRIEAEDDCGNRSQLEFDILGRTETLRAEA DSTAVVLRPEKVALVRAGRDLTARIPAGAVYEPIFCRPEQRQAPASDSTAAVLSPAYRIL DAETPLRHPMTVSVHAFVPDDMQSRTVLATRTAKGRIACIGGRYADGAVTATTRTTGDIF VAADIIPPRIRPLFSEGADLGGARSIRFRVSDNFSGIASCTLLIDGRWVPCDRFPMQGTL VHAFDLPAAKKRRSVQLTVTDRCGNTARWEGTFWR >gi|313157045|gb|AENZ01000076.1| GENE 47 42061 - 42579 777 172 aa, chain - ## HITS:1 COG:no KEGG:Odosp_3302 NR:ns ## KEGG: Odosp_3302 # Name: not_defined # Def: regulatory protein RecX # Organism: O.splanchnicus # Pathway: not_defined # 17 155 4 142 160 88 35.0 1e-16 MQPPEVKKTGKRAKTPEQALAALMRLCARAEKSQDDARRLMRGWGLAEREGEQVLARLVR DRFIDDGRYAEAFVRDKLRLSGWGEYKIRTALQRKRIDRELIDAALAQADRQDMAGRLRQ QLERKMRTTRHTTQYELKTKLIRYGLSLGYDYETVLDSAAALVTDTETCDEF >gi|313157045|gb|AENZ01000076.1| GENE 48 42561 - 43403 327 280 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|225874212|ref|YP_002755671.1| ribosomal protein L11 methyltransferase [Acidobacterium capsulatum ATCC 51196] # 1 278 16 289 294 130 36 4e-29 MTTRRDILGRISARLAPLYGAREARSIALVAVSELSGLSASALLTDPGAPLEIAELDDIL GQLAAGRPVQYVVGRTEFCGRTFAVHEGVLIPRPETEELAAWIAQAKTEAATLLDVGTGS GCIAATLALALPGAQVYAADISDTALETAARNCRALGAGVILRKADALSDLAEVFPGPFD VIVSNPPYVPQSDLPAMHVNVREYEPHEALLVPDDDPLRFYRAIARAGRRTLRPGGRLYF EIYERFAEAMRRMLGEEGYTDTEVREDLNGKPRMTCSRLK >gi|313157045|gb|AENZ01000076.1| GENE 49 43403 - 44377 1479 324 aa, chain - ## HITS:1 COG:MA0550 KEGG:ns NR:ns ## COG: MA0550 COG0082 # Protein_GI_number: 20089439 # Func_class: E Amino acid transport and metabolism # Function: Chorismate synthase # Organism: Methanosarcina acetivorans str.C2A # 2 321 4 350 365 261 44.0 1e-69 MNIFGHNFRLAIWGESHGQQIGISIDGIPAGVPLSAEDFETDLARRRSGARGTTPRREPD IPQIVSGLYNGMTTGAPLTIEFANRDTHSQDYANVMRHYRPSHADMVAYHKFNGFNDPRG GGHFSARLTVALTAAGVVAKKILPPGVTFDTRIAEIGGCTDPEGFDEVLRAAAAEQDSVG GIIECRVQGVPLGLGQPFFDSAESMIAHLLFSVPAVKGVEFGSGFAGARMRGSENNDPFL DVEGTTATNNAGGINGGLTNGNELVVRAAVKPTPSIGREQMTYNLATNKVEPLMIRGRHD VCVALRGAVVVEAAVAIALANFIR >gi|313157045|gb|AENZ01000076.1| GENE 50 44382 - 45626 1527 414 aa, chain - ## HITS:1 COG:CAC0895 KEGG:ns NR:ns ## COG: CAC0895 COG0128 # Protein_GI_number: 15894182 # Func_class: E Amino acid transport and metabolism # Function: 5-enolpyruvylshikimate-3-phosphate synthase # Organism: Clostridium acetobutylicum # 10 412 11 417 428 242 33.0 1e-63 MDKTVPSGRVKGTLTPPCSKSYAQRALAASLLCEETSVLRNIEFCSDTRSALQCIETLGA RVTRIGEETLSIEGGLRPQGHTLRVGESGLSTRLFTPIASLCGTPITIEGEGTLLRRPME MMIGPLRTLGVRVRDNGGFLPFEVRGPIRGGEVDVDGSVSSQFITGLMLALPLARHDTTI HVRSAVSTPYLDMTIDTAARFGVEICHNDYEEFYIEGGQRYSPACLSIEGDWSAAAMLLV AGAVAGEVTVRNISMLSKQADTAVCTALVRAGAAVINEADSVTTISRPLHGFEFDATNCP DLFPALAALAAAADGVSTIRGTSRLLHKESNRAEAIREEYAKVGIEVDISEEDVMRIRGG KIRPARVSSHDDHRMAMSMAVSALRCDGQITIENAECVAKSYPGFFEDLEKIRV >gi|313157045|gb|AENZ01000076.1| GENE 51 45699 - 46487 265 262 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 10 231 9 230 245 106 25 7e-22 MRLYTSELVKKYKARTVVDHVSIDVNQGEIVGLLGPNGAGKTTTFYMIVGLIKPNEGRIF LENDEGQVAELTNEPVYKRAQMGVGYLAQEASVFRRLSVEDNLRAVLEMTEYSKEYQAER VESLIEEFRLQKVRKSLGIQLSGGERRRTEIARAVAINPAFILLDEPFAGVDPIAVEDIQ SIVATLKHKNIGVIITDHNVDETLAITDRTYLLYEGKILKTGTAEELAADPMVRKVYLGQ HFELKKSHQLTQEMKNRKTPQE >gi|313157045|gb|AENZ01000076.1| GENE 52 46788 - 48011 1821 407 aa, chain - ## HITS:1 COG:no KEGG:Odosp_3332 NR:ns ## KEGG: Odosp_3332 # Name: not_defined # Def: phosphate-selective porin O and P # Organism: O.splanchnicus # Pathway: not_defined # 41 398 10 362 373 252 43.0 2e-65 MKKRLLSALLFTAALSLPAAAQTLPAQPAPEGSGTRPAPRTSAAEIDDANDAASPEEVSK KITRFEKILAALPKFSGYAQIGYTYQTNVGGTDGKNSSSFNVKRMRLILTGDISRTFDYK MQFEGFSSSADQQGKALITVQDMYLRAKIRPQIHIWAGQFPVPLTIENYDISPGTLEVPN FSHAILKMVCRNAVSGYNTYGRDCGLQATGAFLHRDGWDMLTYNLALFNGSQMNRSDDNK SKDIVARLTFFPLRELRISGSVNWGEYTNNTVSKDYIPMTRYAAGFWYDSEHILVRAEYA HTGSSAAYTDKTTGEKTRGKVDEQMYYLIAGYKFKGKYMPVIRYDVFNGKRNTYLPGSVG KQQDFLVGFLYMPIPRLKAQAAWTLSKYSAAGAKNGNGFEVALIGFF >gi|313157045|gb|AENZ01000076.1| GENE 53 48281 - 50473 3923 730 aa, chain + ## HITS:1 COG:alr0205 KEGG:ns NR:ns ## COG: alr0205 COG0514 # Protein_GI_number: 17227701 # Func_class: L Replication, recombination and repair # Function: Superfamily II DNA helicase # Organism: Nostoc sp. PCC 7120 # 9 724 7 712 718 502 39.0 1e-141 MKQNESSLLHDKLKEYFGFSSFKGNQEAVIRNVLEGNDTFVLMPTGGGKSLCYQLPALIM DGVAIVISPLIALMKNQVDAMRTFSADSGIAHFLNSSLNKTAVAQVRADVLSGKTKLLYF APESLTKEDNVAFLHKIKVSFYAIDEAHCISEWGHDFRPEYRRIRPIINEIGPAPLIALT ATATPKVQLDIQKNLGMSDASVFKSSFNRPNLYYEIRPKHNVDRDIIRFIKQNEGKSGII YCLSRKKVEELTELLVANGIKALAYHAGMDAATRAANQDHFLMERADVIVATIAFGMGID KPDVRYVIHYDIPKSLEGYYQETGRAGRDGGEGYCLTFYSYKDIQKLEKFMQGKPIAEQE IGKLLLQETVSYAESSMCRRKTLLHYFGEEYTEENCGNCDNCRNPRPKIDAKAALKMLLE ALRDIGDKFKGDYLVNVLTGKTTALIKSYGHNKSKWFGAGAEHDASFWGAVLRQALILGL VDKNIENYGLISVNRKGENFIAMPFPVTVTLDHNYDEEEREAEAVAPMGKGGAADEELFS MLKDLRKKVAKQHGLPPFVIFQDPSLEDMAVQYPITLEEMQNITGVGVGKARKFGEEFIK LIKAYVEEKEIIRPQDMIVKSVGNKSGNKIFIIQSIDRKMDFEDIARAKDLDFDELLTEI EGIVNSGTKLDISYYLREFMDEDKIEDIYLYFKEDAESDSLDAAIDELGADYTEEEIRLV RIKFMCEQGN >gi|313157045|gb|AENZ01000076.1| GENE 54 50550 - 50768 382 72 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313157235|gb|EFR56665.1| ## NR: gi|313157235|gb|EFR56665.1| hypothetical protein HMPREF9720_1663 [Alistipes sp. HGB5] # 1 72 1 72 72 137 100.0 3e-31 MNKKTFTVAVIFLIVTLVWLGVNIFWALSNKNCIFDLIVAVLFALAGAGMVYNECHKRKC RHLDEVNRHQIP >gi|313157045|gb|AENZ01000076.1| GENE 55 50988 - 51230 388 80 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313157172|gb|EFR56602.1| ## NR: gi|313157172|gb|EFR56602.1| hypothetical protein HMPREF9720_1664 [Alistipes sp. HGB5] # 1 80 1 80 80 113 100.0 4e-24 MRKTNVMQILRLVMMIAAVAGVIYALEYMERGIAVWFWITLLGVGFVGLLYTFRLTQRRL RLERRAKERQEKKKSRKQKR >gi|313157045|gb|AENZ01000076.1| GENE 56 51673 - 51876 169 67 aa, chain + ## HITS:1 COG:no KEGG:BT_0780 NR:ns ## KEGG: BT_0780 # Name: not_defined # Def: acetyltransferase # Organism: B.thetaiotaomicron # Pathway: not_defined # 2 67 100 165 168 67 51.0 2e-10 MTAAVARAAEWVFANTPIIRIFAAAYVSNPASQRVLDKAGFRFVGTMHRAFVGNGAFVDG CYYELLK >gi|313157045|gb|AENZ01000076.1| GENE 57 51878 - 53029 1028 383 aa, chain + ## HITS:1 COG:XF2148 KEGG:ns NR:ns ## COG: XF2148 COG0030 # Protein_GI_number: 15838739 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Dimethyladenosine transferase (rRNA methylation) # Organism: Xylella fastidiosa 9a5c # 160 376 65 286 290 134 40.0 3e-31 MSEVRAKKALGQHFLVDLNIARKICDSLSGGEIRLRTAPAVAALPGADGQAAAGRGPEEP ATAVIIAAKRGTGNAAEMGAAAATGDAAVAVDGRTAEIAAGPDVAAGAGPDVVQSTEPGV AAGAGRDVAQETGPEGVSGIEPDVTPDAGQGAKAAGRCDVLEVGCGMGVLTQFLLRRDDI VTYGAEIDPESVEYLHAHYPEFTPRLMEGDFLKMNLRELFPGGLKIIGNFPYNISSQIFF KVLENRDLVPECVGMIQKEVAVRLAEPPGSKEYGILSVLLQAWYDIEYLFTVNETVFNPP PKVKSAVIRLRRNGVERLACDETLFVKVVKASFGQRRKMIRNSLRSVFGNFGGAEHPFFT QRAEQLSVADFVELTDWVAANRT >gi|313157045|gb|AENZ01000076.1| GENE 58 53252 - 54556 1947 434 aa, chain - ## HITS:1 COG:all1939 KEGG:ns NR:ns ## COG: all1939 COG0612 # Protein_GI_number: 17229431 # Func_class: R General function prediction only # Function: Predicted Zn-dependent peptidases # Organism: Nostoc sp. PCC 7120 # 12 418 68 481 512 120 24.0 4e-27 MTQPPLVIPATVEVPQAEKTTLANGATLYTLASDDFEVLRITFVFRAGSAVQRVPFSASA AANMLAEGTRDMTAQQIAEQLDYYGSYFDVNIDRDYAYISFCTLSKFFGQTLAVAEQVLL HPTFPEEELRTYCAKRKQRLAIERTKVDVEAREAFARTMFGPEHPYGISADENDYDRLTR ADVAEFYARHYTAANGFVVCSGRIGEQEREAVAALAERLPRSESETGTPFPAPVTRHEAF VEHPGAVQSSIRIGRMLFPRQHPDFLGMQVVASVLGGYFGSRLMQNLREERGYTYGVVAA MVNFEQAGYFAVATQVGTDVTRDALREIYAEIERLRTEPMPDEELSLVKNIMIGEMMRIL DGPFGIADVTIENILCGRDHTVIGENIRRIQAMTPADVQRLAQKYLAREDLVTVIAGDPI PEERQAPEEKADRQ >gi|313157045|gb|AENZ01000076.1| GENE 59 54553 - 55791 1819 412 aa, chain - ## HITS:1 COG:CC3584 KEGG:ns NR:ns ## COG: CC3584 COG0612 # Protein_GI_number: 16127814 # Func_class: R General function prediction only # Function: Predicted Zn-dependent peptidases # Organism: Caulobacter vibrioides # 2 412 43 458 948 202 31.0 7e-52 MIEYKKKILANGLTVVVNRDPVSKLAAVNILYKAGARNENPERTGFAHLFEHLMFRGTRR IPNFDLPVQMACGDNNAFTNNDYTDFYITLPKDNVETALWLESDRMEGLDITPEKLETEK RVVIEEFRQRYLNQPYGDQPMLLRALAYKVHPYRWATIGLTPEHIAQATPDEVQAFYRAH YRPSNAILSISADIDEEKMLDLAEKWFEPLANRPAAPDHIRQEPVQTAPRREVAERDVPA TTVSLAFHMGGRTSPDFYIADLVSDLLAGGDSARLYTHLVKEQRLLSSVNAYISGDVDPG LFVFTGQLLPETTPEQAEAAFRAEIEALRTRPATDYEVEKVKNKFEANTLFGELNVMNKA MNLGFYEMLGDLPLVNREVAAYRAVTTDDIIDFSRRTFRPENCSTLIYKASK >gi|313157045|gb|AENZ01000076.1| GENE 60 56191 - 56361 97 56 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MQAVLRYFGTEWNVRLFWSRETLPGGGAAERRKVCRRGNALWVNRFCNSVIYAGLC >gi|313157045|gb|AENZ01000076.1| GENE 61 56311 - 56505 59 64 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MCVFSEFRTECVAKIANPAEITVRFQNKKSGIFTIRGVSGITLKVVNFLTQTCVNHGVAK PVYP >gi|313157045|gb|AENZ01000076.1| GENE 62 56528 - 60061 5037 1177 aa, chain + ## HITS:1 COG:FN1170_1 KEGG:ns NR:ns ## COG: FN1170_1 COG0674 # Protein_GI_number: 19704505 # Func_class: C Energy production and conversion # Function: Pyruvate:ferredoxin oxidoreductase and related 2-oxoacid:ferredoxin oxidoreductases, alpha subunit # Organism: Fusobacterium nucleatum # 4 413 3 410 410 600 71.0 1e-171 MAEKKFITCDGNYAAAHVAYMFSEVAAIYPITPSSTMAELVDEWAAQGRKNIFGETVKVV EMQSEAGAAGAVHGSLQSGALTSTFTASQGLLLMIPNMYKISGELLPGVFHVSARALAAQ SLSIFGDHQDVMATRQTGFAMIATSSVQEVMDLAGVAHIVALKSRVPFLHFFDGFRTSHE IQKIELIDEAALTAMLDRDALKAFRQRALNPEHPVTRGTAQNPDIYFQTREAANTFYDAV PDMVADAMREISKITGREYKPFVYYGAADAENIVIAMGSVTETLKETVDYLNAKGGKVGV VTVHLYRPFSVKYMMEVLPESVKRICVLDRTKEPGANGDPLYLDVVEAFATCPCKNKPLI IGGRYGLSSKDTTPAQMLAVFTNLAANEPKNQFTVGIVDDVTFRSLPVGEEISLAKPGTF EALFFGLGADGTVGANKNSIKIIGGTTNKYCQAYFSYDSKKSGGYTSSHLRFGDLPITSP YLVTTPDFVACHVPSYVDKYDVLKGLKAGGSFLLNSVHDAATTCETLPAHMKAYLAKNKI NFYIINATKIAAELGLGSRTNTIMQSAFFKIANVIPFDKAVEQMKKAIQKSYGKKGEDIV NMNYAAVDAGGDAVVKVEIPAEWASIEDKGFVHVSDASCPEFVRKIVEPINGLKGDDLPV SAFTGREDGTWENGTAAYEKRGIAVNVPEWKIENCIQCNQCAYVCPHAVIRPFLATEAEA AASGVEWKQGLGETKDYKFRIQISPLDCTGCSNCVDVCPAKEKALVMKPLESQLPQQRNW DYIVKNIGYKQVVDKTKSVKNLQFAQPLFEFSGACAGCGETPYIKAISQLFGEKMMVANA TGCTSIYSGSAPSTPYCTNDKGQGPAWANSLFEDNAEFGLGMHVGVEKLRDRIQESMEQA IAGCTKCSDELKGVMQEWIAARGSSAKSAEVSARLIPMMEACGCDYCKDILEMKDWLVKK SQWIIGGDGWGYDIGFGGVDHVLASGLDVNILVVDTEVYSNTGGQSSKSTPVGAVAKFAS AGKRIRKKDLGAIAMTYGYVYVAQVSIGASQMQLFNVLKEAEAYPGPSLVIAYAPCINHG IKGGMTRTQTVGKEAVSCGYWHLWHYNPQLEEQGKNPFVMDSKEPDWSKFRDFLMKEVRY TSLKKSFPAEAEELFAAAEENAKWRYNSYQRLAKMEY >gi|313157045|gb|AENZ01000076.1| GENE 63 60260 - 61036 1286 258 aa, chain + ## HITS:1 COG:no KEGG:Cpin_0543 NR:ns ## KEGG: Cpin_0543 # Name: not_defined # Def: S1/P1 nuclease # Organism: C.pinensis # Pathway: not_defined # 2 257 3 265 266 115 30.0 1e-24 MKKLSVCLLTAVLCCWCATAFGWGKIGHDAIADIAECNLTPKAKKNIEKYLGGRSIVYYA SWMDQVRHTPAYRHTNTWHTNKVDAGGIYVPDPEGDAMTFLDDCIAKVEDYRNQNDSTVT VSIRFIVHLVGDMHCPGHVKYPWYKSFKFTLSGKEYGLHNYWDEWALTLSNKWHYLEYGH QLDRCSKREKRDIAEGTPRDWLADSARECRVIYDWTKAGQTLSYEEARDFIIFSYEFAEA QVLKAGYRLAALLNRLFG >gi|313157045|gb|AENZ01000076.1| GENE 64 61584 - 62471 805 295 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313157171|gb|EFR56601.1| ## NR: gi|313157171|gb|EFR56601.1| hypothetical protein HMPREF9720_1673 [Alistipes sp. HGB5] # 11 295 1 285 285 545 99.0 1e-153 MKNLLHKIASVFVVCAALAAFTACNDDDGKVKKPQVSSELVGSYMPHYFKEVPDEGGGEP SRFYLEVAAAWTDPDNIPGIDLSASMDMPAGSFIMPFNTILDLVSAIGSQFVSSGLVQLD LHSDGLFAAKYHEVVVEEGADIMTTIFSPTFAEAVSVFPSAETAAVLPEGALSFYTESNL FFAAVSKDFIRAIEKQEGMEILSEIDKMLGAYKGLSVVSTDSHYAIPFKYTFEGGVLRLY VDRAMMLPYVKLFKDVLGGLGLDPSQMMGLDPAVILDDLFNATTELQIAVYLKKM >gi|313157045|gb|AENZ01000076.1| GENE 65 62714 - 63145 735 143 aa, chain + ## HITS:1 COG:no KEGG:Odosp_1604 NR:ns ## KEGG: Odosp_1604 # Name: not_defined # Def: hypothetical protein # Organism: O.splanchnicus # Pathway: not_defined # 2 138 1 136 138 105 37.0 8e-22 MMKKYFAAAALLVTMAACGSAAQDKTLEGTTWKLAKMESIPATAINQETDFFTLEFNAAD TMVAGRTNCNRFFGRYELKGKKLEFENMGMTRMACPDMQYEDAFVKMLDDVDRFEIKGSE LTFFDDDKSLAVFKAVEKEPAKK >gi|313157045|gb|AENZ01000076.1| GENE 66 63738 - 64883 905 381 aa, chain + ## HITS:1 COG:XF1483 KEGG:ns NR:ns ## COG: XF1483 COG4973 # Protein_GI_number: 15838084 # Func_class: L Replication, recombination and repair # Function: Site-specific recombinase XerC # Organism: Xylella fastidiosa 9a5c # 127 364 43 273 294 81 27.0 2e-15 MARVKNHTKVKEPIRLRMKSLSNGSKSLYLDIYRDGKRTYEYLKMYIIPETDSNSRRQNQ TTMDAANAIKSKRIIELTSNEAGIVFRKDKTFLLDWMQVYMEAQESAGKKDGSQIKIAMR ILKDYAGEMVTLDQIDGDFCRGYITYLLTEYHPKGKDISNYTLHNYYRALNGALNSAVRK KKMKANPFNELEKSEKIRKPESMRSYMTIEEVQALIDTPMPHEEYEIVKCAYLFSCFCGL RISDIIKLKWNDVFVDRGQYRLAVSMKKTKEPIYLPLSPEALKWMPERGGKSSEDNVFDL PSANTIRMQLKPWAKAAGISKRFSYHTSRHTFATMMLTLGADLYTVSKLLGHADVKMTQV YAKIINKKKDEAVNLVNGLFH >gi|313157045|gb|AENZ01000076.1| GENE 67 64981 - 65487 66 168 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|218129443|ref|ZP_03458247.1| ## NR: gi|218129443|ref|ZP_03458247.1| hypothetical protein BACEGG_01020 [Bacteroides eggerthii DSM 20697] hypothetical protein Bfra3_03330 [Bacteroides fragilis 3_1_12] hypothetical protein BACEGG_01020 [Bacteroides eggerthii DSM 20697] # 48 156 2 110 134 118 49.0 2e-25 MISIGASHYTFHLFRGKITTKRKRESRRSVDTGRTKRRQRFRGVIHPFQLHALIKREIRC GKAGLDGTGALGQQPLDNVDVHFVAVPAGRMVTVYVPVSVYEIVHIAVVLFTTQHNIVEI NCFCFGKQGEKFIRHILVRSEVLQVLAPSKKREGRLQVFSCRFTFLSL >gi|313157045|gb|AENZ01000076.1| GENE 68 65452 - 65787 274 111 aa, chain + ## HITS:1 COG:no KEGG:Bacsa_0506 NR:ns ## KEGG: Bacsa_0506 # Name: not_defined # Def: hypothetical protein # Organism: B.salanitronis # Pathway: not_defined # 1 111 145 256 256 148 70.0 7e-35 MKSIVACANTYHLFCVSVRIEDMEALFACKKGFSIRVNNIRRVVILFDALLENSFIQSRW QNVLGKGAFLQSKDGTRSVSVSTLSSALSSIKNNMTSVAYSIRKVIDRLKE >gi|313157045|gb|AENZ01000076.1| GENE 69 65979 - 66362 427 127 aa, chain + ## HITS:1 COG:no KEGG:BF2790 NR:ns ## KEGG: BF2790 # Name: not_defined # Def: putative excisionase # Organism: B.fragilis # Pathway: not_defined # 1 127 6 132 132 180 76.0 2e-44 MTFMERMSGRLAAIESVLKKLEPVESLLERITLLENTIFTTKRVFTFQEACMYIGVSESM LYKLTSSKEIPHYKPRGKMVYFAKEELDEWLLQNYEPTMNEAVRRVTEAAATEPFLNKRR YGKRKKD >gi|313157045|gb|AENZ01000076.1| GENE 70 66340 - 67440 914 366 aa, chain + ## HITS:1 COG:no KEGG:BF2791 NR:ns ## KEGG: BF2791 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 366 1 368 368 627 82.0 1e-178 MENERRIDRTTSMGMDEHRLSDILHASQIKATDIYETPPQIIWIDNSTIATLGNFSASTG KAKSKKTFNVSALVAASLAGKQVLNYRAHLPEGKQRILYVDTEQSRFHCRSVLERILRLA GLPTTTDPENLDFFCLREYSPSVRIEVIDYALRQQKGYGLVIIDGIRDLMLDINNAGESV EVINRMMEWSSRYDLHIHCVLHLNKGDNNVRGHIGTEMSNKAETVLVISKSNENPGISEV HALHIREKEFKPFAFTINETGLPVIAEVHSFGEPPKPKARTGFTELSIEQHREALSAAFG EKPIRGFDNLLQSLMVSYEAIGFKRGRSVMIKLMQYLIDNLKLIIKRDKLFYYDMTPTEA MLFDEE >gi|313157045|gb|AENZ01000076.1| GENE 71 67543 - 68430 610 295 aa, chain + ## HITS:1 COG:no KEGG:BF2792 NR:ns ## KEGG: BF2792 # Name: not_defined # Def: DNA primase # Organism: B.fragilis # Pathway: not_defined # 1 295 1 295 295 457 74.0 1e-127 MTIAEAKQVRIVDFLAQLGHHAQHIKSEQYWYFSPLRNERTPSFKVNDRINEWYDFGEAT GGDLVELAKYICRTDCVSEALAYIERLVNGASLPRTRMPTAPPRPVEAEMKDVIVIPLRH HALFSYLQSRLIDADIGRMYCKEVHYELRGRHYFALAFGNTSGGYEVRNAYYKGCLNNKD ISLIRHLTEETQENVCVFEGFMDFLSYMTLKLAGDRTVCLAMPCDYLVMNSVNNLKKTLA RLQEYSVIHCYLDNDLAGQRTTETIAGMYDGRVSDESCHYAEYKDLNDYLRGKKR >gi|313157045|gb|AENZ01000076.1| GENE 72 68674 - 69117 354 147 aa, chain + ## HITS:1 COG:no KEGG:BF2793 NR:ns ## KEGG: BF2793 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 147 51 200 200 169 62.0 4e-41 MNSYFIFAIVLTVAYIIYYAVIIAHDIYGKKGTDKPNEEVFDLGAPEEEESVAVTESETG FSIGSENYETESTATSSETSLEDVKDKPGTAQEKLERLKAEAEEQMEETTPYLSDARTSE EMYKAMISKGRLDNRPEIKWNPIQDRL >gi|313157045|gb|AENZ01000076.1| GENE 73 69119 - 69445 336 108 aa, chain + ## HITS:1 COG:no KEGG:BF2794 NR:ns ## KEGG: BF2794 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 108 23 129 129 144 78.0 1e-33 MSKAKKILCALCLPFPYTTFAKSGSVNYSWGADALATMHDFVVTMMLYVQYICCAIAGVY VIVSVCQIYIKMNTGEDGITKSIMTLVGACLFLIGAFYVFPVFFGYRI >gi|313157045|gb|AENZ01000076.1| GENE 74 69500 - 69877 566 125 aa, chain + ## HITS:1 COG:no KEGG:BF2795 NR:ns ## KEGG: BF2795 # Name: not_defined # Def: conjugate transposon protein TraE # Organism: B.fragilis # Pathway: not_defined # 1 125 1 125 125 154 76.0 1e-36 MFQKTKQLCRKAFGFVNGIPTKVMMFSFMLLSGMVAKAQNSAGDYSAGTSALSTVAEEIA KYVPIMVKLCYAIAGVVAIVGAISVYIAMNNEEQDVKKKIMMVVGACIFLIAAAKALPLF FGIAA >gi|313157045|gb|AENZ01000076.1| GENE 75 69893 - 70189 370 98 aa, chain + ## HITS:1 COG:no KEGG:Bacsa_0499 NR:ns ## KEGG: Bacsa_0499 # Name: not_defined # Def: hypothetical protein # Organism: B.salanitronis # Pathway: not_defined # 1 97 1 97 98 146 79.0 3e-34 MKGKDERYPDYPLFKGLQRPLEFLGIQGRYIYWAAVTTCGAIVGFVAAYCLLGFIAGLVV LAAVVSAGIVLILLKQRKGLHSKKVVPGVYVYAHSRKI >gi|313157045|gb|AENZ01000076.1| GENE 76 70301 - 73009 3113 902 aa, chain + ## HITS:1 COG:no KEGG:BF2797 NR:ns ## KEGG: BF2797 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 895 1 897 903 1521 83.0 0 MTLYIILCFVALCAGMALSVYAFGTGGKRKRIFQDIYFSAEETDGVGVLYTKTGEYSAVL KIENPVQKYSADIDSYYDFTHLFTALAQTLGEGYAIHKQDIFVRKQFASEPTDGQEFLSS SYFRYFKGRPYTDSLCYLTITQEAKKSRLFSFDSKKWRDFLVKIRKVHDQLRDGGVQARF LNKAEASEYVDRYFAMNFKDRTVSMTNFKADDETVSMGDKRCKVYSLVDVDCAALPSQIR PYTNIEVNNTEMPVDLVSVVDSIPNAETVVYNQIIFLPNQKRELSLLDKKKNRHASIPNP NNQMAVEDIKRVQEVIARESKQLVYTHFNMVVAVSAGADLQKCTNHLENAFGRMGIHISK RAYNQLELFVGSFPGNCYTLNEEYDRFLTLSDAAMCLMYKERVLHSEETPLKIYYTDRQG VPVAIDITGKEGKNKLTDNSNFFCLGPSGSGKSFHMNSVVRQLHEQGTDVVMVDTGNSYE GLCEYLGGKYISYTEERPITMNPFRINREEYNIEKIDFLKNLILMIWKGSDSQIPEIEFR IVEQIIIDYYDAYFNGFTRYTDEQREVLLKNLFAAASRKNPNKPPREVDEMVRKQIEVLE ARRAALKVSELNFNSFFDYSFDRLEQICTENDITTISYSTYSTMLQPFYKGGAYEKILNE NVDSALFDETFIVFEVDAIKENKKLFPIVTLIIMDVFLQKMRIKKTRKVLVIEEAWKAIA SPLMAEYIKFMYKTARKFWASVGVVTQEIQDIIGSEIVKEAIINNSDVVMLLDQSKFKER FDEIRKILGLTEVDCKKIFTINRLENKDGRSFFREVFIRRGTTSGVYGVEEPHECYMTYT TERAEKEALKLYKKELRCSHQEAIEAYCRDWDASGIGKALPFAQKVNETGRVLNLRPVHE SK >gi|313157045|gb|AENZ01000076.1| GENE 77 73021 - 75567 1860 848 aa, chain + ## HITS:1 COG:no KEGG:BF2799 NR:ns ## KEGG: BF2799 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 848 1 843 843 1377 80.0 0 MKRLLLILTIASVTLCQTVHAQYYSVNYDTRTVAAMVAAFGTEAVAEGYYREQVDDVLKH YTAAEVATAGIFAAKFLEHKALSDLGIWNSRTENYYYRRIYRMVAEKIMPKIWVVAKQML HSPQTAIYWGSYLMKVCDDTKSLCMQFESVVTNSTLTFSDIAFLEINPEIAPLLKLSETG NIDWQRMMDNFARIPGNFTHENLKSDLDNLYNTGVGLATAGIANLGDALLQSSSFHDLLG GKVEEIGNLYEHYGSLFEQAEHDISGLLIDMVGGPDNVAGLFNFSNYNLTAWMTDYLDEA MGNYYTQRWYIARRDQGSVSLCDYYPPTDDNSILNGGAWVRFNTSDPNFYPDASQREQVL ANSEGYAGWSRNRVQQLNNQNDGFSYSINYWMSAYIISKKNKQTKKAYAYEIHVSKNWNK EEVVYEEVFDSYSMDLNTFRAQLNVRLAEFNENEDGYTYYISSGARNYYQATDAAKLKGC ESVTISVTCSDGVTLGQGSTQYKCRKCGSSLNAHTKECAMQTSVTENDLDLSELDALLNE ANSQAANIEAQIGALENENAVLLKKISTASIEDAANYRQQYNANKTRIDRLKGELAEWKK KQSDYAQAKSDAADDNSVATDDYYRIPAIMQDCKAAYGLTWQDGGAWNGYSYVRKATMPN INGIITFRATVSIARKPKYFLGIKIHRTIIQIKWELTSEYTDTHVADVLTLDPGLSDAEK AKLVNDRIAEIAREHPSCKITTEYARSAPTEETPSGDVYHLLWSSDRLEIAREVDSRITK IYADLVSLEKMMHYKRSILDVMKDVQPGLDTDEGRRLTLVEECHDRWVENARTLRSGNSR NSRKEVRP >gi|313157045|gb|AENZ01000076.1| GENE 78 75564 - 76289 506 241 aa, chain + ## HITS:1 COG:no KEGG:BF2800 NR:ns ## KEGG: BF2800 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 241 1 241 241 357 79.0 3e-97 MKRTIQIAVVLFALLPGIAKAQWTFDIVSVEAYINDHKKQRSLLLARSTLEYSNKLLHEY SCKEIGDYKELNIDLDKYTRAFDAIDVMYQSLRTVLNVKNTYTSVSDRIGDYKSLLEDFN AKILKRGRIESADTLIISINARGLRAIAREGEQLYKSVSDLVLYATGAAACSTSDLLMVL EAINRSLDNIERHLNRAYFETWRYIQVRIGYWKAKVYRSRTMREILDDAFGRWRGAGRLD Y >gi|313157045|gb|AENZ01000076.1| GENE 79 76308 - 77075 754 255 aa, chain + ## HITS:1 COG:no KEGG:BF2802 NR:ns ## KEGG: BF2802 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 238 1 238 256 366 81.0 1e-100 MNRKLLLMAVAVTVTTAVHAQYVTYNHDSPKQNQVTVMETGTGALSPDLYYSVLHNKYKK SAAAKNKLSFRTLAGINLYNQVDEAEAIDSALVKRAKVEALNVADRQADIAWLAEGDKVS RQMDRFRRNIDRILLSGGTPADKERWTEYYHVYQCAINATKDAYMPNAQRKKEYLRIYED VARQNEILVSYLAKRQNATATSTLLNATDNRTLHKGGIVRNAMSRWQESRLAVRGSQSGG NGNGEDDNESVNRGK >gi|313157045|gb|AENZ01000076.1| GENE 80 77088 - 78221 748 377 aa, chain + ## HITS:1 COG:no KEGG:BF2803 NR:ns ## KEGG: BF2803 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 377 1 377 377 701 92.0 0 MANGDILSDFGINLLEEEIDDVIFQTNEFLTDATFTGAQGPFWWILQMCMALAALFSIVM AAGIAYKMMVKHEPLDVMKLFRPLAVSIIICWWYPPADTGMAGSRNNWCFLDFLSYIPNC IGSYTHDLYEAEASQISDRFEEVQQLIHVRDTMYTNLQAQADVAHTGTSDPNLIEATMEQ TGVDEVTSMEKDAAKLWFTSLTAGVIVGIDKIIMLIALVVFRIGWWATIYCQQILLGMLT IFGPIQWAFSILPKWEGAWAKWLTRYLTVHFYGAMLYFVGFYVLLLFDIVLCIQIENLTA ITASEQTMAAYLQNSFFSAGYLMAASIVALKCLNLVPDLAAWMIPEGDTAFSTRNFGEGV AQQAKMTATGGLGGIMR >gi|313157045|gb|AENZ01000076.1| GENE 81 78263 - 78658 330 131 aa, chain + ## HITS:1 COG:no KEGG:Bacsa_0493 NR:ns ## KEGG: Bacsa_0493 # Name: not_defined # Def: hypothetical protein # Organism: B.salanitronis # Pathway: not_defined # 1 130 1 130 136 147 56.0 1e-34 MKLQEKIKSWCKDEKFMSFAQERARKEVCEVTENHRIDPQYEELDEAFEYDDRYIAPLVT YLTYKLRLALLQRNAGKRKRGIWWVLVHVEMQGYYVEIFSAEFENLLTELRDAVIPMLHT EYVQMLNGKRE >gi|313157045|gb|AENZ01000076.1| GENE 82 78662 - 79276 589 204 aa, chain + ## HITS:1 COG:no KEGG:BF2805 NR:ns ## KEGG: BF2805 # Name: not_defined # Def: conjugate transposon protein TraK # Organism: B.fragilis # Pathway: not_defined # 1 204 1 204 204 382 91.0 1e-105 MVIKNLENKIRLVGIICTAFLVGCVIISLSSIWTARTMVSDAQKKVYVLDGNVPILVNRT TMDETLDMEAKSHVEMFHHYFFTLPPDDKYIRYTMEKAMYLVDETGLAQYNTLKEKGFYS NILGTSSVFSIYCDSVAFNKEKMEFTYYGRQRIERRSNILMRELVTAGQLKRVPRTENNP HGLLIVNWRTLLNKDIEQKTKINY >gi|313157045|gb|AENZ01000076.1| GENE 83 79289 - 79669 286 126 aa, chain + ## HITS:1 COG:no KEGG:BF2806 NR:ns ## KEGG: BF2806 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 122 1 122 123 172 74.0 4e-42 MNIKGFRRMLFGEKMPDKNDPKYKDRYEREVSAGRKFAQATRIDKAAAKVQGFANAHRIL FLVIVFGFAIGGFTWNIYRITMAYRNSRPTRTATEMQDSVLRERHKRLQGGEIRENRNEN KKYEPQ >gi|313157045|gb|AENZ01000076.1| GENE 84 79680 - 80162 290 160 aa, chain + ## HITS:1 COG:no KEGG:BF2807 NR:ns ## KEGG: BF2807 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 160 1 160 160 300 86.0 1e-80 MNTRFEKSVRSSDEWYTPKEVLKALGRFDLDPCAPIRPLWPTAEVMYDRNMDGLSLKWEG RVWLNPPYSRPLIEQFVRKLAEHGNGIALLFNRCDSKMFQDVIFEKATGMKFLRHRIRFY RPDGTRGDSPGCGSILIAFGVENAEVLKNCSIEGKYVQLN >gi|313157045|gb|AENZ01000076.1| GENE 85 80173 - 81321 948 382 aa, chain + ## HITS:1 COG:no KEGG:BF2808 NR:ns ## KEGG: BF2808 # Name: not_defined # Def: conjugate transposon protein TraM # Organism: B.fragilis # Pathway: not_defined # 1 382 10 389 390 545 79.0 1e-153 MKIFDKINFREPKYMLPAVLYIPLLVASYFIFDLFHTETAEIPDKTLQTTEFLNPDLPDA RLKGGDGIGSKYENMAKSWGKIQDYSAVDNIERDEPDDNKEEYESQYTVDDIALLDEQQQ EKAAAAQIADAKTREQEALAELEKALAEARLRGRNEVLPATDTDSTASAQPPTATIEVKG KIEEESRSVKAPSENEPPSEVVRKVKTASDYFNTLAVNAREPKLIKAIIDEDIKAVDGSR VRLRLLDDVEINECVVKRGSYLYATMSGFSSGRVKGNITSILIEDELVKVSLSLYDTDGM EGLYVPNSQFRETSKDVASGAVSGNLNMNTGSYGNSLSQWGMQAATNAYQKTSNAIGKAI KKNKVKLKYGTFVYLVNGREKQ >gi|313157045|gb|AENZ01000076.1| GENE 86 81333 - 82118 413 261 aa, chain + ## HITS:1 COG:SMc00021 KEGG:ns NR:ns ## COG: SMc00021 COG0863 # Protein_GI_number: 15964679 # Func_class: L Replication, recombination and repair # Function: DNA modification methylase # Organism: Sinorhizobium meliloti # 4 247 20 265 376 89 30.0 9e-18 MIEIDNIYNMDCIEGMKLMANGSVDAVIADLPYGVLNRSNKAAHWDRQIPLEALWKQYRR ITKPGSPVILFAQGIFSAQLMLSQPRMWRYNLVWRKDRVTGHLNANRMPLRQHEDIIVFY DRQPVYHPQMTPCPPERRNHGRRKTDGFTNRCYGEMKLAPVRVAEDKYPTSVISIPKEHK TGAFYHPTQKPVALIEYLIRTYTNEGDVVLDNCIGSGTTAIAAIRTGRHYIGFEIEPTYC EIVGRRIREELERGHGLKKAK >gi|313157045|gb|AENZ01000076.1| GENE 87 82156 - 83007 877 283 aa, chain + ## HITS:1 COG:no KEGG:BF2811 NR:ns ## KEGG: BF2811 # Name: not_defined # Def: conjugate transposon protein TraN # Organism: B.fragilis # Pathway: not_defined # 11 283 9 281 281 437 85.0 1e-121 MNKKIIVTAFLLAAGLFATRNAQAQRTYEEMERLTVNEQVTTVITATEPVRFVDISTDKV AGDQPIENIIRLKPKETGHEDGEVLAIVTIVTERYRTQYALIYTTRISEAVADKEIQLQE RDAYNNPTVSMSTADMVRFARRVWNSPAKIRNVATKAHRMVMRLNNIYSVGDYFFIDFSI ENKTNIRFDIDEIRVKLSDKKLSKATNAQTIELTPALVLEHGKTFKHGYRNVIVVKKMTF PNDKLLTIEMTEQQISGRNISLNIDYEDVLSADSFNTALLEEE >gi|313157045|gb|AENZ01000076.1| GENE 88 83031 - 83516 423 161 aa, chain + ## HITS:1 COG:no KEGG:Bacsa_0487 NR:ns ## KEGG: Bacsa_0487 # Name: not_defined # Def: hypothetical protein # Organism: B.salanitronis # Pathway: not_defined # 2 161 20 179 179 272 78.0 3e-72 MLACMCVFMNASAQRNSGRLSLGVGLLYENGMDVTLAYEHEMNYRHAWEFFANGYLKWTE CKSCGHICPESFWRNYRTYGFGVAYKPCVVRGRNHYGNLRIGASAGSDTNKFLAGIHVGY EHNYVLRSGCTLFWQVKSDMMIKGADLLRTGIVLGVKLPIK >gi|313157045|gb|AENZ01000076.1| GENE 89 83542 - 84198 515 218 aa, chain + ## HITS:1 COG:no KEGG:BF2813 NR:ns ## KEGG: BF2813 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 217 1 217 218 355 83.0 1e-96 MIRKIQWFAMAVTAVLCAACDAHIDVPDTAVRPGHILCEDGTALSYVQYEQSGKRAIAVV FDTEHREGTEGNGYAVYLWDIAPAAFADSLGVAQGTSADIEALDGNMNTFALYDTRETAS PMAEAVFDLWRYGQSAYIPSVAQMRLLYAVRETVNPVIERCGGHPLPLDENDCWYWTSTE VTGQETAKAWLYSTGSGAMQETPKTQAHKVRPIITLNR >gi|313157045|gb|AENZ01000076.1| GENE 90 84227 - 86413 1259 728 aa, chain + ## HITS:1 COG:no KEGG:BF2815 NR:ns ## KEGG: BF2815 # Name: not_defined # Def: putative mobilization protein # Organism: B.fragilis # Pathway: not_defined # 1 728 1 728 728 1368 92.0 0 MEESKELQGFYKIFRAVIYISVLMEFFEYAIDLAMLDHWGGILIDIHGRIKRWMIYNDGN LVYSKIATFLLICITCIGTRNKKHLEFDARRQVLYPLICGLFLIVFSVWLYHHTMETRLY TLPLNIIFYMAATLVGVILVHIALDNISKFIKEGLGKDRFNFENESFEQSEEKDENQYSV NIPMRYYYKGKFRKGWVSISNCFRGTWVVGTPGSGKTFSIIEPFIRQHSAKGFAMVVYDY KFPTLATKLYYHYKKNQKLGKVPKGCKFNIINFVDVEYSKRVNPIQAKYINNLAAASETA ETLLESLQKGKKEGGGGSDQFFQTSAVNFLAACIYFFVNYEREPYDANGKKLYAEKRQDP QTKFWKPTGVVRDREGGSIVEPAYWLGKYSDMPHILSFLNESYQTIFEVLETDNEVAPLL GPFQTAFKNKAMEQLEGMIGTLRVYTSRLATKESYWIFHKDGDDFDLKVSDPKSPSYLLI ANDPEMESIIGALNALILNRLVTRVNTGQGKNIPVSIIVDELPTLYFHKIDRLIGTARSN KVSVTLGFQELPQLEADYGKVGMQKIITTVGNVVSGSARAKETLEWLSNDIFGKVVQVKK GVTIDRDKTSINLNENMDSLVPASKISDMATGWICGQTARDFVKTKTGTGGSMNIQESEE FKTTKFFCKTDFDMREIKAEEAAYVPLPKFYTFKSREERERILYKNFIQVGQDVKDMIAD VLNKRGAK >gi|313157045|gb|AENZ01000076.1| GENE 91 86501 - 87721 651 406 aa, chain - ## HITS:1 COG:MA2994 KEGG:ns NR:ns ## COG: MA2994 COG4804 # Protein_GI_number: 20091812 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Methanosarcina acetivorans str.C2A # 15 385 13 328 345 120 25.0 5e-27 MYMEQQVIPSDYTQYAEAVEIIKHAIERCRYRSAAAVNKETLSLYFGMGKFVSENSRIGC WGTKALPTISKLLQRELPGLHGFSESGLKRMRSFYEEWRTFLIRPTVLGELENHSSEKGT ALIHPTALGELDIDEHLLLQLIGQPVNTEFTWDDFVKISFSHHIEILTRSKDIAERLFYI HCAAQNAWSLSSLKNYLKEDIYSNRGSLPSNFLQVLPEAIYAVKATLAFKDEYMLEMVNL ENVGEREQDWNEKVIENQIVTNIKQFILRFGNDFTFIDSQHRLIVAGEEMFADLVFFNRE LNASVIVELKRGKFRPNYLGQLSGYLTVYDMTDKKPHENPSIGIVLCQDANRQFVEIMVR DYDKPMGVATYRTAQEMPENLRKTLPDIDKLQNLLSENDSKSETVR >gi|313157045|gb|AENZ01000076.1| GENE 92 87870 - 88046 102 58 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253571025|ref|ZP_04848433.1| ## NR: gi|253571025|ref|ZP_04848433.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] hypothetical protein HMPREF9011_04225 [Bacteroides sp. 3_1_40A] conserved hypothetical protein [Bacteroides sp. 1_1_6] hypothetical protein HMPREF9011_04225 [Bacteroides sp. 3_1_40A] # 1 58 3 60 60 113 100.0 5e-24 MCLGRAEKAGSGVDKIVSGWQSLGWPLPTVAEETRPDYVVLTLQLGMKTRQENLASRI >gi|313157045|gb|AENZ01000076.1| GENE 93 88136 - 88258 86 40 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253571024|ref|ZP_04848432.1| ## NR: gi|253571024|ref|ZP_04848432.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] hypothetical protein HMPREF9011_04224 [Bacteroides sp. 3_1_40A] conserved hypothetical protein [Bacteroides sp. 1_1_6] hypothetical protein HMPREF9011_04224 [Bacteroides sp. 3_1_40A] # 1 40 32 71 71 72 100.0 1e-11 MEVYINPMIGAGVLEMTEPDNPTSRNQMYVTVKVEQEFQK >gi|313157045|gb|AENZ01000076.1| GENE 94 88361 - 88741 433 126 aa, chain - ## HITS:1 COG:no KEGG:BF2817 NR:ns ## KEGG: BF2817 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 111 1 109 118 177 74.0 2e-43 MDKEYVMHIAQTIKEQLLSFTPIPVFMSWGVSEFVATVFQELPALRLKVNGRLHAGYVVI ALNGSDYYEVYLLKEDDSNAKCVNEEVCFDELGDVIDRAIESGTDKEEYDKFCDRQLAEL LSGTRA >gi|313157045|gb|AENZ01000076.1| GENE 95 88775 - 89047 241 90 aa, chain - ## HITS:1 COG:no KEGG:Bacsa_0432 NR:ns ## KEGG: Bacsa_0432 # Name: not_defined # Def: hypothetical protein # Organism: B.salanitronis # Pathway: not_defined # 1 89 1 89 90 115 71.0 8e-25 MKTKELQFDGNIYICRIVKSNEGEELLIGSTALLDALHPGSFEDESEGFASKEAEQIYDE VFFFADAKTLKLPDDELITELKEDNPEWFN >gi|313157045|gb|AENZ01000076.1| GENE 96 89055 - 89330 440 91 aa, chain - ## HITS:1 COG:no KEGG:BF2819 NR:ns ## KEGG: BF2819 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 91 3 93 93 156 84.0 2e-37 MDNIKESKEYQLAKDWERAVNDYGFNPKRFAAAIPEMHPTLQQSLYRLVKECIVVMADET RNYDDRNRASHEEAKCIMEYLKANGRHIPLR >gi|313157045|gb|AENZ01000076.1| GENE 97 89772 - 91505 1122 577 aa, chain - ## HITS:1 COG:PA5562 KEGG:ns NR:ns ## COG: PA5562 COG1475 # Protein_GI_number: 15600755 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Pseudomonas aeruginosa # 7 217 28 229 290 132 38.0 1e-30 MIMATTAVQAVEKNITSVALADIQPSNYNPRKNFDETSLAELAESIRQQGVLQPIGVRPI ADNRFEIVFGERRYRASLMAELAEIPAIVMEISDETAEEMAITENLQRKDVTPIEEANAY QKLIDSGRHDVQSLTVQFGKTEAYIRTRLKFVSLIPEIALLLEQDEITISVASEICRYGE EIQREVYDQHLKEGVQYNSWRGMKASEVAQSIERQYTADLNRYSFDKTLCLSCPHNTNNM MLFCEGGCGNCANRACLVEMNTSHLTEKAMRLMEQHPAVPLCHESYNYNEAVIDRLTAMG YEVESLKTYATKYPESPQAPQKEDYDTTEEYEDAEKDYGQELNGYTEKCEAIRTRSEAGE ISLYLRIESNDITLCYVANTATTVNGTATEMPLSPIEKLEKQDKRNKEIALEKTVEDTKK RILEVDMSERKFGQDEEKMVYFFLLSSLRKEHFNEVGIEDKGSYYYLTSEDKMRIIENLT AKQKAVIRRDYLIANFKDAFGNNATASLLLGFAQKHMPEELANIQDGYNEVYEKRHQRIK EKKAALQEQATQEAEQPDEPQPEAEAQTEPQTEEIAA >gi|313157045|gb|AENZ01000076.1| GENE 98 91937 - 92191 289 84 aa, chain - ## HITS:1 COG:no KEGG:BF2825 NR:ns ## KEGG: BF2825 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 84 1 84 84 117 83.0 2e-25 MGKSDSLKNGMRSGLDGLLSSTGKSTQKKEPAPVKTEKEPAVHCNFVINKSIHTRMKYLA IEKNMSLRDIVNEAMKEYLEKHEK >gi|313157045|gb|AENZ01000076.1| GENE 99 92199 - 92954 667 251 aa, chain - ## HITS:1 COG:Rv1708 KEGG:ns NR:ns ## COG: Rv1708 COG1192 # Protein_GI_number: 15608846 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: ATPases involved in chromosome partitioning # Organism: Mycobacterium tuberculosis H37Rv # 3 248 65 313 318 191 41.0 1e-48 MTKIIAVLNHKGGVGKTTTTINLAAALQQKKKRVLLIDMDGQANLTESCGLSIEEERTVY GAMKGEYTLPVFELENGLSVVPSCLDLSATESELINEPGRELILKGLIAKLLETRKFDYI LIDCPPSLGLLTLNALTSADFLIIPVQAQFLAMRGMAKITNVVGIVKERLNPNLNIGGIV ITQFDKRKTLNKSVAELISESFCDKVFKTVIRDNVSLAEAPIKGMNIFEYNKNSNGAKDY MELAKEVLKLK >gi|313157045|gb|AENZ01000076.1| GENE 100 93235 - 93597 221 120 aa, chain - ## HITS:1 COG:no KEGG:Bacsa_0398 NR:ns ## KEGG: Bacsa_0398 # Name: not_defined # Def: hypothetical protein # Organism: B.salanitronis # Pathway: not_defined # 7 117 4 113 116 107 46.0 2e-22 MTAQEKIQKVTEISQSKGWSISVDDKNKSNIQFDFQRYTNYGQDFNFSAEMKCEDIDTLI ADMEQYFEGFDPDYEAYLWIGNDGHGKNGAPYHIKDIVSDMEEAEKQIHDLLEALETEFI >gi|313157045|gb|AENZ01000076.1| GENE 101 93608 - 93949 180 113 aa, chain - ## HITS:1 COG:no KEGG:BF2912 NR:ns ## KEGG: BF2912 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 18 89 11 82 104 84 56.0 1e-15 MEILKIVIPKWNIAEITGYDPMTTFWQDFSMADKFGNEAIADTYRKVKAEWKDNYKHWTE LCLVLNHKIWQWHERDNQKATLYDRRMARRAAGVLLSNNGLKQSVGRCEVSPL >gi|313157045|gb|AENZ01000076.1| GENE 102 94550 - 95182 262 210 aa, chain - ## HITS:1 COG:no KEGG:Bacsa_0421 NR:ns ## KEGG: Bacsa_0421 # Name: not_defined # Def: hypothetical protein # Organism: B.salanitronis # Pathway: not_defined # 1 210 1 209 209 302 72.0 6e-81 METTTIHPAEAYLRNEQNSTSLYVRIAGQRRRLFINRNENVIGIIAPRKRKSGYIFTDWA SIEKIYYPSSSPEDAADIGKKQVLKYQKLARLATHTNDWLRKIANADLDKSLYENRITTG TRIDGKCIGLATIEKYCSSWDMARFRTALKQGEKFSTGRFDFCGYDGTLWCEPRENGDMA AGFSKEYRNCGNGYYYLLINDEYMIGYDID >gi|313157045|gb|AENZ01000076.1| GENE 103 95230 - 95454 162 74 aa, chain - ## HITS:1 COG:no KEGG:BF1096 NR:ns ## KEGG: BF1096 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 74 1 74 74 117 72.0 2e-25 MVEEIRQDDKVILSSEDGFSVPMIFNNLCGKNFIGKEYKDYIRHIAFEEMGLKPGIVSHY RDGVLYKNGTIPEL >gi|313157045|gb|AENZ01000076.1| GENE 104 95844 - 97013 873 389 aa, chain + ## HITS:1 COG:MT0627 KEGG:ns NR:ns ## COG: MT0627 COG1373 # Protein_GI_number: 15840000 # Func_class: R General function prediction only # Function: Predicted ATPase (AAA+ superfamily) # Organism: Mycobacterium tuberculosis CDC1551 # 6 350 4 373 411 157 28.0 5e-38 MEAKYIHRELSAVIEEAYRYFSVITVTGPRQSGKTTLLRNLFSYLPYYSLENLDVRSFAE NDPVAFLNQHTEGMILDEVHNAPNLLSYIQGMVDNDADRRFILSGSSQFAMLKKVTQSLA GRTAVFELLPLSYSEIREQITDTPLDNLLFNGFYPAIYSGRNIPKFLYPAYMKTYLDKDV RDLLQIKDMMQFHTFIRLCAGRIGSLFKASELANEIGVSSHTVTAWLSVLQASYIVFLLP PYFENTRKRLTKTPKLYFTDTGLACHLLGIESPEQLARDKMRGALFENFIVTEALKRRYN QGKESNLYFYRDSNQNEVDLLLKKHSGLYGIEIKSAMTYHADFEKALKQMDGWVKETILG KAVAYAGTLENTAGEIKLLNYSHLDEVLA >gi|313157045|gb|AENZ01000076.1| GENE 105 97019 - 97672 325 217 aa, chain - ## HITS:1 COG:TP0864 KEGG:ns NR:ns ## COG: TP0864 COG0739 # Protein_GI_number: 15639850 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane proteins related to metalloendopeptidases # Organism: Treponema pallidum # 79 195 423 540 546 99 45.0 3e-21 MRKILLIAISGISFCTMPVQAQFNTIAKTPERYKVEALQEGMKKPEPTPESMAPAQETST KPADESKKLWIDRYLSVSYPLQRIRITSPYGYRKDPFTGKRKFHGGIDLHARGEQVLAMM EGVVVKVGQDKTSGKYVTLRHGNYTVSYCHLSRVLAAKGTVVRPRDAVGITGSTGRSTGE HLHVTCKLNGKNINPSVLFDYIKSMQQECVSALAGLL >gi|313157045|gb|AENZ01000076.1| GENE 106 97892 - 98650 749 252 aa, chain + ## HITS:1 COG:no KEGG:BF2874 NR:ns ## KEGG: BF2874 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 252 1 252 252 425 80.0 1e-117 MKKVQIEFDELPFSTLERFGLTREMIEDLPMRVLEDICNGRHSPVLPVRVTDEHGGQIES RSRFAFIRMDDGQVDVVFYPALKSSPLERYDEAQQKQLLDGKAIVADVEMSDGRSSKAFV QIDAETKQVMYVPTPIIARNLKVLAEVMRLGTVEVNGMQHGEPLTVVVGGEPATVGIDLH AKTGIRICSGDAQQWRNQPKREWDKYTFGCYGCWVMDDDGNLDYVSEEDYTEELWNEQKK SGERNRATALHK >gi|313157045|gb|AENZ01000076.1| GENE 107 98669 - 98893 436 74 aa, chain + ## HITS:1 COG:no KEGG:BF2875 NR:ns ## KEGG: BF2875 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 74 1 74 74 100 78.0 2e-20 MAQNEYYPEDVLVGKMQSGEYGWLDYVNHFSADWQEEYARYCEEKGLAVGNESAAEFVRF KDEQLEAAMESGDA >gi|313157045|gb|AENZ01000076.1| GENE 108 98910 - 100487 1267 525 aa, chain + ## HITS:1 COG:no KEGG:BF2876 NR:ns ## KEGG: BF2876 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 525 1 522 522 872 83.0 0 MAIRTNTNPRQMDLQPEMRDMLMRNGLQAHVALDDAGYRLIVQGHDSPLLVYPITERQML ALTDWGTNTANKKAYNVLTSIVGKDFYMPKNFVHARNANGRVAMGLHGYRIVIGEYGRMG RMGMPPPFLGWTPRNQLGFHLRRVGGQLFFPGPSIVPERPDGRMKPGELQSGGYGFYYKG HQQEQPVQQRDVLQDLQAAITPMVSRPRSKEPARPYKELIASPVYFSNEKWAECLASHGL VVDAEKQTLTVQSESVQADMVYDLTEEEVRTLVAAPIEEQPVEKRLELLNGIIGADFSDK VTMEALNSDQRIAIGLHPEVQQDLKQRQRQEQEAFMPVKTSLQQQEESVQGNIGAVVDGR DLQFLNENKGWYREEKHGREVEVSQIAVQPAQTEGKYRMTAVIDGQAISHEITQRDYDKF LAVDDYHRMKLFSKIFSEVDMKTRPEAQRGLGTKIFAALTAGTVVAAGVAHGIHHHCHAP EFYGECFGGPPRPYFKPGVDTPRDVAIRCFEAEMNREINDMRMGR >gi|313157045|gb|AENZ01000076.1| GENE 109 100491 - 101879 936 462 aa, chain + ## HITS:1 COG:no KEGG:Bacsa_0413 NR:ns ## KEGG: Bacsa_0413 # Name: not_defined # Def: hypothetical protein # Organism: B.salanitronis # Pathway: not_defined # 1 462 1 462 462 686 71.0 0 MREGELTYDDFLRRLNIQDVLIDAGYHLNRRDGLRYPSYVRLDSEGRRIRNDKFIVTQQG KCCFKPQQQKSYNIISFIKEHPHFFAEYHAGVSPDRLVNLVCNRLLNHPVADRDTRIIQP KRDVKPFDMADYDIHQFNPQDRATQKKFYPFFKHRGIDLYTQYAFHRNFCLATKHREDGM KYTNLAFPLTVPKDTGQVVGLEERGRPRMDGSGSYKGKAEGSNSSQGLWIASPAKTTLTE AKHIYWFESAYDAMAYYQLHQANDKDLRKAVFISTGGNPTVEQMRGVLTLSLPAKQHICF DTDLAGIEFAKNLQQEMYRAVRSTIEETPERKPYLDSVADGKNLDEGDIDLLPDALRSSY GKYESAWEEAMSMRSSGLCHPDDIREQTDIMNGNYKEFREGLREFLGLDKANDASFVREQ PTYPNKDWNEQLLAGQKQEETVDETQAREQSPEEEQQTHFRR >gi|313157045|gb|AENZ01000076.1| GENE 110 101876 - 102373 345 165 aa, chain + ## HITS:1 COG:no KEGG:Bacsa_0412 NR:ns ## KEGG: Bacsa_0412 # Name: not_defined # Def: hypothetical protein # Organism: B.salanitronis # Pathway: not_defined # 1 165 1 165 165 226 67.0 2e-58 MMTDAPTGHFQSIVLQPHAGQFVIDELPAIVLCCAAWVYGGMEGLPLTALAVSVAALLSL ALLYRFIYLRRTRYHIGSEQLISRHGVLSRKTDYMEQYRIVDFVEHQSLMQQLCGLKTVR IFSMDRNTPRLDLVGIRHNFDVVTLIRERVEYNKRKKGIYEITNH >gi|313157045|gb|AENZ01000076.1| GENE 111 102354 - 103043 694 229 aa, chain + ## HITS:1 COG:no KEGG:Bacsa_0411 NR:ns ## KEGG: Bacsa_0411 # Name: not_defined # Def: hypothetical protein # Organism: B.salanitronis # Pathway: not_defined # 1 228 1 230 231 373 85.0 1e-102 MKSRIISLVIFALCLMPHWAKAQITASNPLEWMALAEGNEVINDQIEKQINGQTKTAMLQ NSIAAEFNRIHKWEKQYNSYLKTVSGYASSLKACTHLYNDGVRIFLTLGKLGNAIRNNPQ GIIASMSMNNLYIETATELVSVFTLLNDAVAKGGKENMLTGAERSKTLWALNDKLSVFSR KLHLLYLSIRYYTLNDVWNNVTAGMLDRNNGEAARMAMSRWRRAAVLAR >gi|313157045|gb|AENZ01000076.1| GENE 112 103057 - 103740 487 227 aa, chain + ## HITS:1 COG:no KEGG:BF2880 NR:ns ## KEGG: BF2880 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 227 1 228 228 380 84.0 1e-104 MKRTIWIGILLLCIGIGKVRAQNDPVLAGMILLYTDKAEKELKNQEKVMLIQTTGHLWTK EEVKATTDLQREFNNYLDSFRSIICYAAQIYGFYHEISRLTDNMGDFTRQVSRNSPNALA VALSTQRNRIYRELIMNSVEIVNDIRMACLSDNKMTEKERMEIVFGIRPKLKMMNKKLQR LTKAVKYTTMGDIWREIDEGARPAADKRDIVEAAKRRWRQIGRNVRP >gi|313157045|gb|AENZ01000076.1| GENE 113 103754 - 104419 554 221 aa, chain + ## HITS:1 COG:no KEGG:BF2881 NR:ns ## KEGG: BF2881 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 221 1 214 214 286 78.0 5e-76 MGVWSTLLKYGGKAMKGVGTATATTGKSMGKAVLHPSQTLRGAGQAVKTATIGGAVGYVG WEKLTTDKSVVRIVSDAVIGEPATNTLAETADGVRELTGKAGEAVSSVSGAVTGIDNKLN GVSNFLRQASGGGGLDMFGNFFRNLGSGNVSGLSIAGLVAAAFLIFGRFGWLGKIAGAFL GMMLIGNNAGVFRTPETENVQRTRTPELPAEEQTHGGGMRR >gi|313157045|gb|AENZ01000076.1| GENE 114 104445 - 105263 902 272 aa, chain + ## HITS:1 COG:TM0409 KEGG:ns NR:ns ## COG: TM0409 COG0739 # Protein_GI_number: 15643175 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane proteins related to metalloendopeptidases # Organism: Thermotoga maritima # 29 135 140 254 271 63 33.0 3e-10 MKYTEEMILQSDSGYCMPFEEQQGKDVELSLGYGEQTDPATGEKFFHHGIDFNVRCYMLS ALASGIVSGVGNDSGHGICQTIRYGEYEVTYGNLSNVFAQFGQRVKAGQTVALSGDKLHM TVRFKGEELNPLEFLTMLYGNIQAMRQAGGHETDYLSGLEMELKTDYERDKREIEELMLH FLPHYMEDLRHGAYTLPRNTEQSLRHIFTVGAMKEYFYENMPSMANPLGLGHKAMPLACK VQNLLIADFLHYLALRHDVYLSTASSDIKKTP >gi|313157045|gb|AENZ01000076.1| GENE 115 105298 - 105579 316 93 aa, chain + ## HITS:1 COG:no KEGG:Bacsa_0407 NR:ns ## KEGG: Bacsa_0407 # Name: not_defined # Def: hypothetical protein # Organism: B.salanitronis # Pathway: not_defined # 1 93 1 93 94 146 83.0 3e-34 MAELDIDIQSFDIPRLVTVYPDRAGVRWWTKAWFNNREEGETSVEIEREQAIRFMQDNIE KDAWLEEFFPRQMEVYHNAIEQTKEQLLKQINI >gi|313157045|gb|AENZ01000076.1| GENE 116 105588 - 107387 1408 599 aa, chain + ## HITS:1 COG:no KEGG:BF2884 NR:ns ## KEGG: BF2884 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 599 1 596 596 776 69.0 0 MDGIRHSRQFAEIERLVSDYFHCHLAPVMSKTQAFLTKKQGEEMREYSTSLGGILSAMAS AAQPMGDPYQVLKVTGEWNSKTTEDYIGMCKAEITGSKEIQQDLAYMAGQWRDTVVREIG RERYNELSEQLGGDLAYAYMDYRVEELMIDRLVKERMPKSSADYIIRKAAESSLLGLSQT LNRSPLAEEIEARGEAAYRPKGWEKGAGRVLGATADAVMMGGAGSWATLAKFVGADVAIS AVASRFEPEKSPKLSVEQCISKGVFGSERNVFTDFRKEAATVQTGDSALIVAANEQLKKK IPVMNFNFLEWMQTPKFTPFQMSEKPGHPEKKNERAERYKDVPLIVAPGQEEAYLKHLEK YKNATAVKSTKESTQPEMEQREKVEKEERQVVIPHEEEKQEREAVQTNTNGWSGLLGTLG LEGLGDITGNLGYVMAMLPDILLGAFTGKTQSLRFGDNLLPIASIVAGLFVRNPLLKMLL VGLGGANLLNKAGHEALGRPMPSADVHTENQYRRYPDEPLNSRIVNPVLQGSTLIATIDR VPCTIQLTPTVAEAYRAGALPLNTLANAVLAKSDQLRHIASQNYDDGQRETIVRTRGIQ >gi|313157045|gb|AENZ01000076.1| GENE 117 107415 - 109676 1257 753 aa, chain + ## HITS:1 COG:no KEGG:BF2885 NR:ns ## KEGG: BF2885 # Name: not_defined # Def: putative DNA primase # Organism: B.fragilis # Pathway: not_defined # 18 753 1 732 732 1110 76.0 0 MKEKSQIEKKAEEKQTELLSAALGGASNAGGHWLNVSGKGFPRLYPQGVSASPFNALFMA LHSDNNGCKTNLFTLYSETKARGAAVREHEQGVPFLFYNWNKYVNRNNPNETIDRTAYLQ LDEEQKAQFKGVHNREIRTLFNIDQTTLPYVDKPAYEDAVKQDGSVQERGYAEADNRRLR TRFNDFLLKMRDNLVPVRSDGSGVPHYETDKDAVYMPRQKDFEHYHDYVQEALRQIVSAT GHQQRLAREGMVMKNGVAPSEDAVKYERLVVELASGIKMLELGLPARLSDASLKTVDYWC REFKENPCIMDALESDVNNALDVIRKAERGEKIEYATLRNRRQTTTMQEQMPKHYFVANE IRQHPDKAAKSIVLVIDREAKSADVILPAGASTEANNEIPGMNKGRIERALQKEGIEQVR FYNTDGALGYRPDDSYFNEKMVTLARLRNYTLEKLSTLDVSEAVRRANEVGFDAVQMIQD DKNRWALYIKPENKEGYSVYPDKEDVNRFFSTLKQAMDNIDKVRMELAHKYYALAEVKPD LKVDLFGSDTPEIDLNRIQRVAIFKTRQDGIQCVATIDGQKLQPRSVTPQQWQRMWVAED RDGYKRHLAATLFADVLQKGQSVGEQKAEEQVRQQNEVVAENRSEETNDENMSPKRQFWD NLKEKHPDALYLIRAGEVYRLYNEDAAKGADILGITMKKYPERGFSAFAEFPRTQLDSYL PKLVRAGERVAISEIELQDKRQDEQETHRGIHR >gi|313157045|gb|AENZ01000076.1| GENE 118 109689 - 111245 1114 518 aa, chain + ## HITS:1 COG:SPy1438_1 KEGG:ns NR:ns ## COG: SPy1438_1 COG1705 # Protein_GI_number: 15675348 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Muramidase (flagellum-specific) # Organism: Streptococcus pyogenes M1 GAS # 22 161 18 163 174 79 38.0 1e-14 MTKNQEYALQYADYAMAQMRRYGIPASVTLAQGILESSNGQSRLARNENNHFGIKATSSW IAEGGKYGIYTDDKPNEKFCSYDSVGDSYEHHSRFLKENSRYAGCFKLSPDDYKGWAQSI EKAGYATGGKYAENLQKIIEQNGLQKYDRQVMQEMTAQGRQFGVEHNPLQTSESAEHGTG YSFPVEREEFLFITSPFGTRQDPLDSTKQQMHKGVDIRCKADAVLATESGGKVVAVNQNK NTPGGKSLTVEYARTDGSKVQCTYMHLKEISVKVGDTVQAGGRLGTSGNTGTRTTGEHLH FGVKNIYADGTKRDIDPAAYLAEIAQKGHIKLQMLHNGNDLLVKYKAAEDTVPEKNLSPD GWMKKLLSSEDSGVGISGCNDPIVEMAMTAFASLMLLATQIDNREEEEQKTAISAAMDLR TIDLKPLLPNMKNCDLTIGENGKAILKADNGELHVSRELTASELNRLSATLNNGTLTEEA KQMRVTGMLNTVILSEAASQNFEQGMTRQQGQTENLRR >gi|313157045|gb|AENZ01000076.1| GENE 119 111258 - 111434 138 58 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|253570994|ref|ZP_04848402.1| ## NR: gi|253570994|ref|ZP_04848402.1| conserved hypothetical protein [Bacteroides sp. 1_1_6] conserved hypothetical protein [Bacteroides sp. 1_1_6] # 1 58 1 58 58 79 100.0 1e-13 MIKCVMIQLCKCLAGVSLLALHLRFCYLLLGWIGTLVVVGVQLAIAGWMVWAMIRAPD >gi|313157045|gb|AENZ01000076.1| GENE 120 111518 - 112126 549 202 aa, chain + ## HITS:1 COG:no KEGG:BF2888 NR:ns ## KEGG: BF2888 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 202 1 203 203 313 77.0 2e-84 MIKCNVTVCGVIGRDASVRKNKEEKEFLVFPLRVQIPATGGGHTIEVDVRKEGCQEEAAG YRNGSRVEVRGTMYLKRRGDKLYFNLFADEICNAATDDADCVKGELVFRGKVGQNIEEKR DKKDQPYTVFSAFSAEKVDDGFEYQWVRFFCFGKEREAWLQPGVKVDAKGEMNLSAHNGK INLSCKMEELTQYVADSSNYNQ >gi|313157045|gb|AENZ01000076.1| GENE 121 112132 - 113820 1139 562 aa, chain + ## HITS:1 COG:XF2061_1 KEGG:ns NR:ns ## COG: XF2061_1 COG4227 # Protein_GI_number: 15838653 # Func_class: L Replication, recombination and repair # Function: Antirestriction protein # Organism: Xylella fastidiosa 9a5c # 23 340 224 514 522 92 27.0 2e-18 MAGYRKKNADGPNSEDKALDLFAEMMIEKIEGIQKDWKKPWFTEGTLQWPRNLHGREYNG MNAFMLLLHCEKEGYKIPRFCTFDCVQKLNKPGKDGEELPRVSVLRGEKSFPVMLTTFTC IHKETKEKIKYDDYKKLSDDEKEQYNVYPKMQVFRVFNVAQTNLQEARPELWQKLERENS RPAIEEGEHFSFAPVDTMIRDNLWICPITPKYQNDAYYSITKNEIIVPEKEQFKSGESFY GTLFHEMTHSTGAENVLDRFKPTTFGSPEYAREELVAELGSALVAQRYGMTKHIKEDSCA YLKGWLDELKESPQFIKTTLLDVKRATSMITQKVDKIAQELEQNVGEKQENGAAAKENTF YSSVAYLQFSDDTRQLDELREKGDYEGLLTLAKEYYDGNGINEQHTYLSATNNKGDSLIA EDENFAVVYNGSVGGTYEVMLKFTEQEIRDHIRRYGVDIAGDTIKGVAREMAAEQFSALA YQKIPAFEMPNGEVLYVEYNKESDTLDVGQPTNAGLVAQHRFPYDHNVGLDANLQAVNEK LNELEEYRAELQEAEYGSGMRR >gi|313157045|gb|AENZ01000076.1| GENE 122 113873 - 114247 329 124 aa, chain - ## HITS:1 COG:no KEGG:BF2890 NR:ns ## KEGG: BF2890 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 122 1 122 123 209 88.0 2e-53 MAQNFYTKWQDAILADAGDYVSKEYRSFQTALVREISKYAAAVGAKVASNSKGHYDTSCF IERNGKFVYISHSSGLSRMGSGVRIELDSFLIRTAQNGKDYRGGCNQYCDIANLQSMIDG LLGK >gi|313157045|gb|AENZ01000076.1| GENE 123 114262 - 114516 160 84 aa, chain - ## HITS:1 COG:no KEGG:BF2891 NR:ns ## KEGG: BF2891 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 7 83 18 94 94 125 76.0 5e-28 MTTNDILKRLCGNIAAGRFNWRKYCTPQSYFGWEICVTPLHCSYGQIGYTVHFPYTNIPE VEYDWEMGKLTIDGEKWKSYLRNE >gi|313157045|gb|AENZ01000076.1| GENE 124 114513 - 114980 417 155 aa, chain - ## HITS:1 COG:no KEGG:BF2897 NR:ns ## KEGG: BF2897 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 149 1 152 156 200 61.0 1e-50 MSKYDFIKQGNLLFWHTADNDIECRIISTPEKVDSDSIILISTSSSETEVLASELLPIGS SRSHKEEFMRWKKEREAEGMEFFSRLSEVMETDSDLAVGDMVAFTNDYGVVFGPKEVLAF RKPWNGYRCVYIDSDAYWFPDRPEQLTILSKGGTE >gi|313157045|gb|AENZ01000076.1| GENE 125 115532 - 115783 65 83 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MIGDIQRIFHCLFEIIGRTPFQHIICRKSIRSFFDIGDILLFPLLYSTIIAHGVIGYKLE HSFGQGMIDQIGNILFTVFLVYF >gi|313157045|gb|AENZ01000076.1| GENE 126 116753 - 117721 483 322 aa, chain + ## HITS:1 COG:no KEGG:Odosp_2774 NR:ns ## KEGG: Odosp_2774 # Name: not_defined # Def: hypothetical protein # Organism: O.splanchnicus # Pathway: not_defined # 1 321 1 321 321 501 84.0 1e-140 MKKNFRFLAMAVVAMAAAVFTGCSSDDDFLAPYEESAIQTRAISSTNALIDFDNVPSSVM ASDQYGNNLYSATANGKQVTTGYITQIGQTGTYIQFPINYLEQEWVSGQPWEYEFWNGGF AISNFHNLTQGDYQNQCSVYWPNGGHSGKNFAVAFGYSDSYNDSQATYDKCAKIYLTDAT GYRVVTTNTPVKGTPKYGKFNSVWVCNTTYTYLVMKDGNSFTQGSLSAQKGWFKVVFVAL DATGKPTGKEVEYYLANFDSSKDAESGLTNKIRTGWNQVDLSGLGDSVCTVAINFEGSDS SAYGLNTPAYVAIDDIDVTVNE >gi|313157045|gb|AENZ01000076.1| GENE 127 117849 - 119624 1140 591 aa, chain + ## HITS:1 COG:no KEGG:Odosp_2775 NR:ns ## KEGG: Odosp_2775 # Name: not_defined # Def: PKD domain containing protein # Organism: O.splanchnicus # Pathway: not_defined # 1 591 1 592 592 1083 90.0 0 MKKNLLLFSLTIFLFACNKDEEISQDVTVPPVIELDSEDGIYVVKIGKEVVIEPTYQNVD YAVYSWKCNGRIISDEPQLKYIFNECGSYYVTLRVDTRDASTEEEIRVDVNELAPPVISL VTPSIGLKVVAGREYILTPDIQNAEGATYLWTLNGNEVGTENTYTFKQDELGTYELTLTV TNEDGQSKKTVSIEVVDKLPIEIVVPSSLYFTENNTKYVELGRTLFVRPFVSISAEPSYQ WSLDGQPIEGANSLVYGFKPAKTGEHTLTFTVKYDNQDTKAVLTRNISVSGVDEVSVNIP VKCCEAAGKRPFAAGNSIYSNKVYEFVPAPGQFVNETNTAGFNGENTHEAACAYAQKRLD NEQYVSLGGWGGYIVVGFDHSIENKGGYDFSIKGNAFDSSNEPGIVWVMQDVNGDGLPND EWYELKGSEYGKPETIQDYAVTYFRPGPNMDTQWQDNKGNKGAIDRLGNYHPQEFYYPLW IEEDSYTLYGTCLKARTEQSPSTGMWSNNPFGWGYADNIGDDMPNKDNPNAGALGNYFKI SDALNIDGTSANLSHIDFIKVQTGVNVKAGWLGENSTEVFKFCDENNNNDK >gi|313157045|gb|AENZ01000076.1| GENE 128 119633 - 121687 23 684 aa, chain + ## HITS:1 COG:no KEGG:Odosp_2776 NR:ns ## KEGG: Odosp_2776 # Name: not_defined # Def: hypothetical protein # Organism: O.splanchnicus # Pathway: not_defined # 1 684 1 684 684 1047 78.0 0 MNKIYHFILFCFATLCLAACSDDDPEVSGIDGKDHFISEFALTVDGITYQAMIVGDKITV EIPYNTSLKGATVEYALCEGASINPNPSTIEDWENEWKFVVTSKMQDSKVYSYTYQYTDI EQSGSVVLATQAEVDNFAKTGINKIEGSLTIGTADGEEITNLDGLANLKQISNSLVINPS YKGTDLTGLDNLEQLGSFKLGSTTSASKNIMLKTVNLPSLLGVTGDFVVNSSVIEKISIP KVEFIGEDMYITSDALLDLDANAVESVGASLIVKGSVAQKESATTEAIVFSALKQIGNEL TIQYFPKLQGIYLPALESVVGTASFSDMSSIGSLAMTELHSVGGLTIKNCKEISIVELPG LISCGETSVDANKVNKLNIASLKDVLGDMTLSNLLIEELDLSQINFNGNTLTLQCARLNK IVGPETFNGSLLLLPKSCRLTEFTLEGISNIQGDFQCKDYSYVKEFVMPFIRVAGDMTIA LNSGSVNTAAEIEFPKLQEIGGTLTLGSNSNANNIAFPSLKKILGSCSVTTTYLKNDIEF TNLESIGTDGADEQIKFEIEATNILCPKLKTINGKFDIATSSFMFGMEVDKVSYPNVESI SENLSITCPYSDFGSNGILSIDFSGLKSVKGISISGQGDVSDFSSFKYLFENNVLTGESQ WSVKECGYNPTFQEMKDGKYKLAE >gi|313157045|gb|AENZ01000076.1| GENE 129 121772 - 122851 1541 359 aa, chain + ## HITS:1 COG:no KEGG:Odosp_2777 NR:ns ## KEGG: Odosp_2777 # Name: not_defined # Def: putative surface layer protein # Organism: O.splanchnicus # Pathway: not_defined # 1 359 1 357 357 680 92.0 0 MTYHQLKYLLWSVVVGVTLSLTSCMKWDYGDAVEDFNATGAGLFITNEGNFQYGNATLSY YDPATKQVQNEIFFRANGMKLGDVAQSMTIYDNKGWVVVNNSHVIFAIDLNTFKEVGRIE NLTSPRYIHFLSDEKAYVTQLWDNRIFIINPKKYEITGYIQVPDMTMESGSTEQMVQYGK YVYCNCWSYQNRIIKIDTETDQVVDELKVGIQPTSLVMDKYNKMWTVTDGGYEGSPYGYE APSLYRIDAETFTVEKQFKFKLGDWPSEVQLNGDRDKLYWINKAIWSMDVTASHVPVRPF LEYSGTIYYGLTVNPANGEVYIADAIDYQQQGMIYRYSPEGKLIDEFYVGIIPGAFCWK >gi|313157045|gb|AENZ01000076.1| GENE 130 122860 - 123348 64 162 aa, chain + ## HITS:1 COG:no KEGG:Slin_4541 NR:ns ## KEGG: Slin_4541 # Name: not_defined # Def: transposase IS200-like protein # Organism: S.linguale # Pathway: not_defined # 4 135 7 138 148 72 28.0 6e-12 MRHIYHIIFITKRQEMTIPMTTKGMILDQMVQIFTRKGCHVYACNAFLNHVHILVEIPSP TDFAKIINKVKSSTGVVYRKYPEYADFSGWAAGFDSFSVSFNDLNRVKHHIEKQETVHQE LSFEDEYDQLLEENGFNAYQDFMRASFSAYAAVNNHIKNKRK >gi|313157045|gb|AENZ01000076.1| GENE 131 123345 - 125423 1826 692 aa, chain + ## HITS:1 COG:no KEGG:Odosp_3622 NR:ns ## KEGG: Odosp_3622 # Name: not_defined # Def: TonB-dependent receptor # Organism: O.splanchnicus # Pathway: not_defined # 6 691 5 695 696 1242 89.0 0 MTYTKYLLFSILVVCPSVLSAQGITRRIHQIDEVTVWGKRPMKEIGVQKTKFDSLALKEN IALSMADILTFNSSVFVKSYGRATLSTVAFRGTSPSHTQVTWNGMRINNPMLGMTDFSTI PSYFIDQASLLHGTSSVNETGGGLGGLVKLGTAPEVAEGFNAQYVQGIGSFKTFDEFARF TYGSERWHVSTRAVYSSSPNDYKYTNHDKKINIYDEDKNIVGQYHPKERNRSGAFKDLHL LQEVYYNTGKGDRFGLNAWYINSNRELPMLTTDYGDATDFENRQREQTFRSVLSWDHMKS NWKLGVKGGYIHTWMAYDYKREVAPDNWASMTRSRSKVNTFYGQAEGEYSLDKKWFFTAN VSAHQHLVRSEDKNIILQDGGKAIVGYDKGRVELSGSVSAKWQPIDRLGMSVVLREEMYG SDWIPLIPAFFIDGIISPKGNVMLKASISRNYRFPTLNDLYFLPGGNPNLKNEQGFSYDA GVSFDVGKKGIYKLSGGANWFDSYIDDWIIWLPTTKGFFSPRNVKKVHAYGVEVKANFAV QPAKDWLIDLNGSYSWTPSINEGEKMSPADQSVGKQLPYVPKHSASLTGRLSWRTWAFLY KWAFYSERYTMSSNDYTLTGHLPEYFMSNVSLEKNLFFKPVDIQLKFAVNNLFNEDYLSV LSRPMPGINFEFFIGITPKFGKNKKKSENTNM >gi|313157045|gb|AENZ01000076.1| GENE 132 125428 - 126567 1046 379 aa, chain + ## HITS:1 COG:alr4031 KEGG:ns NR:ns ## COG: alr4031 COG0614 # Protein_GI_number: 17231523 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+-hydroxamate transport system, periplasmic component # Organism: Nostoc sp. PCC 7120 # 55 379 97 426 426 209 34.0 5e-54 MKALKNLSLLLLLVLAFTGCHNKSSKLADFNRTVYTPEYASGFDIKGADGKKSVLVTVTN PWQGADSITTNLFIARDDEEVPADFTGQMLKGDAERIVCMSSTHIAMLDAIGETGRVVGV SGIDYISNPDIQARRDSVGDVGYEGNINYELLLSLDPDLVLLYGVNGASSMEGKLKELDI PFMYVGDYLEESPLGKAEWLVALSEVIGKRAEGEKVFAEIPVRYNVLKKKVADNILDAPS VMLNTPYGDSWFMPSTESYVARLIKDAGGDYIYKKNTGNASAPIDLEEAYLLASQADMWL HVGMANTLDELKAACPKFIDTRCFRGGQVYNNNARTNAAGGNDYYESAVVNPDLVLRDLV KIFHPELVEEDFVYYKQLK >gi|313157045|gb|AENZ01000076.1| GENE 133 126568 - 127548 966 326 aa, chain + ## HITS:1 COG:alr4032 KEGG:ns NR:ns ## COG: alr4032 COG0609 # Protein_GI_number: 17231524 # Func_class: P Inorganic ion transport and metabolism # Function: ABC-type Fe3+-siderophore transport system, permease component # Organism: Nostoc sp. PCC 7120 # 6 323 22 356 362 241 49.0 1e-63 MRSRSAILFAMLAALTLFLFLLDLAVGAVAVPLGDVWAALTGGDCPRATAKIILNIRLIK AVVALLAGAALSVSGLQMQTLFRNPLAGPYVLGISSGASLGVALVVLAGFGSSIGIAGAA WLGAALVLVVIAAVGHRIKDIMVILILGMMFSSGVGAIVQILQYLSKEESLKAFVIWTMG SLGDVTFDQLAVLVPSIIAGLLLAVVTIKPLNLLLFGEEYAVTMGLNIRRSRGLLFLSTT LLAGTVTAFCGPIGFIGLAMPHVTRMLFRNSDHRVLVPGTVLSGAAVLLLCDLVSKMFTL PINAITALLGIPIVVWVVLRNKSVTA >gi|313157045|gb|AENZ01000076.1| GENE 134 127545 - 128303 227 252 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|163803615|ref|ZP_02197481.1| 50S ribosomal protein L34 [Vibrio campbellii AND4] # 1 208 1 209 245 92 30 2e-17 MIKLHDFSIGYGERTLLCEVETTIEKGRLTALIGRNGTGKSTLLRAIAGLNRRYTGRILL DGHNAADMRAAEMARTLAFVTTERTRIANLKCKDVVAIGRAPYTNWIGKMQEVDKEIVMR SLASVGMEAYAERTMDKMSDGECQRIMIARALAQDTPIILLDEPTSFLDMPNRYELCTLL ARLAHEENKCILFSTHELDIALSLADAIALIDPPQLSYMPTEEMRRSGCIERLFRNNCVT FDATTGFIKVGQ >gi|313157045|gb|AENZ01000076.1| GENE 135 128300 - 128896 129 198 aa, chain + ## HITS:1 COG:no KEGG:Odosp_2784 NR:ns ## KEGG: Odosp_2784 # Name: not_defined # Def: hypothetical protein # Organism: O.splanchnicus # Pathway: not_defined # 1 181 1 181 187 311 78.0 1e-83 MTKEYIIENFTANISVDEYISRFRDEKRFVEFCKQCPNYGNSWGCPPFDFDTGEFLRQYE YAHLMATKIIPVEKNIPIDRTQELIKPERLRIERELLEMEHRYGGQAFAYVGKCLYCPDS ECARKCNRPCLHPDKVRPSLEAFGFDMTRTLSELFGIELLWGKDGILPEYLVIVSGLFHN SAENIISHTKRNQDSGNL >gi|313157045|gb|AENZ01000076.1| GENE 136 128977 - 130377 683 466 aa, chain - ## HITS:1 COG:no KEGG:BF2899 NR:ns ## KEGG: BF2899 # Name: not_defined # Def: putative outer membrane protein # Organism: B.fragilis # Pathway: not_defined # 1 466 1 476 480 687 70.0 0 MTKQIFFILAVLCTLQAQASIQPVQVDTVQHTPYYNVSEELQPIQPVYLDGVVLPASLTG NWFVSIAGGTSAFLGTPLGCEDLFGRLKPSYSFAVGKWFTPSVGARINYSGVQFKDGTLS NQDYHHIHADLLWNVLGCRYARQEQVRWNLAPFAGVGLLHNASNGNNPFTVSYGVQGEYR ISKRVSAMLELSGTTTFQDFDGYGRPNRLGDHMVSLTAGFTFHLGKVGWKRAVDATSYIR QNEWLVDHANILSGENKRYKDWYDRNRRTVAELKKILEIEGLLDKYGHLVNDDETDRRQG YPRNNYSGLNSLRARLKNRHWDGISPLDSASIGYGNNGDDAEKPGIIVSERTELIQAGNC IGSPVYFFFNLNTAHLTDASQMLNLDELARVAKKYGLSLRVTGAADSSTGTPVLNGSLST SRADYIVAELKKRGIPIERIVKVSRGGIADHVPVEVNRHTKVELFF >gi|313157045|gb|AENZ01000076.1| GENE 137 130432 - 131862 785 476 aa, chain - ## HITS:1 COG:no KEGG:BF2900 NR:ns ## KEGG: BF2900 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 476 10 485 485 738 84.0 0 MQVSKGITAAQSNEHLRDRSERAEKYAMNKGNYDPTRKHLNFEIAPGGKVRPIDTSRNIP ERMADILERRGIKDPNEGLAEPKYRTVVNFIFGGSRKRMHELAFGTQNVDFDEGANNTRI KRMSDIERWAKDVYSFVSGKYGEQNIAAFIVHLDELNPHVHCTLLPIKDGRFAYKEIFAG KDKFEYSARMKQLHTDFFAEVNTRWGMSRGTSISETGARHRTTEEYRRMLSEECTTIEEN IDRHQKVLATLQSDIRLAERRVKGLTTMVDNLEKSKAEKEALLSAAEQDLKANKGDAEQL AAQVKSLEKELAGINRQLADKQEKLQTADRQLAELKENMDAIEERTGELKEEAYKYSHDI HSKVDTLLKDVLLENVVGEYRNVSAQMDVAERQLFDGSLVQSIAEQGTDVMHCATMLFLG MVDDATTFAETRGGGGGGSDLKWGRDEDEDNRAWALRSMRMASRMMRPAIGKKPKR >gi|313157045|gb|AENZ01000076.1| GENE 138 132107 - 132952 545 281 aa, chain - ## HITS:1 COG:no KEGG:BF2901 NR:ns ## KEGG: BF2901 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 277 3 280 288 298 57.0 2e-79 MKNNQKIPVSILADREVFEYLKEKGDERKTRTEAYCDLLDKSLAGFVSPFLRKKAYVLQP NQCHLTVSDLASEWHWHRATVRSFLSAMEAFGLLTRIQLPKSVVITMTVQSGQAAQPRNA QEQPDFARQLREVLSDWVIGKTTAAETGVICGQLVSLAKTEIADRDTGLCLDTHSNTTSA HSGTLVTEHRETALCCIAHAALQKILHKSRFDDSSPLVDFFRFDLGEEWAAFIESAKDLA GLILDTEASVTDFDMDEDQERLKSLRKPFLSLLAKAQAMVD >gi|313157045|gb|AENZ01000076.1| GENE 139 132942 - 133532 181 196 aa, chain - ## HITS:1 COG:no KEGG:BF2902 NR:ns ## KEGG: BF2902 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 185 1 185 185 192 62.0 5e-48 MLLNILLLIFCAIPFGVSMSLYKNNKRFMTPFYMAMARSGNARKLYVQVWLICLLLFHYV YACGHMGEFGILLSTGVCAAMFSFRRTDNWLRRLLDRPRAFVTLASGALVIGFVPHLYTL AITIAYLLLAALFYPSVRVMSECKDTDTLSGWAKHPGMLSESYHENHHANLPHEADSGNT DISAQYESLKPNENEK >gi|313157045|gb|AENZ01000076.1| GENE 140 133523 - 133900 420 125 aa, chain - ## HITS:1 COG:no KEGG:Bacsa_0388 NR:ns ## KEGG: Bacsa_0388 # Name: not_defined # Def: hypothetical protein # Organism: B.salanitronis # Pathway: not_defined # 1 123 9 131 131 169 69.0 4e-41 MVYCINAYRTWIEVADDNLYKEHIISRTDRTDYLVSRTLVLRAFKTNGIHAEGTTWTIPE HELDKALAIYRKQDITFKQRIKKAAMYFSPKDAETLIRLATYGIVQLELIVRPTPIPEKP YYLCY >gi|313157045|gb|AENZ01000076.1| GENE 141 133921 - 134373 607 150 aa, chain - ## HITS:1 COG:no KEGG:BF2904 NR:ns ## KEGG: BF2904 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 150 1 150 150 238 82.0 7e-62 MNMLRHFYNDFMAFVPLQLPQLLDVTTMEEPQFYGDYVLLTFPLRTPYELDEVMDMFEDD MELITLYHHIPMRTEKFGHSTCAYSNPAFGQMFKMNAKTDTEGNVNSILVTIYDSLEQMY GDLCLDLELHSKSGFLKYKKDKSDVLMNFI >gi|313157045|gb|AENZ01000076.1| GENE 142 134395 - 135444 948 349 aa, chain - ## HITS:1 COG:no KEGG:BF2905 NR:ns ## KEGG: BF2905 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 349 1 349 349 399 57.0 1e-110 MKTLKSNLKQPIIFAKIIAVSIAIFLCASCDNGNGKRLAKAKNDPAGTYREYLSDIRRQK KLSIKELAGHLKQWQTLRDSVFLHLECDTLGRLHSTVREECEQIHDSIRMEFSRMVLSQT RTYKDILWVKEQISPHLEDKELHRAAETIRPFFASLDNRPVSQRNRQHILATYRTTLAET IDYGIHCMADLTTFIEKEDAVYRAFLSRLPDFDGENLSDITHDTERCCAQIFLAAERNDI TYRNAMIYLAMRTNRRLIQNIRTCIDDICNKKVKTPAQAQAYIWMLLQPYSSLDGFCLAL LSAEEREQLDMLASQTPDAFRELGRMLHPGDNRLDELPGMLMEAFIHTL >gi|313157045|gb|AENZ01000076.1| GENE 143 136164 - 136343 65 59 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGIQSVIFGKKAVSKVKAPCTAFSINTRFSLTEHRDECRASGNRTYKSIWILKINRLLA >gi|313157045|gb|AENZ01000076.1| GENE 144 136398 - 136502 61 34 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MGDIIILLLVFLVVGRLLRGVFGGFSKSSFRDDK >gi|313157045|gb|AENZ01000076.1| GENE 145 136570 - 136851 268 93 aa, chain + ## HITS:1 COG:BMEI0877 KEGG:ns NR:ns ## COG: BMEI0877 COG0776 # Protein_GI_number: 17987160 # Func_class: L Replication, recombination and repair # Function: Bacterial nucleoid DNA-binding protein # Organism: Brucella melitensis # 1 89 3 91 93 60 37.0 6e-10 MTKAEIVAQISRQSGIEKTVVMTVIESFMENVKESMVAGNEVFLRGFGSFIIKRRAEKTA RNISKNTTIKIPAHNIPAFKPAKAFLNAVKENK >gi|313157045|gb|AENZ01000076.1| GENE 146 136939 - 137319 450 126 aa, chain - ## HITS:1 COG:no KEGG:BF2915 NR:ns ## KEGG: BF2915 # Name: not_defined # Def: putative single strand binding protein # Organism: B.fragilis # Pathway: not_defined # 1 115 1 115 126 187 83.0 2e-46 MKKIENNFVVTGFVGKDAEIRQFTNASVARFPLAVSRLENNGEESKRVSAFMNFEAWRKN ENTGSFDQLTKGTMLTVEGYFKPEEWSDQSGVKHNRIVMVAVKFYPPVEKEETPEKPAKP AKKGKK >gi|313157045|gb|AENZ01000076.1| GENE 147 137801 - 138019 120 72 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|319644002|ref|ZP_07998561.1| ## NR: gi|319644002|ref|ZP_07998561.1| ATP-dependent RNA helicase DED1 [Bacteroides sp. 3_1_40A] ATP-dependent RNA helicase DED1 [Bacteroides sp. 3_1_40A] # 1 72 14 85 85 114 100.0 2e-24 MFPSGRKKRGQKGKRKEPCFSEKRTELFTNGCFLQKDMAQPKERDRCKKGNAAWQEWKII PFLPACIIKILP >gi|313157045|gb|AENZ01000076.1| GENE 148 138003 - 138260 206 85 aa, chain - ## HITS:1 COG:no KEGG:BF2918 NR:ns ## KEGG: BF2918 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 85 1 85 86 134 75.0 9e-31 MKATIEHSFCPYCDEVTELYFRIINTILFSTDEAELRTGMERLQEKTRLDDYFVFGYGKH HLWVCQRRPSNQKKIFEHRVIMAEF >gi|313157045|gb|AENZ01000076.1| GENE 149 138244 - 138603 268 119 aa, chain - ## HITS:1 COG:no KEGG:BF2919 NR:ns ## KEGG: BF2919 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 109 1 109 109 192 90.0 3e-48 MQQIAMKFVQWDVPELEKLKDSRVYQLREQLDNGDKLSREDKNWITRNVKECIHFKRGIA LMGYFFDFSDVLKRYFVKQHGHIAEYYAIDKTALRSVLYGRIEDIVEVELKSKKHESND >gi|313157045|gb|AENZ01000076.1| GENE 150 138615 - 138791 230 58 aa, chain - ## HITS:1 COG:no KEGG:Bacsa_0363 NR:ns ## KEGG: Bacsa_0363 # Name: not_defined # Def: hypothetical protein # Organism: B.salanitronis # Pathway: not_defined # 1 58 1 58 58 80 74.0 3e-14 MYEYEENSEIIGAHCTLLTPYKGYSEGTVVGDFGNKIVVRLSSGKEVVEYRDEVIIYD >gi|313157045|gb|AENZ01000076.1| GENE 151 138811 - 139104 356 97 aa, chain - ## HITS:1 COG:no KEGG:BF2920 NR:ns ## KEGG: BF2920 # Name: not_defined # Def: hypothetical protein # Organism: B.fragilis # Pathway: not_defined # 1 97 1 97 100 137 80.0 1e-31 MANYATNIFYARTENKADLDKIEAFLDDTFDGFVNRHSDSVDAEFTSRWVYPEEEIDRLI ASLEAKDKTYIKILSYEFTDEYVSFRIFSQGKWDIKL >gi|313157045|gb|AENZ01000076.1| GENE 152 140210 - 140470 349 86 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|227417938|ref|ZP_03901107.1| LSU ribosomal protein L27P [Dyadobacter fermentans DSM 18053] # 1 85 1 85 92 139 78 1e-31 MAHKKGVGSSKNGRESESKRLGVKLFGGQFAKAGNIIVRQRGTVHNPGENVGIGKDHTLF ALVDGTVEFCKKGAGKSYVSVTPLSE >gi|313157045|gb|AENZ01000076.1| GENE 153 140491 - 140802 381 103 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|124010267|ref|ZP_01694920.1| ribosomal protein L21 [Microscilla marina ATCC 23134] # 1 103 1 103 103 151 68 2e-35 MYVIVEIAGQQFKAEKGRKLYVHRLQGEENSSVSFDKVLLTDNDGQVKVGAPVVKGASVK CKILKHLKDDKVLVFKKKRRTGYQKCNGHRQYLTQVLVEEIVA >gi|313157045|gb|AENZ01000076.1| GENE 154 140958 - 141326 415 122 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313157078|gb|EFR56508.1| ## NR: gi|313157078|gb|EFR56508.1| hypothetical protein HMPREF9720_1760 [Alistipes sp. HGB5] # 1 122 17 138 138 226 100.0 6e-58 MNRILIVSDDIFLRDMVRLSLIDMHTEVRCAADADEMEGLCRRVLFDLVIVLHVAPFLCG RDVVRGVRPAGLRRPLFYVVSWLQSEQAVLSLLECGVDQYMAFPLSLQRLRGKVSNDLNR LL >gi|313157045|gb|AENZ01000076.1| GENE 155 141323 - 142270 1106 315 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313157211|gb|EFR56641.1| ## NR: gi|313157211|gb|EFR56641.1| conserved domain protein [Alistipes sp. HGB5] # 1 315 1 315 315 495 100.0 1e-138 MTETLFMIYSAAAAAVTLWLLGGMAVGRLRRRRRRGRDAVLQRKYLHIVMLALFSGGEEA PRFPLLRRAGARRLLIETVGRLVAATYGLDPAPLRRIVVQYGLDGWLLRRIRFAQGYRRA RYLMLLSRLPAGDDIGAEAARYMRSRNRYVRFYALMTQLAAEPATSLRRMAEYDYPFSAC EVSEIMAMLRRGLLPIAYEPLVGSPNRNLRMVGLGIVRQFGIEEAERLLLAMVAREREPE LGREALYTLCSMRCSLRRREVAGRIASMSRAERKALMRYMAREGYAPAVLRRLFGDRERP YYESLIHSYKRSLVC >gi|313157045|gb|AENZ01000076.1| GENE 156 142264 - 143220 941 318 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313157135|gb|EFR56565.1| ## NR: gi|313157135|gb|EFR56565.1| hypothetical protein HMPREF9720_1762 [Alistipes sp. HGB5] # 25 318 25 318 318 481 100.0 1e-134 MLIVLLLLLAAVAAAALALLWLVVRGHNKRLLLAGSHAVASDTPGGIGISVLCSGVHAQE QVENLLSAEYAHYEVIVVLDSLRYPVEFAAFTARYRMIRVEYALSEELPVTGVRALGRSR KRRFRRLVLVDRAQDTPAGDFDAAAEVAAYDYVLPVREGQFLLPDAVVRLVAEVGARGVG GCDCVRARIGEPAVLLSREAVVAAGGFARRPGKAVPWRRRRSLWEPVFYTPQSARRQRIG LQEVLRVVPAAALAAAIVAAVWTGRWPAAAVLLTAALVWCVAGCVSLALSDIACGRTGGD EKGLLRRCKIGVKNFTIS >gi|313157045|gb|AENZ01000076.1| GENE 157 143239 - 144123 1244 294 aa, chain + ## HITS:1 COG:aq_337 KEGG:ns NR:ns ## COG: aq_337 COG1218 # Protein_GI_number: 15605852 # Func_class: P Inorganic ion transport and metabolism # Function: 3'-Phosphoadenosine 5'-phosphosulfate (PAPS) 3'-phosphatase # Organism: Aquifex aeolicus # 23 293 1 256 268 196 40.0 3e-50 MAIEITELKKRATMINDKVRMYLLPPLFNAAVRAGASIMNIYKNLDDYDISLKDDKTPIT LADRLAHKTIREYLGRTRIPILSEEGREMRYDERRNWELYWLVDPLDGTVEFIKGNNEFT VNIALMENNVCMGAVIYVPYFEKMYVAGRDSGSYVKEHIAPDAAAEYTYDQIVTGWRQLP LEEGAEHPRLRVAVSRSHQTPETAEHIARLREAHPDLEIVEQGSSYKFCLLAEGKVDYYV RTTHTYEWDTAAGELILAEAGGRTRTLPDDRELLYNEEDLRNPWFVCRSKHCKI >gi|313157045|gb|AENZ01000076.1| GENE 158 144430 - 144912 500 160 aa, chain - ## HITS:1 COG:no KEGG:STHERM_c15490 NR:ns ## KEGG: STHERM_c15490 # Name: not_defined # Def: hypothetical protein # Organism: S.thermophila # Pathway: not_defined # 2 156 19 174 181 133 55.0 3e-30 MIAYFTAGVFALLFLDYRTHFSSESLALLMRPVSSPWVALGPGLQIFRGVLIALALLPVR GFLYGKNGFLKLAWLVLGLSFISTIGPTPGSFDGYIYTILPVQYHLGGIPEAVLYTALFA GILAFWHKSGKRYVTVLSIVLVAIIVLLSVMGFLGAAQAE >gi|313157045|gb|AENZ01000076.1| GENE 159 145012 - 145281 429 89 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313157228|gb|EFR56658.1| ## NR: gi|313157228|gb|EFR56658.1| conserved domain protein [Alistipes sp. HGB5] # 1 89 1 89 89 168 100.0 1e-40 MPTWLIVFLFALGAVALFVVGMSLTLMIKGHNIDSEISTNKNMQRLGIKCAVHETREADG SAGCSDTHTAAGCSGNCSACDIEEAGHKK >gi|313157045|gb|AENZ01000076.1| GENE 160 145409 - 146749 2160 446 aa, chain + ## HITS:1 COG:PM1563 KEGG:ns NR:ns ## COG: PM1563 COG0569 # Protein_GI_number: 15603428 # Func_class: P Inorganic ion transport and metabolism # Function: K+ transport systems, NAD-binding component # Organism: Pasteurella multocida # 1 432 1 437 458 229 33.0 1e-59 MKIVIAGAGEMGSHLAKMLSGNGHDITIIDSDQKLLSDVGSLADVITVEGDSTTFAVLRK ASVRKCDLFIAVNHEENDNVVAAMLAKKLGARKSIARIDNNEYLEPNNKEMFIDMGIDYL FYPEKVAAREVINLLGHTSTTEYVDFSSGKLSLVVFRLEPASPLVGRQIEGFDDDETPLS YRTVAITRGGETIIPRQGEIFTEGDVIYVIARQDAVKQVMEFSGQSNIEIKNMMILGGSR IGIRIATELQEEVNIKLVDYNAEKAYRLAELLDKTLIINEDGRNTEAMMEEGLSNMDAFV AVTGRSETNILAAMLAKRMGVKKVIAEVENLNYINLAESIGIDTIINKKLVTASNIFRFT MSTDVQAIKCLTGSDAEVLEFIVKPNAPAVKTRIKDLGLPEDTIIGGIVRGDKVFIAVDN MEINPYDRVVVFAMPASVGKVGYFFN >gi|313157045|gb|AENZ01000076.1| GENE 161 146822 - 148294 2097 490 aa, chain + ## HITS:1 COG:no KEGG:Phep_3685 NR:ns ## KEGG: Phep_3685 # Name: not_defined # Def: GH3 auxin-responsive promoter # Organism: P.heparinus # Pathway: not_defined # 1 484 1 500 502 461 45.0 1e-128 MSFRSILLKAWFSQRERAIDRFRRRPAETQERVFRQLLRRGRRTEFGWRYDLGHVRTVEQ FQTMVETFDYETFKPYIEKMLAGEKNVASPGRVKLFARSSGTTSDRSKYIPVTEESLWWN HTLGMRDVATVFSGNNPKTGVFEGKTLTLGGSCSREGRNLVGDLSAILIHETTFWSGWFR APRMATALIPDFDRKVEAICRECVGENITAFAGVPSWNLAMMRRVLEYTGRQNLLEVWPN LCMFAHGGVEFGPYRRSFEALIPSERMQYMETYNASEGFFALADDPSRDDMLLMLDYGNF FEFRSGGTIVPLEGVECGRVYAMLITSNNGLWRYEIGDTVEFTSTNPYRIRFAGRTRQYI NVFGEELIVDNADRALIAASNETGAVIGEYTVAPCYMSLRERGAHEWIVEFEREPDSREH FAEALDRELRAVNSDYDAKRRTTLERQRLTVVDRGLFLAWLRARGKNKVPRLVNDRHVAE EILAFGGEKK >gi|313157045|gb|AENZ01000076.1| GENE 162 148383 - 148997 1051 204 aa, chain + ## HITS:1 COG:DR0298 KEGG:ns NR:ns ## COG: DR0298 COG1428 # Protein_GI_number: 15805329 # Func_class: F Nucleotide transport and metabolism # Function: Deoxynucleoside kinases # Organism: Deinococcus radiodurans # 1 199 1 198 207 174 45.0 1e-43 MYIAIAGNIGSGKTTLTQILTKRYNAKSYLEECNNPYIGDFYEDMNRWSFNLQMYFLGSR IQQTMDMLSDGGSGVIFQDRTVYEDAHIFAGNLHEMGLMPTRDIETYMKIFRLVTELIPK PDLLIYLKASVPTLISQIRKRGREYEMNIDELYLKRLNDKYNNWIDNIYEGDVLVVDKDH EDFISDPAVLDKICARLDALNTRK >gi|313157045|gb|AENZ01000076.1| GENE 163 149001 - 150443 1656 480 aa, chain + ## HITS:1 COG:SP1402_1 KEGG:ns NR:ns ## COG: SP1402_1 COG0144 # Protein_GI_number: 15901256 # Func_class: J Translation, ribosomal structure and biogenesis # Function: tRNA and rRNA cytosine-C5-methylases # Organism: Streptococcus pneumoniae TIGR4 # 1 315 1 277 280 161 35.0 2e-39 MTLPAKFAERVLRDLGDSEGPALCAALDTEPPVSVRLNPAKCGGTENGGAGREEDADKVV VRPDGVPPAVLADADGRVPWCGDGYYLAARPQFTFDSDFHAGAYYVQEAASQFVGHLLQG VEVAGRRILDLCAAPGGKTTLYASLVGPDGLVVANEIDRRRAQVLADNVRKWGTGNVAVT TCEARQLGGFEAWFDVVAVDAPCSGEGMFRKDRDARGEWSEGNVKLCAARQDELLREAWR ALKPGGVLIYSTCTFNRDEDEGVLERMLGWAEDEAAQAGEVAVDASWGIVCGRVGAFRTF RFYPHRARGEGFFAAVVRKAFDAGGRCRTPKARRTVFASVDRAAAAELRRWVNSPERMCF ATVADTCYGYYVAQAEAVKALAEALPVIYSGVAMGQLFKGRLRPDPALAFFCGLNRDAVP AAELDEEQTLRFLRRQEIGAGPFAEGINLVCARGRALGFAKRIGNRVNNMYPNSLRIIKQ >gi|313157045|gb|AENZ01000076.1| GENE 164 150446 - 151012 816 188 aa, chain + ## HITS:1 COG:BMEI1178 KEGG:ns NR:ns ## COG: BMEI1178 COG0789 # Protein_GI_number: 17987461 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Brucella melitensis # 6 80 22 96 193 59 38.0 3e-09 MAEKLFYSMGEVAEMFDVNASLIRHWESQFSVIRPKRNKKGNRLFSPQDVENLKLIYHLV KERGMTLEGAKKALRQQPAEGGMERDAELMERLQRIRALLVEVREDLKAGEGEILADADS DAGAVPDAPDAEAAAPRRRGKPVVKIAAEGEEAADAGHSVAKRTRKPRRKKEEVENKELF AFYEQSLF >gi|313157045|gb|AENZ01000076.1| GENE 165 151194 - 151979 1060 261 aa, chain + ## HITS:1 COG:FN1316 KEGG:ns NR:ns ## COG: FN1316 COG0327 # Protein_GI_number: 19704651 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Fusobacterium nucleatum # 1 251 1 252 258 128 30.0 9e-30 MLIREITDVIERFAPLAWQESYDNAGLIVGRPDDEVRKALLAVDVTDEVMAEAEREGCDL IITHHPIVFHALKRFNSADQVQRCVERAIRSGIALYACHTNLDSAPEGMSWRLAEMLGVA DLRVLQPSAADAKVGFGVVGELPGAVATVEFMRHIQRTLGVRVVRHSDIASPEVRRVAVC TGAGASMIGEARRAGADIYITADMKYNDFMTPDKALTVADIGHFESEYCAIQILFDILSK NLITFAVRKSEDSRNPVNYLV >gi|313157045|gb|AENZ01000076.1| GENE 166 152003 - 152764 1359 253 aa, chain + ## HITS:1 COG:CPn0525 KEGG:ns NR:ns ## COG: CPn0525 COG1579 # Protein_GI_number: 15618436 # Func_class: R General function prediction only # Function: Zn-ribbon protein, possibly nucleic acid-binding # Organism: Chlamydophila pneumoniae CWL029 # 14 251 1 237 254 59 23.0 7e-09 MATQKKTAEVDYSMQEKIMALYELQKIDSKIDEINKVKGELPLEVQDLEDEMTGLRTRID NINAEIEELNTLTKQRKREVDQAKILIGNYKEQQNNVRNNREFDAITKEIEYQELEIELA EKRLKEYSAGVKAKKLQLEEAEAVADGRAADLAAKKSELEGIEAETAPLVAEFEVQADRA KAKIDERLLAAYSRIRQNVRNGLAVVTVKRDACGGCFNRIPPQRQVEIRQGKKIIICEYC GRILVTDPEEVQE >gi|313157045|gb|AENZ01000076.1| GENE 167 152831 - 153280 419 149 aa, chain - ## HITS:1 COG:no KEGG:Bacsa_1316 NR:ns ## KEGG: Bacsa_1316 # Name: not_defined # Def: cupin 2 barrel domain-containing protein # Organism: B.salanitronis # Pathway: not_defined # 1 149 1 149 149 187 58.0 1e-46 MKKVEKTAGGANFAAVTVGKMDELNQHTLILAPGVEIPGKVFAGSALGATGAELSFQRIE PGAGVGFLHTHKTHEELYIIVRGDGEFQVDGKIFAVGEGSVIRVSPDGRRALRNTGAEPM IMICVQYKADSFASSDADDGVILDVGVVW >gi|313157045|gb|AENZ01000076.1| GENE 168 153582 - 155162 2377 526 aa, chain + ## HITS:1 COG:aq_959_2 KEGG:ns NR:ns ## COG: aq_959_2 COG0171 # Protein_GI_number: 15606275 # Func_class: H Coenzyme transport and metabolism # Function: NAD synthase # Organism: Aquifex aeolicus # 251 524 13 285 287 268 49.0 2e-71 MKIAIAQLNYTIGDIDGNASKIIDSINKAKAQRADLVIFAEQAVSGTPAFDLLRKTTFLE LCEDALVEIASCCDGIAAIVGLPILTREGTISAAALIQDRKVLRYVGKKYITARREMGFL VPSKGFEYATIKGHKCAIIVGDDLSREHDFDKSVETVISINARKYGKGAMTYRYDMMRHL SFVEAKNIVLVNQVGGATDIVYDGTSGAFNNRGELVLMMKQFEEDFQIFDTKAQNPPVGV PSTYNDRTRMVYQAARCGLRDFFLKNGYKKACIGLSGGIDSAVVACLAADALGAGNVRAL LLPSQFSSDHSVEDAKKLAENLGIEYNVIPITEIYTSVVNTLKPVIGGREFDATEENIQT RIRTVLLMAVQNKTDYILLNSSNKSENALGLCTLYGDTAGAFSPTGDLYKSEMYDVARYI NRTFGNVIPENILDKEPSSELHPGQKDSDILPPYEVVDAILLRMIEEGQHREEIVNAGFD SEVVEKIHCMIMRNEKKRYQFPPVLRLSMCSFGHERLMPLTNKYGD >gi|313157045|gb|AENZ01000076.1| GENE 169 155166 - 155594 433 142 aa, chain + ## HITS:1 COG:no KEGG:BDI_0517 NR:ns ## KEGG: BDI_0517 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 6 140 5 136 138 85 35.0 6e-16 MASGIIEHEGIVSRVEGDRVYVKITSQSACGTCKAREACGLAEAREKIVEVTTPGAQQHY TAGESVTVGVRRSAGAVAVILAYVGALAVLLAVLAAAVAALGWSEGRSALAALAAVGVYY CVLWLFRHKIEHTIHFSITKNY >gi|313157045|gb|AENZ01000076.1| GENE 170 155594 - 156679 1147 361 aa, chain + ## HITS:1 COG:MA0664 KEGG:ns NR:ns ## COG: MA0664 COG2878 # Protein_GI_number: 20089551 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfB # Organism: Methanosarcina acetivorans str.C2A # 3 267 4 263 264 177 42.0 2e-44 MEVLLYTILTLCALGVLSAVILYFVAQKFRVEEDPRIDEVEKMLPGANCGGCGFAGCRGM ADALVKQDDISSLFCPVGGGDCMKAVAAYLGKSAPEKEPQVATVRCGGTCEKRPRTNDFN GAKSCAVASSLYVGETGCAFGCLGFGDCVVSCAFDAIRMNPETGLPEVDPDKCTACGACV KACPKMIIELRKKWPKNRAVYVSCVSKDKGAVVMKACKAGCIGCGKCQKVCAFDAITIAN NLAYIDPQKCKLCRKCVNECPTGAIRLVGMEPLPKAPKTPAAPAAPKAAPVAEKTAPAAE KAVESAKPAAEKAALASGQAAPADASKAAAVEKPAGSAAAPEKAAPNAEKAAPAETASKA E >gi|313157045|gb|AENZ01000076.1| GENE 171 156775 - 158118 1971 447 aa, chain + ## HITS:1 COG:TM0244 KEGG:ns NR:ns ## COG: TM0244 COG4656 # Protein_GI_number: 15643016 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfC # Organism: Thermotoga maritima # 7 441 22 450 451 355 46.0 1e-97 MKTFPIGGVHPSENKLSGAKPIEVLPLPDAVAIPLAQHIGAPAAAKVAKGDKVLTGQLIA EASSFMSANIHSPISGTVTAVDAVPNGQGLRQVMITIKREGDEWAEGIDRSETLVRECAL SSAEIIARIKDAGIVGMGGATFPTHVKLSVPPGRKAEALIINGVECEPYLTSDHRTMLEH GEELLVGVTILMKAIAVEKAYIGIENNKPDAIAHLRKLAEGYKGIEVVPLKVKYPQGGEK QLIAAVTGREVPPPPALPIDVGAVVCNASTTYAVYQAVQKNKPLIERVVTVTGKGVKEPK NLLTRMGTPISALLEAAGGLPADAGKVINGGPMMGRAMVNLDSPVTKGCSGITVMSGRDA VRREASQCIKCAKCVAACPMGLEPYYLSKMTQKKGWDALEEQMITSCIECGCCQASCPSY LPLLDWVRLGKQTVMGIIRARAAAPKK >gi|313157045|gb|AENZ01000076.1| GENE 172 158149 - 159120 1531 323 aa, chain + ## HITS:1 COG:FN1595 KEGG:ns NR:ns ## COG: FN1595 COG4658 # Protein_GI_number: 19704916 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfD # Organism: Fusobacterium nucleatum # 1 321 1 307 314 234 43.0 1e-61 MANKLVVAPAPHVQTSQSTARIMRDVVIALMPALAVSAVVFGVDVLRVTALSVAACVLFE YLIQKFLVRGASTIGNWSAVVTGVLLAFNLPASIPWWIVLIGAFVAIAIAKMTFGGLGKN PFNPALVGRVFLLIAYPVQMTSFPVPVNGSFDALSGATPLAAVKHGAAADVLGVQELLLG NMPGSLGEVAALALICGFVYLLWRRVITWQIPVTILGTMALFAFVVAAANGGLSAGPVLW QFPLFHVLAGGALLGAIFMATDYSTSPMTVRGGVIFGVGIGAITMCIRLWGAYPEGMSFA ILIMNAVVPLINKYVKPKRFGVK >gi|313157045|gb|AENZ01000076.1| GENE 173 159134 - 159811 923 225 aa, chain + ## HITS:1 COG:HI1687 KEGG:ns NR:ns ## COG: HI1687 COG4659 # Protein_GI_number: 16273574 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfG # Organism: Haemophilus influenzae # 70 181 79 188 207 78 37.0 1e-14 MKSTLLNMTAVLFGITLVASAGVGAVNMITAEPIAQAEQAAKTAALNGVLPAFDETAAET LTIDEMPITVYTATKGGQLAGYAVESMTKNGFGGAINMMVGFTPDGEVINVNVLKQAETP GLGTKMADADNVLLRSVKGQKLEEKKLVNGKLAVAKDGGDVDALTAATISSRAYVDAINR AWMAYKSVATGSTPTDVSSGATSAAGDTAEEQTSGPEAQEGGQNE >gi|313157045|gb|AENZ01000076.1| GENE 174 159804 - 160388 861 194 aa, chain + ## HITS:1 COG:TM0247 KEGG:ns NR:ns ## COG: TM0247 COG4660 # Protein_GI_number: 15643019 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfE # Organism: Thermotoga maritima # 1 188 1 186 200 196 60.0 3e-50 MNKLNIIVSGIVKNNPTFVLVLGMCPTLGTTTSAENGMGMGLATMAVLIMSNLVISLIKN IIPDKVRIPAFIVVIASFVTVIQMLMQAYVPALYASLGVFIPLIVVNCIILGRAEAFASK NSPLDSILDGVGIGLGFTLALTAVGAVREILGSGAIFGVSLGTADFMPLIFVLAPGAFLV LGYLMVLFNKLAKK >gi|313157045|gb|AENZ01000076.1| GENE 175 160391 - 160969 1115 192 aa, chain + ## HITS:1 COG:FN1592 KEGG:ns NR:ns ## COG: FN1592 COG4657 # Protein_GI_number: 19704913 # Func_class: C Energy production and conversion # Function: Predicted NADH:ubiquinone oxidoreductase, subunit RnfA # Organism: Fusobacterium nucleatum # 20 191 21 192 194 185 62.0 5e-47 MELSYFAIIIGAIFVNNVVLAQFLGICPFLGVSSKVETSMGMGAAVTFVMAIAAVVAWLI QTYVLVPLDIVYMQTIVFILVIAALVQMVEIMLKKLSPSLYQALGIFLPLITTNCAVLGV AILMIQKEFNLLQSVTYSVATALGFALALVLFAGIRERLDFEDVPKAFKGVPIALITAGI LAMAFMGFSGLV >gi|313157045|gb|AENZ01000076.1| GENE 176 161085 - 161267 384 60 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKLSEWFKARSGQELMMIAVVAMLLVMIATRWAYISRTAKEAIRQRFVPPAEQADSLQAK >gi|313157045|gb|AENZ01000076.1| GENE 177 161357 - 161899 879 180 aa, chain + ## HITS:1 COG:CAC3203 KEGG:ns NR:ns ## COG: CAC3203 COG0634 # Protein_GI_number: 15896450 # Func_class: F Nucleotide transport and metabolism # Function: Hypoxanthine-guanine phosphoribosyltransferase # Organism: Clostridium acetobutylicum # 13 180 7 174 178 116 36.0 2e-26 MKDIIKLHDRKFKIMIPAEKIDEAVSAVAQRINEDYGDKETPLFVGVLNGSFMFMSDLIK KIEFNNELSFVKLASYEGTCSSGCVKSLIGLNGSIEGRHVIVVEDIVDTGESIEYMICDL KAQKPASLEVCTLFFKPGSYRKQIPIKYRAMEIGNEFIVGYGLDYDQLGRSLKDIYVVTE >gi|313157045|gb|AENZ01000076.1| GENE 178 161899 - 162534 825 211 aa, chain + ## HITS:1 COG:sll2011 KEGG:ns NR:ns ## COG: sll2011 COG0602 # Protein_GI_number: 16330304 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Organic radical activating enzymes # Organism: Synechocystis # 17 210 15 207 208 176 44.0 2e-44 MAFTTDTELLDGGRLLPLVEDFYTIQGEGFHAGKPAYFIRLGGCDVGCRWCDAKYTWNPK LYPPTDVRTVIDRALACPAQAIVITGGEPLLYPLGVLTETLHEKGLQIFLETSGTHPFSG YFDWVCLSPKRQQPPLDEALERAHELKVIVESESDFEWAERNAARVRPECMLYLQPEWSV AERVMPAMVEYAKAHPKWNISIQTHKYMHIP >gi|313157045|gb|AENZ01000076.1| GENE 179 162534 - 162809 457 91 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313157167|gb|EFR56597.1| ## NR: gi|313157167|gb|EFR56597.1| conserved hypothetical protein [Alistipes sp. HGB5] # 1 91 1 91 91 134 100.0 3e-30 MSFEVGRKFWIAATAVIVVVTLFVVGRNSLHAVKIKRQINAMTREKEYYRTKIEQDSTLL ERLQYDDYLEEYARENYHMQRRGEHVYIIKE >gi|313157045|gb|AENZ01000076.1| GENE 180 163258 - 164196 1190 312 aa, chain - ## HITS:1 COG:YPO3180 KEGG:ns NR:ns ## COG: YPO3180 COG0611 # Protein_GI_number: 16123342 # Func_class: H Coenzyme transport and metabolism # Function: Thiamine monophosphate kinase # Organism: Yersinia pestis # 3 312 5 322 329 150 36.0 4e-36 MTEFGFIDRIKDLFATLPDNGFEGIGDDCAVLPVGGGESLVFTTDMLAEGVHFLRTATSA RGLGRKSLAVNLSDVASMGARPIATLLSLSLPDDATGAWAEEFMQGYRELSQEFGVTLAG GDTTRSAAGITINVTAIGRAADTHIKRRSGARPGDVIFTAGALGASGAGLRDILAGRYDT PAAAIHRNPRPQVAEGLWLGRRHEVHAMMDISDGLASDIRHIMERSGVGAAIDTERIPAA ADADIRTAACAGEDYVLLLTAECADAERLAADFLARFGAPLHPVGRITDGRELEWFENGT LRPLDWHGFTHY >gi|313157045|gb|AENZ01000076.1| GENE 181 164348 - 164776 638 142 aa, chain + ## HITS:1 COG:Cj0066c KEGG:ns NR:ns ## COG: Cj0066c COG0757 # Protein_GI_number: 15791458 # Func_class: E Amino acid transport and metabolism # Function: 3-dehydroquinate dehydratase II # Organism: Campylobacter jejuni # 1 137 1 139 159 144 47.0 4e-35 MKILILNGPNLNLQGRRDTGVYGTQTFETFFERLRSLYPAVGFGYFQSNIEGELIDAVQQ ADGVYDGVVLNAGGYTHTSVALRDAVSAVSVPVVEVHISSILAREEFRHVSLLAPVAKGS IMGFGLDSYRLGVEALLLGGEK >gi|313157045|gb|AENZ01000076.1| GENE 182 165013 - 166488 2009 491 aa, chain + ## HITS:1 COG:BB0348 KEGG:ns NR:ns ## COG: BB0348 COG0469 # Protein_GI_number: 15594693 # Func_class: G Carbohydrate transport and metabolism # Function: Pyruvate kinase # Organism: Borrelia burgdorferi # 5 482 4 476 477 362 43.0 1e-99 MYRMKKTKIVATMSDFRCTEEFVKNLYDAGMDVVRVNSAHVTEDGATHIVETVHRVNPAI PIMIDTKGPEIRVTTIADEYGNSINFRVGDRVAVRGSDGSDMTTRKVVYMNVPTIVRDIP VGARMLIADGELEIRVVEKTDEELICEFVVGGAMRSRKSVNVPAVSIDLPSVTEKDRRFI EWAVKNDVDFIAHSFVRSAKDIKAVQEILDAHGSKIKIISKIENQEGLDNIDEIIEASYG IMVARGDLGVELPAEAIPNTQRRIVEKCICAKRPVIIATQMLYSMVKSPRPTRAEVSDVA SAIYERVDAVMLSDETAMGDYPVESVETMARVAREIERDETHFKPMIDMDMVSVNHEITA QLARSAVRASTNLPIKYVVLDTKTGRTGRYLAAFRGRKTVMAVCYQLHAQRILALSYGVV PILRNQELHDKYHFLVDALEFLDQYRKLKGEDLLAIVGGSFGPDGGASYVEIANVDNIRR RNEEIVAAQKC >gi|313157045|gb|AENZ01000076.1| GENE 183 166492 - 167940 1953 482 aa, chain + ## HITS:1 COG:mll5270 KEGG:ns NR:ns ## COG: mll5270 COG2244 # Protein_GI_number: 13474395 # Func_class: R General function prediction only # Function: Membrane protein involved in the export of O-antigen and teichoic acid # Organism: Mesorhizobium loti # 2 470 73 543 561 133 23.0 9e-31 MKGELKDKVARGVAWSMAEKIGTMLLQMAVSLIILRLLTRDILGVMAIPTAVVAVALVLV DSGFSQALIRKAAPSADDYKSVFAFNIGVSLVMYALFTALAFPASRFYDMPQIAQIAPVF FLQLPIGALCAIQNTICVRKFRFALLSKVTFASSLAGGLAAIGLALAGFGIWCIVAQRVL QAAVRTLLLWWLSDWRPCGACSRGPLREMAPFSFSLIATDLISTFYNKIPQFFLGRLYPA DTLGSFDQAIKLKDQPTISAMQAVQSVTFPALAKIKDDAPKFAESYRQVVMVVSYVLFPV MLGMSAVAHDMFAVLLGEEWMPTVPYFETVCLVGLFYPMSIIAYNVLKVKSGGALIVKLE IVKKVIMTLVFAVTIPISVQAVVWGLVVIAFCEMSVNFLATTKFTSLSPGRFVRTLLPPL LAAAAMYGAVRLTARAIPEHALLRLMAEVTAGAVSYVLLSALFRLEAFREVAAIVRKQFL HR >gi|313157045|gb|AENZ01000076.1| GENE 184 168056 - 169123 1427 355 aa, chain - ## HITS:1 COG:CAC1660 KEGG:ns NR:ns ## COG: CAC1660 COG3426 # Protein_GI_number: 15894937 # Func_class: C Energy production and conversion # Function: Butyrate kinase # Organism: Clostridium acetobutylicum # 1 354 1 355 356 388 52.0 1e-107 MGFKILAINPGSTSTKVALYDEERPLLDLTLRHSAEQIAHYPNIIDQLDWRRDMILTALK EHGFDIMSLAAVIGRGGLIKPIPAGVYEVNDAMRRDLQHATMEHASNLGGLLADGIAATA GIKAYIADPVVVDEMDDIARLSGHPDCPRRSIFHALNQKATARLHCDRIGIVYEKANLVV AHLGGGISVAAHKQGRVVDVNNALDGDGPFAPERAGTLPAGELVDLCFSGRYTRHDIQQM LAGKGGLVAHLGTNSMIQVMERIGQGDEKARVVKDAMCYGIVKQIGAMAAALGGRVDAVI LTGGIAHNKSVVEYISEFCSFIAPIAVYPGENELESLVTNALVVLRGVITPKVYA >gi|313157045|gb|AENZ01000076.1| GENE 185 169182 - 170090 1217 302 aa, chain - ## HITS:1 COG:CAC3076 KEGG:ns NR:ns ## COG: CAC3076 COG0280 # Protein_GI_number: 15896327 # Func_class: C Energy production and conversion # Function: Phosphotransacetylase # Organism: Clostridium acetobutylicum # 1 298 1 296 301 209 40.0 4e-54 MIQHLTEIVEEARKRGKKRLAVAYGQDSHTLEAVYEAYKEGLVEPTLYGEKSVIEQVCAE NDIDIKAFNIVDETSDVKCVQQAVAAVVAGNADVLMKGLVSTDKYMRGILNKEAGLFPPK GVLSHVSIVEMPCYHKLLVISDVAVIPLPDFKQKTVQIGYLARTANLLGIKTPKIACIAP SEQLLPNVISSTEGALLAKMGDRGQLGDVVVDGPLSLDVALYKDVAEHKKVKGSSVAGDP DCLLFPNLESANVFFKSVTHLCGGELAAMVMGTKVPCVLTSRGDTSKTKLYSIALACLAV KQ >gi|313157045|gb|AENZ01000076.1| GENE 186 170116 - 171318 1905 400 aa, chain - ## HITS:1 COG:TM0274 KEGG:ns NR:ns ## COG: TM0274 COG0282 # Protein_GI_number: 15643044 # Func_class: C Energy production and conversion # Function: Acetate kinase # Organism: Thermotoga maritima # 1 400 1 399 403 424 54.0 1e-118 MVILVLNCGSSSIKYQVIDMEAASSTLLAKGIVERIGLPEGDLIHKPVGKQPFELHRPIP DHTTGIKLVLDALTDPEHGVIRSLDEVKAVGHRVAHGGEFFPESCIVTEEVKSKIRSLFE IAPLHNPANLEGVLSIEKVLPGTPQVTVFDTSFHQTIPAVNFMYALPHAYYDKYRVRKYG FHGTSHKYVAQTGAKLAGLDFENSKIITCHIGNGASVTAVLNGKSFDTSMGFSPVDGLVM GTRCGNVDPSAVTFIGEKEGMSYAELNEMMNKKSGVLGLTDLSSDMRDIDLAYDEGNPRA ILARDMHYGRIRKFVGEYAAEMGGVDMIVFTGGVGENSCEMRESVCTGLEFMGVEFDREA NKGARGVNKILSTPGSRVKVAVIATDEELVIATDTYNLVK >gi|313157045|gb|AENZ01000076.1| GENE 187 171334 - 171705 538 123 aa, chain - ## HITS:1 COG:BS_yhdV KEGG:ns NR:ns ## COG: BS_yhdV COG0239 # Protein_GI_number: 16078026 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Integral membrane protein possibly involved in chromosome condensation # Organism: Bacillus subtilis # 7 118 6 116 131 62 46.0 2e-10 MIRELMAVGFGGAAGSIVRYLLSGGILAGQTLLGFPAGTFTVNAAGSLLIGILLEATSSE TLGWLLIVGFCGGFTTFSAFSADTVRLLRAGCYNAAAIYVALSVAVCIVFAALGMWIGTT FRN >gi|313157045|gb|AENZ01000076.1| GENE 188 171813 - 173063 1907 416 aa, chain - ## HITS:1 COG:NMB1068 KEGG:ns NR:ns ## COG: NMB1068 COG0014 # Protein_GI_number: 15676952 # Func_class: E Amino acid transport and metabolism # Function: Gamma-glutamyl phosphate reductase # Organism: Neisseria meningitidis MC58 # 5 415 9 420 420 350 50.0 2e-96 MEYESLFKAARQAGTLLAETETERLGRTLMHAAQLLRTHTPRLLEANARDLEAMAPDSPL RDRLRLTPERIAAIADDMEAVAHLPSPQGEELERRTRPNGMTIVKRRVPFGVVGMICEAR PNVTADIFSLCIKTGNACVLKGGGDARHSNEAIAALLHEALRSEGIDEHAFTLLPSDHAS VGALLNAVGYVDVVIPRGGAGLIRFVRENARVPVIETGAGIVHTYFDRDGDLAKGRAVVC NAKTRRVSVCNALDCLVIHRDRLADLPELCAPLAAARVTVYADAEAFAALGGHYPADLLE HASEEHFGTEFLDYKLAVRTVASPEEALAHIARYSSRHSEAIVTENKQTATQFTRAVDAA CVYVNVSTAFTDGGQFGFGAEVGISTQKLHARGPMGLPELTTYKYVVTGDGQIRES >gi|313157045|gb|AENZ01000076.1| GENE 189 173071 - 174162 1496 363 aa, chain - ## HITS:1 COG:Cgl2305 KEGG:ns NR:ns ## COG: Cgl2305 COG0263 # Protein_GI_number: 19553555 # Func_class: E Amino acid transport and metabolism # Function: Glutamate 5-kinase # Organism: Corynebacterium glutamicum # 2 360 43 402 409 189 36.0 7e-48 MDRTSKYRRIVIKAGSNVLTRDDGRPDTTRISSLVDQIARLHRAGIEVILVSSGAVASGR SVLEQRTGRIDTVSARQLFSAVGQVKLLNRYYDLFNEYGIVCGQVLTTKESLSTRRQYLN QRNCMEVMLANGVIPIVNENDTISVTELMFTDNDELSGLVAAMMQAEALIILSNIDGIYD GSPSDPSSQVIRRVKPSEDLSRYIDPARSSRGRGGMATKSRISSRVAGEGIEVIIANGRR DNILTALALTGEEVLCTRFEPAPHTASGVKMWIASSEGFAKGVLRIDEGAAAAVRASKAA SILAVGVTAVEGDFERDDLVRILSPDGTQLAVGRISCDSETARRNIGRKGLKPLVHCDYL YLE >gi|313157045|gb|AENZ01000076.1| GENE 190 174191 - 174859 1174 222 aa, chain - ## HITS:1 COG:CAC1700 KEGG:ns NR:ns ## COG: CAC1700 COG0745 # Protein_GI_number: 15894977 # Func_class: T Signal transduction mechanisms; K Transcription # Function: Response regulators consisting of a CheY-like receiver domain and a winged-helix DNA-binding domain # Organism: Clostridium acetobutylicum # 4 220 5 227 232 179 41.0 3e-45 MSHRILLVDDEPDILEFVKYTLVKEGYEVFTAQNGAEALKAAAQHRPHLILLDMMMPVMD GAQTCRAIREDPQLRDTMVVFLSALGEEEQQLAGFGVGADDYLTKPIKMKLLVSRVQAIL KRIDAGKAPAPAEEGVAMDRERYTVTRDGREISLPRKEFALLDLLYSSPGKLFSREEIYA RIWGSEVVVGDRTIDVHIRKLRQKIGDEKIVTVKGVGYKYEP >gi|313157045|gb|AENZ01000076.1| GENE 191 175036 - 177393 3325 785 aa, chain + ## HITS:1 COG:no KEGG:BDI_2685 NR:ns ## KEGG: BDI_2685 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 216 775 53 617 617 427 44.0 1e-117 MATENPILPAPEEQVSKAAEDVQKATSEVTPEVPDSETAAIAEPETPAVAQGEPAASAAE APVSEALAEETVAETAAEAPAAGEVAADAPAAGEVAADAPADGEGAPAEMPAAEVPAAEV VAEAPAAETAAEAAPAEETARAKRARIKPAAEAVVTETEEEAAASDVQVDFADEEAALAA QDAGLELEGETAEEAEAEQMAGSAADTEDKFTGRGKEELVALFTRMLEEQPVQSIRRDVE ALKIAFYRIRRAEVEAARRKFIEEGGAEEDFAPAVDGVEVQLKELFKEYRRRRDEFIANL EAEKEKNLQVKLGIIEELKELVNSDETLNHTFTKFRELQQRWKETGIVPLQNVKDLWETY NLHVENFYSFIKINKELRDLDLKKNYEQKVTLCEQAEALVLEPSVVEAFHKLQKLHDEWR ETGPVANEYKEVLWERFKAASSRINKQHQEHFEALKGEQVRNLELKTELCAATEELAAQP LTTRKEWNRASDRLLEIQKTWKTIGFAPKKDNNRIYERFRTACDRFFEAKRQFYSGVKAE MEHNLQLKTEICEAAESLMNSEEWKKATDELIALQARWKEIGAVSRRHSDAIWRRFRAAC DKFFERKGAHFASVDGEHEENLKKKLALLEEMAAADVKTGGYDVIRDFQRRWGEIGFVPI KQKDAVQKKYKAAVDALFNTLRGTERDRSMNRFREKVSSFKSAGGSRLRSERERLYGKVR QLEQEIGLLENNIGFFAKSKNAESLIADVKAKIERAREDMAAAIEKVKLIDRQAQEENHE NNENK >gi|313157045|gb|AENZ01000076.1| GENE 192 177405 - 179027 2709 540 aa, chain + ## HITS:1 COG:BH3792 KEGG:ns NR:ns ## COG: BH3792 COG0504 # Protein_GI_number: 15616354 # Func_class: F Nucleotide transport and metabolism # Function: CTP synthase (UTP-ammonia lyase) # Organism: Bacillus halodurans # 7 534 4 531 532 645 58.0 0 MKLSKPKYIFVTGGVASSLGKGIISASIARLLQARGYSVTIQKLDPYINVDPGTLNPYEH GECYVTEDGAETDLDLGHYERFTNQPTSKANNVTTGRIYKSVIEKERKGEYLGKTVQVIP HITDEIKRRIQLLAQKKVYDVIITEIGGTVGDIESLPFIESVRQLRYSLGYKNTALVHLT LIPYMAASGELKTKPTQHSVKALLENGLQPDILVLRTEHPLSVNLRRKVALFCNVDANAV MESIDVPTIYEVPQKMHDQHLDEVVLQKLDLPADKEPDMTAWNALVDKIKHPSKKIEIAL VGKYTELPDAYKSICESFVHAGAVNDCKVKLRYVNSEKITPETVAEKLGKMSGILVAPGF GNRGIEGKIVAVRYARENNIPFLGICLGMQCAVIEFARNVLGIADANSSEMESTPHPVID LMEEQKGVTAKGGTMRLGAYPCALKKGSKVAAAYGKLNISERHRHRYEFNNDYLEQFEAA GMQAVGINPDTGLVEVVEVANHPWFVGTQYHPEYKSTVLKPSPLFVAFVGAALKYAGEKK >gi|313157045|gb|AENZ01000076.1| GENE 193 179061 - 181031 2826 656 aa, chain + ## HITS:1 COG:NMA0548 KEGG:ns NR:ns ## COG: NMA0548 COG0706 # Protein_GI_number: 15793542 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Preprotein translocase subunit YidC # Organism: Neisseria meningitidis Z2491 # 180 616 123 538 545 155 29.0 3e-37 MDKKTILGIVVVAVLFLGFAYVNTKQQEKYQQEMAAWQAYQDSVAAASRPAVPAADSVAG GAAESAVAASGETTAPEAEADLAQTVRQRRIAAMGEYLTAAQEAEPEEFTVENEVMTVRF STRGGQITGVTLKDYTKYAPRGQRDQLIELMDPASARFDMSFYVKNGLNNVKVNTMDYVF RAEPVETAGDARRVTMRLAVAENAWLEYEYLIYNKQAPERDYLVDFNVRLVNMAPQMANQ TSIGIDWSNVSYQNEKGFQNENMYTTLAYRFPGESSIEELGMSDGAKSKSVSTAVNWVAF KQQFFSSVFIAPQNVSSANMAFDTAAPGSELLKSFSVQMAVPYSAQVEGYDFAFYFGPNK YAILKKVTDNNGADLHMERLIPLGWGIFGWVNRWCVIPVFDFLRNYIGSFGIIILILVIL VKLVISPLTYKSYVSMAKMRLIKPQVDELNKKYPKKEDAMKKQQATMELYKKAGINPMGG CIPMLIQLPILIAMFRFFPASIELREQPFLWADDLSSYDSIVNLPFSIPFYGDHVSLFAL LMAVSLFGYSYFNYQQTASSQPQMAGMKFMMVYMMPIMMLLWFNSYSSGLCYYYLLSNLF TIGQTLVIRRIVDDEKIHAVMQANAARKSKGKKSKFQQRYEELMRQQEAQQRAKRK >gi|313157045|gb|AENZ01000076.1| GENE 194 181255 - 182079 1164 274 aa, chain + ## HITS:1 COG:no KEGG:Odosp_3333 NR:ns ## KEGG: Odosp_3333 # Name: not_defined # Def: hypothetical protein # Organism: O.splanchnicus # Pathway: not_defined # 4 274 14 322 322 328 53.0 2e-88 MAAGTALRAQSLGGEYRLKRIVPVEGRQGIAADSNYYYVSGSTALYKYDKQGRLVAENEH PFEGLPLEANHLGDIDVWNGELYAGIETFEDGKGENIQVAVYDADNLKWKRSIDWDPESG QVEVSGLAVDRDNDMVWMSDWVDGRYLYGYDLKTGKYVRKVHLRPVPQWQQGIFMVDGRM LISADDGDADLDEPDTIYIADMRDGKSYATALPFRAMTDFLRAGEIEGLAIDPQTDELLV LANRGSRIVLGMVKGFYPGYDKELHEVYVFEKVK >gi|313157045|gb|AENZ01000076.1| GENE 195 182350 - 182727 448 125 aa, chain - ## HITS:1 COG:AGc3635 KEGG:ns NR:ns ## COG: AGc3635 COG1733 # Protein_GI_number: 15889290 # Func_class: K Transcription # Function: Predicted transcriptional regulators # Organism: Agrobacterium tumefaciens strain C58 (Cereon) # 13 115 36 137 147 115 55.0 2e-26 MAKTATDPGYAACPIRNIVDRFGDKWSLLVLYNLHTGGRLRFSEIHRRMTDISQKMLAST LRRLEQDGLLSRTVYPEVPPRVEYALTPRGESLMPHLVSLIGWALENFDAIVSDRNSLQA TGPDL >gi|313157045|gb|AENZ01000076.1| GENE 196 182729 - 183928 599 399 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|223476703|ref|YP_002580685.1| ribosomal protein L11 methyltransferase, putative [Thermococcus barophilus MP] # 35 399 30 393 396 235 40 1e-60 MTPQIYLRKGKEESLMRRHPWIFSGAIDYIKAEDESEIAEGALVEVYTRSGEFIAQGHYQ IGSIAVRVLSFCRETIDQEWWNRRIAVAHDVRRTLGLTDNPDTNCYRLVHGEGDALPGLV VDIYGTAAVVQCHSVGMYHARMEIAKAIRATYGDKITAIYDKSSQTVPYNARLGAVDGYL WGSSDHESHIVTENGEKFLVNWEKGQKTGFFLDQRENRELVKRYAKGRTVLNTFCYTGGF SVYAMAGGARKVCSVDSSERAVTLATENMVLNFGREADFCEIAADAVEYLKDIGDKYDLI ILDPPAFAKHHKVLGNALQGYKRLNARAISQIKPGGILFTFSCSQAVSKELFRTTVFSAA AIAGRNVRILHQMTQPADHPISIYHPEGEYLKGLVLYVE >gi|313157045|gb|AENZ01000076.1| GENE 197 183925 - 184554 1000 209 aa, chain - ## HITS:1 COG:no KEGG:Odosp_3430 NR:ns ## KEGG: Odosp_3430 # Name: not_defined # Def: 3'-5' exonuclease # Organism: O.splanchnicus # Pathway: not_defined # 17 198 9 190 203 171 48.0 2e-41 MGTFHRMTTTNNFTDKISNEETAALPAIEFKGEIRIIEHERDIVPACKFLMKQAVVGFDT ETRPSFRPGISYRVSLLQLSTPQLCFLFRLNKIPLAKPILQVLETDSILKIGADVAGDLR SLRQIRHFRDGGFVDLQSIAPEWGIEDKSLRKLSAIVLRQRVSKAQRLSNWEAATLTDKQ KLYAATDAWVCTAIYDKLLHTPKIKRPKA >gi|313157045|gb|AENZ01000076.1| GENE 198 184627 - 186558 2894 643 aa, chain - ## HITS:1 COG:lin2900 KEGG:ns NR:ns ## COG: lin2900 COG0514 # Protein_GI_number: 16801959 # Func_class: L Replication, recombination and repair # Function: Superfamily II DNA helicase # Organism: Listeria innocua # 11 338 8 337 590 318 49.0 2e-86 MEGSADKIFEVLKRYWGFTEFRPVQERIIRSAMAGRDTLALMPTGGGKSLTYQVPGLAQP GLCIVVTPLIALMKDQVDRLRARRIPAVAIHSGLSPRQIDIALDNCVYGDVKFLYVAPER LATEAFRLRVERMKVSLLAVDEAHCISQWGYDFRPSYLRIAELREKLPGVPVLALTASAT KLVAEDIMRHLRFAEPHILRSSFARPNLSYSVRRTDDKNGQLLRLVQNVPGSGIVYVRTR EGTEQVADMLRRQGVTAAAYHGGMGHAERSLRQEEWVAGRTRVMVATNAFGMGIDKPDVR FVAHYAMCDSLESYYQEAGRAGRDGVRSYALLLTSPDDGGRIIKRFEQEFPPLEKIKDIY EKVCSYLQIGIGDGAEASFLFNIHDFCAREHLYSGTVQSALKLLQQNGYMTLTDAQENPA RIMFCVSRDDLYRIRVQRDELDHFLRTLLRLYNGVFTEFRQIDEGEIATWSGYTVERVKE LLKRLWQLRVIRYVPSNRSPILFMNEERLPRADLYISPDTYKRRQELMRERFEHMLDYAA NETRCRSAVLEAYFGEGDPAPCGVCDICLARRRAAKQKSADAAGNAAESLRKEVLERLAQ GPADPRELADGFPGGVQRTGEVIRQLLDEGLLATGKDGKIRLK >gi|313157045|gb|AENZ01000076.1| GENE 199 186633 - 186899 351 88 aa, chain + ## HITS:1 COG:no KEGG:BVU_4156 NR:ns ## KEGG: BVU_4156 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 87 1 87 89 78 42.0 1e-13 MKISEKFKVREMAGEHVIIMPGRVGADMTRILALNDSSLYLWETLRGRDFTTEEAAGLLT ERYEVDEATALRDAQAWAGKLAECGVLE >gi|313157045|gb|AENZ01000076.1| GENE 200 186920 - 187336 458 138 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313157086|gb|EFR56516.1| ## NR: gi|313157086|gb|EFR56516.1| conserved domain protein [Alistipes sp. HGB5] # 18 138 18 138 138 221 100.0 2e-56 MKLKRLISLLLLSVYLLATGGPAYVSLSCKCVAMKARAAHVCCHHCQHGADTSDGTASLK ATCCGNHHSTEIDLYTFSQDNEKSTRCTVTDLPPALVAEAPEPADISLGGEKTVGRCAPF VERGHVRCAGLRAPPVLA >gi|313157045|gb|AENZ01000076.1| GENE 201 187402 - 189732 3526 776 aa, chain + ## HITS:1 COG:no KEGG:Odosp_2264 NR:ns ## KEGG: Odosp_2264 # Name: not_defined # Def: TonB-dependent receptor # Organism: O.splanchnicus # Pathway: not_defined # 1 775 1 754 757 466 37.0 1e-129 MKINKILVLSVSALAVCGTLSAQDLRGVVRDADNQPLVGASVYWEGTTIGASTDAEGAFL LHRVKGYDNLVASYLGFVNDTLHVANGAERIEFALRADGVELEDVVVEGNLSGNFVKRDG IVKNEMISFAGLCKMACCNLAESFENSASVTVGYSDAISGARQIKMLGLAGTYTQILDEN RPIMRGLSAPYGLSYTPGMWLNSIQVSKGVASVTAGHEAITGQINLEHRKPTDDERLFVN LYLDDELRPEANISTAFPVSKNKKLSSVILLHGSMDTDVRKMDHNDDGFRDLPLADQFNI ANKWLYAADNGTQIRWGWKFVQENRLGGMLDYKNSMRDQMREKWDQPGTLYGSKIRNRGA NGYFKIGMPVGPSVYDPDEKDEMRSNLAFVADFDHFNENAYFGLNDYKGNENALAMNLMY NHYFTYRSSLIVGAQAQLQYYRESLANNTPWIEAAKSRFYDFDRSEQEVGAYAEYTYAVK DKFSIVAGVRGDYNAFYDKFFVTPRGHIRWNITPSTTLRGSAGLGYRSTNVITDNIGILA TGRAIVFKDSDFSKFNRLEKALTVGGSLTQTFGLVSADDATLSFDYFRTQFYNSVIADQE YNADQILLYNSDKRSYTDTYQIDFSWTPVERLDIFATFRYTNSEMTIDRPDGTTARVERP LVSKYKTLLNIQYATKFRRWVFDATAQLNGPARIPTQTGNLADDKYSPRYPMFFAQVSRK IGKFDIYVGCENIADYRQHDPILNADNPFSTGFNSMNVWGPLMGRKFYAGLRFNLY >gi|313157045|gb|AENZ01000076.1| GENE 202 189779 - 190090 511 103 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313157090|gb|EFR56520.1| ## NR: gi|313157090|gb|EFR56520.1| heavy metal-associated domain protein [Alistipes sp. HGB5] # 1 103 1 103 103 152 100.0 7e-36 MKKILMLCLALVMGVGMCAAEKPANKKTVTTVFTTDIDCEHCVKKIMNNVPSLGKGVKDV QVDLPKKEVTVVYDGTKNNDENIVKGFASLKVKAEPKKADAKK >gi|313157045|gb|AENZ01000076.1| GENE 203 190170 - 190892 980 240 aa, chain - ## HITS:1 COG:AF2153 KEGG:ns NR:ns ## COG: AF2153 COG2220 # Protein_GI_number: 11499736 # Func_class: R General function prediction only # Function: Predicted Zn-dependent hydrolases of the beta-lactamase fold # Organism: Archaeoglobus fulgidus # 30 239 11 207 208 115 35.0 6e-26 MSLLSLLGCGKKAAPEYPADTLTTRDGTQITITFFRHASLSIEVGGKYVYVDPVSGYARY AALPKADVILITHSHYDHLDVAAVEAIQTPQTEILCDRTSAEAFEMNCYTMRPGSVAAPR DYLKVEAVAAYNTTDGHLQFHPKDREDCGYILTVGGTRIYIAGDTEPTPEMKALKNIDIA FLPVNQPYTMTVDQAVEAVKAIRPAVFYPYHYGEVEEKTDTARLVRELEGVTEVRIRPME >gi|313157045|gb|AENZ01000076.1| GENE 204 191004 - 191456 679 150 aa, chain + ## HITS:1 COG:no KEGG:Ctha_1316 NR:ns ## KEGG: Ctha_1316 # Name: not_defined # Def: hypothetical protein # Organism: C.thalassium # Pathway: not_defined # 1 150 15 163 165 84 35.0 1e-15 MKGTLKKLIVLAAALFAAAGVAAQDADRIVGIYKAVEEGKESKVEFTRRPDGTYRGQIVW LRNPNNPDGTPKLDAKNPDKSKRSGRADRVVVVDGVKYDAEKGVWNGGRVYDPTKGKSYK VEVSFEDDRTLRVKGSLLGFSRSVYWQKIE >gi|313157045|gb|AENZ01000076.1| GENE 205 191536 - 192462 1435 308 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313157052|gb|EFR56482.1| ## NR: gi|313157052|gb|EFR56482.1| putative lipoprotein [Alistipes sp. HGB5] # 1 308 1 308 308 565 100.0 1e-159 MMKKVLFVAAVAMSLVFSGCMKEDDTYKKLKPVQPGLNIYTGAMNQNIVSMQQANFGLRL AMLVAEADKQQKTIDEVTVGSSNTLLKRQLLGNAKVETTANGYKITFDADYADLDTYVRK GTLLINTNETALLKDATESKPWTVTFEDKLTMGYSGGDMQAITLTGGLTKLYFVESSGAY GIGLEAQQSYVGKTEELTSNWNGKFTVKPENVNFTYTDCAGKKFMLNGTATGRTFNTYDG ISATTMSLRMTNGEYYSSSALYGGKIEASLGDGYNPSLYPSKDVIVEITLEGTRLRQTIT YAGHIVTV >gi|313157045|gb|AENZ01000076.1| GENE 206 192537 - 193748 1724 403 aa, chain - ## HITS:1 COG:RSp0310 KEGG:ns NR:ns ## COG: RSp0310 COG0477 # Protein_GI_number: 17548531 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; P Inorganic ion transport and metabolism; R General function prediction only # Function: Permeases of the major facilitator superfamily # Organism: Ralstonia solanacearum # 4 403 45 443 450 196 36.0 6e-50 MENWKKTFAVIWSGQLASILSSSVVAFAIIFWVSLETGSAQVLALAAIAGMLPQSLLGPL VGVYVDRWDRKWTMILADSFIALCTLLLAVLFWLGVARMWHIYVLLACRSIGSAFHTPAM QASVPLLAPKQQLTRIAGINQIISSLSNIIGPAFGALLINLTGIGNILLLDVAGAFIACT TLLLVRIPNPERSTAQKPSLWREFREGFGAMHAVPGMGWFFTLAVLVWFFIMPVGVMFPL MTLQHFGGGAYEMSLIEIVWGGGALIGGAVMGARTYRVNRIVLINLMYLTIGLLFAASGM LPPTAFALFAALSLIEGVTSSVFNSSFVSVIQSCIDAGVLGRVMSLYYSFGLLPSAIGLL GTGFLAEHVGLTTTFVIAGTVICCIGLIAFCIPSVMRLDRQSS >gi|313157045|gb|AENZ01000076.1| GENE 207 193764 - 194186 628 140 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313157251|gb|EFR56681.1| ## NR: gi|313157251|gb|EFR56681.1| putative lipoprotein [Alistipes sp. HGB5] # 1 140 1 140 140 248 100.0 7e-65 MKHLKYLLLICLAAAAACSKDKTEDPTLKAQRTALQETQAVGIYRSGEALRLFDKAKQQL FVDPTTLTFRIQDDAGLKFVSLQLESMPSDGQKVRGTFTDNTGLNIGSIEDFVLLKSDKQ HYWFWSDQTRVGFVFPRIGM >gi|313157045|gb|AENZ01000076.1| GENE 208 194195 - 195157 1316 320 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313157105|gb|EFR56535.1| ## NR: gi|313157105|gb|EFR56535.1| hypothetical protein HMPREF9720_1816 [Alistipes sp. HGB5] # 92 320 1 229 229 437 100.0 1e-121 MVGWAEPEEVTENGLYTLAPVSENKGYLVQTPTTNDYFLLENRDTRNNKWDQPLNSAAAC RGLLVYHVDYTSRYAPQWSYNTLNNNPAHECMKLVRSVPGRSSYDVPQKTFFPGANNITS LSPETNADYISWNSGKPSVSFSDIKLDGSQVRLSVKTKANLKAEVSAKQYDALLTWEGDP AAEWEITWKSAGTQRSETVTGCNAFHITGLSPATEYALSIAQVSDTVDSSKDMVFNTEPT YTYKSVRICVPDEGYTHDTPVMLSLLDYRGKIGRIDWYIDNRKTENTYTTLAAGEHTIMA VVTDAEEESQQYIVKYITVK >gi|313157045|gb|AENZ01000076.1| GENE 209 194881 - 196293 438 470 aa, chain - ## HITS:1 COG:CAC0746 KEGG:ns NR:ns ## COG: CAC0746 COG4412 # Protein_GI_number: 15894033 # Func_class: S Function unknown # Function: Uncharacterized protein conserved in bacteria # Organism: Clostridium acetobutylicum # 148 366 138 386 781 84 29.0 6e-16 MRHFFLSLAAASAALLCGIGTAHAVPAKPGLIQVTQSDGTQLTVRLVGDERRHYMLSSDG YTLTGGPDGDLYYAELSASGKLIPTKVKARPVNALTADEKAVVSRLGKELKPTYVKPMVM PAPQQTQAPVLPAAGKPAPPRRITSAQTTGRLKSLIILVESSDKKFASASAQQDFRNLLM QDGYSVNGATGSAWNYYQDNSNRLFDPEFVVVGPYKASKTSAYYAGSEGSDNVPELIVEA CRLADNDVNFTDFADNGVIRDVFVFYAGRGQADSGDTQSIWPHRWDVRVNSKYLDVRFDG VQLQGYACGAELNGGYQMTAIGTFCHEFGHVLGWPDFYDTDYSASGGTAPALESFSLMCS GSYNNTAARRRRSTYWNAGWSAGPNPKRSRKTGFTRLLRSRRTKAIWCRRRRRTTISCWR TATPGTTSGISPSTAQQRAAACWYTMSITPADTPRSGPTIRSTTIRPTNV >gi|313157045|gb|AENZ01000076.1| GENE 210 196445 - 196789 422 114 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|227417053|ref|ZP_03900228.1| LSU ribosomal protein L20P [Dyadobacter fermentans DSM 18053] # 1 114 1 114 114 167 74 4e-40 MPRSVNVVASRARRKKVLKLAKGNFGSRGNVWTVAKNTVEKGLCYAYAHRQLKKRTFRSL WITRINAAVRAQGMTYSQFIGKVNAKGIALNRKVMADLAMNEPKAFEAIVKAVK >gi|313157045|gb|AENZ01000076.1| GENE 211 196882 - 197076 244 64 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|153808045|ref|ZP_01960713.1| hypothetical protein BACCAC_02331 [Bacteroides caccae ATCC 43185] # 1 62 1 62 65 98 72 2e-19 MPKMKTNAAAKKRFTFTGTGKIKRKHAYHSHILTKKTTKQKRNLCYSGTVSAADEAKIKN LLVK >gi|313157045|gb|AENZ01000076.1| GENE 212 197336 - 198007 963 223 aa, chain + ## HITS:1 COG:lin2751 KEGG:ns NR:ns ## COG: lin2751 COG1285 # Protein_GI_number: 16801812 # Func_class: S Function unknown # Function: Uncharacterized membrane protein # Organism: Listeria innocua # 3 222 1 219 220 207 52.0 2e-53 MEMEWDLILRLCVAGLCGTVIGLDREYRVKDAGFRTHFLVALGSALMMIVSQYGFEGFLA THDGLRLDPSRIAAQVVSGIGFIGAGTIIIHRQLVRGLTTAASLWATAGIGLAAGGHMYV VAGAATLLTLFALEVLTLFFGRLGRRRIMVVFSAADRKAIDAMFNELQSSEYAVISYEVE AQRSPDGVVYRATLVIRAKGNADEERYVDLLRENPDVTVERIV >gi|313157045|gb|AENZ01000076.1| GENE 213 198004 - 198636 361 210 aa, chain - ## HITS:1 COG:no KEGG:HMPREF9137_0261 NR:ns ## KEGG: HMPREF9137_0261 # Name: not_defined # Def: putative lipoprotein # Organism: P.denticola # Pathway: not_defined # 1 207 8 216 218 134 37.0 3e-30 MKRSLLYIGILFSAVSCGRNDHALQSGDLLFQAGESSDMTGAITAATGENGRLNFSHVGI AVVKNGADSVLEATTNGGVRLTALPEFLARSAKIGGRPAVVAMRLKDTAGVCEAVRRAQN YLGLPYDYSFRPDNGKFYCSELVWECYRTSDGSPIFTARPMNFRAEDGTLPQFWTELFAR RSESVPEGVPGTNPNDMSQERTLREVYRWF Prediction of potential genes in microbial genomes Time: Wed Jun 22 13:48:01 2011 Seq name: gi|313157032|gb|AENZ01000077.1| Alistipes sp. HGB5 contig00009, whole genome shotgun sequence Length of sequence - 12039 bp Number of predicted genes - 12, with homology - 12 Number of transcription units - 2, operones - 2 average op.length - 6.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 1 - 817 631 ## BVU_2144 conjugate transposon protein 2 1 Op 2 . - CDS 820 - 2028 1147 ## BF1352 conjugate transposon protein TraM 3 1 Op 3 . - CDS 2035 - 2250 184 ## gi|313157041|gb|EFR56472.1| hypothetical protein HMPREF9720_0282 4 1 Op 4 . - CDS 2240 - 2863 1037 ## BF1771 hypothetical protein 5 1 Op 5 . - CDS 2878 - 3954 1580 ## BF1355 conjugate transposon protein 6 1 Op 6 . - CDS 3961 - 4590 750 ## BT_2294 conjugate transposon protein 7 1 Op 7 . - CDS 4637 - 5719 503 ## gi|313157034|gb|EFR56465.1| hypothetical protein HMPREF9720_0286 - Prom 5740 - 5799 1.5 8 2 Op 1 . - CDS 5816 - 6250 486 ## gi|313157044|gb|EFR56475.1| hypothetical protein HMPREF9720_0287 9 2 Op 2 . - CDS 6240 - 6599 329 ## gi|313157042|gb|EFR56473.1| hypothetical protein HMPREF9720_0288 10 2 Op 3 . - CDS 6611 - 9121 3219 ## COG3451 Type IV secretory pathway, VirB4 components 11 2 Op 4 . - CDS 9126 - 9509 359 ## BT_4768 conjugate transposon protein 12 2 Op 5 . - CDS 9512 - 11329 968 ## COG3344 Retron-type reverse transcriptase - Prom 11359 - 11418 2.4 Predicted protein(s) >gi|313157032|gb|AENZ01000077.1| GENE 1 1 - 817 631 272 aa, chain - ## HITS:1 COG:no KEGG:BVU_2144 NR:ns ## KEGG: BVU_2144 # Name: not_defined # Def: conjugate transposon protein # Organism: B.vulgatus # Pathway: not_defined # 7 272 5 264 299 206 42.0 1e-51 MKRDLIYLTLIVAAIAAAKVTAQTTPETPTEIRPLRIEAGFTKTVHILFPSPVTYIDIGS MDIIAGKADGAENVVRVKAAVRNFAAETNLTVITEDGGFFTFDVHYAENPVVSTLNLTVQ EPQTEGVKKPAAAGYPQPTAPASEGRVLLREVGREKPATIKRMLSDIYRQNRTDVKGIRT KKYGIEVEVLGIYVSNDVIYIHTCMYNDTNISFEVDARQFIVADKKLAKRTAQQQTPLEI LRVCNDPAVVRGHQRQRTVFALPKLTISDDKV >gi|313157032|gb|AENZ01000077.1| GENE 2 820 - 2028 1147 402 aa, chain - ## HITS:1 COG:no KEGG:BF1352 NR:ns ## KEGG: BF1352 # Name: not_defined # Def: conjugate transposon protein TraM # Organism: B.fragilis # Pathway: not_defined # 18 402 36 438 438 206 36.0 2e-51 MTENKNPGNDPSAEFERQRKRKVLLFAAILGCIFLVAMWLIFRPAPVKPQEGAAGINTSV PDGKAQATVGDKRKAAEQLRSEEQQQKRMMTLGDNSFSLLDDGLKPTEEPAPADDPALRA AEANRAMQRQVQGFYAAPQRNAEVEALKEQVAALQSQLDAERQQPDPLELAEEQYKLARK YLGGGTAVGEEAVEQAKQRKDSRLSVMRPVREGEVEASTLDTRADFTVERNLGFLTAAGG VAHADTPTVRACVASTQVIRAGSTVQLRLLEAVRIDGVTIPRNTPLYSLATISGMRLQVT VSSVEYGGRIFAVEAVAYDMDGQPGLNIPNSRERTALKEALASVGQTAGTSVNVTRSAGQ QVLSELARGGLQASSQYVAGKLREVKITLKANHQLLLISKQQ >gi|313157032|gb|AENZ01000077.1| GENE 3 2035 - 2250 184 71 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313157041|gb|EFR56472.1| ## NR: gi|313157041|gb|EFR56472.1| hypothetical protein HMPREF9720_0282 [Alistipes sp. HGB5] # 1 71 1 71 71 69 100.0 7e-11 MTAKPHTLRCRIRWVRRSLRRRHDALAPRIRERIVLAALVLLLFLYLWLLFAADMGSELE IGHPELLHLKP >gi|313157032|gb|AENZ01000077.1| GENE 4 2240 - 2863 1037 207 aa, chain - ## HITS:1 COG:no KEGG:BF1771 NR:ns ## KEGG: BF1771 # Name: traK # Def: hypothetical protein # Organism: B.fragilis_NCTC9343 # Pathway: not_defined # 1 207 1 207 207 304 71.0 2e-81 MEFKCLTNIETSFRQLRMYALVFAGVCAVVTVAAVWMSYSFAERQRQKIYVLDNGRSLMV ALSQDLAQNRPVEAREHVRRFHELFFTLSPDKAAIESNVGRALQMADKSALSYYKVLQEK GFFNRLIAGNVSQMVKVDSIRCDFDRYPYEVTTFARQRILRESTVTERSLVTTCRLVNVS RSDNNPQGFMVEALNIVENKDIATYDR >gi|313157032|gb|AENZ01000077.1| GENE 5 2878 - 3954 1580 358 aa, chain - ## HITS:1 COG:no KEGG:BF1355 NR:ns ## KEGG: BF1355 # Name: not_defined # Def: conjugate transposon protein # Organism: B.fragilis # Pathway: not_defined # 9 310 9 309 331 329 50.0 1e-88 MVHMLSLTDNLHTILRVLYDDMMALCYPMSQVAMAIAGIGALLHIAYRVWQSMAQAEPID LFPLMRPFAIGICILFFPTLVLGSLNGILSPLVKATHSLMVGQTLDMEQWQERRERLELE SREQMPPDSYYAENEEMERELSELGVDDQTQQTLDRMNEERSSWSVKGLIFKGLAWILEL LFAAASVILDVLRTFYLIVLSLLGPIAFAISVFDGFQSTLTQWLTKYVSIYLWLPISDLF SAIIARLQTLSMRHDADLMAEGYNWYVDWSNSLNLIFMLVAVCGYLCIPSIASWVVQANG FAAYNKTVSKMTSLVSAGAGWTAGKAWAGAKGAGSAALSGGKAVGRGIMNGARLIFRK >gi|313157032|gb|AENZ01000077.1| GENE 6 3961 - 4590 750 209 aa, chain - ## HITS:1 COG:no KEGG:BT_2294 NR:ns ## KEGG: BT_2294 # Name: not_defined # Def: conjugate transposon protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 3 209 2 209 209 241 57.0 1e-62 MKQKIILLCLCLVCTGLSTYAQWVVSDPTNLAQGIVNSTKQVVEAAKNGSTMLQSFQETV KIYEQGKRYYDALKSVSNLVRSARKVQQCILLVGEISDIYVDGYRRMVGDENFTPAELAA IAAGYARIIEESAGELKELQDIVNPTDMSLTDKDRIDVVQRVYGVLRRHRDLARYYTRKN ISISLLRAARKRDMEGVLSLYGTDEQRYW >gi|313157032|gb|AENZ01000077.1| GENE 7 4637 - 5719 503 360 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313157034|gb|EFR56465.1| ## NR: gi|313157034|gb|EFR56465.1| hypothetical protein HMPREF9720_0286 [Alistipes sp. HGB5] # 1 360 1 360 360 643 100.0 0 MEWKDRIARLFRRTPDWEREHRRTLIVRHAENLLRERDINSVTDLVRRHRKSDLTIAGIR LTMNTATSRFLRTPEDAAGKDMLELMDALRRTPFAKKIGRRLADVPADTAHSLHGVFAMH TFLLDAYLEQHPDSKLRQPPMDEVQAAAHIIDRQFRAETFRELRHLAEISGGYIPSCYVA RLYDWDADMERLQEMRGRLNGPACPEAPQQVQQLRKRIWKAENRMVREAERILESDPEIS LRQTYIEKLDAELQTLGWLARFPERIDDSRINRQLLDKYRILPGISPAEQYGQVEKAFRE LDARLVRMTGRQSYADDLFESLRQKGPKPEKHISGVRQKEINSARKEQTGQPASGRRIKR >gi|313157032|gb|AENZ01000077.1| GENE 8 5816 - 6250 486 144 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313157044|gb|EFR56475.1| ## NR: gi|313157044|gb|EFR56475.1| hypothetical protein HMPREF9720_0287 [Alistipes sp. HGB5] # 1 144 1 144 144 278 100.0 1e-73 MNSNQSTPASAVAALQQEIRTRTEVIRTLADLREQLDADRICGAWLSAENNLSASIRRIG EGMWRILVFDHALCYKRLVQDGIIALRRHRLWLGADDGNRVIYDASTETLTIGCYGRFVP EDCIRRQEDDAISAEACDFNEPAE >gi|313157032|gb|AENZ01000077.1| GENE 9 6240 - 6599 329 119 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313157042|gb|EFR56473.1| ## NR: gi|313157042|gb|EFR56473.1| hypothetical protein HMPREF9720_0288 [Alistipes sp. HGB5] # 1 119 5 123 123 249 100.0 5e-65 MKQLLWICAGILLTFTAVLGAFHLFYDYEYRKIRPLCGAWHSTLDDTRLVIEPCGDKFRI TITRRGTSETHALHYKDCVYYTAYGGRRIDLFYTPPADALLLVPGDAFKRTSKLKNNEQ >gi|313157032|gb|AENZ01000077.1| GENE 10 6611 - 9121 3219 836 aa, chain - ## HITS:1 COG:PSLT088_2 KEGG:ns NR:ns ## COG: PSLT088_2 COG3451 # Protein_GI_number: 17233453 # Func_class: U Intracellular trafficking, secretion, and vesicular transport # Function: Type IV secretory pathway, VirB4 components # Organism: Salmonella typhimurium LT2 # 438 742 189 481 593 72 24.0 4e-12 MRGTLKTSTLESKFPLLRVENNCIISKFADITAAYRVTLPELFTLTAEEYEALHGAWLKA LKVLPDYTVVHKQDFFIEERYTAPEEGSERSFLARSYERHFNERPYLRHTCYLFVTKTTP ERMRQTSASSVLCRGFIVPREMRDTDAVTRFLEAAEQMERILNDSGLVHVERLTETEIVG TADDAGLLARYFALSDERSPVVNEDIRLDPGVMRIGDKFLTMHTLSDLDMLPQSVATDFR YERLSTDRSDCRLSFAAPVGLLLSCNHVYNQVIFLDDHDETLKRLEASARNMNSLAGYSR SNAINREWIEMYLNEAHSQGLRSVRCHCNVMAWAESEAELKRIRNDVGSQLALMGCTPHH NTVDVPVLFWAAIPGNEADFPAEESFYTFLDQALCLFNEETNYRSSLSPFGIKMSDRLSG IPLHLDISDLPMKRGVITNRNKFILGPSGSGKSYLTNHLVRQYWEQGAHILLVDTGNSYQ GLCSLIHAKTKGRDGVYFTYTEEAPIAFNPFYVEDGVYDVEKRESLKTLLLTLWKRESEE PTRSEEVALSNAVNLYLSKLRADRSIVPSFDTFYEFVETDYRRLLEQKRVREKDFDLANF LNVLEPYYKGGEYDYLLNSDKQLDLLDKRFIVFELDNISSNRTLLPVVTLIIMETFISKM RRLKGVRKMILIEECWKALTSANMSAYIRYLFKTVRKYFGEAVVVTQEVDDIISSPIVKE SIINNADCKILLDQRKYLSKFDGIQRLLGLTDKERAQILSINLSNDPKRLYKEVWIGLGG VQSAVYATETSVEEYLTYTTEESEKMQVMELAEKLGGDIEAAIQQLACKRETKTES >gi|313157032|gb|AENZ01000077.1| GENE 11 9126 - 9509 359 127 aa, chain - ## HITS:1 COG:no KEGG:BT_4768 NR:ns ## KEGG: BT_4768 # Name: not_defined # Def: conjugate transposon protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 37 119 19 101 106 85 53.0 6e-16 MLVESRVRREVQARFGGEFLETYRRNTARHRVLSLRLTAQYLVLFVAGLLGVFVCFAGMY MSGVPQGLCILFALTAGLAVTGGTFYLNRRFGPHGLQKLMASRRHPRRIVHRRAVGRLLR FRRTVKL >gi|313157032|gb|AENZ01000077.1| GENE 12 9512 - 11329 968 605 aa, chain - ## HITS:1 COG:Q0055 KEGG:ns NR:ns ## COG: Q0055 COG3344 # Protein_GI_number: 6226521 # Func_class: L Replication, recombination and repair # Function: Retron-type reverse transcriptase # Organism: Saccharomyces cerevisiae # 8 602 267 847 854 317 33.0 4e-86 MRNPERVLNSLSEHSKVSSYKFERLYRVLFNSEMFLLAYHNIQGRQGNMTEGSDGKTIDG MSLKRIENLIDALKDESYQPKPARRTYIPKKNGNMRPLGIPSIDDKLVQEVLRMLLEAIY EGSFENTSHGFRPKRSCHTALIQVQKNFTAAKWFIEGDIEGFFDNINHDVLIGILKERIA DDRFIRLMWKFLKAGYIEDWTFHRTYSGTPQGGIISPILANIYLDKLDKYMKEYACQFDR GDRRAMNLEYKRYSRKIWWLGTKLKQTKDKDTRKELIDAIKQHQKNRMHLPSVDEMDEGY RRIKYVRYADDFIIGVIGSKSDCEAIKEDIKNFLGEKLKLTLSKEKTLITHGNRKAKFLG YEIYVRPFTDKTLRGEKSGVLIKAYGKKVVLEVPMSTMRDKLLYYEAMEIHQFEGKAKWK PTSRTKLLHLDDLEILDAYNREIRGFANYFSIANNSSHLNSFKYIMQYSLYKTFARKYST TARKIIAKYRHHKDFAVFYEDKKGGKKMRVFFNGSFKRKTTAMDASCDYVANTIFNTTVS SLIQRLKAGKCELCGATENIEIHHVKRLKDLKGKEPWKIQMIGRQRKTLAVCIPCHNKIH HGIID Prediction of potential genes in microbial genomes Time: Wed Jun 22 13:49:31 2011 Seq name: gi|313156961|gb|AENZ01000078.1| Alistipes sp. HGB5 contig00054, whole genome shotgun sequence Length of sequence - 91818 bp Number of predicted genes - 71, with homology - 68 Number of transcription units - 34, operones - 18 average op.length - 3.1 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 79 - 951 1030 ## BF1781 tyrosine type site-specific recombinase - Prom 1044 - 1103 1.9 2 2 Tu 1 . + CDS 1343 - 1804 -367 ## + Term 1985 - 2014 0.4 - TRNA 2367 - 2438 79.8 # Gly CCC 0 0 - Term 2292 - 2330 6.1 3 3 Op 1 . - CDS 2505 - 4412 2938 ## COG1022 Long-chain acyl-CoA synthetases (AMP-forming) 4 3 Op 2 . - CDS 4424 - 6214 3205 ## COG0481 Membrane GTPase LepA - Prom 6234 - 6293 1.9 + Prom 6138 - 6197 3.0 5 4 Op 1 . + CDS 6428 - 7300 1164 ## COG0382 4-hydroxybenzoate polyprenyltransferase and related prenyltransferases 6 4 Op 2 . + CDS 7350 - 8510 1969 ## COG0019 Diaminopimelate decarboxylase 7 4 Op 3 . + CDS 8583 - 9482 1246 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily + Term 9635 - 9669 1.6 - Term 9561 - 9607 5.3 8 5 Tu 1 . - CDS 9630 - 10991 2016 ## COG1350 Predicted alternative tryptophan synthase beta-subunit (paralog of TrpB) - Prom 11015 - 11074 3.6 9 6 Tu 1 . - CDS 11348 - 11500 79 ## + Prom 11306 - 11365 2.7 10 7 Tu 1 . + CDS 11515 - 13980 3653 ## COG0072 Phenylalanyl-tRNA synthetase beta subunit + Term 14068 - 14107 7.1 + Prom 14029 - 14088 1.7 11 8 Tu 1 . + CDS 14110 - 14751 617 ## Varpa_5280 yhhn family protein + Term 14769 - 14811 2.2 12 9 Tu 1 . - CDS 14887 - 17043 2861 ## COG3525 N-acetyl-beta-hexosaminidase + Prom 17049 - 17108 2.1 13 10 Tu 1 . + CDS 17195 - 18004 1325 ## COG1694 Predicted pyrophosphatase 14 11 Op 1 . + CDS 18136 - 18669 975 ## COG0663 Carbonic anhydrases/acetyltransferases, isoleucine patch superfamily 15 11 Op 2 9/0.000 + CDS 18677 - 19984 2107 ## COG3275 Putative regulator of cell autolysis 16 11 Op 3 . + CDS 19981 - 20742 1325 ## COG3279 Response regulator of the LytR/AlgR family 17 12 Tu 1 . + CDS 20877 - 21341 609 ## gi|313156972|gb|EFR56404.1| conserved hypothetical protein + Term 21374 - 21410 6.2 + Prom 21399 - 21458 1.8 18 13 Tu 1 . + CDS 21492 - 24446 4338 ## Dfer_5762 hypothetical protein + Term 24469 - 24505 6.4 - Term 24455 - 24493 5.2 19 14 Op 1 22/0.000 - CDS 24615 - 25850 1888 ## COG0842 ABC-type multidrug transport system, permease component 20 14 Op 2 9/0.000 - CDS 25852 - 27030 1833 ## COG0842 ABC-type multidrug transport system, permease component 21 14 Op 3 13/0.000 - CDS 27040 - 28029 1637 ## COG0845 Membrane-fusion protein 22 14 Op 4 . - CDS 28042 - 29493 2412 ## COG1538 Outer membrane protein 23 15 Op 1 . - CDS 29604 - 30680 1076 ## COG0117 Pyrimidine deaminase 24 15 Op 2 . - CDS 30671 - 31840 1576 ## COG1408 Predicted phosphohydrolases 25 16 Op 1 . + CDS 32227 - 33582 1993 ## BVU_2435 hypothetical protein + Term 33583 - 33638 4.1 26 16 Op 2 . + CDS 33668 - 33979 402 ## Odosp_0556 hypothetical protein 27 17 Tu 1 . - CDS 34129 - 35055 1312 ## COG0697 Permeases of the drug/metabolite transporter (DMT) superfamily 28 18 Op 1 . + CDS 35342 - 36400 1217 ## COG2843 Putative enzyme of poly-gamma-glutamate biosynthesis (capsule formation) + Prom 36415 - 36474 1.7 29 18 Op 2 . + CDS 36511 - 37089 1013 ## COG1592 Rubrerythrin + Term 37105 - 37149 15.1 + Prom 37203 - 37262 2.8 30 19 Op 1 1/0.000 + CDS 37303 - 38979 2767 ## COG0488 ATPase components of ABC transporters with duplicated ATPase domains 31 19 Op 2 . + CDS 39011 - 40366 2201 ## COG0124 Histidyl-tRNA synthetase + Term 40389 - 40446 7.0 + Prom 40388 - 40447 3.1 32 20 Op 1 . + CDS 40468 - 42045 2368 ## COG0793 Periplasmic protease + Prom 42051 - 42110 4.2 33 20 Op 2 . + CDS 42130 - 42447 441 ## Odosp_0148 PUR-alpha/beta/gamma DNA/RNA-binding protein + Term 42497 - 42542 12.1 34 21 Op 1 . + CDS 42637 - 43578 1464 ## COG2035 Predicted membrane protein 35 21 Op 2 . + CDS 43582 - 44319 1031 ## COG0169 Shikimate 5-dehydrogenase 36 21 Op 3 . + CDS 44446 - 44985 634 ## gi|313157015|gb|EFR56447.1| TonB family C-terminal domain protein + Term 45073 - 45117 13.9 - Term 45059 - 45105 14.3 37 22 Tu 1 . - CDS 45109 - 45318 313 ## gi|313156977|gb|EFR56409.1| conserved hypothetical protein - Prom 45340 - 45399 1.9 + Prom 45318 - 45377 4.4 38 23 Op 1 . + CDS 45443 - 49099 6125 ## COG0587 DNA polymerase III, alpha subunit 39 23 Op 2 . + CDS 49123 - 49437 169 ## PROTEIN SUPPORTED gi|124485582|ref|YP_001030198.1| ribosomal protein L12E/L44/L45/RPP1/RPP2-like protein 40 23 Op 3 . + CDS 49439 - 50431 1034 ## COG2017 Galactose mutarotase and related enzymes + Term 50476 - 50516 13.6 - Term 50468 - 50500 7.0 41 24 Tu 1 . - CDS 50608 - 51492 288 ## Bache_1414 hypothetical protein - Prom 51580 - 51639 4.5 + Prom 51588 - 51647 6.7 42 25 Op 1 . + CDS 51879 - 52313 426 ## gi|313156966|gb|EFR56398.1| putative lipoprotein + Prom 52335 - 52394 1.9 43 25 Op 2 . + CDS 52454 - 54295 2054 ## gi|313157007|gb|EFR56439.1| hypothetical protein HMPREF9720_2260 + Term 54318 - 54366 14.1 - Term 54310 - 54350 10.6 44 26 Tu 1 . - CDS 54455 - 54973 734 ## COG0778 Nitroreductase - Prom 55128 - 55187 4.7 + Prom 54961 - 55020 3.9 45 27 Tu 1 . + CDS 55147 - 57210 3576 ## COG0326 Molecular chaperone, HSP90 family + Term 57234 - 57271 7.8 - Term 57220 - 57259 7.2 46 28 Op 1 . - CDS 57364 - 57777 788 ## COG0824 Predicted thioesterase 47 28 Op 2 . - CDS 57781 - 58518 223 ## PROTEIN SUPPORTED gi|163762565|ref|ZP_02169630.1| ribosomal protein S2 48 28 Op 3 . - CDS 58544 - 59200 987 ## COG2344 AT-rich DNA-binding protein 49 28 Op 4 . - CDS 59208 - 61037 2581 ## Bacsa_2701 peptidoglycan-binding lysin domain 50 28 Op 5 . - CDS 61040 - 61807 1354 ## COG0847 DNA polymerase III, epsilon subunit and related 3'-5' exonucleases 51 28 Op 6 . - CDS 61819 - 62943 1880 ## COG0592 DNA polymerase sliding clamp subunit (PCNA homolog) 52 28 Op 7 . - CDS 62983 - 63861 1356 ## COG2820 Uridine phosphorylase - Prom 63939 - 63998 5.1 + Prom 63835 - 63894 6.4 53 29 Op 1 . + CDS 63976 - 65367 2025 ## COG1109 Phosphomannomutase 54 29 Op 2 . + CDS 65378 - 65818 734 ## COG3015 Uncharacterized lipoprotein NlpE involved in copper resistance 55 29 Op 3 . + CDS 65831 - 67198 1834 ## COG0624 Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases 56 29 Op 4 . + CDS 67208 - 67990 961 ## COG0483 Archaeal fructose-1,6-bisphosphatase and related enzymes of inositol monophosphatase family + Term 67991 - 68019 1.0 + Prom 67996 - 68055 4.4 57 29 Op 5 . + CDS 68089 - 69933 2612 ## COG0449 Glucosamine 6-phosphate synthetase, contains amidotransferase and phosphosugar isomerase domains + Term 69966 - 70005 12.1 - Term 69948 - 69997 18.4 58 30 Op 1 6/0.000 - CDS 70031 - 71380 2112 ## COG2271 Sugar phosphate permease 59 30 Op 2 . - CDS 71402 - 72400 1406 ## COG0584 Glycerophosphoryl diester phosphodiesterase 60 30 Op 3 . - CDS 72415 - 73572 1757 ## COG0246 Mannitol-1-phosphate/altronate dehydrogenases 61 30 Op 4 . - CDS 73574 - 73774 78 ## - Prom 73794 - 73853 4.1 + Prom 73770 - 73829 4.2 62 31 Op 1 6/0.000 + CDS 73864 - 74454 838 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog + Prom 74456 - 74515 3.6 63 31 Op 2 . + CDS 74535 - 75542 1315 ## COG3712 Fe2+-dicitrate sensor, membrane component + Term 75552 - 75596 16.6 64 32 Op 1 . + CDS 75624 - 79166 5190 ## Sph21_5171 TonB-dependent receptor plug 65 32 Op 2 . + CDS 79187 - 80653 2244 ## Sph21_5172 RagB/SusD domain-containing protein 66 32 Op 3 . + CDS 80679 - 82664 2789 ## gi|313156998|gb|EFR56430.1| conserved domain protein 67 32 Op 4 . + CDS 82725 - 83597 1330 ## Lbys_2251 glycerophosphoryl diester phosphodiesterase + Term 83619 - 83658 6.2 - Term 83605 - 83646 10.9 68 33 Op 1 . - CDS 83665 - 85266 2319 ## Phep_3771 metallophosphoesterase 69 33 Op 2 . - CDS 85290 - 87125 2936 ## Phep_3772 RagB/SusD domain-containing protein 70 33 Op 3 . - CDS 87138 - 90770 6042 ## Phep_3773 TonB-dependent receptor plug + Prom 91112 - 91171 7.7 71 34 Tu 1 . + CDS 91325 - 91817 674 ## Phep_3774 hypothetical protein Predicted protein(s) >gi|313156961|gb|AENZ01000078.1| GENE 1 79 - 951 1030 290 aa, chain - ## HITS:1 COG:no KEGG:BF1781 NR:ns ## KEGG: BF1781 # Name: not_defined # Def: tyrosine type site-specific recombinase # Organism: B.fragilis # Pathway: not_defined # 8 289 120 401 404 237 48.0 4e-61 MSCFALHVAECEQTGRDGSAANYLTAYRMLVKYMNGRKLLPEQFTAEWLERYERWLLARG LGTNSVVFHMRYLRAVYNKAADRGSFIASRGNPFYRRHIRQVATRKRALPRETLRRISDA DLSELHPKYALARDLFMFSFYTRGMSFVDMIYLRKSDISDGVLTYKRKKTGQTLTLRIEP PLQRIIDRYNGDSPYILPVLARDDSYRAYRQQQRELNKFIRKIGAMLEISEPLTFYAARH SWATLARDCGTPLTVISAGMGHTSERTTRIYLAQLDHDVIDKANRKIINL >gi|313156961|gb|AENZ01000078.1| GENE 2 1343 - 1804 -367 153 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MPGRFSVRFFAVRRLPAVGIPVLKRAGRYGSRPFFVCVGGAPDLSAAMTRPCLRDAGTYD FRTKWFLIFFKKNRNPLFFCRPADRRPATLPPCHFAVLPLCRPATLPSCHFAALPPCHFA ALPLCRPATLPSCYFAALPPAVLPPAVLPPAAL >gi|313156961|gb|AENZ01000078.1| GENE 3 2505 - 4412 2938 635 aa, chain - ## HITS:1 COG:alr3602 KEGG:ns NR:ns ## COG: alr3602 COG1022 # Protein_GI_number: 17231094 # Func_class: I Lipid transport and metabolism # Function: Long-chain acyl-CoA synthetases (AMP-forming) # Organism: Nostoc sp. PCC 7120 # 14 614 68 682 683 231 27.0 3e-60 MKKTIIDLFEGSVEKYGAKTFLLEKKHRAFEPTTYAETKELALETGAGLAAIGIRPKDKV AILAEGSNAWIVSELGMFYAGAVSVPLSVKLEESNDLLFRMRHAEVKALFVSKYQLPKIR RIRTELPGVKHIIVIGHIPLEPGETALGTLRRLGRDYLSKHREEFLAIGRGIGNDDYATI TYTSGTTADPKGVVLTHRNYTANVEQSLSRIDIPSSFRTLIILPLDHCFAHVVGFYIMIA CGATVATVQVGATPMETLKNIPQNIREVRPHFLLSVPALAKNFRKNIEGSIRAKGRFTER LFNLALRTAYLYNKDGYGRGQGWRVLLWPAVRLFDAVLFRKVREAFGGSLRFFVGGGALL DAELQRFFYAIGIPMFQGYGLSEATPVISTNSPKYHWHRFGSSGKILIPLDLKIVDEAGR EVPRGEKGEIVIRGENVMAGYWKNPEASAETVRDGWLHTGDMGYVSKDDFLYVLGRFKSL LIASDGEKYSPEGMEEAIVDKSPYIDQIIIHNNQSPFTGAIVVPNREALRRELDSRGIAG EKRAETAAEILGGEIDRYRAGGVFGGEFPERWLPAGLAIVDEPFTEQNGLVNSTMKVVRN KVEAHFGDRLDYLYTPEGRSLKNAKNLASLKKIVA >gi|313156961|gb|AENZ01000078.1| GENE 4 4424 - 6214 3205 596 aa, chain - ## HITS:1 COG:BH1342 KEGG:ns NR:ns ## COG: BH1342 COG0481 # Protein_GI_number: 15613905 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane GTPase LepA # Organism: Bacillus halodurans # 4 596 13 605 609 746 59.0 0 MKNIRNFCIIAHIDHGKSTLADRLLEKTNTLNQREMQAQVLDDMDLEREKGITIKSHAIQ MEYKARDGQLYTLNLIDTPGHVDFSYEVSRAIASCEGALLVVDATQGIQAQTISNLYLAV GHDLEIIPVLNKIDMDSAMIDEVKDQVIDLIGCKDEDILLASGKTGLGVEEVLEAIVQRI PAPEGDENAPLQALIFDSVFNPFRGIIAYYRVFNGRIRKGEHVKFFNTGSQYDADEVGVL KLKMQPRDEIKAGDVGYICSGIKTSSDVKVGDTITTVTNPAREAIAGFEDVKPMVFAGVY PVEADQYEDLRASLEKLQLNDASLTFEPESSLALGFGFRCGFLGLLHMEIIQERLYREFD MDVITTVPNVSYRITTTQGDVVEVHNPSGLPEVTKIAKIEEPYILAQIITKSEFLGNVIK LCIDKRGVMKNQTFITQDRVEINFDMPLSEIVFDFYDKLKSISKGYASFDYHRTGYQPSK LVKLDILLNGEPVDALSSLTYTDHAYDFGRKMCEKLKELIPRQQFDIAIQAAIGAKIIAR ETVKAVRKDVTAKCYGGDISRKRKLLEKQKKGKKRMRQIGNVEVPQSAFLAVLKMD >gi|313156961|gb|AENZ01000078.1| GENE 5 6428 - 7300 1164 290 aa, chain + ## HITS:1 COG:BH1650 KEGG:ns NR:ns ## COG: BH1650 COG0382 # Protein_GI_number: 15614213 # Func_class: H Coenzyme transport and metabolism # Function: 4-hydroxybenzoate polyprenyltransferase and related prenyltransferases # Organism: Bacillus halodurans # 10 289 1 276 277 187 38.0 2e-47 MNTVSKYLSLVKFAHTIFAMPFAAVGFVYAYATLPAGAHDAAWWLTRAVQVLLCMVFARN TAMGFNRWADRRIDAENPRTAGREIPAGVIPARHALRFVAVNALLFIAAASTINLLTALL SPVALFIILFYSYCKRFTALAHLVLGLSLGIAPVGAYIAVTGRIALEPCILSLLVMTWCG GFDIIYALQDAAFDRERGLHSIPARFSARTALALSCGLHAVSVAALLWFASCCPGSAWFW TGAGIFTGLLVLEHLLVTPSRQRNIGIAFGTLNGLASLTLAAGIIADLLK >gi|313156961|gb|AENZ01000078.1| GENE 6 7350 - 8510 1969 386 aa, chain + ## HITS:1 COG:aq_1208 KEGG:ns NR:ns ## COG: aq_1208 COG0019 # Protein_GI_number: 15606446 # Func_class: E Amino acid transport and metabolism # Function: Diaminopimelate decarboxylase # Organism: Aquifex aeolicus # 4 373 26 387 420 252 37.0 7e-67 MLSRQIAQKLRGYETPFYLYDTALLRQTLESVVYESKKYGYKVHYAIKANYDDHLLAIIR EYGLGIDCASGNELRKAVEAGFDPKGIVYAGIGKRDKELRYAIGQEIMAINCESIEELEL VDRLAGEAGKKTDVALRINPDIDPKTNHCIDTGQADSKFGISYEEVLEHAKEIKSLKHIN IVGIHLHIGSQIRELHVFENMCNKVNVIVENLEKLGFSFRFVDVGGGLGVNYDVPENEPI PNFASLFSIVHNHLAVGDREVHFEFGRSIVAECGELITTVLFNKTTATGRKLVIVDASMT ELIRPALYGSYHNIENITSEDEVREKYTIVGTACESTDVFDENVTLRKTRRGDLLTLKSA GAYGMSMASRYNLHDLPGAVYSDEIR >gi|313156961|gb|AENZ01000078.1| GENE 7 8583 - 9482 1246 299 aa, chain + ## HITS:1 COG:CAC1984 KEGG:ns NR:ns ## COG: CAC1984 COG0697 # Protein_GI_number: 15895255 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Clostridium acetobutylicum # 2 296 4 283 285 104 26.0 2e-22 MWLTLAFTSAALLGFYDAAKKQSLTGNAVLPVLLLNTLFSALFLLPMIVSAECGFGWFDG TILAASSGTLRAHGLVLFKAVIVLTSWIFGYFAMKHLPITIVGPINATRPVMVLVGAMLI FGERLNACQWTGVVLTLLSLFLLSRSSRREGVDFRHNVWILCIAVAALAAVVSGLYDKYI MARLDPVFVQGWCNLYLFGLMSVVVGILWWPRRRTTTPFHWTWAIPLISFFLVLADFAYF YALHQPDAMVSVVSMVRRSSVVVSFLCGAVLFREHNLRSKAFDLAFILAGMFFLWLGSR >gi|313156961|gb|AENZ01000078.1| GENE 8 9630 - 10991 2016 453 aa, chain - ## HITS:1 COG:AF1240 KEGG:ns NR:ns ## COG: AF1240 COG1350 # Protein_GI_number: 11498839 # Func_class: R General function prediction only # Function: Predicted alternative tryptophan synthase beta-subunit (paralog of TrpB) # Organism: Archaeoglobus fulgidus # 4 434 7 434 435 516 61.0 1e-146 MKTKKFCLTENQMPAQWYNIVADMPNKPLPPLHPATKQPVTKEQMSAIFAEELIDQEMSA ERFIDIPEEVQEIYKIWRPTPLVRATGLEKALGTPAKIYFKNESVSPAGSHKPNTAVPQA YYNYKQGIRRLTTETGAGQWGAAIAFAAKHFGIDLQVYMVKVSYEQKPYRRLMMNTWGAE CIASPSTLTASGRAALERDPNCSGSLGLAISEAIEVALQHPEDTRYCLGSVLNHVILHQT VIGQEAVAQMEMADAEPDVVIGCFGGGSNFAGLGFPFLRKNLTEGKKIRIVAVEPSSCPK LTRGSFQYDFGDVAGFTPLLPMYTLGHDFQPSDIHAGGLRYHGAGSIVSQLLKDGLVEAQ SLPQTETLAAGILFAKTEGIIPAPESTHAIAAAIREAMQAKEEGTPKTILFNLSGNGVID LYAYEQYLAGALKDYAPGDAEIAKTVSRLEQLI >gi|313156961|gb|AENZ01000078.1| GENE 9 11348 - 11500 79 50 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MFLLQFNMPDRLPRGGRPRLQVRPPRNPRQGGECRRHSVRNGMLKSNKDT >gi|313156961|gb|AENZ01000078.1| GENE 10 11515 - 13980 3653 821 aa, chain + ## HITS:1 COG:FN2122_2 KEGG:ns NR:ns ## COG: FN2122_2 COG0072 # Protein_GI_number: 19705412 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Phenylalanyl-tRNA synthetase beta subunit # Organism: Fusobacterium nucleatum # 156 820 3 652 653 367 33.0 1e-101 MNISYNWLKRYLDTDLPAEEIARILTDIGLEVEGFEKIETVKGGLHGVVVGEVLTCTDHP DSDHLHLTTVDVGAGDPLQIVCGAPNCRAGLKVLCATVGTVLYPGGGDEEFKIKRSKIRG VESLGMLCAEDELGIGASHDGIMELPADARVGMTAKEYLGIEDDYLIEVGLTPNRVDAAS HIGVARDLAAYLRSQGLNAEVKMPDVSAFAPDNHDLPVTVRVENHEAAPRYAGVTVKNCK IGPSPEWMQNCLRAAGINPKNNLVDITNFVLFELGQPLHAFDAAKIEGREVVVRTCAEGT PFVTLDGVERKLTADDLMICSAERPMCIAGVFGGLDSGIGDTTTDVFIESAYFNPVWVRK TAKRFGLNTDSSFRFERGVDPNMQVYAAKRAALLMKELAGGEISSDITDIYPTPVEDFKF DISFARIDSLIGKKIPEDTVRTILAALEVKILAEKEGVLSVAVPPYRVDVQREADLVEDI LRIYGYNNVEIPSRVRSTLSYAPKPDRSKLMNLAADFLTSNGFTEIMSNSLTKAAYYEGL ESYKPENCVRILNPLSADLNVMRQTLLFNMLEAVQLNANRKNGDLKLYEFGNCYFYDESK RSEENRLAAYSEEYRLAIAVTGVAEPASWNARPQAASFFTLRAVAEKLLRRFGIDIYALR TETLQNDLFGEALSMSLGGKELLQIGTVSAKIRRRLDVKQDVYYLEMNFDALVKSTRKHK IAAEELSKFPEVKRDLALLIDKQVTFSALRDVAFATERKLLKSVSLFDVYEGDKLPEGKK SYALSFILEDKTRTLDEKTIEKAMKNLTAQFEQRCGAQVRG >gi|313156961|gb|AENZ01000078.1| GENE 11 14110 - 14751 617 213 aa, chain + ## HITS:1 COG:no KEGG:Varpa_5280 NR:ns ## KEGG: Varpa_5280 # Name: not_defined # Def: yhhn family protein # Organism: V.paradoxus_EPS # Pathway: not_defined # 57 197 423 571 581 64 34.0 3e-09 MQRTSRTLCLTTLFFAAGAALFFTKAAVIPYKVAYPVLLLSLSLFYLRLKPLIPVGAALL LSAVGDAAGAGGMFIPQMLFFALAHGAYMCYFLPQAQLAPRPFVWPVLTALLLFLFVCIV PRAADPAERAGVAVYGLVIAGMLYSALQYRGAYAAWFRLAALLFVFSDSVIAWGRFVAPV PCRTCVVMITYYAAQYLFYLFAVRAATLSDSPH >gi|313156961|gb|AENZ01000078.1| GENE 12 14887 - 17043 2861 718 aa, chain - ## HITS:1 COG:VC0613 KEGG:ns NR:ns ## COG: VC0613 COG3525 # Protein_GI_number: 15640633 # Func_class: G Carbohydrate transport and metabolism # Function: N-acetyl-beta-hexosaminidase # Organism: Vibrio cholerae # 75 519 195 633 637 281 35.0 3e-75 MKLRYLLPLAGVLCTFLTAHAQPSPARQAFYEGACRISPRTMLCYETPLAPLAAYLREYI NVETASDSMSADDAIVLSTDPTLGGEAFRLTVLPQRIEIAGGSYGGVFNGVQALFRLLPA EIYAKNCPLPVEIACTKVEDAPRFPYRGMMLDVARTWIDAAGVKRYIDLLSYHGINKLHL HLSDDEGWRIEIRSHPELTEIGGFRGGDSPVRPVYGKWDEKYGGYYTQDEMRGLIRYAAA RNIEIIPEIDLPGHSRNIASVHPEIRCNYPPDTVSTNGYDYRSAWCVAREENYALLADIL GELCALFPSEYIHVGGDEVDMTQWNRCPDCQALMSRRGMTDPHRLEDLFMERMAAILAAN GKRPGVWNEAVTTGGLSRECLVYGWQSVKACLDATAKGYKTVVMPGEYFYFDMRQTPQEE GHDWAAVFDAKKVFGFDFTEKGFSDEQMRNVVGLQGTFFSEAYVSHEPEKPDYLDYMCFP RICALARIAWRGNCEGWDAYYRELTDHYDRMAAMGIRFRLFPPKVSYKEGAFTVVADDGS EIFYLEGDSPEEHRYTGPVKTEKPHLYRFLTRYKTGRSPYAADKSYYRTLAPAVTITTSM GESTQFPYTNASAYKGLARTRRACRQQDWILYTYEQPVKCREMFLQTGNRQLPKTIITTG YAEVSYDGTTYERAGDLEKGSITLKPGRPVKAVRIVSTCDDNGTPYVTIQPPQIKPVL >gi|313156961|gb|AENZ01000078.1| GENE 13 17195 - 18004 1325 269 aa, chain + ## HITS:1 COG:SMc01051 KEGG:ns NR:ns ## COG: SMc01051 COG1694 # Protein_GI_number: 15965220 # Func_class: R General function prediction only # Function: Predicted pyrophosphatase # Organism: Sinorhizobium meliloti # 11 261 9 273 277 214 47.0 1e-55 MEDKRLEATARLLEVMNTLRRECPWDREQTFDSLRSNTIEETYELADAITDHNMEGIKEE LGDLLLHVVFYSKLGEEAGAFDFGEVADALCDKLIYRHPHVYGDIHANTPDAVKENWEAL KLRKKNRKSGTLGGVPRSLPAMVKAFRVGEKAAATGFDWQRREDVWDKVKEEIAEVEAEM RSEAPGSREKLEGEFGDLFFALVNASRLYGIDPESALERTNKKFIRRFNYMEEKAAAEGH TLHELPLEKMEEYWQEAKHEETFNSMCPH >gi|313156961|gb|AENZ01000078.1| GENE 14 18136 - 18669 975 177 aa, chain + ## HITS:1 COG:MJ0304 KEGG:ns NR:ns ## COG: MJ0304 COG0663 # Protein_GI_number: 15668479 # Func_class: R General function prediction only # Function: Carbonic anhydrases/acetyltransferases, isoleucine patch superfamily # Organism: Methanococcus jannaschii # 14 164 2 147 159 142 51.0 5e-34 MALIKKVRGHTPAIGENTFLAETAVILGDVTIGRDCSIWYNAVLRGDVNKIVIGDRTNIQ DGVVLHTLYDGSPHPSQTIIGSDVSVGHNAVIHGARIGDNCLIGMGATLLDNAVVPSGCI IAANALVLSNAQLEPNSVYAGVPAKKVKEVTPQQREEIIRRTARDYMLYASWFKEEE >gi|313156961|gb|AENZ01000078.1| GENE 15 18677 - 19984 2107 435 aa, chain + ## HITS:1 COG:SA0250 KEGG:ns NR:ns ## COG: SA0250 COG3275 # Protein_GI_number: 15925963 # Func_class: T Signal transduction mechanisms # Function: Putative regulator of cell autolysis # Organism: Staphylococcus aureus N315 # 254 412 383 557 584 93 33.0 7e-19 MGLQPKYSDLILNLLVALAVSLVVNFSYVLLLIVDQKSDGQPRPAKASVIARGEEGTLRV SPDGHGYIVYENGDSVYVMMQRIYRMNLKDGDRLVANLAAPRRHGAHPVMTELRMRNGEE FDYSTLFNRPSKTTELALQLFYYLVVSFIMLSILTSVRRRYTPGRFVRHCIWCVLAAAAL YMVAPVTEWRSGRIMLNCMGDHIFDYMLLLKCSFAVAVSMLYGWVYVLNSKQQAVVMENE RLKNENLTTRYNMLVGQINPHFFFNSLNSLAMLVREKHDEKALTYIDQLSYTFRYIIQNG QSTLMTLDEELKFVEAYSYLFKIRYADKLFFDIDIEEKYRTWTLPALSLQPLIGNAVKHN TITRSKPFHISIRTENGWLVVSNPKVPKLEPEPSTGIGLENLRNRWHLITGRDIEIIDTD KEFVVRMPLHTPLAG >gi|313156961|gb|AENZ01000078.1| GENE 16 19981 - 20742 1325 253 aa, chain + ## HITS:1 COG:BS_lytT KEGG:ns NR:ns ## COG: BS_lytT COG3279 # Protein_GI_number: 16079944 # Func_class: K Transcription; T Signal transduction mechanisms # Function: Response regulator of the LytR/AlgR family # Organism: Bacillus subtilis # 1 246 2 239 241 108 30.0 8e-24 MKALIIEDETAAALNLKAILKQAAPDIRVVDTLESVEESVDWLRANPQPDLLFMDIHLAD GDSFRIFDAVEVTAPVIFTTAYDRYALEAFKVNSIDYLLKPLNAADVQRALGKLMRLSKG ERSDYGSRVRTLAAARREQTFLVHVRDKIIPLKREDIAFCYTCNEKVTAYTFAGASYPLD KTLEALQAVLPEADFFRANRQFIVARRAVKEIAVWFGSRLSLSLVLETPERIVISKARVP EFKAWLTSVQPAE >gi|313156961|gb|AENZ01000078.1| GENE 17 20877 - 21341 609 154 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313156972|gb|EFR56404.1| ## NR: gi|313156972|gb|EFR56404.1| conserved hypothetical protein [Alistipes sp. HGB5] # 1 154 6 159 159 201 100.0 2e-50 MKKRIFAVAAALCLLAGTGAFAQDSKKAPMMKERPTAEQMAQRRTERMTEKLNLSEKQSK QLYEVNLQDIKEMQAQAEQMRAYRKAQAEKMKGILTPEQFEQWKQMQGPRHGMNRGPRMK DGRGDKAAVREGRKCAPCAGDRKCDGPRSGKEKK >gi|313156961|gb|AENZ01000078.1| GENE 18 21492 - 24446 4338 984 aa, chain + ## HITS:1 COG:no KEGG:Dfer_5762 NR:ns ## KEGG: Dfer_5762 # Name: not_defined # Def: hypothetical protein # Organism: D.fermentans # Pathway: not_defined # 1 956 1 937 965 449 32.0 1e-124 MKRLLLSTLFLFFAVAAFAQKGAVSGTVLDADTGESVAGAVLEVAPVKTPDQKQYFTSGY KGAVAIPSLPYDEYRLTVSFLGYNNYETTFRVAAGKQNIGRIELKPGVQIETVVKEAKAL RTSQKGDTVSYNAGAFKVTNDADVEGLLKKMPGITVSDGTVEAQGEQIKKVFVDGKEFFG EDVTTAIKSLPAQAVDRVEVYNKLSDAAEFSGMDDGEGYKALNIVTHANMRQGQFGKLYA GYGYDADTKTEAKNKYVVGGNANIFSGSSRVSVIGLFNNINQQNFSFEDILGVSGGSGRG RRGGVGQYMVRPQSGVASVNAIGVNYSDTWGKRDQVSFQGSYFFNNTDTRNRSTIDKWYE SPMPVDTLSTRGYSDTESYNHRFNARLEWKISDNQNLMVRPSFSYQSNDPLSTTQGWQFG ESGYSRTENRSDALRHGYNVRTNAVYRAKLGKDGRTITVDGDVNYSDNTNNSNSFSNVLP LSDLRPDGVDAPWGWDTTGYTRLRYLRDLAPSSSYRLRGQFTYTEPVAKYAQVSLQYRIS YDYQERDKKSYITGADYSIAGLAPDGALSNSYRSNYLTQSAGPGFRYSKERNTFIANVFY QRSSLDGQIVRDDADKIRHSYNNVTYFMMGQLNINRENSLRLYVSSYTDNPSITDLQNVA DVSDAQNITKGNPDLNPTYSHRVNFHYINSNVEKGRTFMWMFSMQNTSDYNATHVVSNPG TITLGGVQYKPNYYSTPVNLDGYWSLRTHLSYGFPIGFLKSNFNVMAGVSYSLTPSMLGG EVDADGFITGGKRNDTSNIGYDFNTVLGSNISENVDFTLSWNGTYNEATNSLGESGSKNR YFNHRAQGNMKFVFPLGFTFTGSVAYTQYIGFTNDYDDSYLLCNAWIGKKIFKNKRGEIM FGVNDLLNRNKAFARTTGSGWTQNATNSVIGRYYMVQFTYNLRRFGKKGSKNISDYDGVV QSGPRRMGPGGPGGPPPGIFHGPR >gi|313156961|gb|AENZ01000078.1| GENE 19 24615 - 25850 1888 411 aa, chain - ## HITS:1 COG:VC1609 KEGG:ns NR:ns ## COG: VC1609 COG0842 # Protein_GI_number: 15641617 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, permease component # Organism: Vibrio cholerae # 15 398 29 405 408 159 27.0 7e-39 MSRFTHQMTSYWQQMLSVMRNEFRTIFTDAGVVLILVLALIIYATVYSMAYGAQVLRNVP IGVVDECRTPTSRSLIETFNAGPNTYVAYNPTSMEEAKDLFFKRKIYGVVYIPSDYEEKL LGGLQANVAIYADASYFLMYRQVFQEVVTSIGATGAMVEFQRLIAKGANLPQAQATTQPV IYQSHNLFNPYLGYGTFVMPAIIMVIIQQTMLIGIGMIGGTWREFGLYRKLCPPDRKRMS TLPIVLGKALVYGIIYSVTTFYILGLHYRLFHYPMNGATGTIVVFMLAYMAACIALGIAI STLFRYRENSLLLLLWTSIPLLMLSGVSYPREGIPDWLFNLGQIFPSSHGVNAFIRIQSM GASLSEVLSEIRMLCILALVYGGLACIGIHLVLNRADKDDIPAPHTSEPEK >gi|313156961|gb|AENZ01000078.1| GENE 20 25852 - 27030 1833 392 aa, chain - ## HITS:1 COG:VC1608 KEGG:ns NR:ns ## COG: VC1608 COG0842 # Protein_GI_number: 15641616 # Func_class: V Defense mechanisms # Function: ABC-type multidrug transport system, permease component # Organism: Vibrio cholerae # 14 365 12 361 387 141 27.0 2e-33 MLNFLRNTYAVLRRELTRLAHQPMYFVLMLVLPVVSFAFFALLFNKGVARDIPIAVLDQD HTSLSRKVTQMIDDTATAMVSYGIQDMDEGERLMREGKIMAIVQIPAFFEKNILSNSQTH IETYISGTNITVNGLLSKDIQTAVTTFSGGVQIQLLTKQGLTELQAMAQLMPVRFNKHVL FNPYINYGYYLSPSFMPMMLLIFIVMVTVFTIGTELKKGTAREWIETGNGSVSAALLGKV LPVTVVMFLMSLVMFLIIFKVVGVPLNGSLTVILLSTLLFVMSYQAIAVFIVSLLSNLRL SLSIGGGYSVLAFTFSGLTFPIMAMWPAMQYLSRLFPFTYYTDIFVDQMLRGAPVVCSLP DMCYMSLFIVLPLLCLPRLRTICTEEKYWGRL >gi|313156961|gb|AENZ01000078.1| GENE 21 27040 - 28029 1637 329 aa, chain - ## HITS:1 COG:VC1607 KEGG:ns NR:ns ## COG: VC1607 COG0845 # Protein_GI_number: 15641615 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Vibrio cholerae # 10 322 4 322 324 254 47.0 2e-67 MKRSNIIGIIAAAVVIIVSVVLISWYLRRSTPTLIQGTVECTTYKASSKVPGRIDDMKVE QGDRVEKGQLLYTLSTPELEAKLRQAEAVKSAAAALDQAAIAGARIQQIEAAMNMWEKAQ AGLELARKTYERVKNLYEQGVVPEQKMDEASANYKAMEATALAAKAQYNLAMDGARKEDK EAAAARVRQAEGAVSEVESYISDAMVYSPVTGEVSTIIAEQGELVGSGYPVVAILDMSDM WVTFNIKETLLPAIRVGTRMSGYVPALGYDVEFEVTYIAVQADFATWSATRTQGGFDIRT FAVKAKPTTHIENMRPGMSVLVDWDLIGK >gi|313156961|gb|AENZ01000078.1| GENE 22 28042 - 29493 2412 483 aa, chain - ## HITS:1 COG:VC1606 KEGG:ns NR:ns ## COG: VC1606 COG1538 # Protein_GI_number: 15641614 # Func_class: M Cell wall/membrane/envelope biogenesis; U Intracellular trafficking, secretion, and vesicular transport # Function: Outer membrane protein # Organism: Vibrio cholerae # 28 475 24 465 476 169 28.0 1e-41 MKTAVGIAFAAVFATVVPLSAQEVKQTLSLDEAIAVTLTENPAMKAAEFEEKAATQERRA AIGLRMPKINVTGAYAYLGKDIGFDFNEMKGPAKNLAGQLLGSGLIPPEAIPSINGLLNP LLNADWFLTLQDRSLGFVGGEVTMPIWLGGKINAANRAARINERTAAEQGNQTRNALISE LVERYFGLALAMQVVEVRRQVVDGVRRHLEDAIALEKNGMIAQSERLYVEFKMAEAEREL ANAKLQAETIASALSNTLGRENAWQPVTSMFMIDKVEDLAYYQDLAQDRNPLLNQVSLKR QLAEEGVRVQRAEFLPQVAAIGGGSFYNYQVSGIVPRWAVGVGVNIKIFDGLNREYKYSA AKQTARRVGALQNKAGKDISVLVEKLYNQMMNYRNQMTSIDASLNFAEEYLRMKNAAFLE GMSSSSDLIDAELNLAGVRTERLQAAYNFDLLLAQLLETAGISDEFPAYARRSDARPVLF DKK >gi|313156961|gb|AENZ01000078.1| GENE 23 29604 - 30680 1076 358 aa, chain - ## HITS:1 COG:SPCC4G3.16_2 KEGG:ns NR:ns ## COG: SPCC4G3.16_2 COG0117 # Protein_GI_number: 19075322 # Func_class: H Coenzyme transport and metabolism # Function: Pyrimidine deaminase # Organism: Schizosaccharomyces pombe # 219 339 4 126 148 100 40.0 6e-21 MRITLSAAVTADGFLDDNSPRRLIISTPEDWAEVYRLRAAHDAILVGAETLRRDDPSLLV RDEEVRARRREQGLPPDIAKATLTGTGELSPGLRFFTEGEAERYVFSPHEIDSLQNIATV ISTEGPITAKLIVTQLEKRGVERLMVEGGAAVLRMFLGEGMADTLRLAVNPQLRLGAAGG AEFRFTPPQGTPQTRENLGGMEVTTYTLHPDTSAADLRFLKQAVDEGRKCTPSATSYCVG AVVVAADGRIFAGHTHETSPTHHAEQEAIAKALAAGAPLRGAAMYSSMEPCSQRASEPES CTQLLLKYGFARAVFALYEPDCFVCCRGALTLREAGVDVRVYPGLAGGVWEANAHLKR >gi|313156961|gb|AENZ01000078.1| GENE 24 30671 - 31840 1576 389 aa, chain - ## HITS:1 COG:BS_ykuE KEGG:ns NR:ns ## COG: BS_ykuE COG1408 # Protein_GI_number: 16078469 # Func_class: R General function prediction only # Function: Predicted phosphohydrolases # Organism: Bacillus subtilis # 120 386 39 284 287 107 32.0 5e-23 MFLWFTSILTLAAAAADIIHYRRLRRRGTSARNLRLFVALAAATDALPPVIALTDALSRD NTTGFMLFAMWAFWVWMITVLPRIACYFFRLVGLPRAGIAAGICVAALLIWGTTRGRTQI RVSRTEVCSERIPAGFDGFRIVQFSDVHLGTLVCPEAELSRIADSINSLHPDLVVFCGDL VNIRGSELDARAMRLLGGLRAPYGVVAVTGNHDAGVYIKDSIAHPAQASLAEVVARQREM GWRVLEDTTVYLRRGGDSISLTGLSFDPALRHLRHAPDLPPANLNAAYRGVPDSLYNITA VHIPQLWDQIAGEGYGDLTLSGHVHSMQLKIRFFGHAFSPAQLIYTRWSGRYDEKGRTLY INDGTGYVAFPMRLGAWPEITLITLKRCG >gi|313156961|gb|AENZ01000078.1| GENE 25 32227 - 33582 1993 451 aa, chain + ## HITS:1 COG:no KEGG:BVU_2435 NR:ns ## KEGG: BVU_2435 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 74 451 57 429 429 295 37.0 2e-78 MKTFRLLIVALLLASSASAQRHMRDGRNGEYSPTVYLISVHEVDTVYNCGGCGSRQAAAL NRLAMDNATQDYIETHRPGFQQSEKPQFVFASKNNRFSFSLGGFVSLRAGYDFDGIVDNI DFVPYDIPVPGNYNSKQKLMMDASTSRLFMKAITNTRALGRVVIYMDADFRGGAEGSYTP RLRSAYVSFKGLTLGRDVTTFCDLQAAPTTIDFQGPNAYNFNFATMIRYEVSFARRHMTF GVAAEMPNVSATYGENFKPMHQRVPDFPMYLQYAWGDDRSSHIRASGVIRNPYMHKVSKN STTSLLGWGVQFSGTIKCCDWFRLFMNGVYGEGITPYIQDLTGSGLDFTPNPEDPSLVRM MPMWGWQAAGQINITPRLFVSGGYSTVRVQRSHGYYTDDQYKQGQYIFGNIFYSLTPRCK VAGEYLYGSRKDMNSMKNHANRVNVMVQYSF >gi|313156961|gb|AENZ01000078.1| GENE 26 33668 - 33979 402 103 aa, chain + ## HITS:1 COG:no KEGG:Odosp_0556 NR:ns ## KEGG: Odosp_0556 # Name: not_defined # Def: hypothetical protein # Organism: O.splanchnicus # Pathway: not_defined # 1 103 1 103 106 137 67.0 1e-31 MMEFLTEYNLTGLVIGVATFLIIGLFHPFVIKGEYYFGVRCWWFFLLMGVAAIAASVAVR HILWSTLLAVWGASSLWSIGELFEQRGRVAKGWFPANPNRKKK >gi|313156961|gb|AENZ01000078.1| GENE 27 34129 - 35055 1312 308 aa, chain - ## HITS:1 COG:BH2747 KEGG:ns NR:ns ## COG: BH2747 COG0697 # Protein_GI_number: 15615310 # Func_class: G Carbohydrate transport and metabolism; E Amino acid transport and metabolism; R General function prediction only # Function: Permeases of the drug/metabolite transporter (DMT) superfamily # Organism: Bacillus halodurans # 36 292 33 290 302 67 24.0 3e-11 MRNISNIKGVLYAVVSSATFGLIPLFSIPLLHAGMASPTILFYRMLLSAAIMAAVALLLK RNFRISRRDFGVLAGLSLMYAATSLGLLRSYDYIPSGVATTVNFLYPLVVTIVMTLFFRE RSSVWIVIAVFISLVGVALLAWGDAGNNDPGRGLIYAGTTVFTYAVYIIGVMKSRAGRLD PLIVAFYVLTFSAAVFLVYALSTSGIAVVHTWHIWRNLLLLALLPTVLSNLTLVLAIKHI GSTMTSILGSMEPLTAVLVGVVHFGEGFDLDSAAGLILVVTAVIIVILQTNHTPQTPPAN IPPTQPQE >gi|313156961|gb|AENZ01000078.1| GENE 28 35342 - 36400 1217 352 aa, chain + ## HITS:1 COG:SPy0818 KEGG:ns NR:ns ## COG: SPy0818 COG2843 # Protein_GI_number: 15674859 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Putative enzyme of poly-gamma-glutamate biosynthesis (capsule formation) # Organism: Streptococcus pyogenes M1 GAS # 34 325 79 382 430 130 29.0 3e-30 MKISARHILPWLLLAAACCVPAKQSERKPALPPPKPVRVRLLFGGDVMQHLPQVTAARRE TGFDYREVFAHLHRRFRAADLVIVNLETTLTRTDRYTGYPCFRSPAALADALRDAGVDVA VLANNHCCDGGGAGVHTTVAELRRCGIRHTGVFTDSLDRAANNPLWLEHCGVRFALLNYT YGTNGIPVPEGVRVNLIDTVRMAADLAAAREGSPDCIAVCIHWGNEYERQANAEQRRLAR FLRRNGADLIVGSHPHVVQPFETDSSHVVLYSLGNFVSNQRRRYCDGGLMAEIEAVRHPD GRMTYTLDPVPVWVAMPGYRIVPPEVGDTMALPAAYSLFRKDVAEMLDRVYK >gi|313156961|gb|AENZ01000078.1| GENE 29 36511 - 37089 1013 192 aa, chain + ## HITS:1 COG:CAC2575 KEGG:ns NR:ns ## COG: CAC2575 COG1592 # Protein_GI_number: 15895835 # Func_class: C Energy production and conversion # Function: Rubrerythrin # Organism: Clostridium acetobutylicum # 3 192 2 195 195 199 56.0 3e-51 MEKSIKGTRTEQNLLKAFAGESQARSRYVFFASKAKKEGYEQIAGVFAETAEQEKEHAER FFKFLEGGDVEITASYPTGPIGTTAENLLAAAHGEKEEWDVLYREFAKVAEEEGFIEIAT AFKMISTVEAEHERRYLKLLSRLTDGNFFKRDGKIWWQCRNCGFVIEAEEAPKLCPACKH PQAYFEPKKENY >gi|313156961|gb|AENZ01000078.1| GENE 30 37303 - 38979 2767 558 aa, chain + ## HITS:1 COG:VC0698 KEGG:ns NR:ns ## COG: VC0698 COG0488 # Protein_GI_number: 15640717 # Func_class: R General function prediction only # Function: ATPase components of ABC transporters with duplicated ATPase domains # Organism: Vibrio cholerae # 7 557 5 555 555 649 58.0 0 MADEKIIFSMVGVSKTFTNQKKVLNNIYLSFFYGAKIGIIGLNGSGKSTLMKIIAGIDKN FQGEVVFSPGYTVGYLEQEPKLDDSKTVREVVEEGCAATVALLKEYEEINMKLCEPMDDD TMAKLIERQGELYEQIDQVNGWELDSVLERAMDALRCPDPDQSVKVLSGGERRRVALCRL LLQQPDVLLLDEPTNHLDAESIDWLEQHLQQYKGTVIAVTHDRYFLDNVAGWILELDRGE GIPWKGNYSGWLDQKTTRMAMEEKQESKRRKTLERELEWVRMSPSGRHAKSKARLSAYDK MMNEDAKQKEEKLEIFIPNGPRLGDVVIEAHDVSKAFGDRVLYEHLEFSLPPAGIVGVIG PNGTGKTTLFRMIMGLETPTGGSFRVGPTVKLAYVDQQHKSIDPEKTVYEVISGGTDLIM LGNRQVNARAYVARFNFSGADQEKKCGMLSGGERNRLHLALALKEEGNVLLLDEPTNDID VNTLRALEEGLENFAGCAVVISHDRWFLDRIATHILSFEGDSKVVFYEGSYSEYEEWKKA QGGDTTPHRVKYRKLIEE >gi|313156961|gb|AENZ01000078.1| GENE 31 39011 - 40366 2201 451 aa, chain + ## HITS:1 COG:all5012 KEGG:ns NR:ns ## COG: all5012 COG0124 # Protein_GI_number: 17232504 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Histidyl-tRNA synthetase # Organism: Nostoc sp. PCC 7120 # 7 450 10 460 462 260 35.0 4e-69 MGIQKPSIPKGTRDFSPAEMMRRQYIFDTVKGVFRTYGFAPLETPSMENLSTLLGKYGEE GDKLLFKILNSGDYAAGLSDDEVRQASRICEKGLRYDLTVPFARYVVQHQGELTFPFKRY QVQPVWRADRPQKGRYREFYQCDVDVIGTRSLLCEVELIEIVERVFRALGIRVALKMNNR KILFGIAEAIGHADKMMDITVAIDKLEKIGLDNVKAELLERGLGQEAVDKLQPILELSDD NSQKLTKLREVLAVSETGLKGIEEMETVFGYVQRSGIGLTVELDLSLARGLNYYTGAIFE VKALDFAIGSICGGGRYDDLTGIFGMPNMSGVGISFGADRIYDVMTGLNLFPEEVSFTTR VFFTNLGAEEEAASLQLLRSLRDAGVAAEIYPECGKMKKQMEYANRRSIPYVVIIGSQEL EAQAATVKDMRSGEQRQVPFAALAADLESRR >gi|313156961|gb|AENZ01000078.1| GENE 32 40468 - 42045 2368 525 aa, chain + ## HITS:1 COG:aq_797 KEGG:ns NR:ns ## COG: aq_797 COG0793 # Protein_GI_number: 15606169 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Periplasmic protease # Organism: Aquifex aeolicus # 3 337 9 346 408 205 35.0 2e-52 MRIVLLFFTALFALPLAAQEIGAAQQLRKLMQVYRYLDGLYVDEVEMGPLVESAIEGMLE ELDPHSAYIGAEDMKGVQESFDGEFSGIGIEFNVLRDTVIVVNTIVGGPAERVGVMPNDR IVRIDTLDAVGFKQTDVPKHLRGKTGTRVEIDVVRHGEPAPLHFVIVRDKIPLNTVDAAY MAADGVGYIKVNRFGRTTMSEFREAFDKLGKPGSLILDLRGNGGGLLEQAVEMAGFFLPR GALVVSTEGRAVPPMSLRNPHDGEAPDGPLAVLIDESSASASEIVAGAVQDWDRGVVVGR PSFGKGLVQRQIGLGDGSAVRITVARYHTPSGRVIQRPYEKGKRREYYLDHLRRYDDAVC DSLDAGAPEYRTLRTGRTVYGGGGIRPDVMVEADTAGFSVYYANLIRRGVVNEYVISYMD RERGRLERAYPDFAAFDAAFGVGEEMLTGLTALGAERGVEFDEAGFAASEPLMRVQLKAL VAQRLFDTGAFYRVMNPAQNGAYRRAVEILGGWEHEGAPLLMPEN >gi|313156961|gb|AENZ01000078.1| GENE 33 42130 - 42447 441 105 aa, chain + ## HITS:1 COG:no KEGG:Odosp_0148 NR:ns ## KEGG: Odosp_0148 # Name: not_defined # Def: PUR-alpha/beta/gamma DNA/RNA-binding protein # Organism: O.splanchnicus # Pathway: not_defined # 10 90 11 91 100 90 58.0 2e-17 MTPQTTPRRDAEEYGDQILTKAVKAGRRTYFFDVRGTRGGDYFLTITESRKVTNPDGGFS YDRHKIFLYKEDFEKFSDGLSEVVAFIRRSKPEFFECREREKVGV >gi|313156961|gb|AENZ01000078.1| GENE 34 42637 - 43578 1464 313 aa, chain + ## HITS:1 COG:VCA0040 KEGG:ns NR:ns ## COG: VCA0040 COG2035 # Protein_GI_number: 15600811 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Vibrio cholerae # 2 279 4 281 308 243 48.0 2e-64 MKFSQYLLLTLKGCAMGMADVVPGVSGGTIAFISGIYEELIESIKSVDATALRLLGTLRL KEFWRHINGRFLLPVLLGIAIAIFSLARLMTYLLTNHPIAIWSFFFGLIVASALLVAKQI GRWRVQTVAACLIGAAAAWWITVATPAETPDTWWFILLSGAIAICAMILPGISGAFILLL LGKYQFIMQAVGDLNVPVIVIFVVGAAAGIISFSHLLSWLLKHWHDVTVAVLMGFMVGSL NKVWPWKEVVETYTDSHGKIMPLVERNVAPGRFEALAQQDALLGEAVVLCVVGFLTIYCI ELAARIVVKKREE >gi|313156961|gb|AENZ01000078.1| GENE 35 43582 - 44319 1031 245 aa, chain + ## HITS:1 COG:CAC0897_2 KEGG:ns NR:ns ## COG: CAC0897_2 COG0169 # Protein_GI_number: 15894184 # Func_class: E Amino acid transport and metabolism # Function: Shikimate 5-dehydrogenase # Organism: Clostridium acetobutylicum # 4 245 7 252 273 129 37.0 5e-30 MRRYGLIGRPLGHSASAAYFAAKFEREGIADCAYALYELPDIGALEGLLARTPDLCGFNV TIPYKREVMPLLDALSHDARMIGAVNCVRRAADGSLTGHNTDVVGLRASLDELLGGEQPE HALVLGTGGASQAVQYVLAERGIPFDLVSRDTAKGNYTYDDLPCEVVERSRLIVNASPVG TYPAVDAAPRIPYGFVTPGHYLLDLVYNPPLTQFLDYGRQRGAHILNGETMLREQAEASW RIWNE >gi|313156961|gb|AENZ01000078.1| GENE 36 44446 - 44985 634 179 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313157015|gb|EFR56447.1| ## NR: gi|313157015|gb|EFR56447.1| TonB family C-terminal domain protein [Alistipes sp. HGB5] # 1 179 1 179 179 336 100.0 5e-91 MARLIGETEKIVDAAQIPAAEFSSQVVVGFTVDETGNVTQWRFLDNTCEGKDSVGVEPAT PRTREAMTEALGRLEKWTPAMKDGKPTTYSWRLTMRLRYDGRFTGRGAKGEGVVRVRFYI EPDGKITIGEVIKSPDEKLSREMIRVIRSSKGKWTPRKVRGVPQRTAYEYGVNFLGMAE >gi|313156961|gb|AENZ01000078.1| GENE 37 45109 - 45318 313 69 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313156977|gb|EFR56409.1| ## NR: gi|313156977|gb|EFR56409.1| conserved hypothetical protein [Alistipes sp. HGB5] # 1 69 1 69 69 112 100.0 8e-24 MQDETMVVLAEYNTITEAEIAKSMLDSAGIWSMIRNEYMSAIYPIGTMPAQVVVREEDAE KAKAMLRHR >gi|313156961|gb|AENZ01000078.1| GENE 38 45443 - 49099 6125 1218 aa, chain + ## HITS:1 COG:BB0579 KEGG:ns NR:ns ## COG: BB0579 COG0587 # Protein_GI_number: 15594924 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, alpha subunit # Organism: Borrelia burgdorferi # 4 1139 21 1080 1161 785 40.0 0 MPDFVHLHVHTQYSILDGAAAIKPLIKRAKALGMNAIAITDHGNMYGVKNFHDTATDAGV KPILGCEVYVVKNRFEKDKDEKAGDHLILLAKNLEGYHNLCKMVSYSFTEGFYYKPRIDK QLIEQYHEGLICCSACLGGEVPQAIMHNDIEEAERVVQWFKNIFGEDYYLELQLHPSGDP QKDADVYENQLRVNKVILELAAKYGVKYICSNDVHFILAEDAVAHDHLICLNTGRDLDDP NRMRYTFQEYLKSPEEMAALFPDHPEALATTLEIADKCEDYKLTHAPLMPNFPPPEDFPI ALGELRESFVKKIEDEEMLAKIGACATVPELEELVAGDKELSDRLMVAKQYCYLKDLTYK GAHMRYGDVLDEKTEERIKYELSTIEWMGFPGYFLIVWDYIRAAREMGVSVGPGRGSAAG SVVAYCLKITNIDPLKYDLLFERFLNPERISLPDVDVDFDEDGRADVLRYCVQKYGQKRV AQIVTFGTMAPKMAIKDVARVQKLALSESDRLSKLVPDKVTPDKKHGETPFDFVYKESPE LAAERESPNQLIRNTLKYAEKLEGSIRQTGVHACGVIIGQDDLEKFAPMAIAKDAELNVV EFEGKEVESVGLIKMDFLGLRTLSIIKDAVENVKAVHGVDVDIDGISLDDAPTYEVFARG DTTGLFQFESPGMKKHLRNLKPNRFEDLIAMNALYRPGPMEYIPNFIARKHGQEPVTYEI ADMEEYLNDTYGITVYQEQVMLLSQKLAGFTGGEADTLRKAMGKKKRDVLDKMKPKFIEG CKQRGHDEKICDKIWGDWEAFASYAFNKSHSTCYAYVAYQTGYLKAHYPSEFMAALLSRN LADIKQLTLYMNECKRMGIRVLGPDINESMRTFSSNKAGDVRFGLGAVKGVGEAAVESII AERNANGRFKDIYDLMERVNFSAVNRKCFENLAYAGGFDSISGFHRGKFFGADARDNTGV TFIEQLMRYGQRFQAEKNNAQQSLFGGGGHVDIQRPVLPACADWSQLETLAKEREMIGHY LSAHPLDDYKIIINHMCKTQLTELENLEALKGQEIAVAGMVVSVQNLITKTGKPWGKFVL EDYNGTHEFALFSRDYENFRKYLFSDYFLFVRGRVQPKPYNDKELEFKIISMVQLSEMRD TMIKEMNVLLPVEDVTPTLVRELTEKVKEAKGETLFRISVIDREAHVSLSLFSKSHKVSL TQSLVSYLDDNEIKYSIA >gi|313156961|gb|AENZ01000078.1| GENE 39 49123 - 49437 169 104 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|124485582|ref|YP_001030198.1| ribosomal protein L12E/L44/L45/RPP1/RPP2-like protein [Methanocorpusculum labreanum Z] # 3 103 18 117 120 69 35 5e-11 MALAITKENFQEVISSELPVVIDFWAEWCGPCRTIAPIVDELAAEYEGRVLIGKCDVEEN DEITMKYGVRNIPTIVFLKGGELVDKQVGAASKAALAEKIEKLL >gi|313156961|gb|AENZ01000078.1| GENE 40 49439 - 50431 1034 330 aa, chain + ## HITS:1 COG:CAC0697 KEGG:ns NR:ns ## COG: CAC0697 COG2017 # Protein_GI_number: 15893985 # Func_class: G Carbohydrate transport and metabolism # Function: Galactose mutarotase and related enzymes # Organism: Clostridium acetobutylicum # 7 328 14 336 340 254 38.0 2e-67 MLRSTDWHGMQAVEFSKGDYTALLIPGMGANLIRLTNTRLGAEILRAPADDEVETFKGRP HVFGLPILFPPNRIEDGRYTFGGRTYRFPITNAKEHHYHHGILKSQPFAVSKAWETGEEV LVECRYYSNAGNDAIFRDFPHEFKCKITFRLTAQGLEQEVMFANRSAESMPVGVGFHTPM QIPFAGGEAADYVMRVAVGEQVEMSGRNIPTGRKLPLSEQFARLREGGLRVTDCEPIEAG FTLREIEVDGKPFRGALVENLRTGVRTFYEVDRQTTYWTIWNNGGRVPYCCPEPQSWTTN APNAADPAAEGFQAIAPGESWRMKFRLYAK >gi|313156961|gb|AENZ01000078.1| GENE 41 50608 - 51492 288 294 aa, chain - ## HITS:1 COG:no KEGG:Bache_1414 NR:ns ## KEGG: Bache_1414 # Name: not_defined # Def: hypothetical protein # Organism: B.helcogenes # Pathway: not_defined # 146 293 190 343 350 66 32.0 2e-09 MKTKKSGILILFLCIFCVVGCDNDEKKGREVTDYKEYILTVASRKVPGVLTSDGHNYLTD VYAVKKELSDEWIPFGNIEGFDFEEKYECKIKISETSYLDYGMGEPAWTEYERLEVISKE KKDSEEMPLHFIPKWYYENRPLPQYRYAIESENKELIEEVLKANPILPLDYHYLLYRDED NHPKGIAIKDDDDLFGPSIIKSTNRNPEEMPESYKLLPPEEQVQGFMEWTFSDESGNATD YPSFDVFVAHSSRSRSVGPAPDLVCLYKDLTEHYKNQYPDAGVKTVVVSFTIDI >gi|313156961|gb|AENZ01000078.1| GENE 42 51879 - 52313 426 144 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313156966|gb|EFR56398.1| ## NR: gi|313156966|gb|EFR56398.1| putative lipoprotein [Alistipes sp. HGB5] # 1 144 1 144 144 239 100.0 5e-62 MKKNLLKCLCLPLGMLCVVLSACSDDNTEPTKKTEYTVTEFASTTIPETIPAEGGAYTMT FTTRTETRSQPVVPVFEPWQYRVTLGEAVGDAIAVTEPKTEVSVVIGSNYSKEPRAVVVE MAVTEDAATPPGGFGSSRLNRNLP >gi|313156961|gb|AENZ01000078.1| GENE 43 52454 - 54295 2054 613 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313157007|gb|EFR56439.1| ## NR: gi|313157007|gb|EFR56439.1| hypothetical protein HMPREF9720_2260 [Alistipes sp. HGB5] # 1 613 1 613 613 1214 100.0 0 MTLGDKVGDPVAVTERTESIDVEIAGNYTENERNIVVEMADAGAEPQWTKIVEARQEAAL VLTAGFYWAKGNVTLMDGKFAVADKMSDLGLYFRQGSKYGVPSDGGSYAGTAYTPEAVQV ALADIPYRQPNTDPCTMIDAGLRTPTYMELFCLYDREDYMNQHVLDGITGMGYLSSDYFM PFCGALELASGQISGKSQFGGYWGLGANYAGEGVIYVLNADYSMVDYDLAGTNMASLRCV KNIRQPSYVSHTPASVTDNASFKLTVKTDPGEFPAYEVDIEAEDGEIRSIDASSSETEVT LTVPKNDEVGNREWRLFINRVYSGVSFVQPGKKNYVDYISHSPSKATYEAFTLSVKCESD LASFPVVIKGSDGLELTQNGSKENPTVSFSVPENAGEERTLAIWVNGSDTGKSVRQEANP ATNAFSVTWSEGYLTVVDGAYTFAEPKERGMYFKWKSKYGIKFEGTVSSSTKYQGVVYGP DEQSIPNYADVPYGDVDPCSLVAPAGTWRMPTSEQLLELVADGAKEFEPYAFRMCSDGMQ NIYMVPSGQLYKDGTKTMLPNIISVWTADESATKPGQYCYLAWSTASTTNNPSVSAGGVV PATAMMVRCVRSK >gi|313156961|gb|AENZ01000078.1| GENE 44 54455 - 54973 734 172 aa, chain - ## HITS:1 COG:CAC1484 KEGG:ns NR:ns ## COG: CAC1484 COG0778 # Protein_GI_number: 15894763 # Func_class: C Energy production and conversion # Function: Nitroreductase # Organism: Clostridium acetobutylicum # 1 172 1 172 172 229 57.0 2e-60 MDFLSLAKKRYACRKYAPAKVEQEKLDIILEAGRVAPTGANRQPQRLIVVRSQEGMERLA RCTRDFGAPCAVIVCADTGEAWTRKYDGKTIGDIDASIVTDHMMLAAASLGLDTLWICMF KPEAVREEFRLPEHVEPVNILLIGYGIGTPADPDRHDLLRKPLAETVVYETF >gi|313156961|gb|AENZ01000078.1| GENE 45 55147 - 57210 3576 687 aa, chain + ## HITS:1 COG:alr2323 KEGG:ns NR:ns ## COG: alr2323 COG0326 # Protein_GI_number: 17229815 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Molecular chaperone, HSP90 family # Organism: Nostoc sp. PCC 7120 # 1 580 2 596 658 466 42.0 1e-131 MKNGKIGVTTENIFPVIKKFLYSDHEIFLRELISNAVDATQKLKTLSSIGEMKGELGDLT IHVAVDKEAKTLTVTDRGVGMTAEEVDKYINQIAFSGAEEFMEKYKNQSIIGHFGLGFYS SFMVADKVEIFTKSYKKEAKTVHWSCTGTPEFEMEETDAEHDRGTTIVLHLSEDSLEYAE SSKVEELLRKYCRFLPVPIAFGKVKEWKDGKYVDTDKDNIINNIDPLWTRKPADITEEQY KEFYHELYPLSDEPLFSIHLNIDYPFHLTGILYFPKIHNNFEIQKNKIQLYSNQVYVTDQ VEGIVPEYLTLLHGVIDSPDIPLNVSRSYLQSDSNVKKISSYITRKVADRLQELFNTMRA DYEAKWDDLKIFIEYGILTDEKFSEKAQEFMLWKNIEGKYFTPAEYLEKVKENQTDKNKT VVFLYVDDPVEKHTFLEAAKAKGYDVLLMDGQLDNHYINWYESKNKETRFVRVDSDVIDK LIQKEENIKMSLTEAQQELLRPVFESQMPKDDKIHYNISFEAMSPDEAPVVITQNEFMRR MKEMAAMGGGGGMSQFYGQMPDNFTIAVNGNHPIVADILSDAEKAYGDKLKSITKKIDAA VAEENRFDEVVKGKKEEELTPEEKSTREELSKKIVTLRDERNERLREIGGENRLVKQIID LALLTNGMLKGKNLTDFIQRSISLIEK >gi|313156961|gb|AENZ01000078.1| GENE 46 57364 - 57777 788 137 aa, chain - ## HITS:1 COG:SA1185 KEGG:ns NR:ns ## COG: SA1185 COG0824 # Protein_GI_number: 15926931 # Func_class: R General function prediction only # Function: Predicted thioesterase # Organism: Staphylococcus aureus N315 # 7 125 7 129 155 91 36.0 3e-19 MLSYDCQIRVWYKHTDQMAICHHSNYICYYEAARSELLRWLGMSFAEVEKRGIMMPILEV QSKYHRPAYYDELLTVRIMLKELPTARINFFYEIYNEKGDLLNTGMTQLGFIHSDTRRPC RVPDWFLQLVTNKWTEE >gi|313156961|gb|AENZ01000078.1| GENE 47 57781 - 58518 223 245 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|163762565|ref|ZP_02169630.1| ribosomal protein S2 [Bacillus selenitireducens MLS10] # 17 216 19 231 317 90 30 3e-17 MRVFHGFEALPHFTHPAVTVGSYDGVHSGHLKLLRTVTAAARGQGGESIVFTFEPHPRIT LGKADGLRLLTTLDEKICLLERLGIDNLVIIPFDRAFSRLAPDEFVRDCLVGHTGAETLV VGFNHRFGHDKQGSYDYLDSHGFGLRVIEVSECDVDAEKVSSTVIRQLVGEGKMARAARL LAHPYLVIGRALRNRIRVEDPLKLLPPPGEYDAHINGKPTRLTVDAAQELITRDILPEGR VVITF >gi|313156961|gb|AENZ01000078.1| GENE 48 58544 - 59200 987 218 aa, chain - ## HITS:1 COG:lin2178 KEGG:ns NR:ns ## COG: lin2178 COG2344 # Protein_GI_number: 16801243 # Func_class: R General function prediction only # Function: AT-rich DNA-binding protein # Organism: Listeria innocua # 1 213 1 213 215 131 34.0 9e-31 MASLTNAIPEKTIERLSEYRRTLLASHKQGITHIFSHVLAGIHGITAVQVRRDLMLIGFS SDTKKGYDVQVLIEYISRILDSPSQMNIAVLGMGHLGQAITKYFNGKGLKLKITAAFDVD PGKVGKTIDGIPCYHMDTFEEIVEDKDISIVIVSSPTQVAPSLVVPIINAGIRGVLNFTS TPLNFPQGIVVENYDITTLLEKVAYFVKENEEKNSSNT >gi|313156961|gb|AENZ01000078.1| GENE 49 59208 - 61037 2581 609 aa, chain - ## HITS:1 COG:no KEGG:Bacsa_2701 NR:ns ## KEGG: Bacsa_2701 # Name: not_defined # Def: peptidoglycan-binding lysin domain # Organism: B.salanitronis # Pathway: not_defined # 34 606 44 606 612 178 25.0 9e-43 MKRLCATVLLLLLSVCAFAVQKSGTIVYINGSKFYIHTVQPGETLYGLSKAYEVGEKVIL QYNPSAGGGLKAGENVKIPFVTAVPEPKSERKLRRTFDSHTVAQGETLYAISRKYEIPIQ TVIEDNPNLDPTNLRLGERILIRKKEIGSEDEAGAKEQWEAYRNTLNSVAEEGYAYHMVK PGETFYSLSRRFGITEEQLGSLNGGLKPADLKAGAIIKVPGSPEELAAGERQPADTVRAD SVPDLSADNRVKEIEFRALRRSATLDVALLLPMDAGTPNPSYLEFYQGFLLGLDSVKTKY GYSVNVCLYNTARDAEKIREITESDAFRNTDLIIGPVYEEGLYPVIRFAEEHNVPVVSPL ANITGMNSDVLFQMAPDPAHKWEKAAELVNGDRQVALIYTESTDKEFEKEILTLLGDSDY KKHTYKYVHPSARTNSNSGDLTPLLDNGADNVFIVMSDNEVDVDRILAAIASADTNISSR GRTAPRFVVLGNTRWNRMNNIDRTMFFKDRVIFISTYHAKRDSQTILDFDSAYIRSFGSL PTLFSYRGYDAAVIFCPAMYNDIEYDMEGRSYAPLQTSYLFGQGEGRHNHVNRSWMRVNY NSDFTITVE >gi|313156961|gb|AENZ01000078.1| GENE 50 61040 - 61807 1354 255 aa, chain - ## HITS:1 COG:CT261 KEGG:ns NR:ns ## COG: CT261 COG0847 # Protein_GI_number: 15604982 # Func_class: L Replication, recombination and repair # Function: DNA polymerase III, epsilon subunit and related 3'-5' exonucleases # Organism: Chlamydia trachomatis # 9 255 4 232 232 84 31.0 2e-16 MKLNLKRPIIFFDLETTGVDTAKDRIVEISMAKIMPDGEEIVKTRKLNPEMHIPEEATAI HGITDEDVKDCPTFAQVAKSLEQFIRGCDFGGFNSNRFDLPVLVEEFLRAGVDVDFKRRK FIDVQNIFHKKEQRTLVAAYKFYCDKDLDDAHSAEADTLATYEVLKAQLERYPDLENDID KLAEFSTRAETADYAGRILFNEKGEEVFGFGKYKGRSVEEVFRMEPSYYAWMMNGDFPLY TKKVITEIRMREKLK >gi|313156961|gb|AENZ01000078.1| GENE 51 61819 - 62943 1880 374 aa, chain - ## HITS:1 COG:BMEI1942 KEGG:ns NR:ns ## COG: BMEI1942 COG0592 # Protein_GI_number: 17988225 # Func_class: L Replication, recombination and repair # Function: DNA polymerase sliding clamp subunit (PCNA homolog) # Organism: Brucella melitensis # 1 372 26 397 397 138 27.0 1e-32 MKFSVSSSALLSLLATTGKVISNKNTLPILDYFLMELSGNELKVTTSDLETTLIGSITVD SVESEGTIAAPAKLMLDSLKEFSELPLTIDVNDKNWEITINWKSGSLSIPGASAVSYPAV PQLSAEKKELSMDVDTLVNGINKTIFATADDELRPVMNGIYINLAPGALTFVGTDAHKLV KYESETANEVTASFILPKKPANLLKSVLLKEDDAIEMSFDSKNALFKLKSHTLVCRLIEG NYPNYNAVIPANNPNKVLVDRVELVNGIKRVAVCSNPTTNLIRMDIGDNRINLTAQDIDF SVSANETISCSYDGEEITIGFKSTFLVEILSNMDTPTVVIELADSTRAGVFKPVYDDKQT SSTLMLLMPMMINA >gi|313156961|gb|AENZ01000078.1| GENE 52 62983 - 63861 1356 292 aa, chain - ## HITS:1 COG:VNG0893G KEGG:ns NR:ns ## COG: VNG0893G COG2820 # Protein_GI_number: 15790029 # Func_class: F Nucleotide transport and metabolism # Function: Uridine phosphorylase # Organism: Halobacterium sp. NRC-1 # 13 292 9 255 273 106 29.0 6e-23 MRTIPASELIINDDGSIFHLHLLPEQLADTVILVGDPGRVALVAEFFDTRECEVANREFK TVTGTYKGKRMTVLSTGIGIGNIDISVTELDALANVDFATRQEKAQKRQLTLVRLGTSGA IQPDIKVGEFVFSRTSVGFDGLLNYYKGRNEICDLAIEKAFMEHVGWNELLPKPYFIDAD KTLFEHFRDVTREGITIAAPGFYAPQGRWVRLEPQDARLNEKIESFDYRGRRITNFEMEG SALAGLAALMGHRAATICTIIAQRIAQNVDTDYKPFVRKMISTALDKLAVLE >gi|313156961|gb|AENZ01000078.1| GENE 53 63976 - 65367 2025 463 aa, chain + ## HITS:1 COG:MTH1584 KEGG:ns NR:ns ## COG: MTH1584 COG1109 # Protein_GI_number: 15679579 # Func_class: G Carbohydrate transport and metabolism # Function: Phosphomannomutase # Organism: Methanothermobacter thermautotrophicus # 12 451 6 441 455 275 37.0 1e-73 MTLIKSISGIRGTVGGAQAENLTPPDVVKFTTAYARLIAERNPGKKLTIVVGRDARISGE MVSDLVEGTLLACGADVINVGLCTTPGTEMAVITKRADGGIIITASHNPRQWNALKLLNS DGEFLTDAEGKRVLAMAEEEGFEYPGVDGIGHVLSREPFNRTHIEQVLALPLVDAEAVRK RRFKVVVDAVNSVGGIVMPELLRRLGCEVVELNCDPTGEFAHNPEPLPENLTGIAETIRR EKADLGIVVDPDVDRLALVSEDGSMFVEEYTLVAVADYILSKNPGGDTVSNLSSSRALRD VTERHGGKYSASAVGEVNVVAKMKETGAVIGGEGNGGVIYPELHYGRDALVGTALFLTWL AHKGMTMTQLRATYPSYFASKNKIELTPAIDVDKVLREIKERYAGENMNDIDGVKIDFAE NWVHLRKSNTEPIIRVYTEAKSMAEADALAQRFIGEIKEICNI >gi|313156961|gb|AENZ01000078.1| GENE 54 65378 - 65818 734 146 aa, chain + ## HITS:1 COG:VC1962 KEGG:ns NR:ns ## COG: VC1962 COG3015 # Protein_GI_number: 15641964 # Func_class: M Cell wall/membrane/envelope biogenesis; P Inorganic ion transport and metabolism # Function: Uncharacterized lipoprotein NlpE involved in copper resistance # Organism: Vibrio cholerae # 32 144 48 162 163 81 42.0 5e-16 MKSGILILAAAALLAACGGNTQKKAAAGGTETVAGMPDMHTAETSLDYQGTYAGTLPAAD CPGIETRLTLKKDGTFDLHMKYIDRDAEFDTKGGYSVRGNLLTLTPENGEEVEYYKVEEN RLRRLDADKQPVTGPLDENYVLKKTE >gi|313156961|gb|AENZ01000078.1| GENE 55 65831 - 67198 1834 455 aa, chain + ## HITS:1 COG:DR2025 KEGG:ns NR:ns ## COG: DR2025 COG0624 # Protein_GI_number: 15807020 # Func_class: E Amino acid transport and metabolism # Function: Acetylornithine deacetylase/Succinyl-diaminopimelate desuccinylase and related deacylases # Organism: Deinococcus radiodurans # 11 443 11 442 459 420 49.0 1e-117 MDKVKKYIDANKDRFISELFDLLRIPSISAQSEHRPDMTRCAEWLAAALVKAGADHAEVF PTEGNPVVYAEKTVDPKAKTVLVYGHYDVMPVDPRSEWRTEPFEPVIRDGRIWGRGADDD KGQLWMHAKAFEAMCAEECLPCNVKFMLEGEEEIGSPSLYKFCEQNKKMLKADIILVSDT SMISMQTPSITCGLRGLAYMEVEVTGPDKDLHSGLFGGAVANPANVLTRLVAQLVDAEGR VTIPGFYDDVRELTPAERKAFNKAPFSLAAYKKSLSIGDVEGEAGYTTLERTGVRPSLDV NGIWGGYIEEGTKTVIPSKASAKISMRLVPNQDYRKISKLFEKYFKSIAPKSVKVTVKSL HGGMPYVAPTDMPAYKAAEKAVAETFGKKPLPFYSGGSIPIISGFESILGIKSLLIGFGL AEDAIHSPNESYGLEQFYKGVETIPLFYKYFAEQK >gi|313156961|gb|AENZ01000078.1| GENE 56 67208 - 67990 961 260 aa, chain + ## HITS:1 COG:PM0315 KEGG:ns NR:ns ## COG: PM0315 COG0483 # Protein_GI_number: 15602180 # Func_class: G Carbohydrate transport and metabolism # Function: Archaeal fructose-1,6-bisphosphatase and related enzymes of inositol monophosphatase family # Organism: Pasteurella multocida # 13 236 12 230 267 182 47.0 6e-46 MYREFLDFIFPLAEGAGKIQFAYFRGDDLAIRTKSNVYDVVTRADKESEAYIAAAILERW PDHEILGEEGGSMGRAGSDYRWVVDPLDGTTNYSQGLPVFSVSIALQYRGETIAGVVYAP YLNELFWASKGGGAYMKSRSGEVKRLHVSDKQTLATSVIATGFPYDKDVNPDNNSDNVAR IIPYVRDVRRLGSAAYDLSCVAAGLLDGYWELDLHEWDVCAANLIVREAGGVVADFRPDR GVSQAAGNETLVREILKYVI >gi|313156961|gb|AENZ01000078.1| GENE 57 68089 - 69933 2612 614 aa, chain + ## HITS:1 COG:DR0302 KEGG:ns NR:ns ## COG: DR0302 COG0449 # Protein_GI_number: 15805332 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Glucosamine 6-phosphate synthetase, contains amidotransferase and phosphosugar isomerase domains # Organism: Deinococcus radiodurans # 1 614 37 642 642 561 48.0 1e-159 MCGIVGYVGGREACPILLKGLHRLEYRGYDSAGVAMVDEDGALNVYKCKGKVSDLEHFLA GKELGGNIGIAHTRWATHGVPNDVNAHPHCSESENIALIHNGIIENYRVLKDALEENGYT FRSSTDSEVLVNLIEYIRSTNACSLLEAVQQALRQVVGAYAIAVVEKNNRDEIIAARQSS PMAIGIGRGEYFLSSDAASIIEYTQDFVYVNDGEIAVINRNKPLKIVTLDNHEGKIDIKK LQMSISQLEKGGYPHFMLKEIYEQPKTIVDCIRGRINPETGEVKLSGVIDNRRKFLQARR IIFVACGTSWHAALIGEHLIENICRIPVEVEYASEFRYRNPVIYEDDIVIAVSQSGETAD TLAAVELARKAGAFVFGICNVVGSSIARATHSGAYIHVGPEIGVASTKAFTGQVTVMAML ALAVGREKGAVTEEYYREVAKGLLELPAVLEEVLKLGPRIEDLSKIFTYAHNFIYLGRGY NYPTAMEGALKLKEISYIHAEGYPAAEMKHGAIALIDNEMPTVAIATPDNTYEKTASNIE EIRARGGKIIAVIARDDTHVRRSADYTIEVPVITDCLMPVVVSVPLQLLAYYIAVNKGRN VDQPRNLAKSVTVE >gi|313156961|gb|AENZ01000078.1| GENE 58 70031 - 71380 2112 449 aa, chain - ## HITS:1 COG:RC0082 KEGG:ns NR:ns ## COG: RC0082 COG2271 # Protein_GI_number: 15892005 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar phosphate permease # Organism: Rickettsia conorii # 18 447 16 430 431 276 37.0 7e-74 MSAKQLNPAYAGMTDQQVKRFKYWEFRTMAMCIFGYALFYFVRKNLSIAMPYLNEEMGIS KSDLGLFLTLHGLIYGLSKFVNGIWGDRSNARYFLVTGLVFCGICNLLFGFSSSVLALGI FWILNGWFQGMGVPPCTRLMTHWIVPERLATKMSVWNTSHSIGAGLAIIFCGYVVSLNWG AWFDASSMLSQHWRWCFLLPATIALLGAAVVWAFVRDTPSSVGLPELKTGKTAEKPQSKA EEKAEYKAFLRRKVFLNPTIWIIAVGNFFVYIVRFAVLDWGPTMLKEHLHMDISHAGWTV AAFEIAGIAGMLAAGWATDRFFGGRAPRTCVICMLMAAVCLVGLYSLNEHTPLIVAVAIL MAAGFFIYGPQALVGIAAANIATKRAAATAGGFCGLFGYGSTIVSGWGLGMLVQYTDWSI ALYTLVGMALVGTAIFAAAWKAKPNGYDD >gi|313156961|gb|AENZ01000078.1| GENE 59 71402 - 72400 1406 332 aa, chain - ## HITS:1 COG:mlr8455 KEGG:ns NR:ns ## COG: mlr8455 COG0584 # Protein_GI_number: 13476980 # Func_class: C Energy production and conversion # Function: Glycerophosphoryl diester phosphodiesterase # Organism: Mesorhizobium loti # 47 157 97 213 407 72 36.0 8e-13 MKNTKLLLLAACCAAIMPACAQAPRLHTVKIDSIEELQAYFTYDPARDVIVSGHRGGMMP GYPENCIESCEKTLSLMPTFFEIDFSFTKDSVMVLMHDLTIDRTTTGKGLVADYTYDELR RLNLVDRDGKVTPYRIPRLKDVLEWGKDKVVFNFDNKYINTKGVSDEVRRASLDYYIRQL RPGGEWSMYHNIMLSVRSIEEALYYWNHGIRNVMFCVEISSMEHFRAYEASPIPWKYIMA YIRLAVNPELQQVYDLLHAEGVMTMTSITGSSDKVKNPHDRRVAYMRELLAEPDIIETDY PSEFIGLPWSRDAIHALQDAAIRGNRSSTDLK >gi|313156961|gb|AENZ01000078.1| GENE 60 72415 - 73572 1757 385 aa, chain - ## HITS:1 COG:mtlD KEGG:ns NR:ns ## COG: mtlD COG0246 # Protein_GI_number: 16131471 # Func_class: G Carbohydrate transport and metabolism # Function: Mannitol-1-phosphate/altronate dehydrogenases # Organism: Escherichia coli K12 # 3 381 2 382 382 384 52.0 1e-106 MKKAIQFGAGNIGRGFIGAVLEKAGYHVVFADVNEQIVDRINRDKGYTVQIMDTVCEEVR ITDISAVDSRSPELAQQIAEAEIVTTAVGLTILPRIAGAIAAGIEARREQGVEQPLNVIA CENGVRATSQLKAAVLTHLDAAGQAYCEQCVGFPDCSVDRIVPPVKSENPIDVVVERFFE WNVERAAFKGAVPEIPGMNPADNLIAYIERKLFTLNTGHAITAYLGRMKGYMTICQSISD EQIHAVVKAAMRESGRGLVARYGFDRDAHFAYIDKIIGRFTNPYLCDDVTRVGREPLRKL SAGDRLVKPVLTARQYGIGTPNLLLGIGAALHYDNPEDPQSVEMIAMTARLGAAAAVAEI AELPAGDPLPALAAQAYAEVERIIR >gi|313156961|gb|AENZ01000078.1| GENE 61 73574 - 73774 78 66 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MVSSDRLYLICHNRTTRPMGKNLFPAPDAAAQSPAAMLRQYGGTCRRVFPSPAQTHHAIL VIINHR >gi|313156961|gb|AENZ01000078.1| GENE 62 73864 - 74454 838 196 aa, chain + ## HITS:1 COG:RSc2361 KEGG:ns NR:ns ## COG: RSc2361 COG1595 # Protein_GI_number: 17547080 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Ralstonia solanacearum # 6 181 36 210 213 66 33.0 3e-11 MEKILLTTRIRNGDRKAFDELCGRYYAMLVSYARLFMKDDWAEDVVQDVFYNVWQNRAAL DDSNSLYKYLLRSVYNRALNYLDKNRRAGDYRAYYQERIASMGSAYYAPDNSPIIRKLYS DDLRASLDAAIESLPPKCREVFKLSYLEDLSNREISERLGISQSTVENHMYSALKQLRRK LSKEQLLLLLALLFMR >gi|313156961|gb|AENZ01000078.1| GENE 63 74535 - 75542 1315 335 aa, chain + ## HITS:1 COG:SMc04204 KEGG:ns NR:ns ## COG: SMc04204 COG3712 # Protein_GI_number: 15965785 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Sinorhizobium meliloti # 132 298 145 310 354 87 37.0 2e-17 MAHTDDTVLYSIDSVLYKHLNHETTEAEERRLYAWLEEREEHRTFYFEIAAIWSAHKTLS SKQLGERCDAMMRRLNARIDADEAMRREVKRQRPAWRRWASAAAAAAVVAVAVTAAWFAG VGTPGDELFRTYLNESGEITALRLEDGTQVWMQEGTRLQYAVGAGSAERVVRIDGEAYFD VAHDEAHPFVVKTENLSVRVLGTAFNVRAHAADPLTEVVLERGSVRLQTPEGYNLVRLHP NQRAVFDAAKDDIEVEEIYAEHFVTERYNLVAMKNATIGEIIARIESNYGVRIRIADPDN RKRYDINYLRTNSLEEVIDIVEFMTGQRCEVVRGK >gi|313156961|gb|AENZ01000078.1| GENE 64 75624 - 79166 5190 1180 aa, chain + ## HITS:1 COG:no KEGG:Sph21_5171 NR:ns ## KEGG: Sph21_5171 # Name: not_defined # Def: TonB-dependent receptor plug # Organism: Sphingobacterium_21 # Pathway: not_defined # 125 1180 28 1056 1056 913 47.0 0 MKKNSIFRWGVAMLLLFTGLSLRAYAEDDGGSPFRKNHDVDLTVKSATLQTFTDAFTKQT GVLFSYESALASMPMGDVSVRESNAPLERILNNVFTKRGFRYKIVDRTVVLTYDRTAEPQ RKNSVTGRVCDAAGSPLVGATVLVKDSTRGTTTGADGTYSVEAEPGAVLLFSYIGYTERE EPVGSRSVIDVTMQEDQSVLEEVVVVGYGTQTRKTVTSAISKMDGKTLESMPVNLVGDGM KGRIAGLQVATTDATPGSAPKFLIRGGSSINNSNDPIVLVDGAVREMAGLNPNDIESIEV LKDAASAGIYGSRASNGVILITTKKGAPHKGPQIVFEGQWAYESPATKFDLMNGRDYLLT LRPAIAEGYCGGADPLSILDGAESAGAGNSAASRWTTRYLNPGESLPKGYKWIEDPVNPG KIIVFQDNDQQSQWFDDAFWQNYYIGVNGGGENIRYAASAGYTDDGGIGMATGFSRFTFH GNTSFKVTRRLTATTTFDYSQIERQMLDGGALNKRNSVIRGLSVPATHRDWYDAEAGEDL AGTPAMGPNNTTLPAAYYNYYYSDTGETTKRSSVNINLEWAIVDGLRAVAQFSNHNRHSR SHFFVKNNPTTGTNIRPTKEAFTETNRMDFQTYLNWKKTFAEDHNIDVVAGYDYMKDKNN SIDARVQGAASDKIPTLNVGTSNITNYPTSTRTDEVLISYFGRVNYNFREKYLLSFTMRA DGSSKFAAGNRWGYFPAASAGWIVDEEAFWPQNKVVSSLKLRASYGLTGNNGIGLYDTYG SYNSAYQYNGMSTTTTGEMPNSNLTWESTTQVNAGLDMGLADDRVRVSFDYYDKVTRNLL FDVTLPNTTGYGSVVSNVGKVRFYGVDLAISSVNISRGDFTWTTDFTYNFNMNKVLKLPD NGNVRNRMDGITIGDGSQFGGIAEGERMGRIYGYKVDHIIETQADADAAMYDASSRGYRR SDRRQIAGRKDIGDYEWVNRAGSTQRDGRDIINEEDQFLLGYATPHSTGGIGNTFRYRNW SLNIYMDYALGHSIQNEMQMRYFMATMGNCNWNLVNDVKQCWSQPGDKTKYARFTANDPD WGNRNFSRMSDVFVEKGDYLCIRDISLSYKLPVRWTSRLGMKDVTLTVSGNTLYYWTAVH GVSPEAATTGGLYNSASTYATGFSPYPPARKILFSAKFTF >gi|313156961|gb|AENZ01000078.1| GENE 65 79187 - 80653 2244 488 aa, chain + ## HITS:1 COG:no KEGG:Sph21_5172 NR:ns ## KEGG: Sph21_5172 # Name: not_defined # Def: RagB/SusD domain-containing protein # Organism: Sphingobacterium_21 # Pathway: not_defined # 1 487 1 484 484 305 38.0 2e-81 MKCKKILTALLAAACTMSGCHGDLDIMQDNKLSASNMWIDESDAETATYGIYLYMRDALK SVHDIYLCWGEFRNGLWGAGTNKTLSGVDQTQVRTSTMSSSNAYADWSVMYKTVNQANLI LKHVPEMGISESVRNFSLGNAYFSRAWCYFWIARIWGDAPLALTGYESTGGDLYLPRAPK AEIFARIESDLTEAERYVTDNSDKAVATPAAVQMLKADYALWMYRVAGGGSSYLTMAEKA IEALNLSASRLESDYAQVFSSANKLNKEVIFAVHQANGEAVNGPGYYLGWNSNYVEAAYQ NNPVPITGGNQWWWYTDRYKALLTSVEGDTRTKLTYNRADYGTTIKSIEWTEKGVGQMIS GARIFDSDFILYRYAEAFLMDAEIKYYRKDYGGALTALGEITKRAYGNAAHYTDQSEAAV KRAIVEESLKELVGEGRTWWTLIRLDAVWEYNKDVADKRESNANILLWPITQESIIKNSK LKQTEGWY >gi|313156961|gb|AENZ01000078.1| GENE 66 80679 - 82664 2789 661 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313156998|gb|EFR56430.1| ## NR: gi|313156998|gb|EFR56430.1| conserved domain protein [Alistipes sp. HGB5] # 1 661 1 661 661 1263 100.0 0 MKINCYIALVAAALACACNPLADTDDMDNNVQATRIAFSPEVVTLDNAGRNAEEDQGQNV IVTLNPKARRSMAWSAETDKTETWCTLTECSVTDADGVTHRGFRITATENTAYKRTATVT LTAADGTQETLRVVQTGVYPDAEVTVDPKQIEFNADEIVPVDVSFTTNMGDVYAVSRDED ADWISWEDLGGNVIRFTAAPWDDEEQPRTANVYITVGSGETSAATAKIPVTQLAKDLYCY VWGASLFDYDRFELARRMTKSETGVYRFDAYFTAGNAPEIRVNTVLRADYYPAYALAADG RIVTLNSAADAVPAGPAIDIDGMRTLTIDLNAMTYTLSRIAIANCMPDDAAAACPSKAFR TKDGGVKIWMTRNLNWNGGPEIGAMKLGSRLVPAYSGSATSGGYDPQDPYIERNPAYDEE ESGGSVKGDQAVTDRHGRLYTLDEILLGTPAGGLGRGITSIEWPSAYNVGSTFVDAVGTQ ITARVFKSAEMKAVTDDERFFAENPALSAQIQGICPYGWHIANFRDWYDLAYAALEASRG DGTYPVREEMLTIAKLVATNANNVAPWLRTQEGWTSTPARAAGADAFDFNLYPTGWRLNK NGYGQYGDTAHSWVPLIGSSAKKTWRLNNVKATSNYWFNDNLDAGNPAVGIRCVKNYKVK K >gi|313156961|gb|AENZ01000078.1| GENE 67 82725 - 83597 1330 290 aa, chain + ## HITS:1 COG:no KEGG:Lbys_2251 NR:ns ## KEGG: Lbys_2251 # Name: not_defined # Def: glycerophosphoryl diester phosphodiesterase # Organism: L.byssophila # Pathway: Glycerophospholipid metabolism [PATH:lby00564] # 51 266 19 228 249 94 31.0 6e-18 MKKLKYLLMACCAACVWLAACSDDDYEPLPDWWWEQTGEPAVYPEPEPVDNKVVAHRGCY AETGFPQNSVAGLKKAVEMKLFASECDIHLTKDGKVVVYHDDYYLGNTCFKDATYAELCA KGTLANGEKLPLLEEFIDVVLEGGCTQLWIDVKTLGDEAGGNAEASRTGIAAAQIVHEKR AKNFVGFIVGRLAIRDKVIPAVRSAWPVSYGAAAYEPGDFIARDIPWANMKLADFGLDTK RAKSFPENKVRLSLWQIDTDEQMQWYQGIRSDVYGITNYPLRMMDKLGLR >gi|313156961|gb|AENZ01000078.1| GENE 68 83665 - 85266 2319 533 aa, chain - ## HITS:1 COG:no KEGG:Phep_3771 NR:ns ## KEGG: Phep_3771 # Name: not_defined # Def: metallophosphoesterase # Organism: P.heparinus # Pathway: not_defined # 1 532 1 516 521 488 47.0 1e-136 MKTRIMLFCCMAAALLLACSNDKSDGEGREPGVETELNGTQLGEGTTLYGLVTDTSGNPV QGVVVSDGYNCVETDANGVYQMIRYKKARFVWYSTPAGYEINTSADNYPLYYAEIVHKNI ADRHDFVLKPLAAPETDFTLLCIADPQCASTADISRYVNETIPDIEATVETFKAKGRAVY GITLGDIVFDTPDLWSNMKEAMANRNLTIFQTIGNHDHLKTETSDDKAAANFESQFGPRN YSFNRGNAHIVSMDNVFYVGGSTPSSNYKGGITDQQLEWLRQDLSHVAKDKLVIFCAHMP FRGGTSETDESHMNHAGVLDLLSEFAEAHIMIGHTHYQQKYIHTRNGKKIFEHIHGAACG AWWTSTLCADGTPNGYGVYEISGSTVPNQYYKATNKAADHQIRAYSAKQVFGTSGSTTFG FAANASAMNDAKCIVANVWNSDTGGDWKVSLWQNGAKVGDMTRISTCDYWAYAYHVLYFS KSVGSTWGKKLDHFYYGRLTSGTPETADFEIVVEDGMGNTYRTSKLQTDFIGF >gi|313156961|gb|AENZ01000078.1| GENE 69 85290 - 87125 2936 611 aa, chain - ## HITS:1 COG:no KEGG:Phep_3772 NR:ns ## KEGG: Phep_3772 # Name: not_defined # Def: RagB/SusD domain-containing protein # Organism: P.heparinus # Pathway: not_defined # 2 611 4 604 606 742 60.0 0 MKKILLYTLMTVAAAGMTACEDFLTRDTYDQIGSDEFWKSETDLELYANGFIQKMIPGDG TITRGDIDADYCAVDIATDLLRPDGNVSPDNQGGWTESSWTNLRRVNYMLDNMHRCRGRV SDEVYNHYEGVARFWRAWFYYDKVRTFGAVPWYDFTISASDKEALTKPRDSREYVMDKVL EDLTFASTYCLADAKYTKGSALINKWVALAFKSRVCLYEGTFRKYHSREPSSDKPWQNDV NNADNKFLREAAAAAKEIMDKGPFSLVTGDVKTAYRSLFTSAGLLTQEVIFGREYSKELS AFHETSWYYYSPTYGTKIAMTKKFMNTYLTTKGTPFTDIDGYKTIDFIHEFDGRDARMAQ TVISPSYQMKISGRTVLYSPNWKVTRTGYHPIKWSIDDDADNVLSKAASWNSLPIIRYAE VLLNYAEAKAELGEMDAAVWDQTIAPLRSRAGVTSVIPAKADPYMEAYFLNTVTDKWILE VRRERGIELCLEMGLRWDDDMRWHMGDLLTSDNNPWTGIYIGSTSYTYDYTGLSTDDNGD PVPDFYIRPGEDTEHSIAISNTGANQTFSLNGEGNIVWEYRRVWSEKKYLRPIPLTAITR NPNLEQNALWK >gi|313156961|gb|AENZ01000078.1| GENE 70 87138 - 90770 6042 1210 aa, chain - ## HITS:1 COG:no KEGG:Phep_3773 NR:ns ## KEGG: Phep_3773 # Name: not_defined # Def: TonB-dependent receptor plug # Organism: P.heparinus # Pathway: not_defined # 95 1210 9 1120 1120 1482 64.0 0 MKNLFHKGLCLVMALCCIAFKGYAGGGTLPDAQGHDLKLSYSRTTLKTVADAITQQVGIV FSYEIALADYPMENLYVTEQGAALDAILKTVFTPRGIDYRVVDKVVVLTRSAHPAQTVRA APAKAQVTGVVRDAAGNPMIGVTVTIKGTLVGVSTGVDGSYTIPADKDAMLLFSYIGYRN QEEPVGQRTQIDVTMQEDQLMMDEVVVVGYGTLKKRNIVGAVENLAGDAVENRPNADITR SLQGQIPGLNIVQTDGKAAHGGQVTIRGVNNSFKARVNGGQKENKLGQGGSALVLIDGAE GDMSSVNPDDIASISVLKDASSAAVYGARGAFGVILITTKNPEKGKVRVNYNGSVSLHRR TVIWEDNVVTDPVQWVEAFRESYLNSSPTATVPSLFNNYMPYSNAWFEELKRRRADPTMD NYSIDANGNYSYYGETNWLKEIYKSVNYSTTHAVSIQGGREGVSYYVSGRYYNQDGIYKV GEETYKKYNLRAKGSIRIRPWLTLDNNTSLMSSKYHQPMMHYGQQMISRQIDMFAFPFAL LKNPDGTWTQTAAKSGYAAFAEGTSWQEDNRLEVANTTTFNFEFVPEVFKVSADVTYKGA RWTRDRMENLYTYYTGVNVSGQDNSYSSLENWNYISNYISTNIVGTLTPKLGRDHDLNVV AGWNLEDYDYRAQKTYRQGNLYPSKPSFTLMDGEYYSTTSGGYTWGLVGFFGRINYAYAG RYLVELSARYDGSSKFPANSQWGFFPSASVGWRLSEEPWLKPHVEGWLDNFKIRASIGSL GNANIDPYQYLETMTATNSASIAKSSVIINGQNVPYTSVPDLIPDDITWEKVTTYNIGLD LDLFNNRLSFTGDYYRRNTTDLYTVGPNLPQVLGSAAPYGNYASLKTKGWEVSLGWRDSF KLGGKPFNYSLRAMLWDSRSWITDYYNETGDLTTYYKGMEIGEIWGFRTAGIYASNAEAL NGPAYNFFKNGEMFRAYAGDLRFVDVDGDGIMTKGNRTLSNHGDMEIIGNQSPRYQYSIN MSLNWNGIGLSMLWQGVGKRDWYPWTESGFFWGKWNRAYNSLMKTQTGDRVVKIDKSTDN WRVTNMDKNPYWTRMVSLAANRNDGPLTWENDHYLQDASYIRLKNITVDYTFPKHICKKL RIEGLKIYVSGENLFTHSPMFKYTDMFDPEVITSGDSDFASTQKSGLGGTGNGYSYPMLK TVTLGVNVTF >gi|313156961|gb|AENZ01000078.1| GENE 71 91325 - 91817 674 164 aa, chain + ## HITS:1 COG:no KEGG:Phep_3774 NR:ns ## KEGG: Phep_3774 # Name: not_defined # Def: hypothetical protein # Organism: P.heparinus # Pathway: not_defined # 21 164 21 158 440 138 47.0 6e-32 MKFFHKLTIILALAGIGGLTACSDDDSGFLKRNLREMSFSYVESSNTFTIRATGDWYISD VPDSREWTGAYSWVHVDRLSGKGSPDTYQKITVTCDQNVSDERTATIYLHGCGEENVAIA IKQENGIFEWKPFDNGQRFGVLGLLKLNAESEASLRIPYIKAVG Prediction of potential genes in microbial genomes Time: Wed Jun 22 13:53:06 2011 Seq name: gi|313156947|gb|AENZ01000079.1| Alistipes sp. HGB5 contig00048, whole genome shotgun sequence Length of sequence - 21531 bp Number of predicted genes - 13, with homology - 12 Number of transcription units - 11, operones - 2 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 143 - 202 3.8 1 1 Op 1 . + CDS 398 - 1492 1548 ## COG0628 Predicted permease + Prom 1504 - 1563 3.2 2 1 Op 2 . + CDS 1616 - 3469 2926 ## COG5016 Pyruvate/oxaloacetate carboxyltransferase + Term 3494 - 3538 12.1 + Prom 3506 - 3565 6.2 3 2 Tu 1 . + CDS 3589 - 4071 768 ## COG1970 Large-conductance mechanosensitive channel + Term 4090 - 4131 8.6 4 3 Tu 1 . - CDS 4102 - 4782 921 ## COG1226 Kef-type K+ transport systems, predicted NAD-binding component - Prom 4933 - 4992 79.6 + TRNA 4916 - 4992 94.2 # Met CAT 0 0 - Term 4991 - 5028 7.3 5 4 Tu 1 . - CDS 5103 - 5813 425 ## Mpal_0765 UvrD/REP helicase - Prom 5838 - 5897 4.0 6 5 Tu 1 . - CDS 6988 - 7422 108 ## Mpal_0766 SMC domain protein - Prom 7445 - 7504 3.4 - Term 9751 - 9790 11.1 7 6 Op 1 . - CDS 9820 - 11568 2384 ## BDI_3380 hypothetical protein 8 6 Op 2 . - CDS 11588 - 14461 4293 ## BVU_1054 hypothetical protein 9 7 Tu 1 . - CDS 14804 - 14944 76 ## - Prom 14964 - 15023 3.1 10 8 Tu 1 . - CDS 15081 - 15821 540 ## gi|291515409|emb|CBK64619.1| hypothetical protein AL1_23800 - Term 17399 - 17443 4.1 11 9 Tu 1 . - CDS 17447 - 18886 1243 ## BVU_2464 mobilization protein 12 10 Tu 1 . - CDS 19266 - 20489 1369 ## BT_1138 transposase - Prom 20672 - 20731 3.2 - Term 20541 - 20583 -0.9 13 11 Tu 1 . - CDS 20753 - 21157 215 ## gi|313156948|gb|EFR56381.1| hypothetical protein HMPREF9720_2070 Predicted protein(s) >gi|313156947|gb|AENZ01000079.1| GENE 1 398 - 1492 1548 364 aa, chain + ## HITS:1 COG:VC0624 KEGG:ns NR:ns ## COG: VC0624 COG0628 # Protein_GI_number: 15640644 # Func_class: R General function prediction only # Function: Predicted permease # Organism: Vibrio cholerae # 172 332 192 351 361 100 33.0 3e-21 MRNFRERYWRYSLFVLILGLGITICIELTPFLGGLLGAATIYVLLRRQMQVLTACRRWRR SLAASLLLAEAVLCFLVPISLIVWMVVNQVQDITLRPDSIITPLKHIAALIHEKTGYNLW QEENISSMIGMIPKLGQWVVSSIFDFGVNIVVLLFVLYFMLIGGRRMEEYCREILPFDSS VSGSVMREVHMIVRSNAIGIPLLAVVQGIVAYIGYLIFGAPSPLFWGVLTCFATIIPIFG TALVWLPLAGYMALAGEWGPAVGLTLYGGLVVTHVDNVVRFVMQKKLADTHPLVTIFGVF IGLSLFGFMGVIFGPLLLEMFVFCVNIFKKRYLDGTPDKQLFVPGGAVRAHGPEAPAGRG VYEK >gi|313156947|gb|AENZ01000079.1| GENE 2 1616 - 3469 2926 617 aa, chain + ## HITS:1 COG:AF1252m KEGG:ns NR:ns ## COG: AF1252m COG5016 # Protein_GI_number: 18677784 # Func_class: C Energy production and conversion # Function: Pyruvate/oxaloacetate carboxyltransferase # Organism: Archaeoglobus fulgidus # 1 515 1 443 480 274 36.0 4e-73 MARNLKIRDLTLRDGQQSSFATRMSQAQVDRCLPYYKDANFYAMEVWGGAVPDSVMRYLN ENPWTRLETIHKAVGNVSKLTALSRGRNLFGYAPYPDDVIDGFCRNSIESGLGIMRIFDA LNDVDNVKSTVKYVKQYGGIADCAVCYTVDPKYPEPGFFAKLMGRKSHEQVFTDAYFLDK AKQMAALGADMITIKDMSGLIPPRRVATLVKLFKKNIDIPVDFHTHCTPGYGLASVLAAI IAGVDVVDTNCWYFAEGTGAPAIELVHVFCKKLGVDTGVNMEAVAKINTQLREIRKELNQ SVFGTEKPEPKPFNPLTDTLPAEIDALFDKAIKAAQADDEAATIDACRKIEAYFGFPAPN ELVQKAEIPGGMYSNMVAQLKQLKAEDILPRAMELIPSVRLAAGLPPLVTPTSQIVGAQA VNCALDEKAGRPMYTNKSSQFVGLVKGEYGHTPVKIDPEFRFKICGVREETPYDTSKYQM QPNPELPEAGGVKLAANEKEVLLLELFPLVAKNFLTDMKVKAYAASKPAEPKAEEKKAEE SVAAAITGNTVTAPLPGRIIEFKVKVGDTVKAGQEIVVLEAMKMENSVTTDYAGTVKQIL AHPGDNVATDAVLVEIV >gi|313156947|gb|AENZ01000079.1| GENE 3 3589 - 4071 768 160 aa, chain + ## HITS:1 COG:ECs4156 KEGG:ns NR:ns ## COG: ECs4156 COG1970 # Protein_GI_number: 15833410 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Large-conductance mechanosensitive channel # Organism: Escherichia coli O157:H7 # 1 158 1 132 136 143 53.0 2e-34 MAFFKEFKEFALKGNVMDMAVGVIIGGAFGKIVSSLVNDILMPPIGALIGNTDFSQLRLD ISKFRDMTSSAVHAVGDAVTGGGETVAQAAAEPVYWNYGAFIQQCIDFTILAFCVFMMVK LMNRLMKKKEEAPAPAAEPVLSKEEQLLTEIRDLLKEQKK >gi|313156947|gb|AENZ01000079.1| GENE 4 4102 - 4782 921 226 aa, chain - ## HITS:1 COG:CAC1317 KEGG:ns NR:ns ## COG: CAC1317 COG1226 # Protein_GI_number: 15894597 # Func_class: P Inorganic ion transport and metabolism # Function: Kef-type K+ transport systems, predicted NAD-binding component # Organism: Clostridium acetobutylicum # 16 222 16 216 256 61 26.0 2e-09 MSLSESTRTEIRNCLSILKVIAGVVLLAAISWEIIAGDHMHFSRNYLIIQLVVCILFLCD FFVRWAASERKGRFFARNFLVLLISIPYLNIIDWSGAELPRYWAMLIGIMPLMRAFLAFY IVVQWLVDNKVRKLFFAYIFTVVVFTYISALVFYDYEILVNSKLHGFGNALWWAWMNVTT VGAEIFPVTAIGKIFCVLLPSLGMMFFPIFTTYVLQEYSHKKESDQ >gi|313156947|gb|AENZ01000079.1| GENE 5 5103 - 5813 425 236 aa, chain - ## HITS:1 COG:no KEGG:Mpal_0765 NR:ns ## KEGG: Mpal_0765 # Name: not_defined # Def: UvrD/REP helicase # Organism: M.palustris # Pathway: not_defined # 16 212 430 635 669 85 30.0 2e-15 MGNMIDENNPYTLLAMAIYLWKKTAKTKTDTQLSIYYAGKYLSQVCYNGDGNKRDFFCPS NFESLTWRKLIMNILAGIDDLLFPFENLQGQKYKWCEWARKARTYFDKLKNDFPECSDNI DNAKTKRLPNGYGNEEVLLYLEAMNNINELPITTIHDAKGQTFDSVLLISEKDKHSKGGH FEQWFNSNNVEYKRLGYVAASRAKHLLIIAVPILTQEQKQHMIEIGLIENPVLALF >gi|313156947|gb|AENZ01000079.1| GENE 6 6988 - 7422 108 144 aa, chain - ## HITS:1 COG:no KEGG:Mpal_0766 NR:ns ## KEGG: Mpal_0766 # Name: not_defined # Def: SMC domain protein # Organism: M.palustris # Pathway: not_defined # 9 132 508 619 623 84 38.0 2e-15 MIKQLEQNSKNARLFTNLKTFEYDLALETGNIIPLFNLLLEKIDTNGATKRKATEYSTID WIQKERDTDKENMEAFFTEKAEAAKFLLDTIENGTFITKGEFAQLLALKMVNEEIRLIVP TYIKNALDWLVEPYINKDDEKEAK >gi|313156947|gb|AENZ01000079.1| GENE 7 9820 - 11568 2384 582 aa, chain - ## HITS:1 COG:no KEGG:BDI_3380 NR:ns ## KEGG: BDI_3380 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 1 582 1 551 551 496 46.0 1e-138 MKLKNILRNVTAMALPLALLGSCTGDFEELNTNPYEIDPEELPFEAQFSTPLSFSYPTHQ NLFQYWTSLSIDNYGGYFEVPHSNWTMARYDLARGFCGGMHENFMQKIFNNTRRLIKQCD AAGQHDFAAVARIVEAYNLLQYTDTYGPVPYSSVLAADELAERPSSYAYDKQEDIYKAIF AQLDKALEGLDTETAGLASFDCWCNGDRTLWKKIANQLKLRMALRIVKVNPVDAEKYAKE AIQAGVLEDKDILINKSYSNELRRMMDWLDSGIGSSIVAFMNGYNDPRRPLYFTTNVRHL VKKTAEPTGEKDQNNEDIYNESDILIRKGAQYIGVPVGCELGNKNGGNDNQRVYYSFLAG GYATPQPIMFAAEGWFLRAEAKLRWTDAGQQSVKELYEKGIEVSIRNQKSYRQSDAAAAW TEKKLTAPDWAAIDDAAISSYINDDTSSPEGYTDPWKPEYNSDPTTSITVKWDEGASNEE KLERIITQKWIANFPLSTEAWADYRRTGYPKLFRPKQNLAPSIIDTQLGPRRLLYNETEL SSNTVEVNNAIQLLKAESSEVKGDGDTGGTRLWWDRKDKGNF >gi|313156947|gb|AENZ01000079.1| GENE 8 11588 - 14461 4293 957 aa, chain - ## HITS:1 COG:no KEGG:BVU_1054 NR:ns ## KEGG: BVU_1054 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 1 957 58 1012 1012 1154 62.0 0 MDGKFRLEGLQPGDVLQVTYVGYDPYEVTYTGQTTLDILMTTTANQLNAVVVTAMGIERQ SKTLSYAAETVGGDDVADIKSVNMINALQGKAAGLQITPNSTGAGGSSKILFRGNKSING SNQPLVVVDGVPLMMNITSDQVDSNWGAQRDGGDAMSTINPDDIASISLLKGASAAALYG AVAANGAIMITTKSAMAGRLAVNVSSNTTIDTPLSLPEFQNTYGANGQYSWGDKLASKAP DYAEKFFRTGWTTNNSISINGGAEDLRAYFSYGNVTSGGITPENDYSQHTLNAKVGFDLF NDHIKVDFNAKYVNQHISNQPAGGFVFNPLVGTYTFPRGGDWNGYKSNFETYNGELNANV QNWVTTTDETNSNPYWLLNRERPVVERNRYEFGGSIKYQIIDGLSLTGRMRYERADEHYV RNHYASSYGNKYTYGKMDDNRYFSEQLYADLLAQYNHTWDDFSLNATLGTSMMQTRSNNV SLLYEQSKFVAPGNGGAYYPNIFNPSNFYKNGTTMGLERKRLNSVFGAVTFGFKEALFLD VTARNDWSSALAYTDGYSFFYPSVGASLLLNRFVDMGRNIDLFKFRGSYSIVGNDVPVYK TNPRYTYGDQGAINPPKSVPFRTLKPEKTHSFEVGFDGEFFQHRLHVNATYYKTNTKNQY FEVTLPWESGYKSQFVNAGNVQNQGFELTAGWFQDFGNEFTWSTDLNLSYNDNKIIELFD GIQDGVTVSNLGGAKVILYEGGQYGDLYVRTLKRDESGKLVTETPEGADYQIPVNGGEQN SDLKYMGNMNSKWNMGWNNTFRYKDLTLSMLIDFRIGGKVVSMTEATLDGYGVSERTGRA RDRGYVMREGIKFSNVKAYYDVVGATSFNSVYNVEDYVYDATNVRMREISLGYTFRNLFG QSKNLTLAFIARNLFFFYKDAPMDPDVSMGTGNGLQGFDVFNLPTTRSFGLNVKLNF >gi|313156947|gb|AENZ01000079.1| GENE 9 14804 - 14944 76 46 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MRKIYTKTFVYFAEIQYIYNEINNRNIIMDILNFTYYFLIYNLIFN >gi|313156947|gb|AENZ01000079.1| GENE 10 15081 - 15821 540 246 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|291515409|emb|CBK64619.1| ## NR: gi|291515409|emb|CBK64619.1| hypothetical protein AL1_23800 [Alistipes shahii WAL 8301] hypothetical protein HMPREF9720_2067 [Alistipes sp. HGB5] # 1 246 1 246 246 477 100.0 1e-133 MPQKSYTLGNLLQDCLDCDIRKSDIRQFFDDLKKKAHFVYAILCGAPKSVFWITLDDEES HPDEKTFSVAEFNDNYENFDSRIDSFRVNFNAYFFREQWATKFPEWELMENPFNGEERFC ATSRKELTVEIFVTEYPHLLAYLCRRAVLENAPFELTRETIKHYGFSREDVKLLRDCIAS NNRRIIDELREEYERLKAIDCLQVVPPKRRGRKPKAEVSAEIQPIAKKPRTRISRKKFVP TRSVKS >gi|313156947|gb|AENZ01000079.1| GENE 11 17447 - 18886 1243 479 aa, chain - ## HITS:1 COG:no KEGG:BVU_2464 NR:ns ## KEGG: BVU_2464 # Name: not_defined # Def: mobilization protein # Organism: B.vulgatus # Pathway: not_defined # 1 479 1 446 446 319 40.0 2e-85 MGYVVLHIEKAAGTDAAMSGHVERRITPANVITTLTYLNEELVEFPKGVTNRTEAIQHRL DNAGLERKIGKNQVRALRVMLSGSPEDMKRIRQAGQLDAWAKDSCGWLQKTFGKENVVSA VLHLDEKTPHIHATVVPITRGERRKAKLEREKNAQSGKRTYRTKKDRPRLCADDVMARDK LKAYQTTYAEAMAKYGLQRGVEGSEAKHISTQQYYREVFVRKNEMAEQIENLKEQQKTLT VDIAALQAQQRAAQTDCNIIDEQRRKKKEELEKAETELAQTRREIKTDKLKGVAVDATTK AVERIGALFHDPKPARYEKQIADLQGVIADKDKCIGQLQQEIKTMQAGHDKEVANLKQEA QQVIKALMRVDELCPYVKGLLKWENYCKDVGLDKERTKALFTMQPYRYTGELHSIRYNHT FRANDVILQFKPDKDGPSGFQFTINGKDCDEWFRQQRKEFYERIGIDIEQTEQRRGMKM >gi|313156947|gb|AENZ01000079.1| GENE 12 19266 - 20489 1369 407 aa, chain - ## HITS:1 COG:no KEGG:BT_1138 NR:ns ## KEGG: BT_1138 # Name: not_defined # Def: transposase # Organism: B.thetaiotaomicron # Pathway: not_defined # 2 399 3 400 411 592 71.0 1e-167 MRSTFRVLFYVKKGSARPNGDLPLMCRLTVDGEVKQFSCKMDVPPRLWDVKNGRATGKSV EAMQINLAVDKIRVEVNRRYQELMQSDGYVTAAKLRDAYLGIGVKQETLLKLFEQHNVEY RKKVGFNREVATWKKYCCVCKRVREFLAHTYHREDIPLKELNLTFINDFEYFLRTEKKCR TNTVWGYMIVLKHIVAIARNDGRLPFNPFSGYINSPESVDRGYLSQEEIKAVMNYKTANK AHARIRDLFVFSIFTGLAYADVKGLTTDNLQTMFDGNLWIITRRKKTNTESRIRLLDIPK RIIEKYADQRLDNHVFYMPCNCHCNDILREIGKQCGIKNKLTFHMARHTFATTITLSQGM PIETVSCLLGHTNIKTTQIYAKITNEKISRDMSALTERLGDKYRLAE >gi|313156947|gb|AENZ01000079.1| GENE 13 20753 - 21157 215 134 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313156948|gb|EFR56381.1| ## NR: gi|313156948|gb|EFR56381.1| hypothetical protein HMPREF9720_2070 [Alistipes sp. HGB5] # 1 134 125 258 258 261 100.0 1e-68 MGYNRILKPVIFSTPTNEKIDLSEFEGRYIILLFWDACSSKDQNILSDLQILYDKIKTEK DVSLYAVYVWTPNENKLAIIDEISHSFPCLYTARNGYTTTILGVTKYPTMLILNKKSEII YNGKFTLNALSCIK Prediction of potential genes in microbial genomes Time: Wed Jun 22 13:54:11 2011 Seq name: gi|313156944|gb|AENZ01000080.1| Alistipes sp. HGB5 contig00008, whole genome shotgun sequence Length of sequence - 639 bp Number of predicted genes - 2, with homology - 2 Number of transcription units - 2, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . - CDS 3 - 204 213 ## BT_0095 conjugate transposon protein - Term 225 - 269 2.8 2 2 Tu 1 . - CDS 349 - 594 282 ## gi|313156946|gb|EFR56380.1| hypothetical protein HMPREF9720_0279 Predicted protein(s) >gi|313156944|gb|AENZ01000080.1| GENE 1 3 - 204 213 67 aa, chain - ## HITS:1 COG:no KEGG:BT_0095 NR:ns ## KEGG: BT_0095 # Name: not_defined # Def: conjugate transposon protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 67 1 67 98 81 67.0 1e-14 MKKRAIFTILSLCSALLSFAQGNGMAGISEATNMVTSYFDPLTKLIFAVAAILGLVGGVR VYSKFSS >gi|313156944|gb|AENZ01000080.1| GENE 2 349 - 594 282 81 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313156946|gb|EFR56380.1| ## NR: gi|313156946|gb|EFR56380.1| hypothetical protein HMPREF9720_0279 [Alistipes sp. HGB5] # 1 81 1 81 81 132 100.0 9e-30 MQGWNTLVVRPSIGEDGELAEPKAPAGARPPLPALPARAELRDAEVHAMLGELDFGESAV MGDATAEERQAIANFDIRAFA Prediction of potential genes in microbial genomes Time: Wed Jun 22 13:54:20 2011 Seq name: gi|313156942|gb|AENZ01000081.1| Alistipes sp. HGB5 contig00039, whole genome shotgun sequence Length of sequence - 1030 bp Number of predicted genes - 2, with homology - 1 Number of transcription units - 1, operones - 1 average op.length - 2.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . - CDS 1 - 220 150 ## 2 1 Op 2 . - CDS 226 - 672 90 ## COG0338 Site-specific DNA methylase - Prom 799 - 858 7.3 Predicted protein(s) >gi|313156942|gb|AENZ01000081.1| GENE 1 1 - 220 150 73 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MKEETKQNILNSLVSGKSLTTVLSAKAKEFMFESVQNELVTKYETDGWEVYKRYKTSIRM QRRKPMDMAFEDD >gi|313156942|gb|AENZ01000081.1| GENE 2 226 - 672 90 148 aa, chain - ## HITS:1 COG:PH1032 KEGG:ns NR:ns ## COG: PH1032 COG0338 # Protein_GI_number: 14590870 # Func_class: L Replication, recombination and repair # Function: Site-specific DNA methylase # Organism: Pyrococcus horikoshii # 2 146 145 292 330 76 32.0 2e-14 MNLKGEYNVPYGFREKEFLNEKTLLDVSNALQNAILRYGDFDLVRQNIKPNDLVFLDPPY TVSHNNNGFIKYNQKIFSLEDQIRLNSLIQYIKKIGAYYILTNAAHETIKEIFNNGDWCI ELNRASLIGGINAKRGQTKEYIFTNLIK Prediction of potential genes in microbial genomes Time: Wed Jun 22 13:54:29 2011 Seq name: gi|313156941|gb|AENZ01000082.1| Alistipes sp. HGB5 contig00004, whole genome shotgun sequence Length of sequence - 1436 bp Number of predicted genes - 0 Number of transcription units - 0, operones - 0 average op.length - 0 N Tu/Op Conserved S Start End Score pairs(N/Pv) - SSU_RRNA 36 - 1436 99.0 # AJ518874 [D:1..1465] # 16S ribosomal RNA # Alistipes finegoldii # Bacteria; Bacteroidetes; Bacteroidia; Bacteroidales; Rikenellaceae; Alistipes. Prediction of potential genes in microbial genomes Time: Wed Jun 22 13:54:34 2011 Seq name: gi|313156931|gb|AENZ01000083.1| Alistipes sp. HGB5 contig00035, whole genome shotgun sequence Length of sequence - 20408 bp Number of predicted genes - 11, with homology - 10 Number of transcription units - 4, operones - 4 average op.length - 2.8 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Op 1 . + CDS 36 - 3233 4913 ## BVU_2578 hypothetical protein 2 1 Op 2 . + CDS 3259 - 4875 2209 ## HMPREF0659_A5075 hypothetical protein 3 1 Op 3 . + CDS 4913 - 7366 3325 ## gi|313156934|gb|EFR56371.1| putative lipoprotein + Term 7399 - 7439 11.0 4 2 Op 1 . + CDS 7445 - 9472 2499 ## COG1404 Subtilisin-like serine proteases 5 2 Op 2 . + CDS 9491 - 9958 639 ## gi|313157370|gb|EFR56793.1| putative lipoprotein + Term 9994 - 10033 9.1 + Prom 10037 - 10096 3.2 6 3 Op 1 . + CDS 10312 - 13605 4137 ## BT_3239 hypothetical protein 7 3 Op 2 . + CDS 13621 - 15192 2029 ## BT_3238 hypothetical protein 8 3 Op 3 . + CDS 15230 - 16123 1182 ## BT_3237 hypothetical protein 9 3 Op 4 . + CDS 16134 - 18059 1840 ## BT_3243 hypothetical protein + Term 18085 - 18125 7.2 + TRNA 18263 - 18338 79.4 # Phe GAA 0 0 + Prom 18264 - 18323 78.7 10 4 Op 1 . + CDS 18451 - 19653 476 ## BF3455 tyrosine type site-specific recombinase + Prom 19680 - 19739 5.8 11 4 Op 2 . + CDS 19781 - 20335 508 ## Predicted protein(s) >gi|313156931|gb|AENZ01000083.1| GENE 1 36 - 3233 4913 1065 aa, chain + ## HITS:1 COG:no KEGG:BVU_2578 NR:ns ## KEGG: BVU_2578 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 16 1065 16 1053 1053 949 48.0 0 MVRKIVLSLIAVFVFLAYATAQNRQISGTVSDANGHPVAGATVIVDGTSLGTTTNTAGEY TLSAPVNGTLVVTFVGFEPQQLPIAGKTRINVTMKEDAQAIDDVIVVAFGTAKKEAFTGS AAVIKSDEIAKVQTSNVATALVGRVAGVQTSSTSGDLGKTPSIRVRGFGSINAGKEPLWI VDGMPYEGDLNNLNTNDIESMTVLKDAASNALYGARGANGVIMVTTKKAKSGDAVVTIDA KWGVNSKALEEYDVITSPAQYYETHFKALYGYYAQTNPAAKAYALASSGLTSNGTGGLGY NVYTVPEGQALIGTNGKLNPNATLGRKIIYNGQEYWLTPDDWIDEAYQSAFRQEYNVNIS GATERSSFYASLGYLDNTGIIKSSALERYTARLKADYQAKKWLKVGGNMSYAHFSNSNGN SNEGSASSTANIFAFSAQMPPIYPVYIRDGSGRIMVDDNGYQMYDYGDKGNAGLTRPLLP GANGLQTSWLNKKKAEGNAFSGSGFVDISLYKGLKLTVNGSTNIDETRTTYLNNQYYGQF AEAGGTISKYHTRDIAYNLQQILNYNETFGKHNVGLMVGHEYYQKKYYYLSGTKSKLFSY DNEELGGAVVDGAGAHSYIDDYNSEGYFMRAQYDYAGRYFVSGSYRRDASSRFHPDHRWG NFWSVGAAWLLNQENWFDAPWVNMLKLKASYGSQGNDNIGNYLYTDTYSIENNNGEIAVL FGQKGNPNITWETNTNLNIGTEFGFWNNRLSGSVDFFNRKTSDMLFAFSVPSSLGYSSYY ANVGDMVNRGVEVELNADLIRTKNVLWSFNLNLTHVKNEVTYLAPEHKSTTVEGYKGYID GSYFVGEGLPLYTYYLRSYAGVDPETGASLWYKDVKGDDGKITRTKTSDYTSATRYLHDS AIPSVYGGFSTSVSAYGVDFSISFNYQIGGKVYDSGYASFMSSPYGTTVGTNYHKDILKA WTPENKGSDIPRLQYGDQYTTSVSDRFLTDASYLNISNINVGYTLPSKITQKFGVQKLRV YLACDNVVYWSKRQGLDPRYSFTGATNFSNYSPIRTISGGVTVQF >gi|313156931|gb|AENZ01000083.1| GENE 2 3259 - 4875 2209 538 aa, chain + ## HITS:1 COG:no KEGG:HMPREF0659_A5075 NR:ns ## KEGG: HMPREF0659_A5075 # Name: not_defined # Def: hypothetical protein # Organism: P.melaninogenica # Pathway: not_defined # 40 529 2 534 545 260 35.0 9e-68 MKNILKTTTLALVCSMALTGCIKETFPTDSATSGQVSDSPTAVEAIVNAIPVNMTYPYSV YGRRNNYGFDFGYPGMMCATDAATGDVICSSGDEGSGYDWFTYWQSGILIGPTQTVCRFP WNCYYTFIKSSNDVVGLLAGVELSEAQKSYLGFAKAYRAQLYLDLARLYDPLNNKYTDVS GVKGLTVPIVDENTTEDMARNNPRVTREAMFEFILRDLDDAEALLANYTSSSKTNPSLAV VYGIKARAYLWLGGFDSSKYADAATYARKAIDASKCTIMTEAEWLNPKTGFNKANNSWMW YLPQSTEGIANLVNFAAWRSSEATWGYGGKYVFEGVTSNFYNEISDTDWRKRAFVGPDAK YADYSDITNLTAAQFANLPAYANLKFHPAEGETATYSVGNVTDIPMMRVEEMYLIEAEAT AHADASAGLQLLKTFMAKRDPEFSTSAVSSADVVEEIIWNKRIELWGEGVVFYDFKRLNY SLKTGYVGTNVPSDCRFSTDGRAPWWNFCIPETETQQNTALESQNNPNPVGIIDPWVQ >gi|313156931|gb|AENZ01000083.1| GENE 3 4913 - 7366 3325 817 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313156934|gb|EFR56371.1| ## NR: gi|313156934|gb|EFR56371.1| putative lipoprotein [Alistipes sp. HGB5] # 1 817 1 817 817 1573 100.0 0 MKAIKYFLYALIATFVFAGCSDDPTYTRGESEADGCYGVYFPSQDNAADLELDPADPTVL TFTAMRTNDADAITVPVTVTGSEDAIFSASEISFDDGATETTFQVSFPDAEIGTTYSCNI QIEDKKYAFIYGEKASGVSFSVTRVKWNLVTGPKGETKGKWRDDILSSAYGIPNRYGEGE VEIYERDDNPGYYRISNVYSAEYLASLLNMSPSEVSGNRSDVITYIDATNPDKVWLPEQS TGVFLNSGDGIVSFASQVPENGFNGSGYGTNVNGVITFPAKSVLLMFGDDGWYVGNAGGM QRLMLPGAEEYDFSLALTDSEPADGKVEIAAKLGADVAKVKYAFFEGVFGDAIAKANSAG IDAGTVESKEITADGTIVAQFEETGKYTVVANIYDEAGELQGYEFLSFGYVKAGDDKPVV LSVRTELTWEFEAQGHTPENSIRSIIFGENIESGYMGLFKSADLTGKSSEDLIEIAKTSG KAITAEEIDKINDTGLSLLHKGLNAGTDYVLLVWAYNGYYGKLYSVQQTTAGKPDPLQIH YTYDDVENGLTKAGYCGTWDYYAVDAYDKTGNTSRQYFGQVVVEDGGTTSGYADFDNVFS DYVKVSGLTGAPQAFNAGESQYMAWSGGLLYVLAPQLIGNVTLQGTDYFVNVAYTAEGDD GVYNGRGLLLGAPVADGLVAFVPNPQLQASNNLTFTGMWFGIYADYDPATNKFSGGEGGI VFHEWLLLADPDLYPAADMASVVKALSVQPGNYVELRGPELAKVIVGENRMPADRGRECE IVGMPVLGAAKAKVEFRQGIAARSADFTRRTGAKITE >gi|313156931|gb|AENZ01000083.1| GENE 4 7445 - 9472 2499 675 aa, chain + ## HITS:1 COG:alr1615_2 KEGG:ns NR:ns ## COG: alr1615_2 COG1404 # Protein_GI_number: 17229107 # Func_class: O Posttranslational modification, protein turnover, chaperones # Function: Subtilisin-like serine proteases # Organism: Nostoc sp. PCC 7120 # 202 504 37 297 416 144 37.0 7e-34 MTDKKILLLALLFLAGCASDPMEESGGKEAPAAAMRKIVNAPANAARGELLIYFDGDAVG DVEQTAVAAAITRTAVTRSGIAPVDDIFTQLGVTSLRRVFPCNPVAEERTRAAGLHKWYI VTFGEEVDLDAAARRLAAVSEVSFVQFNTKLQLASDNRACPYRDGSAATRAAAGGFNDPG YKDQWHYSNNGDRIFAETTRAGADINVEEAWKLAAGDPSLTVAIVDQGIKYSHPDLAANM WINKAEQSGATGRDDDGNGYADDVYGYNFALGTSRLTWDVEAYDDKGKNVGDSGHGTHVA GTVAAVSNNGVGVSGIAGGTGRNDGVKLMSCQIFSGGEGGSAAVSAEAIKYAADNGASIL QCSWGYPAGAVTTDNAYASGARIEKQAIDYFIATKNNAVLDGGLVIFAAGNDAKAMSGYP GAYRDYISVTAFSPDYLPAYYTNYGPGCNVAAPGGDAYISPSGSSAAQVLSTLPSELYQS DYGYMQGTSMACPHVSGVAALGLSYALAKGKQYTVAEFKSMLLTSVNDIDTYLDGTKQSL TTMQLRNYRKQMGTGAIAYQLLMQIEGTPCLKAKVGVQQQLSLDKYFGKASANLTYLGVE ISQADRTKLGISAEPTFVSGKLRLHCTKPGVARITVRAVAGGSGLGTGTSMGGMEIAKEF AVIARAVQTENGGWL >gi|313156931|gb|AENZ01000083.1| GENE 5 9491 - 9958 639 155 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313157370|gb|EFR56793.1| ## NR: gi|313157370|gb|EFR56793.1| putative lipoprotein [Alistipes sp. HGB5] # 1 149 4 152 158 221 71.0 2e-56 MRKTIYIVLAALGLCLFAGCGSSDKEKVSPLVKQLAGEWQLKTWNGEAPRDFDAYVSFGA DRSFEIYQRIEQVGYQKYSGKYEIRNGMLGGVYSDGKPFGSTYDITFDESGNTLTLTSAT GVAETGVYVRAAIPDGVKNGAAVMKSSRTVPIRLL >gi|313156931|gb|AENZ01000083.1| GENE 6 10312 - 13605 4137 1097 aa, chain + ## HITS:1 COG:no KEGG:BT_3239 NR:ns ## KEGG: BT_3239 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 50 1097 1 1059 1059 1365 64.0 0 MVKKILLSLIGVLVFAAGAFAQAKQVRGSVVDENGAAVIGASVIVKGTTIGVSTDQQGFF VMNNVPAKSEELQISYLGYKTATVAVAPEVKVTLEPDSQTVETVVVTGMTKTDKRLFTGA SDQLVADDVKLSGLADISRGLEGRSAGVSVQNVSGTFGTAPKIRVRGATSIHGSSKPLWV VDGVIMEDVADISAEDLSSGDATTLIGSAIAGLNAEDIESFNILKDGSATSIYGARAMAG VIVITTKKGRAGVSHINYTGEFTYRMIPSYADFNIMNSQDQMSVYQELYNKGWLGNSRVT NASASGVYGKMYSLINAGLLENTVETRNRYLRKAEYRNTDWFKELFQNTVMHSHSVSITS GTEKSAYYASISALFDPGWTKTSEISRYTANLNATYNISDALTLNLITNGSYRKQKAPGT LGATTDYVTGEVKRDFDINPYSYALNTSRVLDPKEYYVRNYADFNILHELANNYIDLNVA DVKFQGELKWKAFKGFEASALASVRYTGTSQEHNVREASNQANAYRAMGTTTIRDNNPLL YKDPDDIYAVPVSILPAGGIYTRTDNRLLSYDFRFSATYNTTINNTHIINAYLGMETNQS DRQKTWFRGWGLQYDSGEIPFYDYLVFKKGKEDNSPYYEMGNTHYRNVAFFFNGTYSYKG RYTLNGTLRYEGNNALGLTTRSRWLPTWNVSGAWNVHEENFFNSISKAVSHLSLKASYSL TADRPSVTNALAVIQSYSPWRPSAGVTESGLTIKDLENSDLTYEKKHELNIGADLGFLDN RINLTVDWYKRDNFDLIGRVMTQGIGGQIQKYGNVATMKSRGIEFSLSTKNIKTKDFSWT TDFIYSHMKNEVTKFNTSTRAFNLVSGIGFSREGYPVNSLFSYDFRGLNEDGVPTFINEA GVLTSTDINFQETERFDNLVYSGPSEPTDVGSFGNIFQYKGLRLNVFITYSFGNVVRLDP VFSSIYNDLDAMPREFRNRWMTPGDERYTDVPAIVSSRQVRNDSKLSYAYNAYNYSTARI AKGDFIRMKEISLTYDFPKRWIKKLRMSNLALKIQATNLFLIYADKKLNGQDPEFFNSGG VATPVPKQFTLMLKIGL >gi|313156931|gb|AENZ01000083.1| GENE 7 13621 - 15192 2029 523 aa, chain + ## HITS:1 COG:no KEGG:BT_3238 NR:ns ## KEGG: BT_3238 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 8 523 4 517 519 389 43.0 1e-106 MKTNKLIYFLLAGLLVAGLSSCNDFLDEQPDNRTIIDSEEKAIKMLVSAYADSQPWLFLE MASDNTDNHGANYDGWSRFWQESYEWSDPTETDNEDAAHTWEAHYMAIANANEVLKAIDK MEMTDKLSAAKGEALICRAYAHFVLVNVFCQHYDPAHPDDLGIPYMEKAETELDPKYERG TVAEVYAKIEKDIEEGLPLINDVIYSVPKYHFNKRAANAFAARFYLYYQKWDEAVKYATV ALTNSPAGSLRNYEELLEYPMNNGATLNTAATYYVSADLTCNFMMATAYSYAGRVFGGYG VGMQYNHGQMIADNEGIDCTNTPFAPYTWKCEPLRFVSGIDKILLPRVPYLIEYLDPVAG TGYRHTVYPLFTAEETLLTRAEAYIMQKDYSAALSDMNLYLSNSCSKYKTLTEANISSWA ASVADYDPKNPTPKKPLNPAFEIDRTQEGMIHTLLLLRRYETLHYGLRWFDVKRFGIEIY RRTLTSNDLGIKSVDDTLEVRDNRRAVQLPADVLAADMTPNPR >gi|313156931|gb|AENZ01000083.1| GENE 8 15230 - 16123 1182 297 aa, chain + ## HITS:1 COG:no KEGG:BT_3237 NR:ns ## KEGG: BT_3237 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 293 1 289 292 255 46.0 2e-66 MKKTILGLFAFVCAFSGSSCSEDDLSGTSVIKPEQTTETPLDAWLYKNYIEPYNIEFRYR YEDMESDMIYDLTPANYEKSVQMAKLVKHLCLQAYDEVTGSRDFITSYFPKMVFLVGSPA YNNNGEVVLGTAEGGTKITLYAVNNMDPTNVDLLNEWYFKTIHHEFAHILNQKKPFSTDF NQITGLATGIRYVGNACWDVYPSEDLALKDGFISRYASTSAEEEFVEVSSIYVTNTAATW EEMLETAGEVGRPMLEAKFEIVDKYMKNDWGIDLDELRKVVLRRQKELPNLDLDATN >gi|313156931|gb|AENZ01000083.1| GENE 9 16134 - 18059 1840 641 aa, chain + ## HITS:1 COG:no KEGG:BT_3243 NR:ns ## KEGG: BT_3243 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 4 268 3 252 443 134 32.0 2e-29 MFMKKILILLALPLLFNSCLKDDEDKFSKSATERIEEAVKEAITVLQGAENGWRMELYPE SERIYGGYTLFLKFNADNTVVASSENFAAGKTESSYYSVVAESGPVLAFDTNNDIIHFYS SPTTGAELGIGTSNGGLEGDSDFIVMEASADFVKLKGRKTNNYAYLYPIEAGVNWKTELQ SYQDAAAKMDLIYTRCVVDGVAYPIDWEMRLNNFSSRVFRISYTPAPGDDGSVSAGEVVK APFVFTETGLKFYSPLTIGNATVSEMTFKEDYYFENEDGSVKIYSPKPVRSNNKLTINPA DITFSSATVNVTPSVATDYYYFDVYEKADLEGESDMAIIKSLISEMNSLVGTYTADFIVS ALGKKGAASEIFEDLSSETDHVVIGFGIVATENVVLATTDLFRKEFTTEKAPELDEAYAA WLGTWTVTSTTSMKSAKPISFDVTFSTKVANTNFALTGWAISAYRDRFPAIATFDKETGY IMIQSYQEIGATGDGTVRYVALCRDKNVSGKYYYPVGGSYVGLIGAIMSDGKGKVIGNEL TLTGDVEAEVVMMDQFVYNGSNYLGKYNPTADSGFTADDYPVGPFTLVKKSPAAAPVKGK AMLAGKFAAAADRNMKPAVPQLTSAPSFERRAVNANAVMLK >gi|313156931|gb|AENZ01000083.1| GENE 10 18451 - 19653 476 400 aa, chain + ## HITS:1 COG:no KEGG:BF3455 NR:ns ## KEGG: BF3455 # Name: not_defined # Def: tyrosine type site-specific recombinase # Organism: B.fragilis # Pathway: not_defined # 1 400 1 400 400 466 57.0 1e-130 MDAGVNILCYKSKTLANGAHPLMIRVCKDGKKKYVSLGVSVLPQFWDFTKNQPKKNCPNK AYIEKIIADKSSEFAERIIELKAEKKEFTATTLTEHLTDGTRAKRTVGEVFLSQIENLKQ TGRTGYALSHREVYNSLLKFNGHLNIYFSEIDTVWLKRYETWLRGQGFSENTIGRRFRTL RAVYNVAIEEKCVKADYYPFKSYKVSKLHQATAKRAISKADIMRIIEYRCDDFYKQFAVD LFAFGYFMGGINFVDIAYLKMDNIVDGRLIYTRRKTHKLIRLPLQPKAQEIIERYRQDGA LFLFPILSAFHKTEQQQRNRVHKVISKVNERLKEVGKELEIPIDLTTYVSRHSFATVLKR EGVSTSIICESLGHSSEKVTQIYLDSFENSQIDAAMANLL >gi|313156931|gb|AENZ01000083.1| GENE 11 19781 - 20335 508 184 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MEIPNYYKDTCVAYNRAYNVISLEEYKAKMLADVPETKNLIIWRRLIPEGCFSNALFICD FNGLLRRYGERCNYCEGLAISKIALDSDKEPTVLAVHHSFIYSNKYECYVDCTPGNKYHP NLYLYLLSDELKEDALEEYLESLMSEDENGDDVTTPKFIHDDLPLGVDVPKNWLLSNEKH RDKE Prediction of potential genes in microbial genomes Time: Wed Jun 22 13:56:41 2011 Seq name: gi|313156882|gb|AENZ01000084.1| Alistipes sp. HGB5 contig00038, whole genome shotgun sequence Length of sequence - 59622 bp Number of predicted genes - 46, with homology - 45 Number of transcription units - 15, operones - 10 average op.length - 4.1 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 1776 - 1835 2.5 2 2 Tu 1 . + CDS 1855 - 2127 420 ## + Term 2191 - 2224 4.1 - Term 2178 - 2211 4.1 3 3 Op 1 . - CDS 2244 - 3566 2095 ## BF0682 putative outer membrane protein TolC 4 3 Op 2 27/0.000 - CDS 3563 - 6604 4715 ## COG0841 Cation/multidrug efflux pump 5 3 Op 3 . - CDS 6617 - 7699 1511 ## COG0845 Membrane-fusion protein + Prom 8111 - 8170 1.9 6 4 Op 1 6/0.000 + CDS 8237 - 8791 759 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog + Term 8817 - 8860 2.0 + Prom 8800 - 8859 5.2 7 4 Op 2 . + CDS 8933 - 9916 1247 ## COG3712 Fe2+-dicitrate sensor, membrane component + Term 9953 - 9982 -0.9 8 5 Op 1 . + CDS 10023 - 13583 5144 ## Dfer_1802 TonB-dependent receptor plug 9 5 Op 2 . + CDS 13597 - 15057 2544 ## Cpin_3634 RagB/SusD domain protein 10 5 Op 3 . + CDS 15070 - 16491 1885 ## COG0584 Glycerophosphoryl diester phosphodiesterase 11 5 Op 4 . + CDS 16520 - 18373 2532 ## COG3291 FOG: PKD repeat 12 5 Op 5 . + CDS 18448 - 20469 2889 ## COG3568 Metal-dependent hydrolase + Term 20477 - 20516 9.1 13 6 Op 1 . + CDS 20521 - 21435 1209 ## COG0584 Glycerophosphoryl diester phosphodiesterase 14 6 Op 2 . + CDS 21445 - 22329 1302 ## COG3568 Metal-dependent hydrolase 15 6 Op 3 . + CDS 22366 - 23739 2332 ## COG2271 Sugar phosphate permease + Term 23764 - 23805 9.6 + Prom 23846 - 23905 1.7 16 7 Tu 1 . + CDS 23989 - 24519 869 ## Bache_2563 hypothetical protein + Term 24539 - 24577 10.5 - TRNA 24729 - 24806 87.6 # Val GAC 0 0 - TRNA 24853 - 24930 89.1 # Val TAC 0 0 - TRNA 25067 - 25140 83.5 # Asn GTT 0 0 + Prom 25154 - 25213 5.1 17 8 Op 1 . + CDS 25313 - 26341 1218 ## Odosp_0261 hypothetical protein 18 8 Op 2 5/0.000 + CDS 26400 - 27284 1161 ## COG0543 2-polyprenylphenol hydroxylase and related flavodoxin oxidoreductases 19 8 Op 3 . + CDS 27342 - 28727 1890 ## COG0493 NADPH-dependent glutamate synthase beta chain and related oxidoreductases 20 8 Op 4 . + CDS 28748 - 29314 983 ## Amuc_1202 hypothetical protein + Term 29343 - 29379 4.6 + Prom 29372 - 29431 1.9 21 9 Tu 1 . + CDS 29451 - 30299 1180 ## COG4632 Exopolysaccharide biosynthesis protein related to N-acetylglucosamine-1-phosphodiester alpha-N-acetylglucosaminidase - Term 30590 - 30629 7.8 22 10 Op 1 . - CDS 30646 - 31671 891 ## COG0859 ADP-heptose:LPS heptosyltransferase 23 10 Op 2 . - CDS 31655 - 32260 968 ## Bache_0540 hypothetical protein - Prom 32293 - 32352 4.6 + Prom 32249 - 32308 4.5 24 11 Op 1 . + CDS 32336 - 33343 1546 ## COG1705 Muramidase (flagellum-specific) 25 11 Op 2 . + CDS 33359 - 34045 1308 ## Fluta_3562 hypothetical protein 26 11 Op 3 . + CDS 34059 - 36194 3488 ## COG1506 Dipeptidyl aminopeptidases/acylaminoacyl-peptidases 27 11 Op 4 . + CDS 36206 - 36703 686 ## COG0350 Methylated DNA-protein cysteine methyltransferase + Term 36832 - 36875 11.0 - Term 36820 - 36862 11.6 28 12 Op 1 22/0.000 - CDS 36880 - 38307 2183 ## COG1007 NADH:ubiquinone oxidoreductase subunit 2 (chain N) 29 12 Op 2 30/0.000 - CDS 38314 - 39798 2122 ## COG1008 NADH:ubiquinone oxidoreductase subunit 4 (chain M) 30 12 Op 3 26/0.000 - CDS 39811 - 41703 2716 ## COG1009 NADH:ubiquinone oxidoreductase subunit 5 (chain L)/Multisubunit Na+/H+ antiporter, MnhA subunit 31 12 Op 4 30/0.000 - CDS 41726 - 42034 588 ## COG0713 NADH:ubiquinone oxidoreductase subunit 11 or 4L (chain K) 32 12 Op 5 28/0.000 - CDS 42047 - 42556 910 ## COG0839 NADH:ubiquinone oxidoreductase subunit 6 (chain J) 33 12 Op 6 31/0.000 - CDS 42560 - 43000 700 ## COG1143 Formate hydrogenlyase subunit 6/NADH:ubiquinone oxidoreductase 23 kD subunit (chain I) 34 12 Op 7 8/0.000 - CDS 43003 - 44079 1828 ## COG1005 NADH:ubiquinone oxidoreductase subunit 1 (chain H) - Term 44216 - 44243 -0.1 35 12 Op 8 9/0.000 - CDS 44259 - 45878 2441 ## COG0649 NADH:ubiquinone oxidoreductase 49 kD subunit 7 36 12 Op 9 30/0.000 - CDS 45875 - 46498 439 ## PROTEIN SUPPORTED gi|154175216|ref|YP_001407461.1| NADH dehydrogenase subunit B 37 12 Op 10 . - CDS 46522 - 46872 224 ## PROTEIN SUPPORTED gi|154175415|ref|YP_001407462.1| NADH dehydrogenase subunit A - Term 46904 - 46958 10.6 38 13 Tu 1 . - CDS 46984 - 47349 441 ## gi|313156904|gb|EFR56342.1| hypothetical protein HMPREF9720_1864 - Prom 47424 - 47483 5.3 - Term 47502 - 47543 10.2 39 14 Op 1 . - CDS 47569 - 48933 1301 ## BT_3988 putative peptidoglycan bound protein 40 14 Op 2 . - CDS 49003 - 50520 1988 ## BT_3987 endo-beta-N-acetylglucosaminidase F1 41 14 Op 3 . - CDS 50540 - 51715 1711 ## BT_3986 putative patatin-like protein 42 14 Op 4 . - CDS 51728 - 52777 1485 ## BT_3985 hypothetical protein 43 14 Op 5 . - CDS 52797 - 54386 1933 ## HMPREF9137_0327 putative lipoprotein 44 14 Op 6 . - CDS 54402 - 57809 5077 ## Bacsa_0825 TonB-dependent receptor plug - Prom 57902 - 57961 5.1 45 15 Op 1 . - CDS 57975 - 58916 1005 ## COG3712 Fe2+-dicitrate sensor, membrane component 46 15 Op 2 . - CDS 59014 - 59550 596 ## Bache_2470 RNA polymerase, sigma-24 subunit, ECF subfamily Predicted protein(s) >gi|313156882|gb|AENZ01000084.1| GENE 1 2 - 1286 1807 428 aa, chain - ## HITS:1 COG:no KEGG:Bacsa_2210 NR:ns ## KEGG: Bacsa_2210 # Name: not_defined # Def: TonB-dependent receptor plug # Organism: B.salanitronis # Pathway: not_defined # 30 428 13 410 781 546 66.0 1e-154 MMSLGLRIICAFAALLACTRAAAGTPESGYPVNGRVIDRLTRRPVAYAAVVPVGQEQQGV STDSAGHFTLRRVRPGIHRLSASSVGYKSVVTPEYIVSAATPFIEIELEEDATQLEAVTV VPSPFRATAESPVGLQVIGLREIEKSPGANRDVSRIVRSYPGVSFSPVGYRNDLIVRGGG PSENKFYMDGIEIPNINHFATQGATGGPVSIVNADLIREISFYTGAFPADRSGALSSVLD FRLRDGNPEKQTFKATLGASEVGLSGSGHIGGKTSYLFSLRQSYLQLLFKMIGLPFLPNY IDGQFKVKTRFSERDELTVLGLAGFDNMKLNLDEEGEDAEYLLSYLPRIRQETFTVGASW RHYAGRHVQTVTLSHTYLNNRNLKYRDNDDSSEENLTLRLRSVEQKTTLRAENRSYLGRW TVREGAEL >gi|313156882|gb|AENZ01000084.1| GENE 2 1855 - 2127 420 90 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MRKQHYTAEIPVSNPYNIAAFEDLVSNIQAGYNEEKRLKNAAFEFMIDRGIYLEFHEWLH APEHKEHGSDVMGRLRMYMTGNRLTKIFSR >gi|313156882|gb|AENZ01000084.1| GENE 3 2244 - 3566 2095 440 aa, chain - ## HITS:1 COG:no KEGG:BF0682 NR:ns ## KEGG: BF0682 # Name: not_defined # Def: putative outer membrane protein TolC # Organism: B.fragilis # Pathway: not_defined # 1 440 1 441 441 492 58.0 1e-137 MKRLILTVLAAGAVGAAAAQECGEPMTLQRCLELGLEQNYDVRIIRNEQRISDNDATAAN AGMLPSVDLTAGYSGALDNERTTPREGDAVTENGIYDQTFDAGVAVNWTLFDGFRIRTNY KRLQELQQMGALRTRITIEDFVATLTAEYYNYVQQTLRLSNFRYAVQLSRERLRITEERF KIGSFSRLDLLQARVDFNADSSKYMSQHELVTASRIRINELLANDDLNGRLSICDSLIEV NSTLEWDRLEAETRAANASLLLAGHDNTLAELDLKSIRSRFYPYLNLTAGYGYTYNRFGN GATQSRGTLGLNAGVKLGFTIFDGNRRREQRNARITIENTELTRQQLEQSLMANLSNFWQ AYRNNLEVIQLETENLIAARENYEIAMERYLLGDLPGIEMREAQKSLLDAEERILTAQYN TKLCEISLQQISGNIGVYLQ >gi|313156882|gb|AENZ01000084.1| GENE 4 3563 - 6604 4715 1013 aa, chain - ## HITS:1 COG:VC0914 KEGG:ns NR:ns ## COG: VC0914 COG0841 # Protein_GI_number: 15640930 # Func_class: V Defense mechanisms # Function: Cation/multidrug efflux pump # Organism: Vibrio cholerae # 1 1001 1 1009 1036 659 37.0 0 MNISELSIRRPVLSTVLTIIILLFGFIGYSYLGVREFPSVDNPIISVTCSYPGANADVIE NQITEPLEQNINGIPGIRSMSSVSSQGQSRITVEFELSVDLETAANDVRDKVSRAQRYLP RDCDPPTVTKADADATPILMVTIQSDKRSLLELSEIADLTVKEQLQTISDVSGVQIWGEK RYSMRLWLDPAKMAGYGITPADVKSALDRENVELPSGSIEGNTTELTIRTLGLMHTSQEF NDLVVREDADRIIRLSDVGRAELGPQDTRSYMKMNGIPMVGIVVVPQPGANHIEIADEVY RRMESMKKDLPEDVVTDYGFDNTRFIRASIDEVKQTVYEAFLLVIIIIFLFLRNWRVTLI PCIVIPVSLIGTFFLMYVAGFSINVLSMLAVVLSVGLVVDDAIVMTENIYIRIERGMSPK EAGIEGAKEIFFAVISTSITLIAVFFPIVFMEGMTGRLFREFSLVISGAVVISTFAALTI TPMLATKLLVRQEKQSWFYNKTEPFFVWLNNFYSRTLAGFLRRRWIALPLVALMIGLIGW LWGSIPAEMAPLEDRSQITINTRGSEGATYEYLRDYTEGINRLVDSLVPEARAVTARVSS GRGNVQIALKDIATRERTQMEIAEEVSAAVRTRTKARAFVQQQSTFGGRRGSMPVQYVLQ ATNIERLQEVLPEFMRKVYESPVFQMADVDLKFSKPEARIAINRDKANLLGVSTRNIAQT LQYGLSGQRLGYFYMNGKQYEILAEINRQQRNKPADLKSIYVRSDKGEMIQLDNLITLEE NVAPPQLYHYNRFLSATVSAGLAKGKTIGQGLDEMDRIAEQTLGDDFRTALAGDSKEFRE SSSSLIFAFLLAIVLIYLILAAQFESFKDPLVIMLTVPLAIAGALVFMAANGQTMNIFSQ IGIIMLIGLVAKNGILIVEFANQKQDAGLNKREAIEQASLQRLRPILMTSASTILGLLPL TMASGEGAQGRIAMGVAVVGGMVVSTLLTLYIVPAIYSYVSTDRIRKTKKNAK >gi|313156882|gb|AENZ01000084.1| GENE 5 6617 - 7699 1511 360 aa, chain - ## HITS:1 COG:VC0165 KEGG:ns NR:ns ## COG: VC0165 COG0845 # Protein_GI_number: 15640195 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Membrane-fusion protein # Organism: Vibrio cholerae # 43 356 35 353 368 139 30.0 8e-33 MNKRKKWIVTAVILVVFCAVLGVYRHFGPGSSASDADPAAVSAPRKKSVLNVNGLVVRPQ SLADGITMVGNLLPDEEVDLSFETSGKIVEINFKEGSIVRKGELLAKVNDKPLLAQLSRY EAQLKLAEDRVYRQSALLEKDAVSQEAYEQASTELAMLNADIDIVKANIALTELRAPFDG IIGLRNVSEGAYASPSVVVAKLTKISPLKIDLFVPERYASQIRPGTPLSFTVEGRNETFR AEVYAQESKVDIATRTLAVRALYPNTRGTLLPGRFVTVKIRLHDIPDAIAVPTEAIVPEM GVDKVYLYKGGKAQAVEVKTGLRTESSIQIIEGLNVGDTLITSGTLQLRTDLPVKLDVVE >gi|313156882|gb|AENZ01000084.1| GENE 6 8237 - 8791 759 184 aa, chain + ## HITS:1 COG:all2193 KEGG:ns NR:ns ## COG: all2193 COG1595 # Protein_GI_number: 17229685 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Nostoc sp. PCC 7120 # 6 177 18 192 201 68 26.0 8e-12 MDKQPDELILQSLRQGSYDAFDALYMRYAPHVEAFAFCMLKNRSEAEDLAHDIFLKIWET RESIGRIKSFRSYLFRMTKNAVFDIFEHKSVQTRYEQRLLHVEDLLTDDISTKVATEDLL MIIDMAVEQMPEQRQRVFRLSRYEGLSHQQIAQKLGVTPKTVEYHIRTALAELKKIIGVI AFFF >gi|313156882|gb|AENZ01000084.1| GENE 7 8933 - 9916 1247 327 aa, chain + ## HITS:1 COG:PA1301 KEGG:ns NR:ns ## COG: PA1301 COG3712 # Protein_GI_number: 15596498 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Pseudomonas aeruginosa # 16 296 17 295 327 61 26.0 3e-09 MIKKIIHTFLHNPMPEQVQQQFRTWCLNKERAGEKTDALWKEWERLDPASVLPADEKAYR RKLDRLHREMTPPAAPKRGLFSITRRTAAAAAAIVLLAMSVEFFVVKRLSADTTTWLVTA ENSKGRFTLPDGSVVWLNADSRLAYSDRFTASGSRAVRLEGEAFFDVKRDTLRPFEVEMG KLRVKVLGTRFTASHMPAFNTEEVTLLSGKVEVSGYRADQSVVLTPDQSCSYDAGSGAVA VRNVAASNYCSWTGNSIIFDNMTLADIAVNLEHWYNIRFRIDERIDTSIRISFTLRPETL EETLKIIETLTHLRCRQIDKHYVIIHK >gi|313156882|gb|AENZ01000084.1| GENE 8 10023 - 13583 5144 1186 aa, chain + ## HITS:1 COG:no KEGG:Dfer_1802 NR:ns ## KEGG: Dfer_1802 # Name: not_defined # Def: TonB-dependent receptor plug # Organism: D.fermentans # Pathway: not_defined # 130 1186 48 1119 1119 934 48.0 0 MKKDERLNFSDRTPSVRRLVVSLVLMLLAALPAAAQNKKITVDLDNVPVREFIKTVESQS GYTFAYNNSEIDLTRRVSVKAADENVVDVVIRALSAQNLTARMEGSRIVVSRKPAAARVQ TAQPVRGGVVTGTVKTISGEPVIGASVIVLETNRGNVTGLEGDFSVEATPGQTLSVSFLG YNTQQIKVGSQTSFDITLTEDSKQISEVLVVGYTPMRKSDFTGSIASVKASELSATTPTV GQSLVGKVAGVEVHQTSGAPGDGVTIRVRGVNSLSASSAPLYVIDGYPASEDVFINPNDI ESIDILKDAASAAIYGSRGASGVVLITTKRGKDGEAAKVSYDFSYGIQQLDHKVDLLNST QFRDLLIDARNNSYRLRATAAGVSWSPYDDNTIRAAKGFSLAEVGIHPMFYDFTTRTPVT PQYDTDWQDELFSNAGIMRHNVSVIGGTKAIKYMASVGYMDQDGIIAPSNHNRINARINL DAQITKRLTASISYSMYDAKNTVVQAEGRMINDGVIQSALMYLPNLPAYEENGDYARSAM IRMKTDWGMNFPENPLAIANELDINEKMSRHNLNLNLVYEFLPDLKLSARLGQQWYNYRY FYYRPMSIGRDAAPAYSEELRSSNIARTTSTYDVDRLGEFTLSYKKKIGRHHIDALAGYT LQKKTYDRLGVEATGFADDRIHEVTGHGSNASDISLYSTRKAAWAMMSFLTRVNYSFDDR YTLTGSFRADGSSRFGIDSRWGYFPSVSAGWTLSNEPFLKDALKDVASIRLRASWGKSGN NDIGNYASLAGISSGSYAFGQTPVSTTYEGSFTDAALGWETTRQTNVGIDLGFFNGRLNV IGNWYNSISTDILYKYPISSISGATSTTTNMSGAKIRNRGFDVQLDARLLTGKVNWNFST NISVNRNKVVSMGGLDDIISTTERSVGSHITKEGKPIGSFYGYQAVGIMSKADYANALLD RDVYIKNGNKFPEGYQLKGPAVASYALDNLSYGNAIWKDANGDGVITTDDKTIIGDAYPD FTGGFSTSLSWNGLDFSASFAYSYGGEVINFQDYYLYNMEGSGNQYSIVADRYISDAQPG RNNVPIASRISTTNTSLKLSSYYVEDASFFRCANITLGYTLPKRWTSKLHITSCRVYVSG DNLFTITPYRGYNPEVSYKSSNMMPGFDWGCYPLSRIYSVGLNLTF >gi|313156882|gb|AENZ01000084.1| GENE 9 13597 - 15057 2544 486 aa, chain + ## HITS:1 COG:no KEGG:Cpin_3634 NR:ns ## KEGG: Cpin_3634 # Name: not_defined # Def: RagB/SusD domain protein # Organism: C.pinensis # Pathway: not_defined # 1 483 1 510 510 307 37.0 6e-82 MKIKHMFILLAASVSTASCGFLDEYDPNATTVGNFYKSEADIVTSLNGVYASLTQSYYYT NNHYFTDVRSHDTAVTDSGANSGIPYQFYNYTLTEENSYVYNRYTQLFKTISRANTLLAH LDDVSYSNGDARNTYEAEIRFLRALTYFHLVTEWGDVPLILTELKTKDEVQANNYRRPKS EVYKAVFADLDYVLGSPLADMQPASECGRASKAAAWTLYGKAKLQYACDEDFASEKSASL TSAIEKLTAAWNLRSFGELSDIPYNAIWDLSTQKSCAENIFQINYIQNNADLGSTWNYLF GPSKTGITSNKVGLMQNMTTKSVYDAFAKEDVRKNYLRETTVAGVTYYHTMKYADLECGA NGYGGNNWIVMRYADVALMLAEAYYWQGTEPTAKSFLNMVRKRAGLGDWSGTDLRQGIYD ERRNEFIQEGLRWQDLLRMYSKQEMIEHFNAINPNFGLKDLLLPIPYNERILNTEGLYQN PGYGVK >gi|313156882|gb|AENZ01000084.1| GENE 10 15070 - 16491 1885 473 aa, chain + ## HITS:1 COG:BS_yhdW KEGG:ns NR:ns ## COG: BS_yhdW COG0584 # Protein_GI_number: 16078027 # Func_class: C Energy production and conversion # Function: Glycerophosphoryl diester phosphodiesterase # Organism: Bacillus subtilis # 218 460 1 231 243 74 31.0 4e-13 MRFTNRIILILWAIACGPLAACSDGGTEAPGPENGTQTIVWTPAIPVRDAAVVFSLEGSP DNVRSVLWMFGDGATATSGAGESAEHTYRTAGTYTVRATVTRLHGPMTELSATVAVADTE AAIVVSNDHPARMEQVAFALSRIPGVSGVVWNFGDGSAAQRVTDPTQRVEHRYEADGQYR AGAAISYTDGTTQTLTQTVTVEGASLSRSCLNFDTGKMWIMAHRGNFNNGYDLAPNSMAA FRKCVELGCVDFIETDVQITKDGQVICLHDNYLKRFTDYSSYAGDEGYVINFTREELRKF RLKTTDGKVTGEQIPTLEEVLTELRGKVWFNLDKCGSDDVDIAKVYEVVKRCGCLARVQF YVGTNSDRAAWLARQEQPGIIAPHANNASALAAMSAFAPVYMIQTSTGYVNSAWISSLNA KGLSVSNLLDDEGAAFRNGNTVTMDGFVAAGLRMIQCDYPALMDEYLRDKGKR >gi|313156882|gb|AENZ01000084.1| GENE 11 16520 - 18373 2532 617 aa, chain + ## HITS:1 COG:VCA0223_2 KEGG:ns NR:ns ## COG: VCA0223_2 COG3291 # Protein_GI_number: 15600992 # Func_class: R General function prediction only # Function: FOG: PKD repeat # Organism: Vibrio cholerae # 45 200 53 199 204 79 37.0 1e-14 MKRILSYLSIVLLSASVACSDSSEMDKAYNSVITPGFTFGEEEIIAGITPVTFTNTTTAE GTTVSEYFWHFGFAGEGNWSEEAAPDPIVYNRPGDYTVTLTAWGADGNRATTTRTITVLA DNVLPTADFSYSPQMISIGTTVTFTDKSTDSDGEIVSREWLFPDGTTSTDINATYTFEQM GMFNVQLTVTDDRGGSNSTSKAINVRAGDVNAFTLLWSTAVASADALCAASVVAASDMGY IYSVSGDGKMVALNTDGAKVWEYDAMAKDGVYLKSEISYPSADTDGTVYWVAHGYGGSSS ASLVYAFDGATGSVLWKNTTAYAPAARIAFSTPCISPSMVIVGSRGTNGAIRGFEKGTGK NTATATPANGGGTSGTIALKNGVVIFTNTAQYGYGIMVPDASFVWSPVPTSDTFAPNKIL SAGRCQPCVDADNCVYLPGIAEGSGTWNLASFDCTNLTASSKKTPKWSVNLPDGFQQTGA SLSADGTTLYIVADKATPSVVYALNTSNGSTRWSYTLDAASNSIPAVDNLGQIHLCTKTG DYVVLSAEGELRYKEHLADSVDGSPTISEWGYSYFLGKDSAAGALKVYSVALPGVTSPAA SAWSQYGQNARHSNYQK >gi|313156882|gb|AENZ01000084.1| GENE 12 18448 - 20469 2889 673 aa, chain + ## HITS:1 COG:lin0348 KEGG:ns NR:ns ## COG: lin0348 COG3568 # Protein_GI_number: 16799425 # Func_class: R General function prediction only # Function: Metal-dependent hydrolase # Organism: Listeria innocua # 49 307 2 251 257 84 24.0 6e-16 MKKTIILAMFAALCTLTAGCADDFKTVLNDKYYEDDTPSREPDITEQTLTLGSYNLWISS KGTGDYLWTNRRTVLAQSIVKNKWDIFGFQEANGTIQNELPTLVGQQGGKYEWWFVGRDS QDGVSGEALGIAYNPDRFELTDKHFFWISPTPDEMSYGWDELGYHRIAACAMVTDKLYNK QFFMMVTHAPLGATARAEGAKLLIEREKMYNPDGIPSILVGDMNAAMDDASSKTLRTHWN DSFLTVESDFVSGPVGTFNGHKITADLTQATARIDYIYSRGDVELKSYKVDNTVYGNIYP SDHCPLTIQFDTDYEKPAPDVVEGSGTAADPWQLNSVSDWNTVAASINGQAEDAVYTSAA YYRLTADIDFDNKNLTPISFTADNTIYFEGEFDGAGHKLLNVKIVAPGKSCGVFGANKGT IRDLAVEGALSTEFEIAGGIVGINAGVIDGATFKGDITGGTGAKTIGGIAGQNKGTLVNC ANLGGTMKTDAPKDPNMGGIVGQIAKGDDGLGRYVINCYSRVDQLEAKHNDVGGIAGIVS DDSFVINCYSTVEKITANSSYASVVGYSKKGNLQNIYGNSACPSKSAANSAVGSDKAAGT VWKKTTFALLSLDEMKSGAVTVPSSGESCANFAAALNAGATLFNDTPAATLPGKPDVVLR KWTASESYPVLEK >gi|313156882|gb|AENZ01000084.1| GENE 13 20521 - 21435 1209 304 aa, chain + ## HITS:1 COG:mlr8455 KEGG:ns NR:ns ## COG: mlr8455 COG0584 # Protein_GI_number: 13476980 # Func_class: C Energy production and conversion # Function: Glycerophosphoryl diester phosphodiesterase # Organism: Mesorhizobium loti # 6 303 63 388 407 126 32.0 4e-29 MKRYLLILCATAGLYACSQPAAETTDKARFVRAELHNPSSRYVVVVSHRGDWRNWPENSI PAIESVIGMGVDIMELDLKLTKDSVLVLCHDKTIDRTTNGRGRVCDITYDSIRRCVLKTG HGVKTSLKMPTLREALAVCKDRIAVNIDQGYEYYDLAFAITEELGVTDQVLIKGKRPVET VSAKFAEYGHNMMYMPIIDILKPQGRELFEEYASKGVVPLAYEVCWDDYTPQVEACMRQV VAGGSKLWVNSLWDSLCGGLSDDKAFTESPDEVYGRLLDMGASIIQTDRPELLIRYLEAQ GRRK >gi|313156882|gb|AENZ01000084.1| GENE 14 21445 - 22329 1302 294 aa, chain + ## HITS:1 COG:lin0348 KEGG:ns NR:ns ## COG: lin0348 COG3568 # Protein_GI_number: 16799425 # Func_class: R General function prediction only # Function: Metal-dependent hydrolase # Organism: Listeria innocua # 23 292 2 257 257 156 36.0 4e-38 MNKLLICMAVACVTAFNAAAQSFSAATYNIRQLNAKDTAEGNGWERRLPVISSLIRFHDF DIFGAQEVFHSQLQGMLAALPGYDYVGVGRDDGAAGGEYSAIFYKRGRFRLLDSGHFWLS EDPSKPGKGWDAKYVRICTWGRFYDRQSRQRFWFFTTHTDHRGEQAQAESCRLILAKIEE LCRGERVILTGDFNVGETSRSYAILRDSGILSDTYDTAEIRYAETGTENWFDPDIKTFRR IDYLFVTPGFRVLRYGILTDTYRSENPDGGERKFRARTPSDHFPVQVELEFTKR >gi|313156882|gb|AENZ01000084.1| GENE 15 22366 - 23739 2332 457 aa, chain + ## HITS:1 COG:VCA0707 KEGG:ns NR:ns ## COG: VCA0707 COG2271 # Protein_GI_number: 15601463 # Func_class: G Carbohydrate transport and metabolism # Function: Sugar phosphate permease # Organism: Vibrio cholerae # 3 451 1 436 459 331 39.0 1e-90 MSILSFFRKSAPAPRLPKDDDAMMRLYKKLRWQSFIAGTVGYSLYYVCRTSLNVVKKPIL ESGALDATQLGVIGSALLFAYAIGKFVNGFLSDHSNIKRFMAAGLAVSAVANLLVGILGL ANGGGIVGNMVLFVAFAVMWGVNGWSQSMGAPPAIIALSRWYPLSKRGTYYGFFSASHNL GEFLSFLFVGAVVGFFGWQWGFVGSSVAGVLGVLTIVLLMHDTPESKGLPPIEELTGEET PGTPRDHGKTSDLQRSVIRNPLVWVLALSSAFMYISRYAINGWGVLFLQEIKGYTLAAAT QVISVNALCGIVGTVFSGWLSDAFFYGRRNVLAFGFGVLNTVALCLFLYSGDGLIVNILS MILFGVAIGVLICFLGGLMAIDIVPREATGAALGIVGMASYVGAGLQDIVSGWLINSGKT ELDGVTSYNFDSAIVFWIAASAVSFILALFVAKRSHR >gi|313156882|gb|AENZ01000084.1| GENE 16 23989 - 24519 869 176 aa, chain + ## HITS:1 COG:no KEGG:Bache_2563 NR:ns ## KEGG: Bache_2563 # Name: not_defined # Def: hypothetical protein # Organism: B.helcogenes # Pathway: not_defined # 1 176 1 180 180 129 40.0 5e-29 MKKIIIVLLAAAASFTAANKASAQEKGLWIGGKLGYWHDKTEGVKTDSFSIAPEAGYDFN KRWSVGVALGYEYLKVKDGGSANVFTASPYARYKYYNKGILSLFLDGGVGFACCDHEGFQ AGITPGLSVKINEHFSFLTQVGFLGYRKDYFNGGSGEAFGLKLSSSDLKFGFYYKF >gi|313156882|gb|AENZ01000084.1| GENE 17 25313 - 26341 1218 342 aa, chain + ## HITS:1 COG:no KEGG:Odosp_0261 NR:ns ## KEGG: Odosp_0261 # Name: not_defined # Def: hypothetical protein # Organism: O.splanchnicus # Pathway: not_defined # 7 331 21 341 351 141 27.0 5e-32 MTAPAHAQLIFTPDSWDFGTISETDGRVSHTFTGENRSDKPVVILDVVTTCGCTVPEFTK RPIRPGEKTTIKVTFDPTNRPGAFTKELGVYSSERRKIATLTVRGSVTPRTKTTEELYPV DAGGGLRLASTLCTFSYIREGQQVQSAIGCINTSNRPVRLELHPKESSGLLAADYPRELA PGQTAEINLMYLNPEGAPRYGSLRDALEVRADGRGNGTLIVAHGIGIDKSPADARRTPKL QITENIIKFGPVKHNAPQQQRTFTLTNTGDGELIVRAVETGGKVATTLTPGQRIPAGSSF TAKATLDPGRQEYGVVTDFVTIITNDPTRPMRRLRATAIVED >gi|313156882|gb|AENZ01000084.1| GENE 18 26400 - 27284 1161 294 aa, chain + ## HITS:1 COG:PAB1737 KEGG:ns NR:ns ## COG: PAB1737 COG0543 # Protein_GI_number: 14521153 # Func_class: H Coenzyme transport and metabolism; C Energy production and conversion # Function: 2-polyprenylphenol hydroxylase and related flavodoxin oxidoreductases # Organism: Pyrococcus abyssi # 1 285 1 276 278 250 47.0 3e-66 MYPILEKRLLAEGIWLMKVLAPRVARSARPGQFIIVRADEHGERIPLTISDFDAEEGSVT IVTQAIGASTRKICSLEAGEAFADFAGPLGHPSEFVEMPLDELRKRHYLFVAGGVGTAPV YPQVKWLKEHGVKADVIIGAKTRDMLIYTDAMRAVAENLHIATDDGSEGFKGLVTQVIDE LIGKQGRHYDECVAIGPMIMMKFVALTTKKYALKTTVSLNALMVDGTGMCGACRVTVAGR TRFTCVEGPEFDGHEVDFDEAMRRQGMYRTQEQRAAAIEAERQAGHQCKIGLDK >gi|313156882|gb|AENZ01000084.1| GENE 19 27342 - 28727 1890 461 aa, chain + ## HITS:1 COG:MA3787 KEGG:ns NR:ns ## COG: MA3787 COG0493 # Protein_GI_number: 20092583 # Func_class: E Amino acid transport and metabolism; R General function prediction only # Function: NADPH-dependent glutamate synthase beta chain and related oxidoreductases # Organism: Methanosarcina acetivorans str.C2A # 4 461 13 469 469 511 60.0 1e-144 MANKIPRVPVREQDPKVRATNFEEVCYGYDLEEATLEASRCLQCKNPRCVAACPVNIRIP DFIGALHRGELQQAADIIAEDSSLPSICGRVCPQESQCEGSCILGIKSEPVAIGKLERFV GDWKLEHGAAAHRTAPRNGRRVAVIGSGPAGLACASDLAKMGYGVKIFEALHKVGGVLVY GIPEFRLPKQRIVAREIAEVERLGVEIETDVIVGRTVTVDSLLGEEGYDAVFIGSGAGLP RFMGIPGENLNGVVSANEFLTRANLMRAYDDTYDTPIYVGKRVVVVGGGNVAMDAVRTAK RLGAEATIVYRRSEKELPARVEEVHHAKQEGICFRMLTNPAEVLGDERGWVTGIRCVEME LGEPDQSGRRSPVVKPGSEFDIACDVVIMALGTSPNPLLAATTAGLETDRRGCITADGQG ATSREGVFAGGDAVTGAATVILAMGAGRTAARAIDEYLRKR >gi|313156882|gb|AENZ01000084.1| GENE 20 28748 - 29314 983 188 aa, chain + ## HITS:1 COG:no KEGG:Amuc_1202 NR:ns ## KEGG: Amuc_1202 # Name: not_defined # Def: hypothetical protein # Organism: A.muciniphila # Pathway: not_defined # 87 187 18 116 121 70 37.0 4e-11 MKSIKIMAAAIALCGMTACGDNTPKTFTGFITDASMNTVTVENAEGTFTFSTMDADKSEA NGLLLGAPVVVDYKGKLEDGAAAAKVATDPTYAEAVGKWTMPDPIDPEGVMGIDILIEGQ AQSINMATLRYTSWELQGEAGKILLKGQSVGNGQTIDFTETGIIAKDADGVYTLTIEGNK TVYTKATK >gi|313156882|gb|AENZ01000084.1| GENE 21 29451 - 30299 1180 282 aa, chain + ## HITS:1 COG:CAC2630 KEGG:ns NR:ns ## COG: CAC2630 COG4632 # Protein_GI_number: 15895888 # Func_class: G Carbohydrate transport and metabolism # Function: Exopolysaccharide biosynthesis protein related to N-acetylglucosamine-1-phosphodiester alpha-N-acetylglucosaminidase # Organism: Clostridium acetobutylicum # 82 278 147 344 347 72 30.0 6e-13 MIKFLAGAFLLLFSFTAAAQTPEDSVAFAGARWQITPLAAGAECRRAQIDMFDSRQTVSV VAYPARNFTTEIIQLDGKACATSELGKAAGADAALNGSYFNMKTLAPVTFVLIDKQILGR TTPGETMRTNGVIALRDKRGRKMDILRCDTTQYSRIARRYRSALAAGPVLVRDGRSVEYD SEDSFYARRHPRTLIGKRADGTVVMAVIDGRFKGEADGATIAETAYIACQLGLVDALNLD GGGSSTLWTAQEGVLNHPYDNKRFDHEGERGVPNCIVIRRNK >gi|313156882|gb|AENZ01000084.1| GENE 22 30646 - 31671 891 341 aa, chain - ## HITS:1 COG:aq_1543 KEGG:ns NR:ns ## COG: aq_1543 COG0859 # Protein_GI_number: 15606683 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: ADP-heptose:LPS heptosyltransferase # Organism: Aquifex aeolicus # 89 333 72 302 312 71 27.0 2e-12 MVRKSKKLPRHLIVYRTSAMGDVAMLAHALRALKRAYPDLQVTVATRPLFRAFFTDLDVG FLDVDVKGAHHSLCGIWRLAAQARRLGADAVADVHGVLRSVTFRTALRLHGIPFAAIGKG RDEKGEFIRRGGRGVKPAKHTVVRYCDVFRKLGFVFDDPKPAVRREHPSPMGEKTGTWIG FAPFSAQQGKTYPEPLARETVRLLAGRYDRVFIHSGGGAEAEFAQEMERLHPNVTALFGR VKMAGEIDLIANLDCVVSMDSLVMHLASLVATPVVSVWGATHPGLGFFGYGCDPRGIVQA DMECRPCSVFGNKPCRYGDYRCLHAVTPAMIVERVEEIVGK >gi|313156882|gb|AENZ01000084.1| GENE 23 31655 - 32260 968 201 aa, chain - ## HITS:1 COG:no KEGG:Bache_0540 NR:ns ## KEGG: Bache_0540 # Name: not_defined # Def: hypothetical protein # Organism: B.helcogenes # Pathway: not_defined # 2 201 3 202 202 268 66.0 1e-70 MFTENANRIFNRVIEEYHRWDDVDRPVENPYEPGTIDHLLYHKNWIDTVQWHLEDIIRDP QIDPVAALAIKRRIDKSNQERTDMVEYVDSYLLDKYREIEPLPGARLNTETPAWAIDRLS ILALKIYHMACETERTDVDDAHRAACRKKLDVLLAQQADLSKAIEELIEDIEAGRKYMKT YKQMKMYNDPALNPVLYGQKK >gi|313156882|gb|AENZ01000084.1| GENE 24 32336 - 33343 1546 335 aa, chain + ## HITS:1 COG:lin1064_1 KEGG:ns NR:ns ## COG: lin1064_1 COG1705 # Protein_GI_number: 16800133 # Func_class: N Cell motility; U Intracellular trafficking, secretion, and vesicular transport # Function: Muramidase (flagellum-specific) # Organism: Listeria innocua # 22 170 44 201 201 80 36.0 4e-15 MNFSKNAILLAGILLTAFGSLRAQVRQTREEYIDRYKSIAVAHMERYGIPASITMAQGIL ESDCGNSRLSLMSNNHFGIKCKRNWTGEKVYHDDDAKGECFRSYPTVEASYQDHAEFLDS QPRYDSLFAYSPTDYKSWARGLKAAGYATAPDYAQRLCRIIEEAQLFLLDQPDGERLYAS RSGRKITDPEGWFTDQTSMERPADGSSAVDPDNYRVTINAHNGYNVYATNGVHYVLAKEG DTFENIGKKFRISARNLRKFNDLKDKKAQPMPHEVVYIERKKKRWEGNAHTHTCRQGETA YAVGQSYAIRTRSIEKLNKLRPGDTLEQGRQIRIK >gi|313156882|gb|AENZ01000084.1| GENE 25 33359 - 34045 1308 228 aa, chain + ## HITS:1 COG:no KEGG:Fluta_3562 NR:ns ## KEGG: Fluta_3562 # Name: not_defined # Def: hypothetical protein # Organism: F.taffensis # Pathway: not_defined # 9 227 6 222 226 171 41.0 3e-41 METTPLRSKILDTIIRKSTLKQRIFDNTFSAFNELKETLLEMASELDDELDGKLDRRVRL EYRDRGKFEAQLQVANDILIFQMHTDIFEFAADHPIWQNPYVQGDRENSYCGLINIYNFL SDSFKFNRNADEGYLIGRIFINRERRYFVEGKQQTSMRAAEFGKSEIGHDALIAILEEAI WFALNFDLLMPPYEDNKRVTVDQFNTKMDNSKFVTGKRLGYDFEVEDI >gi|313156882|gb|AENZ01000084.1| GENE 26 34059 - 36194 3488 711 aa, chain + ## HITS:1 COG:CC2154 KEGG:ns NR:ns ## COG: CC2154 COG1506 # Protein_GI_number: 16126393 # Func_class: E Amino acid transport and metabolism # Function: Dipeptidyl aminopeptidases/acylaminoacyl-peptidases # Organism: Caulobacter vibrioides # 37 690 39 712 738 303 31.0 7e-82 MKASLITLLAAALLGAGIVQAQNEYAEIARLRGMNQTVRGIRSMADGEHYTTLEGSNIIR HSYAAAGPGERMLPSSAANLTITDYSFSPDERQILIASGSKPIYRHSYTTSYFLAAGNSL TPVLREAEAPRDASFSPDSRLIAYSDRNDLYVYDTAARQTRRITDDGAWNSVINGTTDWV YEEEFGFTKAYAFSPDSRRIAYLRFDESQVPLMEMMRFDGKLYNQAYSFKYPKAGERNST VEVWVCDLATGAKERIDTGRETDQYIPRIGWTPDGRLYFFRLNRRQNTFEVILCEPNGAQ RTIYDERSQQYVERVDDSSLTFVDKDRFLVRQESHTGYMHLYLYSIRRGFLAQVTKGDWE VTDVAGSDGKRVWYLSTETSPLRRNLYSVRLDGTQKRRLTTGEGFYTIAPSAGMKYYIST FSNAATPNRVEICDGEGNPVRLLADSKALRDELAAADRPQKEFFTFATERGDTLNAYIVK PRGYDPSHRYPVLLTQYSGPGSQSVRDRWSLDWEDALADKGYIVVCADGRGTGFRGEKFK KQTYGRLGALEVEDQLSLARYMAAQPYTDPARIGIYGWSYGGFMALSCALKGHGLFKMAI AVAPVTSWRYYDSIYTEIYNNLPQYNASGYDDNSPLNFARMLDDTKTRLLIIHGTADDNV HFQNTVEMTRALNRCGKQYDMMVYPDQNHSMQPHDTANVRQKMIRYTLDNL >gi|313156882|gb|AENZ01000084.1| GENE 27 36206 - 36703 686 165 aa, chain + ## HITS:1 COG:BH1021 KEGG:ns NR:ns ## COG: BH1021 COG0350 # Protein_GI_number: 15613584 # Func_class: L Replication, recombination and repair # Function: Methylated DNA-protein cysteine methyltransferase # Organism: Bacillus halodurans # 7 164 11 172 175 148 46.0 4e-36 MERMLFIETPVGKLGITASDEAVTRIFFGREYRNRRRPRNAPEACTPETGGPLLLQAAAE LAEYFAGTRREFTLPLAPAGTPFQQAVWEALRMIPYGETRTYGQIAAQIGRPTACRAVGM ANNRNPIAIVVPCHRVVGSTGALVGYAGGLGVKTHLLNLEKAHED >gi|313156882|gb|AENZ01000084.1| GENE 28 36880 - 38307 2183 475 aa, chain - ## HITS:1 COG:SMc01927 KEGG:ns NR:ns ## COG: SMc01927 COG1007 # Protein_GI_number: 15965032 # Func_class: C Energy production and conversion # Function: NADH:ubiquinone oxidoreductase subunit 2 (chain N) # Organism: Sinorhizobium meliloti # 13 461 16 462 480 237 33.0 4e-62 MDYSNIFTSMQAEVTLIAVIVLLFLYDLIAGERGRRRFSAVACGLLALQVAVNVVPGTPG EVFGGMFQYVPMHSVVKSVLTVGTLIVFLQADAWLRRGDTAHKQGEFYILTLSTLLGMYL MVGAGNFLLFFIGMELASVPMACLVAFDKYKRYSAEAGAKYILCALFASGLMLYGISFFY GTTGALYFDDMAARITGSPLQIMAMVFFFAGLGFKISLVPFHLWTADTYQGAPTTVTSYL SVVSKGSAAFVLMTVLMKVFGPMIGQWQTLLCAVTVLSITVANLFAIRQKDLKRFLAFSS ISQAGYIMLAVIGGSAFGMTSLVFYVLVYMAANLAVFGVVSTVEQHSGGKVGIEDYNGLY RTNPKLAQMMTLALFSLAGIPPFAGFFSKFFVFAAAFKGGFHLLVFIALINTVVSLYYYL LVVKAMYITPNDAPVAPFRSDRYTRISLALCMAGVLLLGVVSAVYDGINAFAFGL >gi|313156882|gb|AENZ01000084.1| GENE 29 38314 - 39798 2122 494 aa, chain - ## HITS:1 COG:BMEI1146 KEGG:ns NR:ns ## COG: BMEI1146 COG1008 # Protein_GI_number: 17987429 # Func_class: C Energy production and conversion # Function: NADH:ubiquinone oxidoreductase subunit 4 (chain M) # Organism: Brucella melitensis # 75 489 77 485 502 253 38.0 6e-67 MNMLSLFPAIPLLMMLGLWLSKSTRQIRAVMVAGSSLLLLLAAVLVVRYLGLREAGETAA MLFTASHVWYAPLDIHYAVGVDGISVAMLLLSAVIVFTGTFASWRMQTQIKEYFLWFCLL SVGVFGFFISVDLFTMFMFYEVALIPMYLLIGVWGSGRKEYAAMKLTLMLMGGSALLLIG ILGIYFGAGATTMNILEIAQLHSIPLAQQYVWFPLVFVGFGVLGALFPFHTWSPDGHASA PTAVSMLHAGVLMKLGGYGCFRVAMYLMPEAAHELSWIFIVLTTISVVYGALAACVQTDL KYINAYSSVSHCGLVLFAILMMNTTAGTGAILQMLSHGLMTALFFALIGMIYGRTRTRDI RELSGLMKVMPFLAVGYVIAGLANLGLPGLSGFVAEMTIFNGAFMNDDTFHRVVTIIACT SIVITAVYILRVVGKIIYGTCTDEHHLKLTDATRDERLSVVCLVAAIAGMGLAPLWVSDM IRESVTGIIAQLMS >gi|313156882|gb|AENZ01000084.1| GENE 30 39811 - 41703 2716 630 aa, chain - ## HITS:1 COG:slr0844 KEGG:ns NR:ns ## COG: slr0844 COG1009 # Protein_GI_number: 16331732 # Func_class: C Energy production and conversion; P Inorganic ion transport and metabolism # Function: NADH:ubiquinone oxidoreductase subunit 5 (chain L)/Multisubunit Na+/H+ antiporter, MnhA subunit # Organism: Synechocystis # 61 628 67 680 681 369 39.0 1e-102 MEYTILILLLPLLSFLFLGLAGMKLKPAAAGLVGTAVLGVVTLLCYCTAFEYFTAGRDAA GAFPTLIPWNTVWLPISRTLHIDMGILLDPISVMMLVVISTVSLMVHVYSLGYMKGERGF QRYYAFLSLFTMSMMGLVVATNIFQMYLFWELVGVSSYLLIGFYYTKKEAVAASKKAFIV TRFADLGFLVGILFYGYYAGTFSFTPEVRLLAAAGTMIPLALGLMFIGGAGKSAMFPLHI WLPDAMEGPTPVSALIHAATMVVAGVYLVARMFPLFIGYAPEVLHWTAYIGAFTALYAAV VACVQSDIKRVLAFSTISQIGFMIVALGVSTSADPHAGGLGYMASMFHLFTHAMFKALLF LGAGCIIHAVHSNEMSAMGGLRKYMPVTHATFLVACLAIAGIWPLSGFFSKDEILTACFA FSPAMGWLMTAIAGLTAFYMFRLYYNIFWGRENRELHAAHKPHEAPLTMTLPLVFLAAVT LVGGAIPFGKFVSSDGMPYTIHIDWRVAGVSLCVAAAGIALATWMYLRERQPVADRLALR FRGLHRAAYNRFYIDDVYQFVTHKVIFRFVSTPIAWFDRHVVDGFMNLLARAAGGAAYAI RDMQSGSVQRYCIWFLGGALGFTIILLLIR >gi|313156882|gb|AENZ01000084.1| GENE 31 41726 - 42034 588 102 aa, chain - ## HITS:1 COG:VNG0643G KEGG:ns NR:ns ## COG: VNG0643G COG0713 # Protein_GI_number: 15789840 # Func_class: C Energy production and conversion # Function: NADH:ubiquinone oxidoreductase subunit 11 or 4L (chain K) # Organism: Halobacterium sp. NRC-1 # 1 102 1 100 100 77 45.0 7e-15 MIHMEYYLVLSTIMMFVGIYGFVTRRNLLAMLISVELILNSVDINFVVFNRFLFPEQLEG FFFTLFAIGISAAETAVAIAIIINIYRNIRNIEVKSLREMKW >gi|313156882|gb|AENZ01000084.1| GENE 32 42047 - 42556 910 169 aa, chain - ## HITS:1 COG:sll0521 KEGG:ns NR:ns ## COG: sll0521 COG0839 # Protein_GI_number: 16332084 # Func_class: C Energy production and conversion # Function: NADH:ubiquinone oxidoreductase subunit 6 (chain J) # Organism: Synechocystis # 5 168 7 168 198 59 31.0 3e-09 MEITLELIVFTVLAVFTAVSALLAVTTKRILRAATYLLFVLFGTAGIYFQLNYSFLGAVQ LLIYAGGITVLYVFSILLTSSQGDKSEDLKGYKFFVGLAATLLSLGVCLYVTLGHDFRPS HFVHGELHVQTIGHALMSSGKYGYVLPFEVISILLLACIVGGILIARKR >gi|313156882|gb|AENZ01000084.1| GENE 33 42560 - 43000 700 146 aa, chain - ## HITS:1 COG:RP795 KEGG:ns NR:ns ## COG: RP795 COG1143 # Protein_GI_number: 15604627 # Func_class: C Energy production and conversion # Function: Formate hydrogenlyase subunit 6/NADH:ubiquinone oxidoreductase 23 kD subunit (chain I) # Organism: Rickettsia prowazekii # 1 140 1 131 159 85 35.0 4e-17 MKSYIKGFFHGLGSLLTGLKVTGREFFTPKVTEQYPENRATLRMFDRFCGELTMPHDAEG RNKCIACGLCQSACPNGTIRITTEMVADPETGKSRKRLVLYEYDLGACMFCRLCVNACPT GAIRFSTDFEHAVYTREKLVKTLNKQ >gi|313156882|gb|AENZ01000084.1| GENE 34 43003 - 44079 1828 358 aa, chain - ## HITS:1 COG:sll0519 KEGG:ns NR:ns ## COG: sll0519 COG1005 # Protein_GI_number: 16332086 # Func_class: C Energy production and conversion # Function: NADH:ubiquinone oxidoreductase subunit 1 (chain H) # Organism: Synechocystis # 9 347 10 358 372 234 38.0 3e-61 MFDFSIVTSWIHGLLTSLMPLGLAVFLECVIVGVCLLLMYTVIAILMIFMERKVCAAFQC RLGPVRVGPWGTLQVICDVFKMLTKEIITIRRSDKFLYNLAPYIVILASVLAFACLPVNK GLEVLDFNVGVFFMMAASSIGVVGILLAGWSSNNKYSLIGAMRSGAQMISYELSIGLSIL TIIVLTDTMQFSEIVARQADGWFIFKGHIPALIAFVIYLIAGNAEVNRGPFDLPEAESEL TAGYHTEYSGMHFGLFYVAEFVNLFIVAGVAATIFLGGWMPLHISGWEGFNAAMDYIPGF VWFFGKAFFVVWLLMWIKWTFPRLRIDQILTLEWKYLVPIGLANLLLMVVVVVFKLHF >gi|313156882|gb|AENZ01000084.1| GENE 35 44259 - 45878 2441 539 aa, chain - ## HITS:1 COG:SMa1529 KEGG:ns NR:ns ## COG: SMa1529 COG0649 # Protein_GI_number: 16263284 # Func_class: C Energy production and conversion # Function: NADH:ubiquinone oxidoreductase 49 kD subunit 7 # Organism: Sinorhizobium meliloti # 162 539 1 404 404 302 36.0 9e-82 MSMTLKDKITAWEPQAVWSEAGDGMWTVPAEKFHDLAARLRAEGFDFLRSLTGMDWGEEG FGAVYHLEASATGENVVLRTLTPSREKCGLPSVCDLWKAAELNEREVFDYFGIGFLNHPD MRRLFLRDDWVGYPLRKDYDPALNPLRMTNEVSMDSAPSFELTSDGSFIRKRNVLFEEDE YVINVGPQHPATHGVLRFRVSLEGEIIKKLDVHCGYIHRGIEKLCEGLTYPQTLALTDRL DYLGAAQNRHALCMCIEKGLGVEVSERVQYIRTIMDELQRIDSHLLFFACLCMDMGALTA FFYGFRDREKVLDILEQTTGGRLIQTYNTIGGVQADIHPGFVRKVKEFIAYMRPLLREYH EIFTGNVIAQERLKGTGVLTREDAISFGATGGTGRAAGWACDVRKRHPYAMYGKVGFEEV LFTEGDCFARYMVRMREIEQSMNIVEQLIDNIPEGEYQLKMKPVIRIPEGSYYAAVEGSR GEFGVFIESRGDKSPYRMKFRSTGLPLVSCLETIAKGTKIADLIAIGGTLDYVVPDIDR >gi|313156882|gb|AENZ01000084.1| GENE 36 45875 - 46498 439 207 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|154175216|ref|YP_001407461.1| NADH dehydrogenase subunit B [Campylobacter curvus 525.92] # 26 172 16 160 170 173 53 2e-42 MKYEDFNDNEYLEKMVAELRDNNTNVIVGCLDELIEWGRSNSLWPLTFATSCCGIEFMAV GAARYDFARFGFEVARASPRQADFIMVAGTITHKMAPVLRRLYDQMADPKYVIAVGGCAI SGGPFKKSYHVLNGVDRILPVDVYIPGCPPRPEAMLYGLMQLQRKVKLQRFFGGVNKQIG REEYEELLRRDLTSERNDLNVEGGEKQ >gi|313156882|gb|AENZ01000084.1| GENE 37 46522 - 46872 224 116 aa, chain - ## PROTEIN SUPPORTED ## NR: gi|154175415|ref|YP_001407462.1| NADH dehydrogenase subunit A [Campylobacter curvus 525.92] # 3 116 14 126 129 90 35 1e-17 MYFTLLVVVILTAIALVAVALGIARAVSPRSYNPQKGEAYECGIPTRGRSWMQFKVGYYL FAILFLMFDVETVFLFSWAVVVQELGVYGLVSILFFLVVLILGLAYAWRKGALEWK >gi|313156882|gb|AENZ01000084.1| GENE 38 46984 - 47349 441 121 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313156904|gb|EFR56342.1| ## NR: gi|313156904|gb|EFR56342.1| hypothetical protein HMPREF9720_1864 [Alistipes sp. HGB5] # 1 121 1 121 121 204 100.0 2e-51 MLERIQYALFVLLFVLAGSLDGATPQGGACRVQPGMSPQAALHAPVTAQNYFTAAVMLSG DASIEIPAHSSEHKAIVRHRFYARSAAAEAVAQQNSALSDRGALYSKPVDYYIFSLGRIL I >gi|313156882|gb|AENZ01000084.1| GENE 39 47569 - 48933 1301 454 aa, chain - ## HITS:1 COG:no KEGG:BT_3988 NR:ns ## KEGG: BT_3988 # Name: not_defined # Def: putative peptidoglycan bound protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 450 1 441 444 228 37.0 4e-58 MKNMYKYALLFALAGGAALFTGCRADEEVDLAGYPETPVGATISGTTDRVATFAGTYDNE GVLNLAGSLSGEYTIALAQASPEETVVRVEPIITNVPAELVEISARELVIPAGSTTAAVS VRMIEENYDFMENVFGPVTYELGVRVVEARGSQVPVVDGEAKMVIEKAAYVAVASLVGVE GNAVTFKRNFIDGAIVNEDPITYDVKVVVDRPVLEDTKFVVKSAGIPEGFVDDERFTPAA EVTIPAGAKESDATTWSVTDDFLEADEELGTFPVQLTAELVGDTAGAAVDPEDAGVAISI VKKSDLLTFLSAADPSWKKYPTSGWTVQTNGSSSSNANTALWDGDSYSDIYLWRDSSLEL TIDMKTLQWVAGFDIQAFSIDYLGEYFELSTSEDGENWTPHGELRQAEAPNRGYGATNYV QLLKPCNARYVKWVGMKGSTALSINELYIYGRNE >gi|313156882|gb|AENZ01000084.1| GENE 40 49003 - 50520 1988 505 aa, chain - ## HITS:1 COG:no KEGG:BT_3987 NR:ns ## KEGG: BT_3987 # Name: not_defined # Def: endo-beta-N-acetylglucosaminidase F1 # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 490 1 476 476 436 47.0 1e-120 MKTRYFKSGILAAAAALLCSAALVSCTDDVTVGEWDGAGSIETGLNETGAMLQDLNSGKS NTTVELWKETYTADLRLVLTKTPSEGFTARAKVDDSYDVGQYNKANGTNYTLYPADKVTF ANDGLFAAAGRNVELTVEMTVHAAEGLVAGRGYLIPVALEADGVILKESHCFYVVKDMTS MPTCYKGDDLPKGFLFFEVNDVNPLNALTFELEDGRLLWDVVCLFSGNINHHADRNAPFL SLNPQTQYWMDNNEVFIQPLRKRGIKVIMCVLGNHDQSGVAQLSDYGCQMFAKELATFCE AYNIDGVCFDDEYSNSPDLSNPYYASRSSARAARLAYESKKAMPDKLVVAYCYSSFNISG WPTEIEGQDIAEWVDIAVGDYGWSTSPMGNMTQKQCSAISMEFNRGTGGNFTSVVAERML DPTTGKGWFMGFAPDPLKNTNGRVNKNAWRNIFVNRLNKGPETLYGSPLKEPEHFYKFAD TTRYNYPEDLPDTYSRPAQPEWPNY >gi|313156882|gb|AENZ01000084.1| GENE 41 50540 - 51715 1711 391 aa, chain - ## HITS:1 COG:no KEGG:BT_3986 NR:ns ## KEGG: BT_3986 # Name: not_defined # Def: putative patatin-like protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 390 1 383 384 397 52.0 1e-109 MKYNKCLLVLSALAAVVFTGCKNEDVEKHRFDNKFYITSAILSDDLLIKSDTPGYTKTIE SRLAKPAGQQVAVTIEADPSQVAAYNMIYGDNAVALPAECYELSSNQIDIAAGQVSGDAI EIVFKDINTLDGSRRYVLPVTVTSCNGVDLLDSRSTVYFVARQGAMINVVANIAWMNFPV SWSADAKPLVSGMKQVTIEALLRSADWTDGRGDALSSVFGIEGNFLVRIGDADRPRDQLQ LATGSANGGNWPAANAAPGLPVNEWVHIAVVWDAVNGERIYYQNGKQVAYSNQKMSGSVT LTSNCYVGKSYNEERYIPGEISELRVWSVARTPEQIAGNMYGVSPESEGLVAYWKFNEGS GSTIIDHANGTNLSAVGGTPTWIPVTLPEIK >gi|313156882|gb|AENZ01000084.1| GENE 42 51728 - 52777 1485 349 aa, chain - ## HITS:1 COG:no KEGG:BT_3985 NR:ns ## KEGG: BT_3985 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 1 348 1 357 358 228 36.0 4e-58 MKINIKRYFLPLAAVLFMVGCHEAVEPEAIDVNYSSFEETNPELYARYMQALREYKAGSH KAVFVTMTVPENGAATTHRTQHFTAMPDSVDFIVVNPVPSVLCQTLVDEIRKVHEKGTRI LFNIDLQTFENDWTQVLKEDPTLSEEDALAYLGGRVGEQIALVDRWGYDGFIFTYTGKAV GSMQDEALAVYTARQEALFAPIRAWHEAHPSHALVFRGFTGAITETNMPLLDECAYIILP TNDVKTLDEMSFSALTAVSVAGVPADRLIVTAQTTRPGDDKKEFGYTGMVDAFGDTVEAI EAWAGWVTLPSPGYTRAGLLIEDVQYDYYNPAKVWAKTREAIGIMNPSI >gi|313156882|gb|AENZ01000084.1| GENE 43 52797 - 54386 1933 529 aa, chain - ## HITS:1 COG:no KEGG:HMPREF9137_0327 NR:ns ## KEGG: HMPREF9137_0327 # Name: not_defined # Def: putative lipoprotein # Organism: P.denticola # Pathway: not_defined # 8 522 9 517 519 462 47.0 1e-128 MKFNISNMTVALAAAALGAASCTANYEDINRNPYEVTAEDMERDGYAMRSFMTTMQSWVI PADVNQCQFTDLLLGGPYGGYIADANSGFNTGKFSTYDPQSNWSPVFYRVIYTNEMSNFT ELCKVTEDENAIAVAKVVKVAGVHRVADTYGPIPYTQVGTGADPVPLDSEKEVFKAMFAD LDAAVEVLTRNRAGMITTDADRLYGGSLERWGRLANSLKLRLAMRIVYTDFTTDDGRTPQ QLAEEAVSSELGVIEDNSGNAMYTGFGKDGNPFYVCFFSYKSGAGDHRVAADIASFMNGY DDPRRASYFNESAFSGGGYDGLRNGIQIPDDDRINNFSRYNVTASTSLMWMNASEVAFLR AEGALRGWAMGGNARTFYEEGIRLSFERCGVADQLTAYMASTALPGGYDDPAFGYSCSNQ STVTIPWNDGAAFEENLERIITQKWIANFPLGTEAWADYRRTGYPRLFPVVVNNNPDITN LQLGARRMTYPLEEYEANGETIRAAVDQWLGGQDKMSTRLWWDCNPRIR >gi|313156882|gb|AENZ01000084.1| GENE 44 54402 - 57809 5077 1135 aa, chain - ## HITS:1 COG:no KEGG:Bacsa_0825 NR:ns ## KEGG: Bacsa_0825 # Name: not_defined # Def: TonB-dependent receptor plug # Organism: B.salanitronis # Pathway: not_defined # 38 1135 33 1130 1130 1257 58.0 0 MNKTFIRQAKRSFTFLFMLGLCAIWSPRAAAQNRSLPQVTIRMDRVPMSQIMSEIERQTK YLFVTGSDVKTDRITSLDVTAVPLTEALNRMVAGADLTYDIRSLSIILSVKKAEQPAAVA GYVRDGSGQPVIGATIIVRDTSTGTTTDADGRFTLSVPAPSTAWLEVSYLGYEPQAVAVG NRTSFDITLREAASEIESVVVTALGIKRSEKALSYNAQQINAEDIVAVKDVNFVNSLNGK VAGLNINASSSGVGGSSKVVMRGSKSIEQSSNALYVIDGIPMCNFRSDGSEEFDSQGSSE AIADLNPEDIESMTVLSGAAAAALYGSDAANGAILITTKRGKAGQTTVTVTSNTEVLAPF VMPMFQTRYGTGDKLAGAANGIYSWGERLNAFNYRGYDPADDYFQTGVVGTETVSFSTGN ERSQTYASAGAVNSRGIIPNNGYDRYNFTFRNTSSFLKDKMHLDLGASYIIQKDRNMVNQ GVYSNPLVSAYLFPRGDDWADVEMYERYNPSRGIYEQYWPTMGADTYQMQNPYWISYRNL RENRKDRYMINASLTYDILDWLSVSGRIRVDNSNSDYTEKLYASTNNLRTEGSPNGLYGI TKTNDRQTYGDVMVNINKTFGENWSLQANVGASISDMRSDASKVRGPIAYGEQTGYDKDG NAIYEPNNIPNVFNVYNLSNSKTVREQIGWREQSQSVFFSAEVGFKGAYYLTVTGRNDWP SQLAGPNSVNSSFFYPSVGASVVLSEIIPNMPKNLSYVKLRASYASVGLAFSRFYANPTR SWNSSTNSWNLSSQSPLYDLKPERTKSFEVGLTMRFLRHFNFDISYYNTRTINQTFNTGI PASSMYSKVYKQDGDVRNRGFEMSLGYKNTWNRFSWDSNFTASVNRNKIMKLGKEVIDPN TGKVTPIGDLNVGGMGNVRFILREGGSLGDIYSRADLRRDSNGDIYQDMDGNIFVDNKTQ TKDFIKLGSVFPDANLAWRNDFRWRNFNFGFLLTARLGGVVYSRTQAVMDYYGVSASSAD ARDAGGIVINGNDLIDAQKWYQTVANGDTTPQYYTYSATNVRLQEASIGYTIPRKKLGDV CDITLSIVGRNLWMLYNKAPFDPEAVATTSNYYQGIDYFMMPSLRSIGFNVRVKF >gi|313156882|gb|AENZ01000084.1| GENE 45 57975 - 58916 1005 313 aa, chain - ## HITS:1 COG:SMc04204 KEGG:ns NR:ns ## COG: SMc04204 COG3712 # Protein_GI_number: 15965785 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Sinorhizobium meliloti # 114 259 154 299 354 65 29.0 9e-11 MEKETLYRYVACQASPEEETAVLTWLEADPAHESELAKVQRQHDLVALSAPVINELYAKD RRRRFGSVLRRWSAAAAAVVLLAFGGYYFHAARDFSRQGERLLSVSVPYGQRVSLTLQDG TSVWLNAGTTLRYPALFTGRERRVEIEGEARFEVVHDAKHPFIVRTYACDVEVLGTKFNV VAEEENGLFSTALFEGRVAVSSRLVPGERLVLEPDEMVTLEGKHLCLAQIDNDEEYLWTN GIISLTDQSFSELMHRFEKTFGVTIRIEREPSIRIGQGKIRQSVGIDNALQVLQRFADFE YEKDEQNNTITIR >gi|313156882|gb|AENZ01000084.1| GENE 46 59014 - 59550 596 178 aa, chain - ## HITS:1 COG:no KEGG:Bache_2470 NR:ns ## KEGG: Bache_2470 # Name: not_defined # Def: RNA polymerase, sigma-24 subunit, ECF subfamily # Organism: B.helcogenes # Pathway: not_defined # 3 176 8 182 185 121 38.0 1e-26 MEEVYAEYRTYFVRIAVSYVRDRMVAEDLVSDTFLKIWESRTETTPRNLPAYLLTALKRK CLDYLRDQAIHLRIQQNMHQTSARVVGERIARLEANDPQNLMMSEALAIIERELRRMPEQ RRRIFIAHRYEEMSYREIAAVYELSEGQVTYELRAAKEALKAALKDYLPLIGLLLNGF Prediction of potential genes in microbial genomes Time: Wed Jun 22 13:58:40 2011 Seq name: gi|313156880|gb|AENZ01000085.1| Alistipes sp. HGB5 contig00044, whole genome shotgun sequence Length of sequence - 1858 bp Number of predicted genes - 4, with homology - 2 Number of transcription units - 4, operones - 0 average op.length - 0.0 N Tu/Op Conserved S Start End Score pairs(N/Pv) 1 1 Tu 1 . + CDS 78 - 266 188 ## + Term 502 - 540 -0.3 2 2 Tu 1 . - CDS 762 - 1208 111 ## Lbys_3001 hypothetical protein - Prom 1279 - 1338 7.2 3 3 Tu 1 . - CDS 1365 - 1628 167 ## + Prom 1181 - 1240 5.6 4 4 Tu 1 . + CDS 1488 - 1857 151 ## Odosp_2382 hypothetical protein Predicted protein(s) >gi|313156880|gb|AENZ01000085.1| GENE 1 78 - 266 188 62 aa, chain + ## HITS:0 COG:no KEGG:no NR:no MVFMRNIHPNIFLPKTFRTHGRVHNLLHDNYTTHKDDSKIALRVHHRSYMFSDLYPHCEF LL >gi|313156880|gb|AENZ01000085.1| GENE 2 762 - 1208 111 148 aa, chain - ## HITS:1 COG:no KEGG:Lbys_3001 NR:ns ## KEGG: Lbys_3001 # Name: not_defined # Def: hypothetical protein # Organism: L.byssophila # Pathway: not_defined # 56 147 32 121 130 73 38.0 2e-12 MKKCICFILLICSCTCCTSTNNPRSVEIQATNVPLYDSLPHNIEGAYLHLPNNRFDFGKI NRKKIPNIPIEVEFSNIGTVPLVILKADVSCGCLSVEYPKAPILPNQTGKLIIKADLRAQ SGTFNKSVFIKTNAENDVVLLRVAGEIR >gi|313156880|gb|AENZ01000085.1| GENE 3 1365 - 1628 167 87 aa, chain - ## HITS:0 COG:no KEGG:no NR:no MVKYTFQFRAVPPGKLKRLDYLCTTTGHASGKQTNGEEKIQEKQKSHNKRNLGYANVWKY YGQPHPQLEHPVLHPHKNERFKSLHIK >gi|313156880|gb|AENZ01000085.1| GENE 4 1488 - 1857 151 123 aa, chain + ## HITS:1 COG:no KEGG:Odosp_2382 NR:ns ## KEGG: Odosp_2382 # Name: not_defined # Def: hypothetical protein # Organism: O.splanchnicus # Pathway: not_defined # 10 90 5 85 644 72 41.0 8e-12 MRFLLFLNFLFSVCLLSACMSRRSAEVIEALELAGRNRSELERILDHYAACEADSLKLRA AEFLIANMPGHQCIYGPEVDSFFCEIDSLLDCEVSDSYPIVVGLNEIADRHAPLESVIRN GIE Prediction of potential genes in microbial genomes Time: Wed Jun 22 13:59:28 2011 Seq name: gi|313156793|gb|AENZ01000086.1| Alistipes sp. HGB5 contig00005, whole genome shotgun sequence Length of sequence - 111437 bp Number of predicted genes - 88, with homology - 88 Number of transcription units - 21, operones - 18 average op.length - 4.7 N Tu/Op Conserved S Start End Score pairs(N/Pv) + Prom 436 - 495 7.5 1 1 Op 1 19/0.000 + CDS 728 - 1654 1305 ## COG0083 Homoserine kinase 2 1 Op 2 . + CDS 1670 - 2959 1669 ## COG0498 Threonine synthase + Term 2966 - 2993 0.3 - Term 3209 - 3246 1.6 3 2 Op 1 11/0.000 - CDS 3329 - 4390 1588 ## COG0473 Isocitrate/isopropylmalate dehydrogenase 4 2 Op 2 . - CDS 4380 - 5921 2363 ## COG0119 Isopropylmalate/homocitrate/citramalate synthases 5 2 Op 3 30/0.000 - CDS 5932 - 6525 823 ## COG0066 3-isopropylmalate dehydratase small subunit 6 2 Op 4 6/0.000 - CDS 6538 - 7914 1667 ## COG0065 3-isopropylmalate dehydratase large subunit 7 2 Op 5 . - CDS 7916 - 9436 2190 ## COG0119 Isopropylmalate/homocitrate/citramalate synthases - Prom 9465 - 9524 4.4 - Term 9711 - 9749 9.4 8 3 Op 1 . - CDS 9781 - 11448 2237 ## gi|313156846|gb|EFR56286.1| hypothetical protein HMPREF9720_0165 9 3 Op 2 . - CDS 11462 - 11785 363 ## gi|313156810|gb|EFR56250.1| hypothetical protein HMPREF9720_0166 10 3 Op 3 . - CDS 11782 - 12342 847 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog - Prom 12515 - 12574 3.2 + Prom 12410 - 12469 2.3 11 4 Op 1 22/0.000 + CDS 12549 - 13118 486 ## PROTEIN SUPPORTED gi|29349997|ref|NP_813500.1| 50S ribosomal protein L25/general stress protein Ctc 12 4 Op 2 . + CDS 13130 - 13693 841 ## COG0193 Peptidyl-tRNA hydrolase + Term 13730 - 13789 19.1 - Term 13719 - 13774 9.1 13 5 Op 1 . - CDS 13984 - 14376 697 ## COG1188 Ribosome-associated heat shock protein implicated in the recycling of the 50S subunit (S4 paralog) 14 5 Op 2 25/0.000 - CDS 14383 - 15843 1866 ## COG0772 Bacterial cell division membrane protein 15 5 Op 3 28/0.000 - CDS 15848 - 17197 1926 ## COG0771 UDP-N-acetylmuramoylalanine-D-glutamate ligase 16 5 Op 4 4/0.000 - CDS 17200 - 18444 2376 ## COG0472 UDP-N-acetylmuramyl pentapeptide phosphotransferase/UDP-N-acetylglucosamine-1-phosphate transferase 17 5 Op 5 26/0.000 - CDS 18461 - 19918 2213 ## COG0769 UDP-N-acetylmuramyl tripeptide synthase 18 5 Op 6 . - CDS 20072 - 22336 3520 ## COG0768 Cell division protein FtsI/penicillin-binding protein 2 19 5 Op 7 . - CDS 22343 - 22858 842 ## gi|313156858|gb|EFR56298.1| hypothetical protein HMPREF9720_0177 20 5 Op 8 . - CDS 22911 - 23837 1303 ## COG0275 Predicted S-adenosylmethionine-dependent methyltransferase involved in cell envelope biogenesis - Prom 23868 - 23927 4.0 + Prom 23805 - 23864 3.7 21 6 Op 1 . + CDS 23916 - 24749 1348 ## COG3176 Putative hemolysin 22 6 Op 2 . + CDS 24774 - 25781 1651 ## COG3176 Putative hemolysin + Term 25803 - 25838 7.2 23 7 Op 1 . + CDS 25913 - 26407 291 ## gi|313156847|gb|EFR56287.1| putative lipoprotein 24 7 Op 2 . + CDS 26343 - 27410 1098 ## gi|313156852|gb|EFR56292.1| hypothetical protein HMPREF9720_0182 + Term 27449 - 27482 6.1 - Term 27437 - 27470 6.1 25 8 Op 1 . - CDS 27556 - 28101 904 ## COG1778 Low specificity phosphatase (HAD superfamily) 26 8 Op 2 . - CDS 28077 - 28856 968 ## PGN_0697 hypothetical protein 27 8 Op 3 . - CDS 28876 - 30180 2281 ## COG1295 Predicted membrane protein 28 8 Op 4 . - CDS 30255 - 31403 1480 ## Ctha_2085 hypothetical protein 29 8 Op 5 . - CDS 31400 - 32878 1618 ## COG0658 Predicted membrane metal-binding protein 30 8 Op 6 . - CDS 32970 - 33371 291 ## BDI_0904 hypothetical protein 31 8 Op 7 . - CDS 33364 - 33789 457 ## gi|291514742|emb|CBK63952.1| Holin family 32 8 Op 8 . - CDS 33764 - 34237 510 ## gi|167753868|ref|ZP_02425995.1| hypothetical protein ALIPUT_02153 - Prom 34317 - 34376 2.1 - Term 34333 - 34368 1.1 33 9 Op 1 . - CDS 34479 - 35711 1898 ## gi|313156821|gb|EFR56261.1| phage portal protein, PBSX family 34 9 Op 2 . - CDS 35708 - 36205 626 ## gi|313156841|gb|EFR56281.1| hypothetical protein HMPREF9720_0191 35 9 Op 3 . - CDS 36208 - 37185 1623 ## gi|313156857|gb|EFR56297.1| hypothetical protein HMPREF9720_0192 36 9 Op 4 . - CDS 37252 - 38133 1139 ## COG0740 Protease subunit of ATP-dependent Clp proteases 37 9 Op 5 . - CDS 38137 - 38400 400 ## gi|313156867|gb|EFR56307.1| hypothetical protein HMPREF9720_0194 38 9 Op 6 . - CDS 38397 - 40757 2613 ## gi|313156856|gb|EFR56296.1| hypothetical protein HMPREF9720_0195 39 9 Op 7 . - CDS 40748 - 41737 860 ## gi|313156817|gb|EFR56257.1| hypothetical protein HMPREF9720_0196 40 10 Tu 1 . - CDS 41872 - 42276 447 ## gi|167753876|ref|ZP_02426003.1| hypothetical protein ALIPUT_02161 - Prom 42395 - 42454 5.3 + Prom 42404 - 42463 6.7 41 11 Op 1 . + CDS 42527 - 43297 971 ## COG1427 Predicted periplasmic solute-binding protein 42 11 Op 2 . + CDS 43367 - 44350 1543 ## COG0598 Mg2+ and Co2+ transporters - Term 44344 - 44372 3.7 43 12 Op 1 . - CDS 44390 - 45214 543 ## BT_2655 hypothetical protein 44 12 Op 2 . - CDS 44923 - 47520 2689 ## BT_2655 hypothetical protein 45 12 Op 3 . - CDS 47538 - 48692 969 ## Bacsa_0603 hypothetical protein 46 12 Op 4 . - CDS 48705 - 49805 1324 ## BVU_0901 hypothetical protein 47 12 Op 5 . - CDS 49875 - 51602 2374 ## gi|313156864|gb|EFR56304.1| putative lipoprotein 48 12 Op 6 . - CDS 51622 - 52668 1264 ## Bacsa_1472 hypothetical protein 49 12 Op 7 . - CDS 52701 - 53039 344 ## BT_2659 hypothetical protein 50 12 Op 8 . - CDS 53052 - 53633 640 ## Bacsa_1471 hypothetical protein - Prom 53882 - 53941 2.5 + Prom 54188 - 54247 8.2 51 13 Op 1 . + CDS 54434 - 55903 2059 ## COG2244 Membrane protein involved in the export of O-antigen and teichoic acid 52 13 Op 2 . + CDS 55920 - 56105 258 ## gi|313156868|gb|EFR56308.1| hypothetical protein HMPREF9720_0211 53 13 Op 3 . + CDS 56112 - 57257 1770 ## COG0470 ATPase involved in DNA replication 54 13 Op 4 . + CDS 57261 - 57752 209 ## PROTEIN SUPPORTED gi|148994682|ref|ZP_01823786.1| 50S ribosomal protein L13 55 13 Op 5 . + CDS 57762 - 58727 1382 ## gi|313156796|gb|EFR56236.1| putative lipoprotein 56 13 Op 6 . + CDS 58730 - 60805 3159 ## PRU_2555 putative lipoprotein 57 13 Op 7 . + CDS 60810 - 61616 1163 ## gi|313156831|gb|EFR56271.1| outer membrane insertion signal domain protein 58 13 Op 8 . + CDS 61629 - 61970 594 ## COG0858 Ribosome-binding factor A 59 13 Op 9 . + CDS 61989 - 63200 1443 ## COG4591 ABC-type transport system, involved in lipoprotein release, permease component 60 13 Op 10 . + CDS 63197 - 64024 1382 ## gi|313156806|gb|EFR56246.1| putative lipoprotein 61 13 Op 11 . + CDS 63984 - 65132 1208 ## COG0402 Cytosine deaminase and related metal-dependent hydrolases + Term 65143 - 65181 3.1 - Term 65126 - 65172 8.4 62 14 Op 1 . - CDS 65192 - 66151 1663 ## Bache_0655 glycoside hydrolase family 2 sugar binding protein 63 14 Op 2 . - CDS 66259 - 67011 1129 ## COG3250 Beta-galactosidase/beta-glucuronidase - Prom 67196 - 67255 2.9 + Prom 66973 - 67032 4.7 64 15 Op 1 6/0.000 + CDS 67247 - 67840 792 ## COG1595 DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog 65 15 Op 2 . + CDS 67911 - 68909 1188 ## COG3712 Fe2+-dicitrate sensor, membrane component 66 16 Op 1 . + CDS 69035 - 72592 5748 ## Phep_3773 TonB-dependent receptor plug 67 16 Op 2 . + CDS 72612 - 74399 2817 ## Phep_3772 RagB/SusD domain-containing protein 68 16 Op 3 . + CDS 74419 - 75738 1645 ## Phep_3774 hypothetical protein 69 16 Op 4 . + CDS 75756 - 76181 624 ## COG5492 Bacterial surface proteins containing Ig-like domains + Term 76183 - 76217 2.3 - Term 76878 - 76915 1.3 70 17 Op 1 . - CDS 76984 - 78855 1241 ## COG3525 N-acetyl-beta-hexosaminidase - Term 78868 - 78904 8.1 71 17 Op 2 . - CDS 78913 - 80433 1158 ## gi|313156832|gb|EFR56272.1| hypothetical protein HMPREF9720_0230 72 18 Op 1 . - CDS 80569 - 82020 1436 ## gi|313156826|gb|EFR56266.1| hypothetical protein HMPREF9720_0231 73 18 Op 2 . - CDS 82071 - 84344 2227 ## BVU_0121 glycoside hydrolase family protein 74 18 Op 3 . - CDS 84354 - 85844 1459 ## BVU_0121 glycoside hydrolase family protein 75 18 Op 4 . - CDS 85847 - 87580 1199 ## COG2730 Endoglucanase 76 18 Op 5 . - CDS 87610 - 88482 796 ## Bache_0346 coagulation factor 5/8 type domain protein 77 18 Op 6 . - CDS 88512 - 90362 1926 ## Bache_0347 RagB/SusD domain protein 78 18 Op 7 . - CDS 90385 - 93477 3133 ## BT_3012 hypothetical protein - Prom 93679 - 93738 6.3 + Prom 93564 - 93623 5.2 79 19 Tu 1 . + CDS 93719 - 94651 591 ## COG1409 Predicted phosphohydrolases + Term 94774 - 94808 2.0 - Term 94494 - 94520 -0.6 80 20 Op 1 . - CDS 94711 - 95547 428 ## COG3568 Metal-dependent hydrolase 81 20 Op 2 . - CDS 95556 - 97550 1424 ## COG1554 Trehalose and maltose hydrolases (possible phosphorylases) 82 20 Op 3 . - CDS 97555 - 99570 1524 ## BF1323 alpha-glucosidase 83 20 Op 4 . - CDS 99620 - 100732 655 ## COG4299 Uncharacterized conserved protein 84 20 Op 5 . - CDS 100762 - 102348 680 ## COG3525 N-acetyl-beta-hexosaminidase 85 20 Op 6 1/0.000 - CDS 102371 - 104659 1533 ## COG1472 Beta-glucosidase-related glycosidases 86 20 Op 7 . - CDS 104694 - 107024 1499 ## COG1472 Beta-glucosidase-related glycosidases 87 20 Op 8 . - CDS 107035 - 108882 1067 ## COG3525 N-acetyl-beta-hexosaminidase - Term 108940 - 108973 1.2 88 21 Tu 1 . - CDS 109020 - 111437 1796 ## COG0642 Signal transduction histidine kinase Predicted protein(s) >gi|313156793|gb|AENZ01000086.1| GENE 1 728 - 1654 1305 308 aa, chain + ## HITS:1 COG:STM0003 KEGG:ns NR:ns ## COG: STM0003 COG0083 # Protein_GI_number: 16763393 # Func_class: E Amino acid transport and metabolism # Function: Homoserine kinase # Organism: Salmonella typhimurium LT2 # 4 306 2 307 309 157 33.0 3e-38 MKHIKVFAPGTVANLGCGFDIMGLTLDGVGDRIEVAAEAGASGLAIRNESGKRLPEKPED NVITPAVAAMLEAYGKPVQVEITILEKIAPGSGIGSSAASSAAAVYGLNELLGRPFAAER LVEFAMMGEALIGGTPHADNVGPALLGGVVLIRGYGPLDIIRLPVPDDFFYAVAHPDIVV GTKEAREVLPREIPMAHAVTQWGNVGGLVAGLALGDVALIGRSMHDVVAEPYRKGFIPGY DRLKEGVLSEGALAMNIAGSGPSVFALAAEGSVSERVAERMKSHFAGLGIGCNIYAGRVS NRGARIEE >gi|313156793|gb|AENZ01000086.1| GENE 2 1670 - 2959 1669 429 aa, chain + ## HITS:1 COG:YPO0461 KEGG:ns NR:ns ## COG: YPO0461 COG0498 # Protein_GI_number: 16120790 # Func_class: E Amino acid transport and metabolism # Function: Threonine synthase # Organism: Yersinia pestis # 1 426 1 424 429 307 39.0 2e-83 MKYYSTRDKALARPCSLREAVEAGLAPDGGLFVPERIPQADMAVAERLAGESYAALAGYL AALFFGEDIDRGILQREIDRIYDFQVPLRPVGSRYTLELFHGPTFAFKDFGAGFMGRMVG LLGGADEKLVILTATSGDTGSAVAHGFYDVPGVEVVLLYPEGKISRLQECQMTALGGNIH PLRVGGTFDDCQRLVKELFADSSFRAAHRVSSANSINLLRWIPQAFYYFYGYFRWRQASG GENPVVVVPSGNYGNLSAGMLARRMGLPLGGFVAASNANDVVPEFLRTGDYRPRPSVRTL ANAMDVGAPSNFERMMWLCGGDADALRGELLGFRCDDAMIRRTIDGLYRNYGYLSDPHSA VGYAASVACGRPGFYLSTAHPAKFGEVISPVTGAKVPLPPRLAELVKRPRVSEPMAVDLG ALEEYVAGV >gi|313156793|gb|AENZ01000086.1| GENE 3 3329 - 4390 1588 353 aa, chain - ## HITS:1 COG:aq_244 KEGG:ns NR:ns ## COG: aq_244 COG0473 # Protein_GI_number: 15605790 # Func_class: C Energy production and conversion; E Amino acid transport and metabolism # Function: Isocitrate/isopropylmalate dehydrogenase # Organism: Aquifex aeolicus # 5 349 6 352 364 378 53.0 1e-105 MNVNIALLAGDGIGPEIIAQAVKALDHVARKYAHNFTYREALVGACAIDATGDPYPEETH KICREADAVLFGAIGDPKYDNDPKAPVRPEQGLLRMRKSLGLFANLRPIALFDTLADRSP LKAEVVRGTDFVCVRELTGGIYFGRPQGRDEGGDRAVDTCTYSRQEIERVLHVAFRLAMS RRRHLTVVDKANVLETSRFWREIAQQLAPQYPEVELDFMFVDNAAMQIIRQPTHFDVIVT ENMFGDILTDEASVISGSLGMLPSASVGSEVALFEPIHGSYPQAAGKNIANPMAAILSAA MLLEHLGLTAEGSAVRDAVNRALTEGIVTEDLAAQGERRYSTSEVGDFIAAHI >gi|313156793|gb|AENZ01000086.1| GENE 4 4380 - 5921 2363 513 aa, chain - ## HITS:1 COG:MK0391 KEGG:ns NR:ns ## COG: MK0391 COG0119 # Protein_GI_number: 20093829 # Func_class: E Amino acid transport and metabolism # Function: Isopropylmalate/homocitrate/citramalate synthases # Organism: Methanopyrus kandleri AV19 # 5 502 5 491 499 233 35.0 7e-61 MLPPVEIMDTTLRDGEQTSGVSFNAREKLSIARLLLEELRVSRIEIASARVSQGEQEAVR RIAEWAASKGYSDRIEALGFIDGGISLDWLGDAGCRVVNLLTKGSLKHCREQLRRTPEEH LGDIRATIDKAVARGMRVNIYLEDWSNGMQHSPEYVYAMLDALADAPVQRFMCPDTLGVL DPYATERFCGDLVRRYPRLRFDFHAHNDYDLAVANTAAAVKAGFHGVHVTVNGLGERAGN APLSSTVAVLHDRLHRATGIDETKINHVSRMVETYTGIRIPANRPIVGESVFTQCAGIHA DGDNKNNLYFNELLPERFGRVREYALGKTSGKANIMKNLETLGIDLDEASMRKVTERVVE LSDKKELVTAEDLPYIIADVLRYDLAENPVRILNYSLSLAQGLHPAASLKIEINGREYEQ TSAGDGQYDAFMRALRQIYEHQLRREFPMLRNYTVSIPPGGRTDAFVQTIITWEMGGKVF KTRGLDADQTEAAIKATIKMLNIIETNYNDHEC >gi|313156793|gb|AENZ01000086.1| GENE 5 5932 - 6525 823 197 aa, chain - ## HITS:1 COG:BH3055 KEGG:ns NR:ns ## COG: BH3055 COG0066 # Protein_GI_number: 15615617 # Func_class: E Amino acid transport and metabolism # Function: 3-isopropylmalate dehydratase small subunit # Organism: Bacillus halodurans # 20 196 18 193 194 175 51.0 5e-44 MSIPKFETFTSGAVPVRTENIDTDQIIPARFLKATERKGFGDNLFRDWRYDAEGRKVASF PLNDSRYEGRILVAGRNFGCGSSREHAAWAIADYGFRVVVSSFFADIFRNNALNNGLLPI TVSDNFLAAIFAAIADDPAARFTVDLEGQTLTAEATGRSERFEIDAYKKRCLQNGYDDVD YLCSISDQIRRFEAARK >gi|313156793|gb|AENZ01000086.1| GENE 6 6538 - 7914 1667 458 aa, chain - ## HITS:1 COG:SMc03823 KEGG:ns NR:ns ## COG: SMc03823 COG0065 # Protein_GI_number: 15966959 # Func_class: E Amino acid transport and metabolism # Function: 3-isopropylmalate dehydratase large subunit # Organism: Sinorhizobium meliloti # 3 454 5 467 469 496 56.0 1e-140 MGKTLLDKIWDAHTVRTEDGGRCVLYIDRQYIHEVTSPVAFAGLERRGIGVARPAQITAT ADHNIPTVDQHLPIAEPESRRQVEALAANCARFGIEHFGVGDPRQGIVHIIGPELGFTQP GMTIVCGDSHTSTHGALGAVAFGVGTSEVEMVFASQCIIQSKPRSMRITVDGTLRPGVEA KDIILYIISRLTASGGTGYFIEFAGDAIRSLSMEGRMTVCNMSIECGARGGMIAPDQTTF DYLRGRERAPQGAAFDRAAARWRELYSDADAVFDKEYRFDAADIAPMITYGTNPGMGMAV DAAIPADADDKALKYMGFESGETLQGKPVDYVFVGSCTNGRIEDLRLFARLVESRRKADN ITAWIVPGSKGVEAAARAEGLDRILADAGFELRQPGCSACLAMNADKIPAGKYCVSTSNR NFEGRQGPGARTMLSGIAVAAAAAVCGRIEDPRKVFEM >gi|313156793|gb|AENZ01000086.1| GENE 7 7916 - 9436 2190 506 aa, chain - ## HITS:1 COG:aq_2090 KEGG:ns NR:ns ## COG: aq_2090 COG0119 # Protein_GI_number: 15607049 # Func_class: E Amino acid transport and metabolism # Function: Isopropylmalate/homocitrate/citramalate synthases # Organism: Aquifex aeolicus # 1 493 5 502 524 451 49.0 1e-126 MCEKLFIFDTTLRDGEQVPGCQLNTTEKIEVAKLLESLGVDIIEAGFPISSPGDFNSVLE ISKAVSAPTICALTRAVKKDIDVAADALRLAKHKRIHTGIGVSPQHIYDKLRSTPEKIIE SAVEAVKYAKRYVEDVEFYAEDAGRADPEYLARVIEAVIAAGATVVNIPDTTGYCLPGEY GAKIKYLVDHVSNIDRAIISTHCHNDLGMATANTLSGILNGARQAEVTINGIGERAGNTS LEEVVMTLRCHKELNVDTNIDAQLITKASHLVSSLMNMPVQPNKAIVGRNAFAHSSGIHQ DGVLKDRQTYEIIDPQDVGLNQSVIALTARSGRAALVHRLELLGYNLTQEETDDTYAKFL ELADRKKEIHDYDLLYLVGDIDRMKQQSLSLKFLQVTTGTLVPTATVVLKFGDHERMAIA TGNGPVDAAVSAIKTLINEKVVLTEFLMQAITKGSNDVGRVHVQVQCGSRTVHGFAAHTD TTRASIEAFLDALRILNVTERKEKEA >gi|313156793|gb|AENZ01000086.1| GENE 8 9781 - 11448 2237 555 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313156846|gb|EFR56286.1| ## NR: gi|313156846|gb|EFR56286.1| hypothetical protein HMPREF9720_0165 [Alistipes sp. HGB5] # 1 555 1 555 555 871 100.0 0 MKHILLFLCALLFALTGNTAPACNEPEQLLAEARTATNAAGSVSSGEMERAMAEARRTME TTGREIERAVAEARRATELSDREIARAVTEARQAIDAAERIDLANQSLEELNKAAREQIV RELGLSTRQRREFEPIYKAYREALDKAVDARAGASGADEATQKNSLKAKLSNIAATAQVK RDYVDKFAAVLTAEQIRRLYNTEGEIGTNIKRAAFDRSSRTRSGRLKGSGRMVTQDWGKA GDYTGISAAAFFDITVSPAAKTISVTADDNVIDYLVLERDGGKLKFRVNANSTENISVSV TVPASASLREISAGSYGKVNCKMPLKGPSVSVSVSSYGSVSADIDTPGAAKLDVSSYGKF AGSVRCSDGELRISSYGSAQAPVECRNSCKLTVGSYAKFSNDIKASDLTVEVSSGASVGS TLTADALTMRIDSYAKFSGTVTVNARQAKLTVSSGGSFNGTFSGSSLEASVGSYGKIYLK GAAQVADATVRVSSGANFSAPELRVSDYDLTVSNYAKADVWCSGRLKINASTAAKVTYGG PCTVETVSDNIQRRK >gi|313156793|gb|AENZ01000086.1| GENE 9 11462 - 11785 363 107 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313156810|gb|EFR56250.1| ## NR: gi|313156810|gb|EFR56250.1| hypothetical protein HMPREF9720_0166 [Alistipes sp. HGB5] # 1 107 1 107 107 161 100.0 1e-38 MNREFENSGREMPYRLPPQSLEALHERILSRTSRRPASPPRTVRRYCLTAAAAAAVLVMG LLVTEYRTRRADTPAPDLEQMLATTPAETLRQAAAENYDDILYNQQL >gi|313156793|gb|AENZ01000086.1| GENE 10 11782 - 12342 847 186 aa, chain - ## HITS:1 COG:BS_sigW KEGG:ns NR:ns ## COG: BS_sigW COG1595 # Protein_GI_number: 16077241 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Bacillus subtilis # 24 182 19 184 187 71 31.0 7e-13 METTRDISAEELAALRDPGARHGAFDRLVGAYLRPLYWHARRLVVVHEDAEDVVQETFVR AYDRIGTFRGGNDEIGAWLYRIATNTALTLLRRRKTGLFASLDEVSRTLAGRVAEECGED ADQMQVRFQQAVLELPLKQRLVFNLRYYDGMSYGQIARVLGQREQTLKVNYHYAVQKLKE KLTQEI >gi|313156793|gb|AENZ01000086.1| GENE 11 12549 - 13118 486 189 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|29349997|ref|NP_813500.1| 50S ribosomal protein L25/general stress protein Ctc [Bacteroides thetaiotaomicron VPI-5482] # 1 187 1 192 196 191 49 1e-47 MKTISVKAVKREEYGKKAAKAVRREGMVPCVLSGHGESVAFSVDPREIKPLIYTPNSYIV EFDFDGKKEQAVLREAQFHPVREQILHLDFYRIADGKPVSIAIPVRLTGNAEGVKVGGKL ALSARKLVVSALVENLPDELVVDVTTLGVGKTIFVGDLKFENLKFVTPATTAVCAVRVTR ASRGAAAQD >gi|313156793|gb|AENZ01000086.1| GENE 12 13130 - 13693 841 187 aa, chain + ## HITS:1 COG:BH0068 KEGG:ns NR:ns ## COG: BH0068 COG0193 # Protein_GI_number: 15612631 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Peptidyl-tRNA hydrolase # Organism: Bacillus halodurans # 4 185 3 185 185 145 43.0 3e-35 MKYLIVGLGNIGAEYAGTRHNIGFNVLDALAEASNAVFTTARYGDVAQVKHKGRTLILLK PSTYMNLSGKAVRYWMEAEKIAPENLLVVSDDIALPFGTLRLRPKGSAGGHNGLKNIAEL LGTENFARMRFGVGGDFPKGHQVDYVLGEWSEEDRKAMPERLKVFCDAILSFATIGVERT MNFFNKK >gi|313156793|gb|AENZ01000086.1| GENE 13 13984 - 14376 697 130 aa, chain - ## HITS:1 COG:Cgl2072 KEGG:ns NR:ns ## COG: Cgl2072 COG1188 # Protein_GI_number: 19553322 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Ribosome-associated heat shock protein implicated in the recycling of the 50S subunit (S4 paralog) # Organism: Corynebacterium glutamicum # 4 119 9 120 126 71 38.0 3e-13 MDDIRLDKYLWAVRVFKTRSDAADAVRNNKVTVNGAYAKPSREVKIGDVIAVRRMQVTYS YKVLDLVSSRQPAKNVSLYCLNVTPQEELDKLNVPRETVFVFRDRGTGRPTKKERRELDG LMDELYYDEE >gi|313156793|gb|AENZ01000086.1| GENE 14 14383 - 15843 1866 486 aa, chain - ## HITS:1 COG:lin1059 KEGG:ns NR:ns ## COG: lin1059 COG0772 # Protein_GI_number: 16800128 # Func_class: D Cell cycle control, cell division, chromosome partitioning # Function: Bacterial cell division membrane protein # Organism: Listeria innocua # 87 470 10 378 400 127 26.0 5e-29 MKFGERKQTQGAQQAQEPWSETAGNAPRTAGAAAAETAPHGSEGRGGRGKSRSGAEYGTA ADEENPQAANACGETAEKPKFRLFTGDRVLWIIIAVLAVVSVLVVYSSTAKMAYDAHTAR TTAHFLRQQLMILIVSLVVMVAVQKINCRIYNLFSRPVYILSVLLTVAVYFIGATTNGAA RWIPLGPFQFQPSEALKVATVLFLASQLAGRQSKIDKIRIVPSLRFWTWRSSREQRRIWR EGTWPILMPVVVSCTVIFPAHTSSAVLVFLASWVMMLIGRVRFGELMKLVGLACVGIVLI MTLNLGRSETAEGRVSTWIHLWTRSQTDKPIEHLTDTERSMIAIYNGGIFGEGAGQSAMR VEMIHPESDYAYAFFVEEYGIVLAIALLMLYLWIFFRGIEIFRRCGTAFPGLLVLGLALL ITCQALLHIMVTVNLIPETGQTLPLISRGGSSTLFTTIALGMILSVSRQNDEQSHDTPRS ESIYEK >gi|313156793|gb|AENZ01000086.1| GENE 15 15848 - 17197 1926 449 aa, chain - ## HITS:1 COG:SA1026 KEGG:ns NR:ns ## COG: SA1026 COG0771 # Protein_GI_number: 15926766 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramoylalanine-D-glutamate ligase # Organism: Staphylococcus aureus N315 # 2 441 10 438 449 228 34.0 1e-59 MKKIAVLGGGISGYGSAILAKKKGFDVFLSDAGRIADRYKAKLDVWQVPYEEGGHTEERI LDAAEVVKSPGIPETAPLVRKLRAAGIPVISEIEFAGRYTGRAKCICITGSNGKTTTTSL IYKIMRDAGMRVALGGNIGESFAYSVATGDYDWYVLELSSFQLDGMYKFRAHIGVLMNIT PDHLDRYDHCFQNYADSKMRITQNQTSRDYFVYSGDDEVIWQQLPKYDLRMKQLPFAAKN AVASGAGDAFLCDCKFTAAVGKASVEIDTAKLQIKGLHNAYNAMAAALATLAAGIAPAAI RRSLYGFAPVEHRLEPVRESGGVLWINDSKATNVDSVYYALESMKRPVVWIAGGTDKGND YGPLKDFAREKVHTLVCMGIDNAKLLREFTGVVPEVISTDSLDAAMTAAKAAARPGDAVL LSPACASFDLFKNYENRGELFKAWVNEKA >gi|313156793|gb|AENZ01000086.1| GENE 16 17200 - 18444 2376 414 aa, chain - ## HITS:1 COG:YPO0552 KEGG:ns NR:ns ## COG: YPO0552 COG0472 # Protein_GI_number: 16120880 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramyl pentapeptide phosphotransferase/UDP-N-acetylglucosamine-1-phosphate transferase # Organism: Yersinia pestis # 1 414 1 360 360 257 38.0 3e-68 MLYYIFKYLDEAYNLPGSGMFQYISFRAAASIILALLIVIIFGRKIIDFLRRKQIGEEVR DLGLEGQLQKRGTPTMGGVIILLAILVPILLLGRLDNIYIQLMIVSTVWLGLIGGLDDYI KVFRHNKEGLKGRFKIVGQVGLGIIVGTTMCVSQEIVVRDKVVQPVQTVYMNEDGSVLET VHRNVVLSSESLKTTQTTIPFIKDNEFDYGWLTGGNTTMTWLLYVLVAIFVVTAVSNGAN LTDGLDGLATGVSVPIVAVLGVLAYLSGHIVYADYLNIMYIPDSGELVVFAAALVGALVG FLWYNSFPAQIFMGDTGSLSIGGIIAVFALCIRKELLLPLLCGVFLVESFSVMMQVGYFK YTKRKYGEGHRLLLMAPVHHHFQKKGQFETKIVLRFWIISLLLAAITLVTLKIR >gi|313156793|gb|AENZ01000086.1| GENE 17 18461 - 19918 2213 485 aa, chain - ## HITS:1 COG:CAC2129 KEGG:ns NR:ns ## COG: CAC2129 COG0769 # Protein_GI_number: 15895398 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: UDP-N-acetylmuramyl tripeptide synthase # Organism: Clostridium acetobutylicum # 4 476 3 474 482 342 37.0 9e-94 MKTLKELLRNTPVTALHGDDSAAVAGLVYDSRAVGPGDCFFAVRGTQNDGHDFIPAAVAK GAAAVVCERLPEQTAADVVYVAVPDAAGALADMAAAFYDHPSRALKLVGITGTNGKTTTV TLLYDLVRALGHKAGLISTVVYKVGERTVGATHTTPDPVRLNAMMREMADEGCEYCFMEC SSHAIVQERTRGLHFAGGIFSNITHDHLDYHKTFAEYIRAKKLFFDSLSKEAFALTNADD KNGRVMVQNTAAAIHTYSLRSMADFRCKIVEMHPDGMLLRIDGQEVWVGLVGRFNAYNLL AVYGAAVLLGFDRGETLRAASMLHPVSGRFELVRSANGVTAVVDYAHTPDALENVIRTIE EIRTPAQKLIVVCGCGGDRDRTKRPEMAQIAVRYADTAVFTSDNPRHESPEAILDQMAAG LEPGARFLRIADRAEAIRTAVMLSQPGDILLVAGKGHETYQIVGDVKHPFDDREEVRKAF DSFGK >gi|313156793|gb|AENZ01000086.1| GENE 18 20072 - 22336 3520 754 aa, chain - ## HITS:1 COG:NMA2072 KEGG:ns NR:ns ## COG: NMA2072 COG0768 # Protein_GI_number: 15794948 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Cell division protein FtsI/penicillin-binding protein 2 # Organism: Neisseria meningitidis Z2491 # 215 626 166 557 581 130 30.0 1e-29 MKEERSKIKSDILLRVRLLYVLFILAGAVVLGRLVWVQLFSSEVAYNAERLASRIFTEET IPAQRGSILSRGGAPLATSIFRYQAAFDFASPGLDSLKTFREQSDSLAKLLAAFFRDKSA AEYSRKFREEHARHYRLVNGRDTTYLRSEGWFSRLMDRLRGEEFATRRIYDTIRDHTPVN IFPREVDYAEWQTLRRYPLLNWNMGMVYRLVERDQRVYPQGELARRTIGLTGDKGNYGLE EAYREELAGRDGKALRQRIARGFYGRVAGGGHEDPEDGYDVVTTLDLDLQDVADKALRRQ LEAQNAIWGTTIVMEVHTGEILAMANLGRAGTEGAFYERENYALGRSMEPGSTFKLATML TLLDDADMSPQTAYDTHNGDPVTVGPAKNIRDSHRGDHVIDFRRAVASSSNVYFAKAIWE RYGITGKKQEYSDFLHEKLRLGQTVGLERLGERAPSITTDWKVPDPGVMLVKMSYGYRVR LAPIQMITFYNAIANGGKMISPVLVRELRRGDRVEERFESRTLASSICSRSALREVQRCL ELVCTEGTASLYFRDTARLRVAAKTGTAQITDARSREGRYYLGSMIAYFPADNPRYTVLT TIETRAQAGKAYYGGPLAGPVVKRMVDYIYNRNRDWYGRVERHGPRRHPDRIKGGDIAQI RRVADKLSPRASFDHRTGWGRVQVDSLSNVVITSLPAETGTMPDVRGMGLKDALFVLESR GLKVRFSGRGAVVRQSVAPGARITPGATVAITLN >gi|313156793|gb|AENZ01000086.1| GENE 19 22343 - 22858 842 171 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313156858|gb|EFR56298.1| ## NR: gi|313156858|gb|EFR56298.1| hypothetical protein HMPREF9720_0177 [Alistipes sp. HGB5] # 1 171 1 171 171 224 100.0 2e-57 MFKDHEFDPVTPEEEARRRQEEEFARRVRREVLRIERGEADEDIRADMEREEEEKAEEEE RQRRERRRKASTFWQLFSGTILVHEGVSKYYPYMLSIAGMFFLSIAVMFTTLHLDMKYSR LEREVQILRERSIRLQEQRYRRTTHSAIVERLRERDIELTDPPAPGEIIEN >gi|313156793|gb|AENZ01000086.1| GENE 20 22911 - 23837 1303 308 aa, chain - ## HITS:1 COG:SP0334 KEGG:ns NR:ns ## COG: SP0334 COG0275 # Protein_GI_number: 15900265 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: Predicted S-adenosylmethionine-dependent methyltransferase involved in cell envelope biogenesis # Organism: Streptococcus pneumoniae TIGR4 # 9 304 6 313 316 239 44.0 4e-63 MSATNMSQYHTPVLLKESVDLLAVDPAGTYVDLTFGGGGHSRRILEKLGPGGRLYAFDQD RDTRANCPDDGRFHYVESNFRFLRGALRLREVEQVDGILADLGVSSHHFDAVERGFSFRG EAPLDMRMNQRGALTAARVVNGYPADALTRLLGEWGELETPWKIANCIVRAREAAPIETT AQLVEAVKPCTPKKDESKFLTKLFQALRIEVNGEMEALKMALEQSLKVLRPGGRLVVISY HSLEDRLVKNFLRSGNFSGSVEKDFFGRALTPFEIITRKAVVPTAEELERNPRSRSAKLR AAAKRENE >gi|313156793|gb|AENZ01000086.1| GENE 21 23916 - 24749 1348 277 aa, chain + ## HITS:1 COG:VCA0646 KEGG:ns NR:ns ## COG: VCA0646 COG3176 # Protein_GI_number: 15601404 # Func_class: R General function prediction only # Function: Putative hemolysin # Organism: Vibrio cholerae # 16 266 32 287 605 94 26.0 3e-19 MPRIDIGAVLADKAPRLARWIPGFVIGWLRRTIHESEINYILEHYWNLPPQEFIRACFRE WQVTYTVEGLEKLDPKGRYLFAANHPFGGMDGMMLADKLIDRFGDARVVVNDLLMHLEPL RPLWIPVNKHGSQNSAYARKFDEEFVGEVPILTFPAGLCSRCIGGEVTDLPWKTNFLKKA YASQREIVPVFVEGRLSNFFYRVARLRVMLGLKLNIEMLWLPDEMFSQKGRHFRIFVGDP IPVSELQRYGGLREQAEFVRKKAYFLENMLAPEPEKR >gi|313156793|gb|AENZ01000086.1| GENE 22 24774 - 25781 1651 335 aa, chain + ## HITS:1 COG:VC0489 KEGG:ns NR:ns ## COG: VC0489 COG3176 # Protein_GI_number: 15640516 # Func_class: R General function prediction only # Function: Putative hemolysin # Organism: Vibrio cholerae # 37 313 310 576 586 106 30.0 5e-23 MQQRQMEAIIPPVEREQLRAELTPERKARDTKKAGNEIYIFSAAECPSLMREIGRLREIA FRGAGGGTGKELDIDAEDLAEDGYYQLIVWDPAAEEIVGGYRFIICTDEYPRHLSTEHYF RFSDKFRRKYLPRTIELGRSFVQPSYQARGNAKSIYALDNLWDGLGALIVLDPRAKYLFG KVTMYTTYKAVARNALIWFLRRYFPDRDNLVEGIHPLRLDLDDPYYEQLFTGQTYQENYR ILIQKIREFNENIPPLINAYMNLSPTMRVFDTVSNPDFGGVEETGILVTIRDIYPEKRER YTRWLGWQANLQYRRELFRKRVREHLDRIKKRKGE >gi|313156793|gb|AENZ01000086.1| GENE 23 25913 - 26407 291 164 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313156847|gb|EFR56287.1| ## NR: gi|313156847|gb|EFR56287.1| putative lipoprotein [Alistipes sp. HGB5] # 1 164 1 164 164 308 100.0 1e-82 MKRLFSALLCLALIAGSAACSDDETTRPVLVVNAGESFLSLDALARAGKISVTSSLPWRV TAAAGDKWFTLSATEGPAGYSEIEVTFDRNEGPARSAQLAFASEDLIVPFVMSQSAPKDG YDARRITIFMPDSARCRPLCRIACAVPRQTELFLLYAFENVRPG >gi|313156793|gb|AENZ01000086.1| GENE 24 26343 - 27410 1098 355 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313156852|gb|EFR56292.1| ## NR: gi|313156852|gb|EFR56292.1| hypothetical protein HMPREF9720_0182 [Alistipes sp. HGB5] # 47 355 1 309 309 631 100.0 1e-179 MLSHDKPSYFYYTRSRTFDPDKFPPHVKVTTAADRTSDATEAEKDAMCREMKRRVLEINK ADPTAVFSLYVDDLRCRLGYDWFVAQGIDSARVKVSLLSDGTATYNNFYNYFGDPATAQQ NWETYAAEVEALDWNHGGRYPETRAEFELESWTWPYYLATRPGYRLVMQNGALLESSCPF ITDKLREMQIEDIQPYEMLSALSESARRLFYDMAGFDYDKFAALFDASPKGNLIIIGTSH QNAASEQQQRDYVARIVGQYGAAYDIFFKPHPADTSSASYETDFPGLTLLPGQMPFEIFV WSLMDHVDMIGGYPSTVFLTVPVDKVRFIFAPDAESLVRPLNLLFRDAANVEWMQ >gi|313156793|gb|AENZ01000086.1| GENE 25 27556 - 28101 904 181 aa, chain - ## HITS:1 COG:FN0213 KEGG:ns NR:ns ## COG: FN0213 COG1778 # Protein_GI_number: 19703558 # Func_class: R General function prediction only # Function: Low specificity phosphatase (HAD superfamily) # Organism: Fusobacterium nucleatum # 14 167 7 160 168 110 35.0 2e-24 MGNFKEDIARTEAFVFDVDGVMTDGGIIPTLDGDFIRRYNAKDGYALAYAIKMGYKVCII TGGRGKTLENRLKMLGVTRAYIDCMDKISALHEYFAEEGIDPRNAIYMGDDIPDLECMRE VGIPVCPADAAAEVIEASRYVSEFKGGEGAVRDIVEQVLRARGDWAKNSEGVTPSSLAAS R >gi|313156793|gb|AENZ01000086.1| GENE 26 28077 - 28856 968 259 aa, chain - ## HITS:1 COG:no KEGG:PGN_0697 NR:ns ## KEGG: PGN_0697 # Name: not_defined # Def: hypothetical protein # Organism: P.gingivalis_ATCC33277 # Pathway: not_defined # 3 252 13 261 270 192 45.0 1e-47 MERAVIIGSGNLAEALAQAVARSGLKLVQLFARNAQRGKTVAALAGTQWTSDPAGLAEAD IYLIAVSDRAVAEAAASLPIPAGAVVAHTAGSVPLEALPDRPTTRRAVFYPLQTFTKGRE VDFSQIPVFLETDDEALRPELEAFARRLSRTVIWADSACRAKAHLAAVFACNFVNHMYAV AERIAGSAGLPFDVLKPLIAQTAAKALDAASPADVQTGPAVRNDTGTRARHCALLDDDLQ LKNIYSIISNSIWETSKKI >gi|313156793|gb|AENZ01000086.1| GENE 27 28876 - 30180 2281 434 aa, chain - ## HITS:1 COG:FN1154 KEGG:ns NR:ns ## COG: FN1154 COG1295 # Protein_GI_number: 19704489 # Func_class: S Function unknown # Function: Predicted membrane protein # Organism: Fusobacterium nucleatum # 55 361 20 326 396 138 26.0 2e-32 MKISEIFTYFTDTIFRRDINEWRNPVIRWLVQQYRLLFYTARGLLEHGMIVRSAALTFYT LMSLVPIVAVVFAVVKGFGLADGLIDNLYALFPQNPEIIDYVVGFAEKALARTQGGVVAA VALVMLFWAIIRVFGSIESAFNNIWEVKVERSVTRQYTDYIAVVMIVPVLWVVANAVGNY TQQLLGFDGSWYFDLLSRFASMFIIWVMFTILYIIIPNTKVKFKSALMAGIVAGTLFLLF QWGYIYIQRWMTSYNAIYGSFAALPLLLIWLQTSWQILLFGGELSFAYQNIARFGEERES LLISYDQRRKILLAVMLSVVRHFREKGGATPADVIRARLGLPTRIVNDVLYQLVQAGQLI AVPSGDGEREVAFAPAHDTGTLTVYGVLEAVEASGQTTVDLARNAELTRIDRELENLKET ARKSQDNVRLVDLL >gi|313156793|gb|AENZ01000086.1| GENE 28 30255 - 31403 1480 382 aa, chain - ## HITS:1 COG:no KEGG:Ctha_2085 NR:ns ## KEGG: Ctha_2085 # Name: not_defined # Def: hypothetical protein # Organism: C.thalassium # Pathway: not_defined # 6 371 5 388 421 174 33.0 4e-42 MMTPEEYELLLTDEVQRAIAVSRGRDPLDVALDRTVPHARLVATQVKYLARAASKLPSYA AAQCILPPLAFEQASSEACAAHKRIDGDTALDLTCGLGVDALFLSRRFRRVVTLERSETL ARIAAENFRRLQAANIEVVHTSAEEYLQRPGLRFDWIFADPDRRSADGRKLVRLEACSPD VTALMPLIARASGRLCIKNSPLFDVDEALRLFPDSRVEVVSLGDECKEVLVYADGTGPAL TATALGRGSFTATPGQTPPPAPDTFDPSRYRWLVVPDVALQKARLARLHLRGKADIWSEN GYGFAAEEPQEVIGKVFAVESIEPYDPRRLKRSLKGAGAELMKRDFPFGAEELARRLGVH AGADVRLAFTKIGNGFWAVRLK >gi|313156793|gb|AENZ01000086.1| GENE 29 31400 - 32878 1618 492 aa, chain - ## HITS:1 COG:all1983_1 KEGG:ns NR:ns ## COG: all1983_1 COG0658 # Protein_GI_number: 17229475 # Func_class: R General function prediction only # Function: Predicted membrane metal-binding protein # Organism: Nostoc sp. PCC 7120 # 127 401 182 453 612 81 28.0 3e-15 MKSEFLLAKLDRMPMLKAVVPFAAGILAADRFALPLWFLAGTFLLSGVSALLLRSPLCTL AMLFTAGFGAAQLRDTGRTVPYGVYTAYELSVEGIPADRGRYASAEATVTAWRDPADGTW HPAGDRVTLHADSLTALHPGERLRCRGTVRPLRGGAESYRRLMTRRGFAGTIWLSERTIL ERLPARNTALHLHAAERMGRLRIPGDAGAVCRAMTAGDRSGITPELRAAYSRSGLSHLLA VSGLHTGIVFALVNLLLWWLPLLRRGHLLRNLLAAACIWIYVAAAGFPPSAVRAAVMFTM LQSALASASEYNGLNALAAAAFGMLLWNPAWLGDISFQLSFAAVAAILAWGVPLCRRLRT RRRALNPITDALAVSLAATLATAPLVSHTFGIVPLAGVVVNPAAILLGSVVVLAGALWML LPVGWAAPAFECVLSHTAEGLNALARVTADLPCTAAEYALGGGATAAVYLFFTLATVAAW SAEPKKSVHLPA >gi|313156793|gb|AENZ01000086.1| GENE 30 32970 - 33371 291 133 aa, chain - ## HITS:1 COG:no KEGG:BDI_0904 NR:ns ## KEGG: BDI_0904 # Name: not_defined # Def: hypothetical protein # Organism: P.distasonis # Pathway: not_defined # 1 131 6 135 135 132 52.0 5e-30 MSRGLSNRNPGNIRQSAVRYKGEVRPSRDPAFKQFESMPWGYRAIFVLLDTYRIRHGLDT IRGMISRWAPPSENRTEIYIRAVADAVGIADDRPVDTRDRTTMLRMAAAISRVENGTAAD MDDVERGWELFRQ >gi|313156793|gb|AENZ01000086.1| GENE 31 33364 - 33789 457 141 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|291514742|emb|CBK63952.1| ## NR: gi|291514742|emb|CBK63952.1| Holin family [Alistipes shahii WAL 8301] # 1 141 1 141 141 218 83.0 9e-56 METLYRLVSGAAAGIAALFAPIGPLVACTVVFIGIDFLSGVAASRAAARREGRAWYFESR EAWRTVLKLGLTITAIAMAWMIDSCILDFMGLNVARLFTGFTCGVELWSFLENASQLSDA PLFRWLRRYVRRRIRREAGDE >gi|313156793|gb|AENZ01000086.1| GENE 32 33764 - 34237 510 157 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167753868|ref|ZP_02425995.1| ## NR: gi|167753868|ref|ZP_02425995.1| hypothetical protein ALIPUT_02153 [Alistipes putredinis DSM 17216] hypothetical protein ALIPUT_02153 [Alistipes putredinis DSM 17216] # 1 154 1 154 157 120 44.0 3e-26 MNTLITPAQAVALAFADGEYLAPESVTQSDIAAAEQRYIVPVIGRRLYEKLLAGSHASFT TEYLAAPAAFFTRIALQPRLDVRTGQCGTVAPKSAAYQPAGTQALRELQRSLRRQARTLL RRAAEHLETHAAEFPEYDPHENILNRCTTDGNFVQTR >gi|313156793|gb|AENZ01000086.1| GENE 33 34479 - 35711 1898 410 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313156821|gb|EFR56261.1| ## NR: gi|313156821|gb|EFR56261.1| phage portal protein, PBSX family [Alistipes sp. HGB5] # 1 410 1 410 410 833 100.0 0 MSKKHKIKPAVAVNNRTADPYITLGAGRVESDAFWRWGDDNLFPNALALMARRSVTHRRI INDKADYISGKGFTCDEAQQPRLAAFLRHVNGDGESLRQVLNKLAFDKALFGNAFLEAVT DAGHTFLSLHHQDASRCRLARDSAHVLLHHDWTAFKATEARTLPLYPAFEEQSDGTLRAV IHYKDYEPTFEHYGVPPYIAGFNVSAIAYKTDKWNISRLDNSFQLSGVMMLDSSVDSEAE AERIVRLAEQKFAGNPGQVMFVIRDGGEQNDNSRFIPIASQNEGDWQALHEQAVSDIVVA HSWFRTLSGLEYASGFSAERILHEYEVALNTVILGEQAELTEPIRQLLTSVSGFDTSSLQ IVNRPPTRSKPIYMKVWEARKADGLEYDPEDERQQAFLSEITKYNIRSIE >gi|313156793|gb|AENZ01000086.1| GENE 34 35708 - 36205 626 165 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313156841|gb|EFR56281.1| ## NR: gi|313156841|gb|EFR56281.1| hypothetical protein HMPREF9720_0191 [Alistipes sp. HGB5] # 1 165 1 165 165 269 100.0 6e-71 MTASTPRKPAGGITAAAITPAGNLLRAALGADGRAEAVFRDGTLLTRLPLAEQRSSYTEL SDTQAGPHRVRHTLKLVVSREAARTLFDPAFRRTAATEGLIATVTAASGERLLIGWSARL GTEQPLRLGSVSNESGVKPLDGTPVVVTLTCEDADPAPELKQQEP >gi|313156793|gb|AENZ01000086.1| GENE 35 36208 - 37185 1623 325 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313156857|gb|EFR56297.1| ## NR: gi|313156857|gb|EFR56297.1| hypothetical protein HMPREF9720_0192 [Alistipes sp. HGB5] # 1 325 1 325 325 644 100.0 0 MSFLENAKQYTGSDLENIFFRPILSGPSAGELGVRVLYNMPVPTTIQLWEGQRNVLQKYT AAGWSGGSAANKLQKTIALSRVKAELGFSAADYFSLVYEKIAARADVNMDDLTGSELEQA ETSLFKQAIAEGIRATMWVGDTTAASGFNTFDGFLKSVKAGAEQERFHNSVYEAADFANP EKIVAILDDLWQNADERIKDRKAEGQLAFFVTSDLYYAYEKYLDSKGADAAYAESTNGRQ GLAYHGIPLVDVRLGAYLADTSLHKSFCLLTDRRNLVLAVNTSDFPGNEVRMWYNPDLME NRQRAVFMAGCDVIDETMLSYAFRV >gi|313156793|gb|AENZ01000086.1| GENE 36 37252 - 38133 1139 293 aa, chain - ## HITS:1 COG:L34899 KEGG:ns NR:ns ## COG: L34899 COG0740 # Protein_GI_number: 15673378 # Func_class: O Posttranslational modification, protein turnover, chaperones; U Intracellular trafficking, secretion, and vesicular transport # Function: Protease subunit of ATP-dependent Clp proteases # Organism: Lactococcus lactis # 54 189 41 170 235 72 33.0 1e-12 MKSEIQIDNNAGVCHIDIEGTIGVPEEWQFDQPEARVATYEKFRDAVRRIAEIDAPEVVV NIRSTGGDVNDALLIHDALTALDAHITTRCYGYTASAATIIAQAASPGCREISANALYLI HTAVCATEGNAAELSGKLDLLRQTDTRIAAVYAARSGRPAAEFETLMAENNGSGRWLSPE EAVAAGLADVVTDAAGRGAPSLTRNIARGWERLFVRIGLSAGEQLPADRDRNVLHTEGEE RELLRRSAAAVRQAQQRVVPTQTRPREDPSYGDLVRTANQRAYSEDAKRLRNM >gi|313156793|gb|AENZ01000086.1| GENE 37 38137 - 38400 400 87 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313156867|gb|EFR56307.1| ## NR: gi|313156867|gb|EFR56307.1| hypothetical protein HMPREF9720_0194 [Alistipes sp. HGB5] # 1 87 1 87 87 150 100.0 3e-35 MTTRHKKRLAAILLREIGGLEGERAVERLFELGLVNLRVCEQRAIRGEIERLAAEGVPRC EAMHATAELFCCSYEKIRNYYYNSYKS >gi|313156793|gb|AENZ01000086.1| GENE 38 38397 - 40757 2613 786 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313156856|gb|EFR56296.1| ## NR: gi|313156856|gb|EFR56296.1| hypothetical protein HMPREF9720_0195 [Alistipes sp. HGB5] # 1 786 1 786 786 1502 100.0 0 MELNIDGKPCDLGTERIAVPGYDAAALADVEAAREGRSLTLTIPATPRNDALAGFARDPH TAERFNAAPHTAELTAEGSVMLKGAVRMLSASDEGYRIEIRDGGAGWAKNAARRMFNALG VDFRAALTPETILAGWTDDSPVKFFPIRRDEYPQRNSPTDLLPAQRLLTVDDYHPFLHVA TLVETIFGEAGYRIESRFMESDLFRSLYMSGAYASRDTAALAARMGFFARRLSSATAKAD YSGRVAANPNTALNSVGNIVETATPQTPDADGEPVPGLSNNGGCFGFDRGNIVFTPRTEV SAGFEYYLKYTTDHVIQSRQRLKGFDSVYFGTGADMRFTLANRYEDRRGEIAPNHSYTAI VFDHAEGRQYRLNYTKNGTADTLWAEFSARTAQVVTPAAGSVANPVLQVRSGTRWEPYAG DWALYDGYIAERGETTVELRVRTAAERLSPSSPKYFNRIYFHGAEEGMSLTLHKECSLQP RFSSAPGYGSAITFADVARHRIRQSELLEALQHLFNLRFHTEQATRTVRIEPADDFFAAG PAADWRAKTDFSQPVVLADIAPEVHERRTWRYLAGDGAVARFDAEAESPFGQWSVTTGSR AAKEGEKTLANPLFCPTISTAGGYSDASSAVIMQVGDRDDVQEDGTNFTPRIVRFAGMHP LPDGERWGFPSGQAEYPLAAFHFAGDGAAAGFTLCFEDRDGVRGLHRYYDLQTGRESAGR RITLSLRLAPHEFESLFTPGTGAPDLRSAFLLDTGEGTVRAALRAVEDYDPQAASVRCTF TQLPDA >gi|313156793|gb|AENZ01000086.1| GENE 39 40748 - 41737 860 329 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313156817|gb|EFR56257.1| ## NR: gi|313156817|gb|EFR56257.1| hypothetical protein HMPREF9720_0196 [Alistipes sp. HGB5] # 1 329 1 329 329 571 100.0 1e-161 MTFTQIPQQYAPLGGELRYTVTAAAAGTIDLRITDTRSADLIGARRFAGSASVSFDAAPC LRRAIRFSPVTGSTGFHVAEGRCIAAVVEAGMTPAGETAPANGEKAVSPERTFLPRRNIV AAPALLTAMPSVRTISPGESDELTLLCDEACTITVTAQAGDSATAESYRTKSGGVLVFRL DTRDFPDAETLTVDAGKCGSVSYTVVPAVSGGRRLAWRSSAGSIEHYTFPIEKSVTAETV KNRAYGTDGHLTAAARTERRTTLVSAYETRAVLEALGEITASPEVWLAEDGGYTPIDVTT PAAVVHRHAAVSCLEIEIRPKRKTRMPWN >gi|313156793|gb|AENZ01000086.1| GENE 40 41872 - 42276 447 134 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|167753876|ref|ZP_02426003.1| ## NR: gi|167753876|ref|ZP_02426003.1| hypothetical protein ALIPUT_02161 [Alistipes putredinis DSM 17216] hypothetical protein ALIPUT_02161 [Alistipes putredinis DSM 17216] # 1 134 1 134 134 117 47.0 2e-25 MNRTHLEGAAEALAAQQGYAFHTGPEDGMAHTVRSYPAAWLAPLELHAVEGCGHGRATYD MTLHLLSSGARLSPTERRAALARMEEQLLEIFAALSLDERVLAVEGLSVRPRAFALTPHG EISQTAAARIVTFF >gi|313156793|gb|AENZ01000086.1| GENE 41 42527 - 43297 971 256 aa, chain + ## HITS:1 COG:aq_1070 KEGG:ns NR:ns ## COG: aq_1070 COG1427 # Protein_GI_number: 15606349 # Func_class: R General function prediction only # Function: Predicted periplasmic solute-binding protein # Organism: Aquifex aeolicus # 6 243 3 218 236 68 25.0 1e-11 MVIVPRIAAVSYLNTIPFIYGIEHEGNLRAELLLSPPAVCAKNFAEHKADIALVPAAAVP SLADAEIVTEYCIGAAGPVRTVVLLSGEPIETVRRVFLDAHSLTSVQLAGYLLAKHWKVS PEYYTMEDYAQLDHALPGDAFLLIGDKVFDYEGRFAYSYDLAAEWKKATRLPFAFAVWIA RKGVDPDLTEGLQHALTFGIEHTYEAVLEYGFDRKPYDAYGYLTQNIDYIFDNQKHKALQ KFWNSGIKVSPRANPG >gi|313156793|gb|AENZ01000086.1| GENE 42 43367 - 44350 1543 327 aa, chain + ## HITS:1 COG:Cj0726c KEGG:ns NR:ns ## COG: Cj0726c COG0598 # Protein_GI_number: 15792075 # Func_class: P Inorganic ion transport and metabolism # Function: Mg2+ and Co2+ transporters # Organism: Campylobacter jejuni # 1 327 1 327 327 181 34.0 2e-45 MITIYLKQYNKIIRNADTKLFDELGYDDILWIDMLLPTIKEQKAVENFMEISLQTKQQVE EIESTSKYSEDENAIISNSNFFVPTGDTFIVEPVSFIISNEGVLVSVRSAEFRTFRETEK RLQMNYRNYSTGYHLFISLLEVRIDFDADLVELIAKQVAALSKDINSEDSIDKAVLHRIS ALQESTMSLRENIFDRQRVLSGILRSERFPNDIYPRLQLMIKDVNSLINHADFSFQRLDY IQDAALGLINIEQNEIVKIFSVAAVIFMPATLIASIYGMNFKFMPELDWTLTLSNGWNIP LGYIFAIGLMIFCSALTIWYFKYKKWL >gi|313156793|gb|AENZ01000086.1| GENE 43 44390 - 45214 543 274 aa, chain - ## HITS:1 COG:no KEGG:BT_2655 NR:ns ## KEGG: BT_2655 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 47 274 835 1074 1075 109 33.0 1e-22 MALFQSLLQRHGSEVAVRGGVAVPLRQFRPAHRRDQQRVRQFQGPCDDRVRSGSLGVEKK GTWTEVAGKTQITNKPRASAGFSDRKQTINGNPNCHVAEYEDFHALQLNCQFAYGVLYDD ESTETSFDVKDIYSYAYYNASNKNKGMRGCFVYNPKGSTNSGGEGANLFFPTGVAGYGRR KNCTQEGGKRGVLRYANRGALYTSSDIHYRPMLYTVYTNFGAIYWLNKQSASGTSAWDIN VSTFDFNSFNNNAFLVSDWGAGDVSDACFVRLVE >gi|313156793|gb|AENZ01000086.1| GENE 44 44923 - 47520 2689 865 aa, chain - ## HITS:1 COG:no KEGG:BT_2655 NR:ns ## KEGG: BT_2655 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 4 806 3 818 1075 454 37.0 1e-126 MKSKAIYYLLLGGMLFGFGSCMRDDLDAEAEPIGKGESRVSFAVTFQALSNLSLGKTRSL KGDQIAEIEDIFIAWYNEDGSLAGSGYRTREQMKISNPEREGNPTEQTTQRAEFTYNIPY GRYRIYAVANMGDLAANTRYAEKIALEEDFRRIAPAWETDPELTRNNDQMSGYFTANKPE SGAVRGEAPLLTINSAKPTIHAWLRRLASKVTISFDTKNLYDNVYIYIKSARIHNIPRSC TLLDANDEQRIKDPDNDVWFAGDTIEYGQGTDFNQWPRQTKGSNPLGGTAQSKEQLHAND ARSLFFYENMQGKGKLKWQDVSGDNQHISFPDGNDPGDPGYRDGKPCGTYIEVEGFYISN TAENPGKGAIFYRFMLGKNADDDYNAERNFHYKLTMRFQGSANDVNWHIDYNEDPGIYVP NPYYISYLYDHSMMLPVKIKGKPVGNLKAEIVENNWGPYQAGDEFEYYPDEVYSLSGNPT AGNVTTADPDNPKIKDGPWNGFLSLAATHINWIGRDKSFWCGYNYFYWLQKDFGTALSGE SLEAYGEALLLPQEKGQAPRGYREYELSKSGTIDGGNNGDYIVTIDEAKERTVLQIPCYT RALQLVSTTGFSGNNPYFSYRRSAKVKLTVQLEGHPEPTVDTVTIFQVRRIINPKGIYRR HDNGTPFNVVLTHRKNEAATQYLPFESEGPWEAEIETGSDWIRINGVLGGRAKGATGSEI RFSYQPDGTIGADQCRYGIILIRYHNYSCYHRIFVRQGYAPAQIAGDAKWHCFNLCYNDT EAKSPCEEGSLFRFGNFDQPIDATNNVFDNFKDHATTEFDLAPLESKKRGHGQRSREKRR SRTNPALRQDSATANRPSTAIPTAM >gi|313156793|gb|AENZ01000086.1| GENE 45 47538 - 48692 969 384 aa, chain - ## HITS:1 COG:no KEGG:Bacsa_0603 NR:ns ## KEGG: Bacsa_0603 # Name: not_defined # Def: hypothetical protein # Organism: B.salanitronis # Pathway: not_defined # 30 374 23 362 369 122 30.0 2e-26 MSPNGIIRIFICLATLASATLVSCRRTGHEEPVPGTGTERPDDVRVSFRIALGSTNGTRA GGTPEGSYDDGTQSEFENGIDLENGNYRFFFFDAQNRYLSAFQTVELIPEGEDPDRSKTY EVVGKTDKMPPANFKVVALANWPSYPGKMTAGETTIEELCTAESSRYTFSAPFALSRERL IPMYGVKSCEEMTFTPGMLTHLGTVHLLRAMAKVEVSCKTSGWTLEKVELLRYNATGYCA PSGVYSQDDYVKGNYAGDYTDTVHLPDGRNDEAPKTRAFDRTADGRFVAYVPEYRNVAAG TANRKAADAAEMRVRFKEAPGKEYAVEFKYYNAPPEGSAVGDPFDIRRNYYYKFSISKTS EPDLVVEVFPYELVELDPGFGLLP >gi|313156793|gb|AENZ01000086.1| GENE 46 48705 - 49805 1324 366 aa, chain - ## HITS:1 COG:no KEGG:BVU_0901 NR:ns ## KEGG: BVU_0901 # Name: not_defined # Def: hypothetical protein # Organism: B.vulgatus # Pathway: not_defined # 15 357 16 339 348 256 40.0 1e-66 MGTSSYIYAIGKRTLAAIVCALTLLGCDSVIYDGEGDCSVNYRVKFRYDMNLKFADAFAR EVESVTLYLLDGGGRVVWQRTESGEALGKEGYAMEVDVAPGTYDLLAWCGSTDKGSFRIP ESASRKELTCTLMRESGTDGTGHIREDHDRLYHGYLPNQTFGDTEGIYTYVVPLVKNTNN VRVVLQQTSGERLDEKRFSFRITAENGRMDWDNQLLPDEPVTYHAWHKQSATAGTALPDL PDAVTSVNAVIAELTTARLMVRDKSATVRSETGTNPPQLQPEELPEERKMRLTVRDNDTG KTVLSIPLVDYALLVKGEYRRDMDEQEFLDRMDEYNIVCVLGEGMRWVSMSINIHSWKMV WQNTGI >gi|313156793|gb|AENZ01000086.1| GENE 47 49875 - 51602 2374 575 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313156864|gb|EFR56304.1| ## NR: gi|313156864|gb|EFR56304.1| putative lipoprotein [Alistipes sp. HGB5] # 1 575 1 575 575 1131 100.0 0 MKLKNILCGILVCGAFCACSDQEPENSNKPGSFDGKAYLRVNITDANSTRATGGDLEYGD ATEHEVVNADFFFYDTNGIFIARANVWDSGTENEDNPAGNIEFFGKSVIVLKGLTETNHP KHLVTVLNTPEDFEPANTLDEMKTLLAGSIYNGSHFTMSTTSYGRTSAAVPYFVTEVSES DFKKELDEAQNATPVDIYVERLAAKVTLRVDNDKLVPVDGKPGIYKVKVTVAGDPNQQEQ GTQTGAADIYVRFLGWGLNATAKDSYIVKNIETAWSDADLGFVWNDAVRFRSYWGKSYNY GKQDGKYPDVSAEAAIAEYLNYINADNLTKQFGQSAYCAENSNTSEIVSEYLHSAVTSIL LKAQILDENEQPLEMIRYNGTLYKRNDYLNYVLNTLQKGNKLNLYCLEDEATGKYVQIDH NCVELANDRDGKVYLKLKPLAAGTSLYRKSGETYAPATDTELGEVEKNMKDFNAYNEAIG YKDGLMYYNIPIEHLNNAATTTDAENKRVIPVAKYGIVRNHHYVVTVNSLTKIGKGIFDP EEEIVPGKNDEKDTYYVGARINILSWKIVNQNVNL >gi|313156793|gb|AENZ01000086.1| GENE 48 51622 - 52668 1264 348 aa, chain - ## HITS:1 COG:no KEGG:Bacsa_1472 NR:ns ## KEGG: Bacsa_1472 # Name: not_defined # Def: hypothetical protein # Organism: B.salanitronis # Pathway: not_defined # 1 347 124 471 471 339 51.0 9e-92 MNGAALRLSRRDYGCCNTVLAARQGLLGHFEEPVPYTPRPVYVRPTAEAVKSRSLSGSAF IDFPVNKTVIYHDYRRNTAELGRIEATIDSVRNDRDVTITSVWLKGYASPESPWTHNRML AIGRTEALKKHIGQLYRFDEGVIETDYEPEDWAGLRSYVERSNLTHRAEILALIDSSLEP DAKEAKIKRSYPEEYDFLLKNCYPALRHTDYRITYTIRTFSDAQEIRHIMLTQPQKLSLN EFYLAAQACSPGSDEFNEIFETAVRMYPEDTAANLNAANTAMQKGDLKNAEHYLRKAGES PEALYARGAYAMLTEEYETAAGYLQEAEKAGIREAGEALEQLQKRNKK >gi|313156793|gb|AENZ01000086.1| GENE 49 52701 - 53039 344 112 aa, chain - ## HITS:1 COG:no KEGG:BT_2659 NR:ns ## KEGG: BT_2659 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 26 92 5 71 446 65 46.0 9e-10 MKRTIFFLSLLLGTSPLPHAQEIDGVTVGNLKMDWNGEYLVVEMDVELSRLEVEANRAVL LTPRLVNGADSADLPAVGIYGRRRYYYYVRNGASMLSGDGETSFKAAENPNG >gi|313156793|gb|AENZ01000086.1| GENE 50 53052 - 53633 640 193 aa, chain - ## HITS:1 COG:no KEGG:Bacsa_1471 NR:ns ## KEGG: Bacsa_1471 # Name: not_defined # Def: hypothetical protein # Organism: B.salanitronis # Pathway: not_defined # 7 193 8 190 190 253 64.0 2e-66 MKFRIFLLLCASLWGWNVRGQSVALKTNVLSDAFMNINLGVETGLAPKWTLDITGDFNAW TLSHGRRWKHWLVRPEARLWFCDRFAGHFIGFHAHGGQYNIGGVKNGISFLGSDLSKLSD YRYQGWFIGAGVAYGYAWILGRHWNLEAEIGVGYAYTQYDSFRCVGCGKRVEKDKPHHYV GPTKAAVNLVYVF >gi|313156793|gb|AENZ01000086.1| GENE 51 54434 - 55903 2059 489 aa, chain + ## HITS:1 COG:TM0620 KEGG:ns NR:ns ## COG: TM0620 COG2244 # Protein_GI_number: 15643386 # Func_class: R General function prediction only # Function: Membrane protein involved in the export of O-antigen and teichoic acid # Organism: Thermotoga maritima # 5 440 8 429 479 71 23.0 4e-12 MLEKLAKQTAVYGISTIVVRFLSYLLTPYYTRIFGQETYGIVTDIYALIPLALTLLTMGM ESSYFRFSAKAEEAGGDVRAAKRRLFATTWGVTSLAAVVFFVLVASFRNGVAGLMGEAYA AHPEYVVWVGLIILFDVWACIPFSRLREQGRALLFVGIKALNVVMNVALAVAFGVAGLFA TEFGVGWVFVANLIASVVTWLVILATVDRTVPKINWALLAAVFAYSLPLLVGGLAGTANE FIDRQLIKYLVPEGAMAQVGIYGAITKIAVVMMLFYQMYRLAAEPFFLSNFKKSDFVQMN AAALKYYVMASMLIFLGIALFRDVFALIVGRDFREGIFILPVVLGANVLTGVWLNLSFWY KREEKTSLAIVVTGAGLVSMLVFGFWCIPVWGYYGAAWARLASESTMVAVSWWLNRRFYP TPYDWRRIGEYVAAALAVFAVCEAVTACGGNKLITYAFNIVLFAAYALYLVRRERIDVAA LVKAALKRK >gi|313156793|gb|AENZ01000086.1| GENE 52 55920 - 56105 258 61 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313156868|gb|EFR56308.1| ## NR: gi|313156868|gb|EFR56308.1| hypothetical protein HMPREF9720_0211 [Alistipes sp. HGB5] # 1 61 1 61 61 65 100.0 1e-09 MRQSAKEKIVVLAVSFALFVGGVLALSMDRGAGRTVVLACVVLALLAVAVRQTVLKNGKK K >gi|313156793|gb|AENZ01000086.1| GENE 53 56112 - 57257 1770 381 aa, chain + ## HITS:1 COG:BS_holB KEGG:ns NR:ns ## COG: BS_holB COG0470 # Protein_GI_number: 16077099 # Func_class: L Replication, recombination and repair # Function: ATPase involved in DNA replication # Organism: Bacillus subtilis # 9 239 12 194 329 90 30.0 5e-18 MRFADITGQEDLKRHLVQSVDAGRISHAQLFTGQAGAGALALAVAYVQYLCCCHRRGGDS CGECPDCKQIAALAHPDLHLVFPVNKQGKKSGEVMRSDEFLPLFRGLFAERGGYFSAQDW YDRLDLGKTLKGMIAAREADDIIRKLSFKSFEADYKTMLIWLPEAMNEEAANKILKILEE PWEKTLFLLISEQPERLLPTIISRTQEVAVPRIAPDVLERAAQERGVTDPVKARNIARLA GGDLVELQHLVAGESDALRKENFELFCGLMRLSYNDRHLELVTWAEEAAQLSREQQRAFL RDASRLLRESYMLHAGINEISYLWGEELAFCSKFAPFVGSQNIEPLIAEIECALAQISQN GNPTIVFTHFALSVSKMIKRL >gi|313156793|gb|AENZ01000086.1| GENE 54 57261 - 57752 209 163 aa, chain + ## PROTEIN SUPPORTED ## NR: gi|148994682|ref|ZP_01823786.1| 50S ribosomal protein L13 [Streptococcus pneumoniae SP9-BS68] # 3 161 121 269 278 85 32 1e-15 MARVVLLTGGNSGDVKRTLQAAQQLVNAKVGAVLRCSHRYETKPWGFDADGVFSNQALEV STDLLPLEVLDAVQAIERELGRNRAAEAVEKARTGVNYTSRPIDIDILFYDDEVIDSERL TVPHPLMGEREFALVPLCEIMRQRRHPVTGRTVGEMLEELRNK >gi|313156793|gb|AENZ01000086.1| GENE 55 57762 - 58727 1382 321 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313156796|gb|EFR56236.1| ## NR: gi|313156796|gb|EFR56236.1| putative lipoprotein [Alistipes sp. HGB5] # 1 321 1 321 321 605 100.0 1e-171 MKTKRIILFLAAAALFTGCFKDVSYKTTYVFKPLEQEKSGDPTQILADAKLYAFAADTSS WEVASYKDALAGVISQRGNRNEKISTPLAAGEPYEQEGVEGLLQMSFDRSSVMVVAVDTK NELYGYTQQEVPENTPKIYVTLIFKRWKQGYSYKDGKWLMFNDSYTPPIYIDSQVTAQLQ AEEGAEPTVPSNLRVYAYAVDTTAWKINSYNDAAQRIITSKSDPKQTRTSPDFEAYYSKE SGTYGMKVSSPTLMVVVTDPVNQLYAYSQQEVEIVEGGQPVNFLPVVFRPWKQEYLYVEE GGWRVVNDKLAPKEPEKASKR >gi|313156793|gb|AENZ01000086.1| GENE 56 58730 - 60805 3159 691 aa, chain + ## HITS:1 COG:no KEGG:PRU_2555 NR:ns ## KEGG: PRU_2555 # Name: not_defined # Def: putative lipoprotein # Organism: P.ruminicola # Pathway: not_defined # 28 649 17 633 650 228 26.0 6e-58 MQTRKRNILRLLRAAVLLLFATAFLSRCASMMTPTGGPRDTLPPVILNMTPDNFSVNRPT VHHEKIYIEFDEFVQLKDQQKEFFTSPQMKKKPLVSMRGKGIVVQLRDTLEANTTYALNF GSAVRDNNEGNPLYSMRYVFSTGPTIDSMIFSGYTADSYKADSVSKSFIWFFPADSVENV AEYDSTIFKYKPAVIARAENNGIFIAQNLKPIPYRVYAVQDKNDNQMYEPGSDQVGFLEK SYNPAEMPDFAMWYDSIRQYVTAEPQFYLRMFTDKAFRRQLLSQTERPLQHKAMLYFGAA HPRIERIRFDSIPEDRVIVDPQTVGRDTIALWFNMPSSALPDTIKGEITYFKHDTVNVLQ EVTEPLKLSWRLIETKEQEKEREKLERDRRKAEAAGEEWVEPKKENPFAYKLPLTGEINP ENNLTVDFDYPLTRLDSAAMLLTLTRSDNSIEDVPVRFVRDTGLLRRWHIEAPWTSGGQY TLTIPKGAITDVAGFSNDSIVGKYTVLDPEKFATVKIHVKGKDDKAKYILQLLDGSNALK QEKRDVTTGDWQFNYVPAGEIKFRIIEDMNGNGKWDTGNVVERLQPERAEIYANDEGEDT FATKTNWEVEFSIDMNRVFAPVTMESLSRLLDEREAQRLRREAEKRAKEPKTNRNSHDQN NQNNANGFGSFGGFGGGMNNSMNSTGGMFNR >gi|313156793|gb|AENZ01000086.1| GENE 57 60810 - 61616 1163 268 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313156831|gb|EFR56271.1| ## NR: gi|313156831|gb|EFR56271.1| outer membrane insertion signal domain protein [Alistipes sp. HGB5] # 1 268 1 268 268 498 100.0 1e-139 MIKRQLKTLLFLAFAAVAWQGASAQHTLGFTAGYGMASSRLDPKQEMKAIWGSYTAGLTW RYYGQQRFVGGFGIDLEFLQQGFSLATNASMVEEKKDYRYYTRNVNSIVLPIVWQPHFYL FRNHVRVYLEAAATFSYHLSSTYENEEAKASGAADWKGDYSFKLPRDNRWGYGLAGGGGI ALLIRRFEINVRARYYFGLSDIVRNRNKYADNGIDGSENPFWATPMRSPLDNLTVSVGLS YRFNKEGFSTWKPRPKREKNREVFKYGL >gi|313156793|gb|AENZ01000086.1| GENE 58 61629 - 61970 594 113 aa, chain + ## HITS:1 COG:ECs4048 KEGG:ns NR:ns ## COG: ECs4048 COG0858 # Protein_GI_number: 15833302 # Func_class: J Translation, ribosomal structure and biogenesis # Function: Ribosome-binding factor A # Organism: Escherichia coli O157:H7 # 2 100 4 103 133 58 36.0 3e-09 METTRQQKIAKQIQKDVAEIFQKEGEGIVRGSLVTVTAVRVSPDFGYAKIYVSVFPFGRS AALMQELDRNNKFIRHALGQRIRNQVKNVPEIQFFLDDSLEYIDHIEQALKND >gi|313156793|gb|AENZ01000086.1| GENE 59 61989 - 63200 1443 403 aa, chain + ## HITS:1 COG:YPO1628 KEGG:ns NR:ns ## COG: YPO1628 COG4591 # Protein_GI_number: 16121896 # Func_class: M Cell wall/membrane/envelope biogenesis # Function: ABC-type transport system, involved in lipoprotein release, permease component # Organism: Yersinia pestis # 19 399 24 397 416 97 25.0 3e-20 MLPQFFARRYLFSPKSRSVVNLISGLSVAAVAMPVAAMIVLLSVFNGFESLVKSMCSAFD ADLTVSARQGQTFAADAVDTAALRRIPGVEAMSFVLEESALLEHGDRQATATVRGVDDAY EAVFPLSDAVAAGEYRVRVGDLERLVIGQSMAYMLGVRTLADADVGVYAVRRGSFSSLLP FDNYTRRTIPLGGVYTLDLETERTYVLASLRMAQELFSRPGRVSGLVLRLREGADAVQVR DAVAQQLGGDYRVRTRYELRASFYRIMTYEKWGIFFISLLVLVVASFSVVGSLAMLIVEK RRDIGTLRALGADTTLVRSIFRSEGLLICALGAALGVVLGVGATLLQQRFGLIEIPAETF LTKSYPVEFRPGDLAAVLAAFGAVACVISNITVRSMIKSNVKL >gi|313156793|gb|AENZ01000086.1| GENE 60 63197 - 64024 1382 275 aa, chain + ## HITS:1 COG:no KEGG:no NR:gi|313156806|gb|EFR56246.1| ## NR: gi|313156806|gb|EFR56246.1| putative lipoprotein [Alistipes sp. HGB5] # 1 275 1 275 275 521 100.0 1e-146 MKRIVRISLCMFLLVLAGACARHKIIPDRKLAQIFHDAFLANAYIGSEQVDIDSLNIYEP IFAGYGYTTEDVYYTIGNFSKRKSARLGDVVELAIEMLEAEGKYYNREVAVLDTIDNVAR RSFTRTVYADSLIRVGSLRDTARLRFSVDVRPGEYNLSLKYLVDSLDRNEKGLRGVVWLE RRDSTRTNVYTTTLRRDRQENFTRRFTVDTTHRRLWVDFIEFRGKPQRPSLTVSDLKIDY TPETSAAEDSLYMQQLDIRIFADEFFRAAIPADSL >gi|313156793|gb|AENZ01000086.1| GENE 61 63984 - 65132 1208 382 aa, chain + ## HITS:1 COG:jhp0252 KEGG:ns NR:ns ## COG: jhp0252 COG0402 # Protein_GI_number: 15611322 # Func_class: F Nucleotide transport and metabolism; R General function prediction only # Function: Cytosine deaminase and related metal-dependent hydrolases # Organism: Helicobacter pylori J99 # 53 360 52 374 409 72 28.0 1e-12 MNSSAQPSRRIASNLLWTPQGLVRNPLVEVAADGRILSVGTCSEPDRMPGTEFYSGVLAP GLVNAHCHLELSYLRGAIPEGCGFAGFAGAMGQVRERFGPEERLRAVAAADAAMWQDGVQ AVGDISNGDTTFSVKERSRVAYHTFIEFFGLRLASADSVRPLLRHPHTSLTPHSLYSVQD APLRAIAAEGDAPLSVHFMESPAESELFARRGPLWEWYRKVGFTCDFLHYDSPAERLVAS VPRDRRVILVHNCCVTQRDIDVVMNHFTAPVRWCLCPRSNRYISRLEPPVELLRCNRLDI CLGTDSLASNDRLSVFEEMRMFPEVPLPELLSWAAEGGARALDMDAELGEVAPGRRCGLT AISGLDYDTMRLTPASRIRRIL >gi|313156793|gb|AENZ01000086.1| GENE 62 65192 - 66151 1663 319 aa, chain - ## HITS:1 COG:no KEGG:Bache_0655 NR:ns ## KEGG: Bache_0655 # Name: not_defined # Def: glycoside hydrolase family 2 sugar binding protein # Organism: B.helcogenes # Pathway: not_defined # 1 317 287 603 604 509 73.0 1e-143 MEVALVRNGRKVDEVQSYTAMRKFSIGRDENDIVRLELNNEPLFQFGPLDQGWWPDGLYT APSDEALAFDVVKTKELGYNMIRKHVKVEPARWYYHCDKLGMIVWQDMPNGDQGPQWQMH NYFNGAEKHRSAESEANFRKEWKEIIDCLYSVPSIGVWVPFNEAWGQFKAPEIAEWTKAY DPSRLVNPASGGNHYLTGDILDTHHYPHPRMTLLDTNRATVLGEYGGIGLVMKEHLWEPD RNWGYVRLNSPKEVTDEYEKYADMLYKLIGRGFAAAVYTQTTDVEVEVNGLMTYDRKVMK VEPERIRRINERICNALNK >gi|313156793|gb|AENZ01000086.1| GENE 63 66259 - 67011 1129 250 aa, chain - ## HITS:1 COG:ebgA KEGG:ns NR:ns ## COG: ebgA COG3250 # Protein_GI_number: 16130971 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-galactosidase/beta-glucuronidase # Organism: Escherichia coli K12 # 77 227 103 233 1042 62 31.0 8e-10 MKKNILLTAAALLTTLAGAAQWQPAGDRIGTEWGEKLDPQNVLPEYPRPQMTRTQVQDGW QNLNGLWNYAILPMGETPEKYDGQILVPFAVESSLSGVGKRLGDRNELWYNRTFTISPKW NGKRVLLHFGAVDWKADVWVNGVCVGTHTGGFTPFEFDITAALKKGDNELKVRVWDPTDA GCQPRGKQVNRPEGIWYTPVSGIWQTVWLEAVPQQYIRNIRTTPDLDRKLFRVETAACGA QPGDVIEVRL >gi|313156793|gb|AENZ01000086.1| GENE 64 67247 - 67840 792 197 aa, chain + ## HITS:1 COG:XF2239 KEGG:ns NR:ns ## COG: XF2239 COG1595 # Protein_GI_number: 15838830 # Func_class: K Transcription # Function: DNA-directed RNA polymerase specialized sigma subunit, sigma24 homolog # Organism: Xylella fastidiosa 9a5c # 3 173 11 182 206 69 27.0 3e-12 MEDSTLIARVKAGDRGAFNELYGIYWASLVNYAGLFVGDDGAEDVVQDVFVRVWLRRDNL RDDGTLQGYLLRSVYHASLNALKKGANATAYRSWVAQQIEQSCYAHYDPDDSEVIRKLYS QEIAGQIDAAVESLSPKCREVFRMSHIEGLSNREISERLGITLSTVENHIYNALKQLRQK LSHYKMLILLTMYILGR >gi|313156793|gb|AENZ01000086.1| GENE 65 67911 - 68909 1188 332 aa, chain + ## HITS:1 COG:PA2388 KEGG:ns NR:ns ## COG: PA2388 COG3712 # Protein_GI_number: 15597584 # Func_class: P Inorganic ion transport and metabolism; T Signal transduction mechanisms # Function: Fe2+-dicitrate sensor, membrane component # Organism: Pseudomonas aeruginosa # 24 291 29 281 331 82 30.0 9e-16 MKQDDTTARFDEKQLAAYFSGTATPEEEQALLAWIRSSDDNRRTFAELRAVWQRGRMQRP DTQLQARFVRSLNSLNRRIDALGADAPLRRGRRIPLRRFAAAAVIVVALAAAFMTYRVAT APFVHRFHNADTVAMHVAMPDGTDVWLSPGTTLSYDDTFRIDGRNVELDGEAYFDVTHDA GQPFVVTAPALRVRVLGTVFNVRSFSGDPVAEATLAEGSVALQHAGGRNLICLHPGQQAV YDADAELLEVNEVPVGDLLLIRYGVVTLDDVTLPEIVARIERTYGVSLRIGAQQIPGERY NFSFQKDASVEDVVELLQFVSGCRFEIIQLNR >gi|313156793|gb|AENZ01000086.1| GENE 66 69035 - 72592 5748 1185 aa, chain + ## HITS:1 COG:no KEGG:Phep_3773 NR:ns ## KEGG: Phep_3773 # Name: not_defined # Def: TonB-dependent receptor plug # Organism: P.heparinus # Pathway: not_defined # 125 1185 33 1120 1120 973 45.0 0 MKKSITSLLKGWCAALLCAGLTLPSFAAADADAAFAAQRHDLKLSLRNAPLREVVTAFTQ QTGVVFSYETSLGERTLPHVDVALNGSTLDEMLASVFAGTGISYKIKDKVVALSAPPQSD APAARSAAQQRKQTVAGRVTDEQGEPLAGVSVLVKNTLTGTSTDADGRYSLSVSGSGAVL VFMYLGFQTLEEEVGARSTLDVTLRENKQILEEVVVVGYGSVKRRDLIGAVDQVDSRQFA ERSNPSVSRSLQGAIPNLNISMRDGKPSRAATIDIRGAGSIGSGGGALVLIDGVEGDLET VNPQDIASVSVLKDASSAAIYGARGAFGVILVTTKSASEGEVNVTYSGSFSIHSRTVKPN LVTDGYEWTTGYINAWNGYYNGSKELNSTINNIVPYSDSWYRELARRSTDPSLERVRIND NGKYEYFGNTDWTKAFYKDVNYSHEHNLSISGGGKNADYYVSGRFYDQDGIYRVGDERYK QYNVRAKGSVRIRPWLRLNNNMDFTVVDYHQPMLYYSNQLVPRMVEHSGQPVSLITNPDG TWTYAAVLNGYAGFAEGTSYQQEDKFTLKDKLGVEIDLVKDVLKVSGDLSYLFSRNTRER VTNMYTGYTGPESTITVNESQGSTLQNVRYDMNYVSSNIYAEYTPKLGDDHSLKLLGGWN LEKKKYRTLTVKREGLTVPSKPSFGLMDGVTSDPTVGGYDWSYVGAFFRANYGYKGRYLA EFSCRYDGSSKFPQNSKWGFFPSASVGWRLSEEKFMGWSEGWLDNFKIRLSAGSMGNGNI DPYKYIDYMTLKSSTVVIGNALGTYTVAPGAIPLSLTWETSTTYDVGADLDLFKNRLTMG FDWYRRYTTDMYTVGVSLPAVYGTAAPKGNNASLKTNGWELSVGWRDSFRLAGKEFRYGV KAMVWDSRTWVTKYINPTGNLDDYYEGMELGEIWGYRVEGLFRDQDDIDSHATQSFLQSS DKVTRPGQVKFADLNEDGKIDQGAKTLSDHGDLTVIGNTSARYHYGINVNLNWNGIGIST FWQGVAKKNWYPRYDSGYFWGQYNRPFGYMLKAHTGSNVYSEELDNWDTAYWPRYSAYQT NESSVNRVLTTPNDRYMQDVSYIRLKNLTVDYTFPAHICRKMRIKGLKVYVSGENLWTSS PMFKYCDNYDPEVINAGDSDFRTTEGDGYSYPMLRTVTLGLNLTF >gi|313156793|gb|AENZ01000086.1| GENE 67 72612 - 74399 2817 595 aa, chain + ## HITS:1 COG:no KEGG:Phep_3772 NR:ns ## KEGG: Phep_3772 # Name: not_defined # Def: RagB/SusD domain-containing protein # Organism: P.heparinus # Pathway: not_defined # 1 594 1 604 606 461 43.0 1e-128 MKKLFLILSVAAGLVSCEDFLTAGDPNKIDAPSYFRNESDLEAYANGFLQTMIPTAISVA TGDARADYMAWRGEWQYLTDNFDADDQSGWSTGSWEDLRNINYFLGNFRRAAAGEAILDH YEGVARFWRAYFYMNKVKTFGAVPWYDREIDANDREALYKPRDSREYVMRKVLEDLDFAS THCITNAKLEVNSVRITRNVALAFKARCCLYEGTFRKYHANDPSTGKPWTADESEMYLRA CADACETLMGEGKYALVSDPAKVATQYRSLFTSESVASGEVIWARAYDASLNATHILNTY FVNMQYGSYSLTRQFVNTYLNRDGSRFTDKAGYEKTLFADEFENRDYRLMQSIRYPGYTR RNNNVNTPYAPDFGYCVTGYQPIKWVIDDTSMDSNTAPCATCIPILRYAEVLLNYAEAKA ELGEFSETVWNATIKLLRERAGVDGAMPSAYDPYMAEYFLNTTTDMAILEIRRERGIELL LENCRWDDDMRWGMGRLLERPWYGVYVGELGKVYDMDGDGSGDVCFVRENPPVTEPGVTY RVLGGDYALTDGDSGYIECYIRMNRKWDDKKYVRPIPTTALNDNPALGQNPGWKK >gi|313156793|gb|AENZ01000086.1| GENE 68 74419 - 75738 1645 439 aa, chain + ## HITS:1 COG:no KEGG:Phep_3774 NR:ns ## KEGG: Phep_3774 # Name: not_defined # Def: hypothetical protein # Organism: P.heparinus # Pathway: not_defined # 23 212 21 220 440 84 31.0 1e-14 MKIMKRIYQFLLLAAGCAALTTACSDDDDSYLVRETDSIRFASCLASSKQITLRCNGSWR TVIPEDAGWLSTSPSEGVGSGAFEWIAVSATHNRSAERTATIYLESGGRQYPITVTQADG AVVYGAPYVEGNLIEQEPSKARLCFTYANAYGDETIDVSCALGGDSQGLSVAGASVSLVN GGDTVALDIAGTPTVPGYAAFAVSVDGVQIGTARAKVYAMSEMPIEGLPVTWEFCPVKGS TEDVNALKARQPDWVTASHSLVSEDGRAYITVVEADAKTASAVNSWGYNDGHAYLKGLYT GDYWLQKIPVKYLVSGTKINCTGSIGGSGSSAGFFLIEYSADGQTWRQAGGAKSGTFNNT EVTYHVRAYDSPLFEGENTGYFSYDFPVDLTINSGTLWIRYRVSANVRITANNTITTGGG GSTRLKGTFSVSVVDENMN >gi|313156793|gb|AENZ01000086.1| GENE 69 75756 - 76181 624 141 aa, chain + ## HITS:1 COG:CAP0003_1 KEGG:ns NR:ns ## COG: CAP0003_1 COG5492 # Protein_GI_number: 15004708 # Func_class: N Cell motility # Function: Bacterial surface proteins containing Ig-like domains # Organism: Clostridium acetobutylicum # 26 104 383 464 501 60 43.0 6e-10 MKKYLTLLLLAVVSAAAGAGCSSDDRPVQITSLKLNEKDLELTVGQSFQLVAATRPADAG VTLKWSSSDQTVATVDQTGLVTLRAFGTAVITARYRSYSARCTVATLQEPPLDPSAALAA PLSDGIIYSKNVLLYYPFIFR >gi|313156793|gb|AENZ01000086.1| GENE 70 76984 - 78855 1241 623 aa, chain - ## HITS:1 COG:CC0447 KEGG:ns NR:ns ## COG: CC0447 COG3525 # Protein_GI_number: 16124702 # Func_class: G Carbohydrate transport and metabolism # Function: N-acetyl-beta-hexosaminidase # Organism: Caulobacter vibrioides # 79 606 103 604 757 306 33.0 6e-83 MNERLLPGRIRSGWGVSILFLCLMALSGCCSHEAIRVNLIPRPAEMEVLSGYFQPGNTTV DDFTTVVVDNSRMDALGREGYELTVDRSSVRLTAATQTGIFYGKRTLEQLMTDKGIPCVR VSDRPRFAYRGMHMDVSRHFFPKSQVLKMLDEMARYKLNVFHFHLTDNGGWRIRIDKYPR LTAEGAFRTQRDWYAWWDQNDRRYLPEGTPGAYGGYFTKEDIREIVTYAAERHITVIPEI EFPAHSDAVFIGYPELCCTGKPYTTGEFCVGNEQVYTFMEDVLTEVMELFPSKYIHIGGD EARKVAWATCPKCQALIERERLDGIQGLQPYMIARIQDFLASKGRVMVGWDEILHNELHS ETLVMSYRGQKGAIEAANRGNYAVMTPGEVLYFDWYQADPSTQPRAMYGYSPIKKMYAFE PVPADPESAARNESIIRAEFVDPAAVEPIRADRADRIVGVQGCTWAEYIEDEEQQEYMIF PRLLAVAELAWTPQEKREWCDFKVRMNSHIPLLQQRGLNTFTLSDEVEITTRIRSGNRAV EVTLDCEKYPAEIRYTLDGTPPTPDASLYEAPFTVTDSTVVRAAVCRDGVIRSPERKQLV TLAAEIDNYYPFDVPEIWKEYFD >gi|313156793|gb|AENZ01000086.1| GENE 71 78913 - 80433 1158 506 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313156832|gb|EFR56272.1| ## NR: gi|313156832|gb|EFR56272.1| hypothetical protein HMPREF9720_0230 [Alistipes sp. HGB5] # 1 506 13 518 518 968 100.0 0 MKAMMMKKQNVGNACGWADRLARLLLLPALTLVLLCSCDEDKAKSERPEEGDQLWKIVPV LASESDGFEGLGEYGVSLYLFNDMSSDCYKYQHADSFGDLEPFVVEKAAYSLVALMMDRD VPHVFYEASDEAKKNTTVTVTDMNETIPDMVLGTASVSRNVGATVPVSDLQRLVGALDVR LTDVPNDVTAVELTVGDLYDRVDLTGQYDFSTGEPASKRIELTKEGTVFSARSVLMPSDE TRANVKMDFRVFRGDAYEDFVVRLSGSIPADKLTRLEGRAEEILKNAELTLGLTYAPWDA SITIEDHFQTDDDLNKVWSSAPLPISGAADPGYDNFWASSVFTDWDGSVKYDSYLYDGIM DGSDEHKDLYWGPDAVGEAEKGTVPSWYVDLGTGCQGITLTYWNKFGGKGGQKLRTMNIY GSNAPEDYRGGNESWTLITTFTSDRTKPTEDAGAEVTTGRIEFDKGNATYRYVKCEITSR VNADGNVVEDSDVNVAEVQITVWSCK >gi|313156793|gb|AENZ01000086.1| GENE 72 80569 - 82020 1436 483 aa, chain - ## HITS:1 COG:no KEGG:no NR:gi|313156826|gb|EFR56266.1| ## NR: gi|313156826|gb|EFR56266.1| hypothetical protein HMPREF9720_0231 [Alistipes sp. HGB5] # 6 483 1 478 478 773 100.0 0 MKKYSMYTFAVAMLAGMILTGCSSSEEIPGGTPDNGPKLKIETRSADAAEAAYAHRAILV SEGNIADAKSAADAAALGSFSVEPGSYTVYAVASSDETNLTFTGVAASNAVADARVKITD ISAQIPELLVGNSDVVTVGAEGAAAALQLKRVVASVTVTVSGLENVDAESITLTIGNMYD QVSLDGAFSKSGAAFASKSLVLTKNAEGKFVGNAIVMPTDTEASNLSLTYSVNGTDYTST PAGRIAANGQYELATTVEAGSSSSKVNLNSAITYAAWDSTVVNLSDSFTIDETTPDKQWM GPTALPISGSAAPGYDNFWASSDGGDSGWGETFWQYNLYDGNKTDGSNYWCPDEWDRTAP VWYIDLGAAKQGVTIDYWNKAGGKGGQKIKTMDIYASNTRADYGGGNADWLKIMTFTSDK TTPTTDAGAQVTTGKIAFSEDGSVSYQYVKCVMTSKVNPDGATITDVLDVNVGEVEVSYW EFK >gi|313156793|gb|AENZ01000086.1| GENE 73 82071 - 84344 2227 757 aa, chain - ## HITS:1 COG:no KEGG:BVU_0121 NR:ns ## KEGG: BVU_0121 # Name: not_defined # Def: glycoside hydrolase family protein # Organism: B.vulgatus # Pathway: not_defined # 290 742 32 448 461 194 31.0 1e-47 MRRILITSLALCTAALSAACSDSACDSDLVAGGMAMVRAVPQSVTAEAAYAHRVCLIREG VVAQSVSAASAAALAPFRVEPGTYGMLAVAAADEDNLTFPAAEGIALSEYGVRITDMTAP IPDLLIGCNSRFEAVARSVSAPVGMKRAVAHLTVTVVGLEALSCESITVSIPRMYDRIAS DGTPGNSGAEFSEKAIVLARNSAGRYVGQAVVLPTDTASATLEFRFTINGKNYVSVQETR IEANRKYALSVAAKFKDDTDLKLTPVISYLPWDAPTTIPDDGLPAMDDRPANDDFTVEIF RNGCWEEIFVHNAEVSDYAANPAAGYVQHDMGFAMFTDAFAAPLKVRVTRRAGTFSKVEI RPLSYGIVPNVQTPNSVEFELDDPAQKVSVEFDDNRMENLFILPDLPDTAIPTGANVTYF GPGIHNMGRKEILYKDNQTIYLDEGALVYGCIYAKGCRNLTIRGRGILCSSKENHGDGRQ PQIETFDCNGFKVEGILLRDTPNWTLKIVGSTGVHIDNIKEIGWIMNSDGMDFICCRNVL VENTFQRNYDDNVTIKAFNGKTDYVTAHTASDGSFTDASIWTVYYLAQNKFDVYDYEIRN CVFWADKAHNMLVGPEARGIAFRNIRFHDNIVLENRQNDGIYPGAMAVMIADNGTFEDIA FENIIVEDIDGGKVFCAHFTNAWAFDGLYGQWARNITLRNIAYTGTRATPSWIRGRSDAQ SIDGVTIGNFTVNGAPVTDGSGPHLEINGYVRNVTFE >gi|313156793|gb|AENZ01000086.1| GENE 74 84354 - 85844 1459 496 aa, chain - ## HITS:1 COG:no KEGG:BVU_0121 NR:ns ## KEGG: BVU_0121 # Name: not_defined # Def: glycoside hydrolase family protein # Organism: B.vulgatus # Pathway: not_defined # 12 493 4 458 461 190 31.0 2e-46 MLLKIAQGVFWILAVCLMPGTAYGEGRDVLPGNAAYKVEVFRGGTWEEIFVYNARVSDYA GNPEAGYTQYTMGFALFTDSFRQPLKVRVTRRAGTFSKVEIRPLSYGIVPDVQTPNSVEF ELGDPAQKVSVEFDGNRMENLFILPDLPDTAIPMGANVTYFGPGVHNAGVIRIANESGRI LYLDEGAVVLGRIEAENAANLTIRGRGVFCSSQEDHGAGRRPQMEFRNCDNLKIEGILLR DTPNWTLKIVGSTGVHIDNIKEIGWIMNSDGMDFICCRDVLVENTFQRNYDDNVTIKAFN ATPEYIASHTNTDGSYSDGAIWMVAGLRNFEVCNYEIRNCVFWADKAHNMLVGPEARGIA FRNIRFHDNIVLENRQDDGIYPGAMAVMIADDGTFEDIAFENIIVEDIDGGKVFCAHFTN AWAFDGLYGQWARNITLRNIAYTGTRATPSWIRGRNDAQSIDGVTIGNFTVNGTPVTDGS GPHLEINEYVRNVTFE >gi|313156793|gb|AENZ01000086.1| GENE 75 85847 - 87580 1199 577 aa, chain - ## HITS:1 COG:TM1751 KEGG:ns NR:ns ## COG: TM1751 COG2730 # Protein_GI_number: 15644497 # Func_class: G Carbohydrate transport and metabolism # Function: Endoglucanase # Organism: Thermotoga maritima # 226 557 14 312 317 89 23.0 2e-17 MKKLLKYILLLIYVALMPACGEDSGLDGVFANKPFALVTAADGGRTSISAVIDDGKRTIR LTFDKKRDVDAVELTFQLNPGYAMVSPKEPETTMDLTQPRTVTVSSGAGEVSYEISARNE TPVLSASFLFRGKPIEGVIDHETRTVSFPITTIYASEYPEELLGAVPLDITLSDGYRPTS EKVSYDLSTGESFSVYKGRNTYTYRLVADIAPIPVADHLSDFRFRRGVNLAYWMTEADRD WLGYVMPAQFPLWKELGFDHFRVPVDELCIFDREGQFNEKAVGKLHELFTWCEQNGMYAI LDMHTLAPREGSDSYREEELYDPAYPEFRTHFVNVWRKLAAEFKGHNPKCVAFELLNEPH DGTPDATGWNKLQNEVLTAVREQDPERIVFVPAMGWQDYNYIKYARVAEEDPNAVVSFHY YLPMLLSHYKMLAWVGYQGAVQYPGVVIPTQSDADKYPQYASFHKTTYNADRIMGEMSNA ASDGRNAGLAVHCGEFGCSKNVAEAMRLAWFTDMVAAMEANNIPWTLWECLDGGFGFVDM EYGIYRINCPLLKVLTGKELTDAEAKALLQKYGFKTE >gi|313156793|gb|AENZ01000086.1| GENE 76 87610 - 88482 796 290 aa, chain - ## HITS:1 COG:no KEGG:Bache_0346 NR:ns ## KEGG: Bache_0346 # Name: not_defined # Def: coagulation factor 5/8 type domain protein # Organism: B.helcogenes # Pathway: not_defined # 1 289 5 284 286 137 33.0 8e-31 MKSIYKILLLFTCSCGVSCGGDDEPVMTALAVQGIRAEAFPGAIRLTWNPIESRDFLYAE VSYYDFGTQTTISKQCSRYTGTLYIDGLLNRYGKYRFTFTAYDVNGTPGQPVYIERQCEK AESYYVVVAENPIPLAADQLATNAQEPSEGPLDLLVDGNPDTFFQSFWDEWTYPELKPTG YHHLTFDLRKEVSAFKFQYWNRKKANGSLPKAVNIYGSPDGETWVLLAQLDNLPGDLGSS YTSELFLPEEPISHVKYEVVKGTDVYQPYFSLAEIEFYEVVRELVDPEAE >gi|313156793|gb|AENZ01000086.1| GENE 77 88512 - 90362 1926 616 aa, chain - ## HITS:1 COG:no KEGG:Bache_0347 NR:ns ## KEGG: Bache_0347 # Name: not_defined # Def: RagB/SusD domain protein # Organism: B.helcogenes # Pathway: not_defined # 7 616 9 639 639 561 48.0 1e-158 MKYYTHILSLILSAAFLFQACSYLDVVPDETPDEKDAFQNPTMAERYLYSCYSFLPNPRH ETASLDLMTADEVVTPWEHETFANFPKGNYNASEPVISYWNTLFGGIRRCYILLENIDAV PNMTEANLRIYKGEAKFLIAYYHFLLLKNYGPIPLIKGVMSIDMDKADFPERDSYDVCVQ WISDLFDEAAGMLPAEQTSTAYGRATSVAAKALKSRLWLYAASPLFNGNSEYYANFVSKV DGRHLISQTYSEAKWEKAASAAEEAIEVAKAAGYKLYTESPVVTSRFAQPTDPVHRVLRL NFMDYNTSKEVLWADTRKEGHYGLQYKSTPYRVGPSSGNGVGVTLTMVEMFYSKNGLPID LDPEYDYTGRYRYGEYHNDVCDGVTMNLNIDREPRFYAWVAFQNGYYEVLRRDGADDNAN IVQTKFRKNDVFGIKERTTNYTPTGYLNKKGCSPLYNNIQEDVAAPHYPWPVIRMAELYL NLAEAYANLGRIDEAAAALKPVRERAGLDPVDEAFEKAGLTLGRDEMIRMARQERTIELY LEGHRFWDVRRWKEGDKYFNVRPKGMNYNGESDADFFRVTEVIVQRKFMTPMHYLMPIPK DDINKNERKFVQNPGY >gi|313156793|gb|AENZ01000086.1| GENE 78 90385 - 93477 3133 1030 aa, chain - ## HITS:1 COG:no KEGG:BT_3012 NR:ns ## KEGG: BT_3012 # Name: not_defined # Def: hypothetical protein # Organism: B.thetaiotaomicron # Pathway: not_defined # 33 1030 121 1122 1125 1063 53.0 0 MLRTARLIMLAVLSGGIFAVPCGIYAQSAPRTVSGTVTDAATGEPLIGATVQVKGVTIGT TTDLNGMFSINLPANRKTLLFSYVGYEAQEIAVGRENYLNVALKQDANALDEVVVVGYGQ QKKGMLVSSVQSIAPAELRMPSSSLSTGFAGRLAGVIAVQRNGQPGADQADFWIRGISTF GSATKPLIILDGVAIDANELNGLDPEIIESFSVLKDATATALYGSRGANGVMIVTTKNGK NLDKPVINFRFETALSTPTSRPETVDAVTYMQMYNESVLTRGTGQIPYTQAKIDGTRAGS NKYIYPDVDWYDEMFKNLAVNENFNFNIRGGSSRVDYFMSATVRHEEGMLKNLSRDYFSY NNNYSVWRYAFQNNVNVNLTKSTKVSLKINTQLRDTHGPVKSSENIFGMIMNGNPADMPI TFPDDPTVNHIRWGGKAMVKNPVAEMVTGYKDEFQSVLNANLSLDQNFDFITEGLSASAL ISFKNYSYTQTSRSAGYNSYEISGTHTGDDGLEDYDLEIRGNEQSTTLATSSSTDGDRKI YIQGMINYNRSFGRHDVSAMVVYNQEETALGNPGSLFSSLPKRKQGLAGRLTYGYDNRYL IEANFGYNGSENFAEGRRFGFFPSVAVGYVVSQEKFWAPIKKAVSYFKLRGSYGLVGNDS EETRFMYMSDLSLSGAGYTTGADGEYTLSGPVYNRFENKKITWETGEKLNVGVDLQFYNR LNVMVDLFQEVRSGIFLERGTVPAFLGTATTKVYGNLGKVRNRGLDLSMDYTHQIGKDFF ISAKGTFTFARNRVLEQDEPDYLQYPNLSRVGRSVNSFLLYEAQRLFIDNNEVKYSPEQL LGGEIMAGDIKYVNQPDANGNYDNTINSNDRIYAGYPEVPEIVYGFGFSAQWKGLDFSVF FQGTANTSLVMSGFHPFGINNNVKRNVMQFVADDYWSESNPNIYAAYPRLSVVEYGNNTV ASTYWLRDASFLKLKNAEIGYTYKKMRFYISGSNLLTFSKFKLWDPEQGGGSGLKYPTQR VFNIGIQMTL >gi|313156793|gb|AENZ01000086.1| GENE 79 93719 - 94651 591 310 aa, chain + ## HITS:1 COG:BMEII0111 KEGG:ns NR:ns ## COG: BMEII0111 COG1409 # Protein_GI_number: 17988455 # Func_class: R General function prediction only # Function: Predicted phosphohydrolases # Organism: Brucella melitensis # 53 295 3 267 281 81 29.0 2e-15 MGFLRFFLDFGNPFVIFGYAFIYNPMKKIILTTLCCLTAALTSAREKSPVTIIMQVSDPQ MGFYADNRDMAYETRTLTKTVEAINRLRPDVVVFTGDYVHNAADESQWTEFLRIVAEINP RIKTLYLPGNHDVRLEEGSVDVEPYTKHLGIDRFCVRVNGILLTGINSDYLKDETRDPSK EENQFRWLARSLKKKRPSRTSLVFAHHPFFLRQIDEPDGYSTISPEKRRRYFELFRETGV QTVFTGHLHDNAETSYDNIGMITTSAVGRPLGDAPSGVRIIVIKDRTIIHRYYPLDEIPD ARTGLIQALR >gi|313156793|gb|AENZ01000086.1| GENE 80 94711 - 95547 428 278 aa, chain - ## HITS:1 COG:CC0862 KEGG:ns NR:ns ## COG: CC0862 COG3568 # Protein_GI_number: 16125115 # Func_class: R General function prediction only # Function: Metal-dependent hydrolase # Organism: Caulobacter vibrioides # 19 278 35 295 305 129 32.0 7e-30 MKKTMLFLLLAFCFSNSVQAGQDAVSIKLISFNMRTSWGRDGDNSWPNRRHATAQMLRQE APDVMGVQEAMQDQLYYIDTECPRYARVGEDRDGGAEGGETMAVFYLRDRFELLDSGTFW ISETPDNVSRGWDAACNRTVTWVELRDKSSGKEFFYFNTHLDHQGKIAREEGVKLIVTKI RQIAGKKAAVILGGDLNTSIDNPHLKPLTRLMASARDTAAETDQKGTFNGFGSAPDTIIL DHLFYRGRMKCRKFVTLDGDYGAPYISDHYPIAMVFTL >gi|313156793|gb|AENZ01000086.1| GENE 81 95556 - 97550 1424 664 aa, chain - ## HITS:1 COG:YPR026w KEGG:ns NR:ns ## COG: YPR026w COG1554 # Protein_GI_number: 6325283 # Func_class: G Carbohydrate transport and metabolism # Function: Trehalose and maltose hydrolases (possible phosphorylases) # Organism: Saccharomyces cerevisiae # 264 629 443 828 1211 210 34.0 9e-54 MMRLVFWGVLLSVVPLSVDARSEGWVITARDTTADYFAVAMANGEIGVAVGREPFALGSV ILGGSYEPGFGDDVSRILEGINPLGLTMAVDGHRFDPSEVRPQHQSVDMRRAVHTTRFST DGVTVVYRIRALRNMPYALMTEVEVTAERDAEVLFSNGHTVPGEFADTLRESRTVGCEDG SRIAVQRTSGSYNRGRDHIVASSTFLCGEGCEAVSPESVRIVLRKGGRASFSLVGTICTT AAFADPWNESERQAIYAAREGAAQLVAAHERKWAELWQGDIEIEGDPTAQLDVRFALFNL YGSIREGSRRSIPPMGLSARGFYNGHIFWDSEIWMYPALLVLRPCLARQMLDYRTDGLDA ARRRAYAHGYRGAMFPWEGDDRGEEATPTFALTGPLEHHITADIAIASWNYYCVTQDREW LRREGFPLMREAARFWCDRVTANADGSYSIRNVIGANEYAVGVTDNAFTNGAARRALEYA SAAAELCGERPDPQWSAVAAGLRIPHFADGTTREHAGYDGEMIKQADANLLGYPLGIVTG REAQLRDLEYYERRIDPRNGPAMSYSVFAIQYARLGMAEKACEMFRRSYLPNLRPPFGVF AETATSGNPYFMTGAGGMLQAVLFGFGGLEITADGLVQRPSVLPPQWKNLRIKINGKIYQ ATNQ >gi|313156793|gb|AENZ01000086.1| GENE 82 97555 - 99570 1524 671 aa, chain - ## HITS:1 COG:no KEGG:BF1323 NR:ns ## KEGG: BF1323 # Name: not_defined # Def: alpha-glucosidase # Organism: B.fragilis # Pathway: not_defined # 12 671 8 671 671 825 60.0 0 MNQTLKFSFRNLFLSCVCLLLFACGGNPEIISPDGRTRLSFVTGADGCMAYTVERDGKPL ILPSALGLVVQERNLAGGFSVREIVKRSVDETWTQPWGENKILRDCHNEMTAVLKNDDGV LLTLRFRAFDDGVAFRYEWEVPDLDSLTVTDELTEFRFAEDGVSWSIPGNFNTYELLYRE LPVSAVENANTPFTFRVDGTYGSIHEAALYDFPEMNLYRTDSLAFKAELAPLPDGIKARI PSKFMTAWRTIQVGDKAVDLINSSLILNLNEPSKIADTSWIKPQKYIGVWWGLHLGTHTW TMAPRHGATTENALRHIDFAAANNIQGVLFEGWNEGWENWGKTQHFDYVKPYADFDLDRI AAYAREKNIELWMHNETGGNIPEYEAALETAMQRYAGLGVHAVKTGYAGGFRDGQLHHSQ YGVRHYQRVVETAARYGIVIDAHEPIKDTGIRRTWPNMMTREGARGMEWNAWSEGNSAEY LCTLPFVRLLSGPMDYTPGVFDLDYSRVRGRETGMQEWNGDNNSCCIKTTLARQIANWVI IYSPLQMASDLIENYEGHPAFQFFRDFDADCDWSEALQGEPGEYIVVVRRADDSFFLGAG TNDEPRTLTQKLGFLKSGMTYTATIYADAPDSAENPENYRIEKRTVTSADMLEIAMTARG GQAVTFVPVNE >gi|313156793|gb|AENZ01000086.1| GENE 83 99620 - 100732 655 370 aa, chain - ## HITS:1 COG:all1887 KEGG:ns NR:ns ## COG: all1887 COG4299 # Protein_GI_number: 17229379 # Func_class: S Function unknown # Function: Uncharacterized conserved protein # Organism: Nostoc sp. PCC 7120 # 7 370 2 375 375 167 32.0 2e-41 MMNPNRRLLSLDTLRGVDMFFIMGFSGLVTSLCALWPGSFTDMLASQMQHAAWNGLTIQD TIFPLFLFIAGVAFPFSLAKQRARGFGRKRILDRIFRRGLILALLGMVYNGLFELNFSSL RIASVLGRIGLAWMFAALLCVYCSVRTRIAVAGIILIGYSLLLGLVVAPDAPVGADPLSV EGCLAGWIDRQYLPGHILYGAFDPEGILSTLPAVVSALFGMFTGEFLLDGRRGLSGSWKA FYMAVAALAITTAGLCWNLIMPVNKNLWSSSFTCVVSGYSLGMTALFYYLIDVCGYKRWT FVFRVIGLNSITIYMAQRIIPLRYASDFFVGGLASKCSETVGAVIYDIGYIVLCWLFLYF LYRKNTFLKV >gi|313156793|gb|AENZ01000086.1| GENE 84 100762 - 102348 680 528 aa, chain - ## HITS:1 COG:CC0447 KEGG:ns NR:ns ## COG: CC0447 COG3525 # Protein_GI_number: 16124702 # Func_class: G Carbohydrate transport and metabolism # Function: N-acetyl-beta-hexosaminidase # Organism: Caulobacter vibrioides # 21 504 26 507 757 384 43.0 1e-106 MKNVFFAVMALAVSFTETRAADSPAIVPMPQHIAYATGEGAVLSAGSRIAVPGGAKALRS VAELFVRDICREHGLRLKVSGISESGIVLGIDPALRAEAYSLDIGQGRVAVTGGSPSGLF YGLQSLRQLISQYGMRLPAVHVEDEPCFAYRGAMLDCCRHFFTVDEIKTFIDILALHKLN RFHWHLTDDQGWRIEIRHYPGLTKEGSRRAETVLGRNTNIYDGIPSGGYYTQRQIRDVVA YAAERFITVIPEIEMPGHASAALAAYPWLGCAGEGYMVRTRWGVFPEVYCAGKDSTFEFM ENVLAEVCELFPSEYIHIGGDECPKQSWTSCPACQQRIRNERLEHENELQSYFVHRIEKW LNARGRNLIGWDEILEGGISKTATIMSWRGADSGVAAAKAGNQVIMTPNTHCYLDYFQTQ EPERLEPLGIGGYVPVRKVYSFDPYDRLSSAEQSCIQGVQGNIWTEYIASFAHAQHMALP RLAALAEVGWACDRRDFGDFTRRMTVFRKLYDKCGYRYASYFFDGTGE >gi|313156793|gb|AENZ01000086.1| GENE 85 102371 - 104659 1533 762 aa, chain - ## HITS:1 COG:TM0076 KEGG:ns NR:ns ## COG: TM0076 COG1472 # Protein_GI_number: 15642851 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Thermotoga maritima # 16 749 7 751 778 372 33.0 1e-102 MGCNALFAAVPPAIRPDAKIETRIEKILGRLTLEEKIGQMCQLTVSMVTDMNDSGHPFIS DELLDTVIGHYKVGSILNVPFDEAQSREAWTQIIGRIQRRSLDCLGIPCIYGVDQMHGAS YTRGATFFPQGINMGAALNCELMRRSSEISAYETRACAIPWNFAPVMDLGRDPRWSRMWE SYGEDVCVNSRLAAASVRGLQGDDPNRIGMYRVAACLKHFMAYGVPVSGKDRTPSSVTRN ALREKYFAPFLECIRAGALSLMVNSSNNDGMPFHANRELLTGWLKEELNWDGVIVTDWND IYNLYERDHIAESRKDAVRIAINAGIDMAMVPLDRDFCVYLRELVEEGLVSERRIDDAVR RILRLKMRIGLFEEPFPDTSKFDRFASDEFAAVALQAAEESEVLLKNDGGLLPLPKSARI LLTGPNANSMRCLNGGWSYTWQGERCDEFADRYNTIYEALARKFDHVTWIPGVEYGTPSE NWQVERVRGIGEAVSAAADADVIVVCIGENSYCETPGNMNDLNLSQNQKKLVRELASTGK PLVLVLNEGRPRLIGDIEPLAQAVVDILLPGNYGGDALANLLAGDANFSARLPFTYPRWP DALATYDYKPCQKRGTMEGEYNYDAVMDVQWPFCHGLSYTTFEYGNLRANLTEFRAGDTL SFTVDISNTGDCAGKEAVLLWSSDLVASLTPDVIRLRNFEKISLEPGETRTVTLSIPASD LAFVGYDGRWRLEKGDFRIRMGTETLFVRCVETRVWDTPNIP >gi|313156793|gb|AENZ01000086.1| GENE 86 104694 - 107024 1499 776 aa, chain - ## HITS:1 COG:TM0025 KEGG:ns NR:ns ## COG: TM0025 COG1472 # Protein_GI_number: 15642800 # Func_class: G Carbohydrate transport and metabolism # Function: Beta-glucosidase-related glycosidases # Organism: Thermotoga maritima # 31 759 4 718 721 643 45.0 0 MKKLLIALFLFPVTLVVAVPSAKPRLGEAPVRKIVAAMSLEEKAGLLVGTSMEGYAGEGA VTGRTLKTVPGSAGTTRSLESYGIPTTVMADGPAGLRIDPERAGDARTYFCTAFPIGTML ASTWDEAAIERCGAAMGNEVLEYGCDVILGPGMNIHRHPLCGRNFEYYSEDPLLSGKCGA AMVRGIQSQGVGTSVKHFAANNQESMRLQNDARVSQRALREIYLRGFEIAVKEGRPWTVM SSYNRINGPYTQESRELLTSVLRDEWGYEGLVVSDWIGKRNTVAQVHAGNDLMMPGEPAQ AREIVEAVRSGRLAEADVDRCVTRVLEYILRTPRFRKQAVSETPDLEAHAAVSRQVAAEG MVLLRNEGAALPLAAGCNLSVFGVNAYDCIAGGTGAGHVNKAYTVDLDEGLRNAGFRLNT RTADLYAKYMSFGEALLAEQNALRYLGEKWFVPETTLTPEFIASRAADSDAAIVALGRNS GEYNDRPTSDFYLTEAERELLESVCSAFHAAGKKVVVVLNIGGVIETMSWKELPDAILLA WQGGLEAGNAMADVLSGRVSPSGRLPMTFPADYSDHPSSKNFPLDYRGRCGDWADDAPER HLRNLGFTCYEEDIWVGYRYFSTHAPEAVSYPFGYGLGYTTFEWTDAAVRRSGTDYAVTL RVTNTGSRAGREVVELYVAAPQGALPKPVRELRAFAKSRELQPGESQALTLRFAAADLAS FDERISAFVVDSGRYMAELGRSADDIVQRLPFVAKASERKVHDVLKPQEPLRRLKF >gi|313156793|gb|AENZ01000086.1| GENE 87 107035 - 108882 1067 615 aa, chain - ## HITS:1 COG:CC0447 KEGG:ns NR:ns ## COG: CC0447 COG3525 # Protein_GI_number: 16124702 # Func_class: G Carbohydrate transport and metabolism # Function: N-acetyl-beta-hexosaminidase # Organism: Caulobacter vibrioides # 53 584 89 593 757 337 35.0 4e-92 MKLIFAISIVALSFFGTACRRSVGEMILVPMPQEATFTGGYFETDSARFVERAGDGYVAC RIDPSAPEIRPEGYRLKVTRRGIDLTALDSAGLFYGRQTLCLLATAQGIPCVEIIDNPCF GYRGIHLDVSRHFFPVETIFRLLDEMARYKLNKFHFHLADNGGWRIRIDAYPLLTRLGAF RTESDWLEWWSYGDRHYVPEDTPGAYGGYYTKEEIRRIVAYAAERFIEVIPEIEFPGHSD EVFAAYPELCCSGRAYTSGEFCVGNPLSLKFMDEVLTEVLELFPSKYIHIGGDEADRTAW KSCPRCRHLARELGGVDQVQCYLVEHAEKFLAEHGRTMIGWDEILKNNLRSTSTVISYRG QRGGIEAANRGYDVVMSPGEILYFDWYQADPHTQPRAMGGFSPIRKMYGFHPVPDTPAKA ADNESIIRGEFVSPDSVEYIYDGGKEHVIGVQGCTWTEFIETEKHLEYMIFPRLLAVSEL AWTPRERCEWNDFRRRINVHVSLLHARGINAFPLSDDVVITAQMLSEGKKARVTLDTEKY PAEVRYTLDGTAPVPGSDLYDGPFVVKAGTTVRAALFVAGRMEGTMTELYVDARRNVDNY YTYLNTPEVYASTDR >gi|313156793|gb|AENZ01000086.1| GENE 88 109020 - 111437 1796 805 aa, chain - ## HITS:1 COG:CAC0323 KEGG:ns NR:ns ## COG: CAC0323 COG0642 # Protein_GI_number: 15893615 # Func_class: T Signal transduction mechanisms # Function: Signal transduction histidine kinase # Organism: Clostridium acetobutylicum # 264 521 375 644 654 132 31.0 2e-30 VAVRIPSEGSFKAVSAFPNLENKGIGNIARDENGRMWITTNNSVFSFSPDSLGNPEHINT YIISADMQSFFFNRNASAQVDYGRIAFGGSNGLMIFTGNRTQPHQTRLPIVLTDFKVHNR SLRTIPARERSRISLRDIDYTDAVTLTHDRNNFFIEFSMLSYANPRDHIFRYRLDGFDKE YVTADSHHRFASYSNLSPGTYTFRLQAAGENGVWSSNERTLTVRILPAPWLSWWAWTIYS VLLLVLAYGVVRFLRYRLRLRQEVQISKLERQKTEELNHAKLQFFTNVTHELMTPLTIIL TSLQNLNNGTGDNQTLYGVMSANATRLMRLIQQILEFRKVESGNLKIRVSHGDVVGFVRR CVEAFAPLVARKQLKVYFRASSEQVDGWFDPDKLDKIVYNLLSNAAKYTPDKGEIIIRIE TGDDCSVCISVANSGELMTQQTIDGLFRRFYDGNYRKHHTIGTGIGLSLVKDLTDLHRGS IRVSSDEQDGNCFRITLPIGRDAYTEEEIDDDTGDDAAEKIYEGAGEFVPVQPDAAMTDT PSRTRTDHTLLVVDDNEELLLLISNLLAPYFRIETASDGEEALRILSRQPVDLIVSDIMM PGMDGIELCRRIKQTFEYCHIPVILLTAKNADESRIEGYNSGADGYVTKPFNLQLLYAQI VNQLRKLEVRGLHFRNQPVFEVEKLEYTSMDEKFMRQAMACVNAHIDDCEFAQADFTREM NMSRTILTEKLKSLTGLTPAAFIIDVRLRAAYHLLEEQKKMRIADLAYASGFNDPKYFST CFRKKFGFSPKEFIDRLNEKGDKIA